2

I have some tasks I only want to run on machines that have NVIDIA GPUs. Is there a good way with Puppet to be able to determine if a specific agent has an NVIDIA GPU or not? I'm able to do it in bash by checking to see if /usr/bin/nvidia-smi exists, but I'm not sure how I should do this in Puppet. Also if there's a better way to do it in bash instead of this way, please let me know.

2 Answers2

3

You should create a custom fact that either checks the existence of /usr/bin/nvidia-smi (if that's sufficient), with something like:

Facter.add(:nvidia_gpu) do
  confine :kernel => 'Linux'
  setcode do
    FileTest.executable?('/usr/bin/nvidia-smi')
  end
end

or perhaps to be more thorough checks to see if a particular PCI device exists, if it shows up as one, using either the output of lspci or walking the /sys/bus/pci directory.

In your Puppet manifests, you can then use the value of $facts['nvidia_gpu'] to control what you do.

bodgit
  • 4,871
0

One can modify the pci_devices fact to detect the GPU is installed in the computer. It uses lspci instead of looking for toolkits, so can be used to install the toolkits with puppet.

# Copyright: Pieter Lexis <pieter@kumina.nl>

This program is free software: you can redistribute it and/or modify

it under the terms of the GNU General Public License as published by

the Free Software Foundation, either version 3 of the License, or

(at your option) any later version.

This program is distributed in the hope that it will be useful,

but WITHOUT ANY WARRANTY; without even the implied warranty of

MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the

GNU General Public License for more details.

There are no dependencies needed for this script, except for lspci.

This script is only tested on Debian (Lenny and Squeeze), if you

have any improvements, send a pull request, ticket or email.

The latest version of this script is available on github at

https://github.com/kumina/fact-pci_devices

def add_fact(fact, code) Facter.add(fact) { setcode { code } } end

case Facter.value(:operatingsystem) when /Debian|Ubuntu/i lspci = "/usr/bin/lspci" when /RedHat|CentOS|Fedora|Scientific|SLES/i lspci = "/sbin/lspci" else lspci = "" end

We can't do this if we don't know the location of lspci

if !lspci.empty? and FileTest.exists?(lspci)

Create a hash of ALL PCI devices, the key is the PCI slot ID.

{ SLOT_ID => { ATTRIBUTE => VALUE }, ... }

slot=""

after the following loop, devices will contain ALL PCI devices and the info returned from lspci

devices = {} %x{#{lspci} -v -mm -k}.each_line do |line| if not line =~ /^$/ # We don't need to parse empty lines splitted = line.split(/\t/) # lspci has a nice syntax: # ATTRIBUTE:\tVALUE # We use this to fill our hash if splitted[0] =~ /^Slot:$/ slot=splitted[1].chomp devices[slot] = {} else # The chop is needed to strip the ':' from the string devices[slot][splitted[0].chop] = splitted[1].chomp end end end

To create your own facts, edit the following code:

raid_counter = 0 raidcontrollers = [] gpus = {} scsicontrollers = {} devices.each_key do |a| case devices[a].fetch("Class") when /^RAID/ # ignore AHCI "fake" RAID, because we don't use it if devices[a].fetch('Driver') != "ahci" add_fact("raidcontroller_#{raid_counter}vendor", "#{devices[a].fetch('Vendor')}") add_fact("raidcontroller#{raid_counter}_model", "#{devices[a].fetch('SDevice')}") raid_counter += 1 raidcontrollers.insert(-1,"#{devices[a].fetch('Driver')}") end when /^3D/ if gpus.key?("#{devices[a].fetch('Device')}") gpus["#{devices[a].fetch('Device')}"]['count'] += 1 else gpus["#{devices[a].fetch('Device')}"] = { 'count' => 1, 'vendor' => "#{devices[a].fetch('Vendor')}", } # Driver might not be defined if devices[a].key?('Driver') gpus["#{devices[a].fetch('Device')}"]['driver'] = "#{devices[a].fetch('Driver')}" end end when /.SCSI controller./ if scsicontrollers.key?("#{devices[a].fetch('Device')}") scsicontrollers["#{devices[a].fetch('Device')}"]['count'] += 1 else scsicontrollers["#{devices[a].fetch('Device')}"] = { 'count' => 1, 'vendor' => "#{devices[a].fetch('Vendor')}", 'driver' => "#{devices[a].fetch('Driver')}" } end end end add_fact("raidcontrollers", raidcontrollers.join(",")) add_fact("gpus", gpus) add_fact("scsicontrollers", scsicontrollers) end