Projet

Général

Profil

Révision 10b1de81

ID10b1de81bbd2c92e9fb9e897e38231ff8903729a
Parent 758ca724
Enfant c53197ce

Ajouté par Nuno Fachada il y a environ 12 ans

Configurable warning and critical temperatures for GPUs

Voir les différences:

plugins/gpu/nvidia_gpu_
17 17

  
18 18
 [nvidia_gpu_*]
19 19
  env.smiexec - Location of nvidia-smi executable.
20
  env.warning - Warning temperature
21
  env.critical - Critical temperature
20 22

  
21 23
=head2 DEFAULT CONFIGURATION
22 24

  
......
101 103
			while [ $nGpusCounter -lt $nGpus ]
102 104
			do
103 105
				gpuName=`echo "$nGpusOutput" | sed -n $(( $nGpusCounter + 1 ))p | cut -d \( -f 1`
104
				echo "temp${nGpusCounter}.warning 75"
105
				echo "temp${nGpusCounter}.critical 95"
106
				echo "temp${nGpusCounter}.warning ${warning:-75}"
107
				echo "temp${nGpusCounter}.critical ${critical:-95}"
106 108
				echo "temp${nGpusCounter}.info Temperature information for $gpuName"
107 109
				: $(( nGpusCounter = $nGpusCounter + 1 ))
108 110
			done 
......
205 207
done
206 208

  
207 209
# TODO Follow multigraph suggestion from Flameeyes to look into multigraph plugins http://munin-monitoring.org/wiki/MultigraphSampleOutput, in order to reduce the amount of round trips to get the data.
208
# TODO Put warning and critical as vars in config with sensible defaults
209

  
210 210
# TODO Nvidia only: Add unsupported output options from nvidia-smi for those who have that option (how to test?). Test if they are supported and put them in suggest (or not) in case they are supported (or not)
211 211

  
212 212

  

Formats disponibles : Unified diff