├── README.md
├── nv.chart.py
├── nv.conf
└── python_modules
    └── pynvml.py


/README.md:
--------------------------------------------------------------------------------
  1 | # README #
  2 | 
  3 | ## Overview ##
  4 | <!-- MarkdownTOC depth=0 -->
  5 | 
  6 | - [About](#about)
  7 | - [Requirements](#requirements)
  8 | - [Installation](#installation)
  9 | 	- [General](#general)
 10 | 	- [Installation Example](#installation-example)
 11 | - [Options](#options)
 12 | 	- [Memory Clock Factor](#memory-clock-factor)
 13 | 	- [Legacy Mode](#legacy-mode)
 14 | - [Charts](#charts)
 15 | - [Known Bugs/Issues](#known-bugsissues)
 16 | - [FAQ](#faq)
 17 | - [License](#license)
 18 | - [Version History](#version-history)
 19 | - [Contact](#contact)
 20 | - [Screenshots](#screenshots)
 21 | 
 22 | <!-- /MarkdownTOC -->
 23 | 
 24 | ## About ##
 25 | 
 26 | [NetData](https://github.com/firehol/netdata/) plugin that polls Nvidia GPU data.
 27 | 
 28 | ![nv plugin screenshot](http://semper.space/netdata_nv/screenshot00.png "Netdata nv plugin")
 29 | 
 30 | 
 31 | ## Requirements ##
 32 | 
 33 | * Nvidia driver installed (this plugin uses the NVML library)
 34 | * nvidia-ml-py Python package (Python NVML wrapper)
 35 | 
 36 | 
 37 | ## Installation ##
 38 | 
 39 | ### General ###
 40 | The path to the NetData installation refered to in this readme is `/usr/libexec/netdata/`. For some NetData installations the path may vary, e.g. `/usr/lib/x86_64-linux-gnu/netdata`.
 41 | 
 42 | Install the nvidia-ml-py Python package via `pip install nvidia-ml-py` or copy the `pynvml.py` file from the "nvidia-ml-py" package (https://pypi.python.org/pypi/nvidia-ml-py) to `/usr/libexec/netdata/python.d/python_modules/`.
 43 | 
 44 | **IMPORTANT**: Version 7.352.0 of the nvidia-ml-py package does not work with Python >=3.2 -> see known bugs section of this readme.
 45 | 
 46 | With default NetData installation copy the nv.chart.py script to `/usr/libexec/netdata/python.d/` and the nv.conf config file to `/etc/netdata/python.d/`.
 47 | 
 48 | Then restart NetData to activate the plugin.
 49 | 
 50 | To disable the nv plugin, edit `/etc/netdata/python.d.conf` and add `nv: no`.
 51 | 
 52 | 
 53 | ### Installation Example ###
 54 | 
 55 | Example for standard NetData installation under Ubuntu, working with Python >=2.6 and >=3.2:
 56 | 
 57 | ```
 58 | cd /tmp/
 59 | 
 60 | git clone https://github.com/coraxx/netdata_nv_plugin --depth 1
 61 | 
 62 | sudo cp netdata_nv_plugin/nv.chart.py /usr/libexec/netdata/python.d/
 63 | 
 64 | sudo cp netdata_nv_plugin/python_modules/pynvml.py /usr/libexec/netdata/python.d/python_modules/
 65 | 
 66 | sudo cp netdata_nv_plugin/nv.conf /etc/netdata/python.d/
 67 | ```
 68 | 
 69 | 
 70 | ## Options ##
 71 | 
 72 | Options are set in the `nv.conf` file.
 73 | ### Memory Clock Factor ###
 74 | 
 75 | Set `nvMemFactor: 2` in the `nv.conf` file if you want to display "the real clock speed". This is due to [DDR RAM](https://en.wikipedia.org/wiki/DDR_SDRAM#Double_data_rate_.28DDR.29_SDRAM_specification). Default is `1`.
 76 | 
 77 | 
 78 | ### Legacy Mode ###
 79 | 
 80 | With older GPUs like my Nvidia GeForce 9600m gt, the load and clock frequencies cannot be read by the NVML lib. Only temperature and memory usage is displayed.
 81 | 
 82 | Set `legacy: True` in the `nv.conf` file to poll GPU and memory load/frequency via the nvidia-settings application (also installed with the Nvidia driver).
 83 | 
 84 | *Tested under Ubuntu 16.04*
 85 | 
 86 | **IMPORTANT:** This legacy mode only works with a running X session, so this will **not** work on headless clients. Also when the X session is not hosted by root, which is usually the case when running e.g. Ubuntu, you **must** allow `root` to connect to the X session. You can do that by executing this command in a terminal as the user of the X session (i.e. the user you are logged into your e.g. GNOME desktop environment):
 87 | 
 88 | `xhost +local:root`
 89 | 
 90 | Don't forget to restart NetData afterwards with e.g. `sudo service netdata restart`
 91 | 
 92 | For the sake of completeness: If you want to disable the root access to the X session again, execute:
 93 | 
 94 | `xhost -local:root`
 95 | 
 96 | 
 97 | 
 98 | ## Charts ##
 99 | 
100 | Depending on the Graphics card, these informations are extracted:
101 | 
102 | - GPU, memory, encoder, and decoder load
103 | - Free and used memory
104 | - ECC errors (only for cards equipped with ECC memory e.g. Quadro cards)
105 | - Temperature
106 | - Fan speed
107 | - Clock frequency for GPU core, SM and memory
108 | 
109 | Readouts for units (S-class systems) are integrated but not tested yet. These add:
110 | 
111 | - intake, exhaust and board temperatures
112 | - PSU current, voltage and power
113 | - Fan rpm
114 | 
115 | 
116 | ## Known Bugs/Issues ##
117 | 
118 | Bugs:
119 | * No known bugs at the moment
120 | 
121 | Issues:
122 | * While making this plugin fit for Python 3 I encountered an old Python 2 style `print` in nvidia-ml-py's `pynvml.py` file. Line 1671 `print c_count.value` must be `print(c_count.value)` to be importable under Python 3!
123 | You can do this fix on your own or use the fixed version from this repo.
124 | 
125 | 
126 | ## FAQ ##
127 | 
128 | > Why does only one of my two whatever values show up in the whatever graph/chart?
129 | 
130 | Probably because the one not drawn in the graph is zero. For example with two graphics cards where only one is under load, chances are that NetData only draws the one with load > 0%. As soon as the other one also returns values > 0 it will be drawn as well.
131 | 
132 | 
133 | ## License ##
134 | 
135 | The MIT License (MIT)
136 | Copyright (c) 2016 Jan Arnold
137 | 
138 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
139 | 
140 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
141 | 
142 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
143 | 
144 | ## Version History ##
145 | 
146 | v0.6:
147 | * added encoder and decoder utilization
148 | * fixed memError/eccCounter assignment issue
149 | 
150 | v0.5:
151 | * fixed legacy mode (consult [Legacy Mode](#legacy-mode) for detailed information on usage)
152 | 
153 | v0.4:
154 | * debug, info and error message cleanup (still a lot of debug messages since legacy mode is not working yet)
155 | * more detailed error reporting for pynvml import
156 | * added nvMemFactor setting to config
157 | * changed default chart order
158 | 
159 | v0.3:
160 | * potential bugs fixed
161 | * fit for Python >=2.6 and >=3.2 (see known bugs section)
162 | 
163 | v0.2:
164 | * code cleanup (thanks to @paulfantom for the feedback)
165 | 
166 | v0.1:
167 | * initial release
168 | 
169 | 
170 | ## Contact ##
171 | 
172 | Who do I talk to?
173 | 
174 | * Repo owner or admin
175 | * Other community or team contact
176 | 
177 | 
178 | ## Screenshots ##
179 | 
180 | * Nvidia GeForce GTX 980 equipped PC running Ubuntu 16.04:
181 | 
182 | ![nv plugin screenshot 1](http://semper.space/netdata_nv/screenshot01.png "Netdata nv plugin")
183 | 
184 | 
185 | * Nvidia GeForce 9600m gt/9400m with legacy mode on a MacBook Pro late 2008 running Ubuntu 16.04:
186 | 
187 | ![nv plugin screenshot 2](http://semper.space/netdata_nv/screenshot02.png "Netdata nv plugin")
188 | 
189 | 


--------------------------------------------------------------------------------
/nv.chart.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | NetData plugin for Nvidia GPU stats.
  5 | 
  6 | Requirements:
  7 | #	- Nvidia driver installed (this plugin uses the NVML library)
  8 | #	- nvidia-ml-py Python package (Python NVML wrapper) installed or copy the 'pynvml.py' file
  9 | #	  from the 'nvidia-ml-py' package (https://pypi.python.org/pypi/nvidia-ml-py/7.352.0) to
 10 | #	  '/usr/libexec/netdata/python.d/python_modules/'. For use with Python >=3.2 please se known bugs
 11 | #	  in the README file.
 12 | 
 13 | 
 14 | The MIT License (MIT)
 15 | Copyright (c) 2016 Jan Arnold
 16 | 
 17 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
 18 | documentation files (the "Software"), to deal in the Software without restriction, including without limitation
 19 | the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
 20 | and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
 21 | 
 22 | The above copyright notice and this permission notice shall be included in all copies or substantial portions
 23 | of the Software.
 24 | 
 25 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
 26 | TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 27 | THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
 28 | CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 29 | DEALINGS IN THE SOFTWARE.
 30 | 
 31 | # @Title			: nv.chart
 32 | # @Project			:
 33 | # @Description		: NetData plugin for Nvidia GPU stats
 34 | # @Author			: Jan Arnold
 35 | # @Email			: jan.arnold (at) coraxx.net
 36 | # @Copyright		: Copyright (C) 2016  Jan Arnold
 37 | # @License			: MIT
 38 | # @Credits			:
 39 | # @Maintainer		: Jan Arnold
 40 | # @Date				: 2018/11/02
 41 | # @Version			: 0.6
 42 | # @Status			: stable
 43 | # @Usage			: automatically processed by netdata
 44 | # @Notes			: With default NetData installation put this file under
 45 | #					: /usr/libexec/netdata/python.d/
 46 | #					: and the config file under /etc/netdata/python.d/
 47 | # @Python_version	: 2.7.12 and 3.5.2
 48 | """
 49 | # ======================================================================================================================
 50 | from bases.FrameworkServices.SimpleService import SimpleService
 51 | from subprocess import Popen, PIPE
 52 | from re import findall
 53 | try:
 54 | 	import pynvml
 55 | except Exception as e:
 56 | 	if isinstance(e, ImportError):
 57 | 		self.error("Please install pynvml: pip install nvidia-ml-py")
 58 | 	if isinstance(e, SyntaxError):
 59 | 		self.error(
 60 | 			"Please fix line 1671 in pynvml.py file from the nvidia-ml-py package. 'print c_count.value' must be",
 61 | 			"'print(c_count.value)' to be compatible with Python >=3.2")
 62 | 	raise e
 63 | 
 64 | ## Plugin settings
 65 | update_every = 1
 66 | priority = 60000
 67 | retries = 10
 68 | 
 69 | ORDER = ['load', 'memory', 'frequency', 'temperature', 'fan', 'ecc_errors']
 70 | 
 71 | CHARTS = {
 72 | 	'memory': {
 73 | 		'options': [None, 'Memory', 'MB', 'Memory', 'nv.memory', 'stacked'],
 74 | 		'lines': [
 75 | 			# generated dynamically
 76 | 		]},
 77 | 	'load': {
 78 | 		'options': [None, 'Load', '%', 'Load', 'nv.load', 'line'],
 79 | 		'lines': [
 80 | 			# generated dynamically
 81 | 		]},
 82 | 	'ecc_errors': {
 83 | 		'options': [None, 'ECC errors', 'counts', 'ECC', 'nv.ecc', 'line'],
 84 | 		'lines': [
 85 | 			# generated dynamically
 86 | 		]},
 87 | 	'temperature': {
 88 | 		'options': [None, 'GPU temperature', 'C', 'Temperature', 'nv.temperature', 'line'],
 89 | 		'lines': [
 90 | 			# generated dynamically
 91 | 		]},
 92 | 	'fan': {
 93 | 		'options': [None, 'Fan speed', '%', 'Fans', 'nv.fan', 'line'],
 94 | 		'lines': [
 95 | 			# generated dynamically
 96 | 		]},
 97 | 	'frequency': {
 98 | 		'options': [None, 'Frequency', 'MHz', 'Frequency', 'nv.frequency', 'line'],
 99 | 		'lines': [
100 | 			# generated dynamically
101 | 		]}
102 | }
103 | 
104 | 
105 | class Service(SimpleService):
106 | 	def __init__(self, configuration=None, name=None):
107 | 		SimpleService.__init__(self, configuration=configuration, name=name)
108 | 
109 | 		# Chart
110 | 		self.order = ORDER
111 | 		self.definitions = CHARTS
112 | 
113 | 	def check(self):
114 | 		## Check legacy mode
115 | 		try:
116 | 			self.legacy = self.configuration['legacy']
117 | 			if self.legacy == '': raise KeyError
118 | 			if self.legacy is True: self.info('Legacy mode set to True')
119 | 		except KeyError:
120 | 			self.legacy = False
121 | 			self.info("No legacy mode specified. Setting to 'False'")
122 | 
123 | 		## Real memory clock is double (DDR double data rate ram). Set nvMemFactor = 2 in conf for 'real' memory clock
124 | 		try:
125 | 			self.nvMemFactor = int(self.configuration['nvMemFactor'])
126 | 			if self.nvMemFactor == '': raise KeyError
127 | 			self.info("'nvMemFactor' set to:",str(self.nvMemFactor))
128 | 		except Exception as e:
129 | 			if isinstance(e, KeyError):
130 | 				self.info("No 'nvMemFactor' configured. Setting to 1")
131 | 			else:
132 | 				self.error("nvMemFactor in config file is not an int. Setting 'nvMemFactor' to 1", str(e))
133 | 			self.nvMemFactor = 1
134 | 
135 | 		## Initialize NVML
136 | 		try:
137 | 			pynvml.nvmlInit()
138 | 			self.info("Nvidia Driver Version:", str(pynvml.nvmlSystemGetDriverVersion()))
139 | 		except Exception as e:
140 | 			self.error("pynvml could not be initialized", str(e))
141 | 			pynvml.nvmlShutdown()
142 | 			return False
143 | 
144 | 		## Get number of graphic cards
145 | 		try:
146 | 			self.unitCount = pynvml.nvmlUnitGetCount()
147 | 			self.deviceCount = pynvml.nvmlDeviceGetCount()
148 | 			self.debug("Unit count:", str(self.unitCount))
149 | 			self.debug("Device count", str(self.deviceCount))
150 | 		except Exception as e:
151 | 			self.error('Error getting number of Nvidia GPUs', str(e))
152 | 			pynvml.nvmlShutdown()
153 | 			return False
154 | 
155 | 		## Get graphic card names
156 | 		data = self._get_data()
157 | 		name = ''
158 | 		for i in range(self.deviceCount):
159 | 			if i == 0:
160 | 				name = name + str(data["device_name_" + str(i)]) + " [{0}]".format(i)
161 | 			else:
162 | 				name = name + ' | ' + str(data["device_name_" + str(i)]) + " [{0}]".format(i)
163 | 		self.info('Graphics Card(s) found:', name)
164 | 		for chart in self.definitions:
165 | 			self.definitions[chart]['options'][1] = self.definitions[chart]['options'][1] + ' for ' + name
166 | 		## Dynamically add lines
167 | 		for i in range(self.deviceCount):
168 | 			gpuIdx = str(i)
169 | 			## Memory
170 | 			if data['device_mem_used_'+str(i)] is not None:
171 | 				self.definitions['memory']['lines'].append(['device_mem_free_' + gpuIdx, 'free [{0}]'.format(i), 'absolute', 1, 1024**2])
172 | 				self.definitions['memory']['lines'].append(['device_mem_used_' + gpuIdx, 'used [{0}]'.format(i), 'absolute', 1, 1024**2])
173 | 			# self.definitions['memory']['lines'].append(['device_mem_total_' + gpuIdx, 'GPU:{0} total'.format(i), 'absolute', -1, 1024**2])
174 | 
175 | 			## Load/usage
176 | 			if data['device_load_gpu_' + gpuIdx] is not None:
177 | 				self.definitions['load']['lines'].append(['device_load_gpu_' + gpuIdx, 'gpu [{0}]'.format(i), 'absolute'])
178 | 				self.definitions['load']['lines'].append(['device_load_mem_' + gpuIdx, 'memory [{0}]'.format(i), 'absolute'])
179 | 
180 |             ## Encoder Utilization
181 | 			if data['device_load_enc_' + gpuIdx] is not None:
182 | 				self.definitions['load']['lines'].append(['device_load_enc_' + gpuIdx, 'enc [{0}]'.format(i), 'absolute'])
183 | 
184 |             ## Decoder Utilization
185 | 			if data['device_load_dec_' + gpuIdx] is not None:
186 | 				self.definitions['load']['lines'].append(['device_load_dec_' + gpuIdx, 'dec [{0}]'.format(i), 'absolute'])
187 | 
188 | 			## ECC errors
189 | 			if data['device_ecc_errors_L1_CACHE_VOLATILE_CORRECTED_' + gpuIdx] is not None:
190 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_L1_CACHE_VOLATILE_CORRECTED_' + gpuIdx, 'L1 Cache Volatile Corrected [{0}]'.format(i), 'absolute'])
191 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_L1_CACHE_VOLATILE_UNCORRECTED_' + gpuIdx, 'L1 Cache Volatile Uncorrected [{0}]'.format(i), 'absolute'])
192 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_L1_CACHE_AGGREGATE_CORRECTED_' + gpuIdx, 'L1 Cache Aggregate Corrected [{0}]'.format(i), 'absolute'])
193 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_L1_CACHE_AGGREGATE_UNCORRECTED_' + gpuIdx, 'L1 Cache Aggregate Uncorrected [{0}]'.format(i), 'absolute'])
194 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_L2_CACHE_VOLATILE_CORRECTED_' + gpuIdx, 'L2 Cache Volatile Corrected [{0}]'.format(i), 'absolute'])
195 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_L2_CACHE_VOLATILE_UNCORRECTED_' + gpuIdx, 'L2 Cache Volatile Uncorrected [{0}]'.format(i), 'absolute'])
196 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_L2_CACHE_AGGREGATE_CORRECTED_' + gpuIdx, 'L2 Cache Aggregate Corrected [{0}]'.format(i), 'absolute'])
197 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_L2_CACHE_AGGREGATE_UNCORRECTED_' + gpuIdx, 'L2 Cache Aggregate Uncorrected [{0}]'.format(i), 'absolute'])
198 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_DEVICE_MEMORY_VOLATILE_CORRECTED_' + gpuIdx, 'Device Memory Volatile Corrected [{0}]'.format(i), 'absolute'])
199 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_DEVICE_MEMORY_VOLATILE_UNCORRECTED_' + gpuIdx, 'Device Memory Volatile Uncorrected [{0}]'.format(i), 'absolute'])
200 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_DEVICE_MEMORY_AGGREGATE_CORRECTED_' + gpuIdx, 'Device Memory Aggregate Corrected [{0}]'.format(i), 'absolute'])
201 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_DEVICE_MEMORY_AGGREGATE_UNCORRECTED_' + gpuIdx, 'Device Memory Aggregate Uncorrected [{0}]'.format(i), 'absolute'])
202 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_REGISTER_FILE_VOLATILE_CORRECTED_' + gpuIdx, 'Register File Volatile Corrected [{0}]'.format(i), 'absolute'])
203 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_REGISTER_FILE_VOLATILE_UNCORRECTED_' + gpuIdx, 'Register File Volatile Uncorrected [{0}]'.format(i), 'absolute'])
204 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_REGISTER_FILE_AGGREGATE_CORRECTED_' + gpuIdx, 'Register File Aggregate Corrected [{0}]'.format(i), 'absolute'])
205 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_REGISTER_FILE_AGGREGATE_UNCORRECTED_' + gpuIdx, 'Register File Aggregate Uncorrected [{0}]'.format(i), 'absolute'])
206 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_TEXTURE_MEMORY_VOLATILE_CORRECTED_' + gpuIdx, 'Texture Memory Volatile Corrected [{0}]'.format(i), 'absolute'])
207 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_TEXTURE_MEMORY_VOLATILE_UNCORRECTED_' + gpuIdx, 'Texture Memory Volatile Uncorrected [{0}]'.format(i), 'absolute'])
208 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_TEXTURE_MEMORY_AGGREGATE_CORRECTED_' + gpuIdx, 'Texture Memory Aggregate Corrected [{0}]'.format(i), 'absolute'])
209 | 				self.definitions['ecc_errors']['lines'].append(['device_ecc_errors_TEXTURE_MEMORY_AGGREGATE_UNCORRECTED_' + gpuIdx, 'Texture Memory Aggregate Uncorrected [{0}]'.format(i), 'absolute'])
210 | 
211 | 			## Temperature
212 | 			if data['device_temp_' + gpuIdx] is not None:
213 | 				self.definitions['temperature']['lines'].append(['device_temp_' + gpuIdx, 'GPU:{0}'.format(i), 'absolute'])
214 | 
215 | 			## Fan
216 | 			if data['device_fanspeed_' + gpuIdx] is not None:
217 | 				self.definitions['fan']['lines'].append(['device_fanspeed_' + gpuIdx, 'GPU:{0}'.format(i), 'absolute'])
218 | 
219 | 			## GPU and Memory frequency
220 | 			if data['device_core_clock_' + gpuIdx] is not None:
221 | 				self.definitions['frequency']['lines'].append(['device_core_clock_' + gpuIdx, 'core [{0}]'.format(i), 'absolute'])
222 | 				self.definitions['frequency']['lines'].append(['device_mem_clock_' + gpuIdx, 'memory [{0}]'.format(i), 'absolute'])
223 | 			## SM frequency, usually same as GPU - handled extra here because of legacy mode
224 | 			if data['device_sm_clock_' + gpuIdx] is not None:
225 | 				self.definitions['frequency']['lines'].append(['device_sm_clock_' + gpuIdx, 'sm [{0}]'.format(i), 'absolute'])
226 | 
227 | 		## Check if GPU Units are installed and add charts
228 | 		if self.unitCount:
229 | 			self.order.append('unit_fan')
230 | 			self.order.append('unit_psu')
231 | 			for i in range(self.unitCount):
232 | 				gpuIdx = str(i)
233 | 				if data['unit_temp_intake_' + gpuIdx] is not None:
234 | 					self.definitions['temperature']['lines'].append(['unit_temp_intake_' + gpuIdx, 'intake (unit {0})'.format(i), 'absolute'])
235 | 					self.definitions['temperature']['lines'].append(['unit_temp_exhaust_' + gpuIdx, 'exhaust (unit {0})'.format(i), 'absolute'])
236 | 					self.definitions['temperature']['lines'].append(['unit_temp_board_' + gpuIdx, 'board (unit {0})'.format(i), 'absolute'])
237 | 				if data['unit_fan_speed_' + gpuIdx] is not None:
238 | 					self.definitions['unit_fan'] = {
239 | 						'options': [None, 'Unit fan', 'rpm', 'Unit Fans', 'nv.unit', 'line'],
240 | 						'lines': [['unit_fan_speed_' + gpuIdx, 'Unit{0}'.format(i), 'absolute']]}
241 | 				if data['unit_psu_current_' + gpuIdx] is not None:
242 | 					self.definitions['unit_psu'] = {
243 | 						'options': [None, 'Unit PSU', 'mixed', 'Unit PSU', 'nv.unit', 'line'],
244 | 						'lines': [
245 | 							['unit_psu_current_' + gpuIdx, 'current (A) (unit {0})'.format(i), 'absolute'],
246 | 							['unit_psu_power_' + gpuIdx, 'power (W) (unit {0})'.format(i), 'absolute'],
247 | 							['unit_psu_voltage_' + gpuIdx, 'voltage (V) (unit {0})'.format(i), 'absolute']]}
248 | 		return True
249 | 
250 | 	def _get_data(self):
251 | 		data = {}
252 | 
253 | 		if self.deviceCount:
254 | 			for i in range(self.deviceCount):
255 | 				gpuIdx = str(i)
256 | 				handle = pynvml.nvmlDeviceGetHandleByIndex(i)
257 | 				name = pynvml.nvmlDeviceGetName(handle)
258 | 				brand = pynvml.nvmlDeviceGetBrand(handle)
259 | 
260 | 				### Get data ###
261 | 				## Memory usage
262 | 				try:
263 | 					mem = pynvml.nvmlDeviceGetMemoryInfo(handle)
264 | 				except Exception as e:
265 | 					self.debug(str(e))
266 | 					mem = None
267 | 
268 | 				## ECC errors
269 | 				try:
270 | 					_memError = {}
271 | 					_eccCounter = {}
272 | 					eccErrors = {}
273 | 					eccCounterType = ['VOLATILE_ECC', 'AGGREGATE_ECC']
274 | 					memErrorType = ['ERROR_TYPE_CORRECTED', 'ERROR_TYPE_UNCORRECTED']
275 | 					memoryLocationType = ['L1_CACHE', 'L2_CACHE', 'DEVICE_MEMORY', 'REGISTER_FILE', 'TEXTURE_MEMORY']
276 | 					for memoryLocation in range(5):
277 | 						for eccCounter in range(2):
278 | 							for memError in range(2):
279 | 								_memError[memErrorType[memError]] = pynvml.nvmlDeviceGetMemoryErrorCounter(handle,memError,eccCounter,memoryLocation)
280 | 							_eccCounter[eccCounterType[eccCounter]] = _memError
281 | 						eccErrors[memoryLocationType[memoryLocation]] = _eccCounter
282 | 				except Exception as e:
283 | 					self.debug(str(e))
284 | 					eccErrors = None
285 | 
286 | 				## Temperature
287 | 				try:
288 | 					temp = pynvml.nvmlDeviceGetTemperature(handle,pynvml.NVML_TEMPERATURE_GPU)
289 | 				except Exception as e:
290 | 					self.debug(str(e))
291 | 					temp = None
292 | 
293 | 				## Fan
294 | 				try:
295 | 					fanspeed = pynvml.nvmlDeviceGetFanSpeed(handle)
296 | 				except Exception as e:
297 | 					self.debug(str(e))
298 | 					fanspeed = None
299 | 
300 | 				## GPU and Memory Utilization
301 | 				try:
302 | 					util = pynvml.nvmlDeviceGetUtilizationRates(handle)
303 | 					gpu_util = util.gpu
304 | 					mem_util = util.memory
305 | 				except Exception as e:
306 | 					self.debug(str(e))
307 | 					gpu_util = None
308 | 					mem_util = None
309 | 
310 | 				## Encoder Utilization
311 | 				try:
312 | 					encoder = pynvml.nvmlDeviceGetEncoderUtilization(handle)
313 | 					enc_util = encoder[0]
314 | 				except Exception as e:
315 | 					self.debug(str(e))
316 | 					enc_util = None
317 | 
318 | 				## Decoder Utilization
319 | 				try:
320 | 					decoder = pynvml.nvmlDeviceGetDecoderUtilization(handle)
321 | 					dec_util = decoder[0]
322 | 				except Exception as e:
323 | 					self.debug(str(e))
324 | 					dec_util = None
325 | 
326 | 				## Clock frequencies
327 | 				try:
328 | 					clock_core = pynvml.nvmlDeviceGetClockInfo(handle, pynvml.NVML_CLOCK_GRAPHICS)
329 | 					clock_sm = pynvml.nvmlDeviceGetClockInfo(handle, pynvml.NVML_CLOCK_SM)
330 | 					clock_mem = pynvml.nvmlDeviceGetClockInfo(handle, pynvml.NVML_CLOCK_MEM) * self.nvMemFactor
331 | 				except Exception as e:
332 | 					self.debug(str(e))
333 | 					clock_core = None
334 | 					clock_sm = None
335 | 					clock_mem = None
336 | 
337 | 				### Packing data ###
338 | 				self.debug("Device", gpuIdx, ":", str(name))
339 | 				data["device_name_" + gpuIdx] = name
340 | 
341 | 				self.debug("Brand:", str(brand))
342 | 
343 | 				self.debug(str(name), "Temp      :", str(temp))
344 | 				data["device_temp_" + gpuIdx] = temp
345 | 
346 | 				self.debug(str(name), "Mem total :", str(mem.total), 'bytes')
347 | 				data["device_mem_total_" + gpuIdx] = mem.total
348 | 
349 | 				self.debug(str(name), "Mem used  :", str(mem.used), 'bytes')
350 | 				data["device_mem_used_" + gpuIdx] = mem.used
351 | 
352 | 				self.debug(str(name), "Mem free  :", str(mem.free), 'bytes')
353 | 				data["device_mem_free_" + gpuIdx] = mem.free
354 | 
355 | 				self.debug(str(name), "Load GPU  :", str(gpu_util), '%')
356 | 				data["device_load_gpu_" + gpuIdx] = gpu_util
357 | 
358 | 				self.debug(str(name), "Load MEM  :", str(mem_util), '%')
359 | 				data["device_load_mem_" + gpuIdx] = mem_util
360 | 
361 | 				self.debug(str(name), "Load ENC  :", str(enc_util), '%')
362 | 				data["device_load_enc_" + gpuIdx] = enc_util
363 | 
364 | 				self.debug(str(name), "Load DEC  :", str(dec_util), '%')
365 | 				data["device_load_dec_" + gpuIdx] = dec_util
366 | 
367 | 				self.debug(str(name), "Core clock:", str(clock_core), 'MHz')
368 | 				data["device_core_clock_" + gpuIdx] = clock_core
369 | 
370 | 				self.debug(str(name), "SM clock  :", str(clock_sm), 'MHz')
371 | 				data["device_sm_clock_" + gpuIdx] = clock_sm
372 | 
373 | 				self.debug(str(name), "Mem clock :", str(clock_mem), 'MHz')
374 | 				data["device_mem_clock_" + gpuIdx] = clock_mem
375 | 
376 | 				self.debug(str(name), "Fan speed :", str(fanspeed), '%')
377 | 				data["device_fanspeed_" + gpuIdx] = fanspeed
378 | 
379 | 				self.debug(str(name), "ECC errors:", str(eccErrors))
380 | 				if eccErrors is not None:
381 | 					data["device_ecc_errors_L1_CACHE_VOLATILE_CORRECTED_" + gpuIdx] = eccErrors["L1_CACHE"]["VOLATILE_ECC"]["ERROR_TYPE_CORRECTED"]
382 | 					data["device_ecc_errors_L1_CACHE_VOLATILE_UNCORRECTED_" + gpuIdx] = eccErrors["L1_CACHE"]["VOLATILE_ECC"]["ERROR_TYPE_UNCORRECTED"]
383 | 					data["device_ecc_errors_L1_CACHE_AGGREGATE_CORRECTED_" + gpuIdx] = eccErrors["L1_CACHE"]["AGGREGATE_ECC"]["ERROR_TYPE_CORRECTED"]
384 | 					data["device_ecc_errors_L1_CACHE_AGGREGATE_UNCORRECTED_" + gpuIdx] = eccErrors["L1_CACHE"]["AGGREGATE_ECC"]["ERROR_TYPE_UNCORRECTED"]
385 | 					data["device_ecc_errors_L2_CACHE_VOLATILE_CORRECTED_" + gpuIdx] = eccErrors["L2_CACHE"]["VOLATILE_ECC"]["ERROR_TYPE_CORRECTED"]
386 | 					data["device_ecc_errors_L2_CACHE_VOLATILE_UNCORRECTED_" + gpuIdx] = eccErrors["L2_CACHE"]["VOLATILE_ECC"]["ERROR_TYPE_UNCORRECTED"]
387 | 					data["device_ecc_errors_L2_CACHE_AGGREGATE_CORRECTED_" + gpuIdx] = eccErrors["L2_CACHE"]["AGGREGATE_ECC"]["ERROR_TYPE_CORRECTED"]
388 | 					data["device_ecc_errors_L2_CACHE_AGGREGATE_UNCORRECTED_" + gpuIdx] = eccErrors["L2_CACHE"]["AGGREGATE_ECC"]["ERROR_TYPE_UNCORRECTED"]
389 | 					data["device_ecc_errors_DEVICE_MEMORY_VOLATILE_CORRECTED_" + gpuIdx] = eccErrors["DEVICE_MEMORY"]["VOLATILE_ECC"]["ERROR_TYPE_CORRECTED"]
390 | 					data["device_ecc_errors_DEVICE_MEMORY_VOLATILE_UNCORRECTED_" + gpuIdx] = eccErrors["DEVICE_MEMORY"]["VOLATILE_ECC"]["ERROR_TYPE_UNCORRECTED"]
391 | 					data["device_ecc_errors_DEVICE_MEMORY_AGGREGATE_CORRECTED_" + gpuIdx] = eccErrors["DEVICE_MEMORY"]["AGGREGATE_ECC"]["ERROR_TYPE_CORRECTED"]
392 | 					data["device_ecc_errors_DEVICE_MEMORY_AGGREGATE_UNCORRECTED_" + gpuIdx] = eccErrors["DEVICE_MEMORY"]["AGGREGATE_ECC"]["ERROR_TYPE_UNCORRECTED"]
393 | 					data["device_ecc_errors_REGISTER_FILE_VOLATILE_CORRECTED_" + gpuIdx] = eccErrors["REGISTER_FILE"]["VOLATILE_ECC"]["ERROR_TYPE_CORRECTED"]
394 | 					data["device_ecc_errors_REGISTER_FILE_VOLATILE_UNCORRECTED_" + gpuIdx] = eccErrors["REGISTER_FILE"]["VOLATILE_ECC"]["ERROR_TYPE_UNCORRECTED"]
395 | 					data["device_ecc_errors_REGISTER_FILE_AGGREGATE_CORRECTED_" + gpuIdx] = eccErrors["REGISTER_FILE"]["AGGREGATE_ECC"]["ERROR_TYPE_CORRECTED"]
396 | 					data["device_ecc_errors_REGISTER_FILE_AGGREGATE_UNCORRECTED_" + gpuIdx] = eccErrors["REGISTER_FILE"]["AGGREGATE_ECC"]["ERROR_TYPE_UNCORRECTED"]
397 | 					data["device_ecc_errors_TEXTURE_MEMORY_VOLATILE_CORRECTED_" + gpuIdx] = eccErrors["TEXTURE_MEMORY"]["VOLATILE_ECC"]["ERROR_TYPE_CORRECTED"]
398 | 					data["device_ecc_errors_TEXTURE_MEMORY_VOLATILE_UNCORRECTED_" + gpuIdx] = eccErrors["TEXTURE_MEMORY"]["VOLATILE_ECC"]["ERROR_TYPE_UNCORRECTED"]
399 | 					data["device_ecc_errors_TEXTURE_MEMORY_AGGREGATE_CORRECTED_" + gpuIdx] = eccErrors["TEXTURE_MEMORY"]["AGGREGATE_ECC"]["ERROR_TYPE_CORRECTED"]
400 | 					data["device_ecc_errors_TEXTURE_MEMORY_AGGREGATE_UNCORRECTED_" + gpuIdx] = eccErrors["TEXTURE_MEMORY"]["AGGREGATE_ECC"]["ERROR_TYPE_UNCORRECTED"]
401 | 				else:
402 | 					data["device_ecc_errors_L1_CACHE_VOLATILE_CORRECTED_" + gpuIdx] = None
403 | 
404 | 		## Get unit (S-class Nvidia cards) data
405 | 		if self.unitCount:
406 | 			for i in range(self.unitCount):
407 | 				gpuIdx = str(i)
408 | 				handle = pynvml.nvmlUnitGetHandleByIndex(i)
409 | 
410 | 				try:
411 | 					fan = pynvml.nvmlUnitGetFanSpeedInfo(handle)
412 | 					fan_speed = fan.speed  # Fan speed (RPM)
413 | 					fan_state = fan.state  # Flag that indicates whether fan is working properly
414 | 				except Exception as e:
415 | 					self.debug(str(e))
416 | 					fan_speed = None
417 | 					fan_state = None
418 | 
419 | 				try:
420 | 					psu = pynvml.nvmlUnitGetPsuInfo(handle)
421 | 					psu_current = psu.current  # PSU current (A)
422 | 					psu_power = psu.power  # PSU power draw (W)
423 | 					psu_state = psu.state  # The power supply state
424 | 					psu_voltage = psu.voltage  # PSU voltage (V)
425 | 				except Exception as e:
426 | 					self.debug(str(e))
427 | 					psu_current = None
428 | 					psu_power = None
429 | 					psu_state = None
430 | 					psu_voltage = None
431 | 
432 | 				try:
433 | 					temp_intake = pynvml.nvmlUnitGetTemperature(handle,0)  # Temperature at intake in C
434 | 					temp_exhaust = pynvml.nvmlUnitGetTemperature(handle,1)  # Temperature at exhaust in C
435 | 					temp_board = pynvml.nvmlUnitGetTemperature(handle,2)  # Temperature on board in C
436 | 				except Exception as e:
437 | 					self.debug(str(e))
438 | 					temp_intake = None
439 | 					temp_exhaust = None
440 | 					temp_board = None
441 | 
442 | 				self.debug('Unit fan speed:',str(fan_speed))
443 | 				data["unit_fan_speed_" + gpuIdx] = fan_speed
444 | 
445 | 				self.debug('Unit fan state:',str(fan_state))
446 | 				data["unit_fan_state_" + gpuIdx] = fan_state
447 | 
448 | 				self.debug('Unit PSU current:',str(psu_current))
449 | 				data["unit_psu_current_" + gpuIdx] = psu_current
450 | 
451 | 				self.debug('Unit PSU power:', str(psu_power))
452 | 				data["unit_psu_power_" + gpuIdx] = psu_power
453 | 
454 | 				self.debug('Unit PSU state:', str(psu_state))
455 | 				data["unit_psu_state_" + gpuIdx] = psu_state
456 | 
457 | 				self.debug('Unit PSU voltage:', str(psu_voltage))
458 | 				data["unit_psu_voltage_" + gpuIdx] = psu_voltage
459 | 
460 | 				self.debug('Unit temp intake:', str(temp_intake))
461 | 				data["unit_temp_intake_" + gpuIdx] = temp_intake
462 | 
463 | 				self.debug('Unit temp exhaust:', str(temp_exhaust))
464 | 				data["unit_temp_exhaust_" + gpuIdx] = temp_exhaust
465 | 
466 | 				self.debug('Unit temp board:', str(temp_board))
467 | 				data["unit_temp_board_" + gpuIdx] = temp_board
468 | 
469 | 		## Get data via legacy mode
470 | 		if self.legacy:
471 | 			try:
472 | 				output, error = Popen(
473 | 					[
474 | 						"nvidia-settings",
475 | 						"-c", ":0",
476 | 						"-q", "GPUUtilization",
477 | 						"-q", "GPUCurrentClockFreqs",
478 | 						"-q", "GPUCoreTemp",
479 | 						"-q", "TotalDedicatedGPUMemory",
480 | 						"-q", "UsedDedicatedGPUMemory"
481 | 					],
482 | 					shell=False,
483 | 					stdout=PIPE,stderr=PIPE).communicate()
484 | 				output = repr(str(output))
485 | 				if len(output) < 800:
486 | 					raise Exception('Error in fetching data from nvidia-settings ' + output)
487 | 				self.debug(str(error), output)
488 | 			except Exception as e:
489 | 				self.error(str(e))
490 | 				self.error('Setting legacy mode to False')
491 | 				self.legacy = False
492 | 				return data
493 | 			for i in range(self.deviceCount):
494 | 				gpuIdx = str(i)
495 | 				if data["device_temp_" + gpuIdx] is None:
496 | 					coreTemp = findall('GPUCoreTemp.*?(gpu:\d*).*?\s(\d*)', output)[i][1]
497 | 					try:
498 | 						data["device_temp_" + gpuIdx] = int(coreTemp)
499 | 						self.debug('Using legacy temp for GPU {0}: {1}'.format(gpuIdx, coreTemp))
500 | 					except Exception as e:
501 | 						self.debug(str(e), "skipping device_temp_" + gpuIdx)
502 | 				if data["device_mem_used_" + gpuIdx] is None:
503 | 					memUsed = findall('UsedDedicatedGPUMemory.*?(gpu:\d*).*?\s(\d*)', output)[i][1]
504 | 					try:
505 | 						data["device_mem_used_" + gpuIdx] = int(memUsed)
506 | 						self.debug('Using legacy mem_used for GPU {0}: {1}'.format(gpuIdx, memUsed))
507 | 					except Exception as e:
508 | 						self.debug(str(e), "skipping device_mem_used_" + gpuIdx)
509 | 				if data["device_load_gpu_" + gpuIdx] is None:
510 | 					gpu_util = findall('(gpu:\d*).*?graphics=(\d*),.*?memory=(\d*)', output)[i][1]
511 | 					try:
512 | 						data["device_load_gpu_" + gpuIdx] = int(gpu_util)
513 | 						self.debug('Using legacy load_gpu for GPU {0}: {1}'.format(gpuIdx, gpu_util))
514 | 					except Exception as e:
515 | 						self.debug(str(e), "skipping device_load_gpu_" + gpuIdx)
516 | 				if data["device_load_mem_" + gpuIdx] is None:
517 | 					mem_util = findall('(gpu:\d*).*?graphics=(\d*),.*?memory=(\d*)', output)[i][2]
518 | 					try:
519 | 						data["device_load_mem_" + gpuIdx] = int(mem_util)
520 | 						self.debug('Using legacy load_mem for GPU {0}: {1}'.format(gpuIdx, mem_util))
521 | 					except Exception as e:
522 | 						self.debug(str(e), "skipping device_load_mem_" + gpuIdx)
523 | 				if data["device_core_clock_" + gpuIdx] is None:
524 | 					clock_core = findall('GPUCurrentClockFreqs.*?(gpu:\d*).*?(\d*),(\d*)', output)[i][1]
525 | 					try:
526 | 						data["device_core_clock_" + gpuIdx] = int(clock_core)
527 | 						self.debug('Using legacy core_clock for GPU {0}: {1}'.format(gpuIdx, clock_core))
528 | 					except Exception as e:
529 | 						self.debug(str(e), "skipping device_core_clock_" + gpuIdx)
530 | 				if data["device_mem_clock_" + gpuIdx] is None:
531 | 					clock_mem = findall('GPUCurrentClockFreqs.*?(gpu:\d*).*?(\d*),(\d*)', output)[i][2]
532 | 					try:
533 | 						data["device_mem_clock_" + gpuIdx] = int(clock_mem)
534 | 						self.debug('Using legacy mem_clock for GPU {0}: {1}'.format(gpuIdx, clock_mem))
535 | 					except Exception as e:
536 | 						self.debug(str(e), "skipping device_mem_clock_" + gpuIdx)
537 | 
538 | 		return data
539 | 


--------------------------------------------------------------------------------
/nv.conf:
--------------------------------------------------------------------------------
 1 | # netdata python.d.plugin configuration for nv
 2 | #
 3 | # Set "legacy: True" for old Nvidia graphic cards.
 4 | #
 5 | #       With older GPUs like my Nvidia GeForce 9600m gt, the load and clock frequencies
 6 | #       cannot be read by the NVML lib. Only temperature and memory usage is displayed.
 7 | #
 8 | #       Set "legacy: True" in the "nv.conf" file to poll GPU and memory load/frequency
 9 | #       via the nvidia-settings application (also installed with the Nvidia driver).
10 | #
11 | #       IMPORTANT: This legacy mode only works with a running X session, so this will NOT
12 | #       work on headless clients. Also when the X session is not hosted by root, which is
13 | #       usually the case when running e.g. Ubuntu, you **must** allow "root" to connect to
14 | #       the X session. You can do that by executing this command in a terminal as the user
15 | #       of the X session (i.e. the user you are logged into your e.g. GNOME desktop environment,
16 | #       tested under Ubuntu 16.04):
17 | #
18 | #       $ xhost +local:root
19 | #
20 | #       Don't forget to restart NetData afterwards with e.g. "sudo service netdata restart"
21 | #
22 | #       For the sake of completeness: If you want to disable the root access to the X session
23 | #       again, execute:
24 | #
25 | #       $ xhost -local:root
26 | #
27 | # Set "nvMemFactor" to 2 if you want to display "the real clock speed"
28 | # (https://en.wikipedia.org/wiki/DDR_SDRAM#Double_data_rate_.28DDR.29_SDRAM_specification)
29 | 
30 | update_every : 1
31 | retries      : 10
32 | priority     : 20000
33 | 
34 | legacy       : False
35 | nvMemFactor  : 1
36 | 


--------------------------------------------------------------------------------
/python_modules/pynvml.py:
--------------------------------------------------------------------------------
   1 | #####
   2 | # Copyright (c) 2011-2015, NVIDIA Corporation.  All rights reserved.
   3 | #
   4 | # Redistribution and use in source and binary forms, with or without
   5 | # modification, are permitted provided that the following conditions are met:
   6 | # 
   7 | #    * Redistributions of source code must retain the above copyright notice,
   8 | #      this list of conditions and the following disclaimer.
   9 | #    * Redistributions in binary form must reproduce the above copyright
  10 | #      notice, this list of conditions and the following disclaimer in the
  11 | #      documentation and/or other materials provided with the distribution.
  12 | #    * Neither the name of the NVIDIA Corporation nor the names of its
  13 | #      contributors may be used to endorse or promote products derived from
  14 | #      this software without specific prior written permission.
  15 | #
  16 | # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
  17 | # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  18 | # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  19 | # ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
  20 | # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
  21 | # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 
  22 | # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 
  23 | # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
  24 | # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 
  25 | # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF 
  26 | # THE POSSIBILITY OF SUCH DAMAGE.
  27 | #####
  28 | 
  29 | ##
  30 | # Python bindings for the NVML library
  31 | ##
  32 | from ctypes import *
  33 | from ctypes.util import find_library
  34 | import sys
  35 | import os
  36 | import threading
  37 | import string
  38 |     
  39 | ## C Type mappings ##
  40 | ## Enums
  41 | _nvmlEnableState_t = c_uint
  42 | NVML_FEATURE_DISABLED    = 0
  43 | NVML_FEATURE_ENABLED     = 1
  44 | 
  45 | _nvmlBrandType_t = c_uint
  46 | NVML_BRAND_UNKNOWN = 0
  47 | NVML_BRAND_QUADRO  = 1
  48 | NVML_BRAND_TESLA   = 2
  49 | NVML_BRAND_NVS     = 3
  50 | NVML_BRAND_GRID    = 4
  51 | NVML_BRAND_GEFORCE = 5
  52 | NVML_BRAND_COUNT   = 6
  53 | 
  54 | _nvmlTemperatureThresholds_t = c_uint
  55 | NVML_TEMPERATURE_THRESHOLD_SHUTDOWN = 0
  56 | NVML_TEMPERATURE_THRESHOLD_SLOWDOWN = 1
  57 | NVML_TEMPERATURE_THRESHOLD_COUNT = 1
  58 | 
  59 | _nvmlTemperatureSensors_t = c_uint
  60 | NVML_TEMPERATURE_GPU     = 0
  61 | NVML_TEMPERATURE_COUNT   = 1
  62 | 
  63 | _nvmlComputeMode_t = c_uint
  64 | NVML_COMPUTEMODE_DEFAULT           = 0
  65 | NVML_COMPUTEMODE_EXCLUSIVE_THREAD  = 1
  66 | NVML_COMPUTEMODE_PROHIBITED        = 2
  67 | NVML_COMPUTEMODE_EXCLUSIVE_PROCESS = 3
  68 | NVML_COMPUTEMODE_COUNT             = 4
  69 | 
  70 | _nvmlMemoryLocation_t = c_uint
  71 | NVML_MEMORY_LOCATION_L1_CACHE = 0
  72 | NVML_MEMORY_LOCATION_L2_CACHE = 1
  73 | NVML_MEMORY_LOCATION_DEVICE_MEMORY = 2
  74 | NVML_MEMORY_LOCATION_REGISTER_FILE = 3
  75 | NVML_MEMORY_LOCATION_TEXTURE_MEMORY = 4
  76 | NVML_MEMORY_LOCATION_COUNT = 5
  77 | 
  78 | # These are deprecated, instead use _nvmlMemoryErrorType_t
  79 | _nvmlEccBitType_t = c_uint
  80 | NVML_SINGLE_BIT_ECC    = 0
  81 | NVML_DOUBLE_BIT_ECC    = 1
  82 | NVML_ECC_ERROR_TYPE_COUNT = 2
  83 | 
  84 | _nvmlEccCounterType_t = c_uint
  85 | NVML_VOLATILE_ECC      = 0
  86 | NVML_AGGREGATE_ECC     = 1
  87 | NVML_ECC_COUNTER_TYPE_COUNT = 2
  88 | 
  89 | _nvmlMemoryErrorType_t = c_uint
  90 | NVML_MEMORY_ERROR_TYPE_CORRECTED   = 0
  91 | NVML_MEMORY_ERROR_TYPE_UNCORRECTED = 1
  92 | NVML_MEMORY_ERROR_TYPE_COUNT       = 2
  93 | 
  94 | _nvmlClockType_t = c_uint
  95 | NVML_CLOCK_GRAPHICS  = 0
  96 | NVML_CLOCK_SM        = 1
  97 | NVML_CLOCK_MEM       = 2
  98 | NVML_CLOCK_COUNT     = 3
  99 | 
 100 | _nvmlDriverModel_t = c_uint
 101 | NVML_DRIVER_WDDM       = 0
 102 | NVML_DRIVER_WDM        = 1
 103 | 
 104 | _nvmlPstates_t = c_uint
 105 | NVML_PSTATE_0               = 0
 106 | NVML_PSTATE_1               = 1
 107 | NVML_PSTATE_2               = 2
 108 | NVML_PSTATE_3               = 3
 109 | NVML_PSTATE_4               = 4
 110 | NVML_PSTATE_5               = 5
 111 | NVML_PSTATE_6               = 6
 112 | NVML_PSTATE_7               = 7
 113 | NVML_PSTATE_8               = 8
 114 | NVML_PSTATE_9               = 9
 115 | NVML_PSTATE_10              = 10
 116 | NVML_PSTATE_11              = 11
 117 | NVML_PSTATE_12              = 12
 118 | NVML_PSTATE_13              = 13
 119 | NVML_PSTATE_14              = 14
 120 | NVML_PSTATE_15              = 15
 121 | NVML_PSTATE_UNKNOWN         = 32
 122 | 
 123 | _nvmlInforomObject_t = c_uint
 124 | NVML_INFOROM_OEM            = 0
 125 | NVML_INFOROM_ECC            = 1
 126 | NVML_INFOROM_POWER          = 2
 127 | NVML_INFOROM_COUNT          = 3
 128 | 
 129 | _nvmlReturn_t = c_uint
 130 | NVML_SUCCESS                   = 0
 131 | NVML_ERROR_UNINITIALIZED       = 1
 132 | NVML_ERROR_INVALID_ARGUMENT    = 2
 133 | NVML_ERROR_NOT_SUPPORTED       = 3
 134 | NVML_ERROR_NO_PERMISSION       = 4
 135 | NVML_ERROR_ALREADY_INITIALIZED = 5
 136 | NVML_ERROR_NOT_FOUND           = 6
 137 | NVML_ERROR_INSUFFICIENT_SIZE   = 7
 138 | NVML_ERROR_INSUFFICIENT_POWER  = 8
 139 | NVML_ERROR_DRIVER_NOT_LOADED   = 9
 140 | NVML_ERROR_TIMEOUT             = 10
 141 | NVML_ERROR_IRQ_ISSUE           = 11
 142 | NVML_ERROR_LIBRARY_NOT_FOUND   = 12
 143 | NVML_ERROR_FUNCTION_NOT_FOUND  = 13
 144 | NVML_ERROR_CORRUPTED_INFOROM   = 14
 145 | NVML_ERROR_GPU_IS_LOST         = 15
 146 | NVML_ERROR_RESET_REQUIRED      = 16
 147 | NVML_ERROR_OPERATING_SYSTEM    = 17
 148 | NVML_ERROR_LIB_RM_VERSION_MISMATCH = 18
 149 | NVML_ERROR_UNKNOWN             = 999
 150 | 
 151 | _nvmlFanState_t = c_uint
 152 | NVML_FAN_NORMAL             = 0
 153 | NVML_FAN_FAILED             = 1
 154 | 
 155 | _nvmlLedColor_t = c_uint
 156 | NVML_LED_COLOR_GREEN        = 0
 157 | NVML_LED_COLOR_AMBER        = 1
 158 |      
 159 | _nvmlGpuOperationMode_t = c_uint
 160 | NVML_GOM_ALL_ON                 = 0 
 161 | NVML_GOM_COMPUTE                = 1
 162 | NVML_GOM_LOW_DP                 = 2
 163 | 
 164 | _nvmlPageRetirementCause_t = c_uint
 165 | NVML_PAGE_RETIREMENT_CAUSE_DOUBLE_BIT_ECC_ERROR           = 0
 166 | NVML_PAGE_RETIREMENT_CAUSE_MULTIPLE_SINGLE_BIT_ECC_ERRORS = 1
 167 | NVML_PAGE_RETIREMENT_CAUSE_COUNT                          = 2
 168 | 
 169 | _nvmlRestrictedAPI_t = c_uint
 170 | NVML_RESTRICTED_API_SET_APPLICATION_CLOCKS                = 0
 171 | NVML_RESTRICTED_API_SET_AUTO_BOOSTED_CLOCKS               = 1
 172 | NVML_RESTRICTED_API_COUNT                                 = 2
 173 | 
 174 | _nvmlBridgeChipType_t = c_uint
 175 | NVML_BRIDGE_CHIP_PLX = 0
 176 | NVML_BRIDGE_CHIP_BRO4 = 1      
 177 | NVML_MAX_PHYSICAL_BRIDGE = 128
 178 | 
 179 | _nvmlValueType_t = c_uint
 180 | NVML_VALUE_TYPE_DOUBLE = 0
 181 | NVML_VALUE_TYPE_UNSIGNED_INT = 1
 182 | NVML_VALUE_TYPE_UNSIGNED_LONG = 2
 183 | NVML_VALUE_TYPE_UNSIGNED_LONG_LONG = 3
 184 | NVML_VALUE_TYPE_COUNT = 4
 185 | 
 186 | _nvmlPerfPolicyType_t = c_uint
 187 | NVML_PERF_POLICY_POWER = 0
 188 | NVML_PERF_POLICY_THERMAL = 1
 189 | NVML_PERF_POLICY_COUNT = 2
 190 | 
 191 | _nvmlSamplingType_t = c_uint
 192 | NVML_TOTAL_POWER_SAMPLES = 0
 193 | NVML_GPU_UTILIZATION_SAMPLES = 1
 194 | NVML_MEMORY_UTILIZATION_SAMPLES = 2
 195 | NVML_ENC_UTILIZATION_SAMPLES = 3
 196 | NVML_DEC_UTILIZATION_SAMPLES = 4
 197 | NVML_PROCESSOR_CLK_SAMPLES = 5
 198 | NVML_MEMORY_CLK_SAMPLES = 6
 199 | NVML_SAMPLINGTYPE_COUNT = 7
 200 | 
 201 | _nvmlPcieUtilCounter_t = c_uint
 202 | NVML_PCIE_UTIL_TX_BYTES = 0
 203 | NVML_PCIE_UTIL_RX_BYTES = 1
 204 | NVML_PCIE_UTIL_COUNT = 2
 205 | 
 206 | _nvmlGpuTopologyLevel_t = c_uint
 207 | NVML_TOPOLOGY_INTERNAL = 0
 208 | NVML_TOPOLOGY_SINGLE = 10
 209 | NVML_TOPOLOGY_MULTIPLE = 20
 210 | NVML_TOPOLOGY_HOSTBRIDGE = 30
 211 | NVML_TOPOLOGY_CPU = 40
 212 | NVML_TOPOLOGY_SYSTEM = 50
 213 | 
 214 | # C preprocessor defined values
 215 | nvmlFlagDefault             = 0
 216 | nvmlFlagForce               = 1
 217 | 
 218 | # buffer size
 219 | NVML_DEVICE_INFOROM_VERSION_BUFFER_SIZE      = 16
 220 | NVML_DEVICE_UUID_BUFFER_SIZE                 = 80
 221 | NVML_SYSTEM_DRIVER_VERSION_BUFFER_SIZE       = 81
 222 | NVML_SYSTEM_NVML_VERSION_BUFFER_SIZE         = 80
 223 | NVML_DEVICE_NAME_BUFFER_SIZE                 = 64
 224 | NVML_DEVICE_SERIAL_BUFFER_SIZE               = 30
 225 | NVML_DEVICE_VBIOS_VERSION_BUFFER_SIZE        = 32
 226 | NVML_DEVICE_PCI_BUS_ID_BUFFER_SIZE           = 16
 227 | 
 228 | NVML_VALUE_NOT_AVAILABLE_ulonglong = c_ulonglong(-1)
 229 | NVML_VALUE_NOT_AVAILABLE_uint = c_uint(-1)
 230 | 
 231 | ## Lib loading ##
 232 | nvmlLib = None
 233 | libLoadLock = threading.Lock()
 234 | _nvmlLib_refcount = 0 # Incremented on each nvmlInit and decremented on nvmlShutdown
 235 | 
 236 | ## Error Checking ##
 237 | class NVMLError(Exception):
 238 |     _valClassMapping = dict()
 239 |     # List of currently known error codes
 240 |     _errcode_to_string = {
 241 |         NVML_ERROR_UNINITIALIZED:       "Uninitialized",
 242 |         NVML_ERROR_INVALID_ARGUMENT:    "Invalid Argument",
 243 |         NVML_ERROR_NOT_SUPPORTED:       "Not Supported",
 244 |         NVML_ERROR_NO_PERMISSION:       "Insufficient Permissions",
 245 |         NVML_ERROR_ALREADY_INITIALIZED: "Already Initialized",
 246 |         NVML_ERROR_NOT_FOUND:           "Not Found",
 247 |         NVML_ERROR_INSUFFICIENT_SIZE:   "Insufficient Size",
 248 |         NVML_ERROR_INSUFFICIENT_POWER:  "Insufficient External Power",
 249 |         NVML_ERROR_DRIVER_NOT_LOADED:   "Driver Not Loaded",
 250 |         NVML_ERROR_TIMEOUT:             "Timeout",
 251 |         NVML_ERROR_IRQ_ISSUE:           "Interrupt Request Issue",
 252 |         NVML_ERROR_LIBRARY_NOT_FOUND:   "NVML Shared Library Not Found",
 253 |         NVML_ERROR_FUNCTION_NOT_FOUND:  "Function Not Found",
 254 |         NVML_ERROR_CORRUPTED_INFOROM:   "Corrupted infoROM",
 255 |         NVML_ERROR_GPU_IS_LOST:         "GPU is lost",
 256 |         NVML_ERROR_RESET_REQUIRED:      "GPU requires restart",
 257 |         NVML_ERROR_OPERATING_SYSTEM:    "The operating system has blocked the request.",
 258 |         NVML_ERROR_LIB_RM_VERSION_MISMATCH: "RM has detected an NVML/RM version mismatch.",
 259 |         NVML_ERROR_UNKNOWN:             "Unknown Error",
 260 |         }
 261 |     def __new__(typ, value):
 262 |         '''
 263 |         Maps value to a proper subclass of NVMLError.
 264 |         See _extractNVMLErrorsAsClasses function for more details
 265 |         '''
 266 |         if typ == NVMLError:
 267 |             typ = NVMLError._valClassMapping.get(value, typ)
 268 |         obj = Exception.__new__(typ)
 269 |         obj.value = value
 270 |         return obj
 271 |     def __str__(self):
 272 |         try:
 273 |             if self.value not in NVMLError._errcode_to_string:
 274 |                 NVMLError._errcode_to_string[self.value] = str(nvmlErrorString(self.value))
 275 |             return NVMLError._errcode_to_string[self.value]
 276 |         except NVMLError_Uninitialized:
 277 |             return "NVML Error with code %d" % self.value
 278 |     def __eq__(self, other):
 279 |         return self.value == other.value
 280 | 
 281 | def _extractNVMLErrorsAsClasses():
 282 |     '''
 283 |     Generates a hierarchy of classes on top of NVMLError class.
 284 | 
 285 |     Each NVML Error gets a new NVMLError subclass. This way try,except blocks can filter appropriate
 286 |     exceptions more easily.
 287 | 
 288 |     NVMLError is a parent class. Each NVML_ERROR_* gets it's own subclass.
 289 |     e.g. NVML_ERROR_ALREADY_INITIALIZED will be turned into NVMLError_AlreadyInitialized
 290 |     '''
 291 |     this_module = sys.modules[__name__]
 292 |     nvmlErrorsNames = filter(lambda x: x.startswith("NVML_ERROR_"), dir(this_module))
 293 |     for err_name in nvmlErrorsNames:
 294 |         # e.g. Turn NVML_ERROR_ALREADY_INITIALIZED into NVMLError_AlreadyInitialized
 295 |         class_name = "NVMLError_" + string.capwords(err_name.replace("NVML_ERROR_", ""), "_").replace("_", "")
 296 |         err_val = getattr(this_module, err_name)
 297 |         def gen_new(val):
 298 |             def new(typ):
 299 |                 obj = NVMLError.__new__(typ, val)
 300 |                 return obj
 301 |             return new
 302 |         new_error_class = type(class_name, (NVMLError,), {'__new__': gen_new(err_val)})
 303 |         new_error_class.__module__ = __name__
 304 |         setattr(this_module, class_name, new_error_class)
 305 |         NVMLError._valClassMapping[err_val] = new_error_class
 306 | _extractNVMLErrorsAsClasses()
 307 | 
 308 | def _nvmlCheckReturn(ret):
 309 |     if (ret != NVML_SUCCESS):
 310 |         raise NVMLError(ret)
 311 |     return ret
 312 | 
 313 | ## Function access ##
 314 | _nvmlGetFunctionPointer_cache = dict() # function pointers are cached to prevent unnecessary libLoadLock locking
 315 | def _nvmlGetFunctionPointer(name):
 316 |     global nvmlLib
 317 | 
 318 |     if name in _nvmlGetFunctionPointer_cache:
 319 |         return _nvmlGetFunctionPointer_cache[name]
 320 |     
 321 |     libLoadLock.acquire()
 322 |     try:
 323 |         # ensure library was loaded
 324 |         if (nvmlLib == None):
 325 |             raise NVMLError(NVML_ERROR_UNINITIALIZED)
 326 |         try:
 327 |             _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
 328 |             return _nvmlGetFunctionPointer_cache[name]
 329 |         except AttributeError:
 330 |             raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
 331 |     finally:
 332 |         # lock is always freed
 333 |         libLoadLock.release()
 334 | 
 335 | ## Alternative object
 336 | # Allows the object to be printed
 337 | # Allows mismatched types to be assigned
 338 | #  - like None when the Structure variant requires c_uint
 339 | class nvmlFriendlyObject(object):
 340 |     def __init__(self, dictionary):
 341 |         for x in dictionary:
 342 |             setattr(self, x, dictionary[x])
 343 |     def __str__(self):
 344 |         return self.__dict__.__str__()
 345 | 
 346 | def nvmlStructToFriendlyObject(struct):
 347 |     d = {}
 348 |     for x in struct._fields_:
 349 |         key = x[0]
 350 |         value = getattr(struct, key)
 351 |         d[key] = value
 352 |     obj = nvmlFriendlyObject(d)
 353 |     return obj
 354 | 
 355 | # pack the object so it can be passed to the NVML library
 356 | def nvmlFriendlyObjectToStruct(obj, model):
 357 |     for x in model._fields_:
 358 |         key = x[0]
 359 |         value = obj.__dict__[key]
 360 |         setattr(model, key, value)
 361 |     return model
 362 | 
 363 | ## Unit structures
 364 | class struct_c_nvmlUnit_t(Structure):
 365 |     pass # opaque handle
 366 | c_nvmlUnit_t = POINTER(struct_c_nvmlUnit_t)
 367 |     
 368 | class _PrintableStructure(Structure):
 369 |     """
 370 |     Abstract class that produces nicer __str__ output than ctypes.Structure.
 371 |     e.g. instead of:
 372 |       >>> print str(obj)
 373 |       <class_name object at 0x7fdf82fef9e0>
 374 |     this class will print
 375 |       class_name(field_name: formatted_value, field_name: formatted_value)
 376 |     
 377 |     _fmt_ dictionary of <str _field_ name> -> <str format>
 378 |     e.g. class that has _field_ 'hex_value', c_uint could be formatted with
 379 |       _fmt_ = {"hex_value" : "%08X"}
 380 |     to produce nicer output.
 381 |     Default fomratting string for all fields can be set with key "<default>" like:
 382 |       _fmt_ = {"<default>" : "%d MHz"} # e.g all values are numbers in MHz.
 383 |     If not set it's assumed to be just "%s"
 384 | 
 385 |     Exact format of returned str from this class is subject to change in the future.
 386 |     """
 387 |     _fmt_ = {}
 388 |     def __str__(self):
 389 |         result = []
 390 |         for x in self._fields_:
 391 |             key = x[0]
 392 |             value = getattr(self, key)
 393 |             fmt = "%s"
 394 |             if key in self._fmt_:
 395 |                 fmt = self._fmt_[key]
 396 |             elif "<default>" in self._fmt_:
 397 |                 fmt = self._fmt_["<default>"]
 398 |             result.append(("%s: " + fmt) % (key, value))
 399 |         return self.__class__.__name__ + "(" + string.join(result, ", ") + ")"
 400 |     
 401 | class c_nvmlUnitInfo_t(_PrintableStructure):
 402 |     _fields_ = [
 403 |         ('name', c_char * 96),
 404 |         ('id', c_char * 96),
 405 |         ('serial', c_char * 96),
 406 |         ('firmwareVersion', c_char * 96),
 407 |     ]
 408 | 
 409 | class c_nvmlLedState_t(_PrintableStructure):
 410 |     _fields_ = [
 411 |         ('cause', c_char * 256),
 412 |         ('color', _nvmlLedColor_t),
 413 |     ]
 414 | 
 415 | class c_nvmlPSUInfo_t(_PrintableStructure):
 416 |     _fields_ = [
 417 |         ('state', c_char * 256),
 418 |         ('current', c_uint),
 419 |         ('voltage', c_uint),
 420 |         ('power', c_uint),
 421 |     ]
 422 | 
 423 | class c_nvmlUnitFanInfo_t(_PrintableStructure):
 424 |     _fields_ = [
 425 |         ('speed', c_uint),
 426 |         ('state', _nvmlFanState_t),
 427 |     ]
 428 | 
 429 | class c_nvmlUnitFanSpeeds_t(_PrintableStructure):
 430 |     _fields_ = [
 431 |         ('fans', c_nvmlUnitFanInfo_t * 24),
 432 |         ('count', c_uint)
 433 |     ]
 434 | 
 435 | ## Device structures
 436 | class struct_c_nvmlDevice_t(Structure):
 437 |     pass # opaque handle
 438 | c_nvmlDevice_t = POINTER(struct_c_nvmlDevice_t)
 439 | 
 440 | class nvmlPciInfo_t(_PrintableStructure):
 441 |     _fields_ = [
 442 |         ('busId', c_char * 16),
 443 |         ('domain', c_uint),
 444 |         ('bus', c_uint),
 445 |         ('device', c_uint),
 446 |         ('pciDeviceId', c_uint),
 447 |         
 448 |         # Added in 2.285
 449 |         ('pciSubSystemId', c_uint),
 450 |         ('reserved0', c_uint),
 451 |         ('reserved1', c_uint),
 452 |         ('reserved2', c_uint),
 453 |         ('reserved3', c_uint),
 454 |     ]
 455 |     _fmt_ = {
 456 |             'domain'         : "0x%04X",
 457 |             'bus'            : "0x%02X",
 458 |             'device'         : "0x%02X",
 459 |             'pciDeviceId'    : "0x%08X",
 460 |             'pciSubSystemId' : "0x%08X",
 461 |             }
 462 | 
 463 | class c_nvmlMemory_t(_PrintableStructure):
 464 |     _fields_ = [
 465 |         ('total', c_ulonglong),
 466 |         ('free', c_ulonglong),
 467 |         ('used', c_ulonglong),
 468 |     ]
 469 |     _fmt_ = {'<default>': "%d B"}
 470 | 
 471 | class c_nvmlBAR1Memory_t(_PrintableStructure):
 472 |     _fields_ = [
 473 |         ('bar1Total', c_ulonglong),
 474 |         ('bar1Free', c_ulonglong),
 475 |         ('bar1Used', c_ulonglong),
 476 |     ]
 477 |     _fmt_ = {'<default>': "%d B"}
 478 | 
 479 | # On Windows with the WDDM driver, usedGpuMemory is reported as None
 480 | # Code that processes this structure should check for None, I.E.
 481 | #
 482 | # if (info.usedGpuMemory == None):
 483 | #     # TODO handle the error
 484 | #     pass
 485 | # else:
 486 | #    print("Using %d MiB of memory" % (info.usedGpuMemory / 1024 / 1024))
 487 | #
 488 | # See NVML documentation for more information
 489 | class c_nvmlProcessInfo_t(_PrintableStructure):
 490 |     _fields_ = [
 491 |         ('pid', c_uint),
 492 |         ('usedGpuMemory', c_ulonglong),
 493 |     ]
 494 |     _fmt_ = {'usedGpuMemory': "%d B"}
 495 | 
 496 | class c_nvmlBridgeChipInfo_t(_PrintableStructure):
 497 |     _fields_ = [
 498 |         ('type', _nvmlBridgeChipType_t),
 499 |         ('fwVersion', c_uint),
 500 |     ]
 501 | 
 502 | class c_nvmlBridgeChipHierarchy_t(_PrintableStructure):
 503 |     _fields_ = [
 504 |         ('bridgeCount', c_uint),
 505 |         ('bridgeChipInfo', c_nvmlBridgeChipInfo_t * 128),
 506 |     ]    
 507 | 
 508 | class c_nvmlEccErrorCounts_t(_PrintableStructure):
 509 |     _fields_ = [
 510 |         ('l1Cache', c_ulonglong),
 511 |         ('l2Cache', c_ulonglong),
 512 |         ('deviceMemory', c_ulonglong),
 513 |         ('registerFile', c_ulonglong),
 514 |     ]
 515 | 
 516 | class c_nvmlUtilization_t(_PrintableStructure):
 517 |     _fields_ = [
 518 |         ('gpu', c_uint),
 519 |         ('memory', c_uint),
 520 |     ]
 521 |     _fmt_ = {'<default>': "%d %%"}
 522 | 
 523 | # Added in 2.285
 524 | class c_nvmlHwbcEntry_t(_PrintableStructure):
 525 |     _fields_ = [
 526 |         ('hwbcId', c_uint),
 527 |         ('firmwareVersion', c_char * 32),
 528 |     ]
 529 | 
 530 | class c_nvmlValue_t(Union):
 531 |     _fields_ = [
 532 |         ('dVal', c_double),
 533 |         ('uiVal', c_uint),
 534 |         ('ulVal', c_ulong),
 535 |         ('ullVal', c_ulonglong),
 536 |     ]
 537 | 
 538 | class c_nvmlSample_t(_PrintableStructure):
 539 |     _fields_ = [
 540 |         ('timeStamp', c_ulonglong),
 541 |         ('sampleValue', c_nvmlValue_t),
 542 |     ]
 543 | 
 544 | class c_nvmlViolationTime_t(_PrintableStructure):
 545 |     _fields_ = [
 546 |         ('referenceTime', c_ulonglong),
 547 |         ('violationTime', c_ulonglong),
 548 |     ]
 549 | 
 550 | ## Event structures
 551 | class struct_c_nvmlEventSet_t(Structure):
 552 |     pass # opaque handle
 553 | c_nvmlEventSet_t = POINTER(struct_c_nvmlEventSet_t)
 554 | 
 555 | nvmlEventTypeSingleBitEccError     = 0x0000000000000001
 556 | nvmlEventTypeDoubleBitEccError     = 0x0000000000000002
 557 | nvmlEventTypePState                = 0x0000000000000004
 558 | nvmlEventTypeXidCriticalError      = 0x0000000000000008
 559 | nvmlEventTypeClock                 = 0x0000000000000010
 560 | nvmlEventTypeNone                  = 0x0000000000000000
 561 | nvmlEventTypeAll                   = (
 562 |                                         nvmlEventTypeNone |
 563 |                                         nvmlEventTypeSingleBitEccError |
 564 |                                         nvmlEventTypeDoubleBitEccError |
 565 |                                         nvmlEventTypePState |
 566 |                                         nvmlEventTypeClock |
 567 |                                         nvmlEventTypeXidCriticalError
 568 |                                      )
 569 | 
 570 | ## Clock Throttle Reasons defines
 571 | nvmlClocksThrottleReasonGpuIdle           = 0x0000000000000001
 572 | nvmlClocksThrottleReasonApplicationsClocksSetting = 0x0000000000000002
 573 | nvmlClocksThrottleReasonUserDefinedClocks         = nvmlClocksThrottleReasonApplicationsClocksSetting # deprecated, use nvmlClocksThrottleReasonApplicationsClocksSetting
 574 | nvmlClocksThrottleReasonSwPowerCap        = 0x0000000000000004
 575 | nvmlClocksThrottleReasonHwSlowdown        = 0x0000000000000008
 576 | nvmlClocksThrottleReasonUnknown           = 0x8000000000000000
 577 | nvmlClocksThrottleReasonNone              = 0x0000000000000000
 578 | nvmlClocksThrottleReasonAll               = (
 579 |                                                nvmlClocksThrottleReasonNone |
 580 |                                                nvmlClocksThrottleReasonGpuIdle |
 581 |                                                nvmlClocksThrottleReasonApplicationsClocksSetting |
 582 |                                                nvmlClocksThrottleReasonSwPowerCap |
 583 |                                                nvmlClocksThrottleReasonHwSlowdown |
 584 |                                                nvmlClocksThrottleReasonUnknown
 585 |                                             ) 
 586 | 
 587 | class c_nvmlEventData_t(_PrintableStructure):
 588 |     _fields_ = [
 589 |         ('device', c_nvmlDevice_t),
 590 |         ('eventType', c_ulonglong),
 591 |         ('eventData', c_ulonglong)
 592 |     ]
 593 |     _fmt_ = {'eventType': "0x%08X"}
 594 | 
 595 | class c_nvmlAccountingStats_t(_PrintableStructure):
 596 |     _fields_ = [
 597 |         ('gpuUtilization', c_uint),
 598 |         ('memoryUtilization', c_uint),
 599 |         ('maxMemoryUsage', c_ulonglong),
 600 |         ('time', c_ulonglong),
 601 |         ('startTime', c_ulonglong),
 602 |         ('isRunning', c_uint),
 603 |         ('reserved', c_uint * 5)
 604 |     ]
 605 | 
 606 | ## C function wrappers ##
 607 | def nvmlInit():
 608 |     _LoadNvmlLibrary()
 609 |     
 610 |     #
 611 |     # Initialize the library
 612 |     #
 613 |     fn = _nvmlGetFunctionPointer("nvmlInit_v2")
 614 |     ret = fn()
 615 |     _nvmlCheckReturn(ret)
 616 |    
 617 |     # Atomically update refcount
 618 |     global _nvmlLib_refcount
 619 |     libLoadLock.acquire()
 620 |     _nvmlLib_refcount += 1
 621 |     libLoadLock.release()
 622 |     return None
 623 |     
 624 | def _LoadNvmlLibrary():
 625 |     '''
 626 |     Load the library if it isn't loaded already
 627 |     '''
 628 |     global nvmlLib
 629 |     
 630 |     if (nvmlLib == None):
 631 |         # lock to ensure only one caller loads the library
 632 |         libLoadLock.acquire()
 633 |         
 634 |         try:
 635 |             # ensure the library still isn't loaded
 636 |             if (nvmlLib == None):
 637 |                 try:
 638 |                     if (sys.platform[:3] == "win"):
 639 |                         # cdecl calling convention
 640 |                         # load nvml.dll from %ProgramFiles%/NVIDIA Corporation/NVSMI/nvml.dll
 641 |                         nvmlLib = CDLL(os.path.join(os.getenv("ProgramFiles", "C:/Program Files"), "NVIDIA Corporation/NVSMI/nvml.dll"))
 642 |                     else:
 643 |                         # assume linux
 644 |                         nvmlLib = CDLL("libnvidia-ml.so.1")
 645 |                 except OSError as ose:
 646 |                     _nvmlCheckReturn(NVML_ERROR_LIBRARY_NOT_FOUND)
 647 |                 if (nvmlLib == None):
 648 |                     _nvmlCheckReturn(NVML_ERROR_LIBRARY_NOT_FOUND)
 649 |         finally:
 650 |             # lock is always freed
 651 |             libLoadLock.release()
 652 |             
 653 | def nvmlShutdown():
 654 |     #
 655 |     # Leave the library loaded, but shutdown the interface
 656 |     #
 657 |     fn = _nvmlGetFunctionPointer("nvmlShutdown")
 658 |     ret = fn()
 659 |     _nvmlCheckReturn(ret)
 660 |     
 661 |     # Atomically update refcount
 662 |     global _nvmlLib_refcount
 663 |     libLoadLock.acquire()
 664 |     if (0 < _nvmlLib_refcount):
 665 |         _nvmlLib_refcount -= 1
 666 |     libLoadLock.release()
 667 |     return None
 668 | 
 669 | # Added in 2.285
 670 | def nvmlErrorString(result):
 671 |     fn = _nvmlGetFunctionPointer("nvmlErrorString")
 672 |     fn.restype = c_char_p # otherwise return is an int
 673 |     ret = fn(result)
 674 |     return ret
 675 | 
 676 | # Added in 2.285
 677 | def nvmlSystemGetNVMLVersion():
 678 |     c_version = create_string_buffer(NVML_SYSTEM_NVML_VERSION_BUFFER_SIZE)
 679 |     fn = _nvmlGetFunctionPointer("nvmlSystemGetNVMLVersion")
 680 |     ret = fn(c_version, c_uint(NVML_SYSTEM_NVML_VERSION_BUFFER_SIZE))
 681 |     _nvmlCheckReturn(ret)
 682 |     return c_version.value
 683 | 
 684 | # Added in 2.285
 685 | def nvmlSystemGetProcessName(pid):
 686 |     c_name = create_string_buffer(1024)
 687 |     fn = _nvmlGetFunctionPointer("nvmlSystemGetProcessName")
 688 |     ret = fn(c_uint(pid), c_name, c_uint(1024))
 689 |     _nvmlCheckReturn(ret)
 690 |     return c_name.value
 691 | 
 692 | def nvmlSystemGetDriverVersion():
 693 |     c_version = create_string_buffer(NVML_SYSTEM_DRIVER_VERSION_BUFFER_SIZE)
 694 |     fn = _nvmlGetFunctionPointer("nvmlSystemGetDriverVersion")
 695 |     ret = fn(c_version, c_uint(NVML_SYSTEM_DRIVER_VERSION_BUFFER_SIZE))
 696 |     _nvmlCheckReturn(ret)
 697 |     return c_version.value
 698 | 
 699 | # Added in 2.285
 700 | def nvmlSystemGetHicVersion():
 701 |     c_count = c_uint(0)
 702 |     hics = None
 703 |     fn = _nvmlGetFunctionPointer("nvmlSystemGetHicVersion")
 704 |     
 705 |     # get the count
 706 |     ret = fn(byref(c_count), None)
 707 |     
 708 |     # this should only fail with insufficient size
 709 |     if ((ret != NVML_SUCCESS) and
 710 |         (ret != NVML_ERROR_INSUFFICIENT_SIZE)):
 711 |         raise NVMLError(ret)
 712 |     
 713 |     # if there are no hics
 714 |     if (c_count.value == 0):
 715 |         return []
 716 |     
 717 |     hic_array = c_nvmlHwbcEntry_t * c_count.value
 718 |     hics = hic_array()
 719 |     ret = fn(byref(c_count), hics)
 720 |     _nvmlCheckReturn(ret)
 721 |     return hics
 722 | 
 723 | ## Unit get functions
 724 | def nvmlUnitGetCount():
 725 |     c_count = c_uint()
 726 |     fn = _nvmlGetFunctionPointer("nvmlUnitGetCount")
 727 |     ret = fn(byref(c_count))
 728 |     _nvmlCheckReturn(ret)
 729 |     return c_count.value
 730 | 
 731 | def nvmlUnitGetHandleByIndex(index):
 732 |     c_index = c_uint(index)
 733 |     unit = c_nvmlUnit_t()
 734 |     fn = _nvmlGetFunctionPointer("nvmlUnitGetHandleByIndex")
 735 |     ret = fn(c_index, byref(unit))
 736 |     _nvmlCheckReturn(ret)
 737 |     return unit
 738 | 
 739 | def nvmlUnitGetUnitInfo(unit):
 740 |     c_info = c_nvmlUnitInfo_t()
 741 |     fn = _nvmlGetFunctionPointer("nvmlUnitGetUnitInfo")
 742 |     ret = fn(unit, byref(c_info))
 743 |     _nvmlCheckReturn(ret)
 744 |     return c_info
 745 | 
 746 | def nvmlUnitGetLedState(unit):
 747 |     c_state =  c_nvmlLedState_t()
 748 |     fn = _nvmlGetFunctionPointer("nvmlUnitGetLedState")
 749 |     ret = fn(unit, byref(c_state))
 750 |     _nvmlCheckReturn(ret)
 751 |     return c_state
 752 | 
 753 | def nvmlUnitGetPsuInfo(unit):
 754 |     c_info = c_nvmlPSUInfo_t()
 755 |     fn = _nvmlGetFunctionPointer("nvmlUnitGetPsuInfo")
 756 |     ret = fn(unit, byref(c_info))
 757 |     _nvmlCheckReturn(ret)
 758 |     return c_info
 759 | 
 760 | def nvmlUnitGetTemperature(unit, type):
 761 |     c_temp = c_uint()
 762 |     fn = _nvmlGetFunctionPointer("nvmlUnitGetTemperature")
 763 |     ret = fn(unit, c_uint(type), byref(c_temp))
 764 |     _nvmlCheckReturn(ret)
 765 |     return c_temp.value
 766 | 
 767 | def nvmlUnitGetFanSpeedInfo(unit):
 768 |     c_speeds = c_nvmlUnitFanSpeeds_t()
 769 |     fn = _nvmlGetFunctionPointer("nvmlUnitGetFanSpeedInfo")
 770 |     ret = fn(unit, byref(c_speeds))
 771 |     _nvmlCheckReturn(ret)
 772 |     return c_speeds
 773 |     
 774 | # added to API
 775 | def nvmlUnitGetDeviceCount(unit):
 776 |     c_count = c_uint(0)
 777 |     # query the unit to determine device count
 778 |     fn = _nvmlGetFunctionPointer("nvmlUnitGetDevices")
 779 |     ret = fn(unit, byref(c_count), None)
 780 |     if (ret == NVML_ERROR_INSUFFICIENT_SIZE):
 781 |         ret = NVML_SUCCESS
 782 |     _nvmlCheckReturn(ret)
 783 |     return c_count.value
 784 | 
 785 | def nvmlUnitGetDevices(unit):
 786 |     c_count = c_uint(nvmlUnitGetDeviceCount(unit))
 787 |     device_array = c_nvmlDevice_t * c_count.value
 788 |     c_devices = device_array()
 789 |     fn = _nvmlGetFunctionPointer("nvmlUnitGetDevices")
 790 |     ret = fn(unit, byref(c_count), c_devices)
 791 |     _nvmlCheckReturn(ret)
 792 |     return c_devices
 793 | 
 794 | ## Device get functions
 795 | def nvmlDeviceGetCount():
 796 |     c_count = c_uint()
 797 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetCount_v2")
 798 |     ret = fn(byref(c_count))
 799 |     _nvmlCheckReturn(ret)
 800 |     return c_count.value
 801 | 
 802 | def nvmlDeviceGetHandleByIndex(index):
 803 |     c_index = c_uint(index)
 804 |     device = c_nvmlDevice_t()
 805 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetHandleByIndex_v2")
 806 |     ret = fn(c_index, byref(device))
 807 |     _nvmlCheckReturn(ret)
 808 |     return device
 809 | 
 810 | def nvmlDeviceGetHandleBySerial(serial):
 811 |     c_serial = c_char_p(serial)
 812 |     device = c_nvmlDevice_t()
 813 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetHandleBySerial")
 814 |     ret = fn(c_serial, byref(device))
 815 |     _nvmlCheckReturn(ret)
 816 |     return device
 817 | 
 818 | def nvmlDeviceGetHandleByUUID(uuid):
 819 |     c_uuid = c_char_p(uuid)
 820 |     device = c_nvmlDevice_t()
 821 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetHandleByUUID")
 822 |     ret = fn(c_uuid, byref(device))
 823 |     _nvmlCheckReturn(ret)
 824 |     return device
 825 |     
 826 | def nvmlDeviceGetHandleByPciBusId(pciBusId):
 827 |     c_busId = c_char_p(pciBusId)
 828 |     device = c_nvmlDevice_t()
 829 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetHandleByPciBusId_v2")
 830 |     ret = fn(c_busId, byref(device))
 831 |     _nvmlCheckReturn(ret)
 832 |     return device
 833 | 
 834 | def nvmlDeviceGetName(handle):
 835 |     c_name = create_string_buffer(NVML_DEVICE_NAME_BUFFER_SIZE)
 836 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetName")
 837 |     ret = fn(handle, c_name, c_uint(NVML_DEVICE_NAME_BUFFER_SIZE))
 838 |     _nvmlCheckReturn(ret)
 839 |     return c_name.value
 840 | 
 841 | def nvmlDeviceGetBoardId(handle):
 842 |     c_id = c_uint();
 843 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetBoardId")
 844 |     ret = fn(handle, byref(c_id))
 845 |     _nvmlCheckReturn(ret)
 846 |     return c_id.value
 847 | 
 848 | def nvmlDeviceGetMultiGpuBoard(handle):
 849 |     c_multiGpu = c_uint();
 850 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetMultiGpuBoard")
 851 |     ret = fn(handle, byref(c_multiGpu))
 852 |     _nvmlCheckReturn(ret)
 853 |     return c_multiGpu.value
 854 | 
 855 | def nvmlDeviceGetBrand(handle):
 856 |     c_type = _nvmlBrandType_t()
 857 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetBrand")
 858 |     ret = fn(handle, byref(c_type))
 859 |     _nvmlCheckReturn(ret)
 860 |     return c_type.value
 861 |     
 862 | def nvmlDeviceGetSerial(handle):
 863 |     c_serial = create_string_buffer(NVML_DEVICE_SERIAL_BUFFER_SIZE)
 864 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetSerial")
 865 |     ret = fn(handle, c_serial, c_uint(NVML_DEVICE_SERIAL_BUFFER_SIZE))
 866 |     _nvmlCheckReturn(ret)
 867 |     return c_serial.value
 868 | 
 869 | def nvmlDeviceGetCpuAffinity(handle, cpuSetSize):
 870 |     affinity_array = c_ulonglong * cpuSetSize
 871 |     c_affinity = affinity_array()
 872 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetCpuAffinity")
 873 |     ret = fn(handle, cpuSetSize, byref(c_affinity))
 874 |     _nvmlCheckReturn(ret)
 875 |     return c_affinity
 876 | 
 877 | def nvmlDeviceSetCpuAffinity(handle):
 878 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetCpuAffinity")
 879 |     ret = fn(handle)
 880 |     _nvmlCheckReturn(ret)
 881 |     return None
 882 | 
 883 | def nvmlDeviceClearCpuAffinity(handle):
 884 |     fn = _nvmlGetFunctionPointer("nvmlDeviceClearCpuAffinity")
 885 |     ret = fn(handle)
 886 |     _nvmlCheckReturn(ret)
 887 |     return None
 888 | 
 889 | def nvmlDeviceGetMinorNumber(handle):
 890 |     c_minor_number = c_uint()
 891 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetMinorNumber")
 892 |     ret = fn(handle, byref(c_minor_number))
 893 |     _nvmlCheckReturn(ret)
 894 |     return c_minor_number.value
 895 |     
 896 | def nvmlDeviceGetUUID(handle):
 897 |     c_uuid = create_string_buffer(NVML_DEVICE_UUID_BUFFER_SIZE)
 898 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetUUID")
 899 |     ret = fn(handle, c_uuid, c_uint(NVML_DEVICE_UUID_BUFFER_SIZE))
 900 |     _nvmlCheckReturn(ret)
 901 |     return c_uuid.value
 902 |     
 903 | def nvmlDeviceGetInforomVersion(handle, infoRomObject):
 904 |     c_version = create_string_buffer(NVML_DEVICE_INFOROM_VERSION_BUFFER_SIZE)
 905 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetInforomVersion")
 906 |     ret = fn(handle, _nvmlInforomObject_t(infoRomObject),
 907 | 	         c_version, c_uint(NVML_DEVICE_INFOROM_VERSION_BUFFER_SIZE))
 908 |     _nvmlCheckReturn(ret)
 909 |     return c_version.value
 910 | 
 911 | # Added in 4.304
 912 | def nvmlDeviceGetInforomImageVersion(handle):
 913 |     c_version = create_string_buffer(NVML_DEVICE_INFOROM_VERSION_BUFFER_SIZE)
 914 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetInforomImageVersion")
 915 |     ret = fn(handle, c_version, c_uint(NVML_DEVICE_INFOROM_VERSION_BUFFER_SIZE))
 916 |     _nvmlCheckReturn(ret)
 917 |     return c_version.value
 918 | 
 919 | # Added in 4.304
 920 | def nvmlDeviceGetInforomConfigurationChecksum(handle):
 921 |     c_checksum = c_uint()
 922 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetInforomConfigurationChecksum")
 923 |     ret = fn(handle, byref(c_checksum))
 924 |     _nvmlCheckReturn(ret)
 925 |     return c_checksum.value
 926 | 
 927 | # Added in 4.304
 928 | def nvmlDeviceValidateInforom(handle):
 929 |     fn = _nvmlGetFunctionPointer("nvmlDeviceValidateInforom")
 930 |     ret = fn(handle)
 931 |     _nvmlCheckReturn(ret)
 932 |     return None 
 933 | 
 934 | def nvmlDeviceGetDisplayMode(handle):
 935 |     c_mode = _nvmlEnableState_t()
 936 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetDisplayMode")
 937 |     ret = fn(handle, byref(c_mode))
 938 |     _nvmlCheckReturn(ret)
 939 |     return c_mode.value
 940 |     
 941 | def nvmlDeviceGetDisplayActive(handle):
 942 |     c_mode = _nvmlEnableState_t()
 943 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetDisplayActive")
 944 |     ret = fn(handle, byref(c_mode))
 945 |     _nvmlCheckReturn(ret)
 946 |     return c_mode.value
 947 |     
 948 |     
 949 | def nvmlDeviceGetPersistenceMode(handle):
 950 |     c_state = _nvmlEnableState_t()
 951 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetPersistenceMode")
 952 |     ret = fn(handle, byref(c_state))
 953 |     _nvmlCheckReturn(ret)
 954 |     return c_state.value
 955 |     
 956 | def nvmlDeviceGetPciInfo(handle):
 957 |     c_info = nvmlPciInfo_t()
 958 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetPciInfo_v2")
 959 |     ret = fn(handle, byref(c_info))
 960 |     _nvmlCheckReturn(ret)
 961 |     return c_info
 962 |     
 963 | def nvmlDeviceGetClockInfo(handle, type):
 964 |     c_clock = c_uint()
 965 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetClockInfo")
 966 |     ret = fn(handle, _nvmlClockType_t(type), byref(c_clock))
 967 |     _nvmlCheckReturn(ret)
 968 |     return c_clock.value
 969 | 
 970 | # Added in 2.285
 971 | def nvmlDeviceGetMaxClockInfo(handle, type):
 972 |     c_clock = c_uint()
 973 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetMaxClockInfo")
 974 |     ret = fn(handle, _nvmlClockType_t(type), byref(c_clock))
 975 |     _nvmlCheckReturn(ret)
 976 |     return c_clock.value
 977 | 
 978 | # Added in 4.304
 979 | def nvmlDeviceGetApplicationsClock(handle, type):
 980 |     c_clock = c_uint()
 981 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetApplicationsClock")
 982 |     ret = fn(handle, _nvmlClockType_t(type), byref(c_clock))
 983 |     _nvmlCheckReturn(ret)
 984 |     return c_clock.value
 985 | 
 986 | # Added in 5.319
 987 | def nvmlDeviceGetDefaultApplicationsClock(handle, type):
 988 |     c_clock = c_uint()
 989 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetDefaultApplicationsClock")
 990 |     ret = fn(handle, _nvmlClockType_t(type), byref(c_clock))
 991 |     _nvmlCheckReturn(ret)
 992 |     return c_clock.value
 993 | 
 994 | # Added in 4.304
 995 | def nvmlDeviceGetSupportedMemoryClocks(handle):
 996 |     # first call to get the size
 997 |     c_count = c_uint(0)
 998 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetSupportedMemoryClocks")
 999 |     ret = fn(handle, byref(c_count), None)
1000 |     
1001 |     if (ret == NVML_SUCCESS):
1002 |         # special case, no clocks
1003 |         return []
1004 |     elif (ret == NVML_ERROR_INSUFFICIENT_SIZE):
1005 |         # typical case
1006 |         clocks_array = c_uint * c_count.value
1007 |         c_clocks = clocks_array()
1008 |         
1009 |         # make the call again
1010 |         ret = fn(handle, byref(c_count), c_clocks)
1011 |         _nvmlCheckReturn(ret)
1012 |         
1013 |         procs = []
1014 |         for i in range(c_count.value):
1015 |             procs.append(c_clocks[i])
1016 | 
1017 |         return procs
1018 |     else:
1019 |         # error case
1020 |         raise NVMLError(ret)
1021 | 
1022 | # Added in 4.304
1023 | def nvmlDeviceGetSupportedGraphicsClocks(handle, memoryClockMHz):
1024 |     # first call to get the size
1025 |     c_count = c_uint(0)
1026 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetSupportedGraphicsClocks")
1027 |     ret = fn(handle, c_uint(memoryClockMHz), byref(c_count), None)
1028 |     
1029 |     if (ret == NVML_SUCCESS):
1030 |         # special case, no clocks
1031 |         return []
1032 |     elif (ret == NVML_ERROR_INSUFFICIENT_SIZE):
1033 |         # typical case
1034 |         clocks_array = c_uint * c_count.value
1035 |         c_clocks = clocks_array()
1036 |         
1037 |         # make the call again
1038 |         ret = fn(handle, c_uint(memoryClockMHz), byref(c_count), c_clocks)
1039 |         _nvmlCheckReturn(ret)
1040 |         
1041 |         procs = []
1042 |         for i in range(c_count.value):
1043 |             procs.append(c_clocks[i])
1044 | 
1045 |         return procs
1046 |     else:
1047 |         # error case
1048 |         raise NVMLError(ret)
1049 | 
1050 | def nvmlDeviceGetFanSpeed(handle):
1051 |     c_speed = c_uint()
1052 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetFanSpeed")
1053 |     ret = fn(handle, byref(c_speed))
1054 |     _nvmlCheckReturn(ret)
1055 |     return c_speed.value
1056 |     
1057 | def nvmlDeviceGetTemperature(handle, sensor):
1058 |     c_temp = c_uint()
1059 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetTemperature")
1060 |     ret = fn(handle, _nvmlTemperatureSensors_t(sensor), byref(c_temp))
1061 |     _nvmlCheckReturn(ret)
1062 |     return c_temp.value
1063 | 
1064 | def nvmlDeviceGetTemperatureThreshold(handle, threshold):
1065 |     c_temp = c_uint()
1066 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetTemperatureThreshold")
1067 |     ret = fn(handle, _nvmlTemperatureThresholds_t(threshold), byref(c_temp))
1068 |     _nvmlCheckReturn(ret)
1069 |     return c_temp.value
1070 | 
1071 | # DEPRECATED use nvmlDeviceGetPerformanceState
1072 | def nvmlDeviceGetPowerState(handle):
1073 |     c_pstate = _nvmlPstates_t()
1074 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetPowerState")
1075 |     ret = fn(handle, byref(c_pstate))
1076 |     _nvmlCheckReturn(ret)
1077 |     return c_pstate.value
1078 |     
1079 | def nvmlDeviceGetPerformanceState(handle):
1080 |     c_pstate = _nvmlPstates_t()
1081 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetPerformanceState")
1082 |     ret = fn(handle, byref(c_pstate))
1083 |     _nvmlCheckReturn(ret)
1084 |     return c_pstate.value
1085 | 
1086 | def nvmlDeviceGetPowerManagementMode(handle):
1087 |     c_pcapMode = _nvmlEnableState_t()
1088 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetPowerManagementMode")
1089 |     ret = fn(handle, byref(c_pcapMode))
1090 |     _nvmlCheckReturn(ret)
1091 |     return c_pcapMode.value
1092 |     
1093 | def nvmlDeviceGetPowerManagementLimit(handle):
1094 |     c_limit = c_uint()
1095 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetPowerManagementLimit")
1096 |     ret = fn(handle, byref(c_limit))
1097 |     _nvmlCheckReturn(ret)
1098 |     return c_limit.value
1099 | 
1100 | # Added in 4.304
1101 | def nvmlDeviceGetPowerManagementLimitConstraints(handle):
1102 |     c_minLimit = c_uint()
1103 |     c_maxLimit = c_uint()
1104 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetPowerManagementLimitConstraints")
1105 |     ret = fn(handle, byref(c_minLimit), byref(c_maxLimit))
1106 |     _nvmlCheckReturn(ret)
1107 |     return [c_minLimit.value, c_maxLimit.value]
1108 | 
1109 | # Added in 4.304
1110 | def nvmlDeviceGetPowerManagementDefaultLimit(handle):
1111 |     c_limit = c_uint()
1112 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetPowerManagementDefaultLimit")
1113 |     ret = fn(handle, byref(c_limit))
1114 |     _nvmlCheckReturn(ret)
1115 |     return c_limit.value
1116 |     
1117 | 
1118 | # Added in 331
1119 | def nvmlDeviceGetEnforcedPowerLimit(handle):
1120 |     c_limit = c_uint()
1121 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetEnforcedPowerLimit")
1122 |     ret = fn(handle, byref(c_limit))
1123 |     _nvmlCheckReturn(ret)
1124 |     return c_limit.value
1125 | 
1126 | def nvmlDeviceGetPowerUsage(handle):
1127 |     c_watts = c_uint()
1128 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetPowerUsage")
1129 |     ret = fn(handle, byref(c_watts))
1130 |     _nvmlCheckReturn(ret)
1131 |     return c_watts.value
1132 | 
1133 | # Added in 4.304
1134 | def nvmlDeviceGetGpuOperationMode(handle):
1135 |     c_currState = _nvmlGpuOperationMode_t()
1136 |     c_pendingState = _nvmlGpuOperationMode_t()
1137 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetGpuOperationMode")
1138 |     ret = fn(handle, byref(c_currState), byref(c_pendingState))
1139 |     _nvmlCheckReturn(ret)
1140 |     return [c_currState.value, c_pendingState.value]
1141 | 
1142 | # Added in 4.304
1143 | def nvmlDeviceGetCurrentGpuOperationMode(handle):
1144 |     return nvmlDeviceGetGpuOperationMode(handle)[0]
1145 | 
1146 | # Added in 4.304
1147 | def nvmlDeviceGetPendingGpuOperationMode(handle):
1148 |     return nvmlDeviceGetGpuOperationMode(handle)[1]
1149 |     
1150 | def nvmlDeviceGetMemoryInfo(handle):
1151 |     c_memory = c_nvmlMemory_t()
1152 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetMemoryInfo")
1153 |     ret = fn(handle, byref(c_memory))
1154 |     _nvmlCheckReturn(ret)
1155 |     return c_memory
1156 | 
1157 | def nvmlDeviceGetBAR1MemoryInfo(handle):
1158 |     c_bar1_memory = c_nvmlBAR1Memory_t()
1159 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetBAR1MemoryInfo")
1160 |     ret = fn(handle, byref(c_bar1_memory))
1161 |     _nvmlCheckReturn(ret)
1162 |     return c_bar1_memory
1163 |     
1164 | def nvmlDeviceGetComputeMode(handle):
1165 |     c_mode = _nvmlComputeMode_t()
1166 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetComputeMode")
1167 |     ret = fn(handle, byref(c_mode))
1168 |     _nvmlCheckReturn(ret)
1169 |     return c_mode.value
1170 |     
1171 | def nvmlDeviceGetEccMode(handle):
1172 |     c_currState = _nvmlEnableState_t()
1173 |     c_pendingState = _nvmlEnableState_t()
1174 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetEccMode")
1175 |     ret = fn(handle, byref(c_currState), byref(c_pendingState))
1176 |     _nvmlCheckReturn(ret)
1177 |     return [c_currState.value, c_pendingState.value]
1178 | 
1179 | # added to API
1180 | def nvmlDeviceGetCurrentEccMode(handle):
1181 |     return nvmlDeviceGetEccMode(handle)[0]
1182 | 
1183 | # added to API
1184 | def nvmlDeviceGetPendingEccMode(handle):
1185 |     return nvmlDeviceGetEccMode(handle)[1]
1186 | 
1187 | def nvmlDeviceGetTotalEccErrors(handle, errorType, counterType):
1188 |     c_count = c_ulonglong()
1189 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetTotalEccErrors")
1190 |     ret = fn(handle, _nvmlMemoryErrorType_t(errorType),
1191 | 	         _nvmlEccCounterType_t(counterType), byref(c_count))
1192 |     _nvmlCheckReturn(ret)
1193 |     return c_count.value
1194 | 
1195 | # This is deprecated, instead use nvmlDeviceGetMemoryErrorCounter
1196 | def nvmlDeviceGetDetailedEccErrors(handle, errorType, counterType):
1197 |     c_counts = c_nvmlEccErrorCounts_t()
1198 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetDetailedEccErrors")
1199 |     ret = fn(handle, _nvmlMemoryErrorType_t(errorType),
1200 | 	         _nvmlEccCounterType_t(counterType), byref(c_counts))
1201 |     _nvmlCheckReturn(ret)
1202 |     return c_counts
1203 |     
1204 | # Added in 4.304
1205 | def nvmlDeviceGetMemoryErrorCounter(handle, errorType, counterType, locationType):
1206 |     c_count = c_ulonglong()
1207 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetMemoryErrorCounter")
1208 |     ret = fn(handle,
1209 |              _nvmlMemoryErrorType_t(errorType),
1210 |              _nvmlEccCounterType_t(counterType),
1211 |              _nvmlMemoryLocation_t(locationType),
1212 |              byref(c_count))
1213 |     _nvmlCheckReturn(ret)
1214 |     return c_count.value
1215 |     
1216 | def nvmlDeviceGetUtilizationRates(handle):
1217 |     c_util = c_nvmlUtilization_t()
1218 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetUtilizationRates")
1219 |     ret = fn(handle, byref(c_util))
1220 |     _nvmlCheckReturn(ret)
1221 |     return c_util
1222 | 
1223 | def nvmlDeviceGetEncoderUtilization(handle):
1224 |     c_util = c_uint()
1225 |     c_samplingPeriod = c_uint()
1226 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetEncoderUtilization")
1227 |     ret = fn(handle, byref(c_util), byref(c_samplingPeriod))
1228 |     _nvmlCheckReturn(ret)
1229 |     return [c_util.value, c_samplingPeriod.value]
1230 | 
1231 | def nvmlDeviceGetDecoderUtilization(handle):
1232 |     c_util = c_uint()
1233 |     c_samplingPeriod = c_uint()
1234 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetDecoderUtilization")
1235 |     ret = fn(handle, byref(c_util), byref(c_samplingPeriod))
1236 |     _nvmlCheckReturn(ret)
1237 |     return [c_util.value, c_samplingPeriod.value]
1238 | 
1239 | def nvmlDeviceGetPcieReplayCounter(handle):
1240 |     c_replay = c_uint()
1241 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetPcieReplayCounter")
1242 |     ret = fn(handle, byref(c_replay))
1243 |     _nvmlCheckReturn(ret)
1244 |     return c_replay.value
1245 | 
1246 | def nvmlDeviceGetDriverModel(handle):
1247 |     c_currModel = _nvmlDriverModel_t()
1248 |     c_pendingModel = _nvmlDriverModel_t()
1249 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetDriverModel")
1250 |     ret = fn(handle, byref(c_currModel), byref(c_pendingModel))
1251 |     _nvmlCheckReturn(ret)
1252 |     return [c_currModel.value, c_pendingModel.value]
1253 | 
1254 | # added to API
1255 | def nvmlDeviceGetCurrentDriverModel(handle):
1256 |     return nvmlDeviceGetDriverModel(handle)[0]
1257 | 
1258 | # added to API
1259 | def nvmlDeviceGetPendingDriverModel(handle):
1260 |     return nvmlDeviceGetDriverModel(handle)[1]
1261 | 
1262 | # Added in 2.285
1263 | def nvmlDeviceGetVbiosVersion(handle):
1264 |     c_version = create_string_buffer(NVML_DEVICE_VBIOS_VERSION_BUFFER_SIZE)
1265 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetVbiosVersion")
1266 |     ret = fn(handle, c_version, c_uint(NVML_DEVICE_VBIOS_VERSION_BUFFER_SIZE))
1267 |     _nvmlCheckReturn(ret)
1268 |     return c_version.value
1269 | 
1270 | # Added in 2.285
1271 | def nvmlDeviceGetComputeRunningProcesses(handle):
1272 |     # first call to get the size
1273 |     c_count = c_uint(0)
1274 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetComputeRunningProcesses")
1275 |     ret = fn(handle, byref(c_count), None)
1276 |     
1277 |     if (ret == NVML_SUCCESS):
1278 |         # special case, no running processes
1279 |         return []
1280 |     elif (ret == NVML_ERROR_INSUFFICIENT_SIZE):
1281 |         # typical case
1282 |         # oversize the array incase more processes are created
1283 |         c_count.value = c_count.value * 2 + 5
1284 |         proc_array = c_nvmlProcessInfo_t * c_count.value
1285 |         c_procs = proc_array()
1286 |         
1287 |         # make the call again
1288 |         ret = fn(handle, byref(c_count), c_procs)
1289 |         _nvmlCheckReturn(ret)
1290 |         
1291 |         procs = []
1292 |         for i in range(c_count.value):
1293 |             # use an alternative struct for this object
1294 |             obj = nvmlStructToFriendlyObject(c_procs[i])
1295 |             if (obj.usedGpuMemory == NVML_VALUE_NOT_AVAILABLE_ulonglong.value):
1296 |                 # special case for WDDM on Windows, see comment above
1297 |                 obj.usedGpuMemory = None
1298 |             procs.append(obj)
1299 | 
1300 |         return procs
1301 |     else:
1302 |         # error case
1303 |         raise NVMLError(ret)
1304 | 
1305 | def nvmlDeviceGetGraphicsRunningProcesses(handle):
1306 |     # first call to get the size
1307 |     c_count = c_uint(0)
1308 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetGraphicsRunningProcesses")
1309 |     ret = fn(handle, byref(c_count), None)
1310 | 
1311 |     if (ret == NVML_SUCCESS):
1312 |         # special case, no running processes
1313 |         return []
1314 |     elif (ret == NVML_ERROR_INSUFFICIENT_SIZE):
1315 |         # typical case
1316 |         # oversize the array incase more processes are created
1317 |         c_count.value = c_count.value * 2 + 5
1318 |         proc_array = c_nvmlProcessInfo_t * c_count.value
1319 |         c_procs = proc_array()
1320 |         
1321 |         # make the call again
1322 |         ret = fn(handle, byref(c_count), c_procs)
1323 |         _nvmlCheckReturn(ret)
1324 |         
1325 |         procs = []
1326 |         for i in range(c_count.value):
1327 |             # use an alternative struct for this object
1328 |             obj = nvmlStructToFriendlyObject(c_procs[i])
1329 |             if (obj.usedGpuMemory == NVML_VALUE_NOT_AVAILABLE_ulonglong.value):
1330 |                 # special case for WDDM on Windows, see comment above
1331 |                 obj.usedGpuMemory = None
1332 |             procs.append(obj)
1333 | 
1334 |         return procs
1335 |     else:
1336 |         # error case
1337 |         raise NVMLError(ret)
1338 | 
1339 | def nvmlDeviceGetAutoBoostedClocksEnabled(handle):
1340 |     c_isEnabled = _nvmlEnableState_t()
1341 |     c_defaultIsEnabled = _nvmlEnableState_t()
1342 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetAutoBoostedClocksEnabled")
1343 |     ret = fn(handle, byref(c_isEnabled), byref(c_defaultIsEnabled))
1344 |     _nvmlCheckReturn(ret)
1345 |     return [c_isEnabled.value, c_defaultIsEnabled.value]
1346 |     #Throws NVML_ERROR_NOT_SUPPORTED if hardware doesn't support setting auto boosted clocks
1347 | 
1348 | ## Set functions
1349 | def nvmlUnitSetLedState(unit, color):
1350 |     fn = _nvmlGetFunctionPointer("nvmlUnitSetLedState")
1351 |     ret = fn(unit, _nvmlLedColor_t(color))
1352 |     _nvmlCheckReturn(ret)
1353 |     return None
1354 |     
1355 | def nvmlDeviceSetPersistenceMode(handle, mode):
1356 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetPersistenceMode")
1357 |     ret = fn(handle, _nvmlEnableState_t(mode))
1358 |     _nvmlCheckReturn(ret)
1359 |     return None
1360 |     
1361 | def nvmlDeviceSetComputeMode(handle, mode):
1362 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetComputeMode")
1363 |     ret = fn(handle, _nvmlComputeMode_t(mode))
1364 |     _nvmlCheckReturn(ret)
1365 |     return None
1366 |     
1367 | def nvmlDeviceSetEccMode(handle, mode):
1368 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetEccMode")
1369 |     ret = fn(handle, _nvmlEnableState_t(mode))
1370 |     _nvmlCheckReturn(ret)
1371 |     return None
1372 | 
1373 | def nvmlDeviceClearEccErrorCounts(handle, counterType):
1374 |     fn = _nvmlGetFunctionPointer("nvmlDeviceClearEccErrorCounts")
1375 |     ret = fn(handle, _nvmlEccCounterType_t(counterType))
1376 |     _nvmlCheckReturn(ret)
1377 |     return None
1378 | 
1379 | def nvmlDeviceSetDriverModel(handle, model):
1380 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetDriverModel")
1381 |     ret = fn(handle, _nvmlDriverModel_t(model))
1382 |     _nvmlCheckReturn(ret)
1383 |     return None
1384 |     
1385 | def nvmlDeviceSetAutoBoostedClocksEnabled(handle, enabled): 
1386 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetAutoBoostedClocksEnabled")
1387 |     ret = fn(handle, _nvmlEnableState_t(enabled))
1388 |     _nvmlCheckReturn(ret)
1389 |     return None
1390 |     #Throws NVML_ERROR_NOT_SUPPORTED if hardware doesn't support setting auto boosted clocks
1391 | 
1392 | def nvmlDeviceSetDefaultAutoBoostedClocksEnabled(handle, enabled, flags): 
1393 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetDefaultAutoBoostedClocksEnabled")
1394 |     ret = fn(handle, _nvmlEnableState_t(enabled), c_uint(flags))
1395 |     _nvmlCheckReturn(ret)
1396 |     return None
1397 |     #Throws NVML_ERROR_NOT_SUPPORTED if hardware doesn't support setting auto boosted clocks
1398 | 
1399 | # Added in 4.304
1400 | def nvmlDeviceSetApplicationsClocks(handle, maxMemClockMHz, maxGraphicsClockMHz):
1401 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetApplicationsClocks")
1402 |     ret = fn(handle, c_uint(maxMemClockMHz), c_uint(maxGraphicsClockMHz))
1403 |     _nvmlCheckReturn(ret)
1404 |     return None
1405 |     
1406 | # Added in 4.304
1407 | def nvmlDeviceResetApplicationsClocks(handle):
1408 |     fn = _nvmlGetFunctionPointer("nvmlDeviceResetApplicationsClocks")
1409 |     ret = fn(handle)
1410 |     _nvmlCheckReturn(ret)
1411 |     return None
1412 | 
1413 | # Added in 4.304
1414 | def nvmlDeviceSetPowerManagementLimit(handle, limit):
1415 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetPowerManagementLimit")
1416 |     ret = fn(handle, c_uint(limit))
1417 |     _nvmlCheckReturn(ret)
1418 |     return None
1419 |     
1420 | # Added in 4.304
1421 | def nvmlDeviceSetGpuOperationMode(handle, mode):
1422 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetGpuOperationMode")
1423 |     ret = fn(handle, _nvmlGpuOperationMode_t(mode))
1424 |     _nvmlCheckReturn(ret)
1425 |     return None
1426 | 
1427 | # Added in 2.285
1428 | def nvmlEventSetCreate():
1429 |     fn = _nvmlGetFunctionPointer("nvmlEventSetCreate")
1430 |     eventSet = c_nvmlEventSet_t()
1431 |     ret = fn(byref(eventSet))
1432 |     _nvmlCheckReturn(ret)
1433 |     return eventSet
1434 | 
1435 | # Added in 2.285
1436 | def nvmlDeviceRegisterEvents(handle, eventTypes, eventSet):
1437 |     fn = _nvmlGetFunctionPointer("nvmlDeviceRegisterEvents")
1438 |     ret = fn(handle, c_ulonglong(eventTypes), eventSet)
1439 |     _nvmlCheckReturn(ret)
1440 |     return None
1441 | 
1442 | # Added in 2.285
1443 | def nvmlDeviceGetSupportedEventTypes(handle):
1444 |     c_eventTypes = c_ulonglong()
1445 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetSupportedEventTypes")
1446 |     ret = fn(handle, byref(c_eventTypes))
1447 |     _nvmlCheckReturn(ret)
1448 |     return c_eventTypes.value
1449 | 
1450 | # Added in 2.285
1451 | # raises NVML_ERROR_TIMEOUT exception on timeout
1452 | def nvmlEventSetWait(eventSet, timeoutms):
1453 |     fn = _nvmlGetFunctionPointer("nvmlEventSetWait")
1454 |     data = c_nvmlEventData_t()
1455 |     ret = fn(eventSet, byref(data), c_uint(timeoutms))
1456 |     _nvmlCheckReturn(ret)
1457 |     return data
1458 | 
1459 | # Added in 2.285
1460 | def nvmlEventSetFree(eventSet):
1461 |     fn = _nvmlGetFunctionPointer("nvmlEventSetFree")
1462 |     ret = fn(eventSet)
1463 |     _nvmlCheckReturn(ret)
1464 |     return None
1465 | 
1466 | # Added in 3.295
1467 | def nvmlDeviceOnSameBoard(handle1, handle2):
1468 |     fn = _nvmlGetFunctionPointer("nvmlDeviceOnSameBoard")
1469 |     onSameBoard = c_int()
1470 |     ret = fn(handle1, handle2, byref(onSameBoard))
1471 |     _nvmlCheckReturn(ret)
1472 |     return (onSameBoard.value != 0)
1473 | 
1474 | # Added in 3.295
1475 | def nvmlDeviceGetCurrPcieLinkGeneration(handle):
1476 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetCurrPcieLinkGeneration")
1477 |     gen = c_uint()
1478 |     ret = fn(handle, byref(gen))
1479 |     _nvmlCheckReturn(ret)
1480 |     return gen.value
1481 | 
1482 | # Added in 3.295
1483 | def nvmlDeviceGetMaxPcieLinkGeneration(handle):
1484 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetMaxPcieLinkGeneration")
1485 |     gen = c_uint()
1486 |     ret = fn(handle, byref(gen))
1487 |     _nvmlCheckReturn(ret)
1488 |     return gen.value
1489 | 
1490 | # Added in 3.295
1491 | def nvmlDeviceGetCurrPcieLinkWidth(handle):
1492 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetCurrPcieLinkWidth")
1493 |     width = c_uint()
1494 |     ret = fn(handle, byref(width))
1495 |     _nvmlCheckReturn(ret)
1496 |     return width.value
1497 | 
1498 | # Added in 3.295
1499 | def nvmlDeviceGetMaxPcieLinkWidth(handle):
1500 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetMaxPcieLinkWidth")
1501 |     width = c_uint()
1502 |     ret = fn(handle, byref(width))
1503 |     _nvmlCheckReturn(ret)
1504 |     return width.value
1505 | 
1506 | # Added in 4.304
1507 | def nvmlDeviceGetSupportedClocksThrottleReasons(handle):
1508 |     c_reasons= c_ulonglong()
1509 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetSupportedClocksThrottleReasons")
1510 |     ret = fn(handle, byref(c_reasons))
1511 |     _nvmlCheckReturn(ret)
1512 |     return c_reasons.value
1513 | 
1514 | # Added in 4.304
1515 | def nvmlDeviceGetCurrentClocksThrottleReasons(handle):
1516 |     c_reasons= c_ulonglong()
1517 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetCurrentClocksThrottleReasons")
1518 |     ret = fn(handle, byref(c_reasons))
1519 |     _nvmlCheckReturn(ret)
1520 |     return c_reasons.value
1521 | 
1522 | # Added in 5.319
1523 | def nvmlDeviceGetIndex(handle):
1524 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetIndex")
1525 |     c_index = c_uint()
1526 |     ret = fn(handle, byref(c_index))
1527 |     _nvmlCheckReturn(ret)
1528 |     return c_index.value
1529 | 
1530 | # Added in 5.319
1531 | def nvmlDeviceGetAccountingMode(handle):
1532 |     c_mode = _nvmlEnableState_t()
1533 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetAccountingMode")
1534 |     ret = fn(handle, byref(c_mode))
1535 |     _nvmlCheckReturn(ret)
1536 |     return c_mode.value
1537 |     
1538 | def nvmlDeviceSetAccountingMode(handle, mode):
1539 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetAccountingMode")
1540 |     ret = fn(handle, _nvmlEnableState_t(mode))
1541 |     _nvmlCheckReturn(ret)
1542 |     return None
1543 | 
1544 | def nvmlDeviceClearAccountingPids(handle):
1545 |     fn = _nvmlGetFunctionPointer("nvmlDeviceClearAccountingPids")
1546 |     ret = fn(handle)
1547 |     _nvmlCheckReturn(ret)
1548 |     return None
1549 | 
1550 | def nvmlDeviceGetAccountingStats(handle, pid):
1551 |     stats = c_nvmlAccountingStats_t()
1552 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetAccountingStats")
1553 |     ret = fn(handle, c_uint(pid), byref(stats))
1554 |     _nvmlCheckReturn(ret)
1555 |     if (stats.maxMemoryUsage == NVML_VALUE_NOT_AVAILABLE_ulonglong.value):
1556 |         # special case for WDDM on Windows, see comment above
1557 |         stats.maxMemoryUsage = None
1558 |     return stats
1559 | 
1560 | def nvmlDeviceGetAccountingPids(handle):
1561 |     count = c_uint(nvmlDeviceGetAccountingBufferSize(handle))
1562 |     pids = (c_uint * count.value)()
1563 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetAccountingPids")
1564 |     ret = fn(handle, byref(count), pids)
1565 |     _nvmlCheckReturn(ret)
1566 |     return map(int, pids[0:count.value]) 
1567 | 
1568 | def nvmlDeviceGetAccountingBufferSize(handle):
1569 |     bufferSize = c_uint()
1570 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetAccountingBufferSize")
1571 |     ret = fn(handle, byref(bufferSize))
1572 |     _nvmlCheckReturn(ret)
1573 |     return int(bufferSize.value)
1574 | 
1575 | def nvmlDeviceGetRetiredPages(device, sourceFilter):
1576 |     c_source = _nvmlPageRetirementCause_t(sourceFilter)
1577 |     c_count = c_uint(0)
1578 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetRetiredPages")
1579 |     
1580 |     # First call will get the size
1581 |     ret = fn(device, c_source, byref(c_count), None)
1582 |     
1583 |     # this should only fail with insufficient size
1584 |     if ((ret != NVML_SUCCESS) and
1585 |         (ret != NVML_ERROR_INSUFFICIENT_SIZE)):
1586 |         raise NVMLError(ret)
1587 | 
1588 |     # call again with a buffer
1589 |     # oversize the array for the rare cases where additional pages
1590 |     # are retired between NVML calls
1591 |     c_count.value = c_count.value * 2 + 5
1592 |     page_array = c_ulonglong * c_count.value
1593 |     c_pages = page_array()
1594 |     ret = fn(device, c_source, byref(c_count), c_pages)
1595 |     _nvmlCheckReturn(ret)
1596 |     return map(int, c_pages[0:c_count.value])
1597 | 
1598 | def nvmlDeviceGetRetiredPagesPendingStatus(device):
1599 |     c_pending = _nvmlEnableState_t()
1600 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetRetiredPagesPendingStatus")
1601 |     ret = fn(device, byref(c_pending))
1602 |     _nvmlCheckReturn(ret)
1603 |     return int(c_pending.value)
1604 | 
1605 | def nvmlDeviceGetAPIRestriction(device, apiType):
1606 |     c_permission = _nvmlEnableState_t()
1607 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetAPIRestriction")
1608 |     ret = fn(device, _nvmlRestrictedAPI_t(apiType), byref(c_permission))
1609 |     _nvmlCheckReturn(ret)
1610 |     return int(c_permission.value)
1611 | 
1612 | def nvmlDeviceSetAPIRestriction(handle, apiType, isRestricted):
1613 |     fn = _nvmlGetFunctionPointer("nvmlDeviceSetAPIRestriction")
1614 |     ret = fn(handle, _nvmlRestrictedAPI_t(apiType), _nvmlEnableState_t(isRestricted))
1615 |     _nvmlCheckReturn(ret)
1616 |     return None
1617 | 
1618 | def nvmlDeviceGetBridgeChipInfo(handle):
1619 |     bridgeHierarchy = c_nvmlBridgeChipHierarchy_t()
1620 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetBridgeChipInfo")
1621 |     ret = fn(handle, byref(bridgeHierarchy))
1622 |     _nvmlCheckReturn(ret)
1623 |     return bridgeHierarchy
1624 | 
1625 | def nvmlDeviceGetSamples(device, sampling_type, timeStamp):
1626 |     c_sampling_type = _nvmlSamplingType_t(sampling_type)
1627 |     c_time_stamp = c_ulonglong(timeStamp)
1628 |     c_sample_count = c_uint(0)
1629 |     c_sample_value_type = _nvmlValueType_t()
1630 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetSamples")
1631 | 
1632 |     ## First Call gets the size
1633 |     ret = fn(device, c_sampling_type, c_time_stamp, byref(c_sample_value_type), byref(c_sample_count), None)
1634 | 
1635 |     # Stop if this fails
1636 |     if (ret != NVML_SUCCESS):
1637 |         raise NVMLError(ret)
1638 | 
1639 |     sampleArray = c_sample_count.value * c_nvmlSample_t
1640 |     c_samples = sampleArray()
1641 |     ret = fn(device, c_sampling_type, c_time_stamp,  byref(c_sample_value_type), byref(c_sample_count), c_samples)
1642 |     _nvmlCheckReturn(ret)
1643 |     return (c_sample_value_type.value, c_samples[0:c_sample_count.value])
1644 | 
1645 | def nvmlDeviceGetViolationStatus(device, perfPolicyType):
1646 |     c_perfPolicy_type = _nvmlPerfPolicyType_t(perfPolicyType)
1647 |     c_violTime = c_nvmlViolationTime_t()
1648 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetViolationStatus")
1649 | 
1650 |     ## Invoke the method to get violation time
1651 |     ret = fn(device, c_perfPolicy_type, byref(c_violTime))
1652 |     _nvmlCheckReturn(ret)
1653 |     return c_violTime
1654 |     
1655 | def nvmlDeviceGetPcieThroughput(device, counter):
1656 |     c_util = c_uint()
1657 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetPcieThroughput")
1658 |     ret = fn(device, _nvmlPcieUtilCounter_t(counter), byref(c_util))
1659 |     _nvmlCheckReturn(ret)
1660 |     return c_util.value
1661 | 
1662 | def nvmlSystemGetTopologyGpuSet(cpuNumber):
1663 |     c_count = c_uint(0)
1664 |     fn = _nvmlGetFunctionPointer("nvmlSystemGetTopologyGpuSet")
1665 | 
1666 |     # First call will get the size
1667 |     ret = fn(cpuNumber, byref(c_count), None)
1668 | 
1669 |     if ret != NVML_SUCCESS:
1670 |         raise NVMLError(ret)
1671 |     print(c_count.value)
1672 |     # call again with a buffer
1673 |     device_array = c_nvmlDevice_t * c_count.value
1674 |     c_devices = device_array()
1675 |     ret = fn(cpuNumber, byref(c_count), c_devices)
1676 |     _nvmlCheckReturn(ret)
1677 |     return map(None, c_devices[0:c_count.value])
1678 | 
1679 | def nvmlDeviceGetTopologyNearestGpus(device, level):
1680 |     c_count = c_uint(0)
1681 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetTopologyNearestGpus")
1682 | 
1683 |     # First call will get the size
1684 |     ret = fn(device, level, byref(c_count), None)
1685 | 
1686 |     if ret != NVML_SUCCESS:
1687 |         raise NVMLError(ret)
1688 | 
1689 |     # call again with a buffer
1690 |     device_array = c_nvmlDevice_t * c_count.value
1691 |     c_devices = device_array()
1692 |     ret = fn(device, level, byref(c_count), c_devices)
1693 |     _nvmlCheckReturn(ret)
1694 |     return map(None, c_devices[0:c_count.value])
1695 | 
1696 | def nvmlDeviceGetTopologyCommonAncestor(device1, device2):
1697 |     c_level = _nvmlGpuTopologyLevel_t()
1698 |     fn = _nvmlGetFunctionPointer("nvmlDeviceGetTopologyCommonAncestor")
1699 |     ret = fn(device1, device2, byref(c_level))
1700 |     _nvmlCheckReturn(ret)
1701 |     return c_level.value
1702 | 


--------------------------------------------------------------------------------