├── LICENSE ├── Makefile ├── README.md ├── SECURITY.md ├── images └── sinewave.png ├── psst.1 ├── src ├── logger.c ├── logger.h ├── parse_config.c ├── parse_config.h ├── perf_msr.c ├── perf_msr.h ├── psst.c ├── psst.h ├── rapl.c └── rapl.h └── whitepapers └── Generic_perf_per_watt.pdf /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 2, June 1991 3 | 4 | Copyright (C) 1989, 1991 Free Software Foundation, Inc., 5 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA 6 | Everyone is permitted to copy and distribute verbatim copies 7 | of this license document, but changing it is not allowed. 8 | 9 | Preamble 10 | 11 | The licenses for most software are designed to take away your 12 | freedom to share and change it. By contrast, the GNU General Public 13 | License is intended to guarantee your freedom to share and change free 14 | software--to make sure the software is free for all its users. This 15 | General Public License applies to most of the Free Software 16 | Foundation's software and to any other program whose authors commit to 17 | using it. (Some other Free Software Foundation software is covered by 18 | the GNU Lesser General Public License instead.) You can apply it to 19 | your programs, too. 20 | 21 | When we speak of free software, we are referring to freedom, not 22 | price. Our General Public Licenses are designed to make sure that you 23 | have the freedom to distribute copies of free software (and charge for 24 | this service if you wish), that you receive source code or can get it 25 | if you want it, that you can change the software or use pieces of it 26 | in new free programs; and that you know you can do these things. 27 | 28 | To protect your rights, we need to make restrictions that forbid 29 | anyone to deny you these rights or to ask you to surrender the rights. 30 | These restrictions translate to certain responsibilities for you if you 31 | distribute copies of the software, or if you modify it. 32 | 33 | For example, if you distribute copies of such a program, whether 34 | gratis or for a fee, you must give the recipients all the rights that 35 | you have. You must make sure that they, too, receive or can get the 36 | source code. And you must show them these terms so they know their 37 | rights. 38 | 39 | We protect your rights with two steps: (1) copyright the software, and 40 | (2) offer you this license which gives you legal permission to copy, 41 | distribute and/or modify the software. 42 | 43 | Also, for each author's protection and ours, we want to make certain 44 | that everyone understands that there is no warranty for this free 45 | software. If the software is modified by someone else and passed on, we 46 | want its recipients to know that what they have is not the original, so 47 | that any problems introduced by others will not reflect on the original 48 | authors' reputations. 49 | 50 | Finally, any free program is threatened constantly by software 51 | patents. We wish to avoid the danger that redistributors of a free 52 | program will individually obtain patent licenses, in effect making the 53 | program proprietary. To prevent this, we have made it clear that any 54 | patent must be licensed for everyone's free use or not licensed at all. 55 | 56 | The precise terms and conditions for copying, distribution and 57 | modification follow. 58 | 59 | GNU GENERAL PUBLIC LICENSE 60 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 61 | 62 | 0. This License applies to any program or other work which contains 63 | a notice placed by the copyright holder saying it may be distributed 64 | under the terms of this General Public License. The "Program", below, 65 | refers to any such program or work, and a "work based on the Program" 66 | means either the Program or any derivative work under copyright law: 67 | that is to say, a work containing the Program or a portion of it, 68 | either verbatim or with modifications and/or translated into another 69 | language. (Hereinafter, translation is included without limitation in 70 | the term "modification".) Each licensee is addressed as "you". 71 | 72 | Activities other than copying, distribution and modification are not 73 | covered by this License; they are outside its scope. The act of 74 | running the Program is not restricted, and the output from the Program 75 | is covered only if its contents constitute a work based on the 76 | Program (independent of having been made by running the Program). 77 | Whether that is true depends on what the Program does. 78 | 79 | 1. You may copy and distribute verbatim copies of the Program's 80 | source code as you receive it, in any medium, provided that you 81 | conspicuously and appropriately publish on each copy an appropriate 82 | copyright notice and disclaimer of warranty; keep intact all the 83 | notices that refer to this License and to the absence of any warranty; 84 | and give any other recipients of the Program a copy of this License 85 | along with the Program. 86 | 87 | You may charge a fee for the physical act of transferring a copy, and 88 | you may at your option offer warranty protection in exchange for a fee. 89 | 90 | 2. You may modify your copy or copies of the Program or any portion 91 | of it, thus forming a work based on the Program, and copy and 92 | distribute such modifications or work under the terms of Section 1 93 | above, provided that you also meet all of these conditions: 94 | 95 | a) You must cause the modified files to carry prominent notices 96 | stating that you changed the files and the date of any change. 97 | 98 | b) You must cause any work that you distribute or publish, that in 99 | whole or in part contains or is derived from the Program or any 100 | part thereof, to be licensed as a whole at no charge to all third 101 | parties under the terms of this License. 102 | 103 | c) If the modified program normally reads commands interactively 104 | when run, you must cause it, when started running for such 105 | interactive use in the most ordinary way, to print or display an 106 | announcement including an appropriate copyright notice and a 107 | notice that there is no warranty (or else, saying that you provide 108 | a warranty) and that users may redistribute the program under 109 | these conditions, and telling the user how to view a copy of this 110 | License. (Exception: if the Program itself is interactive but 111 | does not normally print such an announcement, your work based on 112 | the Program is not required to print an announcement.) 113 | 114 | These requirements apply to the modified work as a whole. If 115 | identifiable sections of that work are not derived from the Program, 116 | and can be reasonably considered independent and separate works in 117 | themselves, then this License, and its terms, do not apply to those 118 | sections when you distribute them as separate works. But when you 119 | distribute the same sections as part of a whole which is a work based 120 | on the Program, the distribution of the whole must be on the terms of 121 | this License, whose permissions for other licensees extend to the 122 | entire whole, and thus to each and every part regardless of who wrote it. 123 | 124 | Thus, it is not the intent of this section to claim rights or contest 125 | your rights to work written entirely by you; rather, the intent is to 126 | exercise the right to control the distribution of derivative or 127 | collective works based on the Program. 128 | 129 | In addition, mere aggregation of another work not based on the Program 130 | with the Program (or with a work based on the Program) on a volume of 131 | a storage or distribution medium does not bring the other work under 132 | the scope of this License. 133 | 134 | 3. You may copy and distribute the Program (or a work based on it, 135 | under Section 2) in object code or executable form under the terms of 136 | Sections 1 and 2 above provided that you also do one of the following: 137 | 138 | a) Accompany it with the complete corresponding machine-readable 139 | source code, which must be distributed under the terms of Sections 140 | 1 and 2 above on a medium customarily used for software interchange; or, 141 | 142 | b) Accompany it with a written offer, valid for at least three 143 | years, to give any third party, for a charge no more than your 144 | cost of physically performing source distribution, a complete 145 | machine-readable copy of the corresponding source code, to be 146 | distributed under the terms of Sections 1 and 2 above on a medium 147 | customarily used for software interchange; or, 148 | 149 | c) Accompany it with the information you received as to the offer 150 | to distribute corresponding source code. (This alternative is 151 | allowed only for noncommercial distribution and only if you 152 | received the program in object code or executable form with such 153 | an offer, in accord with Subsection b above.) 154 | 155 | The source code for a work means the preferred form of the work for 156 | making modifications to it. For an executable work, complete source 157 | code means all the source code for all modules it contains, plus any 158 | associated interface definition files, plus the scripts used to 159 | control compilation and installation of the executable. However, as a 160 | special exception, the source code distributed need not include 161 | anything that is normally distributed (in either source or binary 162 | form) with the major components (compiler, kernel, and so on) of the 163 | operating system on which the executable runs, unless that component 164 | itself accompanies the executable. 165 | 166 | If distribution of executable or object code is made by offering 167 | access to copy from a designated place, then offering equivalent 168 | access to copy the source code from the same place counts as 169 | distribution of the source code, even though third parties are not 170 | compelled to copy the source along with the object code. 171 | 172 | 4. You may not copy, modify, sublicense, or distribute the Program 173 | except as expressly provided under this License. Any attempt 174 | otherwise to copy, modify, sublicense or distribute the Program is 175 | void, and will automatically terminate your rights under this License. 176 | However, parties who have received copies, or rights, from you under 177 | this License will not have their licenses terminated so long as such 178 | parties remain in full compliance. 179 | 180 | 5. You are not required to accept this License, since you have not 181 | signed it. However, nothing else grants you permission to modify or 182 | distribute the Program or its derivative works. These actions are 183 | prohibited by law if you do not accept this License. Therefore, by 184 | modifying or distributing the Program (or any work based on the 185 | Program), you indicate your acceptance of this License to do so, and 186 | all its terms and conditions for copying, distributing or modifying 187 | the Program or works based on it. 188 | 189 | 6. Each time you redistribute the Program (or any work based on the 190 | Program), the recipient automatically receives a license from the 191 | original licensor to copy, distribute or modify the Program subject to 192 | these terms and conditions. You may not impose any further 193 | restrictions on the recipients' exercise of the rights granted herein. 194 | You are not responsible for enforcing compliance by third parties to 195 | this License. 196 | 197 | 7. If, as a consequence of a court judgment or allegation of patent 198 | infringement or for any other reason (not limited to patent issues), 199 | conditions are imposed on you (whether by court order, agreement or 200 | otherwise) that contradict the conditions of this License, they do not 201 | excuse you from the conditions of this License. If you cannot 202 | distribute so as to satisfy simultaneously your obligations under this 203 | License and any other pertinent obligations, then as a consequence you 204 | may not distribute the Program at all. For example, if a patent 205 | license would not permit royalty-free redistribution of the Program by 206 | all those who receive copies directly or indirectly through you, then 207 | the only way you could satisfy both it and this License would be to 208 | refrain entirely from distribution of the Program. 209 | 210 | If any portion of this section is held invalid or unenforceable under 211 | any particular circumstance, the balance of the section is intended to 212 | apply and the section as a whole is intended to apply in other 213 | circumstances. 214 | 215 | It is not the purpose of this section to induce you to infringe any 216 | patents or other property right claims or to contest validity of any 217 | such claims; this section has the sole purpose of protecting the 218 | integrity of the free software distribution system, which is 219 | implemented by public license practices. Many people have made 220 | generous contributions to the wide range of software distributed 221 | through that system in reliance on consistent application of that 222 | system; it is up to the author/donor to decide if he or she is willing 223 | to distribute software through any other system and a licensee cannot 224 | impose that choice. 225 | 226 | This section is intended to make thoroughly clear what is believed to 227 | be a consequence of the rest of this License. 228 | 229 | 8. If the distribution and/or use of the Program is restricted in 230 | certain countries either by patents or by copyrighted interfaces, the 231 | original copyright holder who places the Program under this License 232 | may add an explicit geographical distribution limitation excluding 233 | those countries, so that distribution is permitted only in or among 234 | countries not thus excluded. In such case, this License incorporates 235 | the limitation as if written in the body of this License. 236 | 237 | 9. The Free Software Foundation may publish revised and/or new versions 238 | of the General Public License from time to time. Such new versions will 239 | be similar in spirit to the present version, but may differ in detail to 240 | address new problems or concerns. 241 | 242 | Each version is given a distinguishing version number. If the Program 243 | specifies a version number of this License which applies to it and "any 244 | later version", you have the option of following the terms and conditions 245 | either of that version or of any later version published by the Free 246 | Software Foundation. If the Program does not specify a version number of 247 | this License, you may choose any version ever published by the Free Software 248 | Foundation. 249 | 250 | 10. If you wish to incorporate parts of the Program into other free 251 | programs whose distribution conditions are different, write to the author 252 | to ask for permission. For software which is copyrighted by the Free 253 | Software Foundation, write to the Free Software Foundation; we sometimes 254 | make exceptions for this. Our decision will be guided by the two goals 255 | of preserving the free status of all derivatives of our free software and 256 | of promoting the sharing and reuse of software generally. 257 | 258 | NO WARRANTY 259 | 260 | 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY 261 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN 262 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES 263 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED 264 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 265 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS 266 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE 267 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, 268 | REPAIR OR CORRECTION. 269 | 270 | 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 271 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR 272 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, 273 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING 274 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED 275 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY 276 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER 277 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE 278 | POSSIBILITY OF SUCH DAMAGES. 279 | 280 | END OF TERMS AND CONDITIONS 281 | 282 | How to Apply These Terms to Your New Programs 283 | 284 | If you develop a new program, and you want it to be of the greatest 285 | possible use to the public, the best way to achieve this is to make it 286 | free software which everyone can redistribute and change under these terms. 287 | 288 | To do so, attach the following notices to the program. It is safest 289 | to attach them to the start of each source file to most effectively 290 | convey the exclusion of warranty; and each file should have at least 291 | the "copyright" line and a pointer to where the full notice is found. 292 | 293 | 294 | Copyright (C) 295 | 296 | This program is free software; you can redistribute it and/or modify 297 | it under the terms of the GNU General Public License as published by 298 | the Free Software Foundation; either version 2 of the License, or 299 | (at your option) any later version. 300 | 301 | This program is distributed in the hope that it will be useful, 302 | but WITHOUT ANY WARRANTY; without even the implied warranty of 303 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 304 | GNU General Public License for more details. 305 | 306 | You should have received a copy of the GNU General Public License along 307 | with this program; if not, write to the Free Software Foundation, Inc., 308 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. 309 | 310 | Also add information on how to contact you by electronic and paper mail. 311 | 312 | If the program is interactive, make it output a short notice like this 313 | when it starts in an interactive mode: 314 | 315 | Gnomovision version 69, Copyright (C) year name of author 316 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 317 | This is free software, and you are welcome to redistribute it 318 | under certain conditions; type `show c' for details. 319 | 320 | The hypothetical commands `show w' and `show c' should show the appropriate 321 | parts of the General Public License. Of course, the commands you use may 322 | be called something other than `show w' and `show c'; they could even be 323 | mouse-clicks or menu items--whatever suits your program. 324 | 325 | You should also get your employer (if you work as a programmer) or your 326 | school, if any, to sign a "copyright disclaimer" for the program, if 327 | necessary. Here is a sample; alter the names: 328 | 329 | Yoyodyne, Inc., hereby disclaims all copyright interest in the program 330 | `Gnomovision' (which makes passes at compilers) written by James Hacker. 331 | 332 | , 1 April 1989 333 | Ty Coon, President of Vice 334 | 335 | This General Public License does not permit incorporating your program into 336 | proprietary programs. If your program is a subroutine library, you may 337 | consider it more useful to permit linking proprietary applications with the 338 | library. If this is what you want to do, use the GNU Lesser General 339 | Public License instead of this License. 340 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | VERSION = 2.1 2 | 3 | BINDIR = /usr/bin 4 | MANDIR = /usr/share/man/man1 5 | WARNFLAGS = -Wall -Wformat 6 | CC ?= gcc 7 | CFLAGS += -D VERSION=\"$(VERSION)\" 8 | CFLAGS += -D_LINUX_ -Wall -O2 -Wfloat-equal 9 | DBG_CFLAGS = -DDEBUG -g -O0 10 | LDFLAGS += -DPASS2 11 | TARGET = psst 12 | 13 | INSTALL_PROGRAM = install -m 755 -p 14 | DEL_FILE = rm -f 15 | 16 | SRC_PATH = ./src 17 | OBJS = $(SRC_PATH)/parse_config.o $(SRC_PATH)/logger.o $(SRC_PATH)/rapl.o \ 18 | $(SRC_PATH)/perf_msr.o $(SRC_PATH)/psst.o 19 | OBJS += 20 | 21 | psst: $(OBJS) Makefile 22 | $(CC) ${CFLAGS} $(LDFLAGS) $(OBJS) -o $(TARGET) -lpthread -lrt -lm 23 | 24 | install: 25 | mkdir -p $(BINDIR) 26 | $(INSTALL_PROGRAM) "$(TARGET)" "$(BINDIR)/$(TARGET)" 27 | gzip -c psst.1 > psst.1.gz 28 | mv -f psst.1.gz $(MANDIR) 29 | 30 | uninstall: 31 | $(DEL_FILE) "$(BINDIR)/$(TARGET)" 32 | 33 | clean: 34 | find . -name "*.o" | xargs $(DEL_FILE) 35 | rm -f $(TARGET) 36 | 37 | dist: 38 | git tag v$(VERSION) 39 | git archive --format=tar --prefix="$(TARGET)-$(VERSION)/" v$(VERSION) | \ 40 | gzip > $(TARGET)-$(VERSION).tar.gz 41 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Power Stress and Shaping Tool __(PSST)__ - a tool for controlled stress & monitor of x86 SoC for power/thermal analysis. 2 | 3 | What can psst do? 4 | ================= 5 | __PSST__ is a userspace controlled power virus for cpu and other soc-sub components such as gpu & memory. 6 | Presently only cpu is supported. The intent is to subject the SoC at different run-time-varying utilization levels for analysis. 7 | This is done by controlled duty-cycling of utilization to specific contour. 8 | A simplest contour could be fixed low utilzation. This allows simple usage such as logging of system parameters at fixed low overhead even at small polling intervals (ms). 9 | More complex usages, such as study of governors, workloads e.t.c., are possible by applying different power shape contours. 10 | 11 | Dependencies 12 | ============ 13 | The tool has a dependency on certain x86 model-specific registers. This can be addressed by loading msr.ko module. 14 | Normally, this should suffice: 15 | 16 | $ sudo modprobe msr 17 | 18 | Additionally, if you need to monitor the power parameters, ensure that the kernel is upto-date with the x86 platform 19 | being used. If energy counters for the platform are not supported in the present version of intel_rapl driver, you see this message: 20 | 21 | $ dmesg | grep rapl 22 | intel_rapl: driver does not support CPU family xxx model yyy 23 | 24 | In that case, update with the latest intel_rapl driver which supports it. 25 | 26 | Why another virus? 27 | ================= 28 | The fundamental difference compared to other power virus or monitor tools is that, the work done during hog is not 29 | a dummy function, but own useful work functions such as accounting, logging (in-memory), or power shape contour 30 | change etc. psst just executes real work function duty-cycled in controlled loops. More work functions could be 31 | added to the this tool overtime & they will be accounted for good -- against the ON-time of duty cycling. 32 | 33 | The tool's most important usecase is to do logging at a fixed "own" overhead --not more than the present requested 34 | load (active C0 percent). This ensures that monitoring does not influence the overall system load (C0%). This serves 35 | to monitor soc power/thermal parameters at much fine grained time, typically comparable to governers's poll period 36 | (tens of ms). For instance, a 10ms poll could causes up to 50% cpu overhead in traditional polling. Further, psst's logging 37 | is aligned with the C0 activity that is being analyzed. This ensures a good coalesced synthetic workload. 38 | 39 | Sample output with verbose mode 40 | =============================== 41 | Verbose mode ON 42 | CPU domain. Following 4 cpu selected: 43 | cpu 0 [was online or chosen] 44 | cpu 1 [was online or chosen] 45 | cpu 2 [was online or chosen] 46 | cpu 3 [was online or chosen] 47 | 48 | poll period 500ms 49 | run duration 3600000ms 50 | Log file path: /var/log/psst.csv 51 | power curve shape: single-step,0.1 52 | # Time, FreqReal, LoadIn, LoadOut, PkgPwr, PwrCore, PwrGpu, PwrDram, CpuDts, SocDts 53 | # [ms], [MHz], [C0_%], [C0_%], [mWatt], [mWatt], [mWatt], [mWatt], [DegC], [DegC] 54 | #-------------------------------------------------------------------------------------------------------------------------------- 55 | 0, 2914, 0.10, 1.00, 0.00, 0.00, 0.00, 0.00, 24.00, 24.00 56 | 506, 2903, 6.31, 6.56, 1466.67, 868.53, 0.00, 284.18, 24.00, 22.00 57 | 1015, 2901, 2.75, 4.37, 1122.80, 572.63, 0.00, 283.08, 20.00, 21.00 58 | 1523, 2900, 1.11, 2.00, 782.23, 251.71, 0.00, 281.98, 20.00, 21.00 59 | 2031, 2900, 1.46, 1.46, 664.79, 134.03, 0.00, 279.91, 20.00, 21.00 60 | ^C 61 | 62 | _Note:_ Sometimes system study involves clamping values or disabling features that influence result parameters. 63 | Typically this involves frequency influencing features such as cpu-freq governors or other such features. 64 | Clamping frequency is not intended part of this tool. Such requirement are best handled on per-platform using 65 | sysfs or appropriate interface. However, it may be a good idea to kill Xwindows and run in console mode, if background tasks are treated as irrelavant utilization noise in your analysis. 66 | 67 | Plotting and analysis 68 | ===================== 69 | The generated csv can be analyzed using any plotting tool. Following example illustrates with _gnuplot_. 70 | To plot the LoadOut i.e., realized load and cpu dts temperature check the column# from logfile /var/log/psst.csv 71 | 72 | $ sudo apt-get install gnuplot-x11 73 | $ gnuplot 74 | gnuplot> cd '/var/log' 75 | gnuplot> set datafile separator ',' 76 | gnuplot> plot 'psst.csv' using 1:4 w lines, 'psst.csv' using 1:9 w lines 77 | 78 | ![alt text](images/sinewave.png) 79 | 80 | USAGE 81 | ===== 82 | $ ./psst --help 83 | 84 | psst [options ] 85 | Supported options are: 86 | -C|--cpumask hex bit mask of cpu# to be selected. 87 | (e.g., a1 selects cpu 0,5,7. default: every online cpu. Max:400 [1024]) 88 | -p|--poll-period (ms) for logging (default: 500 ms) 89 | -d|--duration (ms) to run the tool (default: 3600000 i.e., 1hr) 90 | -l|--log-file (default: /var/log/psst.csv) 91 | -v|--verbose enables verbose mode (default: disabled when args specified) 92 | -V|--Version prints version when specified 93 | -T|--track-max-cpu track the cpu# which had max freq during each polling 94 | -h|--help prints usage when specified 95 | -s|--shape-func (default: single-step,0.1) 96 | Supported power shape functions & args are: 97 | where v is load step height. 98 | where w is wavelength [seconds] and a is the max amplitude (load %) 99 | where v is load step height, u is step length (sec) 100 | where v is load step height, u is step length (sec) 101 | where m is the slope (load/sec) 102 | slope m (load/sec);reversed after max a% or min(0.1)% 103 | 104 | example 1: use psst just for logging system power/thermal parameters with minimum overhead 105 | $ sudo ./psst #implied default args: -s single-step,0.1 -p 500 -v 106 | 107 | example 2: linear ramp CPU power with slope 3 (i.e., 3% usage increase every sec) applied for cpu0, cpu1 & cpu3. 108 | poll and report every 700mS. output on terminal. run for 33 sec 109 | $ sudo ./psst -s linear-ramp,3 -C b -p 700 -d 33000 -v 110 | 111 | More details on some options 112 | ============================ 113 | -T|--track-max-cpu Track the cpu# with maxed freq during poll period 114 | With this option, the cpu that did max frequency in the poll window is determined & printed in output. 115 | one application of this, can be in determination of symmetric load distribution on SMP systems wherein each cpu 116 | gets roughly equal percent of hits at the end. 117 | 118 | 119 | _Note:_ different cpu can be stressed with different functions simultaneously. To do this just invoking separate 120 | commands for each cpu. Here is a fun example to demonstrate the controllability of linear ramp on cpu0, 121 | sine wave on cpu1, single-step on cpu2, single-pulse on cpu3 -- at the same time. 122 | 123 | Launch "system monitor" like utility to observe system load. Then execute the following _one-shot_ in a terminal window: 124 | 125 | sleep 2; sudo ./psst -C 0 -d 30000 -s linear-ramp,2 & 126 | sleep 2; sudo ./psst -C 1 -d 30000 -s sinosoid,15,50 & 127 | sleep 2; sudo ./psst -C 2 -d 30000 -s single-step,20 & 128 | sleep 5; sudo ./psst -C 3 -d 30000 -s single-pulse,60,2 & 129 | 130 | Code structure 131 | ============== 132 | . 133 | |-- logger.c # in-memory logging functions 134 | |-- logger.h 135 | |-- Makefile 136 | |-- parse_config.c # parse cmdline related routines 137 | |-- parse_config.h 138 | |-- perf_msr.c # x86 msr counters for aperf/mperf etc 139 | |-- perf_msr.h 140 | |-- psst.c # main routine & core work function 141 | |-- psst.h 142 | |-- rapl.c # x86 energy register interface 143 | `-- rapl.h 144 | 145 | Build 146 | ===== 147 | $sudo make 148 | $sudo ./psst 149 | 150 | Version log 151 | =========== 152 | 11/2017 v0.1 first checkin. supports cpu load. about 6 power shape functions. 153 | last test platform: kabylake mobile/client. 154 | -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | # Security Policy 2 | Intel is committed to rapidly addressing security vulnerabilities affecting our customers and providing clear guidance on the solution, impact, severity and mitigation. 3 | 4 | ## Reporting a Vulnerability 5 | Please report any security vulnerabilities in this project utilizing the guidelines [here](https://www.intel.com/content/www/us/en/security-center/vulnerability-handling-guidelines.html). 6 | -------------------------------------------------------------------------------- /images/sinewave.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/intel/psst/f8693dee31934d04feed9bd01329ad8aed6a9e3e/images/sinewave.png -------------------------------------------------------------------------------- /psst.1: -------------------------------------------------------------------------------- 1 | .TH PSST 1 "November 28, 2017" 2 | .nr SM ((\n[.l] - \n[.i]) / 1n - 41) 3 | .SH NAME 4 | psst \- Power Stress and Shaping Tool 5 | .SH SYNOPSIS 6 | .\" The general command line 7 | .B psst [options ] 8 | .SH DESCRIPTION 9 | The Power Stress and Shaping Tool (\fBPSST\fR) is a controlled power 'virus' 10 | for Intel SoC components such as CPU and GPU. 11 | .br 12 | The objective is to subject the SoC at different run-time-varying 13 | utilization levels for analysis. This is done by controlled duty-cycling 14 | of utilization to specific contour. A simplest contour could be fixed low 15 | utilization. This allows simple usage, such as logging of system parameters at 16 | fixed low overhead even at small polling intervals (ms). More complex usage, 17 | such as study of governors, workloads e.t.c., are possible by applying 18 | different power shape contours and options. 19 | 20 | .SH OPTIONS 21 | .TP 22 | .B \-C \-\-cpumask CPUMASK 23 | CPUMASK is hexadecimal bit mask of cpu selected. (e.g., a1 selects 24 | cpu 0, 5, 7. default: every online cpu, maximum 1024) 25 | .TP 26 | .B \-p \-\-poll\-period pollperiod 27 | pollperiod specifies period for logging in milliseconds (default 500 ms) 28 | .TP 29 | .B \-d \-\-duration m 30 | specifies duration m in milliseconds to run psst (default is 3600000; 1 hour) 31 | .TP 32 | .B \-l \-\-log\-file path 33 | specifies the full path to the logfile (default is /var/log/psst.csv) 34 | .TP 35 | .B \-v \-\-verbose 36 | enables verbose mode (default: disabled when args specified) 37 | .TP 38 | .B \-V \-\-version 39 | prints version 40 | .TP 41 | .B \-T \-\-track\-max\-cpu 42 | Track the cpu# with maxed freq during poll period 43 | .TP 44 | .B \-h \-\-help 45 | prints help 46 | .TP 47 | .B \-s \-\-shape\-func shape-func,arg 48 | Specifies power shape function and argument: 49 | .TS 50 | expand; 51 | lB lBw(\n[SM]n) 52 | l l. 53 | Shape Function Argument 54 | single-step,v T{ 55 | where v is load step height [C0%]. (default shape: single-step,0.1) 56 | T} 57 | sinosoid,w,a T{ 58 | where w is wavelength [seconds] and a is the amplitude (max load %) 59 | T} 60 | stair-case,v,u T{ 61 | where v is load step height [C0%], u is step length (sec) 62 | T} 63 | single-pulse,v,u T{ 64 | where v is load step height [C0%], u is step length (sec) 65 | T} 66 | linear-ramp,m T{ 67 | where m is the slope (load/sec) 68 | T} 69 | saw-tooth,m,a T{ 70 | slope m (load/sec); reversed after max a% or min(0.1)% 71 | T} 72 | .TE 73 | .SH EXAMPLES 74 | .IP 1. 4 75 | Use psst just for logging various power/thermal parameters: 76 | .RS 8 77 | sudo psst # same as $./psst -s single-step,0.1 -p 500 -v 78 | .RE 79 | .IP 2. 4 80 | Linear ramp CPU power with slope 3% usage-per-sec applied for cpu 1 and 3, 81 | polling and reporting every 700mS, output on terminal, running for 33 seconds: 82 | .RS 8 83 | sudo psst -s linear-ramp,3 -C a -p 700 -d 33000 -v 84 | .RE 85 | .SH AUTHOR 86 | Started by Noor Mubeen 87 | .SH COPYRIGHT 88 | Copyright \(co 2017, Intel Corporation 89 | License GPLv2: GNU GPL version 2 90 | .br 91 | This is free software: you are free to change and redistribute it. 92 | There is NO WARRANTY, to the extent permitted by law. 93 | Or, say, there is NO warranty; not even for MERCHANTABILITY 94 | or FITNESS FOR A PARTICULAR PURPOSE. 95 | -------------------------------------------------------------------------------- /src/logger.c: -------------------------------------------------------------------------------- 1 | /* 2 | * logger.c functions around logging polled values of open file descriptors 3 | * 4 | * Copyright (c) 2017, Intel Corporation. 5 | * 6 | * This program is free software; you can redistribute it and/or modify it 7 | * under the terms and conditions of the GNU General Public License, 8 | * version 2, as published by the Free Software Foundation. 9 | * 10 | * This program is distributed in the hope it will be useful, but WITHOUT 11 | * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 12 | * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 13 | * more details. 14 | * 15 | * Author: Noor ul Mubeen 16 | */ 17 | 18 | #define _GNU_SOURCE 19 | #include 20 | #define _POSIX_C_SOURCE 200809L 21 | #include 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | #include 28 | #include 29 | #include "psst.h" 30 | #include "logger.h" 31 | #include "rapl.h" 32 | #include "perf_msr.h" 33 | #include "parse_config.h" 34 | #ifdef _LINUX_ 35 | #include 36 | #elif defined _ANDROID_ 37 | #include 38 | #endif 39 | 40 | #define INIT_COL(e, h, u, f, m, p, v) { \ 41 | .report_enabled = e, \ 42 | .header_name = #h, \ 43 | .unit = #u, \ 44 | .fmt = #f, \ 45 | .unit_multiplier = m, \ 46 | .fd_type = p, \ 47 | .value = v, \ 48 | } 49 | 50 | /* fixed columns of the log output */ 51 | struct log_col_desc col_desc[] = { 52 | /* TIME_STAMP_MS: time stamp in millisec */ 53 | INIT_COL(1, Time, [ms], 9.0, 1, NO_FD, 0), 54 | /* FREQ_REALIZED: average frequency (cpu0) since last poll */ 55 | INIT_COL(1, Freq, [MHz], 8.2, 1, MSR_FD, 0), 56 | /* MAX_FREQ_CPU: smp cpu that delivered max freq in last sample */ 57 | INIT_COL(1, MaxCPU, [#], 6.0, 1, NO_FD, 0), 58 | /* LOAD_REQUEST: cpu overhead requested by this program */ 59 | INIT_COL(1, LoadRq, [C0_%], 6.2, 1, NO_FD, 0), 60 | /* LOAD_REALIZED: actual overall cpu overhead in the system */ 61 | INIT_COL(1, Load, [C0_%], 7.2, 1, MSR_FD, 0), 62 | /* SCALE_FACTOR: workload scaling factor on a cpu */ 63 | INIT_COL(1, ScaleF, [%], 7.2, 1, MSR_FD, 0), 64 | /* Normalized productive perf */ 65 | INIT_COL(1, Qperf, [perf/uS], 9.2, 1, MSR_FD, 0), 66 | /* PKG_POWER_RAPL: sysfs rapl power package scope. */ 67 | INIT_COL(1, pwrPkg0, [mWatt], 8.2, 1, NORMAL_FD, 0), 68 | INIT_COL(1, pwrPkg1, [mWatt], 8.2, 1, NORMAL_FD, 0), 69 | INIT_COL(1, pwrPkg2, [mWatt], 8.2, 1, NORMAL_FD, 0), 70 | INIT_COL(1, pwrPkg3, [mWatt], 8.2, 1, NORMAL_FD, 0), 71 | /* PKG_POWER_LIMIT: sysfs rapl power limit (pkg). */ 72 | INIT_COL(0, PkgLmt, [mWatt], 7.2, 0.001, NORMAL_FD, 0), 73 | /* PP0_POWER_RAPL: sysfs rapl power PP0 or core scope. */ 74 | INIT_COL(1, PwrCore, [mWatt], 8.2, 1, NORMAL_FD, 0), 75 | /* PP1_POWER_RAPL: sysfs rapl power PP1 or uncore scope. */ 76 | INIT_COL(1, PwrGpu, [mWatt], 8.2, 1, NORMAL_FD, 0), 77 | /* DRAM_POWER_RAPL: sysfs rapl power PP1 or uncore scope. */ 78 | INIT_COL(1, PwrDram, [mWatt], 7.2, 1, NORMAL_FD, 0), 79 | /* CPU_DTS: cpu die temp */ 80 | INIT_COL(1, CpuDts, [DegC], 6.2, 0.001, NORMAL_FD, 0), 81 | /* SOC_DTS: cpu die temp */ 82 | INIT_COL(1, SocDts, [DegC], 6.2, 0.001, NORMAL_FD, 0), 83 | }; 84 | 85 | int complete_path(char *path, char *compl) 86 | { 87 | FILE *fp; 88 | int sz; 89 | fp = popen(path, "r"); 90 | if (!fp) { 91 | perror("complete_path()"); 92 | dbg_print("popen failed. path %s\n", path); 93 | return -1; 94 | } 95 | sz = fread(compl, 1, 256, fp); 96 | if (!ferror(fp) && (sz != 0)) { 97 | compl[sz-1] = '\0'; 98 | } else { 99 | dbg_print("fread failed. path %s\n", path); 100 | pclose(fp); 101 | return -1; 102 | } 103 | pclose(fp); 104 | return 0; 105 | } 106 | 107 | int count_tzone_paths(char *base, char *match) 108 | { 109 | int sz; 110 | FILE *fp; 111 | /* generally sufficient */ 112 | char cmd[MAX_LEN]; 113 | char result[8]; 114 | 115 | /* find ${base}* -name 2>/dev/null*/ 116 | sprintf(cmd, "find %s* -name %s 2>/dev/null | wc -l", base, match); 117 | fp = popen(cmd, "r"); 118 | if (!fp) { 119 | perror("find_tzone_path()"); 120 | dbg_print("popen failed. base %s match %s\n", base, match); 121 | return -1; 122 | } 123 | sz = fread(cmd, 1, sizeof(cmd), fp); 124 | if (!sz) { 125 | pclose(fp); 126 | dbg_print("fread failed. cmd %s\n", cmd); 127 | return -1; 128 | } 129 | strncpy(result, cmd, sz); 130 | result[sz - 1] = '\0'; 131 | pclose(fp); 132 | return atoi(result); 133 | } 134 | 135 | int get_node_name(char *base, char *node, char *result) 136 | { 137 | int sz; 138 | FILE *fp; 139 | char path[MAX_LEN]; 140 | 141 | sprintf(path, "cat %s/%s 2>/dev/null", base, node); 142 | fp = popen(path, "r"); 143 | if (!fp) { 144 | perror("get_node_name()"); 145 | dbg_print("popen failed. path %s\n", path); 146 | return -1; 147 | } 148 | sz = fread(path, 1, sizeof(path), fp); 149 | if (!sz) { 150 | pclose(fp); 151 | dbg_print("fread failed. path %s\n", path); 152 | return -1; 153 | } 154 | if (result) { 155 | strncpy(result, path, sz); 156 | result[sz - 1] = '\0'; 157 | } 158 | pclose(fp); 159 | return 0; 160 | } 161 | 162 | int find_path(char *base, char *node, char *match, char *replace, char *buf) 163 | { 164 | int sz, replace_sz, fd, found = 0; 165 | FILE *fp; 166 | char value[64], path[MAX_LEN]; 167 | char list[2048] = {0}; 168 | char *token, *loc; 169 | 170 | sprintf(path, "find %s* -name %s 2>/dev/null", base, node); 171 | fp = popen(path, "r"); 172 | if (!fp) { 173 | perror("find_path()"); 174 | dbg_print("popen failed. base %s\n", base); 175 | return -1; 176 | } 177 | 178 | sz = fread(list, 1, sizeof(list), fp); 179 | pclose(fp); 180 | if (!sz) { 181 | dbg_print("fread failed. path %s\n", path); 182 | return -1; 183 | } else 184 | list[sz] = '\0'; 185 | 186 | token = strtok(list, "\n"); 187 | if (!token) 188 | return -1; 189 | do { 190 | fd = open(token, O_RDONLY); 191 | if (fd != -1) { 192 | sz = read(fd, value, sizeof(value)); 193 | close(fd); 194 | } 195 | if (sz > 0) { 196 | value[sz - 1] = '\0'; 197 | if (strcmp(value, match) == 0) { 198 | found = 1; 199 | break; 200 | } 201 | } 202 | token = strtok(NULL, "\n"); 203 | } while (token); 204 | 205 | if (!found || !token) 206 | return -1; 207 | 208 | loc = strstr(token, node); 209 | if (!loc) 210 | return -1; 211 | 212 | sz = loc - token; 213 | strncpy(buf, token, sz); 214 | replace_sz = strnlen(replace, MAX_LEN-sz-1); 215 | strncpy(buf+sz, replace, replace_sz); 216 | 217 | buf[sz + replace_sz] = '\0'; 218 | return 0; 219 | } 220 | 221 | int exit_cpu_thread, exit_io_thread; 222 | #define PAGE_SIZE_BYTES 4096 223 | static char *page[2]; 224 | static char *active_pg; 225 | static char *dirty_pg; 226 | static int active_pg_filled, dirty_pg_filled; 227 | static int io_inprogress; 228 | 229 | static pthread_mutex_t pmutex = PTHREAD_MUTEX_INITIALIZER; 230 | static pthread_cond_t pcond = PTHREAD_COND_INITIALIZER; 231 | 232 | void initialize_log_page(void) 233 | { 234 | page[0] = (char *)malloc(PAGE_SIZE_BYTES * 8); 235 | page[1] = (char *)malloc(PAGE_SIZE_BYTES * 8); 236 | active_pg = page[0]; 237 | dirty_pg = page[1]; 238 | } 239 | 240 | void trigger_disk_io(void) 241 | { 242 | /* trigger write IO */ 243 | pthread_mutex_lock(&pmutex); 244 | pthread_cond_signal(&pcond); 245 | pthread_mutex_unlock(&pmutex); 246 | } 247 | 248 | void accumulate_flush_record(char *record, int sz) 249 | { 250 | if ((PAGE_SIZE_BYTES - active_pg_filled < sz) || exit_cpu_thread) 251 | return; 252 | 253 | /* 254 | * each old record has nul terminator. write new one starting 255 | * on last nul terminator. we just need one nul at the end 256 | */ 257 | if (active_pg_filled != 0) 258 | active_pg_filled -= 1; 259 | 260 | memcpy(active_pg + active_pg_filled, record, sz); 261 | active_pg_filled += sz; 262 | 263 | if (PAGE_SIZE_BYTES - active_pg_filled <= sz) { 264 | /* 265 | * swap active with dirty. Note this is not mutex locked. 266 | * the idea for separate buffer of some size is that they 267 | * never conflict. if we have conflict, the purpose of 268 | * delegating io operation to separate thread is defeated. 269 | */ 270 | if (!io_inprogress) { 271 | /* swap buffers */ 272 | dirty_pg = active_pg; 273 | dirty_pg_filled = active_pg_filled; 274 | 275 | active_pg = dirty_pg; 276 | active_pg_filled = 0; 277 | 278 | /* IO the other page */ 279 | dbg_print("actvpg:%x, drtypg:%x\n", (void *)active_pg, 280 | (void *)dirty_pg); 281 | trigger_disk_io(); 282 | } else { 283 | /* purpose defeated. too-much/too-slow IO ? */ 284 | printf("**IO err: fix buffer size or too slow IO **\n"); 285 | return; 286 | } 287 | } 288 | } 289 | 290 | void page_write_disk(void *confg) 291 | { 292 | int ret; 293 | struct config *cfg = (struct config *)confg; 294 | sigset_t sigmask; 295 | sigfillset(&sigmask); 296 | 297 | ret = pthread_sigmask(SIG_BLOCK, &sigmask, NULL); 298 | if (ret) 299 | printf("page_write_disk: couldn't mask signals. err:%d\n", ret); 300 | 301 | do { 302 | int wr_sz; 303 | UNUSED(wr_sz); 304 | pthread_mutex_lock(&pmutex); 305 | pthread_cond_wait(&pcond, &pmutex); 306 | pthread_mutex_unlock(&pmutex); 307 | 308 | io_inprogress = 1; 309 | /* if we are exiting, just dump the active page */ 310 | if (!exit_cpu_thread) { 311 | wr_sz = write(cfg->log_file_fd, dirty_pg, 312 | dirty_pg_filled - 1); 313 | if (wr_sz == -1) 314 | perror("fail dirty pg write"); 315 | dbg_print("wrote %d io page bytes to log.\n", wr_sz); 316 | } else { 317 | /* reset to top of page */ 318 | wr_sz = write(cfg->log_file_fd, active_pg, 319 | active_pg_filled - 1); 320 | if (wr_sz == -1) 321 | perror("fail active pg write"); 322 | /* NULL terminate whatever data we have written */ 323 | dbg_print("wrote %d active pg bytes to log.\n", wr_sz); 324 | } 325 | io_inprogress = 0; 326 | 327 | } while (!exit_io_thread); 328 | 329 | pthread_cond_destroy(&pcond); 330 | pthread_mutex_destroy(&pmutex); 331 | pthread_exit(NULL); 332 | } 333 | 334 | struct config configpv; 335 | void initialize_logger(void) 336 | { 337 | int i; 338 | char path[MAX_LEN] = ""; 339 | 340 | for (i = 0; i < MAX_COL_NUM; i++) { 341 | if (!col_desc[i].report_enabled) { 342 | dbg_print(" %d.report_disabled for %s\n", 343 | i, col_desc[i].header_name); 344 | continue; 345 | } 346 | 347 | switch (i) { 348 | case FREQ_REALIZED: 349 | case LOAD_REALIZED: 350 | case SCALE_FACTOR: 351 | case NORM_PERF: 352 | /* XXX: for gfx C0, create separate columns */ 353 | if (get_node_name("/dev/cpu/0", "msr", NULL) < 0) { 354 | col_desc[i].report_enabled = 0; 355 | } 356 | continue; /* No file descritor required */ 357 | case MAX_FREQ_CPU: 358 | if (get_node_name("/dev/cpu/0", "msr", NULL) < 0) 359 | col_desc[i].report_enabled = 0; 360 | continue; /* No file descritor required */ 361 | case TIME_STAMP_MS: 362 | case LOAD_REQUEST: 363 | continue; /* No file descritor required */ 364 | case PKG0_POWER_RAPL: 365 | if (find_path(BASE_PATH_RAPL, "name", "package-0", 366 | "energy_uj", path)) { 367 | col_desc[i].report_enabled = 0; 368 | rapl_pp0_supported = 0; 369 | } else { 370 | rapl_pp0_supported = 1; 371 | } 372 | break; 373 | case PKG1_POWER_RAPL: 374 | if (find_path(BASE_PATH_RAPL, "name", "package-1", 375 | "energy_uj", path)) { 376 | col_desc[i].report_enabled = 0; 377 | } 378 | break; 379 | case PKG2_POWER_RAPL: 380 | if (find_path(BASE_PATH_RAPL, "name", "package-2", 381 | "energy_uj", path)) { 382 | col_desc[i].report_enabled = 0; 383 | } 384 | break; 385 | case PKG3_POWER_RAPL: 386 | if (find_path(BASE_PATH_RAPL, "name", "package-3", 387 | "energy_uj", path)) { 388 | col_desc[i].report_enabled = 0; 389 | } 390 | break; 391 | case PP0_POWER_RAPL: 392 | if (find_path(BASE_PATH_RAPL, "name", "core", 393 | "energy_uj", path)) 394 | col_desc[i].report_enabled = 0; 395 | break; 396 | case PP1_POWER_RAPL: 397 | if (find_path(BASE_PATH_RAPL, "name", "uncore", 398 | "energy_uj", path)) 399 | col_desc[i].report_enabled = 0; 400 | break; 401 | case DRAM_POWER_RAPL: 402 | if (find_path(BASE_PATH_RAPL, "name", "dram", 403 | "energy_uj", path)) 404 | col_desc[i].report_enabled = 0; 405 | break; 406 | case CPU_DTS: 407 | if (find_path(BASE_PATH_CPUDTS, "name", "coretemp", 408 | "temp2_input", path)) 409 | col_desc[i].report_enabled = 0; 410 | break; 411 | case SOC_DTS: 412 | if (find_path(BASE_PATH_TZONE, "type", "x86_pkg_temp", 413 | "temp", path)) 414 | col_desc[i].report_enabled = 0; 415 | break; 416 | } 417 | /* close only on exit */ 418 | col_desc[i].poll_fd = open(path, 0, "r"); 419 | if (col_desc[i].poll_fd < 0) { 420 | dbg_print("disabling column %s\n", 421 | col_desc[i].header_name); 422 | col_desc[i].report_enabled = 0; 423 | } 424 | } 425 | 426 | initialize_log_page(); 427 | return; 428 | } 429 | 430 | static char *log_header; 431 | 432 | struct timespec plog_last_tm, first_tm; 433 | int plog_poll_sec, plog_poll_nsec; 434 | int duration_sec, duration_nsec; 435 | 436 | void initialize_log_clock(void) 437 | { 438 | struct timespec tm; 439 | if (clock_gettime(CLOCK_MONOTONIC, &tm)) 440 | perror("clock_gettime"); 441 | first_tm.tv_sec = plog_last_tm.tv_sec = tm.tv_sec; 442 | first_tm.tv_nsec = plog_last_tm.tv_nsec = tm.tv_nsec; 443 | } 444 | 445 | uint64_t diff_ns(struct timespec *ts_then, struct timespec *ts_now) 446 | { 447 | uint64_t diff = 0; 448 | if (ts_now->tv_sec > ts_then->tv_sec) { 449 | diff = (ts_now->tv_sec - ts_then->tv_sec) * NSEC_PER_SEC; 450 | diff = diff - ts_then->tv_nsec + ts_now->tv_nsec; 451 | } else { 452 | diff += ts_now->tv_nsec - ts_then->tv_nsec; 453 | } 454 | return diff; 455 | } 456 | 457 | int update_perf_diffs(float *sum_norm_perf) 458 | { 459 | int fd, maxed_cpu_idx; 460 | float max_load, next_max_load; 461 | float _sum_nperf = 0, nperf = 0; 462 | uint64_t aperf_raw, mperf_raw, pperf_raw, tsc_raw, poll_cpu_us; 463 | 464 | for (int t = 0; t < nr_threads; t++) { 465 | fd = perf_stats[t].dev_msr_fd; 466 | 467 | /* 468 | * XXX: per-cpu IPI wakes for msr read will cost power. Need to 469 | * skip poll for idle bound cpus (e.g, using C-state count). 470 | * note: all-core sum perf considers per-respective poll time 471 | */ 472 | read_msr(fd, (uint32_t)MSR_IA32_PPERF, &pperf_raw); 473 | read_msr(fd, (uint32_t)MSR_IA32_APERF, &aperf_raw); 474 | read_msr(fd, (uint32_t)MSR_IA32_MPERF, &mperf_raw); 475 | read_msr(fd, (uint32_t)MSR_IA32_TSC, &tsc_raw); 476 | 477 | perf_stats[t].pperf_diff = cpu_get_diff_pperf(pperf_raw, t); 478 | perf_stats[t].aperf_diff = cpu_get_diff_aperf(aperf_raw, t); 479 | perf_stats[t].mperf_diff = cpu_get_diff_mperf(mperf_raw, t); 480 | perf_stats[t].tsc_diff = cpu_get_diff_tsc(tsc_raw, t); 481 | 482 | poll_cpu_us = perf_stats[t].tsc_diff/cpu_hfm_mhz; 483 | 484 | /* 485 | * Normalized perf metric defined as pperf per load per time. 486 | * The rationale is detailed in the followingi paper: 487 | * github.com/intel/psst >whitepapers >Generic_perf_per_watt.pdf 488 | * Given that delta_load = delta_mperf/delta_tsc, we can rewrite 489 | * as given below. 490 | */ 491 | if (perf_stats[t].mperf_diff) { 492 | nperf = (float) perf_stats[t].pperf_diff/poll_cpu_us; 493 | nperf = (float) nperf * perf_stats[t].tsc_diff; 494 | nperf = (float) nperf/(perf_stats[t].mperf_diff); 495 | perf_stats[t].nperf = (uint64_t)nperf; 496 | _sum_nperf += (float) nperf; 497 | } 498 | } 499 | *sum_norm_perf = _sum_nperf; 500 | 501 | max_load = 100*(float)perf_stats[0].mperf_diff/perf_stats[0].tsc_diff; 502 | maxed_cpu_idx = 0; 503 | 504 | for (int t = 1; t < nr_threads; t++) { 505 | next_max_load = 100 * (float) perf_stats[t].mperf_diff / 506 | perf_stats[t].tsc_diff; 507 | 508 | /* float comparison with some meaningful difference */ 509 | if (max_load > (next_max_load + 0.01)) 510 | continue; 511 | else { 512 | max_load = next_max_load; 513 | maxed_cpu_idx = t; 514 | } 515 | } 516 | 517 | return maxed_cpu_idx; 518 | } 519 | #define LOG_HEADER_SZ 2048 520 | 521 | int first_log = 1; 522 | uint64_t pp0_initial_energy, soc_initial_energy[4]; 523 | uint64_t pp0_diff_uj, soc_diff_uj[4]; 524 | 525 | int rapl_pp0_supported; 526 | 527 | void do_logging(float dc) 528 | { 529 | char buf[64]; 530 | char final_buf[1024]; 531 | char val_fmt[16]; 532 | char delim[] = ", "; 533 | char delim_short[] = ", "; 534 | log_col_t i; 535 | int sz, sz1, pkg_num; 536 | int max_cpu = 0; 537 | int m = 0; 538 | float sum_norm_perf = 0; 539 | struct timespec tm; 540 | 541 | *buf = '\0'; 542 | 543 | if (clock_gettime(CLOCK_MONOTONIC, &tm)) 544 | perror("clock_gettime"); 545 | 546 | /* duration_* is the total time this tool runs */ 547 | if (!is_time_remaining(CLOCK_MONOTONIC, &first_tm, duration_sec, 548 | duration_nsec)) 549 | exit_cpu_thread = 1; 550 | 551 | /* we log once in plog_poll_* interval */ 552 | if (!first_log && is_time_remaining(CLOCK_MONOTONIC, &plog_last_tm, 553 | plog_poll_sec, plog_poll_nsec)) 554 | return; 555 | 556 | plog_last_tm.tv_sec = tm.tv_sec; 557 | plog_last_tm.tv_nsec = tm.tv_nsec; 558 | 559 | /* 560 | * When dev_msr not supported, the diffs are not populated. 561 | * In these cases the associated columns have been disabled anyway. 562 | */ 563 | if (perf_stats->dev_msr_supported) { 564 | m = update_perf_diffs(&sum_norm_perf); 565 | max_cpu = perf_stats[m].cpu; 566 | } 567 | 568 | for (i = 0; i < MAX_COL_NUM; i++) { 569 | if (!col_desc[i].report_enabled) 570 | continue; 571 | 572 | if (col_desc[i].fd_type == NORMAL_FD) { 573 | lseek(col_desc[i].poll_fd, 0L, SEEK_SET); 574 | sz = read(col_desc[i].poll_fd, buf, 64); 575 | if (sz == -1) { 576 | perror("read poll_fd 1"); 577 | printf(" col desc read fd err %d\n", i); 578 | } 579 | } 580 | 581 | switch (i) { 582 | case TIME_STAMP_MS: 583 | col_desc[i].value = 584 | diff_ns(&first_tm, &plog_last_tm)/1000000; 585 | break; 586 | case LOAD_REQUEST: 587 | col_desc[i].value = dc; 588 | break; 589 | case LOAD_REALIZED: 590 | /* real C0 = delta-mperf/delta-tsc */ 591 | col_desc[i].value = (float) perf_stats[m].mperf_diff * 592 | 100/perf_stats[m].tsc_diff; 593 | break; 594 | case SCALE_FACTOR: 595 | col_desc[i].value = (float) perf_stats[m].pperf_diff * 596 | 100/perf_stats[m].aperf_diff; 597 | break; 598 | case NORM_PERF: 599 | col_desc[i].value = sum_norm_perf; 600 | sum_norm_perf = 0; 601 | break; 602 | case MAX_FREQ_CPU: 603 | col_desc[i].value = max_cpu; 604 | break; 605 | case FREQ_REALIZED: 606 | /* real freq = TSC* delta-aperf/delta-mperf */ 607 | col_desc[i].value = (float) perf_stats[m].aperf_diff / 608 | perf_stats[m].mperf_diff*cpu_hfm_mhz; 609 | break; 610 | case PKG0_POWER_RAPL: 611 | pkg_num = i - PKG0_POWER_RAPL; 612 | if (first_log) 613 | soc_initial_energy[pkg_num] = atoll(buf); 614 | 615 | soc_diff_uj[pkg_num] = atoll(buf) - soc_initial_energy[pkg_num]; 616 | 617 | col_desc[i].value = (float) rapl_ediff_pkg0(atoll(buf))/ 618 | configpv.poll_period; 619 | break; 620 | 621 | case PKG1_POWER_RAPL: 622 | pkg_num = i - PKG0_POWER_RAPL; 623 | if (first_log) 624 | soc_initial_energy[pkg_num] = atoll(buf); 625 | 626 | soc_diff_uj[pkg_num] = atoll(buf) - soc_initial_energy[pkg_num]; 627 | 628 | col_desc[i].value = (float) rapl_ediff_pkg1(atoll(buf))/ 629 | configpv.poll_period; 630 | break; 631 | case PKG2_POWER_RAPL: 632 | pkg_num = i - PKG0_POWER_RAPL; 633 | if (first_log) 634 | soc_initial_energy[pkg_num] = atoll(buf); 635 | 636 | soc_diff_uj[pkg_num] = atoll(buf) - soc_initial_energy[pkg_num]; 637 | 638 | col_desc[i].value = (float) rapl_ediff_pkg2(atoll(buf))/ 639 | configpv.poll_period; 640 | break; 641 | case PKG3_POWER_RAPL: 642 | pkg_num = i - PKG0_POWER_RAPL; 643 | if (first_log) 644 | soc_initial_energy[pkg_num] = atoll(buf); 645 | 646 | soc_diff_uj[pkg_num] = atoll(buf) - soc_initial_energy[pkg_num]; 647 | 648 | col_desc[i].value = (float) rapl_ediff_pkg3(atoll(buf))/ 649 | configpv.poll_period; 650 | break; 651 | case PP0_POWER_RAPL: 652 | if (first_log) 653 | pp0_initial_energy = atoll(buf); 654 | 655 | pp0_diff_uj = atoll(buf) - pp0_initial_energy; 656 | 657 | col_desc[i].value = (float) rapl_ediff_cpu(atoll(buf))/ 658 | configpv.poll_period; 659 | break; 660 | case PP1_POWER_RAPL: 661 | col_desc[i].value = (float) rapl_ediff_gpu(atoll(buf))/ 662 | configpv.poll_period; 663 | break; 664 | case DRAM_POWER_RAPL: 665 | col_desc[i].value = (float) rapl_ediff_dram(atoll(buf))/ 666 | configpv.poll_period; 667 | break; 668 | 669 | case PKG_POWER_LIMIT: 670 | case CPU_DTS: 671 | case SOC_DTS: 672 | col_desc[i].value = atoi(buf); 673 | break; 674 | /* dead code. happy compiler */ 675 | case MAX_COL_NUM: 676 | break; 677 | 678 | } 679 | col_desc[i].value *= col_desc[i].unit_multiplier; 680 | } 681 | 682 | if (!log_header) { 683 | log_header = malloc(LOG_HEADER_SZ * sizeof(char)); 684 | if (!log_header) { 685 | perror("Failed to malloc log_header"); 686 | exit(EXIT_FAILURE); 687 | } 688 | char hdr_fmt[32]; 689 | 690 | /* add header names */ 691 | sprintf(log_header, "%c", '#'); 692 | sz = 1; 693 | for (i = 0; i < MAX_COL_NUM; i++) { 694 | if (!col_desc[i].report_enabled) 695 | continue; 696 | sprintf(hdr_fmt, "%%%ds%s", 697 | atoi(col_desc[i].fmt), delim); 698 | sz1 = sprintf(log_header + sz, hdr_fmt, 699 | col_desc[i].header_name); 700 | sz += sz1; 701 | } 702 | 703 | if (configpv.super_verbose && col_desc[LOAD_REALIZED].report_enabled) { 704 | i = SCALE_FACTOR; 705 | for (int j = 0; j < nr_threads; j++) { 706 | sprintf(hdr_fmt, "%%%ds%.2d%s", 707 | atoi(col_desc[i].fmt), perf_stats[j].cpu, delim_short); 708 | sz1 = sprintf(log_header + sz, hdr_fmt, 709 | col_desc[i].header_name); 710 | sz += sz1; 711 | } 712 | i = LOAD_REALIZED; 713 | for (int j = 0; j < nr_threads; j++) { 714 | sprintf(hdr_fmt, "%%%ds%.2d%s", 715 | atoi(col_desc[i].fmt), perf_stats[j].cpu, delim_short); 716 | sz1 = sprintf(log_header + sz, hdr_fmt, 717 | col_desc[i].header_name); 718 | sz += sz1; 719 | } 720 | i = FREQ_REALIZED; 721 | for (int j = 0; j < nr_threads; j++) { 722 | sprintf(hdr_fmt, "%%%ds%.2d%s", 723 | atoi(col_desc[i].fmt), perf_stats[j].cpu, delim_short); 724 | sz1 = sprintf(log_header + sz, hdr_fmt, 725 | col_desc[i].header_name); 726 | sz += sz1; 727 | } 728 | sz = sz - sizeof(delim_short) + 2; 729 | } else { 730 | sz = sz - sizeof(delim) + 2; 731 | } 732 | 733 | log_header[sz - 1] = '\n'; 734 | 735 | /* add header unit of measurement */ 736 | sprintf(log_header+sz, "%c", '#'); 737 | sz += 1; 738 | for (i = 0; i < MAX_COL_NUM; i++) { 739 | if (!col_desc[i].report_enabled) 740 | continue; 741 | sprintf(hdr_fmt, "%%%ds%s", 742 | atoi(col_desc[i].fmt), delim); 743 | sz1 = sprintf(log_header + sz, hdr_fmt, 744 | col_desc[i].unit); 745 | sz += sz1; 746 | } 747 | if (configpv.super_verbose && col_desc[LOAD_REALIZED].report_enabled) { 748 | i = SCALE_FACTOR; 749 | for (int j = 0; j < nr_threads; j++) { 750 | /* add 2 digits for cpu# */ 751 | sprintf(hdr_fmt, "%%%ds%s", 752 | atoi(col_desc[i].fmt)+2, delim_short); 753 | sz1 = sprintf(log_header + sz, hdr_fmt, 754 | col_desc[i].unit); 755 | sz += sz1; 756 | } 757 | i = LOAD_REALIZED; 758 | for (int j = 0; j < nr_threads; j++) { 759 | sprintf(hdr_fmt, "%%%ds%s", 760 | atoi(col_desc[i].fmt)+2, delim_short); 761 | sz1 = sprintf(log_header + sz, hdr_fmt, 762 | col_desc[i].unit); 763 | sz += sz1; 764 | } 765 | i = FREQ_REALIZED; 766 | for (int j = 0; j < nr_threads; j++) { 767 | sprintf(hdr_fmt, "%%%ds%s", 768 | atoi(col_desc[i].fmt)+2, delim_short); 769 | sz1 = sprintf(log_header + sz, hdr_fmt, 770 | col_desc[i].unit); 771 | sz += sz1; 772 | } 773 | sz = sz - sizeof(delim_short) + 2; 774 | } else { 775 | sz = sz - sizeof(delim) + 2; 776 | } 777 | log_header[sz - 1] = '\n'; 778 | 779 | /* write-out to file */ 780 | sz = write(configpv.log_file_fd, log_header, sz); 781 | if (sz == -1) 782 | perror("log_header write"); 783 | 784 | printf("report being logged to %s... ^C to exit.\n", configpv.log_file_name); 785 | if (configpv.verbose && !configpv.super_verbose) 786 | printf("%s\n", log_header); 787 | } 788 | 789 | sz = 0; 790 | for (i = 0; i < MAX_COL_NUM; i++) { 791 | if (!col_desc[i].report_enabled) 792 | continue; 793 | sprintf(val_fmt, "%%%sf%s", col_desc[i].fmt, delim); 794 | sz1 = sprintf(final_buf + sz, val_fmt, 795 | col_desc[i].value); 796 | sz += sz1; 797 | } 798 | 799 | if (configpv.super_verbose && col_desc[LOAD_REALIZED].report_enabled) { 800 | int sz2; 801 | i = SCALE_FACTOR; 802 | for (int j = 0; j < nr_threads; j++) { 803 | sprintf(val_fmt, "%%%.3sf%s", col_desc[i].fmt, delim); 804 | sz2 = sprintf(final_buf+sz, val_fmt, (float) perf_stats[j].pperf_diff * 805 | 100/perf_stats[j].aperf_diff); 806 | sz += sz2; 807 | } 808 | 809 | i = LOAD_REALIZED; 810 | for (int j = 0; j < nr_threads; j++) { 811 | sprintf(val_fmt, "%%%.3sf%s", col_desc[i].fmt, delim); 812 | sz2 = sprintf(final_buf+sz, val_fmt, (float)perf_stats[j].mperf_diff 813 | *100/perf_stats[j].tsc_diff); 814 | sz += sz2; 815 | } 816 | 817 | i = FREQ_REALIZED; 818 | for (int j = 0; j < nr_threads; j++) { 819 | sprintf(val_fmt, "%%%.3sf%s", col_desc[i].fmt, delim); 820 | sz2 = sprintf(final_buf+sz, val_fmt, (float)perf_stats[j].aperf_diff / 821 | perf_stats[j].mperf_diff*cpu_hfm_mhz); 822 | sz += sz2; 823 | } 824 | } 825 | 826 | /* erase the last delimiter */ 827 | sz = sz - sizeof(delim) + 2; 828 | final_buf[sz-1] = '\n'; 829 | final_buf[sz] = '\0'; 830 | 831 | if (configpv.verbose && !configpv.super_verbose) 832 | printf("%s", final_buf); 833 | 834 | accumulate_flush_record(final_buf, sz+1); 835 | 836 | first_log = 0; 837 | } 838 | -------------------------------------------------------------------------------- /src/logger.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2017, Intel Corporation. 3 | * 4 | * This program is free software; you can redistribute it and/or modify it 5 | * under the terms and conditions of the GNU General Public License, 6 | * version 2, as published by the Free Software Foundation. 7 | * 8 | * This program is distributed in the hope it will be useful, but WITHOUT 9 | * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 10 | * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 11 | * more details. 12 | * 13 | * Author: Noor ul Mubeen 14 | */ 15 | 16 | #ifndef _LOGGER_H_ 17 | #define _LOGGER_H_ 18 | #include 19 | #include 20 | #include "psst.h" 21 | 22 | #ifdef DEBUG 23 | #define dbg_print(fmt...) printf(fmt) 24 | #else 25 | #define dbg_print(fmt...) ((void)0) 26 | #endif 27 | 28 | #define UNUSED(expr) do { (void)(expr); } while (0) 29 | 30 | #define MSEC_TO_SEC(x) (x/1000) 31 | #define REMAINING_MS_TO_NS(x) ((x % 1000) * 1000000) 32 | 33 | typedef enum log_col {TIME_STAMP_MS, 34 | FREQ_REALIZED, 35 | MAX_FREQ_CPU, 36 | LOAD_REQUEST, 37 | LOAD_REALIZED, 38 | SCALE_FACTOR, 39 | NORM_PERF, 40 | PKG0_POWER_RAPL, 41 | PKG1_POWER_RAPL, 42 | PKG2_POWER_RAPL, 43 | PKG3_POWER_RAPL, 44 | PKG_POWER_LIMIT, 45 | PP0_POWER_RAPL, 46 | PP1_POWER_RAPL, 47 | DRAM_POWER_RAPL, 48 | CPU_DTS, 49 | SOC_DTS, 50 | MAX_COL_NUM,} log_col_t; 51 | 52 | enum col_processing { NO_FD, NORMAL_FD, MSR_FD }; 53 | 54 | struct log_col_desc { 55 | int report_enabled; 56 | char header_name[32]; 57 | char unit[32]; 58 | char fmt[32]; 59 | float unit_multiplier; 60 | enum col_processing fd_type; 61 | int poll_fd; 62 | float value; 63 | }; 64 | 65 | extern int nr_threads; 66 | extern int rapl_pp0_supported; 67 | extern int need_maxed_cpu; 68 | extern int plog_poll_sec, plog_poll_nsec, duration_sec, duration_nsec; 69 | extern struct config configpv; 70 | extern perf_stats_t *perf_stats; 71 | 72 | extern void do_logging(float dc); 73 | extern void initialize_logger(void); 74 | extern void initialize_log_clock(void); 75 | extern void page_write_disk(void *); 76 | extern void trigger_disk_io(void); 77 | extern uint64_t diff_ns(struct timespec *, struct timespec *); 78 | extern int update_perf_diffs(float *s); 79 | #endif 80 | -------------------------------------------------------------------------------- /src/parse_config.c: -------------------------------------------------------------------------------- 1 | /* 2 | * parse_config.c: deals with cmd line arg parsing & default value population 3 | * 4 | * Copyright (c) 2017, Intel Corporation. 5 | * 6 | * This program is free software; you can redistribute it and/or modify it 7 | * under the terms and conditions of the GNU General Public License, 8 | * version 2, as published by the Free Software Foundation. 9 | * 10 | * This program is distributed in the hope it will be useful, but WITHOUT 11 | * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 12 | * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 13 | * more details. 14 | * 15 | * Author: Noor ul Mubeen 16 | */ 17 | 18 | #define _GNU_SOURCE 19 | #include 20 | #include 21 | #include 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | 28 | #include "parse_config.h" 29 | #include "logger.h" 30 | 31 | static struct option long_options[] = { 32 | {"cpumask", 1, 0, 'C'}, 33 | {"duration", 1, 0, 'd'}, 34 | {"gpumask", 1, 0, 'G'}, 35 | {"log-file", 1, 0, 'l'}, 36 | {"poll-period", 1, 0, 'p'}, 37 | {"shape-func", 1, 0, 's'}, 38 | {"verbose", 0, 0, 'v'}, 39 | {"super-verbose",0, 0, 'S'}, 40 | {"version", 0, 0, 'V'}, 41 | {"help", 0, 0, 'h'}, 42 | {0, 0, 0, 0} 43 | }; 44 | 45 | void print_usage(char *prog) 46 | { 47 | printf("usage:\n"); 48 | printf("%s [options ]\n", prog); 49 | printf("\tSupported options are:\n"); 50 | printf("\t-C|--cpumask\t\t hex bit mask of cpu# to be selected.\n"); 51 | printf("\t\t\t\t(e.g., a1 selects cpu 0,5,7. default: every online cpu. Max:400 [1024])\n"); 52 | printf("\t-p|--poll-period\t (ms) for logging (default: 500 ms)\n"); 53 | printf("\t-d|--duration\t\t (ms) to run the tool (default: 3600000 i.e., 1hr)\n"); 54 | printf("\t-l|--log-file\t\t (default: %s)\n", default_log_file); 55 | printf("\t-v|--verbose\t\tenables verbose mode (default: disabled when args specified)\n"); 56 | printf("\t-S|--super-verbose\tprint per-core info (e.g., util) to log file (default: disabled)\n"); 57 | printf("\t-V|--version\t\tprints version when specified\n"); 58 | printf("\t-h|--help\t\tprints usage when specified\n"); 59 | printf("\t-s|--shape-func\t\t (default: single-step,0.1)\n"); 60 | printf("\tSupported power shape functions & args are:\n"); 61 | printf("\t\t\t\t"); 62 | printf("where v is load step height.\n"); 63 | printf("\t\t\t\t"); 64 | printf("where w is wavelength [seconds] and a is the max amplitude (load %%)\n"); 65 | printf("\t\t\t"); 66 | printf("where v is load step height, u is step length (sec)\n"); 67 | printf("\t\t\t"); 68 | printf("where v is load step height, u is step length (sec)\n"); 69 | printf("\t\t\t\twhere m is the slope (load/sec)\n"); 70 | printf("\t\t\t\tslope m (load/sec);reversed after max a%% or min(0.1)%%\n"); 71 | printf("\nexample 1: use psst just for logging system power/thermal parameters with minimum overhead\n"); 72 | printf("\t $ sudo ./psst #implied default args: -s single-step,0.1 -p 500 -v\n"); 73 | printf("\nexample 2: linear ramp CPU power with slope 3 (i.e., 3%% usage increase every sec)" 74 | " applied for cpu0, cpu1 & cpu3.\n\t poll and report" 75 | " every 700mS. output on terminal. run for 33 sec\n"); 76 | printf("\t $ sudo ./psst -s linear-ramp,3 -C b -p 700 -d 33000 -v\n"); 77 | } 78 | 79 | static int populate_online_cpumask(cpu_set_t *cpumask); 80 | static void verbose_prints(struct config *configp); 81 | 82 | int populate_default_config(struct config *configp) 83 | { 84 | if (!configp->shape_func[0]) 85 | strncpy(configp->shape_func, "single-step,0.1", 16); 86 | 87 | if (!configp->v_unit) 88 | configp->v_unit = 'C'; 89 | 90 | if (cpu_stress_opt == UNDEFINED) 91 | populate_online_cpumask(&configp->cpumask); 92 | 93 | if (!configp->gpumask) 94 | configp->gpumask = 0x0; 95 | if (!configp->memmask) 96 | configp->memmask = 0x0; 97 | 98 | if (configp->memmask || configp->gpumask) { 99 | /* we want to use cpu0 for non-cpu submitter 100 | * hence we can't have any regular stress function 101 | * request on cpu 0 at the same time 102 | */ 103 | if (CPU_ISSET(0, &configp->cpumask) && 104 | (cpu_stress_opt == WELL_DEFINED)) { 105 | printf("can't stress cpu0 (-C xx) along with -G or -M\n"); 106 | return 0; 107 | } else { 108 | dont_stress_cpu0 = 1; 109 | CPU_SET(0, &configp->cpumask); 110 | } 111 | } 112 | /* 113 | * cpu0 is special. It has to be always enabled. Move the 114 | * user intention as cpu_stress_opt reason. 115 | */ 116 | if (!CPU_ISSET(0, &configp->cpumask)) { 117 | dont_stress_cpu0 = 1; 118 | CPU_SET(0, &configp->cpumask); 119 | } 120 | 121 | if (!configp->log_file_name[0]) { 122 | strncpy(configp->log_file_name, default_log_file, 123 | sizeof(default_log_file)+1); 124 | } 125 | /* verbose & version option are not turned on by default */ 126 | 127 | if (!configp->cpu_freq) 128 | configp->cpu_freq = -1; 129 | 130 | if (!configp->log_file_fd) { 131 | configp->log_file_fd = open(configp->log_file_name, 132 | O_RDWR|O_CREAT|O_TRUNC, 133 | S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH); 134 | if (configp->log_file_fd == -1) 135 | perror("log file"); 136 | } 137 | if (!configp->poll_period) 138 | configp->poll_period = 500; /* (ms) */ 139 | if (!configp->duration) 140 | configp->duration = 3600000; /* default 60min */ 141 | 142 | initialize_logger(); 143 | if (configp->verbose | configp->super_verbose) 144 | verbose_prints(configp); 145 | 146 | return 1; 147 | } 148 | 149 | static int xchar_to_int(char x) 150 | { 151 | if (isalpha(x)) 152 | return toupper(x) - 55; 153 | if (isdigit(x)) 154 | return x - 48; 155 | return -1; 156 | } 157 | 158 | /* cpuset procfs reports online cpu in this format: 159 | * 0-4,7 : to mean 0,1,2,3,4 & 7 are online 160 | */ 161 | static int cpuset_to_bitmap(char *buf, cpu_set_t *cpumask) 162 | { 163 | int k; 164 | char *token, *subtoken, *pos; 165 | char *save1, *save2, *token_copy; 166 | 167 | pos = strchr(buf, '\n'); 168 | if (!pos) 169 | return 0; 170 | pos[0] = '\0'; 171 | /* e.g: 3,5-11 */ 172 | token = strtok_r(buf, ",", &save1); 173 | do { 174 | if (!token) 175 | break; 176 | token_copy = strdup(token); 177 | subtoken = strtok_r(token_copy, "-", &save2); 178 | /* update 3 (and 5 in next pass) ... */ 179 | if (!subtoken) { 180 | k = atoi(token); 181 | CPU_SET(k, cpumask); 182 | free(token_copy); 183 | continue; 184 | } else { 185 | k = atoi(subtoken); 186 | CPU_SET(k, cpumask); 187 | pos = token + strlen(subtoken) + 1; 188 | while (++k <= atoi(pos)) 189 | CPU_SET(k, cpumask); 190 | free(token_copy); 191 | } 192 | token = strtok_r(NULL, ",", &save1); 193 | } while (token); 194 | return 0; 195 | } 196 | 197 | static int populate_online_cpumask(cpu_set_t *cpumask) 198 | { 199 | char buf[65]; 200 | FILE *fp; 201 | size_t sz; 202 | 203 | /* Open the command for reading in pipe. */ 204 | fp = fopen("/sys/devices/system/cpu/online", "r"); 205 | if (fp == NULL) { 206 | printf("Failed to get online cpu list\n"); 207 | return -1; 208 | } 209 | sz = fread(buf, 1, sizeof(buf) - 1, fp); 210 | fclose(fp); 211 | if (sz == 0) { 212 | printf("populate_online_cpumask: fread failed\n"); 213 | return -1; 214 | } 215 | buf[sz] = '\0'; 216 | cpuset_to_bitmap(buf, cpumask); 217 | return 0; 218 | } 219 | 220 | cpu_stress_opt_t cpu_stress_opt = UNDEFINED; 221 | int dont_stress_cpu0; 222 | 223 | static int set_cpu_mask(char *buf, struct config *configp) 224 | { 225 | int arg_bytes = strlen(buf); 226 | if ((arg_bytes == 1) && (buf[0] == '0')) { 227 | /* user wants to: not stress any cpu. 228 | * lets translate that to cpu0 submitter work 229 | */ 230 | dont_stress_cpu0 = 1; 231 | cpu_stress_opt = WELL_DEFINED; 232 | CPU_SET(0, &(configp->cpumask)); 233 | return 0; 234 | } 235 | 236 | if ((arg_bytes * 4) > CPU_SETSIZE) { 237 | printf("max cpu supported is %d\n", CPU_SETSIZE); 238 | return -1; 239 | } else { 240 | /* arg "a1" or 0000.1010 0000.0001 selects cpu 0,5,7 */ 241 | int i, j, k = 0; 242 | for (i = arg_bytes - 1; i > -1; i--) { 243 | for (j = 0; j < 4; j++, k++) { 244 | if (!isxdigit(buf[i])) { 245 | printf("Invalid arg to -C\n"); 246 | return -1; 247 | } 248 | if (xchar_to_int(buf[i]) & (1<cpumask)); 250 | } 251 | } 252 | cpu_stress_opt = WELL_DEFINED; 253 | } 254 | return 0; 255 | } 256 | 257 | int parse_cmd_config(int ac, char **av, struct config *configp) 258 | { 259 | int c, option_index; 260 | char buf[128]; 261 | size_t len; 262 | 263 | memset(configp, 0, sizeof(struct config)); 264 | CPU_ZERO(&configp->cpumask); 265 | 266 | if (ac == 1) 267 | configp->verbose = 1; 268 | 269 | while ((c = getopt_long(ac, av, "s:G:C:E:l:p:d:hvVTS", 270 | long_options, &option_index)) != -1) { 271 | /* XXX check optarg valid */ 272 | switch (c) { 273 | case 'l': 274 | len = sizeof(configp->log_file_name); 275 | strncpy(configp->log_file_name, optarg, len); 276 | configp->log_file_name[len - 1] = '\0'; 277 | break; 278 | case 'p': 279 | sscanf(optarg, "%d", &configp->poll_period); 280 | if (configp->poll_period <= 0) 281 | return 0; 282 | break; 283 | case 'd': 284 | sscanf(optarg, "%d", &configp->duration); 285 | if (configp->duration <= 0) 286 | return 0; 287 | break; 288 | case 'C': 289 | sscanf(optarg, "%127s", buf); 290 | if (set_cpu_mask(buf, configp) < 0) 291 | return 0; 292 | break; 293 | case 'G': 294 | sscanf(optarg, "%x", &configp->gpumask); 295 | break; 296 | case 'v': 297 | configp->verbose = 1; 298 | break; 299 | case 'S': 300 | configp->super_verbose = 1; 301 | break; 302 | case 'V': 303 | configp->version = 1; 304 | break; 305 | case 's': 306 | len = sizeof(configp->shape_func); 307 | strncpy(configp->shape_func, optarg, len); 308 | configp->shape_func[len - 1] = '\0'; 309 | break; 310 | case 'h': 311 | case '?': 312 | default: 313 | print_usage("psst"); 314 | return 0; 315 | } /* switch */ 316 | } /* while */ 317 | 318 | if (optind < ac) { 319 | print_usage("psst"); 320 | return 0; 321 | } 322 | 323 | return 1; 324 | } 325 | 326 | static void verbose_prints(struct config *configp) 327 | { 328 | int i; 329 | printf("Verbose mode ON\n"); 330 | dbg_print("v-unit is: %c\n", configp->v_unit); 331 | if (configp->v_unit != 'C') 332 | dbg_print("option not supported\n"); 333 | printf("CPU domain. Following %d cpu selected:\n", 334 | CPU_COUNT(&configp->cpumask)); 335 | 336 | for (i = 0; i < CPU_SETSIZE; i++) { 337 | if (CPU_ISSET(i, &configp->cpumask)) { 338 | printf("\tcpu %d", i); 339 | if (i == 0) 340 | printf("\t[%s]\n", 341 | dont_stress_cpu0 ? 342 | "as work submitter" : "was online or chosen"); 343 | else 344 | printf("\t[%s]\n", "was online or chosen"); 345 | } 346 | } 347 | 348 | printf("\n"); 349 | printf("poll period %dms\n", configp->poll_period); 350 | printf("run duration %dms\n", configp->duration); 351 | printf("Log file path: %s\n", configp->log_file_name); 352 | printf("power curve shape: %s\n", configp->shape_func); 353 | printf("\n"); 354 | } 355 | 356 | int parse_power_shape(char *shape, data_t *pst) 357 | { 358 | char *token; 359 | char *delimiter = ","; 360 | char *running; 361 | 362 | if (!strcmp(shape, "")) 363 | return 0; 364 | /* use strdupa to auto free on stack exit */ 365 | running = strdup(shape); 366 | token = strtok(running, delimiter); 367 | if (!token) { 368 | free(running); 369 | return 1; 370 | } 371 | if (!strcmp(token, "single-step")) { 372 | token = strtok(NULL, delimiter); 373 | if (token) { 374 | pst->psn = SINGLE_STEP; 375 | sscanf(token,"%f",&pst->psa.single_step.v_units); 376 | dbg_print("single step yunit %f\n", 377 | pst->psa.single_step.v_units); 378 | free(running); 379 | } else { 380 | free(running); 381 | return 0; 382 | } 383 | } else if (!strcmp(token, "stair-case")) { 384 | token = strtok(NULL, delimiter); 385 | if (token) { 386 | pst->psn = STAIR_CASE; 387 | sscanf(token,"%f",&pst->psa.staircase.y_height); 388 | token = strtok(NULL, delimiter); 389 | if (token) { 390 | pst->psa.staircase.x_length = atof(token); 391 | dbg_print("staircase x, y %f ,%d\n", 392 | pst->psa.staircase.y_height, 393 | pst->psa.staircase.x_length); 394 | } 395 | free(running); 396 | } else { 397 | free(running); 398 | return 0; 399 | } 400 | } else if (!strcmp(token, "sinosoid")) { 401 | token = strtok(NULL, delimiter); 402 | if (token) { 403 | pst->psn = SINOSOID; 404 | sscanf(token,"%d",&pst->psa.sinosoid.x_wavelength); 405 | token = strtok(NULL, delimiter); 406 | if (token) { 407 | sscanf(token,"%f",&pst->psa.sinosoid.y_amplitude); 408 | dbg_print("sine wavelength %d amplitude %f\n", 409 | pst->psa.sinosoid.x_wavelength, 410 | pst->psa.sinosoid.y_amplitude); 411 | } 412 | free(running); 413 | } else { 414 | free(running); 415 | return 0; 416 | } 417 | } else if (!strcmp(token, "single-pulse")) { 418 | token = strtok(NULL, delimiter); 419 | if (token) { 420 | pst->psn = SINGLE_PULSE; 421 | sscanf(token,"%f",&pst->psa.single_pulse.y_height); 422 | token = strtok(NULL, delimiter); 423 | if (token) { 424 | sscanf(token,"%d",&pst->psa.single_pulse.x_length); 425 | dbg_print("singlepulse x, y %f ,%d\n", 426 | pst->psa.single_pulse.y_height, 427 | pst->psa.single_pulse.x_length); 428 | } 429 | free(running); 430 | } else { 431 | free(running); 432 | return 0; 433 | } 434 | } else if (!strcmp(token, "linear-ramp")) { 435 | token = strtok(NULL, delimiter); 436 | if (token) { 437 | pst->psn = LINEAR_RAMP; 438 | sscanf(token,"%f",&pst->psa.linear_ramp.slope_y_per_sec); 439 | printf(" liner ramp %f\n", 440 | pst->psa.linear_ramp.slope_y_per_sec); 441 | free(running); 442 | } else { 443 | free(running); 444 | return 0; 445 | } 446 | } else if (!strcmp(token, "saw-tooth")) { 447 | token = strtok(NULL, delimiter); 448 | if (token) { 449 | pst->psn = SAW_TOOTH; 450 | sscanf(token,"%f",&pst->psa.saw_tooth.slope_y_per_sec); 451 | token = strtok(NULL, delimiter); 452 | if (token) 453 | sscanf(token,"%f",&pst->psa.saw_tooth.max_y); 454 | dbg_print(" saw tooth slope %.3f, max %.3f\n", 455 | pst->psa.saw_tooth.slope_y_per_sec, 456 | pst->psa.saw_tooth.max_y); 457 | free(running); 458 | } else { 459 | free(running); 460 | return 0; 461 | } 462 | } else { 463 | free(running); 464 | return 0; 465 | } 466 | return 1; 467 | } 468 | -------------------------------------------------------------------------------- /src/parse_config.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2017, Intel Corporation. 3 | * 4 | * This program is free software; you can redistribute it and/or modify it 5 | * under the terms and conditions of the GNU General Public License, 6 | * version 2, as published by the Free Software Foundation. 7 | * 8 | * This program is distributed in the hope it will be useful, but WITHOUT 9 | * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 10 | * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 11 | * more details. 12 | * 13 | * Author: Noor ul Mubeen 14 | */ 15 | 16 | #ifndef _PARSECONFIG_H_ 17 | #define _PARSECONFIG_H_ 18 | 19 | #define _GNU_SOURCE 20 | #include 21 | #include 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | #include "psst.h" 28 | #include "logger.h" 29 | 30 | #define MAX_LEN 512 31 | #define BASE_PATH_RAPL \ 32 | "/sys/devices/virtual/powercap/intel-rapl/intel-rapl" 33 | #define BASE_PATH_TZONE "/sys/devices/virtual/thermal/thermal_zone" 34 | #define BASE_PATH_CPUDTS \ 35 | "/sys/devices/platform/coretemp.0" 36 | 37 | /* paths & cmd specific to Android */ 38 | #if defined(_ANDROID_) 39 | #define default_log_file "/data/psst.csv" 40 | /* paths & cmd specific to non-android Linux */ 41 | #elif defined(_LINUX_) 42 | #define default_log_file "/var/log/psst.csv" 43 | #else 44 | #define default_log_file "./psst.csv" 45 | #endif 46 | 47 | struct config { 48 | char v_unit; 49 | cpu_set_t cpumask; 50 | unsigned int gpumask; 51 | unsigned int memmask; 52 | unsigned int cpu_freq; 53 | unsigned int verbose; 54 | unsigned int super_verbose; 55 | unsigned int version; 56 | char log_file_name[80]; 57 | int log_file_fd; 58 | char shape_func[20]; 59 | int poll_period; 60 | int duration; 61 | }; 62 | 63 | extern int dont_stress_cpu0; 64 | typedef enum cpu_stress_option { UNDEFINED, 65 | WELL_DEFINED } cpu_stress_opt_t; 66 | extern cpu_stress_opt_t cpu_stress_opt; 67 | extern int parse_cmd_config(int ac, char **av, struct config *configp); 68 | extern int populate_default_config(struct config *configp); 69 | extern int parse_power_shape(char *shape, data_t *pst); 70 | extern int avail_freq_item(int item); 71 | 72 | #endif 73 | -------------------------------------------------------------------------------- /src/perf_msr.c: -------------------------------------------------------------------------------- 1 | /* 2 | * perf_msr.c: Intel cpu aperf/mperf msr counter interface 3 | * 4 | * Copyright (c) 2017, Intel Corporation. 5 | * 6 | * This program is free software; you can redistribute it and/or modify it 7 | * under the terms and conditions of the GNU General Public License, 8 | * version 2, as published by the Free Software Foundation. 9 | * 10 | * This program is distributed in the hope it will be useful, but WITHOUT 11 | * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 12 | * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 13 | * more details. 14 | * 15 | * Author: Noor ul Mubeen 16 | */ 17 | 18 | #include 19 | #include 20 | #include 21 | #include 22 | #include 23 | #include 24 | #include "perf_msr.h" 25 | 26 | int read_msr(int fd, uint32_t reg, uint64_t *data) 27 | { 28 | if (pread(fd, data, sizeof(*data), reg) != sizeof(*data)) { 29 | dbg_print("rdmsr fail on fd:%d\n", fd); 30 | return -1; 31 | } 32 | return 0; 33 | } 34 | 35 | int dev_msr_supported = -1; 36 | int cpu_hfm_mhz; 37 | int initialize_dev_msr(int c) 38 | { 39 | int fd; 40 | char msr_file[128]; 41 | 42 | sprintf(msr_file, "/dev/cpu/%d/msr", c); 43 | fd = open(msr_file, O_RDONLY); 44 | if (fd < 0) { 45 | perror("rdmsr: open"); 46 | return -1; 47 | } 48 | return fd; 49 | } 50 | int initialize_cpu_hfm_mhz(int fd) 51 | { 52 | uint64_t msr_val; 53 | int ret; 54 | 55 | ret = read_msr(fd, (uint32_t)MSR_PLATFORM_INFO, &msr_val); 56 | if (ret != -1) { 57 | /* most x86 platform have BaseCLK as 100MHz */ 58 | cpu_hfm_mhz = ((msr_val >> 8) & 0xffUll) * 100; 59 | } else { 60 | printf("***can't read MSR_PLATFORM_INFO***\n"); 61 | return -1; 62 | } 63 | 64 | return 0; 65 | } 66 | 67 | /* routine to evaluate & store a global msr value's diff */ 68 | #define VAR(a, b) (a##b) 69 | #define generate_msr_diff(scope) \ 70 | uint64_t get_diff_##scope(uint64_t cur_value) \ 71 | { \ 72 | uint64_t diff; \ 73 | diff = (VAR(last_, scope) == 0) ? 0 : (cur_value - VAR(last_, scope)); \ 74 | VAR(last_, scope) = cur_value; \ 75 | return diff; \ 76 | } 77 | 78 | uint64_t *last_aperf = NULL; 79 | uint64_t *last_mperf = NULL; 80 | uint64_t *last_pperf = NULL; 81 | uint64_t *last_tsc = NULL; 82 | 83 | int init_delta_vars(int n) 84 | { 85 | last_aperf = malloc(sizeof(uint64_t) * n); 86 | last_mperf = malloc(sizeof(uint64_t) * n); 87 | last_pperf = malloc(sizeof(uint64_t) * n); 88 | last_tsc = malloc(sizeof(uint64_t) * n); 89 | if (!last_aperf || !last_mperf || !last_mperf || !last_tsc) { 90 | printf("malloc failure perf vars\n"); 91 | return 0; 92 | } 93 | return 1; 94 | } 95 | 96 | /* 97 | * Intel Alderlake hardware errata #ADL026: pperf bits 31:64 could be incorrect. 98 | * https://edc.intel.com/content/www/us/en/design/ipla/software-development-plat 99 | * forms/client/platforms/alder-lake-desktop/682436/007/errata-details/#ADL026 100 | * u644diff() implements a workaround. Assuming real diffs less than MAX(uint32) 101 | */ 102 | #define u64diff(b, a) (((uint64_t)b < (uint64_t)a) ? \ 103 | (uint64_t)((uint32_t)~0UL - (uint32_t)a + (uint32_t)b) :\ 104 | ((uint64_t)b - (uint64_t)a)) 105 | 106 | /* routine to evaluate & store a per-cpu msr value's diff */ 107 | #define VARI(a, b, i) a##b[i] 108 | #define cpu_generate_msr_diff(scope) \ 109 | uint64_t cpu_get_diff_##scope(uint64_t cur_value, int instance) \ 110 | { \ 111 | uint64_t diff; \ 112 | diff = (VARI(last_, scope, instance) == 0) ? \ 113 | 0 : u64diff(cur_value, VARI(last_, scope, instance)); \ 114 | VARI(last_, scope, instance) = cur_value; \ 115 | return diff; \ 116 | } 117 | 118 | cpu_generate_msr_diff(aperf); 119 | cpu_generate_msr_diff(mperf); 120 | cpu_generate_msr_diff(pperf); 121 | cpu_generate_msr_diff(tsc); 122 | -------------------------------------------------------------------------------- /src/perf_msr.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2017, Intel Corporation. 3 | * 4 | * This program is free software; you can redistribute it and/or modify it 5 | * under the terms and conditions of the GNU General Public License, 6 | * version 2, as published by the Free Software Foundation. 7 | * 8 | * This program is distributed in the hope it will be useful, but WITHOUT 9 | * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 10 | * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 11 | * more details. 12 | * 13 | * Author: Noor ul Mubeen 14 | */ 15 | 16 | #ifndef _PERF_MSR_ 17 | #define _PERF_MSR_ 18 | #include 19 | #include 20 | #include "logger.h" 21 | 22 | #define MSR_IA32_MPERF 0xe7 23 | #define MSR_IA32_APERF 0xe8 24 | #define MSR_IA32_PPERF 0x64e 25 | #define MSR_IA32_TSC 0x10 26 | #define MSR_PLATFORM_INFO 0xce 27 | #define MSR_PERF_STATUS 0x198 28 | 29 | extern int cpu_hfm_mhz; 30 | extern int read_msr(int fd, uint32_t reg, uint64_t *data); 31 | extern int initialize_dev_msr(int c); 32 | extern int initialize_cpu_hfm_mhz(int fd); 33 | extern int init_delta_vars(int n); 34 | extern uint64_t cpu_get_diff_aperf(uint64_t a, int i); 35 | extern uint64_t cpu_get_diff_mperf(uint64_t m, int i); 36 | extern uint64_t cpu_get_diff_pperf(uint64_t p, int i); 37 | extern uint64_t cpu_get_diff_tsc(uint64_t t, int i); 38 | #endif 39 | -------------------------------------------------------------------------------- /src/psst.c: -------------------------------------------------------------------------------- 1 | /* 2 | * psst.c: main, creates threads & delegate work to SoC. 3 | * 4 | * Copyright (c) 2017, Intel Corporation. 5 | * 6 | * This program is free software; you can redistribute it and/or modify it 7 | * under the terms and conditions of the GNU General Public License, 8 | * version 2, as published by the Free Software Foundation. 9 | * 10 | * This program is distributed in the hope it will be useful, but WITHOUT 11 | * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 12 | * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 13 | * more details. 14 | * 15 | * Author: Noor ul Mubeen 16 | */ 17 | 18 | #define _GNU_SOURCE 19 | #include 20 | #include 21 | #include 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | #include 28 | #include 29 | #include 30 | #include 31 | #include 32 | #include 33 | #include 34 | 35 | #include "parse_config.h" 36 | #include "psst.h" 37 | #include "logger.h" 38 | #include "rapl.h" 39 | #include "perf_msr.h" 40 | 41 | 42 | void print_version(void) 43 | { 44 | printf("psst version %s\n", VERSION); 45 | } 46 | 47 | int nr_threads; 48 | 49 | /* 50 | * Any additional stress function goes here. 51 | * However, the motive of this tool is reasonable peak power & its 52 | * controllabilty. both motives are met using meaningful work. 53 | */ 54 | static void cpu_work(int on_time_us) 55 | { 56 | (void)on_time_us; /* please the compiler */ 57 | return; 58 | } 59 | 60 | int ts_compare(struct timespec *time1, struct timespec *time2) 61 | { 62 | if (time1->tv_sec < time2->tv_sec) 63 | return -1; /* Less than. */ 64 | else if (time1->tv_sec > time2->tv_sec) 65 | return 1; /* Greater than. */ 66 | else if (time1->tv_nsec < time2->tv_nsec) 67 | return -1; /* Less than. */ 68 | else if (time1->tv_nsec > time2->tv_nsec) 69 | return 1; /* Greater than. */ 70 | else 71 | return 0; /* Equal. */ 72 | } 73 | 74 | int is_time_remaining(clockid_t clk, struct timespec *ts_last, 75 | int sec, int nsec) 76 | { 77 | struct timespec ts_now, ts_later; 78 | ts_later.tv_sec = ts_last->tv_sec + sec; 79 | ts_later.tv_nsec = ts_last->tv_nsec + nsec; 80 | if (ts_later.tv_nsec > NSEC_PER_SEC) { 81 | ts_later.tv_sec++; 82 | ts_later.tv_nsec -= NSEC_PER_SEC; 83 | } 84 | if (clock_gettime(clk, &ts_now)) 85 | perror("clock_gettime"); 86 | if (ts_compare(&ts_now, &ts_later) < 0) 87 | return 1; 88 | else 89 | return 0; 90 | } 91 | 92 | int timespec_to_msec(struct timespec *t) 93 | { 94 | return t->tv_sec*1000 + t->tv_nsec/1000000; 95 | } 96 | unsigned long clockdiff_now_ns(clockid_t clk, struct timespec *ts_then) 97 | { 98 | struct timespec ts_now; 99 | if (clock_gettime(clk, &ts_now)) 100 | perror("clock_gettime"); 101 | return diff_ns(ts_then, &ts_now); 102 | } 103 | 104 | int set_affinity(int pr) 105 | { 106 | cpu_set_t mask; 107 | CPU_ZERO(&mask); 108 | CPU_SET(pr, &mask); 109 | if (sched_setaffinity(0, sizeof(cpu_set_t), &mask) != 0) 110 | perror("sched_setaffiny:"); 111 | 112 | return 0; 113 | } 114 | 115 | int set_sched_priority(int min_max) 116 | { 117 | int policy; 118 | struct sched_param param; 119 | 120 | pthread_getschedparam(pthread_self(), &policy, ¶m); 121 | if (min_max) 122 | param.sched_priority = sched_get_priority_max(policy); 123 | else 124 | param.sched_priority = sched_get_priority_min(policy); 125 | 126 | pthread_setschedparam(pthread_self(), policy, ¶m); 127 | return 0; 128 | } 129 | 130 | int cap_v_unit(float *v_unitp, float max, float min) 131 | { 132 | if (*v_unitp >= (float)max) { 133 | *v_unitp = max; 134 | return 1; 135 | } else if (*v_unitp <= (float)min) { 136 | *v_unitp = min; 137 | return 1; 138 | } 139 | return 0; 140 | } 141 | 142 | #define PS_MIN_POLL_MS (50) 143 | int power_shaping(ps_t *ps, float *v_unit) 144 | { 145 | float y_delta, rad; 146 | int x_delta = 0; 147 | long long time_ms; 148 | switch (ps->psn) { 149 | case LINEAR_RAMP: 150 | /* recalculate every PS_MIN_POLL_MS */ 151 | x_delta = PS_MIN_POLL_MS; 152 | if (is_time_remaining(CLOCK_MONOTONIC, &ps->last, 0, x_delta * 1000000)) 153 | return 0; 154 | y_delta = ps->psa.linear_ramp.slope_y_per_sec / ((float)MSEC_PER_SEC/x_delta); 155 | *v_unit = *v_unit + y_delta; 156 | if (cap_v_unit(v_unit, MAX_LOAD, MIN_LOAD)) 157 | return 0; 158 | break; 159 | case SAW_TOOTH: 160 | x_delta = PS_MIN_POLL_MS; 161 | if (is_time_remaining(CLOCK_MONOTONIC, &ps->last, 0, x_delta * 1000000)) 162 | return 0; 163 | if ((*v_unit >= (float)ps->psa.saw_tooth.max_y) || 164 | (*v_unit <= (float)MIN_LOAD)) { 165 | ps->psa.linear_ramp.slope_y_per_sec *= -1; 166 | } 167 | y_delta = ps->psa.linear_ramp.slope_y_per_sec / ((float)MSEC_PER_SEC/x_delta); 168 | *v_unit = *v_unit + y_delta; 169 | cap_v_unit(v_unit, ps->psa.saw_tooth.max_y, MIN_LOAD); 170 | break; 171 | case STAIR_CASE: 172 | /* recalculate every step-stride seconds */ 173 | x_delta = ps->psa.staircase.x_length; 174 | if (is_time_remaining(CLOCK_MONOTONIC, &ps->last, x_delta, 0)) 175 | return 0; 176 | y_delta = ps->psa.staircase.y_height; 177 | *v_unit = *v_unit + y_delta; 178 | cap_v_unit(v_unit, MAX_LOAD, MIN_LOAD); 179 | break; 180 | case SINOSOID: 181 | x_delta = PS_MIN_POLL_MS; 182 | if (is_time_remaining(CLOCK_MONOTONIC, &ps->last, 0, x_delta * 1000000)) 183 | return 0; 184 | /* 2*pi radians == 360 degree == 1 wavelength*/ 185 | time_ms = ps->last.tv_sec * 1000 + ps->last.tv_nsec/1000000; 186 | x_delta = time_ms % (ps->psa.sinosoid.x_wavelength * 1000); 187 | rad = (float)(2*3.14159 * x_delta)/(ps->psa.sinosoid.x_wavelength * 1000); 188 | /* scale sin(x) to +/-amplituted/2 excursions */ 189 | *v_unit = ps->psa.sinosoid.y_amplitude * (1 + sinf(rad))/2; 190 | /* duty cycle of 0.00 does not make sense. offset by +1% */ 191 | *v_unit += 1; 192 | break; 193 | case SINGLE_PULSE: 194 | /* rising edge detection */ 195 | if (!ps->begin.tv_sec) { 196 | if (clock_gettime(CLOCK_MONOTONIC, &ps->begin)) 197 | perror("clock_gettime"); 198 | } 199 | 200 | x_delta = (int)ps->psa.single_pulse.x_length; 201 | *v_unit = ps->psa.single_pulse.y_height; 202 | 203 | /* falling edge detection */ 204 | if (is_time_remaining(CLOCK_MONOTONIC, &ps->begin, x_delta, 0)) 205 | return 1; 206 | else 207 | *v_unit = MIN_LOAD; 208 | break; 209 | case NONE: 210 | break; 211 | default: 212 | /* single step */ 213 | *v_unit = ps->psa.single_step.v_units; 214 | break; 215 | } 216 | 217 | /* if we din't return from above cases, we changed shape. Update ps */ 218 | if (clock_gettime(CLOCK_MONOTONIC, &ps->last)) 219 | perror("clock_gettime"); 220 | return 1; 221 | 222 | } 223 | 224 | #define START_DELAY 0 225 | 226 | static void work_fn(void *data) 227 | { 228 | int start_pending = 0; 229 | int tick_usec = DEFAULT_TICK_USEC; 230 | int ret, on_time_us, off_time_us, pr; 231 | int cpu_work_exist = 0; 232 | float duty_cycle, dummy; 233 | struct timespec ts; 234 | static int start_ms; 235 | data_t *data_ptr = (data_t*)data; 236 | ps_t ps; 237 | 238 | sigset_t maskset; 239 | sigfillset(&maskset); 240 | ret = pthread_sigmask(SIG_BLOCK, &maskset, NULL); 241 | if (ret) 242 | printf("Couldn't mask signals in work_fn. err:%d\n", ret); 243 | 244 | duty_cycle = data_ptr->duty_cycle; 245 | pr = data_ptr->affinity_pr; 246 | ps.psn = data_ptr->psn; 247 | ps.psa = data_ptr->psa; 248 | ps.begin.tv_sec = 0; 249 | 250 | /* 251 | * if this thread is launched for non-cpu work (e,g gpu work requestor) 252 | * let it run like any normal thread in system 253 | */ 254 | if (CPU_ISSET(pr, &configpv.cpumask)) { 255 | cpu_work_exist = 1; 256 | set_affinity(pr); 257 | set_sched_priority(1); 258 | /* thread could override gpu or other XX_TICK_USEC */ 259 | tick_usec = IA_TICK_USEC; 260 | } 261 | 262 | /* fix duty cycle to to non-zero min value */ 263 | duty_cycle = (fabsf(duty_cycle - MIN_LOAD) < MIN_LOAD/10) ? MIN_LOAD : duty_cycle; 264 | 265 | if (pr == 0) { 266 | plog_poll_sec = MSEC_TO_SEC(configpv.poll_period); 267 | plog_poll_nsec = (plog_poll_sec > 0) ? 268 | REMAINING_MS_TO_NS(configpv.poll_period) : 269 | configpv.poll_period * 1000000; 270 | 271 | duration_sec = MSEC_TO_SEC(configpv.duration); 272 | duration_nsec = (duration_sec > 0) ? 273 | REMAINING_MS_TO_NS(configpv.duration) : 274 | configpv.duration * 1000000; 275 | dbg_print("thread %d sec: %d nsec %d\n", 276 | pr, duration_sec, duration_nsec); 277 | initialize_log_clock(); 278 | update_perf_diffs(&dummy); 279 | if (dont_stress_cpu0) { 280 | duty_cycle = MIN_LOAD; 281 | ps.psn = NONE; 282 | } 283 | } 284 | /* initial on/off time calculation based on duty cycle */ 285 | on_time_us = (tick_usec * duty_cycle / 100); 286 | off_time_us = tick_usec - on_time_us; 287 | dbg_print("Thread:%x DutyCycle:%f ontime:%duS, idletime:%duS\n", 288 | (unsigned int)pthread_self(), 289 | duty_cycle, on_time_us, off_time_us); 290 | 291 | /* monotonic clock initial reference. updated during power_shaping */ 292 | if (clock_gettime(CLOCK_MONOTONIC, &ps.last)) 293 | perror("clock_gettime 1"); 294 | 295 | start_ms = timespec_to_msec(&ps.last); 296 | 297 | do { 298 | if (clock_gettime(CLOCK_THREAD_CPUTIME_ID, &ts)) 299 | perror("clock_gettime 2"); 300 | while (is_time_remaining(CLOCK_THREAD_CPUTIME_ID, &ts, 0, 301 | on_time_us*1000)) { 302 | if (timespec_to_msec(&ps.last) - start_ms < START_DELAY) { 303 | if (clock_gettime(CLOCK_MONOTONIC, &ps.last)) 304 | perror("clock_gettime 3"); 305 | start_pending = 1; 306 | } else { 307 | start_pending = 0; 308 | } 309 | 310 | if (!start_pending) { 311 | /* 312 | * add as much work as required in this loop. 313 | * it will be accounted for good. 314 | */ 315 | data_ptr->duty_cycle = duty_cycle; 316 | if (power_shaping(&ps, &duty_cycle)) { 317 | on_time_us = tick_usec * duty_cycle/100; 318 | off_time_us = tick_usec - on_time_us; 319 | } 320 | } 321 | 322 | if (!start_pending) { 323 | /* No work for cpu0 if it was just submitter */ 324 | if ((dont_stress_cpu0 || (pr != 0)) && cpu_work_exist) { 325 | cpu_work(on_time_us); 326 | } 327 | } 328 | 329 | if (pr == 0) { 330 | do_logging(duty_cycle); 331 | /* XXX: gfx, mem work */ 332 | } 333 | } 334 | 335 | if (exit_cpu_thread) 336 | continue; 337 | /* now for OFF cycle */ 338 | ts.tv_sec = 0; 339 | ts.tv_nsec = off_time_us * 1000; 340 | nanosleep(&ts, NULL); 341 | } while(!exit_cpu_thread && cpu_work_exist); 342 | 343 | /* report out energy index details before exit */ 344 | int N; 345 | int time_ms; 346 | float soc_r_avg, pp0_r_avg; 347 | if (clock_gettime(CLOCK_MONOTONIC, &ts)) 348 | perror("clock_gettime 1"); 349 | time_ms = timespec_to_msec(&ts) - start_ms; 350 | N = (int)time_ms/configpv.poll_period; 351 | 352 | if (pr == 0) { 353 | printf("\nDuration: %d ms. poll: %d ms. samples: %d\n", 354 | time_ms, configpv.poll_period, N); 355 | if (rapl_pp0_supported) { 356 | soc_r_avg = (float)(soc_diff_uj[0])/(time_ms*1000); 357 | pp0_r_avg = (float)(pp0_diff_uj)/(time_ms*1000); 358 | printf("Applicable to SOC\n"); 359 | printf("\tAvg soc power: %.3f W\n", soc_r_avg); 360 | printf("\tEnergy consumed (soc): %.3f mJ\n", 361 | (float)soc_diff_uj[0]/1000); 362 | printf("Applicable to CPU\n"); 363 | 364 | printf("\tAvg cpu power: %.3f W\n", pp0_r_avg); 365 | printf("\tEnergy consumed (cpu): %.3f mJ\n", 366 | (float)pp0_diff_uj/1000); 367 | } 368 | } 369 | pthread_exit(NULL); 370 | return; 371 | } 372 | 373 | 374 | /* signal handler: terminate all threads on cpu */ 375 | static void psst_signal_handler(int sig) 376 | { 377 | dbg_print("caught signal %d\n", sig); 378 | 379 | switch (sig) { 380 | case SIGTERM: 381 | case SIGKILL: 382 | case SIGINT: 383 | exit_cpu_thread = 1; 384 | break; 385 | default: 386 | break; 387 | } 388 | } 389 | 390 | static pthread_t *thread_ptr; 391 | static data_t *data_ptr; 392 | perf_stats_t *perf_stats; 393 | 394 | int main(int argc, char *argv[]) 395 | { 396 | int c, t = 0, ret; 397 | float duty; 398 | void *res; 399 | data_t *pst; 400 | struct config *cfg; 401 | 402 | exit_cpu_thread = 0; 403 | cfg = &configpv; 404 | 405 | if (!parse_cmd_config(argc, argv, cfg)) { 406 | printf("failed to parse_cmd_config\n"); 407 | exit(EXIT_FAILURE); 408 | } 409 | 410 | if (cfg->version) { 411 | print_version(); 412 | exit(EXIT_SUCCESS); 413 | } 414 | 415 | if (geteuid() != 0) { 416 | printf("run as root\n"); 417 | exit(EXIT_FAILURE); 418 | } 419 | 420 | if (!populate_default_config(cfg)) { 421 | printf("failed to populate_default_config\n"); 422 | exit(EXIT_FAILURE); 423 | } 424 | 425 | pthread_t io_thread; 426 | pthread_attr_t attr_io; 427 | if (pthread_attr_init(&attr_io)) { 428 | perror("io thread attr"); 429 | exit(EXIT_FAILURE); 430 | } 431 | pthread_attr_setdetachstate(&attr_io, PTHREAD_CREATE_JOINABLE); 432 | pthread_attr_t attr_t; 433 | 434 | data_t base_data; 435 | /* shape func is common to all threads */ 436 | pst = &base_data; 437 | 438 | /* Android lib does not support suboption(). parse it manually */ 439 | if (!parse_power_shape(cfg->shape_func, pst)) { 440 | printf("failed parse_power_shape \"%s\"\n", cfg->shape_func); 441 | printf("see --help for usage\n"); 442 | exit(EXIT_FAILURE); 443 | } 444 | 445 | /* default/starting duty cycle */ 446 | duty = MIN_LOAD; 447 | nr_threads = CPU_COUNT(&cfg->cpumask); 448 | 449 | if (!init_delta_vars(nr_threads)){ 450 | exit(EXIT_FAILURE); 451 | } 452 | 453 | thread_ptr = malloc(sizeof(pthread_t) * nr_threads); 454 | if (!thread_ptr) { 455 | perror("malloc thread_ptr"); 456 | goto bail; 457 | } 458 | data_ptr = malloc(sizeof(data_t) * nr_threads); 459 | if (!data_ptr) { 460 | perror("malloc data_ptr"); 461 | goto bail; 462 | } 463 | perf_stats = malloc(sizeof(perf_stats_t) * nr_threads); 464 | if (!perf_stats) { 465 | perror("malloc perf_stat"); 466 | goto bail; 467 | } 468 | 469 | for (c = 0, t = 0; c < CPU_SETSIZE && t < nr_threads; c++) { 470 | if (!CPU_ISSET(c, &cfg->cpumask)) 471 | continue; 472 | ret = initialize_dev_msr(c); 473 | if (ret < 0) { 474 | perf_stats[t].dev_msr_supported = 0; 475 | printf("*** No /dev/cpu%d/msr. check CONFIG_X86_MSR support ***\n\n", c); 476 | break; 477 | } else { 478 | perf_stats[t].cpu = c; 479 | perf_stats[t].dev_msr_fd = ret; 480 | perf_stats[t].dev_msr_supported = 1; 481 | } 482 | t++; 483 | } 484 | if (initialize_cpu_hfm_mhz(perf_stats[0].dev_msr_fd)) 485 | goto bail; 486 | 487 | /* thread for deferred disk IO of logs */ 488 | if (pthread_create(&io_thread, &attr_io, 489 | (void *)&page_write_disk, (void *)cfg)) { 490 | perror("io thread create"); 491 | goto bail; 492 | } 493 | if (!nr_threads) { 494 | nr_threads = 1; 495 | CPU_SET(0, &cfg->cpumask); 496 | } 497 | 498 | ret = pthread_attr_init(&attr_t); 499 | if (ret) { 500 | perror("pthread_attr_init:"); 501 | goto bail; 502 | } 503 | 504 | pthread_attr_setdetachstate(&attr_t, PTHREAD_CREATE_JOINABLE); 505 | /* fork pthreads for each logical cpu selected & set affinity to cpu. */ 506 | for (c = 0, t = 0; c < CPU_SETSIZE && t < nr_threads; c++) { 507 | if (!CPU_ISSET(c, &cfg->cpumask)) 508 | continue; 509 | data_ptr[t].duty_cycle = duty; 510 | /* setaffinity to specific processor */ 511 | data_ptr[t].affinity_pr = c; 512 | data_ptr[t].psn = pst->psn; 513 | data_ptr[t].psa = pst->psa; 514 | ret = pthread_create(&thread_ptr[t], &attr_t, (void *)&work_fn, 515 | (void *)&data_ptr[t]); 516 | if (ret) { 517 | perror("Failed pthread create"); 518 | goto bail; 519 | } 520 | t++; 521 | } 522 | 523 | if (signal(SIGINT, psst_signal_handler) == SIG_ERR) 524 | printf("Cannot handle SIGINT\n"); 525 | 526 | /* attr not needed after create */ 527 | pthread_attr_destroy(&attr_t); 528 | dbg_print("Created %d Thread + 1 io thread\n", t); 529 | while (0 < t--) { 530 | pthread_join(thread_ptr[t], &res); 531 | close(perf_stats[t].dev_msr_fd); 532 | dbg_print("Thread %d cleaned\n", t); 533 | } 534 | 535 | /* we exit the logger thread above. time to flush any remaining data */ 536 | exit_io_thread = 1; 537 | trigger_disk_io(); 538 | 539 | pthread_attr_destroy(&attr_io); 540 | pthread_join(io_thread, &res); 541 | dbg_print("IO Thread cleaned\n"); 542 | 543 | bail: 544 | return 1; 545 | } 546 | -------------------------------------------------------------------------------- /src/psst.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2017, Intel Corporation. 3 | * 4 | * This program is free software; you can redistribute it and/or modify it 5 | * under the terms and conditions of the GNU General Public License, 6 | * version 2, as published by the Free Software Foundation. 7 | * 8 | * This program is distributed in the hope it will be useful, but WITHOUT 9 | * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 10 | * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 11 | * more details. 12 | * 13 | * Author: Noor ul Mubeen 14 | */ 15 | 16 | #ifndef _PSST_H_ 17 | #define _PSST_H_ 18 | #include 19 | 20 | #define MSEC_PER_SEC (1000) 21 | #define USEC_PER_SEC (1000000) 22 | #define NSEC_PER_SEC (1000000000) 23 | /* 24 | * kernel's USER_HZ is not exported to user space 25 | * typically platforms have kernel (HZ == USER_HZ == 1000 per sec) 26 | */ 27 | #define IA_DUTY_CYCLE_PER_SEC (50) 28 | #define IA_TICK_USEC (USEC_PER_SEC / IA_DUTY_CYCLE_PER_SEC) 29 | 30 | #define DEFAULT_TICK_USEC (IA_TICK_USEC) 31 | 32 | #define MIN_LOAD (0.10) 33 | #define MAX_LOAD (100) 34 | 35 | enum power_shape_name { 36 | SINGLE_STEP, 37 | SINOSOID, 38 | STAIR_CASE, 39 | SINGLE_PULSE, 40 | LINEAR_RAMP, 41 | SAW_TOOTH, 42 | GROWTH_CURVE, 43 | DECAY_CURVE, 44 | NONE 45 | }; 46 | 47 | typedef union { 48 | struct single_step_t { 49 | float v_units; 50 | } single_step; 51 | struct staircase_t { 52 | float y_height; 53 | int x_length; 54 | } staircase; 55 | struct sinosoid_t { 56 | float y_amplitude; 57 | int x_wavelength; 58 | } sinosoid; 59 | struct singlepulse_t { 60 | float y_height; 61 | int x_length; 62 | } single_pulse; 63 | struct linear_ramp_t { 64 | float slope_y_per_sec; 65 | } linear_ramp; 66 | struct saw_tooth_t { 67 | float slope_y_per_sec; 68 | float max_y; 69 | } saw_tooth; 70 | struct growth_curve_t { 71 | } growth_curve; 72 | struct decay_curve_t { 73 | } decay_curve; 74 | } power_shape_attr_t; 75 | 76 | typedef struct { 77 | enum power_shape_name psn; 78 | power_shape_attr_t psa; 79 | struct timespec last; 80 | struct timespec begin; 81 | } ps_t; 82 | 83 | typedef struct { 84 | float duty_cycle; 85 | int affinity_pr; 86 | enum power_shape_name psn; 87 | power_shape_attr_t psa; 88 | } data_t; 89 | 90 | typedef struct { 91 | int last_time_taken; 92 | struct timespec ts; 93 | } perf_t; 94 | 95 | typedef struct { 96 | int cpu; 97 | int dev_msr_fd; 98 | int dev_msr_supported; 99 | uint64_t aperf_diff; 100 | uint64_t mperf_diff; 101 | uint64_t pperf_diff; 102 | uint64_t tsc_diff; 103 | uint64_t nperf; 104 | } perf_stats_t; 105 | 106 | extern int is_time_remaining(clockid_t, struct timespec *, int, int); 107 | extern unsigned int *perf_time; 108 | extern uint64_t pp0_diff_uj, soc_diff_uj[4]; 109 | extern int exit_cpu_thread, exit_io_thread; 110 | 111 | #endif 112 | -------------------------------------------------------------------------------- /src/rapl.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Interface functions to intel rapl. 3 | * 4 | * Copyright (c) 2017, Intel Corporation. 5 | * 6 | * This program is free software; you can redistribute it and/or modify it 7 | * under the terms and conditions of the GNU General Public License, 8 | * version 2, as published by the Free Software Foundation. 9 | * 10 | * This program is distributed in the hope it will be useful, but WITHOUT 11 | * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 12 | * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 13 | * more details. 14 | * 15 | * Author: Noor ul Mubeen 16 | */ 17 | 18 | #include "logger.h" 19 | static long long prev_pkg0, prev_pkg1, prev_pkg2, prev_pkg3; 20 | static long long prev_cpu; 21 | static long long prev_gpu; 22 | static long long prev_dram; 23 | 24 | #define VAR(a, b) (a##b) 25 | #define generate_rapl_ediff(scope) \ 26 | int rapl_ediff_##scope(long long cur_ewma) \ 27 | { \ 28 | long long ediff; \ 29 | ediff = (VAR(prev_, scope) == 0) ? 0 : (cur_ewma - VAR(prev_, scope)); \ 30 | VAR(prev_, scope) = cur_ewma; \ 31 | if (ediff <= 0) { \ 32 | return 0; \ 33 | } \ 34 | return (int)ediff; \ 35 | } 36 | 37 | /* These functions return energy diff in micro-joules since last sample */ 38 | generate_rapl_ediff(pkg0); 39 | generate_rapl_ediff(pkg1); 40 | generate_rapl_ediff(pkg2); 41 | generate_rapl_ediff(pkg3); 42 | generate_rapl_ediff(cpu); 43 | generate_rapl_ediff(gpu); 44 | generate_rapl_ediff(dram); 45 | -------------------------------------------------------------------------------- /src/rapl.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2017, Intel Corporation. 3 | * 4 | * This program is free software; you can redistribute it and/or modify it 5 | * under the terms and conditions of the GNU General Public License, 6 | * version 2, as published by the Free Software Foundation. 7 | * 8 | * This program is distributed in the hope it will be useful, but WITHOUT 9 | * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 10 | * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 11 | * more details. 12 | * 13 | * Author: Noor ul Mubeen 14 | */ 15 | 16 | #ifndef _RAPL_H_ 17 | #define _RAPL_H_ 18 | 19 | extern int rapl_ediff_pkg0(long long); 20 | extern int rapl_ediff_pkg1(long long); 21 | extern int rapl_ediff_pkg2(long long); 22 | extern int rapl_ediff_pkg3(long long); 23 | extern int rapl_ediff_soc(long long); 24 | extern int rapl_ediff_cpu(long long); 25 | extern int rapl_ediff_gpu(long long); 26 | extern int rapl_ediff_dram(long long); 27 | #endif 28 | 29 | -------------------------------------------------------------------------------- /whitepapers/Generic_perf_per_watt.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/intel/psst/f8693dee31934d04feed9bd01329ad8aed6a9e3e/whitepapers/Generic_perf_per_watt.pdf --------------------------------------------------------------------------------