├── .gitignore ├── AUTHORS.txt ├── CONTRIBUTING.md ├── LICENSE.txt ├── Makefile ├── README.md ├── armpmu_lib.h ├── ko ├── Makefile └── enable_arm_pmu.c ├── load-module ├── perf_arm_pmu.c ├── perf_event_open.c └── unload-module /.gitignore: -------------------------------------------------------------------------------- 1 | *.ko 2 | *.o 3 | *~ 4 | -------------------------------------------------------------------------------- /AUTHORS.txt: -------------------------------------------------------------------------------- 1 | Author, target of blame: 2 | 3 | Austin Seipp 4 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | ## Commits 4 | 5 | Rules for contribution: 6 | 7 | * 80-character column maximum. 8 | * The first line of a commit message should be 73 columns max. 9 | * Try to make commits self contained. One thing at a time. 10 | If it's a branch, squash the commits together to make one. 11 | * Always run tests. If benchmarks regress, give OS information, 12 | and we'll discuss. 13 | * Always reference the issue you're working on in the bug tracker 14 | in your commit message, and if it fixes the issue, close it. 15 | 16 | You can use GitHub pull requests OR just email me patches directly 17 | (see `git format-patch --help`,) whatever you are more comfortable with. 18 | 19 | One nice aspect of submitting a pull request is that 20 | [travis-ci.org](http://travis-ci.org) bots will automatically merge, build 21 | and run tests against your commits, and continue as you update the request, 22 | so you can be sure you didn't typo stuff or something before a final merge. 23 | 24 | For multi-commit requests, I will often squash them into the smallest 25 | possible logical changes and commit with author attribution. 26 | 27 | ### Notes on sign-offs and attributions, etc. 28 | 29 | When you commit, **please use -s to add a Signed-off-by line**. I manage 30 | the `Signed-off-by` line much like Git itself: by adding it, you make clear 31 | that the contributed code abides by the source code license. I'm pretty 32 | much always going to want you to do this. 33 | 34 | I normally merge commits manually and give the original author attribution 35 | via `git commit --author`. I also sign-off on it, and add an `Acked-by` field 36 | which basically states "this commit is not totally ludicrous." 37 | 38 | Other fields may be added in the same vein for attribution or other purposes 39 | (`Suggested-by`, `Reviewed-by`, etc.) 40 | 41 | ## Hacker notes 42 | 43 | N/A. 44 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | Copyright (c) 2013 Austin Seipp 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining 4 | a copy of this software and associated documentation files (the 5 | "Software"), to deal in the Software without restriction, including 6 | without limitation the rights to use, copy, modify, merge, publish, 7 | distribute, sublicense, and/or sell copies of the Software, and to 8 | permit persons to whom the Software is furnished to do so, subject to 9 | the following conditions: 10 | 11 | The above copyright notice and this permission notice shall be 12 | included in all copies or substantial portions of the Software. 13 | 14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 15 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 16 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 17 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 18 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 19 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 20 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | all: ko/enable_arm_pmu.ko perf_arm_pmu perf_event_open 2 | perf_arm_pmu: perf_arm_pmu.c 3 | @echo CC perf_arm_pmu 4 | @$(CC) -O3 -std=gnu99 perf_arm_pmu.c -o perf_arm_pmu 5 | perf_event_open: perf_event_open.c 6 | @echo CC perf_event_open 7 | @$(CC) -O3 -std=gnu99 perf_event_open.c -o perf_event_open 8 | ko/enable_arm_pmu.ko: ko/enable_arm_pmu.c 9 | @echo KMOD ko/enable_arm_pmu.ko 10 | @$(MAKE) -C ko > /dev/null 11 | runtests: all 12 | @echo SUDO load-module 13 | @./load-module 14 | @./perf_arm_pmu 64 15 | @./perf_event_open 64 16 | @echo SUDO unload-module 17 | @./unload-module 18 | 19 | clean: 20 | @($(MAKE) -C ko clean > /dev/null) && rm -f perf_arm_pmu perf_event_open *.o *~ 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # User-mode access to ARM PMU cycle counters 2 | 3 | This repository contains a kernel module and library. 4 | 5 | ARM performance monitor units (PMUs) are only available on ARMv7 machines. In 6 | general, this means you'll need a Cortex-A7 or better (A8, A9, A15, etc.) 7 | 8 | More details are available in [my blog post][blog]. 9 | 10 | [blog]: http://neocontra.blogspot.com/2013/05/user-mode-performance-counters-for.html 11 | 12 | # Testing 13 | 14 | To compile, load, test and remove the module, you can just run: 15 | 16 | ``` 17 | $ sudo make runtests 18 | ``` 19 | 20 | ## Tested on 21 | 22 | * Samsung Chromebook 23 | * Exynos 5 Dual, 1.7gHz Cortex-A15 24 | * Ubuntu 13.04 25 | * ODROID-U2 26 | * Exynos 4 Quad, 1.7gHz Cortex-A9 27 | * Ubuntu/Linaro 12.10 derivative 28 | 29 | TBD: PandaBoard. 30 | 31 | # Join in 32 | 33 | Be sure to read the [contributing guidelines][contribute]. File bugs 34 | in the GitHub [issue tracker][]. 35 | 36 | Master [git repository][gh]: 37 | 38 | * `git clone https://github.com/thoughtpolice/enable_arm_pmu.git` 39 | 40 | There's also a [BitBucket mirror][bb]: 41 | 42 | * `git clone https://bitbucket.org/thoughtpolice/enable_arm_pmu.git` 43 | 44 | # Authors 45 | 46 | See [AUTHORS.txt](https://raw.github.com/thoughtpolice/enable_arm_pmu/master/AUTHORS.txt). 47 | 48 | # License 49 | 50 | MIT. See 51 | [LICENSE.txt](https://raw.github.com/thoughtpolice/enable_arm_pmu/master/LICENSE.txt) 52 | for terms of copyright and redistribution. 53 | 54 | [contribute]: https://github.com/thoughtpolice/enable_arm_pmu/blob/master/CONTRIBUTING.md 55 | [issue tracker]: http://github.com/thoughtpolice/enable_arm_pmu/issues 56 | [gh]: http://github.com/thoughtpolice/enable_arm_pmu 57 | [bb]: http://bitbucket.org/thoughtpolice/enable_arm_pmu 58 | -------------------------------------------------------------------------------- /armpmu_lib.h: -------------------------------------------------------------------------------- 1 | #ifndef ARMPMU_LIB_H 2 | #define ARMPMU_LIB_H 3 | 4 | static inline uint32_t 5 | rdtsc32(void) 6 | { 7 | #if defined(__GNUC__) 8 | uint32_t r = 0; 9 | #if defined __aarch64__ 10 | asm volatile("mrs %0, pmccntr_el0" : "=r" (r)); 11 | #elif defined(__ARM_ARCH_7A__) 12 | asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r"(r) ); 13 | #else 14 | #error Unsupported architecture/compiler! 15 | #endif 16 | return r; 17 | #endif 18 | } 19 | 20 | #define ARMV8_PMEVTYPER_P (1 << 31) /* EL1 modes filtering bit */ 21 | #define ARMV8_PMEVTYPER_U (1 << 30) /* EL0 filtering bit */ 22 | #define ARMV8_PMEVTYPER_NSK (1 << 29) /* Non-secure EL1 (kernel) modes filtering bit */ 23 | #define ARMV8_PMEVTYPER_NSU (1 << 28) /* Non-secure User mode filtering bit */ 24 | #define ARMV8_PMEVTYPER_NSH (1 << 27) /* Non-secure Hyp modes filtering bit */ 25 | #define ARMV8_PMEVTYPER_M (1 << 26) /* Secure EL3 filtering bit */ 26 | #define ARMV8_PMEVTYPER_MT (1 << 25) /* Multithreading */ 27 | #define ARMV8_PMEVTYPER_EVTCOUNT_MASK 0x3ff 28 | 29 | static inline void 30 | enable_pmu(uint32_t evtCount) 31 | { 32 | #if defined(__GNUC__) && defined __aarch64__ 33 | evtCount &= ARMV8_PMEVTYPER_EVTCOUNT_MASK; 34 | asm volatile("isb"); 35 | /* Just use counter 0 */ 36 | asm volatile("msr pmevtyper0_el0, %0" : : "r" (evtCount)); 37 | /* Performance Monitors Count Enable Set register bit 30:1 disable, 31,1 enable */ 38 | uint32_t r = 0; 39 | 40 | asm volatile("mrs %0, pmcntenset_el0" : "=r" (r)); 41 | asm volatile("msr pmcntenset_el0, %0" : : "r" (r|1)); 42 | #else 43 | #error Unsupported architecture/compiler! 44 | #endif 45 | } 46 | 47 | static inline uint32_t 48 | read_pmu(void) 49 | { 50 | #if defined(__GNUC__) && defined __aarch64__ 51 | uint32_t r = 0; 52 | asm volatile("mrs %0, pmevcntr0_el0" : "=r" (r)); 53 | return r; 54 | #else 55 | #error Unsupported architecture/compiler! 56 | #endif 57 | } 58 | 59 | static inline void 60 | disable_pmu(uint32_t evtCount) 61 | { 62 | #if defined(__GNUC__) && defined __aarch64__ 63 | /* Performance Monitors Count Enable Set register: clear bit 0 */ 64 | uint32_t r = 0; 65 | 66 | asm volatile("mrs %0, pmcntenset_el0" : "=r" (r)); 67 | asm volatile("msr pmcntenset_el0, %0" : : "r" (r&&0xfffffffe)); 68 | #else 69 | #error Unsupported architecture/compiler! 70 | #endif 71 | } 72 | 73 | 74 | #endif /* ARMPMU_LIB_H */ 75 | -------------------------------------------------------------------------------- /ko/Makefile: -------------------------------------------------------------------------------- 1 | obj-m := enable_arm_pmu.o 2 | KDIR := /lib/modules/$(shell uname -r)/build 3 | PWD := $(shell pwd) 4 | 5 | all: 6 | $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules 7 | clean: 8 | $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) clean 9 | -------------------------------------------------------------------------------- /ko/enable_arm_pmu.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Enable user-mode ARM performance counter access. 3 | */ 4 | #include 5 | #include 6 | #include 7 | 8 | /** -- Configuration stuff ------------------------------------------------- */ 9 | 10 | #define DRVR_NAME "enable_arm_pmu" 11 | 12 | #if !defined(__arm__) && !defined(__aarch64__) 13 | #error Module can only be compiled on ARM machines. 14 | #endif 15 | 16 | /** -- Initialization & boilerplate ---------------------------------------- */ 17 | #define ARMV8_PMCR_MASK 0x3f 18 | #define ARMV8_PMCR_E (1 << 0) /* Enable all counters */ 19 | #define ARMV8_PMCR_P (1 << 1) /* Reset all counters */ 20 | #define ARMV8_PMCR_C (1 << 2) /* Cycle counter reset */ 21 | #define ARMV8_PMCR_D (1 << 3) /* CCNT counts every 64th cpu cycle */ 22 | #define ARMV8_PMCR_X (1 << 4) /* Export to ETM */ 23 | #define ARMV8_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/ 24 | #define ARMV8_PMCR_N_SHIFT 11 /* Number of counters supported */ 25 | #define ARMV8_PMCR_N_MASK 0x1f 26 | 27 | #define ARMV8_PMUSERENR_EN_EL0 (1 << 0) /* EL0 access enable */ 28 | #define ARMV8_PMUSERENR_CR (1 << 2) /* Cycle counter read enable */ 29 | #define ARMV8_PMUSERENR_ER (1 << 3) /* Event counter read enable */ 30 | 31 | #define ARMV8_PMCNTENSET_EL0_ENABLE (1<<31) /* *< Enable Perf count reg */ 32 | 33 | #define PERF_DEF_OPTS (1 | 16) 34 | #define PERF_OPT_RESET_CYCLES (2 | 4) 35 | #define PERF_OPT_DIV64 (8) 36 | 37 | static inline u32 armv8pmu_pmcr_read(void) 38 | { 39 | u64 val=0; 40 | asm volatile("mrs %0, pmcr_el0" : "=r" (val)); 41 | return (u32)val; 42 | } 43 | static inline void armv8pmu_pmcr_write(u32 val) 44 | { 45 | val &= ARMV8_PMCR_MASK; 46 | isb(); 47 | asm volatile("msr pmcr_el0, %0" : : "r" ((u64)val)); 48 | } 49 | 50 | static void 51 | enable_cpu_counters(void* data) 52 | { 53 | printk(KERN_INFO "[" DRVR_NAME "] enabling user-mode PMU access on CPU #%d", 54 | smp_processor_id()); 55 | 56 | #if __aarch64__ 57 | /* Enable user-mode access to counters. */ 58 | asm volatile("msr pmuserenr_el0, %0" : : "r"((u64)ARMV8_PMUSERENR_EN_EL0|ARMV8_PMUSERENR_ER|ARMV8_PMUSERENR_CR)); 59 | /* Initialize & Reset PMNC: C and P bits. */ 60 | armv8pmu_pmcr_write(ARMV8_PMCR_P | ARMV8_PMCR_C); 61 | /* G4.4.11 62 | * PMINTENSET, Performance Monitors Interrupt Enable Set register */ 63 | /* cycle counter overflow interrupt request is disabled */ 64 | asm volatile("msr pmintenset_el1, %0" : : "r" ((u64)(0 << 31))); 65 | /* Performance Monitors Count Enable Set register bit 30:0 disable, 31 enable */ 66 | asm volatile("msr pmcntenset_el0, %0" : : "r" (ARMV8_PMCNTENSET_EL0_ENABLE)); 67 | /* start*/ 68 | armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMCR_E); 69 | #elif defined(__ARM_ARCH_7A__) 70 | /* Enable user-mode access to counters. */ 71 | asm volatile("mcr p15, 0, %0, c9, c14, 0" :: "r"(1)); 72 | /* Program PMU and enable all counters */ 73 | asm volatile("mcr p15, 0, %0, c9, c12, 0" :: "r"(PERF_DEF_OPTS)); 74 | asm volatile("mcr p15, 0, %0, c9, c12, 1" :: "r"(0x8000000f)); 75 | #else 76 | #error Unsupported Architecture 77 | #endif 78 | } 79 | 80 | static void 81 | disable_cpu_counters(void* data) 82 | { 83 | printk(KERN_INFO "[" DRVR_NAME "] disabling user-mode PMU access on CPU #%d", 84 | smp_processor_id()); 85 | 86 | #if __aarch64__ 87 | /* Performance Monitors Count Enable Set register bit 31:0 disable, 1 enable */ 88 | asm volatile("msr pmcntenset_el0, %0" : : "r" (0<<31)); 89 | /* Note above statement does not really clearing register...refer to doc */ 90 | /* Program PMU and disable all counters */ 91 | armv8pmu_pmcr_write(armv8pmu_pmcr_read() |~ARMV8_PMCR_E); 92 | /* disable user-mode access to counters. */ 93 | asm volatile("msr pmuserenr_el0, %0" : : "r"((u64)0)); 94 | #elif defined(__ARM_ARCH_7A__) 95 | /* Program PMU and disable all counters */ 96 | asm volatile("mcr p15, 0, %0, c9, c12, 0" :: "r"(0)); 97 | asm volatile("mcr p15, 0, %0, c9, c12, 2" :: "r"(0x8000000f)); 98 | /* Disable user-mode access to counters. */ 99 | asm volatile("mcr p15, 0, %0, c9, c14, 0" :: "r"(0)); 100 | #else 101 | #error Unsupported Architecture 102 | #endif 103 | } 104 | 105 | static int __init 106 | init(void) 107 | { 108 | on_each_cpu(enable_cpu_counters, NULL, 1); 109 | printk(KERN_INFO "[" DRVR_NAME "] initialized"); 110 | return 0; 111 | } 112 | 113 | static void __exit 114 | fini(void) 115 | { 116 | on_each_cpu(disable_cpu_counters, NULL, 1); 117 | printk(KERN_INFO "[" DRVR_NAME "] unloaded"); 118 | } 119 | 120 | MODULE_AUTHOR("Austin Seipp "); 121 | MODULE_LICENSE("Dual MIT/GPL"); 122 | MODULE_DESCRIPTION("Enables user-mode access to ARMv7 PMU counters"); 123 | MODULE_VERSION("0:0.1-dev"); 124 | module_init(init); 125 | module_exit(fini); 126 | -------------------------------------------------------------------------------- /load-module: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | /sbin/insmod ./ko/enable_arm_pmu.ko || exit 1 3 | -------------------------------------------------------------------------------- /perf_arm_pmu.c: -------------------------------------------------------------------------------- 1 | /** compile with -std=gnu99 */ 2 | #include 3 | #include 4 | #include 5 | 6 | #include "armpmu_lib.h" 7 | 8 | /* Simple loop body to keep things interested. Make sure it gets inlined. */ 9 | static inline int 10 | loop(int* __restrict__ a, int* __restrict__ b, int n) 11 | { 12 | unsigned sum = 0; 13 | for (int i = 0; i < n; ++i) 14 | if(a[i] > b[i]) 15 | sum += a[i] + 5; 16 | return sum; 17 | } 18 | 19 | int 20 | main(int ac, char **av) 21 | { 22 | uint32_t time_start = 0; 23 | uint32_t time_end = 0; 24 | uint32_t cnt_start = 0; 25 | uint32_t cnt_end = 0; 26 | 27 | int *a = NULL; 28 | int *b = NULL; 29 | int len = 0; 30 | int sum = 0; 31 | 32 | if (ac != 2) return -1; 33 | len = atoi(av[1]); 34 | printf("%s: len = %d\n", av[0], len); 35 | 36 | a = malloc(len*sizeof(*a)); 37 | b = malloc(len*sizeof(*b)); 38 | 39 | for (int i = 0; i < len; ++i) { 40 | a[i] = i+128; 41 | b[i] = i+64; 42 | } 43 | 44 | printf("%s: beginning loop\n", av[0]); 45 | time_start = rdtsc32(); 46 | sum = loop(a, b, len); 47 | time_end = rdtsc32(); 48 | printf("%s: done. sum = %d; time delta = %u\n", av[0], sum, time_end - time_start); 49 | 50 | printf("%s: beginning loop\n", av[0]); 51 | enable_pmu(0x008); 52 | cnt_start = read_pmu(); 53 | sum = loop(a, b, len); 54 | cnt_end = read_pmu(); 55 | disable_pmu(0x008); 56 | printf("%s: done. sum = %d; event 0x%03x delta = %u\n", av[0], sum, 0x008, cnt_end - cnt_start); 57 | 58 | free(a); free(b); 59 | return 0; 60 | } 61 | -------------------------------------------------------------------------------- /perf_event_open.c: -------------------------------------------------------------------------------- 1 | /** compile with -std=gnu99 */ 2 | #define _GNU_SOURCE 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | 11 | static int fddev = -1; 12 | __attribute__((constructor)) static void 13 | init(void) 14 | { 15 | static struct perf_event_attr attr; 16 | attr.type = PERF_TYPE_HARDWARE; 17 | attr.config = PERF_COUNT_HW_CPU_CYCLES; 18 | fddev = syscall(__NR_perf_event_open, &attr, 0, -1, -1, 0); 19 | } 20 | 21 | __attribute__((destructor)) static void 22 | fini(void) 23 | { 24 | close(fddev); 25 | } 26 | 27 | static inline long long 28 | cpucycles(void) 29 | { 30 | long long result = 0; 31 | if (read(fddev, &result, sizeof(result)) < sizeof(result)) return 0; 32 | return result; 33 | } 34 | 35 | /* Simple loop body to keep things interested. Make sure it gets inlined. */ 36 | static inline int 37 | loop(int* __restrict__ a, int* __restrict__ b, int n) 38 | { 39 | unsigned sum = 0; 40 | for (int i = 0; i < n; ++i) 41 | if(a[i] > b[i]) 42 | sum += a[i] + 5; 43 | return sum; 44 | } 45 | 46 | int 47 | main(int ac, char **av) 48 | { 49 | long long time_start = 0; 50 | long long time_end = 0; 51 | 52 | int *a = NULL; 53 | int *b = NULL; 54 | int len = 0; 55 | int sum = 0; 56 | 57 | if (ac != 2) return -1; 58 | len = atoi(av[1]); 59 | printf("%s: len = %d\n", av[0], len); 60 | 61 | a = malloc(len*sizeof(*a)); 62 | b = malloc(len*sizeof(*b)); 63 | 64 | for (int i = 0; i < len; ++i) { 65 | a[i] = i+128; 66 | b[i] = i+64; 67 | } 68 | 69 | printf("%s: beginning loop\n", av[0]); 70 | time_start = cpucycles(); 71 | sum = loop(a, b, len); 72 | time_end = cpucycles(); 73 | printf("%s: done. sum = %d; time delta = %llu\n", av[0], sum, time_end - time_start); 74 | 75 | free(a); free(b); 76 | return 0; 77 | } 78 | -------------------------------------------------------------------------------- /unload-module: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | /sbin/rmmod enable_arm_pmu || exit 1 3 | --------------------------------------------------------------------------------