├── LICENSE ├── README.md ├── freud-dwarf ├── LICENSE ├── Makefile ├── README.md ├── code-gen.cc ├── code-gen.hh ├── create-instrumentation.cc ├── dwarf-explorer.cc ├── dwarf-explorer.hh ├── dwarf │ ├── .gitignore │ ├── Doxyfile │ ├── Makefile │ ├── abbrev.cc │ ├── attrs.cc │ ├── cursor.cc │ ├── data.hh │ ├── die.cc │ ├── die_str_map.cc │ ├── dwarf++.hh │ ├── dwarf.cc │ ├── elf.cc │ ├── expr.cc │ ├── frame.cc │ ├── internal.hh │ ├── line.cc │ ├── loclist.cc │ ├── rangelist.cc │ ├── small_vector.hh │ └── value.cc ├── elf │ ├── .gitignore │ ├── Makefile │ ├── common.hh │ ├── data.hh │ ├── elf++.hh │ ├── elf.cc │ ├── enum-print.py │ ├── mmap_loader.cc │ └── to_hex.hh ├── instr-expr-context.cc ├── instr-expr-context.hh ├── structures.hh ├── table-generator.cc ├── table-generator.hh ├── table_intf.txt ├── utils.cc └── utils.hh ├── freud-pin ├── LICENSE ├── dumper.cc ├── freud-pin.cpp ├── logger.cc ├── makefile ├── makefile.rules ├── reader.cc ├── reader.hh ├── rtn_descriptor.h └── rtn_execution.h ├── freud-statistics ├── .deps │ ├── freud_statistics-analysis.Po │ ├── freud_statistics-checker.Po │ ├── freud_statistics-main.Po │ ├── freud_statistics-plotter.Po │ ├── freud_statistics-reader.Po │ ├── freud_statistics-reader_annotations.Po │ ├── freud_statistics-regression.Po │ ├── freud_statistics-stats.Po │ └── freud_statistics-utils.Po ├── AUTHORS ├── ChangeLog ├── INSTALL ├── LICENSE ├── Makefile ├── README ├── analysis.cc ├── analysis.hh ├── checker.cc ├── checker.hh ├── const.hh ├── function.hh ├── hpi.R ├── main.cc ├── measure.hh ├── method.hh ├── plotter.cc ├── plotter.hh ├── reader.cc ├── reader.hh ├── reader_annotations.cc ├── reader_annotations.hh ├── regression.cc ├── regression.hh ├── stats.cc ├── stats.hh ├── utils.cc └── utils.hh ├── micro_benchmark ├── LICENSE ├── Makefile └── micro_benchmark.cc ├── micro_benchmark_analysis └── Makefile ├── test.sh └── test ├── LICENSE ├── Makefile ├── instr_tests.hh ├── stats_tests.hh └── test.cc /README.md: -------------------------------------------------------------------------------- 1 | # *FREUD* 2 | This is the main repository for Freud, originally described in the publication 3 | > Analyzing System Performance with Probabilistic Performance Annotations 4 | > D. Rogora, A. Carzaniga A. Diwan, M. Hauswirth, R. Soulé 5 | > In Eurosys'20: Proceedings of the Fifteenth European Conference on Computer Systems. Heraklion, Crete, Greece. April 2020. 6 | 7 | Freud consists of 3 different components: **freud-dwarf**, **freud-pin**, and **freud-statistics**. 8 | 9 | **freud-dwarf** 10 | This is the component that creates the custom Pin tool that instruments the target program. 11 | 12 | *Compilation* 13 | ```sh 14 | cd freud-dwarf 15 | make 16 | ``` 17 | **freud-pin** 18 | This component consists of the shared part of the Pin tool that performs the instrumentation. 19 | It cannot be used as is, but must be recompiled whenever the instrumentation target changes. 20 | Trying to compile this component without an appropriate *feature_processing.cc* will fail. 21 | 22 | *Setup* 23 | ```sh 24 | download the latest Intel Pin for your platform 25 | extract the downloaded archive to the root of this repository and rename the root folder of Pin to "pin" 26 | cd freud-pin 27 | make 28 | ``` 29 | Pin can be found at https://software.intel.com/en-us/articles/pin-a-binary-instrumentation-tool-downloads 30 | 31 | **freud-statistics** 32 | This component performs the statistical analysis on the data produced by the Pin tool. It depends on the R statistical package, so this must be installed (with development headers) on your computer. 33 | On Ubuntu, install the packages *r-base*, and *r-base-dev*. Also, adjust the include location for the R headers in the freud-statistics/Makefile, if necessary. 34 | 35 | *Compilation* 36 | ```sh 37 | cd freud-statistics 38 | make 39 | ``` 40 | ***TEST*** 41 | This repository includes some tests to help getting started with Freud. For the test, there are two additional components: **micro_benchmark**, and **test** 42 | **Once *all* the components are prepared correctly, it is suggested to run the test.sh script in the root folder of the repository, to see if everything is working.** 43 | The expected output of test.sh is "All tests completed successfully!" for both the instrumentation and statistical analysis. 44 | Make sure that the *R_HOME* environment variable in test.sh is defined correctly for your system! 45 | 46 | **micro_benchmark** 47 | This is a set of basic functions for which we suppose to know the expected performance. See section 5 of the paper for more details. 48 | 49 | *Compilation* 50 | ```sh 51 | cd micro_benchmark 52 | make 53 | ``` 54 | 55 | **test** 56 | This is a program which parses the performance annotations produced for the *micro_benchmark*, and acts as an oracle to validate them. 57 | 58 | *Compilation* 59 | ```sh 60 | cd test 61 | make 62 | ``` 63 | 64 | # HOW TO PRODUCE PERFORMANCE ANNOTATIONS FOR YOUR FAVOURITE C/C++ PROGRAM 65 | 66 | Notice that in this document the "target" or "instrumented" program refers to the program for which we want to create performance annotations (e.g. mysql, envoy...) 67 | 68 | The end-to-end process has 4 main steps: 69 | 1. COMPILE THE TARGET PROGRAM WITH gcc/g++ WITH DEBUGGING SYMBOLS. The binary that you want to instrument must contain debugging symbols, possibly respecting the DWARF standard. 70 | 2. CREATE THE PINTOOL. Decide which symbols to instrument, and generate the features extraction code 71 | 3. RUN THE APPLICATION WITH THE PINTOOL. Use the output of phase 2 to compile a custom Pin tool, and use it to run the target application with a "relevant" workload 72 | 4. RUN THE STATISTICAL ANALYSIS. Use the output of phase 3 to run the statistical analysis tool 73 | 74 | **COMPILE THE TARGET PROGRAM WITH DEBUGGING SYMBOLS** 75 | With gcc this means adding the -g -gstrict-dwarf flags to the compilation flags. How to do this depends on the build system. 76 | 77 | **CREATE THE PINTOOL** 78 | This steps requires the *create-instrumentation* tool, which is created in the freud-dwarf directory. 79 | To generate the required information run 80 | ```sh 81 | ./freud-dwarf /path/to/the/binary 82 | ``` 83 | It is recommended to use the --sym-wl=symbol_name option to specify a comma separated list of symbols to instrument. The default is to instrument *every* symbol. 84 | The name of the symbols are specified through their C++ mangled names. 85 | For example, to instrument "int __attribute__ ((noinline)) test_linear_structs(struct basic_structure * bs)" in the micro_benchmark, you can: 86 | ```sh 87 | nm -g micro_benchmark | grep test_linear_structs # "_Z19test_linear_structsP15basic_structure" 88 | ./create-instrumentation micro_benchmark --sym_wl=_Z19test_linear_structsP15basic_structure 89 | ``` 90 | The output of this phase is a pair of files: "table.txt", and "feature_processing.cc" 91 | 92 | TODO: in the current version of Freud there is a little bug for which this procedure does not work with local symbols (lowercase "t" in nm). This will be addressed soon. 93 | 94 | **RUN THE APPLICATION WITH THE PINTOOL** 95 | This step requires pin, that in turn uses a custom pintool that is compiled using the "feature_processing.cc" file created in the previous step. 96 | ```sh 97 | cp table.txt feature_processing.cc freud-pin/ 98 | cd freud-pin 99 | make clean; make 100 | ../pin -t obj-intel64/freud-pin.so --pin-tool-arguments -- /path/to/the/target/binary 101 | ``` 102 | You can append any command line parameter for the target program to the end of the string. 103 | The freud-pin tool provides different command line arguments. You can have a description passing the -h parameter to the PinTool (before -- in the invocation line). 104 | When the target program exits, the Pin tools creates a "symbols" directory, containing one subfolder for each symbol. Each subfolder contains binary data. 105 | 106 | **RUN THE STATISTICAL ANALYSIS** 107 | This step requires the analysis tool, which is created in the freud-statistics directory. The analysis tool has many command line parameters, which are described by the help message that it prints when executed without parameters. For example, the following command creates performance annotations (3) with the default R2 threshold (0) for the *time* metric (0) for the "_Z20test_linear_branchesiii" symbol, whose binary logs are in the given directory. 108 | ```sh 109 | ./freud-statistics 3 0 0 _Z20test_linear_branchesiii symbols/_Z20test_linear_branchesiii/ 110 | ``` 111 | The output of the analysis is in the _eps_ (for the plots) and _ann_ (for the text annotations) directories. 112 | 113 | ***Contributions*** 114 | This repository bundles a modified version of libelfin [https://github.com/aclements/libelfin]. Thanks to the original authors. 115 | 116 | -------------------------------------------------------------------------------- /freud-dwarf/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | $(MAKE) -C elf 3 | $(MAKE) -C dwarf 4 | $(MAKE) freud-dwarf 5 | 6 | install: 7 | $(MAKE) -C elf install 8 | $(MAKE) -C dwarf install 9 | 10 | clean: 11 | $(MAKE) -C elf clean 12 | $(MAKE) -C dwarf clean 13 | rm *.o 14 | rm freud-dwarf 15 | check: 16 | cd test && ./test.sh 17 | 18 | # Find libs 19 | export PKG_CONFIG_PATH=elf:dwarf 20 | #CPPFLAGS+=$$(pkg-config --cflags libelf++ libdwarf++) 21 | CPPFLAGS+=-g -Ielf -Idwarf -std=c++11 22 | 23 | # Statically link against our libs to keep the example binaries simple 24 | # and dependencies correct. 25 | LIBS=dwarf/libdwarf++.a elf/libelf++.a 26 | 27 | # Dependencies 28 | CPPFLAGS+=-MD -MP -MF .$@.d 29 | -include .*.d 30 | 31 | freud-dwarf: create-instrumentation.o dwarf-explorer.o code-gen.o table-generator.o utils.o instr-expr-context.o $(LIBS) 32 | $(LINK.cc) $^ $(LOADLIBES) $(LDLIBS) -o $@ 33 | -------------------------------------------------------------------------------- /freud-dwarf/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/usi-systems/freud/caa50f4f636cbfd1b73b8995432506fb5713fc8d/freud-dwarf/README.md -------------------------------------------------------------------------------- /freud-dwarf/code-gen.hh: -------------------------------------------------------------------------------- 1 | #ifndef CODE_GEN_HH 2 | #define CODE_GEN_HH 3 | 4 | #include "structures.hh" 5 | #include "dwarf-explorer.hh" 6 | 7 | class code_generator { 8 | private: 9 | static void create_switch_cases_dfs(const dwarf_explorer * de, const int tree_size, std::string & feature_processing, const std::string sym_name, std::unordered_set & used_names, const int offset, bool & need_arr); 10 | static std::string get_complex_feature_processing_text_from_addr(const struct member &m, std::unordered_set &used_names, bool & need_arr); 11 | static std::string get_complex_feature_processing_text_from_reg(const struct member &m, std::unordered_set &used_names, bool &can_add_size); 12 | 13 | public: 14 | static void create_instrumentation_code(const dwarf_explorer * de, std::string fprocessfname); 15 | }; 16 | 17 | #endif 18 | -------------------------------------------------------------------------------- /freud-dwarf/create-instrumentation.cc: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2020 Daniele Rogora 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #include "elf++.hh" 18 | #include "dwarf++.hh" 19 | 20 | #include "structures.hh" 21 | #include "utils.hh" 22 | #include "code-gen.hh" 23 | #include "dwarf-explorer.hh" 24 | #include "table-generator.hh" 25 | 26 | std::unordered_map> artificial_features; 27 | 28 | void parse_parameters(const int argc, const char **argv, unsigned int & max_depth, unsigned int & max_features, 29 | std::unordered_set & cu_bl, 30 | std::unordered_set & cu_wl, 31 | std::unordered_set & symbols_bl, 32 | std::unordered_set & symbols_wl 33 | ) { 34 | for (int a = 2; a < argc; a++) { 35 | if (strncmp(argv[a], "--max-depth=", 12) == 0) { 36 | int d = atoi(argv[a] + 12); 37 | if (d == 0) { 38 | utils::log(VL_ERROR, "Error parsing max-depth, or max-depth == 0!"); 39 | exit(-1); 40 | } 41 | max_depth = d; 42 | } else if (strncmp(argv[a], "--max-features=", 15) == 0) { 43 | int f = atoi(argv[a] + 15); 44 | if (f == 0) { 45 | utils::log(VL_ERROR, "Error parsing max-features, or max-features == 0!"); 46 | exit(-1); 47 | } 48 | max_features = f; 49 | } else if (strncmp(argv[a], "--sym_wl=", 9) == 0) { 50 | // TODO: only one symbol at a time is supported right now 51 | std::string sym_name = argv[a]; 52 | sym_name = sym_name.substr(9, std::string::npos); 53 | int comma_pos = -1; 54 | int last_idx; 55 | do { 56 | last_idx = comma_pos + 1; 57 | comma_pos = sym_name.find(",", last_idx); 58 | std::string sn = sym_name.substr(last_idx, comma_pos - last_idx); 59 | symbols_wl.insert(sn); 60 | utils::log(VL_DEBUG, "Adding sym " + sn); 61 | } 62 | while (comma_pos != std::string::npos); 63 | } else if (strncmp(argv[a], "--cu_wl=", 8) == 0) { 64 | utils::log(VL_ERROR, "TODO: implement " + std::string(argv[a])); 65 | } else { 66 | utils::log(VL_ERROR, "Unhandled parameter: " + std::string(argv[a])); 67 | exit(-1); 68 | } 69 | 70 | } 71 | } 72 | 73 | int main(const int argc, const char **argv) 74 | { 75 | if (argc < 2) { 76 | fprintf(stderr, "usage: %s elf-file [--max-depth=n] [--sym_wl=path] [--cu_wl=path]\n", argv[0]); 77 | return 2; 78 | } 79 | 80 | unsigned int max_depth = MAX_DEPTH; 81 | unsigned int max_features = MAX_FEATURES; 82 | // BLACKLISTS FOR COMPILATION_UNITS AND SYMBOLS. IGNORED IF EMPTY. 83 | std::unordered_set cu_bl = {}; 84 | std::unordered_set symbols_bl = {}; 85 | // WHITELISTS FOR COMPILATION_UNITS AND SYMBOLS. IGNORED IF EMPTY. 86 | std::unordered_set cu_wl = { }; 87 | std::unordered_set symbols_wl = { }; 88 | 89 | // Parse parameters 90 | parse_parameters(argc, argv, max_depth, max_features, cu_bl, cu_wl, symbols_bl, symbols_wl); 91 | 92 | // Load the binary 93 | int fd = open(argv[1], O_RDONLY); 94 | if (fd < 0) { 95 | fprintf(stderr, "%s: %s\n", argv[1], strerror(errno)); 96 | return 1; 97 | } 98 | elf::elf ef(elf::create_mmap_loader(fd)); 99 | dwarf::dwarf dw(dwarf::elf::create_loader(ef)); 100 | 101 | // FIXME: dwarf_explorer still wants the binary filename to find definitions when needed 102 | dwarf_explorer * curiosity = new dwarf_explorer(argv[1], max_depth, max_features); 103 | 104 | // PHASE 1: explore dwarf info to find function and variables 105 | for (auto cu : dw.compilation_units()) { 106 | const dwarf::die curoot = cu.root(); 107 | std::string cuname = to_string(curoot[dwarf::DW_AT::name]); 108 | if ((cu_wl.size() > 0 && cu_wl.find(cuname) == cu_wl.end()) 109 | || (cu_bl.size() > 0 && cu_bl.find(cuname) != cu_bl.end())) 110 | continue; 111 | 112 | dwarf::line_table lt = cu.get_line_table(); 113 | utils::log(VL_INFO, "### Exploring Compilation Unit " + cuname + "###"); 114 | #ifdef WITH_GLOBAL_VARIABLES 115 | curiosity->walk_tree_dfs(curoot, lt, curoot, symbols_wl, symbols_bl); 116 | #else 117 | curiosity->walk_tree_dfs(curoot, lt); 118 | #endif 119 | } 120 | utils::log(VL_INFO, "### DONE EXPLORING DWARF INFO ###"); 121 | 122 | // PHASE 2: generate code and table info 123 | if (curiosity->found_info()) { 124 | utils::log(VL_INFO, "Creating table..."); 125 | table_generator::create_table(curiosity, "table.txt"); 126 | utils::log(VL_INFO, "Table created!"); 127 | utils::log(VL_INFO, "Creating instrumentation code..."); 128 | code_generator::create_instrumentation_code(curiosity, "feature_processing.cc"); 129 | utils::log(VL_INFO, "Instrumentation code created!"); 130 | } else { 131 | utils::log(VL_ERROR, "Could not find info for the given symbols!"); 132 | } 133 | 134 | utils::log(VL_INFO, "All done!"); 135 | delete curiosity; 136 | return 0; 137 | } 138 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf-explorer.hh: -------------------------------------------------------------------------------- 1 | #ifndef DWARF_EXPLORER_HH_INCLUDED 2 | #define DWARF_EXPLORER_HH_INCLUDED 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | 13 | #include "elf++.hh" 14 | #include "dwarf++.hh" 15 | #include "structures.hh" 16 | 17 | class dwarf_explorer { 18 | private: 19 | unsigned int max_depth, max_features; 20 | std::string binary_filename; 21 | std::unordered_map class_definitions_map; 22 | std::vector cu_variables; 23 | std::unordered_set undefined_types; 24 | 25 | // Explore the dwarf debugging tree to find the definition of a specific class/structure 26 | // The information collected is returned directly to the caller 27 | bool walk_tree_definition(const dwarf::die &node, const dwarf::die &parent, const dwarf::line_table lt, std::string name, std::vector ptr_vec, std::unordered_set &visited_structs, std::list &members, std::vector offsets, const std::string current_name, bool is_array, struct array_info &ai, bool is_aligned_membuf, std::string amembuf_type, unsigned int &num_features, const std::string & cuname, std::string & complex_type, bool & lname_found, std::string & lname); 28 | 29 | // Extract the info about the type of the object "node" 30 | // The information collected is returned directly to the caller 31 | std::string get_par_type(const dwarf::die &node, unsigned int &local_is_ptr, std::vector ptr_vec, std::unordered_set &visited_structs, std::list &members, std::vector offsets, const std::string current_name, bool is_array, struct array_info &ai, bool is_aligned_membuf, std::string amembuf_type, unsigned int &num_features, const bool decl, bool & lname_found, std::string & lname); 32 | 33 | // Find all the "formal_parameters" that are passed to node 34 | void dump_parameters(const dwarf::die &node, std::string fname, uint64_t address); 35 | 36 | // Find all the global variables in the compilation unit cu_node 37 | void dump_cu_variables(const dwarf::die &cu_node, std::string cu_name); 38 | 39 | std::string get_class_name(const dwarf::die &node, std::string & lname); 40 | 41 | std::string get_class_linkage_name(const std::string & tname, const dwarf::die &node); 42 | 43 | public: 44 | // TODO: add setters/getters for these objects 45 | std::map> func_pars_map; 46 | std::unordered_map> types; 47 | std::unordered_map hierarchy_tree_nodes_map; 48 | std::unordered_map func_addr_map; 49 | 50 | // Constructor 51 | dwarf_explorer(std::string binary_path, const unsigned int depth, const unsigned int features); 52 | 53 | // Returns true if we do not have any information about tname type 54 | bool type_is_undefined(const std::string tname) const; 55 | 56 | // Returns true if we extracted some info about our symbols of interest 57 | bool found_info() const { return !func_pars_map.empty(); }; 58 | 59 | // Compute the number of possible dynamic types of the type tname 60 | unsigned int get_class_graph_size(const std::string tname) const; 61 | 62 | // Parse the debugging information to find the definition of the class/struct cname 63 | bool find_definition(const std::string & cname, std::vector ptr_vec, std::unordered_set &visited_structs, std::list &members, std::vector offsets, const std::string current_name, bool is_array, struct array_info &ai, bool is_aligned_membuf, std::string amembuf_type, unsigned int &num_features, std::string & complex_type); 64 | 65 | // Scan the debugging symbols to find the possible features for the parameters 66 | // and global variables for each symbol in symbols_wl 67 | #ifdef WITH_GLOBAL_VARIABLES 68 | void walk_tree_dfs(const dwarf::die node, const dwarf::line_table lt, const dwarf::die last_cu, const std::unordered_set & symbols_wl, const std::unordered_set & symbols_bl); 69 | #else 70 | void walk_tree_dfs(const dwarf::die node, const dwarf::line_table lt, const std::unordered_set & symbols_wl, const std::unordered_set & symbols_bl); 71 | #endif 72 | }; 73 | 74 | #endif 75 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf/.gitignore: -------------------------------------------------------------------------------- 1 | *.o 2 | to_string.cc 3 | libdwarf++.a 4 | libdwarf++.so 5 | libdwarf++.so.* 6 | libdwarf++.pc 7 | /doc/ 8 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf/Makefile: -------------------------------------------------------------------------------- 1 | # Changed when ABI backwards compatibility is broken. 2 | # Typically uses the major version. 3 | SONAME = 0 4 | 5 | CXXFLAGS+=-g -O2 -Werror 6 | override CXXFLAGS+=-std=c++11 -Wall -fPIC -Wno-unused-private-field 7 | 8 | ifeq ($(shell uname -s),Darwin) 9 | SONAME_FLAG=-install_name 10 | else 11 | SONAME_FLAG=-soname 12 | endif 13 | 14 | all: libdwarf++.a libdwarf++.so.$(SONAME) libdwarf++.so libdwarf++.pc 15 | 16 | SRCS := dwarf.cc cursor.cc die.cc value.cc abbrev.cc \ 17 | expr.cc rangelist.cc line.cc attrs.cc \ 18 | die_str_map.cc elf.cc to_string.cc loclist.cc frame.cc 19 | HDRS := dwarf++.hh data.hh internal.hh small_vector.hh ../elf/to_hex.hh 20 | CLEAN := 21 | 22 | libdwarf++.a: $(SRCS:.cc=.o) 23 | ar rcs $@ $^ 24 | CLEAN += libdwarf++.a $(SRCS:.cc=.o) 25 | 26 | $(SRCS:.cc=.o): $(HDRS) 27 | 28 | to_string.cc: ../elf/enum-print.py dwarf++.hh data.hh Makefile 29 | @echo "// Automatically generated by make at $$(date)" > to_string.cc 30 | @echo "// DO NOT EDIT" >> to_string.cc 31 | @echo >> to_string.cc 32 | @echo '#include "internal.hh"' >> to_string.cc 33 | @echo >> to_string.cc 34 | @echo 'DWARFPP_BEGIN_NAMESPACE' >> to_string.cc 35 | @echo >> to_string.cc 36 | python3 ../elf/enum-print.py < dwarf++.hh >> to_string.cc 37 | python3 ../elf/enum-print.py -s _ -u --hex -x hi_user -x lo_user < data.hh >> to_string.cc 38 | @echo 'DWARFPP_END_NAMESPACE' >> to_string.cc 39 | CLEAN += to_string.cc 40 | 41 | libdwarf++.so.$(SONAME): $(SRCS:.cc=.o) 42 | $(CXX) $(CXXFLAGS) $(LDFLAGS) -shared -Wl,$(SONAME_FLAG),$@ -o $@ $^ 43 | CLEAN += libdwarf++.so.* 44 | 45 | libdwarf++.so: 46 | ln -s $@.$(SONAME) $@ 47 | CLEAN += libdwarf++.so 48 | 49 | # Create pkg-config for local library and headers. This will be 50 | # transformed in to the correct global pkg-config by install. 51 | libdwarf++.pc: always 52 | @(VER=$$(git describe --match 'v*' | sed -e s/^v//); \ 53 | echo "libdir=$$PWD"; \ 54 | echo "includedir=$$PWD"; \ 55 | echo ""; \ 56 | echo "Name: libdwarf++"; \ 57 | echo "Description: C++11 DWARF library"; \ 58 | echo "Version: $$VER"; \ 59 | echo "Requires: libelf++ = $$VER"; \ 60 | echo "Libs: -L\$${libdir} -ldwarf++"; \ 61 | echo "Cflags: -I\$${includedir}") > $@ 62 | CLEAN += libdwarf++.pc 63 | 64 | .PHONY: always 65 | 66 | PREFIX?=/usr/local 67 | 68 | install: libdwarf++.a libdwarf++.so.$(SONAME) libdwarf++.so libdwarf++.pc 69 | install -d $(PREFIX)/lib/pkgconfig 70 | install -t $(PREFIX)/lib libdwarf++.a 71 | install -t $(PREFIX)/lib libdwarf++.so.$(SONAME) 72 | install -t $(PREFIX)/lib libdwarf++.so 73 | install -d $(PREFIX)/include/libelfin/dwarf 74 | install -t $(PREFIX)/include/libelfin/dwarf data.hh dwarf++.hh small_vector.hh 75 | sed 's,^libdir=.*,libdir=$(PREFIX)/lib,;s,^includedir=.*,includedir=$(PREFIX)/include,' libdwarf++.pc \ 76 | > $(PREFIX)/lib/pkgconfig/libdwarf++.pc 77 | 78 | clean: 79 | rm -f $(CLEAN) 80 | 81 | .DELETE_ON_ERROR: 82 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf/abbrev.cc: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #include "internal.hh" 6 | 7 | using namespace std; 8 | 9 | DWARFPP_BEGIN_NAMESPACE 10 | 11 | static value::type 12 | resolve_type(DW_AT name, DW_FORM form) 13 | { 14 | switch (form) { 15 | case DW_FORM::addr: 16 | return value::type::address; 17 | 18 | case DW_FORM::block: 19 | case DW_FORM::block1: 20 | case DW_FORM::block2: 21 | case DW_FORM::block4: 22 | // Prior to DWARF 4, exprlocs didn't have their own 23 | // form and were represented as blocks. 24 | // XXX Should this be predicated on version? 25 | switch (name) { 26 | case DW_AT::location: 27 | case DW_AT::byte_size: 28 | case DW_AT::bit_offset: 29 | case DW_AT::bit_size: 30 | case DW_AT::string_length: 31 | case DW_AT::lower_bound: 32 | case DW_AT::return_addr: 33 | case DW_AT::bit_stride: 34 | case DW_AT::upper_bound: 35 | case DW_AT::count: 36 | case DW_AT::data_member_location: 37 | case DW_AT::frame_base: 38 | case DW_AT::segment: 39 | case DW_AT::static_link: 40 | case DW_AT::use_location: 41 | case DW_AT::vtable_elem_location: 42 | case DW_AT::allocated: 43 | case DW_AT::associated: 44 | case DW_AT::data_location: 45 | case DW_AT::byte_stride: 46 | return value::type::exprloc; 47 | default: 48 | return value::type::block; 49 | } 50 | 51 | case DW_FORM::data4: 52 | case DW_FORM::data8: 53 | // Prior to DWARF 4, section offsets didn't have their 54 | // own form and were represented as data4 or data8. 55 | // DWARF 3 clarified that types that accepted both 56 | // constants and section offsets were to treat data4 57 | // and data8 as section offsets and other constant 58 | // forms as constants. 59 | // XXX Should this be predicated on version? 60 | switch (name) { 61 | case DW_AT::location: 62 | case DW_AT::stmt_list: 63 | case DW_AT::string_length: 64 | case DW_AT::return_addr: 65 | case DW_AT::start_scope: 66 | case DW_AT::data_member_location: 67 | case DW_AT::frame_base: 68 | case DW_AT::macro_info: 69 | case DW_AT::segment: 70 | case DW_AT::static_link: 71 | case DW_AT::use_location: 72 | case DW_AT::vtable_elem_location: 73 | case DW_AT::ranges: 74 | goto sec_offset; 75 | default: 76 | // Fall through 77 | break; 78 | } 79 | case DW_FORM::data1: 80 | case DW_FORM::data2: 81 | return value::type::constant; 82 | case DW_FORM::udata: 83 | return value::type::uconstant; 84 | case DW_FORM::sdata: 85 | return value::type::sconstant; 86 | 87 | case DW_FORM::exprloc: 88 | return value::type::exprloc; 89 | 90 | case DW_FORM::flag: 91 | case DW_FORM::flag_present: 92 | return value::type::flag; 93 | 94 | case DW_FORM::ref1: 95 | case DW_FORM::ref2: 96 | case DW_FORM::ref4: 97 | case DW_FORM::ref8: 98 | case DW_FORM::ref_addr: 99 | case DW_FORM::ref_sig8: 100 | case DW_FORM::ref_udata: 101 | return value::type::reference; 102 | 103 | case DW_FORM::string: 104 | case DW_FORM::strp: 105 | return value::type::string; 106 | 107 | case DW_FORM::indirect: 108 | // There's nothing meaningful we can do 109 | return value::type::invalid; 110 | 111 | case DW_FORM::sec_offset: 112 | sec_offset: 113 | // The type of this form depends on the attribute 114 | switch (name) { 115 | case DW_AT::stmt_list: 116 | return value::type::line; 117 | 118 | case DW_AT::location: 119 | case DW_AT::string_length: 120 | case DW_AT::return_addr: 121 | case DW_AT::data_member_location: 122 | case DW_AT::frame_base: 123 | case DW_AT::segment: 124 | case DW_AT::static_link: 125 | case DW_AT::use_location: 126 | case DW_AT::vtable_elem_location: 127 | return value::type::loclist; 128 | 129 | case DW_AT::macro_info: 130 | return value::type::mac; 131 | 132 | case DW_AT::start_scope: 133 | case DW_AT::ranges: 134 | return value::type::rangelist; 135 | 136 | default: 137 | throw format_error("DW_FORM_sec_offset not expected for attribute " + 138 | to_string(name)); 139 | } 140 | } 141 | throw format_error("unknown attribute form " + to_string(form)); 142 | } 143 | 144 | attribute_spec::attribute_spec(DW_AT name, DW_FORM form) 145 | : name(name), form(form), type(resolve_type(name, form)) 146 | { 147 | } 148 | 149 | bool 150 | abbrev_entry::read(cursor *cur) 151 | { 152 | attributes.clear(); 153 | 154 | // Section 7.5.3 155 | code = cur->uleb128(); 156 | if (!code) 157 | return false; 158 | 159 | tag = (DW_TAG)cur->uleb128(); 160 | children = cur->fixed() == DW_CHILDREN::yes; 161 | while (1) { 162 | DW_AT name = (DW_AT)cur->uleb128(); 163 | DW_FORM form = (DW_FORM)cur->uleb128(); 164 | if (name == (DW_AT)0 && form == (DW_FORM)0) 165 | break; 166 | attributes.push_back(attribute_spec(name, form)); 167 | } 168 | attributes.shrink_to_fit(); 169 | return true; 170 | } 171 | 172 | DWARFPP_END_NAMESPACE 173 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf/attrs.cc: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #include "dwarf++.hh" 6 | 7 | using namespace std; 8 | 9 | DWARFPP_BEGIN_NAMESPACE 10 | 11 | #define AT_ANY(name) \ 12 | value at_##name(const die &d) \ 13 | { \ 14 | return d[DW_AT::name]; \ 15 | } \ 16 | static_assert(true, "") 17 | 18 | #define AT_ADDRESS(name) \ 19 | taddr at_##name(const die &d) \ 20 | { \ 21 | return d[DW_AT::name].as_address(); \ 22 | } \ 23 | static_assert(true, "") 24 | 25 | #define AT_ENUM(name, type) \ 26 | type at_##name(const die &d) \ 27 | { \ 28 | return (type)d[DW_AT::name].as_uconstant(); \ 29 | } \ 30 | static_assert(true, "") 31 | 32 | #define AT_FLAG(name) \ 33 | bool at_##name(const die &d) \ 34 | { \ 35 | return d[DW_AT::name].as_flag(); \ 36 | } \ 37 | static_assert(true, "") 38 | 39 | #define AT_FLAG_(name) \ 40 | bool at_##name(const die &d) \ 41 | { \ 42 | return d[DW_AT::name##_].as_flag(); \ 43 | } \ 44 | static_assert(true, "") 45 | 46 | #define AT_REFERENCE(name) \ 47 | die at_##name(const die &d) \ 48 | { \ 49 | return d[DW_AT::name].as_reference(); \ 50 | } \ 51 | static_assert(true, "") 52 | 53 | #define AT_STRING(name) \ 54 | string at_##name(const die &d) \ 55 | { \ 56 | return d[DW_AT::name].as_string(); \ 57 | } \ 58 | static_assert(true, "") 59 | 60 | #define AT_UDYNAMIC(name) \ 61 | uint64_t at_##name(const die &d, expr_context *ctx) \ 62 | { \ 63 | return _at_udynamic(DW_AT::name, d, ctx); \ 64 | } \ 65 | static_assert(true, "") 66 | 67 | static uint64_t _at_udynamic(DW_AT attr, const die &d, expr_context *ctx, int depth = 0) 68 | { 69 | // DWARF4 section 2.19 70 | if (depth > 16) 71 | throw format_error("reference depth exceeded for " + to_string(attr)); 72 | 73 | value v(d[attr]); 74 | switch (v.get_type()) { 75 | case value::type::constant: 76 | case value::type::uconstant: 77 | return v.as_uconstant(); 78 | case value::type::reference: 79 | return _at_udynamic(attr, v.as_reference(), ctx, depth + 1); 80 | case value::type::exprloc: 81 | { 82 | std::string tmp_str; 83 | return v.as_exprloc().evaluate(ctx, tmp_str).value; 84 | } 85 | default: 86 | throw format_error(to_string(attr) + " has unexpected type " + 87 | to_string(v.get_type())); 88 | } 89 | } 90 | 91 | ////////////////////////////////////////////////////////////////// 92 | // 0x0X 93 | // 94 | 95 | AT_REFERENCE(sibling); 96 | // XXX location 97 | AT_STRING(name); 98 | AT_ENUM(ordering, DW_ORD); 99 | AT_UDYNAMIC(byte_size); 100 | AT_UDYNAMIC(bit_offset); 101 | AT_UDYNAMIC(bit_size); 102 | 103 | ////////////////////////////////////////////////////////////////// 104 | // 0x1X 105 | // 106 | 107 | // XXX stmt_list 108 | AT_ADDRESS(low_pc); 109 | taddr 110 | at_high_pc(const die &d) 111 | { 112 | value v(d[DW_AT::high_pc]); 113 | switch (v.get_type()) { 114 | case value::type::address: 115 | return v.as_address(); 116 | case value::type::constant: 117 | case value::type::uconstant: 118 | return at_low_pc(d) + v.as_uconstant(); 119 | default: 120 | throw format_error(to_string(DW_AT::high_pc) + " has unexpected type " + 121 | to_string(v.get_type())); 122 | } 123 | } 124 | AT_ENUM(language, DW_LANG); 125 | AT_REFERENCE(discr); 126 | AT_ANY(discr_value); // XXX Signed or unsigned 127 | AT_ENUM(visibility, DW_VIS); 128 | AT_REFERENCE(import); 129 | // XXX string_length 130 | AT_REFERENCE(common_reference); 131 | AT_STRING(comp_dir); 132 | AT_ANY(const_value); 133 | AT_REFERENCE(containing_type); 134 | // XXX default_value 135 | 136 | ////////////////////////////////////////////////////////////////// 137 | // 0x2X 138 | // 139 | 140 | DW_INL at_inline(const die &d) 141 | { 142 | // XXX Missing attribute is equivalent to DW_INL_not_inlined 143 | // (DWARF4 section 3.3.8) 144 | return (DW_INL)d[DW_AT::inline_].as_uconstant(); 145 | } 146 | AT_FLAG(is_optional); 147 | AT_UDYNAMIC(lower_bound); // XXX Language-based default? 148 | AT_STRING(producer); 149 | AT_FLAG(prototyped); 150 | // XXX return_addr 151 | // XXX start_scope 152 | AT_UDYNAMIC(bit_stride); 153 | AT_UDYNAMIC(upper_bound); 154 | 155 | ////////////////////////////////////////////////////////////////// 156 | // 0x3X 157 | // 158 | 159 | AT_REFERENCE(abstract_origin); 160 | AT_ENUM(accessibility, DW_ACCESS); 161 | // XXX const address_class 162 | AT_FLAG(artificial); 163 | // XXX base_types 164 | AT_ENUM(calling_convention, DW_CC); 165 | AT_UDYNAMIC(count); 166 | expr_result 167 | at_data_member_location(const die &d, expr_context *ctx, taddr base, taddr pc) 168 | { 169 | value v(d[DW_AT::data_member_location]); 170 | switch (v.get_type()) { 171 | case value::type::constant: 172 | case value::type::uconstant: 173 | return {expr_result::type::address, base + v.as_uconstant()}; 174 | case value::type::exprloc: 175 | { 176 | std::string tmp_str; 177 | return v.as_exprloc().evaluate(ctx, base, tmp_str); 178 | } 179 | case value::type::loclist: 180 | // XXX 181 | throw std::runtime_error("not implemented"); 182 | default: 183 | throw format_error("DW_AT_data_member_location has unexpected type " + 184 | to_string(v.get_type())); 185 | } 186 | } 187 | // XXX decl_column decl_file decl_line 188 | AT_FLAG(declaration); 189 | // XXX discr_list 190 | AT_ENUM(encoding, DW_ATE); 191 | AT_FLAG(external); 192 | 193 | ////////////////////////////////////////////////////////////////// 194 | // 0x4X 195 | // 196 | 197 | // XXX frame_base 198 | die at_friend(const die &d) 199 | { 200 | return d[DW_AT::friend_].as_reference(); 201 | } 202 | AT_ENUM(identifier_case, DW_ID); 203 | // XXX macro_info 204 | AT_REFERENCE(namelist_item); 205 | AT_REFERENCE(priority); // XXX Computed might be useful 206 | // XXX segment 207 | AT_REFERENCE(specification); 208 | // XXX static_link 209 | AT_REFERENCE(type); 210 | // XXX use_location 211 | AT_FLAG(variable_parameter); 212 | // XXX 7.11 The value DW_VIRTUALITY_none is equivalent to the absence 213 | // of the DW_AT_virtuality attribute. 214 | AT_ENUM(virtuality, DW_VIRTUALITY); 215 | // XXX vtable_elem_location 216 | AT_UDYNAMIC(allocated); 217 | AT_UDYNAMIC(associated); 218 | 219 | ////////////////////////////////////////////////////////////////// 220 | // 0x5X 221 | // 222 | 223 | // XXX data_location 224 | AT_UDYNAMIC(byte_stride); 225 | AT_ADDRESS(entry_pc); 226 | AT_FLAG(use_UTF8); 227 | AT_REFERENCE(extension); 228 | rangelist 229 | at_ranges(const die &d) 230 | { 231 | return d[DW_AT::ranges].as_rangelist(); 232 | } 233 | // XXX trampoline 234 | // XXX const call_column, call_file, call_line 235 | AT_STRING(description); 236 | // XXX const binary_scale 237 | // XXX const decimal_scale 238 | AT_REFERENCE(small); 239 | // XXX const decimal_sign 240 | // XXX const digit_count 241 | 242 | ////////////////////////////////////////////////////////////////// 243 | // 0x6X 244 | // 245 | 246 | AT_STRING(picture_string); 247 | AT_FLAG_(mutable); 248 | AT_FLAG(threads_scaled); 249 | AT_FLAG_(explicit); 250 | AT_REFERENCE(object_pointer); 251 | AT_ENUM(endianity, DW_END); 252 | AT_FLAG(elemental); 253 | AT_FLAG(pure); 254 | AT_FLAG(recursive); 255 | AT_REFERENCE(signature); // XXX Computed might be useful 256 | AT_FLAG(main_subprogram); 257 | // XXX const data_bit_offset 258 | AT_FLAG(const_expr); 259 | AT_FLAG(enum_class); 260 | AT_STRING(linkage_name); 261 | 262 | rangelist 263 | die_pc_range(const die &d) 264 | { 265 | // DWARF4 section 2.17 266 | if (d.has(DW_AT::ranges)) 267 | return at_ranges(d); 268 | taddr low = at_low_pc(d); 269 | taddr high = d.has(DW_AT::high_pc) ? at_high_pc(d) : (low + 1); 270 | return rangelist({{low, high}}); 271 | } 272 | 273 | DWARFPP_END_NAMESPACE 274 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf/cursor.cc: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #include "internal.hh" 6 | 7 | #include 8 | #include 9 | 10 | using namespace std; 11 | 12 | DWARFPP_BEGIN_NAMESPACE 13 | 14 | void cursor::skip_bytes(uint32_t how_many) { 15 | pos += how_many; 16 | } 17 | 18 | int64_t 19 | cursor::sleb128() 20 | { 21 | // Appendix C 22 | unsigned int tmp = 0; 23 | return sleb128(tmp); 24 | } 25 | 26 | int64_t 27 | cursor::sleb128(unsigned int &read) 28 | { 29 | // Appendix C 30 | uint64_t result = 0; 31 | unsigned shift = 0; 32 | while (pos < sec->end) { 33 | uint8_t byte = *(uint8_t*)(pos++); 34 | read++; 35 | result |= (uint64_t)(byte & 0x7f) << shift; 36 | shift += 7; 37 | if ((byte & 0x80) == 0) { 38 | if (shift < sizeof(result)*8 && (byte & 0x40)) 39 | result |= -((uint64_t)1 << shift); 40 | return result; 41 | } 42 | } 43 | underflow(); 44 | return 0; 45 | } 46 | 47 | uint64_t 48 | cursor::read_4_or_8_bytes_field() 49 | { 50 | uint64_t res = 0; 51 | // Section 7.4 52 | switch (sec->fmt) { 53 | case format::dwarf32: 54 | res = fixed(); 55 | break; 56 | case format::dwarf64: 57 | res = fixed(); 58 | break; 59 | default: 60 | throw logic_error("cannot read 4/8byte field with unknown format"); 61 | } 62 | return res; 63 | } 64 | 65 | shared_ptr
66 | cursor::subsection() 67 | { 68 | // Section 7.4 69 | const char *begin = pos; 70 | section_length length = fixed(); 71 | format fmt; 72 | if (length < 0xfffffff0) { 73 | fmt = format::dwarf32; 74 | length += sizeof(uword); 75 | } else if (length == 0xffffffff) { 76 | length = fixed(); 77 | fmt = format::dwarf64; 78 | length += sizeof(uword) + sizeof(uint64_t); 79 | } else { 80 | throw format_error("initial length has reserved value"); 81 | } 82 | pos = begin + length; 83 | return make_shared
(sec->type, begin, length, sec->ord, fmt); 84 | } 85 | 86 | void 87 | cursor::skip_initial_length() 88 | { 89 | switch (sec->fmt) { 90 | case format::dwarf32: 91 | pos += sizeof(uword); 92 | break; 93 | case format::dwarf64: 94 | pos += sizeof(uword) + sizeof(uint64_t); 95 | break; 96 | default: 97 | throw logic_error("cannot skip initial length with unknown format"); 98 | } 99 | } 100 | 101 | section_offset 102 | cursor::offset() 103 | { 104 | switch (sec->fmt) { 105 | case format::dwarf32: 106 | return fixed(); 107 | case format::dwarf64: 108 | return fixed(); 109 | default: 110 | throw logic_error("cannot read offset with unknown format"); 111 | } 112 | } 113 | 114 | void 115 | cursor::string(std::string &out) 116 | { 117 | size_t size; 118 | const char *p = this->cstr(&size); 119 | out.resize(size); 120 | memmove(&out.front(), p, size); 121 | } 122 | 123 | const char * 124 | cursor::cstr(size_t *size_out) 125 | { 126 | // Scan string size 127 | const char *p = pos; 128 | while (pos < sec->end && *pos) 129 | pos++; 130 | if (pos == sec->end) 131 | throw format_error("unterminated string"); 132 | if (size_out) 133 | *size_out = pos - p; 134 | pos++; 135 | return p; 136 | } 137 | 138 | void 139 | cursor::skip_form(DW_FORM form) 140 | { 141 | section_offset tmp; 142 | 143 | // Section 7.5.4 144 | switch (form) { 145 | case DW_FORM::addr: 146 | pos += sec->addr_size; 147 | break; 148 | case DW_FORM::sec_offset: 149 | case DW_FORM::ref_addr: 150 | case DW_FORM::strp: 151 | switch (sec->fmt) { 152 | case format::dwarf32: 153 | pos += 4; 154 | break; 155 | case format::dwarf64: 156 | pos += 8; 157 | break; 158 | case format::unknown: 159 | throw logic_error("cannot read form with unknown format"); 160 | } 161 | break; 162 | 163 | // size+data forms 164 | case DW_FORM::block1: 165 | tmp = fixed(); 166 | pos += tmp; 167 | break; 168 | case DW_FORM::block2: 169 | tmp = fixed(); 170 | pos += tmp; 171 | break; 172 | case DW_FORM::block4: 173 | tmp = fixed(); 174 | pos += tmp; 175 | break; 176 | case DW_FORM::block: 177 | case DW_FORM::exprloc: 178 | tmp = uleb128(); 179 | pos += tmp; 180 | break; 181 | 182 | // fixed-length forms 183 | case DW_FORM::flag_present: 184 | break; 185 | case DW_FORM::flag: 186 | case DW_FORM::data1: 187 | case DW_FORM::ref1: 188 | pos += 1; 189 | break; 190 | case DW_FORM::data2: 191 | case DW_FORM::ref2: 192 | pos += 2; 193 | break; 194 | case DW_FORM::data4: 195 | case DW_FORM::ref4: 196 | pos += 4; 197 | break; 198 | case DW_FORM::data8: 199 | case DW_FORM::ref_sig8: 200 | pos += 8; 201 | break; 202 | 203 | // variable-length forms 204 | case DW_FORM::sdata: 205 | case DW_FORM::udata: 206 | case DW_FORM::ref_udata: 207 | while (pos < sec->end && (*(uint8_t*)pos & 0x80)) 208 | pos++; 209 | pos++; 210 | break; 211 | case DW_FORM::string: 212 | while (pos < sec->end && *pos) 213 | pos++; 214 | pos++; 215 | break; 216 | 217 | case DW_FORM::indirect: 218 | skip_form((DW_FORM)uleb128()); 219 | break; 220 | 221 | default: 222 | throw format_error("unknown form " + to_string(form)); 223 | } 224 | } 225 | 226 | void 227 | cursor::underflow() 228 | { 229 | throw underflow_error("cannot read past end of DWARF section"); 230 | } 231 | 232 | DWARFPP_END_NAMESPACE 233 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf/die.cc: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #include "internal.hh" 6 | #include 7 | 8 | using namespace std; 9 | 10 | DWARFPP_BEGIN_NAMESPACE 11 | 12 | die::die(const unit *cu) 13 | : cu(cu), abbrev(nullptr) 14 | { 15 | } 16 | 17 | const unit & 18 | die::get_unit() const 19 | { 20 | return *cu; 21 | } 22 | 23 | section_offset 24 | die::get_section_offset() const 25 | { 26 | return cu->get_section_offset() + offset; 27 | } 28 | 29 | void 30 | die::read(std::shared_ptr
cudata, section_offset off) 31 | { 32 | cursor cur(cudata, off); 33 | 34 | position = off; 35 | offset = off; 36 | 37 | acode = cur.uleb128(); 38 | if (acode > 1024) { 39 | printf("Weird acode %" PRIu64 "\n", acode); 40 | exit(-1); 41 | } 42 | if (acode == 0) { 43 | abbrev = nullptr; 44 | next = cur.get_section_offset(); 45 | return; 46 | } 47 | abbrev = &cu->get_abbrev(acode); 48 | 49 | tag = abbrev->tag; 50 | 51 | // XXX We can pre-compute almost all of this work in the 52 | // abbrev_entry. 53 | attrs.clear(); 54 | attrs.reserve(abbrev->attributes.size()); 55 | for (auto &attr : abbrev->attributes) { 56 | attrs.push_back(cur.get_section_offset()); 57 | cur.skip_form(attr.form); 58 | } 59 | next = cur.get_section_offset(); 60 | } 61 | 62 | void 63 | die::read(section_offset off) 64 | { 65 | cursor cur(cu->data(), off); 66 | 67 | position = off; 68 | offset = off; 69 | 70 | acode = cur.uleb128(); 71 | if (acode > 1024) { 72 | printf("Weird acode %" PRIu64 "\n", acode); 73 | exit(-1); 74 | } 75 | if (acode == 0) { 76 | abbrev = nullptr; 77 | next = cur.get_section_offset(); 78 | return; 79 | } 80 | abbrev = &cu->get_abbrev(acode); 81 | 82 | tag = abbrev->tag; 83 | 84 | // XXX We can pre-compute almost all of this work in the 85 | // abbrev_entry. 86 | attrs.clear(); 87 | attrs.reserve(abbrev->attributes.size()); 88 | for (auto &attr : abbrev->attributes) { 89 | attrs.push_back(cur.get_section_offset()); 90 | cur.skip_form(attr.form); 91 | } 92 | next = cur.get_section_offset(); 93 | } 94 | 95 | bool 96 | die::has(DW_AT attr) const 97 | { 98 | if (!abbrev) 99 | return false; 100 | // XXX Totally lame 101 | for (auto &a : abbrev->attributes) 102 | if (a.name == attr) 103 | return true; 104 | return false; 105 | } 106 | 107 | value 108 | die::operator[](DW_AT attr) const 109 | { 110 | // XXX We can pre-compute almost all of this work in the 111 | // abbrev_entry. 112 | if (abbrev) { 113 | int i = 0; 114 | for (auto &a : abbrev->attributes) { 115 | if (a.name == attr) 116 | return value(cu, a.name, a.form, a.type, attrs[i]); 117 | i++; 118 | } 119 | } 120 | throw out_of_range("DIE does not have attribute " + to_string(attr)); 121 | } 122 | 123 | value 124 | die::resolve(DW_AT attr) const 125 | { 126 | // DWARF4 section 2.13, DWARF4 section 3.3.8 127 | 128 | // DWARF4 is unclear about what to do when there's both a 129 | // DW_AT::specification and a DW_AT::abstract_origin. 130 | // Conceptually, though, a concrete inlined instance cannot 131 | // itself complete an external function that wasn't first 132 | // completed by its abstract instance, so we first try to 133 | // resolve abstract_origin, then we resolve specification. 134 | 135 | // XXX This traverses the abbrevs at least twice and 136 | // potentially several more times 137 | 138 | if (has(attr)) 139 | return (*this)[attr]; 140 | 141 | if (has(DW_AT::abstract_origin)) { 142 | die ao = (*this)[DW_AT::abstract_origin].as_reference(); 143 | if (ao.has(attr)) 144 | return ao[attr]; 145 | if (ao.has(DW_AT::specification)) { 146 | die s = ao[DW_AT::specification].as_reference(); 147 | if (s.has(attr)) 148 | return s[attr]; 149 | } 150 | } else if (has(DW_AT::specification)) { 151 | die s = (*this)[DW_AT::specification].as_reference(); 152 | if (s.has(attr)) 153 | return s[attr]; 154 | } 155 | 156 | return value(); 157 | } 158 | 159 | die::iterator 160 | die::begin() const 161 | { 162 | if (acode > 1024) 163 | printf("ACODE is too big, fixme %" PRIu64 "\n", acode); 164 | if (acode > 1024 || !abbrev || !abbrev->children) 165 | return end(); 166 | return iterator(cu, next); 167 | } 168 | 169 | die::iterator::iterator(const unit *cu, section_offset off) 170 | : d(cu) 171 | { 172 | d.read(off); 173 | } 174 | 175 | die::iterator & 176 | die::iterator::operator++() 177 | { 178 | if (!d.abbrev) 179 | return *this; 180 | 181 | if (!d.abbrev->children) { 182 | // The DIE has no children, so its successor follows 183 | // immediately 184 | d.read(d.next); 185 | } else if (d.has(DW_AT::sibling)) { 186 | // They made it easy on us. Follow the sibling 187 | // pointer. XXX Probably worth optimizing 188 | d = d[DW_AT::sibling].as_reference(); 189 | } else { 190 | // It's a hard-knock life. We have to iterate through 191 | // the children to find the next DIE. 192 | // XXX Particularly unfortunate if the user is doing a 193 | // DFS, since this will result in N^2 behavior. Maybe 194 | // a small cache of terminator locations in the CU? 195 | iterator sub(d.cu, d.next); 196 | while (sub->abbrev) 197 | ++sub; 198 | d.read(sub->next); 199 | } 200 | 201 | return *this; 202 | } 203 | 204 | const vector > 205 | die::attributes() const 206 | { 207 | vector > res; 208 | 209 | if (!abbrev) 210 | return res; 211 | 212 | // XXX Quite slow, especially when using this to traverse an 213 | // entire DIE tree since each DIE will produce a new vector 214 | // (whereas other vectors get reused). Might be worth a 215 | // custom iterator. 216 | int i = 0; 217 | for (auto &a : abbrev->attributes) { 218 | res.push_back(make_pair(a.name, value(cu, a.name, a.form, a.type, attrs[i]))); 219 | i++; 220 | } 221 | return res; 222 | } 223 | 224 | bool 225 | die::operator==(const die &o) const 226 | { 227 | return cu == o.cu && offset == o.offset; 228 | } 229 | 230 | bool 231 | die::operator!=(const die &o) const 232 | { 233 | return !(*this == o); 234 | } 235 | 236 | bool 237 | die::contains_section_offset(section_offset off) const 238 | { 239 | auto contains_off = [off] (const die& d) { return off >= d.get_section_offset() && off < d.next; }; 240 | 241 | if (contains_off(*this)) return true; 242 | 243 | for (const auto& child : *this) { 244 | if (contains_off(child)) return true; 245 | } 246 | 247 | return false; 248 | } 249 | 250 | DWARFPP_END_NAMESPACE 251 | 252 | size_t 253 | std::hash::operator()(const dwarf::die &a) const 254 | { 255 | return hash()(a.cu) ^ 256 | hash()(a.get_unit_offset()); 257 | } 258 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf/die_str_map.cc: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #include "internal.hh" 6 | 7 | #include 8 | #include 9 | 10 | using namespace std; 11 | 12 | // XXX Make this more readily available? 13 | namespace std { 14 | template<> 15 | struct hash 16 | { 17 | typedef size_t result_type; 18 | typedef dwarf::DW_TAG argument_type; 19 | result_type operator()(argument_type a) const 20 | { 21 | return (result_type)a; 22 | } 23 | }; 24 | } 25 | 26 | DWARFPP_BEGIN_NAMESPACE 27 | 28 | struct string_hash 29 | { 30 | typedef size_t result_type; 31 | typedef const char *argument_type; 32 | result_type operator()(const char *s) const 33 | { 34 | result_type h = 0; 35 | for (; *s; ++s) 36 | h += 33 * h + *s; 37 | return h; 38 | } 39 | }; 40 | 41 | struct string_eq 42 | { 43 | typedef bool result_type; 44 | typedef const char *first_argument_type; 45 | typedef const char *second_argument_type; 46 | bool operator()(const char *x, const char *y) const 47 | { 48 | return strcmp(x, y) == 0; 49 | } 50 | }; 51 | 52 | struct die_str_map::impl 53 | { 54 | impl(const die &parent, DW_AT attr, 55 | const initializer_list &accept) 56 | : attr(attr), accept(accept.begin(), accept.end()), 57 | pos(parent.begin()), end(parent.end()) { } 58 | 59 | unordered_map str_map; 60 | DW_AT attr; 61 | unordered_set accept; 62 | die::iterator pos, end; 63 | die invalid; 64 | }; 65 | 66 | die_str_map::die_str_map(const die &parent, DW_AT attr, 67 | const initializer_list &accept) 68 | : m(make_shared(parent, attr, accept)) 69 | { 70 | } 71 | 72 | die_str_map 73 | die_str_map::from_type_names(const die &parent) 74 | { 75 | return die_str_map 76 | (parent, DW_AT::name, 77 | // All DWARF type tags (this is everything that ends 78 | // with _type except thrown_type). 79 | {DW_TAG::array_type, DW_TAG::class_type, 80 | DW_TAG::enumeration_type, DW_TAG::pointer_type, 81 | DW_TAG::reference_type, DW_TAG::string_type, 82 | DW_TAG::structure_type, DW_TAG::subroutine_type, 83 | DW_TAG::union_type, DW_TAG::ptr_to_member_type, 84 | DW_TAG::set_type, DW_TAG::subrange_type, 85 | DW_TAG::base_type, DW_TAG::const_type, 86 | DW_TAG::file_type, DW_TAG::packed_type, 87 | DW_TAG::volatile_type, DW_TAG::restrict_type, 88 | DW_TAG::interface_type, DW_TAG::unspecified_type, 89 | DW_TAG::shared_type, DW_TAG::rvalue_reference_type}); 90 | } 91 | 92 | const die & 93 | die_str_map::operator[](const char *val) const 94 | { 95 | // Do we have this value? 96 | auto it = m->str_map.find(val); 97 | if (it != m->str_map.end()) 98 | return it->second; 99 | // Read more until we find the value or the end 100 | while (m->pos != m->end) { 101 | const die &d = *m->pos; 102 | ++m->pos; 103 | 104 | if (!m->accept.count(d.tag) || !d.has(m->attr)) 105 | continue; 106 | value dval(d[m->attr]); 107 | if (dval.get_type() != value::type::string) 108 | continue; 109 | const char *dstr = dval.as_cstr(); 110 | m->str_map[dstr] = d; 111 | if (strcmp(val, dstr) == 0) 112 | return m->str_map[dstr]; 113 | } 114 | // Not found 115 | return m->invalid; 116 | } 117 | 118 | DWARFPP_END_NAMESPACE 119 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf/elf.cc: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #include "dwarf++.hh" 6 | 7 | #include 8 | 9 | using namespace std; 10 | 11 | DWARFPP_BEGIN_NAMESPACE 12 | 13 | static const struct 14 | { 15 | const char *name; 16 | section_type type; 17 | } sections[] = { 18 | {".debug_abbrev", section_type::abbrev}, 19 | {".debug_aranges", section_type::aranges}, 20 | {".debug_frame", section_type::frame}, 21 | {".debug_info", section_type::info}, 22 | {".debug_line", section_type::line}, 23 | {".debug_loc", section_type::loc}, 24 | {".debug_macinfo", section_type::macinfo}, 25 | {".debug_pubnames", section_type::pubnames}, 26 | {".debug_pubtypes", section_type::pubtypes}, 27 | {".debug_ranges", section_type::ranges}, 28 | {".debug_str", section_type::str}, 29 | {".debug_types", section_type::types}, 30 | {".eh_frame", section_type::eh_frame}, 31 | }; 32 | 33 | bool 34 | elf::section_name_to_type(const char *name, section_type *out) 35 | { 36 | for (auto &sec : sections) { 37 | if (strcmp(sec.name, name) == 0) { 38 | *out = sec.type; 39 | return true; 40 | } 41 | } 42 | return false; 43 | } 44 | 45 | const char * 46 | elf::section_type_to_name(section_type type) 47 | { 48 | for (auto &sec : sections) { 49 | if (sec.type == type) 50 | return sec.name; 51 | } 52 | return nullptr; 53 | } 54 | 55 | DWARFPP_END_NAMESPACE 56 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf/internal.hh: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #ifndef _DWARFPP_INTERNAL_HH_ 6 | #define _DWARFPP_INTERNAL_HH_ 7 | 8 | #include "dwarf++.hh" 9 | #include "../elf/to_hex.hh" 10 | 11 | #include 12 | #include 13 | #include 14 | #include 15 | 16 | DWARFPP_BEGIN_NAMESPACE 17 | 18 | enum class format 19 | { 20 | unknown, 21 | dwarf32, 22 | dwarf64 23 | }; 24 | 25 | enum class byte_order 26 | { 27 | lsb, 28 | msb 29 | }; 30 | 31 | /** 32 | * Return this system's native byte order. 33 | */ 34 | static inline byte_order 35 | native_order() 36 | { 37 | static const union 38 | { 39 | int i; 40 | char c[sizeof(int)]; 41 | } test = {1}; 42 | 43 | return test.c[0] == 1 ? byte_order::lsb : byte_order::msb; 44 | } 45 | 46 | /** 47 | * A single DWARF section or a slice of a section. This also tracks 48 | * dynamic information necessary to decode values in this section. 49 | */ 50 | struct section 51 | { 52 | uint64_t sec_offset; 53 | section_type type; 54 | const char *begin, *end; 55 | const format fmt; 56 | const byte_order ord; 57 | unsigned addr_size; 58 | 59 | section(section_type type, const void *begin, 60 | section_length length, 61 | byte_order ord, format fmt = format::unknown, 62 | unsigned addr_size = 0) 63 | : sec_offset(0), type(type), begin((char*)begin), end((char*)begin + length), 64 | fmt(fmt), ord(ord), addr_size(addr_size) { } 65 | 66 | section(const uint64_t offset, 67 | section_type type, const void *begin, 68 | section_length length, 69 | byte_order ord, format fmt = format::unknown, 70 | unsigned addr_size = 0) 71 | : sec_offset(offset), type(type), begin((char*)begin), end((char*)begin + length), 72 | fmt(fmt), ord(ord), addr_size(addr_size) { } 73 | 74 | section(const section &o) = default; 75 | 76 | std::shared_ptr
slice(section_offset start, section_length len, 77 | format fmt = format::unknown, 78 | unsigned addr_size = 0) 79 | { 80 | if (fmt == format::unknown) 81 | fmt = this->fmt; 82 | if (addr_size == 0) 83 | addr_size = this->addr_size; 84 | 85 | return std::make_shared
( 86 | type, begin+start, 87 | std::min(len, (section_length)(end-begin)), 88 | ord, fmt, addr_size); 89 | } 90 | 91 | size_t size() const 92 | { 93 | return end - begin; 94 | } 95 | }; 96 | 97 | /** 98 | * A cursor pointing into a DWARF section. Provides deserialization 99 | * operations and bounds checking. 100 | */ 101 | struct cursor 102 | { 103 | // XXX There's probably a lot of overhead to maintaining the 104 | // shared pointer to the section from this. Perhaps the rule 105 | // should be that all objects keep the dwarf::impl alive 106 | // (directly or indirectly) and that keeps the loader alive, 107 | // so a cursor just needs a regular section*. 108 | 109 | std::shared_ptr
sec; 110 | const char *pos; 111 | 112 | cursor() 113 | : pos(nullptr) { } 114 | cursor(const std::shared_ptr
sec, section_offset offset = 0) 115 | : sec(sec), pos(sec->begin + offset) { } 116 | 117 | /** 118 | * Read a subsection. The cursor must be at an initial 119 | * length. After, the cursor will point just past the end of 120 | * the subsection. The returned section has the appropriate 121 | * DWARF format and begins at the current location of the 122 | * cursor (so this is usually followed by a 123 | * skip_initial_length). 124 | */ 125 | std::shared_ptr
subsection(); 126 | void skip_bytes(uint32_t how_many); // useful to parse loclists 127 | std::int64_t sleb128(); 128 | std::int64_t sleb128(unsigned int &read); 129 | section_offset offset(); 130 | void string(std::string &out); 131 | const char *cstr(size_t *size_out = nullptr); 132 | 133 | void 134 | ensure(section_offset bytes) 135 | { 136 | if ((section_offset)(sec->end - pos) < bytes || pos >= sec->end) 137 | underflow(); 138 | } 139 | 140 | template 141 | T fixed() 142 | { 143 | ensure(sizeof(T)); 144 | static_assert(sizeof(T) <= 8, "T too big"); 145 | uint64_t val = 0; 146 | const unsigned char *p = (const unsigned char*)pos; 147 | if (sec->ord == byte_order::lsb) { 148 | for (unsigned i = 0; i < sizeof(T); i++) 149 | val |= ((uint64_t)p[i]) << (i * 8); 150 | } else { 151 | for (unsigned i = 0; i < sizeof(T); i++) 152 | val = (val << 8) | (uint64_t)p[i]; 153 | } 154 | pos += sizeof(T); 155 | return (T)val; 156 | } 157 | 158 | std::uint64_t uleb128() 159 | { 160 | // Appendix C 161 | // XXX Pre-compute all two byte ULEB's 162 | unsigned int tmp = 0; 163 | return uleb128(tmp); 164 | } 165 | 166 | std::uint64_t uleb128(unsigned int &read) 167 | { 168 | // Appendix C 169 | // XXX Pre-compute all two byte ULEB's 170 | std::uint64_t result = 0; 171 | int shift = 0; 172 | while (pos < sec->end) { 173 | uint8_t byte = *(uint8_t*)(pos++); 174 | read++; 175 | result |= (uint64_t)(byte & 0x7f) << shift; 176 | if ((byte & 0x80) == 0) 177 | return result; 178 | shift += 7; 179 | } 180 | underflow(); 181 | return 0; 182 | } 183 | 184 | taddr address() 185 | { 186 | switch (sec->addr_size) { 187 | case 1: 188 | return fixed(); 189 | case 2: 190 | return fixed(); 191 | case 4: 192 | return fixed(); 193 | case 8: 194 | return fixed(); 195 | default: 196 | throw std::runtime_error("address size " + std::to_string(sec->addr_size) + " not supported"); 197 | } 198 | } 199 | 200 | uint64_t read_4_or_8_bytes_field(); 201 | void skip_initial_length(); 202 | void skip_form(DW_FORM form); 203 | 204 | cursor &operator+=(section_offset offset) 205 | { 206 | pos += offset; 207 | return *this; 208 | } 209 | 210 | cursor operator+(section_offset offset) const 211 | { 212 | return cursor(sec, pos + offset); 213 | } 214 | 215 | bool operator<(const cursor &o) const 216 | { 217 | return pos < o.pos; 218 | } 219 | 220 | bool end() const 221 | { 222 | return pos >= sec->end; 223 | } 224 | 225 | bool valid() const 226 | { 227 | return !!pos; 228 | } 229 | 230 | section_offset get_section_offset() const 231 | { 232 | return pos - sec->begin; 233 | } 234 | 235 | private: 236 | cursor(const std::shared_ptr
sec, const char *pos) 237 | : sec(sec), pos(pos) { } 238 | 239 | void underflow(); 240 | }; 241 | 242 | /** 243 | * An attribute specification in an abbrev. 244 | */ 245 | struct attribute_spec 246 | { 247 | DW_AT name; 248 | DW_FORM form; 249 | 250 | // Computed information 251 | value::type type; 252 | 253 | attribute_spec(DW_AT name, DW_FORM form); 254 | }; 255 | 256 | typedef std::uint64_t abbrev_code; 257 | 258 | /** 259 | * An entry in .debug_abbrev. 260 | */ 261 | struct abbrev_entry 262 | { 263 | abbrev_code code; 264 | DW_TAG tag; 265 | bool children; 266 | std::vector attributes; 267 | 268 | abbrev_entry() : code(0) { } 269 | 270 | bool read(cursor *cur); 271 | }; 272 | 273 | /** 274 | * A section header in .debug_pubnames or .debug_pubtypes. 275 | */ 276 | struct name_unit 277 | { 278 | uhalf version; 279 | section_offset debug_info_offset; 280 | section_length debug_info_length; 281 | // Cursor to the first name_entry in this unit. This cursor's 282 | // section is limited to this unit. 283 | cursor entries; 284 | 285 | void read(cursor *cur) 286 | { 287 | // Section 7.19 288 | std::shared_ptr
subsec = cur->subsection(); 289 | cursor sub(subsec); 290 | sub.skip_initial_length(); 291 | version = sub.fixed(); 292 | if (version != 2) 293 | throw format_error("unknown name unit version " + std::to_string(version)); 294 | debug_info_offset = sub.offset(); 295 | debug_info_length = sub.offset(); 296 | entries = sub; 297 | } 298 | }; 299 | 300 | /** 301 | * An entry in a .debug_pubnames or .debug_pubtypes unit. 302 | */ 303 | struct name_entry 304 | { 305 | section_offset offset; 306 | std::string name; 307 | 308 | void read(cursor *cur) 309 | { 310 | offset = cur->offset(); 311 | cur->string(name); 312 | } 313 | }; 314 | 315 | DWARFPP_END_NAMESPACE 316 | 317 | #endif 318 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf/rangelist.cc: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #include "internal.hh" 6 | 7 | using namespace std; 8 | 9 | DWARFPP_BEGIN_NAMESPACE 10 | 11 | rangelist::rangelist(const std::shared_ptr
&sec, section_offset off, 12 | unsigned cu_addr_size, taddr cu_low_pc) 13 | : sec(sec->slice(off, ~0, format::unknown, cu_addr_size)), 14 | base_addr(cu_low_pc) 15 | { 16 | } 17 | 18 | rangelist::rangelist(const initializer_list > &ranges) 19 | { 20 | synthetic.reserve(ranges.size() * 2 + 2); 21 | for (auto &range : ranges) { 22 | synthetic.push_back(range.first); 23 | synthetic.push_back(range.second); 24 | } 25 | synthetic.push_back(0); 26 | synthetic.push_back(0); 27 | 28 | sec = make_shared
( 29 | section_type::ranges, (const char*)synthetic.data(), 30 | synthetic.size() * sizeof(taddr), 31 | native_order(), format::unknown, sizeof(taddr)); 32 | 33 | base_addr = 0; 34 | } 35 | 36 | rangelist::iterator 37 | rangelist::begin() const 38 | { 39 | if (sec) 40 | return iterator(sec, base_addr); 41 | return end(); 42 | } 43 | 44 | rangelist::iterator 45 | rangelist::end() const 46 | { 47 | return iterator(); 48 | } 49 | 50 | bool 51 | rangelist::contains(taddr addr) const 52 | { 53 | for (auto ent : *this) 54 | if (ent.contains(addr)) 55 | return true; 56 | return false; 57 | } 58 | 59 | rangelist::iterator::iterator(const std::shared_ptr
&sec, taddr base_addr) 60 | : sec(sec), base_addr(base_addr), pos(0) 61 | { 62 | // Read in the first entry 63 | ++(*this); 64 | } 65 | 66 | rangelist::iterator & 67 | rangelist::iterator::operator++() 68 | { 69 | // DWARF4 section 2.17.3 70 | taddr largest_offset = ~(taddr)0; 71 | if (sec->addr_size < sizeof(taddr)) 72 | largest_offset += 1 << (8 * sec->addr_size); 73 | 74 | // Read in entries until we reach a regular entry of an 75 | // end-of-list. Note that pos points to the beginning of the 76 | // entry *following* the current entry, so that's where we 77 | // start. 78 | cursor cur(sec, pos); 79 | while (true) { 80 | entry.low = cur.address(); 81 | entry.high = cur.address(); 82 | 83 | if (entry.low == 0 && entry.high == 0) { 84 | // End of list 85 | sec.reset(); 86 | pos = 0; 87 | break; 88 | } else if (entry.low == largest_offset) { 89 | // Base address change 90 | base_addr = entry.high; 91 | } else { 92 | // Regular entry. Adjust by base address. 93 | entry.low += base_addr; 94 | entry.high += base_addr; 95 | pos = cur.get_section_offset(); 96 | break; 97 | } 98 | } 99 | 100 | return *this; 101 | } 102 | 103 | DWARFPP_END_NAMESPACE 104 | -------------------------------------------------------------------------------- /freud-dwarf/dwarf/small_vector.hh: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #ifndef _DWARFPP_SMALL_VECTOR_HH_ 6 | #define _DWARFPP_SMALL_VECTOR_HH_ 7 | 8 | DWARFPP_BEGIN_NAMESPACE 9 | 10 | /** 11 | * A vector-like class that only heap allocates above a specified 12 | * size. 13 | */ 14 | template 15 | class small_vector 16 | { 17 | public: 18 | typedef T value_type; 19 | typedef value_type& reference; 20 | typedef const value_type& const_reference; 21 | typedef size_t size_type; 22 | 23 | small_vector() 24 | : base((T*)buf), end(base), cap((T*)&buf[sizeof(T[Min])]) 25 | { 26 | } 27 | 28 | small_vector(const small_vector &o) 29 | : base((T*)buf), end(base), cap((T*)&buf[sizeof(T[Min])]) 30 | { 31 | *this = o; 32 | } 33 | 34 | small_vector(small_vector &&o) 35 | : base((T*)buf), end(base), cap((T*)&buf[sizeof(T[Min])]) 36 | { 37 | if ((char*)o.base == o.buf) { 38 | // Elements are inline; have to copy them 39 | base = (T*)buf; 40 | end = base; 41 | cap = (T*)&buf[sizeof(T[Min])]; 42 | 43 | *this = o; 44 | o.clear(); 45 | } else { 46 | // Elements are external; swap pointers 47 | base = o.base; 48 | end = o.end; 49 | cap = o.cap; 50 | 51 | o.base = (T*)o.buf; 52 | o.end = o.base; 53 | o.cap = (T*)&o.buf[sizeof(T[Min])]; 54 | } 55 | } 56 | 57 | ~small_vector() 58 | { 59 | clear(); 60 | if ((char*)base != buf) 61 | delete[] (char*)base; 62 | } 63 | 64 | small_vector &operator=(const small_vector &o) 65 | { 66 | size_type osize = o.size(); 67 | clear(); 68 | reserve(osize); 69 | for (size_type i = 0; i < osize; i++) 70 | new (&base[i]) T(o[i]); 71 | end = base + osize; 72 | return *this; 73 | } 74 | 75 | size_type size() const 76 | { 77 | return end - base; 78 | } 79 | 80 | bool empty() const 81 | { 82 | return base == end; 83 | } 84 | 85 | void reserve(size_type n) 86 | { 87 | if (n <= (size_type)(cap - base)) 88 | return; 89 | 90 | size_type target = cap - base; 91 | if (target == 0) 92 | target = 1; 93 | while (target < n) 94 | target <<= 1; 95 | 96 | char *newbuf = new char[sizeof(T[target])]; 97 | T *src = base, *dest = (T*)newbuf; 98 | for (; src < end; src++, dest++) { 99 | new(dest) T(*src); 100 | dest->~T(); 101 | } 102 | if ((char*)base != buf) 103 | delete[] (char*)base; 104 | base = (T*)newbuf; 105 | end = dest; 106 | cap = base + target; 107 | } 108 | 109 | reference operator[](size_type n) 110 | { 111 | return base[n]; 112 | } 113 | 114 | const_reference operator[](size_type n) const 115 | { 116 | return base[n]; 117 | } 118 | 119 | reference at(size_type n) 120 | { 121 | return base[n]; 122 | } 123 | 124 | const_reference at(size_type n) const 125 | { 126 | return base[n]; 127 | } 128 | 129 | /** 130 | * "Reverse at". revat(0) is equivalent to back(). revat(1) 131 | * is the element before back. Etc. 132 | */ 133 | reference revat(size_type n) 134 | { 135 | return *(end - 1 - n); 136 | } 137 | 138 | const_reference revat(size_type n) const 139 | { 140 | return *(end - 1 - n); 141 | } 142 | 143 | reference front() 144 | { 145 | return base[0]; 146 | } 147 | 148 | const_reference front() const 149 | { 150 | return base[0]; 151 | } 152 | 153 | reference back() 154 | { 155 | return *(end-1); 156 | } 157 | 158 | const_reference back() const 159 | { 160 | return *(end-1); 161 | } 162 | 163 | void push_back(const T& x) 164 | { 165 | reserve(size() + 1); 166 | new (end) T(x); 167 | end++; 168 | } 169 | 170 | void push_back(T&& x) 171 | { 172 | reserve(size() + 1); 173 | new (end) T(std::move(x)); 174 | end++; 175 | } 176 | 177 | void pop_back() 178 | { 179 | end--; 180 | end->~T(); 181 | } 182 | 183 | void clear() 184 | { 185 | for (T* p = base; p < end; ++p) 186 | p->~T(); 187 | end = base; 188 | } 189 | 190 | private: 191 | char buf[sizeof(T[Min])]; 192 | T *base, *end, *cap; 193 | }; 194 | 195 | DWARFPP_END_NAMESPACE 196 | 197 | #endif 198 | -------------------------------------------------------------------------------- /freud-dwarf/elf/.gitignore: -------------------------------------------------------------------------------- 1 | *.o 2 | to_string.cc 3 | libelf++.a 4 | libelf++.so 5 | libelf++.so.* 6 | libelf++.pc 7 | -------------------------------------------------------------------------------- /freud-dwarf/elf/Makefile: -------------------------------------------------------------------------------- 1 | # Changed when ABI backwards compatibility is broken. 2 | # Typically uses the major version. 3 | SONAME = 0 4 | 5 | CXXFLAGS+=-g -O2 -Werror 6 | override CXXFLAGS+=-std=c++11 -Wall -fPIC -Wno-unused-private-field 7 | 8 | ifeq ($(shell uname -s),Darwin) 9 | SONAME_FLAG=-install_name 10 | else 11 | SONAME_FLAG=-soname 12 | endif 13 | 14 | 15 | all: libelf++.a libelf++.so libelf++.so.$(SONAME) libelf++.pc 16 | 17 | SRCS := elf.cc mmap_loader.cc to_string.cc 18 | HDRS := elf++.hh data.hh common.hh to_hex.hh 19 | CLEAN := 20 | 21 | libelf++.a: $(SRCS:.cc=.o) 22 | ar rcs $@ $^ 23 | CLEAN += libelf++.a $(SRCS:.cc=.o) 24 | 25 | $(SRCS:.cc=.o): $(HDRS) 26 | 27 | to_string.cc: enum-print.py data.hh Makefile 28 | @echo "// Automatically generated by make at $$(date)" > to_string.cc 29 | @echo "// DO NOT EDIT" >> to_string.cc 30 | @echo >> to_string.cc 31 | @echo '#include "data.hh"' >> to_string.cc 32 | @echo '#include "to_hex.hh"' >> to_string.cc 33 | @echo >> to_string.cc 34 | @echo 'ELFPP_BEGIN_NAMESPACE' >> to_string.cc 35 | @echo >> to_string.cc 36 | python3 enum-print.py -u --hex --no-type --mask shf --mask pf \ 37 | -x loos -x hios -x loproc -x hiproc < data.hh >> to_string.cc 38 | @echo 'ELFPP_END_NAMESPACE' >> to_string.cc 39 | CLEAN += to_string.cc 40 | 41 | libelf++.so.$(SONAME): $(SRCS:.cc=.o) 42 | $(CXX) $(CXXFLAGS) $(LDFLAGS) -shared -Wl,$(SONAME_FLAG),$@ -o $@ $^ 43 | CLEAN += libelf++.so.* 44 | 45 | libelf++.so: 46 | ln -s $@.$(SONAME) $@ 47 | CLEAN += libelf++.so 48 | 49 | # Create pkg-config for local library and headers. This will be 50 | # transformed in to the correct global pkg-config by install. 51 | libelf++.pc: always 52 | @(VER=$$(git describe --match 'v*' | sed -e s/^v//); \ 53 | echo "libdir=$$PWD"; \ 54 | echo "includedir=$$PWD"; \ 55 | echo ""; \ 56 | echo "Name: libelf++"; \ 57 | echo "Description: C++11 ELF library"; \ 58 | echo "Version: $$VER"; \ 59 | echo "Libs: -L\$${libdir} -lelf++"; \ 60 | echo "Cflags: -I\$${includedir}") > $@ 61 | CLEAN += libelf++.pc 62 | 63 | .PHONY: always 64 | 65 | PREFIX?=/usr/local 66 | 67 | install: libelf++.a libelf++.so libelf++.so.$(SONAME) libelf++.pc 68 | install -d $(PREFIX)/lib/pkgconfig 69 | install -t $(PREFIX)/lib libelf++.a 70 | install -t $(PREFIX)/lib libelf++.so.$(SONAME) 71 | install -t $(PREFIX)/lib libelf++.so 72 | install -d $(PREFIX)/include/libelfin/elf 73 | install -t $(PREFIX)/include/libelfin/elf common.hh data.hh elf++.hh 74 | sed 's,^libdir=.*,libdir=$(PREFIX)/lib,;s,^includedir=.*,includedir=$(PREFIX)/include,' libelf++.pc \ 75 | > $(PREFIX)/lib/pkgconfig/libelf++.pc 76 | 77 | clean: 78 | rm -f $(CLEAN) 79 | 80 | .DELETE_ON_ERROR: 81 | -------------------------------------------------------------------------------- /freud-dwarf/elf/common.hh: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #ifndef _ELFPP_COMMON_HH_ 6 | #define _ELFPP_COMMON_HH_ 7 | 8 | #define ELFPP_BEGIN_NAMESPACE namespace elf { 9 | #define ELFPP_END_NAMESPACE } 10 | #define ELFPP_BEGIN_INTERNAL namespace internal { 11 | #define ELFPP_END_INTERNAL } 12 | 13 | #include 14 | 15 | ELFPP_BEGIN_NAMESPACE 16 | 17 | /** 18 | * A byte ordering. 19 | */ 20 | enum class byte_order 21 | { 22 | native, 23 | lsb, 24 | msb 25 | }; 26 | 27 | /** 28 | * Return either byte_order::lsb or byte_order::msb. If the argument 29 | * is byte_order::native, it will be resolved to whatever the native 30 | * byte order is. 31 | */ 32 | static inline byte_order 33 | resolve_order(byte_order o) 34 | { 35 | static const union 36 | { 37 | int i; 38 | char c[sizeof(int)]; 39 | } test = {1}; 40 | 41 | if (o == byte_order::native) 42 | return test.c[0] == 1 ? byte_order::lsb : byte_order::msb; 43 | return o; 44 | } 45 | 46 | /** 47 | * Return v converted from one byte order to another. 48 | */ 49 | template 50 | T 51 | swizzle(T v, byte_order from, byte_order to) 52 | { 53 | static_assert(sizeof(T) == 1 || 54 | sizeof(T) == 2 || 55 | sizeof(T) == 4 || 56 | sizeof(T) == 8, 57 | "cannot swizzle type"); 58 | 59 | from = resolve_order(from); 60 | to = resolve_order(to); 61 | 62 | if (from == to) 63 | return v; 64 | 65 | switch (sizeof(T)) { 66 | case 1: 67 | return v; 68 | case 2: { 69 | std::uint16_t x = (std::uint16_t)v; 70 | return (T)(((x&0xFF) << 8) | (x >> 8)); 71 | } 72 | case 4: 73 | return (T)__builtin_bswap32((std::uint32_t)v); 74 | case 8: 75 | return (T)__builtin_bswap64((std::uint64_t)v); 76 | } 77 | } 78 | 79 | ELFPP_BEGIN_INTERNAL 80 | 81 | /** 82 | * OrderPick selects between Native, LSB, and MSB based on ord. 83 | */ 84 | template 85 | struct OrderPick; 86 | 87 | template 88 | struct OrderPick 89 | { 90 | typedef Native T; 91 | }; 92 | 93 | template 94 | struct OrderPick 95 | { 96 | typedef LSB T; 97 | }; 98 | 99 | template 100 | struct OrderPick 101 | { 102 | typedef MSB T; 103 | }; 104 | 105 | ELFPP_END_INTERNAL 106 | 107 | ELFPP_END_NAMESPACE 108 | 109 | #endif 110 | -------------------------------------------------------------------------------- /freud-dwarf/elf/enum-print.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | # Use of this source code is governed by an MIT license 3 | # that can be found in the LICENSE file. 4 | 5 | import sys, re 6 | from optparse import OptionParser 7 | 8 | def read_toks(): 9 | data = sys.stdin.read() 10 | while data: 11 | data = data.lstrip() 12 | if data.startswith("//") or data.startswith("#"): 13 | data = data.split("\n",1)[1] 14 | elif data.startswith("/*"): 15 | data = data.split("*/",1)[1] 16 | elif data.startswith("\"") or data.startswith("'"): 17 | c = data[0] 18 | m = re.match(r'%s([^\\%s]|\\.)*%s' % (c,c,c), data) 19 | yield m.group(0) 20 | data = data[m.end():] 21 | else: 22 | m = re.match(r"[_a-zA-Z0-9]+|[{}();]|[^_a-zA-Z0-9 \n\t\f]+", data) 23 | yield m.group(0) 24 | data = data[m.end():] 25 | 26 | enums = {} 27 | 28 | def do_top_level(toks, ns=[]): 29 | while toks: 30 | tok = toks.pop(0) 31 | if tok == "enum" and toks[0] == "class": 32 | toks.pop(0) 33 | name = toks.pop(0) 34 | # Get to the first token in the body 35 | while toks.pop(0) != "{": 36 | pass 37 | # Consume body and close brace 38 | do_enum_body("::".join(ns + [name]), toks) 39 | elif tok == "class": 40 | name = do_qname(toks) 41 | # Find the class body, if there is one 42 | while toks[0] != "{" and toks[0] != ";": 43 | toks.pop(0) 44 | # Enter the class's namespace 45 | if toks[0] == "{": 46 | toks.pop(0) 47 | do_top_level(toks, ns + [name]) 48 | elif tok == "{": 49 | # Enter an unknown namespace 50 | do_top_level(toks, ns + [None]) 51 | elif tok == "}": 52 | # Exit the namespace 53 | assert len(ns) 54 | return 55 | elif not ns and tok == "string" and toks[:2] == ["to_string", "("]: 56 | # Get the argument type and name 57 | toks.pop(0) 58 | toks.pop(0) 59 | typ = do_qname(toks) 60 | if typ not in enums: 61 | continue 62 | arg = toks.pop(0) 63 | assert toks[0] == ")" 64 | 65 | if typ in options.mask: 66 | make_to_string_mask(typ, arg) 67 | else: 68 | make_to_string(typ, arg) 69 | 70 | def fmt_value(typ, key): 71 | if options.no_type: 72 | val = key 73 | else: 74 | val = "%s%s%s" % (typ, options.separator, key) 75 | if options.strip_underscore: 76 | val = val.strip("_") 77 | return val 78 | 79 | def expr_remainder(typ, arg): 80 | if options.hex: 81 | return "\"(%s)0x\" + to_hex((int)%s)" % (typ, arg) 82 | else: 83 | return "\"(%s)\" + std::to_string((int)%s)" % (typ, arg) 84 | 85 | def make_to_string(typ, arg): 86 | print("std::string") 87 | print("to_string(%s %s)" % (typ, arg)) 88 | print("{") 89 | print(" switch (%s) {" % arg) 90 | for key in enums[typ]: 91 | if key in options.exclude: 92 | print(" case %s::%s: break;" % (typ, key)) 93 | continue 94 | print(" case %s::%s: return \"%s\";" % \ 95 | (typ, key, fmt_value(typ, key))) 96 | print(" }") 97 | print(" return %s;" % expr_remainder(typ, arg)) 98 | print("}") 99 | print() 100 | 101 | def make_to_string_mask(typ, arg): 102 | print("std::string") 103 | print("to_string(%s %s)" % (typ, arg)) 104 | print("{") 105 | print(" std::string res;") 106 | for key in enums[typ]: 107 | if key in options.exclude: 108 | continue 109 | print(" if ((%s & %s::%s) == %s::%s) { res += \"%s|\"; %s &= ~%s::%s; }" % \ 110 | (arg, typ, key, typ, key, fmt_value(typ, key), arg, typ, key)) 111 | print(" if (res.empty() || %s != (%s)0) res += %s;" % \ 112 | (arg, typ, expr_remainder(typ, arg))) 113 | print(" else res.pop_back();") 114 | print(" return res;") 115 | print("}") 116 | print() 117 | 118 | def do_enum_body(name, toks): 119 | keys = [] 120 | while True: 121 | key = toks.pop(0) 122 | if key == "}": 123 | assert toks.pop(0) == ";" 124 | enums[name] = keys 125 | return 126 | keys.append(key) 127 | if toks[0] == "=": 128 | toks.pop(0) 129 | toks.pop(0) 130 | if toks[0] == ",": 131 | toks.pop(0) 132 | else: 133 | assert toks[0] == "}" 134 | 135 | def do_qname(toks): 136 | # Get a nested-name-specifier followed by an identifier 137 | res = [] 138 | while True: 139 | res.append(toks.pop(0)) 140 | if toks[0] != "::": 141 | return "::".join(res) 142 | toks.pop(0) 143 | 144 | parser = OptionParser() 145 | parser.add_option("-x", "--exclude", dest="exclude", action="append", 146 | help="exclude FIELD", metavar="FIELD", default=[]) 147 | parser.add_option("-u", "--strip-underscore", dest="strip_underscore", 148 | action="store_true", 149 | help="strip leading and trailing underscores") 150 | parser.add_option("-s", "--separator", dest="separator", 151 | help="use SEP between type and field", metavar="SEP", 152 | default="::") 153 | parser.add_option("--hex", dest="hex", action="store_true", 154 | help="return unknown values in hex", default=False) 155 | parser.add_option("--no-type", dest="no_type", action="store_true", 156 | help="omit type") 157 | parser.add_option("--mask", dest="mask", action="append", 158 | help="treat TYPE as a bit-mask", metavar="TYPE", default=[]) 159 | (options, args) = parser.parse_args() 160 | if args: 161 | parser.error("expected 0 arguments") 162 | 163 | do_top_level(list(read_toks())) 164 | -------------------------------------------------------------------------------- /freud-dwarf/elf/mmap_loader.cc: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #include "elf++.hh" 6 | 7 | #include 8 | 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | 15 | using namespace std; 16 | 17 | ELFPP_BEGIN_NAMESPACE 18 | 19 | class mmap_loader : public loader 20 | { 21 | void *base; 22 | size_t lim; 23 | 24 | public: 25 | mmap_loader(int fd) 26 | { 27 | off_t end = lseek(fd, 0, SEEK_END); 28 | if (end == (off_t)-1) 29 | throw system_error(errno, system_category(), 30 | "finding file length"); 31 | lim = end; 32 | 33 | base = mmap(nullptr, lim, PROT_READ, MAP_SHARED, fd, 0); 34 | if (base == MAP_FAILED) 35 | throw system_error(errno, system_category(), 36 | "mmap'ing file"); 37 | close(fd); 38 | } 39 | 40 | ~mmap_loader() 41 | { 42 | munmap(base, lim); 43 | } 44 | 45 | const void *load(off_t offset, size_t size) 46 | { 47 | if (offset + size > lim) 48 | throw range_error("offset exceeds file size"); 49 | return (const char*)base + offset; 50 | } 51 | }; 52 | 53 | std::shared_ptr 54 | create_mmap_loader(int fd) 55 | { 56 | return make_shared(fd); 57 | } 58 | 59 | ELFPP_END_NAMESPACE 60 | -------------------------------------------------------------------------------- /freud-dwarf/elf/to_hex.hh: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013 Austin T. Clements. All rights reserved. 2 | // Use of this source code is governed by an MIT license 3 | // that can be found in the LICENSE file. 4 | 5 | #ifndef _ELFPP_TO_HEX_HH_ 6 | #define _ELFPP_TO_HEX_HH_ 7 | 8 | #include 9 | #include 10 | 11 | template 12 | std::string 13 | to_hex(T v) 14 | { 15 | static_assert(std::is_integral::value, 16 | "to_hex applied to non-integral type"); 17 | if (v == 0) 18 | return std::string("0"); 19 | char buf[sizeof(T)*2 + 1]; 20 | char *pos = &buf[sizeof(buf)-1]; 21 | *pos-- = '\0'; 22 | while (v && pos >= buf) { 23 | int digit = v & 0xf; 24 | if (digit < 10) 25 | *pos = '0' + digit; 26 | else 27 | *pos = 'a' + (digit - 10); 28 | pos--; 29 | v >>= 4; 30 | } 31 | return std::string(pos + 1); 32 | } 33 | 34 | #endif // _ELFPP_TO_HEX_HH_ 35 | -------------------------------------------------------------------------------- /freud-dwarf/instr-expr-context.cc: -------------------------------------------------------------------------------- 1 | #include "instr-expr-context.hh" 2 | 3 | extern bool ctx_used; 4 | 5 | instr_expr_context::instr_expr_context (dwarf::taddr ppc) : ipc{ppc} { 6 | // Using 8 registers: RDI .. R9 + RSP + RBP 7 | for (int r = 0; r < CONTEXT_REGS; r++) { 8 | registers[r].reg = r; 9 | registers[r].offset = 0; 10 | } 11 | } 12 | 13 | dwarf::register_content instr_expr_context::reg (unsigned regnum) { 14 | ctx_used = true; 15 | return registers[regnum]; 16 | } 17 | 18 | void instr_expr_context::set_reg(unsigned regnum, unsigned newreg, int64_t offset) { 19 | registers[regnum].reg = newreg; 20 | registers[regnum].offset = offset; 21 | } 22 | 23 | void instr_expr_context::add_reg_offset(unsigned regnum, int64_t off) { 24 | registers[regnum].offset += off; 25 | } 26 | 27 | void instr_expr_context::set_cfa(unsigned regnum, int64_t offset) { 28 | //std::cout << "Setting CFA_ to " << regnum << "; " << offset << std::endl; 29 | cfa.regnum = regnum; 30 | cfa.offset = offset; 31 | } 32 | 33 | dwarf::cfa_eval_result instr_expr_context::get_cfa() { 34 | return cfa; 35 | } 36 | 37 | unsigned instr_expr_context::get_cfa_reg() { 38 | return cfa.regnum; 39 | } 40 | 41 | int64_t instr_expr_context::get_cfa_offset() { 42 | return cfa.offset; 43 | } 44 | 45 | void instr_expr_context::set_cfa_reg(unsigned regnum) { 46 | //std::cout << "Setting CFA_REG to " << regnum << std::endl; 47 | cfa.regnum = regnum; 48 | } 49 | 50 | void instr_expr_context::set_cfa_offset(int64_t off) { 51 | //std::cout << "Setting CFA_OFF to " << off << std::endl; 52 | cfa.offset = off; 53 | } 54 | 55 | dwarf::taddr instr_expr_context::pc() { 56 | return ipc; 57 | } 58 | 59 | dwarf::taddr instr_expr_context::deref_size (dwarf::taddr address, unsigned size) { 60 | //TODO take into account size 61 | ctx_used = true; 62 | printf("ASKING FOR DEREF %u\n", size); 63 | return 0; 64 | } 65 | 66 | -------------------------------------------------------------------------------- /freud-dwarf/instr-expr-context.hh: -------------------------------------------------------------------------------- 1 | #ifndef INSTR_EXPR_CONTEXT_HH_INCLUDED 2 | #define INSTR_EXPR_CONTEXT_HH_INCLUDED 3 | 4 | #include 5 | #include 6 | 7 | #include "dwarf++.hh" 8 | #include "elf++.hh" 9 | #include "structures.hh" 10 | 11 | class instr_expr_context : public dwarf::expr_context { 12 | public: 13 | instr_expr_context (dwarf::taddr ppc); 14 | 15 | dwarf::register_content reg (unsigned regnum) override; 16 | 17 | void set_reg(unsigned regnum, unsigned newreg, int64_t offset); 18 | 19 | void add_reg_offset(unsigned regnum, int64_t off); 20 | 21 | void set_cfa(unsigned regnum, int64_t offset); 22 | 23 | dwarf::cfa_eval_result get_cfa(); 24 | 25 | unsigned get_cfa_reg(); 26 | 27 | int64_t get_cfa_offset(); 28 | 29 | void set_cfa_reg(unsigned regnum); 30 | 31 | void set_cfa_offset(int64_t off); 32 | 33 | dwarf::taddr pc() override; 34 | 35 | dwarf::taddr deref_size (dwarf::taddr address, unsigned size) override; 36 | 37 | private: 38 | dwarf::taddr ipc; 39 | struct dwarf::register_content registers[CONTEXT_REGS]; 40 | struct dwarf::cfa_eval_result cfa; 41 | }; 42 | 43 | 44 | #endif 45 | -------------------------------------------------------------------------------- /freud-dwarf/structures.hh: -------------------------------------------------------------------------------- 1 | #ifndef STRUCTURES_HH_DEFINED 2 | #define STRUCTURES_HH_DEFINED 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | 11 | #include "utils.hh" 12 | 13 | #define MAX_DEPTH 3 14 | #define MAX_FEATURES 512 15 | #define WITH_STRUCT_IN_REG 16 | #define CONTEXT_REGS 8 17 | //#define USE_VECTORS_FOR_VARIADIC 18 | #define WITH_GLOBAL_VARIABLES 19 | 20 | // Cross information from: 21 | // - https://github.com/libunwind/libunwind/blob/d32956507cf29d9b1a98a8bce53c78623908f4fe/src/x86_64/init.h 22 | // - https://en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI 23 | // - readelf output for XMM registers 24 | // -----------------------------------------------------------> INTs and addresses 25 | // 0 should be RAX, return value (smaller than 64bit) in amd64 26 | // 1 should be RDX, 2 in amd64 27 | // 2 should be RCX, 3 in amd64 28 | // 3 should be RBX, not used 29 | // 4 should be RSI, 1 in amd64 30 | // 5 should be RDI, 0 in amd64 31 | // 6 should be RBP, 7 in my PinTool 32 | // 7 should be RSP, 6 in my PinTool 33 | // 8 should be R8, 4 in amd64 34 | // 9 should be R9, 5 in amd64 35 | // ... 36 | // 15 should be R15, not used 37 | // 16 should be RIP, not used 38 | // -----------------------------------------------------------> FLOATs 39 | // 17 is XMM0 in amd64 40 | // ... 41 | // 24 is XMM 7 42 | 43 | // Here another apparently good source, with also x86 and ARM, should it ever become 44 | // necessary... 45 | // https://www.imperialviolet.org/2017/01/18/cfi.html 46 | 47 | const std::vector dwarf2amd64 { -1, 2, 3, -1, 1, 0, 7, 6, 4, 5, -1, -1, -1, -1, -1, -1, -1, 8, 9, 10, 11, 12, 13, 14, 15 }; 48 | 49 | struct hierarchy_tree_node { 50 | hierarchy_tree_node(std::string n, std::string l): class_name(n), linkage_name(l) { 51 | n = std::to_string(n.size()) + n; 52 | unsigned char * tmp; 53 | tmp = (unsigned char *)n.c_str(); 54 | uint64_t h = utils::hash(tmp); 55 | sym_name_hash = std::to_string(h); 56 | 57 | h = utils::hash((unsigned char *)l.c_str()); 58 | sym_lname_hash = std::to_string(h); 59 | utils::log(VL_DEBUG, "HASH: " + class_name + " -> " + sym_name_hash); 60 | }; 61 | std::string class_name; 62 | std::string linkage_name; 63 | std::string sym_name_hash; 64 | std::string sym_lname_hash; 65 | std::set maybe_real_type; 66 | std::unordered_map offsets; 67 | }; 68 | 69 | struct array_info { 70 | uint32_t dims; 71 | std::vector counts; 72 | array_info(): dims(0) {}; 73 | }; 74 | 75 | struct member { 76 | std::list feat_names; 77 | std::string type_name; 78 | std::vector offset; 79 | std::vector pointer; 80 | struct array_info ai; 81 | }; 82 | 83 | struct parameter { 84 | std::string name; 85 | std::string type_name; 86 | std::string linkage_type_name; 87 | struct array_info ai; 88 | uint64_t location; 89 | int64_t offset_from_location; 90 | bool uses_input; 91 | bool in_XMM; 92 | bool is_ptr; 93 | bool is_addr; 94 | bool is_register_with_addr; 95 | }; 96 | 97 | struct feature_info { 98 | std::string out_name; 99 | unsigned char num_of_values; 100 | feature_info() { out_name = "auto_created"; num_of_values = 0;}; 101 | feature_info(std::string n, unsigned char f): out_name(n), num_of_values(f) {}; 102 | }; 103 | 104 | const std::unordered_map basic_features { 105 | {"bool",{"b",1}}, 106 | {"char",{"s",1}}, {"unsigned char",{"n",1}}, {"signed char",{"n",1}}, 107 | {"short int",{"n",1}}, {"short unsigned int",{"n",1}}, 108 | {"int",{"n",1}}, {"unsigned int",{"n",1}}, 109 | {"long int",{"n", 1}}, {"long unsigned int",{"n",1}}, 110 | {"long long int",{"n",1}}, {"long long unsigned int",{"n",1}}, {"size_t",{"n",1}}, 111 | {"float",{"n",1}}, {"double",{"n",1}}, 112 | 113 | // ARRAYS 114 | {"array bool",{"a",2}}, 115 | {"array char",{"a",2}}, {"array unsigned char",{"a",2}}, {"array signed char",{"a",2}}, 116 | {"array int", {"a", 2}}, {"array unsigned int", {"a", 2}}, 117 | {"array long int",{"a", 2}}, {"array long unsigned int",{"a", 2}}, 118 | {"array short int", {"a", 2}}, {"array short unsigned int", {"a", 2}}, 119 | {"array long long int",{"a",2}}, {"array long long unsigned int",{"a",2}}, {"array size_t",{"a",2}}, 120 | {"array float",{"a",2}}, {"array double",{"a",2}}, 121 | 122 | // ENUMS 123 | {"enum int",{"e",1}}, {"enum unsigned int",{"e",1}}, {"enum unsigned char",{"e",1}}, 124 | {"enum long",{"e",1}}, {"enum unsigned long",{"e",1}}, 125 | 126 | }; 127 | 128 | 129 | #endif 130 | -------------------------------------------------------------------------------- /freud-dwarf/table-generator.cc: -------------------------------------------------------------------------------- 1 | #include "table-generator.hh" 2 | 3 | extern std::unordered_map> artificial_features; 4 | 5 | /// ************ PRIVATE ************ //// 6 | bool table_generator::check_features(dwarf_explorer * de, const std::string sym_name, const std::string type_name) { 7 | if (de->type_is_undefined(sym_name)) 8 | return false; 9 | 10 | // --- SWITCH: complex vs basic type 11 | if (de->hierarchy_tree_nodes_map.find(sym_name) != de->hierarchy_tree_nodes_map.end()) { 12 | // --- SWITCH: complex type 13 | 14 | if (de->hierarchy_tree_nodes_map.at(sym_name)->maybe_real_type.size()) { 15 | // if polymorphic, there should be at least the vptr 16 | // check if that is true... 17 | bool res = false; 18 | for (auto member: de->types.at(type_name)) { 19 | if (member.feat_names.front().find("_vptr") == 0) { 20 | res = true; 21 | break; 22 | } 23 | } 24 | return res; 25 | } else { 26 | // not polymorphic 27 | return !de->types.at(type_name).empty(); 28 | } 29 | } else if (basic_features.find(type_name) != basic_features.end()) { 30 | // --- SWITCH: primitive type 31 | return (basic_features.at(type_name).num_of_values > 0); 32 | } else { 33 | utils::log(VL_ERROR, "Could not find type " + type_name); 34 | return false; 35 | } 36 | } 37 | 38 | void table_generator::create_table_entries_for_classes_dfs(dwarf_explorer * de, int tree_size, std::string &descriptors_code, const std::string sym_name, const std::string ppname) { 39 | struct hierarchy_tree_node * htn = de->hierarchy_tree_nodes_map.at(sym_name); 40 | descriptors_code += get_table_entry_for_node(de, tree_size, htn, ppname); 41 | 42 | // GO DFS 43 | for (hierarchy_tree_node * htn_c: htn->maybe_real_type) { 44 | create_table_entries_for_classes_dfs(de, tree_size, descriptors_code, htn_c->class_name, ppname); 45 | } 46 | } 47 | 48 | std::string table_generator::get_table_entry_for_node(dwarf_explorer * de, int tree_size, const struct hierarchy_tree_node * mrt, const std::string ppname) { 49 | std::string descriptors_code; 50 | // HASH OF THE RUNTIME OBJECT TYPE 51 | if (tree_size > 1) 52 | descriptors_code = mrt->sym_lname_hash + "\n"; 53 | else 54 | descriptors_code = mrt->sym_name_hash + "\n"; 55 | 56 | std::string pp_type_name = "class " + mrt->class_name; 57 | if (de->types.find(pp_type_name) == de->types.end()) { 58 | pp_type_name = "struct " + mrt->class_name; 59 | if (de->types.find(pp_type_name) == de->types.end()) { 60 | utils::log(VL_DEBUG, "Need to find definition for " + pp_type_name); 61 | std::vector empty_vec; 62 | std::unordered_set empty_uset; 63 | std::list members; 64 | std::string name = ""; 65 | bool false_bool = false; 66 | struct array_info empty_ai; 67 | std::string empty_str = ""; 68 | unsigned int num_features = 0; 69 | std::string complex_type; 70 | bool found = de->find_definition(mrt->class_name, empty_vec, empty_uset, members, empty_vec, name, false_bool, empty_ai, false_bool, empty_str, num_features, complex_type); 71 | if (found) { 72 | pp_type_name = complex_type + mrt->class_name; 73 | de->types[pp_type_name] = members; 74 | } else { 75 | utils::log(VL_ERROR, "Could not find definition for " + pp_type_name); 76 | descriptors_code += std::to_string(0) + "\n"; 77 | return descriptors_code; 78 | } 79 | } 80 | } 81 | 82 | unsigned int num_of_features = 0; 83 | std::string member_names = ""; 84 | bool first = false, last = false, size_found = false; 85 | struct member first_m, last_m; 86 | for (struct member m: de->types[pp_type_name]) { 87 | if (basic_features.find(m.type_name) != basic_features.end()) { 88 | if (basic_features.at(m.type_name).num_of_values == 0) { 89 | utils::log(VL_ERROR, "Missing basic type " + m.type_name); 90 | exit(-1); 91 | } 92 | num_of_features += basic_features.at(m.type_name).num_of_values; 93 | for (std::string n: m.feat_names) { 94 | member_names += m.type_name + "\n"; 95 | member_names += ppname + n + "\n"; 96 | } 97 | if (m.feat_names.size() != basic_features.at(m.type_name).num_of_values) { 98 | utils::log(VL_ERROR, "Inconsistent data! T: " + pp_type_name + " - " + m.feat_names.front() + "; " + std::to_string(m.feat_names.size()) + " - " + std::to_string(basic_features.at(m.type_name).num_of_values)); 99 | utils::log(VL_ERROR, m.type_name + " should have " + std::to_string(basic_features.at(m.type_name).num_of_values) + " values, not " + std::to_string(m.feat_names.size())); 100 | for (std::string fname: m.feat_names) { 101 | utils::log(VL_ERROR, m.type_name + " " + fname); 102 | } 103 | exit(-1); 104 | } 105 | } else { 106 | utils::log(VL_ERROR, "Missing basic type " + m.type_name); 107 | exit(-1); 108 | } 109 | 110 | /* 111 | * HEURISTIC, PART1: LOOK FOR PAIRS LIKE (START,END), (FIRST,LAST)... 112 | */ 113 | std::string uname = utils::uppercase(m.feat_names.front()); 114 | if ( 115 | uname.find("FIRST") != std::string::npos || 116 | uname.find("START") != std::string::npos || 117 | uname.find("BEGIN") != std::string::npos 118 | ) { 119 | if (first) { 120 | utils::log(VL_DEBUG, ppname + ": there's more than one possible first member for heuristic 1, skipping the last"); 121 | } else { 122 | first = true; 123 | first_m = m; 124 | } 125 | } 126 | else if ( 127 | uname.find("LAST") != std::string::npos || 128 | uname.find("STOP") != std::string::npos || 129 | uname.find("FINISH") != std::string::npos || 130 | uname.find("END") != std::string::npos 131 | ) { 132 | if (last) { 133 | utils::log(VL_DEBUG, ppname + ": there's more than one possible last member for heuristic 1, skipping the last"); 134 | } else { 135 | last = true; 136 | last_m = m; 137 | } 138 | } 139 | } 140 | if (first && last && first_m.type_name == last_m.type_name) { 141 | size_found = true; 142 | // TODO: remove this hack 143 | // right now I know I have only one artificial feature, but not necessarily true 144 | if (artificial_features[pp_type_name].empty()) 145 | artificial_features[pp_type_name].push_back("size"); 146 | } 147 | if (artificial_features.find(pp_type_name) != artificial_features.end()) { 148 | num_of_features += artificial_features[pp_type_name].size(); 149 | for (std::string af_name: artificial_features[pp_type_name]) { 150 | member_names += "unsigned int\n"; // right now it's only size, so unsigned int should be good 151 | member_names += ppname + ".artificial." + af_name + "\n"; 152 | } 153 | } 154 | 155 | // NUM_OF_FEATURES 156 | descriptors_code += std::to_string(num_of_features) + "\n"; 157 | // FEATURE NAMES 158 | descriptors_code += member_names; 159 | return descriptors_code; 160 | } 161 | 162 | 163 | /// ************ PUBLIC ************ //// 164 | void table_generator::create_table(dwarf_explorer * de, std::string tbl_filename) { 165 | std::ofstream descf(tbl_filename); 166 | std::string descriptors_code = ""; 167 | 168 | // FOR EACH SYMBOL 169 | for (auto p: de->func_pars_map) { 170 | if (p.second.size() == 0) 171 | continue; // no features, skip 172 | 173 | descriptors_code += "###\n"; 174 | // SYMBOL NAME 175 | descriptors_code += p.first + "\n"; 176 | if (de->func_addr_map.find(p.first) == de->func_addr_map.end()) { 177 | utils::log(VL_ERROR, "Couldn't find entry address for " + p.first); 178 | exit(-1); 179 | } 180 | 181 | // INSTRUMENTATION ENTRY POINT 182 | descriptors_code += std::to_string(de->func_addr_map[p.first]) + "\n"; 183 | 184 | size_t sz = p.second.size(); 185 | std::unordered_set to_skip; 186 | 187 | for (auto pp: p.second) { 188 | std::string sym_name = pp.type_name.substr(pp.type_name.find(" ") + 1, std::string::npos); 189 | 190 | // Check whether it actually has features 191 | if (check_features(de, sym_name, pp.type_name) == false) { 192 | to_skip.insert(sym_name); 193 | sz--; 194 | } 195 | } 196 | 197 | // NUM OF FORMAL PARAMETERS 198 | descriptors_code += std::to_string(sz) + "\n"; 199 | for (auto pp: p.second) { 200 | std::string sym_name = pp.type_name.substr(pp.type_name.find(" ") + 1, std::string::npos); 201 | if (to_skip.find(sym_name) == to_skip.end()) { //!de->type_is_undefined(sym_name)) { 202 | /***** 203 | * DESCRIPTOR INFO 204 | ****/ 205 | 206 | // LOCATION 207 | descriptors_code += std::to_string(pp.location) + "\n"; 208 | 209 | // OFFSET FROM LOCATION 210 | descriptors_code += std::to_string(pp.offset_from_location) + "\n"; 211 | 212 | // IS_ADDR or REG. CONTAINING ADDR 213 | descriptors_code += std::to_string(pp.is_addr + pp.is_register_with_addr) + "\n"; 214 | 215 | // IS_PTR; THE PARAMETER REPRESENTS A PTR 216 | descriptors_code += std::to_string(pp.is_ptr) + "\n"; 217 | 218 | // NAME OF THE TYPE OF THE FORMAL PARAMETER 219 | descriptors_code += pp.type_name + "\n"; 220 | 221 | // --- SWITCH: complex vs basic type 222 | if (de->hierarchy_tree_nodes_map.find(sym_name) != de->hierarchy_tree_nodes_map.end()) { 223 | // --- SWITCH: complex type 224 | 225 | // NUMBER OF POTENTIAL RUNTIME TYPES (for polymorphic types) 226 | int tree_size = de->get_class_graph_size(sym_name); 227 | descriptors_code += std::to_string(tree_size) + "\n"; 228 | 229 | create_table_entries_for_classes_dfs(de, tree_size, descriptors_code, sym_name, pp.name); 230 | } else if (basic_features.find(pp.type_name) != basic_features.end()) { 231 | // --- SWITCH: primitive type 232 | 233 | // NUMBER OF POTENTIAL RUNTIME TYPES (for basic types) 234 | descriptors_code += std::to_string(1) + "\n"; 235 | 236 | // HASH OF THE RUNTIME OBJECT TYPE (always 0) 237 | descriptors_code += std::to_string(0) + "\n"; 238 | 239 | // NUM OF FEATURES 240 | unsigned feat_num = basic_features.at(pp.type_name).num_of_values; 241 | descriptors_code += std::to_string(feat_num) + "\n"; 242 | 243 | // FEAT_NAMES 244 | for (int f = 0; f < feat_num; f++) { 245 | descriptors_code += pp.type_name + "\n"; 246 | descriptors_code += pp.name + (f > 0 ? std::to_string(f) : "") + "\n"; 247 | } 248 | // TODO: handle arrays decently 249 | // Arrays have 2 values, where the second is an aggregate function of the values in the array 250 | //if (pp.type_name.find("array ") == 0) 251 | // descriptors_code += "artificial_" + pp.name + "_aggr\n"; 252 | } else { 253 | utils::log(VL_ERROR, "Could not find type " + pp.type_name); 254 | // JUST SAY 0 RUNTIME TYPES 255 | descriptors_code += std::to_string(0) + "\n"; 256 | } 257 | } else { 258 | utils::log(VL_ERROR, "Could not find type " + sym_name); 259 | } 260 | // ########################################################## 261 | } 262 | } 263 | descf << descriptors_code; 264 | descf.close(); 265 | } 266 | 267 | -------------------------------------------------------------------------------- /freud-dwarf/table-generator.hh: -------------------------------------------------------------------------------- 1 | #ifndef TABLE_GENERATOR_HH_INCLUDED 2 | #define TABLE_GENERATOR_HH_INCLUDED 3 | 4 | #include "structures.hh" 5 | #include "dwarf-explorer.hh" 6 | #include 7 | 8 | class table_generator { 9 | private: 10 | static bool check_features(dwarf_explorer * de, const std::string sym_name, const std::string type_name); 11 | static void create_table_entries_for_classes_dfs(dwarf_explorer * de, int tree_size, std::string &descriptors_code, const std::string sym_name, const std::string ppname); 12 | static std::string get_table_entry_for_node(dwarf_explorer * de, int tree_size, const struct hierarchy_tree_node * mrt, const std::string ppname); 13 | 14 | public: 15 | static void create_table(dwarf_explorer * de, std::string tbl_filename); 16 | }; 17 | 18 | #endif 19 | -------------------------------------------------------------------------------- /freud-dwarf/table_intf.txt: -------------------------------------------------------------------------------- 1 | [ for each symbol ] 2 | ### 3 | symbol_name 4 | instr_entry_point 5 | param_num 6 | [ for each param ] 7 | location 8 | offset_from_location 9 | is_addr + is_reg_with_addr 10 | is_ptr 11 | type_name 12 | number_of_feature_sets 13 | [ for each feature_set ] 14 | hash_of_the_feature_set (0 if basic_type) 15 | num_of_features 16 | [ for each feature of the set ] 17 | type_name 18 | fname 19 | -------------------------------------------------------------------------------- /freud-dwarf/utils.cc: -------------------------------------------------------------------------------- 1 | #include "utils.hh" 2 | 3 | enum verbosity_levels vl = VL_INFO; 4 | 5 | std::string utils::log_label(enum verbosity_levels vl) { 6 | if (vl == VL_ERROR) { 7 | return "ERR: "; 8 | } else if (vl == VL_INFO) { 9 | return "INFO: "; 10 | } else if (vl == VL_DEBUG) { 11 | return "DBG: "; 12 | } 13 | return "UNK: "; 14 | } 15 | 16 | void utils::log(enum verbosity_levels l, const std::string msg) { 17 | if (l <= vl) 18 | std::cout << log_label(l) << msg << std::endl; 19 | } 20 | 21 | std::string utils::uppercase(std::string str) { 22 | std::string res = ""; 23 | for (auto c: str) res += toupper(c); 24 | return res; 25 | } 26 | 27 | void utils::validate_f_name(std::string &t) { 28 | std::replace(t.begin(), t.end(), '~', '_'); 29 | std::replace(t.begin(), t.end(), '.', '_'); 30 | std::replace(t.begin(), t.end(), '-', '_'); 31 | std::replace(t.begin(), t.end(), '&', '_'); 32 | std::replace(t.begin(), t.end(), '\'', '_'); 33 | std::replace(t.begin(), t.end(), ')', '_'); 34 | std::replace(t.begin(), t.end(), '(', '_'); 35 | std::replace(t.begin(), t.end(), '*', '_'); 36 | std::replace(t.begin(), t.end(), ':', '_'); 37 | std::replace(t.begin(), t.end(), ',', '_'); 38 | std::replace(t.begin(), t.end(), '>', '_'); 39 | std::replace(t.begin(), t.end(), '<', '_'); 40 | std::replace(t.begin(), t.end(), ' ', '_'); 41 | std::replace(t.begin(), t.end(), '[', '_'); 42 | std::replace(t.begin(), t.end(), ']', '_'); 43 | std::replace(t.begin(), t.end(), '=', '_'); 44 | std::replace(t.begin(), t.end(), ';', '_'); 45 | } 46 | 47 | std::string utils::copy_validate_f_name(std::string in) { 48 | std::string res = in; 49 | validate_f_name(res); 50 | return res; 51 | } 52 | 53 | std::string utils::get_function_name(std::string type) { 54 | std::string fname = type; 55 | validate_f_name(fname); 56 | fname = "get_" + fname + "_features"; 57 | return fname; 58 | } 59 | 60 | std::string utils::get_signature(std::string type) { 61 | std::string sig = type; 62 | sig = "void " + get_function_name(type) + "(" + type + " value, double *values);\n\n"; 63 | sig += "void " + get_function_name(type) + "_ptr(void * value, double *values);\n\n"; 64 | return sig; 65 | } 66 | 67 | unsigned long utils::hash(unsigned char *str) { 68 | unsigned long hash = 5381; 69 | int c; 70 | while (c = *str++) 71 | hash = ((hash << 5) + hash) + c; 72 | return hash; 73 | } 74 | 75 | -------------------------------------------------------------------------------- /freud-dwarf/utils.hh: -------------------------------------------------------------------------------- 1 | #ifndef UTILS_HH_DEFINED 2 | #define UTILS_HH_DEFINED 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | 9 | enum verbosity_levels { 10 | VL_ERROR = 0, 11 | VL_QUIET, 12 | VL_INFO, 13 | VL_DEBUG 14 | }; 15 | 16 | class utils { 17 | private: 18 | static std::string log_label(enum verbosity_levels vl); 19 | 20 | public: 21 | static void log(enum verbosity_levels l, std::string msg); 22 | static std::string uppercase(std::string str); 23 | static void validate_f_name(std::string &t); 24 | static std::string copy_validate_f_name(std::string in); 25 | static std::string get_function_name(std::string type); 26 | static std::string get_signature(std::string type); 27 | static unsigned long hash(unsigned char *str); 28 | }; 29 | 30 | #endif 31 | -------------------------------------------------------------------------------- /freud-pin/dumper.cc: -------------------------------------------------------------------------------- 1 | size_t write_logs_to_file(std::string rtn_name, const struct routine_descriptor & desc) { 2 | time_t now = time(NULL); 3 | std::stringstream now_ss; 4 | now_ss.str(""); 5 | mkdir("symbols/", S_IRWXU | S_IRWXG); 6 | std::string folder = "symbols/" + rtn_name; 7 | now_ss << folder << "/idcm_" << rtn_name << "_" << now << ".bin"; 8 | size_t samples_count = 0; 9 | for (THREADID tid = 0; tid < MAXTHREADS; tid++) { 10 | samples_count += desc.get_stored_history_size(tid); // I do not want to keep a counter in the struct, since I would need to sync on that 11 | } 12 | if (samples_count == 0) { 13 | return 0; // do not print unnecessary things 14 | } 15 | mkdir(folder.c_str(), S_IRWXU | S_IRWXG); 16 | ofstream outFile(now_ss.str().c_str(), std::ios::binary); 17 | if (!outFile.is_open()) { 18 | log(VL_ERROR, "Could not open outfile " + now_ss.str()); 19 | return 0; 20 | } 21 | 22 | // **** RTN NAME **** 23 | uint32_t name_len = rtn_name.size(); 24 | outFile.write((char *)&name_len, sizeof(uint32_t)); 25 | outFile.write(rtn_name.c_str(), sizeof(char) * name_len); 26 | 27 | // **** FEATURES NAMES (incl. sys.) **** 28 | // print all the names of all the possible features 29 | // we will point to these strings afterwards 30 | unordered_map fname_offsets; 31 | unordered_map ftype_offsets; 32 | set ftype_names; 33 | uint32_t tot_fnames = 0; 34 | std::fpos tot_fnames_position = outFile.tellp(); 35 | outFile.write((char *)&tot_fnames, sizeof(uint32_t)); 36 | 37 | for (auto p: desc.params) { 38 | for (auto rttf: p.runtime_type_to_features) { 39 | for (struct primitive_feature pf: rttf.second) { 40 | string fname = pf.name; 41 | if (fname_offsets.find(fname) == fname_offsets.end()) { 42 | fname_offsets.insert(make_pair(fname, outFile.tellp())); 43 | ftype_names.insert(pf.type); 44 | uint16_t fname_len = fname.size(); 45 | outFile.write((char *)&fname_len, sizeof(uint16_t)); 46 | outFile.write(fname.c_str(), sizeof(char) * fname_len); 47 | tot_fnames++; 48 | } 49 | } 50 | } 51 | } 52 | for (string sfname: desc.system_variable_names) { 53 | if (fname_offsets.find(sfname) == fname_offsets.end()) { 54 | fname_offsets.insert(make_pair(sfname, outFile.tellp())); 55 | uint16_t sfname_len = sfname.size(); 56 | outFile.write((char *)&sfname_len, sizeof(uint16_t)); 57 | outFile.write(sfname.c_str(), sizeof(char) * sfname_len); 58 | tot_fnames++; 59 | } 60 | } 61 | 62 | // **** TYPES NAMES **** 63 | uint32_t tcount = ftype_names.size(); 64 | outFile.write((char *)&tcount, sizeof(uint32_t)); 65 | for (string t: ftype_names) { 66 | ftype_offsets.insert(make_pair(t, outFile.tellp())); 67 | uint16_t ftype_len = t.size(); 68 | outFile.write((char *)&ftype_len, sizeof(uint16_t)); 69 | outFile.write(t.c_str(), sizeof(char) * ftype_len); 70 | } 71 | 72 | // GO BACK TO WRITE THE CORRECT NUM OF FEATURES 73 | std::fpos prev_pos = outFile.tellp(); 74 | outFile.seekp(tot_fnames_position); 75 | outFile.write((char *)&tot_fnames, sizeof(uint32_t)); 76 | outFile.seekp(prev_pos); 77 | 78 | // **** FOR EVERY RUN **** 79 | // I need to correct the number of samples, considering also the samples for which time == -1! 80 | // will do it later, though 81 | samples_count = 0; 82 | std::fpos num_of_samples_position = outFile.tellp(); 83 | outFile.write((char *)&samples_count, sizeof(uint32_t)); 84 | 85 | //cout << "Basics " << rtn.first << endl; 86 | for (THREADID tid = 0; tid < MAXTHREADS; tid++) { 87 | if(desc.get_stored_history_size(tid) == 0) 88 | continue; 89 | //cout << "TID " << tid << endl; 90 | 91 | const vector * history = desc.get_stored_history_ptr(tid); 92 | for(uint32_t i = 0; i < history->size(); i++){ 93 | uint64_t tim = (*history)[i]->diff(); 94 | if (tim == UINT64_MAX) 95 | continue; 96 | //cout << "OBS " << i << ": " << (*history)[i]->total_waiting_time << endl; 97 | 98 | samples_count++; 99 | 100 | // **** ID, [ METRICS ] **** 101 | outFile.write((char *)&(*history)[i]->unique_id, sizeof(uint32_t)); 102 | outFile.write((char *)&tim, sizeof(uint64_t)); 103 | outFile.write((char *)&(*history)[i]->allocated_memory, sizeof(uint64_t)); 104 | outFile.write((char *)&(*history)[i]->total_lock_holding_time, sizeof(uint64_t)); 105 | outFile.write((char *)&(*history)[i]->total_waiting_time, sizeof(uint64_t)); 106 | outFile.write((char *)&(*history)[i]->minor_page_faults, sizeof(uint64_t)); 107 | outFile.write((char *)&(*history)[i]->major_page_faults, sizeof(uint64_t)); 108 | 109 | 110 | // **** NUM OF FEATURES **** 111 | uint32_t pn = 0, tot_features = 0; 112 | std::fpos tf_pos = outFile.tellp(); 113 | outFile.write((char *)&tot_features, sizeof(uint32_t)); 114 | uint32_t rot_idx = 0; 115 | string runtime_type; 116 | // LOCAL AND GLOBAL FEATURES 117 | for (; pn < desc.param_count; pn++) { 118 | struct dwarf_formal_parameter dfp = desc.params[pn]; 119 | if (dfp.runtime_type_to_features.size() > 1) { 120 | // No std::to_string() in STL... 121 | std::ostringstream oss; 122 | oss << (*history)[i]->runtime_types[rot_idx++]; 123 | runtime_type = oss.str(); 124 | } else if (dfp.type_name.find("class ") == 0 || 125 | dfp.type_name.find("struct ") == 0) { 126 | std::ostringstream oss; 127 | string stmp = dfp.type_name.substr(dfp.type_name.find(" ") + 1, string::npos); 128 | oss << stmp.size(); 129 | stmp = oss.str() + stmp; 130 | oss.str(""); 131 | oss << freud_hash((const unsigned char *)stmp.c_str()); 132 | runtime_type = oss.str(); 133 | } else { 134 | // basic type 135 | runtime_type = "0"; 136 | } 137 | 138 | if (dfp.runtime_type_to_features.find(runtime_type) == dfp.runtime_type_to_features.end()) { 139 | log(VL_ERROR, "Warning, found unknown dynamic type " + runtime_type + " for " + dfp.type_name); 140 | //exit(-1); 141 | } else { 142 | for (struct primitive_feature pf: dfp.runtime_type_to_features[runtime_type]) { 143 | if (tot_features >= (*history)[i]->feature_values.size()) { 144 | log(VL_ERROR, "Less features than expected in " + rtn_name + "!"); 145 | log(VL_ERROR, "Type: " + runtime_type); 146 | std::ostringstream oss; 147 | oss.str("Got "); 148 | oss << (*history)[i]->feature_values.size(); 149 | log(VL_ERROR, oss.str()); 150 | oss.str("Feat "); 151 | oss << pf.name; 152 | log(VL_ERROR, oss.str()); 153 | exit(-1); 154 | } 155 | uint64_t offs = fname_offsets[pf.name]; 156 | uint64_t toffs = ftype_offsets[pf.type]; 157 | outFile.write((char *)&offs, sizeof(uint64_t)); 158 | outFile.write((char *)&toffs, sizeof(uint64_t)); 159 | outFile.write((char *)&((*history)[i]->feature_values[tot_features++]), sizeof(int64_t)); 160 | } 161 | } 162 | } 163 | if (tot_features != (*history)[i]->feature_values.size()) { 164 | log(VL_ERROR, "More features than expected in " + rtn_name + "!"); 165 | log(VL_ERROR, runtime_type); 166 | std::ostringstream oss; 167 | oss.str("Exp "); 168 | oss << tot_features << " / Got " << (*history)[i]->feature_values.size(); 169 | log(VL_ERROR, oss.str()); 170 | exit(-1); 171 | } 172 | 173 | // SYSTEM FEATURES 174 | for (uint16_t f = 0; f < ((*history)[i])->system_feature_values.size(); f++) { 175 | // System features have 1! value by definition 176 | uint64_t offs = fname_offsets[desc.system_variable_names[f]]; 177 | uint64_t toffs = 0; // Fake, it's a placeholder 178 | outFile.write((char *)&offs, sizeof(uint64_t)); 179 | outFile.write((char *)&toffs, sizeof(uint64_t)); 180 | outFile.write((char *)&((*history)[i]->system_feature_values[f]), sizeof(int64_t)); 181 | tot_features++; 182 | } 183 | prev_pos = outFile.tellp(); 184 | outFile.seekp(tf_pos); 185 | outFile.write((char *)&tot_features, sizeof(uint32_t)); 186 | outFile.seekp(prev_pos); 187 | 188 | // BRANCHES 189 | uint32_t num_of_branches = (*history)[i]->branches.size(); 190 | outFile.write((char *)&num_of_branches, sizeof(uint32_t)); 191 | for (const auto & b: (*history)[i]->branches) { 192 | outFile.write((char *)&b.first, sizeof(uint16_t)); 193 | uint32_t num_of_executions = b.second.size(); 194 | outFile.write((char *)&num_of_executions, sizeof(uint32_t)); 195 | for (bool t: b.second) { 196 | outFile.write((char *)&t, sizeof(bool)); 197 | } 198 | } 199 | 200 | // CHILDREN 201 | uint32_t num_of_children = (*history)[i]->children.size(); 202 | outFile.write((char *)&num_of_children, sizeof(uint32_t)); 203 | for (uint32_t c: (*history)[i]->children) 204 | outFile.write((char *)&c, sizeof(uint32_t)); 205 | } 206 | } 207 | prev_pos = outFile.tellp(); 208 | outFile.seekp(num_of_samples_position); 209 | outFile.write((char *)&samples_count, sizeof(uint32_t)); 210 | outFile.seekp(prev_pos); 211 | outFile.close(); 212 | return samples_count; 213 | } 214 | 215 | VOID dump_logs(VOID * arg) { 216 | size_t tot_w = 0; 217 | while (!quit_dump_thread) { 218 | PIN_Sleep(DUMP_LOG_PERIOD); 219 | // SWITCH ALL DESCRIPTORS 220 | for (std::pair ar: routines_catalog) { 221 | routines_catalog[ar.first].switch_history(); 222 | } 223 | 224 | // DUMP 225 | for (std::pair ar: routines_catalog) { 226 | // BINARY OUTPUT 227 | tot_w += write_logs_to_file(ar.first, ar.second); 228 | } 229 | }; 230 | if (tot_w > 0) { 231 | std::ostringstream oss; 232 | oss << "Written info with " << tot_w << " samples"; 233 | log(VL_INFO, oss.str()); 234 | } 235 | else { 236 | log(VL_ERROR, "Nothing collected!"); 237 | } 238 | } 239 | -------------------------------------------------------------------------------- /freud-pin/logger.cc: -------------------------------------------------------------------------------- 1 | enum verbosity_levels { 2 | VL_ERROR = 0, 3 | VL_QUIET, 4 | VL_INFO, 5 | VL_DEBUG 6 | }; 7 | 8 | enum verbosity_levels vl = VL_DEBUG; 9 | 10 | std::string log_label(enum verbosity_levels vl) { 11 | if (vl == VL_ERROR) { 12 | return "ERR: "; 13 | } else if (vl == VL_INFO) { 14 | return "INFO: "; 15 | } else if (vl == VL_DEBUG) { 16 | return "DBG: "; 17 | } 18 | return "UNK: "; 19 | } 20 | 21 | void log(enum verbosity_levels l, const std::string msg) { 22 | if (l <= vl) 23 | std::cout << log_label(l) << msg << std::endl; 24 | } 25 | 26 | 27 | -------------------------------------------------------------------------------- /freud-pin/makefile: -------------------------------------------------------------------------------- 1 | ############################################################## 2 | # 3 | # DO NOT EDIT THIS FILE! 4 | # 5 | ############################################################## 6 | 7 | PIN_ROOT=/opt/pin-3.30 8 | # If the tool is built out of the kit, PIN_ROOT must be specified in the make invocation and point to the kit root. 9 | ifdef PIN_ROOT 10 | CONFIG_ROOT := $(PIN_ROOT)/source/tools/Config 11 | else 12 | CONFIG_ROOT := ../Config 13 | endif 14 | include $(CONFIG_ROOT)/makefile.config 15 | include makefile.rules 16 | include $(TOOLS_ROOT)/Config/makefile.default.rules 17 | 18 | ############################################################## 19 | # 20 | # DO NOT EDIT THIS FILE! 21 | # 22 | ############################################################## 23 | -------------------------------------------------------------------------------- /freud-pin/makefile.rules: -------------------------------------------------------------------------------- 1 | ############################################################## 2 | # 3 | # This file includes all the test targets as well as all the 4 | # non-default build rules and test recipes. 5 | # 6 | ############################################################## 7 | 8 | 9 | ############################################################## 10 | # 11 | # Test targets 12 | # 13 | ############################################################## 14 | 15 | ###### Place all generic definitions here ###### 16 | 17 | # This defines tests which run tools of the same name. This is simply for convenience to avoid 18 | # defining the test name twice (once in TOOL_ROOTS and again in TEST_ROOTS). 19 | # Tests defined here should not be defined in TOOL_ROOTS and TEST_ROOTS. 20 | TEST_TOOL_ROOTS := 21 | 22 | # This defines the tests to be run that were not already defined in TEST_TOOL_ROOTS. 23 | TEST_ROOTS := 24 | 25 | # This defines the tools which will be run during the the tests, and were not already defined in 26 | # TEST_TOOL_ROOTS. 27 | TOOL_ROOTS := freud-pin 28 | 29 | # This defines the static analysis tools which will be run during the the tests. They should not 30 | # be defined in TEST_TOOL_ROOTS. If a test with the same name exists, it should be defined in 31 | # TEST_ROOTS. 32 | # Note: Static analysis tools are in fact executables linked with the Pin Static Analysis Library. 33 | # This library provides a subset of the Pin APIs which allows the tool to perform static analysis 34 | # of an application or dll. Pin itself is not used when this tool runs. 35 | SA_TOOL_ROOTS := 36 | 37 | # This defines all the applications that will be run during the tests. 38 | APP_ROOTS := 39 | 40 | # This defines any additional object files that need to be compiled. 41 | OBJECT_ROOTS := reader 42 | 43 | # This defines any additional dlls (shared objects), other than the pintools, that need to be compiled. 44 | DLL_ROOTS := 45 | 46 | # This defines any static libraries (archives), that need to be built. 47 | LIB_ROOTS := 48 | 49 | ###### Define the sanity subset ###### 50 | 51 | # This defines the list of tests that should run in sanity. It should include all the tests listed in 52 | # TEST_TOOL_ROOTS and TEST_ROOTS excluding only unstable tests. 53 | SANITY_SUBSET := $(TEST_TOOL_ROOTS) $(TEST_ROOTS) 54 | 55 | 56 | ############################################################## 57 | # 58 | # Test recipes 59 | # 60 | ############################################################## 61 | 62 | # This section contains recipes for tests other than the default. 63 | # See makefile.default.rules for the default test rules. 64 | # All tests in this section should adhere to the naming convention: .test 65 | 66 | 67 | ############################################################## 68 | # 69 | # Build rules 70 | # 71 | ############################################################## 72 | 73 | # This section contains the build rules for all binaries that have special build rules. 74 | # See makefile.default.rules for the default build rules. 75 | -------------------------------------------------------------------------------- /freud-pin/reader.cc: -------------------------------------------------------------------------------- 1 | #include "reader.hh" 2 | 3 | void bin_reader::read(std::string filename, std::unordered_map &rmap) { 4 | ifstream table(filename.c_str()); 5 | std::string line, name, fhash; 6 | int count, isptr, fcount, isaddr, fsets; 7 | uint64_t address, pos; 8 | int64_t off; 9 | int lnum = 0; 10 | while (std::getline(table, line)) { 11 | lnum++; 12 | if (line != "###") { 13 | std::ostringstream oss; 14 | oss << "Parsing problem at line " << lnum; 15 | log(VL_ERROR, oss.str()); 16 | oss.str(""); 17 | oss << "Expected ###, got " << line; 18 | log(VL_ERROR, oss.str()); 19 | exit(-1); 20 | } 21 | // SYMBOL NAME 22 | std::getline(table, name); 23 | lnum++; 24 | 25 | // INSTR ENTRY POINT 26 | std::getline(table, line); 27 | lnum++; 28 | address = strtoull(line.c_str(), NULL, 10); 29 | 30 | // NUM OF PARAMS 31 | std::getline(table, line); 32 | lnum++; 33 | count = atoi(line.c_str()); 34 | rmap[name].name = name; 35 | rmap[name].address = address; 36 | rmap[name].taken_count = 0; 37 | rmap[name].param_count = (uint16_t)count; 38 | 39 | for (int i = 0; i < count; i++) { 40 | // LOCATION 41 | std::getline(table, line); 42 | lnum++; 43 | pos = strtoull(line.c_str(), NULL, 10); 44 | 45 | // OFFSET FROM LOCATION 46 | std::getline(table, line); 47 | lnum++; 48 | off = strtoll(line.c_str(), NULL, 10); 49 | 50 | // IS_ADDR | IS_REG_WITH_ADDR 51 | std::getline(table, line); 52 | lnum++; 53 | isaddr = atoi(line.c_str()); 54 | 55 | // IS_PTR 56 | std::getline(table, line); 57 | lnum++; 58 | isptr = atoi(line.c_str()); 59 | 60 | struct dwarf_formal_parameter dfp; 61 | // TYPE_NAME 62 | std::getline(table, line); 63 | dfp.type_name = line; 64 | lnum++; 65 | 66 | // NUMBER OF FEATURE_SETS 67 | std::getline(table, line); 68 | lnum++; 69 | fsets = atoi(line.c_str()); 70 | 71 | int max_f_c = 0; 72 | for (int f = 0; f < fsets; f++) { 73 | // HASH OF THE FEATURE SET 74 | std::getline(table, fhash); 75 | lnum++; 76 | 77 | // NUM_OF_FEATURES 78 | std::getline(table, line); 79 | lnum++; 80 | fcount = atoi(line.c_str()); 81 | if (fcount > max_f_c) 82 | max_f_c = fcount; 83 | 84 | log(VL_DEBUG, "Found runtime type " + fhash + " for " + dfp.type_name); 85 | 86 | for (int c = 0; c < fcount; c++) { 87 | // TYPE_NAME 88 | std::getline(table, line); 89 | lnum++; 90 | string ht = line; 91 | 92 | // FEATURE_NAME 93 | std::getline(table, line); 94 | lnum++; 95 | string ft = line; 96 | dfp.runtime_type_to_features[fhash].push_back(primitive_feature(ht, ft)); 97 | vector ca; 98 | uint32_t dims = 0; 99 | if (ft.find("array ") == 0) { 100 | // read info about the dims 101 | std::getline(table, line); 102 | lnum++; 103 | dims = strtoul(line.c_str(), 0, 10); 104 | for (unsigned int i = 0; i < dims; i++) { 105 | std::getline(table, line); 106 | lnum++; 107 | uint32_t c = atoi(line.c_str()); 108 | ca.push_back(c); 109 | } 110 | } 111 | // TODO: DO SOMETHING! 112 | //dfs.array_dims.push_back(dims); 113 | //dfs.array_counts.push_back(ca); 114 | } 115 | } 116 | dfp.position = pos; 117 | dfp.offset = off; 118 | dfp.is_addr = ((bool)(isaddr > 0)); 119 | dfp.is_reg_with_addr = ((bool)(isaddr > 1)); 120 | dfp.is_ptr = ((bool)isptr); 121 | rmap[name].params.push_back(dfp); 122 | rmap[name].max_features_count += max_f_c; 123 | } 124 | 125 | // Add system features 126 | // I have only 1! sys feature right now, hardcoded in rtn_descriptor 127 | rmap[name].system_variable_names.push_back("cpu_clock"); 128 | } 129 | 130 | } 131 | 132 | -------------------------------------------------------------------------------- /freud-pin/reader.hh: -------------------------------------------------------------------------------- 1 | #ifndef READER_HH_INCLUDED 2 | #define READER_HH_INCLUDED 3 | 4 | #include "rtn_descriptor.h" 5 | 6 | class bin_reader { 7 | public: 8 | static void read(std::string fname, std::unordered_map &rmap); 9 | }; 10 | 11 | #endif 12 | -------------------------------------------------------------------------------- /freud-pin/rtn_descriptor.h: -------------------------------------------------------------------------------- 1 | #ifndef RTN_STRUCTURE_H_ 2 | #define RTN_STRUCTURE_H_ 3 | 4 | #include 5 | #include 6 | #include 7 | 8 | #include "rtn_execution.h" 9 | 10 | #define EXPECTED_SAMPLES_PER_PERIOD 1000.0 // num of logs to collect for a specific object 11 | #define MAXTHREADS 64 12 | 13 | #define SYSTEM_FEATURES_CNT 1 14 | 15 | unsigned int DUMP_LOG_PERIOD = 5000; 16 | 17 | struct primitive_feature { 18 | primitive_feature(string t, string n): type(t), name(n) {};// { array_dims = 0; }; 19 | string type; 20 | string name; 21 | }; 22 | 23 | struct dwarf_formal_parameter 24 | { 25 | bool is_addr; 26 | bool is_reg_with_addr; 27 | bool is_ptr; 28 | uint64_t position; 29 | int64_t offset; 30 | string type_name; 31 | unordered_map> runtime_type_to_features; 32 | uint32_t array_dims; 33 | vector array_counts; 34 | }; 35 | 36 | struct routine_descriptor 37 | { 38 | PIN_MUTEX history_mutex; 39 | unsigned int executed_count; 40 | unsigned int taken_count; 41 | string name; 42 | uint64_t address; // The address of the first instruction after the prologue 43 | uint16_t param_count; 44 | unsigned int max_features_count; 45 | 46 | vector params; 47 | /* 48 | vector is_addr; 49 | vector is_reg_with_addr; 50 | vector is_ptr; 51 | vector position; 52 | vector offset; 53 | vector feature_type; 54 | vector array_dims; 55 | vector> array_counts; 56 | vector variable_names; 57 | 58 | // How many features a specific parameter (identified by the idx in the vector) has 59 | vector feature_counts; 60 | 61 | // The codes denoting the feature types associated with the parameter 62 | vector feature_names; 63 | */ 64 | 65 | vector history[2][MAXTHREADS]; 66 | int first_execution; 67 | volatile bool active_history; 68 | 69 | // SYSTEM FEATURES 70 | vector system_variable_names; 71 | 72 | void init() { 73 | PIN_MutexInit(&history_mutex); 74 | first_execution = 0; 75 | executed_count = taken_count = 0; 76 | active_history = false; // Not really needed 77 | for (int t = 0; t < MAXTHREADS; t++) { 78 | history[0][t].reserve(EXPECTED_SAMPLES_PER_PERIOD); 79 | history[1][t].reserve(EXPECTED_SAMPLES_PER_PERIOD); 80 | } 81 | } 82 | 83 | void acquire_lock() { 84 | PIN_MutexLock(&history_mutex); 85 | } 86 | 87 | void release_lock() { 88 | PIN_MutexUnlock(&history_mutex); 89 | } 90 | 91 | void add_to_history(struct rtn_execution * re, THREADID tid, int pos) { 92 | PIN_MutexLock(&history_mutex); 93 | // The second part may occur if log dump happened during the execution of the 94 | // instrumented symbol 95 | if (pos == -1 || (unsigned int)pos >= history[active_history][tid].size()) 96 | history[active_history][tid].push_back(re); 97 | else { 98 | // evict the old sample 99 | delete history[active_history][tid][pos]; 100 | history[active_history][tid][pos] = re; 101 | } 102 | PIN_MutexUnlock(&history_mutex); 103 | } 104 | 105 | void switch_history() { 106 | PIN_MutexLock(&history_mutex); 107 | taken_count = 0; 108 | active_history = !active_history; 109 | for (int t = 0; t < MAXTHREADS; t++) { 110 | for (struct rtn_execution * re: history[active_history][t]) { 111 | delete re; 112 | } 113 | history[active_history][t].clear(); 114 | } 115 | PIN_MutexUnlock(&history_mutex); 116 | } 117 | 118 | inline size_t get_stored_history_size(const THREADID tid) const { 119 | return history[!active_history][tid].size(); 120 | } 121 | 122 | inline const vector * get_stored_history_ptr(const THREADID tid) const { 123 | return &(history[!active_history][tid]); 124 | } 125 | 126 | struct rtn_execution * get_sample(THREADID tid, const int idx) { 127 | if (history[active_history][tid].size() == 0) { 128 | // this must be the first execution of this symbol on this thread 129 | // so I'm not going to remove anything, even though I should 130 | // never mind, ignore this case 131 | return 0; 132 | } 133 | PIN_MutexLock(&history_mutex); 134 | struct rtn_execution * addr = history[active_history][tid][idx]; 135 | PIN_MutexUnlock(&history_mutex); 136 | return addr; 137 | } 138 | 139 | }; 140 | 141 | #endif 142 | -------------------------------------------------------------------------------- /freud-pin/rtn_execution.h: -------------------------------------------------------------------------------- 1 | #ifndef RTN_EXECUTION_H_ 2 | #define RTN_EXECUTION_H_ 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | 9 | // TODO: these values depend on the processor, should be parsed automatically 10 | #define TSC_CLOCK_RATE_SEC 2300000000u 11 | #define TSC_CLOCK_RATE_MSEC 2300000u 12 | #define TSC_CLOCK_RATE_USEC 2300u 13 | 14 | //#define NO_CTX_SWITCH 15 | 16 | using namespace std; 17 | 18 | static __inline__ uint64_t rdtsc(){ 19 | #ifdef USE_OLD_RDTSC 20 | unsigned int lo,hi; 21 | __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi)); 22 | return ((uint64_t)hi << 32) | lo; 23 | #else 24 | // https://stackoverflow.com/questions/14783782/which-inline-assembly-code-is-correct-for-rdtscp 25 | unsigned long long tsc; 26 | __asm__ __volatile__( 27 | "rdtscp;" 28 | "shl $32, %%rdx;" 29 | "or %%rdx, %%rax" 30 | : "=a"(tsc) 31 | : 32 | : "%rcx", "%rdx"); 33 | 34 | return tsc; 35 | #endif 36 | } 37 | 38 | struct rtn_execution 39 | { 40 | bool active; 41 | uint64_t start_time; 42 | uint64_t stop_time; 43 | uint64_t allocated_memory; 44 | uint64_t total_lock_holding_time; 45 | uint64_t total_waiting_time; 46 | uint64_t minor_page_faults; 47 | uint64_t major_page_faults; 48 | uint64_t voluntary_ctx_switches; 49 | uint64_t involuntary_ctx_switches; 50 | 51 | // SYSTEM FEATURES 52 | vector system_feature_values; 53 | 54 | bool started_with_lock; 55 | uint64_t descendants_instrumentation_time; 56 | uint32_t unique_id; 57 | std::vector children; 58 | vector feature_values; 59 | vector runtime_types; 60 | 61 | unordered_map> branches; 62 | ADDRINT return_val; 63 | 64 | void init(uint32_t uid, bool lock) { 65 | total_waiting_time = descendants_instrumentation_time = start_time = stop_time = 0; 66 | active = true; 67 | // PROCFS stats 68 | // init to zero in case we are not reading procfs 69 | minor_page_faults = major_page_faults = voluntary_ctx_switches = involuntary_ctx_switches = 0; 70 | allocated_memory = 0; 71 | unique_id = uid; 72 | total_lock_holding_time = 0; 73 | started_with_lock = lock; 74 | // TODO: use meaningful values for these 75 | runtime_types.reserve(1); 76 | system_feature_values.reserve(1); 77 | } 78 | 79 | void trigger_start() { 80 | if (!active) 81 | return; 82 | start_time = rdtsc(); 83 | } 84 | 85 | inline void add_runtime_type(uint64_t rt) { 86 | runtime_types.push_back(rt); 87 | } 88 | 89 | void add_feature_value(void ** arg) { 90 | // hoping to catch any non explicitly defined type here 91 | feature_values.push_back(0); 92 | } 93 | 94 | void add_feature_value(unsigned char ** arg) { 95 | feature_values.push_back(0); 96 | } 97 | 98 | template 99 | void add_feature_value(T ** arg) { 100 | // Skipping ptr2ptr! 101 | feature_values.push_back(0); 102 | } 103 | 104 | template 105 | void add_feature_value(T * arg) { 106 | T foo; 107 | if (arg && PIN_SafeCopy(&foo, arg, sizeof(T)) == sizeof(T)) 108 | // looks like we can read this thing 109 | feature_values.push_back(foo); 110 | else 111 | feature_values.push_back(0); 112 | } 113 | 114 | template 115 | void add_feature_value(T arg) { 116 | // looks like we can read this thing 117 | feature_values.push_back((int64_t)arg); 118 | } 119 | 120 | void add_feature_value(long long unsigned int* arg){ 121 | long long unsigned int tmp; 122 | if (arg && PIN_SafeCopy(&tmp, arg, sizeof(long long unsigned int)) == sizeof(long long unsigned int)) { 123 | // looks like we can read this thing 124 | //if (foo > INT32_MAX) 125 | // std::cout << "TODO: unsigned long long too big for uint32_t!" << std::endl; 126 | //std::cout << "Added long long val! " << foo << std::endl; 127 | feature_values.push_back(tmp % UINT32_MAX); 128 | } 129 | else 130 | feature_values.push_back(0); 131 | } 132 | 133 | void add_feature_value(long unsigned int * arg){ 134 | //std::cout << "Got value " << hex << (ADDRINT)arg << std::endl; 135 | long unsigned int foo; 136 | if (arg && PIN_SafeCopy(&foo, arg, sizeof(long unsigned int)) == sizeof(long unsigned int)) { 137 | // looks like we can read this thing 138 | //if (foo > INT32_MAX) 139 | // std::cout << "TODO: unsigned long too big for uint32_t!" << std::endl; 140 | //std::cout << "Added val! " << foo << std::endl; 141 | feature_values.push_back(foo % UINT32_MAX); 142 | } 143 | else 144 | feature_values.push_back(0); 145 | //std::cout << "Read 4 bytes " << arg << std::endl; 146 | } 147 | 148 | 149 | #if UNSIGNED_CHAR_IS_STRING 150 | void add_feature_value(unsigned char * arg){ 151 | // FIXME: does it make sense to consider this as a string? 152 | //std::cout << "Reading str from " << (uint64_t)arg << std::endl; 153 | unsigned char *tmp_ptr = arg; 154 | unsigned char foo; 155 | int len = 0; 156 | while (arg && PIN_SafeCopy(&foo, tmp_ptr++, 1) == 1 && foo != '\0') 157 | len++; 158 | feature_values.push_back(len); 159 | } 160 | #endif 161 | 162 | void add_feature_value(char ** arg){ 163 | // Try to compute the string length of the string pointed to by the first pointer (which is also the only one many time). And do it safely. This might be super slow 164 | char *tmp_ptr = 0; 165 | int len = 0; 166 | if (arg && PIN_SafeCopy(&tmp_ptr, arg, 1) == 1) { 167 | char foo; 168 | while (tmp_ptr && PIN_SafeCopy(&foo, tmp_ptr++, 1) == 1 && foo != '\0') 169 | len++; 170 | } 171 | feature_values.push_back(len); 172 | } 173 | 174 | void add_feature_value(char * arg){ 175 | // Compute the (maybe not null terminated) string length, and do it safely. This might be super slow 176 | char *tmp_ptr = arg; 177 | char foo; 178 | int len = 0; 179 | while (arg && PIN_SafeCopy(&foo, tmp_ptr++, 1) == 1 && foo != '\0') 180 | len++; 181 | feature_values.push_back(len); 182 | } 183 | 184 | 185 | #include 186 | 187 | // ARRAYS 188 | template 189 | void add_feature_value_array(T * arg, const unsigned int dims, const vector * counts) { 190 | void * foo; 191 | uint32_t res = 1; 192 | T sum = 0; 193 | if (arg && PIN_SafeCopy(&foo, arg, sizeof(void *)) == sizeof(void *)) { 194 | // the first feature is the total length of the array 195 | for (uint32_t c : *counts) { 196 | //printf("Adding dim %u / %u\n", c, dims); 197 | res *= (c + 1u); 198 | } 199 | feature_values.push_back((int)res); 200 | // An array is guaranteed to have contiguous memory allocation 201 | // so this should not break... 202 | for (uint32_t i = 0; i < res; i++) { 203 | sum += arg[i]; 204 | } 205 | feature_values.push_back((int)sum); 206 | } 207 | else { 208 | feature_values.push_back(0); 209 | feature_values.push_back(0); 210 | } 211 | } 212 | 213 | template 214 | void add_feature_value_array_var(T * arg, const unsigned int dims, ...) { 215 | void * foo; 216 | uint32_t res = 1; 217 | T sum = 0; 218 | if (arg && PIN_SafeCopy(&foo, arg, sizeof(void *)) == sizeof(void *)) { 219 | va_list ap; 220 | va_start(ap, dims); 221 | // the first feature is the total length of the array 222 | for (uint32_t c = 0; c < dims; c++) 223 | res *= (va_arg(ap, uint32_t) + 1u); 224 | va_end(ap); 225 | feature_values.push_back((int)res); 226 | // An array is guaranteed to have contiguous memory allocation 227 | // so this should not break... 228 | for (uint32_t i = 0; i < res; i++) { 229 | //printf("Reading array, adding %u, value %d\n", i, (int)arg[i]); 230 | sum += arg[i]; 231 | } 232 | feature_values.push_back((int)sum); 233 | } 234 | else { 235 | feature_values.push_back(0); 236 | feature_values.push_back(0); 237 | } 238 | } 239 | 240 | void do_malloc(uint64_t mem_size) { 241 | allocated_memory += mem_size; 242 | } 243 | 244 | void released_all_locks(const uint64_t tot_time, const uint64_t release_time) { 245 | if (!started_with_lock) 246 | total_lock_holding_time += tot_time; 247 | else 248 | total_lock_holding_time += release_time - start_time; 249 | 250 | // Should this function grab another lock, that must be considered from 251 | // the lock grabbing time 252 | started_with_lock = false; 253 | } 254 | 255 | void done_waiting(const uint64_t tot_time) { 256 | total_waiting_time += tot_time; 257 | } 258 | 259 | void init_stats(uint64_t minor, uint64_t major, uint64_t vcsw, uint64_t ivcsw) { 260 | minor_page_faults = minor; 261 | major_page_faults = major; 262 | voluntary_ctx_switches = vcsw; 263 | involuntary_ctx_switches = ivcsw; 264 | } 265 | 266 | void trigger_end(uint64_t itime, uint64_t locks, uint64_t lock_start, uint64_t minor_pf, uint64_t major_pf, uint64_t vctxs, uint64_t ivctxs){ 267 | if(!active) 268 | return; 269 | 270 | stop_time = rdtsc(); 271 | 272 | if (locks > 0 && started_with_lock == false) { 273 | // this guy acquired a lock himself! 274 | total_lock_holding_time += (stop_time - lock_start) - descendants_instrumentation_time; 275 | } 276 | minor_page_faults = minor_pf - minor_page_faults; 277 | major_page_faults = major_pf - major_page_faults; 278 | voluntary_ctx_switches = vctxs - voluntary_ctx_switches; 279 | involuntary_ctx_switches = ivctxs - involuntary_ctx_switches; 280 | 281 | #ifdef NO_CTX_SWITCH 282 | if (voluntary_ctx_switches | involuntary_ctx_switches) { 283 | std::cout << minor_page_faults << " | " << major_page_faults << " | " << voluntary_ctx_switches << " | " << involuntary_ctx_switches << std::endl; 284 | 285 | // TODO: handle properly, instead of simply thrashing 286 | return; 287 | } 288 | #endif 289 | descendants_instrumentation_time = itime; 290 | active = false; 291 | } 292 | 293 | uint64_t diff(){ 294 | if(active) 295 | return UINT64_MAX; 296 | uint64_t duration = stop_time - start_time; 297 | if (started_with_lock) { 298 | // if we're here, this function never saw a lock release, but it's 299 | // kept the lock for the whole duration 300 | total_lock_holding_time = duration - descendants_instrumentation_time; 301 | } 302 | total_lock_holding_time /= TSC_CLOCK_RATE_USEC; 303 | total_waiting_time /= TSC_CLOCK_RATE_USEC; 304 | return (duration - descendants_instrumentation_time) / TSC_CLOCK_RATE_USEC; 305 | } 306 | 307 | }; 308 | 309 | #endif 310 | -------------------------------------------------------------------------------- /freud-statistics/.deps/freud_statistics-analysis.Po: -------------------------------------------------------------------------------- 1 | # dummy 2 | -------------------------------------------------------------------------------- /freud-statistics/.deps/freud_statistics-checker.Po: -------------------------------------------------------------------------------- 1 | # dummy 2 | -------------------------------------------------------------------------------- /freud-statistics/.deps/freud_statistics-main.Po: -------------------------------------------------------------------------------- 1 | # dummy 2 | -------------------------------------------------------------------------------- /freud-statistics/.deps/freud_statistics-plotter.Po: -------------------------------------------------------------------------------- 1 | # dummy 2 | -------------------------------------------------------------------------------- /freud-statistics/.deps/freud_statistics-reader.Po: -------------------------------------------------------------------------------- 1 | # dummy 2 | -------------------------------------------------------------------------------- /freud-statistics/.deps/freud_statistics-reader_annotations.Po: -------------------------------------------------------------------------------- 1 | # dummy 2 | -------------------------------------------------------------------------------- /freud-statistics/.deps/freud_statistics-regression.Po: -------------------------------------------------------------------------------- 1 | # dummy 2 | -------------------------------------------------------------------------------- /freud-statistics/.deps/freud_statistics-stats.Po: -------------------------------------------------------------------------------- 1 | # dummy 2 | -------------------------------------------------------------------------------- /freud-statistics/AUTHORS: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/usi-systems/freud/caa50f4f636cbfd1b73b8995432506fb5713fc8d/freud-statistics/AUTHORS -------------------------------------------------------------------------------- /freud-statistics/ChangeLog: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/usi-systems/freud/caa50f4f636cbfd1b73b8995432506fb5713fc8d/freud-statistics/ChangeLog -------------------------------------------------------------------------------- /freud-statistics/Makefile: -------------------------------------------------------------------------------- 1 | .PHONY: all 2 | all: freud-statistics 3 | 4 | 5 | OBJECTS = utils.o stats.o reader.o reader_annotations.o regression.o plotter.o analysis.o checker.o main.o 6 | 7 | CXXFLAGS = -g -O3 -std=c++11 -I/usr/share/R/include 8 | LIBS = -lR 9 | 10 | freud-statistics: $(OBJECTS) 11 | $(CXX) $(CXXFLAGS) $(LDFLAGS) $(OBJECTS) $(LIBS) -o $@ 12 | 13 | .PHONY: clean 14 | clean: 15 | rm -f *.o freud-statistics 16 | -------------------------------------------------------------------------------- /freud-statistics/README: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/usi-systems/freud/caa50f4f636cbfd1b73b8995432506fb5713fc8d/freud-statistics/README -------------------------------------------------------------------------------- /freud-statistics/analysis.hh: -------------------------------------------------------------------------------- 1 | #ifndef ANALYSIS_HH_DEFINED 2 | #define ANALYSIS_HH_DEFINED 3 | 4 | #include 5 | #include 6 | //#include 7 | #include 8 | 9 | #include "stats.hh" 10 | #include "method.hh" 11 | #include "utils.hh" 12 | 13 | #define PCC_THRESHOLD 0.4 14 | 15 | enum phase { 16 | ORIGINAL = 0, 17 | REMOVED_NOISE, 18 | MAIN_TREND 19 | }; 20 | 21 | class analysis { 22 | private: 23 | // data set 24 | //const std::vector * data; 25 | std::unordered_map * m_ids_map; 26 | method * mtd; 27 | metric_type metric; 28 | feature_type ftype; 29 | double min_det; 30 | 31 | bool compute_best_multiple_regression(std::vector &result); 32 | bool explore_regression_tree(const std::vector & data, std::vector branches, std::vector &result, int depth, const std::vector & path_conditions, const std::unordered_set features, const enum phase ph); 33 | multiple_regression * compute_best_multiple_regression(const std::vector &in_data, const std::unordered_set &features); 34 | 35 | bool split_on_pivots(const std::vector pvalues, const std::string fkey, const std::vector in, std::vector> &out); 36 | 37 | public: 38 | analysis(const std::string &mname, method * m, metric_type mtype, std::unordered_map * mim); 39 | bool cluster(); 40 | bool cluster(std::vector &result); 41 | bool find_regressions(); 42 | bool find_regressions(double min_r2); 43 | }; 44 | 45 | #endif 46 | -------------------------------------------------------------------------------- /freud-statistics/checker.cc: -------------------------------------------------------------------------------- 1 | #include "checker.hh" 2 | 3 | 4 | checker::checker(std::string mname, method * m, metric_type mtype, bool remove_dependencies, annotation * performance_annotation, std::ofstream * out) { 5 | metric = mtype; 6 | mtd = m; 7 | mtd->name = mname; 8 | pa = performance_annotation; 9 | out_file = out; 10 | } 11 | 12 | void checker::filter_data(corr_with_constraints cwc, std::vector in, std::vector &out) { 13 | bool filtered = false; 14 | // If the PC is based on clustering, skip this phase, the data has already been filtered along with the cluster checks 15 | if (cwc.path_conditions_vec.size() > 0 && cwc.path_conditions_vec[0].feature != "CL") { 16 | utils::get_filtered_data(cwc, in, out); 17 | std::cout << "Filtered on pc: " << in.size() << " - " << out.size() << std::endl; 18 | in = out; 19 | filtered = true; 20 | } 21 | 22 | printf("TODO: reimplement the checker\n"); 23 | exit(-1); 24 | /* 25 | if (cwc.main_trend) { 26 | utils::get_main_trend(cwc.reg->get_fkeyidx(), metric, in, out); 27 | filtered = true; 28 | } else if (cwc.addnoise_removed) { 29 | utils::remove_addnoise(cwc.reg->get_fkeyidx(), metric, in, out); 30 | filtered = true; 31 | } 32 | 33 | if (!filtered) 34 | out = in; 35 | 36 | if (out.size() == 0) 37 | std::cout << "ERR: out is empty in filter_data!" << std::endl; 38 | */ 39 | } 40 | 41 | bool checker::validate_annotation() { 42 | printf("TODO: reimplement the checker\n"); 43 | exit(-1); 44 | return false; 45 | /* 46 | if (pa->regressions.size() + pa->clusters.size() == 0) { 47 | (*out_file) << "Empty annotation read!" << std::endl; 48 | return true; //false; 49 | } 50 | 51 | bool result = true; 52 | std::cout << "Ann with " << pa->regressions.size() << "R and " << pa->clusters.size() << "C" << std::endl; 53 | 54 | for (corr_with_constraints cwc: pa->regressions) { 55 | //if (result == false) 56 | // break; 57 | 58 | bool v = false; 59 | std::vector data = mtd->data, f_data; 60 | 61 | if (cwc.path_conditions_vec.size() && cwc.path_conditions_vec[0].feature == "CL" && cwc.path_conditions_vec[0].flt == FLT_EQ) { 62 | 63 | // ######### REGRESSIONS AFTER CLUSTERS ######### 64 | v = stats::validate_cluster_before_regression(mtd->data, metric, cwc.path_conditions_vec[0].probability, cwc.path_conditions_vec[0].cl_lower_bound, cwc.path_conditions_vec[0].cl_upper_bound, data); 65 | result = result && v; 66 | //std::cout << "Validation1: " << v << " - " << result << std::endl; 67 | } 68 | 69 | // ######### REGRESSIONS ######### 70 | //std::cout << "PFilters: " << cwc.path_conditions_vec.size() << std::endl; 71 | if (cwc.main_trend || cwc.addnoise_removed || cwc.path_conditions_vec.size() > 0) { 72 | // filter the data 73 | filter_data(cwc, data, f_data); 74 | } else { 75 | f_data = data; 76 | } 77 | 78 | if (f_data.size() <= MAX_DEGREE + 1) { 79 | // Not enough samples to check this 80 | (*out_file) << "Skipping, not enough samples (" << f_data.size() << ")" << std::endl; 81 | continue; 82 | } 83 | 84 | if (cwc.constant) { 85 | v = stats::validate_constant(f_data, metric, cwc.constant_value); 86 | } else { 87 | univariate_regression * r = cwc.reg; 88 | v = stats::validate_regression(f_data, r->get_fkeyidx(), metric, r); 89 | } 90 | //std::cout << "Validation: " << v << std::endl; 91 | (*out_file) << "Validation: " << v << " -- (" << f_data.size() << ")" << std::endl; 92 | result = result && v; 93 | } 94 | 95 | // ######### CLUSTERS ######### 96 | if (pa->clusters.size() > 0) { 97 | result = result && stats::validate_clusters(mtd->data, metric, pa->clusters); 98 | } 99 | 100 | return result; 101 | */ 102 | } 103 | -------------------------------------------------------------------------------- /freud-statistics/checker.hh: -------------------------------------------------------------------------------- 1 | #ifndef CHECKER_HH_DEFINED 2 | #define CHECKER_HH_DEFINED 3 | 4 | #include "method.hh" 5 | #include "stats.hh" 6 | #include "utils.hh" 7 | 8 | #include 9 | 10 | 11 | class checker { 12 | private: 13 | metric_type metric; 14 | method * mtd; 15 | annotation * pa; 16 | std::ofstream * out_file; 17 | void filter_data(corr_with_constraints cwc, std::vector in, std::vector &out); 18 | 19 | public: 20 | checker(std::string mname, method * m, metric_type mtype, bool remove_dependencies, annotation * performance_annotation, std::ofstream * out); 21 | 22 | bool validate_annotation(); 23 | 24 | }; 25 | 26 | 27 | #endif 28 | -------------------------------------------------------------------------------- /freud-statistics/const.hh: -------------------------------------------------------------------------------- 1 | #ifndef CONST_HH_DEFINED 2 | #define CONST_HH_DEFINED 3 | 4 | // Max degree for a regression; internal parameter, not to be modified alone 5 | #define MAX_DEGREE 2 6 | 7 | enum filter_type { 8 | FLT_LE = 0, 9 | FLT_GT, 10 | FLT_BT, 11 | FLT_EQ, 12 | FLT_DF, 13 | FLT_NONE 14 | }; 15 | 16 | enum operations { 17 | PRINT, 18 | GNUPLOT, 19 | PCC, 20 | ANNOTATION, 21 | DEPENDENCIES, 22 | CHECK 23 | }; 24 | 25 | enum feature_type { 26 | FT_BOOL = 0, 27 | FT_INT, 28 | FT_COLLECTION, 29 | FT_COLLECTION_AGGR, 30 | FT_STRING, // and also mysqllexstring are caught here 31 | FT_PATH, 32 | FT_NASCII, 33 | FT_SIZE, 34 | FT_RESOURCE, 35 | FT_ENUM, 36 | FT_BRANCH_EXEC, 37 | FT_BRANCH_ISTRUE, 38 | FT_UNKNOWN // either not existing, or not found/decided yet 39 | }; 40 | 41 | enum computation_state { 42 | ST_TOT, 43 | ST_STDDEV, 44 | ST_COVARIANCE, 45 | ST_REGRESSION, 46 | ST_CLUSTER 47 | }; 48 | 49 | #define METRICS_COUNT 6 50 | 51 | enum metric_type { 52 | MT_TIME = 0, 53 | MT_MEM, 54 | MT_LOCK, 55 | MT_WAIT, 56 | MT_PAGEFAULT_MINOR, 57 | MT_PAGEFAULT_MAJOR 58 | }; 59 | 60 | #endif 61 | -------------------------------------------------------------------------------- /freud-statistics/function.hh: -------------------------------------------------------------------------------- 1 | struct function { 2 | std::string function_name; 3 | std::vector features; 4 | std::vector data; 5 | }; 6 | -------------------------------------------------------------------------------- /freud-statistics/hpi.R: -------------------------------------------------------------------------------- 1 | get_hpi <- function(vec) { 2 | library(ks) 3 | 4 | res <- hscv(vec) 5 | return(res) 6 | } 7 | 8 | get_cc <- function(mat) { 9 | pcc <- max(abs(cor(x=mat[,1], y=mat[,2], method="pearson")), abs(cor(x=mat[,1], y=mat[,2], method="spearman"))) 10 | return(pcc) 11 | } 12 | 13 | check_model <- function(coeff, data, nlogn) { 14 | x <- data[2,] 15 | y_actual <- data[1,] 16 | if (nlogn) { 17 | mod <- lm(formula = as.formula(y_actual ~ do.call("offset", list(coeff[2] * x)) + do.call("offset", list(coeff[3] * x * log2(x))))) 18 | } else { 19 | mod <- lm(formula = as.formula(y_actual ~ do.call("offset", list(coeff[2] * x)) + do.call("offset", list(coeff[3] * x**2)))) 20 | } 21 | mod$coefficients[1] = coeff[1] # set the intercept 22 | xdf <- data.frame(x = x) 23 | y_pred <- predict(mod, xdf) 24 | rss <- sum((y_pred - y_actual) ^ 2) ## residual sum of squares 25 | tss <- sum((y_actual - mean(y_actual)) ^ 2) ## total sum of squares 26 | rsq <- 1 - rss/tss 27 | return(rsq) 28 | } 29 | 30 | cor_wrap <- function(mat) { 31 | res <- cor(mat) 32 | return(res) 33 | } 34 | 35 | get_multimodel <- function(mat, frm) { 36 | #print(frm) 37 | tryCatch({ 38 | mod <- lm(eval(parse(text=frm))) 39 | sum <- summary(mod) 40 | bic <- BIC(mod) 41 | numterms <- mod$rank 42 | res <- c(bic, sum$r.squared, mod$coefficients, sum$coefficients[(numterms)*3+1:numterms]) 43 | return(res) 44 | }, 45 | # Not possible to compute the lm, return 0 46 | error = function(e) { return(0) } 47 | ) 48 | } 49 | 50 | get_best_model <- function(mat) { 51 | bic_threshold <- 15 52 | best_is_nlogn <- 0 53 | y <- mat[1,] 54 | x <- mat[2,] 55 | 56 | # Linear model 57 | model <- lm(y ~ x) 58 | linear_bic <- BIC(model) 59 | best_model <- model 60 | 61 | # Nlogn model 62 | xlogx <- x * log2(x) 63 | model <- lm(y ~ x + xlogx) 64 | nlogn_bic <- BIC(model) 65 | if (nlogn_bic < linear_bic - bic_threshold) { 66 | best_model <- model 67 | best_is_nlogn <- 1 68 | } 69 | 70 | # Quadratic model 71 | x2 <- x ** 2 72 | model <- lm(y ~ x + x2) 73 | quad_bic <- BIC(model) 74 | if (quad_bic < linear_bic - bic_threshold) { 75 | if (!best_is_nlogn || quad_bic < nlogn_bic) { 76 | best_model <- model 77 | best_is_nlogn <- 0 78 | } 79 | } 80 | 81 | res <- c(summary(best_model)$r.squared, best_model$rank + best_is_nlogn) 82 | for (i in 1:best_model$rank) { 83 | res <- append(res, best_model$coefficients[i]) 84 | } 85 | 86 | return(res) 87 | 88 | } 89 | 90 | 91 | #################### 92 | # Local minima and maxima: https://stackoverflow.com/questions/6836409/finding-local-maxima-and-minima 93 | # Modified to account for the smoothness factor 94 | localMaxima <- function(x) { 95 | smooth_factor <- 0L #0.000003 96 | # Use -Inf instead if x is numeric (non-integer) 97 | y <- diff(c(-Inf, x)) > smooth_factor 98 | rle(y)$lengths 99 | y <- cumsum(rle(y)$lengths) 100 | y <- y[seq.int(1L, length(y), 2L)] 101 | if (x[[1]] == x[[2]]) { 102 | y <- y[-1] 103 | } 104 | y 105 | } 106 | 107 | localMinima <- function(x) { 108 | # Use -Inf instead if x is numeric (non-integer) 109 | smooth_factor <- 0L #0.000003 110 | y <- diff(c(Inf, x)) > smooth_factor 111 | rle(y)$lengths 112 | y <- cumsum(rle(y)$lengths) 113 | y <- y[seq.int(1L, length(y), 2L)] 114 | if (x[[1]] == x[[2]]) { 115 | y <- y[-1] 116 | } 117 | y 118 | } 119 | #################### 120 | ### VARIABLE KDE ### 121 | #' Broadcasting an array to a wanted shape 122 | #' 123 | #' Extending a given array to match a wanted shape, similar to repmat. 124 | #' @param arr Array to be broadcasted 125 | #' @param dims Desired output dimensions 126 | #' 127 | #' @return array of dimension dim 128 | #' @examples 129 | #' M = array(1:12,dim=c(1,2,3)) 130 | #' N = broadcast(M,c(4,2,3)) 131 | #' 132 | #' @note # Fri Feb 9 14:41:25 2018 ------------------------------ 133 | #' @author Feng Geng (shouldsee.gem@gmail.com) 134 | #' @export 135 | broadcast <- function(arr, dims){ 136 | DIM = dim(arr) 137 | EXCLUDE = which(DIM==dims) #### find out the margin that will be excluded 138 | if (length(EXCLUDE)!=0){ 139 | if( !all(DIM[-EXCLUDE]==1) ){ 140 | errmsg = sprintf("All non-singleton dimensions must be equal: Array 1 [%s] , Wanted shape: [%s]", 141 | paste(DIM,collapse = ','), 142 | paste(dims,collapse = ',')) 143 | stop(errmsg) 144 | } 145 | perm <- c(EXCLUDE, seq_along(dims)[-EXCLUDE]) 146 | }else{ 147 | perm <- c(seq_along(dims)) 148 | } 149 | 150 | arr = array(aperm(arr, perm), dims[perm]) 151 | arr = aperm(arr, order(perm) 152 | ,resize = T) 153 | } 154 | #' Broadcasting an operation on multi-dimensinoal arrays 155 | #' 156 | #' Broadcasting two arrays to the same shape and then apply FUN. 157 | #' 158 | #' @param arrA,arrB Arrays to perfom the operator FUN(arrA,arrB) 159 | #' @param FUN function to be performed (must be vectorised) 160 | #' 161 | #' @return An array of the shape with each dimension being the maximum of dimA,dimB. 162 | #' @examples 163 | #' 164 | #' M = array(1:12,dim=c(2,2,3)) 165 | #' N = rbind(c(1,2)) 166 | #' O = bsxfun(M,N,'*') 167 | #' @note 168 | #' 169 | #' # Fri Feb 9 14:41:25 2018 ------------------------------ 170 | #' 171 | #' @author Feng Geng (shouldsee.gem@gmail.com) 172 | #' @export 173 | #' 174 | # Thanks to https://github.com/shouldsee/Rutil/blob/master/R/bsxfun.R 175 | bsxfun <- function(arrA,arrB,FUN = '+',...){ 176 | FUN <- match.fun(FUN) 177 | arrA <- as.array(arrA) 178 | arrB <- as.array(arrB) 179 | dimA <- dim(arrA) 180 | dimB <- dim(arrB) 181 | if (identical(dimA,dimB)){ 182 | #### Trivial scenario 183 | arr = FUN(arrA,arrB, ...) 184 | return(arr) 185 | } 186 | 187 | arrL = list(arrA,arrB) 188 | orient <- order(sapply(list(dimA,dimB),length)) 189 | # orient <- order(sapply(arrL,length)) 190 | arrL = arrL[orient] 191 | dim1 <- dim(arrL[[1]]) 192 | dim2 <- dim(arrL[[2]]) 193 | dim1 <- c(dim1, rep(1,length(dim2)-length(dim1))) #### Padding smaller array to be of same length 194 | dim(arrL[[1]])<-dim1; 195 | nonS = (dim1!=1) & (dim2!=1) 196 | 197 | if( !all(dim1[nonS]== dim2[nonS])){ 198 | errmsg = sprintf("All non-singleton dimensions must be equal: Array 1 [%s] , Array 2 [%s]", 199 | paste(dimA,collapse = ','), 200 | paste(dimB,collapse = ',')) 201 | stop(errmsg) 202 | } 203 | ### Broadcasting to fill singleton dimensions 204 | # browser() 205 | dims <- pmax(dim1,dim2) 206 | arrL = lapply(arrL,function(x)broadcast(x,dims)) 207 | arrL = arrL[order(orient)] 208 | arr = FUN(arrL[[1]],arrL[[2]], ...) 209 | return(arr) 210 | } 211 | 212 | if (interactive()){ 213 | 214 | M = array(1:12,dim=c(2,2,3)) 215 | N = rbind(c(1,2)) 216 | O = bsxfun(M,N,'*') 217 | 218 | M = array(1:12,dim=c(2,2,3)) 219 | N = rbind(c(1,2)) 220 | O = bsxfun(N,M,'*') 221 | } 222 | 223 | # Reference: 224 | # Kernel density estimation via diffusion 225 | # Z. I. Botev, J. F. Grotowski, and D. P. Kroese (2010) 226 | # Annals of Statistics, Volume 38, Number 5, pages 2916-2957. 227 | akde1d <- function(X,grid,gam) { 228 | # begin scaling preprocessing 229 | n = NROW(X);d = NCOL(X); 230 | MAX = max(X);MIN = min(X);scaling=MAX-MIN; 231 | MAX=MAX+scaling/10;MIN=MIN-scaling/10;scaling=MAX-MIN; 232 | X=bsxfun(X,MIN,'-');X=bsxfun(X,scaling,'/'); 233 | #if (nargin<2)|isempty(grid) % failing to provide grid 234 | grid=seq(MIN,MAX,scaling/(2^12-1)); 235 | #end 236 | # FIXME: check @rdivide matlab operation 237 | mesh=bsxfun(grid,MIN,'-');mesh=bsxfun(mesh,scaling,'/'); 238 | #if nargin<3 % failing to provide speed/accuracy tradeoff 239 | gam=ceiling(n^(1/3))+20; 240 | if (gam > n) { 241 | gam = gam - 20 242 | } 243 | #end 244 | #end preprocessing 245 | # algorithm initialization 246 | print(n) 247 | print(gam) 248 | del=.2/n^(d/(d+4));perm=sample(n);mu=matrix(X[perm[1:gam]], gam, d); 249 | w=matrix(runif(gam), 1, gam);w=w/sum(w);Sig=(del^2)*matrix(runif(gam),gam,d); 250 | 251 | ent=-Inf; 252 | for (iter in seq(1,1500)) { 253 | Eold = ent; 254 | rEM=regEM(w,mu,Sig,del,X); # update parameters 255 | w=rEM$w;mu=rEM$mu;Sig=rEM$sig;del=rEM$del;ent=rEM$ent; 256 | err=abs((ent-Eold)/ent); # stopping condition 257 | #print(ent) 258 | #print(Eold) 259 | #print('Iter. Tol. Bandwidth '); 260 | #print('%4i %8.2e %8.2e\n',iter,err,del); 261 | #fprintf('----------------------------\n'); 262 | #print(ent) 263 | #print(Eold) 264 | #print(err) 265 | if (iter > 200) { 266 | break #, end 267 | } 268 | if (err < 0.00001) { 269 | break #, end 270 | } 271 | } 272 | pdf = probfun(mesh,w,mu,Sig)/prod(scaling); # evaluate density 273 | del=del*scaling; # adjust bandwidth for scaling 274 | return(list("pdf"=pdf, "grid"=grid)); 275 | } 276 | 277 | #################################################### 278 | probfun <- function(x,w,mu,Sig) { 279 | gam=NROW(mu); 280 | d=NCOL(mu); 281 | out=0; 282 | for (k in seq(1,gam)) { 283 | S=Sig[k,]; 284 | xx=bsxfun(x,mu[k,],'-'); 285 | xx=bsxfun(xx^2,S,'/'); 286 | out=out+exp(-.5*matrix(apply(xx, 1, function(x) sum(x)), NROW(xx), 1)+log(w[k])-.5*sum(log(S))-d*log(2*pi)/2); 287 | } 288 | return(out); 289 | } 290 | #################################################### 291 | regEM <- function(w,mu,Sig,del,X) { 292 | #function [w,mu,Sig,del,ent]=regEM(w,mu,Sig,del,X) 293 | gam=NROW(mu);d=NCOL(mu); 294 | n=NROW(X);d=NCOL(X); 295 | log_lh=matrix(0,n,gam); log_sig=log_lh; 296 | #print(gam) 297 | #print(w) 298 | #print(mu) 299 | for (i in seq(1,gam)) { 300 | #print(i) 301 | #print(Sig) 302 | s=Sig[i,]; 303 | Xcentered = bsxfun(X, mu[i],'-'); 304 | xRinv = bsxfun(Xcentered^2, s,'/'); 305 | tmp = bsxfun(xRinv, s,'/') 306 | xSig = matrix(apply(tmp, 1, function(x) sum(x)), NROW(tmp), 1)+.Machine$double.eps; 307 | log_lh[,i]=-.5*matrix(apply(xRinv, 1, function(x) sum(x)), NROW(xRinv), 1)-.5*sum(log(s))+log(w[i])-d*log(2*pi)/2-.5*del^2*sum(1./s); 308 | log_sig[,i]=log_lh[,i]+log(xSig); 309 | } 310 | maxll = matrix(apply(log_lh, 1, function(x) max(x)), NROW(log_lh), 1); 311 | maxlsig = matrix(apply(log_sig, 1, function(x) max(x)), NROW(log_sig), 1); 312 | p=exp(bsxfun(log_lh, maxll, '-')); 313 | #print(log_lh) 314 | psig=exp(bsxfun(log_sig, maxlsig, '-')); 315 | density=matrix(apply(p, 1, function(x) sum(x)), NROW(p), 1); # density = sum(p,2); 316 | psigd=matrix(apply(psig, 1, function(x) sum(x)), NROW(psig), 1); # psigd=sum(psig,2); 317 | logpdf=log(density)+maxll; logpsigd=log(psigd)+maxlsig; 318 | p = bsxfun(p, density,'/'); #% normalize classification prob. 319 | ent=sum(logpdf); 320 | w = apply(p, 2, function(x) sum(x)); # w=sum(p,1); 321 | #print(which(w>0)) 322 | for (i in which(w>0)) { 323 | mu[i]=t(p[,i]) %*% X / w[i]; # compute mu's 324 | Xcentered = bsxfun(X,mu[i],'-'); 325 | Sig[i,]=t(p[,i]) %*% (Xcentered^2)/w[i]+del^2; # compute sigmas 326 | } 327 | #print(Sig) 328 | w=w/sum(w); 329 | curv=mean(exp(logpsigd-logpdf)); 330 | del=1/(4*n*(4*pi)^(d/2)*curv)^(1/(d+2)); 331 | return(list("w" = w, "mu" = mu, "sig" = Sig, "del" = del, "ent" = ent)); 332 | } 333 | 334 | 335 | #################### 336 | 337 | kde_clustering <- function(vec) { 338 | library(ks) 339 | k <- kde(vec, positive = T) 340 | maxima <- k$eval.points[localMaxima(k$estimate)] 341 | minima <- k$eval.points[localMinima(k$estimate)] 342 | res <- c(as.numeric(length(maxima))) 343 | res <- append(res, maxima) 344 | res <- append(res, as.numeric(length(minima))) 345 | res <- append(res, minima) 346 | return(res) 347 | } 348 | 349 | vkde_clustering <- function(vec) { 350 | k <- akde1d(vec) 351 | maxima <- k$grid[localMaxima(k$pdf)] 352 | minima <- k$grid[localMinima(k$pdf)] 353 | res <- c(as.numeric(length(maxima))) 354 | res <- append(res, maxima) 355 | res <- append(res, as.numeric(length(minima))) 356 | res <- append(res, minima) 357 | return(res) 358 | } 359 | -------------------------------------------------------------------------------- /freud-statistics/main.cc: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2020 Daniele Rogora 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #include 18 | #include 19 | #include 20 | #include 21 | #include 22 | #include 23 | 24 | #include "const.hh" 25 | #include "method.hh" 26 | #include "stats.hh" 27 | #include "reader.hh" 28 | #include "reader_annotations.hh" 29 | #include "function.hh" 30 | #include "analysis.hh" 31 | #include "checker.hh" 32 | #include "plotter.hh" 33 | 34 | #include 35 | #include 36 | 37 | void printUsage(const char *argv[]) { 38 | std::cout << "Usage: " << argv[0] << " OPERATION [function_name] [magictype] INPUT_FILES" << std::endl; 39 | std::cout << std::endl << "Operations: " << std::endl; 40 | std::cout << "\t" << GNUPLOT << ": produce a gnuplot with values measured and the specified parameter for a specific function (" << argv[0] << " 1 MTYPE FNAME FCOUNT [\"ft1\",\"ft2\"]) PATH" << std::endl; 41 | std::cout << "\t" << ANNOTATION << ": (min_r2) (metric type) (symbol_name) print derived best-effort annotations for every function" << std::endl; 42 | std::cout << "\t" << CHECK << ": check a given performance annotation against the input data" << std::endl; 43 | } 44 | 45 | /** 46 | * Loads external R source code 47 | */ 48 | void source(const char *name) 49 | { 50 | SEXP e; 51 | PROTECT(e = lang2(install("source"), mkString(name))); 52 | R_tryEval(e, R_GlobalEnv, NULL); 53 | UNPROTECT(1); 54 | } 55 | 56 | int main(int argc, const char *argv[]) { 57 | if (argc < 3) { 58 | printUsage(argv); 59 | return 0; 60 | } 61 | 62 | // Intialize the embedded R environment. 63 | int r_argc = 4; 64 | char exe[] = "R"; 65 | char o1[] = "--silent"; 66 | char o2[] = "--no-save"; 67 | char o3[] = "--gui=none"; 68 | char *r_argv[] = { exe, o1, o2, o3 }; 69 | Rf_initEmbeddedR(r_argc, r_argv); 70 | source("hpi.R"); 71 | 72 | int k; 73 | int features_count; 74 | int operation = atoi(argv[1]); 75 | double min_r2; 76 | metric_type mtype; 77 | std::string symbol_name; 78 | std::vector features_index_type; 79 | 80 | if (operation == PRINT) 81 | k = 3; // argv[2] is the name of a function 82 | else if (operation == GNUPLOT) { 83 | features_count = atoi(argv[4]); 84 | if (features_count < 1) { 85 | utils::log(VL_ERROR, "There must be at least 1 feature!"); 86 | exit(-1); 87 | } 88 | for (int i = 0; i < features_count; i++) 89 | features_index_type.push_back(argv[5 + i]);//ftype = atoi(argv[3]); 90 | features_index_type.push_back("gnuplot OP"); 91 | mtype = (metric_type)atoi(argv[2]); 92 | k = 5 + features_count; // argv[2] fname, argv[3] magic type, argv[4] remove_deps 93 | } 94 | else if (operation == ANNOTATION || operation == CHECK) { 95 | k = 5; 96 | min_r2 = std::stod(std::string(argv[2])); 97 | mtype = (metric_type)atoi(argv[3]); 98 | symbol_name = argv[4]; 99 | } 100 | else 101 | k = 2; 102 | 103 | if (min_r2 == 0) 104 | min_r2 = MIN_DET; 105 | 106 | utils::log(VL_INFO, "Reading data from " + std::string(argv[k]) + "... "); 107 | 108 | // this is the main vector, that holds all the data collected from the system 109 | std::map data; 110 | std::unordered_map m_ids_map; 111 | reader::read_folder(argv[k], data, m_ids_map); 112 | 113 | if (operation == ANNOTATION) { 114 | unsigned int understood = 0; 115 | std::stringstream out_ann_file; 116 | if (symbol_name != "-") 117 | out_ann_file << "ann/" << symbol_name << ".txt"; 118 | else 119 | out_ann_file << "ann/all.txt"; 120 | std::ofstream outFile(out_ann_file.str().c_str(), std::ios::out); 121 | if (!outFile.is_open()) { 122 | utils::log(VL_ERROR, "Could not open outfile " + out_ann_file.str()); 123 | exit(-1); 124 | } 125 | for (std::pair p: data) { 126 | if (symbol_name != "-" && p.first != symbol_name) 127 | continue; 128 | 129 | utils::log(VL_INFO, "Creating Performance Annotation (" + std::to_string(mtype) + ") for " + p.first); 130 | 131 | // TRY REGRESSIONS 132 | analysis a(p.first, p.second, mtype, &m_ids_map); 133 | understood += a.find_regressions(min_r2); 134 | 135 | utils::log(VL_INFO, "Regressions found: " + std::to_string(p.second->performance_annotation.regressions.size())); 136 | plotter::plot_annotations(p.second, mtype); 137 | 138 | // CLUSTERING if nothing worked, fall back on kde 1d clustering 139 | if (p.second->performance_annotation.regressions.size() == 0) { 140 | analysis a(p.first, p.second, mtype, &m_ids_map); 141 | if (a.cluster()) 142 | utils::log(VL_INFO, "Clustering succeeded!"); 143 | } 144 | outFile << utils::generate_annotation_string(&(p.second->performance_annotation)); 145 | } 146 | outFile.close(); 147 | 148 | } else if (operation == GNUPLOT) { 149 | // TODO: remove 150 | for (std::pair p: data) { 151 | if (p.first == argv[3]) { 152 | utils::log(VL_INFO, "Plotting " + std::to_string(mtype) + " for method " + p.first); 153 | plotter::graph(p.second, "eps/" + p.first, features_index_type, mtype); 154 | } 155 | } 156 | } else if (operation == CHECK) { 157 | // CHECK THE ANNOTATION 158 | std::stringstream out_check_file; 159 | if (symbol_name != "-") 160 | out_check_file << "ass/" << symbol_name << ".txt"; 161 | else 162 | out_check_file << "ass/all.txt"; 163 | std::ofstream outFile(out_check_file.str().c_str(), std::ios::out); 164 | if (!outFile.is_open()) { 165 | utils::log(VL_ERROR, "Could not open outfile " + out_check_file.str()); 166 | exit(-1); 167 | } 168 | 169 | std::map m_ann_map; 170 | for (std::pair p: data) { 171 | if (symbol_name != "-" && p.first != symbol_name) 172 | continue; 173 | 174 | utils::log(VL_INFO, "Reading annotations from " + std::string(argv[k + 1])); 175 | std::string afname = argv[k + 1]; 176 | afname += "/" + symbol_name + ".txt"; 177 | reader_annotations::read_annotations_file(afname, symbol_name, m_ann_map); 178 | 179 | if (m_ann_map.find(symbol_name) == m_ann_map.end()) { 180 | utils::log(VL_ERROR, "Couldn't read annotation for " + symbol_name + "; skipping"); 181 | continue; 182 | } 183 | 184 | checker c(p.first, p.second, mtype, false, m_ann_map[p.first], &outFile); 185 | bool validated = c.validate_annotation(); 186 | if (validated) 187 | outFile << "Validation succeded for " << p.first << std::endl; 188 | else 189 | outFile << "Validation failed for " << p.first << std::endl; 190 | } 191 | outFile.close(); 192 | } 193 | } 194 | -------------------------------------------------------------------------------- /freud-statistics/measure.hh: -------------------------------------------------------------------------------- 1 | #ifndef MEASURE_HH_INCLUDED 2 | #define MEASURE_HH_INCLUDED 3 | 4 | class measure { 5 | private: 6 | public: 7 | measure() {}; 8 | measure(std::string mname) {}; 9 | // I probably want to add a method here, either by name or by struct method 10 | std::string method_name; 11 | unsigned char depth; 12 | uint64_t measures[METRICS_COUNT]; 13 | std::unordered_map> branches; 14 | std::vector features; 15 | measure *parent; 16 | //std::vector children; 17 | std::vector children_ids; 18 | 19 | 20 | static unsigned char get_feature_idx(std::string fkey) { 21 | int pipe_pos = fkey.find("|"); 22 | unsigned char idx = (unsigned char)(atoi(fkey.substr(0, pipe_pos).c_str())); 23 | return idx; 24 | } 25 | static feature_type get_feature_type(std::string fkey) { 26 | int pipe_pos = fkey.find("|"); 27 | feature_type ftype = (feature_type)(atoi(fkey.substr(pipe_pos+1, fkey.length()-pipe_pos).c_str())); 28 | return ftype; 29 | } 30 | 31 | std::string get_v_name(std::string fkey) { 32 | int pipe_pos = fkey.find("|"); 33 | int idx = atoi(fkey.substr(0, pipe_pos).c_str()); 34 | feature_type ftype = (feature_type)(atoi(fkey.substr(pipe_pos+1, fkey.length()-pipe_pos).c_str())); 35 | for (feature f: features) { 36 | if (f.idx == idx && ftype == f.type) 37 | return f.var_name; 38 | } 39 | return "fixme_notfound"; 40 | } 41 | 42 | long long int get_feature_value(std::string fkey) { 43 | int colon_pos = fkey.find(":"); 44 | if (colon_pos == std::string::npos) { 45 | int pipe_pos = fkey.find("|"); 46 | int idx = atoi(fkey.substr(0, pipe_pos).c_str()); 47 | feature_type ftype = (feature_type)(atoi(fkey.substr(pipe_pos+1, fkey.length()-pipe_pos).c_str())); 48 | for (feature f: features) { 49 | if (f.idx == idx && ftype == f.type) 50 | return f.value; 51 | } 52 | } else { 53 | printf("THIS SHOULD NOT HAPPEN! OFFENDING FKEY: %s\n", fkey.c_str()); 54 | exit(-1); 55 | /* 56 | std::string f1, f2; 57 | f1 = fkey.substr(0, colon_pos); 58 | f2 = fkey.substr(colon_pos+1); 59 | return get_feature_value(f1) * get_feature_value(f2); 60 | */ 61 | } 62 | return 0; 63 | } 64 | 65 | }; 66 | 67 | #endif 68 | -------------------------------------------------------------------------------- /freud-statistics/plotter.hh: -------------------------------------------------------------------------------- 1 | #ifndef PLOTTER_HH_DEFINED 2 | #define PLOTTER_HH_DEFINED 3 | 4 | #include 5 | #include 6 | #include "method.hh" 7 | 8 | #define OF_INTERVALS 3 9 | 10 | class plotter { 11 | private: 12 | static std::string getGnuplotScriptContent(const std::vector & data, std::string fname, const std::string &main_feat, const std::vector &magic, metric_type mtype, double radius, corr_with_constraints cwc, std::string func_name, std::string data_name); 13 | 14 | static void plot_annotation(method * mtd, corr_with_constraints cwc, metric_type metric); 15 | 16 | public: 17 | static bool graph(std::vector data, std::string name, std::vector & fkeys, corr_with_constraints cwc, metric_type metric); 18 | 19 | static bool graph(method * mtd, std::string name, std::vector & fkeys, metric_type metric); 20 | 21 | static void plot_annotations(method * mtd, metric_type metric); 22 | }; 23 | 24 | #endif 25 | -------------------------------------------------------------------------------- /freud-statistics/reader.cc: -------------------------------------------------------------------------------- 1 | #include "reader.hh" 2 | #define MAX_FEATURES 512 3 | 4 | feature_type reader::get_type(const std::string & t) { 5 | if (t.find("enum ") == 0) { 6 | return FT_ENUM; 7 | } else if (t.find("int") != std::string::npos 8 | || t.find("long") != std::string::npos 9 | || t.find("float") != std::string::npos 10 | || t.find("double") != std::string::npos 11 | ) { 12 | return FT_INT; 13 | } else if (t.find("char") != std::string::npos) { 14 | return FT_STRING; 15 | } else if (t.find("size") != std::string::npos) { 16 | return FT_SIZE; 17 | } else if (t.find("array") != std::string::npos) { 18 | return FT_COLLECTION; 19 | } 20 | return FT_UNKNOWN; 21 | } 22 | 23 | void reader::read_file_binary(std::string filename, std::map &data, std::unordered_map & m_ids_map, const uint32_t file_idx) { 24 | //std::cout << "Reading file " << filename << std::endl; 25 | std::ifstream in(filename.c_str(), std::ios::binary); 26 | 27 | if (!in.is_open()) 28 | std::cout << "Cannot open: " << filename << std::endl; 29 | 30 | uint64_t entry_num = 0; 31 | 32 | while (in) { 33 | uint32_t name_len; 34 | in.read((char *)&name_len, sizeof(uint32_t)); 35 | if (in.eof()) 36 | break; 37 | 38 | char *fname = new char[name_len + 1]; 39 | in.read(fname, sizeof(char) * name_len); 40 | fname[name_len] = '\0'; 41 | method * mtd; 42 | if (data.find(fname) == data.end()) { 43 | mtd = new method(); 44 | data[fname] = mtd; 45 | } 46 | else { 47 | mtd = data.at(fname); 48 | mtd->name = fname; 49 | } 50 | //std::cout << "METHOD " << fname << ": " << mtd->data.size() << std::endl; 51 | 52 | // FEATURE NAMES 53 | uint32_t feature_names_count; 54 | in.read((char *)&feature_names_count, sizeof(uint32_t)); 55 | std::unordered_map feature_names; 56 | //std::cout << "fcount " << fcount << ", vncount " << vncount << std::endl; 57 | char *vname = new char[UINT16_MAX]; 58 | for (uint32_t v = 0; v < feature_names_count; v++) { 59 | uint16_t vnlen; 60 | uint64_t foff = in.tellg(); 61 | in.read((char *)&vnlen, sizeof(uint16_t)); 62 | //std::cout << "LEN: " << vnlen << std::endl; 63 | in.read(vname, sizeof(char) * vnlen); 64 | vname[vnlen] = '\0'; 65 | feature_names.insert(std::make_pair(foff, vname)); 66 | //std::cout << "VAR: " << vname << " at " << foff << std::endl; 67 | } 68 | delete[] vname; 69 | 70 | // TYPE NAMES 71 | uint32_t type_names_count; 72 | in.read((char *)&type_names_count, sizeof(uint32_t)); 73 | std::unordered_map type_names; 74 | //std::cout << "fcount " << fcount << ", vncount " << vncount << std::endl; 75 | char *tname = new char[UINT16_MAX]; 76 | for (uint32_t v = 0; v < type_names_count; v++) { 77 | uint16_t vnlen; 78 | uint64_t foff = in.tellg(); 79 | in.read((char *)&vnlen, sizeof(uint16_t)); 80 | //std::cout << "LEN: " << vnlen << std::endl; 81 | in.read(tname, sizeof(char) * vnlen); 82 | tname[vnlen] = '\0'; 83 | type_names.insert(std::make_pair(foff, tname)); 84 | //std::cout << "TYPE: " << tname << " at " << foff << std::endl; 85 | } 86 | delete[] tname; 87 | 88 | uint32_t samples_count; 89 | in.read((char *)&samples_count, sizeof(uint32_t)); 90 | for (uint32_t s = 0; s < samples_count; s++) { 91 | uint64_t time, mem, lock_holding_time, waiting_time, minor_page_faults, major_page_faults; 92 | uint32_t uid_r; 93 | in.read((char *)&uid_r, sizeof(uint32_t)); 94 | in.read((char *)&time, sizeof(uint64_t)); 95 | in.read((char *)&mem, sizeof(uint64_t)); 96 | in.read((char *)&lock_holding_time, sizeof(uint64_t)); 97 | in.read((char *)&waiting_time, sizeof(uint64_t)); 98 | in.read((char *)&minor_page_faults, sizeof(uint64_t)); 99 | in.read((char *)&major_page_faults, sizeof(uint64_t)); 100 | if (waiting_time > time || lock_holding_time > time) { 101 | std::cout << "Entry " << entry_num << std::endl; 102 | std::cout << "Time: " << time << std::endl; 103 | std::cout << "Mem: " << mem << std::endl; 104 | std::cout << "Lock: " << lock_holding_time << std::endl; 105 | std::cout << "Wait: " << waiting_time << std::endl; 106 | std::cout << "MiPF: " << minor_page_faults << std::endl; 107 | std::cout << "MaPF: " << major_page_faults << std::endl; 108 | exit(-1); 109 | } 110 | uint64_t uid = ((1llu << 32) * (const uint64_t)file_idx) + uid_r; 111 | measure *m = new measure(); 112 | m->method_name = fname; 113 | if (m_ids_map.find(uid) != m_ids_map.end()) { 114 | std::cout << "Non unique id found! " << fname << "; " << uid << std::endl; 115 | std::cout << "Prev: " << m_ids_map.at(uid)->method_name << std::endl; 116 | exit(-1); 117 | } 118 | m_ids_map.insert(std::make_pair(uid, m)); 119 | m->measures[MT_TIME] = time; 120 | m->measures[MT_MEM] = mem; 121 | m->measures[MT_LOCK] = lock_holding_time; 122 | m->measures[MT_WAIT] = waiting_time; 123 | m->measures[MT_PAGEFAULT_MINOR] = minor_page_faults; 124 | m->measures[MT_PAGEFAULT_MAJOR] = major_page_faults; 125 | 126 | // NUM OF FEATURES 127 | uint32_t num_of_features; 128 | in.read((char *)&num_of_features, sizeof(uint32_t)); 129 | for (uint32_t f = 0; f < num_of_features; f++) { 130 | uint64_t name_offset; 131 | uint64_t type_offset; 132 | int64_t value; 133 | in.read((char *)&name_offset, sizeof(uint64_t)); 134 | in.read((char *)&type_offset, sizeof(uint64_t)); 135 | in.read((char *)&value, sizeof(int64_t)); 136 | std::string type; 137 | if (type_offset > 0) 138 | type = type_names.at(type_offset); 139 | else 140 | type = "sysf"; // system feature 141 | struct feature feat; 142 | feat.type = get_type(type); // info not given by Pin 143 | std::string fname = feature_names.at(name_offset); 144 | if (feat.type == FT_ENUM) { 145 | mtd->enums.insert(fname); 146 | } 147 | // Never use the vptr or enums as an actual feature for regressions 148 | if (feat.type != FT_ENUM && fname.find("_vptr") == std::string::npos) 149 | mtd->feature_set.insert(fname); 150 | feat.value = value; 151 | //std::cout << "READ: " << fname << ": " << value << std::endl; 152 | m->features_map.insert(make_pair(fname, feat)); 153 | } 154 | 155 | // BRANCHES 156 | uint32_t num_of_branches; 157 | in.read((char *)&num_of_branches, sizeof(uint32_t)); 158 | for (int b = 0; b < num_of_branches; b++) { 159 | uint16_t branch_id; 160 | in.read((char *)&branch_id, sizeof(uint16_t)); 161 | uint32_t num_of_executions; 162 | in.read((char *)&num_of_executions, sizeof(uint32_t)); 163 | for (int e = 0; e < num_of_executions; e++) { 164 | bool taken; 165 | in.read((char *)&taken, sizeof(bool)); 166 | m->branches[branch_id].push_back(taken); 167 | } 168 | } 169 | 170 | // CHILDREN 171 | uint32_t num_of_children; 172 | in.read((char *)&num_of_children, sizeof(uint32_t)); 173 | for (int c = 0; c < num_of_children; c++) { 174 | uint32_t c_id; 175 | in.read((char *)&c_id, sizeof(uint32_t)); 176 | uint64_t c_id_64 = (1lu << 32) * file_idx; 177 | c_id_64 += c_id; 178 | m->children_ids.push_back(c_id_64); 179 | } 180 | mtd->data.push_back(m); 181 | 182 | entry_num++; 183 | 184 | } 185 | delete[] fname; 186 | //std::cout << "Done reading samples" << std::endl; 187 | } 188 | in.close(); 189 | } 190 | 191 | 192 | bool reader::read_folder(std::string folder_name, std::map &data, std::unordered_map & m_ids_map) { 193 | DIR *dir; 194 | uint32_t fid = 0; 195 | struct dirent *ent; 196 | if ((dir = opendir(folder_name.c_str())) != 0) { 197 | while ((ent = readdir (dir)) != NULL) { 198 | if (strncmp(ent->d_name, "idcm", strlen("idcm")) == 0) 199 | read_file_binary(folder_name + ent->d_name, data, m_ids_map, fid++); 200 | } 201 | closedir(dir); 202 | } else { 203 | perror ("Could not open dir"); 204 | return false; 205 | } 206 | //utils::log(VL_INFO, "Read " + std::to_string(m_ids_map.size()) + " measurements"); 207 | return true; 208 | } 209 | -------------------------------------------------------------------------------- /freud-statistics/reader.hh: -------------------------------------------------------------------------------- 1 | #ifndef READER_HH_DEFINED 2 | #define READER_HH 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | 15 | #include "method.hh" 16 | #include "utils.hh" 17 | 18 | class reader { 19 | private: 20 | static feature_type get_type(const std::string & t); 21 | static void read_file_binary(std::string filename, std::map &data, std::unordered_map & m_ids_map, const uint32_t file_idx); 22 | 23 | public: 24 | static bool read_folder(std::string folder_name, std::map &data, std::unordered_map & m_ids_map); 25 | }; 26 | 27 | #endif 28 | -------------------------------------------------------------------------------- /freud-statistics/reader_annotations.cc: -------------------------------------------------------------------------------- 1 | #include "reader_annotations.hh" 2 | #include "method.hh" 3 | 4 | void reader_annotations::read_annotations_file(std::string filename, std::string symbol_name, std::map &data) { 5 | data[symbol_name] = new annotation(filename); 6 | } 7 | 8 | bool reader_annotations::read_annotations_folder(std::string folder_name, std::map &data) { 9 | DIR *dir; 10 | struct dirent *ent; 11 | if (dir = opendir(folder_name.c_str())) { 12 | while ((ent = readdir (dir)) != NULL) { 13 | //if (strncmp(ent->d_name, "idcm", strlen("idcm")) == 0) 14 | read_annotations_file(folder_name + ent->d_name, "todo_implement_me", data); 15 | } 16 | closedir(dir); 17 | } else { 18 | perror ("Could not open dir"); 19 | return false; 20 | } 21 | std::cout << "Read " << data.size() << " annotations " << std::endl; 22 | return true; 23 | } 24 | -------------------------------------------------------------------------------- /freud-statistics/reader_annotations.hh: -------------------------------------------------------------------------------- 1 | #ifndef READER_ANNOTATIONS_HH_DEFINED 2 | #define READER_ANNOTATIONS_HH 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | 15 | #include "method.hh" 16 | 17 | 18 | class reader_annotations { 19 | private: 20 | 21 | public: 22 | static void read_annotations_file(std::string filename, std::string symbol_name, std::map &data); 23 | static bool read_annotations_folder(std::string folder_name, std::map &data); 24 | }; 25 | 26 | #endif 27 | 28 | -------------------------------------------------------------------------------- /freud-statistics/regression.cc: -------------------------------------------------------------------------------- 1 | #include "regression.hh" 2 | #include 3 | #include 4 | #include 5 | #include 6 | 7 | multiple_regression::multiple_regression(std::stringstream &ss) { 8 | std::string temp; 9 | ss >> temp; // [det= 10 | ss >> determination; 11 | ss >> temp; // ] 12 | 13 | // Coefficients 14 | while (ss) { 15 | ss >> temp; 16 | if (!ss) 17 | break; 18 | int star_index = temp.find("*"); 19 | if (star_index == std::string::npos) { 20 | // Intercept 21 | intercept_value = std::stod(temp); 22 | } else { 23 | std::string c = temp.substr(0, star_index); 24 | std::string fname = temp.substr(star_index + 1, std::string::npos); 25 | coefficients[fname] = std::stod(c); 26 | features.push_back(fname); 27 | } 28 | } 29 | } 30 | 31 | multiple_regression::multiple_regression(const double r2, const double intercept, const std::vector &in_features, const std::vector &var_names, const std::vector °rees, const std::vector> &coefficients, const std::unordered_map &coefficients_map, const std::unordered_map &vnames) { 32 | determination = r2; 33 | intercept_value = intercept; 34 | this->coefficients = coefficients_map; 35 | this->var_names = vnames; 36 | this->features = in_features; 37 | int nr = features.size(); 38 | } 39 | 40 | 41 | void multiple_regression::print(std::ostream & stream) { 42 | stream << "RR "; 43 | stream << intercept_value; 44 | stream << " + "; 45 | #if 0 46 | for (std::pair p: feature_regressions) { 47 | std::string fname = p.first; 48 | univariate_regression * r = p.second; 49 | stream << " {" << fname << "_" << p.second->var_name << "} "; 50 | for (int i = 0; i < r->degree; i++) { 51 | stream << r->coefficients[i]; 52 | stream << " "; 53 | } 54 | } 55 | #endif 56 | stream << "; det: " << determination; 57 | } 58 | 59 | std::string multiple_regression::get_string() { 60 | std::ostringstream res; 61 | res.precision(std::numeric_limits::digits10 + 1); 62 | res << std::scientific; 63 | 64 | res << "[det= " << determination << " ] "; 65 | res << intercept_value; 66 | for (std::string f: features) { 67 | if (coefficients[f] >= 0) 68 | res << " +" << coefficients[f] << "*" << var_names[f]; 69 | else 70 | res << " " << coefficients[f] << "*" << var_names[f]; 71 | } 72 | return res.str(); 73 | } 74 | -------------------------------------------------------------------------------- /freud-statistics/regression.hh: -------------------------------------------------------------------------------- 1 | #ifndef REGRESSION_HH_DEFINED 2 | #define REGRESSION_HH_DEFINED 3 | 4 | #include "const.hh" 5 | 6 | #include 7 | #include 8 | #include 9 | #include 10 | 11 | #include 12 | #include 13 | #include 14 | 15 | class multiple_regression { 16 | public: 17 | // constructors 18 | multiple_regression(const double r2, const double intercept, const std::vector &in_features, const std::vector &var_names, const std::vector °rees, const std::vector> &coefficients, const std::unordered_map &coefficients_map, const std::unordered_map &vnames); 19 | 20 | // parse from string 21 | multiple_regression(std::stringstream &in); 22 | 23 | // R^2 24 | double determination; 25 | double intercept_value; 26 | std::vector features; 27 | std::unordered_map coefficients; 28 | std::unordered_map var_names; 29 | 30 | void print(std::ostream & stream); 31 | std::string get_string(); 32 | }; 33 | 34 | #endif 35 | -------------------------------------------------------------------------------- /freud-statistics/stats.hh: -------------------------------------------------------------------------------- 1 | #ifndef STATS_HH_INCLUDED 2 | #define STATS_HH_INCLUDED 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | 10 | #include "method.hh" 11 | #include "regression.hh" 12 | 13 | // Maximum allowed correlation between two different features when filtering 14 | #define CORRELATION_THRESHOLD 0.9 15 | 16 | // Minimum r2 value that is accepted for a correlation 17 | #define MIN_DET 0.5 18 | 19 | // Maximum acceptable p-value for a feature in a regression 20 | #define GOOD_PVALUE 2e-11 21 | 22 | // Additional penalty for a higher degree for a regression 23 | #define BIC_DIFF 10 24 | 25 | // When filtering features to have a better understanding of the main features, 26 | // how many features to remove at each iteration 27 | #define DROP_K 5 28 | 29 | // Degree used to represent internally the nlogn term 30 | #define LOG_DEG 3 31 | 32 | // Constants used to evaluate the distance between performance annotations 33 | #define MAX_CONST_DST 1.05 34 | #define MAX_CLUST_DST 0.05 35 | 36 | class stats { 37 | private: 38 | static void compute_tot_min_max(const std::vector & data, std::string fkey, metric_type mtype, double &tot, double &sqr_tot, double &min, double &max, double &tot_feature, double &sqr_tot_feature, bool remove_dependencies); 39 | static double compute_correlation_coefficient(const std::vector & data, std::string fkey, metric_type mtype, bool spearman); 40 | static void compute_variances(const std::vector & data, double tot, double sqr_tot, double tot_feature, double sqr_tot_feature, double &var, double &feature_var, const double avoid_c_cancellation_metric, const double avoid_c_cancellation_feature, const std::string fkey, metric_type mtype); 41 | 42 | public: 43 | /*********************** 44 | REGRESSION ANALYSIS 45 | **********************/ 46 | static void find_noncorr_features(const std::vector &in_data, const std::unordered_set &features, std::vector &noncorr_features); 47 | static void free_r_memory(const int howmany); 48 | static void * prepare_data_for_r(const std::vector &data, const std::vector &features, std::vector &features_interaction, std::vector &quad_features, std::vector &quad_features_interaction, std::vector &log_features, std::vector &log_features_interaction, std::unordered_map &names_map_to_r, std::unordered_map &names_map_to_c, std::unordered_map &quad_names_map_to_r, std::unordered_map &quad_names_map_to_c, std::unordered_map &log_names_map_to_r, std::unordered_map &log_names_map_to_c, const metric_type mtype); 49 | static bool compute_multiple_regression(const std::vector &data, const std::vector &features_to_use, const std::vector &quad_features_to_use, const std::vector &log_features_to_use, const std::vector &features_to_use_interaction, const std::vector &quad_features_to_use_interaction, const std::vector &log_features_to_use_interaction, metric_type mtype, std::vector &used_features, double &det, double &intercept, std::vector> &coefficients, double &bic, const int degree, void * matp, const std::unordered_map &names_map_to_r, const std::unordered_map &names_map_to_c, const std::unordered_map &quad_names_map_to_r, const std::unordered_map &quad_names_map_to_c, const std::unordered_map &log_names_map_to_r, const std::unordered_map &log_names_map_to_c, std::unordered_map &coefficients_map, const bool refining, const double min_det); 50 | 51 | /*********************** 52 | CLUSTER ANALYSIS 53 | **********************/ 54 | static void kde_clustering(const std::vector & data, const metric_type mtype, std::vector & clusters, bool R_needs_init); 55 | static void create_clusters(const std::vector & data, const metric_type mtype, const double * kde, const double min, const double max, std::vector & clusters); 56 | 57 | /*********************** 58 | MISC 59 | **********************/ 60 | static void is_constant(const std::vector & data, const metric_type mtype, bool &is_c, uint64_t &c); 61 | 62 | /*********************** 63 | PERFORMANCE ANNOTATION VALIDATION 64 | **********************/ 65 | static bool validate_cluster_before_regression(const std::vector & data, metric_type mtype, const double cl_probability, const double lb, const double ub, std::vector & f_data); 66 | static bool validate_regression(const std::vector & data, std::string feature_key, metric_type mtype, multiple_regression * r); 67 | static bool validate_constant(const std::vector & data, metric_type mtype, int cvalue); 68 | static bool validate_clusters(const std::vector & data, metric_type mtype, const std::vector & clusters); 69 | 70 | /*********************** 71 | PLOTTING 72 | **********************/ 73 | static void printGNUPlotData(const std::vector & data, std::vector fkeys, const metric_type mtype, bool remove_dependencies, std::string &res, double min, double max); 74 | static void get_ofeature_intervals(const std::vector & data, std::string feat, std::vector &values, const int intervals); 75 | static double get_avg_value(const std::vector & data, std::string feat); 76 | }; 77 | 78 | #endif 79 | -------------------------------------------------------------------------------- /freud-statistics/utils.hh: -------------------------------------------------------------------------------- 1 | #ifndef UTILS_HH_INCLUDED 2 | #define UTILS_HH_INCLUDED 3 | 4 | #include "method.hh" 5 | 6 | enum verbosity_levels { 7 | VL_ERROR = 0, 8 | VL_QUIET, 9 | VL_INFO, 10 | VL_DEBUG 11 | }; 12 | 13 | 14 | class utils { 15 | private: 16 | static std::string log_label(enum verbosity_levels vl); 17 | 18 | public: 19 | static void log(enum verbosity_levels l, std::string msg); 20 | static bool similar_cwc(struct corr_with_constraints cwc1, struct corr_with_constraints cwc2); 21 | 22 | static bool similar_regressions(std::vector vr1, std::vector vr2); 23 | 24 | static void get_filtered_data(corr_with_constraints cwc, const std::vector in, std::vector &out); 25 | 26 | static void get_main_trend(std::string fkey, metric_type metric, const std::vector data, std::vector &res); 27 | 28 | static void remove_addnoise(std::string fkey, metric_type metric, const std::vector data, std::vector &res); 29 | 30 | static std::string get_random_color(int seed); 31 | static int get_degree(std::string feat); 32 | static std::string base_feature(std::string feature); 33 | 34 | // Annotation string generators 35 | static std::string generate_annotation_string(const struct annotation * pa); 36 | static std::string generate_annotation_string(const corr_with_constraints cwc); 37 | static std::string generate_annotation_string(const std::vector * in); 38 | static std::string generate_annotation_string(const std::vector * clusters); 39 | static bool comp_contains(const std::string &comp, const std::string &f); 40 | }; 41 | 42 | #endif 43 | -------------------------------------------------------------------------------- /micro_benchmark/Makefile: -------------------------------------------------------------------------------- 1 | CXXFLAGS+=-std=c++11 -g -O1 -gstrict-dwarf -gdwarf-4 -fPIC -pie 2 | 3 | CLEAN := 4 | 5 | all: micro_benchmark_bin 6 | 7 | micro_benchmark_bin: micro_benchmark 8 | CLEAN += micro_benchmark 9 | 10 | clean: 11 | rm -f $(CLEAN) 12 | -------------------------------------------------------------------------------- /micro_benchmark_analysis/Makefile: -------------------------------------------------------------------------------- 1 | ROOT=./ 2 | WORKLOADS=mysql_classes 3 | LOGDIR=$(ROOT)logs 4 | EPSDIR=$(ROOT)eps 5 | GPLDIR=$(ROOT)gnuplot 6 | ANNDIR=$(ROOT)ann 7 | CHECKDIR=$(ROOT)ass 8 | SYMDIR=$(ROOT)symbols 9 | 10 | ANALYZER=$(ROOT)freud-statistics 11 | CHECKER=$(ROOT)freud-statistics 12 | 13 | LOGFILES := $(wildcard $(LOGDIR)/*.bin) 14 | 15 | .PHONY: all 16 | all: analysis 17 | 18 | .PHONY: clean 19 | clean: 20 | rm -rf $(CHECKDIR) 21 | rm -rf $(ANNDIR) 22 | rm -rf $(EPSDIR) 23 | rm -rf $(GPLDIR) 24 | 25 | $(SYMDIR): $(LOGFILES) 26 | rm -rf $(SYMDIR) 27 | mkdir $(SYMDIR) 28 | $(SYM_SPLIT) $(LOGFILES) -o $(SYMDIR) 29 | 30 | SYMBOLS := $(patsubst $(SYMDIR)/%,%,$(wildcard $(SYMDIR)/*)) 31 | #EPSDIRS := $(addprefix $(EPSDIR)/,$(addsuffix _graphs,$(SYMBOLS))) 32 | ANNDIRS := $(addprefix $(ANNDIR)/,$(addsuffix .txt,$(SYMBOLS))) 33 | CHECKDIRS := $(addprefix $(CHECKDIR)/,$(addsuffix .txt,$(SYMBOLS))) 34 | 35 | .PHONY: analysis 36 | analysis: $(ANNDIRS) 37 | 38 | $(EPSDIR): 39 | mkdir $@ 40 | 41 | $(ANNDIR): 42 | mkdir $@ 43 | mkdir $(EPSDIR) 44 | mkdir $(GPLDIR) 45 | 46 | ## $(info analysis_template called with parameter "$(1)") 47 | 48 | define analysis_template 49 | $$(ANNDIR)/$(1).txt: 50 | -mkdir $$(ANNDIR) > /dev/null 2>&1 51 | time -o times.txt --append -f "%e" $(ANALYZER) 3 0.9 0 $(1) $$(SYMDIR)/$(1)/ 52 | endef 53 | 54 | $(foreach sym,$(SYMBOLS),$(eval $(call analysis_template,$(sym)))) 55 | 56 | .PHONY: check 57 | check: $(CHECKDIRS) 58 | 59 | $(CHECKDIR): 60 | mkdir $@ 61 | 62 | define check_template 63 | $$(CHECKDIR)/$(1).txt: 64 | -mkdir $$(CHECKDIR) > /dev/null 2>&1 65 | $(CHECKER) 5 0 0 $(1) $$(SYMDIR)/$(1)/ $(ANNDIR) 66 | endef 67 | 68 | $(foreach sym,$(SYMBOLS),$(eval $(call check_template,$(sym)))) 69 | 70 | 71 | -------------------------------------------------------------------------------- /test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | export R_HOME=/usr/lib/R 3 | 4 | set -e 5 | 6 | echo "Cleaning up previous runs..." 7 | rm -rf test/ann 8 | rm -rf test/symbols 9 | rm -rf micro_benchmark_analysis/eps 10 | rm -rf micro_benchmark_analysis/plots 11 | rm -rf micro_benchmark_analysis/ann 12 | rm -rf micro_benchmark_analysis/symbols 13 | rm -rf freud-pin/symbols 14 | cp freud-statistics/freud-statistics micro_benchmark_analysis 15 | cp freud-statistics/hpi.R micro_benchmark_analysis 16 | 17 | echo "Creating the instrumentation..." 18 | ./freud-dwarf/freud-dwarf micro_benchmark/micro_benchmark --max-depth=4 >> test_log 19 | mv feature_processing.cc freud-pin/ 20 | mv table.txt freud-pin/ 21 | cd freud-pin/ 22 | make clean > /dev/null 23 | make &>> test_log 24 | mkdir symbols 25 | 26 | echo "Running test program for instrumentation..." 27 | ../pin/pin -t obj-intel64/freud-pin.so -- ../micro_benchmark/micro_benchmark --test_instrumentation &>> test_log 28 | mv symbols ../test/ 29 | cd ../ 30 | 31 | echo "Analyzing instrumentation results..." 32 | cd test 33 | ./test --test-instr 34 | 35 | echo "Running test program for the statistical analysis..." 36 | rm -rf symbols # clean the logs for the test of the instrumentation 37 | cd ../freud-pin/ 38 | mkdir symbols 39 | ../pin/pin -t obj-intel64/freud-pin.so -- ../micro_benchmark/micro_benchmark #&>> test_log 40 | mv symbols ../micro_benchmark_analysis/ 41 | 42 | echo "Performing statistical analysis..." 43 | cd ../ 44 | cp freud-statistics/freud-statistics micro_benchmark_analysis 45 | cp freud-statistics/hpi.R micro_benchmark_analysis 46 | cd micro_benchmark_analysis/ 47 | mkdir eps 48 | mkdir plots 49 | make &>> make_log 50 | 51 | echo "Analyzing statistical results..." 52 | mv ann ../test/ 53 | cd ../test 54 | ./test --test-stats 55 | 56 | echo "Done!" 57 | echo "You can find the plots for the benchmarks in micro_benchmark_analysis/eps" 58 | -------------------------------------------------------------------------------- /test/Makefile: -------------------------------------------------------------------------------- 1 | CXXFLAGS=-std=c++11 2 | 3 | all: test 4 | 5 | LIBS= ../freud-statistics/reader.o ../freud-statistics/regression.o ../freud-statistics/reader_annotations.o 6 | 7 | test: $(LIBS) 8 | 9 | CLEAN += test 10 | 11 | clean: 12 | rm -f $(CLEAN) 13 | -------------------------------------------------------------------------------- /test/test.cc: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2020 Daniele Rogora 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #include "../freud-statistics/method.hh" 18 | #include "../freud-statistics/reader.hh" 19 | #include "../freud-statistics/regression.hh" 20 | #include "../freud-statistics/reader_annotations.hh" 21 | 22 | #include 23 | 24 | #include "instr_tests.hh" 25 | 26 | #define TEST_STATS 27 | 28 | #ifdef TEST_STATS 29 | #include "stats_tests.hh" 30 | #endif 31 | 32 | // enums defined in analysis/const.hh 33 | /* 34 | FT_BOOL = 0, 35 | FT_INT = 1, 36 | FT_COLLECTION = 2, 37 | FT_COLLECTION_AGGR = 3, 38 | FT_STRING = 4, // and also mysqllexstring are caught here 39 | FT_PATH = 5, 40 | FT_NASCII = 6, 41 | FT_SIZE = 7, 42 | FT_RESOURCE = 8, 43 | FT_BRANCH_EXEC = 9, 44 | FT_BRANCH_ISTRUE = 10, 45 | FT_STRUCT_MEMBER = 11, 46 | FT_STRUCT_MEMBER_2 = 12, 47 | ... 48 | FT_STRUCT_MEMBER_899 = 910, 49 | */ 50 | 51 | bool test_instrumentation_output() { 52 | bool err = false; 53 | std::unordered_map m_ids_map; 54 | std::map data; 55 | 56 | // _Z15test_linear_inti 57 | std::cout << "Checking _Z15test_linear_inti" << std::endl; 58 | data.clear(); 59 | m_ids_map.clear(); 60 | reader::read_folder("symbols/_Z15test_linear_inti/", data, m_ids_map); 61 | err |= check_Z15test_linear_inti(data); 62 | 63 | // _Z23test_linear_int_pointerPi 64 | std::cout << "Checking _Z23test_linear_int_pointerPi" << std::endl; 65 | data.clear(); 66 | m_ids_map.clear(); 67 | reader::read_folder("symbols/_Z23test_linear_int_pointerPi/", data, m_ids_map); 68 | err |= check_Z23test_linear_int_pointerPi(data); 69 | 70 | // _Z17test_linear_floatf 71 | std::cout << "Checking _Z17test_linear_floatf" << std::endl; 72 | data.clear(); 73 | m_ids_map.clear(); 74 | reader::read_folder("symbols/_Z17test_linear_floatf/", data, m_ids_map); 75 | err |= check_Z17test_linear_floatf(data); 76 | 77 | // _Z13test_quad_inti 78 | std::cout << "Checking _Z13test_quad_inti" << std::endl; 79 | data.clear(); 80 | m_ids_map.clear(); 81 | reader::read_folder("symbols/_Z13test_quad_inti/", data, m_ids_map); 82 | err |= check_Z13test_quad_inti(data); 83 | 84 | // _Z16test_quad_int_wnii 85 | std::cout << "Checking _Z16test_quad_int_wnii" << std::endl; 86 | data.clear(); 87 | m_ids_map.clear(); 88 | reader::read_folder("symbols/_Z16test_quad_int_wnii/", data, m_ids_map); 89 | err |= check_Z16test_quad_int_wnii(data); 90 | 91 | // _Z19test_linear_charptrPc 92 | std::cout << "Checking _Z19test_linear_charptrPc" << std::endl; 93 | data.clear(); 94 | m_ids_map.clear(); 95 | reader::read_folder("symbols/_Z19test_linear_charptrPc/", data, m_ids_map); 96 | err |= check_Z19test_linear_charptrPc(data); 97 | 98 | // _Z19test_linear_structsP15basic_structure 99 | std::cout << "Checking _Z19test_linear_structsP15basic_structure" << std::endl; 100 | data.clear(); 101 | m_ids_map.clear(); 102 | reader::read_folder("symbols/_Z19test_linear_structsP15basic_structure/", data, m_ids_map); 103 | err |= check_Z19test_linear_structsP15basic_structure(data); 104 | 105 | // _Z19test_linear_classesP11basic_class 106 | std::cout << "Checking _Z19test_linear_classesP11basic_class" << std::endl; 107 | data.clear(); 108 | m_ids_map.clear(); 109 | reader::read_folder("symbols/_Z19test_linear_classesP11basic_class/", data, m_ids_map); 110 | err |= check_Z19test_linear_classesP11basic_class(data); 111 | 112 | // _Z25test_linear_fitinregister15fit_in_register 113 | std::cout << "Checking _Z25test_linear_fitinregister15fit_in_register" << std::endl; 114 | data.clear(); 115 | m_ids_map.clear(); 116 | reader::read_folder("symbols/_Z25test_linear_fitinregister15fit_in_register/", data, m_ids_map); 117 | err |= check_Z25test_linear_fitinregister15fit_in_register(data); 118 | 119 | // _Z20test_linear_branchesiii 120 | std::cout << "Checking _Z20test_linear_branchesiii" << std::endl; 121 | data.clear(); 122 | m_ids_map.clear(); 123 | reader::read_folder("symbols/_Z20test_linear_branchesiii/", data, m_ids_map); 124 | err |= check_Z20test_linear_branchesiii(data); 125 | 126 | // _Z18test_linear_vectorPSt6vectorIiSaIiEE 127 | std::cout << "Checking _Z18test_linear_vectorPSt6vectorIiSaIiEE" << std::endl; 128 | data.clear(); 129 | m_ids_map.clear(); 130 | reader::read_folder("symbols/_Z18test_linear_vectorPSt6vectorIiSaIiEE/", data, m_ids_map); 131 | err |= check_Z18test_linear_vectorPSt6vectorIiSaIiEE(data); 132 | 133 | // _Z18test_linear_farrayPA10_j 134 | std::cout << "Checking _Z18test_linear_farrayPA10_j" << std::endl; 135 | data.clear(); 136 | m_ids_map.clear(); 137 | reader::read_folder("symbols/_Z18test_linear_farrayPA10_j/", data, m_ids_map); 138 | err |= check_Z18test_linear_farrayPA10_j(data); 139 | 140 | // _Z14test_namespacePN8whatever25namespaced_abstract_classE 141 | std::cout << "Checking _Z14test_namespacePN8whatever25namespaced_abstract_classE" << std::endl; 142 | data.clear(); 143 | m_ids_map.clear(); 144 | reader::read_folder("symbols/_Z14test_namespacePN8whatever25namespaced_abstract_classE/", data, m_ids_map); 145 | err |= check_Z14test_namespacePN8whatever25namespaced_abstract_classE(data); 146 | 147 | return err; 148 | } 149 | 150 | #ifdef TEST_STATS 151 | bool test_statistical_analysis() { 152 | bool err = false; 153 | std::map m_ann_map; 154 | 155 | // _Z15test_linear_inti 156 | m_ann_map.clear(); 157 | std::string sym_name = "_Z15test_linear_inti"; 158 | std::cout << "Checking annotation for " << sym_name << std::endl; 159 | reader_annotations::read_annotations_file("ann/" + sym_name + ".txt", sym_name, m_ann_map); 160 | err |= check_ann_Z15test_linear_inti(m_ann_map); 161 | 162 | // _Z23test_linear_int_pointerPi 163 | 164 | // _Z13test_quad_inti 165 | sym_name = "_Z13test_quad_inti"; 166 | std::cout << "Checking " << sym_name << std::endl; 167 | m_ann_map.clear(); 168 | std::cout << "Checking annotation for " << sym_name << std::endl; 169 | reader_annotations::read_annotations_file("ann/" + sym_name + ".txt", sym_name, m_ann_map); 170 | err |= check_ann_Z13test_quad_inti(m_ann_map); 171 | 172 | // _Z16test_quad_int_wnii 173 | sym_name = "_Z16test_quad_int_wnii"; 174 | std::cout << "Checking " << sym_name << std::endl; 175 | m_ann_map.clear(); 176 | std::cout << "Checking annotation for " << sym_name << std::endl; 177 | reader_annotations::read_annotations_file("ann/" + sym_name + ".txt", sym_name, m_ann_map); 178 | err |= check_ann_Z16test_quad_int_wnii(m_ann_map); 179 | 180 | // _Z19test_linear_charptrPc 181 | sym_name = "_Z19test_linear_charptrPc"; 182 | std::cout << "Checking " << sym_name << std::endl; 183 | m_ann_map.clear(); 184 | std::cout << "Checking annotation for " << sym_name << std::endl; 185 | reader_annotations::read_annotations_file("ann/" + sym_name + ".txt", sym_name, m_ann_map); 186 | err |= check_ann_Z19test_linear_charptrPc(m_ann_map); 187 | 188 | // _Z19test_linear_structsP15basic_structure 189 | sym_name = "_Z19test_linear_structsP15basic_structure"; 190 | std::cout << "Checking " << sym_name << std::endl; 191 | m_ann_map.clear(); 192 | std::cout << "Checking annotation for " << sym_name << std::endl; 193 | reader_annotations::read_annotations_file("ann/" + sym_name + ".txt", sym_name, m_ann_map); 194 | err |= check_ann_Z19test_linear_structsP15basic_structure(m_ann_map); 195 | 196 | // _Z19test_linear_classesP11basic_class 197 | sym_name = "_Z19test_linear_classesP11basic_class"; 198 | std::cout << "Checking " << sym_name << std::endl; 199 | m_ann_map.clear(); 200 | std::cout << "Checking annotation for " << sym_name << std::endl; 201 | reader_annotations::read_annotations_file("ann/" + sym_name + ".txt", sym_name, m_ann_map); 202 | err |= check_ann_Z19test_linear_classesP11basic_class(m_ann_map); 203 | 204 | // _Z25test_linear_fitinregister15fit_in_register 205 | sym_name = "_Z25test_linear_fitinregister15fit_in_register"; 206 | std::cout << "Checking " << sym_name << std::endl; 207 | m_ann_map.clear(); 208 | std::cout << "Checking annotation for " << sym_name << std::endl; 209 | reader_annotations::read_annotations_file("ann/" + sym_name + ".txt", sym_name, m_ann_map); 210 | err |= check_ann_Z25test_linear_fitinregister15fit_in_register(m_ann_map); 211 | 212 | // _Z20test_linear_branchesiii 213 | sym_name = "_Z20test_linear_branchesiii"; 214 | std::cout << "Checking " << sym_name << std::endl; 215 | m_ann_map.clear(); 216 | std::cout << "Checking annotation for " << sym_name << std::endl; 217 | reader_annotations::read_annotations_file("ann/" + sym_name + ".txt", sym_name, m_ann_map); 218 | err |= check_ann_Z20test_linear_branchesiii(m_ann_map); 219 | 220 | // _Z18test_linear_vectorPSt6vectorIiSaIiEE 221 | sym_name = "_Z18test_linear_vectorPSt6vectorIiSaIiEE"; 222 | std::cout << "Checking " << sym_name << std::endl; 223 | m_ann_map.clear(); 224 | std::cout << "Checking annotation for " << sym_name << std::endl; 225 | reader_annotations::read_annotations_file("ann/" + sym_name + ".txt", sym_name, m_ann_map); 226 | err |= check_ann_Z18test_linear_vectorPSt6vectorIiSaIiEE(m_ann_map); 227 | 228 | // _Z18test_linear_farrayPA10_j 229 | sym_name = "_Z18test_linear_farrayPA10_j"; 230 | std::cout << "Checking " << sym_name << std::endl; 231 | m_ann_map.clear(); 232 | std::cout << "Checking annotation for " << sym_name << std::endl; 233 | reader_annotations::read_annotations_file("ann/" + sym_name + ".txt", sym_name, m_ann_map); 234 | err |= check_ann_Z18test_linear_farrayPA10_j(m_ann_map); 235 | 236 | // _Z22test_random_clusteringi 237 | sym_name = "_Z22test_random_clusteringi"; 238 | std::cout << "Checking " << sym_name << std::endl; 239 | m_ann_map.clear(); 240 | std::cout << "Checking annotation for " << sym_name << std::endl; 241 | reader_annotations::read_annotations_file("ann/" + sym_name + ".txt", sym_name, m_ann_map); 242 | err |= check_ann_Z22test_random_clusteringi(m_ann_map); 243 | 244 | 245 | return false; 246 | } 247 | #endif 248 | 249 | int main(int argc, char *argv[]) { 250 | 251 | if (argc <= 1) { 252 | std::cout << "you must specify either --test-instr or --test-stats" << std::endl; 253 | return 0; 254 | } 255 | 256 | if (strcmp(argv[1], "--test-instr") == 0) { 257 | std::cout << "\033[1;34m ### TESTING INSTRUMENTATION ### \033[0m" << std::endl; 258 | if (test_instrumentation_output()) { 259 | std::cout << "\033[1;31m Could not complete all tests successfully! \033[0m" << std::endl; 260 | return -1; 261 | } 262 | std::cout << "\033[1;32m All tests completed successfully! \033[0m" << std::endl; 263 | } 264 | #ifdef TEST_STATS 265 | else if (strcmp(argv[1], "--test-stats") == 0) { 266 | std::cout << "\033[1;34m ### TESTING STATISTICS ### \033[0m" << std::endl; 267 | if (test_statistical_analysis()) { 268 | std::cout << "\033[1;31m Could not complete all tests successfully! \033[0m" << std::endl; 269 | return -1; 270 | } 271 | std::cout << "\033[1;32m All tests completed successfully! \033[0m" << std::endl; 272 | } 273 | #endif 274 | else { 275 | std::cout << "Could not parse parameter, or feature not enabled: " << argv[1] << std::endl; 276 | return -1; 277 | } 278 | 279 | } 280 | --------------------------------------------------------------------------------