├── LICENSE ├── README.md ├── SPDX.spdx ├── cla-README.pdf ├── cla-metrics ├── .DS_Store ├── README.md ├── find_elf.sh ├── fn_metrics.py ├── input │ ├── .DS_Store │ ├── elfs │ │ └── .placeholder │ ├── objdumps │ │ └── .placeholder │ └── source-info.json └── output │ ├── .DS_Store │ ├── elf-results │ └── .placeholder │ ├── graphs │ └── .placeholder │ └── obj-results │ └── .placeholder ├── go-cla-examples ├── Makefile └── src │ ├── init │ ├── init.c │ └── init.h │ └── main.go └── rust-cla-examples ├── Cargo.lock ├── Cargo.toml ├── Makefile ├── build.rs └── src ├── init ├── init.c └── init.h └── main.rs /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 MIT Lincoln Laboratory 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Overview 2 | 3 | This repository contains the source code behind the NDSS '22 paper "Cross-Language Attacks", available [here](TODO). 4 | 5 | The paper shows that adding code in "safe" langauges such as Rust to applications in unsafe lanaguage such as C/C++ may undermine hardening techniques that have been applied to the C/C++ code. This paradoxical result shows the importance of having well thought out and consistent threat models. Here we provide the proofs of concept referenced in the paper, for both Rust and Go. We also provide the analysis scripts we used to gauge how prevalent these vulnerabilities might be in Firefox. 6 | 7 | ## Objective 8 | 9 | The objective of this project is to aid authors of multi-language software 10 | applications in hardening their code. Securing such applications effectively 11 | requires understanding the threat model that they face, and how different 12 | defenses compose. We hope that our exploration of this subject results in more 13 | secure software. 14 | 15 | # Directory Layout 16 | 17 | ## rust-cla-examples 18 | 19 | In this directory, one can find a mixed language application (MLA) with both Rust and C code that is vulnerable to a number of Cross Language Attacks (CLAs). The C side of the program can either be compiled as a static library (libinit.a) or a dynamic shared library (libinit.so). Furthermore, the C library is compiled with Control Flow Integrity (CFI) to prevent code-reuse attacks. However, the C code contains a series of spatial memory corruption out-of-bound errors (OOB) and temporal corruption use-after-free (UAF) or double free errors that an attacker can leverage to degrade the spatial and temporal safety of Rust or by-pass the CFI protection on the C library. 20 | 21 | ## go-cla-examples 22 | In this directory, one can find another mixed language application (MLA) with both Go and C code that is vulnerable to a number of Cross Language Attacks (CLAs). Similar to the Rust MLA example, the C side of the program can be compiled separately as a static library (libinit.a) or a dynamic shared library (libinit.so). In fact, these CLA examples contain a similar set of attacks as the Rust MLA. 23 | 24 | ## cla-metrics 25 | A series of scripts to analyze mixed-language binaries for metrics that indicate the opportunity for a Cross-Language Attack (CLA). 26 | 27 | # Disclaimer 28 | 29 | Cross-Language Attacks is distributed under the terms of the MIT License 30 | DISTRIBUTION STATEMENT A. Approved for public release: distribution unlimited. 31 | 32 | © 2021 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 33 | 34 | 35 | Subject to FAR 52.227-11 – Patent Rights – Ownership by the Contractor (May 2014) 36 | SPDX-License-Identifier: MIT 37 | 38 | This material is based upon work supported by the Under Secretary of Defense (USD) for Research & Engineering (R&E) under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of USD (R&E). 39 | 40 | The software/firmware is provided to you on an As-Is basis 41 | -------------------------------------------------------------------------------- /SPDX.spdx: -------------------------------------------------------------------------------- 1 | SPDXVersion: SPDX-2.1 2 | PackageName: Cross-Language Attacks 3 | PackageHomePage: https://github.com/mit-ll/cross-language-attacks/ 4 | PackageOriginator: MIT Lincoln Laboratory 5 | PackageCopyrightText:2021 Massachusetts Institute of Technology 6 | PackageLicenseDeclared: MIT 7 | -------------------------------------------------------------------------------- /cla-README.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-README.pdf -------------------------------------------------------------------------------- /cla-metrics/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/.DS_Store -------------------------------------------------------------------------------- /cla-metrics/README.md: -------------------------------------------------------------------------------- 1 | # cla-metrics 2 | A series of scripts to analyze mixed-language binaries for metrics that indicate the opportunity for a Cross-Language Attack (CLA). 3 | -------------------------------------------------------------------------------- /cla-metrics/find_elf.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Used to collect a list of file names from a directory (recursively) that are elfs 4 | 5 | # Input is the path to the search directory 6 | path=$1 7 | 8 | echo "Finding ELFs..." 9 | 10 | # Count the number of / in path to strip files later 11 | char="/" 12 | stripped=${path//[^$char]/} 13 | num=${#stripped} 14 | 15 | if [ -f input/files.txt ] 16 | then 17 | echo "Cleaning up input/files.txt" 18 | rm input/files.txt 19 | echo "Cleaning up input/elfs/" 20 | rm input/elfs/* 21 | echo "Cleaning up input/objdumps/" 22 | rm input/objdumps/* 23 | fi 24 | 25 | shopt -s globstar 26 | 27 | count=0 28 | for f in $path**/* 29 | do 30 | if file -L $f | grep -qi elf 31 | then 32 | # Skip certain files 33 | if [[ $f =~ "Test" ]] 34 | then 35 | printf "Skipping $f\n" 36 | continue 37 | fi 38 | 39 | if [[ $f == *.jsm ]] 40 | then 41 | printf "Skipping $f\n" 42 | continue 43 | fi 44 | 45 | if [[ $f == *.objdump ]] 46 | then 47 | printf "Skipping $f\n" 48 | continue 49 | fi 50 | 51 | # holds binary name 52 | b=$f 53 | 54 | # strip path from binary name 55 | for i in $(eval echo {1..$num}) 56 | do 57 | b="${b#*/}" 58 | done 59 | 60 | # count remaining paths for this file 61 | strippedb=${b//[^$char]/} 62 | numb=${#strippedb} 63 | 64 | # strip rest of path from binary name 65 | for i in $(eval echo {1..$numb}) 66 | do 67 | b="${b#*/}" 68 | done 69 | 70 | ((count+=1)) 71 | printf "File #$count: $b\n" 72 | objdump -d $f > input/objdumps/$b.objdump 73 | cp $f input/elfs/$b.elf 74 | echo $b >> input/files.txt 75 | fi 76 | done 77 | -------------------------------------------------------------------------------- /cla-metrics/fn_metrics.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | # Authors: Samuel Mergendahl and Nathan Burow 3 | # Copyright: MIT Lincoln Laboratory 4 | 5 | import sys 6 | import re 7 | import struct 8 | import json 9 | import numpy as np 10 | import matplotlib 11 | import matplotlib.pyplot as plt 12 | 13 | from argparse import ArgumentParser 14 | from elftools.elf.elffile import ELFFile 15 | 16 | # Object that helps retrieve CLA-relevant info from an ELF File 17 | class Tagger: 18 | 19 | # Initializes the Tagger Class 20 | def __init__(self, e, f, r_path, n): 21 | 22 | # Tags holds all the metadata 23 | self.tags = {} 24 | 25 | # elf holds functions to search the elffile 26 | #with open(e_path, "rb") as f: 27 | self.elf = ELFFile(e) 28 | 29 | # fns holds incoming metadata derived from the high-level source 30 | #with open(f_path, "rb") as f: 31 | self.fns = json.load(f) 32 | 33 | # res holds the path to store derived metadata 34 | self.res = r_path 35 | 36 | # bin_name holds the name of the elf binary 37 | self.bin_name = n 38 | 39 | # Main language variable defaults to c++ 40 | self.main_lang = "c++" 41 | 42 | # elf assembly flavor 43 | self.assembly = "x86" 44 | 45 | # Main function to append metadata to tags 46 | def tag(self, name, field, tag): 47 | 48 | # Adds a newly seen function to the metadata 49 | if name not in self.tags: 50 | self.tags[name] = {field:tag} 51 | 52 | # Adds a newly seen metadata field to an already seen function 53 | elif field not in self.tags[name]: 54 | self.tags[name][field] = tag 55 | 56 | # Appends metadata to an already seen field for a function 57 | else: 58 | tmp = self.tags[name][field] 59 | if type(tmp) is not list: 60 | tmp = [tmp] 61 | tmp.append(tag) 62 | self.tags[name][field] = tmp 63 | 64 | # Finds all the function names and initializes the tags metadata with all function names in elf 65 | def get_all_fns(self): 66 | 67 | # Check if we can use dwarf information 68 | if self.elf.has_dwarf_info: 69 | print(str(self.bin_name) + " has dwarf info!") 70 | dwarf_info = self.elf.get_dwarf_info() 71 | 72 | # Iterate through the entire symbol table 73 | stab = self.elf.get_section_by_name(".symtab") 74 | for symb in stab.iter_symbols(): 75 | 76 | # Ignore empty, already seen, and symbol names without an address 77 | # TODO: should we add a not null qualifier? 78 | if symb.name != "" and symb.name not in self.tags and symb["st_value"] != 0: 79 | 80 | # Symbol table value is virtual address, so make relative to .text 81 | text = self.elf.get_section_by_name(".text") 82 | offset = symb["st_value"] - text["sh_addr"] 83 | size = symb["st_size"] 84 | start = symb["st_value"] 85 | 86 | self.tag(symb.name, "addr", symb["st_value"]) 87 | 88 | # For each function in the ELF, determine if it is a static, dynamic, closure, etc. 89 | def tag_function_type(self): 90 | # Check if we can use dwarf information 91 | if self.elf.has_dwarf_info: 92 | print(str(self.bin_name) + " has dwarf info!") 93 | dwarf_info = self.elf.get_dwarf_info() 94 | 95 | # Iterate through the entire symbol table 96 | stab = self.elf.get_section_by_name(".symtab") 97 | for symb in stab.iter_symbols(): 98 | 99 | # Ignore empty, null, and symbol names without an address 100 | if symb.name != "" and symb.name and symb["st_value"] != 0: 101 | 102 | # Some shorter names 103 | name = symb.name 104 | fn = symb 105 | 106 | # No need to work if we already identified the type 107 | if "type" not in self.tags[fn.name]: 108 | 109 | # Tag v0 mangler types 110 | if fn.name.startswith('_R'): 111 | 112 | # Set main language global as rust since only rust uses v0 113 | self.main_lang = "rust" 114 | 115 | # Closure 116 | if 'CN' in fn.name: 117 | self.tag(name, "type", "closure") 118 | # TODO: Dynamic dispatch 119 | #elif 'N' in fn.name: 120 | # self.tag(name, "type", "dynamic") 121 | # Generic arguments impl 122 | elif 'IN' in fn.name: 123 | self.tag(name, "type", "static") 124 | # Inherit impl root 125 | elif 'X' in fn.name: 126 | self.tag(name, "type", "static") 127 | # Trait impl root 128 | elif 'M' in fn.name: 129 | self.tag(name, "type", "static") 130 | elif 'N' in fn.name: 131 | self.tag(name, "type", "free_fn") 132 | 133 | # Tag lagacy mangler types 134 | elif fn.name.startswith('_Z'): 135 | 136 | # Set main language global as c++ 137 | # (assume that rust is compiled with v0) 138 | self.main_lang = "c++" 139 | 140 | # TODO: More expressive function types for C++ 141 | if 'ZN' in fn.name: 142 | self.tag(name, "type", "static") 143 | else: 144 | self.tag(name, "type", "free_fn") 145 | 146 | # This script only analyzes v0 Rust manglers or typical C++ manglers 147 | else: 148 | self.tag(name, "type", "unknown") 149 | 150 | # For each function in the ELF, determine if it is a C/C++ or Rust function 151 | def tag_language(self): 152 | 153 | # Check if we can use dwarf information 154 | if self.elf.has_dwarf_info: 155 | print(str(self.bin_name) + " has dwarf info!") 156 | dwarf_info = self.elf.get_dwarf_info() 157 | 158 | # Use source level info to get language 159 | stab = self.elf.get_section_by_name(".symtab") 160 | if "rust" in self.fns.keys(): 161 | if self.fns["rust"]: 162 | for rust_fn in self.fns["rust"]: 163 | possible_funcs = list(filter(lambda s: rust_fn == s.name, stab.iter_symbols())) 164 | for fn in possible_funcs: 165 | print("tagging " + str(fn.name) + " as a rust function") 166 | self.tag(fn.name, "lang", "rust") 167 | 168 | if "c++" in self.fns.keys(): 169 | if self.fns["c++"]: 170 | for c_fn in self.fns["c++"]: 171 | possible_funcs = list(filter(lambda s: c_fn == s.name, stab.iter_symbols())) 172 | for fn in possible_funcs: 173 | print("tagging " + str(fn.name) + " as a c++ function") 174 | self.tag(fn.name, "lang", "c++") 175 | 176 | if "c" in self.fns.keys(): 177 | if self.fns["c"]: 178 | for c_fn in self.fns["c"]: 179 | possible_funcs = list(filter(lambda s: c_fn == s.name, stab.iter_symbols())) 180 | for fn in possible_funcs: 181 | print("tagging " + str(fn.name) + " as a c function") 182 | self.tag(fn.name, "lang", "c") 183 | 184 | # Use name mangling info to get language 185 | stab = self.elf.get_section_by_name(".symtab") 186 | for symb in stab.iter_symbols(): 187 | 188 | # Ignore empty, null, and symbol names without an address 189 | if symb.name != "" and symb.name and symb["st_value"] != 0: 190 | 191 | func_name = symb.name 192 | 193 | # No need to work if we already identified the language 194 | if "lang" not in self.tags[symb.name]: 195 | mangled = False 196 | 197 | # Simple check to see if the name is mangled in any way 198 | # I.e., if there is more than one uppercase character, assume mangled 199 | # Okay to over estimate the mangled names, 200 | # as it will only underestimate the number of external functions 201 | if sum(1 for c in symb.name if c.isupper()) > 0: 202 | mangled = True 203 | 204 | # tags the language and whether it is an external language call 205 | if "lang" not in self.tags[func_name] and not mangled and self.tags[func_name]["type"] == "unknown" and self.main_lang == "c++": 206 | 207 | # tag the language 208 | if func_name.startswith("_"): 209 | self.tag(func_name, "lang", "c") 210 | else: 211 | self.tag(func_name, "lang", "rust") 212 | 213 | # also tag that it is an external language call 214 | self.tags[func_name]["type"] = "external" 215 | 216 | elif "lang" not in self.tags[func_name] and not mangled and self.tags[func_name]["type"] == "unknown" and self.main_lang == "rust": 217 | # tag the language 218 | if func_name.startswith("_"): 219 | self.tag(func_name, "lang", "c") 220 | else: 221 | self.tag(func_name, "lang", "c++") 222 | 223 | # also tag that it is an external language call 224 | self.tags[func_name]["type"] = "external" 225 | 226 | elif "lang" not in self.tags[func_name]: 227 | # tag the language 228 | self.tag(func_name, "lang", self.main_lang) 229 | 230 | # Save the collected tags metadata to a file 231 | def save_results(self): 232 | f = open(self.res, "w") 233 | json.dump(self.tags, f, indent=4) 234 | 235 | # This function uses the Tagger class to generate 236 | # the function types and language of each function in the elf file 237 | # Stores results in a json file 238 | def generate_elf_metrics(elf_path, fns_path, results_path, binary): 239 | print("Started ELF Tagging.") 240 | tagger = Tagger(elf_path, fns_path, results_path, binary) 241 | 242 | print("Initializing tags...") 243 | tagger.get_all_fns() 244 | 245 | print("Tagging function types...") 246 | tagger.tag_function_type() 247 | 248 | print("Tagging language...") 249 | tagger.tag_language() 250 | 251 | print("Saving results...") 252 | tagger.save_results() 253 | 254 | # This function generates metrics for a series of elf binaries 255 | # file path is a text file that holds the names of a bunch of elf files 256 | def elf_reader(file_path): 257 | 258 | with open(file_path) as f: 259 | lines = [line.rstrip() for line in f] 260 | 261 | for binary in lines: 262 | # Skip jsm executables 263 | if "jsm" not in binary: 264 | print("Generating elf metrics for: " + str(binary)) 265 | with open("input/source-info.json", "rb") as fns: 266 | try: 267 | with open("input/elfs/" + str(binary) + ".elf", "rb") as e: 268 | generate_elf_metrics(e, fns, "output/elf-results/" + str(binary) + "_results.json", str(binary)) 269 | except IOError: 270 | print("Error " + str(binary) + " does not exist.") 271 | 272 | 273 | def generate_obj_metrics(obj_path, binary): 274 | functionStartRegex=re.compile(r"^[\da-f]{16} <.+>:$") 275 | callRegex = re.compile(r"call") 276 | functionName=re.compile(r"<(.+)>:?$") 277 | indirectCallRegex = re.compile(r"call.? *") 278 | 279 | functionToCalls = {} 280 | curFunc = "" 281 | indirectCallCount = 0 282 | count = 0 283 | with open(obj_path, "r") as fp: 284 | for line in fp: 285 | function = functionStartRegex.search(line) 286 | if function: 287 | name = functionName.search(line) 288 | if name: 289 | #print("Current Function: " + str(name.group(1)) 290 | #TODO: check if this function already exists and handle it 291 | #gracefully if so 292 | functionToCalls[name.group(1)] = [] 293 | if curFunc: 294 | functionToCalls[curFunc].append(indirectCallCount) 295 | #if indirectCallCount: 296 | # print(str(curFunc) + " has " + str(indirectCallCount) + " indirect calls") 297 | indirectCallCount = 0 298 | curFunc = name.group(1) 299 | else: 300 | print("Couldn't find function name for line: " + str(line)) 301 | sys.exit(1) 302 | call = callRegex.search(line) 303 | if call: 304 | name = functionName.search(line) 305 | if name: 306 | #print("\tCalls: " + str(name.group(1))) 307 | functionToCalls[curFunc].append(name.group(1)) 308 | else: 309 | if indirectCallRegex.search(line): 310 | #print("Couldn't find name for: " + str(line) + " assuming indirect call") 311 | indirectCallCount += 1 312 | else: 313 | print("Error on line: " + str(line)) 314 | print("Neither direct nor indirect") 315 | sys.exit(1) 316 | count +=1; 317 | if count % 1000000 == 0: 318 | print(str(count / float(87087059) * 100) + "% complete") 319 | 320 | with open("output/obj-results/" + str(binary) + "_results.json", "w") as fp: 321 | json.dump(functionToCalls, fp, indent=4) 322 | 323 | # This function generates metrics for a series of objdumps 324 | # file path is a text file that holds the names of a bunch of objdump files 325 | def obj_reader(file_path): 326 | 327 | with open(file_path) as f: 328 | lines = [line.rstrip() for line in f] 329 | 330 | for binary in lines: 331 | # Skip jsm executables 332 | if "jsm" not in binary: 333 | print("Generating obj metrics for: " + str(binary)) 334 | generate_obj_metrics("input/objdumps/" + str(binary) + ".objdump", str(binary)) 335 | 336 | # Combine objdump file processing from output/obj-results/ into one json file 337 | def combine_obj_results(file_path): 338 | full_json = {} 339 | 340 | with open(file_path) as f: 341 | lines = [line.rstrip() for line in f] 342 | for binary in lines: 343 | try: 344 | with open("output/obj-results/" + str(binary) + "_results.json") as j: 345 | res_data = json.load(j) 346 | 347 | for fn_name in res_data.keys(): 348 | res_data_list = res_data[fn_name] 349 | tmp_dict = {} 350 | tmp_call_list = [] 351 | 352 | # Strip indirect calls 353 | if not res_data_list: 354 | num_indir_calls = float(0) 355 | else: 356 | num_indir_calls = res_data_list[-1] 357 | tmp_dict["num_indirect_calls"] = num_indir_calls 358 | 359 | # Strip dynamic calls info from name 360 | tmp_dict["num_dynamic_calls"] = float(0) 361 | if len(res_data_list) > 1: 362 | for cs in res_data_list[0:-2]: 363 | 364 | # add @binary on the end of call site 365 | if '@' in str(cs): 366 | tmp_dict["num_dynamic_calls"] = tmp_dict["num_dynamic_calls"]+1 367 | else: 368 | cs = str(cs) + '@' + str(binary) 369 | 370 | # add to set first to prevent duplicates 371 | tmp_set = set(tmp_call_list) 372 | tmp_set = tmp_set.union(set([cs])) 373 | tmp_call_list = list(tmp_set) 374 | #tmp_call_list.append(cs) 375 | 376 | tmp_dict["call_sites"] = tmp_call_list 377 | 378 | # Add a unique token for the function to prevent repeated functions 379 | if '@' in str(fn_name): 380 | full_json[str(fn_name)] = tmp_dict 381 | else: 382 | full_json[str(fn_name) + "@" + str(binary)] = tmp_dict 383 | 384 | except IOError: 385 | print("Error " + str(binary) + " does not have any obj results.") 386 | 387 | f = open("output/obj-results/full.json", "w") 388 | json.dump(full_json, f, indent=4) 389 | 390 | # Combine elf file processing from output/elf-results/ into one json file 391 | def combine_elf_results(file_path): 392 | full_json = {} 393 | 394 | with open(file_path) as f: 395 | lines = [line.rstrip() for line in f] 396 | for binary in lines: 397 | try: 398 | with open("output/elf-results/" + str(binary) + "_results.json") as j: 399 | res_data = json.load(j) 400 | 401 | for fn_name in res_data.keys(): 402 | full_json[str(fn_name) + "@" + str(binary)] = res_data[fn_name] 403 | 404 | except IOError: 405 | print("Error " + str(binary) + " does not have any elf results.") 406 | 407 | f = open("output/elf-results/full.json", "w") 408 | json.dump(full_json, f, indent=4) 409 | 410 | # Combine elf processing with objump processing 411 | def combine_elf_and_obj_results(): 412 | print("Opening elf results...") 413 | with open("output/elf-results/full.json") as fj: 414 | full_json = json.load(fj) 415 | 416 | print("Opening objdump results...") 417 | with open("output/obj-results/full.json") as dj: 418 | dump = json.load(dj) 419 | 420 | print("Iterating through objdump functions for std lib call sites...") 421 | for name in dump.keys(): 422 | for cs in dump[name]["call_sites"]: 423 | if cs not in full_json: 424 | if "+0x" not in cs: 425 | if "LIBCXX" in cs or "libcxx" in cs or "LIBC++" in cs or "libc++" in cs or "GXX" in cs or "gxx" in cs: 426 | full_json[cs] = { 427 | "addr": "unknown", 428 | "type": "free_fn", 429 | "lang": "c++", 430 | "call_sites": [], 431 | "num_indirect_calls": float(0), 432 | "num_dynamic_calls": float(0), 433 | } 434 | elif "LIBC" in cs or "libc" in cs or "GCC" in cs or "GXX" in cs or "NSS" in cs or "nss" in cs: 435 | full_json[cs] = { 436 | "addr": "unknown", 437 | "type": "external", 438 | "lang": "c", 439 | "call_sites": [], 440 | "num_indirect_calls": float(0), 441 | "num_dynamic_calls": float(0), 442 | } 443 | 444 | print("Iterating through objdump functions...") 445 | for name in dump.keys(): 446 | 447 | if name in full_json.keys(): 448 | call_list = dump[name]["call_sites"] 449 | try: 450 | num_indirect = float(dump[name]["num_indirect_calls"]) 451 | num_dynamic = float(dump[name]["num_dynamic_calls"]) 452 | except Exception as e: 453 | print(e) 454 | print("...saving as 0 instead...") 455 | num_indirect = float(0) 456 | num_dynamic = float(0) 457 | 458 | # Adds a call_sites metadata field to an already seen function 459 | if "call_sites" not in full_json[name]: 460 | full_json[name]["call_sites"] = call_list 461 | full_json[name]["num_indirect_calls"] = num_indirect 462 | full_json[name]["num_dynamic_calls"] = num_dynamic 463 | 464 | # Appends metadata to an already seen call_cites for a function 465 | else: 466 | tmp = full_json[name]["call_sites"] 467 | tmp_set = set(tmp) 468 | tmp_set = tmp_set.union(set(call_list)) 469 | full_json[name]["call_cites"] = list(tmp_set) 470 | 471 | full_json[name]["num_indirect_calls"] = full_json[name]["num_indirect_calls"] + num_indirect 472 | full_json[name]["num_dynamic_calls"] = full_json[name]["num_dynamic_calls"] + num_dynamic 473 | else: 474 | # Objdump found a function that the elf files couldn't 475 | # Temporary solution: check if objdump has it as a @plt functions 476 | # Ignore all other cases 477 | found = False 478 | if "@plt" in name: 479 | stripped_name = str(name).split('@')[0] 480 | 481 | for n in full_json.keys(): 482 | if n.startswith(stripped_name) and not found: 483 | 484 | # Same as if we found it above, but need to use n to save rather than name 485 | call_list = dump[name]["call_sites"] 486 | try: 487 | num_indirect = float(dump[name]["num_indirect_calls"]) 488 | num_dynamic = float(dump[name]["num_dynamic_calls"]) 489 | except Exception as e: 490 | print(e) 491 | print("...saving as 0 instead...") 492 | num_indirect = float(0) 493 | num_dynamic = float(0) 494 | 495 | # Adds a call_sites metadata field to an already seen function 496 | if "call_sites" not in full_json[n]: 497 | full_json[n]["call_sites"] = call_list 498 | full_json[n]["num_indirect_calls"] = num_indirect 499 | full_json[n]["num_dynamic_calls"] = num_dynamic 500 | 501 | # Appends metadata to an already seen call_cites for a function 502 | else: 503 | tmp = full_json[n]["call_sites"] 504 | tmp_set = set(tmp) 505 | tmp_set = tmp_set.union(set(call_list)) 506 | full_json[n]["call_cites"] = list(tmp_set) 507 | 508 | full_json[n]["num_indirect_calls"] = full_json[n]["num_indirect_calls"] + num_indirect 509 | full_json[n]["num_dynamic_calls"] = full_json[n]["num_dynamic_calls"] + num_dynamic 510 | 511 | found = True 512 | 513 | if not found: 514 | found = True 515 | 516 | call_list = dump[name]["call_sites"] 517 | try: 518 | num_indirect = float(dump[name]["num_indirect_calls"]) 519 | num_dynamic = float(dump[name]["num_dynamic_calls"]) 520 | except Exception as e: 521 | print(e) 522 | print("...saving as 0 instead...") 523 | num_indirect = float(0) 524 | num_dynamic = float(0) 525 | 526 | if "_Z" in name: 527 | t = "unknown" 528 | l = "c++" 529 | else: 530 | t = "external" 531 | l = "c" 532 | 533 | full_json[name] = { 534 | "addr": "unknown", 535 | "type": t, 536 | "lang": l, 537 | "call_sites": call_list, 538 | "num_indirect_calls": num_indirect, 539 | "num_dynamic_calls": num_dynamic, 540 | } 541 | 542 | if not found: 543 | print("Not in full: " + str(name)) 544 | 545 | for name in full_json.keys(): 546 | # ELF found a function that the objdump files couldn't 547 | # TODO: Should we really set this as zero, or "unknown"? 548 | 549 | if "call_sites" not in full_json[name]: 550 | full_json[name]["call_sites"] = [] 551 | if "num_indirect_calls" not in full_json[name]: 552 | full_json[name]["num_indirect_calls"] = float(0) 553 | if "num_dynamic_calls" not in full_json[name]: 554 | full_json[name]["num_dynamic_calls"] = float(0) 555 | 556 | 557 | # Save combined results 558 | print("Saving combined elf and objdump metadata...") 559 | f = open("output/metadata.json", "w") 560 | json.dump(full_json, f, indent=4) 561 | 562 | # Add transfer point data to metadata using call sites and language tags 563 | def get_transfer_points(): 564 | print("Opening metadata to find transfer and visitor points...") 565 | with open("output/metadata.json") as mj: 566 | md = json.load(mj) 567 | 568 | # For each function, get a list of its call sites that cross a language 569 | print("Iterating through metadata functions...") 570 | md_keys_copy = md.keys() 571 | count = 0 572 | percent = 0 573 | 574 | for fn in md_keys_copy: 575 | print(str(count) + " of " + str(len(md_keys_copy)) + " complete.") 576 | count = count + 1 577 | if count % (len(md_keys_copy)/100) == 0: 578 | percent = percent + 1 579 | print(str(percent) + " percent complete...") 580 | 581 | if "call_sites" in md[fn] and "lang" in md[fn]: 582 | call_sites_copy = md[fn]["call_sites"] 583 | for cs in call_sites_copy: 584 | if cs in md: 585 | if "lang" in md[cs]: 586 | if md[cs]["lang"] != md[fn]["lang"]: 587 | 588 | ### transfer points 589 | # Creates a transfer points metadata field to an already seen function 590 | if "transfer_points" not in md[fn]: 591 | md[fn]["transfer_points"] = [cs] 592 | 593 | # Appends metadata to an already seen transfer points list for a function 594 | else: 595 | tmp = md[fn]["transfer_points"] 596 | tmp_set = set(tmp) 597 | tmp_set = tmp_set.union(set([cs])) 598 | md[fn]["transfer_points"] = list(tmp_set) 599 | 600 | ### visitor points 601 | # Creates a visitor points metadata field to an already seen function 602 | if "visitor_points" not in md[cs]: 603 | md[cs]["visitor_points"] = [fn] 604 | 605 | # Appends metadata to an already seen transfer points list for a function 606 | else: 607 | tmp = md[cs]["visitor_points"] 608 | tmp_set = set(tmp) 609 | tmp_set = tmp_set.union(set([fn])) 610 | md[cs]["visitor_points"] = list(tmp_set) 611 | else: 612 | print("Error: Call site " + str(cs) + " has no language information.") 613 | else: 614 | # TODO: Objdump called functions plus offsets, need to add it as a function or remove it from the call sites list 615 | # Temporary solution: just remove functions with offsets from call sites list 616 | if '+0x' in str(cs): 617 | print("Removing " + str(cs) + " as a call site...") 618 | md[fn]["call_sites"].remove(cs) 619 | else: 620 | # Temporary solution 2: find plt calls 621 | found = False 622 | if "@plt" in cs: 623 | stripped_name = str(cs).split('@')[0] 624 | 625 | # If the plt function exists in the metadata, replace the plt call site with the real function 626 | for newcs in md.keys(): 627 | if newcs.startswith(stripped_name) and not found: 628 | md[fn]["call_sites"].remove(cs) 629 | md[fn]["call_sites"].append(newcs) 630 | 631 | # Same as above but with newcs instead of cs 632 | if "lang" in md[newcs]: 633 | if md[newcs]["lang"] != md[fn]["lang"]: 634 | 635 | ### transfer points 636 | # Creates a transfer points metadata field to an already seen function 637 | if "transfer_points" not in md[fn]: 638 | md[fn]["transfer_points"] = [newcs] 639 | 640 | # Appends metadata to an already seen transfer points list for a function 641 | else: 642 | tmp = md[fn]["transfer_points"] 643 | tmp_set = set(tmp) 644 | tmp_set = tmp_set.union(set([newcs])) 645 | md[fn]["transfer_points"] = list(tmp_set) 646 | 647 | ### visitor points 648 | # Creates a visitor points metadata field to an already seen function 649 | if "visitor_points" not in md[newcs]: 650 | md[newcs]["visitor_points"] = [fn] 651 | 652 | # Appends metadata to an already seen transfer points list for a function 653 | else: 654 | tmp = md[newcs]["visitor_points"] 655 | tmp_set = set(tmp) 656 | tmp_set = tmp_set.union(set([fn])) 657 | md[newcs]["visitor_points"] = list(tmp_set) 658 | 659 | else: 660 | print("Error: Call site " + str(newcs) + " has no language information.") 661 | 662 | found = True 663 | 664 | # If the plt function is not in the metadata, add the plt call site to the metadata 665 | if not found: 666 | found = True 667 | 668 | if "_Z" in cs: 669 | t = "unknown" 670 | l = "c++" 671 | else: 672 | t = "external" 673 | l = "c" 674 | 675 | md[cs] = { 676 | "addr": "unknown", 677 | "type": t, 678 | "lang": l, 679 | "call_sites": [], 680 | "num_indirect_calls": float(0), 681 | "num_dynamic_calls": float(0), 682 | } 683 | 684 | if "lang" in md[cs]: 685 | if md[cs]["lang"] != md[fn]["lang"]: 686 | 687 | ### transfer points 688 | # Creates a transfer points metadata field to an already seen function 689 | if "transfer_points" not in md[fn]: 690 | md[fn]["transfer_points"] = [cs] 691 | 692 | # Appends metadata to an already seen transfer points list for a function 693 | else: 694 | tmp = md[fn]["transfer_points"] 695 | tmp_set = set(tmp) 696 | tmp_set = tmp_set.union(set([cs])) 697 | md[fn]["transfer_points"] = list(tmp_set) 698 | 699 | ### visitor points 700 | # Creates a visitor points metadata field to an already seen function 701 | if "visitor_points" not in md[cs]: 702 | md[cs]["visitor_points"] = [fn] 703 | 704 | # Appends metadata to an already seen transfer points list for a function 705 | else: 706 | tmp = md[cs]["visitor_points"] 707 | tmp_set = set(tmp) 708 | tmp_set = tmp_set.union(set([fn])) 709 | md[cs]["visitor_points"] = list(tmp_set) 710 | 711 | if not found: 712 | print("Error: Call site " + str(cs) + " does not exist in metadata.") 713 | print("...removing as a call site...") 714 | md[fn]["call_sites"].remove(cs) 715 | else: 716 | print("Error: Function " + str(fn) + " either has no call sites field or no language information field.") 717 | 718 | print("Adding size of call sites, transfer points, and visitor points lists...") 719 | for fn in md.keys(): 720 | if "call_sites" in md[fn]: 721 | md[fn]["num_call_sites"] = len(md[fn]["call_sites"]) 722 | else: 723 | md[fn]["call_sites"] = [] 724 | md[fn]["num_call_sites"] = float(0) 725 | 726 | if "transfer_points" in md[fn]: 727 | md[fn]["num_transfer_points"] = len(md[fn]["transfer_points"]) 728 | else: 729 | md[fn]["transfer_points"] = [] 730 | md[fn]["num_transfer_points"] = float(0) 731 | 732 | if "visitor_points" in md[fn]: 733 | md[fn]["num_visitor_points"] = len(md[fn]["visitor_points"]) 734 | else: 735 | md[fn]["visitor_points"] = [] 736 | md[fn]["num_visitor_points"] = float(0) 737 | 738 | print("Saving metadata with transfer and visitor points...") 739 | f = open("output/metadata_with_tps.json", "w") 740 | json.dump(md, f, indent=4) 741 | 742 | def get_invocation_points(): 743 | print("Opening metadata to find invocation points...") 744 | with open("output/metadata_with_tps.json") as mj: 745 | md = json.load(mj) 746 | 747 | # For each function, get a list of which functions call other functions 748 | print("Iterating through metadata functions...") 749 | for fn in md.keys(): 750 | 751 | for cs in md[fn]["call_sites"]: 752 | 753 | if cs in md: 754 | 755 | ### invocation points 756 | # Creates a invocation points metadata field to an already seen function 757 | if "invocation_points" not in md[cs]: 758 | md[cs]["invocation_points"] = [fn] 759 | 760 | # Appends metadata to an already seen transfer points list for a function 761 | else: 762 | tmp = md[cs]["invocation_points"] 763 | tmp_set = set(tmp) 764 | tmp_set = tmp_set.union(set([fn])) 765 | md[cs]["invocation_points"] = list(tmp_set) 766 | else: 767 | print("Error: " + str(cs) + " does not exist in the metadata.") 768 | 769 | for fn in md.keys(): 770 | if "invocation_points" not in md[fn]: 771 | md[fn]["invocation_points"] = [] 772 | 773 | md[fn]["num_invocations"] = float(len(md[fn]["invocation_points"])) 774 | 775 | print("Saving metadata with invocation points...") 776 | f = open("output/metadata_with_invos.json", "w") 777 | json.dump(md, f, indent=4) 778 | 779 | 780 | def generate_cdfs(): 781 | print("Setting plot params...") 782 | plt.style.use('ggplot') 783 | 784 | plt.rcParams['figure.titlesize'] = 20 785 | plt.rcParams['axes.labelsize'] = 16 786 | plt.rcParams['axes.titlesize'] = 16 787 | plt.rcParams['xtick.labelsize'] = 14 788 | plt.rcParams['ytick.labelsize'] = 14 789 | plt.rcParams['legend.fontsize'] = 16 790 | plt.rcParams['axes.grid'] = 'true' 791 | plt.rcParams['grid.color'] = '0.45' 792 | plt.rcParams['axes.facecolor'] = '0.95' 793 | 794 | # with_invos is the json metadata file with the most data 795 | print("Loading metadata to generate cdfs...") 796 | with open("output/metadata_with_invos.json") as mj: 797 | md = json.load(mj) 798 | 799 | ### CDFs 800 | ## Make a CDF for number of indirect calls 801 | both_indirs = [] 802 | rust_indirs = [] 803 | c_indirs = [] 804 | 805 | ## Make a CDF for number of dynamic calls 806 | both_dynamics = [] 807 | rust_dynamics = [] 808 | c_dynamics = [] 809 | 810 | 811 | ## Make a CDF for number of call sites 812 | both_cs = [] 813 | rust_cs = [] 814 | c_cs = [] 815 | 816 | ## Make a CDF for number of visitor points 817 | both_vps = [] 818 | rust_vps = [] 819 | c_vps = [] 820 | 821 | ## Make a CDF for number of invocation points 822 | both_invos = [] 823 | rust_invos = [] 824 | c_invos = [] 825 | 826 | ## Make a CDF for number of transfer points 827 | both_tps = [] 828 | rust_tps = [] 829 | c_tps = [] 830 | 831 | ### Table values 832 | ## Table value of total number of functions 833 | both_fns = 0 834 | rust_fns = 0 835 | c_fns = 0 836 | 837 | ## Table value of total number of invocations 838 | both_total_indirs = 0 839 | rust_total_indirs = 0 840 | c_total_indirs = 0 841 | 842 | ## Table value of total number of invocations 843 | both_total_dynamics = 0 844 | rust_total_dynamics = 0 845 | c_total_dynamics = 0 846 | 847 | ## Table value of total number of invocations 848 | both_total_cs = 0 849 | rust_total_cs = 0 850 | c_total_cs = 0 851 | 852 | ## Table value of total number of invocations 853 | both_total_vps = 0 854 | rust_total_vps = 0 855 | c_total_vps = 0 856 | 857 | ## Table value of total number of invocations 858 | both_total_invos = 0 859 | rust_total_invos = 0 860 | c_total_invos = 0 861 | 862 | ## Table value of total number of invocations 863 | both_total_tps = 0 864 | rust_total_tps = 0 865 | c_total_tps = 0 866 | 867 | ## Table value of total number of closures 868 | rust_closures = 0 869 | 870 | ## Table value of total number of monomorphized functions 871 | rust_monos = 0 872 | 873 | ## Largest Degree Call sites 874 | both_top_cs = { 875 | "first": { 876 | "name": "unknown", 877 | "num": 0 878 | }, 879 | "second": { 880 | "name": "unknown", 881 | "num": 0 882 | }, 883 | "third": { 884 | "name": "unknown", 885 | "num": 0 886 | }, 887 | } 888 | rust_top_cs = { 889 | "first": { 890 | "name": "unknown", 891 | "num": 0 892 | }, 893 | "second": { 894 | "name": "unknown", 895 | "num": 0 896 | }, 897 | "third": { 898 | "name": "unknown", 899 | "num": 0 900 | }, 901 | } 902 | c_top_cs = { 903 | "first": { 904 | "name": "unknown", 905 | "num": 0 906 | }, 907 | "second": { 908 | "name": "unknown", 909 | "num": 0 910 | }, 911 | "third": { 912 | "name": "unknown", 913 | "num": 0 914 | }, 915 | } 916 | 917 | ## Largest Degree Invocations 918 | both_top_invos = { 919 | "first": { 920 | "name": "unknown", 921 | "num": 0 922 | }, 923 | "second": { 924 | "name": "unknown", 925 | "num": 0 926 | }, 927 | "third": { 928 | "name": "unknown", 929 | "num": 0 930 | }, 931 | } 932 | rust_top_invos = { 933 | "first": { 934 | "name": "unknown", 935 | "num": 0 936 | }, 937 | "second": { 938 | "name": "unknown", 939 | "num": 0 940 | }, 941 | "third": { 942 | "name": "unknown", 943 | "num": 0 944 | }, 945 | } 946 | c_top_invos = { 947 | "first": { 948 | "name": "unknown", 949 | "num": 0 950 | }, 951 | "second": { 952 | "name": "unknown", 953 | "num": 0 954 | }, 955 | "third": { 956 | "name": "unknown", 957 | "num": 0 958 | }, 959 | } 960 | 961 | ## Largest Degree Transfer Points 962 | both_top_tps = { 963 | "first": { 964 | "name": "unknown", 965 | "num": 0 966 | }, 967 | "second": { 968 | "name": "unknown", 969 | "num": 0 970 | }, 971 | "third": { 972 | "name": "unknown", 973 | "num": 0 974 | }, 975 | } 976 | rust_top_tps = { 977 | "first": { 978 | "name": "unknown", 979 | "num": 0 980 | }, 981 | "second": { 982 | "name": "unknown", 983 | "num": 0 984 | }, 985 | "third": { 986 | "name": "unknown", 987 | "num": 0 988 | }, 989 | } 990 | c_top_tps = { 991 | "first": { 992 | "name": "unknown", 993 | "num": 0 994 | }, 995 | "second": { 996 | "name": "unknown", 997 | "num": 0 998 | }, 999 | "third": { 1000 | "name": "unknown", 1001 | "num": 0 1002 | }, 1003 | } 1004 | 1005 | ## Largest Degree Visitor Points 1006 | both_top_vps = { 1007 | "first": { 1008 | "name": "unknown", 1009 | "num": 0 1010 | }, 1011 | "second": { 1012 | "name": "unknown", 1013 | "num": 0 1014 | }, 1015 | "third": { 1016 | "name": "unknown", 1017 | "num": 0 1018 | }, 1019 | } 1020 | rust_top_vps = { 1021 | "first": { 1022 | "name": "unknown", 1023 | "num": 0 1024 | }, 1025 | "second": { 1026 | "name": "unknown", 1027 | "num": 0 1028 | }, 1029 | "third": { 1030 | "name": "unknown", 1031 | "num": 0 1032 | }, 1033 | } 1034 | c_top_vps = { 1035 | "first": { 1036 | "name": "unknown", 1037 | "num": 0 1038 | }, 1039 | "second": { 1040 | "name": "unknown", 1041 | "num": 0 1042 | }, 1043 | "third": { 1044 | "name": "unknown", 1045 | "num": 0 1046 | }, 1047 | } 1048 | 1049 | 1050 | print("Looping through metadata to collect graph data...") 1051 | for name in md.keys(): 1052 | 1053 | if md[name]["lang"] == "rust": 1054 | try: 1055 | rust_indirs.append(float(md[name]["num_indirect_calls"])) 1056 | rust_dynamics.append(float(md[name]["num_dynamic_calls"])) 1057 | rust_cs.append(float(md[name]["num_call_sites"])) 1058 | rust_tps.append(float(md[name]["num_transfer_points"])) 1059 | rust_vps.append(float(md[name]["num_visitor_points"])) 1060 | rust_invos.append(float(md[name]["num_invocations"])) 1061 | 1062 | rust_fns = rust_fns + 1 1063 | 1064 | rust_total_indirs = rust_total_indirs + float(md[name]["num_indirect_calls"]) 1065 | rust_total_dynamics = rust_total_dynamics + float(md[name]["num_dynamic_calls"]) 1066 | rust_total_cs = rust_total_cs + float(md[name]["num_call_sites"]) 1067 | rust_total_tps = rust_total_tps + float(md[name]["num_transfer_points"]) 1068 | rust_total_vps = rust_total_vps + float(md[name]["num_visitor_points"]) 1069 | rust_total_invos = rust_total_invos + float(md[name]["num_invocations"]) 1070 | 1071 | #if md[name]["type"] == "closure": 1072 | #rust_closures = rust_closures + 1 1073 | #if md[name]["type"] == "static": 1074 | #rust_monos = rust_monos + 1 1075 | if "_R" in name: 1076 | mangled = False 1077 | if sum(1 for c in name if c.isupper()) > 1: 1078 | mangled = True 1079 | 1080 | if mangled: 1081 | if "CN" in name: 1082 | rust_closures = rust_closures + 1 1083 | elif "IN" in name: 1084 | rust_monos = rust_monos + 1 1085 | elif "X" in name: 1086 | rust_monos = rust_monos + 1 1087 | elif "M" in name: 1088 | rust_monos = rust_monos + 1 1089 | 1090 | ## Check top call sites 1091 | if float(md[name]["num_call_sites"]) > float(rust_top_cs["first"]["num"]): 1092 | #print("Found new top leader") 1093 | #print(md[name]["num_call_sites"]) 1094 | #print(rust_top_cs) 1095 | 1096 | # Get tmps 1097 | tmp1 = { 1098 | "name": rust_top_cs["first"]["name"], 1099 | "num": rust_top_cs["first"]["num"], 1100 | } 1101 | tmp2 = { 1102 | "name": rust_top_cs["second"]["name"], 1103 | "num": rust_top_cs["second"]["num"], 1104 | } 1105 | 1106 | #print(tmp1) 1107 | #print(tmp2) 1108 | 1109 | # Set the new leader 1110 | rust_top_cs["first"]["name"] = str(name) 1111 | rust_top_cs["first"]["num"] = float(md[name]["num_call_sites"]) 1112 | 1113 | #print(tmp1) 1114 | #print(tmp2) 1115 | 1116 | # Move down 1117 | rust_top_cs["second"] = tmp1 1118 | rust_top_cs["third"] = tmp2 1119 | 1120 | elif float(md[name]["num_call_sites"]) > float(rust_top_cs["second"]["num"]): 1121 | # Get tmps 1122 | tmp2 = { 1123 | "name": rust_top_cs["second"]["name"], 1124 | "num": rust_top_cs["second"]["num"], 1125 | } 1126 | 1127 | # Set the new leader 1128 | rust_top_cs["second"]["name"] = str(name) 1129 | rust_top_cs["second"]["num"] = float(md[name]["num_call_sites"]) 1130 | 1131 | # Move down 1132 | rust_top_cs["third"] = tmp2 1133 | 1134 | elif float(md[name]["num_call_sites"]) > float(rust_top_cs["third"]["num"]): 1135 | # Set the new leader 1136 | rust_top_cs["third"]["name"] = str(name) 1137 | rust_top_cs["third"]["num"] = float(md[name]["num_call_sites"]) 1138 | 1139 | ## Check top invocations 1140 | if float(md[name]["num_invocations"]) > float(rust_top_invos["first"]["num"]): 1141 | # Get tmps 1142 | tmp1 = { 1143 | "name": rust_top_invos["first"]["name"], 1144 | "num": rust_top_invos["first"]["num"], 1145 | } 1146 | tmp2 = { 1147 | "name": rust_top_invos["second"]["name"], 1148 | "num": rust_top_invos["second"]["num"], 1149 | } 1150 | 1151 | # Set the new leader 1152 | rust_top_invos["first"]["name"] = str(name) 1153 | rust_top_invos["first"]["num"] = float(md[name]["num_invocations"]) 1154 | 1155 | # Move down 1156 | rust_top_invos["second"] = tmp1 1157 | rust_top_invos["third"] = tmp2 1158 | 1159 | elif float(md[name]["num_invocations"]) > float(rust_top_invos["second"]["num"]): 1160 | # Get tmps 1161 | tmp2 = rust_top_invos["second"] 1162 | tmp2 = { 1163 | "name": rust_top_invos["second"]["name"], 1164 | "num": rust_top_invos["second"]["num"], 1165 | } 1166 | 1167 | # Set the new leader 1168 | rust_top_invos["second"]["name"] = str(name) 1169 | rust_top_invos["second"]["num"] = float(md[name]["num_invocations"]) 1170 | 1171 | # Move down 1172 | rust_top_invos["third"] = tmp2 1173 | 1174 | elif float(md[name]["num_invocations"]) > float(rust_top_invos["third"]["num"]): 1175 | # Set the new leader 1176 | rust_top_invos["third"]["name"] = str(name) 1177 | rust_top_invos["third"]["num"] = float(md[name]["num_invocations"]) 1178 | 1179 | ## Check top Transfer Points 1180 | if float(md[name]["num_transfer_points"]) > float(rust_top_tps["first"]["num"]): 1181 | # Get tmps 1182 | tmp1 = { 1183 | "name": rust_top_tps["first"]["name"], 1184 | "num": rust_top_tps["first"]["num"], 1185 | } 1186 | tmp2 = { 1187 | "name": rust_top_tps["second"]["name"], 1188 | "num": rust_top_tps["second"]["num"], 1189 | } 1190 | 1191 | # Set the new leader 1192 | rust_top_tps["first"]["name"] = str(name) 1193 | rust_top_tps["first"]["num"] = float(md[name]["num_transfer_points"]) 1194 | 1195 | # Move down 1196 | rust_top_tps["second"] = tmp1 1197 | rust_top_tps["third"] = tmp2 1198 | 1199 | elif float(md[name]["num_transfer_points"]) > float(rust_top_tps["second"]["num"]): 1200 | # Get tmps 1201 | tmp2 = rust_top_tps["second"] 1202 | tmp2 = { 1203 | "name": rust_top_tps["second"]["name"], 1204 | "num": rust_top_tps["second"]["num"], 1205 | } 1206 | 1207 | # Set the new leader 1208 | rust_top_tps["second"]["name"] = str(name) 1209 | rust_top_tps["second"]["num"] = float(md[name]["num_transfer_points"]) 1210 | 1211 | # Move down 1212 | rust_top_tps["third"] = tmp2 1213 | 1214 | elif float(md[name]["num_transfer_points"]) > float(rust_top_tps["third"]["num"]): 1215 | # Set the new leader 1216 | rust_top_tps["third"]["name"] = str(name) 1217 | rust_top_tps["third"]["num"] = float(md[name]["num_transfer_points"]) 1218 | 1219 | ## Check top invocations 1220 | if float(md[name]["num_visitor_points"]) > float(rust_top_vps["first"]["num"]): 1221 | # Get tmps 1222 | tmp1 = { 1223 | "name": rust_top_vps["first"]["name"], 1224 | "num": rust_top_vps["first"]["num"], 1225 | } 1226 | tmp2 = { 1227 | "name": rust_top_vps["second"]["name"], 1228 | "num": rust_top_vps["second"]["num"], 1229 | } 1230 | 1231 | # Set the new leader 1232 | rust_top_vps["first"]["name"] = str(name) 1233 | rust_top_vps["first"]["num"] = float(md[name]["num_visitor_points"]) 1234 | 1235 | # Move down 1236 | rust_top_vps["second"] = tmp1 1237 | rust_top_vps["third"] = tmp2 1238 | 1239 | elif float(md[name]["num_visitor_points"]) > float(rust_top_vps["second"]["num"]): 1240 | # Get tmps 1241 | tmp2 = { 1242 | "name": rust_top_vps["second"]["name"], 1243 | "num": rust_top_vps["second"]["num"], 1244 | } 1245 | 1246 | # Set the new leader 1247 | rust_top_vps["second"]["name"] = str(name) 1248 | rust_top_vps["second"]["num"] = float(md[name]["num_visitor_points"]) 1249 | 1250 | # Move down 1251 | rust_top_vps["third"] = tmp2 1252 | 1253 | elif float(md[name]["num_visitor_points"]) > float(rust_top_vps["third"]["num"]): 1254 | # Set the new leader 1255 | rust_top_vps["third"]["name"] = str(name) 1256 | rust_top_vps["third"]["num"] = float(md[name]["num_visitor_points"]) 1257 | 1258 | except Exception as e: 1259 | print(e) 1260 | 1261 | #elif md[name]["lang"] == "c" or md[name]["lang"] == "c++": 1262 | else: 1263 | try: 1264 | c_indirs.append(float(md[name]["num_indirect_calls"])) 1265 | c_dynamics.append(float(md[name]["num_dynamic_calls"])) 1266 | c_cs.append(float(md[name]["num_call_sites"])) 1267 | c_tps.append(float(md[name]["num_transfer_points"])) 1268 | c_vps.append(float(md[name]["num_visitor_points"])) 1269 | c_invos.append(float(md[name]["num_invocations"])) 1270 | 1271 | c_fns = c_fns + 1 1272 | 1273 | c_total_indirs = c_total_indirs + float(md[name]["num_indirect_calls"]) 1274 | c_total_dynamics = c_total_dynamics + float(md[name]["num_dynamic_calls"]) 1275 | c_total_cs = c_total_cs + float(md[name]["num_call_sites"]) 1276 | c_total_tps = c_total_tps + float(md[name]["num_transfer_points"]) 1277 | c_total_vps = c_total_vps + float(md[name]["num_visitor_points"]) 1278 | c_total_invos = c_total_invos + float(md[name]["num_invocations"]) 1279 | 1280 | ## Check top call sites 1281 | if float(md[name]["num_call_sites"]) > float(c_top_cs["first"]["num"]): 1282 | # Get tmps 1283 | tmp1 = { 1284 | "name": c_top_cs["first"]["name"], 1285 | "num": c_top_cs["first"]["num"], 1286 | } 1287 | tmp2 = { 1288 | "name": c_top_cs["second"]["name"], 1289 | "num": c_top_cs["second"]["num"], 1290 | } 1291 | 1292 | # Set the new leader 1293 | c_top_cs["first"]["name"] = str(name) 1294 | c_top_cs["first"]["num"] = float(md[name]["num_call_sites"]) 1295 | 1296 | # Move down 1297 | c_top_cs["second"] = tmp1 1298 | c_top_cs["third"] = tmp2 1299 | 1300 | elif float(md[name]["num_call_sites"]) > float(c_top_cs["second"]["num"]): 1301 | # Get tmps 1302 | tmp2 = { 1303 | "name": c_top_cs["second"]["name"], 1304 | "num": c_top_cs["second"]["num"], 1305 | } 1306 | 1307 | # Set the new leader 1308 | c_top_cs["second"]["name"] = str(name) 1309 | c_top_cs["second"]["num"] = float(md[name]["num_call_sites"]) 1310 | 1311 | # Move down 1312 | c_top_cs["third"] = tmp2 1313 | 1314 | elif float(md[name]["num_call_sites"]) > float(c_top_cs["third"]["num"]): 1315 | # Set the new leader 1316 | c_top_cs["third"]["name"] = str(name) 1317 | c_top_cs["third"]["num"] = float(md[name]["num_call_sites"]) 1318 | 1319 | ## Check top invocations 1320 | if float(md[name]["num_invocations"]) > float(c_top_invos["first"]["num"]): 1321 | # Get tmps 1322 | tmp1 = { 1323 | "name": c_top_invos["first"]["name"], 1324 | "num": c_top_invos["first"]["num"], 1325 | } 1326 | tmp2 = { 1327 | "name": c_top_invos["second"]["name"], 1328 | "num": c_top_invos["second"]["num"], 1329 | } 1330 | 1331 | # Set the new leader 1332 | c_top_invos["first"]["name"] = str(name) 1333 | c_top_invos["first"]["num"] = float(md[name]["num_invocations"]) 1334 | 1335 | # Move down 1336 | c_top_invos["second"] = tmp1 1337 | c_top_invos["third"] = tmp2 1338 | 1339 | elif float(md[name]["num_invocations"]) > float(c_top_invos["second"]["num"]): 1340 | # Get tmps 1341 | tmp2 = { 1342 | "name": c_top_invos["second"]["name"], 1343 | "num": c_top_invos["second"]["num"], 1344 | } 1345 | 1346 | # Set the new leader 1347 | c_top_invos["second"]["name"] = str(name) 1348 | c_top_invos["second"]["num"] = float(md[name]["num_invocations"]) 1349 | 1350 | # Move down 1351 | c_top_invos["third"] = tmp2 1352 | 1353 | elif float(md[name]["num_invocations"]) > float(c_top_invos["third"]["num"]): 1354 | # Set the new leader 1355 | c_top_invos["third"]["name"] = str(name) 1356 | c_top_invos["third"]["num"] = float(md[name]["num_invocations"]) 1357 | 1358 | ## Check top Transfer Points 1359 | if float(md[name]["num_transfer_points"]) > float(c_top_tps["first"]["num"]): 1360 | # Get tmps 1361 | tmp1 = { 1362 | "name": c_top_tps["first"]["name"], 1363 | "num": c_top_tps["first"]["num"], 1364 | } 1365 | tmp2 = { 1366 | "name": c_top_tps["second"]["name"], 1367 | "num": c_top_tps["second"]["num"], 1368 | } 1369 | 1370 | # Set the new leader 1371 | c_top_tps["first"]["name"] = str(name) 1372 | c_top_tps["first"]["num"] = float(md[name]["num_transfer_points"]) 1373 | 1374 | # Move down 1375 | c_top_tps["second"] = tmp1 1376 | c_top_tps["third"] = tmp2 1377 | 1378 | elif float(md[name]["num_transfer_points"]) > float(c_top_tps["second"]["num"]): 1379 | # Get tmps 1380 | tmp2 = { 1381 | "name": c_top_tps["second"]["name"], 1382 | "num": c_top_tps["second"]["num"], 1383 | } 1384 | 1385 | # Set the new leader 1386 | c_top_tps["second"]["name"] = str(name) 1387 | c_top_tps["second"]["num"] = float(md[name]["num_transfer_points"]) 1388 | 1389 | # Move down 1390 | c_top_tps["third"] = tmp2 1391 | 1392 | elif float(md[name]["num_transfer_points"]) > float(c_top_tps["third"]["num"]): 1393 | # Set the new leader 1394 | c_top_tps["third"]["name"] = str(name) 1395 | c_top_tps["third"]["num"] = float(md[name]["num_transfer_points"]) 1396 | 1397 | ## Check top invocations 1398 | if float(md[name]["num_visitor_points"]) > float(c_top_vps["first"]["num"]): 1399 | # Get tmps 1400 | tmp1 = { 1401 | "name": c_top_vps["first"]["name"], 1402 | "num": c_top_vps["first"]["num"], 1403 | } 1404 | tmp2 = { 1405 | "name": c_top_vps["second"]["name"], 1406 | "num": c_top_vps["second"]["num"], 1407 | } 1408 | 1409 | # Set the new leader 1410 | c_top_vps["first"]["name"] = str(name) 1411 | c_top_vps["first"]["num"] = float(md[name]["num_visitor_points"]) 1412 | 1413 | # Move down 1414 | c_top_vps["second"] = tmp1 1415 | c_top_vps["third"] = tmp2 1416 | 1417 | elif float(md[name]["num_visitor_points"]) > float(c_top_vps["second"]["num"]): 1418 | # Get tmps 1419 | tmp2 = { 1420 | "name": c_top_vps["second"]["name"], 1421 | "num": c_top_vps["second"]["num"], 1422 | } 1423 | 1424 | # Set the new leader 1425 | c_top_vps["second"]["name"] = str(name) 1426 | c_top_vps["second"]["num"] = float(md[name]["num_visitor_points"]) 1427 | 1428 | # Move down 1429 | c_top_vps["third"] = tmp2 1430 | 1431 | elif float(md[name]["num_visitor_points"]) > float(c_top_vps["third"]["num"]): 1432 | # Set the new leader 1433 | c_top_vps["third"]["name"] = str(name) 1434 | c_top_vps["third"]["num"] = float(md[name]["num_visitor_points"]) 1435 | 1436 | except Exception as e: 1437 | print(e) 1438 | 1439 | try: 1440 | both_indirs.append(float(md[name]["num_indirect_calls"])) 1441 | both_dynamics.append(float(md[name]["num_dynamic_calls"])) 1442 | both_cs.append(float(md[name]["num_call_sites"])) 1443 | both_tps.append(float(md[name]["num_transfer_points"])) 1444 | both_vps.append(float(md[name]["num_visitor_points"])) 1445 | both_invos.append(float(md[name]["num_invocations"])) 1446 | 1447 | both_fns = both_fns + 1 1448 | 1449 | both_total_indirs = both_total_indirs + float(md[name]["num_indirect_calls"]) 1450 | both_total_dynamics = both_total_dynamics + float(md[name]["num_dynamic_calls"]) 1451 | both_total_cs = both_total_cs + float(md[name]["num_call_sites"]) 1452 | both_total_tps = both_total_tps + float(md[name]["num_transfer_points"]) 1453 | both_total_vps = both_total_vps + float(md[name]["num_visitor_points"]) 1454 | both_total_invos = both_total_invos + float(md[name]["num_invocations"]) 1455 | 1456 | ## Check top call sites 1457 | if float(md[name]["num_call_sites"]) > float(both_top_cs["first"]["num"]): 1458 | # Get tmps 1459 | tmp1 = both_top_cs["first"] 1460 | tmp2 = both_top_cs["second"] 1461 | tmp1 = { 1462 | "name": both_top_cs["first"]["name"], 1463 | "num": both_top_cs["first"]["num"], 1464 | } 1465 | tmp2 = { 1466 | "name": both_top_cs["second"]["name"], 1467 | "num": both_top_cs["second"]["num"], 1468 | } 1469 | 1470 | # Set the new leader 1471 | both_top_cs["first"]["name"] = str(name) 1472 | both_top_cs["first"]["num"] = float(md[name]["num_call_sites"]) 1473 | 1474 | # Move down 1475 | both_top_cs["second"] = tmp1 1476 | both_top_cs["third"] = tmp2 1477 | 1478 | elif float(md[name]["num_call_sites"]) > float(both_top_cs["second"]["num"]): 1479 | # Get tmps 1480 | tmp2 = both_top_cs["second"] 1481 | tmp2 = { 1482 | "name": both_top_cs["second"]["name"], 1483 | "num": both_top_cs["second"]["num"], 1484 | } 1485 | 1486 | # Set the new leader 1487 | both_top_cs["second"]["name"] = str(name) 1488 | both_top_cs["second"]["num"] = float(md[name]["num_call_sites"]) 1489 | 1490 | # Move down 1491 | both_top_cs["third"] = tmp2 1492 | 1493 | elif float(md[name]["num_call_sites"]) > float(both_top_cs["third"]["num"]): 1494 | # Set the new leader 1495 | both_top_cs["third"]["name"] = str(name) 1496 | both_top_cs["third"]["num"] = float(md[name]["num_call_sites"]) 1497 | 1498 | ## Check top invocations 1499 | if float(md[name]["num_invocations"]) > float(both_top_invos["first"]["num"]): 1500 | # Get tmps 1501 | tmp1 = { 1502 | "name": both_top_invos["first"]["name"], 1503 | "num": both_top_invos["first"]["num"], 1504 | } 1505 | tmp2 = { 1506 | "name": both_top_invos["second"]["name"], 1507 | "num": both_top_invos["second"]["num"], 1508 | } 1509 | 1510 | # Set the new leader 1511 | both_top_invos["first"]["name"] = str(name) 1512 | both_top_invos["first"]["num"] = float(md[name]["num_invocations"]) 1513 | 1514 | # Move down 1515 | both_top_invos["second"] = tmp1 1516 | both_top_invos["third"] = tmp2 1517 | 1518 | elif float(md[name]["num_invocations"]) > float(both_top_invos["second"]["num"]): 1519 | # Get tmps 1520 | tmp2 = { 1521 | "name": both_top_invos["second"]["name"], 1522 | "num": both_top_invos["second"]["num"], 1523 | } 1524 | 1525 | # Set the new leader 1526 | both_top_invos["second"]["name"] = str(name) 1527 | both_top_invos["second"]["num"] = float(md[name]["num_invocations"]) 1528 | 1529 | # Move down 1530 | both_top_invos["third"] = tmp2 1531 | 1532 | elif float(md[name]["num_invocations"]) > float(both_top_invos["third"]["num"]): 1533 | # Set the new leader 1534 | both_top_invos["third"]["name"] = str(name) 1535 | both_top_invos["third"]["num"] = float(md[name]["num_invocations"]) 1536 | 1537 | ## Check top Transfer Points 1538 | if float(md[name]["num_transfer_points"]) > float(both_top_tps["first"]["num"]): 1539 | # Get tmps 1540 | tmp1 = { 1541 | "name": both_top_tps["first"]["name"], 1542 | "num": both_top_tps["first"]["num"], 1543 | } 1544 | tmp2 = { 1545 | "name": both_top_tps["second"]["name"], 1546 | "num": both_top_tps["second"]["num"], 1547 | } 1548 | 1549 | # Set the new leader 1550 | both_top_tps["first"]["name"] = str(name) 1551 | both_top_tps["first"]["num"] = float(md[name]["num_transfer_points"]) 1552 | 1553 | # Move down 1554 | both_top_tps["second"] = tmp1 1555 | both_top_tps["third"] = tmp2 1556 | 1557 | elif float(md[name]["num_transfer_points"]) > float(both_top_tps["second"]["num"]): 1558 | # Get tmps 1559 | tmp2 = { 1560 | "name": both_top_tps["second"]["name"], 1561 | "num": both_top_tps["second"]["num"], 1562 | } 1563 | 1564 | # Set the new leader 1565 | both_top_tps["second"]["name"] = str(name) 1566 | both_top_tps["second"]["num"] = float(md[name]["num_transfer_points"]) 1567 | 1568 | # Move down 1569 | both_top_tps["third"] = tmp2 1570 | 1571 | elif float(md[name]["num_transfer_points"]) > float(both_top_tps["third"]["num"]): 1572 | # Set the new leader 1573 | both_top_tps["third"]["name"] = str(name) 1574 | both_top_tps["third"]["num"] = float(md[name]["num_transfer_points"]) 1575 | 1576 | ## Check top invocations 1577 | if float(md[name]["num_visitor_points"]) > float(both_top_vps["first"]["num"]): 1578 | # Get tmps 1579 | tmp1 = { 1580 | "name": both_top_vps["first"]["name"], 1581 | "num": both_top_vps["first"]["num"], 1582 | } 1583 | tmp2 = { 1584 | "name": both_top_vps["second"]["name"], 1585 | "num": both_top_vps["second"]["num"], 1586 | } 1587 | 1588 | # Set the new leader 1589 | both_top_vps["first"]["name"] = str(name) 1590 | both_top_vps["first"]["num"] = float(md[name]["num_visitor_points"]) 1591 | 1592 | # Move down 1593 | both_top_vps["second"] = tmp1 1594 | both_top_vps["third"] = tmp2 1595 | 1596 | elif float(md[name]["num_visitor_points"]) > float(both_top_vps["second"]["num"]): 1597 | # Get tmps 1598 | tmp2 = both_top_vps["second"] 1599 | 1600 | # Set the new leader 1601 | both_top_vps["second"]["name"] = str(name) 1602 | both_top_vps["second"]["num"] = float(md[name]["num_visitor_points"]) 1603 | 1604 | # Move down 1605 | tmp2 = { 1606 | "name": both_top_vps["second"]["name"], 1607 | "num": both_top_vps["second"]["num"], 1608 | } 1609 | 1610 | elif float(md[name]["num_visitor_points"]) > float(both_top_vps["third"]["num"]): 1611 | # Set the new leader 1612 | both_top_vps["third"]["name"] = str(name) 1613 | both_top_vps["third"]["num"] = float(md[name]["num_visitor_points"]) 1614 | 1615 | except Exception as e: 1616 | print(e) 1617 | 1618 | ### Print table metrics 1619 | ## Number of Functions 1620 | print("Total functions: ") 1621 | print("Rust: " + str(rust_fns)) 1622 | print("C/C++: " + str(c_fns)) 1623 | print("Both: " + str(both_fns)) 1624 | 1625 | ## Number of Indirections 1626 | print("Total Indirect Calls: ") 1627 | print("Rust: " + str(rust_total_indirs)) 1628 | print("C/C++: " + str(c_total_indirs)) 1629 | print("Both: " + str(both_total_indirs)) 1630 | 1631 | ## Number of Indirections 1632 | print("Total Dynamic Calls: ") 1633 | print("Rust: " + str(rust_total_dynamics)) 1634 | print("C/C++: " + str(c_total_dynamics)) 1635 | print("Both: " + str(both_total_dynamics)) 1636 | 1637 | ## Number of Call Sites 1638 | print("Total Call Sites: ") 1639 | print("Rust: " + str(rust_total_cs)) 1640 | print("C/C++: " + str(c_total_cs)) 1641 | print("Both: " + str(both_total_cs)) 1642 | 1643 | ## Number of Transfer Points 1644 | print("Total Transfer Points: ") 1645 | print("Rust: " + str(rust_total_tps)) 1646 | print("C/C++: " + str(c_total_tps)) 1647 | print("Both: " + str(both_total_tps)) 1648 | 1649 | ## Number of Visitor Points 1650 | print("Total Visitor Points: ") 1651 | print("Rust: " + str(rust_total_vps)) 1652 | print("C/C++: " + str(c_total_vps)) 1653 | print("Both: " + str(both_total_vps)) 1654 | 1655 | ## Number of Invocations 1656 | print("Total Invocations: ") 1657 | print("Rust: " + str(rust_total_invos)) 1658 | print("C/C++: " + str(c_total_invos)) 1659 | print("Both: " + str(both_total_invos)) 1660 | 1661 | ## Number of Rust closures 1662 | print("Total Closures: ") 1663 | print("Rust: " + str(rust_closures)) 1664 | 1665 | ## Number of Rust monophorphized functions 1666 | print("Total Monomorphized Functions: ") 1667 | print("Rust: " + str(rust_monos)) 1668 | 1669 | ### Print Top contenders 1670 | ## Top Call Sites 1671 | print("Top Call Sites:") 1672 | print("Both: " + str(both_top_cs)) 1673 | print("Rust: " + str(rust_top_cs)) 1674 | print("C/C++: " + str(c_top_cs)) 1675 | 1676 | ## Top Invocations 1677 | print("Top Invocations:") 1678 | print("Both: " + str(both_top_invos)) 1679 | print("Rust: " + str(rust_top_invos)) 1680 | print("C/C++: " + str(c_top_invos)) 1681 | 1682 | ## Top Transfer Points 1683 | print("Top Transfer Points:") 1684 | print("Both: " + str(both_top_tps)) 1685 | print("Rust: " + str(rust_top_tps)) 1686 | print("C/C++: " + str(c_top_tps)) 1687 | 1688 | ## Top Visitor Points 1689 | print("Top Visitor Points:") 1690 | print("Both: " + str(both_top_vps)) 1691 | print("Rust: " + str(rust_top_vps)) 1692 | print("C/C++: " + str(c_top_vps)) 1693 | 1694 | ### Make a CDF for number of indirect calls 1695 | rust_x = np.sort(np.array(rust_indirs)) 1696 | rust_y = np.arange(1, len(rust_x)+1)/len(rust_x) 1697 | 1698 | c_x = np.sort(np.array(c_indirs)) 1699 | c_y = np.arange(1, len(c_x)+1)/len(c_x) 1700 | 1701 | both_x = np.sort(np.array(both_indirs)) 1702 | both_y = np.arange(1, len(both_x)+1)/len(both_x) 1703 | 1704 | print("Generating CDF for indirect function calls...") 1705 | print(both_y) 1706 | cdf_plt = plt.figure() 1707 | 1708 | # Graph labels 1709 | plt.title("Number of Indirect Function Calls") 1710 | plt.xlabel("Number of Indirect Function Calls") 1711 | plt.ylabel("Cumulative Distribution Function (CDF)") 1712 | 1713 | #plt.axis([0, max(both_x), 0, 1]) 1714 | plt.axis([0, 10, 0.9, 1]) 1715 | 1716 | # Grayscale 1717 | #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/10000000) 1718 | #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000) 1719 | #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/10000000) 1720 | 1721 | # Color 1722 | plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/10000000) 1723 | plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000) 1724 | plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/10000000) 1725 | 1726 | # Generate and save graph 1727 | #plt.grid(True) 1728 | plt.grid(True, color='0.45') 1729 | plt.plot() 1730 | plt.show() 1731 | plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3) 1732 | print("Saving indirs...") 1733 | cdf_plt.savefig("output/graphs/indirs.pdf", bbox_inches='tight') 1734 | 1735 | 1736 | ### Make a CDF for number of dynamic calls 1737 | rust_x = np.sort(np.array(rust_dynamics)) 1738 | rust_y = np.arange(1, len(rust_x)+1)/len(rust_x) 1739 | 1740 | c_x = np.sort(np.array(c_dynamics)) 1741 | c_y = np.arange(1, len(c_x)+1)/len(c_x) 1742 | 1743 | both_x = np.sort(np.array(both_dynamics)) 1744 | both_y = np.arange(1, len(both_x)+1)/len(both_x) 1745 | 1746 | print("Generating CDF for dynamic function calls...") 1747 | cdf_plt = plt.figure() 1748 | 1749 | # Graph labels 1750 | plt.title("Number of Dynamic Function Calls") 1751 | plt.xlabel("Number of Dynamic Function Calls") 1752 | plt.ylabel("Cumulative Distribution Function (CDF)") 1753 | 1754 | #plt.axis([0, max(both_x), 0.9, 1]) 1755 | plt.axis([0, 20, 0.85, 1]) 1756 | 1757 | #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/5000000) 1758 | #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000) 1759 | #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/5000000) 1760 | 1761 | plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/5000000) 1762 | plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000) 1763 | plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/5000000) 1764 | 1765 | # Generate and save graph 1766 | #plt.grid(True) 1767 | plt.grid(True, color='0.45') 1768 | plt.plot() 1769 | plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3) 1770 | print("Saving dynamics...") 1771 | cdf_plt.savefig("output/graphs/dynamics.pdf", bbox_inches='tight') 1772 | 1773 | ### Make a CDF for number of call sites 1774 | rust_x = np.sort(np.array(rust_cs)) 1775 | rust_y = np.arange(1, len(rust_x)+1)/len(rust_x) 1776 | 1777 | c_x = np.sort(np.array(c_cs)) 1778 | c_y = np.arange(1, len(c_x)+1)/len(c_x) 1779 | 1780 | both_x = np.sort(np.array(both_cs)) 1781 | both_y = np.arange(1, len(both_x)+1)/len(both_x) 1782 | 1783 | print("Generating CDF for number of call sites...") 1784 | cdf_plt = plt.figure() 1785 | 1786 | # Graph labels 1787 | plt.title("Number of Call Sites") 1788 | plt.xlabel("Number of Call Sites") 1789 | plt.ylabel("Cumulative Distribution Function (CDF)") 1790 | 1791 | #plt.axis([0, max(both_x), 0.9, 1]) 1792 | plt.axis([0, 50, 0.8, 1]) 1793 | 1794 | #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/10000000) 1795 | #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000) 1796 | #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/10000000) 1797 | 1798 | plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/10000000) 1799 | plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000) 1800 | plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/10000000) 1801 | 1802 | # Generate and save graph 1803 | #plt.grid(True) 1804 | plt.grid(True, color='0.45') 1805 | plt.plot() 1806 | plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3) 1807 | print("Saving calls...") 1808 | cdf_plt.savefig("output/graphs/calls.pdf", bbox_inches='tight') 1809 | 1810 | ### Make a CDF for number of transfer points 1811 | rust_x = np.sort(np.array(rust_tps)) 1812 | rust_y = np.arange(1, len(rust_x)+1)/len(rust_x) 1813 | 1814 | c_x = np.sort(np.array(c_tps)) 1815 | c_y = np.arange(1, len(c_x)+1)/len(c_x) 1816 | 1817 | both_x = np.sort(np.array(both_tps)) 1818 | both_y = np.arange(1, len(both_x)+1)/len(both_x) 1819 | 1820 | print("Generating CDF for number of transfer points...") 1821 | cdf_plt = plt.figure() 1822 | 1823 | # Graph labels 1824 | plt.title("Number of Transfer Points") 1825 | plt.xlabel("Number of Transfer Points") 1826 | plt.ylabel("Cumulative Distribution Function (CDF)") 1827 | 1828 | #plt.axis([0, max(both_x), 0.9, 1]) 1829 | plt.axis([0, 20, 0.9, 1]) 1830 | 1831 | #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/5000000) 1832 | #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000) 1833 | #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/5000000) 1834 | 1835 | plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/5000000) 1836 | plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000) 1837 | plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/5000000) 1838 | 1839 | # Generate and save graph 1840 | #plt.grid(True) 1841 | plt.grid(True, color='0.45') 1842 | plt.plot() 1843 | plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3) 1844 | print("Saving tps...") 1845 | cdf_plt.savefig("output/graphs/tps.pdf", bbox_inches='tight') 1846 | 1847 | ### Make a CDF for number of visitor points 1848 | #print("rust_vps") 1849 | #print(rust_vps) 1850 | #print("c_vps") 1851 | #print(c_vps) 1852 | #print("both_vps") 1853 | #print(both_vps) 1854 | rust_x = np.sort(np.array(rust_vps)) 1855 | rust_y = np.arange(1, len(rust_x)+1)/len(rust_x) 1856 | 1857 | c_x = np.sort(np.array(c_vps)) 1858 | c_y = np.arange(1, len(c_x)+1)/len(c_x) 1859 | 1860 | both_x = np.sort(np.array(both_vps)) 1861 | both_y = np.arange(1, len(both_x)+1)/len(both_x) 1862 | 1863 | print("Generating CDF for number of visitor points...") 1864 | cdf_plt = plt.figure() 1865 | 1866 | # Graph labels 1867 | plt.title("Number of Visitor Points") 1868 | plt.xlabel("Number of Visitor Points") 1869 | plt.ylabel("Cumulative Distribution Function (CDF)") 1870 | 1871 | #plt.axis([0, max(both_x), 0.9, 1]) 1872 | plt.axis([0, 10, 0.9, 1]) 1873 | 1874 | #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/500000) 1875 | #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000) 1876 | #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/500000) 1877 | 1878 | plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/250000) 1879 | plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000) 1880 | plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/250000) 1881 | 1882 | # Generate and save graph 1883 | #plt.grid(True) 1884 | plt.grid(True, color='0.45') 1885 | plt.plot() 1886 | plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3) 1887 | print("Saving vps...") 1888 | cdf_plt.savefig("output/graphs/vps.pdf", bbox_inches='tight') 1889 | 1890 | ### Make a CDF for number of invocations 1891 | #print("rust_invos") 1892 | #print(rust_invos) 1893 | 1894 | rust_x = np.sort(np.array(rust_invos)) 1895 | rust_y = np.arange(1, len(rust_x)+1)/len(rust_x) 1896 | 1897 | #print("c_invos") 1898 | #print(c_invos) 1899 | 1900 | c_x = np.sort(np.array(c_invos)) 1901 | c_y = np.arange(1, len(c_x)+1)/len(c_x) 1902 | 1903 | #print("both_invos") 1904 | #print(both_invos) 1905 | 1906 | both_x = np.sort(np.array(both_invos)) 1907 | both_y = np.arange(1, len(both_x)+1)/len(both_x) 1908 | 1909 | print("Generating CDF for number of invocations...") 1910 | cdf_plt = plt.figure() 1911 | 1912 | # Graph labels 1913 | plt.title("Number of Invocations") 1914 | plt.xlabel("Number of Invocations") 1915 | plt.ylabel("Cumulative Distribution Function (CDF)") 1916 | 1917 | #plt.axis([0, max(both_x), 0.9, 1]) 1918 | plt.axis([0, 30, 0.8, 1]) 1919 | 1920 | #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/1000000) 1921 | #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000) 1922 | #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/1000000) 1923 | 1924 | plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/750000) 1925 | plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000) 1926 | plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/750000) 1927 | 1928 | # Generate and save graph 1929 | #plt.grid(True) 1930 | plt.grid(True, color='0.45') 1931 | plt.plot() 1932 | plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3) 1933 | print("Saving invos...") 1934 | cdf_plt.savefig("output/graphs/invos.pdf", bbox_inches='tight') 1935 | 1936 | if __name__ == "__main__": 1937 | 1938 | # Version that takes a file of binaries 1939 | parser = ArgumentParser() 1940 | parser.add_argument("bin_paths", type=str, help=""" 1941 | Path of file that contains list of binaries to generate metrics for 1942 | """) 1943 | args = parser.parse_args() 1944 | 1945 | # Find each relevant binary 1946 | # TODO: call find_elf.sh from python 1947 | 1948 | # For each elf, create a json file of function metadata 1949 | elf_reader(args.bin_paths) 1950 | 1951 | # Combine each elf json file into a single json file of function metadata 1952 | combine_elf_results(args.bin_paths) 1953 | 1954 | # For each objdump, create a json file of function metadata 1955 | obj_reader(args.bin_paths) 1956 | 1957 | # Combine each obj json file into a single json file of function metadata 1958 | combine_obj_results(args.bin_paths) 1959 | 1960 | # Combine elf and obj json metadata into one json file of function metadata 1961 | combine_elf_and_obj_results() 1962 | 1963 | # Use call sites in function metadata to find transfer and visitor points and save 1964 | get_transfer_points() 1965 | 1966 | # Use call sites in function metadata to find invocations 1967 | get_invocation_points() 1968 | 1969 | # Make graphs from metadata 1970 | generate_cdfs() 1971 | -------------------------------------------------------------------------------- /cla-metrics/input/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/input/.DS_Store -------------------------------------------------------------------------------- /cla-metrics/input/elfs/.placeholder: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/input/elfs/.placeholder -------------------------------------------------------------------------------- /cla-metrics/input/objdumps/.placeholder: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/input/objdumps/.placeholder -------------------------------------------------------------------------------- /cla-metrics/input/source-info.json: -------------------------------------------------------------------------------- 1 | { 2 | "rust": [], 3 | "c": [] 4 | } 5 | -------------------------------------------------------------------------------- /cla-metrics/output/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/output/.DS_Store -------------------------------------------------------------------------------- /cla-metrics/output/elf-results/.placeholder: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/output/elf-results/.placeholder -------------------------------------------------------------------------------- /cla-metrics/output/graphs/.placeholder: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/output/graphs/.placeholder -------------------------------------------------------------------------------- /cla-metrics/output/obj-results/.placeholder: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/output/obj-results/.placeholder -------------------------------------------------------------------------------- /go-cla-examples/Makefile: -------------------------------------------------------------------------------- 1 | DIR=$(dir $(realpath $(firstword $(MAKEFILE_LIST)))) 2 | 3 | all: go 4 | 5 | dynamic-c: 6 | #clang -fPIC -c -o $(DIR)/src/init/init.o $(DIR)src/init/init.c -fsanitize=cfi -flto -fvisibility=hidden 7 | #clang -fPIC -shared -o $(DIR)src/init/libinit.so $(DIR)src/init/init.o -fsanitize=cfi -flto -fvisibility=hidden 8 | clang -fPIC -c -o $(DIR)/src/init/init.o $(DIR)src/init/init.c 9 | clang -fPIC -shared -o $(DIR)src/init/libinit.so $(DIR)src/init/init.o 10 | 11 | static-c: 12 | clang -fPIC -c -o $(DIR)/src/init/init.o $(DIR)src/init/init.c -fsanitize=cfi -flto -fvisibility=hidden 13 | ar crs $(DIR)src/init/libinit.a $(DIR)/src/init/init.o 14 | 15 | go: dynamic-c 16 | gofmt -e -s -w . 17 | CGO_CFLAGS="-flto -ffat-lto-objects" go build $(DIR)/src/main.go 18 | 19 | sim: 20 | LD_LIBRARY_PATH=$(DIR)/src/init/ $(DIR)/main 21 | 22 | obj: go 23 | objdump -S $(DIR)/main > $(DIR)/main.obj 24 | 25 | clean: 26 | rm -rf $(DIR)/main 27 | rm -f $(DIR)/src/init/libinit.a 28 | rm -f $(DIR)/src/init/libinit.so 29 | rm -f $(DIR)/src/init/init.o 30 | rm -f $(DIR)/main.obj 31 | -------------------------------------------------------------------------------- /go-cla-examples/src/init/init.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | // Simple function that acts user input 7 | extern int64_t get_attack(); 8 | 9 | // Simple initialization function 10 | void init() { 11 | // Turns off heap checks for double frees 12 | // Set to lowest level which just prints out the error and continues 13 | // Not strictly necessary, but helps with presentation 14 | // I.e., prints when we achieved a double free 15 | mallopt(M_CHECK_ACTION, 1); 16 | } 17 | 18 | // Given the array to modify, this function set a field in the array 19 | // Can cause an OOB vulnerability 20 | void user_given_array(int64_t array_ptr_addr) { 21 | // These values could be set by a corruptible source, e.g., user input 22 | // Thus, the index points to Data.cb and value is the address of attack() 23 | // This is an OOB as it indexes past the allocated array of size 3) 24 | int64_t array_index = 3; 25 | int64_t array_value = get_attack(); 26 | 27 | int64_t* a = (void *)array_ptr_addr; 28 | printf("addr of a[array_index] in user_given_array: %ld\n", (int64_t)&(a[array_index])); 29 | 30 | a[array_index] = array_value; 31 | printf("Done with user_given_array.\n"); 32 | } 33 | 34 | // This function prints the address of a given array 35 | // Can cause UaF and DF vulnerabilities 36 | void print_array_addr(int64_t array_ptr_addr) { 37 | int64_t* a = (void *)array_ptr_addr; 38 | printf("addr of a in print_array_addr: %ld\n", (int64_t)a); 39 | 40 | // This is an unnecessary free call, as Go allocated the array 41 | // (and subsequently Go will free this array later) 42 | //free(a); 43 | 44 | // C now thinks it can use a for something else 45 | // (e.g., set it to a user defined address) 46 | // Go may not realize this functionality occurs 47 | // These values could be set by a corruptible source, e.g., user input 48 | int64_t array_value = get_attack(); 49 | *a = array_value; 50 | 51 | printf("Done with print_array_addr.\n"); 52 | } 53 | 54 | 55 | // This function allocates its own array and populates based on user input 56 | void user_set_array() { 57 | 58 | // Initialize array 59 | int64_t a[1] = { 0 }; 60 | 61 | // These values could be set by a corruptible source, e.g., user input 62 | // Thus, the index points to Data.cb and value is the address of attack() 63 | // This is an OOB as it indexes past the allocated array of size 3) 64 | // Note: the value of array_index only works 50% of the time 65 | // It depends on what memory address the stack gets 66 | // (i.e., stack location is not deterministic in Go) 67 | // Thus, we corrupt two likely canidates 68 | 69 | int64_t array_index = (((int64_t)get_attack() + 824628677912) - (int64_t)&a)/8; 70 | int64_t array_index2 = (((int64_t)get_attack() + 824628698392) - (int64_t)&a)/8; 71 | 72 | int64_t array_value = get_attack(); 73 | 74 | printf("addr of &a: %ld\n", (int64_t)&a); 75 | printf("array_value: %ld\n", array_value); 76 | printf("addr of a[array_index] in user_given_array: %ld\n", (int64_t)&(a[array_index])); 77 | printf("addr of a[array_index2] in user_given_array: %ld\n", (int64_t)&(a[array_index2])); 78 | 79 | a[array_index] = array_value; 80 | a[array_index2] = array_value; 81 | printf("Done with user_set_array.\n"); 82 | } 83 | 84 | // Go calls this function to get the right address of a call back function 85 | // If Go doesn't properly sanitize data from this function, 86 | // it could return corrupted data 87 | int64_t get_cb_from_c() { 88 | // These values could be set by a corruptible source, e.g., user input 89 | int64_t call_back_addr = get_attack(); 90 | 91 | return call_back_addr; 92 | } 93 | 94 | // Given the array to modify, this function set a field in the array 95 | // Can cause an OOB vulnerability 96 | void user_given_slice(int64_t slice_ptr_addr) { 97 | // These values could be set by a corruptible source, e.g., user input 98 | // Thus, the index points to the slice fat pointer and value too large 99 | // This is an OOB as it indexes past the allocated array of size 3) 100 | int64_t array_index = 1; 101 | int64_t array_value = 10000000; 102 | 103 | int64_t* a = (void *)slice_ptr_addr; 104 | printf("addr of a[array_index] in user_given_slice: %ld\n", (int64_t)&(a[array_index])); 105 | 106 | a[array_index] = array_value; 107 | printf("Done with user_given_slice.\n"); 108 | } 109 | -------------------------------------------------------------------------------- /go-cla-examples/src/init/init.h: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | void init(); 4 | void user_given_array(int64_t array_ptr_addr); 5 | void print_array_addr(int64_t array_ptr_addr); 6 | void user_set_array(); 7 | void user_given_slice(int64_t slice_ptr_addr); 8 | int64_t get_cb_from_c(); 9 | -------------------------------------------------------------------------------- /go-cla-examples/src/main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | /* 4 | #include "./init/init.h" 5 | #include // for C.int64_t 6 | #include // for free() 7 | #cgo LDFLAGS: -L./init/ -linit 8 | */ 9 | import "C" 10 | 11 | import ( 12 | "fmt" 13 | "unsafe" 14 | ) 15 | 16 | var fa_attack = attack 17 | 18 | const MAX_LENGTH = 3 19 | 20 | //export get_attack 21 | func get_attack() C.int64_t { 22 | 23 | // Get a pointer to the address of the function pointer 24 | p := unsafe.Pointer(&fa_attack) 25 | 26 | // Pull out the address of a function pointer from the pointer to the address of the function pointer 27 | p_addr := C.int64_t(uintptr(p)) 28 | 29 | return p_addr 30 | } 31 | 32 | type Data struct { 33 | vals [MAX_LENGTH]int64 34 | cb *func(*int64) 35 | slice []int64 36 | cb2 *func(*int64) 37 | } 38 | 39 | // A simple benign function that doubles a value 40 | //go:noinline 41 | func doubler(x *int64) { 42 | fmt.Println("Not attacked! Adding two to the input...") 43 | *x = *x + 2 44 | } 45 | 46 | // A simple benign function that increments a value 47 | //go:noinline 48 | func incrementer(x *int64) { 49 | fmt.Println("Not attacked! Adding one to the input...") 50 | *x = *x + 1 51 | } 52 | 53 | // Attack aims to call this 54 | // Could be replaced with actual gadgets that together execute a weird machine 55 | //go:noinline 56 | func attack() { 57 | fmt.Println("We were attacked!") 58 | } 59 | 60 | // Main function 61 | //go:noinline 62 | func analyze_data(cb_fptr *func(*int64)) { 63 | 64 | // Initialize program 65 | C.init() 66 | 67 | // Set up some function pointers 68 | fa1 := incrementer 69 | fp1 := (*func(*int64))(&fa1) 70 | fa2 := doubler 71 | fp2 := (*func(*int64))(&fa2) 72 | 73 | // Initialize some data 74 | data := Data{ 75 | vals: [3]int64{1, 2, 3}, 76 | cb: fp1, 77 | slice: []int64{4, 5}, 78 | cb2: fp2, 79 | } 80 | fmt.Println("Start data: vals[0]=", data.vals[0], "cb=", data.cb, "slice[0]=", data.slice[0], "cb2=", data.cb2) 81 | 82 | // Get and print the addresses of the Data struct 83 | data_vals_addr := C.int64_t(uintptr(unsafe.Pointer(&data.vals))) 84 | data_cb_addr := C.int64_t(uintptr(unsafe.Pointer(&data.cb))) 85 | data_slice_addr := C.int64_t(uintptr(unsafe.Pointer(&data.slice))) 86 | data_cb2_addr := C.int64_t(uintptr(unsafe.Pointer(&data.cb2))) 87 | 88 | fmt.Println("data_vals_addr", data_vals_addr) 89 | fmt.Println("data_cb_addr", data_cb_addr) 90 | fmt.Println("data_slice_addr", data_slice_addr) 91 | fmt.Println("data_cb2_addr", data_cb2_addr) 92 | 93 | // Get and print the address of the function argument to this function 94 | cb_fptr_addr := C.int64_t(uintptr(unsafe.Pointer(&cb_fptr))) 95 | fmt.Println("cb_fptr_addr", cb_fptr_addr) 96 | 97 | // Get and print the address of heap data that a new stores 98 | doubler_fp := new(func(*int64)) 99 | doubler_fp = fp2 100 | doubler_fp_addr := C.int64_t(uintptr(unsafe.Pointer(&doubler_fp))) 101 | fmt.Println("doubler_fp_addr", doubler_fp_addr) 102 | 103 | // Get and print the address of heap data that a new stores 104 | doubler2_fp := new(func(*int64)) 105 | doubler2_fp = fp2 106 | doubler2_fp_addr := C.int64_t(uintptr(unsafe.Pointer(&doubler2_fp))) 107 | fmt.Println("doubler2_fp_addr", doubler2_fp_addr) 108 | 109 | // Get a callback function pointer from C 110 | incrementer_fp_addr := C.get_cb_from_c() 111 | // Derive a function pointer from the address of a pointer to a function pointer 112 | incrementer_fp := (*func(*int64))(unsafe.Pointer(uintptr(incrementer_fp_addr))) 113 | 114 | // Section 4 Attacks 115 | /* Go Static Bounds Check Bypass Attack */ 116 | fmt.Println("Launching Go Bounds Check Bypass Attack...") 117 | C.user_given_array(data_vals_addr) 118 | 119 | fmt.Println("Calling data.cb...") 120 | (*data.cb)(&data.vals[0]) 121 | fmt.Println("Updated data: vals[0]=", data.vals[0]) 122 | 123 | /* Go Garbage Collection Bypass Attack */ 124 | fmt.Println("Launching Go Garbage Collection Bypass Attack...") 125 | C.print_array_addr(doubler_fp_addr) 126 | 127 | fmt.Println("Calling doubler_fp...") 128 | (*doubler_fp)(&data.vals[0]) 129 | fmt.Println("Updated data: vals[0]=", data.vals[0]) 130 | 131 | /* C/C++ Hardening Bypass Attack */ 132 | fmt.Println("Launching C/C++ Hardening Bypass Attack...") 133 | C.user_set_array() 134 | 135 | fmt.Println("Calling cb_fptr...") 136 | (*cb_fptr)(&data.vals[0]) 137 | fmt.Println("Updated data: vals[0]=", data.vals[0]) 138 | 139 | // Section 5 Attacks 140 | /* Corrupting Go Dynamic Bounds */ 141 | fmt.Println("Launching Go Dynaic Bounds Check Bypass Attack...") 142 | C.user_given_slice(data_slice_addr) 143 | 144 | // Now we can access past the length of data.slice in *Safe Go* 145 | // Length of slice is only 2 (and capacity is 2) 146 | // E.g., data.slice[22] actually points to doubler2_fp on the heap 147 | // So setting data.slice[22] actually corrupts the value a pointer holds 148 | // Moreover, slice_index and slice_val could come from a corruptible source, e.g., user input 149 | data_slice0_addr := C.int64_t(uintptr(unsafe.Pointer(&data.slice[0]))) 150 | slice_index := (doubler2_fp_addr - data_slice0_addr) / 8 151 | 152 | if slice_index > 0 { 153 | data_slice_I_addr := C.int64_t(uintptr(unsafe.Pointer(&data.slice[slice_index]))) 154 | 155 | fmt.Println("addr of data.slice[0]:", data_slice0_addr) 156 | fmt.Println("addr of data.slice[slice_index]:", data_slice_I_addr) 157 | 158 | slice_val := int64(get_attack()) 159 | 160 | /* This OOB is done in Safe Go! */ 161 | data.slice[slice_index] = slice_val 162 | 163 | } else { 164 | // ASLR placed doubler2_fp "below" data.slice in the heap 165 | // But, we can't access a slice with a negative value 166 | fmt.Println("ASLR protected us! Better luck next time attacker...") 167 | } 168 | 169 | fmt.Println("Calling doubler2_fp...") 170 | (*doubler2_fp)(&data.vals[0]) 171 | fmt.Println("Updated data: vals[0]=", data.vals[0]) 172 | 173 | /* Corrupting Intended Interactions */ 174 | fmt.Println("Launching Intended Interactions Attack...") 175 | 176 | fmt.Println("Calling incrementer_fp...") 177 | (*incrementer_fp)(&data.vals[0]) 178 | fmt.Println("Updated data: vals[0]=", data.vals[0]) 179 | 180 | /* Corrupting with Double Frees */ 181 | // Unsure exactly when, but Go will try to free doubler_fp 182 | // but it was already freed in print_array_addr 183 | // This will cause an abort, but could be used to execute a weird machine 184 | } 185 | 186 | func main() { 187 | // Set up a function pointer 188 | fa0 := incrementer 189 | fp0 := (*func(*int64))(&fa0) 190 | 191 | // Call the main function 192 | analyze_data(fp0) 193 | 194 | fmt.Println("Finished main.") 195 | } 196 | -------------------------------------------------------------------------------- /rust-cla-examples/Cargo.lock: -------------------------------------------------------------------------------- 1 | # This file is automatically @generated by Cargo. 2 | # It is not intended for manual editing. 3 | [[package]] 4 | name = "cc" 5 | version = "1.0.72" 6 | source = "registry+https://github.com/rust-lang/crates.io-index" 7 | checksum = "22a9137b95ea06864e018375b72adfb7db6e6f68cfc8df5a04d00288050485ee" 8 | 9 | [[package]] 10 | name = "rust-cla-ex" 11 | version = "0.1.0" 12 | dependencies = [ 13 | "cc", 14 | ] 15 | -------------------------------------------------------------------------------- /rust-cla-examples/Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "rust-cla-ex" 3 | version = "0.1.0" 4 | authors = ["Samuel Mergendahl"] 5 | edition = "2018" 6 | 7 | # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html 8 | 9 | [dependencies] 10 | 11 | [build-dependencies] 12 | cc = "1.0" 13 | 14 | [[bin]] 15 | name = "cla" 16 | path = "src/main.rs" 17 | -------------------------------------------------------------------------------- /rust-cla-examples/Makefile: -------------------------------------------------------------------------------- 1 | DIR=$(dir $(realpath $(firstword $(MAKEFILE_LIST)))) 2 | 3 | all: rust 4 | 5 | dynamic-c: 6 | clang -fPIC -c -o $(DIR)/src/init/init.o $(DIR)src/init/init.c -fsanitize=cfi -flto -fvisibility=hidden 7 | clang -fPIC -shared -o $(DIR)src/init/libinit.so $(DIR)src/init/init.o -fsanitize=cfi -flto -fvisibility=hidden 8 | 9 | static-c: 10 | clang -fPIC -c -o $(DIR)/src/init/init.o $(DIR)src/init/init.c -fsanitize=cfi -flto -fvisibility=hidden 11 | ar crs $(DIR)src/init/libinit.a $(DIR)/src/init/init.o 12 | 13 | rust: static-c 14 | RUSTFLAGS="-Clinker-plugin-lto -Clinker=clang -Clink-arg=-fuse-ld=lld" cargo build --release 15 | 16 | sim: 17 | $(DIR)/target/release/cla 18 | 19 | obj: rust 20 | objdump -S $(DIR)/target/release/cla > $(DIR)/main.obj 21 | 22 | clean: 23 | rm -rf target 24 | rm -f $(DIR)/src/init/libinit.a 25 | rm -f $(DIR)/src/init/libinit.so 26 | rm -f $(DIR)/src/init/init.o 27 | rm -f $(DIR)/main.obj 28 | -------------------------------------------------------------------------------- /rust-cla-examples/build.rs: -------------------------------------------------------------------------------- 1 | fn main() { 2 | println!("cargo:rustc-link-search=./src/init/"); 3 | //println!("cargo:rustc-link-lib=dylib=init"); 4 | println!("cargo:rustc-link-lib=static=init"); 5 | } 6 | -------------------------------------------------------------------------------- /rust-cla-examples/src/init/init.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | // Simple function that acts user input 7 | extern int64_t get_attack(); 8 | 9 | // Simple initialization function 10 | void init() { 11 | // Turns off heap checks for double frees 12 | // Set to lowest level which just prints out the error and continues 13 | // Not strictly necessary, but helps with presentation 14 | // I.e., prints when we achieved a double free 15 | mallopt(M_CHECK_ACTION, 1); 16 | } 17 | 18 | // Given the array to modify, this function set a field in the array 19 | // Can cause an OOB vulnerability 20 | void user_given_array(int64_t array_ptr_addr) { 21 | // These values could be set by a corruptible source, e.g., user input 22 | // Thus, the index points to Data.cb and value is the address of attack() 23 | // This is an OOB as it indexes past the allocated array of size 3) 24 | int64_t array_index = 3; 25 | int64_t array_value = get_attack(); 26 | 27 | int64_t* a = (void *)array_ptr_addr; 28 | printf("addr of a[array_index] in user_given_array: %ld\n", (int64_t)&(a[array_index])); 29 | 30 | a[array_index] = array_value; 31 | printf("Done with user_given_array.\n"); 32 | } 33 | 34 | // This function prints the address of a given array 35 | // Can cause UaF and DF vulnerabilities 36 | void print_array_addr(int64_t array_ptr_addr) { 37 | int64_t* a = (void *)array_ptr_addr; 38 | printf("addr of a in print_array_addr: %ld\n", (int64_t)a); 39 | 40 | // This is an unnecessary free call, as Rust allocated the array 41 | // (and subsequently Rust will free this array later) 42 | free(a); 43 | 44 | // C now thinks it can use a for something else 45 | // (e.g., set it to a user defined address) 46 | // Rust may not realize this functionality occurs 47 | // These values could be set by a corruptible source, e.g., user input 48 | int64_t array_value = get_attack(); 49 | *a = array_value; 50 | 51 | printf("Done with print_array_addr.\n"); 52 | } 53 | 54 | 55 | // This function allocates its own array and populates based on user input 56 | void user_set_array() { 57 | 58 | // Initialize array 59 | int64_t a[1] = { 0 }; 60 | 61 | // These values could be set by a corruptible source, e.g., user input 62 | // Thus, the index points to Data.cb and value is the address of attack() 63 | // This is an OOB as it indexes past the allocated array of size 3) 64 | int64_t array_index = 28; 65 | int64_t array_value = get_attack(); 66 | 67 | printf("addr of &a: %ld\n", (int64_t)&a); 68 | printf("array_value %ld\n", array_value); 69 | printf("addr of a[array_index] in user_given_array: %ld\n", (int64_t)&(a[array_index])); 70 | 71 | a[array_index] = array_value; 72 | printf("Done with user_set_array.\n"); 73 | } 74 | 75 | // Rust calls this function to get the right address of a call back function 76 | // If Rust doesn't properly sanitize data from this function, 77 | // it could return corrupted data 78 | int64_t get_cb_from_c() { 79 | // These values could be set by a corruptible source, e.g., user input 80 | int64_t call_back_addr = get_attack(); 81 | 82 | return call_back_addr; 83 | } 84 | 85 | // Given the array to modify, this function set a field in the array 86 | // Can cause an OOB vulnerability 87 | void user_given_vec(int64_t vec_ptr_addr) { 88 | // These values could be set by a corruptible source, e.g., user input 89 | // Thus, the index points to the Vec fat pointer and value too large 90 | // This is an OOB as it indexes past the allocated array of size 3) 91 | int64_t array_index = 2; 92 | int64_t array_value = 10000000; 93 | 94 | int64_t* a = (void *)vec_ptr_addr; 95 | 96 | printf("addr of a[array_index] in user_given_slice: %ld\n", (int64_t)&(a[array_index])); 97 | 98 | a[array_index] = array_value; 99 | printf("Done with user_given_vec.\n"); 100 | } 101 | -------------------------------------------------------------------------------- /rust-cla-examples/src/init/init.h: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | void init(); 4 | void user_given_array(int64_t array_ptr_addr); 5 | void print_array_addr(int64_t array_ptr_addr); 6 | void user_set_array(); 7 | void user_given_vec(int64_t vec_ptr_addr); 8 | int64_t get_cb_from_c(); 9 | -------------------------------------------------------------------------------- /rust-cla-examples/src/main.rs: -------------------------------------------------------------------------------- 1 | /* External unsafe functions */ 2 | 3 | extern "C" { fn init(); } 4 | extern "C" { fn user_set_array(); } 5 | extern "C" { fn user_given_array(array_ptr_addr: i64); } 6 | extern "C" { fn user_given_vec(vec_ptr_addr: i64); } 7 | extern "C" { fn print_array_addr(array_ptr_addr: i64); } 8 | extern "C" { fn get_cb_from_c() -> i64; } 9 | 10 | // A simple function that acts as user input 11 | #[no_mangle] 12 | extern "C" fn get_attack() -> i64 { 13 | return attack as i64; 14 | } 15 | 16 | pub const MAX_LENGTH: usize = 3; 17 | 18 | // A simple struct that we frequently manipulate 19 | pub struct Data { 20 | vals: [i64;MAX_LENGTH], 21 | cb: fn(&mut i64), 22 | vecs: std::vec::Vec, 23 | cb2: fn(&mut i64) 24 | } 25 | 26 | // A simple benign function that doubles a value 27 | #[no_mangle] 28 | #[inline(never)] 29 | pub fn doubler(x: &mut i64) { 30 | println!("Not attacked! Adding two to input..."); 31 | *x += 2; 32 | } 33 | 34 | // A simple benign function that increments a value 35 | #[no_mangle] 36 | #[inline(never)] 37 | pub fn incrementer(x: &mut i64) { 38 | println!("Not attacked! Adding one to input..."); 39 | *x += 1; 40 | } 41 | 42 | // Attack aims to call this 43 | // Could be replaced with actual gadgets that together execute a weird machine 44 | #[no_mangle] 45 | #[inline(never)] 46 | pub fn attack() { 47 | println!("We were attacked!"); 48 | } 49 | 50 | // Main function 51 | #[no_mangle] 52 | #[inline(never)] 53 | fn analyze_data(cb_fptr: fn(&mut i64)) { 54 | 55 | // Initialize program 56 | unsafe{init()}; 57 | 58 | // Set up some function pointers 59 | let fp1 = incrementer; 60 | let fp2 = doubler; 61 | 62 | // Initialize some data 63 | let mut data = Data { 64 | vals: [1,2,3], 65 | cb: fp1, 66 | vecs: vec![4,5], 67 | cb2: fp2 68 | }; 69 | println!("Start data: vals[0]={}, cb={}, vecs[0]={}, cb2={}", 70 | data.vals[0], 71 | data.cb as *const fn(&mut i64) as i64, 72 | data.vecs[0], 73 | data.cb2 as *const fn(&mut i64) as i64); 74 | 75 | // Get and print the addresses of the Data Struct 76 | let data_vals_addr = &data.vals as *const i64 as i64; 77 | let data_cb_addr = &data.cb as *const fn(&mut i64) as i64; 78 | let data_vecs_addr = &data.vecs as *const std::vec::Vec as i64; 79 | let data_cb2_addr = &data.cb2 as *const fn(&mut i64) as i64; 80 | 81 | println!("data_vals_addr: {}", data_vals_addr); 82 | println!("data_cb_addr: {}", data_cb_addr); 83 | println!("data_vecs_addr: {}", data_vecs_addr); 84 | println!("data_cb2_addr: {}", data_cb2_addr); 85 | 86 | // Get and print the address of the function argument to this function 87 | let cb_fptr_addr = &cb_fptr as *const fn(&mut i64) as i64; 88 | println!("cb_fptr_addr: {}", cb_fptr_addr); 89 | 90 | // Get and print the address of heap data that a Box stores 91 | let doubler_fp: Box = Box::new(doubler); 92 | let doubler_fp_addr = &(*doubler_fp) as *const fn(&mut i64) as i64; 93 | println!("doubler_fp_addr: {}", doubler_fp_addr); 94 | 95 | // Get and print the address of heap data that a Box stores 96 | let doubler2_fp: Box = Box::new(doubler); 97 | let doubler2_fp_addr = &(*doubler2_fp) as *const fn(&mut i64) as i64; 98 | println!("doubler2_fp_addr: {}", doubler2_fp_addr); 99 | 100 | // Get a callback function pointer from C 101 | // Uses unsafe because it needs to parse data from C 102 | // since this is used as an intended interaction 103 | let incrementer_fp = unsafe { 104 | let c_addr: i64 = get_cb_from_c(); 105 | let ptr = c_addr as *const fn(&mut i64); 106 | let fp: fn(&mut i64) = std::mem::transmute::<*const fn(&mut i64), fn(&mut i64)>(ptr); 107 | fp 108 | }; 109 | 110 | // Section 4 Attacks 111 | /* Rust Bounds Check Bypass Attack */ 112 | println!("Launching Rust Bounds Check Bypass Attack..."); 113 | unsafe{ user_given_array(data_vals_addr) } 114 | 115 | println!("Calling data.cb..."); 116 | (data.cb)(&mut data.vals[0]); 117 | println!("Updated data: vals[0]={}", data.vals[0]); 118 | 119 | /* Rust Lifetime Bypass Attack */ 120 | println!("Launching Rust Lifetimes Bypass Attack..."); 121 | unsafe{ print_array_addr(doubler_fp_addr) } 122 | 123 | println!("Calling doubler_fp..."); 124 | doubler_fp(&mut data.vals[0]); 125 | println!("Updated data: vals[0]={}", data.vals[0]); 126 | 127 | /* C/C++ Hardening Bypass Attack */ 128 | println!("Launching C/C++ Hardening Bypass Attack..."); 129 | unsafe{ user_set_array() } 130 | 131 | println!("Calling cb_fptr..."); 132 | cb_fptr(&mut data.vals[0]); 133 | println!("Updated data: vals[0]={}", data.vals[0]); 134 | 135 | // Section 5 Attacks 136 | /* Corrupting Rust Dynamic Bounds */ 137 | println!("Launching Dynamic Rust Bounds Check Bypass Attack..."); 138 | unsafe{ user_given_vec(data_vecs_addr) } 139 | 140 | // Now we can access past the length of data.vecs in *Safe Rust* 141 | // Length of vec is only 2 (and capacity is 2) 142 | // E.g., data.vec[22] actually points to doubler2_fp on the heap 143 | // So setting data.vec[22] actually corrupts the value a pointer holds 144 | // Moreover, vec_index and vec_val could come from a corruptible source, e.g., user input 145 | 146 | let data_vecs0_addr = &data.vecs[0] as *const i64 as i64; 147 | let vec_index = ((doubler2_fp_addr - data_vecs0_addr)/8) as usize; 148 | let vec_val = get_attack(); 149 | 150 | println!("addr of data.vecs[0]: {}", &data.vecs[0] as *const i64 as i64); 151 | println!("addr of data.vecs[vec_index]: {}", &data.vecs[vec_index] as *const i64 as i64); 152 | 153 | /* This OOB is done in Safe Rust! */ 154 | data.vecs[vec_index] = vec_val; 155 | 156 | println!("Calling doubler2_fp..."); 157 | doubler2_fp(&mut data.vals[0]); 158 | println!("Updated data: vals[0]={}", data.vals[0]); 159 | 160 | /* Corrupting Intended Interactions */ 161 | println!("Launching Intended Interactions Attack..."); 162 | println!("Calling incrementer_fp..."); 163 | incrementer_fp(&mut data.vals[0]); 164 | println!("Updated data: vals[0]={}", data.vals[0]); 165 | 166 | /* Corrupting with Serialization Errors */ 167 | // TODO 168 | 169 | /* Corrupting vTable dynamic dispatch */ 170 | // TODO 171 | 172 | /* Corrupting with Double Frees */ 173 | // Rust will now free doubler_fp as it goes out of scope here, 174 | // but it was already freed in print_array_addr 175 | // This will cause an abort, but could be used to execute a weird machine 176 | } 177 | 178 | fn main() { 179 | // Setup a function pointer 180 | let fp0 = doubler; 181 | 182 | // Call the main function 183 | analyze_data(fp0); 184 | 185 | println!("Finished main."); 186 | } 187 | --------------------------------------------------------------------------------