├── .gitignore ├── README.md ├── code ├── README.md ├── bbskel │ ├── Makefile │ ├── README.md │ ├── src │ │ ├── BBSkel.cpp │ │ └── BBSkel.h │ └── test │ │ └── foo.c ├── cgsskel │ ├── Makefile │ ├── README.md │ ├── src │ │ ├── CGSSkel.cpp │ │ └── CGSSkel.h │ └── test │ │ └── foo.c ├── comminute │ ├── Makefile │ ├── README.md │ ├── conf │ │ ├── constantarg.cfg │ │ ├── fdsink.cfg │ │ ├── fdsource.cfg │ │ ├── fexternalizer.txt │ │ ├── pdfunctions.cfg │ │ ├── sensitivesink.cfg │ │ └── sensitivesource.cfg │ ├── src │ │ ├── Analysis │ │ │ ├── NaiveConstantArgCheck.cpp │ │ │ ├── NaiveConstantArgCheck.h │ │ │ ├── NaiveFileDescLeak.cpp │ │ │ ├── NaiveFileDescLeak.h │ │ │ ├── NaiveSensitiveDataLeak.cpp │ │ │ ├── NaiveSensitiveDataLeak.h │ │ │ ├── NaiveSensitiveDataLeak_ahead.cpp │ │ │ ├── PotentiallyDangerousScan.cpp │ │ │ ├── PotentiallyDangerousScan.h │ │ │ ├── PotentiallyDangerousScanFunctionPass.cpp │ │ │ ├── PotentiallyDangerousScanFunctionPass.h │ │ │ ├── PotentiallyDangerousScanUserMethod.cpp │ │ │ ├── PotentiallyDangerousScanUserMethod.h │ │ │ ├── StoreCollector.cpp │ │ │ ├── StoreCollector.h │ │ │ ├── TargetCallSitesPass.cpp │ │ │ └── TargetCallSitesPass.h │ │ ├── Comminute.cpp │ │ └── Transform │ │ │ ├── ChoosePhiValue.cpp │ │ │ ├── ChoosePhiValue.h │ │ │ ├── FunctionExternalizer.cpp │ │ │ └── FunctionExternalizer.h │ ├── tests │ │ ├── FE001.c │ │ ├── NCA001.c │ │ ├── NCA002.c │ │ ├── NFDL001.c │ │ ├── NFDL002.c │ │ ├── NFDL003.c │ │ ├── NFDL004.c │ │ ├── NFDL005.c │ │ ├── NFDL006.c │ │ ├── NFDL007.c │ │ ├── NSDL001.c │ │ ├── NSDL002.c │ │ ├── NSDL003.c │ │ └── PD001.c │ └── thirdparty │ │ └── jsoncpp-1.8.0.tar.gz ├── fpskel │ ├── Makefile │ ├── README.md │ └── src │ │ ├── FPSkel.cpp │ │ └── FPSkel.h ├── intflip │ ├── Makefile │ ├── README.md │ ├── replace.cfg │ ├── replacepp.cfg │ ├── src │ │ ├── BaseRandomizer.h │ │ ├── BitFlipRandomizer.cpp │ │ ├── BitFlipRandomizer.h │ │ ├── FlipConfig.cpp │ │ ├── FlipConfig.h │ │ ├── InjectRandomizers.cpp │ │ ├── IntReplacerIterate.cpp │ │ ├── IntReplacerVisitor.cpp │ │ ├── LiftConstantIntPass.cpp │ │ ├── ReplaceRandomizer.cpp │ │ ├── ReplaceRandomizer.h │ │ └── TypeValueSupport.h │ ├── test │ │ ├── foo.c │ │ └── foo.cpp │ └── thirdparty │ │ └── jsoncpp-1.8.0.tar.gz ├── lpskel │ ├── Makefile │ ├── README.md │ ├── src │ │ ├── LPSkel.cpp │ │ └── LPSkel.h │ └── test │ │ └── foo.c ├── mpskel │ ├── Makefile │ ├── README.md │ └── src │ │ ├── MPSkel.cpp │ │ └── MPSkel.h ├── newpm │ ├── Makefile │ ├── README.md │ ├── build │ │ ├── lib │ │ │ └── TestFunctionPass.dylib │ │ └── tests │ │ │ ├── FE001 │ │ │ ├── FE001.bc │ │ │ └── FE001.ll │ ├── src │ │ ├── Analysis │ │ │ ├── TestFunctionPass.cpp │ │ │ └── TestFunctionPass.h │ │ └── manager.cpp │ └── tests │ │ └── FE001.c ├── npassert │ ├── Makefile │ ├── README.md │ ├── conf │ │ └── targ.cfg │ ├── src │ │ ├── NullPtrAssertPass.cpp │ │ └── NullPtrAssertPass.h │ └── test │ │ ├── ex01.c │ │ └── ex02.c ├── rpskel │ ├── Makefile │ ├── README.md │ ├── src │ │ ├── RPSkel.cpp │ │ └── RPSkel.h │ └── test │ │ └── foo.c └── visitorskel │ ├── Makefile │ ├── README.md │ └── src │ └── VisitorSkelModulePass.cpp ├── possible_projects.txt ├── projects.md └── slides ├── ExtraNonsense.pdf ├── MainDeck.pdf └── SecurityRDprojectsLLVM.pdf /.gitignore: -------------------------------------------------------------------------------- 1 | *~ 2 | *.o -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | This repo contains some materials related to a talk 3 | on security r&d projects that in some way use LLVM 4 | for doing their work. 5 | 6 | 7 | - projects.md is a list of such projects 8 | - code is a set of very basic, example code to help those interested in getting up to speed with some basics of LLVM. Don't hate. 9 | - slides is the slide deck 10 | 11 | #### Again on the code 12 | I must repeat myself: the provided code is only meant to be basic helpers for learning about LLVM 13 | and is not meant to be some new awesome tool (sadly, no). There is a great deal of research in 14 | dynamic and program analysis that is being done and the goal of this code is to make it so you 15 | can start to more easily read some of the code from those projects ... and then make sense of it. 16 | This is to avoid getting lost in code versus meaning. 17 | 18 | 19 | -------------------------------------------------------------------------------- /code/README.md: -------------------------------------------------------------------------------- 1 | Very basic example codes. 2 | 3 | - bbskel: BasicBlock pass skeleton 4 | - cgspskel: CallGraphSCC pass skeleton 5 | - comminute: tool illustrating pass mgr, pass deps, and more 6 | - fpskel: FunctionPass skeleton 7 | - intflip: integer argument randomizer|bit-flipper 8 | - mpskel: ModulePass skeleton 9 | - newpm: Making use of the new pass manager API (req 4.0) 10 | - npassert: NULL pointer check insertion 11 | - rpskel: RegionPass skeleton 12 | - visitorskel: Using a InstVisitor class 13 | 14 | 15 | -------------------------------------------------------------------------------- /code/bbskel/Makefile: -------------------------------------------------------------------------------- 1 | LLVM_VER=3.9 2 | LLVM_HOME=/usr/bin 3 | LLVM_CONFIG?=$(LLVM_HOME)/llvm-config-$(LLVM_VER) 4 | 5 | ifndef VERBOSE 6 | QUIET:=@ 7 | endif 8 | 9 | SRC_DIR?=$(PWD)/src 10 | 11 | CXX=$(LLVM_HOME)/clang++-$(LLVM_VER) 12 | CC=$(LLVM_HOME)/clang-$(LLVM_VER) 13 | OPT=$(LLVM_HOME)/opt-$(LLVM_VER) 14 | DIS=$(LLVM_HOME)/llvm-dis-$(LLVM_VER) 15 | LNK=$(LLVM_HOME)/llvm-link-$(LLVM_VER) 16 | 17 | LDFLAGS+=$(shell $(LLVM_CONFIG) --ldflags) 18 | LDFLAGS+=-shared -Wl,-O1 19 | 20 | CXXFLAGS+=-I$(shell $(LLVM_CONFIG) --includedir) 21 | CXXFLAGS+=-std=c++11 -fPIC -fvisibility-inlines-hidden 22 | CXXFLAGS+=-Wall -Wextra -g -Wno-unused-parameter -Wno-unused-variable 23 | 24 | CPPFLAGS+=$(shell $(LLVM_CONFIG) --cppflags) 25 | CPPFLAGS+=-I$(SRC_DIR) 26 | 27 | 28 | PASS=BBSkel.so 29 | PASS_OBJECTS=BBSkel.o 30 | 31 | default: prep $(PASS) 32 | 33 | prep: 34 | $(QUIET)mkdir -p built 35 | 36 | %.o : $(SRC_DIR)/%.cpp 37 | @echo Compiling $*.cpp 38 | $(QUIET)$(CXX) -o built/$*.o -c $(CPPFLAGS) $(CXXFLAGS) $< 39 | 40 | $(PASS) : $(PASS_OBJECTS) 41 | @echo Linking $@ 42 | $(QUIET)$(CXX) -o built/$@ $(LDFLAGS) $(CXXFLAGS) built/*.o 43 | 44 | clean: 45 | $(QUIET)rm -rf built test/*.bc 46 | 47 | 48 | tests: 49 | $(CC) -emit-llvm -o test/foo.bc -c test/foo.c 50 | 51 | runtests: 52 | $(OPT) -load built/BBSkel.so -mem2reg -bbskel < test/foo.bc 53 | -------------------------------------------------------------------------------- /code/bbskel/README.md: -------------------------------------------------------------------------------- 1 | 2 | # BBSkel 3 | 4 | This is a basic block pass skeleton. Note, there are also 5 | loop passes, region passes, etc. 6 | 7 | 8 | # Build & Run 9 | 10 | First check the Makefile to set path to llvm-config and version. 11 | 3.8, 3.9 should be fine, so should 4.0 12 | 13 | ``` 14 | $ make 15 | $ opt-X.Y -load built/BBSkel.so -bbskel < file.bc 16 | ... 17 | $ 18 | ``` 19 | 20 | 21 | -------------------------------------------------------------------------------- /code/bbskel/src/BBSkel.cpp: -------------------------------------------------------------------------------- 1 | #include "llvm/IR/Module.h" 2 | #include "llvm/IR/Function.h" 3 | #include "llvm/IR/Instructions.h" 4 | #include "llvm/Support/raw_ostream.h" 5 | 6 | using namespace llvm; 7 | 8 | #include "BBSkel.h" 9 | 10 | void 11 | BBSkel::getAnalysisUsage(AnalysisUsage &AU) const 12 | { 13 | // No changes to CFG, so tell the pass manager 14 | AU.setPreservesCFG(); 15 | } 16 | 17 | bool 18 | BBSkel::doFinalization(Function &F) 19 | { 20 | errs() << "#BBs: " << bbcount << "\n"; 21 | errs() << "#Is: " << icount << "\n"; 22 | return false; 23 | } 24 | 25 | bool 26 | BBSkel::runOnBasicBlock(BasicBlock &B) 27 | { 28 | bbcount++; 29 | errs() << " Basic Block found:\n"; 30 | for (auto &I : B) { // Iterate through instructions in the block 31 | ++icount; 32 | // Note in output if instruction is a call/invoke 33 | if (isa(I) || isa(I)) { 34 | errs() << " C "; 35 | } else { 36 | errs() << " "; 37 | } 38 | I.dump(); 39 | errs() << " used by:\n"; 40 | // Go through and dump the uses for each instruction. 41 | for (auto ui = I.user_begin(); ui != I.user_end(); ++ui) { 42 | errs() << " U: "; 43 | ui->dump(); 44 | } 45 | errs() << " ~~~~ \n"; 46 | } 47 | errs() << " --- end of basic block ---\n"; 48 | 49 | // return true if CFG has changed. 50 | return false; 51 | } 52 | 53 | 54 | 55 | /* 56 | * Register this pass to be made usable. 57 | * Needs the static ID initialized and the pass declaration given. 58 | */ 59 | char BBSkel::ID = 0; 60 | static RegisterPass XX("bbskel", "BasicBlock Pass Skeleton"); 61 | 62 | -------------------------------------------------------------------------------- /code/bbskel/src/BBSkel.h: -------------------------------------------------------------------------------- 1 | #ifndef __BBSKEL_H 2 | #define __BBSKEL_H 3 | 4 | struct BBSkel : public BasicBlockPass { 5 | /* 6 | * For all of your passes you will need this and to define it. 7 | * It's address is used by pass system, so the value does not matter. 8 | */ 9 | static char ID; 10 | 11 | unsigned bbcount; 12 | unsigned icount; 13 | 14 | BBSkel() : BasicBlockPass(ID) { 15 | bbcount = 0; 16 | icount = 0; 17 | } 18 | 19 | // Called on each BasicBlock in given compilation unit 20 | virtual bool runOnBasicBlock(BasicBlock &); 21 | 22 | /* 23 | * Used to help order passes by pass manager. 24 | * Declare any passes you need run prior here.. as well as 25 | * any information such as preserving CFG or similar. 26 | */ 27 | virtual void getAnalysisUsage(AnalysisUsage &) const; 28 | 29 | /* 30 | * Called after each exec of a runOnBasicBlock. 31 | */ 32 | virtual bool doFinalization(Function &); 33 | }; 34 | 35 | #endif 36 | -------------------------------------------------------------------------------- /code/bbskel/test/foo.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | 8 | void 9 | leaks_passwd() 10 | { 11 | char *p; 12 | struct addrinfo hints, *result; 13 | 14 | p = getpass("enter passwd: "); 15 | memset(&hints, 0, sizeof(struct addrinfo)); 16 | hints.ai_family = AF_UNSPEC; 17 | hints.ai_socktype = SOCK_DGRAM; 18 | hints.ai_flags = 0; 19 | hints.ai_protocol = 0; 20 | (void)getaddrinfo(p, "http", &hints, &result); 21 | } 22 | 23 | int 24 | main(int argc, char **argv) 25 | { 26 | leaks_passwd(); 27 | return 0; 28 | } 29 | -------------------------------------------------------------------------------- /code/cgsskel/Makefile: -------------------------------------------------------------------------------- 1 | LLVM_VER=3.9 2 | LLVM_HOME=/usr/bin 3 | LLVM_CONFIG?=$(LLVM_HOME)/llvm-config-$(LLVM_VER) 4 | 5 | ifndef VERBOSE 6 | QUIET:=@ 7 | endif 8 | 9 | SRC_DIR?=$(PWD)/src 10 | 11 | CXX=$(LLVM_HOME)/clang++-$(LLVM_VER) 12 | CC=$(LLVM_HOME)/clang-$(LLVM_VER) 13 | OPT=$(LLVM_HOME)/opt-$(LLVM_VER) 14 | DIS=$(LLVM_HOME)/llvm-dis-$(LLVM_VER) 15 | LNK=$(LLVM_HOME)/llvm-link-$(LLVM_VER) 16 | 17 | LDFLAGS+=$(shell $(LLVM_CONFIG) --ldflags) 18 | LDFLAGS+=-shared -Wl,-O1 19 | 20 | CXXFLAGS+=-I$(shell $(LLVM_CONFIG) --includedir) 21 | CXXFLAGS+=-std=c++11 -fPIC -fvisibility-inlines-hidden 22 | CXXFLAGS+=-Wall -Wextra -g -Wno-unused-parameter -Wno-unused-variable 23 | 24 | CPPFLAGS+=$(shell $(LLVM_CONFIG) --cppflags) 25 | CPPFLAGS+=-I$(SRC_DIR) 26 | 27 | PASS=CGSSkel.so 28 | PASS_OBJECTS=CGSSkel.o 29 | 30 | default: prep $(PASS) 31 | 32 | prep: 33 | $(QUIET)mkdir -p built 34 | 35 | %.o : $(SRC_DIR)/%.cpp 36 | @echo Compiling $*.cpp 37 | $(QUIET)$(CXX) -o built/$*.o -c $(CPPFLAGS) $(CXXFLAGS) $< 38 | 39 | $(PASS) : $(PASS_OBJECTS) 40 | @echo Linking $@ 41 | $(QUIET)$(CXX) -o built/$@ $(LDFLAGS) $(CXXFLAGS) built/*.o 42 | 43 | clean: 44 | $(QUIET)rm -rf built test/*.bc 45 | 46 | 47 | tests: 48 | $(CC) -emit-llvm -o test/foo.bc -c test/foo.c 49 | 50 | runtests: 51 | $(OPT) -load built/CGSSkel.so -cgsskel < test/foo.bc 52 | -------------------------------------------------------------------------------- /code/cgsskel/README.md: -------------------------------------------------------------------------------- 1 | 2 | # CallGraphSCCPass Skeleton 3 | 4 | Bottom up style to assist to use + augment for CG building 5 | 6 | # Build & Run 7 | 8 | First check the Makefile to set path to llvm-config and version. 9 | 3.8, 3.9 should be fine, so should 4.0 10 | 11 | ``` 12 | $ make 13 | $ opt-X.Y -load built/CGSSkel.so -cgsskel < file.bc 14 | ... 15 | $ 16 | ``` 17 | 18 | 19 | -------------------------------------------------------------------------------- /code/cgsskel/src/CGSSkel.cpp: -------------------------------------------------------------------------------- 1 | #include "llvm/IR/Module.h" 2 | #include "llvm/IR/Instructions.h" 3 | #include "llvm/Analysis/CallGraphSCCPass.h" 4 | #include "llvm/Analysis/CallGraph.h" 5 | #include "llvm/Support/raw_ostream.h" 6 | 7 | using namespace llvm; 8 | 9 | #include "CGSSkel.h" 10 | 11 | void 12 | CGSSkel::getAnalysisUsage(AnalysisUsage &AU) const 13 | { 14 | // No changes to CFG, so tell the pass manager 15 | AU.setPreservesCFG(); 16 | } 17 | 18 | bool 19 | CGSSkel::doFinalization(CallGraph &G) 20 | { 21 | return false; 22 | } 23 | 24 | bool 25 | CGSSkel::doInitialization(CallGraph &G) 26 | { 27 | return false; 28 | } 29 | 30 | bool 31 | CGSSkel::runOnSCC(CallGraphSCC &GSCC) 32 | { 33 | errs() << " Strongly connected component found:\n"; 34 | 35 | /* 36 | * Singular SCC's can be used to detect recursion. See: 37 | * http://llvm.org/docs/doxygen/html/FunctionAttrs_8.cpp_source.html 38 | */ 39 | if (GSCC.isSingular()) { 40 | errs() << " SCC is singular\n"; 41 | } 42 | for (auto &G : GSCC) { 43 | errs() << " Dump:\n"; 44 | G->dump(); 45 | } 46 | errs() << " --- end of SCC ---\n"; 47 | 48 | // return true if Module has been changed. 49 | return false; 50 | } 51 | 52 | /* 53 | * Register this pass to be made usable. 54 | * Needs the static ID initialized and the pass declaration given. 55 | */ 56 | char CGSSkel::ID = 0; 57 | static RegisterPass XX("cgsskel", "CallGraphSCC Pass Skeleton"); 58 | 59 | -------------------------------------------------------------------------------- /code/cgsskel/src/CGSSkel.h: -------------------------------------------------------------------------------- 1 | #ifndef __CGSSKEL_H 2 | #define __CGSSKEL_H 3 | 4 | struct CGSSkel : public CallGraphSCCPass { 5 | /* 6 | * For all of your passes you will need this and to define it. 7 | * It's address is used by pass system, so the value does not matter. 8 | */ 9 | static char ID; 10 | 11 | CGSSkel() : CallGraphSCCPass(ID) { 12 | } 13 | 14 | // Return true if Module was modified, otherwise false. 15 | virtual bool runOnSCC(CallGraphSCC &); 16 | 17 | /* 18 | * Used to help order passes by pass manager. 19 | * Declare any passes you need run prior here.. as well as 20 | * any information such as preserving CFG or similar. 21 | */ 22 | virtual void getAnalysisUsage(AnalysisUsage &) const; 23 | 24 | virtual bool doInitialization(CallGraph &CG); 25 | virtual bool doFinalization(CallGraph &); 26 | }; 27 | 28 | #endif 29 | -------------------------------------------------------------------------------- /code/cgsskel/test/foo.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | 8 | void 9 | leaks_passwd() 10 | { 11 | char *p; 12 | struct addrinfo hints, *result; 13 | 14 | p = getpass("enter passwd: "); 15 | memset(&hints, 0, sizeof(struct addrinfo)); 16 | hints.ai_family = AF_UNSPEC; 17 | hints.ai_socktype = SOCK_DGRAM; 18 | hints.ai_flags = 0; 19 | hints.ai_protocol = 0; 20 | (void)getaddrinfo(p, "http", &hints, &result); 21 | } 22 | 23 | int 24 | main(int argc, char **argv) 25 | { 26 | leaks_passwd(); 27 | return 0; 28 | } 29 | -------------------------------------------------------------------------------- /code/comminute/Makefile: -------------------------------------------------------------------------------- 1 | LLBIN=/usr/lib/llvm-3.9/bin 2 | LLVM_CONFIG=$(LLBIN)/llvm-config 3 | #QUIET:=@ 4 | QUIET:= 5 | 6 | BUILDJSONCPP:=# 7 | ifdef WITHJSONCPP 8 | BUILDJSONCPP:= 9 | endif 10 | 11 | SRC_DIR?=$(PWD)/src 12 | LDFLAGS+=$(shell $(LLVM_CONFIG) --ldflags) -Lthirdparty/jsoncpp-1.8.0/build/src/lib_json -ljsoncpp 13 | 14 | COMMON_FLAGS=-Wall -Wextra -g 15 | 16 | 17 | CXXFLAGS+=$(COMMON_FLAGS) $(shell $(LLVM_CONFIG) --cxxflags) 18 | CPPFLAGS+=$(shell $(LLVM_CONFIG) --cppflags) -std=c++11 -I$(SRC_DIR) -Ithirdparty/jsoncpp-1.8.0/include 19 | 20 | LOADABLE_MODULE_OPTIONS=-shared -Wl,-O1 21 | 22 | LDIS=$(LLBIN)/llvm-dis 23 | CPP=$(LLBIN)/clang++ 24 | CC=$(LLBIN)/clang 25 | 26 | BOD=build/obj 27 | PASSMGR=Comminute 28 | OPM=build/bin/$(PASSMGR) 29 | PASS=ComminuteShared.so 30 | PASS_OBJECTS=Analysis/TargetCallSitesPass.o \ 31 | Analysis/StoreCollector.o \ 32 | Transform/FunctionExternalizer.o \ 33 | Transform/ChoosePhiValue.o \ 34 | Analysis/NaiveSensitiveDataLeak.o \ 35 | Analysis/NaiveFileDescLeak.o \ 36 | Analysis/PotentiallyDangerousScan.o \ 37 | Analysis/PotentiallyDangerousScanUserMethod.o \ 38 | Analysis/PotentiallyDangerousScanFunctionPass.o \ 39 | Analysis/NaiveConstantArgCheck.o 40 | 41 | # XXX 42 | # This is awful... Im just like "PUT IT ALL IN" 43 | LIBS=$(shell $(LLVM_CONFIG) --libs) -lclang 44 | LIBS+=-lpthread -ldl -lncurses -lz 45 | 46 | TDIR=build/tests 47 | 48 | default: prep $(PASS) passmgr 49 | 50 | prep: 51 | @echo "Prep phase" 52 | $(QUIET)mkdir -p build/obj 53 | $(QUIET)mkdir -p build/obj/Analysis 54 | $(QUIET)mkdir -p build/obj/Transform 55 | $(QUIET)mkdir -p build/bin 56 | $(QUIET)mkdir -p build/lib 57 | 58 | define builditdood 59 | $(QUIET)$(CPP) -o $(BOD)/$(1)/$(@F) -c $(CPPFLAGS) $(CXXFLAGS) $< 60 | endef 61 | 62 | Transform/%.o: $(SRC_DIR)/Transform/%.cpp 63 | @echo "Compiling $*.cpp" 64 | $(call builditdood,Transform) 65 | 66 | Analysis/%.o: $(SRC_DIR)/Analysis/%.cpp 67 | @echo "Compiling $*.cpp" 68 | $(call builditdood,Analysis) 69 | 70 | %.o : $(SRC_DIR)/%.cpp 71 | @echo "Compiling $*.cpp" 72 | $(call builditdood,.) 73 | 74 | passmgr: 75 | @echo "Building passmanager clean up ldflags XXX" 76 | $(QUIET)$(CPP) -o $(BOD)/Comminute.o -c $(CPPFLAGS) $(CXXFLAGS) src/Comminute.cpp 77 | $(QUIET)$(CPP) -o $(OPM) $(CXXFLAGS) build/obj/Comminute.o ${addprefix $(BOD)/,$(PASS_OBJECTS)} $(LDFLAGS) $(LIBS) 78 | 79 | 80 | $(PASS) : $(PASS_OBJECTS) 81 | @echo "Linking $@" 82 | $(QUIET)$(CPP) -o build/lib/$@ $(LOADABLE_MODULE_OPTIONS) $(CXXFLAGS) $(LDFLAGS) ${addprefix $(BOD)/,$^} 83 | 84 | test: testprep testnca testnsdl testpd testnfdl 85 | 86 | testprep: 87 | $(QUIET)mkdir -p $(TDIR) 88 | 89 | testnca: 90 | $(QUIET)$(CC) -o $(TDIR)/NCA001 tests/NCA001.c 91 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/NCA001.bc -c tests/NCA001.c 92 | $(QUIET)$(LDIS) $(TDIR)/NCA001.bc 93 | $(QUIET)$(CC) -o $(TDIR)/NCA002 tests/NCA002.c 94 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/NCA002.bc -c tests/NCA002.c 95 | $(QUIET)$(LDIS) $(TDIR)/NCA002.bc 96 | 97 | testnsdl: 98 | $(QUIET)$(CC) -o $(TDIR)/NSDL001 tests/NSDL001.c 99 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/NSDL001.bc -c tests/NSDL001.c 100 | $(QUIET)$(LDIS) $(TDIR)/NSDL001.bc 101 | $(QUIET)$(CC) -o $(TDIR)/NSDL002 tests/NSDL002.c 102 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/NSDL002.bc -c tests/NSDL002.c 103 | $(QUIET)$(LDIS) $(TDIR)/NSDL002.bc 104 | 105 | testpd: 106 | $(QUIET)$(CC) -o $(TDIR)/PD001 tests/PD001.c 107 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/PD001.bc -c tests/PD001.c 108 | $(QUIET)$(LDIS) $(TDIR)/PD001.bc 109 | 110 | testnfdl: 111 | $(QUIET)$(CC) -o $(TDIR)/NFDL001 tests/NFDL001.c 112 | $(QUIET)$(CC) -emit-llvm -o $(TDIR)/NFDL001.bc -c tests/NFDL001.c 113 | $(QUIET)$(LDIS) $(TDIR)/NFDL001.bc 114 | $(QUIET)$(CC) -o $(TDIR)/NFDL002 tests/NFDL002.c 115 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/NFDL002.bc -c tests/NFDL002.c 116 | $(QUIET)$(LDIS) $(TDIR)/NFDL002.bc 117 | $(QUIET)$(CC) -o $(TDIR)/NFDL003 tests/NFDL003.c 118 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/NFDL003.bc -c tests/NFDL003.c 119 | $(QUIET)$(LDIS) $(TDIR)/NFDL003.bc 120 | $(QUIET)$(CC) -o $(TDIR)/NFDL004 tests/NFDL004.c 121 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/NFDL004.bc -c tests/NFDL004.c 122 | $(QUIET)$(LDIS) $(TDIR)/NFDL004.bc 123 | $(QUIET)$(CC) -o $(TDIR)/NFDL005 tests/NFDL005.c 124 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/NFDL005.bc -c tests/NFDL005.c 125 | $(QUIET)$(LDIS) $(TDIR)/NFDL005.bc 126 | $(QUIET)$(CC) -o $(TDIR)/NFDL006 tests/NFDL006.c 127 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/NFDL006.bc -c tests/NFDL006.c 128 | $(QUIET)$(LDIS) $(TDIR)/NFDL006.bc 129 | $(QUIET)$(CC) -o $(TDIR)/NFDL007 tests/NFDL007.c 130 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/NFDL007.bc -c tests/NFDL007.c 131 | $(QUIET)$(LDIS) $(TDIR)/NFDL007.bc 132 | 133 | help: 134 | @echo "make jsoncpp" 135 | @echo "make " 136 | @echo "...See build/" 137 | @echo "make clean or make cleanall which requires jsoncpp rebuild" 138 | @echo "make test" 139 | @echo "make runtests" 140 | @echo "make runconstantarg" 141 | @echo "make runsensitiveleak" 142 | @echo "make runfdleak" 143 | @echo "make rundangerfn" 144 | 145 | 146 | runtests: runconstantarg runsensitiveleak runfdleak rundangerfn 147 | 148 | runconstantarg: 149 | @echo "***" 150 | @echo "*** Running: Naive constant arg on NCA001 ***" 151 | @echo "***" 152 | $(QUIET)$(OPM) -naive-constant-arg $(TDIR)/NCA001.bc $(TDIR)/NCA001_out.bc 153 | @echo "***" 154 | @echo "*** Running: Naive constant arg on NCA002 ***" 155 | @echo "***" 156 | $(QUIET)$(OPM) -naive-constant-arg $(TDIR)/NCA002.bc $(TDIR)/NCA002_out.bc 157 | 158 | runsensitiveleak: 159 | @echo "***" 160 | @echo "*** Running: Naive sensitive leak on NSDL001 ***" 161 | @echo "***" 162 | $(QUIET)$(OPM) -naive-sensitive-data-leak $(TDIR)/NSDL001.bc $(TDIR)/NSDL001_out.bc 163 | $(QUIET)$(LDIS) $(TDIR)/NSDL001_out.bc 164 | @echo "***" 165 | @echo "*** Running: Naive sensitive leak on NSDL002 ***" 166 | @echo "***" 167 | $(QUIET)$(OPM) -naive-sensitive-data-leak $(TDIR)/NSDL002.bc $(TDIR)/NSDL002_out.bc 168 | $(QUIET)$(LDIS) $(TDIR)/NSDL002_out.bc 169 | 170 | rundangerfn: 171 | @echo "***" 172 | @echo "*** Running: Potentially danger function on PD001 ***" 173 | @echo "***" 174 | $(QUIET)$(OPM) -dangerous-function $(TDIR)/PD001.bc $(TDIR)/PD001_out.bc 175 | @echo "***" 176 | @echo "*** Running: Potentially danger function via user method on PD001 ***" 177 | @echo "***" 178 | $(QUIET)$(OPM) -dangerous-function-user-method $(TDIR)/PD001.bc $(TDIR)/PD001_out-um.bc 179 | @echo "***" 180 | @echo "*** Running: Potentially danger function via function pass on PD001 ***" 181 | @echo "***" 182 | $(QUIET)$(OPM) -dangerous-function-fpass $(TDIR)/PD001.bc $(TDIR)/PD001_out-fp.bc 183 | 184 | runfdleak: 185 | @echo "***" 186 | @echo "*** Running: Naive file descriptor leak on NFDL001 ***" 187 | @echo "***" 188 | $(QUIET)$(OPM) -naive-fd-leak $(TDIR)/NFDL001.bc $(TDIR)/NFDL001_out.bc 189 | @echo "***" 190 | @echo "*** Running: Naive file descriptor leak on NFDL002 ***" 191 | @echo "***" 192 | $(QUIET)$(OPM) -naive-fd-leak $(TDIR)/NFDL002.bc $(TDIR)/NFDL002_out.bc 193 | @echo "***" 194 | @echo "*** Running: Naive file descriptor leak on NFDL003 ***" 195 | @echo "***" 196 | $(QUIET)$(OPM) -naive-fd-leak $(TDIR)/NFDL003.bc $(TDIR)/NFDL003_out.bc 197 | @echo "***" 198 | @echo "*** Running: Naive file descriptor leak on NFDL004 ***" 199 | @echo "***" 200 | $(QUIET)$(OPM) -naive-fd-leak $(TDIR)/NFDL004.bc $(TDIR)/NFDL004_out.bc 201 | @echo "***" 202 | @echo "*** Running: Naive file descriptor leak on NFDL005 ***" 203 | @echo "***" 204 | $(QUIET)$(OPM) -naive-fd-leak $(TDIR)/NFDL005.bc $(TDIR)/NFDL005_out.bc 205 | @echo "***" 206 | @echo "*** Running: Naive file descriptor leak on NFDL006 ***" 207 | @echo "***" 208 | $(QUIET)$(OPM) -naive-fd-leak $(TDIR)/NFDL006.bc $(TDIR)/NFDL006_out.bc 209 | @echo "***" 210 | @echo "*** Running: Naive file descriptor leak on NFDL007 ***" 211 | @echo "***" 212 | $(QUIET)$(OPM) -naive-fd-leak $(TDIR)/NFDL007.bc $(TDIR)/NFDL007_out.bc 213 | 214 | 215 | # :^D 216 | jsoncpp: 217 | @echo "Building jsoncpp-1.8.0" 218 | cd thirdparty && \ 219 | tar zxvf jsoncpp-1.8.0.tar.gz && \ 220 | cd jsoncpp-1.8.0 && \ 221 | rm -rf build && \ 222 | mkdir -p build && \ 223 | cd build && \ 224 | cmake .. && \ 225 | make && \ 226 | cd ../../ 227 | 228 | jsonclean: 229 | $(QUIET)rm -rf thirdparty/jsoncpp-1.8.0 230 | 231 | clean: 232 | $(QUIET)rm -rf build tests/*.ll 233 | 234 | cleanall: clean jsonclean 235 | 236 | -------------------------------------------------------------------------------- /code/comminute/README.md: -------------------------------------------------------------------------------- 1 | 2 | # Comminute 3 | 4 | The code here is intended to be some basics that help with learning 5 | the LLVM API. The points dealt with here are: 6 | 7 | - Pass manager use 8 | - Pass dependency 9 | - Some basic IR instruction and value analysis 10 | - Use/User API 11 | 12 | It is not meant to be a some great bug hunting tool. It is meant 13 | to help get you to the point where you can start to think about 14 | interprocedural and other analyses. Once you get the feel you 15 | should start to look at other code, like SVF, to get into things. 16 | You should look at building CFGs so you can evaluate globals 17 | better. You should look at some extended interprocedural SSA or 18 | perhaps using Andersen's Alias Analysis for analyzing pointers. 19 | There is a lot to doing good static analysis and that's where the 20 | meat of the research is! 21 | 22 | I repeat... Just a learning tool. 23 | 24 | ### Multiple potentially dangerous examples 25 | 26 | The potent-danger examples are there to illustrate you can do the 27 | thing a few different ways. Essentially, depending on your 28 | design needs and end goals, you may want to implement one 29 | methodology or another... 30 | 31 | 32 | -------------------------------------------------------------------------------- /code/comminute/conf/constantarg.cfg: -------------------------------------------------------------------------------- 1 | { 2 | "srandom" : 0, 3 | "srand48" : 0 4 | } 5 | -------------------------------------------------------------------------------- /code/comminute/conf/fdsink.cfg: -------------------------------------------------------------------------------- 1 | { 2 | "close" : 0 3 | } 4 | -------------------------------------------------------------------------------- /code/comminute/conf/fdsource.cfg: -------------------------------------------------------------------------------- 1 | { 2 | "socket" : -1, 3 | "open" : -1 4 | } 5 | -------------------------------------------------------------------------------- /code/comminute/conf/pdfunctions.cfg: -------------------------------------------------------------------------------- 1 | { 2 | "functions" : [ "strcpy" ] 3 | } 4 | -------------------------------------------------------------------------------- /code/comminute/conf/sensitivesink.cfg: -------------------------------------------------------------------------------- 1 | { 2 | "getaddrinfo" : 0 3 | } 4 | -------------------------------------------------------------------------------- /code/comminute/conf/sensitivesource.cfg: -------------------------------------------------------------------------------- 1 | { 2 | "getpass" : -1 3 | } 4 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/NaiveConstantArgCheck.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * NaiveConstantArgCheck 3 | * 4 | * There are some functions that you do not want constant values 5 | * as arguments... some such are seed functions to (P)RNGs. 6 | * This attempts some basic detection of such cases, but is quite 7 | * naive 8 | * 9 | * Note: 10 | * Things would be better if this used the TargetCallSitesPass. 11 | * Mostly wanted to show using instruction iteration in an 12 | * example. There are many improvements to be made.. C'est la vie. 13 | * 14 | */ 15 | 16 | #include "llvm/IR/Module.h" 17 | #include "llvm/IR/Function.h" 18 | #include "llvm/Pass.h" 19 | #include "llvm/IR/InstIterator.h" 20 | #include "llvm/IR/Constant.h" 21 | #include "llvm/IR/Constants.h" 22 | #include "llvm/IR/Instructions.h" 23 | #include "llvm/IR/CallSite.h" 24 | #include "llvm/IR/DebugInfoMetadata.h" 25 | #include "llvm/IR/DebugLoc.h" 26 | #include "llvm/Support/raw_ostream.h" 27 | #include "llvm/Analysis/LazyCallGraph.h" 28 | 29 | #include 30 | #include 31 | #include 32 | 33 | #include 34 | #include 35 | #include 36 | 37 | using namespace llvm; 38 | 39 | #include "NaiveConstantArgCheck.h" 40 | 41 | void 42 | NaiveConstantArgCheck::getAnalysisUsage(AnalysisUsage &AU) const 43 | { 44 | } 45 | 46 | bool 47 | NaiveConstantArgCheck::runOnModule(Module &M) 48 | { 49 | errs() << "Running naive constant argument check pass\n"; 50 | 51 | Json::Value caDict; 52 | std::ifstream cfgFileStream; 53 | std::map> _existingToCheck; 54 | 55 | cfgFileStream.open(this->getConfigFilePath()); 56 | cfgFileStream >> caDict; 57 | cfgFileStream.close(); 58 | 59 | Json::Value::Members mems = caDict.getMemberNames(); 60 | for (auto memberIt = mems.begin(); memberIt != mems.end(); ++memberIt) { 61 | std::string fnName = *memberIt; 62 | unsigned fnArgIdx = caDict[fnName].asUInt(); 63 | 64 | // Lookup Function by name in this module 65 | Function *f = M.getFunction(fnName); 66 | if (f == NULL) { 67 | continue; 68 | } 69 | 70 | // If arg count for target fn and this fn don't match, don't add it. 71 | if (f->arg_size() == 0 || f->arg_size() <= fnArgIdx) { 72 | continue; 73 | } 74 | _existingToCheck[fnName] = std::make_pair(f, fnArgIdx); 75 | } 76 | 77 | /* 78 | * Go through every function (cept some) 79 | * Go through every call instruction 80 | * Determine if the called function is a sink function 81 | * Naively check if the argument in question is constant (to that function) 82 | * If so, add to result set. 83 | * See User in naive sensitive data leak for another way to do things. 84 | * 85 | */ 86 | for (auto &f : M) { 87 | Function *parentFunction = &f; 88 | 89 | /* 90 | * Skip analyzing any functions we check the calling of. 91 | */ 92 | auto etcIt = _existingToCheck.begin(); 93 | for ( ; etcIt != _existingToCheck.end(); ++etcIt) { 94 | auto fi = etcIt->second; 95 | Function *ignoreMe = fi.first; 96 | if (ignoreMe == parentFunction) { 97 | break; 98 | } 99 | } 100 | if (etcIt != _existingToCheck.end()) { 101 | continue; 102 | } 103 | 104 | /* 105 | * Iterate through the instructions that compose the function. 106 | * 107 | * Instead of going through instructions to find CallInst, etc, 108 | * we could use an instruction visitor. For such an example, see the 109 | * visitor code found in the intflip base above this directory. 110 | * 111 | * Further, as per two comments above and code in another file, could 112 | * just utilize the User list associated with the Function. 113 | */ 114 | for (inst_iterator iIt = inst_begin(f); iIt != inst_end(f); ++iIt) { 115 | /* 116 | * Check if the current instruction is a call instruction. If not, 117 | * skip to the next instruction. 118 | * 119 | * Some might use: 120 | * if (CallInst *ci = dyn_cast(&fInst) { 121 | * } else if (InvokeInst *ii ....) 122 | * instead... 123 | * 124 | */ 125 | Instruction *fInst = &*iIt; 126 | if (!isa(fInst) && !isa(fInst)) { 127 | continue; 128 | } 129 | DILocation *lineInfo = fInst->getDebugLoc().get(); 130 | CallSite cs(fInst); 131 | Function *calledFunction = cs.getCalledFunction(); 132 | 133 | auto fIt = _existingToCheck.begin(); 134 | for (; fIt != _existingToCheck.end(); ++fIt) { 135 | auto j = fIt->second; 136 | Function *badFunc = j.first; 137 | unsigned idx = j.second; 138 | 139 | if (badFunc == calledFunction) { 140 | 141 | // We just assume there needs to be enough arguments 142 | // Improvement would check function signature or something. 143 | unsigned nOps = cs.getNumArgOperands(); 144 | if (nOps <= idx) { 145 | continue; 146 | } 147 | 148 | // Get argument by index. 149 | Value *arg = cs.getArgOperand(idx); 150 | 151 | // and determine if this argument is constant 152 | if (!isa(arg)) { 153 | continue; 154 | } 155 | 156 | /* 157 | * Since the arg was constant and was to a fn of interest, 158 | * then save off this info as a finding to display. 159 | * 160 | */ 161 | NaiveConstantArgCheckResult n(parentFunction, 162 | calledFunction, arg, idx, lineInfo); 163 | n.printResult(); 164 | } 165 | } 166 | } 167 | } 168 | return false; 169 | } 170 | 171 | void 172 | NaiveConstantArgCheckResult::printResult() 173 | { 174 | bool hl = hasLocation(); 175 | bool hn = caller->hasName(); 176 | 177 | if (hn && hl) { 178 | unsigned line = loc->getLine(); 179 | StringRef file = loc->getFilename(); 180 | StringRef fdir = loc->getDirectory(); 181 | std::cout << " !" << caller->getName().str() << " calls " 182 | << callee->getName().str() << " where arg index: " 183 | << argIndex << " is a constant\n"; 184 | std::cout << " " << file.str() << ":" << line << "\n"; 185 | } else if (hn && !hl) { 186 | std::cout << " !" << caller->getName().str() << " calls " 187 | << callee->getName().str() << " with argument " 188 | << argIndex << " of constant value: \n "; 189 | argument->dump(); 190 | } 191 | } 192 | 193 | char NaiveConstantArgCheck::ID = 0; 194 | static RegisterPass XX("naive-con-arg-check", "Basic constant arg check"); 195 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/NaiveConstantArgCheck.h: -------------------------------------------------------------------------------- 1 | #ifndef _NAIVECONSTANTARGCHECK_H 2 | #define _NAIVECONSTANTARGCHECK_H 3 | 4 | class NaiveConstantArgCheckResult { 5 | Function *caller; 6 | Function *callee; 7 | Value *argument; 8 | unsigned argIndex; 9 | DILocation *loc; 10 | 11 | public: 12 | NaiveConstantArgCheckResult(Function *aCaller, 13 | Function *aCallee, Value *arg, unsigned idx, DILocation *d) : 14 | caller(aCaller), callee(aCallee), argument(arg), argIndex(idx), loc(d) {}; 15 | 16 | Function *getCaller() { return caller; } 17 | Function *getCallee() { return callee; } 18 | Value *getArgument() { return argument; } 19 | unsigned getArgumentIndex() { return argIndex; } 20 | bool hasLocation() { 21 | if (loc == NULL) { 22 | return false; 23 | } 24 | return true; 25 | } 26 | DILocation *getLocation() { return loc; } 27 | void printResult(); 28 | }; 29 | 30 | struct NaiveConstantArgCheck : public ModulePass { 31 | private: 32 | std::vector _results; 33 | std::string configFilePath; 34 | 35 | public: 36 | static char ID; 37 | 38 | NaiveConstantArgCheck() : ModulePass(ID) {} 39 | 40 | virtual bool runOnModule(Module &M); 41 | virtual void getAnalysisUsage(AnalysisUsage &) const; 42 | 43 | void setConfigFilePath(std::string a) { configFilePath = a; } 44 | std::string getConfigFilePath() { return configFilePath; } 45 | }; 46 | #endif // !_CONSTANTARGCHECK_H 47 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/NaiveFileDescLeak.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * NaiveFileDescLeak 3 | * 4 | * Look for cases where 5 | * 6 | * %k = call srcfnthatreturnsfd 7 | * ... 8 | * 9 | * and no call close %k in the same function 10 | */ 11 | 12 | #include "llvm/IR/Module.h" 13 | #include "llvm/IR/Function.h" 14 | #include "llvm/Pass.h" 15 | #include "llvm/IR/DebugLoc.h" 16 | #include "llvm/IR/DebugInfoMetadata.h" 17 | #include "llvm/IR/InstIterator.h" 18 | #include "llvm/IR/Instructions.h" 19 | #include "llvm/IR/CallSite.h" 20 | #include "llvm/Support/raw_ostream.h" 21 | 22 | #include 23 | 24 | using namespace llvm; 25 | 26 | #include "StoreCollector.h" 27 | #include "TargetCallSitesPass.h" 28 | #include "NaiveFileDescLeak.h" 29 | 30 | void 31 | NaiveFileDescLeak::getAnalysisUsage(AnalysisUsage &AU) const 32 | { 33 | AU.addRequired(); 34 | AU.setPreservesCFG(); 35 | } 36 | 37 | void 38 | NaiveFileDescLeak::printLeak(TargetCallSite *s) 39 | { 40 | Instruction *i = s->getInstruction(); 41 | DILocation *loc = i->getDebugLoc().get(); 42 | 43 | errs() << " ! file descriptor leak:\n"; 44 | if (loc) { 45 | errs() << " " << s->getCaller()->getName() << " calls " 46 | << s->getCalled()->getName() << " at " << loc->getFilename() 47 | << ":" << loc->getLine() << " and never closes\n"; 48 | } else { 49 | // No debug info. 50 | errs() << " " << s->getCaller()->getName() << " calls " 51 | << s->getCalled()->getName() << " and never closes \n"; 52 | i->getDebugLoc().dump(); 53 | } 54 | } 55 | 56 | void 57 | NaiveFileDescLeak::printLikelyFP(TargetCallSite *s, std::string reason) 58 | { 59 | Instruction *i = s->getInstruction(); 60 | DILocation *loc = i->getDebugLoc().get(); 61 | 62 | errs() << " ! file descriptor FP likely:\n"; 63 | if (loc) { 64 | errs() << " " << s->getCaller()->getName() << " calls " 65 | << s->getCalled()->getName() << " at " << loc->getFilename() 66 | << ":" << loc->getLine() << " reason: " << reason << "\n"; 67 | } else { 68 | // No debug info. 69 | errs() << " " << s->getCaller()->getName() << " calls " 70 | << s->getCalled()->getName() << " reason: " << reason << "\n"; 71 | i->getDebugLoc().dump(); 72 | } 73 | } 74 | 75 | bool 76 | NaiveFileDescLeak::runOnModule(Module &M) 77 | { 78 | errs() << "Running naive file descriptor leak pass\n"; 79 | 80 | /* 81 | * Make use of the result of running the TargetCallSitesPass. 82 | * It gives locations where file descriptors were created 83 | * and locations of file descriptor destroyer calls (e.g. close(2)). 84 | */ 85 | TargetCallSitesPass &p = getAnalysis(); 86 | if (p.src_empty()) { 87 | return false; 88 | } 89 | 90 | /* 91 | * If there exist no close-like calls, then every source is 92 | * leaking or we are missing some function that closes (e.g. 93 | * it exists in a different compilation unit (Module) or we 94 | * do not know some call from an API we externalized does a 95 | * close). 96 | */ 97 | if (p.snk_empty()) { 98 | errs() << " ! No close-like calls found.\n"; 99 | for (auto tcs = p.src_begin(); tcs != p.src_end(); ++tcs) { 100 | TargetCallSite *s = tcs->get(); 101 | printLeak(s); 102 | } 103 | return false; 104 | } 105 | /* 106 | * For every open()-like, if there is not even a use of the 107 | * Value, then we know there is something up. 108 | */ 109 | for (auto srcIt = p.src_begin(); srcIt != p.src_end(); ) { 110 | TargetCallSite *srcSite = &*srcIt->get(); 111 | Value *possiblyLeaked = srcSite->getTarget(); 112 | if (possiblyLeaked->user_empty()) { 113 | printLeak(srcSite); 114 | srcIt = p.src_erase(srcIt); 115 | } else { 116 | ++srcIt; 117 | } 118 | } 119 | 120 | for (auto snkIt = p.snk_begin(); snkIt != p.snk_end(); ++snkIt) { 121 | TargetCallSite *snkSite = &*snkIt->get(); 122 | Value *closingVar = snkSite->getTarget(); 123 | if (isa(closingVar)) { 124 | printLikelyFP(snkSite, "Value is an Argument to parent function (unsupported)"); 125 | } else if (isa(closingVar)) { 126 | printLikelyFP(snkSite, "Value is a GlobalVariable (unsupported)"); 127 | } 128 | } 129 | 130 | /* 131 | * Now, for every close()-like, go through and see if we can easily 132 | * find a source open(). If we can, we remove the value from being 133 | * a possible leak. 134 | * 135 | * Note that this is naive. Note that if you use the PHINode axe that 136 | * you may have FN. Etc etc etc :P 137 | */ 138 | for (auto snkIt = p.snk_begin(); snkIt != p.snk_end(); ) { 139 | TargetCallSite *snkSite = &*snkIt->get(); 140 | Value *closedValue = snkSite->getTarget(); 141 | bool remd = false; 142 | for (auto srcIt = p.src_begin(); srcIt != p.src_end(); ) { 143 | TargetCallSite *srcSite = &*srcIt->get(); 144 | Value *possiblyLeaked = srcSite->getTarget(); 145 | if (closedValue == possiblyLeaked) { 146 | snkIt = p.snk_erase(snkIt); 147 | srcIt = p.src_erase(srcIt); 148 | remd = true; 149 | break; 150 | } else { 151 | ++srcIt; 152 | } 153 | } 154 | if (remd == false) { 155 | ++snkIt; 156 | } 157 | } 158 | 159 | /* 160 | * If we did not remove some sources then either: 161 | * (a) we failed to track things properly (very likely! :D) 162 | * (b) we have some fd leak 163 | * these are basic, weak assumptions. 164 | */ 165 | for (auto srcIt = p.src_begin(); srcIt != p.src_end(); ++srcIt) { 166 | TargetCallSite *srcSite = &*srcIt->get(); 167 | Value *v = srcSite->getTarget(); 168 | if (isa(v)) { 169 | printLikelyFP(srcSite, "Value is an Argument to parent function (unsupported)"); 170 | } else if (isa(v)) { 171 | printLikelyFP(srcSite, "Value is a GlobalVariable (unsupported)"); 172 | } else { 173 | printLeak(srcSite); 174 | } 175 | } 176 | return false; 177 | } 178 | 179 | char NaiveFileDescLeak::ID = 0; 180 | static RegisterPass XX("naive-fd-leak", "Naive fd leak"); 181 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/NaiveFileDescLeak.h: -------------------------------------------------------------------------------- 1 | #ifndef __NAIVEFILEDESCLEAK_H 2 | #define __NAIVEFILEDESCLEAK_H 3 | 4 | struct NaiveFileDescLeak : public ModulePass { 5 | public: 6 | static char ID; 7 | 8 | typedef std::pair FuncArg; 9 | typedef std::map FuncArgMap; 10 | 11 | NaiveFileDescLeak() : ModulePass(ID) { } 12 | virtual bool runOnModule(Module &); 13 | virtual void getAnalysisUsage(AnalysisUsage &) const; 14 | 15 | void printLeak(TargetCallSite *s); 16 | void printLikelyFP(TargetCallSite *s, std::string r); 17 | 18 | private: 19 | }; 20 | 21 | 22 | #endif 23 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/NaiveSensitiveDataLeak.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * NaiveSensitiveDataLeak 3 | * 4 | * It makes use of the TargetCallSitesPass as a way to get uses and 5 | * target values of interest (sinks, sources, and the related data); 6 | * it runs prior to this pass. 7 | * 8 | * There is no handling of special entry points or callbacks that are 9 | * tainted a priori. 10 | */ 11 | 12 | #include "llvm/IR/Module.h" 13 | #include "llvm/IR/Function.h" 14 | #include "llvm/Pass.h" 15 | #include "llvm/IR/InstIterator.h" 16 | #include "llvm/IR/Instructions.h" 17 | #include "llvm/IR/CallSite.h" 18 | #include "llvm/IR/DebugInfoMetadata.h" 19 | #include "llvm/IR/DebugLoc.h" 20 | #include "llvm/Support/raw_ostream.h" 21 | 22 | #include 23 | 24 | using namespace llvm; 25 | 26 | #include "TargetCallSitesPass.h" 27 | #include "NaiveSensitiveDataLeak.h" 28 | 29 | void 30 | NaiveSensitiveDataLeak::getAnalysisUsage(AnalysisUsage &AU) const 31 | { 32 | AU.addRequired(); 33 | AU.setPreservesCFG(); 34 | } 35 | 36 | bool 37 | NaiveSensitiveDataLeak::runOnModule(Module &M) 38 | { 39 | errs() << "Running naive sensitive data leak pass\n"; 40 | /* 41 | * We can use the upstream analysis from the TargetCallSitesPass. 42 | * Always nice use already available tools. 43 | */ 44 | TargetCallSitesPass &p = getAnalysis(); 45 | 46 | 47 | if (p.src_empty()) { 48 | return false; 49 | } 50 | if (p.snk_empty()) { 51 | return false; 52 | } 53 | 54 | // For each sink value available, we must attempt to trace it to a source 55 | for (auto snkIt = p.snk_begin(); snkIt != p.snk_end(); ++snkIt) { 56 | TargetCallSite *snkSite = &*snkIt->get(); 57 | Value *leakData = snkSite->getTarget(); 58 | auto srcIt = p.src_end(); 59 | --srcIt; 60 | bool brk_back = false; 61 | for (; brk_back == false; --srcIt) { 62 | if (srcIt == p.src_begin()) { 63 | brk_back = true; 64 | } 65 | TargetCallSite *srcSite = &*srcIt->get(); 66 | Value *originalSourceData = srcSite->getTarget(); 67 | Value *sourceData = originalSourceData; 68 | if (isa(leakData) || isa(leakData)) { 69 | if (leakData == sourceData) { 70 | printResult(srcSite, snkSite); 71 | break; 72 | } 73 | } 74 | } 75 | } 76 | return false; 77 | 78 | } 79 | 80 | void 81 | NaiveSensitiveDataLeak::printResult(TargetCallSite *srcSite, 82 | TargetCallSite *snkSite) 83 | { 84 | Instruction *snkIn = snkSite->getInstruction(); 85 | Instruction *srcIn = srcSite->getInstruction(); 86 | 87 | DILocation *snkLoc = snkIn->getDebugLoc().get(); 88 | DILocation *srcLoc = srcIn->getDebugLoc().get(); 89 | 90 | errs() << " ! sensitive data leak \n"; 91 | errs() << " " << snkSite->getCaller()->getName() 92 | << " calls " << snkSite->getCalled()->getName() 93 | << " where arg idx #" << snkSite->getArgIndex() 94 | << " is tainted sensitive. file: " << snkLoc->getFilename() 95 | << " line: " << snkLoc->getLine() << "\n"; 96 | errs() << " source: " << srcSite->getCaller()->getName() 97 | << " calls " << srcSite->getCalled()->getName() 98 | << " at line: "<< srcLoc->getLine() << " of file: " 99 | << srcLoc->getFilename() << "\n"; 100 | } 101 | 102 | char NaiveSensitiveDataLeak::ID = 0; 103 | static RegisterPass XX("naive-sensitive-leak", 104 | "Naive sensitive data leak"); 105 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/NaiveSensitiveDataLeak.h: -------------------------------------------------------------------------------- 1 | #ifndef __NAIVESENSITIVEDATALEAK_H 2 | #define __NAIVESENSITIVEDATALEAK_H 3 | 4 | struct NaiveSensitiveDataLeak : public ModulePass { 5 | static char ID; 6 | typedef std::pair FuncArg; 7 | typedef std::map FuncArgMap; 8 | 9 | NaiveSensitiveDataLeak() : ModulePass(ID) { } 10 | virtual bool runOnModule(Module &); 11 | virtual void getAnalysisUsage(AnalysisUsage &) const; 12 | private: 13 | void parseAndCheckConfig(FuncArgMap *, bool); 14 | void printResult(TargetCallSite *srcSite, TargetCallSite *snkSite); 15 | 16 | }; 17 | 18 | 19 | #endif 20 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/NaiveSensitiveDataLeak_ahead.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * XXX I accidentally committed this file.. so maybe ignore.. but 3 | * leaving it because it might be useful to read for someone. 4 | */ 5 | 6 | /* 7 | * NaiveSensitiveDataLeak 8 | * 9 | * It makes use of the TargetCallSitesPass as a way to get uses and 10 | * target values of interest (sinks, sources, and the related data); 11 | * it runs prior to this pass. 12 | * 13 | * It is using the StoreCollector a poor means of tracking 14 | * memory load/store use. That could be a pass but it isn't. There 15 | * are much better ways for handling this... Andersen's AA or 16 | * MemoryDependenceAnalysis. The former is a bit more memory intensive 17 | * and agressive, while the latter is ``lazy''. 18 | * 19 | * There is no handling of special entry points or callbacks that are 20 | * tainted a priori. 21 | */ 22 | 23 | #include "llvm/IR/Module.h" 24 | #include "llvm/IR/Function.h" 25 | #include "llvm/Pass.h" 26 | #include "llvm/IR/InstIterator.h" 27 | #include "llvm/IR/Instructions.h" 28 | #include "llvm/IR/CallSite.h" 29 | #include "llvm/IR/DebugInfoMetadata.h" 30 | #include "llvm/IR/DebugLoc.h" 31 | #include "llvm/Support/raw_ostream.h" 32 | 33 | #include 34 | 35 | using namespace llvm; 36 | 37 | #include "TargetCallSitesPass.h" 38 | #include "NaiveSensitiveDataLeak.h" 39 | #include "StoreCollector.h" 40 | 41 | void 42 | NaiveSensitiveDataLeak::getAnalysisUsage(AnalysisUsage &AU) const 43 | { 44 | AU.addRequired(); 45 | AU.preservesCFG(); 46 | } 47 | 48 | bool 49 | NaiveSensitiveDataLeak::runOnModule(Module &M) 50 | { 51 | errs() << "Running naive sensitive data leak pass\n"; 52 | TargetCallSitesPass &p = getAnalysis(); 53 | 54 | 55 | if (p.src_empty()) { 56 | return false; 57 | } 58 | if (p.snk_empty()) { 59 | return false; 60 | } 61 | 62 | StoreCollector *store = new StoreCollector(); 63 | 64 | // For each sink value available, we must attempt to trace it to a source 65 | for (auto snkIt = p.snk_begin(); snkIt != p.snk_end(); ++snkIt) { 66 | TargetCallSite *snkSite = &*snkIt->get(); 67 | Value *leakData = snkSite->getTarget(); 68 | 69 | #if 0 70 | /* Lazily refresh the StoreInst holder */ 71 | if (snkSite->getCaller() != store->getFunction()) { 72 | store->collect(snkSite->getCaller()); 73 | } 74 | #endif 75 | 76 | for (auto srcIt = p.src_begin(); srcIt != p.src_end(); ++srcIt) { 77 | TargetCallSite *srcSite = &*srcIt->get(); 78 | Value *originalSourceData = srcSite->getTarget(); 79 | Value *sourceData = originalSourceData; 80 | 81 | while (true) { 82 | if (isa(leakData) || isa(leakData)) { 83 | if (leakData == sourceData) { 84 | printResult(srcSite, snkSite); 85 | } 86 | /* 87 | * Does not handle propagators at this point :( 88 | * so any call is either a source or dead end. 89 | */ 90 | break; 91 | } 92 | 93 | if (isa(leakData)) { 94 | break; 95 | } 96 | if (isa(leakData)) { 97 | break; 98 | } 99 | 100 | if (CastInst *ci = dyn_cast(leakData)) { 101 | User *cu = cast(ci); 102 | leakData = cu->getOperand(0); 103 | continue; 104 | } 105 | if (LoadInst *li = dyn_cast(leakData)) { 106 | Value *memLoc = li->getPointerOperand(); 107 | leakData = store->find(memLoc); 108 | assert(leakData != NULL && "memLoc not in storeCollect"); 109 | continue; 110 | } 111 | if (GetElementPtrInst *gp = 112 | dyn_cast(leakData)) { 113 | leakData = gp->getPointerOperand(); 114 | continue; 115 | } 116 | if (isa(leakData)) { 117 | errs() << "Data leaked escaped function analysis, unknown result.\n"; 118 | break; 119 | } 120 | errs() << "Unhandled:\n "; 121 | leakData->dump(); 122 | assert(0 == 1); 123 | } 124 | } 125 | } 126 | delete store; 127 | return false; 128 | 129 | } 130 | 131 | void 132 | NaiveSensitiveDataLeak::printResult(TargetCallSite *srcSite, 133 | TargetCallSite *snkSite) 134 | { 135 | Instruction *snkIn = snkSite->getInstruction(); 136 | Instruction *srcIn = srcSite->getInstruction(); 137 | 138 | DILocation *snkLoc = snkIn->getDebugLoc().get(); 139 | DILocation *srcLoc = srcIn->getDebugLoc().get(); 140 | 141 | errs() << " ! sensitive data leak \n"; 142 | errs() << " " << snkSite->getCaller()->getName() 143 | << " calls " << snkSite->getCalled()->getName() 144 | << " where arg idx #" << snkSite->getArgIndex() 145 | << " is tainted sensitive. file: " << snkLoc->getFilename() 146 | << " line: " << snkLoc->getLine() << "\n"; 147 | errs() << " source: " << srcSite->getCaller()->getName() 148 | << " calls " << srcSite->getCalled()->getName() 149 | << " at line: "<< srcLoc->getLine() << " of file: " 150 | << srcLoc->getFilename() << "\n"; 151 | } 152 | 153 | char NaiveSensitiveDataLeak::ID = 0; 154 | static RegisterPass XX("naive-sensitive-leak", 155 | "Naive sensitive data leak"); 156 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/PotentiallyDangerousScan.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * 3 | * PotentiallyDangerousScan 4 | * 5 | * This is yet another naive scan :-). This is just 6 | * checking for CWE 676 which is just the use of potentially 7 | * dangerous functions. 8 | * 9 | * There are, again, multiple ways that one might do this 10 | * check in LLVM (ie, ignoring objdump | grep :)). You 11 | * could use a visitor, instruction iteration, or even 12 | * the User checking of the functions in question. 13 | * Each of those are valid, but here we perform this 14 | * by relying on the CallGraph pass being run prior 15 | * to this pass. We then just go through those results 16 | * and perform the analysis. So this code illustrates 17 | * pass dependency (via getAnalysisUsage()) and using 18 | * the CallgraphPass as a source of data. 19 | * 20 | */ 21 | #include "llvm/IR/Module.h" 22 | #include "llvm/IR/Function.h" 23 | #include "llvm/IR/ValueHandle.h" 24 | #include "llvm/Pass.h" 25 | #include "llvm/IR/InstIterator.h" 26 | #include "llvm/IR/Instructions.h" 27 | #include "llvm/Analysis/CallGraph.h" 28 | 29 | #include 30 | #include 31 | #include 32 | #include 33 | 34 | #include 35 | #include 36 | #include 37 | 38 | using namespace llvm; 39 | 40 | #include "PotentiallyDangerousScan.h" 41 | 42 | /* 43 | * This informs the pass manager that prior to running this pass, the 44 | * CallGraphWrapperPass should be run. This helps in ordering passes so 45 | * you can have a reasonable expectation of state of the IR (or other) 46 | * upon entry to your runOn*() function. 47 | * 48 | */ 49 | void 50 | PotentiallyDangerousScan::getAnalysisUsage(AnalysisUsage &AU) const 51 | { 52 | AU.addRequired(); 53 | } 54 | 55 | 56 | bool 57 | PotentiallyDangerousScan::runOnModule(Module &M) 58 | { 59 | Json::Value fnDict; 60 | std::ifstream cfgFileStream; 61 | std::vector pdFunctions; 62 | 63 | /* 64 | * Make use of the libjsoncpp to ingest a json config file. 65 | * This file houses the set of functions that one might consider 66 | * to fall under CWE 676. Then see if any of them exist in this 67 | * module. 68 | * 69 | * I am being trusting of the configs.. 70 | */ 71 | cfgFileStream.open(this->getConfigFilePath()); 72 | cfgFileStream >> fnDict; 73 | cfgFileStream.close(); 74 | Json::Value fnList = fnDict["functions"]; 75 | assert(fnList.isArray() && "fnList was not an array"); 76 | Json::ArrayIndex aLen = fnList.size(); 77 | for (Json::ArrayIndex ai = 0; ai < aLen; ai++) { 78 | Function *f = M.getFunction(fnList[ai].asString()); 79 | if (f != NULL) { 80 | pdFunctions.push_back(f); 81 | } 82 | } 83 | 84 | /* 85 | * A great thing about the pass design is that you can share information 86 | * between them. Here we are getting the call graph as previously built 87 | * by the call graph pass. If you see the getAnalysisUsage() function at 88 | * the bottom of this file, you will note the dependency on that pass 89 | * being run prior to this one. 90 | * 91 | */ 92 | CallGraphWrapperPass &cgPass = getAnalysis(); 93 | 94 | /* store results in pairs of caller and callee */ 95 | typedef std::pair PDRes; 96 | std::vector results; 97 | 98 | /* 99 | * The container we go through is a map with key Function * and value of 100 | * CallGraphNode list. The key is the caller and value the callees. 101 | * 102 | * The first entry this map is NULL caller and contains the set of all functions. 103 | * 104 | */ 105 | CallGraph &cg = cgPass.getCallGraph(); 106 | for (auto cgIt = cg.begin(); cgIt != cg.end(); ++cgIt) { 107 | /* 108 | * Calling function 109 | */ 110 | Function *caller = const_cast(cgIt->first); 111 | if (caller == NULL) { 112 | continue; 113 | } 114 | 115 | CallGraphNode &callees = *cgIt->second; 116 | if (callees.size() == 0) { 117 | continue; 118 | } 119 | 120 | /* 121 | * We iterate through a vector of CallRecords, which is: 122 | * typedef std::pair CallRecord; 123 | * in which each CGN is a called function. 124 | * 125 | */ 126 | for (const auto &crIt : callees) { 127 | if (Function *callee = crIt.second->getFunction()) { 128 | 129 | /* determine if the callee is something we consider dangerous */ 130 | for (auto pdIt = pdFunctions.begin(); pdIt != pdFunctions.end(); ++pdIt) { 131 | Function *pdFn = *pdIt; 132 | if (pdFn == callee) { 133 | /* Save a result since matched a pd function */ 134 | auto r = std::make_pair(caller, callee); 135 | results.push_back(r); 136 | break; 137 | } 138 | } 139 | } 140 | } 141 | } 142 | 143 | errs() << "Results for potentially dangerous function call usage:\n"; 144 | for (auto rIt = results.begin(); rIt != results.end(); ++rIt) { 145 | Function *caller = rIt->first; 146 | Function *callee = rIt->second; 147 | errs() << " " << caller->getName().str() << " calls " << \ 148 | callee->getName().str() << "\n"; 149 | } 150 | 151 | return false; 152 | } 153 | 154 | char PotentiallyDangerousScan::ID = 1; 155 | static RegisterPass XX("pot-danger", "Potentially Dangerous Call (CWE 676)"); 156 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/PotentiallyDangerousScan.h: -------------------------------------------------------------------------------- 1 | #ifndef __POTENTIALLYDANGEROUSSCAN_H 2 | #define __POTENTIALLYDANGEROUSSCAN_H 3 | 4 | struct PotentiallyDangerousScan : public ModulePass { 5 | private: 6 | std::string _cfgFilePath; 7 | 8 | public: 9 | static char ID; 10 | 11 | PotentiallyDangerousScan() : ModulePass(ID) { } 12 | virtual bool runOnModule(Module &); 13 | virtual void getAnalysisUsage(AnalysisUsage &) const; 14 | 15 | void setConfigFilePath(std::string s) { _cfgFilePath = s; } 16 | std::string getConfigFilePath() { return _cfgFilePath; } 17 | }; 18 | 19 | #endif 20 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/PotentiallyDangerousScanFunctionPass.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * 3 | * PotentiallyDangerousScanFunctionPass 4 | * 5 | * This example implements the scan as a function pass and 6 | * then iterate through the instructions for calls. This could 7 | * also be done as a visitor. 8 | * 9 | */ 10 | #include "llvm/IR/Module.h" 11 | #include "llvm/IR/Function.h" 12 | #include "llvm/IR/ValueHandle.h" 13 | #include "llvm/Pass.h" 14 | #include "llvm/IR/InstIterator.h" 15 | #include "llvm/IR/CallSite.h" 16 | #include "llvm/IR/Instructions.h" 17 | #include "llvm/Analysis/CallGraph.h" 18 | 19 | #include 20 | #include 21 | #include 22 | #include 23 | 24 | #include 25 | #include 26 | #include 27 | 28 | using namespace llvm; 29 | 30 | #include "PotentiallyDangerousScanFunctionPass.h" 31 | 32 | void 33 | PotentiallyDangerousScanFunctionPass::getAnalysisUsage(AnalysisUsage &AU) const 34 | { 35 | } 36 | 37 | void 38 | PotentiallyDangerousScanFunctionPass::setConfigFilePath(std::string s) 39 | { 40 | 41 | _cfgFilePath = s; 42 | pdFunctions.clear(); 43 | lookupPDFunctions = true; 44 | } 45 | 46 | 47 | bool 48 | PotentiallyDangerousScanFunctionPass::runOnFunction(Function &f) 49 | { 50 | std::string caller = "unnamed_func"; 51 | 52 | if (lookupPDFunctions) { 53 | Json::Value fnDict; 54 | std::ifstream cfgFileStream; 55 | 56 | cfgFileStream.open(this->getConfigFilePath()); 57 | cfgFileStream >> fnDict; 58 | cfgFileStream.close(); 59 | Json::Value fnList = fnDict["functions"]; 60 | assert(fnList.isArray() && "fnList was not an array"); 61 | Json::ArrayIndex aLen = fnList.size(); 62 | Module *m = f.getParent(); 63 | for (Json::ArrayIndex ai = 0; ai < aLen; ai++) { 64 | Function *pdFn = m->getFunction(fnList[ai].asString()); 65 | if (pdFn != NULL) { 66 | pdFunctions.push_back(pdFn); 67 | } 68 | } 69 | lookupPDFunctions = false; 70 | } 71 | // Nothing to find. 72 | if (pdFunctions.empty()) { 73 | return false; 74 | } 75 | 76 | if (f.hasName()) { 77 | caller = f.getName().str(); 78 | } 79 | for (auto ii = inst_begin(f); ii != inst_end(f); ++ii) { 80 | Instruction *in = &*ii; 81 | if (isa(in) || isa(in)) { 82 | CallSite cs(in); 83 | Function *called = cs.getCalledFunction(); 84 | for (auto pdi = pdFunctions.begin(); pdi != pdFunctions.end(); ++pdi) { 85 | Function *pdf = *pdi; 86 | if (called == pdf) { 87 | std::cout << " " << caller << " called " << pdf->getName().str() << "\n"; 88 | break; 89 | } 90 | } 91 | } 92 | } 93 | return false; 94 | } 95 | 96 | char PotentiallyDangerousScanFunctionPass::ID = 1; 97 | static RegisterPass XX("pot-danger-function-pas", "Potentially Dangerous Call (CWE 676) func pass"); 98 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/PotentiallyDangerousScanFunctionPass.h: -------------------------------------------------------------------------------- 1 | #ifndef __POTENTIALLYDANGEROUSSCANFUNCTIONPASS_H 2 | #define __POTENTIALLYDANGEROUSSCANFUNCTIONPASS_H 3 | 4 | struct PotentiallyDangerousScanFunctionPass : public FunctionPass { 5 | private: 6 | std::string _cfgFilePath; 7 | std::vector pdFunctions; 8 | bool lookupPDFunctions; 9 | 10 | 11 | public: 12 | static char ID; 13 | 14 | PotentiallyDangerousScanFunctionPass() : FunctionPass(ID) { 15 | pdFunctions.clear(); 16 | lookupPDFunctions = true; 17 | } 18 | virtual bool runOnFunction(Function &); 19 | virtual void getAnalysisUsage(AnalysisUsage &) const; 20 | 21 | std::string getConfigFilePath() { return _cfgFilePath; } 22 | void setConfigFilePath(std::string); 23 | }; 24 | 25 | #endif 26 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/PotentiallyDangerousScanUserMethod.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * PotentiallyDangerousScanUserMethodUserMethod 3 | * 4 | * This example is implemented looking at User list associated 5 | * with p.d. function. 6 | * 7 | * Similar to PotentiallyDangerousScanUserMethod and 8 | * PotentiallyDangerousScanUserMethodFunctionPass 9 | * 10 | */ 11 | #include "llvm/IR/Module.h" 12 | #include "llvm/IR/Function.h" 13 | #include "llvm/IR/ValueHandle.h" 14 | #include "llvm/Pass.h" 15 | #include "llvm/IR/InstIterator.h" 16 | #include "llvm/IR/Instructions.h" 17 | #include "llvm/IR/CallSite.h" 18 | #include "llvm/Analysis/CallGraph.h" 19 | 20 | #include 21 | #include 22 | #include 23 | #include 24 | 25 | #include 26 | #include 27 | #include 28 | 29 | using namespace llvm; 30 | 31 | #include "PotentiallyDangerousScanUserMethod.h" 32 | 33 | void 34 | PotentiallyDangerousScanUserMethod::getAnalysisUsage(AnalysisUsage &AU) const 35 | { 36 | } 37 | 38 | 39 | bool 40 | PotentiallyDangerousScanUserMethod::runOnModule(Module &M) 41 | { 42 | Json::Value fnDict; 43 | std::ifstream cfgFileStream; 44 | 45 | cfgFileStream.open(this->getConfigFilePath()); 46 | cfgFileStream >> fnDict; 47 | cfgFileStream.close(); 48 | Json::Value fnList = fnDict["functions"]; 49 | assert(fnList.isArray() && "fnList was not an array"); 50 | Json::ArrayIndex aLen = fnList.size(); 51 | for (Json::ArrayIndex ai = 0; ai < aLen; ai++) { 52 | Function *f = M.getFunction(fnList[ai].asString()); 53 | if (f == NULL) { 54 | continue; 55 | } 56 | for (auto fi = f->user_begin(); fi != f->user_end(); ++fi) { 57 | User *u = *fi; 58 | if (isa(u) || isa(u)) { 59 | 60 | /* CallSite is a nice container for call and invoke */ 61 | CallSite cs(u); 62 | Function *caller = cs.getCaller(); 63 | std::string callerName = "unnamed_func"; 64 | if (caller->hasName()) { 65 | callerName = caller->getName().str(); 66 | } 67 | std::cout << " " << callerName << " calls " 68 | << fnList[ai].asString() << "\n"; 69 | } 70 | } 71 | } 72 | return false; 73 | } 74 | 75 | char PotentiallyDangerousScanUserMethod::ID = 1; 76 | static RegisterPass XX("pot-danger-user-method", "Potentially Dangerous Call (CWE 676) done with User list"); 77 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/PotentiallyDangerousScanUserMethod.h: -------------------------------------------------------------------------------- 1 | #ifndef __POTENTIALLYDANGEROUSSCANUSERMETHOD_H 2 | #define __POTENTIALLYDANGEROUSSCANUSERMETHOD_H 3 | 4 | struct PotentiallyDangerousScanUserMethod : public ModulePass { 5 | private: 6 | std::string _cfgFilePath; 7 | 8 | public: 9 | static char ID; 10 | 11 | PotentiallyDangerousScanUserMethod() : ModulePass(ID) { } 12 | virtual bool runOnModule(Module &); 13 | virtual void getAnalysisUsage(AnalysisUsage &) const; 14 | 15 | void setConfigFilePath(std::string s) { _cfgFilePath = s; } 16 | std::string getConfigFilePath() { return _cfgFilePath; } 17 | }; 18 | 19 | #endif 20 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/StoreCollector.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * There are methods that are much better than what I am doing. There are 3 | * aggressive methods such as Andersen's Alias Analysis and there are 4 | * lazy methods such as making use of the MemoryDependenceAnalysis API. 5 | * 6 | * What is going on here is quite basic and will miss many things. 7 | */ 8 | #include "llvm/IR/Module.h" 9 | #include "llvm/IR/Function.h" 10 | #include "llvm/IR/InstIterator.h" 11 | #include "llvm/IR/Instructions.h" 12 | 13 | #include 14 | #include 15 | 16 | using namespace llvm; 17 | 18 | #include "StoreCollector.h" 19 | 20 | void 21 | StoreCollector::collect(Function *f) 22 | { 23 | collectedFunction = f; 24 | if (storeMap.empty() == false) { 25 | storeMap.clear(); 26 | } 27 | for (auto ii = inst_begin(*f); ii != inst_end(*f); ++ii) { 28 | Instruction *in = &*ii; 29 | if (!isa(in)) { 30 | continue; 31 | } 32 | StoreInst *s = cast(in); 33 | Value *storedVal = s->getValueOperand(); 34 | Value *storedLoc = s->getPointerOperand(); 35 | storeMap[storedLoc] = storedVal; 36 | continue; 37 | } 38 | } 39 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/StoreCollector.h: -------------------------------------------------------------------------------- 1 | #ifndef __STORECOLLECTOR_H 2 | #define __STORECOLLECTOR_H 3 | 4 | class StoreCollector { 5 | // key is pointer and value is the stored value 6 | std::map storeMap; 7 | Function *collectedFunction; 8 | 9 | public: 10 | StoreCollector() { 11 | collectedFunction = NULL; 12 | } 13 | void collect(Function *); 14 | Function *getFunction() { 15 | return collectedFunction; 16 | } 17 | 18 | Value *find(Value *storeLoc) { 19 | auto s = storeMap.find(storeLoc); 20 | if (s == storeMap.end()) { 21 | return NULL; 22 | } 23 | return s->second; 24 | } 25 | }; 26 | #endif 27 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/TargetCallSitesPass.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * This is a dependency for a few of the passes. Collects 3 | * call sites and organizes them by TargetCallType. 4 | * 5 | */ 6 | 7 | #include "llvm/IR/Module.h" 8 | #include "llvm/IR/Function.h" 9 | #include "llvm/Pass.h" 10 | #include "llvm/IR/InstIterator.h" 11 | #include "llvm/IR/Instructions.h" 12 | #include "llvm/IR/CallSite.h" 13 | #include "llvm/Support/raw_ostream.h" 14 | 15 | #include 16 | #include 17 | #include 18 | 19 | #include 20 | #include 21 | #include 22 | 23 | using namespace llvm; 24 | 25 | #include "TargetCallSitesPass.h" 26 | #include "../Transform/FunctionExternalizer.h" 27 | 28 | void 29 | TargetCallSitesPass::getAnalysisUsage(AnalysisUsage &AU) const 30 | { 31 | AU.setPreservesCFG(); 32 | } 33 | 34 | void 35 | TargetCallSitesPass::parseConfig(std::string configFilePath, TargetCallType tct, 36 | Module *M) 37 | { 38 | Json::Value dict; 39 | std::ifstream strm; 40 | 41 | // XXX quite trusting 42 | strm.open(configFilePath); 43 | strm >> dict; 44 | strm.close(); 45 | Json::Value::Members mems = dict.getMemberNames(); 46 | for (auto memIt = mems.begin(); memIt != mems.end(); ++memIt) { 47 | std::string fnName = *memIt; 48 | 49 | int argIdx = dict[fnName].asInt(); 50 | assert(argIdx >= -1 && "Argument index should be >= -1"); 51 | 52 | /* 53 | * Determine if the name given for the function in the config is 54 | * a name of a function in this module. 55 | */ 56 | Function *fp = M->getFunction(fnName); 57 | if (fp == NULL) { 58 | continue; 59 | } 60 | 61 | /* 62 | * See if argument counts match up 63 | * If we were cool, we would check arg types if we detected 64 | * no name mangling. 65 | * 66 | * The difference between sink and source cases is that at some point 67 | * want to not just handle return value of source but allow for in/out 68 | * or out arguments to be tainted. 69 | */ 70 | if (argIdx != -1 && \ 71 | (fp->arg_size() == 0 || fp->arg_size() <= (unsigned)argIdx)) { 72 | continue; 73 | } 74 | 75 | /* 76 | * Check this Function's User list. If there is no instruction using 77 | * this function, then we have nothing to check for it. 78 | */ 79 | if (fp->user_empty() == true) { 80 | continue; 81 | } 82 | 83 | /* Ok, so we have name, argument, function, and a non-empty user list */ 84 | for (auto userIt = fp->user_begin(); userIt != fp->user_end(); 85 | ++userIt) { 86 | User *targUser = *userIt; 87 | 88 | /* 89 | * Not handling functions passed call backs or entry pts based on 90 | * RT environment, so just get CallSites. Then make sure the called 91 | * function is the targeted function (from config file) 92 | */ 93 | if (!isa(targUser) && !isa(targUser)) { 94 | continue; 95 | } 96 | Instruction *targInst = cast(targUser); 97 | std::unique_ptr tcs(new TargetCallSite(targInst, 98 | argIdx)); 99 | targetCallMap[tct].push_back(std::move(tcs)); 100 | } 101 | } 102 | } 103 | 104 | bool 105 | TargetCallSitesPass::runOnModule(Module &M) 106 | { 107 | errs() << "Running target call sites pass.\n"; 108 | 109 | for (auto k = targetConfigMap.begin(); k != targetConfigMap.end(); ++k) { 110 | TargetCallType t = k->first; 111 | std::string p = k->second; 112 | parseConfig(p, t, &M); 113 | } 114 | return false; 115 | } 116 | char TargetCallSitesPass::ID = 0; 117 | static RegisterPass XX("target-call-sites", ""); 118 | -------------------------------------------------------------------------------- /code/comminute/src/Analysis/TargetCallSitesPass.h: -------------------------------------------------------------------------------- 1 | #ifndef __TARGETCALLSITESPASS_H 2 | #define __TARGETCALLSITESPASS_H 3 | 4 | // Probably should just derive from CallSite, but so it goes. 5 | class TargetCallSite { 6 | CallSite callSite; 7 | int argOperandIndex; 8 | 9 | public: 10 | TargetCallSite(Instruction *c, int i) : callSite(c), argOperandIndex(i) {} 11 | ~TargetCallSite() { } 12 | 13 | int getArgIndex() { 14 | return argOperandIndex; 15 | } 16 | 17 | Instruction *getInstruction() { 18 | return callSite.getInstruction(); 19 | } 20 | 21 | Function *getCaller() { 22 | return callSite.getCaller(); 23 | } 24 | 25 | Function *getCalled() { 26 | return callSite.getCalledFunction(); 27 | } 28 | 29 | Value *getTarget() { 30 | if (argOperandIndex == -1) { 31 | return callSite.getInstruction(); 32 | } 33 | return callSite.getArgOperand(argOperandIndex); 34 | } 35 | }; 36 | 37 | struct TargetCallSitesPass : public ModulePass { 38 | static char ID; 39 | 40 | typedef enum _TargetCallType { 41 | SinkCall, 42 | SourceCall 43 | } TargetCallType; 44 | 45 | TargetCallSitesPass() : ModulePass(ID) { 46 | // There is probably a better, c++-ier way to do this with templates 47 | // or something. 48 | targetConfigMap[SinkCall] = ""; 49 | targetCallMap[SinkCall].reserve(0); 50 | targetConfigMap[SourceCall] = ""; 51 | targetCallMap[SourceCall].reserve(0); 52 | } 53 | ~TargetCallSitesPass() { 54 | for (auto k = targetCallMap.begin(); k != targetCallMap.end(); ++k) { 55 | targetCallMap[k->first].clear(); 56 | } 57 | } 58 | 59 | virtual bool runOnModule(Module &); 60 | virtual void getAnalysisUsage(AnalysisUsage &) const; 61 | 62 | void setConfig(TargetCallType ty, std::string path) { 63 | targetConfigMap[ty] = path; 64 | } 65 | 66 | typedef std::vector> TargetVector; 67 | typedef TargetVector::const_iterator iterator; 68 | 69 | iterator src_begin() { 70 | return targetCallMap[SourceCall].begin(); 71 | } 72 | iterator src_end() { 73 | return targetCallMap[SourceCall].end(); 74 | } 75 | bool src_empty() { 76 | return targetCallMap[SourceCall].empty(); 77 | } 78 | iterator src_erase(iterator pos) { 79 | return targetCallMap[SourceCall].erase(pos); 80 | } 81 | 82 | iterator snk_begin() { 83 | return targetCallMap[SinkCall].begin(); 84 | } 85 | iterator snk_end() { 86 | return targetCallMap[SinkCall].end(); 87 | } 88 | bool snk_empty() { 89 | return targetCallMap[SinkCall].empty(); 90 | } 91 | iterator snk_erase(iterator pos) { 92 | return targetCallMap[SinkCall].erase(pos); 93 | } 94 | 95 | private: 96 | void parseConfig(std::string, TargetCallType, Module *); 97 | std::map targetConfigMap; 98 | std::map targetCallMap; 99 | }; 100 | 101 | 102 | #endif 103 | -------------------------------------------------------------------------------- /code/comminute/src/Comminute.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Comminute 3 | * 4 | * This is a very basic example of using a (legacy) pass 5 | * manager. We configure passes to run based on command 6 | * line program options given. Some passes have dependencies, 7 | * some do not. It is meant to help show how you can 8 | * not have to do passes via opt, but via a programmatic 9 | * means. 10 | * 11 | */ 12 | 13 | #include "llvm/LinkAllPasses.h" 14 | #include "llvm/IR/LegacyPassManager.h" 15 | #include "llvm/IR/LLVMContext.h" 16 | #include "llvm/IR/Module.h" 17 | #include "llvm/Analysis/MemoryDependenceAnalysis.h" 18 | #include "llvm/IRReader/IRReader.h" 19 | #include "llvm/Support/SourceMgr.h" 20 | #include "llvm/Support/FileSystem.h" 21 | #include "llvm/Support/raw_ostream.h" 22 | #include "llvm/Support/CommandLine.h" 23 | #include "llvm/Bitcode/ReaderWriter.h" 24 | #include "llvm/Bitcode/BitcodeWriterPass.h" 25 | #include "llvm-c/Core.h" 26 | #include "llvm/Analysis/CallGraph.h" 27 | 28 | #include 29 | #include 30 | 31 | using namespace llvm; 32 | 33 | #include "Analysis/TargetCallSitesPass.h" 34 | #include "Transform/FunctionExternalizer.h" 35 | #include "Transform/ChoosePhiValue.h" 36 | #include "Analysis/NaiveConstantArgCheck.h" 37 | #include "Analysis/NaiveSensitiveDataLeak.h" 38 | #include "Analysis/NaiveFileDescLeak.h" 39 | #include "Analysis/PotentiallyDangerousScan.h" 40 | #include "Analysis/PotentiallyDangerousScanUserMethod.h" 41 | #include "Analysis/PotentiallyDangerousScanFunctionPass.h" 42 | 43 | /* 44 | * Command line arguments... 45 | * 46 | * comminute 47 | * 48 | * commands: 49 | * -choose-phi-value :: choose incoming edge to phi node 50 | * -naive-sensitive-data-leak :: very basic sensitive data leak check 51 | * -naive-fd-leak :: very basic file descriptor leak check 52 | * -naive-constant-arg :: check some functions for constant arg usage 53 | * -dangerous-function :: looks for calls a ``weak'' API 54 | * -dangerous-function-user-method :: " but using User list 55 | * -dangerous-function-function-pass :: " but as a Function pass 56 | * 57 | */ 58 | cl::opt InputBitcodeFile(cl::Positional, cl::desc(""), 59 | cl::Required); 60 | cl::opt OutputBitcodeFile(cl::Positional, cl::desc(""), 61 | cl::Required); 62 | cl::opt ChoosePhiValuePass("choose-phi-value", 63 | cl::desc("Choose value to use from PhiNode (defaults to first)"), 64 | cl::init(-1)); 65 | cl::opt NaiveSDL("naive-sensitive-data-leak", 66 | cl::desc("Perform Naive Sensitive Data Leak Analysis"), cl::init(false)); 67 | cl::opt NaiveCAC("naive-constant-arg", 68 | cl::desc("Perform Naive Constant Argument Check"), cl::init(false)); 69 | cl::opt NaiveFDL("naive-fd-leak", 70 | cl::desc("Perform Naive File Desc Leak check"), cl::init(false)); 71 | cl::opt PotentiallyDangerous("dangerous-function", 72 | cl::desc("Silly CWE 676"), cl::init(false)); 73 | cl::opt PotentiallyDangerousUserMethod("dangerous-function-user-method", 74 | cl::desc("Silly CWE 676 using User list"), cl::init(false)); 75 | cl::opt PotentiallyDangerousFunctionPass("dangerous-function-fpass", 76 | cl::desc("Silly CWE 676 using function pass"), cl::init(false)); 77 | 78 | int 79 | main(int argc, char **argv) 80 | { 81 | std::error_code ec; 82 | 83 | legacy::PassManager passManager; 84 | std::unique_ptr irModule; 85 | ModulePass *modPass; 86 | SMDiagnostic err; 87 | raw_fd_ostream *outputStream; 88 | 89 | cl::ParseCommandLineOptions(argc, argv); 90 | 91 | std::cout << " Reading input bitcode file: " << InputBitcodeFile << "\n"; 92 | irModule = parseIRFile(InputBitcodeFile, err, 93 | *unwrap(LLVMGetGlobalContext())); 94 | if (irModule == nullptr) { 95 | std::cout << " IR issue: " << err.getMessage().str() << "\n"; 96 | return -1; 97 | } 98 | 99 | /* 100 | * Add the pass that removes the body of some functions we do not 101 | * wish to have as local to this module. 102 | */ 103 | std::cout << " Adding function externalizer pass.\n"; 104 | FunctionExternalizer *fe = new FunctionExternalizer(); 105 | fe->setFunctionListFilePath("conf/fexternalizer.txt"); 106 | passManager.add(fe); 107 | 108 | /* 109 | * The mem2reg pass promotes alloc+load+store to register based. 110 | * This helps with analysis by reducing load/store chasing. 111 | */ 112 | std::cout << " Adding mem2reg pass.\n"; 113 | passManager.add(createPromoteMemoryToRegisterPass()); 114 | 115 | std::cout << " Adding constant propagation passes.\n"; 116 | passManager.add(createConstantPropagationPass()); 117 | passManager.add(createIPConstantPropagationPass()); 118 | 119 | /* 120 | * The issue here is that one selection is made. To be better, 121 | * you would want to reason, as best you can, about the paths 122 | * selected. The same goes for branch analysis, etc. This is like 123 | * taking an axe to things :-P. 124 | */ 125 | if (ChoosePhiValuePass >= 0) { 126 | std::cout << " Adding phi value selector pass\n"; 127 | std::cout << " Using edge index: " << ChoosePhiValuePass << "\n"; 128 | ChoosePhiValue *c = new ChoosePhiValue(); 129 | c->setEdgeIndex(ChoosePhiValuePass); 130 | passManager.add(c); 131 | } 132 | 133 | if (NaiveSDL) { 134 | /* 135 | * Add the naive sensitive data leak checking pass. 136 | * 137 | */ 138 | std::cout << " Adding naive sensitive data leak pass.\n"; 139 | TargetCallSitesPass *pt = new TargetCallSitesPass(); 140 | pt->setConfig(TargetCallSitesPass::SourceCall, 141 | "conf/sensitivesource.cfg"); 142 | pt->setConfig(TargetCallSitesPass::SinkCall, 143 | "conf/sensitivesink.cfg"); 144 | passManager.add(pt); 145 | NaiveSensitiveDataLeak *n = new NaiveSensitiveDataLeak(); 146 | passManager.add(n); 147 | } else if (NaiveFDL) { 148 | /* 149 | * Add the fd leak check. 150 | */ 151 | std::cout << " Adding naive file descriptor leak pass.\n"; 152 | TargetCallSitesPass *pt = new TargetCallSitesPass(); 153 | pt->setConfig(TargetCallSitesPass::SourceCall, 154 | "conf/fdsource.cfg"); 155 | pt->setConfig(TargetCallSitesPass::SinkCall, 156 | "conf/fdsink.cfg"); 157 | passManager.add(pt); 158 | NaiveFileDescLeak *n = new NaiveFileDescLeak(); 159 | passManager.add(n); 160 | } else if (NaiveCAC) { 161 | /* 162 | * Add the naive constant argument checker pass. This also adds 163 | * the ConstantPropagation pass which attempts to lower cases like 164 | * int i = 0; foo(i); to foo(0);. This makes it easier for us to 165 | * do the constant checking without having to follow-back. 166 | * 167 | */ 168 | std::cout << " Adding naive constant argument pass.\n"; 169 | NaiveConstantArgCheck *nca = new NaiveConstantArgCheck(); 170 | nca->setConfigFilePath("conf/constantarg.cfg"); 171 | passManager.add(nca); 172 | } else if (PotentiallyDangerous) { 173 | std::cout << " Adding call graph pass\n"; 174 | passManager.add(new CallGraphWrapperPass()); 175 | std::cout << " Adding dangerous fn scan pass.\n"; 176 | PotentiallyDangerousScan *p = new PotentiallyDangerousScan(); 177 | p->setConfigFilePath("conf/pdfunctions.cfg"); 178 | passManager.add(p); 179 | } else if (PotentiallyDangerousUserMethod) { 180 | std::cout << " Adding dangerous fn scan user method pass.\n"; 181 | PotentiallyDangerousScanUserMethod *p = new PotentiallyDangerousScanUserMethod(); 182 | p->setConfigFilePath("conf/pdfunctions.cfg"); 183 | passManager.add(p); 184 | } else if (PotentiallyDangerousFunctionPass) { 185 | std::cout << " Adding dangerous fn scan function pass.\n"; 186 | PotentiallyDangerousScanFunctionPass *p = new PotentiallyDangerousScanFunctionPass(); 187 | p->setConfigFilePath("conf/pdfunctions.cfg"); 188 | passManager.add(p); 189 | } 190 | 191 | /* 192 | * Open output stream and use that as conduit for writing output 193 | * bitcode file. 194 | * 195 | */ 196 | std::cout << " Adding bitcode writer pass\n"; 197 | outputStream = new raw_fd_ostream(OutputBitcodeFile, ec, sys::fs::F_None); 198 | modPass = createBitcodeWriterPass(*outputStream, false, true); 199 | passManager.add(modPass); 200 | 201 | /* 202 | * Actually run the passes added on this module. With this very 203 | * basic tool, there is just results to std[out|err]. 204 | * 205 | */ 206 | std::cout << " Running passes\n"; 207 | passManager.run(*irModule.get()); 208 | outputStream->close(); 209 | std::cout << " Finished...\n"; 210 | return 0; 211 | } 212 | -------------------------------------------------------------------------------- /code/comminute/src/Transform/ChoosePhiValue.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Choose a value from a Phi node and replace all 3 | * Phi node uses with it. This ignores some paths, but 4 | * simplifies analysis. Will attempt to drop no longer 5 | * used basic blocks. 6 | */ 7 | #include "llvm/IR/Module.h" 8 | #include "llvm/IR/Function.h" 9 | #include "llvm/Pass.h" 10 | #include "llvm/IR/InstIterator.h" 11 | #include "llvm/IR/CallSite.h" 12 | #include "llvm/IR/Instructions.h" 13 | #include "llvm/IR/SymbolTableListTraits.h" 14 | #include "llvm/Support/raw_ostream.h" 15 | 16 | using namespace llvm; 17 | 18 | #include "ChoosePhiValue.h" 19 | 20 | bool 21 | ChoosePhiValue::runOnModule(Module &M) 22 | { 23 | errs() << "Running choose phi value pass.\n"; 24 | bool rv = false; 25 | 26 | /* 27 | * For every function in this compilation unit, iterate through 28 | * the instructions looking for PHINodes. When a PHINode is found, 29 | * the chosen incoming edge will be used, if it exists, otherwise 30 | * the first (0) is chosen. The edge is removed. 31 | * 32 | * Take all the removed edges and replace the corresponding PHINode 33 | * uses with the edge's value. So, where before things were using 34 | * PHINode as the Value, they will use the incoming edge Value. 35 | * Remove all other incoming edges and then erase the PHINode. 36 | * Then, for each unused incoming edge, attempt to erase it's 37 | * BasicBlock. 38 | * 39 | */ 40 | for (auto &f : M) { 41 | std::vector> replaceList; 42 | 43 | for (auto ii = inst_begin(f); ii != inst_end(f); ++ii) { 44 | Instruction *in = &*ii; 45 | if (PHINode *pn = dyn_cast(in)) { 46 | unsigned usedEdge = edgeIndex; 47 | if (pn->getNumIncomingValues() <= edgeIndex) { 48 | errs() << "Not enough incoming values...using 0\n"; 49 | usedEdge = 0; 50 | } 51 | // Remove chosen incoming value, add to replacement list. 52 | Value *x = pn->removeIncomingValue(usedEdge, false); 53 | replaceList.push_back(std::make_pair(pn, x)); 54 | 55 | } 56 | } 57 | // We know we are going to change the CFG. 58 | if (replaceList.empty() == false) { 59 | rv = true; 60 | } 61 | for (auto pc : replaceList) { 62 | /* Replace all uses of the PHINode with the selected Value */ 63 | pc.first->replaceAllUsesWith(pc.second); 64 | while (pc.first->getNumIncomingValues() > 0) { 65 | Value *d = pc.first->removeIncomingValue((unsigned)0, false); 66 | // Each instruction resides in a BasicBlock 67 | assert(isa(d) == true); 68 | Instruction *vi = cast(d); 69 | BasicBlock *bb = vi->getParent(); 70 | if (bb->user_empty()) { 71 | bb->eraseFromParent(); 72 | continue; 73 | } 74 | // Attempt to remove users of BasicBlock so we can axe it 75 | attemptUserReduction(bb); 76 | if (bb->user_empty()) { 77 | bb->eraseFromParent(); 78 | continue; 79 | } 80 | } 81 | assert(pc.first->users_empty()); 82 | pc.first->eraseFromParent(); 83 | } 84 | #ifdef DEBUG 85 | for (auto ii = inst_begin(f); ii != inst_end(f); ++ii) { 86 | Instruction *in = &*ii; 87 | assert(!isa(in) && "PHINode chooser missed."); 88 | } 89 | #endif 90 | } 91 | return rv; 92 | } 93 | 94 | void 95 | ChoosePhiValue::attemptUserReduction(BasicBlock *bb) 96 | { 97 | /* 98 | * Go through each user of this and we should just be finding 99 | * BranchInst. If there is just one operand, then this is 100 | * an unconditional branch and cannot be mucked with (unless 101 | * we did deeper analysis), so we just return because 102 | * we know there will always be a user of this BasicBlock. 103 | * 104 | * Otherwise, we make the true/false branch be the same and 105 | * not our basic block. ... Repeat all this until gone 106 | * through all users... 107 | */ 108 | for (auto ui = bb->user_begin(); ui != bb->user_end(); ++ui) { 109 | User *u = *ui; 110 | if (Instruction *i = dyn_cast(u)) { 111 | if (BranchInst *bi = dyn_cast(i)) { 112 | if (bi->getNumOperands() == 1) { 113 | return; 114 | } 115 | Value *tb = bi->getOperand(1); 116 | Value *fb = bi->getOperand(2); 117 | if (tb == cast(bb)) { 118 | bi->setOperand(1, fb); 119 | } else if (fb == cast(bb)) { 120 | bi->setOperand(2, tb); 121 | } 122 | } 123 | } 124 | } 125 | } 126 | 127 | char ChoosePhiValue::ID = 0; 128 | static RegisterPass XX("choose-phi-value", 129 | "Choose Phi value to use"); 130 | -------------------------------------------------------------------------------- /code/comminute/src/Transform/ChoosePhiValue.h: -------------------------------------------------------------------------------- 1 | #ifndef __CHOOSEPHIVALUE_H 2 | #define __CHOOSEPHIVALUE_H 3 | 4 | struct ChoosePhiValue : public ModulePass { 5 | private: 6 | unsigned edgeIndex; 7 | void attemptUserReduction(BasicBlock *bb); 8 | 9 | public: 10 | static char ID; 11 | ChoosePhiValue() : ModulePass(ID) { } 12 | virtual bool runOnModule(Module &); 13 | 14 | // phi [ a, b], [ c, d] .... 15 | // this gets us [a, b] if edgeIndex = 0. 16 | void setEdgeIndex(unsigned e) { 17 | edgeIndex = e; 18 | } 19 | unsigned getEdgeIndex() { 20 | return edgeIndex; 21 | } 22 | }; 23 | #endif 24 | -------------------------------------------------------------------------------- /code/comminute/src/Transform/FunctionExternalizer.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Often in Android a thing to do is to 3 | * grab source to a project and just build it 4 | * in with your native code. This is painful from 5 | * the analysis perspective because you (we) do 6 | * not want to scan non-customer code. So, what 7 | * this attempts to do is to look for any functions 8 | * that are libc, libssl, DESLib, etc and convert 9 | * them from their lifted-wrapper versions to 10 | * just be the external declaration. 11 | * 12 | * To do this, you just deleteBody() on the function. 13 | * 14 | */ 15 | #include "llvm/IR/Module.h" 16 | #include "llvm/IR/Function.h" 17 | #include "llvm/Pass.h" 18 | #include "llvm/IR/InstIterator.h" 19 | #include "llvm/IR/CallSite.h" 20 | #include "llvm/IR/Instructions.h" 21 | #include "llvm/IR/SymbolTableListTraits.h" 22 | #include "llvm/Support/raw_ostream.h" 23 | 24 | #include 25 | #include 26 | #include 27 | #include 28 | 29 | using namespace llvm; 30 | 31 | #include "FunctionExternalizer.h" 32 | 33 | bool 34 | FunctionExternalizer::runOnModule(Module &M) 35 | { 36 | errs() << "Running function externalizer pass.\n"; 37 | 38 | 39 | std::ifstream fileHandle(this->_functionListFile); 40 | std::string fnName; 41 | bool rv = false; 42 | 43 | /* 44 | * Each line is a function name to extern. It does not do any 45 | * checking of the target function's signature etc, so be aware. 46 | */ 47 | while (std::getline(fileHandle, fnName)) { 48 | // skip comment line. 49 | if (fnName.find("#", 0) == 0) { 50 | continue; 51 | } 52 | // Does the function exist within this module? 53 | Function *f = M.getFunction(fnName); 54 | if (f == NULL) { 55 | continue; 56 | } 57 | // Definition is already outside of this module. 58 | if (f->isDeclaration()) { 59 | continue; 60 | } 61 | // Remove the body (definition) of the function. Leave declaration. 62 | errs() << "Deleting body of function: " << f->getName().str() << "\n"; 63 | f->deleteBody(); 64 | rv = true; 65 | } 66 | return rv; 67 | } 68 | char FunctionExternalizer::ID = 0; 69 | static RegisterPass XX("fn-extern", 70 | "Function externalizer"); 71 | -------------------------------------------------------------------------------- /code/comminute/src/Transform/FunctionExternalizer.h: -------------------------------------------------------------------------------- 1 | #ifndef __FUNCTIONEXTERNALIZER_H 2 | #define __FUNCTIONEXTERNALIZER_H 3 | 4 | struct FunctionExternalizer : public ModulePass { 5 | private: 6 | std::string _functionListFile; 7 | 8 | public: 9 | static char ID; 10 | FunctionExternalizer() : ModulePass(ID) { } 11 | virtual bool runOnModule(Module &); 12 | void setFunctionListFilePath(std::string a) { this->_functionListFile = a; } 13 | std::string getFunctionListFilePath() { return this->_functionListFile; } 14 | }; 15 | 16 | #endif 17 | -------------------------------------------------------------------------------- /code/comminute/tests/FE001.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int 4 | FE001_foo() 5 | { 6 | return 0; 7 | } 8 | 9 | int 10 | main(int argc, char **argv) 11 | { 12 | return 1; 13 | } 14 | -------------------------------------------------------------------------------- /code/comminute/tests/NCA001.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NCA001 3 | * 4 | * NaiveConstantArg 001 5 | * 6 | */ 7 | #include 8 | #include 9 | 10 | void 11 | seed_random() 12 | { 13 | 14 | srandom(0xdeadbeef); 15 | } 16 | 17 | int 18 | main(int argc, char **argv) 19 | { 20 | long int rv; 21 | 22 | seed_random(); 23 | rv = random(); 24 | (void)printf("random value = %ld\n", rv); 25 | return 0; 26 | } 27 | -------------------------------------------------------------------------------- /code/comminute/tests/NCA002.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NCA002 3 | * 4 | * NaiveConstantArg 002 5 | * 6 | * The diff between this and 001 is using a pointer 7 | * which causes load/store to occur. They should be 8 | * removed by mem2reg. Let's see. 9 | * 10 | */ 11 | #include 12 | #include 13 | 14 | void 15 | seed_random() 16 | { 17 | unsigned int *seedp; 18 | 19 | seedp = (unsigned int *)malloc(sizeof(unsigned int)); 20 | if (seedp == NULL) return; 21 | *seedp = 0xdeadbeef; 22 | srandom(*seedp); 23 | free(seedp); 24 | } 25 | 26 | int 27 | main(int argc, char **argv) 28 | { 29 | long int rv; 30 | 31 | seed_random(); 32 | rv = random(); 33 | (void)printf("random value = %ld\n", rv); 34 | return 0; 35 | } 36 | -------------------------------------------------------------------------------- /code/comminute/tests/NFDL001.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NFDL001 3 | * 4 | * Naive File Descriptor Leak 001 5 | * 6 | */ 7 | #include 8 | #include 9 | #include 10 | 11 | void 12 | leaks_fd() 13 | { 14 | int fd; 15 | fd = socket(AF_INET, SOCK_STREAM, 0); 16 | if (fd == -1) { 17 | perror("socket"); 18 | } 19 | return; 20 | } 21 | 22 | int 23 | main(int argc, char **argv) 24 | { 25 | leaks_fd(); 26 | return 0; 27 | } 28 | -------------------------------------------------------------------------------- /code/comminute/tests/NFDL002.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NFDL002 3 | * 4 | * Naive File Descriptor Leak 002 5 | * .. well this does not leak. 6 | * 7 | */ 8 | #include 9 | #include 10 | #include 11 | #include 12 | 13 | void 14 | foo(char *k) 15 | { 16 | return; 17 | } 18 | 19 | void 20 | leaks_fd() 21 | { 22 | int fd; 23 | 24 | 25 | fd = socket(AF_INET, SOCK_STREAM, 0); 26 | if (fd == -1) { 27 | perror("socket"); 28 | } 29 | close(fd); 30 | return; 31 | } 32 | 33 | int 34 | main(int argc, char **argv) 35 | { 36 | leaks_fd(); 37 | return 0; 38 | } 39 | -------------------------------------------------------------------------------- /code/comminute/tests/NFDL003.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NFDL003 3 | * 4 | * Naive File Descriptor Leak 003 5 | * 6 | * copies fd to another local variable and closes that... so no leak 7 | */ 8 | #include 9 | #include 10 | #include 11 | #include 12 | 13 | void 14 | foo(char *k) 15 | { 16 | return; 17 | } 18 | 19 | void 20 | leaks_fd() 21 | { 22 | int fd; 23 | int fd2; 24 | 25 | fd2 = 0; 26 | fd = socket(AF_INET, SOCK_STREAM, 0); 27 | if (fd == -1) { 28 | perror("socket"); 29 | } 30 | fd2 = fd; 31 | close(fd2); 32 | return; 33 | } 34 | 35 | int 36 | main(int argc, char **argv) 37 | { 38 | leaks_fd(); 39 | return 0; 40 | } 41 | -------------------------------------------------------------------------------- /code/comminute/tests/NFDL004.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NFDL004 3 | * 4 | * Naive File Descriptor Leak 004 5 | * 6 | * One leaks one does not 7 | * 8 | */ 9 | #include 10 | #include 11 | #include 12 | #include 13 | 14 | void 15 | foo(char *k) 16 | { 17 | return; 18 | } 19 | 20 | void 21 | leaks_fd() 22 | { 23 | int fd, fd2; 24 | 25 | 26 | fd = socket(AF_INET, SOCK_STREAM, 0); 27 | if (fd == -1) { 28 | perror("socket"); 29 | } 30 | fd2 = socket(AF_INET, SOCK_STREAM, 0); 31 | if (fd2 == -1) { 32 | perror("socket"); 33 | } 34 | close(fd); 35 | return; 36 | } 37 | 38 | int 39 | main(int argc, char **argv) 40 | { 41 | leaks_fd(); 42 | return 0; 43 | } 44 | -------------------------------------------------------------------------------- /code/comminute/tests/NFDL005.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NFDL005 3 | * 4 | * Naive File Descriptor Leak 005 5 | * 6 | * global variable that ``seems'' to leak. 7 | * 8 | */ 9 | #include 10 | #include 11 | #include 12 | #include 13 | 14 | void 15 | foo(char *k) 16 | { 17 | return; 18 | } 19 | 20 | int fd; 21 | 22 | void 23 | leaks_fd() 24 | { 25 | fd = socket(AF_INET, SOCK_STREAM, 0); 26 | if (fd == -1) { 27 | perror("socket"); 28 | } 29 | return; 30 | } 31 | 32 | int 33 | main(int argc, char **argv) 34 | { 35 | leaks_fd(); 36 | return 0; 37 | } 38 | -------------------------------------------------------------------------------- /code/comminute/tests/NFDL006.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NFDL006 3 | * 4 | * Naive File Descriptor Leak 005 5 | * 6 | * global variable that does not leak, but naive analysis won't catch. 7 | * 8 | */ 9 | #include 10 | #include 11 | #include 12 | #include 13 | 14 | void 15 | foo(char *k) 16 | { 17 | return; 18 | } 19 | 20 | int fd; 21 | 22 | void 23 | leaks_fd() 24 | { 25 | fd = socket(AF_INET, SOCK_STREAM, 0); 26 | if (fd == -1) { 27 | perror("socket"); 28 | } 29 | return; 30 | } 31 | 32 | void 33 | closes_fd() 34 | { 35 | close(fd); 36 | } 37 | 38 | int 39 | main(int argc, char **argv) 40 | { 41 | leaks_fd(); 42 | closes_fd(); 43 | return 0; 44 | } 45 | -------------------------------------------------------------------------------- /code/comminute/tests/NFDL007.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NFDL005 3 | * 4 | * Naive File Descriptor Leak 005 5 | * 6 | * opens and closes a global var fd 7 | * 8 | */ 9 | #include 10 | #include 11 | #include 12 | #include 13 | 14 | void 15 | foo(char *k) 16 | { 17 | return; 18 | } 19 | 20 | int fd; 21 | 22 | void 23 | leaks_fd() 24 | { 25 | fd = socket(AF_INET, SOCK_STREAM, 0); 26 | if (fd == -1) { 27 | perror("socket"); 28 | } 29 | close(fd); 30 | return; 31 | } 32 | 33 | int 34 | main(int argc, char **argv) 35 | { 36 | leaks_fd(); 37 | return 0; 38 | } 39 | -------------------------------------------------------------------------------- /code/comminute/tests/NSDL001.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NSDL001 3 | * 4 | * Naive Sensitive Data Leak 001 5 | * 6 | */ 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | 14 | void 15 | leaks_passwd() 16 | { 17 | char *p; 18 | struct addrinfo hints, *result; 19 | 20 | p = getpass("enter passwd: "); 21 | /* l.v. p is now tainted with sensitive data */ 22 | memset(&hints, 0, sizeof(struct addrinfo)); 23 | hints.ai_family = AF_UNSPEC; 24 | hints.ai_socktype = SOCK_DGRAM; 25 | hints.ai_flags = 0; 26 | hints.ai_protocol = 0; 27 | /* leak password via getaddrinfo() DNS lookup. contrived af. */ 28 | (void)getaddrinfo(p, "http", &hints, &result); 29 | } 30 | 31 | int 32 | main(int argc, char **argv) 33 | { 34 | leaks_passwd(); 35 | return 0; 36 | } 37 | -------------------------------------------------------------------------------- /code/comminute/tests/NSDL002.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NSDL002 3 | * 4 | * Naive Sensitive Data Leak 002 5 | * 6 | */ 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | 15 | void 16 | leaks_passwd(unsigned lookup) 17 | { 18 | char *p; 19 | struct addrinfo hints, *result; 20 | 21 | p = getpass("enter passwd: "); 22 | /* l.v. p is now tainted with sensitive data */ 23 | memset(&hints, 0, sizeof(struct addrinfo)); 24 | hints.ai_family = AF_UNSPEC; 25 | hints.ai_socktype = SOCK_DGRAM; 26 | hints.ai_flags = 0; 27 | hints.ai_protocol = 0; 28 | /* leak password via getaddrinfo() DNS lookup. contrived af. */ 29 | memset(p, 0, strlen(p)); // XXX :PpPp 30 | (void)getaddrinfo(p, "http", &hints, &result); 31 | } 32 | int 33 | main(int argc, char **argv) 34 | { 35 | leaks_passwd(random()); 36 | return 0; 37 | } 38 | -------------------------------------------------------------------------------- /code/comminute/tests/NSDL003.c: -------------------------------------------------------------------------------- 1 | /* 2 | * NSDL003 3 | * 4 | * Naive Sensitive Data Leak 003 5 | * 6 | */ 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | 15 | void 16 | leaks_passwd(unsigned lookup) 17 | { 18 | char *p, *a2l = "www.cw-complex.com"; 19 | struct addrinfo hints, *result; 20 | if (lookup) { 21 | p = getpass("enter passwd: "); 22 | } else { 23 | p = a2l; 24 | } 25 | memset(&hints, 0, sizeof(struct addrinfo)); 26 | hints.ai_family = AF_UNSPEC; 27 | hints.ai_socktype = SOCK_DGRAM; 28 | hints.ai_flags = 0; 29 | hints.ai_protocol = 0; 30 | (void)getaddrinfo(p, "http", &hints, &result); /* which path did p take? */ 31 | } 32 | int 33 | main(int argc, char **argv) 34 | { 35 | leaks_passwd(random()); 36 | return 0; 37 | } 38 | -------------------------------------------------------------------------------- /code/comminute/tests/PD001.c: -------------------------------------------------------------------------------- 1 | /* 2 | * PD001 3 | * 4 | * Potentially Dangerous 001 5 | * 6 | */ 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | 14 | void 15 | blind_copy() 16 | { 17 | char buf[512]; 18 | char buf2[32]; 19 | struct addrinfo hints, *result; 20 | 21 | memset(&hints, 0, sizeof(struct addrinfo)); 22 | hints.ai_family = AF_UNSPEC; 23 | hints.ai_socktype = SOCK_DGRAM; 24 | hints.ai_flags = 0; 25 | hints.ai_protocol = 0; 26 | 27 | (void)getaddrinfo("www.evilgoogle.com", "http", &hints, &result); 28 | int sfd = socket(result->ai_family, result->ai_socktype, result->ai_protocol); 29 | if (sfd == -1) { 30 | return; 31 | } 32 | memset(&buf, 0, 512); 33 | connect(sfd, result->ai_addr, result->ai_addrlen); 34 | read(sfd, &buf, 511); 35 | strcpy(buf2, buf); 36 | 37 | } 38 | 39 | int 40 | main(int argc, char **argv) 41 | { 42 | blind_copy(); 43 | return 0; 44 | } 45 | -------------------------------------------------------------------------------- /code/comminute/thirdparty/jsoncpp-1.8.0.tar.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roachspray/opcde2017/8c2474e8cd8ff2315b72d54560e83ffd35b4c91d/code/comminute/thirdparty/jsoncpp-1.8.0.tar.gz -------------------------------------------------------------------------------- /code/fpskel/Makefile: -------------------------------------------------------------------------------- 1 | LLVM_VER=3.9 2 | LLVM_HOME=/usr/bin 3 | LLVM_CONFIG?=$(LLVM_HOME)/llvm-config-$(LLVM_VER) 4 | 5 | ifndef VERBOSE 6 | QUIET:=@ 7 | endif 8 | 9 | SRC_DIR?=$(PWD)/src 10 | 11 | CXX=$(LLVM_HOME)/clang++-$(LLVM_VER) 12 | CC=$(LLVM_HOME)/clang-$(LLVM_VER) 13 | OPT=$(LLVM_HOME)/opt-$(LLVM_VER) 14 | DIS=$(LLVM_HOME)/llvm-dis-$(LLVM_VER) 15 | LNK=$(LLVM_HOME)/llvm-link-$(LLVM_VER) 16 | 17 | LDFLAGS+=$(shell $(LLVM_CONFIG) --ldflags) 18 | LDFLAGS+=-shared -Wl,-O1 19 | 20 | CXXFLAGS+=-I$(shell $(LLVM_CONFIG) --includedir) 21 | CXXFLAGS+=-std=c++11 -fPIC -fvisibility-inlines-hidden 22 | CXXFLAGS+=-Wall -Wextra -g -Wno-unused-parameter -Wno-unused-variable 23 | 24 | CPPFLAGS+=$(shell $(LLVM_CONFIG) --cppflags) 25 | CPPFLAGS+=-I$(SRC_DIR) 26 | 27 | 28 | PASS=FPSkel.so 29 | PASS_OBJECTS=FPSkel.o 30 | 31 | default: prep $(PASS) 32 | 33 | prep: 34 | $(QUIET)mkdir -p built 35 | 36 | %.o : $(SRC_DIR)/%.cpp 37 | @echo Compiling $*.cpp 38 | $(QUIET)$(CXX) -o built/$*.o -c $(CPPFLAGS) $(CXXFLAGS) $< 39 | 40 | $(PASS) : $(PASS_OBJECTS) 41 | @echo Linking $@ 42 | $(QUIET)$(CXX) -o built/$@ $(LDFLAGS) $(CXXFLAGS) built/*.o 43 | 44 | clean: 45 | $(QUIET)rm -rf built 46 | 47 | -------------------------------------------------------------------------------- /code/fpskel/README.md: -------------------------------------------------------------------------------- 1 | 2 | # FPSkel 3 | 4 | This is a function pass skeleton. 5 | 6 | 7 | # Build & Run 8 | 9 | First check the Makefile to set path to llvm-config and version. 10 | 3.8, 3.9 should be fine, so should 4.0 11 | 12 | ``` 13 | $ make 14 | $ opt-X.Y -load built/FPSkel.so -fpskel < file.bc 15 | ... 16 | $ 17 | ``` 18 | 19 | 20 | -------------------------------------------------------------------------------- /code/fpskel/src/FPSkel.cpp: -------------------------------------------------------------------------------- 1 | #include "llvm/IR/Module.h" 2 | #include "llvm/IR/Function.h" 3 | #include "llvm/IR/Instructions.h" 4 | #include "llvm/Support/raw_ostream.h" 5 | 6 | using namespace llvm; 7 | 8 | #include "FPSkel.h" 9 | 10 | void 11 | FPSkel::getAnalysisUsage(AnalysisUsage &AU) const 12 | { 13 | AU.setPreservesCFG(); 14 | } 15 | 16 | 17 | bool 18 | FPSkel::runOnFunction(Function &F) 19 | { 20 | unsigned nbb = 0; 21 | unsigned ins = 0; 22 | 23 | if (F.isDeclaration()) { 24 | errs() << "Ignoring function declaration.\n"; 25 | return false; 26 | } 27 | if (F.hasName()) { 28 | errs() << "\nFunction: " << F.getName() << "\n"; 29 | } else { 30 | errs() << "\nFunction: not named\n"; 31 | } 32 | 33 | for (auto &B : F) { // Iterate through Basic Blocks in a function 34 | ++nbb; 35 | errs() << " Basic Block found:\n"; 36 | B.dump(); 37 | for (auto &I : B) { // Iterate through instructions in the block 38 | ++ins; 39 | } 40 | errs() << " --- end of basic block ---\n"; 41 | } 42 | errs() << " Total of " << nbb << " blocks in this function\n"; 43 | errs() << " Total of " << ins << " instructions in this function\n"; 44 | errs() << "--- end of function ---\n"; 45 | 46 | // return true if CFG has changed. 47 | return false; 48 | } 49 | 50 | /* 51 | * Register this pass to be made usable with -fpskel option. 52 | * Needs the static ID initialized and the pass declaration given. 53 | */ 54 | char FPSkel::ID = 0; 55 | static RegisterPass XX("fpskel", "Function Pass Skeleton"); 56 | 57 | -------------------------------------------------------------------------------- /code/fpskel/src/FPSkel.h: -------------------------------------------------------------------------------- 1 | #ifndef __FPSKEL_H 2 | #define __FPSKEL_H 3 | 4 | struct FPSkel : public FunctionPass { 5 | /* 6 | * For all of your passes you will need this and to define it. 7 | * It's address is used by pass system, so the value does not matter. 8 | */ 9 | static char ID; 10 | 11 | FPSkel() : FunctionPass(ID) { } 12 | 13 | // Called on each function in given compilation unit 14 | virtual bool runOnFunction(Function &); 15 | 16 | /* 17 | * Used to help order passes by pass manager. 18 | * Declare any passes you need run prior here.. as well as 19 | * any information such as preserving CFG or similar. 20 | */ 21 | virtual void getAnalysisUsage(AnalysisUsage &) const; 22 | 23 | }; 24 | 25 | #endif 26 | -------------------------------------------------------------------------------- /code/intflip/Makefile: -------------------------------------------------------------------------------- 1 | # 2 | # This is all a bit hack-ish, but should give the idea 3 | # about llvm-config usage, which is the key tool to make 4 | # life easy^Hier. 5 | # 6 | # make jsoncpp (once) 7 | # make 8 | # 9 | # I assume you have things in /usr/bin. Often this is the 10 | # case, but providing a path for you to set. 11 | # 12 | LLVM_VER=3.9 13 | LLVM_HOME=/usr/bin 14 | LLVM_CONFIG?=$(LLVM_HOME)/llvm-config-$(LLVM_VER) 15 | 16 | ifndef VERBOSE 17 | QUIET:=@ 18 | endif 19 | 20 | SRC_DIR?=$(PWD)/src 21 | 22 | CXX=$(LLVM_HOME)/clang++-$(LLVM_VER) 23 | CC=$(LLVM_HOME)/clang-$(LLVM_VER) 24 | OPT=$(LLVM_HOME)/opt-$(LLVM_VER) 25 | DIS=$(LLVM_HOME)/llvm-dis-$(LLVM_VER) 26 | LNK=$(LLVM_HOME)/llvm-link-$(LLVM_VER) 27 | 28 | LDFLAGS+=$(shell $(LLVM_CONFIG) --ldflags) 29 | LDFLAGS+=-Lthirdparty/jsoncpp-1.8.0/build/src/lib_json -ljsoncpp 30 | LDFLAGS+=-shared -Wl,-O1 31 | 32 | CXXFLAGS+=-I$(shell $(LLVM_CONFIG) --includedir) 33 | CXXFLAGS+=-std=c++11 -fPIC -fvisibility-inlines-hidden 34 | CXXFLAGS+=-Wall -Wextra -g -Wno-unused-parameter -Wno-unused-variable 35 | 36 | CPPFLAGS+=$(shell $(LLVM_CONFIG) --cppflags) 37 | CPPFLAGS+=-I$(SRC_DIR) -Ithirdparty/jsoncpp-1.8.0/include 38 | 39 | 40 | PASS=libIntFlip.so 41 | PASS_OBJECTS=FlipConfig.o \ 42 | LiftConstantIntPass.o \ 43 | ReplaceRandomizer.o \ 44 | BitFlipRandomizer.o \ 45 | InjectRandomizers.o \ 46 | IntReplacerVisitor.o \ 47 | IntReplacerIterate.o 48 | 49 | 50 | # IntReplacerVisitor.o 51 | 52 | 53 | default: prep $(PASS) 54 | 55 | # Quite the hack :-P 56 | jsoncpp: 57 | @echo Building jsoncpp-1.8.0 58 | cd thirdparty && \ 59 | tar zxvf jsoncpp-1.8.0.tar.gz && \ 60 | cd jsoncpp-1.8.0 && \ 61 | rm -rf build && \ 62 | mkdir -p build && \ 63 | cd build && \ 64 | cmake .. && \ 65 | make && \ 66 | cd ../../ 67 | 68 | 69 | prep: 70 | $(QUIET)mkdir -p built 71 | 72 | %.o : $(SRC_DIR)/%.cpp 73 | @echo "CPPFLAGS: ${CPPFLAGS}" 74 | @echo "CXXFLAGS: ${CXXFLAGS}" 75 | @echo Compiling $*.cpp 76 | $(QUIET)$(CXX) -o built/$*.o -c $(CPPFLAGS) $(CXXFLAGS) $< 77 | 78 | $(PASS) : $(PASS_OBJECTS) 79 | @echo Linking $@ 80 | $(QUIET)$(CXX) -o built/$@ $(LDFLAGS) $(CXXFLAGS) built/*.o 81 | 82 | clean: 83 | $(QUIET)rm -rf built 84 | 85 | jsonclean: 86 | $(QUIET)rm -rf thirdparty/jsoncpp-1.8.0 87 | 88 | tests: 89 | $(QUIET)echo "Generating bitcode from C" 90 | $(QUIET)$(CC) -emit-llvm -c -o test/foo.bc test/foo.c 91 | $(QUIET)echo "Lifting constants to local variables" 92 | $(QUIET)$(OPT) -load built/libIntFlip.so -lift-constant-int-args -o test/foo2.bc < test/foo.bc 93 | $(QUIET)echo "Injecting randomizer functions" 94 | $(QUIET)$(OPT) -load built/libIntFlip.so -inject-randomizers -o test/foo3.bc < test/foo2.bc 95 | $(QUIET)echo "Replacing ints with visitor method based on replace.cfg" 96 | $(QUIET)$(OPT) -load built/libIntFlip.so -replace-ints-visitor -o test/foo4.bc -repcfg=replace.cfg < test/foo3.bc 97 | $(QUIET)echo "Replacing ints with iteration method based on replace.cfg" 98 | $(QUIET)$(OPT) -load built/libIntFlip.so -replace-ints-iterate -o test/foo5.bc -repcfg=replace.cfg < test/foo3.bc 99 | $(QUIET)echo "llvm-dissing..." 100 | $(QUIET)$(DIS) --o=test/foo.ll test/foo.bc 101 | $(QUIET)$(DIS) --o=test/foo2.ll test/foo2.bc 102 | $(QUIET)$(DIS) --o=test/foo3.ll test/foo3.bc 103 | $(QUIET)$(DIS) --o=test/foo4.ll test/foo4.bc 104 | $(QUIET)$(DIS) --o=test/foo5.ll test/foo5.bc 105 | $(QUIET)echo "Building foo, foo4, foo5 executables" 106 | $(QUIET)$(CC) -o test/foo5 test/foo5.bc -pthread -lbsd 107 | $(QUIET)$(CC) -o test/foo4 test/foo4.bc -pthread -lbsd 108 | $(QUIET)$(CC) -o test/foo test/foo.bc -pthread -lbsd 109 | $(QUIET)echo "" 110 | $(QUIET)echo "Generating bitcode from C++ source" 111 | $(QUIET)$(CXX) -emit-llvm -c -o test/foopp.bc test/foo.cpp 112 | $(QUIET)echo "Lifting constants to local variables" 113 | $(QUIET)$(OPT) -load built/libIntFlip.so -lift-constant-int-args -o test/foopp2.bc < test/foopp.bc 114 | $(QUIET)echo "Injecting randomizer functions" 115 | $(QUIET)$(OPT) -load built/libIntFlip.so -inject-randomizers -o test/foopp3.bc < test/foopp2.bc 116 | $(QUIET)echo "Replacing ints with visitor method based on replacepp.cfg" 117 | $(QUIET)$(OPT) -load built/libIntFlip.so -replace-ints-visitor -o test/foopp4.bc -repcfg=replacepp.cfg < test/foopp3.bc 118 | $(QUIET)echo "Replacing ints with iteration method based on replacepp.cfg" 119 | $(QUIET)$(OPT) -load built/libIntFlip.so -replace-ints-iterate -o test/foopp5.bc -repcfg=replacepp.cfg < test/foopp3.bc 120 | $(QUIET)echo "llvm-dissing..." 121 | $(QUIET)$(DIS) --o=test/foopp.ll test/foopp.bc 122 | $(QUIET)$(DIS) --o=test/foopp2.ll test/foopp2.bc 123 | $(QUIET)$(DIS) --o=test/foopp3.ll test/foopp3.bc 124 | $(QUIET)$(DIS) --o=test/foopp4.ll test/foopp4.bc 125 | $(QUIET)$(DIS) --o=test/foopp5.ll test/foopp5.bc 126 | $(QUIET)echo "Building foopp, foopp4, foopp5 executables" 127 | $(QUIET)$(CXX) -o test/foopp5 test/foopp5.bc -pthread -lbsd 128 | $(QUIET)$(CXX) -o test/foopp4 test/foopp4.bc -pthread -lbsd 129 | $(QUIET)$(CXX) -o test/foopp test/foopp.bc -pthread -lbsd 130 | 131 | cleantests: 132 | rm -f test/*.bc test/*.ll foo foo4 foo5 foopp foopp4 foopp5 133 | 134 | cleanall: clean jsonclean cleantests 135 | -------------------------------------------------------------------------------- /code/intflip/README.md: -------------------------------------------------------------------------------- 1 | 2 | # intflip 3 | 4 | Currently, what this does is replaces all 8, 16, 32, and 64 bit integers arguments to functions 5 | with possibly randomly changed values. The purpose is to re-run test suites 6 | with the modified application in order to simulate the low-probability bit or 7 | more flips that could occur in certain extreme situations. The aim is to perform 8 | some basic analysis as to the stability of the code given random changes. 9 | 10 | ## Aside 11 | 12 | - I do not do much of anything with being sane about memory usage. You're warned. :-P 13 | 14 | - The way I do the RNG bit is possibly overkill. *shrug*. ISO C standard 15 | one should be fine. 16 | 17 | - The randomizer insertion could just be C code that gets linked in, but 18 | going through the generation and insertion of a function is a good 19 | exercise. 20 | 21 | # Requirements 22 | 23 | - LLVM 3.9.0 (should work with 3.8 & 4.0) 24 | 25 | # Building 26 | 27 | $ make jsoncpp 28 | $ make 29 | 30 | # Process 31 | 32 | All executables listed below are supposed to be in your path. You will 33 | ``compile'' to IR and then work on that. I would follow the basic steps 34 | and then determine how you want to handle performing the analysis. The 35 | analysis would be a combination of running unit, or other, tests with 36 | the injected randomizers and analyzing how those runs performed given 37 | the probability distribution you are looking at. 38 | 39 | ## Passes 40 | 41 | This is a list of passes available. They are intended to be used per the 42 | ``basic steps'' section below. 43 | 44 | - -lift-constant-int-args 45 | - -inject-randomizers 46 | - -replace-ints-visitor 47 | - -replace-ints-iterate 48 | - -replace-ints-cgpass 49 | 50 | ## Basic Work Flow 51 | 52 | *Setup replace.cfg* 53 | 54 | The file replace.cfg informs the passes how to setup the random integer replacement action. As the 55 | key, you specify the function that will have any call instructions /in/ it that have at least one 56 | integer argument, replaced to use an integer value that is randomly selected. 57 | 58 | Each value is a dict that should have the keys/values: 59 | - "analyze" : true|false 60 | - "type" : 0|1 ... 0 for randomly replace with random value. 1 for randomly replace with random bit flipp 61 | - "mean" : we are dealing with 32-bit integers, and compare <= mean on random number. 62 | The for 63 | { 64 | "foo" : { 65 | "analyze" : true, 66 | "type" : 0, 67 | "mean" : 500 68 | } 69 | } 70 | 71 | *Compile your code to IR* 72 | 73 | > clang++-3.9 -emit-llvm -o foo.bc -c foo.c 74 | 75 | Of course with a larger code base there will be more work involved. 76 | There is a tool that is out there to help with, at least, the merging 77 | of multiple bitcode files, it may be found [link](https://github.com/travitch/whole-program-llvm "whole-program-llvm"). 78 | 79 | *Lift ConstantInt to local variable* 80 | 81 | > opt-3.9 -load built/libIntFlip.so -lift-constant-ints -o foo2.bc < foo.bc 82 | 83 | This will convert constant integers into local variables. This helps 84 | the next step of replacing integers used with randomizer function as 85 | there is no special casing for constant integers.. we just go after 86 | the local vars. 87 | 88 | *Inject the randomizer functions* 89 | 90 | > opt-3.9 -load built/libIntFlip.so -inject-randomizers -o foo3.bc < foo2.bc 91 | 92 | This generates the functions that will possibly change integer values 93 | and injects them into the bitcode file. We could just link in code, but 94 | we do this to demonstrate some writing of our own functions via the API. 95 | 96 | *Modify integer arguments to use randomizer functions* 97 | 98 | > opt-3.9 -load built/libIntFlip.so -replace-ints -o foo4.bc < foo3.bc 99 | 100 | *Build the executable* 101 | 102 | > llc-3.9 -o=foo4.s foo4.bc 103 | > clang++-3.9 -o foo4 foo4.s 104 | 105 | or however you want... 106 | 107 | *Dump IR from resultant bitcode file* 108 | 109 | > llvm-dis-3.9 -o=foo.ll test/foo.bc 110 | 111 | You can do the above for each bitcode file and compare them. 112 | 113 | 114 | # Some things one might wish to do 115 | 116 | - Wrap all and use a pass manager 117 | -- add pass dependencies for ordering 118 | - Improved control over configurations 119 | - Build out a whole test run harness 120 | -- run test cases that have known outcomes (typically, just unit tests and some other) 121 | -- adjusting probability distribution mean and seeing how that impacts results 122 | - Improved probability distributions.. 123 | -- evolutionary (evolve with execution steps) 124 | - Overall model of app that is intelligently injected based on a model of a specific event 125 | -- that would be like a model based on a real gamma ray birst or something weird 126 | - Randomly change instructions 127 | -- Could be done in a few ways, but would want to be arch specific 128 | -- to find all 1-bit mutatable instructions for instruction i, find all instructions j st dist(i,j) = 1 129 | -- similar for 2 bit.. just make dist(i,j) = 2... 130 | -- Then lift those sets to IR 131 | -- so.. you attempt to fix it in IR.. this will not always work because of ``nice graph'' desires. 132 | - whatever. 133 | 134 | 135 | # Inspirational credits 136 | - NASA 137 | - John Regehr (Utah) 138 | - Gamma rays, alpha particles 139 | -------------------------------------------------------------------------------- /code/intflip/replace.cfg: -------------------------------------------------------------------------------- 1 | { 2 | "global" : { 3 | }, 4 | "skip" : [ 5 | "printf" 6 | ], 7 | "functions" : { 8 | "thd_bf8" : { 9 | "analyze" : true, 10 | "type" : 1, 11 | "mean" : 100000000 12 | }, 13 | "thd_bf16" : { 14 | "analyze" : true, 15 | "type" : 1, 16 | "mean" : 100000000 17 | }, 18 | "thd_bf32" : { 19 | "analyze" : true, 20 | "type" : 1, 21 | "mean" : 100000000 22 | }, 23 | "thd_bf64" : { 24 | "analyze" : true, 25 | "type" : 1, 26 | "mean" : 1000000000 27 | }, 28 | "thd_rr8" : { 29 | "analyze" : true, 30 | "type" : 0, 31 | "mean" : 100000000 32 | }, 33 | "thd_rr16" : { 34 | "analyze" : true, 35 | "type" : 0, 36 | "mean" : 100000000 37 | }, 38 | "thd_rr32" : { 39 | "analyze" : true, 40 | "type" : 0, 41 | "mean" : 100000000 42 | }, 43 | "thd_rr64" : { 44 | "analyze" : true, 45 | "type" : 0, 46 | "mean" : 100000000 47 | } 48 | } 49 | } 50 | -------------------------------------------------------------------------------- /code/intflip/replacepp.cfg: -------------------------------------------------------------------------------- 1 | { 2 | "global" : { 3 | }, 4 | "skip" : [ 5 | "printf" 6 | ], 7 | "functions" : { 8 | "_ZL7thd_bf8Pv" : { 9 | "analyze" : true, 10 | "type" : 1, 11 | "mean" : 100000000 12 | }, 13 | "_ZL8thd_bf16Pv" : { 14 | "analyze" : true, 15 | "type" : 1, 16 | "mean" : 100000000 17 | }, 18 | "_ZL8thd_bf32Pv" : { 19 | "analyze" : true, 20 | "type" : 1, 21 | "mean" : 100000000 22 | }, 23 | "_ZL8thd_bf64Pv" : { 24 | "analyze" : true, 25 | "type" : 1, 26 | "mean" : 1000000000 27 | }, 28 | "_ZL7thd_rr8Pv" : { 29 | "analyze" : true, 30 | "type" : 0, 31 | "mean" : 100000000 32 | }, 33 | "_ZL8thd_rr16Pv" : { 34 | "analyze" : true, 35 | "type" : 0, 36 | "mean" : 100000000 37 | }, 38 | "_ZL8thd_rr32Pv" : { 39 | "analyze" : true, 40 | "type" : 0, 41 | "mean" : 100000000 42 | }, 43 | "_ZL8thd_rr64Pv" : { 44 | "analyze" : true, 45 | "type" : 0, 46 | "mean" : 100000000 47 | } 48 | } 49 | } 50 | -------------------------------------------------------------------------------- /code/intflip/src/BaseRandomizer.h: -------------------------------------------------------------------------------- 1 | #ifndef __BASERANDOMIZER_H 2 | #define __BASERANDOMIZER_H 3 | 4 | class BaseRandomizer { 5 | public: 6 | static void inject(llvm::Module& M); 7 | }; 8 | 9 | typedef enum { 10 | Replace = 1, 11 | BitFlip = 2 12 | } RandomizerKind; 13 | 14 | #endif // !__BASERANDOMIZER_H 15 | -------------------------------------------------------------------------------- /code/intflip/src/BitFlipRandomizer.cpp: -------------------------------------------------------------------------------- 1 | #include "llvm/IR/Module.h" 2 | #include "llvm/IR/IRBuilder.h" 3 | 4 | #include "BitFlipRandomizer.h" 5 | 6 | using namespace llvm; 7 | 8 | /* 9 | * This macro helps with the varying bit cases. It generates the code 10 | * that generates the code where K = nBits is a value: 11 | * 12 | * intK_t 13 | * __bitflip_randomize_iK(intK inArg0) 14 | * { 15 | * unsigned rv = arc4random(); 16 | * if (rv <= (unsigned)2^31) { 17 | * rv = 1 << (rv % K); 18 | * return inArg0 ^ rv; 19 | * } 20 | * return inArg0; 21 | * } 22 | * 23 | * So this says get a random number and if it is less than some mean 24 | * then we flip a random bit of the input integer and return it. 25 | * Otherwise, we just return it unscathed. 26 | * 27 | */ 28 | #define injectBFIntN(nBits) \ 29 | static void \ 30 | injectInt##nBits(llvm::Module& M, Function *fnRand) \ 31 | { \ 32 | /* \ 33 | * Add the function intN_t __bitflip_randomizer_iN__(intN_t, int32_t) \ 34 | */ \ 35 | LLVMContext& ctx = M.getContext(); \ 36 | /* \ 37 | * define i8 @__bitflip_randomizer_i8__(i8 %intToFlip, i32 %meanValue) { \ 38 | * } \ 39 | */ \ 40 | std::string int##nBits##_rand = "__bitflip_randomizer_i" #nBits "__"; \ 41 | Constant *cTmp = M.getOrInsertFunction(int##nBits##_rand, \ 42 | Type::getInt##nBits##Ty(ctx), \ 43 | Type::getInt##nBits##Ty(ctx), \ 44 | Type::getInt32Ty(ctx), \ 45 | NULL); \ 46 | Function *bf_i##nBits = cast(cTmp); \ 47 | bf_i##nBits->setCallingConv(CallingConv::C); \ 48 | \ 49 | Argument& inArg0 = bf_i##nBits->getArgumentList().front(); \ 50 | inArg0.setName("intToFlip"); \ 51 | Argument& inArg1 = bf_i##nBits->getArgumentList().back(); \ 52 | inArg1.setName("meanValue"); \ 53 | \ 54 | /* \ 55 | * entry: \ 56 | * bf_it: \ 57 | * return: \ 58 | */ \ 59 | BasicBlock *blkEntry = BasicBlock::Create(ctx, "entry", bf_i##nBits); \ 60 | BasicBlock *blkBitFlipIt = BasicBlock::Create(ctx, "bf_it", bf_i##nBits); \ 61 | BasicBlock *blkReturn = BasicBlock::Create(ctx, "return", bf_i##nBits); \ 62 | /* \ 63 | * entry: \ 64 | * %__bf_rand_ = call i32 @arc4random() \ 65 | * %__bf_lessthan_ = icmp ule i32 %__bf_rand_, %meanValue \ 66 | * br i1 %__bf_lessthan_, label %bf_it, label %return \ 67 | */ \ 68 | IRBuilder<> builder(blkEntry); \ 69 | Value *callArc4Random = builder.CreateCall(fnRand, None, "__bf_rand_", nullptr); \ 70 | Value *lessThan = builder.CreateICmpULE(callArc4Random, &inArg1, "__bf_lessthan_"); \ 71 | Value *branchBitFlip = builder.CreateCondBr(lessThan, blkBitFlipIt, blkReturn); \ 72 | \ 73 | /* \ 74 | * bf_it: ; preds = %entry \ 75 | * %__bf_bitflip_ = urem i32 %__bf_rand_, 8 \ 76 | * %__bf_cast_randrem_ = trunc i32 %__bf_bitflip_ to i8 \ 77 | * %__bf_shifted_bit_ = shl i8 1, %__bf_cast_randrem_ \ 78 | * %__bf_xord_retval_ = xor i8 %intToFlip, %__bf_shifted_bit_ \ 79 | * ret i8 %__bf_xord_retval_ \ 80 | */ \ 81 | builder.SetInsertPoint(blkBitFlipIt); \ 82 | Value *randModulus = ConstantInt::get(IntegerType::get(ctx, 32), nBits, false); \ 83 | Value *randRemainder = builder.CreateURem(callArc4Random, randModulus, \ 84 | "__bf_bitflip_"); \ 85 | Value *defaultBit = ConstantInt::get(IntegerType::get(ctx, nBits), 1, false); \ 86 | Value *castRandRem = builder.CreateZExtOrTrunc(randRemainder, \ 87 | Type::getInt##nBits##Ty(ctx), "__bf_cast_randrem_"); \ 88 | Value *shiftedBit = builder.CreateShl(defaultBit, castRandRem, \ 89 | "__bf_shifted_bit_"); \ 90 | Value *xordReturnVal = builder.CreateXor(&inArg0, shiftedBit, "__bf_xord_retval_"); \ 91 | builder.CreateRet(xordReturnVal); \ 92 | \ 93 | /* \ 94 | * return: ; preds = %entry \ 95 | * ret i8 %intToFlip \ 96 | */ \ 97 | builder.SetInsertPoint(blkReturn); \ 98 | builder.CreateRet(&inArg0); \ 99 | } \ 100 | 101 | 102 | 103 | injectBFIntN(64) 104 | injectBFIntN(32) 105 | injectBFIntN(16) 106 | injectBFIntN(8) 107 | 108 | void 109 | BitFlipRandomizer::inject(Module& M) 110 | { 111 | LLVMContext& ctx = M.getContext(); 112 | 113 | /* 114 | * If arc4random is already in the module, then return the Function for it; otherwise, 115 | * declare it so it will be pulled in. It should be within scope, so we assert. 116 | */ 117 | Constant *lookupRand = M.getOrInsertFunction("arc4random", Type::getInt32Ty(ctx), NULL); 118 | assert(lookupRand != NULL && "BitFlipRandomizer::inject: Unable to getFunction(arc4random)\n"); 119 | 120 | Function *fnRand = cast(lookupRand); 121 | 122 | injectInt64(M, fnRand); 123 | injectInt32(M, fnRand); 124 | injectInt16(M, fnRand); 125 | injectInt8(M, fnRand); 126 | } 127 | -------------------------------------------------------------------------------- /code/intflip/src/BitFlipRandomizer.h: -------------------------------------------------------------------------------- 1 | #ifndef __BITFLIPRANDOMIZER_H 2 | #define __BITFLIPRANDOMIZER_H 3 | 4 | #include "BaseRandomizer.h" 5 | 6 | class BitFlipRandomizer : BaseRandomizer { 7 | public: 8 | static void inject(llvm::Module& M); 9 | }; 10 | 11 | #endif // !__BITFLIPRANDOMIZER_H 12 | -------------------------------------------------------------------------------- /code/intflip/src/FlipConfig.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | #include 5 | #include 6 | 7 | #include 8 | #include 9 | #include 10 | 11 | #include "FlipConfig.h" 12 | 13 | std::vector skip_replace_functions = { 14 | "__rep_randomizer_i64__", 15 | "__rep_randomizer_i32__", 16 | "__rep_randomizer_i16__", 17 | "__rep_randomizer_i8__", 18 | "__bitflip_randomizer_i64__", 19 | "__bitflip_randomizer_i32__", 20 | "__bitflip_randomizer_i16__", 21 | "__bitflip_randomizer_i8__", 22 | "__cxa_allocate_exception", 23 | "__cxa_throw" // ... There are many RT and other functions to skip... but so it goes. 24 | }; 25 | 26 | 27 | FlipConfig::FlipConfig(std::string c) 28 | { 29 | _configFile = c; 30 | std::ifstream cStream; 31 | cStream.open(_configFile); 32 | cStream >> _configDict; 33 | cStream.close(); 34 | 35 | Json::Value globalConfig =_configDict["global"]; // XXX Not used yet 36 | Json::Value fnConfig = _configDict["functions"]; 37 | Json::Value skipFn = _configDict["skip"]; 38 | Json::Value::Members m = fnConfig.getMemberNames(); 39 | for (auto iii = m.begin(); iii != m.end(); ++iii) { 40 | std::string name = *iii; 41 | 42 | Json::Value nv = fnConfig[name]; 43 | Json::Value meanVal = nv["mean"]; 44 | Json::Value typeVal = nv["type"]; 45 | Json::Value analyzeVal = nv["analyze"]; 46 | 47 | FunctionConfig fc{analyzeVal.asBool(), typeVal.asUInt(), meanVal.asUInt()}; 48 | _replace[name] = fc; 49 | } 50 | if (skipFn.isArray()) { 51 | Json::ArrayIndex aLen = skipFn.size(); 52 | for (Json::ArrayIndex ai = 0; ai < aLen; ai++) { 53 | skip_replace_functions.push_back(skipFn[ai].asString()); 54 | } 55 | } 56 | } 57 | 58 | -------------------------------------------------------------------------------- /code/intflip/src/FlipConfig.h: -------------------------------------------------------------------------------- 1 | #ifndef __FLIPCONFIG_H 2 | #define __FLIPCONFIG_H 3 | 4 | struct FunctionConfig { 5 | bool shouldAnalyze; 6 | unsigned randomType; // weak, i know. 0 = full replace, 1 = bit replace 7 | unsigned mean; // replace int, if random value is <= mean 8 | 9 | FunctionConfig() { shouldAnalyze = false; randomType = 0; mean = 2828282; } 10 | FunctionConfig(bool a, unsigned t, unsigned m) : shouldAnalyze(a), randomType(t), mean(m) {} 11 | }; 12 | 13 | typedef std::map ReplaceMap; 14 | 15 | class FlipConfig { 16 | private: 17 | std::string _configFile; 18 | Json::Value _configDict; 19 | ReplaceMap _replace; 20 | 21 | public: 22 | FlipConfig(std::string c); 23 | Json::Value getDict() { return _configDict; } 24 | ReplaceMap getReplaceMap() { return _replace; } 25 | }; 26 | 27 | extern std::vector skip_replace_functions; 28 | 29 | #endif // !__FLIPCONFIG_H 30 | -------------------------------------------------------------------------------- /code/intflip/src/InjectRandomizers.cpp: -------------------------------------------------------------------------------- 1 | #include "llvm/IR/Module.h" 2 | 3 | #include "ReplaceRandomizer.h" 4 | #include "BitFlipRandomizer.h" 5 | 6 | using namespace llvm; 7 | 8 | struct InjectRandomizers : public ModulePass { 9 | static char ID; 10 | 11 | InjectRandomizers() : ModulePass(ID) {} 12 | 13 | virtual bool 14 | runOnModule(Module &M) 15 | { 16 | ReplaceRandomizer::inject(M); 17 | BitFlipRandomizer::inject(M); 18 | return true; 19 | } 20 | }; 21 | 22 | char InjectRandomizers::ID = 0; 23 | static RegisterPass XX("inject-randomizers", "Inject randomizer functions"); 24 | -------------------------------------------------------------------------------- /code/intflip/src/IntReplacerIterate.cpp: -------------------------------------------------------------------------------- 1 | 2 | #include "llvm/IR/Module.h" 3 | #include "llvm/Support/Casting.h" 4 | #include "llvm/IR/InstIterator.h" 5 | #include "llvm/IR/CallSite.h" 6 | #include "llvm/IR/Constants.h" 7 | #include "llvm/IR/Instructions.h" 8 | #include "llvm/Support/raw_ostream.h" 9 | #include "llvm/Support/CommandLine.h" 10 | 11 | #include "TypeValueSupport.h" 12 | #include "BaseRandomizer.h" 13 | 14 | #include 15 | #include "FlipConfig.h" 16 | 17 | #include 18 | #include 19 | 20 | using namespace llvm; 21 | 22 | extern cl::opt ReplaceConfigFileName; 23 | 24 | struct IntReplacerIterate: public ModulePass { 25 | static char ID; 26 | FlipConfig *_zConfig; 27 | 28 | 29 | IntReplacerIterate() : ModulePass(ID) 30 | { 31 | _zConfig = new FlipConfig(ReplaceConfigFileName); 32 | 33 | } 34 | virtual bool 35 | runOnModule(Module &M) 36 | { 37 | for (auto mIt = _zConfig->getReplaceMap().begin(); 38 | mIt != _zConfig->getReplaceMap().end(); ++mIt) { 39 | 40 | std::string fnName = mIt->first; 41 | FunctionConfig mFc = mIt->second; 42 | 43 | if (mFc.shouldAnalyze == false) { 44 | errs() << "Skipping analysis of " << fnName << " \n"; 45 | continue; 46 | } 47 | 48 | Function *f = M.getFunction(fnName); 49 | if (f == NULL) { 50 | continue; 51 | } 52 | 53 | for (inst_iterator I = inst_begin(f), E = inst_end(f); I != E; ++I) { 54 | if (isa(&*I) || isa(&*I)) { 55 | CallSite cs(&*I); 56 | Function *called = cs.getCalledFunction(); 57 | if (!called->hasName()) { 58 | continue; // XXX Currently require functions to have names 59 | } 60 | 61 | /* 62 | * Determine if the call is of a function we don't want to replace args of.. 63 | */ 64 | if (std::find(skip_replace_functions.begin(), 65 | skip_replace_functions.end(), 66 | called->getName()) != skip_replace_functions.end()) { 67 | errs() << "Skipping: " << called->getName().str() << "\n"; 68 | continue; // Skip it. 69 | } 70 | 71 | /* 72 | * Go through all the called function's arguments. See if 73 | * any are supported by replacement. 74 | */ 75 | unsigned numArgOps = cs.getNumArgOperands(); 76 | for (unsigned ii = 0; ii < numArgOps; ii++) { 77 | Value *va = cs.getArgOperand(ii); 78 | Type *ta = va->getType(); 79 | 80 | /* 81 | * If not a 8, 16, 32, or 64 bit integer, we skip it. 82 | */ 83 | if (TypeValueSupport::isReplaceable(ta, va) == false) { 84 | continue; 85 | } 86 | 87 | /* 88 | * Based on configuration, choose the randomization method. 89 | */ 90 | std::string rndtype = ""; 91 | if (mFc.randomType == 0) { 92 | rndtype = "rep"; 93 | } else { 94 | rndtype = "bitflip"; 95 | } 96 | unsigned nBits = ta->getIntegerBitWidth(); 97 | std::string rndFnName = "__" + rndtype + "_randomizer_i" + std::to_string(nBits) + "__"; 98 | 99 | /* 100 | * The randomizer functions /should/ already be in the module, so get the handle. 101 | */ 102 | Function *insertedRndFn = M.getFunction(rndFnName); 103 | assert(insertedRndFn != NULL); 104 | 105 | /* 106 | * We allow different means for different functions. 107 | */ 108 | ConstantInt *mn = ConstantInt::get(M.getContext(), APInt(32, mFc.mean, false)); 109 | 110 | /* 111 | * Insert call to randomizer with input integer and a mean value. 112 | * It will be inserted before the CallInst. 113 | */ 114 | CallInst *callNewFunc = CallInst::Create(insertedRndFn, 115 | { va, mn }, // Arguments are the integer to maybe flip and the mean value 116 | "__rnd_replicant_", 117 | cs.getInstruction()); // insert our call to the rnd fn before the targeted call instruction 118 | 119 | /* 120 | * Replace the old integer argument with the randomized one 121 | */ 122 | cs.setArgument(ii, callNewFunc); 123 | 124 | } 125 | } 126 | } 127 | } 128 | return true; 129 | } 130 | 131 | }; 132 | 133 | char IntReplacerIterate::ID = 0; 134 | static RegisterPass XX("replace-ints-iterate", "Replace int function args with randomizer-integers using instruction iteration"); 135 | -------------------------------------------------------------------------------- /code/intflip/src/IntReplacerVisitor.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * This is one of three examples of integer replacer pass. 3 | * In this example, we make use of the visitor patterned 4 | * API.. which allows us to implement an InstVisitor which 5 | * let's avoid the if statment checking of instructions 6 | * that you can see in the second integer replacer pass 7 | * 8 | * We implement visitCallSite() which handles both CallInst and 9 | * InvokeInst cases as one. 10 | * 11 | */ 12 | 13 | 14 | #include "llvm/IR/Module.h" 15 | #include "llvm/IR/InstVisitor.h" 16 | #include "llvm/IR/CallSite.h" 17 | #include "llvm/Support/raw_ostream.h" 18 | #include "llvm/Support/CommandLine.h" 19 | 20 | #include "TypeValueSupport.h" 21 | #include "BaseRandomizer.h" 22 | 23 | #include 24 | #include "FlipConfig.h" 25 | 26 | #include 27 | #include 28 | 29 | using namespace llvm; 30 | 31 | cl::opt ReplaceConfigFileName("repcfg", cl::desc("")); 32 | 33 | class CodeModificationLocation { 34 | private: 35 | Instruction *callLocation; 36 | Type *argumentType; 37 | Value *argumentValue; 38 | unsigned argumentIdx; 39 | 40 | public: 41 | CodeModificationLocation(Instruction *c, Type *t, Value *v, unsigned i) : callLocation(c), argumentType(t), argumentValue(v), argumentIdx(i) { }; 42 | CallSite getCallSite() { return CallSite(callLocation); } 43 | Type *getArgumentType() { return argumentType; } 44 | Value *getArgumentValue() { return argumentValue; } 45 | unsigned getArgumentIdx() { return argumentIdx; } 46 | }; 47 | 48 | struct IntReplacerVisitor: public ModulePass { 49 | static char ID; 50 | 51 | std::vector modificationList; 52 | FlipConfig *_zConfig; 53 | 54 | IntReplacerVisitor() : ModulePass(ID) 55 | { 56 | _zConfig = new FlipConfig(ReplaceConfigFileName); 57 | } 58 | 59 | virtual bool 60 | runOnModule(Module &M) 61 | { 62 | for (auto mIt = _zConfig->getReplaceMap().begin(); 63 | mIt != _zConfig->getReplaceMap().end(); ++mIt) { 64 | 65 | std::string fnName = mIt->first; 66 | FunctionConfig mFc = mIt->second; 67 | if (mFc.shouldAnalyze == false) { 68 | errs() << "Per config: skipping analysis of " << fnName << " \n"; 69 | continue; 70 | } 71 | 72 | Function *f = M.getFunction(fnName); 73 | if (f == NULL) { 74 | errs() << "Can't find function: " << fnName << " \n"; 75 | continue; 76 | } 77 | 78 | /* 79 | * The way this currently works -- you can do it many ways -- but the 80 | * way this works is that the visit finds any/all call instructions in 81 | * the target function. There, it looks to see if there are any integer 82 | * arguments that we want to replace with a randomizer.. if there are, 83 | * we save them to a list of locations to change. Once gathering that 84 | * list, we iterate and modify. 85 | */ 86 | CheckCallInsts cCheck; 87 | cCheck.setIntReplacerVisitor(this); 88 | cCheck.visit(*f); 89 | 90 | for (auto iml = modificationList.begin(); 91 | iml != modificationList.end(); ++iml) { 92 | CodeModificationLocation cLoc = *iml; 93 | CallSite cs = cLoc.getCallSite(); 94 | IntegerType *t = cast(cLoc.getArgumentType()); 95 | unsigned nBits = t->getBitWidth(); 96 | 97 | /* Lookup randomizer function to use */ 98 | std::string rndtype = ""; 99 | if (mFc.randomType == 0) { 100 | rndtype = "rep"; 101 | } else { 102 | rndtype = "bitflip"; 103 | } 104 | std::string rndFnName = "__" + rndtype + "_randomizer_i" \ 105 | + std::to_string(nBits) + "__"; 106 | 107 | // the randomizers should already be in, so we assert. 108 | Function *insertedRndFn = cs.getParent()->getModule()->getFunction(rndFnName); 109 | assert(insertedRndFn != NULL); 110 | 111 | // Base mean value on what the configuration says 112 | ConstantInt *mn = ConstantInt::get(M.getContext(), APInt(32, mFc.mean, false)); 113 | 114 | CallInst *callNewFunc = CallInst::Create(insertedRndFn, 115 | { cLoc.getArgumentValue(), mn }, // Arguments are the integer to maybe flip and the mean value 116 | "__rnd_replicant_", 117 | cs.getInstruction()); // insert our call to the rnd fn before the targeted call instruction 118 | 119 | cs.setArgument(cLoc.getArgumentIdx(), callNewFunc); 120 | } 121 | modificationList.clear(); 122 | } 123 | return true; 124 | } 125 | 126 | struct CheckCallInsts : public InstVisitor { 127 | IntReplacerVisitor *__ziInst; 128 | 129 | void 130 | setIntReplacerVisitor(IntReplacerVisitor *p) 131 | { 132 | __ziInst = p; 133 | } 134 | 135 | /* 136 | * Could do this with two separate functions visitCallInst() and 137 | * visitInvokeInst().. the difference is one is in the exception handling 138 | * case and tthe other (the former) is for normal calling contexts. 139 | * CallSite provides a reasonable common ground for the two. 140 | * 141 | */ 142 | void 143 | visitCallSite(CallSite callSite) 144 | { 145 | /* 146 | * Skip called functions that are on the skip_replace list. 147 | */ 148 | Function *called = callSite.getCalledFunction(); 149 | if (!called->hasName()) { 150 | return; 151 | } 152 | if (std::find(skip_replace_functions.begin(), 153 | skip_replace_functions.end(), 154 | called->getName()) != skip_replace_functions.end()) { 155 | errs() << "Skipping replace function: " << called->getName() << "\n"; 156 | return; 157 | } 158 | 159 | /* 160 | * For each argument, determine if it of type and value that is replaceable. 161 | * At this point, we just note the location by storing CallSite, Type, 162 | * Value, and an arg index as a place to change. 163 | * 164 | */ 165 | unsigned numArgOps = callSite.getNumArgOperands(); 166 | for (unsigned ii = 0; ii < numArgOps; ii++) { 167 | Value *va = callSite.getArgOperand(ii); 168 | Type *ta = va->getType(); 169 | if (TypeValueSupport::isReplaceable(ta, va)) { 170 | CodeModificationLocation ml = CodeModificationLocation( 171 | callSite.getInstruction(), 172 | ta, 173 | va, 174 | ii); 175 | __ziInst->modificationList.push_back(ml); 176 | } 177 | } 178 | } 179 | private: 180 | }; 181 | }; 182 | 183 | char IntReplacerVisitor::ID = 0; 184 | static RegisterPass XX("replace-ints-visitor", "Replace int function args with randomizer-integers using instruction visitor"); 185 | -------------------------------------------------------------------------------- /code/intflip/src/LiftConstantIntPass.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * This is a module pass that likely will not have 3 | * impact given possibly run passes, but for our 4 | * purposes we should have it just in case. 5 | * 6 | * What it does is identifies call instructions that 7 | * are passed a integral constant to the target function. 8 | * We lift that constant into a local variable and 9 | * adjust the function call to use the variable. 10 | * 11 | * The reason for doing this is that the passes run after 12 | * this want to act on variables of integer type and so 13 | * this makes that possible to do without checking types.. 14 | * 15 | */ 16 | 17 | #include "llvm/IR/Module.h" 18 | #include "llvm/IR/InstVisitor.h" 19 | #include "llvm/IR/CallSite.h" 20 | 21 | #include "TypeValueSupport.h" 22 | 23 | using namespace llvm; 24 | 25 | struct LiftConstantIntPass : public ModulePass { 26 | static char ID; 27 | 28 | LiftConstantIntPass() : ModulePass(ID) {} 29 | 30 | virtual bool 31 | runOnModule(Module &M) 32 | { 33 | bool modified = false; 34 | 35 | /* 36 | * Visit the call instructions of each function in this 37 | * module. 38 | */ 39 | for (auto &f : M) { 40 | 41 | CheckCallInsts cCheck; 42 | cCheck.visit(f); 43 | if (cCheck.modified) { 44 | modified = true; 45 | } 46 | } 47 | return modified; 48 | } 49 | 50 | struct CheckCallInsts : public InstVisitor { 51 | bool modified = false; 52 | 53 | void 54 | visitCallSite(CallSite callSite) 55 | { 56 | unsigned numArgOps = callSite.getNumArgOperands(); 57 | unsigned argIdx; 58 | 59 | /* 60 | * Check all arguments to the called function to see 61 | * they are a constant integer of a size we can lift. 62 | * 63 | * If an argument is liftable, we allocated space for it 64 | * and store the constant to it. Then, load value and use 65 | * it to replace the constant integer argument to the 66 | * function called. 67 | */ 68 | for (argIdx = 0; argIdx < numArgOps; argIdx++) { 69 | Value *va = callSite.getArgOperand(argIdx); 70 | 71 | if (TypeValueSupport::isLiftable(va) == true) { 72 | ConstantInt *con = cast(va); 73 | 74 | unsigned nBits = con->getBitWidth(); 75 | 76 | AllocaInst *localized__alloc = new AllocaInst( 77 | IntegerType::get(callSite.getParent()->getContext(), nBits), // type to allocate 78 | "__intflip_localized", // give the slot a label 79 | callSite.getInstruction()); // Insert before call instruction 80 | 81 | StoreInst *localized__store = new StoreInst( 82 | con, // value to store 83 | localized__alloc, // where to store it 84 | callSite.getInstruction()); // Insert before call instruction 85 | 86 | LoadInst *localized__load = new LoadInst( 87 | localized__alloc, // pointer to load from 88 | (const char *)"__intflip_loaded", // label the slot 89 | callSite.getInstruction()); // Insert before call instruction 90 | 91 | /* 92 | * after that series we now have: 93 | * alloca 94 | * store 95 | * load 96 | * call 97 | * sequence of instructions 98 | */ 99 | 100 | /* replace the constant in the function call */ 101 | callSite.setArgument(argIdx, localized__load); 102 | new_vars++; 103 | modified = true; 104 | } 105 | } 106 | } 107 | private: 108 | unsigned new_vars = 0; 109 | }; 110 | }; 111 | 112 | char LiftConstantIntPass::ID = 0; 113 | static RegisterPass XX("lift-constant-int-args", "Lifts constant int fn args to a local variable"); 114 | -------------------------------------------------------------------------------- /code/intflip/src/ReplaceRandomizer.cpp: -------------------------------------------------------------------------------- 1 | #include "llvm/IR/Module.h" 2 | #include "llvm/IR/IRBuilder.h" 3 | 4 | #include "ReplaceRandomizer.h" 5 | 6 | using namespace llvm; 7 | /* 8 | 9 | * This, for each K bits integer type, will inject the function 10 | * 11 | * intK_t 12 | * __rep_randomizer_iK(intK_t inArg0) 13 | * { 14 | * unsigned rv = arc4random(); 15 | * if (rv <= (unsigned)meanValue) { 16 | * return rv; 17 | * } 18 | * return inArg0 19 | * } 20 | * 21 | * The comments in the macro below show the generated IR along each step..but for 22 | * K=64. 23 | */ 24 | #define injectRIntN(nBits) \ 25 | static void \ 26 | injectInt##nBits(llvm::Module& M, Function *fnRand) \ 27 | { \ 28 | LLVMContext& ctx = M.getContext(); \ 29 | /* \ 30 | * define i64 @__rep_randomizer_i64__(i64 %intToFlip, i32 %meanValue) { \ 31 | * } \ 32 | */ \ 33 | std::string int##nBits##_rand = "__rep_randomizer_i" #nBits "__"; \ 34 | Constant *cTmp = M.getOrInsertFunction(int##nBits##_rand, \ 35 | Type::getInt##nBits##Ty(ctx), \ 36 | Type::getInt##nBits##Ty(ctx), \ 37 | Type::getInt32Ty(ctx), \ 38 | NULL); \ 39 | Function *rep_i##nBits = cast(cTmp); \ 40 | rep_i##nBits->setCallingConv(CallingConv::C); \ 41 | Argument& inArg0 = rep_i##nBits->getArgumentList().front(); \ 42 | inArg0.setName("intToFlip"); \ 43 | Argument& inArg1 = rep_i##nBits->getArgumentList().back(); \ 44 | inArg1.setName("meanValue"); \ 45 | /* \ 46 | * entry: \ 47 | * replace_it: \ 48 | * return: \ 49 | */ \ 50 | BasicBlock *blkEntry = BasicBlock::Create(ctx, "entry", rep_i##nBits); \ 51 | BasicBlock *blkRepIt = BasicBlock::Create(ctx, "replace_it", rep_i##nBits); \ 52 | BasicBlock *blkReturn = BasicBlock::Create(ctx, "return", rep_i##nBits); \ 53 | /* \ 54 | * entry: \ 55 | * %__rep_rand_ = call i32 @arc4random() \ 56 | * %__rep_lessthan_ = icmp ule i32 %__rep_rand_, %meanValue \ 57 | * br i1 %__rep_lessthan_, label %replace_it, label %return \ 58 | * replace_it: \ 59 | * return: \ 60 | */ \ 61 | IRBuilder<> builder(blkEntry); \ 62 | Value *callArc4Random = builder.CreateCall(fnRand, None, "__rep_rand_", nullptr); \ 63 | Value *lessThan = builder.CreateICmpULE(callArc4Random, &inArg1, "__rep_lessthan_");\ 64 | Value *branchRep = builder.CreateCondBr(lessThan, blkRepIt, blkReturn); \ 65 | \ 66 | /* \ 67 | * ... \ 68 | * replace_it: ; preds = %entry \ 69 | * %__rep_cast_ = zext i32 %__rep_rand_ to i64 \ 70 | * ret i64 %__rep_cast_ \ 71 | */ \ 72 | builder.SetInsertPoint(blkRepIt); \ 73 | Value *castRand = builder.CreateZExtOrTrunc(callArc4Random, \ 74 | Type::getInt##nBits##Ty(ctx), "__rep_cast_"); \ 75 | Value *cr = builder.CreateRet(castRand); \ 76 | \ 77 | /* \ 78 | * ... \ 79 | * return: ; preds = %entry \ 80 | * ret i64 %intToFlip \ 81 | */ \ 82 | builder.SetInsertPoint(blkReturn); \ 83 | Value *cr2 = builder.CreateRet(&inArg0); \ 84 | } \ 85 | 86 | 87 | injectRIntN(64) 88 | injectRIntN(32) 89 | injectRIntN(16) 90 | injectRIntN(8) 91 | 92 | void 93 | ReplaceRandomizer::inject(Module& M) 94 | { 95 | LLVMContext& ctx = M.getContext(); 96 | 97 | /* 98 | * If arc4random is already in the module, then return the Function for it; otherwise, 99 | * declare it so it will be pulled in. It should be within scope, so we assert. 100 | */ 101 | Constant *lookupRand = M.getOrInsertFunction("arc4random", Type::getInt32Ty(ctx), NULL); 102 | assert(lookupRand != NULL && "ReplaceRandomizer::inject(): failed getFunction(arc4random)\n"); 103 | 104 | Function *fnRand = cast(lookupRand); 105 | 106 | injectInt64(M, fnRand); 107 | injectInt32(M, fnRand); 108 | injectInt16(M, fnRand); 109 | injectInt8(M, fnRand); 110 | 111 | } 112 | 113 | 114 | -------------------------------------------------------------------------------- /code/intflip/src/ReplaceRandomizer.h: -------------------------------------------------------------------------------- 1 | #ifndef __REPLACERANDOMIZER_H 2 | #define __REPLACERANDOMIZER_H 3 | 4 | #include "BaseRandomizer.h" 5 | 6 | class ReplaceRandomizer : BaseRandomizer { 7 | public: 8 | static void inject(llvm::Module& M); 9 | }; 10 | #endif // !__REPLACERANDOMIZER_H 11 | -------------------------------------------------------------------------------- /code/intflip/src/TypeValueSupport.h: -------------------------------------------------------------------------------- 1 | #ifndef __TYPEVALUESUPPORT_H 2 | #define __TYPEVALUESUPPORT_H 3 | 4 | class TypeValueSupport { 5 | 6 | public: 7 | TypeValueSupport() {} 8 | 9 | static bool 10 | isLiftable(llvm::Value *v) 11 | { 12 | /* dyn_cast<>() will return NULL if cast fails */ 13 | if (llvm::ConstantInt *c = llvm::dyn_cast(v)) { 14 | unsigned nBits = c->getBitWidth(); 15 | switch (nBits) { 16 | case 64: 17 | case 32: 18 | case 16: 19 | case 8: 20 | return true; 21 | } 22 | } 23 | return false; 24 | } 25 | 26 | static bool 27 | isReplaceable(llvm::Type *t, llvm::Value *v) 28 | { 29 | 30 | /* isa<>() returns boolean as to whether can cast */ 31 | if (llvm::isa(v) == true) { 32 | /* 33 | * Maybe assert() and say you should run the lifting 34 | * pass before this one? Or just report that... 35 | */ 36 | return false; 37 | } 38 | 39 | /* dyn_cast<>() will return NULL if cast fails */ 40 | if (llvm::IntegerType *intType = llvm::dyn_cast(t)) { 41 | unsigned nBits = intType->getBitWidth(); 42 | switch (nBits) { 43 | case 64: 44 | case 32: 45 | case 16: 46 | case 8: 47 | return true; 48 | } 49 | } 50 | return false; 51 | } 52 | }; 53 | #endif // !__TYPEVALUESUPPORT_H 54 | -------------------------------------------------------------------------------- /code/intflip/test/foo.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | #define MAXITER (4 * 4096) 6 | 7 | char 8 | bf8(char in) 9 | { 10 | return in; 11 | } 12 | 13 | static void * 14 | thd_bf8(void *a) 15 | { 16 | for (unsigned i = 0; i < MAXITER; i++) 17 | printf("thd_bf8: %d\n", bf8(0)); 18 | return a; 19 | } 20 | 21 | short 22 | bf16(short in) 23 | { 24 | return in; 25 | } 26 | 27 | static void * 28 | thd_bf16(void *a) 29 | { 30 | for (unsigned i = 0; i < MAXITER; i++) 31 | printf("thd_bf16: %d\n", bf16(0)); 32 | return a; 33 | } 34 | 35 | int 36 | bf32(int in) 37 | { 38 | return in; 39 | } 40 | 41 | static void * 42 | thd_bf32(void *a) 43 | { 44 | for (unsigned i = 0; i < MAXITER; i++) 45 | printf("thd_bf32: %d\n", bf32(0)); 46 | return a; 47 | } 48 | 49 | int64_t 50 | bf64(int64_t in) 51 | { 52 | return in; 53 | } 54 | 55 | static void * 56 | thd_bf64(void *a) 57 | { 58 | for (unsigned i = 0; i < MAXITER; i++) 59 | printf("thd_bf64: %ld\n", bf64(0)); 60 | return a; 61 | } 62 | 63 | char 64 | rr8(char in) 65 | { 66 | return in; 67 | } 68 | 69 | static void * 70 | thd_rr8(void *a) 71 | { 72 | for (unsigned i = 0; i < MAXITER; i++) 73 | printf("thd_rr8: %d\n", rr8(0)); 74 | return a; 75 | } 76 | 77 | short 78 | rr16(short in) 79 | { 80 | return in; 81 | } 82 | 83 | static void * 84 | thd_rr16(void *a) 85 | { 86 | for (unsigned i = 0; i < MAXITER; i++) 87 | printf("thd_rr16: %d\n", rr16(0)); 88 | return a; 89 | } 90 | 91 | int 92 | rr32(int in) 93 | { 94 | return in; 95 | } 96 | 97 | static void * 98 | thd_rr32(void *a) 99 | { 100 | for (unsigned i = 0; i < MAXITER; i++) 101 | printf("thd_rr32: %d\n", rr32(0)); 102 | return a; 103 | } 104 | 105 | int64_t 106 | rr64(int64_t in) 107 | { 108 | return in; 109 | } 110 | 111 | static void * 112 | thd_rr64(void *a) 113 | { 114 | for (unsigned i = 0; i < MAXITER; i++) 115 | printf("thd_rr64: %ld\n", rr64(0)); 116 | return a; 117 | } 118 | 119 | int 120 | main(int argc, char **argv) 121 | { 122 | pthread_t a,b,c,d,e,f,g,h; 123 | printf("Creating.\n"); 124 | pthread_create(&a, NULL, &thd_bf8, NULL); 125 | pthread_create(&b, NULL, &thd_bf16, NULL); 126 | pthread_create(&c, NULL, &thd_bf32, NULL); 127 | pthread_create(&d, NULL, &thd_bf64, NULL); 128 | pthread_create(&e, NULL, &thd_rr8, NULL); 129 | pthread_create(&f, NULL, &thd_rr16, NULL); 130 | pthread_create(&g, NULL, &thd_rr32, NULL); 131 | pthread_create(&h, NULL, &thd_rr64, NULL); 132 | printf("Created.\n"); 133 | pthread_join(a, NULL); 134 | pthread_join(b, NULL); 135 | pthread_join(c, NULL); 136 | pthread_join(d, NULL); 137 | pthread_join(e, NULL); 138 | pthread_join(f, NULL); 139 | pthread_join(g, NULL); 140 | pthread_join(h, NULL); 141 | return 0; 142 | } 143 | -------------------------------------------------------------------------------- /code/intflip/test/foo.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | #define MAXITER (8 * 4096) 6 | 7 | char 8 | bf8(char in) 9 | { 10 | if (in != 0 && " 0" == NULL) { 11 | throw 2; 12 | } 13 | return in; 14 | } 15 | 16 | static void * 17 | thd_bf8(void *a) 18 | { 19 | try { 20 | for (unsigned i = 0; i < MAXITER; i++) 21 | printf("thd_bf8: %d\n", bf8(0)); 22 | } catch (int e) {} 23 | return a; 24 | } 25 | 26 | short 27 | bf16(short in) 28 | { 29 | if (in != 0 && " 0" == NULL) { 30 | throw 2; 31 | } 32 | return in; 33 | } 34 | 35 | static void * 36 | thd_bf16(void *a) 37 | { 38 | try { 39 | for (unsigned i = 0; i < MAXITER; i++) 40 | printf("thd_bf16: %d\n", bf16(0)); 41 | } catch (int e) {} 42 | return a; 43 | } 44 | 45 | int 46 | bf32(int in) 47 | { 48 | if (in != 0 && " 0" == NULL) { 49 | throw 2; 50 | } 51 | return in; 52 | } 53 | 54 | static void * 55 | thd_bf32(void *a) 56 | { 57 | try { 58 | for (unsigned i = 0; i < MAXITER; i++) 59 | printf("thd_bf32: %d\n", bf32(0)); 60 | } catch (int e) {} 61 | return a; 62 | } 63 | 64 | int64_t 65 | bf64(int64_t in) 66 | { 67 | if (in != 0 && " 0" == NULL) { 68 | throw 2; 69 | } 70 | return in; 71 | } 72 | 73 | static void * 74 | thd_bf64(void *a) 75 | { 76 | try { 77 | for (unsigned i = 0; i < MAXITER; i++) 78 | printf("thd_bf64: %ld\n", bf64(0)); 79 | } catch (int e) {} 80 | return a; 81 | } 82 | 83 | char 84 | rr8(char in) 85 | { 86 | if (in != 0 && " 0" == NULL) { 87 | throw 2; 88 | } 89 | return in; 90 | } 91 | 92 | static void * 93 | thd_rr8(void *a) 94 | { 95 | try { 96 | for (unsigned i = 0; i < MAXITER; i++) 97 | printf("thd_rr8: %d\n", rr8(0)); 98 | } catch (int e) {} 99 | return a; 100 | } 101 | 102 | short 103 | rr16(short in) 104 | { 105 | if (in != 0 && " 0" == NULL) { 106 | throw 2; 107 | } 108 | return in; 109 | } 110 | 111 | static void * 112 | thd_rr16(void *a) 113 | { 114 | try { 115 | for (unsigned i = 0; i < MAXITER; i++) 116 | printf("thd_rr16: %d\n", rr16(0)); 117 | } catch (int e) {} 118 | return a; 119 | } 120 | 121 | int 122 | rr32(int in) 123 | { 124 | if (in != 0 && " 0" == NULL) { 125 | throw 2; 126 | } 127 | return in; 128 | } 129 | 130 | static void * 131 | thd_rr32(void *a) 132 | { 133 | try { 134 | for (unsigned i = 0; i < MAXITER; i++) 135 | printf("thd_rr32: %d\n", rr32(0)); 136 | } catch (int e) {} 137 | return a; 138 | } 139 | 140 | int64_t 141 | rr64(int64_t in) 142 | { 143 | if (in != 0 && " 0" == NULL) { 144 | throw 2; 145 | } 146 | return in; 147 | } 148 | 149 | static void * 150 | thd_rr64(void *a) 151 | { 152 | try { 153 | for (unsigned i = 0; i < MAXITER; i++) 154 | printf("thd_rr64: %ld\n", rr64(0)); 155 | } catch (int e) {} 156 | return a; 157 | } 158 | 159 | int 160 | main(int argc, char **argv) 161 | { 162 | pthread_t a,b,c,d,e,f,g,h; 163 | printf("Max iter = %u\n", MAXITER); 164 | printf("Creating.\n"); 165 | pthread_create(&a, NULL, &thd_bf8, NULL); 166 | pthread_create(&b, NULL, &thd_bf16, NULL); 167 | pthread_create(&c, NULL, &thd_bf32, NULL); 168 | pthread_create(&d, NULL, &thd_bf64, NULL); 169 | pthread_create(&e, NULL, &thd_rr8, NULL); 170 | pthread_create(&f, NULL, &thd_rr16, NULL); 171 | pthread_create(&g, NULL, &thd_rr32, NULL); 172 | pthread_create(&h, NULL, &thd_rr64, NULL); 173 | printf("Created.\n"); 174 | pthread_join(a, NULL); 175 | pthread_join(b, NULL); 176 | pthread_join(c, NULL); 177 | pthread_join(d, NULL); 178 | pthread_join(e, NULL); 179 | pthread_join(f, NULL); 180 | pthread_join(g, NULL); 181 | pthread_join(h, NULL); 182 | return 0; 183 | } 184 | -------------------------------------------------------------------------------- /code/intflip/thirdparty/jsoncpp-1.8.0.tar.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roachspray/opcde2017/8c2474e8cd8ff2315b72d54560e83ffd35b4c91d/code/intflip/thirdparty/jsoncpp-1.8.0.tar.gz -------------------------------------------------------------------------------- /code/lpskel/Makefile: -------------------------------------------------------------------------------- 1 | LLVM_VER=3.9 2 | LLVM_HOME=/usr/bin 3 | LLVM_CONFIG?=$(LLVM_HOME)/llvm-config-$(LLVM_VER) 4 | 5 | ifndef VERBOSE 6 | QUIET:=@ 7 | endif 8 | 9 | SRC_DIR?=$(PWD)/src 10 | 11 | CXX=$(LLVM_HOME)/clang++-$(LLVM_VER) 12 | CC=$(LLVM_HOME)/clang-$(LLVM_VER) 13 | OPT=$(LLVM_HOME)/opt-$(LLVM_VER) 14 | DIS=$(LLVM_HOME)/llvm-dis-$(LLVM_VER) 15 | LNK=$(LLVM_HOME)/llvm-link-$(LLVM_VER) 16 | 17 | LDFLAGS+=$(shell $(LLVM_CONFIG) --ldflags) 18 | LDFLAGS+=-shared -Wl,-O1 19 | 20 | CXXFLAGS+=-I$(shell $(LLVM_CONFIG) --includedir) 21 | CXXFLAGS+=-std=c++11 -fPIC -fvisibility-inlines-hidden 22 | CXXFLAGS+=-Wall -Wextra -g -Wno-unused-parameter -Wno-unused-variable 23 | 24 | CPPFLAGS+=$(shell $(LLVM_CONFIG) --cppflags) 25 | CPPFLAGS+=-I$(SRC_DIR) 26 | 27 | PASS=LPSkel.so 28 | PASS_OBJECTS=LPSkel.o 29 | 30 | default: prep $(PASS) 31 | 32 | prep: 33 | $(QUIET)mkdir -p built 34 | 35 | %.o : $(SRC_DIR)/%.cpp 36 | @echo Compiling $*.cpp 37 | $(QUIET)$(CXX) -o built/$*.o -c $(CPPFLAGS) $(CXXFLAGS) $< 38 | 39 | $(PASS) : $(PASS_OBJECTS) 40 | @echo Linking $@ 41 | $(QUIET)$(CXX) -o built/$@ $(LDFLAGS) $(CXXFLAGS) built/*.o 42 | 43 | clean: 44 | $(QUIET)rm -rf built test/*.bc 45 | 46 | 47 | tests: 48 | $(CC) -emit-llvm -o test/foo.bc -c test/foo.c 49 | 50 | runtests: 51 | $(OPT) -load built/LPSkel.so -lpskel < test/foo.bc 52 | -------------------------------------------------------------------------------- /code/lpskel/README.md: -------------------------------------------------------------------------------- 1 | 2 | # LoopPass Skeleton 3 | 4 | Visit loops in code; outer loop met last. 5 | 6 | # Build & Run 7 | 8 | First check the Makefile to set path to llvm-config and version. 9 | 3.8, 3.9 should be fine, so should 4.0 10 | 11 | ``` 12 | $ make 13 | $ opt-X.Y -load built/LPSkel.so -lpskel < file.bc 14 | ... 15 | $ 16 | ``` 17 | 18 | 19 | -------------------------------------------------------------------------------- /code/lpskel/src/LPSkel.cpp: -------------------------------------------------------------------------------- 1 | #include "llvm/IR/Module.h" 2 | #include "llvm/IR/Instructions.h" 3 | #include "llvm/Analysis/LoopPass.h" 4 | #include "llvm/Analysis/LoopInfo.h" 5 | #include "llvm/Support/raw_ostream.h" 6 | 7 | using namespace llvm; 8 | 9 | #include "LPSkel.h" 10 | 11 | void 12 | LPSkel::getAnalysisUsage(AnalysisUsage &AU) const 13 | { 14 | // No changes to CFG, so tell the pass manager 15 | AU.setPreservesCFG(); 16 | } 17 | 18 | bool 19 | LPSkel::doFinalization() 20 | { 21 | return false; 22 | } 23 | 24 | bool 25 | LPSkel::doInitialization(Loop *L, LPPassManager &LP) 26 | { 27 | return false; 28 | } 29 | 30 | bool 31 | LPSkel::runOnLoop(Loop *L, LPPassManager &LP) 32 | { 33 | errs() << " Loop found:\n"; 34 | L->print(errs(), 1); 35 | errs() << " --- end of Loop ---\n"; 36 | 37 | // return true if Function has been changed. 38 | return false; 39 | } 40 | 41 | /* 42 | * Register this pass to be made usable. 43 | * Needs the static ID initialized and the pass declaration given. 44 | */ 45 | char LPSkel::ID = 0; 46 | static RegisterPass XX("lpskel", "Loop Pass Skeleton"); 47 | 48 | -------------------------------------------------------------------------------- /code/lpskel/src/LPSkel.h: -------------------------------------------------------------------------------- 1 | #ifndef __LPSKEL_H 2 | #define __LPSKEL_H 3 | 4 | struct LPSkel : public LoopPass { 5 | /* 6 | * For all of your passes you will need this and to define it. 7 | * It's address is used by pass system, so the value does not matter. 8 | */ 9 | static char ID; 10 | 11 | LPSkel() : LoopPass(ID) { 12 | } 13 | 14 | // Return true if Function was modified, otherwise false. 15 | virtual bool runOnLoop(Loop *L, LPPassManager &LP); 16 | 17 | /* 18 | * Used to help order passes by pass manager. 19 | * Declare any passes you need run prior here.. as well as 20 | * any information such as preserving CFG or similar. 21 | */ 22 | virtual void getAnalysisUsage(AnalysisUsage &) const; 23 | 24 | virtual bool doInitialization(Loop *L, LPPassManager &LPM); 25 | virtual bool doFinalization(); 26 | }; 27 | 28 | #endif 29 | -------------------------------------------------------------------------------- /code/lpskel/test/foo.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int 4 | main(int argc, char **argv) 5 | { 6 | unsigned z = 0; 7 | if (argc <= 0) argc = 30; 8 | for (unsigned k = 0; k < argc; ++k) { 9 | z += k * 2; 10 | } 11 | printf("zeta: %u\n", z); 12 | return 0; 13 | } 14 | -------------------------------------------------------------------------------- /code/mpskel/Makefile: -------------------------------------------------------------------------------- 1 | LLVM_VER=3.9 2 | LLVM_HOME=/usr/bin 3 | LLVM_CONFIG?=$(LLVM_HOME)/llvm-config-$(LLVM_VER) 4 | 5 | ifndef VERBOSE 6 | QUIET:=@ 7 | endif 8 | 9 | SRC_DIR?=$(PWD)/src 10 | 11 | CXX=$(LLVM_HOME)/clang++-$(LLVM_VER) 12 | CC=$(LLVM_HOME)/clang-$(LLVM_VER) 13 | OPT=$(LLVM_HOME)/opt-$(LLVM_VER) 14 | DIS=$(LLVM_HOME)/llvm-dis-$(LLVM_VER) 15 | LNK=$(LLVM_HOME)/llvm-link-$(LLVM_VER) 16 | 17 | LDFLAGS+=$(shell $(LLVM_CONFIG) --ldflags) 18 | LDFLAGS+=-shared -Wl,-O1 19 | 20 | CXXFLAGS+=-I$(shell $(LLVM_CONFIG) --includedir) 21 | CXXFLAGS+=-std=c++11 -fPIC -fvisibility-inlines-hidden 22 | CXXFLAGS+=-Wall -Wextra -g -Wno-unused-parameter -Wno-unused-variable 23 | 24 | CPPFLAGS+=$(shell $(LLVM_CONFIG) --cppflags) 25 | CPPFLAGS+=-I$(SRC_DIR) 26 | 27 | 28 | PASS=MPSkel.so 29 | PASS_OBJECTS=MPSkel.o 30 | 31 | default: prep $(PASS) 32 | 33 | prep: 34 | $(QUIET)mkdir -p built 35 | 36 | %.o : $(SRC_DIR)/%.cpp 37 | @echo Compiling $*.cpp 38 | $(QUIET)$(CXX) -o built/$*.o -c $(CPPFLAGS) $(CXXFLAGS) $< 39 | 40 | $(PASS) : $(PASS_OBJECTS) 41 | @echo Linking $@ 42 | $(QUIET)$(CXX) -o built/$@ $(LDFLAGS) $(CXXFLAGS) built/*.o 43 | 44 | clean: 45 | $(QUIET)rm -rf built 46 | 47 | -------------------------------------------------------------------------------- /code/mpskel/README.md: -------------------------------------------------------------------------------- 1 | 2 | # MPSkel 3 | 4 | This is a module pass skeleton. 5 | 6 | 7 | # Build & Run 8 | 9 | First check the Makefile to set path to llvm-config and version. 10 | 3.8, 3.9 should be fine, so should 4.0 11 | 12 | ``` 13 | $ make 14 | $ opt-X.Y -load built/MPSkel.so -mpskel < file.bc 15 | ... 16 | $ 17 | ``` 18 | 19 | 20 | -------------------------------------------------------------------------------- /code/mpskel/src/MPSkel.cpp: -------------------------------------------------------------------------------- 1 | #include "llvm/IR/Module.h" 2 | #include "llvm/IR/Instructions.h" 3 | #include "llvm/IR/CallSite.h" 4 | #include "llvm/Support/raw_ostream.h" 5 | 6 | using namespace llvm; 7 | 8 | #include "MPSkel.h" 9 | 10 | void 11 | MPSkel::getAnalysisUsage(AnalysisUsage &AU) const 12 | { 13 | AU.setPreservesCFG(); 14 | } 15 | 16 | bool 17 | MPSkel::runOnModule(Module &M) 18 | { 19 | 20 | /* Iterate through all functions in this module */ 21 | for (auto &F : M) { 22 | std::string fname = "not named"; 23 | if (F.hasName()) { 24 | fname = F.getName().str(); 25 | } 26 | 27 | // If no uses, don't look further. 28 | if (F.user_empty()) { 29 | errs() << "Function (" << fname << ") not used.\n"; 30 | continue; 31 | } 32 | errs() << "Listing uses for function (" << fname << ")\n"; 33 | for (auto uit = F.user_begin(); uit != F.user_end(); ++uit) { 34 | User *u = *uit; 35 | errs() << " "; 36 | std::string pn = ""; 37 | 38 | // Is this use a Call or Invoke instruction? 39 | if (isa(u) || isa(u)) { 40 | // It is, so let's use the common class CallSite 41 | CallSite cs(dyn_cast(u)); 42 | 43 | // Instruction in a BasicBlock in a Function. 44 | Function *caller = cs.getParent()->getParent(); 45 | if (caller->hasName()) { 46 | pn = caller->getName().str(); 47 | } else { 48 | pn = "not named"; 49 | } 50 | errs() << pn << ": "; 51 | } 52 | 53 | // Just print out what Value is 54 | u->dump(); 55 | // If has debug info, we should just dump that as well 56 | } 57 | errs() << "\n"; 58 | } 59 | return false; // CFG did not change 60 | } 61 | 62 | /* 63 | * Register this pass to be made usable. 64 | * Needs the static ID initialized and the pass declaration given. 65 | */ 66 | char MPSkel::ID = 0; 67 | static RegisterPass XX("mpskel", "Module Pass Skeleton"); 68 | 69 | -------------------------------------------------------------------------------- /code/mpskel/src/MPSkel.h: -------------------------------------------------------------------------------- 1 | #ifndef __MPSKEL_H 2 | #define __MPSKEL_H 3 | 4 | struct MPSkel : public ModulePass { 5 | /* 6 | * For all of your passes you will need this and to define it. 7 | * It's address is used by pass system, so the value does not matter. 8 | */ 9 | static char ID; 10 | 11 | MPSkel() : ModulePass(ID) { } 12 | 13 | // Called on each compilation unit 14 | virtual bool runOnModule(Module &); 15 | 16 | /* 17 | * Used to help order passes by pass manager. 18 | * Declare any passes you need run prior here.. as well as 19 | * any information such as preserving CFG or similar. 20 | */ 21 | virtual void getAnalysisUsage(AnalysisUsage &) const; 22 | 23 | }; 24 | 25 | #endif 26 | -------------------------------------------------------------------------------- /code/newpm/Makefile: -------------------------------------------------------------------------------- 1 | # Point towwards the LLVM 4.0 installation (might work with 3.9) 2 | LLBIN=/Users/jaredcarlson/Projects/llvm-4-build/bin 3 | LLVM_CONFIG=$(LLBIN)/llvm-config 4 | #QUIET:=@ 5 | QUIET:= 6 | 7 | 8 | SRC_DIR?=$(PWD)/src 9 | LDFLAGS+=$(shell $(LLVM_CONFIG) --ldflags) 10 | 11 | COMMON_FLAGS=-Wall -Wextra -g 12 | 13 | 14 | CXXFLAGS+=$(COMMON_FLAGS) $(shell $(LLVM_CONFIG) --cxxflags) 15 | CPPFLAGS+=$(shell $(LLVM_CONFIG) --cppflags) -std=c++11 -I$(SRC_DIR) 16 | 17 | ifeq ($(shell uname),Darwin) 18 | LOADABLE_MODULE_OPTIONS=-bundle -undefined dynamic_lookup 19 | PASS=TestFunctionPass.dylib 20 | else 21 | LOADABLE_MODULE_OPTIONS=-shared -Wl,-O1 22 | PASS=TestFunctionPass.so 23 | endif 24 | 25 | # Change these to point to your installation of clang/llvm-d 26 | LDIS=$(LLBIN)/llvm-dis 27 | CPP=/usr/bin/clang++ 28 | CC=/usr/bin/clang 29 | 30 | BOD=build/obj 31 | PASSMGR=Manager 32 | OPM=build/bin/$(PASSMGR) 33 | 34 | PASS_OBJECTS=Analysis/TestFunctionPass.o 35 | 36 | # XXX 37 | # This is awful... Im just like "PUT IT ALL IN" 38 | LIBS=$(shell $(LLVM_CONFIG) --libs) -lclang 39 | LIBS+=-lpthread -ldl -lncurses -lz 40 | 41 | TDIR=build/tests 42 | 43 | default: prep $(PASS) passmgr 44 | 45 | prep: 46 | @echo "Prep phase" 47 | $(QUIET)mkdir -p build 48 | $(QUIET)mkdir -p build/obj 49 | $(QUIET)mkdir -p build/obj/Analysis 50 | $(QUIET)mkdir -p build/bin 51 | $(QUIET)mkdir -p build/lib 52 | 53 | define builditdood 54 | $(QUIET)$(CPP) -o $(BOD)/$(1)/$(@F) -c $(CPPFLAGS) $(CXXFLAGS) $< 55 | endef 56 | 57 | Analysis/%.o: $(SRC_DIR)/Analysis/%.cpp 58 | @echo "Compiling $*.cpp" 59 | $(call builditdood,Analysis) 60 | 61 | %.o : $(SRC_DIR)/%.cpp 62 | @echo "Compiling $*.cpp" 63 | $(call builditdood,.) 64 | 65 | passmgr: 66 | @echo "Building passmanager clean up ldflags XXX" 67 | $(QUIET)$(CPP) -o $(BOD)/manager.o -c $(CPPFLAGS) $(CXXFLAGS) src/manager.cpp 68 | $(QUIET)$(CPP) -o $(OPM) $(CXXFLAGS) build/obj/manager.o ${addprefix $(BOD)/,$(PASS_OBJECTS)} $(LDFLAGS) $(LIBS) 69 | 70 | 71 | $(PASS) : $(PASS_OBJECTS) 72 | @echo "Linking $@" 73 | $(QUIET)$(CPP) -o build/lib/$@ $(LOADABLE_MODULE_OPTIONS) $(CXXFLAGS) $(LDFLAGS) ${addprefix $(BOD)/,$^} 74 | 75 | test: testprep testfe 76 | 77 | testprep: 78 | $(QUIET)mkdir -p $(TDIR) 79 | 80 | testfe: 81 | $(QUIET)$(CC) -o $(TDIR)/FE001 tests/FE001.c 82 | $(QUIET)$(CC) -g -emit-llvm -o $(TDIR)/FE001.bc -c tests/FE001.c 83 | $(QUIET)$(LDIS) $(TDIR)/FE001.bc 84 | 85 | help: 86 | @echo "make " 87 | @echo "...See build/" 88 | @echo "make clean or make cleanall which requires jsoncpp rebuild" 89 | @echo "make test" 90 | @echo "make runtests" 91 | 92 | runtests: runfe 93 | 94 | runfe: 95 | @echo "***" 96 | @echo "*** Running: Function Pass Manager with Function Recogition ***" 97 | @echo "***" 98 | $(QUIET)$(OPM) -function-recognition $(TDIR)/FE001.bc $(TDIR)/FE001_out.bc 99 | 100 | clean: 101 | $(QUIET)rm -rf build tests/*.ll 102 | 103 | cleanall: clean 104 | 105 | -------------------------------------------------------------------------------- /code/newpm/README.md: -------------------------------------------------------------------------------- 1 | 2 | # Attempt to leverage the new LLVM Pass Manager 3 | 4 | The code here is intended to be some basics that help with learning 5 | the LLVM API. The points dealt with here are: 6 | 7 | - Pass manager use 8 | - Pass dependency 9 | 10 | This sample is somewhat extracted to create an isolated learning from the 11 | LLVM source. For those interested in the new pass manager please check 12 | out Chandler's talk in 2014 at the LLVM Developer's meeting where he 13 | explains a lot of the rationale behind the design decisions. 14 | 15 | This is just for those curious as to how passes might be linked together and 16 | especially regarding analysis passes in the future. The legacy pass manager, 17 | at the time of writing (LLVM 4.0) is still the dominant use pattern... 18 | 19 | Again, just a learning tool for those interested in where pass management is 20 | heading. 21 | 22 | ## A Sample run 23 | 24 | With all of the logging turned on 25 | ``` 26 | $ ./build/bin/Manager build/tests/FE001.bc 27 | Reading bitcode from file: build/tests/FE001.bc 28 | warning: ignoring debug info with an invalid version (700000003) in build/tests/FE001.bc 29 | Running analysis: InnerAnalysisManagerProxy 30 | Starting llvm::Function pass manager run. 31 | Running pass: (anonymous namespace)::TestFunctionPass on FE001_foo 32 | Running analysis: OuterAnalysisManagerProxy 33 | Running analysis: (anonymous namespace)::TestFunctionAnalysis 34 | Finished llvm::Function pass manager run. 35 | Starting llvm::Function pass manager run. 36 | Running pass: (anonymous namespace)::TestFunctionPass on main 37 | Running analysis: OuterAnalysisManagerProxy 38 | Running analysis: (anonymous namespace)::TestFunctionAnalysis 39 | Finished llvm::Function pass manager run. 40 | Functions analyzed: 2 41 | Instructions analyzed: 8 42 | ``` -------------------------------------------------------------------------------- /code/newpm/build/lib/TestFunctionPass.dylib: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roachspray/opcde2017/8c2474e8cd8ff2315b72d54560e83ffd35b4c91d/code/newpm/build/lib/TestFunctionPass.dylib -------------------------------------------------------------------------------- /code/newpm/build/tests/FE001: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roachspray/opcde2017/8c2474e8cd8ff2315b72d54560e83ffd35b4c91d/code/newpm/build/tests/FE001 -------------------------------------------------------------------------------- /code/newpm/build/tests/FE001.bc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roachspray/opcde2017/8c2474e8cd8ff2315b72d54560e83ffd35b4c91d/code/newpm/build/tests/FE001.bc -------------------------------------------------------------------------------- /code/newpm/build/tests/FE001.ll: -------------------------------------------------------------------------------- 1 | ; ModuleID = 'build/tests/FE001.bc' 2 | source_filename = "tests/FE001.c" 3 | target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" 4 | target triple = "x86_64-apple-macosx10.12.0" 5 | 6 | ; Function Attrs: nounwind ssp uwtable 7 | define i32 @FE001_foo() #0 { 8 | ret i32 0 9 | } 10 | 11 | ; Function Attrs: nounwind ssp uwtable 12 | define i32 @main(i32, i8**) #0 { 13 | %3 = alloca i32, align 4 14 | %4 = alloca i32, align 4 15 | %5 = alloca i8**, align 8 16 | store i32 0, i32* %3, align 4 17 | store i32 %0, i32* %4, align 4 18 | store i8** %1, i8*** %5, align 8 19 | ret i32 1 20 | } 21 | 22 | ; Function Attrs: nounwind readnone 23 | declare void @llvm.dbg.declare(metadata, metadata, metadata) #1 24 | 25 | attributes #0 = { nounwind ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" } 26 | attributes #1 = { nounwind readnone } 27 | 28 | !llvm.module.flags = !{!0, !1, !2} 29 | !llvm.ident = !{!3} 30 | 31 | !0 = !{i32 2, !"Dwarf Version", i32 4} 32 | !1 = !{i32 2, !"Debug Info Version", i32 700000003} 33 | !2 = !{i32 1, !"PIC Level", i32 2} 34 | !3 = !{!"Apple LLVM version 8.1.0 (clang-802.0.42)"} 35 | -------------------------------------------------------------------------------- /code/newpm/src/Analysis/TestFunctionPass.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * This is a dependency for a few of the passes. Collects 3 | * call sites and organizes them by TargetCallType. 4 | * 5 | */ 6 | 7 | #include "llvm/IR/Module.h" 8 | #include "llvm/IR/Function.h" 9 | #include "llvm/Pass.h" 10 | #include "llvm/IR/InstIterator.h" 11 | #include "llvm/IR/Instructions.h" 12 | #include "llvm/IR/CallSite.h" 13 | #include "llvm/Support/raw_ostream.h" 14 | 15 | #include 16 | #include 17 | #include 18 | 19 | using namespace llvm; 20 | 21 | #include "TestFunctionPass.h" 22 | 23 | 24 | -------------------------------------------------------------------------------- /code/newpm/src/Analysis/TestFunctionPass.h: -------------------------------------------------------------------------------- 1 | #ifndef __TARGETCALLSITESPASS_H 2 | #define __TARGETCALLSITESPASS_H 3 | 4 | #include "llvm/IR/PassManager.h" 5 | 6 | using namespace llvm; 7 | 8 | namespace { 9 | 10 | class TestFunctionAnalysis : public AnalysisInfoMixin { 11 | public: 12 | struct Result { 13 | Result(int Count) : InstructionCount(Count) {} 14 | int InstructionCount; 15 | }; 16 | 17 | TestFunctionAnalysis(int &Runs) : Runs(Runs) {} 18 | 19 | /// \brief Run the analysis pass over the function and return a result. 20 | Result run(Function &F, FunctionAnalysisManager &AM) { 21 | ++Runs; 22 | int Count = 0; 23 | for (Function::iterator BBI = F.begin(), BBE = F.end(); BBI != BBE; ++BBI) 24 | for (BasicBlock::iterator II = BBI->begin(), IE = BBI->end(); II != IE; 25 | ++II) 26 | ++Count; 27 | return Result(Count); 28 | } 29 | 30 | private: 31 | friend AnalysisInfoMixin; 32 | static AnalysisKey Key; 33 | 34 | int &Runs; 35 | }; 36 | 37 | AnalysisKey TestFunctionAnalysis::Key; 38 | 39 | class TestModuleAnalysis : public AnalysisInfoMixin { 40 | public: 41 | struct Result { 42 | Result(int Count) : FunctionCount(Count) {} 43 | int FunctionCount; 44 | }; 45 | 46 | TestModuleAnalysis(int &Runs) : Runs(Runs) {} 47 | 48 | Result run(Module &M, ModuleAnalysisManager &AM) { 49 | ++Runs; 50 | int Count = 0; 51 | for (Module::iterator I = M.begin(), E = M.end(); I != E; ++I) 52 | ++Count; 53 | return Result(Count); 54 | } 55 | 56 | private: 57 | friend AnalysisInfoMixin; 58 | static AnalysisKey Key; 59 | 60 | int &Runs; 61 | }; 62 | 63 | AnalysisKey TestModuleAnalysis::Key; 64 | 65 | struct TestModulePass : PassInfoMixin { 66 | TestModulePass(int &RunCount) : RunCount(RunCount) {} 67 | 68 | PreservedAnalyses run(Module &M, ModuleAnalysisManager &) { 69 | ++RunCount; 70 | return PreservedAnalyses::none(); 71 | } 72 | 73 | int &RunCount; 74 | }; 75 | 76 | struct TestPreservingModulePass : PassInfoMixin { 77 | PreservedAnalyses run(Module &M, ModuleAnalysisManager &) { 78 | return PreservedAnalyses::all(); 79 | } 80 | }; 81 | 82 | struct TestFunctionPass : PassInfoMixin { 83 | TestFunctionPass(int &RunCount, int &AnalyzedInstrCount, 84 | int &AnalyzedFunctionCount, 85 | bool OnlyUseCachedResults = false) 86 | : RunCount(RunCount), AnalyzedInstrCount(AnalyzedInstrCount), 87 | AnalyzedFunctionCount(AnalyzedFunctionCount), 88 | OnlyUseCachedResults(OnlyUseCachedResults) {} 89 | 90 | PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) { 91 | ++RunCount; 92 | 93 | const ModuleAnalysisManager &MAM = 94 | AM.getResult(F).getManager(); 95 | if (TestModuleAnalysis::Result *TMA = 96 | MAM.getCachedResult(*F.getParent())) 97 | AnalyzedFunctionCount += TMA->FunctionCount; 98 | 99 | if (OnlyUseCachedResults) { 100 | // Hack to force the use of the cached interface. 101 | if (TestFunctionAnalysis::Result *AR = 102 | AM.getCachedResult(F)) 103 | AnalyzedInstrCount += AR->InstructionCount; 104 | } else { 105 | // Typical path just runs the analysis as needed. 106 | TestFunctionAnalysis::Result &AR = AM.getResult(F); 107 | AnalyzedInstrCount += AR.InstructionCount; 108 | } 109 | 110 | return PreservedAnalyses::all(); 111 | } 112 | 113 | int &RunCount; 114 | int &AnalyzedInstrCount; 115 | int &AnalyzedFunctionCount; 116 | bool OnlyUseCachedResults; 117 | }; 118 | 119 | } 120 | 121 | #endif 122 | -------------------------------------------------------------------------------- /code/newpm/src/manager.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Pass Manager Implementation, divorcing itself from the 3 | * the legacy pass manager. 4 | * 5 | * 6 | */ 7 | 8 | #include "llvm/LinkAllPasses.h" 9 | #include "llvm/AsmParser/Parser.h" 10 | #include "llvm/IR/LLVMContext.h" 11 | #include "llvm/IR/PassManager.h" 12 | #include "llvm/Support/SourceMgr.h" 13 | #include "llvm/Support/CommandLine.h" 14 | #include "llvm/Bitcode/BitcodeWriter.h" 15 | #include "llvm/Bitcode/BitcodeReader.h" 16 | #include "llvm/IRReader/IRReader.h" 17 | #include "llvm-c/Core.h" 18 | 19 | #include 20 | #include 21 | 22 | using namespace llvm; 23 | 24 | #include "Analysis/TestFunctionPass.h" 25 | 26 | /* 27 | * The only pass that is merged into this manager is ta simple function list utility 28 | * so there isn't a need for output, etc, but for consistency's sake it's best to keep 29 | * a similar syntax. 30 | */ 31 | cl::opt InputBitcodeFile(cl::Positional, cl::desc(""), 32 | cl::Required); 33 | 34 | /* 35 | * Utility for testing 36 | */ 37 | std::unique_ptr parseIR(LLVMContext &Context, const char *IR) { 38 | SMDiagnostic Err; 39 | return parseAssemblyString(IR, Err, Context); 40 | } 41 | 42 | 43 | int main(int argc, char *argv[]) { 44 | 45 | std::error_code ec; 46 | // Here we can explicitly manage analyses, not just a pass. 47 | // in the grand scheme this saves us time (tends-to) as analysis 48 | // passes don't transform the IR. More importantly it allows us to 49 | // customize the analysis output! 50 | SMDiagnostic err; 51 | std::unique_ptr irModule; 52 | 53 | cl::ParseCommandLineOptions( argc, argv ); 54 | 55 | outs() << "Reading bitcode from file: " << InputBitcodeFile << '\n'; 56 | 57 | irModule = parseIRFile( InputBitcodeFile, err, *unwrap(LLVMGetGlobalContext())); 58 | if ( irModule == nullptr ) { 59 | errs() << "Error: " << err.getMessage().str() << '\n'; 60 | return -1; 61 | } 62 | 63 | 64 | /* 65 | * Our analysis is really about functions 66 | * but to do that we'll want to bridge functions within modules 67 | * so we'll start with the registration of a FunctionAnalysis class 68 | */ 69 | FunctionAnalysisManager FAM(/*DebugLogging*/ true); 70 | int FunctionAnalysisRuns = 0; 71 | FAM.registerPass([&] { return TestFunctionAnalysis(FunctionAnalysisRuns); }); 72 | 73 | /* 74 | * We're creating the ModuleAnaysisManager and we want to regisgter the various 75 | * analysis pieces that are available. This allows the "results" to be 76 | * looked up (in the cache), to see if the analysis has already been done. 77 | */ 78 | ModuleAnalysisManager MAM(/*DebugLogging*/ true); 79 | int ModuleAnalysisRuns = 0; 80 | MAM.registerPass([&] { return TestModuleAnalysis(ModuleAnalysisRuns); }); 81 | MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); 82 | FAM.registerPass([&] { return ModuleAnalysisManagerFunctionProxy(MAM); }); 83 | ModulePassManager MPM; 84 | 85 | // Count the runs over a Function. 86 | int FunctionPassRunCount = 0; 87 | int AnalyzedInstrCount = 0; 88 | int AnalyzedFunctionCount = 0; 89 | 90 | /* 91 | * In this case we have a function pass that run within the module pass. 92 | */ 93 | FunctionPassManager FPM(true); 94 | FPM.addPass( TestFunctionPass( FunctionPassRunCount, 95 | AnalyzedInstrCount, 96 | AnalyzedFunctionCount ) ); 97 | 98 | MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); 99 | 100 | /* 101 | * Ultimately we run our passes on a module or IR. The underlying unit at the pass 102 | * level might be a function or mobile, etc, but we have a "Module" of IR and 103 | * therefore need a ModulePassManager to exercise our analysis. 104 | */ 105 | MPM.run( *irModule.get(), MAM ); 106 | 107 | outs() << "Functions analyzed: " << FunctionAnalysisRuns << '\n'; 108 | outs() << "Instructions analyzed: " << AnalyzedInstrCount << '\n'; 109 | 110 | // je suis fine... 111 | return 0; 112 | 113 | } 114 | -------------------------------------------------------------------------------- /code/newpm/tests/FE001.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int 4 | FE001_foo() 5 | { 6 | return 0; 7 | } 8 | 9 | int 10 | main(int argc, char **argv) 11 | { 12 | return 1; 13 | } 14 | -------------------------------------------------------------------------------- /code/npassert/Makefile: -------------------------------------------------------------------------------- 1 | LLVM_VER=3.8 2 | #LLVM_VER=3.9 3 | #LLVM_VER=4.0 4 | LLVM_HOME=/usr/bin 5 | LLVM_CONFIG?=$(LLVM_HOME)/llvm-config-$(LLVM_VER) 6 | 7 | ifndef VERBOSE 8 | QUIET:=@ 9 | endif 10 | 11 | SRC_DIR?=$(PWD)/src 12 | 13 | CXX=$(LLVM_HOME)/clang++-$(LLVM_VER) 14 | CC=$(LLVM_HOME)/clang-$(LLVM_VER) 15 | OPT=$(LLVM_HOME)/opt-$(LLVM_VER) 16 | DIS=$(LLVM_HOME)/llvm-dis-$(LLVM_VER) 17 | LNK=$(LLVM_HOME)/llvm-link-$(LLVM_VER) 18 | 19 | LDFLAGS+=$(shell $(LLVM_CONFIG) --ldflags) 20 | LDFLAGS+=-shared -Wl,-O1 21 | 22 | CXXFLAGS+=-I$(shell $(LLVM_CONFIG) --includedir) 23 | CXXFLAGS+=-std=c++11 -fPIC -fvisibility-inlines-hidden 24 | CXXFLAGS+=-Wall -Wextra -g -Wno-unused-parameter -Wno-unused-variable 25 | 26 | CPPFLAGS+=$(shell $(LLVM_CONFIG) --cppflags) 27 | CPPFLAGS+=-I$(SRC_DIR) 28 | 29 | PASS=libnpassert.so 30 | PASS_OBJECTS=NullPtrAssertPass.o 31 | 32 | default: prep $(PASS) 33 | 34 | prep: 35 | $(QUIET)mkdir -p built 36 | 37 | %.o : $(SRC_DIR)/%.cpp 38 | @echo Compiling $*.cpp 39 | $(QUIET)$(CXX) -o built/$*.o -c $(CPPFLAGS) $(CXXFLAGS) $< 40 | 41 | $(PASS) : $(PASS_OBJECTS) 42 | @echo Linking $@ 43 | $(QUIET)$(CXX) -o built/$@ $(LDFLAGS) $(CXXFLAGS) built/*.o 44 | 45 | clean: 46 | $(QUIET)rm -rf built 47 | 48 | tests: 49 | $(QUIET)echo "Generating bitcode from C" 50 | $(QUIET)$(CC) -g -emit-llvm -c -o test/ex01.bc test/ex01.c 51 | $(QUIET)$(CC) -g -emit-llvm -c -o test/ex02.bc test/ex02.c 52 | $(QUIET)$(OPT) -load built/libnpassert.so -null-ptr-assert -npa-use-function -o test/ex02c.bc < test/ex02.bc 53 | $(QUIET)echo "Attempting to inject assertions" 54 | $(QUIET)$(OPT) -load built/libnpassert.so -null-ptr-assert -o test/ex01a.bc < test/ex01.bc 55 | $(QUIET)$(OPT) -load built/libnpassert.so -null-ptr-assert -o test/ex02a.bc < test/ex02.bc 56 | $(QUIET)echo "Running inject with config file" 57 | $(QUIET)$(OPT) -load built/libnpassert.so -null-ptr-assert -npa-target-config conf/targ.cfg -o test/ex01b.bc < test/ex01.bc 58 | $(QUIET)$(OPT) -load built/libnpassert.so -null-ptr-assert -npa-target-config conf/targ.cfg -o test/ex02b.bc < test/ex02.bc 59 | $(QUIET)echo "Running llvm-dis on the bitcode files" 60 | $(QUIET)$(DIS) --o=test/ex01a.ll test/ex01a.bc 61 | $(QUIET)$(DIS) --o=test/ex01b.ll test/ex01b.bc 62 | $(QUIET)$(DIS) --o=test/ex01.ll test/ex01.bc 63 | $(QUIET)$(DIS) --o=test/ex02a.ll test/ex02a.bc 64 | $(QUIET)$(DIS) --o=test/ex02b.ll test/ex02b.bc 65 | $(QUIET)$(DIS) --o=test/ex02.ll test/ex02.bc 66 | $(QUIET)echo "Compiling to machine code (elf)" 67 | $(QUIET)$(CC) -g -o test/ex01a test/ex01a.bc 68 | $(QUIET)$(CC) -g -o test/ex01 test/ex01.bc 69 | $(QUIET)$(CC) -g -o test/ex02a test/ex02a.bc 70 | $(QUIET)$(CC) -g -o test/ex02c test/ex02c.bc 71 | $(QUIET)$(CC) -g -o test/ex02 test/ex02.bc 72 | 73 | cleantests: 74 | rm -f test/*.bc test/*.ll test/ex01 test/ex01a test/ex02 test/ex02a test/ex02c 75 | 76 | cleanall: clean cleantests 77 | -------------------------------------------------------------------------------- /code/npassert/README.md: -------------------------------------------------------------------------------- 1 | 2 | The code will look for any/all pointer function arguments and will 3 | insert an assert(ptr != NULL); statement. It shows looking at 4 | functions, their arguments, a function pass, declaring a function, 5 | and inserting code. Why? Eh. A few reasons that are BS, but mostly 6 | to help learn API. 7 | 8 | 9 | ## Build and Run 10 | 11 | This requires LLVM and Clang 3.8, 3.9, or 4.0 releases. You should 12 | review the Makefile to setup the LLVM path and version. 13 | 14 | ``` 15 | $ make 16 | $ make tests 17 | $ cd tests 18 | ``` 19 | 20 | If you do not specify a configuration file, the code will look for all 21 | functions and their arguments. If you specify a configuration file, 22 | it will specify the function and the argument to check. The tests/ex0*b.... 23 | files are those in which a configuration file was specified and the 24 | ex0*a are those in which none was specified.. so you should see a difference 25 | 26 | ### Example running by hand 27 | ``` 28 | $ clang -g -emit-llvm -c -o FOO.bc FOO.c 29 | $ opt -load built/libnpassert.so -null-ptr-assert -o FOO_assertall.bc < FOO.bc 30 | $ opt -load built/libnpassert.so -null-ptr-assert -npa-target-config FOO.cfg -o FOO_assertbyconfigfile.bc < FOO.bc 31 | $ clang -g -o FOO_assertall FOO_assertall.bc 32 | $ clang -g -o FOO_assertbyconfigfile FOO_assertbyconfigfile.bc 33 | ``` 34 | 35 | The last clang steps generate an runnable executable. 36 | Use llvm-dis on bitcode (.bc) files to get the human readable IR 37 | to view the differences. One may also use the -npa-use-function option 38 | which will tell the pass to create a separate function to perform 39 | the assertion in. 40 | 41 | So with the above instead of getting a crash like: 42 | ``` 43 | (gdb) r 44 | Starting program: /home/hoser/code/npassert/test/ex02 45 | 46 | Program received signal SIGSEGV, Segmentation fault. 47 | 0x0000000000400549 in foo (s=0x0) at test/ex02.c:16 48 | 16 return s->one; 49 | (gdb) disass foo 50 | Dump of assembler code for function foo: 51 | 0x0000000000400530 <+0>: push %rbp 52 | 0x0000000000400531 <+1>: mov %rsp,%rbp 53 | 0x0000000000400534 <+4>: sub $0x10,%rsp 54 | 0x0000000000400538 <+8>: mov %rdi,-0x8(%rbp) 55 | 0x000000000040053c <+12>: callq 0x400420 56 | 0x0000000000400541 <+17>: mov %rax,-0x10(%rbp) 57 | 0x0000000000400545 <+21>: mov -0x8(%rbp),%rax 58 | => 0x0000000000400549 <+25>: mov (%rax),%eax 59 | 0x000000000040054b <+27>: add $0x10,%rsp 60 | 0x000000000040054f <+31>: pop %rbp 61 | 0x0000000000400550 <+32>: retq 62 | End of assembler dump. 63 | (gdb) 64 | ``` 65 | 66 | You would get a crash early in the function.. 67 | ``` 68 | (gdb) r 69 | Starting program: /home/hoser/code/npassert/test/ex02a 70 | Program received signal SIGSEGV, Segmentation fault. 71 | 0x0000000000400554 in foo (s=0x0) at test/ex02.c:12 72 | 12 { 73 | (gdb) disass foo 74 | Dump of assembler code for function foo: 75 | 0x0000000000400530 <+0>: push %rbp 76 | 0x0000000000400531 <+1>: mov %rsp,%rbp 77 | 0x0000000000400534 <+4>: sub $0x20,%rsp 78 | 0x0000000000400538 <+8>: cmp $0x0,%rdi 79 | 0x000000000040053c <+12>: mov %rdi,-0x8(%rbp) 80 | 0x0000000000400540 <+16>: jne 0x400558 81 | 0x0000000000400546 <+22>: xor %eax,%eax 82 | 0x0000000000400548 <+24>: mov %eax,%ecx 83 | 0x000000000040054a <+26>: mov %rsp,%rdx 84 | 0x000000000040054d <+29>: add $0xfffffffffffffff0,%rdx 85 | 0x0000000000400551 <+33>: mov %rdx,%rsp 86 | => 0x0000000000400554 <+36>: mov (%rcx),%eax 87 | 0x0000000000400556 <+38>: mov %eax,(%rdx) 88 | 0x0000000000400558 <+40>: mov %rsp,%rax 89 | 0x000000000040055b <+43>: add $0xfffffffffffffff0,%rax 90 | 0x000000000040055f <+47>: mov %rax,%rsp 91 | 0x0000000000400562 <+50>: mov %rsp,%rcx 92 | 0x0000000000400565 <+53>: add $0xfffffffffffffff0,%rcx 93 | 0x0000000000400569 <+57>: mov %rcx,%rsp 94 | 0x000000000040056c <+60>: mov -0x8(%rbp),%rdx 95 | 0x0000000000400570 <+64>: mov %rdx,(%rax) 96 | 0x0000000000400573 <+67>: mov %rax,-0x10(%rbp) 97 | 0x0000000000400577 <+71>: mov %rcx,-0x18(%rbp) 98 | 0x000000000040057b <+75>: callq 0x400420 99 | 0x0000000000400580 <+80>: mov -0x18(%rbp),%rcx 100 | 0x0000000000400584 <+84>: mov %rax,(%rcx) 101 | 0x0000000000400587 <+87>: mov -0x10(%rbp),%rax 102 | 0x000000000040058b <+91>: mov (%rax),%rdx 103 | 0x000000000040058e <+94>: mov (%rdx),%eax 104 | 0x0000000000400590 <+96>: mov %rbp,%rsp 105 | 0x0000000000400593 <+99>: pop %rbp 106 | 0x0000000000400594 <+100>: retq 107 | ``` 108 | 109 | Or if you add the -npa-use-function option 110 | ``` 111 | (gdb) r 112 | Starting program: /home/hoser/code/npassert/test/ex02c 113 | 114 | Program received signal SIGSEGV, Segmentation fault. 115 | 0x00000000004005b3 in __NPA_assert_8__ () 116 | (gdb) 117 | ``` 118 | it will fault in the assertion function. 119 | -------------------------------------------------------------------------------- /code/npassert/conf/targ.cfg: -------------------------------------------------------------------------------- 1 | # Ignore lines with # 2 | # index by 0 3 | foo,0 4 | foo3,1 5 | foo3,0 6 | -------------------------------------------------------------------------------- /code/npassert/src/NullPtrAssertPass.cpp: -------------------------------------------------------------------------------- 1 | #include "llvm/IR/Module.h" 2 | #include "llvm/IR/IRBuilder.h" 3 | #include "llvm/IR/Constant.h" 4 | #include "llvm/IR/Constants.h" 5 | #include "llvm/IR/Instructions.h" 6 | #include "llvm/Support/raw_ostream.h" 7 | #include "llvm/Support/CommandLine.h" 8 | 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | 15 | using namespace llvm; 16 | 17 | #include "NullPtrAssertPass.h" 18 | 19 | /* 20 | * cl::opt represent options taken from the command line, they can include defaults 21 | * help descriptions, etc. This is how a pass can be made more granular or specifify 22 | * a type of analysis for instance. 23 | */ 24 | cl::opt ReplaceConfigFileName("npa-target-config", 25 | cl::desc("configuration file for np asserts"), cl::init("")); 26 | 27 | cl::opt AssertFunction("npa-use-function", 28 | cl::desc("if set, then create new assertion function"), cl::init(false)); 29 | 30 | void 31 | NullPtrAssertPass::insertAssertionFunction(Module *M) 32 | { 33 | LLVMContext &ctx = M->getContext(); 34 | 35 | /* 36 | * Insert our own assertion function. void __NPA_assert_8__(i8 *, i32) 37 | */ 38 | Constant *assertFnCon = M->getOrInsertFunction("__NPA_assert_8__", 39 | Type::getVoidTy(ctx), 40 | Type::getInt8PtrTy(ctx), Type::getInt32Ty(ctx), NULL); 41 | assertFn = cast(assertFnCon); 42 | assertFn->setCallingConv(CallingConv::C); 43 | Argument *arg = &assertFn->getArgumentList().front(); 44 | arg->setName("ptrToCheck"); 45 | 46 | /* 47 | * Insert blocks for entry and branches 48 | */ 49 | BasicBlock *blkEntry = BasicBlock::Create(ctx, "npa_entry_blk", assertFn); 50 | BasicBlock *blkAssert = BasicBlock::Create(ctx, "npa_assert_blk", assertFn); 51 | BasicBlock *blkReturn = BasicBlock::Create(ctx, "npa_return_blk", assertFn); 52 | 53 | /* 54 | * Creates a null pointer and compares it with the argument. 55 | * If equal, branch to blkAssert, if not, return block 56 | */ 57 | IRBuilder<> builder8(blkEntry); 58 | PointerType *ptNull = cast(Type::getInt8PtrTy(ctx)); 59 | Value *cpNull0 = ConstantPointerNull::get(ptNull); 60 | Value *eqNull = builder8.CreateICmpEQ(arg, cpNull0); 61 | (void)builder8.CreateCondBr(eqNull, blkAssert, blkReturn); 62 | 63 | /* 64 | * attempt to deref null 65 | */ 66 | builder8.SetInsertPoint(blkAssert); 67 | Value *crLoad = builder8.CreateAlloca(IntegerType::get(ctx, 32)); 68 | auto ptNullB = PointerType::get(IntegerType::get(ctx, 32), 0); 69 | Value *cpNull = ConstantPointerNull::get(ptNullB); 70 | Value *doLoad = builder8.CreateLoad(cpNull); 71 | Value *doStore = builder8.CreateStore(doLoad, crLoad, true); 72 | (void)builder8.CreateRetVoid(); 73 | 74 | /* return block */ 75 | builder8.SetInsertPoint(blkReturn); 76 | (void)builder8.CreateRetVoid(); 77 | 78 | } 79 | 80 | void 81 | NullPtrAssertPass::insertAssertionFunctionCall(Module *M, Function *F, 82 | Argument *A) 83 | { 84 | LLVMContext &ctx = M->getContext(); 85 | BasicBlock *oe = &F->front(); 86 | Instruction *oi = &oe->front(); 87 | APInt a(32, A->getArgNo(), false); 88 | ConstantInt *cv = ConstantInt::get(ctx, a); 89 | Value *v = new BitCastInst(A, Type::getInt8PtrTy(ctx), "bitcastme_", oi); 90 | CallInst::Create(assertFn, {v, cv}, "", oi); 91 | return; 92 | } 93 | 94 | void 95 | NullPtrAssertPass::insertAssertion(Module *M, Function *F, Argument *A) 96 | { 97 | LLVMContext &ctx = M->getContext(); 98 | 99 | /* 100 | * Get original first BasicBlock in the function. 101 | */ 102 | BasicBlock *origEntry = &F->front(); 103 | 104 | /* 105 | * Add a new BasicBlock before this original first one. 106 | * This block will hold code that contains the compare and 107 | * conditional branch statements. 108 | */ 109 | BasicBlock *entryBlock = BasicBlock::Create(ctx, "npa_entry_blk", F, 110 | origEntry); 111 | /* 112 | * Add block between the new entry block and the original entry that 113 | * will hold code for causing crash. 114 | */ 115 | BasicBlock *assertBlock = BasicBlock::Create(ctx, "npa_assert_blk", F, 116 | origEntry); 117 | 118 | /* 119 | * Using the IRBuilder API to add code 120 | * Get a NULL pointer and see if it and the argument A 121 | * are equal. 122 | * Branch to the assert block if equal, otherwise, go to 123 | * the original entry block and continue. 124 | */ 125 | IRBuilder<> builder(entryBlock); 126 | Type *typeNull = A->getType(); 127 | assert(isa(typeNull) && "typeNull not PointerType"); 128 | PointerType *ptNull = cast(typeNull); 129 | Value *cpNull0 = ConstantPointerNull::get(ptNull); 130 | Value *eqNull = builder.CreateICmpEQ(A, cpNull0); 131 | (void)builder.CreateCondBr(eqNull, assertBlock, origEntry); 132 | 133 | /* 134 | * Add the crash code to the assertBlock. 135 | * The basic idea is trying to deref null ptr. We could do this 136 | * to re-use some NULLs we might have, but to show more of the API 137 | * we do it in the following manner. 138 | */ 139 | builder.SetInsertPoint(assertBlock); 140 | Value *crLoad = builder.CreateAlloca(IntegerType::get(ctx, 32)); 141 | auto ptNullB = PointerType::get(IntegerType::get(ctx, 32), 0); 142 | Value *cpNull = ConstantPointerNull::get(ptNullB); 143 | Value *doLoad = builder.CreateLoad(cpNull); 144 | Value *doStore = builder.CreateStore(doLoad, crLoad, true); 145 | // Add a terminator for this block so the IR is sane 146 | builder.CreateBr(origEntry); 147 | } 148 | 149 | bool 150 | NullPtrAssertPass::attemptInsertAssert(Module *M, Function *f, int targetIdx) 151 | { 152 | bool chg = false; 153 | int argIdx = -1; 154 | 155 | /* Decl not defn */ 156 | if (f->isDeclaration() == true) { 157 | return false; 158 | } 159 | 160 | /* No body, nothing to inject */ 161 | if (f->empty() == true) { 162 | return false; 163 | } 164 | 165 | /* Deal with just named functions. */ 166 | if (f->hasName() == false) { 167 | return false; 168 | } 169 | 170 | /* No arguments, nothing to possibly assert. */ 171 | if (f->arg_empty() == true) { 172 | return false; 173 | } 174 | /* We have a specified argument index */ 175 | if (targetIdx >= 0 && f->arg_size() <= (unsigned)targetIdx) { 176 | return false; 177 | } 178 | for (auto ait = f->arg_begin(); ait != f->arg_end(); ++ait) { 179 | ++argIdx; 180 | if (targetIdx != -1 && argIdx != targetIdx) { 181 | continue; 182 | } 183 | Argument *a = &*ait; 184 | Type *ty = a->getType(); 185 | 186 | // The type of this argument is not of pointer; nothing to do. 187 | if (ty->isPointerTy() == false) { 188 | continue; 189 | } 190 | if (AssertFunction) { 191 | insertAssertionFunctionCall(M, f, a); 192 | } else { 193 | insertAssertion(M, f, a); 194 | } 195 | chg = true; 196 | } 197 | return chg; 198 | } 199 | 200 | bool 201 | NullPtrAssertPass::runOnModule(Module &M) 202 | { 203 | bool chg = false; 204 | 205 | if (AssertFunction) { 206 | insertAssertionFunction(&M); 207 | } 208 | /* Do'em all */ 209 | if (ReplaceConfigFileName == "") { 210 | for (auto fit = M.functions().begin(); 211 | fit != M.functions().end(); ++fit) { 212 | Function *f = &*fit; 213 | if (f->hasName()) { 214 | if (f->getName().str().find("__NPA_",0) == 0) { 215 | continue; 216 | } 217 | } 218 | chg = attemptInsertAssert(&M, f, -1); 219 | } 220 | } else { 221 | std::ifstream hand(ReplaceConfigFileName); 222 | std::string line; 223 | std::string fnName; 224 | std::string argIdx; 225 | int idx; 226 | // We trust that the user is giving us a sane file. 227 | while (std::getline(hand, line)) { 228 | if (line.find("#", 0) == 0) { 229 | continue; 230 | } 231 | size_t i = line.find(",", 0); 232 | fnName = line.substr(0, i); 233 | argIdx = line.substr(i+1); // XXX ;^p 234 | idx = stoi(argIdx, nullptr, 10); 235 | if (idx < 0) { 236 | idx = -1; 237 | } 238 | Function *f = M.getFunction(fnName); 239 | if (f == NULL) { 240 | errs() << "Unable to find function: " << fnName << "\n"; 241 | continue; 242 | } 243 | chg = attemptInsertAssert(&M, f, idx); 244 | } 245 | } 246 | return chg; 247 | } 248 | 249 | char NullPtrAssertPass::ID = 0; 250 | static RegisterPass XX("null-ptr-assert", "Inject null ptr checks"); 251 | 252 | -------------------------------------------------------------------------------- /code/npassert/src/NullPtrAssertPass.h: -------------------------------------------------------------------------------- 1 | #ifndef __NULLPTRASSERTPASS_H 2 | #define __NULLPTRASSERTPASS_H 3 | 4 | struct NullPtrAssertPass : public ModulePass { 5 | static char ID; 6 | 7 | NullPtrAssertPass() : ModulePass(ID) { } 8 | virtual bool runOnModule(Module &); 9 | 10 | private: 11 | void insertAssertion(Module *M, Function *F, Argument *A); 12 | bool attemptInsertAssert(Module *M, Function *F, int); 13 | void insertAssertionFunction(Module *M); 14 | void insertAssertionFunctionCall(Module *M, Function *F, Argument *A); 15 | 16 | Function *assertFn; 17 | }; 18 | 19 | #endif 20 | -------------------------------------------------------------------------------- /code/npassert/test/ex01.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | void 5 | foo(int *ptr) 6 | { 7 | long int k = 0; 8 | k = random(); 9 | printf("Yessir %ld %d\n", k, *ptr); 10 | return; 11 | } 12 | 13 | void 14 | foo2(char *dingle) 15 | { 16 | long int k = 0; 17 | 18 | k = random(); 19 | printf("dingle %ld %s\n", k, dingle); 20 | } 21 | void 22 | foo3(int *ptr, char *foo) 23 | { 24 | long int k = 0; 25 | k = random(); 26 | printf("Yessir %ld %d %s\n", k, *ptr, foo); 27 | } 28 | 29 | int 30 | main(int argc, char **argv) 31 | { 32 | foo(&argc); 33 | foo2("hi there"); 34 | foo3(&argc, "hi there"); 35 | foo2(NULL); 36 | foo((int *)NULL); 37 | return 0; 38 | } 39 | -------------------------------------------------------------------------------- /code/npassert/test/ex02.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | struct shame { 5 | int one; 6 | int two; 7 | int three; 8 | }; 9 | 10 | int 11 | foo(struct shame *s) 12 | { 13 | long int z; 14 | 15 | z = random(); 16 | return s->one; 17 | } 18 | 19 | int 20 | main(int argc, char **argv) 21 | { 22 | struct shame *s = (struct shame *)NULL; 23 | foo(s); 24 | return 0; 25 | } 26 | -------------------------------------------------------------------------------- /code/rpskel/Makefile: -------------------------------------------------------------------------------- 1 | LLVM_VER=3.9 2 | LLVM_HOME=/usr/bin 3 | LLVM_CONFIG?=$(LLVM_HOME)/llvm-config-$(LLVM_VER) 4 | 5 | ifndef VERBOSE 6 | QUIET:=@ 7 | endif 8 | 9 | SRC_DIR?=$(PWD)/src 10 | 11 | CXX=$(LLVM_HOME)/clang++-$(LLVM_VER) 12 | CC=$(LLVM_HOME)/clang-$(LLVM_VER) 13 | OPT=$(LLVM_HOME)/opt-$(LLVM_VER) 14 | DIS=$(LLVM_HOME)/llvm-dis-$(LLVM_VER) 15 | LNK=$(LLVM_HOME)/llvm-link-$(LLVM_VER) 16 | 17 | LDFLAGS+=$(shell $(LLVM_CONFIG) --ldflags) 18 | LDFLAGS+=-shared -Wl,-O1 19 | 20 | CXXFLAGS+=-I$(shell $(LLVM_CONFIG) --includedir) 21 | CXXFLAGS+=-std=c++11 -fPIC -fvisibility-inlines-hidden 22 | CXXFLAGS+=-Wall -Wextra -g -Wno-unused-parameter -Wno-unused-variable 23 | 24 | CPPFLAGS+=$(shell $(LLVM_CONFIG) --cppflags) 25 | CPPFLAGS+=-I$(SRC_DIR) 26 | 27 | PASS=RPSkel.so 28 | PASS_OBJECTS=RPSkel.o 29 | 30 | default: prep $(PASS) 31 | 32 | prep: 33 | $(QUIET)mkdir -p built 34 | 35 | %.o : $(SRC_DIR)/%.cpp 36 | @echo Compiling $*.cpp 37 | $(QUIET)$(CXX) -o built/$*.o -c $(CPPFLAGS) $(CXXFLAGS) $< 38 | 39 | $(PASS) : $(PASS_OBJECTS) 40 | @echo Linking $@ 41 | $(QUIET)$(CXX) -o built/$@ $(LDFLAGS) $(CXXFLAGS) built/*.o 42 | 43 | clean: 44 | $(QUIET)rm -rf built test/*.bc 45 | 46 | 47 | tests: 48 | $(CC) -emit-llvm -o test/foo.bc -c test/foo.c 49 | 50 | runtests: 51 | $(OPT) -load built/RPSkel.so -rpskel < test/foo.bc 52 | -------------------------------------------------------------------------------- /code/rpskel/README.md: -------------------------------------------------------------------------------- 1 | 2 | # RegionPass Skeleton 3 | 4 | 5 | # Build & Run 6 | 7 | First check the Makefile to set path to llvm-config and version. 8 | 3.8, 3.9 should be fine, so should 4.0 9 | 10 | ``` 11 | $ make 12 | $ opt-X.Y -load built/RPSkel.so -rpskel < file.bc 13 | ... 14 | $ 15 | ``` 16 | 17 | 18 | -------------------------------------------------------------------------------- /code/rpskel/src/RPSkel.cpp: -------------------------------------------------------------------------------- 1 | #include "llvm/IR/Module.h" 2 | #include "llvm/IR/Instructions.h" 3 | #include "llvm/Analysis/RegionPass.h" 4 | #include "llvm/Analysis/RegionInfo.h" 5 | #include "llvm/Support/raw_ostream.h" 6 | 7 | using namespace llvm; 8 | 9 | #include "RPSkel.h" 10 | 11 | void 12 | RPSkel::getAnalysisUsage(AnalysisUsage &AU) const 13 | { 14 | // No changes to CFG, so tell the pass manager 15 | AU.setPreservesCFG(); 16 | } 17 | 18 | bool 19 | RPSkel::doFinalization() 20 | { 21 | return false; 22 | } 23 | 24 | bool 25 | RPSkel::doInitialization(Region *R, RGPassManager &RGPM) 26 | { 27 | return false; 28 | } 29 | 30 | bool 31 | RPSkel::runOnRegion(Region *R, RGPassManager &RGPM) 32 | { 33 | errs() << " Region found:\n"; 34 | 35 | R->print(errs(), true, 1, Region::PrintNone); 36 | errs() << " --- end of Region ---\n"; 37 | 38 | // return true if Region has been changed. 39 | return false; 40 | } 41 | 42 | 43 | 44 | /* 45 | * Register this pass to be made usable. 46 | * Needs the static ID initialized and the pass declaration given. 47 | */ 48 | char RPSkel::ID = 0; 49 | static RegisterPass XX("rpskel", "Region Pass Skeleton"); 50 | 51 | -------------------------------------------------------------------------------- /code/rpskel/src/RPSkel.h: -------------------------------------------------------------------------------- 1 | #ifndef __RPSKEL_H 2 | #define __RPSKEL_H 3 | 4 | struct RPSkel : public RegionPass { 5 | /* 6 | * For all of your passes you will need this and to define it. 7 | * It's address is used by pass system, so the value does not matter. 8 | */ 9 | static char ID; 10 | 11 | RPSkel() : RegionPass(ID) { 12 | } 13 | 14 | // Return true if Region was modified, otherwise false. 15 | virtual bool runOnRegion(Region *R, RGPassManager &RGM); 16 | 17 | /* 18 | * Used to help order passes by pass manager. 19 | * Declare any passes you need run prior here.. as well as 20 | * any information such as preserving CFG or similar. 21 | */ 22 | virtual void getAnalysisUsage(AnalysisUsage &) const; 23 | 24 | virtual bool doInitialization(Region *R, RGPassManager &RGM); 25 | virtual bool doFinalization(); 26 | }; 27 | 28 | #endif 29 | -------------------------------------------------------------------------------- /code/rpskel/test/foo.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | 8 | void 9 | leaks_passwd() 10 | { 11 | char *p; 12 | struct addrinfo hints, *result; 13 | 14 | p = getpass("enter passwd: "); 15 | memset(&hints, 0, sizeof(struct addrinfo)); 16 | hints.ai_family = AF_UNSPEC; 17 | hints.ai_socktype = SOCK_DGRAM; 18 | hints.ai_flags = 0; 19 | hints.ai_protocol = 0; 20 | (void)getaddrinfo(p, "http", &hints, &result); 21 | } 22 | 23 | int 24 | main(int argc, char **argv) 25 | { 26 | leaks_passwd(); 27 | return 0; 28 | } 29 | -------------------------------------------------------------------------------- /code/visitorskel/Makefile: -------------------------------------------------------------------------------- 1 | LLVM_VER=3.9 2 | LLVM_HOME=/usr/bin 3 | LLVM_CONFIG?=$(LLVM_HOME)/llvm-config-$(LLVM_VER) 4 | 5 | ifndef VERBOSE 6 | QUIET:=@ 7 | endif 8 | 9 | SRC_DIR?=$(PWD)/src 10 | 11 | CXX=$(LLVM_HOME)/clang++-$(LLVM_VER) 12 | CC=$(LLVM_HOME)/clang-$(LLVM_VER) 13 | OPT=$(LLVM_HOME)/opt-$(LLVM_VER) 14 | DIS=$(LLVM_HOME)/llvm-dis-$(LLVM_VER) 15 | LNK=$(LLVM_HOME)/llvm-link-$(LLVM_VER) 16 | 17 | LDFLAGS+=$(shell $(LLVM_CONFIG) --ldflags) 18 | LDFLAGS+=-shared -Wl,-O1 19 | 20 | CXXFLAGS+=-I$(shell $(LLVM_CONFIG) --includedir) 21 | CXXFLAGS+=-std=c++11 -fPIC -fvisibility-inlines-hidden 22 | CXXFLAGS+=-Wall -Wextra -g -Wno-unused-parameter -Wno-unused-variable 23 | 24 | CPPFLAGS+=$(shell $(LLVM_CONFIG) --cppflags) 25 | CPPFLAGS+=-I$(SRC_DIR) 26 | 27 | 28 | PASS=VisitorSkel.so 29 | PASS_OBJECTS=VisitorSkelModulePass.o 30 | 31 | default: prep $(PASS) 32 | 33 | prep: 34 | $(QUIET)mkdir -p built 35 | 36 | %.o : $(SRC_DIR)/%.cpp 37 | @echo Compiling $*.cpp 38 | $(QUIET)$(CXX) -o built/$*.o -c $(CPPFLAGS) $(CXXFLAGS) $< 39 | 40 | $(PASS) : $(PASS_OBJECTS) 41 | @echo Linking $@ 42 | $(QUIET)$(CXX) -o built/$@ $(LDFLAGS) $(CXXFLAGS) built/*.o 43 | 44 | clean: 45 | $(QUIET)rm -rf built 46 | 47 | -------------------------------------------------------------------------------- /code/visitorskel/README.md: -------------------------------------------------------------------------------- 1 | 2 | # VisitorSkel 3 | 4 | This is a module pass that runs an instruction visitor 5 | on each function it encounters in the given compilation 6 | unit. 7 | 8 | ## Build 9 | 10 | Check the makefile to be sure the LLVM version and llvm-config 11 | paths are ok. Then just make. This will produce build/VisitorSkel.so. 12 | 13 | ## Run 14 | 15 | ``` 16 | $ clang -emit-llvm -o file.bc -c file.c 17 | $ opt-X.Y -load built/VisitorSkel.so -vskel < file.bc 18 | ... 19 | $ opt-X.Y -load built/VisitorSkel.so -vskel -fn-to-visit someFuncName < file.bc 20 | ... 21 | ``` 22 | 23 | -------------------------------------------------------------------------------- /code/visitorskel/src/VisitorSkelModulePass.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Module pass that makes use of an InstVisitor but 3 | * does nothing but dump values. 4 | */ 5 | 6 | #include "llvm/IR/Module.h" 7 | #include "llvm/IR/InstVisitor.h" 8 | #include "llvm/Support/raw_ostream.h" 9 | #include "llvm/Support/CommandLine.h" 10 | 11 | using namespace llvm; 12 | 13 | cl::opt FunctionToVisit("fn-to-visit", cl::init("")); 14 | 15 | struct VisitorSkelModulePass : public ModulePass { 16 | static char ID; 17 | 18 | VisitorSkelModulePass() : ModulePass(ID) {} 19 | 20 | /* 21 | * The point of this pass is to demonstrate how easily we can find a specific instruction type, rather 22 | * than looping through each instruction... 23 | */ 24 | virtual bool 25 | runOnModule(Module &M) 26 | { 27 | if (FunctionToVisit != "") { 28 | Function *f = M.getFunction(FunctionToVisit); 29 | if (f == NULL || f->isDeclaration()) { 30 | errs() << "Unable to find the function you specified.\n"; 31 | return false; 32 | } 33 | errs() << "Visiting function: " << f->getName() << "\n"; 34 | SkelInstVisitor v; 35 | v.visit(*f); 36 | return false; 37 | } 38 | for (auto &f : M) { 39 | SkelInstVisitor v; 40 | if (f.isDeclaration()) { 41 | errs() << "Skipping function declaration\n"; 42 | continue; 43 | } 44 | if (f.hasName()) { 45 | errs() << "\nVisiting function: " << f.getName() << "\n"; 46 | } else { 47 | errs() << "\nVisiting unnamed function:\n"; 48 | } 49 | // You can call on a module, a function, or a basicblock.... 50 | // depends on what you need. 51 | v.visit(f); 52 | errs() << "\n"; 53 | } 54 | return false; 55 | } 56 | 57 | /* See llvm/IR/InstVisitor.h for full set... */ 58 | struct SkelInstVisitor : public InstVisitor { 59 | void 60 | visitReturnInst(ReturnInst &I) { 61 | errs() << "X return instruction\n "; 62 | I.dump(); 63 | } 64 | void visitBranchInst(BranchInst &I) { 65 | errs() << "X branch instruction\n "; 66 | I.dump(); 67 | } 68 | void visitSwitchInst(SwitchInst &I) { 69 | errs() << "X switch instruction\n "; 70 | I.dump(); 71 | } 72 | void visitIndirectBrInst(IndirectBrInst &I) { 73 | errs() << "X indirect branch instruction\n "; 74 | I.dump(); 75 | } 76 | void visitCallInst(CallInst &I) { 77 | errs() << "X call instruction\n "; 78 | I.dump(); 79 | } 80 | void visitInvokeInst(InvokeInst &I) { 81 | errs() << "X invoke instruction\n "; 82 | I.dump(); 83 | } 84 | void visitCallSite(CallSite &I) { 85 | errs() << "X call site instruction\n "; 86 | } 87 | 88 | void visitBinaryOperator(BinaryOperator &I) { 89 | errs() << "X binary instruction\n "; 90 | I.dump(); 91 | } 92 | void visitCmpInst(CmpInst &I) { 93 | errs() << "X compare instruction\n "; 94 | I.dump(); 95 | } 96 | void visitBitCastInst(BitCastInst &I) { 97 | errs() << "X bit cast instruction\n "; 98 | I.dump(); 99 | } 100 | void visitICmpInst(ICmpInst &I) { 101 | errs() << "X int compare instruction\n "; 102 | I.dump(); 103 | } 104 | void visitAllocaInst(AllocaInst &I) { 105 | errs() << "X alloca instruction\n "; 106 | I.dump(); 107 | } 108 | void visitLoadInst(LoadInst &I) { 109 | errs() << "X load instruction\n "; 110 | I.dump(); 111 | } 112 | void visitStoreInst(StoreInst &I) { 113 | errs() << "X store instruction\n "; 114 | I.dump(); 115 | } 116 | void visitGetElementPtrInst(GetElementPtrInst &I){ 117 | errs() << "X get element pointer instruction\n "; 118 | I.dump(); 119 | } 120 | void visitPHINode(PHINode &I) { 121 | errs() << "X PHI node instruction\n "; 122 | I.dump(); 123 | } 124 | void visitTruncInst(TruncInst &I) { 125 | errs() << "X truncate instruction\n "; 126 | I.dump(); 127 | } 128 | void visitZExtInst(ZExtInst &I) { 129 | errs() << "X zero extend instruction\n "; 130 | I.dump(); 131 | } 132 | void visitSExtInst(SExtInst &I) { 133 | errs() << "X signed exstend instruction\n "; 134 | I.dump(); 135 | } 136 | void visitUnaryInstruction(UnaryInstruction &I) { 137 | errs() << "X unary instruction\n "; 138 | I.dump(); 139 | } 140 | }; 141 | }; 142 | 143 | char VisitorSkelModulePass::ID = 0; 144 | static RegisterPass XX("vskel", 145 | "Skeleton code for a module pass that visits function instructions"); 146 | -------------------------------------------------------------------------------- /possible_projects.txt: -------------------------------------------------------------------------------- 1 | 2 | There are many possible projects, but if you are looking for some 3 | ideas to get you going, here are some. Note, you may need to look 4 | at the slide deck or the projects.md file for some idea of what is 5 | being said. 6 | 7 | 8 | - Revive foreign-inference project to newer LLVM. Add support for LibFuzzer 9 | 10 | - Passes that look at slices or pruning or other for improvements in fuzzing 11 | 12 | - Use SAW on open source projects to flesh out things 13 | 14 | - Create a tool that analyzes code meant to use Divine or Seahorn for the 15 | purposes of cataloging or policy validation 16 | 17 | - Build on Comminute to just be more realistic of a tool 18 | 19 | - Build on IntFlip 20 | 21 | - Build on npassert to determine asymptotic behaviors of code (think ASAP 22 | from EPFL) 23 | 24 | - Snapshotting kernel regions for replay 25 | 26 | There are more, will add as i recall them. 27 | 28 | 29 | -------------------------------------------------------------------------------- /projects.md: -------------------------------------------------------------------------------- 1 | # Security Related LLVM Projects 2 | 3 | Some entries are on the cusp of security and non-security related. Please 4 | submit a pull request (or a friendly email) to add or remove projects. I 5 | highly doubt it is a complete list, so always welcoming additions and/or 6 | corrections. Further, I had thought to break out papers into it's own 7 | section, but decided against it for reasons related to sleep. And even 8 | further... There are some things I just wanted in here for informational 9 | reasons; i.e. they may not be projects. :camel: 10 | 11 | ## Projects 12 | 13 | ### AddressSanitizer (ASan) 14 | > Memory error detector for C/C++ finding UAF, buffer overflow, 15 | > UAR, etc 16 | 17 | [GitHub](https://github.com/google/sanitizers/wiki/AddressSanitizer) 18 | Note the code in lib/Transform/Instrumentation. 19 | 20 | [``AddressSanitizer: A Fast Address Sanity Checker''](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37752.pdf) 21 | 22 | 23 | ### Alive 24 | > Verifying the InstCombiner code; Regehr et al 25 | 26 | [GitHub](https://github.com/nunoplopes/alive) 27 | 28 | 29 | ### Alive-nj 30 | > Reimplementation of the Alive code 31 | 32 | [GitHub](https://github.com/rutgers-apl/alive-nj) 33 | 34 | 35 | ### American Fuzzy Lop (AFL) 36 | > a security-oriented fuzzer that employs a novel type of compile-time 37 | > instrumentation and genetic algorithms to automatically discover 38 | > clean, interesting test cases that trigger new internal states in the 39 | > targeted binary 40 | 41 | http://lcamtuf.coredump.cx/afl 42 | 43 | ### Andersen's Analysis 44 | [``Program Analysis and Specialization for the C Programming Language''](http://www.cs.cornell.edu/courses/cs711/2005fa/papers/andersen-thesis94.pdf) 45 | 46 | [Standalone LLVM impl](https://github.com/grievejia/andersen) 47 | 48 | ### ASAP (EPFL) 49 | > Analyze runtime checks to balance those that are needed and those that are not 50 | 51 | http://dslab.epfl.ch/proj/asap/ 52 | 53 | ### Causal, Adaptive, Distributed, and Efficient Tracing System (CADETS) 54 | > address flaws in current audit and information-flow systems through 55 | > fundamental improvements in dynamic instrumentation, scalable 56 | > distributed tracing, and programming-language support. 57 | 58 | http://www.cl.cam.ac.uk/research/security/cadets/ 59 | 60 | ### cclyzer 61 | > A tool for analyzing LLVM bitcode using Datalog. 62 | 63 | [GitHub](https://github.com/plast-lab/cclyzer) 64 | 65 | ### cspgen (Draper) 66 | > Map IR to models in Communicating Sequential Processes (CSP) process calculus. 67 | 68 | [GitHub](https://github.com/draperlaboratory/cspgen) 69 | 70 | ### DangSan (VUSec) 71 | > Dangling pointer detection 72 | 73 | [GitHub](https://github.com/vusec/dangsan) 74 | 75 | [Paper](http://www.cs.vu.nl/~giuffrida/papers/dangsan_eurosys17.pdf) 76 | 77 | ### DataFlowSanitizer 78 | > Generalized data flow API 79 | 80 | http://clang.llvm.org/docs/DataFlowSanitizerDesign.html 81 | 82 | ### Divine 83 | > Explicit-state model checker 84 | 85 | https://divine.fi.muni.cz 86 | 87 | [Thesis](https://is.muni.cz/th/373979/fi_m/thesis.pdf) 88 | 89 | 90 | ### DomTreSat 91 | > static analysis system that takes source code as input and automatically produces path satisfiability reports for paths gathered from a created Dominator Tree structure...main use of this tool is determine reachability of controllable input to a target in the program, as well as what this input needs to be to get there 92 | 93 | [GitHub](https://github.com/trailofbits/DomTreSat) 94 | 95 | 96 | ### DynamicTools 97 | > This project consists of several useful tool for dealing with LLVM 98 | > IR runtime behaviors. Currently it consists of two parts, 99 | > a custom-written LLVM IR interpreter, and an LLVM IR fuzzer (abandoned). 100 | 101 | [GitHub](https://github.com/grievejia/LLVMDynamicTools) 102 | 103 | ### fdc 104 | > An optimizing decompiler (for reversing) 105 | 106 | http://zneak.github.io/fcd/ 107 | 108 | [GitHub](https://github.com/zneak/fcd) 109 | 110 | ### FindFlows 111 | https://llvm.org/svn/llvm-project/giri/trunk/lib/Static/FindFlows.cpp 112 | 113 | ### Foreign Inference 114 | > Auto-generate wrappers for C code for Python etc. 115 | 116 | [GitHub](https://github.com/travitch/foreign-inference) 117 | Out of date, someone should revive and build LibFuzzer wrappers 118 | 119 | ### Fracture (Draper) 120 | > Arch independent decompiler to LLVM IR (ver 3.6) 121 | 122 | [GitHub](https://github.com/draperlaboratory/fracture) 123 | 124 | 125 | ### Gist Static Analyzer (EPFL) 126 | > Failure-sketching to help determine reasons for faults 127 | 128 | http://dslab.epfl.ch/proj/gist/ 129 | 130 | [Paper](http://dslab.epfl.ch/pubs/gist.pdf) 131 | 132 | ### Trail of Bits CGC 133 | [``How we fared in the Cyber Grand Challenge''](https://blog.trailofbits.com/2015/07/15/how-we-fared-in-the-cyber-grand-challenge/) 134 | 135 | ### Infer (Facebook) 136 | > Compile time static analyzer 137 | 138 | http://fbinfer.com/ 139 | 140 | [GitHub](https://github.com/facebook/infer) 141 | 142 | [Infer Clang Plugin](https://github.com/facebook/facebook-clang-plugins/tree/ee26293dd046acc5c2dd862d3201aa9f7dace96a) 143 | 144 | ### Kryptonite obfuscator 145 | [GitHub](https://github.com/0vercl0k/stuffz/blob/master/llvm-funz/kryptonite/llvm-functionpass-kryptonite-obfuscater.cpp) 146 | 147 | ### KULFI (Utah) 148 | > Instruction level fault injector 149 | 150 | [GitHub](https://github.com/soarlab/KULFI) 151 | 152 | http://formalverification.cs.utah.edu/fmr/#kulfi 153 | 154 | ### IKOS (NASA) 155 | > a C++ library designed to facilitate the development of sound static analyzers based on Abstract Interpretation 156 | 157 | https://ti.arc.nasa.gov/opensource/ikos/ 158 | 159 | ### lafindel's transforms 160 | > Splitting certain compares to improve fuzzing 161 | 162 | [``Circumventing Fuzzing Roadblocks with Compiler Transformations''](https://lafintel.wordpress.com/2016/08/15/circumventing-fuzzing-roadblocks-with-compiler-transformations/) 163 | 164 | ### LeakSanitizer 165 | > Memory leak detection 166 | 167 | [GitHub](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer) 168 | 169 | ### LibFuzzer 170 | > Evolutionary, in-process guided fuzzing library 171 | 172 | http://llvm.org/docs/LibFuzzer.html 173 | 174 | [GitHub](https://github.com/llvm-mirror/llvm/tree/master/lib/Fuzzer) 175 | 176 | [LibFuzzer Tutorial](https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md) 177 | 178 | ### llmc 179 | [``Model checking LLVM IR using LTSmin''](http://fmt.cs.utwente.nl/files/sprojects/190.pdf) 180 | 181 | ### llStar 182 | > Pre/post condition intra-procedural verifier 183 | 184 | https://bitbucket.org/jvillard/llstar/wiki/Home 185 | 186 | ### Mc Sema (ToB) 187 | > Lift MC to IR for retargeting, patching, recompilation, symbolic exec 188 | 189 | [Github](https://github.com/trailofbits/mcsema) 190 | 191 | ### MemorySanitizer 192 | > Detect pointer misalignment etc 193 | 194 | [GitHub](https://github.com/google/sanitizers/wiki/MemorySanitizer) and see source in lib/Transform/Instrumentation 195 | 196 | ### Obfuscator-LLVM 197 | [GitHub](https://github.com/obfuscator-llvm/obfuscator) 198 | 199 | Note: https://github.com/obfuscator-llvm/obfuscator/tree/llvm-3.6.1/lib/Transforms/Obfuscation 200 | 201 | ### Passes from QuarksLab 202 | [GitHub](https://github.com/quarkslab/llvm-passes/tree/master/llvm-passes) 203 | 204 | [``Turning regular code into atrocities''](http://blog.quarkslab.com/turning-regular-code-into-atrocities-with-llvm.html) 205 | 206 | ### Path Analysis for invariant Generation by Abstract Interpretation (PAGAI) (VERIMAG) 207 | http://pagai.forge.imag.fr/ 208 | 209 | Also check out *screen* from trail of bits which makes use of PAGAI: https://github.com/trailofbits/screen 210 | 211 | ### pmGen 212 | > Translate IR to Promela for verification 213 | 214 | [GitHub](https://github.com/roselone/pmGen) 215 | 216 | ### PRESAGE (Utah) 217 | > protecting structured address computations against soft errors 218 | 219 | http://formalverification.cs.utah.edu/fmr/#presage 220 | 221 | [Git]https://utahfmr.github.io/PRESAGE/ 222 | 223 | [Paper](https://arxiv.org/abs/1606.08948) 224 | Basically, it attempts to reorganize GetElementPtr's to best handle alpha particle hits. 225 | 226 | ### Pointer analysis with tunable precision (TPA) 227 | [GitHub](https://github.com/grievejia/tpa) 228 | 229 | 230 | ### RaceSan (ToB) 231 | > Data race detector using modified DataCollider algorithm. 232 | 233 | WIP? 234 | 235 | [GitHub](https://github.com/trailofbits/RaceSanitizer) 236 | 237 | ### Remill (ToB) 238 | > Lift MC instructions to IR 239 | 240 | [GitHub](https://github.com/trailofbits/remill) 241 | 242 | ### Return-less code 243 | > Transform IR to have no return statements.. attempt to avoid ROP 244 | 245 | [Paper](http://www4.ncsu.edu/~mcgrace/EUROSYS10.pdf) 246 | 247 | ### Rev.ng 248 | > Lift MC to IR by using QEMU 249 | 250 | https://rev.ng/ 251 | 252 | [Rev.ng LLVM developer mtg 2016](http://llvm.org/devmtg/2016-11/Slides/DiFederico-rev.ng.pdf) 253 | 254 | ### s2e (EPFL) 255 | [GitHub](https://github.com/dslab-epfl/s2e) 256 | 257 | ### SAFECode 258 | http://safecode.cs.illinois.edu/ 259 | 260 | [``Memory Safety for Low-Level Software/Hardware Interactions''](http://llvm.org/pubs/2009-08-12-UsenixSecurity-SafeSVAOS.html) 261 | 262 | ### SafeInit (VUSec) 263 | > Detect uninitialized memory use errors 264 | 265 | [GitHub](https://github.com/vusec/safeinit) 266 | 267 | [Paper](https://www.vusec.net/download/?t=papers/safeinit_ndss17.pdf) 268 | 269 | ### Seahorn 270 | > Intraprocedural model checker 271 | 272 | [GitHub](https://github.com/seahorn/seahorn) 273 | 274 | See the links there to CRAB etc 275 | 276 | ### Security-Oriented Analysis of Application Programs (SOAAP) (Cambridge) 277 | http://www.cl.cam.ac.uk/research/security/ctsrd/soaap 278 | 279 | [GitHub](https://github.com/CTSRD-SOAAP) 280 | 281 | ### Sloopy 282 | http://forsyte.at/people/pani/sloopy/ 283 | 284 | ### Smack 285 | [GitHub](https://github.com/smackers/smack) 286 | 287 | http://soarlab.org/2014/05/smack-decoupling-source-language-details-from-verifier-implementations/ 288 | 289 | ### Software Analysis Workbench (SAW) (Galois Inc) 290 | > Formal verification via equivelancy checks 291 | 292 | http://saw.galois.com/ 293 | 294 | [GitHub SAW Script](https://github.com/GaloisInc/saw-script) 295 | 296 | [GitHub llvm-verifier](https://github.com/GaloisInc/llvm-verifier) 297 | 298 | ### Strong Update Analysis (SUPA) (UNSW) 299 | > demand-driven Strong UPdate Analysis that computes points-to information on-demand via value-flow refinement. 300 | > built on SVF (below) 301 | 302 | http://www.cse.unsw.edu.au/~corg/supa/ 303 | 304 | [GitHub](https://github.com/unsw-corg/PTABen) 305 | 306 | ### Static Value Flow (SVF) (UNSW) 307 | > Pointer Analysis and Program Dependence Analysis for C and C++ Programs 308 | 309 | http://unsw-corg.github.io/SVF/ 310 | 311 | [GitHub](https://github.com/unsw-corg/SVF) 312 | 313 | 314 | ### Temporally Enhanced Security Logic Assertions (TESLA) (Cambridge) 315 | http://www.cl.cam.ac.uk/research/security/ctsrd/tesla 316 | 317 | [GitHub](https://github.com/CTSRD-TESLA/) 318 | 319 | ### TokenCap 320 | > Find tokens/magics in code so as to quickly pass fuzzing blockers 321 | 322 | [GitHub](https://github.com/0vercl0k/stuffz/blob/master/llvm-funz/afl-llvm-tokencap-pass.so.cc) 323 | 324 | ### Typesan (VUSec) 325 | [GitHub](https://github.com/vusec/typesan) 326 | 327 | ### Verified LLVM (VeLLVM) 328 | > Model syntax and semantics of LLVM IR in Coq for proving things about code reasoning on IR 329 | 330 | http://www.cis.upenn.edu/~stevez/vellvm/ 331 | 332 | https://deepspec.org/main 333 | 334 | ### Verified ThreadSanitizer (TSan) 335 | > Output research from VeLLVM / deepspec group on verifying TSan does what it says it will 336 | 337 | [GitHub](https://github.com/upenn-acg/verified-tsan) 338 | 339 | 340 | ### Whole program LLVM 341 | > Help linking multiple .bc files to one 342 | 343 | [GitHub](https://github.com/travitch/whole-program-llvm) 344 | -------------------------------------------------------------------------------- /slides/ExtraNonsense.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roachspray/opcde2017/8c2474e8cd8ff2315b72d54560e83ffd35b4c91d/slides/ExtraNonsense.pdf -------------------------------------------------------------------------------- /slides/MainDeck.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roachspray/opcde2017/8c2474e8cd8ff2315b72d54560e83ffd35b4c91d/slides/MainDeck.pdf -------------------------------------------------------------------------------- /slides/SecurityRDprojectsLLVM.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roachspray/opcde2017/8c2474e8cd8ff2315b72d54560e83ffd35b4c91d/slides/SecurityRDprojectsLLVM.pdf --------------------------------------------------------------------------------