├── LICENSE ├── common.cc ├── dex_bytecode.cc ├── dex_format.cc ├── dex_helper.cc ├── dex_helper.h ├── dex_ir.cc ├── dex_utf8.cc ├── main.cc ├── reader.cc ├── slicer ├── arrayview.h ├── buffer.h ├── chronometer.h ├── common.h ├── dex_bytecode.h ├── dex_format.h ├── dex_instruction_list.h ├── dex_ir.h ├── dex_leb128.h ├── dex_utf8.h ├── hash_table.h ├── index_map.h ├── memview.h ├── reader.h ├── scopeguard.h └── writer.h └── test.cc /LICENSE: -------------------------------------------------------------------------------- 1 | GNU LESSER GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2022 LSPosed 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | 9 | This version of the GNU Lesser General Public License incorporates 10 | the terms and conditions of version 3 of the GNU General Public 11 | License, supplemented by the additional permissions listed below. 12 | 13 | 0. Additional Definitions. 14 | 15 | As used herein, "this License" refers to version 3 of the GNU Lesser 16 | General Public License, and the "GNU GPL" refers to version 3 of the GNU 17 | General Public License. 18 | 19 | "The Library" refers to a covered work governed by this License, 20 | other than an Application or a Combined Work as defined below. 21 | 22 | An "Application" is any work that makes use of an interface provided 23 | by the Library, but which is not otherwise based on the Library. 24 | Defining a subclass of a class defined by the Library is deemed a mode 25 | of using an interface provided by the Library. 26 | 27 | A "Combined Work" is a work produced by combining or linking an 28 | Application with the Library. The particular version of the Library 29 | with which the Combined Work was made is also called the "Linked 30 | Version". 31 | 32 | The "Minimal Corresponding Source" for a Combined Work means the 33 | Corresponding Source for the Combined Work, excluding any source code 34 | for portions of the Combined Work that, considered in isolation, are 35 | based on the Application, and not on the Linked Version. 36 | 37 | The "Corresponding Application Code" for a Combined Work means the 38 | object code and/or source code for the Application, including any data 39 | and utility programs needed for reproducing the Combined Work from the 40 | Application, but excluding the System Libraries of the Combined Work. 41 | 42 | 1. Exception to Section 3 of the GNU GPL. 43 | 44 | You may convey a covered work under sections 3 and 4 of this License 45 | without being bound by section 3 of the GNU GPL. 46 | 47 | 2. Conveying Modified Versions. 48 | 49 | If you modify a copy of the Library, and, in your modifications, a 50 | facility refers to a function or data to be supplied by an Application 51 | that uses the facility (other than as an argument passed when the 52 | facility is invoked), then you may convey a copy of the modified 53 | version: 54 | 55 | a) under this License, provided that you make a good faith effort to 56 | ensure that, in the event an Application does not supply the 57 | function or data, the facility still operates, and performs 58 | whatever part of its purpose remains meaningful, or 59 | 60 | b) under the GNU GPL, with none of the additional permissions of 61 | this License applicable to that copy. 62 | 63 | 3. Object Code Incorporating Material from Library Header Files. 64 | 65 | The object code form of an Application may incorporate material from 66 | a header file that is part of the Library. You may convey such object 67 | code under terms of your choice, provided that, if the incorporated 68 | material is not limited to numerical parameters, data structure 69 | layouts and accessors, or small macros, inline functions and templates 70 | (ten or fewer lines in length), you do both of the following: 71 | 72 | a) Give prominent notice with each copy of the object code that the 73 | Library is used in it and that the Library and its use are 74 | covered by this License. 75 | 76 | b) Accompany the object code with a copy of the GNU GPL and this license 77 | document. 78 | 79 | 4. Combined Works. 80 | 81 | You may convey a Combined Work under terms of your choice that, 82 | taken together, effectively do not restrict modification of the 83 | portions of the Library contained in the Combined Work and reverse 84 | engineering for debugging such modifications, if you also do each of 85 | the following: 86 | 87 | a) Give prominent notice with each copy of the Combined Work that 88 | the Library is used in it and that the Library and its use are 89 | covered by this License. 90 | 91 | b) Accompany the Combined Work with a copy of the GNU GPL and this license 92 | document. 93 | 94 | c) For a Combined Work that displays copyright notices during 95 | execution, include the copyright notice for the Library among 96 | these notices, as well as a reference directing the user to the 97 | copies of the GNU GPL and this license document. 98 | 99 | d) Do one of the following: 100 | 101 | 0) Convey the Minimal Corresponding Source under the terms of this 102 | License, and the Corresponding Application Code in a form 103 | suitable for, and under terms that permit, the user to 104 | recombine or relink the Application with a modified version of 105 | the Linked Version to produce a modified Combined Work, in the 106 | manner specified by section 6 of the GNU GPL for conveying 107 | Corresponding Source. 108 | 109 | 1) Use a suitable shared library mechanism for linking with the 110 | Library. A suitable mechanism is one that (a) uses at run time 111 | a copy of the Library already present on the user's computer 112 | system, and (b) will operate properly with a modified version 113 | of the Library that is interface-compatible with the Linked 114 | Version. 115 | 116 | e) Provide Installation Information, but only if you would otherwise 117 | be required to provide such information under section 6 of the 118 | GNU GPL, and only to the extent that such information is 119 | necessary to install and execute a modified version of the 120 | Combined Work produced by recombining or relinking the 121 | Application with a modified version of the Linked Version. (If 122 | you use option 4d0, the Installation Information must accompany 123 | the Minimal Corresponding Source and Corresponding Application 124 | Code. If you use option 4d1, you must provide the Installation 125 | Information in the manner specified by section 6 of the GNU GPL 126 | for conveying Corresponding Source.) 127 | 128 | 5. Combined Libraries. 129 | 130 | You may place library facilities that are a work based on the 131 | Library side by side in a single library together with other library 132 | facilities that are not Applications and are not covered by this 133 | License, and convey such a combined library under terms of your 134 | choice, if you do both of the following: 135 | 136 | a) Accompany the combined library with a copy of the same work based 137 | on the Library, uncombined with any other library facilities, 138 | conveyed under the terms of this License. 139 | 140 | b) Give prominent notice with the combined library that part of it 141 | is a work based on the Library, and explaining where to find the 142 | accompanying uncombined form of the same work. 143 | 144 | 6. Revised Versions of the GNU Lesser General Public License. 145 | 146 | The Free Software Foundation may publish revised and/or new versions 147 | of the GNU Lesser General Public License from time to time. Such new 148 | versions will be similar in spirit to the present version, but may 149 | differ in detail to address new problems or concerns. 150 | 151 | Each version is given a distinguishing version number. If the 152 | Library as you received it specifies that a certain numbered version 153 | of the GNU Lesser General Public License "or any later version" 154 | applies to it, you have the option of following the terms and 155 | conditions either of that published version or of any later version 156 | published by the Free Software Foundation. If the Library as you 157 | received it does not specify a version number of the GNU Lesser 158 | General Public License, you may choose any version of the GNU Lesser 159 | General Public License ever published by the Free Software Foundation. 160 | 161 | If the Library as you received it specifies that a proxy can decide 162 | whether future versions of the GNU Lesser General Public License shall 163 | apply, that proxy's public statement of acceptance of any version is 164 | permanent authorization for you to choose that version for the 165 | Library. 166 | -------------------------------------------------------------------------------- /common.cc: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #include "slicer/common.h" 18 | 19 | #include 20 | #include 21 | #include 22 | #include 23 | #include 24 | 25 | namespace slicer { 26 | 27 | // Helper for the default SLICER_CHECK() policy 28 | void _checkFailed(const char* expr, int line, const char* file) { 29 | printf("\nSLICER_CHECK failed [%s] at %s:%d\n\n", expr, file, line); 30 | abort(); 31 | } 32 | 33 | // keep track of the failures we already saw to avoid spamming with duplicates 34 | thread_local std::set> weak_failures; 35 | 36 | // Helper for the default SLICER_WEAK_CHECK() policy 37 | // 38 | // TODO: implement a modal switch (abort/continue) 39 | // 40 | void _weakCheckFailed(const char* expr, int line, const char* file) { 41 | auto failure_id = std::make_pair(line, file); 42 | if (weak_failures.find(failure_id) == weak_failures.end()) { 43 | printf("\nSLICER_WEAK_CHECK failed [%s] at %s:%d\n\n", expr, file, line); 44 | weak_failures.insert(failure_id); 45 | } 46 | } 47 | 48 | // Prints a formatted message and aborts 49 | void _fatal(const char* format, ...) { 50 | va_list args; 51 | va_start(args, format); 52 | vprintf(format, args); 53 | va_end(args); 54 | abort(); 55 | } 56 | 57 | } // namespace slicer 58 | -------------------------------------------------------------------------------- /dex_bytecode.cc: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #include "slicer/dex_bytecode.h" 18 | #include "slicer/common.h" 19 | 20 | #include 21 | #include 22 | 23 | namespace dex { 24 | 25 | Opcode OpcodeFromBytecode(u2 bytecode) { 26 | Opcode opcode = Opcode(bytecode & 0xff); 27 | return opcode; 28 | } 29 | 30 | // Table that maps each opcode to the index type implied by that opcode 31 | static constexpr std::array 32 | gInstructionDescriptors = {{ 33 | #define INSTRUCTION_DESCR(o, c, p, format, index, flags, e, vflags) \ 34 | { \ 35 | vflags, \ 36 | format, \ 37 | index, \ 38 | flags, \ 39 | }, 40 | #include "slicer/dex_instruction_list.h" 41 | DEX_INSTRUCTION_LIST(INSTRUCTION_DESCR) 42 | #undef DEX_INSTRUCTION_LIST 43 | #undef INSTRUCTION_DESCR 44 | }}; 45 | 46 | InstructionIndexType GetIndexTypeFromOpcode(Opcode opcode) { 47 | return gInstructionDescriptors[opcode].index_type; 48 | } 49 | 50 | InstructionFormat GetFormatFromOpcode(Opcode opcode) { 51 | return gInstructionDescriptors[opcode].format; 52 | } 53 | 54 | OpcodeFlags GetFlagsFromOpcode(Opcode opcode) { 55 | return gInstructionDescriptors[opcode].flags; 56 | } 57 | 58 | VerifyFlags GetVerifyFlagsFromOpcode(Opcode opcode) { 59 | return gInstructionDescriptors[opcode].verify_flags; 60 | } 61 | 62 | size_t GetWidthFromFormat(InstructionFormat format) { 63 | switch (format) { 64 | case k10x: 65 | case k12x: 66 | case k11n: 67 | case k11x: 68 | case k10t: 69 | return 1; 70 | case k20t: 71 | case k20bc: 72 | case k21c: 73 | case k22x: 74 | case k21s: 75 | case k21t: 76 | case k21h: 77 | case k23x: 78 | case k22b: 79 | case k22s: 80 | case k22t: 81 | case k22c: 82 | case k22cs: 83 | return 2; 84 | case k30t: 85 | case k31t: 86 | case k31c: 87 | case k32x: 88 | case k31i: 89 | case k35c: 90 | case k35ms: 91 | case k35mi: 92 | case k3rc: 93 | case k3rms: 94 | case k3rmi: 95 | return 3; 96 | case k45cc: 97 | case k4rcc: 98 | return 4; 99 | case k51l: 100 | return 5; 101 | } 102 | } 103 | 104 | size_t GetWidthFromBytecode(const u2* bytecode) { 105 | size_t width = 0; 106 | if (*bytecode == kPackedSwitchSignature) { 107 | width = 4 + bytecode[1] * 2; 108 | } else if (*bytecode == kSparseSwitchSignature) { 109 | width = 2 + bytecode[1] * 4; 110 | } else if (*bytecode == kArrayDataSignature) { 111 | u2 elemWidth = bytecode[1]; 112 | u4 len = bytecode[2] | (((u4)bytecode[3]) << 16); 113 | // The plus 1 is to round up for odd size and width. 114 | width = 4 + (elemWidth * len + 1) / 2; 115 | } else { 116 | width = GetWidthFromFormat( 117 | GetFormatFromOpcode(OpcodeFromBytecode(bytecode[0]))); 118 | } 119 | return width; 120 | } 121 | 122 | // Dalvik opcode names. 123 | static constexpr std::array gOpcodeNames = { 124 | #define INSTRUCTION_NAME(o, c, pname, f, i, a, e, v) pname, 125 | #include "slicer/dex_instruction_list.h" 126 | DEX_INSTRUCTION_LIST(INSTRUCTION_NAME) 127 | #undef DEX_INSTRUCTION_LIST 128 | #undef INSTRUCTION_NAME 129 | }; 130 | 131 | const char* GetOpcodeName(Opcode opcode) { return gOpcodeNames[opcode]; } 132 | 133 | // Helpers for DecodeInstruction() 134 | static u4 InstA(u2 inst) { return (inst >> 8) & 0x0f; } 135 | static u4 InstB(u2 inst) { return inst >> 12; } 136 | static u4 InstAA(u2 inst) { return inst >> 8; } 137 | 138 | // Helper for DecodeInstruction() 139 | static u4 FetchU4(const u2* ptr) { return ptr[0] | (u4(ptr[1]) << 16); } 140 | 141 | // Helper for DecodeInstruction() 142 | static u8 FetchU8(const u2* ptr) { 143 | return FetchU4(ptr) | (u8(FetchU4(ptr + 2)) << 32); 144 | } 145 | 146 | // Decode a Dalvik bytecode and extract the individual fields 147 | Instruction DecodeInstruction(const u2* bytecode) { 148 | u2 inst = bytecode[0]; 149 | Opcode opcode = OpcodeFromBytecode(inst); 150 | InstructionFormat format = GetFormatFromOpcode(opcode); 151 | 152 | Instruction dec = {}; 153 | dec.opcode = opcode; 154 | 155 | switch (format) { 156 | case k10x: // op 157 | return dec; 158 | case k12x: // op vA, vB 159 | dec.vA = InstA(inst); 160 | dec.vB = InstB(inst); 161 | return dec; 162 | case k11n: // op vA, #+B 163 | dec.vA = InstA(inst); 164 | dec.vB = s4(InstB(inst) << 28) >> 28; // sign extend 4-bit value 165 | return dec; 166 | case k11x: // op vAA 167 | dec.vA = InstAA(inst); 168 | return dec; 169 | case k10t: // op +AA 170 | dec.vA = s1(InstAA(inst)); // sign-extend 8-bit value 171 | return dec; 172 | case k20t: // op +AAAA 173 | dec.vA = s2(bytecode[1]); // sign-extend 16-bit value 174 | return dec; 175 | case k20bc: // [opt] op AA, thing@BBBB 176 | case k21c: // op vAA, thing@BBBB 177 | case k22x: // op vAA, vBBBB 178 | dec.vA = InstAA(inst); 179 | dec.vB = bytecode[1]; 180 | return dec; 181 | case k21s: // op vAA, #+BBBB 182 | case k21t: // op vAA, +BBBB 183 | dec.vA = InstAA(inst); 184 | dec.vB = s2(bytecode[1]); // sign-extend 16-bit value 185 | return dec; 186 | case k21h: // op vAA, #+BBBB0000[00000000] 187 | dec.vA = InstAA(inst); 188 | // The value should be treated as right-zero-extended, but we don't 189 | // actually do that here. Among other things, we don't know if it's 190 | // the top bits of a 32- or 64-bit value. 191 | dec.vB = bytecode[1]; 192 | return dec; 193 | case k23x: // op vAA, vBB, vCC 194 | dec.vA = InstAA(inst); 195 | dec.vB = bytecode[1] & 0xff; 196 | dec.vC = bytecode[1] >> 8; 197 | return dec; 198 | case k22b: // op vAA, vBB, #+CC 199 | dec.vA = InstAA(inst); 200 | dec.vB = bytecode[1] & 0xff; 201 | dec.vC = s1(bytecode[1] >> 8); // sign-extend 8-bit value 202 | return dec; 203 | case k22s: // op vA, vB, #+CCCC 204 | case k22t: // op vA, vB, +CCCC 205 | dec.vA = InstA(inst); 206 | dec.vB = InstB(inst); 207 | dec.vC = s2(bytecode[1]); // sign-extend 16-bit value 208 | return dec; 209 | case k22c: // op vA, vB, thing@CCCC 210 | case k22cs: // [opt] op vA, vB, field offset CCCC 211 | dec.vA = InstA(inst); 212 | dec.vB = InstB(inst); 213 | dec.vC = bytecode[1]; 214 | return dec; 215 | case k30t: // op +AAAAAAAA 216 | dec.vA = FetchU4(bytecode + 1); 217 | return dec; 218 | case k31t: // op vAA, +BBBBBBBB 219 | case k31c: // op vAA, string@BBBBBBBB 220 | dec.vA = InstAA(inst); 221 | dec.vB = FetchU4(bytecode + 1); 222 | return dec; 223 | case k32x: // op vAAAA, vBBBB 224 | dec.vA = bytecode[1]; 225 | dec.vB = bytecode[2]; 226 | return dec; 227 | case k31i: // op vAA, #+BBBBBBBB 228 | dec.vA = InstAA(inst); 229 | dec.vB = FetchU4(bytecode + 1); 230 | return dec; 231 | case k35c: // op {vC, vD, vE, vF, vG}, thing@BBBB 232 | case k35ms: // [opt] invoke-virtual+super 233 | case k35mi: { // [opt] inline invoke 234 | dec.vA = InstB(inst); // This is labeled A in the spec. 235 | dec.vB = bytecode[1]; 236 | 237 | u2 regList = bytecode[2]; 238 | 239 | // Copy the argument registers into the arg[] array, and 240 | // also copy the first argument (if any) into vC. (The 241 | // Instruction structure doesn't have separate 242 | // fields for {vD, vE, vF, vG}, so there's no need to make 243 | // copies of those.) Note that cases 5..2 fall through. 244 | switch (dec.vA) { 245 | case 5: 246 | // A fifth arg is verboten for inline invokes 247 | SLICER_CHECK(format != k35mi); 248 | 249 | // Per note at the top of this format decoder, the 250 | // fifth argument comes from the A field in the 251 | // instruction, but it's labeled G in the spec. 252 | dec.arg[4] = InstA(inst); 253 | FALLTHROUGH_INTENDED; 254 | case 4: 255 | dec.arg[3] = (regList >> 12) & 0x0f; 256 | FALLTHROUGH_INTENDED; 257 | case 3: 258 | dec.arg[2] = (regList >> 8) & 0x0f; 259 | FALLTHROUGH_INTENDED; 260 | case 2: 261 | dec.arg[1] = (regList >> 4) & 0x0f; 262 | FALLTHROUGH_INTENDED; 263 | case 1: 264 | dec.vC = dec.arg[0] = regList & 0x0f; 265 | FALLTHROUGH_INTENDED; 266 | case 0: 267 | // Valid, but no need to do anything 268 | return dec; 269 | } 270 | } 271 | SLICER_CHECK(!"Invalid arg count in 35c/35ms/35mi"); 272 | case k3rc: // op {vCCCC .. v(CCCC+AA-1)}, meth@BBBB 273 | case k3rms: // [opt] invoke-virtual+super/range 274 | case k3rmi: // [opt] execute-inline/range 275 | dec.vA = InstAA(inst); 276 | dec.vB = bytecode[1]; 277 | dec.vC = bytecode[2]; 278 | return dec; 279 | case k45cc: { 280 | // AG op BBBB FEDC HHHH 281 | dec.vA = InstB(inst); // This is labelled A in the spec. 282 | dec.vB = bytecode[1]; // vB meth@BBBB 283 | 284 | u2 regList = bytecode[2]; 285 | dec.vC = regList & 0xf; 286 | dec.arg[0] = (regList >> 4) & 0xf; // vD 287 | dec.arg[1] = (regList >> 8) & 0xf; // vE 288 | dec.arg[2] = (regList >> 12); // vF 289 | dec.arg[3] = InstA(inst); // vG 290 | dec.arg[4] = bytecode[3]; // vH proto@HHHH 291 | } 292 | return dec; 293 | case k4rcc: 294 | // AA op BBBB CCCC HHHH 295 | dec.vA = InstAA(inst); 296 | dec.vB = bytecode[1]; 297 | dec.vC = bytecode[2]; 298 | dec.arg[4] = bytecode[3]; // vH proto@HHHH 299 | return dec; 300 | case k51l: // op vAA, #+BBBBBBBBBBBBBBBB 301 | dec.vA = InstAA(inst); 302 | dec.vB_wide = FetchU8(bytecode + 1); 303 | return dec; 304 | } 305 | SLICER_FATAL("Can't decode unexpected format 0x%02x (op=0x%02x)", format, 306 | opcode); 307 | } 308 | 309 | } // namespace dex 310 | -------------------------------------------------------------------------------- /dex_format.cc: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #include "slicer/dex_format.h" 18 | #include "slicer/common.h" 19 | 20 | #include 21 | 22 | namespace dex { 23 | 24 | // Compute the DEX file checksum for a memory-mapped DEX file 25 | u4 ComputeChecksum(const Header* header) { 26 | const u1* start = reinterpret_cast(header); 27 | 28 | uLong adler = adler32(0L, Z_NULL, 0); 29 | const int non_sum = sizeof(header->magic) + sizeof(header->checksum); 30 | 31 | return static_cast( 32 | adler32(adler, start + non_sum, header->file_size - non_sum)); 33 | } 34 | 35 | // Returns the human-readable name for a primitive type 36 | static const char* PrimitiveTypeName(char type_char) { 37 | switch (type_char) { 38 | case 'B': return "byte"; 39 | case 'C': return "char"; 40 | case 'D': return "double"; 41 | case 'F': return "float"; 42 | case 'I': return "int"; 43 | case 'J': return "long"; 44 | case 'S': return "short"; 45 | case 'V': return "void"; 46 | case 'Z': return "boolean"; 47 | } 48 | SLICER_CHECK(!"unexpected type"); 49 | return nullptr; 50 | } 51 | 52 | // Converts a type descriptor to human-readable "dotted" form. For 53 | // example, "Ljava/lang/String;" becomes "java.lang.String", and 54 | // "[I" becomes "int[]". 55 | std::string DescriptorToDecl(const char* descriptor) { 56 | std::string ss; 57 | 58 | int array_dimensions = 0; 59 | while (*descriptor == '[') { 60 | ++array_dimensions; 61 | ++descriptor; 62 | } 63 | 64 | if (*descriptor == 'L') { 65 | for (++descriptor; *descriptor != ';'; ++descriptor) { 66 | SLICER_CHECK(*descriptor != '\0'); 67 | ss += (*descriptor == '/' ? '.' : *descriptor); 68 | } 69 | } else { 70 | ss += PrimitiveTypeName(*descriptor); 71 | } 72 | 73 | SLICER_CHECK(descriptor[1] == '\0'); 74 | 75 | // add the array brackets 76 | for (int i = 0; i < array_dimensions; ++i) { 77 | ss += "[]"; 78 | } 79 | 80 | return ss; 81 | } 82 | 83 | // Converts a type descriptor to a single "shorty" char 84 | // (ex. "LFoo;" and "[[I" become 'L', "I" stays 'I') 85 | char DescriptorToShorty(const char* descriptor) { 86 | // skip array dimensions 87 | int array_dimensions = 0; 88 | while (*descriptor == '[') { 89 | ++array_dimensions; 90 | ++descriptor; 91 | } 92 | 93 | char short_descriptor = *descriptor; 94 | if (short_descriptor == 'L') { 95 | // skip the full class name 96 | for(; *descriptor && *descriptor != ';'; ++descriptor); 97 | SLICER_CHECK(*descriptor == ';'); 98 | } 99 | 100 | SLICER_CHECK(descriptor[1] == '\0'); 101 | SLICER_CHECK(short_descriptor == 'L' || PrimitiveTypeName(short_descriptor) != nullptr); 102 | 103 | return array_dimensions > 0 ? 'L' : short_descriptor; 104 | } 105 | 106 | } // namespace dex 107 | -------------------------------------------------------------------------------- /dex_helper.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "slicer/reader.h" 4 | #include 5 | #include 6 | #include 7 | 8 | class DexHelper { 9 | public: 10 | DexHelper(const std::vector> &dexs); 11 | void CreateFullCache() const; 12 | std::vector FindMethodUsingString( 13 | std::string_view str, bool match_prefix, size_t return_type, 14 | short parameter_count, std::string_view parameter_shorty, 15 | size_t declaring_class, const std::vector ¶meter_types, 16 | const std::vector &contains_parameter_types, 17 | const std::vector &dex_priority, bool find_first) const; 18 | 19 | std::vector FindMethodInvoking( 20 | size_t method_idx, size_t return_type, short parameter_count, 21 | std::string_view parameter_shorty, size_t declaring_class, 22 | const std::vector ¶meter_types, 23 | const std::vector &contains_parameter_types, 24 | const std::vector &dex_priority, bool find_first) const; 25 | 26 | std::vector FindMethodInvoked( 27 | size_t method_idx, size_t return_type, short parameter_count, 28 | std::string_view parameter_shorty, size_t declaring_class, 29 | const std::vector ¶meter_types, 30 | const std::vector &contains_parameter_types, 31 | const std::vector &dex_priority, bool find_first) const; 32 | 33 | std::vector FindMethodGettingField( 34 | size_t field_idx, size_t return_type, short parameter_count, 35 | std::string_view parameter_shorty, size_t declaring_class, 36 | const std::vector ¶meter_types, 37 | const std::vector &contains_parameter_types, 38 | const std::vector &dex_priority, bool find_first) const; 39 | 40 | std::vector FindMethodSettingField( 41 | size_t field_idx, size_t return_type, short parameter_count, 42 | std::string_view parameter_shorty, size_t declaring_class, 43 | const std::vector ¶meter_types, 44 | const std::vector &contains_parameter_types, 45 | const std::vector &dex_priority, bool find_first) const; 46 | 47 | std::vector FindField(size_t type, 48 | const std::vector &dex_priority, 49 | bool find_first) const; 50 | 51 | struct Class { 52 | const std::string_view name; 53 | }; 54 | struct Field { 55 | const Class declaring_class; 56 | const Class type; 57 | const std::string_view name; 58 | }; 59 | struct Method { 60 | const Class declaring_class; 61 | const std::string_view name; 62 | const std::vector parameters; 63 | const Class return_type; 64 | }; 65 | 66 | size_t CreateClassIndex(std::string_view class_name, 67 | size_t on_dex = -1) const; 68 | size_t CreateMethodIndex(std::string_view class_name, 69 | std::string_view method_name, 70 | const std::vector ¶ms_name, 71 | size_t on_dex = -1) const; 72 | size_t CreateFieldIndex(std::string_view class_name, 73 | std::string_view field_name, 74 | size_t on_dex = -1) const; 75 | 76 | Class DecodeClass(size_t class_idx) const; 77 | Field DecodeField(size_t field_idx) const; 78 | Method DecodeMethod(size_t method_idx) const; 79 | 80 | private: 81 | std::tuple>, 82 | std::vector>> 83 | ConvertParameters(const std::vector ¶meter_types, 84 | const std::vector &contains_parameter_types) const; 85 | 86 | std::vector GetPriority(const std::vector &priority) const; 87 | 88 | bool ScanMethod(size_t dex_idx, uint32_t method_id, 89 | size_t str_lower = size_t(-1), 90 | size_t str_upper = size_t(-1)) const; 91 | 92 | std::tuple 93 | FindPrefixStringId(size_t dex_idx, std::string_view to_find) const; 94 | 95 | uint32_t FindPrefixStringIdExact(size_t dex_idx, 96 | std::string_view to_find) const; 97 | 98 | bool 99 | IsMethodMatch(size_t dex_id, uint32_t method_id, uint32_t return_type, 100 | short parameter_count, std::string_view parameter_shorty, 101 | uint32_t declaring_class, 102 | const std::vector ¶meter_types, 103 | const std::vector &contains_parameter_types) const; 104 | 105 | size_t CreateMethodIndex(size_t dex_idx, uint32_t method_id) const; 106 | size_t CreateClassIndex(size_t dex_idx, uint32_t class_id) const; 107 | size_t CreateFieldIndex(size_t dex_idx, uint32_t field_id) const; 108 | 109 | std::vector readers_; 110 | 111 | // for interface 112 | // indices[method_index][dex] -> id 113 | mutable std::vector> method_indices_; 114 | mutable std::vector> class_indices_; 115 | mutable std::vector> field_indices_; 116 | // rev[dex][method_id] -> method_index 117 | mutable std::vector> rev_method_indices_; // for each dex 118 | mutable std::vector> rev_class_indices_; 119 | mutable std::vector> rev_field_indices_; 120 | 121 | // for preprocess 122 | // strings[dex][str_id] -> str 123 | std::vector> strings_; 124 | // method_codes[dex][method_id] -> code 125 | std::vector> method_codes_; 126 | std::vector> method_params_; 127 | 128 | // for cache 129 | // type_cache[dex][str_id] -> type_id 130 | std::vector> type_cache_; 131 | // field_cache[dex][type_id][str_id] -> method_ids 132 | std::vector>>> 133 | method_cache_; 134 | // field_cache[dex][type_id][str_id] -> field_id 135 | std::vector>> field_cache_; 136 | // class_cache[dex][type_id] -> class_id 137 | std::vector> class_cache_; 138 | 139 | // search result cache 140 | // string_cache[dex][str_id] -> method_ids 141 | mutable std::vector>> string_cache_; 142 | // invoking_cache[dex][method_id] -> method_ids 143 | mutable std::vector>> invoking_cache_; 144 | // invoked_cache[dex][method_id] -> method_ids 145 | mutable std::vector>> invoked_cache_; 146 | // getting/setting_cache[dex][field_id] -> method_ids 147 | mutable std::vector>> getting_cache_; 148 | mutable std::vector>> setting_cache_; 149 | mutable std::vector>> declaring_cache_; 150 | // for method search 151 | mutable std::vector> searched_methods_; 152 | 153 | constexpr static uint8_t opcode_len[] = { 154 | 1, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 3, 2, 2, 3, 155 | 5, 2, 2, 3, 2, 1, 1, 2, 2, 1, 2, 2, 3, 3, 3, 1, 1, 2, 3, 3, 3, 2, 2, 2, 156 | 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 157 | 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 158 | 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 159 | 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 160 | 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 161 | 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 162 | 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 163 | 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 164 | 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 4, 3, 3, 2, 2}; 165 | static_assert(sizeof(opcode_len) == 256); 166 | }; 167 | -------------------------------------------------------------------------------- /dex_ir.cc: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #include "slicer/dex_ir.h" 18 | #include "slicer/chronometer.h" 19 | #include "slicer/dex_utf8.h" 20 | #include "slicer/dex_format.h" 21 | 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | #include 28 | 29 | namespace ir { 30 | 31 | // DBJ2a string hash 32 | static uint32_t HashString(const char* cstr) { 33 | uint32_t hash = 5381; // DBJ2 magic prime value 34 | while (*cstr) { 35 | hash = ((hash << 5) + hash) ^ *cstr++; 36 | } 37 | return hash; 38 | } 39 | 40 | uint32_t StringsHasher::Hash(const char* string_key) const { 41 | return HashString(string_key); 42 | } 43 | 44 | bool StringsHasher::Compare(const char* string_key, const String* string) const { 45 | return dex::Utf8Cmp(string_key, string->c_str()) == 0; 46 | } 47 | 48 | uint32_t ProtosHasher::Hash(const std::string& proto_key) const { 49 | return HashString(proto_key.c_str()); 50 | } 51 | 52 | bool ProtosHasher::Compare(const std::string& proto_key, const Proto* proto) const { 53 | return proto_key == proto->Signature(); 54 | } 55 | 56 | MethodKey MethodsHasher::GetKey(const EncodedMethod* method) const { 57 | MethodKey method_key; 58 | method_key.class_descriptor = method->decl->parent->descriptor; 59 | method_key.method_name = method->decl->name; 60 | method_key.prototype = method->decl->prototype; 61 | return method_key; 62 | } 63 | 64 | uint32_t MethodsHasher::Hash(const MethodKey& method_key) const { 65 | return static_cast(std::hash{}(method_key.class_descriptor) ^ 66 | std::hash{}(method_key.method_name) ^ 67 | std::hash{}(method_key.prototype)); 68 | } 69 | 70 | bool MethodsHasher::Compare(const MethodKey& method_key, const EncodedMethod* method) const { 71 | return method_key.class_descriptor == method->decl->parent->descriptor && 72 | method_key.method_name == method->decl->name && 73 | method_key.prototype == method->decl->prototype; 74 | } 75 | 76 | // Human-readable type declaration 77 | std::string Type::Decl() const { 78 | return dex::DescriptorToDecl(descriptor->c_str()); 79 | } 80 | 81 | Type::Category Type::GetCategory() const { 82 | switch (*descriptor->c_str()) { 83 | case 'L': 84 | case '[': 85 | return Category::Reference; 86 | case 'V': 87 | return Category::Void; 88 | case 'D': 89 | case 'J': 90 | return Category::WideScalar; 91 | default: 92 | return Category::Scalar; 93 | } 94 | } 95 | 96 | // Create the corresponding JNI signature: 97 | // https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/types.html#type_signatures 98 | std::string Proto::Signature() const { 99 | std::string ss; 100 | ss += "("; 101 | if (param_types != nullptr) { 102 | for (const auto& type : param_types->types) { 103 | ss += type->descriptor->c_str(); 104 | } 105 | } 106 | ss += ")"; 107 | ss += return_type->descriptor->c_str(); 108 | return ss; 109 | } 110 | 111 | // Helper for IR normalization 112 | // (it sorts items and update the numeric idexes to match) 113 | template 114 | static void IndexItems(std::vector& items, C comp) { 115 | std::sort(items.begin(), items.end(), comp); 116 | for (size_t i = 0; i < items.size(); ++i) { 117 | items[i]->index = i; 118 | } 119 | } 120 | 121 | // Helper for IR normalization (DFS for topological sort) 122 | // 123 | // NOTE: this recursive version is clean and simple and we know 124 | // that the max depth is bounded (exactly 1 for JVMTI and a small 125 | // max for general case - the largest .dex file in AOSP has 5000 classes 126 | // total) 127 | // 128 | void DexFile::TopSortClassIndex(Class* irClass, dex::u4* nextIndex) { 129 | if (irClass->index == dex::u4(-1)) { 130 | if (irClass->super_class && irClass->super_class->class_def) { 131 | TopSortClassIndex(irClass->super_class->class_def, nextIndex); 132 | } 133 | 134 | if (irClass->interfaces) { 135 | for (Type* interfaceType : irClass->interfaces->types) { 136 | if (interfaceType->class_def) { 137 | TopSortClassIndex(interfaceType->class_def, nextIndex); 138 | } 139 | } 140 | } 141 | 142 | SLICER_CHECK(*nextIndex < classes.size()); 143 | irClass->index = (*nextIndex)++; 144 | } 145 | } 146 | 147 | // Helper for IR normalization 148 | // (topological sort the classes) 149 | void DexFile::SortClassIndexes() { 150 | for (auto& irClass : classes) { 151 | irClass->index = dex::u4(-1); 152 | } 153 | 154 | dex::u4 nextIndex = 0; 155 | for (auto& irClass : classes) { 156 | TopSortClassIndex(irClass.get(), &nextIndex); 157 | } 158 | } 159 | 160 | // Helper for NormalizeClass() 161 | static void SortEncodedFields(std::vector* fields) { 162 | std::sort(fields->begin(), fields->end(), 163 | [](const EncodedField* a, const EncodedField* b) { 164 | SLICER_CHECK(a->decl->index != b->decl->index || a == b); 165 | return a->decl->index < b->decl->index; 166 | }); 167 | } 168 | 169 | // Helper for NormalizeClass() 170 | static void SortEncodedMethods(std::vector* methods) { 171 | std::sort(methods->begin(), methods->end(), 172 | [](const EncodedMethod* a, const EncodedMethod* b) { 173 | SLICER_CHECK(a->decl->index != b->decl->index || a == b); 174 | return a->decl->index < b->decl->index; 175 | }); 176 | } 177 | 178 | // Helper for IR normalization 179 | // (sort the field & method arrays) 180 | static void NormalizeClass(Class* irClass) { 181 | SortEncodedFields(&irClass->static_fields); 182 | SortEncodedFields(&irClass->instance_fields); 183 | SortEncodedMethods(&irClass->direct_methods); 184 | SortEncodedMethods(&irClass->virtual_methods); 185 | } 186 | 187 | // Prepare the IR for generating a .dex image 188 | // (the .dex format requires a specific sort order for some of the arrays, etc...) 189 | // 190 | // TODO: not a great solution - move this logic to the writer! 191 | // 192 | // TODO: the comparison predicate can be better expressed by using std::tie() 193 | // Ex. FieldDecl has a method comp() returning tie(parent->index, name->index, type->index) 194 | // 195 | void DexFile::Normalize() { 196 | // sort build the .dex indexes 197 | IndexItems(strings, [](const own& a, const own& b) { 198 | // this list must be sorted by std::string contents, using UTF-16 code point values 199 | // (not in a locale-sensitive manner) 200 | return dex::Utf8Cmp(a->c_str(), b->c_str()) < 0; 201 | }); 202 | 203 | IndexItems(types, [](const own& a, const own& b) { 204 | // this list must be sorted by string_id index 205 | return a->descriptor->index < b->descriptor->index; 206 | }); 207 | 208 | IndexItems(protos, [](const own& a, const own& b) { 209 | // this list must be sorted in return-type (by type_id index) major order, 210 | // and then by argument list (lexicographic ordering, individual arguments 211 | // ordered by type_id index) 212 | if (a->return_type->index != b->return_type->index) { 213 | return a->return_type->index < b->return_type->index; 214 | } else { 215 | std::vector empty; 216 | const auto& aParamTypes = a->param_types ? a->param_types->types : empty; 217 | const auto& bParamTypes = b->param_types ? b->param_types->types : empty; 218 | return std::lexicographical_compare( 219 | aParamTypes.begin(), aParamTypes.end(), bParamTypes.begin(), 220 | bParamTypes.end(), 221 | [](const Type* t1, const Type* t2) { return t1->index < t2->index; }); 222 | } 223 | }); 224 | 225 | IndexItems(fields, [](const own& a, const own& b) { 226 | // this list must be sorted, where the defining type (by type_id index) is 227 | // the major order, field name (by string_id index) is the intermediate 228 | // order, and type (by type_id index) is the minor order 229 | return (a->parent->index != b->parent->index) 230 | ? a->parent->index < b->parent->index 231 | : (a->name->index != b->name->index) 232 | ? a->name->index < b->name->index 233 | : a->type->index < b->type->index; 234 | }); 235 | 236 | IndexItems(methods, [](const own& a, const own& b) { 237 | // this list must be sorted, where the defining type (by type_id index) is 238 | // the major order, method name (by string_id index) is the intermediate 239 | // order, and method prototype (by proto_id index) is the minor order 240 | return (a->parent->index != b->parent->index) 241 | ? a->parent->index < b->parent->index 242 | : (a->name->index != b->name->index) 243 | ? a->name->index < b->name->index 244 | : a->prototype->index < b->prototype->index; 245 | }); 246 | 247 | // reverse topological sort 248 | // 249 | // the classes must be ordered such that a given class's superclass and 250 | // implemented interfaces appear in the list earlier than the referring 251 | // class 252 | // 253 | // CONSIDER: for the BCI-only scenario we can avoid this 254 | // 255 | SortClassIndexes(); 256 | 257 | IndexItems(classes, [&](const own& a, const own& b) { 258 | SLICER_CHECK(a->index < classes.size()); 259 | SLICER_CHECK(b->index < classes.size()); 260 | SLICER_CHECK(a->index != b->index || a == b); 261 | return a->index < b->index; 262 | }); 263 | 264 | // normalize class data 265 | for (const auto& irClass : classes) { 266 | NormalizeClass(irClass.get()); 267 | } 268 | 269 | // normalize annotations 270 | for (const auto& irAnnotation : annotations) { 271 | // elements must be sorted in increasing order by string_id index 272 | auto& elements = irAnnotation->elements; 273 | std::sort(elements.begin(), elements.end(), 274 | [](const AnnotationElement* a, const AnnotationElement* b) { 275 | return a->name->index < b->name->index; 276 | }); 277 | } 278 | 279 | // normalize "annotation_set_item" 280 | for (const auto& irAnnotationSet : annotation_sets) { 281 | // The elements must be sorted in increasing order, by type_idx 282 | auto& annotations = irAnnotationSet->annotations; 283 | std::sort(annotations.begin(), annotations.end(), 284 | [](const Annotation* a, const Annotation* b) { 285 | return a->type->index < b->type->index; 286 | }); 287 | } 288 | 289 | // normalize "annotations_directory_item" 290 | for (const auto& irAnnotationDirectory : annotations_directories) { 291 | // field_annotations: The elements of the list must be 292 | // sorted in increasing order, by field_idx 293 | auto& field_annotations = irAnnotationDirectory->field_annotations; 294 | std::sort(field_annotations.begin(), field_annotations.end(), 295 | [](const FieldAnnotation* a, const FieldAnnotation* b) { 296 | return a->field_decl->index < b->field_decl->index; 297 | }); 298 | 299 | // method_annotations: The elements of the list must be 300 | // sorted in increasing order, by method_idx 301 | auto& method_annotations = irAnnotationDirectory->method_annotations; 302 | std::sort(method_annotations.begin(), method_annotations.end(), 303 | [](const MethodAnnotation* a, const MethodAnnotation* b) { 304 | return a->method_decl->index < b->method_decl->index; 305 | }); 306 | 307 | // parameter_annotations: The elements of the list must be 308 | // sorted in increasing order, by method_idx 309 | auto& param_annotations = irAnnotationDirectory->param_annotations; 310 | std::sort(param_annotations.begin(), param_annotations.end(), 311 | [](const ParamAnnotation* a, const ParamAnnotation* b) { 312 | return a->method_decl->index < b->method_decl->index; 313 | }); 314 | } 315 | } 316 | 317 | } // namespace ir 318 | -------------------------------------------------------------------------------- /dex_utf8.cc: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #include "slicer/dex_format.h" 18 | 19 | namespace dex { 20 | 21 | // Retrieve the next UTF-16 character from a UTF-8 string. 22 | // Advances "*pUtf8Ptr" to the start of the next character. 23 | // 24 | // NOTE: If a string is corrupted by dropping a '\0' in the middle 25 | // of a 3-byte sequence, you can end up overrunning the buffer with 26 | // reads (and possibly with the writes if the length was computed and 27 | // cached before the damage). For performance reasons, this function 28 | // assumes that the string being parsed is known to be valid (e.g., by 29 | // already being verified). 30 | static u2 GetUtf16FromUtf8(const char** pUtf8Ptr) { 31 | u4 one = *(*pUtf8Ptr)++; 32 | if ((one & 0x80) != 0) { 33 | // two- or three-byte encoding 34 | u4 two = *(*pUtf8Ptr)++; 35 | if ((one & 0x20) != 0) { 36 | // three-byte encoding 37 | u4 three = *(*pUtf8Ptr)++; 38 | return ((one & 0x0f) << 12) | ((two & 0x3f) << 6) | (three & 0x3f); 39 | } else { 40 | // two-byte encoding 41 | return ((one & 0x1f) << 6) | (two & 0x3f); 42 | } 43 | } else { 44 | // one-byte encoding 45 | return one; 46 | } 47 | } 48 | 49 | int Utf8Cmp(const char* s1, const char* s2) { 50 | for (;;) { 51 | if (*s1 == '\0') { 52 | if (*s2 == '\0') { 53 | return 0; 54 | } 55 | return -1; 56 | } else if (*s2 == '\0') { 57 | return 1; 58 | } 59 | 60 | int utf1 = GetUtf16FromUtf8(&s1); 61 | int utf2 = GetUtf16FromUtf8(&s2); 62 | int diff = utf1 - utf2; 63 | 64 | if (diff != 0) { 65 | return diff; 66 | } 67 | } 68 | } 69 | 70 | } // namespace dex 71 | -------------------------------------------------------------------------------- /main.cc: -------------------------------------------------------------------------------- 1 | #include "dex_helper.h" 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | 10 | std::ostream &operator<<(std::ostream &out, const DexHelper::Class &clazz) { 11 | return out << clazz.name; 12 | } 13 | 14 | std::ostream &operator<<(std::ostream &out, const DexHelper::Field &field) { 15 | return out << field.declaring_class.name << "->" << field.name << ":" 16 | << field.type.name; 17 | } 18 | 19 | std::ostream &operator<<(std::ostream &out, const DexHelper::Method &method) { 20 | out << method.declaring_class.name << "->" << method.name << "("; 21 | 22 | for (auto ¶m : method.parameters) { 23 | out << param.name; 24 | } 25 | 26 | return out << ")" << method.return_type.name; 27 | } 28 | 29 | int main(int argc, char *argv[]) { 30 | std::vector> dexs; 31 | for (int i = 1; i <= 100; ++i) { 32 | std::string path = "dexs/classes" + 33 | (i == 1 ? std::string("") : std::to_string(i)) + ".dex"; 34 | int raw_dex = open(path.data(), O_RDONLY); 35 | if (raw_dex == -1) { 36 | break; 37 | } 38 | struct stat s {}; 39 | fstat(raw_dex, &s); 40 | auto *out = reinterpret_cast( 41 | mmap(nullptr, s.st_size, PROT_READ, MAP_PRIVATE, raw_dex, 0)); 42 | dexs.emplace_back(out, s.st_size); 43 | } 44 | 45 | std::string_view to_find = argv[2]; 46 | DexHelper helper(dexs); 47 | 48 | auto class_idx = helper.CreateClassIndex("Ljava/lang/Object;"); 49 | auto clazz = helper.DecodeClass(class_idx); 50 | std::cout << "got class: " << clazz << std::endl; 51 | 52 | auto field_indices = helper.FindField(class_idx, {}, true); 53 | if (!field_indices.empty()) { 54 | auto field_idx = field_indices[0]; 55 | auto field = helper.DecodeField(field_idx); 56 | std::cout << "got field: " << field << std::endl; 57 | { 58 | auto method_indices = helper.FindMethodSettingField(field_idx, -1, -1, "", 59 | -1, {}, {}, {}, true); 60 | if (!method_indices.empty()) { 61 | auto method_idx = method_indices[0]; 62 | auto method = helper.DecodeMethod(method_idx); 63 | std::cout << "got method settings field " << field << " : " << method << std::endl; 64 | } 65 | } 66 | { 67 | auto method_indices = helper.FindMethodGettingField(field_idx, -1, -1, "", 68 | -1, {}, {}, {}, true); 69 | if (!method_indices.empty()) { 70 | auto method_idx = method_indices[0]; 71 | auto method = helper.DecodeMethod(method_idx); 72 | std::cout << "got method getting field " << field << " : " << method << std::endl; 73 | } 74 | } 75 | } 76 | auto method_indices = helper.FindMethodUsingString( 77 | "isNullableType", false, -1, 1, "VI", -1, {}, {}, {}, true); 78 | if (!method_indices.empty()) { 79 | auto method_idx = method_indices[0]; 80 | auto method = helper.DecodeMethod(method_idx); 81 | std::cout << "got method with string: " << method << std::endl; 82 | { 83 | auto method_indices = helper.FindMethodInvoking(method_idx, -1, -1, "", 84 | -1, {}, {}, {}, true); 85 | if (!method_indices.empty()) { 86 | auto method_idx = method_indices[0]; 87 | auto callee = helper.DecodeMethod(method_idx); 88 | std::cout << "got method " << method << " invoking " << callee << std::endl; 89 | } 90 | } 91 | { 92 | auto method_indices = 93 | helper.FindMethodInvoked(method_idx, -1, -1, "", -1, {}, {}, {}, true); 94 | if (!method_indices.empty()) { 95 | auto method_idx = method_indices[0]; 96 | auto caller = helper.DecodeMethod(method_idx); 97 | std::cout << "got method invoked by " << caller << " : " << method << std::endl; 98 | } 99 | } 100 | } 101 | // helper.CreateFullCache(); 102 | } -------------------------------------------------------------------------------- /reader.cc: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #include "slicer/reader.h" 18 | #include "slicer/dex_bytecode.h" 19 | #include "slicer/chronometer.h" 20 | #include "slicer/dex_leb128.h" 21 | 22 | #include 23 | #include 24 | #include 25 | #include 26 | 27 | namespace dex { 28 | 29 | Reader::Reader(const dex::u1* image, size_t size) : image_(image), size_(size) { 30 | // init the header reference 31 | header_ = ptr(0); 32 | ValidateHeader(); 33 | 34 | // start with an "empty" .dex IR 35 | dex_ir_ = std::make_shared(); 36 | dex_ir_->magic = slicer::MemView(header_, sizeof(dex::Header::magic)); 37 | } 38 | 39 | slicer::ArrayView Reader::ClassDefs() const { 40 | return section(header_->class_defs_off, 41 | header_->class_defs_size); 42 | } 43 | 44 | slicer::ArrayView Reader::StringIds() const { 45 | return section(header_->string_ids_off, 46 | header_->string_ids_size); 47 | } 48 | 49 | slicer::ArrayView Reader::TypeIds() const { 50 | return section(header_->type_ids_off, 51 | header_->type_ids_size); 52 | } 53 | 54 | slicer::ArrayView Reader::FieldIds() const { 55 | return section(header_->field_ids_off, 56 | header_->field_ids_size); 57 | } 58 | 59 | slicer::ArrayView Reader::MethodIds() const { 60 | return section(header_->method_ids_off, 61 | header_->method_ids_size); 62 | } 63 | 64 | slicer::ArrayView Reader::ProtoIds() const { 65 | return section(header_->proto_ids_off, 66 | header_->proto_ids_size); 67 | } 68 | 69 | const dex::MapList* Reader::DexMapList() const { 70 | return dataPtr(header_->map_off); 71 | } 72 | 73 | const char* Reader::GetStringMUTF8(dex::u4 index) const { 74 | if (index == dex::kNoIndex) { 75 | return ""; 76 | } 77 | const dex::u1* strData = GetStringData(index); 78 | dex::ReadULeb128(&strData); 79 | return reinterpret_cast(strData); 80 | } 81 | 82 | void Reader::CreateFullIr() { 83 | size_t classCount = ClassDefs().size(); 84 | for (size_t i = 0; i < classCount; ++i) { 85 | CreateClassIr(i); 86 | } 87 | } 88 | 89 | void Reader::CreateClassIr(dex::u4 index) { 90 | auto ir_class = GetClass(index); 91 | SLICER_CHECK(ir_class != nullptr); 92 | } 93 | 94 | // Returns the index of the class with the specified 95 | // descriptor, or kNoIndex if not found 96 | dex::u4 Reader::FindClassIndex(const char* class_descriptor) const { 97 | auto classes = ClassDefs(); 98 | auto types = TypeIds(); 99 | for (dex::u4 i = 0; i < classes.size(); ++i) { 100 | auto typeId = types[classes[i].class_idx]; 101 | const char* descriptor = GetStringMUTF8(typeId.descriptor_idx); 102 | if (strcmp(class_descriptor, descriptor) == 0) { 103 | return i; 104 | } 105 | } 106 | return dex::kNoIndex; 107 | } 108 | 109 | // map a .dex index to corresponding .dex IR node 110 | // 111 | // NOTES: 112 | // 1. the mapping beween an index and the indexed 113 | // .dex IR nodes is 1:1 114 | // 2. we do a single index lookup for both existing 115 | // nodes as well as new nodes 116 | // 3. placeholder is an invalid, but non-null pointer value 117 | // used to check that the mapping loookup/update is atomic 118 | // 4. there should be no recursion with the same index 119 | // (we use the placeholder value to guard against this too) 120 | // 121 | ir::Class* Reader::GetClass(dex::u4 index) { 122 | SLICER_CHECK(index != dex::kNoIndex); 123 | auto& p = dex_ir_->classes_map[index]; 124 | auto placeholder = reinterpret_cast(1); 125 | if (p == nullptr) { 126 | p = placeholder; 127 | auto newClass = ParseClass(index); 128 | SLICER_CHECK(p == placeholder); 129 | p = newClass; 130 | dex_ir_->classes_indexes.MarkUsedIndex(index); 131 | } 132 | SLICER_CHECK(p != placeholder); 133 | return p; 134 | } 135 | 136 | // map a .dex index to corresponding .dex IR node 137 | // (see the Reader::GetClass() comments) 138 | ir::Type* Reader::GetType(dex::u4 index) { 139 | SLICER_CHECK(index != dex::kNoIndex); 140 | auto& p = dex_ir_->types_map[index]; 141 | auto placeholder = reinterpret_cast(1); 142 | if (p == nullptr) { 143 | p = placeholder; 144 | auto newType = ParseType(index); 145 | SLICER_CHECK(p == placeholder); 146 | p = newType; 147 | dex_ir_->types_indexes.MarkUsedIndex(index); 148 | } 149 | SLICER_CHECK(p != placeholder); 150 | return p; 151 | } 152 | 153 | // map a .dex index to corresponding .dex IR node 154 | // (see the Reader::GetClass() comments) 155 | ir::FieldDecl* Reader::GetFieldDecl(dex::u4 index) { 156 | SLICER_CHECK(index != dex::kNoIndex); 157 | auto& p = dex_ir_->fields_map[index]; 158 | auto placeholder = reinterpret_cast(1); 159 | if (p == nullptr) { 160 | p = placeholder; 161 | auto newField = ParseFieldDecl(index); 162 | SLICER_CHECK(p == placeholder); 163 | p = newField; 164 | dex_ir_->fields_indexes.MarkUsedIndex(index); 165 | } 166 | SLICER_CHECK(p != placeholder); 167 | return p; 168 | } 169 | 170 | // map a .dex index to corresponding .dex IR node 171 | // (see the Reader::GetClass() comments) 172 | ir::MethodDecl* Reader::GetMethodDecl(dex::u4 index) { 173 | SLICER_CHECK(index != dex::kNoIndex); 174 | auto& p = dex_ir_->methods_map[index]; 175 | auto placeholder = reinterpret_cast(1); 176 | if (p == nullptr) { 177 | p = placeholder; 178 | auto newMethod = ParseMethodDecl(index); 179 | SLICER_CHECK(p == placeholder); 180 | p = newMethod; 181 | dex_ir_->methods_indexes.MarkUsedIndex(index); 182 | } 183 | SLICER_CHECK(p != placeholder); 184 | return p; 185 | } 186 | 187 | // map a .dex index to corresponding .dex IR node 188 | // (see the Reader::GetClass() comments) 189 | ir::Proto* Reader::GetProto(dex::u4 index) { 190 | SLICER_CHECK(index != dex::kNoIndex); 191 | auto& p = dex_ir_->protos_map[index]; 192 | auto placeholder = reinterpret_cast(1); 193 | if (p == nullptr) { 194 | p = placeholder; 195 | auto newProto = ParseProto(index); 196 | SLICER_CHECK(p == placeholder); 197 | p = newProto; 198 | dex_ir_->protos_indexes.MarkUsedIndex(index); 199 | } 200 | SLICER_CHECK(p != placeholder); 201 | return p; 202 | } 203 | 204 | // map a .dex index to corresponding .dex IR node 205 | // (see the Reader::GetClass() comments) 206 | ir::String* Reader::GetString(dex::u4 index) { 207 | SLICER_CHECK(index != dex::kNoIndex); 208 | auto& p = dex_ir_->strings_map[index]; 209 | auto placeholder = reinterpret_cast(1); 210 | if (p == nullptr) { 211 | p = placeholder; 212 | auto newString = ParseString(index); 213 | SLICER_CHECK(p == placeholder); 214 | p = newString; 215 | dex_ir_->strings_indexes.MarkUsedIndex(index); 216 | } 217 | SLICER_CHECK(p != placeholder); 218 | return p; 219 | } 220 | 221 | ir::Class* Reader::ParseClass(dex::u4 index) { 222 | auto& dex_class_def = ClassDefs()[index]; 223 | auto ir_class = dex_ir_->Alloc(); 224 | 225 | ir_class->type = GetType(dex_class_def.class_idx); 226 | assert(ir_class->type->class_def == nullptr); 227 | ir_class->type->class_def = ir_class; 228 | 229 | ir_class->access_flags = dex_class_def.access_flags; 230 | ir_class->interfaces = ExtractTypeList(dex_class_def.interfaces_off); 231 | 232 | if (dex_class_def.superclass_idx != dex::kNoIndex) { 233 | ir_class->super_class = GetType(dex_class_def.superclass_idx); 234 | } 235 | 236 | if (dex_class_def.source_file_idx != dex::kNoIndex) { 237 | ir_class->source_file = GetString(dex_class_def.source_file_idx); 238 | } 239 | 240 | if (dex_class_def.class_data_off != 0) { 241 | const dex::u1* class_data = dataPtr(dex_class_def.class_data_off); 242 | 243 | dex::u4 static_fields_count = dex::ReadULeb128(&class_data); 244 | dex::u4 instance_fields_count = dex::ReadULeb128(&class_data); 245 | dex::u4 direct_methods_count = dex::ReadULeb128(&class_data); 246 | dex::u4 virtual_methods_count = dex::ReadULeb128(&class_data); 247 | 248 | dex::u4 base_index = dex::kNoIndex; 249 | for (dex::u4 i = 0; i < static_fields_count; ++i) { 250 | auto field = ParseEncodedField(&class_data, &base_index); 251 | ir_class->static_fields.push_back(field); 252 | } 253 | 254 | base_index = dex::kNoIndex; 255 | for (dex::u4 i = 0; i < instance_fields_count; ++i) { 256 | auto field = ParseEncodedField(&class_data, &base_index); 257 | ir_class->instance_fields.push_back(field); 258 | } 259 | 260 | base_index = dex::kNoIndex; 261 | for (dex::u4 i = 0; i < direct_methods_count; ++i) { 262 | auto method = ParseEncodedMethod(&class_data, &base_index); 263 | ir_class->direct_methods.push_back(method); 264 | } 265 | 266 | base_index = dex::kNoIndex; 267 | for (dex::u4 i = 0; i < virtual_methods_count; ++i) { 268 | auto method = ParseEncodedMethod(&class_data, &base_index); 269 | ir_class->virtual_methods.push_back(method); 270 | } 271 | } 272 | 273 | ir_class->static_init = ExtractEncodedArray(dex_class_def.static_values_off); 274 | ir_class->annotations = ExtractAnnotations(dex_class_def.annotations_off); 275 | ir_class->orig_index = index; 276 | 277 | return ir_class; 278 | } 279 | 280 | ir::AnnotationsDirectory* Reader::ExtractAnnotations(dex::u4 offset) { 281 | if (offset == 0) { 282 | return nullptr; 283 | } 284 | 285 | SLICER_CHECK(offset % 4 == 0); 286 | 287 | // first check if we already extracted the same "annotations_directory_item" 288 | auto& ir_annotations = annotations_directories_[offset]; 289 | if (ir_annotations == nullptr) { 290 | ir_annotations = dex_ir_->Alloc(); 291 | 292 | auto dex_annotations = dataPtr(offset); 293 | 294 | ir_annotations->class_annotation = 295 | ExtractAnnotationSet(dex_annotations->class_annotations_off); 296 | 297 | const dex::u1* ptr = reinterpret_cast(dex_annotations + 1); 298 | 299 | for (dex::u4 i = 0; i < dex_annotations->fields_size; ++i) { 300 | ir_annotations->field_annotations.push_back(ParseFieldAnnotation(&ptr)); 301 | } 302 | 303 | for (dex::u4 i = 0; i < dex_annotations->methods_size; ++i) { 304 | ir_annotations->method_annotations.push_back(ParseMethodAnnotation(&ptr)); 305 | } 306 | 307 | for (dex::u4 i = 0; i < dex_annotations->parameters_size; ++i) { 308 | ir_annotations->param_annotations.push_back(ParseParamAnnotation(&ptr)); 309 | } 310 | } 311 | return ir_annotations; 312 | } 313 | 314 | ir::Annotation* Reader::ExtractAnnotationItem(dex::u4 offset) { 315 | SLICER_CHECK(offset != 0); 316 | 317 | // first check if we already extracted the same "annotation_item" 318 | auto& ir_annotation = annotations_[offset]; 319 | if (ir_annotation == nullptr) { 320 | auto dexAnnotationItem = dataPtr(offset); 321 | const dex::u1* ptr = dexAnnotationItem->annotation; 322 | ir_annotation = ParseAnnotation(&ptr); 323 | ir_annotation->visibility = dexAnnotationItem->visibility; 324 | } 325 | return ir_annotation; 326 | } 327 | 328 | ir::AnnotationSet* Reader::ExtractAnnotationSet(dex::u4 offset) { 329 | if (offset == 0) { 330 | return nullptr; 331 | } 332 | 333 | SLICER_CHECK(offset % 4 == 0); 334 | 335 | // first check if we already extracted the same "annotation_set_item" 336 | auto& ir_annotation_set = annotation_sets_[offset]; 337 | if (ir_annotation_set == nullptr) { 338 | ir_annotation_set = dex_ir_->Alloc(); 339 | 340 | auto dex_annotation_set = dataPtr(offset); 341 | for (dex::u4 i = 0; i < dex_annotation_set->size; ++i) { 342 | auto ir_annotation = ExtractAnnotationItem(dex_annotation_set->entries[i]); 343 | assert(ir_annotation != nullptr); 344 | ir_annotation_set->annotations.push_back(ir_annotation); 345 | } 346 | } 347 | return ir_annotation_set; 348 | } 349 | 350 | ir::AnnotationSetRefList* Reader::ExtractAnnotationSetRefList(dex::u4 offset) { 351 | SLICER_CHECK(offset % 4 == 0); 352 | 353 | auto dex_annotation_set_ref_list = dataPtr(offset); 354 | auto ir_annotation_set_ref_list = dex_ir_->Alloc(); 355 | 356 | for (dex::u4 i = 0; i < dex_annotation_set_ref_list->size; ++i) { 357 | dex::u4 entry_offset = dex_annotation_set_ref_list->list[i].annotations_off; 358 | if (entry_offset != 0) { 359 | auto ir_annotation_set = ExtractAnnotationSet(entry_offset); 360 | SLICER_CHECK(ir_annotation_set != nullptr); 361 | ir_annotation_set_ref_list->annotations.push_back(ir_annotation_set); 362 | } 363 | } 364 | 365 | return ir_annotation_set_ref_list; 366 | } 367 | 368 | ir::FieldAnnotation* Reader::ParseFieldAnnotation(const dex::u1** pptr) { 369 | auto dex_field_annotation = reinterpret_cast(*pptr); 370 | auto ir_field_annotation = dex_ir_->Alloc(); 371 | 372 | ir_field_annotation->field_decl = GetFieldDecl(dex_field_annotation->field_idx); 373 | 374 | ir_field_annotation->annotations = 375 | ExtractAnnotationSet(dex_field_annotation->annotations_off); 376 | SLICER_CHECK(ir_field_annotation->annotations != nullptr); 377 | 378 | *pptr += sizeof(dex::FieldAnnotationsItem); 379 | return ir_field_annotation; 380 | } 381 | 382 | ir::MethodAnnotation* Reader::ParseMethodAnnotation(const dex::u1** pptr) { 383 | auto dex_method_annotation = 384 | reinterpret_cast(*pptr); 385 | auto ir_method_annotation = dex_ir_->Alloc(); 386 | 387 | ir_method_annotation->method_decl = GetMethodDecl(dex_method_annotation->method_idx); 388 | 389 | ir_method_annotation->annotations = 390 | ExtractAnnotationSet(dex_method_annotation->annotations_off); 391 | SLICER_CHECK(ir_method_annotation->annotations != nullptr); 392 | 393 | *pptr += sizeof(dex::MethodAnnotationsItem); 394 | return ir_method_annotation; 395 | } 396 | 397 | ir::ParamAnnotation* Reader::ParseParamAnnotation(const dex::u1** pptr) { 398 | auto dex_param_annotation = 399 | reinterpret_cast(*pptr); 400 | auto ir_param_annotation = dex_ir_->Alloc(); 401 | 402 | ir_param_annotation->method_decl = GetMethodDecl(dex_param_annotation->method_idx); 403 | 404 | ir_param_annotation->annotations = 405 | ExtractAnnotationSetRefList(dex_param_annotation->annotations_off); 406 | SLICER_CHECK(ir_param_annotation->annotations != nullptr); 407 | 408 | *pptr += sizeof(dex::ParameterAnnotationsItem); 409 | return ir_param_annotation; 410 | } 411 | 412 | ir::EncodedField* Reader::ParseEncodedField(const dex::u1** pptr, dex::u4* base_index) { 413 | auto ir_encoded_field = dex_ir_->Alloc(); 414 | 415 | auto field_index = dex::ReadULeb128(pptr); 416 | SLICER_CHECK(field_index != dex::kNoIndex); 417 | if (*base_index != dex::kNoIndex) { 418 | SLICER_CHECK(field_index != 0); 419 | field_index += *base_index; 420 | } 421 | *base_index = field_index; 422 | 423 | ir_encoded_field->decl = GetFieldDecl(field_index); 424 | ir_encoded_field->access_flags = dex::ReadULeb128(pptr); 425 | 426 | return ir_encoded_field; 427 | } 428 | 429 | // Parse an encoded variable-length integer value 430 | // (sign-extend signed types, zero-extend unsigned types) 431 | template 432 | static T ParseIntValue(const dex::u1** pptr, size_t size) { 433 | static_assert(std::is_integral::value, "must be an integral type"); 434 | 435 | SLICER_CHECK(size > 0); 436 | SLICER_CHECK(size <= sizeof(T)); 437 | 438 | T value = 0; 439 | for (size_t i = 0; i < size; ++i) { 440 | value |= T(*(*pptr)++) << (i * 8); 441 | } 442 | 443 | // sign-extend? 444 | if (std::is_signed::value) { 445 | size_t shift = (sizeof(T) - size) * 8; 446 | value = T(value << shift) >> shift; 447 | } 448 | 449 | return value; 450 | } 451 | 452 | // Parse an encoded variable-length floating point value 453 | // (zero-extend to the right) 454 | template 455 | static T ParseFloatValue(const dex::u1** pptr, size_t size) { 456 | SLICER_CHECK(size > 0); 457 | SLICER_CHECK(size <= sizeof(T)); 458 | 459 | T value = 0; 460 | int start_byte = sizeof(T) - size; 461 | for (dex::u1* p = reinterpret_cast(&value) + start_byte; size > 0; 462 | --size) { 463 | *p++ = *(*pptr)++; 464 | } 465 | return value; 466 | } 467 | 468 | ir::EncodedValue* Reader::ParseEncodedValue(const dex::u1** pptr) { 469 | auto ir_encoded_value = dex_ir_->Alloc(); 470 | 471 | SLICER_EXTRA(auto base_ptr = *pptr); 472 | 473 | dex::u1 header = *(*pptr)++; 474 | dex::u1 type = header & dex::kEncodedValueTypeMask; 475 | dex::u1 arg = header >> dex::kEncodedValueArgShift; 476 | 477 | ir_encoded_value->type = type; 478 | 479 | switch (type) { 480 | case dex::kEncodedByte: 481 | ir_encoded_value->u.byte_value = ParseIntValue(pptr, arg + 1); 482 | break; 483 | 484 | case dex::kEncodedShort: 485 | ir_encoded_value->u.short_value = ParseIntValue(pptr, arg + 1); 486 | break; 487 | 488 | case dex::kEncodedChar: 489 | ir_encoded_value->u.char_value = ParseIntValue(pptr, arg + 1); 490 | break; 491 | 492 | case dex::kEncodedInt: 493 | ir_encoded_value->u.int_value = ParseIntValue(pptr, arg + 1); 494 | break; 495 | 496 | case dex::kEncodedLong: 497 | ir_encoded_value->u.long_value = ParseIntValue(pptr, arg + 1); 498 | break; 499 | 500 | case dex::kEncodedFloat: 501 | ir_encoded_value->u.float_value = ParseFloatValue(pptr, arg + 1); 502 | break; 503 | 504 | case dex::kEncodedDouble: 505 | ir_encoded_value->u.double_value = ParseFloatValue(pptr, arg + 1); 506 | break; 507 | 508 | case dex::kEncodedString: { 509 | dex::u4 index = ParseIntValue(pptr, arg + 1); 510 | ir_encoded_value->u.string_value = GetString(index); 511 | } break; 512 | 513 | case dex::kEncodedType: { 514 | dex::u4 index = ParseIntValue(pptr, arg + 1); 515 | ir_encoded_value->u.type_value = GetType(index); 516 | } break; 517 | 518 | case dex::kEncodedField: { 519 | dex::u4 index = ParseIntValue(pptr, arg + 1); 520 | ir_encoded_value->u.field_value = GetFieldDecl(index); 521 | } break; 522 | 523 | case dex::kEncodedMethod: { 524 | dex::u4 index = ParseIntValue(pptr, arg + 1); 525 | ir_encoded_value->u.method_value = GetMethodDecl(index); 526 | } break; 527 | 528 | case dex::kEncodedEnum: { 529 | dex::u4 index = ParseIntValue(pptr, arg + 1); 530 | ir_encoded_value->u.enum_value = GetFieldDecl(index); 531 | } break; 532 | 533 | case dex::kEncodedArray: 534 | SLICER_CHECK(arg == 0); 535 | ir_encoded_value->u.array_value = ParseEncodedArray(pptr); 536 | break; 537 | 538 | case dex::kEncodedAnnotation: 539 | SLICER_CHECK(arg == 0); 540 | ir_encoded_value->u.annotation_value = ParseAnnotation(pptr); 541 | break; 542 | 543 | case dex::kEncodedNull: 544 | SLICER_CHECK(arg == 0); 545 | break; 546 | 547 | case dex::kEncodedBoolean: 548 | SLICER_CHECK(arg < 2); 549 | ir_encoded_value->u.bool_value = (arg == 1); 550 | break; 551 | 552 | default: 553 | SLICER_CHECK(!"unexpected value type"); 554 | } 555 | 556 | SLICER_EXTRA(ir_encoded_value->original = slicer::MemView(base_ptr, *pptr - base_ptr)); 557 | 558 | return ir_encoded_value; 559 | } 560 | 561 | ir::Annotation* Reader::ParseAnnotation(const dex::u1** pptr) { 562 | auto ir_annotation = dex_ir_->Alloc(); 563 | 564 | dex::u4 type_index = dex::ReadULeb128(pptr); 565 | dex::u4 elements_count = dex::ReadULeb128(pptr); 566 | 567 | ir_annotation->type = GetType(type_index); 568 | ir_annotation->visibility = dex::kVisibilityEncoded; 569 | 570 | for (dex::u4 i = 0; i < elements_count; ++i) { 571 | auto ir_element = dex_ir_->Alloc(); 572 | 573 | ir_element->name = GetString(dex::ReadULeb128(pptr)); 574 | ir_element->value = ParseEncodedValue(pptr); 575 | 576 | ir_annotation->elements.push_back(ir_element); 577 | } 578 | 579 | return ir_annotation; 580 | } 581 | 582 | ir::EncodedArray* Reader::ParseEncodedArray(const dex::u1** pptr) { 583 | auto ir_encoded_array = dex_ir_->Alloc(); 584 | 585 | dex::u4 count = dex::ReadULeb128(pptr); 586 | for (dex::u4 i = 0; i < count; ++i) { 587 | ir_encoded_array->values.push_back(ParseEncodedValue(pptr)); 588 | } 589 | 590 | return ir_encoded_array; 591 | } 592 | 593 | ir::EncodedArray* Reader::ExtractEncodedArray(dex::u4 offset) { 594 | if (offset == 0) { 595 | return nullptr; 596 | } 597 | 598 | // first check if we already extracted the same "annotation_item" 599 | auto& ir_encoded_array = encoded_arrays_[offset]; 600 | if (ir_encoded_array == nullptr) { 601 | auto ptr = dataPtr(offset); 602 | ir_encoded_array = ParseEncodedArray(&ptr); 603 | } 604 | return ir_encoded_array; 605 | } 606 | 607 | ir::DebugInfo* Reader::ExtractDebugInfo(dex::u4 offset) { 608 | if (offset == 0) { 609 | return nullptr; 610 | } 611 | 612 | auto ir_debug_info = dex_ir_->Alloc(); 613 | const dex::u1* ptr = dataPtr(offset); 614 | 615 | ir_debug_info->line_start = dex::ReadULeb128(&ptr); 616 | 617 | // TODO: implicit this param for non-static methods? 618 | dex::u4 param_count = dex::ReadULeb128(&ptr); 619 | for (dex::u4 i = 0; i < param_count; ++i) { 620 | dex::u4 name_index = dex::ReadULeb128(&ptr) - 1; 621 | auto ir_string = 622 | (name_index == dex::kNoIndex) ? nullptr : GetString(name_index); 623 | ir_debug_info->param_names.push_back(ir_string); 624 | } 625 | 626 | // parse the debug info opcodes and note the 627 | // references to strings and types (to make sure the IR 628 | // is the full closure of all referenced items) 629 | // 630 | // TODO: design a generic debug info iterator? 631 | // 632 | auto base_ptr = ptr; 633 | dex::u1 opcode = 0; 634 | while ((opcode = *ptr++) != dex::DBG_END_SEQUENCE) { 635 | switch (opcode) { 636 | case dex::DBG_ADVANCE_PC: 637 | // addr_diff 638 | dex::ReadULeb128(&ptr); 639 | break; 640 | 641 | case dex::DBG_ADVANCE_LINE: 642 | // line_diff 643 | dex::ReadSLeb128(&ptr); 644 | break; 645 | 646 | case dex::DBG_START_LOCAL: { 647 | // register_num 648 | dex::ReadULeb128(&ptr); 649 | 650 | dex::u4 name_index = dex::ReadULeb128(&ptr) - 1; 651 | if (name_index != dex::kNoIndex) { 652 | GetString(name_index); 653 | } 654 | 655 | dex::u4 type_index = dex::ReadULeb128(&ptr) - 1; 656 | if (type_index != dex::kNoIndex) { 657 | GetType(type_index); 658 | } 659 | } break; 660 | 661 | case dex::DBG_START_LOCAL_EXTENDED: { 662 | // register_num 663 | dex::ReadULeb128(&ptr); 664 | 665 | dex::u4 name_index = dex::ReadULeb128(&ptr) - 1; 666 | if (name_index != dex::kNoIndex) { 667 | GetString(name_index); 668 | } 669 | 670 | dex::u4 type_index = dex::ReadULeb128(&ptr) - 1; 671 | if (type_index != dex::kNoIndex) { 672 | GetType(type_index); 673 | } 674 | 675 | dex::u4 sig_index = dex::ReadULeb128(&ptr) - 1; 676 | if (sig_index != dex::kNoIndex) { 677 | GetString(sig_index); 678 | } 679 | } break; 680 | 681 | case dex::DBG_END_LOCAL: 682 | case dex::DBG_RESTART_LOCAL: 683 | // register_num 684 | dex::ReadULeb128(&ptr); 685 | break; 686 | 687 | case dex::DBG_SET_FILE: { 688 | dex::u4 name_index = dex::ReadULeb128(&ptr) - 1; 689 | if (name_index != dex::kNoIndex) { 690 | GetString(name_index); 691 | } 692 | } break; 693 | } 694 | } 695 | 696 | ir_debug_info->data = slicer::MemView(base_ptr, ptr - base_ptr); 697 | 698 | return ir_debug_info; 699 | } 700 | 701 | ir::Code* Reader::ExtractCode(dex::u4 offset) { 702 | if (offset == 0) { 703 | return nullptr; 704 | } 705 | 706 | SLICER_CHECK(offset % 4 == 0); 707 | 708 | auto dex_code = dataPtr(offset); 709 | auto ir_code = dex_ir_->Alloc(); 710 | 711 | ir_code->registers = dex_code->registers_size; 712 | ir_code->ins_count = dex_code->ins_size; 713 | ir_code->outs_count = dex_code->outs_size; 714 | 715 | // instructions array 716 | ir_code->instructions = 717 | slicer::ArrayView(dex_code->insns, dex_code->insns_size); 718 | 719 | // parse the instructions to discover references to other 720 | // IR nodes (see debug info stream parsing too) 721 | ParseInstructions(ir_code->instructions); 722 | 723 | // try blocks & handlers 724 | // 725 | // TODO: a generic try/catch blocks iterator? 726 | // 727 | if (dex_code->tries_size != 0) { 728 | dex::u4 aligned_count = (dex_code->insns_size + 1) / 2 * 2; 729 | auto tries = 730 | reinterpret_cast(dex_code->insns + aligned_count); 731 | auto handlers_list = 732 | reinterpret_cast(tries + dex_code->tries_size); 733 | 734 | ir_code->try_blocks = 735 | slicer::ArrayView(tries, dex_code->tries_size); 736 | 737 | // parse the handlers list (and discover embedded references) 738 | auto ptr = handlers_list; 739 | 740 | dex::u4 handlers_count = dex::ReadULeb128(&ptr); 741 | SLICER_WEAK_CHECK(handlers_count <= dex_code->tries_size); 742 | 743 | for (dex::u4 handler_index = 0; handler_index < handlers_count; ++handler_index) { 744 | int catch_count = dex::ReadSLeb128(&ptr); 745 | 746 | for (int catch_index = 0; catch_index < std::abs(catch_count); ++catch_index) { 747 | dex::u4 type_index = dex::ReadULeb128(&ptr); 748 | GetType(type_index); 749 | 750 | // address 751 | dex::ReadULeb128(&ptr); 752 | } 753 | 754 | if (catch_count < 1) { 755 | // catch_all_addr 756 | dex::ReadULeb128(&ptr); 757 | } 758 | } 759 | 760 | ir_code->catch_handlers = slicer::MemView(handlers_list, ptr - handlers_list); 761 | } 762 | 763 | ir_code->debug_info = ExtractDebugInfo(dex_code->debug_info_off); 764 | 765 | return ir_code; 766 | } 767 | 768 | ir::EncodedMethod* Reader::ParseEncodedMethod(const dex::u1** pptr, dex::u4* base_index) { 769 | auto ir_encoded_method = dex_ir_->Alloc(); 770 | 771 | auto method_index = dex::ReadULeb128(pptr); 772 | SLICER_CHECK(method_index != dex::kNoIndex); 773 | if (*base_index != dex::kNoIndex) { 774 | SLICER_CHECK(method_index != 0); 775 | method_index += *base_index; 776 | } 777 | *base_index = method_index; 778 | 779 | ir_encoded_method->decl = GetMethodDecl(method_index); 780 | ir_encoded_method->access_flags = dex::ReadULeb128(pptr); 781 | 782 | dex::u4 code_offset = dex::ReadULeb128(pptr); 783 | ir_encoded_method->code = ExtractCode(code_offset); 784 | 785 | // update the methods lookup table 786 | dex_ir_->methods_lookup.Insert(ir_encoded_method); 787 | 788 | return ir_encoded_method; 789 | } 790 | 791 | ir::Type* Reader::ParseType(dex::u4 index) { 792 | auto& dex_type = TypeIds()[index]; 793 | auto ir_type = dex_ir_->Alloc(); 794 | 795 | ir_type->descriptor = GetString(dex_type.descriptor_idx); 796 | ir_type->orig_index = index; 797 | 798 | return ir_type; 799 | } 800 | 801 | ir::FieldDecl* Reader::ParseFieldDecl(dex::u4 index) { 802 | auto& dex_field = FieldIds()[index]; 803 | auto ir_field = dex_ir_->Alloc(); 804 | 805 | ir_field->name = GetString(dex_field.name_idx); 806 | ir_field->type = GetType(dex_field.type_idx); 807 | ir_field->parent = GetType(dex_field.class_idx); 808 | ir_field->orig_index = index; 809 | 810 | return ir_field; 811 | } 812 | 813 | ir::MethodDecl* Reader::ParseMethodDecl(dex::u4 index) { 814 | auto& dex_method = MethodIds()[index]; 815 | auto ir_method = dex_ir_->Alloc(); 816 | 817 | ir_method->name = GetString(dex_method.name_idx); 818 | ir_method->prototype = GetProto(dex_method.proto_idx); 819 | ir_method->parent = GetType(dex_method.class_idx); 820 | ir_method->orig_index = index; 821 | 822 | return ir_method; 823 | } 824 | 825 | ir::TypeList* Reader::ExtractTypeList(dex::u4 offset) { 826 | if (offset == 0) { 827 | return nullptr; 828 | } 829 | 830 | // first check to see if we already extracted the same "type_list" 831 | auto& ir_type_list = type_lists_[offset]; 832 | if (ir_type_list == nullptr) { 833 | ir_type_list = dex_ir_->Alloc(); 834 | 835 | auto dex_type_list = dataPtr(offset); 836 | SLICER_WEAK_CHECK(dex_type_list->size > 0); 837 | 838 | for (dex::u4 i = 0; i < dex_type_list->size; ++i) { 839 | ir_type_list->types.push_back(GetType(dex_type_list->list[i].type_idx)); 840 | } 841 | } 842 | 843 | return ir_type_list; 844 | } 845 | 846 | ir::Proto* Reader::ParseProto(dex::u4 index) { 847 | auto& dex_proto = ProtoIds()[index]; 848 | auto ir_proto = dex_ir_->Alloc(); 849 | 850 | ir_proto->shorty = GetString(dex_proto.shorty_idx); 851 | ir_proto->return_type = GetType(dex_proto.return_type_idx); 852 | ir_proto->param_types = ExtractTypeList(dex_proto.parameters_off); 853 | ir_proto->orig_index = index; 854 | 855 | // update the prototypes lookup table 856 | dex_ir_->prototypes_lookup.Insert(ir_proto); 857 | 858 | return ir_proto; 859 | } 860 | 861 | ir::String* Reader::ParseString(dex::u4 index) { 862 | auto ir_string = dex_ir_->Alloc(); 863 | 864 | auto data = GetStringData(index); 865 | auto cstr = data; 866 | dex::ReadULeb128(&cstr); 867 | size_t size = (cstr - data) + ::strlen(reinterpret_cast(cstr)) + 1; 868 | 869 | ir_string->data = slicer::MemView(data, size); 870 | ir_string->orig_index = index; 871 | 872 | // update the strings lookup table 873 | dex_ir_->strings_lookup.Insert(ir_string); 874 | 875 | return ir_string; 876 | } 877 | 878 | void Reader::ParseInstructions(slicer::ArrayView code) { 879 | const dex::u2* ptr = code.begin(); 880 | while (ptr < code.end()) { 881 | auto dex_instr = dex::DecodeInstruction(ptr); 882 | 883 | dex::u4 index = dex::kNoIndex; 884 | switch (dex::GetFormatFromOpcode(dex_instr.opcode)) { 885 | case dex::k20bc: 886 | case dex::k21c: 887 | case dex::k31c: 888 | case dex::k35c: 889 | case dex::k3rc: 890 | index = dex_instr.vB; 891 | break; 892 | 893 | case dex::k22c: 894 | index = dex_instr.vC; 895 | break; 896 | 897 | default: 898 | break; 899 | } 900 | 901 | switch (GetIndexTypeFromOpcode(dex_instr.opcode)) { 902 | case dex::kIndexStringRef: 903 | GetString(index); 904 | break; 905 | 906 | case dex::kIndexTypeRef: 907 | GetType(index); 908 | break; 909 | 910 | case dex::kIndexFieldRef: 911 | GetFieldDecl(index); 912 | break; 913 | 914 | case dex::kIndexMethodRef: 915 | GetMethodDecl(index); 916 | break; 917 | 918 | default: 919 | break; 920 | } 921 | 922 | auto isize = dex::GetWidthFromBytecode(ptr); 923 | SLICER_CHECK(isize > 0); 924 | ptr += isize; 925 | } 926 | SLICER_CHECK(ptr == code.end()); 927 | } 928 | 929 | // Basic .dex header structural checks 930 | void Reader::ValidateHeader() { 931 | SLICER_CHECK(size_ > sizeof(dex::Header)); 932 | 933 | // Known issue: For performance reasons the initial size_ passed to jvmti events might be an 934 | // estimate. b/72402467 935 | SLICER_CHECK(header_->file_size <= size_); 936 | SLICER_CHECK(header_->header_size == sizeof(dex::Header)); 937 | SLICER_CHECK(header_->endian_tag == dex::kEndianConstant); 938 | SLICER_CHECK(header_->data_size % 4 == 0); 939 | 940 | // Known issue: The fields might be slighly corrupted b/65452964 941 | // SLICER_CHECK(header_->data_off + header_->data_size <= size_); 942 | 943 | SLICER_CHECK(header_->string_ids_off % 4 == 0); 944 | SLICER_CHECK(header_->type_ids_size < 65536); 945 | SLICER_CHECK(header_->type_ids_off % 4 == 0); 946 | SLICER_CHECK(header_->proto_ids_size < 65536); 947 | SLICER_CHECK(header_->proto_ids_off % 4 == 0); 948 | SLICER_CHECK(header_->field_ids_off % 4 == 0); 949 | SLICER_CHECK(header_->method_ids_off % 4 == 0); 950 | SLICER_CHECK(header_->class_defs_off % 4 == 0); 951 | SLICER_CHECK(header_->map_off >= header_->data_off && header_->map_off < size_); 952 | SLICER_CHECK(header_->link_size == 0); 953 | SLICER_CHECK(header_->link_off == 0); 954 | SLICER_CHECK(header_->data_off % 4 == 0); 955 | SLICER_CHECK(header_->map_off % 4 == 0); 956 | 957 | // we seem to have .dex files with extra bytes at the end ... 958 | // Known issue: For performance reasons the initial size_ passed to jvmti events might be an 959 | // estimate. b/72402467 960 | SLICER_WEAK_CHECK(header_->data_off + header_->data_size <= size_); 961 | 962 | // but we should still have the whole data section 963 | 964 | // Known issue: The fields might be slighly corrupted b/65452964 965 | // Known issue: For performance reasons the initial size_ passed to jvmti events might be an 966 | // estimate. b/72402467 967 | // SLICER_CHECK(header_->data_off + header_->data_size <= size_); 968 | 969 | // validate the map 970 | // (map section size = sizeof(MapList::size) + sizeof(MapList::list[size]) 971 | auto map_list = ptr(header_->map_off); 972 | SLICER_CHECK(map_list->size > 0); 973 | auto map_section_size = 974 | sizeof(dex::u4) + sizeof(dex::MapItem) * map_list->size; 975 | SLICER_CHECK(header_->map_off + map_section_size <= size_); 976 | } 977 | 978 | } // namespace dex 979 | -------------------------------------------------------------------------------- /slicer/arrayview.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include "common.h" 20 | 21 | #include 22 | 23 | namespace slicer { 24 | 25 | // A shallow array view 26 | template 27 | class ArrayView { 28 | public: 29 | ArrayView() = default; 30 | 31 | ArrayView(const ArrayView&) = default; 32 | ArrayView& operator=(const ArrayView&) = default; 33 | 34 | ArrayView(T* ptr, size_t count) : begin_(ptr), end_(ptr + count) {} 35 | 36 | T* begin() const { return begin_; } 37 | T* end() const { return end_; } 38 | 39 | T* data() const { return begin_; } 40 | 41 | T& operator[](size_t i) const { 42 | SLICER_CHECK(i < size()); 43 | return *(begin_ + i); 44 | } 45 | 46 | size_t size() const { return end_ - begin_; } 47 | bool empty() const { return begin_ == end_; } 48 | 49 | private: 50 | T* begin_ = nullptr; 51 | T* end_ = nullptr; 52 | }; 53 | 54 | } // namespace slicer 55 | -------------------------------------------------------------------------------- /slicer/buffer.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include "common.h" 20 | #include "arrayview.h" 21 | #include "memview.h" 22 | #include "dex_leb128.h" 23 | 24 | #include 25 | #include 26 | #include 27 | #include 28 | #include 29 | 30 | namespace slicer { 31 | 32 | // A simple growing memory buffer 33 | // 34 | // NOTE: pointers into this buffer are not stable 35 | // since it may be relocated as it expands. 36 | // 37 | class Buffer { 38 | public: 39 | Buffer() = default; 40 | 41 | ~Buffer() { Free(); } 42 | 43 | Buffer(const Buffer&) = delete; 44 | Buffer& operator=(const Buffer&) = delete; 45 | 46 | Buffer(Buffer&& b) { 47 | std::swap(buff_, b.buff_); 48 | std::swap(size_, b.size_); 49 | std::swap(capacity_, b.capacity_); 50 | } 51 | 52 | Buffer& operator=(Buffer&& b) { 53 | Free(); 54 | std::swap(buff_, b.buff_); 55 | std::swap(size_, b.size_); 56 | std::swap(capacity_, b.capacity_); 57 | return *this; 58 | } 59 | 60 | public: 61 | // Align the total size and prevent further changes 62 | size_t Seal(size_t alignment) { 63 | SLICER_CHECK(!sealed_); 64 | Align(alignment); 65 | sealed_ = true; 66 | return size(); 67 | } 68 | 69 | // Returns a pointer within the buffer 70 | // 71 | // NOTE: the returned pointer is "ephemeral" and 72 | // is only valid until the next buffer push/alloc 73 | // 74 | template 75 | T* ptr(size_t offset) { 76 | SLICER_CHECK(offset + sizeof(T) <= size_); 77 | return reinterpret_cast(buff_ + offset); 78 | } 79 | 80 | // Align the buffer size to the specified alignment 81 | void Align(size_t alignment) { 82 | assert(alignment > 0); 83 | size_t rem = size_ % alignment; 84 | if (rem != 0) { 85 | Alloc(alignment - rem); 86 | } 87 | } 88 | 89 | size_t Alloc(size_t size) { 90 | size_t offset = size_; 91 | Expand(size); 92 | std::memset(buff_ + offset, 0, size); 93 | return offset; 94 | } 95 | 96 | size_t Push(const void* ptr, size_t size) { 97 | size_t offset = size_; 98 | Expand(size); 99 | std::memcpy(buff_ + offset, ptr, size); 100 | return offset; 101 | } 102 | 103 | size_t Push(const MemView& memView) { 104 | return Push(memView.ptr(), memView.size()); 105 | } 106 | 107 | template 108 | size_t Push(const ArrayView& a) { 109 | return Push(a.data(), a.size() * sizeof(T)); 110 | } 111 | 112 | template 113 | size_t Push(const std::vector& v) { 114 | return Push(v.data(), v.size() * sizeof(T)); 115 | } 116 | 117 | size_t Push(const Buffer& buff) { 118 | SLICER_CHECK(&buff != this); 119 | return Push(buff.data(), buff.size()); 120 | } 121 | 122 | // TODO: this is really dangerous since it would 123 | // write any type - sometimes not what you expect. 124 | // 125 | template 126 | size_t Push(const T& value) { 127 | return Push(&value, sizeof(value)); 128 | } 129 | 130 | size_t PushULeb128(dex::u4 value) { 131 | dex::u1 tmp[4]; 132 | dex::u1* end = dex::WriteULeb128(tmp, value); 133 | assert(end > tmp && end - tmp <= 4); 134 | return Push(tmp, end - tmp); 135 | } 136 | 137 | size_t PushSLeb128(dex::s4 value) { 138 | dex::u1 tmp[4]; 139 | dex::u1* end = dex::WriteSLeb128(tmp, value); 140 | assert(end > tmp && end - tmp <= 4); 141 | return Push(tmp, end - tmp); 142 | } 143 | 144 | size_t size() const { return size_; } 145 | 146 | bool empty() const { return size_ == 0; } 147 | 148 | void Free() { 149 | ::free(buff_); 150 | buff_ = nullptr; 151 | size_ = 0; 152 | capacity_ = 0; 153 | } 154 | 155 | const dex::u1* data() const { 156 | SLICER_CHECK(buff_ != nullptr); 157 | return buff_; 158 | } 159 | 160 | private: 161 | void Expand(size_t size) { 162 | SLICER_CHECK(!sealed_); 163 | if (size_ + size > capacity_) { 164 | capacity_ = std::max(size_t(capacity_ * 1.5), size_ + size); 165 | buff_ = static_cast(::realloc(buff_, capacity_)); 166 | SLICER_CHECK(buff_ != nullptr); 167 | } 168 | size_ += size; 169 | } 170 | 171 | private: 172 | dex::u1* buff_ = nullptr; 173 | size_t size_ = 0; 174 | size_t capacity_ = 0; 175 | bool sealed_ = false; 176 | }; 177 | 178 | } // namespace slicer 179 | 180 | -------------------------------------------------------------------------------- /slicer/chronometer.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include 20 | 21 | namespace slicer { 22 | 23 | // A very simple timing chronometer 24 | class Chronometer { 25 | using Clock = std::chrono::high_resolution_clock; 26 | 27 | public: 28 | // elapsed time is in milliseconds 29 | explicit Chronometer(double& elapsed, bool cumulative = false) : 30 | elapsed_(elapsed), cumulative_(cumulative) { 31 | start_time_ = Clock::now(); 32 | } 33 | 34 | ~Chronometer() { 35 | Clock::time_point end_time = Clock::now(); 36 | std::chrono::duration ms = end_time - start_time_; 37 | if (cumulative_) { 38 | elapsed_ += ms.count(); 39 | } else { 40 | elapsed_ = ms.count(); 41 | } 42 | } 43 | 44 | Chronometer(const Chronometer&) = delete; 45 | Chronometer& operator=(const Chronometer&) = delete; 46 | 47 | private: 48 | double& elapsed_; 49 | Clock::time_point start_time_; 50 | bool cumulative_; 51 | }; 52 | 53 | } // namespace slicer 54 | -------------------------------------------------------------------------------- /slicer/common.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | namespace slicer { 20 | 21 | // Encapsulate the runtime check and error reporting policy. 22 | // (currently a simple fail-fast but the the intention is to allow customization) 23 | void _checkFailed(const char* expr, int line, const char* file) __attribute__((noreturn)); 24 | #define SLICER_CHECK(expr) do { if(!(expr)) slicer::_checkFailed(#expr, __LINE__, __FILE__); } while(false) 25 | 26 | // A modal check: if the strict mode is enabled, it behaves as a SLICER_CHECK, 27 | // otherwise it will only log a warning and continue 28 | // 29 | // NOTE: we use SLICER_WEAK_CHECK for .dex format validations that are frequently 30 | // violated by existing apps. So we need to be able to annotate these common 31 | // problems and potentially ignoring them for parity with the Android runtime. 32 | // 33 | void _weakCheckFailed(const char* expr, int line, const char* file); 34 | #define SLICER_WEAK_CHECK(expr) do { if(!(expr)) slicer::_weakCheckFailed(#expr, __LINE__, __FILE__); } while(false) 35 | 36 | // Report a fatal condition with a printf-formatted message 37 | void _fatal(const char* format, ...) __attribute__((noreturn)); 38 | #define SLICER_FATAL(format, ...) slicer::_fatal("\nSLICER_FATAL: " format "\n\n", ##__VA_ARGS__); 39 | 40 | // Annotation customization point for extra validation / state. 41 | #ifdef NDEBUG 42 | #define SLICER_EXTRA(x) 43 | #else 44 | #define SLICER_EXTRA(x) x 45 | #endif 46 | 47 | #ifndef FALLTHROUGH_INTENDED 48 | #ifdef __clang__ 49 | #define FALLTHROUGH_INTENDED [[clang::fallthrough]] 50 | #else 51 | #define FALLTHROUGH_INTENDED 52 | #endif // __clang__ 53 | #endif // FALLTHROUGH_INTENDED 54 | 55 | } // namespace slicer 56 | 57 | -------------------------------------------------------------------------------- /slicer/dex_bytecode.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include "dex_format.h" 20 | 21 | #include 22 | 23 | // .dex bytecode definitions and helpers: 24 | // https://source.android.com/devices/tech/dalvik/dalvik-bytecode.html 25 | 26 | namespace dex { 27 | 28 | // The number of Dalvik opcodes 29 | constexpr size_t kNumPackedOpcodes = 0x100; 30 | 31 | // Switch table and array data signatures are a code unit consisting 32 | // of "NOP" (0x00) in the low-order byte and a non-zero identifying 33 | // code in the high-order byte. (A true NOP is 0x0000.) 34 | constexpr u2 kPackedSwitchSignature = 0x0100; 35 | constexpr u2 kSparseSwitchSignature = 0x0200; 36 | constexpr u2 kArrayDataSignature = 0x0300; 37 | 38 | // Enumeration of all Dalvik opcodes 39 | enum Opcode : u1 { 40 | #define INSTRUCTION_ENUM(opcode, cname, ...) OP_##cname = (opcode), 41 | #include "dex_instruction_list.h" 42 | DEX_INSTRUCTION_LIST(INSTRUCTION_ENUM) 43 | #undef DEX_INSTRUCTION_LIST 44 | #undef INSTRUCTION_ENUM 45 | }; 46 | 47 | // Instruction formats associated with Dalvik opcodes 48 | enum InstructionFormat : u1 { 49 | k10x, // op 50 | k12x, // op vA, vB 51 | k11n, // op vA, #+B 52 | k11x, // op vAA 53 | k10t, // op +AA 54 | k20t, // op +AAAA 55 | k20bc, // [opt] op AA, thing@BBBB 56 | k22x, // op vAA, vBBBB 57 | k21t, // op vAA, +BBBB 58 | k21s, // op vAA, #+BBBB 59 | k21h, // op vAA, #+BBBB00000[00000000] 60 | k21c, // op vAA, thing@BBBB 61 | k23x, // op vAA, vBB, vCC 62 | k22b, // op vAA, vBB, #+CC 63 | k22t, // op vA, vB, +CCCC 64 | k22s, // op vA, vB, #+CCCC 65 | k22c, // op vA, vB, thing@CCCC 66 | k22cs, // [opt] op vA, vB, field offset CCCC 67 | k30t, // op +AAAAAAAA 68 | k32x, // op vAAAA, vBBBB 69 | k31i, // op vAA, #+BBBBBBBB 70 | k31t, // op vAA, +BBBBBBBB 71 | k31c, // op vAA, string@BBBBBBBB 72 | k35c, // op {vC,vD,vE,vF,vG}, thing@BBBB 73 | k35ms, // [opt] invoke-virtual+super 74 | k3rc, // op {vCCCC .. v(CCCC+AA-1)}, thing@BBBB 75 | k3rms, // [opt] invoke-virtual+super/range 76 | k35mi, // [opt] inline invoke 77 | k3rmi, // [opt] inline invoke/range 78 | k45cc, // op {vC, vD, vE, vF, vG}, meth@BBBB, proto@HHHH 79 | k4rcc, // op {VCCCC .. v(CCCC+AA-1)}, meth@BBBB, proto@HHHH 80 | k51l, // op vAA, #+BBBBBBBBBBBBBBBB 81 | }; 82 | 83 | using OpcodeFlags = u1; 84 | enum : OpcodeFlags { 85 | kBranch = 0x01, // conditional or unconditional branch 86 | kContinue = 0x02, // flow can continue to next statement 87 | kSwitch = 0x04, // switch statement 88 | kThrow = 0x08, // could cause an exception to be thrown 89 | kReturn = 0x10, // returns, no additional statements 90 | kInvoke = 0x20, // a flavor of invoke 91 | kUnconditional = 0x40, // unconditional branch 92 | kExperimental = 0x80, // is an experimental opcode 93 | }; 94 | 95 | using VerifyFlags = u4; 96 | enum : VerifyFlags { 97 | kVerifyNothing = 0x0000000, 98 | kVerifyRegA = 0x0000001, 99 | kVerifyRegAWide = 0x0000002, 100 | kVerifyRegB = 0x0000004, 101 | kVerifyRegBField = 0x0000008, 102 | kVerifyRegBMethod = 0x0000010, 103 | kVerifyRegBNewInstance = 0x0000020, 104 | kVerifyRegBString = 0x0000040, 105 | kVerifyRegBType = 0x0000080, 106 | kVerifyRegBWide = 0x0000100, 107 | kVerifyRegC = 0x0000200, 108 | kVerifyRegCField = 0x0000400, 109 | kVerifyRegCNewArray = 0x0000800, 110 | kVerifyRegCType = 0x0001000, 111 | kVerifyRegCWide = 0x0002000, 112 | kVerifyArrayData = 0x0004000, 113 | kVerifyBranchTarget = 0x0008000, 114 | kVerifySwitchTargets = 0x0010000, 115 | kVerifyVarArg = 0x0020000, 116 | kVerifyVarArgNonZero = 0x0040000, 117 | kVerifyVarArgRange = 0x0080000, 118 | kVerifyVarArgRangeNonZero = 0x0100000, 119 | kVerifyRuntimeOnly = 0x0200000, 120 | kVerifyError = 0x0400000, 121 | kVerifyRegHPrototype = 0x0800000, 122 | kVerifyRegBCallSite = 0x1000000, 123 | kVerifyRegBMethodHandle = 0x2000000, 124 | kVerifyRegBPrototype = 0x4000000, 125 | }; 126 | 127 | // Types of indexed reference that are associated with opcodes whose 128 | // formats include such an indexed reference (e.g., 21c and 35c). 129 | enum InstructionIndexType : u1 { 130 | kIndexUnknown = 0, 131 | kIndexNone, // has no index 132 | kIndexVaries, // "It depends." Used for throw-verification-error 133 | kIndexTypeRef, // type reference index 134 | kIndexStringRef, // string reference index 135 | kIndexMethodRef, // method reference index 136 | kIndexFieldRef, // field reference index 137 | kIndexInlineMethod, // inline method index (for inline linked methods) 138 | kIndexVtableOffset, // vtable offset (for static linked methods) 139 | kIndexFieldOffset, // field offset (for static linked fields) 140 | kIndexMethodAndProtoRef, // method index and proto index 141 | kIndexCallSiteRef, // call site index 142 | kIndexMethodHandleRef, // constant method handle reference index 143 | kIndexProtoRef, // constant prototype reference index 144 | }; 145 | 146 | // Holds the contents of a decoded instruction. 147 | struct Instruction { 148 | u4 vA; // the A field of the instruction 149 | u4 vB; // the B field of the instruction 150 | u8 vB_wide; // 64bit version of the B field (for k51l) 151 | u4 vC; // the C field of the instruction 152 | u4 arg[5]; // vC/D/E/F/G in invoke or filled-new-array 153 | Opcode opcode; // instruction opcode 154 | }; 155 | 156 | // "packed-switch-payload" format 157 | struct PackedSwitchPayload { 158 | u2 ident; 159 | u2 size; 160 | s4 first_key; 161 | s4 targets[]; 162 | }; 163 | 164 | // "sparse-switch-payload" format 165 | struct SparseSwitchPayload { 166 | u2 ident; 167 | u2 size; 168 | s4 data[]; 169 | }; 170 | 171 | // "fill-array-data-payload" format 172 | struct ArrayData { 173 | u2 ident; 174 | u2 element_width; 175 | u4 size; 176 | u1 data[]; 177 | }; 178 | 179 | // Collect the enums in a struct for better locality. 180 | struct InstructionDescriptor { 181 | u4 verify_flags; // Set of VerifyFlag. 182 | InstructionFormat format; 183 | InstructionIndexType index_type; 184 | u1 flags; // Set of Flags. 185 | }; 186 | 187 | // Extracts the opcode from a Dalvik code unit (bytecode) 188 | Opcode OpcodeFromBytecode(u2 bytecode); 189 | 190 | // Returns the name of an opcode 191 | const char* GetOpcodeName(Opcode opcode); 192 | 193 | // Returns the index type associated with the specified opcode 194 | InstructionIndexType GetIndexTypeFromOpcode(Opcode opcode); 195 | 196 | // Returns the format associated with the specified opcode 197 | InstructionFormat GetFormatFromOpcode(Opcode opcode); 198 | 199 | // Returns the flags for the specified opcode 200 | OpcodeFlags GetFlagsFromOpcode(Opcode opcode); 201 | 202 | // Returns the verify flags for the specified opcode 203 | VerifyFlags GetVerifyFlagsFromOpcode(Opcode opcode); 204 | 205 | // Returns the instruction width for the specified opcode format 206 | size_t GetWidthFromFormat(InstructionFormat format); 207 | 208 | // Return the width of the specified instruction, or 0 if not defined. Also 209 | // works for special OP_NOP entries, including switch statement data tables 210 | // and array data. 211 | size_t GetWidthFromBytecode(const u2* bytecode); 212 | 213 | // Decode a .dex bytecode 214 | Instruction DecodeInstruction(const u2* bytecode); 215 | 216 | } // namespace dex 217 | -------------------------------------------------------------------------------- /slicer/dex_format.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include 20 | #include 21 | 22 | // Definitions for .dex file format structures and helpers. 23 | // 24 | // The names for the structures and fields follows the specification: 25 | // https://source.android.com/devices/tech/dalvik/dex-format.html 26 | 27 | namespace dex { 28 | 29 | // These match the definitions in the VM specification 30 | typedef uint8_t u1; 31 | typedef uint16_t u2; 32 | typedef uint32_t u4; 33 | typedef uint64_t u8; 34 | typedef int8_t s1; 35 | typedef int16_t s2; 36 | typedef int32_t s4; 37 | typedef int64_t s8; 38 | 39 | // General constants 40 | constexpr u4 kEndianConstant = 0x12345678; 41 | constexpr u4 kNoIndex = 0xffffffff; 42 | constexpr u4 kSHA1DigestLen = 20; 43 | 44 | // Annotation visibility 45 | constexpr u1 kVisibilityBuild = 0x00; 46 | constexpr u1 kVisibilityRuntime = 0x01; 47 | constexpr u1 kVisibilitySystem = 0x02; 48 | 49 | // Special visibility: encoded_annotation, not annotation_item 50 | constexpr u1 kVisibilityEncoded = 0xff; 51 | 52 | // encoded_value types 53 | constexpr u1 kEncodedByte = 0x00; 54 | constexpr u1 kEncodedShort = 0x02; 55 | constexpr u1 kEncodedChar = 0x03; 56 | constexpr u1 kEncodedInt = 0x04; 57 | constexpr u1 kEncodedLong = 0x06; 58 | constexpr u1 kEncodedFloat = 0x10; 59 | constexpr u1 kEncodedDouble = 0x11; 60 | constexpr u1 kEncodedString = 0x17; 61 | constexpr u1 kEncodedType = 0x18; 62 | constexpr u1 kEncodedField = 0x19; 63 | constexpr u1 kEncodedMethod = 0x1a; 64 | constexpr u1 kEncodedEnum = 0x1b; 65 | constexpr u1 kEncodedArray = 0x1c; 66 | constexpr u1 kEncodedAnnotation = 0x1d; 67 | constexpr u1 kEncodedNull = 0x1e; 68 | constexpr u1 kEncodedBoolean = 0x1f; 69 | 70 | // encoded_value header 71 | constexpr u1 kEncodedValueTypeMask = 0x1f; 72 | constexpr u1 kEncodedValueArgShift = 5; 73 | 74 | // access_flags 75 | constexpr u4 kAccPublic = 0x0001; // class, field, method, ic 76 | constexpr u4 kAccPrivate = 0x0002; // field, method, ic 77 | constexpr u4 kAccProtected = 0x0004; // field, method, ic 78 | constexpr u4 kAccStatic = 0x0008; // field, method, ic 79 | constexpr u4 kAccFinal = 0x0010; // class, field, method, ic 80 | constexpr u4 kAccSynchronized = 0x0020; // method (only allowed on natives) 81 | constexpr u4 kAccSuper = 0x0020; // class (not used in dex) 82 | constexpr u4 kAccVolatile = 0x0040; // field 83 | constexpr u4 kAccBridge = 0x0040; // method 84 | constexpr u4 kAccTransient = 0x0080; // field 85 | constexpr u4 kAccVarargs = 0x0080; // method 86 | constexpr u4 kAccNative = 0x0100; // method 87 | constexpr u4 kAccInterface = 0x0200; // class, ic 88 | constexpr u4 kAccAbstract = 0x0400; // class, method, ic 89 | constexpr u4 kAccStrict = 0x0800; // method 90 | constexpr u4 kAccSynthetic = 0x1000; // class, field, method, ic 91 | constexpr u4 kAccAnnotation = 0x2000; // class, ic 92 | constexpr u4 kAccEnum = 0x4000; // class, field, ic 93 | constexpr u4 kAccConstructor = 0x00010000; // method (dex only) <(cl)init> 94 | constexpr u4 kAccDeclaredSynchronized = 0x00020000; // method (dex only) 95 | 96 | // map_item type codes 97 | constexpr u2 kHeaderItem = 0x0000; 98 | constexpr u2 kStringIdItem = 0x0001; 99 | constexpr u2 kTypeIdItem = 0x0002; 100 | constexpr u2 kProtoIdItem = 0x0003; 101 | constexpr u2 kFieldIdItem = 0x0004; 102 | constexpr u2 kMethodIdItem = 0x0005; 103 | constexpr u2 kClassDefItem = 0x0006; 104 | constexpr u2 kMapList = 0x1000; 105 | constexpr u2 kTypeList = 0x1001; 106 | constexpr u2 kAnnotationSetRefList = 0x1002; 107 | constexpr u2 kAnnotationSetItem = 0x1003; 108 | constexpr u2 kClassDataItem = 0x2000; 109 | constexpr u2 kCodeItem = 0x2001; 110 | constexpr u2 kStringDataItem = 0x2002; 111 | constexpr u2 kDebugInfoItem = 0x2003; 112 | constexpr u2 kAnnotationItem = 0x2004; 113 | constexpr u2 kEncodedArrayItem = 0x2005; 114 | constexpr u2 kAnnotationsDirectoryItem = 0x2006; 115 | 116 | // debug info opcodes 117 | constexpr u1 DBG_END_SEQUENCE = 0x00; 118 | constexpr u1 DBG_ADVANCE_PC = 0x01; 119 | constexpr u1 DBG_ADVANCE_LINE = 0x02; 120 | constexpr u1 DBG_START_LOCAL = 0x03; 121 | constexpr u1 DBG_START_LOCAL_EXTENDED = 0x04; 122 | constexpr u1 DBG_END_LOCAL = 0x05; 123 | constexpr u1 DBG_RESTART_LOCAL = 0x06; 124 | constexpr u1 DBG_SET_PROLOGUE_END = 0x07; 125 | constexpr u1 DBG_SET_EPILOGUE_BEGIN = 0x08; 126 | constexpr u1 DBG_SET_FILE = 0x09; 127 | constexpr u1 DBG_FIRST_SPECIAL = 0x0a; 128 | 129 | // special debug info values 130 | constexpr int DBG_LINE_BASE = -4; 131 | constexpr int DBG_LINE_RANGE = 15; 132 | 133 | // "header_item" 134 | struct Header { 135 | u1 magic[8]; 136 | u4 checksum; 137 | u1 signature[kSHA1DigestLen]; 138 | u4 file_size; 139 | u4 header_size; 140 | u4 endian_tag; 141 | u4 link_size; 142 | u4 link_off; 143 | u4 map_off; 144 | u4 string_ids_size; 145 | u4 string_ids_off; 146 | u4 type_ids_size; 147 | u4 type_ids_off; 148 | u4 proto_ids_size; 149 | u4 proto_ids_off; 150 | u4 field_ids_size; 151 | u4 field_ids_off; 152 | u4 method_ids_size; 153 | u4 method_ids_off; 154 | u4 class_defs_size; 155 | u4 class_defs_off; 156 | u4 data_size; 157 | u4 data_off; 158 | }; 159 | 160 | // "map_item" 161 | struct MapItem { 162 | u2 type; 163 | u2 unused; 164 | u4 size; 165 | u4 offset; 166 | }; 167 | 168 | // "map_list" 169 | struct MapList { 170 | u4 size; 171 | MapItem list[]; 172 | }; 173 | 174 | // "string_id_item" 175 | struct StringId { 176 | u4 string_data_off; 177 | }; 178 | 179 | // "type_id_item" 180 | struct TypeId { 181 | u4 descriptor_idx; 182 | }; 183 | 184 | // "field_id_item" 185 | struct FieldId { 186 | u2 class_idx; 187 | u2 type_idx; 188 | u4 name_idx; 189 | }; 190 | 191 | // "method_id_item" 192 | struct MethodId { 193 | u2 class_idx; 194 | u2 proto_idx; 195 | u4 name_idx; 196 | }; 197 | 198 | // "proto_id_item" 199 | struct ProtoId { 200 | u4 shorty_idx; 201 | u4 return_type_idx; 202 | u4 parameters_off; 203 | }; 204 | 205 | // "class_def_item" 206 | struct ClassDef { 207 | u4 class_idx; 208 | u4 access_flags; 209 | u4 superclass_idx; 210 | u4 interfaces_off; 211 | u4 source_file_idx; 212 | u4 annotations_off; 213 | u4 class_data_off; 214 | u4 static_values_off; 215 | }; 216 | 217 | // "type_item" 218 | struct TypeItem { 219 | u2 type_idx; 220 | }; 221 | 222 | // "type_list" 223 | struct TypeList { 224 | u4 size; 225 | TypeItem list[]; 226 | }; 227 | 228 | // "code_item" 229 | struct Code { 230 | u2 registers_size; 231 | u2 ins_size; 232 | u2 outs_size; 233 | u2 tries_size; 234 | u4 debug_info_off; 235 | u4 insns_size; 236 | u2 insns[]; 237 | // followed by optional u2 padding 238 | // followed by try_item[tries_size] 239 | // followed by uleb128 handlersSize 240 | // followed by catch_handler_item[handlersSize] 241 | }; 242 | 243 | // "try_item" 244 | struct TryBlock { 245 | u4 start_addr; 246 | u2 insn_count; 247 | u2 handler_off; 248 | }; 249 | 250 | // "annotations_directory_item" 251 | struct AnnotationsDirectoryItem { 252 | u4 class_annotations_off; 253 | u4 fields_size; 254 | u4 methods_size; 255 | u4 parameters_size; 256 | // followed by FieldAnnotationsItem[fields_size] 257 | // followed by MethodAnnotationsItem[methods_size] 258 | // followed by ParameterAnnotationsItem[parameters_size] 259 | }; 260 | 261 | // "field_annotations_item" 262 | struct FieldAnnotationsItem { 263 | u4 field_idx; 264 | u4 annotations_off; 265 | }; 266 | 267 | // "method_annotations_item" 268 | struct MethodAnnotationsItem { 269 | u4 method_idx; 270 | u4 annotations_off; 271 | }; 272 | 273 | // "parameter_annotations_item" 274 | struct ParameterAnnotationsItem { 275 | u4 method_idx; 276 | u4 annotations_off; 277 | }; 278 | 279 | // "annotation_set_ref_item" 280 | struct AnnotationSetRefItem { 281 | u4 annotations_off; 282 | }; 283 | 284 | // "annotation_set_ref_list" 285 | struct AnnotationSetRefList { 286 | u4 size; 287 | AnnotationSetRefItem list[]; 288 | }; 289 | 290 | // "annotation_set_item" 291 | struct AnnotationSetItem { 292 | u4 size; 293 | u4 entries[]; 294 | }; 295 | 296 | // "annotation_item" 297 | struct AnnotationItem { 298 | u1 visibility; 299 | u1 annotation[]; 300 | }; 301 | 302 | // Compute DEX checksum 303 | u4 ComputeChecksum(const Header* header); 304 | 305 | // Converts a type descriptor to a human-readable declaration 306 | std::string DescriptorToDecl(const char* descriptor); 307 | 308 | // Converts a type descriptor to the equivalent shorty type descriptor 309 | char DescriptorToShorty(const char* descriptor); 310 | 311 | } // namespace dex 312 | -------------------------------------------------------------------------------- /slicer/dex_instruction_list.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2019 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef ART_LIBDEXFILE_DEX_DEX_INSTRUCTION_LIST_H_ 18 | #define ART_LIBDEXFILE_DEX_DEX_INSTRUCTION_LIST_H_ 19 | 20 | /** 21 | * Cloned from //art/libdexfile/dex/dex_instruction_list.h. 22 | */ 23 | 24 | // V(opcode, instruction_code, name, format, index, flags, extended_flags, verifier_flags); 25 | #define DEX_INSTRUCTION_LIST(V) \ 26 | V(0x00, NOP, "nop", k10x, kIndexNone, kContinue, 0, kVerifyNothing) \ 27 | V(0x01, MOVE, "move", k12x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegB) \ 28 | V(0x02, MOVE_FROM16, "move/from16", k22x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegB) \ 29 | V(0x03, MOVE_16, "move/16", k32x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegB) \ 30 | V(0x04, MOVE_WIDE, "move-wide", k12x, kIndexNone, kContinue, 0, kVerifyRegAWide | kVerifyRegBWide) \ 31 | V(0x05, MOVE_WIDE_FROM16, "move-wide/from16", k22x, kIndexNone, kContinue, 0, kVerifyRegAWide | kVerifyRegBWide) \ 32 | V(0x06, MOVE_WIDE_16, "move-wide/16", k32x, kIndexNone, kContinue, 0, kVerifyRegAWide | kVerifyRegBWide) \ 33 | V(0x07, MOVE_OBJECT, "move-object", k12x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegB) \ 34 | V(0x08, MOVE_OBJECT_FROM16, "move-object/from16", k22x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegB) \ 35 | V(0x09, MOVE_OBJECT_16, "move-object/16", k32x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegB) \ 36 | V(0x0A, MOVE_RESULT, "move-result", k11x, kIndexNone, kContinue, 0, kVerifyRegA) \ 37 | V(0x0B, MOVE_RESULT_WIDE, "move-result-wide", k11x, kIndexNone, kContinue, 0, kVerifyRegAWide) \ 38 | V(0x0C, MOVE_RESULT_OBJECT, "move-result-object", k11x, kIndexNone, kContinue, 0, kVerifyRegA) \ 39 | V(0x0D, MOVE_EXCEPTION, "move-exception", k11x, kIndexNone, kContinue, 0, kVerifyRegA) \ 40 | V(0x0E, RETURN_VOID, "return-void", k10x, kIndexNone, kReturn, 0, kVerifyNothing) \ 41 | V(0x0F, RETURN, "return", k11x, kIndexNone, kReturn, 0, kVerifyRegA) \ 42 | V(0x10, RETURN_WIDE, "return-wide", k11x, kIndexNone, kReturn, 0, kVerifyRegAWide) \ 43 | V(0x11, RETURN_OBJECT, "return-object", k11x, kIndexNone, kReturn, 0, kVerifyRegA) \ 44 | V(0x12, CONST_4, "const/4", k11n, kIndexNone, kContinue, kRegBFieldOrConstant, kVerifyRegA) \ 45 | V(0x13, CONST_16, "const/16", k21s, kIndexNone, kContinue, kRegBFieldOrConstant, kVerifyRegA) \ 46 | V(0x14, CONST, "const", k31i, kIndexNone, kContinue, kRegBFieldOrConstant, kVerifyRegA) \ 47 | V(0x15, CONST_HIGH16, "const/high16", k21h, kIndexNone, kContinue, kRegBFieldOrConstant, kVerifyRegA) \ 48 | V(0x16, CONST_WIDE_16, "const-wide/16", k21s, kIndexNone, kContinue, kRegBFieldOrConstant, kVerifyRegAWide) \ 49 | V(0x17, CONST_WIDE_32, "const-wide/32", k31i, kIndexNone, kContinue, kRegBFieldOrConstant, kVerifyRegAWide) \ 50 | V(0x18, CONST_WIDE, "const-wide", k51l, kIndexNone, kContinue, kRegBFieldOrConstant, kVerifyRegAWide) \ 51 | V(0x19, CONST_WIDE_HIGH16, "const-wide/high16", k21h, kIndexNone, kContinue, kRegBFieldOrConstant, kVerifyRegAWide) \ 52 | V(0x1A, CONST_STRING, "const-string", k21c, kIndexStringRef, kContinue | kThrow, 0, kVerifyRegA | kVerifyRegBString) \ 53 | V(0x1B, CONST_STRING_JUMBO, "const-string/jumbo", k31c, kIndexStringRef, kContinue | kThrow, 0, kVerifyRegA | kVerifyRegBString) \ 54 | V(0x1C, CONST_CLASS, "const-class", k21c, kIndexTypeRef, kContinue | kThrow, 0, kVerifyRegA | kVerifyRegBType) \ 55 | V(0x1D, MONITOR_ENTER, "monitor-enter", k11x, kIndexNone, kContinue | kThrow, kClobber, kVerifyRegA) \ 56 | V(0x1E, MONITOR_EXIT, "monitor-exit", k11x, kIndexNone, kContinue | kThrow, kClobber, kVerifyRegA) \ 57 | V(0x1F, CHECK_CAST, "check-cast", k21c, kIndexTypeRef, kContinue | kThrow, 0, kVerifyRegA | kVerifyRegBType) \ 58 | V(0x20, INSTANCE_OF, "instance-of", k22c, kIndexTypeRef, kContinue | kThrow, 0, kVerifyRegA | kVerifyRegB | kVerifyRegCType) \ 59 | V(0x21, ARRAY_LENGTH, "array-length", k12x, kIndexNone, kContinue | kThrow, 0, kVerifyRegA | kVerifyRegB) \ 60 | V(0x22, NEW_INSTANCE, "new-instance", k21c, kIndexTypeRef, kContinue | kThrow, kClobber, kVerifyRegA | kVerifyRegBNewInstance) \ 61 | V(0x23, NEW_ARRAY, "new-array", k22c, kIndexTypeRef, kContinue | kThrow, kClobber, kVerifyRegA | kVerifyRegB | kVerifyRegCNewArray) \ 62 | V(0x24, FILLED_NEW_ARRAY, "filled-new-array", k35c, kIndexTypeRef, kContinue | kThrow, kClobber, kVerifyRegBType | kVerifyVarArg) \ 63 | V(0x25, FILLED_NEW_ARRAY_RANGE, "filled-new-array/range", k3rc, kIndexTypeRef, kContinue | kThrow, kClobber, kVerifyRegBType | kVerifyVarArgRange) \ 64 | V(0x26, FILL_ARRAY_DATA, "fill-array-data", k31t, kIndexNone, kContinue | kThrow, kClobber, kVerifyRegA | kVerifyArrayData) \ 65 | V(0x27, THROW, "throw", k11x, kIndexNone, kThrow, 0, kVerifyRegA) \ 66 | V(0x28, GOTO, "goto", k10t, kIndexNone, kBranch | kUnconditional, 0, kVerifyBranchTarget) \ 67 | V(0x29, GOTO_16, "goto/16", k20t, kIndexNone, kBranch | kUnconditional, 0, kVerifyBranchTarget) \ 68 | V(0x2A, GOTO_32, "goto/32", k30t, kIndexNone, kBranch | kUnconditional, 0, kVerifyBranchTarget) \ 69 | V(0x2B, PACKED_SWITCH, "packed-switch", k31t, kIndexNone, kContinue | kSwitch, 0, kVerifyRegA | kVerifySwitchTargets) \ 70 | V(0x2C, SPARSE_SWITCH, "sparse-switch", k31t, kIndexNone, kContinue | kSwitch, 0, kVerifyRegA | kVerifySwitchTargets) \ 71 | V(0x2D, CMPL_FLOAT, "cmpl-float", k23x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 72 | V(0x2E, CMPG_FLOAT, "cmpg-float", k23x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 73 | V(0x2F, CMPL_DOUBLE, "cmpl-double", k23x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegBWide | kVerifyRegCWide) \ 74 | V(0x30, CMPG_DOUBLE, "cmpg-double", k23x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegBWide | kVerifyRegCWide) \ 75 | V(0x31, CMP_LONG, "cmp-long", k23x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegBWide | kVerifyRegCWide) \ 76 | V(0x32, IF_EQ, "if-eq", k22t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyRegB | kVerifyBranchTarget) \ 77 | V(0x33, IF_NE, "if-ne", k22t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyRegB | kVerifyBranchTarget) \ 78 | V(0x34, IF_LT, "if-lt", k22t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyRegB | kVerifyBranchTarget) \ 79 | V(0x35, IF_GE, "if-ge", k22t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyRegB | kVerifyBranchTarget) \ 80 | V(0x36, IF_GT, "if-gt", k22t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyRegB | kVerifyBranchTarget) \ 81 | V(0x37, IF_LE, "if-le", k22t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyRegB | kVerifyBranchTarget) \ 82 | V(0x38, IF_EQZ, "if-eqz", k21t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyBranchTarget) \ 83 | V(0x39, IF_NEZ, "if-nez", k21t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyBranchTarget) \ 84 | V(0x3A, IF_LTZ, "if-ltz", k21t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyBranchTarget) \ 85 | V(0x3B, IF_GEZ, "if-gez", k21t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyBranchTarget) \ 86 | V(0x3C, IF_GTZ, "if-gtz", k21t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyBranchTarget) \ 87 | V(0x3D, IF_LEZ, "if-lez", k21t, kIndexNone, kContinue | kBranch, 0, kVerifyRegA | kVerifyBranchTarget) \ 88 | V(0x3E, UNUSED_3E, "unused-3e", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 89 | V(0x3F, UNUSED_3F, "unused-3f", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 90 | V(0x40, UNUSED_40, "unused-40", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 91 | V(0x41, UNUSED_41, "unused-41", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 92 | V(0x42, UNUSED_42, "unused-42", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 93 | V(0x43, UNUSED_43, "unused-43", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 94 | V(0x44, AGET, "aget", k23x, kIndexNone, kContinue | kThrow, kLoad, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 95 | V(0x45, AGET_WIDE, "aget-wide", k23x, kIndexNone, kContinue | kThrow, kLoad, kVerifyRegAWide | kVerifyRegB | kVerifyRegC) \ 96 | V(0x46, AGET_OBJECT, "aget-object", k23x, kIndexNone, kContinue | kThrow, kLoad, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 97 | V(0x47, AGET_BOOLEAN, "aget-boolean", k23x, kIndexNone, kContinue | kThrow, kLoad, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 98 | V(0x48, AGET_BYTE, "aget-byte", k23x, kIndexNone, kContinue | kThrow, kLoad, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 99 | V(0x49, AGET_CHAR, "aget-char", k23x, kIndexNone, kContinue | kThrow, kLoad, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 100 | V(0x4A, AGET_SHORT, "aget-short", k23x, kIndexNone, kContinue | kThrow, kLoad, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 101 | V(0x4B, APUT, "aput", k23x, kIndexNone, kContinue | kThrow, kStore, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 102 | V(0x4C, APUT_WIDE, "aput-wide", k23x, kIndexNone, kContinue | kThrow, kStore, kVerifyRegAWide | kVerifyRegB | kVerifyRegC) \ 103 | V(0x4D, APUT_OBJECT, "aput-object", k23x, kIndexNone, kContinue | kThrow, kStore, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 104 | V(0x4E, APUT_BOOLEAN, "aput-boolean", k23x, kIndexNone, kContinue | kThrow, kStore, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 105 | V(0x4F, APUT_BYTE, "aput-byte", k23x, kIndexNone, kContinue | kThrow, kStore, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 106 | V(0x50, APUT_CHAR, "aput-char", k23x, kIndexNone, kContinue | kThrow, kStore, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 107 | V(0x51, APUT_SHORT, "aput-short", k23x, kIndexNone, kContinue | kThrow, kStore, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 108 | V(0x52, IGET, "iget", k22c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 109 | V(0x53, IGET_WIDE, "iget-wide", k22c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegAWide | kVerifyRegB | kVerifyRegCField) \ 110 | V(0x54, IGET_OBJECT, "iget-object", k22c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 111 | V(0x55, IGET_BOOLEAN, "iget-boolean", k22c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 112 | V(0x56, IGET_BYTE, "iget-byte", k22c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 113 | V(0x57, IGET_CHAR, "iget-char", k22c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 114 | V(0x58, IGET_SHORT, "iget-short", k22c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 115 | V(0x59, IPUT, "iput", k22c, kIndexFieldRef, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 116 | V(0x5A, IPUT_WIDE, "iput-wide", k22c, kIndexFieldRef, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegAWide | kVerifyRegB | kVerifyRegCField) \ 117 | V(0x5B, IPUT_OBJECT, "iput-object", k22c, kIndexFieldRef, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 118 | V(0x5C, IPUT_BOOLEAN, "iput-boolean", k22c, kIndexFieldRef, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 119 | V(0x5D, IPUT_BYTE, "iput-byte", k22c, kIndexFieldRef, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 120 | V(0x5E, IPUT_CHAR, "iput-char", k22c, kIndexFieldRef, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 121 | V(0x5F, IPUT_SHORT, "iput-short", k22c, kIndexFieldRef, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRegCField) \ 122 | V(0x60, SGET, "sget", k21c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 123 | V(0x61, SGET_WIDE, "sget-wide", k21c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegBFieldOrConstant, kVerifyRegAWide | kVerifyRegBField) \ 124 | V(0x62, SGET_OBJECT, "sget-object", k21c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 125 | V(0x63, SGET_BOOLEAN, "sget-boolean", k21c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 126 | V(0x64, SGET_BYTE, "sget-byte", k21c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 127 | V(0x65, SGET_CHAR, "sget-char", k21c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 128 | V(0x66, SGET_SHORT, "sget-short", k21c, kIndexFieldRef, kContinue | kThrow, kLoad | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 129 | V(0x67, SPUT, "sput", k21c, kIndexFieldRef, kContinue | kThrow, kStore | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 130 | V(0x68, SPUT_WIDE, "sput-wide", k21c, kIndexFieldRef, kContinue | kThrow, kStore | kRegBFieldOrConstant, kVerifyRegAWide | kVerifyRegBField) \ 131 | V(0x69, SPUT_OBJECT, "sput-object", k21c, kIndexFieldRef, kContinue | kThrow, kStore | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 132 | V(0x6A, SPUT_BOOLEAN, "sput-boolean", k21c, kIndexFieldRef, kContinue | kThrow, kStore | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 133 | V(0x6B, SPUT_BYTE, "sput-byte", k21c, kIndexFieldRef, kContinue | kThrow, kStore | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 134 | V(0x6C, SPUT_CHAR, "sput-char", k21c, kIndexFieldRef, kContinue | kThrow, kStore | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 135 | V(0x6D, SPUT_SHORT, "sput-short", k21c, kIndexFieldRef, kContinue | kThrow, kStore | kRegBFieldOrConstant, kVerifyRegA | kVerifyRegBField) \ 136 | V(0x6E, INVOKE_VIRTUAL, "invoke-virtual", k35c, kIndexMethodRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArgNonZero) \ 137 | V(0x6F, INVOKE_SUPER, "invoke-super", k35c, kIndexMethodRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArgNonZero) \ 138 | V(0x70, INVOKE_DIRECT, "invoke-direct", k35c, kIndexMethodRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArgNonZero) \ 139 | V(0x71, INVOKE_STATIC, "invoke-static", k35c, kIndexMethodRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArg) \ 140 | V(0x72, INVOKE_INTERFACE, "invoke-interface", k35c, kIndexMethodRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArgNonZero) \ 141 | V(0x73, RETURN_VOID_NO_BARRIER, "return-void-no-barrier", k10x, kIndexNone, kReturn, 0, kVerifyNothing) \ 142 | V(0x74, INVOKE_VIRTUAL_RANGE, "invoke-virtual/range", k3rc, kIndexMethodRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArgRangeNonZero) \ 143 | V(0x75, INVOKE_SUPER_RANGE, "invoke-super/range", k3rc, kIndexMethodRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArgRangeNonZero) \ 144 | V(0x76, INVOKE_DIRECT_RANGE, "invoke-direct/range", k3rc, kIndexMethodRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArgRangeNonZero) \ 145 | V(0x77, INVOKE_STATIC_RANGE, "invoke-static/range", k3rc, kIndexMethodRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArgRange) \ 146 | V(0x78, INVOKE_INTERFACE_RANGE, "invoke-interface/range", k3rc, kIndexMethodRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArgRangeNonZero) \ 147 | V(0x79, UNUSED_79, "unused-79", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 148 | V(0x7A, UNUSED_7A, "unused-7a", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 149 | V(0x7B, NEG_INT, "neg-int", k12x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegB) \ 150 | V(0x7C, NOT_INT, "not-int", k12x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegB) \ 151 | V(0x7D, NEG_LONG, "neg-long", k12x, kIndexNone, kContinue, 0, kVerifyRegAWide | kVerifyRegBWide) \ 152 | V(0x7E, NOT_LONG, "not-long", k12x, kIndexNone, kContinue, 0, kVerifyRegAWide | kVerifyRegBWide) \ 153 | V(0x7F, NEG_FLOAT, "neg-float", k12x, kIndexNone, kContinue, 0, kVerifyRegA | kVerifyRegB) \ 154 | V(0x80, NEG_DOUBLE, "neg-double", k12x, kIndexNone, kContinue, 0, kVerifyRegAWide | kVerifyRegBWide) \ 155 | V(0x81, INT_TO_LONG, "int-to-long", k12x, kIndexNone, kContinue, kCast, kVerifyRegAWide | kVerifyRegB) \ 156 | V(0x82, INT_TO_FLOAT, "int-to-float", k12x, kIndexNone, kContinue, kCast, kVerifyRegA | kVerifyRegB) \ 157 | V(0x83, INT_TO_DOUBLE, "int-to-double", k12x, kIndexNone, kContinue, kCast, kVerifyRegAWide | kVerifyRegB) \ 158 | V(0x84, LONG_TO_INT, "long-to-int", k12x, kIndexNone, kContinue, kCast, kVerifyRegA | kVerifyRegBWide) \ 159 | V(0x85, LONG_TO_FLOAT, "long-to-float", k12x, kIndexNone, kContinue, kCast, kVerifyRegA | kVerifyRegBWide) \ 160 | V(0x86, LONG_TO_DOUBLE, "long-to-double", k12x, kIndexNone, kContinue, kCast, kVerifyRegAWide | kVerifyRegBWide) \ 161 | V(0x87, FLOAT_TO_INT, "float-to-int", k12x, kIndexNone, kContinue, kCast, kVerifyRegA | kVerifyRegB) \ 162 | V(0x88, FLOAT_TO_LONG, "float-to-long", k12x, kIndexNone, kContinue, kCast, kVerifyRegAWide | kVerifyRegB) \ 163 | V(0x89, FLOAT_TO_DOUBLE, "float-to-double", k12x, kIndexNone, kContinue, kCast, kVerifyRegAWide | kVerifyRegB) \ 164 | V(0x8A, DOUBLE_TO_INT, "double-to-int", k12x, kIndexNone, kContinue, kCast, kVerifyRegA | kVerifyRegBWide) \ 165 | V(0x8B, DOUBLE_TO_LONG, "double-to-long", k12x, kIndexNone, kContinue, kCast, kVerifyRegAWide | kVerifyRegBWide) \ 166 | V(0x8C, DOUBLE_TO_FLOAT, "double-to-float", k12x, kIndexNone, kContinue, kCast, kVerifyRegA | kVerifyRegBWide) \ 167 | V(0x8D, INT_TO_BYTE, "int-to-byte", k12x, kIndexNone, kContinue, kCast, kVerifyRegA | kVerifyRegB) \ 168 | V(0x8E, INT_TO_CHAR, "int-to-char", k12x, kIndexNone, kContinue, kCast, kVerifyRegA | kVerifyRegB) \ 169 | V(0x8F, INT_TO_SHORT, "int-to-short", k12x, kIndexNone, kContinue, kCast, kVerifyRegA | kVerifyRegB) \ 170 | V(0x90, ADD_INT, "add-int", k23x, kIndexNone, kContinue, kAdd, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 171 | V(0x91, SUB_INT, "sub-int", k23x, kIndexNone, kContinue, kSubtract, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 172 | V(0x92, MUL_INT, "mul-int", k23x, kIndexNone, kContinue, kMultiply, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 173 | V(0x93, DIV_INT, "div-int", k23x, kIndexNone, kContinue | kThrow, kDivide, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 174 | V(0x94, REM_INT, "rem-int", k23x, kIndexNone, kContinue | kThrow, kRemainder, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 175 | V(0x95, AND_INT, "and-int", k23x, kIndexNone, kContinue, kAnd, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 176 | V(0x96, OR_INT, "or-int", k23x, kIndexNone, kContinue, kOr, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 177 | V(0x97, XOR_INT, "xor-int", k23x, kIndexNone, kContinue, kXor, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 178 | V(0x98, SHL_INT, "shl-int", k23x, kIndexNone, kContinue, kShl, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 179 | V(0x99, SHR_INT, "shr-int", k23x, kIndexNone, kContinue, kShr, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 180 | V(0x9A, USHR_INT, "ushr-int", k23x, kIndexNone, kContinue, kUshr, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 181 | V(0x9B, ADD_LONG, "add-long", k23x, kIndexNone, kContinue, kAdd, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 182 | V(0x9C, SUB_LONG, "sub-long", k23x, kIndexNone, kContinue, kSubtract, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 183 | V(0x9D, MUL_LONG, "mul-long", k23x, kIndexNone, kContinue, kMultiply, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 184 | V(0x9E, DIV_LONG, "div-long", k23x, kIndexNone, kContinue | kThrow, kDivide, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 185 | V(0x9F, REM_LONG, "rem-long", k23x, kIndexNone, kContinue | kThrow, kRemainder, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 186 | V(0xA0, AND_LONG, "and-long", k23x, kIndexNone, kContinue, kAnd, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 187 | V(0xA1, OR_LONG, "or-long", k23x, kIndexNone, kContinue, kOr, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 188 | V(0xA2, XOR_LONG, "xor-long", k23x, kIndexNone, kContinue, kXor, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 189 | V(0xA3, SHL_LONG, "shl-long", k23x, kIndexNone, kContinue, kShl, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegC) \ 190 | V(0xA4, SHR_LONG, "shr-long", k23x, kIndexNone, kContinue, kShr, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegC) \ 191 | V(0xA5, USHR_LONG, "ushr-long", k23x, kIndexNone, kContinue, kUshr, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegC) \ 192 | V(0xA6, ADD_FLOAT, "add-float", k23x, kIndexNone, kContinue, kAdd, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 193 | V(0xA7, SUB_FLOAT, "sub-float", k23x, kIndexNone, kContinue, kSubtract, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 194 | V(0xA8, MUL_FLOAT, "mul-float", k23x, kIndexNone, kContinue, kMultiply, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 195 | V(0xA9, DIV_FLOAT, "div-float", k23x, kIndexNone, kContinue, kDivide, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 196 | V(0xAA, REM_FLOAT, "rem-float", k23x, kIndexNone, kContinue, kRemainder, kVerifyRegA | kVerifyRegB | kVerifyRegC) \ 197 | V(0xAB, ADD_DOUBLE, "add-double", k23x, kIndexNone, kContinue, kAdd, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 198 | V(0xAC, SUB_DOUBLE, "sub-double", k23x, kIndexNone, kContinue, kSubtract, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 199 | V(0xAD, MUL_DOUBLE, "mul-double", k23x, kIndexNone, kContinue, kMultiply, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 200 | V(0xAE, DIV_DOUBLE, "div-double", k23x, kIndexNone, kContinue, kDivide, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 201 | V(0xAF, REM_DOUBLE, "rem-double", k23x, kIndexNone, kContinue, kRemainder, kVerifyRegAWide | kVerifyRegBWide | kVerifyRegCWide) \ 202 | V(0xB0, ADD_INT_2ADDR, "add-int/2addr", k12x, kIndexNone, kContinue, kAdd, kVerifyRegA | kVerifyRegB) \ 203 | V(0xB1, SUB_INT_2ADDR, "sub-int/2addr", k12x, kIndexNone, kContinue, kSubtract, kVerifyRegA | kVerifyRegB) \ 204 | V(0xB2, MUL_INT_2ADDR, "mul-int/2addr", k12x, kIndexNone, kContinue, kMultiply, kVerifyRegA | kVerifyRegB) \ 205 | V(0xB3, DIV_INT_2ADDR, "div-int/2addr", k12x, kIndexNone, kContinue | kThrow, kDivide, kVerifyRegA | kVerifyRegB) \ 206 | V(0xB4, REM_INT_2ADDR, "rem-int/2addr", k12x, kIndexNone, kContinue | kThrow, kRemainder, kVerifyRegA | kVerifyRegB) \ 207 | V(0xB5, AND_INT_2ADDR, "and-int/2addr", k12x, kIndexNone, kContinue, kAnd, kVerifyRegA | kVerifyRegB) \ 208 | V(0xB6, OR_INT_2ADDR, "or-int/2addr", k12x, kIndexNone, kContinue, kOr, kVerifyRegA | kVerifyRegB) \ 209 | V(0xB7, XOR_INT_2ADDR, "xor-int/2addr", k12x, kIndexNone, kContinue, kXor, kVerifyRegA | kVerifyRegB) \ 210 | V(0xB8, SHL_INT_2ADDR, "shl-int/2addr", k12x, kIndexNone, kContinue, kShl, kVerifyRegA | kVerifyRegB) \ 211 | V(0xB9, SHR_INT_2ADDR, "shr-int/2addr", k12x, kIndexNone, kContinue, kShr, kVerifyRegA | kVerifyRegB) \ 212 | V(0xBA, USHR_INT_2ADDR, "ushr-int/2addr", k12x, kIndexNone, kContinue, kUshr, kVerifyRegA | kVerifyRegB) \ 213 | V(0xBB, ADD_LONG_2ADDR, "add-long/2addr", k12x, kIndexNone, kContinue, kAdd, kVerifyRegAWide | kVerifyRegBWide) \ 214 | V(0xBC, SUB_LONG_2ADDR, "sub-long/2addr", k12x, kIndexNone, kContinue, kSubtract, kVerifyRegAWide | kVerifyRegBWide) \ 215 | V(0xBD, MUL_LONG_2ADDR, "mul-long/2addr", k12x, kIndexNone, kContinue, kMultiply, kVerifyRegAWide | kVerifyRegBWide) \ 216 | V(0xBE, DIV_LONG_2ADDR, "div-long/2addr", k12x, kIndexNone, kContinue | kThrow, kDivide, kVerifyRegAWide | kVerifyRegBWide) \ 217 | V(0xBF, REM_LONG_2ADDR, "rem-long/2addr", k12x, kIndexNone, kContinue | kThrow, kRemainder, kVerifyRegAWide | kVerifyRegBWide) \ 218 | V(0xC0, AND_LONG_2ADDR, "and-long/2addr", k12x, kIndexNone, kContinue, kAnd, kVerifyRegAWide | kVerifyRegBWide) \ 219 | V(0xC1, OR_LONG_2ADDR, "or-long/2addr", k12x, kIndexNone, kContinue, kOr, kVerifyRegAWide | kVerifyRegBWide) \ 220 | V(0xC2, XOR_LONG_2ADDR, "xor-long/2addr", k12x, kIndexNone, kContinue, kXor, kVerifyRegAWide | kVerifyRegBWide) \ 221 | V(0xC3, SHL_LONG_2ADDR, "shl-long/2addr", k12x, kIndexNone, kContinue, kShl, kVerifyRegAWide | kVerifyRegB) \ 222 | V(0xC4, SHR_LONG_2ADDR, "shr-long/2addr", k12x, kIndexNone, kContinue, kShr, kVerifyRegAWide | kVerifyRegB) \ 223 | V(0xC5, USHR_LONG_2ADDR, "ushr-long/2addr", k12x, kIndexNone, kContinue, kUshr, kVerifyRegAWide | kVerifyRegB) \ 224 | V(0xC6, ADD_FLOAT_2ADDR, "add-float/2addr", k12x, kIndexNone, kContinue, kAdd, kVerifyRegA | kVerifyRegB) \ 225 | V(0xC7, SUB_FLOAT_2ADDR, "sub-float/2addr", k12x, kIndexNone, kContinue, kSubtract, kVerifyRegA | kVerifyRegB) \ 226 | V(0xC8, MUL_FLOAT_2ADDR, "mul-float/2addr", k12x, kIndexNone, kContinue, kMultiply, kVerifyRegA | kVerifyRegB) \ 227 | V(0xC9, DIV_FLOAT_2ADDR, "div-float/2addr", k12x, kIndexNone, kContinue, kDivide, kVerifyRegA | kVerifyRegB) \ 228 | V(0xCA, REM_FLOAT_2ADDR, "rem-float/2addr", k12x, kIndexNone, kContinue, kRemainder, kVerifyRegA | kVerifyRegB) \ 229 | V(0xCB, ADD_DOUBLE_2ADDR, "add-double/2addr", k12x, kIndexNone, kContinue, kAdd, kVerifyRegAWide | kVerifyRegBWide) \ 230 | V(0xCC, SUB_DOUBLE_2ADDR, "sub-double/2addr", k12x, kIndexNone, kContinue, kSubtract, kVerifyRegAWide | kVerifyRegBWide) \ 231 | V(0xCD, MUL_DOUBLE_2ADDR, "mul-double/2addr", k12x, kIndexNone, kContinue, kMultiply, kVerifyRegAWide | kVerifyRegBWide) \ 232 | V(0xCE, DIV_DOUBLE_2ADDR, "div-double/2addr", k12x, kIndexNone, kContinue, kDivide, kVerifyRegAWide | kVerifyRegBWide) \ 233 | V(0xCF, REM_DOUBLE_2ADDR, "rem-double/2addr", k12x, kIndexNone, kContinue, kRemainder, kVerifyRegAWide | kVerifyRegBWide) \ 234 | V(0xD0, ADD_INT_LIT16, "add-int/lit16", k22s, kIndexNone, kContinue, kAdd | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 235 | V(0xD1, RSUB_INT, "rsub-int", k22s, kIndexNone, kContinue, kSubtract | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 236 | V(0xD2, MUL_INT_LIT16, "mul-int/lit16", k22s, kIndexNone, kContinue, kMultiply | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 237 | V(0xD3, DIV_INT_LIT16, "div-int/lit16", k22s, kIndexNone, kContinue | kThrow, kDivide | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 238 | V(0xD4, REM_INT_LIT16, "rem-int/lit16", k22s, kIndexNone, kContinue | kThrow, kRemainder | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 239 | V(0xD5, AND_INT_LIT16, "and-int/lit16", k22s, kIndexNone, kContinue, kAnd | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 240 | V(0xD6, OR_INT_LIT16, "or-int/lit16", k22s, kIndexNone, kContinue, kOr | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 241 | V(0xD7, XOR_INT_LIT16, "xor-int/lit16", k22s, kIndexNone, kContinue, kXor | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 242 | V(0xD8, ADD_INT_LIT8, "add-int/lit8", k22b, kIndexNone, kContinue, kAdd | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 243 | V(0xD9, RSUB_INT_LIT8, "rsub-int/lit8", k22b, kIndexNone, kContinue, kSubtract | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 244 | V(0xDA, MUL_INT_LIT8, "mul-int/lit8", k22b, kIndexNone, kContinue, kMultiply | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 245 | V(0xDB, DIV_INT_LIT8, "div-int/lit8", k22b, kIndexNone, kContinue | kThrow, kDivide | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 246 | V(0xDC, REM_INT_LIT8, "rem-int/lit8", k22b, kIndexNone, kContinue | kThrow, kRemainder | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 247 | V(0xDD, AND_INT_LIT8, "and-int/lit8", k22b, kIndexNone, kContinue, kAnd | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 248 | V(0xDE, OR_INT_LIT8, "or-int/lit8", k22b, kIndexNone, kContinue, kOr | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 249 | V(0xDF, XOR_INT_LIT8, "xor-int/lit8", k22b, kIndexNone, kContinue, kXor | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 250 | V(0xE0, SHL_INT_LIT8, "shl-int/lit8", k22b, kIndexNone, kContinue, kShl | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 251 | V(0xE1, SHR_INT_LIT8, "shr-int/lit8", k22b, kIndexNone, kContinue, kShr | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 252 | V(0xE2, USHR_INT_LIT8, "ushr-int/lit8", k22b, kIndexNone, kContinue, kUshr | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB) \ 253 | V(0xE3, IGET_QUICK, "iget-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 254 | V(0xE4, IGET_WIDE_QUICK, "iget-wide-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegAWide | kVerifyRegB | kVerifyRuntimeOnly) \ 255 | V(0xE5, IGET_OBJECT_QUICK, "iget-object-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 256 | V(0xE6, IPUT_QUICK, "iput-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 257 | V(0xE7, IPUT_WIDE_QUICK, "iput-wide-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegAWide | kVerifyRegB | kVerifyRuntimeOnly) \ 258 | V(0xE8, IPUT_OBJECT_QUICK, "iput-object-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 259 | V(0xE9, INVOKE_VIRTUAL_QUICK, "invoke-virtual-quick", k35c, kIndexVtableOffset, kContinue | kThrow | kInvoke, 0, kVerifyVarArgNonZero | kVerifyRuntimeOnly) \ 260 | V(0xEA, INVOKE_VIRTUAL_RANGE_QUICK, "invoke-virtual/range-quick", k3rc, kIndexVtableOffset, kContinue | kThrow | kInvoke, 0, kVerifyVarArgRangeNonZero | kVerifyRuntimeOnly) \ 261 | V(0xEB, IPUT_BOOLEAN_QUICK, "iput-boolean-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 262 | V(0xEC, IPUT_BYTE_QUICK, "iput-byte-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 263 | V(0xED, IPUT_CHAR_QUICK, "iput-char-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 264 | V(0xEE, IPUT_SHORT_QUICK, "iput-short-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kStore | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 265 | V(0xEF, IGET_BOOLEAN_QUICK, "iget-boolean-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 266 | V(0xF0, IGET_BYTE_QUICK, "iget-byte-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 267 | V(0xF1, IGET_CHAR_QUICK, "iget-char-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 268 | V(0xF2, IGET_SHORT_QUICK, "iget-short-quick", k22c, kIndexFieldOffset, kContinue | kThrow, kLoad | kRegCFieldOrConstant, kVerifyRegA | kVerifyRegB | kVerifyRuntimeOnly) \ 269 | V(0xF3, UNUSED_F3, "unused-f3", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 270 | V(0xF4, UNUSED_F4, "unused-f4", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 271 | V(0xF5, UNUSED_F5, "unused-f5", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 272 | V(0xF6, UNUSED_F6, "unused-f6", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 273 | V(0xF7, UNUSED_F7, "unused-f7", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 274 | V(0xF8, UNUSED_F8, "unused-f8", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 275 | V(0xF9, UNUSED_F9, "unused-f9", k10x, kIndexUnknown, 0, 0, kVerifyError) \ 276 | V(0xFA, INVOKE_POLYMORPHIC, "invoke-polymorphic", k45cc, kIndexMethodAndProtoRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArgNonZero | kVerifyRegHPrototype) \ 277 | V(0xFB, INVOKE_POLYMORPHIC_RANGE, "invoke-polymorphic/range", k4rcc, kIndexMethodAndProtoRef, kContinue | kThrow | kInvoke, 0, kVerifyRegBMethod | kVerifyVarArgRangeNonZero | kVerifyRegHPrototype) \ 278 | V(0xFC, INVOKE_CUSTOM, "invoke-custom", k35c, kIndexCallSiteRef, kContinue | kThrow, 0, kVerifyRegBCallSite | kVerifyVarArg) \ 279 | V(0xFD, INVOKE_CUSTOM_RANGE, "invoke-custom/range", k3rc, kIndexCallSiteRef, kContinue | kThrow, 0, kVerifyRegBCallSite | kVerifyVarArgRange) \ 280 | V(0xFE, CONST_METHOD_HANDLE, "const-method-handle", k21c, kIndexMethodHandleRef, kContinue | kThrow, 0, kVerifyRegA | kVerifyRegBMethodHandle) \ 281 | V(0xFF, CONST_METHOD_TYPE, "const-method-type", k21c, kIndexProtoRef, kContinue | kThrow, 0, kVerifyRegA | kVerifyRegBPrototype) 282 | 283 | #define DEX_INSTRUCTION_FORMAT_LIST(V) \ 284 | V(k10x) \ 285 | V(k12x) \ 286 | V(k11n) \ 287 | V(k11x) \ 288 | V(k10t) \ 289 | V(k20t) \ 290 | V(k22x) \ 291 | V(k21t) \ 292 | V(k21s) \ 293 | V(k21h) \ 294 | V(k21c) \ 295 | V(k23x) \ 296 | V(k22b) \ 297 | V(k22t) \ 298 | V(k22s) \ 299 | V(k22c) \ 300 | V(k32x) \ 301 | V(k30t) \ 302 | V(k31t) \ 303 | V(k31i) \ 304 | V(k31c) \ 305 | V(k35c) \ 306 | V(k3rc) \ 307 | V(k45cc) \ 308 | V(k4rcc) \ 309 | V(k51l) 310 | 311 | #endif // ART_LIBDEXFILE_DEX_DEX_INSTRUCTION_LIST_H_ 312 | #undef ART_LIBDEXFILE_DEX_DEX_INSTRUCTION_LIST_H_ // the guard in this file is just for cpplint -------------------------------------------------------------------------------- /slicer/dex_ir.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include "common.h" 20 | #include "memview.h" 21 | #include "arrayview.h" 22 | #include "dex_format.h" 23 | #include "dex_leb128.h" 24 | #include "buffer.h" 25 | #include "index_map.h" 26 | #include "hash_table.h" 27 | 28 | #include 29 | #include 30 | #include 31 | #include 32 | #include 33 | 34 | // A simple, lightweight IR to abstract the key .dex structures 35 | // 36 | // 1. All the cross-IR references are modeled as plain pointers. 37 | // 2. Newly allocated nodes are mem-zeroed first 38 | // 39 | // This IR can mirror any .dex file, although for JVMTI BCI 40 | // it's expected to construct the IR for the single modified class only 41 | // (and include only the nodes referenced from that class) 42 | 43 | #define SLICER_IR_TYPE \ 44 | using Node::Node; \ 45 | friend struct DexFile; 46 | 47 | #define SLICER_IR_INDEXED_TYPE \ 48 | using IndexedNode::IndexedNode; \ 49 | friend struct DexFile; 50 | 51 | namespace ir { 52 | 53 | // convenience notation 54 | template 55 | using own = std::unique_ptr; 56 | 57 | struct Node; 58 | struct IndexedNode; 59 | struct EncodedValue; 60 | struct EncodedArray; 61 | struct String; 62 | struct Type; 63 | struct TypeList; 64 | struct Proto; 65 | struct FieldDecl; 66 | struct EncodedField; 67 | struct DebugInfo; 68 | struct Code; 69 | struct MethodDecl; 70 | struct EncodedMethod; 71 | struct AnnotationElement; 72 | struct Annotation; 73 | struct AnnotationSet; 74 | struct AnnotationSetRefList; 75 | struct FieldAnnotation; 76 | struct MethodAnnotation; 77 | struct ParamAnnotation; 78 | struct AnnotationsDirectory; 79 | struct Class; 80 | struct DexFile; 81 | 82 | // The base class for all the .dex IR types: 83 | // This is not a polymorphic interface, but 84 | // a way to constrain the allocation and ownership 85 | // of .dex IR nodes. 86 | struct Node { 87 | void* operator new(size_t size) { 88 | return ::calloc(1, size); 89 | } 90 | 91 | void* operator new[](size_t size) { 92 | return ::calloc(1, size); 93 | } 94 | 95 | void operator delete(void* ptr) { 96 | ::free(ptr); 97 | } 98 | 99 | void operator delete[](void* ptr) { 100 | ::free(ptr); 101 | } 102 | 103 | public: 104 | Node(const Node&) = delete; 105 | Node& operator=(const Node&) = delete; 106 | 107 | protected: 108 | Node() = default; 109 | ~Node() = default; 110 | }; 111 | 112 | // a concession for the convenience of the .dex writer 113 | // 114 | // TODO: consider moving the indexing to the writer. 115 | // 116 | struct IndexedNode : public Node { 117 | SLICER_IR_TYPE; 118 | 119 | // this is the index in the generated image 120 | // (not the original index) 121 | dex::u4 index; 122 | 123 | // original indexe 124 | // (from the source .dex image or allocated post reader) 125 | dex::u4 orig_index; 126 | }; 127 | 128 | struct EncodedValue : public Node { 129 | SLICER_IR_TYPE; 130 | 131 | dex::u1 type; 132 | union { 133 | int8_t byte_value; 134 | int16_t short_value; 135 | uint16_t char_value; 136 | int32_t int_value; 137 | int64_t long_value; 138 | float float_value; 139 | double double_value; 140 | String* string_value; 141 | Type* type_value; 142 | FieldDecl* field_value; 143 | MethodDecl* method_value; 144 | FieldDecl* enum_value; 145 | EncodedArray* array_value; 146 | Annotation* annotation_value; 147 | bool bool_value; 148 | } u; 149 | 150 | SLICER_EXTRA(slicer::MemView original); 151 | }; 152 | 153 | struct EncodedArray : public Node { 154 | SLICER_IR_TYPE; 155 | 156 | std::vector values; 157 | }; 158 | 159 | struct String : public IndexedNode { 160 | SLICER_IR_INDEXED_TYPE; 161 | 162 | // opaque DEX "string_data_item" 163 | slicer::MemView data; 164 | 165 | const char* c_str() const { 166 | const dex::u1* strData = data.ptr(); 167 | dex::ReadULeb128(&strData); 168 | return reinterpret_cast(strData); 169 | } 170 | }; 171 | 172 | struct Type : public IndexedNode { 173 | SLICER_IR_INDEXED_TYPE; 174 | 175 | enum class Category { Void, Scalar, WideScalar, Reference }; 176 | 177 | String* descriptor; 178 | Class* class_def; 179 | 180 | std::string Decl() const; 181 | Category GetCategory() const; 182 | }; 183 | 184 | struct TypeList : public Node { 185 | SLICER_IR_TYPE; 186 | 187 | std::vector types; 188 | }; 189 | 190 | struct Proto : public IndexedNode { 191 | SLICER_IR_INDEXED_TYPE; 192 | 193 | String* shorty; 194 | Type* return_type; 195 | TypeList* param_types; 196 | 197 | std::string Signature() const; 198 | }; 199 | 200 | struct FieldDecl : public IndexedNode { 201 | SLICER_IR_INDEXED_TYPE; 202 | 203 | String* name; 204 | Type* type; 205 | Type* parent; 206 | }; 207 | 208 | struct EncodedField : public Node { 209 | SLICER_IR_TYPE; 210 | 211 | FieldDecl* decl; 212 | dex::u4 access_flags; 213 | }; 214 | 215 | struct DebugInfo : public Node { 216 | SLICER_IR_TYPE; 217 | 218 | dex::u4 line_start; 219 | std::vector param_names; 220 | 221 | // original debug info opcodes stream 222 | // (must be "relocated" when creating a new .dex image) 223 | slicer::MemView data; 224 | }; 225 | 226 | struct Code : public Node { 227 | SLICER_IR_TYPE; 228 | 229 | dex::u2 registers; 230 | dex::u2 ins_count; 231 | dex::u2 outs_count; 232 | slicer::ArrayView instructions; 233 | slicer::ArrayView try_blocks; 234 | slicer::MemView catch_handlers; 235 | DebugInfo* debug_info; 236 | }; 237 | 238 | struct MethodDecl : public IndexedNode { 239 | SLICER_IR_INDEXED_TYPE; 240 | 241 | String* name; 242 | Proto* prototype; 243 | Type* parent; 244 | }; 245 | 246 | struct EncodedMethod : public Node { 247 | SLICER_IR_TYPE; 248 | 249 | MethodDecl* decl; 250 | Code* code; 251 | dex::u4 access_flags; 252 | }; 253 | 254 | struct AnnotationElement : public Node { 255 | SLICER_IR_TYPE; 256 | 257 | String* name; 258 | EncodedValue* value; 259 | }; 260 | 261 | struct Annotation : public Node { 262 | SLICER_IR_TYPE; 263 | 264 | Type* type; 265 | std::vector elements; 266 | dex::u1 visibility; 267 | }; 268 | 269 | struct AnnotationSet : public Node { 270 | SLICER_IR_TYPE; 271 | 272 | std::vector annotations; 273 | }; 274 | 275 | struct AnnotationSetRefList : public Node { 276 | SLICER_IR_TYPE; 277 | 278 | std::vector annotations; 279 | }; 280 | 281 | struct FieldAnnotation : public Node { 282 | SLICER_IR_TYPE; 283 | 284 | FieldDecl* field_decl; 285 | AnnotationSet* annotations; 286 | }; 287 | 288 | struct MethodAnnotation : public Node { 289 | SLICER_IR_TYPE; 290 | 291 | MethodDecl* method_decl; 292 | AnnotationSet* annotations; 293 | }; 294 | 295 | struct ParamAnnotation : public Node { 296 | SLICER_IR_TYPE; 297 | 298 | MethodDecl* method_decl; 299 | AnnotationSetRefList* annotations; 300 | }; 301 | 302 | struct AnnotationsDirectory : public Node { 303 | SLICER_IR_TYPE; 304 | 305 | AnnotationSet* class_annotation; 306 | std::vector field_annotations; 307 | std::vector method_annotations; 308 | std::vector param_annotations; 309 | }; 310 | 311 | struct Class : public IndexedNode { 312 | SLICER_IR_INDEXED_TYPE; 313 | 314 | Type* type; 315 | dex::u4 access_flags; 316 | Type* super_class; 317 | TypeList* interfaces; 318 | String* source_file; 319 | AnnotationsDirectory* annotations; 320 | EncodedArray* static_init; 321 | 322 | std::vector static_fields; 323 | std::vector instance_fields; 324 | std::vector direct_methods; 325 | std::vector virtual_methods; 326 | }; 327 | 328 | // ir::String hashing 329 | struct StringsHasher { 330 | const char* GetKey(const String* string) const { return string->c_str(); } 331 | uint32_t Hash(const char* string_key) const; 332 | bool Compare(const char* string_key, const String* string) const; 333 | }; 334 | 335 | // ir::Proto hashing 336 | struct ProtosHasher { 337 | std::string GetKey(const Proto* proto) const { return proto->Signature(); } 338 | uint32_t Hash(const std::string& proto_key) const; 339 | bool Compare(const std::string& proto_key, const Proto* proto) const; 340 | }; 341 | 342 | // ir::EncodedMethod hashing 343 | struct MethodKey { 344 | String* class_descriptor = nullptr; 345 | String* method_name = nullptr; 346 | Proto* prototype = nullptr; 347 | }; 348 | 349 | struct MethodsHasher { 350 | MethodKey GetKey(const EncodedMethod* method) const; 351 | uint32_t Hash(const MethodKey& method_key) const; 352 | bool Compare(const MethodKey& method_key, const EncodedMethod* method) const; 353 | }; 354 | 355 | using StringsLookup = slicer::HashTable; 356 | using PrototypesLookup = slicer::HashTable; 357 | using MethodsLookup = slicer::HashTable; 358 | 359 | // The main container/root for a .dex IR 360 | struct DexFile { 361 | // indexed structures 362 | std::vector> strings; 363 | std::vector> types; 364 | std::vector> protos; 365 | std::vector> fields; 366 | std::vector> methods; 367 | std::vector> classes; 368 | 369 | // data segment structures 370 | std::vector> encoded_fields; 371 | std::vector> encoded_methods; 372 | std::vector> type_lists; 373 | std::vector> code; 374 | std::vector> debug_info; 375 | std::vector> encoded_values; 376 | std::vector> encoded_arrays; 377 | std::vector> annotations; 378 | std::vector> annotation_elements; 379 | std::vector> annotation_sets; 380 | std::vector> annotation_set_ref_lists; 381 | std::vector> annotations_directories; 382 | std::vector> field_annotations; 383 | std::vector> method_annotations; 384 | std::vector> param_annotations; 385 | 386 | // original index to IR node mappings 387 | // 388 | // CONSIDER: we only need to carry around 389 | // the relocation for the referenced items 390 | // 391 | std::map types_map; 392 | std::map strings_map; 393 | std::map protos_map; 394 | std::map fields_map; 395 | std::map methods_map; 396 | std::map classes_map; 397 | 398 | // original .dex header "magic" signature 399 | slicer::MemView magic; 400 | 401 | // keep track of the used index values 402 | // (so we can easily allocate new ones) 403 | IndexMap strings_indexes; 404 | IndexMap types_indexes; 405 | IndexMap protos_indexes; 406 | IndexMap fields_indexes; 407 | IndexMap methods_indexes; 408 | IndexMap classes_indexes; 409 | 410 | // lookup hash tables 411 | StringsLookup strings_lookup; 412 | MethodsLookup methods_lookup; 413 | PrototypesLookup prototypes_lookup; 414 | 415 | public: 416 | DexFile() = default; 417 | 418 | // No copy/move semantics 419 | DexFile(const DexFile&) = delete; 420 | DexFile& operator=(const DexFile&) = delete; 421 | 422 | template 423 | T* Alloc() { 424 | T* p = new T(); 425 | Track(p); 426 | return p; 427 | } 428 | 429 | void AttachBuffer(slicer::Buffer&& buffer) { 430 | buffers_.push_back(std::move(buffer)); 431 | } 432 | 433 | void Normalize(); 434 | 435 | private: 436 | void TopSortClassIndex(Class* irClass, dex::u4* nextIndex); 437 | void SortClassIndexes(); 438 | 439 | template 440 | void PushOwn(std::vector>& v, T* p) { 441 | v.push_back(own(p)); 442 | } 443 | 444 | void Track(String* p) { PushOwn(strings, p); } 445 | void Track(Type* p) { PushOwn(types, p); } 446 | void Track(Proto* p) { PushOwn(protos, p); } 447 | void Track(FieldDecl* p) { PushOwn(fields, p); } 448 | void Track(MethodDecl* p) { PushOwn(methods, p); } 449 | void Track(Class* p) { PushOwn(classes, p); } 450 | 451 | void Track(EncodedField* p) { PushOwn(encoded_fields, p); } 452 | void Track(EncodedMethod* p) { PushOwn(encoded_methods, p); } 453 | void Track(TypeList* p) { PushOwn(type_lists, p); } 454 | void Track(Code* p) { PushOwn(code, p); } 455 | void Track(DebugInfo* p) { PushOwn(debug_info, p); } 456 | void Track(EncodedValue* p) { PushOwn(encoded_values, p); } 457 | void Track(EncodedArray* p) { PushOwn(encoded_arrays, p); } 458 | void Track(Annotation* p) { PushOwn(annotations, p); } 459 | void Track(AnnotationElement* p) { PushOwn(annotation_elements, p); } 460 | void Track(AnnotationSet* p) { PushOwn(annotation_sets, p); } 461 | void Track(AnnotationSetRefList* p) { PushOwn(annotation_set_ref_lists, p); } 462 | void Track(AnnotationsDirectory* p) { PushOwn(annotations_directories, p); } 463 | void Track(FieldAnnotation* p) { PushOwn(field_annotations, p); } 464 | void Track(MethodAnnotation* p) { PushOwn(method_annotations, p); } 465 | void Track(ParamAnnotation* p) { PushOwn(param_annotations, p); } 466 | 467 | private: 468 | // additional memory buffers owned by this .dex IR 469 | std::vector buffers_; 470 | }; 471 | 472 | } // namespace ir 473 | 474 | #undef SLICER_IR_TYPE 475 | #undef SLICER_IR_INDEXED_TYPE 476 | -------------------------------------------------------------------------------- /slicer/dex_leb128.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include "dex_format.h" 20 | 21 | // LEB128 encode/decode helpers: 22 | // https://source.android.com/devices/tech/dalvik/dex-format.html 23 | 24 | namespace dex { 25 | 26 | // Reads an unsigned LEB128 value, updating the given pointer to 27 | // point just past the end of the read value. 28 | inline u4 ReadULeb128(const u1** pptr) { 29 | const u1* ptr = *pptr; 30 | u4 result = *(ptr++); 31 | 32 | if (result > 0x7f) { 33 | u4 cur = *(ptr++); 34 | result = (result & 0x7f) | ((cur & 0x7f) << 7); 35 | if (cur > 0x7f) { 36 | cur = *(ptr++); 37 | result |= (cur & 0x7f) << 14; 38 | if (cur > 0x7f) { 39 | cur = *(ptr++); 40 | result |= (cur & 0x7f) << 21; 41 | if (cur > 0x7f) { 42 | // We don't check to see if cur is out of 43 | // range here, meaning we tolerate garbage in the 44 | // high four-order bits. 45 | cur = *(ptr++); 46 | result |= cur << 28; 47 | } 48 | } 49 | } 50 | } 51 | 52 | *pptr = ptr; 53 | return result; 54 | } 55 | 56 | // Reads a signed LEB128 value, updating the given pointer to 57 | // point just past the end of the read value. 58 | inline s4 ReadSLeb128(const u1** pptr) { 59 | const u1* ptr = *pptr; 60 | s4 result = *(ptr++); 61 | 62 | if (result <= 0x7f) { 63 | result = (result << 25) >> 25; 64 | } else { 65 | s4 cur = *(ptr++); 66 | result = (result & 0x7f) | ((cur & 0x7f) << 7); 67 | if (cur <= 0x7f) { 68 | result = (result << 18) >> 18; 69 | } else { 70 | cur = *(ptr++); 71 | result |= (cur & 0x7f) << 14; 72 | if (cur <= 0x7f) { 73 | result = (result << 11) >> 11; 74 | } else { 75 | cur = *(ptr++); 76 | result |= (cur & 0x7f) << 21; 77 | if (cur <= 0x7f) { 78 | result = (result << 4) >> 4; 79 | } else { 80 | // Note: We don't check to see if cur is out of 81 | // range here, meaning we tolerate garbage in the 82 | // high four-order bits. 83 | cur = *(ptr++); 84 | result |= cur << 28; 85 | } 86 | } 87 | } 88 | } 89 | 90 | *pptr = ptr; 91 | return result; 92 | } 93 | 94 | // Writes a 32-bit value in unsigned ULEB128 format. 95 | // Returns the updated pointer. 96 | inline u1* WriteULeb128(u1* ptr, u4 data) { 97 | for (;;) { 98 | u1 out = data & 0x7f; 99 | if (out != data) { 100 | *ptr++ = out | 0x80; 101 | data >>= 7; 102 | } else { 103 | *ptr++ = out; 104 | break; 105 | } 106 | } 107 | return ptr; 108 | } 109 | 110 | // Writes a 32-bit value in signed ULEB128 format. 111 | // Returns the updated pointer. 112 | inline u1* WriteSLeb128(u1* ptr, s4 value) { 113 | u4 extra_bits = static_cast(value ^ (value >> 31)) >> 6; 114 | u1 out = value & 0x7f; 115 | while (extra_bits != 0u) { 116 | *ptr++ = out | 0x80; 117 | value >>= 7; 118 | out = value & 0x7f; 119 | extra_bits >>= 7; 120 | } 121 | *ptr++ = out; 122 | return ptr; 123 | } 124 | 125 | } // namespace dex 126 | -------------------------------------------------------------------------------- /slicer/dex_utf8.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include "dex_format.h" 20 | 21 | // MUTF-8 (Modified UTF-8) Encoding helpers: 22 | // https://source.android.com/devices/tech/dalvik/dex-format.html 23 | 24 | namespace dex { 25 | 26 | // Compare two '\0'-terminated modified UTF-8 strings, using Unicode 27 | // code point values for comparison. This treats different encodings 28 | // for the same code point as equivalent, except that only a real '\0' 29 | // byte is considered the string terminator. The return value is as 30 | // for strcmp(). 31 | int Utf8Cmp(const char* s1, const char* s2); 32 | 33 | } // namespace dex -------------------------------------------------------------------------------- /slicer/hash_table.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include 20 | #include 21 | #include 22 | 23 | namespace slicer { 24 | 25 | // A specialized Key -> T* map (note that, unlike std:: containers, the values 26 | // are always pointers here, and we don't explicitly store the lookup keys) 27 | // 28 | // Implemented as an incrementally resizable hash table: we split the logical hash table 29 | // into two internal fixed size tables, the "full table" and a "insertion table". 30 | // When the insertion table overflows, we allocate a larger hashtable to replace 31 | // it and "insertion table" becomes the "full table" (the old "full table" is 32 | // rehashed into the new hash table) 33 | // 34 | // Similar to open addressing hash tables, all the buckets are a single, 35 | // contiguous array. But this table is growing and the collisions are still handled 36 | // as chains (using indexes instead of pointers). 37 | // 38 | // The result is faster than std::unordered_map and uses ~25% of 39 | // the memory used by std::unordered_map 40 | // 41 | // The Hash template argument is a type which must implement: 42 | // 1. hash function : uint32_t Hash(const Key& key) 43 | // 2. key compare : bool Compare(const Key& key, T* value) 44 | // 3. key extraction : Key GetKey(T* value) 45 | // 4. copy semantics 46 | // 47 | template 48 | class HashTable { 49 | private: 50 | // the index type inside the bucket array 51 | using Index = uint32_t; 52 | 53 | static constexpr Index kInitialHashBuckets = (1 << 7) - 1; 54 | static constexpr Index kAvgChainLength = 2; 55 | static constexpr Index kInvalidIndex = static_cast(-1); 56 | static constexpr double kResizeFactor = 1.6; 57 | 58 | struct __attribute__((packed)) Bucket { 59 | T* value = nullptr; 60 | Index next = kInvalidIndex; 61 | }; 62 | 63 | class Partition { 64 | public: 65 | Partition(Index size, const Hash& hasher); 66 | bool Insert(T* value); 67 | T* Lookup(const Key& key, uint32_t hash_value) const; 68 | Index HashBuckets() const { return hash_buckets_; } 69 | void InsertAll(const Partition& src); 70 | void PrintStats(const char* name, bool verbose); 71 | 72 | private: 73 | std::vector buckets_; 74 | const Index hash_buckets_; 75 | Hash hasher_; 76 | }; 77 | 78 | public: 79 | explicit HashTable(const Hash& hasher = Hash()) : hasher_(hasher) { 80 | // we start with full_table_ == nullptr 81 | insertion_table_.reset(new Partition(kInitialHashBuckets, hasher_)); 82 | } 83 | 84 | ~HashTable() = default; 85 | 86 | // No move or copy semantics 87 | HashTable(const HashTable&) = delete; 88 | HashTable& operator=(const HashTable&) = delete; 89 | 90 | // Insert a new, non-nullptr T* into the hash table 91 | // (we only store unique values so the new value must 92 | // not be in the table already) 93 | void Insert(T* value); 94 | 95 | // Lookup an existing value 96 | // (returns nullptr if the value is not found) 97 | T* Lookup(const Key& key) const; 98 | 99 | void PrintStats(const char* name, bool verbose); 100 | 101 | private: 102 | std::unique_ptr full_table_; 103 | std::unique_ptr insertion_table_; 104 | Hash hasher_; 105 | }; 106 | 107 | template 108 | HashTable::Partition::Partition(Index size, const Hash& hasher) 109 | : hash_buckets_(size), hasher_(hasher) { 110 | // allocate space for the hash buckets + avg chain length 111 | buckets_.reserve(hash_buckets_ * kAvgChainLength); 112 | buckets_.resize(hash_buckets_); 113 | } 114 | 115 | // Similar to the "cellar" version of coalesced hashing, 116 | // the buckets array is divided into a fixed set of entries 117 | // addressable by the hash value [0 .. hash_buckets_) and 118 | // extra buckets for the collision chains [hash_buckets_, buckets_.size()) 119 | // Unlike coalesced hashing, our "cellar" is growing so we don't actually 120 | // have to coalesce any chains. 121 | // 122 | // Returns true if the insertion succeeded, false if the table overflows 123 | // (we never insert more than the pre-reserved capacity) 124 | // 125 | template 126 | bool HashTable::Partition::Insert(T* value) { 127 | SLICER_CHECK(value != nullptr); 128 | // overflow? 129 | if (buckets_.size() + 1 > buckets_.capacity()) { 130 | return false; 131 | } 132 | auto key = hasher_.GetKey(value); 133 | Index bucket_index = hasher_.Hash(key) % hash_buckets_; 134 | if (buckets_[bucket_index].value == nullptr) { 135 | buckets_[bucket_index].value = value; 136 | } else { 137 | Bucket new_bucket = {}; 138 | new_bucket.value = value; 139 | new_bucket.next = buckets_[bucket_index].next; 140 | buckets_[bucket_index].next = buckets_.size(); 141 | buckets_.push_back(new_bucket); 142 | } 143 | return true; 144 | } 145 | 146 | template 147 | T* HashTable::Partition::Lookup(const Key& key, uint32_t hash_value) const { 148 | assert(hash_value == hasher_.Hash(key)); 149 | Index bucket_index = hash_value % hash_buckets_; 150 | for (Index index = bucket_index; index != kInvalidIndex; index = buckets_[index].next) { 151 | auto value = buckets_[index].value; 152 | if (value == nullptr) { 153 | assert(index < hash_buckets_); 154 | break; 155 | } else if (hasher_.Compare(key, value)) { 156 | return value; 157 | } 158 | } 159 | return nullptr; 160 | } 161 | 162 | template 163 | void HashTable::Partition::InsertAll(const Partition& src) { 164 | for (const auto& bucket : src.buckets_) { 165 | if (bucket.value != nullptr) { 166 | SLICER_CHECK(Insert(bucket.value)); 167 | } 168 | } 169 | } 170 | 171 | // Try to insert into the "insertion table". If that overflows, 172 | // we allocate a new, larger hash table, move "full table" value to it 173 | // and "insertion table" becomes the new "full table". 174 | template 175 | void HashTable::Insert(T* value) { 176 | assert(Lookup(hasher_.GetKey(value)) == nullptr); 177 | if (!insertion_table_->Insert(value)) { 178 | std::unique_ptr new_hash_table( 179 | new Partition(insertion_table_->HashBuckets() * kResizeFactor, hasher_)); 180 | if (full_table_) { 181 | new_hash_table->InsertAll(*full_table_); 182 | } 183 | SLICER_CHECK(new_hash_table->Insert(value)); 184 | full_table_ = std::move(insertion_table_); 185 | insertion_table_ = std::move(new_hash_table); 186 | } 187 | } 188 | 189 | // First look into the "full table" and if the value is 190 | // not found there look into the "insertion table" next 191 | template 192 | T* HashTable::Lookup(const Key& key) const { 193 | auto hash_value = hasher_.Hash(key); 194 | if (full_table_) { 195 | auto value = full_table_->Lookup(key, hash_value); 196 | if (value != nullptr) { 197 | return value; 198 | } 199 | } 200 | return insertion_table_->Lookup(key, hash_value); 201 | } 202 | 203 | template 204 | void HashTable::Partition::PrintStats(const char* name, bool verbose) { 205 | int max_chain_length = 0; 206 | int sum_chain_length = 0; 207 | int used_buckets = 0; 208 | for (Index i = 0; i < hash_buckets_; ++i) { 209 | if (verbose) printf("%4d : ", i); 210 | if (buckets_[i].value != nullptr) { 211 | ++used_buckets; 212 | int chain_length = 0; 213 | for (Index ci = i; buckets_[ci].next != kInvalidIndex; ci = buckets_[ci].next) { 214 | SLICER_CHECK(buckets_[ci].value != nullptr); 215 | ++chain_length; 216 | if (verbose) printf("*"); 217 | } 218 | max_chain_length = std::max(max_chain_length, chain_length); 219 | sum_chain_length += chain_length; 220 | } 221 | if (verbose) printf("\n"); 222 | } 223 | 224 | int avg_chain_length = used_buckets ? sum_chain_length / used_buckets : 0; 225 | 226 | printf("\nHash table partition (%s):\n", name); 227 | printf(" hash_buckets : %u\n", hash_buckets_); 228 | printf(" size/capacity : %zu / %zu\n", buckets_.size(), buckets_.capacity()); 229 | printf(" used_buckets : %d\n", used_buckets); 230 | printf(" max_chain_length : %d\n", max_chain_length); 231 | printf(" avg_chain_length : %d\n", avg_chain_length); 232 | } 233 | 234 | template 235 | void HashTable::PrintStats(const char* name, bool verbose) { 236 | printf("\nHash table stats (%s)\n", name); 237 | if (full_table_) { 238 | full_table_->PrintStats("full_table", verbose); 239 | } 240 | insertion_table_->PrintStats("insertion_table", verbose); 241 | } 242 | 243 | } // namespace slicer -------------------------------------------------------------------------------- /slicer/index_map.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include "common.h" 20 | #include "dex_format.h" 21 | 22 | #include 23 | 24 | namespace ir { 25 | 26 | // A simple index tracking and allocator 27 | class IndexMap { 28 | public: 29 | dex::u4 AllocateIndex() { 30 | const auto size = indexes_map_.size(); 31 | while (alloc_pos_ < size && indexes_map_[alloc_pos_]) { 32 | ++alloc_pos_; 33 | } 34 | MarkUsedIndex(alloc_pos_); 35 | return alloc_pos_++; 36 | } 37 | 38 | void MarkUsedIndex(dex::u4 index) { 39 | if (index >= indexes_map_.size()) { 40 | indexes_map_.resize(index + 1); 41 | } 42 | SLICER_CHECK(!indexes_map_[index]); 43 | indexes_map_[index] = true; 44 | } 45 | 46 | private: 47 | std::vector indexes_map_; 48 | dex::u4 alloc_pos_ = 0; 49 | }; 50 | 51 | } // namespace ir -------------------------------------------------------------------------------- /slicer/memview.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include "common.h" 20 | 21 | #include 22 | #include 23 | 24 | namespace slicer { 25 | 26 | // A shallow, non-owning reference to a "view" inside a memory buffer 27 | class MemView { 28 | public: 29 | MemView() : ptr_(nullptr), size_(0) {} 30 | 31 | MemView(const void* ptr, size_t size) : ptr_(ptr), size_(size) { 32 | assert(size > 0); 33 | } 34 | 35 | ~MemView() = default; 36 | 37 | template 38 | const T* ptr() const { 39 | return static_cast(ptr_); 40 | } 41 | 42 | size_t size() const { return size_; } 43 | 44 | private: 45 | const void* ptr_; 46 | size_t size_; 47 | }; 48 | 49 | } // namespace slicer 50 | 51 | -------------------------------------------------------------------------------- /slicer/reader.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include "common.h" 20 | #include "dex_format.h" 21 | #include "dex_ir.h" 22 | 23 | #include 24 | #include 25 | #include 26 | #include 27 | 28 | namespace dex { 29 | 30 | // Provides both a low level iteration over the .dex 31 | // structures and incremental .dex IR creation. 32 | // 33 | // NOTES: 34 | // - only little-endian .dex files and host machines are supported 35 | // - aggresive structure validation & minimal semantic validation 36 | // 37 | class Reader { 38 | public: 39 | Reader(const dex::u1* image, size_t size); 40 | ~Reader() = default; 41 | 42 | // No copy/move semantics 43 | Reader(const Reader&) = delete; 44 | Reader& operator=(const Reader&) = delete; 45 | Reader(Reader&&) = default; 46 | Reader& operator=(Reader&&) = default; 47 | 48 | public: 49 | // Low level dex format interface 50 | const dex::Header* Header() const { return header_; } 51 | const char* GetStringMUTF8(dex::u4 index) const; 52 | slicer::ArrayView ClassDefs() const; 53 | slicer::ArrayView StringIds() const; 54 | slicer::ArrayView TypeIds() const; 55 | slicer::ArrayView FieldIds() const; 56 | slicer::ArrayView MethodIds() const; 57 | slicer::ArrayView ProtoIds() const; 58 | const dex::MapList* DexMapList() const; 59 | 60 | // IR creation interface 61 | std::shared_ptr GetIr() const { return dex_ir_; } 62 | void CreateFullIr(); 63 | void CreateClassIr(dex::u4 index); 64 | dex::u4 FindClassIndex(const char* class_descriptor) const; 65 | 66 | const dex::u1* Image() const { return image_; } 67 | 68 | private: 69 | // Internal access to IR nodes for indexed .dex structures 70 | ir::Class* GetClass(dex::u4 index); 71 | ir::Type* GetType(dex::u4 index); 72 | ir::FieldDecl* GetFieldDecl(dex::u4 index); 73 | ir::MethodDecl* GetMethodDecl(dex::u4 index); 74 | ir::Proto* GetProto(dex::u4 index); 75 | ir::String* GetString(dex::u4 index); 76 | 77 | // Parsing annotations 78 | ir::AnnotationsDirectory* ExtractAnnotations(dex::u4 offset); 79 | ir::Annotation* ExtractAnnotationItem(dex::u4 offset); 80 | ir::AnnotationSet* ExtractAnnotationSet(dex::u4 offset); 81 | ir::AnnotationSetRefList* ExtractAnnotationSetRefList(dex::u4 offset); 82 | ir::FieldAnnotation* ParseFieldAnnotation(const dex::u1** pptr); 83 | ir::MethodAnnotation* ParseMethodAnnotation(const dex::u1** pptr); 84 | ir::ParamAnnotation* ParseParamAnnotation(const dex::u1** pptr); 85 | ir::EncodedField* ParseEncodedField(const dex::u1** pptr, dex::u4* baseIndex); 86 | ir::Annotation* ParseAnnotation(const dex::u1** pptr); 87 | 88 | // Parse encoded values and arrays 89 | ir::EncodedValue* ParseEncodedValue(const dex::u1** pptr); 90 | ir::EncodedArray* ParseEncodedArray(const dex::u1** pptr); 91 | ir::EncodedArray* ExtractEncodedArray(dex::u4 offset); 92 | 93 | // Parse root .dex structures 94 | ir::Class* ParseClass(dex::u4 index); 95 | ir::EncodedMethod* ParseEncodedMethod(const dex::u1** pptr, dex::u4* baseIndex); 96 | ir::Type* ParseType(dex::u4 index); 97 | ir::FieldDecl* ParseFieldDecl(dex::u4 index); 98 | ir::MethodDecl* ParseMethodDecl(dex::u4 index); 99 | ir::TypeList* ExtractTypeList(dex::u4 offset); 100 | ir::Proto* ParseProto(dex::u4 index); 101 | ir::String* ParseString(dex::u4 index); 102 | 103 | // Parse code and debug information 104 | ir::DebugInfo* ExtractDebugInfo(dex::u4 offset); 105 | ir::Code* ExtractCode(dex::u4 offset); 106 | void ParseInstructions(slicer::ArrayView code); 107 | 108 | // Convert a file pointer (absolute offset) to an in-memory pointer 109 | template 110 | const T* ptr(int offset) const { 111 | SLICER_CHECK(offset >= 0 && offset + sizeof(T) <= size_); 112 | return reinterpret_cast(image_ + offset); 113 | } 114 | 115 | // Convert a data section file pointer (absolute offset) to an in-memory pointer 116 | // (offset should be inside the data section) 117 | template 118 | const T* dataPtr(size_t offset) const { 119 | SLICER_CHECK(offset >= header_->data_off && offset + sizeof(T) <= size_); 120 | return reinterpret_cast(image_ + offset); 121 | } 122 | 123 | // Map an indexed section to an ArrayView 124 | template 125 | slicer::ArrayView section(int offset, int count) const { 126 | return slicer::ArrayView(ptr(offset), count); 127 | } 128 | 129 | // Simple accessor for a MUTF8 string data 130 | const dex::u1* GetStringData(dex::u4 index) const { 131 | auto& stringId = StringIds()[index]; 132 | return dataPtr(stringId.string_data_off); 133 | } 134 | 135 | void ValidateHeader(); 136 | 137 | private: 138 | // the in-memory .dex image 139 | const dex::u1* image_; 140 | size_t size_; 141 | 142 | // .dex image header 143 | const dex::Header* header_; 144 | 145 | // .dex IR associated with the reader 146 | std::shared_ptr dex_ir_; 147 | 148 | // maps for de-duplicating items identified by file pointers 149 | std::map type_lists_; 150 | std::map annotations_; 151 | std::map annotation_sets_; 152 | std::map annotations_directories_; 153 | std::map encoded_arrays_; 154 | }; 155 | 156 | } // namespace dex 157 | -------------------------------------------------------------------------------- /slicer/scopeguard.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | namespace slicer { 20 | 21 | // A simple and lightweight scope guard and macro 22 | // (inspired by Andrei Alexandrescu's C++11 Scope Guard) 23 | // 24 | // Here is how it's used: 25 | // 26 | // FILE* file = std::fopen(...); 27 | // SLICER_SCOPE_EXIT { 28 | // std::fclose(file); 29 | // }; 30 | // 31 | // "file" will be closed at the end of the enclosing scope, 32 | // regardless of how the scope is exited 33 | // 34 | class ScopeGuardHelper 35 | { 36 | template 37 | class ScopeGuard 38 | { 39 | public: 40 | explicit ScopeGuard(T closure) : 41 | closure_(std::move(closure)) 42 | { 43 | } 44 | 45 | ~ScopeGuard() 46 | { 47 | closure_(); 48 | } 49 | 50 | // move constructor only 51 | ScopeGuard(ScopeGuard&&) = default; 52 | ScopeGuard(const ScopeGuard&) = delete; 53 | ScopeGuard& operator=(const ScopeGuard&) = delete; 54 | ScopeGuard& operator=(ScopeGuard&&) = delete; 55 | 56 | private: 57 | T closure_; 58 | }; 59 | 60 | public: 61 | template 62 | ScopeGuard operator<<(T closure) 63 | { 64 | return ScopeGuard(std::move(closure)); 65 | } 66 | }; 67 | 68 | #define SLICER_SG_MACRO_CONCAT2(a, b) a ## b 69 | #define SLICER_SG_MACRO_CONCAT(a, b) SLICER_SG_MACRO_CONCAT2(a, b) 70 | #define SLICER_SG_ANONYMOUS(prefix) SLICER_SG_MACRO_CONCAT(prefix, __COUNTER__) 71 | 72 | #define SLICER_SCOPE_EXIT \ 73 | auto SLICER_SG_ANONYMOUS(_scope_guard_) = slicer::ScopeGuardHelper() << [&]() 74 | 75 | } // namespace slicer 76 | -------------------------------------------------------------------------------- /slicer/writer.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (C) 2017 The Android Open Source Project 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #include "common.h" 20 | #include "buffer.h" 21 | #include "arrayview.h" 22 | #include "dex_format.h" 23 | #include "dex_ir.h" 24 | 25 | #include 26 | #include 27 | #include 28 | 29 | namespace dex { 30 | 31 | // Specialized buffer for creating a .dex image section 32 | // (tracking the section offset, section type, ...) 33 | class Section : public slicer::Buffer { 34 | public: 35 | explicit Section(dex::u2 mapEntryType) : map_entry_type_(mapEntryType) {} 36 | ~Section() = default; 37 | 38 | Section(const Section&) = delete; 39 | Section& operator=(const Section&) = delete; 40 | 41 | void SetOffset(dex::u4 offset) { 42 | SLICER_CHECK(offset > 0 && offset % 4 == 0); 43 | offset_ = offset; 44 | } 45 | 46 | dex::u4 SectionOffset() const { 47 | SLICER_CHECK(offset_ > 0 && offset_ % 4 == 0); 48 | return ItemsCount() > 0 ? offset_ : 0; 49 | } 50 | 51 | dex::u4 AbsoluteOffset(dex::u4 itemOffset) const { 52 | SLICER_CHECK(offset_ > 0); 53 | SLICER_CHECK(itemOffset < size()); 54 | return offset_ + itemOffset; 55 | } 56 | 57 | // TODO: return absolute offsets? 58 | dex::u4 AddItem(dex::u4 alignment = 1) { 59 | ++count_; 60 | Align(alignment); 61 | return size(); 62 | } 63 | 64 | dex::u4 ItemsCount() const { return count_; } 65 | 66 | dex::u2 MapEntryType() const { return map_entry_type_; } 67 | 68 | private: 69 | dex::u4 offset_ = 0; 70 | dex::u4 count_ = 0; 71 | const dex::u2 map_entry_type_; 72 | }; 73 | 74 | // A specialized container for an .dex index section 75 | // (strings, types, fields, methods, ...) 76 | template 77 | class Index { 78 | public: 79 | explicit Index(dex::u2 mapEntryType) : map_entry_type_(mapEntryType) {} 80 | ~Index() = default; 81 | 82 | Index(const Index&) = delete; 83 | Index& operator=(const Index&) = delete; 84 | 85 | dex::u4 Init(dex::u4 offset, dex::u4 count) { 86 | values_.reset(new T[count]); 87 | offset_ = offset; 88 | count_ = count; 89 | return size(); 90 | } 91 | 92 | void Free() { 93 | values_.reset(); 94 | offset_ = 0; 95 | count_ = 0; 96 | } 97 | 98 | dex::u4 SectionOffset() const { 99 | SLICER_CHECK(offset_ > 0 && offset_ % 4 == 0); 100 | return ItemsCount() > 0 ? offset_ : 0; 101 | } 102 | 103 | T* begin() { return values_.get(); } 104 | T* end() { return begin() + count_; } 105 | 106 | bool empty() const { return count_ == 0; } 107 | 108 | dex::u4 ItemsCount() const { return count_; } 109 | const T* data() const { return values_.get(); } 110 | dex::u4 size() const { return count_ * sizeof(T); } 111 | 112 | T& operator[](dex::u4 i) { 113 | SLICER_CHECK(i < count_); 114 | return values_[i]; 115 | } 116 | 117 | dex::u2 MapEntryType() const { return map_entry_type_; } 118 | 119 | private: 120 | dex::u4 offset_ = 0; 121 | dex::u4 count_ = 0; 122 | std::unique_ptr values_; 123 | const dex::u2 map_entry_type_; 124 | }; 125 | 126 | // Creates an in-memory .dex image from a .dex IR 127 | class Writer { 128 | // The container for the individual sections in a .dex image 129 | // (factored out from Writer for a more granular lifetime control) 130 | struct DexImage { 131 | DexImage() 132 | : string_ids(dex::kStringIdItem), 133 | type_ids(dex::kTypeIdItem), 134 | proto_ids(dex::kProtoIdItem), 135 | field_ids(dex::kFieldIdItem), 136 | method_ids(dex::kMethodIdItem), 137 | class_defs(dex::kClassDefItem), 138 | string_data(dex::kStringDataItem), 139 | type_lists(dex::kTypeList), 140 | debug_info(dex::kDebugInfoItem), 141 | encoded_arrays(dex::kEncodedArrayItem), 142 | code(dex::kCodeItem), 143 | class_data(dex::kClassDataItem), 144 | ann_directories(dex::kAnnotationsDirectoryItem), 145 | ann_set_ref_lists(dex::kAnnotationSetRefList), 146 | ann_sets(dex::kAnnotationSetItem), 147 | ann_items(dex::kAnnotationItem), 148 | map_list(dex::kMapList) {} 149 | 150 | Index string_ids; 151 | Index type_ids; 152 | Index proto_ids; 153 | Index field_ids; 154 | Index method_ids; 155 | Index class_defs; 156 | 157 | Section string_data; 158 | Section type_lists; 159 | Section debug_info; 160 | Section encoded_arrays; 161 | Section code; 162 | Section class_data; 163 | Section ann_directories; 164 | Section ann_set_ref_lists; 165 | Section ann_sets; 166 | Section ann_items; 167 | Section map_list; 168 | }; 169 | 170 | public: 171 | // interface for allocating the final in-memory image 172 | struct Allocator { 173 | virtual void* Allocate(size_t size) = 0; 174 | virtual void Free(void* ptr) = 0; 175 | virtual ~Allocator() = default; 176 | }; 177 | 178 | public: 179 | explicit Writer(std::shared_ptr dex_ir) : dex_ir_(dex_ir) {} 180 | ~Writer() = default; 181 | 182 | Writer(const Writer&) = delete; 183 | Writer& operator=(const Writer&) = delete; 184 | 185 | // .dex image creation 186 | dex::u1* CreateImage(Allocator* allocator, size_t* new_image_size); 187 | 188 | private: 189 | // helpers for creating various .dex sections 190 | dex::u4 CreateStringDataSection(dex::u4 section_offset); 191 | dex::u4 CreateMapSection(dex::u4 section_offset); 192 | dex::u4 CreateAnnItemSection(dex::u4 section_offset); 193 | dex::u4 CreateAnnSetsSection(dex::u4 section_offset); 194 | dex::u4 CreateAnnSetRefListsSection(dex::u4 section_offset); 195 | dex::u4 CreateTypeListsSection(dex::u4 section_offset); 196 | dex::u4 CreateCodeItemSection(dex::u4 section_offset); 197 | dex::u4 CreateDebugInfoSection(dex::u4 section_offset); 198 | dex::u4 CreateClassDataSection(dex::u4 section_offset); 199 | dex::u4 CreateAnnDirectoriesSection(dex::u4 section_offset); 200 | dex::u4 CreateEncodedArrayItemSection(dex::u4 section_offset); 201 | 202 | // back-fill the indexes 203 | void FillTypes(); 204 | void FillProtos(); 205 | void FillFields(); 206 | void FillMethods(); 207 | void FillClassDefs(); 208 | 209 | // helpers for writing .dex structures 210 | dex::u4 WriteTypeList(const std::vector& types); 211 | dex::u4 WriteAnnotationItem(const ir::Annotation* ir_annotation); 212 | dex::u4 WriteAnnotationSet(const ir::AnnotationSet* ir_annotation_set); 213 | dex::u4 WriteAnnotationSetRefList(const ir::AnnotationSetRefList* ir_annotation_set_ref_list); 214 | dex::u4 WriteClassAnnotations(const ir::Class* ir_class); 215 | dex::u4 WriteDebugInfo(const ir::DebugInfo* ir_debug_info); 216 | dex::u4 WriteCode(const ir::Code* ir_code); 217 | dex::u4 WriteClassData(const ir::Class* ir_class); 218 | dex::u4 WriteClassStaticValues(const ir::Class* ir_class); 219 | 220 | // Map indexes from the original .dex to the 221 | // corresponding index in the new image 222 | dex::u4 MapStringIndex(dex::u4 index) const; 223 | dex::u4 MapTypeIndex(dex::u4 index) const; 224 | dex::u4 MapFieldIndex(dex::u4 index) const; 225 | dex::u4 MapMethodIndex(dex::u4 index) const; 226 | 227 | // writing parts of a class definition 228 | void WriteInstructions(slicer::ArrayView instructions); 229 | void WriteTryBlocks(const ir::Code* ir_code); 230 | void WriteEncodedField(const ir::EncodedField* irEncodedField, dex::u4* base_index); 231 | void WriteEncodedMethod(const ir::EncodedMethod* irEncodedMethod, dex::u4* base_index); 232 | 233 | dex::u4 FilePointer(const ir::Node* ir_node) const; 234 | 235 | private: 236 | std::shared_ptr dex_ir_; 237 | std::unique_ptr dex_; 238 | 239 | // CONSIDER: we can have multiple maps per IR node type 240 | // (that's what the reader does) 241 | std::map node_offset_; 242 | }; 243 | 244 | } // namespace dex 245 | -------------------------------------------------------------------------------- /test.cc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LSPosed/DexHelper/bf734d1decfef36783bae0df0cdc897310f5c16e/test.cc --------------------------------------------------------------------------------