├── screenshots ├── avx_demo_two.gif └── avx_title_card.png ├── LICENSE ├── README.md ├── .gitignore ├── misc └── scrape.py └── plugins └── microavx.py /screenshots/avx_demo_two.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaasedelen/microavx/HEAD/screenshots/avx_demo_two.gif -------------------------------------------------------------------------------- /screenshots/avx_title_card.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gaasedelen/microavx/HEAD/screenshots/avx_title_card.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 gaasedelen 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MicroAVX - An AVX Lifter for the Hex-Rays Decompiler 2 | 3 |

4 | MicroAVX Plugin 5 |

6 | 7 | ## Overview 8 | 9 | MicroAVX is an extension of the [IDA Pro](https://www.hex-rays.com/products/ida/) decompiler, adding partial support for a number of common instructions from Intel's [Advanced Vector Extensions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) (AVX). This plugin demonstrates how the Hex-Rays microcode can be used to lift and decompile new or previously unsupported instructions. 10 | 11 | There are no plans further develop MicroAVX, or extend its coverage to the complete set of AVX instructions. This plugin is labeled only as a prototype & code resource for the community. 12 | 13 | For more information, please read the associated [blogpost](https://blog.ret2.io/2020/07/22/ida-pro-avx-decompiler). 14 | 15 | ## Releases 16 | 17 | * v0.1 -- Initial release 18 | 19 | ## Installation 20 | 21 | MicroAVX is a cross-platform (Windows, macOS, Linux) Python 2/3 plugin. It takes zero third party dependencies, making the code both portable and easy to install. 22 | 23 | 1. From your disassembler's python console, run the following command to find its plugin directory: 24 | - **IDA Pro**: `os.path.join(idaapi.get_user_idadir(), "plugins")` 25 | 26 | 2. Copy the contents of this repository's `/plugins/` folder to the listed directory. 27 | 3. Restart your disassembler. 28 | 29 | This plugin is only supported for IDA 7.5 and newer. 30 | 31 | ## Usage 32 | 33 | The MircoAVX plugin loads automatically when an x86_64 executable / IDB is opened in IDA. Simply attempt to decompile any function containing AVX instructions, and the the plugin will lift any instructions that it supports. 34 | 35 |

36 | Decompiling AVX 37 |

38 | 39 | (please note, there is no right click 'AVX toggle' in this release) 40 | 41 | ## Authors 42 | 43 | * Markus Gaasedelen ([@gaasedelen](https://twitter.com/gaasedelen)) -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /misc/scrape.py: -------------------------------------------------------------------------------- 1 | import collections 2 | 3 | import idc 4 | import ida_name 5 | import idautils 6 | import ida_funcs 7 | import ida_hexrays 8 | 9 | #----------------------------------------------------------------------------- 10 | # Scraping Code 11 | #----------------------------------------------------------------------------- 12 | 13 | class MinsnVisitor(ida_hexrays.minsn_visitor_t): 14 | """ 15 | Hex-Rays Micro-instruction Visitor 16 | """ 17 | found = set() 18 | 19 | def visit_minsn(self): 20 | 21 | # we only care about external (unsupported) instructions 22 | if self.curins.opcode != ida_hexrays.m_ext: 23 | return 0 24 | 25 | ins_text = idc.GetDisasm(self.curins.ea) 26 | ins_op = ins_text.split(" ")[0] 27 | 28 | print("- 0x%08X: UNSUPPORTED %s" % (self.curins.ea, ins_text)) 29 | self.found.add(ins_op) 30 | return 0 31 | 32 | def scrape_unsupported_instructions(): 33 | """ 34 | Scrape all 'external' (unsupported) decompiler instructions from this IDB. 35 | 36 | Returns a tuple of two maps: 37 | ext2func = { opcode: set([func_ea, func2_ea, ...]) } 38 | func2ext = { func_ea: set([opcode1, opcode2, opcode3]) } 39 | 40 | """ 41 | miv = MinsnVisitor() 42 | ext2func = collections.defaultdict(set) 43 | func2ext = {} 44 | 45 | for address in idautils.Functions(): 46 | 47 | #address = 0x1800017E0 48 | print("0x%08X: DECOMPILING" % address) 49 | func = ida_funcs.get_func(address) 50 | 51 | func_mbr = ida_hexrays.mba_ranges_t(func) 52 | hf = ida_hexrays.hexrays_failure_t() 53 | flags = ida_hexrays.DECOMP_NO_XREFS | ida_hexrays.DECOMP_NO_WAIT | ida_hexrays.DECOMP_WARNINGS 54 | mba = ida_hexrays.gen_microcode(func_mbr, hf, None, flags, ida_hexrays.MMAT_GENERATED) 55 | 56 | if not mba: 57 | print(" - 0x%08x: FAILED %s" % (hf.errea, hf.str)) 58 | continue 59 | 60 | miv.found = set() 61 | mba.for_all_insns(miv) 62 | 63 | # opcode --> [func_ea, func2_ea, ..] 64 | for ins_op in miv.found: 65 | ext2func[ins_op].add(address) 66 | 67 | # func_ea --> [ins_op, ins_op2, ..] 68 | func2ext[address] = miv.found 69 | 70 | print("\nDone scraping...\n") 71 | return (ext2func, func2ext) 72 | 73 | def print_stats(ext2func): 74 | """ 75 | Print stats about the scraped instructions. 76 | """ 77 | print("-"*60) 78 | 79 | func_size_cache = {} 80 | all_funcs = set() 81 | 82 | print("\nFUNC USES -- UNSUPPORTED INSTR (%u types)\n" % len(ext2func)) 83 | for key in sorted(ext2func, key=lambda key: len(ext2func[key]), reverse=True): 84 | function_addresses = ext2func[key] 85 | all_funcs |= function_addresses 86 | 87 | # print the unsupported instruction op, and how many funcs use it 88 | print(" - USES: %d - OP: %s" % (len(function_addresses), key)) 89 | 90 | # compute the size of all the funcs that use this op.. 91 | func_sizes = [] 92 | for address in function_addresses: 93 | 94 | # try to grab the func size if we cached it already 95 | func_size = func_size_cache.get(address, None) 96 | if func_size: 97 | func_sizes.append((func_size, address)) 98 | continue 99 | 100 | # compute the size oe the function 101 | func = ida_funcs.get_func(address) 102 | func_size = ida_funcs.calc_func_size(func) 103 | func_sizes.append((func_size, address)) 104 | 105 | # cache the func size for future use 106 | func_size_cache[address] = func_size 107 | 108 | # print a few small functions that use this unsupported op.. 109 | func_sizes.sort() 110 | for size, address in func_sizes[:5]: 111 | print(" -- SAMPLE FUNC 0x%08X (%u bytes)" % (address, size)) 112 | 113 | print("\n" + "-"*60 + "\n") 114 | print("AFFLICTED FUNCTIONS (%u funcs)\n" % len(all_funcs)) 115 | 116 | all_funcs = sorted(all_funcs) 117 | for ea in all_funcs: 118 | function_name = ida_name.get_short_name(ea) 119 | print("0x%08X: %s" % (ea, function_name)) 120 | 121 | #----------------------------------------------------------------------------- 122 | # Main 123 | #----------------------------------------------------------------------------- 124 | 125 | print("Scraping instructions...") 126 | ext2func, func2ext = scrape_unsupported_instructions() 127 | print("Dumping results...") 128 | print_stats(ext2func) 129 | -------------------------------------------------------------------------------- /plugins/microavx.py: -------------------------------------------------------------------------------- 1 | import sys 2 | 3 | import idc 4 | import ida_ua 5 | import ida_ida 6 | import ida_idp 7 | import ida_funcs 8 | import ida_allins 9 | import ida_idaapi 10 | import ida_loader 11 | import ida_kernwin 12 | import ida_typeinf 13 | import ida_hexrays 14 | 15 | #----------------------------------------------------------------------------- 16 | # Util 17 | #----------------------------------------------------------------------------- 18 | 19 | # an empty / NULL mop_t 20 | NO_MOP = ida_hexrays.mop_t() 21 | 22 | # EVEX-encoded instruction, intel.hpp (ida sdk) 23 | AUX_EVEX = 0x10000 24 | 25 | # register widths (bytes) 26 | XMM_SIZE = 16 27 | YMM_SIZE = 32 28 | ZMM_SIZE = 64 29 | 30 | # type sizes (bytes) 31 | FLOAT_SIZE = 4 32 | DOUBLE_SIZE = 8 33 | DWORD_SIZE = 4 34 | QWORD_SIZE = 8 35 | 36 | def size_of_operand(op): 37 | """ 38 | From ... 39 | https://reverseengineering.stackexchange.com/questions/19843/how-can-i-get-the-byte-size-of-an-operand-in-ida-pro 40 | """ 41 | tbyte = 8 42 | dt_ldbl = 8 43 | n_bytes = [ 1, 2, 4, 4, 8, 44 | tbyte, -1, 8, 16, -1, 45 | -1, 6, -1, 4, 4, 46 | dt_ldbl, 32, 64 ] 47 | return n_bytes[op.dtype] 48 | 49 | def is_amd64_idb(): 50 | """ 51 | Return true if an x86_64 IDB is open. 52 | """ 53 | if ida_idp.ph.id != ida_idp.PLFM_386: 54 | return False 55 | return ida_ida.cvar.inf.is_64bit() 56 | 57 | def bytes2bits(n): 58 | """ 59 | Return the number of bits repersented by 'n' bytes. 60 | """ 61 | return n * 8 62 | 63 | def is_mem_op(op): 64 | """ 65 | Return true if the given operand *looks* like a mem op. 66 | """ 67 | return op.type in [ida_ua.o_mem, ida_ua.o_displ, ida_ua.o_phrase] 68 | 69 | def is_reg_op(op): 70 | """ 71 | Return true if the given operand is a register. 72 | """ 73 | return op.type in [ida_ua.o_reg] 74 | 75 | def is_avx_reg(op): 76 | """ 77 | Return true if the given operand is a XMM or YMM register. 78 | """ 79 | return bool(is_xmm_reg(op) or is_ymm_reg(op)) 80 | 81 | def is_xmm_reg(op): 82 | """ 83 | Return true if the given operand is a XMM register. 84 | """ 85 | if op.type != ida_ua.o_reg: 86 | return False 87 | if op.dtype != ida_ua.dt_byte16: 88 | return False 89 | return True 90 | 91 | def is_ymm_reg(op): 92 | """ 93 | Return true if the given operand is a YMM register. 94 | """ 95 | if op.type != ida_ua.o_reg: 96 | return False 97 | if op.dtype != ida_ua.dt_byte32: 98 | return False 99 | return True 100 | 101 | def is_avx_512(insn): 102 | """ 103 | Return true if the given insn_t is an AVX512 instruction. 104 | """ 105 | return bool(insn.auxpref & AUX_EVEX) 106 | 107 | #----------------------------------------------------------------------------- 108 | # Microcode Helpers 109 | #----------------------------------------------------------------------------- 110 | 111 | def get_ymm_mreg(xmm_mreg): 112 | """ 113 | Return the YMM microcode register for a given XMM register. 114 | """ 115 | xmm_reg = ida_hexrays.mreg2reg(xmm_mreg, XMM_SIZE) 116 | xmm_name = ida_idp.get_reg_name(xmm_reg, XMM_SIZE) 117 | xmm_number = int(xmm_name.split("mm")[-1]) 118 | 119 | # compute the ymm mreg id 120 | ymm_reg = ida_idp.str2reg("ymm%u" % xmm_number) 121 | ymm_mreg = ida_hexrays.reg2mreg(ymm_reg) 122 | 123 | # sanity check... 124 | xmm_name = ida_hexrays.get_mreg_name(xmm_mreg, XMM_SIZE) 125 | ymm_name = ida_hexrays.get_mreg_name(ymm_mreg, YMM_SIZE) 126 | assert xmm_name[1:] == ymm_name[1:], "Reg escalation did not work... (%s, %s)" % (xmm_name, ymm_name) 127 | 128 | # return the ymm microcode register id 129 | return ymm_mreg 130 | 131 | def clear_upper(cdg, xmm_mreg, op_size=XMM_SIZE): 132 | """ 133 | Extend the given xmm reg, clearing the upper bits (through ymm). 134 | """ 135 | ymm_mreg = get_ymm_mreg(xmm_mreg) 136 | 137 | xmm_mop = ida_hexrays.mop_t(xmm_mreg, op_size) 138 | ymm_mop = ida_hexrays.mop_t(ymm_mreg, YMM_SIZE) 139 | 140 | return cdg.emit(ida_hexrays.m_xdu, xmm_mop, NO_MOP, ymm_mop) 141 | 142 | def store_operand_hack(cdg, op_num, new_mop): 143 | """ 144 | XXX: why is there a load_operand(), but no inverse.. ? 145 | """ 146 | 147 | # emit a 'load' operation... 148 | memX = cdg.load_operand(op_num) 149 | assert memX != ida_hexrays.mr_none, "Invalid op_num..." 150 | 151 | # since this is gonna be kind of hacky, let's make sure a load was actually emitted 152 | ins = cdg.mb.tail 153 | if ins.opcode != ida_hexrays.m_ldx: 154 | if ins.prev.opcode != ida_hexrays.m_ldx: 155 | raise ValueError("Hehe, hack failed :-( (insn 0x%08X op 0x%02X)" % (cdg.insn.ea, ins.opcode)) 156 | prev = ins.prev 157 | cdg.mb.make_nop(ins) 158 | ins = prev 159 | assert ins.d.size == new_mop.size, "%u vs %u" % (new_mop.size, ins.d.size) 160 | 161 | # convert the load to a store :^) 162 | ins.opcode = ida_hexrays.m_stx 163 | ins.d = ins.r # d = op mem offset 164 | ins.r = ins.l # r = op mem segm 165 | ins.l = new_mop # l = value to store (mop_t) 166 | 167 | return ins 168 | 169 | #----------------------------------------------------------------------------- 170 | # Intrinsic Helper 171 | #----------------------------------------------------------------------------- 172 | 173 | class AVXIntrinsic(object): 174 | """ 175 | This class helps with generating simple intrinsic calls in microcode. 176 | """ 177 | 178 | def __init__(self, cdg, name): 179 | self.cdg = cdg 180 | 181 | # call info, sort of like func_type_data_t() 182 | self.call_info = ida_hexrays.mcallinfo_t() 183 | self.call_info.cc = ida_typeinf.CM_CC_FASTCALL 184 | self.call_info.callee = ida_idaapi.BADADDR 185 | self.call_info.solid_args = 0 186 | self.call_info.role = ida_hexrays.ROLE_UNK 187 | self.call_info.flags = ida_hexrays.FCI_SPLOK | ida_hexrays.FCI_FINAL | ida_hexrays.FCI_PROP 188 | 189 | # the actual 'call' microcode insn 190 | self.call_insn = ida_hexrays.minsn_t(cdg.insn.ea) 191 | self.call_insn.opcode = ida_hexrays.m_call 192 | self.call_insn.l.make_helper(name) 193 | self.call_insn.d.t = ida_hexrays.mop_f 194 | self.call_insn.d.f = self.call_info 195 | 196 | # temp return type 197 | self.call_info.return_type = ida_typeinf.tinfo_t() 198 | self.call_insn.d.size = 0 199 | 200 | def set_return_reg(self, mreg, type_string): 201 | """ 202 | Set the return register of the function call, with a type string. 203 | """ 204 | ret_tinfo = ida_typeinf.tinfo_t() 205 | ret_tinfo.get_named_type(None, type_string) 206 | return self.set_return_reg_type(mreg, ret_tinfo) 207 | 208 | def set_return_reg_basic(self, mreg, basic_type): 209 | """ 210 | Set the return register of the function call, with a basic type assigned. 211 | """ 212 | ret_tinfo = ida_typeinf.tinfo_t(basic_type) 213 | return self.set_return_reg_type(mreg, ret_tinfo) 214 | 215 | def set_return_reg_type(self, mreg, ret_tinfo): 216 | """ 217 | Set the return register of the function call, with a complex type. 218 | """ 219 | self.call_info.return_type = ret_tinfo 220 | self.call_insn.d.size = ret_tinfo.get_size() 221 | 222 | self.mov_insn = ida_hexrays.minsn_t(self.cdg.insn.ea) 223 | self.mov_insn.opcode = ida_hexrays.m_mov 224 | self.mov_insn.l.t = ida_hexrays.mop_d 225 | self.mov_insn.l.d = self.call_insn 226 | self.mov_insn.l.size = self.call_insn.d.size 227 | self.mov_insn.d.t = ida_hexrays.mop_r 228 | self.mov_insn.d.r = mreg 229 | self.mov_insn.d.size = self.call_insn.d.size 230 | 231 | if ret_tinfo.is_decl_floating(): 232 | self.mov_insn.set_fpinsn() 233 | 234 | def add_argument_reg(self, mreg, type_string): 235 | """ 236 | Add a regeister argument with a given type string to the function argument list. 237 | """ 238 | op_tinfo = ida_typeinf.tinfo_t() 239 | op_tinfo.get_named_type(None, type_string) 240 | return self.add_argument_reg_type(mreg, op_tinfo) 241 | 242 | def add_argument_reg_basic(self, mreg, basic_type): 243 | """ 244 | Add a regeister argument with a basic type to the function argument list. 245 | """ 246 | op_tinfo = ida_typeinf.tinfo_t(basic_type) 247 | return self.add_argument_reg_type(mreg, op_tinfo) 248 | 249 | def add_argument_reg_type(self, mreg, op_tinfo): 250 | """ 251 | Add a register argument of the given type to the function argument list. 252 | """ 253 | call_arg = ida_hexrays.mcallarg_t() 254 | call_arg.t = ida_hexrays.mop_r 255 | call_arg.r = mreg 256 | call_arg.type = op_tinfo 257 | call_arg.size = op_tinfo.get_size() 258 | 259 | self.call_info.args.push_back(call_arg) 260 | self.call_info.solid_args += 1 261 | 262 | def add_argument_imm(self, value, basic_type): 263 | """ 264 | Add an immediate value to the function argument list. 265 | """ 266 | op_tinfo = ida_typeinf.tinfo_t(basic_type) 267 | 268 | mop_imm = ida_hexrays.mop_t() 269 | mop_imm.make_number(value, op_tinfo.get_size()) 270 | 271 | call_arg = ida_hexrays.mcallarg_t() 272 | call_arg.make_number(value, op_tinfo.get_size()) 273 | call_arg.type = op_tinfo 274 | 275 | self.call_info.args.push_back(call_arg) 276 | self.call_info.solid_args += 1 277 | 278 | def emit(self): 279 | """ 280 | Emit the intrinsic call to the generated microcode. 281 | """ 282 | self.cdg.mb.insert_into_block(self.mov_insn, self.cdg.mb.tail) 283 | 284 | #----------------------------------------------------------------------------- 285 | # AVX Lifter 286 | #----------------------------------------------------------------------------- 287 | 288 | class AVXLifter(ida_hexrays.microcode_filter_t): 289 | """ 290 | A Hex-Rays microcode filter to lift AVX instructions during decompilation. 291 | """ 292 | 293 | def __init__(self): 294 | super(AVXLifter, self).__init__() 295 | self._avx_handlers = \ 296 | { 297 | 298 | # Compares (Scalar, Single / Double-Precision) 299 | ida_allins.NN_vcomiss: self.vcomiss, 300 | ida_allins.NN_vcomisd: self.vcomisd, 301 | ida_allins.NN_vucomiss: self.vucomiss, 302 | ida_allins.NN_vucomisd: self.vucomisd, 303 | 304 | # Conversions 305 | ida_allins.NN_vcvttss2si: self.vcvttss2si, 306 | ida_allins.NN_vcvtdq2ps: self.vcvtdq2ps, 307 | ida_allins.NN_vcvtsi2ss: self.vcvtsi2ss, 308 | ida_allins.NN_vcvtps2pd: self.vcvtps2pd, 309 | ida_allins.NN_vcvtss2sd: self.vcvtss2sd, 310 | 311 | # Mov (DWORD / QWORD) 312 | ida_allins.NN_vmovd: self.vmovd, 313 | ida_allins.NN_vmovq: self.vmovq, 314 | 315 | # Mov (Scalar, Single / Double-Precision) 316 | ida_allins.NN_vmovss: self.vmovss, 317 | ida_allins.NN_vmovsd: self.vmovsd, 318 | 319 | # Mov (Packed Single-Precision, Packed Integers) 320 | ida_allins.NN_vmovaps: self.v_mov_ps_dq, 321 | ida_allins.NN_vmovups: self.v_mov_ps_dq, 322 | ida_allins.NN_vmovdqa: self.v_mov_ps_dq, 323 | ida_allins.NN_vmovdqu: self.v_mov_ps_dq, 324 | 325 | # Bitwise (Packed Single-Precision) 326 | ida_allins.NN_vorps: self.v_bitwise_ps, 327 | ida_allins.NN_vandps: self.v_bitwise_ps, 328 | ida_allins.NN_vxorps: self.v_bitwise_ps, 329 | 330 | # Math (Scalar Single-Precision) 331 | ida_allins.NN_vaddss: self.v_math_ss, 332 | ida_allins.NN_vsubss: self.v_math_ss, 333 | ida_allins.NN_vmulss: self.v_math_ss, 334 | ida_allins.NN_vdivss: self.v_math_ss, 335 | 336 | # Math (Scalar Double-Precision) 337 | ida_allins.NN_vaddsd: self.v_math_sd, 338 | ida_allins.NN_vsubsd: self.v_math_sd, 339 | ida_allins.NN_vmulsd: self.v_math_sd, 340 | ida_allins.NN_vdivsd: self.v_math_sd, 341 | 342 | # Math (Packed Single-Precision) 343 | ida_allins.NN_vaddps: self.v_math_ps, 344 | ida_allins.NN_vsubps: self.v_math_ps, 345 | ida_allins.NN_vmulps: self.v_math_ps, 346 | ida_allins.NN_vdivps: self.v_math_ps, 347 | 348 | # Square Root 349 | ida_allins.NN_vsqrtss: self.vsqrtss, 350 | ida_allins.NN_vsqrtps: self.vsqrtps, 351 | 352 | # Shuffle (Packed Single-Precision) 353 | ida_allins.NN_vshufps: self.vshufps, 354 | 355 | } 356 | 357 | def match(self, cdg): 358 | """ 359 | Return true if the lifter supports this AVX instruction. 360 | """ 361 | if is_avx_512(cdg.insn): 362 | return False 363 | return cdg.insn.itype in self._avx_handlers 364 | 365 | def apply(self, cdg): 366 | """ 367 | Generate microcode for the current instruction. 368 | """ 369 | cdg.store_operand = lambda x, y: store_operand_hack(cdg, x, y) 370 | return self._avx_handlers[cdg.insn.itype](cdg, cdg.insn) 371 | 372 | def install(self): 373 | """ 374 | Install the AVX codegen lifter. 375 | """ 376 | ida_hexrays.install_microcode_filter(self, True) 377 | print("Installed AVX lifter... (%u instr supported)" % len(self._avx_handlers)) 378 | 379 | def remove(self): 380 | """ 381 | Remove the AVX codegen lifter. 382 | """ 383 | ida_hexrays.install_microcode_filter(self, False) 384 | print("Removed AVX lifter...") 385 | 386 | #-------------------------------------------------------------------------- 387 | # Compare Instructions 388 | #-------------------------------------------------------------------------- 389 | 390 | # 391 | # the intel manual states that all of these comparison instructions are 392 | # effectively identical to their SSE counterparts. because of this, we 393 | # simply twiddle the decoded insn to make it appear as SSE and bail. 394 | # 395 | # since the decompiler appears to operate on the same decoded instruction 396 | # data that we meddled with, it will lift the instruction in the same way 397 | # it would lift the SSE version we alias each AVX one to. 398 | # 399 | 400 | def vcomiss(self, cdg, insn): 401 | """ 402 | VCOMISS xmm1, xmm2/m32 403 | """ 404 | insn.itype = ida_allins.NN_comiss 405 | return ida_hexrays.MERR_INSN 406 | 407 | def vucomiss(self, cdg, insn): 408 | """ 409 | VUCOMISS xmm1, xmm2/m32 410 | """ 411 | insn.itype = ida_allins.NN_ucomiss 412 | return ida_hexrays.MERR_INSN 413 | 414 | def vcomisd(self, cdg, insn): 415 | """ 416 | VCOMISD xmm1, xmm2/m64 417 | """ 418 | insn.itype = ida_allins.NN_comisd 419 | return ida_hexrays.MERR_INSN 420 | 421 | def vucomisd(self, cdg, insn): 422 | """ 423 | VUCOMISD xmm1, xmm2/m64 424 | """ 425 | insn.itype = ida_allins.NN_ucomisd 426 | return ida_hexrays.MERR_INSN 427 | 428 | #------------------------------------------------------------------------- 429 | # Conversion Instructions 430 | #------------------------------------------------------------------------- 431 | 432 | def vcvttss2si(self, cdg, insn): 433 | """ 434 | CVTTSS2SI r64, xmm1/m32 435 | CVTTSS2SI r32, xmm1/m32 436 | """ 437 | insn.itype = ida_allins.NN_cvttss2si 438 | return ida_hexrays.MERR_INSN 439 | 440 | def vcvtdq2ps(self, cdg, insn): 441 | """ 442 | VCVTDQ2PS xmm1, xmm2/m128 443 | VCVTDQ2PS ymm1, ymm2/m256 444 | """ 445 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE 446 | 447 | # op2 -- m128/m256 448 | if is_mem_op(insn.Op2): 449 | r_reg = cdg.load_operand(1) 450 | 451 | # op2 -- xmm2/ymm2 452 | else: 453 | assert is_avx_reg(insn.Op2) 454 | r_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 455 | 456 | # op1 -- xmm1/ymm1 457 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 458 | 459 | # 460 | # intrinsics: 461 | # __m128 _mm_cvtepi32_ps (__m128i a) 462 | # __m256 _mm256_cvtepi32_ps (__m256i a) 463 | # 464 | 465 | bit_size = bytes2bits(op_size) 466 | bit_str = str(bit_size) if op_size == YMM_SIZE else "" 467 | intrinsic_name = "_mm%s_cvtepi32_ps" % bit_str 468 | 469 | avx_intrinsic = AVXIntrinsic(cdg, intrinsic_name) 470 | avx_intrinsic.add_argument_reg(r_reg, "__m%ui" % bit_size) 471 | avx_intrinsic.set_return_reg(d_reg, "__m%u" % bit_size) 472 | avx_intrinsic.emit() 473 | 474 | # clear upper 128 bits of ymm1 475 | if op_size == XMM_SIZE: 476 | clear_upper(cdg, d_reg) 477 | 478 | return ida_hexrays.MERR_OK 479 | 480 | def vcvtsi2ss(self, cdg, insn): 481 | """ 482 | VCVTSI2SS xmm1, xmm2, r/m32 483 | VCVTSI2SS xmm1, xmm2, r/m64 484 | """ 485 | src_size = size_of_operand(insn.Op3) 486 | 487 | # op3 -- m32/m64 488 | if is_mem_op(insn.Op3): 489 | r_reg = cdg.load_operand(2) 490 | 491 | # op3 -- r32/r64 492 | else: 493 | assert is_reg_op(insn.Op3) 494 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg) 495 | 496 | # op2 -- xmm2 497 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 498 | 499 | # op1 -- xmm1 500 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 501 | 502 | # create a temp register to compute the final result into 503 | t0_result = cdg.mba.alloc_kreg(XMM_SIZE) 504 | t0_mop = ida_hexrays.mop_t(t0_result, FLOAT_SIZE) 505 | 506 | # create a temp register to downcast a double to a float (if needed) 507 | t1_i2f = cdg.mba.alloc_kreg(src_size) 508 | t1_mop = ida_hexrays.mop_t(t1_i2f, src_size) 509 | 510 | # copy xmm2 into the temp result reg, as we need its upper 3 dwords 511 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, l_reg, 0, t0_result, 0) 512 | 513 | # convert the integer (op3) to a float/double depending on its size 514 | cdg.emit(ida_hexrays.m_i2f, src_size, r_reg, 0, t1_i2f, 0) 515 | 516 | # reduce precision on the converted floating point value if needed (only r64/m64) 517 | cdg.emit(ida_hexrays.m_f2f, t1_mop, NO_MOP, t0_mop) 518 | 519 | # transfer the fully computed temp register to the real dest reg 520 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, t0_result, 0, d_reg, 0) 521 | cdg.mba.free_kreg(t0_result, XMM_SIZE) 522 | cdg.mba.free_kreg(t1_i2f, src_size) 523 | 524 | # clear upper 128 bits of ymm1 525 | clear_upper(cdg, d_reg) 526 | 527 | return ida_hexrays.MERR_OK 528 | 529 | def vcvtps2pd(self, cdg, insn): 530 | """ 531 | VCVTPS2PD xmm1, xmm2/m64 532 | VCVTPS2PD ymm1, ymm2/m128 533 | """ 534 | src_size = QWORD_SIZE if is_xmm_reg(insn.Op1) else XMM_SIZE 535 | 536 | # op2 -- m64/m128 537 | if is_mem_op(insn.Op2): 538 | r_reg = cdg.load_operand(1) 539 | 540 | # op2 -- xmm2/ymm2 541 | else: 542 | assert is_avx_reg(insn.Op2) 543 | r_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 544 | 545 | # op1 -- xmm1/ymm1 546 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 547 | 548 | # 549 | # intrinsics: 550 | # - __m128d _mm_cvtps_pd (__m128 a) 551 | # - __m256d _mm256_cvtps_pd (__m128 a) 552 | # 553 | 554 | bit_size = bytes2bits(src_size * 2) 555 | bit_str = "256" if (src_size * 2) == YMM_SIZE else "" 556 | intrinsic_name = "_mm%s_cvtps_pd" % bit_str 557 | 558 | avx_intrinsic = AVXIntrinsic(cdg, intrinsic_name) 559 | avx_intrinsic.add_argument_reg(r_reg, "__m128") 560 | avx_intrinsic.set_return_reg(d_reg, "__m%ud" % bit_size) 561 | avx_intrinsic.emit() 562 | 563 | # clear upper 128 bits of ymm1 564 | if src_size == QWORD_SIZE: 565 | clear_upper(cdg, d_reg) 566 | 567 | return ida_hexrays.MERR_OK 568 | 569 | def vcvtss2sd(self, cdg, insn): 570 | """ 571 | VCVTSS2SD xmm1, xmm2, r/m32 572 | """ 573 | 574 | # op3 -- m32 575 | if is_mem_op(insn.Op3): 576 | r_reg = cdg.load_operand(2) 577 | 578 | # op3 -- r32 579 | else: 580 | assert is_reg_op(insn.Op3) 581 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg) 582 | 583 | r_mop = ida_hexrays.mop_t(r_reg, FLOAT_SIZE) 584 | 585 | # op2 -- xmm2 586 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 587 | 588 | # op1 -- xmm1 589 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 590 | 591 | # create a temp register to compute the final result into 592 | t0_result = cdg.mba.alloc_kreg(XMM_SIZE) 593 | t0_mop = ida_hexrays.mop_t(t0_result, DOUBLE_SIZE) 594 | 595 | # copy xmm2 into the temp result reg, as we need its upper quadword 596 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, l_reg, 0, t0_result, 0) 597 | 598 | # convert float (op3) to a double, storing it in the lower 64 of the temp result reg 599 | cdg.emit(ida_hexrays.m_f2f, r_mop, NO_MOP, t0_mop) 600 | 601 | # transfer the fully computed temp register to the real dest reg 602 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, t0_result, 0, d_reg, 0) 603 | cdg.mba.free_kreg(t0_result, XMM_SIZE) 604 | 605 | # clear upper 128 bits of ymm1 606 | clear_upper(cdg, d_reg) 607 | 608 | return ida_hexrays.MERR_OK 609 | 610 | #------------------------------------------------------------------------- 611 | # Mov Instructions 612 | #------------------------------------------------------------------------- 613 | 614 | def vmovss(self, cdg, insn): 615 | """ 616 | VMOVSS xmm1, xmm2, xmm3 617 | VMOVSS xmm1, m32 618 | VMOVSS xmm1, xmm2, xmm3 619 | VMOVSS m32, xmm1 620 | """ 621 | return self._vmov_ss_sd(cdg, insn, FLOAT_SIZE) 622 | 623 | def vmovsd(self, cdg, insn): 624 | """ 625 | VMOVSD xmm1, xmm2, xmm3 626 | VMOVSD xmm1, m64 627 | VMOVSD xmm1, xmm2, xmm3 628 | VMOVSD m64, xmm1 629 | """ 630 | return self._vmov_ss_sd(cdg, insn, DOUBLE_SIZE) 631 | 632 | def _vmov_ss_sd(self, cdg, insn, data_size): 633 | """ 634 | Templated handler for scalar float/double mov instructions. 635 | """ 636 | 637 | # op form: X, Y -- (2 operands) 638 | if insn.Op3.type == ida_ua.o_void: 639 | 640 | # op form: xmm1, m32/m64 641 | if is_xmm_reg(insn.Op1): 642 | assert is_mem_op(insn.Op2) 643 | 644 | # op2 -- m32/m64 645 | l_reg = cdg.load_operand(1) 646 | l_mop = ida_hexrays.mop_t(l_reg, data_size) 647 | 648 | # op1 -- xmm1 649 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 650 | d_mop = ida_hexrays.mop_t(d_reg, XMM_SIZE) 651 | 652 | # xmm1[:data_size] = [mem] 653 | insn = cdg.emit(ida_hexrays.m_xdu, l_mop, NO_MOP, d_mop) 654 | 655 | # clear xmm1[data_size:] bits (through ymm1) 656 | clear_upper(cdg, d_reg, data_size) 657 | 658 | return ida_hexrays.MERR_OK 659 | 660 | # op form: m32/m64, xmm1 661 | else: 662 | assert is_mem_op(insn.Op1) and is_xmm_reg(insn.Op2) 663 | 664 | # op2 -- xmm1 665 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 666 | l_mop = ida_hexrays.mop_t(l_reg, data_size) 667 | 668 | # store xmm1[:data_size] into memory at [m32/m64] (op1) 669 | insn = cdg.store_operand(0, l_mop) 670 | insn.set_fpinsn() 671 | 672 | return ida_hexrays.MERR_OK 673 | 674 | # op form: xmm1, xmm2, xmm3 -- (3 operands) 675 | else: 676 | assert is_xmm_reg(insn.Op1) and is_xmm_reg(insn.Op2) and is_xmm_reg(insn.Op3) 677 | 678 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 679 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 680 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg) 681 | 682 | # create a temp register to compute the final result into 683 | t0_result = cdg.mba.alloc_kreg(XMM_SIZE) 684 | 685 | # emit the microcode for this insn 686 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, l_reg, 0, t0_result, 0) 687 | cdg.emit(ida_hexrays.m_f2f, data_size, r_reg, 0, t0_result, 0) 688 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, t0_result, 0, d_reg, 0) 689 | cdg.mba.free_kreg(t0_result, XMM_SIZE) 690 | 691 | # clear xmm1[data_size:] bits (through ymm1) 692 | clear_upper(cdg, d_reg, data_size) 693 | 694 | return ida_hexrays.MERR_OK 695 | 696 | # failsafe 697 | assert "Unreachable..." 698 | return ida_hexrays.MERR_INSN 699 | 700 | def vmovd(self, cdg, insn): 701 | """ 702 | VMOVD xmm1, r32/m32 703 | VMOVD r32/m32, xmm1 704 | """ 705 | return self._vmov(cdg, insn, DWORD_SIZE) 706 | 707 | def vmovq(self, cdg, insn): 708 | """ 709 | VMOVQ xmm1, r64/m64 710 | VMOVQ r64/m64, xmm1 711 | """ 712 | return self._vmov(cdg, insn, QWORD_SIZE) 713 | 714 | def _vmov(self, cdg, insn, data_size): 715 | """ 716 | Templated handler for dword/qword mov instructions. 717 | """ 718 | 719 | # op form: xmm1, rXX/mXX 720 | if is_xmm_reg(insn.Op1): 721 | 722 | # op2 -- m32/m64 723 | if is_mem_op(insn.Op2): 724 | l_reg = cdg.load_operand(1) 725 | 726 | # op2 -- r32/r64 727 | else: 728 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 729 | 730 | # wrap the source micro-reg as a micro-operand of the specified size 731 | l_mop = ida_hexrays.mop_t(l_reg, data_size) 732 | 733 | # op1 -- xmm1 734 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 735 | d_mop = ida_hexrays.mop_t(d_reg, XMM_SIZE) 736 | 737 | # emit the microcode for this insn 738 | cdg.emit(ida_hexrays.m_xdu, l_mop, NO_MOP, d_mop) 739 | 740 | # clear upper 128 bits of ymm1 741 | clear_upper(cdg, d_reg) 742 | 743 | return ida_hexrays.MERR_OK 744 | 745 | # op form: rXX/mXX, xmm1 746 | else: 747 | assert is_xmm_reg(insn.Op2) 748 | 749 | # op2 -- xmm1 750 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 751 | l_mop = ida_hexrays.mop_t(l_reg, data_size) 752 | 753 | # op1 -- m32/m64 754 | if is_mem_op(insn.Op1): 755 | cdg.store_operand(0, l_mop) 756 | 757 | # op1 -- r32/r64 758 | else: 759 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 760 | d_mop = ida_hexrays.mop_t(d_reg, data_size) 761 | cdg.emit(ida_hexrays.m_mov, l_mop, NO_MOP, d_mop) 762 | 763 | # 764 | # TODO: the intel manual doesn't make it entierly clear here 765 | # if the upper bits of a r32 operation need to be cleared ? 766 | # 767 | 768 | return ida_hexrays.MERR_OK 769 | 770 | # failsafe 771 | assert "Unreachable..." 772 | return ida_hexrays.MERR_INSN 773 | 774 | def v_mov_ps_dq(self, cdg, insn): 775 | """ 776 | VMOVAPS xmm1, xmm2/m128 777 | VMOVAPS ymm1, ymm2/m256 778 | VMOVAPS xmm2/m128, xmm1 779 | VMOVAPS ymm2/m256, ymm1 780 | 781 | VMOVUPS xmm1, xmm2/m128 782 | VMOVUPS ymm1, ymm2/m256 783 | VMOVUPS xmm2/m128, xmm1 784 | VMOVUPS ymm2/m256, ymm1 785 | 786 | VMOVDQA xmm1, xmm2/m128 787 | VMOVDQA xmm2/m128, xmm1 788 | VMOVDQA ymm1, ymm2/m256 789 | VMOVDQA ymm2/m256, ymm1 790 | 791 | VMOVDQU xmm1, xmm2/m128 792 | VMOVDQU xmm2/m128, xmm1 793 | VMOVDQU ymm1, ymm2/m256 794 | VMOVDQU ymm2/m256, ymm1 795 | """ 796 | 797 | # op form: reg, [mem] 798 | if is_avx_reg(insn.Op1): 799 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE 800 | 801 | # op2 -- m128/m256 802 | if is_mem_op(insn.Op2): 803 | l_reg = cdg.load_operand(1) 804 | 805 | # op2 -- xmm1/ymm1 806 | else: 807 | assert is_avx_reg(insn.Op2) 808 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 809 | 810 | # wrap the source micro-reg as a micro-operand 811 | l_mop = ida_hexrays.mop_t(l_reg, op_size) 812 | 813 | # op1 -- xmmX/ymmX 814 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 815 | d_mop = ida_hexrays.mop_t(d_reg, op_size) 816 | 817 | # emit the microcode for this insn 818 | cdg.emit(ida_hexrays.m_mov, l_mop, NO_MOP, d_mop) 819 | 820 | # clear upper 128 bits of ymm1 821 | if op_size == XMM_SIZE: 822 | clear_upper(cdg, d_reg) 823 | 824 | return ida_hexrays.MERR_OK 825 | 826 | # op form: [mem], reg 827 | else: 828 | assert is_mem_op(insn.Op1) and is_avx_reg(insn.Op2) 829 | op_size = XMM_SIZE if is_xmm_reg(insn.Op2) else YMM_SIZE 830 | 831 | # op1 -- xmm1/ymm1 832 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 833 | l_mop = ida_hexrays.mop_t(l_reg, op_size) 834 | 835 | # [m128/m256] = xmm1/ymm1 836 | cdg.store_operand(0, l_mop) 837 | return ida_hexrays.MERR_OK 838 | 839 | # failsafe 840 | assert "Unreachable..." 841 | return ida_hexrays.MERR_INSN 842 | 843 | #------------------------------------------------------------------------- 844 | # Bitwise Instructions 845 | #------------------------------------------------------------------------- 846 | 847 | def v_bitwise_ps(self, cdg, insn): 848 | """ 849 | VORPS xmm1, xmm2, xmm3/m128 850 | VORPS ymm1, ymm2, ymm3/m256 851 | 852 | VXORPS xmm1, xmm2, xmm3/m128 853 | VXORPS ymm1, ymm2, ymm3/m256 854 | 855 | VANDPS xmm1, xmm2, xmm3/m128 856 | VANDPS ymm1, ymm2, ymm3/m256 857 | """ 858 | assert is_avx_reg(insn.Op1) and is_avx_reg(insn.Op2) 859 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE 860 | 861 | # op3 -- m128/m256 862 | if is_mem_op(insn.Op3): 863 | r_reg = cdg.load_operand(2) 864 | 865 | # op3 -- xmm3/ymm3 866 | else: 867 | assert is_avx_reg(insn.Op3) 868 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg) 869 | 870 | itype2mcode = \ 871 | { 872 | ida_allins.NN_vorps: ida_hexrays.m_or, 873 | ida_allins.NN_vandps: ida_hexrays.m_and, 874 | ida_allins.NN_vxorps: ida_hexrays.m_xor, 875 | } 876 | 877 | # get the hexrays microcode op to use for this instruction 878 | mcode_op = itype2mcode[insn.itype] 879 | 880 | # wrap the source micro-reg as a micro-operand 881 | r_mop = ida_hexrays.mop_t(r_reg, op_size) 882 | 883 | # op2 -- xmm2/ymm2 884 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 885 | l_mop = ida_hexrays.mop_t(l_reg, op_size) 886 | 887 | # op1 -- xmm1/ymm1 888 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 889 | d_mop = ida_hexrays.mop_t(d_reg, op_size) 890 | 891 | # emit the microcode for this insn 892 | cdg.emit(mcode_op, l_mop, r_mop, d_mop) 893 | 894 | # clear upper 128 bits of ymm1 895 | if op_size == XMM_SIZE: 896 | clear_upper(cdg, d_reg) 897 | 898 | return ida_hexrays.MERR_OK 899 | 900 | #------------------------------------------------------------------------- 901 | # Arithmetic Instructions 902 | #------------------------------------------------------------------------- 903 | 904 | def v_math_ss(self, cdg, insn): 905 | """ 906 | VADDSS xmm1, xmm2, xmm3/m32 907 | VSUBSS xmm1, xmm2, xmm3/m32 908 | VMULSS xmm1, xmm2, xmm3/m32 909 | VDIVSS xmm1, xmm2, xmm3/m32 910 | """ 911 | return self._v_math_ss_sd(cdg, insn, FLOAT_SIZE) 912 | 913 | def v_math_sd(self, cdg, insn): 914 | """ 915 | VADDSD xmm1, xmm2, xmm3/m64 916 | VSUBSD xmm1, xmm2, xmm3/m64 917 | VMULSD xmm1, xmm2, xmm3/m64 918 | VDIVSD xmm1, xmm2, xmm3/m64 919 | """ 920 | return self._v_math_ss_sd(cdg, insn, DOUBLE_SIZE) 921 | 922 | def _v_math_ss_sd(self, cdg, insn, op_size): 923 | """ 924 | Templated handler for scalar float/double math instructions. 925 | """ 926 | assert is_avx_reg(insn.Op1) and is_avx_reg(insn.Op2) 927 | 928 | # op3 -- m32/m64 929 | if is_mem_op(insn.Op3): 930 | r_reg = cdg.load_operand(2) 931 | 932 | # op3 -- xmm3 933 | else: 934 | assert is_xmm_reg(insn.Op3) 935 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg) 936 | 937 | # op2 -- xmm2 938 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 939 | 940 | # op1 -- xmm1 941 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 942 | 943 | itype2mcode = \ 944 | { 945 | ida_allins.NN_vaddss: ida_hexrays.m_fadd, 946 | ida_allins.NN_vaddsd: ida_hexrays.m_fadd, 947 | 948 | ida_allins.NN_vsubss: ida_hexrays.m_fsub, 949 | ida_allins.NN_vsubsd: ida_hexrays.m_fsub, 950 | 951 | ida_allins.NN_vmulss: ida_hexrays.m_fmul, 952 | ida_allins.NN_vmulsd: ida_hexrays.m_fmul, 953 | 954 | ida_allins.NN_vdivss: ida_hexrays.m_fdiv, 955 | ida_allins.NN_vdivsd: ida_hexrays.m_fdiv, 956 | } 957 | 958 | # get the hexrays microcode op to use for this instruction 959 | mcode_op = itype2mcode[insn.itype] 960 | op_dtype = ida_ua.dt_float if op_size == FLOAT_SIZE else ida_ua.dt_double 961 | 962 | # create a temp register to compute the final result into 963 | t0_result = cdg.mba.alloc_kreg(XMM_SIZE) 964 | 965 | # emit the microcode for this insn 966 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, l_reg, 0, t0_result, 0) 967 | cdg.emit_micro_mvm(mcode_op, op_dtype, l_reg, r_reg, t0_result, 0) 968 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, t0_result, 0, d_reg, 0) 969 | cdg.mba.free_kreg(t0_result, 16) 970 | 971 | # clear upper 128 bits of ymm1 972 | assert is_xmm_reg(insn.Op1) 973 | clear_upper(cdg, d_reg) 974 | 975 | return ida_hexrays.MERR_OK 976 | 977 | def v_math_ps(self, cdg, insn): 978 | """ 979 | VADDPS xmm1, xmm2, xmm3/m128 980 | VADDPS ymm1, ymm2, ymm3/m256 981 | 982 | VSUBPS xmm1, xmm2, xmm3/m128 983 | VSUBPS ymm1, ymm2, ymm3/m256 984 | 985 | VMULPS xmm1, xmm2, xmm3/m128 986 | VMULPS ymm1, ymm2, ymm3/m256 987 | 988 | VDIVPS xmm1, xmm2, xmm3/m128 989 | VDIVPS ymm1, ymm2, ymm3/m256 990 | """ 991 | assert is_avx_reg(insn.Op1) and is_avx_reg(insn.Op2) 992 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE 993 | 994 | # op3 -- m128/m256 995 | if is_mem_op(insn.Op3): 996 | r_reg = cdg.load_operand(2) 997 | 998 | # op3 -- xmm3/ymm3 999 | else: 1000 | assert is_avx_reg(insn.Op3) 1001 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg) 1002 | 1003 | # op2 -- xmm2/ymm2 1004 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 1005 | 1006 | # op1 -- xmm1/ymm1 1007 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 1008 | d_mop = ida_hexrays.mop_t(d_reg, op_size) 1009 | 1010 | itype2name = \ 1011 | { 1012 | ida_allins.NN_vaddps: "_mm%u_add_ps", 1013 | ida_allins.NN_vsubps: "_mm%u_sub_ps", 1014 | ida_allins.NN_vmulps: "_mm%u_mul_ps", 1015 | ida_allins.NN_vdivps: "_mm%u_div_ps", 1016 | } 1017 | 1018 | # create the intrinsic 1019 | bit_size = bytes2bits(op_size) 1020 | bit_str = "256" if op_size == YMM_SIZE else "" 1021 | intrinsic_name = itype2name[insn.itype] % bytes2bits(op_size) 1022 | 1023 | avx_intrinsic = AVXIntrinsic(cdg, intrinsic_name) 1024 | avx_intrinsic.add_argument_reg(l_reg, "__m%u" % bit_size) 1025 | avx_intrinsic.add_argument_reg(r_reg, "__m%u" % bit_size) 1026 | avx_intrinsic.set_return_reg(d_reg, "__m%u" % bit_size) 1027 | avx_intrinsic.emit() 1028 | 1029 | # clear upper 128 bits of ymm1 1030 | if op_size == XMM_SIZE: 1031 | clear_upper(cdg, d_reg) 1032 | 1033 | return ida_hexrays.MERR_OK 1034 | 1035 | #------------------------------------------------------------------------- 1036 | # Misc Instructions 1037 | #------------------------------------------------------------------------- 1038 | 1039 | def vsqrtss(self, cdg, insn): 1040 | """ 1041 | VSQRTSS xmm1, xmm2, xmm3/m32 1042 | """ 1043 | assert is_xmm_reg(insn.Op1) and is_xmm_reg(insn.Op2) 1044 | 1045 | # op3 -- xmm3 1046 | if is_xmm_reg(insn.Op3): 1047 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg) 1048 | 1049 | # op3 -- m32 1050 | else: 1051 | assert is_mem_op(insn.Op3) 1052 | r_reg = cdg.load_operand(2) 1053 | 1054 | # op2 - xmm2 1055 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 1056 | 1057 | # op1 - xmm1 1058 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 1059 | 1060 | # create a temp register to compute the final result into 1061 | t0_result = cdg.mba.alloc_kreg(XMM_SIZE) 1062 | 1063 | # populate the dest reg 1064 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, l_reg, 0, t0_result, 0) 1065 | 1066 | # mov.fpu call !fsqrt.4, t0_result_4.4 1067 | avx_intrinsic = AVXIntrinsic(cdg, "fsqrt") 1068 | avx_intrinsic.add_argument_reg_basic(r_reg, ida_typeinf.BT_FLOAT) 1069 | avx_intrinsic.set_return_reg_basic(t0_result, ida_typeinf.BT_FLOAT) 1070 | avx_intrinsic.emit() 1071 | 1072 | # store the fully computed result 1073 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, t0_result, 0, d_reg, 0) 1074 | cdg.mba.free_kreg(t0_result, XMM_SIZE) 1075 | 1076 | # clear upper 128 bits of ymm1 1077 | clear_upper(cdg, d_reg) 1078 | 1079 | return ida_hexrays.MERR_OK 1080 | 1081 | def vsqrtps(self, cdg, insn): 1082 | """ 1083 | VSQRTPS xmm1, xmm2/m128 1084 | VSQRTPS ymm1, ymm2/m256 1085 | """ 1086 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE 1087 | 1088 | # op2 -- m128/m256 1089 | if is_mem_op(insn.Op2): 1090 | r_reg = cdg.load_operand(1) 1091 | 1092 | # op2 -- xmm2/ymm2 1093 | else: 1094 | assert is_avx_reg(insn.Op2) 1095 | r_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 1096 | 1097 | # op1 -- xmm1/ymm1 1098 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 1099 | 1100 | # intrinsic: __m256 _mm256_cvtepi32_ps (__m256i a) 1101 | bit_size = bytes2bits(op_size) 1102 | bit_str = str(bit_size) if op_size == YMM_SIZE else "" 1103 | intrinsic_name = "_mm%s_sqrt_ps" % bit_str 1104 | 1105 | avx_intrinsic = AVXIntrinsic(cdg, intrinsic_name) 1106 | avx_intrinsic.add_argument_reg(r_reg, "__m%u" % bit_size) 1107 | avx_intrinsic.set_return_reg(d_reg, "__m%u" % bit_size) 1108 | avx_intrinsic.emit() 1109 | 1110 | # clear upper 128 bits of ymm1 1111 | if op_size == XMM_SIZE: 1112 | clear_upper(cdg, d_reg) 1113 | 1114 | return ida_hexrays.MERR_OK 1115 | 1116 | def vshufps(self, cdg, insn): 1117 | """ 1118 | VSHUFPS xmm1, xmm2, xmm3/m128, imm8 1119 | VSHUFPS ymm1, ymm2, ymm3/m256, imm8 1120 | """ 1121 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE 1122 | 1123 | # op4 -- imm8 1124 | assert insn.Op4.type == ida_ua.o_imm 1125 | mask_value = insn.Op4.value 1126 | 1127 | # op3 -- m128/m256 1128 | if is_mem_op(insn.Op3): 1129 | r_reg = cdg.load_operand(2) 1130 | 1131 | # op3 -- xmm3/ymm3 1132 | else: 1133 | assert is_avx_reg(insn.Op3) 1134 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg) 1135 | 1136 | # op2 -- xmm2/ymm2 1137 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg) 1138 | 1139 | # op1 -- xmm1/ymm1 1140 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg) 1141 | 1142 | # 1143 | # intrinsics: 1144 | # __m128 _mm_shuffle_ps (__m128 a, __m128 b, unsigned int imm8) 1145 | # __m256 _mm256_shuffle_ps (__m256 a, __m256 b, const int imm8) 1146 | # 1147 | 1148 | bit_size = bytes2bits(op_size) 1149 | bit_str = str(bit_size) if op_size == YMM_SIZE else "" 1150 | intrinsic_name = "_mm%s_shuffle_ps" % bit_str 1151 | 1152 | avx_intrinsic = AVXIntrinsic(cdg, intrinsic_name) 1153 | avx_intrinsic.add_argument_reg(l_reg, "__m%u" % bit_size) 1154 | avx_intrinsic.add_argument_reg(r_reg, "__m%u" % bit_size) 1155 | avx_intrinsic.add_argument_imm(mask_value, ida_typeinf.BT_INT8) 1156 | avx_intrinsic.set_return_reg(d_reg, "__m%u" % bit_size) 1157 | avx_intrinsic.emit() 1158 | 1159 | # clear upper 128 bits of ymm1 1160 | if op_size == XMM_SIZE: 1161 | clear_upper(cdg, d_reg) 1162 | 1163 | return ida_hexrays.MERR_OK 1164 | 1165 | #----------------------------------------------------------------------------- 1166 | # Plugin 1167 | #----------------------------------------------------------------------------- 1168 | 1169 | def PLUGIN_ENTRY(): 1170 | """ 1171 | Required plugin entry point for IDAPython Plugins. 1172 | """ 1173 | return MicroAVX() 1174 | 1175 | class MicroAVX(ida_idaapi.plugin_t): 1176 | """ 1177 | The IDA plugin stub for MicroAVX. 1178 | """ 1179 | 1180 | flags = ida_idaapi.PLUGIN_PROC | ida_idaapi.PLUGIN_HIDE 1181 | comment = "AVX support for the Hex-Rays x64 Decompiler" 1182 | help = "" 1183 | wanted_name = "MicroAVX" 1184 | wanted_hotkey = "" 1185 | loaded = False 1186 | 1187 | #-------------------------------------------------------------------------- 1188 | # IDA Plugin Overloads 1189 | #-------------------------------------------------------------------------- 1190 | 1191 | def init(self): 1192 | """ 1193 | This is called by IDA when it is loading the plugin. 1194 | """ 1195 | 1196 | # only bother to load the plugin for relevant sessions 1197 | if not is_amd64_idb(): 1198 | return ida_idaapi.PLUGIN_SKIP 1199 | 1200 | # ensure the x64 decompiler is loaded 1201 | ida_loader.load_plugin("hexx64") 1202 | assert ida_hexrays.init_hexrays_plugin(), "Missing Hexx64 Decompiler..." 1203 | 1204 | # initialize the AVX lifter 1205 | self.avx_lifter = AVXLifter() 1206 | self.avx_lifter.install() 1207 | sys.modules["__main__"].lifter = self.avx_lifter 1208 | 1209 | # mark the plugin as loaded 1210 | self.loaded = True 1211 | return ida_idaapi.PLUGIN_KEEP 1212 | 1213 | def run(self, arg): 1214 | """ 1215 | This is called by IDA when this file is loaded as a script. 1216 | """ 1217 | ida_kernwin.warning("%s cannot be run as a script in IDA." % self.wanted_name) 1218 | 1219 | def term(self): 1220 | """ 1221 | This is called by IDA when it is unloading the plugin. 1222 | """ 1223 | if not self.loaded: 1224 | return 1225 | 1226 | # hex-rays automatically cleans up decompiler hooks, so not much to do here... 1227 | self.avx_lifter = None --------------------------------------------------------------------------------