├── screenshots
├── avx_demo_two.gif
└── avx_title_card.png
├── LICENSE
├── README.md
├── .gitignore
├── misc
└── scrape.py
└── plugins
└── microavx.py
/screenshots/avx_demo_two.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gaasedelen/microavx/HEAD/screenshots/avx_demo_two.gif
--------------------------------------------------------------------------------
/screenshots/avx_title_card.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gaasedelen/microavx/HEAD/screenshots/avx_title_card.png
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2020 gaasedelen
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # MicroAVX - An AVX Lifter for the Hex-Rays Decompiler
2 |
3 |
4 |
5 |
6 |
7 | ## Overview
8 |
9 | MicroAVX is an extension of the [IDA Pro](https://www.hex-rays.com/products/ida/) decompiler, adding partial support for a number of common instructions from Intel's [Advanced Vector Extensions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) (AVX). This plugin demonstrates how the Hex-Rays microcode can be used to lift and decompile new or previously unsupported instructions.
10 |
11 | There are no plans further develop MicroAVX, or extend its coverage to the complete set of AVX instructions. This plugin is labeled only as a prototype & code resource for the community.
12 |
13 | For more information, please read the associated [blogpost](https://blog.ret2.io/2020/07/22/ida-pro-avx-decompiler).
14 |
15 | ## Releases
16 |
17 | * v0.1 -- Initial release
18 |
19 | ## Installation
20 |
21 | MicroAVX is a cross-platform (Windows, macOS, Linux) Python 2/3 plugin. It takes zero third party dependencies, making the code both portable and easy to install.
22 |
23 | 1. From your disassembler's python console, run the following command to find its plugin directory:
24 | - **IDA Pro**: `os.path.join(idaapi.get_user_idadir(), "plugins")`
25 |
26 | 2. Copy the contents of this repository's `/plugins/` folder to the listed directory.
27 | 3. Restart your disassembler.
28 |
29 | This plugin is only supported for IDA 7.5 and newer.
30 |
31 | ## Usage
32 |
33 | The MircoAVX plugin loads automatically when an x86_64 executable / IDB is opened in IDA. Simply attempt to decompile any function containing AVX instructions, and the the plugin will lift any instructions that it supports.
34 |
35 |
36 |
37 |
38 |
39 | (please note, there is no right click 'AVX toggle' in this release)
40 |
41 | ## Authors
42 |
43 | * Markus Gaasedelen ([@gaasedelen](https://twitter.com/gaasedelen))
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | pip-wheel-metadata/
24 | share/python-wheels/
25 | *.egg-info/
26 | .installed.cfg
27 | *.egg
28 | MANIFEST
29 |
30 | # PyInstaller
31 | # Usually these files are written by a python script from a template
32 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
33 | *.manifest
34 | *.spec
35 |
36 | # Installer logs
37 | pip-log.txt
38 | pip-delete-this-directory.txt
39 |
40 | # Unit test / coverage reports
41 | htmlcov/
42 | .tox/
43 | .nox/
44 | .coverage
45 | .coverage.*
46 | .cache
47 | nosetests.xml
48 | coverage.xml
49 | *.cover
50 | *.py,cover
51 | .hypothesis/
52 | .pytest_cache/
53 |
54 | # Translations
55 | *.mo
56 | *.pot
57 |
58 | # Django stuff:
59 | *.log
60 | local_settings.py
61 | db.sqlite3
62 | db.sqlite3-journal
63 |
64 | # Flask stuff:
65 | instance/
66 | .webassets-cache
67 |
68 | # Scrapy stuff:
69 | .scrapy
70 |
71 | # Sphinx documentation
72 | docs/_build/
73 |
74 | # PyBuilder
75 | target/
76 |
77 | # Jupyter Notebook
78 | .ipynb_checkpoints
79 |
80 | # IPython
81 | profile_default/
82 | ipython_config.py
83 |
84 | # pyenv
85 | .python-version
86 |
87 | # pipenv
88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
91 | # install all needed dependencies.
92 | #Pipfile.lock
93 |
94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
95 | __pypackages__/
96 |
97 | # Celery stuff
98 | celerybeat-schedule
99 | celerybeat.pid
100 |
101 | # SageMath parsed files
102 | *.sage.py
103 |
104 | # Environments
105 | .env
106 | .venv
107 | env/
108 | venv/
109 | ENV/
110 | env.bak/
111 | venv.bak/
112 |
113 | # Spyder project settings
114 | .spyderproject
115 | .spyproject
116 |
117 | # Rope project settings
118 | .ropeproject
119 |
120 | # mkdocs documentation
121 | /site
122 |
123 | # mypy
124 | .mypy_cache/
125 | .dmypy.json
126 | dmypy.json
127 |
128 | # Pyre type checker
129 | .pyre/
130 |
--------------------------------------------------------------------------------
/misc/scrape.py:
--------------------------------------------------------------------------------
1 | import collections
2 |
3 | import idc
4 | import ida_name
5 | import idautils
6 | import ida_funcs
7 | import ida_hexrays
8 |
9 | #-----------------------------------------------------------------------------
10 | # Scraping Code
11 | #-----------------------------------------------------------------------------
12 |
13 | class MinsnVisitor(ida_hexrays.minsn_visitor_t):
14 | """
15 | Hex-Rays Micro-instruction Visitor
16 | """
17 | found = set()
18 |
19 | def visit_minsn(self):
20 |
21 | # we only care about external (unsupported) instructions
22 | if self.curins.opcode != ida_hexrays.m_ext:
23 | return 0
24 |
25 | ins_text = idc.GetDisasm(self.curins.ea)
26 | ins_op = ins_text.split(" ")[0]
27 |
28 | print("- 0x%08X: UNSUPPORTED %s" % (self.curins.ea, ins_text))
29 | self.found.add(ins_op)
30 | return 0
31 |
32 | def scrape_unsupported_instructions():
33 | """
34 | Scrape all 'external' (unsupported) decompiler instructions from this IDB.
35 |
36 | Returns a tuple of two maps:
37 | ext2func = { opcode: set([func_ea, func2_ea, ...]) }
38 | func2ext = { func_ea: set([opcode1, opcode2, opcode3]) }
39 |
40 | """
41 | miv = MinsnVisitor()
42 | ext2func = collections.defaultdict(set)
43 | func2ext = {}
44 |
45 | for address in idautils.Functions():
46 |
47 | #address = 0x1800017E0
48 | print("0x%08X: DECOMPILING" % address)
49 | func = ida_funcs.get_func(address)
50 |
51 | func_mbr = ida_hexrays.mba_ranges_t(func)
52 | hf = ida_hexrays.hexrays_failure_t()
53 | flags = ida_hexrays.DECOMP_NO_XREFS | ida_hexrays.DECOMP_NO_WAIT | ida_hexrays.DECOMP_WARNINGS
54 | mba = ida_hexrays.gen_microcode(func_mbr, hf, None, flags, ida_hexrays.MMAT_GENERATED)
55 |
56 | if not mba:
57 | print(" - 0x%08x: FAILED %s" % (hf.errea, hf.str))
58 | continue
59 |
60 | miv.found = set()
61 | mba.for_all_insns(miv)
62 |
63 | # opcode --> [func_ea, func2_ea, ..]
64 | for ins_op in miv.found:
65 | ext2func[ins_op].add(address)
66 |
67 | # func_ea --> [ins_op, ins_op2, ..]
68 | func2ext[address] = miv.found
69 |
70 | print("\nDone scraping...\n")
71 | return (ext2func, func2ext)
72 |
73 | def print_stats(ext2func):
74 | """
75 | Print stats about the scraped instructions.
76 | """
77 | print("-"*60)
78 |
79 | func_size_cache = {}
80 | all_funcs = set()
81 |
82 | print("\nFUNC USES -- UNSUPPORTED INSTR (%u types)\n" % len(ext2func))
83 | for key in sorted(ext2func, key=lambda key: len(ext2func[key]), reverse=True):
84 | function_addresses = ext2func[key]
85 | all_funcs |= function_addresses
86 |
87 | # print the unsupported instruction op, and how many funcs use it
88 | print(" - USES: %d - OP: %s" % (len(function_addresses), key))
89 |
90 | # compute the size of all the funcs that use this op..
91 | func_sizes = []
92 | for address in function_addresses:
93 |
94 | # try to grab the func size if we cached it already
95 | func_size = func_size_cache.get(address, None)
96 | if func_size:
97 | func_sizes.append((func_size, address))
98 | continue
99 |
100 | # compute the size oe the function
101 | func = ida_funcs.get_func(address)
102 | func_size = ida_funcs.calc_func_size(func)
103 | func_sizes.append((func_size, address))
104 |
105 | # cache the func size for future use
106 | func_size_cache[address] = func_size
107 |
108 | # print a few small functions that use this unsupported op..
109 | func_sizes.sort()
110 | for size, address in func_sizes[:5]:
111 | print(" -- SAMPLE FUNC 0x%08X (%u bytes)" % (address, size))
112 |
113 | print("\n" + "-"*60 + "\n")
114 | print("AFFLICTED FUNCTIONS (%u funcs)\n" % len(all_funcs))
115 |
116 | all_funcs = sorted(all_funcs)
117 | for ea in all_funcs:
118 | function_name = ida_name.get_short_name(ea)
119 | print("0x%08X: %s" % (ea, function_name))
120 |
121 | #-----------------------------------------------------------------------------
122 | # Main
123 | #-----------------------------------------------------------------------------
124 |
125 | print("Scraping instructions...")
126 | ext2func, func2ext = scrape_unsupported_instructions()
127 | print("Dumping results...")
128 | print_stats(ext2func)
129 |
--------------------------------------------------------------------------------
/plugins/microavx.py:
--------------------------------------------------------------------------------
1 | import sys
2 |
3 | import idc
4 | import ida_ua
5 | import ida_ida
6 | import ida_idp
7 | import ida_funcs
8 | import ida_allins
9 | import ida_idaapi
10 | import ida_loader
11 | import ida_kernwin
12 | import ida_typeinf
13 | import ida_hexrays
14 |
15 | #-----------------------------------------------------------------------------
16 | # Util
17 | #-----------------------------------------------------------------------------
18 |
19 | # an empty / NULL mop_t
20 | NO_MOP = ida_hexrays.mop_t()
21 |
22 | # EVEX-encoded instruction, intel.hpp (ida sdk)
23 | AUX_EVEX = 0x10000
24 |
25 | # register widths (bytes)
26 | XMM_SIZE = 16
27 | YMM_SIZE = 32
28 | ZMM_SIZE = 64
29 |
30 | # type sizes (bytes)
31 | FLOAT_SIZE = 4
32 | DOUBLE_SIZE = 8
33 | DWORD_SIZE = 4
34 | QWORD_SIZE = 8
35 |
36 | def size_of_operand(op):
37 | """
38 | From ...
39 | https://reverseengineering.stackexchange.com/questions/19843/how-can-i-get-the-byte-size-of-an-operand-in-ida-pro
40 | """
41 | tbyte = 8
42 | dt_ldbl = 8
43 | n_bytes = [ 1, 2, 4, 4, 8,
44 | tbyte, -1, 8, 16, -1,
45 | -1, 6, -1, 4, 4,
46 | dt_ldbl, 32, 64 ]
47 | return n_bytes[op.dtype]
48 |
49 | def is_amd64_idb():
50 | """
51 | Return true if an x86_64 IDB is open.
52 | """
53 | if ida_idp.ph.id != ida_idp.PLFM_386:
54 | return False
55 | return ida_ida.cvar.inf.is_64bit()
56 |
57 | def bytes2bits(n):
58 | """
59 | Return the number of bits repersented by 'n' bytes.
60 | """
61 | return n * 8
62 |
63 | def is_mem_op(op):
64 | """
65 | Return true if the given operand *looks* like a mem op.
66 | """
67 | return op.type in [ida_ua.o_mem, ida_ua.o_displ, ida_ua.o_phrase]
68 |
69 | def is_reg_op(op):
70 | """
71 | Return true if the given operand is a register.
72 | """
73 | return op.type in [ida_ua.o_reg]
74 |
75 | def is_avx_reg(op):
76 | """
77 | Return true if the given operand is a XMM or YMM register.
78 | """
79 | return bool(is_xmm_reg(op) or is_ymm_reg(op))
80 |
81 | def is_xmm_reg(op):
82 | """
83 | Return true if the given operand is a XMM register.
84 | """
85 | if op.type != ida_ua.o_reg:
86 | return False
87 | if op.dtype != ida_ua.dt_byte16:
88 | return False
89 | return True
90 |
91 | def is_ymm_reg(op):
92 | """
93 | Return true if the given operand is a YMM register.
94 | """
95 | if op.type != ida_ua.o_reg:
96 | return False
97 | if op.dtype != ida_ua.dt_byte32:
98 | return False
99 | return True
100 |
101 | def is_avx_512(insn):
102 | """
103 | Return true if the given insn_t is an AVX512 instruction.
104 | """
105 | return bool(insn.auxpref & AUX_EVEX)
106 |
107 | #-----------------------------------------------------------------------------
108 | # Microcode Helpers
109 | #-----------------------------------------------------------------------------
110 |
111 | def get_ymm_mreg(xmm_mreg):
112 | """
113 | Return the YMM microcode register for a given XMM register.
114 | """
115 | xmm_reg = ida_hexrays.mreg2reg(xmm_mreg, XMM_SIZE)
116 | xmm_name = ida_idp.get_reg_name(xmm_reg, XMM_SIZE)
117 | xmm_number = int(xmm_name.split("mm")[-1])
118 |
119 | # compute the ymm mreg id
120 | ymm_reg = ida_idp.str2reg("ymm%u" % xmm_number)
121 | ymm_mreg = ida_hexrays.reg2mreg(ymm_reg)
122 |
123 | # sanity check...
124 | xmm_name = ida_hexrays.get_mreg_name(xmm_mreg, XMM_SIZE)
125 | ymm_name = ida_hexrays.get_mreg_name(ymm_mreg, YMM_SIZE)
126 | assert xmm_name[1:] == ymm_name[1:], "Reg escalation did not work... (%s, %s)" % (xmm_name, ymm_name)
127 |
128 | # return the ymm microcode register id
129 | return ymm_mreg
130 |
131 | def clear_upper(cdg, xmm_mreg, op_size=XMM_SIZE):
132 | """
133 | Extend the given xmm reg, clearing the upper bits (through ymm).
134 | """
135 | ymm_mreg = get_ymm_mreg(xmm_mreg)
136 |
137 | xmm_mop = ida_hexrays.mop_t(xmm_mreg, op_size)
138 | ymm_mop = ida_hexrays.mop_t(ymm_mreg, YMM_SIZE)
139 |
140 | return cdg.emit(ida_hexrays.m_xdu, xmm_mop, NO_MOP, ymm_mop)
141 |
142 | def store_operand_hack(cdg, op_num, new_mop):
143 | """
144 | XXX: why is there a load_operand(), but no inverse.. ?
145 | """
146 |
147 | # emit a 'load' operation...
148 | memX = cdg.load_operand(op_num)
149 | assert memX != ida_hexrays.mr_none, "Invalid op_num..."
150 |
151 | # since this is gonna be kind of hacky, let's make sure a load was actually emitted
152 | ins = cdg.mb.tail
153 | if ins.opcode != ida_hexrays.m_ldx:
154 | if ins.prev.opcode != ida_hexrays.m_ldx:
155 | raise ValueError("Hehe, hack failed :-( (insn 0x%08X op 0x%02X)" % (cdg.insn.ea, ins.opcode))
156 | prev = ins.prev
157 | cdg.mb.make_nop(ins)
158 | ins = prev
159 | assert ins.d.size == new_mop.size, "%u vs %u" % (new_mop.size, ins.d.size)
160 |
161 | # convert the load to a store :^)
162 | ins.opcode = ida_hexrays.m_stx
163 | ins.d = ins.r # d = op mem offset
164 | ins.r = ins.l # r = op mem segm
165 | ins.l = new_mop # l = value to store (mop_t)
166 |
167 | return ins
168 |
169 | #-----------------------------------------------------------------------------
170 | # Intrinsic Helper
171 | #-----------------------------------------------------------------------------
172 |
173 | class AVXIntrinsic(object):
174 | """
175 | This class helps with generating simple intrinsic calls in microcode.
176 | """
177 |
178 | def __init__(self, cdg, name):
179 | self.cdg = cdg
180 |
181 | # call info, sort of like func_type_data_t()
182 | self.call_info = ida_hexrays.mcallinfo_t()
183 | self.call_info.cc = ida_typeinf.CM_CC_FASTCALL
184 | self.call_info.callee = ida_idaapi.BADADDR
185 | self.call_info.solid_args = 0
186 | self.call_info.role = ida_hexrays.ROLE_UNK
187 | self.call_info.flags = ida_hexrays.FCI_SPLOK | ida_hexrays.FCI_FINAL | ida_hexrays.FCI_PROP
188 |
189 | # the actual 'call' microcode insn
190 | self.call_insn = ida_hexrays.minsn_t(cdg.insn.ea)
191 | self.call_insn.opcode = ida_hexrays.m_call
192 | self.call_insn.l.make_helper(name)
193 | self.call_insn.d.t = ida_hexrays.mop_f
194 | self.call_insn.d.f = self.call_info
195 |
196 | # temp return type
197 | self.call_info.return_type = ida_typeinf.tinfo_t()
198 | self.call_insn.d.size = 0
199 |
200 | def set_return_reg(self, mreg, type_string):
201 | """
202 | Set the return register of the function call, with a type string.
203 | """
204 | ret_tinfo = ida_typeinf.tinfo_t()
205 | ret_tinfo.get_named_type(None, type_string)
206 | return self.set_return_reg_type(mreg, ret_tinfo)
207 |
208 | def set_return_reg_basic(self, mreg, basic_type):
209 | """
210 | Set the return register of the function call, with a basic type assigned.
211 | """
212 | ret_tinfo = ida_typeinf.tinfo_t(basic_type)
213 | return self.set_return_reg_type(mreg, ret_tinfo)
214 |
215 | def set_return_reg_type(self, mreg, ret_tinfo):
216 | """
217 | Set the return register of the function call, with a complex type.
218 | """
219 | self.call_info.return_type = ret_tinfo
220 | self.call_insn.d.size = ret_tinfo.get_size()
221 |
222 | self.mov_insn = ida_hexrays.minsn_t(self.cdg.insn.ea)
223 | self.mov_insn.opcode = ida_hexrays.m_mov
224 | self.mov_insn.l.t = ida_hexrays.mop_d
225 | self.mov_insn.l.d = self.call_insn
226 | self.mov_insn.l.size = self.call_insn.d.size
227 | self.mov_insn.d.t = ida_hexrays.mop_r
228 | self.mov_insn.d.r = mreg
229 | self.mov_insn.d.size = self.call_insn.d.size
230 |
231 | if ret_tinfo.is_decl_floating():
232 | self.mov_insn.set_fpinsn()
233 |
234 | def add_argument_reg(self, mreg, type_string):
235 | """
236 | Add a regeister argument with a given type string to the function argument list.
237 | """
238 | op_tinfo = ida_typeinf.tinfo_t()
239 | op_tinfo.get_named_type(None, type_string)
240 | return self.add_argument_reg_type(mreg, op_tinfo)
241 |
242 | def add_argument_reg_basic(self, mreg, basic_type):
243 | """
244 | Add a regeister argument with a basic type to the function argument list.
245 | """
246 | op_tinfo = ida_typeinf.tinfo_t(basic_type)
247 | return self.add_argument_reg_type(mreg, op_tinfo)
248 |
249 | def add_argument_reg_type(self, mreg, op_tinfo):
250 | """
251 | Add a register argument of the given type to the function argument list.
252 | """
253 | call_arg = ida_hexrays.mcallarg_t()
254 | call_arg.t = ida_hexrays.mop_r
255 | call_arg.r = mreg
256 | call_arg.type = op_tinfo
257 | call_arg.size = op_tinfo.get_size()
258 |
259 | self.call_info.args.push_back(call_arg)
260 | self.call_info.solid_args += 1
261 |
262 | def add_argument_imm(self, value, basic_type):
263 | """
264 | Add an immediate value to the function argument list.
265 | """
266 | op_tinfo = ida_typeinf.tinfo_t(basic_type)
267 |
268 | mop_imm = ida_hexrays.mop_t()
269 | mop_imm.make_number(value, op_tinfo.get_size())
270 |
271 | call_arg = ida_hexrays.mcallarg_t()
272 | call_arg.make_number(value, op_tinfo.get_size())
273 | call_arg.type = op_tinfo
274 |
275 | self.call_info.args.push_back(call_arg)
276 | self.call_info.solid_args += 1
277 |
278 | def emit(self):
279 | """
280 | Emit the intrinsic call to the generated microcode.
281 | """
282 | self.cdg.mb.insert_into_block(self.mov_insn, self.cdg.mb.tail)
283 |
284 | #-----------------------------------------------------------------------------
285 | # AVX Lifter
286 | #-----------------------------------------------------------------------------
287 |
288 | class AVXLifter(ida_hexrays.microcode_filter_t):
289 | """
290 | A Hex-Rays microcode filter to lift AVX instructions during decompilation.
291 | """
292 |
293 | def __init__(self):
294 | super(AVXLifter, self).__init__()
295 | self._avx_handlers = \
296 | {
297 |
298 | # Compares (Scalar, Single / Double-Precision)
299 | ida_allins.NN_vcomiss: self.vcomiss,
300 | ida_allins.NN_vcomisd: self.vcomisd,
301 | ida_allins.NN_vucomiss: self.vucomiss,
302 | ida_allins.NN_vucomisd: self.vucomisd,
303 |
304 | # Conversions
305 | ida_allins.NN_vcvttss2si: self.vcvttss2si,
306 | ida_allins.NN_vcvtdq2ps: self.vcvtdq2ps,
307 | ida_allins.NN_vcvtsi2ss: self.vcvtsi2ss,
308 | ida_allins.NN_vcvtps2pd: self.vcvtps2pd,
309 | ida_allins.NN_vcvtss2sd: self.vcvtss2sd,
310 |
311 | # Mov (DWORD / QWORD)
312 | ida_allins.NN_vmovd: self.vmovd,
313 | ida_allins.NN_vmovq: self.vmovq,
314 |
315 | # Mov (Scalar, Single / Double-Precision)
316 | ida_allins.NN_vmovss: self.vmovss,
317 | ida_allins.NN_vmovsd: self.vmovsd,
318 |
319 | # Mov (Packed Single-Precision, Packed Integers)
320 | ida_allins.NN_vmovaps: self.v_mov_ps_dq,
321 | ida_allins.NN_vmovups: self.v_mov_ps_dq,
322 | ida_allins.NN_vmovdqa: self.v_mov_ps_dq,
323 | ida_allins.NN_vmovdqu: self.v_mov_ps_dq,
324 |
325 | # Bitwise (Packed Single-Precision)
326 | ida_allins.NN_vorps: self.v_bitwise_ps,
327 | ida_allins.NN_vandps: self.v_bitwise_ps,
328 | ida_allins.NN_vxorps: self.v_bitwise_ps,
329 |
330 | # Math (Scalar Single-Precision)
331 | ida_allins.NN_vaddss: self.v_math_ss,
332 | ida_allins.NN_vsubss: self.v_math_ss,
333 | ida_allins.NN_vmulss: self.v_math_ss,
334 | ida_allins.NN_vdivss: self.v_math_ss,
335 |
336 | # Math (Scalar Double-Precision)
337 | ida_allins.NN_vaddsd: self.v_math_sd,
338 | ida_allins.NN_vsubsd: self.v_math_sd,
339 | ida_allins.NN_vmulsd: self.v_math_sd,
340 | ida_allins.NN_vdivsd: self.v_math_sd,
341 |
342 | # Math (Packed Single-Precision)
343 | ida_allins.NN_vaddps: self.v_math_ps,
344 | ida_allins.NN_vsubps: self.v_math_ps,
345 | ida_allins.NN_vmulps: self.v_math_ps,
346 | ida_allins.NN_vdivps: self.v_math_ps,
347 |
348 | # Square Root
349 | ida_allins.NN_vsqrtss: self.vsqrtss,
350 | ida_allins.NN_vsqrtps: self.vsqrtps,
351 |
352 | # Shuffle (Packed Single-Precision)
353 | ida_allins.NN_vshufps: self.vshufps,
354 |
355 | }
356 |
357 | def match(self, cdg):
358 | """
359 | Return true if the lifter supports this AVX instruction.
360 | """
361 | if is_avx_512(cdg.insn):
362 | return False
363 | return cdg.insn.itype in self._avx_handlers
364 |
365 | def apply(self, cdg):
366 | """
367 | Generate microcode for the current instruction.
368 | """
369 | cdg.store_operand = lambda x, y: store_operand_hack(cdg, x, y)
370 | return self._avx_handlers[cdg.insn.itype](cdg, cdg.insn)
371 |
372 | def install(self):
373 | """
374 | Install the AVX codegen lifter.
375 | """
376 | ida_hexrays.install_microcode_filter(self, True)
377 | print("Installed AVX lifter... (%u instr supported)" % len(self._avx_handlers))
378 |
379 | def remove(self):
380 | """
381 | Remove the AVX codegen lifter.
382 | """
383 | ida_hexrays.install_microcode_filter(self, False)
384 | print("Removed AVX lifter...")
385 |
386 | #--------------------------------------------------------------------------
387 | # Compare Instructions
388 | #--------------------------------------------------------------------------
389 |
390 | #
391 | # the intel manual states that all of these comparison instructions are
392 | # effectively identical to their SSE counterparts. because of this, we
393 | # simply twiddle the decoded insn to make it appear as SSE and bail.
394 | #
395 | # since the decompiler appears to operate on the same decoded instruction
396 | # data that we meddled with, it will lift the instruction in the same way
397 | # it would lift the SSE version we alias each AVX one to.
398 | #
399 |
400 | def vcomiss(self, cdg, insn):
401 | """
402 | VCOMISS xmm1, xmm2/m32
403 | """
404 | insn.itype = ida_allins.NN_comiss
405 | return ida_hexrays.MERR_INSN
406 |
407 | def vucomiss(self, cdg, insn):
408 | """
409 | VUCOMISS xmm1, xmm2/m32
410 | """
411 | insn.itype = ida_allins.NN_ucomiss
412 | return ida_hexrays.MERR_INSN
413 |
414 | def vcomisd(self, cdg, insn):
415 | """
416 | VCOMISD xmm1, xmm2/m64
417 | """
418 | insn.itype = ida_allins.NN_comisd
419 | return ida_hexrays.MERR_INSN
420 |
421 | def vucomisd(self, cdg, insn):
422 | """
423 | VUCOMISD xmm1, xmm2/m64
424 | """
425 | insn.itype = ida_allins.NN_ucomisd
426 | return ida_hexrays.MERR_INSN
427 |
428 | #-------------------------------------------------------------------------
429 | # Conversion Instructions
430 | #-------------------------------------------------------------------------
431 |
432 | def vcvttss2si(self, cdg, insn):
433 | """
434 | CVTTSS2SI r64, xmm1/m32
435 | CVTTSS2SI r32, xmm1/m32
436 | """
437 | insn.itype = ida_allins.NN_cvttss2si
438 | return ida_hexrays.MERR_INSN
439 |
440 | def vcvtdq2ps(self, cdg, insn):
441 | """
442 | VCVTDQ2PS xmm1, xmm2/m128
443 | VCVTDQ2PS ymm1, ymm2/m256
444 | """
445 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE
446 |
447 | # op2 -- m128/m256
448 | if is_mem_op(insn.Op2):
449 | r_reg = cdg.load_operand(1)
450 |
451 | # op2 -- xmm2/ymm2
452 | else:
453 | assert is_avx_reg(insn.Op2)
454 | r_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
455 |
456 | # op1 -- xmm1/ymm1
457 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
458 |
459 | #
460 | # intrinsics:
461 | # __m128 _mm_cvtepi32_ps (__m128i a)
462 | # __m256 _mm256_cvtepi32_ps (__m256i a)
463 | #
464 |
465 | bit_size = bytes2bits(op_size)
466 | bit_str = str(bit_size) if op_size == YMM_SIZE else ""
467 | intrinsic_name = "_mm%s_cvtepi32_ps" % bit_str
468 |
469 | avx_intrinsic = AVXIntrinsic(cdg, intrinsic_name)
470 | avx_intrinsic.add_argument_reg(r_reg, "__m%ui" % bit_size)
471 | avx_intrinsic.set_return_reg(d_reg, "__m%u" % bit_size)
472 | avx_intrinsic.emit()
473 |
474 | # clear upper 128 bits of ymm1
475 | if op_size == XMM_SIZE:
476 | clear_upper(cdg, d_reg)
477 |
478 | return ida_hexrays.MERR_OK
479 |
480 | def vcvtsi2ss(self, cdg, insn):
481 | """
482 | VCVTSI2SS xmm1, xmm2, r/m32
483 | VCVTSI2SS xmm1, xmm2, r/m64
484 | """
485 | src_size = size_of_operand(insn.Op3)
486 |
487 | # op3 -- m32/m64
488 | if is_mem_op(insn.Op3):
489 | r_reg = cdg.load_operand(2)
490 |
491 | # op3 -- r32/r64
492 | else:
493 | assert is_reg_op(insn.Op3)
494 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg)
495 |
496 | # op2 -- xmm2
497 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
498 |
499 | # op1 -- xmm1
500 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
501 |
502 | # create a temp register to compute the final result into
503 | t0_result = cdg.mba.alloc_kreg(XMM_SIZE)
504 | t0_mop = ida_hexrays.mop_t(t0_result, FLOAT_SIZE)
505 |
506 | # create a temp register to downcast a double to a float (if needed)
507 | t1_i2f = cdg.mba.alloc_kreg(src_size)
508 | t1_mop = ida_hexrays.mop_t(t1_i2f, src_size)
509 |
510 | # copy xmm2 into the temp result reg, as we need its upper 3 dwords
511 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, l_reg, 0, t0_result, 0)
512 |
513 | # convert the integer (op3) to a float/double depending on its size
514 | cdg.emit(ida_hexrays.m_i2f, src_size, r_reg, 0, t1_i2f, 0)
515 |
516 | # reduce precision on the converted floating point value if needed (only r64/m64)
517 | cdg.emit(ida_hexrays.m_f2f, t1_mop, NO_MOP, t0_mop)
518 |
519 | # transfer the fully computed temp register to the real dest reg
520 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, t0_result, 0, d_reg, 0)
521 | cdg.mba.free_kreg(t0_result, XMM_SIZE)
522 | cdg.mba.free_kreg(t1_i2f, src_size)
523 |
524 | # clear upper 128 bits of ymm1
525 | clear_upper(cdg, d_reg)
526 |
527 | return ida_hexrays.MERR_OK
528 |
529 | def vcvtps2pd(self, cdg, insn):
530 | """
531 | VCVTPS2PD xmm1, xmm2/m64
532 | VCVTPS2PD ymm1, ymm2/m128
533 | """
534 | src_size = QWORD_SIZE if is_xmm_reg(insn.Op1) else XMM_SIZE
535 |
536 | # op2 -- m64/m128
537 | if is_mem_op(insn.Op2):
538 | r_reg = cdg.load_operand(1)
539 |
540 | # op2 -- xmm2/ymm2
541 | else:
542 | assert is_avx_reg(insn.Op2)
543 | r_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
544 |
545 | # op1 -- xmm1/ymm1
546 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
547 |
548 | #
549 | # intrinsics:
550 | # - __m128d _mm_cvtps_pd (__m128 a)
551 | # - __m256d _mm256_cvtps_pd (__m128 a)
552 | #
553 |
554 | bit_size = bytes2bits(src_size * 2)
555 | bit_str = "256" if (src_size * 2) == YMM_SIZE else ""
556 | intrinsic_name = "_mm%s_cvtps_pd" % bit_str
557 |
558 | avx_intrinsic = AVXIntrinsic(cdg, intrinsic_name)
559 | avx_intrinsic.add_argument_reg(r_reg, "__m128")
560 | avx_intrinsic.set_return_reg(d_reg, "__m%ud" % bit_size)
561 | avx_intrinsic.emit()
562 |
563 | # clear upper 128 bits of ymm1
564 | if src_size == QWORD_SIZE:
565 | clear_upper(cdg, d_reg)
566 |
567 | return ida_hexrays.MERR_OK
568 |
569 | def vcvtss2sd(self, cdg, insn):
570 | """
571 | VCVTSS2SD xmm1, xmm2, r/m32
572 | """
573 |
574 | # op3 -- m32
575 | if is_mem_op(insn.Op3):
576 | r_reg = cdg.load_operand(2)
577 |
578 | # op3 -- r32
579 | else:
580 | assert is_reg_op(insn.Op3)
581 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg)
582 |
583 | r_mop = ida_hexrays.mop_t(r_reg, FLOAT_SIZE)
584 |
585 | # op2 -- xmm2
586 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
587 |
588 | # op1 -- xmm1
589 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
590 |
591 | # create a temp register to compute the final result into
592 | t0_result = cdg.mba.alloc_kreg(XMM_SIZE)
593 | t0_mop = ida_hexrays.mop_t(t0_result, DOUBLE_SIZE)
594 |
595 | # copy xmm2 into the temp result reg, as we need its upper quadword
596 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, l_reg, 0, t0_result, 0)
597 |
598 | # convert float (op3) to a double, storing it in the lower 64 of the temp result reg
599 | cdg.emit(ida_hexrays.m_f2f, r_mop, NO_MOP, t0_mop)
600 |
601 | # transfer the fully computed temp register to the real dest reg
602 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, t0_result, 0, d_reg, 0)
603 | cdg.mba.free_kreg(t0_result, XMM_SIZE)
604 |
605 | # clear upper 128 bits of ymm1
606 | clear_upper(cdg, d_reg)
607 |
608 | return ida_hexrays.MERR_OK
609 |
610 | #-------------------------------------------------------------------------
611 | # Mov Instructions
612 | #-------------------------------------------------------------------------
613 |
614 | def vmovss(self, cdg, insn):
615 | """
616 | VMOVSS xmm1, xmm2, xmm3
617 | VMOVSS xmm1, m32
618 | VMOVSS xmm1, xmm2, xmm3
619 | VMOVSS m32, xmm1
620 | """
621 | return self._vmov_ss_sd(cdg, insn, FLOAT_SIZE)
622 |
623 | def vmovsd(self, cdg, insn):
624 | """
625 | VMOVSD xmm1, xmm2, xmm3
626 | VMOVSD xmm1, m64
627 | VMOVSD xmm1, xmm2, xmm3
628 | VMOVSD m64, xmm1
629 | """
630 | return self._vmov_ss_sd(cdg, insn, DOUBLE_SIZE)
631 |
632 | def _vmov_ss_sd(self, cdg, insn, data_size):
633 | """
634 | Templated handler for scalar float/double mov instructions.
635 | """
636 |
637 | # op form: X, Y -- (2 operands)
638 | if insn.Op3.type == ida_ua.o_void:
639 |
640 | # op form: xmm1, m32/m64
641 | if is_xmm_reg(insn.Op1):
642 | assert is_mem_op(insn.Op2)
643 |
644 | # op2 -- m32/m64
645 | l_reg = cdg.load_operand(1)
646 | l_mop = ida_hexrays.mop_t(l_reg, data_size)
647 |
648 | # op1 -- xmm1
649 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
650 | d_mop = ida_hexrays.mop_t(d_reg, XMM_SIZE)
651 |
652 | # xmm1[:data_size] = [mem]
653 | insn = cdg.emit(ida_hexrays.m_xdu, l_mop, NO_MOP, d_mop)
654 |
655 | # clear xmm1[data_size:] bits (through ymm1)
656 | clear_upper(cdg, d_reg, data_size)
657 |
658 | return ida_hexrays.MERR_OK
659 |
660 | # op form: m32/m64, xmm1
661 | else:
662 | assert is_mem_op(insn.Op1) and is_xmm_reg(insn.Op2)
663 |
664 | # op2 -- xmm1
665 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
666 | l_mop = ida_hexrays.mop_t(l_reg, data_size)
667 |
668 | # store xmm1[:data_size] into memory at [m32/m64] (op1)
669 | insn = cdg.store_operand(0, l_mop)
670 | insn.set_fpinsn()
671 |
672 | return ida_hexrays.MERR_OK
673 |
674 | # op form: xmm1, xmm2, xmm3 -- (3 operands)
675 | else:
676 | assert is_xmm_reg(insn.Op1) and is_xmm_reg(insn.Op2) and is_xmm_reg(insn.Op3)
677 |
678 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
679 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
680 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg)
681 |
682 | # create a temp register to compute the final result into
683 | t0_result = cdg.mba.alloc_kreg(XMM_SIZE)
684 |
685 | # emit the microcode for this insn
686 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, l_reg, 0, t0_result, 0)
687 | cdg.emit(ida_hexrays.m_f2f, data_size, r_reg, 0, t0_result, 0)
688 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, t0_result, 0, d_reg, 0)
689 | cdg.mba.free_kreg(t0_result, XMM_SIZE)
690 |
691 | # clear xmm1[data_size:] bits (through ymm1)
692 | clear_upper(cdg, d_reg, data_size)
693 |
694 | return ida_hexrays.MERR_OK
695 |
696 | # failsafe
697 | assert "Unreachable..."
698 | return ida_hexrays.MERR_INSN
699 |
700 | def vmovd(self, cdg, insn):
701 | """
702 | VMOVD xmm1, r32/m32
703 | VMOVD r32/m32, xmm1
704 | """
705 | return self._vmov(cdg, insn, DWORD_SIZE)
706 |
707 | def vmovq(self, cdg, insn):
708 | """
709 | VMOVQ xmm1, r64/m64
710 | VMOVQ r64/m64, xmm1
711 | """
712 | return self._vmov(cdg, insn, QWORD_SIZE)
713 |
714 | def _vmov(self, cdg, insn, data_size):
715 | """
716 | Templated handler for dword/qword mov instructions.
717 | """
718 |
719 | # op form: xmm1, rXX/mXX
720 | if is_xmm_reg(insn.Op1):
721 |
722 | # op2 -- m32/m64
723 | if is_mem_op(insn.Op2):
724 | l_reg = cdg.load_operand(1)
725 |
726 | # op2 -- r32/r64
727 | else:
728 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
729 |
730 | # wrap the source micro-reg as a micro-operand of the specified size
731 | l_mop = ida_hexrays.mop_t(l_reg, data_size)
732 |
733 | # op1 -- xmm1
734 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
735 | d_mop = ida_hexrays.mop_t(d_reg, XMM_SIZE)
736 |
737 | # emit the microcode for this insn
738 | cdg.emit(ida_hexrays.m_xdu, l_mop, NO_MOP, d_mop)
739 |
740 | # clear upper 128 bits of ymm1
741 | clear_upper(cdg, d_reg)
742 |
743 | return ida_hexrays.MERR_OK
744 |
745 | # op form: rXX/mXX, xmm1
746 | else:
747 | assert is_xmm_reg(insn.Op2)
748 |
749 | # op2 -- xmm1
750 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
751 | l_mop = ida_hexrays.mop_t(l_reg, data_size)
752 |
753 | # op1 -- m32/m64
754 | if is_mem_op(insn.Op1):
755 | cdg.store_operand(0, l_mop)
756 |
757 | # op1 -- r32/r64
758 | else:
759 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
760 | d_mop = ida_hexrays.mop_t(d_reg, data_size)
761 | cdg.emit(ida_hexrays.m_mov, l_mop, NO_MOP, d_mop)
762 |
763 | #
764 | # TODO: the intel manual doesn't make it entierly clear here
765 | # if the upper bits of a r32 operation need to be cleared ?
766 | #
767 |
768 | return ida_hexrays.MERR_OK
769 |
770 | # failsafe
771 | assert "Unreachable..."
772 | return ida_hexrays.MERR_INSN
773 |
774 | def v_mov_ps_dq(self, cdg, insn):
775 | """
776 | VMOVAPS xmm1, xmm2/m128
777 | VMOVAPS ymm1, ymm2/m256
778 | VMOVAPS xmm2/m128, xmm1
779 | VMOVAPS ymm2/m256, ymm1
780 |
781 | VMOVUPS xmm1, xmm2/m128
782 | VMOVUPS ymm1, ymm2/m256
783 | VMOVUPS xmm2/m128, xmm1
784 | VMOVUPS ymm2/m256, ymm1
785 |
786 | VMOVDQA xmm1, xmm2/m128
787 | VMOVDQA xmm2/m128, xmm1
788 | VMOVDQA ymm1, ymm2/m256
789 | VMOVDQA ymm2/m256, ymm1
790 |
791 | VMOVDQU xmm1, xmm2/m128
792 | VMOVDQU xmm2/m128, xmm1
793 | VMOVDQU ymm1, ymm2/m256
794 | VMOVDQU ymm2/m256, ymm1
795 | """
796 |
797 | # op form: reg, [mem]
798 | if is_avx_reg(insn.Op1):
799 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE
800 |
801 | # op2 -- m128/m256
802 | if is_mem_op(insn.Op2):
803 | l_reg = cdg.load_operand(1)
804 |
805 | # op2 -- xmm1/ymm1
806 | else:
807 | assert is_avx_reg(insn.Op2)
808 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
809 |
810 | # wrap the source micro-reg as a micro-operand
811 | l_mop = ida_hexrays.mop_t(l_reg, op_size)
812 |
813 | # op1 -- xmmX/ymmX
814 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
815 | d_mop = ida_hexrays.mop_t(d_reg, op_size)
816 |
817 | # emit the microcode for this insn
818 | cdg.emit(ida_hexrays.m_mov, l_mop, NO_MOP, d_mop)
819 |
820 | # clear upper 128 bits of ymm1
821 | if op_size == XMM_SIZE:
822 | clear_upper(cdg, d_reg)
823 |
824 | return ida_hexrays.MERR_OK
825 |
826 | # op form: [mem], reg
827 | else:
828 | assert is_mem_op(insn.Op1) and is_avx_reg(insn.Op2)
829 | op_size = XMM_SIZE if is_xmm_reg(insn.Op2) else YMM_SIZE
830 |
831 | # op1 -- xmm1/ymm1
832 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
833 | l_mop = ida_hexrays.mop_t(l_reg, op_size)
834 |
835 | # [m128/m256] = xmm1/ymm1
836 | cdg.store_operand(0, l_mop)
837 | return ida_hexrays.MERR_OK
838 |
839 | # failsafe
840 | assert "Unreachable..."
841 | return ida_hexrays.MERR_INSN
842 |
843 | #-------------------------------------------------------------------------
844 | # Bitwise Instructions
845 | #-------------------------------------------------------------------------
846 |
847 | def v_bitwise_ps(self, cdg, insn):
848 | """
849 | VORPS xmm1, xmm2, xmm3/m128
850 | VORPS ymm1, ymm2, ymm3/m256
851 |
852 | VXORPS xmm1, xmm2, xmm3/m128
853 | VXORPS ymm1, ymm2, ymm3/m256
854 |
855 | VANDPS xmm1, xmm2, xmm3/m128
856 | VANDPS ymm1, ymm2, ymm3/m256
857 | """
858 | assert is_avx_reg(insn.Op1) and is_avx_reg(insn.Op2)
859 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE
860 |
861 | # op3 -- m128/m256
862 | if is_mem_op(insn.Op3):
863 | r_reg = cdg.load_operand(2)
864 |
865 | # op3 -- xmm3/ymm3
866 | else:
867 | assert is_avx_reg(insn.Op3)
868 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg)
869 |
870 | itype2mcode = \
871 | {
872 | ida_allins.NN_vorps: ida_hexrays.m_or,
873 | ida_allins.NN_vandps: ida_hexrays.m_and,
874 | ida_allins.NN_vxorps: ida_hexrays.m_xor,
875 | }
876 |
877 | # get the hexrays microcode op to use for this instruction
878 | mcode_op = itype2mcode[insn.itype]
879 |
880 | # wrap the source micro-reg as a micro-operand
881 | r_mop = ida_hexrays.mop_t(r_reg, op_size)
882 |
883 | # op2 -- xmm2/ymm2
884 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
885 | l_mop = ida_hexrays.mop_t(l_reg, op_size)
886 |
887 | # op1 -- xmm1/ymm1
888 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
889 | d_mop = ida_hexrays.mop_t(d_reg, op_size)
890 |
891 | # emit the microcode for this insn
892 | cdg.emit(mcode_op, l_mop, r_mop, d_mop)
893 |
894 | # clear upper 128 bits of ymm1
895 | if op_size == XMM_SIZE:
896 | clear_upper(cdg, d_reg)
897 |
898 | return ida_hexrays.MERR_OK
899 |
900 | #-------------------------------------------------------------------------
901 | # Arithmetic Instructions
902 | #-------------------------------------------------------------------------
903 |
904 | def v_math_ss(self, cdg, insn):
905 | """
906 | VADDSS xmm1, xmm2, xmm3/m32
907 | VSUBSS xmm1, xmm2, xmm3/m32
908 | VMULSS xmm1, xmm2, xmm3/m32
909 | VDIVSS xmm1, xmm2, xmm3/m32
910 | """
911 | return self._v_math_ss_sd(cdg, insn, FLOAT_SIZE)
912 |
913 | def v_math_sd(self, cdg, insn):
914 | """
915 | VADDSD xmm1, xmm2, xmm3/m64
916 | VSUBSD xmm1, xmm2, xmm3/m64
917 | VMULSD xmm1, xmm2, xmm3/m64
918 | VDIVSD xmm1, xmm2, xmm3/m64
919 | """
920 | return self._v_math_ss_sd(cdg, insn, DOUBLE_SIZE)
921 |
922 | def _v_math_ss_sd(self, cdg, insn, op_size):
923 | """
924 | Templated handler for scalar float/double math instructions.
925 | """
926 | assert is_avx_reg(insn.Op1) and is_avx_reg(insn.Op2)
927 |
928 | # op3 -- m32/m64
929 | if is_mem_op(insn.Op3):
930 | r_reg = cdg.load_operand(2)
931 |
932 | # op3 -- xmm3
933 | else:
934 | assert is_xmm_reg(insn.Op3)
935 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg)
936 |
937 | # op2 -- xmm2
938 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
939 |
940 | # op1 -- xmm1
941 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
942 |
943 | itype2mcode = \
944 | {
945 | ida_allins.NN_vaddss: ida_hexrays.m_fadd,
946 | ida_allins.NN_vaddsd: ida_hexrays.m_fadd,
947 |
948 | ida_allins.NN_vsubss: ida_hexrays.m_fsub,
949 | ida_allins.NN_vsubsd: ida_hexrays.m_fsub,
950 |
951 | ida_allins.NN_vmulss: ida_hexrays.m_fmul,
952 | ida_allins.NN_vmulsd: ida_hexrays.m_fmul,
953 |
954 | ida_allins.NN_vdivss: ida_hexrays.m_fdiv,
955 | ida_allins.NN_vdivsd: ida_hexrays.m_fdiv,
956 | }
957 |
958 | # get the hexrays microcode op to use for this instruction
959 | mcode_op = itype2mcode[insn.itype]
960 | op_dtype = ida_ua.dt_float if op_size == FLOAT_SIZE else ida_ua.dt_double
961 |
962 | # create a temp register to compute the final result into
963 | t0_result = cdg.mba.alloc_kreg(XMM_SIZE)
964 |
965 | # emit the microcode for this insn
966 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, l_reg, 0, t0_result, 0)
967 | cdg.emit_micro_mvm(mcode_op, op_dtype, l_reg, r_reg, t0_result, 0)
968 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, t0_result, 0, d_reg, 0)
969 | cdg.mba.free_kreg(t0_result, 16)
970 |
971 | # clear upper 128 bits of ymm1
972 | assert is_xmm_reg(insn.Op1)
973 | clear_upper(cdg, d_reg)
974 |
975 | return ida_hexrays.MERR_OK
976 |
977 | def v_math_ps(self, cdg, insn):
978 | """
979 | VADDPS xmm1, xmm2, xmm3/m128
980 | VADDPS ymm1, ymm2, ymm3/m256
981 |
982 | VSUBPS xmm1, xmm2, xmm3/m128
983 | VSUBPS ymm1, ymm2, ymm3/m256
984 |
985 | VMULPS xmm1, xmm2, xmm3/m128
986 | VMULPS ymm1, ymm2, ymm3/m256
987 |
988 | VDIVPS xmm1, xmm2, xmm3/m128
989 | VDIVPS ymm1, ymm2, ymm3/m256
990 | """
991 | assert is_avx_reg(insn.Op1) and is_avx_reg(insn.Op2)
992 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE
993 |
994 | # op3 -- m128/m256
995 | if is_mem_op(insn.Op3):
996 | r_reg = cdg.load_operand(2)
997 |
998 | # op3 -- xmm3/ymm3
999 | else:
1000 | assert is_avx_reg(insn.Op3)
1001 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg)
1002 |
1003 | # op2 -- xmm2/ymm2
1004 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
1005 |
1006 | # op1 -- xmm1/ymm1
1007 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
1008 | d_mop = ida_hexrays.mop_t(d_reg, op_size)
1009 |
1010 | itype2name = \
1011 | {
1012 | ida_allins.NN_vaddps: "_mm%u_add_ps",
1013 | ida_allins.NN_vsubps: "_mm%u_sub_ps",
1014 | ida_allins.NN_vmulps: "_mm%u_mul_ps",
1015 | ida_allins.NN_vdivps: "_mm%u_div_ps",
1016 | }
1017 |
1018 | # create the intrinsic
1019 | bit_size = bytes2bits(op_size)
1020 | bit_str = "256" if op_size == YMM_SIZE else ""
1021 | intrinsic_name = itype2name[insn.itype] % bytes2bits(op_size)
1022 |
1023 | avx_intrinsic = AVXIntrinsic(cdg, intrinsic_name)
1024 | avx_intrinsic.add_argument_reg(l_reg, "__m%u" % bit_size)
1025 | avx_intrinsic.add_argument_reg(r_reg, "__m%u" % bit_size)
1026 | avx_intrinsic.set_return_reg(d_reg, "__m%u" % bit_size)
1027 | avx_intrinsic.emit()
1028 |
1029 | # clear upper 128 bits of ymm1
1030 | if op_size == XMM_SIZE:
1031 | clear_upper(cdg, d_reg)
1032 |
1033 | return ida_hexrays.MERR_OK
1034 |
1035 | #-------------------------------------------------------------------------
1036 | # Misc Instructions
1037 | #-------------------------------------------------------------------------
1038 |
1039 | def vsqrtss(self, cdg, insn):
1040 | """
1041 | VSQRTSS xmm1, xmm2, xmm3/m32
1042 | """
1043 | assert is_xmm_reg(insn.Op1) and is_xmm_reg(insn.Op2)
1044 |
1045 | # op3 -- xmm3
1046 | if is_xmm_reg(insn.Op3):
1047 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg)
1048 |
1049 | # op3 -- m32
1050 | else:
1051 | assert is_mem_op(insn.Op3)
1052 | r_reg = cdg.load_operand(2)
1053 |
1054 | # op2 - xmm2
1055 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
1056 |
1057 | # op1 - xmm1
1058 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
1059 |
1060 | # create a temp register to compute the final result into
1061 | t0_result = cdg.mba.alloc_kreg(XMM_SIZE)
1062 |
1063 | # populate the dest reg
1064 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, l_reg, 0, t0_result, 0)
1065 |
1066 | # mov.fpu call !fsqrt.4, t0_result_4.4
1067 | avx_intrinsic = AVXIntrinsic(cdg, "fsqrt")
1068 | avx_intrinsic.add_argument_reg_basic(r_reg, ida_typeinf.BT_FLOAT)
1069 | avx_intrinsic.set_return_reg_basic(t0_result, ida_typeinf.BT_FLOAT)
1070 | avx_intrinsic.emit()
1071 |
1072 | # store the fully computed result
1073 | cdg.emit(ida_hexrays.m_mov, XMM_SIZE, t0_result, 0, d_reg, 0)
1074 | cdg.mba.free_kreg(t0_result, XMM_SIZE)
1075 |
1076 | # clear upper 128 bits of ymm1
1077 | clear_upper(cdg, d_reg)
1078 |
1079 | return ida_hexrays.MERR_OK
1080 |
1081 | def vsqrtps(self, cdg, insn):
1082 | """
1083 | VSQRTPS xmm1, xmm2/m128
1084 | VSQRTPS ymm1, ymm2/m256
1085 | """
1086 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE
1087 |
1088 | # op2 -- m128/m256
1089 | if is_mem_op(insn.Op2):
1090 | r_reg = cdg.load_operand(1)
1091 |
1092 | # op2 -- xmm2/ymm2
1093 | else:
1094 | assert is_avx_reg(insn.Op2)
1095 | r_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
1096 |
1097 | # op1 -- xmm1/ymm1
1098 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
1099 |
1100 | # intrinsic: __m256 _mm256_cvtepi32_ps (__m256i a)
1101 | bit_size = bytes2bits(op_size)
1102 | bit_str = str(bit_size) if op_size == YMM_SIZE else ""
1103 | intrinsic_name = "_mm%s_sqrt_ps" % bit_str
1104 |
1105 | avx_intrinsic = AVXIntrinsic(cdg, intrinsic_name)
1106 | avx_intrinsic.add_argument_reg(r_reg, "__m%u" % bit_size)
1107 | avx_intrinsic.set_return_reg(d_reg, "__m%u" % bit_size)
1108 | avx_intrinsic.emit()
1109 |
1110 | # clear upper 128 bits of ymm1
1111 | if op_size == XMM_SIZE:
1112 | clear_upper(cdg, d_reg)
1113 |
1114 | return ida_hexrays.MERR_OK
1115 |
1116 | def vshufps(self, cdg, insn):
1117 | """
1118 | VSHUFPS xmm1, xmm2, xmm3/m128, imm8
1119 | VSHUFPS ymm1, ymm2, ymm3/m256, imm8
1120 | """
1121 | op_size = XMM_SIZE if is_xmm_reg(insn.Op1) else YMM_SIZE
1122 |
1123 | # op4 -- imm8
1124 | assert insn.Op4.type == ida_ua.o_imm
1125 | mask_value = insn.Op4.value
1126 |
1127 | # op3 -- m128/m256
1128 | if is_mem_op(insn.Op3):
1129 | r_reg = cdg.load_operand(2)
1130 |
1131 | # op3 -- xmm3/ymm3
1132 | else:
1133 | assert is_avx_reg(insn.Op3)
1134 | r_reg = ida_hexrays.reg2mreg(insn.Op3.reg)
1135 |
1136 | # op2 -- xmm2/ymm2
1137 | l_reg = ida_hexrays.reg2mreg(insn.Op2.reg)
1138 |
1139 | # op1 -- xmm1/ymm1
1140 | d_reg = ida_hexrays.reg2mreg(insn.Op1.reg)
1141 |
1142 | #
1143 | # intrinsics:
1144 | # __m128 _mm_shuffle_ps (__m128 a, __m128 b, unsigned int imm8)
1145 | # __m256 _mm256_shuffle_ps (__m256 a, __m256 b, const int imm8)
1146 | #
1147 |
1148 | bit_size = bytes2bits(op_size)
1149 | bit_str = str(bit_size) if op_size == YMM_SIZE else ""
1150 | intrinsic_name = "_mm%s_shuffle_ps" % bit_str
1151 |
1152 | avx_intrinsic = AVXIntrinsic(cdg, intrinsic_name)
1153 | avx_intrinsic.add_argument_reg(l_reg, "__m%u" % bit_size)
1154 | avx_intrinsic.add_argument_reg(r_reg, "__m%u" % bit_size)
1155 | avx_intrinsic.add_argument_imm(mask_value, ida_typeinf.BT_INT8)
1156 | avx_intrinsic.set_return_reg(d_reg, "__m%u" % bit_size)
1157 | avx_intrinsic.emit()
1158 |
1159 | # clear upper 128 bits of ymm1
1160 | if op_size == XMM_SIZE:
1161 | clear_upper(cdg, d_reg)
1162 |
1163 | return ida_hexrays.MERR_OK
1164 |
1165 | #-----------------------------------------------------------------------------
1166 | # Plugin
1167 | #-----------------------------------------------------------------------------
1168 |
1169 | def PLUGIN_ENTRY():
1170 | """
1171 | Required plugin entry point for IDAPython Plugins.
1172 | """
1173 | return MicroAVX()
1174 |
1175 | class MicroAVX(ida_idaapi.plugin_t):
1176 | """
1177 | The IDA plugin stub for MicroAVX.
1178 | """
1179 |
1180 | flags = ida_idaapi.PLUGIN_PROC | ida_idaapi.PLUGIN_HIDE
1181 | comment = "AVX support for the Hex-Rays x64 Decompiler"
1182 | help = ""
1183 | wanted_name = "MicroAVX"
1184 | wanted_hotkey = ""
1185 | loaded = False
1186 |
1187 | #--------------------------------------------------------------------------
1188 | # IDA Plugin Overloads
1189 | #--------------------------------------------------------------------------
1190 |
1191 | def init(self):
1192 | """
1193 | This is called by IDA when it is loading the plugin.
1194 | """
1195 |
1196 | # only bother to load the plugin for relevant sessions
1197 | if not is_amd64_idb():
1198 | return ida_idaapi.PLUGIN_SKIP
1199 |
1200 | # ensure the x64 decompiler is loaded
1201 | ida_loader.load_plugin("hexx64")
1202 | assert ida_hexrays.init_hexrays_plugin(), "Missing Hexx64 Decompiler..."
1203 |
1204 | # initialize the AVX lifter
1205 | self.avx_lifter = AVXLifter()
1206 | self.avx_lifter.install()
1207 | sys.modules["__main__"].lifter = self.avx_lifter
1208 |
1209 | # mark the plugin as loaded
1210 | self.loaded = True
1211 | return ida_idaapi.PLUGIN_KEEP
1212 |
1213 | def run(self, arg):
1214 | """
1215 | This is called by IDA when this file is loaded as a script.
1216 | """
1217 | ida_kernwin.warning("%s cannot be run as a script in IDA." % self.wanted_name)
1218 |
1219 | def term(self):
1220 | """
1221 | This is called by IDA when it is unloading the plugin.
1222 | """
1223 | if not self.loaded:
1224 | return
1225 |
1226 | # hex-rays automatically cleans up decompiler hooks, so not much to do here...
1227 | self.avx_lifter = None
--------------------------------------------------------------------------------