├── README.md
└── glm_ucode_patch_parser.py


/README.md:
--------------------------------------------------------------------------------
 1 | 
 2 | # **Disclaimer**
 3 | 
 4 | **All information is provided for educational purposes only. Follow these instructions at your own risk. Neither the authors nor their employer are responsible for any direct or consequential damage or loss arising from any person or organization acting or failing to act on the basis of information contained in this page.**
 5 | 
 6 | # Content
 7 | [Description](#description)  
 8 | [Usage](#usage)  
 9 | [Research Team](#research-eam)  
10 | [License](#license)  
11 | 
12 | # Description
13 | 
14 | The microcode patch parser for Atom Goldmont is a tool making the textual representation of the microcode patch data. The patch itself represents binary data processed by a special routine in microcode ROM. The address of the routine for Goldmont in MS ROM is U1ea6. We called the routine patch_runs_load_loop. If you study the routine you will see that the microcode patch is simply a sequence of calls to a fixed set of other routines in MS ROM, each identified by its numeric ID. This ID starts each sequence of the calls in the patch and after 4-bytes ID there placed the binary arguments for the called routine. The patch processing routines start from address U226c in MS ROM with step 4 uops (one uops tetrad), so the the routine identified by ID 0 is at U226c, by ID 1 is placed at U2270. The number of arguments varies for each routine. The calls sequence (and the ucode patch itself) ends with special call ID - 0.
15 | 
16 | We identified the base set of the patch processing routines, reverse engineered the purpose of each those routines and the number of their arguments and in our tool generate the text representation of the calls to routines: write the text description of the routine's operation and its arguments from the patch. For routines processing microoperations patching MS ROM (sent to MS Patch RAM) we perform their disassembling in-place so in the resulted text file you can see the uops in the text form. The parser tool imports our [ucode disassembler for Goldmont][4] so it must be accessible from Python import paths. Also, in the verbose mode the tool saves pcode patch data in binary form (firmware for P-unit, a power management controller for CPU) which is also contained in the microcode patch.
17 | 
18 | Our ucode patch parser script supports special verbose mode which can be specified by "-v" option. In this verbose mode the tool accumulates all the microcode related data (the values for Match/Patch registers, Sequence Words and uops) and forms three text files (ms_array2/3/4) which can be used by [ucode disassembler][4] to make the full ucode listing corresponding to the ucode patch (you can simply replace the files read from MS LDAT in runtime and use these new in disassembler).
19 | 
20 | Please note that you must run the parser for decoded patch data (by our [Micrcode Decryptor][5]) and not for original which is encrypted.
21 | 
22 | # Usage
23 | ```
24 | glm_ucode_patch_parser.py
25 | Usage: glm_ucode_patch_parser <decoded_patch_path> [-v]
26 | ```
27 | 
28 | Example:
29 | ```
30 | glm_ucode_patch_parser.py cpu506C9_plat03_ver00000038_2019-01-15_PRD_99AA67D7.bin.dec -v
31 | File [cpu506C9_plat03_ver00000038_2019-01-15_PRD_99AA67D7.bin.dec] processed
32 | ```
33 | 
34 | # Research Team
35 | 
36 | Mark Ermolov ([@\_markel___][1])
37 | 
38 | Maxim Goryachy ([@h0t_max][2])
39 | 
40 | Dmitry Sklyarov ([@_Dmit][3])
41 | 
42 | # License
43 | Copyright (c) 2021 Mark Ermolov, Dmitry Sklyarov at Positive Technologies and Maxim Goryachy (Independent Researcher)
44 | 
45 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 
46 | 
47 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
48 | 
49 | [1]: https://twitter.com/_markel___
50 | [2]: https://twitter.com/h0t_max
51 | [3]: https://twitter.com/_Dmit
52 | [4]: https://github.com/chip-red-pill/uCodeDisasm
53 | [5]: https://github.com/chip-red-pill/MicrocodeDecryptor
54 | 


--------------------------------------------------------------------------------
/glm_ucode_patch_parser.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | import struct
  4 | 
  5 | import glm_ucode_disasm
  6 | 
  7 | g_pcode = b''
  8 | g_match_patch_regs = ()
  9 | g_patch_match = {}
 10 | g_patch_ram = ()
 11 | g_patch_ram_seqwords = ()
 12 | 
 13 | def parser_rid_end(patch_data, offset):
 14 |     return "END", offset
 15 | 
 16 | def parser_rid_init(patch_data, offset):
 17 |     global g_match_patch_data
 18 |     global g_patch_match
 19 |     global g_patch_ram
 20 |     global g_patch_ram_seqwords
 21 | 
 22 |     g_match_patch_regs = ()
 23 |     g_patch_match = {}
 24 |     g_patch_ram = ()
 25 |     g_patch_ram_seqwords = ()
 26 |     return "INIT", offset
 27 | 
 28 | def parser_rid_patch_ram(patch_data, offset):
 29 |     str_res = ""
 30 |     assert(len(patch_data) - offset >= 4)
 31 |     patch_ram_addr, entry_count = struct.unpack_from("<HH", patch_data, offset)
 32 |     offset += 4
 33 |     assert(len(patch_data) - offset >= entry_count * 8)
 34 | 
 35 |     global g_patch_ram
 36 |     global g_patch_ram_seqwords
 37 |     g_patch_ram = [0,] * 0x180
 38 |     g_patch_ram_seqwords = [0,] * 0x80
 39 | 
 40 |     assert(patch_ram_addr >= 0x7c00 and patch_ram_addr < 0x7e00)
 41 |     patch_ram_idx = patch_ram_addr - 0x7c00
 42 | 
 43 |     str_res = "PATCH_RAM: U%04x:\n" % patch_ram_addr
 44 |     seqw = 0
 45 |     for i in range(entry_count):
 46 |         seqw_uop, = struct.unpack_from("<Q", patch_data, offset)
 47 |         seqw |= ((seqw_uop >> 48) & 0x3ff) << ((i % 3) * 10)
 48 | 
 49 |         g_patch_ram[patch_ram_idx] = seqw_uop & 0xffffffffffff
 50 |         g_patch_ram_seqwords[patch_ram_idx // 3] = seqw
 51 |         patch_ram_idx += 1
 52 | 
 53 |         offset += 8
 54 | 
 55 |         if i % 3 == 2:
 56 |             for uop_idx in range(3):
 57 |                 uop_patch_ram_addr = 0x7c00 + (patch_ram_idx // 3 - 1) * 4 + uop_idx
 58 |                 uop = g_patch_ram[(patch_ram_idx // 3 - 1) * 3 + uop_idx]
 59 |                 str_match_patch_addr = ("U%04x: " % g_patch_match[uop_patch_ram_addr]) if uop_patch_ram_addr in g_patch_match else ""
 60 |                 str_res += " U%04x: " % uop_patch_ram_addr + str_match_patch_addr + "%012x " % uop + \
 61 |                             glm_ucode_disasm.uop_disassemble(uop, uop_patch_ram_addr) + "\n"
 62 |                 seqword_sentences, exec_flow_stop = glm_ucode_disasm.process_seqword(uop_patch_ram_addr, uop, seqw, False)
 63 |                 if len(seqword_sentences):
 64 |                     for sws_idx, seqword_sentence in enumerate(seqword_sentences):
 65 |                         str_prefix = "%20s" % ("%08x" % seqw if sws_idx == 0 else "") + " "
 66 |                         str_res += str_prefix + seqword_sentence + "\n"
 67 |             
 68 |             seqw = 0
 69 |             str_res += "\n"
 70 |     
 71 |     return str_res.rstrip("\n") + "\n", offset
 72 | 
 73 | def parser_rid_match_patch(patch_data, offset):
 74 |     str_res = ""
 75 |     assert(len(patch_data) - offset >= 2)
 76 |     entry_count, = struct.unpack_from("<H", patch_data, offset)
 77 |     offset += 2
 78 |     assert(len(patch_data) - offset >= entry_count * 8)
 79 | 
 80 |     global g_match_patch_regs
 81 |     global g_patch_match
 82 | 
 83 |     str_res = "MATCH_PATCH:\n"
 84 |     for i in range(entry_count):
 85 |         two_match_patch, = struct.unpack_from("<Q", patch_data, offset)
 86 |         for match_patch in (two_match_patch & 0x7fffffff, two_match_patch >> 0x1f):
 87 |             g_match_patch_regs += match_patch,
 88 |             match_addr = match_patch & 0xfffe
 89 |             patch_addr = (match_patch >> 16) << 1
 90 |             g_patch_match[patch_addr] = match_addr
 91 |             str_match_patch = ": U%04x -> U%04x" % (match_addr, patch_addr)
 92 |             str_res += " 0x%08x" % match_patch + (str_match_patch if match_patch else "") + "\n"
 93 |         offset += 8
 94 |     return str_res.rstrip("\n"), offset
 95 | 
 96 | def parser_rid_rmw_stg_buf(patch_data, offset):
 97 |     str_res = ""
 98 |     assert(len(patch_data) - offset >= 2)
 99 |     entry_count, = struct.unpack_from("<H", patch_data, offset)
100 |     offset += 2
101 |     assert(len(patch_data) - offset >= entry_count * 0x12)
102 | 
103 |     str_res = "RMW STAGING BUF:\n"
104 |     for i in range(entry_count):
105 |         iospecial_addr, and_val, or_val = struct.unpack_from("<HQQ", patch_data, offset)
106 |         str_res += " 0x%03x: AND=0x%016x: OR=0x%016x\n" % (iospecial_addr, and_val, or_val)
107 |         offset += 0x12
108 |     return str_res.rstrip("\n"), offset
109 | 
110 | def parser_rid_rmw_creg(patch_data, offset):
111 |     str_res = ""
112 |     assert(len(patch_data) - offset >= 2)
113 |     entry_count, = struct.unpack_from("<H", patch_data, offset)
114 |     offset += 2
115 |     assert(len(patch_data) - offset >= entry_count * 0x14)
116 | 
117 |     str_res = "RMW CREG:\n"
118 |     for i in range(entry_count):
119 |         creg_addr, and_val, or_val = struct.unpack_from("<LQQ", patch_data, offset)
120 |         str_res += " 0x%03x: AND=0x%016x: OR=0x%016x\n" % (creg_addr, and_val, or_val)
121 |         offset += 0x14
122 |     return str_res.rstrip("\n"), offset
123 | 
124 | def parser_rid_rmw_uram(patch_data, offset):
125 |     str_res = ""
126 |     assert(len(patch_data) - offset >= 2)
127 |     entry_count, = struct.unpack_from("<H", patch_data, offset)
128 |     offset += 2
129 |     assert(len(patch_data) - offset >= entry_count * 0x14)
130 | 
131 |     str_res = "RMW URAM:\n"
132 |     for i in range(entry_count):
133 |         uram_addr, and_val, or_val = struct.unpack_from("<LQQ", patch_data, offset)
134 |         str_res += " 0x%03x: AND=0x%016x: OR=0x%016x\n" % (uram_addr, and_val, or_val)
135 |         offset += 0x14
136 |     return str_res.rstrip("\n"), offset
137 | 
138 | def parser_rid_rmw_creg_sync(patch_data, offset):
139 |     str_res = ""
140 |     assert(len(patch_data) - offset >= 2)
141 |     entry_count, = struct.unpack_from("<H", patch_data, offset)
142 |     offset += 2
143 |     assert(len(patch_data) - offset >= entry_count * 0x14)
144 | 
145 |     str_res = "RMW CREG SYNC:\n"
146 |     for i in range(entry_count):
147 |         creg_addr, and_val, or_val = struct.unpack_from("<LQQ", patch_data, offset)
148 |         str_res += " 0x%03x: AND=0x%016x: OR=0x%016x\n" % (creg_addr, and_val, or_val)
149 |         offset += 0x14
150 |     return str_res.rstrip("\n"), offset
151 | 
152 | def parser_rid_ucall(patch_data, offset):
153 |     str_res = ""
154 |     assert(len(patch_data) - offset >= 2)
155 |     uaddr, = struct.unpack_from("<H", patch_data, offset)
156 |     str_res = "UCALL: U%04x" % uaddr
157 |     return str_res, offset + 2
158 | 
159 | def parser_rid_skip_for_pcu_mbox_op_01(patch_data, offset):
160 |     str_res = ""
161 |     assert(len(patch_data) - offset >= 8)
162 |     mbox_op_res, def_skip_size, = struct.unpack_from("<LL", patch_data, offset)
163 |     offset += 8
164 |     assert(len(patch_data) - offset >= def_skip_size)
165 |     str_res = "SKIP FOR PCU MBOX OP_01: 0x%08x: 0x%08x" % \
166 |               (mbox_op_res, offset + def_skip_size)
167 |     return str_res, offset
168 | 
169 | def parser_rid_halt_pcu(patch_data, offset):
170 |     return "HALT PCU", offset
171 | 
172 | def parser_rid_resume_pcu(patch_data, offset):
173 |     return "RESUME PCU", offset
174 | 
175 | def parser_rid_write_pcu_ldat(patch_data, offset):
176 |     str_res = ""
177 |     assert(len(patch_data) - offset >= 6)
178 |     sdat, pdat, entry_count, = struct.unpack_from("<HHH", patch_data, offset)
179 |     offset += 6
180 |     assert(len(patch_data) - offset >= entry_count * 8)
181 | 
182 |     global g_pcode
183 |     paddr = (((sdat >> 2) & 0xf) * 0x1000 + pdat) * 8
184 |     if len(g_pcode) < paddr:
185 |         g_pcode = g_pcode + b'\x00' * (paddr - len(g_pcode))
186 |     
187 |     str_res = "WRITE PCU LDAT: PDAT=0x%04x: SDAT=0x%04x\n"% (pdat, sdat)
188 |     for i in range(entry_count):
189 |         outval, = struct.unpack_from("<Q", patch_data, offset)
190 |         str_res += " 0x%016x\n" % outval
191 |         g_pcode = g_pcode + struct.pack("<Q", outval)
192 |         offset += 8
193 |     return str_res.rstrip("\n"), offset
194 | 
195 | def parser_rid_rmw_pcu_mbox_op_05(patch_data, offset):
196 |     str_res = ""
197 |     assert(len(patch_data) - offset >= 2)
198 |     entry_count, = struct.unpack_from("<H", patch_data, offset)
199 |     offset += 2
200 |     assert(len(patch_data) - offset >= entry_count * 0x0a)
201 | 
202 |     str_res = "RMW PCU MBOX OP_05:\n"
203 |     for i in range(entry_count):
204 |         op_data, and_val, or_val = struct.unpack_from("<HLL", patch_data, offset)
205 |         str_res += " 0x%04x: AND=0x%08x: OR=0x%08x\n" % (op_data, and_val, or_val)
206 |         offset += 0x0a
207 |     return str_res.rstrip("\n"), offset
208 | 
209 | def parser_rid_pcu_mbox(patch_data, offset):
210 |     str_res = ""
211 |     assert(len(patch_data) - offset >= 5)
212 |     opcode, data = struct.unpack_from("<BL", patch_data, offset)
213 |     offset += 5
214 |     str_res = "PCU MBOX: OP=0x%02x, DATA=0x%08x" % (opcode, data)
215 |     return str_res, offset
216 | 
217 | def parser_rid_skip_for_mode_c000(patch_data, offset):
218 |     str_res = ""
219 |     assert(len(patch_data) - offset >= 4)
220 |     skip_size, = struct.unpack_from("<L", patch_data, offset)
221 |     offset += 4
222 |     assert(len(patch_data) - offset >= skip_size)
223 |     str_res = "SKIP FOR MODE 0xc000: 0x%08x" % (offset + skip_size)
224 |     return str_res, offset
225 | 
226 | def parser_rid_skip_for_mode_4000(patch_data, offset):
227 |     str_res = ""
228 |     assert(len(patch_data) - offset >= 4)
229 |     skip_size, = struct.unpack_from("<L", patch_data, offset)
230 |     offset += 4
231 |     assert(len(patch_data) - offset >= skip_size)
232 |     str_res = "SKIP FOR MODE 0x4000: 0x%08x" % (offset + skip_size)
233 |     return str_res, offset
234 | 
235 | g_parsers = {0x00: parser_rid_end,
236 |              0x01: parser_rid_init,
237 |              0x02: parser_rid_patch_ram,
238 |              0x03: parser_rid_match_patch,
239 |              0x05: parser_rid_rmw_stg_buf,
240 |              0x06: parser_rid_rmw_creg,
241 |              0x07: parser_rid_rmw_uram,
242 |              0x08: parser_rid_rmw_creg_sync,
243 |              0x0a: parser_rid_ucall,
244 |              0x0c: parser_rid_skip_for_pcu_mbox_op_01,
245 |              0x0d: parser_rid_halt_pcu,
246 |              0x0e: parser_rid_resume_pcu,
247 |              0x0f: parser_rid_write_pcu_ldat,
248 |              0x10: parser_rid_rmw_pcu_mbox_op_05,
249 |              0x11: parser_rid_pcu_mbox,
250 |              0x1d: parser_rid_skip_for_mode_c000,
251 |              0x1e: parser_rid_skip_for_mode_4000}
252 | 
253 | def parse_ucode_patch(patch_data):
254 |     str_res = ""
255 |     offset = 0
256 |     while (offset < len(patch_data)):
257 |         run_id, = struct.unpack_from("<B", patch_data, offset)
258 |         #assert(run_id in g_parsers)
259 |         if run_id not in g_parsers:
260 |             break
261 |         parse_res = g_parsers[run_id](patch_data, offset + 1)
262 |         str_res += "0x%04x: 0x%02x(U%04x): %s\n" % (offset, run_id, (run_id << 2) + 0x226c, parse_res[0])
263 |         offset = parse_res[1] 
264 |     return str_res
265 | 
266 | def save_ms_array(array_idx, ms_array_data, file_path):
267 |     fo = open(file_path, "w")
268 |     fo.write("array %02x:" % array_idx)
269 |     for addr, data_item in enumerate(ms_array_data):
270 |         if addr % 4 == 0:
271 |             fo.write("\n%04x: " % addr)
272 |         fo.write(" %012x" % data_item)
273 |     for i in range((addr + 1) % 4):
274 |         fo.write(" %012x" % 0)
275 |     fo.write("\n")
276 |     fo.close()
277 | 
278 | def main():
279 |     if len(sys.argv) < 2:
280 |         print("Usage: glm_ucode_patch_parser <decoded_patch_path> [-v]")
281 |         return -1
282 |     
283 |     patch_path = sys.argv[1]
284 |     fi = open(patch_path, "rb")
285 |     patch_data = fi.read()
286 |     fi.close()
287 | 
288 |     out_file_path = os.path.splitext(patch_path)[0] + ".txt"
289 |     parsed_data = parse_ucode_patch(patch_data)
290 |     fo = open(out_file_path, "w")
291 |     fo.write(parsed_data)
292 |     fo.close()
293 | 
294 |     verbose = len(sys.argv) > 2 and sys.argv[2] == "-v"
295 |     if verbose: 
296 |         pcode_out_file_path = os.path.splitext(patch_path)[0] + ".pcode.bin"
297 |         fo = open(pcode_out_file_path, "wb")
298 |         fo.write(g_pcode)
299 |         fo.close()
300 | 
301 |         global g_match_patch_regs
302 |         if len(g_match_patch_regs):
303 |             g_match_patch_regs += (0,) * (0x40 - len(g_match_patch_regs))
304 |             save_ms_array(3, g_match_patch_regs, os.path.splitext(patch_path)[0] + ".ms_array3.txt")
305 |     
306 |         if len(g_patch_ram):
307 |             assert(len(g_patch_ram_seqwords))
308 |             save_ms_array(2, g_patch_ram_seqwords, os.path.splitext(patch_path)[0] + ".ms_array2.txt")
309 | 
310 |             patch_ram_array_data = [0,] * 0x200
311 |             for idx, patch_ram_item in enumerate(g_patch_ram):
312 |                 patch_ram_array_data[(idx // 3) + (idx % 3) * 0x80] = patch_ram_item
313 |             save_ms_array(4, patch_ram_array_data, os.path.splitext(patch_path)[0] + ".ms_array4.txt")
314 |     
315 |     print("File [%s] processed" % patch_path)
316 |     return 0
317 | 
318 | main()
319 | 


--------------------------------------------------------------------------------