├── .gitignore ├── README.md └── u4pak.py /.gitignore: -------------------------------------------------------------------------------- 1 | *~ 2 | .* 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | u4pak 2 | ===== 3 | 4 | Unpack, pack, list, test and mount Unreal Engine 4 .pak archives. 5 | 6 | **NOTE:** I've wrote an [alternative version](https://github.com/panzi/rust-u4pak) 7 | of this in Rust and compiled a [self-contained binary](https://github.com/panzi/rust-u4pak/releases) 8 | for Windows users. So there is no hassle with installing Python, plus it adds a 9 | way to supply command line arguments via a file that you can associate with the 10 | binary so you only need to double click that. It is also faster, mainly because 11 | it uses multi-threading. (Note that it's command line arguments are sligthly 12 | different.) 13 | 14 | Basic usage: 15 | 16 | u4pak.py info - print archive summary info 17 | u4pak.py list - list contens of .pak archive 18 | u4pak.py test - test archive integrity 19 | u4pak.py unpack - extract .pak archive 20 | u4pak.py pack - create .pak archive 21 | u4pak.py mount - mount archive as read-only file system 22 | 23 | For unpacking only unencryped archives of version 1, 2, 3, 4, and 7 are supported, 24 | for packing only version 1, 2, and 3. 25 | 26 | **NOTE:** If you know (cheap) games that use other archive versions please tell me! 27 | Especially if its 5 or 6. There is a change in how certain offsets are handled at 28 | some point, but since I only have an example file of version 7 I don't know if it 29 | happened in version 5, 6, or 7. 30 | 31 | The `mount` command depends on the [llfuse](https://code.google.com/p/python-llfuse/) 32 | Python package. If it's not available the rest is still working. 33 | 34 | This script is compatible with Python 3.7. 35 | 36 | If you get errors saying anything about `'utf8' codec can't decode byte [...]` try to 37 | use another encoding by passing `--encoding=iso-8859-1` or similar. 38 | 39 | If you get an error message about an illegal file magic try to pass `--ignore-magic`. 40 | If you get an error message about the archive version being 0 try to pass 41 | `--force-version=1` (or a higher number). 42 | 43 | File Format 44 | ----------- 45 | 46 | Byte order is little endian and the character encoding of file names seems to be 47 | ASCII (or ISO-8859-1/UTF-8 that coincidentally only uses ASCII compatiple 48 | characters). 49 | 50 | Offsets and sizes seem to be 64bit or at least unsigned 32bit integers. If 51 | interpreted as 32bit integers all sizes (except the size of file names) and offsets 52 | are followed by another 32bit integer of the value 0, which makes me guess these 53 | are 64bit values. Also some values exceed the range of signed 32bit integers, so 54 | they have to be at least unsigned 32bit integers. This information was reverse 55 | engineered from the Elemental [Demo](https://wiki.unrealengine.com/Linux_Demos) 56 | for Linux (which contains a 2.5 GB .pak file). 57 | 58 | Basic layout: 59 | 60 | * Data Records 61 | * Index 62 | * Index Header 63 | * Index Records 64 | * Footer 65 | 66 | In order to parse a file you need to read the footer first. The footer contains 67 | an offset pointer to the start of the index records. The index records then 68 | contain offset pointers to the data records. 69 | 70 | Some games seem to zero out parts of the file. In particular the footer, which 71 | makes it pretty much impossible to read the file without manual analysis and 72 | guessing. I suspect these games have the footer included somewhere in the game 73 | binary. If it's not obfuscated one might be able to find it using the file 74 | magic (given that the file magic is even included)? 75 | 76 | ### Record 77 | 78 | Offset Size Type Description 79 | 0 8 uint64_t offset 80 | 8 8 uint64_t size (N) 81 | 16 8 uint64_t uncompressed size 82 | 24 4 uint32_t compression method: 83 | 0x00 ... none 84 | 0x01 ... zlib 85 | 0x10 ... bias memory 86 | 0x20 ... bias speed 87 | if version <= 1 88 | 28 8 uint64_t timestamp 89 | end 90 | ? 20 uint8_t[20] data sha1 hash 91 | if version >= 3 92 | if compression method != 0x00 93 | ?+20 4 uint32_t block count (M) 94 | ?+24 M*16 CB[M] compression blocks 95 | end 96 | ? 1 uint8_t is encrypted 97 | ?+1 4 uint32_t The uncompressed size of each compression block. 98 | end The last block can be smaller, of course. 99 | 100 | ### Compression Block (CB) 101 | 102 | Size: 16 bytes 103 | 104 | Offset Size Type Description 105 | 0 8 uint64_t compressed data block start offset. 106 | version <= 4: offset is absolute to the file 107 | version 7: offset is relative to the offset 108 | field in the corresponding Record 109 | 8 8 uint64_t compressed data block end offset. 110 | There may or may not be a gap between blocks. 111 | version <= 4: offset is absolute to the file 112 | version 7: offset is relative to the offset 113 | field in the corresponding Record 114 | 115 | ### Data Record 116 | 117 | Offset Size Type Description 118 | 0 ? Record file metadata (offset field is 0, N = compressed_size) 119 | ? N uint8_t[N] file data 120 | 121 | ### Index Record 122 | 123 | Offset Size Type Description 124 | 0 4 uint32_t file name size (S) 125 | 4 S char[S] file name (includes terminating null byte) 126 | 4+S ? Record file metadata 127 | 128 | ### Index 129 | 130 | Offset Size Type Description 131 | 0 4 uint32_t mount point size (S) 132 | 4 S char[S] mount point (includes terminating null byte) 133 | S+4 4 uint32_t record count (N) 134 | S+8 ? IndexRecord[N] records 135 | 136 | ### Footer 137 | 138 | Size: 44 bytes 139 | 140 | Offset Size Type Description 141 | 0 4 uint32_t magic: 0x5A6F12E1 142 | 4 4 uint32_t version: 1, 2, 3, 4, or 7 143 | 8 8 uint64_t index offset 144 | 16 8 uint64_t index size 145 | 24 20 uint8_t[20] index sha1 hash 146 | 147 | Related Projects 148 | ---------------- 149 | 150 | * [fezpak](https://github.com/panzi/fezpak): pack, unpack, list and mount FEZ .pak archives 151 | * [psypkg](https://github.com/panzi/psypkg): pack, unpack, list and mount Psychonauts .pkg archives 152 | * [bgebf](https://github.com/panzi/bgebf): unpack, list and mount Beyond Good and Evil .bf archives 153 | * [unvpk](https://github.com/panzi/unvpk): extract, list, check and mount Valve .vpk archives 154 | * [rust-vpk](https://github.com/panzi/rust-vpk/): Rust rewrite of the above 155 | * [t2fbq](https://github.com/panzi/t2fbq): unpack, list and mount Trine 2 .fbq archives 156 | * [rust-u4pak](https://github.com/panzi/rust-u4pak): not yet finished Rust rewrite of this script 157 | 158 | BSD License 159 | ----------- 160 | Copyright (c) 2014-2019 Mathias Panzenböck 161 | 162 | Permission is hereby granted, free of charge, to any person obtaining a copy 163 | of this software and associated documentation files (the "Software"), to deal 164 | in the Software without restriction, including without limitation the rights 165 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 166 | copies of the Software, and to permit persons to whom the Software is 167 | furnished to do so, subject to the following conditions: 168 | 169 | The above copyright notice and this permission notice shall be included in 170 | all copies or substantial portions of the Software. 171 | 172 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 173 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 174 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 175 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 176 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 177 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 178 | THE SOFTWARE. 179 | -------------------------------------------------------------------------------- /u4pak.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding=UTF-8 3 | # 4 | # Copyright (c) 2014 Mathias Panzenböck 5 | # 6 | # Permission is hereby granted, free of charge, to any person obtaining a copy 7 | # of this software and associated documentation files (the "Software"), to deal 8 | # in the Software without restriction, including without limitation the rights 9 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | # copies of the Software, and to permit persons to whom the Software is 11 | # furnished to do so, subject to the following conditions: 12 | # 13 | # The above copyright notice and this permission notice shall be included in 14 | # all copies or substantial portions of the Software. 15 | # 16 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 22 | # THE SOFTWARE. 23 | 24 | from __future__ import annotations, with_statement, division, print_function 25 | 26 | import os 27 | import io 28 | import sys 29 | import hashlib 30 | import zlib 31 | import math 32 | import argparse 33 | 34 | from struct import unpack as st_unpack, pack as st_pack 35 | from collections import OrderedDict 36 | from io import DEFAULT_BUFFER_SIZE 37 | from binascii import hexlify 38 | from typing import NamedTuple, Optional, Tuple, List, Dict, Set, Iterable, Iterator, Callable, IO, Any, Union 39 | 40 | try: 41 | import llfuse # type: ignore 42 | except ImportError: 43 | HAS_LLFUSE = False 44 | else: 45 | HAS_LLFUSE = True 46 | 47 | HAS_STAT_NS = hasattr(os.stat_result, 'st_atime_ns') 48 | 49 | __all__ = 'read_index', 'pack' 50 | 51 | # for Python < 3.3 and Windows 52 | def highlevel_sendfile(outfile: io.BufferedWriter, infile: io.BufferedReader, offset: int, size: int) -> None: 53 | infile.seek(offset,0) 54 | buf_size = DEFAULT_BUFFER_SIZE 55 | buf = bytearray(buf_size) 56 | while size > 0: 57 | if size >= buf_size: 58 | n = infile.readinto(buf) or 0 59 | if n < buf_size: 60 | raise IOError("unexpected end of file") 61 | outfile.write(buf) 62 | size -= buf_size 63 | else: 64 | data = infile.read(size) or b'' 65 | if len(data) < size: 66 | raise IOError("unexpected end of file") 67 | outfile.write(data) 68 | size = 0 69 | 70 | if hasattr(os, 'sendfile'): 71 | def os_sendfile(outfile: io.BufferedWriter, infile: io.BufferedReader, offset: int, size: int) -> None: 72 | try: 73 | out_fd = outfile.fileno() 74 | in_fd = infile.fileno() 75 | except: 76 | highlevel_sendfile(outfile, infile, offset, size) 77 | else: 78 | # size == 0 has special meaning for some sendfile implentations 79 | if size > 0: 80 | os.sendfile(out_fd, in_fd, offset, size) 81 | sendfile = os_sendfile 82 | else: 83 | sendfile = highlevel_sendfile 84 | 85 | def raise_check_error(ctx: Optional[Record], message: str) -> None: 86 | if ctx is None: 87 | raise ValueError(message) 88 | 89 | elif isinstance(ctx, Record): 90 | raise ValueError("%s: %s" % (ctx.filename, message)) 91 | 92 | else: 93 | raise ValueError("%s: %s" % (ctx, message)) 94 | 95 | class FragInfo(object): 96 | __slots__ = '__frags', '__size' 97 | 98 | __size: int 99 | __frags: List[Tuple[int, int]] 100 | 101 | def __init__(self, size: int, frags: Optional[List[Tuple[int, int]]] = None) -> None: 102 | self.__size = size 103 | self.__frags = [] 104 | if frags: 105 | for start, end in frags: 106 | self.add(start, end) 107 | 108 | @property 109 | def size(self) -> int: 110 | return self.__size 111 | 112 | def __iter__(self) -> Iterator[Tuple[int, int]]: 113 | return iter(self.__frags) 114 | 115 | def __len__(self) -> int: 116 | return len(self.__frags) 117 | 118 | def __repr__(self) -> str: 119 | return 'FragInfo(%r,%r)' % (self.__size, self.__frags) 120 | 121 | def add(self, new_start: int, new_end: int) -> None: 122 | if new_start >= new_end: 123 | return 124 | 125 | elif new_start >= self.__size or new_end > self.__size: 126 | raise IndexError("range out of bounds: (%r, %r]" % (new_start, new_end)) 127 | 128 | frags = self.__frags 129 | for i, (start, end) in enumerate(frags): 130 | if new_end < start: 131 | frags.insert(i, (new_start, new_end)) 132 | return 133 | 134 | elif new_start <= start: 135 | if new_end <= end: 136 | frags[i] = (new_start, end) 137 | return 138 | 139 | elif new_start <= end: 140 | if new_end > end: 141 | new_start = start 142 | else: 143 | continue 144 | 145 | j = i+1 146 | n = len(frags) 147 | while j < n: 148 | next_start, next_end = frags[j] 149 | if next_start <= new_end: 150 | j += 1 151 | if next_end > new_end: 152 | new_end = next_end 153 | break 154 | else: 155 | break 156 | 157 | frags[i:j] = [(new_start, new_end)] 158 | return 159 | 160 | frags.append((new_start, new_end)) 161 | 162 | def invert(self) -> FragInfo: 163 | inverted = FragInfo(self.__size) 164 | append = inverted.__frags.append 165 | prev_end = 0 166 | 167 | for start, end in self.__frags: 168 | if start > prev_end: 169 | append((prev_end, start)) 170 | prev_end = end 171 | 172 | if self.__size > prev_end: 173 | append((prev_end, self.__size)) 174 | 175 | return inverted 176 | 177 | def free(self) -> int: 178 | free = 0 179 | prev_end = 0 180 | 181 | for start, end in self.__frags: 182 | free += start - prev_end 183 | prev_end = end 184 | 185 | free += self.__size - prev_end 186 | 187 | return free 188 | 189 | class Pak(object): 190 | __slots__ = ('version', 'index_offset', 'index_size', 'footer_offset', 'index_sha1', 'mount_point', 'records') 191 | 192 | version: int 193 | index_offset: int 194 | index_size: int 195 | footer_offset: int 196 | index_sha1: bytes 197 | mount_point: Optional[str] 198 | records: List[Record] 199 | 200 | def __init__(self, version: int, index_offset: int, index_size: int, footer_offset: int, index_sha1: bytes, mount_point: Optional[str] = None, records: Optional[List[Record]] = None) -> None: 201 | self.version = version 202 | self.index_offset = index_offset 203 | self.index_size = index_size 204 | self.footer_offset = footer_offset 205 | self.index_sha1 = index_sha1 206 | self.mount_point = mount_point 207 | self.records = records or [] 208 | 209 | def __len__(self) -> int: 210 | return len(self.records) 211 | 212 | def __iter__(self) -> Iterator[Record]: 213 | return iter(self.records) 214 | 215 | def __repr__(self) -> str: 216 | return 'Pak(version=%r, index_offset=%r, index_size=%r, footer_offset=%r, index_sha1=%r, mount_point=%r, records=%r)' % ( 217 | self.version, self.index_offset, self.index_size, self.footer_offset, self.index_sha1, self.mount_point, self.records) 218 | 219 | def check_integrity(self, stream: io.BufferedReader, callback: Callable[[Optional[Record], str], None] = raise_check_error, ignore_null_checksums: bool = False) -> None: 220 | index_offset = self.index_offset 221 | buf = bytearray(DEFAULT_BUFFER_SIZE) 222 | 223 | read_record: Callable[[io.BufferedReader, str], Record] 224 | if self.version == 1: 225 | read_record = read_record_v1 226 | 227 | elif self.version == 2: 228 | read_record = read_record_v2 229 | 230 | elif self.version == 3: 231 | read_record = read_record_v3 232 | 233 | elif self.version == 4: 234 | read_record = read_record_v4 235 | 236 | elif self.version == 7: 237 | read_record = read_record_v7 238 | 239 | else: 240 | raise ValueError(f'unsupported version: {self.version}') 241 | 242 | def check_data(ctx, offset, size, sha1): 243 | if ignore_null_checksums and sha1 == b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00': 244 | return 245 | 246 | hasher = hashlib.sha1() 247 | stream.seek(offset, 0) 248 | 249 | while size > 0: 250 | if size >= DEFAULT_BUFFER_SIZE: 251 | size -= stream.readinto(buf) 252 | hasher.update(buf) 253 | else: 254 | rest = stream.read(size) 255 | assert rest is not None 256 | hasher.update(rest) 257 | size = 0 258 | 259 | if hasher.digest() != sha1: 260 | callback(ctx, 261 | 'checksum missmatch:\n' 262 | '\tgot: %s\n' 263 | '\texpected: %s' % ( 264 | hasher.hexdigest(), 265 | hexlify(sha1).decode('latin1'))) 266 | 267 | # test index sha1 sum 268 | check_data("", index_offset, self.index_size, self.index_sha1) 269 | 270 | for r1 in self: 271 | stream.seek(r1.offset, 0) 272 | r2 = read_record(stream, r1.filename) 273 | 274 | # test index metadata 275 | if r2.offset != 0: 276 | callback(r2, 'data record offset field is not 0 but %d' % r2.offset) 277 | 278 | if not same_metadata(r1, r2): 279 | callback(r1, 'metadata missmatch:\n%s' % metadata_diff(r1, r2)) 280 | 281 | if r1.compression_method not in COMPR_METHODS: 282 | callback(r1, 'unknown compression method: 0x%02x' % r1.compression_method) 283 | 284 | if r1.compression_method == COMPR_NONE and r1.compressed_size != r1.uncompressed_size: 285 | callback(r1, 'file is not compressed but compressed size (%d) differes from uncompressed size (%d)' % 286 | (r1.compressed_size, r1.uncompressed_size)) 287 | 288 | if r1.data_offset + r1.compressed_size > index_offset: 289 | callback(None, 'data bleeds into index') 290 | 291 | # test file sha1 sum 292 | if ignore_null_checksums and r1.sha1 == b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00': 293 | pass 294 | elif r1.compression_blocks is None: 295 | check_data(r1, r1.data_offset, r1.compressed_size, r1.sha1) 296 | else: 297 | hasher = hashlib.sha1() 298 | base_offset = r1.base_offset 299 | for start_offset, end_offset in r1.compression_blocks: 300 | block_size = end_offset - start_offset 301 | stream.seek(base_offset + start_offset, 0) 302 | data = stream.read(block_size) 303 | hasher.update(data) 304 | 305 | if hasher.digest() != r1.sha1: 306 | callback(r1, 307 | 'checksum missmatch:\n' 308 | '\tgot: %s\n' 309 | '\texpected: %s' % ( 310 | hasher.hexdigest(), 311 | hexlify(r1.sha1).decode('latin1'))) 312 | 313 | def unpack(self, stream: io.BufferedReader, outdir: str=".", callback: Callable[[str], None] = lambda name: None) -> None: 314 | for record in self: 315 | record.unpack(stream, outdir, callback) 316 | 317 | def unpack_only(self, stream: io.BufferedReader, files: Iterable[str], outdir: str = ".", callback: Callable[[str], None] = lambda name: None) -> None: 318 | for record in self: 319 | if shall_unpack(files, record.filename): 320 | record.unpack(stream, outdir, callback) 321 | 322 | def frag_info(self) -> FragInfo: 323 | frags = FragInfo(self.footer_offset + 44) 324 | frags.add(self.index_offset, self.index_offset + self.index_size) 325 | frags.add(self.footer_offset, frags.size) 326 | 327 | for record in self.records: 328 | frags.add(record.offset, record.data_offset + record.compressed_size) 329 | 330 | return frags 331 | 332 | def print_list(self, details: bool = False, human: bool = False, delim: str = "\n", sort_key_func: Optional[Callable[[Record], Any]] = None, out: IO[str] = sys.stdout) -> None: 333 | records = self.records 334 | 335 | if sort_key_func: 336 | records = sorted(records, key=sort_key_func) 337 | 338 | if details: 339 | size_to_str: Callable[[int], str] 340 | if human: 341 | size_to_str = human_size 342 | else: 343 | size_to_str = str 344 | 345 | count = 0 346 | sum_size = 0 347 | out.write(" Offset Size Compr-Method Compr-Size SHA1 Name%s" % delim) 348 | for record in records: 349 | size = size_to_str(record.uncompressed_size) 350 | sha1 = hexlify(record.sha1).decode('latin1') 351 | cmeth = record.compression_method 352 | 353 | if cmeth == COMPR_NONE: 354 | out.write("%10u %10s - - %s %s%s" % ( 355 | record.data_offset, size, sha1, record.filename, delim)) 356 | else: 357 | out.write("%10u %10s %12s %10s %s %s%s" % ( 358 | record.data_offset, size, COMPR_METHOD_NAMES[cmeth], 359 | size_to_str(record.compressed_size), sha1, 360 | record.filename, delim)) 361 | count += 1 362 | sum_size += record.uncompressed_size 363 | out.write("%d file(s) (%s) %s" % (count, size_to_str(sum_size), delim)) 364 | else: 365 | for record in records: 366 | out.write("%s%s" % (record.filename, delim)) 367 | 368 | def print_info(self, human: bool = False, out: IO[str] = sys.stdout) -> None: 369 | size_to_str: Callable[[int], str] 370 | if human: 371 | size_to_str = human_size 372 | else: 373 | size_to_str = str 374 | 375 | csize = 0 376 | size = 0 377 | for record in self.records: 378 | csize += record.compressed_size 379 | size += record.uncompressed_size 380 | 381 | frags = self.frag_info() 382 | 383 | out.write("Pak Version: %d\n" % self.version) 384 | out.write("Index SHA1: %s\n" % hexlify(self.index_sha1).decode('latin1')) 385 | out.write("Mount Point: %s\n" % self.mount_point) 386 | out.write("File Count: %d\n" % len(self.records)) 387 | out.write("Archive Size: %10s\n" % size_to_str(frags.size)) 388 | out.write("Unallocated Bytes: %10s\n" % size_to_str(frags.free())) 389 | out.write("Sum Compr. Files Size: %10s\n" % size_to_str(csize)) 390 | out.write("Sum Uncompr. Files Size: %10s\n" % size_to_str(size)) 391 | out.write("\n") 392 | out.write("Fragments (%d):\n" % len(frags)) 393 | 394 | for start, end in frags: 395 | out.write("\t%10s ... %10s (%10s)\n" % (start, end, size_to_str(end - start))) 396 | 397 | def mount(self, stream: io.BufferedReader, mountpt: str, foreground: bool = False, debug: bool = False) -> None: 398 | mountpt = os.path.abspath(mountpt) 399 | ops = Operations(stream, self) 400 | args = ['fsname=u4pak', 'subtype=u4pak', 'ro'] 401 | 402 | if debug: 403 | foreground = True 404 | args.append('debug') 405 | 406 | if not foreground: 407 | deamonize() 408 | 409 | llfuse.init(ops, mountpt, args) 410 | try: 411 | llfuse.main() 412 | finally: 413 | llfuse.close() 414 | 415 | # compare all metadata except for the filename 416 | def same_metadata(r1: Record, r2: Record) -> bool: 417 | # data records always have offset == 0 it seems, so skip that 418 | return \ 419 | r1.compressed_size == r2.compressed_size and \ 420 | r1.uncompressed_size == r2.uncompressed_size and \ 421 | r1.compression_method == r2.compression_method and \ 422 | r1.timestamp == r2.timestamp and \ 423 | r1.sha1 == r2.sha1 and \ 424 | r1.compression_blocks == r2.compression_blocks and \ 425 | r1.encrypted == r2.encrypted and \ 426 | r1.compression_block_size == r2.compression_block_size 427 | 428 | def metadata_diff(r1: Record, r2: Record) -> str: 429 | diff = [] 430 | 431 | for attr in ['compressed_size', 'uncompressed_size', 'timestamp', 'encrypted', 'compression_block_size']: 432 | v1 = getattr(r1,attr) 433 | v2 = getattr(r2,attr) 434 | if v1 != v2: 435 | diff.append('\t%s: %r != %r' % (attr, v1, v2)) 436 | 437 | if r1.sha1 != r2.sha1: 438 | diff.append('\tsha1: %s != %s' % (hexlify(r1.sha1).decode('latin1'), hexlify(r2.sha1).decode('latin1'))) 439 | 440 | if r1.compression_blocks != r2.compression_blocks: 441 | diff.append('\tcompression_blocks:\n\t\t%r\n\t\t\t!=\n\t\t%r' % (r1.compression_blocks, r2.compression_blocks)) 442 | 443 | return '\n'.join(diff) 444 | 445 | COMPR_NONE = 0x00 446 | COMPR_ZLIB = 0x01 447 | COMPR_BIAS_MEMORY = 0x10 448 | COMPR_BIAS_SPEED = 0x20 449 | 450 | COMPR_METHODS: Set[int] = {COMPR_NONE, COMPR_ZLIB, COMPR_BIAS_MEMORY, COMPR_BIAS_SPEED} 451 | 452 | COMPR_METHOD_NAMES: Dict[int, str] = { 453 | COMPR_NONE: 'none', 454 | COMPR_ZLIB: 'zlib', 455 | COMPR_BIAS_MEMORY: 'bias memory', 456 | COMPR_BIAS_SPEED: 'bias speed' 457 | } 458 | 459 | class Record(NamedTuple): 460 | filename: str 461 | offset: int 462 | compressed_size: int 463 | uncompressed_size: int 464 | compression_method: int 465 | timestamp: Optional[int] 466 | sha1: bytes 467 | compression_blocks: Optional[List[Tuple[int, int]]] 468 | encrypted: bool 469 | compression_block_size: Optional[int] 470 | 471 | def sendfile(self, outfile: io.BufferedWriter, infile: io.BufferedReader) -> None: 472 | if self.compression_method == COMPR_NONE: 473 | sendfile(outfile, infile, self.data_offset, self.uncompressed_size) 474 | elif self.compression_method == COMPR_ZLIB: 475 | if self.encrypted: 476 | raise NotImplementedError('zlib decompression with encryption is not implemented yet') 477 | assert self.compression_blocks is not None 478 | base_offset = self.base_offset 479 | for start_offset, end_offset in self.compression_blocks: 480 | block_size = end_offset - start_offset 481 | infile.seek(base_offset + start_offset) 482 | block_content = infile.read(block_size) 483 | assert block_content is not None 484 | block_decompress = zlib.decompress(block_content) 485 | outfile.write(block_decompress) 486 | else: 487 | raise NotImplementedError('decompression is not implemented yet') 488 | 489 | @property 490 | def base_offset(self): 491 | return 0 492 | 493 | def read(self, data: Union[memoryview, bytes, mmap.mmap], offset: int, size: int) -> Union[bytes, bytearray]: 494 | if self.encrypted: 495 | raise NotImplementedError('decryption is not supported') 496 | 497 | if self.compression_method == COMPR_NONE: 498 | uncompressed_size = self.uncompressed_size 499 | 500 | if offset >= uncompressed_size: 501 | return b'' 502 | 503 | i = self.data_offset + offset 504 | j = i + min(uncompressed_size - offset, size) 505 | return data[i:j] 506 | elif self.compression_method == COMPR_ZLIB: 507 | assert self.compression_blocks is not None 508 | base_offset = self.base_offset 509 | buffer = bytearray() 510 | end_offset = offset + size 511 | 512 | compression_block_size = self.compression_block_size 513 | assert compression_block_size 514 | start_block_index = offset // compression_block_size 515 | end_block_index = end_offset // compression_block_size 516 | 517 | current_offset = compression_block_size * start_block_index 518 | for block_start_offset, block_end_offset in self.compression_blocks[start_block_index:end_block_index + 1]: 519 | block_size = block_end_offset - block_start_offset 520 | 521 | block_content = data[base_offset + block_start_offset:base_offset + block_end_offset] 522 | block_decompress = zlib.decompress(block_content) 523 | 524 | next_offset = current_offset + len(block_decompress) 525 | if current_offset >= offset: 526 | buffer.extend(block_decompress[:end_offset - current_offset]) 527 | else: 528 | buffer.extend(block_decompress[offset - current_offset:end_offset - current_offset]) 529 | 530 | current_offset = next_offset 531 | return buffer 532 | else: 533 | raise NotImplementedError(f'decompression method {self.compression_method} is not supported') 534 | 535 | def unpack(self, stream: io.BufferedReader, outdir: str = ".", callback: Callable[[str], None] = lambda name: None) -> None: 536 | prefix, name = os.path.split(self.filename) 537 | prefix = os.path.join(outdir,prefix) 538 | if not os.path.exists(prefix): 539 | os.makedirs(prefix) 540 | name = os.path.join(prefix,name) 541 | callback(name) 542 | fp: io.BufferedWriter 543 | with open(name, "wb") as fp: # type: ignore 544 | self.sendfile(fp, stream) 545 | 546 | @property 547 | def data_offset(self) -> int: 548 | return self.offset + self.header_size 549 | 550 | @property 551 | def alloc_size(self) -> int: 552 | return self.header_size + self.compressed_size 553 | 554 | @property 555 | def index_size(self) -> int: 556 | name_size = 4 + len(self.filename.replace(os.path.sep,'/').encode('utf-8')) + 1 557 | return name_size + self.header_size 558 | 559 | @property 560 | def header_size(self) -> int: 561 | raise NotImplementedError 562 | 563 | class RecordV1(Record): 564 | __slots__ = () 565 | 566 | def __new__(cls, filename: str, offset: int, compressed_size: int, uncompressed_size: int, compression_method: int, timestamp: Optional[int], sha1: bytes) -> RecordV1: 567 | return Record.__new__(cls, filename, offset, compressed_size, uncompressed_size, 568 | compression_method, timestamp, sha1, None, False, None) # type: ignore 569 | 570 | @property 571 | def header_size(self) -> int: 572 | return 56 573 | 574 | class RecordV2(Record): 575 | __slots__ = () 576 | 577 | def __new__(cls, filename: str, offset: int, compressed_size: int, uncompressed_size: int, compression_method: int, sha1: bytes) -> RecordV2: 578 | return Record.__new__(cls, filename, offset, compressed_size, uncompressed_size, 579 | compression_method, None, sha1, None, False, None) # type: ignore 580 | 581 | @property 582 | def header_size(self): 583 | return 48 584 | 585 | class RecordV3(Record): 586 | __slots__ = () 587 | 588 | def __new__(cls, filename: str, offset: int, compressed_size: int, uncompressed_size: int, compression_method: int, sha1: bytes, 589 | compression_blocks: Optional[List[Tuple[int, int]]], encrypted: bool, compression_block_size: Optional[int]) -> RecordV3: 590 | return Record.__new__(cls, filename, offset, compressed_size, uncompressed_size, 591 | compression_method, None, sha1, compression_blocks, encrypted, 592 | compression_block_size) # type: ignore 593 | 594 | @property 595 | def header_size(self) -> int: 596 | size = 53 597 | if self.compression_method != COMPR_NONE: 598 | assert self.compression_blocks is not None 599 | size += len(self.compression_blocks) * 16 600 | return size 601 | 602 | # XXX: Don't know at which version exactly the change happens. 603 | # Only know 4 is relative, 7 is absolute. 604 | class RecordV7(RecordV3): 605 | @property 606 | def base_offset(self): 607 | return self.offset 608 | 609 | def read_path(stream: io.BufferedReader, encoding: str = 'utf-8') -> str: 610 | path_len, = st_unpack(' bytes: 618 | encoded_path = path.replace(os.path.sep, '/').encode('utf-8') + b'\0' 619 | return st_pack(' bytes: 622 | data = pack_path(path,encoding) 623 | stream.write(data) 624 | return data 625 | 626 | def read_record_v1(stream: io.BufferedReader, filename: str) -> RecordV1: 627 | return RecordV1(filename, *st_unpack(' RecordV2: 630 | return RecordV2(filename, *st_unpack(' RecordV3: 633 | offset, compressed_size, uncompressed_size, compression_method, sha1 = \ 634 | st_unpack(' RecordV3: 652 | offset, compressed_size, uncompressed_size, compression_method, sha1 = \ 653 | st_unpack(' Tuple[int, bytes]: 676 | if compression_method != COMPR_NONE: 677 | raise NotImplementedError("compression is not implemented") 678 | 679 | if encrypted: 680 | raise NotImplementedError("encryption is not implemented") 681 | 682 | buf_size = DEFAULT_BUFFER_SIZE 683 | buf = bytearray(buf_size) 684 | bytes_left = size 685 | hasher = hashlib.sha1() 686 | while bytes_left > 0: 687 | data: Union[bytes, bytearray] 688 | if bytes_left >= buf_size: 689 | n = fh.readinto(buf) 690 | data = buf 691 | if n is None or n < buf_size: 692 | raise IOError('unexpected end of file') 693 | else: 694 | opt_data = fh.read(bytes_left) 695 | assert opt_data is not None 696 | n = len(opt_data) 697 | if n < bytes_left: 698 | raise IOError('unexpected end of file') 699 | data = opt_data 700 | bytes_left -= n 701 | hasher.update(data) 702 | archive.write(data) 703 | 704 | return size, hasher.digest() 705 | 706 | def write_data_zlib( 707 | archive: io.BufferedWriter, 708 | fh: io.BufferedReader, 709 | size: int, 710 | compression_method: int = COMPR_NONE, 711 | encrypted: bool = False, 712 | compression_block_size: int = 65536 713 | ) -> Tuple[int, bytes, int, List[int]]: 714 | if encrypted: 715 | raise NotImplementedError("encryption is not implemented") 716 | 717 | buf_size = compression_block_size 718 | block_count = int(math.ceil(size / compression_block_size)) 719 | base_offset = archive.tell() 720 | 721 | archive.write(st_pack(' 0: 739 | n: int 740 | if bytes_left >= buf_size: 741 | n = fh.readinto(buf) or 0 742 | data = zlib.compress(memoryview(buf)) 743 | 744 | compressed_size += len(data) 745 | compress_blocks[compress_block_no * 2] = cur_offset 746 | cur_offset += len(data) 747 | compress_blocks[compress_block_no * 2 + 1] = cur_offset 748 | compress_block_no += 1 749 | 750 | if n < buf_size: 751 | raise IOError('unexpected end of file') 752 | else: 753 | data = fh.read(bytes_left) or b'' 754 | n = len(data) 755 | 756 | data = zlib.compress(data) 757 | compressed_size += len(data) 758 | compress_blocks[compress_block_no * 2] = cur_offset 759 | cur_offset += len(data) 760 | compress_blocks[compress_block_no * 2 + 1] = cur_offset 761 | compress_block_no += 1 762 | 763 | if n < bytes_left: 764 | raise IOError('unexpected end of file') 765 | bytes_left -= n 766 | hasher.update(data) 767 | archive.write(data) 768 | 769 | cur_offset = archive.tell() 770 | 771 | archive.seek(base_offset + 4, 0) 772 | archive.write(st_pack('<%dQ' % (block_count * 2), *compress_blocks)) 773 | archive.seek(cur_offset, 0) 774 | 775 | return compressed_size, hasher.digest(), block_count, compress_blocks 776 | 777 | def write_record_v1( 778 | archive: io.BufferedWriter, 779 | fh: io.BufferedReader, 780 | compression_method: int = COMPR_NONE, 781 | encrypted: bool = False, 782 | compression_block_size: int = 0) -> bytes: 783 | if encrypted: 784 | raise ValueError('version 1 does not support encryption') 785 | 786 | record_offset = archive.tell() 787 | 788 | st = os.fstat(fh.fileno()) 789 | size = st.st_size 790 | # XXX: timestamp probably needs multiplication with some factor? 791 | record = st_pack('<16xQIQ20x',size,compression_method,int(st.st_mtime)) 792 | archive.write(record) 793 | 794 | compressed_size, sha1 = write_data(archive,fh,size,compression_method,encrypted,compression_block_size) 795 | data_end = archive.tell() 796 | 797 | archive.seek(record_offset+8, 0) 798 | archive.write(st_pack(' bytes: 813 | if encrypted: 814 | raise ValueError('version 2 does not support encryption') 815 | 816 | record_offset = archive.tell() 817 | 818 | st = os.fstat(fh.fileno()) 819 | size = st.st_size 820 | record = st_pack('<16xQI20x',size,compression_method) 821 | archive.write(record) 822 | 823 | compressed_size, sha1 = write_data(archive,fh,size,compression_method,encrypted,compression_block_size) 824 | data_end = archive.tell() 825 | 826 | archive.seek(record_offset+8, 0) 827 | archive.write(st_pack(' bytes: 842 | if compression_method != COMPR_NONE and compression_method != COMPR_ZLIB: 843 | raise NotImplementedError("compression is not implemented") 844 | 845 | record_offset = archive.tell() 846 | 847 | if compression_block_size == 0 and compression_method == COMPR_ZLIB: 848 | compression_block_size = 65536 849 | 850 | st = os.fstat(fh.fileno()) 851 | size = st.st_size 852 | record = st_pack('<16xQI20x',size,compression_method) 853 | archive.write(record) 854 | 855 | if compression_method == COMPR_ZLIB: 856 | compressed_size, sha1, block_count, blocks = write_data_zlib(archive,fh,size,compression_method,encrypted,compression_block_size) 857 | else: 858 | record = st_pack(' Pak: 883 | stream.seek(-44, 2) 884 | footer_offset = stream.tell() 885 | footer = stream.read(44) 886 | magic, version, index_offset, index_size, index_sha1 = st_unpack(' footer_offset: 914 | raise ValueError('illegal index offset/size') 915 | 916 | stream.seek(index_offset, 0) 917 | 918 | mount_point = read_path(stream, encoding) 919 | entry_count = st_unpack(' footer_offset: 929 | raise ValueError('index bleeds into footer') 930 | 931 | if check_integrity: 932 | pak.check_integrity(stream, ignore_null_checksums=ignore_null_checksums) 933 | 934 | return pak 935 | 936 | def _pack_callback(name: str, files: List[str]) -> None: 937 | pass 938 | 939 | def pack(stream: io.BufferedWriter, files_or_dirs: List[str], mount_point: str, version: int = 3, compression_method: int = COMPR_NONE, 940 | encrypted: bool = False, compression_block_size: int = 0, callback: Callable[[str, List[str]], None] = _pack_callback, 941 | encoding: str='utf-8') -> None: 942 | if version == 1: 943 | write_record = write_record_v1 944 | 945 | elif version == 2: 946 | write_record = write_record_v2 947 | 948 | elif version == 3: 949 | write_record = write_record_v3 950 | 951 | else: 952 | raise ValueError('version not supported: %d' % version) 953 | 954 | files: List[str] = [] 955 | for name in files_or_dirs: 956 | if os.path.isdir(name): 957 | for dirpath, dirnames, filenames in os.walk(name): 958 | for filename in filenames: 959 | files.append(os.path.join(dirpath,filename)) 960 | else: 961 | files.append(name) 962 | 963 | files.sort() 964 | 965 | records: List[Tuple[str, bytes]] = [] 966 | for filename in files: 967 | callback(filename, files) 968 | fh: io.BufferedReader 969 | with open(filename, "rb") as fh: # type: ignore 970 | record = write_record(stream, fh, compression_method, encrypted, compression_block_size) 971 | records.append((filename, record)) 972 | 973 | write_index(stream,version,mount_point,records,encoding) 974 | 975 | def write_index(stream: IO[bytes], version: int, mount_point: str, records: List[Tuple[str, bytes]], encoding: str = 'utf-8') -> None: 976 | hasher = hashlib.sha1() 977 | index_offset = stream.tell() 978 | 979 | index_header = pack_path(mount_point, encoding) + st_pack(' RecordV1: 998 | st = os.stat(filename) 999 | size = st.st_size 1000 | return RecordV1(filename, -1, size, size, COMPR_NONE, int(st.st_mtime), b'') # type: ignore 1001 | 1002 | def make_record_v2(filename: str) -> RecordV2: 1003 | size = os.path.getsize(filename) 1004 | return RecordV2(filename, -1, size, size, COMPR_NONE, b'') # type: ignore 1005 | 1006 | def make_record_v3(filename: str) -> RecordV3: 1007 | size = os.path.getsize(filename) 1008 | return RecordV3(filename, -1, size, size, COMPR_NONE, b'', None, False, 0) # type: ignore 1009 | 1010 | # TODO: untested! 1011 | # removes, inserts and updates files, rewrites index, truncates archive if neccesarry 1012 | def update(stream: io.BufferedRandom, mount_point: str, insert: Optional[List[str]] = None, remove: Optional[List[str]] = None, compression_method: int = COMPR_NONE, 1013 | encrypted: bool = False, compression_block_size: int = 0, callback: Callable[[str], None] = lambda name: None, 1014 | ignore_magic: bool = False, encoding: str = 'utf-8', force_version: Optional[int] = None): 1015 | if compression_method != COMPR_NONE: 1016 | raise NotImplementedError("compression is not implemented") 1017 | 1018 | if encrypted: 1019 | raise NotImplementedError("encryption is not implemented") 1020 | 1021 | pak = read_index(stream, False, ignore_magic, encoding, force_version) 1022 | 1023 | make_record: Callable[[str], Record] 1024 | if pak.version == 1: 1025 | write_record = write_record_v1 1026 | make_record = make_record_v1 1027 | 1028 | elif pak.version == 2: 1029 | write_record = write_record_v2 1030 | make_record = make_record_v2 1031 | 1032 | elif pak.version == 3: 1033 | write_record = write_record_v3 1034 | make_record = make_record_v3 1035 | 1036 | else: 1037 | raise ValueError('version not supported: %d' % pak.version) 1038 | 1039 | # build directory tree of existing files 1040 | root = Dir(-1) 1041 | root.parent = root 1042 | for record in pak: 1043 | path = record.filename.split(os.path.sep) 1044 | path, name = path[:-1], path[-1] 1045 | 1046 | parent = root 1047 | for i, comp in enumerate(path): 1048 | comp_encoded = comp.encode(encoding) 1049 | try: 1050 | entry = parent.children[comp_encoded] 1051 | except KeyError: 1052 | entry = parent.children[comp_encoded] = Dir(-1, parent=parent) 1053 | 1054 | if not isinstance(entry, Dir): 1055 | raise ValueError("name conflict in archive: %r is not a directory" % os.path.join(*path[:i+1])) 1056 | 1057 | parent = entry 1058 | 1059 | if name in parent.children: 1060 | raise ValueError("doubled name in archive: %s" % record.filename) 1061 | 1062 | parent.children[name.encode(encoding)] = File(-1, record, parent) 1063 | 1064 | # find files to remove 1065 | if remove: 1066 | for filename in remove: 1067 | path = filename.split(os.path.sep) 1068 | path, name = path[:-1], path[-1] 1069 | 1070 | parent = root 1071 | for i, comp in enumerate(path): 1072 | comp_encoded = comp.encode(encoding) 1073 | try: 1074 | entry = parent.children[comp_encoded] 1075 | except KeyError: 1076 | entry = parent.children[comp_encoded] = Dir(-1, parent=parent) 1077 | 1078 | if not isinstance(entry, Dir): 1079 | # TODO: maybe option to ignore this? 1080 | raise ValueError("file not in archive: %s" % filename) 1081 | 1082 | parent = entry 1083 | 1084 | if name not in parent.children: 1085 | raise ValueError("file not in archive: %s" % filename) 1086 | 1087 | name_encoded = name.encode(encoding) 1088 | entry = parent.children[name_encoded] 1089 | del parent.children[name_encoded] 1090 | 1091 | # find files to insert 1092 | if insert: 1093 | files = [] 1094 | for name in insert: 1095 | if os.path.isdir(name): 1096 | for dirpath, dirnames, filenames in os.walk(name): 1097 | for filename in filenames: 1098 | files.append(os.path.join(dirpath,filename)) 1099 | else: 1100 | files.append(name) 1101 | 1102 | for filename in files: 1103 | path = filename.split(os.path.sep) 1104 | path, name = path[:-1], path[-1] 1105 | 1106 | parent = root 1107 | for i, comp in enumerate(path): 1108 | comp_encoded = comp.encode(encoding) 1109 | try: 1110 | entry = parent.children[comp_encoded] 1111 | except KeyError: 1112 | entry = parent.children[comp_encoded] = Dir(-1, parent=parent) 1113 | 1114 | if not isinstance(entry, Dir): 1115 | raise ValueError("name conflict in archive: %r is not a directory" % os.path.join(*path[:i+1])) 1116 | 1117 | parent = entry 1118 | 1119 | if name in parent.children: 1120 | raise ValueError("doubled name in archive: %s" % filename) 1121 | 1122 | parent.children[name.encode(encoding)] = File(-1, make_record(filename), parent) 1123 | 1124 | # build new allocations 1125 | existing_records: List[Record] = [] 1126 | new_records: List[Record] = [] 1127 | 1128 | for record in root.allrecords(): 1129 | if record.offset == -1: 1130 | new_records.append(record) 1131 | else: 1132 | existing_records.append(record) 1133 | 1134 | # try to build new allocations in a way that needs a minimal amount of reads/writes 1135 | allocations = [] 1136 | new_records.sort(key=lambda r: (r.compressed_size, r.filename),reverse=True) 1137 | arch_size = 0 1138 | for record in existing_records: 1139 | size = record.alloc_size 1140 | offset = record.offset 1141 | if offset > arch_size: 1142 | # find new records that fit the hole in order to reduce shifts 1143 | # but never cause a shift torwards the end of the file 1144 | # this is done so the rewriting/shifting code below is simpler 1145 | i = 0 1146 | while i < len(new_records) and arch_size < offset: 1147 | new_record = new_records[i] 1148 | new_size = new_record.alloc_size 1149 | if arch_size + new_size <= offset: 1150 | allocations.append((arch_size, new_record)) 1151 | del new_records[i] 1152 | arch_size += new_size 1153 | else: 1154 | i += 1 1155 | 1156 | allocations.append((arch_size, record)) 1157 | arch_size += size 1158 | 1159 | # add remaining records at the end 1160 | new_records.sort(key=lambda r: r.filename) 1161 | for record in new_records: 1162 | allocations.append((arch_size,record)) 1163 | arch_size += record.alloc_size 1164 | 1165 | index_offset = arch_size 1166 | for offset, record in allocations: 1167 | arch_size += record.index_size 1168 | 1169 | footer_offset = arch_size 1170 | arch_size += 44 1171 | 1172 | current_size = os.fstat(stream.fileno()).st_size 1173 | diff_size = arch_size - current_size 1174 | # minimize chance of corrupting archive 1175 | if diff_size > 0 and hasattr(os,'statvfs'): 1176 | st = os.statvfs(stream.name) 1177 | free = st.f_frsize * st.f_bfree 1178 | if free - diff_size < DEFAULT_BUFFER_SIZE: 1179 | raise ValueError("filesystem not big enough") 1180 | 1181 | index_records = [] 1182 | for offset, record in reversed(allocations): 1183 | if record.offset == -1: 1184 | # new record 1185 | filename = record.filename 1186 | callback("+" + filename) 1187 | fh: io.BufferedReader 1188 | with open(filename, "rb") as fh: # type: ignore 1189 | record_bytes = write_record(stream, fh, record.compression_method, record.encrypted, record.compression_block_size or 0) 1190 | elif offset != record.offset: 1191 | assert offset > record.offset 1192 | callback(" "+filename) 1193 | fshift(stream, record.offset, offset, record.alloc_size) 1194 | stream.seek(offset, 0) 1195 | record_bytes = stream.read(record.header_size) 1196 | index_records.append((filename, record_bytes)) 1197 | 1198 | write_index(stream,pak.version,mount_point,index_records,encoding) 1199 | 1200 | if diff_size < 0: 1201 | stream.truncate(arch_size) 1202 | 1203 | def fshift(stream: io.BufferedRandom, src: int, dst: int, size: int) -> None: 1204 | assert src < dst 1205 | buf_size = DEFAULT_BUFFER_SIZE 1206 | buf = bytearray(buf_size) 1207 | 1208 | while size > 0: 1209 | data: Union[bytes, bytearray] 1210 | if size >= buf_size: 1211 | stream.seek(src + size - buf_size, 0) 1212 | stream.readinto(buf) 1213 | data = buf 1214 | size -= buf_size 1215 | else: 1216 | stream.seek(src, 0) 1217 | data = stream.read(size) or b'' 1218 | size = 0 1219 | 1220 | stream.seek(dst + size, 0) 1221 | stream.write(data) 1222 | 1223 | def shall_unpack(paths: Iterable[str], name: str) -> bool: 1224 | path = name.split(os.path.sep) 1225 | for i in range(1, len(path) + 1): 1226 | prefix = os.path.join(*path[0:i]) 1227 | if prefix in paths: 1228 | return True 1229 | return False 1230 | 1231 | def human_size(size: int) -> str: 1232 | if size < 2 ** 10: 1233 | return str(size) 1234 | 1235 | elif size < 2 ** 20: 1236 | str_size = "%.1f" % (size / 2 ** 10) 1237 | unit = "K" 1238 | 1239 | elif size < 2 ** 30: 1240 | str_size = "%.1f" % (size / 2 ** 20) 1241 | unit = "M" 1242 | 1243 | elif size < 2 ** 40: 1244 | str_size = "%.1f" % (size / 2 ** 30) 1245 | unit = "G" 1246 | 1247 | elif size < 2 ** 50: 1248 | str_size = "%.1f" % (size / 2 ** 40) 1249 | unit = "T" 1250 | 1251 | elif size < 2 ** 60: 1252 | str_size = "%.1f" % (size / 2 ** 50) 1253 | unit = "P" 1254 | 1255 | elif size < 2 ** 70: 1256 | str_size = "%.1f" % (size / 2 ** 60) 1257 | unit = "E" 1258 | 1259 | elif size < 2 ** 80: 1260 | str_size = "%.1f" % (size / 2 ** 70) 1261 | unit = "Z" 1262 | 1263 | else: 1264 | str_size = "%.1f" % (size / 2 ** 80) 1265 | unit = "Y" 1266 | 1267 | if str_size.endswith(".0"): 1268 | str_size = str_size[:-2] 1269 | 1270 | return str_size + unit 1271 | 1272 | SORT_ALIASES: Dict[str, str] = { 1273 | "s": "size", 1274 | "S": "-size", 1275 | "z": "zsize", 1276 | "Z": "-zsize", 1277 | "o": "offset", 1278 | "O": "-offset", 1279 | "n": "name" 1280 | } 1281 | 1282 | KEY_FUNCS: Dict[str, Callable[[Record], Union[str, int]]] = { 1283 | "size": lambda rec: rec.uncompressed_size, 1284 | "-size": lambda rec: -rec.uncompressed_size, 1285 | 1286 | "zsize": lambda rec: rec.compressed_size, 1287 | "-zsize": lambda rec: -rec.compressed_size, 1288 | 1289 | "offset": lambda rec: rec.offset, 1290 | "-offset": lambda rec: -rec.offset, 1291 | 1292 | "name": lambda rec: rec.filename.lower(), 1293 | } 1294 | 1295 | def sort_key_func(sort: str) -> Callable[[Record], Tuple[Union[str, int], ...]]: 1296 | key_funcs = [] 1297 | for key in sort.split(","): 1298 | key = SORT_ALIASES.get(key,key) 1299 | try: 1300 | func = KEY_FUNCS[key] 1301 | except KeyError: 1302 | raise ValueError("unknown sort key: "+key) 1303 | key_funcs.append(func) 1304 | 1305 | return lambda rec: tuple(key_func(rec) for key_func in key_funcs) 1306 | 1307 | class Entry(object): 1308 | __slots__ = 'inode', '_parent', 'stat', '__weakref__' 1309 | 1310 | inode: int 1311 | _parent: Optional[weakref.ref[Dir]] 1312 | stat: Optional[os.stat_result] 1313 | 1314 | def __init__(self, inode: int, parent: Optional[Dir] = None) -> None: 1315 | self.inode = inode 1316 | self.parent = parent 1317 | self.stat = None 1318 | 1319 | @property 1320 | def parent(self) -> Optional[Dir]: 1321 | return self._parent() if self._parent is not None else None 1322 | 1323 | @parent.setter 1324 | def parent(self, parent: Optional[Dir]) -> None: 1325 | self._parent = weakref.ref(parent) if parent is not None else None 1326 | 1327 | class Dir(Entry): 1328 | __slots__ = 'children', 1329 | 1330 | children: OrderedDict[bytes, Union[Dir, File]] 1331 | 1332 | def __init__(self, inode: int, children: Optional[OrderedDict[bytes, Union[Dir, File]]] = None, parent: Optional[Dir] = None) -> None: 1333 | Entry.__init__(self,inode,parent) 1334 | if children is None: 1335 | self.children = OrderedDict() 1336 | else: 1337 | self.children = children 1338 | for child in children.values(): 1339 | child.parent = self 1340 | 1341 | def __repr__(self) -> str: 1342 | return 'Dir(%r, %r)' % (self.inode, self.children) 1343 | 1344 | def allrecords(self) -> Iterable[Record]: 1345 | for child in self.children.values(): 1346 | if isinstance(child, Dir): 1347 | for record in child.allrecords(): 1348 | yield record 1349 | else: 1350 | yield child.record 1351 | 1352 | class File(Entry): 1353 | __slots__ = 'record', 1354 | 1355 | record: Record 1356 | 1357 | def __init__(self, inode: int, record: Record, parent: Optional[Dir] = None) -> None: 1358 | Entry.__init__(self, inode, parent) 1359 | self.record = record 1360 | 1361 | def __repr__(self) -> str: 1362 | return 'File(%r, %r)' % (self.inode, self.record) 1363 | 1364 | if HAS_LLFUSE: 1365 | import errno 1366 | import weakref 1367 | import stat 1368 | import mmap 1369 | 1370 | DIR_SELF = '.'.encode(sys.getfilesystemencoding()) 1371 | DIR_PARENT = '..'.encode(sys.getfilesystemencoding()) 1372 | 1373 | class Operations(llfuse.Operations): 1374 | __slots__ = 'archive', 'root', 'inodes', 'arch_st', 'data' 1375 | 1376 | archive: io.BufferedReader 1377 | inodes: Dict[int, Union[Dir, File]] 1378 | root: Dir 1379 | arch_st: os.stat_result 1380 | data: mmap.mmap 1381 | 1382 | def __init__(self, archive: io.BufferedReader, pak: Pak) -> None: 1383 | llfuse.Operations.__init__(self) 1384 | self.archive = archive 1385 | self.arch_st = os.fstat(archive.fileno()) 1386 | self.root = Dir(llfuse.ROOT_INODE) 1387 | self.inodes = {self.root.inode: self.root} 1388 | self.root.parent = self.root 1389 | 1390 | encoding = sys.getfilesystemencoding() 1391 | inode = self.root.inode + 1 1392 | for record in pak: 1393 | path = record.filename.split(os.path.sep) 1394 | path, name = path[:-1], path[-1] 1395 | enc_name = name.encode(encoding) 1396 | name, ext = os.path.splitext(name) 1397 | 1398 | parent = self.root 1399 | for i, comp in enumerate(path): 1400 | comp_encoded = comp.encode(encoding) 1401 | try: 1402 | entry = parent.children[comp_encoded] 1403 | except KeyError: 1404 | entry = parent.children[comp_encoded] = self.inodes[inode] = Dir(inode, parent=parent) 1405 | inode += 1 1406 | 1407 | if not isinstance(entry, Dir): 1408 | raise ValueError("name conflict in archive: %r is not a directory" % os.path.join(*path[:i+1])) 1409 | 1410 | parent = entry 1411 | 1412 | i = 0 1413 | while enc_name in parent.children: 1414 | sys.stderr.write("Warning: doubled name in archive: %s\n" % record.filename) 1415 | i += 1 1416 | enc_name = ("%s~%d%s" % (name, i, ext)).encode(encoding) 1417 | 1418 | parent.children[enc_name] = self.inodes[inode] = File(inode, record, parent) 1419 | inode += 1 1420 | 1421 | archive.seek(0, 0) 1422 | self.data = mmap.mmap(archive.fileno(), 0, access=mmap.ACCESS_READ) 1423 | 1424 | # cache entry attributes 1425 | for inode in self.inodes: 1426 | entry = self.inodes[inode] 1427 | entry.stat = self._getattr(entry) 1428 | 1429 | def destroy(self) -> None: 1430 | self.data.close() 1431 | self.archive.close() 1432 | 1433 | def lookup(self, parent_inode: int, name: bytes, ctx) -> os.stat_result: 1434 | try: 1435 | entry = self.inodes[parent_inode] 1436 | if name == DIR_SELF: 1437 | pass 1438 | 1439 | elif name == DIR_PARENT: 1440 | parent = entry.parent 1441 | if parent is not None: 1442 | entry = parent 1443 | 1444 | else: 1445 | if not isinstance(entry, Dir): 1446 | raise llfuse.FUSEError(errno.ENOTDIR) 1447 | 1448 | entry = entry.children[name] 1449 | 1450 | except KeyError: 1451 | raise llfuse.FUSEError(errno.ENOENT) 1452 | else: 1453 | stat = entry.stat 1454 | assert stat is not None 1455 | return stat 1456 | 1457 | def _getattr(self, entry: Union[Dir, File]) -> llfuse.EntryAttributes: 1458 | attrs = llfuse.EntryAttributes() 1459 | 1460 | attrs.st_ino = entry.inode 1461 | attrs.st_rdev = 0 1462 | attrs.generation = 0 1463 | attrs.entry_timeout = 300 1464 | attrs.attr_timeout = 300 1465 | 1466 | if isinstance(entry, Dir): 1467 | nlink = 2 if entry is not self.root else 1 1468 | size = 5 1469 | 1470 | for name, child in entry.children.items(): 1471 | size += len(name) + 1 1472 | if type(child) is Dir: 1473 | nlink += 1 1474 | 1475 | attrs.st_mode = stat.S_IFDIR | 0o555 1476 | attrs.st_nlink = nlink 1477 | attrs.st_size = size 1478 | else: 1479 | attrs.st_nlink = 1 1480 | attrs.st_mode = stat.S_IFREG | 0o444 1481 | attrs.st_size = entry.record.uncompressed_size 1482 | 1483 | arch_st = self.arch_st 1484 | attrs.st_uid = arch_st.st_uid 1485 | attrs.st_gid = arch_st.st_gid 1486 | attrs.st_blksize = arch_st.st_blksize 1487 | attrs.st_blocks = 1 + ((attrs.st_size - 1) // attrs.st_blksize) if attrs.st_size != 0 else 0 1488 | if HAS_STAT_NS: 1489 | attrs.st_atime_ns = arch_st.st_atime_ns 1490 | attrs.st_mtime_ns = arch_st.st_mtime_ns 1491 | attrs.st_ctime_ns = arch_st.st_ctime_ns 1492 | else: 1493 | attrs.st_atime_ns = int(arch_st.st_atime * 1000) 1494 | attrs.st_mtime_ns = int(arch_st.st_mtime * 1000) 1495 | attrs.st_ctime_ns = int(arch_st.st_ctime * 1000) 1496 | 1497 | return attrs 1498 | 1499 | def getattr(self, inode: int, ctx) -> os.stat_result: 1500 | try: 1501 | entry = self.inodes[inode] 1502 | except KeyError: 1503 | raise llfuse.FUSEError(errno.ENOENT) 1504 | else: 1505 | stat = entry.stat 1506 | assert stat is not None 1507 | return stat 1508 | 1509 | def getxattr(self, inode: int, name: bytes, ctx) -> bytes: 1510 | try: 1511 | entry = self.inodes[inode] 1512 | except KeyError: 1513 | raise llfuse.FUSEError(errno.ENOENT) 1514 | else: 1515 | if not isinstance(entry, File): 1516 | raise llfuse.FUSEError(errno.ENODATA) 1517 | 1518 | if name == b'user.u4pak.sha1': 1519 | return hexlify(entry.record.sha1) 1520 | 1521 | elif name == b'user.u4pak.compressed_size': 1522 | return str(entry.record.compressed_size).encode('ascii') 1523 | 1524 | elif name == b'user.u4pak.compression_method': 1525 | return COMPR_METHOD_NAMES[entry.record.compression_method].encode('ascii') 1526 | 1527 | elif name == b'user.u4pak.compression_block_size': 1528 | return str(entry.record.compression_block_size).encode('ascii') 1529 | 1530 | elif name == b'user.u4pak.encrypted': 1531 | return str(entry.record.encrypted).encode('ascii') 1532 | 1533 | else: 1534 | raise llfuse.FUSEError(errno.ENODATA) 1535 | 1536 | def listxattr(self, inode: int, ctx) -> List[bytes]: 1537 | try: 1538 | entry = self.inodes[inode] 1539 | except KeyError: 1540 | raise llfuse.FUSEError(errno.ENOENT) 1541 | else: 1542 | if type(entry) is Dir: 1543 | return [] 1544 | 1545 | else: 1546 | return [b'user.u4pak.sha1', b'user.u4pak.compressed_size', 1547 | b'user.u4pak.compression_method', b'user.u4pak.compression_block_size', 1548 | b'user.u4pak.encrypted'] 1549 | 1550 | def access(self, inode: int, mode: int, ctx) -> bool: 1551 | try: 1552 | entry = self.inodes[inode] 1553 | except KeyError: 1554 | raise llfuse.FUSEError(errno.ENOENT) 1555 | else: 1556 | st_mode = 0o555 if type(entry) is Dir else 0o444 1557 | return (st_mode & mode) == mode 1558 | 1559 | def opendir(self, inode: int, ctx): 1560 | try: 1561 | entry = self.inodes[inode] 1562 | except KeyError: 1563 | raise llfuse.FUSEError(errno.ENOENT) 1564 | else: 1565 | if type(entry) is not Dir: 1566 | raise llfuse.FUSEError(errno.ENOTDIR) 1567 | 1568 | return inode 1569 | 1570 | def readdir(self, inode: int, offset: int) -> Iterable[Tuple[bytes, os.stat_result, int]]: 1571 | try: 1572 | entry = self.inodes[inode] 1573 | except KeyError: 1574 | raise llfuse.FUSEError(errno.ENOENT) 1575 | else: 1576 | if not isinstance(entry, Dir): 1577 | raise llfuse.FUSEError(errno.ENOTDIR) 1578 | 1579 | names = list(entry.children)[offset:] if offset > 0 else entry.children 1580 | for name in names: 1581 | child = entry.children[name] 1582 | stat = child.stat 1583 | assert stat is not None 1584 | yield name, stat, child.inode 1585 | 1586 | def releasedir(self, fh: int) -> None: 1587 | pass 1588 | 1589 | def statfs(self, ctx) -> os.stat_result: 1590 | attrs = llfuse.StatvfsData() 1591 | 1592 | arch_st = self.arch_st 1593 | attrs.f_bsize = arch_st.st_blksize 1594 | attrs.f_frsize = arch_st.st_blksize 1595 | attrs.f_blocks = arch_st.st_blocks 1596 | attrs.f_bfree = 0 1597 | attrs.f_bavail = 0 1598 | 1599 | attrs.f_files = len(self.inodes) 1600 | attrs.f_ffree = 0 1601 | attrs.f_favail = 0 1602 | 1603 | return attrs 1604 | 1605 | def open(self, inode: int, flags: int, ctx) -> int: 1606 | try: 1607 | entry = self.inodes[inode] 1608 | except KeyError: 1609 | raise llfuse.FUSEError(errno.ENOENT) 1610 | else: 1611 | if type(entry) is Dir: 1612 | raise llfuse.FUSEError(errno.EISDIR) 1613 | 1614 | if flags & 3 != os.O_RDONLY: 1615 | raise llfuse.FUSEError(errno.EACCES) 1616 | 1617 | return inode 1618 | 1619 | def read(self, fh: int, offset: int, length: int) -> bytes: 1620 | try: 1621 | entry = self.inodes[fh] 1622 | except KeyError: 1623 | raise llfuse.FUSEError(errno.ENOENT) 1624 | 1625 | if not isinstance(entry, File): 1626 | raise llfuse.FUSEError(errno.EISDIR) 1627 | 1628 | try: 1629 | return entry.record.read(self.data, offset, length) 1630 | except NotImplementedError: 1631 | raise llfuse.FUSEError(errno.ENOSYS) 1632 | 1633 | def release(self, fh): 1634 | pass 1635 | 1636 | # based on http://code.activestate.com/recipes/66012/ 1637 | def deamonize(stdout: str = '/dev/null', stderr: Optional[str] = None, stdin: str = '/dev/null') -> None: 1638 | # Do first fork. 1639 | try: 1640 | pid = os.fork() 1641 | if pid > 0: 1642 | sys.exit(0) # Exit first parent. 1643 | except OSError as e: 1644 | sys.stderr.write("fork #1 failed: (%d) %s\n" % (e.errno, e.strerror)) 1645 | sys.exit(1) 1646 | 1647 | # Decouple from parent environment. 1648 | os.chdir("/") 1649 | os.umask(0) 1650 | os.setsid() 1651 | 1652 | # Do second fork. 1653 | try: 1654 | pid = os.fork() 1655 | if pid > 0: 1656 | sys.exit(0) # Exit second parent. 1657 | except OSError as e: 1658 | sys.stderr.write("fork #2 failed: (%d) %s\n" % (e.errno, e.strerror)) 1659 | sys.exit(1) 1660 | 1661 | # Open file descriptors 1662 | if not stderr: 1663 | stderr = stdout 1664 | 1665 | si = open(stdin, 'r') 1666 | so = open(stdout, 'a+') 1667 | se = open(stderr, 'a+') 1668 | 1669 | # Redirect standard file descriptors. 1670 | sys.stdout.flush() 1671 | sys.stderr.flush() 1672 | 1673 | os.close(sys.stdin.fileno()) 1674 | os.close(sys.stdout.fileno()) 1675 | os.close(sys.stderr.fileno()) 1676 | 1677 | os.dup2(si.fileno(), sys.stdin.fileno()) 1678 | os.dup2(so.fileno(), sys.stdout.fileno()) 1679 | os.dup2(se.fileno(), sys.stderr.fileno()) 1680 | 1681 | def main(argv: List[str]) -> None: 1682 | parser = argparse.ArgumentParser(description='unpack, list and mount Unreal Engine 4 .pak archives') 1683 | parser.set_defaults(print0=False,verbose=False,progress=False,zlib=False,command=None,no_sendfile=False,global_debug=False) 1684 | add_debug_arg(parser) 1685 | 1686 | subparsers = parser.add_subparsers(metavar='command') 1687 | 1688 | unpack_parser = subparsers.add_parser('unpack',aliases=('x',),help='unpack archive') 1689 | unpack_parser.set_defaults(command='unpack',check_integrity=False,ignore_null_checksums=False) 1690 | unpack_parser.add_argument('-C','--dir',type=str,default='.', 1691 | help='directory to write unpacked files') 1692 | unpack_parser.add_argument('-p','--progress',action='store_true',default=False, 1693 | help='show progress') 1694 | add_hack_args(unpack_parser) 1695 | add_common_args(unpack_parser) 1696 | add_no_sendfile_arg(unpack_parser) 1697 | unpack_parser.add_argument('files', metavar='file', nargs='*', help='files and directories to unpack') 1698 | 1699 | pack_parser = subparsers.add_parser('pack',aliases=('c',),help="pack archive") 1700 | pack_parser.set_defaults(command='pack') 1701 | pack_parser.add_argument('--archive-version',type=int,choices=[1,2,3],default=3,help='archive file format version') 1702 | pack_parser.add_argument('--mount-point',type=str,default=os.path.join('..','..','..',''),help='archive mount point relative to its path') 1703 | pack_parser.add_argument('-z', '--zlib',action='store_true',default=False,help='use zlib compress') 1704 | pack_parser.add_argument('-p', '--progress',action='store_true',default=False, 1705 | help='show progress') 1706 | add_print0_arg(pack_parser) 1707 | add_verbose_arg(pack_parser) 1708 | add_archive_arg(pack_parser) 1709 | add_encoding_arg(pack_parser) 1710 | pack_parser.add_argument('files', metavar='file', nargs='+', help='files and directories to pack') 1711 | 1712 | list_parser = subparsers.add_parser('list',aliases=('l',),help='list archive contens') 1713 | list_parser.set_defaults(command='list',check_integrity=False,ignore_null_checksums=False) 1714 | add_human_arg(list_parser) 1715 | list_parser.add_argument('-d','--details',action='store_true',default=False, 1716 | help='print file offsets and sizes') 1717 | list_parser.add_argument('-s','--sort',dest='sort_key_func',metavar='KEYS',type=sort_key_func,default=None, 1718 | help='sort file list. Comma seperated list of sort keys. Keys are "size", "zsize", "offset", and "name". ' 1719 | 'Prepend "-" to a key name to sort in descending order (descending order not supported for name).') 1720 | add_hack_args(list_parser) 1721 | add_common_args(list_parser) 1722 | 1723 | info_parser = subparsers.add_parser('info',aliases=('i',),help='print archive summary info') 1724 | info_parser.set_defaults(command='info',check_integrity=False,ignore_null_checksums=False) 1725 | add_human_arg(info_parser) 1726 | add_integrity_arg(info_parser) 1727 | add_archive_arg(info_parser) 1728 | add_hack_args(info_parser) 1729 | 1730 | check_parser = subparsers.add_parser('test',aliases=('t',),help='test archive integrity') 1731 | check_parser.set_defaults(command='test',ignore_null_checksums=False) 1732 | add_print0_arg(check_parser) 1733 | add_archive_arg(check_parser) 1734 | add_hack_args(check_parser) 1735 | 1736 | mount_parser = subparsers.add_parser('mount',aliases=('m',),help='fuse mount archive') 1737 | mount_parser.set_defaults(command='mount',check_integrity=False,ignore_null_checksums=False) 1738 | mount_parser.add_argument('-d','--debug',action='store_true',default=False, 1739 | help='print debug output (implies -f)') 1740 | mount_parser.add_argument('-f','--foreground',action='store_true',default=False, 1741 | help='foreground operation') 1742 | mount_parser.add_argument('archive', help='Unreal Engine 4 .pak archive') 1743 | mount_parser.add_argument('mountpt', help='mount point') 1744 | add_integrity_arg(mount_parser) 1745 | add_hack_args(mount_parser) 1746 | 1747 | args = parser.parse_args(argv) 1748 | 1749 | if args.command is None: 1750 | parser.print_help() 1751 | 1752 | elif args.global_debug: 1753 | _main(args) 1754 | 1755 | else: 1756 | try: 1757 | _main(args) 1758 | except (ValueError, NotImplementedError, IOError) as exc: 1759 | sys.stderr.write("%s\n" % exc) 1760 | sys.exit(1) 1761 | 1762 | def _main(args: argparse.Namespace) -> None: 1763 | delim = '\0' if args.print0 else '\n' 1764 | 1765 | stream: io.BufferedReader 1766 | wstream: io.BufferedWriter 1767 | 1768 | if args.command == 'list': 1769 | with open(args.archive, "rb") as stream: # type: ignore 1770 | pak = read_index(stream, args.check_integrity, args.ignore_magic, args.encoding, args.force_version, args.ignore_null_checksums) 1771 | pak.print_list(args.details,args.human,delim,args.sort_key_func,sys.stdout) 1772 | 1773 | elif args.command == 'info': 1774 | with open(args.archive, "rb") as stream: # type: ignore 1775 | pak = read_index(stream, args.check_integrity, args.ignore_magic, args.encoding, args.force_version, args.ignore_null_checksums) 1776 | pak.print_info(args.human,sys.stdout) 1777 | 1778 | elif args.command == 'test': 1779 | error_count = 0 1780 | 1781 | def check_callback(ctx: Optional[Record], message: str) -> None: 1782 | nonlocal error_count 1783 | error_count += 1 1784 | 1785 | if ctx is None: 1786 | sys.stdout.write("%s%s" % (message, delim)) 1787 | 1788 | elif isinstance(ctx, Record): 1789 | sys.stdout.write("%s: %s%s" % (ctx.filename, message, delim)) 1790 | 1791 | else: 1792 | sys.stdout.write("%s: %s%s" % (ctx, message, delim)) 1793 | 1794 | with open(args.archive, "rb") as stream: # type: ignore 1795 | pak = read_index(stream, False, args.ignore_magic, args.encoding, args.force_version, args.ignore_null_checksums) 1796 | pak.check_integrity(stream, check_callback, args.ignore_null_checksums) 1797 | 1798 | if error_count == 0: 1799 | sys.stdout.write('All ok%s' % delim) 1800 | else: 1801 | sys.stdout.write('Found %d error(s)%s' % (error_count, delim)) 1802 | sys.exit(1) 1803 | 1804 | elif args.command == 'unpack': 1805 | if args.no_sendfile: 1806 | global sendfile 1807 | sendfile = highlevel_sendfile 1808 | 1809 | if args.verbose: 1810 | def unpack_callback(name: str) -> None: 1811 | sys.stdout.write("%s%s" % (name, delim)) 1812 | 1813 | elif args.progress: 1814 | nDecompOffset = 0 1815 | def unpack_callback(name: str) -> None: 1816 | nonlocal nDecompOffset 1817 | nDecompOffset = nDecompOffset + 1 1818 | if nDecompOffset % 10 == 0: 1819 | print("Decompressing %3.02f%%" % (round(nDecompOffset/len(pak)*100,2)), end="\r") 1820 | else: 1821 | def unpack_callback(name: str) -> None: 1822 | pass 1823 | 1824 | with open(args.archive, "rb") as stream: # type: ignore 1825 | pak = read_index(stream, args.check_integrity, args.ignore_magic, args.encoding, args.force_version, args.ignore_null_checksums) 1826 | if args.files: 1827 | pak.unpack_only(stream, set(name.strip(os.path.sep) for name in args.files), args.dir, unpack_callback) 1828 | else: 1829 | pak.unpack(stream, args.dir, unpack_callback) 1830 | 1831 | elif args.command == 'pack': 1832 | if args.verbose: 1833 | def pack_callback(name: str, files: List[str]) -> None: 1834 | sys.stdout.write("%s%s" % (name, delim)) 1835 | elif args.progress: 1836 | nCompOffset = 0 1837 | def pack_callback(name: str, files: List[str]) -> None: 1838 | nonlocal nCompOffset 1839 | nCompOffset = nCompOffset + 1 1840 | print("Compressing %3.02f%%" % (round(nCompOffset/len(files)*100,2)), end="\r") 1841 | else: 1842 | def pack_callback(name: str, files: List[str]) -> None: 1843 | pass 1844 | 1845 | compFmt = COMPR_NONE 1846 | if args.zlib == True: compFmt = COMPR_ZLIB 1847 | 1848 | with open(args.archive, "wb") as wstream: # type: ignore 1849 | pack(wstream, args.files, args.mount_point, args.archive_version, compFmt, 1850 | callback=pack_callback, encoding=args.encoding) 1851 | 1852 | elif args.command == 'mount': 1853 | if not HAS_LLFUSE: 1854 | raise ValueError('the llfuse python module is needed for this feature') 1855 | 1856 | with open(args.archive, "rb") as stream: # type: ignore 1857 | pak = read_index(stream, args.check_integrity, args.ignore_magic, args.encoding, args.force_version, args.ignore_null_checksums) 1858 | pak.mount(stream, args.mountpt, args.foreground, args.debug) 1859 | 1860 | else: 1861 | raise ValueError('unknown command: %s' % args.command) 1862 | 1863 | def add_integrity_arg(parser: argparse.ArgumentParser) -> None: 1864 | parser.add_argument('-i','--check-integrity',action='store_true',default=False, 1865 | help='meta-data sanity check and verify checksums') 1866 | parser.add_argument('--ignore-null-checksums',action='store_true',default=False, 1867 | help='ignore checksums that are all nulls') 1868 | 1869 | def add_archive_arg(parser: argparse.ArgumentParser) -> None: 1870 | parser.add_argument('archive', help='Unreal Engine 4 .pak archive') 1871 | 1872 | def add_print0_arg(parser: argparse.ArgumentParser) -> None: 1873 | parser.add_argument('-0','--print0',action='store_true',default=False, 1874 | help='seperate file names with nil bytes') 1875 | 1876 | def add_verbose_arg(parser: argparse.ArgumentParser) -> None: 1877 | parser.add_argument('-v','--verbose',action='store_true',default=False, 1878 | help='print verbose output') 1879 | 1880 | def add_human_arg(parser: argparse.ArgumentParser) -> None: 1881 | parser.add_argument('-u','--human-readable',dest='human',action='store_true',default=False, 1882 | help='print human readable file sizes') 1883 | 1884 | def add_encoding_arg(parser: argparse.ArgumentParser) -> None: 1885 | parser.add_argument('--encoding',type=str,default='UTF-8', 1886 | help='charcter encoding of file names to use (default: UTF-8)') 1887 | 1888 | def add_hack_args(parser: argparse.ArgumentParser) -> None: 1889 | add_encoding_arg(parser) 1890 | parser.add_argument('--ignore-magic',action='store_true',default=False, 1891 | help="don't error out if file magic missmatches") 1892 | parser.add_argument('--force-version',type=int,default=None, 1893 | help='use this format version when parsing the file instead of the version read from the archive') 1894 | 1895 | def add_no_sendfile_arg(parser: argparse.ArgumentParser) -> None: 1896 | parser.add_argument('--no-sendfile',action='store_true',default=False, 1897 | help="don't use sendfile system call. Try this if you get an IOError during unpacking.") 1898 | 1899 | def add_debug_arg(parser: argparse.ArgumentParser) -> None: 1900 | parser.add_argument('-d', '--debug',action='store_true',default=False,dest='global_debug', 1901 | help="print stacktrace on error") 1902 | 1903 | def add_common_args(parser: argparse.ArgumentParser) -> None: 1904 | add_print0_arg(parser) 1905 | add_verbose_arg(parser) 1906 | add_integrity_arg(parser) 1907 | add_archive_arg(parser) 1908 | 1909 | if __name__ == '__main__': 1910 | main(sys.argv[1:]) 1911 | --------------------------------------------------------------------------------