├── .gitignore
├── CHANGELOG
├── LICENSE
├── README.md
├── requirements.txt
└── ufdr2dir.py


/.gitignore:
--------------------------------------------------------------------------------
1 | UFDRConvert
2 | data/
3 | files/
4 | 


--------------------------------------------------------------------------------
/CHANGELOG:
--------------------------------------------------------------------------------
 1 | # Changelog
 2 | All notable changes to this project will be documented in this file.
 3 | 
 4 | The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 5 | and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 6 | 
 7 | ## [Unreleased]
 8 | 
 9 | ## [0.1.10] - 2022-03-21
10 | ### Added
11 | - Additional checks for correct file name and paths
12 | - Clean working directory 'files' on startup/exit
13 | 
14 | ### Fixed
15 | - Missing zip folder arguments for extraction.
16 | - Output file name was based on local path and not original path
17 | 
18 | ## [0.1.9] - 2022-03-09
19 | ### Fixed
20 | - alive_progress failed import exits program.
21 |   
22 | ### Added
23 | - function that shows progress if alive_it is found
24 | - function that loops with no progress
25 | 
26 | ## [0.1.8] - 2022-02-05
27 | ### Added
28 | - Catch sigint and exit properly
29 | 
30 | ## [0.1.7] - 2022-02-05
31 | ### Added
32 | - Notice of paths changes for Windows users
33 | - Check write permission error and exit
34 | - NotADirectoryError catch during extraction
35 | - general error catch during file extraction
36 | 
37 | ### Fixed
38 | - Windows illegal character better strip
39 | 
40 | ## [0.1.6] - 2022-02-05
41 | ### Added
42 | - TryCatch for directory creation
43 | 
44 | ### Fixed
45 | - Fixed Windows path illegal char strip
46 | - Check for leading path slash before strip
47 | 
48 | ## [0.1.5] - 2022-02-05
49 | ### Fixed
50 | - Fixed output variable error
51 | 
52 | ## [0.1.4] - 2022-02-05
53 | ### Fixed
54 | - Fixed Window check issue
55 | - Fixed output selection option
56 | 
57 | ## [0.1.3] - 2022-02-05
58 | ### Changed
59 | - Strip illegal path chars in Windows
60 | 
61 | ## [0.1.2] - 2022-02-05
62 | ### Changed
63 | - Changed zip file extraction bug in Windows
64 | - Temp remove set output directory
65 | 
66 | ## [0.1.1] - 2022-02-04
67 | ### Added
68 | - Original directory structure creation function
69 | - Local to orignal file move function
70 | - Status bar for long process time
71 | - python requirements file
72 | 
73 | ## [0.1.0] - 2022-02-03
74 | ### Added
75 | - This CHANGELOG based on the standard from https://keepachangelog.com/en/1.0.0/
76 | - README about the project description and goals
77 | - LICENSE - MIT
78 | - ufrd2dir.py inital structure and plan


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2022 DFIRScience
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | ## UFDR2DIR
 2 | 
 3 | A script to convert a Cellebrite UFDR to it's original file and directory structure.
 4 | 
 5 | ## Why??
 6 | 
 7 | Cellebrite Reader files (.ufdr) are processed mobile device images. They are compressed (zip) files that contain a ```report.xml``` file in the root, and files sorted into directories by category.
 8 | 
 9 | The ufdr has the original subject data, but does not keep the original file path structure. This means that tools such as [ALEAPP](https://github.com/abrignoni/ALEAPP) have [poor results](https://dfir.science/2022/02/How-to-extract-files-from-Cellebrite-Reader-UFDR-for-ALEAPPiLEAPP) over the package.
10 | 
11 | UFDR2DIR converts the categorized data back into the original directory structure. This will allow tools that do not support UFDR to load the data as a directory.
12 | 
13 | ## Install and Run
14 | 
15 | Make sure you have [Python 3](https://www.python.org/) installed. Download the repository.
16 | From a command prompt run:
17 | 
18 | ```bash
19 | pip3 install -r requirements.txt
20 | python3 ufdr2dir.py filename.ufdr
21 | ```
22 | 
23 | This will create an output folder in the current working directory. You can specify where you want to output to with -o [OUTDIR].
24 | 
25 | The output directory will mirror what was recorded in ```report.xml```. You can point tools like ALEAPP directly at the resulting folder.
26 | 
27 | ## Note
28 | 
29 | Cellebrite apparently does some deleted data recovery. These files are currently **not** being extracted if they lack path information.
30 | 
31 | Physical Analyzer also extracts archive files. When recovering the structure we write the original archive file in the original path. Any extracted files are currently disregarded.
32 | 
33 | Most UFDR are probably going to be from Android and iOS. Windows, however, has a lot of illegal file path characters. If you extract the UFDR on Windows/NTFS, illegal characters will be stripped from the file path. Be aware that some paths may be slightly different from original on Windows.
34 | 
35 | **Example:** ```com.facebook.katana:dash``` <-- ":" is an illegal path character in NTFS (thanks, alternate data streams!). As such, UFDR2DIR extracts it as ```com.facebook.katanadash``` on Windows. Linux and MacOS are unaffected.
36 | 
37 | ## Bug reports and suggestions
38 | 
39 | Pull requests considered! Otherwise create an issue or message me on [Twitter](https://twitter.com/dfirscience) if you find any bugs or have some recommendations.
40 | 
41 | ### TODO
42 | 
43 | * Determine archive or archive extraction directory
44 | * Allow user to select keep or disregard extractions
45 | * UDFR2ZIP -> Create a directory structure in ZIP archive preserving meta-data
46 | * report.xml zip file name is based on read order -> fragile
47 | * Allow user to select file exists overwrite, skip or fail.
48 | * Optimize:
49 |   * Extract then copy is slow
50 |   * Read report.xml by line is slow
51 | 
52 | ### Testing
53 | 
54 | Tested on:
55 | 
56 | * Linux Mint 20.3
57 | * (light testing) Windows 11
58 | 
59 | If you have issues or experience on other platforms, please let me know how it went.
60 | 
61 | ## Thank you
62 | 
63 | Thanks to [Josh Hickman](https://thebinaryhick.blog/) for the public data sets that this script was tested on.
64 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | alive_progress>=2.2.0


--------------------------------------------------------------------------------
/ufdr2dir.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | 
  3 | """
  4 | Convert a Cellebrite Reader UFDR file to it's original directory structure.
  5 | MIT License.
  6 | """
  7 | 
  8 | # Imports
  9 | import argparse
 10 | import logging
 11 | import re, io, os
 12 | import signal
 13 | import platform
 14 | import shutil
 15 | 
 16 | from pathlib import Path
 17 | from pathlib import PurePath
 18 | from zipfile import ZipFile
 19 | 
 20 | __software__ = 'UFDR2DIR'
 21 | __author__ = 'Joshua James'
 22 | __copyright__ = 'Copyright 2022, UFDR2DIR'
 23 | __credits__ = []
 24 | __license__ = 'MIT'
 25 | __version__ = '0.1.10'
 26 | __maintainer__ = 'Joshua James'
 27 | __email__ = 'joshua+github@dfirscience.org'
 28 | __status__ = 'active'
 29 | 
 30 | PROGRESSLIB = True
 31 | 
 32 | # Some users had trouble importing alive_progress on Windows
 33 | try:from alive_progress import alive_it
 34 | except ImportError:
 35 |     print('[E] Could not find alive_progress library. Will not show progress.')
 36 |     PROGRESSLIB = False
 37 | 
 38 | # Set logging level and format
 39 | def setLogging(debug):
 40 |     fmt = "[%(levelname)s] %(asctime)s %(message)s"
 41 |     LOGLEVEL = logging.INFO if debug is False else logging.DEBUG
 42 |     logging.basicConfig(level=LOGLEVEL, format=fmt, datefmt='%Y-%M-%dT%H:%M:%S')
 43 | 
 44 | # Argparser config and argument setup
 45 | def setArgs():
 46 |     parser = argparse.ArgumentParser(description=__copyright__)
 47 |     parser.add_argument('ufdr', help="Celebrite Reader UFDR file")
 48 |     parser.add_argument('-o', '--out', required=False, action='store', dest="out", help='Output directory path')
 49 |     parser.add_argument('--debug', required=False, action='store_true', help='Set the log level to DEBUG')
 50 |     return(parser.parse_args())
 51 | 
 52 | def getZipReportXML(ufdr, OUTD):
 53 |     logging.info("Extracting report.xml...")
 54 |     with ZipFile(ufdr, 'r') as zip:
 55 |         with io.TextIOWrapper(zip.open("report.xml"), encoding="utf-8") as f:
 56 |             logging.info("Creating original directory structure...")
 57 |             if PROGRESSLIB: extractProgress(zip, OUTD, f)
 58 |             else: extractNoProgress(zip, OUTD, f)
 59 | 
 60 | # Function to show progress if lib exists
 61 | # Optimize with progress functions instead of alive_it
 62 | def extractProgress(zip, OUTD, f):
 63 |     ORIGF = ""
 64 |     LOCALF = ""
 65 |     for l in alive_it(f): # Run though each line... is lxml faster?
 66 |         if l.__contains__('<file fs'):
 67 |             result = re.search('path="(.*?)" ', l) # This gets original path / FN
 68 |             if result:
 69 |                 ORIGF = result.group(1)
 70 |                 if platform.system() == "Windows": ORIGF = re.sub('[:*?"<>|]', '-', ORIGF)
 71 |                 logging.debug(f'Original: {ORIGF}')
 72 |                 # Create the original file directory structure
 73 |                 makeDirStructure(ORIGF, OUTD)
 74 |         elif l.__contains__('name="Local Path"'):
 75 |             result = re.search('CDATA\[(.*?)\]\]', l) # This gets local path / FN
 76 |             if result:
 77 |                 LOCALF = result.group(1).replace("\\", "/")
 78 |                 logging.debug(f'Local: {LOCALF}')
 79 |                 extractToDir(zip, LOCALF, ORIGF, OUTD)
 80 | 
 81 | def extractNoProgress(zip, OUTD, f):
 82 |     ORIGF = ""
 83 |     LOCALF = ""
 84 |     for l in f: # Run though each line... is lxml faster?
 85 |         if l.__contains__('<file fs'):
 86 |             result = re.search('path="(.*?)" ', l) # This gets original path / FN
 87 |             if result:
 88 |                 ORIGF = result.group(1)
 89 |                 if platform.system() == "Windows": ORIGF = re.sub('[:*?"<>|]', '-', ORIGF)
 90 |                 logging.debug(f'Original: {ORIGF}')
 91 |                 # Create the original file directory structure
 92 |                 makeDirStructure(ORIGF, OUTD)
 93 |         elif l.__contains__('name="Local Path"'):
 94 |             result = re.search('CDATA\[(.*?)\]\]', l) # This gets local path / FN
 95 |             if result:
 96 |                 LOCALF = result.group(1).replace("\\", "/")
 97 |                 logging.debug(f'Local: {LOCALF}')
 98 |                 extractToDir(zip, LOCALF, ORIGF, OUTD)
 99 |     Path('files').rename(f'{OUTD}/UFDR-Files') # Move remaining archive structure to output
100 | 
101 | def extractToDir(zip, LOCALP, ORIGP, OUTD):
102 |     if ORIGP[:1] == "/": # Sometimes the first slash is missing in report.xml
103 |         ORIGP = ORIGP[1:len(ORIGP)]
104 |     #OUTPATH = PurePath(Path(OUTD), Path(ORIGP).parent)
105 |     OUTPATH = PurePath(Path(OUTD), Path(ORIGP))
106 |     logging.debug(f'Extracting {LOCALP} to {OUTPATH}')
107 |     try:
108 |         zip.extract(LOCALP)
109 |     except KeyError as e:
110 |         logging.debug(e)
111 |     except NotADirectoryError as e:
112 |         logging.debug(f'Error writing to directory: {e}')
113 |     except PermissionError as e:
114 |         print(f'Cannot write to the out directory. Check permissions: {e}')
115 |         exit(0)
116 |     except:
117 |         logging.debug(f'General error extracting file to path.')
118 |     else:
119 |         try:
120 |             Path(LOCALP).rename(OUTPATH) # Move from CWD to original path + FN
121 |         except IsADirectoryError: # If an archive is found but the dir already exists
122 |             logging.debug('An original archive was found. Renaming the directory...')
123 |             Path(OUTPATH).rename(f'{OUTPATH}.extract')
124 |             Path(LOCALP).rename(OUTPATH)
125 |         except NotADirectoryError: # If an archive file already exists and extraction is found
126 |             logging.debug('Archive extraction found but archive already exists. Skipping...')
127 |         except FileExistsError:
128 |             logging.debug('File already exists. Skipping...')
129 |         except OSError as e:
130 |             logging.debug(f'Error writing file: {e}') # Probably file name too long
131 | 
132 | # This might not be necessary if we can extract directly.                    
133 | def makeDirStructure(FP, OUTD): # FP is a string
134 |     OUTPATH = PurePath(Path(OUTD), Path(FP[1:len(FP)]).parent)
135 |     logging.debug(f'Outpath set to: {OUTPATH}')
136 |     try:
137 |         Path(OUTPATH).mkdir(parents=True, exist_ok=True)
138 |     except NotADirectoryError as e:
139 |         logging.debug(f'Error creating directory: {e}')
140 |     except PermissionError as e:
141 |         print(f'Cannot write to the out directory. Check permissions: {e}')
142 |         exit(1)
143 |     except FileExistsError:
144 |         logging.debug(f"The file {OUTPATH} already exists. Skipping...")
145 |     #except:
146 |     #    logging.debug(f'General error creating file path.')
147 | 
148 | def windowsWarning():
149 |     print("Note: Windows paths are not POSIX compliant.")
150 |     print("      Illegal original-path chracters will be replaced with \"-\".")
151 | 
152 | def exitHandler(sig, frame):
153 |     logging.info('Process terminated by user.')
154 |     cleanWorking()
155 |     if platform.system == "Windows": os._exit()
156 |     else: os.kill(os.getpid(), signal.SIGINT)
157 | 
158 | def cleanWorking():
159 |     try: shutil.rmtree('files')
160 |     except: logging.debug('Files working directory not found')
161 | 
162 | def main():
163 |     signal.signal(signal.SIGINT, exitHandler)
164 |     args = setArgs()
165 |     UFDR = Path(args.ufdr)
166 |     OUTD = Path.cwd().joinpath("UFDRConvert")
167 |     setLogging(args.debug)
168 |     print(f"{__software__} v{__version__} - Use ctrl+c to exit")
169 |     if platform.system() == "Windows":
170 |         windowsWarning()
171 |     if Path.is_file(UFDR):
172 |         logging.debug(f'UDFR set to {args.ufdr}')
173 |     if args.out and Path.is_dir(Path(args.out)):
174 |         logging.debug(f'Output directory set to {args.out}')
175 |         OUTD = args.out + "UFDRConvert"
176 |     cleanWorking()
177 |     getZipReportXML(UFDR, OUTD)
178 | 
179 | if __name__ == '__main__':
180 |     main()


--------------------------------------------------------------------------------