├── .gitignore ├── CHANGELOG.md ├── Docker └── Dockerfile ├── LICENSE ├── README.md ├── docs ├── Example Test.png ├── FSF Overview.png ├── FSF Process.png ├── INSTALL.md ├── JQ_EXAMPLES.md ├── JQ_FILTERS.md ├── MODULES.md ├── Test.json └── Test.zip ├── fsf-client ├── conf │ ├── __init__.py │ └── config.py └── fsf_client.py └── fsf-server ├── conf ├── __init__.py ├── config.py └── disposition.py ├── daemon.py ├── jq ├── embedded_sfx_rar_w_exe.jq ├── exe_in_zip.jq ├── fresh_vt_scan.jq ├── macro_gt_five_suspicious.jq ├── many_objects.jq ├── more_than_ten_yara.jq ├── no_yara_hits.jq ├── one_module.jq ├── pe_recently_compiled.jq ├── vt_broadbased_detections_found.jq ├── vt_exploit_detections_found.jq ├── vt_match_found.jq └── vt_match_not_found.jq ├── main.py ├── modules ├── EXTRACT_CAB.py ├── EXTRACT_EMBEDDED.py ├── EXTRACT_GZIP.py ├── EXTRACT_HEXASCII_PE.py ├── EXTRACT_RAR.py ├── EXTRACT_RTF_OBJ.py ├── EXTRACT_SWF.py ├── EXTRACT_TAR.py ├── EXTRACT_UPX.py ├── EXTRACT_VBA_MACRO.py ├── EXTRACT_ZIP.py ├── META_BASIC_INFO.py ├── META_ELF.py ├── META_JAVA_CLASS.py ├── META_MACHO.py ├── META_OLECF.py ├── META_OOXML.py ├── META_PDF.py ├── META_PE.py ├── META_PE_SIGNATURE.py ├── META_VT_INSPECT.py ├── SCAN_YARA.py ├── __init__.py └── template.py ├── processor.py ├── scanner.py └── yara ├── ft_cab.yara ├── ft_elf.yara ├── ft_exe.yara ├── ft_gzip.yara ├── ft_jar.yara ├── ft_java_class.yara ├── ft_macho.yara ├── ft_office_open_xml.yara ├── ft_ole_cf.yara ├── ft_pdf.yara ├── ft_rar.yara ├── ft_rtf.yara ├── ft_swf.yara ├── ft_tar.yara ├── ft_zip.yara ├── misc_compressed_exe.yara ├── misc_hexascii_pe_in_html.yara ├── misc_no_dosmode_header.yara ├── misc_ooxml_core_properties.yara ├── misc_pe_signature.yara ├── misc_upx_packed_binary.yara └── rules.yara /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | 3/16/2017 2 | --------- 3 | * Solved Issue #49. A recent change in in the MachoLibre Module on or around abdb9c9a4378a1ff261525bbb75d7062eff95e5b changed the packaged structure 4 | 5 | 3/15/2017 6 | --------- 7 | * Merged PR#50 which explicitly casts values in the META_JAVA_CLASS module to strings 8 | 9 | 2/27/2017 10 | --------- 11 | * Merge PR#47 which addresses META_PE output inconsistencies during module exceptions. This should increase consistency in FSF outputs and remove barriers to indexing / storage in document oriented databases. 12 | 13 | 2/25/2017 14 | ---------- 15 | * Merged PR#46 which is minor tweak to the misc_hexascii_pe_in_html comments to help avoid some AV vendors flagging the rule file as malware. 16 | 17 | 2/09/2017 18 | --------- 19 | * Merged PR#43 which moves the pidfile path (formerly hard coded into fsf-server.main) to the fsf-server.conf.config. This allows for more flexible deployment of FSF across multiple platforms. 20 | 21 | 2/08/2017 22 | --------- 23 | 24 | * Merged PR #41 to fix issue #40 where the META_JAVA class was returning a tuple in one of its sub values. This was causing issues with external systems that had strict json interperters. Fix was to convert the tuple to a python dictionary / json sub-document. 25 | 26 | 27 | 1/10/2017 28 | --------- 29 | 30 | * Moving CLI arg input check for archive type out of the fsf-client module to the main section to make the client code easier to re-use. 31 | 32 | 33 | 12/20/2016 34 | --------- 35 | 36 | * Added new module META_MACHO - Collect data on Mach-o binaries (thanks zcatbear!) 37 | 38 | 12/07/2016 39 | --------- 40 | 41 | * Better error output when an export directory cannot be created or written to. 42 | 43 | 08/28/2016 44 | --------- 45 | 46 | * Small bug fix in how connection attempts are made from client. 47 | 48 | 08/17/2016 49 | --------- 50 | 51 | * Merged pull request from spartan782. Allow fail over incase of multiple servers. 52 | 53 | 07/13/2016 54 | --------- 55 | 56 | * Small fix to make fsf virtualenv compatible 57 | 58 | 04/27/2016 59 | ---------- 60 | 61 | * Added new module: 62 | * EXTRACT_HEXASCII_PE - Snag encoded PE files inside of files (example in source) 63 | 64 | * Added new Yara signatures: 65 | * misc_hexascii_pe_in_html 66 | * misc_no_dosmode_header 67 | 68 | 02/11/2016 69 | ---------- 70 | 71 | * Formal 1.0 stable release :) 72 | 73 | * Removal of '--interactive' and '--not-interactive' modes from client/server 74 | 75 | * Introduction of new flags to the client to support more flexibility 76 | * Added '--source', to specify the source of the input. Useful when scaling up to larger operations or supporting multiple sources; such as integrating with a sensor grid or other network defense solutions. Defaults to 'Analyst' as submission source 77 | * Added '--delete' to remove file from client after sent to FSF server. Data can be archived later on server depending on selected options 78 | * Added '--archive' to specify how the file submission should be stored on the server (if at all) 79 | * The most common option is 'none' which will tell the server not to archive for this submission (default) 80 | * 'file-on-alert' will archive the file only if the alert flag is set 81 | * 'all-on-alert' will archive the file and all sub objects if the alert flag is set 82 | * 'all-the-files' will archive all the files sent to the scanner regardless of the alert flag 83 | * 'all-the-things' will archive the file and all sub objects regardless of the alert flag 84 | * Added '--suppress-report', don't return a JSON report back to the client and log client-side errors to the locally configured log directory. Choosing this will log scan results server-side only. Needed for automated scanning use cases when sending large amount of files for bulk collection. Set to false by default. 85 | 86 | * Updated documentation: 87 | * New process flow diagram to reflect changes 88 | * New overview picture to get rid of old 'interactive modes' 89 | * Updated [modules](https://github.com/EmersonElectricCo/fsf/blob/master/docs/MODULES.md) to reflect removal of 'interactive' modes and addition of new flags 90 | * Added a few usage notes to the readme based on the recent changes 91 | * fsf_client -h output 92 | * Added a module matrix to give an overview of capabilities 93 | 94 | 02/03/2016 95 | ---------- 96 | * Docker image updated (thanks wzod!) 97 | 98 | 02/01/2016 99 | ---------- 100 | 101 | * Updated documentation: 102 | * README update to include post-processing capability 103 | * Added documentation on incorporating jq filters for post-processing (JQ_FILTERS.md) 104 | * Updated FSF process diagram 105 | * Updated install documents to include new requirements: 106 | * Python modules: pyelftools, javatools, requests 107 | * Tools: jq 108 | 109 | * Introduced the addition of report post processing capability using jq filters! 110 | * Observations informed by jq filters are now added to the FSF report summary 111 | * Check out the [documentation](https://github.com/EmersonElectricCo/fsf/blob/master/docs/JQ_FILTERS.md) 112 | 113 | * Added new modules: 114 | * META_ELF - Extract metadata contents inside ELF files 115 | * META_JAVA_CLASS - Expose requirements, capabilities, and other metadata inside Java class files 116 | * META_VT_INSPECT - Query VT for AV assessment on various files (public/private API key required) 117 | 118 | * Bug fixes: 119 | * Spacing issue with a few lines in fsf_client.py 120 | * UnicodeDecodeError with some kinds of macro files, adjusted EXTRACT_VBA_MACRO to accommodate 121 | 122 | * Added some starter jq filters: 123 | * embedded_sfx_rar_w_exe.jq 124 | * macro_gt_five_suspicious.jq 125 | * no_yara_hits.jq 126 | * vt_broadbased_detections_found.jq 127 | * vt_match_not_found.jq 128 | * exe_in_zip.jq 129 | * many_objects.jq 130 | * one_module.jq 131 | * vt_exploit_detections_found.jq 132 | * fresh_vt_scan.jq 133 | * more_than_ten_yara.jq 134 | * pe_recently_compiled.jq 135 | * vt_match_found.jq 136 | 137 | * Added new Yara signatures: 138 | * ft_elf.yara 139 | * ft_java_class.yara 140 | 141 | 01/09/2016 142 | ---------- 143 | * Docker image updated (thanks wzod!) 144 | 145 | 01/08/2016 146 | ---------- 147 | 148 | * Updated installation docs to include cabextract and latest pefile module 149 | 150 | * Added new module: 151 | * EXTRACT_CAB - Extract contents and metadata of MS CAB files. Requires installation of cabextract utility 152 | 153 | * Improved some modules: 154 | * META_PE - Now includes metadata for the entry point, image base, and import hash. Requires latest pefile module (>= 1.2.10-139) 155 | * META_BASIC_INFO - Made this an ordered dictionary for display reasons 156 | 157 | * Core changes to address some minor bugs. 158 | * Added server side timeout condition in off chance where client terminates connection mid transfer 159 | * Added small sanity check to verify input is from a true FSF client 160 | 161 | * Added new Yara signatures: 162 | * ft_cab.yara 163 | * ft_jar.yara 164 | 165 | 11/23/2015 166 | ---------- 167 | 168 | * NOTE - please ensure you have the OpenSSL development libraries installed (openssl-devel for RH distros, libssl-dev for Debian) before installing Yara. Otherwise signatures like the newly added `misc_pe_signature.yara` will not work! If you don't have these, please install them and then reinstall Yara. This has been captured in [Yara Issue #378](https://github.com/plusvic/yara/issues/378). 169 | 170 | * Updated installation requirements to include Python modules pyasn1 and pyasn1-modules. This is necessary to use META_PE_SIGNATURE. 171 | 172 | * Added new modules: 173 | * EXTRACT_RTF_OBJ - Get embedded, hexascii encoded, OLE objects within RTFs. 174 | * EXTRACT_TAR - Get metadata and embedded objects within a TAR file and extract them. Some interesting goodies in TAR metadata, you should check it out! 175 | * EXTRACT_GZIP - Get embedded file within GZIP archive and extract it 176 | * META_PE_SIGNATURE - Get certificate metadata from PE files. Long overdue and really useful I hope! 177 | 178 | * Improved some modules: 179 | * META_PE - Now delivers information on PE imports and export entries as appropriate, also provides version info 180 | * EXTRACT_ZIP - More generous on corrupt ZIP files. It will now process embedded archives the best it can, if one is corrupt, it will move to the next instead of failing entirely 181 | * EXTRACT_RAR - Removed StringIO module in imports, was unnecessary 182 | 183 | * Added a section on jq tippers for help interacting FSF JSON output in docs 184 | * Filter out multiple nodes from JSON output 185 | * Show results from only one module 186 | * Contribute your own creative jq-fu! 187 | 188 | * Updated Test.json to accommodate output from module additions 189 | 190 | * Updated README.md with notes on jq use with FSF data 191 | 192 | * Added new Yara signatures: 193 | * ft_gzip.yara 194 | * ft_rtf.yara 195 | * ft_tar.yara 196 | * misc_pe_signature.yara 197 | 198 | * Docker image updated (thanks wzod!) 199 | 200 | 11/09/2015 201 | ---------- 202 | * Added detailed step-by-step installation instructions for Ubuntu and CentOS platforms. (thanks for the nudge cfossace!) 203 | 204 | 11/06/2015 205 | ---------- 206 | * Docker image updated (thanks wzod!) 207 | 208 | 11/05/2015 209 | ---------- 210 | * Changes to core code to accomodate the following: 211 | * Point client to more than one FSF server if desired 212 | * Added option for analyst to dump all subobjects returned to client 213 | * Added summary key value pairs for list of unique Yara signature hits as well as modules run with results. Helps to better digest output. 214 | 215 | * Added new modules: 216 | * EXTRACT_UPX - Unpack upx packed binaries 217 | * EXTRACT_VBA_MACRO - Extract and scan macro for anomolies to include in report using oletools.olevba module 218 | 219 | * Added new Yara sig: 220 | * misc_upx_packed_binary.yara 221 | 222 | * Documentation updates: 223 | * Updated module howto and readme documentation to incorporate recent core changes 224 | * Added visual graphic of test.zip along with sample file and JSON output to help with understanding 225 | 226 | 10/14/2015 227 | ---------- 228 | * Minor grammar and usage clarifications (thanks mkayoh!) 229 | 230 | 09/28/2015 231 | ---------- 232 | * Docker image added (thanks wzod!) 233 | 234 | 08/28/2015 235 | ---------- 236 | * Added new modules: 237 | * EXTRACT_EMBEDDED - if hachoir subfile detects embedded content, rip it out and feed it back in for scanning 238 | * EXTRACT_SWF - return basic metadata about SWF, but also deflate LZMA or ZLib compressed SWF files 239 | * META_OLECF - Return basic metadata concerning an OLE document (hachoir again for the heavy lifting) 240 | * META_OOXML - Parse the core.xml file for various properties of a file 241 | * META_PDF - Return basic metadata on PDF files 242 | 243 | * Added new Yara sigs: 244 | * ft_office_open_xml.yara 245 | * ft_ole_cf.yara 246 | * ft_pdf.yara 247 | * ft_swf.yara 248 | * misc_compressed_exe.yara 249 | * misc_ooxml_core_properties.yara 250 | 251 | 08/05/2015 252 | ---------- 253 | * Initial commit 254 | -------------------------------------------------------------------------------- /Docker/Dockerfile: -------------------------------------------------------------------------------- 1 | # This Docker image encapsulates the File Scanning Framework (FSF) by 2 | # Emerson Electric Company from https://github.com/EmersonElectricCo/fsf 3 | # 4 | # To run this image after installing Docker using a standalone instance, use a command like 5 | # the following, replacing “~/fsf-workdir" with the path to the location of your FSF 6 | # working directory: 7 | # 8 | # sudo docker run --rm -it -v ~/fsf-workdir:/home/nonroot/workdir wzod/fsf 9 | # 10 | # To run this image using a networked instance, use a command like this after installing 11 | # FSF on the host system: 12 | # 13 | # sudo docker run --rm -it -p 5800:5800 -v ~/fsf-workdir:/home/nonroot/workdir wzod/fsf 14 | # 15 | # Before running FSF, create the ~/fsf-workdir and make it world-accessible 16 | # (“chmod a+xwr"). 17 | # 18 | # Licensed under the Apache License, Version 2.0 (the "License"); 19 | # you may not use this file except in compliance with the License. 20 | # You may obtain a copy of the License at 21 | # 22 | # http://www.apache.org/licenses/LICENSE-2.0 23 | # 24 | # Unless required by applicable law or agreed to in writing, software 25 | # distributed under the License is distributed on an "AS IS" BASIS, 26 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 27 | # See the License for the specific language governing permissions and 28 | # limitations under the License. 29 | # 30 | 31 | FROM ubuntu:16.04 32 | MAINTAINER Zod (@wzod) 33 | 34 | ENV DEBIAN_FRONTEND noninteractive 35 | 36 | USER root 37 | RUN apt-get update && \ 38 | apt-get -y install software-properties-common && \ 39 | apt-add-repository -y multiverse && \ 40 | apt-get -qq update && apt-get install -y --fix-missing \ 41 | autoconf \ 42 | automake \ 43 | build-essential \ 44 | cabextract \ 45 | dh-autoreconf \ 46 | git \ 47 | jq \ 48 | libffi-dev \ 49 | libfuzzy-dev \ 50 | libpython2.7-stdlib \ 51 | libssl-dev \ 52 | libtool \ 53 | make \ 54 | net-tools \ 55 | python-dev \ 56 | python-minimal \ 57 | python-pip \ 58 | python-setuptools \ 59 | ssdeep \ 60 | unrar \ 61 | unzip \ 62 | upx-ucl \ 63 | vim \ 64 | wget && \ 65 | 66 | # Update setuptools 67 | pip install --upgrade setuptools 68 | 69 | # Retrieve current version of Yara via wget, verify known good hash and install Yara 70 | RUN cd /tmp && \ 71 | wget -O yara.v3.5.0.tar.gz "https://github.com/VirusTotal/yara/archive/v3.5.0.tar.gz" && \ 72 | echo 4bc72ee755db85747f7e856afb0e817b788a280ab5e73dee42f159171a9b5299\ \ yara.v3.5.0.tar.gz > sha256sum-yara && \ 73 | sha256sum -c sha256sum-yara && \ 74 | 75 | tar vxzf yara.v3.5.0.tar.gz && \ 76 | cd yara-3.5.0/ && \ 77 | ./bootstrap.sh && \ 78 | ./configure && \ 79 | make && \ 80 | make install && \ 81 | cd /tmp && \ 82 | 83 | # Retrieve yara-python from the project's site using recursive option and install yara-python 84 | git clone --recursive https://github.com/VirusTotal/yara-python && \ 85 | cd yara-python/ && \ 86 | python setup.py build && \ 87 | python setup.py install && \ 88 | cd /tmp && \ 89 | 90 | # Retrieve current version of pefile via wget, verify known good hash and install pefile 91 | wget -O pefile-1.2.10-139.tar.gz "https://github.com/erocarrera/pefile/archive/pefile-1.2.10-139.tar.gz" && \ 92 | echo 3297cb72e6a51befefc3d9b27ec7690b743ee826538629ecf68f4eee64f331ab\ \ pefile-1.2.10-139.tar.gz > sha256sum-pefile && \ 93 | sha256sum -c sha256sum-pefile && \ 94 | 95 | tar vxzf pefile-1.2.10-139.tar.gz && \ 96 | cd pefile-pefile-1.2.10-139/ && \ 97 | sed -i s/1\.2\.10.*/1\.2\.10\.139\'/ pefile.py && \ 98 | python setup.py build && \ 99 | python setup.py install && \ 100 | cd /tmp && \ 101 | 102 | # Retrieve current version of jq via wget, verify known good hash and move to /usr/local/bin 103 | wget -O jq "https://github.com/stedolan/jq/releases/download/jq-1.5/jq-linux64" && \ 104 | echo c6b3a7d7d3e7b70c6f51b706a3b90bd01833846c54d32ca32f0027f00226ff6d\ \ jq > sha256sum-jq && \ 105 | sha256sum -c sha256sum-jq && \ 106 | chmod 755 jq && \ 107 | mv jq /usr/local/bin/ 108 | 109 | # Install additional dependencies 110 | RUN pip install czipfile \ 111 | hachoir-parser \ 112 | hachoir-core \ 113 | hachoir-regex \ 114 | hachoir-metadata \ 115 | hachoir-subfile \ 116 | ConcurrentLogHandler \ 117 | pypdf2 \ 118 | xmltodict \ 119 | rarfile \ 120 | pylzma \ 121 | oletools \ 122 | pyasn1_modules \ 123 | pyasn1 \ 124 | pyelftools \ 125 | javatools \ 126 | requests \ 127 | git+https://github.com/aaronst/macholibre.git && \ 128 | 129 | BUILD_LIB=1 pip install ssdeep 130 | 131 | # Add nonroot user, clone repo and setup environment 132 | RUN groupadd -r nonroot && \ 133 | useradd -r -g nonroot -d /home/nonroot -s /sbin/nologin -c "Nonroot User" nonroot && \ 134 | mkdir /home/nonroot && \ 135 | chown -R nonroot:nonroot /home/nonroot && \ 136 | echo "/usr/local/lib" >> /etc/ld.so.conf.d/yara.conf 137 | 138 | USER nonroot 139 | RUN mkdir -pv /home/nonroot/workdir && \ 140 | cd /home/nonroot && \ 141 | git clone https://github.com/EmersonElectricCo/fsf.git && \ 142 | cd fsf/ && \ 143 | sed -i 's/\/FULL\/PATH\/TO/\/home\/nonroot/' fsf-server/conf/config.py && \ 144 | sed -i "/^SCANNER\_CONFIG/ s/\/tmp/\/home\/nonroot\/workdir/" fsf-server/conf/config.py 145 | 146 | USER root 147 | RUN ldconfig && \ 148 | ln -f -s /home/nonroot/fsf/fsf-server/main.py /usr/local/bin/ && \ 149 | ln -f -s /home/nonroot/fsf/fsf-client/fsf_client.py /usr/local/bin/ && \ 150 | apt-get remove -y --purge automake build-essential libtool && \ 151 | apt-get autoremove -y --purge && \ 152 | apt-get clean -y && \ 153 | rm -rf /var/lib/apt/lists/* 154 | 155 | USER nonroot 156 | ENV HOME /home/nonroot 157 | ENV USER nonroot 158 | WORKDIR /home/nonroot/workdir 159 | 160 | ENTRYPOINT sed -i "/^SERVER_CONFIG/ s/127\.0\.0\.1/$(hostname -i)/" /home/nonroot/fsf/fsf-client/conf/config.py && main.py start && printf "\n\n" && echo "<----->" && echo "FSF server daemonized!" && echo "<----->" && printf "\n" && echo "Invoke fsf_client.py by giving it a file as an argument:" && printf "\n" && echo "fsf_client.py " && printf "\n" && echo "Alternatively, Invoke fsf_client.py by giving it a file as an argument and pass to jq so you can interact extensively with the JSON output:" && printf "\n" && echo "fsf_client.py | jq -C . | less -r" && printf "\n" && echo "To access all of the subobjects that are recursively processed, simply add --full when invoking fsf_client.py:" && printf "\n" && echo "fsf_client.py --full" && printf "\n" && /bin/bash 161 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "{}" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright {yyyy} {name of copyright owner} 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | 203 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | File Scanning Framework (FSF) v1.1 2 | ============== 3 | 4 | Introduction 5 | ------------ 6 | 7 | __What is the ‘file scanning framework’?__ 8 | 9 | Network defenders should be empowered to drive capabilities forward how they see fit. This is the philosophy upon which, FSF was designed. 10 | 11 | FSF is a modular, recursive file scanning solution. FSF enables analysts to extend the utility of the Yara signatures they write and define actionable intelligence within a file. This is accomplished by recursively scanning a file and looking for opportunities to extract file objects using a combination of Yara signatures (to define opportunities) and programmable logic (to define what to do with the opportunity). 12 | The framework allows you to build out your intelligence capability by empowering you to apply observations wrought out of the analytical process… 13 | 14 | Okay that’s a mouthful – but think about it – if you see that some pattern (maybe a string or a byte sequence) that represents some concept or behavior; through the use of the framework, you are positioned to capture that observation and apply it to certain file types that meet your criteria. The goal being, to help extend the utility for observations from malware analysis and reverse engineering efforts. 15 | 16 | Some examples might be: 17 | * Uncompressing ZIP files and scanning their contents. 18 | * Decoding a malware config file that matches a specific signature, then parsing the meta data. 19 | * General metadata enrichment for any file type. 20 | * Logging the compile time for any EXE 21 | * Logging the author field for office documents 22 | * So much more... 23 | 24 | You can extend and define what’s important by writing modules that expose pieces of metadata that inform analysis and expose new sub objects of a file! These sub objects are recursively scanned through the same gauntlet, further enhancing both Yara and module utility. 25 | 26 | Once that is complete, you can add jq filters using the post-processing feature to capture certain items of interest from FSF output. Both Yara and jq may be used to capture observations and drive innovative detections! 27 | 28 | __If we alert on a signature, how will we know?__ 29 | 30 | This decision is left up to you since there are many ways to do this. One suggestion might be to aggregate and index the scan.log data using something like [Splunk](http://www.splunk.com/) or with an [ELK Stack](http://brewhouse.io/blog/2014/11/04/big-data-with-elk-stack.html). You can then build your alerting into the capability. 31 | 32 | __Is there a way I can take action on a specific rule hit from within the FSF? Like print out metadata for certain file types?__ 33 | 34 | This is precisely what modules are for! Module development driven by analyst observations is a cornerstone of the FSF! 35 | 36 | __What if I want to capture high level observations and even detect on relationships between files that FSF exposes?__ 37 | 38 | This is all done via a post-processing feature that is driven in large part by jq (a JSON interpreter). To learn more about how to write jq filters that work with the FSF post-processor, check out [docs/jq_filters.md](https://github.com/EmersonElectricCo/fsf/blob/master/docs/JQ_FILTERS.md). 39 | 40 | __This is pretty cool – but I don’t really know that much about Yara or jq?__ 41 | 42 | Check out the [Yara official documentation](http://yara.readthedocs.org/) for more information and examples for Yara. 43 | 44 | The official [jq](https://stedolan.github.io/jq/) website contains great tutorials and documentation as well. 45 | 46 | __What are the tools limitations?__ 47 | 48 | * Since we recursively process objects, a `MIN_DEPTH` configurable value is enforced. 49 | * There is a `TIMEOUT` value that is imposed on each module run that may not be exceeded or the program terminates. 50 | 51 | __Is there a general process flow that can help me understand what's going on?__ 52 | 53 | Yes. For a complete process flow, refer to the graphic found at [docs/FSF Process.png](https://github.com/EmersonElectricCo/fsf/blob/master/docs/FSF%20Process.png). You may also find a graphic depicting a high level overview helpful as well at [docs/FSF Overview.png] (https://github.com/EmersonElectricCo/fsf/blob/master/docs/FSF%20Overview.png) 54 | 55 | __Is there helpful documentation on how to write modules?__ 56 | 57 | Absolutely. Check out the [docs/modules.md](https://github.com/EmersonElectricCo/fsf/blob/master/docs/MODULES.md) for a great primer on how to get started. 58 | 59 | __What kind of modules are written and what do they do?__ 60 | 61 | The table below provides this information: 62 | 63 | |Module|Description| 64 | |---|---| 65 | |SCAN_YARA|Scan incoming object against series of Yara signatures.| 66 | |EXTRACT_EMBEDDED|Use hachoir library to extract embedded files and process them.| 67 | |META_BASIC_INFO|Get basic information about an object to display; size, MD5, sha1, etc...| 68 | |META_PE|Get as much metadata about an EXE file as possible.| 69 | |EXTRACT_ZIP|Get metadata on embedded objects within a ZIP file and extract them.| 70 | |EXTRACT_RAR|Get metadata on embedded objects within a RAR file and extract them.| 71 | |EXTRACT_SWF|Get metadata on embedded objects within a ZWS, CWS, or FWS files and extract them.| 72 | |META_OLECF|Get metadata from OLECF files (legacy Office documents); creation date, modification, author name, etc...| 73 | |META_OOXML|Get metadata from OOXML files (modern Office documents); creation date, modification, author name, etc...| 74 | |META_PDF|Get metadata from PDF files; creation date, modification, author name, etc...| 75 | |EXTRACT_VBA_MACRO|Extract macros from OLE document, scan and capture suspicious attributes.| 76 | |EXTRACT_UPX|Automatically unpack UPX compressed binaries.| 77 | |EXTRACT_RTF_OBJ|Get embedded hexascii objects within RTF files.| 78 | |EXTRACT_GZIP|Get embedded object within a GZIP file and extract it.| 79 | |EXTRACT_TAR|Get metadata on embedded objects within a TAR file and extract them.| 80 | |META_PE_SIGNATURE|Get certificate metadata from PE files.| 81 | |EXTRACT_CAB|Uncompress MS CAB files.| 82 | |META_ELF|Expose metadata within ELF binaries.| 83 | |META_JAVA_CLASS|Expose requirements, capabilities, and other metadata inside Java class files.| 84 | |META_VT_INSPECT|Get VirusTotal info concerning a specific file MD5. (Requires Public or Private API Key)| 85 | |EXTRACT_HEXASCII_PE|Get encoded PE elements out of files and convert to binary.| 86 | |META_MACHO|Exposes the metadata within MACHO binares.| 87 | 88 | __How does this scale up if I want to 'scan all the things'?__ 89 | 90 | The server is parallelized and supports running multiple jobs at the same time. As an example, I've provided one possible way you can accomplish this by integrating with Bro, extracting files, and sending them over to the FSF server. You can find this at the bottom of [docs/modules.md](https://github.com/EmersonElectricCo/fsf/blob/master/docs/MODULES.md) under the heading 'Automated File Extraction'. 91 | 92 | Some key advantages to Bro integration are: 93 | 94 | * Ability to direct files to a given FSF scanner node on a per sensor basis 95 | * Use of the Bro scripting language to help optimize inputs, some examples might include: 96 | * Limit sending of a file we've already seen for a certain time interval to avoid redundancy (based on MD5, etc) 97 | * Limit the size of the file you extracting if desired 98 | * Control over MIME types you care to pass on to FSF 99 | 100 | __What if I want to do load balancing across several FSF servers?__ 101 | 102 | You can easily integrate different load balancing solutions with FSF if you wish. Doing so, combined with the servers parallel processing for each request has many performance and reliability benefits. It also gives you the flexibility to do load balancing the way you want to, like using equal distribution, grouping, fail over, some combination and more... 103 | 104 | For example, you can use the popular utility [Balance](https://www.inlab.de/balance.html) to configure simple load balancing between FSF nodes with one simple command. 105 | 106 | `balance -f 5800 10.0.3.5 10.0.3.6` 107 | 108 | The above tells balance to run in the foreground on port 5800, and equally distribute requests between the two hosts specified (10.0.3.5 and 10.0.3.6). By default, the requests will be forwarded on port 5800 as well unless otherwise specified. Now we can just point our FSF clients to our load balancer and let it do the work for us. 109 | 110 | Of course, you can use a different load balancing solution you'd like, this is just a quick example. You can even specify multiple FSF servers/balancers using the client config file if desired. When doing this, the FSF server chosen for the request is done at random allowing for some rudimentary balancing. 111 | 112 | 113 | __How can I get access to the subobjects that are recursively processed?__ 114 | 115 | Ah, so are you tired of using `hachoir-subfile` + `dd` to carve out files during static analysis? Or perhaps running `unzip` or `unrar` to get decompressed files, `upx -d` to get unpacked files, or `OfficeMalScan` to get macros over and over is getting old? 116 | 117 | Well you can certainly use FSF to do the heavy lifting if you'd like. It incorporates the components that make the above tools so helpful into the framework. For other use cases, all you you need is to ensure the intelligence to do what you want is built into the framework (Yara + Module)! Several open source modules included with the package help with this. 118 | 119 | To support analysts submitting files using the client, the --full option will return all the subobjects collected in a new directory. 120 | 121 | Word of caution however, make sure you understand how to do it the hard way first! 122 | 123 | ``` 124 | fsf_client.py macro_test --full 125 | ...normal report information... 126 | Subobjects of macro_test successfully written to: fsf_dump_1446676465_6ba593d8d5defd6fbaa96a1ef2bc601d 127 | ``` 128 | 129 | If you want to collect sub objects on a grander scale server-side, look into the --archive command. You have five choices that are built-in which allow you to determine how aggressively you want to capture extracted data. 130 | 131 | __Okay I think I understand, but I'd like visual representation on what a 'report' looks like?__ 132 | 133 | Take a look a the following graphic in [docs/Example Test.png](https://github.com/EmersonElectricCo/fsf/blob/master/docs/Example%20Test.png). That represents the file `test.zip` which may be found in [docs/Test.zip](https://github.com/EmersonElectricCo/fsf/blob/master/docs/Test.zip). That file, when recursively processed using FSF outputs what's found in [docs/Test.json](https://github.com/EmersonElectricCo/fsf/blob/master/docs/Test.json). 134 | 135 | Each object within this file represents an opportunity to collect/enrich intelligence to drive more informed detections, adversary awareness, correlations, and overall analytical tradecraft. 136 | 137 | __There's a lot of JSON output here... What tools exist to help me interact with this data effectively over the command line?__ 138 | 139 | [Jq](https://stedolan.github.io/jq/) is a great utility to help work with JSON data. You might find yourself wanting to filter out certain modules when reviewing FSF JSON output for intel gain. Please refer to the [docs/jq_examples.md](https://github.com/EmersonElectricCo/fsf/blob/master/docs/JQ_EXAMPLES.md), for some helpful 'FSF specific' examples to accommodate such inquiries. I'd also suggest taking a peek at the [jq Cookbook](https://github.com/stedolan/jq/wiki/Cookbook) for more great examples. 140 | 141 | Finally, don't be afraid to check out some of the jq filters we've open sourced as part of the post-processing feature! 142 | 143 | Installation 144 | ------------ 145 | 146 | FSF has been tested to work successfully on CentOS and Ubuntu distributions. 147 | 148 | Please refer to [docs/INSTALL.md](https://github.com/EmersonElectricCo/fsf/blob/master/docs/INSTALL.md) for a detailed, step-by-step guide on how to get started with either platform. 149 | 150 | Alternatively, you can check out our [Dockerfile](https://github.com/EmersonElectricCo/fsf/blob/master/Docker/Dockerfile) if you'd like. 151 | 152 | Setup 153 | ----- 154 | 155 | Check your configuration settings 156 | * __Server-side__ - In [fsf-server/conf/conf.py] (https://github.com/EmersonElectricCo/fsf/blob/master/fsf-server/conf/config.py) 157 | * Make sure you are pointing to your master yara signature file using the full path. See [fsf-server/yara/rules.yara] (https://github.com/EmersonElectricCo/fsf/blob/master/fsf-server/yara/rules.yara) 158 | * Set the logging directory; make sure it exists and ensure you have permissions to write to it 159 | * In [fsf-server](https://github.com/EmersonElectricCo/fsf/tree/master/fsf-server), start up the server using `./main.py start` and it will daemonize 160 | * __Client-side__ - In [fsf-client/conf/conf.py](https://github.com/EmersonElectricCo/fsf/blob/master/fsf-client/conf/config.py) 161 | * Point to your server(s) being used to scan files 162 | * Submit a file with `fsf_client.py `, you can use wildcard for scanning all of the files in a directory 163 | 164 | The client may be invoked with the following flags: 165 | 166 | ``` 167 | usage: fsf_client [-h] [--delete] [--source [SOURCE]] [--archive [ARCHIVE]] 168 | [--suppress-report] [--full] 169 | [file [file ...]] 170 | 171 | Uploads files to scanner server and returns the results to the user if 172 | desired. Results will always be written to a server side log file. Default 173 | options for each flag are designed to accommodate easy analyst interaction. 174 | Adjustments can be made to accommodate larger operations. Read the 175 | documentation for more details! 176 | 177 | positional arguments: 178 | file Full path to file(s) to be processed. 179 | 180 | optional arguments: 181 | -h, --help show this help message and exit 182 | --delete Remove file from client after sending to the FSF 183 | server. Data can be archived later on server depending 184 | on selected options. 185 | --source [SOURCE] Specify the source of the input. Useful when scaling up 186 | to larger operations or supporting multiple input 187 | sources, such as; integrating with a sensor grid or 188 | other network defense solutions. Defaults to 'Analyst' 189 | as submission source. 190 | --archive [ARCHIVE] Specify the archive option to use. The most common 191 | option is 'none' which will tell the server not to 192 | archive for this submission (default). 'file-on-alert' 193 | will archive the file only if the alert flag is set. 194 | 'all-on-alert' will archive the file and all sub 195 | objects if the alert flag is set. 'all-the-files' will 196 | archive all the files sent to the scanner regardless of 197 | the alert flag. 'all-the-things' will archive the file 198 | and all sub objects regardless of the alert flag. 199 | --suppress-report Don't return a JSON report back to the client and log 200 | client-side errors to the locally configured log 201 | directory. Choosing this will log scan results server- 202 | side only. Needed for automated scanning use cases when 203 | sending large amount of files for bulk collection. Set 204 | to false by default. 205 | --full Dump all sub objects of submitted file to current 206 | directory of the client. Format or directory name is 207 | 'fsf_dump_[epoch time]_[md5 hash of scan results]'. 208 | Only supported when suppress-report option is false 209 | (default). 210 | ``` 211 | -------------------------------------------------------------------------------- /docs/Example Test.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EmersonElectricCo/fsf/15303aa298414397f9aa5d19ca343040a0fe0bbd/docs/Example Test.png -------------------------------------------------------------------------------- /docs/FSF Overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EmersonElectricCo/fsf/15303aa298414397f9aa5d19ca343040a0fe0bbd/docs/FSF Overview.png -------------------------------------------------------------------------------- /docs/FSF Process.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EmersonElectricCo/fsf/15303aa298414397f9aa5d19ca343040a0fe0bbd/docs/FSF Process.png -------------------------------------------------------------------------------- /docs/INSTALL.md: -------------------------------------------------------------------------------- 1 | Install Guide 2 | ============= 3 | 4 | The following step-by-step instructions were tested against Ubuntu Server 14.04.3 and CentOS 7. 5 | 6 | Required Packages 7 | ------------------ 8 | 9 | Install the following required packages. Once you complete this step, the rest of the installation is the same for either platform. 10 | 11 | ### Ubuntu ### 12 | 13 | ``` 14 | sudo apt-get install autoconf dh-autoreconf python-dev libpython2.7-stdlib python-pip libffi-dev ssdeep upx unrar libfuzzy-dev unzip wget vim libssl-dev net-tools cabextract 15 | ``` 16 | 17 | ### CentOS ### 18 | 19 | `sudo yum install autoconf python-devel automake wget vim libtool openssl openssl-devel net-tools` 20 | 21 | Turn on EPEL repo. 22 | 23 | `sudo yum install epel-release` 24 | 25 | Turn on RPMForge repo. 26 | ``` 27 | wget http://ftp.tu-chemnitz.de/pub/linux/dag/redhat/el7/en/x86_64/rpmforge/RPMS/rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm 28 | rpm -Uvh rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm 29 | ``` 30 | Get remaining packages. 31 | 32 | `sudo yum install python-argparse python-pip ssdeep-devel libffi-devel unrar upx unzip cabextract` 33 | 34 | Installing Yara 35 | ------------------ 36 | 37 | Make sure you are getting the latest and greatest version of Yara... 38 | ``` 39 | wget https://github.com/plusvic/yara/archive/v3.4.0.tar.gz 40 | tar -xvzf v3.4.0.tar.gz 41 | cd yara-3.4.0/ 42 | ./bootstrap.sh 43 | ./configure 44 | make 45 | sudo make install 46 | ``` 47 | 48 | Python Yara module install. 49 | ``` 50 | cd yara-python/ 51 | python setup.py build 52 | sudo python setup.py install 53 | ``` 54 | Ensure those new libraries can be found. 55 | 56 | `sudo vim /etc/ld.so.conf.d/yara.conf` 57 | 58 | Add the line `/usr/local/lib`. 59 | 60 | Reload necessary libraries. 61 | 62 | `sudo ldconfig` 63 | 64 | Installing JQ 65 | ------------- 66 | Get the latest JQ package, set the right perms and move it to a known path: 67 | ``` 68 | wget https://github.com/stedolan/jq/releases/download/jq-1.5/jq-linux64 -O jq 69 | chmod 755 jq 70 | sudo mv jq /usr/local/bin/ 71 | ``` 72 | 73 | Python Modules 74 | -------------- 75 | 76 | Install the following Python modules using `pip`. 77 | 78 | ``` 79 | sudo easy_install -U setuptools 80 | sudo pip install czipfile pefile hachoir-parser hachoir-core hachoir-regex hachoir-metadata hachoir-subfile ConcurrentLogHandler pypdf2 xmltodict rarfile ssdeep pylzma oletools pyasn1_modules pyasn1 pyelftools javatools requests 81 | ``` 82 | NOTE: Ensure pefile is at least version pefile-1.2.10-139. On some distros a latter version is installed which means you will need to build from source. To do this, simply follow the instructions below... 83 | 84 | ``` 85 | wget https://github.com/erocarrera/pefile/files/192316/pefile-2016.3.28.tar.gz 86 | tar -xvzf pefile-2016.3.28.tar.gz 87 | cd pefile-2016.3.28 88 | python setup.py build 89 | sudo python setup.py install 90 | ``` 91 | 92 | Install macholibre for python2. 93 | ``` 94 | git clone https://github.com/aaronst/macholibre.git 95 | git checkout python2 96 | cd macholibre 97 | python setup.py build 98 | python setup.py install 99 | ``` 100 | 101 | Install FSF 102 | ------------ 103 | 104 | Retrieve latest version of master. 105 | 106 | ``` 107 | cd ~ 108 | wget https://github.com/EmersonElectricCo/fsf/archive/master.zip 109 | unzip master.zip 110 | vim fsf-master/fsf-server/conf/config.py 111 | ``` 112 | Point `YARA_PATH` to the full path to `rules.yara`, in our case `/home/_username_/fsf-master/fsf-server/yara/rules.yara`. 113 | 114 | Start the daemon. 115 | ``` 116 | cd fsf-master/fsf-server 117 | ./main.py start 118 | ``` 119 | 120 | Check how it is being locally hosted with a `netstat -na | grep 5800`, by default it is 127.0.0.1, but sometimes that needs to change, like here :) 121 | ``` 122 | netstat -na | grep 5800 123 | tcp 0 0 127.0.1.1:5800 0.0.0.0:* LISTEN 124 | ``` 125 | 126 | If necessary, change `IP_ADDRESS` in client config. 127 | 128 | `vim ../fsf-client/conf/config.py` 129 | 130 | Finally, test it out! 131 | ``` 132 | cd ../fsf-client/ 133 | ./fsf_client.py ~/fsf-master/docs/Test.zip 134 | ``` 135 | 136 | Get all subobjects! 137 | 138 | `./fsf_client.py ~/fsf-master/docs/Test.zip --full` 139 | 140 | You should get a bunch of pretty JSON and a dump of subobjects if you use `--full`. 141 | 142 | Problems? Check out `/tmp/daemon.log` and or `/tmp/dbg.log`. 143 | 144 | Success? Awesome! If you have any ideas or desire to contribute modules or Yara signatures please share them! 145 | 146 | Extra Stuff 147 | ----------- 148 | 149 | Users scanning at a large scale will likely want to have some level of log rotation baked into the deployment. Below is a simple example using logrotate. 150 | 151 | Create the following file _/etc/logrotate.d/scanner_ and put have the following configuration options... 152 | 153 | ``` 154 | compress 155 | copytruncate 156 | 157 | /YOUR/LOG/PATH/*.log { 158 | weekly 159 | create 0664 YOUR_USER YOUR_GROUP 160 | rotate 5 161 | } 162 | ``` 163 | The above will compress log files on a weekly basis in your directory. It will assign the permissions to the user and group you supply and logs will rotate off after five weeks. The _copytruncate_ option is important to ensure logs like _daemon.log_ will continue logging data after it is rotated. 164 | -------------------------------------------------------------------------------- /docs/JQ_EXAMPLES.md: -------------------------------------------------------------------------------- 1 | This page is meant to help enable folks interested in using JQ to interact with the JSON data produced by FSF. 2 | 3 | Remove JSON Nodes 4 | ----------------- 5 | 6 | Create the following JQ script 7 | 8 | ``` 9 | vim fsf_module_filter.jq 10 | def post_recurse(f): 11 | def r: 12 | (f | select(. != null) | r), .; 13 | r; 14 | def post_recurse: 15 | post_recurse(.[]?); 16 | (post_recurse | objects) |= reduce $delete[] as $d (.; delpaths([[ $d ]])) 17 | ``` 18 | 19 | Invocation with multiple nodes with sample [Test.json](https://github.com/EmersonElectricCo/fsf/blob/master/docs/Test.json) from FSF. 20 | 21 | ``` 22 | cat Test.json | jq --argjson delete '["META_BASIC_INFO","SCAN_YARA"]' -f fsf_module_filter.jq | less 23 | ``` 24 | 25 | Show Select JSON Nodes 26 | ---------------------- 27 | 28 | Show results from only one module 29 | 30 | ``` 31 | cat Test.json | jq '..|.SCAN_YARA? | select(type != "null")' 32 | ``` 33 | -------------------------------------------------------------------------------- /docs/JQ_FILTERS.md: -------------------------------------------------------------------------------- 1 | Jq Filter Integration 2 | ===================== 3 | 4 | Purpose 5 | ------- 6 | 7 | Examining FSF output can be quite cumbersome due to how rich some of the output can be. Additionally, there are also scenarios where an analyst might wish to capture a unique relationship between an object and a sub-object that FSF exposes. For example, you might think it interesting that an executable with high entropy (as measured by a Yara signature) came from a rar or a zip file? Unfortunately, capturing these observations is a bit of a chicken and egg problem. How can one know the relationships between files and various meta data elements until they have been fully processed? 8 | 9 | To overcome this gap, the post-processing engine was added. Much like one would use Yara to capture observations on a file, the post-processor uses a similar approach, but instead of using Yara, uses jq. Jq is a mature, and very powerful JSON interpreter and may be extended to capture unique observations on FSF data. You can even develop detections based on relationships seen within FSF JSON output! In this paradigm, we can think of jq filters as jq signatures. 10 | 11 | Interested? Read on for more on how post-processing has been implemented. 12 | 13 | Fundamentals 14 | ------------ 15 | 16 | As with everything in FSF; it's all about exposing intelligence. In the post-processing paradigm, we can expose intelligence concerning relationships from one file to another or one metadata attribute to another. Certain relationships are more noteworthy than others. Some are worth capturing but not alerting on, others might drive such a detection! 17 | 18 | Use Cases 19 | --------- 20 | 21 | Why would someone want to write jq filters on FSF output? Here are some use cases you might find interesting... 22 | 23 | * The presence of certain filetypes within a file; such as, an scr/exe within a zip, rar, Office document, etc... 24 | * When the compile time for an executable is within a certain time frame, say < 24 hours old. 25 | * When a number of _suspicious_ macros exceeds a certain threshold. 26 | 27 | In FSF, these observations are captured in a summary dictionary. Below is a brief snippet of multiple jq filters triggering different conditions from a large report. 28 | 29 | ``` 30 | "Observations": [ 31 | "An executable was found inside a ZIP file.", 32 | "An embedded file contained a self-extracting RAR that itself contained an executable payload.", 33 | "More than 10 unique objects were observed in this file.", 34 | "There were no matches found when VirusTotal was queried.", 35 | "More than 10 unique Yara signatures fired when processing this file!" 36 | ] 37 | ``` 38 | 39 | Implementation 40 | -------------- 41 | 42 | Jq filters designed to analyze FSF data __MUST__ return a boolean result. Testing whether or not one will work is as simple as piping FSF output to the jq interpreter. Once you are confident in your approach, simply do the following. 43 | 44 | * Add your jq script to the _jq_ directory within the fsf-server folder. 45 | * Add an tuple entry to the _disposition.py_ list entitled _post_processor_. 46 | * First element is your jq signature name. 47 | * Second is the observation you want to capture. 48 | * Last is whether or not you want to set the alert flag based on the observation. 49 | 50 | The following is an example of what this would look like within the ''disposition.py'' file: 51 | 52 | ``` 53 | # STRUCTURE: List of tuples such that... 54 | # Types: [('string', 'string', boolean'), ...] 55 | # Variables: [('jq script', 'observation' , 'is archivable'), ...] 56 | 57 | post_processor = [('one_module.jq', 'Only one kind of module was run on for this report.', False), 58 | ('no_yara_hits.jq', 'There doesn\'t appear to be any Yara signature hits for this scan.', False), 59 | ``` 60 | -------------------------------------------------------------------------------- /docs/MODULES.md: -------------------------------------------------------------------------------- 1 | Writing Modules 2 | ============== 3 | 4 | Purpose 5 | ------------ 6 | 7 | This documentation will go over the process of making contributions to the File Scanning Framework. Modules are intended to be very easy to write and contribute. They can even be dynamically updated on a scanning service while the daemon is running. 8 | 9 | Fundamentals 10 | ------------ 11 | 12 | The following is a bulleted list of important files within the framework and their purpose. 13 | 14 | * `conf/config.py` - Configuration file for the server. Used to define IP address and port to listen on, timeout value for each module, where the central Yara file is with all the includes, where to export files that trigger an alert, and how deep to recursively process a single object. 15 | * `conf/disposition.py` - Configuration file used to define any actions that should be taken on files being processed. Drives alerting decisions on files that match Yara signatures and defines module(s) to run as a result of a signature hit. 16 | * `modules/` - Add your modules here to incorporate them into the framework by editing the `__init__.py` file. Ensure your module is in the _modules_ directory. 17 | 18 | The scanner can be invoked using a variety of different options. By default they accommodate submissions by an analyst. However, they can be easily tuned to support larger operations; such as that of a sensor grid, or others... 19 | 20 | Module Overview 21 | ------------ 22 | All modules are stored in the _modules_ directory and follow a loosely defined naming convention where the META prefix is sole purposed for returning metadata from a parsed buffer and EXTRACT is used to denote modules that do some level of decoding or decompression, perhaps in addition to returning metadata. 23 | 24 | There is a `modules/template.py` file in the modules directory that is a simple starting point. 25 | 26 | ### Module Requirements ### 27 | 28 | * By convention, your module must have a function with your modules name. (Example: A module named META_TEST.py should have a function named META_TEST). 29 | * This is what FSF will call when your module is plugged in 30 | * This function must accept two parameters, a scanner object and a buffer to process 31 | * The main function must return a dictionary 32 | * Empty dictionary objects are deleted before displayed 33 | 34 | ### Scanner Object ### 35 | 36 | Modules are granted access to a scanner object which has the following attributes: 37 | * Filename - (String) - Name of the initial file being analyzed 38 | * Source - (String) - Name given to the submission source (analyst by default) 39 | * Archive - (String) - Archival criteria sent from the submitter 40 | * Suppress report - (Boolean) - Indicates whether client expects a report sent back 41 | * File - (List) - Buffer of initial file being analyzed 42 | * Yara Rule Path - (String) - Path to central yara file with all includes 43 | * Export Path - (String) - Where files should be written to 44 | * Log Path - (String) - Where logs should be written to 45 | * Max Depth - (Int) - How deep we should recurse through each object tree 46 | * Debug Log - (Logger Object) - Writing to debug logger file 47 | * Scan Log - (Logger Object) - Writing to scan logger file 48 | * Timeout - (Int) - How long each module has before it is forced to exit 49 | * Alert - (Boolean) - Value sets the alert key 50 | * Full - (Boolean) - Value too see if user wants all subobjects 51 | * Sub objects - (List) - Storage for different subobjects returned from modules 52 | 53 | ### File Recursion ### 54 | 55 | Returned buffers are processed recursively with the framework, and the convention for doing this is to simply append the buffer you plan on returning affixed to a dictionary key named _Buffer_. Doing so will cause the processor script to iterate through the assigned values and run modules on them as defined in the _conf/dispositioner.py_ file. 56 | 57 | If you need to return multiple buffers for whatever reason, you might want to consider a parent/child hierarchy, where your parent dictionary is assigned an object identifier for a key and a child dictionary with your _Buffer_ key and value pair. The modules EXTRACT_RAR and EXTRACT_ZIP are good examples of this. 58 | 59 | ### Adding a Module ### 60 | 61 | The following steps need to be followed when adding a module to the framework. 62 | 63 | * Ensure the `modules/__init__.py` file is updated to contain the new modules name. 64 | * Add logic in the `conf/dispositioner.py` file to ensure your module is run 65 | * Can choose to have your module run all the time (default modules list) 66 | * Can choose to have module run only when a Yara signature hits 67 | * If necessary, create the Yara signature that triggers it 68 | 69 | Your First Module 70 | ------------ 71 | 72 | Let's write a module that processes a new file type we're interested in. This file is defined by the 'JXB' header and we want to parse our fictional file which is defined by the following pseudo-structure. 73 | 74 | ``` 75 | struct my_test 76 | { 77 | char header[3]; 78 | BYTE xorkey; 79 | char secret[10]; 80 | } 81 | ``` 82 | 83 | Our module should be invoked when our file type is encoded, and then uses the xorkey we derive to decode the secret message. 84 | 85 | Lets use the following command to generate our test file. 86 | 87 | `echo -ne 'JXB\x51\x3e\x24\x23\x71\x37\x38\x23\x22\x25\x71\x3c\x3e\x35\x24\x3d\x34' > test_file` 88 | 89 | A Yara rule that would flag on a file like this might be as follows... 90 | 91 | ``` 92 | rule my_test 93 | { 94 | meta: 95 | author = "[your name]" 96 | lastmod = "20150729" 97 | desc = "[description of signature]" 98 | 99 | strings: 100 | $magic = "JXB" 101 | 102 | condition: 103 | $magic at 0 104 | } 105 | ``` 106 | 107 | We need to ensure the include for this signature is added to where ever our chief Yara file the server side scanner is configured to point to. 108 | 109 | Example code to test out our module would be as follows... 110 | 111 | ``` 112 | import sys 113 | 114 | def META_TEST_DECODE(s, buff): 115 | TEST = {} 116 | 117 | xor_key = ord(buff[3]) 118 | decode = [] 119 | 120 | for i in buff[4:]: 121 | decode.append(chr(ord(i) ^ xor_key)) 122 | 123 | TEST['Message'] = ''.join(decode) 124 | TEST['XOR Key'] = hex(xor_key) 125 | 126 | return TEST 127 | 128 | if __name__ == '__main__': 129 | print META_TEST_DECODE(None, sys.stdin.read()) 130 | ``` 131 | 132 | Testing this outside the framework produces our expected result. 133 | 134 | ``` 135 | cat test_file | python META_TEST_DECODE.py 136 | {'Message': 'our first module', 'XOR Key': '0x51'} 137 | ``` 138 | 139 | Now to integrate within our framework. 140 | 141 | Edit the `modules/__init__.py` file to add the module and then edit the `disposition.py` file in `conf/` to add our signature, and the module we want run, we also want to set the alert key to True. 142 | 143 | ` ('my_test', ['META_TEST_DECODE'], True),` 144 | 145 | Finally, start the scanner server daemon in fsf-server. Ensure the server configuration file is setup properly before proceeding. 146 | 147 | ` ./main.py start` 148 | 149 | Next, move over to the fsf-client and ensure the `conf/config.py` file is pointing to your server and other parameters are set. Once things are set up right, invoke the client and you should get back a JSON report inclusive of your module, congrats! 150 | 151 | ``` 152 | ./fsf_client.py test_file 153 | { 154 | "Scan Time": "2015-07-29 12:38:18.095262", 155 | "Source": "Analyst", 156 | "Filename": "test_file", 157 | "Object": { 158 | "META_BASIC_INFO": { 159 | "SHA1": "fc9ed5d80e1d5170b2a6c17673ddbf5bd7dd579e", 160 | "MD5": "379ff2d43a6aa065c0bae65108815d20", 161 | "ssdeep": "3:WbE40B8:WbE47", 162 | "SHA256": "6755b15031b263127dfa38b0275bf1e901b6711b636905245338a2e9835f9ed2", 163 | "SHA512": "d51021afd7de53fd3546d1cd9a5aba1bacdafa1a94377ab2d10b90943a6bf708a821a20decd08311c19d5dc3a3b701a972bd5db1e1881b16b6ea1d046fdce5bb", 164 | "Size": "20 bytes" 165 | }, 166 | "SCAN_YARA": { 167 | "my_test": { 168 | "desc": "[description of signature]", 169 | "lastmod": "20150729", 170 | "author": "[your name]" 171 | } 172 | }, 173 | "META_TEST_DECODE": { 174 | "Message": "our first module", 175 | "XOR Key": "0x51" 176 | } 177 | }, 178 | "Interactive": true, 179 | "Alert": true, 180 | "Summary": { 181 | "Yara": [ 182 | "my_test" 183 | ], 184 | "Modules": [ 185 | "META_BASIC_INFO", 186 | "META_TEST_DECODE", 187 | "SCAN_YARA" 188 | ], 189 | "Observations": [], 190 | } 191 | } 192 | ``` 193 | 194 | Debugging 195 | ------------ 196 | 197 | All modules are passed both a scanner object and a buffer in the form of a list directly. It is suggested practice to begin debugging by feeding your module an example file directly and reading that from STDIN and printing the dictionary output. 198 | 199 | Once you've achieved success getting the desired output in the form of a returned dictionary, you can plug the module in to the framework by following the above instructions, and attempt to run the `fsf-client.py` script against the configured server. Your returned output should be a JSON object including your modules returned data. 200 | 201 | Areas to troubleshoot for difficulties running at this level on the server side are the `dbg.log` file and the `daemon.log` file. 202 | 203 | Automated File Extraction 204 | ------------ 205 | 206 | ### Bro ### 207 | 208 | The following Bro script was compiled and tested with Bro 2.4. After simply adding it to the ''local.bro'' file and deploying, you should be all set! This script aids in the automatic extraction of files, and the sending of those files to an FSF server. 209 | 210 | ``` 211 | # Jason Batchelor 212 | # Extract files over various protocols 213 | # 6/19/2015 214 | 215 | export 216 | { 217 | # Define the file types we are interested in extracting 218 | const ext_map: table[string] of string = { 219 | ["application/x-dosexec"] = "exe", 220 | ... ADD MIME TYPES TO EXTRACT HERE ... 221 | } &redef &default=""; 222 | } 223 | 224 | # Set extraction folder 225 | redef FileExtract::prefix = "WHERE FILES ARE WRITTEN"; 226 | 227 | event file_sniff(f: fa_file, meta: fa_metadata) 228 | { 229 | local ext = ""; 230 | 231 | if ( meta?$mime_type ) 232 | { 233 | ext = ext_map[meta$mime_type]; 234 | } 235 | 236 | if ( ext == "" ) 237 | { 238 | return; 239 | } 240 | # Hash the file for good measure 241 | Files::add_analyzer(f, Files::ANALYZER_MD5); 242 | 243 | local fname = fmt("%s-%s-%s", f$source, f$id, ext); 244 | Files::add_analyzer(f, Files::ANALYZER_EXTRACT, [$extract_filename=fname, $extract_limit=FILE LIMIT]); 245 | } 246 | 247 | event file_state_remove(f: fa_file) 248 | { 249 | if ( f$info?$extracted ) 250 | { 251 | # Invoke the scanner using the pre-defined options. Files will be deleted off client once sent, this is a fail open operation 252 | local scan_cmd = fmt("%s %s/%s", "PATH/fsf_client.py --delete --source EVision --suppress-report --archive all-on-alert", FileExtract::prefix, f$info$extracted); 253 | system(scan_cmd); 254 | } 255 | } 256 | ``` 257 | 258 | To ensure things are going smoothly, check the client_dbg.log file to see if there are any errors being generated. Next tail the scanner log file and hopefully you will begin seeing JSON reports of all the files being written. You can aggregate these reports to your favorite indexer or SIMS! 259 | 260 | -------------------------------------------------------------------------------- /docs/Test.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EmersonElectricCo/fsf/15303aa298414397f9aa5d19ca343040a0fe0bbd/docs/Test.zip -------------------------------------------------------------------------------- /fsf-client/conf/__init__.py: -------------------------------------------------------------------------------- 1 | __all__ = ['config'] 2 | -------------------------------------------------------------------------------- /fsf-client/conf/config.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Basic configuration attributes for scanner client. 4 | # 5 | 6 | # 'IP Address' is a list. It can contain one element, or more. 7 | # If you put multiple FSF servers in, the one your client chooses will 8 | # be done at random. A rudimentary way to distribute tasks. 9 | SERVER_CONFIG = { 'IP_ADDRESS' : ['127.0.0.1',], 10 | 'PORT' : 5800 } 11 | 12 | # Full path to debug file if run with --suppress-report 13 | CLIENT_CONFIG = { 'LOG_FILE' : '/tmp/client_dbg.log' } 14 | -------------------------------------------------------------------------------- /fsf-client/fsf_client.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # FSF Client for sending information and generating a report 4 | # 5 | # Jason Batchelor 6 | # Emerson Corporation 7 | # 02/09/2016 8 | ''' 9 | Copyright 2016 Emerson Electric Co. 10 | 11 | Licensed under the Apache License, Version 2.0 (the "License"); 12 | you may not use this file except in compliance with the License. 13 | You may obtain a copy of the License at 14 | 15 | http://www.apache.org/licenses/LICENSE-2.0 16 | 17 | Unless required by applicable law or agreed to in writing, software 18 | distributed under the License is distributed on an "AS IS" BASIS, 19 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | See the License for the specific language governing permissions and 21 | limitations under the License. 22 | ''' 23 | 24 | import os 25 | import sys 26 | import socket 27 | import argparse 28 | import struct 29 | import json 30 | import time 31 | import hashlib 32 | import random 33 | from conf import config 34 | from datetime import datetime as dt 35 | 36 | class FSFClient: 37 | def __init__(self, fullpath, filename, delete, source, archive, suppress_report, full, file): 38 | 39 | self.fullpath = fullpath 40 | self.filename = filename 41 | self.delete = delete 42 | self.source = source 43 | self.archive = archive 44 | self.suppress_report = suppress_report 45 | self.full = full 46 | self.file = file 47 | # will hold host after verifying connection to server 48 | self.host = '' 49 | self.port = config.SERVER_CONFIG['PORT'] 50 | self.logfile = config.CLIENT_CONFIG['LOG_FILE'] 51 | self.server_list = config.SERVER_CONFIG['IP_ADDRESS'] 52 | 53 | # Test connection to randomized server and rudimentary fail over 54 | def initiate_submission(self): 55 | 56 | sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 57 | random.shuffle(self.server_list) 58 | attempts = 0 59 | 60 | for server in self.server_list: 61 | success = 1 62 | try: 63 | sock.connect((server, self.port)) 64 | except: 65 | warning ='%s There was a problem connecting to %s on port %s. Trying another server. \n' % (dt.now(), server, self.port) 66 | self.issue_error(warning) 67 | success = 0 68 | attempts += 1 69 | if success: 70 | self.host = server 71 | self.process_files(sock) 72 | break 73 | elif attempts == len(self.server_list): 74 | e = sys.exc_info()[0] 75 | error = '%s There are not servers available to send files too. Error: %s\n' % (dt.now(), e) 76 | self.issue_error(error) 77 | 78 | 79 | # Send files to server for processing and await results 80 | def process_files(self, sock): 81 | 82 | msg = '%sFSF_RPC%sFSF_RPC%sFSF_RPC%sFSF_RPC%sFSF_RPC%s' % (self.filename, self.source, self.archive, self.suppress_report, self.full, self.file) 83 | buffer = struct.pack('>I', len(msg)) + 'FSF_RPC' + msg 84 | 85 | try: 86 | sock.sendall(buffer) 87 | except: 88 | e = sys.exc_info()[0] 89 | error = '%s There was a problem sending file %s to %s on port %s. Error: %s\n' % (dt.now(), self.filename, self.host, self.port, e) 90 | self.issue_error(error) 91 | 92 | finally: 93 | 94 | if self.delete: 95 | os.remove(self.fullpath) 96 | 97 | if not self.suppress_report: 98 | self.process_results(sock) 99 | 100 | sock.close() 101 | 102 | # Process the results sent back from the FSF server 103 | def process_results(self, sock): 104 | 105 | try: 106 | raw_msg_len = sock.recv(4) 107 | msg_len = struct.unpack('>I', raw_msg_len)[0] 108 | data = '' 109 | 110 | while len(data) < msg_len: 111 | recv_buff = sock.recv(msg_len - len(data)) 112 | data += recv_buff 113 | 114 | print data 115 | 116 | # Does the user want all sub objects? 117 | if self.full: 118 | # Generate dirname by calculating epoch time and hash of results 119 | dirname = 'fsf_dump_%s_%s' % (int(time.time()), hashlib.md5(data).hexdigest()) 120 | self.dump_subobjects(sock, dirname) 121 | 122 | except: 123 | e = sys.exc_info()[0] 124 | error = '%s There was a problem getting data for %s from %s on port %s. Error: %s' % (dt.now(), self.filename, self.host, self.port, e) 125 | self.issue_error(error) 126 | 127 | # Dump all subobjects returned by the scanner server 128 | def dump_subobjects(self, sock, dirname): 129 | 130 | sub_status = sock.recv(4) 131 | if sub_status == 'Null': 132 | print 'No subobjects were returned from scanner for %s.' % self.filename 133 | return 134 | 135 | os.mkdir(dirname) 136 | 137 | while self.full: 138 | raw_sub_count = sock.recv(4) 139 | sub_count = struct.unpack('>I', raw_sub_count)[0] 140 | raw_msg_len = sock.recv(4) 141 | msg_len = struct.unpack('>I', raw_msg_len)[0] 142 | data = '' 143 | 144 | while len(data) < msg_len: 145 | recv_buff = sock.recv(msg_len - len(data)) 146 | data += recv_buff 147 | 148 | fname = hashlib.md5(data).hexdigest() 149 | with open('%s/%s' % (dirname, fname), 'w') as f: 150 | f.write(data) 151 | f.close 152 | 153 | if sub_count == 0: 154 | self.full = False 155 | 156 | print 'Sub objects of %s successfully written to: %s' % (self.filename, dirname) 157 | 158 | # Either log to log file or print to stdout depending on flags used 159 | def issue_error(self, error): 160 | 161 | if self.suppress_report: 162 | with open(self.logfile, 'a') as f: 163 | f.write(error) 164 | f.close() 165 | else: 166 | print error 167 | 168 | if __name__ == '__main__': 169 | 170 | parser = argparse.ArgumentParser(prog='fsf_client', description='Uploads files to scanner server and returns the results to the user if desired. Results will always be written to a server side log file. Default options for each flag are designed to accommodate easy analyst interaction. Adjustments can be made to accommodate larger operations. Read the documentation for more details!') 171 | parser.add_argument('file', nargs='*', type=argparse.FileType('r'), help='Full path to file(s) to be processed.') 172 | parser.add_argument('--delete', default=False, action='store_true', help='Remove file from client after sending to the FSF server. Data can be archived later on server depending on selected options.') 173 | parser.add_argument('--source', nargs='?', type=str, default='Analyst', help='Specify the source of the input. Useful when scaling up to larger operations or supporting multiple input sources, such as; integrating with a sensor grid or other network defense solutions. Defaults to \'Analyst\' as submission source.') 174 | parser.add_argument('--archive', nargs='?', type=str, default='none', help='Specify the archive option to use. The most common option is \'none\' which will tell the server not to archive for this submission (default). \'file-on-alert\' will archive the file only if the alert flag is set. \'all-on-alert\' will archive the file and all sub objects if the alert flag is set. \'all-the-files\' will archive all the files sent to the scanner regardless of the alert flag. \'all-the-things\' will archive the file and all sub objects regardless of the alert flag.') 175 | parser.add_argument('--suppress-report', default=False, action='store_true', help='Don\'t return a JSON report back to the client and log client-side errors to the locally configured log directory. Choosing this will log scan results server-side only. Needed for automated scanning use cases when sending large amount of files for bulk collection. Set to false by default.') 176 | parser.add_argument('--full', default=False, action='store_true', help='Dump all sub objects of submitted file to current directory of the client. Format or directory name is \'fsf_dump_[epoch time]_[md5 hash of scan results]\'. Only supported when suppress-report option is false (default).') 177 | 178 | if len(sys.argv) == 1: 179 | parser.print_help() 180 | sys.exit(1) 181 | 182 | try: 183 | args = parser.parse_args() 184 | except IOError: 185 | e = sys.exc_info()[1] 186 | print 'The file provided could not be found. Error: %s' % e 187 | sys.exit(1) 188 | 189 | if len(args.file) == 0: 190 | print 'A file to scan needs to be provided!' 191 | 192 | archive_options = ['none', 'file-on-alert', 'all-on-alert', 'all-the-files', 'all-the-things'] 193 | if args.archive not in archive_options: 194 | print 'Please specify a valid archive option: \'none\', \'file-on-alert\', \'all-on-alert\', \'all-the-files\' or \'all-the-things\'.' 195 | sys.exit(1) 196 | 197 | for f in args.file: 198 | filename = os.path.basename(f.name) 199 | file = f.read() 200 | fsf = FSFClient(f.name, filename, args.delete, args.source, args.archive, args.suppress_report, args.full, file) 201 | fsf.initiate_submission() 202 | -------------------------------------------------------------------------------- /fsf-server/conf/__init__.py: -------------------------------------------------------------------------------- 1 | __all__ = ['config', 'disposition'] 2 | -------------------------------------------------------------------------------- /fsf-server/conf/config.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Basic configuration attributes for scanner. Used as default 4 | # unless the user overrides them. 5 | # 6 | 7 | import socket 8 | 9 | SCANNER_CONFIG = { 'LOG_PATH' : '/tmp', 10 | 'YARA_PATH' : '/FULL/PATH/TO/fsf/fsf-server/yara/rules.yara', 11 | 'PID_PATH' : '/tmp/scanner.pid', 12 | 'EXPORT_PATH' : '/tmp', 13 | 'TIMEOUT' : 60, 14 | 'MAX_DEPTH' : 10 } 15 | 16 | SERVER_CONFIG = { 'IP_ADDRESS' : socket.gethostname(), 17 | 'PORT' : 5800 } 18 | -------------------------------------------------------------------------------- /fsf-server/conf/disposition.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # This is the Python 'module' that contains the 4 | # disposition criteria for Yara and jq filters the scanner framework 5 | # will work on. Each member is the name of a 6 | # high fidelity detection. 7 | # 8 | # default - Modules that are always run on a returned buffer value 9 | # triggers - List of tuples that are configured to drive the flow of execution 10 | # as the file itself it scanned recursively. They consist of Yara rule names 11 | # that (if evaluated to true) may then run zero, one or more modules and optionally 12 | # set the alert flag. 13 | # post_processor - List of tuples that are configured to capture observations 14 | # concerning the JSON output. These consist of jq filters that ultimately produce 15 | # a boolean value dictating if a given condition is true. If 'true' then the 16 | # observation is captured and the alert flag is optionally set. 17 | default = ['META_BASIC_INFO', 18 | 'EXTRACT_EMBEDDED', 19 | 'SCAN_YARA'] 20 | 21 | # STRUCTURE: List of tuples such that... 22 | # Types: [('string', 'list', boolean'), ...] 23 | # Variables: [('rule name', ['module_1' , 'module_2'] , 'alert_flag'), ...] 24 | triggers = [('ft_zip', ['EXTRACT_ZIP'], False), 25 | ('ft_exe', ['META_PE'], False), 26 | ('ft_rar', ['EXTRACT_RAR'], False), 27 | ('ft_ole_cf', ['META_OLECF', 'EXTRACT_VBA_MACRO'], False), 28 | ('ft_pdf', ['META_PDF'], False), 29 | ('misc_ooxml_core_properties', ['META_OOXML'], False), 30 | ('ft_swf', ['EXTRACT_SWF'], False), 31 | ('misc_upx_packed_binary', ['EXTRACT_UPX'], False), 32 | ('ft_rtf', ['EXTRACT_RTF_OBJ'], False), 33 | ('ft_tar', ['EXTRACT_TAR'], False), 34 | ('ft_gzip', ['EXTRACT_GZIP'], False), 35 | ('misc_pe_signature', ['META_PE_SIGNATURE'], False), 36 | ('ft_cab', ['EXTRACT_CAB'], False), 37 | ('ft_elf', ['META_ELF'], False), 38 | ('ft_java_class', ['META_JAVA_CLASS'], False), 39 | ('misc_hexascii_pe_in_html', ['EXTRACT_HEXASCII_PE'], False), 40 | ('misc_no_dosmode_header', '', False), 41 | ('ft_macho', ['META_MACHO'], False), 42 | ] 43 | 44 | # STRUCTURE: List of tuples such that... 45 | # Types: [('string', 'string', boolean'), ...] 46 | # Variables: [('jq script', 'observation' , 'alert_flag'), ...] 47 | post_processor = [('one_module.jq', 'Only one kind of module was run on for this report.', False), 48 | ('no_yara_hits.jq', 'There doesn\'t appear to be any Yara signature hits for this scan.', False), 49 | ('exe_in_zip.jq', 'An executable was found inside a ZIP file.', False), 50 | ('embedded_sfx_rar_w_exe.jq', 'An embedded file contained a self-extracting RAR that itself contained an executable payload.', False), 51 | ('many_objects.jq', 'More than 10 unique objects were observed in this file.', False), 52 | ('vt_match_found.jq', 'At least one file was found to have results in VirusTotal\'s database.', False), 53 | ('vt_match_not_found.jq', 'There were no matches found when VirusTotal was queried.', False), 54 | ('macro_gt_five_suspicious.jq', 'A macro was found with more than five suspicious traits.', False), 55 | ('vt_broadbased_detections_found.jq', 'Some AV products have detected this as a PUP threat.', False), 56 | ('vt_exploit_detections_found.jq', 'Some AV products have detected this as an exploit.', False), 57 | ('more_than_ten_yara.jq', 'More than 10 unique Yara signatures fired when processing this file!', False), 58 | ('fresh_vt_scan.jq', 'One of the VirusTotal results contains an object that was scanned less than 24 hours ago.', False), 59 | ('pe_recently_compiled.jq', 'An executable has a compile time less than a week old.', False), 60 | ] 61 | -------------------------------------------------------------------------------- /fsf-server/daemon.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # All credit for this class goes to Sander Marechal, 2009-05-31 4 | # Reference: http://www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python/ 5 | # 6 | # 7 | import sys, os, time, atexit 8 | from signal import SIGTERM 9 | 10 | class Daemon: 11 | """ 12 | A generic daemon class. 13 | 14 | Usage: subclass the Daemon class and override the run() method 15 | """ 16 | def __init__(self, pidfile, stdin='/dev/null', stdout='/dev/null', stderr='/dev/null'): 17 | self.stdin = stdin 18 | self.stdout = stdout 19 | self.stderr = stderr 20 | self.pidfile = pidfile 21 | 22 | def daemonize(self): 23 | """ 24 | do the UNIX double-fork magic, see Stevens' "Advanced 25 | Programming in the UNIX Environment" for details (ISBN 0201563177) 26 | http://www.erlenstar.demon.co.uk/unix/faq_2.html#SEC16 27 | """ 28 | try: 29 | pid = os.fork() 30 | if pid > 0: 31 | # exit first parent 32 | sys.exit(0) 33 | except OSError, e: 34 | sys.stderr.write("fork #1 failed: %d (%s)\n" % (e.errno, e.strerror)) 35 | sys.exit(1) 36 | 37 | # decouple from parent environment 38 | os.chdir("/") 39 | os.setsid() 40 | os.umask(0) 41 | 42 | # do second fork 43 | try: 44 | pid = os.fork() 45 | if pid > 0: 46 | # exit from second parent 47 | sys.exit(0) 48 | except OSError, e: 49 | sys.stderr.write("fork #2 failed: %d (%s)\n" % (e.errno, e.strerror)) 50 | sys.exit(1) 51 | 52 | # redirect standard file descriptors 53 | sys.stdout.flush() 54 | sys.stderr.flush() 55 | si = file(self.stdin, 'r') 56 | so = file(self.stdout, 'a+') 57 | se = file(self.stderr, 'a+', 0) 58 | os.dup2(si.fileno(), sys.stdin.fileno()) 59 | os.dup2(so.fileno(), sys.stdout.fileno()) 60 | os.dup2(se.fileno(), sys.stderr.fileno()) 61 | 62 | # write pidfile 63 | atexit.register(self.delpid) 64 | pid = str(os.getpid()) 65 | file(self.pidfile,'w+').write("%s\n" % pid) 66 | 67 | def delpid(self): 68 | os.remove(self.pidfile) 69 | 70 | def start(self): 71 | """ 72 | Start the daemon 73 | """ 74 | # Check for a pidfile to see if the daemon already runs 75 | try: 76 | pf = file(self.pidfile,'r') 77 | pid = int(pf.read().strip()) 78 | pf.close() 79 | except IOError: 80 | pid = None 81 | 82 | if pid: 83 | message = "pidfile %s already exists. Daemon already running?\n" 84 | sys.stderr.write(message % self.pidfile) 85 | sys.exit(1) 86 | 87 | # Start the daemon 88 | self.daemonize() 89 | self.run() 90 | 91 | def stop(self): 92 | """ 93 | Stop the daemon 94 | """ 95 | # Get the pid from the pidfile 96 | try: 97 | pf = file(self.pidfile,'r') 98 | pid = int(pf.read().strip()) 99 | pf.close() 100 | except IOError: 101 | pid = None 102 | 103 | if not pid: 104 | message = "pidfile %s does not exist. Daemon not running?\n" 105 | sys.stderr.write(message % self.pidfile) 106 | return # not an error in a restart 107 | 108 | # Try killing the daemon process 109 | try: 110 | while 1: 111 | os.kill(pid, SIGTERM) 112 | time.sleep(0.1) 113 | except OSError, err: 114 | err = str(err) 115 | if err.find("No such process") > 0: 116 | if os.path.exists(self.pidfile): 117 | os.remove(self.pidfile) 118 | else: 119 | print str(err) 120 | sys.exit(1) 121 | 122 | def restart(self): 123 | """ 124 | Restart the daemon 125 | """ 126 | self.stop() 127 | self.start() 128 | 129 | def run(self): 130 | """ 131 | You should override this method when you subclass Daemon. It will be called after the process has been 132 | daemonized by start() or restart(). 133 | """ 134 | -------------------------------------------------------------------------------- /fsf-server/jq/embedded_sfx_rar_w_exe.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Check if an embedded file contained a RAR, which itself contained an EXE 4 | 5 | path(..) | join(" "?) | match("EXTRACT_EMBEDDED Object_.*? EXTRACT_RAR Object_.*? SCAN_YARA ft_exe") | .length > 0 6 | -------------------------------------------------------------------------------- /fsf-server/jq/exe_in_zip.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Check if a ZIP contains an EXE 4 | 5 | path(..) | join(" "?) | match("EXTRACT_ZIP Object_.*? SCAN_YARA ft_exe") | .length > 0 6 | -------------------------------------------------------------------------------- /fsf-server/jq/fresh_vt_scan.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Signature to see if any VT results contain submissions less than 24 hours old. 4 | 5 | (now - 86400) < (map(..|.META_VT_INSPECT?.scan_date|select(. != null)) | .[] | strptime("%Y-%m-%d %H:%M:%S") | mktime) 6 | -------------------------------------------------------------------------------- /fsf-server/jq/macro_gt_five_suspicious.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: More than five suspicious macro attributes 4 | 5 | map(..|.EXTRACT_VBA_MACRO?|..|.Suspicious?|select(. != null)| length > 5) | .[] 6 | -------------------------------------------------------------------------------- /fsf-server/jq/many_objects.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Check if an FSF run produced more than ten unique objects 4 | 5 | map(..|.SHA256?)| del(.[] | nulls) | unique | length >= 10 6 | -------------------------------------------------------------------------------- /fsf-server/jq/more_than_ten_yara.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Simple JQ to see if more than ten Yara signatures hit on something. 4 | 5 | .Summary.Yara | length > 10 6 | -------------------------------------------------------------------------------- /fsf-server/jq/no_yara_hits.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Simple JQ to see if no Yara signatures hit. 4 | 5 | .Summary.Yara | length == 0 6 | -------------------------------------------------------------------------------- /fsf-server/jq/one_module.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Simple JQ to see if only one module was kicked off. 4 | 5 | .Summary.Modules | length == 1 6 | -------------------------------------------------------------------------------- /fsf-server/jq/pe_recently_compiled.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Check if output contains EXE compiled in the past week. 4 | 5 | (now - 604800) < (map(..|.META_PE?.Compiled|select(. != null)) | .[] | strptime("%a %b %d %H:%M:%S %Y UTC") | mktime) 6 | -------------------------------------------------------------------------------- /fsf-server/jq/vt_broadbased_detections_found.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Inspect AV output for trace elements of PUP detection names 4 | map(..|.META_VT_INSPECT?.scans|.[]?.result|select(. != null)) | join(" ") | test("Riskware|PUP|Adware|Toolbar") 5 | -------------------------------------------------------------------------------- /fsf-server/jq/vt_exploit_detections_found.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Inspect AV output for exploit names. 4 | map(..|.META_VT_INSPECT?.scans|.[]?.result|select(. != null)) | join(" ") | test("CVE|Exploit") 5 | -------------------------------------------------------------------------------- /fsf-server/jq/vt_match_found.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Check of VT query contained a match at some level. 4 | 5 | map(..|.META_VT_INSPECT?|.response_code) | del(.[] | nulls) | unique | .[] > 0 6 | -------------------------------------------------------------------------------- /fsf-server/jq/vt_match_not_found.jq: -------------------------------------------------------------------------------- 1 | # Author: Jason Batchelor 2 | # Company: Emerson 3 | # Description: Check to see of no VT matches were observed when queried 4 | map(..|.META_VT_INSPECT?|.response_code|select(type=="number")) | all (. == 0) and length > 0 5 | -------------------------------------------------------------------------------- /fsf-server/main.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Listen, accept, and disposition incomming files. 4 | # 5 | # Jason Batchelor 6 | # Emerson Corporation 7 | # 02/06/2016 8 | ''' 9 | Copyright 2016 Emerson Electric Co. 10 | 11 | Licensed under the Apache License, Version 2.0 (the "License"); 12 | you may not use this file except in compliance with the License. 13 | You may obtain a copy of the License at 14 | 15 | http://www.apache.org/licenses/LICENSE-2.0 16 | 17 | Unless required by applicable law or agreed to in writing, software 18 | distributed under the License is distributed on an "AS IS" BASIS, 19 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | See the License for the specific language governing permissions and 21 | limitations under the License. 22 | ''' 23 | 24 | import sys 25 | import struct 26 | import threading 27 | import SocketServer 28 | import json 29 | from daemon import Daemon 30 | from conf import config 31 | from datetime import datetime as dt 32 | 33 | class ScannerDaemon(Daemon): 34 | 35 | def run(self): 36 | HOST = config.SERVER_CONFIG['IP_ADDRESS'] 37 | PORT = config.SERVER_CONFIG['PORT'] 38 | 39 | try: 40 | self.fsf_server = ForkingTCPServer((HOST, PORT), ForkingTCPRequestHandler) 41 | except: 42 | print 'Could not initialize server... am I already running?' 43 | sys.exit(2) 44 | 45 | fsf_server_thread = threading.Thread(target=self.fsf_server.serve_forever) 46 | fsf_server_thread.start() 47 | 48 | class ForkingTCPRequestHandler(SocketServer.BaseRequestHandler): 49 | 50 | def handle(self): 51 | 52 | from scanner import Scanner 53 | 54 | s = Scanner() 55 | s.check_directories() 56 | s.initialize_logger() 57 | s.check_yara_file() 58 | 59 | # Check in case client terminates connection mid xfer 60 | self.request.settimeout(10.0) 61 | 62 | try: 63 | raw_msg_len = self.request.recv(4) 64 | msg_len = struct.unpack('>I', raw_msg_len)[0] 65 | data = '' 66 | 67 | # Basic client request integrity check, not fullproof 68 | proto_check = self.request.recv(7) 69 | if proto_check != 'FSF_RPC': 70 | s.dbg_h.error('%s Client request integrity check failed. Invalid FSF protocol used.' % dt.now()) 71 | raise ValueError() 72 | 73 | while len(data) < msg_len: 74 | recv_buff = self.request.recv(msg_len - len(data)) 75 | data += recv_buff 76 | 77 | self.request.settimeout(None) 78 | self.process_data(data, s) 79 | 80 | except: 81 | e = sys.exc_info()[0] 82 | s.dbg_h.error('%s There was a problem processing the connection request from %s. Error: %s' % (dt.now(), self.request.getpeername()[0], e)) 83 | finally: 84 | self.request.close() 85 | 86 | def process_data(self, data, s): 87 | # Get data for initial report generation 88 | try: 89 | s.filename, s.source, s.archive, s.suppress_report, s.full, s.file = data.split('FSF_RPC') 90 | results = s.scan_file() 91 | 92 | if s.suppress_report == 'True': 93 | s.scan_h.info(json.dumps(results, sort_keys=False)) 94 | else: 95 | s.scan_h.info(json.dumps(results, sort_keys=False)) 96 | msg = json.dumps(results, indent=4, sort_keys=False) 97 | buffer = struct.pack('>I', len(msg)) + msg 98 | self.request.sendall(buffer) 99 | if s.full == 'True': 100 | self.process_subobjects(s) 101 | 102 | except: 103 | e = sys.exc_info()[0] 104 | s.dbg_h.error('%s There was an error generating scanner results. Error: %s' % (dt.now(), e)) 105 | 106 | def process_subobjects(self, s): 107 | # If client requests full dump of subobjects, we should have them ready here 108 | try: 109 | if len(s.sub_objects) > 0: 110 | sub_status = 'Data' 111 | self.request.sendall(sub_status) 112 | for i in xrange(len(s.sub_objects)-1, -1, -1): 113 | sub_count = struct.pack('>I', i) 114 | obj_size = struct.pack('>I', len(s.sub_objects[i])) 115 | buffer = sub_count + obj_size + s.sub_objects[i] 116 | self.request.sendall(buffer) 117 | elif len(s.sub_objects) == 0: 118 | sub_status = 'Null' 119 | self.request.sendall(sub_status) 120 | 121 | except: 122 | e = sys.exc_info()[0] 123 | s.dbg_h.error('%s There was an error dumping sub object data. Error: %s' % (dt.now(), e)) 124 | 125 | class ForkingTCPServer(SocketServer.ForkingMixIn, SocketServer.TCPServer): 126 | pass 127 | 128 | if __name__ == "__main__": 129 | 130 | daemon_logger = '%s/%s' % (config.SCANNER_CONFIG['LOG_PATH'], 'daemon.log') 131 | 132 | if len(sys.argv) != 2: 133 | print "usage: %s start|stop|restart" % sys.argv[0] 134 | sys.exit(2) 135 | 136 | with open(daemon_logger, 'a') as fh: 137 | fh.write('%s Daemon given %s command\n' % (dt.now(), sys.argv[1])) 138 | 139 | daemon = ScannerDaemon(config.SCANNER_CONFIG['PID_PATH'], stdin=daemon_logger, stdout=daemon_logger, stderr=daemon_logger) 140 | 141 | if 'start' == sys.argv[1]: 142 | daemon.start() 143 | elif 'stop' == sys.argv[1]: 144 | daemon.stop() 145 | elif 'restart' == sys.argv[1]: 146 | daemon.restart() 147 | else: 148 | print "Unknown command" 149 | sys.exit(2) 150 | sys.exit(0) 151 | -------------------------------------------------------------------------------- /fsf-server/modules/EXTRACT_CAB.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Description: Unpack CAB using cabextract as a helper. 5 | # Basically returns a stream of all the uncompressed contents, 6 | # multiple files are lumped together for displacement by other modules. 7 | # Date: 12/02/2015 8 | # Reference: http://download.microsoft.com/download/5/0/1/501ED102-E53F-4CE0-AA6B-B0F93629DDC6/Exchange/[MS-CAB].pdf 9 | ''' 10 | Copyright 2016 Emerson Electric Co. 11 | 12 | Licensed under the Apache License, Version 2.0 (the "License"); 13 | you may not use this file except in compliance with the License. 14 | You may obtain a copy of the License at 15 | 16 | http://www.apache.org/licenses/LICENSE-2.0 17 | 18 | Unless required by applicable law or agreed to in writing, software 19 | distributed under the License is distributed on an "AS IS" BASIS, 20 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 21 | See the License for the specific language governing permissions and 22 | limitations under the License. 23 | ''' 24 | 25 | import sys 26 | import os 27 | import subprocess 28 | from datetime import datetime as dt 29 | from tempfile import mkstemp 30 | from distutils.spawn import find_executable 31 | from struct import pack, unpack 32 | from collections import OrderedDict 33 | 34 | def get_flag_enums(value): 35 | 36 | db = {} 37 | db['cfhdrPREV_CABINET'] = True if value == 0x1 else False 38 | db['cfhdrNEXT_CABINET'] = True if value == 0x2 else False 39 | db['cfhdrRESERVE_PRESENT'] = True if value == 0x4 else False 40 | return db 41 | 42 | def get_compression_type(value): 43 | 44 | if value == 0x0: return 'None' 45 | if value == 0x1: return 'MSZIP' 46 | if value == 0x2: return 'QUANTUM' 47 | if value == 0x3: return 'LZX' 48 | return 'Unknown' 49 | 50 | def last_modified(date, time): 51 | 52 | year = (date >> 9) + 1980 53 | month = (date >> 5) & 0xf 54 | day = date & 0x1f 55 | hour = time >> 11 56 | minute = (time >> 5) & 0x3f 57 | second = (time << 1) & 0x3e 58 | return dt(year, month, day, hour, minute, second).__str__() 59 | 60 | def get_attributes(attribs): 61 | 62 | attributes = [] 63 | if attribs & 0x1: attributes.append('Read-only file') 64 | if attribs & 0x2: attributes.append('Hidden file') 65 | if attribs & 0x4: attributes.append('System file') 66 | if attribs & 0x20: attributes.append('Modified since last backup') 67 | if attribs & 0x40: attributes.append('Run after extraction') 68 | if attribs & 0x80: attributes.append('Name contains UTF') 69 | return attributes 70 | 71 | # Use cabextract as a helper to get the data from various MS compression formats 72 | def collect_cab(cabname, tmpfile): 73 | 74 | cabextract_location = find_executable('cabextract') 75 | args = [cabextract_location, '-F', cabname, '-p', tmpfile] 76 | 77 | proc = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) 78 | 79 | decompressed = proc.stdout.read() 80 | proc.communicate() 81 | 82 | # CAB will return 0 if successful 83 | if proc.returncode: 84 | s.dbg_h.error('%s There was a problem getting data from the cab file...' % dt.now()) 85 | 86 | return decompressed 87 | 88 | def parse_cab(buff, tmpfile): 89 | 90 | # CFHEADER structure 91 | magic = buff[0:4] 92 | reserved1 = buff[4:8] 93 | cbCabinet = unpack('= last_end: 60 | EXTRACT_FILES['Object_%s' % counter] = OrderedDict([('Start', '%s bytes' % start), 61 | ('End', '%s bytes' % end), 62 | ('Description', parser.description), 63 | ('Buffer', buff[start:end])]) 64 | counter += 1 65 | last_start = start 66 | last_end = end 67 | 68 | subfile.current_offset += subfile.slice_size 69 | if subfile.next_offset: 70 | subfile.current_offset = max(subfile.current_offset, subfile.next_offset) 71 | subfile.current_offset = min(subfile.current_offset, subfile.size) 72 | 73 | return EXTRACT_FILES 74 | 75 | if __name__ == '__main__': 76 | # For testing, s object can be None type if unused in function 77 | print EXTRACT_EMBEDDED(None, sys.stdin.read()) 78 | -------------------------------------------------------------------------------- /fsf-server/modules/EXTRACT_GZIP.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Description: Extract and process file contained within a GZIP archive 5 | # Date: 11/16/2015 6 | ''' 7 | Copyright 2015 Emerson Electric Co. 8 | 9 | Licensed under the Apache License, Version 2.0 (the "License"); 10 | you may not use this file except in compliance with the License. 11 | You may obtain a copy of the License at 12 | 13 | http://www.apache.org/licenses/LICENSE-2.0 14 | 15 | Unless required by applicable law or agreed to in writing, software 16 | distributed under the License is distributed on an "AS IS" BASIS, 17 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 18 | See the License for the specific language governing permissions and 19 | limitations under the License. 20 | ''' 21 | 22 | import sys 23 | import gzip 24 | from StringIO import StringIO 25 | 26 | def EXTRACT_GZIP(s, buff): 27 | EXTRACT_GZIP = {} 28 | 29 | # Only one file within a GZIP, you'll never have multiple ones 30 | # For that, you'll likely see something like GZ+TAR 31 | gzf = gzip.GzipFile(fileobj=StringIO(buff), mode='rb') 32 | EXTRACT_GZIP['Buffer'] = gzf.read() 33 | 34 | return EXTRACT_GZIP 35 | 36 | if __name__ == '__main__': 37 | print EXTRACT_GZIP(None, sys.stdin.read()) 38 | -------------------------------------------------------------------------------- /fsf-server/modules/EXTRACT_HEXASCII_PE.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Description: Binary convert on hexascii printables contained within a stream 5 | # believed to represent an executable file. 6 | # Date: 03/07/2016 7 | ''' 8 | Copyright 2016 Carnegie Mellon University 9 | 10 | Licensed under the Apache License, Version 2.0 (the "License"); 11 | you may not use this file except in compliance with the License. 12 | You may obtain a copy of the License at 13 | 14 | http://www.apache.org/licenses/LICENSE-2.0 15 | 16 | Unless required by applicable law or agreed to in writing, software 17 | distributed under the License is distributed on an "AS IS" BASIS, 18 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 19 | See the License for the specific language governing permissions and 20 | limitations under the License. 21 | ''' 22 | 23 | import sys 24 | import re 25 | import binascii 26 | 27 | def EXTRACT_HEXASCII_PE(s, buff): 28 | # Function must return a dictionary 29 | SUB_OBJ = {} 30 | counter = 0 31 | 32 | for m in re.finditer(r"4[dD]5[aA][0-9A-Fa-f]+", buff): 33 | SUB_OBJ.update( { 'Object_%s' % counter : { 'Buffer' : binascii.unhexlify(m.group(0)) } } ) 34 | counter += 1 35 | 36 | return SUB_OBJ 37 | 38 | if __name__ == '__main__': 39 | # For testing, s object can be None type if unused in function 40 | print EXTRACT_HEXASCII_PE(None, sys.stdin.read()) 41 | -------------------------------------------------------------------------------- /fsf-server/modules/EXTRACT_RAR.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Company: Emerson 5 | # Description: Extract RAR files and get metadata 6 | # Date: 05/19/2015 7 | ''' 8 | Copyright 2015 Emerson Electric Co. 9 | 10 | Licensed under the Apache License, Version 2.0 (the "License"); 11 | you may not use this file except in compliance with the License. 12 | You may obtain a copy of the License at 13 | 14 | http://www.apache.org/licenses/LICENSE-2.0 15 | 16 | Unless required by applicable law or agreed to in writing, software 17 | distributed under the License is distributed on an "AS IS" BASIS, 18 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 19 | See the License for the specific language governing permissions and 20 | limitations under the License. 21 | ''' 22 | 23 | import sys 24 | import shutil 25 | import rarfile 26 | import os 27 | from tempfile import mkstemp 28 | from datetime import datetime 29 | from collections import OrderedDict 30 | 31 | def get_compression_method(field_value): 32 | 33 | map_val = 'Unknown' 34 | 35 | # Enum list from: http://forensicswiki.org/wiki/RAR 36 | if field_value == 0x30: map_val = 'Storing' 37 | elif field_value == 0x31: map_val = 'Fastest Compression' 38 | elif field_value == 0x32: map_val = 'Fast Compression' 39 | elif field_value == 0x33: map_val = 'Normal Compression' 40 | elif field_value == 0x34: map_val = 'Good Compression' 41 | elif field_value == 0x35: map_val = 'Best Compression' 42 | 43 | return map_val 44 | 45 | def get_system_mapping(field_value): 46 | 47 | map_val = 'Unknown' 48 | 49 | # Enum list from: http://forensicswiki.org/wiki/RAR 50 | if field_value == 0: map_val = 'MS-DOS' 51 | elif field_value == 1: map_val = 'OS/2' 52 | elif field_value == 2: map_val = 'Windows' 53 | elif field_value == 3: map_val = 'UNIX' 54 | elif field_value == 4: map_val = 'Macintosh' 55 | elif field_value == 5: map_val = 'BeOS' 56 | 57 | return map_val 58 | 59 | def get_rar_info(tmpfile, PARENT_BIN): 60 | 61 | file_num = 0 62 | password_required = False 63 | 64 | rf = rarfile.RarFile(tmpfile) 65 | 66 | if rf.needs_password(): 67 | password_required = True 68 | 69 | for r in rf.infolist(): 70 | CHILD_BIN = OrderedDict([('Filename', r.filename), 71 | ('Last Modified', datetime(*r.date_time).strftime("%Y-%m-%d %H:%M:%S")), 72 | ('Comment', r.comment), 73 | ('CRC', hex(r.CRC)), 74 | ('Compressed Size', '%s bytes' % r.compress_size), 75 | ('Uncompressed Size', '%s bytes' % r.file_size), 76 | ('Compress Type', get_compression_method(r.compress_type)), 77 | ('Create System', get_system_mapping(r.host_os)), 78 | ('Password Required', password_required)]) 79 | 80 | if not password_required and r.file_size != 0: 81 | CHILD_BIN['Buffer'] = rf.read(r) 82 | 83 | PARENT_BIN['Object_%s' % file_num] = CHILD_BIN 84 | file_num += 1 85 | 86 | rf.close() 87 | 88 | return PARENT_BIN 89 | 90 | def EXTRACT_RAR(s, buff): 91 | 92 | EXTRACT_RAR = { } 93 | tmpfd, tmpfile = mkstemp(suffix='.rar') 94 | tmpf = os.fdopen(tmpfd, 'wb') 95 | 96 | try: 97 | tmpf.write(buff) 98 | tmpf.close() 99 | EXTRACT_RAR = get_rar_info(tmpfile, EXTRACT_RAR) 100 | finally: 101 | os.remove(tmpfile) 102 | 103 | return EXTRACT_RAR 104 | 105 | if __name__ == '__main__': 106 | # For testing, s object can be None type if unused in function 107 | print EXTRACT_RAR(None, sys.stdin.read()) 108 | -------------------------------------------------------------------------------- /fsf-server/modules/EXTRACT_RTF_OBJ.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Description: Extract RTF objects 5 | # Date: 11/12/2015 6 | ''' 7 | Copyright 2015 Emerson Electric Co. 8 | 9 | Licensed under the Apache License, Version 2.0 (the "License"); 10 | you may not use this file except in compliance with the License. 11 | You may obtain a copy of the License at 12 | 13 | http://www.apache.org/licenses/LICENSE-2.0 14 | 15 | Unless required by applicable law or agreed to in writing, software 16 | distributed under the License is distributed on an "AS IS" BASIS, 17 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 18 | See the License for the specific language governing permissions and 19 | limitations under the License. 20 | ''' 21 | 22 | import os 23 | import sys 24 | import oletools.rtfobj as rtfobj 25 | from tempfile import mkstemp 26 | 27 | def EXTRACT_RTF_OBJ(s, buff): 28 | 29 | PARENT_RTF_OBJS = {} 30 | counter = 0 31 | tmpfd, tmpfile = mkstemp(suffix='.rtf') 32 | tmpf = os.fdopen(tmpfd, 'wb') 33 | 34 | try: 35 | tmpf.write(buff) 36 | tmpf.close() 37 | objs = rtfobj.rtf_iter_objects(tmpfile) 38 | 39 | for index, orig_len, data in objs: 40 | CHILD_OBJ = {'Index' : index, 41 | 'Buffer' : data } 42 | PARENT_RTF_OBJS['Object_%s' % counter] = CHILD_OBJ 43 | counter += 1 44 | 45 | finally: 46 | os.remove(tmpfile) 47 | 48 | return PARENT_RTF_OBJS 49 | 50 | if __name__ == '__main__': 51 | print EXTRACT_RTF_OBJ(None, sys.stdin.read()) 52 | -------------------------------------------------------------------------------- /fsf-server/modules/EXTRACT_SWF.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Description: Process compressed SWF files 5 | # Date: 07/29/2015 6 | ''' 7 | Copyright 2015 Emerson Electric Co. 8 | 9 | Licensed under the Apache License, Version 2.0 (the "License"); 10 | you may not use this file except in compliance with the License. 11 | You may obtain a copy of the License at 12 | 13 | http://www.apache.org/licenses/LICENSE-2.0 14 | 15 | Unless required by applicable law or agreed to in writing, software 16 | distributed under the License is distributed on an "AS IS" BASIS, 17 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 18 | See the License for the specific language governing permissions and 19 | limitations under the License. 20 | ''' 21 | 22 | import sys 23 | import zlib 24 | import pylzma 25 | 26 | def EXTRACT_SWF(s, buff): 27 | 28 | SWF = {} 29 | 30 | magic = buff[:3] 31 | data = '' 32 | 33 | if magic == 'CWS': 34 | SWF['Buffer'] = 'FWS' + buff[3:8] + zlib.decompress(buff[8:]) 35 | elif magic == 'ZWS': 36 | SWF['Buffer'] = 'FWS' + buff[3:8] + pylzma.decompress(buff[12:]) 37 | elif magic == 'FWS': 38 | SWF['Version'] = ord(buff[3]) 39 | 40 | return SWF 41 | 42 | if __name__ == '__main__': 43 | # For testing, s object can be None type if unused in function 44 | print EXTRACT_SWF(None, sys.stdin.read()) 45 | 46 | -------------------------------------------------------------------------------- /fsf-server/modules/EXTRACT_TAR.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Description: Extract files from TAR archive file 5 | # Date: 11/16/2015 6 | ''' 7 | Copyright 2015 Emerson Electric Co. 8 | 9 | Licensed under the Apache License, Version 2.0 (the "License"); 10 | you may not use this file except in compliance with the License. 11 | You may obtain a copy of the License at 12 | 13 | http://www.apache.org/licenses/LICENSE-2.0 14 | 15 | Unless required by applicable law or agreed to in writing, software 16 | distributed under the License is distributed on an "AS IS" BASIS, 17 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 18 | See the License for the specific language governing permissions and 19 | limitations under the License. 20 | ''' 21 | 22 | import sys 23 | import tarfile 24 | from datetime import datetime 25 | from StringIO import StringIO 26 | from collections import OrderedDict 27 | 28 | # For security reasons, we will only allow our module to process a max of twenty identified files 29 | MAX_FILES = 20 30 | 31 | def get_tar_type(ti): 32 | 33 | type = 'Unknown' 34 | 35 | if ti.isfile(): type = 'File' 36 | elif ti.isdir(): type = 'Directory' 37 | elif ti.issym(): type = 'Sym Link' 38 | elif ti.islnk(): type = 'Hard Link' 39 | elif ischr(): type = 'Character device' 40 | elif isblk(): type = 'Block device' 41 | elif isfifo(): type = 'FIFO' 42 | 43 | return type 44 | 45 | def EXTRACT_TAR(s, buff): 46 | 47 | EXTRACT_TAR = {} 48 | file_num = 0 49 | 50 | tarf = tarfile.TarFile(fileobj=StringIO(buff), mode='r') 51 | 52 | for ti in tarf: 53 | 54 | if file_num >= MAX_FILES: 55 | tarf.close() 56 | EXTRACT_TAR['Object_%s' % file_num] = { 'Error' : 'Max number of archived files reached' } 57 | return EXTRACT_TAR 58 | 59 | CHILD_TAR = OrderedDict([('Name', ti.name), 60 | ('Last modified', datetime.fromtimestamp(ti.mtime).strftime("%Y-%m-%d %H:%M:%S")), 61 | ('Type', get_tar_type(ti)), 62 | ('UID', ti.uid ), 63 | ('GID', ti.gid ), 64 | ('Username', ti.uname), 65 | ('Groupname', ti.gname)]) 66 | 67 | if ti.isfile(): 68 | 69 | try: 70 | f = tarf.extractfile(ti) 71 | CHILD_TAR['Buffer'] = f.read() 72 | f.close() 73 | except: 74 | CHILD_TAR['Buffer'] = 'Failed to extract this specific archive. Invalid or corrupt?' 75 | 76 | EXTRACT_TAR['Object_%s' % file_num] = CHILD_TAR 77 | 78 | file_num += 1 79 | 80 | tarf.close() 81 | 82 | return EXTRACT_TAR 83 | 84 | if __name__ == '__main__': 85 | print EXTRACT_TAR(None, sys.stdin.read()) 86 | -------------------------------------------------------------------------------- /fsf-server/modules/EXTRACT_UPX.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Description: Unpack UPX packed binaries 5 | # Date: 8/28/2015 6 | ''' 7 | Copyright 2015 Emerson Electric Co. 8 | Licensed under the Apache License, Version 2.0 (the "License"); 9 | you may not use this file except in compliance with the License. 10 | You may obtain a copy of the License at 11 | http://www.apache.org/licenses/LICENSE-2.0 12 | Unless required by applicable law or agreed to in writing, software 13 | distributed under the License is distributed on an "AS IS" BASIS, 14 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | See the License for the specific language governing permissions and 16 | limitations under the License. 17 | ''' 18 | 19 | import sys 20 | import os 21 | import subprocess 22 | from datetime import datetime as dt 23 | from tempfile import mkstemp 24 | from distutils.spawn import find_executable 25 | 26 | def EXTRACT_UPX(s, buff): 27 | 28 | UNPACKED = {} 29 | tmpfd, tmpfile = mkstemp(suffix='.upx') 30 | tmpf = os.fdopen(tmpfd, 'wb') 31 | 32 | upx_location = find_executable("upx") 33 | outfile = tmpfile + ".out" 34 | args = [upx_location, "-q", "-d", tmpfile, "-o", outfile] 35 | 36 | try: 37 | tmpf.write(buff) 38 | tmpf.close() 39 | 40 | proc = subprocess.Popen(args, stdout=subprocess.PIPE, 41 | stderr=subprocess.STDOUT) 42 | 43 | proc.communicate() 44 | # UPX will return 0 if successful 45 | if not proc.returncode: 46 | f = open(outfile, 'rb') 47 | UNPACKED['Buffer'] = f.read() 48 | f.close() 49 | else: 50 | s.dbg_h.error('%s There was a problem unpacking the file...' % dt.now()) 51 | raise ValueError() 52 | finally: 53 | os.remove(tmpfile) 54 | os.remove(outfile) 55 | 56 | return UNPACKED 57 | 58 | if __name__ == '__main__': 59 | # For testing, s object can be None type if unused in function 60 | print EXTRACT_UPX(None, sys.stdin.read()) 61 | -------------------------------------------------------------------------------- /fsf-server/modules/EXTRACT_VBA_MACRO.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Description: Extract metadata for Office documents 5 | # Date: 02/01/2016 6 | ''' 7 | Copyright 2016 Emerson Electric Co. 8 | 9 | Licensed under the Apache License, Version 2.0 (the "License"); 10 | you may not use this file except in compliance with the License. 11 | You may obtain a copy of the License at 12 | 13 | http://www.apache.org/licenses/LICENSE-2.0 14 | 15 | Unless required by applicable law or agreed to in writing, software 16 | distributed under the License is distributed on an "AS IS" BASIS, 17 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 18 | See the License for the specific language governing permissions and 19 | limitations under the License. 20 | ''' 21 | import sys 22 | import itertools 23 | import operator 24 | import logging 25 | from collections import OrderedDict 26 | from oletools.olevba import VBA_Parser, VBA_Scanner 27 | 28 | def scan_macro(vba_code): 29 | 30 | SCAN = {} 31 | 32 | vba_scanner = VBA_Scanner(vba_code) 33 | results = vba_scanner.scan(include_decoded_strings=False) 34 | for key, subiter in itertools.groupby(results, operator.itemgetter(0)): 35 | groups = [] 36 | groups.extend(['%s: %s' % (desc[1], desc[2]) for desc in subiter]) 37 | SCAN['%s' % key] = groups 38 | 39 | if not SCAN: 40 | return 'No results from scan' 41 | 42 | return SCAN 43 | 44 | def EXTRACT_VBA_MACRO(s, buff): 45 | 46 | EXTRACT_MACRO = {} 47 | counter = 0 48 | 49 | ### TODO: REMOVE THIS WORKAROUND ONCE MODULE AUTHOR FIXES CODE ### 50 | ### Reference: http://stackoverflow.com/questions/32261679/strange-issue-using-logging-module-in-python/32264445#32264445 51 | ### Reference: https://bitbucket.org/decalage/oletools/issues/26/use-of-logger 52 | ### /dev/null used instead of NullHandler for 2.6 compatibility 53 | logging.getLogger('workaround').root.addHandler(logging.FileHandler('/dev/null')) 54 | ### 55 | 56 | vba = VBA_Parser('None', data=buff) 57 | 58 | if not vba.detect_vba_macros(): 59 | return EXTRACT_MACRO 60 | 61 | for (filename, stream_path, vba_filename, vba_code) in vba.extract_macros(): 62 | 63 | CHILD_MACRO = OrderedDict([('OLE Stream', stream_path), 64 | ('VBA Filename', vba_filename.decode('ascii', 'ignore')), 65 | ('Scan', scan_macro(vba_code)), 66 | ('Buffer', vba_code)]) 67 | 68 | EXTRACT_MACRO['Object_%s' % counter] = CHILD_MACRO 69 | counter += 1 70 | 71 | return EXTRACT_MACRO 72 | 73 | if __name__ == '__main__': 74 | print EXTRACT_VBA_MACRO(None, sys.stdin.read()) 75 | -------------------------------------------------------------------------------- /fsf-server/modules/EXTRACT_ZIP.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Company: Emerson 5 | # Description: Simple module to run on zip files and return metadata as a dictionary. 6 | # Date: 12/16/2014 7 | ''' 8 | Copyright 2015 Emerson Electric Co. 9 | 10 | Licensed under the Apache License, Version 2.0 (the "License"); 11 | you may not use this file except in compliance with the License. 12 | You may obtain a copy of the License at 13 | 14 | http://www.apache.org/licenses/LICENSE-2.0 15 | 16 | Unless required by applicable law or agreed to in writing, software 17 | distributed under the License is distributed on an "AS IS" BASIS, 18 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 19 | See the License for the specific language governing permissions and 20 | limitations under the License. 21 | ''' 22 | 23 | import sys 24 | import czipfile 25 | from datetime import datetime 26 | from StringIO import StringIO 27 | from collections import OrderedDict 28 | 29 | # For security reasons, we will only allow our module to process a max of twenty identified files 30 | MAX_FILES = 20 31 | 32 | def get_system_mapping(field_value): 33 | 34 | map_val = 'Unknown' 35 | 36 | # Enum list from: http://www.pkware.com/documents/casestudies/APPNOTE.TXT 37 | if field_value == 0: map_val = 'MS-DOS' 38 | elif field_value == 1: map_val = 'Amiga' 39 | elif field_value == 2: map_val = 'OpenVMS' 40 | elif field_value == 3: map_val = 'UNIX' 41 | elif field_value == 4: map_val = 'VM/CMS' 42 | elif field_value == 5: map_val = 'Atari ST' 43 | elif field_value == 6: map_val = 'OS/2 H.P.F.S.' 44 | elif field_value == 7: map_val = 'Macintosh' 45 | elif field_value == 8: map_val = 'Z-System' 46 | elif field_value == 9: map_val = 'CP/M' 47 | elif field_value == 10: map_val = 'Windows NTFS' 48 | elif field_value == 11: map_val = 'MVS (OS/390 - Z/OS)' 49 | elif field_value == 12: map_val = 'VSE' 50 | elif field_value == 13: map_val = 'Acorn Risc' 51 | elif field_value == 14: map_val = 'VFAT' 52 | elif field_value == 15: map_val = 'Alternate MVS' 53 | elif field_value == 16: map_val = 'BeOS' 54 | elif field_value == 17: map_val = 'Tandem' 55 | elif field_value == 18: map_val = 'OS/400' 56 | elif field_value == 19: map_val = 'OS X (Darwin)' 57 | 58 | return map_val 59 | 60 | def get_compression_method(field_value): 61 | 62 | map_val = 'Unknown' 63 | 64 | # Enum list from: http://www.pkware.com/documents/casestudies/APPNOTE.TXT 65 | if field_value == 0: map_val = 'The file is stored (no compression)' 66 | elif field_value == 1: map_val = 'The file is Shrunk' 67 | elif field_value == 2: map_val = 'The file is Reduced with compression factor 1' 68 | elif field_value == 3: map_val = 'The file is Reduced with compression factor 2' 69 | elif field_value == 4: map_val = 'The file is Reduced with compression factor 3' 70 | elif field_value == 5: map_val = 'The file is Reduced with compression factor 4' 71 | elif field_value == 6: map_val = 'The file is Imploded' 72 | elif field_value == 7: map_val = 'Tokenizing compression algorithm' 73 | elif field_value == 8: map_val = 'Standard compression algorithm' 74 | elif field_value == 9: map_val = 'Enhanced Deflating using Deflate64(tm)' 75 | elif field_value == 10: map_val = 'PKWARE Data Compression Library Imploding (old IBM TERSE)' 76 | elif field_value == 12: map_val = 'File is compressed using BZIP2 algorithm' 77 | elif field_value == 14: map_val = 'LZMA (EFS)' 78 | elif field_value == 18: map_val = 'File is compressed using IBM TERSE (new)' 79 | elif field_value == 19: map_val = 'IBM LZ77 z Architecture (PFS)' 80 | elif field_value == 97: map_val = 'WavPack compressed data' 81 | elif field_value == 98: map_val = 'PPMd version I, Rev 1' 82 | 83 | return map_val 84 | 85 | def EXTRACT_ZIP(s, buff): 86 | 87 | EXTRACT_ZIP = { } 88 | file_num = 0 89 | password_required = False 90 | 91 | zf = czipfile.ZipFile(StringIO(buff)) 92 | 93 | for z in zf.namelist(): 94 | 95 | if file_num >= MAX_FILES: 96 | zf.close() 97 | EXTRACT_ZIP['Object_%s' % file_num] = { 'Error' : 'Max number of compressed files reached' } 98 | return EXTRACT_ZIP 99 | 100 | zi_child = zf.getinfo(z) 101 | 102 | # Test if content is encrypted 103 | if zi_child.flag_bits & 0x1: 104 | password_required = True 105 | 106 | CHILD_ZIP = OrderedDict([('Name', zi_child.filename), 107 | ('Last modified', datetime(*zi_child.date_time).strftime("%Y-%m-%d %H:%M:%S")), 108 | ('Comment', zi_child.comment), 109 | ('CRC', hex(zi_child.CRC)), 110 | ('Compressed Size', '%s bytes' % zi_child.compress_size), 111 | ('Uncompressed Size', '%s bytes' % zi_child.file_size), 112 | ('Compress Type', get_compression_method(zi_child.compress_type)), 113 | ('Create System', get_system_mapping(zi_child.create_system)), 114 | ('Password Required', password_required)]) 115 | 116 | if not password_required and zi_child.file_size != 0: 117 | 118 | try: 119 | f = zf.open(z, 'r') 120 | CHILD_ZIP['Buffer'] = f.read() 121 | f.close() 122 | except: 123 | CHILD_ZIP['Buffer'] = 'Failed to extract this specific archive. Invalid or corrupt?' 124 | 125 | EXTRACT_ZIP['Object_%s' % file_num] = CHILD_ZIP 126 | 127 | file_num += 1 128 | 129 | zf.close() 130 | 131 | return EXTRACT_ZIP 132 | 133 | if __name__ == '__main__': 134 | # For testing, s object can be None type if unused in function 135 | print EXTRACT_ZIP(None, sys.stdin.read()) 136 | 137 | -------------------------------------------------------------------------------- /fsf-server/modules/META_BASIC_INFO.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Company: Emerson 5 | # Description: Module that is applied to all files being scanned. Generate core metadata. 6 | # Date: 12/10/2015 7 | ''' 8 | Copyright 2016 Emerson Electric Co. 9 | 10 | Licensed under the Apache License, Version 2.0 (the "License"); 11 | you may not use this file except in compliance with the License. 12 | You may obtain a copy of the License at 13 | 14 | http://www.apache.org/licenses/LICENSE-2.0 15 | 16 | Unless required by applicable law or agreed to in writing, software 17 | distributed under the License is distributed on an "AS IS" BASIS, 18 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 19 | See the License for the specific language governing permissions and 20 | limitations under the License. 21 | ''' 22 | 23 | import sys 24 | import hashlib 25 | import ssdeep 26 | from collections import OrderedDict 27 | 28 | def META_BASIC_INFO(s, buff): 29 | 30 | BASIC_INFO = OrderedDict([('MD5', hashlib.md5(buff).hexdigest()), 31 | ('SHA1', hashlib.sha1(buff).hexdigest()), 32 | ('SHA256', hashlib.sha256(buff).hexdigest()), 33 | ('SHA512', hashlib.sha512(buff).hexdigest()), 34 | ('ssdeep' , ssdeep.hash(buff)), 35 | ('Size', '%s bytes' % len(buff))]) 36 | 37 | return BASIC_INFO 38 | 39 | if __name__ == '__main__': 40 | # For testing, s object can be None type if unused in function 41 | print META_BASIC_INFO(None, sys.stdin.read()) 42 | -------------------------------------------------------------------------------- /fsf-server/modules/META_ELF.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Company: Emerson 5 | # Description: Extract metadata associated with ELF payloads 6 | # Date: 01/26/2016 7 | ''' 8 | Copyright 2016 Emerson Electric Co. 9 | 10 | Licensed under the Apache License, Version 2.0 (the "License"); 11 | you may not use this file except in compliance with the License. 12 | You may obtain a copy of the License at 13 | 14 | http://www.apache.org/licenses/LICENSE-2.0 15 | 16 | Unless required by applicable law or agreed to in writing, software 17 | distributed under the License is distributed on an "AS IS" BASIS, 18 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 19 | See the License for the specific language governing permissions and 20 | limitations under the License. 21 | ''' 22 | import sys 23 | from StringIO import StringIO 24 | from elftools.elf.elffile import ELFFile 25 | from elftools.elf.sections import SymbolTableSection 26 | 27 | import pprint 28 | 29 | def get_die_entries(elffile): 30 | die_entries = [] 31 | 32 | # Get name of Debug Info Entries (DIE) 33 | if elffile.has_dwarf_info(): 34 | dwarfinfo = elffile.get_dwarf_info() 35 | 36 | for cu in dwarfinfo.iter_CUs(): 37 | die_entries.append(cu.get_top_DIE().get_full_path()) 38 | 39 | return die_entries 40 | 41 | def get_section_names(elffile): 42 | section_names = [] 43 | symbol_names = [] 44 | 45 | for section in elffile.iter_sections(): 46 | 47 | # Get names of all sections in ELF file 48 | if len(section.name) > 0: 49 | section_names.append(section.name) 50 | 51 | # If symbol tables exist for the section, take inventory 52 | if isinstance(section, SymbolTableSection): 53 | for i in range(0, section.num_symbols()): 54 | if len(section.get_symbol(i).name) > 0: 55 | symbol_names.append(section.get_symbol(i).name) 56 | 57 | return section_names, symbol_names 58 | 59 | def META_ELF(s, buff): 60 | elffile = ELFFile(StringIO(buff)) 61 | 62 | META_ELF = { 'Arch' : elffile.get_machine_arch(), 63 | 'Debug Entries' : get_die_entries(elffile) } 64 | 65 | META_ELF['Section Names'], META_ELF['Symbol Names'] = get_section_names(elffile) 66 | 67 | return META_ELF 68 | 69 | if __name__ == '__main__': 70 | pprint.pprint(META_ELF(None, sys.stdin.read())) 71 | -------------------------------------------------------------------------------- /fsf-server/modules/META_JAVA_CLASS.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Description: Get metadata concerning Java class files 5 | # Date: 02/07/2017 (updated) 6 | ''' 7 | Copyright 2017 Emerson Electric Co. 8 | 9 | Licensed under the Apache License, Version 2.0 (the "License"); 10 | you may not use this file except in compliance with the License. 11 | You may obtain a copy of the License at 12 | 13 | http://www.apache.org/licenses/LICENSE-2.0 14 | 15 | Unless required by applicable law or agreed to in writing, software 16 | distributed under the License is distributed on an "AS IS" BASIS, 17 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 18 | See the License for the specific language governing permissions and 19 | limitations under the License. 20 | ''' 21 | import sys 22 | from javatools import classinfo 23 | from javatools import unpack_class 24 | 25 | # Stub class to interface with classinfo 26 | class classinfo_options: 27 | 28 | def __init__(self): 29 | self.class_provides = True 30 | self.class_requires = True 31 | self.constpool = True 32 | self.api_ignore = '' 33 | self.show = 'SHOW_PRIVATE' 34 | 35 | # Needs to be the filename of your module 36 | def META_JAVA_CLASS(s, buff): 37 | # Function must return a dictionary 38 | META_DICT = {} 39 | 40 | options = classinfo_options() 41 | info = unpack_class(buff) 42 | META_DICT = classinfo.cli_simplify_classinfo(options, info) 43 | _constants_pool = [] 44 | for x in META_DICT['constants_pool']: 45 | _constants_pool.append({"index": x[0], "type": x[1], "value": str(x[2])}) 46 | META_DICT["constants_pool"] = _constants_pool 47 | return META_DICT 48 | 49 | if __name__ == '__main__': 50 | # For testing, s object can be None type if unused in function 51 | print META_JAVA_CLASS(None, sys.stdin.read()) 52 | -------------------------------------------------------------------------------- /fsf-server/modules/META_MACHO.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jamie Ford 4 | # Description: Parses mach-o files using the Macholibre library by Aaron Stevens 5 | # Returns various metadata about the file 6 | # Date: 09/08/2016 7 | # Updated: 3/16/17 8 | ''' 9 | Copyright 2016 BroEZ 10 | 11 | Licensed under the Apache License, Version 2.0 (the "License"); 12 | you may not use this file except in compliance with the License. 13 | You may obtain a copy of the License at 14 | 15 | http://www.apache.org/licenses/LICENSE-2.0 16 | 17 | Unless required by applicable law or agreed to in writing, software 18 | distributed under the License is distributed on an "AS IS" BASIS, 19 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | See the License for the specific language governing permissions and 21 | limitations under the License. 22 | ''' 23 | import sys 24 | import macholibre 25 | 26 | import os 27 | from tempfile import mkstemp 28 | 29 | def META_MACHO(s, buff): 30 | tmpfd, tmpfile = mkstemp() 31 | tmpf = os.fdopen(tmpfd, 'wb') 32 | 33 | try: 34 | #Writing the buffer to a temp file 35 | tmpf.write(buff) 36 | tmpf.close() 37 | dictionary = macholibre.parse(tmpfile) 38 | finally: 39 | #Remove it to save space 40 | os.remove(tmpfile) 41 | 42 | if dictionary.has_key('name'): 43 | #The name doesn't make sense with the temp file 44 | dictionary.pop('name') 45 | 46 | # META_BASIC_INFO already hs this informaton 47 | if dictionary.has_key('hashes'): 48 | dictionary.pop('hashes') 49 | if dictionary.has_key('size'): 50 | dictionary.pop('size') 51 | dictionary['architecutures'] = [] 52 | 53 | 54 | #Macholibre either has macho or universal 55 | if dictionary.has_key('macho'): 56 | popMachoKeys(dictionary['macho']) 57 | dictionary['Universal'] = False 58 | macho = dictionary.pop('macho') 59 | #I need it twice so there's no point in searching 60 | 61 | # Makes the key Macho + the cputype from the macho dictionary, if it has that key. Also replaces spaces with '_' 62 | machoKey = "macho_" + macho['cputype'].replace(' ', '_') if macho.has_key('cputype') else 'macho' 63 | dictionary['machos'] = [{machoKey: macho}] 64 | dictionary['architecutures'].append(macho['subtype'] if macho.has_key('subtype') else '') 65 | 66 | del macho, machoKey 67 | 68 | elif dictionary.has_key('universal'): 69 | #Universal has embedded machos 70 | if dictionary['universal'].has_key('machos'): 71 | dictionary['Universal'] = True 72 | dictionary['machos'] = [] 73 | 74 | for index, macho in enumerate(dictionary['universal']['machos']): 75 | popMachoKeys(macho) 76 | hasCPU = macho.has_key('cputype') 77 | # Does the same thing but make sure not to overwrite the indexes if neither has 'cputype' as a key 78 | machoKey = "macho_" + macho['cputype'].replace(' ', '_') if hasCPU else 'macho_' + str(index) 79 | if macho.has_key('subtype'): 80 | dictionary['architecutures'].append(macho['subtype']) 81 | dictionary['machos'].append({machoKey: macho}) 82 | dictionary.pop('universal') 83 | 84 | 85 | 86 | return dictionary 87 | 88 | def popMachoKeys(macho): 89 | #Keys to keep to prevent too much printout (These can be added and removed by just adding them to the list) 90 | keepKeys = ['filetype', 'signature', 'flags', 'offset', 'cputype', 'minos', 'dylibs', 'subtype'] 91 | for key in macho.keys(): 92 | if (key not in keepKeys): 93 | macho.pop(key) 94 | 95 | 96 | if __name__ == '__main__': 97 | print(META_MACHO(None, sys.stdin.read())) 98 | -------------------------------------------------------------------------------- /fsf-server/modules/META_OLECF.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | # 3 | # Author: Jason Batchelor 4 | # Company: Emerson 5 | # Description: Return dictionary of metadata attributes from an OLE CF file 6 | # Date: 07/29/2015 7 | ''' 8 | Copyright 2015 Emerson Electric Co. 9 | 10 | Licensed under the Apache License, Version 2.0 (the "License"); 11 | you may not use this file except in compliance with the License. 12 | You may obtain a copy of the License at 13 | 14 | http://www.apache.org/licenses/LICENSE-2.0 15 | 16 | Unless required by applicable law or agreed to in writing, software 17 | distributed under the License is distributed on an "AS IS" BASIS, 18 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 19 | See the License for the specific language governing permissions and 20 | limitations under the License. 21 | ''' 22 | 23 | import sys 24 | from StringIO import StringIO 25 | from hachoir_metadata import extractMetadata 26 | from hachoir_parser import guessParser 27 | from hachoir_core.stream import InputIOStream 28 | 29 | def META_OLECF(s, buff): 30 | 31 | META_DICT = { } 32 | 33 | try: 34 | stream = InputIOStream(StringIO(buff)) 35 | parser = guessParser(stream) 36 | meta = extractMetadata(parser) 37 | except: 38 | return META_DICT 39 | 40 | for data in sorted(meta): 41 | if data.values: 42 | if len(data.values) == 1: 43 | META_DICT['%s' % data.key] = data.values[0].text 44 | else: 45 | values = [] 46 | for value in data.values: 47 | values.append(value.text) 48 | META_DICT['%s' % data.key] = values 49 | 50 | return META_DICT 51 | 52 | if __name__ == '__main__': 53 | # For testing, s object can be None type if unused in function 54 | print META_OLECF(None, sys.stdin.read()) 55 | 56 | -------------------------------------------------------------------------------- /fsf-server/modules/META_OOXML.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Company: Emerson 5 | # Description: Show metadata from parsing OOXML core properties files 6 | # Date: 5/5/2015 7 | ''' 8 | Copyright 2015 Emerson Electric Co. 9 | 10 | Licensed under the Apache License, Version 2.0 (the "License"); 11 | you may not use this file except in compliance with the License. 12 | You may obtain a copy of the License at 13 | 14 | http://www.apache.org/licenses/LICENSE-2.0 15 | 16 | Unless required by applicable law or agreed to in writing, software 17 | distributed under the License is distributed on an "AS IS" BASIS, 18 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 19 | See the License for the specific language governing permissions and 20 | limitations under the License. 21 | ''' 22 | 23 | import sys 24 | import xmltodict 25 | 26 | def META_OOXML(s, buff): 27 | 28 | CORE_PROP = xmltodict.parse(buff) 29 | 30 | # We don't care about keys for XML namespaces 31 | xmlns = "@xmlns" 32 | 33 | try: 34 | for key, child_dict in CORE_PROP.items(): 35 | for k, v in child_dict.items(): 36 | if xmlns in k: 37 | del child_dict[k] 38 | continue 39 | 40 | if 'dcterms:' in k: 41 | child_dict[k[k.index(':')+1:]] = child_dict[k]['#text'] 42 | del child_dict[k] 43 | continue 44 | 45 | if 'cp:' in k or 'dc:' in k: 46 | child_dict[k[k.index(':')+1:]] = v 47 | del child_dict[k] 48 | 49 | except: 50 | pass 51 | 52 | return CORE_PROP 53 | 54 | if __name__ == '__main__': 55 | # For testing, s object can be None type if unused in function 56 | print META_OOXML(None, sys.stdin.read()) 57 | 58 | -------------------------------------------------------------------------------- /fsf-server/modules/META_PDF.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Company: Emerson 5 | # Description: Get metadata from PDF and return dict of metadata 6 | # Date: 12/30/2014 7 | ''' 8 | Copyright 2015 Emerson Electric Co. 9 | 10 | Licensed under the Apache License, Version 2.0 (the "License"); 11 | you may not use this file except in compliance with the License. 12 | You may obtain a copy of the License at 13 | 14 | http://www.apache.org/licenses/LICENSE-2.0 15 | 16 | Unless required by applicable law or agreed to in writing, software 17 | distributed under the License is distributed on an "AS IS" BASIS, 18 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 19 | See the License for the specific language governing permissions and 20 | limitations under the License. 21 | ''' 22 | 23 | import sys 24 | from PyPDF2 import PdfFileReader 25 | from StringIO import StringIO 26 | 27 | def META_PDF(s, buff): 28 | 29 | META_PDF = { } 30 | pdfinfo = PdfFileReader(StringIO(buff)).documentInfo 31 | 32 | for i in pdfinfo: 33 | META_PDF['%s' % i] = pdfinfo[i] 34 | 35 | return META_PDF 36 | 37 | -------------------------------------------------------------------------------- /fsf-server/modules/META_PE.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Company: Emerson 5 | # Description: Extract metadata associated with executable filetypes 6 | # Date: 01/02/2015 7 | ''' 8 | Copyright 2016 Emerson Electric Co. 9 | 10 | Licensed under the Apache License, Version 2.0 (the "License"); 11 | you may not use this file except in compliance with the License. 12 | You may obtain a copy of the License at 13 | 14 | http://www.apache.org/licenses/LICENSE-2.0 15 | 16 | Unless required by applicable law or agreed to in writing, software 17 | distributed under the License is distributed on an "AS IS" BASIS, 18 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 19 | See the License for the specific language governing permissions and 20 | limitations under the License. 21 | ''' 22 | 23 | import sys 24 | import pefile 25 | import time 26 | from collections import OrderedDict 27 | 28 | def enum_resources(id): 29 | # Reference: http://msdn.microsoft.com/en-us/library/ms648009%28v=vs.85%29.aspx 30 | type = '' 31 | 32 | if id == 9: type = "RT_ACCELERATOR" 33 | if id == 21: type = "RT_ANICURSOR" 34 | if id == 22: type = "RT_ANIICON" 35 | if id == 2: type = "RT_BITMAP" 36 | if id == 1: type = "RT_CURSOR" 37 | if id == 5: type = "RT_DIALOG" 38 | if id == 17: type = "RT_DLGINCLUDE" 39 | if id == 8: type = "RT_FONT" 40 | if id == 7: type = "RT_FONTDIR" 41 | if id == 12: type = "RT_GROUP_CURSOR" 42 | if id == 14: type = "RT_GROUP_ICON" 43 | if id == 23: type = "RT_HTML" 44 | if id == 3: type = "RT_ICON" 45 | if id == 24: type = "RT_MANIFEST" 46 | if id == 4: type = "RT_MENU" 47 | if id == 11: type = "RT_MESSAGETABLE" 48 | if id == 19: type = "RT_PLUGPLAY" 49 | if id == 10: type = "RT_RCDATA" 50 | if id == 6: type = "RT_STRING" 51 | if id == 16: type = "RT_VERSION" 52 | if id == 20: type = "RT_VXD" 53 | 54 | return type 55 | 56 | 57 | def get_image_hdr_characteristics(pe): 58 | 59 | myChars = pe.FILE_HEADER.Characteristics 60 | 61 | HDR_CHARS = {} 62 | 63 | # Reference: http://msdn.microsoft.com/en-us/library/windows/desktop/ms680313(v=vs.85).aspx 64 | IMAGE_FILE_EXECUTABLE_IMAGE = 0x2 65 | IMAGE_FILE_SYSTEM = 0x1000 66 | IMAGE_FILE_DLL = 0x2000 67 | 68 | HDR_CHARS['EXE'] = 'True' if myChars & IMAGE_FILE_EXECUTABLE_IMAGE else 'False' 69 | HDR_CHARS['SYSTEM'] = 'True' if myChars & IMAGE_FILE_SYSTEM else 'False' 70 | HDR_CHARS['DLL'] = 'True' if myChars & IMAGE_FILE_DLL else 'False' 71 | 72 | return HDR_CHARS 73 | 74 | def get_crc(pe): 75 | 76 | crc = [] 77 | crc.append('Claimed: 0x%x' % pe.OPTIONAL_HEADER.CheckSum) 78 | crc.append('Actual: 0x%x' % pe.generate_checksum()) 79 | 80 | return crc 81 | 82 | def get_machine(pe): 83 | 84 | # Reference: http://msdn.microsoft.com/en-us/library/windows/desktop/ms680313(v=vs.85).aspx 85 | IMAGE_FILE_MACHINE_I386 = 0x014c 86 | IMAGE_FILE_MACHINE_IA64 = 0x0200 87 | IMAGE_FILE_MACHINE_AMD64 = 0x8664 88 | 89 | machine = pe.FILE_HEADER.Machine 90 | 91 | if machine & IMAGE_FILE_MACHINE_I386: 92 | return 'x86' 93 | 94 | if machine & IMAGE_FILE_MACHINE_IA64: 95 | return 'Intel Itanium' 96 | 97 | if machine & IMAGE_FILE_MACHINE_AMD64: 98 | return 'x64' 99 | 100 | return 'Unknown' 101 | 102 | def get_sections(pe): 103 | 104 | sections = [] 105 | for section in pe.sections: 106 | name = section.Name.strip('\0') 107 | sections.append(name.decode('ascii', 'ignore')) 108 | return sections 109 | 110 | def get_dllcharacteristics(pe): 111 | 112 | myChars = pe.OPTIONAL_HEADER.DllCharacteristics 113 | 114 | DLL_CHARS = {} 115 | 116 | # Reference: http://msdn.microsoft.com/en-us/library/windows/desktop/ms680339(v=vs.85).aspx 117 | DYNAMICBASE_FLAG = 0x0040 118 | NXCOMPAT_FLAG = 0x0100 119 | NO_SEH_FLAG = 0x0400 120 | WDM_DRIVER = 0x2000 121 | NO_ISOLATION = 0x200 122 | FORCE_INTEGRITY = 0x80 123 | TERMINAL_SERVER_AWARE = 0x8000 124 | 125 | DLL_CHARS['ASLR'] = 'Enabled' if myChars & DYNAMICBASE_FLAG else 'Disabled' 126 | DLL_CHARS['DEP'] = 'Enabled' if myChars & NXCOMPAT_FLAG else 'Disabled' 127 | DLL_CHARS['SEH'] = 'Disabled' if myChars & NO_SEH_FLAG else 'Enabled' 128 | DLL_CHARS['WDM_DRIVER'] = 'Enabled' if myChars & WDM_DRIVER else 'Disabled' 129 | DLL_CHARS['NO_ISOLATION'] = 'Enabled' if myChars & NO_ISOLATION else 'Disabled' 130 | DLL_CHARS['FORCE_INTEGRITY'] = 'Enabled' if myChars & FORCE_INTEGRITY else 'Disabled' 131 | DLL_CHARS['TERMINAL_SERVER_AWARE'] = 'Enabled' if myChars & TERMINAL_SERVER_AWARE else 'Disabled' 132 | 133 | return DLL_CHARS 134 | 135 | def get_resource_names(pe): 136 | 137 | resource_names = [] 138 | 139 | try: 140 | pe.DIRECTORY_ENTRY_RESOURCE.entries 141 | except: 142 | return resource_names 143 | 144 | for res in pe.DIRECTORY_ENTRY_RESOURCE.entries: 145 | if res.name is not None: 146 | resource_names.append(res.name.__str__()) 147 | return resource_names 148 | 149 | def get_resource_types(pe): 150 | 151 | resource_types = [] 152 | 153 | try: 154 | pe.DIRECTORY_ENTRY_RESOURCE.entries 155 | except: 156 | return resource_types 157 | 158 | for res in pe.DIRECTORY_ENTRY_RESOURCE.entries: 159 | if res.id is not None: 160 | resource_types.append(enum_resources(res.id)) 161 | return resource_types 162 | 163 | def get_exports(pe): 164 | 165 | my_exports = [] 166 | 167 | try: 168 | pe.DIRECTORY_ENTRY_EXPORT.symbols 169 | except: 170 | return my_exports 171 | 172 | for exp in pe.DIRECTORY_ENTRY_EXPORT.symbols: 173 | my_exports.append(exp.name) 174 | 175 | return my_exports 176 | 177 | def get_imports(pe): 178 | 179 | IMPORTS = {} 180 | 181 | try: 182 | pe.DIRECTORY_ENTRY_IMPORT 183 | except: 184 | return IMPORTS 185 | 186 | for entry in pe.DIRECTORY_ENTRY_IMPORT: 187 | my_imports = [] 188 | for imp in entry.imports: 189 | my_imports.append(imp.name) 190 | IMPORTS['%s' % entry.dll.upper().split('.')[0]] = my_imports 191 | 192 | return IMPORTS 193 | 194 | def get_stringfileinfo(pe): 195 | 196 | STRINGFILEINFO = {} 197 | 198 | try: 199 | pe.FileInfo 200 | except: 201 | return STRINGFILEINFO 202 | 203 | for fi in pe.FileInfo: 204 | if fi.Key == 'StringFileInfo': 205 | for st in fi.StringTable: 206 | for entry in st.entries.items(): 207 | k = entry[0].encode('ascii','backslashreplace') 208 | v = entry[1].encode('ascii','backslashreplace') 209 | STRINGFILEINFO['%s' % k] = v 210 | 211 | return STRINGFILEINFO 212 | 213 | def META_PE(s, buff): 214 | 215 | pe = pefile.PE(data=buff) 216 | 217 | META_PE = OrderedDict([('File Type', get_image_hdr_characteristics(pe)), 218 | ('CRC', get_crc(pe)), 219 | ('Compiled', '%s UTC' % time.asctime(time.gmtime(pe.FILE_HEADER.TimeDateStamp))), 220 | ('Architecture', get_machine(pe)), 221 | ('EntryPoint', hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)), 222 | ('ImageBase', hex(pe.OPTIONAL_HEADER.ImageBase)), 223 | ('Characteristics', get_dllcharacteristics(pe)), 224 | ('Sections', get_sections(pe)), 225 | ('Resource Names', get_resource_names(pe)), 226 | ('Resource Types', get_resource_types(pe)), 227 | ('Export Functions', get_exports(pe)), 228 | ('Import DLLs', get_imports(pe)), 229 | ('Import Hash', pe.get_imphash()), 230 | ('StringFileInfo', get_stringfileinfo(pe))]) 231 | 232 | return META_PE 233 | 234 | if __name__ == '__main__': 235 | print META_PE(None, sys.stdin.read()) 236 | 237 | -------------------------------------------------------------------------------- /fsf-server/modules/META_PE_SIGNATURE.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Description: Get metadata on the signature used to sign a PE file 5 | # Date: 11/17/2015 6 | # 7 | # Good resources: 8 | # * https://www.cs.auckland.ac.nz/~pgut001/pubs/authenticode.txt 9 | # * http://erny-rev.blogspot.com/2013/10/parsing-x509v3-certificates-and-pkcs7.html 10 | # * http://pyasn1.sourceforge.net/ 11 | # * https://msdn.microsoft.com/en-us/windows/hardware/gg463180.aspx 12 | ''' 13 | Copyright 2015 Emerson Electric Co. 14 | 15 | Licensed under the Apache License, Version 2.0 (the "License"); 16 | you may not use this file except in compliance with the License. 17 | You may obtain a copy of the License at 18 | 19 | http://www.apache.org/licenses/LICENSE-2.0 20 | 21 | Unless required by applicable law or agreed to in writing, software 22 | distributed under the License is distributed on an "AS IS" BASIS, 23 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 24 | See the License for the specific language governing permissions and 25 | limitations under the License. 26 | ''' 27 | 28 | import sys 29 | import pefile 30 | import struct 31 | from datetime import datetime 32 | from pyasn1.codec.der.decoder import decode 33 | from pyasn1_modules import rfc2315 34 | 35 | # Reference: https://msdn.microsoft.com/en-us/library/ff635603.aspx 36 | def hash_alg_oid_mapping(): 37 | 38 | db = {} 39 | db['1.2.840.113549.1.1.5'] = 'sha1RSA' 40 | db['1.2.840.113549.1.1.4'] = 'md5RSA' 41 | db['1.2.840.10040.4.3'] = 'sha1DSA' 42 | db['1.3.14.3.2.29'] = 'sha1RSA' 43 | db['1.3.14.3.2.15'] = 'shaRSA' 44 | db['1.3.14.3.2.3'] = 'md5RSA' 45 | db['1.2.840.113549.1.1.2'] = 'md2RSA' 46 | db['1.2.840.113549.1.1.3'] = 'md4RSA' 47 | db['1.3.14.3.2.2'] = 'md4RSA' 48 | db['1.3.14.3.2.4'] = 'md4RSA' 49 | db['1.3.14.7.2.3.1'] = 'md2RSA' 50 | db['1.3.14.3.2.13'] = 'sha1DSA' 51 | db['1.3.14.3.2.27'] = 'dsaSHA1' 52 | db['2.16.840.1.101.2.1.1.19'] = 'mosaicUpdatedSig' 53 | db['1.3.14.3.2.26'] = 'sha1NoSign' 54 | db['1.2.840.113549.2.5'] = 'md5NoSign' 55 | db['2.16.840.1.101.3.4.2.1'] = 'sha256NoSign' 56 | db['2.16.840.1.101.3.4.2.2'] = 'sha384NoSign' 57 | db['2.16.840.1.101.3.4.2.3'] = 'sha512NoSign' 58 | db['1.2.840.113549.1.1.11'] = 'sha256RSA' 59 | db['1.2.840.113549.1.1.12'] = 'sha384RSA' 60 | db['1.2.840.113549.1.1.13'] = 'sha512RSA' 61 | db['1.2.840.113549.1.1.10'] = 'RSASSA-PSS' 62 | db['1.2.840.10045.4.1'] = 'sha1ECDSA' 63 | db['1.2.840.10045.4.3.2'] = 'sha256ECDSA' 64 | db['1.2.840.10045.4.3.3'] = 'sha384ECDSA' 65 | db['1.2.840.10045.4.3.4'] = 'sha512ECDSA' 66 | db['1.2.840.10045.4.3'] = 'specifiedECDSA' 67 | 68 | return db 69 | 70 | # Reference: https://msdn.microsoft.com/en-us/library/windows/desktop/aa386991(v=vs.85).aspx 71 | def rdn_oid_mapping(): 72 | 73 | db = {} 74 | db['2.5.4.3'] = 'CN' 75 | db['2.5.4.5'] = 'DeviceSerialNumber' 76 | db['2.5.4.6'] = 'C' 77 | db['2.5.4.7'] = 'L' 78 | db['2.5.4.8'] = 'ST' 79 | db['2.5.4.10'] = 'O' 80 | db['2.5.4.11'] = 'OU' 81 | db['1.2.840.113549.1.9.1'] = 'E' 82 | 83 | return db 84 | 85 | def get_cert_info(signed_data): 86 | 87 | PARENT_CERT_INFO = {} 88 | rdn_mapping = rdn_oid_mapping() 89 | hash_mapping = hash_alg_oid_mapping() 90 | cert_count = 0 91 | 92 | for c in signed_data['certificates']: 93 | 94 | CERT_INFO = {} 95 | cer = c['certificate']['tbsCertificate'] 96 | 97 | CERT_INFO['Version'] = cer['version'].prettyPrint()[1:-1] # the [1:-1] is a fun way to get rid of double quotes 98 | 99 | CERT_INFO['Algorithm'] = hash_mapping[cer['signature']['algorithm'].prettyPrint()] 100 | 101 | # Had do get creative here with the formatting.. 102 | serial = '%.02x' % int(cer['serialNumber'].prettyPrint()) 103 | # Append a zero to the front if we have an odd number of hex digits 104 | serial = '0' + serial if len(serial) % 2 != 0 else serial 105 | # Finally, apply our colon in between the hex bytes 106 | serial = ':'.join(serial[i:i+2] for i in range(0, len(serial), 2)) 107 | CERT_INFO['Serial'] = serial 108 | 109 | CERT_INFO['Validity'] = { 'Not Before' : datetime.strptime(str(cer['validity']['notBefore']['utcTime']), '%y%m%d%H%M%SZ').strftime("%Y-%m-%d %H:%M:%S UTC"), 110 | 'Not After' : datetime.strptime(str(cer['validity']['notAfter']['utcTime']), '%y%m%d%H%M%SZ').strftime("%Y-%m-%d %H:%M:%S UTC") } 111 | 112 | subject = cer['subject'] 113 | issuer = cer['issuer'] 114 | 115 | rdnsequence = subject[0] 116 | CERT_INFO['Subject'] = [] 117 | for rdn in rdnsequence: 118 | oid, value = rdn[0] 119 | if oid.prettyPrint() in rdn_mapping: 120 | CERT_INFO['Subject'].append('%s=%s' % (rdn_mapping[oid.prettyPrint()], str(value[2:]))) 121 | 122 | rdnsequence = issuer[0] 123 | CERT_INFO['Issuer'] = [] 124 | for rdn in rdnsequence: 125 | oid, value = rdn[0] 126 | if oid.prettyPrint() in rdn_mapping: 127 | CERT_INFO['Issuer'].append('%s=%s' % (rdn_mapping[oid.prettyPrint()], str(value[2:]))) 128 | 129 | PARENT_CERT_INFO['Cert_%s' % cert_count] = CERT_INFO 130 | cert_count += 1 131 | 132 | return PARENT_CERT_INFO 133 | 134 | def META_PE_SIGNATURE(s, buff): 135 | 136 | sig_buff = [] 137 | 138 | pe = pefile.PE(data=buff) 139 | 140 | address = pe.OPTIONAL_HEADER.DATA_DIRECTORY[pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_SECURITY']].VirtualAddress 141 | size = pe.OPTIONAL_HEADER.DATA_DIRECTORY[pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_SECURITY']].Size 142 | 143 | # Eight bytes in due to the struct spec 144 | # typedef struct _WIN_CERTIFICATE 145 | # { 146 | # DWORD dwLength; 147 | # WORD wRevision; 148 | # WORD wCertificateType; 149 | # BYTE bCertificate[ANYSIZE_ARRAY]; 150 | # } WIN_CERTIFICATE, *LPWIN_CERTIFICATE; 151 | sig_buff = buff[address + 8 : address + 8 + size] 152 | # Remove sequence and objid structures, 19 bytes 153 | signed_data, rest = decode(sig_buff[19:], asn1Spec=rfc2315.SignedData()) 154 | 155 | return get_cert_info(signed_data) 156 | 157 | if __name__ == '__main__': 158 | print META_PE_SIGNATURE(None, sys.stdin.read()) 159 | -------------------------------------------------------------------------------- /fsf-server/modules/META_VT_INSPECT.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: Jason Batchelor 4 | # Description: Search VirusTotal database based on computed hash of buffer 5 | # Can be used as a default module (if desired) or more tactically applied 6 | # (ie only EXE files with high entropy, etc). It depends on you and your API 7 | # usage limits. When you are set up, just add this to the dispositioner file. 8 | # Date: 01/06/2016 9 | ''' 10 | Copyright 2016 Emerson Electric Co. 11 | 12 | Licensed under the Apache License, Version 2.0 (the "License"); 13 | you may not use this file except in compliance with the License. 14 | You may obtain a copy of the License at 15 | 16 | http://www.apache.org/licenses/LICENSE-2.0 17 | 18 | Unless required by applicable law or agreed to in writing, software 19 | distributed under the License is distributed on an "AS IS" BASIS, 20 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 21 | See the License for the specific language governing permissions and 22 | limitations under the License. 23 | ''' 24 | import sys 25 | import hashlib 26 | import requests 27 | 28 | def META_VT_INSPECT(s, buff): 29 | 30 | md5 = hashlib.md5(buff).hexdigest() 31 | params = {'apikey' : 'YOUR API KEY HERE', 32 | 'resource' : md5 } 33 | base_uri = 'https://www.virustotal.com/vtapi/v2' 34 | response = requests.get('%s/%s' % (base_uri, 'file/report'), params=params) 35 | response_json = response.json() 36 | 37 | return response_json 38 | 39 | if __name__ == '__main__': 40 | print META_VT_INSPECT(None, sys.stdin.read()) 41 | -------------------------------------------------------------------------------- /fsf-server/modules/SCAN_YARA.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Jason Batchelor 4 | # Module used to scan files against our Yara signatures 5 | # 04/21/2015 6 | ''' 7 | Copyright 2015 Emerson Electric Co. 8 | 9 | Licensed under the Apache License, Version 2.0 (the "License"); 10 | you may not use this file except in compliance with the License. 11 | You may obtain a copy of the License at 12 | 13 | http://www.apache.org/licenses/LICENSE-2.0 14 | 15 | Unless required by applicable law or agreed to in writing, software 16 | distributed under the License is distributed on an "AS IS" BASIS, 17 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 18 | See the License for the specific language governing permissions and 19 | limitations under the License. 20 | ''' 21 | 22 | import sys 23 | import yara 24 | 25 | def SCAN_YARA(s, buff): 26 | 27 | rules = yara.compile(s.yara_rule_path) 28 | 29 | results = { } 30 | if rules: 31 | matches = rules.match(data=buff) 32 | if matches: 33 | for m in matches: 34 | if m.meta: 35 | results['%s' % m.rule] = m.meta 36 | else: 37 | results['%s' % m.rule] = 'No Meta Provided' 38 | return results 39 | -------------------------------------------------------------------------------- /fsf-server/modules/__init__.py: -------------------------------------------------------------------------------- 1 | __all__ = ['META_BASIC_INFO', 2 | 'SCAN_YARA', 3 | 'EXTRACT_ZIP', 4 | 'META_PE', 5 | 'EXTRACT_EMBEDDED', 6 | 'EXTRACT_RAR', 7 | 'META_PDF', 8 | 'META_OOXML', 9 | 'EXTRACT_SWF', 10 | 'META_OLECF', 11 | 'EXTRACT_VBA_MACRO', 12 | 'EXTRACT_UPX', 13 | 'EXTRACT_RTF_OBJ', 14 | 'EXTRACT_GZIP', 15 | 'EXTRACT_TAR', 16 | 'META_PE_SIGNATURE', 17 | 'EXTRACT_CAB', 18 | 'META_JAVA_CLASS', 19 | 'META_ELF', 20 | 'META_VT_INSPECT', 21 | 'EXTRACT_HEXASCII_PE', 22 | 'META_MACHO' 23 | ] 24 | -------------------------------------------------------------------------------- /fsf-server/modules/template.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Author: 4 | # Description: 5 | # Date: 6 | ''' 7 | Copyright 2015 Emerson Electric Co. 8 | 9 | Licensed under the Apache License, Version 2.0 (the "License"); 10 | you may not use this file except in compliance with the License. 11 | You may obtain a copy of the License at 12 | 13 | http://www.apache.org/licenses/LICENSE-2.0 14 | 15 | Unless required by applicable law or agreed to in writing, software 16 | distributed under the License is distributed on an "AS IS" BASIS, 17 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 18 | See the License for the specific language governing permissions and 19 | limitations under the License. 20 | ''' 21 | 22 | import sys 23 | 24 | def MODULE_NAME(s, buff): 25 | # Function must return a dictionary 26 | MY_DICTIONARY = {} 27 | 28 | return MY_DICTIONARY 29 | 30 | if __name__ == '__main__': 31 | # For testing, s object can be None type if unused in function 32 | print MODULE_NAME(None, sys.stdin.read()) 33 | -------------------------------------------------------------------------------- /fsf-server/processor.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Process objects and sub objects according to disposition criteria. 4 | # Log all the results. 5 | # 6 | # Jason Batchelor 7 | # Emerson Corporation 8 | # 02/09/2016 9 | ''' 10 | Copyright 2016 Emerson Electric Co. 11 | 12 | Licensed under the Apache License, Version 2.0 (the "License"); 13 | you may not use this file except in compliance with the License. 14 | You may obtain a copy of the License at 15 | 16 | http://www.apache.org/licenses/LICENSE-2.0 17 | 18 | Unless required by applicable law or agreed to in writing, software 19 | distributed under the License is distributed on an "AS IS" BASIS, 20 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 21 | See the License for the specific language governing permissions and 22 | limitations under the License. 23 | ''' 24 | 25 | import sys 26 | import os 27 | import logging 28 | import signal 29 | import json 30 | import time 31 | import hashlib 32 | from datetime import datetime as dt 33 | from collections import OrderedDict 34 | from distutils.spawn import find_executable 35 | from subprocess import Popen, PIPE, STDOUT 36 | # Ensure concurrent logging 37 | from cloghandler import ConcurrentRotatingFileHandler 38 | # Configurations 39 | from conf import disposition 40 | # Custom modules 41 | from modules import * 42 | 43 | # Global counter, helps keep track of object depth 44 | COUNTER = 0 45 | 46 | # List of modules that ran and returned results - used in summary generation 47 | MODULES_RUN = [] 48 | 49 | # List of Yara rules that fired - used in summary generation 50 | YARA_RULES = [] 51 | 52 | # Determine if we need to save sub object buffers 53 | SUB_TRACKER = False 54 | 55 | # Result: Recurse through dictionary to identify and process returned buffers. 56 | # When complete, we update the dictionary with new module information and remove the buffer. 57 | def recurse_dictionary(s, myDict): 58 | 59 | for key, value in myDict.items(): 60 | if isinstance(value, dict): 61 | recurse_dictionary(s, value) 62 | if key == 'Buffer': 63 | # Process any new buffers from a module 64 | myDict.update(process_buffer(s, value)) 65 | # Keep track of sub objects if client wants them 66 | if SUB_TRACKER: 67 | s.sub_objects.append(value) 68 | # We don't care to display/log the buffer after processing 69 | del myDict[key] 70 | # Multiple buffers can be found at the same depth 71 | # We don't want to increment depth if we are just at the same spot 72 | global COUNTER 73 | COUNTER -= 1 74 | 75 | return myDict 76 | 77 | def invoke_module(s, module, buff, myDict): 78 | 79 | def timer(*args): 80 | 81 | s.dbg_h.error('%s The scanner timeout threshold has been triggered...' % dt.now()) 82 | raise Exception() 83 | 84 | # Set timeout for processing of data 85 | signal.signal(signal.SIGALRM, timer) 86 | signal.alarm(s.timeout) 87 | 88 | m = sys.modules['modules.%s' % module] 89 | try: 90 | module_result = getattr(m, module)(s, buff) 91 | # Are you a dictionary? 92 | if isinstance(module_result, dict): 93 | # Do you have something for me? 94 | if module_result: 95 | myDict['%s' % module] = recurse_dictionary(s, module_result) 96 | MODULES_RUN.append(module) 97 | except: 98 | e = sys.exc_info()[0] 99 | s.dbg_h.error('%s Failed to run module %s on %s byte buffer supplied for file %s. Error: %s' \ 100 | % (dt.now(), module, len(buff), s.filename, e)) 101 | 102 | return myDict 103 | 104 | # Result: Logs any scan hits on the file and sends an alert if a signature prefix match is observed 105 | def process_buffer(s, buff): 106 | 107 | myDict = OrderedDict() 108 | 109 | global COUNTER 110 | COUNTER += 1 111 | 112 | if COUNTER >= s.max_depth: 113 | myDict['Error'] = 'Max depth of %s exceeded' % s.max_depth 114 | return myDict 115 | 116 | for module in disposition.default: 117 | myDict.update(invoke_module(s, module, buff, myDict)) 118 | 119 | # Yara helps drive execution of modules and alerting, no Yara = nothing more to do for buffer 120 | if 'SCAN_YARA' not in myDict: 121 | return myDict 122 | 123 | results = myDict['SCAN_YARA'].keys() 124 | YARA_RULES.extend(results) 125 | 126 | # Are there opportunities to run modules or set alert flag? 127 | for rule, modules, alert in disposition.triggers: 128 | if rule in results and alert: 129 | s.alert = True 130 | 131 | if rule in results and modules is not None: 132 | for module in modules: 133 | myDict.update(invoke_module(s, module, buff, myDict)) 134 | 135 | return myDict 136 | 137 | # Result: Return post processing observations back 138 | def post_processor(s, report): 139 | 140 | observations = [] 141 | 142 | jq_location = find_executable('jq') 143 | if jq_location == None: 144 | s.dbg_h.error('%s Unable to find JQ, aborting post-processing routine...' % dt.now()) 145 | return 146 | 147 | for script, observation, alert in disposition.post_processor: 148 | args = [jq_location, '-f', '%s/%s/%s' % (os.path.dirname(os.path.realpath(__file__)), 'jq', script)] 149 | proc = Popen(args, stdin=PIPE, stdout=PIPE, stderr=STDOUT) 150 | results = proc.communicate(input=json.dumps(report))[0].split('\n') 151 | 152 | if proc.returncode: 153 | s.dbg_h.error('%s There was a problem executing the JSON interpreter...' % dt.now()) 154 | return 155 | 156 | for r in results: 157 | if r == 'true': 158 | observations.append(observation) 159 | # Allow ourselves to alert on certain observations 160 | if alert: 161 | s.alert = True 162 | 163 | break 164 | 165 | return observations 166 | 167 | # Result: copy file, return export path to user 168 | def archive_file(s): 169 | 170 | path = '%s/%s' % (s.export_path, s.filename) 171 | 172 | try: 173 | with open(path, 'w') as f: 174 | f.write(s.file) 175 | f.close() 176 | except: 177 | e = sys.exc_info()[0] 178 | s.dbg_h.error('%s There was an error writing to the export directory. Error: %s' % (dt.now(), e)) 179 | 180 | return path 181 | 182 | # Result: archive file and associated sub objects, return export path to user 183 | def archive_all(s, report): 184 | 185 | report_dump = json.dumps(report) 186 | # Generate dirname by calculating epoch time and hash of results 187 | dirname = '%s/fsf_dump_%s_%s' % (s.export_path, int(time.time()), hashlib.md5(report_dump).hexdigest()) 188 | 189 | try: 190 | 191 | os.mkdir(dirname) 192 | 193 | # Archive the base file 194 | with open ('%s/%s' % (dirname, s.filename), 'w') as f: 195 | f.write(s.file) 196 | f.close() 197 | 198 | # Archive all sub objects 199 | for data in s.sub_objects: 200 | fname = hashlib.md5(data).hexdigest() 201 | with open('%s/%s' % (dirname, fname), 'w') as f: 202 | f.write(data) 203 | f.close 204 | except: 205 | e = sys.exc_info()[0] 206 | s.dbg_h.error('%s There was an error writing to the export directory. Error: %s' % (dt.now(), e)) 207 | 208 | return dirname 209 | 210 | # Result: Process object and sub objects, review results, pass dictionary back 211 | def scan_file(s): 212 | 213 | # Determine if we need to save sub object buffers 214 | if s.full == 'True' or \ 215 | s.archive == 'all-the-things' or \ 216 | s.archive == 'all-on-alert': 217 | global SUB_TRACKER 218 | SUB_TRACKER = True 219 | 220 | # Scan and process the results 221 | root_dict = OrderedDict([('Scan Time', '%s' % dt.now()), 222 | ('Filename', s.filename), 223 | ('Source', s.source), 224 | ('Object', process_buffer(s, s.file))]) 225 | 226 | root_dict['Summary'] = { 'Modules' : sorted(set(MODULES_RUN)), 227 | 'Yara' : sorted(set(YARA_RULES)) } 228 | 229 | # Allow post processor to add observations on output 230 | root_dict['Summary']['Observations'] = post_processor(s, root_dict) 231 | 232 | if s.alert: 233 | root_dict['Alert'] = True 234 | # Archive file on alert 235 | if s.archive == 'file-on-alert': 236 | root_dict['Export'] = archive_file(s) 237 | # Archive file and sub objects on alert 238 | if s.archive == 'all-on-alert': 239 | root_dict['Export'] = archive_all(s, root_dict) 240 | else: 241 | root_dict['Alert'] = False 242 | 243 | # Archive all the files, regardless of alert status 244 | if s.archive == 'all-the-files': 245 | root_dict['Export'] = archive_file(s) 246 | 247 | # Archive all the things, regardless of alert status 248 | if s.archive == 'all-the-things': 249 | root_dict['Export'] = archive_all(s, root_dict) 250 | 251 | return root_dict 252 | -------------------------------------------------------------------------------- /fsf-server/scanner.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # 3 | # Base class for scanner framework. 4 | # 5 | # Jason Batchelor 6 | # Emerson Corporation 7 | # 02/10/2016 8 | ''' 9 | Copyright 2016 Emerson Electric Co. 10 | 11 | Licensed under the Apache License, Version 2.0 (the "License"); 12 | you may not use this file except in compliance with the License. 13 | You may obtain a copy of the License at 14 | 15 | http://www.apache.org/licenses/LICENSE-2.0 16 | 17 | Unless required by applicable law or agreed to in writing, software 18 | distributed under the License is distributed on an "AS IS" BASIS, 19 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 20 | See the License for the specific language governing permissions and 21 | limitations under the License. 22 | ''' 23 | 24 | import os 25 | import sys 26 | import argparse 27 | import logging 28 | import processor 29 | from cloghandler import ConcurrentRotatingFileHandler 30 | from conf import config 31 | from datetime import datetime as dt 32 | 33 | class Scanner: 34 | def __init__(self): 35 | 36 | self.filename = "" 37 | self.source = "" 38 | self.archive = "" 39 | self.suppress_report = "" 40 | self.file = "" 41 | self.yara_rule_path = config.SCANNER_CONFIG['YARA_PATH'] 42 | self.export_path = config.SCANNER_CONFIG['EXPORT_PATH'] 43 | self.log_path = config.SCANNER_CONFIG['LOG_PATH'] 44 | self.max_depth = config.SCANNER_CONFIG['MAX_DEPTH'] 45 | self.dbg_h = "" 46 | self.scan_h = "" 47 | self.timeout = config.SCANNER_CONFIG['TIMEOUT'] 48 | self.alert = False 49 | self.full = "" 50 | self.sub_objects = [] 51 | 52 | def check_directories(self): 53 | 54 | # Create log dir if it does not exist 55 | if not os.path.isdir(self.log_path): 56 | try: 57 | os.makedirs(self.log_path) 58 | except: 59 | print 'Unable to create logging directory: %s. Check permissions?' \ 60 | % self.log_path 61 | sys.exit(2) 62 | 63 | # Create export dir if it does not exist 64 | if not os.path.isdir(self.export_path): 65 | try: 66 | os.makedirs(self.export_path) 67 | except: 68 | e = sys.exc_info()[0] 69 | print 'Unable to create export directory: %s. Check permissions?' \ 70 | % self.export_path 71 | sys.exit(2) 72 | 73 | def initialize_logger(self): 74 | 75 | # Invoke logging with a concurrent logging module since many of these 76 | # processes will likely be writing to scan.log at the same time 77 | self.dbg_h = logging.getLogger('dbg_log') 78 | dbglog = '%s/%s' % (self.log_path, 'dbg.log') 79 | dbg_rotateHandler = ConcurrentRotatingFileHandler(dbglog, "a") 80 | self.dbg_h.addHandler(dbg_rotateHandler) 81 | self.dbg_h.setLevel(logging.ERROR) 82 | 83 | self.scan_h = logging.getLogger('scan_log') 84 | scanlog = '%s/%s' % (self.log_path, 'scan.log') 85 | scan_rotateHandler = ConcurrentRotatingFileHandler(scanlog, "a") 86 | self.scan_h.addHandler(scan_rotateHandler) 87 | self.scan_h.setLevel(logging.INFO) 88 | 89 | def check_yara_file(self): 90 | 91 | # Ensure Yara rule file exists before proceeding 92 | if not os.path.isfile(self.yara_rule_path): 93 | self.dbg_h.error('%s Could not load Yara rule file. File %s, does not exist!' % (dt.now(), self.yara_rule_path)) 94 | sys.exit(2) 95 | 96 | def scan_file(self): 97 | return processor.scan_file(self) 98 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_cab.yara: -------------------------------------------------------------------------------- 1 | rule ft_cab 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20150723" 7 | desc = "File magic for CABs (Microsoft Cabinet Files)" 8 | 9 | strings: 10 | $cab = { 4D 53 43 46 } 11 | 12 | condition: 13 | $cab at 0 14 | } 15 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_elf.yara: -------------------------------------------------------------------------------- 1 | rule ft_elf 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20160121" 7 | desc = "File magic for ELF files" 8 | 9 | strings: 10 | $magic = { 7f 45 4c 46 } 11 | 12 | condition: 13 | $magic at 0 14 | } 15 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_exe.yara: -------------------------------------------------------------------------------- 1 | rule ft_exe 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20141217" 7 | desc = "Simple signature to trigger on PE files." 8 | 9 | strings: 10 | $mz = "MZ" 11 | 12 | condition: 13 | $mz at 0 14 | } 15 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_gzip.yara: -------------------------------------------------------------------------------- 1 | rule ft_gzip 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20151116" 7 | desc = "Trigger on magic of GZip compressed files" 8 | 9 | strings: 10 | $magic = { 1f 8b 08 } 11 | 12 | condition: 13 | $magic at 0 14 | } 15 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_jar.yara: -------------------------------------------------------------------------------- 1 | rule ft_jar 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20150810" 7 | desc = "Signature to detect JAR files" 8 | 9 | strings: 10 | $pk_header = { 50 4B 03 04 } 11 | $jar = "META-INF/MANIFEST.MF" 12 | 13 | condition: 14 | $pk_header at 0 and $jar 15 | } 16 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_java_class.yara: -------------------------------------------------------------------------------- 1 | rule ft_java_class 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20160126" 7 | desc = "File magic for detecting a Java bytecode file." 8 | 9 | strings: 10 | $class = { CA FE BA BE } 11 | 12 | condition: 13 | $class at 0 14 | } 15 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_macho.yara: -------------------------------------------------------------------------------- 1 | rule ft_macho 2 | { 3 | meta: 4 | author = "Jamie Ford" 5 | company = "BroEZ" 6 | lastmod = "September 5 2016" 7 | desc = "Signature to trigger on mach-o file format." 8 | 9 | strings: 10 | $MH_CIGAM_64 = { CF FA ED FE } 11 | $MH_MAGIC_64 = { FE ED FA CF } 12 | $MH_MAGIC_32 = { FE ED FA CE } 13 | $MH_CIGAM_32 = { CE FA ED FE } 14 | $FAT_MAGIC = { CA FE BA BE } 15 | $FAT_CIGAM = { BE BA FE CA } 16 | 17 | condition: 18 | ($MH_CIGAM_64 at 0) or ($MH_MAGIC_64 at 0) or ($MH_CIGAM_32 at 0) or ($MH_MAGIC_32 at 0) or ($FAT_MAGIC at 0) or ($FAT_CIGAM at 0) 19 | } -------------------------------------------------------------------------------- /fsf-server/yara/ft_office_open_xml.yara: -------------------------------------------------------------------------------- 1 | // References: 2 | // http://www.garykessler.net/library/file_sigs.html 3 | // https://issues.apache.org/jira/browse/TIKA-257 4 | 5 | rule ft_office_open_xml 6 | { 7 | meta: 8 | author = "Jason Batchelor" 9 | company = "Emerson" 10 | lastmod = "20140915" 11 | desc = "Simple metadata attribute indicative of Office Open XML format. Commonly seen in modern office files." 12 | 13 | strings: 14 | $OOXML = "[Content_Types].xml" 15 | 16 | condition: 17 | $OOXML at 30 18 | } 19 | 20 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_ole_cf.yara: -------------------------------------------------------------------------------- 1 | rule ft_ole_cf 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20141202" 7 | desc = "Detect file magic indicative of OLE CF files (commonly used by early versions of MS Office)." 8 | 9 | strings: 10 | $magic = { D0 CF 11 E0 A1 B1 1A E1 } 11 | 12 | condition: 13 | $magic at 0 14 | } 15 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_pdf.yara: -------------------------------------------------------------------------------- 1 | rule ft_pdf 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20141230" 7 | desc = "Signature to trigger on PDF file magic." 8 | 9 | strings: 10 | $pdf = "%PDF" 11 | 12 | condition: 13 | $pdf in (0 .. 1024) 14 | } 15 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_rar.yara: -------------------------------------------------------------------------------- 1 | rule ft_rar 2 | { 3 | meta: 4 | author = "James Ferrer" 5 | company = "Emerson" 6 | lastmod = "20150107" 7 | desc = "File type signature for basic .rar files" 8 | 9 | strings: 10 | $Rar = {52 61 72 21 1A 07} 11 | 12 | condition: 13 | 14 | $Rar at 0 15 | } 16 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_rtf.yara: -------------------------------------------------------------------------------- 1 | rule ft_rtf 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20141204" 7 | desc = "Hit on RTF files by triggering on RTF file magic" 8 | 9 | strings: 10 | $rtf = { 7B 5C 72 74 66 } 11 | 12 | condition: 13 | $rtf at 0 14 | } 15 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_swf.yara: -------------------------------------------------------------------------------- 1 | rule ft_swf_cws 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20150318" 7 | desc = "File type signature for regular compressed SWF files" 8 | 9 | strings: 10 | $cws = "CWS" 11 | 12 | condition: 13 | $cws at 0 14 | } 15 | 16 | rule ft_swf_fws 17 | { 18 | meta: 19 | author = "Jason Batchelor" 20 | company = "Emerson" 21 | lastmod = "20150318" 22 | desc = "File type signature for basic SWF files." 23 | 24 | strings: 25 | $fws = "FWS" 26 | 27 | condition: 28 | $fws at 0 29 | } 30 | 31 | rule ft_swf_zws 32 | { 33 | meta: 34 | author = "Jason Batchelor" 35 | company = "Emerson" 36 | lastmod = "20150318" 37 | desc = "File type signature for SWF files compressed with LZMA compression, uncommonly observed" 38 | 39 | strings: 40 | $zws = "ZWS" 41 | 42 | condition: 43 | $zws at 0 44 | } 45 | 46 | rule ft_swf 47 | { 48 | condition: 49 | ft_swf_zws or ft_swf_fws or ft_swf_cws 50 | } 51 | 52 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_tar.yara: -------------------------------------------------------------------------------- 1 | rule ft_tar 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20151116" 7 | desc = "Signature to detect on TAR archive files" 8 | 9 | strings: 10 | $magic = { 75 73 74 61 72 } 11 | 12 | condition: 13 | $magic at 257 14 | } 15 | -------------------------------------------------------------------------------- /fsf-server/yara/ft_zip.yara: -------------------------------------------------------------------------------- 1 | rule ft_zip 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20141217" 7 | desc = "File type signature for basic ZIP files." 8 | 9 | strings: 10 | $pk = { 50 4B 03 04 } 11 | 12 | condition: 13 | $pk at 0 14 | } 15 | -------------------------------------------------------------------------------- /fsf-server/yara/misc_compressed_exe.yara: -------------------------------------------------------------------------------- 1 | // Spec reference: http://forensicswiki.org/wiki/RAR#Format 2 | rule compressed_exe_in_rar 3 | { 4 | meta: 5 | author = "Jason Batchelor" 6 | company = "Emerson" 7 | lastmod = "20150813" 8 | desc = "Detect on evidence of a compressed executable within a RAR" 9 | 10 | strings: 11 | $rar = { 52 61 72 21 1A 07 00 } 12 | $file_header_part = { 74 [12] ( 00 | 01 | 02 | 03 | 04 | 05 ) [9] ( 30 | 31 | 32 | 33 | 34 | 35 ) } 13 | $exe_ext = ".exe" 14 | 15 | condition: 16 | $rar at 0 and for any r in (1..#file_header_part): 17 | // see if .exe is within the offset of the file archive header and however long the file name size is 18 | // file name begins 30 bytes away from start of header 19 | // file size is specified 24 bytes from the start 20 | // limitation is if the HIGH_PACK_SIZE or HIGH_UNP_SIZE optional values are set, accuracy will be effected 21 | ($exe_ext in (@file_header_part[r] + 30..@file_header_part[r] + 30 + uint16(@file_header_part[r] + 24))) 22 | } 23 | 24 | // Spec reference: https://en.wikipedia.org/wiki/Zip_(file_format)#File_headers 25 | rule compressed_exe_in_zip 26 | { 27 | meta: 28 | author = "Jason Batchelor" 29 | company = "Emerson" 30 | lastmod = "20150813" 31 | desc = "Detect on evidence of a compressed executable within a ZIP" 32 | 33 | strings: 34 | $pk = { 50 4B 03 04 } 35 | $exe_ext = ".exe" 36 | 37 | condition: 38 | $pk at 0 and for any p in (1..#pk): 39 | // see if .exe is within the offset of the local file header and however long the file name size is 40 | // file name begins 30 bytes away from the start of the local file header 41 | // file size is specified 26 bytes from the start 42 | ($exe_ext in (@pk[p] + 30..@pk[p] + 30 + uint16(@pk[p] + 26))) 43 | } 44 | 45 | rule misc_compressed_exe 46 | { 47 | condition: 48 | compressed_exe_in_zip or compressed_exe_in_rar 49 | } 50 | 51 | -------------------------------------------------------------------------------- /fsf-server/yara/misc_hexascii_pe_in_html.yara: -------------------------------------------------------------------------------- 1 | /* 2 | Example target... 3 | 4 | 5 | 6 | 7 | ... 8 | 9 | 10 | 25 | 26 | Source: http://pastebin.com/raw/mkDzzjEv 27 | */ 28 | rule misc_hexascii_pe_in_html : encoding html suspicious 29 | { 30 | meta: 31 | author = "Jason Batchelor" 32 | created = "2016-03-02" 33 | modified = "2016-03-02" 34 | university = "Carnegie Mellon University" 35 | description = "Detect on presence of hexascii encoded executable inside scripted code section of html file" 36 | 37 | strings: 38 | $html_start = "" ascii nocase // HTML tags 39 | $html_end = "" ascii nocase 40 | $mz = "4d5a" ascii nocase // MZ header constant 41 | $pe = "50450000" ascii nocase // PE header constant 42 | 43 | condition: 44 | all of ($html*) and $pe in (@mz[1] .. filesize) 45 | } 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | -------------------------------------------------------------------------------- /fsf-server/yara/misc_no_dosmode_header.yara: -------------------------------------------------------------------------------- 1 | // Source: http://yara.readthedocs.org/en/v3.4.0/writingrules.html#conditions 2 | private rule ft_strict_exe 3 | { 4 | condition: 5 | // MZ signature at offset 0 and ... 6 | uint16(0) == 0x5A4D and 7 | // ... PE signature at offset stored in MZ header at 0x3C 8 | uint32(uint32(0x3C)) == 0x00004550 9 | } 10 | 11 | /* 12 | Example target... 13 | 00000000 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 |MZ..............| 14 | 00000010 b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 |........@.......| 15 | 00000020 20 20 20 20 00 00 00 00 00 00 00 00 00 00 00 00 | ............| 16 | 00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 |................| 17 | 00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 18 | 00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 19 | 00000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 20 | 00000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 21 | 00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 22 | 00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 23 | 000000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 24 | 000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 25 | 000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 26 | 000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 27 | 000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 28 | 000000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 29 | 00000100 50 45 00 00 4c 01 03 00 bc 7c b1 47 00 00 00 00 |PE..L....|.G....| 30 | 00000110 00 00 00 00 e0 00 0f 01 0b 01 07 04 00 e0 00 00 |................| 31 | */ 32 | 33 | rule misc_no_dosmode_header : suspicious 34 | { 35 | meta: 36 | author = "Jason Batchelor" 37 | created = "2016-03-02" 38 | modified = "2016-03-02" 39 | university = "Carnegie Mellon University" 40 | description = "Detect on absence of 'DOS Mode' heaader between MZ and PE boundries" 41 | 42 | strings: 43 | $dosmode = "This program cannot be run in DOS mode." 44 | 45 | condition: 46 | // (0 .. (uint32(0x3C))) = between end of MZ and start of PE headers 47 | // 0x3C = e_lfanew = offset of PE header 48 | ft_strict_exe and not $dosmode in (0x3C .. (uint32(0x3C))) 49 | } 50 | 51 | 52 | -------------------------------------------------------------------------------- /fsf-server/yara/misc_ooxml_core_properties.yara: -------------------------------------------------------------------------------- 1 | rule misc_ooxml_core_properties 2 | { 3 | meta: 4 | author = "Jason Batchelor" 5 | company = "Emerson" 6 | lastmod = "20150505" 7 | desc = "Identify meta xml content within OOXML documents" 8 | 9 | strings: 10 | $xml = " 0 13 | } 14 | -------------------------------------------------------------------------------- /fsf-server/yara/misc_upx_packed_binary.yara: -------------------------------------------------------------------------------- 1 | import "pe" 2 | 3 | rule misc_upx_packed_binary 4 | { 5 | meta: 6 | author = "Jason Batchelor" 7 | company = "Emerson" 8 | lastmod = "20150520" 9 | desc = "Detect section names indicative of UPX packed PE files" 10 | 11 | condition: 12 | (pe.sections[0].name == "UPX0" and pe.sections[1].name == "UPX1") 13 | } 14 | -------------------------------------------------------------------------------- /fsf-server/yara/rules.yara: -------------------------------------------------------------------------------- 1 | // File Magic Signatures 2 | include "ft_exe.yara" 3 | include "ft_rar.yara" 4 | include "ft_zip.yara" 5 | include "ft_pdf.yara" 6 | include "ft_ole_cf.yara" 7 | include "ft_swf.yara" 8 | include "ft_office_open_xml.yara" 9 | include "ft_rtf.yara" 10 | include "ft_tar.yara" 11 | include "ft_gzip.yara" 12 | include "ft_jar.yara" 13 | include "ft_cab.yara" 14 | include "ft_elf.yara" 15 | include "ft_java_class.yara" 16 | include "ft_macho.yara" 17 | 18 | // Misc Signatures 19 | include "misc_ooxml_core_properties.yara" 20 | include "misc_compressed_exe.yara" 21 | include "misc_upx_packed_binary.yara" 22 | include "misc_pe_signature.yara" 23 | include "misc_hexascii_pe_in_html.yara" 24 | include "misc_no_dosmode_header.yara" 25 | --------------------------------------------------------------------------------