├── .dockerignore ├── .github └── workflows │ └── ci.yml ├── .gitignore ├── CHANGELOG ├── COMPAT.md ├── LICENSE ├── MANIFEST.in ├── README.md ├── magic ├── __init__.py ├── __init__.pyi ├── compat.py ├── loader.py └── py.typed ├── ruff.toml ├── setup.cfg ├── setup.py ├── stdeb.cfg ├── test ├── README ├── __init__.py ├── docker │ ├── alpine │ ├── archlinux │ ├── bionic │ ├── centos7 │ ├── centos8 │ ├── focal │ └── xenial ├── libmagic_test.py ├── python_magic_test.py ├── run_all_docker_test.sh └── testdata │ ├── elf-NetBSD-x86_64-echo │ ├── keep-going.jpg │ ├── lambda │ ├── magic._pyc_ │ ├── name_use.jpg │ ├── pgpunicode │ ├── test.gz │ ├── test.json │ ├── test.pdf │ ├── test.snappy.parquet │ ├── text-iso8859-1.txt │ └── text.txt ├── tox.ini └── upload.sh /.dockerignore: -------------------------------------------------------------------------------- 1 | .gitignore -------------------------------------------------------------------------------- /.github/workflows/ci.yml: -------------------------------------------------------------------------------- 1 | name: ci 2 | on: [push, pull_request] 3 | jobs: 4 | ci: 5 | strategy: 6 | fail-fast: false 7 | matrix: 8 | os: ["ubuntu-latest"] 9 | python-version: ["3.8", "3.9", "3.10", "3.11", "3.12", "3.13"] 10 | include: 11 | - os: macos-latest 12 | python-version: "3.13" 13 | # - os: windows-latest # TODO: Fix the Windows test that runs in an infinite loop 14 | # python-version: '3.13' 15 | runs-on: ${{ matrix.os }} 16 | steps: 17 | - uses: actions/checkout@v4 18 | - uses: actions/setup-python@v5 19 | with: 20 | python-version: ${{ matrix.python-version }} 21 | allow-prereleases: true 22 | - run: pip install --upgrade pip 23 | - run: pip install --upgrade pytest 24 | - run: pip install --editable . 25 | - if: runner.os == 'macOS' 26 | run: brew install libmagic 27 | - if: runner.os == 'Windows' 28 | run: pip install python-magic-bin 29 | - run: LC_ALL=en_US.UTF-8 pytest 30 | shell: bash 31 | timeout-minutes: 15 # Limit Windows infinite loop. 32 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .coverage* 2 | .tox/ 3 | bin/ 4 | deb_dist 5 | htmlcov/ 6 | lib/ 7 | **/__pycache__ 8 | python_magic.egg-info 9 | pip-selfcheck.json 10 | pyvenv.cfg 11 | *.pyc 12 | *~ 13 | dist/ 14 | .vscode/ 15 | -------------------------------------------------------------------------------- /CHANGELOG: -------------------------------------------------------------------------------- 1 | Changes to 0.4.29: 2 | 3 | - support MAGIC_SYMLINK (via follow_symlink flag on Magic constructor) 4 | - correctly throw FileNotFoundException depending on flag 5 | 6 | Changes to 0.4.28: 7 | 8 | - support "magic-1.dll" on Windows, which is produced by vcpkg 9 | - add python 3.10 to tox config 10 | - update test for upstream gzip extensions 11 | 12 | Changes to 0.4.27: 13 | 14 | - remove spurious pyproject.toml that breaks source builds 15 | 16 | Changes to 0.4.26: 17 | 18 | - Use tox for all multi-version testing 19 | - Fix use of pytest, use it via tox 20 | 21 | Changes to 0.4.25: 22 | 23 | - Support os.PathLike values in Magic.from_file and magic.from_file 24 | - Handle some versions of libmagic that return mime string without charset 25 | - Fix tests for file 5.41 26 | - Include typing stub in package 27 | 28 | Changes to 0.4.24: 29 | 30 | - Fix regression in library loading on some Alpine docker images. 31 | 32 | Changes to 0.4.23 33 | 34 | - Include a `py.typed` sentinel to enable type checking 35 | - Improve fix for attribute error during destruction 36 | - Cleanup library loading logic 37 | - Add new homebrew library dir for OSX 38 | 39 | Changes to 0.4.21, 0.4.22 40 | 41 | - Unify dll loader between the standard and compat library, fixing load 42 | failures on some previously supported platforms. 43 | 44 | Changes to 0.4.20 45 | 46 | - merge in a compatibility layer for the upstream libmagic python binding. 47 | Since both this package and that one are called 'magic', this compat layer 48 | removes a very common source of runtime errors. Use of that libmagic API will 49 | produce a deprecation warning. 50 | 51 | - support python 3.9 in tests and pypi metadata 52 | 53 | - add support for magic_descriptor functions, which take a file descriptor 54 | rather than a filename. 55 | 56 | - sometimes the returned description includes snippets of the file, e.g a title 57 | for MS Word docs. Since this is in an unknown encoding, we would throw a 58 | unicode decode error trying to decode. Now, it decodes with 59 | 'backslashreplace' to handle this more gracefully. The undecodable characters 60 | are replaced with hex escapes. 61 | 62 | - add support for MAGIC_EXTENSION, to return possible file extensions. 63 | 64 | - add mypy typing stubs file, for type checking 65 | 66 | Changes in 0.4.18 67 | 68 | - Make bindings for magic\_[set|get]param optional, and throw NotImplementedError 69 | if they are used but not supported. Only call setparam() in the constructor if 70 | it's supported. This prevents breakage on CentOS7 which uses an old version of 71 | libmagic. 72 | 73 | - Add tests for CentOS 7 & 8 74 | 75 | Changes in 0.4.16 and 0.4.17 76 | 77 | - add MAGIC_MIME_TYPE constant, use that in preference to MAGIC_MIME internally. 78 | This sets up for a breaking change in a future major version bump where 79 | MAGIC_MIME will change to match magic.h. 80 | - add magic.version() function to return library version 81 | - add setparam/getparam to control internal behavior 82 | - increase internal limits with setparam to prevent spurious error on some jpeg files 83 | - various setup.py improvements to declare modern python support 84 | - support MSYS2 magic dlls 85 | - fix warning about using 'is' on an int in python 3.8 86 | - include tests in source distribution 87 | 88 | - many test improvements: 89 | -- tox runner support 90 | -- remove deprecated test_suite field from setup.py 91 | -- docker tests that cover all LTS ubuntu versions 92 | -- add test for snapp file identification 93 | 94 | - doc improvements 95 | -- document dependency install process for debian 96 | -- various typos 97 | -- document test running process 98 | -------------------------------------------------------------------------------- /COMPAT.md: -------------------------------------------------------------------------------- 1 | There are two python modules named 'magic' that do the same thing, but 2 | with incompatible APIs. One of these ships with libmagic, and (this one) is 3 | distributed through pypi. Both have been around for many years and have 4 | substantial user bases. This incompatibility is a major source of pain for 5 | users, and bug reports for me. 6 | 7 | To mitigate this pain, python-magic has added a compatibility layer to export 8 | the libmagic python API parallel to the existing one. 9 | 10 | The mapping between the libmagic and python-magic functions is: 11 | 12 | detect_from_filename => from_file 13 | detect_from_content => from_buffer 14 | detect_from_fobj => from_descriptor(f.fileno()) 15 | open => Magic() 16 | 17 | 18 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2001-2014 Adam Hupp 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | 23 | 24 | ==== 25 | 26 | Portions of this package (magic/compat.py and test/libmagic_test.py) 27 | are distributed under the following copyright notice: 28 | 29 | 30 | $File: LEGAL.NOTICE,v 1.15 2006/05/03 18:48:33 christos Exp $ 31 | Copyright (c) Ian F. Darwin 1986, 1987, 1989, 1990, 1991, 1992, 1994, 1995. 32 | Software written by Ian F. Darwin and others; 33 | maintained 1994- Christos Zoulas. 34 | 35 | This software is not subject to any export provision of the United States 36 | Department of Commerce, and may be exported to any country or planet. 37 | 38 | Redistribution and use in source and binary forms, with or without 39 | modification, are permitted provided that the following conditions 40 | are met: 41 | 1. Redistributions of source code must retain the above copyright 42 | notice immediately at the beginning of the file, without modification, 43 | this list of conditions, and the following disclaimer. 44 | 2. Redistributions in binary form must reproduce the above copyright 45 | notice, this list of conditions and the following disclaimer in the 46 | documentation and/or other materials provided with the distribution. 47 | 48 | THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 49 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 50 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 51 | ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR 52 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 53 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 54 | OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 55 | HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 56 | LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 57 | OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 58 | SUCH DAMAGE. 59 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include *.py 2 | include LICENSE 3 | graft tests 4 | global-exclude __pycache__ 5 | global-exclude *.py[co] 6 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # python-magic 2 | [![PyPI version](https://badge.fury.io/py/python-magic.svg)](https://badge.fury.io/py/python-magic) 3 | [![ci](https://github.com/ahupp/python-magic/actions/workflows/ci.yml/badge.svg)](https://github.com/ahupp/python-magic/actions/workflows/ci.yml) 4 | [![Join the chat at https://gitter.im/ahupp/python-magic](https://badges.gitter.im/ahupp/python-magic.svg)](https://gitter.im/ahupp/python-magic?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) 5 | 6 | python-magic is a Python interface to the libmagic file type 7 | identification library. libmagic identifies file types by checking 8 | their headers according to a predefined list of file types. This 9 | functionality is exposed to the command line by the Unix command 10 | `file`. 11 | 12 | ## Usage 13 | 14 | ```python 15 | >>> import magic 16 | >>> magic.from_file("testdata/test.pdf") 17 | 'PDF document, version 1.2' 18 | # recommend using at least the first 2048 bytes, as less can produce incorrect identification 19 | >>> magic.from_buffer(open("testdata/test.pdf", "rb").read(2048)) 20 | 'PDF document, version 1.2' 21 | >>> magic.from_file("testdata/test.pdf", mime=True) 22 | 'application/pdf' 23 | ``` 24 | 25 | There is also a `Magic` class that provides more direct control, 26 | including overriding the magic database file and turning on character 27 | encoding detection. This is not recommended for general use. In 28 | particular, it's not safe for sharing across multiple threads and 29 | will fail throw if this is attempted. 30 | 31 | ```python 32 | >>> f = magic.Magic(uncompress=True) 33 | >>> f.from_file('testdata/test.gz') 34 | 'ASCII text (gzip compressed data, was "test", last modified: Sat Jun 28 35 | 21:32:52 2008, from Unix)' 36 | ``` 37 | 38 | You can also combine the flag options: 39 | 40 | ```python 41 | >>> f = magic.Magic(mime=True, uncompress=True) 42 | >>> f.from_file('testdata/test.gz') 43 | 'text/plain' 44 | ``` 45 | 46 | ## Installation 47 | 48 | The current stable version of python-magic is available on PyPI and 49 | can be installed by running `pip install python-magic`. 50 | 51 | Other sources: 52 | 53 | - PyPI: http://pypi.python.org/pypi/python-magic/ 54 | - GitHub: https://github.com/ahupp/python-magic 55 | 56 | This module is a simple wrapper around the libmagic C library, and 57 | that must be installed as well: 58 | 59 | ### Debian/Ubuntu 60 | 61 | ``` 62 | sudo apt-get install libmagic1 63 | ``` 64 | 65 | ### OSX 66 | 67 | - When using Homebrew: `brew install libmagic` 68 | - When using macports: `port install file` 69 | 70 | If python-magic fails to load the library it may be in a non-standard location, in which case you can set the environment variable `DYLD_LIBRARY_PATH` to point to it. 71 | 72 | ### SmartOS: 73 | - Install libmagic for source https://github.com/threatstack/libmagic/ 74 | - Depending on your ./configure --prefix settings set your LD_LIBRARY_PATH to /lib 75 | 76 | ### Troubleshooting 77 | 78 | - 'MagicException: could not find any magic files!': some 79 | installations of libmagic do not correctly point to their magic 80 | database file. Try specifying the path to the file explicitly in the 81 | constructor: `magic.Magic(magic_file="path_to_magic_file")`. 82 | 83 | - 'WindowsError: [Error 193] %1 is not a valid Win32 application': 84 | Attempting to run the 32-bit libmagic DLL in a 64-bit build of 85 | python will fail with this error. Here are 64-bit builds of libmagic for windows: https://github.com/pidydx/libmagicwin64. 86 | Newer version can be found here: https://github.com/nscaife/file-windows. 87 | 88 | - 'WindowsError: exception: access violation writing 0x00000000 ' This may indicate you are mixing 89 | Windows Python and Cygwin Python. Make sure your libmagic and python builds are consistent. 90 | 91 | 92 | ## Bug Reports 93 | 94 | python-magic is a thin layer over the libmagic C library. 95 | Historically, most bugs that have been reported against python-magic 96 | are actually bugs in libmagic; libmagic bugs can be reported on their 97 | tracker here: https://bugs.astron.com/my_view_page.php. If you're not 98 | sure where the bug lies feel free to file an issue on GitHub and I can 99 | triage it. 100 | 101 | ## Running the tests 102 | 103 | We use the `tox` test runner which can be installed with `python -m pip install tox`. 104 | 105 | To run tests locally across all available python versions: 106 | 107 | ``` 108 | python -m tox 109 | ``` 110 | 111 | Or to run just against a single version: 112 | 113 | ``` 114 | python -m tox py 115 | ``` 116 | To run the tests across a variety of linux distributions (depends on Docker): 117 | 118 | ``` 119 | ./test/run_all_docker_test.sh 120 | ``` 121 | 122 | ## libmagic python API compatibility 123 | 124 | The python bindings shipped with libmagic use a module name that conflicts with this package. To work around this, python-magic includes a compatibility layer for the libmagic API. See [COMPAT.md](COMPAT.md) for a guide to libmagic / python-magic compatibility. 125 | 126 | ## Versioning 127 | 128 | Minor version bumps should be backwards compatible. Major bumps are not. 129 | 130 | ## Author 131 | 132 | Written by Adam Hupp in 2001 for a project that never got off the 133 | ground. It originally used SWIG for the C library bindings, but 134 | switched to ctypes once that was part of the python standard library. 135 | 136 | You can contact me via my [website](http://hupp.org/adam) or 137 | [GitHub](http://github.com/ahupp). 138 | 139 | ## License 140 | 141 | python-magic is distributed under the MIT license. See the included 142 | LICENSE file for details. 143 | 144 | I am providing code in the repository to you under an open source license. Because this is my personal repository, the license you receive to my code is from me and not my employer (Facebook). 145 | -------------------------------------------------------------------------------- /magic/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | magic is a wrapper around the libmagic file identification library. 3 | 4 | See README for more information. 5 | 6 | Usage: 7 | 8 | >>> import magic 9 | >>> magic.from_file("testdata/test.pdf") 10 | 'PDF document, version 1.2' 11 | >>> magic.from_file("testdata/test.pdf", mime=True) 12 | 'application/pdf' 13 | >>> magic.from_buffer(open("testdata/test.pdf").read(1024)) 14 | 'PDF document, version 1.2' 15 | >>> 16 | 17 | """ 18 | 19 | import sys 20 | import os 21 | import glob 22 | import ctypes 23 | import ctypes.util 24 | import threading 25 | import logging 26 | 27 | from ctypes import c_char_p, c_int, c_size_t, c_void_p, byref, POINTER 28 | 29 | 30 | class MagicException(Exception): 31 | def __init__(self, message): 32 | super(Exception, self).__init__(message) 33 | self.message = message 34 | 35 | 36 | class Magic: 37 | """ 38 | Magic is a wrapper around the libmagic C library. 39 | """ 40 | 41 | def __init__( 42 | self, 43 | mime=False, 44 | magic_file=None, 45 | mime_encoding=False, 46 | keep_going=False, 47 | uncompress=False, 48 | raw=False, 49 | extension=False, 50 | follow_symlinks=False, 51 | check_tar=True, 52 | check_soft=True, 53 | check_apptype=True, 54 | check_elf=True, 55 | check_text=True, 56 | check_cdf=True, 57 | check_csv=True, 58 | check_encoding=True, 59 | check_json=True, 60 | check_simh=True, 61 | ): 62 | """ 63 | Create a new libmagic wrapper. 64 | 65 | mime - if True, mimetypes are returned instead of textual descriptions 66 | mime_encoding - if True, codec is returned 67 | magic_file - use a mime database other than the system default 68 | keep_going - don't stop at the first match, keep going 69 | uncompress - Try to look inside compressed files. 70 | raw - Do not try to decode "non-printable" chars. 71 | extension - Print a slash-separated list of valid extensions for the file type found. 72 | """ 73 | self.flags = MAGIC_NONE 74 | if mime: 75 | self.flags |= MAGIC_MIME_TYPE 76 | if mime_encoding: 77 | self.flags |= MAGIC_MIME_ENCODING 78 | if keep_going: 79 | self.flags |= MAGIC_CONTINUE 80 | if uncompress: 81 | self.flags |= MAGIC_COMPRESS 82 | if raw: 83 | self.flags |= MAGIC_RAW 84 | if extension: 85 | self.flags |= MAGIC_EXTENSION 86 | 87 | if follow_symlinks: 88 | self.flags |= MAGIC_SYMLINK 89 | 90 | if not check_tar: 91 | self.flags |= MAGIC_NO_CHECK_TAR 92 | if not check_soft: 93 | self.flags |= MAGIC_NO_CHECK_SOFT 94 | if not check_apptype: 95 | self.flags |= MAGIC_NO_CHECK_APPTYPE 96 | if not check_elf: 97 | self.flags |= MAGIC_NO_CHECK_ELF 98 | if not check_text: 99 | self.flags |= MAGIC_NO_CHECK_TEXT 100 | if not check_cdf: 101 | self.flags |= MAGIC_NO_CHECK_CDF 102 | if not check_csv: 103 | self.flags |= MAGIC_NO_CHECK_CSV 104 | if not check_encoding: 105 | self.flags |= MAGIC_NO_CHECK_ENCODING 106 | if not check_json: 107 | self.flags |= MAGIC_NO_CHECK_JSON 108 | if not check_simh: 109 | self.flags |= MAGIC_NO_CHECK_SIMH 110 | 111 | self.cookie = magic_open(self.flags) 112 | self.lock = threading.Lock() 113 | 114 | magic_load(self.cookie, magic_file) 115 | 116 | # MAGIC_EXTENSION was added in 523 or 524, so bail if 117 | # it doesn't appear to be available 118 | if extension and (not _has_version or version() < 524): 119 | raise NotImplementedError( 120 | "MAGIC_EXTENSION is not supported in this version of libmagic" 121 | ) 122 | 123 | # For https://github.com/ahupp/python-magic/issues/190 124 | # libmagic has fixed internal limits that some files exceed, causing 125 | # an error. We can avoid this (at least for the sample file given) 126 | # by bumping the limit up. It's not clear if this is a general solution 127 | # or whether other internal limits should be increased, but given 128 | # the lack of other reports I'll assume this is rare. 129 | if _has_param: 130 | try: 131 | self.setparam(MAGIC_PARAM_NAME_MAX, 64) 132 | except MagicException as e: 133 | # some versions of libmagic fail this call, 134 | # so rather than fail hard just use default behavior 135 | pass 136 | 137 | def from_buffer(self, buf): 138 | """ 139 | Identify the contents of `buf` 140 | """ 141 | with self.lock: 142 | try: 143 | # if we're on python3, convert buf to bytes 144 | # otherwise this string is passed as wchar* 145 | # which is not what libmagic expects 146 | # NEXTBREAK: only take bytes 147 | if type(buf) == str and str != bytes: 148 | buf = buf.encode("utf-8", errors="replace") 149 | return maybe_decode(magic_buffer(self.cookie, buf)) 150 | except MagicException as e: 151 | return self._handle509Bug(e) 152 | 153 | def from_file(self, filename): 154 | # raise FileNotFoundException or IOError if the file does not exist 155 | os.stat(filename, follow_symlinks=self.flags & MAGIC_SYMLINK) 156 | 157 | with self.lock: 158 | try: 159 | return maybe_decode(magic_file(self.cookie, filename)) 160 | except MagicException as e: 161 | return self._handle509Bug(e) 162 | 163 | def from_descriptor(self, fd): 164 | with self.lock: 165 | try: 166 | return maybe_decode(magic_descriptor(self.cookie, fd)) 167 | except MagicException as e: 168 | return self._handle509Bug(e) 169 | 170 | def _handle509Bug(self, e): 171 | # libmagic 5.09 has a bug where it might fail to identify the 172 | # mimetype of a file and returns null from magic_file (and 173 | # likely _buffer), but also does not return an error message. 174 | if e.message is None and (self.flags & MAGIC_MIME_TYPE): 175 | return "application/octet-stream" 176 | else: 177 | raise e 178 | 179 | def setparam(self, param, val): 180 | return magic_setparam(self.cookie, param, val) 181 | 182 | def getparam(self, param): 183 | return magic_getparam(self.cookie, param) 184 | 185 | def __del__(self): 186 | # no _thread_check here because there can be no other 187 | # references to this object at this point. 188 | 189 | # during shutdown magic_close may have been cleared already so 190 | # make sure it exists before using it. 191 | 192 | # the self.cookie check should be unnecessary and was an 193 | # incorrect fix for a threading problem, however I'm leaving 194 | # it in because it's harmless and I'm slightly afraid to 195 | # remove it. 196 | if hasattr(self, "cookie") and self.cookie and magic_close: 197 | magic_close(self.cookie) 198 | self.cookie = None 199 | 200 | 201 | _instances = {} 202 | 203 | 204 | def _get_magic_type(mime): 205 | i = _instances.get(mime) 206 | if i is None: 207 | i = _instances[mime] = Magic(mime=mime) 208 | return i 209 | 210 | 211 | def from_file(filename, mime=False): 212 | """ 213 | Accepts a filename and returns the detected filetype. Return 214 | value is the mimetype if mime=True, otherwise a human readable 215 | name. 216 | 217 | >>> magic.from_file("testdata/test.pdf", mime=True) 218 | 'application/pdf' 219 | """ 220 | m = _get_magic_type(mime) 221 | return m.from_file(filename) 222 | 223 | 224 | def from_buffer(buffer, mime=False): 225 | """ 226 | Accepts a binary string and returns the detected filetype. Return 227 | value is the mimetype if mime=True, otherwise a human readable 228 | name. 229 | 230 | >>> magic.from_buffer(open("testdata/test.pdf").read(1024)) 231 | 'PDF document, version 1.2' 232 | """ 233 | m = _get_magic_type(mime) 234 | return m.from_buffer(buffer) 235 | 236 | 237 | def from_descriptor(fd, mime=False): 238 | """ 239 | Accepts a file descriptor and returns the detected filetype. Return 240 | value is the mimetype if mime=True, otherwise a human readable 241 | name. 242 | 243 | >>> f = open("testdata/test.pdf") 244 | >>> magic.from_descriptor(f.fileno()) 245 | 'PDF document, version 1.2' 246 | """ 247 | m = _get_magic_type(mime) 248 | return m.from_descriptor(fd) 249 | 250 | 251 | from . import loader 252 | 253 | libmagic = loader.load_lib() 254 | 255 | magic_t = ctypes.c_void_p 256 | 257 | 258 | def errorcheck_null(result, func, args): 259 | if result is None: 260 | err = magic_error(args[0]) 261 | raise MagicException(err) 262 | else: 263 | return result 264 | 265 | 266 | def errorcheck_negative_one(result, func, args): 267 | if result == -1: 268 | err = magic_error(args[0]) 269 | raise MagicException(err) 270 | else: 271 | return result 272 | 273 | 274 | # return str on python3. Don't want to unconditionally 275 | # decode because that results in unicode on python2 276 | def maybe_decode(s): 277 | # NEXTBREAK: remove 278 | if str == bytes: 279 | return s 280 | else: 281 | # backslashreplace here because sometimes libmagic will return metadata in the charset 282 | # of the file, which is unknown to us (e.g the title of a Word doc) 283 | return s.decode("utf-8", "backslashreplace") 284 | 285 | 286 | try: 287 | from os import PathLike 288 | 289 | def unpath(filename): 290 | if isinstance(filename, PathLike): 291 | return filename.__fspath__() 292 | else: 293 | return filename 294 | except ImportError: 295 | 296 | def unpath(filename): 297 | return filename 298 | 299 | 300 | def coerce_filename(filename): 301 | if filename is None: 302 | return None 303 | 304 | filename = unpath(filename) 305 | 306 | # ctypes will implicitly convert unicode strings to bytes with 307 | # .encode('ascii'). If you use the filesystem encoding 308 | # then you'll get inconsistent behavior (crashes) depending on the user's 309 | # LANG environment variable 310 | # NEXTBREAK: remove 311 | is_unicode = (sys.version_info[0] <= 2 and isinstance(filename, unicode)) or ( 312 | sys.version_info[0] >= 3 and isinstance(filename, str) 313 | ) 314 | if is_unicode: 315 | return filename.encode("utf-8", "surrogateescape") 316 | else: 317 | return filename 318 | 319 | 320 | magic_open = libmagic.magic_open 321 | magic_open.restype = magic_t 322 | magic_open.argtypes = [c_int] 323 | 324 | magic_close = libmagic.magic_close 325 | magic_close.restype = None 326 | magic_close.argtypes = [magic_t] 327 | 328 | magic_error = libmagic.magic_error 329 | magic_error.restype = c_char_p 330 | magic_error.argtypes = [magic_t] 331 | 332 | magic_errno = libmagic.magic_errno 333 | magic_errno.restype = c_int 334 | magic_errno.argtypes = [magic_t] 335 | 336 | _magic_file = libmagic.magic_file 337 | _magic_file.restype = c_char_p 338 | _magic_file.argtypes = [magic_t, c_char_p] 339 | _magic_file.errcheck = errorcheck_null 340 | 341 | 342 | def magic_file(cookie, filename): 343 | return _magic_file(cookie, coerce_filename(filename)) 344 | 345 | 346 | _magic_buffer = libmagic.magic_buffer 347 | _magic_buffer.restype = c_char_p 348 | _magic_buffer.argtypes = [magic_t, c_void_p, c_size_t] 349 | _magic_buffer.errcheck = errorcheck_null 350 | 351 | 352 | def magic_buffer(cookie, buf): 353 | return _magic_buffer(cookie, buf, len(buf)) 354 | 355 | 356 | magic_descriptor = libmagic.magic_descriptor 357 | magic_descriptor.restype = c_char_p 358 | magic_descriptor.argtypes = [magic_t, c_int] 359 | magic_descriptor.errcheck = errorcheck_null 360 | 361 | _magic_descriptor = libmagic.magic_descriptor 362 | _magic_descriptor.restype = c_char_p 363 | _magic_descriptor.argtypes = [magic_t, c_int] 364 | _magic_descriptor.errcheck = errorcheck_null 365 | 366 | 367 | def magic_descriptor(cookie, fd): 368 | return _magic_descriptor(cookie, fd) 369 | 370 | 371 | _magic_load = libmagic.magic_load 372 | _magic_load.restype = c_int 373 | _magic_load.argtypes = [magic_t, c_char_p] 374 | _magic_load.errcheck = errorcheck_negative_one 375 | 376 | 377 | def magic_load(cookie, filename): 378 | return _magic_load(cookie, coerce_filename(filename)) 379 | 380 | 381 | magic_setflags = libmagic.magic_setflags 382 | magic_setflags.restype = c_int 383 | magic_setflags.argtypes = [magic_t, c_int] 384 | 385 | magic_check = libmagic.magic_check 386 | magic_check.restype = c_int 387 | magic_check.argtypes = [magic_t, c_char_p] 388 | 389 | magic_compile = libmagic.magic_compile 390 | magic_compile.restype = c_int 391 | magic_compile.argtypes = [magic_t, c_char_p] 392 | 393 | _has_param = False 394 | if hasattr(libmagic, "magic_setparam") and hasattr(libmagic, "magic_getparam"): 395 | _has_param = True 396 | _magic_setparam = libmagic.magic_setparam 397 | _magic_setparam.restype = c_int 398 | _magic_setparam.argtypes = [magic_t, c_int, POINTER(c_size_t)] 399 | _magic_setparam.errcheck = errorcheck_negative_one 400 | 401 | _magic_getparam = libmagic.magic_getparam 402 | _magic_getparam.restype = c_int 403 | _magic_getparam.argtypes = [magic_t, c_int, POINTER(c_size_t)] 404 | _magic_getparam.errcheck = errorcheck_negative_one 405 | 406 | 407 | def magic_setparam(cookie, param, val): 408 | if not _has_param: 409 | raise NotImplementedError("magic_setparam not implemented") 410 | v = c_size_t(val) 411 | return _magic_setparam(cookie, param, byref(v)) 412 | 413 | 414 | def magic_getparam(cookie, param): 415 | if not _has_param: 416 | raise NotImplementedError("magic_getparam not implemented") 417 | val = c_size_t() 418 | _magic_getparam(cookie, param, byref(val)) 419 | return val.value 420 | 421 | 422 | _has_version = False 423 | if hasattr(libmagic, "magic_version"): 424 | _has_version = True 425 | magic_version = libmagic.magic_version 426 | magic_version.restype = c_int 427 | magic_version.argtypes = [] 428 | 429 | 430 | def version(): 431 | if not _has_version: 432 | raise NotImplementedError("magic_version not implemented") 433 | return magic_version() 434 | 435 | 436 | MAGIC_NONE = 0x000000 # No flags 437 | MAGIC_DEBUG = 0x000001 # Turn on debugging 438 | MAGIC_SYMLINK = 0x000002 # Follow symlinks 439 | MAGIC_COMPRESS = 0x000004 # Check inside compressed files 440 | MAGIC_DEVICES = 0x000008 # Look at the contents of devices 441 | MAGIC_MIME_TYPE = 0x000010 # Return a mime string 442 | MAGIC_MIME_ENCODING = 0x000400 # Return the MIME encoding 443 | # TODO: should be 444 | # MAGIC_MIME = MAGIC_MIME_TYPE | MAGIC_MIME_ENCODING 445 | MAGIC_MIME = 0x000010 # Return a mime string 446 | MAGIC_EXTENSION = 0x1000000 # Return a /-separated list of extensions 447 | 448 | MAGIC_CONTINUE = 0x000020 # Return all matches 449 | MAGIC_CHECK = 0x000040 # Print warnings to stderr 450 | MAGIC_PRESERVE_ATIME = 0x000080 # Restore access time on exit 451 | MAGIC_RAW = 0x000100 # Don't translate unprintable chars 452 | MAGIC_ERROR = 0x000200 # Handle ENOENT etc as real errors 453 | 454 | MAGIC_NO_CHECK_COMPRESS = 0x001000 # Don't check for compressed files 455 | MAGIC_NO_CHECK_TAR = 0x002000 # Don't check for tar files 456 | MAGIC_NO_CHECK_SOFT = 0x004000 # Don't check magic entries 457 | MAGIC_NO_CHECK_APPTYPE = 0x008000 # Don't check application type 458 | MAGIC_NO_CHECK_ELF = 0x010000 # Don't check for elf details 459 | MAGIC_NO_CHECK_TEXT = 0x020000 # Don't check for ascii files 460 | MAGIC_NO_CHECK_ASCII = 0x020000 # Deprecated alias for MAGIC_NO_CHECK_TEXT 461 | MAGIC_NO_CHECK_TROFF = 0x040000 # Don't check ascii/troff (deprecated) 462 | MAGIC_NO_CHECK_FORTRAN = 0x080000 # Don't check ascii/fortran (deprecated) 463 | MAGIC_NO_CHECK_TOKENS = 0x100000 # Don't check ascii/tokens (deprecated) 464 | MAGIC_NO_CHECK_CDF = 0x0040000 # Don't check for CDF files 465 | MAGIC_NO_CHECK_CSV = 0x0080000 # Don't check for CSV files 466 | MAGIC_NO_CHECK_ENCODING = 0x0200000 # Don't check text encodings 467 | MAGIC_NO_CHECK_JSON = 0x0400000 # Don't check for JSON files 468 | MAGIC_NO_CHECK_SIMH = 0x0800000 # Don't check for SIMH tape files 469 | 470 | MAGIC_PARAM_INDIR_MAX = 0 # Recursion limit for indirect magic 471 | MAGIC_PARAM_NAME_MAX = 1 # Use count limit for name/use magic 472 | MAGIC_PARAM_ELF_PHNUM_MAX = 2 # Max ELF notes processed 473 | MAGIC_PARAM_ELF_SHNUM_MAX = 3 # Max ELF program sections processed 474 | MAGIC_PARAM_ELF_NOTES_MAX = 4 # # Max ELF sections processed 475 | MAGIC_PARAM_REGEX_MAX = 5 # Length limit for regex searches 476 | MAGIC_PARAM_BYTES_MAX = 6 # Max number of bytes to read from file 477 | 478 | 479 | # This package name conflicts with the one provided by upstream 480 | # libmagic. This is a common source of confusion for users. To 481 | # resolve, We ship a copy of that module, and expose it's functions 482 | # wrapped in deprecation warnings. 483 | def _add_compat(to_module): 484 | import warnings, re 485 | from magic import compat 486 | 487 | def deprecation_wrapper(fn): 488 | def _(*args, **kwargs): 489 | warnings.warn( 490 | "Using compatibility mode with libmagic's python binding. " 491 | "See https://github.com/ahupp/python-magic/blob/master/COMPAT.md for details.", 492 | PendingDeprecationWarning, 493 | ) 494 | 495 | return fn(*args, **kwargs) 496 | 497 | return _ 498 | 499 | fn = ["detect_from_filename", "detect_from_content", "detect_from_fobj", "open"] 500 | for fname in fn: 501 | to_module[fname] = deprecation_wrapper(compat.__dict__[fname]) 502 | 503 | # copy constants over, ensuring there's no conflicts 504 | is_const_re = re.compile("^[A-Z_]+$") 505 | allowed_inconsistent = set(["MAGIC_MIME"]) 506 | for name, value in compat.__dict__.items(): 507 | if is_const_re.match(name): 508 | if name in to_module: 509 | if name in allowed_inconsistent: 510 | continue 511 | if to_module[name] != value: 512 | raise Exception("inconsistent value for " + name) 513 | else: 514 | continue 515 | else: 516 | to_module[name] = value 517 | 518 | 519 | _add_compat(globals()) 520 | -------------------------------------------------------------------------------- /magic/__init__.pyi: -------------------------------------------------------------------------------- 1 | import ctypes.util 2 | import threading 3 | from typing import Any, Text, Optional, Union 4 | from os import PathLike 5 | 6 | class MagicException(Exception): 7 | message: Any = ... 8 | def __init__(self, message: Any) -> None: ... 9 | 10 | class Magic: 11 | flags: int = ... 12 | cookie: Any = ... 13 | lock: threading.Lock = ... 14 | def __init__( 15 | self, 16 | mime: bool = ..., 17 | magic_file: Optional[Any] = ..., 18 | mime_encoding: bool = ..., 19 | keep_going: bool = ..., 20 | uncompress: bool = ..., 21 | raw: bool = ..., 22 | extension: bool = ..., 23 | follow_symlinks: bool = ..., 24 | check_tar: bool = ..., 25 | check_soft: bool = ..., 26 | check_apptype: bool = ..., 27 | check_elf: bool = ..., 28 | check_text: bool = ..., 29 | check_encoding: bool = ..., 30 | check_json: bool = ..., 31 | check_simh: bool = ..., 32 | ) -> None: ... 33 | def from_buffer(self, buf: Union[bytes, str]) -> Text: ... 34 | def from_file(self, filename: Union[bytes, str, PathLike]) -> Text: ... 35 | def from_descriptor(self, fd: int, mime: bool = ...) -> Text: ... 36 | def setparam(self, param: Any, val: Any): ... 37 | def getparam(self, param: Any): ... 38 | def __del__(self) -> None: ... 39 | 40 | def from_file(filename: Union[bytes, str, PathLike], mime: bool = ...) -> Text: ... 41 | def from_buffer(buffer: Union[bytes, str], mime: bool = ...) -> Text: ... 42 | def from_descriptor(fd: int, mime: bool = ...) -> Text: ... 43 | 44 | libmagic: Any 45 | dll: Any 46 | windows_dlls: Any 47 | platform_to_lib: Any 48 | platform: Any 49 | magic_t = ctypes.c_void_p 50 | 51 | def errorcheck_null(result: Any, func: Any, args: Any): ... 52 | def errorcheck_negative_one(result: Any, func: Any, args: Any): ... 53 | def maybe_decode(s: Union[bytes, str]) -> str: ... 54 | def coerce_filename(filename: Any): ... 55 | 56 | magic_open: Any 57 | magic_close: Any 58 | magic_error: Any 59 | magic_errno: Any 60 | 61 | def magic_file(cookie: Any, filename: Any): ... 62 | def magic_buffer(cookie: Any, buf: Any): ... 63 | def magic_descriptor(cookie: Any, fd: int): ... 64 | def magic_load(cookie: Any, filename: Any): ... 65 | 66 | magic_setflags: Any 67 | magic_check: Any 68 | magic_compile: Any 69 | 70 | def magic_setparam(cookie: Any, param: Any, val: Any): ... 71 | def magic_getparam(cookie: Any, param: Any): ... 72 | 73 | magic_version: Any 74 | 75 | def version(): ... 76 | 77 | MAGIC_NONE: int 78 | MAGIC_DEBUG: int 79 | MAGIC_SYMLINK: int 80 | MAGIC_COMPRESS: int 81 | MAGIC_DEVICES: int 82 | MAGIC_MIME_TYPE: int 83 | MAGIC_MIME_ENCODING: int 84 | MAGIC_MIME: int 85 | MAGIC_CONTINUE: int 86 | MAGIC_CHECK: int 87 | MAGIC_PRESERVE_ATIME: int 88 | MAGIC_RAW: int 89 | MAGIC_ERROR: int 90 | MAGIC_NO_CHECK_COMPRESS: int 91 | MAGIC_NO_CHECK_TAR: int 92 | MAGIC_NO_CHECK_SOFT: int 93 | MAGIC_NO_CHECK_APPTYPE: int 94 | MAGIC_NO_CHECK_ELF: int 95 | MAGIC_NO_CHECK_TEXT: int 96 | MAGIC_NO_CHECK_ASCII: int 97 | MAGIC_NO_CHECK_TROFF: int 98 | MAGIC_NO_CHECK_FORTRAN: int 99 | MAGIC_NO_CHECK_CDF: int 100 | MAGIC_NO_CHECK_CSV: int 101 | MAGIC_NO_CHECK_TOKENS: int 102 | MAGIC_NO_CHECK_ENCODING: int 103 | MAGIC_NO_CHECK_JSON: int 104 | MAGIC_NO_CHECK_SIMH: int 105 | MAGIC_PARAM_INDIR_MAX: int 106 | MAGIC_PARAM_NAME_MAX: int 107 | MAGIC_PARAM_ELF_PHNUM_MAX: int 108 | MAGIC_PARAM_ELF_SHNUM_MAX: int 109 | MAGIC_PARAM_ELF_NOTES_MAX: int 110 | MAGIC_PARAM_REGEX_MAX: int 111 | MAGIC_PARAM_BYTES_MAX: int 112 | -------------------------------------------------------------------------------- /magic/compat.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | ''' 4 | Python bindings for libmagic 5 | ''' 6 | 7 | import threading 8 | from collections import namedtuple 9 | 10 | from ctypes import * 11 | from ctypes.util import find_library 12 | 13 | from . import loader 14 | 15 | _libraries = {} 16 | _libraries['magic'] = loader.load_lib() 17 | 18 | # Flag constants for open and setflags 19 | MAGIC_NONE = NONE = 0 20 | MAGIC_DEBUG = DEBUG = 1 21 | MAGIC_SYMLINK = SYMLINK = 2 22 | MAGIC_COMPRESS = COMPRESS = 4 23 | MAGIC_DEVICES = DEVICES = 8 24 | MAGIC_MIME_TYPE = MIME_TYPE = 16 25 | MAGIC_CONTINUE = CONTINUE = 32 26 | MAGIC_CHECK = CHECK = 64 27 | MAGIC_PRESERVE_ATIME = PRESERVE_ATIME = 128 28 | MAGIC_RAW = RAW = 256 29 | MAGIC_ERROR = ERROR = 512 30 | MAGIC_MIME_ENCODING = MIME_ENCODING = 1024 31 | MAGIC_MIME = MIME = 1040 # MIME_TYPE + MIME_ENCODING 32 | MAGIC_APPLE = APPLE = 2048 33 | 34 | MAGIC_NO_CHECK_COMPRESS = NO_CHECK_COMPRESS = 4096 35 | MAGIC_NO_CHECK_TAR = NO_CHECK_TAR = 8192 36 | MAGIC_NO_CHECK_SOFT = NO_CHECK_SOFT = 16384 37 | MAGIC_NO_CHECK_APPTYPE = NO_CHECK_APPTYPE = 32768 38 | MAGIC_NO_CHECK_ELF = NO_CHECK_ELF = 65536 39 | MAGIC_NO_CHECK_TEXT = NO_CHECK_TEXT = 131072 40 | MAGIC_NO_CHECK_CDF = NO_CHECK_CDF = 262144 41 | MAGIC_NO_CHECK_TOKENS = NO_CHECK_TOKENS = 1048576 42 | MAGIC_NO_CHECK_ENCODING = NO_CHECK_ENCODING = 2097152 43 | 44 | MAGIC_NO_CHECK_BUILTIN = NO_CHECK_BUILTIN = 4173824 45 | 46 | MAGIC_PARAM_INDIR_MAX = PARAM_INDIR_MAX = 0 47 | MAGIC_PARAM_NAME_MAX = PARAM_NAME_MAX = 1 48 | MAGIC_PARAM_ELF_PHNUM_MAX = PARAM_ELF_PHNUM_MAX = 2 49 | MAGIC_PARAM_ELF_SHNUM_MAX = PARAM_ELF_SHNUM_MAX = 3 50 | MAGIC_PARAM_ELF_NOTES_MAX = PARAM_ELF_NOTES_MAX = 4 51 | MAGIC_PARAM_REGEX_MAX = PARAM_REGEX_MAX = 5 52 | MAGIC_PARAM_BYTES_MAX = PARAM_BYTES_MAX = 6 53 | 54 | FileMagic = namedtuple('FileMagic', ('mime_type', 'encoding', 'name')) 55 | 56 | 57 | class magic_set(Structure): 58 | pass 59 | magic_set._fields_ = [] 60 | magic_t = POINTER(magic_set) 61 | 62 | _open = _libraries['magic'].magic_open 63 | _open.restype = magic_t 64 | _open.argtypes = [c_int] 65 | 66 | _close = _libraries['magic'].magic_close 67 | _close.restype = None 68 | _close.argtypes = [magic_t] 69 | 70 | _file = _libraries['magic'].magic_file 71 | _file.restype = c_char_p 72 | _file.argtypes = [magic_t, c_char_p] 73 | 74 | _descriptor = _libraries['magic'].magic_descriptor 75 | _descriptor.restype = c_char_p 76 | _descriptor.argtypes = [magic_t, c_int] 77 | 78 | _buffer = _libraries['magic'].magic_buffer 79 | _buffer.restype = c_char_p 80 | _buffer.argtypes = [magic_t, c_void_p, c_size_t] 81 | 82 | _error = _libraries['magic'].magic_error 83 | _error.restype = c_char_p 84 | _error.argtypes = [magic_t] 85 | 86 | _setflags = _libraries['magic'].magic_setflags 87 | _setflags.restype = c_int 88 | _setflags.argtypes = [magic_t, c_int] 89 | 90 | _load = _libraries['magic'].magic_load 91 | _load.restype = c_int 92 | _load.argtypes = [magic_t, c_char_p] 93 | 94 | _compile = _libraries['magic'].magic_compile 95 | _compile.restype = c_int 96 | _compile.argtypes = [magic_t, c_char_p] 97 | 98 | _check = _libraries['magic'].magic_check 99 | _check.restype = c_int 100 | _check.argtypes = [magic_t, c_char_p] 101 | 102 | _list = _libraries['magic'].magic_list 103 | _list.restype = c_int 104 | _list.argtypes = [magic_t, c_char_p] 105 | 106 | _errno = _libraries['magic'].magic_errno 107 | _errno.restype = c_int 108 | _errno.argtypes = [magic_t] 109 | 110 | _getparam = _libraries['magic'].magic_getparam 111 | _getparam.restype = c_int 112 | _getparam.argtypes = [magic_t, c_int, c_void_p] 113 | 114 | _setparam = _libraries['magic'].magic_setparam 115 | _setparam.restype = c_int 116 | _setparam.argtypes = [magic_t, c_int, c_void_p] 117 | 118 | 119 | class Magic(object): 120 | def __init__(self, ms): 121 | self._magic_t = ms 122 | 123 | def close(self): 124 | """ 125 | Closes the magic database and deallocates any resources used. 126 | """ 127 | _close(self._magic_t) 128 | 129 | @staticmethod 130 | def __tostr(s): 131 | if s is None: 132 | return None 133 | if isinstance(s, str): 134 | return s 135 | try: # keep Python 2 compatibility 136 | return str(s, 'utf-8') 137 | except TypeError: 138 | return str(s) 139 | 140 | @staticmethod 141 | def __tobytes(b): 142 | if b is None: 143 | return None 144 | if isinstance(b, bytes): 145 | return b 146 | try: # keep Python 2 compatibility 147 | return bytes(b, 'utf-8') 148 | except TypeError: 149 | return bytes(b) 150 | 151 | def file(self, filename): 152 | """ 153 | Returns a textual description of the contents of the argument passed 154 | as a filename or None if an error occurred and the MAGIC_ERROR flag 155 | is set. A call to errno() will return the numeric error code. 156 | """ 157 | return Magic.__tostr(_file(self._magic_t, Magic.__tobytes(filename))) 158 | 159 | def descriptor(self, fd): 160 | """ 161 | Returns a textual description of the contents of the argument passed 162 | as a file descriptor or None if an error occurred and the MAGIC_ERROR 163 | flag is set. A call to errno() will return the numeric error code. 164 | """ 165 | return Magic.__tostr(_descriptor(self._magic_t, fd)) 166 | 167 | def buffer(self, buf): 168 | """ 169 | Returns a textual description of the contents of the argument passed 170 | as a buffer or None if an error occurred and the MAGIC_ERROR flag 171 | is set. A call to errno() will return the numeric error code. 172 | """ 173 | return Magic.__tostr(_buffer(self._magic_t, buf, len(buf))) 174 | 175 | def error(self): 176 | """ 177 | Returns a textual explanation of the last error or None 178 | if there was no error. 179 | """ 180 | return Magic.__tostr(_error(self._magic_t)) 181 | 182 | def setflags(self, flags): 183 | """ 184 | Set flags on the magic object which determine how magic checking 185 | behaves; a bitwise OR of the flags described in libmagic(3), but 186 | without the MAGIC_ prefix. 187 | 188 | Returns -1 on systems that don't support utime(2) or utimes(2) 189 | when PRESERVE_ATIME is set. 190 | """ 191 | return _setflags(self._magic_t, flags) 192 | 193 | def load(self, filename=None): 194 | """ 195 | Must be called to load entries in the colon separated list of database 196 | files passed as argument or the default database file if no argument 197 | before any magic queries can be performed. 198 | 199 | Returns 0 on success and -1 on failure. 200 | """ 201 | return _load(self._magic_t, Magic.__tobytes(filename)) 202 | 203 | def compile(self, dbs): 204 | """ 205 | Compile entries in the colon separated list of database files 206 | passed as argument or the default database file if no argument. 207 | The compiled files created are named from the basename(1) of each file 208 | argument with ".mgc" appended to it. 209 | 210 | Returns 0 on success and -1 on failure. 211 | """ 212 | return _compile(self._magic_t, Magic.__tobytes(dbs)) 213 | 214 | def check(self, dbs): 215 | """ 216 | Check the validity of entries in the colon separated list of 217 | database files passed as argument or the default database file 218 | if no argument. 219 | 220 | Returns 0 on success and -1 on failure. 221 | """ 222 | return _check(self._magic_t, Magic.__tobytes(dbs)) 223 | 224 | def list(self, dbs): 225 | """ 226 | Check the validity of entries in the colon separated list of 227 | database files passed as argument or the default database file 228 | if no argument. 229 | 230 | Returns 0 on success and -1 on failure. 231 | """ 232 | return _list(self._magic_t, Magic.__tobytes(dbs)) 233 | 234 | def errno(self): 235 | """ 236 | Returns a numeric error code. If return value is 0, an internal 237 | magic error occurred. If return value is non-zero, the value is 238 | an OS error code. Use the errno module or os.strerror() can be used 239 | to provide detailed error information. 240 | """ 241 | return _errno(self._magic_t) 242 | 243 | def getparam(self, param): 244 | """ 245 | Returns the param value if successful and -1 if the parameter 246 | was unknown. 247 | """ 248 | v = c_int() 249 | i = _getparam(self._magic_t, param, byref(v)) 250 | if i == -1: 251 | return -1 252 | return v.value 253 | 254 | def setparam(self, param, value): 255 | """ 256 | Returns 0 if successful and -1 if the parameter was unknown. 257 | """ 258 | v = c_int(value) 259 | return _setparam(self._magic_t, param, byref(v)) 260 | 261 | 262 | def open(flags): 263 | """ 264 | Returns a magic object on success and None on failure. 265 | Flags argument as for setflags. 266 | """ 267 | magic_t = _open(flags) 268 | if magic_t is None: 269 | return None 270 | return Magic(magic_t) 271 | 272 | 273 | # Objects used by `detect_from_` functions 274 | class error(Exception): 275 | pass 276 | 277 | class MagicDetect(object): 278 | def __init__(self): 279 | self.mime_magic = open(MAGIC_MIME) 280 | if self.mime_magic is None: 281 | raise error 282 | if self.mime_magic.load() == -1: 283 | self.mime_magic.close() 284 | self.mime_magic = None 285 | raise error 286 | self.none_magic = open(MAGIC_NONE) 287 | if self.none_magic is None: 288 | self.mime_magic.close() 289 | self.mime_magic = None 290 | raise error 291 | if self.none_magic.load() == -1: 292 | self.none_magic.close() 293 | self.none_magic = None 294 | self.mime_magic.close() 295 | self.mime_magic = None 296 | raise error 297 | 298 | def __del__(self): 299 | if self.mime_magic is not None: 300 | self.mime_magic.close() 301 | if self.none_magic is not None: 302 | self.none_magic.close() 303 | 304 | threadlocal = threading.local() 305 | 306 | def _detect_make(): 307 | v = getattr(threadlocal, "magic_instance", None) 308 | if v is None: 309 | v = MagicDetect() 310 | setattr(threadlocal, "magic_instance", v) 311 | return v 312 | 313 | def _create_filemagic(mime_detected, type_detected): 314 | try: 315 | mime_type, mime_encoding = mime_detected.split('; ') 316 | except ValueError: 317 | raise ValueError(mime_detected) 318 | 319 | return FileMagic(name=type_detected, mime_type=mime_type, 320 | encoding=mime_encoding.replace('charset=', '')) 321 | 322 | 323 | def detect_from_filename(filename): 324 | '''Detect mime type, encoding and file type from a filename 325 | 326 | Returns a `FileMagic` namedtuple. 327 | ''' 328 | x = _detect_make() 329 | return _create_filemagic(x.mime_magic.file(filename), 330 | x.none_magic.file(filename)) 331 | 332 | 333 | def detect_from_fobj(fobj): 334 | '''Detect mime type, encoding and file type from file-like object 335 | 336 | Returns a `FileMagic` namedtuple. 337 | ''' 338 | 339 | file_descriptor = fobj.fileno() 340 | x = _detect_make() 341 | return _create_filemagic(x.mime_magic.descriptor(file_descriptor), 342 | x.none_magic.descriptor(file_descriptor)) 343 | 344 | 345 | def detect_from_content(byte_content): 346 | '''Detect mime type, encoding and file type from bytes 347 | 348 | Returns a `FileMagic` namedtuple. 349 | ''' 350 | 351 | x = _detect_make() 352 | return _create_filemagic(x.mime_magic.buffer(byte_content), 353 | x.none_magic.buffer(byte_content)) 354 | -------------------------------------------------------------------------------- /magic/loader.py: -------------------------------------------------------------------------------- 1 | from ctypes.util import find_library 2 | import ctypes 3 | import sys 4 | import glob 5 | import os.path 6 | import logging 7 | 8 | logger = logging.getLogger(__name__) 9 | 10 | 11 | def _lib_candidates_linux(): 12 | """Yield possible libmagic library names on Linux. 13 | 14 | This is necessary because alpine is bad 15 | """ 16 | yield "libmagic.so.1" 17 | 18 | 19 | def _lib_candidates_macos(): 20 | """Yield possible libmagic library names on macOS.""" 21 | paths = [ 22 | "/opt/homebrew/lib", 23 | "/opt/local/lib", 24 | "/usr/local/lib", 25 | ] + glob.glob("/usr/local/Cellar/libmagic/*/lib") 26 | for path in paths: 27 | yield os.path.join(path, "libmagic.dylib") 28 | 29 | 30 | def _lib_candidates_windows(): 31 | """Yield possible libmagic library names on Windows.""" 32 | prefixes = ( 33 | "libmagic", 34 | "magic1", 35 | "magic-1", 36 | "cygmagic-1", 37 | "libmagic-1", 38 | "msys-magic-1", 39 | ) 40 | for prefix in prefixes: 41 | # find_library searches in %PATH% but not the current directory, 42 | # so look for both 43 | yield "./%s.dll" % (prefix,) 44 | yield find_library(prefix) 45 | 46 | 47 | def _lib_candidates(): 48 | yield find_library("magic") 49 | 50 | func = { 51 | "cygwin": _lib_candidates_windows, 52 | "darwin": _lib_candidates_macos, 53 | "linux": _lib_candidates_linux, 54 | "win32": _lib_candidates_windows, 55 | "sunos5": _lib_candidates_linux, 56 | }.get(sys.platform) 57 | if func is None: 58 | raise ImportError("python-magic: Unsupported platform: " + sys.platform) 59 | # When we drop legacy Python, we can just `yield from func()` 60 | for path in func(): 61 | yield path 62 | 63 | 64 | def load_lib(): 65 | exc = [] 66 | for lib in _lib_candidates(): 67 | # find_library returns None when lib not found 68 | if lib is None: 69 | continue 70 | 71 | try: 72 | return ctypes.CDLL(lib) 73 | except OSError as e: 74 | exc.append(e) 75 | 76 | msg = "\n".join([str(e) for e in exc]) 77 | 78 | # It is better to raise an ImportError since we are importing magic module 79 | raise ImportError( 80 | "python-magic: failed to find libmagic. Check your installation: \n" + msg 81 | ) 82 | -------------------------------------------------------------------------------- /magic/py.typed: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahupp/python-magic/62bd3c6a562b26e4005a012c30a0e86428b8defc/magic/py.typed -------------------------------------------------------------------------------- /ruff.toml: -------------------------------------------------------------------------------- 1 | exclude = ["magic/compat.py"] 2 | 3 | 4 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [global] 2 | command_packages=stdeb.command 3 | 4 | [bdist_wheel] 5 | universal = 1 6 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | import setuptools 5 | import io 6 | import os 7 | 8 | 9 | def read(file_name): 10 | """Read a text file and return the content as a string.""" 11 | with io.open( 12 | os.path.join(os.path.dirname(__file__), file_name), encoding="utf-8" 13 | ) as f: 14 | return f.read() 15 | 16 | 17 | setuptools.setup( 18 | name="python-magic", 19 | description="File type identification using libmagic", 20 | author="Adam Hupp", 21 | author_email="adam@hupp.org", 22 | url="http://github.com/ahupp/python-magic", 23 | version="0.4.28", 24 | long_description=read("README.md"), 25 | long_description_content_type="text/markdown", 26 | packages=["magic"], 27 | package_data={ 28 | "magic": ["py.typed", "*.pyi", "**/*.pyi"], 29 | }, 30 | keywords="mime magic file", 31 | license="MIT", 32 | python_requires=">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*", 33 | classifiers=[ 34 | "Intended Audience :: Developers", 35 | "License :: OSI Approved :: MIT License", 36 | "Programming Language :: Python", 37 | "Programming Language :: Python :: 2.7", 38 | "Programming Language :: Python :: 3", 39 | "Programming Language :: Python :: 3.5", 40 | "Programming Language :: Python :: 3.6", 41 | "Programming Language :: Python :: 3.7", 42 | "Programming Language :: Python :: 3.8", 43 | "Programming Language :: Python :: 3.9", 44 | "Programming Language :: Python :: 3.10", 45 | "Programming Language :: Python :: 3.11", 46 | "Programming Language :: Python :: 3.12", 47 | "Programming Language :: Python :: 3.13", 48 | "Programming Language :: Python :: Implementation :: CPython", 49 | ], 50 | ) 51 | -------------------------------------------------------------------------------- /stdeb.cfg: -------------------------------------------------------------------------------- 1 | [python-magic] 2 | Depends: libmagic1 3 | Conflicts: python-magic 4 | -------------------------------------------------------------------------------- /test/README: -------------------------------------------------------------------------------- 1 | There are a few ways to run the python-magic tests 2 | 3 | 1. `tox` will run the tests against all installed versions of python 4 | 2. `./test/run_all_docker_test.sh` will run against a variety of different Linux distributions, using docker. 5 | -------------------------------------------------------------------------------- /test/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahupp/python-magic/62bd3c6a562b26e4005a012c30a0e86428b8defc/test/__init__.py -------------------------------------------------------------------------------- /test/docker/alpine: -------------------------------------------------------------------------------- 1 | FROM python:3.8-alpine3.12 2 | RUN apk add python3 python2 libmagic 3 | WORKDIR /python-magic 4 | COPY . . 5 | RUN python3 -m pip install tox 6 | -------------------------------------------------------------------------------- /test/docker/archlinux: -------------------------------------------------------------------------------- 1 | FROM archlinux:latest 2 | RUN yes | pacman -Syyu --overwrite '*' 3 | RUN yes | pacman -S python python-pip file which 4 | WORKDIR /python-magic 5 | COPY . . 6 | RUN python3 -m pip install tox 7 | -------------------------------------------------------------------------------- /test/docker/bionic: -------------------------------------------------------------------------------- 1 | FROM ubuntu:bionic 2 | RUN apt-get update 3 | RUN apt-get -y install python python3 locales python3-pip libmagic1 4 | RUN locale-gen en_US.UTF-8 5 | 6 | WORKDIR /python-magic 7 | COPY . . 8 | RUN python3 -m pip install tox 9 | -------------------------------------------------------------------------------- /test/docker/centos7: -------------------------------------------------------------------------------- 1 | FROM centos:7 2 | RUN yum -y update 3 | RUN yum -y install file-devel python3 python2 which 4 | ENV SKIP_FROM_DESCRIPTOR=1 5 | 6 | WORKDIR /python-magic 7 | COPY . . 8 | RUN python3 -m pip install tox 9 | -------------------------------------------------------------------------------- /test/docker/centos8: -------------------------------------------------------------------------------- 1 | FROM centos:8 2 | RUN yum -y update 3 | RUN yum -y install file-libs python3 python2 which glibc-locale-source 4 | RUN yum reinstall glibc-common -y && \ 5 | localedef -i en_US -f UTF-8 en_US.UTF-8 && \ 6 | echo "LANG=en_US.UTF-8" > /etc/locale.conf 7 | 8 | WORKDIR /python-magic 9 | COPY . . 10 | RUN python3 -m pip install tox 11 | -------------------------------------------------------------------------------- /test/docker/focal: -------------------------------------------------------------------------------- 1 | FROM ubuntu:focal 2 | RUN apt-get update 3 | RUN apt-get -y install python python3 locales python3-pip libmagic1 4 | RUN locale-gen en_US.UTF-8 5 | 6 | WORKDIR /python-magic 7 | COPY . . 8 | RUN python3 -m pip install tox 9 | 10 | 11 | -------------------------------------------------------------------------------- /test/docker/xenial: -------------------------------------------------------------------------------- 1 | FROM ubuntu:xenial 2 | RUN apt-get update 3 | RUN apt-get -y install python python3 locales python3-pip libmagic1 4 | RUN locale-gen en_US.UTF-8 5 | 6 | WORKDIR /python-magic 7 | COPY . . 8 | RUN python3 -m pip install tox 9 | -------------------------------------------------------------------------------- /test/libmagic_test.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | import unittest 4 | import os 5 | import magic 6 | import os.path 7 | 8 | # magic_descriptor is broken (?) in centos 7, so don't run those tests 9 | SKIP_FROM_DESCRIPTOR = bool(os.environ.get("SKIP_FROM_DESCRIPTOR")) 10 | 11 | TESTDATA_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__), "testdata")) 12 | 13 | 14 | class MagicTestCase(unittest.TestCase): 15 | filename = os.path.join(TESTDATA_DIR, "test.pdf") 16 | expected_mime_type = "application/pdf" 17 | expected_encoding = "us-ascii" 18 | expected_name = ( 19 | "PDF document, version 1.2", 20 | "PDF document, version 1.2, 2 pages", 21 | "PDF document, version 1.2, 2 page(s)", 22 | ) 23 | 24 | def assert_result(self, result): 25 | self.assertEqual(result.mime_type, self.expected_mime_type) 26 | self.assertEqual(result.encoding, self.expected_encoding) 27 | self.assertIn(result.name, self.expected_name) 28 | 29 | def test_detect_from_filename(self): 30 | result = magic.detect_from_filename(self.filename) 31 | self.assert_result(result) 32 | 33 | def test_detect_from_fobj(self): 34 | if SKIP_FROM_DESCRIPTOR: 35 | self.skipTest("magic_descriptor is broken in this version of libmagic") 36 | 37 | with open(self.filename) as fobj: 38 | result = magic.detect_from_fobj(fobj) 39 | self.assert_result(result) 40 | 41 | def test_detect_from_content(self): 42 | # differ from upstream by opening file in binary mode, 43 | # this avoids hitting a bug in python3+libfile bindings 44 | # see https://github.com/ahupp/python-magic/issues/152 45 | # for a similar issue 46 | with open(self.filename, "rb") as fobj: 47 | result = magic.detect_from_content(fobj.read(4096)) 48 | self.assert_result(result) 49 | 50 | 51 | if __name__ == "__main__": 52 | unittest.main() 53 | -------------------------------------------------------------------------------- /test/python_magic_test.py: -------------------------------------------------------------------------------- 1 | from dataclasses import dataclass 2 | from enum import Enum 3 | import os 4 | import os.path 5 | import shutil 6 | import sys 7 | import tempfile 8 | from typing import List, Union 9 | import unittest 10 | 11 | import pytest 12 | 13 | # for output which reports a local time 14 | os.environ["TZ"] = "GMT" 15 | 16 | if os.environ.get("LC_ALL", "") != "en_US.UTF-8": 17 | # this ensure we're in a utf-8 default filesystem encoding which is 18 | # necessary for some tests 19 | raise Exception("must run `export LC_ALL=en_US.UTF-8` before running test suite") 20 | 21 | import magic 22 | 23 | 24 | @dataclass 25 | class TestFile: 26 | file_name: str 27 | mime_results: List[str] 28 | text_results: List[str] 29 | no_check_elf_results: Union[List[str], None] 30 | buf_equals_file: bool = True 31 | 32 | 33 | # magic_descriptor is broken (?) in centos 7, so don't run those tests 34 | SKIP_FROM_DESCRIPTOR = bool(os.environ.get("SKIP_FROM_DESCRIPTOR")) 35 | 36 | 37 | COMMON_PLAIN = [{}] 38 | NO_SOFT = [{"check_soft": False}] 39 | COMMON_MIME = [{"mime": True}] 40 | 41 | CASES = { 42 | b"magic._pyc_": [ 43 | ( 44 | COMMON_MIME, 45 | [ 46 | "application/octet-stream", 47 | "text/x-bytecode.python", 48 | "application/x-bytecode.python", 49 | ], 50 | ), 51 | (COMMON_PLAIN, ["python 2.4 byte-compiled"]), 52 | (NO_SOFT, ["data"]), 53 | ], 54 | b"test.pdf": [ 55 | (COMMON_MIME, ["application/pdf"]), 56 | ( 57 | COMMON_PLAIN, 58 | [ 59 | "PDF document, version 1.2", 60 | "PDF document, version 1.2, 2 pages", 61 | "PDF document, version 1.2, 2 page(s)", 62 | ], 63 | ), 64 | (NO_SOFT, ["ASCII text"]), 65 | ], 66 | b"test.gz": [ 67 | (COMMON_MIME, ["application/gzip", "application/x-gzip"]), 68 | ( 69 | COMMON_PLAIN, 70 | [ 71 | 'gzip compressed data, was "test", from Unix, last modified: Sun Jun 29 01:32:52 2008', 72 | 'gzip compressed data, was "test", last modified: Sun Jun 29 01:32:52 2008, from Unix', 73 | 'gzip compressed data, was "test", last modified: Sun Jun 29 01:32:52 2008, from Unix, original size 15', 74 | 'gzip compressed data, was "test", last modified: Sun Jun 29 01:32:52 2008, from Unix, original size modulo 2^32 15', 75 | 'gzip compressed data, was "test", last modified: Sun Jun 29 01:32:52 2008, from Unix, truncated', 76 | ], 77 | ), 78 | ( 79 | [{"extension": True}], 80 | [ 81 | # some versions return '' for the extensions of a gz file, 82 | # including w/ the command line. Who knows... 83 | "gz/tgz/tpz/zabw/svgz/adz/kmy/xcfgz", 84 | "gz/tgz/tpz/zabw/svgz", 85 | "", 86 | "???", 87 | ], 88 | ), 89 | (NO_SOFT, ["data"]), 90 | ], 91 | b"test.snappy.parquet": [ 92 | (COMMON_MIME, ["application/octet-stream"]), 93 | (COMMON_PLAIN, ["Apache Parquet", "Par archive data"]), 94 | (NO_SOFT, ["data"]), 95 | ], 96 | b"test.json": [ 97 | (COMMON_MIME, ["application/json"]), 98 | (COMMON_PLAIN, ["JSON text data"]), 99 | ( 100 | [{"mime": True, "check_json": False}], 101 | [ 102 | "text/plain", 103 | ], 104 | ), 105 | (NO_SOFT, ["JSON text data"]), 106 | ], 107 | b"elf-NetBSD-x86_64-echo": [ 108 | # TODO: soft, no elf 109 | ( 110 | COMMON_PLAIN, 111 | [ 112 | "ELF 64-bit LSB shared object, x86-64, version 1 (SYSV)", 113 | "ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /libexec/ld.elf_so, for NetBSD 8.0, not stripped", 114 | ], 115 | ), 116 | ( 117 | COMMON_MIME, 118 | [ 119 | "application/x-pie-executable", 120 | "application/x-sharedlib", 121 | ], 122 | ), 123 | ( 124 | [{"check_elf": False}], 125 | [ 126 | "ELF 64-bit LSB shared object, x86-64, version 1 (SYSV)", 127 | ], 128 | ), 129 | # TODO: sometimes 130 | # "ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /libexec/ld.elf_so, for NetBSD 8.0, not stripped", 131 | (NO_SOFT, ["data"]), 132 | ], 133 | b"text.txt": [ 134 | (COMMON_MIME, ["text/plain"]), 135 | (COMMON_PLAIN, ["ASCII text"]), 136 | ( 137 | [{"mime_encoding": True}], 138 | [ 139 | "us-ascii", 140 | ], 141 | ), 142 | (NO_SOFT, ["ASCII text"]), 143 | ], 144 | b"text-iso8859-1.txt": [ 145 | ( 146 | [{"mime_encoding": True}], 147 | [ 148 | "iso-8859-1", 149 | ], 150 | ), 151 | ], 152 | b"\xce\xbb": [ 153 | (COMMON_MIME, ["text/plain"]), 154 | ], 155 | b"name_use.jpg": [ 156 | ([{"extension": True}], ["jpeg/jpg/jpe/jfif"]), 157 | ], 158 | b"keep-going.jpg": [ 159 | (COMMON_MIME, ["image/jpeg"]), 160 | ( 161 | [{"mime": True, "keep_going": True}], 162 | [ 163 | "image/jpeg\\012- application/octet-stream", 164 | ], 165 | ), 166 | ], 167 | b"../../magic/loader.py": [ 168 | ( 169 | COMMON_MIME, 170 | [ 171 | "text/x-python", 172 | "text/x-script.python", 173 | ], 174 | ) 175 | ], 176 | } 177 | 178 | 179 | class MagicTest(unittest.TestCase): 180 | TESTDATA_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__), "testdata")) 181 | 182 | def test_version(self): 183 | try: 184 | self.assertTrue(magic.version() > 0) 185 | except NotImplementedError: 186 | pass 187 | 188 | def test_fs_encoding(self): 189 | self.assertEqual("utf-8", sys.getfilesystemencoding().lower()) 190 | 191 | def test_from_file_str_and_bytes(self): 192 | filename = os.path.join(self.TESTDATA_DIR, "test.pdf") 193 | 194 | self.assertEqual("application/pdf", magic.from_file(filename, mime=True)) 195 | self.assertEqual( 196 | "application/pdf", magic.from_file(filename.encode("utf-8"), mime=True) 197 | ) 198 | 199 | def test_all_cases(self): 200 | # TODO: 201 | # * MAGIC_EXTENSION not supported 202 | # * keep_going not supported 203 | # * buffer checks 204 | dest = os.path.join(MagicTest.TESTDATA_DIR, b"\xce\xbb".decode("utf-8")) 205 | shutil.copyfile(os.path.join(MagicTest.TESTDATA_DIR, "lambda"), dest) 206 | os.environ["TZ"] = "UTC" 207 | try: 208 | for filename, cases in CASES.items(): 209 | filename = os.path.join(self.TESTDATA_DIR.encode("utf-8"), filename) 210 | print("test case ", filename, file=sys.stderr) 211 | for flag_variants, outputs in cases: 212 | for flags in flag_variants: 213 | print("flags", flags, file=sys.stderr) 214 | m = magic.Magic(**flags) 215 | with open(filename) as f: 216 | self.assertIn(m.from_descriptor(f.fileno()), outputs) 217 | 218 | self.assertIn(m.from_file(filename), outputs) 219 | 220 | fname_str = filename.decode("utf-8") 221 | self.assertIn(m.from_file(fname_str), outputs) 222 | 223 | with open(filename, "rb") as f: 224 | buf_result = m.from_buffer(f.read(1024)) 225 | self.assertIn(buf_result, outputs) 226 | finally: 227 | del os.environ["TZ"] 228 | os.unlink(dest) 229 | 230 | def test_unicode_result_nonraw(self): 231 | m = magic.Magic(raw=False) 232 | src = os.path.join(MagicTest.TESTDATA_DIR, "pgpunicode") 233 | result = m.from_file(src) 234 | # NOTE: This check is added as otherwise some magic files don't identify the test case as a PGP key. 235 | if "PGP" in result: 236 | assert r"PGP\011Secret Sub-key -" == result 237 | else: 238 | raise unittest.SkipTest("Magic file doesn't return expected type.") 239 | 240 | def test_unicode_result_raw(self): 241 | m = magic.Magic(raw=True) 242 | src = os.path.join(MagicTest.TESTDATA_DIR, "pgpunicode") 243 | result = m.from_file(src) 244 | if "PGP" in result: 245 | assert b"PGP\tSecret Sub-key -" == result.encode("utf-8") 246 | else: 247 | raise unittest.SkipTest("Magic file doesn't return expected type.") 248 | 249 | def test_errors(self): 250 | m = magic.Magic() 251 | self.assertRaises(IOError, m.from_file, "nonexistent") 252 | self.assertRaises(magic.MagicException, magic.Magic, magic_file="nonexistent") 253 | os.environ["MAGIC"] = "nonexistent" 254 | try: 255 | self.assertRaises(magic.MagicException, magic.Magic) 256 | finally: 257 | del os.environ["MAGIC"] 258 | 259 | def test_rethrow(self): 260 | old = magic.magic_buffer 261 | try: 262 | 263 | def t(x, y): 264 | raise magic.MagicException("passthrough") 265 | 266 | magic.magic_buffer = t 267 | 268 | with self.assertRaises(magic.MagicException): 269 | magic.from_buffer("hello", True) 270 | finally: 271 | magic.magic_buffer = old 272 | 273 | def test_getparam(self): 274 | m = magic.Magic(mime=True) 275 | try: 276 | m.setparam(magic.MAGIC_PARAM_INDIR_MAX, 1) 277 | self.assertEqual(m.getparam(magic.MAGIC_PARAM_INDIR_MAX), 1) 278 | except NotImplementedError: 279 | pass 280 | 281 | def test_name_count(self): 282 | m = magic.Magic() 283 | with open(os.path.join(self.TESTDATA_DIR, "name_use.jpg"), "rb") as f: 284 | m.from_buffer(f.read()) 285 | 286 | def test_pathlike(self): 287 | if sys.version_info < (3, 6): 288 | return 289 | from pathlib import Path 290 | 291 | path = Path(self.TESTDATA_DIR, "test.pdf") 292 | m = magic.Magic(mime=True) 293 | self.assertEqual("application/pdf", m.from_file(path)) 294 | 295 | def test_symlink(self): 296 | # TODO: 3.0 297 | if not hasattr(tempfile, "TemporaryDirectory"): 298 | return 299 | 300 | with tempfile.TemporaryDirectory() as tmp: 301 | tmp_link = os.path.join(tmp, "test_link") 302 | tmp_broken = os.path.join(tmp, "nonexistent") 303 | 304 | os.symlink( 305 | os.path.join(self.TESTDATA_DIR, "test.pdf"), 306 | tmp_link, 307 | ) 308 | 309 | os.symlink("/nonexistent", tmp_broken) 310 | 311 | m = magic.Magic() 312 | m_follow = magic.Magic(follow_symlinks=True) 313 | self.assertTrue(m.from_file(tmp_link).startswith("symbolic link to ")) 314 | self.assertTrue(m_follow.from_file(tmp_link).startswith("PDF document")) 315 | 316 | self.assertTrue( 317 | m.from_file(tmp_broken).startswith( 318 | "broken symbolic link to /nonexistent" 319 | ) 320 | ) 321 | 322 | self.assertRaises(IOError, m_follow.from_file, tmp_broken) 323 | 324 | 325 | if __name__ == "__main__": 326 | unittest.main() 327 | -------------------------------------------------------------------------------- /test/run_all_docker_test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | set -e 4 | set -x 5 | 6 | ROOT=$(dirname $0)/.. 7 | cd $ROOT 8 | 9 | for f in test/docker/*; do 10 | H=$(docker build -q -f ${f} .) 11 | docker run --rm $H python3 -m tox 12 | done 13 | 14 | -------------------------------------------------------------------------------- /test/testdata/elf-NetBSD-x86_64-echo: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahupp/python-magic/62bd3c6a562b26e4005a012c30a0e86428b8defc/test/testdata/elf-NetBSD-x86_64-echo -------------------------------------------------------------------------------- /test/testdata/keep-going.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahupp/python-magic/62bd3c6a562b26e4005a012c30a0e86428b8defc/test/testdata/keep-going.jpg -------------------------------------------------------------------------------- /test/testdata/lambda: -------------------------------------------------------------------------------- 1 | test 2 | -------------------------------------------------------------------------------- /test/testdata/magic._pyc_: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahupp/python-magic/62bd3c6a562b26e4005a012c30a0e86428b8defc/test/testdata/magic._pyc_ -------------------------------------------------------------------------------- /test/testdata/name_use.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahupp/python-magic/62bd3c6a562b26e4005a012c30a0e86428b8defc/test/testdata/name_use.jpg -------------------------------------------------------------------------------- /test/testdata/pgpunicode: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahupp/python-magic/62bd3c6a562b26e4005a012c30a0e86428b8defc/test/testdata/pgpunicode -------------------------------------------------------------------------------- /test/testdata/test.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahupp/python-magic/62bd3c6a562b26e4005a012c30a0e86428b8defc/test/testdata/test.gz -------------------------------------------------------------------------------- /test/testdata/test.json: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "one": 2, 4 | "three": null, 5 | "four": [5, "six", false] 6 | } 7 | ] 8 | -------------------------------------------------------------------------------- /test/testdata/test.pdf: -------------------------------------------------------------------------------- 1 | %PDF-1.2 2 | 7 0 obj 3 | [5 0 R/XYZ 111.6 757.86] 4 | endobj 5 | 13 0 obj 6 | << 7 | /Title(About this document) 8 | /A<< 9 | /S/GoTo 10 | /D(subsection.1.1) 11 | >> 12 | /Parent 12 0 R 13 | /Next 14 0 R 14 | >> 15 | endobj 16 | 15 0 obj 17 | << 18 | /Title(Compiling with GHC) 19 | /A<< 20 | /S/GoTo 21 | /D(subsubsection.1.2.1) 22 | >> 23 | /Parent 14 0 R 24 | /Next 16 0 R 25 | >> 26 | endobj 27 | 16 0 obj 28 | << 29 | /Title(Compiling with Hugs) 30 | /A<< 31 | /S/GoTo 32 | /D(subsubsection.1.2.2) 33 | >> 34 | /Parent 14 0 R 35 | /Prev 15 0 R 36 | >> 37 | endobj 38 | 14 0 obj 39 | << 40 | /Title(Compatibility) 41 | /A<< 42 | /S/GoTo 43 | /D(subsection.1.2) 44 | >> 45 | /Parent 12 0 R 46 | /Prev 13 0 R 47 | /First 15 0 R 48 | /Last 16 0 R 49 | /Count -2 50 | /Next 17 0 R 51 | >> 52 | endobj 53 | 17 0 obj 54 | << 55 | /Title(Reporting bugs) 56 | /A<< 57 | /S/GoTo 58 | /D(subsection.1.3) 59 | >> 60 | /Parent 12 0 R 61 | /Prev 14 0 R 62 | /Next 18 0 R 63 | >> 64 | endobj 65 | 18 0 obj 66 | << 67 | /Title(History) 68 | /A<< 69 | /S/GoTo 70 | /D(subsection.1.4) 71 | >> 72 | /Parent 12 0 R 73 | /Prev 17 0 R 74 | /Next 19 0 R 75 | >> 76 | endobj 77 | 19 0 obj 78 | << 79 | /Title(License) 80 | /A<< 81 | /S/GoTo 82 | /D(subsection.1.5) 83 | >> 84 | /Parent 12 0 R 85 | /Prev 18 0 R 86 | >> 87 | endobj 88 | 12 0 obj 89 | << 90 | /Title(Introduction) 91 | /A<< 92 | /S/GoTo 93 | /D(section.1) 94 | >> 95 | /Parent 11 0 R 96 | /First 13 0 R 97 | /Last 19 0 R 98 | /Count -5 99 | /Next 20 0 R 100 | >> 101 | endobj 102 | 21 0 obj 103 | << 104 | /Title(Running a parser) 105 | /A<< 106 | /S/GoTo 107 | /D(subsection.2.1) 108 | >> 109 | /Parent 20 0 R 110 | /Next 22 0 R 111 | >> 112 | endobj 113 | 22 0 obj 114 | << 115 | /Title(Sequence and choice) 116 | /A<< 117 | /S/GoTo 118 | /D(subsection.2.2) 119 | >> 120 | /Parent 20 0 R 121 | /Prev 21 0 R 122 | /Next 23 0 R 123 | >> 124 | endobj 125 | 23 0 obj 126 | << 127 | /Title(Predictive parsers) 128 | /A<< 129 | /S/GoTo 130 | /D(subsection.2.3) 131 | >> 132 | /Parent 20 0 R 133 | /Prev 22 0 R 134 | /Next 24 0 R 135 | >> 136 | endobj 137 | 24 0 obj 138 | << 139 | /Title(Adding semantics) 140 | /A<< 141 | /S/GoTo 142 | /D(subsection.2.4) 143 | >> 144 | /Parent 20 0 R 145 | /Prev 23 0 R 146 | /Next 25 0 R 147 | >> 148 | endobj 149 | 25 0 obj 150 | << 151 | /Title(Sequences and seperators) 152 | /A<< 153 | /S/GoTo 154 | /D(subsection.2.5) 155 | >> 156 | /Parent 20 0 R 157 | /Prev 24 0 R 158 | /Next 26 0 R 159 | >> 160 | endobj 161 | 26 0 obj 162 | << 163 | /Title(Improving error messages) 164 | /A<< 165 | /S/GoTo 166 | /D(subsection.2.6) 167 | >> 168 | /Parent 20 0 R 169 | /Prev 25 0 R 170 | /Next 27 0 R 171 | >> 172 | endobj 173 | 27 0 obj 174 | << 175 | /Title(Expressions) 176 | /A<< 177 | /S/GoTo 178 | /D(subsection.2.7) 179 | >> 180 | /Parent 20 0 R 181 | /Prev 26 0 R 182 | /Next 28 0 R 183 | >> 184 | endobj 185 | 28 0 obj 186 | << 187 | /Title(Lexical analysis) 188 | /A<< 189 | /S/GoTo 190 | /D(subsection.2.8) 191 | >> 192 | /Parent 20 0 R 193 | /Prev 27 0 R 194 | /Next 29 0 R 195 | >> 196 | endobj 197 | 30 0 obj 198 | << 199 | /Title(Lexeme parsers -------------------------------------------------------------------------------- /test/testdata/test.snappy.parquet: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahupp/python-magic/62bd3c6a562b26e4005a012c30a0e86428b8defc/test/testdata/test.snappy.parquet -------------------------------------------------------------------------------- /test/testdata/text-iso8859-1.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ahupp/python-magic/62bd3c6a562b26e4005a012c30a0e86428b8defc/test/testdata/text-iso8859-1.txt -------------------------------------------------------------------------------- /test/testdata/text.txt: -------------------------------------------------------------------------------- 1 | Hello, World! 2 | 3 | -------------------------------------------------------------------------------- /tox.ini: -------------------------------------------------------------------------------- 1 | [tox] 2 | envlist = 3 | py27, 4 | py35, 5 | py36, 6 | py37, 7 | py38, 8 | py39, 9 | py310, 10 | py311, 11 | py312, 12 | py313, 13 | mypy 14 | 15 | [testenv] 16 | commands = 17 | coverage run -m pytest 18 | 19 | setenv = 20 | COVERAGE_FILE=.coverage.{envname} 21 | LC_ALL=en_US.UTF-8 22 | deps = 23 | .[test] 24 | coverage 25 | pytest 26 | 27 | [testenv:coverage-clean] 28 | deps = coverage 29 | setenv = 30 | COVERAGE_FILE=.coverage 31 | skip_install = true 32 | commands = coverage erase 33 | 34 | [testenv:coverage-report] 35 | deps = coverage 36 | setenv = 37 | COVERAGE_FILE=.coverage 38 | skip_install = true 39 | commands = 40 | coverage combine 41 | coverage report 42 | coverage html 43 | coverage 44 | 45 | [testenv:mypy] 46 | deps = mypy 47 | skip_install = true 48 | commands = 49 | mypy -p magic 50 | 51 | -------------------------------------------------------------------------------- /upload.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | python3 setup.py clean --all 4 | python3 setup.py sdist bdist_wheel 5 | #python3 -m twine upload dist/* 6 | 7 | --------------------------------------------------------------------------------