10 | :Date: 2011-11-19
11 | :Copyright: 2006-2011 Brett Smith and others
12 |
13 | :Manual section: 1
14 |
15 | SYNOPSIS
16 | ========
17 |
18 | dtrx [OPTIONS] ARCHIVE [ARCHIVE ...]
19 |
20 | DESCRIPTION
21 | ===========
22 |
23 | dtrx extracts archives in a number of different formats; it currently
24 | supports tar, zip (including self-extracting .exe files), cpio, rpm, deb,
25 | gem, 7z, cab, rar, lzh, arj, and InstallShield files. It can also decompress
26 | files compressed with gzip, bzip2, lzma, xz, lrzip, lzip, or compress.
27 |
28 | In addition to providing one command to handle many different archive
29 | types, dtrx also aids the user by extracting contents consistently. By
30 | default, everything will be written to a dedicated directory that's named
31 | after the archive. dtrx will also change the permissions to ensure that the
32 | owner can read and write all those files.
33 |
34 | To run dtrx, simply call it with the archive(s) you wish to extract as
35 | arguments. For example::
36 |
37 | $ dtrx coreutils-5.*.tar.gz
38 |
39 | You may specify URLs as arguments as well. If you do, dtrx will use `wget
40 | -c` to download the URL to the current directory and then extract what it
41 | downloads. This may fail if you already have a file in the current
42 | directory with the same name as the file you're trying to download.
43 |
44 | OPTIONS
45 | =======
46 |
47 | dtrx supports a number of options to mandate specific behavior:
48 |
49 | -r, --recursive
50 | With this option, dtrx will search inside the archives you specify to see
51 | if any of the contents are themselves archives, and extract those as
52 | well.
53 |
54 | --one, --one-entry
55 | Normally, if an archive only contains one file or directory with a name
56 | that doesn't match the archive's, dtrx will ask you how to handle it.
57 | With this option, you can specify ahead of time what should happen.
58 | Possible values are:
59 |
60 | inside
61 | Extract the file/directory inside another directory named after the
62 | archive. This is the default.
63 |
64 | rename
65 | Extract the file/directory in the current directory, and then rename
66 | it to match the name of the archive.
67 |
68 | here
69 | Extract the file/directory in the current directory.
70 |
71 | -o, --overwrite
72 | Normally, dtrx will avoid extracting into a directory that already exists,
73 | and instead try to find an alternative name to use. If this option is
74 | listed, dtrx will use the default directory name no matter what.
75 |
76 | -f, --flat
77 | Extract all archive contents into the current directory, instead of
78 | their own dedicated directory. This is handy if you have multiple
79 | archive files which all need to be extracted into the same directory
80 | structure. Note that existing files may be overwritten with this
81 | option.
82 |
83 | -n, --noninteractive
84 | dtrx will normally ask the user how to handle certain corner cases, such
85 | as how to handle an archive that only contains one file. This option
86 | suppresses those questions; dtrx will instead use sane, conservative
87 | defaults.
88 |
89 | -l, -t, --list, --table
90 | Don't extract the archives; just list their contents on standard output.
91 |
92 | -m, --metadata
93 | Extract the metadata from .deb and .gem archives, instead of their normal
94 | contents.
95 |
96 | -q, --quiet
97 | Suppress warning messages. List this option twice to make dtrx silent.
98 |
99 | -v, --verbose
100 | Show the files that are being extracted. List this option twice to
101 | print debugging information.
102 |
103 | --help
104 | Display basic help.
105 |
106 | --version
107 | Display dtrx's version, copyright, and license information.
108 |
109 | LICENSE
110 | =======
111 |
112 | dtrx 7.1 is copyright © 2006-2011 Brett Smith and others. Feel free to
113 | send comments, bug reports, patches, and so on. You can find the latest
114 | version of dtrx on its home page at
115 | .
116 |
117 | dtrx is free software; you can redistribute it and/or modify it under the
118 | terms of the GNU General Public License as published by the Free Software
119 | Foundation; either version 3 of the License, or (at your option) any
120 | later version.
121 |
122 | This program is distributed in the hope that it will be useful, but
123 | WITHOUT ANY WARRANTY; without even the implied warranty of
124 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
125 | Public License for more details.
126 |
127 | You should have received a copy of the GNU General Public License along
128 | with this program; if not, see .
129 |
--------------------------------------------------------------------------------
/archived/web/index.html:
--------------------------------------------------------------------------------
1 |
3 |
4 | dtrx: Intelligent archive extraction
5 |
6 |
7 |
8 |
9 | dtrx: Intelligent archive extraction
10 |
11 | Introduction
12 |
13 | dtrx stands for “Do The Right
14 | Extraction.” It's a tool for Unix-like systems that takes all the
15 | hassle out of extracting archives. Here's an example of how you use
16 | it:
17 |
18 | $ dtrx linux-3.0.1.tar.bz2
19 |
20 | That's basically the same thing as:
21 |
22 | $ tar -jxf linux-3.0.1.tar.bz2
23 |
24 | But there's more to it than that. You know those really annoying files
25 | that don't put everything in a dedicated directory, and have the
26 | permissions all wrong?
27 |
28 | $ tar -zvxf random-tarball.tar.gz
29 | foo
30 | bar
31 | data/
32 | data/text
33 | $ cd data/
34 | cd: permission denied: data
35 |
36 | dtrx takes care of all those problems for
37 | you, too:
38 |
39 | $ dtrx random-tarball.tar.gz
40 | $ cd random-tarball/data
41 | $ cat text
42 | This all works properly.
43 |
44 | dtrx is simple and powerful. Just use the
45 | same command for all your archive files, and they'll never frustrate you
46 | again.
47 |
48 | Features
49 |
50 |
51 |
52 | Handles many archive types : You only need to remember
53 | one simple command to extract
54 |
55 | tar ,
56 | zip ,
57 | cpio ,
58 | deb ,
59 | rpm ,
60 | gem ,
61 | 7z ,
62 | cab ,
63 | lzh ,
64 | rar ,
65 | arj ,
66 | gz ,
67 | bz2 ,
68 | lzma ,
69 | xz ,
70 | lrzip ,
71 | lzip ,
72 | and many kinds of
73 | exe files, including Microsoft Cabinet archives,
74 | InstallShield archives, and self-extracting zip
75 | files.
76 |
77 | If they have any extra compression, like tar.bz2
78 | files, dtrx will take care of that for you,
79 | too.
80 |
81 | Keeps everything organized : dtrx will make sure that archives are extracted into
83 | their own dedicated directories.
84 |
85 | Sane permissions : dtrx makes
86 | sure you can read and write all the files you just extracted, while leaving
87 | the rest of the permissions intact.
88 |
89 | Recursive extraction : dtrx can
90 | find archives inside the archive and extract those too.
91 |
92 |
93 |
94 | Download
95 |
96 | Download dtrx
97 | 7.1 . The SHA1 checksum for this file
98 | is 05cfe705a04a8b84571b0a5647cd2648720791a4 . Improvements in this
99 | release include:
100 |
101 |
102 |
103 | Support for LZH archives.
104 | Minor bug fixes in handling recursive extraction and empty
105 | archives.
106 |
107 |
108 |
109 | If you would like to try the latest development version—or maybe do some
110 | work on it yourself—you can check out the
111 | project's Git
112 | repository. A
113 | web repository is
114 | available, or you can just run:
115 |
116 | $ git clone git://gitorious.org/dtrx/dtrx.git
117 |
118 | Requirements
119 |
120 | If you have Python 2.4 or greater, this should work out of the box. If
121 | you're stuck on Python 2.3, you can use this if you install
122 | the subprocess
123 | module . You'll need the usual tools for the archive types you want to
124 | extract: for example, if you're extracting zip
125 | files, you'll need zipinfo
126 | and unzip . See the INSTALL file included
127 | with dtrx for a complete list of necessary
128 | utilities.
129 |
130 | Installation
131 |
132 | You can just put scripts/dtrx wherever is
133 | convenient for you, but if you want to install the program system-wide, you
134 | can also run the following command as root or equivalent:
135 |
136 | python setup.py install --prefix=/usr/local
137 |
138 | See the included INSTALL file for more information.
139 |
140 |
141 |
142 |
--------------------------------------------------------------------------------
/uv.lock:
--------------------------------------------------------------------------------
1 | version = 1
2 | revision = 3
3 | requires-python = ">=3.10"
4 |
5 | [[package]]
6 | name = "docutils"
7 | version = "0.16"
8 | source = { registry = "https://pypi.org/simple" }
9 | sdist = { url = "https://files.pythonhosted.org/packages/2f/e0/3d435b34abd2d62e8206171892f174b180cd37b09d57b924ca5c2ef2219d/docutils-0.16.tar.gz", hash = "sha256:c2de3a60e9e7d07be26b7f2b00ca0309c207e06c100f9cc2a94931fc75a478fc", size = 1962041, upload-time = "2020-01-12T13:55:25.917Z" }
10 | wheels = [
11 | { url = "https://files.pythonhosted.org/packages/81/44/8a15e45ffa96e6cf82956dd8d7af9e666357e16b0d93b253903475ee947f/docutils-0.16-py2.py3-none-any.whl", hash = "sha256:0c5b78adfbf7762415433f5515cd5c9e762339e23369dbe8000d84a4bf4ab3af", size = 548181, upload-time = "2020-01-12T13:55:21.393Z" },
12 | ]
13 |
14 | [[package]]
15 | name = "dtrx"
16 | version = "8.7.1"
17 | source = { editable = "." }
18 | dependencies = [
19 | { name = "unsupported-python", marker = "sys_platform == 'win32'" },
20 | ]
21 |
22 | [package.dev-dependencies]
23 | dev = [
24 | { name = "docutils" },
25 | { name = "pyyaml" },
26 | { name = "ruff" },
27 | ]
28 |
29 | [package.metadata]
30 | requires-dist = [{ name = "unsupported-python", marker = "sys_platform == 'win32'", specifier = "==1.0.0" }]
31 |
32 | [package.metadata.requires-dev]
33 | dev = [
34 | { name = "docutils", specifier = "==0.16" },
35 | { name = "pyyaml", specifier = "==5.3.1" },
36 | { name = "ruff", specifier = ">=0.14.3" },
37 | ]
38 |
39 | [[package]]
40 | name = "pyyaml"
41 | version = "5.3.1"
42 | source = { registry = "https://pypi.org/simple" }
43 | sdist = { url = "https://files.pythonhosted.org/packages/64/c2/b80047c7ac2478f9501676c988a5411ed5572f35d1beff9cae07d321512c/PyYAML-5.3.1.tar.gz", hash = "sha256:b8eac752c5e14d3eca0e6dd9199cd627518cb5ec06add0de9d32baeee6fe645d", size = 269377, upload-time = "2020-03-18T21:41:21.618Z" }
44 |
45 | [[package]]
46 | name = "ruff"
47 | version = "0.14.3"
48 | source = { registry = "https://pypi.org/simple" }
49 | sdist = { url = "https://files.pythonhosted.org/packages/75/62/50b7727004dfe361104dfbf898c45a9a2fdfad8c72c04ae62900224d6ecf/ruff-0.14.3.tar.gz", hash = "sha256:4ff876d2ab2b161b6de0aa1f5bd714e8e9b4033dc122ee006925fbacc4f62153", size = 5558687, upload-time = "2025-10-31T00:26:26.878Z" }
50 | wheels = [
51 | { url = "https://files.pythonhosted.org/packages/ce/8e/0c10ff1ea5d4360ab8bfca4cb2c9d979101a391f3e79d2616c9bf348cd26/ruff-0.14.3-py3-none-linux_armv6l.whl", hash = "sha256:876b21e6c824f519446715c1342b8e60f97f93264012de9d8d10314f8a79c371", size = 12535613, upload-time = "2025-10-31T00:25:44.302Z" },
52 | { url = "https://files.pythonhosted.org/packages/d3/c8/6724f4634c1daf52409fbf13fefda64aa9c8f81e44727a378b7b73dc590b/ruff-0.14.3-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:b6fd8c79b457bedd2abf2702b9b472147cd860ed7855c73a5247fa55c9117654", size = 12855812, upload-time = "2025-10-31T00:25:47.793Z" },
53 | { url = "https://files.pythonhosted.org/packages/de/03/db1bce591d55fd5f8a08bb02517fa0b5097b2ccabd4ea1ee29aa72b67d96/ruff-0.14.3-py3-none-macosx_11_0_arm64.whl", hash = "sha256:71ff6edca490c308f083156938c0c1a66907151263c4abdcb588602c6e696a14", size = 11944026, upload-time = "2025-10-31T00:25:49.657Z" },
54 | { url = "https://files.pythonhosted.org/packages/0b/75/4f8dbd48e03272715d12c87dc4fcaaf21b913f0affa5f12a4e9c6f8a0582/ruff-0.14.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:786ee3ce6139772ff9272aaf43296d975c0217ee1b97538a98171bf0d21f87ed", size = 12356818, upload-time = "2025-10-31T00:25:51.949Z" },
55 | { url = "https://files.pythonhosted.org/packages/ec/9b/506ec5b140c11d44a9a4f284ea7c14ebf6f8b01e6e8917734a3325bff787/ruff-0.14.3-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:cd6291d0061811c52b8e392f946889916757610d45d004e41140d81fb6cd5ddc", size = 12336745, upload-time = "2025-10-31T00:25:54.248Z" },
56 | { url = "https://files.pythonhosted.org/packages/c7/e1/c560d254048c147f35e7f8131d30bc1f63a008ac61595cf3078a3e93533d/ruff-0.14.3-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a497ec0c3d2c88561b6d90f9c29f5ae68221ac00d471f306fa21fa4264ce5fcd", size = 13101684, upload-time = "2025-10-31T00:25:56.253Z" },
57 | { url = "https://files.pythonhosted.org/packages/a5/32/e310133f8af5cd11f8cc30f52522a3ebccc5ea5bff4b492f94faceaca7a8/ruff-0.14.3-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:e231e1be58fc568950a04fbe6887c8e4b85310e7889727e2b81db205c45059eb", size = 14535000, upload-time = "2025-10-31T00:25:58.397Z" },
58 | { url = "https://files.pythonhosted.org/packages/a2/a1/7b0470a22158c6d8501eabc5e9b6043c99bede40fa1994cadf6b5c2a61c7/ruff-0.14.3-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:469e35872a09c0e45fecf48dd960bfbce056b5db2d5e6b50eca329b4f853ae20", size = 14156450, upload-time = "2025-10-31T00:26:00.889Z" },
59 | { url = "https://files.pythonhosted.org/packages/0a/96/24bfd9d1a7f532b560dcee1a87096332e461354d3882124219bcaff65c09/ruff-0.14.3-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:3d6bc90307c469cb9d28b7cfad90aaa600b10d67c6e22026869f585e1e8a2db0", size = 13568414, upload-time = "2025-10-31T00:26:03.291Z" },
60 | { url = "https://files.pythonhosted.org/packages/a7/e7/138b883f0dfe4ad5b76b58bf4ae675f4d2176ac2b24bdd81b4d966b28c61/ruff-0.14.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0e2f8a0bbcffcfd895df39c9a4ecd59bb80dca03dc43f7fb63e647ed176b741e", size = 13315293, upload-time = "2025-10-31T00:26:05.708Z" },
61 | { url = "https://files.pythonhosted.org/packages/33/f4/c09bb898be97b2eb18476b7c950df8815ef14cf956074177e9fbd40b7719/ruff-0.14.3-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:678fdd7c7d2d94851597c23ee6336d25f9930b460b55f8598e011b57c74fd8c5", size = 13539444, upload-time = "2025-10-31T00:26:08.09Z" },
62 | { url = "https://files.pythonhosted.org/packages/9c/aa/b30a1db25fc6128b1dd6ff0741fa4abf969ded161599d07ca7edd0739cc0/ruff-0.14.3-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:1ec1ac071e7e37e0221d2f2dbaf90897a988c531a8592a6a5959f0603a1ecf5e", size = 12252581, upload-time = "2025-10-31T00:26:10.297Z" },
63 | { url = "https://files.pythonhosted.org/packages/da/13/21096308f384d796ffe3f2960b17054110a9c3828d223ca540c2b7cc670b/ruff-0.14.3-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:afcdc4b5335ef440d19e7df9e8ae2ad9f749352190e96d481dc501b753f0733e", size = 12307503, upload-time = "2025-10-31T00:26:12.646Z" },
64 | { url = "https://files.pythonhosted.org/packages/cb/cc/a350bac23f03b7dbcde3c81b154706e80c6f16b06ff1ce28ed07dc7b07b0/ruff-0.14.3-py3-none-musllinux_1_2_i686.whl", hash = "sha256:7bfc42f81862749a7136267a343990f865e71fe2f99cf8d2958f684d23ce3dfa", size = 12675457, upload-time = "2025-10-31T00:26:15.044Z" },
65 | { url = "https://files.pythonhosted.org/packages/cb/76/46346029fa2f2078826bc88ef7167e8c198e58fe3126636e52f77488cbba/ruff-0.14.3-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:a65e448cfd7e9c59fae8cf37f9221585d3354febaad9a07f29158af1528e165f", size = 13403980, upload-time = "2025-10-31T00:26:17.81Z" },
66 | { url = "https://files.pythonhosted.org/packages/9f/a4/35f1ef68c4e7b236d4a5204e3669efdeefaef21f0ff6a456792b3d8be438/ruff-0.14.3-py3-none-win32.whl", hash = "sha256:f3d91857d023ba93e14ed2d462ab62c3428f9bbf2b4fbac50a03ca66d31991f7", size = 12500045, upload-time = "2025-10-31T00:26:20.503Z" },
67 | { url = "https://files.pythonhosted.org/packages/03/15/51960ae340823c9859fb60c63301d977308735403e2134e17d1d2858c7fb/ruff-0.14.3-py3-none-win_amd64.whl", hash = "sha256:d7b7006ac0756306db212fd37116cce2bd307e1e109375e1c6c106002df0ae5f", size = 13594005, upload-time = "2025-10-31T00:26:22.533Z" },
68 | { url = "https://files.pythonhosted.org/packages/b7/73/4de6579bac8e979fca0a77e54dec1f1e011a0d268165eb8a9bc0982a6564/ruff-0.14.3-py3-none-win_arm64.whl", hash = "sha256:26eb477ede6d399d898791d01961e16b86f02bc2486d0d1a7a9bb2379d055dc1", size = 12590017, upload-time = "2025-10-31T00:26:24.52Z" },
69 | ]
70 |
71 | [[package]]
72 | name = "unsupported-python"
73 | version = "1.0.0"
74 | source = { registry = "https://pypi.org/simple" }
75 | sdist = { url = "https://files.pythonhosted.org/packages/e1/ef/e6f8232d0f7f3c402b05b186aacdfc632bca5ad33d114375cf9eefd749ef/unsupported-python-1.0.0.tar.gz", hash = "sha256:64e4eb0c4d99b9ad17247dc2c34e0a9b8be2b21dc1c28079ab625457a74a5abe", size = 2414, upload-time = "2023-04-18T22:47:26.691Z" }
76 |
--------------------------------------------------------------------------------
/tests/test-1.23.tar:
--------------------------------------------------------------------------------
1 | test-1.23/ 0000755 0001750 0001750 00000000000 10520721673 011443 5 ustar brett brett test-1.23/1/ 0000755 0001750 0001750 00000000000 10520721657 011605 5 ustar brett brett test-1.23/1/2/ 0000755 0001750 0001750 00000000000 10520721664 011744 5 ustar brett brett test-1.23/1/2/3 0000644 0001750 0001750 00000000000 10520721664 012017 0 ustar brett brett test-1.23/a/ 0000755 0001750 0001750 00000000000 10520721671 011661 5 ustar brett brett test-1.23/a/b 0000644 0001750 0001750 00000000000 10520721671 012013 0 ustar brett brett test-1.23/foobar 0000644 0001750 0001750 00000000000 10520721673 012624 0 ustar brett brett
--------------------------------------------------------------------------------
/archived/NEWS:
--------------------------------------------------------------------------------
1 | Changes in dtrx
2 | ===============
3 |
4 | Version 7.2
5 | -----------
6 |
7 | Thanks to Ville Skyllä, who contributed most of the new features and
8 | enhancements in this release.
9 |
10 | New features
11 | ~~~~~~~~~~~~
12 |
13 | * dtrx now supports the arj archive, lrzip encoding, and several specific
14 | file extensions.
15 |
16 | * If unar is available, dtrx can try to use it to extract rar archives.
17 |
18 | Bug fixes
19 | ~~~~~~~~~
20 |
21 | * dtrx will get correct file magic information for archives it's reading
22 | through symbolic links.
23 |
24 | * File listings for rar archives now include full paths.
25 |
26 | Development changes
27 | ~~~~~~~~~~~~~~~~~~~
28 |
29 | * dtrx development is now `hosted on Gitorious`_.
30 |
31 | .. _hosted on Gitorious: http://gitorious.org/dtrx
32 |
33 | * The test script can run specific tests, and has improved output on
34 | interactive terminals.
35 |
36 | Version 7.1
37 | -----------
38 |
39 | New features
40 | ~~~~~~~~~~~~
41 |
42 | * LZH archives are now supported.
43 |
44 | Bug fixes
45 | ~~~~~~~~~
46 |
47 | * dtrx will no longer offer to extract the zero archive files found in a
48 | zero-file archive.
49 |
50 | * Temporary directories will be cleaned up after extracting an empty
51 | archive.
52 |
53 | Version 7.0
54 | -----------
55 |
56 | At this point, I consider dtrx to be mature software. It's maybe a little
57 | too interactive, but otherwise it does everything I want, and it does it
58 | very well. Expect new releases to be few and far between going forward.
59 |
60 | New features
61 | ~~~~~~~~~~~~
62 |
63 | * If any of dtrx's command line arguments are URLs, it will automatically
64 | download them with `wget -c` in the current directory before extracting
65 | them. See the documentation for more information about this feature.
66 | Note that there might be trouble if there's already a file in the
67 | directory where wget would normally save the download.
68 |
69 | Enhancements
70 | ~~~~~~~~~~~~
71 |
72 | * dtrx will try to extract ZIP files with 7z if unzip is not successful.
73 | Thanks to Edward H for reporting this bug.
74 |
75 | * dtrx will be smarter about removing extensions from filenames when
76 | extracting to a new directory or file.
77 |
78 | * dtrx will not ask you if you want to recurse through an archive if
79 | the number of archives inside the original file is small.
80 |
81 | Version 6.6
82 | -----------
83 |
84 | Enhancements
85 | ~~~~~~~~~~~~
86 |
87 | * dtrx can now handle `xz compression`_.
88 |
89 | .. _xz compression: http://tukaani.org/xz/
90 |
91 | Other changes
92 | ~~~~~~~~~~~~~
93 |
94 | * The tests now use the PyYAML library, instead of the abandoned Syck.
95 | Thanks to Miguelangel Jose Freitas Loreto for a patch.
96 |
97 | Version 6.5
98 | -----------
99 |
100 | Enhancements
101 | ~~~~~~~~~~~~
102 |
103 | * When you list archive contents with -l or -t, dtrx will start printing
104 | results much faster than it used to. There's a small chance that it
105 | will print some incorrect listings if it misdetects the archive type of
106 | a given file, but it will show you an error message when that happens.
107 |
108 | * dtrx recognizes more kinds of compressed tar archives by their
109 | extension.
110 |
111 | * You can now extract newer .deb packages that are compressed with bzip2
112 | or lzma.
113 |
114 | Bug fixes
115 | ~~~~~~~~~
116 |
117 | * When extracting an archive that contained a file with a mismatched
118 | filename, the prompt would offer you a chance to "rename the directory"
119 | instead of "rename the file." This wording has been fixed, along with
120 | some other wording adjustments in the prompts generally.
121 |
122 | * Perform more reliable detection of the terminal size, and improve word
123 | wrapping on prompts.
124 |
125 | Other changes
126 | ~~~~~~~~~~~~~
127 |
128 | * The README is now written like a man page, and can be converted to a man
129 | page by using rst2man_.
130 |
131 | .. _rst2man: http://docutils.sourceforge.net/sandbox/manpage-writer/
132 |
133 | Version 6.4
134 | -----------
135 |
136 | Enhancements
137 | ~~~~~~~~~~~~
138 |
139 | * Support detection of LZMA archives by magic.
140 |
141 | * Interactive prompts are wrapped much more cleanly.
142 |
143 | Bug fixes
144 | ~~~~~~~~~
145 |
146 | * Fix a bug where dtrx would crash when extracting an archive with no
147 | files inside it.
148 |
149 | Version 6.3
150 | -----------
151 |
152 | New features
153 | ~~~~~~~~~~~~
154 |
155 | * Add support for RAR archives. Thanks to Peter Kelemen for the patch.
156 |
157 | Bug fixes
158 | ~~~~~~~~~
159 |
160 | * Previous versions of dtrx would fail to extract certain archive types
161 | with the ``-v`` option specified. This has been fixed.
162 |
163 | * dtrx 6.3 no longer imports the sets module unless it's running under a
164 | very old version of Python, to avoid deprecation warnings under Python
165 | 2.6.
166 |
167 | Version 6.2
168 | -----------
169 |
170 | New features
171 | ~~~~~~~~~~~~
172 |
173 | * --one-entry option: Normally, if an archive only contains one file or
174 | directory with a name that doesn't match the archive's, dtrx will ask
175 | you how to handle it. With this option, you can specify ahead of time
176 | what should happen.
177 |
178 | Bug fixes
179 | ~~~~~~~~~
180 |
181 | * Since version 6.0, when you extracted or listed the contents of a cpio
182 | archive, dtrx would display a warning that simply said "1234 blocks."
183 | dtrx 6.2 suppresses this message.
184 |
185 | * When you try to list the contents of an archive, dtrx will now cope with
186 | misnamed files more gracefully, giving more accurate results and showing
187 | fewer error messages.
188 |
189 | * dtrx 6.2 will only show you error messages from archive extraction if it
190 | is completely unable to extract the file. If one of its extraction
191 | methods succeeds, it will no longer show you the error messages from
192 | previous extraction attempts.
193 |
194 | * dtrx is now better about cleaning up partially extracted archives when
195 | it encounters an error or signal.
196 |
197 | * Users will no longer see error messages about broken pipes from dtrx.
198 |
199 | Version 6.1
200 | -----------
201 |
202 | New features
203 | ~~~~~~~~~~~~
204 |
205 | * Add support for InstallShield archives, using the unshield command.
206 |
207 | * The wording of many of the interactive prompts has been adjusted,
208 | hopefully to be clearer and provide more information to the user
209 | immediately.
210 |
211 | Bug fixes
212 | ~~~~~~~~~
213 |
214 | * dtrx 6.1 does a better job protecting against race conditions when
215 | extracting a single file.
216 |
217 | * If you used the -f option, and extracted an archive that only contained
218 | one file or directory, dtrx 6.0 would still prompt you to ask how it
219 | should be extracted. dtrx 6.1 fixes this, extracting the contents to
220 | the current directory as -f requires.
221 |
222 | * Recursive extraction would not work well in dtrx 6.0 when the contents
223 | of the original archive were a single file. This has been fixed in dtrx
224 | 6.1.
225 |
226 | Version 6.0
227 | -----------
228 |
229 | New features
230 | ~~~~~~~~~~~~
231 |
232 | * When you specify -v at the command line, dtrx will display the files it
233 | extracts, much like tar.
234 |
235 | * When dtrx prompts you about how to handle recursive archives, you now
236 | have the option of listing what those archives before making a decision.
237 |
238 | * dtrx will now provide more information about why a particular extraction
239 | attempt failed. It will show you error messages from all the attempts
240 | it made, rather than only the last error it got. It will also detect
241 | and warn you when one of the underlying extraction tools, like
242 | cabextract, cannot be found.
243 |
244 | * dtrx does a better job of cleaning up after itself. It wouldn't always
245 | clean up temporary files after certain errors; that has been fixed. It
246 | also catches SIGINT and SIGTERM and cleans up before finishing
247 | execution.
248 |
249 | Bug fixes
250 | ~~~~~~~~~
251 |
252 | * Version 5.0 introduced a regression such that dtrx would not offer to
253 | extract recursive archives that were hidden under subdirectories.
254 | Version 6.0 fixes that.
255 |
256 | * dtrx would not properly extract recursive archives when the original
257 | archive contained a single directory. This has been fixed.
258 |
259 | Version 5.1
260 | -----------
261 |
262 | Bug fixes
263 | ~~~~~~~~~
264 |
265 | * Version 5.0 did not work with Python 2.3; it used a new language
266 | feature. This release fixes that.
267 |
268 | Version 5.0
269 | -----------
270 |
271 | New features
272 | ~~~~~~~~~~~~
273 |
274 | * dtrx can now extract Ruby gems, 7z archives, and Microsoft Cabinet
275 | archives. It can also handle files compressed with lzma, and extract
276 | the metadata from Debian packages and Ruby gems.
277 |
278 | * dtrx will now use several strategies to try to figure out what kind of
279 | file you have, and extract it accordingly. If one doesn't work, it'll
280 | try something else if it can.
281 |
282 | * dtrx now displays more helpful errors when things go wrong.
283 |
284 | * Previous versions of dtrx would look at what files were included in an
285 | archive, and then make a decision about how to extract it. Now, it
286 | always extracts files to a temporary directory, and figures out what to
287 | do with that directory afterward. This should be slightly faster and
288 | nicer to the system.
289 |
290 | Version 4.0
291 | -----------
292 |
293 | New features
294 | ~~~~~~~~~~~~
295 |
296 | * dtrx is now interactive. If the archive only contains one item, or
297 | contains other archives, dtrx will ask you how you would like to handle
298 | it. You can turn these questions off the the -n option.
299 |
300 | * There is a new -l option, which simply lists the archive's contents
301 | rather than extracting them.
302 |
--------------------------------------------------------------------------------
/tests/compare.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # -*- coding: utf-8 -*-
3 | #
4 | # compare.py -- High-level tests for dtrx.
5 | # Copyright © 2006-2009 Brett Smith .
6 | #
7 | # This program is free software; you can redistribute it and/or modify it
8 | # under the terms of the GNU General Public License as published by the
9 | # Free Software Foundation; either version 3 of the License, or (at your
10 | # option) any later version.
11 | #
12 | # This program is distributed in the hope that it will be useful, but
13 | # WITHOUT ANY WARRANTY; without even the implied warranty of
14 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
15 | # Public License for more details.
16 | #
17 | # You should have received a copy of the GNU General Public License along
18 | # with this program; if not, see .
19 |
20 | from __future__ import print_function
21 |
22 | import fcntl
23 | import os
24 | import re
25 | import struct
26 |
27 | try:
28 | import subprocess32 as subprocess
29 | except ImportError:
30 | import subprocess
31 |
32 | import sys
33 | import tempfile
34 | import termios
35 |
36 | import yaml
37 |
38 | try:
39 | set
40 | except NameError:
41 | from sets import Set as set
42 |
43 | if os.path.exists("dtrx/dtrx.py") and os.path.exists("tests"):
44 | os.chdir("tests")
45 | elif os.path.exists("../dtrx/dtrx.py") and os.path.exists("../tests"):
46 | pass
47 | else:
48 | print("ERROR: Can't run tests in this directory!")
49 | sys.exit(2)
50 |
51 | DTRX_SCRIPT = os.path.realpath("../dtrx/dtrx.py")
52 | SHELL_CMD = ["sh", "-se"]
53 | ROOT_DIR = os.path.realpath(os.curdir)
54 | NUM_TESTS = 0
55 |
56 |
57 | class ExtractorTestError(Exception):
58 | pass
59 |
60 |
61 | class StatusWriter(object):
62 | def __init__(self):
63 | try:
64 | size = fcntl.ioctl(
65 | sys.stdout.fileno(), termios.TIOCGWINSZ, struct.pack("HHHH", 0, 0, 0, 0)
66 | )
67 | except IOError:
68 | self.show = self.show_file
69 | else:
70 | self.width = struct.unpack("HHHH", size)[1] - 1
71 | self.last_width = self.width
72 | self.show = self.show_term
73 |
74 | def show_term(self, message):
75 | sys.stdout.write(message.ljust(self.last_width) + "\r")
76 | sys.stdout.flush()
77 | self.last_width = max(self.width, len(message))
78 |
79 | def show_file(self, message):
80 | if message:
81 | print(message)
82 |
83 | def clear(self):
84 | self.show("")
85 |
86 |
87 | class ExtractorTest(object):
88 | status_writer = StatusWriter()
89 |
90 | def __init__(self, **kwargs):
91 | global NUM_TESTS
92 | NUM_TESTS += 1
93 | self.test_num = NUM_TESTS
94 | self.name = kwargs["name"]
95 | self.options = kwargs.get("options", "-n").split()
96 | self.filenames = kwargs.get("filenames", "").split()
97 | for key in (
98 | "directory",
99 | "prerun",
100 | "posttest",
101 | "baseline",
102 | "error",
103 | "input",
104 | "output",
105 | "cleanup",
106 | ):
107 | setattr(self, key, kwargs.get(key, None))
108 | for key in ("grep", "antigrep"):
109 | value = kwargs.get(key, [])
110 | if isinstance(value, str):
111 | value = [value]
112 | setattr(self, key, value)
113 | if self.input and (not self.input.endswith("\n")):
114 | self.input = self.input + "\n"
115 |
116 | def start_proc(self, command, stdin=None, output=None):
117 | process = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=output, stderr=output)
118 | if stdin:
119 | process.stdin.write(bytes(str(stdin).encode("utf-8")))
120 | process.stdin.close()
121 | return process
122 |
123 | def get_results(self, command, stdin=None):
124 | print("Output from %s:" % (" ".join(command),), file=self.outbuffer)
125 | self.outbuffer.flush()
126 | status = self.start_proc(command, stdin, self.outbuffer).wait(5)
127 | process = subprocess.Popen(["find"], stdout=subprocess.PIPE)
128 | output = process.stdout.read(-1).decode("ascii", errors="ignore")
129 | process.stdout.close()
130 | process.wait()
131 | return status, set(output.split("\n"))
132 |
133 | def run_script(self, key):
134 | commands = getattr(self, key)
135 | if commands is not None:
136 | if self.directory:
137 | directory_hint = "../"
138 | else:
139 | directory_hint = ""
140 | self.start_proc(SHELL_CMD + [directory_hint], commands).wait()
141 |
142 | def get_shell_results(self):
143 | self.run_script("prerun")
144 | return self.get_results(SHELL_CMD + self.filenames, self.baseline)
145 |
146 | def get_extractor_results(self):
147 | self.run_script("prerun")
148 | # run with the current python interpreter, rather than relying on the
149 | # hashbang.
150 | return self.get_results(
151 | [sys.executable, DTRX_SCRIPT] + self.options + self.filenames, self.input
152 | )
153 |
154 | def get_posttest_result(self):
155 | if not self.posttest:
156 | return 0
157 | return self.start_proc(SHELL_CMD, self.posttest).wait()
158 |
159 | def clean(self):
160 | self.run_script("cleanup")
161 | if self.directory:
162 | target = os.path.join(ROOT_DIR, self.directory)
163 | extra_options = []
164 | else:
165 | target = ROOT_DIR
166 | extra_options = [
167 | "(",
168 | "(",
169 | "-type",
170 | "d",
171 | "!",
172 | "-name",
173 | "CVS",
174 | "!",
175 | "-name",
176 | ".svn",
177 | ")",
178 | "-or",
179 | "-name",
180 | "test-text",
181 | "-or",
182 | "-name",
183 | "test-onefile",
184 | ")",
185 | ]
186 | status = subprocess.call(
187 | ["find", target, "-mindepth", "1", "-maxdepth", "1"]
188 | + extra_options
189 | + ["-exec", "rm", "-rf", "{}", ";"]
190 | )
191 | if status != 0:
192 | raise ExtractorTestError("cleanup exited with status code %s" % (status,))
193 |
194 | def show_pass(self):
195 | self.status_writer.show("Passed %i/%i: %s" % (self.test_num, NUM_TESTS, self.name))
196 | return "passed"
197 |
198 | def show_report(self, status, message=None):
199 | self.status_writer.clear()
200 | self.outbuffer.seek(0, 0)
201 | sys.stdout.write(self.outbuffer.read(-1))
202 | if message is None:
203 | last_part = ""
204 | else:
205 | last_part = ": %s" % (message,)
206 | print("%s: %s%s\n" % (status, self.name, last_part))
207 | return status.lower()
208 |
209 | def compare_results(self, actual):
210 | posttest_result = self.get_posttest_result()
211 | self.clean()
212 | status, expected = self.get_shell_results()
213 | self.clean()
214 | if expected != actual:
215 | print("Only in baseline results:", file=self.outbuffer)
216 | print("\n".join(expected.difference(actual)), file=self.outbuffer)
217 | print("Only in actual results:", file=self.outbuffer)
218 | print("\n".join(actual.difference(expected)), file=self.outbuffer)
219 | return self.show_report("FAILED")
220 | elif posttest_result != 0:
221 | print("Posttest gave status code", posttest_result, file=self.outbuffer)
222 | return self.show_report("FAILED")
223 | return self.show_pass()
224 |
225 | def have_error_mismatch(self, status):
226 | if self.error and (status == 0):
227 | return "dtrx did not return expected error"
228 | elif (not self.error) and (status != 0):
229 | return "dtrx returned error code %s" % (status,)
230 | return None
231 |
232 | def grep_output(self, output):
233 | for pattern in self.grep:
234 | if not re.search(pattern.replace(" ", "\\s+"), output, re.MULTILINE):
235 | return "output did not match %s" % (pattern)
236 | for pattern in self.antigrep:
237 | if re.search(pattern.replace(" ", "\\s+"), output, re.MULTILINE):
238 | return "output matched antigrep %s" % (self.antigrep)
239 | return None
240 |
241 | def check_output(self, output):
242 | if (self.output is not None) and (self.output.strip() != output.strip()):
243 | return "output did not match provided text:\n{}\nVS:\n{}".format(
244 | repr(self.output), repr(output)
245 | )
246 | return None
247 |
248 | def check_results(self):
249 | self.clean()
250 | status, actual = self.get_extractor_results()
251 | self.outbuffer.seek(0, 0)
252 | self.outbuffer.readline()
253 | output = self.outbuffer.read(-1)
254 | problem = (
255 | self.have_error_mismatch(status)
256 | or self.check_output(output)
257 | or self.grep_output(output)
258 | )
259 | if problem:
260 | return self.show_report("FAILED", problem)
261 | if self.baseline is not None:
262 | return self.compare_results(actual)
263 | else:
264 | self.clean()
265 | return self.show_pass()
266 |
267 | def run(self):
268 | self.outbuffer = tempfile.TemporaryFile(mode="w+")
269 | if self.directory:
270 | os.mkdir(self.directory)
271 | os.chdir(self.directory)
272 | try:
273 | result = self.check_results()
274 | except ExtractorTestError as error:
275 | result = self.show_report("ERROR", error)
276 | self.outbuffer.close()
277 | if self.directory:
278 | os.chdir(ROOT_DIR)
279 | subprocess.call(["chmod", "-R", "700", self.directory])
280 | subprocess.call(["rm", "-rf", self.directory])
281 | return result
282 |
283 |
284 | class TestsRunner(object):
285 | outcomes = ["error", "failed", "passed"]
286 |
287 | def __init__(self):
288 | with open("tests.yml", "rb") as test_db:
289 | self.test_data = yaml.load(
290 | test_db.read(-1).decode("utf-8", errors="ignore"),
291 | Loader=yaml.FullLoader,
292 | )
293 | self.name_regexps = [re.compile(s) for s in sys.argv[1:]]
294 | self.tests = [ExtractorTest(**data) for data in self.test_data if self.wanted_test(data)]
295 | self.add_subdir_tests()
296 |
297 | def wanted_test(self, data):
298 | if not self.name_regexps:
299 | return True
300 | return any([r.search(data["name"]) for r in self.name_regexps])
301 |
302 | def add_subdir_tests(self):
303 | for odata in self.test_data:
304 | if (not self.wanted_test(odata)) or "directory" in odata or ("baseline" not in odata):
305 | continue
306 | data = odata.copy()
307 | data["name"] += " in .."
308 | data["directory"] = "inside-dir"
309 | data["filenames"] = " ".join([
310 | "../%s" % filename for filename in data.get("filenames", "").split()
311 | ])
312 | self.tests.append(ExtractorTest(**data))
313 |
314 | def run(self):
315 | results = {}
316 | for outcome in self.outcomes:
317 | results[outcome] = 0
318 | for test in self.tests:
319 | results[test.run()] += 1
320 | if self.tests:
321 | self.tests[-1].status_writer.clear()
322 | print(
323 | "Totals:",
324 | ", ".join(["%s %s" % (results[key], key) for key in self.outcomes]),
325 | )
326 | return (results["error"] + results["failed"]) == 0
327 |
328 |
329 | runner = TestsRunner()
330 | if not runner.run():
331 | sys.exit(1)
332 |
--------------------------------------------------------------------------------
/tests/tests.yml:
--------------------------------------------------------------------------------
1 | # tests.yml -- Whole-program comparison tests for dtrx
2 | # Copyright © 2006-2011 Brett Smith
3 | # Copyright © 2011 Ville Skyttä
4 | #
5 | # This program is free software; you can redistribute it and/or modify it
6 | # under the terms of the GNU General Public License as published by the
7 | # Free Software Foundation; either version 3 of the License, or (at your
8 | # option) any later version.
9 | #
10 | # This program is distributed in the hope that it will be useful, but
11 | # WITHOUT ANY WARRANTY; without even the implied warranty of
12 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
13 | # Public License for more details.
14 | #
15 | # You should have received a copy of the GNU General Public License along
16 | # with this program; if not, see .
17 |
18 | - name: basic .tar
19 | filenames: test-1.23.tar
20 | baseline: |
21 | tar -xf $1
22 |
23 | - name: basic .tar.gz
24 | filenames: test-1.23.tar.gz
25 | baseline: |
26 | tar -zxf $1
27 |
28 | - name: basic .tar.bz2
29 | filenames: test-1.23.tar.bz2
30 | baseline: |
31 | mkdir test-1.23
32 | cd test-1.23
33 | tar -jxf ../$1
34 |
35 | - name: basic .tar.lrz
36 | filenames: test-1.23.tar.lrz
37 | baseline: |
38 | lrzcat -Q $1 | tar -xf -
39 |
40 | - name: basic .zip
41 | filenames: test-1.23.zip
42 | baseline: |
43 | mkdir test-1.23
44 | cd test-1.23
45 | unzip -q ../$1
46 |
47 | - name: basic .lzh
48 | filenames: test-1.23.lzh
49 | baseline: |
50 | mkdir test-1.23
51 | cd test-1.23
52 | lha xq ../$1
53 |
54 | - name: basic .deb
55 | filenames: test-1.23_all.deb
56 | baseline: |
57 | mkdir test-1.23
58 | cd test-1.23
59 | ar p ../$1 data.tar.gz | tar -zx
60 |
61 | - name: .deb with LZMA compression
62 | filenames: test-2_all.deb
63 | baseline: |
64 | mkdir test-2
65 | cd test-2
66 | ar p ../$1 data.tar.lzma | lzcat | tar -x
67 |
68 | - name: basic .gem
69 | filenames: test-1.23.gem
70 | baseline: |
71 | mkdir test-1.23
72 | cd test-1.23
73 | tar -xOf ../$1 data.tar.gz | tar -zx
74 |
75 | - name: basic .7z
76 | filenames: test-1.23.7z
77 | baseline: |
78 | 7z x $1
79 |
80 | - name: basic .lzma
81 | filenames: test-1.23.tar.lzma
82 | baseline: |
83 | lzcat $1 | tar -x
84 |
85 | - name: basic .cpio
86 | filenames: test-1.23.cpio
87 | baseline: |
88 | cpio -i --make-directories <$1
89 | antigrep: blocks?
90 |
91 | - name: basic .rar
92 | filenames: test-1.23.rar
93 | baseline: |
94 | mkdir test-1.23
95 | cd test-1.23
96 | unar -D ../$1 || unrar x ../$1
97 |
98 | - name: many files .rar
99 | filenames: test-lots-files.rar
100 | baseline: |
101 | mkdir test-lots-files
102 | cd test-lots-files
103 | unar -D ../$1 || unrar x ../$1
104 |
105 | - name: basic .arj
106 | filenames: test-1.23.arj
107 | baseline: |
108 | mkdir test-1.23
109 | cd test-1.23
110 | arj x -y ../$1
111 |
112 | - name: .deb metadata
113 | filenames: test-1.23_all.deb
114 | options: --metadata
115 | baseline: |
116 | mkdir test-1.23
117 | cd test-1.23
118 | ar p ../$1 control.tar.gz | tar -zx
119 |
120 | - name: .gem metadata
121 | filenames: test-1.23.gem
122 | options: -m
123 | baseline: |
124 | tar -xOf $1 metadata.gz | zcat > test-1.23.gem-metadata.txt
125 | cleanup: rm -f test-1.23.gem-metadata.txt
126 | posttest: |
127 | exec [ "$(cat test-1.23.gem-metadata.txt)" = "hi" ]
128 |
129 | - name: recursion and permissions
130 | filenames: test-recursive-badperms.tar.bz2
131 | options: -n -r
132 | baseline: |
133 | extract() {
134 | mkdir "$1"
135 | cd "$1"
136 | tar "-${3}xf" "../$2"
137 | }
138 | extract test-recursive-badperms "$1" j
139 | extract test-badperms test-badperms.tar
140 | chmod 700 testdir
141 | posttest: |
142 | exec [ "$(cat test-recursive-badperms/test-badperms/testdir/testfile)" = \
143 | "hey" ]
144 |
145 | - name: decompressing gz, not interactive
146 | directory: inside-dir
147 | filenames: ../test-text.gz
148 | options: ""
149 | antigrep: "."
150 | baseline: |
151 | zcat $1 >test-text
152 | posttest: |
153 | exec [ "$(cat test-text)" = "hi" ]
154 |
155 | - name: decompressing bz2, not interactive
156 | directory: inside-dir
157 | filenames: ../test-text.bz2
158 | options: ""
159 | antigrep: "."
160 | baseline: |
161 | bzcat $1 >test-text
162 | posttest: |
163 | exec [ "$(cat test-text)" = "hi" ]
164 |
165 | - name: decompressing xz, not interactive
166 | directory: inside-dir
167 | filenames: ../test-text.xz
168 | options: ""
169 | antigrep: "."
170 | baseline: |
171 | xzcat $1 >test-text
172 | posttest: |
173 | exec [ "$(cat test-text)" = "hi" ]
174 |
175 | - name: decompressing lrzip, not interactive
176 | directory: inside-dir
177 | filenames: ../test-text.lrz
178 | options: ""
179 | antigrep: "."
180 | baseline: |
181 | lrzcat -Q $1 >test-text
182 | posttest: |
183 | exec [ "$(cat test-text)" = "hi" ]
184 |
185 | - name: decompressing lzip, not interactive
186 | directory: inside-dir
187 | filenames: ../test-text.lz
188 | options: ""
189 | antigrep: "."
190 | baseline: |
191 | lzip -cd <$1 >test-text
192 | posttest: |
193 | exec [ "$(cat test-text)" = "hi" ]
194 |
195 | - name: decompression with -r
196 | directory: inside-dir
197 | filenames: ../test-text.gz
198 | options: -n -r
199 | baseline: |
200 | zcat $1 >test-text
201 |
202 | - name: decompression with -fr
203 | directory: inside-dir
204 | filenames: ../test-text.gz
205 | options: -n -fr
206 | baseline: |
207 | zcat $1 >test-text
208 |
209 | - name: overwrite protection
210 | filenames: test-1.23.tar.bz2
211 | baseline: |
212 | mkdir test-1.23.1
213 | cd test-1.23.1
214 | tar -jxf ../$1
215 | prerun: |
216 | mkdir test-1.23
217 |
218 | - name: overwrite option
219 | filenames: test-1.23.tar.bz2
220 | options: -n -o
221 | baseline: |
222 | cd test-1.23
223 | tar -jxf ../$1
224 | prerun: |
225 | mkdir test-1.23
226 |
227 | - name: flat option
228 | directory: inside-dir
229 | filenames: ../test-1.23.tar.bz2
230 | options: -n -f
231 | baseline: |
232 | tar -jxf $1
233 |
234 | - name: flat recursion and permissions
235 | directory: inside-dir
236 | filenames: ../test-recursive-badperms.tar.bz2
237 | options: -n -fr
238 | baseline: |
239 | tar -jxf $1
240 | tar -xf test-badperms.tar
241 | chmod 700 testdir
242 | posttest: |
243 | exec [ "$(cat testdir/testfile)" = "hey" ]
244 |
245 | - name: no files
246 | error: true
247 | grep: "[Uu]sage"
248 |
249 | - name: bad file
250 | error: true
251 | filenames: nonexistent-file.tar
252 |
253 | - name: not an archive
254 | error: true
255 | filenames: tests.yml
256 |
257 | - name: bad options
258 | options: -n --nonexistent-option
259 | filenames: test-1.23.tar
260 | error: true
261 |
262 | - name: --version
263 | options: -n --version
264 | grep: ersion \d+\.\d+
265 | filenames: test-1.23.tar
266 | baseline: |
267 | exit 0
268 |
269 | - name: one good archive of many
270 | filenames: tests.yml test-1.23.tar nonexistent-file.tar
271 | error: true
272 | baseline: |
273 | tar -xf $2
274 |
275 | - name: silence
276 | filenames: tests.yml
277 | options: -n -qq
278 | error: true
279 | antigrep: "."
280 |
281 | - name: can't write to directory
282 | directory: inside-dir
283 | filenames: ../test-1.23.tar
284 | error: true
285 | grep: ERROR
286 | antigrep: Traceback
287 | prerun: |
288 | chmod 500 .
289 |
290 | - name: list contents of one file
291 | options: -n -l
292 | filenames: test-1.23.tar
293 | output: |
294 | test-1.23.tar:
295 | test-1.23/
296 | test-1.23/1/
297 | test-1.23/1/2/
298 | test-1.23/1/2/3
299 | test-1.23/a/
300 | test-1.23/a/b
301 | test-1.23/foobar
302 |
303 | - name: list contents of LZH
304 | options: -n -l
305 | filenames: test-1.23.lzh
306 | output: |
307 | test-1.23.lzh:
308 | 1/
309 | 1/2/
310 | 1/2/3
311 | a/
312 | a/b
313 | foobar
314 |
315 | - name: list contents of 7z
316 | options: -n -l
317 | filenames: test-1.23.7z
318 | output: |
319 | test-1.23.7z:
320 | test-1.23/1/2/3
321 | test-1.23/a/b
322 | test-1.23/foobar
323 | test-1.23/a
324 | test-1.23/1/2
325 | test-1.23/1
326 | test-1.23
327 |
328 | - name: list contents of .arj
329 | options: -n -l
330 | filenames: test-1.23.arj
331 | output: |
332 | test-1.23.arj:
333 | a/b
334 | 1/2/3
335 | foobar
336 |
337 | - name: list contents of .cpio
338 | options: -n -l
339 | filenames: test-1.23.cpio
340 | grep: ^test-1\.23/1/2/3$
341 | antigrep: blocks?
342 |
343 | - name: list contents of multiple files
344 | options: -n --table
345 | filenames: test-1.23_all.deb test-1.23.zip
346 | output: |
347 | test-1.23_all.deb:
348 | 1/
349 | 1/2/
350 | 1/2/3
351 | a/
352 | a/b
353 | foobar
354 |
355 | test-1.23.zip:
356 | 1/2/3
357 | a/b
358 | foobar
359 |
360 | - name: list contents of compressed file
361 | options: -n -t
362 | filenames: test-text.gz
363 | output: |
364 | test-text.gz:
365 | test-text
366 |
367 | - name: default behavior with one directory (gz)
368 | options: -n
369 | filenames: test-onedir.tar.gz
370 | baseline: |
371 | mkdir test-onedir
372 | cd test-onedir
373 | tar -zxf ../$1
374 |
375 | - name: one directory extracted inside another interactively (gz)
376 | options: ""
377 | filenames: test-onedir.tar.gz
378 | grep: one directory
379 | input: i
380 | baseline: |
381 | mkdir test-onedir
382 | cd test-onedir
383 | tar -zxf ../$1
384 |
385 | - name: one directory extracted with rename interactively (gz)
386 | options: ""
387 | filenames: test-onedir.tar.gz
388 | input: r
389 | baseline: |
390 | tar -zxf $1
391 | mv test test-onedir
392 |
393 | - name: one directory extracted here interactively (gz)
394 | options: ""
395 | filenames: test-onedir.tar.gz
396 | input: h
397 | baseline: |
398 | tar -zxf $1
399 |
400 | - name: --one=inside
401 | options: "--one=inside -n"
402 | filenames: test-onedir.tar.gz
403 | baseline: |
404 | mkdir test-onedir
405 | cd test-onedir
406 | tar -zxf ../$1
407 |
408 | - name: --one=rename
409 | options: "--one-entry=rename -n"
410 | filenames: test-onedir.tar.gz
411 | baseline: |
412 | tar -zxf $1
413 | mv test test-onedir
414 |
415 | - name: --one=here
416 | options: "--one=here -n"
417 | filenames: test-onedir.tar.gz
418 | baseline: |
419 | tar -zxf $1
420 |
421 | - name: default behavior with one directory (bz2)
422 | options: -n
423 | filenames: test-onedir.tar.gz
424 | baseline: |
425 | mkdir test-onedir
426 | cd test-onedir
427 | tar -zxf ../$1
428 |
429 | - name: one directory extracted inside another (bz2)
430 | options: ""
431 | filenames: test-onedir.tar.gz
432 | input: i
433 | baseline: |
434 | mkdir test-onedir
435 | cd test-onedir
436 | tar -zxf ../$1
437 |
438 | - name: one directory extracted with rename (bz2)
439 | options: ""
440 | filenames: test-onedir.tar.gz
441 | input: r
442 | baseline: |
443 | tar -zxf $1
444 | mv test test-onedir
445 |
446 | - name: one directory extracted here (bz2)
447 | options: ""
448 | filenames: test-onedir.tar.gz
449 | input: h
450 | baseline: |
451 | tar -zxf $1
452 |
453 | - name: default behavior with one file
454 | options: -n
455 | filenames: test-onefile.tar.gz
456 | baseline: |
457 | mkdir test-onefile
458 | cd test-onefile
459 | tar -zxf ../$1
460 |
461 | - name: one file extracted inside a directory
462 | options: ""
463 | filenames: test-onefile.tar.gz
464 | input: i
465 | grep: one file
466 | baseline: |
467 | mkdir test-onefile
468 | cd test-onefile
469 | tar -zxf ../$1
470 |
471 | - name: prompt wording with one file
472 | options: ""
473 | filenames: test-onefile.tar.gz
474 | input: i
475 | grep: file _I_nside
476 |
477 | - name: one file extracted with rename, with Expected text
478 | options: ""
479 | filenames: test-onefile.tar.gz
480 | input: r
481 | grep: "Expected: test-onefile"
482 | baseline: |
483 | tar -zxOf $1 >test-onefile
484 |
485 | - name: one file extracted here, with Actual text
486 | options: ""
487 | filenames: test-onefile.tar.gz
488 | input: h
489 | grep: " Actual: test-text"
490 | baseline: |
491 | tar -zxf $1
492 |
493 | - name: bomb with preceding dot in the table
494 | filenames: test-dot-first-bomb.tar.gz
495 | options: ""
496 | antigrep: one
497 | baseline: |
498 | mkdir test-dot-first-bomb
499 | cd test-dot-first-bomb
500 | tar -zxf ../$1
501 |
502 | - name: one directory preceded by dot in the table
503 | filenames: test-dot-first-onedir.tar.gz
504 | options: ""
505 | grep: "Actual: (./)?dir/"
506 | input: h
507 | baseline: |
508 | tar -zxf $1
509 |
510 | - name: two one-item archives with different answers
511 | filenames: test-onedir.tar.gz test-onedir.tar.gz
512 | options: ""
513 | input: |
514 | h
515 | r
516 | baseline: |
517 | tar -zxf $1
518 | mv test test-onedir
519 | tar -zxf $1
520 |
521 | - name: interactive recursion (always)
522 | filenames: test-recursive-badperms.tar.bz2 test-recursive-badperms.tar.bz2
523 | options: ""
524 | input: |
525 | i
526 | a
527 | i
528 | baseline: |
529 | extract() {
530 | mkdir test-recursive-badperms$2
531 | cd test-recursive-badperms$2
532 | tar -jxf ../$1
533 | mkdir test-badperms
534 | cd test-badperms
535 | tar -xf ../test-badperms.tar
536 | chmod 700 testdir
537 | cd ../..
538 | }
539 | extract $1
540 | extract $1 .1
541 |
542 | - name: interactive recursion (once)
543 | filenames: test-recursive-badperms.tar.bz2 test-recursive-badperms.tar.bz2
544 | options: ""
545 | input: |
546 | i
547 | o
548 | i
549 | n
550 | baseline: |
551 | extract() {
552 | mkdir "$1"
553 | cd "$1"
554 | tar "-${3}xf" "../$2"
555 | }
556 | extract test-recursive-badperms "$1" j
557 | extract test-badperms test-badperms.tar
558 | chmod 700 testdir
559 | cd ../..
560 | extract test-recursive-badperms.1 "$1" j
561 |
562 | - name: interactive recursion (never)
563 | filenames: test-recursive-badperms.tar.bz2 test-recursive-badperms.tar.bz2
564 | options: ""
565 | input: |
566 | i
567 | v
568 | i
569 | baseline: |
570 | extract() {
571 | mkdir test-recursive-badperms$2
572 | cd test-recursive-badperms$2
573 | tar -jxf ../$1
574 | cd ..
575 | }
576 | extract $1
577 | extract $1 .1
578 |
579 | - name: recursion in subdirectories here
580 | filenames: test-deep-recursion.tar
581 | options: ""
582 | input: |
583 | h
584 | o
585 | grep: 'contains 2 other archive file\(s\), out of 2 file\(s\)'
586 | baseline: |
587 | tar -xf $1
588 | cd subdir
589 | zcat test-text.gz > test-text
590 | cd subsubdir
591 | zcat test-text.gz > test-text
592 |
593 | - name: recursion in subdirectories with rename
594 | filenames: test-deep-recursion.tar
595 | options: ""
596 | input: |
597 | r
598 | o
599 | grep: "contains 2"
600 | baseline: |
601 | tar -xf $1
602 | mv subdir test-deep-recursion
603 | cd test-deep-recursion
604 | zcat test-text.gz > test-text
605 | cd subsubdir
606 | zcat test-text.gz > test-text
607 |
608 | - name: recursion in subdirectories inside new dir
609 | filenames: test-deep-recursion.tar
610 | options: ""
611 | input: |
612 | i
613 | o
614 | grep: "contains 2"
615 | baseline: |
616 | mkdir test-deep-recursion
617 | cd test-deep-recursion
618 | tar -xf ../$1
619 | cd subdir
620 | zcat test-text.gz > test-text
621 | cd subsubdir
622 | zcat test-text.gz > test-text
623 |
624 | - name: extracting file with bad extension
625 | filenames: test-1.23.bin
626 | prerun: cp ${1}test-1.23.tar.gz ${1}test-1.23.bin
627 | cleanup: rm -f ${1}test-1.23.bin
628 | baseline: |
629 | tar -zxf $1
630 |
631 | - name: extracting file with misleading extension
632 | filenames: trickery.tar.gz
633 | prerun: cp ${1}test-1.23.zip ${1}trickery.tar.gz
634 | cleanup: rm -f ${1}trickery.tar.gz
635 | antigrep: "."
636 | baseline: |
637 | mkdir trickery
638 | cd trickery
639 | unzip -q ../$1
640 |
641 | - name: listing file with misleading extension
642 | options: -l
643 | filenames: trickery.tar.gz
644 | prerun: cp ${1}test-1.23.zip ${1}trickery.tar.gz
645 | cleanup: rm -f ${1}trickery.tar.gz
646 | grep: "^1/2/3$"
647 | antigrep: "^dtrx:"
648 |
649 | - name: listing multiple file with misleading extensions
650 | options: -l
651 | filenames: trickery.tar.gz trickery.tar.gz
652 | prerun: cp ${1}test-1.23.zip ${1}trickery.tar.gz
653 | cleanup: rm -f ${1}trickery.tar.gz
654 | output: |
655 | trickery.tar.gz:
656 | 1/2/3
657 | a/b
658 | foobar
659 |
660 | trickery.tar.gz:
661 | 1/2/3
662 | a/b
663 | foobar
664 |
665 | - name: non-archive error
666 | filenames: /dev/null
667 | error: true
668 | grep: "not a known archive type"
669 |
670 | - name: no such file error
671 | filenames: nonexistent-file.tar.gz
672 | error: true
673 | grep: "[Nn]o such file"
674 |
675 | - name: no such file error with no extension
676 | filenames: nonexistent-file
677 | error: true
678 | grep: "[Nn]o such file"
679 |
680 | - name: try to extract a directory error
681 | filenames: test-directory
682 | prerun: mkdir test-directory
683 | error: true
684 | grep: "cannot work with a directory"
685 |
686 | - name: permission denied error
687 | filenames: unreadable-file.tar.gz
688 | prerun: |
689 | touch unreadable-file.tar.gz
690 | chmod 000 unreadable-file.tar.gz
691 | cleanup: rm -f unreadable-file.tar.gz
692 | error: true
693 | grep: "[Pp]ermission denied"
694 |
695 | - name: permission denied no-pipe file error
696 | filenames: unreadable-file.zip
697 | prerun: |
698 | touch unreadable-file.zip
699 | chmod 000 unreadable-file.zip
700 | cleanup: rm -f unreadable-file.zip
701 | error: true
702 | grep: "[Pp]ermission denied"
703 |
704 | - name: bad file error
705 | filenames: bogus-file.tar.gz
706 | prerun: |
707 | touch bogus-file.tar.gz
708 | cleanup: rm -f bogus-file.tar.gz
709 | error: true
710 | grep: "returned status code [^0]"
711 |
712 | - name: try to extract in unwritable directory
713 | directory: unwritable-dir
714 | filenames: ../test-1.23.tar.gz
715 | prerun: chmod 500 .
716 | error: true
717 | grep: "cannot extract here: [Pp]ermission denied"
718 |
719 | - name: recursive listing is a no-op
720 | options: -rl
721 | filenames: test-recursive-badperms.tar.bz2
722 | grep: test-badperms.tar
723 | antigrep: testdir/
724 |
725 | - name: graceful coping when many extraction directories are taken
726 | directory: busydir
727 | prerun: |
728 | mkdir test-1.23
729 | for i in $(seq 1 10); do mkdir test-1.23.$i; done
730 | filenames: ../test-1.23.tar.gz
731 | grep: "WARNING: extracting"
732 |
733 | - name: graceful coping when many decompression targets are taken
734 | directory: busydir
735 | prerun: |
736 | touch test-text
737 | for i in $(seq 1 10); do touch test-text.$i; done
738 | filenames: ../test-text.gz
739 | grep: "WARNING: extracting"
740 |
741 | - name: output filenames with -v
742 | options: -v -n
743 | filenames: test-onedir.tar.gz test-text.gz
744 | output: |
745 | test-onedir.tar.gz:
746 | test-onedir/
747 | test-onedir/test/
748 | test-onedir/test/foobar
749 | test-onedir/test/quux
750 |
751 | test-text.gz:
752 | test-text
753 |
754 | - name: output filenames with -v and -f
755 | options: -nvf
756 | directory: busydir
757 | filenames: ../test-onedir.tar.gz
758 | output: |
759 | ../test-onedir.tar.gz:
760 | test/
761 | test/foobar
762 | test/quux
763 |
764 | - name: list recursive archives
765 | options: ""
766 | filenames: test-deep-recursion.tar
767 | input: |
768 | r
769 | l
770 | n
771 | grep: '^test-deep-recursion/subsubdir/test-text\.gz$'
772 |
773 | - name: partly failed extraction
774 | options: -n
775 | filenames: test-tar-with-node.tar.gz
776 | baseline: |
777 | mkdir test-tar-with-node
778 | cd test-tar-with-node
779 | tar -zxf ../$1
780 | grep: Cannot mknod
781 |
782 | - name: flat extraction of one-file archive
783 | directory: inside-dir
784 | options: -f
785 | filenames: ../test-onefile.tar.gz
786 | baseline: tar -zxf $1
787 | antigrep: "contains"
788 |
789 | - name: test recursive extraction of one archive
790 | directory: inside-dir
791 | options: ""
792 | filenames: ../test-one-archive.tar.gz
793 | baseline: |
794 | tar -zxf $1
795 | zcat test-text.gz >test-text
796 | input: |
797 | h
798 | o
799 |
800 | - name: extracting empty archive
801 | filenames: test-empty.tar.bz2
802 | options: ""
803 | baseline: ""
804 | antigrep: "."
805 |
806 | - name: listing empty archive
807 | filenames: test-empty.tar.bz2
808 | options: -l
809 | baseline: |
810 | test-empty.tar.bz2:
811 |
812 | - name: download and extract
813 | filenames: https://raw.githubusercontent.com/dtrx-py/dtrx/master/tests/test-text.gz
814 | directory: inside-dir
815 | baseline: |
816 | wget "$1"
817 | zcat test-text.gz >test-text
818 | cleanup: rm -f test-text.gz test-text
819 |
820 | - name: recursive archive without prompt
821 | filenames: test-recursive-no-prompt.tar.bz2
822 | options: ""
823 | baseline: |
824 | mkdir test-recursive-no-prompt
825 | cd test-recursive-no-prompt
826 | tar -jxf ../$1
827 | antigrep: "."
828 |
829 | - name: uncompressed dmg
830 | filenames: test.dmg
831 | baseline: |
832 | mkdir test
833 | cd test
834 | 7z x ../$1
835 |
836 | - name: compressed dmg
837 | filenames: test.compressed.dmg
838 | baseline: |
839 | mkdir test.compressed
840 | cd test.compressed
841 | 7z x ../$1
842 |
843 | - name: zstd compressed tar
844 | filenames: test.tar.zst
845 | # note: the --zstd flag for tar requires a fairly recent (~2021) version of
846 | # tar
847 | baseline: |
848 | tar --zstd -xf $1
849 |
850 | - name: zstd compressed single file
851 | filenames: test-text.zst
852 | baseline: |
853 | zstd -d $1 -o test-text
854 |
855 | - name: cpio false extension match recursive extract
856 | filenames: test-cpio.tar.gz
857 | options: "-rn"
858 | baseline: |
859 | mkdir test-cpio
860 | cd test-cpio
861 | tar -zxf ../$1
862 |
863 | # this has issues because zip uses a pty to read the password without echo;
864 | # see https://pexpect.readthedocs.io/en/latest/FAQ.html#whynotpipe
865 | # disabled for now
866 | # - name: password zip interactive
867 | # filenames: test-pw.zip
868 | # input: yolo
869 | # baseline: |
870 | # mkdir test-pw
871 | # cd test-pw
872 | # unzip -P yolo -q ../$1
873 |
874 | - name: password zip noninteractive
875 | filenames: test-pw.zip
876 | options: "-n"
877 | error: true
878 | grep: "cannot extract encrypted archive"
879 | cleanup: stty -F /dev/stdout echo
880 |
881 | - name: password zip noninteractive with password
882 | filenames: test-pw.zip
883 | options: "-n -p yolo"
884 | baseline: |
885 | mkdir test-pw
886 | cd test-pw
887 | unzip -P yolo -q $1
888 | posttest: |
889 | exec [ "$(cat test-pw)" = "test-pw" ]
890 | cleanup: rm -rf test-pw
891 |
892 | - name: password 7z noninteractive with password
893 | filenames: test-pw.7z
894 | options: "-n -p yolo"
895 | baseline: |
896 | mkdir test-pw
897 | cd test-pw
898 | 7z x $1 -pyolo -q
899 | posttest: |
900 | exec [ "$(cat test-pw)" = "test-pw" ]
901 | cleanup: rm -rf test-pw
902 |
903 | - name: password rar noninteractive with password
904 | filenames: test-pw.rar
905 | options: "-n -p yolo"
906 | baseline: |
907 | mkdir test-pw
908 | cd test-pw
909 | unrar e -pyolo -q $1
910 | posttest: |
911 | exec [ "$(cat test-pw)" = "test-pw" ]
912 | cleanup: rm -rf test-pw
913 |
914 | - name: password arj noninteractive with password
915 | filenames: test-pw.7z
916 | options: "-n -p yolo"
917 | baseline: |
918 | mkdir test-pw
919 | cd test-pw
920 | arj x -gyolo $1
921 | posttest: |
922 | exec [ "$(cat test-pw)" = "test-pw" ]
923 | cleanup: rm -rf test-pw
924 |
925 | - name: brotli
926 | filenames: test-text.br
927 | baseline: |
928 | brotli --decompress $1 --output=test-text
929 |
930 | - name: list extensions
931 | options: "--list-extensions"
932 | # note: this pattern is directly passed to python's regex .search function
933 | grep: |
934 | 7z
935 | Z
936 | arj
937 | br
938 | bz2
939 | cab
940 | cpio
941 | crx
942 | deb
943 | dmg
944 | epub
945 | gem
946 | gz
947 | hdr
948 | jar
949 | lha
950 | lrz
951 | lz
952 | lzh
953 | lzma
954 | msi
955 | rar
956 | rpm
957 | tar
958 | tar.Z
959 | tar.bz2
960 | tar.gz
961 | tar.lrz
962 | tar.lz
963 | tar.lzma
964 | tar.xz
965 | tar.zst
966 | taz
967 | tb2
968 | tbz
969 | tbz2
970 | tgz
971 | tlz
972 | txz
973 | xpi
974 | xz
975 | zip
976 | zst
977 | zstd
978 |
979 | - name: rpm
980 | filenames: test.rpm
981 | baseline: |
982 | mkdir test
983 | cd test
984 | rpm2cpio ../$1 | cpio -i --make-directories --quiet --no-absolute-filenames
985 |
986 | - name: crx
987 | filenames: getting-started.crx
988 | baseline: |
989 | unzip -q $1 -d getting-started
990 |
991 | - name: whl
992 | filenames: test-1.23.whl
993 | baseline: |
994 | mkdir test-1.23
995 | cd test-1.23
996 | unzip -q ../$1
997 |
--------------------------------------------------------------------------------
/COPYING:
--------------------------------------------------------------------------------
1 | GNU GENERAL PUBLIC LICENSE
2 | Version 3, 29 June 2007
3 |
4 | Copyright (C) 2007 Free Software Foundation, Inc.
5 | Everyone is permitted to copy and distribute verbatim copies
6 | of this license document, but changing it is not allowed.
7 |
8 | Preamble
9 |
10 | The GNU General Public License is a free, copyleft license for
11 | software and other kinds of works.
12 |
13 | The licenses for most software and other practical works are designed
14 | to take away your freedom to share and change the works. By contrast,
15 | the GNU General Public License is intended to guarantee your freedom to
16 | share and change all versions of a program--to make sure it remains free
17 | software for all its users. We, the Free Software Foundation, use the
18 | GNU General Public License for most of our software; it applies also to
19 | any other work released this way by its authors. You can apply it to
20 | your programs, too.
21 |
22 | When we speak of free software, we are referring to freedom, not
23 | price. Our General Public Licenses are designed to make sure that you
24 | have the freedom to distribute copies of free software (and charge for
25 | them if you wish), that you receive source code or can get it if you
26 | want it, that you can change the software or use pieces of it in new
27 | free programs, and that you know you can do these things.
28 |
29 | To protect your rights, we need to prevent others from denying you
30 | these rights or asking you to surrender the rights. Therefore, you have
31 | certain responsibilities if you distribute copies of the software, or if
32 | you modify it: responsibilities to respect the freedom of others.
33 |
34 | For example, if you distribute copies of such a program, whether
35 | gratis or for a fee, you must pass on to the recipients the same
36 | freedoms that you received. You must make sure that they, too, receive
37 | or can get the source code. And you must show them these terms so they
38 | know their rights.
39 |
40 | Developers that use the GNU GPL protect your rights with two steps:
41 | (1) assert copyright on the software, and (2) offer you this License
42 | giving you legal permission to copy, distribute and/or modify it.
43 |
44 | For the developers' and authors' protection, the GPL clearly explains
45 | that there is no warranty for this free software. For both users' and
46 | authors' sake, the GPL requires that modified versions be marked as
47 | changed, so that their problems will not be attributed erroneously to
48 | authors of previous versions.
49 |
50 | Some devices are designed to deny users access to install or run
51 | modified versions of the software inside them, although the manufacturer
52 | can do so. This is fundamentally incompatible with the aim of
53 | protecting users' freedom to change the software. The systematic
54 | pattern of such abuse occurs in the area of products for individuals to
55 | use, which is precisely where it is most unacceptable. Therefore, we
56 | have designed this version of the GPL to prohibit the practice for those
57 | products. If such problems arise substantially in other domains, we
58 | stand ready to extend this provision to those domains in future versions
59 | of the GPL, as needed to protect the freedom of users.
60 |
61 | Finally, every program is threatened constantly by software patents.
62 | States should not allow patents to restrict development and use of
63 | software on general-purpose computers, but in those that do, we wish to
64 | avoid the special danger that patents applied to a free program could
65 | make it effectively proprietary. To prevent this, the GPL assures that
66 | patents cannot be used to render the program non-free.
67 |
68 | The precise terms and conditions for copying, distribution and
69 | modification follow.
70 |
71 | TERMS AND CONDITIONS
72 |
73 | 0. Definitions.
74 |
75 | "This License" refers to version 3 of the GNU General Public License.
76 |
77 | "Copyright" also means copyright-like laws that apply to other kinds of
78 | works, such as semiconductor masks.
79 |
80 | "The Program" refers to any copyrightable work licensed under this
81 | License. Each licensee is addressed as "you". "Licensees" and
82 | "recipients" may be individuals or organizations.
83 |
84 | To "modify" a work means to copy from or adapt all or part of the work
85 | in a fashion requiring copyright permission, other than the making of an
86 | exact copy. The resulting work is called a "modified version" of the
87 | earlier work or a work "based on" the earlier work.
88 |
89 | A "covered work" means either the unmodified Program or a work based
90 | on the Program.
91 |
92 | To "propagate" a work means to do anything with it that, without
93 | permission, would make you directly or secondarily liable for
94 | infringement under applicable copyright law, except executing it on a
95 | computer or modifying a private copy. Propagation includes copying,
96 | distribution (with or without modification), making available to the
97 | public, and in some countries other activities as well.
98 |
99 | To "convey" a work means any kind of propagation that enables other
100 | parties to make or receive copies. Mere interaction with a user through
101 | a computer network, with no transfer of a copy, is not conveying.
102 |
103 | An interactive user interface displays "Appropriate Legal Notices"
104 | to the extent that it includes a convenient and prominently visible
105 | feature that (1) displays an appropriate copyright notice, and (2)
106 | tells the user that there is no warranty for the work (except to the
107 | extent that warranties are provided), that licensees may convey the
108 | work under this License, and how to view a copy of this License. If
109 | the interface presents a list of user commands or options, such as a
110 | menu, a prominent item in the list meets this criterion.
111 |
112 | 1. Source Code.
113 |
114 | The "source code" for a work means the preferred form of the work
115 | for making modifications to it. "Object code" means any non-source
116 | form of a work.
117 |
118 | A "Standard Interface" means an interface that either is an official
119 | standard defined by a recognized standards body, or, in the case of
120 | interfaces specified for a particular programming language, one that
121 | is widely used among developers working in that language.
122 |
123 | The "System Libraries" of an executable work include anything, other
124 | than the work as a whole, that (a) is included in the normal form of
125 | packaging a Major Component, but which is not part of that Major
126 | Component, and (b) serves only to enable use of the work with that
127 | Major Component, or to implement a Standard Interface for which an
128 | implementation is available to the public in source code form. A
129 | "Major Component", in this context, means a major essential component
130 | (kernel, window system, and so on) of the specific operating system
131 | (if any) on which the executable work runs, or a compiler used to
132 | produce the work, or an object code interpreter used to run it.
133 |
134 | The "Corresponding Source" for a work in object code form means all
135 | the source code needed to generate, install, and (for an executable
136 | work) run the object code and to modify the work, including scripts to
137 | control those activities. However, it does not include the work's
138 | System Libraries, or general-purpose tools or generally available free
139 | programs which are used unmodified in performing those activities but
140 | which are not part of the work. For example, Corresponding Source
141 | includes interface definition files associated with source files for
142 | the work, and the source code for shared libraries and dynamically
143 | linked subprograms that the work is specifically designed to require,
144 | such as by intimate data communication or control flow between those
145 | subprograms and other parts of the work.
146 |
147 | The Corresponding Source need not include anything that users
148 | can regenerate automatically from other parts of the Corresponding
149 | Source.
150 |
151 | The Corresponding Source for a work in source code form is that
152 | same work.
153 |
154 | 2. Basic Permissions.
155 |
156 | All rights granted under this License are granted for the term of
157 | copyright on the Program, and are irrevocable provided the stated
158 | conditions are met. This License explicitly affirms your unlimited
159 | permission to run the unmodified Program. The output from running a
160 | covered work is covered by this License only if the output, given its
161 | content, constitutes a covered work. This License acknowledges your
162 | rights of fair use or other equivalent, as provided by copyright law.
163 |
164 | You may make, run and propagate covered works that you do not
165 | convey, without conditions so long as your license otherwise remains
166 | in force. You may convey covered works to others for the sole purpose
167 | of having them make modifications exclusively for you, or provide you
168 | with facilities for running those works, provided that you comply with
169 | the terms of this License in conveying all material for which you do
170 | not control copyright. Those thus making or running the covered works
171 | for you must do so exclusively on your behalf, under your direction
172 | and control, on terms that prohibit them from making any copies of
173 | your copyrighted material outside their relationship with you.
174 |
175 | Conveying under any other circumstances is permitted solely under
176 | the conditions stated below. Sublicensing is not allowed; section 10
177 | makes it unnecessary.
178 |
179 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
180 |
181 | No covered work shall be deemed part of an effective technological
182 | measure under any applicable law fulfilling obligations under article
183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or
184 | similar laws prohibiting or restricting circumvention of such
185 | measures.
186 |
187 | When you convey a covered work, you waive any legal power to forbid
188 | circumvention of technological measures to the extent such circumvention
189 | is effected by exercising rights under this License with respect to
190 | the covered work, and you disclaim any intention to limit operation or
191 | modification of the work as a means of enforcing, against the work's
192 | users, your or third parties' legal rights to forbid circumvention of
193 | technological measures.
194 |
195 | 4. Conveying Verbatim Copies.
196 |
197 | You may convey verbatim copies of the Program's source code as you
198 | receive it, in any medium, provided that you conspicuously and
199 | appropriately publish on each copy an appropriate copyright notice;
200 | keep intact all notices stating that this License and any
201 | non-permissive terms added in accord with section 7 apply to the code;
202 | keep intact all notices of the absence of any warranty; and give all
203 | recipients a copy of this License along with the Program.
204 |
205 | You may charge any price or no price for each copy that you convey,
206 | and you may offer support or warranty protection for a fee.
207 |
208 | 5. Conveying Modified Source Versions.
209 |
210 | You may convey a work based on the Program, or the modifications to
211 | produce it from the Program, in the form of source code under the
212 | terms of section 4, provided that you also meet all of these conditions:
213 |
214 | a) The work must carry prominent notices stating that you modified
215 | it, and giving a relevant date.
216 |
217 | b) The work must carry prominent notices stating that it is
218 | released under this License and any conditions added under section
219 | 7. This requirement modifies the requirement in section 4 to
220 | "keep intact all notices".
221 |
222 | c) You must license the entire work, as a whole, under this
223 | License to anyone who comes into possession of a copy. This
224 | License will therefore apply, along with any applicable section 7
225 | additional terms, to the whole of the work, and all its parts,
226 | regardless of how they are packaged. This License gives no
227 | permission to license the work in any other way, but it does not
228 | invalidate such permission if you have separately received it.
229 |
230 | d) If the work has interactive user interfaces, each must display
231 | Appropriate Legal Notices; however, if the Program has interactive
232 | interfaces that do not display Appropriate Legal Notices, your
233 | work need not make them do so.
234 |
235 | A compilation of a covered work with other separate and independent
236 | works, which are not by their nature extensions of the covered work,
237 | and which are not combined with it such as to form a larger program,
238 | in or on a volume of a storage or distribution medium, is called an
239 | "aggregate" if the compilation and its resulting copyright are not
240 | used to limit the access or legal rights of the compilation's users
241 | beyond what the individual works permit. Inclusion of a covered work
242 | in an aggregate does not cause this License to apply to the other
243 | parts of the aggregate.
244 |
245 | 6. Conveying Non-Source Forms.
246 |
247 | You may convey a covered work in object code form under the terms
248 | of sections 4 and 5, provided that you also convey the
249 | machine-readable Corresponding Source under the terms of this License,
250 | in one of these ways:
251 |
252 | a) Convey the object code in, or embodied in, a physical product
253 | (including a physical distribution medium), accompanied by the
254 | Corresponding Source fixed on a durable physical medium
255 | customarily used for software interchange.
256 |
257 | b) Convey the object code in, or embodied in, a physical product
258 | (including a physical distribution medium), accompanied by a
259 | written offer, valid for at least three years and valid for as
260 | long as you offer spare parts or customer support for that product
261 | model, to give anyone who possesses the object code either (1) a
262 | copy of the Corresponding Source for all the software in the
263 | product that is covered by this License, on a durable physical
264 | medium customarily used for software interchange, for a price no
265 | more than your reasonable cost of physically performing this
266 | conveying of source, or (2) access to copy the
267 | Corresponding Source from a network server at no charge.
268 |
269 | c) Convey individual copies of the object code with a copy of the
270 | written offer to provide the Corresponding Source. This
271 | alternative is allowed only occasionally and noncommercially, and
272 | only if you received the object code with such an offer, in accord
273 | with subsection 6b.
274 |
275 | d) Convey the object code by offering access from a designated
276 | place (gratis or for a charge), and offer equivalent access to the
277 | Corresponding Source in the same way through the same place at no
278 | further charge. You need not require recipients to copy the
279 | Corresponding Source along with the object code. If the place to
280 | copy the object code is a network server, the Corresponding Source
281 | may be on a different server (operated by you or a third party)
282 | that supports equivalent copying facilities, provided you maintain
283 | clear directions next to the object code saying where to find the
284 | Corresponding Source. Regardless of what server hosts the
285 | Corresponding Source, you remain obligated to ensure that it is
286 | available for as long as needed to satisfy these requirements.
287 |
288 | e) Convey the object code using peer-to-peer transmission, provided
289 | you inform other peers where the object code and Corresponding
290 | Source of the work are being offered to the general public at no
291 | charge under subsection 6d.
292 |
293 | A separable portion of the object code, whose source code is excluded
294 | from the Corresponding Source as a System Library, need not be
295 | included in conveying the object code work.
296 |
297 | A "User Product" is either (1) a "consumer product", which means any
298 | tangible personal property which is normally used for personal, family,
299 | or household purposes, or (2) anything designed or sold for incorporation
300 | into a dwelling. In determining whether a product is a consumer product,
301 | doubtful cases shall be resolved in favor of coverage. For a particular
302 | product received by a particular user, "normally used" refers to a
303 | typical or common use of that class of product, regardless of the status
304 | of the particular user or of the way in which the particular user
305 | actually uses, or expects or is expected to use, the product. A product
306 | is a consumer product regardless of whether the product has substantial
307 | commercial, industrial or non-consumer uses, unless such uses represent
308 | the only significant mode of use of the product.
309 |
310 | "Installation Information" for a User Product means any methods,
311 | procedures, authorization keys, or other information required to install
312 | and execute modified versions of a covered work in that User Product from
313 | a modified version of its Corresponding Source. The information must
314 | suffice to ensure that the continued functioning of the modified object
315 | code is in no case prevented or interfered with solely because
316 | modification has been made.
317 |
318 | If you convey an object code work under this section in, or with, or
319 | specifically for use in, a User Product, and the conveying occurs as
320 | part of a transaction in which the right of possession and use of the
321 | User Product is transferred to the recipient in perpetuity or for a
322 | fixed term (regardless of how the transaction is characterized), the
323 | Corresponding Source conveyed under this section must be accompanied
324 | by the Installation Information. But this requirement does not apply
325 | if neither you nor any third party retains the ability to install
326 | modified object code on the User Product (for example, the work has
327 | been installed in ROM).
328 |
329 | The requirement to provide Installation Information does not include a
330 | requirement to continue to provide support service, warranty, or updates
331 | for a work that has been modified or installed by the recipient, or for
332 | the User Product in which it has been modified or installed. Access to a
333 | network may be denied when the modification itself materially and
334 | adversely affects the operation of the network or violates the rules and
335 | protocols for communication across the network.
336 |
337 | Corresponding Source conveyed, and Installation Information provided,
338 | in accord with this section must be in a format that is publicly
339 | documented (and with an implementation available to the public in
340 | source code form), and must require no special password or key for
341 | unpacking, reading or copying.
342 |
343 | 7. Additional Terms.
344 |
345 | "Additional permissions" are terms that supplement the terms of this
346 | License by making exceptions from one or more of its conditions.
347 | Additional permissions that are applicable to the entire Program shall
348 | be treated as though they were included in this License, to the extent
349 | that they are valid under applicable law. If additional permissions
350 | apply only to part of the Program, that part may be used separately
351 | under those permissions, but the entire Program remains governed by
352 | this License without regard to the additional permissions.
353 |
354 | When you convey a copy of a covered work, you may at your option
355 | remove any additional permissions from that copy, or from any part of
356 | it. (Additional permissions may be written to require their own
357 | removal in certain cases when you modify the work.) You may place
358 | additional permissions on material, added by you to a covered work,
359 | for which you have or can give appropriate copyright permission.
360 |
361 | Notwithstanding any other provision of this License, for material you
362 | add to a covered work, you may (if authorized by the copyright holders of
363 | that material) supplement the terms of this License with terms:
364 |
365 | a) Disclaiming warranty or limiting liability differently from the
366 | terms of sections 15 and 16 of this License; or
367 |
368 | b) Requiring preservation of specified reasonable legal notices or
369 | author attributions in that material or in the Appropriate Legal
370 | Notices displayed by works containing it; or
371 |
372 | c) Prohibiting misrepresentation of the origin of that material, or
373 | requiring that modified versions of such material be marked in
374 | reasonable ways as different from the original version; or
375 |
376 | d) Limiting the use for publicity purposes of names of licensors or
377 | authors of the material; or
378 |
379 | e) Declining to grant rights under trademark law for use of some
380 | trade names, trademarks, or service marks; or
381 |
382 | f) Requiring indemnification of licensors and authors of that
383 | material by anyone who conveys the material (or modified versions of
384 | it) with contractual assumptions of liability to the recipient, for
385 | any liability that these contractual assumptions directly impose on
386 | those licensors and authors.
387 |
388 | All other non-permissive additional terms are considered "further
389 | restrictions" within the meaning of section 10. If the Program as you
390 | received it, or any part of it, contains a notice stating that it is
391 | governed by this License along with a term that is a further
392 | restriction, you may remove that term. If a license document contains
393 | a further restriction but permits relicensing or conveying under this
394 | License, you may add to a covered work material governed by the terms
395 | of that license document, provided that the further restriction does
396 | not survive such relicensing or conveying.
397 |
398 | If you add terms to a covered work in accord with this section, you
399 | must place, in the relevant source files, a statement of the
400 | additional terms that apply to those files, or a notice indicating
401 | where to find the applicable terms.
402 |
403 | Additional terms, permissive or non-permissive, may be stated in the
404 | form of a separately written license, or stated as exceptions;
405 | the above requirements apply either way.
406 |
407 | 8. Termination.
408 |
409 | You may not propagate or modify a covered work except as expressly
410 | provided under this License. Any attempt otherwise to propagate or
411 | modify it is void, and will automatically terminate your rights under
412 | this License (including any patent licenses granted under the third
413 | paragraph of section 11).
414 |
415 | However, if you cease all violation of this License, then your
416 | license from a particular copyright holder is reinstated (a)
417 | provisionally, unless and until the copyright holder explicitly and
418 | finally terminates your license, and (b) permanently, if the copyright
419 | holder fails to notify you of the violation by some reasonable means
420 | prior to 60 days after the cessation.
421 |
422 | Moreover, your license from a particular copyright holder is
423 | reinstated permanently if the copyright holder notifies you of the
424 | violation by some reasonable means, this is the first time you have
425 | received notice of violation of this License (for any work) from that
426 | copyright holder, and you cure the violation prior to 30 days after
427 | your receipt of the notice.
428 |
429 | Termination of your rights under this section does not terminate the
430 | licenses of parties who have received copies or rights from you under
431 | this License. If your rights have been terminated and not permanently
432 | reinstated, you do not qualify to receive new licenses for the same
433 | material under section 10.
434 |
435 | 9. Acceptance Not Required for Having Copies.
436 |
437 | You are not required to accept this License in order to receive or
438 | run a copy of the Program. Ancillary propagation of a covered work
439 | occurring solely as a consequence of using peer-to-peer transmission
440 | to receive a copy likewise does not require acceptance. However,
441 | nothing other than this License grants you permission to propagate or
442 | modify any covered work. These actions infringe copyright if you do
443 | not accept this License. Therefore, by modifying or propagating a
444 | covered work, you indicate your acceptance of this License to do so.
445 |
446 | 10. Automatic Licensing of Downstream Recipients.
447 |
448 | Each time you convey a covered work, the recipient automatically
449 | receives a license from the original licensors, to run, modify and
450 | propagate that work, subject to this License. You are not responsible
451 | for enforcing compliance by third parties with this License.
452 |
453 | An "entity transaction" is a transaction transferring control of an
454 | organization, or substantially all assets of one, or subdividing an
455 | organization, or merging organizations. If propagation of a covered
456 | work results from an entity transaction, each party to that
457 | transaction who receives a copy of the work also receives whatever
458 | licenses to the work the party's predecessor in interest had or could
459 | give under the previous paragraph, plus a right to possession of the
460 | Corresponding Source of the work from the predecessor in interest, if
461 | the predecessor has it or can get it with reasonable efforts.
462 |
463 | You may not impose any further restrictions on the exercise of the
464 | rights granted or affirmed under this License. For example, you may
465 | not impose a license fee, royalty, or other charge for exercise of
466 | rights granted under this License, and you may not initiate litigation
467 | (including a cross-claim or counterclaim in a lawsuit) alleging that
468 | any patent claim is infringed by making, using, selling, offering for
469 | sale, or importing the Program or any portion of it.
470 |
471 | 11. Patents.
472 |
473 | A "contributor" is a copyright holder who authorizes use under this
474 | License of the Program or a work on which the Program is based. The
475 | work thus licensed is called the contributor's "contributor version".
476 |
477 | A contributor's "essential patent claims" are all patent claims
478 | owned or controlled by the contributor, whether already acquired or
479 | hereafter acquired, that would be infringed by some manner, permitted
480 | by this License, of making, using, or selling its contributor version,
481 | but do not include claims that would be infringed only as a
482 | consequence of further modification of the contributor version. For
483 | purposes of this definition, "control" includes the right to grant
484 | patent sublicenses in a manner consistent with the requirements of
485 | this License.
486 |
487 | Each contributor grants you a non-exclusive, worldwide, royalty-free
488 | patent license under the contributor's essential patent claims, to
489 | make, use, sell, offer for sale, import and otherwise run, modify and
490 | propagate the contents of its contributor version.
491 |
492 | In the following three paragraphs, a "patent license" is any express
493 | agreement or commitment, however denominated, not to enforce a patent
494 | (such as an express permission to practice a patent or covenant not to
495 | sue for patent infringement). To "grant" such a patent license to a
496 | party means to make such an agreement or commitment not to enforce a
497 | patent against the party.
498 |
499 | If you convey a covered work, knowingly relying on a patent license,
500 | and the Corresponding Source of the work is not available for anyone
501 | to copy, free of charge and under the terms of this License, through a
502 | publicly available network server or other readily accessible means,
503 | then you must either (1) cause the Corresponding Source to be so
504 | available, or (2) arrange to deprive yourself of the benefit of the
505 | patent license for this particular work, or (3) arrange, in a manner
506 | consistent with the requirements of this License, to extend the patent
507 | license to downstream recipients. "Knowingly relying" means you have
508 | actual knowledge that, but for the patent license, your conveying the
509 | covered work in a country, or your recipient's use of the covered work
510 | in a country, would infringe one or more identifiable patents in that
511 | country that you have reason to believe are valid.
512 |
513 | If, pursuant to or in connection with a single transaction or
514 | arrangement, you convey, or propagate by procuring conveyance of, a
515 | covered work, and grant a patent license to some of the parties
516 | receiving the covered work authorizing them to use, propagate, modify
517 | or convey a specific copy of the covered work, then the patent license
518 | you grant is automatically extended to all recipients of the covered
519 | work and works based on it.
520 |
521 | A patent license is "discriminatory" if it does not include within
522 | the scope of its coverage, prohibits the exercise of, or is
523 | conditioned on the non-exercise of one or more of the rights that are
524 | specifically granted under this License. You may not convey a covered
525 | work if you are a party to an arrangement with a third party that is
526 | in the business of distributing software, under which you make payment
527 | to the third party based on the extent of your activity of conveying
528 | the work, and under which the third party grants, to any of the
529 | parties who would receive the covered work from you, a discriminatory
530 | patent license (a) in connection with copies of the covered work
531 | conveyed by you (or copies made from those copies), or (b) primarily
532 | for and in connection with specific products or compilations that
533 | contain the covered work, unless you entered into that arrangement,
534 | or that patent license was granted, prior to 28 March 2007.
535 |
536 | Nothing in this License shall be construed as excluding or limiting
537 | any implied license or other defenses to infringement that may
538 | otherwise be available to you under applicable patent law.
539 |
540 | 12. No Surrender of Others' Freedom.
541 |
542 | If conditions are imposed on you (whether by court order, agreement or
543 | otherwise) that contradict the conditions of this License, they do not
544 | excuse you from the conditions of this License. If you cannot convey a
545 | covered work so as to satisfy simultaneously your obligations under this
546 | License and any other pertinent obligations, then as a consequence you may
547 | not convey it at all. For example, if you agree to terms that obligate you
548 | to collect a royalty for further conveying from those to whom you convey
549 | the Program, the only way you could satisfy both those terms and this
550 | License would be to refrain entirely from conveying the Program.
551 |
552 | 13. Use with the GNU Affero General Public License.
553 |
554 | Notwithstanding any other provision of this License, you have
555 | permission to link or combine any covered work with a work licensed
556 | under version 3 of the GNU Affero General Public License into a single
557 | combined work, and to convey the resulting work. The terms of this
558 | License will continue to apply to the part which is the covered work,
559 | but the special requirements of the GNU Affero General Public License,
560 | section 13, concerning interaction through a network will apply to the
561 | combination as such.
562 |
563 | 14. Revised Versions of this License.
564 |
565 | The Free Software Foundation may publish revised and/or new versions of
566 | the GNU General Public License from time to time. Such new versions will
567 | be similar in spirit to the present version, but may differ in detail to
568 | address new problems or concerns.
569 |
570 | Each version is given a distinguishing version number. If the
571 | Program specifies that a certain numbered version of the GNU General
572 | Public License "or any later version" applies to it, you have the
573 | option of following the terms and conditions either of that numbered
574 | version or of any later version published by the Free Software
575 | Foundation. If the Program does not specify a version number of the
576 | GNU General Public License, you may choose any version ever published
577 | by the Free Software Foundation.
578 |
579 | If the Program specifies that a proxy can decide which future
580 | versions of the GNU General Public License can be used, that proxy's
581 | public statement of acceptance of a version permanently authorizes you
582 | to choose that version for the Program.
583 |
584 | Later license versions may give you additional or different
585 | permissions. However, no additional obligations are imposed on any
586 | author or copyright holder as a result of your choosing to follow a
587 | later version.
588 |
589 | 15. Disclaimer of Warranty.
590 |
591 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
592 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
596 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
597 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
599 |
600 | 16. Limitation of Liability.
601 |
602 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
610 | SUCH DAMAGES.
611 |
612 | 17. Interpretation of Sections 15 and 16.
613 |
614 | If the disclaimer of warranty and limitation of liability provided
615 | above cannot be given local legal effect according to their terms,
616 | reviewing courts shall apply local law that most closely approximates
617 | an absolute waiver of all civil liability in connection with the
618 | Program, unless a warranty or assumption of liability accompanies a
619 | copy of the Program in return for a fee.
620 |
621 | END OF TERMS AND CONDITIONS
622 |
623 | How to Apply These Terms to Your New Programs
624 |
625 | If you develop a new program, and you want it to be of the greatest
626 | possible use to the public, the best way to achieve this is to make it
627 | free software which everyone can redistribute and change under these terms.
628 |
629 | To do so, attach the following notices to the program. It is safest
630 | to attach them to the start of each source file to most effectively
631 | state the exclusion of warranty; and each file should have at least
632 | the "copyright" line and a pointer to where the full notice is found.
633 |
634 |
635 | Copyright (C)
636 |
637 | This program is free software: you can redistribute it and/or modify
638 | it under the terms of the GNU General Public License as published by
639 | the Free Software Foundation, either version 3 of the License, or
640 | (at your option) any later version.
641 |
642 | This program is distributed in the hope that it will be useful,
643 | but WITHOUT ANY WARRANTY; without even the implied warranty of
644 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
645 | GNU General Public License for more details.
646 |
647 | You should have received a copy of the GNU General Public License
648 | along with this program. If not, see .
649 |
650 | Also add information on how to contact you by electronic and paper mail.
651 |
652 | If the program does terminal interaction, make it output a short
653 | notice like this when it starts in an interactive mode:
654 |
655 | Copyright (C)
656 | This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
657 | This is free software, and you are welcome to redistribute it
658 | under certain conditions; type `show c' for details.
659 |
660 | The hypothetical commands `show w' and `show c' should show the appropriate
661 | parts of the General Public License. Of course, your program's commands
662 | might be different; for a GUI interface, you would use an "about box".
663 |
664 | You should also get your employer (if you work as a programmer) or school,
665 | if any, to sign a "copyright disclaimer" for the program, if necessary.
666 | For more information on this, and how to apply and follow the GNU GPL, see
667 | .
668 |
669 | The GNU General Public License does not permit incorporating your program
670 | into proprietary programs. If your program is a subroutine library, you
671 | may consider it more useful to permit linking proprietary applications with
672 | the library. If this is what you want to do, use the GNU Lesser General
673 | Public License instead of this License. But first, please read
674 | .
675 |
--------------------------------------------------------------------------------
/dtrx/dtrx.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # -*- coding: utf-8 -*-
3 | #
4 | # dtrx -- Intelligently extract various archive types.
5 | # Copyright © 2006-2011 Brett Smith
6 | # Copyright © 2008 Peter Kelemen
7 | # Copyright © 2011 Ville Skyttä
8 | #
9 | # This program is free software; you can redistribute it and/or modify it
10 | # under the terms of the GNU General Public License as published by the
11 | # Free Software Foundation; either version 3 of the License, or (at your
12 | # option) any later version.
13 | #
14 | # This program is distributed in the hope that it will be useful, but
15 | # WITHOUT ANY WARRANTY; without even the implied warranty of
16 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
17 | # Public License for more details.
18 | #
19 | # You should have received a copy of the GNU General Public License along
20 | # with this program; if not, see .
21 |
22 | from __future__ import absolute_import, print_function
23 |
24 | import errno
25 | import fcntl
26 | import importlib.metadata
27 | import itertools
28 | import logging
29 | import mimetypes
30 | import optparse
31 | import os
32 | import re
33 | import shutil
34 | import signal
35 | import stat
36 | import struct
37 | import subprocess
38 | import sys
39 | import tempfile
40 | import termios
41 | import textwrap
42 | import traceback
43 | import urllib.parse as urlparse
44 | from functools import cmp_to_key, total_ordering
45 |
46 |
47 | def cmp(a, b):
48 | return (a > b) - (a < b)
49 |
50 |
51 | try:
52 | VERSION = importlib.metadata.version("dtrx")
53 | except importlib.metadata.PackageNotFoundError:
54 | VERSION = "DEVELOPMENT"
55 | VERSION_BANNER = """dtrx version %s
56 | Copyright © 2006-2011 Brett Smith
57 | Copyright © 2008 Peter Kelemen
58 | Copyright © 2011 Ville Skyttä
59 |
60 | This program is free software; you can redistribute it and/or modify it
61 | under the terms of the GNU General Public License as published by the
62 | Free Software Foundation; either version 3 of the License, or (at your
63 | option) any later version.
64 |
65 | This program is distributed in the hope that it will be useful, but
66 | WITHOUT ANY WARRANTY; without even the implied warranty of
67 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
68 | Public License for more details.""" % (VERSION,)
69 |
70 | MATCHING_DIRECTORY = 1
71 | ONE_ENTRY_KNOWN = 2
72 | BOMB = 3
73 | EMPTY = 4
74 | ONE_ENTRY_FILE = "file"
75 | ONE_ENTRY_DIRECTORY = "directory"
76 |
77 | ONE_ENTRY_UNKNOWN = [ONE_ENTRY_FILE, ONE_ENTRY_DIRECTORY]
78 |
79 | EXTRACT_HERE = 1
80 | EXTRACT_WRAP = 2
81 | EXTRACT_RENAME = 3
82 |
83 | RECURSE_ALWAYS = 1
84 | RECURSE_ONCE = 2
85 | RECURSE_NOT_NOW = 3
86 | RECURSE_NEVER = 4
87 | RECURSE_LIST = 5
88 |
89 | mimetypes.encodings_map.setdefault(".bz2", "bzip2")
90 | mimetypes.encodings_map.setdefault(".lzma", "lzma")
91 | mimetypes.encodings_map.setdefault(".xz", "xz")
92 | mimetypes.encodings_map.setdefault(".lz", "lzip")
93 | mimetypes.encodings_map.setdefault(".lrz", "lrzip")
94 | mimetypes.encodings_map.setdefault(".zst", "zstd")
95 | mimetypes.encodings_map.setdefault(".zstd", "zstd")
96 | mimetypes.types_map.setdefault(".gem", "application/x-ruby-gem")
97 |
98 | logger = logging.getLogger("dtrx-log")
99 |
100 |
101 | class FilenameChecker(object):
102 | free_func = os.open
103 | free_args = (os.O_CREAT | os.O_EXCL,)
104 | free_close = os.close
105 |
106 | def __init__(self, original_name):
107 | self.original_name = original_name
108 |
109 | def is_free(self, filename):
110 | try:
111 | result = self.free_func(filename, *self.free_args)
112 | except OSError as error:
113 | if error.errno == errno.EEXIST:
114 | return False
115 | raise
116 | if self.free_close:
117 | self.free_close(result)
118 | return True
119 |
120 | def create(self):
121 | fd, filename = tempfile.mkstemp(prefix=self.original_name + ".", dir=".")
122 | os.close(fd)
123 | return filename
124 |
125 | def check(self):
126 | for suffix in [""] + [".%s" % (x,) for x in range(1, 10)]:
127 | filename = "%s%s" % (self.original_name, suffix)
128 | if self.is_free(filename):
129 | return filename
130 | return self.create()
131 |
132 |
133 | class DirectoryChecker(FilenameChecker):
134 | free_func = os.mkdir
135 | free_args = ()
136 | free_close = None
137 |
138 | def create(self):
139 | dirname = tempfile.mkdtemp(prefix=self.original_name + ".", dir=".")
140 | # We want to directory to be relative to current directory
141 | dirname = os.path.join(".", os.path.relpath(dirname))
142 | return dirname
143 |
144 |
145 | class NonblockingRead(object):
146 | iostream = None
147 |
148 | def __init__(self, iostream):
149 | self.iostream = iostream
150 |
151 | fd = iostream.fileno()
152 | flags = fcntl.fcntl(fd, fcntl.F_GETFL)
153 | fcntl.fcntl(fd, fcntl.F_SETFL, flags | os.O_NONBLOCK)
154 |
155 | def readlines(self):
156 | out = self.iostream.readlines()
157 | return [line.decode("ascii", "ignore") for line in out]
158 |
159 |
160 | class ExtractorError(Exception):
161 | pass
162 |
163 |
164 | class ExtractorUnusable(Exception):
165 | pass
166 |
167 |
168 | EXTRACTION_ERRORS = (ExtractorError, ExtractorUnusable, OSError, IOError)
169 |
170 |
171 | class BaseExtractor(object):
172 | decoders = {
173 | "bzip2": ["bzcat"],
174 | "gzip": ["zcat"],
175 | "compress": ["zcat"],
176 | "lzma": ["lzcat"],
177 | "xz": ["xzcat"],
178 | "lzip": ["lzip", "-cd"],
179 | "zstd": ["zstd", "-d"],
180 | "br": ["br", "--decompress"],
181 | }
182 | name_checker = DirectoryChecker
183 |
184 | def __init__(self, filename, encoding):
185 | # bit of a hack, if we're doing lzip, need to set the correct quiet
186 | # option based on what's supported, since this behavior changed
187 | if encoding in ("lrzip", "lrz"):
188 | # need to check if this version of lrzip supports the -Q option
189 | output = subprocess.check_output("lrzip --help", stderr=subprocess.STDOUT, shell=True)
190 | if b"-Q" in output:
191 | decoder = ["lrzcat", "-Q"]
192 | else:
193 | decoder = ["lrzcat", "-q"]
194 | self.decoders["lrz"] = decoder
195 | self.decoders["lrzip"] = decoder
196 |
197 | if encoding and (encoding not in self.decoders):
198 | raise ValueError("unrecognized encoding %s" % (encoding,))
199 | self.filename = os.path.realpath(filename)
200 | self.encoding = encoding
201 | self.ignore_pw = False
202 | self.password = None
203 | self.file_count = 0
204 | self.included_archives = []
205 | self.target = None
206 | self.content_type = None
207 | self.content_name = None
208 | self.pipes = []
209 | self.user_stdin = False
210 | self.stderr = ""
211 | self.pw_prompted = False
212 | self.exit_codes = []
213 | try:
214 | self.archive = open(filename, "r")
215 | except (IOError, OSError) as error:
216 | raise ExtractorError("could not open %s: %s" % (filename, error.strerror))
217 | if encoding:
218 | self.pipe(self.decoders[encoding], "decoding")
219 | self.prepare()
220 |
221 | def pipe(self, command, description="extraction"):
222 | self.pipes.append((command, description))
223 |
224 | def add_process(self, processes, command, stdin, stdout):
225 | try:
226 | logger.debug("running command: {}".format(command))
227 | processes.append(
228 | subprocess.Popen(command, stdin=stdin, stdout=stdout, stderr=subprocess.PIPE)
229 | )
230 | except OSError as error:
231 | if error.errno == errno.ENOENT:
232 | raise ExtractorUnusable("could not run %s" % (command[0],))
233 | raise
234 |
235 | def timeout_check(self, pipe):
236 | pass
237 |
238 | def wait_for_exit(self, pipe):
239 | while True:
240 | try:
241 | return pipe.wait(timeout=1)
242 | except subprocess.TimeoutExpired:
243 | logging.debug("timeout hit...")
244 | self.timeout_check(pipe)
245 | # Verify that we're not waiting for a password in non-interactive mode
246 | if self.pw_prompted and self.ignore_pw:
247 | pipe.kill()
248 | # Whatever extractor we're using probably left the
249 | # terminal hiding output..
250 | os.system("stty echo")
251 | # Clean up the error output
252 | self.stderr = ""
253 | raise ExtractorError(
254 | "cannot extract encrypted archive '%s' in non-interactive mode"
255 | " without a password" % (self.filename)
256 | )
257 |
258 | def send_stdout_to_dev_null(self):
259 | return True
260 |
261 | def run_pipes(self, final_stdout=None):
262 | has_output_target = True if final_stdout or self.send_stdout_to_dev_null() else False
263 | if not self.pipes:
264 | return
265 | elif final_stdout is None:
266 | final_stdout = open("/dev/null", "w")
267 | num_pipes = len(self.pipes)
268 | last_pipe = num_pipes - 1
269 | processes = []
270 | for index, command in enumerate([pipe[0] for pipe in self.pipes]):
271 | if index == 0:
272 | stdin = None if self.user_stdin else self.archive
273 | else:
274 | stdin = processes[-1].stdout
275 | if index == last_pipe and has_output_target:
276 | stdout = final_stdout
277 | else:
278 | stdout = subprocess.PIPE
279 | self.add_process(processes, command, stdin, stdout)
280 | self.exit_codes = [self.wait_for_exit(pipe) for pipe in processes]
281 | for pipe in processes:
282 | # Grab any remaining error messages
283 | errs = pipe.stderr.readlines()
284 | self.stderr += b"".join(errs).decode("ascii", "ignore")
285 | self.archive.close()
286 | for index in range(last_pipe):
287 | processes[index].stdout.close()
288 | self.archive = final_stdout
289 |
290 | def prepare(self):
291 | pass
292 |
293 | def check_included_archives(self):
294 | if (self.content_name is None) or (not self.content_name.endswith("/")):
295 | self.included_root = "./"
296 | else:
297 | self.included_root = self.content_name
298 | start_index = len(self.included_root)
299 | for path, _dirname, filenames in os.walk(self.included_root):
300 | self.file_count += len(filenames)
301 | path = path[start_index:]
302 | for filename in filenames:
303 | if ExtractorBuilder.try_by_mimetype(filename) or ExtractorBuilder.try_by_extension(
304 | filename
305 | ):
306 | self.included_archives.append(os.path.join(path, filename))
307 |
308 | def check_contents(self):
309 | if not self.contents:
310 | self.content_type = EMPTY
311 | elif len(self.contents) == 1:
312 | if self.basename() == self.contents[0]:
313 | self.content_type = MATCHING_DIRECTORY
314 | elif os.path.isdir(self.contents[0]):
315 | self.content_type = ONE_ENTRY_DIRECTORY
316 | else:
317 | self.content_type = ONE_ENTRY_FILE
318 | self.content_name = self.contents[0]
319 | if os.path.isdir(self.contents[0]):
320 | self.content_name += "/"
321 | else:
322 | self.content_type = BOMB
323 | self.check_included_archives()
324 |
325 | def basename(self):
326 | pieces = os.path.basename(self.filename).split(".")
327 | orig_len = len(pieces)
328 | extension = "." + pieces[-1]
329 | # This is maybe a little more clever than it ought to be.
330 | # We're trying to be conservative about what remove, but also DTRT
331 | # in cases like .tar.gz, and also do something reasonable if we
332 | # encounter some completely off-the-wall extension. So that means:
333 | # 1. First remove any compression extension.
334 | # 2. Then remove any commonly known extension that remains.
335 | # 3. If neither of those did anything, remove anything that looks
336 | # like it's almost certainly an extension (less than 5 chars).
337 | if extension in mimetypes.encodings_map:
338 | pieces.pop()
339 | extension = "." + pieces[-1]
340 | if (
341 | extension in mimetypes.types_map
342 | or extension in mimetypes.common_types
343 | or extension in mimetypes.suffix_map
344 | ):
345 | pieces.pop()
346 | if (orig_len == len(pieces)) and (orig_len > 1) and (len(pieces[-1]) < 5):
347 | pieces.pop()
348 | return ".".join(pieces)
349 |
350 | def is_fatal_error(self, status):
351 | return False
352 |
353 | def first_bad_exit_code(self):
354 | for index, code in enumerate(self.exit_codes):
355 | if code > 0:
356 | return index, code
357 | return None, None
358 |
359 | def check_success(self, got_files):
360 | error_index, error_code = self.first_bad_exit_code()
361 | logger.debug("success results: %s %s %s" % (got_files, error_index, self.exit_codes))
362 | if self.is_fatal_error(error_code) or ((not got_files) and (error_code is not None)):
363 | command = " ".join(self.pipes[error_index][0])
364 | self.pw_prompt = False # Don't silently fail with wrong password
365 | raise ExtractorError(
366 | "%s error: '%s' returned status code %s"
367 | % (self.pipes[error_index][1], command, error_code)
368 | )
369 |
370 | def extract_archive(self):
371 | self.pipe(self.extract_pipe)
372 | self.run_pipes()
373 |
374 | def extract(self, ignore_passwd=False, password=None):
375 | self.ignore_pw = ignore_passwd
376 | self.password = password
377 | try:
378 | dirname = tempfile.mkdtemp(prefix=".dtrx-", dir=".")
379 | # We want to directory to be relative to current directory
380 | dirname = os.path.join(".", os.path.relpath(dirname))
381 | self.target = dirname
382 |
383 | except (OSError, IOError) as error:
384 | raise ExtractorError("cannot extract here: %s" % (error.strerror,))
385 | old_path = os.path.realpath(os.curdir)
386 | os.chdir(self.target)
387 | try:
388 | self.archive.seek(0, 0)
389 | self.extract_archive()
390 | self.contents = os.listdir(".")
391 | self.check_contents()
392 | self.check_success(self.content_type != EMPTY)
393 | except EXTRACTION_ERRORS:
394 | self.archive.close()
395 | os.chdir(old_path)
396 | shutil.rmtree(self.target, ignore_errors=True)
397 | raise
398 | self.archive.close()
399 | os.chdir(old_path)
400 |
401 | def get_filenames(self, internal=False):
402 | if not internal:
403 | self.pipe(self.list_pipe, "listing")
404 | processes = []
405 | stdin = self.archive
406 | for command in [pipe[0] for pipe in self.pipes]:
407 | self.add_process(processes, command, stdin, subprocess.PIPE)
408 | stdin = processes[-1].stdout
409 | get_output_line = processes[-1].stdout.readline
410 | while True:
411 | line = get_output_line().decode("ascii", errors="ignore")
412 | if not line:
413 | break
414 | yield line.rstrip("\n")
415 | self.exit_codes = [pipe.wait() for pipe in processes]
416 | self.archive.close()
417 | for process in processes:
418 | process.stdout.close()
419 | self.check_success(False)
420 |
421 |
422 | class CompressionExtractor(BaseExtractor):
423 | file_type = "compressed file"
424 | name_checker = FilenameChecker
425 |
426 | def basename(self):
427 | pieces = os.path.basename(self.filename).split(".")
428 | extension = "." + pieces[-1]
429 | if extension in mimetypes.encodings_map:
430 | pieces.pop()
431 | return ".".join(pieces)
432 |
433 | def get_filenames(self):
434 | # This code used to just immediately yield the basename, under the
435 | # assumption that that would be the filename. However, if that
436 | # happens, dtrx -l will report this as a valid result for files with
437 | # compression extensions, even if those files shouldn't actually be
438 | # handled this way. So, we call out to the file command to do a quick
439 | # check and make sure this actually looks like a compressed file.
440 | if "compress" not in [match[0] for match in ExtractorBuilder.try_by_magic(self.filename)]:
441 | raise ExtractorError("doesn't look like a compressed file")
442 | yield self.basename()
443 |
444 | def extract(self, ignore_passwd=False, password=None):
445 | self.ignore_pw = ignore_passwd
446 | self.password = password
447 | self.content_type = ONE_ENTRY_KNOWN
448 | self.content_name = self.basename()
449 | self.contents = None
450 | self.file_count = 1
451 | self.included_root = "./"
452 | try:
453 | output_fd, self.target = tempfile.mkstemp(prefix=".dtrx-", dir=".")
454 | except (OSError, IOError) as error:
455 | raise ExtractorError("cannot extract here: %s" % (error.strerror,))
456 | self.run_pipes(output_fd)
457 | os.close(output_fd)
458 | try:
459 | self.check_success(os.stat(self.target)[stat.ST_SIZE] > 0)
460 | except EXTRACTION_ERRORS:
461 | os.unlink(self.target)
462 | raise
463 |
464 |
465 | class TarExtractor(BaseExtractor):
466 | file_type = "tar file"
467 | extract_pipe = ["tar", "-x"]
468 | list_pipe = ["tar", "-t"]
469 |
470 |
471 | class CpioExtractor(BaseExtractor):
472 | file_type = "cpio file"
473 | extract_pipe = [
474 | "cpio",
475 | "-i",
476 | "--make-directories",
477 | "--quiet",
478 | "--no-absolute-filenames",
479 | ]
480 | list_pipe = ["cpio", "-t", "--quiet"]
481 |
482 |
483 | class RPMExtractor(CpioExtractor):
484 | file_type = "RPM"
485 |
486 | def prepare(self):
487 | self.pipe(["rpm2cpio", "-"], "rpm2cpio")
488 |
489 | def basename(self):
490 | pieces = os.path.basename(self.filename).split(".")
491 | if len(pieces) == 1:
492 | return pieces[0]
493 | elif pieces[-1] != "rpm":
494 | return BaseExtractor.basename(self)
495 | pieces.pop()
496 | if len(pieces) == 1:
497 | return pieces[0]
498 | elif len(pieces[-1]) < 8:
499 | pieces.pop()
500 | return ".".join(pieces)
501 |
502 | def check_contents(self):
503 | self.check_included_archives()
504 | self.content_type = BOMB
505 |
506 |
507 | class DebExtractor(TarExtractor):
508 | file_type = "Debian package"
509 | data_re = re.compile(r"^data\.tar\.[a-z0-9]+$")
510 |
511 | def prepare(self):
512 | self.pipe(["ar", "t", self.filename], "finding package data file")
513 | for filename in self.get_filenames(internal=True):
514 | if self.data_re.match(filename):
515 | data_filename = filename
516 | break
517 | else:
518 | raise ExtractorError(".deb contains no data.tar file")
519 | self.archive.seek(0, 0)
520 | self.pipes.pop()
521 | # self.pipes = start_pipes
522 | encoding = mimetypes.guess_type(data_filename)[1]
523 | if not encoding:
524 | raise ExtractorError("data.tar file has unrecognized encoding")
525 | self.pipe(["ar", "p", self.filename, data_filename], "extracting data.tar from .deb")
526 | self.pipe(self.decoders[encoding], "decoding data.tar")
527 |
528 | def basename(self):
529 | pieces = os.path.basename(self.filename).split("_")
530 | if len(pieces) == 1:
531 | return pieces[0]
532 | last_piece = pieces.pop()
533 | if (len(last_piece) > 10) or (not last_piece.endswith(".deb")):
534 | return BaseExtractor.basename(self)
535 | return "_".join(pieces)
536 |
537 | def check_contents(self):
538 | self.check_included_archives()
539 | self.content_type = BOMB
540 |
541 |
542 | class DebMetadataExtractor(DebExtractor):
543 | def prepare(self):
544 | self.pipe(["ar", "p", self.filename, "control.tar.gz"], "control.tar.gz extraction")
545 | self.pipe(["zcat"], "control.tar.gz decompression")
546 |
547 |
548 | class GemExtractor(TarExtractor):
549 | file_type = "Ruby gem"
550 |
551 | def prepare(self):
552 | self.pipe(["tar", "-xO", "data.tar.gz"], "data.tar.gz extraction")
553 | self.pipe(["zcat"], "data.tar.gz decompression")
554 |
555 | def check_contents(self):
556 | self.check_included_archives()
557 | self.content_type = BOMB
558 |
559 |
560 | class GemMetadataExtractor(CompressionExtractor):
561 | file_type = "Ruby gem"
562 |
563 | def prepare(self):
564 | self.pipe(["tar", "-xO", "metadata.gz"], "metadata.gz extraction")
565 | self.pipe(["zcat"], "metadata.gz decompression")
566 |
567 | def basename(self):
568 | return os.path.basename(self.filename) + "-metadata.txt"
569 |
570 |
571 | class NoPipeExtractor(BaseExtractor):
572 | # Some extraction tools won't accept the archive from stdin. With
573 | # these, the piping infrastructure we normally set up generally doesn't
574 | # work, at least at first. We can still use most of it; we just don't
575 | # want to seed self.archive with the archive file, since that sucks up
576 | # memory. So instead we seed it with /dev/null, and specify the
577 | # filename on the command line as necessary. We also open the actual
578 | # file with os.open, to make sure we can actually do it (permissions
579 | # are good, etc.). This class doesn't do anything by itself; it's just
580 | # meant to be a base class for extractors that rely on these dumb
581 | # tools.
582 | def __init__(self, filename, encoding):
583 | os.close(os.open(filename, os.O_RDONLY))
584 | BaseExtractor.__init__(self, "/dev/null", None)
585 | self.filename = os.path.realpath(filename)
586 | self.user_stdin = True
587 |
588 | def extract_archive(self):
589 | # the commands provided by the child class have optional format codes
590 | # that will be replaced here
591 | extract_fmt_args = {
592 | "OUTPUT_FILE": os.path.splitext(os.path.basename(self.filename))[0],
593 | }
594 | formatted_extract_commands = [x.format(**extract_fmt_args) for x in self.extract_command]
595 |
596 | self.extract_pipe = formatted_extract_commands + [self.filename]
597 | BaseExtractor.extract_archive(self)
598 |
599 | def get_filenames(self):
600 | self.list_pipe = self.list_command + [self.filename]
601 | return BaseExtractor.get_filenames(self)
602 |
603 |
604 | class ZipExtractor(NoPipeExtractor):
605 | file_type = "Zip file"
606 | list_command = ["zipinfo", "-1"]
607 |
608 | @property
609 | def extract_command(self):
610 | """
611 | Returns the extraction command and adds a password if given.
612 | """
613 | cmd = ["unzip", "-q"]
614 | if self.password:
615 | cmd.append("-P %s" % (self.password,))
616 | return cmd
617 |
618 | def is_fatal_error(self, status):
619 | return (status or 0) > 1
620 |
621 | def timeout_check(self, pipe):
622 | nbs = NonblockingRead(pipe.stderr)
623 | errs = nbs.readlines()
624 |
625 | self.stderr += "".join(errs)
626 |
627 | # pass through the password prompt, if unzip sent one
628 | if errs and "password" in errs[-1]:
629 | sys.stdout.write("\n" + errs[-1])
630 | sys.stdout.flush()
631 | self.pw_prompted = True
632 |
633 |
634 | class LZHExtractor(ZipExtractor):
635 | file_type = "LZH file"
636 | extract_command = ["lha", "xq"]
637 | list_command = ["lha", "l"]
638 |
639 | def border_line_file_index(self, line):
640 | last_space_index = None
641 | for index, char in enumerate(line):
642 | if char == " ":
643 | last_space_index = index
644 | elif char != "-":
645 | return None
646 | if last_space_index is None:
647 | return None
648 | return last_space_index + 1
649 |
650 | def get_filenames(self):
651 | filenames = NoPipeExtractor.get_filenames(self)
652 | for line in filenames:
653 | fn_index = self.border_line_file_index(line)
654 | if fn_index is not None:
655 | break
656 | for line in filenames:
657 | if self.border_line_file_index(line):
658 | break
659 | else:
660 | yield line[fn_index:]
661 | self.archive.close()
662 |
663 |
664 | class SevenExtractor(NoPipeExtractor):
665 | file_type = "7z file"
666 | list_command = ["7z", "l", "-ba"]
667 | border_re = re.compile("^[- ]+$")
668 | space_re = re.compile(" ")
669 |
670 | @property
671 | def extract_command(self):
672 | """
673 | Returns the extraction command and adds a password if given.
674 | """
675 | cmd = ["7z", "x"]
676 | if self.password:
677 | cmd.append("-p%s" % (self.password,))
678 | return cmd
679 |
680 | def get_filenames(self):
681 | for line in NoPipeExtractor.get_filenames(self):
682 | if " " in line:
683 | pos = line.rindex(" ") + 1
684 | yield line[pos:]
685 | self.archive.close()
686 |
687 | def send_stdout_to_dev_null(self):
688 | return False
689 |
690 | def timeout_check(self, pipe):
691 | nbs = NonblockingRead(pipe.stdout)
692 | errs = nbs.readlines()
693 |
694 | self.stderr += "".join(errs)
695 |
696 | # pass through the password prompt, if 7z sent one
697 | if errs and "password" in errs[-1]:
698 | sys.stdout.write("\n" + errs[-1])
699 | sys.stdout.flush()
700 | self.pw_prompted = True
701 |
702 |
703 | class ZstandardExtractor(NoPipeExtractor):
704 | file_type = "zstd file"
705 | extract_command = ["zstd", "-d"]
706 | list_command = ["zstd", "-l"]
707 | border_re = re.compile("^[- ]+$")
708 |
709 | def get_filenames(self):
710 | fn_index = None
711 | for line in NoPipeExtractor.get_filenames(self):
712 | if self.border_re.match(line):
713 | if fn_index is not None:
714 | break
715 | else:
716 | fn_index = line.rindex(" ") + 1
717 | elif fn_index is not None:
718 | yield line[fn_index:]
719 | self.archive.close()
720 |
721 |
722 | class BrotliExtractor(NoPipeExtractor):
723 | file_type = "brotli file"
724 | extract_command = ["brotli", "--decompress", "--output={OUTPUT_FILE}"]
725 | # brotli command line doesn't support this mode
726 | list_command = ["false"]
727 |
728 | def get_filenames(self):
729 | # just raise an error, this is not supported
730 | raise ExtractorError
731 |
732 |
733 | class CABExtractor(NoPipeExtractor):
734 | file_type = "CAB archive"
735 | extract_command = ["cabextract", "-q"]
736 | list_command = ["cabextract", "-l"]
737 | border_re = re.compile(r"^[-\+]+$")
738 |
739 | def get_filenames(self):
740 | filenames = NoPipeExtractor.get_filenames(self)
741 | for line in filenames:
742 | if self.border_re.match(line):
743 | break
744 | for line in filenames:
745 | try:
746 | yield line.split(" | ", 2)[2]
747 | except IndexError:
748 | break
749 | self.archive.close()
750 |
751 |
752 | class ShieldExtractor(NoPipeExtractor):
753 | file_type = "InstallShield archive"
754 | extract_command = ["unshield", "x"]
755 | list_command = ["unshield", "l"]
756 | prefix_re = re.compile(r"^\s+\d+\s+")
757 | end_re = re.compile(r"^\s+-+\s+-+\s*$")
758 |
759 | def get_filenames(self):
760 | for line in NoPipeExtractor.get_filenames(self):
761 | if self.end_re.match(line):
762 | break
763 | else:
764 | match = self.prefix_re.match(line)
765 | if match:
766 | yield line[match.end() :]
767 | self.archive.close()
768 |
769 | def basename(self):
770 | result = NoPipeExtractor.basename(self)
771 | if result.endswith(".hdr"):
772 | result = result[:-4]
773 | return result
774 |
775 |
776 | class RarExtractor(NoPipeExtractor):
777 | file_type = "RAR archive"
778 | list_command = ["unrar", "v"]
779 | border_re = re.compile("^-+$")
780 |
781 | @property
782 | def extract_command(self):
783 | """
784 | Returns the extraction command and adds a password if given.
785 | """
786 | cmd = ["unrar", "x"]
787 | if self.password:
788 | cmd.append("-p%s" % (self.password,))
789 | return cmd
790 |
791 | def get_filenames(self):
792 | inside = False
793 | isfile = True
794 | for line in NoPipeExtractor.get_filenames(self):
795 | if self.border_re.match(line):
796 | if inside:
797 | break
798 | else:
799 | inside = True
800 | elif inside:
801 | if isfile:
802 | yield line.strip()
803 | isfile = not isfile
804 | self.archive.close()
805 |
806 | def timeout_check(self, pipe):
807 | nbs = NonblockingRead(pipe.stderr)
808 | errs = nbs.readlines()
809 |
810 | self.stderr += "".join(errs)
811 |
812 | # pass through the password prompt, if unrar sent one
813 | if errs and "password" in errs[-1]:
814 | sys.stdout.write("\n" + "".join(errs))
815 | sys.stdout.flush()
816 | self.pw_prompted = True
817 |
818 |
819 | class UnarchiverExtractor(NoPipeExtractor):
820 | file_type = "RAR archive"
821 | list_command = ["lsar"]
822 |
823 | @property
824 | def extract_command(self):
825 | """
826 | Returns the extraction command and adds a password if given.
827 | """
828 | cmd = ["unar", "-D"]
829 | if self.password:
830 | cmd.append("-p %s" % (self.password,))
831 | return cmd
832 |
833 | def get_filenames(self):
834 | output = NoPipeExtractor.get_filenames(self)
835 | next(output)
836 | for line in output:
837 | end_index = line.rfind("(")
838 | yield line[:end_index].strip()
839 |
840 |
841 | class ArjExtractor(NoPipeExtractor):
842 | file_type = "ARJ archive"
843 | list_command = ["arj", "v"]
844 | prefix_re = re.compile(r"^\d+\)\s+")
845 |
846 | @property
847 | def extract_command(self):
848 | """
849 | Returns the extraction command and adds a password if given.
850 | """
851 | cmd = ["arj", "x", "-y"]
852 | if self.password:
853 | cmd.append("-g%s" % (self.password,))
854 | return cmd
855 |
856 | def get_filenames(self):
857 | for line in NoPipeExtractor.get_filenames(self):
858 | match = self.prefix_re.match(line)
859 | if match:
860 | yield line[match.end() :]
861 | self.archive.close()
862 |
863 |
864 | class BaseHandler(object):
865 | def __init__(self, extractor, options):
866 | self.extractor = extractor
867 | self.options = options
868 | self.target = None
869 |
870 | def handle(self):
871 | command = "find"
872 | status = subprocess.call([
873 | "find",
874 | self.extractor.target,
875 | "-type",
876 | "d",
877 | "-exec",
878 | "chmod",
879 | "u+rwx",
880 | "{}",
881 | ";",
882 | ])
883 | if status == 0:
884 | command = "chmod"
885 | status = subprocess.call(["chmod", "-R", "u+rwX", self.extractor.target])
886 | if status != 0:
887 | return "%s returned with exit status %s" % (command, status)
888 | return self.organize()
889 |
890 | def set_target(self, target, checker):
891 | self.target = checker(target).check()
892 | if self.target != target:
893 | logger.warning("extracting %s to %s" % (self.extractor.filename, self.target))
894 |
895 |
896 | # The "where to extract" table, with options and archive types.
897 | # This dictates the contents of each can_handle method.
898 | #
899 | # Flat Overwrite None
900 | # File basename basename FilenameChecked
901 | # Match . . tempdir + checked
902 | # Bomb . basename DirectoryChecked
903 |
904 |
905 | class FlatHandler(BaseHandler):
906 | @staticmethod
907 | def can_handle(contents, options):
908 | return (options.flat and (contents != ONE_ENTRY_KNOWN)) or (
909 | options.overwrite and (contents == MATCHING_DIRECTORY)
910 | )
911 |
912 | def organize(self):
913 | self.target = "."
914 | for curdir, _dirs, filenames in os.walk(self.extractor.target, topdown=False):
915 | path_parts = curdir.split(os.sep)
916 | if path_parts[0] == ".":
917 | del path_parts[1]
918 | else:
919 | del path_parts[0]
920 | newdir = os.path.join(*path_parts)
921 | if not os.path.isdir(newdir):
922 | os.makedirs(newdir)
923 | for filename in filenames:
924 | os.rename(os.path.join(curdir, filename), os.path.join(newdir, filename))
925 | os.rmdir(curdir)
926 |
927 |
928 | class OverwriteHandler(BaseHandler):
929 | @staticmethod
930 | def can_handle(contents, options):
931 | return (options.flat and (contents == ONE_ENTRY_KNOWN)) or (
932 | options.overwrite and (contents != MATCHING_DIRECTORY)
933 | )
934 |
935 | def organize(self):
936 | self.target = self.extractor.basename()
937 | if os.path.isdir(self.target):
938 | shutil.rmtree(self.target)
939 | os.rename(self.extractor.target, self.target)
940 |
941 |
942 | class MatchHandler(BaseHandler):
943 | @staticmethod
944 | def can_handle(contents, options):
945 | return (contents == MATCHING_DIRECTORY) or (
946 | (contents in ONE_ENTRY_UNKNOWN) and options.one_entry_policy.ok_for_match()
947 | )
948 |
949 | def organize(self):
950 | source = os.path.join(self.extractor.target, os.listdir(self.extractor.target)[0])
951 | if os.path.isdir(source):
952 | checker = DirectoryChecker
953 | else:
954 | checker = FilenameChecker
955 | if self.options.one_entry_policy == EXTRACT_HERE:
956 | destination = self.extractor.content_name.rstrip("/")
957 | else:
958 | destination = self.extractor.basename()
959 | self.set_target(destination, checker)
960 | if os.path.isdir(self.extractor.target):
961 | os.rename(source, self.target)
962 | os.rmdir(self.extractor.target)
963 | else:
964 | os.rename(self.extractor.target, self.target)
965 | self.extractor.included_root = "./"
966 |
967 |
968 | class EmptyHandler(object):
969 | target = ""
970 |
971 | @staticmethod
972 | def can_handle(contents, options):
973 | return contents == EMPTY
974 |
975 | def __init__(self, extractor, options):
976 | os.rmdir(extractor.target)
977 |
978 | def handle(self):
979 | pass
980 |
981 |
982 | class BombHandler(BaseHandler):
983 | @staticmethod
984 | def can_handle(contents, options):
985 | return True
986 |
987 | def organize(self):
988 | basename = self.extractor.basename()
989 | self.set_target(basename, self.extractor.name_checker)
990 | os.rename(self.extractor.target, self.target)
991 |
992 |
993 | @total_ordering
994 | class BasePolicy(object):
995 | try:
996 | size = fcntl.ioctl(sys.stdout.fileno(), termios.TIOCGWINSZ, struct.pack("HHHH", 0, 0, 0, 0))
997 | width = struct.unpack("HHHH", size)[1]
998 | except IOError:
999 | width = 80
1000 | width = width - 1
1001 | choice_wrapper = textwrap.TextWrapper(
1002 | width=width,
1003 | initial_indent=" * ",
1004 | subsequent_indent=" ",
1005 | break_long_words=False,
1006 | )
1007 |
1008 | def __init__(self, options):
1009 | self.current_policy = None
1010 | if options.batch:
1011 | self.permanent_policy = self.answers[""]
1012 | else:
1013 | self.permanent_policy = None
1014 |
1015 | def ask_question(self, question):
1016 | question = question + ["You can:"]
1017 | for choice in self.choices:
1018 | question.extend(self.choice_wrapper.wrap(choice))
1019 | while True:
1020 | print("\n".join(question))
1021 | try:
1022 | answer = input(self.prompt)
1023 | except EOFError:
1024 | return self.answers[""]
1025 | try:
1026 | return self.answers[answer.lower()]
1027 | except KeyError:
1028 | print()
1029 |
1030 | def wrap(self, question, *args):
1031 | words = question.split()
1032 | for arg in args:
1033 | words[words.index("%s")] = arg
1034 | result = [words.pop(0)]
1035 | for word in words:
1036 | extend = "%s %s" % (result[-1], word)
1037 | if len(extend) > self.width:
1038 | result.append(word)
1039 | else:
1040 | result[-1] = extend
1041 | return result
1042 |
1043 | def __eq__(self, other):
1044 | return self.current_policy == other
1045 |
1046 | def __lt__(self, other):
1047 | return self.current_policy < other
1048 |
1049 |
1050 | class OneEntryPolicy(BasePolicy):
1051 | answers = {
1052 | "h": EXTRACT_HERE,
1053 | "i": EXTRACT_WRAP,
1054 | "r": EXTRACT_RENAME,
1055 | "": EXTRACT_WRAP,
1056 | }
1057 | choice_template = [
1058 | "extract the %s _I_nside a new directory named %s",
1059 | "extract the %s and _R_ename it %s",
1060 | "extract the %s _H_ere",
1061 | ]
1062 | prompt = "What do you want to do? (I/r/h) "
1063 |
1064 | def __init__(self, options):
1065 | BasePolicy.__init__(self, options)
1066 | if options.flat:
1067 | default = "h"
1068 | elif options.one_entry_default is not None:
1069 | default = options.one_entry_default.lower()
1070 | else:
1071 | return
1072 | if "here".startswith(default):
1073 | self.permanent_policy = EXTRACT_HERE
1074 | elif "rename".startswith(default):
1075 | self.permanent_policy = EXTRACT_RENAME
1076 | elif "inside".startswith(default):
1077 | self.permanent_policy = EXTRACT_WRAP
1078 | elif default is not None:
1079 | raise ValueError("bad value %s for default policy" % (default,))
1080 |
1081 | def prep(self, archive_filename, extractor):
1082 | question = self.wrap(
1083 | "%s contains one %s but its name doesn't match.",
1084 | archive_filename,
1085 | extractor.content_type,
1086 | )
1087 | question.append(" Expected: " + extractor.basename())
1088 | question.append(" Actual: " + extractor.content_name)
1089 | choice_vars = (extractor.content_type, extractor.basename())
1090 | self.choices = [text % choice_vars[: text.count("%s")] for text in self.choice_template]
1091 | self.current_policy = self.permanent_policy or self.ask_question(question)
1092 |
1093 | def ok_for_match(self):
1094 | return self.current_policy in (EXTRACT_RENAME, EXTRACT_HERE)
1095 |
1096 |
1097 | class RecursionPolicy(BasePolicy):
1098 | answers = {
1099 | "o": RECURSE_ONCE,
1100 | "a": RECURSE_ALWAYS,
1101 | "n": RECURSE_NOT_NOW,
1102 | "v": RECURSE_NEVER,
1103 | "l": RECURSE_LIST,
1104 | "": RECURSE_NOT_NOW,
1105 | }
1106 | choices = [
1107 | "_A_lways extract included archives during this session",
1108 | "extract included archives this _O_nce",
1109 | "choose _N_ot to extract included archives this once",
1110 | "ne_V_er extract included archives during this session",
1111 | "_L_ist included archives",
1112 | ]
1113 | prompt = "What do you want to do? (a/o/N/v/l) "
1114 |
1115 | def __init__(self, options):
1116 | BasePolicy.__init__(self, options)
1117 | if options.show_list:
1118 | self.permanent_policy = RECURSE_NEVER
1119 | elif options.recursive:
1120 | self.permanent_policy = RECURSE_ALWAYS
1121 |
1122 | def prep(self, current_filename, target, extractor):
1123 | archive_count = len(extractor.included_archives)
1124 | if (self.permanent_policy is not None) or ((archive_count * 10) <= extractor.file_count):
1125 | self.current_policy = self.permanent_policy or RECURSE_NOT_NOW
1126 | return
1127 | question = self.wrap(
1128 | "%s contains %s other archive file(s), out of %s file(s) total.",
1129 | current_filename,
1130 | archive_count,
1131 | extractor.file_count,
1132 | )
1133 | if target == ".":
1134 | target = ""
1135 | included_root = extractor.included_root
1136 | if included_root == "./":
1137 | included_root = ""
1138 | while True:
1139 | self.current_policy = self.ask_question(question)
1140 | if self.current_policy != RECURSE_LIST:
1141 | break
1142 | print(
1143 | "\n%s\n"
1144 | % "\n".join([
1145 | os.path.join(target, included_root, filename)
1146 | for filename in extractor.included_archives
1147 | ])
1148 | )
1149 | if self.current_policy in (RECURSE_ALWAYS, RECURSE_NEVER):
1150 | self.permanent_policy = self.current_policy
1151 |
1152 | def ok_to_recurse(self):
1153 | return self.current_policy in (RECURSE_ALWAYS, RECURSE_ONCE)
1154 |
1155 |
1156 | class ExtractorBuilder(object):
1157 | extractor_map = {
1158 | "tar": {
1159 | "extractors": (TarExtractor,),
1160 | "mimetypes": ("x-tar",),
1161 | "extensions": ("tar",),
1162 | "magic": ("POSIX tar archive",),
1163 | },
1164 | "zip": {
1165 | "extractors": (ZipExtractor, SevenExtractor),
1166 | "mimetypes": ("zip",),
1167 | "extensions": ("zip", "jar", "epub", "xpi", "crx"),
1168 | "magic": ("(Zip|ZIP self-extracting) archive",),
1169 | },
1170 | "lzh": {
1171 | "extractors": (LZHExtractor,),
1172 | "mimetypes": ("x-lzh", "x-lzh-compressed"),
1173 | "extensions": ("lzh", "lha"),
1174 | "magic": (r"LHa [\d\.\?]+ archive",),
1175 | },
1176 | "rpm": {
1177 | "extractors": (RPMExtractor,),
1178 | "mimetypes": ("x-redhat-package-manager", "x-rpm"),
1179 | "extensions": ("rpm",),
1180 | "magic": ("RPM",),
1181 | },
1182 | "deb": {
1183 | "extractors": (DebExtractor,),
1184 | "metadata": (DebMetadataExtractor,),
1185 | "mimetypes": ("x-debian-package",),
1186 | "extensions": ("deb",),
1187 | "magic": ("Debian binary package",),
1188 | },
1189 | "cpio": {
1190 | "extractors": (CpioExtractor,),
1191 | "mimetypes": ("x-cpio",),
1192 | "extensions": ("cpio",),
1193 | "magic": ("cpio archive",),
1194 | },
1195 | "gem": {
1196 | "extractors": (GemExtractor,),
1197 | "metadata": (GemMetadataExtractor,),
1198 | "mimetypes": ("x-ruby-gem",),
1199 | "extensions": ("gem",),
1200 | },
1201 | "7z": {
1202 | "extractors": (SevenExtractor,),
1203 | "mimetypes": ("x-7z-compressed",),
1204 | "extensions": ("7z",),
1205 | "magic": ("7-zip archive",),
1206 | },
1207 | "cab": {
1208 | "extractors": (CABExtractor,),
1209 | "mimetypes": ("x-cab",),
1210 | "extensions": ("cab",),
1211 | "magic": ("Microsoft Cabinet Archive",),
1212 | },
1213 | "rar": {
1214 | "extractors": (RarExtractor, UnarchiverExtractor),
1215 | "mimetypes": ("rar",),
1216 | "extensions": ("rar",),
1217 | "magic": ("RAR archive",),
1218 | },
1219 | "arj": {
1220 | "extractors": (ArjExtractor,),
1221 | "mimetypes": ("arj",),
1222 | "extensions": ("arj",),
1223 | "magic": ("ARJ archive",),
1224 | },
1225 | "shield": {
1226 | "extractors": (ShieldExtractor,),
1227 | "mimetypes": ("x-cab",),
1228 | "extensions": ("cab", "hdr"),
1229 | "magic": ("InstallShield CAB",),
1230 | },
1231 | "msi": {
1232 | "extractors": (SevenExtractor,),
1233 | "mimetypes": ("x-msi", "x-ole-storage"),
1234 | "extensions": ("msi",),
1235 | "magic": ("Application: Windows Installer",),
1236 | },
1237 | "dmg": {
1238 | "extractors": (SevenExtractor,),
1239 | "mimetypes": ("x-apple-diskimage",),
1240 | "extensions": ("dmg",),
1241 | "magic": (
1242 | "ISO 9660 CD-ROM filesystem data",
1243 | "zlib compressed data",
1244 | ),
1245 | },
1246 | "zst": {
1247 | "extractors": (ZstandardExtractor,),
1248 | "mimetypes": ("application/zstd",),
1249 | "extensions": (
1250 | "zst",
1251 | "zstd",
1252 | ),
1253 | "magic": ("Zstandard compressed data",),
1254 | },
1255 | "brotli": {
1256 | "extractors": (BrotliExtractor,),
1257 | "extensions": ("br",),
1258 | },
1259 | "compress": {"extractors": (CompressionExtractor,)},
1260 | }
1261 |
1262 | mimetype_map = {}
1263 | magic_mime_map = {}
1264 | extension_map = {}
1265 | for ext_name, ext_info in extractor_map.items():
1266 | for mimetype in ext_info.get("mimetypes", ()):
1267 | if "/" not in mimetype:
1268 | mimetype = "application/" + mimetype
1269 | mimetype_map[mimetype] = ext_name
1270 | for magic_re in ext_info.get("magic", ()):
1271 | magic_mime_map[re.compile(magic_re)] = ext_name
1272 | for extension in ext_info.get("extensions", ()):
1273 | extension_map.setdefault(extension, []).append((ext_name, None))
1274 |
1275 | for mapping in (
1276 | ("tar", "bzip2", "tar.bz2", "tbz2", "tb2", "tbz"),
1277 | ("tar", "gzip", "tar.gz", "tgz"),
1278 | ("tar", "lzma", "tar.lzma", "tlz"),
1279 | ("tar", "xz", "tar.xz", "txz"),
1280 | ("tar", "lzip", "tar.lz"),
1281 | ("tar", "compress", "tar.Z", "taz"),
1282 | ("tar", "lrz", "tar.lrz"),
1283 | ("tar", "zstd", "tar.zst"),
1284 | ("compress", "gzip", "Z", "gz"),
1285 | ("compress", "bzip2", "bz2"),
1286 | ("compress", "lzma", "lzma"),
1287 | ("compress", "xz", "xz"),
1288 | ("compress", "lrzip", "lrz"),
1289 | ):
1290 | for extension in mapping[2:]:
1291 | extension_map.setdefault(extension, []).append(mapping[:2])
1292 |
1293 | magic_encoding_map = {}
1294 | for mapping in (
1295 | ("bzip2", "bzip2 compressed"),
1296 | ("gzip", "gzip compressed"),
1297 | ("lzma", "LZMA compressed"),
1298 | ("lzip", "lzip compressed"),
1299 | ("lrzip", "LRZIP compressed"),
1300 | ("zstd", "Zstandard compressed"),
1301 | ("xz", "xz compressed"),
1302 | ):
1303 | for pattern in mapping[1:]:
1304 | magic_encoding_map[re.compile(pattern)] = mapping[0]
1305 |
1306 | def __init__(self, filename, options):
1307 | self.filename = filename
1308 | self.options = options
1309 |
1310 | def build_extractor(self, archive_type, encoding):
1311 | type_info = self.extractor_map[archive_type]
1312 | if self.options.metadata and "metadata" in type_info:
1313 | extractors = type_info["metadata"]
1314 | else:
1315 | extractors = type_info["extractors"]
1316 | for extractor in extractors:
1317 | yield extractor(self.filename, encoding)
1318 |
1319 | def get_extractor(self):
1320 | tried_types = set()
1321 | # As smart as it is, the magic test can't go first, because at least
1322 | # on my system it just recognizes gem files as tar files. I guess
1323 | # it's possible for the opposite problem to occur -- where the mimetype
1324 | # or extension suggests something less than ideal -- but it seems less
1325 | # likely so I'm sticking with this.
1326 | for func_name in ("mimetype", "extension", "magic"):
1327 | logger.debug("getting extractors by %s" % (func_name,))
1328 | extractor_types = getattr(self, "try_by_" + func_name)(self.filename)
1329 | logger.debug("done getting extractors")
1330 | for ext_args in extractor_types:
1331 | if ext_args in tried_types:
1332 | continue
1333 | tried_types.add(ext_args)
1334 | logger.debug("trying %s extractor from %s" % (ext_args, func_name))
1335 | for extractor in self.build_extractor(*ext_args):
1336 | yield extractor
1337 |
1338 | def try_by_mimetype(self, filename):
1339 | mimetype, encoding = mimetypes.guess_type(filename)
1340 | try:
1341 | return [(self.mimetype_map[mimetype], encoding)]
1342 | except KeyError:
1343 | if encoding:
1344 | return [("compress", encoding)]
1345 | return []
1346 |
1347 | try_by_mimetype = classmethod(try_by_mimetype)
1348 |
1349 | def magic_map_matches(self, output, magic_map):
1350 | return [result for regexp, result in magic_map.items() if regexp.search(output)]
1351 |
1352 | magic_map_matches = classmethod(magic_map_matches)
1353 |
1354 | def try_by_magic(self, filename):
1355 | try:
1356 | result = subprocess.run(
1357 | ["file", "-zL", filename], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True
1358 | )
1359 | if result.returncode != 0:
1360 | return []
1361 | # if output contains 'ERROR:[', there was an error unzipping the
1362 | # first archive entry. re-run without -z.
1363 | output = result.stdout.split("\n")[0]
1364 | if "ERROR:[" in output:
1365 | result = subprocess.run(
1366 | ["file", "-L", filename],
1367 | stdout=subprocess.PIPE,
1368 | stderr=subprocess.PIPE,
1369 | text=True,
1370 | )
1371 | if result.returncode != 0:
1372 | return []
1373 | output = result.stdout.split("\n")[0]
1374 |
1375 | except FileNotFoundError:
1376 | logger.error("'file' command not found, skipping magic test")
1377 | return []
1378 | if output.startswith("%s: " % filename):
1379 | output = output[len(filename) + 2 :]
1380 | mimes = self.magic_map_matches(output, self.magic_mime_map)
1381 | encodings = self.magic_map_matches(output, self.magic_encoding_map)
1382 | if mimes and not encodings:
1383 | encodings = [None]
1384 | elif encodings and not mimes:
1385 | mimes = ["compress"]
1386 | return [(m, e) for m in mimes for e in encodings]
1387 |
1388 | try_by_magic = classmethod(try_by_magic)
1389 |
1390 | def try_by_extension(self, filename):
1391 | parts = filename.split(".")[-2:]
1392 | results = []
1393 | if len(parts) == 1:
1394 | return results
1395 | while parts:
1396 | results.extend(self.extension_map.get(".".join(parts), []))
1397 | del parts[0]
1398 | return results
1399 |
1400 | try_by_extension = classmethod(try_by_extension)
1401 |
1402 |
1403 | class BaseAction(object):
1404 | def __init__(self, options, filenames):
1405 | self.options = options
1406 | self.filenames = filenames
1407 | self.target = None
1408 | self.do_print = False
1409 |
1410 | def report(self, function, *args):
1411 | try:
1412 | error = function(*args)
1413 | except EXTRACTION_ERRORS as exception:
1414 | error = str(exception)
1415 | logger.debug("".join(traceback.format_exception(*sys.exc_info())))
1416 | return error
1417 |
1418 | def show_filename(self, filename):
1419 | if len(self.filenames) < 2:
1420 | return
1421 | elif self.do_print:
1422 | print()
1423 | else:
1424 | self.do_print = True
1425 | print("%s:" % (filename,))
1426 |
1427 |
1428 | class ExtractionAction(BaseAction):
1429 | handlers = [FlatHandler, OverwriteHandler, MatchHandler, EmptyHandler, BombHandler]
1430 |
1431 | def get_handler(self, extractor):
1432 | if extractor.content_type in ONE_ENTRY_UNKNOWN:
1433 | self.options.one_entry_policy.prep(self.current_filename, extractor)
1434 | for handler in self.handlers:
1435 | if handler.can_handle(extractor.content_type, self.options):
1436 | logger.debug("using %s handler" % (handler.__name__,))
1437 | self.current_handler = handler(extractor, self.options)
1438 | break
1439 |
1440 | def show_extraction(self, extractor):
1441 | if self.options.log_level > logging.INFO:
1442 | return
1443 | self.show_filename(self.current_filename)
1444 | if extractor.contents is None:
1445 | print(self.current_handler.target)
1446 | return
1447 |
1448 | def reverser(x, y):
1449 | return cmp(y, x)
1450 |
1451 | if self.current_handler.target == ".":
1452 | filenames = extractor.contents
1453 | filenames = sorted(filenames, key=cmp_to_key(reverser))
1454 | else:
1455 | filenames = [self.current_handler.target]
1456 | pathjoin = os.path.join
1457 | isdir = os.path.isdir
1458 | while filenames:
1459 | filename = filenames.pop()
1460 | if isdir(filename):
1461 | print("%s/" % (filename,))
1462 | new_filenames = os.listdir(filename)
1463 | new_filenames = sorted(new_filenames, key=cmp_to_key(reverser))
1464 | filenames.extend([
1465 | pathjoin(filename, new_filename) for new_filename in new_filenames
1466 | ])
1467 | else:
1468 | print(filename)
1469 |
1470 | def run(self, filename, extractor):
1471 | self.current_filename = filename
1472 | error = (
1473 | self.report(extractor.extract, self.options.batch, self.options.password)
1474 | or self.report(self.get_handler, extractor)
1475 | or self.report(self.current_handler.handle)
1476 | or self.report(self.show_extraction, extractor)
1477 | )
1478 | if not error:
1479 | self.target = self.current_handler.target
1480 | return error
1481 |
1482 |
1483 | class ListAction(BaseAction):
1484 | def list_filenames(self, extractor, filename):
1485 | # We get a line first to make sure there's not going to be some
1486 | # basic error before we show what filename we're listing.
1487 | filename_lister = extractor.get_filenames()
1488 | try:
1489 | first_line = next(filename_lister)
1490 | except StopIteration:
1491 | self.show_filename(filename)
1492 | else:
1493 | self.did_list = True
1494 | self.show_filename(filename)
1495 | print(first_line)
1496 | for line in filename_lister:
1497 | print(line)
1498 |
1499 | def run(self, filename, extractor):
1500 | self.did_list = False
1501 | error = self.report(self.list_filenames, extractor, filename)
1502 | if error and self.did_list:
1503 | logger.error("lister failed: ignore above listing for %s" % (filename,))
1504 | return error
1505 |
1506 |
1507 | class ExtractorApplication(object):
1508 | def __init__(self, arguments):
1509 | for signal_num in (signal.SIGINT, signal.SIGTERM):
1510 | signal.signal(signal_num, self.abort)
1511 | signal.signal(signal.SIGPIPE, signal.SIG_DFL)
1512 | self.parse_options(arguments)
1513 | self.setup_logger()
1514 | self.successes = []
1515 | self.failures = []
1516 |
1517 | def clean_destination(self, dest_name):
1518 | try:
1519 | os.unlink(dest_name)
1520 | except OSError as error:
1521 | if error.errno == errno.EISDIR:
1522 | shutil.rmtree(dest_name, ignore_errors=True)
1523 |
1524 | def abort(self, signal_num, frame):
1525 | signal.signal(signal_num, signal.SIG_IGN)
1526 | print()
1527 | logger.debug("traceback:\n" + "".join(traceback.format_stack(frame)).rstrip())
1528 | logger.debug("got signal %s" % (signal_num,))
1529 | try:
1530 | basename = self.current_extractor.target
1531 | except AttributeError:
1532 | basename = None
1533 | if basename is not None:
1534 | logger.debug("cleaning up %s" % (basename,))
1535 | clean_targets = set([os.path.realpath(".")])
1536 | if hasattr(self, "current_directory"):
1537 | clean_targets.add(os.path.realpath(self.current_directory))
1538 | for directory in clean_targets:
1539 | self.clean_destination(os.path.join(directory, basename))
1540 | sys.exit(1)
1541 |
1542 | @staticmethod
1543 | def get_supported_extensions():
1544 | """
1545 | return supported extensions
1546 | """
1547 | # get the lists of built-in extensions and combine them
1548 | ext_map_base = set(ExtractorBuilder.extension_map.keys())
1549 | ext_map = set(
1550 | itertools.chain(*[
1551 | x["extensions"]
1552 | for x in ExtractorBuilder.extractor_map.values()
1553 | if "extensions" in x
1554 | ])
1555 | )
1556 | ext_map = ext_map_base.union(ext_map)
1557 |
1558 | # get the list of extensions supplied by mimetypes
1559 | mimetypes_encodings_map = set([x.lstrip(".") for x in mimetypes.encodings_map])
1560 | # dtrx only supports a subset of the total types_map set, filter it
1561 | mimetypes_exts = filter(
1562 | lambda x: mimetypes.types_map[x] in ExtractorBuilder.mimetype_map,
1563 | mimetypes.types_map,
1564 | )
1565 | mimetypes_exts = set([x.lstrip(".") for x in mimetypes_exts])
1566 | mimetypes_exts = mimetypes_encodings_map.union(mimetypes_exts)
1567 |
1568 | # sort the output for consistent order
1569 | return sorted(ext_map.union(mimetypes_exts))
1570 |
1571 | def parse_options(self, arguments):
1572 | parser = optparse.OptionParser(
1573 | usage="%prog [options] archive [archive2 ...]",
1574 | description="Intelligent archive extractor",
1575 | version=VERSION_BANNER,
1576 | )
1577 | parser.add_option(
1578 | "-l",
1579 | "-t",
1580 | "--list",
1581 | "--table",
1582 | dest="show_list",
1583 | action="store_true",
1584 | default=False,
1585 | help="list contents of archives on standard output",
1586 | )
1587 | parser.add_option(
1588 | "-m",
1589 | "--metadata",
1590 | dest="metadata",
1591 | action="store_true",
1592 | default=False,
1593 | help="extract metadata from a .deb/.gem",
1594 | )
1595 | parser.add_option(
1596 | "-r",
1597 | "--recursive",
1598 | dest="recursive",
1599 | action="store_true",
1600 | default=False,
1601 | help="extract archives contained in the ones listed",
1602 | )
1603 | parser.add_option(
1604 | "--one",
1605 | "--one-entry",
1606 | dest="one_entry_default",
1607 | default=None,
1608 | help=("specify extraction policy for one-entry " + "archives: inside/rename/here"),
1609 | )
1610 | parser.add_option(
1611 | "-n",
1612 | "--noninteractive",
1613 | dest="batch",
1614 | action="store_true",
1615 | default=False,
1616 | help="don't ask how to handle special cases",
1617 | )
1618 | parser.add_option(
1619 | "-p",
1620 | "--password",
1621 | dest="password",
1622 | default=None,
1623 | help="provide a password for password-protected archives",
1624 | )
1625 | parser.add_option(
1626 | "-o",
1627 | "--overwrite",
1628 | dest="overwrite",
1629 | action="store_true",
1630 | default=False,
1631 | help="overwrite any existing target output",
1632 | )
1633 | parser.add_option(
1634 | "-f",
1635 | "--flat",
1636 | "--no-directory",
1637 | dest="flat",
1638 | action="store_true",
1639 | default=False,
1640 | help="extract everything to the current directory",
1641 | )
1642 |
1643 | def list_extensions(option, opt, value, parser, *args, **kwargs):
1644 | """callback for optparse to list supported extensions"""
1645 | print("\n".join(ExtractorApplication.get_supported_extensions()))
1646 | sys.exit()
1647 |
1648 | parser.add_option(
1649 | "--list-extensions",
1650 | action="callback",
1651 | callback=list_extensions,
1652 | help=(
1653 | "list supported filetypes by extension. note that these are the"
1654 | " filetypes recognized by dtrx, but extraction still relies on the"
1655 | " appropriate tool to be installed. also note that this is not a"
1656 | " comprehensive list; dtrx will fall back on the 'file' command if the"
1657 | " extension is unknown"
1658 | ),
1659 | )
1660 | parser.add_option(
1661 | "-v",
1662 | "--verbose",
1663 | dest="verbose",
1664 | action="count",
1665 | default=0,
1666 | help="be verbose/print debugging information",
1667 | )
1668 | parser.add_option(
1669 | "-q",
1670 | "--quiet",
1671 | dest="quiet",
1672 | action="count",
1673 | default=3,
1674 | help="suppress warning/error messages",
1675 | )
1676 | self.options, filenames = parser.parse_args(arguments)
1677 | if not filenames:
1678 | parser.error("you did not list any archives")
1679 | # This makes WARNING is the default.
1680 | self.options.log_level = 10 * (self.options.quiet - self.options.verbose)
1681 | try:
1682 | self.options.one_entry_policy = OneEntryPolicy(self.options)
1683 | except ValueError:
1684 | parser.error("invalid value for --one-entry option")
1685 | self.options.recursion_policy = RecursionPolicy(self.options)
1686 | self.archives = {os.path.realpath(os.curdir): filenames}
1687 |
1688 | def setup_logger(self):
1689 | logging.getLogger().setLevel(self.options.log_level)
1690 | handler = logging.StreamHandler()
1691 | handler.setLevel(self.options.log_level)
1692 | formatter = logging.Formatter("dtrx: %(levelname)s: %(message)s")
1693 | handler.setFormatter(formatter)
1694 | logger.addHandler(handler)
1695 | logger.debug("logger is set up")
1696 |
1697 | def recurse(self, filename, extractor, action):
1698 | self.options.recursion_policy.prep(filename, action.target, extractor)
1699 | if self.options.recursion_policy.ok_to_recurse():
1700 | for filename in extractor.included_archives:
1701 | logger.debug("recursing with %s archive" % (extractor.content_type,))
1702 | tail_path, basename = os.path.split(filename)
1703 | path_args = [self.current_directory, extractor.included_root, tail_path]
1704 | logger.debug("included root: %s" % (extractor.included_root,))
1705 | logger.debug("tail path: %s" % (tail_path,))
1706 | if os.path.isdir(action.target):
1707 | logger.debug("action target: %s" % (action.target,))
1708 | path_args.insert(1, action.target)
1709 | directory = os.path.join(*path_args)
1710 | self.archives.setdefault(directory, []).append(basename)
1711 |
1712 | def check_file(self, filename):
1713 | try:
1714 | result = os.stat(filename)
1715 | except OSError as error:
1716 | return error.strerror
1717 | if stat.S_ISDIR(result.st_mode):
1718 | return "cannot work with a directory"
1719 |
1720 | def show_stderr(self, logger_func, stderr):
1721 | if stderr:
1722 | logger_func("Error output from this process:\n" + stderr.rstrip("\n"))
1723 |
1724 | def try_extractors(self, filename, builder):
1725 | errors = []
1726 | for extractor in builder:
1727 | self.current_extractor = extractor # For the abort() method.
1728 | error = self.action.run(filename, extractor)
1729 | if error:
1730 | errors.append((extractor.file_type, extractor.encoding, error, extractor.stderr))
1731 | if extractor.target is not None:
1732 | self.clean_destination(extractor.target)
1733 | else:
1734 | logfunc = logger.warning
1735 | if extractor.pw_prompted:
1736 | # Normally stderr contains actual errors. If the archive
1737 | # contained a password, stderr is full of prompts; only
1738 | # relevant when debugging.
1739 | logfunc = logger.debug
1740 | self.show_stderr(logfunc, extractor.stderr)
1741 | self.recurse(filename, extractor, self.action)
1742 | return
1743 | logger.error("could not handle %s" % (filename,))
1744 | if not errors:
1745 | logger.error("not a known archive type")
1746 | return True
1747 | for file_type, encoding, error, stderr in errors:
1748 | message = ["treating as", file_type, "failed:", error]
1749 | if encoding:
1750 | message.insert(1, "%s-encoded" % (encoding,))
1751 | logger.error(" ".join(message))
1752 | self.show_stderr(logger.error, stderr)
1753 | return True
1754 |
1755 | def download(self, filename):
1756 | url = filename.lower()
1757 | for protocol in "http", "https", "ftp":
1758 | if url.startswith(protocol + "://"):
1759 | break
1760 | else:
1761 | return filename, None
1762 | # FIXME: This can fail if there's already a file in the directory
1763 | # that matches the basename of the URL.
1764 | status = subprocess.call(["wget", "-c", filename], stdin=subprocess.PIPE)
1765 | if status != 0:
1766 | return None, "wget returned status code %s" % (status,)
1767 | return os.path.basename(urlparse.urlparse(filename)[2]), None
1768 |
1769 | def run(self):
1770 | if self.options.show_list:
1771 | action = ListAction
1772 | else:
1773 | action = ExtractionAction
1774 | self.action = action(self.options, list(self.archives.keys())[0])
1775 | while self.archives:
1776 | self.current_directory, self.filenames = self.archives.popitem()
1777 | os.chdir(self.current_directory)
1778 | for filename in self.filenames:
1779 | filename, error = self.download(filename)
1780 | if not error:
1781 | builder = ExtractorBuilder(filename, self.options)
1782 | error = self.check_file(filename) or self.try_extractors(
1783 | filename, builder.get_extractor()
1784 | )
1785 | if error:
1786 | if error is not True:
1787 | logger.error("%s: %s" % (filename, error))
1788 | self.failures.append(filename)
1789 | else:
1790 | self.successes.append(filename)
1791 | self.options.one_entry_policy.permanent_policy = EXTRACT_WRAP
1792 | if self.failures:
1793 | return 1
1794 | return 0
1795 |
1796 |
1797 | def main():
1798 | app = ExtractorApplication(sys.argv[1:])
1799 | sys.exit(app.run())
1800 |
1801 |
1802 | if __name__ == "__main__":
1803 | main()
1804 |
--------------------------------------------------------------------------------