├── .gitignore
├── AUTHORS
├── HISTORY.rdoc
├── LICENSE
├── README.md
├── Rakefile
├── lib
    ├── xz.rb
    └── xz
    │   ├── fiddle_helper.rb
    │   ├── lib_lzma.rb
    │   ├── stream.rb
    │   ├── stream_reader.rb
    │   ├── stream_writer.rb
    │   └── version.rb
├── ruby-xz.gemspec
└── test
    ├── common.rb
    ├── test-data
        ├── iso88591.txt.xz
        ├── lorem_ipsum.txt
        └── lorem_ipsum.txt.xz
    ├── test_stream_reader.rb
    ├── test_stream_writer.rb
    ├── test_tarball.rb
    └── test_xz.rb


/.gitignore:
--------------------------------------------------------------------------------
1 | /*.gem
2 | doc
3 | test/test-data/lorem2.txt.xz
4 | pkg/
5 | 


--------------------------------------------------------------------------------
/AUTHORS:
--------------------------------------------------------------------------------
1 | = List of contributors
2 | 
3 | All the people who worked on this project, in alphabetical order.
4 | 
5 | * Marvin Gülker (Quintus) <m-guelker@phoenixmail.de>
6 | * Christoph Plank (chrisistuff)
7 | 


--------------------------------------------------------------------------------
/HISTORY.rdoc:
--------------------------------------------------------------------------------
 1 | = Version history
 2 | 
 3 | == 1.0.0 (2018-05-20)
 4 | 
 5 | * *BreakingChange* The XZ module's methods now take any parameters
 6 |   beyond the IO object as real Ruby keyword arguments rather than
 7 |   a long argument list.
 8 | * *BreakingChange* XZ.decompress_stream now honours Ruby's
 9 |   external and internal encoding concept instead of just
10 |   returning BINARY-tagged strings.
11 | * *BreakingChange* Remove deprecated API on stream reader/writer
12 |   class and instead sync the API with Ruby's zlib library
13 |   (Ticket #12 by me).
14 | * *BreakingChange* StreamWriter.new and StreamReader.new do not accept
15 |   a block anymore. This is part of syncing with Ruby's zlib API.
16 | * *BreakingChange* StreamReader.open and StreamWriter.open always
17 |   return the new instance, even if a block is given to the method
18 |   (previous behaviour was to return the return value of the block).
19 |   This is part of the syncing with Ruby's zlib API.
20 | * *BreakingChange* StreamReader.new and StreamWriter.new as well as
21 |   the ::open variants take additional arguments as real Ruby keyword
22 |   arguments now instead of a long parameter list plus options hash.
23 |   This is different from Ruby's own zlib API as that one takes both
24 |   a long parameter list and a hash of additional options. ruby-xz
25 |   is meant to follow zlib's semantics mostly, but not as a drop-in
26 |   replacement, so this divergence from zlib's API is okay (also
27 |   given that it isn't possible to replicate all possible options
28 |   1:1 anyway, since liblzma simply accepts different options as
29 |   libz). If you've never used these methods' optional arguments,
30 |   you should be fine.
31 | * *BreakingChange* Stream#close now returns nil instead of the
32 |   number of bytes written. This syncs Stream#close with Ruby's
33 |   own IO#close, which also returns nil.
34 | * *BreakingChange* Remove Stream#pos=, Stream#seek, Stream#stat. These
35 |   methods irritated the minitar gem, which doesn't expect them to
36 |   raise NotImplementedError, but directly to be missing if the object
37 |   does not support seeking.
38 | * *BreakingChange* StreamReader and StreamWriter now honour Ruby's
39 |   encoding system instead of returning only BINARY-tagged strings.
40 | * *Dependency* Remove dependency on ffi. ruby-xz now uses fiddle from
41 |   the stdlib instead.
42 | * *Dependency* Remove dependency on io-like. ruby-xz now implements
43 |   all the IO mechanics itself. (Ticket #10 by me)
44 | * *Dependency* Bump required Ruby version to 2.3.0.
45 | * *Fix* libzlma.dylib not being found on OS X (Ticket #15 by
46 |   s0nspark).
47 | 
48 | == 0.2.3 (2015-12-29)
49 | 
50 | * *Fix* documentation of XZ module (a :nodoc: was causing havoc
51 |   in the XZ module so it appeared to have no methods).
52 | * No other changes this release.
53 | 
54 | == 0.2.2 (2015-12-27)
55 | 
56 | * *Add* XZ.disable_deprecation_notices
57 | * *Deprecate* use of XZ::StreamReader.open with an IO argument
58 | * *Deprecate* use of XZ::StreamReader.new with a filename argument
59 | * *Deprecate* use of XZ::StreamWriter.open with an IO argument
60 | * *Deprecate* use of XZ::StreamWriter.new with a filename argument
61 | * *Deprecate* nonautomatic IO close in XZ::StreamReader#close
62 | * *Deprecate* nonautomatic IO close in XZ::StreamWriter#close
63 | * *Fix* incompatibility with Resolv.getaddress() in Ruby 2.2 (Ticket #13
64 |   by Ken Simon)
65 | * Goal of these deprecations is to sync the API with Ruby’s own
66 |   Zlib::GzipWriter and Zlib::GzipReader mostly.
67 | * Add required versions to gemspec.
68 | * Comment format cleanup, results in better docs.
69 | * Internal code cleanup
70 | * Add more tests.
71 | 
72 | == 0.2.1 (2014-02-08)
73 | 
74 | * Build the gem properly on Ruby 2.0+ (PR #8 by Nana Sakisaka (saki7))
75 | * Release the GIL when interfacing with liblzma (PR #7 by Lars Christensen (larsch))
76 | 
77 | == 0.2.0 (2013-06-23)
78 | 
79 | * Fix #6 (errors on JRuby) by Ben Nagy
80 | * <b>Remove 1.8 compatibility</b>
81 | 
82 | == 0.1.0 (2013-02-17)
83 | 
84 | * <b>Add XZ::StreamReader and XZ::StreamWriter for io-like behaviour.</b>
85 | * New dependency on the +io-like+ gem.
86 | * <b>Add Ruby 1.8 compatibility.</b> Thanks to Christoph Plank.
87 | * We now have proper unit tests.
88 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | Copyright © 2011-2018 Marvin Gülker et al.
 2 | 
 3 | See AUTHORS for the full list of contributors.
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a
 6 | copy of this software and associated documentation files (the ‘Software’),
 7 | to deal in the Software without restriction, including without limitation
 8 | the rights to use, copy, modify, merge, publish, distribute, sublicense,
 9 | and/or sell copies of the Software, and to permit persons to whom the Software
10 | is furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21 | THE SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | Project has new maintainer
  2 | ==========================
  3 | 
  4 | As of 2021, I needed to abandon this project. I moved on to other
  5 | things and was unable maintain it any further. Luckily, GitHub user
  6 | @win93 has kindly taken over maintenance over ruby-xz. The new
  7 | repository for ruby-xz is here: https://github.com/win93/ruby-xz
  8 | Please file any tickets or pull request at that repository. @win93 has
  9 | also taken over the **ruby-xz** RubyGem with my permission.
 10 | 
 11 | This repository only remains for historical and archival purposes.
 12 | Thanks to @win93 for taking over **ruby-xz.**
 13 | 
 14 | ruby-xz
 15 | =======
 16 | 
 17 | **ruby-xz** is a basic binding to the famous [liblzma library][1],
 18 | best known for the extreme compression-ratio it's native *XZ* format
 19 | achieves. ruby-xz gives you the possibility of creating and extracting
 20 | XZ archives on any platform where liblzma is installed. No compilation
 21 | is needed, because ruby-xz is written on top of Ruby's “fiddle” library
 22 | (part of the standard libary). ruby-xz does not have any dependencies
 23 | other than Ruby itself.
 24 | 
 25 | ruby-xz supports both “intuitive” (de)compression by providing methods to
 26 | directly operate on strings and files, but also allows you to operate
 27 | directly on IO streams (see the various methods of the XZ module). On top
 28 | of that, ruby-xz offers an advanced interface that allows you to treat
 29 | XZ-compressed data as IO streams, both for reading and for writing. See the
 30 | XZ::StreamReader and XZ::StreamWriter classes for more information on this.
 31 | 
 32 | **Note**: Version 1.0.0 breaks the API quite heavily. Refer to
 33 | HISTORY.rdoc for details.
 34 | 
 35 | Installation
 36 | ------------
 37 | 
 38 | Install it the way you install all your gems.
 39 | 
 40 | ```
 41 | $ gem install ruby-xz
 42 | ```
 43 | 
 44 | Alternatively, you can clone the repository and build the most recent
 45 | code yourself:
 46 | 
 47 | ```
 48 | $ git clone git://git.guelker.eu/ruby-xz.git
 49 | $ cd ruby-xz
 50 | $ rake gem
 51 | $ gem install pkg/ruby-xz-*.gem
 52 | ```
 53 | 
 54 | Usage
 55 | -----
 56 | 
 57 | The documentation of the XZ module is well and you should be able to find
 58 | everything you need to use ruby-xz. As said, it's not big, but powerful:
 59 | You can create and extract whole archive files, compress or decompress
 60 | streams of data or just plain strings.
 61 | 
 62 | You can read the documentation on your local gemserver, or browse it [online][2].
 63 | 
 64 | ### Require ###
 65 | 
 66 | You have to require the “xz.rb” file:
 67 | 
 68 | ``` ruby
 69 | require "xz"
 70 | ```
 71 | 
 72 | ### Examples ###
 73 | 
 74 | ``` ruby
 75 | # Compress a file
 76 | XZ.compress_file("myfile.txt", "myfile.txt.xz")
 77 | # Decompress it
 78 | XZ.decompress_file("myfile.txt.xz", "myfile.txt")
 79 | 
 80 | # Compress everything you get from a socket (note that there HAS to be a EOF
 81 | # sometime, otherwise this will run infinitely)
 82 | XZ.compress_stream(socket){|chunk| opened_file.write(chunk)}
 83 | 
 84 | # Compress a string
 85 | comp = XZ.compress("Mydata")
 86 | # Decompress it
 87 | data = XZ.decompress(comp)
 88 | ```
 89 | 
 90 | Have a look at the XZ module's documentation for an in-depth description of
 91 | what is possible.
 92 | 
 93 | ### Usage with the minitar gem ###
 94 | 
 95 | ruby-xz can be used together with the [minitar][3] library (formerly
 96 | “archive-tar-minitar”) to create XZ-compressed tarballs. This works by
 97 | employing the IO-like classes XZ::StreamReader and XZ::StreamWriter
 98 | analogous to how one would use Ruby's “zlib” library together with
 99 | “minitar”. Example:
100 | 
101 | ``` ruby
102 | require "xz"
103 | require "minitar"
104 | 
105 | # Create an XZ-compressed tarball
106 | XZ::StreamWriter.open("tarball.tar.xz") do |txz|
107 |   Minitar.pack("path/to/directory", txz)
108 | end
109 | 
110 | # Unpack it again
111 | XZ::StreamReader.open("tarball.tar.xz") do |txz|
112 |   Minitar.unpack(txz, "path/to/target/directory")
113 | end
114 | ```
115 | 
116 | Links
117 | -----
118 | 
119 | * Online documentation: https://rubydoc.info/gems/ruby-xz
120 | * Code repository: https://github.com/Quintus/ruby-xz
121 | * Issue tracker: https://github.com/Quintus/ruby-xz/issues
122 | 
123 | License
124 | -------
125 | 
126 | MIT license; see LICENSE for the full license text.
127 | 
128 | [1]: https://tukaani.org/xz/
129 | [2]: https://mg.guelker.eu/projects/ruby-xz/doc
130 | [3]: https://github.com/halostatue/minitar
131 | 


--------------------------------------------------------------------------------
/Rakefile:
--------------------------------------------------------------------------------
 1 | # -*- mode: ruby; coding: utf-8 -*-
 2 | =begin
 3 | Basic liblzma-bindings for Ruby.
 4 | 
 5 | Copyright © 2011-2018 Marvin Gülker et al.
 6 | 
 7 | See AUTHORS for the full list of contributors.
 8 | 
 9 | Permission is hereby granted, free of charge, to any person obtaining a
10 | copy of this software and associated documentation files (the ‘Software’),
11 | to deal in the Software without restriction, including without limitation
12 | the rights to use, copy, modify, merge, publish, distribute, sublicense,
13 | and/or sell copies of the Software, and to permit persons to whom the Software
14 | is furnished to do so, subject to the following conditions:
15 | 
16 | The above copyright notice and this permission notice shall be included in all
17 | copies or substantial portions of the Software.
18 | 
19 | THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
21 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
22 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
23 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
25 | THE SOFTWARE.
26 | =end
27 | 
28 | require "rake/testtask"
29 | require "rubygems/package_task"
30 | require "rdoc/task"
31 | 
32 | load "ruby-xz.gemspec"
33 | 
34 | Gem::PackageTask.new(GEMSPEC).define
35 | 
36 | Rake::RDocTask.new do |rd|
37 |   rd.rdoc_files.include("lib/**/*.rb", "*.md", "**/*.rdoc", "LICENSE", "AUTHORS")
38 |   rd.title = "ruby-xz RDocs"
39 |   rd.main = "README.md"
40 |   rd.generator = "hanna"
41 |   rd.rdoc_dir = "doc"
42 | end
43 | 
44 | Rake::TestTask.new do |t|
45 |   t.test_files = FileList["test/test_*.rb"]
46 |   t.warning = true
47 | end
48 | 


--------------------------------------------------------------------------------
/lib/xz.rb:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | #--
  3 | # Basic liblzma-bindings for Ruby.
  4 | #
  5 | # Copyright © 2011-2018 Marvin Gülker et al.
  6 | #
  7 | # See AUTHORS for the full list of contributors.
  8 | #
  9 | # Permission is hereby granted, free of charge, to any person obtaining a
 10 | # copy of this software and associated documentation files (the ‘Software’),
 11 | # to deal in the Software without restriction, including without limitation
 12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
 13 | # and/or sell copies of the Software, and to permit persons to whom the Software
 14 | # is furnished to do so, subject to the following conditions:
 15 | #
 16 | # The above copyright notice and this permission notice shall be included in all
 17 | # copies or substantial portions of the Software.
 18 | #
 19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 25 | # THE SOFTWARE.
 26 | #++
 27 | 
 28 | require "pathname"
 29 | require "fiddle"
 30 | require "fiddle/import"
 31 | require "stringio"
 32 | require "forwardable"
 33 | 
 34 | # The namespace and main module of this library. Each method of this
 35 | # module may raise exceptions of class XZ::LZMAError, which is not
 36 | # named in the methods' documentations anymore.
 37 | module XZ
 38 | 
 39 |   # Number of bytes read in one chunk.
 40 |   CHUNK_SIZE = 4096
 41 | 
 42 |   class << self
 43 | 
 44 |     # Force ruby-xz to be silent about deprecations. Using this is
 45 |     # discouraged so that you are aware of upcoming changes to the
 46 |     # API. However, if your standard error stream is closed,
 47 |     # outputting the deprecation notices might result in an exception,
 48 |     # so this method allows you to surpress these notices. Ensure you
 49 |     # read the HISTORY.rdoc file carefully instead.
 50 |     def disable_deprecation_notices=(bool)
 51 |       @disable_deprecation_notices = bool
 52 |     end
 53 | 
 54 |     # Output a deprecation notice.
 55 |     def deprecate(msg) # :nodoc:
 56 |       @disable_deprecation_notices ||= false
 57 | 
 58 |       unless @disable_deprecation_notices
 59 |         $stderr.puts("DEPRECATION NOTICE: #{msg}\n#{caller.drop(1).join("\n\t")}")
 60 |       end
 61 |     end
 62 | 
 63 |     # call-seq:
 64 |     #   decompress_stream(io [, kw ] )                 → a_string
 65 |     #   decompress_stream(io [, kw ] ] ){|chunk| ... } → an_integer
 66 |     #   decode_stream(io [, kw ] ] )                   → a_string
 67 |     #   decode_stream(io [, kw ] ){|chunk| ... }       → an_integer
 68 |     #
 69 |     # Decompresses a stream containing XZ-compressed data.
 70 |     #
 71 |     # === Parameters
 72 |     # ==== Positional parameters
 73 |     #
 74 |     # [io]
 75 |     #   The IO to read from. It must be opened for reading in
 76 |     #   binary mode.
 77 |     # [chunk (Block argument)]
 78 |     #   One piece of decompressed data. See Remarks section below
 79 |     #   for information about its encoding.
 80 |     #
 81 |     # ==== Keyword arguments
 82 |     #
 83 |     # [memory_limit (+UINT64_MAX+)]
 84 |     #   If not XZ::LibLZMA::UINT64_MAX, makes liblzma
 85 |     #   use no more memory than +memory_limit+ bytes.
 86 |     #
 87 |     # [flags (<tt>[:tell_unsupported_check]</tt>)]
 88 |     #   Additional flags
 89 |     #   passed to liblzma (an array). Possible flags are:
 90 |     #
 91 |     #   [:tell_no_check]
 92 |     #     Spit out a warning if the archive hasn't an
 93 |     #     integrity checksum.
 94 |     #   [:tell_unsupported_check]
 95 |     #     Spit out a warning if the archive
 96 |     #     has an unsupported checksum type.
 97 |     #   [:concatenated]
 98 |     #     Decompress concatenated archives.
 99 |     # [external_encoding (Encoding.default_external)]
100 |     #   Assume the decompressed data inside the compressed data
101 |     #   has this encoding. See Remarks section.
102 |     # [internal_encoding (Encoding.default_internal)]
103 |     #   Request transcoding of the decompressed data into this
104 |     #   encoding if not nil. Note that Encoding.default_internal
105 |     #   is nil by default. See Remarks section.
106 |     #
107 |     # === Return value
108 |     #
109 |     # If a block was given, returns the number of bytes
110 |     # written. Otherwise, returns the decompressed data as a
111 |     # BINARY-encoded string.
112 |     #
113 |     # === Raises
114 |     #
115 |     # [Encoding::InvalidByteSequenceError]
116 |     #   1. You requested an “internal encoding” conversion
117 |     #      and the archive contains invalid byte sequences
118 |     #      in the external encoding.
119 |     #   2. You requested an “internal encoding” conversion, used
120 |     #      the block form of this method, and liblzma decided
121 |     #      to cut the decompressed data into chunks in mid of
122 |     #      a multibyte character. See Remarks section for an
123 |     #      explanation.
124 |     #
125 |     # === Example
126 |     #
127 |     #   data = File.open("archive.xz", "rb"){|f| f.read}
128 |     #   io = StringIO.new(data)
129 |     #
130 |     #   XZ.decompress_stream(io) #=> "I AM THE DATA"
131 |     #   io.rewind
132 |     #
133 |     #   str = ""
134 |     #   XZ.decompress_stream(io, XZ::LibLZMA::UINT64_MAX, [:tell_no_check]){|c| str << c} #=> 13
135 |     #   str #=> "I AM THE DATA"
136 |     #
137 |     # === Remarks
138 |     #
139 |     # The block form is *much* better on memory usage, because it
140 |     # doesn't have to load everything into RAM at once. If you don't
141 |     # know how big your data gets or if you want to decompress much
142 |     # data, use the block form. Of course you shouldn't store the data
143 |     # you read in RAM then as in the example above.
144 |     #
145 |     # This method honours Ruby's external and internal encoding concept.
146 |     # All documentation about this applies to this method, with the
147 |     # exception that the external encoding does not refer to the data
148 |     # on the hard disk (that's compressed XZ data, it's always binary),
149 |     # but to the data inside the XZ container, i.e. to the *decompressed*
150 |     # data. Any strings you receive from this method (regardless of
151 |     # whether via return value or via the +chunk+ block argument) will
152 |     # first be tagged with the external encoding. If you set an internal
153 |     # encoding (either via the +internal_encoding+ parameter or via
154 |     # Ruby's default internal encoding) that string will be transcoded
155 |     # from the external encoding to the internal encoding before you
156 |     # even see it; in that case, the return value or chunk block argument
157 |     # will be encoded in the internal encoding. Internal encoding is
158 |     # disabled in Ruby by default and the argument for this method also
159 |     # defaults to nil.
160 |     #
161 |     # Due to the external encoding being applied, it can happen that
162 |     # +chunk+ contains an incomplete multibyte character causing
163 |     # <tt>valid_encoding?</tt> to return false if called on +chunk+,
164 |     # because liblzma doesn't know about encodings. The rest of the
165 |     # character will be yielded to the block in the next iteration
166 |     # then as liblzma progresses with the decompression from the XZ
167 |     # format. In other words, be prepared that +chunk+ can contain
168 |     # incomplete multibyte chars.
169 |     #
170 |     # This can have nasty side effects if you requested an internal
171 |     # encoding automatic transcoding and used the block form. Since
172 |     # this method applies the internal encoding transcoding before the
173 |     # chunk is yielded to the block, String#encode gets the incomplete
174 |     # multibyte character. In that case, you will receive an
175 |     # Encoding::InvalidByteSequenceError exception even though your
176 |     # data is perfectly well-formed inside the XZ data. It's just
177 |     # that liblzma during decompression cut the chunks at an
178 |     # unfortunate place. To avoid this, do not request internal encoding
179 |     # conversion when using the block form, but instead transcode
180 |     # the data manually after you have decompressed the entire data.
181 |     def decompress_stream(io, memory_limit: LibLZMA::UINT64_MAX, flags: [:tell_unsupported_check], external_encoding: nil, internal_encoding: nil, &block)
182 |       raise(ArgumentError, "Invalid memory limit set!") unless memory_limit > 0 && memory_limit <= LibLZMA::UINT64_MAX
183 |       raise(ArgumentError, "external_encoding must be set if internal_encoding transcoding is requested") if internal_encoding && !external_encoding
184 | 
185 |       # The ArgumentError above is only about the concrete arguments
186 |       # (to sync with Ruby's IO API), not about the implied internal
187 |       # encoding, which might still kick in (and does, see below).
188 |       external_encoding ||= Encoding.default_external
189 |       internal_encoding ||= Encoding.default_internal
190 | 
191 |       # bit-or all flags
192 |       allflags = flags.inject(0) do |val, flag|
193 |         flag = LibLZMA::LZMA_DECODE_FLAGS[flag] || raise(ArgumentError, "Unknown flag #{flag}!")
194 |         val | flag
195 |       end
196 | 
197 |       stream = LibLZMA::LZMAStream.malloc
198 |       LibLZMA.LZMA_STREAM_INIT(stream)
199 |       res = LibLZMA.lzma_stream_decoder(stream.to_ptr,
200 |                                         memory_limit,
201 |                                         allflags)
202 | 
203 |       LZMAError.raise_if_necessary(res)
204 | 
205 |       res = ""
206 |       res.encode!(Encoding::BINARY)
207 |       if block_given?
208 |         res = lzma_code(io, stream) do |chunk|
209 |           chunk = chunk.dup # Do not write somewhere into the fiddle pointer while encoding (-> can segfault)
210 |           chunk.force_encoding(external_encoding) if external_encoding
211 |           chunk.encode!(internal_encoding)        if internal_encoding
212 |           yield(chunk)
213 |         end
214 |       else
215 |         lzma_code(io, stream){|chunk| res << chunk}
216 |         res.force_encoding(external_encoding) if external_encoding
217 |         res.encode!(internal_encoding)        if internal_encoding
218 |       end
219 | 
220 |       LibLZMA.lzma_end(stream.to_ptr)
221 | 
222 |       block_given? ? stream.total_out : res
223 |     end
224 |     alias decode_stream decompress_stream
225 | 
226 |     # call-seq:
227 |     #   compress_stream(io [, kw ] ) → a_string
228 |     #   compress_stream(io [, kw ] ){|chunk| ... } → an_integer
229 |     #   encode_stream(io [, kw ] ) → a_string
230 |     #   encode_stream(io [, kw ] ){|chunk| ... } → an_integer
231 |     #
232 |     # Compresses a stream of data into XZ-compressed data.
233 |     #
234 |     # === Parameters
235 |     # ==== Positional arguments
236 |     #
237 |     # [io]
238 |     #   The IO to read the data from. Must be opened for
239 |     #   reading.
240 |     # [chunk (Block argument)]
241 |     #   One piece of compressed data. This is always tagged
242 |     #   as a BINARY string, since it's compressed binary data.
243 |     #
244 |     # ==== Keyword arguments
245 |     # All keyword arguments are optional.
246 |     #
247 |     # [level (6)]
248 |     #   Compression strength. Higher values indicate a
249 |     #   smaller result, but longer compression time. Maximum
250 |     #   is 9.
251 |     #
252 |     # [check (:crc64)]
253 |     #   The checksum algorithm to use for verifying
254 |     #   the data inside the archive. Possible values are:
255 |     #   * :none
256 |     #   * :crc32
257 |     #   * :crc64
258 |     #   * :sha256
259 |     #
260 |     # [extreme (false)]
261 |     #   Tries to get the last bit out of the
262 |     #   compression. This may succeed, but you can end
263 |     #   up with *very* long computation times.
264 |     #
265 |     # === Return value
266 |     #
267 |     # If a block was given, returns the number of bytes
268 |     # written. Otherwise, returns the compressed data as a
269 |     # BINARY-encoded string.
270 |     #
271 |     # === Example
272 |     #   data = File.read("file.txt")
273 |     #   i = StringIO.new(data)
274 |     #   XZ.compress_stream(i) #=> Some binary blob
275 |     #
276 |     #   i.rewind
277 |     #   str = ""
278 |     #
279 |     #   XZ.compress_stream(i, level: 4, check: :sha256) do |c|
280 |     #     str << c
281 |     #   end #=> 123
282 |     #   str #=> Some binary blob
283 |     #
284 |     # === Remarks
285 |     #
286 |     # The block form is *much* better on memory usage, because it
287 |     # doesn't have to load everything into RAM at once. If you don't
288 |     # know how big your data gets or if you want to compress much
289 |     # data, use the block form. Of course you shouldn't store the data
290 |     # your read in RAM then as in the example above.
291 |     #
292 |     # For the +io+ object passed Ruby's normal external and internal
293 |     # encoding rules apply while it is read from by this method. These
294 |     # encodings are not changed on +io+ by this method. The data you
295 |     # receive in the block (+chunk+) above is binary data (compressed
296 |     # data) and as such encoded as BINARY.
297 |     def compress_stream(io, level: 6, check: :crc64, extreme: false, &block)
298 |       raise(ArgumentError, "Invalid compression level!") unless (0..9).include?(level)
299 |       raise(ArgumentError, "Invalid checksum specified!") unless [:none, :crc32, :crc64, :sha256].include?(check)
300 | 
301 |       level |= LibLZMA::LZMA_PRESET_EXTREME if extreme
302 | 
303 |       stream = LibLZMA::LZMAStream.malloc
304 |       LibLZMA::LZMA_STREAM_INIT(stream)
305 |       res = LibLZMA.lzma_easy_encoder(stream.to_ptr,
306 |                                       level,
307 |                                       LibLZMA.const_get(:"LZMA_CHECK_#{check.upcase}"))
308 | 
309 |       LZMAError.raise_if_necessary(res)
310 | 
311 |       res = ""
312 |       res.encode!(Encoding::BINARY)
313 |       if block_given?
314 |         res = lzma_code(io, stream, &block)
315 |       else
316 |         lzma_code(io, stream){|chunk| res << chunk}
317 |       end
318 | 
319 |       LibLZMA.lzma_end(stream.to_ptr)
320 | 
321 |       block_given? ? stream.total_out : res
322 |     end
323 |     alias encode_stream compress_stream
324 | 
325 |     # Compresses +in_file+ and writes the result to +out_file+.
326 |     #
327 |     # === Parameters
328 |     #
329 |     # [in_file]
330 |     #   The path to the file to read from.
331 |     # [out_file]
332 |     #   The path of the file to write to. If it exists, it will be
333 |     #   overwritten.
334 |     #
335 |     # For the keyword parameters, see the ::compress_stream method.
336 |     #
337 |     # === Return value
338 |     #
339 |     # The number of bytes written, i.e. the size of the archive.
340 |     #
341 |     # === Example
342 |     #
343 |     #   XZ.compress_file("myfile.txt", "myfile.txt.xz")
344 |     #   XZ.compress_file("myarchive.tar", "myarchive.tar.xz")
345 |     #
346 |     # === Remarks
347 |     #
348 |     # This method is safe to use with big files, because files are not
349 |     # loaded into memory completely at once.
350 |     def compress_file(in_file, out_file, **args)
351 |       File.open(in_file, "rb") do |i_file|
352 |         File.open(out_file, "wb") do |o_file|
353 |           compress_stream(i_file, **args) do |chunk|
354 |             o_file.write(chunk)
355 |           end
356 |         end
357 |       end
358 |     end
359 | 
360 |     # Compresses arbitrary data using the XZ algorithm.
361 |     #
362 |     # === Parameters
363 |     #
364 |     # [str] The data to compress.
365 |     #
366 |     # For the keyword parameters, see the #compress_stream method.
367 |     #
368 |     # === Return value
369 |     #
370 |     # The compressed data as a BINARY-encoded string.
371 |     #
372 |     # === Example
373 |     #
374 |     #   data = "I love Ruby"
375 |     #   comp = XZ.compress(data) #=> binary blob
376 |     #
377 |     # === Remarks
378 |     #
379 |     # Don't use this method for big amounts of data--you may run out
380 |     # of memory. Use compress_file or compress_stream instead.
381 |     def compress(str, **args)
382 |       s = StringIO.new(str)
383 |       compress_stream(s, **args)
384 |     end
385 | 
386 |     # Decompresses data in XZ format.
387 |     #
388 |     # === Parameters
389 |     #
390 |     # [str] The data to decompress.
391 |     #
392 |     # For the keyword parameters, see the decompress_stream method.
393 |     #
394 |     # === Return value
395 |     #
396 |     # The decompressed data as a BINARY-encoded string.
397 |     #
398 |     # === Example
399 |     #
400 |     #   comp = File.open("data.xz", "rb"){|f| f.read}
401 |     #   data = XZ.decompress(comp) #=> "I love Ruby"
402 |     #
403 |     # === Remarks
404 |     #
405 |     # Don't use this method for big amounts of data--you may run out
406 |     # of memory. Use decompress_file or decompress_stream instead.
407 |     #
408 |     # Read #decompress_stream's Remarks section for notes on the
409 |     # return value's encoding.
410 |     def decompress(str, **args)
411 |       s = StringIO.new(str)
412 |       decompress_stream(s, **args)
413 |     end
414 | 
415 |     # Decompresses +in_file+ and writes the result to +out_file+.
416 |     #
417 |     # ===Parameters
418 |     #
419 |     # [in_file]
420 |     #   The path to the file to read from.
421 |     # [out_file]
422 |     #   The path of the file to write to. If it exists, it will
423 |     #   be overwritten.
424 |     #
425 |     # For the keyword parameters, see the decompress_stream method.
426 |     #
427 |     # === Return value
428 |     #
429 |     # The number of bytes written, i.e. the size of the uncompressed
430 |     # data.
431 |     #
432 |     # === Example
433 |     #
434 |     #   XZ.decompress_file("myfile.txt.xz", "myfile.txt")
435 |     #   XZ.decompress_file("myarchive.tar.xz", "myarchive.tar")
436 |     #
437 |     # === Remarks
438 |     #
439 |     # This method is safe to use with big files, because files are not
440 |     # loaded into memory completely at once.
441 |     def decompress_file(in_file, out_file, **args)
442 |       File.open(in_file, "rb") do |i_file|
443 |         File.open(out_file, "wb") do |o_file|
444 |           decompress_stream(i_file, internal_encoding: nil, external_encoding: Encoding::BINARY, **args) do |chunk|
445 |             o_file.write(chunk)
446 |           end
447 |         end
448 |       end
449 |     end
450 | 
451 |     private
452 | 
453 |     # This method does the heavy work of (de-)compressing a stream. It
454 |     # takes an IO object to read data from (that means the IO must be
455 |     # opened for reading) and a XZ::LibLZMA::LZMAStream object that is used to
456 |     # (de-)compress the data. Furthermore this method takes a block
457 |     # which gets passed the (de-)compressed data in chunks one at a
458 |     # time--this is needed to allow (de-)compressing of very large
459 |     # files that can't be loaded fully into memory.
460 |     def lzma_code(io, stream)
461 |       input_buffer_p  = Fiddle::Pointer.malloc(CHUNK_SIZE) # automatically freed by fiddle on GC
462 |       output_buffer_p = Fiddle::Pointer.malloc(CHUNK_SIZE) # automatically freed by fiddle on GC
463 | 
464 |       while str = io.read(CHUNK_SIZE)
465 |         input_buffer_p[0, str.bytesize] = str
466 | 
467 |         # Set the data for compressing
468 |         stream.next_in  = input_buffer_p
469 |         stream.avail_in = str.bytesize
470 | 
471 |         # Now loop until we gathered all the data in
472 |         # stream[:next_out]. Depending on the amount of data, this may
473 |         # not fit into the buffer, meaning that we have to provide a
474 |         # pointer to a "new" buffer that liblzma can write into. Since
475 |         # liblzma already set stream[:avail_in] to 0 in the first
476 |         # iteration, the extra call to the lzma_code() function
477 |         # doesn't hurt (indeed the pipe_comp example from liblzma
478 |         # handles it this way too). Sometimes it happens that the
479 |         # compressed data is bigger than the original (notably when
480 |         # the amount of data to compress is small).
481 |         loop do
482 |           # Prepare for getting the compressed_data
483 |           stream.next_out  = output_buffer_p
484 |           stream.avail_out = CHUNK_SIZE
485 | 
486 |           # Compress the data
487 |           res = if io.eof?
488 |             LibLZMA.lzma_code(stream.to_ptr, LibLZMA::LZMA_FINISH)
489 |           else
490 |             LibLZMA.lzma_code(stream.to_ptr, LibLZMA::LZMA_RUN)
491 |           end
492 |           check_lzma_code_retval(res)
493 | 
494 |           # Write the compressed data
495 |           # Note: avail_out gives how much space is left after the new data
496 |           data = output_buffer_p[0, CHUNK_SIZE - stream.avail_out]
497 |           yield(data)
498 | 
499 |           # If the buffer is completely filled, it's likely that there
500 |           # is more data liblzma wants to hand to us. Start a new
501 |           # iteration, but don't provide new input data.
502 |           break unless stream.avail_out == 0
503 |         end #loop
504 |       end #while
505 |     end #lzma_code
506 | 
507 |     # Checks for errors and warnings that can be derived from the
508 |     # return value of the lzma_code() function and shows them if
509 |     # necessary.
510 |     def check_lzma_code_retval(code)
511 |       case code
512 |       when LibLZMA::LZMA_NO_CHECK then warn("Couldn't verify archive integrity--archive has no integrity checksum.")
513 |       when LibLZMA::LZMA_UNSUPPORTED_CHECK then warn("Couldn't verify archive integrity--archive has an unsupported integrity checksum.")
514 |       when LibLZMA::LZMA_GET_CHECK then nil # This isn't useful. It indicates that the checksum type is now known.
515 |       else
516 |         LZMAError.raise_if_necessary(code)
517 |       end
518 |     end
519 | 
520 |   end #class << self
521 | 
522 | end
523 | 
524 | require_relative "xz/version"
525 | require_relative "xz/fiddle_helper"
526 | require_relative "xz/lib_lzma"
527 | require_relative "xz/stream"
528 | require_relative "xz/stream_writer"
529 | require_relative "xz/stream_reader"
530 | 


--------------------------------------------------------------------------------
/lib/xz/fiddle_helper.rb:
--------------------------------------------------------------------------------
 1 | # -*- coding: utf-8 -*-
 2 | #--
 3 | # Basic liblzma-bindings for Ruby.
 4 | #
 5 | # Copyright © 2011-2018 Marvin Gülker et al.
 6 | #
 7 | # See AUTHORS for the full list of contributors.
 8 | #
 9 | # Permission is hereby granted, free of charge, to any person obtaining a
10 | # copy of this software and associated documentation files (the ‘Software’),
11 | # to deal in the Software without restriction, including without limitation
12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
13 | # and/or sell copies of the Software, and to permit persons to whom the Software
14 | # is furnished to do so, subject to the following conditions:
15 | #
16 | # The above copyright notice and this permission notice shall be included in all
17 | # copies or substantial portions of the Software.
18 | #
19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
25 | # THE SOFTWARE.
26 | #++
27 | 
28 | module XZ
29 | 
30 |   # This is an internal API not meant for users of ruby-xz.
31 |   # This mixin modules defines some helper functions on top
32 |   # of Fiddle's functionality.
33 |   module FiddleHelper # :nodoc:
34 | 
35 |     # Define constants that have numeric constants assigned as if
36 |     # it was a C enum definition. You can specificy values explicitely
37 |     # or rely on the implicit incrementation; the first implicit value
38 |     # is zero.
39 |     #
40 |     # Example:
41 |     #
42 |     #   enum :FOO, :BAR, 5, :BAZ
43 |     #
44 |     # This defines a constant FOO with value 0, BAR with value 5, BAZ
45 |     # with value 6.
46 |     def enum(*args)
47 |       @next_enum_val = 0 # First value of an enum is 0 in C
48 | 
49 |       args.each_cons(2) do |val1, val2|
50 |         next if val1.respond_to?(:to_int)
51 | 
52 |         if val2.respond_to?(:to_int)
53 |           const_set(val1, val2.to_int)
54 |           @next_enum_val = val2.to_int + 1
55 |         else
56 |           const_set(val1, @next_enum_val)
57 |           @next_enum_val += 1
58 |         end
59 |       end
60 | 
61 |       # Cater for the last element in case it is not an explicit
62 |       # value that has already been assigned above.
63 |       unless args.last.respond_to?(:to_int)
64 |         const_set(args.last, @next_enum_val)
65 |       end
66 | 
67 |       @next_enum_val = 0
68 |       nil
69 |     end
70 | 
71 |     # Try loading any of the given names as a shared
72 |     # object. Raises Fiddle::DLError if none can
73 |     # be opened.
74 |     def dlloadanyof(*names)
75 |       names.each do |name|
76 |         begin
77 |           dlload(name)
78 |         rescue Fiddle::DLError
79 |           # Continue with next one
80 |         else
81 |           # Success
82 |           return name
83 |         end
84 |       end
85 | 
86 |       raise Fiddle::DLError, "Failed to open any of these shared object files: #{names.join(', ')}"
87 |     end
88 | 
89 |   end
90 | 
91 | end
92 | 


--------------------------------------------------------------------------------
/lib/xz/lib_lzma.rb:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | #--
  3 | # Basic liblzma-bindings for Ruby.
  4 | #
  5 | # Copyright © 2011-2018 Marvin Gülker et al.
  6 | #
  7 | # See AUTHORS for the full list of contributors.
  8 | #
  9 | # Permission is hereby granted, free of charge, to any person obtaining a
 10 | # copy of this software and associated documentation files (the ‘Software’),
 11 | # to deal in the Software without restriction, including without limitation
 12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
 13 | # and/or sell copies of the Software, and to permit persons to whom the Software
 14 | # is furnished to do so, subject to the following conditions:
 15 | #
 16 | # The above copyright notice and this permission notice shall be included in all
 17 | # copies or substantial portions of the Software.
 18 | #
 19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 25 | # THE SOFTWARE.
 26 | #++
 27 | 
 28 | module XZ
 29 | 
 30 |   # This module wraps functions and enums provided by liblzma.
 31 |   # It contains the direct mapping to the underlying C functions;
 32 |   # you should never have to use this. It's the lowlevel API
 33 |   # the other methods provided by ruby-xz are based on.
 34 |   module LibLZMA
 35 |     extend Fiddle::Importer
 36 |     extend XZ::FiddleHelper
 37 | 
 38 |     dlloadanyof 'liblzma.so.5', 'liblzma.so', 'liblzma.5.dylib', 'liblzma.dylib', 'liblzma'
 39 | 
 40 |     typealias "uint32_t", "unsigned int"
 41 |     typealias "uint64_t", "unsigned long long"
 42 | 
 43 |     # lzma_ret enum
 44 |     enum :LZMA_OK, 0, :LZMA_STREAM_END, 1, :LZMA_NO_CHECK, 2,
 45 |          :LZMA_UNSUPPORTED_CHECK, 3, :LZMA_GET_CHECK, 4,
 46 |          :LZMA_MEM_ERROR, 5, :LZMA_MEMLIMIT_ERROR, 6,
 47 |          :LZMA_FORMAT_ERROR, 7, :LZMA_OPTIONS_ERROR, 8,
 48 |          :LZMA_DATA_ERROR, 9, :LZMA_BUF_ERROR, 10,
 49 |          :LZMA_PROG_ERROR, 11
 50 | 
 51 |     # lzma_action enum
 52 |     enum :LZMA_RUN, 0, :LZMA_SYNC_FLUSH, 1,
 53 |          :LZMA_FULL_FLUSH, 2, :LZMA_FULL_BARRIER, 4,
 54 |          :LZMA_FINISH, 3
 55 | 
 56 |     # The maximum value of an uint64_t, as defined by liblzma.
 57 |     # Should be the same as
 58 |     #   (2 ** 64) - 1
 59 |     UINT64_MAX = 18446744073709551615
 60 | 
 61 |     # Activates extreme compression. Same as xz's "-e" commandline switch.
 62 |     LZMA_PRESET_EXTREME = 1 << 31
 63 | 
 64 |     LZMA_TELL_NO_CHECK          = 0x01
 65 |     LZMA_TELL_UNSUPPORTED_CHECK = 0x02
 66 |     LZMA_TELL_ANY_CHECK         = 0x04
 67 |     LZMA_CONCATENATED           = 0x08
 68 |     LZMA_IGNORE_CHECK           = 0x10
 69 | 
 70 |     # For access convenience of the above flags.
 71 |     LZMA_DECODE_FLAGS = {
 72 |       :tell_no_check          => LZMA_TELL_NO_CHECK,
 73 |       :tell_unsupported_check => LZMA_TELL_UNSUPPORTED_CHECK,
 74 |       :tell_any_check         => LZMA_TELL_ANY_CHECK,
 75 |       :concatenated           => LZMA_CONCATENATED,
 76 |       :ignore_check           => LZMA_IGNORE_CHECK
 77 |     }.freeze
 78 | 
 79 |     # Placeholder enum used by liblzma for later additions.
 80 |     enum :LZMA_RESERVED_ENUM, 0
 81 | 
 82 |     # lzma_check enum
 83 |     enum :LZMA_CHECK_NONE, 0, :LZMA_CHECK_CRC32, 1,
 84 |          :LZMA_CHECK_CRC64, 4, :LZMA_CHECK_SHA256, 10
 85 | 
 86 |     # Aliases for the enums as fiddle only understands plain int
 87 |     typealias "lzma_ret", "int"
 88 |     typealias "lzma_check", "int"
 89 |     typealias "lzma_action", "int"
 90 |     typealias "lzma_reserved_enum", "int"
 91 | 
 92 |     # lzma_stream struct. When creating one with ::malloc, use
 93 |     # ::LZMA_STREAM_INIT to make it ready for use.
 94 |     #
 95 |     # This is a Fiddle::CStruct. As such, this has a class method
 96 |     # ::malloc for allocating an instance of it on the heap, and
 97 |     # instances of it have a #to_ptr method that returns a
 98 |     # Fiddle::Pointer. That pointer needs to be freed with
 99 |     # Fiddle::free if the instance was created with ::malloc.
100 |     # To wrap an existing instance, call ::new with the
101 |     # Fiddle::Pointer to wrap as an argument.
102 |     LZMAStream = struct [
103 |       "uint8_t* next_in",
104 |       "size_t avail_in",
105 |       "uint64_t total_in",
106 |       "uint8_t* next_out",
107 |       "size_t avail_out",
108 |       "uint64_t total_out",
109 |       "void* allocator",
110 |       "void* internal",
111 |       "void* reserved_ptr1",
112 |       "void* reserved_ptr2",
113 |       "void* reserved_ptr3",
114 |       "void* reserved_ptr4",
115 |       "uint64_t reserved_int1",
116 |       "uint64_t reserved_int2",
117 |       "size_t reserved_int3",
118 |       "size_t reserved_int4",
119 |       "lzma_reserved_enum reserved_enum1",
120 |       "lzma_reserved_enum reserved_enum2"
121 |     ]
122 | 
123 |     # This method does basicly the same thing as the
124 |     # LZMA_STREAM_INIT macro of liblzma. Pass it an instance of
125 |     # LZMAStream that has not been initialised for use.
126 |     # The intended use of this method is:
127 |     #
128 |     #   stream = LibLZMA::LZMAStream.malloc # ::malloc is provided by fiddle
129 |     #   LibLZMA.LZMA_STREAM_INIT(stream)
130 |     #   # ...do something with the stream...
131 |     #   Fiddle.free(stream.to_ptr)
132 |     def self.LZMA_STREAM_INIT(stream)
133 |       stream.next_in        = nil
134 |       stream.avail_in       = 0
135 |       stream.total_in       = 0
136 |       stream.next_out       = nil
137 |       stream.avail_out      = 0
138 |       stream.total_out      = 0
139 |       stream.allocator      = nil
140 |       stream.internal       = nil
141 |       stream.reserved_ptr1  = nil
142 |       stream.reserved_ptr2  = nil
143 |       stream.reserved_ptr3  = nil
144 |       stream.reserved_ptr4  = nil
145 |       stream.reserved_int1  = 0
146 |       stream.reserved_int2  = 0
147 |       stream.reserved_int3  = 0
148 |       stream.reserved_int4  = 0
149 |       stream.reserved_enum1 = LZMA_RESERVED_ENUM
150 |       stream.reserved_enum2 = LZMA_RESERVED_ENUM
151 |       stream
152 |     end
153 | 
154 |     extern "lzma_ret lzma_easy_encoder(lzma_stream*, uint32_t, lzma_check)"
155 |     extern "lzma_ret lzma_code(lzma_stream*, lzma_action)"
156 |     extern "lzma_ret lzma_stream_decoder(lzma_stream*, uint64_t, uint32_t)"
157 |     extern "void lzma_end(lzma_stream*)"
158 | 
159 |   end
160 | 
161 |   # The class of the error that this library raises.
162 |   class LZMAError < StandardError
163 | 
164 |     # Raises an appropriate exception if +val+ isn't a liblzma success code.
165 |     def self.raise_if_necessary(val)
166 |       case val
167 |       when LibLZMA::LZMA_MEM_ERROR      then raise(self, "Couldn't allocate memory!")
168 |       when LibLZMA::LZMA_MEMLIMIT_ERROR then raise(self, "Decoder ran out of (allowed) memory!")
169 |       when LibLZMA::LZMA_FORMAT_ERROR   then raise(self, "Unrecognized file format!")
170 |       when LibLZMA::LZMA_OPTIONS_ERROR  then raise(self, "Invalid options passed!")
171 |       when LibLZMA::LZMA_DATA_ERROR     then raise(self, "Archive is currupt.")
172 |       when LibLZMA::LZMA_BUF_ERROR      then raise(self, "Buffer unusable!")
173 |       when LibLZMA::LZMA_PROG_ERROR     then raise(self, "Program error--if you're sure your code is correct, you may have found a bug in liblzma.")
174 |       end
175 |     end
176 | 
177 |   end
178 | 
179 | end
180 | 


--------------------------------------------------------------------------------
/lib/xz/stream.rb:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | #--
  3 | # Basic liblzma-bindings for Ruby.
  4 | #
  5 | # Copyright © 2011-2018 Marvin Gülker et al.
  6 | #
  7 | # See AUTHORS for the full list of contributors.
  8 | #
  9 | # Permission is hereby granted, free of charge, to any person obtaining a
 10 | # copy of this software and associated documentation files (the ‘Software’),
 11 | # to deal in the Software without restriction, including without limitation
 12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
 13 | # and/or sell copies of the Software, and to permit persons to whom the Software
 14 | # is furnished to do so, subject to the following conditions:
 15 | #
 16 | # The above copyright notice and this permission notice shall be included in all
 17 | # copies or substantial portions of the Software.
 18 | #
 19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 25 | # THE SOFTWARE.
 26 | #++
 27 | 
 28 | # The base class for XZ::StreamReader and XZ::StreamWriter.  This is
 29 | # an abstract class that is not meant to be used directly. You can,
 30 | # however, test against this class in <tt>kind_of?</tt> tests.
 31 | #
 32 | # XZ::StreamReader and XZ::StreamWriter are IO-like classes that allow
 33 | # you to access XZ-compressed data the same way you access an
 34 | # IO-object, easily allowing to fool other libraries that expect IO
 35 | # objects. The most noticable example for this may be reading and
 36 | # writing XZ-compressed tarballs using the minitar
 37 | # RubyGem; see the README.md file for an example.
 38 | #
 39 | # Most of IO's methods are implemented in this class or one of the
 40 | # subclasses. The most notable exception is that it is not possible
 41 | # to seek in XZ archives (#seek and #pos= are not defined).
 42 | # Many methods that are not expressly documented in the RDoc
 43 | # still exist; this class uses Ruby's Forwardable module to forward
 44 | # them to the underlying IO object.
 45 | #
 46 | # Stream and its subclasses honour Ruby's external+internal encoding
 47 | # system just like Ruby's own IO does. All of what the Ruby docs say
 48 | # about external and internal encodings applies to this class with one
 49 | # important difference. The "external encoding" does not refer to the
 50 | # encoding of the file on the hard disk (this file is always a binary
 51 | # file as it's compressed data), but to the encoding of the
 52 | # decompressed data inside the compressed file.
 53 | #
 54 | # As with Ruby's IO class, instances of this class and its subclasses
 55 | # default their external encoding to Encoding.default_external and
 56 | # their internal encoding to Encoding.default_internal. You can use
 57 | # #set_encoding or pass appropriate arguments to the +new+ method to
 58 | # change these encodings per-instance.
 59 | class XZ::Stream
 60 |   extend Forwardable
 61 | 
 62 |   def_delegator :@delegate_io, :"autoclose="
 63 |   def_delegator :@delegate_io, :"autoclose?"
 64 |   def_delegator :@delegate_io, :binmode
 65 |   def_delegator :@delegate_io, :"binmode?"
 66 |   def_delegator :@delegate_io, :"close_on_exec="
 67 |   def_delegator :@delegate_io, :"close_on_exec?"
 68 |   def_delegator :@delegate_io, :fcntl
 69 |   def_delegator :@delegate_io, :fdatasync
 70 |   def_delegator :@delegate_io, :fileno
 71 |   def_delegator :@delegate_io, :to_i
 72 |   def_delegator :@delegate_io, :flush # TODO: liblzma might have its own flush method that should be used
 73 |   def_delegator :@delegate_io, :fsync
 74 |   def_delegator :@delegate_io, :ioctl
 75 |   def_delegator :@delegate_io, :isatty
 76 |   def_delegator :@delegate_io, :pid
 77 |   #def_delegator :@delegate_io, :stat # If this is available the minitar gem thinks it's a File and wants to seek it O_o
 78 |   def_delegator :@delegate_io, :sync # TODO: use liblzma's own syncing functionality?
 79 |   def_delegator :@delegate_io, :"sync=" # TODO: use liblzma's own syncing functionality?
 80 |   def_delegator :@delegate_io, :"tty?"
 81 | 
 82 |   # Like IO#lineno and IO#lineno=.
 83 |   attr_accessor :lineno
 84 | 
 85 |   # Returns the encoding used inside the compressed data stream.
 86 |   # Like IO#external_encoding.
 87 |   attr_reader :external_encoding
 88 | 
 89 |   # When compressed data is read, the decompressed data is transcoded
 90 |   # from the external_encoding to this encoding. If this encoding is
 91 |   # nil, no transcoding happens.
 92 |   attr_reader :internal_encoding
 93 | 
 94 |   # Private API only for use by subclasses.
 95 |   def initialize(delegate_io) # :nodoc:
 96 |     @delegate_io = delegate_io
 97 |     @lzma_stream = XZ::LibLZMA::LZMAStream.malloc
 98 |     XZ::LibLZMA::LZMA_STREAM_INIT(@lzma_stream)
 99 | 
100 |     @finished = false
101 |     @lineno = 0
102 |     @pos = 0
103 |     @external_encoding = Encoding.default_external
104 |     @internal_encoding = Encoding.default_internal
105 |     @transcode_options = {}
106 |     @input_buffer_p  = Fiddle::Pointer.malloc(XZ::CHUNK_SIZE)
107 |     @output_buffer_p = Fiddle::Pointer.malloc(XZ::CHUNK_SIZE)
108 |   end
109 | 
110 |   # Pass the given +str+ into libzlma's lzma_code() function.
111 |   # +action+ is either LibLZMA::LZMA_RUN (still working) or
112 |   # LibLZMA::LZMA_FINISH (this is the last piece).
113 |   def lzma_code(str, action) # :nodoc:
114 |     previous_encoding = str.encoding
115 |     str.force_encoding(Encoding::BINARY) # Need to operate on bytes now
116 | 
117 |     begin
118 |       pos = 0
119 |       until pos > str.bytesize # Do not use >=, that conflicts with #lzma_finish
120 |         substr = str[pos, XZ::CHUNK_SIZE]
121 |         @input_buffer_p[0, substr.bytesize] = substr
122 |         pos += XZ::CHUNK_SIZE
123 | 
124 |         @lzma_stream.next_in  = @input_buffer_p
125 |         @lzma_stream.avail_in = substr.bytesize
126 | 
127 |         loop do
128 |           @lzma_stream.next_out  = @output_buffer_p
129 |           @lzma_stream.avail_out = XZ::CHUNK_SIZE
130 |           res = XZ::LibLZMA.lzma_code(@lzma_stream.to_ptr, action)
131 |           XZ.send :check_lzma_code_retval, res # call package-private method
132 | 
133 |           data = @output_buffer_p[0, XZ::CHUNK_SIZE - @lzma_stream.avail_out]
134 |           yield(data)
135 | 
136 |           break unless @lzma_stream.avail_out == 0
137 |         end
138 |       end
139 |     ensure
140 |       str.force_encoding(previous_encoding)
141 |     end
142 |   end
143 | 
144 |   # Partial implementation of +rewind+ abstracting common operations.
145 |   # The subclasses implement the rest.
146 |   def rewind # :nodoc:
147 |     # Free the current lzma stream and rewind the underlying IO.
148 |     # It is required to call #rewind before allocating a new lzma
149 |     # stream, because if #rewind raises an exception (because the
150 |     # underlying IO is not rewindable), a memory leak would occur
151 |     # with regard to an allocated-but-never-freed lzma stream.
152 |     finish
153 |     @delegate_io.rewind
154 | 
155 |     # Reset internal state
156 |     @pos = @lineno = 0
157 |     @finished = false
158 | 
159 |     # Allocate a new lzma stream (subclasses will configure it).
160 |     @lzma_stream = XZ::LibLZMA::LZMAStream.malloc
161 |     XZ::LibLZMA::LZMA_STREAM_INIT(@lzma_stream)
162 | 
163 |     0 # Mimic IO#rewind's return value
164 |   end
165 | 
166 |   # You can mostly treat this as if it were an IO object.
167 |   # At least for subclasses. This class itself is abstract,
168 |   # you shouldn't be using it directly at all.
169 |   #
170 |   # Returns the receiver.
171 |   def to_io
172 |     self
173 |   end
174 | 
175 |   # Overridden in StreamReader to be like IO#eof?.
176 |   # This abstract implementation only raises IOError.
177 |   def eof?
178 |     raise(IOError, "Stream not opened for reading")
179 |   end
180 | 
181 |   # Alias for #eof?
182 |   def eof
183 |     eof?
184 |   end
185 | 
186 |   # True if the delegate IO has been closed.
187 |   def closed?
188 |     @delegate_io.closed?
189 |   end
190 | 
191 |   # True if liblzma's internal memory has been freed. For writer
192 |   # instances, receiving true from this method also means that all
193 |   # of liblzma's compressed data has been flushed to the underlying
194 |   # IO object.
195 |   def finished?
196 |     @finished
197 |   end
198 | 
199 |   # Free internal libzlma memory. This needs to be called before
200 |   # you leave this object for the GC. If you used a block-form
201 |   # initializer, this done automatically for you.
202 |   #
203 |   # Subsequent calls to #read or #write will cause an IOError.
204 |   #
205 |   # Returns the underlying IO object. This allows you to retrieve
206 |   # the File instance that was automatically created when using
207 |   # the +open+ method's block form.
208 |   def finish
209 |     return if @finished
210 | 
211 |     # Clean up the lzma_stream structure's internal memory.
212 |     # This would belong into a destructor if Ruby had that.
213 |     XZ::LibLZMA.lzma_end(@lzma_stream)
214 |     @finished = true
215 | 
216 |     @delegate_io
217 |   end
218 | 
219 | 
220 |   # If not done yet, call #finish. Then close the delegate IO.
221 |   # The latter action is going to cause the delegate IO to
222 |   # flush its buffer. After this method returns, it is guaranteed
223 |   # that all pending data has been flushed to the OS' kernel.
224 |   def close
225 |     finish unless @finished
226 |     @delegate_io.close unless @delegate_io.closed?
227 |     nil
228 |   end
229 | 
230 |   # Always raises IOError, because XZ streams can never be duplex.
231 |   def close_read
232 |     raise(IOError, "Not a duplex I/O stream")
233 |   end
234 | 
235 |   # Always raises IOError, because XZ streams can never be duplex.
236 |   def close_write
237 |     raise(IOError, "Not a duplex I/O stream")
238 |   end
239 | 
240 |   # Overridden in StreamReader to be like IO#read.
241 |   # This abstract implementation only raises IOError.
242 |   def read(*args)
243 |     raise(IOError, "Stream not opened for reading")
244 |   end
245 | 
246 |   # Overridden in StreamWriter to be like IO#write.
247 |   # This abstract implementation only raises IOError.
248 |   def write(*args)
249 |     raise(IOError, "Stream not opened for writing")
250 |   end
251 | 
252 |   # Returns the position in the *decompressed* data (regardless of
253 |   # whether this is a reader or a writer instance).
254 |   def pos
255 |     @pos
256 |   end
257 |   alias tell pos
258 | 
259 |   # Like IO#set_encoding.
260 |   def set_encoding(*args)
261 |     if args.count < 1 || args.count > 3
262 |       raise ArgumentError, "Wrong number of arguments: Expected 1-3, got #{args.count}"
263 |     end
264 | 
265 |     # Clean `args' to [external_encoding, internal_encoding],
266 |     # and @transcode_options.
267 |     return set_encoding($`, $', *args[1..-1]) if args[0].respond_to?(:to_str) && args[0].to_str =~ /:/
268 |     @transcode_options = args.delete_at(-1) if args[-1].kind_of?(Hash)
269 | 
270 |     # `args' is always [external, internal] or [external] at this point
271 |     @external_encoding = args[0].kind_of?(Encoding) ? args[0] : Encoding.find(args[0])
272 |     if args[1]
273 |       @internal_encoding = args[1].kind_of?(Encoding) ? args[1] : Encoding.find(args[1])
274 |     else
275 |       @internal_encoding = Encoding.default_internal # Encoding.default_internal defaults to nil
276 |     end
277 | 
278 |     self
279 |   end
280 | 
281 |   # Do not define #pos= and #seek, not even to throw NotImplementedError.
282 |   # Reason: The minitar gem thinks it can use this methods then and provokes
283 |   # the NotImplementedError exception.
284 | 
285 |   # Like IO#<<.
286 |   def <<(obj)
287 |     write(obj.to_s)
288 |   end
289 | 
290 |   # Like IO#advise. No-op, because not meaningful on compressed data.
291 |   def advise
292 |     nil
293 |   end
294 | 
295 |   # Like IO#getbyte. Note this method isn't exactly performant,
296 |   # because it actually reads compressed data as a string and then
297 |   # needs to figure out the bytes from that again.
298 |   def getbyte
299 |     return nil if eof?
300 |     read(1).bytes.first
301 |   end
302 | 
303 |   # Like IO#readbyte.
304 |   def readbyte
305 |     getbyte || raise(EOFError, "End of stream reached")
306 |   end
307 | 
308 |   # Like IO#getc.
309 |   def getc
310 |     str = String.new
311 | 
312 |     # Read byte-by-byte until a valid character in the external
313 |     # encoding was built.
314 |     loop do
315 |       str.force_encoding(Encoding::BINARY)
316 |       str << read(1)
317 |       str.force_encoding(@external_encoding)
318 | 
319 |       break if str.valid_encoding? || eof?
320 |     end
321 | 
322 |     # Transcode to internal encoding if one was requested
323 |     if @internal_encoding
324 |       str.encode(@internal_encoding)
325 |     else
326 |       str
327 |     end
328 |   end
329 | 
330 |   # Like IO#readchar.
331 |   def readchar
332 |     getc || raise(EOFError, "End of stream reached")
333 |   end
334 | 
335 |   # Like IO#gets.
336 |   def gets(separator = $/, limit = nil)
337 |     return nil if eof?
338 |     @lineno += 1
339 | 
340 |     # Mirror IO#gets' weird call-seq
341 |     if separator.respond_to?(:to_int)
342 |       limit = separator.to_int
343 |       separator = $/
344 |     end
345 | 
346 |     buf = String.new
347 |     buf.force_encoding(target_encoding)
348 |     until eof? || (limit && buf.length >= limit)
349 |       buf << getc
350 |       return buf if buf[-1] == separator
351 |     end
352 | 
353 |     buf
354 |   end
355 | 
356 |   # Like IO#readline.
357 |   def readline(*args)
358 |     gets(*args) || raise(EOFError, "End of stream reached")
359 |   end
360 | 
361 |   # Like IO#each.
362 |   def each(*args)
363 |     return enum_for __method__ unless block_given?
364 | 
365 |     while line = gets(*args)
366 |       yield(line)
367 |     end
368 |   end
369 |   alias each_line each
370 | 
371 |   # Like IO#each_byte.
372 |   def each_byte
373 |     return enum_for __method__ unless block_given?
374 | 
375 |     while byte = getbyte
376 |       yield(byte)
377 |     end
378 |   end
379 | 
380 |   # Like IO#each_char.
381 |   def each_char
382 |     return enum_for __method__ unless block_given?
383 | 
384 |     while char = getc
385 |       yield(char)
386 |     end
387 |   end
388 | 
389 |   # Like IO#each_codepoint.
390 |   def each_codepoint
391 |     return enum_for __method__ unless block_given?
392 | 
393 |     each_char{|c| yield(c.ord)}
394 |   end
395 | 
396 |   # Like IO#printf.
397 |   def printf(*args)
398 |     write(sprintf(*args))
399 |     nil
400 |   end
401 | 
402 |   # Like IO#putc.
403 |   def putc(obj)
404 |     if obj.respond_to? :chr
405 |       write(obj.chr)
406 |     elsif obj.respond_to? :to_str
407 |       write(obj.to_str)
408 |     else
409 |       raise(TypeError, "Can only #putc strings and numbers")
410 |     end
411 |   end
412 | 
413 |   def puts(*objs)
414 |     if objs.empty?
415 |       write("\n")
416 |       return nil
417 |     end
418 | 
419 |     objs.each do |obj|
420 |       if obj.respond_to? :to_ary
421 |         puts(*obj.to_ary)
422 |       else
423 |         # Don't squeeze multiple subsequent trailing newlines in `obj'
424 |         obj = obj.to_s
425 |         if obj.end_with?("\n".encode(obj.encoding))
426 |           write(obj)
427 |         else
428 |           write(obj + "\n".encode(obj.encoding))
429 |         end
430 |       end
431 |     end
432 |     nil
433 |   end
434 | 
435 |   # Like IO#print.
436 |   def print(*objs)
437 |     if objs.empty?
438 |       write($_)
439 |     else
440 |       objs.each do |obj|
441 |         write(obj.to_s)
442 |         write($,) if $,
443 |       end
444 |     end
445 | 
446 |     write($\) if $\
447 |     nil
448 |   end
449 | 
450 |   # It is not possible to reopen an lzma stream, hence this
451 |   # method always raises NotImplementedError.
452 |   def reopen(*args)
453 |     raise(NotImplementedError, "Can't reopen an lzma stream")
454 |   end
455 | 
456 |   private
457 | 
458 |   def target_encoding
459 |     if @internal_encoding
460 |       @internal_encoding
461 |     else
462 |       @external_encoding
463 |     end
464 |   end
465 | 
466 | end
467 | 


--------------------------------------------------------------------------------
/lib/xz/stream_reader.rb:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | #--
  3 | # Basic liblzma-bindings for Ruby.
  4 | #
  5 | # Copyright © 2011-2018 Marvin Gülker et al.
  6 | #
  7 | # See AUTHORS for the full list of contributors.
  8 | #
  9 | # Permission is hereby granted, free of charge, to any person obtaining a
 10 | # copy of this software and associated documentation files (the ‘Software’),
 11 | # to deal in the Software without restriction, including without limitation
 12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
 13 | # and/or sell copies of the Software, and to permit persons to whom the Software
 14 | # is furnished to do so, subject to the following conditions:
 15 | #
 16 | # The above copyright notice and this permission notice shall be included in all
 17 | # copies or substantial portions of the Software.
 18 | #
 19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 25 | # THE SOFTWARE.
 26 | #++
 27 | 
 28 | # An IO-like reader class for XZ-compressed data, allowing you to
 29 | # access XZ-compressed data as if it was a normal IO object, but
 30 | # please note you can’t seek in the data--this doesn’t make much
 31 | # sense anyway. Where would you want to seek? The plain or the XZ
 32 | # data?
 33 | #
 34 | # A StreamReader object actually wraps another IO object it reads
 35 | # the compressed data from; you can either pass this IO object directly
 36 | # to the ::new method, effectively allowing you to pass any IO-like thing
 37 | # you can imagine (just ensure it is readable), or you can pass a path
 38 | # to a file to ::open, in which case StreamReader will open the path
 39 | # using Ruby's File class internally. If you use ::open's block form,
 40 | # the method will take care of properly closing both the liblzma
 41 | # stream and the File instance correctly.
 42 | class XZ::StreamReader < XZ::Stream
 43 | 
 44 |   # The memory limit configured for this lzma decoder.
 45 |   attr_reader :memory_limit
 46 | 
 47 |   # call-seq:
 48 |   #   open(filename [, kw]) → stream_reader
 49 |   #   open(filename [, kw]){|sr| ...} → stream_reader
 50 |   #
 51 |   # Open the given file and wrap a new instance around it with ::new.
 52 |   # If you use the block form, both the internally created File instance
 53 |   # and the liblzma stream will be closed automatically for you.
 54 |   #
 55 |   # === Parameters
 56 |   # [filename]
 57 |   #   Path to the file to open.
 58 |   # [sr (block argument)]
 59 |   #   The created StreamReader instance.
 60 |   #
 61 |   # See ::new for a description of the keyword parameters.
 62 |   #
 63 |   # === Return value
 64 |   # The newly created instance.
 65 |   #
 66 |   # === Remarks
 67 |   # Starting with version 1.0.0, the block form also returns the newly
 68 |   # created instance rather than the block's return value. This is
 69 |   # in line with Ruby's own GzipReader.open API.
 70 |   #
 71 |   # === Example
 72 |   #     # Normal usage
 73 |   #     XZ::StreamReader.open("myfile.txt.xz") do |xz|
 74 |   #       puts xz.read #=> I love Ruby
 75 |   #     end
 76 |   #
 77 |   #     # If you really need the File instance created internally:
 78 |   #     file = nil
 79 |   #     XZ::StreamReader.open("myfile.txt.xz") do |xz|
 80 |   #       puts xz.read #=> I love Ruby
 81 |   #       file = xz.finish # prevents closing
 82 |   #     end
 83 |   #     file.close # Now close it manually
 84 |   #
 85 |   #     # Or just don't use the block form:
 86 |   #     xz = XZ::StreamReader.open("myfile.txt.xz")
 87 |   #     puts xz.read #=> I love Ruby
 88 |   #     file = xz.finish
 89 |   #     file.close # Don't forget to close it manually (or use xz.close instead of xz.finish above).
 90 |   def self.open(filename, **args)
 91 |     file = File.open(filename, "rb")
 92 |     reader = new(file, **args)
 93 | 
 94 |     if block_given?
 95 |       begin
 96 |         yield(reader)
 97 |       ensure
 98 |         # Close both delegate IO and reader.
 99 |         reader.close unless reader.finished?
100 |       end
101 |     end
102 | 
103 |     reader
104 |   end
105 | 
106 |   # Creates a new instance that is wrapped around the given IO object.
107 |   #
108 |   # === Parameters
109 |   # ==== Positional parameters
110 |   # [delegate_io]
111 |   #   The underlying IO object to read the compressed data from.
112 |   #   This IO object has to have been opened in binary mode,
113 |   #   otherwise you are likely to receive exceptions indicating
114 |   #   that the compressed data is corrupt.
115 |   #
116 |   # ==== Keyword arguments
117 |   # [memory_limit (+UINT64_MAX+)]
118 |   #   If not XZ::LibLZMA::UINT64_MAX, makes liblzma
119 |   #   use no more memory than +memory_limit+ bytes.
120 |   # [flags (<tt>[:tell_unsupported_check]</tt>)]
121 |   #   Additional flags passed to liblzma (an array).
122 |   #   Possible flags are:
123 |   #
124 |   #   [:tell_no_check]
125 |   #     Spit out a warning if the archive hasn't an
126 |   #     integrity checksum.
127 |   #   [:tell_unsupported_check]
128 |   #     Spit out a warning if the archive
129 |   #     has an unsupported checksum type.
130 |   #   [:concatenated]
131 |   #     Decompress concatenated archives.
132 |   # [external_encoding (Encoding.default_external)]
133 |   #   Assume the decompressed data inside the XZ is encoded in
134 |   #   this encoding. Defaults to Encoding.default_external,
135 |   #   which in turn defaults to the environment.
136 |   # [internal_encoding (Encoding.default_internal)]
137 |   #   Request that the data found in the XZ file (which is assumed
138 |   #   to be in the encoding specified by +external_encoding+) to
139 |   #   be transcoded into this encoding. Defaults to Encoding.default_internal,
140 |   #   which defaults to nil, which means to not transcode anything.
141 |   #
142 |   # === Return value
143 |   # The newly created instance.
144 |   #
145 |   # === Remarks
146 |   # The strings returned from the reader will be in the encoding specified
147 |   # by the +internal_encoding+ parameter. If that parameter is nil (default),
148 |   # then they will be in the encoding specified by +external_encoding+.
149 |   #
150 |   # This method used to accept a block in earlier versions. Since version 1.0.0,
151 |   # this behaviour has been removed to synchronise the API with Ruby's own
152 |   # GzipReader.open.
153 |   #
154 |   # This method doesn't close the underlying IO or the liblzma stream.
155 |   # You need to call #finish or #close manually; see ::open for a method
156 |   # that takes a block to automate this.
157 |   #
158 |   # === Example
159 |   #     file = File.open("compressed.txt.xz", "rb") # Note binary mode
160 |   #     xz = XZ::StreamReader.open(file)
161 |   #     puts xz.read #=> I love Ruby
162 |   #     xz.close # closes both `xz' and `file'
163 |   #
164 |   #     file = File.open("compressed.txt.xz", "rb") # Note binary mode
165 |   #     xz = XZ::StreamReader.open(file)
166 |   #     puts xz.read #=> I love Ruby
167 |   #     xz.finish # closes only `xz'
168 |   #     file.close # Now close `file' manually
169 |   def initialize(delegate_io, memory_limit: XZ::LibLZMA::UINT64_MAX, flags: [:tell_unsupported_check], external_encoding: nil, internal_encoding: nil)
170 |     super(delegate_io)
171 |     raise(ArgumentError, "When specifying the internal encoding, the external encoding must also be specified") if internal_encoding && !external_encoding
172 |     raise(ArgumentError, "Memory limit out of range") unless memory_limit > 0 && memory_limit <= XZ::LibLZMA::UINT64_MAX
173 | 
174 |     @memory_limit = memory_limit
175 |     @readbuf = String.new
176 |     @readbuf.force_encoding(Encoding::BINARY)
177 | 
178 |     if external_encoding
179 |       encargs = []
180 |       encargs << external_encoding
181 |       encargs << internal_encoding if internal_encoding
182 |       set_encoding(*encargs)
183 |     end
184 | 
185 |     @allflags = flags.reduce(0) do |val, flag|
186 |       flag = XZ::LibLZMA::LZMA_DECODE_FLAGS[flag] || raise(ArgumentError, "Unknown flag #{flag}")
187 |       val | flag
188 |     end
189 | 
190 |     res = XZ::LibLZMA.lzma_stream_decoder(@lzma_stream.to_ptr,
191 |                                       @memory_limit,
192 |                                       @allflags)
193 |     XZ::LZMAError.raise_if_necessary(res)
194 |   end
195 | 
196 |   # Mostly like IO#read. The +length+ parameter refers to the amount
197 |   # of decompressed bytes to read, not the amount of bytes to read
198 |   # from the compressed data. That is, if you request a read of 50
199 |   # bytes, you will receive a string with a maximum length of 50
200 |   # bytes, regardless of how many bytes this was in compressed form.
201 |   #
202 |   # Return values are as per IO#read.
203 |   def read(length = nil, outbuf = String.new)
204 |     return "".force_encoding(Encoding::BINARY) if length == 0 # Shortcut; retval as per IO#read.
205 | 
206 |     # Note: Querying the underlying IO as early as possible allows to
207 |     # have Ruby's own IO exceptions to bubble up.
208 |     if length
209 |       return nil if eof? # In line with IO#read
210 |       outbuf.force_encoding(Encoding::BINARY) # As per IO#read docs
211 | 
212 |       # The user's request is in decompressed bytes, so it doesn't matter
213 |       # how much is actually read from the compressed file.
214 |       if @delegate_io.eof?
215 |         data   = ""
216 |         action = XZ::LibLZMA::LZMA_FINISH
217 |       else
218 |         data   = @delegate_io.read(XZ::CHUNK_SIZE)
219 |         action = @delegate_io.eof? ? XZ::LibLZMA::LZMA_FINISH : XZ::LibLZMA::LZMA_RUN
220 |       end
221 | 
222 |       lzma_code(data, action) { |decompressed| @readbuf << decompressed }
223 | 
224 |       # If the requested amount has been read, return it.
225 |       # Also return if EOF has been reached. Note that
226 |       # String#slice! will clear the string to an empty one
227 |       # if `length' is greater than the string length.
228 |       # If EOF is not yet reached, try reading and decompresing
229 |       # more data.
230 |       if @readbuf.bytesize >= length || @delegate_io.eof?
231 |         result = @readbuf.slice!(0, length)
232 |         @pos += result.bytesize
233 |         return outbuf.replace(result)
234 |       else
235 |         return read(length, outbuf)
236 |       end
237 |     else
238 |       # Read the entire file and decompress it into memory, returning it.
239 |       while chunk = @delegate_io.read(XZ::CHUNK_SIZE)
240 |         action = @delegate_io.eof? ? XZ::LibLZMA::LZMA_FINISH : XZ::LibLZMA::LZMA_RUN
241 |         lzma_code(chunk, action) { |decompressed| @readbuf << decompressed }
242 |       end
243 | 
244 |       @pos += @readbuf.bytesize
245 | 
246 |       # Apply encoding conversion.
247 |       # First, tag the read data with the external encoding.
248 |       @readbuf.force_encoding(@external_encoding)
249 | 
250 |       # Now, transcode it to the internal encoding if that was requested.
251 |       # Otherwise return it with the external encoding as-is.
252 |       if @internal_encoding
253 |         @readbuf.encode!(@internal_encoding, @transcode_options)
254 |         outbuf.force_encoding(@internal_encoding)
255 |       else
256 |         outbuf.force_encoding(@external_encoding)
257 |       end
258 | 
259 |       outbuf.replace(@readbuf)
260 |       @readbuf.clear
261 |       @readbuf.force_encoding(Encoding::BINARY) # Back to binary mode for further reading
262 | 
263 |       return outbuf
264 |     end
265 |   end
266 | 
267 |   # Abort the current decompression process and reset everything
268 |   # to the start so that reading from this reader will start over
269 |   # from the beginning of the compressed data.
270 |   #
271 |   # The delegate IO has to support the #rewind method. Otherwise
272 |   # like IO#rewind.
273 |   def rewind
274 |     super
275 | 
276 |     @readbuf.clear
277 |     res = XZ::LibLZMA.lzma_stream_decoder(@lzma_stream.to_ptr,
278 |                                       @memory_limit,
279 |                                       @allflags)
280 |     XZ::LZMAError.raise_if_necessary(res)
281 | 
282 |     0 # Mimic IO#rewind's return value
283 |   end
284 | 
285 |   # Like IO#ungetbyte.
286 |   def ungetbyte(obj)
287 |     if obj.respond_to? :chr
288 |       @readbuf.prepend(obj.chr)
289 |     else
290 |       @readbuf.prepend(obj.to_s)
291 |     end
292 |   end
293 | 
294 |   # Like IO#ungetc.
295 |   def ungetc(str)
296 |     @readbuf.prepend(str)
297 |   end
298 | 
299 |   # Returns true if:
300 |   #
301 |   # 1. The underlying IO has reached EOF, and
302 |   # 2. liblzma has returned everything it could make out of that.
303 |   def eof?
304 |     @delegate_io.eof? && @readbuf.empty?
305 |   end
306 | 
307 |   # Human-readable description
308 |   def inspect
309 |     "<#{self.class} pos=#{@pos} bufsize=#{@readbuf.bytesize} finished=#{@finished} closed=#{closed?} io=#{@delegate_io.inspect}>"
310 |   end
311 | 
312 | end
313 | 


--------------------------------------------------------------------------------
/lib/xz/stream_writer.rb:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | #--
  3 | # Basic liblzma-bindings for Ruby.
  4 | #
  5 | # Copyright © 2011-2018 Marvin Gülker et al.
  6 | #
  7 | # See AUTHORS for the full list of contributors.
  8 | #
  9 | # Permission is hereby granted, free of charge, to any person obtaining a
 10 | # copy of this software and associated documentation files (the ‘Software’),
 11 | # to deal in the Software without restriction, including without limitation
 12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
 13 | # and/or sell copies of the Software, and to permit persons to whom the Software
 14 | # is furnished to do so, subject to the following conditions:
 15 | #
 16 | # The above copyright notice and this permission notice shall be included in all
 17 | # copies or substantial portions of the Software.
 18 | #
 19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 25 | # THE SOFTWARE.
 26 | #++
 27 | 
 28 | # An IO-like writer class for XZ-compressed data, allowing you to
 29 | # write uncompressed data to a stream which ends up as compressed data
 30 | # in a wrapped stream such as a file.
 31 | #
 32 | # A StreamWriter object actually wraps another IO object it writes the
 33 | # XZ-compressed data to. Here’s an ASCII art image to demonstrate way
 34 | # data flows when using StreamWriter to write to a compressed file:
 35 | #
 36 | #           +-----------------+  +------------+
 37 | #   YOUR  =>|StreamWriter's   |=>|Wrapped IO's|=> ACTUAL
 38 | #   DATA  =>|(liblzma) buffers|=>|buffers     |=>  FILE
 39 | #           +-----------------+  +------------+
 40 | #
 41 | # This graphic also illustrates why it is unlikely to see written data
 42 | # directly appear in the file on your harddisk; the data is cached at
 43 | # least twice before it actually gets written out. Regarding file
 44 | # closing that means that before you can be sure any pending data has
 45 | # been written to the file you have to close both the StreamWriter
 46 | # instance and then the wrapped IO object (in *exactly* that order,
 47 | # otherwise data loss and unexpected exceptions may occur!).
 48 | #
 49 | # Calling the #close method closes both the XZ writer and the
 50 | # underlying IO object in the correct order. This is akin to the
 51 | # behaviour exposed by Ruby's own Zlib::GzipWriter class. If you
 52 | # expressly don't want to close the underlying IO instance, you need
 53 | # to manually call StreamWriter#finish and never call
 54 | # StreamWriter#close. Instead, you then close your IO object manually
 55 | # using IO#close once you're done with it.
 56 | #
 57 | # *NOTE*: Using #finish inside the +open+ method's block allows
 58 | # you to continue using that writer's File instance as it is
 59 | # returned by #finish.
 60 | class XZ::StreamWriter < XZ::Stream
 61 | 
 62 |   # Compression level used for this writer (set on instanciation).
 63 |   attr_reader :level
 64 |   # Checksum algorithm in use.
 65 |   attr_reader :check
 66 | 
 67 |   # call-seq:
 68 |   #   open(filename [, compression_level = 6 [, options ]]) → stream_writer
 69 |   #   open(filename [, compression_level = 6 [, options ]]){|sw| ...} → stream_writer
 70 |   #
 71 |   # Creates a new instance for writing to a compressed file. The File
 72 |   # instance is opened internally and then wrapped via ::new. The
 73 |   # block form automatically closes both the liblzma stream and the
 74 |   # internal File instance in the correct order. The non-block form
 75 |   # does neither, leaving it to you to call #finish or #close later.
 76 |   #
 77 |   # === Parameters
 78 |   # [filename]
 79 |   #   The file to open.
 80 |   # [sw (block argument)]
 81 |   #   The created StreamWriter instance.
 82 |   #
 83 |   # See ::new for the other parameters.
 84 |   #
 85 |   # === Return value
 86 |   # Returns the newly created instance.
 87 |   #
 88 |   # === Remarks
 89 |   # Starting with version 1.0.0, the block form also returns the newly
 90 |   # created instance rather than the block's return value. This is
 91 |   # in line with Ruby's own GzipWriter.open API.
 92 |   #
 93 |   # === Example
 94 |   #     # Normal usage
 95 |   #     XZ::StreamWriter.open("myfile.txt.xz") do |xz|
 96 |   #       xz.puts "Compress this line"
 97 |   #       xz.puts "And this line as well"
 98 |   #     end
 99 |   #
100 |   #     # If for whatever reason you want to do something else with
101 |   #     # the internally opened file:
102 |   #     file = nil
103 |   #     XZ::StreamWriter.open("myfile.txt.xz") do |xz|
104 |   #       xz.puts "Compress this line"
105 |   #       xz.puts "And this line as well"
106 |   #       file = xz.finish
107 |   #     end
108 |   #     # At this point, the liblzma stream has been closed, but `file'
109 |   #     # now contains the internally created File instance, which is
110 |   #     # still open. Don't forget to close it yourself at some point
111 |   #     # to flush it.
112 |   #     file.close
113 |   #
114 |   #     # Or just don't use the block form:
115 |   #     xz = StreamWriter.open("myfile.txt.xz")
116 |   #     xz.puts "Compress this line"
117 |   #     xz.puts "And this line as well"
118 |   #     file = xz.finish
119 |   #     file.close # Don't forget to close it manually (or use xz.close instead of xz.finish above)
120 |   def self.open(filename, **args)
121 |     file = File.open(filename, "wb")
122 |     writer = new(file, **args)
123 | 
124 |     if block_given?
125 |       begin
126 |         yield(writer)
127 |       ensure
128 |         # Close both writer and delegate IO via writer.close
129 |         # unless the writer has manually been finished (usually
130 |         # not closing the delegate IO then).
131 |         writer.close unless writer.finished?
132 |       end
133 |     end
134 | 
135 |     writer
136 |   end
137 | 
138 |   # Creates a new instance that is wrapped around the given IO instance.
139 |   #
140 |   # === Parameters
141 |   # ==== Positional parameters
142 |   # [delegate_io]
143 |   #   The IO instance to wrap. It has to be opened in binary mode,
144 |   #   otherwise the data it writes to the hard disk will be corrupt.
145 |   #
146 |   # ==== Keyword arguments
147 |   # [compression_level (6)]
148 |   #   Compression strength. Higher values indicate a
149 |   #   smaller result, but longer compression time. Maximum
150 |   #   is 9.
151 |   # [:check (:crc64)]
152 |   #   The checksum algorithm to use for verifying
153 |   #   the data inside the archive. Possible values are:
154 |   #   * :none
155 |   #   * :crc32
156 |   #   * :crc64
157 |   #   * :sha256
158 |   # [:extreme (false)]
159 |   #   Tries to get the last bit out of the
160 |   #   compression. This may succeed, but you can end
161 |   #   up with *very* long computation times.
162 |   # [:external_encoding (Encoding.default_external)]
163 |   #   Transcode to this encoding when writing. Defaults
164 |   #   to Encoding.default_external, which by default is
165 |   #   set from the environment.
166 |   #
167 |   # === Return value
168 |   # Returns the newly created instance.
169 |   #
170 |   # === Remarks
171 |   # This method does not close the underlying IO nor does it automatically
172 |   # flush libzlma. You'll need to do that manually using #close or #finish.
173 |   # See ::open for a method that supports a block with auto-closing.
174 |   #
175 |   # This method used to accept a block in earlier versions. This
176 |   # behaviour has been removed in version 1.0.0 to synchronise the API
177 |   # with Ruby's own GzipWriter.new.
178 |   #
179 |   # === Example
180 |   #     # Normal usage:
181 |   #     file = File.open("myfile.txt.xz", "wb") # Note binary mode
182 |   #     xz = XZ::StreamWriter.new(file)
183 |   #     xz.puts("Compress this line")
184 |   #     xz.puts("And this second line")
185 |   #     xz.close # Closes both the libzlma stream and `file'
186 |   #
187 |   #     # Expressly closing the delegate IO manually:
188 |   #     File.open("myfile.txt.xz", "wb") do |file| # Note binary mode
189 |   #       xz = XZ::StreamWriter.new(file)
190 |   #       xz.puts("Compress this line")
191 |   #       xz.puts("And this second line")
192 |   #       xz.finish # Flushes libzlma, but keeps `file' open.
193 |   #     end # Here, `file' is closed.
194 |   def initialize(delegate_io, level: 6, check: :crc64, extreme: false, external_encoding: nil)
195 |     super(delegate_io)
196 | 
197 |     raise(ArgumentError, "Invalid compression level!")  unless (0..9).include?(level)
198 |     raise(ArgumentError, "Invalid checksum specified!") unless [:none, :crc32, :crc64, :sha256].include?(check)
199 | 
200 |     set_encoding(external_encoding) if external_encoding
201 | 
202 |     @check  = check
203 |     @level  = level
204 |     @level |= LibLZMA::LZMA_PRESET_EXTREME if extreme
205 | 
206 |     res = XZ::LibLZMA.lzma_easy_encoder(@lzma_stream.to_ptr,
207 |                                     @level,
208 |                                     XZ::LibLZMA.const_get(:"LZMA_CHECK_#{@check.upcase}"))
209 |     XZ::LZMAError.raise_if_necessary(res)
210 |   end
211 | 
212 |   # Mostly like IO#write. Additionally it raises an IOError
213 |   # if #finish has been called previously.
214 |   def write(*args)
215 |     raise(IOError, "Cannot write to a finished liblzma stream") if @finished
216 | 
217 |     origpos = @pos
218 | 
219 |     args.each do |arg|
220 |       @pos += arg.to_s.bytesize
221 | 
222 |       # Apply external encoding if requested
223 |       if @external_encoding && @external_encoding != Encoding::BINARY
224 |         arg = arg.to_s.encode(@external_encoding)
225 |       end
226 | 
227 |       lzma_code(arg.to_s, XZ::LibLZMA::LZMA_RUN) do |compressed|
228 |         @delegate_io.write(compressed)
229 |       end
230 |     end
231 | 
232 |     @pos - origpos # Return number of bytes consumed from input
233 |   end
234 | 
235 |   # Like superclass' method, but also ensures liblzma flushes all
236 |   # compressed data to the delegate IO.
237 |   def finish
238 |     lzma_code("", XZ::LibLZMA::LZMA_FINISH) { |compressed| @delegate_io.write(compressed) }
239 |     super
240 |   end
241 | 
242 |   # Abort the current compression process and reset everything
243 |   # to the start. Writing into this writer will cause existing data
244 |   # on the underlying IO to be overwritten after this method has been
245 |   # called.
246 |   #
247 |   # The delegte IO has to support the #rewind method. Otherwise like
248 |   # IO#rewind.
249 |   def rewind
250 |     super
251 | 
252 |     res = XZ::LibLZMA.lzma_easy_encoder(@lzma_stream.to_ptr,
253 |                                     @level,
254 |                                     XZ::LibLZMA.const_get(:"LZMA_CHECK_#{@check.upcase}"))
255 |     XZ::LZMAError.raise_if_necessary(res)
256 | 
257 |     0 # Mimic IO#rewind's return value
258 |   end
259 | 
260 |   # Human-readable description
261 |   def inspect
262 |     "<#{self.class} pos=#{@pos} finished=#{@finished} closed=#{closed?} io=#{@delegate_io.inspect}>"
263 |   end
264 | 
265 | end
266 | 


--------------------------------------------------------------------------------
/lib/xz/version.rb:
--------------------------------------------------------------------------------
 1 | # -*- coding: utf-8 -*-
 2 | #--
 3 | # Basic liblzma-bindings for Ruby.
 4 | #
 5 | # Copyright © 2011-2018 Marvin Gülker et al.
 6 | #
 7 | # See AUTHORS for the full list of contributors.
 8 | #
 9 | # Permission is hereby granted, free of charge, to any person obtaining a
10 | # copy of this software and associated documentation files (the ‘Software’),
11 | # to deal in the Software without restriction, including without limitation
12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
13 | # and/or sell copies of the Software, and to permit persons to whom the Software
14 | # is furnished to do so, subject to the following conditions:
15 | #
16 | # The above copyright notice and this permission notice shall be included in all
17 | # copies or substantial portions of the Software.
18 | #
19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
25 | # THE SOFTWARE.
26 | #++
27 | 
28 | module XZ
29 | 
30 |   # The version of this library.
31 |   VERSION = "1.0.0".freeze
32 | 
33 | end
34 | 


--------------------------------------------------------------------------------
/ruby-xz.gemspec:
--------------------------------------------------------------------------------
 1 | # -*- mode: ruby; coding: utf-8 -*-
 2 | #--
 3 | # Basic liblzma-bindings for Ruby.
 4 | #
 5 | # Copyright © 2011-2018 Marvin Gülker et al.
 6 | #
 7 | # See AUTHORS for the full list of contributors.
 8 | #
 9 | # Permission is hereby granted, free of charge, to any person obtaining a
10 | # copy of this software and associated documentation files (the ‘Software’),
11 | # to deal in the Software without restriction, including without limitation
12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
13 | # and/or sell copies of the Software, and to permit persons to whom the Software
14 | # is furnished to do so, subject to the following conditions:
15 | #
16 | # The above copyright notice and this permission notice shall be included in all
17 | # copies or substantial portions of the Software.
18 | #
19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
25 | # THE SOFTWARE.
26 | #++
27 | 
28 | require_relative "lib/xz/version"
29 | 
30 | GEMSPEC = Gem::Specification.new do |spec|
31 |   spec.name        = "ruby-xz"
32 |   spec.summary     = "XZ compression via liblzma for Ruby, using fiddle."
33 |   spec.description =<<DESCRIPTION
34 | These are simple Ruby bindings for the liblzma library
35 | (http://tukaani.org/xz/), which is best known for the
36 | extreme compression ratio its native XZ format achieves.
37 | Since fiddle is used to implement the bindings, no compilation
38 | is needed.
39 | DESCRIPTION
40 |   spec.version               = XZ::VERSION.gsub("-", ".")
41 |   spec.author                = "Marvin Gülker"
42 |   spec.email                 = "m-guelker@phoenixmail.de"
43 |   spec.license               = "MIT"
44 |   spec.homepage              = "https://mg.guelker.eu/projects/ruby-xz/"
45 |   spec.platform              = Gem::Platform::RUBY
46 |   spec.required_ruby_version = ">=2.3.0"
47 |   spec.add_development_dependency("minitar", "~> 0.6")
48 |   spec.files.concat(Dir["lib/**/*.rb"])
49 |   spec.files.concat(Dir["**/*.rdoc"])
50 |   spec.files << "README.md" << "LICENSE" << "AUTHORS"
51 |   spec.has_rdoc         = true
52 |   spec.extra_rdoc_files = %w[README.md HISTORY.rdoc LICENSE AUTHORS]
53 |   spec.rdoc_options << "-t" << "ruby-xz RDocs" << "-m" << "README.md"
54 |   spec.post_install_message = "Version 1.0.0 of ruby-xz breaks the API. Read HISTORY.rdoc and adapt your code to the new API."
55 | end
56 | 


--------------------------------------------------------------------------------
/test/common.rb:
--------------------------------------------------------------------------------
 1 | # -*- coding: utf-8 -*-
 2 | #--
 3 | # Basic liblzma-bindings for Ruby.
 4 | #
 5 | # Copyright © 2011-2018 Marvin Gülker et al.
 6 | #
 7 | # See AUTHORS for the full list of contributors.
 8 | #
 9 | # Permission is hereby granted, free of charge, to any person obtaining a
10 | # copy of this software and associated documentation files (the ‘Software’),
11 | # to deal in the Software without restriction, including without limitation
12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
13 | # and/or sell copies of the Software, and to permit persons to whom the Software
14 | # is furnished to do so, subject to the following conditions:
15 | #
16 | # The above copyright notice and this permission notice shall be included in all
17 | # copies or substantial portions of the Software.
18 | #
19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
25 | # THE SOFTWARE.
26 | #++
27 | 
28 | require_relative "../lib/xz"
29 | 
30 | require "pathname"
31 | require "tempfile"
32 | require "minitest/autorun"
33 | 
34 | # This is the absolute path to the test/ directory.
35 | TEST_DIR = Pathname.new(__FILE__).dirname
36 | 


--------------------------------------------------------------------------------
/test/test-data/iso88591.txt.xz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Quintus/ruby-xz/4d4f142884a39300cb9adb8bfe82829379c9660b/test/test-data/iso88591.txt.xz


--------------------------------------------------------------------------------
/test/test-data/lorem_ipsum.txt:
--------------------------------------------------------------------------------
1 | Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
2 | 


--------------------------------------------------------------------------------
/test/test-data/lorem_ipsum.txt.xz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Quintus/ruby-xz/4d4f142884a39300cb9adb8bfe82829379c9660b/test/test-data/lorem_ipsum.txt.xz


--------------------------------------------------------------------------------
/test/test_stream_reader.rb:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | #--
  3 | # Basic liblzma-bindings for Ruby.
  4 | #
  5 | # Copyright © 2011-2018 Marvin Gülker et al.
  6 | #
  7 | # See AUTHORS for the full list of contributors.
  8 | #
  9 | # Permission is hereby granted, free of charge, to any person obtaining a
 10 | # copy of this software and associated documentation files (the ‘Software’),
 11 | # to deal in the Software without restriction, including without limitation
 12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
 13 | # and/or sell copies of the Software, and to permit persons to whom the Software
 14 | # is furnished to do so, subject to the following conditions:
 15 | #
 16 | # The above copyright notice and this permission notice shall be included in all
 17 | # copies or substantial portions of the Software.
 18 | #
 19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 25 | # THE SOFTWARE.
 26 | #++
 27 | 
 28 | require_relative "common"
 29 | 
 30 | class StreamReaderTest < Minitest::Test
 31 | 
 32 |   TEST_DATA_DIR    = Pathname.new(__FILE__).dirname + "test-data"
 33 |   PLAIN_TEXT_FILE  = TEST_DATA_DIR + "lorem_ipsum.txt"
 34 |   XZ_TEXT_FILE     = TEST_DATA_DIR + "lorem_ipsum.txt.xz"
 35 |   XZ_ISO_TEXT_FILE = TEST_DATA_DIR + "iso88591.txt.xz"
 36 | 
 37 |   def test_new
 38 |     File.open(XZ_TEXT_FILE) do |file|
 39 |       reader = XZ::StreamReader.new(file)
 40 | 
 41 |       assert_equal("Lorem ipsum", reader.read(11))
 42 |       assert_equal(" dolor sit amet", reader.read(15))
 43 | 
 44 |       rest = reader.read
 45 |       assert_equal("Lorem ipsum dolor sit amet.\n", rest[-28..-1])
 46 |       assert_equal("", reader.read) # We’re at EOF
 47 |       assert(reader.eof?, "EOF is not EOF!")
 48 | 
 49 |       reader.close
 50 |     end
 51 |   end
 52 | 
 53 |   def test_file_closing
 54 |     File.open(XZ_TEXT_FILE, "rb") do |file|
 55 |       reader = XZ::StreamReader.new(file)
 56 |       reader.read
 57 |       reader.close
 58 |       assert(file.closed?, "Did not close file although expected!")
 59 |     end
 60 | 
 61 |     File.open(XZ_TEXT_FILE, "rb") do |file|
 62 |       reader = XZ::StreamReader.new(file)
 63 |       reader.read
 64 |       reader.finish
 65 |       assert(!file.closed?, "Closed file although not expected!")
 66 |     end
 67 | 
 68 |     reader = XZ::StreamReader.open(XZ_TEXT_FILE){|r| r.read}
 69 |     assert(reader.finished?, "Didn't finish stream!")
 70 |     assert(reader.instance_variable_get(:@delegate_io).closed?, "Didn't close internally created file!")
 71 | 
 72 |     reader = XZ::StreamReader.open(XZ_TEXT_FILE)
 73 |     reader.read
 74 |     reader.close
 75 |     assert(reader.instance_variable_get(:@delegate_io).closed?, "Didn't close internally created file!")
 76 | 
 77 |     reader = XZ::StreamReader.open(XZ_TEXT_FILE)
 78 |     reader.read
 79 |     reader.finish
 80 |     assert(!reader.instance_variable_get(:@delegate_io).closed?, "Closed internally created file although not expected!")
 81 | 
 82 |     File.open(XZ_TEXT_FILE, "rb") do |file|
 83 |       r = XZ::StreamReader.new(file)
 84 |       r.read(10)
 85 |       r.rewind
 86 |       assert(!file.closed?, "Closed handed IO during rewind!")
 87 |     end
 88 | 
 89 |     XZ::StreamReader.open(XZ_TEXT_FILE) do |r|
 90 |       r.read(10)
 91 |       r.rewind
 92 |       assert(!r.instance_variable_get(:@delegate_io).closed?, "Closed internal file during rewind")
 93 |     end
 94 | 
 95 |     # Test double closing (this should not raise)
 96 |     XZ::StreamReader.open(XZ_TEXT_FILE) do |r|
 97 |       r.close
 98 |     end
 99 | 
100 |   end
101 | 
102 |   def test_finish
103 |     File.open(XZ_TEXT_FILE, "rb") do |file|
104 |       r = XZ::StreamReader.new(file)
105 |       r.read
106 |       assert_equal file, r.finish
107 | 
108 |       assert r.finished?, "Didn't finish despite of #finish"
109 |       assert !file.closed?, "Closed wrapped file despite of #finish!"
110 |     end
111 | 
112 |     file = nil
113 |     XZ::StreamReader.open(XZ_TEXT_FILE){|r| r.read; file = r.finish}
114 |     assert_kind_of File, file # Return value of #finish
115 |     assert !file.closed?, "Closed wrapped file despite of #finish!"
116 |     file.close # cleanup
117 | 
118 |     reader = XZ::StreamReader.open(XZ_TEXT_FILE)
119 |     reader.read
120 |     file = reader.finish
121 |     assert_kind_of File, file
122 |     assert !file.closed?, "Closed wrapped file despite of #finish!"
123 |     file.close # cleanup
124 |   end
125 | 
126 |   def test_open
127 |     XZ::StreamReader.open(XZ_TEXT_FILE) do |reader|
128 |       assert_equal(File.read(PLAIN_TEXT_FILE), reader.read)
129 |     end
130 | 
131 |     File.open(XZ_TEXT_FILE, "rb") do |file|
132 |       reader = XZ::StreamReader.new(file)
133 |       assert_equal(File.read(PLAIN_TEXT_FILE), reader.read)
134 |       reader.close
135 |     end
136 |   end
137 | 
138 |   def test_pos
139 |     text = File.read(PLAIN_TEXT_FILE)
140 |     XZ::StreamReader.open(XZ_TEXT_FILE) do |reader|
141 |       reader.read
142 |       assert_equal(text.bytes.count, reader.pos)
143 |     end
144 |   end
145 | 
146 |   def test_rewind
147 |     # Non-block form
148 |     File.open(XZ_TEXT_FILE, "rb") do |file|
149 |       reader = XZ::StreamReader.new(file)
150 |       text = reader.read(10)
151 |       reader.rewind
152 |       assert_equal(text, reader.read(10))
153 |     end
154 | 
155 |     # Block form
156 |     XZ::StreamReader.open(XZ_TEXT_FILE) do |reader|
157 |       text = reader.read(10)
158 |       reader.rewind
159 |       assert_equal(text, reader.read(10))
160 |     end
161 |   end
162 | 
163 |   def test_encodings
164 |     enc1 = Encoding.default_external
165 |     enc2 = Encoding.default_internal
166 |     verb = $VERBOSE
167 |     $VERBOSE = nil # Disable warnings, because setting
168 |     # Encoding.default_{internal,external} generates a
169 |     # warning. However, setting these is required to test
170 |     # if they're properly honoured by ruby-xz.
171 |     begin
172 |       Encoding.default_external = Encoding::ISO_8859_1
173 | 
174 |       # Forced binary read must always yield BINARY
175 |       XZ::StreamReader.open(XZ_ISO_TEXT_FILE) do |reader|
176 |         str = reader.read(255)
177 |         assert_equal Encoding::BINARY, str.encoding
178 |       end
179 | 
180 |       # Now the external encoding needs to be detected
181 |       XZ::StreamReader.open(XZ_ISO_TEXT_FILE) do |reader|
182 |         str = reader.read
183 |         assert_equal Encoding::ISO_8859_1, str.encoding
184 |         assert str.valid_encoding?
185 |       end
186 | 
187 |       # Request transcode
188 |       XZ::StreamReader.open(XZ_ISO_TEXT_FILE, external_encoding: "ISO-8859-1", internal_encoding: "UTF-8") do |reader|
189 |         str = reader.read
190 |         assert_equal Encoding::UTF_8, str.encoding
191 |         assert str.valid_encoding?
192 |       end
193 | 
194 |       # Request transcode via default internal encoding
195 |       Encoding.default_internal = Encoding::UTF_8
196 |       XZ::StreamReader.open(XZ_ISO_TEXT_FILE) do |reader|
197 |         str = reader.read
198 |         assert_equal Encoding::UTF_8, str.encoding
199 |         assert str.valid_encoding?
200 |       end
201 | 
202 |       # Ensure getc does what it should when asked for multibyte chars
203 |       XZ::StreamReader.open(XZ_ISO_TEXT_FILE) do |reader|
204 |         assert_equal "B", reader.getc
205 |         assert_equal "ä", reader.getc
206 |         assert_equal "r", reader.getc
207 |       end
208 |     ensure
209 |       # Reset to normal for further tests
210 |       Encoding.default_external = enc1
211 |       Encoding.default_internal = enc2
212 |       $VERBOSE = verb
213 |     end
214 |   end
215 | 
216 |   def test_set_encoding
217 |     reader = XZ::StreamReader.open(XZ_ISO_TEXT_FILE)
218 | 
219 |     reader.set_encoding "UTF-8"
220 |     assert_equal Encoding::UTF_8, reader.external_encoding
221 |     assert_equal nil, reader.internal_encoding
222 | 
223 |     reader.set_encoding "ISO-8859-1:UTF-8"
224 |     assert_equal Encoding::ISO_8859_1, reader.external_encoding
225 |     assert_equal Encoding::UTF_8, reader.internal_encoding
226 | 
227 |     reader.set_encoding Encoding::UTF_8
228 |     assert_equal Encoding::UTF_8, reader.external_encoding
229 |     assert_equal nil, reader.internal_encoding
230 | 
231 |     reader.set_encoding Encoding::UTF_8, Encoding::ISO_8859_1
232 |     assert_equal Encoding::UTF_8, reader.external_encoding
233 |     assert_equal Encoding::ISO_8859_1, reader.internal_encoding
234 | 
235 |     reader.set_encoding "ISO-8859-1", {:invalid => :replace, :replace => "?"}
236 |     assert_equal Encoding::ISO_8859_1, reader.external_encoding
237 |     assert_equal nil, reader.internal_encoding
238 | 
239 |     reader.set_encoding "ISO-8859-1", "UTF-8", {:invalid => :replace, :replace => "?"}
240 |     assert_equal Encoding::ISO_8859_1, reader.external_encoding
241 |     assert_equal Encoding::UTF_8, reader.internal_encoding
242 | 
243 |     reader.set_encoding "ISO-8859-1:UTF-8", {:invalid => :replace, :replace => "?"}
244 |     assert_equal Encoding::ISO_8859_1, reader.external_encoding
245 |     assert_equal Encoding::UTF_8, reader.internal_encoding
246 | 
247 |   end
248 | 
249 | end
250 | 


--------------------------------------------------------------------------------
/test/test_stream_writer.rb:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | #--
  3 | # Basic liblzma-bindings for Ruby.
  4 | #
  5 | # Copyright © 2011-2018 Marvin Gülker et al.
  6 | #
  7 | # See AUTHORS for the full list of contributors.
  8 | #
  9 | # Permission is hereby granted, free of charge, to any person obtaining a
 10 | # copy of this software and associated documentation files (the ‘Software’),
 11 | # to deal in the Software without restriction, including without limitation
 12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
 13 | # and/or sell copies of the Software, and to permit persons to whom the Software
 14 | # is furnished to do so, subject to the following conditions:
 15 | #
 16 | # The above copyright notice and this permission notice shall be included in all
 17 | # copies or substantial portions of the Software.
 18 | #
 19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 25 | # THE SOFTWARE.
 26 | #++
 27 | 
 28 | require_relative "common"
 29 | 
 30 | # For this testcase, please note that it isn’t possible to check
 31 | # whether the compressed string is equal to some other
 32 | # compressed string containing the same original text due to
 33 | # different compression options and/or different versions of
 34 | # liblzma. Hence, I can only test whether the re-decompressed
 35 | # result is equal to what I originally had.
 36 | class StreamWriterTest < Minitest::Test
 37 | 
 38 |   TEST_DATA_DIR   = Pathname.new(__FILE__).dirname + "test-data"
 39 |   PLAIN_TEXT_FILE = TEST_DATA_DIR + "lorem_ipsum.txt"
 40 |   XZ_TEXT_FILE    = TEST_DATA_DIR + "lorem_ipsum.txt.xz"
 41 |   LIVE_TEST_FILE  = TEST_DATA_DIR + "lorem2.txt.xz"
 42 | 
 43 |   def test_stream_writer_new
 44 |     text   = File.read(PLAIN_TEXT_FILE)
 45 |     text1  = text[0...10]
 46 |     text2  = text[10..-1]
 47 | 
 48 |     File.open(LIVE_TEST_FILE, "wb") do |file|
 49 |       writer = XZ::StreamWriter.new(file)
 50 | 
 51 |       assert_equal(text1.bytes.count, writer.write(text1))
 52 |       assert_equal(text2.bytes.count, writer.write(text2))
 53 |       writer.close
 54 | 
 55 |       assert(text.bytesize > File.stat(LIVE_TEST_FILE).size, "Compression did not compress")
 56 |       assert(writer.finished?, "Didn't finish writer")
 57 |       assert(writer.closed?, "Didn't close writer")
 58 |       assert_raises(IOError){writer.write("foo")}
 59 |     end
 60 | 
 61 |     assert_equal(text, XZ.decompress(File.open(LIVE_TEST_FILE, "rb"){|f| f.read}))
 62 |   end
 63 | 
 64 |   def test_file_closing
 65 |     File.open(LIVE_TEST_FILE, "wb") do |file|
 66 |       w = XZ::StreamWriter.new(file)
 67 |       w.write("Foo")
 68 |       w.finish
 69 |       assert(!file.closed?, "Closed file although not expected!")
 70 |     end
 71 | 
 72 |     File.open(LIVE_TEST_FILE, "wb") do |file|
 73 |       w = XZ::StreamWriter.new(file)
 74 |       w.write("Foo")
 75 |       w.close
 76 |       assert(w.finished?, "Didn't finish writer although expected!")
 77 |       assert(file.closed?, "Didn't close file although expected!")
 78 |     end
 79 | 
 80 |     writer = XZ::StreamWriter.open(LIVE_TEST_FILE){|w| w.write("Foo")}
 81 |     assert(writer.finished?, "Didn't finish writer")
 82 |     assert(writer.instance_variable_get(:@delegate_io).closed?, "Didn't close internally opened file!")
 83 | 
 84 |     writer = XZ::StreamWriter.new(File.open(LIVE_TEST_FILE, "wb"))
 85 |     writer.write("Foo")
 86 |     writer.close
 87 |     assert(writer.finished?, "Didn't finish writer")
 88 |     assert(writer.instance_variable_get(:@delegate_io).closed?, "Didn't close internally opened file!")
 89 | 
 90 |     # Test double closing (this should not raise)
 91 |     XZ::StreamWriter.open(LIVE_TEST_FILE) do |w|
 92 |       w.write("Foo")
 93 |       w.close
 94 |     end
 95 |   end
 96 | 
 97 |   def test_finish
 98 |     File.open(LIVE_TEST_FILE, "wb") do |file|
 99 |       w = XZ::StreamWriter.new(file)
100 |       w.write("Foo")
101 | 
102 |       assert_equal file, w.finish
103 |       assert !file.closed?, "Closed wrapped file despite of #finish!"
104 |     end
105 | 
106 |     file = nil
107 |     XZ::StreamWriter.open(LIVE_TEST_FILE){|w| w.write("Foo"); file = w.finish}
108 |     assert_kind_of File, file # Return value of #finish
109 |     assert !file.closed?, "Closed wrapped file despite of #finish!"
110 |     file.close # cleanup
111 | 
112 |     writer = XZ::StreamWriter.open(LIVE_TEST_FILE)
113 |     writer.write("Foo")
114 |     file = writer.finish
115 |     assert_kind_of File, file
116 |     assert !file.closed?, "Closed wrapped file despite of #finish!"
117 |     file.close # cleanup
118 |   end
119 | 
120 |   def test_stream_writer_open
121 |     text = File.read(PLAIN_TEXT_FILE)
122 | 
123 |     XZ::StreamWriter.open(LIVE_TEST_FILE) do |file|
124 |       file.write(text)
125 |     end
126 | 
127 |     assert_equal(text, XZ.decompress(File.open(LIVE_TEST_FILE, "rb"){|f| f.read}))
128 |   end
129 | 
130 | end
131 | 


--------------------------------------------------------------------------------
/test/test_tarball.rb:
--------------------------------------------------------------------------------
 1 | # -*- coding: utf-8 -*-
 2 | #--
 3 | # Basic liblzma-bindings for Ruby.
 4 | #
 5 | # Copyright © 2011-2018 Marvin Gülker et al.
 6 | #
 7 | # See AUTHORS for the full list of contributors.
 8 | #
 9 | # Permission is hereby granted, free of charge, to any person obtaining a
10 | # copy of this software and associated documentation files (the ‘Software’),
11 | # to deal in the Software without restriction, including without limitation
12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
13 | # and/or sell copies of the Software, and to permit persons to whom the Software
14 | # is furnished to do so, subject to the following conditions:
15 | #
16 | # The above copyright notice and this permission notice shall be included in all
17 | # copies or substantial portions of the Software.
18 | #
19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
25 | # THE SOFTWARE.
26 | #++
27 | 
28 | require "minitar"
29 | require_relative "common"
30 | 
31 | # Create XZ-compressed tarballs and unpack them with the system's
32 | # tar(1) utility, and vice-versa. This ensures our library interacts
33 | # with the environment as one expects it to.
34 | class TarballTest < Minitest::Test
35 | 
36 |   def test_pack_tarball
37 |     filename = TEST_DIR + "testtarball.tar.xz"
38 |     content  = File.read(TEST_DIR + "test-data/lorem_ipsum.txt")
39 | 
40 |     XZ::StreamWriter.open(filename) do |txz|
41 |       Dir.chdir(TEST_DIR) do # proper file path parts
42 |         Minitar.pack("test-data/lorem_ipsum.txt", txz)
43 |       end
44 |     end
45 | 
46 |     Dir.mktmpdir("testtarball") do |dir|
47 |       Dir.chdir(dir) do
48 |         system("tar -xJf '#{filename}'")
49 |         assert File.exist?("test-data/lorem_ipsum.txt"), "compressed file missing!"
50 |         assert_equal File.read("test-data/lorem_ipsum.txt"), content
51 |       end
52 |     end
53 |   ensure
54 |     File.unlink(filename) if File.exist?(filename)
55 |   end
56 | 
57 |   def test_unpack_tarball
58 |     filename = TEST_DIR + "testtarball.tar.xz"
59 |     content  = File.read(TEST_DIR + "test-data/lorem_ipsum.txt")
60 | 
61 |     Dir.chdir(TEST_DIR) do # proper file path parts
62 |       system("tar -cJf '#{filename}' test-data/lorem_ipsum.txt")
63 |     end
64 | 
65 |     Dir.mktmpdir("testtarball") do |dir|
66 |       Dir.chdir(dir) do
67 |         XZ::StreamReader.open(filename) do |txz|
68 |           Minitar.unpack(txz, ".")
69 |         end
70 | 
71 |         assert File.exist?("test-data/lorem_ipsum.txt"), "compresed file missing!"
72 |         assert_equal File.read("test-data/lorem_ipsum.txt"), content
73 |       end
74 |     end
75 |   ensure
76 |     File.unlink(filename) if File.exist?(filename)
77 |   end
78 | 
79 | end
80 | 


--------------------------------------------------------------------------------
/test/test_xz.rb:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | #--
  3 | # Basic liblzma-bindings for Ruby.
  4 | #
  5 | # Copyright © 2011-2018 Marvin Gülker et al.
  6 | #
  7 | # See AUTHORS for the full list of contributors.
  8 | #
  9 | # Permission is hereby granted, free of charge, to any person obtaining a
 10 | # copy of this software and associated documentation files (the ‘Software’),
 11 | # to deal in the Software without restriction, including without limitation
 12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense,
 13 | # and/or sell copies of the Software, and to permit persons to whom the Software
 14 | # is furnished to do so, subject to the following conditions:
 15 | #
 16 | # The above copyright notice and this permission notice shall be included in all
 17 | # copies or substantial portions of the Software.
 18 | #
 19 | # THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 24 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 25 | # THE SOFTWARE.
 26 | #++
 27 | 
 28 | require_relative "common"
 29 | 
 30 | class TestXZ < Minitest::Test
 31 | 
 32 |   TEST_XZ = "\3757zXZ\000\000\004\346\326\264F\002\000!\001\026\000\000\000t/" +
 33 |   "\345\243\340\000\023\000\020]\000\030\fB\222jg\274\016\32132a\326|\000\000" +
 34 |   "\000\017:\376\373\"1\270\266\000\001,\024\370\nm\003\037\266\363}\001\000" +
 35 |   "\000\000\000\004YZ"
 36 | 
 37 |   def test_decompress
 38 |     assert_equal(XZ.decompress(TEST_XZ), '01234567890123456789')
 39 |     assert_equal(Encoding.default_external, XZ.decompress(TEST_XZ).encoding)
 40 | 
 41 |     str = XZ.decompress(TEST_XZ, external_encoding: "ISO-8859-1")
 42 |     assert_equal Encoding::ISO_8859_1, str.encoding
 43 | 
 44 |     str = XZ.decompress(TEST_XZ, external_encoding: "ISO-8859-1", internal_encoding: "UTF-16LE")
 45 |     assert_equal Encoding::UTF_16LE, str.encoding
 46 | 
 47 |     # Must specify an external encoding if transcoding is requested
 48 |     assert_raises(ArgumentError){XZ.decompress(TEST_XZ, internal_encoding: "UTF-8")}
 49 |   end
 50 | 
 51 |   def test_corrupt_archive
 52 |     corrupt_xz = TEST_XZ.dup
 53 |     corrupt_xz[20] = "\023"
 54 |     assert_raises(XZ::LZMAError) { XZ.decompress(corrupt_xz) }
 55 |   end
 56 | 
 57 |   def test_compress
 58 |     str = '01234567890123456789'
 59 |     tmp = XZ.compress(str)
 60 |     assert_equal(tmp[0, 5].bytes.to_a, "\3757zXZ".bytes.to_a)
 61 | 
 62 |     # Ensure it interacts with upstream xz properly
 63 |     IO.popen("xzcat", "w+") do |io|
 64 |       io.write(tmp)
 65 |       io.close_write
 66 |       assert_equal(io.read, str)
 67 |     end
 68 |   end
 69 | 
 70 |   def test_compression_levels
 71 |     str = "Once upon a time, there was..."
 72 | 
 73 |     0.upto(9) do |i|
 74 |       [true, false].each do |extreme|
 75 |         compressed = XZ.compress(str, level: i, extreme: extreme)
 76 |         assert_equal(XZ.decompress(compressed), str)
 77 |       end
 78 |     end
 79 | 
 80 |     # Maximum compression level is 9.
 81 |     assert_raises(ArgumentError){XZ.compress("foo", level: 15)}
 82 |   end
 83 | 
 84 |   def test_roundtrip
 85 |     str = "Once upon a time, there was..."
 86 |     assert_equal(XZ.decompress(XZ.compress(str)), str)
 87 |   end
 88 | 
 89 |   def test_compress_file
 90 |     Tempfile.open('in') do |infile|
 91 |       infile.write('01234567890123456789')
 92 |       infile.close
 93 | 
 94 |       Tempfile.open('out') do |outfile|
 95 |         outfile.close
 96 | 
 97 |         XZ.compress_file(infile.path, outfile.path)
 98 | 
 99 |         outfile.open
100 |         assert_equal(outfile.read[0, 5].bytes.to_a, "\3757zXZ".bytes.to_a)
101 |       end
102 |     end
103 |   end
104 | 
105 |   def test_decompress_file
106 |     Tempfile.open('in') do |infile|
107 |       infile.write(TEST_XZ)
108 |       infile.close
109 | 
110 |       Tempfile.open('out') do |outfile|
111 |         outfile.close
112 | 
113 |         XZ.decompress_file(infile.path, outfile.path)
114 | 
115 |         outfile.open
116 |         assert_equal(outfile.read, '01234567890123456789')
117 |       end
118 |     end
119 |   end
120 | 
121 | end
122 | 
123 | 


--------------------------------------------------------------------------------