├── .gitignore
├── LICENSE
├── LICENSE.cc0.md
├── LICENSE.zlib.md
├── Makefile
├── README.md
├── VS2017
├── .gitignore
├── lz4ultra.sln
├── lz4ultra.vcxproj
├── lz4ultra.vcxproj.filters
└── lz4ultra.vcxproj.user
├── Xcode
└── lz4ultra.xcodeproj
│ └── project.pbxproj
└── src
├── dictionary.c
├── dictionary.h
├── expand_block.c
├── expand_block.h
├── expand_inmem.c
├── expand_inmem.h
├── expand_streaming.c
├── expand_streaming.h
├── format.h
├── frame.c
├── frame.h
├── lib.c
├── lib.h
├── libdivsufsort
├── .gitignore
├── CHANGELOG.md
├── CMakeLists.txt
├── CMakeModules
│ ├── AppendCompilerFlags.cmake
│ ├── CheckFunctionKeywords.cmake
│ ├── CheckLFS.cmake
│ ├── ProjectCPack.cmake
│ └── cmake_uninstall.cmake.in
├── LICENSE
├── README.md
├── VERSION.cmake
├── examples
│ ├── CMakeLists.txt
│ ├── bwt.c
│ ├── mksary.c
│ ├── sasearch.c
│ ├── suftest.c
│ └── unbwt.c
├── include
│ ├── CMakeLists.txt
│ ├── config.h.cmake
│ ├── divsufsort.h
│ ├── divsufsort.h.cmake
│ ├── divsufsort_config.h
│ ├── divsufsort_private.h
│ └── lfs.h.cmake
├── lib
│ ├── CMakeLists.txt
│ ├── divsufsort.c
│ ├── divsufsort_utils.c
│ ├── sssort.c
│ └── trsort.c
└── pkgconfig
│ ├── CMakeLists.txt
│ └── libdivsufsort.pc.cmake
├── lz4ultra.c
├── matchfinder.c
├── matchfinder.h
├── shrink_block.c
├── shrink_block.h
├── shrink_context.c
├── shrink_context.h
├── shrink_inmem.c
├── shrink_inmem.h
├── shrink_streaming.c
├── shrink_streaming.h
├── stream.c
├── stream.h
└── xxhash
├── LICENSE.txt
├── xxhash.c
└── xxhash.h
/.gitignore:
--------------------------------------------------------------------------------
1 | obj
2 | lz4ultra
3 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | The lz4ultra code is available under the Zlib license, except for src/matchfinder.c which is placed under the Creative Commons CC0 license.
2 |
3 | Please consult LICENSE.zlib.md and LICENSE.CC0.md for more information.
4 |
--------------------------------------------------------------------------------
/LICENSE.cc0.md:
--------------------------------------------------------------------------------
1 | ## creative commons
2 |
3 | # CC0 1.0 Universal
4 |
5 | CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED HEREUNDER.
6 |
7 | ### Statement of Purpose
8 |
9 | The laws of most jurisdictions throughout the world automatically confer exclusive Copyright and Related Rights (defined below) upon the creator and subsequent owner(s) (each and all, an "owner") of an original work of authorship and/or a database (each, a "Work").
10 |
11 | Certain owners wish to permanently relinquish those rights to a Work for the purpose of contributing to a commons of creative, cultural and scientific works ("Commons") that the public can reliably and without fear of later claims of infringement build upon, modify, incorporate in other works, reuse and redistribute as freely as possible in any form whatsoever and for any purposes, including without limitation commercial purposes. These owners may contribute to the Commons to promote the ideal of a free culture and the further production of creative, cultural and scientific works, or to gain reputation or greater distribution for their Work in part through the use and efforts of others.
12 |
13 | For these and/or other purposes and motivations, and without any expectation of additional consideration or compensation, the person associating CC0 with a Work (the "Affirmer"), to the extent that he or she is an owner of Copyright and Related Rights in the Work, voluntarily elects to apply CC0 to the Work and publicly distribute the Work under its terms, with knowledge of his or her Copyright and Related Rights in the Work and the meaning and intended legal effect of CC0 on those rights.
14 |
15 | 1. __Copyright and Related Rights.__ A Work made available under CC0 may be protected by copyright and related or neighboring rights ("Copyright and Related Rights"). Copyright and Related Rights include, but are not limited to, the following:
16 |
17 | i. the right to reproduce, adapt, distribute, perform, display, communicate, and translate a Work;
18 |
19 | ii. moral rights retained by the original author(s) and/or performer(s);
20 |
21 | iii. publicity and privacy rights pertaining to a person's image or likeness depicted in a Work;
22 |
23 | iv. rights protecting against unfair competition in regards to a Work, subject to the limitations in paragraph 4(a), below;
24 |
25 | v. rights protecting the extraction, dissemination, use and reuse of data in a Work;
26 |
27 | vi. database rights (such as those arising under Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, and under any national implementation thereof, including any amended or successor version of such directive); and
28 |
29 | vii. other similar, equivalent or corresponding rights throughout the world based on applicable law or treaty, and any national implementations thereof.
30 |
31 | 2. __Waiver.__ To the greatest extent permitted by, but not in contravention of, applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and unconditionally waives, abandons, and surrenders all of Affirmer's Copyright and Related Rights and associated claims and causes of action, whether now known or unknown (including existing as well as future claims and causes of action), in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each member of the public at large and to the detriment of Affirmer's heirs and successors, fully intending that such Waiver shall not be subject to revocation, rescission, cancellation, termination, or any other legal or equitable action to disrupt the quiet enjoyment of the Work by the public as contemplated by Affirmer's express Statement of Purpose.
32 |
33 | 3. __Public License Fallback.__ Should any part of the Waiver for any reason be judged legally invalid or ineffective under applicable law, then the Waiver shall be preserved to the maximum extent permitted taking into account Affirmer's express Statement of Purpose. In addition, to the extent the Waiver is so judged Affirmer hereby grants to each affected person a royalty-free, non transferable, non sublicensable, non exclusive, irrevocable and unconditional license to exercise Affirmer's Copyright and Related Rights in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "License"). The License shall be deemed effective as of the date CC0 was applied by Affirmer to the Work. Should any part of the License for any reason be judged legally invalid or ineffective under applicable law, such partial invalidity or ineffectiveness shall not invalidate the remainder of the License, and in such case Affirmer hereby affirms that he or she will not (i) exercise any of his or her remaining Copyright and Related Rights in the Work or (ii) assert any associated claims and causes of action with respect to the Work, in either case contrary to Affirmer's express Statement of Purpose.
34 |
35 | 4. __Limitations and Disclaimers.__
36 |
37 | a. No trademark or patent rights held by Affirmer are waived, abandoned, surrendered, licensed or otherwise affected by this document.
38 |
39 | b. Affirmer offers the Work as-is and makes no representations or warranties of any kind concerning the Work, express, implied, statutory or otherwise, including without limitation warranties of title, merchantability, fitness for a particular purpose, non infringement, or the absence of latent or other defects, accuracy, or the present or absence of errors, whether or not discoverable, all to the greatest extent permissible under applicable law.
40 |
41 | c. Affirmer disclaims responsibility for clearing rights of other persons that may apply to the Work or any use thereof, including without limitation any person's Copyright and Related Rights in the Work. Further, Affirmer disclaims responsibility for obtaining any necessary consents, permissions or other rights required for any use of the Work.
42 |
43 | d. Affirmer understands and acknowledges that Creative Commons is not a party to this document and has no duty or obligation with respect to this CC0 or use of the Work.
44 |
--------------------------------------------------------------------------------
/LICENSE.zlib.md:
--------------------------------------------------------------------------------
1 | Copyright (c) 2019 Emmanuel Marty
2 |
3 | This software is provided 'as-is', without any express or implied warranty. In
4 | no event will the authors be held liable for any damages arising from the use of
5 | this software.
6 |
7 | Permission is granted to anyone to use this software for any purpose, including
8 | commercial applications, and to alter it and redistribute it freely, subject to
9 | the following restrictions:
10 |
11 | 1. The origin of this software must not be misrepresented; you must not claim
12 | that you wrote the original software. If you use this software in a product,
13 | an acknowledgment in the product documentation would be appreciated but is
14 | not required.
15 |
16 | 2. Altered source versions must be plainly marked as such, and must not be
17 | misrepresented as being the original software.
18 |
19 | 3. This notice may not be removed or altered from any source distribution.
20 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | CC=clang
2 | CFLAGS=-O3 -fomit-frame-pointer -Isrc/libdivsufsort/include -Isrc/xxhash -Isrc
3 | OBJDIR=obj
4 | LDFLAGS=
5 | STRIP=strip
6 |
7 | $(OBJDIR)/%.o: src/../%.c
8 | @mkdir -p '$(@D)'
9 | $(CC) $(CFLAGS) -c $< -o $@
10 |
11 | APP := lz4ultra
12 |
13 | OBJS := $(OBJDIR)/src/lz4ultra.o
14 | OBJS += $(OBJDIR)/src/dictionary.o
15 | OBJS += $(OBJDIR)/src/expand_block.o
16 | OBJS += $(OBJDIR)/src/expand_inmem.o
17 | OBJS += $(OBJDIR)/src/expand_streaming.o
18 | OBJS += $(OBJDIR)/src/frame.o
19 | OBJS += $(OBJDIR)/src/lib.o
20 | OBJS += $(OBJDIR)/src/matchfinder.o
21 | OBJS += $(OBJDIR)/src/shrink_block.o
22 | OBJS += $(OBJDIR)/src/shrink_context.o
23 | OBJS += $(OBJDIR)/src/shrink_inmem.o
24 | OBJS += $(OBJDIR)/src/shrink_streaming.o
25 | OBJS += $(OBJDIR)/src/stream.o
26 | OBJS += $(OBJDIR)/src/libdivsufsort/lib/divsufsort.o
27 | OBJS += $(OBJDIR)/src/libdivsufsort/lib/divsufsort_utils.o
28 | OBJS += $(OBJDIR)/src/libdivsufsort/lib/sssort.o
29 | OBJS += $(OBJDIR)/src/libdivsufsort/lib/trsort.o
30 | OBJS += $(OBJDIR)/src/xxhash/xxhash.o
31 |
32 | all: $(APP)
33 |
34 | $(APP): $(OBJS)
35 | @mkdir -p ../../bin/posix
36 | $(CC) $^ $(LDFLAGS) -o $(APP)
37 | $(STRIP) $(APP)
38 |
39 | clean:
40 | @rm -rf $(APP) $(OBJDIR)
41 |
42 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | lz4ultra -- Optimal LZ4 packer with faster decompression
2 | ========================================================
3 |
4 | lz4ultra is a command-line optimal compression utility that produces compressed files in the [lz4](https://github.com/lz4/lz4) format created by Yann Collet.
5 |
6 | The tool creates optimally compressed files, like lz4 in optimal compression mode ("lz4hc"), smallLZ4, blz4 and lz4x.
7 |
8 | With enwik9 (1,000,000,000 bytes):
9 |
10 | Compr.size Tokens Decomp.time (μs, Core i7-6700)
11 | lz4 1.9.2 -12 (favor ratio) 372,443,347 95,698,349 505,804
12 | smalLZ4 1.5 -9 371,680,328 93,172,985 348,018
13 | lz4ultra 1.3.0 (favor ratio) 371,680,323 93,165,899 347,936
14 | lz4 1.9.2 -12 --favor-decSpeed 377,175,400 92,080,802 457,141
15 | lz4ultra 1.3.0 --favor-decSpeed 376,118,079 88,521,993 296,972
16 |
17 | The produced files are meant to be decompressed with the lz4 tool and library. While lz4ultra includes a decompressor, it is mostly meant to verify the output of the compressor and isn't as optimized as Yann Collet's lz4 proper.
18 |
19 | The tool defaults to 4 Mb blocks with inter-block dependencies but can be configured to output all of the LZ4 block sizes (64 Kb to 4 Mb), to use the LZ4 8 Mb blocks legacy encoding, and to compress independent blocks, using command-line switches.
20 |
21 | lz4ultra is developed by Emmanuel Marty with the help of spke.
22 |
--------------------------------------------------------------------------------
/VS2017/.gitignore:
--------------------------------------------------------------------------------
1 | .vs
2 | Debug
3 | Release
4 | bin
5 |
--------------------------------------------------------------------------------
/VS2017/lz4ultra.sln:
--------------------------------------------------------------------------------
1 |
2 | Microsoft Visual Studio Solution File, Format Version 12.00
3 | # Visual Studio 15
4 | VisualStudioVersion = 15.0.28307.489
5 | MinimumVisualStudioVersion = 10.0.40219.1
6 | Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "lz4ultra", "lz4ultra.vcxproj", "{3F30FEE8-63C5-4D39-A175-EDD7EA93E9B8}"
7 | EndProject
8 | Global
9 | GlobalSection(SolutionConfigurationPlatforms) = preSolution
10 | Debug|x64 = Debug|x64
11 | Debug|x86 = Debug|x86
12 | Release|x64 = Release|x64
13 | Release|x86 = Release|x86
14 | EndGlobalSection
15 | GlobalSection(ProjectConfigurationPlatforms) = postSolution
16 | {3F30FEE8-63C5-4D39-A175-EDD7EA93E9B8}.Debug|x64.ActiveCfg = Debug|x64
17 | {3F30FEE8-63C5-4D39-A175-EDD7EA93E9B8}.Debug|x64.Build.0 = Debug|x64
18 | {3F30FEE8-63C5-4D39-A175-EDD7EA93E9B8}.Debug|x86.ActiveCfg = Debug|Win32
19 | {3F30FEE8-63C5-4D39-A175-EDD7EA93E9B8}.Debug|x86.Build.0 = Debug|Win32
20 | {3F30FEE8-63C5-4D39-A175-EDD7EA93E9B8}.Release|x64.ActiveCfg = Release|x64
21 | {3F30FEE8-63C5-4D39-A175-EDD7EA93E9B8}.Release|x64.Build.0 = Release|x64
22 | {3F30FEE8-63C5-4D39-A175-EDD7EA93E9B8}.Release|x86.ActiveCfg = Release|Win32
23 | {3F30FEE8-63C5-4D39-A175-EDD7EA93E9B8}.Release|x86.Build.0 = Release|Win32
24 | EndGlobalSection
25 | GlobalSection(SolutionProperties) = preSolution
26 | HideSolutionNode = FALSE
27 | EndGlobalSection
28 | GlobalSection(ExtensibilityGlobals) = postSolution
29 | SolutionGuid = {A1E1655C-AA9F-41F0-80C9-18DD0B859D7C}
30 | EndGlobalSection
31 | EndGlobal
32 |
--------------------------------------------------------------------------------
/VS2017/lz4ultra.vcxproj.filters:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 | {4FC737F1-C7A5-4376-A066-2A32D752A2FF}
6 | cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx
7 |
8 |
9 | {93995380-89BD-4b04-88EB-625FBE52EBFB}
10 | h;hh;hpp;hxx;hm;inl;inc;ipp;xsd
11 |
12 |
13 | {67DA6AB6-F800-4c08-8B7A-83BB121AAD01}
14 | rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav;mfcribbon-ms
15 |
16 |
17 | {a858de66-bef8-44b2-aaba-99ab69a3a806}
18 |
19 |
20 | {8ffd119e-b205-4e17-8c23-b945711b5e16}
21 |
22 |
23 | {7b58e9ea-8419-4a92-b23b-52a66da2bca3}
24 |
25 |
26 | {178a6577-0784-4aa4-8a35-9c443a088e23}
27 |
28 |
29 |
30 |
31 | Fichiers d%27en-tête
32 |
33 |
34 | Fichiers sources\libdivsufsort\include
35 |
36 |
37 | Fichiers sources\libdivsufsort\include
38 |
39 |
40 | Fichiers sources
41 |
42 |
43 | Fichiers sources\xxhash
44 |
45 |
46 | Fichiers sources
47 |
48 |
49 | Fichiers sources
50 |
51 |
52 | Fichiers sources
53 |
54 |
55 | Fichiers sources
56 |
57 |
58 | Fichiers sources
59 |
60 |
61 | Fichiers sources
62 |
63 |
64 | Fichiers sources
65 |
66 |
67 | Fichiers sources
68 |
69 |
70 | Fichiers sources
71 |
72 |
73 | Fichiers sources
74 |
75 |
76 | Fichiers sources
77 |
78 |
79 | Fichiers sources\libdivsufsort\include
80 |
81 |
82 | Fichiers sources
83 |
84 |
85 |
86 |
87 | Fichiers sources\libdivsufsort\lib
88 |
89 |
90 | Fichiers sources\libdivsufsort\lib
91 |
92 |
93 | Fichiers sources\libdivsufsort\lib
94 |
95 |
96 | Fichiers sources
97 |
98 |
99 | Fichiers sources\xxhash
100 |
101 |
102 | Fichiers sources
103 |
104 |
105 | Fichiers sources
106 |
107 |
108 | Fichiers sources
109 |
110 |
111 | Fichiers sources
112 |
113 |
114 | Fichiers sources
115 |
116 |
117 | Fichiers sources
118 |
119 |
120 | Fichiers sources
121 |
122 |
123 | Fichiers sources
124 |
125 |
126 | Fichiers sources
127 |
128 |
129 | Fichiers sources
130 |
131 |
132 | Fichiers sources\libdivsufsort\lib
133 |
134 |
135 | Fichiers sources
136 |
137 |
138 |
--------------------------------------------------------------------------------
/VS2017/lz4ultra.vcxproj.user:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 | $(TargetPath)
5 | -c -v corpus/zxspectrum/graphics/bfox-dont_go_away_(2010).mg1 packed_lz4ultra/zxspectrum/graphics/bfox-dont_go_away_(2010).mg.lzs
6 | WindowsLocalDebugger
7 | $(ProjectDir)..\
8 |
9 |
10 | $(TargetPath)
11 | -c -v corpus/zxspectrum/graphics/bfox-dont_go_away_(2010).mg1 packed_lz4ultra/zxspectrum/graphics/bfox-dont_go_away_(2010).mg.lzs
12 | WindowsLocalDebugger
13 | $(ProjectDir)..\
14 |
15 |
16 | $(TargetPath)
17 | -c -v corpus/zxspectrum/graphics/bfox-dont_go_away_(2010).mg1 packed_lz4ultra/zxspectrum/graphics/bfox-dont_go_away_(2010).mg.lzs
18 | WindowsLocalDebugger
19 | $(ProjectDir)..\
20 |
21 |
22 | $(TargetPath)
23 | -c -v corpus/zxspectrum/graphics/bfox-dont_go_away_(2010).mg1 packed_lz4ultra/zxspectrum/graphics/bfox-dont_go_away_(2010).mg.lzs
24 | WindowsLocalDebugger
25 | $(ProjectDir)..\
26 |
27 |
--------------------------------------------------------------------------------
/src/dictionary.c:
--------------------------------------------------------------------------------
1 | /*
2 | * dictionary.c - dictionary implementation
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #include
34 | #include "dictionary.h"
35 | #include "format.h"
36 | #include "lib.h"
37 |
38 | /**
39 | * Load dictionary contents
40 | *
41 | * @param pszDictionaryFilename name of dictionary file, or NULL for none
42 | * @param ppDictionaryData pointer to returned dictionary contents, or NULL for none
43 | * @param pDictionaryDataSize pointer to returned size of dictionary contents, or 0
44 | *
45 | * @return LZSA_OK for success, or an error value from lz4ultra_status_t
46 | */
47 | int lz4ultra_dictionary_load(const char *pszDictionaryFilename, void **ppDictionaryData, int *pDictionaryDataSize) {
48 | unsigned char *pDictionaryData = NULL;
49 | int nDictionaryDataSize = 0;
50 |
51 | if (pszDictionaryFilename) {
52 | pDictionaryData = (unsigned char *)malloc(HISTORY_SIZE);
53 | if (!pDictionaryData) {
54 | return LZ4ULTRA_ERROR_MEMORY;
55 | }
56 |
57 | FILE *f_dictionary = fopen(pszDictionaryFilename, "rb");
58 | if (!f_dictionary) {
59 | free(pDictionaryData);
60 | pDictionaryData = NULL;
61 |
62 | return LZ4ULTRA_ERROR_DICTIONARY;
63 | }
64 |
65 | fseek(f_dictionary, 0, SEEK_END);
66 | #ifdef _WIN32
67 | __int64 nDictionaryFileSize = _ftelli64(f_dictionary);
68 | #else
69 | off_t nDictionaryFileSize = ftello(f_dictionary);
70 | #endif
71 | if (nDictionaryFileSize > HISTORY_SIZE) {
72 | /* Use the last HISTORY_SIZE bytes of the dictionary */
73 | fseek(f_dictionary, -HISTORY_SIZE, SEEK_END);
74 | }
75 | else {
76 | fseek(f_dictionary, 0, SEEK_SET);
77 | }
78 |
79 | nDictionaryDataSize = (int)fread(pDictionaryData, 1, HISTORY_SIZE, f_dictionary);
80 | if (nDictionaryDataSize < 0)
81 | nDictionaryDataSize = 0;
82 |
83 | fclose(f_dictionary);
84 | f_dictionary = NULL;
85 | }
86 |
87 | *ppDictionaryData = pDictionaryData;
88 | *pDictionaryDataSize = nDictionaryDataSize;
89 | return LZ4ULTRA_OK;
90 | }
91 |
92 | /**
93 | * Free dictionary contents
94 | *
95 | * @param ppDictionaryData pointer to pointer to dictionary contents
96 | */
97 | void lz4ultra_dictionary_free(void **ppDictionaryData) {
98 | if (*ppDictionaryData) {
99 | free(*ppDictionaryData);
100 | ppDictionaryData = NULL;
101 | }
102 | }
103 |
--------------------------------------------------------------------------------
/src/dictionary.h:
--------------------------------------------------------------------------------
1 | /*
2 | * dictionary.h - dictionary definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _DICTIONARY_H
34 | #define _DICTIONARY_H
35 |
36 | /**
37 | * Load dictionary contents
38 | *
39 | * @param pszDictionaryFilename name of dictionary file, or NULL for none
40 | * @param ppDictionaryData pointer to returned dictionary contents, or NULL for none
41 | * @param pDictionaryDataSize pointer to returned size of dictionary contents, or 0
42 | *
43 | * @return LZSA_OK for success, or an error value from lz4ultra_status_t
44 | */
45 | int lz4ultra_dictionary_load(const char *pszDictionaryFilename, void **ppDictionaryData, int *pDictionaryDataSize);
46 |
47 | /**
48 | * Free dictionary contents
49 | *
50 | * @param ppDictionaryData pointer to pointer to dictionary contents
51 | */
52 | void lz4ultra_dictionary_free(void **ppDictionaryData);
53 |
54 | #endif /* _DICTIONARY_H */
55 |
--------------------------------------------------------------------------------
/src/expand_block.c:
--------------------------------------------------------------------------------
1 | /*
2 | * expand_block.c - block decompressor implementation
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | /* This code is mostly here to verify the compressor's output. You should use the real, optimized lz4 decompressor to decompress your data. */
34 |
35 | #include
36 | #include
37 | #include "format.h"
38 | #include "expand_block.h"
39 |
40 | #if defined(__GNUC__) || defined(__clang__)
41 | #define likely(x) __builtin_expect(!!(x), 1)
42 | #define unlikely(x) __builtin_expect(!!(x), 0)
43 | #else
44 | #define likely(x) (x)
45 | #define unlikely(x) (x)
46 | #endif
47 |
48 | #define LZ4ULTRA_DECOMPRESSOR_BUILD_LEN(__len) { \
49 | unsigned int byte; \
50 | do { \
51 | if (unlikely(pInBlock >= pInBlockEnd)) return -1; \
52 | byte = (unsigned int)*pInBlock++; \
53 | __len += byte; \
54 | } while (unlikely(byte == 255)); \
55 | }
56 |
57 | /**
58 | * Decompress one data block
59 | *
60 | * @param pInBlock pointer to compressed data
61 | * @param nBlockSize size of compressed data, in bytes
62 | * @param pOutData pointer to output decompression buffer (previously decompressed bytes + room for decompressing this block)
63 | * @param nOutDataOffset starting index of where to store decompressed bytes in output buffer (and size of previously decompressed bytes)
64 | * @param nBlockMaxSize total size of output decompression buffer, in bytes
65 | *
66 | * @return size of decompressed data in bytes, or -1 for error
67 | */
68 | int lz4ultra_decompressor_expand_block(const unsigned char *pInBlock, int nBlockSize, unsigned char *pOutData, int nOutDataOffset, int nBlockMaxSize) {
69 | const unsigned char *pInBlockEnd = pInBlock + nBlockSize;
70 | unsigned char *pCurOutData = pOutData + nOutDataOffset;
71 | const unsigned char *pOutDataEnd = pCurOutData + nBlockMaxSize;
72 | const unsigned char *pOutDataFastEnd = pOutDataEnd - 18;
73 |
74 | while (likely(pInBlock < pInBlockEnd)) {
75 | const unsigned int token = (unsigned int)*pInBlock++;
76 | unsigned int nLiterals = ((token & 0xf0) >> 4);
77 |
78 | if (nLiterals != LITERALS_RUN_LEN && pCurOutData <= pOutDataFastEnd && (pInBlock + 16) <= pInBlockEnd) {
79 | memcpy(pCurOutData, pInBlock, 16);
80 | }
81 | else {
82 | if (likely(nLiterals == LITERALS_RUN_LEN))
83 | LZ4ULTRA_DECOMPRESSOR_BUILD_LEN(nLiterals);
84 |
85 | if (unlikely((pInBlock + nLiterals) > pInBlockEnd)) return -1;
86 | if (unlikely((pCurOutData + nLiterals) > pOutDataEnd)) return -1;
87 |
88 | memcpy(pCurOutData, pInBlock, nLiterals);
89 | }
90 |
91 | pInBlock += nLiterals;
92 | pCurOutData += nLiterals;
93 |
94 | if (likely((pInBlock + 2) <= pInBlockEnd)) {
95 | unsigned int nMatchOffset;
96 |
97 | nMatchOffset = (unsigned int)*pInBlock++;
98 | nMatchOffset |= ((unsigned int)*pInBlock++) << 8;
99 |
100 | unsigned int nMatchLen = (token & 0x0f);
101 |
102 | nMatchLen += MIN_MATCH_SIZE;
103 | if (nMatchLen != (MATCH_RUN_LEN + MIN_MATCH_SIZE) && nMatchOffset >= 8 && pCurOutData <= pOutDataFastEnd) {
104 | const unsigned char *pSrc = pCurOutData - nMatchOffset;
105 |
106 | if (unlikely(pSrc < pOutData)) return -1;
107 |
108 | memcpy(pCurOutData, pSrc, 8);
109 | memcpy(pCurOutData + 8, pSrc + 8, 8);
110 | memcpy(pCurOutData + 16, pSrc + 16, 2);
111 |
112 | pCurOutData += nMatchLen;
113 | }
114 | else {
115 | if (likely(nMatchLen == (MATCH_RUN_LEN + MIN_MATCH_SIZE)))
116 | LZ4ULTRA_DECOMPRESSOR_BUILD_LEN(nMatchLen);
117 |
118 | if (unlikely((pCurOutData + nMatchLen) > pOutDataEnd)) return -1;
119 |
120 | const unsigned char *pSrc = pCurOutData - nMatchOffset;
121 | if (unlikely(pSrc < pOutData)) return -1;
122 |
123 | if (nMatchOffset >= 16 && (pCurOutData + nMatchLen) <= pOutDataFastEnd) {
124 | const unsigned char *pCopySrc = pSrc;
125 | unsigned char *pCopyDst = pCurOutData;
126 | const unsigned char *pCopyEndDst = pCurOutData + nMatchLen;
127 |
128 | do {
129 | memcpy(pCopyDst, pCopySrc, 16);
130 | pCopySrc += 16;
131 | pCopyDst += 16;
132 | } while (pCopyDst < pCopyEndDst);
133 |
134 | pCurOutData += nMatchLen;
135 | }
136 | else {
137 | while (nMatchLen--) {
138 | *pCurOutData++ = *pSrc++;
139 | }
140 | }
141 | }
142 | }
143 | }
144 |
145 | return (int)(pCurOutData - (pOutData + nOutDataOffset));
146 | }
147 |
--------------------------------------------------------------------------------
/src/expand_block.h:
--------------------------------------------------------------------------------
1 | /*
2 | * expand_block.h - block decompressor definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _EXPAND_BLOCK_H
34 | #define _EXPAND_BLOCK_H
35 |
36 | /**
37 | * Decompress one data block
38 | *
39 | * @param pInBlock pointer to compressed data
40 | * @param nBlockSize size of compressed data, in bytes
41 | * @param pOutData pointer to output decompression buffer (previously decompressed bytes + room for decompressing this block)
42 | * @param nOutDataOffset starting index of where to store decompressed bytes in output buffer (and size of previously decompressed bytes)
43 | * @param nBlockMaxSize total size of output decompression buffer, in bytes
44 | *
45 | * @return size of decompressed data in bytes, or -1 for error
46 | */
47 | int lz4ultra_decompressor_expand_block(const unsigned char *pInBlock, int nBlockSize, unsigned char *pOutData, int nOutDataOffset, int nBlockMaxSize);
48 |
49 | #endif /* _EXPAND_BLOCK_H */
50 |
--------------------------------------------------------------------------------
/src/expand_inmem.c:
--------------------------------------------------------------------------------
1 | /*
2 | * expand_inmem.c - in-memory decompression implementation
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #include
34 | #include
35 | #include "expand_inmem.h"
36 | #include "lib.h"
37 | #include "frame.h"
38 |
39 | /**
40 | * Get maximum decompressed size of compressed data
41 | *
42 | * @param pFileData compressed data
43 | * @param nFileSize compressed size in bytes
44 | *
45 | * @return maximum decompressed size
46 | */
47 | size_t lz4ultra_inmem_get_max_decompressed_size(const unsigned char *pFileData, size_t nFileSize) {
48 | const unsigned char *pCurFileData = pFileData;
49 | const unsigned char *pEndFileData = pCurFileData + nFileSize;
50 | int nBlockMaxCode = 0;
51 | unsigned int nFlags = 0;
52 | int nBlockMaxBits, nBlockMaxSize;
53 | size_t nMaxDecompressedSize = 0;
54 |
55 | /* Check header */
56 | if ((pCurFileData + LZ4ULTRA_HEADER_SIZE) > pEndFileData)
57 | return -1;
58 |
59 | int nExtraHeaderSize = lz4ultra_check_header(pCurFileData, LZ4ULTRA_HEADER_SIZE);
60 | if (nExtraHeaderSize < 0)
61 | return -1;
62 |
63 | if (((pCurFileData + LZ4ULTRA_HEADER_SIZE + nExtraHeaderSize) > pEndFileData) ||
64 | lz4ultra_decode_header(pCurFileData, LZ4ULTRA_HEADER_SIZE + nExtraHeaderSize, &nBlockMaxCode, &nFlags) != LZ4ULTRA_DECODE_OK)
65 | return -1;
66 |
67 | if (nFlags & LZ4ULTRA_FLAG_LEGACY_FRAMES)
68 | nBlockMaxBits = 23;
69 | else
70 | nBlockMaxBits = 8 + (nBlockMaxCode << 1);
71 | nBlockMaxSize = 1 << nBlockMaxBits;
72 |
73 | pCurFileData += (LZ4ULTRA_HEADER_SIZE + nExtraHeaderSize);
74 |
75 | while (pCurFileData < pEndFileData) {
76 | unsigned int nBlockDataSize = 0;
77 | int nIsUncompressed = 0;
78 |
79 | /* Decode frame header */
80 | if ((pCurFileData + LZ4ULTRA_FRAME_SIZE) > pEndFileData ||
81 | lz4ultra_decode_frame(pCurFileData, LZ4ULTRA_FRAME_SIZE, nFlags, &nBlockDataSize, &nIsUncompressed) != LZ4ULTRA_DECODE_OK)
82 | return -1;
83 | pCurFileData += LZ4ULTRA_FRAME_SIZE;
84 |
85 | if (!nBlockDataSize)
86 | break;
87 |
88 | /* Add one potentially full block to the decompressed size */
89 | nMaxDecompressedSize += nBlockMaxSize;
90 |
91 | if ((pCurFileData + nBlockDataSize) > pEndFileData)
92 | return -1;
93 |
94 | pCurFileData += nBlockDataSize;
95 | }
96 |
97 | return nMaxDecompressedSize;
98 | }
99 |
100 | /**
101 | * Decompress data in memory
102 | *
103 | * @param pFileData compressed data
104 | * @param pOutBuffer buffer for decompressed data
105 | * @param nFileSize compressed size in bytes
106 | * @param nMaxOutBufferSize maximum capacity of decompression buffer
107 | * @param nFlags compression flags (LZ4ULTRA_FLAG_xxx)
108 | *
109 | * @return actual decompressed size, or -1 for error
110 | */
111 | size_t lz4ultra_decompress_inmem(const unsigned char *pFileData, unsigned char *pOutBuffer, size_t nFileSize, size_t nMaxOutBufferSize, unsigned int nFlags) {
112 | const unsigned char *pCurFileData = pFileData;
113 | const unsigned char *pEndFileData = pCurFileData + nFileSize;
114 | unsigned char *pCurOutBuffer = pOutBuffer;
115 | const unsigned char *pEndOutBuffer = pCurOutBuffer + nMaxOutBufferSize;
116 | int nBlockMaxCode = 0;
117 | int nBlockMaxBits, nBlockMaxSize, nPreviousBlockSize;
118 |
119 | if (nFlags & LZ4ULTRA_FLAG_RAW_BLOCK) {
120 | return (size_t)lz4ultra_decompressor_expand_block(pFileData, (int)nFileSize - 2 /* EOD marker */, pOutBuffer, 0, (int)nMaxOutBufferSize);
121 | }
122 |
123 | /* Check header */
124 | if ((pCurFileData + LZ4ULTRA_HEADER_SIZE) > pEndFileData)
125 | return -1;
126 |
127 | int nExtraHeaderSize = lz4ultra_check_header(pCurFileData, LZ4ULTRA_HEADER_SIZE);
128 | if (nExtraHeaderSize < 0)
129 | return -1;
130 |
131 | if (((pCurFileData + LZ4ULTRA_HEADER_SIZE + nExtraHeaderSize) > pEndFileData) ||
132 | lz4ultra_decode_header(pCurFileData, LZ4ULTRA_HEADER_SIZE + nExtraHeaderSize, &nBlockMaxCode, &nFlags) != LZ4ULTRA_DECODE_OK)
133 | return -1;
134 |
135 | if (nFlags & LZ4ULTRA_FLAG_LEGACY_FRAMES)
136 | nBlockMaxBits = 23;
137 | else
138 | nBlockMaxBits = 8 + (nBlockMaxCode << 1);
139 | nBlockMaxSize = 1 << nBlockMaxBits;
140 |
141 | pCurFileData += (LZ4ULTRA_HEADER_SIZE + nExtraHeaderSize);
142 | nPreviousBlockSize = 0;
143 |
144 | while (pCurFileData < pEndFileData) {
145 | unsigned int nBlockDataSize = 0;
146 | int nIsUncompressed = 0;
147 |
148 | /* Decode frame header */
149 | if ((pCurFileData + LZ4ULTRA_FRAME_SIZE) > pEndFileData ||
150 | lz4ultra_decode_frame(pCurFileData, LZ4ULTRA_FRAME_SIZE, nFlags, &nBlockDataSize, &nIsUncompressed) != LZ4ULTRA_DECODE_OK)
151 | return -1;
152 | pCurFileData += LZ4ULTRA_FRAME_SIZE;
153 |
154 | if (!nBlockDataSize)
155 | break;
156 |
157 | if (!nIsUncompressed) {
158 | int nDecompressedSize;
159 |
160 | /* Decompress block */
161 | if ((pCurFileData + nBlockDataSize) > pEndFileData)
162 | return -1;
163 |
164 | if ((nFlags & LZ4ULTRA_FLAG_INDEP_BLOCKS) || (nPreviousBlockSize == 0))
165 | nDecompressedSize = lz4ultra_decompressor_expand_block(pCurFileData, nBlockDataSize, pCurOutBuffer, 0, (int)(pEndOutBuffer - pCurOutBuffer));
166 | else
167 | nDecompressedSize = lz4ultra_decompressor_expand_block(pCurFileData, nBlockDataSize, pCurOutBuffer - nPreviousBlockSize, nPreviousBlockSize, (int)(pEndOutBuffer - pCurOutBuffer + nPreviousBlockSize));
168 | if (nDecompressedSize < 0)
169 | return -1;
170 |
171 | pCurOutBuffer += nDecompressedSize;
172 | nPreviousBlockSize = nDecompressedSize;
173 | }
174 | else {
175 | /* Copy uncompressed block */
176 | if ((pCurFileData + nBlockDataSize) > pEndFileData)
177 | return -1;
178 | if ((pCurOutBuffer + nBlockDataSize) > pEndOutBuffer)
179 | return -1;
180 | memcpy(pCurOutBuffer, pCurFileData, nBlockDataSize);
181 | pCurOutBuffer += nBlockDataSize;
182 | }
183 |
184 | pCurFileData += nBlockDataSize;
185 | }
186 |
187 | return (int)(pCurOutBuffer - pOutBuffer);
188 | }
189 |
--------------------------------------------------------------------------------
/src/expand_inmem.h:
--------------------------------------------------------------------------------
1 | /*
2 | * expand_inmem.h - in-memory decompression definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _EXPAND_INMEM_H
34 | #define _EXPAND_INMEM_H
35 |
36 | #include
37 |
38 | /**
39 | * Get maximum decompressed size of compressed data
40 | *
41 | * @param pFileData compressed data
42 | * @param nFileSize compressed size in bytes
43 | *
44 | * @return maximum decompressed size
45 | */
46 | size_t lz4ultra_inmem_get_max_decompressed_size(const unsigned char *pFileData, size_t nFileSize);
47 |
48 | /**
49 | * Decompress data in memory
50 | *
51 | * @param pFileData compressed data
52 | * @param pOutBuffer buffer for decompressed data
53 | * @param nFileSize compressed size in bytes
54 | * @param nMaxOutBufferSize maximum capacity of decompression buffer
55 | * @param nFlags compression flags (LZ4ULTRA_FLAG_xxx)
56 | *
57 | * @return actual decompressed size, or -1 for error
58 | */
59 | size_t lz4ultra_decompress_inmem(const unsigned char *pFileData, unsigned char *pOutBuffer, size_t nFileSize, size_t nMaxOutBufferSize, unsigned int nFlags);
60 |
61 | #endif /* _EXPAND_INMEM_H */
62 |
--------------------------------------------------------------------------------
/src/expand_streaming.c:
--------------------------------------------------------------------------------
1 | /*
2 | * expand_streaming.c - streaming decompression implementation
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #include
34 | #include
35 | #include "expand_streaming.h"
36 | #include "format.h"
37 | #include "frame.h"
38 | #include "lib.h"
39 |
40 | /*-------------- File API -------------- */
41 |
42 | /**
43 | * Decompress file
44 | *
45 | * @param pszInFilename name of input(compressed) file to decompress
46 | * @param pszOutFilename name of output(decompressed) file to generate
47 | * @param pszDictionaryFilename name of dictionary file, or NULL for none
48 | * @param nFlags compression flags (LZ4ULTRA_FLAG_RAW_BLOCK to decompress a raw block, or 0)
49 | * @param pOriginalSize pointer to returned output(decompressed) size, updated when this function is successful
50 | * @param pCompressedSize pointer to returned input(compressed) size, updated when this function is successful
51 | *
52 | * @return LZ4ULTRA_OK for success, or an error value from lz4ultra_status_t
53 | */
54 | lz4ultra_status_t lz4ultra_decompress_file(const char *pszInFilename, const char *pszOutFilename, const char *pszDictionaryFilename, const unsigned int nFlags,
55 | long long *pOriginalSize, long long *pCompressedSize) {
56 | lz4ultra_stream_t inStream, outStream;
57 | void *pDictionaryData = NULL;
58 | int nDictionaryDataSize = 0;
59 | lz4ultra_status_t nStatus;
60 |
61 | if (lz4ultra_filestream_open(&inStream, pszInFilename, "rb") < 0) {
62 | return LZ4ULTRA_ERROR_SRC;
63 | }
64 |
65 | if (lz4ultra_filestream_open(&outStream, pszOutFilename, "wb") < 0) {
66 | inStream.close(&inStream);
67 | return LZ4ULTRA_ERROR_DST;
68 | }
69 |
70 | nStatus = lz4ultra_dictionary_load(pszDictionaryFilename, &pDictionaryData, &nDictionaryDataSize);
71 | if (nStatus) {
72 | outStream.close(&outStream);
73 | inStream.close(&inStream);
74 |
75 | return nStatus;
76 | }
77 |
78 | nStatus = lz4ultra_decompress_stream(&inStream, &outStream, pDictionaryData, nDictionaryDataSize, nFlags, pOriginalSize, pCompressedSize);
79 |
80 | lz4ultra_dictionary_free(&pDictionaryData);
81 | outStream.close(&outStream);
82 | inStream.close(&inStream);
83 |
84 | return nStatus;
85 | }
86 |
87 | /*-------------- Streaming API -------------- */
88 |
89 | /**
90 | * Decompress stream
91 | *
92 | * @param pInStream input(compressed) stream to decompress
93 | * @param pOutStream output(decompressed) stream to write to
94 | * @param pDictionaryData dictionary contents, or NULL for none
95 | * @param nDictionaryDataSize size of dictionary contents, or 0
96 | * @param pOriginalSize pointer to returned output(decompressed) size, updated when this function is successful
97 | * @param pCompressedSize pointer to returned input(compressed) size, updated when this function is successful
98 | *
99 | * @return LZ4ULTRA_OK for success, or an error value from lz4ultra_status_t
100 | */
101 | lz4ultra_status_t lz4ultra_decompress_stream(lz4ultra_stream_t *pInStream, lz4ultra_stream_t *pOutStream, const void *pDictionaryData, int nDictionaryDataSize, unsigned int nFlags,
102 | long long *pOriginalSize, long long *pCompressedSize) {
103 | long long nOriginalSize = 0LL;
104 | long long nCompressedSize = 0LL;
105 | int nBlockMaxCode = 7;
106 | unsigned char cFrameData[16];
107 | unsigned char *pInBlock;
108 | unsigned char *pOutData;
109 |
110 | if ((nFlags & LZ4ULTRA_FLAG_RAW_BLOCK) == 0) {
111 | memset(cFrameData, 0, 16);
112 |
113 | if (pInStream->read(pInStream, cFrameData, LZ4ULTRA_HEADER_SIZE) != LZ4ULTRA_HEADER_SIZE) {
114 | return LZ4ULTRA_ERROR_SRC;
115 | }
116 |
117 | int nExtraHeaderSize = lz4ultra_check_header(cFrameData, LZ4ULTRA_HEADER_SIZE);
118 | if (nExtraHeaderSize < 0)
119 | return LZ4ULTRA_ERROR_FORMAT;
120 |
121 | if (pInStream->read(pInStream, cFrameData + LZ4ULTRA_HEADER_SIZE, nExtraHeaderSize) != nExtraHeaderSize) {
122 | return LZ4ULTRA_ERROR_SRC;
123 | }
124 |
125 | int nSuccess = lz4ultra_decode_header(cFrameData, LZ4ULTRA_HEADER_SIZE + nExtraHeaderSize, &nBlockMaxCode, &nFlags);
126 | if (nSuccess < 0) {
127 | if (nSuccess == LZ4ULTRA_DECODE_ERR_SUM)
128 | return LZ4ULTRA_ERROR_CHECKSUM;
129 | else
130 | return LZ4ULTRA_ERROR_FORMAT;
131 | }
132 |
133 | nCompressedSize += (long long)(LZ4ULTRA_HEADER_SIZE + nExtraHeaderSize);
134 | }
135 |
136 | int nBlockMaxBits;
137 | if (nFlags & LZ4ULTRA_FLAG_LEGACY_FRAMES)
138 | nBlockMaxBits = 23;
139 | else
140 | nBlockMaxBits = 8 + (nBlockMaxCode << 1);
141 | int nBlockMaxSize = 1 << nBlockMaxBits;
142 |
143 | pInBlock = (unsigned char*)malloc(nBlockMaxSize);
144 | if (!pInBlock) {
145 | return LZ4ULTRA_ERROR_MEMORY;
146 | }
147 |
148 | pOutData = (unsigned char*)malloc(nBlockMaxSize + HISTORY_SIZE);
149 | if (!pOutData) {
150 | free(pInBlock);
151 | pInBlock = NULL;
152 |
153 | return LZ4ULTRA_ERROR_MEMORY;
154 | }
155 |
156 | int nDecompressionError = 0;
157 | int nPrevDecompressedSize = 0;
158 | int nNumBlocks = 0;
159 |
160 | while (!pInStream->eof(pInStream) && !nDecompressionError) {
161 | unsigned int nBlockSize = 0;
162 | int nIsUncompressed = 0;
163 |
164 | if (nPrevDecompressedSize != 0) {
165 | memcpy(pOutData + HISTORY_SIZE - nPrevDecompressedSize, pOutData + HISTORY_SIZE + (nBlockMaxSize - nPrevDecompressedSize), nPrevDecompressedSize);
166 | }
167 | else if (nDictionaryDataSize != 0) {
168 | memcpy(pOutData + HISTORY_SIZE - nDictionaryDataSize, pDictionaryData, nDictionaryDataSize);
169 | nPrevDecompressedSize = nDictionaryDataSize;
170 |
171 | if (!(nFlags & LZ4ULTRA_FLAG_INDEP_BLOCKS))
172 | nDictionaryDataSize = 0;
173 | }
174 |
175 | if ((nFlags & LZ4ULTRA_FLAG_RAW_BLOCK) == 0) {
176 | memset(cFrameData, 0, 16);
177 | if (pInStream->read(pInStream, cFrameData, LZ4ULTRA_FRAME_SIZE) == LZ4ULTRA_FRAME_SIZE) {
178 | int nSuccess = lz4ultra_decode_frame(cFrameData, LZ4ULTRA_FRAME_SIZE, nFlags, &nBlockSize, &nIsUncompressed);
179 | if (nSuccess < 0)
180 | nBlockSize = 0;
181 |
182 | nCompressedSize += (long long)LZ4ULTRA_FRAME_SIZE;
183 | }
184 | else {
185 | nBlockSize = 0;
186 | }
187 | }
188 | else {
189 | if (!nNumBlocks)
190 | nBlockSize = nBlockMaxSize;
191 | else
192 | nBlockSize = 0;
193 | }
194 |
195 | if (nBlockSize != 0) {
196 | int nDecompressedSize = 0;
197 |
198 | if ((int)nBlockSize > nBlockMaxSize) {
199 | nDecompressionError = LZ4ULTRA_ERROR_FORMAT;
200 | break;
201 | }
202 | size_t nReadBytes = pInStream->read(pInStream, pInBlock, nBlockSize);
203 | if (nFlags & LZ4ULTRA_FLAG_RAW_BLOCK) {
204 | if (nReadBytes > 2)
205 | nReadBytes -= 2;
206 | else
207 | nReadBytes = 0;
208 | nBlockSize = (unsigned int)nReadBytes;
209 | }
210 |
211 | if (nReadBytes == nBlockSize) {
212 | nCompressedSize += (long long)nReadBytes;
213 |
214 | if (nIsUncompressed) {
215 | memcpy(pOutData + HISTORY_SIZE, pInBlock, nBlockSize);
216 | nDecompressedSize = nBlockSize;
217 | }
218 | else {
219 | nDecompressedSize = lz4ultra_decompressor_expand_block(pInBlock, nBlockSize, pOutData, HISTORY_SIZE, nBlockMaxSize);
220 | if (nDecompressedSize < 0) {
221 | nDecompressionError = LZ4ULTRA_ERROR_DECOMPRESSION;
222 | break;
223 | }
224 | }
225 |
226 | if (nDecompressedSize != 0) {
227 | nOriginalSize += (long long)nDecompressedSize;
228 |
229 | if (pOutStream->write(pOutStream, pOutData + HISTORY_SIZE, nDecompressedSize) != nDecompressedSize)
230 | nDecompressionError = LZ4ULTRA_ERROR_DST;
231 |
232 | if (!(nFlags & LZ4ULTRA_FLAG_INDEP_BLOCKS)) {
233 | nPrevDecompressedSize = nDecompressedSize;
234 | if (nPrevDecompressedSize > HISTORY_SIZE)
235 | nPrevDecompressedSize = HISTORY_SIZE;
236 | }
237 | else {
238 | nPrevDecompressedSize = 0;
239 | }
240 | nDecompressedSize = 0;
241 | }
242 | }
243 | else {
244 | break;
245 | }
246 |
247 | nNumBlocks++;
248 | }
249 | else {
250 | break;
251 | }
252 | }
253 |
254 | free(pOutData);
255 | pOutData = NULL;
256 |
257 | free(pInBlock);
258 | pInBlock = NULL;
259 |
260 | *pOriginalSize = nOriginalSize;
261 | *pCompressedSize = nCompressedSize;
262 | return nDecompressionError;
263 | }
264 |
--------------------------------------------------------------------------------
/src/expand_streaming.h:
--------------------------------------------------------------------------------
1 | /*
2 | * expand_streaming.h - streaming decompression definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _EXPAND_STREAMING_H
34 | #define _EXPAND_STREAMING_H
35 |
36 | #include "stream.h"
37 |
38 | /* Forward declaration */
39 | typedef enum _lz4ultra_status_t lz4ultra_status_t;
40 |
41 | /*-------------- File API -------------- */
42 |
43 | /**
44 | * Decompress file
45 | *
46 | * @param pszInFilename name of input(compressed) file to decompress
47 | * @param pszOutFilename name of output(decompressed) file to generate
48 | * @param pszDictionaryFilename name of dictionary file, or NULL for none
49 | * @param nFlags compression flags (LZ4ULTRA_FLAG_RAW_BLOCK to decompress a raw block, or 0)
50 | * @param pOriginalSize pointer to returned output(decompressed) size, updated when this function is successful
51 | * @param pCompressedSize pointer to returned input(compressed) size, updated when this function is successful
52 | *
53 | * @return LZ4ULTRA_OK for success, or an error value from lz4ultra_status_t
54 | */
55 | lz4ultra_status_t lz4ultra_decompress_file(const char *pszInFilename, const char *pszOutFilename, const char *pszDictionaryFilename, const unsigned int nFlags,
56 | long long *pOriginalSize, long long *pCompressedSize);
57 |
58 | /*-------------- Streaming API -------------- */
59 |
60 | /**
61 | * Decompress stream
62 | *
63 | * @param pInStream input(compressed) stream to decompress
64 | * @param pOutStream output(decompressed) stream to write to
65 | * @param pDictionaryData dictionary contents, or NULL for none
66 | * @param nDictionaryDataSize size of dictionary contents, or 0
67 | * @param nFlags compression flags (LZ4ULTRA_FLAG_RAW_BLOCK to decompress a raw block, or 0)
68 | * @param pOriginalSize pointer to returned output(decompressed) size, updated when this function is successful
69 | * @param pCompressedSize pointer to returned input(compressed) size, updated when this function is successful
70 | *
71 | * @return LZ4ULTRA_OK for success, or an error value from lz4ultra_status_t
72 | */
73 | lz4ultra_status_t lz4ultra_decompress_stream(lz4ultra_stream_t *pInStream, lz4ultra_stream_t *pOutStream, const void *pDictionaryData, int nDictionaryDataSize, unsigned int nFlags,
74 | long long *pOriginalSize, long long *pCompressedSize);
75 |
76 | #endif /* _EXPAND_STREAMING_H */
77 |
--------------------------------------------------------------------------------
/src/format.h:
--------------------------------------------------------------------------------
1 | /*
2 | * format.h - byte stream format definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _FORMAT_H
34 | #define _FORMAT_H
35 |
36 | #define MIN_MATCH_SIZE 4
37 | #define MIN_OFFSET 1
38 | #define MAX_OFFSET 0xffff
39 | #define HISTORY_SIZE 65536
40 | #define LITERALS_RUN_LEN 15
41 | #define MATCH_RUN_LEN 15
42 |
43 | #endif /* _FORMAT_H */
44 |
--------------------------------------------------------------------------------
/src/frame.c:
--------------------------------------------------------------------------------
1 | /*
2 | * frame.c - lz4 frame implementation
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #include "frame.h"
34 | #include "lib.h"
35 | #include "xxhash.h"
36 |
37 | /**
38 | * Encode compressed stream header
39 | *
40 | * @param pFrameData encoding buffer
41 | * @param nMaxFrameDataSize max encoding buffer size, in bytes
42 | * @param nFlags compression flags (LZ4ULTRA_FLAG_xxx)
43 | * @param nBlockMaxCode max block size code (4-7)
44 | *
45 | * @return number of encoded bytes, or -1 for failure
46 | */
47 | int lz4ultra_encode_header(unsigned char *pFrameData, const int nMaxFrameDataSize, const unsigned int nFlags, int nBlockMaxCode) {
48 | if (nFlags & LZ4ULTRA_FLAG_LEGACY_FRAMES) {
49 | if (nMaxFrameDataSize >= 4) {
50 | pFrameData[0] = 0x02; /* Legacy magic number: 0x184D2204 */
51 | pFrameData[1] = 0x21;
52 | pFrameData[2] = 0x4C;
53 | pFrameData[3] = 0x18;
54 |
55 | return 4;
56 | }
57 | else {
58 | return LZ4ULTRA_ENCODE_ERR;
59 | }
60 | }
61 | else {
62 | if (nMaxFrameDataSize >= 7) {
63 | pFrameData[0] = 0x04; /* Magic number: 0x184D2204 */
64 | pFrameData[1] = 0x22;
65 | pFrameData[2] = 0x4D;
66 | pFrameData[3] = 0x18;
67 |
68 | pFrameData[4] = 0b01000000; /* Version.Hi Version.Lo !B.Indep B.Checksum Content.Size Content.Checksum Reserved.Hi Reserved.Lo */
69 | if (nFlags & LZ4ULTRA_FLAG_INDEP_BLOCKS)
70 | pFrameData[4] |= 0b00100000; /* B.Indep */
71 | pFrameData[5] = nBlockMaxCode << 4; /* Block MaxSize */
72 |
73 | XXH32_hash_t headerSum = XXH32(pFrameData + 4, 2, 0);
74 | pFrameData[6] = (headerSum >> 8) & 0xff; /* Header checksum */
75 |
76 | return 7;
77 | }
78 | else {
79 | return LZ4ULTRA_ENCODE_ERR;
80 | }
81 | }
82 | }
83 |
84 | /**
85 | * Encode compressed block frame header
86 | *
87 | * @param pFrameData encoding buffer
88 | * @param nMaxFrameDataSize max encoding buffer size, in bytes
89 | * @param nFlags compression flags
90 | * @param nBlockDataSize compressed block's data size, in bytes
91 | *
92 | * @return number of encoded bytes, or -1 for failure
93 | */
94 | int lz4ultra_encode_compressed_block_frame(unsigned char *pFrameData, const int nMaxFrameDataSize, const unsigned int nFlags, const int nBlockDataSize) {
95 | if (nMaxFrameDataSize >= 4 && (nBlockDataSize & 0x80000000) == 0) {
96 | pFrameData[0] = nBlockDataSize & 0xff;
97 | pFrameData[1] = (nBlockDataSize >> 8) & 0xff;
98 | pFrameData[2] = (nBlockDataSize >> 16) & 0xff;
99 | pFrameData[3] = (nBlockDataSize >> 24) & 0x7f; /* Compressed block */
100 | return 4;
101 | }
102 | else {
103 | return LZ4ULTRA_ENCODE_ERR;
104 | }
105 | }
106 |
107 | /**
108 | * Encode uncompressed block frame header
109 | *
110 | * @param pFrameData encoding buffer
111 | * @param nMaxFrameDataSize max encoding buffer size, in bytes
112 | * @param nFlags compression flags
113 | * @param nBlockDataSize uncompressed block's data size, in bytes
114 | *
115 | * @return number of encoded bytes, or -1 for failure
116 | */
117 | int lz4ultra_encode_uncompressed_block_frame(unsigned char *pFrameData, const int nMaxFrameDataSize, const unsigned int nFlags, const int nBlockDataSize) {
118 | if (nFlags & LZ4ULTRA_FLAG_LEGACY_FRAMES)
119 | return LZ4ULTRA_ERROR_RAW_UNCOMPRESSED;
120 |
121 | if (nMaxFrameDataSize >= 4 && (nBlockDataSize & 0x80000000) == 0) {
122 | pFrameData[0] = nBlockDataSize & 0xff;
123 | pFrameData[1] = (nBlockDataSize >> 8) & 0xff;
124 | pFrameData[2] = (nBlockDataSize >> 16) & 0xff;
125 | pFrameData[3] = ((nBlockDataSize >> 24) & 0x7f) | 0x80; /* Uncompressed block */
126 | return 4;
127 | }
128 | else {
129 | return LZ4ULTRA_ENCODE_ERR;
130 | }
131 | }
132 |
133 | /**
134 | * Encode terminal frame header
135 | *
136 | * @param pFrameData encoding buffer
137 | * @param nMaxFrameDataSize max encoding buffer size, in bytes
138 | * @param nFlags compression flags
139 | *
140 | * @return number of encoded bytes, or -1 for failure
141 | */
142 | int lz4ultra_encode_footer_frame(unsigned char *pFrameData, const int nMaxFrameDataSize, const unsigned int nFlags) {
143 | if (nFlags & LZ4ULTRA_FLAG_LEGACY_FRAMES)
144 | return 0;
145 |
146 | if (nMaxFrameDataSize >= 4) {
147 | pFrameData[0] = 0x00; /* EOD frame */
148 | pFrameData[1] = 0x00;
149 | pFrameData[2] = 0x00;
150 | pFrameData[3] = 0x00;
151 | return 4;
152 | }
153 | else {
154 | return LZ4ULTRA_ENCODE_ERR;
155 | }
156 | }
157 |
158 | /**
159 | * Check compressed stream header
160 | *
161 | * @param pFrameData data bytes
162 | * @param nFrameDataSize number of bytes to check
163 | *
164 | * @return the number of extra header bytes to read for decoding, or LZ4ULTRA_DECODE_ERR_xxx for failure
165 | */
166 | int lz4ultra_check_header(const unsigned char *pFrameData, const int nFrameDataSize) {
167 | if (nFrameDataSize == 4) {
168 | if (pFrameData[0] == 0x04 &&
169 | pFrameData[1] == 0x22 &&
170 | pFrameData[2] == 0x4D &&
171 | pFrameData[3] == 0x18) {
172 | /* LZ4 magic number */
173 | return 3;
174 | }
175 |
176 | if (pFrameData[0] == 0x02 &&
177 | pFrameData[1] == 0x21 &&
178 | pFrameData[2] == 0x4C &&
179 | pFrameData[3] == 0x18) {
180 | /* Legacy magic number */
181 | return 0;
182 | }
183 | }
184 |
185 | return LZ4ULTRA_DECODE_ERR_FORMAT;
186 | }
187 |
188 | /**
189 | * Decode compressed stream header
190 | *
191 | * @param pFrameData data bytes
192 | * @param nFrameDataSize number of bytes to decode
193 | * @param nBlockMaxCode pointer to max block size code (4-7), updated if this function succeeds
194 | * @param nFlags returned compression flags
195 | *
196 | * @return LZ4ULTRA_DECODE_OK for success, or LZ4ULTRA_DECODE_ERR_xxx for failure
197 | */
198 | int lz4ultra_decode_header(const unsigned char *pFrameData, const int nFrameDataSize, int *nBlockMaxCode, unsigned int *nFlags) {
199 | if (nFrameDataSize == 7) {
200 | if (pFrameData[0] != 0x04 ||
201 | pFrameData[1] != 0x22 ||
202 | pFrameData[2] != 0x4D ||
203 | pFrameData[3] != 0x18 ||
204 | (pFrameData[4] & 0xc0) != 0b01000000 ||
205 | (pFrameData[5] & 0x0f) != 0) {
206 | return LZ4ULTRA_DECODE_ERR_FORMAT;
207 | }
208 |
209 | XXH32_hash_t headerSum = XXH32(pFrameData + 4, 2, 0);
210 | if (((headerSum >> 8) & 0xff) != pFrameData[6]) {
211 | return LZ4ULTRA_DECODE_ERR_SUM;
212 | }
213 |
214 | *nFlags = (pFrameData[4] & 0x20) ? LZ4ULTRA_FLAG_INDEP_BLOCKS : 0;
215 | *nBlockMaxCode = (pFrameData[5] >> 4);
216 |
217 | return LZ4ULTRA_DECODE_OK;
218 | }
219 | else if (nFrameDataSize == 4) {
220 | if (pFrameData[0] != 0x02 ||
221 | pFrameData[1] != 0x21 ||
222 | pFrameData[2] != 0x4C ||
223 | pFrameData[3] != 0x18) {
224 | return LZ4ULTRA_DECODE_ERR_FORMAT;
225 | }
226 |
227 | *nFlags = LZ4ULTRA_FLAG_LEGACY_FRAMES;
228 | *nBlockMaxCode = 0;
229 |
230 | return LZ4ULTRA_DECODE_OK;
231 | }
232 | else {
233 | return LZ4ULTRA_DECODE_ERR_FORMAT;
234 | }
235 | }
236 |
237 | /**
238 | * Decode frame header
239 | *
240 | * @param pFrameData data bytes
241 | * @param nFrameDataSize number of bytes to decode
242 | * @param nFlags compression flags
243 | * @param nBlockSize pointer to block size, updated if this function succeeds (set to 0 if this is the terminal frame)
244 | * @param nIsUncompressed pointer to compressed block flag, updated if this function succeeds
245 | *
246 | * @return LZ4ULTRA_DECODE_OK for success, or LZ4ULTRA_DECODE_ERR_FORMAT for failure
247 | */
248 | int lz4ultra_decode_frame(const unsigned char *pFrameData, const int nFrameDataSize, const unsigned int nFlags, unsigned int *nBlockSize, int *nIsUncompressed) {
249 | if (nFrameDataSize == 4) {
250 | *nBlockSize = ((unsigned int)pFrameData[0]) |
251 | (((unsigned int)pFrameData[1]) << 8) |
252 | (((unsigned int)pFrameData[2]) << 16) |
253 | (((unsigned int)pFrameData[3]) << 24);
254 |
255 | *nIsUncompressed = ((*nBlockSize) & 0x80000000) ? 1 : 0;
256 | (*nBlockSize) &= 0x7fffffff;
257 | return 0;
258 | }
259 | else {
260 | return LZ4ULTRA_DECODE_ERR_FORMAT;
261 | }
262 | }
263 |
--------------------------------------------------------------------------------
/src/frame.h:
--------------------------------------------------------------------------------
1 | /*
2 | * frame.h - lz4 frame definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _FRAME_H
34 | #define _FRAME_H
35 |
36 | #include
37 |
38 | #define LZ4ULTRA_HEADER_SIZE 4
39 | #define LZ4ULTRA_MAX_HEADER_SIZE 7
40 | #define LZ4ULTRA_FRAME_SIZE 4
41 |
42 | #define LZ4ULTRA_ENCODE_ERR (-1)
43 |
44 | #define LZ4ULTRA_DECODE_OK 0
45 | #define LZ4ULTRA_DECODE_ERR_FORMAT (-1)
46 | #define LZ4ULTRA_DECODE_ERR_SUM (-2)
47 |
48 | /**
49 | * Encode compressed stream header
50 | *
51 | * @param pFrameData encoding buffer
52 | * @param nMaxFrameDataSize max encoding buffer size, in bytes
53 | * @param nFlags compression flags (LZ4ULTRA_FLAG_xxx)
54 | * @param nBlockMaxCode max block size code (4-7)
55 | *
56 | * @return number of encoded bytes, or -1 for failure
57 | */
58 | int lz4ultra_encode_header(unsigned char *pFrameData, const int nMaxFrameDataSize, const unsigned int nFlags, int nBlockMaxCode);
59 |
60 | /**
61 | * Encode compressed block frame header
62 | *
63 | * @param pFrameData encoding buffer
64 | * @param nMaxFrameDataSize max encoding buffer size, in bytes
65 | * @param nFlags compression flags
66 | * @param nBlockDataSize compressed block's data size, in bytes
67 | *
68 | * @return number of encoded bytes, or -1 for failure
69 | */
70 | int lz4ultra_encode_compressed_block_frame(unsigned char *pFrameData, const int nMaxFrameDataSize, const unsigned int nFlags, const int nBlockDataSize);
71 |
72 | /**
73 | * Encode uncompressed block frame header
74 | *
75 | * @param pFrameData encoding buffer
76 | * @param nMaxFrameDataSize max encoding buffer size, in bytes
77 | * @param nFlags compression flags
78 | * @param nBlockDataSize uncompressed block's data size, in bytes
79 | *
80 | * @return number of encoded bytes, or -1 for failure
81 | */
82 | int lz4ultra_encode_uncompressed_block_frame(unsigned char *pFrameData, const int nMaxFrameDataSize, const unsigned int nFlags, const int nBlockDataSize);
83 |
84 | /**
85 | * Encode terminal frame header
86 | *
87 | * @param pFrameData encoding buffer
88 | * @param nMaxFrameDataSize max encoding buffer size, in bytes
89 | * @param nFlags compression flags
90 | *
91 | * @return number of encoded bytes, or -1 for failure
92 | */
93 | int lz4ultra_encode_footer_frame(unsigned char *pFrameData, const int nMaxFrameDataSize, const unsigned int nFlags);
94 |
95 | /**
96 | * Check compressed stream header
97 | *
98 | * @param pFrameData data bytes
99 | * @param nFrameDataSize number of bytes to check
100 | *
101 | * @return the number of extra header bytes to read for decoding, or LZ4ULTRA_DECODE_ERR_xxx for failure
102 | */
103 | int lz4ultra_check_header(const unsigned char *pFrameData, const int nFrameDataSize);
104 |
105 | /**
106 | * Decode compressed stream header
107 | *
108 | * @param pFrameData data bytes
109 | * @param nFrameDataSize number of bytes to decode
110 | * @param nBlockMaxCode pointer to max block size code (4-7), updated if this function succeeds
111 | * @param nFlags returned compression flags
112 | *
113 | * @return LZ4ULTRA_DECODE_OK for success, or LZ4ULTRA_DECODE_ERR_xxx for failure
114 | */
115 | int lz4ultra_decode_header(const unsigned char *pFrameData, const int nFrameDataSize, int *nBlockMaxCode, unsigned int *nFlags);
116 |
117 | /**
118 | * Decode frame header
119 | *
120 | * @param pFrameData data bytes
121 | * @param nFrameDataSize number of bytes to decode
122 | * @param nFlags compression flags
123 | * @param nBlockSize pointer to block size, updated if this function succeeds (set to 0 if this is the terminal frame)
124 | * @param nIsUncompressed pointer to compressed block flag, updated if this function succeeds
125 | *
126 | * @return LZ4ULTRA_DECODE_OK for success, or LZ4ULTRA_DECODE_ERR_FORMAT for failure
127 | */
128 | int lz4ultra_decode_frame(const unsigned char *pFrameData, const int nFrameDataSize, const unsigned int nFlags, unsigned int *nBlockSize, int *nIsUncompressed);
129 |
130 | #endif /* _FRAME_H */
131 |
--------------------------------------------------------------------------------
/src/lib.c:
--------------------------------------------------------------------------------
1 | /*
2 | * lib.c - lz4ultra library implementation
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #include
34 | #include
35 | #include
36 | #include "lib.h"
37 | #include "frame.h"
38 | #include "format.h"
39 |
--------------------------------------------------------------------------------
/src/lib.h:
--------------------------------------------------------------------------------
1 | /*
2 | * lib.h - lz4ultra library definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _LIB_H
34 | #define _LIB_H
35 |
36 | #include "stream.h"
37 | #include "dictionary.h"
38 | #include "shrink_context.h"
39 | #include "shrink_streaming.h"
40 | #include "shrink_inmem.h"
41 | #include "expand_block.h"
42 | #include "expand_streaming.h"
43 | #include "expand_inmem.h"
44 |
45 | /** High level status for compression and decompression */
46 | typedef enum _lz4ultra_status_t {
47 | LZ4ULTRA_OK = 0, /**< Success */
48 | LZ4ULTRA_ERROR_SRC, /**< Error reading input */
49 | LZ4ULTRA_ERROR_DST, /**< Error reading output */
50 | LZ4ULTRA_ERROR_DICTIONARY, /**< Error reading dictionary */
51 | LZ4ULTRA_ERROR_MEMORY, /**< Out of memory */
52 |
53 | /* Compression-specific status codes */
54 | LZ4ULTRA_ERROR_COMPRESSION, /**< Internal compression error */
55 | LZ4ULTRA_ERROR_RAW_TOOLARGE, /**< Input is too large to be compressed to a raw block */
56 | LZ4ULTRA_ERROR_RAW_UNCOMPRESSED, /**< Input is incompressible and raw blocks don't support uncompressed data */
57 |
58 | /* Decompression-specific status codes */
59 | LZ4ULTRA_ERROR_FORMAT, /**< Invalid input format or magic number when decompressing */
60 | LZ4ULTRA_ERROR_CHECKSUM, /**< Invalid checksum when decompressing */
61 | LZ4ULTRA_ERROR_DECOMPRESSION, /**< Internal decompression error */
62 | } lz4ultra_status_t;
63 |
64 | /* Compression flags */
65 | #define LZ4ULTRA_FLAG_FAVOR_RATIO (1<<0) /**< 1 to compress with the best ratio, 0 to trade some compression ratio for extra decompression speed */
66 | #define LZ4ULTRA_FLAG_RAW_BLOCK (1<<1) /**< 1 to emit raw block */
67 | #define LZ4ULTRA_FLAG_INDEP_BLOCKS (1<<2) /**< 1 if blocks are independent, 0 if using inter-block back references */
68 | #define LZ4ULTRA_FLAG_LEGACY_FRAMES (1<<3) /**< 1 if using the legacy frames format, 0 if using the modern lz4 frame format */
69 |
70 | #endif /* _LIB_H */
71 |
--------------------------------------------------------------------------------
/src/libdivsufsort/.gitignore:
--------------------------------------------------------------------------------
1 | # Object files
2 | *.o
3 | *.ko
4 | *.obj
5 | *.elf
6 |
7 | # Precompiled Headers
8 | *.gch
9 | *.pch
10 |
11 | # Libraries
12 | *.lib
13 | *.a
14 | *.la
15 | *.lo
16 |
17 | # Shared objects (inc. Windows DLLs)
18 | *.dll
19 | *.so
20 | *.so.*
21 | *.dylib
22 |
23 | # Executables
24 | *.exe
25 | *.out
26 | *.app
27 | *.i*86
28 | *.x86_64
29 | *.hex
30 |
31 | # CMake files/directories
32 | build/
33 |
--------------------------------------------------------------------------------
/src/libdivsufsort/CHANGELOG.md:
--------------------------------------------------------------------------------
1 | # libdivsufsort Change Log
2 |
3 | See full changelog at: https://github.com/y-256/libdivsufsort/commits
4 |
5 | ## [2.0.1] - 2010-11-11
6 | ### Fixed
7 | * Wrong variable used in `divbwt` function
8 | * Enclose some string variables with double quotation marks in include/CMakeLists.txt
9 | * Fix typo in include/CMakeLists.txt
10 |
11 | ## 2.0.0 - 2008-08-23
12 | ### Changed
13 | * Switch the build system to [CMake](http://www.cmake.org/)
14 | * Improve the performance of the suffix-sorting algorithm
15 |
16 | ### Added
17 | * OpenMP support
18 | * 64-bit version of divsufsort
19 |
20 | [Unreleased]: https://github.com/y-256/libdivsufsort/compare/2.0.1...HEAD
21 | [2.0.1]: https://github.com/y-256/libdivsufsort/compare/2.0.0...2.0.1
22 |
--------------------------------------------------------------------------------
/src/libdivsufsort/CMakeLists.txt:
--------------------------------------------------------------------------------
1 | ### cmake file for building libdivsufsort Package ###
2 | cmake_minimum_required(VERSION 2.4.4)
3 | set(CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/CMakeModules")
4 | include(AppendCompilerFlags)
5 |
6 | ## Project information ##
7 | project(libdivsufsort C)
8 | set(PROJECT_VENDOR "Yuta Mori")
9 | set(PROJECT_CONTACT "yuta.256@gmail.com")
10 | set(PROJECT_URL "https://github.com/y-256/libdivsufsort")
11 | set(PROJECT_DESCRIPTION "A lightweight suffix sorting library")
12 | include(VERSION.cmake)
13 |
14 | ## CPack configuration ##
15 | set(CPACK_GENERATOR "TGZ;TBZ2;ZIP")
16 | set(CPACK_SOURCE_GENERATOR "TGZ;TBZ2;ZIP")
17 | include(ProjectCPack)
18 |
19 | ## Project options ##
20 | option(BUILD_SHARED_LIBS "Set to OFF to build static libraries" ON)
21 | option(BUILD_EXAMPLES "Build examples" ON)
22 | option(BUILD_DIVSUFSORT64 "Build libdivsufsort64" OFF)
23 | option(USE_OPENMP "Use OpenMP for parallelization" OFF)
24 | option(WITH_LFS "Enable Large File Support" ON)
25 |
26 | ## Installation directories ##
27 | set(LIB_SUFFIX "" CACHE STRING "Define suffix of directory name (32 or 64)")
28 |
29 | set(CMAKE_INSTALL_RUNTIMEDIR "" CACHE PATH "Specify the output directory for dll runtimes (default is bin)")
30 | if(NOT CMAKE_INSTALL_RUNTIMEDIR)
31 | set(CMAKE_INSTALL_RUNTIMEDIR "${CMAKE_INSTALL_PREFIX}/bin")
32 | endif(NOT CMAKE_INSTALL_RUNTIMEDIR)
33 |
34 | set(CMAKE_INSTALL_LIBDIR "" CACHE PATH "Specify the output directory for libraries (default is lib)")
35 | if(NOT CMAKE_INSTALL_LIBDIR)
36 | set(CMAKE_INSTALL_LIBDIR "${CMAKE_INSTALL_PREFIX}/lib${LIB_SUFFIX}")
37 | endif(NOT CMAKE_INSTALL_LIBDIR)
38 |
39 | set(CMAKE_INSTALL_INCLUDEDIR "" CACHE PATH "Specify the output directory for header files (default is include)")
40 | if(NOT CMAKE_INSTALL_INCLUDEDIR)
41 | set(CMAKE_INSTALL_INCLUDEDIR "${CMAKE_INSTALL_PREFIX}/include")
42 | endif(NOT CMAKE_INSTALL_INCLUDEDIR)
43 |
44 | set(CMAKE_INSTALL_PKGCONFIGDIR "" CACHE PATH "Specify the output directory for pkgconfig files (default is lib/pkgconfig)")
45 | if(NOT CMAKE_INSTALL_PKGCONFIGDIR)
46 | set(CMAKE_INSTALL_PKGCONFIGDIR "${CMAKE_INSTALL_LIBDIR}/pkgconfig")
47 | endif(NOT CMAKE_INSTALL_PKGCONFIGDIR)
48 |
49 | ## Build type ##
50 | if(NOT CMAKE_BUILD_TYPE)
51 | set(CMAKE_BUILD_TYPE "Release")
52 | elseif(CMAKE_BUILD_TYPE STREQUAL "Debug")
53 | set(CMAKE_VERBOSE_MAKEFILE ON)
54 | endif(NOT CMAKE_BUILD_TYPE)
55 |
56 | ## Compiler options ##
57 | if(MSVC)
58 | append_c_compiler_flags("/W4" "VC" CMAKE_C_FLAGS)
59 | append_c_compiler_flags("/Oi;/Ot;/Ox;/Oy" "VC" CMAKE_C_FLAGS_RELEASE)
60 | if(USE_OPENMP)
61 | append_c_compiler_flags("/openmp" "VC" CMAKE_C_FLAGS)
62 | endif(USE_OPENMP)
63 | elseif(BORLAND)
64 | append_c_compiler_flags("-w" "BCC" CMAKE_C_FLAGS)
65 | append_c_compiler_flags("-Oi;-Og;-Os;-Ov;-Ox" "BCC" CMAKE_C_FLAGS_RELEASE)
66 | else(MSVC)
67 | if(CMAKE_COMPILER_IS_GNUCC)
68 | append_c_compiler_flags("-Wall" "GCC" CMAKE_C_FLAGS)
69 | append_c_compiler_flags("-fomit-frame-pointer" "GCC" CMAKE_C_FLAGS_RELEASE)
70 | if(USE_OPENMP)
71 | append_c_compiler_flags("-fopenmp" "GCC" CMAKE_C_FLAGS)
72 | endif(USE_OPENMP)
73 | else(CMAKE_COMPILER_IS_GNUCC)
74 | append_c_compiler_flags("-Wall" "UNKNOWN" CMAKE_C_FLAGS)
75 | append_c_compiler_flags("-fomit-frame-pointer" "UNKNOWN" CMAKE_C_FLAGS_RELEASE)
76 | if(USE_OPENMP)
77 | append_c_compiler_flags("-fopenmp;-openmp;-omp" "UNKNOWN" CMAKE_C_FLAGS)
78 | endif(USE_OPENMP)
79 | endif(CMAKE_COMPILER_IS_GNUCC)
80 | endif(MSVC)
81 |
82 | ## Add definitions ##
83 | add_definitions(-DHAVE_CONFIG_H=1 -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS)
84 |
85 | ## Add subdirectories ##
86 | add_subdirectory(pkgconfig)
87 | add_subdirectory(include)
88 | add_subdirectory(lib)
89 | if(BUILD_EXAMPLES)
90 | add_subdirectory(examples)
91 | endif(BUILD_EXAMPLES)
92 |
93 | ## Add 'uninstall' target ##
94 | CONFIGURE_FILE(
95 | "${CMAKE_CURRENT_SOURCE_DIR}/CMakeModules/cmake_uninstall.cmake.in"
96 | "${CMAKE_CURRENT_BINARY_DIR}/CMakeModules/cmake_uninstall.cmake"
97 | IMMEDIATE @ONLY)
98 | ADD_CUSTOM_TARGET(uninstall
99 | "${CMAKE_COMMAND}" -P "${CMAKE_CURRENT_BINARY_DIR}/CMakeModules/cmake_uninstall.cmake")
100 |
--------------------------------------------------------------------------------
/src/libdivsufsort/CMakeModules/AppendCompilerFlags.cmake:
--------------------------------------------------------------------------------
1 | include(CheckCSourceCompiles)
2 | include(CheckCXXSourceCompiles)
3 |
4 | macro(append_c_compiler_flags _flags _name _result)
5 | set(SAFE_CMAKE_REQUIRED_FLAGS ${CMAKE_REQUIRED_FLAGS})
6 | string(REGEX REPLACE "[-+/ ]" "_" cname "${_name}")
7 | string(TOUPPER "${cname}" cname)
8 | foreach(flag ${_flags})
9 | string(REGEX REPLACE "^[-+/ ]+(.*)[-+/ ]*$" "\\1" flagname "${flag}")
10 | string(REGEX REPLACE "[-+/ ]" "_" flagname "${flagname}")
11 | string(TOUPPER "${flagname}" flagname)
12 | set(have_flag "HAVE_${cname}_${flagname}")
13 | set(CMAKE_REQUIRED_FLAGS "${flag}")
14 | check_c_source_compiles("int main() { return 0; }" ${have_flag})
15 | if(${have_flag})
16 | set(${_result} "${${_result}} ${flag}")
17 | endif(${have_flag})
18 | endforeach(flag)
19 | set(CMAKE_REQUIRED_FLAGS ${SAFE_CMAKE_REQUIRED_FLAGS})
20 | endmacro(append_c_compiler_flags)
21 |
22 | macro(append_cxx_compiler_flags _flags _name _result)
23 | set(SAFE_CMAKE_REQUIRED_FLAGS ${CMAKE_REQUIRED_FLAGS})
24 | string(REGEX REPLACE "[-+/ ]" "_" cname "${_name}")
25 | string(TOUPPER "${cname}" cname)
26 | foreach(flag ${_flags})
27 | string(REGEX REPLACE "^[-+/ ]+(.*)[-+/ ]*$" "\\1" flagname "${flag}")
28 | string(REGEX REPLACE "[-+/ ]" "_" flagname "${flagname}")
29 | string(TOUPPER "${flagname}" flagname)
30 | set(have_flag "HAVE_${cname}_${flagname}")
31 | set(CMAKE_REQUIRED_FLAGS "${flag}")
32 | check_cxx_source_compiles("int main() { return 0; }" ${have_flag})
33 | if(${have_flag})
34 | set(${_result} "${${_result}} ${flag}")
35 | endif(${have_flag})
36 | endforeach(flag)
37 | set(CMAKE_REQUIRED_FLAGS ${SAFE_CMAKE_REQUIRED_FLAGS})
38 | endmacro(append_cxx_compiler_flags)
39 |
--------------------------------------------------------------------------------
/src/libdivsufsort/CMakeModules/CheckFunctionKeywords.cmake:
--------------------------------------------------------------------------------
1 | include(CheckCSourceCompiles)
2 |
3 | macro(check_function_keywords _wordlist)
4 | set(${_result} "")
5 | foreach(flag ${_wordlist})
6 | string(REGEX REPLACE "[-+/ ()]" "_" flagname "${flag}")
7 | string(TOUPPER "${flagname}" flagname)
8 | set(have_flag "HAVE_${flagname}")
9 | check_c_source_compiles("${flag} void func(); void func() { } int main() { func(); return 0; }" ${have_flag})
10 | if(${have_flag} AND NOT ${_result})
11 | set(${_result} "${flag}")
12 | # break()
13 | endif(${have_flag} AND NOT ${_result})
14 | endforeach(flag)
15 | endmacro(check_function_keywords)
16 |
--------------------------------------------------------------------------------
/src/libdivsufsort/CMakeModules/CheckLFS.cmake:
--------------------------------------------------------------------------------
1 | ## Checks for large file support ##
2 | include(CheckIncludeFile)
3 | include(CheckSymbolExists)
4 | include(CheckTypeSize)
5 |
6 | macro(check_lfs _isenable)
7 | set(LFS_OFF_T "")
8 | set(LFS_FOPEN "")
9 | set(LFS_FSEEK "")
10 | set(LFS_FTELL "")
11 | set(LFS_PRID "")
12 |
13 | if(${_isenable})
14 | set(SAFE_CMAKE_REQUIRED_DEFINITIONS "${CMAKE_REQUIRED_DEFINITIONS}")
15 | set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS}
16 | -D_LARGEFILE_SOURCE -D_LARGE_FILES -D_FILE_OFFSET_BITS=64
17 | -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS)
18 |
19 | check_include_file("sys/types.h" HAVE_SYS_TYPES_H)
20 | check_include_file("inttypes.h" HAVE_INTTYPES_H)
21 | check_include_file("stddef.h" HAVE_STDDEF_H)
22 | check_include_file("stdint.h" HAVE_STDINT_H)
23 |
24 | # LFS type1: 8 <= sizeof(off_t), fseeko, ftello
25 | check_type_size("off_t" SIZEOF_OFF_T)
26 | if(SIZEOF_OFF_T GREATER 7)
27 | check_symbol_exists("fseeko" "stdio.h" HAVE_FSEEKO)
28 | check_symbol_exists("ftello" "stdio.h" HAVE_FTELLO)
29 | if(HAVE_FSEEKO AND HAVE_FTELLO)
30 | set(LFS_OFF_T "off_t")
31 | set(LFS_FOPEN "fopen")
32 | set(LFS_FSEEK "fseeko")
33 | set(LFS_FTELL "ftello")
34 | check_symbol_exists("PRIdMAX" "inttypes.h" HAVE_PRIDMAX)
35 | if(HAVE_PRIDMAX)
36 | set(LFS_PRID "PRIdMAX")
37 | else(HAVE_PRIDMAX)
38 | check_type_size("long" SIZEOF_LONG)
39 | check_type_size("int" SIZEOF_INT)
40 | if(SIZEOF_OFF_T GREATER SIZEOF_LONG)
41 | set(LFS_PRID "\"lld\"")
42 | elseif(SIZEOF_LONG GREATER SIZEOF_INT)
43 | set(LFS_PRID "\"ld\"")
44 | else(SIZEOF_OFF_T GREATER SIZEOF_LONG)
45 | set(LFS_PRID "\"d\"")
46 | endif(SIZEOF_OFF_T GREATER SIZEOF_LONG)
47 | endif(HAVE_PRIDMAX)
48 | endif(HAVE_FSEEKO AND HAVE_FTELLO)
49 | endif(SIZEOF_OFF_T GREATER 7)
50 |
51 | # LFS type2: 8 <= sizeof(off64_t), fopen64, fseeko64, ftello64
52 | if(NOT LFS_OFF_T)
53 | check_type_size("off64_t" SIZEOF_OFF64_T)
54 | if(SIZEOF_OFF64_T GREATER 7)
55 | check_symbol_exists("fopen64" "stdio.h" HAVE_FOPEN64)
56 | check_symbol_exists("fseeko64" "stdio.h" HAVE_FSEEKO64)
57 | check_symbol_exists("ftello64" "stdio.h" HAVE_FTELLO64)
58 | if(HAVE_FOPEN64 AND HAVE_FSEEKO64 AND HAVE_FTELLO64)
59 | set(LFS_OFF_T "off64_t")
60 | set(LFS_FOPEN "fopen64")
61 | set(LFS_FSEEK "fseeko64")
62 | set(LFS_FTELL "ftello64")
63 | check_symbol_exists("PRIdMAX" "inttypes.h" HAVE_PRIDMAX)
64 | if(HAVE_PRIDMAX)
65 | set(LFS_PRID "PRIdMAX")
66 | else(HAVE_PRIDMAX)
67 | check_type_size("long" SIZEOF_LONG)
68 | check_type_size("int" SIZEOF_INT)
69 | if(SIZEOF_OFF64_T GREATER SIZEOF_LONG)
70 | set(LFS_PRID "\"lld\"")
71 | elseif(SIZEOF_LONG GREATER SIZEOF_INT)
72 | set(LFS_PRID "\"ld\"")
73 | else(SIZEOF_OFF64_T GREATER SIZEOF_LONG)
74 | set(LFS_PRID "\"d\"")
75 | endif(SIZEOF_OFF64_T GREATER SIZEOF_LONG)
76 | endif(HAVE_PRIDMAX)
77 | endif(HAVE_FOPEN64 AND HAVE_FSEEKO64 AND HAVE_FTELLO64)
78 | endif(SIZEOF_OFF64_T GREATER 7)
79 | endif(NOT LFS_OFF_T)
80 |
81 | # LFS type3: 8 <= sizeof(__int64), _fseeki64, _ftelli64
82 | if(NOT LFS_OFF_T)
83 | check_type_size("__int64" SIZEOF___INT64)
84 | if(SIZEOF___INT64 GREATER 7)
85 | check_symbol_exists("_fseeki64" "stdio.h" HAVE__FSEEKI64)
86 | check_symbol_exists("_ftelli64" "stdio.h" HAVE__FTELLI64)
87 | if(HAVE__FSEEKI64 AND HAVE__FTELLI64)
88 | set(LFS_OFF_T "__int64")
89 | set(LFS_FOPEN "fopen")
90 | set(LFS_FSEEK "_fseeki64")
91 | set(LFS_FTELL "_ftelli64")
92 | set(LFS_PRID "\"I64d\"")
93 | endif(HAVE__FSEEKI64 AND HAVE__FTELLI64)
94 | endif(SIZEOF___INT64 GREATER 7)
95 | endif(NOT LFS_OFF_T)
96 |
97 | set(CMAKE_REQUIRED_DEFINITIONS "${SAFE_CMAKE_REQUIRED_DEFINITIONS}")
98 | endif(${_isenable})
99 |
100 | if(NOT LFS_OFF_T)
101 | ## not found
102 | set(LFS_OFF_T "long")
103 | set(LFS_FOPEN "fopen")
104 | set(LFS_FSEEK "fseek")
105 | set(LFS_FTELL "ftell")
106 | set(LFS_PRID "\"ld\"")
107 | endif(NOT LFS_OFF_T)
108 |
109 | endmacro(check_lfs)
110 |
--------------------------------------------------------------------------------
/src/libdivsufsort/CMakeModules/ProjectCPack.cmake:
--------------------------------------------------------------------------------
1 | # If the cmake version includes cpack, use it
2 | IF(EXISTS "${CMAKE_ROOT}/Modules/CPack.cmake")
3 | SET(CPACK_PACKAGE_DESCRIPTION_SUMMARY "${PROJECT_DESCRIPTION}")
4 | SET(CPACK_PACKAGE_VENDOR "${PROJECT_VENDOR}")
5 | SET(CPACK_PACKAGE_DESCRIPTION_FILE "${CMAKE_CURRENT_SOURCE_DIR}/README.md")
6 | SET(CPACK_RESOURCE_FILE_LICENSE "${CMAKE_CURRENT_SOURCE_DIR}/LICENSE")
7 | SET(CPACK_PACKAGE_VERSION_MAJOR "${PROJECT_VERSION_MAJOR}")
8 | SET(CPACK_PACKAGE_VERSION_MINOR "${PROJECT_VERSION_MINOR}")
9 | SET(CPACK_PACKAGE_VERSION_PATCH "${PROJECT_VERSION_PATCH}")
10 | # SET(CPACK_PACKAGE_INSTALL_DIRECTORY "${PROJECT_NAME} ${PROJECT_VERSION}")
11 | SET(CPACK_SOURCE_PACKAGE_FILE_NAME "${PROJECT_NAME}-${PROJECT_VERSION_FULL}")
12 |
13 | IF(NOT DEFINED CPACK_SYSTEM_NAME)
14 | SET(CPACK_SYSTEM_NAME "${CMAKE_SYSTEM_NAME}-${CMAKE_SYSTEM_PROCESSOR}")
15 | ENDIF(NOT DEFINED CPACK_SYSTEM_NAME)
16 |
17 | IF(${CPACK_SYSTEM_NAME} MATCHES Windows)
18 | IF(CMAKE_CL_64)
19 | SET(CPACK_SYSTEM_NAME win64-${CMAKE_SYSTEM_PROCESSOR})
20 | ELSE(CMAKE_CL_64)
21 | SET(CPACK_SYSTEM_NAME win32-${CMAKE_SYSTEM_PROCESSOR})
22 | ENDIF(CMAKE_CL_64)
23 | ENDIF(${CPACK_SYSTEM_NAME} MATCHES Windows)
24 |
25 | IF(NOT DEFINED CPACK_PACKAGE_FILE_NAME)
26 | SET(CPACK_PACKAGE_FILE_NAME "${CPACK_SOURCE_PACKAGE_FILE_NAME}-${CPACK_SYSTEM_NAME}")
27 | ENDIF(NOT DEFINED CPACK_PACKAGE_FILE_NAME)
28 |
29 | SET(CPACK_PACKAGE_CONTACT "${PROJECT_CONTACT}")
30 | IF(UNIX)
31 | SET(CPACK_STRIP_FILES "")
32 | SET(CPACK_SOURCE_STRIP_FILES "")
33 | # SET(CPACK_PACKAGE_EXECUTABLES "ccmake" "CMake")
34 | ENDIF(UNIX)
35 | SET(CPACK_SOURCE_IGNORE_FILES "/CVS/" "/build/" "/\\\\.build/" "/\\\\.svn/" "~$")
36 | # include CPack model once all variables are set
37 | INCLUDE(CPack)
38 | ENDIF(EXISTS "${CMAKE_ROOT}/Modules/CPack.cmake")
39 |
--------------------------------------------------------------------------------
/src/libdivsufsort/CMakeModules/cmake_uninstall.cmake.in:
--------------------------------------------------------------------------------
1 | IF(NOT EXISTS "@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt")
2 | MESSAGE(FATAL_ERROR "Cannot find install manifest: \"@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt\"")
3 | ENDIF(NOT EXISTS "@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt")
4 |
5 | FILE(READ "@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt" files)
6 | STRING(REGEX REPLACE "\n" ";" files "${files}")
7 |
8 | SET(NUM 0)
9 | FOREACH(file ${files})
10 | IF(EXISTS "$ENV{DESTDIR}${file}")
11 | MESSAGE(STATUS "Looking for \"$ENV{DESTDIR}${file}\" - found")
12 | SET(UNINSTALL_CHECK_${NUM} 1)
13 | ELSE(EXISTS "$ENV{DESTDIR}${file}")
14 | MESSAGE(STATUS "Looking for \"$ENV{DESTDIR}${file}\" - not found")
15 | SET(UNINSTALL_CHECK_${NUM} 0)
16 | ENDIF(EXISTS "$ENV{DESTDIR}${file}")
17 | MATH(EXPR NUM "1 + ${NUM}")
18 | ENDFOREACH(file)
19 |
20 | SET(NUM 0)
21 | FOREACH(file ${files})
22 | IF(${UNINSTALL_CHECK_${NUM}})
23 | MESSAGE(STATUS "Uninstalling \"$ENV{DESTDIR}${file}\"")
24 | EXEC_PROGRAM(
25 | "@CMAKE_COMMAND@" ARGS "-E remove \"$ENV{DESTDIR}${file}\""
26 | OUTPUT_VARIABLE rm_out
27 | RETURN_VALUE rm_retval
28 | )
29 | IF(NOT "${rm_retval}" STREQUAL 0)
30 | MESSAGE(FATAL_ERROR "Problem when removing \"$ENV{DESTDIR}${file}\"")
31 | ENDIF(NOT "${rm_retval}" STREQUAL 0)
32 | ENDIF(${UNINSTALL_CHECK_${NUM}})
33 | MATH(EXPR NUM "1 + ${NUM}")
34 | ENDFOREACH(file)
35 |
36 | FILE(REMOVE "@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt")
37 |
--------------------------------------------------------------------------------
/src/libdivsufsort/LICENSE:
--------------------------------------------------------------------------------
1 | The MIT License (MIT)
2 |
3 | Copyright (c) 2003 Yuta Mori All rights reserved.
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/src/libdivsufsort/README.md:
--------------------------------------------------------------------------------
1 | # libdivsufsort
2 |
3 | libdivsufsort is a software library that implements a lightweight suffix array construction algorithm.
4 |
5 | ## News
6 | * 2015-03-21: The project has moved from [Google Code](http://code.google.com/p/libdivsufsort/) to [GitHub](https://github.com/y-256/libdivsufsort)
7 |
8 | ## Introduction
9 | This library provides a simple and an efficient C API to construct a suffix array and a Burrows-Wheeler transformed string from a given string over a constant-size alphabet.
10 | The algorithm runs in O(n log n) worst-case time using only 5n+O(1) bytes of memory space, where n is the length of
11 | the string.
12 |
13 | ## Build requirements
14 | * An ANSI C Compiler (e.g. GNU GCC)
15 | * [CMake](http://www.cmake.org/ "CMake") version 2.4.2 or newer
16 | * CMake-supported build tool
17 |
18 | ## Building on GNU/Linux
19 | 1. Get the source code from GitHub. You can either
20 | * use git to clone the repository
21 | ```
22 | git clone https://github.com/y-256/libdivsufsort.git
23 | ```
24 | * or download a [zip file](../../archive/master.zip) directly
25 | 2. Create a `build` directory in the package source directory.
26 | ```shell
27 | $ cd libdivsufsort
28 | $ mkdir build
29 | $ cd build
30 | ```
31 | 3. Configure the package for your system.
32 | If you want to install to a different location, change the -DCMAKE_INSTALL_PREFIX option.
33 | ```shell
34 | $ cmake -DCMAKE_BUILD_TYPE="Release" \
35 | -DCMAKE_INSTALL_PREFIX="/usr/local" ..
36 | ```
37 | 4. Compile the package.
38 | ```shell
39 | $ make
40 | ```
41 | 5. (Optional) Install the library and header files.
42 | ```shell
43 | $ sudo make install
44 | ```
45 |
46 | ## API
47 | ```c
48 | /* Data types */
49 | typedef int32_t saint_t;
50 | typedef int32_t saidx_t;
51 | typedef uint8_t sauchar_t;
52 |
53 | /*
54 | * Constructs the suffix array of a given string.
55 | * @param T[0..n-1] The input string.
56 | * @param SA[0..n-1] The output array or suffixes.
57 | * @param n The length of the given string.
58 | * @return 0 if no error occurred, -1 or -2 otherwise.
59 | */
60 | saint_t
61 | divsufsort(const sauchar_t *T, saidx_t *SA, saidx_t n);
62 |
63 | /*
64 | * Constructs the burrows-wheeler transformed string of a given string.
65 | * @param T[0..n-1] The input string.
66 | * @param U[0..n-1] The output string. (can be T)
67 | * @param A[0..n-1] The temporary array. (can be NULL)
68 | * @param n The length of the given string.
69 | * @return The primary index if no error occurred, -1 or -2 otherwise.
70 | */
71 | saidx_t
72 | divbwt(const sauchar_t *T, sauchar_t *U, saidx_t *A, saidx_t n);
73 | ```
74 |
75 | ## Example Usage
76 | ```c
77 | #include
78 | #include
79 | #include
80 |
81 | #include
82 |
83 | int main() {
84 | // intput data
85 | char *Text = "abracadabra";
86 | int n = strlen(Text);
87 | int i, j;
88 |
89 | // allocate
90 | int *SA = (int *)malloc(n * sizeof(int));
91 |
92 | // sort
93 | divsufsort((unsigned char *)Text, SA, n);
94 |
95 | // output
96 | for(i = 0; i < n; ++i) {
97 | printf("SA[%2d] = %2d: ", i, SA[i]);
98 | for(j = SA[i]; j < n; ++j) {
99 | printf("%c", Text[j]);
100 | }
101 | printf("$\n");
102 | }
103 |
104 | // deallocate
105 | free(SA);
106 |
107 | return 0;
108 | }
109 | ```
110 | See the [examples](examples) directory for a few other examples.
111 |
112 | ## Benchmarks
113 | See [Benchmarks](https://github.com/y-256/libdivsufsort/blob/wiki/SACA_Benchmarks.md) page for details.
114 |
115 | ## License
116 | libdivsufsort is released under the [MIT license](LICENSE "MIT license").
117 | > The MIT License (MIT)
118 | >
119 | > Copyright (c) 2003 Yuta Mori All rights reserved.
120 | >
121 | > Permission is hereby granted, free of charge, to any person obtaining a copy
122 | > of this software and associated documentation files (the "Software"), to deal
123 | > in the Software without restriction, including without limitation the rights
124 | > to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
125 | > copies of the Software, and to permit persons to whom the Software is
126 | > furnished to do so, subject to the following conditions:
127 | >
128 | > The above copyright notice and this permission notice shall be included in all
129 | > copies or substantial portions of the Software.
130 | >
131 | > THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
132 | > IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
133 | > FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
134 | > AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
135 | > LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
136 | > OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
137 | > SOFTWARE.
138 |
139 | ## Author
140 | * Yuta Mori
141 |
--------------------------------------------------------------------------------
/src/libdivsufsort/VERSION.cmake:
--------------------------------------------------------------------------------
1 | set(PROJECT_VERSION_MAJOR "2")
2 | set(PROJECT_VERSION_MINOR "0")
3 | set(PROJECT_VERSION_PATCH "2")
4 | set(PROJECT_VERSION_EXTRA "-1")
5 | set(PROJECT_VERSION "${PROJECT_VERSION_MAJOR}.${PROJECT_VERSION_MINOR}")
6 | set(PROJECT_VERSION_FULL "${PROJECT_VERSION_MAJOR}.${PROJECT_VERSION_MINOR}.${PROJECT_VERSION_PATCH}${PROJECT_VERSION_EXTRA}")
7 |
8 | set(LIBRARY_VERSION "3.0.1")
9 | set(LIBRARY_SOVERSION "3")
10 |
11 | ## Git revision number ##
12 | if(EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/.git")
13 | execute_process(COMMAND git describe --tags HEAD
14 | WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}"
15 | OUTPUT_VARIABLE GIT_DESCRIBE_TAGS ERROR_QUIET)
16 | if(GIT_DESCRIBE_TAGS)
17 | string(REGEX REPLACE "^v(.*)" "\\1" GIT_REVISION "${GIT_DESCRIBE_TAGS}")
18 | string(STRIP "${GIT_REVISION}" GIT_REVISION)
19 | if(GIT_REVISION)
20 | set(PROJECT_VERSION_FULL "${GIT_REVISION}")
21 | endif(GIT_REVISION)
22 | endif(GIT_DESCRIBE_TAGS)
23 | endif(EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/.git")
24 |
--------------------------------------------------------------------------------
/src/libdivsufsort/examples/CMakeLists.txt:
--------------------------------------------------------------------------------
1 | ## Add definitions ##
2 | add_definitions(-D_LARGEFILE_SOURCE -D_LARGE_FILES -D_FILE_OFFSET_BITS=64)
3 |
4 | ## Targets ##
5 | include_directories("${CMAKE_CURRENT_SOURCE_DIR}/../include"
6 | "${CMAKE_CURRENT_BINARY_DIR}/../include")
7 | link_directories("${CMAKE_CURRENT_BINARY_DIR}/../lib")
8 | foreach(src suftest mksary sasearch bwt unbwt)
9 | add_executable(${src} ${src}.c)
10 | target_link_libraries(${src} divsufsort)
11 | endforeach(src)
12 |
--------------------------------------------------------------------------------
/src/libdivsufsort/examples/bwt.c:
--------------------------------------------------------------------------------
1 | /*
2 | * bwt.c for libdivsufsort
3 | * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 | *
5 | * Permission is hereby granted, free of charge, to any person
6 | * obtaining a copy of this software and associated documentation
7 | * files (the "Software"), to deal in the Software without
8 | * restriction, including without limitation the rights to use,
9 | * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | * copies of the Software, and to permit persons to whom the
11 | * Software is furnished to do so, subject to the following
12 | * conditions:
13 | *
14 | * The above copyright notice and this permission notice shall be
15 | * included in all copies or substantial portions of the Software.
16 | *
17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 | * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 | * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 | * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 | * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 | * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 | * OTHER DEALINGS IN THE SOFTWARE.
25 | */
26 |
27 | #if HAVE_CONFIG_H
28 | # include "config.h"
29 | #endif
30 | #include
31 | #if HAVE_STRING_H
32 | # include
33 | #endif
34 | #if HAVE_STDLIB_H
35 | # include
36 | #endif
37 | #if HAVE_MEMORY_H
38 | # include
39 | #endif
40 | #if HAVE_STDDEF_H
41 | # include
42 | #endif
43 | #if HAVE_STRINGS_H
44 | # include
45 | #endif
46 | #if HAVE_SYS_TYPES_H
47 | # include
48 | #endif
49 | #if HAVE_IO_H && HAVE_FCNTL_H
50 | # include
51 | # include
52 | #endif
53 | #include
54 | #include
55 | #include "lfs.h"
56 |
57 |
58 | static
59 | size_t
60 | write_int(FILE *fp, saidx_t n) {
61 | unsigned char c[4];
62 | c[0] = (unsigned char)((n >> 0) & 0xff), c[1] = (unsigned char)((n >> 8) & 0xff),
63 | c[2] = (unsigned char)((n >> 16) & 0xff), c[3] = (unsigned char)((n >> 24) & 0xff);
64 | return fwrite(c, sizeof(unsigned char), 4, fp);
65 | }
66 |
67 | static
68 | void
69 | print_help(const char *progname, int status) {
70 | fprintf(stderr,
71 | "bwt, a burrows-wheeler transform program, version %s.\n",
72 | divsufsort_version());
73 | fprintf(stderr, "usage: %s [-b num] INFILE OUTFILE\n", progname);
74 | fprintf(stderr, " -b num set block size to num MiB [1..512] (default: 32)\n\n");
75 | exit(status);
76 | }
77 |
78 | int
79 | main(int argc, const char *argv[]) {
80 | FILE *fp, *ofp;
81 | const char *fname, *ofname;
82 | sauchar_t *T;
83 | saidx_t *SA;
84 | LFS_OFF_T n;
85 | size_t m;
86 | saidx_t pidx;
87 | clock_t start,finish;
88 | saint_t i, blocksize = 32, needclose = 3;
89 |
90 | /* Check arguments. */
91 | if((argc == 1) ||
92 | (strcmp(argv[1], "-h") == 0) ||
93 | (strcmp(argv[1], "--help") == 0)) { print_help(argv[0], EXIT_SUCCESS); }
94 | if((argc != 3) && (argc != 5)) { print_help(argv[0], EXIT_FAILURE); }
95 | i = 1;
96 | if(argc == 5) {
97 | if(strcmp(argv[i], "-b") != 0) { print_help(argv[0], EXIT_FAILURE); }
98 | blocksize = atoi(argv[i + 1]);
99 | if(blocksize < 0) { blocksize = 1; }
100 | else if(512 < blocksize) { blocksize = 512; }
101 | i += 2;
102 | }
103 | blocksize <<= 20;
104 |
105 | /* Open a file for reading. */
106 | if(strcmp(argv[i], "-") != 0) {
107 | #if HAVE_FOPEN_S
108 | if(fopen_s(&fp, fname = argv[i], "rb") != 0) {
109 | #else
110 | if((fp = LFS_FOPEN(fname = argv[i], "rb")) == NULL) {
111 | #endif
112 | fprintf(stderr, "%s: Cannot open file `%s': ", argv[0], fname);
113 | perror(NULL);
114 | exit(EXIT_FAILURE);
115 | }
116 | } else {
117 | #if HAVE__SETMODE && HAVE__FILENO
118 | if(_setmode(_fileno(stdin), _O_BINARY) == -1) {
119 | fprintf(stderr, "%s: Cannot set mode: ", argv[0]);
120 | perror(NULL);
121 | exit(EXIT_FAILURE);
122 | }
123 | #endif
124 | fp = stdin;
125 | fname = "stdin";
126 | needclose ^= 1;
127 | }
128 | i += 1;
129 |
130 | /* Open a file for writing. */
131 | if(strcmp(argv[i], "-") != 0) {
132 | #if HAVE_FOPEN_S
133 | if(fopen_s(&ofp, ofname = argv[i], "wb") != 0) {
134 | #else
135 | if((ofp = LFS_FOPEN(ofname = argv[i], "wb")) == NULL) {
136 | #endif
137 | fprintf(stderr, "%s: Cannot open file `%s': ", argv[0], ofname);
138 | perror(NULL);
139 | exit(EXIT_FAILURE);
140 | }
141 | } else {
142 | #if HAVE__SETMODE && HAVE__FILENO
143 | if(_setmode(_fileno(stdout), _O_BINARY) == -1) {
144 | fprintf(stderr, "%s: Cannot set mode: ", argv[0]);
145 | perror(NULL);
146 | exit(EXIT_FAILURE);
147 | }
148 | #endif
149 | ofp = stdout;
150 | ofname = "stdout";
151 | needclose ^= 2;
152 | }
153 |
154 | /* Get the file size. */
155 | if(LFS_FSEEK(fp, 0, SEEK_END) == 0) {
156 | n = LFS_FTELL(fp);
157 | rewind(fp);
158 | if(n < 0) {
159 | fprintf(stderr, "%s: Cannot ftell `%s': ", argv[0], fname);
160 | perror(NULL);
161 | exit(EXIT_FAILURE);
162 | }
163 | if(0x20000000L < n) { n = 0x20000000L; }
164 | if((blocksize == 0) || (n < blocksize)) { blocksize = (saidx_t)n; }
165 | } else if(blocksize == 0) { blocksize = 32 << 20; }
166 |
167 | /* Allocate 5blocksize bytes of memory. */
168 | T = (sauchar_t *)malloc(blocksize * sizeof(sauchar_t));
169 | SA = (saidx_t *)malloc(blocksize * sizeof(saidx_t));
170 | if((T == NULL) || (SA == NULL)) {
171 | fprintf(stderr, "%s: Cannot allocate memory.\n", argv[0]);
172 | exit(EXIT_FAILURE);
173 | }
174 |
175 | /* Write the blocksize. */
176 | if(write_int(ofp, blocksize) != 4) {
177 | fprintf(stderr, "%s: Cannot write to `%s': ", argv[0], ofname);
178 | perror(NULL);
179 | exit(EXIT_FAILURE);
180 | }
181 |
182 | fprintf(stderr, " BWT (blocksize %" PRIdSAINT_T ") ... ", blocksize);
183 | start = clock();
184 | for(n = 0; 0 < (m = fread(T, sizeof(sauchar_t), blocksize, fp)); n += m) {
185 | /* Burrows-Wheeler Transform. */
186 | pidx = divbwt(T, T, SA, m);
187 | if(pidx < 0) {
188 | fprintf(stderr, "%s (bw_transform): %s.\n",
189 | argv[0],
190 | (pidx == -1) ? "Invalid arguments" : "Cannot allocate memory");
191 | exit(EXIT_FAILURE);
192 | }
193 |
194 | /* Write the bwted data. */
195 | if((write_int(ofp, pidx) != 4) ||
196 | (fwrite(T, sizeof(sauchar_t), m, ofp) != m)) {
197 | fprintf(stderr, "%s: Cannot write to `%s': ", argv[0], ofname);
198 | perror(NULL);
199 | exit(EXIT_FAILURE);
200 | }
201 | }
202 | if(ferror(fp)) {
203 | fprintf(stderr, "%s: Cannot read from `%s': ", argv[0], fname);
204 | perror(NULL);
205 | exit(EXIT_FAILURE);
206 | }
207 | finish = clock();
208 | fprintf(stderr, "%" PRIdOFF_T " bytes: %.4f sec\n",
209 | n, (double)(finish - start) / (double)CLOCKS_PER_SEC);
210 |
211 | /* Close files */
212 | if(needclose & 1) { fclose(fp); }
213 | if(needclose & 2) { fclose(ofp); }
214 |
215 | /* Deallocate memory. */
216 | free(SA);
217 | free(T);
218 |
219 | return 0;
220 | }
221 |
--------------------------------------------------------------------------------
/src/libdivsufsort/examples/mksary.c:
--------------------------------------------------------------------------------
1 | /*
2 | * mksary.c for libdivsufsort
3 | * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 | *
5 | * Permission is hereby granted, free of charge, to any person
6 | * obtaining a copy of this software and associated documentation
7 | * files (the "Software"), to deal in the Software without
8 | * restriction, including without limitation the rights to use,
9 | * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | * copies of the Software, and to permit persons to whom the
11 | * Software is furnished to do so, subject to the following
12 | * conditions:
13 | *
14 | * The above copyright notice and this permission notice shall be
15 | * included in all copies or substantial portions of the Software.
16 | *
17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 | * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 | * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 | * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 | * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 | * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 | * OTHER DEALINGS IN THE SOFTWARE.
25 | */
26 |
27 | #if HAVE_CONFIG_H
28 | # include "config.h"
29 | #endif
30 | #include
31 | #if HAVE_STRING_H
32 | # include
33 | #endif
34 | #if HAVE_STDLIB_H
35 | # include
36 | #endif
37 | #if HAVE_MEMORY_H
38 | # include
39 | #endif
40 | #if HAVE_STDDEF_H
41 | # include
42 | #endif
43 | #if HAVE_STRINGS_H
44 | # include
45 | #endif
46 | #if HAVE_SYS_TYPES_H
47 | # include
48 | #endif
49 | #if HAVE_IO_H && HAVE_FCNTL_H
50 | # include
51 | # include
52 | #endif
53 | #include
54 | #include
55 | #include "lfs.h"
56 |
57 |
58 | static
59 | void
60 | print_help(const char *progname, int status) {
61 | fprintf(stderr,
62 | "mksary, a simple suffix array builder, version %s.\n",
63 | divsufsort_version());
64 | fprintf(stderr, "usage: %s INFILE OUTFILE\n\n", progname);
65 | exit(status);
66 | }
67 |
68 | int
69 | main(int argc, const char *argv[]) {
70 | FILE *fp, *ofp;
71 | const char *fname, *ofname;
72 | sauchar_t *T;
73 | saidx_t *SA;
74 | LFS_OFF_T n;
75 | clock_t start, finish;
76 | saint_t needclose = 3;
77 |
78 | /* Check arguments. */
79 | if((argc == 1) ||
80 | (strcmp(argv[1], "-h") == 0) ||
81 | (strcmp(argv[1], "--help") == 0)) { print_help(argv[0], EXIT_SUCCESS); }
82 | if(argc != 3) { print_help(argv[0], EXIT_FAILURE); }
83 |
84 | /* Open a file for reading. */
85 | if(strcmp(argv[1], "-") != 0) {
86 | #if HAVE_FOPEN_S
87 | if(fopen_s(&fp, fname = argv[1], "rb") != 0) {
88 | #else
89 | if((fp = LFS_FOPEN(fname = argv[1], "rb")) == NULL) {
90 | #endif
91 | fprintf(stderr, "%s: Cannot open file `%s': ", argv[0], fname);
92 | perror(NULL);
93 | exit(EXIT_FAILURE);
94 | }
95 | } else {
96 | #if HAVE__SETMODE && HAVE__FILENO
97 | if(_setmode(_fileno(stdin), _O_BINARY) == -1) {
98 | fprintf(stderr, "%s: Cannot set mode: ", argv[0]);
99 | perror(NULL);
100 | exit(EXIT_FAILURE);
101 | }
102 | #endif
103 | fp = stdin;
104 | fname = "stdin";
105 | needclose ^= 1;
106 | }
107 |
108 | /* Open a file for writing. */
109 | if(strcmp(argv[2], "-") != 0) {
110 | #if HAVE_FOPEN_S
111 | if(fopen_s(&ofp, ofname = argv[2], "wb") != 0) {
112 | #else
113 | if((ofp = LFS_FOPEN(ofname = argv[2], "wb")) == NULL) {
114 | #endif
115 | fprintf(stderr, "%s: Cannot open file `%s': ", argv[0], ofname);
116 | perror(NULL);
117 | exit(EXIT_FAILURE);
118 | }
119 | } else {
120 | #if HAVE__SETMODE && HAVE__FILENO
121 | if(_setmode(_fileno(stdout), _O_BINARY) == -1) {
122 | fprintf(stderr, "%s: Cannot set mode: ", argv[0]);
123 | perror(NULL);
124 | exit(EXIT_FAILURE);
125 | }
126 | #endif
127 | ofp = stdout;
128 | ofname = "stdout";
129 | needclose ^= 2;
130 | }
131 |
132 | /* Get the file size. */
133 | if(LFS_FSEEK(fp, 0, SEEK_END) == 0) {
134 | n = LFS_FTELL(fp);
135 | rewind(fp);
136 | if(n < 0) {
137 | fprintf(stderr, "%s: Cannot ftell `%s': ", argv[0], fname);
138 | perror(NULL);
139 | exit(EXIT_FAILURE);
140 | }
141 | if(0x7fffffff <= n) {
142 | fprintf(stderr, "%s: Input file `%s' is too big.\n", argv[0], fname);
143 | exit(EXIT_FAILURE);
144 | }
145 | } else {
146 | fprintf(stderr, "%s: Cannot fseek `%s': ", argv[0], fname);
147 | perror(NULL);
148 | exit(EXIT_FAILURE);
149 | }
150 |
151 | /* Allocate 5blocksize bytes of memory. */
152 | T = (sauchar_t *)malloc((size_t)n * sizeof(sauchar_t));
153 | SA = (saidx_t *)malloc((size_t)n * sizeof(saidx_t));
154 | if((T == NULL) || (SA == NULL)) {
155 | fprintf(stderr, "%s: Cannot allocate memory.\n", argv[0]);
156 | exit(EXIT_FAILURE);
157 | }
158 |
159 | /* Read n bytes of data. */
160 | if(fread(T, sizeof(sauchar_t), (size_t)n, fp) != (size_t)n) {
161 | fprintf(stderr, "%s: %s `%s': ",
162 | argv[0],
163 | (ferror(fp) || !feof(fp)) ? "Cannot read from" : "Unexpected EOF in",
164 | fname);
165 | perror(NULL);
166 | exit(EXIT_FAILURE);
167 | }
168 | if(needclose & 1) { fclose(fp); }
169 |
170 | /* Construct the suffix array. */
171 | fprintf(stderr, "%s: %" PRIdOFF_T " bytes ... ", fname, n);
172 | start = clock();
173 | if(divsufsort(T, SA, (saidx_t)n) != 0) {
174 | fprintf(stderr, "%s: Cannot allocate memory.\n", argv[0]);
175 | exit(EXIT_FAILURE);
176 | }
177 | finish = clock();
178 | fprintf(stderr, "%.4f sec\n", (double)(finish - start) / (double)CLOCKS_PER_SEC);
179 |
180 | /* Write the suffix array. */
181 | if(fwrite(SA, sizeof(saidx_t), (size_t)n, ofp) != (size_t)n) {
182 | fprintf(stderr, "%s: Cannot write to `%s': ", argv[0], ofname);
183 | perror(NULL);
184 | exit(EXIT_FAILURE);
185 | }
186 | if(needclose & 2) { fclose(ofp); }
187 |
188 | /* Deallocate memory. */
189 | free(SA);
190 | free(T);
191 |
192 | return 0;
193 | }
194 |
--------------------------------------------------------------------------------
/src/libdivsufsort/examples/sasearch.c:
--------------------------------------------------------------------------------
1 | /*
2 | * sasearch.c for libdivsufsort
3 | * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 | *
5 | * Permission is hereby granted, free of charge, to any person
6 | * obtaining a copy of this software and associated documentation
7 | * files (the "Software"), to deal in the Software without
8 | * restriction, including without limitation the rights to use,
9 | * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | * copies of the Software, and to permit persons to whom the
11 | * Software is furnished to do so, subject to the following
12 | * conditions:
13 | *
14 | * The above copyright notice and this permission notice shall be
15 | * included in all copies or substantial portions of the Software.
16 | *
17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 | * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 | * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 | * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 | * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 | * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 | * OTHER DEALINGS IN THE SOFTWARE.
25 | */
26 |
27 | #if HAVE_CONFIG_H
28 | # include "config.h"
29 | #endif
30 | #include
31 | #if HAVE_STRING_H
32 | # include
33 | #endif
34 | #if HAVE_STDLIB_H
35 | # include
36 | #endif
37 | #if HAVE_MEMORY_H
38 | # include
39 | #endif
40 | #if HAVE_STDDEF_H
41 | # include
42 | #endif
43 | #if HAVE_STRINGS_H
44 | # include
45 | #endif
46 | #if HAVE_SYS_TYPES_H
47 | # include
48 | #endif
49 | #if HAVE_IO_H && HAVE_FCNTL_H
50 | # include
51 | # include
52 | #endif
53 | #include
54 | #include "lfs.h"
55 |
56 |
57 | static
58 | void
59 | print_help(const char *progname, int status) {
60 | fprintf(stderr,
61 | "sasearch, a simple SA-based full-text search tool, version %s\n",
62 | divsufsort_version());
63 | fprintf(stderr, "usage: %s PATTERN FILE SAFILE\n\n", progname);
64 | exit(status);
65 | }
66 |
67 | int
68 | main(int argc, const char *argv[]) {
69 | FILE *fp;
70 | const char *P;
71 | sauchar_t *T;
72 | saidx_t *SA;
73 | LFS_OFF_T n;
74 | size_t Psize;
75 | saidx_t i, size, left;
76 |
77 | if((argc == 1) ||
78 | (strcmp(argv[1], "-h") == 0) ||
79 | (strcmp(argv[1], "--help") == 0)) { print_help(argv[0], EXIT_SUCCESS); }
80 | if(argc != 4) { print_help(argv[0], EXIT_FAILURE); }
81 |
82 | P = argv[1];
83 | Psize = strlen(P);
84 |
85 | /* Open a file for reading. */
86 | #if HAVE_FOPEN_S
87 | if(fopen_s(&fp, argv[2], "rb") != 0) {
88 | #else
89 | if((fp = LFS_FOPEN(argv[2], "rb")) == NULL) {
90 | #endif
91 | fprintf(stderr, "%s: Cannot open file `%s': ", argv[0], argv[2]);
92 | perror(NULL);
93 | exit(EXIT_FAILURE);
94 | }
95 |
96 | /* Get the file size. */
97 | if(LFS_FSEEK(fp, 0, SEEK_END) == 0) {
98 | n = LFS_FTELL(fp);
99 | rewind(fp);
100 | if(n < 0) {
101 | fprintf(stderr, "%s: Cannot ftell `%s': ", argv[0], argv[2]);
102 | perror(NULL);
103 | exit(EXIT_FAILURE);
104 | }
105 | } else {
106 | fprintf(stderr, "%s: Cannot fseek `%s': ", argv[0], argv[2]);
107 | perror(NULL);
108 | exit(EXIT_FAILURE);
109 | }
110 |
111 | /* Allocate 5n bytes of memory. */
112 | T = (sauchar_t *)malloc((size_t)n * sizeof(sauchar_t));
113 | SA = (saidx_t *)malloc((size_t)n * sizeof(saidx_t));
114 | if((T == NULL) || (SA == NULL)) {
115 | fprintf(stderr, "%s: Cannot allocate memory.\n", argv[0]);
116 | exit(EXIT_FAILURE);
117 | }
118 |
119 | /* Read n bytes of data. */
120 | if(fread(T, sizeof(sauchar_t), (size_t)n, fp) != (size_t)n) {
121 | fprintf(stderr, "%s: %s `%s': ",
122 | argv[0],
123 | (ferror(fp) || !feof(fp)) ? "Cannot read from" : "Unexpected EOF in",
124 | argv[2]);
125 | perror(NULL);
126 | exit(EXIT_FAILURE);
127 | }
128 | fclose(fp);
129 |
130 | /* Open the SA file for reading. */
131 | #if HAVE_FOPEN_S
132 | if(fopen_s(&fp, argv[3], "rb") != 0) {
133 | #else
134 | if((fp = LFS_FOPEN(argv[3], "rb")) == NULL) {
135 | #endif
136 | fprintf(stderr, "%s: Cannot open file `%s': ", argv[0], argv[3]);
137 | perror(NULL);
138 | exit(EXIT_FAILURE);
139 | }
140 |
141 | /* Read n * sizeof(saidx_t) bytes of data. */
142 | if(fread(SA, sizeof(saidx_t), (size_t)n, fp) != (size_t)n) {
143 | fprintf(stderr, "%s: %s `%s': ",
144 | argv[0],
145 | (ferror(fp) || !feof(fp)) ? "Cannot read from" : "Unexpected EOF in",
146 | argv[3]);
147 | perror(NULL);
148 | exit(EXIT_FAILURE);
149 | }
150 | fclose(fp);
151 |
152 | /* Search and print */
153 | size = sa_search(T, (saidx_t)n,
154 | (const sauchar_t *)P, (saidx_t)Psize,
155 | SA, (saidx_t)n, &left);
156 | for(i = 0; i < size; ++i) {
157 | fprintf(stdout, "%" PRIdSAIDX_T "\n", SA[left + i]);
158 | }
159 |
160 | /* Deallocate memory. */
161 | free(SA);
162 | free(T);
163 |
164 | return 0;
165 | }
166 |
--------------------------------------------------------------------------------
/src/libdivsufsort/examples/suftest.c:
--------------------------------------------------------------------------------
1 | /*
2 | * suftest.c for libdivsufsort
3 | * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 | *
5 | * Permission is hereby granted, free of charge, to any person
6 | * obtaining a copy of this software and associated documentation
7 | * files (the "Software"), to deal in the Software without
8 | * restriction, including without limitation the rights to use,
9 | * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | * copies of the Software, and to permit persons to whom the
11 | * Software is furnished to do so, subject to the following
12 | * conditions:
13 | *
14 | * The above copyright notice and this permission notice shall be
15 | * included in all copies or substantial portions of the Software.
16 | *
17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 | * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 | * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 | * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 | * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 | * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 | * OTHER DEALINGS IN THE SOFTWARE.
25 | */
26 |
27 | #if HAVE_CONFIG_H
28 | # include "config.h"
29 | #endif
30 | #include
31 | #if HAVE_STRING_H
32 | # include
33 | #endif
34 | #if HAVE_STDLIB_H
35 | # include
36 | #endif
37 | #if HAVE_MEMORY_H
38 | # include
39 | #endif
40 | #if HAVE_STDDEF_H
41 | # include
42 | #endif
43 | #if HAVE_STRINGS_H
44 | # include
45 | #endif
46 | #if HAVE_SYS_TYPES_H
47 | # include
48 | #endif
49 | #if HAVE_IO_H && HAVE_FCNTL_H
50 | # include
51 | # include
52 | #endif
53 | #include
54 | #include
55 | #include "lfs.h"
56 |
57 |
58 | static
59 | void
60 | print_help(const char *progname, int status) {
61 | fprintf(stderr,
62 | "suftest, a suffixsort tester, version %s.\n",
63 | divsufsort_version());
64 | fprintf(stderr, "usage: %s FILE\n\n", progname);
65 | exit(status);
66 | }
67 |
68 | int
69 | main(int argc, const char *argv[]) {
70 | FILE *fp;
71 | const char *fname;
72 | sauchar_t *T;
73 | saidx_t *SA;
74 | LFS_OFF_T n;
75 | clock_t start, finish;
76 | saint_t needclose = 1;
77 |
78 | /* Check arguments. */
79 | if((argc == 1) ||
80 | (strcmp(argv[1], "-h") == 0) ||
81 | (strcmp(argv[1], "--help") == 0)) { print_help(argv[0], EXIT_SUCCESS); }
82 | if(argc != 2) { print_help(argv[0], EXIT_FAILURE); }
83 |
84 | /* Open a file for reading. */
85 | if(strcmp(argv[1], "-") != 0) {
86 | #if HAVE_FOPEN_S
87 | if(fopen_s(&fp, fname = argv[1], "rb") != 0) {
88 | #else
89 | if((fp = LFS_FOPEN(fname = argv[1], "rb")) == NULL) {
90 | #endif
91 | fprintf(stderr, "%s: Cannot open file `%s': ", argv[0], fname);
92 | perror(NULL);
93 | exit(EXIT_FAILURE);
94 | }
95 | } else {
96 | #if HAVE__SETMODE && HAVE__FILENO
97 | if(_setmode(_fileno(stdin), _O_BINARY) == -1) {
98 | fprintf(stderr, "%s: Cannot set mode: ", argv[0]);
99 | perror(NULL);
100 | exit(EXIT_FAILURE);
101 | }
102 | #endif
103 | fp = stdin;
104 | fname = "stdin";
105 | needclose = 0;
106 | }
107 |
108 | /* Get the file size. */
109 | if(LFS_FSEEK(fp, 0, SEEK_END) == 0) {
110 | n = LFS_FTELL(fp);
111 | rewind(fp);
112 | if(n < 0) {
113 | fprintf(stderr, "%s: Cannot ftell `%s': ", argv[0], fname);
114 | perror(NULL);
115 | exit(EXIT_FAILURE);
116 | }
117 | if(0x7fffffff <= n) {
118 | fprintf(stderr, "%s: Input file `%s' is too big.\n", argv[0], fname);
119 | exit(EXIT_FAILURE);
120 | }
121 | } else {
122 | fprintf(stderr, "%s: Cannot fseek `%s': ", argv[0], fname);
123 | perror(NULL);
124 | exit(EXIT_FAILURE);
125 | }
126 |
127 | /* Allocate 5n bytes of memory. */
128 | T = (sauchar_t *)malloc((size_t)n * sizeof(sauchar_t));
129 | SA = (saidx_t *)malloc((size_t)n * sizeof(saidx_t));
130 | if((T == NULL) || (SA == NULL)) {
131 | fprintf(stderr, "%s: Cannot allocate memory.\n", argv[0]);
132 | exit(EXIT_FAILURE);
133 | }
134 |
135 | /* Read n bytes of data. */
136 | if(fread(T, sizeof(sauchar_t), (size_t)n, fp) != (size_t)n) {
137 | fprintf(stderr, "%s: %s `%s': ",
138 | argv[0],
139 | (ferror(fp) || !feof(fp)) ? "Cannot read from" : "Unexpected EOF in",
140 | argv[1]);
141 | perror(NULL);
142 | exit(EXIT_FAILURE);
143 | }
144 | if(needclose & 1) { fclose(fp); }
145 |
146 | /* Construct the suffix array. */
147 | fprintf(stderr, "%s: %" PRIdOFF_T " bytes ... ", fname, n);
148 | start = clock();
149 | if(divsufsort(T, SA, (saidx_t)n) != 0) {
150 | fprintf(stderr, "%s: Cannot allocate memory.\n", argv[0]);
151 | exit(EXIT_FAILURE);
152 | }
153 | finish = clock();
154 | fprintf(stderr, "%.4f sec\n", (double)(finish - start) / (double)CLOCKS_PER_SEC);
155 |
156 | /* Check the suffix array. */
157 | if(sufcheck(T, SA, (saidx_t)n, 1) != 0) { exit(EXIT_FAILURE); }
158 |
159 | /* Deallocate memory. */
160 | free(SA);
161 | free(T);
162 |
163 | return 0;
164 | }
165 |
--------------------------------------------------------------------------------
/src/libdivsufsort/examples/unbwt.c:
--------------------------------------------------------------------------------
1 | /*
2 | * unbwt.c for libdivsufsort
3 | * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 | *
5 | * Permission is hereby granted, free of charge, to any person
6 | * obtaining a copy of this software and associated documentation
7 | * files (the "Software"), to deal in the Software without
8 | * restriction, including without limitation the rights to use,
9 | * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | * copies of the Software, and to permit persons to whom the
11 | * Software is furnished to do so, subject to the following
12 | * conditions:
13 | *
14 | * The above copyright notice and this permission notice shall be
15 | * included in all copies or substantial portions of the Software.
16 | *
17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 | * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 | * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 | * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 | * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 | * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 | * OTHER DEALINGS IN THE SOFTWARE.
25 | */
26 |
27 | #if HAVE_CONFIG_H
28 | # include "config.h"
29 | #endif
30 | #include
31 | #if HAVE_STRING_H
32 | # include
33 | #endif
34 | #if HAVE_STDLIB_H
35 | # include
36 | #endif
37 | #if HAVE_MEMORY_H
38 | # include
39 | #endif
40 | #if HAVE_STDDEF_H
41 | # include
42 | #endif
43 | #if HAVE_STRINGS_H
44 | # include
45 | #endif
46 | #if HAVE_SYS_TYPES_H
47 | # include
48 | #endif
49 | #if HAVE_IO_H && HAVE_FCNTL_H
50 | # include
51 | # include
52 | #endif
53 | #include
54 | #include
55 | #include "lfs.h"
56 |
57 |
58 | static
59 | size_t
60 | read_int(FILE *fp, saidx_t *n) {
61 | unsigned char c[4];
62 | size_t m = fread(c, sizeof(unsigned char), 4, fp);
63 | if(m == 4) {
64 | *n = (c[0] << 0) | (c[1] << 8) |
65 | (c[2] << 16) | (c[3] << 24);
66 | }
67 | return m;
68 | }
69 |
70 | static
71 | void
72 | print_help(const char *progname, int status) {
73 | fprintf(stderr,
74 | "unbwt, an inverse burrows-wheeler transform program, version %s.\n",
75 | divsufsort_version());
76 | fprintf(stderr, "usage: %s INFILE OUTFILE\n\n", progname);
77 | exit(status);
78 | }
79 |
80 | int
81 | main(int argc, const char *argv[]) {
82 | FILE *fp, *ofp;
83 | const char *fname, *ofname;
84 | sauchar_t *T;
85 | saidx_t *A;
86 | LFS_OFF_T n;
87 | size_t m;
88 | saidx_t pidx;
89 | clock_t start, finish;
90 | saint_t err, blocksize, needclose = 3;
91 |
92 | /* Check arguments. */
93 | if((argc == 1) ||
94 | (strcmp(argv[1], "-h") == 0) ||
95 | (strcmp(argv[1], "--help") == 0)) { print_help(argv[0], EXIT_SUCCESS); }
96 | if(argc != 3) { print_help(argv[0], EXIT_FAILURE); }
97 |
98 | /* Open a file for reading. */
99 | if(strcmp(argv[1], "-") != 0) {
100 | #if HAVE_FOPEN_S
101 | if(fopen_s(&fp, fname = argv[1], "rb") != 0) {
102 | #else
103 | if((fp = LFS_FOPEN(fname = argv[1], "rb")) == NULL) {
104 | #endif
105 | fprintf(stderr, "%s: Cannot open file `%s': ", argv[0], fname);
106 | perror(NULL);
107 | exit(EXIT_FAILURE);
108 | }
109 | } else {
110 | #if HAVE__SETMODE && HAVE__FILENO
111 | if(_setmode(_fileno(stdin), _O_BINARY) == -1) {
112 | fprintf(stderr, "%s: Cannot set mode: ", argv[0]);
113 | perror(NULL);
114 | exit(EXIT_FAILURE);
115 | }
116 | #endif
117 | fp = stdin;
118 | fname = "stdin";
119 | needclose ^= 1;
120 | }
121 |
122 | /* Open a file for writing. */
123 | if(strcmp(argv[2], "-") != 0) {
124 | #if HAVE_FOPEN_S
125 | if(fopen_s(&ofp, ofname = argv[2], "wb") != 0) {
126 | #else
127 | if((ofp = LFS_FOPEN(ofname = argv[2], "wb")) == NULL) {
128 | #endif
129 | fprintf(stderr, "%s: Cannot open file `%s': ", argv[0], ofname);
130 | perror(NULL);
131 | exit(EXIT_FAILURE);
132 | }
133 | } else {
134 | #if HAVE__SETMODE && HAVE__FILENO
135 | if(_setmode(_fileno(stdout), _O_BINARY) == -1) {
136 | fprintf(stderr, "%s: Cannot set mode: ", argv[0]);
137 | perror(NULL);
138 | exit(EXIT_FAILURE);
139 | }
140 | #endif
141 | ofp = stdout;
142 | ofname = "stdout";
143 | needclose ^= 2;
144 | }
145 |
146 | /* Read the blocksize. */
147 | if(read_int(fp, &blocksize) != 4) {
148 | fprintf(stderr, "%s: Cannot read from `%s': ", argv[0], fname);
149 | perror(NULL);
150 | exit(EXIT_FAILURE);
151 | }
152 |
153 | /* Allocate 5blocksize bytes of memory. */
154 | T = (sauchar_t *)malloc(blocksize * sizeof(sauchar_t));
155 | A = (saidx_t *)malloc(blocksize * sizeof(saidx_t));
156 | if((T == NULL) || (A == NULL)) {
157 | fprintf(stderr, "%s: Cannot allocate memory.\n", argv[0]);
158 | exit(EXIT_FAILURE);
159 | }
160 |
161 | fprintf(stderr, "UnBWT (blocksize %" PRIdSAINT_T ") ... ", blocksize);
162 | start = clock();
163 | for(n = 0; (m = read_int(fp, &pidx)) != 0; n += m) {
164 | /* Read blocksize bytes of data. */
165 | if((m != 4) || ((m = fread(T, sizeof(sauchar_t), blocksize, fp)) == 0)) {
166 | fprintf(stderr, "%s: %s `%s': ",
167 | argv[0],
168 | (ferror(fp) || !feof(fp)) ? "Cannot read from" : "Unexpected EOF in",
169 | fname);
170 | perror(NULL);
171 | exit(EXIT_FAILURE);
172 | }
173 |
174 | /* Inverse Burrows-Wheeler Transform. */
175 | if((err = inverse_bw_transform(T, T, A, m, pidx)) != 0) {
176 | fprintf(stderr, "%s (reverseBWT): %s.\n",
177 | argv[0],
178 | (err == -1) ? "Invalid data" : "Cannot allocate memory");
179 | exit(EXIT_FAILURE);
180 | }
181 |
182 | /* Write m bytes of data. */
183 | if(fwrite(T, sizeof(sauchar_t), m, ofp) != m) {
184 | fprintf(stderr, "%s: Cannot write to `%s': ", argv[0], ofname);
185 | perror(NULL);
186 | exit(EXIT_FAILURE);
187 | }
188 | }
189 | if(ferror(fp)) {
190 | fprintf(stderr, "%s: Cannot read from `%s': ", argv[0], fname);
191 | perror(NULL);
192 | exit(EXIT_FAILURE);
193 | }
194 | finish = clock();
195 | fprintf(stderr, "%" PRIdOFF_T " bytes: %.4f sec\n",
196 | n, (double)(finish - start) / (double)CLOCKS_PER_SEC);
197 |
198 | /* Close files */
199 | if(needclose & 1) { fclose(fp); }
200 | if(needclose & 2) { fclose(ofp); }
201 |
202 | /* Deallocate memory. */
203 | free(A);
204 | free(T);
205 |
206 | return 0;
207 | }
208 |
--------------------------------------------------------------------------------
/src/libdivsufsort/include/CMakeLists.txt:
--------------------------------------------------------------------------------
1 | include(CheckIncludeFiles)
2 | include(CheckIncludeFile)
3 | include(CheckSymbolExists)
4 | include(CheckTypeSize)
5 | include(CheckFunctionKeywords)
6 | include(CheckLFS)
7 |
8 | ## Checks for header files ##
9 | check_include_file("inttypes.h" HAVE_INTTYPES_H)
10 | check_include_file("memory.h" HAVE_MEMORY_H)
11 | check_include_file("stddef.h" HAVE_STDDEF_H)
12 | check_include_file("stdint.h" HAVE_STDINT_H)
13 | check_include_file("stdlib.h" HAVE_STDLIB_H)
14 | check_include_file("string.h" HAVE_STRING_H)
15 | check_include_file("strings.h" HAVE_STRINGS_H)
16 | check_include_file("sys/types.h" HAVE_SYS_TYPES_H)
17 | if(HAVE_INTTYPES_H)
18 | set(INCFILE "#include ")
19 | elseif(HAVE_STDINT_H)
20 | set(INCFILE "#include ")
21 | else(HAVE_INTTYPES_H)
22 | set(INCFILE "")
23 | endif(HAVE_INTTYPES_H)
24 |
25 | ## create configuration files from .cmake file ##
26 | if(BUILD_EXAMPLES)
27 | ## Checks for WinIO ##
28 | if(WIN32)
29 | check_include_file("io.h" HAVE_IO_H)
30 | check_include_file("fcntl.h" HAVE_FCNTL_H)
31 | check_symbol_exists("_setmode" "io.h;fcntl.h" HAVE__SETMODE)
32 | if(NOT HAVE__SETMODE)
33 | check_symbol_exists("setmode" "io.h;fcntl.h" HAVE_SETMODE)
34 | endif(NOT HAVE__SETMODE)
35 | check_symbol_exists("_fileno" "stdio.h" HAVE__FILENO)
36 | check_symbol_exists("fopen_s" "stdio.h" HAVE_FOPEN_S)
37 | check_symbol_exists("_O_BINARY" "fcntl.h" HAVE__O_BINARY)
38 | endif(WIN32)
39 |
40 | ## Checks for large file support ##
41 | check_lfs(WITH_LFS)
42 | configure_file("${CMAKE_CURRENT_SOURCE_DIR}/lfs.h.cmake" "${CMAKE_CURRENT_BINARY_DIR}/lfs.h" @ONLY)
43 | endif(BUILD_EXAMPLES)
44 |
45 | ## generate config.h ##
46 | check_function_keywords("inline;__inline;__inline__;__declspec(dllexport);__declspec(dllimport)")
47 | if(HAVE_INLINE)
48 | set(INLINE "inline")
49 | elseif(HAVE___INLINE)
50 | set(INLINE "__inline")
51 | elseif(HAVE___INLINE__)
52 | set(INLINE "__inline__")
53 | else(HAVE_INLINE)
54 | set(INLINE "")
55 | endif(HAVE_INLINE)
56 | configure_file("${CMAKE_CURRENT_SOURCE_DIR}/config.h.cmake" "${CMAKE_CURRENT_BINARY_DIR}/config.h")
57 |
58 | ## Checks for types ##
59 | # sauchar_t (8bit)
60 | check_type_size("uint8_t" UINT8_T)
61 | if(HAVE_UINT8_T)
62 | set(SAUCHAR_TYPE "uint8_t")
63 | else(HAVE_UINT8_T)
64 | check_type_size("unsigned char" SIZEOF_UNSIGNED_CHAR)
65 | if("${SIZEOF_UNSIGNED_CHAR}" STREQUAL "1")
66 | set(SAUCHAR_TYPE "unsigned char")
67 | else("${SIZEOF_UNSIGNED_CHAR}" STREQUAL "1")
68 | message(FATAL_ERROR "Cannot find unsigned 8-bit integer type")
69 | endif("${SIZEOF_UNSIGNED_CHAR}" STREQUAL "1")
70 | endif(HAVE_UINT8_T)
71 | # saint_t (32bit)
72 | check_type_size("int32_t" INT32_T)
73 | if(HAVE_INT32_T)
74 | set(SAINT32_TYPE "int32_t")
75 | check_symbol_exists("PRId32" "inttypes.h" HAVE_PRID32)
76 | if(HAVE_PRID32)
77 | set(SAINT32_PRId "PRId32")
78 | else(HAVE_PRID32)
79 | set(SAINT32_PRId "\"d\"")
80 | endif(HAVE_PRID32)
81 | else(HAVE_INT32_T)
82 | check_type_size("int" SIZEOF_INT)
83 | check_type_size("long" SIZEOF_LONG)
84 | check_type_size("short" SIZEOF_SHORT)
85 | check_type_size("__int32" SIZEOF___INT32)
86 | if("${SIZEOF_INT}" STREQUAL "4")
87 | set(SAINT32_TYPE "int")
88 | set(SAINT32_PRId "\"d\"")
89 | elseif("${SIZEOF_LONG}" STREQUAL "4")
90 | set(SAINT32_TYPE "long")
91 | set(SAINT32_PRId "\"ld\"")
92 | elseif("${SIZEOF_SHORT}" STREQUAL "4")
93 | set(SAINT32_TYPE "short")
94 | set(SAINT32_PRId "\"d\"")
95 | elseif("${SIZEOF___INT32}" STREQUAL "4")
96 | set(SAINT32_TYPE "__int32")
97 | set(SAINT32_PRId "\"d\"")
98 | else("${SIZEOF_INT}" STREQUAL "4")
99 | message(FATAL_ERROR "Cannot find 32-bit integer type")
100 | endif("${SIZEOF_INT}" STREQUAL "4")
101 | endif(HAVE_INT32_T)
102 | # saint64_t (64bit)
103 | if(BUILD_DIVSUFSORT64)
104 | check_type_size("int64_t" INT64_T)
105 | if(HAVE_INT64_T)
106 | set(SAINT64_TYPE "int64_t")
107 | check_symbol_exists("PRId64" "inttypes.h" HAVE_PRID64)
108 | if(HAVE_PRID64)
109 | set(SAINT64_PRId "PRId64")
110 | else(HAVE_PRID64)
111 | set(SAINT64_PRId "\"lld\"")
112 | endif(HAVE_PRID64)
113 | else(HAVE_INT64_T)
114 | check_type_size("int" SIZEOF_INT)
115 | check_type_size("long" SIZEOF_LONG)
116 | check_type_size("long long" SIZEOF_LONG_LONG)
117 | check_type_size("__int64" SIZEOF___INT64)
118 | if("${SIZEOF_INT}" STREQUAL "8")
119 | set(SAINT64_TYPE "int")
120 | set(SAINT64_PRId "\"d\"")
121 | elseif("${SIZEOF_LONG}" STREQUAL "8")
122 | set(SAINT64_TYPE "long")
123 | set(SAINT64_PRId "\"ld\"")
124 | elseif("${SIZEOF_LONG_LONG}" STREQUAL "8")
125 | set(SAINT64_TYPE "long long")
126 | set(SAINT64_PRId "\"lld\"")
127 | elseif("${SIZEOF___INT64}" STREQUAL "8")
128 | set(SAINT64_TYPE "__int64")
129 | set(SAINT64_PRId "\"I64d\"")
130 | else("${SIZEOF_INT}" STREQUAL "8")
131 | message(SEND_ERROR "Cannot find 64-bit integer type")
132 | set(BUILD_DIVSUFSORT64 OFF)
133 | endif("${SIZEOF_INT}" STREQUAL "8")
134 | endif(HAVE_INT64_T)
135 | endif(BUILD_DIVSUFSORT64)
136 |
137 | ## generate divsufsort.h ##
138 | set(DIVSUFSORT_IMPORT "")
139 | set(DIVSUFSORT_EXPORT "")
140 | if(BUILD_SHARED_LIBS)
141 | if(HAVE___DECLSPEC_DLLIMPORT_)
142 | set(DIVSUFSORT_IMPORT "__declspec(dllimport)")
143 | endif(HAVE___DECLSPEC_DLLIMPORT_)
144 | if(HAVE___DECLSPEC_DLLEXPORT_)
145 | set(DIVSUFSORT_EXPORT "__declspec(dllexport)")
146 | endif(HAVE___DECLSPEC_DLLEXPORT_)
147 | endif(BUILD_SHARED_LIBS)
148 | set(W64BIT "")
149 | set(SAINDEX_TYPE "${SAINT32_TYPE}")
150 | set(SAINDEX_PRId "${SAINT32_PRId}")
151 | set(SAINT_PRId "${SAINT32_PRId}")
152 | configure_file("${CMAKE_CURRENT_SOURCE_DIR}/divsufsort.h.cmake"
153 | "${CMAKE_CURRENT_BINARY_DIR}/divsufsort.h" @ONLY)
154 | install(FILES "${CMAKE_CURRENT_BINARY_DIR}/divsufsort.h" DESTINATION ${CMAKE_INSTALL_INCLUDEDIR})
155 | if(BUILD_DIVSUFSORT64)
156 | set(W64BIT "64")
157 | set(SAINDEX_TYPE "${SAINT64_TYPE}")
158 | set(SAINDEX_PRId "${SAINT64_PRId}")
159 | configure_file("${CMAKE_CURRENT_SOURCE_DIR}/divsufsort.h.cmake"
160 | "${CMAKE_CURRENT_BINARY_DIR}/divsufsort64.h" @ONLY)
161 | install(FILES "${CMAKE_CURRENT_BINARY_DIR}/divsufsort64.h" DESTINATION ${CMAKE_INSTALL_INCLUDEDIR})
162 | endif(BUILD_DIVSUFSORT64)
163 |
--------------------------------------------------------------------------------
/src/libdivsufsort/include/config.h.cmake:
--------------------------------------------------------------------------------
1 | /*
2 | * config.h for libdivsufsort
3 | * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 | *
5 | * Permission is hereby granted, free of charge, to any person
6 | * obtaining a copy of this software and associated documentation
7 | * files (the "Software"), to deal in the Software without
8 | * restriction, including without limitation the rights to use,
9 | * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | * copies of the Software, and to permit persons to whom the
11 | * Software is furnished to do so, subject to the following
12 | * conditions:
13 | *
14 | * The above copyright notice and this permission notice shall be
15 | * included in all copies or substantial portions of the Software.
16 | *
17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 | * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 | * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 | * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 | * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 | * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 | * OTHER DEALINGS IN THE SOFTWARE.
25 | */
26 |
27 | #ifndef _CONFIG_H
28 | #define _CONFIG_H 1
29 |
30 | #ifdef __cplusplus
31 | extern "C" {
32 | #endif /* __cplusplus */
33 |
34 | /** Define to the version of this package. **/
35 | #cmakedefine PROJECT_VERSION_FULL "${PROJECT_VERSION_FULL}"
36 |
37 | /** Define to 1 if you have the header files. **/
38 | #cmakedefine HAVE_INTTYPES_H 1
39 | #cmakedefine HAVE_STDDEF_H 1
40 | #cmakedefine HAVE_STDINT_H 1
41 | #cmakedefine HAVE_STDLIB_H 1
42 | #cmakedefine HAVE_STRING_H 1
43 | #cmakedefine HAVE_STRINGS_H 1
44 | #cmakedefine HAVE_MEMORY_H 1
45 | #cmakedefine HAVE_SYS_TYPES_H 1
46 |
47 | /** for WinIO **/
48 | #cmakedefine HAVE_IO_H 1
49 | #cmakedefine HAVE_FCNTL_H 1
50 | #cmakedefine HAVE__SETMODE 1
51 | #cmakedefine HAVE_SETMODE 1
52 | #cmakedefine HAVE__FILENO 1
53 | #cmakedefine HAVE_FOPEN_S 1
54 | #cmakedefine HAVE__O_BINARY 1
55 | #ifndef HAVE__SETMODE
56 | # if HAVE_SETMODE
57 | # define _setmode setmode
58 | # define HAVE__SETMODE 1
59 | # endif
60 | # if HAVE__SETMODE && !HAVE__O_BINARY
61 | # define _O_BINARY 0
62 | # define HAVE__O_BINARY 1
63 | # endif
64 | #endif
65 |
66 | /** for inline **/
67 | #ifndef INLINE
68 | # define INLINE @INLINE@
69 | #endif
70 |
71 | /** for VC++ warning **/
72 | #ifdef _MSC_VER
73 | #pragma warning(disable: 4127)
74 | #endif
75 |
76 |
77 | #ifdef __cplusplus
78 | } /* extern "C" */
79 | #endif /* __cplusplus */
80 |
81 | #endif /* _CONFIG_H */
82 |
--------------------------------------------------------------------------------
/src/libdivsufsort/include/divsufsort.h:
--------------------------------------------------------------------------------
1 | /*
2 | * divsufsort.h for libdivsufsort
3 | * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 | *
5 | * Permission is hereby granted, free of charge, to any person
6 | * obtaining a copy of this software and associated documentation
7 | * files (the "Software"), to deal in the Software without
8 | * restriction, including without limitation the rights to use,
9 | * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | * copies of the Software, and to permit persons to whom the
11 | * Software is furnished to do so, subject to the following
12 | * conditions:
13 | *
14 | * The above copyright notice and this permission notice shall be
15 | * included in all copies or substantial portions of the Software.
16 | *
17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 | * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 | * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 | * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 | * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 | * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 | * OTHER DEALINGS IN THE SOFTWARE.
25 | */
26 |
27 | #ifndef _DIVSUFSORT_H
28 | #define _DIVSUFSORT_H 1
29 |
30 | #ifdef __cplusplus
31 | extern "C" {
32 | #endif /* __cplusplus */
33 |
34 | #define DIVSUFSORT_API
35 |
36 | /*- Datatypes -*/
37 | #ifndef SAUCHAR_T
38 | #define SAUCHAR_T
39 | typedef unsigned char sauchar_t;
40 | #endif /* SAUCHAR_T */
41 | #ifndef SAINT_T
42 | #define SAINT_T
43 | typedef int saint_t;
44 | #endif /* SAINT_T */
45 | #ifndef SAIDX_T
46 | #define SAIDX_T
47 | typedef int saidx_t;
48 | #endif /* SAIDX_T */
49 | #ifndef PRIdSAIDX_T
50 | #define PRIdSAIDX_T "d"
51 | #endif
52 |
53 | /*- divsufsort context */
54 | typedef struct _divsufsort_ctx_t {
55 | saidx_t *bucket_A;
56 | saidx_t *bucket_B;
57 | } divsufsort_ctx_t;
58 |
59 | /*- Prototypes -*/
60 |
61 | /**
62 | * Initialize suffix array context
63 | *
64 | * @return 0 for success, or non-zero in case of an error
65 | */
66 | int divsufsort_init(divsufsort_ctx_t *ctx);
67 |
68 | /**
69 | * Destroy suffix array context
70 | *
71 | * @param ctx suffix array context to destroy
72 | */
73 | void divsufsort_destroy(divsufsort_ctx_t *ctx);
74 |
75 | /**
76 | * Constructs the suffix array of a given string.
77 | * @param ctx suffix array context
78 | * @param T[0..n-1] The input string.
79 | * @param SA[0..n-1] The output array of suffixes.
80 | * @param n The length of the given string.
81 | * @return 0 if no error occurred, -1 or -2 otherwise.
82 | */
83 | DIVSUFSORT_API
84 | saint_t divsufsort_build_array(divsufsort_ctx_t *ctx, const sauchar_t *T, saidx_t *SA, saidx_t n);
85 |
86 | #if 0
87 | /**
88 | * Constructs the burrows-wheeler transformed string of a given string.
89 | * @param T[0..n-1] The input string.
90 | * @param U[0..n-1] The output string. (can be T)
91 | * @param A[0..n-1] The temporary array. (can be NULL)
92 | * @param n The length of the given string.
93 | * @return The primary index if no error occurred, -1 or -2 otherwise.
94 | */
95 | DIVSUFSORT_API
96 | saidx_t
97 | divbwt(const sauchar_t *T, sauchar_t *U, saidx_t *A, saidx_t n);
98 |
99 | /**
100 | * Returns the version of the divsufsort library.
101 | * @return The version number string.
102 | */
103 | DIVSUFSORT_API
104 | const char *
105 | divsufsort_version(void);
106 |
107 |
108 | /**
109 | * Constructs the burrows-wheeler transformed string of a given string and suffix array.
110 | * @param T[0..n-1] The input string.
111 | * @param U[0..n-1] The output string. (can be T)
112 | * @param SA[0..n-1] The suffix array. (can be NULL)
113 | * @param n The length of the given string.
114 | * @param idx The output primary index.
115 | * @return 0 if no error occurred, -1 or -2 otherwise.
116 | */
117 | DIVSUFSORT_API
118 | saint_t
119 | bw_transform(const sauchar_t *T, sauchar_t *U,
120 | saidx_t *SA /* can NULL */,
121 | saidx_t n, saidx_t *idx);
122 |
123 | /**
124 | * Inverse BW-transforms a given BWTed string.
125 | * @param T[0..n-1] The input string.
126 | * @param U[0..n-1] The output string. (can be T)
127 | * @param A[0..n-1] The temporary array. (can be NULL)
128 | * @param n The length of the given string.
129 | * @param idx The primary index.
130 | * @return 0 if no error occurred, -1 or -2 otherwise.
131 | */
132 | DIVSUFSORT_API
133 | saint_t
134 | inverse_bw_transform(const sauchar_t *T, sauchar_t *U,
135 | saidx_t *A /* can NULL */,
136 | saidx_t n, saidx_t idx);
137 |
138 | /**
139 | * Checks the correctness of a given suffix array.
140 | * @param T[0..n-1] The input string.
141 | * @param SA[0..n-1] The input suffix array.
142 | * @param n The length of the given string.
143 | * @param verbose The verbose mode.
144 | * @return 0 if no error occurred.
145 | */
146 | DIVSUFSORT_API
147 | saint_t
148 | sufcheck(const sauchar_t *T, const saidx_t *SA, saidx_t n, saint_t verbose);
149 |
150 | /**
151 | * Search for the pattern P in the string T.
152 | * @param T[0..Tsize-1] The input string.
153 | * @param Tsize The length of the given string.
154 | * @param P[0..Psize-1] The input pattern string.
155 | * @param Psize The length of the given pattern string.
156 | * @param SA[0..SAsize-1] The input suffix array.
157 | * @param SAsize The length of the given suffix array.
158 | * @param idx The output index.
159 | * @return The count of matches if no error occurred, -1 otherwise.
160 | */
161 | DIVSUFSORT_API
162 | saidx_t
163 | sa_search(const sauchar_t *T, saidx_t Tsize,
164 | const sauchar_t *P, saidx_t Psize,
165 | const saidx_t *SA, saidx_t SAsize,
166 | saidx_t *left);
167 |
168 | /**
169 | * Search for the character c in the string T.
170 | * @param T[0..Tsize-1] The input string.
171 | * @param Tsize The length of the given string.
172 | * @param SA[0..SAsize-1] The input suffix array.
173 | * @param SAsize The length of the given suffix array.
174 | * @param c The input character.
175 | * @param idx The output index.
176 | * @return The count of matches if no error occurred, -1 otherwise.
177 | */
178 | DIVSUFSORT_API
179 | saidx_t
180 | sa_simplesearch(const sauchar_t *T, saidx_t Tsize,
181 | const saidx_t *SA, saidx_t SAsize,
182 | saint_t c, saidx_t *left);
183 | #endif
184 |
185 | #ifdef __cplusplus
186 | } /* extern "C" */
187 | #endif /* __cplusplus */
188 |
189 | #endif /* _DIVSUFSORT_H */
190 |
--------------------------------------------------------------------------------
/src/libdivsufsort/include/divsufsort.h.cmake:
--------------------------------------------------------------------------------
1 | /*
2 | * divsufsort@W64BIT@.h for libdivsufsort@W64BIT@
3 | * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 | *
5 | * Permission is hereby granted, free of charge, to any person
6 | * obtaining a copy of this software and associated documentation
7 | * files (the "Software"), to deal in the Software without
8 | * restriction, including without limitation the rights to use,
9 | * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | * copies of the Software, and to permit persons to whom the
11 | * Software is furnished to do so, subject to the following
12 | * conditions:
13 | *
14 | * The above copyright notice and this permission notice shall be
15 | * included in all copies or substantial portions of the Software.
16 | *
17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 | * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 | * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 | * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 | * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 | * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 | * OTHER DEALINGS IN THE SOFTWARE.
25 | */
26 |
27 | #ifndef _DIVSUFSORT@W64BIT@_H
28 | #define _DIVSUFSORT@W64BIT@_H 1
29 |
30 | #ifdef __cplusplus
31 | extern "C" {
32 | #endif /* __cplusplus */
33 |
34 | @INCFILE@
35 |
36 | #ifndef DIVSUFSORT_API
37 | # ifdef DIVSUFSORT_BUILD_DLL
38 | # define DIVSUFSORT_API @DIVSUFSORT_EXPORT@
39 | # else
40 | # define DIVSUFSORT_API @DIVSUFSORT_IMPORT@
41 | # endif
42 | #endif
43 |
44 | /*- Datatypes -*/
45 | #ifndef SAUCHAR_T
46 | #define SAUCHAR_T
47 | typedef @SAUCHAR_TYPE@ sauchar_t;
48 | #endif /* SAUCHAR_T */
49 | #ifndef SAINT_T
50 | #define SAINT_T
51 | typedef @SAINT32_TYPE@ saint_t;
52 | #endif /* SAINT_T */
53 | #ifndef SAIDX@W64BIT@_T
54 | #define SAIDX@W64BIT@_T
55 | typedef @SAINDEX_TYPE@ saidx@W64BIT@_t;
56 | #endif /* SAIDX@W64BIT@_T */
57 | #ifndef PRIdSAINT_T
58 | #define PRIdSAINT_T @SAINT_PRId@
59 | #endif /* PRIdSAINT_T */
60 | #ifndef PRIdSAIDX@W64BIT@_T
61 | #define PRIdSAIDX@W64BIT@_T @SAINDEX_PRId@
62 | #endif /* PRIdSAIDX@W64BIT@_T */
63 |
64 |
65 | /*- Prototypes -*/
66 |
67 | /**
68 | * Constructs the suffix array of a given string.
69 | * @param T[0..n-1] The input string.
70 | * @param SA[0..n-1] The output array of suffixes.
71 | * @param n The length of the given string.
72 | * @return 0 if no error occurred, -1 or -2 otherwise.
73 | */
74 | DIVSUFSORT_API
75 | saint_t
76 | divsufsort@W64BIT@(const sauchar_t *T, saidx@W64BIT@_t *SA, saidx@W64BIT@_t n);
77 |
78 | /**
79 | * Constructs the burrows-wheeler transformed string of a given string.
80 | * @param T[0..n-1] The input string.
81 | * @param U[0..n-1] The output string. (can be T)
82 | * @param A[0..n-1] The temporary array. (can be NULL)
83 | * @param n The length of the given string.
84 | * @return The primary index if no error occurred, -1 or -2 otherwise.
85 | */
86 | DIVSUFSORT_API
87 | saidx@W64BIT@_t
88 | divbwt@W64BIT@(const sauchar_t *T, sauchar_t *U, saidx@W64BIT@_t *A, saidx@W64BIT@_t n);
89 |
90 | /**
91 | * Returns the version of the divsufsort library.
92 | * @return The version number string.
93 | */
94 | DIVSUFSORT_API
95 | const char *
96 | divsufsort@W64BIT@_version(void);
97 |
98 |
99 | /**
100 | * Constructs the burrows-wheeler transformed string of a given string and suffix array.
101 | * @param T[0..n-1] The input string.
102 | * @param U[0..n-1] The output string. (can be T)
103 | * @param SA[0..n-1] The suffix array. (can be NULL)
104 | * @param n The length of the given string.
105 | * @param idx The output primary index.
106 | * @return 0 if no error occurred, -1 or -2 otherwise.
107 | */
108 | DIVSUFSORT_API
109 | saint_t
110 | bw_transform@W64BIT@(const sauchar_t *T, sauchar_t *U,
111 | saidx@W64BIT@_t *SA /* can NULL */,
112 | saidx@W64BIT@_t n, saidx@W64BIT@_t *idx);
113 |
114 | /**
115 | * Inverse BW-transforms a given BWTed string.
116 | * @param T[0..n-1] The input string.
117 | * @param U[0..n-1] The output string. (can be T)
118 | * @param A[0..n-1] The temporary array. (can be NULL)
119 | * @param n The length of the given string.
120 | * @param idx The primary index.
121 | * @return 0 if no error occurred, -1 or -2 otherwise.
122 | */
123 | DIVSUFSORT_API
124 | saint_t
125 | inverse_bw_transform@W64BIT@(const sauchar_t *T, sauchar_t *U,
126 | saidx@W64BIT@_t *A /* can NULL */,
127 | saidx@W64BIT@_t n, saidx@W64BIT@_t idx);
128 |
129 | /**
130 | * Checks the correctness of a given suffix array.
131 | * @param T[0..n-1] The input string.
132 | * @param SA[0..n-1] The input suffix array.
133 | * @param n The length of the given string.
134 | * @param verbose The verbose mode.
135 | * @return 0 if no error occurred.
136 | */
137 | DIVSUFSORT_API
138 | saint_t
139 | sufcheck@W64BIT@(const sauchar_t *T, const saidx@W64BIT@_t *SA, saidx@W64BIT@_t n, saint_t verbose);
140 |
141 | /**
142 | * Search for the pattern P in the string T.
143 | * @param T[0..Tsize-1] The input string.
144 | * @param Tsize The length of the given string.
145 | * @param P[0..Psize-1] The input pattern string.
146 | * @param Psize The length of the given pattern string.
147 | * @param SA[0..SAsize-1] The input suffix array.
148 | * @param SAsize The length of the given suffix array.
149 | * @param idx The output index.
150 | * @return The count of matches if no error occurred, -1 otherwise.
151 | */
152 | DIVSUFSORT_API
153 | saidx@W64BIT@_t
154 | sa_search@W64BIT@(const sauchar_t *T, saidx@W64BIT@_t Tsize,
155 | const sauchar_t *P, saidx@W64BIT@_t Psize,
156 | const saidx@W64BIT@_t *SA, saidx@W64BIT@_t SAsize,
157 | saidx@W64BIT@_t *left);
158 |
159 | /**
160 | * Search for the character c in the string T.
161 | * @param T[0..Tsize-1] The input string.
162 | * @param Tsize The length of the given string.
163 | * @param SA[0..SAsize-1] The input suffix array.
164 | * @param SAsize The length of the given suffix array.
165 | * @param c The input character.
166 | * @param idx The output index.
167 | * @return The count of matches if no error occurred, -1 otherwise.
168 | */
169 | DIVSUFSORT_API
170 | saidx@W64BIT@_t
171 | sa_simplesearch@W64BIT@(const sauchar_t *T, saidx@W64BIT@_t Tsize,
172 | const saidx@W64BIT@_t *SA, saidx@W64BIT@_t SAsize,
173 | saint_t c, saidx@W64BIT@_t *left);
174 |
175 |
176 | #ifdef __cplusplus
177 | } /* extern "C" */
178 | #endif /* __cplusplus */
179 |
180 | #endif /* _DIVSUFSORT@W64BIT@_H */
181 |
--------------------------------------------------------------------------------
/src/libdivsufsort/include/divsufsort_config.h:
--------------------------------------------------------------------------------
1 | #define HAVE_STRING_H 1
2 | #define HAVE_STDLIB_H 1
3 | #define HAVE_MEMORY_H 1
4 | #define HAVE_STDINT_H 1
5 | #define INLINE inline
6 |
7 | #ifdef _MSC_VER
8 | #pragma warning( disable : 4244 )
9 | #endif /* _MSC_VER */
10 |
--------------------------------------------------------------------------------
/src/libdivsufsort/include/divsufsort_private.h:
--------------------------------------------------------------------------------
1 | /*
2 | * divsufsort_private.h for libdivsufsort
3 | * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 | *
5 | * Permission is hereby granted, free of charge, to any person
6 | * obtaining a copy of this software and associated documentation
7 | * files (the "Software"), to deal in the Software without
8 | * restriction, including without limitation the rights to use,
9 | * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | * copies of the Software, and to permit persons to whom the
11 | * Software is furnished to do so, subject to the following
12 | * conditions:
13 | *
14 | * The above copyright notice and this permission notice shall be
15 | * included in all copies or substantial portions of the Software.
16 | *
17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 | * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 | * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 | * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 | * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 | * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 | * OTHER DEALINGS IN THE SOFTWARE.
25 | */
26 |
27 | #ifndef _DIVSUFSORT_PRIVATE_H
28 | #define _DIVSUFSORT_PRIVATE_H 1
29 |
30 | #ifdef __cplusplus
31 | extern "C" {
32 | #endif /* __cplusplus */
33 |
34 | #include "divsufsort_config.h"
35 | #include
36 | #include
37 | #if HAVE_STRING_H
38 | # include
39 | #endif
40 | #if HAVE_STDLIB_H
41 | # include
42 | #endif
43 | #if HAVE_MEMORY_H
44 | # include
45 | #endif
46 | #if HAVE_STDDEF_H
47 | # include
48 | #endif
49 | #if HAVE_STRINGS_H
50 | # include
51 | #endif
52 | #if HAVE_INTTYPES_H
53 | # include
54 | #else
55 | # if HAVE_STDINT_H
56 | # include
57 | # endif
58 | #endif
59 | #if defined(BUILD_DIVSUFSORT64)
60 | # include "divsufsort64.h"
61 | # ifndef SAIDX_T
62 | # define SAIDX_T
63 | # define saidx_t saidx64_t
64 | # endif /* SAIDX_T */
65 | # ifndef PRIdSAIDX_T
66 | # define PRIdSAIDX_T PRIdSAIDX64_T
67 | # endif /* PRIdSAIDX_T */
68 | # define divsufsort divsufsort64
69 | # define divbwt divbwt64
70 | # define divsufsort_version divsufsort64_version
71 | # define bw_transform bw_transform64
72 | # define inverse_bw_transform inverse_bw_transform64
73 | # define sufcheck sufcheck64
74 | # define sa_search sa_search64
75 | # define sa_simplesearch sa_simplesearch64
76 | # define sssort sssort64
77 | # define trsort trsort64
78 | #else
79 | # include "divsufsort.h"
80 | #endif
81 |
82 |
83 | /*- Constants -*/
84 | #if !defined(UINT8_MAX)
85 | # define UINT8_MAX (255)
86 | #endif /* UINT8_MAX */
87 | #if defined(ALPHABET_SIZE) && (ALPHABET_SIZE < 1)
88 | # undef ALPHABET_SIZE
89 | #endif
90 | #if !defined(ALPHABET_SIZE)
91 | # define ALPHABET_SIZE (UINT8_MAX + 1)
92 | #endif
93 | /* for divsufsort.c */
94 | #define BUCKET_A_SIZE (ALPHABET_SIZE)
95 | #define BUCKET_B_SIZE (ALPHABET_SIZE * ALPHABET_SIZE)
96 | /* for sssort.c */
97 | #if defined(SS_INSERTIONSORT_THRESHOLD)
98 | # if SS_INSERTIONSORT_THRESHOLD < 1
99 | # undef SS_INSERTIONSORT_THRESHOLD
100 | # define SS_INSERTIONSORT_THRESHOLD (1)
101 | # endif
102 | #else
103 | # define SS_INSERTIONSORT_THRESHOLD (8)
104 | #endif
105 | #if defined(SS_BLOCKSIZE)
106 | # if SS_BLOCKSIZE < 0
107 | # undef SS_BLOCKSIZE
108 | # define SS_BLOCKSIZE (0)
109 | # elif 32768 <= SS_BLOCKSIZE
110 | # undef SS_BLOCKSIZE
111 | # define SS_BLOCKSIZE (32767)
112 | # endif
113 | #else
114 | # define SS_BLOCKSIZE (1024)
115 | #endif
116 | /* minstacksize = log(SS_BLOCKSIZE) / log(3) * 2 */
117 | #if SS_BLOCKSIZE == 0
118 | # if defined(BUILD_DIVSUFSORT64)
119 | # define SS_MISORT_STACKSIZE (96)
120 | # else
121 | # define SS_MISORT_STACKSIZE (64)
122 | # endif
123 | #elif SS_BLOCKSIZE <= 4096
124 | # define SS_MISORT_STACKSIZE (16)
125 | #else
126 | # define SS_MISORT_STACKSIZE (24)
127 | #endif
128 | #if defined(BUILD_DIVSUFSORT64)
129 | # define SS_SMERGE_STACKSIZE (64)
130 | #else
131 | # define SS_SMERGE_STACKSIZE (32)
132 | #endif
133 | /* for trsort.c */
134 | #define TR_INSERTIONSORT_THRESHOLD (8)
135 | #if defined(BUILD_DIVSUFSORT64)
136 | # define TR_STACKSIZE (96)
137 | #else
138 | # define TR_STACKSIZE (64)
139 | #endif
140 |
141 |
142 | /*- Macros -*/
143 | #ifndef SWAP
144 | # define SWAP(_a, _b) do { t = (_a); (_a) = (_b); (_b) = t; } while(0)
145 | #endif /* SWAP */
146 | #ifndef MIN
147 | # define MIN(_a, _b) (((_a) < (_b)) ? (_a) : (_b))
148 | #endif /* MIN */
149 | #ifndef MAX
150 | # define MAX(_a, _b) (((_a) > (_b)) ? (_a) : (_b))
151 | #endif /* MAX */
152 | #define STACK_PUSH(_a, _b, _c, _d)\
153 | do {\
154 | assert(ssize < STACK_SIZE);\
155 | stack[ssize].a = (_a), stack[ssize].b = (_b),\
156 | stack[ssize].c = (_c), stack[ssize++].d = (_d);\
157 | } while(0)
158 | #define STACK_PUSH5(_a, _b, _c, _d, _e)\
159 | do {\
160 | assert(ssize < STACK_SIZE);\
161 | stack[ssize].a = (_a), stack[ssize].b = (_b),\
162 | stack[ssize].c = (_c), stack[ssize].d = (_d), stack[ssize++].e = (_e);\
163 | } while(0)
164 | #define STACK_POP(_a, _b, _c, _d)\
165 | do {\
166 | assert(0 <= ssize);\
167 | if(ssize == 0) { return; }\
168 | (_a) = stack[--ssize].a, (_b) = stack[ssize].b,\
169 | (_c) = stack[ssize].c, (_d) = stack[ssize].d;\
170 | } while(0)
171 | #define STACK_POP5(_a, _b, _c, _d, _e)\
172 | do {\
173 | assert(0 <= ssize);\
174 | if(ssize == 0) { return; }\
175 | (_a) = stack[--ssize].a, (_b) = stack[ssize].b,\
176 | (_c) = stack[ssize].c, (_d) = stack[ssize].d, (_e) = stack[ssize].e;\
177 | } while(0)
178 | /* for divsufsort.c */
179 | #define BUCKET_A(_c0) bucket_A[(_c0)]
180 | #if ALPHABET_SIZE == 256
181 | #define BUCKET_B(_c0, _c1) (bucket_B[((_c1) << 8) | (_c0)])
182 | #define BUCKET_BSTAR(_c0, _c1) (bucket_B[((_c0) << 8) | (_c1)])
183 | #else
184 | #define BUCKET_B(_c0, _c1) (bucket_B[(_c1) * ALPHABET_SIZE + (_c0)])
185 | #define BUCKET_BSTAR(_c0, _c1) (bucket_B[(_c0) * ALPHABET_SIZE + (_c1)])
186 | #endif
187 |
188 |
189 | /*- Private Prototypes -*/
190 | /* sssort.c */
191 | void
192 | sssort(const sauchar_t *Td, const saidx_t *PA,
193 | saidx_t *first, saidx_t *last,
194 | saidx_t *buf, saidx_t bufsize,
195 | saidx_t depth, saidx_t n, saint_t lastsuffix);
196 | /* trsort.c */
197 | void
198 | trsort(saidx_t *ISA, saidx_t *SA, saidx_t n, saidx_t depth);
199 |
200 |
201 | #ifdef __cplusplus
202 | } /* extern "C" */
203 | #endif /* __cplusplus */
204 |
205 | #endif /* _DIVSUFSORT_PRIVATE_H */
206 |
--------------------------------------------------------------------------------
/src/libdivsufsort/include/lfs.h.cmake:
--------------------------------------------------------------------------------
1 | /*
2 | * lfs.h for libdivsufsort
3 | * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 | *
5 | * Permission is hereby granted, free of charge, to any person
6 | * obtaining a copy of this software and associated documentation
7 | * files (the "Software"), to deal in the Software without
8 | * restriction, including without limitation the rights to use,
9 | * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | * copies of the Software, and to permit persons to whom the
11 | * Software is furnished to do so, subject to the following
12 | * conditions:
13 | *
14 | * The above copyright notice and this permission notice shall be
15 | * included in all copies or substantial portions of the Software.
16 | *
17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 | * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 | * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 | * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 | * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 | * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 | * OTHER DEALINGS IN THE SOFTWARE.
25 | */
26 |
27 | #ifndef _LFS_H
28 | #define _LFS_H 1
29 |
30 | #ifdef __cplusplus
31 | extern "C" {
32 | #endif /* __cplusplus */
33 |
34 | #ifndef __STRICT_ANSI__
35 | # define LFS_OFF_T @LFS_OFF_T@
36 | # define LFS_FOPEN @LFS_FOPEN@
37 | # define LFS_FTELL @LFS_FTELL@
38 | # define LFS_FSEEK @LFS_FSEEK@
39 | # define LFS_PRId @LFS_PRID@
40 | #else
41 | # define LFS_OFF_T long
42 | # define LFS_FOPEN fopen
43 | # define LFS_FTELL ftell
44 | # define LFS_FSEEK fseek
45 | # define LFS_PRId "ld"
46 | #endif
47 | #ifndef PRIdOFF_T
48 | # define PRIdOFF_T LFS_PRId
49 | #endif
50 |
51 |
52 | #ifdef __cplusplus
53 | } /* extern "C" */
54 | #endif /* __cplusplus */
55 |
56 | #endif /* _LFS_H */
57 |
--------------------------------------------------------------------------------
/src/libdivsufsort/lib/CMakeLists.txt:
--------------------------------------------------------------------------------
1 | include_directories("${CMAKE_CURRENT_SOURCE_DIR}/../include"
2 | "${CMAKE_CURRENT_BINARY_DIR}/../include")
3 |
4 | set(divsufsort_SRCS divsufsort.c sssort.c trsort.c utils.c)
5 |
6 | ## libdivsufsort ##
7 | add_library(divsufsort ${divsufsort_SRCS})
8 | install(TARGETS divsufsort
9 | RUNTIME DESTINATION ${CMAKE_INSTALL_RUNTIMEDIR}
10 | LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
11 | ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR})
12 | set_target_properties(divsufsort PROPERTIES
13 | VERSION "${LIBRARY_VERSION}"
14 | SOVERSION "${LIBRARY_SOVERSION}"
15 | DEFINE_SYMBOL DIVSUFSORT_BUILD_DLL
16 | RUNTIME_OUTPUT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/../examples")
17 |
18 | ## libdivsufsort64 ##
19 | if(BUILD_DIVSUFSORT64)
20 | add_library(divsufsort64 ${divsufsort_SRCS})
21 | install(TARGETS divsufsort64
22 | RUNTIME DESTINATION ${CMAKE_INSTALL_RUNTIMEDIR}
23 | LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
24 | ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR})
25 | set_target_properties(divsufsort64 PROPERTIES
26 | VERSION "${LIBRARY_VERSION}"
27 | SOVERSION "${LIBRARY_SOVERSION}"
28 | DEFINE_SYMBOL DIVSUFSORT_BUILD_DLL
29 | COMPILE_FLAGS "-DBUILD_DIVSUFSORT64"
30 | RUNTIME_OUTPUT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/../examples")
31 | endif(BUILD_DIVSUFSORT64)
32 |
--------------------------------------------------------------------------------
/src/libdivsufsort/lib/divsufsort_utils.c:
--------------------------------------------------------------------------------
1 | /*
2 | * utils.c for libdivsufsort
3 | * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 | *
5 | * Permission is hereby granted, free of charge, to any person
6 | * obtaining a copy of this software and associated documentation
7 | * files (the "Software"), to deal in the Software without
8 | * restriction, including without limitation the rights to use,
9 | * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | * copies of the Software, and to permit persons to whom the
11 | * Software is furnished to do so, subject to the following
12 | * conditions:
13 | *
14 | * The above copyright notice and this permission notice shall be
15 | * included in all copies or substantial portions of the Software.
16 | *
17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 | * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 | * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 | * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 | * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 | * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 | * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 | * OTHER DEALINGS IN THE SOFTWARE.
25 | */
26 |
27 | #include "divsufsort_private.h"
28 |
29 |
30 | /*- Private Function -*/
31 |
32 | #if 0
33 | /* Binary search for inverse bwt. */
34 | static
35 | saidx_t
36 | binarysearch_lower(const saidx_t *A, saidx_t size, saidx_t value) {
37 | saidx_t half, i;
38 | for(i = 0, half = size >> 1;
39 | 0 < size;
40 | size = half, half >>= 1) {
41 | if(A[i + half] < value) {
42 | i += half + 1;
43 | half -= (size & 1) ^ 1;
44 | }
45 | }
46 | return i;
47 | }
48 |
49 |
50 | /*- Functions -*/
51 |
52 | /* Burrows-Wheeler transform. */
53 | saint_t
54 | bw_transform(const sauchar_t *T, sauchar_t *U, saidx_t *SA,
55 | saidx_t n, saidx_t *idx) {
56 | saidx_t *A, i, j, p, t;
57 | saint_t c;
58 |
59 | /* Check arguments. */
60 | if((T == NULL) || (U == NULL) || (n < 0) || (idx == NULL)) { return -1; }
61 | if(n <= 1) {
62 | if(n == 1) { U[0] = T[0]; }
63 | *idx = n;
64 | return 0;
65 | }
66 |
67 | if((A = SA) == NULL) {
68 | i = divbwt(T, U, NULL, n);
69 | if(0 <= i) { *idx = i; i = 0; }
70 | return (saint_t)i;
71 | }
72 |
73 | /* BW transform. */
74 | if(T == U) {
75 | t = n;
76 | for(i = 0, j = 0; i < n; ++i) {
77 | p = t - 1;
78 | t = A[i];
79 | if(0 <= p) {
80 | c = T[j];
81 | U[j] = (j <= p) ? T[p] : (sauchar_t)A[p];
82 | A[j] = c;
83 | j++;
84 | } else {
85 | *idx = i;
86 | }
87 | }
88 | p = t - 1;
89 | if(0 <= p) {
90 | c = T[j];
91 | U[j] = (j <= p) ? T[p] : (sauchar_t)A[p];
92 | A[j] = c;
93 | } else {
94 | *idx = i;
95 | }
96 | } else {
97 | U[0] = T[n - 1];
98 | for(i = 0; A[i] != 0; ++i) { U[i + 1] = T[A[i] - 1]; }
99 | *idx = i + 1;
100 | for(++i; i < n; ++i) { U[i] = T[A[i] - 1]; }
101 | }
102 |
103 | if(SA == NULL) {
104 | /* Deallocate memory. */
105 | free(A);
106 | }
107 |
108 | return 0;
109 | }
110 |
111 | /* Inverse Burrows-Wheeler transform. */
112 | saint_t
113 | inverse_bw_transform(const sauchar_t *T, sauchar_t *U, saidx_t *A,
114 | saidx_t n, saidx_t idx) {
115 | saidx_t C[ALPHABET_SIZE];
116 | sauchar_t D[ALPHABET_SIZE];
117 | saidx_t *B;
118 | saidx_t i, p;
119 | saint_t c, d;
120 |
121 | /* Check arguments. */
122 | if((T == NULL) || (U == NULL) || (n < 0) || (idx < 0) ||
123 | (n < idx) || ((0 < n) && (idx == 0))) {
124 | return -1;
125 | }
126 | if(n <= 1) { return 0; }
127 |
128 | if((B = A) == NULL) {
129 | /* Allocate n*sizeof(saidx_t) bytes of memory. */
130 | if((B = (saidx_t *)malloc((size_t)n * sizeof(saidx_t))) == NULL) { return -2; }
131 | }
132 |
133 | /* Inverse BW transform. */
134 | for(c = 0; c < ALPHABET_SIZE; ++c) { C[c] = 0; }
135 | for(i = 0; i < n; ++i) { ++C[T[i]]; }
136 | for(c = 0, d = 0, i = 0; c < ALPHABET_SIZE; ++c) {
137 | p = C[c];
138 | if(0 < p) {
139 | C[c] = i;
140 | D[d++] = (sauchar_t)c;
141 | i += p;
142 | }
143 | }
144 | for(i = 0; i < idx; ++i) { B[C[T[i]]++] = i; }
145 | for( ; i < n; ++i) { B[C[T[i]]++] = i + 1; }
146 | for(c = 0; c < d; ++c) { C[c] = C[D[c]]; }
147 | for(i = 0, p = idx; i < n; ++i) {
148 | U[i] = D[binarysearch_lower(C, d, p)];
149 | p = B[p - 1];
150 | }
151 |
152 | if(A == NULL) {
153 | /* Deallocate memory. */
154 | free(B);
155 | }
156 |
157 | return 0;
158 | }
159 |
160 | /* Checks the suffix array SA of the string T. */
161 | saint_t
162 | sufcheck(const sauchar_t *T, const saidx_t *SA,
163 | saidx_t n, saint_t verbose) {
164 | saidx_t C[ALPHABET_SIZE];
165 | saidx_t i, p, q, t;
166 | saint_t c;
167 |
168 | if(verbose) { fprintf(stderr, "sufcheck: "); }
169 |
170 | /* Check arguments. */
171 | if((T == NULL) || (SA == NULL) || (n < 0)) {
172 | if(verbose) { fprintf(stderr, "Invalid arguments.\n"); }
173 | return -1;
174 | }
175 | if(n == 0) {
176 | if(verbose) { fprintf(stderr, "Done.\n"); }
177 | return 0;
178 | }
179 |
180 | /* check range: [0..n-1] */
181 | for(i = 0; i < n; ++i) {
182 | if((SA[i] < 0) || (n <= SA[i])) {
183 | if(verbose) {
184 | fprintf(stderr, "Out of the range [0,%" PRIdSAIDX_T "].\n"
185 | " SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T "\n",
186 | n - 1, i, SA[i]);
187 | }
188 | return -2;
189 | }
190 | }
191 |
192 | /* check first characters. */
193 | for(i = 1; i < n; ++i) {
194 | if(T[SA[i - 1]] > T[SA[i]]) {
195 | if(verbose) {
196 | fprintf(stderr, "Suffixes in wrong order.\n"
197 | " T[SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T "]=%d"
198 | " > T[SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T "]=%d\n",
199 | i - 1, SA[i - 1], T[SA[i - 1]], i, SA[i], T[SA[i]]);
200 | }
201 | return -3;
202 | }
203 | }
204 |
205 | /* check suffixes. */
206 | for(i = 0; i < ALPHABET_SIZE; ++i) { C[i] = 0; }
207 | for(i = 0; i < n; ++i) { ++C[T[i]]; }
208 | for(i = 0, p = 0; i < ALPHABET_SIZE; ++i) {
209 | t = C[i];
210 | C[i] = p;
211 | p += t;
212 | }
213 |
214 | q = C[T[n - 1]];
215 | C[T[n - 1]] += 1;
216 | for(i = 0; i < n; ++i) {
217 | p = SA[i];
218 | if(0 < p) {
219 | c = T[--p];
220 | t = C[c];
221 | } else {
222 | c = T[p = n - 1];
223 | t = q;
224 | }
225 | if((t < 0) || (p != SA[t])) {
226 | if(verbose) {
227 | fprintf(stderr, "Suffix in wrong position.\n"
228 | " SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T " or\n"
229 | " SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T "\n",
230 | t, (0 <= t) ? SA[t] : -1, i, SA[i]);
231 | }
232 | return -4;
233 | }
234 | if(t != q) {
235 | ++C[c];
236 | if((n <= C[c]) || (T[SA[C[c]]] != c)) { C[c] = -1; }
237 | }
238 | }
239 |
240 | if(1 <= verbose) { fprintf(stderr, "Done.\n"); }
241 | return 0;
242 | }
243 |
244 |
245 | static
246 | int
247 | _compare(const sauchar_t *T, saidx_t Tsize,
248 | const sauchar_t *P, saidx_t Psize,
249 | saidx_t suf, saidx_t *match) {
250 | saidx_t i, j;
251 | saint_t r;
252 | for(i = suf + *match, j = *match, r = 0;
253 | (i < Tsize) && (j < Psize) && ((r = T[i] - P[j]) == 0); ++i, ++j) { }
254 | *match = j;
255 | return (r == 0) ? -(j != Psize) : r;
256 | }
257 |
258 | /* Search for the pattern P in the string T. */
259 | saidx_t
260 | sa_search(const sauchar_t *T, saidx_t Tsize,
261 | const sauchar_t *P, saidx_t Psize,
262 | const saidx_t *SA, saidx_t SAsize,
263 | saidx_t *idx) {
264 | saidx_t size, lsize, rsize, half;
265 | saidx_t match, lmatch, rmatch;
266 | saidx_t llmatch, lrmatch, rlmatch, rrmatch;
267 | saidx_t i, j, k;
268 | saint_t r;
269 |
270 | if(idx != NULL) { *idx = -1; }
271 | if((T == NULL) || (P == NULL) || (SA == NULL) ||
272 | (Tsize < 0) || (Psize < 0) || (SAsize < 0)) { return -1; }
273 | if((Tsize == 0) || (SAsize == 0)) { return 0; }
274 | if(Psize == 0) { if(idx != NULL) { *idx = 0; } return SAsize; }
275 |
276 | for(i = j = k = 0, lmatch = rmatch = 0, size = SAsize, half = size >> 1;
277 | 0 < size;
278 | size = half, half >>= 1) {
279 | match = MIN(lmatch, rmatch);
280 | r = _compare(T, Tsize, P, Psize, SA[i + half], &match);
281 | if(r < 0) {
282 | i += half + 1;
283 | half -= (size & 1) ^ 1;
284 | lmatch = match;
285 | } else if(r > 0) {
286 | rmatch = match;
287 | } else {
288 | lsize = half, j = i, rsize = size - half - 1, k = i + half + 1;
289 |
290 | /* left part */
291 | for(llmatch = lmatch, lrmatch = match, half = lsize >> 1;
292 | 0 < lsize;
293 | lsize = half, half >>= 1) {
294 | lmatch = MIN(llmatch, lrmatch);
295 | r = _compare(T, Tsize, P, Psize, SA[j + half], &lmatch);
296 | if(r < 0) {
297 | j += half + 1;
298 | half -= (lsize & 1) ^ 1;
299 | llmatch = lmatch;
300 | } else {
301 | lrmatch = lmatch;
302 | }
303 | }
304 |
305 | /* right part */
306 | for(rlmatch = match, rrmatch = rmatch, half = rsize >> 1;
307 | 0 < rsize;
308 | rsize = half, half >>= 1) {
309 | rmatch = MIN(rlmatch, rrmatch);
310 | r = _compare(T, Tsize, P, Psize, SA[k + half], &rmatch);
311 | if(r <= 0) {
312 | k += half + 1;
313 | half -= (rsize & 1) ^ 1;
314 | rlmatch = rmatch;
315 | } else {
316 | rrmatch = rmatch;
317 | }
318 | }
319 |
320 | break;
321 | }
322 | }
323 |
324 | if(idx != NULL) { *idx = (0 < (k - j)) ? j : i; }
325 | return k - j;
326 | }
327 |
328 | /* Search for the character c in the string T. */
329 | saidx_t
330 | sa_simplesearch(const sauchar_t *T, saidx_t Tsize,
331 | const saidx_t *SA, saidx_t SAsize,
332 | saint_t c, saidx_t *idx) {
333 | saidx_t size, lsize, rsize, half;
334 | saidx_t i, j, k, p;
335 | saint_t r;
336 |
337 | if(idx != NULL) { *idx = -1; }
338 | if((T == NULL) || (SA == NULL) || (Tsize < 0) || (SAsize < 0)) { return -1; }
339 | if((Tsize == 0) || (SAsize == 0)) { return 0; }
340 |
341 | for(i = j = k = 0, size = SAsize, half = size >> 1;
342 | 0 < size;
343 | size = half, half >>= 1) {
344 | p = SA[i + half];
345 | r = (p < Tsize) ? T[p] - c : -1;
346 | if(r < 0) {
347 | i += half + 1;
348 | half -= (size & 1) ^ 1;
349 | } else if(r == 0) {
350 | lsize = half, j = i, rsize = size - half - 1, k = i + half + 1;
351 |
352 | /* left part */
353 | for(half = lsize >> 1;
354 | 0 < lsize;
355 | lsize = half, half >>= 1) {
356 | p = SA[j + half];
357 | r = (p < Tsize) ? T[p] - c : -1;
358 | if(r < 0) {
359 | j += half + 1;
360 | half -= (lsize & 1) ^ 1;
361 | }
362 | }
363 |
364 | /* right part */
365 | for(half = rsize >> 1;
366 | 0 < rsize;
367 | rsize = half, half >>= 1) {
368 | p = SA[k + half];
369 | r = (p < Tsize) ? T[p] - c : -1;
370 | if(r <= 0) {
371 | k += half + 1;
372 | half -= (rsize & 1) ^ 1;
373 | }
374 | }
375 |
376 | break;
377 | }
378 | }
379 |
380 | if(idx != NULL) { *idx = (0 < (k - j)) ? j : i; }
381 | return k - j;
382 | }
383 | #endif
384 |
--------------------------------------------------------------------------------
/src/libdivsufsort/pkgconfig/CMakeLists.txt:
--------------------------------------------------------------------------------
1 | ## generate libdivsufsort.pc ##
2 | set(W64BIT "")
3 | configure_file("${CMAKE_CURRENT_SOURCE_DIR}/libdivsufsort.pc.cmake" "${CMAKE_CURRENT_BINARY_DIR}/libdivsufsort.pc" @ONLY)
4 | install(FILES "${CMAKE_CURRENT_BINARY_DIR}/libdivsufsort.pc" DESTINATION ${CMAKE_INSTALL_PKGCONFIGDIR})
5 | if(BUILD_DIVSUFSORT64)
6 | set(W64BIT "64")
7 | configure_file("${CMAKE_CURRENT_SOURCE_DIR}/libdivsufsort.pc.cmake" "${CMAKE_CURRENT_BINARY_DIR}/libdivsufsort64.pc" @ONLY)
8 | install(FILES "${CMAKE_CURRENT_BINARY_DIR}/libdivsufsort64.pc" DESTINATION ${CMAKE_INSTALL_PKGCONFIGDIR})
9 | endif(BUILD_DIVSUFSORT64)
10 |
--------------------------------------------------------------------------------
/src/libdivsufsort/pkgconfig/libdivsufsort.pc.cmake:
--------------------------------------------------------------------------------
1 | prefix=@CMAKE_INSTALL_PREFIX@
2 | exec_prefix=${prefix}
3 | libdir=@CMAKE_INSTALL_LIBDIR@
4 | includedir=@CMAKE_INSTALL_INCLUDEDIR@
5 |
6 | Name: @PROJECT_NAME@@W64BIT@
7 | Description: @PROJECT_DESCRIPTION@
8 | Version: @PROJECT_VERSION_FULL@
9 | URL: @PROJECT_URL@
10 | Libs: -L${libdir} -ldivsufsort@W64BIT@
11 | Cflags: -I${includedir}
12 |
--------------------------------------------------------------------------------
/src/matchfinder.c:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/emmanuel-marty/lz4ultra/f80965cb37a4586cdbe109f02d1ee676f8408889/src/matchfinder.c
--------------------------------------------------------------------------------
/src/matchfinder.h:
--------------------------------------------------------------------------------
1 | /*
2 | * matchfinder.h - LZ match finder implementation
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _MATCHFINDER_H
34 | #define _MATCHFINDER_H
35 |
36 | /* Forward declarations */
37 | typedef struct _lz4ultra_match lz4ultra_match;
38 | typedef struct _lz4ultra_compressor lz4ultra_compressor;
39 |
40 | /**
41 | * Parse input data, build suffix array and overlaid data structures to speed up match finding
42 | *
43 | * @param pCompressor compression context
44 | * @param pInWindow pointer to input data window (previously compressed bytes + bytes to compress)
45 | * @param nInWindowSize total input size in bytes (previously compressed bytes + bytes to compress)
46 | *
47 | * @return 0 for success, non-zero for failure
48 | */
49 | int lz4ultra_build_suffix_array(lz4ultra_compressor *pCompressor, const unsigned char *pInWindow, const int nInWindowSize);
50 |
51 | /**
52 | * Skip previously compressed bytes
53 | *
54 | * @param pCompressor compression context
55 | * @param nStartOffset current offset in input window (typically 0)
56 | * @param nEndOffset offset to skip to in input window (typically the number of previously compressed bytes)
57 | */
58 | void lz4ultra_skip_matches(lz4ultra_compressor *pCompressor, const int nStartOffset, const int nEndOffset);
59 |
60 | /**
61 | * Find all matches for the data to be compressed.
62 | *
63 | * @param pCompressor compression context
64 | * @param nStartOffset current offset in input window (typically the number of previously compressed bytes)
65 | * @param nEndOffset offset to end finding matches at (typically the size of the total input window in bytes
66 | */
67 | void lz4ultra_find_all_matches(lz4ultra_compressor *pCompressor, const int nStartOffset, const int nEndOffset);
68 |
69 | #endif /* _MATCHFINDER_H */
70 |
--------------------------------------------------------------------------------
/src/shrink_block.h:
--------------------------------------------------------------------------------
1 | /*
2 | * shrink_block.h - optimal LZ4 block compressor definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _SHRINK_BLOCK_H
34 | #define _SHRINK_BLOCK_H
35 |
36 | /* Forward declarations */
37 | typedef struct _lz4ultra_compressor lz4ultra_compressor;
38 |
39 | /**
40 | * Select the most optimal matches, reduce the token count if possible, and then emit a block of compressed LZ4 data
41 | *
42 | * @param pCompressor compression context
43 | * @param pInWindow pointer to input data window (previously compressed bytes + bytes to compress)
44 | * @param nPreviousBlockSize number of previously compressed bytes (or 0 for none)
45 | * @param nInDataSize number of input bytes to compress
46 | * @param pOutData pointer to output buffer
47 | * @param nMaxOutDataSize maximum size of output buffer, in bytes
48 | *
49 | * @return size of compressed data in output buffer, or -1 if the data is uncompressible
50 | */
51 | int lz4ultra_optimize_and_write_block(lz4ultra_compressor *pCompressor, const unsigned char *pInWindow, const int nPreviousBlockSize, const int nInDataSize, unsigned char *pOutData, const int nMaxOutDataSize);
52 |
53 | #endif /* _SHRINK_BLOCK_H */
54 |
--------------------------------------------------------------------------------
/src/shrink_context.c:
--------------------------------------------------------------------------------
1 | /*
2 | * shrink_context.c - compression context implementation
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #include
34 | #include
35 | #include "shrink_context.h"
36 | #include "shrink_block.h"
37 | #include "matchfinder.h"
38 |
39 | /**
40 | * Initialize compression context
41 | *
42 | * @param pCompressor compression context to initialize
43 | * @param nMaxWindowSize maximum size of input data window (previously compressed bytes + bytes to compress)
44 | * @param nFlags compression flags
45 | *
46 | * @return 0 for success, non-zero for failure
47 | */
48 | int lz4ultra_compressor_init(lz4ultra_compressor *pCompressor, const int nMaxWindowSize, const int nFlags) {
49 | int nResult;
50 |
51 | nResult = divsufsort_init(&pCompressor->divsufsort_context);
52 | pCompressor->intervals = NULL;
53 | pCompressor->pos_data = NULL;
54 | pCompressor->open_intervals = NULL;
55 | pCompressor->match = NULL;
56 | pCompressor->flags = nFlags;
57 | pCompressor->num_commands = 0;
58 |
59 | if (!nResult) {
60 | pCompressor->intervals = (unsigned long long *)malloc(nMaxWindowSize * sizeof(unsigned long long));
61 |
62 | if (pCompressor->intervals) {
63 | pCompressor->pos_data = (unsigned long long *)malloc(nMaxWindowSize * sizeof(unsigned long long));
64 |
65 | if (pCompressor->pos_data) {
66 | pCompressor->open_intervals = (unsigned long long *)malloc((LCP_MAX + 1) * sizeof(unsigned long long));
67 |
68 | if (pCompressor->open_intervals) {
69 | pCompressor->match = (lz4ultra_match *)malloc(nMaxWindowSize * sizeof(lz4ultra_match));
70 |
71 | if (pCompressor->match)
72 | return 0;
73 | }
74 | }
75 | }
76 | }
77 |
78 | lz4ultra_compressor_destroy(pCompressor);
79 | return 100;
80 | }
81 |
82 | /**
83 | * Clean up compression context and free up any associated resources
84 | *
85 | * @param pCompressor compression context to clean up
86 | */
87 | void lz4ultra_compressor_destroy(lz4ultra_compressor *pCompressor) {
88 | divsufsort_destroy(&pCompressor->divsufsort_context);
89 |
90 | if (pCompressor->match) {
91 | free(pCompressor->match);
92 | pCompressor->match = NULL;
93 | }
94 |
95 | if (pCompressor->open_intervals) {
96 | free(pCompressor->open_intervals);
97 | pCompressor->open_intervals = NULL;
98 | }
99 |
100 | if (pCompressor->pos_data) {
101 | free(pCompressor->pos_data);
102 | pCompressor->pos_data = NULL;
103 | }
104 |
105 | if (pCompressor->intervals) {
106 | free(pCompressor->intervals);
107 | pCompressor->intervals = NULL;
108 | }
109 | }
110 |
111 | /**
112 | * Compress one block of data
113 | *
114 | * @param pCompressor compression context
115 | * @param pInWindow pointer to input data window (previously compressed bytes + bytes to compress)
116 | * @param nPreviousBlockSize number of previously compressed bytes (or 0 for none)
117 | * @param nInDataSize number of input bytes to compress
118 | * @param pOutData pointer to output buffer
119 | * @param nMaxOutDataSize maximum size of output buffer, in bytes
120 | *
121 | * @return size of compressed data in output buffer, or -1 if the data is uncompressible
122 | */
123 | int lz4ultra_compressor_shrink_block(lz4ultra_compressor *pCompressor, const unsigned char *pInWindow, const int nPreviousBlockSize, const int nInDataSize, unsigned char *pOutData, const int nMaxOutDataSize) {
124 | if (lz4ultra_build_suffix_array(pCompressor, pInWindow, nPreviousBlockSize + nInDataSize))
125 | return -1;
126 | if (nPreviousBlockSize) {
127 | lz4ultra_skip_matches(pCompressor, 0, nPreviousBlockSize);
128 | }
129 | lz4ultra_find_all_matches(pCompressor, nPreviousBlockSize, nPreviousBlockSize + nInDataSize);
130 | return lz4ultra_optimize_and_write_block(pCompressor, pInWindow, nPreviousBlockSize, nInDataSize, pOutData, nMaxOutDataSize);
131 | }
132 |
133 | /**
134 | * Get the number of compression commands issued in compressed data blocks
135 | *
136 | * @return number of commands
137 | */
138 | int lz4ultra_compressor_get_command_count(lz4ultra_compressor *pCompressor) {
139 | return pCompressor->num_commands;
140 | }
141 |
--------------------------------------------------------------------------------
/src/shrink_context.h:
--------------------------------------------------------------------------------
1 | /*
2 | * shrink_context.h - compression context definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _SHRINK_CONTEXT_H
34 | #define _SHRINK_CONTEXT_H
35 |
36 | #include "divsufsort.h"
37 |
38 | #define LCP_BITS 15
39 | #define LCP_MAX (1LL<<(LCP_BITS - 1))
40 | #define LCP_SHIFT (39-LCP_BITS)
41 | #define LCP_MASK (((1ULL<
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #include
34 | #include
35 | #include "shrink_inmem.h"
36 | #include "frame.h"
37 | #include "format.h"
38 | #include "lib.h"
39 |
40 | /**
41 | * Get maximum compressed size of input(source) data
42 | *
43 | * @param nInputSize input(source) size in bytes
44 | * @param nFlags compression flags (LZ4ULTRA_FLAG_xxx)
45 | * @param nBlockMaxCode maximum block size code (4..7 for 64 Kb..4 Mb)
46 | *
47 | * @return maximum compressed size
48 | */
49 | size_t lz4ultra_get_max_compressed_size_inmem(size_t nInputSize, unsigned int nFlags, int nBlockMaxCode) {
50 | int nBlockMaxBits;
51 | int nBlockMaxSize;
52 |
53 | if (nFlags & LZ4ULTRA_FLAG_LEGACY_FRAMES) {
54 | nBlockMaxBits = 23;
55 | }
56 | else {
57 | nBlockMaxBits = 8 + (nBlockMaxCode << 1);
58 | }
59 | nBlockMaxSize = 1 << nBlockMaxBits;
60 |
61 | if (nInputSize < nBlockMaxSize && (nFlags & LZ4ULTRA_FLAG_LEGACY_FRAMES) == 0) {
62 | /* If the entire input data is shorter than the specified block size, try to reduce the
63 | * block size until is the smallest one that can fit the data */
64 |
65 | do {
66 | nBlockMaxBits = 8 + (nBlockMaxCode << 1);
67 | nBlockMaxSize = 1 << nBlockMaxBits;
68 |
69 | int nPrevBlockMaxBits = 8 + ((nBlockMaxCode - 1) << 1);
70 | int nPrevBlockMaxSize = 1 << nPrevBlockMaxBits;
71 | if (nBlockMaxCode > 4 && nPrevBlockMaxSize > nInputSize) {
72 | nBlockMaxCode--;
73 | }
74 | else
75 | break;
76 | } while (1);
77 | }
78 |
79 | return LZ4ULTRA_MAX_HEADER_SIZE + ((nInputSize + (nBlockMaxSize - 1)) >> nBlockMaxBits) * LZ4ULTRA_FRAME_SIZE + nInputSize + LZ4ULTRA_FRAME_SIZE /* footer */;
80 | }
81 |
82 | /**
83 | * Compress memory
84 | *
85 | * @param pInputData pointer to input(source) data to compress
86 | * @param pOutBuffer buffer for compressed data
87 | * @param nInputSize input(source) size in bytes
88 | * @param nMaxOutBufferSize maximum capacity of compression buffer
89 | * @param nFlags compression flags (LZ4ULTRA_FLAG_xxx)
90 | * @param nBlockMaxCode maximum block size code (4..7 for 64 Kb..4 Mb)
91 | *
92 | * @return actual compressed size, or -1 for error
93 | */
94 | size_t lz4ultra_compress_inmem(const unsigned char *pInputData, unsigned char *pOutBuffer, size_t nInputSize, size_t nMaxOutBufferSize, unsigned int nFlags, int nBlockMaxCode) {
95 | lz4ultra_compressor compressor;
96 | size_t nOriginalSize = 0L;
97 | size_t nCompressedSize = 0L;
98 | int nBlockMaxBits;
99 | int nBlockMaxSize;
100 | int nResult;
101 | int nError = 0;
102 |
103 | if (nFlags & LZ4ULTRA_FLAG_LEGACY_FRAMES) {
104 | nBlockMaxBits = 23;
105 | nFlags |= LZ4ULTRA_FLAG_INDEP_BLOCKS;
106 | }
107 | else {
108 | nBlockMaxBits = 8 + (nBlockMaxCode << 1);
109 | }
110 | nBlockMaxSize = 1 << nBlockMaxBits;
111 |
112 | if (nInputSize < nBlockMaxSize && (nFlags & LZ4ULTRA_FLAG_LEGACY_FRAMES) == 0) {
113 | /* If the entire input data is shorter than the specified block size, try to reduce the
114 | * block size until is the smallest one that can fit the data */
115 |
116 | do {
117 | nBlockMaxBits = 8 + (nBlockMaxCode << 1);
118 | nBlockMaxSize = 1 << nBlockMaxBits;
119 |
120 | int nPrevBlockMaxBits = 8 + ((nBlockMaxCode - 1) << 1);
121 | int nPrevBlockMaxSize = 1 << nPrevBlockMaxBits;
122 | if (nBlockMaxCode > 4 && nPrevBlockMaxSize > nInputSize) {
123 | nBlockMaxCode--;
124 | }
125 | else
126 | break;
127 | } while (1);
128 | }
129 |
130 | nResult = lz4ultra_compressor_init(&compressor, nBlockMaxSize + HISTORY_SIZE, nFlags);
131 | if (nResult != 0) {
132 | return LZ4ULTRA_ERROR_MEMORY;
133 | }
134 |
135 | if ((nFlags & LZ4ULTRA_FLAG_RAW_BLOCK) == 0) {
136 | int nHeaderSize = lz4ultra_encode_header(pOutBuffer + nCompressedSize, (int)(nMaxOutBufferSize - nCompressedSize), nFlags, nBlockMaxCode);
137 | if (nHeaderSize < 0)
138 | nError = LZ4ULTRA_ERROR_COMPRESSION;
139 | else {
140 | nCompressedSize += nHeaderSize;
141 | }
142 | }
143 |
144 | int nPreviousBlockSize = 0;
145 | int nNumBlocks = 0;
146 |
147 | while (nOriginalSize < nInputSize && !nError) {
148 | int nInDataSize;
149 |
150 | nInDataSize = (int)(nInputSize - nOriginalSize);
151 | if (nInDataSize > nBlockMaxSize)
152 | nInDataSize = nBlockMaxSize;
153 |
154 | if (nInDataSize > 0) {
155 | if ((nFlags & LZ4ULTRA_FLAG_RAW_BLOCK) != 0 && (nNumBlocks || nInDataSize > 0x400000)) {
156 | nError = LZ4ULTRA_ERROR_RAW_TOOLARGE;
157 | break;
158 | }
159 |
160 | int nOutDataSize;
161 | int nOutDataEnd = (int)(nMaxOutBufferSize - LZ4ULTRA_FRAME_SIZE - LZ4ULTRA_FRAME_SIZE /* footer */ - nCompressedSize);
162 | int nHeaderOffset = LZ4ULTRA_FRAME_SIZE;
163 |
164 | if ((nFlags & LZ4ULTRA_FLAG_RAW_BLOCK) != 0) {
165 | nHeaderOffset = 0;
166 | nOutDataEnd = (int)(nMaxOutBufferSize - nCompressedSize);
167 | }
168 |
169 | if (nOutDataEnd > nBlockMaxSize)
170 | nOutDataEnd = nBlockMaxSize;
171 |
172 | nOutDataSize = lz4ultra_compressor_shrink_block(&compressor, pInputData + nOriginalSize - nPreviousBlockSize, nPreviousBlockSize, nInDataSize, pOutBuffer + nHeaderOffset + nCompressedSize, nOutDataEnd);
173 | if (nOutDataSize >= 0) {
174 | int nFrameHeaderSize = 0;
175 |
176 | /* Compressed block */
177 |
178 | if ((nFlags & LZ4ULTRA_FLAG_RAW_BLOCK) == 0) {
179 | nFrameHeaderSize = lz4ultra_encode_compressed_block_frame(pOutBuffer + nCompressedSize, (int)(nMaxOutBufferSize - nCompressedSize), nFlags, nOutDataSize);
180 | if (nFrameHeaderSize < 0)
181 | nError = LZ4ULTRA_ERROR_COMPRESSION;
182 | }
183 |
184 | if (!nError) {
185 | nOriginalSize += nInDataSize;
186 | nCompressedSize += nFrameHeaderSize + nOutDataSize;
187 | }
188 | }
189 | else {
190 | /* Write uncompressible, literal block */
191 |
192 | if ((nFlags & LZ4ULTRA_FLAG_RAW_BLOCK) != 0) {
193 | /* Uncompressible data isn't supported by raw blocks */
194 | nError = LZ4ULTRA_ERROR_RAW_UNCOMPRESSED;
195 | break;
196 | }
197 |
198 | int nFrameHeaderSize;
199 |
200 | nFrameHeaderSize = lz4ultra_encode_uncompressed_block_frame(pOutBuffer + nCompressedSize, (int)(nMaxOutBufferSize - nCompressedSize), nFlags, nInDataSize);
201 | if (nFrameHeaderSize < 0)
202 | nError = LZ4ULTRA_ERROR_COMPRESSION;
203 | else {
204 | if (nInDataSize > (nMaxOutBufferSize - (nCompressedSize + nFrameHeaderSize)))
205 | nError = LZ4ULTRA_ERROR_DST;
206 | else {
207 | memcpy(pOutBuffer + nFrameHeaderSize + nCompressedSize, pInputData + nOriginalSize, nInDataSize);
208 | nOriginalSize += nInDataSize;
209 | nCompressedSize += nFrameHeaderSize + (long long)nInDataSize;
210 | }
211 | }
212 | }
213 |
214 | if (!(nFlags & LZ4ULTRA_FLAG_INDEP_BLOCKS)) {
215 | nPreviousBlockSize = nInDataSize;
216 | if (nPreviousBlockSize > HISTORY_SIZE)
217 | nPreviousBlockSize = HISTORY_SIZE;
218 | }
219 | else {
220 | nPreviousBlockSize = 0;
221 | }
222 |
223 | nNumBlocks++;
224 | }
225 | }
226 |
227 | int nFooterSize;
228 |
229 | if ((nFlags & LZ4ULTRA_FLAG_RAW_BLOCK) != 0) {
230 | nFooterSize = 0;
231 | }
232 | else {
233 | nFooterSize = lz4ultra_encode_footer_frame(pOutBuffer + nCompressedSize, (int)(nMaxOutBufferSize - nCompressedSize), nFlags);
234 | if (nFooterSize < 0)
235 | nError = LZ4ULTRA_ERROR_COMPRESSION;
236 | }
237 |
238 | if (!nError) {
239 | nCompressedSize += nFooterSize;
240 | }
241 |
242 |
243 | lz4ultra_compressor_destroy(&compressor);
244 |
245 | if (nError) {
246 | return -1;
247 | }
248 | else {
249 | return nCompressedSize;
250 | }
251 | }
252 |
--------------------------------------------------------------------------------
/src/shrink_inmem.h:
--------------------------------------------------------------------------------
1 | /*
2 | * shrink_inmem.h - in-memory compression definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _SHRINK_INMEM_H
34 | #define _SHRINK_INMEM_H
35 |
36 | #include
37 |
38 | /**
39 | * Get maximum compressed size of input(source) data
40 | *
41 | * @param nInputSize input(source) size in bytes
42 | * @param nFlags compression flags (LZ4ULTRA_FLAG_xxx)
43 | * @param nBlockMaxCode maximum block size code (4..7 for 64 Kb..4 Mb)
44 | *
45 | * @return maximum compressed size
46 | */
47 | size_t lz4ultra_get_max_compressed_size_inmem(size_t nInputSize, unsigned int nFlags,
48 | int nBlockMaxCode);
49 |
50 | /**
51 | * Compress memory
52 | *
53 | * @param pInputData pointer to input(source) data to compress
54 | * @param pOutBuffer buffer for compressed data
55 | * @param nInputSize input(source) size in bytes
56 | * @param nMaxOutBufferSize maximum capacity of compression buffer
57 | * @param nFlags compression flags (LZ4ULTRA_FLAG_xxx)
58 | * @param nBlockMaxCode maximum block size code (4..7 for 64 Kb..4 Mb)
59 | *
60 | * @return actual compressed size, or -1 for error
61 | */
62 | size_t lz4ultra_compress_inmem(const unsigned char *pInputData, unsigned char *pOutBuffer, size_t nInputSize, size_t nMaxOutBufferSize, unsigned int nFlags,
63 | int nBlockMaxCode);
64 |
65 | #endif /* _SHRINK_INMEM_H */
66 |
--------------------------------------------------------------------------------
/src/shrink_streaming.h:
--------------------------------------------------------------------------------
1 | /*
2 | * shrink_streaming.h - streaming compression definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _SHRINK_STREAMING_H
34 | #define _SHRINK_STREAMING_H
35 |
36 | #include "stream.h"
37 |
38 | /* Forward declaration */
39 | typedef enum _lz4ultra_status_t lz4ultra_status_t;
40 |
41 | /*-------------- File API -------------- */
42 |
43 | /**
44 | * Compress file
45 | *
46 | * @param pszInFilename name of input(source) file to compress
47 | * @param pszOutFilename name of output(compressed) file to generate
48 | * @param pszDictionaryFilename name of dictionary file, or NULL for none
49 | * @param nFlags compression flags (LZ4ULTRA_FLAG_xxx)
50 | * @param nBlockMaxCode maximum block size code (4..7 for 64 Kb..4 Mb)
51 | * @param start start function, called when the max block size is finalized and compression is about to start, or NULL for none
52 | * @param progress progress function, called after compressing each block, or NULL for none
53 | * @param pOriginalSize pointer to returned input(source) size, updated when this function is successful
54 | * @param pCompressedSize pointer to returned output(compressed) size, updated when this function is successful
55 | * @param pCommandCount pointer to returned token(compression commands) count, updated when this function is successful
56 | *
57 | * @return LZ4ULTRA_OK for success, or an error value from lz4ultra_status_t
58 | */
59 | lz4ultra_status_t lz4ultra_compress_file(const char *pszInFilename, const char *pszOutFilename, const char *pszDictionaryFilename, const unsigned int nFlags,
60 | int nBlockMaxCode,
61 | void(*start)(int nBlockMaxCode, const unsigned int nFlags),
62 | void(*progress)(long long nOriginalSize, long long nCompressedSize), long long *pOriginalSize, long long *pCompressedSize, int *pCommandCount);
63 |
64 | /*-------------- Streaming API -------------- */
65 |
66 | /**
67 | * Compress stream
68 | *
69 | * @param pInStream input(source) stream to compress
70 | * @param pOutStream output(compressed) stream to write to
71 | * @param pDictionaryData dictionary contents, or NULL for none
72 | * @param nDictionaryDataSize size of dictionary contents, or 0
73 | * @param nFlags compression flags (LZ4ULTRA_FLAG_xxx)
74 | * @param nBlockMaxCode maximum block size code (4..7 for 64 Kb..4 Mb)
75 | * @param start start function, called when the max block size is finalized and compression is about to start, or NULL for none
76 | * @param progress progress function, called after compressing each block, or NULL for none
77 | * @param pOriginalSize pointer to returned input(source) size, updated when this function is successful
78 | * @param pCompressedSize pointer to returned output(compressed) size, updated when this function is successful
79 | * @param pCommandCount pointer to returned token(compression commands) count, updated when this function is successful
80 | *
81 | * @return LZ4ULTRA_OK for success, or an error value from lz4ultra_status_t
82 | */
83 | lz4ultra_status_t lz4ultra_compress_stream(lz4ultra_stream_t *pInStream, lz4ultra_stream_t *pOutStream, const void *pDictionaryData, int nDictionaryDataSize, unsigned int nFlags,
84 | int nBlockMaxCode,
85 | void(*start)(int nBlockMaxCode, const unsigned int nFlags),
86 | void(*progress)(long long nOriginalSize, long long nCompressedSize), long long *pOriginalSize, long long *pCompressedSize, int *pCommandCount);
87 |
88 | #endif /* _SHRINK_STREAMING_H */
89 |
--------------------------------------------------------------------------------
/src/stream.c:
--------------------------------------------------------------------------------
1 | /*
2 | * stream.c - streaming I/O implementation
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #include
34 | #include
35 | #include
36 | #include "stream.h"
37 |
38 | /**
39 | * Close file stream
40 | *
41 | * @param stream stream
42 | */
43 | static void lz4ultra_filestream_close(lz4ultra_stream_t *stream) {
44 | if (stream->obj) {
45 | fclose((FILE*)stream->obj);
46 | stream->obj = NULL;
47 | stream->read = NULL;
48 | stream->write = NULL;
49 | stream->eof = NULL;
50 | stream->close = NULL;
51 | }
52 | }
53 |
54 | /**
55 | * Read from file stream
56 | *
57 | * @param stream stream
58 | * @param ptr buffer to read into
59 | * @param size number of bytes to read
60 | *
61 | * @return number of bytes read
62 | */
63 | static size_t lz4ultra_filestream_read(lz4ultra_stream_t *stream, void *ptr, size_t size) {
64 | return fread(ptr, 1, size, (FILE*)stream->obj);
65 | }
66 |
67 | /**
68 | * Write to file stream
69 | *
70 | * @param stream stream
71 | * @param ptr buffer to write from
72 | * @param size number of bytes to write
73 | *
74 | * @return number of bytes written
75 | */
76 | static size_t lz4ultra_filestream_write(lz4ultra_stream_t *stream, void *ptr, size_t size) {
77 | return fwrite(ptr, 1, size, (FILE*)stream->obj);
78 | }
79 |
80 | /**
81 | * Check if file stream has reached the end of the data
82 | *
83 | * @param stream stream
84 | *
85 | * @return nonzero if the end of the data has been reached, 0 if there is more data
86 | */
87 | static int lz4ultra_filestream_eof(lz4ultra_stream_t *stream) {
88 | return feof((FILE*)stream->obj);
89 | }
90 |
91 | /**
92 | * Open file and create an I/O stream from it
93 | *
94 | * @param stream stream to fill out
95 | * @param pszInFilename filename
96 | * @param pszMode open mode, as with fopen()
97 | *
98 | * @return 0 for success, nonzero for failure
99 | */
100 | int lz4ultra_filestream_open(lz4ultra_stream_t *stream, const char *pszInFilename, const char *pszMode) {
101 | stream->obj = (void*)fopen(pszInFilename, pszMode);
102 | if (stream->obj) {
103 | stream->read = lz4ultra_filestream_read;
104 | stream->write = lz4ultra_filestream_write;
105 | stream->eof = lz4ultra_filestream_eof;
106 | stream->close = lz4ultra_filestream_close;
107 | return 0;
108 | }
109 | else
110 | return -1;
111 | }
112 |
--------------------------------------------------------------------------------
/src/stream.h:
--------------------------------------------------------------------------------
1 | /*
2 | * stream.h - streaming I/O definitions
3 | *
4 | * Copyright (C) 2019 Emmanuel Marty
5 | *
6 | * This software is provided 'as-is', without any express or implied
7 | * warranty. In no event will the authors be held liable for any damages
8 | * arising from the use of this software.
9 | *
10 | * Permission is granted to anyone to use this software for any purpose,
11 | * including commercial applications, and to alter it and redistribute it
12 | * freely, subject to the following restrictions:
13 | *
14 | * 1. The origin of this software must not be misrepresented; you must not
15 | * claim that you wrote the original software. If you use this software
16 | * in a product, an acknowledgment in the product documentation would be
17 | * appreciated but is not required.
18 | * 2. Altered source versions must be plainly marked as such, and must not be
19 | * misrepresented as being the original software.
20 | * 3. This notice may not be removed or altered from any source distribution.
21 | */
22 |
23 | /*
24 | * Uses the libdivsufsort library Copyright (c) 2003-2008 Yuta Mori
25 | *
26 | * Inspired by LZ4 by Yann Collet. https://github.com/lz4/lz4
27 | * With help, ideas, optimizations and speed measurements by spke
28 | * With ideas from Lizard by Przemyslaw Skibinski and Yann Collet. https://github.com/inikep/lizard
29 | * Also with ideas from smallz4 by Stephan Brumme. https://create.stephan-brumme.com/smallz4/
30 | *
31 | */
32 |
33 | #ifndef _STREAM_H
34 | #define _STREAM_H
35 |
36 | /* Forward declaration */
37 | typedef struct _lz4ultra_stream_t lz4ultra_stream_t;
38 |
39 | /* I/O stream */
40 | typedef struct _lz4ultra_stream_t {
41 | /** Opaque stream-specific pointer */
42 | void *obj;
43 |
44 | /**
45 | * Read from stream
46 | *
47 | * @param stream stream
48 | * @param ptr buffer to read into
49 | * @param size number of bytes to read
50 | *
51 | * @return number of bytes read
52 | */
53 | size_t(*read)(lz4ultra_stream_t *stream, void *ptr, size_t size);
54 |
55 | /**
56 | * Write to stream
57 | *
58 | * @param stream stream
59 | * @param ptr buffer to write from
60 | * @param size number of bytes to write
61 | *
62 | * @return number of bytes written
63 | */
64 | size_t(*write)(lz4ultra_stream_t *stream, void *ptr, size_t size);
65 |
66 |
67 | /**
68 | * Check if stream has reached the end of the data
69 | *
70 | * @param stream stream
71 | *
72 | * @return nonzero if the end of the data has been reached, 0 if there is more data
73 | */
74 | int(*eof)(lz4ultra_stream_t *stream);
75 |
76 | /**
77 | * Close stream
78 | *
79 | * @param stream stream
80 | */
81 | void(*close)(lz4ultra_stream_t *stream);
82 | } lz4ultra_stream_t;
83 |
84 | /**
85 | * Open file and create an I/O stream from it
86 | *
87 | * @param stream stream to fill out
88 | * @param pszInFilename filename
89 | * @param pszMode open mode, as with fopen()
90 | *
91 | * @return 0 for success, nonzero for failure
92 | */
93 | int lz4ultra_filestream_open(lz4ultra_stream_t *stream, const char *pszInFilename, const char *pszMode);
94 |
95 | #endif /* _STREAM_H */
96 |
--------------------------------------------------------------------------------
/src/xxhash/LICENSE.txt:
--------------------------------------------------------------------------------
1 | xxHash Library
2 | Copyright (c) 2012-2014, Yann Collet
3 | All rights reserved.
4 |
5 | Redistribution and use in source and binary forms, with or without modification,
6 | are permitted provided that the following conditions are met:
7 |
8 | * Redistributions of source code must retain the above copyright notice, this
9 | list of conditions and the following disclaimer.
10 |
11 | * Redistributions in binary form must reproduce the above copyright notice, this
12 | list of conditions and the following disclaimer in the documentation and/or
13 | other materials provided with the distribution.
14 |
15 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
16 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
17 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
18 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
19 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
20 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
21 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
22 | ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
23 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
24 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
25 |
--------------------------------------------------------------------------------