├── Azure_enwik9_benchmark.png ├── Benchmarks.md ├── CHANGES ├── CMakeLists.txt ├── LICENSE ├── README.md ├── VERSION ├── include ├── libsais.h ├── libsais16.h ├── libsais16x64.h └── libsais64.h └── src ├── libsais.c ├── libsais16.c ├── libsais16x64.c └── libsais64.c /Azure_enwik9_benchmark.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IlyaGrebnov/libsais/5695d4f14e68bfd71a53fa1d90549a2f44788a75/Azure_enwik9_benchmark.png -------------------------------------------------------------------------------- /Benchmarks.md: -------------------------------------------------------------------------------- 1 | # Specifications 2 | * CPU: Intel Core i7-9700K Processor (12M Cache, 5 GHz all cores) 3 | * RAM: 2x8 GB dual-channel DDR4 (4133 MHz, 17-17-17-37-400-2T with tight subtimings) 4 | * OS: Microsoft Windows 10 Pro 64 Bit (with MEM_LARGE_PAGES support enabled) 5 | * Compiler: Clang 11.0.0 '-Ofast -march=skylake -fopenmp -DNDEBUG' 6 | 7 | The times are the minimum of five runs measuring **single-threaded (ST)** and **multi-threaded (MT)** performance of suffix array construction. 8 | 9 | ### [Silesia Corpus](https://www.data-compression.info/Corpora/SilesiaCorpus/index.html) ### 10 | 11 | | file | size | libsais 2.1.0 (ST) | divsufsort 2.0.2 (ST) |speedup (ST)| libsais 2.1.0 (MT) | divsufsort 2.0.2 (MT) |speedup (MT)| 12 | |:---------------:|:-----------:|:--------------------------:|:--------------------------:|:----------:|:--------------------------:|:--------------------------:|:----------:| 13 | | dickens | 10192446 | 0.204 sec ( 50.00 MB/s) | 0.446 sec ( 22.87 MB/s) |**+118.64%**| 0.120 sec ( 84.98 MB/s) | 0.296 sec ( 34.42 MB/s) |**+146.88%**| 14 | | mozilla | 51220480 | 1.105 sec ( 46.34 MB/s) | 1.819 sec ( 28.15 MB/s) | **+64.60%**| 0.712 sec ( 71.96 MB/s) | 1.206 sec ( 42.47 MB/s) | **+69.44%**| 15 | | mr | 9970564 | 0.187 sec ( 53.21 MB/s) | 0.370 sec ( 26.92 MB/s) | **+97.66%**| 0.131 sec ( 76.03 MB/s) | 0.261 sec ( 38.22 MB/s) | **+98.95%**| 16 | | nci | 33553445 | 0.641 sec ( 52.38 MB/s) | 1.151 sec ( 29.16 MB/s) | **+79.59%**| 0.364 sec ( 92.20 MB/s) | 1.036 sec ( 32.39 MB/s) |**+184.65%**| 17 | | ooffice | 6152192 | 0.116 sec ( 53.23 MB/s) | 0.180 sec ( 34.26 MB/s) | **+55.38%**| 0.082 sec ( 75.06 MB/s) | 0.108 sec ( 57.08 MB/s) | **+31.51%**| 18 | | osdb | 10085684 | 0.224 sec ( 45.02 MB/s) | 0.361 sec ( 27.94 MB/s) | **+61.12%**| 0.154 sec ( 65.61 MB/s) | 0.272 sec ( 37.07 MB/s) | **+77.00%**| 19 | | reymont | 6627202 | 0.125 sec ( 53.09 MB/s) | 0.249 sec ( 26.61 MB/s) | **+99.49%**| 0.076 sec ( 87.44 MB/s) | 0.183 sec ( 36.22 MB/s) |**+141.39%**| 20 | | samba | 21606400 | 0.401 sec ( 53.82 MB/s) | 0.661 sec ( 32.70 MB/s) | **+64.59%**| 0.274 sec ( 79.00 MB/s) | 0.467 sec ( 46.27 MB/s) | **+70.74%**| 21 | | sao | 7251944 | 0.181 sec ( 40.02 MB/s) | 0.232 sec ( 31.24 MB/s) | **+28.10%**| 0.132 sec ( 55.02 MB/s) | 0.143 sec ( 50.63 MB/s) | **+8.67%**| 22 | | webster | 41458703 | 1.020 sec ( 40.64 MB/s) | 2.190 sec ( 18.93 MB/s) |**+114.70%**| 0.595 sec ( 69.72 MB/s) | 1.653 sec ( 25.08 MB/s) |**+177.98%**| 23 | | x-ray | 8474240 | 0.215 sec ( 39.35 MB/s) | 0.337 sec ( 25.13 MB/s) | **+56.58%**| 0.153 sec ( 55.40 MB/s) | 0.184 sec ( 46.10 MB/s) | **+20.19%**| 24 | | xml | 5345280 | 0.079 sec ( 67.54 MB/s) | 0.138 sec ( 38.87 MB/s) | **+73.78%**| 0.050 sec ( 106.50 MB/s) | 0.108 sec ( 49.52 MB/s) |**+115.06%**| 25 | 26 | 27 | 28 | ### [Large Canterbury Corpus](https://www.data-compression.info/Corpora/CanterburyCorpus/) ### 29 | 30 | | file | size | libsais 2.1.0 (ST) | divsufsort 2.0.2 (ST) |speedup (ST)| libsais 2.1.0 (MT) | divsufsort 2.0.2 (MT) |speedup (MT)| 31 | |:---------------:|:-----------:|:--------------------------:|:--------------------------:|:----------:|:--------------------------:|:--------------------------:|:----------:| 32 | | bible.txt | 4047392 | 0.066 sec ( 61.40 MB/s) | 0.140 sec ( 28.95 MB/s) |**+112.14%**| 0.038 sec ( 106.74 MB/s) | 0.100 sec ( 40.58 MB/s) |**+163.00%**| 33 | | E.coli | 4638690 | 0.079 sec ( 58.74 MB/s) | 0.199 sec ( 23.35 MB/s) |**+151.59%**| 0.038 sec ( 121.18 MB/s) | 0.154 sec ( 30.15 MB/s) |**+301.95%**| 34 | | world192.txt | 2473400 | 0.037 sec ( 66.72 MB/s) | 0.075 sec ( 32.92 MB/s) |**+102.65%**| 0.023 sec ( 108.02 MB/s) | 0.050 sec ( 49.58 MB/s) |**+117.85%**| 35 | 36 | 37 | 38 | ### [Manzini Corpus](https://people.unipmn.it/manzini/lightweight/corpus/) ### 39 | 40 | | file | size | libsais 2.1.0 (ST) | divsufsort 2.0.2 (ST) |speedup (ST)| libsais 2.1.0 (MT) | divsufsort 2.0.2 (MT) |speedup (MT)| 41 | |:---------------:|:-----------:|:--------------------------:|:--------------------------:|:----------:|:--------------------------:|:--------------------------:|:----------:| 42 | | chr22.dna | 34553758 | 0.778 sec ( 44.43 MB/s) | 2.005 sec ( 17.23 MB/s) |**+157.85%**| 0.436 sec ( 79.27 MB/s) | 1.676 sec ( 20.61 MB/s) |**+284.61%**| 43 | | etext99 | 105277340 | 3.174 sec ( 33.17 MB/s) | 7.083 sec ( 14.86 MB/s) |**+123.15%**| 1.739 sec ( 60.53 MB/s) | 5.345 sec ( 19.70 MB/s) |**+207.33%**| 44 | | gcc-3.0.tar | 86630400 | 1.930 sec ( 44.89 MB/s) | 3.458 sec ( 25.05 MB/s) | **+79.18%**| 1.188 sec ( 72.90 MB/s) | 2.508 sec ( 34.54 MB/s) |**+111.04%**| 45 | | howto | 39422105 | 0.912 sec ( 43.24 MB/s) | 1.884 sec ( 20.93 MB/s) |**+106.60%**| 0.557 sec ( 70.80 MB/s) | 1.279 sec ( 30.83 MB/s) |**+129.64%**| 46 | | jdk13c | 69728899 | 1.630 sec ( 42.79 MB/s) | 3.019 sec ( 23.10 MB/s) | **+85.25%**| 1.008 sec ( 69.20 MB/s) | 2.490 sec ( 28.00 MB/s) |**+147.11%**| 47 | | linux-2.4.5.tar | 116254720 | 2.688 sec ( 43.25 MB/s) | 4.980 sec ( 23.34 MB/s) | **+85.27%**| 1.651 sec ( 70.40 MB/s) | 3.540 sec ( 32.84 MB/s) |**+114.34%**| 48 | | rctail96 | 114711151 | 3.162 sec ( 36.27 MB/s) | 6.370 sec ( 18.01 MB/s) |**+101.41%**| 1.813 sec ( 63.27 MB/s) | 5.009 sec ( 22.90 MB/s) |**+176.26%**| 49 | | rfc | 116421901 | 2.929 sec ( 39.75 MB/s) | 5.689 sec ( 20.46 MB/s) | **+94.23%**| 1.681 sec ( 69.24 MB/s) | 4.182 sec ( 27.84 MB/s) |**+148.70%**| 50 | | sprot34.dat | 109617186 | 2.983 sec ( 36.75 MB/s) | 6.324 sec ( 17.33 MB/s) |**+112.03%**| 1.712 sec ( 64.03 MB/s) | 4.509 sec ( 24.31 MB/s) |**+163.38%**| 51 | | w3c2 | 104201579 | 2.450 sec ( 42.53 MB/s) | 4.426 sec ( 23.54 MB/s) | **+80.67%**| 1.565 sec ( 66.57 MB/s) | 3.570 sec ( 29.19 MB/s) |**+128.10%**| 52 | 53 | 54 | 55 | ### [Large Text Compression Benchmark Corpus](https://www.mattmahoney.net/dc/textdata.html) ### 56 | 57 | | file | size | libsais 2.1.0 (ST) | divsufsort 2.0.2 (ST) |speedup (ST)| libsais 2.1.0 (MT) | divsufsort 2.0.2 (MT) |speedup (MT)| 58 | |:---------------:|:-----------:|:--------------------------:|:--------------------------:|:----------:|:--------------------------:|:--------------------------:|:----------:| 59 | | enwik8 | 100000000 | 2.841 sec ( 35.20 MB/s) | 6.209 sec ( 16.11 MB/s) |**+118.58%**| 1.604 sec ( 62.36 MB/s) | 4.446 sec ( 22.49 MB/s) |**+177.24%**| 60 | | enwik9 | 1000000000 | 40.718 sec ( 24.56 MB/s) | 82.407 sec ( 12.13 MB/s) |**+102.39%**| 19.138 sec ( 52.25 MB/s) | 63.373 sec ( 15.78 MB/s) |**+231.14%**| 61 | 62 | 63 | 64 | ### [The Gauntlet Corpus](https://github.com/michaelmaniscalco/gauntlet_corpus) ### 65 | 66 | | file | size | libsais 2.1.0 (ST) | divsufsort 2.0.2 (ST) |speedup (ST)| libsais 2.1.0 (MT) | divsufsort 2.0.2 (MT) |speedup (MT)| 67 | |:---------------:|:-----------:|:--------------------------:|:--------------------------:|:----------:|:--------------------------:|:--------------------------:|:----------:| 68 | | abac | 200000 | 0.003 sec ( 73.23 MB/s) | 0.002 sec ( 130.08 MB/s) | -43.71% | 0.002 sec ( 94.65 MB/s) | 0.002 sec ( 125.78 MB/s) | -24.75% | 69 | | abba | 10500596 | 0.169 sec ( 62.02 MB/s) | 0.431 sec ( 24.36 MB/s) |**+154.64%**| 0.090 sec ( 116.53 MB/s) | 0.442 sec ( 23.76 MB/s) |**+390.38%**| 70 | | book1x20 | 15375420 | 0.328 sec ( 46.85 MB/s) | 0.720 sec ( 21.36 MB/s) |**+119.34%**| 0.195 sec ( 79.02 MB/s) | 0.505 sec ( 30.47 MB/s) |**+159.34%**| 71 | | fib_s14930352 | 14930352 | 0.341 sec ( 43.80 MB/s) | 1.070 sec ( 13.95 MB/s) |**+213.96%**| 0.173 sec ( 86.06 MB/s) | 1.059 sec ( 14.10 MB/s) |**+510.23%**| 72 | | fss10 | 12078908 | 0.239 sec ( 50.54 MB/s) | 0.811 sec ( 14.89 MB/s) |**+239.49%**| 0.135 sec ( 89.17 MB/s) | 0.793 sec ( 15.24 MB/s) |**+485.07%**| 73 | | fss9 | 2851443 | 0.041 sec ( 68.79 MB/s) | 0.123 sec ( 23.12 MB/s) |**+197.52%**| 0.025 sec ( 116.35 MB/s) | 0.121 sec ( 23.59 MB/s) |**+393.28%**| 74 | | houston | 3839141 | 0.037 sec ( 103.92 MB/s) | 0.024 sec ( 162.18 MB/s) | -35.92% | 0.018 sec ( 209.58 MB/s) | 0.023 sec ( 164.72 MB/s) | **+27.24%**| 75 | | paper5x80 | 956322 | 0.014 sec ( 69.13 MB/s) | 0.024 sec ( 40.00 MB/s) | **+72.82%**| 0.008 sec ( 112.87 MB/s) | 0.020 sec ( 47.87 MB/s) |**+135.81%**| 76 | | test1 | 2097152 | 0.039 sec ( 53.45 MB/s) | 0.084 sec ( 24.82 MB/s) |**+115.35%**| 0.020 sec ( 106.42 MB/s) | 0.080 sec ( 26.25 MB/s) |**+305.39%**| 77 | | test2 | 2097152 | 0.039 sec ( 53.87 MB/s) | 0.058 sec ( 36.02 MB/s) | **+49.56%**| 0.019 sec ( 109.39 MB/s) | 0.052 sec ( 40.23 MB/s) |**+171.92%**| 78 | | test3 | 2097088 | 0.037 sec ( 56.56 MB/s) | 0.046 sec ( 45.93 MB/s) | **+23.13%**| 0.031 sec ( 67.13 MB/s) | 0.047 sec ( 44.81 MB/s) | **+49.79%**| 79 | 80 | 81 | 82 | ### [Pizza & Chilli Corpus](https://pizzachili.dcc.uchile.cl/texts.html) ### 83 | 84 | | file | size | libsais 2.1.0 (ST) | divsufsort 2.0.2 (ST) |speedup (ST)| libsais 2.1.0 (MT) | divsufsort 2.0.2 (MT) |speedup (MT)| 85 | |:---------------:|:-----------:|:--------------------------:|:--------------------------:|:----------:|:--------------------------:|:--------------------------:|:----------:| 86 | | dblp.xml | 296135874 | 8.428 sec ( 35.14 MB/s) | 16.001 sec ( 18.51 MB/s) | **+89.85%**| 4.749 sec ( 62.36 MB/s) | 12.932 sec ( 22.90 MB/s) |**+172.29%**| 87 | | dna | 403927746 | 14.257 sec ( 28.33 MB/s) | 36.002 sec ( 11.22 MB/s) |**+152.53%**| 6.462 sec ( 62.51 MB/s) | 32.014 sec ( 12.62 MB/s) |**+395.42%**| 88 | | english.1024MB | 1073741824 | 51.568 sec ( 20.82 MB/s) | 104.407 sec ( 10.28 MB/s) |**+102.47%**| 22.777 sec ( 47.14 MB/s) | 84.357 sec ( 12.73 MB/s) |**+270.37%**| 89 | | pitches | 55832855 | 1.222 sec ( 45.70 MB/s) | 2.380 sec ( 23.46 MB/s) | **+94.78%**| 0.820 sec ( 68.10 MB/s) | 1.422 sec ( 39.26 MB/s) | **+73.45%**| 90 | | proteins | 1184051855 | 60.105 sec ( 19.70 MB/s) | 124.688 sec ( 9.50 MB/s) |**+107.45%**| 25.981 sec ( 45.57 MB/s) | 79.149 sec ( 14.96 MB/s) |**+204.65%**| 91 | | sources | 210866607 | 5.381 sec ( 39.19 MB/s) | 10.210 sec ( 20.65 MB/s) | **+89.74%**| 3.223 sec ( 65.42 MB/s) | 7.350 sec ( 28.69 MB/s) |**+128.03%**| 92 | 93 | 94 | 95 | ### [Pizza & Chilli Repetitive Corpus](https://pizzachili.dcc.uchile.cl/repcorpus.html) ### 96 | 97 | | file | size | libsais 2.1.0 (ST) | divsufsort 2.0.2 (ST) |speedup (ST)| libsais 2.1.0 (MT) | divsufsort 2.0.2 (MT) |speedup (MT)| 98 | |:---------------:|:-----------:|:--------------------------:|:--------------------------:|:----------:|:--------------------------:|:--------------------------:|:----------:| 99 | | cere | 461286644 | 16.184 sec ( 28.50 MB/s) | 34.422 sec ( 13.40 MB/s) |**+112.69%**| 7.349 sec ( 62.77 MB/s) | 30.502 sec ( 15.12 MB/s) |**+315.04%**| 100 | | coreutils | 205281778 | 5.264 sec ( 39.00 MB/s) | 10.506 sec ( 19.54 MB/s) | **+99.58%**| 3.090 sec ( 66.43 MB/s) | 8.170 sec ( 25.13 MB/s) |**+164.39%**| 101 | | einstein.de.txt | 92758441 | 2.449 sec ( 37.87 MB/s) | 4.582 sec ( 20.25 MB/s) | **+87.06%**| 1.381 sec ( 67.19 MB/s) | 3.744 sec ( 24.78 MB/s) |**+171.18%**| 102 | | einstein.en.txt | 467626544 | 15.279 sec ( 30.61 MB/s) | 31.504 sec ( 14.84 MB/s) |**+106.20%**| 7.902 sec ( 59.18 MB/s) | 27.152 sec ( 17.22 MB/s) |**+243.60%**| 103 | |Escherichia_Coli | 112689515 | 3.360 sec ( 33.54 MB/s) | 7.495 sec ( 15.03 MB/s) |**+123.06%**| 1.683 sec ( 66.94 MB/s) | 6.283 sec ( 17.94 MB/s) |**+273.25%**| 104 | | influenza | 154808555 | 3.816 sec ( 40.57 MB/s) | 8.982 sec ( 17.24 MB/s) |**+135.38%**| 2.209 sec ( 70.07 MB/s) | 7.757 sec ( 19.96 MB/s) |**+251.11%**| 105 | | kernel | 257961616 | 7.091 sec ( 36.38 MB/s) | 13.511 sec ( 19.09 MB/s) | **+90.54%**| 3.866 sec ( 66.72 MB/s) | 10.158 sec ( 25.40 MB/s) |**+162.74%**| 106 | | para | 429265758 | 15.407 sec ( 27.86 MB/s) | 32.821 sec ( 13.08 MB/s) |**+113.03%**| 6.905 sec ( 62.17 MB/s) | 29.143 sec ( 14.73 MB/s) |**+322.06%**| 107 | | world_leaders | 46968181 | 0.880 sec ( 53.35 MB/s) | 1.301 sec ( 36.11 MB/s) | **+47.73%**| 0.460 sec ( 102.16 MB/s) | 1.054 sec ( 44.58 MB/s) |**+129.18%**| 108 | |dblp.xml.00001.1 | 104857600 | 4.445 sec ( 23.59 MB/s) | 5.913 sec ( 17.73 MB/s) | **+33.03%**| 1.976 sec ( 53.07 MB/s) | 5.130 sec ( 20.44 MB/s) |**+159.62%**| 109 | |dblp.xml.00001.2 | 104857600 | 4.423 sec ( 23.71 MB/s) | 6.050 sec ( 17.33 MB/s) | **+36.78%**| 1.967 sec ( 53.32 MB/s) | 5.217 sec ( 20.10 MB/s) |**+165.27%**| 110 | | dblp.xml.0001.1 | 104857600 | 3.062 sec ( 34.24 MB/s) | 5.534 sec ( 18.95 MB/s) | **+80.70%**| 1.427 sec ( 73.50 MB/s) | 4.664 sec ( 22.48 MB/s) |**+226.91%**| 111 | | dblp.xml.0001.2 | 104857600 | 3.118 sec ( 33.63 MB/s) | 5.592 sec ( 18.75 MB/s) | **+79.32%**| 1.664 sec ( 63.00 MB/s) | 4.763 sec ( 22.01 MB/s) |**+186.21%**| 112 | | dna.001.1 | 104857600 | 3.090 sec ( 33.94 MB/s) | 6.368 sec ( 16.47 MB/s) |**+106.10%**| 1.539 sec ( 68.13 MB/s) | 5.338 sec ( 19.64 MB/s) |**+246.87%**| 113 | | english.001.2 | 104857600 | 3.445 sec ( 30.44 MB/s) | 5.920 sec ( 17.71 MB/s) | **+71.87%**| 1.745 sec ( 60.08 MB/s) | 4.268 sec ( 24.57 MB/s) |**+144.55%**| 114 | | proteins.001.1 | 104857600 | 3.406 sec ( 30.79 MB/s) | 5.884 sec ( 17.82 MB/s) | **+72.76%**| 1.797 sec ( 58.36 MB/s) | 3.667 sec ( 28.59 MB/s) |**+104.12%**| 115 | | sources.001.2 | 104857600 | 2.916 sec ( 35.96 MB/s) | 4.828 sec ( 21.72 MB/s) | **+65.55%**| 1.599 sec ( 65.58 MB/s) | 3.695 sec ( 28.38 MB/s) |**+131.08%**| 116 | | fib41 | 267914296 | 8.390 sec ( 31.93 MB/s) | 37.397 sec ( 7.16 MB/s) |**+345.72%**| 3.794 sec ( 70.61 MB/s) | 37.158 sec ( 7.21 MB/s) |**+879.30%**| 117 | | rs.13 | 216747218 | 6.425 sec ( 33.74 MB/s) | 28.994 sec ( 7.48 MB/s) |**+351.30%**| 2.807 sec ( 77.21 MB/s) | 31.609 sec ( 6.86 MB/s) |**+1025.97%**| 118 | | tm29 | 268435456 | 8.931 sec ( 30.06 MB/s) | 31.267 sec ( 8.59 MB/s) |**+250.09%**| 3.729 sec ( 71.99 MB/s) | 31.384 sec ( 8.55 MB/s) |**+741.69%**| 119 | 120 | 121 | 122 | ## Large pages and multi-core systems support 123 | 124 | Large-pages and OpenMP improves the libsais performance. Here is an example of such improvements on Manzini Corpus. 125 | 126 | | file | size | baseline| LP | LP w 2c | LP w 3c | LP w 4c | LP w 5c | LP w 6c | LP w 7c | LP w 8c | 127 | |:---------------:|:---------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:| 128 | | chr22.dna | 34553758 |43.50MB/s|50.18MB/s|61.20MB/s|73.66MB/s|78.91MB/s|81.20MB/s|81.49MB/s|81.52MB/s|80.42MB/s| 129 | | etext99 | 105277340 |32.96MB/s|40.73MB/s|50.19MB/s|59.34MB/s|62.97MB/s|64.06MB/s|62.83MB/s|63.08MB/s|62.49MB/s| 130 | | gcc-3.0.tar | 86630400 |44.32MB/s|50.13MB/s|58.51MB/s|68.85MB/s|73.82MB/s|75.76MB/s|76.14MB/s|75.85MB/s|75.24MB/s| 131 | | howto | 39422105 |42.78MB/s|48.10MB/s|57.38MB/s|67.75MB/s|71.91MB/s|73.67MB/s|73.61MB/s|73.17MB/s|72.38MB/s| 132 | | jdk13c | 69728899 |42.70MB/s|47.77MB/s|54.50MB/s|64.85MB/s|69.63MB/s|71.66MB/s|72.15MB/s|71.96MB/s|71.24MB/s| 133 | | linux-2.4.5.tar | 116254720 |42.46MB/s|48.85MB/s|57.60MB/s|67.92MB/s|72.29MB/s|73.88MB/s|74.11MB/s|73.59MB/s|73.27MB/s| 134 | | rctail96 | 114711151 |36.39MB/s|43.19MB/s|50.96MB/s|60.60MB/s|64.33MB/s|65.43MB/s|65.79MB/s|65.78MB/s|65.18MB/s| 135 | | rfc | 116421901 |39.81MB/s|46.76MB/s|55.92MB/s|66.48MB/s|70.79MB/s|71.68MB/s|72.21MB/s|71.92MB/s|71.06MB/s| 136 | | sprot34.dat | 109617186 |36.09MB/s|45.06MB/s|53.26MB/s|61.60MB/s|59.69MB/s|62.25MB/s|67.20MB/s|66.84MB/s|66.38MB/s| 137 | | w3c2 | 104201579 |42.97MB/s|47.09MB/s|54.01MB/s|63.79MB/s|67.67MB/s|69.84MB/s|69.94MB/s|69.65MB/s|68.86MB/s| 138 | 139 | Note, multi-core scalability is limited by RAM bandwidth and adding more RAM channels improves performance: 140 | ![enwik9 BWT throughput in MB/s on Azure DS14 v2 (Intel Xeon Platinum 8171M)](Azure_enwik9_benchmark.png?raw=true "enwik9 BWT throughput in MB/s on Azure DS14 v2 (Intel Xeon Platinum 8171M)") 141 | 142 | ## libsais64 for inputs larger than 2GB 143 | 144 | Starting from version 2.2.0 libsais64 could process inputs larger than 2GB. 145 | 146 | The times below are the minimum of five runs measuring **multi-threaded (MT)** performance of suffix array construction on Azure DS14 v2 (Intel Xeon Platinum 8171M). 147 | 148 | | file | size | libsais64 2.2.0 (MT) | divsufsort64 2.0.2 (MT) |speedup (MT)| 149 | |:---------------:|:-----------:|:--------------------------:|:--------------------------:|:----------:| 150 | | english | 2210395553 | 61.499 sec ( 34.28 MB/s) | 435.199 sec ( 4.84 MB/s) |**+607.65%**| 151 | | GRCh38.p13.fa | 3321586957 | 84.068 sec ( 37.68 MB/s) | 782.938 sec ( 4.05 MB/s) |**+831.32%**| 152 | | enwik10 | 10000000000 | 303.542 sec ( 31.42 MB/s) |1927.351 sec ( 4.95 MB/s) |**+534.95%**| 153 | 154 | ## Additional memory 155 | 156 | The libsais reuses space allocated for suffix array during construction. Sometimes this free space is not sufficient for most optimal algorithm (this is uncommon) and libsais will need to fallback to less efficient one (libsais has 4 algorithms at different break-points point: 6k, 4k, 2k and 1k; where k is alphabet size). To improve performance for those cases you could allocating additional space at the end of suffix array. 157 | 158 | | file | size | libsais + O(n) (ST) | libsais + O(1) (ST) |speedup (ST)| libsais + O(n) (MT) | libsais + O(1) (ST) |speedup (MT)| 159 | |:---------------:|:-----------:|:--------------------------:|:--------------------------:|:----------:|:--------------------------:|:--------------------------:|:----------:| 160 | | osdb | 10085684 | 0.222 sec ( 45.52 MB/s) | 0.228 sec ( 44.20 MB/s) | **+2.97%**| 0.150 sec ( 67.30 MB/s) | 0.162 sec ( 62.25 MB/s) | **+8.11%**| 161 | | x-ray | 8474240 | 0.190 sec ( 44.52 MB/s) | 0.217 sec ( 39.11 MB/s) | **+13.82%**| 0.122 sec ( 69.46 MB/s) | 0.156 sec ( 54.16 MB/s) | **+28.25%**| 162 | | sao | 7251944 | 0.175 sec ( 41.48 MB/s) | 0.182 sec ( 39.75 MB/s) | **+4.37%**| 0.127 sec ( 57.26 MB/s) | 0.140 sec ( 51.87 MB/s) | **+10.39%**| 163 | | ooffice | 6152192 | 0.113 sec ( 54.55 MB/s) | 0.117 sec ( 52.45 MB/s) | **+4.01%**| 0.081 sec ( 76.38 MB/s) | 0.088 sec ( 70.30 MB/s) | **+8.65%**| 164 | | abac | 200000 | 0.002 sec ( 84.36 MB/s) | 0.003 sec ( 73.63 MB/s) | **+14.56%**| 0.002 sec ( 105.08 MB/s) | 0.002 sec ( 86.64 MB/s) | **+21.27%**| 165 | | test3 | 2097088 | 0.034 sec ( 61.54 MB/s) | 0.037 sec ( 56.45 MB/s) | **+9.03%**| 0.028 sec ( 75.76 MB/s) | 0.032 sec ( 64.93 MB/s) | **+16.68%**| 166 | 167 | > * All other files from [Benchmarks](#benchmarks) above do not suffer from this fallbacks. 168 | -------------------------------------------------------------------------------- /CHANGES: -------------------------------------------------------------------------------- 1 | Changes in 2.10.1 (May 11, 2025) 2 | - No functional changes, slightly improved performance. 3 | 4 | Changes in 2.10.0 (April 12, 2025) 5 | - Improved performance, with noticeable gains on ARM architecture. 6 | - Fixed compiler warnings and addressed undefined behavior. 7 | 8 | Changes in 2.9.1 (March 19, 2025) 9 | - No functional changes, resolved compiler warnings & undefined behavior. 10 | 11 | Changes in 2.9.0 (March 16, 2025) 12 | - Support for generalized suffix array (GSA) construction. 13 | - Support for longest common prefix array (LCP) construction for generalized suffix array (GSA). 14 | 15 | Changes in 2.8.7 (January 16, 2025) 16 | - Restore the input array after suffix array construction (libsais64 & libsais16x64). 17 | 18 | Changes in 2.8.6 (November 18, 2024) 19 | - Fixed out-of-bound memory access issue for large inputs. 20 | 21 | Changes in 2.8.5 (July 31, 2024) 22 | - Miscellaneous changes to reduce compiler warnings about implicit functions. 23 | 24 | Changes in 2.8.4 (June 13, 2024) 25 | - Additional OpenMP acceleration (libsais16 & libsais16x64). 26 | 27 | Changes in 2.8.3 (June 11, 2024) 28 | - Implemented suffix array construction of a long 16-bit array (libsais16x64). 29 | 30 | Changes in 2.8.2 (May 27, 2024) 31 | - Implemented suffix array construction of a long 64-bit array (libsais64). 32 | 33 | Changes in 2.8.1 (April 5, 2024) 34 | - Fixed out-of-bound memory access issue for large inputs (libsais64). 35 | 36 | Changes in 2.8.0 (March 3, 2024) 37 | - Implemented permuted longest common prefix array (PLCP) construction of an integer array. 38 | - Fixed compilation error when compiling the library with OpenMP enabled. 39 | 40 | Changes in 2.7.5 (February 26, 2024) 41 | - Improved performance of suffix array and burrows wheeler transform construction on degenerate inputs. 42 | 43 | Changes in 2.7.4 (February 23, 2024) 44 | - Resolved strict aliasing violation resulted in invalid code generation by Intel compiler. 45 | 46 | Changes in 2.7.3 (April 21, 2023) 47 | - CMake script for library build and integration with other projects. 48 | 49 | Changes in 2.7.2 (April 18, 2023) 50 | - Fixed out-of-bound memory access issue for large inputs (libsais64). 51 | 52 | Changes in 2.7.1 (June 19, 2022) 53 | - Improved cache coherence for ARMv8 architecture. 54 | 55 | Changes in 2.7.0 (April 12, 2022) 56 | - Support for longest common prefix array (LCP) construction. 57 | 58 | Changes in 2.6.5 (January 1, 2022) 59 | - Exposed functions to construct suffix array of a given integer array. 60 | - Improved detection of various compiler intrinsics. 61 | - Capped free space parameter to avoid crashing due to 32-bit integer overflow. 62 | 63 | Changes in 2.6.0 (October 21, 2021) 64 | - libsais16 for 16-bit inputs. 65 | 66 | Changes in 2.5.0 (October 15, 2021) 67 | - Support for optional symbol frequency tables. 68 | 69 | Changes in 2.4.0 (July 14, 2021) 70 | - Reverse Burrows-Wheeler transform. 71 | 72 | Changes in 2.3.0 (June 23, 2021) 73 | - Burrows-Wheeler transform with auxiliary indexes. 74 | 75 | Changes in 2.2.0 (April 27, 2021) 76 | - libsais64 for inputs larger than 2GB. 77 | 78 | Changes in 2.1.0 (April 19, 2021) 79 | - Additional OpenMP acceleration. 80 | 81 | Changes in 2.0.0 (April 4, 2021) 82 | - OpenMP acceleration. 83 | 84 | Changes in 1.0.0 (February 23, 2021) 85 | - Initial Release. 86 | -------------------------------------------------------------------------------- /CMakeLists.txt: -------------------------------------------------------------------------------- 1 | cmake_minimum_required(VERSION 3.10) 2 | 3 | project( 4 | libsais 5 | VERSION 2.10.1 6 | LANGUAGES C 7 | DESCRIPTION "The libsais library provides fast linear-time construction of suffix array (SA), generalized suffix array (GSA), longest common prefix (LCP) array, permuted LCP (PLCP) array, Burrows-Wheeler transform (BWT) and inverse BWT based on the induced sorting algorithm with optional OpenMP support for multi-core parallel construction." 8 | ) 9 | 10 | set(CMAKE_C_STANDARD 99) 11 | set(CMAKE_C_STANDARD_REQUIRED ON) 12 | set(CMAKE_C_EXTENSIONS OFF) 13 | 14 | option(LIBSAIS_USE_OPENMP "Use OpenMP for parallelization" OFF) 15 | option(LIBSAIS_BUILD_SHARED_LIB "Build libsais as a shared library" OFF) 16 | 17 | if(LIBSAIS_BUILD_SHARED_LIB) 18 | set(LIBSAIS_LIBRARY_TYPE SHARED) 19 | else() 20 | set(LIBSAIS_LIBRARY_TYPE STATIC) 21 | endif() 22 | 23 | add_library(libsais ${LIBSAIS_LIBRARY_TYPE}) 24 | 25 | set_target_properties(libsais PROPERTIES PREFIX "" IMPORT_PREFIX "") 26 | 27 | target_sources(libsais PRIVATE 28 | include/libsais.h 29 | include/libsais16.h 30 | include/libsais16x64.h 31 | include/libsais64.h 32 | src/libsais.c 33 | src/libsais16.c 34 | src/libsais16x64.c 35 | src/libsais64.c 36 | ) 37 | 38 | if(LIBSAIS_USE_OPENMP) 39 | find_package(OpenMP REQUIRED) 40 | endif() 41 | 42 | if(LIBSAIS_USE_OPENMP AND OpenMP_C_FOUND) 43 | target_compile_definitions(libsais PUBLIC LIBSAIS_OPENMP) 44 | target_link_libraries(libsais PRIVATE OpenMP::OpenMP_C) 45 | endif() 46 | 47 | if(LIBSAIS_BUILD_SHARED_LIB) 48 | target_compile_definitions(libsais PUBLIC LIBSAIS_SHARED) 49 | target_compile_definitions(libsais PRIVATE LIBSAIS_EXPORTS) 50 | endif() 51 | 52 | target_include_directories(libsais PUBLIC 53 | $ 54 | $ 55 | ) 56 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # libsais 2 | 3 | The libsais library provides fast (see [Benchmarks](#benchmarks) below) linear-time construction of suffix array (SA), generalized suffix array (GSA), longest common prefix (LCP) array, permuted LCP (PLCP) array, Burrows-Wheeler transform (BWT) and inverse BWT, based on the induced sorting algorithm described in the following papers (with optional OpenMP support for multi-core parallel construction): 4 | * Ge Nong, Sen Zhang, Wai Hong Chan *Two Efficient Algorithms for Linear Suffix Array Construction*, 2009 5 | * Juha Karkkainen, Giovanni Manzini, Simon J. Puglisi *Permuted Longest-Common-Prefix Array*, 2009 6 | * Nataliya Timoshevskaya, Wu-chun Feng *SAIS-OPT: On the characterization and optimization of the SA-IS algorithm for suffix array construction*, 2014 7 | * Jing Yi Xie, Ge Nong, Bin Lao, Wentao Xu *Scalable Suffix Sorting on a Multicore Machine*, 2020 8 | 9 | Copyright (c) 2021-2025 Ilya Grebnov 10 | 11 | >The libsais is inspired by [libdivsufsort](https://github.com/y-256/libdivsufsort), [sais](https://sites.google.com/site/yuta256/sais) libraries by Yuta Mori and [msufsort](https://github.com/michaelmaniscalco/msufsort) by Michael Maniscalco. 12 | 13 | ## libcubwt 14 | If you are looking for even faster construction times, you can try [libcubwt](https://github.com/IlyaGrebnov/libcubwt) a library for GPU-based suffix array, inverse suffix array and Burrows-Wheeler transform construction. 15 | 16 | ## Introduction 17 | The libsais provides simple C99 API to construct suffix array and Burrows-Wheeler transformed string from a given string over constant-size alphabet. The algorithm runs in a linear time using typically only ~16KB of extra memory (with 2n bytes as absolute worst-case; where n is the length of the string). OpenMP acceleration uses 200KB of addition memory per thread. 18 | 19 | > * The libsais works with compilers from GNU, Microsoft and Intel, but I recommend Clang for best performance. 20 | > * The libsais is sensitive to fast memory and software prefetching and might not be suitable for some workloads. Please benchmark yourself. 21 | 22 | ## License 23 | The libsais is released under the [Apache License Version 2.0](LICENSE "Apache license") 24 | 25 | ## Changes 26 | * May 11, 2025 (2.10.1) 27 | * No functional changes, slightly improved performance. 28 | * April 12, 2025 (2.10.0) 29 | * Improved performance, with noticeable gains on ARM architecture. 30 | * Fixed compiler warnings and addressed undefined behavior. 31 | * March 19, 2025 (2.9.1) 32 | * No functional changes, resolved compiler warnings & undefined behavior. 33 | * March 16, 2025 (2.9.0) 34 | * Support for generalized suffix array (GSA) construction. 35 | * Support for longest common prefix array (LCP) construction for generalized suffix array (GSA). 36 | * January 16, 2025 (2.8.7) 37 | * Restore the input array after suffix array construction (libsais64 & libsais16x64). 38 | * November 18, 2024 (2.8.6) 39 | * Fixed out-of-bound memory access issue for large inputs. 40 | * July 31, 2024 (2.8.5) 41 | * Miscellaneous changes to reduce compiler warnings about implicit functions. 42 | * June 13, 2024 (2.8.4) 43 | * Additional OpenMP acceleration (libsais16 & libsais16x64). 44 | * June 11, 2024 (2.8.3) 45 | * Implemented suffix array construction of a long 16-bit array (libsais16x64). 46 | * May 27, 2024 (2.8.2) 47 | * Implemented suffix array construction of a long 64-bit array (libsais64). 48 | * April 5, 2024 (2.8.1) 49 | * Fixed out-of-bound memory access issue for large inputs (libsais64). 50 | * March 3, 2024 (2.8.0) 51 | * Implemented permuted longest common prefix array (PLCP) construction of an integer array. 52 | * Fixed compilation error when compiling the library with OpenMP enabled. 53 | * February 26, 2024 (2.7.5) 54 | * Improved performance of suffix array and burrows wheeler transform construction on degenerate inputs. 55 | * February 23, 2024 (2.7.4) 56 | * Resolved strict aliasing violation resulted in invalid code generation by Intel compiler. 57 | * April 21, 2023 (2.7.3) 58 | * CMake script for library build and integration with other projects. 59 | * April 18, 2023 (2.7.2) 60 | * Fixed out-of-bound memory access issue for large inputs (libsais64). 61 | * June 19, 2022 (2.7.1) 62 | * Improved cache coherence for ARMv8 architecture. 63 | * April 12, 2022 (2.7.0) 64 | * Support for longest common prefix array (LCP) construction. 65 | * January 1, 2022 (2.6.5) 66 | * Exposed functions to construct suffix array of a given integer array. 67 | * Improved detection of various compiler intrinsics. 68 | * Capped free space parameter to avoid crashing due to 32-bit integer overflow. 69 | * October 21, 2021 (2.6.0) 70 | * libsais16 for 16-bit inputs. 71 | * October 15, 2021 (2.5.0) 72 | * Support for optional symbol frequency tables. 73 | * July 14, 2021 (2.4.0) 74 | * Reverse Burrows-Wheeler transform. 75 | * June 23, 2021 (2.3.0) 76 | * Burrows-Wheeler transform with auxiliary indexes. 77 | * April 27, 2021 (2.2.0) 78 | * libsais64 for inputs larger than 2GB. 79 | * April 19, 2021 (2.1.0) 80 | * Additional OpenMP acceleration. 81 | * April 4, 2021 (2.0.0) 82 | * OpenMP acceleration. 83 | * February 23, 2021 (1.0.0) 84 | * Initial release. 85 | 86 | ## Versions of the libsais 87 | * [libsais.c](src/libsais.c) (and corresponding [libsais.h](include/libsais.h)) is for suffix array, GSA, PLCP, LCP, forward BWT and reverse BWT construction over 8-bit inputs smaller than 2GB (2147483648 bytes). 88 | * [libsais64.c](src/libsais64.c) (and corresponding [libsais64.h](include/libsais64.h)) is optional extension of the library for inputs larger or equlas to 2GB (2147483648 bytes). 89 | * This versions of the library could also be used to construct suffix array of an integer array (with a caveat that input array must be mutable). 90 | * [libsais16.c](src/libsais16.c) + [libsais16x64.c](src/libsais16x64.c) (and corresponding [libsais16.h](include/libsais16.h) + [libsais16x64.h](include/libsais16x64.h)) is independent version of the library for 16-bit inputs. 91 | * This version of the library could also be used to construct suffix array and BWT of a set of strings by adding a unique end-of-string symbol to each string and then computing the result for the concatenated string. 92 | 93 | ## Examples of APIs (see [libsais.h](include/libsais.h), [libsais16.h](include/libsais16.h), [libsais16x64.h](include/libsais16x64.h) and [libsais64.h](include/libsais64.h) for complete APIs list) 94 | ```c 95 | /** 96 | * Constructs the suffix array of a given string. 97 | * @param T [0..n-1] The input string. 98 | * @param SA [0..n-1+fs] The output array of suffixes. 99 | * @param n The length of the given string. 100 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 101 | * @param freq [0..255] The output symbol frequency table (can be NULL). 102 | * @return 0 if no error occurred, -1 or -2 otherwise. 103 | */ 104 | int32_t libsais(const uint8_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq); 105 | 106 | /** 107 | * Constructs the suffix array of a given integer array. 108 | * Note, during construction input array will be modified, but restored at the end if no errors occurred. 109 | * @param T [0..n-1] The input integer array. 110 | * @param SA [0..n-1+fs] The output array of suffixes. 111 | * @param n The length of the integer array. 112 | * @param k The alphabet size of the input integer array. 113 | * @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance). 114 | * @return 0 if no error occurred, -1 or -2 otherwise. 115 | */ 116 | int32_t libsais_int(int32_t * T, int32_t * SA, int32_t n, int32_t k, int32_t fs); 117 | 118 | /** 119 | * Constructs the burrows-wheeler transformed string (BWT) of a given string. 120 | * @param T [0..n-1] The input string. 121 | * @param U [0..n-1] The output string (can be T). 122 | * @param A [0..n-1+fs] The temporary array. 123 | * @param n The length of the given string. 124 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 125 | * @param freq [0..255] The output symbol frequency table (can be NULL). 126 | * @return The primary index if no error occurred, -1 or -2 otherwise. 127 | */ 128 | int32_t libsais_bwt(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq); 129 | 130 | /** 131 | * Constructs the original string from a given burrows-wheeler transformed string (BWT) with primary index. 132 | * @param T [0..n-1] The input string. 133 | * @param U [0..n-1] The output string (can be T). 134 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 135 | * @param n The length of the given string. 136 | * @param freq [0..255] The input symbol frequency table (can be NULL). 137 | * @param i The primary index. 138 | * @return 0 if no error occurred, -1 or -2 otherwise. 139 | */ 140 | int32_t libsais_unbwt(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t i); 141 | 142 | /** 143 | * Constructs the permuted longest common prefix array (PLCP) of a given string and a suffix array. 144 | * @param T [0..n-1] The input string. 145 | * @param SA [0..n-1] The input suffix array. 146 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 147 | * @param n The length of the string and the suffix array. 148 | * @return 0 if no error occurred, -1 otherwise. 149 | */ 150 | int32_t libsais_plcp(const uint8_t * T, const int32_t * SA, int32_t * PLCP, int32_t n); 151 | ``` 152 | 153 | ## Example installation using [CPM](https://github.com/cpm-cmake/CPM.cmake) 154 | ```cmake 155 | CPMAddPackage( 156 | NAME libsais 157 | GITHUB_REPOSITORY IlyaGrebnov/libsais 158 | GIT_TAG v2.8.5 159 | OPTIONS 160 | "LIBSAIS_USE_OPENMP OFF" 161 | "LIBSAIS_BUILD_SHARED_LIB OFF" 162 | ) 163 | 164 | target_link_libraries( libsais) 165 | ``` 166 | 167 | # Algorithm description 168 | The libsais uses the SA-IS (Suffix Array Induced Sorting) algorithm to construct both the suffix array and the Burrows-Wheeler transform through recursive decomposition and induced sorting: 169 | * Initially, the algorithm classifies each position in a string as either an S-type or an L-type, based on whether the suffix starting at that position is lexicographically smaller or larger than the suffix at the adjacent right position. Positions identified as S-type, which have an adjacent left L-type position, are further categorized as LMS-type (Leftmost S-type) positions. Next, the algorithm splits the input string into LMS substrings, which start at an LMS-type position and extend up to the next adjacent LMS-type position. These LMS substrings are then lexicographically sorted through induced sorting and subsequently replaced in the input string with their corresponding sorted ranks, thus forming a new, compacted string. This compacted string reduces the problem size, enabling the algorithm to perform a recursive decomposition in which it is reapplied to construct the suffix array for the compacted string. And at the end of the recursive call, the suffix array for the input string is constructed from the suffix array of the compacted string using another round of induced sorting. 170 | * The induced sorting is a core mechanic of the SA-IS algorithm and is employed twice during each recursive call: initially before the recursive call to establish the order of LMS substrings, and subsequently after the recursive call to finalize the order of the suffixes of the string. This process involves two sequential scans: a left-to-right scan that determines the order of L-type positions based on the LMS-type positions, followed by a right-to-left scan that establishes the order of S-type positions based on L-type positions. These scans efficiently extend the ordering from LMS-type positions to all positions in the string. 171 | 172 | The SA-IS algorithm is quite elegant, yet implementing it efficiently presents multiple challenges. The primary challenge is that the SA-IS algorithm exhibits random memory access patterns, which can significantly decrease efficiency due to cache misses. Another significant challenge is that the SA-IS algorithm is not a lightweight construction algorithm; it requires additional memory to support positions classification, induced sorting, compacted string representations, and recursive decomposition. To circumvent this, the libsais implements careful optimizations that are worth highlighting: 173 | * The libsais is meticulously designed from the ground up to leverage the capabilities of modern microprocessors, aiming to minimize various stalls and enhance throughput through instruction-level parallelism. The library employs sophisticated techniques such as manual loop unrolling, software prefetching, and branch elimination to achieve this goal. Moreover, it strives to minimize the number of passes over the data by combining multiple operations into a single function. A prime example of these techniques could be observed in the initialization phase of the SA-IS algorithm. In this phase, the entire logic required to classify positions, count symbols into various buckets, and segment the string into LMS substrings is executed through a single, completely branch-less loop: 174 | ```c 175 | for (i = m - 1, j = omp_block_start + prefetch_distance + 3; i >= j; i -= 4) 176 | { 177 | libsais_prefetchr(&T[i - 2 * prefetch_distance]); 178 | 179 | libsais_prefetchw(&buckets[BUCKETS_INDEX4(T[i - prefetch_distance - 0], 0)]); 180 | libsais_prefetchw(&buckets[BUCKETS_INDEX4(T[i - prefetch_distance - 1], 0)]); 181 | libsais_prefetchw(&buckets[BUCKETS_INDEX4(T[i - prefetch_distance - 2], 0)]); 182 | libsais_prefetchw(&buckets[BUCKETS_INDEX4(T[i - prefetch_distance - 3], 0)]); 183 | 184 | c1 = T[i - 0]; f1 = (fast_uint_t)(c1 > (c0 - (fast_sint_t)(f0))); SA[m] = (sa_sint_t)(i + 1); m -= (f1 & ~f0); 185 | buckets[BUCKETS_INDEX4((fast_uint_t)c0, f0 + f0 + f1)]++; 186 | 187 | c0 = T[i - 1]; f0 = (fast_uint_t)(c0 > (c1 - (fast_sint_t)(f1))); SA[m] = (sa_sint_t)(i - 0); m -= (f0 & ~f1); 188 | buckets[BUCKETS_INDEX4((fast_uint_t)c1, f1 + f1 + f0)]++; 189 | 190 | c1 = T[i - 2]; f1 = (fast_uint_t)(c1 > (c0 - (fast_sint_t)(f0))); SA[m] = (sa_sint_t)(i - 1); m -= (f1 & ~f0); 191 | buckets[BUCKETS_INDEX4((fast_uint_t)c0, f0 + f0 + f1)]++; 192 | 193 | c0 = T[i - 3]; f0 = (fast_uint_t)(c0 > (c1 - (fast_sint_t)(f1))); SA[m] = (sa_sint_t)(i - 2); m -= (f0 & ~f1); 194 | buckets[BUCKETS_INDEX4((fast_uint_t)c1, f1 + f1 + f0)]++; 195 | } 196 | ``` 197 | * To sort LMS substrings lexicographically and compute their ranks, the libsais algorithm begins by gathering LMS-type positions as they appear in the string, placing them at the end of the suffix array. The library then employs two passes of induced sorting, which concludes with these same LMS-type positions ordered lexicographically at the beginning of the suffix array. Once all LMS-type positions are sorted, the ranks of the LMS substrings are computed by inspecting each pair of adjacent positions to determine if the corresponding LMS substrings are identical. If they are the same, they receive the same rank; otherwise, the rank is incremented by one. 198 | * The first challenge of induced sorting is that, during passes over the suffix array, we need to examine each value to determine if it represents a valid position or an empty space, whether this position is not the beginning of the string (and thus could induce another position), and if the induced position is going to be of the necessary type (for example, during a left-to-right scan, we are only inducing L-type positions). This process can cause branch mispredictions and corresponding microprocessor stalls. To address this challenge, libsais employs following techniques. Firstly, the library uses two pointers per induction bucket, each pointing to different sections of the suffix array depending on the type of positions these positions will be inducing next. This approach allows for the separation of LS-type (meaning S-type, which induces L-type; this is the same as LMS-type) and LL-type positions needed for the left-to-right scan from SL-type and SS-type positions needed for the right-to-left scan. Secondly, by understanding the distribution of symbols based on their position types and the types they induce (i.e., SS, SL, LS, LL), we can pre-calculate pointers for each bucket, leaving no empty spaces. And thirdly, by removing the first LMS position and all positions left of it from the initial gathering and distribution, we eliminate the need to check whether a position is not the beginning of the string. These techniques not only result in a completely branch-less loop for each induction sorting pass but also eliminate redundant scanning and the final gathering of LMS-type positions at the beginning of the suffix array. 199 | * The second challenge arises after induced sorting when we need to compute the ranks of LMS substrings. To accomplish this, we must first calculate and store the lengths of LMS substrings and then inspect each pair of adjacent LMS-type positions to determine if the corresponding LMS substrings are identical. This comparison starts with their lengths, and if they are the same, proceeds to compare the substrings themselves. Such operations exhibits random memory access patterns, which can significantly decrease efficiency due to cache misses. However, libsais avoids this inefficient logic by incorporating the ranking of LMS substrings as part of the induced sorting process itself. The library achieves this by marking the most significant bit (MSB) of positions in the suffix array that start new ranking groups. Each time a position is processed during induced sorting, the library checks the MSB and increments the current rank if the beginning of a new ranking group is encountered. Additionally, for each pointer in an induction bucket, the rank of the previous induced position is maintained. Whenever another position is induced, this previous rank is used to determine whether to mark the newly induced position as the beginning of a new rank group. All the logic to update the ranks and mark the beginnings of new ranking groups is implemented using bit manipulation and is completely branch-less. 200 | ```c 201 | for (i = omp_block_start, j = omp_block_start + omp_block_size - 2 * prefetch_distance - 1; i < j; i += 2) 202 | { 203 | libsais_prefetchr(&SA[i + 3 * prefetch_distance]); 204 | 205 | libsais_prefetchr(&T[SA[i + 2 * prefetch_distance + 0] & SAINT_MAX] - 1); 206 | libsais_prefetchr(&T[SA[i + 2 * prefetch_distance + 0] & SAINT_MAX] - 2); 207 | libsais_prefetchr(&T[SA[i + 2 * prefetch_distance + 1] & SAINT_MAX] - 1); 208 | libsais_prefetchr(&T[SA[i + 2 * prefetch_distance + 1] & SAINT_MAX] - 2); 209 | 210 | sa_sint_t p0 = SA[i + prefetch_distance + 0] & SAINT_MAX; sa_sint_t v0 = BUCKETS_INDEX4(T[p0 - (p0 > 0)], 0); libsais_prefetchw(&buckets[v0]); 211 | sa_sint_t p1 = SA[i + prefetch_distance + 1] & SAINT_MAX; sa_sint_t v1 = BUCKETS_INDEX4(T[p1 - (p1 > 0)], 0); libsais_prefetchw(&buckets[v1]); 212 | 213 | sa_sint_t p2 = SA[i + 0]; d += (p2 < 0); p2 &= SAINT_MAX; sa_sint_t v2 = BUCKETS_INDEX4(T[p2 - 1], T[p2 - 2] >= T[p2 - 1]); 214 | SA[buckets[v2]++] = (p2 - 1) | (sa_sint_t)((sa_uint_t)(buckets[2 + v2] != d) << (SAINT_BIT - 1)); buckets[2 + v2] = d; 215 | 216 | sa_sint_t p3 = SA[i + 1]; d += (p3 < 0); p3 &= SAINT_MAX; sa_sint_t v3 = BUCKETS_INDEX4(T[p3 - 1], T[p3 - 2] >= T[p3 - 1]); 217 | SA[buckets[v3]++] = (p3 - 1) | (sa_sint_t)((sa_uint_t)(buckets[2 + v3] != d) << (SAINT_BIT - 1)); buckets[2 + v3] = d; 218 | } 219 | ``` 220 | * In the SA-IS algorithm, after induced sorting, the ranks of LMS substrings are computed in suffix order. These ranks then need to be scattered to reorder them in string order before being gathered again to form the compacted string for recursion. At this point, some LMS substrings may be unique, meaning they don't share their rank with any other LMS substring. Being unique, these substrings are essentially already sorted, and their position relative to other LMS substrings is already determined. However, these unique LMS substrings may still be necessary for sorting other, non-unique LMS substrings during recursion-unless a unique LMS substring is immediately followed by another unique LMS substring in the string. In such cases, the rank of any subsequent unique LMS substrings becomes redundant in the compacted string, as it will not be utilized. Leveraging this insight, libsais employs a strategy to further reduce the size of the compacted string by omitting such redundant LMS substring ranks. This process involves a few steps. First, unique LMS substrings are identified by looking ahead while scanning LMS-positions in the suffix array during the ranking and scattering phase. When scattering LMS substring ranks to form the compacted string, the most significant bit (MSB) of the rank is used to mark that this rank is unique. Next, as the library scans the ranks in string order and detects tandems of unique ranks using the MSB, it then recalculates the MSB for ranks which are redundant, thus markign them for removal from the compacted string. Subsequently, the libsais rescans the LMS-positions in suffix order to recompute the ranks, now focusing only on the ranks of the remaining LMS substrings. The library also uses MSB of first symbol of LMS substrings to mark that LMS substring is removed from the compacted string. Finally, the library builds the compacted string based on the newly recalculated ranks for the remaining LMS substrings, while also saving the final positions for the removed LMS substrings before proceeding with recursion. This reduction process not only further decreases the size of the compacted string but also reduces the alphabet size of the reduced string and creates additional free space in the suffix array, which can be utilized during recursion. 221 | * The SA-IS algorithm, while robust for suffix array construction, is not considered lightweight due to its need for additional memory for tasks such as position classification, induced sorting, the creation of compacted string representations, and recursive decomposition. To mitigate this, libsais optimizes memory usage by not storing position classifications and striving to reuse the memory space allocated for the suffix array for induced sorting, compacted string representations, and recursive decomposition processes. Since position classifications are not stored, the library recalculates them as needed, typically involving checks of adjacent symbols for a given position. Although this approach may seem straightforward, it introduces the challenge of random memory access. Nevertheless, libsais manages these accesses in a manner that either avoids unnecessary memory fetches or minimizes cache penalties. In situations where avoiding cache penalties is unfeasible, the library leverages the most significant bit (MSB) bits for computations, as branch mispredictions on modern microprocessors generally incur lower penalties than cache misses. Memory reuse for the suffix array, despite appearing straightforward, also presents hidden challenges related to implementation complexity. In certain cases, the available space in the suffix array may not suffice for the most optimal algorithm implementation mentioned above. Although such instances are rare, the library aims to deliver optimal performance without additional memory allocation by resorting to a less efficient variant of induced sorting. To accommodate various scenarios, libsais includes four distinct implementations tailored to different breakpoints based on alphabet size (denoted by 'k'): 6k, 4k, 2k, and 1k, with each implementation optimized to ensure performance efficiency. Extensive efforts have been dedicated to refining these implementations, including significant time invested in using various sanitizers to confirm the correctness of the algorithms. Ultimately, while there are specific inputs under which libsais might require additional memory-most of which tend to be synthetic tests designed specifically to challenge the SA-IS algorithm-such instances are relatively rare. In these exceptional cases, the library is designed to allocate only the minimum necessary amount of memory while still delivering the best possible performance. 222 | * The libsais library, initially was developed for constructing suffix arrays, but has broadened its scope to include the calculation of the longest common prefix (LCP) and both the forward and inverse Burrows-Wheeler Transform (BWT) with considerable efforts has been dedicated to refining these algorithms to ensure they deliver maximum performance and maintain the correctness. An illustrative example is the forward BWT, which performance is nearly identical to that of its suffix array construction which is achieved by integrating a modified version of the induced sorting implementation within the final stage of the SA-IS algorithm. Rather than inducing suffix positions at this stage, the library induces the Burrows-Wheeler Transform directly. This approach also supports in-place transformation, maintaining a memory usage of 5n, making it an sutable for data compression applications. Similarly, the inverse BWT is fine-tuned to operate in-place, adhering to the same memory efficiency of 5n with an additional optimization of a bi-gram LF-mapping technique, which allows for the decoding of two symbols simultaneously effectively reduces the number of cache misses during the inversion of the Burrows-Wheeler Transform. 223 | 224 | # Benchmarks 225 | 226 | Full list of benchmarks are moved to own [Benchmarks.md](Benchmarks.md) file. 227 | -------------------------------------------------------------------------------- /VERSION: -------------------------------------------------------------------------------- 1 | 2.10.1 2 | -------------------------------------------------------------------------------- /include/libsais.h: -------------------------------------------------------------------------------- 1 | /*-- 2 | 3 | This file is a part of libsais, a library for linear time suffix array, 4 | longest common prefix array and burrows wheeler transform construction. 5 | 6 | Copyright (c) 2021-2025 Ilya Grebnov 7 | 8 | Licensed under the Apache License, Version 2.0 (the "License"); 9 | you may not use this file except in compliance with the License. 10 | You may obtain a copy of the License at 11 | 12 | http://www.apache.org/licenses/LICENSE-2.0 13 | 14 | Unless required by applicable law or agreed to in writing, software 15 | distributed under the License is distributed on an "AS IS" BASIS, 16 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 17 | See the License for the specific language governing permissions and 18 | limitations under the License. 19 | 20 | Please see the file LICENSE for full copyright information. 21 | 22 | --*/ 23 | 24 | #ifndef LIBSAIS_H 25 | #define LIBSAIS_H 1 26 | 27 | #define LIBSAIS_VERSION_MAJOR 2 28 | #define LIBSAIS_VERSION_MINOR 10 29 | #define LIBSAIS_VERSION_PATCH 1 30 | #define LIBSAIS_VERSION_STRING "2.10.1" 31 | 32 | #ifdef _WIN32 33 | #ifdef LIBSAIS_SHARED 34 | #ifdef LIBSAIS_EXPORTS 35 | #define LIBSAIS_API __declspec(dllexport) 36 | #else 37 | #define LIBSAIS_API __declspec(dllimport) 38 | #endif 39 | #else 40 | #define LIBSAIS_API 41 | #endif 42 | #else 43 | #define LIBSAIS_API 44 | #endif 45 | 46 | #ifdef __cplusplus 47 | extern "C" { 48 | #endif 49 | 50 | #include 51 | 52 | /** 53 | * Creates the libsais context that allows reusing allocated memory with each libsais operation. 54 | * In multi-threaded environments, use one context per thread for parallel executions. 55 | * @return the libsais context, NULL otherwise. 56 | */ 57 | LIBSAIS_API void * libsais_create_ctx(void); 58 | 59 | #if defined(LIBSAIS_OPENMP) 60 | /** 61 | * Creates the libsais context that allows reusing allocated memory with each parallel libsais operation using OpenMP. 62 | * In multi-threaded environments, use one context per thread for parallel executions. 63 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 64 | * @return the libsais context, NULL otherwise. 65 | */ 66 | LIBSAIS_API void * libsais_create_ctx_omp(int32_t threads); 67 | #endif 68 | 69 | /** 70 | * Destroys the libsass context and free previusly allocated memory. 71 | * @param ctx The libsais context (can be NULL). 72 | */ 73 | LIBSAIS_API void libsais_free_ctx(void * ctx); 74 | 75 | /** 76 | * Constructs the suffix array of a given string. 77 | * @param T [0..n-1] The input string. 78 | * @param SA [0..n-1+fs] The output array of suffixes. 79 | * @param n The length of the given string. 80 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 81 | * @param freq [0..255] The output symbol frequency table (can be NULL). 82 | * @return 0 if no error occurred, -1 or -2 otherwise. 83 | */ 84 | LIBSAIS_API int32_t libsais(const uint8_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq); 85 | 86 | /** 87 | * Constructs the generalized suffix array (GSA) of given string set. 88 | * @param T [0..n-1] The input string set using 0 as separators (T[n-1] must be 0). 89 | * @param SA [0..n-1+fs] The output array of suffixes. 90 | * @param n The length of the given string set. 91 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 92 | * @param freq [0..255] The output symbol frequency table (can be NULL). 93 | * @return 0 if no error occurred, -1 or -2 otherwise. 94 | */ 95 | LIBSAIS_API int32_t libsais_gsa(const uint8_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq); 96 | 97 | /** 98 | * Constructs the suffix array of a given integer array. 99 | * Note, during construction input array will be modified, but restored at the end if no errors occurred. 100 | * @param T [0..n-1] The input integer array. 101 | * @param SA [0..n-1+fs] The output array of suffixes. 102 | * @param n The length of the integer array. 103 | * @param k The alphabet size of the input integer array. 104 | * @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance). 105 | * @return 0 if no error occurred, -1 or -2 otherwise. 106 | */ 107 | LIBSAIS_API int32_t libsais_int(int32_t * T, int32_t * SA, int32_t n, int32_t k, int32_t fs); 108 | 109 | /** 110 | * Constructs the suffix array of a given string using libsais context. 111 | * @param ctx The libsais context. 112 | * @param T [0..n-1] The input string. 113 | * @param SA [0..n-1+fs] The output array of suffixes. 114 | * @param n The length of the given string. 115 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 116 | * @param freq [0..255] The output symbol frequency table (can be NULL). 117 | * @return 0 if no error occurred, -1 or -2 otherwise. 118 | */ 119 | LIBSAIS_API int32_t libsais_ctx(const void * ctx, const uint8_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq); 120 | 121 | /** 122 | * Constructs the generalized suffix array (GSA) of given string set using libsais context. 123 | * @param ctx The libsais context. 124 | * @param T [0..n-1] The input string set using 0 as separators (T[n-1] must be 0). 125 | * @param SA [0..n-1+fs] The output array of suffixes. 126 | * @param n The length of the given string set. 127 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 128 | * @param freq [0..255] The output symbol frequency table (can be NULL). 129 | * @return 0 if no error occurred, -1 or -2 otherwise. 130 | */ 131 | LIBSAIS_API int32_t libsais_gsa_ctx(const void * ctx, const uint8_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq); 132 | 133 | #if defined(LIBSAIS_OPENMP) 134 | /** 135 | * Constructs the suffix array of a given string in parallel using OpenMP. 136 | * @param T [0..n-1] The input string. 137 | * @param SA [0..n-1+fs] The output array of suffixes. 138 | * @param n The length of the given string. 139 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 140 | * @param freq [0..255] The output symbol frequency table (can be NULL). 141 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 142 | * @return 0 if no error occurred, -1 or -2 otherwise. 143 | */ 144 | LIBSAIS_API int32_t libsais_omp(const uint8_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq, int32_t threads); 145 | 146 | /** 147 | * Constructs the generalized suffix array (GSA) of given string set in parallel using OpenMP. 148 | * @param T [0..n-1] The input string set using 0 as separators (T[n-1] must be 0). 149 | * @param SA [0..n-1+fs] The output array of suffixes. 150 | * @param n The length of the given string set. 151 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 152 | * @param freq [0..255] The output symbol frequency table (can be NULL). 153 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 154 | * @return 0 if no error occurred, -1 or -2 otherwise. 155 | */ 156 | LIBSAIS_API int32_t libsais_gsa_omp(const uint8_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq, int32_t threads); 157 | 158 | /** 159 | * Constructs the suffix array of a given integer array in parallel using OpenMP. 160 | * Note, during construction input array will be modified, but restored at the end if no errors occurred. 161 | * @param T [0..n-1] The input integer array. 162 | * @param SA [0..n-1+fs] The output array of suffixes. 163 | * @param n The length of the integer array. 164 | * @param k The alphabet size of the input integer array. 165 | * @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance). 166 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 167 | * @return 0 if no error occurred, -1 or -2 otherwise. 168 | */ 169 | LIBSAIS_API int32_t libsais_int_omp(int32_t * T, int32_t * SA, int32_t n, int32_t k, int32_t fs, int32_t threads); 170 | #endif 171 | 172 | /** 173 | * Constructs the burrows-wheeler transformed string (BWT) of a given string. 174 | * @param T [0..n-1] The input string. 175 | * @param U [0..n-1] The output string (can be T). 176 | * @param A [0..n-1+fs] The temporary array. 177 | * @param n The length of the given string. 178 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 179 | * @param freq [0..255] The output symbol frequency table (can be NULL). 180 | * @return The primary index if no error occurred, -1 or -2 otherwise. 181 | */ 182 | LIBSAIS_API int32_t libsais_bwt(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq); 183 | 184 | /** 185 | * Constructs the burrows-wheeler transformed string (BWT) of a given string with auxiliary indexes. 186 | * @param T [0..n-1] The input string. 187 | * @param U [0..n-1] The output string (can be T). 188 | * @param A [0..n-1+fs] The temporary array. 189 | * @param n The length of the given string. 190 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 191 | * @param freq [0..255] The output symbol frequency table (can be NULL). 192 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 193 | * @param I [0..(n-1)/r] The output auxiliary indexes. 194 | * @return 0 if no error occurred, -1 or -2 otherwise. 195 | */ 196 | LIBSAIS_API int32_t libsais_bwt_aux(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq, int32_t r, int32_t * I); 197 | 198 | /** 199 | * Constructs the burrows-wheeler transformed string (BWT) of a given string using libsais context. 200 | * @param ctx The libsais context. 201 | * @param T [0..n-1] The input string. 202 | * @param U [0..n-1] The output string (can be T). 203 | * @param A [0..n-1+fs] The temporary array. 204 | * @param n The length of the given string. 205 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 206 | * @param freq [0..255] The output symbol frequency table (can be NULL). 207 | * @return The primary index if no error occurred, -1 or -2 otherwise. 208 | */ 209 | LIBSAIS_API int32_t libsais_bwt_ctx(const void * ctx, const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq); 210 | 211 | /** 212 | * Constructs the burrows-wheeler transformed string (BWT) of a given string with auxiliary indexes using libsais context. 213 | * @param ctx The libsais context. 214 | * @param T [0..n-1] The input string. 215 | * @param U [0..n-1] The output string (can be T). 216 | * @param A [0..n-1+fs] The temporary array. 217 | * @param n The length of the given string. 218 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 219 | * @param freq [0..255] The output symbol frequency table (can be NULL). 220 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 221 | * @param I [0..(n-1)/r] The output auxiliary indexes. 222 | * @return 0 if no error occurred, -1 or -2 otherwise. 223 | */ 224 | LIBSAIS_API int32_t libsais_bwt_aux_ctx(const void * ctx, const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq, int32_t r, int32_t * I); 225 | 226 | #if defined(LIBSAIS_OPENMP) 227 | /** 228 | * Constructs the burrows-wheeler transformed string (BWT) of a given string in parallel using OpenMP. 229 | * @param T [0..n-1] The input string. 230 | * @param U [0..n-1] The output string (can be T). 231 | * @param A [0..n-1+fs] The temporary array. 232 | * @param n The length of the given string. 233 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 234 | * @param freq [0..255] The output symbol frequency table (can be NULL). 235 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 236 | * @return The primary index if no error occurred, -1 or -2 otherwise. 237 | */ 238 | LIBSAIS_API int32_t libsais_bwt_omp(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq, int32_t threads); 239 | 240 | /** 241 | * Constructs the burrows-wheeler transformed string (BWT) of a given string with auxiliary indexes in parallel using OpenMP. 242 | * @param T [0..n-1] The input string. 243 | * @param U [0..n-1] The output string (can be T). 244 | * @param A [0..n-1+fs] The temporary array. 245 | * @param n The length of the given string. 246 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 247 | * @param freq [0..255] The output symbol frequency table (can be NULL). 248 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 249 | * @param I [0..(n-1)/r] The output auxiliary indexes. 250 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 251 | * @return 0 if no error occurred, -1 or -2 otherwise. 252 | */ 253 | LIBSAIS_API int32_t libsais_bwt_aux_omp(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq, int32_t r, int32_t * I, int32_t threads); 254 | #endif 255 | 256 | /** 257 | * Creates the libsais reverse BWT context that allows reusing allocated memory with each libsais_unbwt_* operation. 258 | * In multi-threaded environments, use one context per thread for parallel executions. 259 | * @return the libsais context, NULL otherwise. 260 | */ 261 | LIBSAIS_API void * libsais_unbwt_create_ctx(void); 262 | 263 | #if defined(LIBSAIS_OPENMP) 264 | /** 265 | * Creates the libsais reverse BWT context that allows reusing allocated memory with each parallel libsais_unbwt_* operation using OpenMP. 266 | * In multi-threaded environments, use one context per thread for parallel executions. 267 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 268 | * @return the libsais context, NULL otherwise. 269 | */ 270 | LIBSAIS_API void * libsais_unbwt_create_ctx_omp(int32_t threads); 271 | #endif 272 | 273 | /** 274 | * Destroys the libsass reverse BWT context and free previusly allocated memory. 275 | * @param ctx The libsais context (can be NULL). 276 | */ 277 | LIBSAIS_API void libsais_unbwt_free_ctx(void * ctx); 278 | 279 | /** 280 | * Constructs the original string from a given burrows-wheeler transformed string (BWT) with primary index. 281 | * @param T [0..n-1] The input string. 282 | * @param U [0..n-1] The output string (can be T). 283 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 284 | * @param n The length of the given string. 285 | * @param freq [0..255] The input symbol frequency table (can be NULL). 286 | * @param i The primary index. 287 | * @return 0 if no error occurred, -1 or -2 otherwise. 288 | */ 289 | LIBSAIS_API int32_t libsais_unbwt(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t i); 290 | 291 | /** 292 | * Constructs the original string from a given burrows-wheeler transformed string (BWT) with primary index using libsais reverse BWT context. 293 | * @param ctx The libsais reverse BWT context. 294 | * @param T [0..n-1] The input string. 295 | * @param U [0..n-1] The output string (can be T). 296 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 297 | * @param n The length of the given string. 298 | * @param freq [0..255] The input symbol frequency table (can be NULL). 299 | * @param i The primary index. 300 | * @return 0 if no error occurred, -1 or -2 otherwise. 301 | */ 302 | LIBSAIS_API int32_t libsais_unbwt_ctx(const void * ctx, const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t i); 303 | 304 | /** 305 | * Constructs the original string from a given burrows-wheeler transformed string (BWT) with auxiliary indexes. 306 | * @param T [0..n-1] The input string. 307 | * @param U [0..n-1] The output string (can be T). 308 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 309 | * @param n The length of the given string. 310 | * @param freq [0..255] The input symbol frequency table (can be NULL). 311 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 312 | * @param I [0..(n-1)/r] The input auxiliary indexes. 313 | * @return 0 if no error occurred, -1 or -2 otherwise. 314 | */ 315 | LIBSAIS_API int32_t libsais_unbwt_aux(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t r, const int32_t * I); 316 | 317 | /** 318 | * Constructs the original string from a given burrows-wheeler transformed string (BWT) with auxiliary indexes using libsais reverse BWT context. 319 | * @param ctx The libsais reverse BWT context. 320 | * @param T [0..n-1] The input string. 321 | * @param U [0..n-1] The output string (can be T). 322 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 323 | * @param n The length of the given string. 324 | * @param freq [0..255] The input symbol frequency table (can be NULL). 325 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 326 | * @param I [0..(n-1)/r] The input auxiliary indexes. 327 | * @return 0 if no error occurred, -1 or -2 otherwise. 328 | */ 329 | LIBSAIS_API int32_t libsais_unbwt_aux_ctx(const void * ctx, const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t r, const int32_t * I); 330 | 331 | #if defined(LIBSAIS_OPENMP) 332 | /** 333 | * Constructs the original string from a given burrows-wheeler transformed string (BWT) with primary index in parallel using OpenMP. 334 | * @param T [0..n-1] The input string. 335 | * @param U [0..n-1] The output string (can be T). 336 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 337 | * @param n The length of the given string. 338 | * @param freq [0..255] The input symbol frequency table (can be NULL). 339 | * @param i The primary index. 340 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 341 | * @return 0 if no error occurred, -1 or -2 otherwise. 342 | */ 343 | LIBSAIS_API int32_t libsais_unbwt_omp(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t i, int32_t threads); 344 | 345 | /** 346 | * Constructs the original string from a given burrows-wheeler transformed string (BWT) with auxiliary indexes in parallel using OpenMP. 347 | * @param T [0..n-1] The input string. 348 | * @param U [0..n-1] The output string (can be T). 349 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 350 | * @param n The length of the given string. 351 | * @param freq [0..255] The input symbol frequency table (can be NULL). 352 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 353 | * @param I [0..(n-1)/r] The input auxiliary indexes. 354 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 355 | * @return 0 if no error occurred, -1 or -2 otherwise. 356 | */ 357 | LIBSAIS_API int32_t libsais_unbwt_aux_omp(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t r, const int32_t * I, int32_t threads); 358 | #endif 359 | 360 | /** 361 | * Constructs the permuted longest common prefix array (PLCP) of a given string and a suffix array. 362 | * @param T [0..n-1] The input string. 363 | * @param SA [0..n-1] The input suffix array. 364 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 365 | * @param n The length of the string and the suffix array. 366 | * @return 0 if no error occurred, -1 otherwise. 367 | */ 368 | LIBSAIS_API int32_t libsais_plcp(const uint8_t * T, const int32_t * SA, int32_t * PLCP, int32_t n); 369 | 370 | /** 371 | * Constructs the permuted longest common prefix array (PLCP) of a given string set and a generalized suffix array (GSA). 372 | * @param T [0..n-1] The input string set using 0 as separators (T[n-1] must be 0). 373 | * @param SA [0..n-1] The input generalized suffix array. 374 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 375 | * @param n The length of the string set and the generalized suffix array. 376 | * @return 0 if no error occurred, -1 otherwise. 377 | */ 378 | LIBSAIS_API int32_t libsais_plcp_gsa(const uint8_t * T, const int32_t * SA, int32_t * PLCP, int32_t n); 379 | 380 | /** 381 | * Constructs the permuted longest common prefix array (PLCP) of a integer array and a suffix array. 382 | * @param T [0..n-1] The input integer array. 383 | * @param SA [0..n-1] The input suffix array. 384 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 385 | * @param n The length of the integer array and the suffix array. 386 | * @return 0 if no error occurred, -1 otherwise. 387 | */ 388 | LIBSAIS_API int32_t libsais_plcp_int(const int32_t * T, const int32_t * SA, int32_t * PLCP, int32_t n); 389 | 390 | /** 391 | * Constructs the longest common prefix array (LCP) of a given permuted longest common prefix array (PLCP) and a suffix array. 392 | * @param PLCP [0..n-1] The input permuted longest common prefix array. 393 | * @param SA [0..n-1] The input suffix array or generalized suffix array (GSA). 394 | * @param LCP [0..n-1] The output longest common prefix array (can be SA). 395 | * @param n The length of the permuted longest common prefix array and the suffix array. 396 | * @return 0 if no error occurred, -1 otherwise. 397 | */ 398 | LIBSAIS_API int32_t libsais_lcp(const int32_t * PLCP, const int32_t * SA, int32_t * LCP, int32_t n); 399 | 400 | #if defined(LIBSAIS_OPENMP) 401 | /** 402 | * Constructs the permuted longest common prefix array (PLCP) of a given string and a suffix array in parallel using OpenMP. 403 | * @param T [0..n-1] The input string. 404 | * @param SA [0..n-1] The input suffix array. 405 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 406 | * @param n The length of the string and the suffix array. 407 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 408 | * @return 0 if no error occurred, -1 otherwise. 409 | */ 410 | LIBSAIS_API int32_t libsais_plcp_omp(const uint8_t * T, const int32_t * SA, int32_t * PLCP, int32_t n, int32_t threads); 411 | 412 | /** 413 | * Constructs the permuted longest common prefix array (PLCP) of a given string set and a generalized suffix array (GSA) in parallel using OpenMP. 414 | * @param T [0..n-1] The input string set using 0 as separators (T[n-1] must be 0). 415 | * @param SA [0..n-1] The input generalized suffix array. 416 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 417 | * @param n The length of the string set and the generalized suffix array. 418 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 419 | * @return 0 if no error occurred, -1 otherwise. 420 | */ 421 | LIBSAIS_API int32_t libsais_plcp_gsa_omp(const uint8_t * T, const int32_t * SA, int32_t * PLCP, int32_t n, int32_t threads); 422 | 423 | /** 424 | * Constructs the permuted longest common prefix array (PLCP) of a given integer array and a suffix array in parallel using OpenMP. 425 | * @param T [0..n-1] The input integer array. 426 | * @param SA [0..n-1] The input suffix array. 427 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 428 | * @param n The length of the integer array and the suffix array. 429 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 430 | * @return 0 if no error occurred, -1 otherwise. 431 | */ 432 | LIBSAIS_API int32_t libsais_plcp_int_omp(const int32_t * T, const int32_t * SA, int32_t * PLCP, int32_t n, int32_t threads); 433 | 434 | /** 435 | * Constructs the longest common prefix array (LCP) of a given permuted longest common prefix array (PLCP) and a suffix array in parallel using OpenMP. 436 | * @param PLCP [0..n-1] The input permuted longest common prefix array. 437 | * @param SA [0..n-1] The input suffix array or generalized suffix array (GSA). 438 | * @param LCP [0..n-1] The output longest common prefix array (can be SA). 439 | * @param n The length of the permuted longest common prefix array and the suffix array. 440 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 441 | * @return 0 if no error occurred, -1 otherwise. 442 | */ 443 | LIBSAIS_API int32_t libsais_lcp_omp(const int32_t * PLCP, const int32_t * SA, int32_t * LCP, int32_t n, int32_t threads); 444 | #endif 445 | 446 | #ifdef __cplusplus 447 | } 448 | #endif 449 | 450 | #endif 451 | -------------------------------------------------------------------------------- /include/libsais16.h: -------------------------------------------------------------------------------- 1 | /*-- 2 | 3 | This file is a part of libsais, a library for linear time suffix array, 4 | longest common prefix array and burrows wheeler transform construction. 5 | 6 | Copyright (c) 2021-2025 Ilya Grebnov 7 | 8 | Licensed under the Apache License, Version 2.0 (the "License"); 9 | you may not use this file except in compliance with the License. 10 | You may obtain a copy of the License at 11 | 12 | http://www.apache.org/licenses/LICENSE-2.0 13 | 14 | Unless required by applicable law or agreed to in writing, software 15 | distributed under the License is distributed on an "AS IS" BASIS, 16 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 17 | See the License for the specific language governing permissions and 18 | limitations under the License. 19 | 20 | Please see the file LICENSE for full copyright information. 21 | 22 | --*/ 23 | 24 | #ifndef LIBSAIS16_H 25 | #define LIBSAIS16_H 1 26 | 27 | #define LIBSAIS16_VERSION_MAJOR 2 28 | #define LIBSAIS16_VERSION_MINOR 10 29 | #define LIBSAIS16_VERSION_PATCH 1 30 | #define LIBSAIS16_VERSION_STRING "2.10.1" 31 | 32 | #ifdef _WIN32 33 | #ifdef LIBSAIS_SHARED 34 | #ifdef LIBSAIS_EXPORTS 35 | #define LIBSAIS16_API __declspec(dllexport) 36 | #else 37 | #define LIBSAIS16_API __declspec(dllimport) 38 | #endif 39 | #else 40 | #define LIBSAIS16_API 41 | #endif 42 | #else 43 | #define LIBSAIS16_API 44 | #endif 45 | 46 | #ifdef __cplusplus 47 | extern "C" { 48 | #endif 49 | 50 | #include 51 | 52 | /** 53 | * Creates the libsais16 context that allows reusing allocated memory with each libsais16 operation. 54 | * In multi-threaded environments, use one context per thread for parallel executions. 55 | * @return the libsais16 context, NULL otherwise. 56 | */ 57 | LIBSAIS16_API void * libsais16_create_ctx(void); 58 | 59 | #if defined(LIBSAIS_OPENMP) 60 | /** 61 | * Creates the libsais16 context that allows reusing allocated memory with each parallel libsais16 operation using OpenMP. 62 | * In multi-threaded environments, use one context per thread for parallel executions. 63 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 64 | * @return the libsais16 context, NULL otherwise. 65 | */ 66 | LIBSAIS16_API void * libsais16_create_ctx_omp(int32_t threads); 67 | #endif 68 | 69 | /** 70 | * Destroys the libsass context and free previusly allocated memory. 71 | * @param ctx The libsais16 context (can be NULL). 72 | */ 73 | LIBSAIS16_API void libsais16_free_ctx(void * ctx); 74 | 75 | /** 76 | * Constructs the suffix array of a given 16-bit string. 77 | * @param T [0..n-1] The input 16-bit string. 78 | * @param SA [0..n-1+fs] The output array of suffixes. 79 | * @param n The length of the given 16-bit string. 80 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 81 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 82 | * @return 0 if no error occurred, -1 or -2 otherwise. 83 | */ 84 | LIBSAIS16_API int32_t libsais16(const uint16_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq); 85 | 86 | /** 87 | * Constructs the generalized suffix array (GSA) of given 16-bit string set. 88 | * @param T [0..n-1] The input 16-bit string set using 0 as separators (T[n-1] must be 0). 89 | * @param SA [0..n-1+fs] The output array of suffixes. 90 | * @param n The length of the given 16-bit string set. 91 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 92 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 93 | * @return 0 if no error occurred, -1 or -2 otherwise. 94 | */ 95 | LIBSAIS16_API int32_t libsais16_gsa(const uint16_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq); 96 | 97 | /** 98 | * Constructs the suffix array of a given integer array. 99 | * Note, during construction input array will be modified, but restored at the end if no errors occurred. 100 | * @param T [0..n-1] The input integer array. 101 | * @param SA [0..n-1+fs] The output array of suffixes. 102 | * @param n The length of the integer array. 103 | * @param k The alphabet size of the input integer array. 104 | * @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance). 105 | * @return 0 if no error occurred, -1 or -2 otherwise. 106 | */ 107 | LIBSAIS16_API int32_t libsais16_int(int32_t * T, int32_t * SA, int32_t n, int32_t k, int32_t fs); 108 | 109 | /** 110 | * Constructs the suffix array of a given 16-bit string using libsais16 context. 111 | * @param ctx The libsais16 context. 112 | * @param T [0..n-1] The input 16-bit string. 113 | * @param SA [0..n-1+fs] The output array of suffixes. 114 | * @param n The length of the given 16-bit string. 115 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 116 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 117 | * @return 0 if no error occurred, -1 or -2 otherwise. 118 | */ 119 | LIBSAIS16_API int32_t libsais16_ctx(const void * ctx, const uint16_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq); 120 | 121 | /** 122 | * Constructs the generalized suffix array (GSA) of given 16-bit string set using libsais16 context. 123 | * @param ctx The libsais16 context. 124 | * @param T [0..n-1] The input 16-bit string set using 0 as separators (T[n-1] must be 0). 125 | * @param SA [0..n-1+fs] The output array of suffixes. 126 | * @param n The length of the given 16-bit string set. 127 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 128 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 129 | * @return 0 if no error occurred, -1 or -2 otherwise. 130 | */ 131 | LIBSAIS16_API int32_t libsais16_gsa_ctx(const void * ctx, const uint16_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq); 132 | 133 | #if defined(LIBSAIS_OPENMP) 134 | /** 135 | * Constructs the suffix array of a given 16-bit string in parallel using OpenMP. 136 | * @param T [0..n-1] The input 16-bit string. 137 | * @param SA [0..n-1+fs] The output array of suffixes. 138 | * @param n The length of the given 16-bit string. 139 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 140 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 141 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 142 | * @return 0 if no error occurred, -1 or -2 otherwise. 143 | */ 144 | LIBSAIS16_API int32_t libsais16_omp(const uint16_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq, int32_t threads); 145 | 146 | /** 147 | * Constructs the generalized suffix array (GSA) of given 16-bit string set in parallel using OpenMP. 148 | * @param T [0..n-1] The input 16-bit string set using 0 as separators (T[n-1] must be 0). 149 | * @param SA [0..n-1+fs] The output array of suffixes. 150 | * @param n The length of the given 16-bit string set. 151 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 152 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 153 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 154 | * @return 0 if no error occurred, -1 or -2 otherwise. 155 | */ 156 | LIBSAIS16_API int32_t libsais16_gsa_omp(const uint16_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq, int32_t threads); 157 | 158 | /** 159 | * Constructs the suffix array of a given integer array in parallel using OpenMP. 160 | * Note, during construction input array will be modified, but restored at the end if no errors occurred. 161 | * @param T [0..n-1] The input integer array. 162 | * @param SA [0..n-1+fs] The output array of suffixes. 163 | * @param n The length of the integer array. 164 | * @param k The alphabet size of the input integer array. 165 | * @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance). 166 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 167 | * @return 0 if no error occurred, -1 or -2 otherwise. 168 | */ 169 | LIBSAIS16_API int32_t libsais16_int_omp(int32_t * T, int32_t * SA, int32_t n, int32_t k, int32_t fs, int32_t threads); 170 | #endif 171 | 172 | /** 173 | * Constructs the burrows-wheeler transformed 16-bit string (BWT) of a given 16-bit string. 174 | * @param T [0..n-1] The input 16-bit string. 175 | * @param U [0..n-1] The output 16-bit string (can be T). 176 | * @param A [0..n-1+fs] The temporary array. 177 | * @param n The length of the given 16-bit string. 178 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 179 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 180 | * @return The primary index if no error occurred, -1 or -2 otherwise. 181 | */ 182 | LIBSAIS16_API int32_t libsais16_bwt(const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq); 183 | 184 | /** 185 | * Constructs the burrows-wheeler transformed 16-bit string (BWT) of a given 16-bit string with auxiliary indexes. 186 | * @param T [0..n-1] The input 16-bit string. 187 | * @param U [0..n-1] The output 16-bit string (can be T). 188 | * @param A [0..n-1+fs] The temporary array. 189 | * @param n The length of the given 16-bit string. 190 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 191 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 192 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 193 | * @param I [0..(n-1)/r] The output auxiliary indexes. 194 | * @return 0 if no error occurred, -1 or -2 otherwise. 195 | */ 196 | LIBSAIS16_API int32_t libsais16_bwt_aux(const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq, int32_t r, int32_t * I); 197 | 198 | /** 199 | * Constructs the burrows-wheeler transformed 16-bit string (BWT) of a given 16-bit string using libsais16 context. 200 | * @param ctx The libsais16 context. 201 | * @param T [0..n-1] The input 16-bit string. 202 | * @param U [0..n-1] The output 16-bit string (can be T). 203 | * @param A [0..n-1+fs] The temporary array. 204 | * @param n The length of the given 16-bit string. 205 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 206 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 207 | * @return The primary index if no error occurred, -1 or -2 otherwise. 208 | */ 209 | LIBSAIS16_API int32_t libsais16_bwt_ctx(const void * ctx, const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq); 210 | 211 | /** 212 | * Constructs the burrows-wheeler transformed 16-bit string (BWT) of a given 16-bit string with auxiliary indexes using libsais16 context. 213 | * @param ctx The libsais16 context. 214 | * @param T [0..n-1] The input 16-bit string. 215 | * @param U [0..n-1] The output 16-bit string (can be T). 216 | * @param A [0..n-1+fs] The temporary array. 217 | * @param n The length of the given 16-bit string. 218 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 219 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 220 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 221 | * @param I [0..(n-1)/r] The output auxiliary indexes. 222 | * @return 0 if no error occurred, -1 or -2 otherwise. 223 | */ 224 | LIBSAIS16_API int32_t libsais16_bwt_aux_ctx(const void * ctx, const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq, int32_t r, int32_t * I); 225 | 226 | #if defined(LIBSAIS_OPENMP) 227 | /** 228 | * Constructs the burrows-wheeler transformed 16-bit string (BWT) of a given 16-bit string in parallel using OpenMP. 229 | * @param T [0..n-1] The input 16-bit string. 230 | * @param U [0..n-1] The output 16-bit string (can be T). 231 | * @param A [0..n-1+fs] The temporary array. 232 | * @param n The length of the given 16-bit string. 233 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 234 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 235 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 236 | * @return The primary index if no error occurred, -1 or -2 otherwise. 237 | */ 238 | LIBSAIS16_API int32_t libsais16_bwt_omp(const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq, int32_t threads); 239 | 240 | /** 241 | * Constructs the burrows-wheeler transformed 16-bit string (BWT) of a given 16-bit string with auxiliary indexes in parallel using OpenMP. 242 | * @param T [0..n-1] The input 16-bit string. 243 | * @param U [0..n-1] The output 16-bit string (can be T). 244 | * @param A [0..n-1+fs] The temporary array. 245 | * @param n The length of the given 16-bit string. 246 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 247 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 248 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 249 | * @param I [0..(n-1)/r] The output auxiliary indexes. 250 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 251 | * @return 0 if no error occurred, -1 or -2 otherwise. 252 | */ 253 | LIBSAIS16_API int32_t libsais16_bwt_aux_omp(const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq, int32_t r, int32_t * I, int32_t threads); 254 | #endif 255 | 256 | /** 257 | * Creates the libsais16 reverse BWT context that allows reusing allocated memory with each libsais16_unbwt_* operation. 258 | * In multi-threaded environments, use one context per thread for parallel executions. 259 | * @return the libsais16 context, NULL otherwise. 260 | */ 261 | LIBSAIS16_API void * libsais16_unbwt_create_ctx(void); 262 | 263 | #if defined(LIBSAIS_OPENMP) 264 | /** 265 | * Creates the libsais16 reverse BWT context that allows reusing allocated memory with each parallel libsais16_unbwt_* operation using OpenMP. 266 | * In multi-threaded environments, use one context per thread for parallel executions. 267 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 268 | * @return the libsais16 context, NULL otherwise. 269 | */ 270 | LIBSAIS16_API void * libsais16_unbwt_create_ctx_omp(int32_t threads); 271 | #endif 272 | 273 | /** 274 | * Destroys the libsass reverse BWT context and free previusly allocated memory. 275 | * @param ctx The libsais16 context (can be NULL). 276 | */ 277 | LIBSAIS16_API void libsais16_unbwt_free_ctx(void * ctx); 278 | 279 | /** 280 | * Constructs the original 16-bit string from a given burrows-wheeler transformed 16-bit string (BWT) with primary index. 281 | * @param T [0..n-1] The input 16-bit string. 282 | * @param U [0..n-1] The output 16-bit string (can be T). 283 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 284 | * @param n The length of the given 16-bit string. 285 | * @param freq [0..65535] The input 16-bit symbol frequency table (can be NULL). 286 | * @param i The primary index. 287 | * @return 0 if no error occurred, -1 or -2 otherwise. 288 | */ 289 | LIBSAIS16_API int32_t libsais16_unbwt(const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t i); 290 | 291 | /** 292 | * Constructs the original 16-bit string from a given burrows-wheeler transformed 16-bit string (BWT) with primary index using libsais16 reverse BWT context. 293 | * @param ctx The libsais16 reverse BWT context. 294 | * @param T [0..n-1] The input 16-bit string. 295 | * @param U [0..n-1] The output 16-bit string (can be T). 296 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 297 | * @param n The length of the given 16-bit string. 298 | * @param freq [0..65535] The input 16-bit symbol frequency table (can be NULL). 299 | * @param i The primary index. 300 | * @return 0 if no error occurred, -1 or -2 otherwise. 301 | */ 302 | LIBSAIS16_API int32_t libsais16_unbwt_ctx(const void * ctx, const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t i); 303 | 304 | /** 305 | * Constructs the original 16-bit string from a given burrows-wheeler transformed 16-bit string (BWT) with auxiliary indexes. 306 | * @param T [0..n-1] The input 16-bit string. 307 | * @param U [0..n-1] The output 16-bit string (can be T). 308 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 309 | * @param n The length of the given 16-bit string. 310 | * @param freq [0..65535] The input 16-bit symbol frequency table (can be NULL). 311 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 312 | * @param I [0..(n-1)/r] The input auxiliary indexes. 313 | * @return 0 if no error occurred, -1 or -2 otherwise. 314 | */ 315 | LIBSAIS16_API int32_t libsais16_unbwt_aux(const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t r, const int32_t * I); 316 | 317 | /** 318 | * Constructs the original 16-bit string from a given burrows-wheeler transformed 16-bit string (BWT) with auxiliary indexes using libsais16 reverse BWT context. 319 | * @param ctx The libsais16 reverse BWT context. 320 | * @param T [0..n-1] The input 16-bit string. 321 | * @param U [0..n-1] The output 16-bit string (can be T). 322 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 323 | * @param n The length of the given 16-bit string. 324 | * @param freq [0..65535] The input 16-bit symbol frequency table (can be NULL). 325 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 326 | * @param I [0..(n-1)/r] The input auxiliary indexes. 327 | * @return 0 if no error occurred, -1 or -2 otherwise. 328 | */ 329 | LIBSAIS16_API int32_t libsais16_unbwt_aux_ctx(const void * ctx, const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t r, const int32_t * I); 330 | 331 | #if defined(LIBSAIS_OPENMP) 332 | /** 333 | * Constructs the original 16-bit string from a given burrows-wheeler transformed 16-bit string (BWT) with primary index in parallel using OpenMP. 334 | * @param T [0..n-1] The input 16-bit string. 335 | * @param U [0..n-1] The output 16-bit string (can be T). 336 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 337 | * @param n The length of the given 16-bit string. 338 | * @param freq [0..65535] The input 16-bit symbol frequency table (can be NULL). 339 | * @param i The primary index. 340 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 341 | * @return 0 if no error occurred, -1 or -2 otherwise. 342 | */ 343 | LIBSAIS16_API int32_t libsais16_unbwt_omp(const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t i, int32_t threads); 344 | 345 | /** 346 | * Constructs the original 16-bit string from a given burrows-wheeler transformed 16-bit string (BWT) with auxiliary indexes in parallel using OpenMP. 347 | * @param T [0..n-1] The input 16-bit string. 348 | * @param U [0..n-1] The output 16-bit string (can be T). 349 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 350 | * @param n The length of the given 16-bit string. 351 | * @param freq [0..65535] The input 16-bit symbol frequency table (can be NULL). 352 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 353 | * @param I [0..(n-1)/r] The input auxiliary indexes. 354 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 355 | * @return 0 if no error occurred, -1 or -2 otherwise. 356 | */ 357 | LIBSAIS16_API int32_t libsais16_unbwt_aux_omp(const uint16_t * T, uint16_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t r, const int32_t * I, int32_t threads); 358 | #endif 359 | 360 | /** 361 | * Constructs the permuted longest common prefix array (PLCP) of a given 16-bit string and a suffix array. 362 | * @param T [0..n-1] The input 16-bit string. 363 | * @param SA [0..n-1] The input suffix array. 364 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 365 | * @param n The length of the 16-bit string and the suffix array. 366 | * @return 0 if no error occurred, -1 otherwise. 367 | */ 368 | LIBSAIS16_API int32_t libsais16_plcp(const uint16_t * T, const int32_t * SA, int32_t * PLCP, int32_t n); 369 | 370 | /** 371 | * Constructs the permuted longest common prefix array (PLCP) of a given 16-bit string set and a generalized suffix array (GSA). 372 | * @param T [0..n-1] The input 16-bit string set using 0 as separators (T[n-1] must be 0). 373 | * @param SA [0..n-1] The input generalized suffix array. 374 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 375 | * @param n The length of the string set and the generalized suffix array. 376 | * @return 0 if no error occurred, -1 otherwise. 377 | */ 378 | LIBSAIS16_API int32_t libsais16_plcp_gsa(const uint16_t * T, const int32_t * SA, int32_t * PLCP, int32_t n); 379 | 380 | /** 381 | * Constructs the longest common prefix array (LCP) of a given permuted longest common prefix array (PLCP) and a suffix array. 382 | * @param PLCP [0..n-1] The input permuted longest common prefix array. 383 | * @param SA [0..n-1] The input suffix array or generalized suffix array (GSA). 384 | * @param LCP [0..n-1] The output longest common prefix array (can be SA). 385 | * @param n The length of the permuted longest common prefix array and the suffix array. 386 | * @return 0 if no error occurred, -1 otherwise. 387 | */ 388 | LIBSAIS16_API int32_t libsais16_lcp(const int32_t * PLCP, const int32_t * SA, int32_t * LCP, int32_t n); 389 | 390 | #if defined(LIBSAIS_OPENMP) 391 | /** 392 | * Constructs the permuted longest common prefix array (PLCP) of a given 16-bit string and a suffix array in parallel using OpenMP. 393 | * @param T [0..n-1] The input 16-bit string. 394 | * @param SA [0..n-1] The input suffix array. 395 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 396 | * @param n The length of the 16-bit string and the suffix array. 397 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 398 | * @return 0 if no error occurred, -1 otherwise. 399 | */ 400 | LIBSAIS16_API int32_t libsais16_plcp_omp(const uint16_t * T, const int32_t * SA, int32_t * PLCP, int32_t n, int32_t threads); 401 | 402 | /** 403 | * Constructs the permuted longest common prefix array (PLCP) of a given 16-bit string set and a generalized suffix array (GSA) in parallel using OpenMP. 404 | * @param T [0..n-1] The input 16-bit string set using 0 as separators (T[n-1] must be 0). 405 | * @param SA [0..n-1] The input generalized suffix array. 406 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 407 | * @param n The length of the string set and the generalized suffix array. 408 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 409 | * @return 0 if no error occurred, -1 otherwise. 410 | */ 411 | LIBSAIS16_API int32_t libsais16_plcp_gsa_omp(const uint16_t * T, const int32_t * SA, int32_t * PLCP, int32_t n, int32_t threads); 412 | 413 | /** 414 | * Constructs the longest common prefix array (LCP) of a given permuted longest common prefix array (PLCP) and a suffix array in parallel using OpenMP. 415 | * @param PLCP [0..n-1] The input permuted longest common prefix array. 416 | * @param SA [0..n-1] The input suffix array or generalized suffix array (GSA). 417 | * @param LCP [0..n-1] The output longest common prefix array (can be SA). 418 | * @param n The length of the permuted longest common prefix array and the suffix array. 419 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 420 | * @return 0 if no error occurred, -1 otherwise. 421 | */ 422 | LIBSAIS16_API int32_t libsais16_lcp_omp(const int32_t * PLCP, const int32_t * SA, int32_t * LCP, int32_t n, int32_t threads); 423 | #endif 424 | 425 | #ifdef __cplusplus 426 | } 427 | #endif 428 | 429 | #endif 430 | -------------------------------------------------------------------------------- /include/libsais16x64.h: -------------------------------------------------------------------------------- 1 | /*-- 2 | 3 | This file is a part of libsais, a library for linear time suffix array, 4 | longest common prefix array and burrows wheeler transform construction. 5 | 6 | Copyright (c) 2021-2025 Ilya Grebnov 7 | 8 | Licensed under the Apache License, Version 2.0 (the "License"); 9 | you may not use this file except in compliance with the License. 10 | You may obtain a copy of the License at 11 | 12 | http://www.apache.org/licenses/LICENSE-2.0 13 | 14 | Unless required by applicable law or agreed to in writing, software 15 | distributed under the License is distributed on an "AS IS" BASIS, 16 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 17 | See the License for the specific language governing permissions and 18 | limitations under the License. 19 | 20 | Please see the file LICENSE for full copyright information. 21 | 22 | --*/ 23 | 24 | #ifndef LIBSAIS16X64_H 25 | #define LIBSAIS16X64_H 1 26 | 27 | #define LIBSAIS16X64_VERSION_MAJOR 2 28 | #define LIBSAIS16X64_VERSION_MINOR 10 29 | #define LIBSAIS16X64_VERSION_PATCH 1 30 | #define LIBSAIS16X64_VERSION_STRING "2.10.1" 31 | 32 | #ifdef _WIN32 33 | #ifdef LIBSAIS_SHARED 34 | #ifdef LIBSAIS_EXPORTS 35 | #define LIBSAIS16X64_API __declspec(dllexport) 36 | #else 37 | #define LIBSAIS16X64_API __declspec(dllimport) 38 | #endif 39 | #else 40 | #define LIBSAIS16X64_API 41 | #endif 42 | #else 43 | #define LIBSAIS16X64_API 44 | #endif 45 | 46 | #ifdef __cplusplus 47 | extern "C" { 48 | #endif 49 | 50 | #include 51 | 52 | /** 53 | * Constructs the suffix array of a given 16-bit string. 54 | * @param T [0..n-1] The input 16-bit string. 55 | * @param SA [0..n-1+fs] The output array of suffixes. 56 | * @param n The length of the given 16-bit string. 57 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 58 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 59 | * @return 0 if no error occurred, -1 or -2 otherwise. 60 | */ 61 | LIBSAIS16X64_API int64_t libsais16x64(const uint16_t * T, int64_t * SA, int64_t n, int64_t fs, int64_t * freq); 62 | 63 | /** 64 | * Constructs the generalized suffix array (GSA) of a given 16-bit string set. 65 | * @param T [0..n-1] The input 16-bit string set using 0 as separators (T[n-1] must be 0). 66 | * @param SA [0..n-1+fs] The output array of suffixes. 67 | * @param n The length of the given 16-bit string set. 68 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 69 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 70 | * @return 0 if no error occurred, -1 or -2 otherwise. 71 | */ 72 | LIBSAIS16X64_API int64_t libsais16x64_gsa(const uint16_t * T, int64_t * SA, int64_t n, int64_t fs, int64_t * freq); 73 | 74 | /** 75 | * Constructs the suffix array of a given integer array. 76 | * Note, during construction input array will be modified, but restored at the end if no errors occurred. 77 | * @param T [0..n-1] The input integer array. 78 | * @param SA [0..n-1+fs] The output array of suffixes. 79 | * @param n The length of the integer array. 80 | * @param k The alphabet size of the input integer array. 81 | * @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance). 82 | * @return 0 if no error occurred, -1 or -2 otherwise. 83 | */ 84 | LIBSAIS16X64_API int64_t libsais16x64_long(int64_t * T, int64_t * SA, int64_t n, int64_t k, int64_t fs); 85 | 86 | #if defined(LIBSAIS_OPENMP) 87 | /** 88 | * Constructs the suffix array of a given 16-bit string in parallel using OpenMP. 89 | * @param T [0..n-1] The input 16-bit string. 90 | * @param SA [0..n-1+fs] The output array of suffixes. 91 | * @param n The length of the given 16-bit string. 92 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 93 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 94 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 95 | * @return 0 if no error occurred, -1 or -2 otherwise. 96 | */ 97 | LIBSAIS16X64_API int64_t libsais16x64_omp(const uint16_t * T, int64_t * SA, int64_t n, int64_t fs, int64_t * freq, int64_t threads); 98 | 99 | /** 100 | * Constructs the generalized suffix array (GSA) of a given 16-bit string set in parallel using OpenMP. 101 | * @param T [0..n-1] The input 16-bit string set using 0 as separators (T[n-1] must be 0). 102 | * @param SA [0..n-1+fs] The output array of suffixes. 103 | * @param n The length of the given 16-bit string set. 104 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 105 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 106 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 107 | * @return 0 if no error occurred, -1 or -2 otherwise. 108 | */ 109 | LIBSAIS16X64_API int64_t libsais16x64_gsa_omp(const uint16_t * T, int64_t * SA, int64_t n, int64_t fs, int64_t * freq, int64_t threads); 110 | 111 | /** 112 | * Constructs the suffix array of a given integer array in parallel using OpenMP. 113 | * Note, during construction input array will be modified, but restored at the end if no errors occurred. 114 | * @param T [0..n-1] The input integer array. 115 | * @param SA [0..n-1+fs] The output array of suffixes. 116 | * @param n The length of the integer array. 117 | * @param k The alphabet size of the input integer array. 118 | * @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance). 119 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 120 | * @return 0 if no error occurred, -1 or -2 otherwise. 121 | */ 122 | LIBSAIS16X64_API int64_t libsais16x64_long_omp(int64_t * T, int64_t * SA, int64_t n, int64_t k, int64_t fs, int64_t threads); 123 | #endif 124 | 125 | /** 126 | * Constructs the burrows-wheeler transformed 16-bit string (BWT) of a given 16-bit string. 127 | * @param T [0..n-1] The input 16-bit string. 128 | * @param U [0..n-1] The output 16-bit string (can be T). 129 | * @param A [0..n-1+fs] The temporary array. 130 | * @param n The length of the given 16-bit string. 131 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 132 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 133 | * @return The primary index if no error occurred, -1 or -2 otherwise. 134 | */ 135 | LIBSAIS16X64_API int64_t libsais16x64_bwt(const uint16_t * T, uint16_t * U, int64_t * A, int64_t n, int64_t fs, int64_t * freq); 136 | 137 | /** 138 | * Constructs the burrows-wheeler transformed 16-bit string (BWT) of a given 16-bit string with auxiliary indexes. 139 | * @param T [0..n-1] The input 16-bit string. 140 | * @param U [0..n-1] The output 16-bit string (can be T). 141 | * @param A [0..n-1+fs] The temporary array. 142 | * @param n The length of the given 16-bit string. 143 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 144 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 145 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 146 | * @param I [0..(n-1)/r] The output auxiliary indexes. 147 | * @return 0 if no error occurred, -1 or -2 otherwise. 148 | */ 149 | LIBSAIS16X64_API int64_t libsais16x64_bwt_aux(const uint16_t * T, uint16_t * U, int64_t * A, int64_t n, int64_t fs, int64_t * freq, int64_t r, int64_t * I); 150 | 151 | #if defined(LIBSAIS_OPENMP) 152 | /** 153 | * Constructs the burrows-wheeler transformed 16-bit string (BWT) of a given 16-bit string in parallel using OpenMP. 154 | * @param T [0..n-1] The input 16-bit string. 155 | * @param U [0..n-1] The output 16-bit string (can be T). 156 | * @param A [0..n-1+fs] The temporary array. 157 | * @param n The length of the given 16-bit string. 158 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 159 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 160 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 161 | * @return The primary index if no error occurred, -1 or -2 otherwise. 162 | */ 163 | LIBSAIS16X64_API int64_t libsais16x64_bwt_omp(const uint16_t * T, uint16_t * U, int64_t * A, int64_t n, int64_t fs, int64_t * freq, int64_t threads); 164 | 165 | /** 166 | * Constructs the burrows-wheeler transformed 16-bit string (BWT) of a given 16-bit string with auxiliary indexes in parallel using OpenMP. 167 | * @param T [0..n-1] The input 16-bit string. 168 | * @param U [0..n-1] The output 16-bit string (can be T). 169 | * @param A [0..n-1+fs] The temporary array. 170 | * @param n The length of the given 16-bit string. 171 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 172 | * @param freq [0..65535] The output 16-bit symbol frequency table (can be NULL). 173 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 174 | * @param I [0..(n-1)/r] The output auxiliary indexes. 175 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 176 | * @return 0 if no error occurred, -1 or -2 otherwise. 177 | */ 178 | LIBSAIS16X64_API int64_t libsais16x64_bwt_aux_omp(const uint16_t * T, uint16_t * U, int64_t * A, int64_t n, int64_t fs, int64_t * freq, int64_t r, int64_t * I, int64_t threads); 179 | #endif 180 | 181 | /** 182 | * Constructs the original 16-bit string from a given burrows-wheeler transformed 16-bit string (BWT) with primary index. 183 | * @param T [0..n-1] The input 16-bit string. 184 | * @param U [0..n-1] The output 16-bit string (can be T). 185 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 186 | * @param n The length of the given 16-bit string. 187 | * @param freq [0..65535] The input 16-bit symbol frequency table (can be NULL). 188 | * @param i The primary index. 189 | * @return 0 if no error occurred, -1 or -2 otherwise. 190 | */ 191 | LIBSAIS16X64_API int64_t libsais16x64_unbwt(const uint16_t * T, uint16_t * U, int64_t * A, int64_t n, const int64_t * freq, int64_t i); 192 | 193 | /** 194 | * Constructs the original 16-bit string from a given burrows-wheeler transformed 16-bit string (BWT) with auxiliary indexes. 195 | * @param T [0..n-1] The input 16-bit string. 196 | * @param U [0..n-1] The output 16-bit string (can be T). 197 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 198 | * @param n The length of the given 16-bit string. 199 | * @param freq [0..65535] The input 16-bit symbol frequency table (can be NULL). 200 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 201 | * @param I [0..(n-1)/r] The input auxiliary indexes. 202 | * @return 0 if no error occurred, -1 or -2 otherwise. 203 | */ 204 | LIBSAIS16X64_API int64_t libsais16x64_unbwt_aux(const uint16_t * T, uint16_t * U, int64_t * A, int64_t n, const int64_t * freq, int64_t r, const int64_t * I); 205 | 206 | #if defined(LIBSAIS_OPENMP) 207 | /** 208 | * Constructs the original 16-bit string from a given burrows-wheeler transformed 16-bit string (BWT) with primary index in parallel using OpenMP. 209 | * @param T [0..n-1] The input 16-bit string. 210 | * @param U [0..n-1] The output 16-bit string (can be T). 211 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 212 | * @param n The length of the given 16-bit string. 213 | * @param freq [0..65535] The input 16-bit symbol frequency table (can be NULL). 214 | * @param i The primary index. 215 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 216 | * @return 0 if no error occurred, -1 or -2 otherwise. 217 | */ 218 | LIBSAIS16X64_API int64_t libsais16x64_unbwt_omp(const uint16_t * T, uint16_t * U, int64_t * A, int64_t n, const int64_t * freq, int64_t i, int64_t threads); 219 | 220 | /** 221 | * Constructs the original 16-bit string from a given burrows-wheeler transformed 16-bit string (BWT) with auxiliary indexes in parallel using OpenMP. 222 | * @param T [0..n-1] The input 16-bit string. 223 | * @param U [0..n-1] The output 16-bit string (can be T). 224 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 225 | * @param n The length of the given 16-bit string. 226 | * @param freq [0..65535] The input 16-bit symbol frequency table (can be NULL). 227 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 228 | * @param I [0..(n-1)/r] The input auxiliary indexes. 229 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 230 | * @return 0 if no error occurred, -1 or -2 otherwise. 231 | */ 232 | LIBSAIS16X64_API int64_t libsais16x64_unbwt_aux_omp(const uint16_t * T, uint16_t * U, int64_t * A, int64_t n, const int64_t * freq, int64_t r, const int64_t * I, int64_t threads); 233 | #endif 234 | 235 | /** 236 | * Constructs the permuted longest common prefix array (PLCP) of a given 16-bit string and a suffix array. 237 | * @param T [0..n-1] The input 16-bit string. 238 | * @param SA [0..n-1] The input suffix array. 239 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 240 | * @param n The length of the 16-bit string and the suffix array. 241 | * @return 0 if no error occurred, -1 otherwise. 242 | */ 243 | LIBSAIS16X64_API int64_t libsais16x64_plcp(const uint16_t * T, const int64_t * SA, int64_t * PLCP, int64_t n); 244 | 245 | /** 246 | * Constructs the permuted longest common prefix array (PLCP) of a given 16-bit string set and a generalized suffix array (GSA). 247 | * @param T [0..n-1] The input 16-bit string set using 0 as separators (T[n-1] must be 0). 248 | * @param SA [0..n-1] The input generalized suffix array. 249 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 250 | * @param n The length of the string set and the generalized suffix array. 251 | * @return 0 if no error occurred, -1 otherwise. 252 | */ 253 | LIBSAIS16X64_API int64_t libsais16x64_plcp_gsa(const uint16_t * T, const int64_t * SA, int64_t * PLCP, int64_t n); 254 | 255 | /** 256 | * Constructs the longest common prefix array (LCP) of a given permuted longest common prefix array (PLCP) and a suffix array. 257 | * @param PLCP [0..n-1] The input permuted longest common prefix array. 258 | * @param SA [0..n-1] The input suffix array or generalized suffix array (GSA). 259 | * @param LCP [0..n-1] The output longest common prefix array (can be SA). 260 | * @param n The length of the permuted longest common prefix array and the suffix array. 261 | * @return 0 if no error occurred, -1 otherwise. 262 | */ 263 | LIBSAIS16X64_API int64_t libsais16x64_lcp(const int64_t * PLCP, const int64_t * SA, int64_t * LCP, int64_t n); 264 | 265 | #if defined(LIBSAIS_OPENMP) 266 | /** 267 | * Constructs the permuted longest common prefix array (PLCP) of a given 16-bit string and a suffix array in parallel using OpenMP. 268 | * @param T [0..n-1] The input 16-bit string. 269 | * @param SA [0..n-1] The input suffix array. 270 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 271 | * @param n The length of the 16-bit string and the suffix array. 272 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 273 | * @return 0 if no error occurred, -1 otherwise. 274 | */ 275 | LIBSAIS16X64_API int64_t libsais16x64_plcp_omp(const uint16_t * T, const int64_t * SA, int64_t * PLCP, int64_t n, int64_t threads); 276 | 277 | /** 278 | * Constructs the permuted longest common prefix array (PLCP) of a given 16-bit string set and a generalized suffix array (GSA) in parallel using OpenMP. 279 | * @param T [0..n-1] The input 16-bit string set using 0 as separators (T[n-1] must be 0). 280 | * @param SA [0..n-1] The input generalized suffix array. 281 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 282 | * @param n The length of the string set and the generalized suffix array. 283 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 284 | * @return 0 if no error occurred, -1 otherwise. 285 | */ 286 | LIBSAIS16X64_API int64_t libsais16x64_plcp_gsa_omp(const uint16_t * T, const int64_t * SA, int64_t * PLCP, int64_t n, int64_t threads); 287 | 288 | /** 289 | * Constructs the longest common prefix array (LCP) of a given permuted longest common prefix array (PLCP) and a suffix array in parallel using OpenMP. 290 | * @param PLCP [0..n-1] The input permuted longest common prefix array. 291 | * @param SA [0..n-1] The input suffix array or generalized suffix array (GSA). 292 | * @param LCP [0..n-1] The output longest common prefix array (can be SA). 293 | * @param n The length of the permuted longest common prefix array and the suffix array. 294 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 295 | * @return 0 if no error occurred, -1 otherwise. 296 | */ 297 | LIBSAIS16X64_API int64_t libsais16x64_lcp_omp(const int64_t * PLCP, const int64_t * SA, int64_t * LCP, int64_t n, int64_t threads); 298 | #endif 299 | 300 | #ifdef __cplusplus 301 | } 302 | #endif 303 | 304 | #endif 305 | -------------------------------------------------------------------------------- /include/libsais64.h: -------------------------------------------------------------------------------- 1 | /*-- 2 | 3 | This file is a part of libsais, a library for linear time suffix array, 4 | longest common prefix array and burrows wheeler transform construction. 5 | 6 | Copyright (c) 2021-2025 Ilya Grebnov 7 | 8 | Licensed under the Apache License, Version 2.0 (the "License"); 9 | you may not use this file except in compliance with the License. 10 | You may obtain a copy of the License at 11 | 12 | http://www.apache.org/licenses/LICENSE-2.0 13 | 14 | Unless required by applicable law or agreed to in writing, software 15 | distributed under the License is distributed on an "AS IS" BASIS, 16 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 17 | See the License for the specific language governing permissions and 18 | limitations under the License. 19 | 20 | Please see the file LICENSE for full copyright information. 21 | 22 | --*/ 23 | 24 | #ifndef LIBSAIS64_H 25 | #define LIBSAIS64_H 1 26 | 27 | #define LIBSAIS64_VERSION_MAJOR 2 28 | #define LIBSAIS64_VERSION_MINOR 10 29 | #define LIBSAIS64_VERSION_PATCH 1 30 | #define LIBSAIS64_VERSION_STRING "2.10.1" 31 | 32 | #ifdef _WIN32 33 | #ifdef LIBSAIS_SHARED 34 | #ifdef LIBSAIS_EXPORTS 35 | #define LIBSAIS64_API __declspec(dllexport) 36 | #else 37 | #define LIBSAIS64_API __declspec(dllimport) 38 | #endif 39 | #else 40 | #define LIBSAIS64_API 41 | #endif 42 | #else 43 | #define LIBSAIS64_API 44 | #endif 45 | 46 | #ifdef __cplusplus 47 | extern "C" { 48 | #endif 49 | 50 | #include 51 | 52 | /** 53 | * Constructs the suffix array of a given string. 54 | * @param T [0..n-1] The input string. 55 | * @param SA [0..n-1+fs] The output array of suffixes. 56 | * @param n The length of the given string. 57 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 58 | * @param freq [0..255] The output symbol frequency table (can be NULL). 59 | * @return 0 if no error occurred, -1 or -2 otherwise. 60 | */ 61 | LIBSAIS64_API int64_t libsais64(const uint8_t * T, int64_t * SA, int64_t n, int64_t fs, int64_t * freq); 62 | 63 | /** 64 | * Constructs the generalized suffix array (GSA) of given string set. 65 | * @param T [0..n-1] The input string set using 0 as separators (T[n-1] must be 0). 66 | * @param SA [0..n-1+fs] The output array of suffixes. 67 | * @param n The length of the given string set. 68 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 69 | * @param freq [0..255] The output symbol frequency table (can be NULL). 70 | * @return 0 if no error occurred, -1 or -2 otherwise. 71 | */ 72 | LIBSAIS64_API int64_t libsais64_gsa(const uint8_t * T, int64_t * SA, int64_t n, int64_t fs, int64_t * freq); 73 | 74 | /** 75 | * Constructs the suffix array of a given integer array. 76 | * Note, during construction input array will be modified, but restored at the end if no errors occurred. 77 | * @param T [0..n-1] The input integer array. 78 | * @param SA [0..n-1+fs] The output array of suffixes. 79 | * @param n The length of the integer array. 80 | * @param k The alphabet size of the input integer array. 81 | * @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance). 82 | * @return 0 if no error occurred, -1 or -2 otherwise. 83 | */ 84 | LIBSAIS64_API int64_t libsais64_long(int64_t * T, int64_t * SA, int64_t n, int64_t k, int64_t fs); 85 | 86 | #if defined(LIBSAIS_OPENMP) 87 | /** 88 | * Constructs the suffix array of a given string in parallel using OpenMP. 89 | * @param T [0..n-1] The input string. 90 | * @param SA [0..n-1+fs] The output array of suffixes. 91 | * @param n The length of the given string. 92 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 93 | * @param freq [0..255] The output symbol frequency table (can be NULL). 94 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 95 | * @return 0 if no error occurred, -1 or -2 otherwise. 96 | */ 97 | LIBSAIS64_API int64_t libsais64_omp(const uint8_t * T, int64_t * SA, int64_t n, int64_t fs, int64_t * freq, int64_t threads); 98 | 99 | /** 100 | * Constructs the generalized suffix array (GSA) of given string set in parallel using OpenMP. 101 | * @param T [0..n-1] The input string set using 0 as separators (T[n-1] must be 0). 102 | * @param SA [0..n-1+fs] The output array of suffixes. 103 | * @param n The length of the given string set. 104 | * @param fs The extra space available at the end of SA array (0 should be enough for most cases). 105 | * @param freq [0..255] The output symbol frequency table (can be NULL). 106 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 107 | * @return 0 if no error occurred, -1 or -2 otherwise. 108 | */ 109 | LIBSAIS64_API int64_t libsais64_gsa_omp(const uint8_t * T, int64_t * SA, int64_t n, int64_t fs, int64_t * freq, int64_t threads); 110 | 111 | /** 112 | * Constructs the suffix array of a given integer array in parallel using OpenMP. 113 | * Note, during construction input array will be modified, but restored at the end if no errors occurred. 114 | * @param T [0..n-1] The input integer array. 115 | * @param SA [0..n-1+fs] The output array of suffixes. 116 | * @param n The length of the integer array. 117 | * @param k The alphabet size of the input integer array. 118 | * @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance). 119 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 120 | * @return 0 if no error occurred, -1 or -2 otherwise. 121 | */ 122 | LIBSAIS64_API int64_t libsais64_long_omp(int64_t * T, int64_t * SA, int64_t n, int64_t k, int64_t fs, int64_t threads); 123 | #endif 124 | 125 | /** 126 | * Constructs the burrows-wheeler transformed string (BWT) of a given string. 127 | * @param T [0..n-1] The input string. 128 | * @param U [0..n-1] The output string (can be T). 129 | * @param A [0..n-1+fs] The temporary array. 130 | * @param n The length of the given string. 131 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 132 | * @param freq [0..255] The output symbol frequency table (can be NULL). 133 | * @return The primary index if no error occurred, -1 or -2 otherwise. 134 | */ 135 | LIBSAIS64_API int64_t libsais64_bwt(const uint8_t * T, uint8_t * U, int64_t * A, int64_t n, int64_t fs, int64_t * freq); 136 | 137 | /** 138 | * Constructs the burrows-wheeler transformed string (BWT) of a given string with auxiliary indexes. 139 | * @param T [0..n-1] The input string. 140 | * @param U [0..n-1] The output string (can be T). 141 | * @param A [0..n-1+fs] The temporary array. 142 | * @param n The length of the given string. 143 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 144 | * @param freq [0..255] The output symbol frequency table (can be NULL). 145 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 146 | * @param I [0..(n-1)/r] The output auxiliary indexes. 147 | * @return 0 if no error occurred, -1 or -2 otherwise. 148 | */ 149 | LIBSAIS64_API int64_t libsais64_bwt_aux(const uint8_t * T, uint8_t * U, int64_t * A, int64_t n, int64_t fs, int64_t * freq, int64_t r, int64_t * I); 150 | 151 | #if defined(LIBSAIS_OPENMP) 152 | /** 153 | * Constructs the burrows-wheeler transformed string (BWT) of a given string in parallel using OpenMP. 154 | * @param T [0..n-1] The input string. 155 | * @param U [0..n-1] The output string (can be T). 156 | * @param A [0..n-1+fs] The temporary array. 157 | * @param n The length of the given string. 158 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 159 | * @param freq [0..255] The output symbol frequency table (can be NULL). 160 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 161 | * @return The primary index if no error occurred, -1 or -2 otherwise. 162 | */ 163 | LIBSAIS64_API int64_t libsais64_bwt_omp(const uint8_t * T, uint8_t * U, int64_t * A, int64_t n, int64_t fs, int64_t * freq, int64_t threads); 164 | 165 | /** 166 | * Constructs the burrows-wheeler transformed string (BWT) of a given string with auxiliary indexes in parallel using OpenMP. 167 | * @param T [0..n-1] The input string. 168 | * @param U [0..n-1] The output string (can be T). 169 | * @param A [0..n-1+fs] The temporary array. 170 | * @param n The length of the given string. 171 | * @param fs The extra space available at the end of A array (0 should be enough for most cases). 172 | * @param freq [0..255] The output symbol frequency table (can be NULL). 173 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 174 | * @param I [0..(n-1)/r] The output auxiliary indexes. 175 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 176 | * @return 0 if no error occurred, -1 or -2 otherwise. 177 | */ 178 | LIBSAIS64_API int64_t libsais64_bwt_aux_omp(const uint8_t * T, uint8_t * U, int64_t * A, int64_t n, int64_t fs, int64_t * freq, int64_t r, int64_t * I, int64_t threads); 179 | #endif 180 | 181 | /** 182 | * Constructs the original string from a given burrows-wheeler transformed string (BWT) with primary index. 183 | * @param T [0..n-1] The input string. 184 | * @param U [0..n-1] The output string (can be T). 185 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 186 | * @param n The length of the given string. 187 | * @param freq [0..255] The input symbol frequency table (can be NULL). 188 | * @param i The primary index. 189 | * @return 0 if no error occurred, -1 or -2 otherwise. 190 | */ 191 | LIBSAIS64_API int64_t libsais64_unbwt(const uint8_t * T, uint8_t * U, int64_t * A, int64_t n, const int64_t * freq, int64_t i); 192 | 193 | /** 194 | * Constructs the original string from a given burrows-wheeler transformed string (BWT) with auxiliary indexes. 195 | * @param T [0..n-1] The input string. 196 | * @param U [0..n-1] The output string (can be T). 197 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 198 | * @param n The length of the given string. 199 | * @param freq [0..255] The input symbol frequency table (can be NULL). 200 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 201 | * @param I [0..(n-1)/r] The input auxiliary indexes. 202 | * @return 0 if no error occurred, -1 or -2 otherwise. 203 | */ 204 | LIBSAIS64_API int64_t libsais64_unbwt_aux(const uint8_t * T, uint8_t * U, int64_t * A, int64_t n, const int64_t * freq, int64_t r, const int64_t * I); 205 | 206 | #if defined(LIBSAIS_OPENMP) 207 | /** 208 | * Constructs the original string from a given burrows-wheeler transformed string (BWT) with primary index in parallel using OpenMP. 209 | * @param T [0..n-1] The input string. 210 | * @param U [0..n-1] The output string (can be T). 211 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 212 | * @param n The length of the given string. 213 | * @param freq [0..255] The input symbol frequency table (can be NULL). 214 | * @param i The primary index. 215 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 216 | * @return 0 if no error occurred, -1 or -2 otherwise. 217 | */ 218 | LIBSAIS64_API int64_t libsais64_unbwt_omp(const uint8_t * T, uint8_t * U, int64_t * A, int64_t n, const int64_t * freq, int64_t i, int64_t threads); 219 | 220 | /** 221 | * Constructs the original string from a given burrows-wheeler transformed string (BWT) with auxiliary indexes in parallel using OpenMP. 222 | * @param T [0..n-1] The input string. 223 | * @param U [0..n-1] The output string (can be T). 224 | * @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size). 225 | * @param n The length of the given string. 226 | * @param freq [0..255] The input symbol frequency table (can be NULL). 227 | * @param r The sampling rate for auxiliary indexes (must be power of 2). 228 | * @param I [0..(n-1)/r] The input auxiliary indexes. 229 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 230 | * @return 0 if no error occurred, -1 or -2 otherwise. 231 | */ 232 | LIBSAIS64_API int64_t libsais64_unbwt_aux_omp(const uint8_t * T, uint8_t * U, int64_t * A, int64_t n, const int64_t * freq, int64_t r, const int64_t * I, int64_t threads); 233 | #endif 234 | 235 | /** 236 | * Constructs the permuted longest common prefix array (PLCP) of a given string and a suffix array. 237 | * @param T [0..n-1] The input string. 238 | * @param SA [0..n-1] The input suffix array. 239 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 240 | * @param n The length of the string and the suffix array. 241 | * @return 0 if no error occurred, -1 otherwise. 242 | */ 243 | LIBSAIS64_API int64_t libsais64_plcp(const uint8_t * T, const int64_t * SA, int64_t * PLCP, int64_t n); 244 | 245 | /** 246 | * Constructs the permuted longest common prefix array (PLCP) of a given string set and a generalized suffix array (GSA). 247 | * @param T [0..n-1] The input string set using 0 as separators (T[n-1] must be 0). 248 | * @param SA [0..n-1] The input generalized suffix array. 249 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 250 | * @param n The length of the string set and the generalized suffix array. 251 | * @return 0 if no error occurred, -1 otherwise. 252 | */ 253 | LIBSAIS64_API int64_t libsais64_plcp_gsa(const uint8_t * T, const int64_t * SA, int64_t * PLCP, int64_t n); 254 | 255 | /** 256 | * Constructs the longest common prefix array (LCP) of a given permuted longest common prefix array (PLCP) and a suffix array. 257 | * @param PLCP [0..n-1] The input permuted longest common prefix array. 258 | * @param SA [0..n-1] The input suffix array or generalized suffix array (GSA). 259 | * @param LCP [0..n-1] The output longest common prefix array (can be SA). 260 | * @param n The length of the permuted longest common prefix array and the suffix array. 261 | * @return 0 if no error occurred, -1 otherwise. 262 | */ 263 | LIBSAIS64_API int64_t libsais64_lcp(const int64_t * PLCP, const int64_t * SA, int64_t * LCP, int64_t n); 264 | 265 | #if defined(LIBSAIS_OPENMP) 266 | /** 267 | * Constructs the permuted longest common prefix array (PLCP) of a given string and a suffix array in parallel using OpenMP. 268 | * @param T [0..n-1] The input string. 269 | * @param SA [0..n-1] The input suffix array. 270 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 271 | * @param n The length of the string and the suffix array. 272 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 273 | * @return 0 if no error occurred, -1 otherwise. 274 | */ 275 | LIBSAIS64_API int64_t libsais64_plcp_omp(const uint8_t * T, const int64_t * SA, int64_t * PLCP, int64_t n, int64_t threads); 276 | 277 | /** 278 | * Constructs the permuted longest common prefix array (PLCP) of a given string set and a generalized suffix array (GSA) in parallel using OpenMP. 279 | * @param T [0..n-1] The input string set using 0 as separators (T[n-1] must be 0). 280 | * @param SA [0..n-1] The input generalized suffix array. 281 | * @param PLCP [0..n-1] The output permuted longest common prefix array. 282 | * @param n The length of the string set and the generalized suffix array. 283 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 284 | * @return 0 if no error occurred, -1 otherwise. 285 | */ 286 | LIBSAIS64_API int64_t libsais64_plcp_gsa_omp(const uint8_t * T, const int64_t * SA, int64_t * PLCP, int64_t n, int64_t threads); 287 | 288 | /** 289 | * Constructs the longest common prefix array (LCP) of a given permuted longest common prefix array (PLCP) and a suffix array in parallel using OpenMP. 290 | * @param PLCP [0..n-1] The input permuted longest common prefix array. 291 | * @param SA [0..n-1] The input suffix array or generalized suffix array (GSA). 292 | * @param LCP [0..n-1] The output longest common prefix array (can be SA). 293 | * @param n The length of the permuted longest common prefix array and the suffix array. 294 | * @param threads The number of OpenMP threads to use (can be 0 for OpenMP default). 295 | * @return 0 if no error occurred, -1 otherwise. 296 | */ 297 | LIBSAIS64_API int64_t libsais64_lcp_omp(const int64_t * PLCP, const int64_t * SA, int64_t * LCP, int64_t n, int64_t threads); 298 | #endif 299 | 300 | #ifdef __cplusplus 301 | } 302 | #endif 303 | 304 | #endif 305 | --------------------------------------------------------------------------------