├── .github └── workflows │ └── go.yml ├── LICENSE ├── README.md ├── errors.go ├── export_test.go ├── fuzzbuzz.yaml ├── go.mod ├── go.sum ├── hash.go ├── hash_amd64.go ├── hash_amd64.s ├── hash_arm64.go ├── hash_arm64.s ├── hash_fuzz_test.go ├── hash_test.go ├── hash_unsupported.go ├── sha256_1_generic.go ├── sha256_1_sse.go-bak └── sha256_1_sse.s-bak /.github/workflows/go.yml: -------------------------------------------------------------------------------- 1 | name: Go 2 | 3 | on: 4 | push: 5 | branches: [ main ] 6 | pull_request: 7 | branches: [ '*' ] 8 | 9 | jobs: 10 | supported: 11 | strategy: 12 | matrix: 13 | go: [ '1.21', '1.22' ] 14 | runner: [ 'ubuntu-latest', 'ubuntu-24.04-arm' ] 15 | runs-on: ${{ matrix.runner }} 16 | name: Go ${{ matrix.go }} ${{ matrix.runner }} supported test 17 | steps: 18 | - name: Set up Go 1.x 19 | uses: actions/setup-go@v4 20 | with: 21 | go-version: ${{ matrix.go }} 22 | 23 | - name: Check out code into the Go module directory 24 | uses: actions/checkout@v3 25 | 26 | - name: Get dependencies 27 | run: go get -v -t -d ./... 28 | 29 | - name: Build 30 | run: go build -v ./... 31 | 32 | - name: Test 33 | run: go test -v ./... 34 | 35 | unsupported: 36 | strategy: 37 | matrix: 38 | go: [ '1.21', '1.22' ] 39 | runs-on: ubuntu-latest 40 | name: Go ${{ matrix.go }} unsupported test 41 | steps: 42 | - name: Update package index 43 | run: sudo apt-get update 44 | 45 | - name: install qemu 46 | run: sudo apt install --yes qemu-user-static 47 | 48 | - name: Set up Go 1.x 49 | uses: actions/setup-go@v4 50 | with: 51 | go-version: ${{ matrix.go }} 52 | 53 | - name: Check out code into the Go module directory 54 | uses: actions/checkout@v3 55 | 56 | - name: Get dependencies 57 | run: go get -v -t -d ./... 58 | 59 | - name: Build 60 | run: GOARCH=riscv64 go build -v ./... 61 | 62 | - name: Test 63 | run: | 64 | GOARCH=riscv64 go test -v ./... -c -o test.riscv64 65 | qemu-riscv64-static test.riscv64 -test.v 66 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022-2025 Prysmatic Labs 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Go Hashtree 2 | 3 | GoHashtree is a SHA256 library highly optimized for Merkle tree computation. It is based on [Intel's implementation](https://github.com/intel/intel-ipsec-mb) with a few modifications like hardcoding the scheduled words of the padding block. It is written in Go Assembly instead of its native assembly counterpart [hashtree](https://github.com/prysmaticlabs/hashtree). 4 | 5 | # Using the library 6 | 7 | The library exposes a single function 8 | ``` 9 | func Hash(digests [][32]byte, chunks [][32]byte) error 10 | ``` 11 | This function hashes each consecutive pair of 32 byte blocks from `chunks` and writes the corresponding digest to `digests`. It performs runtime detection of CPU features supported. The function returns an error if `digests` is not allocated to hold at least `len(chunks)/2` digests or if an odd number of chunks is given. 12 | 13 | Most vectorized implementations exploit the fact that independent branches in the Merkle tree can be hashed in "parallel" within one CPU, to take advantage of this, 14 | Merkleization algorithms that loop over consecutive tree layers hashing two blocks at a time need to be updated to pass the entire layer, or all consecutive blocks. A naive example on how to accomplish this can be found in [this document](https://hackmd.io/80mJ75A5QeeRcrNmqcuU-g?view) 15 | 16 | # Running tests and benchmarks 17 | - Run the tests 18 | ```shell 19 | $ cd gohashstree 20 | $ go test . 21 | ok github.com/prysmaticlabs/gohashtree 0.002s 22 | ``` 23 | 24 | - Some benchmarks in ARM+crypto 25 | ``` 26 | $ cd gohashtree 27 | $ go test . -bench=. 28 | goos: darwin 29 | goarch: arm64 30 | pkg: github.com/prysmaticlabs/gohashtree 31 | BenchmarkHash_1_minio-10 8472337 122.9 ns/op 32 | BenchmarkHash_1-10 27011082 42.99 ns/op 33 | BenchmarkHash_4_minio-10 2419328 500.1 ns/op 34 | BenchmarkHash_4-10 6900236 172.1 ns/op 35 | BenchmarkHash_8_minio-10 1217845 985.6 ns/op 36 | BenchmarkHash_8-10 3471864 344.0 ns/op 37 | BenchmarkHash_16_minio-10 597896 1974 ns/op 38 | BenchmarkHash_16-10 1721486 689.2 ns/op 39 | BenchmarkHashLargeList_minio-10 38 28401697 ns/op 40 | BenchmarkHashList-10 138 8619502 ns/op 41 | PASS 42 | ok github.com/prysmaticlabs/gohashtree 16.854s 43 | ``` 44 | - Some benchmarks on a Raspberry-Pi without crypto extensions 45 | ``` 46 | $ cd gohashtree 47 | $ go test . -bench=. 48 | goos: linux 49 | goarch: arm64 50 | pkg: github.com/prysmaticlabs/gohashtree 51 | BenchmarkHash_1_minio-4 338904 3668 ns/op 52 | BenchmarkHash_1-4 1000000 1087 ns/op 53 | BenchmarkHash_4_minio-4 82258 15537 ns/op 54 | BenchmarkHash_4-4 380631 3216 ns/op 55 | BenchmarkHash_8_minio-4 41265 34344 ns/op 56 | BenchmarkHash_8-4 181153 6569 ns/op 57 | BenchmarkHash_16_minio-4 16635 67142 ns/op 58 | BenchmarkHash_16-4 75922 13351 ns/op 59 | BenchmarkHashLargeList_minio-4 2 826262074 ns/op 60 | BenchmarkHashList-4 7 176396035 ns/op 61 | PASS 62 | ``` 63 | - Some benchmarks on a Xeon with AVX-512 64 | ``` 65 | $ cd gohashtree 66 | $ go test . -bench=. 67 | goos: linux 68 | goarch: amd64 69 | pkg: github.com/prysmaticlabs/gohashtree 70 | cpu: Intel(R) Xeon(R) CPU @ 2.80GHz 71 | BenchmarkHash_1_minio-2 2462506 473.1 ns/op 72 | BenchmarkHash_1-2 3040208 391.3 ns/op 73 | BenchmarkHash_4_minio-2 577078 1959 ns/op 74 | BenchmarkHash_4-2 1954473 604.9 ns/op 75 | BenchmarkHash_8_minio-2 298208 3896 ns/op 76 | BenchmarkHash_8-2 1882191 624.8 ns/op 77 | BenchmarkHash_16_minio-2 147230 7933 ns/op 78 | BenchmarkHash_16-2 557485 1988 ns/op 79 | BenchmarkHashLargeList_minio-2 10 105404666 ns/op 80 | BenchmarkHashList-2 45 25368532 ns/op 81 | PASS 82 | ok github.com/prysmaticlabs/gohashtree 13.969s 83 | ``` 84 | 85 | -------------------------------------------------------------------------------- /errors.go: -------------------------------------------------------------------------------- 1 | package gohashtree 2 | 3 | import "errors" 4 | 5 | var ( 6 | // ErrOddChunks is returned when the number of chunks is odd. 7 | ErrOddChunks = errors.New("odd number of chunks") 8 | // ErrNotEnoughDigests is returned when the number of digests is not enough. 9 | ErrNotEnoughDigests = errors.New("not enough digest length") 10 | // ErrChunksNotMultipleOf64 is returned when the chunks are not multiple of 64 bytes. 11 | ErrChunksNotMultipleOf64 = errors.New("chunks not multiple of 64 bytes") 12 | // ErrDigestsNotMultipleOf32 is returned when the digests are not multiple of 32 bytes. 13 | ErrDigestsNotMultipleOf32 = errors.New("digests not multiple of 32 bytes") 14 | ) 15 | -------------------------------------------------------------------------------- /export_test.go: -------------------------------------------------------------------------------- 1 | /* 2 | MIT License 3 | 4 | Copyright (c) 2021 Prysmatic Labs 5 | 6 | Permission is hereby granted, free of charge, to any person obtaining a copy 7 | of this software and associated documentation files (the "Software"), to deal 8 | in the Software without restriction, including without limitation the rights 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | copies of the Software, and to permit persons to whom the Software is 11 | furnished to do so, subject to the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be included in all 14 | copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | SOFTWARE. 23 | */ 24 | package gohashtree 25 | 26 | // Export internal functions for testing. 27 | 28 | var Sha256_1_generic = sha256_1_generic 29 | -------------------------------------------------------------------------------- /fuzzbuzz.yaml: -------------------------------------------------------------------------------- 1 | gohashtree: 2 | language: go 3 | -------------------------------------------------------------------------------- /go.mod: -------------------------------------------------------------------------------- 1 | module github.com/prysmaticlabs/gohashtree 2 | 3 | go 1.22 4 | 5 | toolchain go1.22.4 6 | 7 | require ( 8 | github.com/klauspost/cpuid/v2 v2.0.9 9 | github.com/minio/sha256-simd v1.0.0 10 | ) 11 | -------------------------------------------------------------------------------- /go.sum: -------------------------------------------------------------------------------- 1 | github.com/klauspost/cpuid/v2 v2.0.4/go.mod h1:FInQzS24/EEf25PyTYn52gqo7WaD8xa0213Md/qVLRg= 2 | github.com/klauspost/cpuid/v2 v2.0.9 h1:lgaqFMSdTdQYdZ04uHyN2d/eKdOMyi2YLSvlQIBFYa4= 3 | github.com/klauspost/cpuid/v2 v2.0.9/go.mod h1:FInQzS24/EEf25PyTYn52gqo7WaD8xa0213Md/qVLRg= 4 | github.com/minio/sha256-simd v1.0.0 h1:v1ta+49hkWZyvaKwrQB8elexRqm6Y0aMLjCNsrYxo6g= 5 | github.com/minio/sha256-simd v1.0.0/go.mod h1:OuYzVNI5vcoYIAmbIvHPl3N3jUzVedXbKy5RFepssQM= 6 | -------------------------------------------------------------------------------- /hash.go: -------------------------------------------------------------------------------- 1 | /* 2 | MIT License 3 | 4 | # Copyright (c) 2021-2025 Prysmatic Labs 5 | 6 | Permission is hereby granted, free of charge, to any person obtaining a copy 7 | of this software and associated documentation files (the "Software"), to deal 8 | in the Software without restriction, including without limitation the rights 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | copies of the Software, and to permit persons to whom the Software is 11 | furnished to do so, subject to the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be included in all 14 | copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | SOFTWARE. 23 | */ 24 | package gohashtree 25 | 26 | import ( 27 | "fmt" 28 | "unsafe" 29 | ) 30 | 31 | // Hash hashes the chunks two at the time and outputs the digests on the first 32 | // argument. It does check for lengths on the inputs. 33 | func Hash(digests [][32]byte, chunks [][32]byte) error { 34 | if len(chunks) == 0 { 35 | return nil 36 | } 37 | 38 | if len(chunks)%2 == 1 { 39 | return ErrOddChunks 40 | } 41 | if len(digests) < len(chunks)/2 { 42 | return fmt.Errorf("%w: need at least %v, got %v", ErrNotEnoughDigests, len(chunks)/2, len(digests)) 43 | } 44 | if supportedCPU { 45 | _hash(&digests[0][0], chunks, uint32(len(chunks)/2)) 46 | } else { 47 | sha256_1_generic(digests, chunks) 48 | } 49 | return nil 50 | } 51 | 52 | // HashChunks is the same as Hash, but does not do error checking on the lengths of the slices 53 | func HashChunks(digests [][32]byte, chunks [][32]byte) { 54 | if supportedCPU { 55 | _hash(&digests[0][0], chunks, uint32(len(chunks)/2)) 56 | } else { 57 | sha256_1_generic(digests, chunks) 58 | } 59 | } 60 | 61 | func HashByteSlice(digests []byte, chunks []byte) error { 62 | if len(chunks) == 0 { 63 | return nil 64 | } 65 | 66 | if len(chunks)%64 != 0 { 67 | return ErrChunksNotMultipleOf64 68 | } 69 | 70 | if len(digests)%32 != 0 { 71 | return ErrDigestsNotMultipleOf32 72 | } 73 | 74 | if len(digests) < len(chunks)/2 { 75 | return fmt.Errorf("%w: need at least %v, got %v", ErrNotEnoughDigests, len(chunks)/2, len(digests)) 76 | } 77 | // We use an unsafe pointer to cast []byte to [][32]byte. The length and 78 | // capacity of the slice need to be divided accordingly by 32. 79 | sizeChunks := (len(chunks) >> 5) 80 | chunkedChunks := unsafe.Slice((*[32]byte)(unsafe.Pointer(&chunks[0])), sizeChunks) 81 | 82 | sizeDigests := (len(digests) >> 5) 83 | chunkedDigest := unsafe.Slice((*[32]byte)(unsafe.Pointer(&digests[0])), sizeDigests) 84 | if supportedCPU { 85 | Hash(chunkedDigest, chunkedChunks) 86 | } else { 87 | sha256_1_generic(chunkedDigest, chunkedChunks) 88 | } 89 | return nil 90 | } 91 | -------------------------------------------------------------------------------- /hash_amd64.go: -------------------------------------------------------------------------------- 1 | //go:build amd64 2 | // +build amd64 3 | 4 | /* 5 | MIT License 6 | 7 | Copyright (c) 2021-2025 Prysmatic Labs 8 | 9 | Permission is hereby granted, free of charge, to any person obtaining a copy 10 | of this software and associated documentation files (the "Software"), to deal 11 | in the Software without restriction, including without limitation the rights 12 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 13 | copies of the Software, and to permit persons to whom the Software is 14 | furnished to do so, subject to the following conditions: 15 | 16 | The above copyright notice and this permission notice shall be included in all 17 | copies or substantial portions of the Software. 18 | 19 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 20 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 21 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 22 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 23 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 24 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 25 | SOFTWARE. 26 | */ 27 | package gohashtree 28 | 29 | import ( 30 | "github.com/klauspost/cpuid/v2" 31 | ) 32 | 33 | var hasAVX512 = cpuid.CPU.Supports(cpuid.AVX512F, cpuid.AVX512VL) 34 | var hasAVX2 = cpuid.CPU.Supports(cpuid.AVX2, cpuid.BMI2) 35 | var hasShani = cpuid.CPU.Supports(cpuid.SHA, cpuid.AVX) 36 | var supportedCPU = hasAVX2 || hasShani || hasAVX512 37 | 38 | func _hash(digests *byte, p [][32]byte, count uint32) 39 | -------------------------------------------------------------------------------- /hash_arm64.go: -------------------------------------------------------------------------------- 1 | //go:build arm64 2 | // +build arm64 3 | 4 | /* 5 | MIT License 6 | 7 | Copyright (c) 2021-2025 Prysmatic Labs 8 | 9 | Permission is hereby granted, free of charge, to any person obtaining a copy 10 | of this software and associated documentation files (the "Software"), to deal 11 | in the Software without restriction, including without limitation the rights 12 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 13 | copies of the Software, and to permit persons to whom the Software is 14 | furnished to do so, subject to the following conditions: 15 | 16 | The above copyright notice and this permission notice shall be included in all 17 | copies or substantial portions of the Software. 18 | 19 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 20 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 21 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 22 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 23 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 24 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 25 | SOFTWARE. 26 | */ 27 | package gohashtree 28 | 29 | import ( 30 | "github.com/klauspost/cpuid/v2" 31 | ) 32 | 33 | var hasShani = cpuid.CPU.Supports(cpuid.SHA2) 34 | var supportedCPU = true 35 | 36 | func _hash(digests *byte, p [][32]byte, count uint32) 37 | -------------------------------------------------------------------------------- /hash_arm64.s: -------------------------------------------------------------------------------- 1 | /* 2 | MIT License 3 | 4 | Copyright (c) 2021 Prysmatic Labs 5 | 6 | Permission is hereby granted, free of charge, to any person obtaining a copy 7 | of this software and associated documentation files (the "Software"), to deal 8 | in the Software without restriction, including without limitation the rights 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | copies of the Software, and to permit persons to whom the Software is 11 | furnished to do so, subject to the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be included in all 14 | copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | SOFTWARE. 23 | 24 | This code is based on Intel's implementation found in 25 | https://github.com/intel/intel-ipsec-mb 26 | Copied parts are 27 | Copyright (c) 2012-2021, Intel Corporation 28 | */ 29 | 30 | #include "textflag.h" 31 | 32 | #define OUTPUT_PTR R0 33 | #define DATA_PTR R1 34 | #define NUM_BLKS R2 35 | #define last R2 36 | 37 | #define digest R19 38 | #define k256 R20 39 | #define padding R21 40 | 41 | #define VR0 V0 42 | #define VR1 V1 43 | #define VR2 V2 44 | #define VR3 V3 45 | #define VTMP0 V4 46 | #define VTMP1 V5 47 | #define VTMP2 V6 48 | #define VTMP3 V7 49 | #define VTMP4 V17 50 | #define VTMP5 V18 51 | #define VTMP6 V19 52 | #define KV0 V20 53 | #define KV1 V21 54 | #define KV2 V22 55 | #define KV3 V23 56 | #define KQ0 F20 57 | #define KQ1 F21 58 | #define KQ2 F22 59 | #define KQ3 F23 60 | #define VZ V16 61 | 62 | #define A_ R3 63 | #define B_ R4 64 | #define C_ R5 65 | #define D_ R6 66 | #define E_ R7 67 | #define F_ R9 68 | #define G_ R10 69 | #define H_ R11 70 | #define T1 R12 71 | #define T2 R13 72 | #define T3 R14 73 | #define T4 R15 74 | #define T5 R22 75 | 76 | #define round1_sched(A, B, C, D, E, F, G, H, VV0, VV1, VV2, VV3) \ 77 | VEXT $4, VV3.B16, VV2.B16, VTMP0.B16; \ 78 | RORW $6, E, T1; \ 79 | MOVWU (RSP), T3; \ 80 | RORW $2, A, T2; \ 81 | RORW $13, A, T4; \ 82 | VEXT $4, VV1.B16, VV0.B16, VTMP1.B16; \ 83 | EORW T4, T2, T2; \ 84 | ADDW T3, H, H; \ 85 | RORW $11, E, T3; \ 86 | VADD VV0.S4, VTMP0.S4, VTMP0.S4; \ 87 | EORW T3, T1, T1; \ 88 | RORW $25, E, T3; \ 89 | RORW $22, A, T4; \ 90 | VUSHR $7, VTMP1.S4, VTMP2.S4; \ 91 | EORW T3, T1, T1; \ 92 | EORW T4, T2, T2; \ 93 | EORW G, F, T3; \ 94 | VSHL $(32-7), VTMP1.S4, VTMP3.S4; \ 95 | EORW C, A, T4; \ 96 | ANDW E, T3, T3; \ 97 | ANDW B, T4, T4; \ 98 | EORW G, T3, T3; \ 99 | VUSHR $18, VTMP1.S4, VTMP4.S4; \ 100 | ADDW T3, T1, T1; \ 101 | ANDW C, A, T3; \ 102 | ADDW T1, H, H; \ 103 | VORR VTMP2.B16, VTMP3.B16, VTMP3.B16; \ 104 | EORW T3, T4, T4; \ 105 | ADDW H, D, D; \ 106 | ADDW T4, T2, T2; \ 107 | VUSHR $3, VTMP1.S4, VTMP2.S4; \ 108 | ADDW T2, H, H 109 | 110 | #define round2_sched(A, B, C, D, E, F, G, H, VV3) \ 111 | MOVWU 4(RSP), T3; \ 112 | RORW $6, E, T1; \ 113 | VSHL $(32-18), VTMP1.S4, VTMP1.S4; \ 114 | RORW $2, A, T2; \ 115 | RORW $13, A, T4; \ 116 | ADDW T3, H, H; \ 117 | VEOR VTMP2.B16, VTMP3.B16, VTMP3.B16; \ 118 | RORW $11, E, T3; \ 119 | EORW T4, T2, T2; \ 120 | EORW T3, T1, T1; \ 121 | VEOR VTMP1.B16, VTMP4.B16, VTMP1.B16; \ 122 | RORW $25, E, T3; \ 123 | RORW $22, A, T4; \ 124 | EORW T3, T1, T1; \ 125 | VZIP2 VV3.S4, VV3.S4, VTMP5.S4; \ 126 | EORW T4, T2, T2; \ 127 | EORW G, F, T3; \ 128 | EORW C, A, T4; \ 129 | VEOR VTMP1.B16, VTMP3.B16, VTMP1.B16; \ 130 | ANDW E, T3, T3; \ 131 | ANDW B, T4, T4; \ 132 | EORW G, T3, T3; \ 133 | VUSHR $10, VTMP5.S4, VTMP6.S4; \ 134 | ADDW T3, T1, T1; \ 135 | ANDW C, A, T3; \ 136 | ADDW T1, H, H; \ 137 | VUSHR $19, VTMP5.D2, VTMP3.D2; \ 138 | EORW T3, T4, T4; \ 139 | ADDW H, D, D; \ 140 | ADDW T4, T2, T2; \ 141 | VUSHR $17, VTMP5.D2, VTMP2.D2; \ 142 | ADDW T2, H, H 143 | 144 | #define round3_sched(A, B, C, D, E, F, G, H) \ 145 | MOVWU 8(RSP), T3; \ 146 | RORW $6, E, T1; \ 147 | VEOR VTMP6.B16, VTMP3.B16, VTMP3.B16; \ 148 | RORW $2, A, T2; \ 149 | RORW $13, A, T4; \ 150 | ADDW T3, H, H; \ 151 | VADD VTMP1.S4, VTMP0.S4, VTMP0.S4; \ 152 | RORW $11, E, T3; \ 153 | EORW T4, T2, T2; \ 154 | EORW T3, T1, T1; \ 155 | VEOR VTMP2.B16, VTMP3.B16, VTMP1.B16; \ 156 | RORW $25, E, T3; \ 157 | RORW $22, A, T4; \ 158 | EORW T3, T1, T1; \ 159 | WORD $0xea128a5; \ 160 | EORW T4, T2, T2; \ 161 | EORW G, F, T3; \ 162 | EORW C, A, T4; \ 163 | VADD VTMP1.S4, VTMP0.S4, VTMP0.S4; \ 164 | ANDW E, T3, T3; \ 165 | ANDW B, T4, T4; \ 166 | EORW G, T3, T3; \ 167 | VZIP1 VTMP0.S4, VTMP0.S4, VTMP2.S4; \ 168 | ADDW T3, T1, T1; \ 169 | ANDW C, A, T3; \ 170 | ADDW T1, H, H; \ 171 | EORW T3, T4, T4; \ 172 | ADDW H, D, D; \ 173 | ADDW T4, T2, T2; \ 174 | VUSHR $10, VTMP2.S4, VTMP1.S4; \ 175 | ADDW T2, H, H 176 | 177 | #define round4_sched(A, B, C, D, E, F, G, H, VV0) \ 178 | MOVWU 12(RSP), T3; \ 179 | RORW $6, E, T1; \ 180 | RORW $2, A, T2; \ 181 | VUSHR $19, VTMP2.D2, VTMP3.D2; \ 182 | RORW $13, A, T4; \ 183 | ADDW T3, H, H; \ 184 | RORW $11, E, T3; \ 185 | EORW T4, T2, T2; \ 186 | VUSHR $17, VTMP2.D2, VTMP2.D2; \ 187 | EORW T3, T1, T1; \ 188 | RORW $25, E, T3; \ 189 | RORW $22, A, T4; \ 190 | EORW T3, T1, T1; \ 191 | VEOR VTMP3.B16, VTMP1.B16, VTMP1.B16; \ 192 | EORW T4, T2, T2; \ 193 | EORW G, F, T3; \ 194 | EORW C, A, T4; \ 195 | VEOR VTMP2.B16, VTMP1.B16, VTMP1.B16; \ 196 | ANDW E, T3, T3; \ 197 | ANDW B, T4, T4; \ 198 | EORW G, T3, T3; \ 199 | VUZP1 VTMP1.S4, VZ.S4, VTMP1.S4; \ 200 | ADDW T3, T1, T1; \ 201 | ANDW C, A, T3; \ 202 | ADDW T1, H, H; \ 203 | EORW T3, T4, T4; \ 204 | ADDW H, D, D; \ 205 | ADDW T4, T2, T2; \ 206 | VADD VTMP0.S4, VTMP1.S4, VV0.S4; \ 207 | ADDW T2, H, H 208 | 209 | #define four_rounds_sched(A, B, C, D, E, F, G, H, VV0, VV1, VV2, VV3) \ 210 | round1_sched(A, B, C, D, E, F, G, H, VV0, VV1, VV2, VV3); \ 211 | round2_sched(H, A, B, C, D, E, F, G, VV3); \ 212 | round3_sched(G, H, A, B, C, D, E, F); \ 213 | round4_sched(F, G, H, A, B, C, D, E, VV0) 214 | 215 | #define one_round(A, B, C, D, E, F, G, H, ptr, offset) \ 216 | MOVWU offset(ptr), T3; \ 217 | RORW $6, E, T1; \ 218 | RORW $2, A, T2; \ 219 | RORW $13, A, T4; \ 220 | ADDW T3, H, H; \ 221 | RORW $11, E, T3; \ 222 | EORW T4, T2, T2; \ 223 | EORW T3, T1, T1; \ 224 | RORW $25, E, T3; \ 225 | RORW $22, A, T4; \ 226 | EORW T3, T1, T1; \ 227 | EORW T4, T2, T2; \ 228 | EORW G, F, T3; \ 229 | EORW C, A, T4; \ 230 | ANDW E, T3, T3; \ 231 | ANDW B, T4, T4; \ 232 | EORW G, T3, T3; \ 233 | ADDW T3, T1, T1; \ 234 | ANDW C, A, T3; \ 235 | ADDW T1, H, H; \ 236 | EORW T3, T4, T4; \ 237 | ADDW H, D, D; \ 238 | ADDW T4, T2, T2; \ 239 | ADDW T2, H, H 240 | 241 | #define four_rounds(A, B, C, D, E, F, G, H, ptr, offset) \ 242 | one_round(A, B, C, D, E, F, G, H, ptr, offset); \ 243 | one_round(H, A, B, C, D, E, F, G, ptr, offset + 4); \ 244 | one_round(G, H, A, B, C, D, E, F, ptr, offset + 8); \ 245 | one_round(F, G, H, A, B, C, D, E, ptr, offset + 12) 246 | 247 | // Definitions for ASIMD version 248 | #define digest2 R6 249 | #define post64 R7 250 | #define postminus176 R9 251 | #define post32 R10 252 | #define postminus80 R11 253 | #define M1 V16 254 | #define M2 V17 255 | #define M3 V18 256 | #define M4 V19 257 | #define MQ1 F16 258 | #define MQ2 F17 259 | #define MQ3 F18 260 | #define MQ4 F19 261 | #define NVR1 V24 262 | #define NVR2 V25 263 | #define NVR3 V26 264 | #define NVR4 V27 265 | #define QR2 F25 266 | #define QR4 F27 267 | #define TV1 V28 268 | #define TV2 V29 269 | #define TV3 V30 270 | #define TV4 V31 271 | #define TV5 V20 272 | #define TV6 V21 273 | #define TV7 V22 274 | #define TV8 V23 275 | #define TQ4 F31 276 | #define TQ5 F20 277 | #define TQ6 F21 278 | #define TQ7 F22 279 | 280 | #define round_4(A, B, C, D, E, F, G, H, MV, MQ, bicword, offset) \ 281 | VUSHR $6, E.S4, TV1.S4; \ 282 | VSHL $(32-6), E.S4, TV2.S4; \ 283 | VUSHR $11, E.S4, NVR2.S4; \ 284 | VSHL $(32-11), E.S4, NVR1.S4; \ 285 | VAND F.B16, E.B16, TV3.B16; \ 286 | WORD bicword; \ 287 | VORR TV2.B16, TV1.B16, TV1.B16; \ 288 | VUSHR $25, E.S4, TV2.S4; \ 289 | FMOVQ offset(k256), QR4; \ 290 | VSHL $(32-25), E.S4, NVR3.S4; \ 291 | VORR NVR1.B16, NVR2.B16, NVR1.B16; \ 292 | VEOR TV4.B16, TV3.B16, TV3.B16; \ 293 | VORR NVR3.B16, TV2.B16, TV2.B16; \ 294 | VEOR C.B16, A.B16, NVR3.B16; \ 295 | VEOR NVR1.B16, TV1.B16, TV1.B16; \ 296 | VADD NVR4.S4, MV.S4, TV4.S4; \ 297 | VADD TV3.S4, H.S4, H.S4; \ 298 | VUSHR $2, A.S4, TV3.S4; \ 299 | VAND B.B16, NVR3.B16, NVR3.B16; \ 300 | VSHL $(32-2), A.S4, NVR4.S4; \ 301 | VEOR TV2.B16, TV1.B16, TV1.B16; \ 302 | VUSHR $13, A.S4, TV2.S4; \ 303 | VSHL $(32-13), A.S4, NVR1.S4; \ 304 | VADD TV4.S4, H.S4, H.S4; \ 305 | VORR NVR4.B16, TV3.B16, TV3.B16; \ 306 | VAND C.B16, A.B16, NVR4.B16; \ 307 | VUSHR $22, A.S4, TV4.S4; \ 308 | VSHL $(32 - 22), A.S4, NVR2.S4 ; \ 309 | VORR NVR1.B16, TV2.B16, TV2.B16; \ 310 | VADD TV1.S4, H.S4, H.S4; \ 311 | VEOR NVR4.B16, NVR3.B16, NVR3.B16; \ 312 | VORR NVR2.B16, TV4.B16, TV4.B16; \ 313 | VEOR TV3.B16, TV2.B16, TV2.B16; \ 314 | VADD H.S4, D.S4, D.S4; \ 315 | VADD NVR3.S4, H.S4, H.S4; \ 316 | VEOR TV4.B16, TV2.B16, TV2.B16; \ 317 | FMOVQ MQ, offset(RSP); \ 318 | VADD TV2.S4, H.S4, H.S4 319 | 320 | #define eight_4_roundsA(A, B, C, D, E, F, G, H, MV1, MV2, MV3, MV4, MQ1, MQ2, MQ3, MQ4, offset) \ 321 | round_4(A, B, C, D, E, F, G, H, MV1, MQ1, $0x4e641cdf, offset); \ 322 | round_4(H, A, B, C, D, E, F, G, MV2, MQ2, $0x4e631cbf, offset + 16); \ 323 | round_4(G, H, A, B, C, D, E, F, MV3, MQ3, $0x4e621c9f, offset + 32); \ 324 | round_4(F, G, H, A, B, C, D, E, MV4, MQ4, $0x4e611c7f, offset + 48) 325 | 326 | #define eight_4_roundsB(A, B, C, D, E, F, G, H, MV1, MV2, MV3, MV4, MQ1, MQ2, MQ3, MQ4, offset) \ 327 | round_4(A, B, C, D, E, F, G, H, MV1, MQ1, $0x4e601c5f, offset); \ 328 | round_4(H, A, B, C, D, E, F, G, MV2, MQ2, $0x4e671c3f, offset + 16); \ 329 | round_4(G, H, A, B, C, D, E, F, MV3, MQ3, $0x4e661c1f, offset + 32); \ 330 | round_4(F, G, H, A, B, C, D, E, MV4, MQ4, $0x4e651cff, offset + 48) 331 | 332 | #define round_4_and_sched(A, B, C, D, E, F, G, H, bicword, offset) \ 333 | FLDPQ (offset-256)(RSP), (TQ6, TQ5); \ 334 | VUSHR $6, E.S4, TV1.S4; \ 335 | VSHL $(32-6), E.S4, TV2.S4; \ 336 | VUSHR $11, E.S4, NVR2.S4; \ 337 | VSHL $(32-11), E.S4, NVR1.S4; \ 338 | VAND F.B16, E.B16, TV3.B16; \ 339 | WORD bicword; \ 340 | VUSHR $7, TV5.S4, M1.S4; \ 341 | FMOVQ (offset-32)(RSP), TQ7; \ 342 | VSHL $(32-7), TV5.S4, M2.S4; \ 343 | VORR TV2.B16, TV1.B16, TV1.B16; \ 344 | VUSHR $25, E.S4, TV2.S4; \ 345 | VSHL $(32-25), E.S4, NVR3.S4; \ 346 | VORR NVR1.B16, NVR2.B16, NVR1.B16; \ 347 | VEOR TV4.B16, TV3.B16, TV3.B16; \ 348 | FMOVQ offset(k256), QR4; \ 349 | VORR M2.B16, M1.B16, M1.B16; \ 350 | VUSHR $17, TV7.S4, M3.S4; \ 351 | VSHL $(32-17), TV7.S4, M4.S4; \ 352 | VUSHR $18, TV5.S4, M2.S4; \ 353 | VSHL $(32-18), TV5.S4, TV8.S4; \ 354 | VORR NVR3.B16, TV2.B16, TV2.B16; \ 355 | VEOR C.B16, A.B16, NVR3.B16; \ 356 | VORR M4.B16, M3.B16, M3.B16; \ 357 | FMOVQ (offset-112)(RSP), TQ4; \ 358 | VUSHR $19, TV7.S4, M4.S4; \ 359 | VSHL $(32-19), TV7.S4, NVR2.S4; \ 360 | VORR TV8.B16, M2.B16, M2.B16; \ 361 | VUSHR $3, TV5.S4, TV8.S4; \ 362 | VORR NVR2.B16, M4.B16, M4.B16; \ 363 | VEOR NVR1.B16, TV1.B16, TV1.B16; \ 364 | VEOR M2.B16, M1.B16, M1.B16; \ 365 | VUSHR $10, TV7.S4, M2.S4; \ 366 | VEOR M4.B16, M3.B16, M3.B16; \ 367 | VADD TV3.S4, H.S4, H.S4; \ 368 | VEOR TV8.B16, M1.B16, M1.B16; \ 369 | VADD TV4.S4, TV6.S4, TV6.S4; \ 370 | VEOR M2.B16, M3.B16, M3.B16; \ 371 | VUSHR $2, A.S4, TV3.S4; \ 372 | VAND B.B16, NVR3.B16, NVR3.B16; \ 373 | VADD TV6.S4, M1.S4, M1.S4; \ 374 | VSHL $(32-2), A.S4, TV6.S4; \ 375 | VEOR TV2.B16, TV1.B16, TV1.B16; \ 376 | VUSHR $13, A.S4, TV2.S4; \ 377 | VADD M3.S4, M1.S4, M1.S4; \ 378 | VADD TV1.S4, H.S4, H.S4; \ 379 | VSHL $(32-13), A.S4, NVR1.S4; \ 380 | VORR TV6.B16, TV3.B16, TV3.B16; \ 381 | VADD NVR4.S4, M1.S4, TV5.S4; \ 382 | FMOVQ MQ1, offset(RSP); \ 383 | VAND C.B16, A.B16, NVR4.B16; \ 384 | VUSHR $22, A.S4, TV4.S4; \ 385 | VSHL $(32-22), A.S4, NVR2.S4; \ 386 | VADD TV5.S4, H.S4, H.S4; \ 387 | VORR NVR1.B16, TV2.B16, TV2.B16; \ 388 | VEOR NVR4.B16, NVR3.B16, NVR3.B16; \ 389 | VORR NVR2.B16, TV4.B16, TV4.B16; \ 390 | VEOR TV3.B16, TV2.B16, TV2.B16; \ 391 | VADD H.S4, D.S4, D.S4; \ 392 | VADD NVR3.S4, H.S4, H.S4; \ 393 | VEOR TV4.B16, TV2.B16, TV2.B16; \ 394 | VADD TV2.S4, H.S4, H.S4 395 | 396 | #define eight_4_rounds_and_sched(A, B, C, D, E, F, G, H, offset) \ 397 | round_4_and_sched(A, B, C, D, E, F, G, H, $0x4e641cdf, offset + 0*16); \ 398 | round_4_and_sched(H, A, B, C, D, E, F, G, $0x4e631cbf, offset + 1*16); \ 399 | round_4_and_sched(G, H, A, B, C, D, E, F, $0x4e621c9f, offset + 2*16); \ 400 | round_4_and_sched(F, G, H, A, B, C, D, E, $0x4e611c7f, offset + 3*16); \ 401 | round_4_and_sched(E, F, G, H, A, B, C, D, $0x4e601c5f, offset + 4*16); \ 402 | round_4_and_sched(D, E, F, G, H, A, B, C, $0x4e671c3f, offset + 5*16); \ 403 | round_4_and_sched(C, D, E, F, G, H, A, B, $0x4e661c1f, offset + 6*16); \ 404 | round_4_and_sched(B, C, D, E, F, G, H, A, $0x4e651cff, offset + 7*16) 405 | 406 | #define round_4_padding(A, B, C, D, E, F, G, H, bicword, offset) \ 407 | VUSHR $6, E.S4, TV1.S4; \ 408 | VSHL $(32-6), E.S4, TV2.S4; \ 409 | VUSHR $11, E.S4, NVR2.S4; \ 410 | VSHL $(32-11), E.S4, NVR1.S4; \ 411 | VAND F.B16, E.B16, TV3.B16; \ 412 | WORD bicword; \ 413 | VORR TV2.B16, TV1.B16, TV1.B16; \ 414 | VUSHR $25, E.S4, TV2.S4; \ 415 | VSHL $(32-25), E.S4, NVR3.S4; \ 416 | VORR NVR1.B16, NVR2.B16, NVR1.B16; \ 417 | VEOR TV4.B16, TV3.B16, TV3.B16; \ 418 | VORR NVR3.B16, TV2.B16, TV2.B16; \ 419 | VEOR C.B16, A.B16, NVR3.B16; \ 420 | VEOR NVR1.B16, TV1.B16, TV1.B16; \ 421 | VADD TV3.S4, H.S4, H.S4; \ 422 | VUSHR $2, A.S4, TV3.S4; \ 423 | FMOVQ offset(padding), QR2; \ 424 | VAND B.B16, NVR3.B16, NVR3.B16; \ 425 | VSHL $(32-2), A.S4, NVR4.S4; \ 426 | VEOR TV2.B16, TV1.B16, TV1.B16; \ 427 | VUSHR $13, A.S4, TV2.S4; \ 428 | VSHL $(32-13), A.S4, NVR1.S4; \ 429 | VADD NVR2.S4, H.S4, H.S4; \ 430 | VORR NVR4.B16, TV3.B16, TV3.B16; \ 431 | VAND C.B16, A.B16, NVR4.B16; \ 432 | VUSHR $22, A.S4, TV4.S4; \ 433 | VSHL $(32-22), A.S4, NVR2.S4; \ 434 | VORR NVR1.B16, TV2.B16, TV2.B16; \ 435 | VADD TV1.S4, H.S4, H.S4; \ 436 | VEOR NVR4.B16, NVR3.B16, NVR3.B16; \ 437 | VORR NVR2.B16, TV4.B16, TV4.B16; \ 438 | VEOR TV3.B16, TV2.B16, TV2.B16; \ 439 | VADD H.S4, D.S4, D.S4; \ 440 | VADD NVR3.S4, H.S4, H.S4; \ 441 | VEOR TV4.B16, TV2.B16, TV2.B16; \ 442 | VADD TV2.S4, H.S4, H.S4 443 | 444 | #define eight_4_rounds_padding(A, B, C, D, E, F, G, H, offset) \ 445 | round_4_padding(A, B, C, D, E, F, G, H, $0x4e641cdf, offset + 0*16); \ 446 | round_4_padding(H, A, B, C, D, E, F, G, $0x4e631cbf, offset + 1*16); \ 447 | round_4_padding(G, H, A, B, C, D, E, F, $0x4e621c9f, offset + 2*16); \ 448 | round_4_padding(F, G, H, A, B, C, D, E, $0x4e611c7f, offset + 3*16); \ 449 | round_4_padding(E, F, G, H, A, B, C, D, $0x4e601c5f, offset + 4*16); \ 450 | round_4_padding(D, E, F, G, H, A, B, C, $0x4e671c3f, offset + 5*16); \ 451 | round_4_padding(C, D, E, F, G, H, A, B, $0x4e661c1f, offset + 6*16); \ 452 | round_4_padding(B, C, D, E, F, G, H, A, $0x4e651cff, offset + 7*16) 453 | 454 | // Definitions for SHA-2 455 | #define check_shani R19 456 | 457 | #define HASHUPDATE(word) \ 458 | SHA256H word, V3, V2; \ 459 | SHA256H2 word, V8, V3; \ 460 | VMOV V2.B16, V8.B16 461 | 462 | TEXT ·_hash(SB), 0, $1024-36 463 | MOVD digests+0(FP), OUTPUT_PTR 464 | MOVD p_base+8(FP), DATA_PTR 465 | MOVWU count+32(FP), NUM_BLKS 466 | 467 | MOVBU ·hasShani(SB), check_shani 468 | CBNZ check_shani, shani 469 | 470 | arm_x4: 471 | CMPW $4, NUM_BLKS 472 | BLO arm_x1 473 | 474 | MOVD $_PADDING_4<>(SB), padding 475 | MOVD $_K256_4<>(SB), k256 476 | MOVD $_DIGEST_4<>(SB), digest 477 | ADD $64, digest, digest2 478 | MOVD $64, post64 479 | MOVD $32, post32 480 | MOVD $-80, postminus80 481 | MOVD $-176, postminus176 482 | 483 | arm_x4_loop: 484 | CMPW $4, NUM_BLKS 485 | BLO arm_x1 486 | VLD1 (digest), [V0.S4, V1.S4, V2.S4, V3.S4] 487 | VLD1 (digest2), [V4.S4, V5.S4, V6.S4, V7.S4] 488 | 489 | // First 16 rounds 490 | WORD $0xde7a030 491 | WORD $0xde7b030 492 | WORD $0x4de7a030 493 | WORD $0x4de9b030 494 | VREV32 M1.B16, M1.B16 495 | VREV32 M2.B16, M2.B16 496 | VREV32 M3.B16, M3.B16 497 | VREV32 M4.B16, M4.B16 498 | eight_4_roundsA(V0, V1, V2, V3, V4, V5, V6, V7, M1, M2, M3, M4, MQ1, MQ2, MQ3, MQ4, 0x00) 499 | 500 | WORD $0xde7a030 501 | WORD $0xde7b030 502 | WORD $0x4de7a030 503 | WORD $0x4de9b030 504 | VREV32 M1.B16, M1.B16 505 | VREV32 M2.B16, M2.B16 506 | VREV32 M3.B16, M3.B16 507 | VREV32 M4.B16, M4.B16 508 | eight_4_roundsB(V4, V5, V6, V7, V0, V1, V2, V3, M1, M2, M3, M4, MQ1, MQ2, MQ3, MQ4, 0x40) 509 | 510 | WORD $0xde7a030 511 | WORD $0xde7b030 512 | WORD $0x4de7a030 513 | WORD $0x4de9b030 514 | VREV32 M1.B16, M1.B16 515 | VREV32 M2.B16, M2.B16 516 | VREV32 M3.B16, M3.B16 517 | VREV32 M4.B16, M4.B16 518 | eight_4_roundsA(V0, V1, V2, V3, V4, V5, V6, V7, M1, M2, M3, M4, MQ1, MQ2, MQ3, MQ4, 0x80) 519 | 520 | WORD $0xde7a030 521 | WORD $0xde7b030 522 | WORD $0x4de7a030 523 | WORD $0x4de9b030 524 | VREV32 M1.B16, M1.B16 525 | VREV32 M2.B16, M2.B16 526 | VREV32 M3.B16, M3.B16 527 | VREV32 M4.B16, M4.B16 528 | eight_4_roundsB(V4, V5, V6, V7, V0, V1, V2, V3, M1, M2, M3, M4, MQ1, MQ2, MQ3, MQ4, 0xc0) 529 | 530 | eight_4_rounds_and_sched(V0, V1, V2, V3, V4, V5, V6, V7, 0x100) 531 | eight_4_rounds_and_sched(V0, V1, V2, V3, V4, V5, V6, V7, 0x180) 532 | eight_4_rounds_and_sched(V0, V1, V2, V3, V4, V5, V6, V7, 0x200) 533 | eight_4_rounds_and_sched(V0, V1, V2, V3, V4, V5, V6, V7, 0x280) 534 | eight_4_rounds_and_sched(V0, V1, V2, V3, V4, V5, V6, V7, 0x300) 535 | eight_4_rounds_and_sched(V0, V1, V2, V3, V4, V5, V6, V7, 0x380) 536 | 537 | 538 | // add previous digest 539 | VLD1 (digest), [M1.S4, M2.S4, M3.S4, M4.S4] 540 | VLD1 (digest2), [TV5.S4, TV6.S4, TV7.S4, TV8.S4] 541 | VADD M1.S4, V0.S4, V0.S4 542 | VADD M2.S4, V1.S4, V1.S4 543 | VADD M3.S4, V2.S4, V2.S4 544 | VADD M4.S4, V3.S4, V3.S4 545 | VADD TV5.S4, V4.S4, V4.S4 546 | VADD TV6.S4, V5.S4, V5.S4 547 | VADD TV7.S4, V6.S4, V6.S4 548 | VADD TV8.S4, V7.S4, V7.S4 549 | 550 | // save state 551 | VMOV V0.B16, M1.B16 552 | VMOV V1.B16, M2.B16 553 | VMOV V2.B16, M3.B16 554 | VMOV V3.B16, M4.B16 555 | VMOV V4.B16, TV5.B16 556 | VMOV V5.B16, TV6.B16 557 | VMOV V6.B16, TV7.B16 558 | VMOV V7.B16, TV8.B16 559 | 560 | // rounds with padding 561 | eight_4_rounds_padding(V0, V1, V2, V3, V4, V5, V6, V7, 0x000) 562 | eight_4_rounds_padding(V0, V1, V2, V3, V4, V5, V6, V7, 0x080) 563 | eight_4_rounds_padding(V0, V1, V2, V3, V4, V5, V6, V7, 0x100) 564 | eight_4_rounds_padding(V0, V1, V2, V3, V4, V5, V6, V7, 0x180) 565 | eight_4_rounds_padding(V0, V1, V2, V3, V4, V5, V6, V7, 0x200) 566 | eight_4_rounds_padding(V0, V1, V2, V3, V4, V5, V6, V7, 0x280) 567 | eight_4_rounds_padding(V0, V1, V2, V3, V4, V5, V6, V7, 0x300) 568 | eight_4_rounds_padding(V0, V1, V2, V3, V4, V5, V6, V7, 0x380) 569 | 570 | // add previous digest 571 | VADD M1.S4, V0.S4, V0.S4 572 | VADD M2.S4, V1.S4, V1.S4 573 | VADD M3.S4, V2.S4, V2.S4 574 | VADD M4.S4, V3.S4, V3.S4 575 | VADD TV5.S4, V4.S4, V4.S4 576 | VADD TV6.S4, V5.S4, V5.S4 577 | VADD TV7.S4, V6.S4, V6.S4 578 | VADD TV8.S4, V7.S4, V7.S4 579 | 580 | // change endianness transpose and store 581 | VREV32 V0.B16, V0.B16 582 | VREV32 V1.B16, V1.B16 583 | VREV32 V2.B16, V2.B16 584 | VREV32 V3.B16, V3.B16 585 | VREV32 V4.B16, V4.B16 586 | VREV32 V5.B16, V5.B16 587 | VREV32 V6.B16, V6.B16 588 | VREV32 V7.B16, V7.B16 589 | 590 | WORD $0xdaaa000 591 | WORD $0xdaab000 592 | WORD $0x4daaa000 593 | WORD $0x4dabb000 594 | WORD $0xdaaa004 595 | WORD $0xdaab004 596 | WORD $0x4daaa004 597 | WORD $0x4dbfb004 598 | 599 | ADD $192, DATA_PTR, DATA_PTR 600 | SUBW $4, NUM_BLKS, NUM_BLKS 601 | JMP arm_x4_loop 602 | 603 | arm_x1: 604 | VMOV ZR, VZ.S4 // Golang guarantees this is zero 605 | MOVD $_DIGEST_1<>(SB), digest 606 | MOVD $_PADDING_1<>(SB), padding 607 | ADD NUM_BLKS<<5, OUTPUT_PTR, last 608 | 609 | arm_x1_loop: 610 | CMP OUTPUT_PTR, last 611 | BEQ epilog 612 | 613 | // Load one block 614 | VLD1.P 64(DATA_PTR), [VR0.S4, VR1.S4, VR2.S4, VR3.S4] 615 | MOVD $_K256_1<>(SB), k256 616 | 617 | // change endiannes 618 | VREV32 VR0.B16, VR0.B16 619 | VREV32 VR1.B16, VR1.B16 620 | VREV32 VR2.B16, VR2.B16 621 | VREV32 VR3.B16, VR3.B16 622 | 623 | // load initial digest 624 | LDPW (digest), (A_, B_) 625 | LDPW 8(digest), (C_, D_) 626 | LDPW 16(digest), (E_, F_) 627 | LDPW 24(digest), (G_, H_) 628 | 629 | // First 48 rounds 630 | VLD1.P 64(k256), [KV0.S4, KV1.S4, KV2.S4, KV3.S4] 631 | VADD VR0.S4, KV0.S4, KV0.S4 632 | FMOVQ KQ0, (RSP) 633 | four_rounds_sched(A_, B_, C_, D_, E_, F_, G_, H_, VR0, VR1, VR2, VR3) 634 | 635 | VADD VR1.S4, KV1.S4, KV1.S4 636 | FMOVQ KQ1, (RSP) 637 | four_rounds_sched(E_, F_, G_, H_, A_, B_, C_, D_, VR1, VR2, VR3, VR0) 638 | 639 | VADD VR2.S4, KV2.S4, KV2.S4 640 | FMOVQ KQ2, (RSP) 641 | four_rounds_sched(A_, B_, C_, D_, E_, F_, G_, H_, VR2, VR3, VR0, VR1) 642 | 643 | VADD VR3.S4, KV3.S4, KV3.S4 644 | FMOVQ KQ3, (RSP) 645 | four_rounds_sched(E_, F_, G_, H_, A_, B_, C_, D_, VR3, VR0, VR1, VR2) 646 | 647 | VLD1.P 64(k256), [KV0.S4, KV1.S4, KV2.S4, KV3.S4] 648 | VADD VR0.S4, KV0.S4, KV0.S4 649 | FMOVQ KQ0, (RSP) 650 | four_rounds_sched(A_, B_, C_, D_, E_, F_, G_, H_, VR0, VR1, VR2, VR3) 651 | 652 | VADD VR1.S4, KV1.S4, KV1.S4 653 | FMOVQ KQ1, (RSP) 654 | four_rounds_sched(E_, F_, G_, H_, A_, B_, C_, D_, VR1, VR2, VR3, VR0) 655 | 656 | VADD VR2.S4, KV2.S4, KV2.S4 657 | FMOVQ KQ2, (RSP) 658 | four_rounds_sched(A_, B_, C_, D_, E_, F_, G_, H_, VR2, VR3, VR0, VR1) 659 | 660 | VADD VR3.S4, KV3.S4, KV3.S4 661 | FMOVQ KQ3, (RSP) 662 | four_rounds_sched(E_, F_, G_, H_, A_, B_, C_, D_, VR3, VR0, VR1, VR2) 663 | 664 | VLD1.P 64(k256), [KV0.S4, KV1.S4, KV2.S4, KV3.S4] 665 | VADD VR0.S4, KV0.S4, KV0.S4 666 | FMOVQ KQ0, (RSP) 667 | four_rounds_sched(A_, B_, C_, D_, E_, F_, G_, H_, VR0, VR1, VR2, VR3) 668 | 669 | VADD VR1.S4, KV1.S4, KV1.S4 670 | FMOVQ KQ1, (RSP) 671 | four_rounds_sched(E_, F_, G_, H_, A_, B_, C_, D_, VR1, VR2, VR3, VR0) 672 | 673 | VADD VR2.S4, KV2.S4, KV2.S4 674 | FMOVQ KQ2, (RSP) 675 | four_rounds_sched(A_, B_, C_, D_, E_, F_, G_, H_, VR2, VR3, VR0, VR1) 676 | 677 | VADD VR3.S4, KV3.S4, KV3.S4 678 | FMOVQ KQ3, (RSP) 679 | four_rounds_sched(E_, F_, G_, H_, A_, B_, C_, D_, VR3, VR0, VR1, VR2) 680 | 681 | // last 16 rounds 682 | VLD1.P 64(k256), [KV0.S4, KV1.S4, KV2.S4, KV3.S4] 683 | VADD VR0.S4, KV0.S4, KV0.S4 684 | FMOVQ KQ0, (RSP) 685 | four_rounds(A_, B_, C_, D_, E_, F_, G_, H_, RSP, 0) 686 | 687 | VADD VR1.S4, KV1.S4, KV1.S4 688 | FMOVQ KQ1, (RSP) 689 | four_rounds(E_, F_, G_, H_, A_, B_, C_, D_, RSP, 0) 690 | 691 | VADD VR2.S4, KV2.S4, KV2.S4 692 | FMOVQ KQ2, (RSP) 693 | four_rounds(A_, B_, C_, D_, E_, F_, G_, H_, RSP, 0) 694 | 695 | VADD VR3.S4, KV3.S4, KV3.S4 696 | FMOVQ KQ3, (RSP) 697 | four_rounds(E_, F_, G_, H_, A_, B_, C_, D_, RSP, 0) 698 | 699 | // rounds with padding 700 | LDPW (digest), (T1, T2) 701 | LDPW 8(digest), (T3, T4) 702 | 703 | ADDW T1, A_, A_ 704 | ADDW T2, B_, B_ 705 | ADDW T3, C_, C_ 706 | ADDW T4, D_, D_ 707 | LDPW 16(digest), (T1, T2) 708 | STPW (A_, B_), (RSP) 709 | STPW (C_, D_), 8(RSP) 710 | LDPW 24(digest), (T3, T4) 711 | ADDW T1, E_, E_ 712 | ADDW T2, F_, F_ 713 | ADDW T3, G_, G_ 714 | STPW (E_, F_), 16(RSP) 715 | ADDW T4, H_, H_ 716 | STPW (G_, H_), 24(RSP) 717 | 718 | four_rounds(A_, B_, C_, D_, E_, F_, G_, H_, padding, 0x00) 719 | four_rounds(E_, F_, G_, H_, A_, B_, C_, D_, padding, 0x10) 720 | four_rounds(A_, B_, C_, D_, E_, F_, G_, H_, padding, 0x20) 721 | four_rounds(E_, F_, G_, H_, A_, B_, C_, D_, padding, 0x30) 722 | four_rounds(A_, B_, C_, D_, E_, F_, G_, H_, padding, 0x40) 723 | four_rounds(E_, F_, G_, H_, A_, B_, C_, D_, padding, 0x50) 724 | four_rounds(A_, B_, C_, D_, E_, F_, G_, H_, padding, 0x60) 725 | four_rounds(E_, F_, G_, H_, A_, B_, C_, D_, padding, 0x70) 726 | four_rounds(A_, B_, C_, D_, E_, F_, G_, H_, padding, 0x80) 727 | four_rounds(E_, F_, G_, H_, A_, B_, C_, D_, padding, 0x90) 728 | four_rounds(A_, B_, C_, D_, E_, F_, G_, H_, padding, 0xa0) 729 | four_rounds(E_, F_, G_, H_, A_, B_, C_, D_, padding, 0xb0) 730 | four_rounds(A_, B_, C_, D_, E_, F_, G_, H_, padding, 0xc0) 731 | four_rounds(E_, F_, G_, H_, A_, B_, C_, D_, padding, 0xd0) 732 | four_rounds(A_, B_, C_, D_, E_, F_, G_, H_, padding, 0xe0) 733 | four_rounds(E_, F_, G_, H_, A_, B_, C_, D_, padding, 0xf0) 734 | 735 | LDPW (RSP), (T1, T2) 736 | LDPW 8(RSP), (T3, T4) 737 | ADDW T1, A_, A_ 738 | ADDW T2, B_, B_ 739 | REV32 A_, A_ 740 | REV32 B_, B_ 741 | ADDW T3, C_, C_ 742 | ADDW T4, D_, D_ 743 | STPW.P (A_, B_), 8(OUTPUT_PTR) 744 | LDPW 16(RSP), (T1, T2) 745 | 746 | REV32 C_, C_ 747 | REV32 D_, D_ 748 | STPW.P (C_, D_), 8(OUTPUT_PTR) 749 | LDPW 24(RSP), (T3, T4) 750 | ADDW T1, E_, E_ 751 | ADDW T2, F_, F_ 752 | REV32 E_, E_ 753 | REV32 F_, F_ 754 | ADDW T3, G_, G_ 755 | ADDW T4, H_, H_ 756 | REV32 G_, G_ 757 | REV32 H_, H_ 758 | STPW.P (E_, F_), 8(OUTPUT_PTR) 759 | STPW.P (G_, H_), 8(OUTPUT_PTR) 760 | 761 | JMP arm_x1_loop 762 | 763 | shani: 764 | MOVD $_DIGEST_1<>(SB), digest 765 | MOVD $_PADDING_1<>(SB), padding 766 | MOVD $_K256_1<>(SB), k256 767 | ADD NUM_BLKS<<5, OUTPUT_PTR, last 768 | 769 | // load incoming digest 770 | VLD1 (digest), [V0.S4, V1.S4] 771 | 772 | shani_loop: 773 | CMP OUTPUT_PTR, last 774 | BEQ epilog 775 | 776 | 777 | // load all K constants 778 | VLD1.P 64(k256), [V16.S4, V17.S4, V18.S4, V19.S4] 779 | VLD1.P 64(k256), [V20.S4, V21.S4, V22.S4, V23.S4] 780 | VLD1.P 64(k256), [V24.S4, V25.S4, V26.S4, V27.S4] 781 | VLD1 (k256), [V28.S4, V29.S4, V30.S4, V31.S4] 782 | SUB $192, k256, k256 783 | 784 | // load one block 785 | VLD1.P 64(DATA_PTR), [V4.S4, V5.S4, V6.S4, V7.S4] 786 | VMOV V0.B16, V2.B16 787 | VMOV V1.B16, V3.B16 788 | VMOV V2.B16, V8.B16 789 | 790 | // reverse endianness 791 | VREV32 V4.B16, V4.B16 792 | VREV32 V5.B16, V5.B16 793 | VREV32 V6.B16, V6.B16 794 | VREV32 V7.B16, V7.B16 795 | 796 | VADD V16.S4, V4.S4, V9.S4 797 | SHA256SU0 V5.S4, V4.S4 798 | HASHUPDATE(V9.S4) 799 | 800 | VADD V17.S4, V5.S4, V9.S4 801 | SHA256SU0 V6.S4, V5.S4 802 | SHA256SU1 V7.S4, V6.S4, V4.S4 803 | HASHUPDATE(V9.S4) 804 | 805 | VADD V18.S4, V6.S4, V9.S4 806 | SHA256SU0 V7.S4, V6.S4 807 | SHA256SU1 V4.S4, V7.S4, V5.S4 808 | HASHUPDATE(V9.S4) 809 | 810 | VADD V19.S4, V7.S4, V9.S4 811 | SHA256SU0 V4.S4, V7.S4 812 | SHA256SU1 V5.S4, V4.S4, V6.S4 813 | HASHUPDATE(V9.S4) 814 | 815 | VADD V20.S4, V4.S4, V9.S4 816 | SHA256SU0 V5.S4, V4.S4 817 | SHA256SU1 V6.S4, V5.S4, V7.S4 818 | HASHUPDATE(V9.S4) 819 | 820 | VADD V21.S4, V5.S4, V9.S4 821 | SHA256SU0 V6.S4, V5.S4 822 | SHA256SU1 V7.S4, V6.S4, V4.S4 823 | HASHUPDATE(V9.S4) 824 | 825 | VADD V22.S4, V6.S4, V9.S4 826 | SHA256SU0 V7.S4, V6.S4 827 | SHA256SU1 V4.S4, V7.S4, V5.S4 828 | HASHUPDATE(V9.S4) 829 | 830 | VADD V23.S4, V7.S4, V9.S4 831 | SHA256SU0 V4.S4, V7.S4 832 | SHA256SU1 V5.S4, V4.S4, V6.S4 833 | HASHUPDATE(V9.S4) 834 | 835 | VADD V24.S4, V4.S4, V9.S4 836 | SHA256SU0 V5.S4, V4.S4 837 | SHA256SU1 V6.S4, V5.S4, V7.S4 838 | HASHUPDATE(V9.S4) 839 | 840 | VADD V25.S4, V5.S4, V9.S4 841 | SHA256SU0 V6.S4, V5.S4 842 | SHA256SU1 V7.S4, V6.S4, V4.S4 843 | HASHUPDATE(V9.S4) 844 | 845 | VADD V26.S4, V6.S4, V9.S4 846 | SHA256SU0 V7.S4, V6.S4 847 | SHA256SU1 V4.S4, V7.S4, V5.S4 848 | HASHUPDATE(V9.S4) 849 | 850 | VADD V27.S4, V7.S4, V9.S4 851 | SHA256SU0 V4.S4, V7.S4 852 | SHA256SU1 V5.S4, V4.S4, V6.S4 853 | HASHUPDATE(V9.S4) 854 | 855 | VADD V28.S4, V4.S4, V9.S4 856 | HASHUPDATE(V9.S4) 857 | SHA256SU1 V6.S4, V5.S4, V7.S4 858 | 859 | VADD V29.S4, V5.S4, V9.S4 860 | HASHUPDATE(V9.S4) 861 | 862 | VADD V30.S4, V6.S4, V9.S4 863 | HASHUPDATE(V9.S4) 864 | 865 | VADD V31.S4, V7.S4, V9.S4 866 | HASHUPDATE(V9.S4) 867 | 868 | 869 | // Add initial digest 870 | VADD V2.S4, V0.S4, V2.S4 871 | VADD V3.S4, V1.S4, V3.S4 872 | 873 | // Back it up 874 | VMOV V2.B16, V10.B16 875 | VMOV V3.B16, V11.B16 876 | 877 | // Rounds with padding 878 | 879 | // load prescheduled constants 880 | VLD1.P 64(padding), [V16.S4, V17.S4, V18.S4, V19.S4] 881 | VLD1.P 64(padding), [V20.S4, V21.S4, V22.S4, V23.S4] 882 | VMOV V2.B16, V8.B16 883 | VLD1.P 64(padding), [V24.S4, V25.S4, V26.S4, V27.S4] 884 | VLD1 (padding), [V28.S4, V29.S4, V30.S4, V31.S4] 885 | SUB $192, padding, padding 886 | 887 | HASHUPDATE(V16.S4) 888 | HASHUPDATE(V17.S4) 889 | HASHUPDATE(V18.S4) 890 | HASHUPDATE(V19.S4) 891 | HASHUPDATE(V20.S4) 892 | HASHUPDATE(V21.S4) 893 | HASHUPDATE(V22.S4) 894 | HASHUPDATE(V23.S4) 895 | HASHUPDATE(V24.S4) 896 | HASHUPDATE(V25.S4) 897 | HASHUPDATE(V26.S4) 898 | HASHUPDATE(V27.S4) 899 | HASHUPDATE(V28.S4) 900 | HASHUPDATE(V29.S4) 901 | HASHUPDATE(V30.S4) 902 | HASHUPDATE(V31.S4) 903 | 904 | // add backed up digest 905 | VADD V2.S4, V10.S4, V2.S4 906 | VADD V3.S4, V11.S4, V3.S4 907 | 908 | 909 | VREV32 V2.B16, V2.B16 910 | VREV32 V3.B16, V3.B16 911 | 912 | VST1.P [V2.S4, V3.S4], 32(OUTPUT_PTR) 913 | JMP shani_loop 914 | 915 | epilog: 916 | RET 917 | 918 | // Data section 919 | DATA _K256_1<>+0x00(SB)/4, $0x428a2f98 920 | DATA _K256_1<>+0x04(SB)/4, $0x71374491 921 | DATA _K256_1<>+0x08(SB)/4, $0xb5c0fbcf 922 | DATA _K256_1<>+0x0c(SB)/4, $0xe9b5dba5 923 | DATA _K256_1<>+0x10(SB)/4, $0x3956c25b 924 | DATA _K256_1<>+0x14(SB)/4, $0x59f111f1 925 | DATA _K256_1<>+0x18(SB)/4, $0x923f82a4 926 | DATA _K256_1<>+0x1c(SB)/4, $0xab1c5ed5 927 | DATA _K256_1<>+0x20(SB)/4, $0xd807aa98 928 | DATA _K256_1<>+0x24(SB)/4, $0x12835b01 929 | DATA _K256_1<>+0x28(SB)/4, $0x243185be 930 | DATA _K256_1<>+0x2c(SB)/4, $0x550c7dc3 931 | DATA _K256_1<>+0x30(SB)/4, $0x72be5d74 932 | DATA _K256_1<>+0x34(SB)/4, $0x80deb1fe 933 | DATA _K256_1<>+0x38(SB)/4, $0x9bdc06a7 934 | DATA _K256_1<>+0x3c(SB)/4, $0xc19bf174 935 | DATA _K256_1<>+0x40(SB)/4, $0xe49b69c1 936 | DATA _K256_1<>+0x44(SB)/4, $0xefbe4786 937 | DATA _K256_1<>+0x48(SB)/4, $0x0fc19dc6 938 | DATA _K256_1<>+0x4c(SB)/4, $0x240ca1cc 939 | DATA _K256_1<>+0x50(SB)/4, $0x2de92c6f 940 | DATA _K256_1<>+0x54(SB)/4, $0x4a7484aa 941 | DATA _K256_1<>+0x58(SB)/4, $0x5cb0a9dc 942 | DATA _K256_1<>+0x5c(SB)/4, $0x76f988da 943 | DATA _K256_1<>+0x60(SB)/4, $0x983e5152 944 | DATA _K256_1<>+0x64(SB)/4, $0xa831c66d 945 | DATA _K256_1<>+0x68(SB)/4, $0xb00327c8 946 | DATA _K256_1<>+0x6c(SB)/4, $0xbf597fc7 947 | DATA _K256_1<>+0x70(SB)/4, $0xc6e00bf3 948 | DATA _K256_1<>+0x74(SB)/4, $0xd5a79147 949 | DATA _K256_1<>+0x78(SB)/4, $0x06ca6351 950 | DATA _K256_1<>+0x7c(SB)/4, $0x14292967 951 | DATA _K256_1<>+0x80(SB)/4, $0x27b70a85 952 | DATA _K256_1<>+0x84(SB)/4, $0x2e1b2138 953 | DATA _K256_1<>+0x88(SB)/4, $0x4d2c6dfc 954 | DATA _K256_1<>+0x8c(SB)/4, $0x53380d13 955 | DATA _K256_1<>+0x90(SB)/4, $0x650a7354 956 | DATA _K256_1<>+0x94(SB)/4, $0x766a0abb 957 | DATA _K256_1<>+0x98(SB)/4, $0x81c2c92e 958 | DATA _K256_1<>+0x9c(SB)/4, $0x92722c85 959 | DATA _K256_1<>+0xa0(SB)/4, $0xa2bfe8a1 960 | DATA _K256_1<>+0xa4(SB)/4, $0xa81a664b 961 | DATA _K256_1<>+0xa8(SB)/4, $0xc24b8b70 962 | DATA _K256_1<>+0xac(SB)/4, $0xc76c51a3 963 | DATA _K256_1<>+0xb0(SB)/4, $0xd192e819 964 | DATA _K256_1<>+0xb4(SB)/4, $0xd6990624 965 | DATA _K256_1<>+0xb8(SB)/4, $0xf40e3585 966 | DATA _K256_1<>+0xbc(SB)/4, $0x106aa070 967 | DATA _K256_1<>+0xc0(SB)/4, $0x19a4c116 968 | DATA _K256_1<>+0xc4(SB)/4, $0x1e376c08 969 | DATA _K256_1<>+0xc8(SB)/4, $0x2748774c 970 | DATA _K256_1<>+0xcc(SB)/4, $0x34b0bcb5 971 | DATA _K256_1<>+0xd0(SB)/4, $0x391c0cb3 972 | DATA _K256_1<>+0xd4(SB)/4, $0x4ed8aa4a 973 | DATA _K256_1<>+0xd8(SB)/4, $0x5b9cca4f 974 | DATA _K256_1<>+0xdc(SB)/4, $0x682e6ff3 975 | DATA _K256_1<>+0xe0(SB)/4, $0x748f82ee 976 | DATA _K256_1<>+0xe4(SB)/4, $0x78a5636f 977 | DATA _K256_1<>+0xe8(SB)/4, $0x84c87814 978 | DATA _K256_1<>+0xec(SB)/4, $0x8cc70208 979 | DATA _K256_1<>+0xf0(SB)/4, $0x90befffa 980 | DATA _K256_1<>+0xf4(SB)/4, $0xa4506ceb 981 | DATA _K256_1<>+0xf8(SB)/4, $0xbef9a3f7 982 | DATA _K256_1<>+0xfc(SB)/4, $0xc67178f2 983 | GLOBL _K256_1<>(SB),(NOPTR+RODATA),$256 984 | 985 | DATA _PADDING_1<>+0x00(SB)/4, $0xc28a2f98 986 | DATA _PADDING_1<>+0x04(SB)/4, $0x71374491 987 | DATA _PADDING_1<>+0x08(SB)/4, $0xb5c0fbcf 988 | DATA _PADDING_1<>+0x0c(SB)/4, $0xe9b5dba5 989 | DATA _PADDING_1<>+0x10(SB)/4, $0x3956c25b 990 | DATA _PADDING_1<>+0x14(SB)/4, $0x59f111f1 991 | DATA _PADDING_1<>+0x18(SB)/4, $0x923f82a4 992 | DATA _PADDING_1<>+0x1c(SB)/4, $0xab1c5ed5 993 | DATA _PADDING_1<>+0x20(SB)/4, $0xd807aa98 994 | DATA _PADDING_1<>+0x24(SB)/4, $0x12835b01 995 | DATA _PADDING_1<>+0x28(SB)/4, $0x243185be 996 | DATA _PADDING_1<>+0x2c(SB)/4, $0x550c7dc3 997 | DATA _PADDING_1<>+0x30(SB)/4, $0x72be5d74 998 | DATA _PADDING_1<>+0x34(SB)/4, $0x80deb1fe 999 | DATA _PADDING_1<>+0x38(SB)/4, $0x9bdc06a7 1000 | DATA _PADDING_1<>+0x3c(SB)/4, $0xc19bf374 1001 | DATA _PADDING_1<>+0x40(SB)/4, $0x649b69c1 1002 | DATA _PADDING_1<>+0x44(SB)/4, $0xf0fe4786 1003 | DATA _PADDING_1<>+0x48(SB)/4, $0x0fe1edc6 1004 | DATA _PADDING_1<>+0x4c(SB)/4, $0x240cf254 1005 | DATA _PADDING_1<>+0x50(SB)/4, $0x4fe9346f 1006 | DATA _PADDING_1<>+0x54(SB)/4, $0x6cc984be 1007 | DATA _PADDING_1<>+0x58(SB)/4, $0x61b9411e 1008 | DATA _PADDING_1<>+0x5c(SB)/4, $0x16f988fa 1009 | DATA _PADDING_1<>+0x60(SB)/4, $0xf2c65152 1010 | DATA _PADDING_1<>+0x64(SB)/4, $0xa88e5a6d 1011 | DATA _PADDING_1<>+0x68(SB)/4, $0xb019fc65 1012 | DATA _PADDING_1<>+0x6c(SB)/4, $0xb9d99ec7 1013 | DATA _PADDING_1<>+0x70(SB)/4, $0x9a1231c3 1014 | DATA _PADDING_1<>+0x74(SB)/4, $0xe70eeaa0 1015 | DATA _PADDING_1<>+0x78(SB)/4, $0xfdb1232b 1016 | DATA _PADDING_1<>+0x7c(SB)/4, $0xc7353eb0 1017 | DATA _PADDING_1<>+0x80(SB)/4, $0x3069bad5 1018 | DATA _PADDING_1<>+0x84(SB)/4, $0xcb976d5f 1019 | DATA _PADDING_1<>+0x88(SB)/4, $0x5a0f118f 1020 | DATA _PADDING_1<>+0x8c(SB)/4, $0xdc1eeefd 1021 | DATA _PADDING_1<>+0x90(SB)/4, $0x0a35b689 1022 | DATA _PADDING_1<>+0x94(SB)/4, $0xde0b7a04 1023 | DATA _PADDING_1<>+0x98(SB)/4, $0x58f4ca9d 1024 | DATA _PADDING_1<>+0x9c(SB)/4, $0xe15d5b16 1025 | DATA _PADDING_1<>+0xa0(SB)/4, $0x007f3e86 1026 | DATA _PADDING_1<>+0xa4(SB)/4, $0x37088980 1027 | DATA _PADDING_1<>+0xa8(SB)/4, $0xa507ea32 1028 | DATA _PADDING_1<>+0xac(SB)/4, $0x6fab9537 1029 | DATA _PADDING_1<>+0xb0(SB)/4, $0x17406110 1030 | DATA _PADDING_1<>+0xb4(SB)/4, $0x0d8cd6f1 1031 | DATA _PADDING_1<>+0xb8(SB)/4, $0xcdaa3b6d 1032 | DATA _PADDING_1<>+0xbc(SB)/4, $0xc0bbbe37 1033 | DATA _PADDING_1<>+0xc0(SB)/4, $0x83613bda 1034 | DATA _PADDING_1<>+0xc4(SB)/4, $0xdb48a363 1035 | DATA _PADDING_1<>+0xc8(SB)/4, $0x0b02e931 1036 | DATA _PADDING_1<>+0xcc(SB)/4, $0x6fd15ca7 1037 | DATA _PADDING_1<>+0xd0(SB)/4, $0x521afaca 1038 | DATA _PADDING_1<>+0xd4(SB)/4, $0x31338431 1039 | DATA _PADDING_1<>+0xd8(SB)/4, $0x6ed41a95 1040 | DATA _PADDING_1<>+0xdc(SB)/4, $0x6d437890 1041 | DATA _PADDING_1<>+0xe0(SB)/4, $0xc39c91f2 1042 | DATA _PADDING_1<>+0xe4(SB)/4, $0x9eccabbd 1043 | DATA _PADDING_1<>+0xe8(SB)/4, $0xb5c9a0e6 1044 | DATA _PADDING_1<>+0xec(SB)/4, $0x532fb63c 1045 | DATA _PADDING_1<>+0xf0(SB)/4, $0xd2c741c6 1046 | DATA _PADDING_1<>+0xf4(SB)/4, $0x07237ea3 1047 | DATA _PADDING_1<>+0xf8(SB)/4, $0xa4954b68 1048 | DATA _PADDING_1<>+0xfc(SB)/4, $0x4c191d76 1049 | GLOBL _PADDING_1<>(SB),(NOPTR+RODATA),$256 1050 | 1051 | DATA _DIGEST_1<>+0(SB)/4, $0x6a09e667 1052 | DATA _DIGEST_1<>+4(SB)/4, $0xbb67ae85 1053 | DATA _DIGEST_1<>+8(SB)/4, $0x3c6ef372 1054 | DATA _DIGEST_1<>+12(SB)/4, $0xa54ff53a 1055 | DATA _DIGEST_1<>+16(SB)/4, $0x510e527f 1056 | DATA _DIGEST_1<>+20(SB)/4, $0x9b05688c 1057 | DATA _DIGEST_1<>+24(SB)/4, $0x1f83d9ab 1058 | DATA _DIGEST_1<>+28(SB)/4, $0x5be0cd19 1059 | GLOBL _DIGEST_1<>(SB),(NOPTR+RODATA),$32 1060 | 1061 | DATA _DIGEST_4<>+0(SB)/8, $0x6a09e6676a09e667 1062 | DATA _DIGEST_4<>+8(SB)/8, $0x6a09e6676a09e667 1063 | DATA _DIGEST_4<>+16(SB)/8, $0xbb67ae85bb67ae85 1064 | DATA _DIGEST_4<>+24(SB)/8, $0xbb67ae85bb67ae85 1065 | DATA _DIGEST_4<>+32(SB)/8, $0x3c6ef3723c6ef372 1066 | DATA _DIGEST_4<>+40(SB)/8, $0x3c6ef3723c6ef372 1067 | DATA _DIGEST_4<>+48(SB)/8, $0xa54ff53aa54ff53a 1068 | DATA _DIGEST_4<>+56(SB)/8, $0xa54ff53aa54ff53a 1069 | DATA _DIGEST_4<>+64(SB)/8, $0x510e527f510e527f 1070 | DATA _DIGEST_4<>+72(SB)/8, $0x510e527f510e527f 1071 | DATA _DIGEST_4<>+80(SB)/8, $0x9b05688c9b05688c 1072 | DATA _DIGEST_4<>+88(SB)/8, $0x9b05688c9b05688c 1073 | DATA _DIGEST_4<>+96(SB)/8, $0x1f83d9ab1f83d9ab 1074 | DATA _DIGEST_4<>+104(SB)/8, $0x1f83d9ab1f83d9ab 1075 | DATA _DIGEST_4<>+112(SB)/8, $0x5be0cd195be0cd19 1076 | DATA _DIGEST_4<>+120(SB)/8, $0x5be0cd195be0cd19 1077 | GLOBL _DIGEST_4<>(SB),(NOPTR+RODATA),$128 1078 | 1079 | 1080 | DATA _PADDING_4<>+0(SB)/8, $0xc28a2f98c28a2f98 1081 | DATA _PADDING_4<>+8(SB)/8, $0xc28a2f98c28a2f98 1082 | DATA _PADDING_4<>+16(SB)/8, $0x7137449171374491 1083 | DATA _PADDING_4<>+24(SB)/8, $0x7137449171374491 1084 | DATA _PADDING_4<>+32(SB)/8, $0xb5c0fbcfb5c0fbcf 1085 | DATA _PADDING_4<>+40(SB)/8, $0xb5c0fbcfb5c0fbcf 1086 | DATA _PADDING_4<>+48(SB)/8, $0xe9b5dba5e9b5dba5 1087 | DATA _PADDING_4<>+56(SB)/8, $0xe9b5dba5e9b5dba5 1088 | DATA _PADDING_4<>+64(SB)/8, $0x3956c25b3956c25b 1089 | DATA _PADDING_4<>+72(SB)/8, $0x3956c25b3956c25b 1090 | DATA _PADDING_4<>+80(SB)/8, $0x59f111f159f111f1 1091 | DATA _PADDING_4<>+88(SB)/8, $0x59f111f159f111f1 1092 | DATA _PADDING_4<>+96(SB)/8, $0x923f82a4923f82a4 1093 | DATA _PADDING_4<>+104(SB)/8, $0x923f82a4923f82a4 1094 | DATA _PADDING_4<>+112(SB)/8, $0xab1c5ed5ab1c5ed5 1095 | DATA _PADDING_4<>+120(SB)/8, $0xab1c5ed5ab1c5ed5 1096 | DATA _PADDING_4<>+128(SB)/8, $0xd807aa98d807aa98 1097 | DATA _PADDING_4<>+136(SB)/8, $0xd807aa98d807aa98 1098 | DATA _PADDING_4<>+144(SB)/8, $0x12835b0112835b01 1099 | DATA _PADDING_4<>+152(SB)/8, $0x12835b0112835b01 1100 | DATA _PADDING_4<>+160(SB)/8, $0x243185be243185be 1101 | DATA _PADDING_4<>+168(SB)/8, $0x243185be243185be 1102 | DATA _PADDING_4<>+176(SB)/8, $0x550c7dc3550c7dc3 1103 | DATA _PADDING_4<>+184(SB)/8, $0x550c7dc3550c7dc3 1104 | DATA _PADDING_4<>+192(SB)/8, $0x72be5d7472be5d74 1105 | DATA _PADDING_4<>+200(SB)/8, $0x72be5d7472be5d74 1106 | DATA _PADDING_4<>+208(SB)/8, $0x80deb1fe80deb1fe 1107 | DATA _PADDING_4<>+216(SB)/8, $0x80deb1fe80deb1fe 1108 | DATA _PADDING_4<>+224(SB)/8, $0x9bdc06a79bdc06a7 1109 | DATA _PADDING_4<>+232(SB)/8, $0x9bdc06a79bdc06a7 1110 | DATA _PADDING_4<>+240(SB)/8, $0xc19bf374c19bf374 1111 | DATA _PADDING_4<>+248(SB)/8, $0xc19bf374c19bf374 1112 | DATA _PADDING_4<>+256(SB)/8, $0x649b69c1649b69c1 1113 | DATA _PADDING_4<>+264(SB)/8, $0x649b69c1649b69c1 1114 | DATA _PADDING_4<>+272(SB)/8, $0xf0fe4786f0fe4786 1115 | DATA _PADDING_4<>+280(SB)/8, $0xf0fe4786f0fe4786 1116 | DATA _PADDING_4<>+288(SB)/8, $0x0fe1edc60fe1edc6 1117 | DATA _PADDING_4<>+296(SB)/8, $0x0fe1edc60fe1edc6 1118 | DATA _PADDING_4<>+304(SB)/8, $0x240cf254240cf254 1119 | DATA _PADDING_4<>+312(SB)/8, $0x240cf254240cf254 1120 | DATA _PADDING_4<>+320(SB)/8, $0x4fe9346f4fe9346f 1121 | DATA _PADDING_4<>+328(SB)/8, $0x4fe9346f4fe9346f 1122 | DATA _PADDING_4<>+336(SB)/8, $0x6cc984be6cc984be 1123 | DATA _PADDING_4<>+344(SB)/8, $0x6cc984be6cc984be 1124 | DATA _PADDING_4<>+352(SB)/8, $0x61b9411e61b9411e 1125 | DATA _PADDING_4<>+360(SB)/8, $0x61b9411e61b9411e 1126 | DATA _PADDING_4<>+368(SB)/8, $0x16f988fa16f988fa 1127 | DATA _PADDING_4<>+376(SB)/8, $0x16f988fa16f988fa 1128 | DATA _PADDING_4<>+384(SB)/8, $0xf2c65152f2c65152 1129 | DATA _PADDING_4<>+392(SB)/8, $0xf2c65152f2c65152 1130 | DATA _PADDING_4<>+400(SB)/8, $0xa88e5a6da88e5a6d 1131 | DATA _PADDING_4<>+408(SB)/8, $0xa88e5a6da88e5a6d 1132 | DATA _PADDING_4<>+416(SB)/8, $0xb019fc65b019fc65 1133 | DATA _PADDING_4<>+424(SB)/8, $0xb019fc65b019fc65 1134 | DATA _PADDING_4<>+432(SB)/8, $0xb9d99ec7b9d99ec7 1135 | DATA _PADDING_4<>+440(SB)/8, $0xb9d99ec7b9d99ec7 1136 | DATA _PADDING_4<>+448(SB)/8, $0x9a1231c39a1231c3 1137 | DATA _PADDING_4<>+456(SB)/8, $0x9a1231c39a1231c3 1138 | DATA _PADDING_4<>+464(SB)/8, $0xe70eeaa0e70eeaa0 1139 | DATA _PADDING_4<>+472(SB)/8, $0xe70eeaa0e70eeaa0 1140 | DATA _PADDING_4<>+480(SB)/8, $0xfdb1232bfdb1232b 1141 | DATA _PADDING_4<>+488(SB)/8, $0xfdb1232bfdb1232b 1142 | DATA _PADDING_4<>+496(SB)/8, $0xc7353eb0c7353eb0 1143 | DATA _PADDING_4<>+504(SB)/8, $0xc7353eb0c7353eb0 1144 | DATA _PADDING_4<>+512(SB)/8, $0x3069bad53069bad5 1145 | DATA _PADDING_4<>+520(SB)/8, $0x3069bad53069bad5 1146 | DATA _PADDING_4<>+528(SB)/8, $0xcb976d5fcb976d5f 1147 | DATA _PADDING_4<>+536(SB)/8, $0xcb976d5fcb976d5f 1148 | DATA _PADDING_4<>+544(SB)/8, $0x5a0f118f5a0f118f 1149 | DATA _PADDING_4<>+552(SB)/8, $0x5a0f118f5a0f118f 1150 | DATA _PADDING_4<>+560(SB)/8, $0xdc1eeefddc1eeefd 1151 | DATA _PADDING_4<>+568(SB)/8, $0xdc1eeefddc1eeefd 1152 | DATA _PADDING_4<>+576(SB)/8, $0x0a35b6890a35b689 1153 | DATA _PADDING_4<>+584(SB)/8, $0x0a35b6890a35b689 1154 | DATA _PADDING_4<>+592(SB)/8, $0xde0b7a04de0b7a04 1155 | DATA _PADDING_4<>+600(SB)/8, $0xde0b7a04de0b7a04 1156 | DATA _PADDING_4<>+608(SB)/8, $0x58f4ca9d58f4ca9d 1157 | DATA _PADDING_4<>+616(SB)/8, $0x58f4ca9d58f4ca9d 1158 | DATA _PADDING_4<>+624(SB)/8, $0xe15d5b16e15d5b16 1159 | DATA _PADDING_4<>+632(SB)/8, $0xe15d5b16e15d5b16 1160 | DATA _PADDING_4<>+640(SB)/8, $0x007f3e86007f3e86 1161 | DATA _PADDING_4<>+648(SB)/8, $0x007f3e86007f3e86 1162 | DATA _PADDING_4<>+656(SB)/8, $0x3708898037088980 1163 | DATA _PADDING_4<>+664(SB)/8, $0x3708898037088980 1164 | DATA _PADDING_4<>+672(SB)/8, $0xa507ea32a507ea32 1165 | DATA _PADDING_4<>+680(SB)/8, $0xa507ea32a507ea32 1166 | DATA _PADDING_4<>+688(SB)/8, $0x6fab95376fab9537 1167 | DATA _PADDING_4<>+696(SB)/8, $0x6fab95376fab9537 1168 | DATA _PADDING_4<>+704(SB)/8, $0x1740611017406110 1169 | DATA _PADDING_4<>+712(SB)/8, $0x1740611017406110 1170 | DATA _PADDING_4<>+720(SB)/8, $0x0d8cd6f10d8cd6f1 1171 | DATA _PADDING_4<>+728(SB)/8, $0x0d8cd6f10d8cd6f1 1172 | DATA _PADDING_4<>+736(SB)/8, $0xcdaa3b6dcdaa3b6d 1173 | DATA _PADDING_4<>+744(SB)/8, $0xcdaa3b6dcdaa3b6d 1174 | DATA _PADDING_4<>+752(SB)/8, $0xc0bbbe37c0bbbe37 1175 | DATA _PADDING_4<>+760(SB)/8, $0xc0bbbe37c0bbbe37 1176 | DATA _PADDING_4<>+768(SB)/8, $0x83613bda83613bda 1177 | DATA _PADDING_4<>+776(SB)/8, $0x83613bda83613bda 1178 | DATA _PADDING_4<>+784(SB)/8, $0xdb48a363db48a363 1179 | DATA _PADDING_4<>+792(SB)/8, $0xdb48a363db48a363 1180 | DATA _PADDING_4<>+800(SB)/8, $0x0b02e9310b02e931 1181 | DATA _PADDING_4<>+808(SB)/8, $0x0b02e9310b02e931 1182 | DATA _PADDING_4<>+816(SB)/8, $0x6fd15ca76fd15ca7 1183 | DATA _PADDING_4<>+824(SB)/8, $0x6fd15ca76fd15ca7 1184 | DATA _PADDING_4<>+832(SB)/8, $0x521afaca521afaca 1185 | DATA _PADDING_4<>+840(SB)/8, $0x521afaca521afaca 1186 | DATA _PADDING_4<>+848(SB)/8, $0x3133843131338431 1187 | DATA _PADDING_4<>+856(SB)/8, $0x3133843131338431 1188 | DATA _PADDING_4<>+864(SB)/8, $0x6ed41a956ed41a95 1189 | DATA _PADDING_4<>+872(SB)/8, $0x6ed41a956ed41a95 1190 | DATA _PADDING_4<>+880(SB)/8, $0x6d4378906d437890 1191 | DATA _PADDING_4<>+888(SB)/8, $0x6d4378906d437890 1192 | DATA _PADDING_4<>+896(SB)/8, $0xc39c91f2c39c91f2 1193 | DATA _PADDING_4<>+904(SB)/8, $0xc39c91f2c39c91f2 1194 | DATA _PADDING_4<>+912(SB)/8, $0x9eccabbd9eccabbd 1195 | DATA _PADDING_4<>+920(SB)/8, $0x9eccabbd9eccabbd 1196 | DATA _PADDING_4<>+928(SB)/8, $0xb5c9a0e6b5c9a0e6 1197 | DATA _PADDING_4<>+936(SB)/8, $0xb5c9a0e6b5c9a0e6 1198 | DATA _PADDING_4<>+944(SB)/8, $0x532fb63c532fb63c 1199 | DATA _PADDING_4<>+952(SB)/8, $0x532fb63c532fb63c 1200 | DATA _PADDING_4<>+960(SB)/8, $0xd2c741c6d2c741c6 1201 | DATA _PADDING_4<>+968(SB)/8, $0xd2c741c6d2c741c6 1202 | DATA _PADDING_4<>+976(SB)/8, $0x07237ea307237ea3 1203 | DATA _PADDING_4<>+984(SB)/8, $0x07237ea307237ea3 1204 | DATA _PADDING_4<>+992(SB)/8, $0xa4954b68a4954b68 1205 | DATA _PADDING_4<>+1000(SB)/8, $0xa4954b68a4954b68 1206 | DATA _PADDING_4<>+1008(SB)/8, $0x4c191d764c191d76 1207 | DATA _PADDING_4<>+1016(SB)/8, $0x4c191d764c191d76 1208 | GLOBL _PADDING_4<>(SB),(NOPTR+RODATA),$1024 1209 | 1210 | DATA _K256_4<>+0(SB)/8, $0x428a2f98428a2f98 1211 | DATA _K256_4<>+8(SB)/8, $0x428a2f98428a2f98 1212 | DATA _K256_4<>+16(SB)/8, $0x7137449171374491 1213 | DATA _K256_4<>+24(SB)/8, $0x7137449171374491 1214 | DATA _K256_4<>+32(SB)/8, $0xb5c0fbcfb5c0fbcf 1215 | DATA _K256_4<>+40(SB)/8, $0xb5c0fbcfb5c0fbcf 1216 | DATA _K256_4<>+48(SB)/8, $0xe9b5dba5e9b5dba5 1217 | DATA _K256_4<>+56(SB)/8, $0xe9b5dba5e9b5dba5 1218 | DATA _K256_4<>+64(SB)/8, $0x3956c25b3956c25b 1219 | DATA _K256_4<>+72(SB)/8, $0x3956c25b3956c25b 1220 | DATA _K256_4<>+80(SB)/8, $0x59f111f159f111f1 1221 | DATA _K256_4<>+88(SB)/8, $0x59f111f159f111f1 1222 | DATA _K256_4<>+96(SB)/8, $0x923f82a4923f82a4 1223 | DATA _K256_4<>+104(SB)/8, $0x923f82a4923f82a4 1224 | DATA _K256_4<>+112(SB)/8, $0xab1c5ed5ab1c5ed5 1225 | DATA _K256_4<>+120(SB)/8, $0xab1c5ed5ab1c5ed5 1226 | DATA _K256_4<>+128(SB)/8, $0xd807aa98d807aa98 1227 | DATA _K256_4<>+136(SB)/8, $0xd807aa98d807aa98 1228 | DATA _K256_4<>+144(SB)/8, $0x12835b0112835b01 1229 | DATA _K256_4<>+152(SB)/8, $0x12835b0112835b01 1230 | DATA _K256_4<>+160(SB)/8, $0x243185be243185be 1231 | DATA _K256_4<>+168(SB)/8, $0x243185be243185be 1232 | DATA _K256_4<>+176(SB)/8, $0x550c7dc3550c7dc3 1233 | DATA _K256_4<>+184(SB)/8, $0x550c7dc3550c7dc3 1234 | DATA _K256_4<>+192(SB)/8, $0x72be5d7472be5d74 1235 | DATA _K256_4<>+200(SB)/8, $0x72be5d7472be5d74 1236 | DATA _K256_4<>+208(SB)/8, $0x80deb1fe80deb1fe 1237 | DATA _K256_4<>+216(SB)/8, $0x80deb1fe80deb1fe 1238 | DATA _K256_4<>+224(SB)/8, $0x9bdc06a79bdc06a7 1239 | DATA _K256_4<>+232(SB)/8, $0x9bdc06a79bdc06a7 1240 | DATA _K256_4<>+240(SB)/8, $0xc19bf174c19bf174 1241 | DATA _K256_4<>+248(SB)/8, $0xc19bf174c19bf174 1242 | DATA _K256_4<>+256(SB)/8, $0xe49b69c1e49b69c1 1243 | DATA _K256_4<>+264(SB)/8, $0xe49b69c1e49b69c1 1244 | DATA _K256_4<>+272(SB)/8, $0xefbe4786efbe4786 1245 | DATA _K256_4<>+280(SB)/8, $0xefbe4786efbe4786 1246 | DATA _K256_4<>+288(SB)/8, $0x0fc19dc60fc19dc6 1247 | DATA _K256_4<>+296(SB)/8, $0x0fc19dc60fc19dc6 1248 | DATA _K256_4<>+304(SB)/8, $0x240ca1cc240ca1cc 1249 | DATA _K256_4<>+312(SB)/8, $0x240ca1cc240ca1cc 1250 | DATA _K256_4<>+320(SB)/8, $0x2de92c6f2de92c6f 1251 | DATA _K256_4<>+328(SB)/8, $0x2de92c6f2de92c6f 1252 | DATA _K256_4<>+336(SB)/8, $0x4a7484aa4a7484aa 1253 | DATA _K256_4<>+344(SB)/8, $0x4a7484aa4a7484aa 1254 | DATA _K256_4<>+352(SB)/8, $0x5cb0a9dc5cb0a9dc 1255 | DATA _K256_4<>+360(SB)/8, $0x5cb0a9dc5cb0a9dc 1256 | DATA _K256_4<>+368(SB)/8, $0x76f988da76f988da 1257 | DATA _K256_4<>+376(SB)/8, $0x76f988da76f988da 1258 | DATA _K256_4<>+384(SB)/8, $0x983e5152983e5152 1259 | DATA _K256_4<>+392(SB)/8, $0x983e5152983e5152 1260 | DATA _K256_4<>+400(SB)/8, $0xa831c66da831c66d 1261 | DATA _K256_4<>+408(SB)/8, $0xa831c66da831c66d 1262 | DATA _K256_4<>+416(SB)/8, $0xb00327c8b00327c8 1263 | DATA _K256_4<>+424(SB)/8, $0xb00327c8b00327c8 1264 | DATA _K256_4<>+432(SB)/8, $0xbf597fc7bf597fc7 1265 | DATA _K256_4<>+440(SB)/8, $0xbf597fc7bf597fc7 1266 | DATA _K256_4<>+448(SB)/8, $0xc6e00bf3c6e00bf3 1267 | DATA _K256_4<>+456(SB)/8, $0xc6e00bf3c6e00bf3 1268 | DATA _K256_4<>+464(SB)/8, $0xd5a79147d5a79147 1269 | DATA _K256_4<>+472(SB)/8, $0xd5a79147d5a79147 1270 | DATA _K256_4<>+480(SB)/8, $0x06ca635106ca6351 1271 | DATA _K256_4<>+488(SB)/8, $0x06ca635106ca6351 1272 | DATA _K256_4<>+496(SB)/8, $0x1429296714292967 1273 | DATA _K256_4<>+504(SB)/8, $0x1429296714292967 1274 | DATA _K256_4<>+512(SB)/8, $0x27b70a8527b70a85 1275 | DATA _K256_4<>+520(SB)/8, $0x27b70a8527b70a85 1276 | DATA _K256_4<>+528(SB)/8, $0x2e1b21382e1b2138 1277 | DATA _K256_4<>+536(SB)/8, $0x2e1b21382e1b2138 1278 | DATA _K256_4<>+544(SB)/8, $0x4d2c6dfc4d2c6dfc 1279 | DATA _K256_4<>+552(SB)/8, $0x4d2c6dfc4d2c6dfc 1280 | DATA _K256_4<>+560(SB)/8, $0x53380d1353380d13 1281 | DATA _K256_4<>+568(SB)/8, $0x53380d1353380d13 1282 | DATA _K256_4<>+576(SB)/8, $0x650a7354650a7354 1283 | DATA _K256_4<>+584(SB)/8, $0x650a7354650a7354 1284 | DATA _K256_4<>+592(SB)/8, $0x766a0abb766a0abb 1285 | DATA _K256_4<>+600(SB)/8, $0x766a0abb766a0abb 1286 | DATA _K256_4<>+608(SB)/8, $0x81c2c92e81c2c92e 1287 | DATA _K256_4<>+616(SB)/8, $0x81c2c92e81c2c92e 1288 | DATA _K256_4<>+624(SB)/8, $0x92722c8592722c85 1289 | DATA _K256_4<>+632(SB)/8, $0x92722c8592722c85 1290 | DATA _K256_4<>+640(SB)/8, $0xa2bfe8a1a2bfe8a1 1291 | DATA _K256_4<>+648(SB)/8, $0xa2bfe8a1a2bfe8a1 1292 | DATA _K256_4<>+656(SB)/8, $0xa81a664ba81a664b 1293 | DATA _K256_4<>+664(SB)/8, $0xa81a664ba81a664b 1294 | DATA _K256_4<>+672(SB)/8, $0xc24b8b70c24b8b70 1295 | DATA _K256_4<>+680(SB)/8, $0xc24b8b70c24b8b70 1296 | DATA _K256_4<>+688(SB)/8, $0xc76c51a3c76c51a3 1297 | DATA _K256_4<>+696(SB)/8, $0xc76c51a3c76c51a3 1298 | DATA _K256_4<>+704(SB)/8, $0xd192e819d192e819 1299 | DATA _K256_4<>+712(SB)/8, $0xd192e819d192e819 1300 | DATA _K256_4<>+720(SB)/8, $0xd6990624d6990624 1301 | DATA _K256_4<>+728(SB)/8, $0xd6990624d6990624 1302 | DATA _K256_4<>+736(SB)/8, $0xf40e3585f40e3585 1303 | DATA _K256_4<>+744(SB)/8, $0xf40e3585f40e3585 1304 | DATA _K256_4<>+752(SB)/8, $0x106aa070106aa070 1305 | DATA _K256_4<>+760(SB)/8, $0x106aa070106aa070 1306 | DATA _K256_4<>+768(SB)/8, $0x19a4c11619a4c116 1307 | DATA _K256_4<>+776(SB)/8, $0x19a4c11619a4c116 1308 | DATA _K256_4<>+784(SB)/8, $0x1e376c081e376c08 1309 | DATA _K256_4<>+792(SB)/8, $0x1e376c081e376c08 1310 | DATA _K256_4<>+800(SB)/8, $0x2748774c2748774c 1311 | DATA _K256_4<>+808(SB)/8, $0x2748774c2748774c 1312 | DATA _K256_4<>+816(SB)/8, $0x34b0bcb534b0bcb5 1313 | DATA _K256_4<>+824(SB)/8, $0x34b0bcb534b0bcb5 1314 | DATA _K256_4<>+832(SB)/8, $0x391c0cb3391c0cb3 1315 | DATA _K256_4<>+840(SB)/8, $0x391c0cb3391c0cb3 1316 | DATA _K256_4<>+848(SB)/8, $0x4ed8aa4a4ed8aa4a 1317 | DATA _K256_4<>+856(SB)/8, $0x4ed8aa4a4ed8aa4a 1318 | DATA _K256_4<>+864(SB)/8, $0x5b9cca4f5b9cca4f 1319 | DATA _K256_4<>+872(SB)/8, $0x5b9cca4f5b9cca4f 1320 | DATA _K256_4<>+880(SB)/8, $0x682e6ff3682e6ff3 1321 | DATA _K256_4<>+888(SB)/8, $0x682e6ff3682e6ff3 1322 | DATA _K256_4<>+896(SB)/8, $0x748f82ee748f82ee 1323 | DATA _K256_4<>+904(SB)/8, $0x748f82ee748f82ee 1324 | DATA _K256_4<>+912(SB)/8, $0x78a5636f78a5636f 1325 | DATA _K256_4<>+920(SB)/8, $0x78a5636f78a5636f 1326 | DATA _K256_4<>+928(SB)/8, $0x84c8781484c87814 1327 | DATA _K256_4<>+936(SB)/8, $0x84c8781484c87814 1328 | DATA _K256_4<>+944(SB)/8, $0x8cc702088cc70208 1329 | DATA _K256_4<>+952(SB)/8, $0x8cc702088cc70208 1330 | DATA _K256_4<>+960(SB)/8, $0x90befffa90befffa 1331 | DATA _K256_4<>+968(SB)/8, $0x90befffa90befffa 1332 | DATA _K256_4<>+976(SB)/8, $0xa4506ceba4506ceb 1333 | DATA _K256_4<>+984(SB)/8, $0xa4506ceba4506ceb 1334 | DATA _K256_4<>+992(SB)/8, $0xbef9a3f7bef9a3f7 1335 | DATA _K256_4<>+1000(SB)/8, $0xbef9a3f7bef9a3f7 1336 | DATA _K256_4<>+1008(SB)/8, $0xc67178f2c67178f2 1337 | DATA _K256_4<>+1016(SB)/8, $0xc67178f2c67178f2 1338 | GLOBL _K256_4<>(SB),(NOPTR+RODATA),$1024 1339 | -------------------------------------------------------------------------------- /hash_fuzz_test.go: -------------------------------------------------------------------------------- 1 | //go:build go1.18 2 | // +build go1.18 3 | 4 | package gohashtree_test 5 | 6 | import ( 7 | "testing" 8 | 9 | "github.com/prysmaticlabs/gohashtree" 10 | ) 11 | 12 | func convertRawChunks(raw []byte) [][32]byte { 13 | var chunks [][32]byte 14 | for i := 32; i <= len(raw); i += 32 { 15 | var c [32]byte 16 | copy(c[:], raw[i-32:i]) 17 | chunks = append(chunks, c) 18 | } 19 | return chunks 20 | } 21 | 22 | func FuzzHash(f *testing.F) { 23 | for i := 1; i <= 10; i++ { 24 | f.Add(make([]byte, 64*i)) 25 | } 26 | f.Fuzz(func(t *testing.T, chunksRaw []byte) { 27 | if len(chunksRaw) < 64 || len(chunksRaw)%64 != 0 { 28 | return // No chunks and odd number of chunks are invalid 29 | } 30 | chunks := convertRawChunks(chunksRaw) 31 | digests := make([][32]byte, len(chunks)/2) 32 | if err := gohashtree.Hash(digests, chunks); err != nil { 33 | t.Fatal(err) 34 | } 35 | }) 36 | } 37 | 38 | func FuzzHash_Differential_Minio(f *testing.F) { 39 | for i := uint(0); i < 128; i++ { 40 | d := make([]byte, 64) 41 | for j := 0; j < 64; j++ { 42 | d[j] = byte(i) 43 | } 44 | f.Add(d) 45 | } 46 | f.Fuzz(func(t *testing.T, chunksRaw []byte) { 47 | if len(chunksRaw) < 64 || len(chunksRaw)%64 != 0 { 48 | return // No chunks and odd number of chunks are invalid 49 | } 50 | chunks := convertRawChunks(chunksRaw) 51 | digests := make([][32]byte, len(chunks)/2) 52 | if err := gohashtree.Hash(digests, chunks); err != nil { 53 | t.Fatal(err) 54 | } 55 | for i := 64; i <= len(chunksRaw); i += 64 { 56 | a := OldHash(chunksRaw[i-64 : i]) 57 | b := digests[(i/64)-1] 58 | if a != b { 59 | t.Error("minio.Hash() != gohashtree.Hash()") 60 | } 61 | } 62 | }) 63 | } 64 | -------------------------------------------------------------------------------- /hash_test.go: -------------------------------------------------------------------------------- 1 | /* 2 | MIT License 3 | 4 | # Copyright (c) 2021 Prysmatic Labs 5 | 6 | Permission is hereby granted, free of charge, to any person obtaining a copy 7 | of this software and associated documentation files (the "Software"), to deal 8 | in the Software without restriction, including without limitation the rights 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | copies of the Software, and to permit persons to whom the Software is 11 | furnished to do so, subject to the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be included in all 14 | copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | SOFTWARE. 23 | */ 24 | package gohashtree_test 25 | 26 | import ( 27 | "errors" 28 | "reflect" 29 | "testing" 30 | 31 | "github.com/minio/sha256-simd" 32 | "github.com/prysmaticlabs/gohashtree" 33 | ) 34 | 35 | var _test_32_block = [][32]byte{ 36 | {0x7a, 0xee, 0xd5, 0xc9, 0x66, 0x17, 0x59, 0x7f, 0x89, 0xd6, 0xd9, 0xe8, 0xa8, 0xa7, 0x01, 0x47, 0x60, 0xc6, 0x88, 0xfd, 0x2a, 0x7a, 0xf6, 0x1d, 0x10, 0x20, 0x62, 0x7e, 0x7c, 0xd0, 0x1a, 0x0b}, 37 | {0xd4, 0x1f, 0xa7, 0x89, 0x8c, 0xf9, 0x05, 0xfc, 0x1e, 0xb0, 0x04, 0xd7, 0xaa, 0x56, 0x35, 0xec, 0x36, 0xf5, 0x0d, 0x41, 0x75, 0x64, 0x34, 0x71, 0xf0, 0x3b, 0x5b, 0xb2, 0xcc, 0xfa, 0x8c, 0xca}, 38 | {0xf8, 0xd9, 0x9e, 0xa7, 0x9c, 0xa1, 0xe0, 0x3a, 0x19, 0x4f, 0xd3, 0x2d, 0xbd, 0x40, 0x3a, 0xa3, 0x28, 0xe8, 0xa4, 0x27, 0x58, 0x44, 0x12, 0xf7, 0x69, 0x01, 0x66, 0xfa, 0xf1, 0x97, 0x30, 0xfe}, 39 | {0x99, 0x7c, 0x24, 0x0e, 0xed, 0x31, 0x0a, 0xda, 0x12, 0x16, 0x0e, 0x06, 0x44, 0xb8, 0x3f, 0xa2, 0x40, 0x52, 0xbc, 0x2d, 0xaf, 0x97, 0x00, 0x01, 0x5d, 0xbb, 0x0d, 0x06, 0x66, 0xb1, 0x59, 0xf2}, 40 | {0x99, 0x43, 0x52, 0x77, 0x28, 0x39, 0x6b, 0xeb, 0x03, 0x51, 0xc4, 0x5f, 0x7d, 0xd3, 0xe1, 0x41, 0x17, 0x66, 0x7b, 0x0e, 0xc9, 0x51, 0x01, 0xa7, 0x39, 0xf3, 0xc8, 0x63, 0x95, 0xa5, 0x92, 0x6b}, 41 | {0xce, 0x6e, 0xab, 0xd2, 0xe8, 0xad, 0x90, 0xad, 0xbe, 0xe5, 0x94, 0x96, 0xa9, 0x98, 0xe7, 0x83, 0x07, 0xa4, 0x0f, 0x8e, 0xe5, 0xb3, 0x5a, 0x05, 0xcd, 0xfd, 0xae, 0x9c, 0x07, 0xad, 0x26, 0xaa}, 42 | {0xf5, 0xee, 0x66, 0x87, 0x00, 0xed, 0xeb, 0x8b, 0xc2, 0x7d, 0x97, 0x52, 0x2d, 0xfc, 0x0a, 0x2a, 0x32, 0x0e, 0x92, 0xd2, 0x91, 0xd1, 0x69, 0x29, 0x9d, 0xb1, 0x3a, 0x65, 0x9f, 0x8e, 0x7e, 0x2a}, 43 | {0x88, 0x4a, 0xc8, 0x81, 0xdb, 0xa6, 0x79, 0x36, 0x54, 0xe9, 0x15, 0x5c, 0xff, 0x06, 0x35, 0x8b, 0x6e, 0x0d, 0xaa, 0x3e, 0x7a, 0x82, 0x7c, 0x4a, 0xfe, 0x8a, 0x91, 0xb4, 0x34, 0xed, 0xe3, 0x17}, 44 | {0xe7, 0x92, 0xa4, 0x91, 0xdc, 0x1d, 0x83, 0xc8, 0x72, 0x5a, 0xd1, 0x27, 0x17, 0x78, 0x2b, 0xc7, 0x67, 0xe9, 0x56, 0xf2, 0xb4, 0x37, 0x51, 0xa1, 0x6b, 0x23, 0x8c, 0xc9, 0x03, 0x3d, 0x90, 0x1e}, 45 | {0xc4, 0x1f, 0xcc, 0x5e, 0xcb, 0x5e, 0x7d, 0x02, 0x12, 0x3f, 0x15, 0x9f, 0x35, 0xf4, 0x49, 0x55, 0xba, 0xc6, 0x47, 0xd2, 0x85, 0x85, 0x61, 0x69, 0xa5, 0x60, 0x7a, 0x32, 0x7f, 0x8e, 0x09, 0x5f}, 46 | {0x60, 0xb6, 0xab, 0xb5, 0x6b, 0x4d, 0xce, 0x6f, 0x1d, 0x77, 0x2e, 0x9b, 0x0d, 0x60, 0x76, 0xe3, 0xcb, 0x79, 0xbc, 0x40, 0x2d, 0x16, 0xf6, 0xa3, 0x06, 0x12, 0x36, 0x71, 0xda, 0xfd, 0x28, 0x89}, 47 | {0x67, 0xdd, 0x7f, 0x26, 0x6d, 0x2e, 0xf3, 0xef, 0x13, 0xb6, 0x09, 0x73, 0x82, 0xbc, 0x73, 0x25, 0x83, 0xc0, 0x34, 0x90, 0xe8, 0xad, 0xf0, 0x17, 0x8d, 0xed, 0xad, 0x29, 0xf7, 0x78, 0x9c, 0x28}, 48 | {0x00, 0xb0, 0xd5, 0xd0, 0x8e, 0x9b, 0xe5, 0xf0, 0x46, 0x8e, 0x60, 0x25, 0x95, 0xe5, 0x3a, 0x46, 0xb1, 0x07, 0x74, 0x97, 0xed, 0x0a, 0x2f, 0x9a, 0x3f, 0xf3, 0x94, 0x2f, 0xb3, 0x12, 0xa1, 0x91}, 49 | {0x8d, 0x36, 0x16, 0xc6, 0x00, 0x88, 0xd6, 0x69, 0xb4, 0x5a, 0x71, 0x18, 0x41, 0xe5, 0x4d, 0xb2, 0xd9, 0x00, 0x7a, 0x17, 0x63, 0x6a, 0x9b, 0x2e, 0x22, 0x12, 0x5b, 0xa3, 0x74, 0x7c, 0x95, 0xc9}, 50 | {0x4e, 0xfc, 0x5c, 0x18, 0xd1, 0x8a, 0x5b, 0x57, 0x7c, 0x86, 0x3e, 0xe2, 0x75, 0x91, 0xf2, 0xb3, 0x5f, 0xd0, 0x92, 0xbc, 0x77, 0xbe, 0x1b, 0xef, 0x1a, 0x7c, 0xe2, 0xd8, 0x8d, 0x7b, 0xef, 0xf7}, 51 | {0xb7, 0x80, 0xc2, 0x31, 0xe6, 0x75, 0x0c, 0xad, 0x0f, 0xe8, 0xed, 0x59, 0x34, 0xdb, 0xfb, 0x41, 0xd4, 0x38, 0x73, 0x7a, 0x47, 0x01, 0xb8, 0xea, 0xea, 0x2e, 0x01, 0x8e, 0x4f, 0x09, 0x64, 0x82}, 52 | {0x99, 0x43, 0x52, 0x77, 0x28, 0x39, 0x6b, 0xeb, 0x03, 0x51, 0xc4, 0x5f, 0x7d, 0xd3, 0xe1, 0x41, 0x17, 0x66, 0x7b, 0x0e, 0xc9, 0x51, 0x01, 0xa7, 0x39, 0xf3, 0xc8, 0x63, 0x95, 0xa5, 0x92, 0x6b}, 53 | {0xce, 0x6e, 0xab, 0xd2, 0xe8, 0xad, 0x90, 0xad, 0xbe, 0xe5, 0x94, 0x96, 0xa9, 0x98, 0xe7, 0x83, 0x07, 0xa4, 0x0f, 0x8e, 0xe5, 0xb3, 0x5a, 0x05, 0xcd, 0xfd, 0xae, 0x9c, 0x07, 0xad, 0x26, 0xaa}, 54 | {0xf5, 0xee, 0x66, 0x87, 0x00, 0xed, 0xeb, 0x8b, 0xc2, 0x7d, 0x97, 0x52, 0x2d, 0xfc, 0x0a, 0x2a, 0x32, 0x0e, 0x92, 0xd2, 0x91, 0xd1, 0x69, 0x29, 0x9d, 0xb1, 0x3a, 0x65, 0x9f, 0x8e, 0x7e, 0x2a}, 55 | {0x88, 0x4a, 0xc8, 0x81, 0xdb, 0xa6, 0x79, 0x36, 0x54, 0xe9, 0x15, 0x5c, 0xff, 0x06, 0x35, 0x8b, 0x6e, 0x0d, 0xaa, 0x3e, 0x7a, 0x82, 0x7c, 0x4a, 0xfe, 0x8a, 0x91, 0xb4, 0x34, 0xed, 0xe3, 0x17}, 56 | {0xe7, 0x92, 0xa4, 0x91, 0xdc, 0x1d, 0x83, 0xc8, 0x72, 0x5a, 0xd1, 0x27, 0x17, 0x78, 0x2b, 0xc7, 0x67, 0xe9, 0x56, 0xf2, 0xb4, 0x37, 0x51, 0xa1, 0x6b, 0x23, 0x8c, 0xc9, 0x03, 0x3d, 0x90, 0x1e}, 57 | {0xc4, 0x1f, 0xcc, 0x5e, 0xcb, 0x5e, 0x7d, 0x02, 0x12, 0x3f, 0x15, 0x9f, 0x35, 0xf4, 0x49, 0x55, 0xba, 0xc6, 0x47, 0xd2, 0x85, 0x85, 0x61, 0x69, 0xa5, 0x60, 0x7a, 0x32, 0x7f, 0x8e, 0x09, 0x5f}, 58 | {0x60, 0xb6, 0xab, 0xb5, 0x6b, 0x4d, 0xce, 0x6f, 0x1d, 0x77, 0x2e, 0x9b, 0x0d, 0x60, 0x76, 0xe3, 0xcb, 0x79, 0xbc, 0x40, 0x2d, 0x16, 0xf6, 0xa3, 0x06, 0x12, 0x36, 0x71, 0xda, 0xfd, 0x28, 0x89}, 59 | {0x67, 0xdd, 0x7f, 0x26, 0x6d, 0x2e, 0xf3, 0xef, 0x13, 0xb6, 0x09, 0x73, 0x82, 0xbc, 0x73, 0x25, 0x83, 0xc0, 0x34, 0x90, 0xe8, 0xad, 0xf0, 0x17, 0x8d, 0xed, 0xad, 0x29, 0xf7, 0x78, 0x9c, 0x28}, 60 | {0x00, 0xb0, 0xd5, 0xd0, 0x8e, 0x9b, 0xe5, 0xf0, 0x46, 0x8e, 0x60, 0x25, 0x95, 0xe5, 0x3a, 0x46, 0xb1, 0x07, 0x74, 0x97, 0xed, 0x0a, 0x2f, 0x9a, 0x3f, 0xf3, 0x94, 0x2f, 0xb3, 0x12, 0xa1, 0x91}, 61 | {0x8d, 0x36, 0x16, 0xc6, 0x00, 0x88, 0xd6, 0x69, 0xb4, 0x5a, 0x71, 0x18, 0x41, 0xe5, 0x4d, 0xb2, 0xd9, 0x00, 0x7a, 0x17, 0x63, 0x6a, 0x9b, 0x2e, 0x22, 0x12, 0x5b, 0xa3, 0x74, 0x7c, 0x95, 0xc9}, 62 | {0x4e, 0xfc, 0x5c, 0x18, 0xd1, 0x8a, 0x5b, 0x57, 0x7c, 0x86, 0x3e, 0xe2, 0x75, 0x91, 0xf2, 0xb3, 0x5f, 0xd0, 0x92, 0xbc, 0x77, 0xbe, 0x1b, 0xef, 0x1a, 0x7c, 0xe2, 0xd8, 0x8d, 0x7b, 0xef, 0xf7}, 63 | {0xcd, 0x78, 0x15, 0x64, 0x2c, 0x78, 0x57, 0x74, 0x2b, 0xb7, 0xdb, 0x74, 0xe2, 0xab, 0x82, 0xbb, 0x61, 0x32, 0x3e, 0xe4, 0xb1, 0x00, 0xde, 0xb2, 0x35, 0x1e, 0x3e, 0x1c, 0x91, 0x9d, 0x87, 0xde}, 64 | {0x17, 0xcc, 0x52, 0x5c, 0x60, 0x9e, 0xd8, 0xd4, 0xf4, 0x56, 0x28, 0x16, 0xde, 0xde, 0x73, 0xfe, 0xd9, 0x92, 0xb7, 0x99, 0x15, 0x24, 0x1b, 0x40, 0xb0, 0xda, 0x9a, 0xf8, 0x24, 0x38, 0x13, 0xbd}, 65 | {0xd0, 0x45, 0x9b, 0xe3, 0x9a, 0xae, 0x78, 0x41, 0xcd, 0x12, 0x9a, 0x6b, 0x91, 0x58, 0x29, 0x75, 0xae, 0x21, 0xd3, 0xf2, 0x5e, 0x98, 0xab, 0x09, 0xb0, 0xaa, 0x62, 0x96, 0x35, 0x64, 0x18, 0x48}, 66 | {0xd2, 0x5b, 0x10, 0xf1, 0x35, 0xaa, 0x04, 0x49, 0x4e, 0x51, 0x30, 0x0d, 0xb6, 0xbf, 0xa0, 0x9b, 0xa0, 0xf5, 0x66, 0x5f, 0x28, 0xc7, 0x8d, 0xa8, 0x3e, 0x0f, 0xe4, 0xa7, 0xc9, 0xd4, 0x0f, 0x7d}, 67 | {0xb7, 0x80, 0xc2, 0x31, 0xe6, 0x75, 0x0c, 0xad, 0x0f, 0xe8, 0xed, 0x59, 0x34, 0xdb, 0xfb, 0x41, 0xd4, 0x38, 0x73, 0x7a, 0x47, 0x01, 0xb8, 0xea, 0xea, 0x2e, 0x01, 0x8e, 0x4f, 0x09, 0x64, 0x82}, 68 | {0xe4, 0x8b, 0x12, 0xd3, 0xd0, 0x78, 0xb5, 0x5f, 0x3e, 0x9d, 0x94, 0x7f, 0x93, 0x84, 0x77, 0x77, 0xdb, 0x78, 0x41, 0xe8, 0x91, 0xfb, 0x6d, 0x0d, 0xef, 0x00, 0x30, 0x8e, 0x0a, 0xe4, 0x7b, 0xec}, 69 | {0xe7, 0xb2, 0x76, 0xe7, 0x6c, 0xba, 0x8f, 0x8c, 0x0b, 0xf2, 0xa3, 0xad, 0xc2, 0x2d, 0x92, 0xb4, 0xd5, 0xf2, 0x83, 0x42, 0x65, 0x02, 0xd6, 0x67, 0x9a, 0x78, 0x6a, 0xc1, 0xca, 0x91, 0x87, 0x7c}, 70 | {0x16, 0x99, 0x13, 0xf8, 0xa9, 0x20, 0x62, 0x2e, 0xc1, 0x84, 0xc0, 0x25, 0xdc, 0x35, 0x1f, 0xe6, 0x32, 0x49, 0x37, 0x79, 0x78, 0xfb, 0xf5, 0xf7, 0x34, 0xf4, 0xa5, 0x49, 0x9f, 0xc8, 0xfa, 0x8e}, 71 | {0x28, 0x9b, 0x27, 0xae, 0x21, 0x12, 0x14, 0x57, 0x56, 0xf6, 0x9d, 0x7f, 0x0d, 0x28, 0x03, 0xbd, 0x05, 0xd0, 0x11, 0x9e, 0xf1, 0x98, 0x8e, 0x1c, 0xbe, 0xc1, 0x83, 0xdb, 0x1a, 0x65, 0x08, 0x0d}, 72 | {0xef, 0x42, 0x3a, 0x0b, 0x2f, 0xea, 0xdf, 0xfe, 0xeb, 0xd9, 0x72, 0x9a, 0xcf, 0x5a, 0xac, 0x19, 0x09, 0x75, 0x25, 0x64, 0x61, 0x19, 0xf5, 0xcd, 0xdb, 0x9d, 0xcf, 0x4a, 0xa9, 0xf5, 0x48, 0x2c}, 73 | {0x47, 0x69, 0xaa, 0x80, 0x3f, 0xd3, 0x02, 0x67, 0xe9, 0x8b, 0x82, 0xa8, 0x02, 0xe8, 0xcf, 0x60, 0x66, 0xaa, 0xcf, 0x05, 0x0a, 0x85, 0xeb, 0x3d, 0x87, 0x21, 0xcc, 0xe2, 0xdd, 0x6c, 0x42, 0x54}, 74 | {0xd8, 0xb4, 0x39, 0x4f, 0x78, 0xce, 0xd8, 0xad, 0x57, 0xbe, 0xda, 0x18, 0x8f, 0x4a, 0x9b, 0x41, 0xfe, 0x58, 0x9d, 0xa1, 0xd4, 0x71, 0x6e, 0x2f, 0x04, 0xaf, 0x37, 0xa0, 0x29, 0x60, 0x6f, 0x9d}, 75 | {0x84, 0x4a, 0x39, 0x0a, 0x5e, 0x24, 0x81, 0x2e, 0x63, 0xc9, 0xb6, 0xde, 0xc3, 0xf1, 0x82, 0x7b, 0x82, 0x14, 0x07, 0xde, 0x46, 0x03, 0x25, 0x27, 0x4d, 0x09, 0x6b, 0x7e, 0xb9, 0x82, 0x98, 0x41}, 76 | {0x68, 0xf8, 0x98, 0x04, 0xb2, 0x61, 0x78, 0xbf, 0x8a, 0x69, 0x4d, 0xc7, 0x83, 0x4a, 0xe7, 0x77, 0xf7, 0x4b, 0x00, 0x28, 0x34, 0xe6, 0x36, 0xca, 0xa2, 0x58, 0x37, 0x61, 0x60, 0x95, 0x0d, 0xa6}, 77 | {0x20, 0x00, 0x7e, 0x29, 0xa8, 0x6e, 0xca, 0xb8, 0x1b, 0xbc, 0x94, 0x29, 0x2b, 0x18, 0xaa, 0x56, 0x0f, 0x4c, 0x38, 0x1a, 0x7a, 0x16, 0xe8, 0xbb, 0x51, 0xb7, 0xb3, 0xe3, 0x22, 0x8e, 0x9c, 0x05}, 78 | {0xa8, 0x0f, 0x08, 0x4d, 0xf1, 0xd1, 0xd8, 0x2c, 0xac, 0xe8, 0x73, 0x43, 0xcc, 0x73, 0x6b, 0x03, 0x40, 0x21, 0x85, 0x9b, 0x9d, 0x63, 0xa8, 0x44, 0x6a, 0x6c, 0x23, 0xe3, 0x4e, 0x76, 0xb1, 0x51}, 79 | {0x90, 0x61, 0x31, 0xfe, 0xf7, 0x4a, 0x8f, 0x06, 0x9e, 0x75, 0x6a, 0x5a, 0x66, 0xdd, 0xa2, 0xe4, 0x9b, 0x8f, 0x98, 0xbb, 0x18, 0x9a, 0x96, 0x84, 0xfa, 0xe4, 0x3c, 0xd2, 0x2c, 0x96, 0x61, 0xd8}, 80 | {0x96, 0xb4, 0x84, 0xa8, 0x8b, 0x6f, 0xeb, 0xc5, 0x3e, 0xa3, 0x48, 0xd5, 0x00, 0x95, 0x47, 0xda, 0xc1, 0x2d, 0x95, 0x68, 0x49, 0x29, 0x15, 0xb9, 0x36, 0x59, 0x4c, 0x0b, 0x77, 0xdc, 0x01, 0x06}, 81 | {0x58, 0x37, 0xa7, 0x03, 0x40, 0x70, 0x91, 0xee, 0x29, 0x75, 0x10, 0xd4, 0xec, 0x01, 0x87, 0x5f, 0x2e, 0xb5, 0x56, 0xc6, 0x2d, 0xe9, 0x2b, 0xb4, 0xab, 0x95, 0x82, 0x1f, 0x11, 0xf2, 0xb8, 0xc9}, 82 | {0x81, 0xbf, 0xb0, 0x58, 0xcc, 0xdd, 0x0e, 0xf1, 0x9c, 0x17, 0x6b, 0xa0, 0xe6, 0x42, 0x8c, 0x1a, 0x3c, 0x9c, 0x20, 0x18, 0x0b, 0x52, 0x66, 0x5a, 0xc1, 0xe5, 0xc5, 0x66, 0x35, 0xe5, 0x26, 0x4f}, 83 | {0xca, 0x73, 0xe0, 0x95, 0x2c, 0xc7, 0xa9, 0x22, 0x58, 0x68, 0x49, 0xb3, 0x68, 0xdc, 0x34, 0xe1, 0x3b, 0x17, 0x67, 0xaa, 0x82, 0xa1, 0xb6, 0xbd, 0x69, 0x9b, 0xf6, 0x00, 0x71, 0x51, 0x08, 0xca}, 84 | {0xce, 0x06, 0x68, 0x95, 0x13, 0x37, 0x8b, 0x32, 0xc9, 0x62, 0x38, 0xc9, 0x78, 0x90, 0x89, 0x0e, 0x3a, 0x5d, 0x85, 0x50, 0x1c, 0x4c, 0xd6, 0x80, 0xcc, 0x5f, 0x63, 0xf0, 0xc9, 0xfe, 0x7a, 0xb5}, 85 | {0x79, 0x78, 0x8d, 0x38, 0x13, 0xdf, 0xb7, 0x37, 0x18, 0x78, 0xbd, 0x2f, 0x3e, 0xc7, 0x2c, 0x46, 0xd2, 0x74, 0x01, 0xe9, 0xa1, 0x3f, 0xfe, 0x46, 0x11, 0xb0, 0x85, 0x2f, 0x6d, 0x4b, 0x4b, 0x8e}, 86 | {0x11, 0xce, 0x55, 0xe4, 0xba, 0xf7, 0x11, 0xcd, 0xe8, 0xa8, 0x04, 0x33, 0xbd, 0x19, 0xe8, 0xbe, 0xa1, 0x00, 0xd3, 0x28, 0xca, 0x78, 0x56, 0x6d, 0xde, 0xe5, 0x71, 0x13, 0xc2, 0xbd, 0xd8, 0xc2}, 87 | {0x04, 0x64, 0xdb, 0xdb, 0x8b, 0x4f, 0x73, 0x0e, 0x0a, 0x9e, 0xfe, 0xd0, 0x5d, 0x92, 0x3e, 0xf8, 0xf4, 0x8b, 0xef, 0xb6, 0x6f, 0x42, 0xc9, 0xea, 0x73, 0xfb, 0xb6, 0x8e, 0x37, 0x74, 0xae, 0x39}, 88 | {0x91, 0x1e, 0x40, 0x74, 0x23, 0xa7, 0xa8, 0x00, 0xfc, 0xa1, 0x16, 0xed, 0xcf, 0xff, 0xce, 0xea, 0x3f, 0x31, 0x54, 0xad, 0x19, 0x98, 0xcb, 0x5d, 0xfd, 0x82, 0xe2, 0x48, 0xbf, 0xc3, 0x74, 0x71}, 89 | {0x5f, 0x45, 0x5f, 0xba, 0x82, 0x5d, 0xc4, 0x20, 0x12, 0x67, 0x65, 0x0d, 0x8b, 0x14, 0x45, 0x20, 0xd3, 0xbc, 0xb4, 0x23, 0x26, 0x98, 0xfc, 0x05, 0x8f, 0xa5, 0x99, 0xe2, 0x78, 0x74, 0x72, 0x71}, 90 | {0xda, 0xa5, 0x2a, 0xc1, 0x13, 0xa4, 0x3b, 0xeb, 0x41, 0x51, 0x1b, 0x96, 0xa3, 0xa0, 0x5b, 0xd8, 0xed, 0x5e, 0x69, 0x67, 0xfb, 0xc5, 0x27, 0x66, 0x56, 0x8a, 0xb2, 0x1e, 0x93, 0xbf, 0xb0, 0x36}, 91 | {0x54, 0xb8, 0x17, 0xb6, 0xd2, 0x26, 0x22, 0x93, 0xdc, 0xb5, 0xd5, 0x32, 0x1b, 0x76, 0x3c, 0xfa, 0x24, 0x04, 0xcb, 0xa0, 0x1b, 0xcb, 0xa3, 0x12, 0x20, 0x60, 0x3b, 0x59, 0xe5, 0xdf, 0xf7, 0xbf}, 92 | {0x41, 0x42, 0x6c, 0xbf, 0xfa, 0x23, 0xcc, 0xee, 0x3e, 0xf6, 0xf3, 0xbf, 0xa1, 0x39, 0x9b, 0x6e, 0x7f, 0xfb, 0x2c, 0x7f, 0x4e, 0xf5, 0x35, 0x78, 0xb5, 0x5e, 0x77, 0x02, 0x40, 0x2a, 0xbc, 0x77}, 93 | {0x9b, 0xc5, 0x2f, 0xb6, 0xa1, 0x3d, 0x5a, 0xc0, 0x9a, 0x23, 0xce, 0xbf, 0x9b, 0x94, 0xad, 0xd4, 0xe4, 0x6f, 0x0f, 0x0a, 0x64, 0x55, 0x22, 0x26, 0xbc, 0x8b, 0xba, 0xdf, 0xb9, 0x04, 0x3a, 0x5b}, 94 | {0x7b, 0x66, 0x20, 0xcf, 0x63, 0xeb, 0x29, 0xb9, 0x11, 0xc5, 0x5e, 0x18, 0x98, 0x15, 0x2f, 0x69, 0x60, 0xa7, 0xf1, 0x0c, 0xc1, 0x6b, 0x6f, 0xba, 0xd3, 0x2c, 0x83, 0x7d, 0x9d, 0x8e, 0x2b, 0x74}, 95 | {0x7b, 0x9b, 0xcd, 0x1a, 0xe3, 0xfd, 0xd9, 0xd4, 0x74, 0x2e, 0x0d, 0xbc, 0xe1, 0x3c, 0x54, 0x2c, 0xc1, 0x81, 0xb5, 0x0b, 0xa0, 0xf9, 0xd5, 0xe1, 0xca, 0x18, 0x00, 0xf9, 0xb5, 0x84, 0x85, 0xca}, 96 | {0xe7, 0xc9, 0xe2, 0xc8, 0x33, 0x41, 0x31, 0x15, 0xb3, 0x84, 0x3f, 0x79, 0x18, 0xe9, 0x98, 0x5a, 0x51, 0x60, 0xf0, 0x5a, 0x5b, 0xf8, 0x7f, 0x5f, 0xdd, 0x70, 0x27, 0xe3, 0x8f, 0xe3, 0x39, 0xf4}, 97 | {0x36, 0x0d, 0x5b, 0xa8, 0x0e, 0x59, 0xe2, 0x82, 0xa2, 0x39, 0xdf, 0x28, 0x34, 0x4d, 0x4f, 0x74, 0xee, 0xd8, 0x6b, 0xa0, 0xd8, 0x9d, 0xe7, 0x88, 0x05, 0x4e, 0xba, 0x6b, 0x50, 0x03, 0x89, 0xa2}, 98 | {0x89, 0xd6, 0x81, 0x5f, 0x68, 0x39, 0x36, 0x6c, 0x25, 0xad, 0xb6, 0x43, 0xff, 0x6b, 0x5e, 0x19, 0x63, 0xd3, 0xff, 0xd0, 0xce, 0x1a, 0xa7, 0x8c, 0x7f, 0xeb, 0x5a, 0x6e, 0x99, 0xf1, 0xb4, 0xdb}, 99 | {0x1f, 0x36, 0x6f, 0x27, 0xc8, 0x2f, 0x23, 0x81, 0xfc, 0x02, 0x80, 0x4f, 0x8b, 0x8d, 0xa8, 0x2f, 0x3d, 0x35, 0x91, 0xe3, 0x60, 0x90, 0x7c, 0x57, 0x03, 0xc3, 0xa9, 0xed, 0xb1, 0x72, 0x3e, 0x3e}, 100 | } 101 | 102 | var _test_32_digests = [][32]byte{ 103 | {0x22, 0xd8, 0x35, 0x89, 0xe6, 0x42, 0xe1, 0xb1, 0x40, 0xed, 0x1b, 0x48, 0x48, 0x5b, 0x44, 0xc7, 0x07, 0x9d, 0xf3, 0xb2, 0x04, 0xbe, 0x48, 0x69, 0x42, 0x1d, 0x45, 0x49, 0xf3, 0x9e, 0x2c, 0xc7}, 104 | {0xac, 0xfe, 0x28, 0x1d, 0x11, 0x77, 0x7c, 0x1e, 0x22, 0xe0, 0xb7, 0x16, 0x0f, 0x01, 0x66, 0x92, 0xa7, 0xb3, 0xb5, 0x69, 0xed, 0x12, 0x8d, 0x93, 0xcf, 0xce, 0x27, 0x49, 0xfd, 0x1c, 0x85, 0x01}, 105 | {0xbc, 0xb2, 0xa2, 0x0b, 0x95, 0x58, 0x91, 0x64, 0x1f, 0x3a, 0x5d, 0x80, 0xaa, 0x11, 0x49, 0xa5, 0x1b, 0xac, 0xb7, 0x1e, 0x06, 0x62, 0x45, 0x34, 0xa5, 0x66, 0xd1, 0xc7, 0x5a, 0xa9, 0x68, 0xc9}, 106 | {0x4d, 0xe2, 0xaa, 0x4b, 0xc4, 0x6c, 0x1c, 0x3d, 0x42, 0x65, 0x34, 0x8a, 0x2c, 0x7a, 0x64, 0xa8, 0xd9, 0x8a, 0x82, 0xe4, 0x8b, 0x9c, 0xc9, 0x3c, 0x3c, 0xcd, 0x34, 0x4d, 0x71, 0x76, 0xda, 0x69}, 107 | {0x1e, 0x00, 0xd3, 0xc6, 0x59, 0x37, 0x27, 0x6a, 0x6a, 0xae, 0xa7, 0xd8, 0x37, 0x51, 0xac, 0x74, 0x2d, 0xe0, 0xb6, 0x7e, 0xc5, 0xa8, 0xa7, 0x56, 0x5b, 0x0f, 0x10, 0xba, 0x8a, 0x40, 0xe2, 0x1c}, 108 | {0x30, 0x96, 0xdb, 0x9d, 0xcf, 0xa9, 0x5c, 0xf4, 0xa4, 0xc4, 0xc9, 0xd5, 0xa0, 0x1e, 0xd4, 0x30, 0xe5, 0xe8, 0xad, 0x9d, 0xaa, 0x8e, 0x79, 0x1c, 0x5d, 0x6c, 0xac, 0x1a, 0xb3, 0x65, 0xb5, 0x14}, 109 | {0x7a, 0xee, 0xd5, 0xc9, 0x66, 0x17, 0x59, 0x7f, 0x89, 0xd6, 0xd9, 0xe8, 0xa8, 0xa7, 0x01, 0x47, 0x60, 0xc6, 0x88, 0xfd, 0x2a, 0x7a, 0xf6, 0x1d, 0x10, 0x20, 0x62, 0x7e, 0x7c, 0xd0, 0x1a, 0x0b}, 110 | {0xce, 0x0c, 0x94, 0xa7, 0x41, 0x25, 0xa5, 0xe3, 0x96, 0x77, 0xd6, 0xbd, 0x91, 0xca, 0xe6, 0x06, 0xf3, 0x90, 0xe0, 0x37, 0xcc, 0xc1, 0x2c, 0x7d, 0x97, 0x97, 0xf3, 0x56, 0xf0, 0xbd, 0x66, 0x43}, 111 | {0xbc, 0xb2, 0xa2, 0x0b, 0x95, 0x58, 0x91, 0x64, 0x1f, 0x3a, 0x5d, 0x80, 0xaa, 0x11, 0x49, 0xa5, 0x1b, 0xac, 0xb7, 0x1e, 0x06, 0x62, 0x45, 0x34, 0xa5, 0x66, 0xd1, 0xc7, 0x5a, 0xa9, 0x68, 0xc9}, 112 | {0x4d, 0xe2, 0xaa, 0x4b, 0xc4, 0x6c, 0x1c, 0x3d, 0x42, 0x65, 0x34, 0x8a, 0x2c, 0x7a, 0x64, 0xa8, 0xd9, 0x8a, 0x82, 0xe4, 0x8b, 0x9c, 0xc9, 0x3c, 0x3c, 0xcd, 0x34, 0x4d, 0x71, 0x76, 0xda, 0x69}, 113 | {0x1e, 0x00, 0xd3, 0xc6, 0x59, 0x37, 0x27, 0x6a, 0x6a, 0xae, 0xa7, 0xd8, 0x37, 0x51, 0xac, 0x74, 0x2d, 0xe0, 0xb6, 0x7e, 0xc5, 0xa8, 0xa7, 0x56, 0x5b, 0x0f, 0x10, 0xba, 0x8a, 0x40, 0xe2, 0x1c}, 114 | {0x30, 0x96, 0xdb, 0x9d, 0xcf, 0xa9, 0x5c, 0xf4, 0xa4, 0xc4, 0xc9, 0xd5, 0xa0, 0x1e, 0xd4, 0x30, 0xe5, 0xe8, 0xad, 0x9d, 0xaa, 0x8e, 0x79, 0x1c, 0x5d, 0x6c, 0xac, 0x1a, 0xb3, 0x65, 0xb5, 0x14}, 115 | {0x7a, 0xee, 0xd5, 0xc9, 0x66, 0x17, 0x59, 0x7f, 0x89, 0xd6, 0xd9, 0xe8, 0xa8, 0xa7, 0x01, 0x47, 0x60, 0xc6, 0x88, 0xfd, 0x2a, 0x7a, 0xf6, 0x1d, 0x10, 0x20, 0x62, 0x7e, 0x7c, 0xd0, 0x1a, 0x0b}, 116 | {0xd4, 0x1f, 0xa7, 0x89, 0x8c, 0xf9, 0x05, 0xfc, 0x1e, 0xb0, 0x04, 0xd7, 0xaa, 0x56, 0x35, 0xec, 0x36, 0xf5, 0x0d, 0x41, 0x75, 0x64, 0x34, 0x71, 0xf0, 0x3b, 0x5b, 0xb2, 0xcc, 0xfa, 0x8c, 0xca}, 117 | {0xf8, 0xd9, 0x9e, 0xa7, 0x9c, 0xa1, 0xe0, 0x3a, 0x19, 0x4f, 0xd3, 0x2d, 0xbd, 0x40, 0x3a, 0xa3, 0x28, 0xe8, 0xa4, 0x27, 0x58, 0x44, 0x12, 0xf7, 0x69, 0x01, 0x66, 0xfa, 0xf1, 0x97, 0x30, 0xfe}, 118 | {0x99, 0x7c, 0x24, 0x0e, 0xed, 0x31, 0x0a, 0xda, 0x12, 0x16, 0x0e, 0x06, 0x44, 0xb8, 0x3f, 0xa2, 0x40, 0x52, 0xbc, 0x2d, 0xaf, 0x97, 0x00, 0x01, 0x5d, 0xbb, 0x0d, 0x06, 0x66, 0xb1, 0x59, 0xf2}, 119 | {0x99, 0x43, 0x52, 0x77, 0x28, 0x39, 0x6b, 0xeb, 0x03, 0x51, 0xc4, 0x5f, 0x7d, 0xd3, 0xe1, 0x41, 0x17, 0x66, 0x7b, 0x0e, 0xc9, 0x51, 0x01, 0xa7, 0x39, 0xf3, 0xc8, 0x63, 0x95, 0xa5, 0x92, 0x6b}, 120 | {0xce, 0x6e, 0xab, 0xd2, 0xe8, 0xad, 0x90, 0xad, 0xbe, 0xe5, 0x94, 0x96, 0xa9, 0x98, 0xe7, 0x83, 0x07, 0xa4, 0x0f, 0x8e, 0xe5, 0xb3, 0x5a, 0x05, 0xcd, 0xfd, 0xae, 0x9c, 0x07, 0xad, 0x26, 0xaa}, 121 | {0xf5, 0xee, 0x66, 0x87, 0x00, 0xed, 0xeb, 0x8b, 0xc2, 0x7d, 0x97, 0x52, 0x2d, 0xfc, 0x0a, 0x2a, 0x32, 0x0e, 0x92, 0xd2, 0x91, 0xd1, 0x69, 0x29, 0x9d, 0xb1, 0x3a, 0x65, 0x9f, 0x8e, 0x7e, 0x2a}, 122 | {0x88, 0x4a, 0xc8, 0x81, 0xdb, 0xa6, 0x79, 0x36, 0x54, 0xe9, 0x15, 0x5c, 0xff, 0x06, 0x35, 0x8b, 0x6e, 0x0d, 0xaa, 0x3e, 0x7a, 0x82, 0x7c, 0x4a, 0xfe, 0x8a, 0x91, 0xb4, 0x34, 0xed, 0xe3, 0x17}, 123 | {0xe7, 0x92, 0xa4, 0x91, 0xdc, 0x1d, 0x83, 0xc8, 0x72, 0x5a, 0xd1, 0x27, 0x17, 0x78, 0x2b, 0xc7, 0x67, 0xe9, 0x56, 0xf2, 0xb4, 0x37, 0x51, 0xa1, 0x6b, 0x23, 0x8c, 0xc9, 0x03, 0x3d, 0x90, 0x1e}, 124 | {0xc4, 0x1f, 0xcc, 0x5e, 0xcb, 0x5e, 0x7d, 0x02, 0x12, 0x3f, 0x15, 0x9f, 0x35, 0xf4, 0x49, 0x55, 0xba, 0xc6, 0x47, 0xd2, 0x85, 0x85, 0x61, 0x69, 0xa5, 0x60, 0x7a, 0x32, 0x7f, 0x8e, 0x09, 0x5f}, 125 | {0x60, 0xb6, 0xab, 0xb5, 0x6b, 0x4d, 0xce, 0x6f, 0x1d, 0x77, 0x2e, 0x9b, 0x0d, 0x60, 0x76, 0xe3, 0xcb, 0x79, 0xbc, 0x40, 0x2d, 0x16, 0xf6, 0xa3, 0x06, 0x12, 0x36, 0x71, 0xda, 0xfd, 0x28, 0x89}, 126 | {0x67, 0xdd, 0x7f, 0x26, 0x6d, 0x2e, 0xf3, 0xef, 0x13, 0xb6, 0x09, 0x73, 0x82, 0xbc, 0x73, 0x25, 0x83, 0xc0, 0x34, 0x90, 0xe8, 0xad, 0xf0, 0x17, 0x8d, 0xed, 0xad, 0x29, 0xf7, 0x78, 0x9c, 0x28}, 127 | {0x00, 0xb0, 0xd5, 0xd0, 0x8e, 0x9b, 0xe5, 0xf0, 0x46, 0x8e, 0x60, 0x25, 0x95, 0xe5, 0x3a, 0x46, 0xb1, 0x07, 0x74, 0x97, 0xed, 0x0a, 0x2f, 0x9a, 0x3f, 0xf3, 0x94, 0x2f, 0xb3, 0x12, 0xa1, 0x91}, 128 | {0x8d, 0x36, 0x16, 0xc6, 0x00, 0x88, 0xd6, 0x69, 0xb4, 0x5a, 0x71, 0x18, 0x41, 0xe5, 0x4d, 0xb2, 0xd9, 0x00, 0x7a, 0x17, 0x63, 0x6a, 0x9b, 0x2e, 0x22, 0x12, 0x5b, 0xa3, 0x74, 0x7c, 0x95, 0xc9}, 129 | {0x4e, 0xfc, 0x5c, 0x18, 0xd1, 0x8a, 0x5b, 0x57, 0x7c, 0x86, 0x3e, 0xe2, 0x75, 0x91, 0xf2, 0xb3, 0x5f, 0xd0, 0x92, 0xbc, 0x77, 0xbe, 0x1b, 0xef, 0x1a, 0x7c, 0xe2, 0xd8, 0x8d, 0x7b, 0xef, 0xf7}, 130 | {0xcd, 0x78, 0x15, 0x64, 0x2c, 0x78, 0x57, 0x74, 0x2b, 0xb7, 0xdb, 0x74, 0xe2, 0xab, 0x82, 0xbb, 0x61, 0x32, 0x3e, 0xe4, 0xb1, 0x00, 0xde, 0xb2, 0x35, 0x1e, 0x3e, 0x1c, 0x91, 0x9d, 0x87, 0xde}, 131 | {0x17, 0xcc, 0x52, 0x5c, 0x60, 0x9e, 0xd8, 0xd4, 0xf4, 0x56, 0x28, 0x16, 0xde, 0xde, 0x73, 0xfe, 0xd9, 0x92, 0xb7, 0x99, 0x15, 0x24, 0x1b, 0x40, 0xb0, 0xda, 0x9a, 0xf8, 0x24, 0x38, 0x13, 0xbd}, 132 | {0xd0, 0x45, 0x9b, 0xe3, 0x9a, 0xae, 0x78, 0x41, 0xcd, 0x12, 0x9a, 0x6b, 0x91, 0x58, 0x29, 0x75, 0xae, 0x21, 0xd3, 0xf2, 0x5e, 0x98, 0xab, 0x09, 0xb0, 0xaa, 0x62, 0x96, 0x35, 0x64, 0x18, 0x48}, 133 | {0xd2, 0x5b, 0x10, 0xf1, 0x35, 0xaa, 0x04, 0x49, 0x4e, 0x51, 0x30, 0x0d, 0xb6, 0xbf, 0xa0, 0x9b, 0xa0, 0xf5, 0x66, 0x5f, 0x28, 0xc7, 0x8d, 0xa8, 0x3e, 0x0f, 0xe4, 0xa7, 0xc9, 0xd4, 0x0f, 0x7d}, 134 | {0xb7, 0x80, 0xc2, 0x31, 0xe6, 0x75, 0x0c, 0xad, 0x0f, 0xe8, 0xed, 0x59, 0x34, 0xdb, 0xfb, 0x41, 0xd4, 0x38, 0x73, 0x7a, 0x47, 0x01, 0xb8, 0xea, 0xea, 0x2e, 0x01, 0x8e, 0x4f, 0x09, 0x64, 0x82}, 135 | } 136 | 137 | func TestHash(t *testing.T) { 138 | tests := []struct { 139 | name string 140 | count uint32 141 | }{ 142 | { 143 | name: "hash 1 block", 144 | count: 1, 145 | }, 146 | { 147 | name: "hash 4 blocks", 148 | count: 4, 149 | }, 150 | { 151 | name: "hash 8 blocks", 152 | count: 8, 153 | }, 154 | { 155 | name: "hash 16 blocks", 156 | count: 16, 157 | }, 158 | { 159 | name: "hash 18 blocks", 160 | count: 18, 161 | }, 162 | { 163 | name: "hash 24 blocks", 164 | count: 24, 165 | }, 166 | { 167 | name: "hash 32 blocks", 168 | count: 32, 169 | }, 170 | { 171 | name: "hash 31 blocks", 172 | count: 31, 173 | }, 174 | } 175 | for _, tt := range tests { 176 | t.Run(tt.name, func(t *testing.T) { 177 | digests := make([][32]byte, tt.count) 178 | err := gohashtree.Hash(digests, _test_32_block[:2*tt.count]) 179 | if err != nil { 180 | t.Log(err) 181 | t.Fail() 182 | } 183 | if !reflect.DeepEqual(digests, _test_32_digests[:tt.count]) { 184 | t.Logf("Digests are different\n Expected: %x\n Produced: %x\n", 185 | _test_32_digests[:tt.count], digests) 186 | t.Fail() 187 | } 188 | digests2 := make([][32]byte, tt.count) 189 | gohashtree.Sha256_1_generic(digests2, _test_32_block[:2*tt.count]) 190 | if err != nil { 191 | t.Log(err) 192 | t.Fail() 193 | } 194 | if !reflect.DeepEqual(digests2, _test_32_digests[:tt.count]) { 195 | t.Logf("Digests are different\n Expected: %x\n Produced: %x\n", 196 | _test_32_digests[:tt.count], digests) 197 | t.Fail() 198 | } 199 | }) 200 | } 201 | } 202 | 203 | func TestHashByteSlice(t *testing.T) { 204 | tests := []struct { 205 | name string 206 | count uint32 207 | }{ 208 | { 209 | name: "hash 1 block", 210 | count: 1, 211 | }, 212 | { 213 | name: "hash 4 blocks", 214 | count: 4, 215 | }, 216 | { 217 | name: "hash 8 blocks", 218 | count: 8, 219 | }, 220 | { 221 | name: "hash 16 blocks", 222 | count: 16, 223 | }, 224 | { 225 | name: "hash 18 blocks", 226 | count: 18, 227 | }, 228 | { 229 | name: "hash 24 blocks", 230 | count: 24, 231 | }, 232 | { 233 | name: "hash 32 blocks", 234 | count: 32, 235 | }, 236 | { 237 | name: "hash 31 blocks", 238 | count: 31, 239 | }, 240 | } 241 | for _, tt := range tests { 242 | t.Run(tt.name, func(t *testing.T) { 243 | digests := make([]byte, 32*tt.count) 244 | chunks := make([]byte, 64*tt.count) 245 | for i := 0; i < int(2*tt.count); i += 2 { 246 | if n := copy(chunks[32*i:32*i+32], _test_32_block[i][:]); n != 32 { 247 | t.Logf("copied wrong number of bytes") 248 | t.Fail() 249 | } 250 | if n := copy(chunks[32*i+32:32*i+64], _test_32_block[i+1][:]); n != 32 { 251 | t.Logf("copied wrong number of bytes") 252 | t.Fail() 253 | } 254 | } 255 | 256 | err := gohashtree.HashByteSlice(digests, chunks) 257 | if err != nil { 258 | t.Log(err) 259 | t.Fail() 260 | } 261 | for i := 0; i < int(tt.count); i++ { 262 | if !reflect.DeepEqual(digests[32*i:32*i+32], _test_32_digests[i][:]) { 263 | t.Logf("Digests are different\n Expected: %x\n Produced: %x\n", 264 | _test_32_digests[i][:], digests[32*i:32*i+32]) 265 | t.Fail() 266 | } 267 | } 268 | }) 269 | } 270 | } 271 | 272 | func TestOddChunks(t *testing.T) { 273 | digests := make([][32]byte, 1) 274 | chunks := make([][32]byte, 1) 275 | err := gohashtree.Hash(digests, chunks) 276 | if err.Error() != "odd number of chunks" { 277 | t.Logf("expected error: \"odd number of chunks\", got: \"%s\"", err) 278 | t.Fail() 279 | } 280 | } 281 | 282 | func TestNotAllocatedDigest(t *testing.T) { 283 | digests := make([][32]byte, 1) 284 | chunks := make([][32]byte, 4) 285 | err := gohashtree.Hash(digests, chunks) 286 | expected := "not enough digest length, need at least 2, got 1" 287 | if !errors.Is(err, gohashtree.ErrNotEnoughDigests) { 288 | t.Logf("expected error: \"%s\", got: \"%s\"", expected, err) 289 | t.Fail() 290 | } 291 | } 292 | 293 | func OldHash(data []byte) [32]byte { 294 | h := sha256.New() 295 | h.Reset() 296 | var b [32]byte 297 | h.Write(data) 298 | h.Sum(b[:0]) 299 | return b 300 | } 301 | 302 | func BenchmarkHash_1_minio(b *testing.B) { 303 | chunks := [64]byte{'A'} 304 | digests := make([][32]byte, 1) 305 | b.ResetTimer() 306 | for i := 0; i < b.N; i++ { 307 | digests[0] = OldHash(chunks[:]) 308 | } 309 | } 310 | 311 | func BenchmarkHash_1(b *testing.B) { 312 | chunks := make([][32]byte, 2) 313 | digests := make([][32]byte, 1) 314 | b.ResetTimer() 315 | for i := 0; i < b.N; i++ { 316 | gohashtree.Hash(digests, chunks) 317 | } 318 | } 319 | 320 | func BenchmarkHash_4_minio(b *testing.B) { 321 | chunks := [64 * 4]byte{'A'} 322 | digests := make([][32]byte, 4) 323 | b.ResetTimer() 324 | for i := 0; i < b.N; i++ { 325 | for j := 0; j < 4; j++ { 326 | digests[j] = OldHash(chunks[j*64 : j*64+64]) 327 | } 328 | } 329 | } 330 | 331 | func BenchmarkHash_4(b *testing.B) { 332 | chunks := make([][32]byte, 8) 333 | digests := make([][32]byte, 4) 334 | b.ResetTimer() 335 | for i := 0; i < b.N; i++ { 336 | gohashtree.Hash(digests, chunks) 337 | } 338 | } 339 | 340 | func BenchmarkHash_8_minio(b *testing.B) { 341 | chunks := [64 * 8]byte{'A'} 342 | digests := make([][32]byte, 8) 343 | b.ResetTimer() 344 | for i := 0; i < b.N; i++ { 345 | for j := 0; j < 8; j++ { 346 | digests[j] = OldHash(chunks[j*64 : j*64+64]) 347 | } 348 | } 349 | } 350 | 351 | func BenchmarkHash_8(b *testing.B) { 352 | chunks := make([][32]byte, 16) 353 | digests := make([][32]byte, 8) 354 | b.ResetTimer() 355 | for i := 0; i < b.N; i++ { 356 | gohashtree.Hash(digests, chunks) 357 | } 358 | } 359 | 360 | func BenchmarkHash_16_minio(b *testing.B) { 361 | chunks := [64 * 16]byte{'A'} 362 | digests := make([][32]byte, 16) 363 | b.ResetTimer() 364 | for i := 0; i < b.N; i++ { 365 | for j := 0; j < 16; j++ { 366 | digests[j] = OldHash(chunks[j*64 : j*64+64]) 367 | } 368 | } 369 | } 370 | 371 | func BenchmarkHash_16(b *testing.B) { 372 | chunks := make([][32]byte, 32) 373 | digests := make([][32]byte, 16) 374 | b.ResetTimer() 375 | for i := 0; i < b.N; i++ { 376 | gohashtree.Hash(digests, chunks) 377 | } 378 | } 379 | 380 | func BenchmarkHashLargeList_minio(b *testing.B) { 381 | balances := make([][32]byte, 400000) 382 | for i := 0; i < len(balances); i++ { 383 | balances[i] = [32]byte{'A'} 384 | } 385 | digests := make([][32]byte, 200000) 386 | b.ResetTimer() 387 | for i := 0; i < b.N; i++ { 388 | for j := 1; j < 200000; j++ { 389 | batchedRT := append(balances[2*j][:], balances[2*j+1][:]...) 390 | digests[j] = OldHash(batchedRT) 391 | } 392 | } 393 | } 394 | 395 | func BenchmarkHashList(b *testing.B) { 396 | balances := make([][32]byte, 400000) 397 | for i := 0; i < len(balances); i++ { 398 | balances[i] = [32]byte{'A'} 399 | } 400 | digests := make([][32]byte, 200000) 401 | b.ResetTimer() 402 | for i := 0; i < b.N; i++ { 403 | gohashtree.Hash(digests, balances) 404 | } 405 | } 406 | -------------------------------------------------------------------------------- /hash_unsupported.go: -------------------------------------------------------------------------------- 1 | //go:build !amd64 && !arm64 2 | 3 | /* 4 | MIT License 5 | 6 | Copyright (c) 2025 Prysmatic Labs 7 | 8 | Permission is hereby granted, free of charge, to any person obtaining a copy 9 | of this software and associated documentation files (the "Software"), to deal 10 | in the Software without restriction, including without limitation the rights 11 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 12 | copies of the Software, and to permit persons to whom the Software is 13 | furnished to do so, subject to the following conditions: 14 | 15 | The above copyright notice and this permission notice shall be included in all 16 | copies or substantial portions of the Software. 17 | 18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 20 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 21 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 22 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 23 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 24 | SOFTWARE. 25 | */ 26 | package gohashtree 27 | 28 | var supportedCPU = false 29 | 30 | func _hash(digests *byte, p [][32]byte, count uint32) {} 31 | -------------------------------------------------------------------------------- /sha256_1_generic.go: -------------------------------------------------------------------------------- 1 | /* 2 | MIT License 3 | 4 | Copyright (c) 2021-2022 Prysmatic Labs 5 | 6 | Permission is hereby granted, free of charge, to any person obtaining a copy 7 | of this software and associated documentation files (the "Software"), to deal 8 | in the Software without restriction, including without limitation the rights 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | copies of the Software, and to permit persons to whom the Software is 11 | furnished to do so, subject to the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be included in all 14 | copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | SOFTWARE. 23 | */ 24 | package gohashtree 25 | 26 | import ( 27 | "encoding/binary" 28 | "math/bits" 29 | ) 30 | 31 | const ( 32 | init0 = uint32(0x6A09E667) 33 | init1 = uint32(0xBB67AE85) 34 | init2 = uint32(0x3C6EF372) 35 | init3 = uint32(0xA54FF53A) 36 | init4 = uint32(0x510E527F) 37 | init5 = uint32(0x9B05688C) 38 | init6 = uint32(0x1F83D9AB) 39 | init7 = uint32(0x5BE0CD19) 40 | ) 41 | 42 | var _P = []uint32{ 43 | 0xc28a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 44 | 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, 45 | 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 46 | 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf374, 47 | 0x649b69c1, 0xf0fe4786, 0x0fe1edc6, 0x240cf254, 48 | 0x4fe9346f, 0x6cc984be, 0x61b9411e, 0x16f988fa, 49 | 0xf2c65152, 0xa88e5a6d, 0xb019fc65, 0xb9d99ec7, 50 | 0x9a1231c3, 0xe70eeaa0, 0xfdb1232b, 0xc7353eb0, 51 | 0x3069bad5, 0xcb976d5f, 0x5a0f118f, 0xdc1eeefd, 52 | 0x0a35b689, 0xde0b7a04, 0x58f4ca9d, 0xe15d5b16, 53 | 0x007f3e86, 0x37088980, 0xa507ea32, 0x6fab9537, 54 | 0x17406110, 0x0d8cd6f1, 0xcdaa3b6d, 0xc0bbbe37, 55 | 0x83613bda, 0xdb48a363, 0x0b02e931, 0x6fd15ca7, 56 | 0x521afaca, 0x31338431, 0x6ed41a95, 0x6d437890, 57 | 0xc39c91f2, 0x9eccabbd, 0xb5c9a0e6, 0x532fb63c, 58 | 0xd2c741c6, 0x07237ea3, 0xa4954b68, 0x4c191d76, 59 | } 60 | 61 | var _K = []uint32{ 62 | 0x428a2f98, 63 | 0x71374491, 64 | 0xb5c0fbcf, 65 | 0xe9b5dba5, 66 | 0x3956c25b, 67 | 0x59f111f1, 68 | 0x923f82a4, 69 | 0xab1c5ed5, 70 | 0xd807aa98, 71 | 0x12835b01, 72 | 0x243185be, 73 | 0x550c7dc3, 74 | 0x72be5d74, 75 | 0x80deb1fe, 76 | 0x9bdc06a7, 77 | 0xc19bf174, 78 | 0xe49b69c1, 79 | 0xefbe4786, 80 | 0x0fc19dc6, 81 | 0x240ca1cc, 82 | 0x2de92c6f, 83 | 0x4a7484aa, 84 | 0x5cb0a9dc, 85 | 0x76f988da, 86 | 0x983e5152, 87 | 0xa831c66d, 88 | 0xb00327c8, 89 | 0xbf597fc7, 90 | 0xc6e00bf3, 91 | 0xd5a79147, 92 | 0x06ca6351, 93 | 0x14292967, 94 | 0x27b70a85, 95 | 0x2e1b2138, 96 | 0x4d2c6dfc, 97 | 0x53380d13, 98 | 0x650a7354, 99 | 0x766a0abb, 100 | 0x81c2c92e, 101 | 0x92722c85, 102 | 0xa2bfe8a1, 103 | 0xa81a664b, 104 | 0xc24b8b70, 105 | 0xc76c51a3, 106 | 0xd192e819, 107 | 0xd6990624, 108 | 0xf40e3585, 109 | 0x106aa070, 110 | 0x19a4c116, 111 | 0x1e376c08, 112 | 0x2748774c, 113 | 0x34b0bcb5, 114 | 0x391c0cb3, 115 | 0x4ed8aa4a, 116 | 0x5b9cca4f, 117 | 0x682e6ff3, 118 | 0x748f82ee, 119 | 0x78a5636f, 120 | 0x84c87814, 121 | 0x8cc70208, 122 | 0x90befffa, 123 | 0xa4506ceb, 124 | 0xbef9a3f7, 125 | 0xc67178f2, 126 | } 127 | 128 | func sha256_1_generic(digests [][32]byte, p [][32]byte) { 129 | var w [16]uint32 130 | for k := 0; k < len(p)/2; k++ { 131 | // First 16 rounds 132 | a, b, c, d, e, f, g, h := init0, init1, init2, init3, init4, init5, init6, init7 133 | for i := 0; i < 8; i++ { 134 | j := i * 4 135 | w[i] = uint32(p[2*k][j])<<24 | uint32(p[2*k][j+1])<<16 | uint32(p[2*k][j+2])<<8 | uint32(p[2*k][j+3]) 136 | t1 := h + ((bits.RotateLeft32(e, -6)) ^ (bits.RotateLeft32(e, -11)) ^ (bits.RotateLeft32(e, -25))) + ((e & f) ^ (^e & g)) + _K[i] + w[i] 137 | 138 | t2 := ((bits.RotateLeft32(a, -2)) ^ (bits.RotateLeft32(a, -13)) ^ (bits.RotateLeft32(a, -22))) + ((a & b) ^ (a & c) ^ (b & c)) 139 | 140 | h = g 141 | g = f 142 | f = e 143 | e = d + t1 144 | d = c 145 | c = b 146 | b = a 147 | a = t1 + t2 148 | } 149 | for i := 8; i < 16; i++ { 150 | j := (i - 8) * 4 151 | w[i] = uint32(p[2*k+1][j])<<24 | uint32(p[2*k+1][j+1])<<16 | uint32(p[2*k+1][j+2])<<8 | uint32(p[2*k+1][j+3]) 152 | t1 := h + ((bits.RotateLeft32(e, -6)) ^ (bits.RotateLeft32(e, -11)) ^ (bits.RotateLeft32(e, -25))) + ((e & f) ^ (^e & g)) + _K[i] + w[i] 153 | 154 | t2 := ((bits.RotateLeft32(a, -2)) ^ (bits.RotateLeft32(a, -13)) ^ (bits.RotateLeft32(a, -22))) + ((a & b) ^ (a & c) ^ (b & c)) 155 | 156 | h = g 157 | g = f 158 | f = e 159 | e = d + t1 160 | d = c 161 | c = b 162 | b = a 163 | a = t1 + t2 164 | } 165 | // Last 48 rounds 166 | for i := 16; i < 64; i++ { 167 | v1 := w[(i-2)%16] 168 | t1 := (bits.RotateLeft32(v1, -17)) ^ (bits.RotateLeft32(v1, -19)) ^ (v1 >> 10) 169 | v2 := w[(i-15)%16] 170 | t2 := (bits.RotateLeft32(v2, -7)) ^ (bits.RotateLeft32(v2, -18)) ^ (v2 >> 3) 171 | w[i%16] += t1 + w[(i-7)%16] + t2 172 | 173 | t1 = h + ((bits.RotateLeft32(e, -6)) ^ (bits.RotateLeft32(e, -11)) ^ (bits.RotateLeft32(e, -25))) + ((e & f) ^ (^e & g)) + _K[i] + w[i%16] 174 | t2 = ((bits.RotateLeft32(a, -2)) ^ (bits.RotateLeft32(a, -13)) ^ (bits.RotateLeft32(a, -22))) + ((a & b) ^ (a & c) ^ (b & c)) 175 | h = g 176 | g = f 177 | f = e 178 | e = d + t1 179 | d = c 180 | c = b 181 | b = a 182 | a = t1 + t2 183 | } 184 | // Add original digest 185 | a += init0 186 | b += init1 187 | c += init2 188 | d += init3 189 | e += init4 190 | f += init5 191 | g += init6 192 | h += init7 193 | 194 | h0, h1, h2, h3, h4, h5, h6, h7 := a, b, c, d, e, f, g, h 195 | // Rounds with padding 196 | for i := 0; i < 64; i++ { 197 | t1 := h + ((bits.RotateLeft32(e, -6)) ^ (bits.RotateLeft32(e, -11)) ^ (bits.RotateLeft32(e, -25))) + ((e & f) ^ (^e & g)) + _P[i] 198 | 199 | t2 := ((bits.RotateLeft32(a, -2)) ^ (bits.RotateLeft32(a, -13)) ^ (bits.RotateLeft32(a, -22))) + ((a & b) ^ (a & c) ^ (b & c)) 200 | 201 | h = g 202 | g = f 203 | f = e 204 | e = d + t1 205 | d = c 206 | c = b 207 | b = a 208 | a = t1 + t2 209 | } 210 | 211 | h0 += a 212 | h1 += b 213 | h2 += c 214 | h3 += d 215 | h4 += e 216 | h5 += f 217 | h6 += g 218 | h7 += h 219 | 220 | var dig [32]byte 221 | binary.BigEndian.PutUint32(dig[0:4], h0) 222 | binary.BigEndian.PutUint32(dig[4:8], h1) 223 | binary.BigEndian.PutUint32(dig[8:12], h2) 224 | binary.BigEndian.PutUint32(dig[12:16], h3) 225 | binary.BigEndian.PutUint32(dig[16:20], h4) 226 | binary.BigEndian.PutUint32(dig[20:24], h5) 227 | binary.BigEndian.PutUint32(dig[24:28], h6) 228 | binary.BigEndian.PutUint32(dig[28:32], h7) 229 | (digests)[k] = dig 230 | } 231 | } 232 | -------------------------------------------------------------------------------- /sha256_1_sse.go-bak: -------------------------------------------------------------------------------- 1 | //go:build amd64 2 | // +build amd64 3 | 4 | /* 5 | MIT License 6 | 7 | Copyright (c) 2021 Prysmatic Labs 8 | 9 | Permission is hereby granted, free of charge, to any person obtaining a copy 10 | of this software and associated documentation files (the "Software"), to deal 11 | in the Software without restriction, including without limitation the rights 12 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 13 | copies of the Software, and to permit persons to whom the Software is 14 | furnished to do so, subject to the following conditions: 15 | 16 | The above copyright notice and this permission notice shall be included in all 17 | copies or substantial portions of the Software. 18 | 19 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 20 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 21 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 22 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 23 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 24 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 25 | SOFTWARE. 26 | */ 27 | package gohashtree 28 | 29 | //go:noescape 30 | 31 | func sha256_1_sse(digests *byte, p [][32]byte, count uint32) 32 | -------------------------------------------------------------------------------- /sha256_1_sse.s-bak: -------------------------------------------------------------------------------- 1 | /* 2 | MIT License 3 | 4 | Copyright (c) 2021 Prysmatic Labs 5 | 6 | Permission is hereby granted, free of charge, to any person obtaining a copy 7 | of this software and associated documentation files (the "Software"), to deal 8 | in the Software without restriction, including without limitation the rights 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | copies of the Software, and to permit persons to whom the Software is 11 | furnished to do so, subject to the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be included in all 14 | copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | SOFTWARE. 23 | */ 24 | #define MSGSCHEDULE0(index) \ 25 | MOVL (index*4)(SI), AX; \ 26 | BSWAPL AX; \ 27 | MOVL AX, (index*4)(BP) 28 | 29 | // Wt = SIGMA1(Wt-2) + Wt-7 + SIGMA0(Wt-15) + Wt-16; for 16 <= t <= 63 30 | // SIGMA0(x) = ROTR(7,x) XOR ROTR(18,x) XOR SHR(3,x) 31 | // SIGMA1(x) = ROTR(17,x) XOR ROTR(19,x) XOR SHR(10,x) 32 | #define MSGSCHEDULE1(index) \ 33 | MOVL ((index-2)*4)(BP), AX; \ 34 | MOVL AX, DI; \ 35 | RORL $17, AX; \ 36 | MOVL DI, DX; \ 37 | RORL $19, DI; \ 38 | SHRL $10, DX; \ 39 | MOVL ((index-15)*4)(BP), BX; \ 40 | XORL DI, AX; \ 41 | MOVL BX, DI; \ 42 | XORL DX, AX; \ 43 | RORL $7, BX; \ 44 | MOVL DI, DX; \ 45 | SHRL $3, DX; \ 46 | RORL $18, DI; \ 47 | ADDL ((index-7)*4)(BP), AX; \ 48 | XORL DI, BX; \ 49 | XORL DX, BX; \ 50 | ADDL ((index-16)*4)(BP), BX; \ 51 | ADDL BX, AX; \ 52 | MOVL AX, ((index)*4)(BP) 53 | 54 | // Calculate T1 and T2, then e = d + T1 and a = T1 + T2. Wt+Kt is passed in AX. 55 | // The values for e and a are stored in d and h, ready for rotation. 56 | #define SHA256ROUND(a, b, c, d, e, f, g, h) \ 57 | MOVL e, BX; \ 58 | RORL $14, BX; \ 59 | MOVL a, DX; \ 60 | RORL $9, DX; \ 61 | XORL e, BX; \ 62 | MOVL f, DI; \ 63 | RORL $5, BX; \ 64 | XORL a, DX; \ 65 | XORL g, DI; \ 66 | XORL e, BX; \ 67 | ANDL e, DI; \ 68 | RORL $11, DX; \ 69 | XORL a, DX; \ 70 | RORL $6, BX; \ 71 | XORL g, DI; \ 72 | RORL $2, DX; \ 73 | ADDL BX, DI; \ 74 | ADDL AX, DI; \ 75 | MOVL a, BX; \ 76 | ADDL DI, h; \ 77 | MOVL a, DI; \ 78 | ORL c, BX; \ 79 | ADDL h, d; \ 80 | ANDL c, DI; \ 81 | ANDL b, BX; \ 82 | ADDL DX, h; \ 83 | ORL DI, BX; \ 84 | ADDL BX, h 85 | 86 | #define SHA256ROUND0(index, const, a, b, c, d, e, f, g, h) \ 87 | MSGSCHEDULE0(index); \ 88 | ADDL $const, AX; \ 89 | SHA256ROUND(a, b, c, d, e, f, g, h) 90 | 91 | #define SHA256ROUND1(index, const, a, b, c, d, e, f, g, h) \ 92 | MSGSCHEDULE1(index); \ 93 | ADDL $const, AX; \ 94 | SHA256ROUND(a, b, c, d, e, f, g, h) 95 | 96 | #define PADDSHA256ROUND(const, a, b, c, d, e, f, g, h) \ 97 | MOVL e, BX; \ 98 | RORL $14, BX; \ 99 | MOVL a, DX; \ 100 | RORL $9, DX; \ 101 | XORL e, BX; \ 102 | MOVL f, DI; \ 103 | RORL $5, BX; \ 104 | XORL a, DX; \ 105 | XORL g, DI; \ 106 | XORL e, BX; \ 107 | ANDL e, DI; \ 108 | RORL $11, DX; \ 109 | XORL a, DX; \ 110 | RORL $6, BX; \ 111 | XORL g, DI; \ 112 | RORL $2, DX; \ 113 | ADDL BX, DI; \ 114 | ADDL $const, DI; \ 115 | MOVL a, BX; \ 116 | ADDL DI, h; \ 117 | MOVL a, DI; \ 118 | ORL c, BX; \ 119 | ADDL h, d; \ 120 | ANDL c, DI; \ 121 | ANDL b, BX; \ 122 | ADDL DX, h; \ 123 | ORL DI, BX; \ 124 | ADDL BX, h 125 | 126 | TEXT ·sha256_1_sse(SB), 0, $296-36 127 | 128 | MOVQ digests+0(FP), CX // digests *[][32]byte 129 | MOVQ p_base+8(FP), SI // p [][32]byte 130 | MOVL count+32(FP), DX // count uint32 131 | SHLQ $6, DX 132 | 133 | LEAQ (SI)(DX*1), DI 134 | MOVQ DI, 256(SP) 135 | CMPQ SI, DI 136 | JEQ end 137 | 138 | MOVQ SP, BP 139 | 140 | loop: 141 | MOVL $0x6A09E667, R8 // a = H0 142 | MOVL $0xBB67AE85, R9 // b = H1 143 | MOVL $0x3C6EF372, R10 // c = H2 144 | MOVL $0xA54FF53A, R11 // d = H3 145 | MOVL $0x510E527F, R12 // e = H4 146 | MOVL $0x9B05688C, R13 // f = H5 147 | MOVL $0x1F83D9AB, R14 // g = H6 148 | MOVL $0x5BE0CD19, R15 // h = H7 149 | 150 | 151 | SHA256ROUND0(0, 0x428a2f98, R8, R9, R10, R11, R12, R13, R14, R15) 152 | SHA256ROUND0(1, 0x71374491, R15, R8, R9, R10, R11, R12, R13, R14) 153 | SHA256ROUND0(2, 0xb5c0fbcf, R14, R15, R8, R9, R10, R11, R12, R13) 154 | SHA256ROUND0(3, 0xe9b5dba5, R13, R14, R15, R8, R9, R10, R11, R12) 155 | SHA256ROUND0(4, 0x3956c25b, R12, R13, R14, R15, R8, R9, R10, R11) 156 | SHA256ROUND0(5, 0x59f111f1, R11, R12, R13, R14, R15, R8, R9, R10) 157 | SHA256ROUND0(6, 0x923f82a4, R10, R11, R12, R13, R14, R15, R8, R9) 158 | SHA256ROUND0(7, 0xab1c5ed5, R9, R10, R11, R12, R13, R14, R15, R8) 159 | SHA256ROUND0(8, 0xd807aa98, R8, R9, R10, R11, R12, R13, R14, R15) 160 | SHA256ROUND0(9, 0x12835b01, R15, R8, R9, R10, R11, R12, R13, R14) 161 | SHA256ROUND0(10, 0x243185be, R14, R15, R8, R9, R10, R11, R12, R13) 162 | SHA256ROUND0(11, 0x550c7dc3, R13, R14, R15, R8, R9, R10, R11, R12) 163 | SHA256ROUND0(12, 0x72be5d74, R12, R13, R14, R15, R8, R9, R10, R11) 164 | SHA256ROUND0(13, 0x80deb1fe, R11, R12, R13, R14, R15, R8, R9, R10) 165 | SHA256ROUND0(14, 0x9bdc06a7, R10, R11, R12, R13, R14, R15, R8, R9) 166 | SHA256ROUND0(15, 0xc19bf174, R9, R10, R11, R12, R13, R14, R15, R8) 167 | 168 | SHA256ROUND1(16, 0xe49b69c1, R8, R9, R10, R11, R12, R13, R14, R15) 169 | SHA256ROUND1(17, 0xefbe4786, R15, R8, R9, R10, R11, R12, R13, R14) 170 | SHA256ROUND1(18, 0x0fc19dc6, R14, R15, R8, R9, R10, R11, R12, R13) 171 | SHA256ROUND1(19, 0x240ca1cc, R13, R14, R15, R8, R9, R10, R11, R12) 172 | SHA256ROUND1(20, 0x2de92c6f, R12, R13, R14, R15, R8, R9, R10, R11) 173 | SHA256ROUND1(21, 0x4a7484aa, R11, R12, R13, R14, R15, R8, R9, R10) 174 | SHA256ROUND1(22, 0x5cb0a9dc, R10, R11, R12, R13, R14, R15, R8, R9) 175 | SHA256ROUND1(23, 0x76f988da, R9, R10, R11, R12, R13, R14, R15, R8) 176 | SHA256ROUND1(24, 0x983e5152, R8, R9, R10, R11, R12, R13, R14, R15) 177 | SHA256ROUND1(25, 0xa831c66d, R15, R8, R9, R10, R11, R12, R13, R14) 178 | SHA256ROUND1(26, 0xb00327c8, R14, R15, R8, R9, R10, R11, R12, R13) 179 | SHA256ROUND1(27, 0xbf597fc7, R13, R14, R15, R8, R9, R10, R11, R12) 180 | SHA256ROUND1(28, 0xc6e00bf3, R12, R13, R14, R15, R8, R9, R10, R11) 181 | SHA256ROUND1(29, 0xd5a79147, R11, R12, R13, R14, R15, R8, R9, R10) 182 | SHA256ROUND1(30, 0x06ca6351, R10, R11, R12, R13, R14, R15, R8, R9) 183 | SHA256ROUND1(31, 0x14292967, R9, R10, R11, R12, R13, R14, R15, R8) 184 | SHA256ROUND1(32, 0x27b70a85, R8, R9, R10, R11, R12, R13, R14, R15) 185 | SHA256ROUND1(33, 0x2e1b2138, R15, R8, R9, R10, R11, R12, R13, R14) 186 | SHA256ROUND1(34, 0x4d2c6dfc, R14, R15, R8, R9, R10, R11, R12, R13) 187 | SHA256ROUND1(35, 0x53380d13, R13, R14, R15, R8, R9, R10, R11, R12) 188 | SHA256ROUND1(36, 0x650a7354, R12, R13, R14, R15, R8, R9, R10, R11) 189 | SHA256ROUND1(37, 0x766a0abb, R11, R12, R13, R14, R15, R8, R9, R10) 190 | SHA256ROUND1(38, 0x81c2c92e, R10, R11, R12, R13, R14, R15, R8, R9) 191 | SHA256ROUND1(39, 0x92722c85, R9, R10, R11, R12, R13, R14, R15, R8) 192 | SHA256ROUND1(40, 0xa2bfe8a1, R8, R9, R10, R11, R12, R13, R14, R15) 193 | SHA256ROUND1(41, 0xa81a664b, R15, R8, R9, R10, R11, R12, R13, R14) 194 | SHA256ROUND1(42, 0xc24b8b70, R14, R15, R8, R9, R10, R11, R12, R13) 195 | SHA256ROUND1(43, 0xc76c51a3, R13, R14, R15, R8, R9, R10, R11, R12) 196 | SHA256ROUND1(44, 0xd192e819, R12, R13, R14, R15, R8, R9, R10, R11) 197 | SHA256ROUND1(45, 0xd6990624, R11, R12, R13, R14, R15, R8, R9, R10) 198 | SHA256ROUND1(46, 0xf40e3585, R10, R11, R12, R13, R14, R15, R8, R9) 199 | SHA256ROUND1(47, 0x106aa070, R9, R10, R11, R12, R13, R14, R15, R8) 200 | SHA256ROUND1(48, 0x19a4c116, R8, R9, R10, R11, R12, R13, R14, R15) 201 | SHA256ROUND1(49, 0x1e376c08, R15, R8, R9, R10, R11, R12, R13, R14) 202 | SHA256ROUND1(50, 0x2748774c, R14, R15, R8, R9, R10, R11, R12, R13) 203 | SHA256ROUND1(51, 0x34b0bcb5, R13, R14, R15, R8, R9, R10, R11, R12) 204 | SHA256ROUND1(52, 0x391c0cb3, R12, R13, R14, R15, R8, R9, R10, R11) 205 | SHA256ROUND1(53, 0x4ed8aa4a, R11, R12, R13, R14, R15, R8, R9, R10) 206 | SHA256ROUND1(54, 0x5b9cca4f, R10, R11, R12, R13, R14, R15, R8, R9) 207 | SHA256ROUND1(55, 0x682e6ff3, R9, R10, R11, R12, R13, R14, R15, R8) 208 | SHA256ROUND1(56, 0x748f82ee, R8, R9, R10, R11, R12, R13, R14, R15) 209 | SHA256ROUND1(57, 0x78a5636f, R15, R8, R9, R10, R11, R12, R13, R14) 210 | SHA256ROUND1(58, 0x84c87814, R14, R15, R8, R9, R10, R11, R12, R13) 211 | SHA256ROUND1(59, 0x8cc70208, R13, R14, R15, R8, R9, R10, R11, R12) 212 | SHA256ROUND1(60, 0x90befffa, R12, R13, R14, R15, R8, R9, R10, R11) 213 | SHA256ROUND1(61, 0xa4506ceb, R11, R12, R13, R14, R15, R8, R9, R10) 214 | SHA256ROUND1(62, 0xbef9a3f7, R10, R11, R12, R13, R14, R15, R8, R9) 215 | SHA256ROUND1(63, 0xc67178f2, R9, R10, R11, R12, R13, R14, R15, R8) 216 | 217 | // Add initial digest and save it 218 | ADDL $0x6A09E667, R8 // H0 = a + H0 219 | MOVL R8, (0*4)(CX) 220 | ADDL $0xBB67AE85, R9 // H1 = b + H1 221 | MOVL R9, (1*4)(CX) 222 | ADDL $0x3C6EF372, R10 // H2 = c + H2 223 | MOVL R10, (2*4)(CX) 224 | ADDL $0xA54FF53A, R11 // H3 = d + H3 225 | MOVL R11, (3*4)(CX) 226 | ADDL $0x510E527F, R12 // H4 = e + H4 227 | MOVL R12, (4*4)(CX) 228 | ADDL $0x9B05688C, R13 // H5 = f + H5 229 | MOVL R13, (5*4)(CX) 230 | ADDL $0x1F83D9AB, R14 // H6 = g + H6 231 | MOVL R14, (6*4)(CX) 232 | ADDL $0x5BE0CD19, R15 // H7 = h + H7 233 | MOVL R15, (7*4)(CX) 234 | 235 | // Rounds with padding 236 | // Rounds 0 - 15 237 | PADDSHA256ROUND(0xc28a2f98, R8, R9, R10, R11, R12, R13, R14, R15) 238 | PADDSHA256ROUND(0x71374491, R15, R8, R9, R10, R11, R12, R13, R14) 239 | PADDSHA256ROUND(0xb5c0fbcf, R14, R15, R8, R9, R10, R11, R12, R13) 240 | PADDSHA256ROUND(0xe9b5dba5, R13, R14, R15, R8, R9, R10, R11, R12) 241 | PADDSHA256ROUND(0x3956c25b, R12, R13, R14, R15, R8, R9, R10, R11) 242 | PADDSHA256ROUND(0x59f111f1, R11, R12, R13, R14, R15, R8, R9, R10) 243 | PADDSHA256ROUND(0x923f82a4, R10, R11, R12, R13, R14, R15, R8, R9) 244 | PADDSHA256ROUND(0xab1c5ed5, R9, R10, R11, R12, R13, R14, R15, R8) 245 | PADDSHA256ROUND(0xd807aa98, R8, R9, R10, R11, R12, R13, R14, R15) 246 | PADDSHA256ROUND(0x12835b01, R15, R8, R9, R10, R11, R12, R13, R14) 247 | PADDSHA256ROUND(0x243185be, R14, R15, R8, R9, R10, R11, R12, R13) 248 | PADDSHA256ROUND(0x550c7dc3, R13, R14, R15, R8, R9, R10, R11, R12) 249 | PADDSHA256ROUND(0x72be5d74, R12, R13, R14, R15, R8, R9, R10, R11) 250 | PADDSHA256ROUND(0x80deb1fe, R11, R12, R13, R14, R15, R8, R9, R10) 251 | PADDSHA256ROUND(0x9bdc06a7, R10, R11, R12, R13, R14, R15, R8, R9) 252 | PADDSHA256ROUND(0xc19bf374, R9, R10, R11, R12, R13, R14, R15, R8) 253 | 254 | // Rounds 16 - 31 255 | PADDSHA256ROUND(0x649b69c1, R8, R9, R10, R11, R12, R13, R14, R15) 256 | PADDSHA256ROUND(0xf0fe4786, R15, R8, R9, R10, R11, R12, R13, R14) 257 | PADDSHA256ROUND(0x0fe1edc6, R14, R15, R8, R9, R10, R11, R12, R13) 258 | PADDSHA256ROUND(0x240cf254, R13, R14, R15, R8, R9, R10, R11, R12) 259 | PADDSHA256ROUND(0x4fe9346f, R12, R13, R14, R15, R8, R9, R10, R11) 260 | PADDSHA256ROUND(0x6cc984be, R11, R12, R13, R14, R15, R8, R9, R10) 261 | PADDSHA256ROUND(0x61b9411e, R10, R11, R12, R13, R14, R15, R8, R9) 262 | PADDSHA256ROUND(0x16f988fa, R9, R10, R11, R12, R13, R14, R15, R8) 263 | PADDSHA256ROUND(0xf2c65152, R8, R9, R10, R11, R12, R13, R14, R15) 264 | PADDSHA256ROUND(0xa88e5a6d, R15, R8, R9, R10, R11, R12, R13, R14) 265 | PADDSHA256ROUND(0xb019fc65, R14, R15, R8, R9, R10, R11, R12, R13) 266 | PADDSHA256ROUND(0xb9d99ec7, R13, R14, R15, R8, R9, R10, R11, R12) 267 | PADDSHA256ROUND(0x9a1231c3, R12, R13, R14, R15, R8, R9, R10, R11) 268 | PADDSHA256ROUND(0xe70eeaa0, R11, R12, R13, R14, R15, R8, R9, R10) 269 | PADDSHA256ROUND(0xfdb1232b, R10, R11, R12, R13, R14, R15, R8, R9) 270 | PADDSHA256ROUND(0xc7353eb0, R9, R10, R11, R12, R13, R14, R15, R8) 271 | 272 | // Rounds 32 - 48 273 | PADDSHA256ROUND(0x3069bad5, R8, R9, R10, R11, R12, R13, R14, R15) 274 | PADDSHA256ROUND(0xcb976d5f, R15, R8, R9, R10, R11, R12, R13, R14) 275 | PADDSHA256ROUND(0x5a0f118f, R14, R15, R8, R9, R10, R11, R12, R13) 276 | PADDSHA256ROUND(0xdc1eeefd, R13, R14, R15, R8, R9, R10, R11, R12) 277 | PADDSHA256ROUND(0x0a35b689, R12, R13, R14, R15, R8, R9, R10, R11) 278 | PADDSHA256ROUND(0xde0b7a04, R11, R12, R13, R14, R15, R8, R9, R10) 279 | PADDSHA256ROUND(0x58f4ca9d, R10, R11, R12, R13, R14, R15, R8, R9) 280 | PADDSHA256ROUND(0xe15d5b16, R9, R10, R11, R12, R13, R14, R15, R8) 281 | PADDSHA256ROUND(0x007f3e86, R8, R9, R10, R11, R12, R13, R14, R15) 282 | PADDSHA256ROUND(0x37088980, R15, R8, R9, R10, R11, R12, R13, R14) 283 | PADDSHA256ROUND(0xa507ea32, R14, R15, R8, R9, R10, R11, R12, R13) 284 | PADDSHA256ROUND(0x6fab9537, R13, R14, R15, R8, R9, R10, R11, R12) 285 | PADDSHA256ROUND(0x17406110, R12, R13, R14, R15, R8, R9, R10, R11) 286 | PADDSHA256ROUND(0x0d8cd6f1, R11, R12, R13, R14, R15, R8, R9, R10) 287 | PADDSHA256ROUND(0xcdaa3b6d, R10, R11, R12, R13, R14, R15, R8, R9) 288 | PADDSHA256ROUND(0xc0bbbe37, R9, R10, R11, R12, R13, R14, R15, R8) 289 | 290 | // Rounds 49 - 64 291 | PADDSHA256ROUND(0x83613bda, R8, R9, R10, R11, R12, R13, R14, R15) 292 | PADDSHA256ROUND(0xdb48a363, R15, R8, R9, R10, R11, R12, R13, R14) 293 | PADDSHA256ROUND(0x0b02e931, R14, R15, R8, R9, R10, R11, R12, R13) 294 | PADDSHA256ROUND(0x6fd15ca7, R13, R14, R15, R8, R9, R10, R11, R12) 295 | PADDSHA256ROUND(0x521afaca, R12, R13, R14, R15, R8, R9, R10, R11) 296 | PADDSHA256ROUND(0x31338431, R11, R12, R13, R14, R15, R8, R9, R10) 297 | PADDSHA256ROUND(0x6ed41a95, R10, R11, R12, R13, R14, R15, R8, R9) 298 | PADDSHA256ROUND(0x6d437890, R9, R10, R11, R12, R13, R14, R15, R8) 299 | PADDSHA256ROUND(0xc39c91f2, R8, R9, R10, R11, R12, R13, R14, R15) 300 | PADDSHA256ROUND(0x9eccabbd, R15, R8, R9, R10, R11, R12, R13, R14) 301 | PADDSHA256ROUND(0xb5c9a0e6, R14, R15, R8, R9, R10, R11, R12, R13) 302 | PADDSHA256ROUND(0x532fb63c, R13, R14, R15, R8, R9, R10, R11, R12) 303 | PADDSHA256ROUND(0xd2c741c6, R12, R13, R14, R15, R8, R9, R10, R11) 304 | PADDSHA256ROUND(0x07237ea3, R11, R12, R13, R14, R15, R8, R9, R10) 305 | PADDSHA256ROUND(0xa4954b68, R10, R11, R12, R13, R14, R15, R8, R9) 306 | PADDSHA256ROUND(0x4c191d76, R9, R10, R11, R12, R13, R14, R15, R8) 307 | 308 | // Add previous digest and save it 309 | ADDL (0*4)(CX), R8 // H0 = a + H0 310 | BSWAPL R8 311 | MOVL R8, (0*4)(CX) 312 | ADDL (1*4)(CX), R9 // H1 = b + H1 313 | BSWAPL R9 314 | MOVL R9, (1*4)(CX) 315 | ADDL (2*4)(CX), R10 // H2 = c + H2 316 | BSWAPL R10 317 | MOVL R10, (2*4)(CX) 318 | ADDL (3*4)(CX), R11 // H3 = d + H3 319 | BSWAPL R11 320 | MOVL R11, (3*4)(CX) 321 | ADDL (4*4)(CX), R12 // H4 = e + H4 322 | BSWAPL R12 323 | MOVL R12, (4*4)(CX) 324 | ADDL (5*4)(CX), R13 // H5 = f + H5 325 | BSWAPL R13 326 | MOVL R13, (5*4)(CX) 327 | ADDL (6*4)(CX), R14 // H6 = g + H6 328 | BSWAPL R14 329 | MOVL R14, (6*4)(CX) 330 | ADDL (7*4)(CX), R15 // H7 = h + H7 331 | BSWAPL R15 332 | MOVL R15, (7*4)(CX) 333 | 334 | ADDQ $64, SI 335 | ADDQ $32, CX 336 | CMPQ SI, 256(SP) 337 | JB loop 338 | 339 | end: 340 | RET 341 | --------------------------------------------------------------------------------