├── .github └── workflows │ └── ci.yml ├── LICENSE ├── README.md ├── test ├── test.sh ├── uuid_generate_v7.test.sql └── uuid_generate_v7.test.sql.expected └── uuid_generate_v7.sql /.github/workflows/ci.yml: -------------------------------------------------------------------------------- 1 | name: CI 2 | 3 | on: push 4 | 5 | jobs: 6 | test: 7 | name: Run tests with PostgreSQL ${{ matrix.postgres-version }} 8 | runs-on: ubuntu-latest 9 | strategy: 10 | fail-fast: false 11 | matrix: 12 | postgres-version: ['12', '13', '14', '15'] 13 | services: 14 | postgres: 15 | image: postgres:${{ matrix.postgres-version }} 16 | env: 17 | POSTGRES_PASSWORD: postgres 18 | options: 19 | --health-cmd pg_isready 20 | --health-interval 10s 21 | --health-retries 5 22 | --health-timeout 5s 23 | ports: 24 | - 5432:5432 25 | env: 26 | PGHOST: localhost 27 | PGPORT: 5432 28 | PGDATABASE: test 29 | PGUSER: postgres 30 | PGPASSWORD: postgres 31 | steps: 32 | - name: Check out branch 33 | uses: actions/checkout@v3 34 | 35 | - name: Install PostgreSQL 15 client 36 | run: | 37 | echo '::group::Add PostgreSQL APT repository' 38 | echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" | sudo tee /etc/apt/sources.list.d/pgdg.list 39 | echo '::endgroup::' 40 | 41 | echo '::group::Install PostgreSQL GPG key' 42 | wget --quiet --output-document - -- https://www.postgresql.org/media/keys/ACCC4CF8.asc \ 43 | | gpg --dearmor \ 44 | | sudo tee /etc/apt/trusted.gpg.d/postgresql.gpg > /dev/null 45 | echo '::endgroup::' 46 | 47 | echo '::group::Update package list and install the PostgreSQL 15 client (psql)' 48 | sudo apt-get update 49 | sudo apt-get --yes install -- postgresql-client-15 50 | echo '::endgroup::' 51 | 52 | - name: Run tests 53 | run: test/test.sh 54 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2023 Betterment Holdings Inc. 2 | Copyright (c) 2023 Kyle Hubert (https://github.com/kjmph) 3 | 4 | Permission is hereby granted, free of charge, to any person obtaining a copy 5 | of this software and associated documentation files (the "Software"), to deal 6 | in the Software without restriction, including without limitation the rights 7 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 | copies of the Software, and to permit persons to whom the Software is 9 | furnished to do so, subject to the following conditions: 10 | 11 | The above copyright notice and this permission notice shall be included in all 12 | copies or substantial portions of the Software. 13 | 14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 20 | SOFTWARE. 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # uuid_generate_v7 2 | 3 | This is a PostgreSQL PL/pgSQL function for creating v7 UUIDs, designed in line with the latest [v7 UUID specification](https://www.ietf.org/archive/id/draft-peabody-dispatch-new-uuid-format-01.html#name-uuidv7-layout-and-bit-order). 4 | 5 | ## Overview 6 | 7 | The `uuid_generate_v7` function is a tool for generating v7-like UUIDs in PostgreSQL. It merges the current UNIX timestamp in milliseconds with 10 random bytes to create unique identifiers, complying with the [UUID RFC 4122 specification](https://datatracker.ietf.org/doc/html/rfc4122#section-4). 8 | 9 | ## Benefits 10 | 11 | A v7 UUID has a distinct advantage over v4 because the timestamp prefix allows them to be partially sequential. This allows better indexing performance in comparison to completely random UUIDs (v4). This is particularly beneficial for databases that frequently insert and search records. 12 | 13 | ## Collision Risk 14 | 15 | The chance of collision is extremely low due to the large size and randomness of UUIDs. UUID uniqueness is derived from the combination of the current UNIX timestamp in milliseconds and 10 random bytes. The risk of collision further decreases with the use of a cryptographically secure random number generator, `gen_random_bytes`. 16 | 17 | To give you a sense of the collision risk: assuming a perfect random number generator and generating 1 billion UUIDs every second, you'd need to keep generating for about 100 years to have a 1 in a billion chance of a single collision. 18 | 19 | ## Testing 20 | 21 | You can run the tests using this command:† 22 | 23 | ```bash 24 | test/test.sh 25 | 26 | # Successful Output: 27 | # 1..1 28 | # ok 1 - uuid_generate_v7 test 29 | ``` 30 | 31 | The test environment and golden test files will be set up and validated. Differences between the test output and the golden file will cause the test to fail. 32 | 33 | † Tests require PostgreSQL 12+. 34 | -------------------------------------------------------------------------------- /test/test.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | # Reference: 4 | # http://www.davidpashley.com/articles/writing-robust-shell-scripts/ 5 | # http://kvz.io/blog/2013/11/21/bash-best-practices/ 6 | # http://redsymbol.net/articles/unofficial-bash-strict-mode/ 7 | 8 | set -o errexit # Exit when an expression fails 9 | set -o noclobber # Disable automatic file overwriting 10 | set -o noglob # Disable shell globbing 11 | set -o nounset # Exit when an undefined variable is used 12 | set -o pipefail # Exit when a command in a pipeline fails 13 | set -o posix # Ensure posix semantics 14 | 15 | IFS=$'\n\t' # Set default field separator to not split on spaces 16 | 17 | umask 0077 18 | 19 | export PGDATABASE=${PGDATABASE:-test} 20 | 21 | readonly TEST_FILE=test/uuid_generate_v7.test.sql 22 | readonly GOLDEN_TEST_FILE=$TEST_FILE.expected 23 | 24 | create_db_if_not_exists() { 25 | psql --field-separator $'\t' --list --no-align --no-psqlrc --tuples-only \ 26 | | cut -f 1 \ 27 | | grep --line-regexp --quiet "$PGDATABASE" \ 28 | || createdb --encoding UTF8 29 | } 30 | 31 | setup_db() { 32 | psql --no-psqlrc --quiet --set ON_ERROR_STOP=1 <| "$GOLDEN_TEST_FILE" 45 | } 46 | 47 | reset_golden_test() { 48 | git checkout --quiet -- "$GOLDEN_TEST_FILE" 2> /dev/null 49 | } 50 | 51 | create_db_if_not_exists 52 | setup_db 53 | 54 | echo '1..1' # We are going to run 1 test 55 | 56 | regenerate_golden_test 57 | 58 | # Successful if test passes 59 | if git diff --exit-code --quiet -- "$GOLDEN_TEST_FILE" 60 | then 61 | echo 'ok 1 - uuid_generate_v7 test' 62 | reset_golden_test 63 | else 64 | echo 'not ok 1 - uuid_generate_v7 test' 65 | exit 1 66 | fi 67 | -------------------------------------------------------------------------------- /test/uuid_generate_v7.test.sql: -------------------------------------------------------------------------------- 1 | \set ECHO none 2 | \out /dev/null 3 | 4 | BEGIN ISOLATION LEVEL SERIALIZABLE; 5 | 6 | -- Add the real uuid_generate_v7 function to allow it to be tested 7 | \i uuid_generate_v7.sql 8 | 9 | -- Temporary clock_timestamp() function for tests 10 | CREATE FUNCTION 11 | pg_temp.clock_timestamp() 12 | RETURNS 13 | timestamptz 14 | LANGUAGE 15 | sql 16 | AS $$ 17 | SELECT 18 | clock_timestamp() 19 | $$ 20 | ; 21 | 22 | -- Temporarily move built-in functions out of public schema 23 | ALTER FUNCTION 24 | gen_random_bytes(count integer) 25 | SET SCHEMA 26 | test 27 | ; 28 | 29 | ALTER FUNCTION 30 | clock_timestamp 31 | SET SCHEMA 32 | test 33 | ; 34 | 35 | -- Create stub built-in functions 36 | CREATE FUNCTION 37 | gen_random_bytes(count integer) 38 | RETURNS 39 | bytea 40 | LANGUAGE 41 | sql 42 | AS $$ 43 | SELECT 44 | '\x0fffffffffffffffffff'::bytea 45 | $$ 46 | ; 47 | 48 | CREATE FUNCTION 49 | extract_timestamp_uuidv7(uuid uuid) 50 | RETURNS 51 | timestamptz 52 | LANGUAGE 53 | sql 54 | IMMUTABLE 55 | PARALLEL SAFE 56 | STRICT 57 | LEAKPROOF 58 | AS $$ 59 | SELECT 60 | to_timestamp 61 | ( ('x0000' || left(replace(uuid::text, '-', ''), 12))::bit(64)::bigint / 1000::numeric 62 | ) 63 | $$ 64 | ; 65 | 66 | /* 67 | 68 | Big endian byte order for v7 uuid 69 | 70 | +-------+-----------+------------+ 71 | | byte | bit range | label | 72 | +-------+-----------+------------+ 73 | | 0 | 0 | 7 | unix_ts_ms | 74 | | 1 | 8 | 15 | | 75 | | 2 | 16 | 23 | | 76 | | 3 | 24 | 31 | | 77 | | 4 | 32 | 39 | | 78 | | 5 | 40 | 47 | | 79 | | 6 | 48 | 51 | ver | 80 | | 6 | 52 | 55 | rand_a | 81 | | 7 | 56 | 63 | | 82 | | 8 | 64 | 65 | var | 83 | | 8 | 66 | 71 | rand_b | 84 | | 9 | 72 | 79 | | 85 | | 10 | 80 | 87 | | 86 | | 11 | 88 | 97 | | 87 | | 12 | 96 | 103 | | 88 | | 13 | 104 | 111 | | 89 | | 14 | 112 | 119 | | 90 | | 15 | 120 | 127 | | 91 | +-------+-----+-----+------------+ 92 | */ 93 | 94 | CREATE TEMPORARY VIEW 95 | test 96 | AS 97 | /* 98 | get_bit uses little-endian order, while a uuid is in big-endian. The 99 | position argument 0 is the right-most bit within each byte, but the 100 | specification assumes version and variant start at the left-most bit. 101 | 102 | In order to display the correct version and variant we need to access each 103 | bit in reverse order, from right to left within each byte. 104 | */ 105 | SELECT 106 | uuid_generate_v7 107 | , pg_typeof(uuid_generate_v7) expected_type 108 | , substr(uuid_generate_v7::text, 15, 1) expected_version 109 | , get_bit(bytes, 55)::bit(1) || 110 | get_bit(bytes, 54)::bit(1) || 111 | get_bit(bytes, 53)::bit(1) || 112 | get_bit(bytes, 52)::bit(1) expected_version_binary 113 | , get_bit(bytes, 71)::bit(1) || 114 | get_bit(bytes, 70)::bit(1) expected_variant_binary 115 | , pg_temp.clock_timestamp() input_timestamp 116 | , extract_timestamp_uuidv7(uuid_generate_v7) serialized_timestamp 117 | FROM 118 | uuid_generate_v7() 119 | CROSS JOIN 120 | uuid_send(uuid_generate_v7) _ (bytes) 121 | ; 122 | 123 | SAVEPOINT test_setup; 124 | 125 | \out 126 | \df uuid_generate_v7 127 | \out /dev/null 128 | 129 | \echo -- With a clock_timestamp having the smallest serializable timestamp 130 | \echo -- (1 row) 131 | \echo 132 | 133 | CREATE FUNCTION 134 | clock_timestamp() 135 | RETURNS 136 | timestamptz 137 | LANGUAGE 138 | sql 139 | AS $$ 140 | SELECT 141 | '1970-01-01 00:00:00'::timestamptz 142 | $$ 143 | ; 144 | 145 | -- The serialized timestamp is unchanged 146 | SELECT 147 | * 148 | FROM 149 | test 150 | \gx /dev/stdout 151 | 152 | ROLLBACK TO test_setup; 153 | 154 | \echo 155 | \echo -- With a clock_timestamp having the largest serializable timestamp 156 | \echo -- (1 row) 157 | \echo 158 | 159 | CREATE FUNCTION 160 | clock_timestamp() 161 | RETURNS 162 | timestamptz 163 | LANGUAGE 164 | sql 165 | AS $$ 166 | SELECT 167 | '10889-08-02 05:31:50.65504'::timestamptz 168 | $$ 169 | ; 170 | 171 | -- The serialized timestamp is unchanged 172 | SELECT 173 | * 174 | FROM 175 | test 176 | \gx /dev/stdout 177 | 178 | ROLLBACK TO test_setup; 179 | 180 | \echo 181 | \echo -- With a clock_timestamp with 999 milliseconds 182 | \echo -- (1 row) 183 | \echo 184 | 185 | CREATE FUNCTION 186 | clock_timestamp() 187 | RETURNS 188 | timestamptz 189 | LANGUAGE 190 | sql 191 | AS $$ 192 | SELECT 193 | '1970-01-01 00:00:00.999'::timestamptz 194 | $$ 195 | ; 196 | 197 | -- The serialized timestamp is unchanged 198 | SELECT 199 | * 200 | FROM 201 | test 202 | \gx /dev/stdout 203 | 204 | ROLLBACK TO test_setup; 205 | 206 | \echo 207 | \echo -- With a clock_timestamp with 9994 milliseconds 208 | \echo -- (1 row) 209 | \echo 210 | 211 | CREATE FUNCTION 212 | clock_timestamp() 213 | RETURNS 214 | timestamptz 215 | LANGUAGE 216 | sql 217 | AS $$ 218 | SELECT 219 | '1970-01-01 00:00:00.9994'::timestamptz 220 | $$ 221 | ; 222 | 223 | -- The serialized timestamp is rounded down to 999 milliseconds 224 | SELECT 225 | * 226 | FROM 227 | test 228 | \gx /dev/stdout 229 | 230 | ROLLBACK TO test_setup; 231 | 232 | \echo 233 | \echo -- With a clock_timestamp with 9995 milliseconds 234 | \echo -- (1 row) 235 | \echo 236 | 237 | CREATE FUNCTION 238 | clock_timestamp() 239 | RETURNS 240 | timestamptz 241 | LANGUAGE 242 | sql 243 | AS $$ 244 | SELECT 245 | '1970-01-01 00:00:00.9995'::timestamptz 246 | $$ 247 | ; 248 | 249 | -- The serialized timestamp is rounded up to next second 250 | SELECT 251 | * 252 | FROM 253 | test 254 | \gx /dev/stdout 255 | 256 | \echo 257 | \echo -- With a clock_timestamp for 2022-02-22T14:22:22.00-05:00 258 | \echo -- with rand_a of 0xcc3 259 | \echo -- with rand_b of 0x18c4dc0c0c07398f 260 | \echo -- (1 row) 261 | \echo 262 | 263 | CREATE OR REPLACE FUNCTION 264 | clock_timestamp() 265 | RETURNS 266 | timestamptz 267 | LANGUAGE 268 | sql 269 | AS $$ 270 | SELECT 271 | '2022-02-22T14:22:22.00-05:00'::timestamptz 272 | $$ 273 | ; 274 | 275 | -- Set random bytes equivalent to UUIDv7 example: 276 | -- https://uuid6.github.io/uuid6-ietf-draft/#name-example-of-a-uuidv7-value 277 | CREATE OR REPLACE FUNCTION 278 | gen_random_bytes(count integer) 279 | RETURNS 280 | bytea 281 | LANGUAGE 282 | sql 283 | AS $$ 284 | SELECT 285 | '\x0cc3'::bytea || -- rand_a 286 | '\x18c4dc0c0c07398f'::bytea -- rand_b 287 | $$ 288 | ; 289 | 290 | SELECT 291 | * 292 | FROM 293 | test 294 | \gx /dev/stdout 295 | 296 | ROLLBACk; 297 | -------------------------------------------------------------------------------- /test/uuid_generate_v7.test.sql.expected: -------------------------------------------------------------------------------- 1 | List of functions 2 | Schema | Name | Result data type | Argument data types | Type 3 | --------+------------------+------------------+---------------------+------ 4 | public | uuid_generate_v7 | uuid | | func 5 | (1 row) 6 | 7 | -- With a clock_timestamp having the smallest serializable timestamp 8 | -- (1 row) 9 | 10 | -[ RECORD 1 ]-----------+------------------------------------- 11 | uuid_generate_v7 | 00000000-0000-7fff-bfff-ffffffffffff 12 | expected_type | uuid 13 | expected_version | 7 14 | expected_version_binary | 0111 15 | expected_variant_binary | 10 16 | input_timestamp | 1970-01-01 00:00:00+00 17 | serialized_timestamp | 1970-01-01 00:00:00+00 18 | 19 | 20 | -- With a clock_timestamp having the largest serializable timestamp 21 | -- (1 row) 22 | 23 | -[ RECORD 1 ]-----------+------------------------------------- 24 | uuid_generate_v7 | ffffffff-ffff-7fff-bfff-ffffffffffff 25 | expected_type | uuid 26 | expected_version | 7 27 | expected_version_binary | 0111 28 | expected_variant_binary | 10 29 | input_timestamp | 10889-08-02 05:31:50.65504+00 30 | serialized_timestamp | 10889-08-02 05:31:50.65504+00 31 | 32 | 33 | -- With a clock_timestamp with 999 milliseconds 34 | -- (1 row) 35 | 36 | -[ RECORD 1 ]-----------+------------------------------------- 37 | uuid_generate_v7 | 00000000-03e7-7fff-bfff-ffffffffffff 38 | expected_type | uuid 39 | expected_version | 7 40 | expected_version_binary | 0111 41 | expected_variant_binary | 10 42 | input_timestamp | 1970-01-01 00:00:00.999+00 43 | serialized_timestamp | 1970-01-01 00:00:00.999+00 44 | 45 | 46 | -- With a clock_timestamp with 9994 milliseconds 47 | -- (1 row) 48 | 49 | -[ RECORD 1 ]-----------+------------------------------------- 50 | uuid_generate_v7 | 00000000-03e7-7fff-bfff-ffffffffffff 51 | expected_type | uuid 52 | expected_version | 7 53 | expected_version_binary | 0111 54 | expected_variant_binary | 10 55 | input_timestamp | 1970-01-01 00:00:00.9994+00 56 | serialized_timestamp | 1970-01-01 00:00:00.999+00 57 | 58 | 59 | -- With a clock_timestamp with 9995 milliseconds 60 | -- (1 row) 61 | 62 | -[ RECORD 1 ]-----------+------------------------------------- 63 | uuid_generate_v7 | 00000000-03e8-7fff-bfff-ffffffffffff 64 | expected_type | uuid 65 | expected_version | 7 66 | expected_version_binary | 0111 67 | expected_variant_binary | 10 68 | input_timestamp | 1970-01-01 00:00:00.9995+00 69 | serialized_timestamp | 1970-01-01 00:00:01+00 70 | 71 | 72 | -- With a clock_timestamp for 2022-02-22T14:22:22.00-05:00 73 | -- with rand_a of 0xcc3 74 | -- with rand_b of 0x18c4dc0c0c07398f 75 | -- (1 row) 76 | 77 | -[ RECORD 1 ]-----------+------------------------------------- 78 | uuid_generate_v7 | 017f22e2-79b0-7cc3-98c4-dc0c0c07398f 79 | expected_type | uuid 80 | expected_version | 7 81 | expected_version_binary | 0111 82 | expected_variant_binary | 10 83 | input_timestamp | 2022-02-22 19:22:22+00 84 | serialized_timestamp | 2022-02-22 19:22:22+00 85 | -------------------------------------------------------------------------------- /uuid_generate_v7.sql: -------------------------------------------------------------------------------- 1 | CREATE OR REPLACE FUNCTION 2 | uuid_generate_v7() 3 | RETURNS 4 | uuid 5 | LANGUAGE 6 | plpgsql 7 | PARALLEL SAFE 8 | AS $$ 9 | DECLARE 10 | -- The current UNIX timestamp in milliseconds 11 | unix_time_ms CONSTANT bytea NOT NULL DEFAULT substring(int8send((extract(epoch FROM clock_timestamp()) * 1000)::bigint) from 3); 12 | 13 | -- The buffer used to create the UUID, starting with the UNIX timestamp and followed by random bytes 14 | buffer bytea NOT NULL DEFAULT unix_time_ms || gen_random_bytes(10); 15 | BEGIN 16 | -- Set most significant 4 bits of 7th byte to 7 (for UUID v7), keeping the last 4 bits unchanged 17 | buffer = set_byte(buffer, 6, (b'0111' || get_byte(buffer, 6)::bit(4))::bit(8)::int); 18 | 19 | -- Set most significant 2 bits of 9th byte to 2 (the UUID variant specified in RFC 4122), keeping the last 6 bits unchanged 20 | buffer = set_byte(buffer, 8, (b'10' || get_byte(buffer, 8)::bit(6))::bit(8)::int); 21 | 22 | RETURN encode(buffer, 'hex'); 23 | END 24 | $$ 25 | ; 26 | --------------------------------------------------------------------------------