├── .gitmodules ├── initial-sync └── py │ ├── .gitignore │ ├── requirements.txt │ ├── README.md │ ├── convert-schema.py │ ├── import_csv_to_sqlite.py │ └── export_to_csv.py ├── .gitignore ├── NOTES.md └── README.md /.gitmodules: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /initial-sync/py/.gitignore: -------------------------------------------------------------------------------- 1 | exported/ 2 | -------------------------------------------------------------------------------- /initial-sync/py/requirements.txt: -------------------------------------------------------------------------------- 1 | sqlalchemy 2 | psycopg2-binary 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .dev.vars 2 | node_modules/ 3 | dist/ 4 | dump.sql 5 | test.db 6 | test.sql 7 | .DS_Store 8 | -------------------------------------------------------------------------------- /NOTES.md: -------------------------------------------------------------------------------- 1 | 1. Connect to PG 2 | 2. Begin a TX 3 | 3. Create a snapshot name: `SELECT pg_export_snapshot();` 4 | 4. Get the LSN `SELECT pg_current_wal_lsn();` 5 | 5. PGDump it `pg_dump --snapshot=_snapshotname` 6 | 6. Commit the TX 7 | 7. Import it https://github.com/scratchmex/pgdump2sqlite?tab=readme-ov-file 8 | 8. Start logical replication @ the LSN 9 | 10 | --- 11 | 12 | Could do all the `copies` in a `TX` and report the `SNAPSHOT` name as well as `WAL LSN` so we can resume replication after importing the export. -------------------------------------------------------------------------------- /initial-sync/py/README.md: -------------------------------------------------------------------------------- 1 | # PostgreSQL to SQLite Schema Converter 2 | 3 | This project provides a script to convert PostgreSQL table schemas to SQLite-compatible schemas using SQLAlchemy. 4 | 5 | ## Prerequisites 6 | 7 | - Python 3.x 8 | - PostgreSQL database 9 | 10 | ## Installation 11 | 12 | 1. Clone this repository: 13 | ```sh 14 | git clone https://your-repo-url/pg_to_sqlite.git 15 | cd pg_to_sqlite 16 | ``` 17 | 18 | 2. Install the required Python packages: 19 | ```sh 20 | pip install -r requirements.txt 21 | ``` 22 | 23 | ## Usage 24 | 25 | 1. Run the script with the PostgreSQL connection string and the SQLite DB file name as arguments: 26 | ```sh 27 | python convert-schema.py 28 | ``` 29 | 30 | Example: 31 | ```sh 32 | python convert.py "postgresql://username:password@localhost/database_name" "sqlite_database.db" 33 | ``` 34 | 35 | 2. The SQLite-compatible schema will be saved to `schema.sql`. 36 | 37 | -------------------------------------------------------------------------------- /initial-sync/py/convert-schema.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from sqlalchemy import create_engine, MetaData 3 | from sqlalchemy.schema import CreateTable 4 | from sqlalchemy.pool import StaticPool 5 | 6 | def convert_schemas(postgres_conn_str, sqlite_schema_file): 7 | # Connect to PostgreSQL 8 | postgres_engine = create_engine(postgres_conn_str) 9 | metadata = MetaData() 10 | metadata.reflect(bind=postgres_engine) 11 | 12 | # Connect to SQLite 13 | sqlite_engine = create_engine(f'sqlite://', poolclass=StaticPool) 14 | 15 | # Generate and print SQLite-compatible CREATE TABLE statements 16 | with open(sqlite_schema_file, 'w') as f: 17 | for table in metadata.tables.values(): 18 | sqlite_table = CreateTable(table).compile(sqlite_engine) 19 | f.write(str(sqlite_table).strip() + ';\n') 20 | 21 | print(f'Schema has been converted and saved to {sqlite_schema_file}') 22 | 23 | if __name__ == "__main__": 24 | if len(sys.argv) != 3: 25 | print("Usage: python convert.py ") 26 | sys.exit(1) 27 | 28 | postgres_conn_str = sys.argv[1] 29 | sqlite_db_file = sys.argv[2] 30 | 31 | convert_schemas(postgres_conn_str, sqlite_db_file) 32 | -------------------------------------------------------------------------------- /initial-sync/py/import_csv_to_sqlite.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import subprocess 4 | 5 | def import_csv_to_sqlite(csv_directory, sqlite_db_file): 6 | # List all CSV files in the directory 7 | csv_files = [f for f in os.listdir(csv_directory) if f.endswith('.csv')] 8 | 9 | # Iterate over each CSV file and import it into SQLite 10 | for csv_file in csv_files: 11 | table_name = os.path.splitext(csv_file)[0] 12 | csv_file_path = os.path.join(csv_directory, csv_file) 13 | 14 | # Construct the SQLite import command 15 | sqlite_command = f'sqlite3 {sqlite_db_file} "PRAGMA foreign_keys = 0" ".mode csv" ".import {csv_file_path} {table_name}"' 16 | 17 | try: 18 | # Run the SQLite import command 19 | subprocess.run(sqlite_command, shell=True, check=True) 20 | print(f"Imported {csv_file} into {table_name} table.") 21 | except subprocess.CalledProcessError as e: 22 | print(f"Error importing {csv_file}: {e}") 23 | 24 | print("All CSV files have been imported.") 25 | 26 | if __name__ == "__main__": 27 | if len(sys.argv) != 3: 28 | print("Usage: python import_csv_to_sqlite.py ") 29 | sys.exit(1) 30 | 31 | csv_directory = sys.argv[1] 32 | sqlite_db_file = sys.argv[2] 33 | 34 | import_csv_to_sqlite(csv_directory, sqlite_db_file) 35 | -------------------------------------------------------------------------------- /initial-sync/py/export_to_csv.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import psycopg2 4 | 5 | def export_tables_to_csv(postgres_conn_str, output_dir): 6 | # Connect to PostgreSQL 7 | conn = psycopg2.connect(postgres_conn_str) 8 | cursor = conn.cursor() 9 | 10 | conn.autocommit = False 11 | cursor.execute("SELECT pg_current_wal_lsn();") 12 | wal_lsn = cursor.fetchone()[0] 13 | print(f"Current WAL LSN: {wal_lsn}") 14 | 15 | # Fetch all table names 16 | cursor.execute(""" 17 | SELECT table_name 18 | FROM information_schema.tables 19 | WHERE table_schema = 'public' 20 | """) 21 | tables = cursor.fetchall() 22 | 23 | # Create output directory if it does not exist 24 | os.makedirs(output_dir, exist_ok=True) 25 | 26 | # Export each table to a CSV file 27 | for table in tables: 28 | table_name = table[0] 29 | print(f"Exporting {table_name}...") 30 | output_file = os.path.join(output_dir, f"{table_name}.csv") 31 | with open(output_file, 'w') as f: 32 | cursor.copy_expert(f'COPY "{table_name}" TO STDOUT WITH CSV HEADER', f) 33 | print(f"{table_name} exported to {output_file}") 34 | 35 | conn.commit() 36 | cursor.close() 37 | conn.close() 38 | print("All tables have been exported.") 39 | 40 | if __name__ == "__main__": 41 | if len(sys.argv) != 3: 42 | print("Usage: python export_to_csv.py ") 43 | sys.exit(1) 44 | 45 | postgres_conn_str = sys.argv[1] 46 | output_dir = sys.argv[2] 47 | 48 | export_tables_to_csv(postgres_conn_str, output_dir) 49 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # PG to SQLite 2 | 3 | After doing a bunch of research on the topic, I didn't find any satisfactory approaches. 4 | 5 | Other approaches: 6 | 7 | 1. https://github.com/caiiiycuk/postgresql-to-sqlite 8 | 2. https://github.com/scratchmex/pgdump2sqlite 9 | 3. https://stackoverflow.com/questions/6148421/how-to-convert-a-postgres-database-to-sqlite/69293251#69293251 10 | 11 | All these involve doing hacky tricks to fix up the `pgdump` output _and_ they're slow. Admittedly this could be faster too but it is fast enough for me at the moment. Currently ~18 seconds to convert a 1GB DB over 3 million rows. 12 | 13 | SQLite has a `.import` command and Postgres has a `copy` command. We can use both in conjunction to eliminate any need to post-process the data. 14 | 15 | These scripts: 16 | 17 | 1. Use `copy` on PG to create CSVs of each table 18 | 2. Use `.import` on SQLite to import those CSVs 19 | 20 | with no extra data cleaning steps. 21 | 22 | A script is included to convert Postgres schemas to SQLite compatible schemas using SQLAlchemy, a well maintained project that speaks many dialects of SQL. Although this step is not strictly required as SQLite will happily create default schemas during the `.import` step. 23 | 24 | After importing these dumps into SQLite you can then replicate from PG to SQLite as we capture `snapshot name` and `WAL LSN` of the export. 25 | 26 | # Usage: 27 | 28 | There's three scripts (under initial-sync/py) meant to be used in turn. 29 | 30 | 1. `convert-schema.py ` 31 | 1. After this step, apply `out.sql` to your sqlite db 32 | 2. `export_to_csv.py ` 33 | 1. This will report `WAL LSN` if you intend to sync with logical replication after initial import. 34 | 3. `import_csv_to_sqlite.py ` 35 | --------------------------------------------------------------------------------