├── .gitignore ├── Persistent_Tables_Materialization.md ├── README.md ├── dbt_project.yml ├── doc └── images │ ├── before_full_refresh.png │ ├── contact_table_after_backup_refresh.png │ ├── contact_table_before_backup_refresh.jpg │ ├── contacts_after_refresh_with_data.jpg │ ├── dbt_product.jpg │ ├── dbtlog_alter_table_add_column.png │ ├── dbtlog_alter_table_drop_column.jpg │ ├── persistent_table_materialization.jpg │ ├── table_structure_alter_table_add_column.png │ └── table_structure_alter_table_drop_column.jpg ├── models ├── PUBLIC │ └── CONTACT.sql └── SNOWFLAKE_SAMPLE_DATA │ └── schema.yml ├── plugins └── snowflake │ └── dbt │ └── include │ └── snowflake │ └── macros │ ├── materializations │ ├── database_definition_materialization.sql │ ├── file_format_definition_materialization.sql │ ├── persistent_table_materialization.sql │ ├── procedure_definition_materialization.sql │ ├── schema_definition_materialization.sql │ └── stage_definition_materialization.sql │ └── snowflake_helper_macros.sql └── profiles.yml /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | .idea 3 | target 4 | logs 5 | dbt_modules -------------------------------------------------------------------------------- /Persistent_Tables_Materialization.md: -------------------------------------------------------------------------------- 1 | # Persistent Tables materialization for creating tables in DBT 2 | 3 | ## DBT 4 | 5 | [DBT](https://www.getdbt.com/product/) is a command line tool based of SQL primarily 6 | used by Analyst to do data-transformation. In other words it does the 7 | 'T' in ELT. 8 | 9 | It facilitates in writing modular SELECT SQLs and takes care of dependency 10 | ,compilation ,materialization in run time. 11 | 12 | ![DBT product](./doc/images/dbt_product.svg) 13 | 14 | ### Reference 15 | If you are interested in learning more about DBT here are some links 16 | - Read the [dbt viewpoint](https://dbt.readme.io/docs/viewpoint) 17 | - [What is dbt](https://dbt.readme.io/docs/overview) 18 | 19 | ### Source table 20 | In DBT, a "Source Table" holds data upon which data transformations are done. 21 | The transformation are SELECT SQL statements which are joined together and 22 | then materialized into tables. 23 | 24 | When developing scripts or models with DBT the main statements is SELECT 25 | SQL dialect. There are no create or replace statements written in Model statements. 26 | This means that DBT does not offer methods for issuing CREATE TABLE statements 27 | which can be used for source table. Hence it is upto the user to define 28 | these outside of DBT. 29 | 30 | ### Macros 31 | But if you closely look at how DBT offers customization or enhancement to 32 | be developed using [Macros](https://docs.getdbt.com/docs/macros); you will 33 | realize that these are pretty much Jinja templates. And if you search around 34 | the [code base in github](https://github.com/fishtown-analytics/dbt), you 35 | would come across common macros like 36 | - create_view_as 37 | - get_columns_in_relation 38 | - drop_relation_if_exists 39 | - alter_column_type 40 | - truncate_relation 41 | - ... 42 | 43 | ### Materialization 44 | [Materializations](https://docs.getdbt.com/docs/materializations) are strategies 45 | for persisting dbt models in a warehouse. There are four types of materializations 46 | built into dbt. They are: 47 | - table 48 | - view 49 | - incremental 50 | - ephemeral 51 | 52 | DBT also offers facility to develop [Custom Materialization](https://docs.getdbt.com/docs/creating-new-materializations) 53 | too. This led me to investigate if we can use these functionalities to develop 54 | a custom materialization with the following functionalities 55 | - Able to process a model file which contains a "CREATE TABLE" statement. 56 | - Identify if a column has been added/updated/dropped in the definition 57 | and issue an alter statement accordingly. 58 | - Do a complete full refresh of the data. 59 | - Backup the table before doing any modifications. 60 | - Migrate the data after the table has been modified. 61 | 62 | It’s possible to achieve this and I start by explaining as below. Also 63 | I am using snowflake as the choice of database, you might need to change 64 | it accordingly. 65 | 66 | #### Defined Macros 67 | I have defined the following macros in [snowflake_helper_macros](./plugins/snowflake/dbt/include/snowflake/macros/snowflake_helper_macros.sql) 68 | 69 | | Macro name | Description | 70 | | ----------- | ----------- | 71 | | clone_table_relation_if_exists | Used to clone a relation | 72 | | copyof_table_relation_if_exists | Used for copy data from one relation to another | 73 | | create_table_stmt_fromfile | copies the create table from the model file | 74 | 75 | #### Persistent Table Materialization 76 | I have defined the custom materialization [persistent_table_materialization](./plugins/snowflake/dbt/include/snowflake/macros/materializations/persistent_table_materialization.sql) 77 | to handle the above defined needs. In short the implementation has the 78 | following logic: 79 | 80 | ![Persistent table logic flow](./doc/images/persistent_table_materialization.jpg) 81 | 82 | #### Model 83 | Below is an example of the model file which now could be materialized by 84 | DBT. The example is here [CONTACT](./models/PUBLIC/CONTACT.sql) 85 | 86 | ` 87 | {{ config(materialized='persistent_table' 88 | ,retain_previous_version_flg=false 89 | ,migrate_data_over_flg=true 90 | )}} 91 | 92 | CREATE OR REPLACE TABLE "{{ database }}"."{{ schema }}"."CONTACT" ( 93 | FIRST_NAME VARCHAR(100), 94 | LAST_NAME VARCHAR(100), 95 | EMAIL VARCHAR(100), 96 | STREETADDRESS VARCHAR(100), 97 | CITY VARCHAR(100) 98 | ); 99 | 100 | ` 101 | 102 | | - | Description | 103 | | ----------- | ----------- | 104 | | ` materialized='persistent_table' ` | Indicates to use the persistent table materialization | 105 | | `retain_previous_version_flg` | Flag to indicate if a copy of the existing table should be kept as a backup. Each run will have a backup table with suffix _DBT_BACKUP_YYYYMMDDHHMMSS | 106 | | `migrate_data_over_flg` | In case of a full refresh, this flag indicates if the data should be migrated over. | 107 | | `"{{ database }}"."{{ schema }}"` | Placeholder for database name and schema name | 108 | 109 | 110 | #### Example 111 | To see this materialization in action; here is a walkthrough with screenshots 112 | on the various facilities. 113 | 114 | ###### - full-refresh 115 | We start of with no table defined in the database. A "full-refresh" flag 116 | would mean to create the table as if nothing existed. Should the table did 117 | exist it will recreate the table (due to 'CREATE OR REPLACE' in the model). 118 | 119 | ![before_full_refresh](./doc/images/before_full_refresh.png) 120 | 121 | ` 122 | dbt -d run -m CONTACT --full-refresh 123 | ` 124 | 125 | The table got created and I have inserted some dummy records manually. 126 | Here is the screenshot: 127 | 128 | ![contacts_after_refresh_with_data](./doc/images/contacts_after_refresh_with_data.jpg) 129 | 130 | 131 | ###### full-refresh with migration data enabled 132 | 133 | Lets do a full-refresh with the 'migrate_data_over_flg' set to true 134 | 135 | ` 136 | config(materialized='persistent_table' 137 | ,retain_previous_version_flg=false 138 | ,migrate_data_over_flg=true 139 | ) 140 | ` 141 | 142 | command to issue 143 | 144 | ` 145 | dbt -d run -m CONTACT --full-refresh 146 | ` 147 | 148 | Again the table gets re-created and I have inserted some dummy records manually. 149 | 150 | ###### Backup previous version 151 | 152 | Lets go through an example of how to retain the previous copy and see what 153 | happens after migration. Screenshot below reflect the various CONTACT table 154 | as in INFORMATION_SCHEMA.TABLES before refresh 155 | 156 | ![contact_table_before_backup_refresh](./doc/images/contact_table_before_backup_refresh.jpg) 157 | 158 | For this we set the flag 'retain_previous_version_flg' 159 | 160 | ` 161 | config(materialized='persistent_table' 162 | ,retain_previous_version_flg=true 163 | ,migrate_data_over_flg=true 164 | ) 165 | ` 166 | 167 | we issue the command to do a full-refresh as usual 168 | 169 | ` 170 | dbt -d run -m CONTACT --full-refresh 171 | ` 172 | Screenshot below reflect the various CONTACT table as in INFORMATION_SCHEMA.TABLES 173 | after refresh: 174 | 175 | ![contact_table_after_backup_refresh](./doc/images/contact_table_after_backup_refresh.png) 176 | 177 | As you see it backed the table 'CONTACT_DBT_BACKUP_20191006125145387106' 178 | and also retained the rows (look at row count). Due to the 'migrate_data_over_flg' 179 | it has also migrated the previous set of data over. 180 | 181 | ###### Add column 182 | 183 | Now I want to add a column 'LAST_UPDATED' to the definition. 184 | 185 | ` 186 | CREATE OR REPLACE TABLE "{{ database }}"."{{ schema }}"."CONTACT" ( 187 | FIRST_NAME VARCHAR(100), 188 | LAST_NAME VARCHAR(100), 189 | EMAIL VARCHAR(100), 190 | STREETADDRESS VARCHAR(100), 191 | CITY VARCHAR(100), 192 | LAST_UPDATED DATE 193 | ) 194 | ` 195 | 196 | Hmm; did you observe that I am not doing an 'ALTER TABLE' statement. Through 197 | DBT we are going to see this happen. We issue the command, do not set the full-refresh flag 198 | 199 | ` 200 | dbt -d run -m CONTACT 201 | ` 202 | 203 | This result in DBT issuing a 'ALTER TABLE', as in below log: 204 | 205 | ![dbtlog_alter_table_add_column](./doc/images/dbtlog_alter_table_add_column.png) 206 | 207 | The screenshot reflects the table structure after this update: 208 | 209 | ![table_structure_alter_table_add_column](./doc/images/table_structure_alter_table_add_column.png) 210 | 211 | You could also see the existing records are not deleted; as this was an 212 | alter statement. 213 | 214 | ###### Drop column 215 | 216 | Now lets removed the 'LAST_UPDATED' column. 217 | 218 | ` 219 | CREATE OR REPLACE TABLE "{{ database }}"."{{ schema }}"."CONTACT" ( 220 | FIRST_NAME VARCHAR(100), 221 | LAST_NAME VARCHAR(100), 222 | EMAIL VARCHAR(100), 223 | STREETADDRESS VARCHAR(100), 224 | CITY VARCHAR(100) 225 | ) 226 | ` 227 | 228 | Hmm; did you observe that I am not doing an 'ALTER TABLE' statement. Through 229 | DBT we are going to see this happen. We issue the command, do not set the full-refresh flag 230 | 231 | ` 232 | dbt -d run -m CONTACT 233 | ` 234 | 235 | This result in DBT issuing a 'ALTER TABLE', as in below log: 236 | 237 | ![dbtlog_alter_table_drop_column](./doc/images/dbtlog_alter_table_drop_column.jpg) 238 | 239 | The database structure as below: 240 | 241 | ![table_structure_alter_table_drop_column](./doc/images/table_structure_alter_table_drop_column.jpg) 242 | 243 | #### Where to now 244 | With this functionality in place I can see the DBT as being adopted for 245 | CI/CD pipeline for database objects. 246 | 247 | #### Limitations 248 | I can think of the following as a limitation 249 | - You cannot use this model in a 'source' or a 'ref' call 250 | - Do not ask DBT to do a 'run' across all the model. This might result in 251 | recreating the tables accidentally, of course backup could exist if the flags were set. 252 | 253 | 254 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | ## Date : 14-Nov-2019 3 | 4 | - Added materialization support for 5 | - Database 6 | - Schema 7 | - Stages 8 | - File Formats 9 | - Procedures 10 | 11 | 12 | ## Please refer to Branch 13 | 14 | - PERSISTENT_TABLE_MATERIALIZATION 15 | -------------------------------------------------------------------------------- /dbt_project.yml: -------------------------------------------------------------------------------- 1 | 2 | # Name your package! Package names should contain only lowercase characters 3 | # and underscores. A good package name should reflect your organization's 4 | # name or the intended use of these models 5 | name: 'dbt_hacks' 6 | version: '1.0' 7 | 8 | # This setting configures which "profile" dbt uses for this project. Profiles contain 9 | # database connection information, and should be configured in the ~/.dbt/profiles.yml file 10 | profile: 'default' 11 | 12 | # These configurations specify where dbt should look for different types of files. 13 | # The `source-paths` config, for example, states that source models can be found 14 | # in the "models/" directory. You probably won't need to change these! 15 | source-paths: ["models"] 16 | #analysis-paths: ["analysis"] 17 | #test-paths: ["tests"] 18 | #data-paths: ["data"] 19 | macro-paths: ["plugins"] 20 | 21 | target-path: "target" # directory which will store compiled SQL files 22 | clean-targets: # directories to be removed by `dbt clean` 23 | - "target" 24 | - "dbt_modules" 25 | - "logs" 26 | 27 | quoting: 28 | database: true 29 | identifier: true 30 | schema: true 31 | 32 | models: 33 | enabled: true # configs defined here are applied to _all_ packages 34 | materialized: ephemeral # but can be overridden in more specific configs below 35 | -------------------------------------------------------------------------------- /doc/images/before_full_refresh.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/venkatra/dbt_hacks/52c7081288718b114359c6e8238ad22615ae1f2c/doc/images/before_full_refresh.png -------------------------------------------------------------------------------- /doc/images/contact_table_after_backup_refresh.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/venkatra/dbt_hacks/52c7081288718b114359c6e8238ad22615ae1f2c/doc/images/contact_table_after_backup_refresh.png -------------------------------------------------------------------------------- /doc/images/contact_table_before_backup_refresh.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/venkatra/dbt_hacks/52c7081288718b114359c6e8238ad22615ae1f2c/doc/images/contact_table_before_backup_refresh.jpg -------------------------------------------------------------------------------- /doc/images/contacts_after_refresh_with_data.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/venkatra/dbt_hacks/52c7081288718b114359c6e8238ad22615ae1f2c/doc/images/contacts_after_refresh_with_data.jpg -------------------------------------------------------------------------------- /doc/images/dbt_product.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/venkatra/dbt_hacks/52c7081288718b114359c6e8238ad22615ae1f2c/doc/images/dbt_product.jpg -------------------------------------------------------------------------------- /doc/images/dbtlog_alter_table_add_column.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/venkatra/dbt_hacks/52c7081288718b114359c6e8238ad22615ae1f2c/doc/images/dbtlog_alter_table_add_column.png -------------------------------------------------------------------------------- /doc/images/dbtlog_alter_table_drop_column.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/venkatra/dbt_hacks/52c7081288718b114359c6e8238ad22615ae1f2c/doc/images/dbtlog_alter_table_drop_column.jpg -------------------------------------------------------------------------------- /doc/images/persistent_table_materialization.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/venkatra/dbt_hacks/52c7081288718b114359c6e8238ad22615ae1f2c/doc/images/persistent_table_materialization.jpg -------------------------------------------------------------------------------- /doc/images/table_structure_alter_table_add_column.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/venkatra/dbt_hacks/52c7081288718b114359c6e8238ad22615ae1f2c/doc/images/table_structure_alter_table_add_column.png -------------------------------------------------------------------------------- /doc/images/table_structure_alter_table_drop_column.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/venkatra/dbt_hacks/52c7081288718b114359c6e8238ad22615ae1f2c/doc/images/table_structure_alter_table_drop_column.jpg -------------------------------------------------------------------------------- /models/PUBLIC/CONTACT.sql: -------------------------------------------------------------------------------- 1 | {{ 2 | config(materialized='persistent_table' 3 | ,retain_previous_version_flg=false 4 | ,migrate_data_over_flg=true 5 | ) 6 | }} 7 | 8 | CREATE OR REPLACE TABLE "{{ database }}"."{{ schema }}"."CONTACT" ( 9 | FIRST_NAME VARCHAR(100), 10 | LAST_NAME VARCHAR(100), 11 | EMAIL VARCHAR(100), 12 | STREETADDRESS VARCHAR(100), 13 | CITY VARCHAR(100) 14 | ) 15 | 16 | -------------------------------------------------------------------------------- /models/SNOWFLAKE_SAMPLE_DATA/schema.yml: -------------------------------------------------------------------------------- 1 | 2 | version: 2 3 | 4 | sources: 5 | - name: SNOWFLAKE_SAMPLE_DATA 6 | description: the snowflake sample dataset. 7 | database: SNOWFLAKE_SAMPLE_DATA 8 | schema: TPCDS_SF100TCL 9 | tables: 10 | - name: CUSTOMER 11 | - name: CUSTOMER_ADDRESS 12 | -------------------------------------------------------------------------------- /plugins/snowflake/dbt/include/snowflake/macros/materializations/database_definition_materialization.sql: -------------------------------------------------------------------------------- 1 | 2 | 3 | {% materialization database_definition, adapter='snowflake' %} 4 | {%- set retain_previous_version_flg = config.get('retain_previous_version_flg', default=True) -%} -- indicate if the backup copy of previous version is to be retained. 5 | {%- set full_refresh_mode = (flags.FULL_REFRESH == True) -%} 6 | {%- set identifier = model['alias'] -%} 7 | 8 | {%- set current_relation = adapter.get_relation(database=database, schema='public', identifier=identifier) -%} 9 | {%- set target_relation = api.Relation.create(database=database, 10 | schema='information_schema', 11 | identifier='tables', 12 | type='table') -%} 13 | 14 | -------------------------------------------------------------------------------------------------------------------- 15 | 16 | -- setup 17 | {{ run_hooks(pre_hooks, inside_transaction=False) }} 18 | 19 | -- `BEGIN` happens here: 20 | {{ run_hooks(pre_hooks, inside_transaction=True) }} 21 | 22 | 23 | -------------------------------------------------------------------------------------------------------------------- 24 | 25 | {% if retain_previous_version_flg == False %} 26 | -- TODO: drop previous schema 27 | {%- else -%} 28 | -- TODO: backup the existing schema 29 | {% endif %} 30 | 31 | -- build model 32 | {%- call statement('main') -%} 33 | {{ log("Creating database " ~ database) }} 34 | 35 | {{ create_stmt_fromfile(sql) }} 36 | {%- endcall -%} 37 | 38 | -------------------------------------------------------------------------------------------------------------------- 39 | 40 | {{ run_hooks(post_hooks, inside_transaction=True) }} 41 | 42 | -- `COMMIT` happens here 43 | {{ adapter.commit() }} 44 | 45 | {{ run_hooks(post_hooks, inside_transaction=False) }} 46 | 47 | {{ return({'relations': [target_relation] }) }} 48 | 49 | {%- endmaterialization %} 50 | 51 | 52 | -------------------------------------------------------------------------------- /plugins/snowflake/dbt/include/snowflake/macros/materializations/file_format_definition_materialization.sql: -------------------------------------------------------------------------------- 1 | /* 2 | This materialization is used for creating stage objects. 3 | The idea behind this materialization is for ability to define CREATE STAGE statements and have DBT the necessary logic 4 | of deploying the table in a consistent manner and logic. 5 | 6 | */ 7 | {% materialization file_format_definition, adapter='snowflake' %} 8 | {%- set full_refresh_mode = (flags.FULL_REFRESH == True) -%} 9 | {%- set identifier = model['alias'] -%} 10 | 11 | 12 | -------------------------------------------------------------------------------------------------------------------- 13 | 14 | -- setup 15 | {{ run_hooks(pre_hooks, inside_transaction=False) }} 16 | 17 | -- `BEGIN` happens here: 18 | {{ run_hooks(pre_hooks, inside_transaction=True) }} 19 | 20 | 21 | -------------------------------------------------------------------------------------------------------------------- 22 | 23 | -- build model 24 | {%- call statement('main') -%} 25 | {{ create_stmt_fromfile(sql) }} 26 | {%- endcall -%} 27 | 28 | 29 | 30 | -------------------------------------------------------------------------------------------------------------------- 31 | 32 | {{ run_hooks(post_hooks, inside_transaction=True) }} 33 | 34 | -- `COMMIT` happens here 35 | {{ adapter.commit() }} 36 | 37 | {{ run_hooks(post_hooks, inside_transaction=False) }} 38 | 39 | {{ return({'file format': [identifier] }) }} 40 | 41 | {%- endmaterialization %} 42 | 43 | 44 | -------------------------------------------------------------------------------- /plugins/snowflake/dbt/include/snowflake/macros/materializations/persistent_table_materialization.sql: -------------------------------------------------------------------------------- 1 | /* 2 | This materialization is used for creating persistent table. 3 | The idea behind this materialization is for ability to define CREATE TABLE statements and have DBT the necessary logic 4 | of deploying the table in a consistent manner and logic. Some concepts have been borrowed from 'incremental' materialization: 5 | 6 | - https://github.com/fishtown-analytics/dbt/blob/0.14.latest/plugins/snowflake/dbt/include/snowflake/macros/materializations/incremental.sql 7 | 8 | Please read the markdown 'Persistent_Tables_Materialization.md' for a better reasoning behind this materialization. 9 | 10 | */ 11 | {% materialization persistent_table, adapter='snowflake' %} 12 | {%- set retain_previous_version_flg = config.get('retain_previous_version_flg', default=True) -%} -- indicate if the backup copy of previous version is to be retained. 13 | {%- set migrate_data_over_flg = config.get('migrate_data_over_flg', default=true) -%} -- indicate if the data needs to be migrated over to the newly defined table. 14 | 15 | {%- set unique_key = config.get('unique_key') -%} 16 | {%- set full_refresh_mode = (flags.FULL_REFRESH == True) -%} 17 | {%- set identifier = model['alias'] -%} 18 | 19 | {%- set current_relation = adapter.get_relation(database=database, schema=schema, identifier=identifier) -%} 20 | 21 | {%- set backup_suffix_dt = py_current_timestring() -%} 22 | {%- set backup_table_suffix = config.get('backup_table_suffix', default='_DBT_BACKUP_') -%} 23 | {%- set backup_identifier = model['name'] + backup_table_suffix + backup_suffix_dt -%} 24 | {%- set backup_relation = api.Relation.create(database=database, 25 | schema=schema, 26 | identifier=backup_identifier, 27 | type='table') -%} 28 | 29 | {%- set target_relation = api.Relation.create(database=database, 30 | schema=schema, 31 | identifier=identifier, 32 | type='table') -%} 33 | {%- set tmp_relation = make_temp_relation(target_relation ,'_DBT_TMP') %} 34 | 35 | {%- set current_relation_exists_as_table = (current_relation is not none and current_relation.is_table) -%} 36 | {%- set current_relation_exists_as_view = (current_relation is not none and current_relation.is_view) -%} 37 | 38 | -------------------------------------------------------------------------------------------------------------------- 39 | 40 | -- setup 41 | {{ run_hooks(pre_hooks, inside_transaction=False) }} 42 | 43 | -- `BEGIN` happens here: 44 | {{ run_hooks(pre_hooks, inside_transaction=True) }} 45 | 46 | {# -- If the destination is a view, then we have no choice but to drop it #} 47 | {% if current_relation_exists_as_view %} 48 | {{ log("Dropping relation " ~ current_relation ~ " because it is a view and this model is a table.") }} 49 | {{ adapter.drop_relation(current_relation) }} 50 | {% set current_relation = none %} 51 | {% endif %} 52 | 53 | -------------------------------------------------------------------------------------------------------------------- 54 | 55 | -- backup the existing table 56 | {% if current_relation_exists_as_table %} 57 | {{ clone_table_relation_if_exists(current_relation ,backup_relation) }} 58 | {% endif %} 59 | 60 | -- build model 61 | {% if full_refresh_mode or current_relation is none -%} 62 | -- drop the relation incase if the stmt happens to be CREATE IF NOT EXISTS 63 | {{ adapter.drop_relation(current_relation) }} 64 | 65 | {%- call statement('main') -%} 66 | {{ create_table_stmt_fromfile(target_relation, sql) }} 67 | {%- endcall -%} 68 | 69 | -- migrate the data over 70 | {% if migrate_data_over_flg and current_relation is not none %} 71 | {{ log("Migrating data from " ~ backup_relation ~ " to " ~ target_relation) }} 72 | {% set dest_columns = adapter.get_columns_in_relation(backup_relation) %} 73 | {%- call statement('merge', fetch_result=False , auto_begin=False) -%} 74 | {{ get_merge_sql(target_relation, backup_relation, unique_key, dest_columns) }} 75 | {% endcall %} 76 | {%- endif %} 77 | 78 | {%- else -%} 79 | {%- call statement('main') -%} 80 | {% set tmpsql = sql.replace(identifier ,tmp_relation.identifier) %} 81 | {{ log("Tmp sql " ~ tmpsql) }} 82 | {{ create_table_stmt_fromfile(tmp_relation, tmpsql) }} 83 | 84 | {%- endcall -%} 85 | 86 | {%- set new_cols = adapter.get_missing_columns(tmp_relation, current_relation) %} 87 | {%- set dropped_cols = adapter.get_missing_columns(current_relation ,tmp_relation) %} 88 | 89 | {% if new_cols|length > 0 -%} 90 | -- CASE 1 : New columns were added 91 | -- https://docs.getdbt.com/docs/adapter#section-get_missing_columns 92 | {%- set new_cols_csv = new_cols | map(attribute="name") | join(', ') -%} 93 | {{ log("COL_ADDED : " ~ new_cols_csv )}} 94 | {% call statement('add_cols') %} 95 | {% for col in new_cols %} 96 | alter table {{current_relation}} add column "{{col.name}}" {{col.data_type}}; 97 | {% endfor %} 98 | {%- endcall %} 99 | {%- endif %} 100 | 101 | {% if dropped_cols|length > 0 -%} 102 | -- CASE 2 : Columns were dropped 103 | {%- set dropped_cols_csv = dropped_cols | map(attribute="name") | join(', ') -%} 104 | {{ log("COLUMNS TO BE DROPPED : " ~ dropped_cols_csv )}} 105 | {% call statement('drop_cols') %} 106 | {% for col in dropped_cols %} 107 | alter table {{current_relation}} drop column "{{col.name}}"; 108 | {% endfor %} 109 | {%- endcall %} 110 | {%- endif %} 111 | 112 | -- CASE 3 : Columns were renamed 113 | -- This is equivalent of dropped and renamed hence no additional logic needed 114 | 115 | -- CASE 4 : Column data type changed 116 | -- TODO identify and log if datatype are detected 117 | -- get_columns_in_relation 118 | -- alter_column_type 119 | -- https://github.com/fishtown-analytics/dbt/blob/f9c8442260e48bdd8bb7805b2e7541ab91492bb1/plugins/snowflake/dbt/include/snowflake/macros/adapters.sql 120 | {{ adapter.expand_target_column_types(from_relation=tmp_relation, 121 | to_relation=current_relation) }} 122 | 123 | {{ adapter.drop_relation(tmp_relation) }} 124 | {%- endif %} 125 | 126 | 127 | {% if retain_previous_version_flg == False %} 128 | {{ adapter.drop_relation(backup_relation) }} 129 | {% endif %} 130 | 131 | -------------------------------------------------------------------------------------------------------------------- 132 | 133 | {{ run_hooks(post_hooks, inside_transaction=True) }} 134 | 135 | -- `COMMIT` happens here 136 | {{ adapter.commit() }} 137 | 138 | {{ run_hooks(post_hooks, inside_transaction=False) }} 139 | 140 | {{ return({'relations': [target_relation] ,'backup_relation': [backup_relation] }) }} 141 | 142 | {%- endmaterialization %} -------------------------------------------------------------------------------- /plugins/snowflake/dbt/include/snowflake/macros/materializations/procedure_definition_materialization.sql: -------------------------------------------------------------------------------- 1 | /* 2 | This materialization is used for creating stage objects. 3 | The idea behind this materialization is for ability to define CREATE STAGE statements and have DBT the necessary logic 4 | of deploying the table in a consistent manner and logic. 5 | 6 | */ 7 | {% materialization procedure_definition, adapter='snowflake' %} 8 | {%- set full_refresh_mode = (flags.FULL_REFRESH == True) -%} 9 | {%- set identifier = model['alias'] -%} 10 | 11 | 12 | -------------------------------------------------------------------------------------------------------------------- 13 | 14 | -- setup 15 | {{ run_hooks(pre_hooks, inside_transaction=False) }} 16 | 17 | -- `BEGIN` happens here: 18 | {{ run_hooks(pre_hooks, inside_transaction=True) }} 19 | 20 | 21 | -------------------------------------------------------------------------------------------------------------------- 22 | 23 | -- build model 24 | {%- call statement('main') -%} 25 | {{ create_stmt_fromfile(sql) }} 26 | {%- endcall -%} 27 | 28 | 29 | 30 | -------------------------------------------------------------------------------------------------------------------- 31 | 32 | {{ run_hooks(post_hooks, inside_transaction=True) }} 33 | 34 | -- `COMMIT` happens here 35 | {{ adapter.commit() }} 36 | 37 | {{ run_hooks(post_hooks, inside_transaction=False) }} 38 | 39 | {{ return({'procedure': [identifier] }) }} 40 | 41 | {%- endmaterialization %} 42 | 43 | 44 | -------------------------------------------------------------------------------- /plugins/snowflake/dbt/include/snowflake/macros/materializations/schema_definition_materialization.sql: -------------------------------------------------------------------------------- 1 | 2 | 3 | {% materialization schema_definition, adapter='snowflake' %} 4 | {%- set retain_previous_version_flg = config.get('retain_previous_version_flg', default=True) -%} -- indicate if the backup copy of previous version is to be retained. 5 | {%- set full_refresh_mode = (flags.FULL_REFRESH == True) -%} 6 | {%- set identifier = model['alias'] -%} 7 | 8 | 9 | -------------------------------------------------------------------------------------------------------------------- 10 | 11 | -- setup 12 | {{ run_hooks(pre_hooks, inside_transaction=False) }} 13 | 14 | -- `BEGIN` happens here: 15 | {{ run_hooks(pre_hooks, inside_transaction=True) }} 16 | 17 | 18 | -------------------------------------------------------------------------------------------------------------------- 19 | 20 | {% if retain_previous_version_flg == False %} 21 | -- TODO: drop previous schema 22 | {%- else -%} 23 | -- TODO: backup the existing schema 24 | {% endif %} 25 | 26 | -- build model 27 | {%- call statement('main') -%} 28 | {{ log("Creating schema " ~ schema) }} 29 | {{ create_stmt_fromfile(sql) }} 30 | {%- endcall -%} 31 | 32 | 33 | 34 | -------------------------------------------------------------------------------------------------------------------- 35 | 36 | {{ run_hooks(post_hooks, inside_transaction=True) }} 37 | 38 | -- `COMMIT` happens here 39 | {{ adapter.commit() }} 40 | 41 | {{ run_hooks(post_hooks, inside_transaction=False) }} 42 | 43 | {{ return({'schema': [identifier] }) }} 44 | 45 | {%- endmaterialization %} 46 | -------------------------------------------------------------------------------- /plugins/snowflake/dbt/include/snowflake/macros/materializations/stage_definition_materialization.sql: -------------------------------------------------------------------------------- 1 | /* 2 | This materialization is used for creating stage objects. 3 | The idea behind this materialization is for ability to define CREATE STAGE statements and have DBT the necessary logic 4 | of deploying the table in a consistent manner and logic. 5 | 6 | */ 7 | {% materialization stage_definition, adapter='snowflake' %} 8 | {%- set full_refresh_mode = (flags.FULL_REFRESH == True) -%} 9 | {%- set identifier = model['alias'] -%} 10 | 11 | 12 | -------------------------------------------------------------------------------------------------------------------- 13 | 14 | -- setup 15 | {{ run_hooks(pre_hooks, inside_transaction=False) }} 16 | 17 | -- `BEGIN` happens here: 18 | {{ run_hooks(pre_hooks, inside_transaction=True) }} 19 | 20 | 21 | -------------------------------------------------------------------------------------------------------------------- 22 | 23 | -- build model 24 | {%- call statement('main') -%} 25 | {{ create_stage_stmt_fromfile(sql) }} 26 | {%- endcall -%} 27 | 28 | 29 | 30 | -------------------------------------------------------------------------------------------------------------------- 31 | 32 | {{ run_hooks(post_hooks, inside_transaction=True) }} 33 | 34 | -- `COMMIT` happens here 35 | {{ adapter.commit() }} 36 | 37 | {{ run_hooks(post_hooks, inside_transaction=False) }} 38 | 39 | {{ return({'stage': [identifier] }) }} 40 | 41 | {%- endmaterialization %} 42 | 43 | 44 | -------------------------------------------------------------------------------- /plugins/snowflake/dbt/include/snowflake/macros/snowflake_helper_macros.sql: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | cloning a table relation 4 | */ 5 | {% macro clone_table_relation_if_exists(old_relation ,clone_relation) %} 6 | {% if old_relation is not none %} 7 | {{ log("Cloning existing relation " ~ old_relation ~ " as a backup with name " ~ clone_relation) }} 8 | {% call statement('clone_relation', auto_begin=False) -%} 9 | CREATE OR REPLACE TABLE {{ clone_relation }} 10 | CLONE {{ old_relation }} 11 | {%- endcall %} 12 | {% endif %} 13 | {% endmacro %} 14 | 15 | /* 16 | Backing up (Copy of) a (transient) table relation 17 | */ 18 | {% macro copyof_table_relation_if_exists(old_relation ,clone_relation) %} 19 | {% if old_relation is not none %} 20 | {{ log("Copying of existing relation " ~ old_relation ~ " as a backup with name " ~ clone_relation) }} 21 | {% call statement('clone_relation', auto_begin=False) -%} 22 | CREATE OR REPLACE TABLE {{ clone_relation }} 23 | AS SELECT * FROM {{ old_relation }} 24 | {%- endcall %} 25 | {% endif %} 26 | {% endmacro %} 27 | 28 | 29 | {%- macro create_table_stmt_fromfile(relation, sql) -%} 30 | {{ log("Creating table " ~ relation) }} 31 | 32 | {{ sql.upper() }} 33 | ; 34 | 35 | {%- endmacro -%} 36 | 37 | 38 | {%- macro create_stmt_fromfile(sql) -%} 39 | {{ sql.upper() }} 40 | ; 41 | 42 | {%- endmacro -%} 43 | 44 | {%- macro create_stage_stmt_fromfile(sql) -%} 45 | {{ log("Creating stage " ~ relation) }} 46 | {{ sql }} 47 | {%- endmacro -%} 48 | 49 | 50 | -------------------------------------------------------------------------------- /profiles.yml: -------------------------------------------------------------------------------- 1 | 2 | # For more information on how to configure this file, please see: 3 | # https://docs.getdbt.com/docs/profile 4 | 5 | #Ref : https://docs.getdbt.com/docs/configure-your-profile 6 | #dbt run --profiles-dir path/to/directory 7 | #export DBT_PROFILES_DIR=path/to/directory 8 | 9 | 10 | default: 11 | target: dev 12 | outputs: 13 | dev: 14 | type: snowflake 15 | account: "{{ env_var('SNOWSQL_ACCOUNT') }}" 16 | user: "{{ env_var('SNOWSQL_USER') }}" 17 | password: "{{ env_var('DBT_PASSWORD') }}" 18 | role: "{{ env_var('SNOWSQL_ROLE') }}" 19 | database: "{{ env_var('SNOWSQL_DATABASE') }}" 20 | warehouse: "{{ env_var('SNOWSQL_WAREHOUSE') }}" 21 | schema: PUBLIC 22 | threads: 1 23 | client_session_keep_alive: False 24 | --------------------------------------------------------------------------------