├── .gitignore ├── LICENSE.md ├── README.md ├── analysis └── periscope_dashboards │ ├── README.md │ ├── cumulative_spend_to_date.sql │ ├── month_to_date.sql │ ├── monthly_spend_by_warehouse.sql │ ├── mtd_spend_by_warehouse.sql │ ├── periscope_snowflake_spend.png │ ├── snowflake_spend.sql │ └── year_to_date.sql ├── dbt_project.yml ├── models ├── base │ ├── schema.yml │ ├── snowflake_amortized_rates.sql │ ├── snowflake_docs.md │ ├── snowflake_query_history.sql │ ├── snowflake_warehouse_metering.sql │ └── sources.yml └── xf │ ├── schema.yml │ └── snowflake_warehouse_metering_xf.sql └── packages.yml /.gitignore: -------------------------------------------------------------------------------- 1 | /target 2 | /dbt_modules 3 | /logs -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright GitLab B.V. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in 13 | all copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 21 | THE SOFTWARE. 22 | 23 | For all third party components incorporated into the GitLab Software, those 24 | components are licensed under the original license provided by the owner of the 25 | applicable component. 26 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Snowflake Spend dbt Package 2 | 3 | This is a [dbt](http://getdbt.com) package for understanding the cost your [Snowflake Data Warehouse](https://www.snowflake.com) is accruing. 4 | 5 | To get started with this package, you will need to have access to the appropriate databases. 6 | 7 | To grant appropriate roles for these tables the following command was run: 8 | ``` 9 | GRANT IMPORTED PRIVILEGES ON DATABASE snowflake TO ROLE ; 10 | ``` 11 | 12 | [Learn more about the appropriate permissions](https://docs.snowflake.net/manuals/user-guide/data-share-consumers.html#granting-privileges-on-a-shared-database). 13 | 14 | dbt has great [docs on package management](https://docs.getdbt.com/docs/package-management). 15 | We are working to get this package on the [dbt hub site](http://hub.getdbt.com). 16 | In the mean time, you can install it using the git package syntax, which the GitLab data team uses in our [`packages.yml`](https://gitlab.com/gitlab-data/analytics/blob/master/transform/snowflake-dbt/packages.yml) file 17 | 18 | ``` 19 | packages: 20 | - git: https://gitlab.com/gitlab-data/snowflake_spend.git 21 | revision: v1.2.0 22 | ``` 23 | 24 | You will need to update your `dbt_project.yml` to enable this package. 25 | You can see [how the GitLab data team has this configured](https://gitlab.com/gitlab-data/analytics/blob/master/transform/snowflake-dbt/dbt_project.yml). 26 | 27 | ``` 28 | snowflake_spend: 29 | enabled: true 30 | ``` 31 | 32 | You will need a csv called `snowflake_contract_rates.csv` which has two columns: effective date and rate. The effective date is the day the new contracted rate started and it should be in YYYY-MM-DD format. The rate is the per credit price for the given time period. You can see how the data team configures [their csv file](https://gitlab.com/gitlab-data/analytics/blob/master/transform/snowflake-dbt/data/snowflake_contract_rates.csv). You will need to run `dbt seed` for the csv to be loaded as a table and for the model to run succesfully. 33 | 34 | These models are documented and tested. 35 | If you'd like to see what these look like live, you can see them in [the GitLab Data Team's public dbt docs](https://dbt.gitlabdata.com/#!/model/model.snowflake_spend.snowflake_amortized_rates). 36 | 37 | Maintainers of this projects are @m_walker and @jjstark. 38 | Reviewers are @mpeychet_. 39 | 40 | We include sample Sisense (formerly Periscope) dashboards in the `/analytics` folder. 41 | 42 | This dbt package is made available by the GitLab Data Team under an MIT License. 43 | 44 | ## Troubleshooting 45 | 46 | **Error: _Found duplicate project dbt_utils. This occurs when a dependency has the same project name as some other dependency._** 47 | 48 | You are most likely referencing dbt-utils using the git/revision syntax. Use the dbt Hub package syntax instead in your `packages.yml`, eg: 49 | 50 | ``` 51 | packages: 52 | 53 | # Avoid this 54 | - git: "https://github.com/fishtown-analytics/dbt-utils.git" 55 | revision: 0.2.1 56 | 57 | # Use this instead 58 | - package: fishtown-analytics/dbt_utils 59 | version: 0.2.1 60 | ``` 61 | 62 | ## How this Package Gets Released -- For Maintainers Only 63 | 64 | In order to cut a new release of this package: 65 | 1. Create a new tag at the commit that is to be released. Incrementing either the major version, minor version, or bug version of the previously released tag. The new tag name should follow the pattern `v#.#.#`. Push the tag to origin/master, if the tag is created locally. 66 | 1. With a GitLab API-enabled private token create a new release with a command similar to this: 67 | 68 | ``` 69 | curl --header 'Content-Type: application/json' --header "PRIVATE-TOKEN: " \ 70 | --data '{ "name": "dbt: snowflake_spend #.#.#", "tag_name": "v#.#.#", "description": "Initial tagged release"}' \ 71 | --request POST https://gitlab.com/api/v4/projects/12955687/releases 72 | ``` 73 | 74 | 1. Update the release notes by going to your [tag](https://gitlab.com/gitlab-data/snowflake_spend/-/tags) and click on the "edit release notes" pencil. The release notes should follow the same general form as the notes for v1.1.0 and include a bulleted summary of merged MRs and a thank you to contributors. 75 | 1. Create a blog post like [this](https://about.gitlab.com/blog/2020/04/08/snowflake-spend-dbt-package-release/) with all of the details of the release. 76 | -------------------------------------------------------------------------------- /analysis/periscope_dashboards/README.md: -------------------------------------------------------------------------------- 1 | Below is the Periscope dashboard that the GitLab data team uses to monitor our Snowflake Spend. 2 | 3 | The underlying queries are available in this folder. 4 | When you run `dbt compile`, the `{ref('')}` statements will get populated with your schema and table name. 5 | 6 | We welcome suggestions for improvement. 7 | 8 | ![periscope_snowflake_spend.png](periscope_snowflake_spend.png) 9 | -------------------------------------------------------------------------------- /analysis/periscope_dashboards/cumulative_spend_to_date.sql: -------------------------------------------------------------------------------- 1 | SELECT usage_month, 2 | date_part('dayofmonth', end_time) as spend_day, 3 | sum(dollars_spent) as spend, 4 | sum(spend) over(partition by usage_month order by spend_day rows between UNBOUNDED PRECEDING and current row) as cumulative 5 | FROM {{ref('snowflake_warehouse_metering_xf')}} 6 | WHERE date_trunc('year', usage_month)::date = date_trunc('year', CURRENT_TIMESTAMP)::date 7 | GROUP BY 1, 2 8 | ORDER BY 1 DESC 9 | -------------------------------------------------------------------------------- /analysis/periscope_dashboards/month_to_date.sql: -------------------------------------------------------------------------------- 1 | SELECT sum(dollars_spent) AS dollars_spend 2 | FROM {{ref('snowflake_warehouse_metering_xf')}} 3 | WHERE usage_month = date_trunc('month', CURRENT_TIMESTAMP)::date 4 | -------------------------------------------------------------------------------- /analysis/periscope_dashboards/monthly_spend_by_warehouse.sql: -------------------------------------------------------------------------------- 1 | SELECT usage_month, 2 | warehouse_name, 3 | sum(dollars_spent) as spend 4 | FROM {{ref('snowflake_warehouse_metering_xf')}} 5 | WHERE [usage_month=daterange] 6 | AND usage_month < date_trunc('month', CURRENT_TIMESTAMP)::date 7 | AND [warehouse_name=warehouse_name] 8 | GROUP BY 1, 2 9 | ORDER BY 1 DESC 10 | -------------------------------------------------------------------------------- /analysis/periscope_dashboards/mtd_spend_by_warehouse.sql: -------------------------------------------------------------------------------- 1 | SELECT warehouse_name, 2 | sum(dollars_spent) as spend 3 | FROM {{ref('snowflake_warehouse_metering_xf')}} 4 | WHERE usage_month = date_trunc('month', CURRENT_TIMESTAMP)::date 5 | AND [warehouse_name=warehouse_name] 6 | GROUP BY 1 7 | ORDER BY 1 DESC 8 | -------------------------------------------------------------------------------- /analysis/periscope_dashboards/periscope_snowflake_spend.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gitlabhq/snowflake_spend/e49af81ebcad0f7958b4d957682ca575bc45d1ff/analysis/periscope_dashboards/periscope_snowflake_spend.png -------------------------------------------------------------------------------- /analysis/periscope_dashboards/snowflake_spend.sql: -------------------------------------------------------------------------------- 1 | SELECT usage_month, 2 | sum(dollars_spent) AS spend 3 | FROM {{ref('snowflake_warehouse_metering_xf')}} 4 | WHERE [usage_month=daterange] 5 | AND usage_month < date_trunc('month', CURRENT_TIMESTAMP)::DATE 6 | AND [warehouse_name=warehouse_name] 7 | GROUP BY 1 8 | ORDER BY 1 DESC 9 | -------------------------------------------------------------------------------- /analysis/periscope_dashboards/year_to_date.sql: -------------------------------------------------------------------------------- 1 | SELECT sum(dollars_spent) AS dollars_spent 2 | FROM {{ref('snowflake_warehouse_metering_xf')}} 3 | WHERE date_trunc('year', usage_month) = date_trunc('year', CURRENT_TIMESTAMP)::date 4 | -------------------------------------------------------------------------------- /dbt_project.yml: -------------------------------------------------------------------------------- 1 | name: 'snowflake_spend' 2 | version: '1.0' 3 | config-version: 2 4 | require-dbt-version: ">=0.17.0" 5 | 6 | model-paths: ["models"] # paths with source code to compile 7 | analysis-paths: ["analysis"] # path with analysis files which are compiled, but not run 8 | target-path: "target" # path for compiled code 9 | clean-targets: ["target"] # directories removed by the clean task 10 | test-paths: ["test"] # where to store test results 11 | seed-paths: ["data"] # load CSVs from this directory with `dbt seed` -------------------------------------------------------------------------------- /models/base/schema.yml: -------------------------------------------------------------------------------- 1 | version: 2 2 | 3 | models: 4 | - name: snowflake_amortized_rates 5 | description: '{{ doc("snowflake_amortized_rates") }}' 6 | columns: 7 | - name: effective_start_date 8 | tests: 9 | - not_null 10 | - name: effective_end_date 11 | - name: rate 12 | tests: 13 | - not_null 14 | - name: date_day 15 | tests: 16 | - not_null 17 | - unique 18 | - name: snowflake_query_history 19 | description: '{{ doc("snowflake_query_history") }}' 20 | columns: 21 | - name: snowflake_query_id 22 | tests: 23 | - not_null 24 | - unique 25 | - name: database_id 26 | - name: snowflake_session_id 27 | - name: warehouse_id 28 | - name: database_name 29 | - name: query_text 30 | tests: 31 | - not_null 32 | - name: snowflake_role_name 33 | - name: snowflake_user_name 34 | tests: 35 | - not_null 36 | - name: query_result_rows_produced 37 | description: 'The number of rows produced as a result of the query output. This is a good proxy of the vol of output' 38 | - name: schema_name 39 | - name: warehouse_name 40 | - name: query_end_time 41 | tests: 42 | - not_null 43 | - name: query_start_time 44 | tests: 45 | - not_null 46 | - name: query_spillover_local 47 | - name: query_spillover_remote 48 | - name: query_bytes_scanned 49 | - name: snowflake_warehouse_metering 50 | description: '{{ doc("snowflake_warehouse_metering") }}' 51 | columns: 52 | - name: warehouse_id 53 | - name: warehouse_name 54 | - name: start_time 55 | - name: end_time 56 | - name: credits_used 57 | -------------------------------------------------------------------------------- /models/base/snowflake_amortized_rates.sql: -------------------------------------------------------------------------------- 1 | WITH contract_rates AS ( 2 | 3 | SELECT * 4 | FROM {{ ref('snowflake_contract_rates')}} 5 | 6 | ), contract_rate_rework AS ( 7 | 8 | SELECT 9 | effective_date AS effective_start_date, 10 | DATEADD(day, -1, LEAD(effective_date, 1, '2059-01-01') OVER ( 11 | ORDER BY effective_date ASC 12 | )) AS effective_end_date, 13 | rate 14 | FROM contract_rates 15 | 16 | ), date_spine AS ( 17 | 18 | {{ dbt_utils.date_spine( 19 | start_date="to_date('11/01/2009', 'mm/dd/yyyy')", 20 | datepart="day", 21 | end_date="dateadd(year, 40, current_date)" 22 | ) 23 | }} 24 | 25 | ), date_details AS ( 26 | 27 | SELECT 28 | date_day, 29 | DATE_PART(month, date_day) AS month_actual, 30 | DATE_PART(year, date_day) AS year_actual, 31 | FIRST_VALUE(date_day) OVER ( 32 | PARTITION BY year_actual, month_actual ORDER BY date_day 33 | ) AS first_day_of_month 34 | FROM date_spine 35 | 36 | ), rate_amortized AS ( 37 | 38 | SELECT 39 | contract_rate_rework.*, 40 | date_details.date_day 41 | FROM contract_rate_rework 42 | LEFT JOIN date_details 43 | ON date_details.date_day >= contract_rate_rework.effective_start_date 44 | AND date_details.date_day <= contract_rate_rework.effective_end_date 45 | 46 | ) 47 | 48 | SELECT * 49 | FROM rate_amortized 50 | -------------------------------------------------------------------------------- /models/base/snowflake_docs.md: -------------------------------------------------------------------------------- 1 | {% docs snowflake_warehouse_metering %} 2 | 3 | This is the base model for the Snowflake Warehouse Metering History table https://docs.snowflake.net/manuals/sql-reference/account-usage/warehouse_metering_history.html. 4 | 5 | {% enddocs %} 6 | 7 | 8 | {% docs snowflake_query_history %} 9 | 10 | This is the base model for the Snowflake Query History table https://docs.snowflake.com/en/sql-reference/functions/query_history.html. 11 | 12 | {% enddocs %} 13 | 14 | {% docs snowflake_amortized_rates %} 15 | 16 | This builds on top of seeded data and results in a model where you have the rate per day. 17 | 18 | {% enddocs %} 19 | -------------------------------------------------------------------------------- /models/base/snowflake_query_history.sql: -------------------------------------------------------------------------------- 1 | {{ 2 | config( 3 | materialized='incremental', 4 | unique_key='snowflake_query_id' 5 | ) 6 | }} 7 | 8 | 9 | WITH source AS ( 10 | 11 | SELECT * 12 | FROM {{ source('snowflake','query_history') }} 13 | QUALIFY ROW_NUMBER() OVER (PARTITION BY query_id ORDER BY query_id) = 1 14 | 15 | ), renamed AS ( 16 | 17 | SELECT 18 | query_id AS snowflake_query_id, 19 | 20 | -- Foreign Keys 21 | database_id AS database_id, 22 | schema_id AS schema_id, 23 | session_id AS snowflake_session_id, 24 | warehouse_id AS warehouse_id, 25 | 26 | -- Logical Info 27 | database_name AS database_name, 28 | query_text AS query_text, 29 | role_name AS snowflake_role_name, 30 | rows_produced AS query_result_rows_produced, 31 | schema_name AS schema_name, 32 | user_name AS snowflake_user_name, 33 | warehouse_name AS warehouse_name, 34 | 35 | -- metadata 36 | end_time AS query_end_time, 37 | start_time AS query_start_time, 38 | bytes_spilled_to_local_storage AS query_bytes_spillover_local, 39 | bytes_spilled_to_remote_storage AS query_bytes_spillover_remote, 40 | bytes_scanned AS query_bytes_scanned 41 | 42 | FROM source 43 | {% if is_incremental() %} 44 | 45 | -- this filter will only be applied on an incremental run 46 | WHERE query_start_time > (SELECT MAX(query_start_time) FROM {{ this }}) 47 | 48 | {% endif %} 49 | 50 | ) 51 | 52 | SELECT * 53 | FROM renamed -------------------------------------------------------------------------------- /models/base/snowflake_warehouse_metering.sql: -------------------------------------------------------------------------------- 1 | WITH base AS ( 2 | 3 | SELECT * 4 | FROM {{ source('snowflake','warehouse_metering_history') }} 5 | 6 | ) 7 | 8 | SELECT 9 | warehouse_id, 10 | warehouse_name, 11 | start_time, 12 | end_time, 13 | credits_used 14 | FROM base 15 | -------------------------------------------------------------------------------- /models/base/sources.yml: -------------------------------------------------------------------------------- 1 | version: 2 2 | 3 | sources: 4 | - name: snowflake 5 | database: snowflake 6 | schema: account_usage 7 | 8 | quoting: 9 | database: false 10 | schema: false 11 | identifier: false 12 | 13 | tables: 14 | - name: query_history 15 | - name: warehouse_metering_history 16 | -------------------------------------------------------------------------------- /models/xf/schema.yml: -------------------------------------------------------------------------------- 1 | version: 2 2 | 3 | models: 4 | - name: snowflake_warehouse_metering_xf 5 | description: This is the transformed model for the Snowflake Warehouse Metering History table https://docs.snowflake.net/manuals/sql-reference/account-usage/warehouse_metering_history.html. 6 | columns: 7 | - name: warehouse_id 8 | - name: warehouse_name 9 | - name: start_time 10 | - name: end_time 11 | - name: credits_used 12 | - name: usage_month 13 | description: Uses the end date to define the month the usage occurred. 14 | - name: usage_length 15 | description: Determines the number of hours between the start and end times. 16 | - name: credits_used 17 | - name: dollars_spent 18 | description: Calculates the dollar value based on the credits consumed. -------------------------------------------------------------------------------- /models/xf/snowflake_warehouse_metering_xf.sql: -------------------------------------------------------------------------------- 1 | WITH base AS ( 2 | 3 | SELECT * 4 | FROM {{ ref('snowflake_warehouse_metering') }} 5 | 6 | ), contract_rates AS ( 7 | 8 | SELECT * 9 | FROM {{ ref('snowflake_amortized_rates') }} 10 | 11 | ), usage AS ( 12 | 13 | SELECT 14 | warehouse_id, 15 | warehouse_name, 16 | 17 | start_time, 18 | end_time, 19 | DATE_TRUNC('month', end_time)::DATE AS usage_month, 20 | DATE_TRUNC('day', end_time)::DATE AS usage_day, 21 | DATEDIFF(hour, start_time, end_time) AS usage_length, 22 | contract_rates.rate AS credit_rate, 23 | ROUND(credits_used * contract_rates.rate, 2) AS dollars_spent 24 | FROM base 25 | LEFT JOIN contract_rates 26 | ON DATE_TRUNC('day', end_time) = contract_rates.date_day 27 | 28 | ) 29 | 30 | SELECT * 31 | FROM usage 32 | -------------------------------------------------------------------------------- /packages.yml: -------------------------------------------------------------------------------- 1 | packages: 2 | - package: dbt-labs/dbt_utils 3 | version: '>=0.1.25' 4 | --------------------------------------------------------------------------------