└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Mazur's LookML Style Guide 2 | 3 | Howdy! I'm [Matt Mazur](https://mattmazur.com/) and I'm a data analyst who has worked at several companies to help them use data to grow their businesses. I've been working in [Looker](http://looker.com/) almost daily since June 2017 and over time have developed various preferences for how to write clean, maintainable LookML. This guide is an attempt to document those preferences. 4 | 5 | My goal with this guide is twofold: first, to share what I've learned so that it may help other LookML developers. Second, I'd love to get feedback from you all about how to improve how I write LookML. I'll incorporate any feedback from you all in here and give you credit where appropriate. You can find me on Twitter at [@mhmazur](https://twitter.com/mhmazur) (where I'll also be tweeting about updates to this guide) or by email at matthew.h.mazur@gmail.com. 6 | 7 | If you're interested in this topic, you may also enjoy my [SQL Style Guide](https://github.com/mattm/sql-style-guide), my [Matt On Analytics](http://eepurl.com/dITJS9) newsletter, and [my blog](https://mattmazur.com/category/analytics/) where I write about analytics and data analysis. 8 | 9 | ## Guidelines 10 | 11 | ### All visible dimensions should have a description 12 | 13 | It's tempting to omit descriptions for dimensions whose purpose seems obvious from the name, but what may be obvious to you may not be obvious to other analysts. You can almost always add clarity by adding a meaningful description to a dimension. 14 | 15 | ```lookml 16 | # Good 17 | dimension: promotion_description { 18 | description: "A description of any promotion that was applied to this billing. For example, "Plus Plan 25%" or "Non-profit Discount". A billing can only have 1 promotion max." 19 | } 20 | 21 | # Bad: A description that restates the dimension name 22 | dimension: promotion_description { 23 | description: "A description of the promotion" 24 | } 25 | 26 | # Bad: No description 27 | dimension: promotion_description {} 28 | ``` 29 | 30 | The only exception is dimensions that won't be visible to the user when performing analyses, for example primary or foreign keys that do not need to be exposed to users. For these type of dimensions, descriptions are optional. 31 | 32 | ### Remove unused fields and views 33 | 34 | If you've taken advantage of Looker's "Create View from Table" feature, you may have wound up with a lot of unused views and dimensions. It's tempting to leave them in place because you may need them one day, but my preference is to remove them entirely. The cost of creating new views and dimensions is small compared to the cognitive overhead resulting from lots of unused views and hidden fields that aren't used in any analyses. 35 | 36 | ### Omit unnecessary parameters 37 | 38 | When possible, I try to take advantage of Looker's default parameter values to minimize how much LookML I have to write. I think the LookML looks cleaner that way, even if it sometimes comes at the cost of being less explicit. A few examples: 39 | 40 | Dimensions will default to referencing a column that matches the name of the dimension, so you can leave the `sql` parameter out in lot of cases: 41 | 42 | ```lookml 43 | # Good 44 | dimension: is_first_billing { 45 | description: "..." 46 | type: yesno 47 | } 48 | 49 | # Bad 50 | dimension: is_first_billing { 51 | description: "..." 52 | type: yesno 53 | sql: ${TABLE}.is_first_billing ;; 54 | } 55 | ``` 56 | 57 | Similarly, no need to include a `label` most of the time because Looker will automatically case titleize it: 58 | 59 | ```lookml 60 | # Good 61 | dimension: is_first_billing { 62 | description: "..." 63 | type: yesno 64 | } 65 | 66 | # Bad 67 | dimension: is_first_billing { 68 | label: "Is First Billing" 69 | description: "..." 70 | type: yesno 71 | } 72 | ``` 73 | 74 | And no need to include `type: string` which is the default type: 75 | 76 | ```lookml 77 | # Good 78 | dimension: name { 79 | description: "..." 80 | } 81 | 82 | # Bad 83 | dimension: name { 84 | description: "..." 85 | type: string 86 | } 87 | ``` 88 | 89 | Another example is excluding `type: left_outer` from joins because left joining is the default behavior: 90 | 91 | ```lookml 92 | # Good 93 | explore: companies { 94 | join: billings { 95 | relationship: one_to_many 96 | sql_on: ${companies.company_id} = ${billings.company_id} ;; 97 | } 98 | } 99 | 100 | # Bad 101 | explore: companies { 102 | join: billings { 103 | relationship: one_to_many 104 | type: left_outer 105 | sql_on: ${companies.company_id} = ${billings.company_id} ;; 106 | } 107 | } 108 | ``` 109 | 110 | ### Yes/No dimensions should be exposed as case/when string dimensions 111 | 112 | Rather than use a dimension like `is_paying` in an analysis, it's better to hide it and make a derived `paying_status` string dimension available. This makes it abundently clear what the yes and no values represent especially when displayed in visualizations (pivoting a measure on a yes/no and seeing Yes and No in a legend is often unclear). 113 | 114 | ```lookml 115 | # Good: Hiding the yes/no dimension and creating a LookML case/when dimension based on it 116 | dimension: is_paying { 117 | type: yesno 118 | hidden: yes 119 | } 120 | 121 | dimension: paying_status { 122 | description: "Whether the company is currently paying or non-paying" 123 | case: 124 | when: { 125 | sql: ${is_paying} ;; 126 | label: "Paying" 127 | } 128 | else: "Non-Paying" 129 | } 130 | } 131 | 132 | # Bad: Using SQL to determine the string values 133 | 134 | # You want to use LookML case/when so that Looker displays a dropdown that users can select from when filtering 135 | # Otherwise Looker will simply display a text field which is less easy to use. 136 | 137 | dimension: is_paying { 138 | type: yesno 139 | hidden: yes 140 | } 141 | 142 | dimension: paying_status { 143 | description: "Whether the company is currently paying or non-paying" 144 | sql: if(${is_paying}, "Paying", "Non-Paying") ;; 145 | } 146 | 147 | # Bad: Exposing the yes/no dimension to end-users 148 | dimension: is_paying { 149 | description: "Whether the company is currently a paying customer" 150 | type: yesno 151 | } 152 | ``` 153 | 154 | ### Alphabetize dimensions, then alphabetize measures 155 | 156 | Looker doesn't care about the order of the fields within a view, but it makes it easier to find specific fields if you simply list dimensions alphabetically then list measures alphabetically: 157 | 158 | ```lookml 159 | # Good 160 | view: companies { 161 | dimension: id { 162 | description: "..." 163 | type: number 164 | } 165 | 166 | dimension: has_closed { 167 | description: "..." 168 | type: yesno 169 | } 170 | 171 | dimension: name { 172 | description: "..." 173 | } 174 | 175 | measure: closed_count { 176 | description: "..." 177 | type: count 178 | filters: { 179 | field: has_closed 180 | value: "yes" 181 | } 182 | } 183 | 184 | measure: company_count { 185 | description: "..." 186 | type: count 187 | } 188 | } 189 | 190 | # Bad 191 | view: companies { 192 | measure: closed_count { 193 | description: "..." 194 | type: count 195 | filters: { 196 | field: has_closed 197 | value: "yes" 198 | } 199 | } 200 | 201 | dimension: id { 202 | description: "..." 203 | type: number 204 | } 205 | 206 | dimension: name { 207 | description: "..." 208 | } 209 | 210 | measure: company_count { 211 | description: "..." 212 | type: count 213 | } 214 | 215 | dimension: has_closed { 216 | description: "..." 217 | type: yesno 218 | } 219 | } 220 | ``` 221 | 222 | ### Every view should have a primary key defined 223 | 224 | It's [required](https://docs.looker.com/data-modeling/learning-lookml/working-with-joins#primary_keys_required) for [symmetric aggregates](https://discourse.looker.com/t/symmetric-aggregates/261) to work correctly. 225 | 226 | ```lookml 227 | # Good 228 | view: companies { 229 | dimension: company_id { 230 | description: "..." 231 | primary_key: yes 232 | } 233 | } 234 | ``` 235 | 236 | If the table you're working with does not have a primary key, you can always create a primary key dimension by concatenating other columns together as outlined in [the docs](https://docs.looker.com/reference/field-params/primary_key). Ideally though you would transform the table prior to analyzing it in Looker using a tool like [dbt](http://getdbt.com/) and create a primary key there if needed, eliminating the need to concatenate fields in Looker (more on dbt in a future tip). 237 | 238 | ### Naming count measures 239 | 240 | Use a [singular noun](https://twitter.com/jgkite/status/1171845537311707136) representing whatever thing you're measuring suffixed with `_count`. For example, `company_count`, `user_count`, `beacon_count`, `purchase_count`, etc. Avoid the default `count` measure name that's created when you use Looker's "Create View From Table" feature. This helps avoid ambiguity that might arise when people see just "Count" in analyses and visualizations. 241 | 242 | ```lookml 243 | # Good 244 | view: companies { 245 | measure: company_count { 246 | type: count 247 | } 248 | } 249 | 250 | # Bad 251 | view: companies { 252 | measure: count { 253 | type: count 254 | } 255 | } 256 | ``` 257 | 258 | ### Naming sum measures 259 | 260 | For sums, prefix the measure name with `total_`. For example, `total_payments`, `total_taxes`, etc: 261 | 262 | ```lookml 263 | # Good 264 | view: payments { 265 | measure: total_payments { 266 | type: sum 267 | sql: ${TABLE}.amount ;; 268 | } 269 | } 270 | ``` 271 | 272 | One exception is when you're summing a column that already represents a count. For example, if you have a `companies` table and there's a `users` column that representing the number of users a company has, you should name the measure like you would a count because the measure represents the number of users, even though it's using a sum behind the scenes. 273 | 274 | ```lookml 275 | # Good 276 | view: companies { 277 | measure: user_count { 278 | type: sum 279 | sql: ${TABLE}.users ;; 280 | } 281 | } 282 | 283 | # Bad 284 | view: companies { 285 | measure: total_users { 286 | type: sum 287 | sql: ${TABLE}.users ;; 288 | } 289 | } 290 | ``` 291 | 292 | 293 | ### Naming time-based dimensions 294 | 295 | If you use Looker's Create View From Table feature to import a table that includes a column like `created_at`, Looker will create the dimension like so: 296 | 297 | ```lookml 298 | # OK 299 | dimension_group: created { 300 | type: time 301 | timeframes: [ 302 | raw, 303 | time, 304 | date, 305 | week, 306 | month, 307 | quarter, 308 | year 309 | ] 310 | sql: ${TABLE}.created_at ;; 311 | } 312 | ``` 313 | 314 | In an explore, you'll wind up with "Created Date", "Created Week", etc. 315 | 316 | This is alright, but because I advocate for [omitting all unnecessary parameters](https://github.com/mattm/lookml-style-guide#omit-unnecessary-parameters) I prefer to tweak this to nix the `sql` parameter. Otherwise only dimension groups will have a `sql` parameter and all others won't (because the dimension name matches the column name so it's unnecessary). 317 | 318 | ```lookml 319 | # Good 320 | dimension_group: created_at { 321 | label: "Created" 322 | type: time 323 | timeframes: [ 324 | raw, 325 | time, 326 | date, 327 | week, 328 | month, 329 | quarter, 330 | year 331 | ] 332 | } 333 | ``` 334 | 335 | The dimension will appear the same in an explore ("Created Date", "Created Week", etc) but your LookML will be cleaner because you've omitted the `sql` parameter like all of the other dimensions. As an added benefit, if you wind up referencing this dimension elsewhere, you'll wind up referencing something like `created_at_date` which makes it clear that the date is being derived from a date+time column (because the [column name is suffixed](https://github.com/mattm/sql-style-guide#column-name-conventions) with `_at`). 336 | 337 | ## Pay attention to white space 338 | 339 | * Include a single space after colons (`type: sum`) 340 | * Include a single space before double-semicolons (`sql: ${TABLE}.amount ;;`) 341 | * Include a space between the field name and the opening curly bracees `measure: total_payments {` 342 | 343 | ```lookml 344 | # Good 345 | measure: total_payments { 346 | type: sum 347 | sql: ${TABLE}.amount ;; 348 | } 349 | 350 | # Bad 351 | measure:total_payments{ 352 | type: sum 353 | sql:${TABLE}.amount;; 354 | } 355 | ``` 356 | 357 | ## Credits 358 | 359 | A lot of these preferences were developed in conjection with others during my time at Help Scout and Automattic. Huge thanks in particular to to Eli Overbey, Simon Ouderkirk, Anna Elek, and Jen Wilson for the many discussions that have influenced my thinking on writing LookML. 360 | --------------------------------------------------------------------------------