├── .gitignore ├── LICENSE ├── MANIFEST.in ├── Makefile ├── README.md ├── ezgif-3-a935fdcb7415.gif ├── notion ├── __init__.py ├── block.py ├── client.py ├── collection.py ├── logger.py ├── maps.py ├── markdown.py ├── monitor.py ├── operations.py ├── records.py ├── settings.py ├── smoke_test.py ├── space.py ├── store.py ├── user.py └── utils.py ├── requirements.txt ├── run_smoke_test.py └── setup.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | __pycache__ 3 | build/ 4 | dist/ 5 | *.egg-info 6 | *_testing.py 7 | 8 | # pipenv 9 | .env 10 | .venv 11 | Pipfile 12 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2018 Jamie Alexandre 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy 4 | of this software and associated documentation files (the "Software"), to deal 5 | in the Software without restriction, including without limitation the rights 6 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 7 | copies of the Software, and to permit persons to whom the Software is 8 | furnished to do so, subject to the following conditions: 9 | 10 | The above copyright notice and this permission notice shall be included in all 11 | copies or substantial portions of the Software. 12 | 13 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 14 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 15 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 16 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 17 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 18 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 19 | SOFTWARE. -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include requirements.txt 2 | include LICENSE 3 | include README.md 4 | 5 | global-exclude *.pyc 6 | global-exclude __pycache__ -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | 2 | build: clean 3 | python3 setup.py sdist bdist_wheel 4 | 5 | release: build 6 | twine upload -s dist/* 7 | 8 | clean: 9 | find . -name "*.pyc" -print0 | xargs -0 rm -rf 10 | rm -rf build 11 | rm -rf dist 12 | rm -rf notion.egg-info 13 | 14 | install: 15 | python setup.py install -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # notion-py 2 | 3 | Unofficial Python 3 client for Notion.so API v3. 4 | 5 | - Object-oriented interface (mapping database tables to Python classes/attributes) 6 | - Automatic conversion between internal Notion formats and appropriate Python objects 7 | - Local cache of data in a unified data store *(Note: disk cache now disabled by default; to enable, add `enable_caching=True` when initializing `NotionClient`)* 8 | - Real-time reactive two-way data binding (changing Python object -> live updating of Notion UI, and vice-versa) *(Note: Notion->Python automatic updating is currently broken and hence disabled by default; call `my_block.refresh()` to update, in the meantime, while monitoring is being fixed)* 9 | - Callback system for responding to changes in Notion (e.g. for triggering actions, updating another API, etc) 10 | 11 | ![](https://raw.githubusercontent.com/jamalex/notion-py/master/ezgif-3-a935fdcb7415.gif) 12 | 13 | [Read more about Notion and Notion-py on Jamie's blog](https://medium.com/@jamiealexandre/introducing-notion-py-an-unofficial-python-api-wrapper-for-notion-so-603700f92369) 14 | 15 | # Usage 16 | 17 | ## Quickstart 18 | 19 | Note: the latest version of **notion-py** requires Python 3.5 or greater. 20 | 21 | `pip install notion` 22 | 23 | ```Python 24 | from notion.client import NotionClient 25 | 26 | # Obtain the `token_v2` value by inspecting your browser cookies on a logged-in (non-guest) session on Notion.so 27 | client = NotionClient(token_v2="") 28 | 29 | # Replace this URL with the URL of the page you want to edit 30 | page = client.get_block("https://www.notion.so/myorg/Test-c0d20a71c0944985ae96e661ccc99821") 31 | 32 | print("The old title is:", page.title) 33 | 34 | # Note: You can use Markdown! We convert on-the-fly to Notion's internal formatted text data structure. 35 | page.title = "The title has now changed, and has *live-updated* in the browser!" 36 | ``` 37 | 38 | ## Concepts and notes 39 | 40 | - We map tables in the Notion database into Python classes (subclassing `Record`), with each instance of a class representing a particular record. Some fields from the records (like `title` in the example above) have been mapped to model properties, allowing for easy, instantaneous read/write of the record. Other fields can be read with the `get` method, and written with the `set` method, but then you'll need to make sure to match the internal structures exactly. 41 | - The tables we currently support are **block** (via [`Block` class and its subclasses](https://github.com/jamalex/notion-py/blob/c65c9b14ed5dcd6d9326264f2e888ab343d2b831/notion/block.py#L143), corresponding to different `type` of blocks), **space** (via [`Space` class](https://github.com/jamalex/notion-py/blob/c65c9b14ed5dcd6d9326264f2e888ab343d2b831/notion/space.py#L6)), **collection** (via [`Collection` class](https://github.com/jamalex/notion-py/blob/c65c9b14ed5dcd6d9326264f2e888ab343d2b831/notion/collection.py#L91)), **collection_view** (via [`CollectionView` and subclasses](https://github.com/jamalex/notion-py/blob/c65c9b14ed5dcd6d9326264f2e888ab343d2b831/notion/collection.py#L175)), and **notion_user** (via [`User` class](https://github.com/jamalex/notion-py/blob/master/notion/user.py)). 42 | - Data for all tables are stored in a central [`RecordStore`](https://github.com/jamalex/notion-py/blob/c65c9b14ed5dcd6d9326264f2e888ab343d2b831/notion/store.py#L69), with the `Record` instances not storing state internally, but always referring to the data in the central `RecordStore`. Many API operations return updating versions of a large number of associated records, which we use to update the store, so the data in `Record` instances may sometimes update without being explicitly requested. You can also call the `refresh` method on a `Record` to trigger an update, or pass `force_update` to methods like `get`. 43 | - The API doesn't have strong validation of most data, so be careful to maintain the structures Notion is expecting. You can view the full internal structure of a record by calling `myrecord.get()` with no arguments. 44 | - When you call `client.get_block`, you can pass in either an ID, or the URL of a page. Note that pages themselves are just `blocks`, as are all the chunks of content on the page. You can get the URL for a block within a page by clicking "Copy Link" in the context menu for the block, and pass that URL into `get_block` as well. 45 | 46 | ## Updating records 47 | 48 | We keep a local cache of all data that passes through. When you reference an attribute on a `Record`, we first look to that cache to retrieve the value. If it doesn't find it, it retrieves it from the server. You can also manually refresh the data for a `Record` by calling the `refresh` method on it. By default (unless we instantiate `NotionClient` with `monitor=False`), we also [subscribe to long-polling updates](https://github.com/jamalex/notion-py/blob/master/notion/monitor.py) for any instantiated `Record`, so the local cache data for these `Records` should be automatically live-updated shortly after any data changes on the server. The long-polling happens in a background daemon thread. 49 | 50 | ## Example: Traversing the block tree 51 | 52 | ```Python 53 | for child in page.children: 54 | print(child.title) 55 | 56 | print("Parent of {} is {}".format(page.id, page.parent.id)) 57 | ``` 58 | 59 | ## Example: Adding a new node 60 | 61 | ```Python 62 | from notion.block import TodoBlock 63 | 64 | newchild = page.children.add_new(TodoBlock, title="Something to get done") 65 | newchild.checked = True 66 | ``` 67 | 68 | ## Example: Deleting nodes 69 | 70 | ```Python 71 | # soft-delete 72 | page.remove() 73 | 74 | # hard-delete 75 | page.remove(permanently=True) 76 | ``` 77 | 78 | ## Example: Create an embedded content type (iframe, video, etc) 79 | 80 | ```Python 81 | from notion.block import VideoBlock 82 | 83 | video = page.children.add_new(VideoBlock, width=200) 84 | # sets "property.source" to the URL, and "format.display_source" to the embedly-converted URL 85 | video.set_source_url("https://www.youtube.com/watch?v=oHg5SJYRHA0") 86 | ``` 87 | 88 | ## Example: Create a new embedded collection view block 89 | 90 | ```Python 91 | collection = client.get_collection(COLLECTION_ID) # get an existing collection 92 | cvb = page.children.add_new(CollectionViewBlock, collection=collection) 93 | view = cvb.views.add_new(view_type="table") 94 | 95 | # Before the view can be browsed in Notion, 96 | # the filters and format options on the view should be set as desired. 97 | # 98 | # for example: 99 | # view.set("query", ...) 100 | # view.set("format.board_groups", ...) 101 | # view.set("format.board_properties", ...) 102 | ``` 103 | 104 | ## Example: Moving blocks around 105 | 106 | ```Python 107 | # move my block to after the video 108 | my_block.move_to(video, "after") 109 | 110 | # move my block to the end of otherblock's children 111 | my_block.move_to(otherblock, "last-child") 112 | 113 | # (you can also use "before" and "first-child") 114 | ``` 115 | 116 | ## Example: Subscribing to updates 117 | 118 | *(Note: Notion->Python automatic updating is currently broken and hence disabled by default; call `my_block.refresh()` to update, in the meantime, while monitoring is being fixed)* 119 | 120 | We can "watch" a `Record` so that we get a [callback](https://github.com/jamalex/notion-py/blob/master/notion/store.py) whenever it changes. Combined with the live-updating of records based on long-polling, this allows for a "reactive" design, where actions in our local application can be triggered in response to interactions with the Notion interface. 121 | 122 | ```Python 123 | 124 | # define a callback (note: all arguments are optional, just include the ones you care about) 125 | def my_callback(record, difference): 126 | print("The record's title is now:" record.title) 127 | print("Here's what was changed:") 128 | print(difference) 129 | 130 | # move my block to after the video 131 | my_block.add_callback(my_callback) 132 | ``` 133 | 134 | ## Example: Working with databases, aka "collections" (tables, boards, etc) 135 | 136 | Here's how things fit together: 137 | - Main container block: `CollectionViewBlock` (inline) / `CollectionViewPageBlock` (full-page) 138 | - `Collection` (holds the schema, and is parent to the database rows themselves) 139 | - `CollectionRowBlock` 140 | - `CollectionRowBlock` 141 | - ... (more database records) 142 | - `CollectionView` (holds filters/sort/etc about each specific view) 143 | 144 | Note: For convenience, we automatically map the database "columns" (aka properties), based on the schema defined in the `Collection`, into getter/setter attributes on the `CollectionRowBlock` instances. The attribute name is a "slugified" version of the name of the column. So if you have a column named "Estimated value", you can read and write it via `myrowblock.estimated_value`. Some basic validation may be conducted, and it will be converted into the appropriate internal format. For columns of type "Person", we expect a `User` instance, or a list of them, and for a "Relation" we expect a singular/list of instances of a subclass of `Block`. 145 | 146 | ```Python 147 | # Access a database using the URL of the database page or the inline block 148 | cv = client.get_collection_view("https://www.notion.so/myorg/8511b9fc522249f79b90768b832599cc?v=8dee2a54f6b64cb296c83328adba78e1") 149 | 150 | # List all the records with "Bob" in them 151 | for row in cv.collection.get_rows(search="Bob"): 152 | print("We estimate the value of '{}' at {}".format(row.name, row.estimated_value)) 153 | 154 | # Add a new record 155 | row = cv.collection.add_row() 156 | row.name = "Just some data" 157 | row.is_confirmed = True 158 | row.estimated_value = 399 159 | row.files = ["https://www.birdlife.org/sites/default/files/styles/1600/public/slide.jpg"] 160 | row.person = client.current_user 161 | row.tags = ["A", "C"] 162 | row.where_to = "https://learningequality.org" 163 | 164 | # Run a filtered/sorted query using a view's default parameters 165 | result = cv.default_query().execute() 166 | for row in result: 167 | print(row) 168 | 169 | # Run an "aggregation" query 170 | aggregations = [{ 171 | "property": "estimated_value", 172 | "aggregator": "sum", 173 | "id": "total_value", 174 | }] 175 | result = cv.build_query(aggregate=aggregate_params).execute() 176 | print("Total estimated value:", result.get_aggregate("total_value")) 177 | 178 | # Run a "filtered" query (inspect network tab in browser for examples, on queryCollection calls) 179 | filter_params = { 180 | "filters": [{ 181 | "filter": { 182 | "value": { 183 | "type": "exact", 184 | "value": {"table": "notion_user", "id": client.current_user.id} 185 | }, 186 | "operator": "person_contains" 187 | }, 188 | "property": "assigned_to" 189 | }], 190 | "operator": "and" 191 | } 192 | result = cv.build_query(filter=filter_params).execute() 193 | print("Things assigned to me:", result) 194 | 195 | # Run a "sorted" query 196 | sort_params = [{ 197 | "direction": "descending", 198 | "property": "estimated_value", 199 | }] 200 | result = cv.build_query(sort=sort_params).execute() 201 | print("Sorted results, showing most valuable first:", result) 202 | ``` 203 | 204 | Note: You can combine `filter`, `aggregate`, and `sort`. See more examples of queries by setting up complex views in Notion, and then inspecting the full query: `cv.get("query2")`. 205 | 206 | You can also see [more examples in action in the smoke test runner](https://github.com/jamalex/notion-py/blob/master/notion/smoke_test.py). Run it using: 207 | 208 | ```sh 209 | python run_smoke_test.py --page [YOUR_NOTION_PAGE_URL] --token [YOUR_NOTION_TOKEN_V2] 210 | ``` 211 | 212 | ## Example: Lock/Unlock A Page 213 | ```Python 214 | from notion.client import NotionClient 215 | 216 | # Obtain the `token_v2` value by inspecting your browser cookies on a logged-in session on Notion.so 217 | client = NotionClient(token_v2="") 218 | 219 | # Replace this URL with the URL of the page or database you want to edit 220 | page = client.get_block("https://www.notion.so/myorg/Test-c0d20a71c0944985ae96e661ccc99821") 221 | 222 | # The "locked" property is available on PageBlock and CollectionViewBlock objects 223 | # Set it to True to lock the page/database 224 | page.locked = True 225 | # and False to unlock it again 226 | page.locked = False 227 | ``` 228 | 229 | ## Example: Set the current user for multi-account user 230 | 231 | ```python 232 | from notion.client import NotionClient 233 | client = NotionClient(token_v2="") 234 | 235 | # The initial current_user of a multi-account user may be an unwanted user 236 | print(client.current_user.email) # → not_the_desired@email.co.jp 237 | 238 | # Set current_user to the desired user 239 | client.set_user_by_email('desired@email.com') 240 | print(client.current_user.email) # → desired@email.com 241 | 242 | # You can also set the current_user by uid. 243 | client.set_user_by_uid('') 244 | print(client.current_user.email) # → desired@email.com 245 | ``` 246 | 247 | # _Quick plug: Learning Equality needs your support!_ 248 | 249 | If you'd like to support notion-py development, please consider [donating to my open-source nonprofit, Learning Equality](https://learningequality.org/donate/), since when I'm not working on notion-py, it probably means I'm heads-down fundraising for our global education work (bringing resources like Khan Academy to communities with no Internet). COVID has further amplified needs, with over a billion kids stuck at home, and over half of them without the connectivity they need for distance learning. You can now also [support our work via GitHub Sponsors](https://github.com/sponsors/learningequality)! 250 | 251 | # Related Projects 252 | 253 | - [md2notion](https://github.com/Cobertos/md2notion): import Markdown files to Notion 254 | - [notion-export-ics](https://github.com/evertheylen/notion-export-ics): Export Notion Databases to ICS calendar files 255 | - [notion-tqdm](https://github.com/shunyooo/notion-tqdm): Progress Bar displayed in Notion like tqdm 256 | 257 | # TODO 258 | 259 | * Cloning pages hierarchically 260 | * Debounce cache-saving? 261 | * Support inline "user" and "page" links, and reminders, in markdown conversion 262 | * Utilities to support updating/creating collection schemas 263 | * Utilities to support updating/creating collection_view queries 264 | * Support for easily managing page permissions 265 | * Websocket support for live block cache updating 266 | * "Render full page to markdown" mode 267 | * "Import page from html" mode 268 | -------------------------------------------------------------------------------- /ezgif-3-a935fdcb7415.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jamalex/notion-py/c3c9c251484f78102d93048719a31005c91217cd/ezgif-3-a935fdcb7415.gif -------------------------------------------------------------------------------- /notion/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jamalex/notion-py/c3c9c251484f78102d93048719a31005c91217cd/notion/__init__.py -------------------------------------------------------------------------------- /notion/block.py: -------------------------------------------------------------------------------- 1 | import io 2 | import mimetypes 3 | import os 4 | import random 5 | import requests 6 | import time 7 | import uuid 8 | import zipfile 9 | 10 | from cached_property import cached_property 11 | from copy import deepcopy 12 | 13 | from .logger import logger 14 | from .maps import property_map, field_map, mapper 15 | from .markdown import plaintext_to_notion, notion_to_plaintext 16 | from .operations import build_operation 17 | from .records import Record 18 | from .settings import S3_URL_PREFIX, BASE_URL 19 | from .utils import ( 20 | extract_id, 21 | now, 22 | get_embed_link, 23 | get_embed_data, 24 | add_signed_prefix_as_needed, 25 | remove_signed_prefix_as_needed, 26 | get_by_path, 27 | ) 28 | 29 | 30 | class Children(object): 31 | 32 | child_list_key = "content" 33 | 34 | def __init__(self, parent): 35 | self._parent = parent 36 | self._client = parent._client 37 | 38 | def shuffle(self): 39 | content = self._content_list() 40 | random.shuffle(content) 41 | self._parent.set(self.child_list_key, content) 42 | 43 | def filter(self, type=None): 44 | kids = list(self) 45 | if type: 46 | if isinstance(type, str): 47 | type = BLOCK_TYPES.get(type, Block) 48 | kids = [kid for kid in kids if isinstance(kid, type)] 49 | return kids 50 | 51 | def _content_list(self): 52 | return self._parent.get(self.child_list_key) or [] 53 | 54 | def _get_block(self, id): 55 | 56 | block = self._client.get_block(id) 57 | 58 | # TODO: this is needed because there seems to be a server-side race condition with setting and getting data 59 | # (sometimes the data previously sent hasn't yet propagated to all DB nodes, perhaps? so it fails to load here) 60 | i = 0 61 | while block is None: 62 | i += 1 63 | if i > 20: 64 | return None 65 | time.sleep(0.1) 66 | block = self._client.get_block(id) 67 | 68 | if block.get("parent_id") != self._parent.id: 69 | block._alias_parent = self._parent.id 70 | 71 | return block 72 | 73 | def __repr__(self): 74 | if not len(self): 75 | return "[]" 76 | rep = "[\n" 77 | for child in self: 78 | rep += " {},\n".format(repr(child)) 79 | rep += "]" 80 | return rep 81 | 82 | def __len__(self): 83 | return len(self._content_list() or []) 84 | 85 | def __getitem__(self, key): 86 | result = self._content_list()[key] 87 | if isinstance(result, list): 88 | return [self._get_block(id) for id in result] 89 | else: 90 | return self._get_block(result) 91 | 92 | def __delitem__(self, key): 93 | self._get_block(self._content_list()[key]).remove() 94 | 95 | def __iter__(self): 96 | return iter(self._get_block(id) for id in self._content_list()) 97 | 98 | def __reversed__(self): 99 | return reversed(iter(self)) 100 | 101 | def __contains__(self, item): 102 | if isinstance(item, str): 103 | item_id = extract_id(item) 104 | elif isinstance(item, Block): 105 | item_id = item.id 106 | else: 107 | return False 108 | return item_id in self._content_list() 109 | 110 | def add_new(self, block_type, child_list_key=None, **kwargs): 111 | """ 112 | Create a new block, add it as the last child of this parent block, and return the corresponding Block instance. 113 | `block_type` can be either a type string, or a Block subclass. 114 | """ 115 | 116 | # determine the block type string from the Block class, if that's what was provided 117 | if ( 118 | isinstance(block_type, type) 119 | and issubclass(block_type, Block) 120 | and hasattr(block_type, "_type") 121 | ): 122 | block_type = block_type._type 123 | elif not isinstance(block_type, str): 124 | raise Exception( 125 | "block_type must be a string or a Block subclass with a _type attribute" 126 | ) 127 | 128 | block_id = self._client.create_record( 129 | table="block", 130 | parent=self._parent, 131 | type=block_type, 132 | child_list_key=child_list_key, 133 | ) 134 | 135 | block = self._get_block(block_id) 136 | 137 | if kwargs: 138 | with self._client.as_atomic_transaction(): 139 | for key, val in kwargs.items(): 140 | if hasattr(block, key): 141 | setattr(block, key, val) 142 | else: 143 | logger.warning( 144 | "{} does not have attribute '{}' to be set; skipping.".format( 145 | block, key 146 | ) 147 | ) 148 | 149 | return block 150 | 151 | def add_alias(self, block): 152 | """ 153 | Adds an alias to the provided `block`, i.e. adds the block's ID to the parent's content list, 154 | but doesn't change the block's parent_id. 155 | """ 156 | 157 | # add the block to the content list of the parent 158 | self._client.submit_transaction( 159 | build_operation( 160 | id=self._parent.id, 161 | path=[self.child_list_key], 162 | args={"id": block.id}, 163 | command="listAfter", 164 | ) 165 | ) 166 | 167 | return self._get_block(block.id) 168 | 169 | 170 | class Block(Record): 171 | """ 172 | Most data in Notion is stored as a "block" (including pages, and all the individual elements within a page). 173 | These blocks have different types, and in some cases we create subclasses of this class to represent those types. 174 | Attributes on the Block are mapped to useful attributes of the server-side data structure, as properties, so you can 175 | get and set values on the API just by reading/writing attributes on these classes. We store a shared local cache on 176 | the `NotionClient` object of all block data, and reference that as needed from here. Data can be refreshed from the 177 | server using the `refresh` method. 178 | """ 179 | 180 | _table = "block" 181 | 182 | # we'll mark it as an alias if we load the Block as a child of a page that is not its parent 183 | _alias_parent = None 184 | 185 | child_list_key = "content" 186 | 187 | type = field_map("type") 188 | alive = field_map("alive") 189 | 190 | def get_browseable_url(self): 191 | if "page" in self._type: 192 | return BASE_URL + self.id.replace("-", "") 193 | else: 194 | return self.parent.get_browseable_url() + "#" + self.id.replace("-", "") 195 | 196 | @property 197 | def children(self): 198 | if not hasattr(self, "_children"): 199 | children_ids = self.get("content", []) 200 | self._client.refresh_records(block=children_ids) 201 | self._children = Children(parent=self) 202 | return self._children 203 | 204 | @property 205 | def parent(self): 206 | 207 | if not self.is_alias: 208 | parent_id = self.get("parent_id") 209 | parent_table = self.get("parent_table") 210 | else: 211 | parent_id = self._alias_parent 212 | parent_table = "block" 213 | 214 | if parent_table == "block": 215 | return self._client.get_block(parent_id) 216 | elif parent_table == "collection": 217 | return self._client.get_collection(parent_id) 218 | elif parent_table == "space": 219 | return self._client.get_space(parent_id) 220 | else: 221 | return None 222 | 223 | @property 224 | def space_info(self): 225 | return self._client.post("getPublicPageData", {"blockId": self.id}).json() 226 | 227 | def _str_fields(self): 228 | """ 229 | Determines the list of fields to include in the __str__ representation. Override and extend this in subclasses. 230 | """ 231 | fields = super()._str_fields() 232 | # if this is a generic Block instance, include what type of block it is 233 | if type(self) is Block: 234 | fields.append("type") 235 | return fields 236 | 237 | @property 238 | def is_alias(self): 239 | return not (self._alias_parent is None) 240 | 241 | def _get_mappers(self): 242 | mappers = {} 243 | for name in dir(self.__class__): 244 | field = getattr(self.__class__, name) 245 | if isinstance(field, mapper): 246 | mappers[name] = field 247 | return mappers 248 | 249 | def _convert_diff_to_changelist(self, difference, old_val, new_val): 250 | 251 | mappers = self._get_mappers() 252 | changed_fields = set() 253 | changes = [] 254 | remaining = [] 255 | content_changed = False 256 | 257 | for d in deepcopy(difference): 258 | operation, path, values = d 259 | 260 | # normalize path 261 | path = path if path else [] 262 | path = path.split(".") if isinstance(path, str) else path 263 | if operation in ["add", "remove"]: 264 | path.append(values[0][0]) 265 | while isinstance(path[-1], int): 266 | path.pop() 267 | path = ".".join(map(str, path)) 268 | 269 | # check whether it was content that changed 270 | if path == "content": 271 | content_changed = True 272 | continue 273 | 274 | # check whether the value changed matches one of our mapped fields/properties 275 | fields = [ 276 | (name, field) 277 | for name, field in mappers.items() 278 | if path.startswith(field.path) 279 | ] 280 | if fields: 281 | changed_fields.add(fields[0]) 282 | continue 283 | 284 | remaining.append(d) 285 | 286 | if content_changed: 287 | 288 | old = deepcopy(old_val.get("content", [])) 289 | new = deepcopy(new_val.get("content", [])) 290 | 291 | # track what's been added and removed 292 | removed = set(old) - set(new) 293 | added = set(new) - set(old) 294 | for id in removed: 295 | changes.append(("content_removed", "content", id)) 296 | for id in added: 297 | changes.append(("content_added", "content", id)) 298 | 299 | # ignore the added/removed items, and see whether order has changed 300 | for id in removed: 301 | old.remove(id) 302 | for id in added: 303 | new.remove(id) 304 | if old != new: 305 | changes.append(("content_reordered", "content", (old, new))) 306 | 307 | for name, field in changed_fields: 308 | old = field.api_to_python(get_by_path(field.path, old_val)) 309 | new = field.api_to_python(get_by_path(field.path, new_val)) 310 | changes.append(("changed_field", name, (old, new))) 311 | 312 | return changes + super()._convert_diff_to_changelist( 313 | remaining, old_val, new_val 314 | ) 315 | 316 | def remove(self, permanently=False): 317 | """ 318 | Removes the node from its parent, and marks it as inactive. This corresponds to what happens in the 319 | Notion UI when you delete a block. Note that it doesn't *actually* delete it, just orphan it, unless 320 | `permanently` is set to True, in which case we make an extra call to hard-delete. 321 | """ 322 | 323 | if not self.is_alias: 324 | 325 | # If it's not an alias, we actually remove the block 326 | with self._client.as_atomic_transaction(): 327 | 328 | # Mark the block as inactive 329 | self._client.submit_transaction( 330 | build_operation( 331 | id=self.id, path=[], args={"alive": False}, command="update" 332 | ) 333 | ) 334 | 335 | # Remove the block's ID from a list on its parent, if needed 336 | if self.parent.child_list_key: 337 | self._client.submit_transaction( 338 | build_operation( 339 | id=self.get("parent_id"), 340 | path=[self.parent.child_list_key], 341 | args={"id": self.id}, 342 | command="listRemove", 343 | table=self.get("parent_table"), 344 | ) 345 | ) 346 | 347 | if permanently: 348 | block_id = self.id 349 | self._client.post( 350 | "deleteBlocks", {"blockIds": [block_id], "permanentlyDelete": True} 351 | ) 352 | del self._client._store._values["block"][block_id] 353 | 354 | else: 355 | 356 | # Otherwise, if it's an alias, we only remove it from the alias parent's content list 357 | self._client.submit_transaction( 358 | build_operation( 359 | id=self._alias_parent, 360 | path=["content"], 361 | args={"id": self.id}, 362 | command="listRemove", 363 | ) 364 | ) 365 | 366 | def move_to(self, target_block, position="last-child"): 367 | assert isinstance( 368 | target_block, Block 369 | ), "target_block must be an instance of Block or one of its subclasses" 370 | assert position in ["first-child", "last-child", "before", "after"] 371 | 372 | if "child" in position: 373 | new_parent_id = target_block.id 374 | new_parent_table = "block" 375 | else: 376 | new_parent_id = target_block.get("parent_id") 377 | new_parent_table = target_block.get("parent_table") 378 | 379 | if position in ["first-child", "before"]: 380 | list_command = "listBefore" 381 | else: 382 | list_command = "listAfter" 383 | 384 | list_args = {"id": self.id} 385 | if position in ["before", "after"]: 386 | list_args[position] = target_block.id 387 | 388 | with self._client.as_atomic_transaction(): 389 | 390 | # First, remove the node, before we re-insert and re-activate it at the target location 391 | self.remove() 392 | 393 | if not self.is_alias: 394 | # Set the parent_id of the moving block to the new parent, and mark it as active again 395 | self._client.submit_transaction( 396 | build_operation( 397 | id=self.id, 398 | path=[], 399 | args={ 400 | "alive": True, 401 | "parent_id": new_parent_id, 402 | "parent_table": new_parent_table, 403 | }, 404 | command="update", 405 | ) 406 | ) 407 | else: 408 | self._alias_parent = new_parent_id 409 | 410 | # Add the moving block's ID to the "content" list of the new parent 411 | self._client.submit_transaction( 412 | build_operation( 413 | id=new_parent_id, 414 | path=["content"], 415 | args=list_args, 416 | command=list_command, 417 | ) 418 | ) 419 | 420 | # update the local block cache to reflect the updates 421 | self._client.refresh_records( 422 | block=[ 423 | self.id, 424 | self.get("parent_id"), 425 | target_block.id, 426 | target_block.get("parent_id"), 427 | ] 428 | ) 429 | 430 | def extract_markdown(self): 431 | task_id = self._client.post("https://www.notion.so/api/v3/enqueueTask", { 432 | "task": { 433 | "eventName": "exportBlock", 434 | "request": { 435 | "block": { 436 | "id": self.id, 437 | "spaceId": self._client.current_space.id, 438 | }, 439 | "recursive": False, 440 | "exportOptions": { 441 | "exportType": "markdown", 442 | "timeZone": "America/Los_Angeles", 443 | "locale": "en", 444 | "collectionViewExportType": "currentView", 445 | "includeContents": "no_files", 446 | "preferredViewMap": {} 447 | }, 448 | "shouldExportComments": False 449 | } 450 | } 451 | }).json()["taskId"] 452 | 453 | for i in range(5000): 454 | response = self._client.post("https://www.notion.so/api/v3/getTasks", { 455 | "taskIds": [task_id] 456 | }).json() 457 | if response["results"][0]["state"] == "success": 458 | break 459 | time.sleep(0.25) 460 | 461 | zip_url = response["results"][0]["status"]["exportURL"] 462 | 463 | response = self._client.session.get(zip_url) 464 | response.raise_for_status() 465 | 466 | # unzip the contents in memory and read the contents of the file inside (there should only be one file, but check the name) 467 | with zipfile.ZipFile(io.BytesIO(response.content)) as z: 468 | names = z.namelist() 469 | assert len(names) == 1, "Expected exactly one file in the zip" 470 | with z.open(names[0]) as f: 471 | return f.read().decode("utf-8") 472 | 473 | 474 | class DividerBlock(Block): 475 | 476 | _type = "divider" 477 | 478 | 479 | class ColumnListBlock(Block): 480 | """ 481 | Must contain only ColumnBlocks as children. 482 | """ 483 | 484 | _type = "column_list" 485 | 486 | def evenly_space_columns(self): 487 | with self._client.as_atomic_transaction(): 488 | for child in self.children: 489 | child.column_ratio = 1 / len(self.children) 490 | 491 | 492 | class ColumnBlock(Block): 493 | """ 494 | Should be added as children of a ColumnListBlock. 495 | """ 496 | 497 | column_ratio = field_map("format.column_ratio") 498 | 499 | _type = "column" 500 | 501 | 502 | class BasicBlock(Block): 503 | 504 | title = property_map("title") 505 | title_plaintext = property_map( 506 | "title", 507 | python_to_api=plaintext_to_notion, 508 | api_to_python=notion_to_plaintext, 509 | markdown=False, 510 | ) 511 | color = field_map("format.block_color") 512 | 513 | def convert_to_type(self, new_type): 514 | """ 515 | Convert this block into another type of BasicBlock. Returns a new instance of the appropriate class. 516 | """ 517 | assert new_type in BLOCK_TYPES and issubclass( 518 | BLOCK_TYPES[new_type], BasicBlock 519 | ), "Target type must correspond to a subclass of BasicBlock" 520 | self.type = new_type 521 | return self._client.get_block(self.id) 522 | 523 | def _str_fields(self): 524 | return super()._str_fields() + ["title"] 525 | 526 | 527 | class TodoBlock(BasicBlock): 528 | 529 | _type = "to_do" 530 | 531 | checked = property_map( 532 | "checked", 533 | python_to_api=lambda x: "Yes" if x else "No", 534 | api_to_python=lambda x: x == "Yes", 535 | ) 536 | 537 | def _str_fields(self): 538 | return super()._str_fields() + ["checked"] 539 | 540 | 541 | class CodeBlock(BasicBlock): 542 | 543 | _type = "code" 544 | 545 | language = property_map("language") 546 | wrap = field_map("format.code_wrap") 547 | 548 | 549 | class FactoryBlock(BasicBlock): 550 | """ 551 | Also known as a "Template Button". The title is the button text, and the children are the templates to clone. 552 | """ 553 | 554 | _type = "factory" 555 | 556 | 557 | class HeaderBlock(BasicBlock): 558 | 559 | _type = "header" 560 | 561 | 562 | class SubheaderBlock(BasicBlock): 563 | 564 | _type = "sub_header" 565 | 566 | 567 | class SubsubheaderBlock(BasicBlock): 568 | 569 | _type = "sub_sub_header" 570 | 571 | 572 | class PageBlock(BasicBlock): 573 | 574 | _type = "page" 575 | 576 | icon = field_map( 577 | "format.page_icon", 578 | api_to_python=add_signed_prefix_as_needed, 579 | python_to_api=remove_signed_prefix_as_needed, 580 | ) 581 | 582 | cover = field_map( 583 | "format.page_cover", 584 | api_to_python=add_signed_prefix_as_needed, 585 | python_to_api=remove_signed_prefix_as_needed, 586 | ) 587 | 588 | locked = field_map("format.block_locked") 589 | 590 | def get_backlinks(self): 591 | """ 592 | Returns a list of blocks that referencing the current PageBlock. Note that only PageBlocks support backlinks. 593 | """ 594 | data = self._client.post("getBacklinksForBlock", {"blockId": self.id}).json() 595 | backlinks = [] 596 | for block in data.get("backlinks") or []: 597 | mention = block.get("mentioned_from") 598 | if not mention: 599 | continue 600 | block_id = mention.get("block_id") or mention.get("parent_block_id") 601 | if block_id: 602 | backlinks.append(self._client.get_block(block_id)) 603 | return backlinks 604 | 605 | 606 | class BulletedListBlock(BasicBlock): 607 | 608 | _type = "bulleted_list" 609 | 610 | 611 | class NumberedListBlock(BasicBlock): 612 | 613 | _type = "numbered_list" 614 | 615 | 616 | class ToggleBlock(BasicBlock): 617 | 618 | _type = "toggle" 619 | 620 | 621 | class QuoteBlock(BasicBlock): 622 | 623 | _type = "quote" 624 | 625 | 626 | class TextBlock(BasicBlock): 627 | 628 | _type = "text" 629 | 630 | 631 | class EquationBlock(BasicBlock): 632 | 633 | latex = field_map( 634 | ["properties", "title"], 635 | python_to_api=lambda x: [[x]], 636 | api_to_python=lambda x: x[0][0], 637 | ) 638 | 639 | _type = "equation" 640 | 641 | 642 | class MediaBlock(Block): 643 | 644 | caption = property_map("caption") 645 | 646 | def _str_fields(self): 647 | return super()._str_fields() + ["caption"] 648 | 649 | 650 | class EmbedBlock(MediaBlock): 651 | 652 | _type = "embed" 653 | 654 | display_source = field_map( 655 | "format.display_source", 656 | api_to_python=add_signed_prefix_as_needed, 657 | python_to_api=remove_signed_prefix_as_needed, 658 | ) 659 | source = property_map( 660 | "source", 661 | api_to_python=add_signed_prefix_as_needed, 662 | python_to_api=remove_signed_prefix_as_needed, 663 | ) 664 | height = field_map("format.block_height") 665 | full_width = field_map("format.block_full_width") 666 | page_width = field_map("format.block_page_width") 667 | width = field_map("format.block_width") 668 | 669 | def set_source_url(self, url): 670 | self.source = remove_signed_prefix_as_needed(url) 671 | self.display_source = get_embed_link(self.source) 672 | 673 | def _str_fields(self): 674 | return super()._str_fields() + ["source"] 675 | 676 | 677 | class EmbedOrUploadBlock(EmbedBlock): 678 | 679 | file_id = field_map(["file_ids", 0]) 680 | 681 | def upload_file(self, path): 682 | 683 | mimetype = mimetypes.guess_type(path)[0] or "text/plain" 684 | filename = os.path.split(path)[-1] 685 | 686 | data = self._client.post( 687 | "getUploadFileUrl", 688 | {"bucket": "secure", "name": filename, "contentType": mimetype}, 689 | ).json() 690 | 691 | with open(path, "rb") as f: 692 | response = requests.put( 693 | data["signedPutUrl"], data=f, headers={"Content-type": mimetype} 694 | ) 695 | response.raise_for_status() 696 | 697 | self.display_source = data["url"] 698 | self.source = data["url"] 699 | self.file_id = data["url"][len(S3_URL_PREFIX) :].split("/")[0] 700 | 701 | 702 | class VideoBlock(EmbedOrUploadBlock): 703 | 704 | _type = "video" 705 | 706 | 707 | class FileBlock(EmbedOrUploadBlock): 708 | 709 | size = property_map("size") 710 | title = property_map("title") 711 | 712 | _type = "file" 713 | 714 | 715 | class AudioBlock(EmbedOrUploadBlock): 716 | 717 | _type = "audio" 718 | 719 | 720 | class PDFBlock(EmbedOrUploadBlock): 721 | 722 | _type = "pdf" 723 | 724 | 725 | class ImageBlock(EmbedOrUploadBlock): 726 | 727 | _type = "image" 728 | 729 | 730 | class BookmarkBlock(EmbedBlock): 731 | 732 | _type = "bookmark" 733 | 734 | bookmark_cover = field_map("format.bookmark_cover") 735 | bookmark_icon = field_map("format.bookmark_icon") 736 | description = property_map("description") 737 | link = property_map("link") 738 | title = property_map("title") 739 | 740 | def set_new_link(self, url): 741 | self._client.post("setBookmarkMetadata", {"blockId": self.id, "url": url}) 742 | self.refresh() 743 | 744 | 745 | class LinkToCollectionBlock(MediaBlock): 746 | 747 | _type = "link_to_collection" 748 | # TODO: add custom fields 749 | 750 | 751 | class BreadcrumbBlock(MediaBlock): 752 | 753 | _type = "breadcrumb" 754 | 755 | 756 | class CollectionViewBlock(MediaBlock): 757 | 758 | _type = "collection_view" 759 | 760 | @property 761 | def collection(self): 762 | collection_id = self.get("collection_id") 763 | if not collection_id: 764 | return None 765 | if not hasattr(self, "_collection"): 766 | self._collection = self._client.get_collection(collection_id) 767 | return self._collection 768 | 769 | @collection.setter 770 | def collection(self, val): 771 | if hasattr(self, "_collection"): 772 | del self._collection 773 | self.set("collection_id", val.id) 774 | 775 | @property 776 | def views(self): 777 | if not hasattr(self, "_views"): 778 | self._views = CollectionViewBlockViews(parent=self) 779 | return self._views 780 | 781 | @property 782 | def title(self): 783 | return self.collection.name 784 | 785 | @title.setter 786 | def title(self, val): 787 | self.collection.name = val 788 | 789 | @property 790 | def description(self): 791 | return self.collection.description 792 | 793 | @description.setter 794 | def description(self, val): 795 | self.collection.description = val 796 | 797 | locked = field_map("format.block_locked") 798 | 799 | def _str_fields(self): 800 | return super()._str_fields() + ["title", "collection"] 801 | 802 | 803 | class CollectionViewBlockViews(Children): 804 | 805 | child_list_key = "view_ids" 806 | 807 | def _get_block(self, view_id): 808 | 809 | view = self._client.get_collection_view( 810 | view_id, collection=self._parent.collection 811 | ) 812 | 813 | i = 0 814 | while view is None: 815 | i += 1 816 | if i > 20: 817 | return None 818 | time.sleep(0.1) 819 | view = self._client.get_collection_view( 820 | view_id, collection=self._parent.collection 821 | ) 822 | 823 | return view 824 | 825 | def add_new(self, view_type="table"): 826 | if not self._parent.collection: 827 | raise Exception( 828 | "Collection view block does not have an associated collection: {}".format( 829 | self._parent 830 | ) 831 | ) 832 | 833 | record_id = self._client.create_record( 834 | table="collection_view", parent=self._parent, type=view_type 835 | ) 836 | view = self._client.get_collection_view( 837 | record_id, collection=self._parent._collection 838 | ) 839 | view.set("collection_id", self._parent._collection.id) 840 | view_ids = self._parent.get(CollectionViewBlockViews.child_list_key, []) 841 | view_ids.append(view.id) 842 | self._parent.set(CollectionViewBlockViews.child_list_key, view_ids) 843 | 844 | # At this point, the view does not see to be completely initialized yet. 845 | # Hack: wait a bit before e.g. setting a query. 846 | # Note: temporarily disabling this sleep to see if the issue reoccurs. 847 | # time.sleep(3) 848 | return view 849 | 850 | 851 | class CollectionViewPageBlock(CollectionViewBlock): 852 | 853 | icon = field_map( 854 | "format.page_icon", 855 | api_to_python=add_signed_prefix_as_needed, 856 | python_to_api=remove_signed_prefix_as_needed, 857 | ) 858 | 859 | cover = field_map( 860 | "format.page_cover", 861 | api_to_python=add_signed_prefix_as_needed, 862 | python_to_api=remove_signed_prefix_as_needed, 863 | ) 864 | 865 | _type = "collection_view_page" 866 | 867 | 868 | class FramerBlock(EmbedBlock): 869 | 870 | _type = "framer" 871 | 872 | 873 | class TweetBlock(EmbedBlock): 874 | 875 | _type = "tweet" 876 | 877 | 878 | class GistBlock(EmbedBlock): 879 | 880 | _type = "gist" 881 | 882 | 883 | class DriveBlock(EmbedBlock): 884 | 885 | _type = "drive" 886 | 887 | 888 | class FigmaBlock(EmbedBlock): 889 | 890 | _type = "figma" 891 | 892 | 893 | class LoomBlock(EmbedBlock): 894 | 895 | _type = "loom" 896 | 897 | 898 | class TypeformBlock(EmbedBlock): 899 | 900 | _type = "typeform" 901 | 902 | 903 | class CodepenBlock(EmbedBlock): 904 | 905 | _type = "codepen" 906 | 907 | 908 | class MapsBlock(EmbedBlock): 909 | 910 | _type = "maps" 911 | 912 | 913 | class InvisionBlock(EmbedBlock): 914 | 915 | _type = "invision" 916 | 917 | 918 | class CalloutBlock(BasicBlock): 919 | 920 | icon = field_map("format.page_icon") 921 | 922 | _type = "callout" 923 | 924 | 925 | BLOCK_TYPES = { 926 | cls._type: cls 927 | for cls in locals().values() 928 | if type(cls) == type and issubclass(cls, Block) and hasattr(cls, "_type") 929 | } 930 | -------------------------------------------------------------------------------- /notion/client.py: -------------------------------------------------------------------------------- 1 | import hashlib 2 | import json 3 | import re 4 | import uuid 5 | 6 | from requests import Session, HTTPError 7 | from requests.cookies import cookiejar_from_dict 8 | from urllib.parse import urljoin 9 | from requests.adapters import HTTPAdapter 10 | from requests.packages.urllib3.util.retry import Retry 11 | from getpass import getpass 12 | 13 | from .block import Block, BLOCK_TYPES 14 | from .collection import ( 15 | Collection, 16 | CollectionView, 17 | CollectionRowBlock, 18 | COLLECTION_VIEW_TYPES, 19 | TemplateBlock, 20 | ) 21 | from .logger import logger 22 | from .monitor import Monitor 23 | from .operations import operation_update_last_edited, build_operation 24 | from .settings import API_BASE_URL 25 | from .space import Space 26 | from .store import RecordStore 27 | from .user import User 28 | from .utils import extract_id, now 29 | 30 | 31 | def create_session(client_specified_retry=None): 32 | """ 33 | retry on 502 34 | """ 35 | session = Session() 36 | if client_specified_retry: 37 | retry = client_specified_retry 38 | else: 39 | retry = Retry( 40 | total=10, 41 | backoff_factor=1, 42 | status_forcelist=(429, 502, 503, 504), 43 | # CAUTION: adding 'POST' to this list which is not technically idempotent 44 | allowed_methods=( 45 | "POST", 46 | "HEAD", 47 | "TRACE", 48 | "GET", 49 | "PUT", 50 | "OPTIONS", 51 | "DELETE", 52 | ), 53 | ) 54 | adapter = HTTPAdapter(max_retries=retry) 55 | session.mount("https://", adapter) 56 | return session 57 | 58 | 59 | class NotionClient(object): 60 | """ 61 | This is the entry point to using the API. Create an instance of this class, passing it the value of the 62 | "token_v2" cookie from a logged-in browser session on Notion.so. Most of the methods on here are primarily 63 | for internal use -- the main one you'll likely want to use is `get_block`. 64 | """ 65 | 66 | def __init__( 67 | self, 68 | token_v2=None, 69 | monitor=False, 70 | start_monitoring=False, 71 | enable_caching=False, 72 | cache_key=None, 73 | email=None, 74 | password=None, 75 | client_specified_retry=None, 76 | ): 77 | self.session = create_session(client_specified_retry) 78 | if token_v2: 79 | self.session.cookies = cookiejar_from_dict({"token_v2": token_v2}) 80 | else: 81 | self._set_token(email=email, password=password) 82 | 83 | if enable_caching: 84 | cache_key = cache_key or hashlib.sha256(token_v2.encode()).hexdigest() 85 | self._store = RecordStore(self, cache_key=cache_key) 86 | else: 87 | self._store = RecordStore(self) 88 | if monitor: 89 | self._monitor = Monitor(self) 90 | if start_monitoring: 91 | self.start_monitoring() 92 | else: 93 | self._monitor = None 94 | 95 | self._update_user_info() 96 | 97 | def start_monitoring(self): 98 | self._monitor.poll_async() 99 | 100 | def _fetch_guest_space_data(self, records): 101 | """ 102 | guest users have an empty `space` dict, so get the space_id from the `space_view` dict instead, 103 | and fetch the space data from the getPublicSpaceData endpoint. 104 | 105 | Note: This mutates the records dict 106 | """ 107 | space_id = list(records["space_view"].values())[0]["value"]["space_id"] 108 | 109 | space_data = self.post( 110 | "getPublicSpaceData", {"type": "space-ids", "spaceIds": [space_id]} 111 | ).json() 112 | 113 | records["space"] = { 114 | space["id"]: {"value": space} for space in space_data["results"] 115 | } 116 | 117 | 118 | def _set_token(self, email=None, password=None): 119 | if not email: 120 | email = input("Enter your Notion email address:\n") 121 | if not password: 122 | password = getpass("Enter your Notion password:\n") 123 | self.post("loginWithEmail", {"email": email, "password": password}).json() 124 | 125 | def _update_user_info(self): 126 | records = self.post("loadUserContent", {}).json()["recordMap"] 127 | if not records["space"]: 128 | self._fetch_guest_space_data(records) 129 | 130 | self._store.store_recordmap(records) 131 | self.current_user = self.get_user(list(records["notion_user"].keys())[0]) 132 | self.current_space = self.get_space(list(records["space"].keys())[0]) 133 | return records 134 | 135 | def get_email_uid(self): 136 | response = self.post("getSpaces", {}).json() 137 | return { 138 | response[uid]["notion_user"][uid]["value"]["email"]: uid 139 | for uid in response.keys() 140 | } 141 | 142 | def set_user_by_uid(self, user_id): 143 | self.session.headers.update({"x-notion-active-user-header": user_id}) 144 | self._update_user_info() 145 | 146 | def set_user_by_email(self, email): 147 | email_uid_dict = self.get_email_uid() 148 | uid = email_uid_dict.get(email) 149 | if not uid: 150 | raise Exception( 151 | "Requested email address {email} not found; available addresses: {available}".format( 152 | email=email, available=list(email_uid_dict) 153 | ) 154 | ) 155 | self.set_user_by_uid(uid) 156 | 157 | def get_top_level_pages(self): 158 | records = self._update_user_info() 159 | return [self.get_block(bid) for bid in records["block"].keys()] 160 | 161 | def get_record_data(self, table, id, force_refresh=False, limit=100): 162 | return self._store.get(table, id, force_refresh=force_refresh, limit=limit) 163 | 164 | def get_block(self, url_or_id, force_refresh=False, limit=100): 165 | """ 166 | Retrieve an instance of a subclass of Block that maps to the block/page identified by the URL or ID passed in. 167 | """ 168 | block_id = extract_id(url_or_id) 169 | block = self.get_record_data("block", block_id, force_refresh=force_refresh, limit=limit) 170 | if not block: 171 | return None 172 | if block.get("parent_table") == "collection": 173 | if block.get("is_template"): 174 | block_class = TemplateBlock 175 | else: 176 | block_class = CollectionRowBlock 177 | else: 178 | block_class = BLOCK_TYPES.get(block.get("type", ""), Block) 179 | return block_class(self, block_id) 180 | 181 | def get_collection(self, collection_id, force_refresh=False): 182 | """ 183 | Retrieve an instance of Collection that maps to the collection identified by the ID passed in. 184 | """ 185 | coll = self.get_record_data( 186 | "collection", collection_id, force_refresh=force_refresh 187 | ) 188 | return Collection(self, collection_id) if coll else None 189 | 190 | def get_user(self, user_id, force_refresh=False): 191 | """ 192 | Retrieve an instance of User that maps to the notion_user identified by the ID passed in. 193 | """ 194 | user = self.get_record_data("notion_user", user_id, force_refresh=force_refresh) 195 | return User(self, user_id) if user else None 196 | 197 | def get_space(self, space_id, force_refresh=False): 198 | """ 199 | Retrieve an instance of Space that maps to the space identified by the ID passed in. 200 | """ 201 | space = self.get_record_data("space", space_id, force_refresh=force_refresh) 202 | return Space(self, space_id) if space else None 203 | 204 | def get_collection_view(self, url_or_id, collection=None, force_refresh=False): 205 | """ 206 | Retrieve an instance of a subclass of CollectionView that maps to the appropriate type. 207 | The `url_or_id` argument can either be the URL for a database page, or the ID of a collection_view (in which case 208 | you must also pass the collection) 209 | """ 210 | # if it's a URL for a database page, try extracting the collection and view IDs 211 | if url_or_id.startswith("http"): 212 | match = re.search("([a-f0-9]{32})\?v=([a-f0-9]{32})", url_or_id) 213 | if not match: 214 | raise Exception("Invalid collection view URL") 215 | block_id, view_id = match.groups() 216 | collection = self.get_block( 217 | block_id, force_refresh=force_refresh 218 | ).collection 219 | else: 220 | view_id = url_or_id 221 | assert ( 222 | collection is not None 223 | ), "If 'url_or_id' is an ID (not a URL), you must also pass the 'collection'" 224 | 225 | view = self.get_record_data( 226 | "collection_view", view_id, force_refresh=force_refresh 227 | ) 228 | 229 | return ( 230 | COLLECTION_VIEW_TYPES.get(view.get("type", ""), CollectionView)( 231 | self, view_id, collection=collection 232 | ) 233 | if view 234 | else None 235 | ) 236 | 237 | def refresh_records(self, **kwargs): 238 | """ 239 | The keyword arguments map table names into lists of (or singular) record IDs to load for that table. 240 | Use `True` instead of a list to refresh all known records for that table. 241 | """ 242 | self._store.call_get_record_values(**kwargs) 243 | 244 | def refresh_collection_rows(self, collection_id): 245 | row_ids = [row.id for row in self.get_collection(collection_id).get_rows()] 246 | self._store.set_collection_rows(collection_id, row_ids) 247 | 248 | def post(self, endpoint, data): 249 | """ 250 | All API requests on Notion.so are done as POSTs (except the websocket communications). 251 | """ 252 | url = urljoin(API_BASE_URL, endpoint) 253 | response = self.session.post(url, json=data) 254 | if response.status_code == 400: 255 | logger.error( 256 | "Got 400 error attempting to POST to {}, with data: {}".format( 257 | endpoint, json.dumps(data, indent=2) 258 | ) 259 | ) 260 | raise HTTPError( 261 | response.json().get( 262 | "message", "There was an error (400) submitting the request." 263 | ) 264 | ) 265 | response.raise_for_status() 266 | return response 267 | 268 | def submit_transaction(self, operations, update_last_edited=True): 269 | 270 | if not operations: 271 | return 272 | 273 | if isinstance(operations, dict): 274 | operations = [operations] 275 | 276 | if update_last_edited: 277 | updated_blocks = set( 278 | [op["id"] for op in operations if op["table"] == "block"] 279 | ) 280 | operations += [ 281 | operation_update_last_edited(self.current_user.id, block_id) 282 | for block_id in updated_blocks 283 | ] 284 | 285 | # if we're in a transaction, just add these operations to the list; otherwise, execute them right away 286 | if self.in_transaction(): 287 | self._transaction_operations += operations 288 | else: 289 | data = {"operations": operations} 290 | self.post("submitTransaction", data) 291 | self._store.run_local_operations(operations) 292 | 293 | def query_collection(self, *args, **kwargs): 294 | return self._store.call_query_collection(*args, **kwargs) 295 | 296 | def as_atomic_transaction(self): 297 | """ 298 | Returns a context manager that buffers up all calls to `submit_transaction` and sends them as one big transaction 299 | when the context manager exits. 300 | """ 301 | return Transaction(client=self) 302 | 303 | def in_transaction(self): 304 | """ 305 | Returns True if we're currently in a transaction, otherwise False. 306 | """ 307 | return hasattr(self, "_transaction_operations") 308 | 309 | def search_pages_with_parent(self, parent_id, search="", limit=100): 310 | data = { 311 | "query": search, 312 | "parentId": parent_id, 313 | "limit": limit, 314 | "spaceId": self.current_space.id, 315 | } 316 | response = self.post("searchPagesWithParent", data).json() 317 | self._store.store_recordmap(response["recordMap"]) 318 | return response["results"] 319 | 320 | def search_blocks(self, search, limit=25): 321 | return self.search(query=search, limit=limit) 322 | 323 | def search( 324 | self, 325 | query="", 326 | search_type="BlocksInSpace", 327 | limit=100, 328 | sort="Relevance", 329 | source="quick_find", 330 | isDeletedOnly=False, 331 | excludeTemplates=False, 332 | isNavigableOnly=False, 333 | requireEditPermissions=False, 334 | ancestors=[], 335 | createdBy=[], 336 | editedBy=[], 337 | lastEditedTime={}, 338 | createdTime={}, 339 | ): 340 | data = { 341 | "type": search_type, 342 | "query": query, 343 | "spaceId": self.current_space.id, 344 | "limit": limit, 345 | "filters": { 346 | "isDeletedOnly": isDeletedOnly, 347 | "excludeTemplates": excludeTemplates, 348 | "isNavigableOnly": isNavigableOnly, 349 | "requireEditPermissions": requireEditPermissions, 350 | "ancestors": ancestors, 351 | "createdBy": createdBy, 352 | "editedBy": editedBy, 353 | "lastEditedTime": lastEditedTime, 354 | "createdTime": createdTime, 355 | }, 356 | "sort": sort, 357 | "source": source, 358 | } 359 | response = self.post("search", data).json() 360 | self._store.store_recordmap(response["recordMap"]) 361 | return [self.get_block(result["id"]) for result in response["results"]] 362 | 363 | def create_record(self, table, parent, **kwargs): 364 | 365 | # make up a new UUID; apparently we get to choose our own! 366 | record_id = str(uuid.uuid4()) 367 | 368 | child_list_key = kwargs.get("child_list_key") or parent.child_list_key 369 | 370 | args = { 371 | "id": record_id, 372 | "version": 1, 373 | "alive": True, 374 | "created_by_id": self.current_user.id, 375 | "created_by_table": "notion_user", 376 | "created_time": now(), 377 | "parent_id": parent.id, 378 | "parent_table": parent._table, 379 | } 380 | 381 | args.update(kwargs) 382 | 383 | with self.as_atomic_transaction(): 384 | 385 | # create the new record 386 | self.submit_transaction( 387 | build_operation( 388 | args=args, command="set", id=record_id, path=[], table=table 389 | ) 390 | ) 391 | 392 | # add the record to the content list of the parent, if needed 393 | if child_list_key: 394 | self.submit_transaction( 395 | build_operation( 396 | id=parent.id, 397 | path=[child_list_key], 398 | args={"id": record_id}, 399 | command="listAfter", 400 | table=parent._table, 401 | ) 402 | ) 403 | 404 | return record_id 405 | 406 | 407 | class Transaction(object): 408 | 409 | is_dummy_nested_transaction = False 410 | 411 | def __init__(self, client): 412 | self.client = client 413 | 414 | def __enter__(self): 415 | 416 | if hasattr(self.client, "_transaction_operations"): 417 | # client is already in a transaction, so we'll just make this one a nullop and let the outer one handle it 418 | self.is_dummy_nested_transaction = True 419 | return 420 | 421 | self.client._transaction_operations = [] 422 | self.client._pages_to_refresh = [] 423 | self.client._blocks_to_refresh = [] 424 | 425 | def __exit__(self, exc_type, exc_value, traceback): 426 | 427 | if self.is_dummy_nested_transaction: 428 | return 429 | 430 | operations = self.client._transaction_operations 431 | del self.client._transaction_operations 432 | 433 | # only actually submit the transaction if there was no exception 434 | if not exc_type: 435 | self.client.submit_transaction(operations) 436 | 437 | self.client._store.handle_post_transaction_refreshing() 438 | -------------------------------------------------------------------------------- /notion/collection.py: -------------------------------------------------------------------------------- 1 | from cached_property import cached_property 2 | from copy import deepcopy 3 | from datetime import datetime, date 4 | from tzlocal import get_localzone 5 | from uuid import uuid1 6 | 7 | from .block import Block, PageBlock, Children, CollectionViewBlock 8 | from .logger import logger 9 | from .maps import property_map, field_map 10 | from .markdown import markdown_to_notion, notion_to_markdown 11 | from .operations import build_operation 12 | from .records import Record 13 | from .utils import ( 14 | add_signed_prefix_as_needed, 15 | extract_id, 16 | remove_signed_prefix_as_needed, 17 | slugify, 18 | ) 19 | 20 | 21 | class NotionDate(object): 22 | 23 | start = None 24 | end = None 25 | timezone = None 26 | reminder = None 27 | 28 | def __init__(self, start, end=None, timezone=None, reminder=None): 29 | self.start = start 30 | self.end = end 31 | self.timezone = timezone 32 | self.reminder = reminder 33 | 34 | @classmethod 35 | def from_notion(cls, obj): 36 | if isinstance(obj, dict): 37 | data = obj 38 | elif isinstance(obj, list): 39 | data = obj[0][1][0][1] 40 | else: 41 | return None 42 | start = cls._parse_datetime(data.get("start_date"), data.get("start_time")) 43 | end = cls._parse_datetime(data.get("end_date"), data.get("end_time")) 44 | timezone = data.get("time_zone") 45 | reminder = data.get("reminder") 46 | return cls(start, end=end, timezone=timezone, reminder=reminder) 47 | 48 | @classmethod 49 | def _parse_datetime(cls, date_str, time_str): 50 | if not date_str: 51 | return None 52 | if time_str: 53 | return datetime.strptime(date_str + " " + time_str, "%Y-%m-%d %H:%M") 54 | else: 55 | return datetime.strptime(date_str, "%Y-%m-%d").date() 56 | 57 | def _format_datetime(self, date_or_datetime): 58 | if not date_or_datetime: 59 | return None, None 60 | if isinstance(date_or_datetime, datetime): 61 | return ( 62 | date_or_datetime.strftime("%Y-%m-%d"), 63 | date_or_datetime.strftime("%H:%M"), 64 | ) 65 | else: 66 | return date_or_datetime.strftime("%Y-%m-%d"), None 67 | 68 | def type(self): 69 | name = "date" 70 | if isinstance(self.start, datetime): 71 | name += "time" 72 | if self.end: 73 | name += "range" 74 | return name 75 | 76 | def to_notion(self): 77 | if self.end: 78 | self.start, self.end = sorted([self.start, self.end]) 79 | 80 | start_date, start_time = self._format_datetime(self.start) 81 | end_date, end_time = self._format_datetime(self.end) 82 | reminder = self.reminder 83 | 84 | if not start_date: 85 | return [] 86 | 87 | data = {"type": self.type(), "start_date": start_date} 88 | 89 | if end_date: 90 | data["end_date"] = end_date 91 | 92 | if reminder: 93 | data["reminder"] = reminder 94 | 95 | if "time" in data["type"]: 96 | data["time_zone"] = str(self.timezone or get_localzone()) 97 | data["start_time"] = start_time or "00:00" 98 | if end_date: 99 | data["end_time"] = end_time or "00:00" 100 | 101 | return [["‣", [["d", data]]]] 102 | 103 | 104 | class NotionSelect(object): 105 | valid_colors = [ 106 | "default", 107 | "gray", 108 | "brown", 109 | "orange", 110 | "yellow", 111 | "green", 112 | "blue", 113 | "purple", 114 | "pink", 115 | "red", 116 | ] 117 | id = None 118 | color = "default" 119 | value = None 120 | 121 | def __init__(self, value, color="default"): 122 | self.id = str(uuid1()) 123 | self.color = self.set_color(color) 124 | self.value = value 125 | 126 | def set_color(self, color): 127 | if color not in self.valid_colors: 128 | if self.color: 129 | return self.color 130 | return "default" 131 | return color 132 | 133 | def to_dict(self): 134 | return {"id": self.id, "value": self.value, "color": self.color} 135 | 136 | 137 | class Collection(Record): 138 | """ 139 | A "collection" corresponds to what's sometimes called a "database" in the Notion UI. 140 | """ 141 | 142 | _table = "collection" 143 | 144 | name = field_map( 145 | "name", api_to_python=notion_to_markdown, python_to_api=markdown_to_notion 146 | ) 147 | description = field_map( 148 | "description", 149 | api_to_python=notion_to_markdown, 150 | python_to_api=markdown_to_notion, 151 | ) 152 | cover = field_map("cover") 153 | 154 | @property 155 | def templates(self): 156 | if not hasattr(self, "_templates"): 157 | template_ids = self.get("template_pages", []) 158 | self._client.refresh_records(block=template_ids) 159 | self._templates = Templates(parent=self) 160 | return self._templates 161 | 162 | def get_schema_properties(self): 163 | """ 164 | Fetch a flattened list of all properties in the collection's schema. 165 | """ 166 | properties = [] 167 | schema = self.get("schema") 168 | for id, item in schema.items(): 169 | prop = {"id": id, "slug": slugify(item["name"])} 170 | prop.update(item) 171 | properties.append(prop) 172 | return properties 173 | 174 | def check_schema_select_options(self, prop, values): 175 | """ 176 | Check and update the prop dict with new values 177 | """ 178 | schema_update = False 179 | current_options = list([p["value"].lower() for p in prop["options"]]) 180 | if not isinstance(values, list): 181 | values = [values] 182 | for v in values: 183 | if v and v.lower() not in current_options: 184 | schema_update = True 185 | prop["options"].append(NotionSelect(v).to_dict()) 186 | return schema_update, prop 187 | 188 | def get_schema_property(self, identifier): 189 | """ 190 | Look up a property in the collection's schema, by "property id" (generally a 4-char string), 191 | or name (human-readable -- there may be duplicates, so we pick the first match we find). 192 | """ 193 | for prop in self.get_schema_properties(): 194 | if identifier == prop["id"] or slugify(identifier) == prop["slug"]: 195 | return prop 196 | if identifier == "title" and prop["type"] == "title": 197 | return prop 198 | return None 199 | 200 | def add_row(self, update_views=True, **kwargs): 201 | """ 202 | Create a new empty CollectionRowBlock under this collection, and return the instance. 203 | """ 204 | 205 | row_id = self._client.create_record("block", self, type="page") 206 | row = CollectionRowBlock(self._client, row_id) 207 | 208 | with self._client.as_atomic_transaction(): 209 | for key, val in kwargs.items(): 210 | setattr(row, key, val) 211 | 212 | if update_views: 213 | # make sure the new record is inserted at the end of each view 214 | for view in self.parent.views: 215 | if view is None or isinstance(view, CalendarView): 216 | continue 217 | view.set("page_sort", view.get("page_sort", []) + [row_id]) 218 | 219 | return row 220 | 221 | @property 222 | def parent(self): 223 | assert self.get("parent_table") == "block" 224 | return self._client.get_block(self.get("parent_id")) 225 | 226 | def _get_a_collection_view(self): 227 | """ 228 | Get an arbitrary collection view for this collection, to allow querying. 229 | """ 230 | parent = self.parent 231 | assert isinstance(parent, CollectionViewBlock) 232 | assert len(parent.views) > 0 233 | return parent.views[0] 234 | 235 | def query(self, **kwargs): 236 | return CollectionQuery( 237 | self, self._get_a_collection_view(), space_id=self.get("space_id"), **kwargs 238 | ).execute() 239 | 240 | def get_rows(self, **kwargs): 241 | return self.query(**kwargs) 242 | 243 | def _convert_diff_to_changelist(self, difference, old_val, new_val): 244 | 245 | changes = [] 246 | remaining = [] 247 | 248 | for operation, path, values in difference: 249 | 250 | if path == "rows": 251 | changes.append((operation, path, values)) 252 | else: 253 | remaining.append((operation, path, values)) 254 | 255 | return changes + super()._convert_diff_to_changelist( 256 | remaining, old_val, new_val 257 | ) 258 | 259 | 260 | class CollectionView(Record): 261 | """ 262 | A "view" is a particular visualization of a collection, with a "type" (board, table, list, etc) 263 | and filters, sort, etc. 264 | """ 265 | 266 | _table = "collection_view" 267 | 268 | name = field_map("name") 269 | type = field_map("type") 270 | 271 | @property 272 | def parent(self): 273 | assert self.get("parent_table", "block") 274 | return self._client.get_block(self.get("parent_id")) 275 | 276 | def __init__(self, *args, collection, **kwargs): 277 | self.collection = collection 278 | super().__init__(*args, **kwargs) 279 | 280 | def build_query(self, **kwargs): 281 | return CollectionQuery( 282 | collection=self.collection, 283 | collection_view=self, 284 | space_id=self.get('space_id'), 285 | **kwargs 286 | ) 287 | 288 | def default_query(self): 289 | return self.build_query(**self.get("query", {})) 290 | 291 | 292 | class BoardView(CollectionView): 293 | 294 | _type = "board" 295 | 296 | group_by = field_map("query.group_by") 297 | 298 | 299 | class TableView(CollectionView): 300 | 301 | _type = "table" 302 | 303 | 304 | class ListView(CollectionView): 305 | 306 | _type = "list" 307 | 308 | 309 | class CalendarView(CollectionView): 310 | 311 | _type = "calendar" 312 | 313 | def build_query(self, **kwargs): 314 | calendar_by = self._client.get_record_data("collection_view", self._id)[ 315 | "query" 316 | ]["calendar_by"] 317 | return super().build_query(calendar_by=calendar_by, **kwargs) 318 | 319 | 320 | class GalleryView(CollectionView): 321 | 322 | _type = "gallery" 323 | 324 | 325 | def _normalize_property_name(prop_name, collection): 326 | if not prop_name: 327 | return "" 328 | else: 329 | prop = collection.get_schema_property(prop_name) 330 | if not prop: 331 | return "" 332 | return prop["id"] 333 | 334 | 335 | def _normalize_query_data(data, collection, recursing=False): 336 | if not recursing: 337 | data = deepcopy(data) 338 | if isinstance(data, list): 339 | return [ 340 | _normalize_query_data(item, collection, recursing=True) for item in data 341 | ] 342 | elif isinstance(data, dict): 343 | # convert slugs to property ids 344 | if "property" in data: 345 | data["property"] = _normalize_property_name(data["property"], collection) 346 | # convert any instantiated objects into their ids 347 | if "value" in data: 348 | if hasattr(data["value"], "id"): 349 | data["value"] = data["value"].id 350 | for key in data: 351 | data[key] = _normalize_query_data(data[key], collection, recursing=True) 352 | return data 353 | 354 | 355 | class CollectionQuery(object): 356 | def __init__( 357 | self, 358 | collection, 359 | collection_view, 360 | space_id, 361 | search="", 362 | type="table", 363 | aggregate=[], 364 | aggregations=[], 365 | filter=[], 366 | sort=[], 367 | calendar_by="", 368 | group_by="", 369 | limit=100 370 | ): 371 | assert not ( 372 | aggregate and aggregations 373 | ), "Use only one of `aggregate` or `aggregations` (old vs new format)" 374 | self.collection = collection 375 | self.collection_view = collection_view 376 | self.space_id = space_id 377 | self.search = search 378 | self.type = type 379 | self.aggregate = _normalize_query_data(aggregate, collection) 380 | self.aggregations = _normalize_query_data(aggregations, collection) 381 | self.filter = _normalize_query_data(filter, collection) 382 | self.sort = _normalize_query_data(sort, collection) 383 | self.calendar_by = _normalize_property_name(calendar_by, collection) 384 | self.group_by = _normalize_property_name(group_by, collection) 385 | self.limit = limit 386 | self._client = collection._client 387 | 388 | def execute(self): 389 | 390 | result_class = QUERY_RESULT_TYPES.get(self.type, QueryResult) 391 | 392 | kwargs = { 393 | 'collection_id':self.collection.id, 394 | 'collection_view_id':self.collection_view.id, 395 | 'space_id':self.space_id, 396 | 'search':self.search, 397 | 'type':self.type, 398 | 'aggregate':self.aggregate, 399 | 'aggregations':self.aggregations, 400 | 'filter':self.filter, 401 | 'sort':self.sort, 402 | 'calendar_by':self.calendar_by, 403 | 'group_by':self.group_by, 404 | 'limit':0 405 | } 406 | 407 | if self.limit == -1: 408 | # fetch remote total 409 | result = self._client.query_collection( 410 | **kwargs 411 | ) 412 | self.limit = result.get("total",-1) 413 | 414 | kwargs['limit'] = self.limit 415 | 416 | return result_class( 417 | self.collection, 418 | self._client.query_collection( 419 | **kwargs 420 | ), 421 | self, 422 | ) 423 | 424 | 425 | class CollectionRowBlock(PageBlock): 426 | @property 427 | def is_template(self): 428 | return self.get("is_template") 429 | 430 | @cached_property 431 | def collection(self): 432 | return self._client.get_collection(self.get("parent_id")) 433 | 434 | @property 435 | def schema(self): 436 | return [ 437 | prop 438 | for prop in self.collection.get_schema_properties() 439 | if prop["type"] not in ["formula", "rollup"] 440 | ] 441 | 442 | def __getattr__(self, attname): 443 | return self.get_property(attname) 444 | 445 | def __setattr__(self, attname, value): 446 | if attname.startswith("_"): 447 | # we only allow setting of new non-property attributes that start with "_" 448 | super().__setattr__(attname, value) 449 | elif attname in self._get_property_slugs(): 450 | self.set_property(attname, value) 451 | elif slugify(attname) in self._get_property_slugs(): 452 | self.set_property(slugify(attname), value) 453 | elif hasattr(self, attname): 454 | super().__setattr__(attname, value) 455 | else: 456 | raise AttributeError("Unknown property: '{}'".format(attname)) 457 | 458 | def _get_property_slugs(self): 459 | slugs = [prop["slug"] for prop in self.schema] 460 | if "title" not in slugs: 461 | slugs.append("title") 462 | return slugs 463 | 464 | def __dir__(self): 465 | return self._get_property_slugs() + super().__dir__() 466 | 467 | def get_property(self, identifier): 468 | 469 | prop = self.collection.get_schema_property(identifier) 470 | if prop is None: 471 | raise AttributeError( 472 | "Object does not have property '{}'".format(identifier) 473 | ) 474 | 475 | val = self.get(["properties", prop["id"]]) 476 | 477 | return self._convert_notion_to_python(val, prop) 478 | 479 | def _convert_diff_to_changelist(self, difference, old_val, new_val): 480 | 481 | changed_props = set() 482 | changes = [] 483 | remaining = [] 484 | 485 | for d in difference: 486 | operation, path, values = d 487 | path = path.split(".") if isinstance(path, str) else path 488 | if path and path[0] == "properties": 489 | if len(path) > 1: 490 | changed_props.add(path[1]) 491 | else: 492 | for item in values: 493 | changed_props.add(item[0]) 494 | else: 495 | remaining.append(d) 496 | 497 | for prop_id in changed_props: 498 | prop = self.collection.get_schema_property(prop_id) 499 | old = self._convert_notion_to_python( 500 | old_val.get("properties", {}).get(prop_id), prop 501 | ) 502 | new = self._convert_notion_to_python( 503 | new_val.get("properties", {}).get(prop_id), prop 504 | ) 505 | changes.append(("prop_changed", prop["slug"], (old, new))) 506 | 507 | return changes + super()._convert_diff_to_changelist( 508 | remaining, old_val, new_val 509 | ) 510 | 511 | def _convert_notion_to_python(self, val, prop): 512 | 513 | if prop["type"] in ["title", "text"]: 514 | val = notion_to_markdown(val) if val else "" 515 | if prop["type"] in ["number"]: 516 | if val is not None: 517 | val = val[0][0] 518 | if "." in val: 519 | val = float(val) 520 | else: 521 | val = int(val) 522 | if prop["type"] in ["select"]: 523 | val = val[0][0] if val else None 524 | if prop["type"] in ["multi_select"]: 525 | val = [v.strip() for v in val[0][0].split(",")] if val else [] 526 | if prop["type"] in ["person"]: 527 | val = ( 528 | [self._client.get_user(item[1][0][1]) for item in val if item[0] == "‣"] 529 | if val 530 | else [] 531 | ) 532 | if prop["type"] in ["email", "phone_number", "url"]: 533 | val = val[0][0] if val else "" 534 | if prop["type"] in ["date"]: 535 | val = NotionDate.from_notion(val) 536 | if prop["type"] in ["file"]: 537 | val = ( 538 | [ 539 | add_signed_prefix_as_needed( 540 | item[1][0][1], client=self._client, id=self.id 541 | ) 542 | for item in val 543 | if item[0] != "," 544 | ] 545 | if val 546 | else [] 547 | ) 548 | if prop["type"] in ["checkbox"]: 549 | val = val[0][0] == "Yes" if val else False 550 | if prop["type"] in ["relation"]: 551 | val = ( 552 | [ 553 | self._client.get_block(item[1][0][1]) 554 | for item in val 555 | if item[0] == "‣" 556 | ] 557 | if val 558 | else [] 559 | ) 560 | if prop["type"] in ["created_time", "last_edited_time"]: 561 | val = self.get(prop["type"]) 562 | val = datetime.utcfromtimestamp(val / 1000) 563 | if prop["type"] in ["created_by", "last_edited_by"]: 564 | val = self.get(prop["type"] + "_id") 565 | val = self._client.get_user(val) 566 | 567 | return val 568 | 569 | def get_all_properties(self): 570 | allprops = {} 571 | for prop in self.schema: 572 | propid = slugify(prop["name"]) 573 | allprops[propid] = self.get_property(propid) 574 | return allprops 575 | 576 | def set_property(self, identifier, val): 577 | 578 | prop = self.collection.get_schema_property(identifier) 579 | if prop is None: 580 | raise AttributeError( 581 | "Object does not have property '{}'".format(identifier) 582 | ) 583 | if prop["type"] in ["select"] or prop["type"] in ["multi_select"]: 584 | schema_update, prop = self.collection.check_schema_select_options(prop, val) 585 | if schema_update: 586 | self.collection.set( 587 | "schema.{}.options".format(prop["id"]), prop["options"] 588 | ) 589 | 590 | path, val = self._convert_python_to_notion(val, prop, identifier=identifier) 591 | 592 | self.set(path, val) 593 | 594 | def _convert_python_to_notion(self, val, prop, identifier=""): 595 | 596 | if prop["type"] in ["title", "text"]: 597 | if not val: 598 | val = "" 599 | if not isinstance(val, str): 600 | raise TypeError( 601 | "Value passed to property '{}' must be a string.".format(identifier) 602 | ) 603 | val = markdown_to_notion(val) 604 | if prop["type"] in ["number"]: 605 | if val is not None: 606 | if not isinstance(val, float) and not isinstance(val, int): 607 | raise TypeError( 608 | "Value passed to property '{}' must be an int or float.".format( 609 | identifier 610 | ) 611 | ) 612 | val = [[str(val)]] 613 | if prop["type"] in ["select"]: 614 | if not val: 615 | val = None 616 | else: 617 | valid_options = [p["value"].lower() for p in prop["options"]] 618 | val = val.split(",")[0] 619 | if val.lower() not in valid_options: 620 | raise ValueError( 621 | "Value '{}' not acceptable for property '{}' (valid options: {})".format( 622 | val, identifier, valid_options 623 | ) 624 | ) 625 | val = [[val]] 626 | if prop["type"] in ["multi_select"]: 627 | if not val: 628 | val = [] 629 | valid_options = [p["value"].lower() for p in prop["options"]] 630 | if not isinstance(val, list): 631 | val = [val] 632 | for v in val: 633 | if v and v.lower() not in valid_options: 634 | raise ValueError( 635 | "Value '{}' not acceptable for property '{}' (valid options: {})".format( 636 | v, identifier, valid_options 637 | ) 638 | ) 639 | val = [[",".join(val)]] 640 | if prop["type"] in ["person"]: 641 | userlist = [] 642 | if not isinstance(val, list): 643 | val = [val] 644 | for user in val: 645 | user_id = user if isinstance(user, str) else user.id 646 | userlist += [["‣", [["u", user_id]]], [","]] 647 | val = userlist[:-1] 648 | if prop["type"] in ["email", "phone_number", "url"]: 649 | val = [[val, [["a", val]]]] 650 | if prop["type"] in ["date"]: 651 | if isinstance(val, date) or isinstance(val, datetime): 652 | val = NotionDate(val) 653 | if isinstance(val, NotionDate): 654 | val = val.to_notion() 655 | else: 656 | val = [] 657 | if prop["type"] in ["file"]: 658 | filelist = [] 659 | if not isinstance(val, list): 660 | val = [val] 661 | for url in val: 662 | url = remove_signed_prefix_as_needed(url) 663 | filename = url.split("/")[-1] 664 | filelist += [[filename, [["a", url]]], [","]] 665 | val = filelist[:-1] 666 | if prop["type"] in ["checkbox"]: 667 | if not isinstance(val, bool): 668 | raise TypeError( 669 | "Value passed to property '{}' must be a bool.".format(identifier) 670 | ) 671 | val = [["Yes" if val else "No"]] 672 | if prop["type"] in ["relation"]: 673 | pagelist = [] 674 | if not isinstance(val, list): 675 | val = [val] 676 | for page in val: 677 | if isinstance(page, str): 678 | page = self._client.get_block(page) 679 | pagelist += [["‣", [["p", page.id]]], [","]] 680 | val = pagelist[:-1] 681 | if prop["type"] in ["created_time", "last_edited_time"]: 682 | val = int(val.timestamp() * 1000) 683 | return prop["type"], val 684 | if prop["type"] in ["created_by", "last_edited_by"]: 685 | val = val if isinstance(val, str) else val.id 686 | return prop["type"], val 687 | 688 | return ["properties", prop["id"]], val 689 | 690 | def remove(self): 691 | # Mark the block as inactive 692 | self._client.submit_transaction( 693 | build_operation( 694 | id=self.id, path=[], args={"alive": False}, command="update" 695 | ) 696 | ) 697 | 698 | 699 | class TemplateBlock(CollectionRowBlock): 700 | @property 701 | def is_template(self): 702 | return self.get("is_template") 703 | 704 | @is_template.setter 705 | def is_template(self, val): 706 | assert val is True, "Templates must have 'is_template' set to True." 707 | self.set("is_template", True) 708 | 709 | 710 | class Templates(Children): 711 | 712 | child_list_key = "template_pages" 713 | 714 | def _content_list(self): 715 | return self._parent.get(self.child_list_key) or [] 716 | 717 | def add_new(self, **kwargs): 718 | 719 | kwargs["block_type"] = "page" 720 | kwargs["child_list_key"] = self.child_list_key 721 | kwargs["is_template"] = True 722 | 723 | return super().add_new(**kwargs) 724 | 725 | 726 | class QueryResult(object): 727 | def __init__(self, collection, result, query): 728 | self.collection = collection 729 | self._client = collection._client 730 | self._block_ids = self._get_block_ids(result) 731 | self.total = result.get("total", -1) 732 | self.aggregates = result.get("aggregationResults", []) 733 | self.aggregate_ids = [ 734 | agg.get("id") for agg in (query.aggregate or query.aggregations) 735 | ] 736 | self.query = query 737 | 738 | def _get_block_ids(self, result): 739 | return result['reducerResults']['collection_group_results']["blockIds"] 740 | 741 | def _get_block(self, id): 742 | block = CollectionRowBlock(self._client, id) 743 | block.__dict__["collection"] = self.collection 744 | return block 745 | 746 | def get_aggregate(self, id): 747 | for agg_id, agg in zip(self.aggregate_ids, self.aggregates): 748 | if id == agg_id: 749 | return agg["value"] 750 | return None 751 | 752 | def __repr__(self): 753 | if not len(self): 754 | return "[]" 755 | rep = "[\n" 756 | for child in self: 757 | rep += " {},\n".format(repr(child)) 758 | rep += "]" 759 | return rep 760 | 761 | def __len__(self): 762 | return len(self._block_ids) 763 | 764 | def __getitem__(self, key): 765 | return list(iter(self))[key] 766 | 767 | def __iter__(self): 768 | return iter(self._get_block(id) for id in self._block_ids) 769 | 770 | def __reversed__(self): 771 | return reversed(iter(self)) 772 | 773 | def __contains__(self, item): 774 | if isinstance(item, str): 775 | item_id = extract_id(item) 776 | elif isinstance(item, Block): 777 | item_id = item.id 778 | else: 779 | return False 780 | return item_id in self._block_ids 781 | 782 | class TableQueryResult(QueryResult): 783 | 784 | _type = "table" 785 | 786 | 787 | class BoardQueryResult(QueryResult): 788 | 789 | _type = "board" 790 | 791 | 792 | class CalendarQueryResult(QueryResult): 793 | 794 | _type = "calendar" 795 | 796 | def _get_block_ids(self, result): 797 | block_ids = [] 798 | for week in result["weeks"]: 799 | block_ids += week["items"] 800 | return block_ids 801 | 802 | 803 | class ListQueryResult(QueryResult): 804 | 805 | _type = "list" 806 | 807 | 808 | class GalleryQueryResult(QueryResult): 809 | 810 | _type = "gallery" 811 | 812 | 813 | COLLECTION_VIEW_TYPES = { 814 | cls._type: cls 815 | for cls in locals().values() 816 | if type(cls) == type and issubclass(cls, CollectionView) and hasattr(cls, "_type") 817 | } 818 | 819 | QUERY_RESULT_TYPES = { 820 | cls._type: cls 821 | for cls in locals().values() 822 | if type(cls) == type and issubclass(cls, QueryResult) and hasattr(cls, "_type") 823 | } 824 | -------------------------------------------------------------------------------- /notion/logger.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import os 3 | 4 | from .settings import LOG_FILE 5 | 6 | 7 | NOTIONPY_LOG_LEVEL = os.environ.get("NOTIONPY_LOG_LEVEL", "warning").lower() 8 | 9 | logger = logging.getLogger("notion") 10 | 11 | 12 | def enable_debugging(): 13 | set_log_level(logging.DEBUG) 14 | 15 | 16 | def set_log_level(level): 17 | logger.setLevel(level) 18 | handler.setLevel(level) 19 | 20 | 21 | if NOTIONPY_LOG_LEVEL == "disabled": 22 | handler = logging.NullHandler() 23 | logger.addHandler(handler) 24 | else: 25 | handler = logging.FileHandler(LOG_FILE) 26 | formatter = logging.Formatter("\n%(asctime)s - %(levelname)s - %(message)s") 27 | handler.setFormatter(formatter) 28 | logger.addHandler(handler) 29 | 30 | if NOTIONPY_LOG_LEVEL == "debug": 31 | set_log_level(logging.DEBUG) 32 | elif NOTIONPY_LOG_LEVEL == "info": 33 | set_log_level(logging.INFO) 34 | elif NOTIONPY_LOG_LEVEL == "warning": 35 | set_log_level(logging.WARNING) 36 | elif NOTIONPY_LOG_LEVEL == "error": 37 | set_log_level(logging.ERROR) 38 | else: 39 | raise Exception( 40 | "Invalid value for environment variable NOTIONPY_LOG_LEVEL: {}".format( 41 | NOTIONPY_LOG_LEVEL 42 | ) 43 | ) 44 | -------------------------------------------------------------------------------- /notion/maps.py: -------------------------------------------------------------------------------- 1 | from inspect import signature 2 | 3 | from .logger import logger 4 | from .markdown import markdown_to_notion, notion_to_markdown 5 | 6 | 7 | class mapper(property): 8 | def __init__(self, path, python_to_api, api_to_python, *args, **kwargs): 9 | self.python_to_api = python_to_api 10 | self.api_to_python = api_to_python 11 | self.path = ( 12 | ".".join(map(str, path)) 13 | if isinstance(path, list) or isinstance(path, tuple) 14 | else path 15 | ) 16 | super().__init__(*args, **kwargs) 17 | 18 | 19 | def field_map(path, python_to_api=lambda x: x, api_to_python=lambda x: x): 20 | """ 21 | Returns a property that maps a Block attribute onto a field in the API data structures. 22 | 23 | - `path` can either be a top-level field-name, a list that specifies the key names to traverse, 24 | or a dot-delimited string representing the same traversal. 25 | 26 | - `python_to_api` is a function that converts values as given in the Python layer into the 27 | internal representation to be sent along in the API request. 28 | 29 | - `api_to_python` is a function that converts what is received from the API into an internal 30 | representation to be returned to the Python layer. 31 | """ 32 | 33 | if isinstance(path, str): 34 | path = path.split(".") 35 | 36 | def fget(self): 37 | kwargs = {} 38 | if ( 39 | "client" in signature(api_to_python).parameters 40 | and "id" in signature(api_to_python).parameters 41 | ): 42 | kwargs["client"] = self._client 43 | kwargs["id"] = self.id 44 | return api_to_python(self.get(path), **kwargs) 45 | 46 | def fset(self, value): 47 | kwargs = {} 48 | if "client" in signature(python_to_api).parameters: 49 | kwargs["client"] = self._client 50 | self.set(path, python_to_api(value, **kwargs)) 51 | 52 | return mapper( 53 | fget=fget, 54 | fset=fset, 55 | path=path, 56 | python_to_api=python_to_api, 57 | api_to_python=api_to_python, 58 | ) 59 | 60 | 61 | def property_map( 62 | name, python_to_api=lambda x: x, api_to_python=lambda x: x, markdown=True 63 | ): 64 | """ 65 | Similar to `field_map`, except it works specifically with the data under the "properties" field 66 | in the API's block table, and just takes a single name to specify which subkey to reference. 67 | Also, these properties all seem to use a special "embedded list" format that breaks the text 68 | up into a sequence of chunks and associated format metadata. If `markdown` is True, we convert 69 | this representation into commonmark-compatible markdown, and back again when saving. 70 | """ 71 | 72 | def py2api(x, client=None): 73 | kwargs = {} 74 | if "client" in signature(python_to_api).parameters: 75 | kwargs["client"] = client 76 | x = python_to_api(x, **kwargs) 77 | if markdown: 78 | x = markdown_to_notion(x) 79 | return x 80 | 81 | def api2py(x, client=None, id=""): 82 | x = x or [[""]] 83 | if markdown: 84 | x = notion_to_markdown(x) 85 | kwargs = {} 86 | params = signature(api_to_python).parameters 87 | if "client" in params: 88 | kwargs["client"] = client 89 | if "id" in params: 90 | kwargs["id"] = id 91 | return api_to_python(x, **kwargs) 92 | 93 | return field_map(["properties", name], python_to_api=py2api, api_to_python=api2py) 94 | 95 | 96 | def joint_map(*mappings): 97 | """ 98 | Combine multiple `field_map` and `property_map` instances together to map an attribute to multiple API fields. 99 | Note: when "getting", the first one will be used. When "setting", they will all be set in parallel. 100 | """ 101 | 102 | def fget(self): 103 | return mappings[0].fget(self) 104 | 105 | def fset(self, value): 106 | for m in mappings: 107 | m.fset(self, value) 108 | 109 | return property(fget=fget, fset=fset) 110 | -------------------------------------------------------------------------------- /notion/markdown.py: -------------------------------------------------------------------------------- 1 | import commonmark 2 | import re 3 | import html 4 | from xml.dom import minidom 5 | 6 | from commonmark.dump import prepare 7 | 8 | 9 | delimiters = { 10 | "!", 11 | '"', 12 | "#", 13 | "$", 14 | "%", 15 | "&", 16 | "'", 17 | "(", 18 | ")", 19 | "*", 20 | "+", 21 | ",", 22 | "-", 23 | ".", 24 | "/", 25 | ":", 26 | ";", 27 | "<", 28 | "=", 29 | ">", 30 | "?", 31 | "@", 32 | "[", 33 | "\\", 34 | "]", 35 | "^", 36 | "_", 37 | "`", 38 | "{", 39 | "|", 40 | "}", 41 | "~", 42 | "☃", 43 | " ", 44 | "\t", 45 | "\n", 46 | "\x0b", 47 | "\x0c", 48 | "\r", 49 | "\x1c", 50 | "\x1d", 51 | "\x1e", 52 | "\x1f", 53 | "\x85", 54 | "\xa0", 55 | "\u1680", 56 | "\u2000", 57 | "\u2001", 58 | "\u2002", 59 | "\u2003", 60 | "\u2004", 61 | "\u2005", 62 | "\u2006", 63 | "\u2007", 64 | "\u2008", 65 | "\u2009", 66 | "\u200a", 67 | "\u2028", 68 | "\u2029", 69 | "\u202f", 70 | "\u205f", 71 | "\u3000", 72 | } 73 | 74 | _NOTION_TO_MARKDOWN_MAPPER = {"i": "☃", "b": "☃☃", "s": "~~", "c": "`", "e": "$$"} 75 | 76 | FORMAT_PRECEDENCE = ["s", "b", "i", "a", "c", "e"] 77 | 78 | 79 | def _extract_text_and_format_from_ast(item): 80 | 81 | if item["type"] == "html_inline": 82 | if item.get("literal", "") == "": 83 | return "", ("s",) 84 | if item.get("literal", "").startswith("" 87 | ).documentElement 88 | equation = elem.attributes["equation"].value 89 | return "", ("e", equation) 90 | 91 | if item["type"] == "emph": 92 | return item.get("literal", ""), ("i",) 93 | 94 | if item["type"] == "strong": 95 | return item.get("literal", ""), ("b",) 96 | 97 | if item["type"] == "code": 98 | return item.get("literal", ""), ("c",) 99 | 100 | if item["type"] == "link": 101 | return item.get("literal", ""), ("a", item.get("destination", "#")) 102 | 103 | return item.get("literal", ""), () 104 | 105 | 106 | def _get_format(notion_segment, as_set=False): 107 | if len(notion_segment) == 1: 108 | if as_set: 109 | return set() 110 | else: 111 | return [] 112 | else: 113 | if as_set: 114 | return set([tuple(f) for f in notion_segment[1]]) 115 | else: 116 | return notion_segment[1] 117 | 118 | 119 | def markdown_to_notion(markdown): 120 | 121 | if not isinstance(markdown, str): 122 | markdown = str(markdown) 123 | 124 | # commonmark doesn't support strikethrough, so we need to handle it ourselves 125 | while markdown.count("~~") >= 2: 126 | markdown = markdown.replace("~~", "", 1) 127 | markdown = markdown.replace("~~", "", 1) 128 | 129 | # commonmark doesn't support latex blocks, so we need to handle it ourselves 130 | def handle_latex(match): 131 | return '\u204d'.format( 132 | html.escape(match.group(0)[2:-2]) 133 | ) 134 | 135 | markdown = re.sub( 136 | r"(?": 168 | format.remove(("s",)) 169 | literal = "" 170 | 171 | if item["type"] == "html_inline" and literal == "": 172 | for f in filter(lambda f: f[0] == "e", format): 173 | format.remove(f) 174 | break 175 | literal = "" 176 | 177 | if item["type"] == "softbreak": 178 | literal = "\n" 179 | 180 | if literal: 181 | notion.append( 182 | [literal, [list(f) for f in sorted(format)]] 183 | if format 184 | else [literal] 185 | ) 186 | 187 | # in the ast format, code blocks are meant to be immediately self-closing 188 | if ("c",) in format: 189 | format.remove(("c",)) 190 | 191 | # remove any trailing newlines from automatic closing paragraph markers 192 | if notion: 193 | notion[-1][0] = notion[-1][0].rstrip("\n") 194 | 195 | # consolidate any adjacent text blocks with identical styles 196 | consolidated = [] 197 | for item in notion: 198 | if consolidated and _get_format(consolidated[-1], as_set=True) == _get_format( 199 | item, as_set=True 200 | ): 201 | consolidated[-1][0] += item[0] 202 | elif item[0]: 203 | consolidated.append(item) 204 | 205 | return cleanup_dashes(consolidated) 206 | 207 | 208 | def cleanup_dashes(thing): 209 | regex_pattern = re.compile("⸻|%E2%B8%BB") 210 | if type(thing) is list: 211 | for counter, value in enumerate(thing): 212 | thing[counter] = cleanup_dashes(value) 213 | elif type(thing) is str: 214 | return regex_pattern.sub("-", thing) 215 | 216 | return thing 217 | 218 | 219 | def notion_to_markdown(notion): 220 | 221 | markdown_chunks = [] 222 | 223 | use_underscores = True 224 | 225 | for item in notion or []: 226 | 227 | markdown = "" 228 | 229 | text = item[0] 230 | format = item[1] if len(item) == 2 else [] 231 | 232 | match = re.match( 233 | "^(?P\s*)(?P(\s|.)*?)(?P\s*)$", text 234 | ) 235 | if not match: 236 | raise Exception("Unable to extract text from: %r" % text) 237 | 238 | leading_whitespace = match.groupdict()["leading"] 239 | stripped = match.groupdict()["stripped"] 240 | trailing_whitespace = match.groupdict()["trailing"] 241 | 242 | markdown += leading_whitespace 243 | 244 | sorted_format = sorted( 245 | format, 246 | key=lambda x: FORMAT_PRECEDENCE.index(x[0]) 247 | if x[0] in FORMAT_PRECEDENCE 248 | else -1, 249 | ) 250 | 251 | for f in sorted_format: 252 | if f[0] in _NOTION_TO_MARKDOWN_MAPPER: 253 | if stripped: 254 | markdown += _NOTION_TO_MARKDOWN_MAPPER[f[0]] 255 | if f[0] == "a": 256 | markdown += "[" 257 | 258 | # Check wheter a format modifies the content 259 | content_changed = False 260 | for f in sorted_format: 261 | if f[0] == "e": 262 | markdown += f[1] 263 | content_changed = True 264 | 265 | if not content_changed: 266 | markdown += stripped 267 | 268 | for f in reversed(sorted_format): 269 | if f[0] in _NOTION_TO_MARKDOWN_MAPPER: 270 | if stripped: 271 | markdown += _NOTION_TO_MARKDOWN_MAPPER[f[0]] 272 | if f[0] == "a": 273 | markdown += "]({})".format(f[1]) 274 | 275 | markdown += trailing_whitespace 276 | 277 | # to make it parseable, add a space after if it combines code/links and emphasis formatting 278 | format_types = [f[0] for f in format] 279 | if ( 280 | ("c" in format_types or "a" in format_types) 281 | and ("b" in format_types or "i" in format_types) 282 | and not trailing_whitespace 283 | ): 284 | markdown += " " 285 | 286 | markdown_chunks.append(markdown) 287 | 288 | # use underscores as needed to separate adjacent chunks to avoid ambiguous runs of asterisks 289 | full_markdown = "" 290 | last_used_underscores = False 291 | for i in range(len(markdown_chunks)): 292 | prev = markdown_chunks[i - 1] if i > 0 else "" 293 | curr = markdown_chunks[i] 294 | next = markdown_chunks[i + 1] if i < len(markdown_chunks) - 1 else "" 295 | prev_ended_in_delimiter = not prev or prev[-1] in delimiters 296 | next_starts_with_delimiter = not next or next[0] in delimiters 297 | if ( 298 | prev_ended_in_delimiter 299 | and next_starts_with_delimiter 300 | and not last_used_underscores 301 | and curr.startswith("☃") 302 | and curr.endswith("☃") 303 | ): 304 | if curr[1] == "☃": 305 | count = 2 306 | else: 307 | count = 1 308 | curr = "_" * count + curr[count:-count] + "_" * count 309 | last_used_underscores = True 310 | else: 311 | last_used_underscores = False 312 | 313 | final_markdown = curr.replace("☃", "*") 314 | 315 | # to make it parseable, convert emphasis/strong combinations to use a mix of _ and * 316 | if "***" in final_markdown: 317 | final_markdown = final_markdown.replace("***", "**_", 1) 318 | final_markdown = final_markdown.replace("***", "_**", 1) 319 | 320 | full_markdown += final_markdown 321 | 322 | return full_markdown 323 | 324 | 325 | def notion_to_plaintext(notion, client=None): 326 | 327 | plaintext = "" 328 | 329 | for item in notion or []: 330 | 331 | text = item[0] 332 | formats = item[1] if len(item) == 2 else [] 333 | 334 | if text == "‣": 335 | 336 | for f in formats: 337 | if f[0] == "p": # page link 338 | if client is None: 339 | plaintext += "page:" + f[1] 340 | else: 341 | plaintext += client.get_block(f[1]).title_plaintext 342 | elif f[0] == "u": # user link 343 | if client is None: 344 | plaintext += "user:" + f[1] 345 | else: 346 | plaintext += client.get_user(f[1]).full_name 347 | 348 | continue 349 | 350 | plaintext += text 351 | 352 | return plaintext 353 | 354 | 355 | def plaintext_to_notion(plaintext): 356 | 357 | return [[plaintext]] 358 | -------------------------------------------------------------------------------- /notion/monitor.py: -------------------------------------------------------------------------------- 1 | import json 2 | import re 3 | import requests 4 | import threading 5 | import time 6 | import uuid 7 | 8 | from collections import defaultdict 9 | from inspect import signature 10 | from requests import HTTPError 11 | 12 | from .collection import Collection 13 | from .logger import logger 14 | from .records import Record 15 | 16 | 17 | class Monitor(object): 18 | 19 | thread = None 20 | 21 | def __init__(self, client, root_url="https://msgstore.www.notion.so/primus/"): 22 | self.client = client 23 | self.session_id = str(uuid.uuid4()) 24 | self.root_url = root_url 25 | self._subscriptions = set() 26 | self.initialize() 27 | 28 | def _decode_numbered_json_thing(self, thing): 29 | 30 | thing = thing.decode().strip() 31 | 32 | for ping in re.findall('\d+:\d+"primus::ping::\d+"', thing): 33 | logger.debug("Received ping: {}".format(ping)) 34 | self.post_data(ping.replace("::ping::", "::pong::")) 35 | 36 | results = [] 37 | for blob in re.findall("\d+:\d+(\{.*?\})(?=\d|$)", thing): 38 | results.append(json.loads(blob)) 39 | if thing and not results and "::ping::" not in thing: 40 | logger.debug("Could not parse monitoring response: {}".format(thing)) 41 | return results 42 | 43 | def _encode_numbered_json_thing(self, data): 44 | assert isinstance(data, list) 45 | results = "" 46 | for obj in data: 47 | msg = str(len(obj)) + json.dumps(obj, separators=(",", ":")) 48 | msg = "{}:{}".format(len(msg), msg) 49 | results += msg 50 | return results.encode() 51 | 52 | def initialize(self): 53 | 54 | logger.debug("Initializing new monitoring session.") 55 | 56 | response = self.client.session.get( 57 | "{}?sessionId={}&EIO=3&transport=polling".format( 58 | self.root_url, self.session_id 59 | ) 60 | ) 61 | 62 | self.sid = self._decode_numbered_json_thing(response.content)[0]["sid"] 63 | 64 | logger.debug("New monitoring session ID is: {}".format(self.sid)) 65 | 66 | # resubscribe to any existing subscriptions if we're reconnecting 67 | old_subscriptions, self._subscriptions = self._subscriptions, set() 68 | self.subscribe(old_subscriptions) 69 | 70 | def subscribe(self, records): 71 | 72 | if isinstance(records, set): 73 | records = list(records) 74 | 75 | if not isinstance(records, list): 76 | records = [records] 77 | 78 | sub_data = [] 79 | 80 | for record in records: 81 | 82 | if record not in self._subscriptions: 83 | 84 | logger.debug( 85 | "Subscribing new record to the monitoring watchlist: {}/{}".format( 86 | record._table, record.id 87 | ) 88 | ) 89 | 90 | # add the record to the list of records to restore if we're disconnected 91 | self._subscriptions.add(record) 92 | 93 | # subscribe to changes to the record itself 94 | sub_data.append( 95 | { 96 | "type": "/api/v1/registerSubscription", 97 | "requestId": str(uuid.uuid4()), 98 | "key": "versions/{}:{}".format(record.id, record._table), 99 | "version": record.get("version", -1), 100 | } 101 | ) 102 | 103 | # if it's a collection, subscribe to changes to its children too 104 | if isinstance(record, Collection): 105 | sub_data.append( 106 | { 107 | "type": "/api/v1/registerSubscription", 108 | "requestId": str(uuid.uuid4()), 109 | "key": "collection/{}".format(record.id), 110 | "version": -1, 111 | } 112 | ) 113 | 114 | data = self._encode_numbered_json_thing(sub_data) 115 | 116 | self.post_data(data) 117 | 118 | def post_data(self, data): 119 | 120 | if not data: 121 | return 122 | 123 | logger.debug("Posting monitoring data: {}".format(data)) 124 | 125 | self.client.session.post( 126 | "{}?sessionId={}&transport=polling&sid={}".format( 127 | self.root_url, self.session_id, self.sid 128 | ), 129 | data=data, 130 | ) 131 | 132 | def poll(self, retries=10): 133 | logger.debug("Starting new long-poll request") 134 | try: 135 | response = self.client.session.get( 136 | "{}?sessionId={}&EIO=3&transport=polling&sid={}".format( 137 | self.root_url, self.session_id, self.sid 138 | ) 139 | ) 140 | response.raise_for_status() 141 | except HTTPError as e: 142 | try: 143 | message = "{} / {}".format(response.content, e) 144 | except: 145 | message = "{}".format(e) 146 | logger.warn( 147 | "Problem with submitting polling request: {} (will retry {} more times)".format( 148 | message, retries 149 | ) 150 | ) 151 | time.sleep(0.1) 152 | if retries <= 0: 153 | raise 154 | if retries <= 5: 155 | logger.error( 156 | "Persistent error submitting polling request: {} (will retry {} more times)".format( 157 | message, retries 158 | ) 159 | ) 160 | # if we're close to giving up, also try reinitializing the session 161 | self.initialize() 162 | self.poll(retries=retries - 1) 163 | 164 | self._refresh_updated_records( 165 | self._decode_numbered_json_thing(response.content) 166 | ) 167 | 168 | def _refresh_updated_records(self, events): 169 | 170 | records_to_refresh = defaultdict(list) 171 | 172 | for event in events: 173 | 174 | logger.debug( 175 | "Received the following event from the remote server: {}".format(event) 176 | ) 177 | 178 | if not isinstance(event, dict): 179 | continue 180 | 181 | if event.get("type", "") == "notification": 182 | 183 | key = event.get("key") 184 | 185 | if key.startswith("versions/"): 186 | 187 | match = re.match("versions/([^\:]+):(.+)", key) 188 | if not match: 189 | continue 190 | 191 | record_id, record_table = match.groups() 192 | 193 | local_version = self.client._store.get_current_version( 194 | record_table, record_id 195 | ) 196 | if event["value"] > local_version: 197 | logger.debug( 198 | "Record {}/{} has changed; refreshing to update from version {} to version {}".format( 199 | record_table, record_id, local_version, event["value"] 200 | ) 201 | ) 202 | records_to_refresh[record_table].append(record_id) 203 | else: 204 | logger.debug( 205 | "Record {}/{} already at version {}, not trying to update to version {}".format( 206 | record_table, record_id, local_version, event["value"] 207 | ) 208 | ) 209 | 210 | if key.startswith("collection/"): 211 | 212 | match = re.match("collection/(.+)", key) 213 | if not match: 214 | continue 215 | 216 | collection_id = match.groups()[0] 217 | 218 | self.client.refresh_collection_rows(collection_id) 219 | row_ids = self.client._store.get_collection_rows(collection_id) 220 | 221 | logger.debug( 222 | "Something inside collection {} has changed; refreshing all {} rows inside it".format( 223 | collection_id, len(row_ids) 224 | ) 225 | ) 226 | 227 | records_to_refresh["block"] += row_ids 228 | 229 | self.client.refresh_records(**records_to_refresh) 230 | 231 | def poll_async(self): 232 | if self.thread: 233 | # Already polling async; no need to have two threads 234 | return 235 | self.thread = threading.Thread(target=self.poll_forever, daemon=True) 236 | self.thread.start() 237 | 238 | def poll_forever(self): 239 | while True: 240 | try: 241 | self.poll() 242 | except Exception as e: 243 | logger.error("Encountered error during polling!") 244 | logger.error(e, exc_info=True) 245 | time.sleep(1) 246 | -------------------------------------------------------------------------------- /notion/operations.py: -------------------------------------------------------------------------------- 1 | from .utils import now 2 | 3 | 4 | def build_operation(id, path, args, command="set", table="block"): 5 | """ 6 | Data updates sent to the submitTransaction endpoint consist of a sequence of "operations". This is a helper 7 | function that constructs one of these operations. 8 | """ 9 | 10 | if isinstance(path, str): 11 | path = path.split(".") 12 | 13 | return {"id": id, "path": path, "args": args, "command": command, "table": table} 14 | 15 | 16 | def operation_update_last_edited(user_id, block_id): 17 | """ 18 | When transactions are submitted from the web UI, it also includes an operation to update the "last edited" 19 | fields, so we want to send those too, for consistency -- this convenience function constructs the operation. 20 | """ 21 | return { 22 | "args": { 23 | "last_edited_by_id": user_id, 24 | "last_edited_by_table": "notion_user", 25 | "last_edited_time": now(), 26 | }, 27 | "command": "update", 28 | "id": block_id, 29 | "path": [], 30 | "table": "block", 31 | } 32 | -------------------------------------------------------------------------------- /notion/records.py: -------------------------------------------------------------------------------- 1 | from copy import deepcopy 2 | 3 | from .logger import logger 4 | from .operations import build_operation 5 | from .utils import extract_id, get_by_path 6 | 7 | 8 | class Record(object): 9 | 10 | # if a subclass has a list of ids that should be update when child records are removed, it should specify the key here 11 | child_list_key = None 12 | 13 | def __init__(self, client, id, *args, **kwargs): 14 | self._client = client 15 | self._id = extract_id(id) 16 | self._callbacks = [] 17 | if self._client._monitor is not None: 18 | self._client._monitor.subscribe(self) 19 | 20 | @property 21 | def id(self): 22 | return self._id 23 | 24 | @property 25 | def role(self): 26 | return self._client._store.get_role(self._table, self.id) 27 | 28 | def _str_fields(self): 29 | """ 30 | Determines the list of fields to include in the __str__ representation. Override and extend this in subclasses. 31 | """ 32 | return ["id"] 33 | 34 | def __str__(self): 35 | return ", ".join( 36 | [ 37 | "{}={}".format(field, repr(getattr(self, field))) 38 | for field in self._str_fields() 39 | if getattr(self, field, "") 40 | ] 41 | ) 42 | 43 | def __repr__(self): 44 | return "<{} ({})>".format(self.__class__.__name__, self) 45 | 46 | def refresh(self): 47 | """ 48 | Update the cached data for this record from the server (data for other records may be updated as a side effect). 49 | """ 50 | self._get_record_data(force_refresh=True) 51 | 52 | def _convert_diff_to_changelist(self, difference, old_val, new_val): 53 | changed_values = set() 54 | for operation, path, values in deepcopy(difference): 55 | path = path.split(".") if isinstance(path, str) else path 56 | if operation in ["add", "remove"]: 57 | path.append(values[0][0]) 58 | while isinstance(path[-1], int): 59 | path.pop() 60 | changed_values.add(".".join(map(str, path))) 61 | return [ 62 | ( 63 | "changed_value", 64 | path, 65 | (get_by_path(path, old_val), get_by_path(path, new_val)), 66 | ) 67 | for path in changed_values 68 | ] 69 | 70 | def add_callback(self, callback, callback_id=None, extra_kwargs={}): 71 | assert callable( 72 | callback 73 | ), "The callback must be a 'callable' object, such as a function." 74 | callback_obj = self._client._store.add_callback( 75 | self, callback, callback_id=callback_id, extra_kwargs=extra_kwargs 76 | ) 77 | self._callbacks.append(callback_obj) 78 | return callback_obj 79 | 80 | def remove_callbacks(self, callback_or_callback_id_prefix=None): 81 | if callback_or_callback_id_prefix is None: 82 | for callback_obj in list(self._callbacks): 83 | self._client._store.remove_callbacks( 84 | self._table, self.id, callback_or_callback_id_prefix=callback_obj 85 | ) 86 | self._callbacks = [] 87 | else: 88 | self._client._store.remove_callbacks( 89 | self._table, 90 | self.id, 91 | callback_or_callback_id_prefix=callback_or_callback_id_prefix, 92 | ) 93 | if callback_or_callback_id_prefix in self._callbacks: 94 | self._callbacks.remove(callback_or_callback_id_prefix) 95 | 96 | def _get_record_data(self, force_refresh=False): 97 | return self._client.get_record_data( 98 | self._table, self.id, force_refresh=force_refresh 99 | ) 100 | 101 | def get(self, path=[], default=None, force_refresh=False): 102 | """ 103 | Retrieve cached data for this record. The `path` is a list (or dot-delimited string) the specifies the field 104 | to retrieve the value for. If no path is supplied, return the entire cached data structure for this record. 105 | If `force_refresh` is set to True, we force_refresh the data cache from the server before reading the values. 106 | """ 107 | return get_by_path( 108 | path, self._get_record_data(force_refresh=force_refresh), default=default 109 | ) 110 | 111 | def set(self, path, value): 112 | """ 113 | Set a specific `value` (under the specific `path`) on the record's data structure on the server. 114 | """ 115 | self._client.submit_transaction( 116 | build_operation(id=self.id, path=path, args=value, table=self._table) 117 | ) 118 | 119 | def __eq__(self, other): 120 | return self.id == other.id 121 | 122 | def __ne__(self, other): 123 | return self.id != other.id 124 | 125 | def __hash__(self): 126 | return hash(self.id) 127 | -------------------------------------------------------------------------------- /notion/settings.py: -------------------------------------------------------------------------------- 1 | import os 2 | from pathlib import Path 3 | 4 | BASE_URL = "https://www.notion.so/" 5 | API_BASE_URL = BASE_URL + "api/v3/" 6 | SIGNED_URL_PREFIX = "https://www.notion.so/signed/" 7 | S3_URL_PREFIX = "https://s3-us-west-2.amazonaws.com/secure.notion-static.com/" 8 | S3_URL_PREFIX_ENCODED = "https://s3.us-west-2.amazonaws.com/secure.notion-static.com/" 9 | DATA_DIR = os.environ.get( 10 | "NOTION_DATA_DIR", str(Path(os.path.expanduser("~")).joinpath(".notion-py")) 11 | ) 12 | CACHE_DIR = str(Path(DATA_DIR).joinpath("cache")) 13 | LOG_FILE = str(Path(DATA_DIR).joinpath("notion.log")) 14 | 15 | try: 16 | os.makedirs(DATA_DIR) 17 | except FileExistsError: 18 | pass 19 | 20 | try: 21 | os.makedirs(CACHE_DIR) 22 | except FileExistsError: 23 | pass 24 | -------------------------------------------------------------------------------- /notion/smoke_test.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | 3 | from .client import * 4 | from .block import * 5 | from .collection import NotionDate 6 | 7 | 8 | def run_live_smoke_test(token_v2, parent_page_url_or_id): 9 | 10 | client = NotionClient(token_v2=token_v2) 11 | 12 | parent_page = client.get_block(parent_page_url_or_id) 13 | 14 | page = parent_page.children.add_new( 15 | PageBlock, 16 | title="Smoke test at {}".format(datetime.now().strftime("%Y-%m-%d %H:%M:%S")), 17 | ) 18 | 19 | print("Created base smoke test page at:", page.get_browseable_url()) 20 | 21 | col_list = page.children.add_new(ColumnListBlock) 22 | col1 = col_list.children.add_new(ColumnBlock) 23 | col2 = col_list.children.add_new(ColumnBlock) 24 | col1kid = col1.children.add_new( 25 | TextBlock, title="Some formatting: *italic*, **bold**, ***both***!" 26 | ) 27 | assert ( 28 | col1kid.title.replace("_", "*") 29 | == "Some formatting: *italic*, **bold**, ***both***!" 30 | ) 31 | assert col1kid.title_plaintext == "Some formatting: italic, bold, both!" 32 | col2.children.add_new(TodoBlock, title="I should be unchecked") 33 | col2.children.add_new(TodoBlock, title="I should be checked", checked=True) 34 | 35 | page.children.add_new(HeaderBlock, title="The finest music:") 36 | video = page.children.add_new(VideoBlock, width=100) 37 | video.set_source_url("https://www.youtube.com/watch?v=oHg5SJYRHA0") 38 | 39 | assert video in page.children 40 | assert col_list in page.children 41 | assert video in page.children.filter(VideoBlock) 42 | assert col_list not in page.children.filter(VideoBlock) 43 | 44 | # check that the parent does not yet consider this page to be backlinking 45 | assert page not in parent_page.get_backlinks() 46 | 47 | page.children.add_new(SubheaderBlock, title="A link back to where I came from:") 48 | alias = page.children.add_alias(parent_page) 49 | assert alias.is_alias 50 | assert not page.is_alias 51 | page.children.add_new( 52 | QuoteBlock, 53 | title="Clicking [here]({}) should take you to the same place...".format( 54 | page.parent.get_browseable_url() 55 | ), 56 | ) 57 | 58 | # check that the parent now knows about the backlink 59 | assert page in parent_page.get_backlinks() 60 | 61 | # ensure __repr__ methods are not breaking 62 | repr(page) 63 | repr(page.children) 64 | for child in page.children: 65 | repr(child) 66 | 67 | page.children.add_new( 68 | SubheaderBlock, title="The order of the following should be alphabetical:" 69 | ) 70 | 71 | B = page.children.add_new(BulletedListBlock, title="B") 72 | D = page.children.add_new(BulletedListBlock, title="D") 73 | C2 = page.children.add_new(BulletedListBlock, title="C2") 74 | C1 = page.children.add_new(BulletedListBlock, title="C1") 75 | C = page.children.add_new(BulletedListBlock, title="C") 76 | A = page.children.add_new(BulletedListBlock, title="A") 77 | 78 | D.move_to(C, "after") 79 | A.move_to(B, "before") 80 | C2.move_to(C) 81 | C1.move_to(C, "first-child") 82 | 83 | page.children.add_new(CalloutBlock, title="I am a callout", icon="🤞") 84 | 85 | cvb = page.children.add_new(CollectionViewBlock) 86 | cvb.collection = client.get_collection( 87 | client.create_record("collection", parent=cvb, schema=get_collection_schema()) 88 | ) 89 | cvb.title = "My data!" 90 | view = cvb.views.add_new(view_type="table") 91 | 92 | special_code = uuid.uuid4().hex[:8] 93 | 94 | # add a row 95 | row1 = cvb.collection.add_row() 96 | assert row1.person == [] 97 | row1.name = "Just some data" 98 | row1.title = "Can reference 'title' field too! " + special_code 99 | assert row1.name == row1.title 100 | row1.check_yo_self = True 101 | row1.estimated_value = None 102 | row1.estimated_value = 42 103 | row1.files = [ 104 | "https://www.birdlife.org/sites/default/files/styles/1600/public/slide.jpg" 105 | ] 106 | row1.tags = None 107 | row1.tags = [] 108 | row1.tags = ["A", "C"] 109 | row1.where_to = "https://learningequality.org" 110 | row1.category = "A" 111 | row1.category = "" 112 | row1.category = None 113 | row1.category = "B" 114 | 115 | start = datetime.strptime("2020-01-01 09:30", "%Y-%m-%d %H:%M") 116 | end = datetime.strptime("2020-01-05 20:45", "%Y-%m-%d %H:%M") 117 | timezone = "America/Los_Angeles" 118 | reminder = {"unit": "minute", "value": 30} 119 | row1.some_date = NotionDate(start, end=end, timezone=timezone, reminder=reminder) 120 | 121 | # add another row 122 | row2 = cvb.collection.add_row(person=client.current_user, title="Metallic penguins") 123 | assert row2.person == [client.current_user] 124 | assert row2.name == "Metallic penguins" 125 | row2.check_yo_self = False 126 | row2.estimated_value = 22 127 | row2.files = [ 128 | "https://www.picclickimg.com/d/l400/pict/223603662103_/Vintage-Small-Monet-and-Jones-JNY-Enamel-Metallic.jpg" 129 | ] 130 | row2.tags = ["A", "B"] 131 | row2.where_to = "https://learningequality.org" 132 | row2.category = "C" 133 | 134 | # check that options "C" have been added to the schema 135 | for prop in ["=d{|", "=d{q"]: 136 | assert cvb.collection.get("schema.{}.options.2.value".format(prop)) == "C" 137 | 138 | # check that existing options "A" haven't been affected 139 | for prop in ["=d{|", "=d{q"]: 140 | assert ( 141 | cvb.collection.get("schema.{}.options.0.id".format(prop)) 142 | == get_collection_schema()[prop]["options"][0]["id"] 143 | ) 144 | 145 | # Run a filtered/sorted query using the view's default parameters 146 | result = view.default_query().execute() 147 | assert row1 == result[0] 148 | assert row2 == result[1] 149 | assert len(result) == 2 150 | 151 | # query the collection directly 152 | assert row1 in cvb.collection.get_rows(search=special_code) 153 | assert row2 not in cvb.collection.get_rows(search=special_code) 154 | assert row1 not in cvb.collection.get_rows(search="penguins") 155 | assert row2 in cvb.collection.get_rows(search="penguins") 156 | 157 | # search the entire space 158 | assert row1 in client.search_blocks(search=special_code) 159 | assert row1 not in client.search_blocks(search="penguins") 160 | assert row2 not in client.search_blocks(search=special_code) 161 | assert row2 in client.search_blocks(search="penguins") 162 | 163 | # Run an "aggregation" query 164 | aggregations = [ 165 | {"property": "estimated_value", "aggregator": "sum", "id": "total_value"} 166 | ] 167 | result = view.build_query(aggregations=aggregations).execute() 168 | assert result.get_aggregate("total_value") == 64 169 | 170 | # Run a "filtered" query 171 | filter_params = { 172 | "filters": [ 173 | { 174 | "filter": { 175 | "value": { 176 | "type": "exact", 177 | "value": {"table": "notion_user", "id": client.current_user.id}, 178 | }, 179 | "operator": "person_does_not_contain", 180 | }, 181 | "property": "person", 182 | } 183 | ], 184 | "operator": "and", 185 | } 186 | result = view.build_query(filter=filter_params).execute() 187 | assert row1 in result 188 | assert row2 not in result 189 | 190 | # Run a "sorted" query 191 | sort_params = [{"direction": "ascending", "property": "estimated_value"}] 192 | result = view.build_query(sort=sort_params).execute() 193 | assert row1 == result[1] 194 | assert row2 == result[0] 195 | 196 | # Test that reminders and time zone's work properly 197 | row1.refresh() 198 | assert row1.some_date.start == start 199 | assert row1.some_date.end == end 200 | assert row1.some_date.timezone == timezone 201 | assert row1.some_date.reminder == reminder 202 | 203 | print( 204 | "Check it out and make sure it looks good, then press any key here to delete it..." 205 | ) 206 | input() 207 | 208 | _delete_page_fully(page) 209 | 210 | 211 | def _delete_page_fully(page): 212 | 213 | id = page.id 214 | 215 | parent_page = page.parent 216 | 217 | assert page.get("alive") == True 218 | assert page in parent_page.children 219 | page.remove() 220 | assert page.get("alive") == False 221 | assert page not in parent_page.children 222 | 223 | assert ( 224 | page.space_info 225 | ), "Page {} was fully deleted prematurely, as we can't get space info about it anymore".format( 226 | id 227 | ) 228 | 229 | page.remove(permanently=True) 230 | 231 | time.sleep(1) 232 | 233 | assert ( 234 | not page.space_info 235 | ), "Page {} was not really fully deleted, as we can still get space info about it".format( 236 | id 237 | ) 238 | 239 | 240 | def get_collection_schema(): 241 | return { 242 | "%9:q": {"name": "Check Yo'self", "type": "checkbox"}, 243 | "=d{|": { 244 | "name": "Tags", 245 | "type": "multi_select", 246 | "options": [ 247 | { 248 | "color": "orange", 249 | "id": "79560dab-c776-43d1-9420-27f4011fcaec", 250 | "value": "A", 251 | }, 252 | { 253 | "color": "default", 254 | "id": "002c7016-ac57-413a-90a6-64afadfb0c44", 255 | "value": "B", 256 | }, 257 | ], 258 | }, 259 | "=d{q": { 260 | "name": "Category", 261 | "type": "select", 262 | "options": [ 263 | { 264 | "color": "orange", 265 | "id": "59560dab-c776-43d1-9420-27f4011fcaec", 266 | "value": "A", 267 | }, 268 | { 269 | "color": "default", 270 | "id": "502c7016-ac57-413a-90a6-64afadfb0c44", 271 | "value": "B", 272 | }, 273 | ], 274 | }, 275 | "LL[(": {"name": "Person", "type": "person"}, 276 | "4Jv$": {"name": "Estimated value", "type": "number"}, 277 | "OBcJ": {"name": "Where to?", "type": "url"}, 278 | "TwR:": {"name": "Some Date", "type": "date"}, 279 | "dV$q": {"name": "Files", "type": "file"}, 280 | "title": {"name": "Name", "type": "title"}, 281 | } 282 | -------------------------------------------------------------------------------- /notion/space.py: -------------------------------------------------------------------------------- 1 | from .logger import logger 2 | from .maps import property_map, field_map 3 | from .records import Record 4 | 5 | 6 | class Space(Record): 7 | 8 | _table = "space" 9 | 10 | child_list_key = "pages" 11 | 12 | name = field_map("name") 13 | domain = field_map("domain") 14 | icon = field_map("icon") 15 | 16 | @property 17 | def pages(self): 18 | # The page list includes pages the current user might not have permissions on, so it's slow to query. 19 | # Instead, we just filter for pages with the space as the parent. 20 | return self._client.search_pages_with_parent(self.id) 21 | 22 | @property 23 | def users(self): 24 | user_ids = [permission["user_id"] for permission in self.get("permissions")] 25 | self._client.refresh_records(notion_user=user_ids) 26 | return [self._client.get_user(user_id) for user_id in user_ids] 27 | 28 | def _str_fields(self): 29 | return super()._str_fields() + ["name", "domain"] 30 | 31 | def add_page(self, title, type="page", shared=False): 32 | assert type in [ 33 | "page", 34 | "collection_view_page", 35 | ], "'type' must be one of 'page' or 'collection_view_page'" 36 | if shared: 37 | permissions = [{"role": "editor", "type": "space_permission"}] 38 | else: 39 | permissions = [ 40 | { 41 | "role": "editor", 42 | "type": "user_permission", 43 | "user_id": self._client.current_user.id, 44 | } 45 | ] 46 | page_id = self._client.create_record( 47 | "block", self, type=type, permissions=permissions 48 | ) 49 | page = self._client.get_block(page_id) 50 | page.title = title 51 | return page 52 | -------------------------------------------------------------------------------- /notion/store.py: -------------------------------------------------------------------------------- 1 | import datetime 2 | import json 3 | import threading 4 | import uuid 5 | 6 | from collections import defaultdict 7 | from copy import deepcopy 8 | from dictdiffer import diff 9 | from inspect import signature 10 | from threading import Lock 11 | from pathlib import Path 12 | from tzlocal import get_localzone 13 | 14 | from .logger import logger 15 | from .settings import CACHE_DIR 16 | from .utils import extract_id 17 | 18 | 19 | class MissingClass(object): 20 | def __bool__(self): 21 | return False 22 | 23 | 24 | Missing = MissingClass() 25 | 26 | 27 | class Callback(object): 28 | def __init__( 29 | self, callback, record, callback_id=None, extra_kwargs={}, watch_children=True 30 | ): 31 | self.callback = callback 32 | self.record = record 33 | self.callback_id = callback_id or str(uuid.uuid4()) 34 | self.extra_kwargs = extra_kwargs 35 | 36 | def __call__(self, difference, old_val, new_val): 37 | kwargs = {} 38 | kwargs.update(self.extra_kwargs) 39 | kwargs["record"] = self.record 40 | kwargs["callback_id"] = self.callback_id 41 | kwargs["difference"] = difference 42 | kwargs["changes"] = self.record._convert_diff_to_changelist( 43 | difference, old_val, new_val 44 | ) 45 | 46 | logger.debug("Firing callback {} with kwargs: {}".format(self.callback, kwargs)) 47 | 48 | # trim down the parameters we'll be passing, to include only those the callback will accept 49 | params = signature(self.callback).parameters 50 | if not any(["**" in str(param) for param in params.values()]): 51 | # there's no "**kwargs" in the callback signature, so remove any unaccepted params 52 | for arg in list(kwargs.keys()): 53 | if arg not in params: 54 | del kwargs[arg] 55 | 56 | # perform the callback, gracefully handling any exceptions 57 | try: 58 | # trigger the callback within its own thread, so it won't block others if it's long-running 59 | threading.Thread(target=self.callback, kwargs=kwargs, daemon=True).start() 60 | except Exception as e: 61 | logger.error( 62 | "Error while processing callback for {}: {}".format( 63 | repr(self.record), repr(e) 64 | ) 65 | ) 66 | 67 | def __eq__(self, val): 68 | if isinstance(val, str): 69 | return self.callback_id.startswith(val) 70 | elif isinstance(val, Callback): 71 | return self.callback_id == val.callback_id 72 | else: 73 | return False 74 | 75 | 76 | class RecordStore(object): 77 | def __init__(self, client, cache_key=None): 78 | self._mutex = Lock() 79 | self._client = client 80 | self._cache_key = cache_key 81 | self._values = defaultdict(lambda: defaultdict(dict)) 82 | self._role = defaultdict(lambda: defaultdict(str)) 83 | self._collection_row_ids = {} 84 | self._callbacks = defaultdict(lambda: defaultdict(list)) 85 | self._records_to_refresh = {} 86 | self._pages_to_refresh = [] 87 | with self._mutex: 88 | self._load_cache() 89 | 90 | def _get(self, table, id): 91 | return self._values[table].get(id, Missing) 92 | 93 | def add_callback(self, record, callback, callback_id=None, extra_kwargs={}): 94 | assert callable( 95 | callback 96 | ), "The callback must be a 'callable' object, such as a function." 97 | self.remove_callbacks(record._table, record.id, callback_id) 98 | callback_obj = Callback( 99 | callback, record, callback_id=callback_id, extra_kwargs=extra_kwargs 100 | ) 101 | self._callbacks[record._table][record.id].append(callback_obj) 102 | return callback_obj 103 | 104 | def remove_callbacks(self, table, id, callback_or_callback_id_prefix=""): 105 | """ 106 | Remove all callbacks for the record specified by `table` and `id` that have a callback_id 107 | starting with the string `callback_or_callback_id_prefix`, or are equal to the provided callback. 108 | """ 109 | if callback_or_callback_id_prefix is None: 110 | return 111 | callbacks = self._callbacks[table][id] 112 | while callback_or_callback_id_prefix in callbacks: 113 | callbacks.remove(callback_or_callback_id_prefix) 114 | 115 | def _get_cache_path(self, attribute): 116 | return str( 117 | Path(CACHE_DIR).joinpath("{}{}.json".format(self._cache_key, attribute)) 118 | ) 119 | 120 | def _load_cache(self, attributes=("_values", "_role", "_collection_row_ids")): 121 | if not self._cache_key: 122 | return 123 | for attr in attributes: 124 | try: 125 | with open(self._get_cache_path(attr)) as f: 126 | if attr == "_collection_row_ids": 127 | self._collection_row_ids.update(json.load(f)) 128 | else: 129 | for k, v in json.load(f).items(): 130 | getattr(self, attr)[k].update(v) 131 | except (FileNotFoundError, ValueError): 132 | pass 133 | 134 | def set_collection_rows(self, collection_id, row_ids): 135 | 136 | if collection_id in self._collection_row_ids: 137 | old_ids = set(self._collection_row_ids[collection_id]) 138 | new_ids = set(row_ids) 139 | added = new_ids - old_ids 140 | removed = old_ids - new_ids 141 | for id in added: 142 | self._trigger_callbacks( 143 | "collection", 144 | collection_id, 145 | [("row_added", "rows", id)], 146 | old_ids, 147 | new_ids, 148 | ) 149 | for id in removed: 150 | self._trigger_callbacks( 151 | "collection", 152 | collection_id, 153 | [("row_removed", "rows", id)], 154 | old_ids, 155 | new_ids, 156 | ) 157 | self._collection_row_ids[collection_id] = row_ids 158 | self._save_cache("_collection_row_ids") 159 | 160 | def get_collection_rows(self, collection_id): 161 | return self._collection_row_ids.get(collection_id, []) 162 | 163 | def _save_cache(self, attribute): 164 | if not self._cache_key: 165 | return 166 | with open(self._get_cache_path(attribute), "w") as f: 167 | json.dump(getattr(self, attribute), f) 168 | 169 | def _trigger_callbacks(self, table, id, difference, old_val, new_val): 170 | for callback_obj in self._callbacks[table][id]: 171 | callback_obj(difference, old_val, new_val) 172 | 173 | def get_role(self, table, id, force_refresh=False): 174 | self.get(table, id, force_refresh=force_refresh) 175 | return self._role[table].get(id, None) 176 | 177 | def get(self, table, id, force_refresh=False, limit=100): 178 | id = extract_id(id) 179 | # look up the record in the current local dataset 180 | result = self._get(table, id) 181 | # if it's not found, try refreshing the record from the server 182 | if result is Missing or force_refresh: 183 | if table == "block": 184 | self.call_load_page_chunk(id,limit=limit) 185 | else: 186 | self.call_get_record_values(**{table: id}) 187 | result = self._get(table, id) 188 | return result if result is not Missing else None 189 | 190 | def _update_record(self, table, id, value=None, role=None): 191 | 192 | callback_queue = [] 193 | 194 | with self._mutex: 195 | if role: 196 | logger.debug("Updating 'role' for {}/{} to {}".format(table, id, role)) 197 | self._role[table][id] = role 198 | self._save_cache("_role") 199 | if value: 200 | logger.debug( 201 | "Updating 'value' for {}/{} to {}".format(table, id, value) 202 | ) 203 | old_val = self._values[table][id] 204 | difference = list( 205 | diff( 206 | old_val, 207 | value, 208 | ignore=["version", "last_edited_time", "last_edited_by"], 209 | expand=True, 210 | ) 211 | ) 212 | self._values[table][id] = value 213 | self._save_cache("_values") 214 | if old_val and difference: 215 | logger.debug("Value changed! Difference: {}".format(difference)) 216 | callback_queue.append((table, id, difference, old_val, value)) 217 | 218 | # run callbacks outside the mutex to avoid lockups 219 | for cb in callback_queue: 220 | self._trigger_callbacks(*cb) 221 | 222 | def call_get_record_values(self, **kwargs): 223 | """ 224 | Call the server's getRecordValues endpoint to update the local record store. The keyword arguments map 225 | table names into lists of (or singular) record IDs to load for that table. Use True to refresh all known 226 | records for that table. 227 | """ 228 | 229 | requestlist = [] 230 | 231 | for table, ids in kwargs.items(): 232 | 233 | # ensure "ids" is a proper list 234 | if ids is True: 235 | ids = list(self._values.get(table, {}).keys()) 236 | if isinstance(ids, str): 237 | ids = [ids] 238 | 239 | # if we're in a transaction, add the requested IDs to a queue to refresh when the transaction completes 240 | if self._client.in_transaction(): 241 | self._records_to_refresh[table] = list( 242 | set(self._records_to_refresh.get(table, []) + ids) 243 | ) 244 | continue 245 | 246 | requestlist += [{"table": table, "id": extract_id(id)} for id in ids] 247 | 248 | if requestlist: 249 | logger.debug( 250 | "Calling 'getRecordValues' endpoint for requests: {}".format( 251 | requestlist 252 | ) 253 | ) 254 | results = self._client.post( 255 | "getRecordValues", {"requests": requestlist} 256 | ).json()["results"] 257 | for request, result in zip(requestlist, results): 258 | self._update_record( 259 | request["table"], 260 | request["id"], 261 | value=result.get("value"), 262 | role=result.get("role"), 263 | ) 264 | 265 | def get_current_version(self, table, id): 266 | values = self._get(table, id) 267 | if values and "version" in values: 268 | return values["version"] 269 | else: 270 | return -1 271 | 272 | def call_load_page_chunk(self, page_id, limit=100): 273 | 274 | if self._client.in_transaction(): 275 | self._pages_to_refresh.append(page_id) 276 | return 277 | 278 | data = { 279 | "pageId": page_id, 280 | "limit": limit, 281 | "cursor": {"stack": []}, 282 | "chunkNumber": 0, 283 | "verticalColumns": False, 284 | } 285 | 286 | recordmap = self._client.post("loadPageChunk", data).json()["recordMap"] 287 | 288 | self.store_recordmap(recordmap) 289 | 290 | def store_recordmap(self, recordmap): 291 | for table, records in recordmap.items(): 292 | if not isinstance(records, dict): 293 | continue 294 | for id, record in records.items(): 295 | if not isinstance(record, dict): 296 | continue 297 | self._update_record( 298 | table, id, value=record.get("value"), role=record.get("role") 299 | ) 300 | 301 | def call_query_collection( 302 | self, 303 | collection_id, 304 | collection_view_id, 305 | space_id, 306 | search="", 307 | type="table", 308 | aggregate=[], 309 | aggregations=[], 310 | filter={}, 311 | sort=[], 312 | calendar_by="", 313 | group_by="", 314 | limit=50 315 | ): 316 | 317 | assert not ( 318 | aggregate and aggregations 319 | ), "Use only one of `aggregate` or `aggregations` (old vs new format)" 320 | 321 | # convert singletons into lists if needed 322 | if isinstance(aggregate, dict): 323 | aggregate = [aggregate] 324 | if isinstance(sort, dict): 325 | sort = [sort] 326 | 327 | data = { 328 | "collectionView": { 329 | "id": collection_view_id, 330 | "spaceId": space_id 331 | }, 332 | "loader": { 333 | "reducers": { 334 | "collection_group_results": { 335 | "type": "results", 336 | "limit": limit, 337 | }, 338 | }, 339 | "sort": sort, 340 | "searchQuery": search, 341 | "userId": self._client.current_user.id, 342 | "userTimeZone": str(get_localzone()), 343 | }, 344 | "source": { 345 | "id": collection_id, 346 | "spaceId": space_id, 347 | "type": "collection" 348 | } 349 | } 350 | 351 | if filter: 352 | data["loader"]["filter"] = filter 353 | 354 | if aggregate: 355 | data["loader"]["aggregate"] = aggregate 356 | 357 | if aggregations: 358 | data["loader"]["aggregations"] = aggregations 359 | 360 | response = self._client.post("queryCollection", data).json() 361 | 362 | self.store_recordmap(response["recordMap"]) 363 | 364 | return response["result"] 365 | 366 | def handle_post_transaction_refreshing(self): 367 | 368 | for block_id in self._pages_to_refresh: 369 | self.call_load_page_chunk(block_id) 370 | self._pages_to_refresh = [] 371 | 372 | self.call_get_record_values(**self._records_to_refresh) 373 | self._records_to_refresh = {} 374 | 375 | def run_local_operations(self, operations): 376 | """ 377 | Called to simulate the results of running the operations on the server, to keep the record store in sync 378 | even when we haven't completed a refresh (or we did a refresh but the database hadn't actually updated yet...) 379 | """ 380 | for operation in operations: 381 | self.run_local_operation(**operation) 382 | 383 | def run_local_operation(self, table, id, path, command, args): 384 | 385 | with self._mutex: 386 | path = deepcopy(path) 387 | new_val = deepcopy(self._values[table][id]) 388 | 389 | ref = new_val 390 | 391 | # loop and descend down the path until it's consumed, or if we're doing a "set", there's one key left 392 | while (len(path) > 1) or (path and command != "set"): 393 | comp = path.pop(0) 394 | if comp not in ref: 395 | ref[comp] = [] if "list" in command else {} 396 | ref = ref[comp] 397 | 398 | if command == "update": 399 | assert isinstance(ref, dict) 400 | ref.update(args) 401 | elif command == "set": 402 | assert isinstance(ref, dict) 403 | if path: 404 | ref[path[0]] = args 405 | else: 406 | # this is the case of "setting the top level" (i.e. creating a record) 407 | ref.clear() 408 | ref.update(args) 409 | elif command == "listAfter": 410 | assert isinstance(ref, list) 411 | if "after" in args: 412 | ref.insert(ref.index(args["after"]) + 1, args["id"]) 413 | else: 414 | ref.append(args["id"]) 415 | elif command == "listBefore": 416 | assert isinstance(ref, list) 417 | if "before" in args: 418 | ref.insert(ref.index(args["before"]), args["id"]) 419 | else: 420 | ref.insert(0, args["id"]) 421 | elif command == "listRemove": 422 | try: 423 | ref.remove(args["id"]) 424 | except ValueError: 425 | pass 426 | 427 | self._update_record(table, id, value=new_val) 428 | -------------------------------------------------------------------------------- /notion/user.py: -------------------------------------------------------------------------------- 1 | from .logger import logger 2 | from .maps import property_map, field_map 3 | from .records import Record 4 | 5 | 6 | class User(Record): 7 | 8 | _table = "notion_user" 9 | 10 | given_name = field_map("given_name") 11 | family_name = field_map("family_name") 12 | email = field_map("email") 13 | locale = field_map("locale") 14 | time_zone = field_map("time_zone") 15 | 16 | @property 17 | def full_name(self): 18 | return " ".join([self.given_name or "", self.family_name or ""]).strip() 19 | 20 | def _str_fields(self): 21 | return super()._str_fields() + ["email", "full_name"] 22 | -------------------------------------------------------------------------------- /notion/utils.py: -------------------------------------------------------------------------------- 1 | import requests 2 | import uuid 3 | 4 | from bs4 import BeautifulSoup 5 | from urllib.parse import urlparse, parse_qs, quote_plus, unquote_plus 6 | from datetime import datetime 7 | from slugify import slugify as _dash_slugify 8 | 9 | from .settings import BASE_URL, SIGNED_URL_PREFIX, S3_URL_PREFIX, S3_URL_PREFIX_ENCODED 10 | 11 | 12 | class InvalidNotionIdentifier(Exception): 13 | pass 14 | 15 | 16 | def now(): 17 | return int(datetime.now().timestamp() * 1000) 18 | 19 | 20 | def extract_id(url_or_id): 21 | """ 22 | Extract the block/page ID from a Notion.so URL -- if it's a bare page URL, it will be the 23 | ID of the page. If there's a hash with a block ID in it (from clicking "Copy Link") on a 24 | block in a page), it will instead be the ID of that block. If it's already in ID format, 25 | it will be passed right through. 26 | """ 27 | input_value = url_or_id 28 | if url_or_id.startswith(BASE_URL): 29 | url_or_id = ( 30 | url_or_id.split("#")[-1] 31 | .split("/")[-1] 32 | .split("&p=")[-1] 33 | .split("?")[0] 34 | .split("-")[-1] 35 | ) 36 | try: 37 | return str(uuid.UUID(url_or_id)) 38 | except ValueError: 39 | raise InvalidNotionIdentifier(input_value) 40 | 41 | 42 | def get_embed_data(source_url): 43 | 44 | return requests.get( 45 | "https://api.embed.ly/1/oembed?key=421626497c5d4fc2ae6b075189d602a2&url={}".format( 46 | source_url 47 | ) 48 | ).json() 49 | 50 | 51 | def get_embed_link(source_url): 52 | 53 | data = get_embed_data(source_url) 54 | 55 | if "html" not in data: 56 | return source_url 57 | 58 | url = list(BeautifulSoup(data["html"], "html.parser").children)[0]["src"] 59 | 60 | return parse_qs(urlparse(url).query)["src"][0] 61 | 62 | 63 | def add_signed_prefix_as_needed(url, client=None, id=""): 64 | 65 | if url is None: 66 | return 67 | 68 | if url.startswith(S3_URL_PREFIX): 69 | url = SIGNED_URL_PREFIX + quote_plus(url) + "?table=block&id=" + id 70 | if client: 71 | url = client.session.head(url).headers.get("Location") 72 | 73 | return url 74 | 75 | 76 | def remove_signed_prefix_as_needed(url): 77 | if url is None: 78 | return 79 | if url.startswith(SIGNED_URL_PREFIX): 80 | return unquote_plus(url[len(S3_URL_PREFIX) :]) 81 | elif url.startswith(S3_URL_PREFIX_ENCODED): 82 | parsed = urlparse(url.replace(S3_URL_PREFIX_ENCODED, S3_URL_PREFIX)) 83 | return "{}://{}{}".format(parsed.scheme, parsed.netloc, parsed.path) 84 | else: 85 | return url 86 | 87 | 88 | def slugify(original): 89 | return _dash_slugify(original).replace("-", "_") 90 | 91 | 92 | def get_by_path(path, obj, default=None): 93 | 94 | if isinstance(path, str): 95 | path = path.split(".") 96 | 97 | value = obj 98 | 99 | # try to traverse down the sequence of keys defined in the path, to get the target value if it exists 100 | try: 101 | for key in path: 102 | if isinstance(value, list): 103 | key = int(key) 104 | value = value[key] 105 | except (KeyError, TypeError, IndexError): 106 | value = default 107 | 108 | return value 109 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | requests 2 | commonmark 3 | bs4 4 | tzlocal 5 | python-slugify 6 | dictdiffer 7 | cached-property -------------------------------------------------------------------------------- /run_smoke_test.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import sys 4 | 5 | from notion.smoke_test import run_live_smoke_test 6 | 7 | # Following code is a sample. Input the code onto the terminal, with your own notion page URL and token_v2 8 | # python3 run_smoke_test.py --page https://www.notion.so/fitcuration/Myam-Myam-Love-a0d22196f58f4efb8a38bcf9b3e06459 --token e26b797ce5beaf4170f2699fdab0b6be375175fa6ca66c9d1a06ca08bc70d578ae2203f408bbbc38554c20357876387a9942152d868ac7c98240be964fd88496257bf0fbe8372de88db5a41c106a 9 | 10 | if __name__ == "__main__": 11 | description = "Run notion-py client smoke tests" 12 | parser = argparse.ArgumentParser(description=description) 13 | parser.add_argument( 14 | "--page", dest="page", help="page URL or ID", required=True, type=str 15 | ) 16 | parser.add_argument("--token", dest="token", help="token_v2", type=str) 17 | args = parser.parse_args() 18 | 19 | token = args.token 20 | if not token: 21 | # if you don't want your terminal to be filled with messy token, then input your token_v2 at "NOTION_TOKEN" 22 | token = os.environ.get("NOTION_TOKEN") 23 | if not token: 24 | print( 25 | "Must either pass --token option or set NOTION_TOKEN environment variable" 26 | ) 27 | sys.exit(1) 28 | 29 | run_live_smoke_test(token, args.page) 30 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | import setuptools 3 | 4 | with open("README.md", "r", encoding='utf-8') as fh: 5 | long_description = fh.read() 6 | 7 | 8 | def get_requirements(fname): 9 | "Takes requirements from requirements.txt and returns a list." 10 | with open(fname) as fp: 11 | reqs = list() 12 | for lib in fp.read().split("\n"): 13 | # Ignore pypi flags and comments 14 | if not lib.startswith("-") or lib.startswith("#"): 15 | reqs.append(lib.strip()) 16 | return reqs 17 | 18 | 19 | install_requires = get_requirements("requirements.txt") 20 | 21 | setuptools.setup( 22 | name="notion", 23 | version="0.0.28", 24 | author="Jamie Alexandre", 25 | author_email="jamalex+python@gmail.com", 26 | description="Unofficial Python API client for Notion.so", 27 | long_description=long_description, 28 | long_description_content_type="text/markdown", 29 | url="https://github.com/jamalex/notion-py", 30 | install_requires=install_requires, 31 | include_package_data=True, 32 | packages=setuptools.find_packages(), 33 | python_requires=">=3.5", 34 | classifiers=[ 35 | "Programming Language :: Python :: 3", 36 | "License :: OSI Approved :: MIT License", 37 | "Operating System :: OS Independent", 38 | ], 39 | ) 40 | --------------------------------------------------------------------------------