├── .github
└── ISSUE_TEMPLATE
│ ├── bug_report.md
│ └── feature_request.md
├── .gitignore
├── LICENSE
├── MANIFEST.in
├── README.md
├── k2n.code-workspace
├── kindle2notion
├── __init__.py
├── __main__.py
├── exporting.py
├── parsing.py
└── reading.py
├── requirements-dev.txt
├── requirements.txt
├── setup.cfg
├── setup.py
└── tests
├── test_data
└── Test Clippings.txt
├── test_exporting.py
├── test_parsing.py
└── test_reading.py
/.github/ISSUE_TEMPLATE/bug_report.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: Bug report
3 | about: Create a report to help us improve
4 | title: ''
5 | labels: ''
6 | assignees: ''
7 |
8 | ---
9 |
10 | **Describe the bug**
11 | A clear and concise description of what the bug is.
12 |
13 | **To Reproduce**
14 | Steps to reproduce the behaviour:
15 | 1. Go to '...'
16 | 2. Click on '....'
17 | 3. Scroll down to '....'
18 | 4. See error
19 |
20 | **Expected behaviour**
21 | A clear and concise description of what you expected to happen.
22 |
23 | **Screenshots**
24 | If applicable, add screenshots to help explain your problem.
25 |
26 | **Desktop (please complete the following information):**
27 | - OS: [e.g. iOS]
28 | - Browser [e.g. chrome, safari]
29 | - Version [e.g. 22]
30 |
31 | **Additional context**
32 | Add any other context about the problem here.
33 |
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/feature_request.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: Feature request
3 | about: Suggest an idea for this project
4 | title: ''
5 | labels: ''
6 | assignees: ''
7 |
8 | ---
9 |
10 | **Is your feature request related to a problem? Please describe.**
11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
12 |
13 | **Describe the solution you'd like**
14 | A clear and concise description of what you want to happen.
15 |
16 | **Describe alternatives you've considered**
17 | A clear and concise description of any alternative solutions or features you've considered.
18 |
19 | **Additional context**
20 | Add any other context or screenshots about the feature request here.
21 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # Created by https://www.toptal.com/developers/gitignore/api/python
2 | # Edit at https://www.toptal.com/developers/gitignore?templates=python
3 |
4 | ### Python ###
5 | # Byte-compiled / optimized / DLL files
6 | __pycache__/
7 | *.py[cod]
8 | *$py.class
9 |
10 | # C extensions
11 | *.so
12 |
13 | # Distribution / packaging
14 | .Python
15 | build/
16 | develop-eggs/
17 | dist/
18 | downloads/
19 | eggs/
20 | .eggs/
21 | parts/
22 | sdist/
23 | var/
24 | wheels/
25 | pip-wheel-metadata/
26 | share/python-wheels/
27 | *.egg-info/
28 | .installed.cfg
29 | *.egg
30 | MANIFEST
31 |
32 | # PyInstaller
33 | # Usually these files are written by a python script from a template
34 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
35 | *.manifest
36 | *.spec
37 |
38 | # Installer logs
39 | pip-log.txt
40 | pip-delete-this-directory.txt
41 |
42 | # Unit test / coverage reports
43 | htmlcov/
44 | .tox/
45 | .nox/
46 | .coverage
47 | .coverage.*
48 | .cache
49 | nosetests.xml
50 | coverage.xml
51 | *.cover
52 | *.py,cover
53 | .hypothesis/
54 | .pytest_cache/
55 | pytestdebug.log
56 |
57 | # Translations
58 | *.mo
59 | *.pot
60 |
61 | # Django stuff:
62 | *.log
63 | local_settings.py
64 | db.sqlite3
65 | db.sqlite3-journal
66 |
67 | # Flask stuff:
68 | instance/
69 | .webassets-cache
70 |
71 | # Scrapy stuff:
72 | .scrapy
73 |
74 | # Sphinx documentation
75 | docs/_build/
76 | doc/_build/
77 |
78 | # PyBuilder
79 | target/
80 |
81 | # Jupyter Notebook
82 | .ipynb_checkpoints
83 |
84 | # IPython
85 | profile_default/
86 | ipython_config.py
87 |
88 | # pyenv
89 | .python-version
90 |
91 | # pipenv
92 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
93 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
94 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
95 | # install all needed dependencies.
96 | #Pipfile.lock
97 |
98 | # poetry
99 | #poetry.lock
100 |
101 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
102 | __pypackages__/
103 |
104 | # Celery stuff
105 | celerybeat-schedule
106 | celerybeat.pid
107 |
108 | # SageMath parsed files
109 | *.sage.py
110 |
111 | # Environments
112 | # .env
113 | .env
114 | .venv/
115 | env/
116 | venv/
117 | ENV/
118 | env.bak/
119 | venv.bak/
120 | pythonenv*
121 |
122 | # Spyder project settings
123 | .spyderproject
124 | .spyproject
125 |
126 | # Rope project settings
127 | .ropeproject
128 |
129 | # mkdocs documentation
130 | /site
131 |
132 | # mypy
133 | .mypy_cache/
134 | .dmypy.json
135 | dmypy.json
136 |
137 | # Pyre type checker
138 | .pyre/
139 |
140 | # pytype static type analyzer
141 | .pytype/
142 |
143 | # operating system-related files
144 | # file properties cache/storage on macOS
145 | *.DS_Store
146 | # thumbnail cache on Windows
147 | Thumbs.db
148 |
149 | # profiling data
150 | .prof
151 |
152 |
153 | # End of https://www.toptal.com/developers/gitignore/api/python
154 |
155 |
156 | # IDE
157 | .vscode/
158 | .idea/
159 | .code-workspace
160 | # custom project files
161 | MyClippings.txt
162 | dist/
163 | images/
164 | my_kindle_clippings.json
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2020 Jeffrey Jacob
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include requirements.txt
2 | include requirements-dev.txt
3 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 | A program to copy all your Kindle highlights and notes to a page in Notion.
9 |
10 | Explore the docs
11 | ·
12 | File issues and feature requests here
13 |
14 |
15 | If you found this script helpful or appreciate my work, you can support me here:
16 |
17 |
18 |
19 |
20 |
21 | [![Downloads][downloads-shield]][downloads-url]
22 | [![Contributors][contributors-shield]][contributors-url]
23 | [![Forks][forks-shield]][forks-url]
24 | [![Stargazers][stars-shield]][stars-url]
25 | [![Issues][issues-shield]][issues-url]
26 | [![MIT License][license-shield]][license-url]
27 | [![LinkedIn][linkedin-shield]][linkedin-url]
28 |
29 |
30 | ## Table of Contents
31 |
32 | - [Table of Contents](#table-of-contents)
33 | - [About The Project](#about-the-project)
34 | - [Getting Started](#getting-started)
35 | - [Prerequisites](#prerequisites)
36 | - [Installation & Setup](#installation--setup)
37 | - [Usage](#usage)
38 | - [Roadmap](#roadmap)
39 | - [Contributing](#contributing)
40 | - [License](#license)
41 | - [Contact](#contact)
42 |
43 |
44 |
45 |
46 | ## About The Project
47 |
48 | ![Kindle2Notion Demo][product-demo]
49 |
50 | A Python package to export all the clippings from your Kindle device to a page in Notion. Run this script whenever you plug in your Kindle device to your PC.
51 |
52 | A key inspiration behind this project was the notes saving feature on Google Play Books, which automatically syncs all your highlights from a book hosted on the service to a Google Doc in real time. I wanted a similar feature for my Kindle and this project is one step towards a solution for this problem.
53 |
54 | **Intended for**
55 | - Avid readers who would want to browse through their prior reads and highlights anytime anywhere.
56 | - For those who like to take notes alongside their highlights.
57 |
58 |
59 |
60 | ## Getting Started
61 |
62 |
63 | > **NOTE**
64 | > Need a step-by-step guide to setting this package up? Click [here](https://kindle2notion.notion.site/Kindle2Notion-8a9683c9b19546c3b1cf42a68aceebee) for the full guide.
65 |
66 | To get a local copy up and running follow these simple steps:
67 |
68 | ### Prerequisites
69 |
70 | * A Kindle device.
71 | * A Notion account to store your links.
72 | * Python 3 on your system to run the code.
73 |
74 | ### Installation & Setup
75 |
76 | > **NOTE**
77 | > As of 10-07-2022, the latest update to this package relies on the offical Notion API for sending API requests. This requires you to create an integration token from [here](https://www.notion.so/my-integrations). For old users, you'd have to switch to this method as well since `notion-py` isn't being maintained anymore.
78 |
79 | 1. Install the library.
80 | ```sh
81 | pip install kindle2notion
82 | ```
83 | 2. Create an integration on Notion.
84 | 1. Duplicate this [database template](https://kindle2notion.notion.site/6d26062e3bb04dd89b988806978c1fe7?v=0d394a8162cc481280966b35a37465c2) to your the workspace you want to use for storing your Kindle clippings.
85 | 2. Open _Settings & Members_ from the left navigation bar.
86 | 3. Select the _Integrations_ option listed under _Workspaces_ in the settings modal.
87 | 4. Click on _Develop your own integrations_ to redirect to the integrations page.
88 | 5. On the integrations page, select the _New integration_ option and enter the name of the integration and the workspace you want to use it with. Hit submit and your internal integration token will be generated.
89 | 3. Go back to your database page and click on the _Share_ button on the top right corner. Use the selector to find your integration by its name and then click _Invite_. Your integration now has the requested permissions on the new database.
90 |
91 |
92 |
93 | ## Usage
94 |
95 | 1. Plug in your Kindle device to your PC.
96 |
97 | 2. You need the following three arguments in hand before running the code:
98 | 1. Take `your_notion_auth_token` from the secret key bearer token provided.
99 | 2. Find `your_notion_database_id` from the URL of the database you have copied to your workspace. For reference,
100 | ```
101 | https://www.notion.so/myworkspace/a8aec43384f447ed84390e8e42c2e089?v=...
102 | |--------- Database ID --------|
103 | ```
104 | 3. `your_kindle_clippings_file` is the path to your `My Clippings File.txt` on your Kindle.
105 |
106 | 3. Additionally, you may modify some default parameters of the command-line with the following options of the CLI:
107 | - ```--enable_highlight_date``` Set to False if you don't want to see the "Date Added" information in Notion.
108 | - ```--enable_book_cover``` Set to False if you don't want to store the book cover in Notion.
109 |
110 | 4. Export your Kindle highlights and notes to Notion!
111 | - On MacOS and UNIX,
112 | ```sh
113 | kindle2notion 'your_notion_auth_token' 'your_notion_table_id' 'your_kindle_clippings_file'
114 | ```
115 | - On Windows
116 | ```sh
117 | python -m kindle2notion 'your_notion_auth_token' 'your_notion_table_id' 'your_kindle_clippings_file'
118 | ```
119 | You may also avail help with the following command:
120 | ```sh
121 | kindle2notion --help
122 | python -m kindle2notion --help
123 | ```
124 |
125 | > **NOTE**
126 | > This code has been tested on a 4th Gen Kindle Paperwhite on both MacOS and Windows.
127 |
128 |
129 |
130 | ## Roadmap
131 |
132 | See the [open issues](https://github.com/paperboi/Kindle2Notion/issues) for a list of proposed features (and known issues).
133 |
134 |
135 |
136 |
137 | ## Contributing
138 |
139 |
140 | Any contributions you make are **greatly appreciated**.
141 |
142 | 1. Fork the Project
143 | 2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
144 | 3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
145 | 4. Push to the Branch (`git push origin feature/AmazingFeature`)
146 | 5. Open a Pull Request
147 |
148 |
149 |
150 |
151 | ## License
152 |
153 | Distributed under the MIT License. See [LICENSE][license-url] for more information.
154 |
155 |
156 |
157 |
158 | ## Contact
159 |
160 | Jeffrey Jacob ([Twitter](https://twitter.com/jeffreysamjacob) | [Email](mailto:jeffreysamjacob@gmail.com) | [LinkedIn](https://www.linkedin.com/in/jeffreysamjacob/))
161 |
162 |
163 | [downloads-shield]: https://pepy.tech/badge/kindle2notion
164 | [downloads-url]: https://pepy.tech/project/kindle2notion
165 | [contributors-shield]: https://img.shields.io/github/contributors/paperboi/Kindle2Notion.svg?style=flat-square
166 | [contributors-url]: https://github.com/paperboi/Kindle2Notion/graphs/contributors
167 | [forks-shield]: https://img.shields.io/github/forks/paperboi/Kindle2Notion.svg?style=flat-square
168 | [forks-url]: https://github.com/paperboi/Kindle2Notion/network/members
169 | [stars-shield]: https://img.shields.io/github/stars/paperboi/Kindle2Notion.svg?style=flat-square
170 | [stars-url]: https://github.com/paperboi/Kindle2Notion/stargazers
171 | [issues-shield]: https://img.shields.io/github/issues/paperboi/Kindle2Notion.svg?style=flat-square
172 | [issues-url]: https://github.com/paperboi/Kindle2Notion/issues
173 | [license-shield]: https://img.shields.io/github/license/paperboi/Kindle2Notion.svg?style=flat-square
174 | [license-url]: https://github.com/paperboi/kindle2notion/blob/master/LICENSE
175 | [linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=flat-square&logo=linkedin&colorB=555
176 | [linkedin-url]: https://www.linkedin.com/in/jeffreysamjacob/
177 | [product-demo]: https://i.imgur.com/IlDmEOy.gif
178 |
--------------------------------------------------------------------------------
/k2n.code-workspace:
--------------------------------------------------------------------------------
1 | {
2 | "folders": [
3 | {
4 | "path": "."
5 | }
6 | ]
7 | }
--------------------------------------------------------------------------------
/kindle2notion/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/paperboi/kindle2notion/c9ca527d2e98167e327893652e8d153908e52fd4/kindle2notion/__init__.py
--------------------------------------------------------------------------------
/kindle2notion/__main__.py:
--------------------------------------------------------------------------------
1 | import json
2 |
3 | import click
4 | import notional
5 |
6 | from kindle2notion.exporting import export_to_notion
7 | from kindle2notion.parsing import parse_raw_clippings_text
8 | from kindle2notion.reading import read_raw_clippings
9 |
10 |
11 | @click.command()
12 | @click.argument("notion_api_auth_token")
13 | @click.argument("notion_database_id")
14 | @click.argument("clippings_file")
15 | @click.option(
16 | "--enable_location",
17 | default=True,
18 | help='Set to False if you don\'t want to see the "Location" and "Page" information in Notion.'
19 | )
20 | @click.option(
21 | "--enable_highlight_date",
22 | default=True,
23 | help='Set to False if you don\'t want to see the "Date Added" information in Notion.',
24 | )
25 | @click.option(
26 | "--enable_book_cover",
27 | default=True,
28 | help="Set to False if you don't want to store the book cover in Notion.",
29 | )
30 | @click.option(
31 | "--separate_blocks",
32 | default=False,
33 | help='Set to True to separate each clipping into a separate quote block. Enabling this option significantly decreases upload speed.'
34 | )
35 |
36 | def main(
37 | notion_api_auth_token,
38 | notion_database_id,
39 | clippings_file,
40 | enable_location,
41 | enable_highlight_date,
42 | enable_book_cover,
43 | separate_blocks
44 | ):
45 | notion = notional.connect(auth=notion_api_auth_token)
46 | db = notion.databases.retrieve(notion_database_id)
47 |
48 | if db:
49 | print("Notion page is found. Analyzing clippings file...")
50 |
51 | # Open the clippings text file and load it into all_clippings
52 | all_clippings = read_raw_clippings(clippings_file)
53 |
54 | # Parse all_clippings file and format the content to be sent tp the Notion DB into all_books
55 | all_books = parse_raw_clippings_text(all_clippings)
56 |
57 | # Export all the contents in all_books into the Notion DB.
58 | export_to_notion(
59 | all_books,
60 | enable_location,
61 | enable_highlight_date,
62 | enable_book_cover,
63 | separate_blocks,
64 | notion_api_auth_token,
65 | notion_database_id
66 | )
67 |
68 | with open("my_kindle_clippings.json", "w") as out_file:
69 | json.dump(all_books, out_file, indent=4)
70 |
71 | print("Transfer complete... Exiting script...")
72 | else:
73 | print(
74 | "Notion page not found! Please check whether the Notion database ID is assigned properly."
75 | )
76 |
77 |
78 | if __name__ == "__main__":
79 | main()
80 |
--------------------------------------------------------------------------------
/kindle2notion/exporting.py:
--------------------------------------------------------------------------------
1 | from datetime import datetime
2 | from typing import Dict, List, Tuple
3 |
4 | import notional
5 | from notional.blocks import Paragraph, TextObject, Quote
6 | from notional.query import TextCondition
7 | from notional.types import Date, ExternalFile, Number, RichText, Title
8 | from requests import get
9 |
10 | # from notional.text import Annotations
11 |
12 | # from more_itertools import grouper
13 |
14 |
15 | NO_COVER_IMG = "https://via.placeholder.com/150x200?text=No%20Cover"
16 |
17 |
18 | def export_to_notion(
19 | all_books: Dict,
20 | enable_location: bool,
21 | enable_highlight_date: bool,
22 | enable_book_cover: bool,
23 | separate_blocks: bool,
24 | notion_api_auth_token: str,
25 | notion_database_id: str,
26 | ) -> None:
27 | print("Initiating transfer...\n")
28 |
29 | for title in all_books:
30 | each_book = all_books[title]
31 | author = each_book["author"]
32 | clippings = each_book["highlights"]
33 | clippings_count = len(clippings)
34 | (
35 | formatted_clippings,
36 | last_date,
37 | ) = _prepare_aggregated_text_for_one_book(clippings, enable_location, enable_highlight_date)
38 | message = _add_book_to_notion(
39 | title,
40 | author,
41 | clippings_count,
42 | formatted_clippings,
43 | last_date,
44 | notion_api_auth_token,
45 | notion_database_id,
46 | enable_book_cover,
47 | separate_blocks,
48 | )
49 | if message != "None to add":
50 | print("✓", message)
51 |
52 |
53 | def _prepare_aggregated_text_for_one_book(
54 | clippings: List, enable_location: bool, enable_highlight_date: bool
55 | ) -> Tuple[str, str]:
56 | # TODO: Special case for books with len(clippings) >= 100 characters. Character limit in a Paragraph block in Notion is 100
57 | formatted_clippings = []
58 | for each_clipping in clippings:
59 | aggregated_text = ""
60 | text = each_clipping[0]
61 | page = each_clipping[1]
62 | location = each_clipping[2]
63 | date = each_clipping[3]
64 | is_note = each_clipping[4]
65 | if is_note == True:
66 | aggregated_text += "> " + "NOTE: \n"
67 |
68 | aggregated_text += text + "\n"
69 | if enable_location:
70 | if page != "":
71 | aggregated_text += "Page: " + page + ", "
72 | if location != "":
73 | aggregated_text += "Location: " + location
74 | if enable_highlight_date and (date != ""):
75 | aggregated_text += ", Date Added: " + date
76 |
77 | aggregated_text = aggregated_text.strip() + "\n\n"
78 | formatted_clippings.append(aggregated_text)
79 | last_date = date
80 | return formatted_clippings, last_date
81 |
82 |
83 | def _add_book_to_notion(
84 | title: str,
85 | author: str,
86 | clippings_count: int,
87 | formatted_clippings: list,
88 | last_date: str,
89 | notion_api_auth_token: str,
90 | notion_database_id: str,
91 | enable_book_cover: bool,
92 | separate_blocks: bool,
93 | ):
94 | notion = notional.connect(auth=notion_api_auth_token)
95 | last_date = datetime.strptime(last_date, "%A, %d %B %Y %I:%M:%S %p")
96 |
97 | # Condition variables
98 | title_exists = False
99 | current_clippings_count = 0
100 |
101 | query = (
102 | notion.databases.query(notion_database_id)
103 | .filter(property="Title", rich_text=TextCondition(equals=title))
104 | .limit(1)
105 | )
106 | data = query.first()
107 |
108 | if data:
109 | title_exists = True
110 | block_id = data.id
111 | block = notion.pages.retrieve(block_id)
112 | if block["Highlights"] == None:
113 | block["Highlights"] = Number[0]
114 | elif block["Highlights"] == clippings_count: # if no change in clippings
115 | title_and_author = str(block["Title"]) + " (" + str(block["Author"]) + ")"
116 | print(title_and_author)
117 | print("-" * len(title_and_author))
118 | return "None to add.\n"
119 |
120 | title_and_author = title + " (" + str(author) + ")"
121 | print(title_and_author)
122 | print("-" * len(title_and_author))
123 |
124 | # Add a new book to the database
125 | if not title_exists:
126 | new_page = notion.pages.create(
127 | parent=notion.databases.retrieve(notion_database_id),
128 | properties={
129 | "Title": Title[title],
130 | "Author": RichText[author],
131 | "Highlights": Number[clippings_count],
132 | "Last Highlighted": Date[last_date.isoformat()],
133 | "Last Synced": Date[datetime.now().isoformat()],
134 | },
135 | children=[],
136 | )
137 | # page_content = _update_book_with_clippings(formatted_clippings)
138 |
139 |
140 | if separate_blocks:
141 | for formatted_clipping in formatted_clippings:
142 | page_content = Quote[formatted_clipping.strip()]
143 | notion.blocks.children.append(new_page, page_content)
144 | else:
145 | page_content = Paragraph["".join(formatted_clippings)]
146 | notion.blocks.children.append(new_page, page_content)
147 |
148 | block_id = new_page.id
149 | if enable_book_cover:
150 | # Fetch a book cover from Google Books if the cover for the page is not set
151 | if new_page.cover is None:
152 | result = _get_book_cover_uri(title, author)
153 |
154 | if result is None:
155 | # Set the page cover to a placeholder image
156 | cover = ExternalFile[NO_COVER_IMG]
157 | print(
158 | "× Book cover couldn't be found. "
159 | "Please replace the placeholder image with the original book cover manually."
160 | )
161 | else:
162 | # Set the page cover to that of the book
163 | cover = ExternalFile[result]
164 | print("✓ Added book cover.")
165 |
166 | notion.pages.set(new_page, cover=cover)
167 | else:
168 | # update a book that already exists in the database
169 | page = notion.pages.retrieve(block_id)
170 | # page_content = _update_book_with_clippings(formatted_clippings)
171 | page_content = Paragraph["".join(formatted_clippings)]
172 | notion.blocks.children.append(page, page_content)
173 | # TODO: Delete existing page children (or figure out how to find changes to be made by comparing it with local json file.)
174 | current_clippings_count = int(float(str(page["Highlights"])))
175 | page["Highlights"] = Number[clippings_count]
176 | page["Last Highlighted"] = Date[last_date.isoformat()]
177 | page["Last Synced"] = Date[datetime.now().isoformat()]
178 |
179 | # Logging the changes made
180 | diff_count = (
181 | clippings_count - current_clippings_count
182 | if clippings_count > current_clippings_count
183 | else clippings_count
184 | )
185 | message = str(diff_count) + " notes/highlights added successfully.\n"
186 |
187 | return message
188 |
189 |
190 | # def _create_rich_text_object(text):
191 | # if "Note: " in text:
192 | # # Bold text
193 | # nested = TextObject._NestedData(content=text)
194 | # rich = TextObject(text=nested, plain_text=text, annotations=Annotations(bold=True))
195 | # elif any(item in text for item in ["Page: ", "Location: ", "Date Added: "]):
196 | # # Italic text
197 | # nested = TextObject._NestedData(content=text)
198 | # rich = TextObject(text=nested, plain_text=text, annotations=Annotations(italic=True))
199 | # else:
200 | # # Plain text
201 | # nested = TextObject._NestedData(content=text)
202 | # rich = TextObject(text=nested, plain_text=text)
203 | # return rich
204 |
205 |
206 | # def _update_book_with_clippings(formatted_clippings):
207 | # rtf = []
208 | # for each_clipping in formatted_clippings:
209 | # each_clipping_list = each_clipping.split("*")
210 | # each_clipping_list = list(filter(None, each_clipping_list))
211 | # for each_line in each_clipping_list:
212 | # rtf.append(_create_rich_text_object(each_line))
213 | # print(len(rtf))
214 | # content = Paragraph._NestedData(rich_text=rtf)
215 | # para = Paragraph(paragraph=content)
216 | # return para
217 |
218 |
219 | def _get_book_cover_uri(title: str, author: str):
220 | req_uri = "https://www.googleapis.com/books/v1/volumes?q="
221 |
222 | if title is None:
223 | return
224 | req_uri += "intitle:" + title
225 |
226 | if author is not None:
227 | req_uri += "+inauthor:" + author
228 |
229 | response = get(req_uri).json().get("items", [])
230 | if len(response) > 0:
231 | for x in response:
232 | if x.get("volumeInfo", {}).get("imageLinks", {}).get("thumbnail"):
233 | return (
234 | x.get("volumeInfo", {})
235 | .get("imageLinks", {})
236 | .get("thumbnail")
237 | .replace("http://", "https://")
238 | )
239 | return
240 |
--------------------------------------------------------------------------------
/kindle2notion/parsing.py:
--------------------------------------------------------------------------------
1 | from re import findall
2 | from typing import Dict, List, Tuple
3 |
4 | from dateparser import parse
5 |
6 | BOOKS_WO_AUTHORS = []
7 |
8 | ACADEMIC_TITLES = [
9 | "A.A.",
10 | "A.S.",
11 | "A.A.A.",
12 | "A.A.S.",
13 | "A.B.",
14 | "A.D.N.",
15 | "A.M.",
16 | "A.M.T.",
17 | "C.E.",
18 | "Ch.E.",
19 | "D.A.",
20 | "D.A.S.",
21 | "D.B.A.",
22 | "D.C.",
23 | "D.D.",
24 | "D.Ed.",
25 | "D.L.S.",
26 | "D.M.D.",
27 | "D.M.S.",
28 | "D.P.A.",
29 | "D.P.H.",
30 | "D.R.E.",
31 | "D.S.W.",
32 | "D.Sc.",
33 | "D.V.M.",
34 | "Ed.D.",
35 | "Ed.S.",
36 | "E.E.",
37 | "E.M.",
38 | "E.Met.",
39 | "I.E.",
40 | "J.D.",
41 | "J.S.D.",
42 | "L.H.D.",
43 | "Litt.B.",
44 | "Litt.M.",
45 | "LL.B.",
46 | "LL.D.",
47 | "LL.M.",
48 | "M.A.",
49 | "M.Aero.E.",
50 | "M.B.A.",
51 | "M.C.S.",
52 | "M.D.",
53 | "M.Div.",
54 | "M.E.",
55 | "M.Ed.",
56 | "M.Eng.",
57 | "M.F.A.",
58 | "M.H.A.",
59 | "M.L.S.",
60 | "M.Mus.",
61 | "M.N.",
62 | "M.P.A.",
63 | "M.S.",
64 | "M.S.Ed.",
65 | "M.S.W.",
66 | "M.Th.",
67 | "Nuc.E.",
68 | "O.D.",
69 | "Pharm.D.",
70 | "Ph.B.",
71 | "Ph.D.",
72 | "S.B.",
73 | "Sc.D.",
74 | "S.J.D.",
75 | "S.Sc.D.",
76 | "Th.B.",
77 | "Th.D.",
78 | "Th.M.",
79 | ]
80 |
81 | DELIMITERS = ["; ", " & ", " and "]
82 |
83 |
84 | def parse_raw_clippings_text(raw_clippings_text: str) -> Dict:
85 | raw_clippings_list = raw_clippings_text.split("==========")
86 | print(f"Found {len(raw_clippings_list)} notes and highlights.\n")
87 |
88 | all_books = {}
89 | passed_clippings_count = 0
90 |
91 | for each_raw_clipping in raw_clippings_list:
92 | raw_clipping_list = each_raw_clipping.strip().split("\n")
93 |
94 | if _is_valid_clipping(raw_clipping_list):
95 | author, title = _parse_author_and_title(raw_clipping_list)
96 | page, location, date, is_note = _parse_page_location_date_and_note(
97 | raw_clipping_list
98 | )
99 | highlight = raw_clipping_list[3]
100 |
101 | all_books = _add_parsed_items_to_all_books_dict(
102 | all_books, title, author, highlight, page, location, date, is_note
103 | )
104 | else:
105 | passed_clippings_count += 1
106 |
107 | print(f"× Passed {passed_clippings_count} bookmarks or unsupported clippings.\n")
108 | return all_books
109 |
110 |
111 | def _is_valid_clipping(raw_clipping_list: List) -> bool:
112 | return len(raw_clipping_list) >= 3
113 |
114 |
115 | def _parse_author_and_title(raw_clipping_list: List) -> Tuple[str, str]:
116 | author, title = _parse_raw_author_and_title(raw_clipping_list)
117 | author, title = _deal_with_exceptions_in_author_name(author, title)
118 | title = _deal_with_exceptions_in_title(title)
119 | return author, title
120 |
121 |
122 | def _parse_page_location_date_and_note(
123 | raw_clipping_list: List,
124 | ) -> Tuple[str, str, str, bool]:
125 | second_line = raw_clipping_list[1]
126 | second_line_as_list = second_line.strip().split(" | ")
127 | page = location = date = ""
128 | is_note = False
129 |
130 | for element in second_line_as_list:
131 | element = element.lower()
132 | if "note" in element:
133 | is_note = True
134 | if "page" in element:
135 | page = element[element.find("page") :].replace("page", "").strip()
136 | if "location" in element:
137 | location = (
138 | element[element.find("location") :].replace("location", "").strip()
139 | )
140 | if "added on" in element:
141 | date = parse(
142 | element[element.find("added on") :].replace("added on", "").strip()
143 | )
144 | date = date.strftime("%A, %d %B %Y %I:%M:%S %p")
145 |
146 | return page, location, date, is_note
147 |
148 |
149 | def _add_parsed_items_to_all_books_dict(
150 | all_books: Dict,
151 | title: str,
152 | author: str,
153 | highlight: str,
154 | page: str,
155 | location: str,
156 | date: str,
157 | is_note: bool,
158 | ) -> Dict:
159 | if title not in all_books:
160 | all_books[title] = {"author": author, "highlights": []}
161 | all_books[title]["highlights"].append((highlight, page, location, date, is_note))
162 | return all_books
163 |
164 |
165 | def _parse_raw_author_and_title(raw_clipping_list: List) -> Tuple[str, str]:
166 | author = ""
167 | title = raw_clipping_list[0]
168 |
169 | if findall(r"\(.*?\)", raw_clipping_list[0]):
170 | author = (findall(r"\(.*?\)", raw_clipping_list[0]))[-1]
171 | author = author.removeprefix("(").removesuffix(")")
172 | else:
173 | if title not in BOOKS_WO_AUTHORS:
174 | BOOKS_WO_AUTHORS.append(title)
175 | print(
176 | f"{title} - No author found. You can manually add the author in the Notion database."
177 | )
178 |
179 | title = raw_clipping_list[0].replace(author, "").strip().replace(" ()", "")
180 |
181 | return author, title
182 |
183 |
184 | def _deal_with_exceptions_in_author_name(author: str, title: str) -> Tuple[str, str]:
185 | if "(" in author:
186 | author = author + ")"
187 | title = title.removesuffix(")")
188 |
189 | if ", " in author and all(x not in author for x in DELIMITERS):
190 | if (author.split(", "))[1] not in ACADEMIC_TITLES:
191 | author = " ".join(reversed(author.split(", ")))
192 |
193 | if "; " in author:
194 | authorList = author.split("; ")
195 | author = ""
196 | for ele in authorList:
197 | author += " ".join(reversed(ele.split(", "))) + ", "
198 | author = author.removesuffix(", ")
199 | return author, title
200 |
201 |
202 | def _deal_with_exceptions_in_title(title: str) -> str:
203 | if ", The" in title:
204 | title = "The " + title.replace(", The", "")
205 | return title
206 |
--------------------------------------------------------------------------------
/kindle2notion/reading.py:
--------------------------------------------------------------------------------
1 | from pathlib import Path
2 |
3 |
4 | def read_raw_clippings(clippings_file_path: Path) -> str:
5 | try:
6 | with open(clippings_file_path, "r", encoding="utf-8-sig") as raw_clippings_file:
7 | raw_clippings_text = raw_clippings_file.read()
8 | raw_clippings_text = raw_clippings_text.replace(u"\ufeff", "")
9 | raw_clippings_text_decoded = raw_clippings_text.encode(
10 | "ascii", errors="ignore"
11 | ).decode()
12 | except UnicodeEncodeError as e:
13 | print(e)
14 |
15 | return raw_clippings_text_decoded
16 |
--------------------------------------------------------------------------------
/requirements-dev.txt:
--------------------------------------------------------------------------------
1 | flake8>=3.9.2
2 | pytest>=6.2.4
3 | pytest-cov>=2.12.0
4 | black>=21.5b2
5 | isort>=5.10.1
6 | interrogate>=1.5.0
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | click>=8.0.0
2 | notional>=0.4.1
3 | pathlib
4 | dateparser>=1.0.0
5 | requests>=2.25.0
--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
1 | [flake8]
2 | exclude =
3 | .git,
4 | __pycache__,
5 | venv,
6 | idea,
7 | .venv
8 | max-line-length = 120
9 | inline-quotes = single
10 | multiline-quotes = '''
11 | avoid-escape = True
12 | ignore=E203,E225,W293,W503
13 |
14 | [tool:pytest]
15 | testpaths = tests/
16 | norecursedirs = .git venv/ .pytest_cache/ main/
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | from setuptools import find_packages, setup
2 |
3 | with open("README.md", "r", encoding="utf-8") as f:
4 | long_description = f.read()
5 |
6 | with open("requirements.txt", "r", encoding="utf-8") as f:
7 | requirements = f.read()
8 |
9 | with open("requirements-dev.txt", "r", encoding="utf-8") as f:
10 | requirements_dev = f.read()
11 |
12 | setup(
13 | name="kindle2notion",
14 | version="1.0.1",
15 | author="Jeffrey Jacob",
16 | author_email="jeffreysamjacob@gmail.com",
17 | description="Export all the clippings from your Kindle device to a database in Notion.",
18 | long_description=long_description,
19 | long_description_content_type="text/markdown",
20 | url="https://github.com/paperboi/kindle2notion",
21 | classifiers=[
22 | "Programming Language :: Python :: 3",
23 | "License :: OSI Approved :: MIT License",
24 | "Operating System :: OS Independent",
25 | ],
26 | packages=find_packages(),
27 | install_requires=requirements,
28 | extras_require={"dev": requirements_dev},
29 | python_requires=">=3.9",
30 | entry_points={
31 | "console_scripts": [
32 | "kindle2notion = kindle2notion.__main__:main",
33 | ],
34 | },
35 | )
36 |
--------------------------------------------------------------------------------
/tests/test_data/Test Clippings.txt:
--------------------------------------------------------------------------------
1 | Title 1: A Great Book (Horowitz, Ben)
2 | - Your Highlight on page 11 | Location 111-114 | Added on Tuesday, September 22, 2020 9:23:48 AM
3 |
4 | This is test highlight 1.
5 | ==========
6 | Title 1: A Great Book (Horowitz, Ben)
7 | - Your Highlight on page 11 | Location 111-114 | Added on Tuesday, September 22, 2020 9:24:04 AM
8 |
9 | This is test highlight 2.
10 | ==========
11 | Title 2 Is Good Too (Bryar, Colin)
12 | - Your Highlight on page 3 | Location 184-185 | Added on Friday, April 30, 2021 12:31:29 AM
13 |
14 | This is test highlight 3.
15 | ==========
16 | Title 2 Is Good Too (Bryar, Colin)
17 | - Your Highlight on page 34 | Location 682-684 | Added on Friday, April 30, 2021 3:14:33 PM
18 |
19 | This is test highlight 4.
20 | ==========
21 | Title 3 Is Clean (Robert C. Martin Series) (C., Martin Robert)
22 | - Your Highlight on page 22 | Location 559-560 | Added on Saturday, May 15, 2021 10:25:42 PM
23 |
24 | This is test highlight 5.
25 | ==========
26 | Title 3 Is Clean (Robert C. Martin Series) (C., Martin Robert)
27 | - Your Highlight on page 22 | Location 564-565 | Added on Saturday, May 15, 2021 10:26:26 PM
28 |
29 | This is test highlight 6.
30 | ==========
--------------------------------------------------------------------------------
/tests/test_exporting.py:
--------------------------------------------------------------------------------
1 | from kindle2notion.exporting import _prepare_aggregated_text_for_one_book
2 |
3 |
4 | def test_prepare_aggregated_text_for_one_book_should_return_the_aggregated_text_when_highlight_date_is_disabled():
5 | # Given
6 | highlights = [
7 | (
8 | "This is an example highlight.",
9 | "1",
10 | "100",
11 | "Thursday, 29 April 2021 12:31:29 AM",
12 | False,
13 | ),
14 | (
15 | "This is a second example highlight.",
16 | "2",
17 | "200",
18 | "Friday, 30 April 2021 12:31:29 AM",
19 | True,
20 | ),
21 | ]
22 |
23 | expected = (
24 | [
25 | "This is an example highlight.\n* Page: 1, Location: 100\n\n",
26 | "> NOTE: \nThis is a second example highlight.\n* Page: 2, Location: 200\n\n",
27 | ],
28 | "Friday, 30 April 2021 12:31:29 AM",
29 | )
30 |
31 | # When
32 | actual = _prepare_aggregated_text_for_one_book(
33 | highlights, enable_highlight_date=False
34 | )
35 | print(actual)
36 | # Then
37 | assert expected == actual
38 |
39 |
40 | def test_prepare_aggregated_text_for_one_book_should_return_the_aggregated_text_when_highlight_date_is_enabled():
41 | # Given
42 | highlights = [
43 | (
44 | "This is an example highlight.",
45 | "1",
46 | "100",
47 | "Thursday, 29 April 2021 12:31:29 AM",
48 | False,
49 | ),
50 | (
51 | "This is a second example highlight.",
52 | "2",
53 | "200",
54 | "Friday, 30 April 2021 12:31:29 AM",
55 | True,
56 | ),
57 | ]
58 |
59 | expected = (
60 | [
61 | "This is an example highlight.\n* Page: 1, Location: 100, Date Added: Thursday, 29 April 2021 12:31:29 AM\n\n",
62 | "> NOTE: \nThis is a second example highlight.\n* Page: 2, Location: 200, Date Added: Friday, 30 April 2021 12:31:29 AM\n\n",
63 | ],
64 | "Friday, 30 April 2021 12:31:29 AM",
65 | )
66 |
67 | # When
68 | actual = _prepare_aggregated_text_for_one_book(
69 | highlights, enable_highlight_date=True
70 | )
71 | print(actual)
72 | # Then
73 | assert expected == actual
74 |
--------------------------------------------------------------------------------
/tests/test_parsing.py:
--------------------------------------------------------------------------------
1 | from datetime import datetime
2 | from pathlib import Path
3 |
4 | from kindle2notion.parsing import (
5 | parse_raw_clippings_text,
6 | _parse_author_and_title,
7 | _parse_page_location_date_and_note,
8 | _add_parsed_items_to_all_books_dict,
9 | )
10 | from kindle2notion.reading import read_raw_clippings
11 |
12 |
13 | def test_parse_raw_clippings_text_should_return_a_dict_with_all_the_parsed_information():
14 | # Given
15 | test_clippings_file_path = (
16 | Path(__file__).parent.absolute() / "test_data/Test Clippings.txt"
17 | )
18 | raw_clippings_text = read_raw_clippings(test_clippings_file_path)
19 |
20 | expected = {
21 | "Title 1: A Great Book": {
22 | "author": "Ben Horowitz",
23 | "highlights": [
24 | (
25 | "This is test highlight 1.",
26 | "11",
27 | "111-114",
28 | "Tuesday, 22 September 2020 09:23:48 AM",
29 | False,
30 | ),
31 | (
32 | "This is test highlight 2.",
33 | "11",
34 | "111-114",
35 | "Tuesday, 22 September 2020 09:24:04 AM",
36 | False,
37 | ),
38 | ],
39 | },
40 | "Title 2 Is Good Too": {
41 | "author": "Colin Bryar",
42 | "highlights": [
43 | (
44 | "This is test highlight 3.",
45 | "3",
46 | "184-185",
47 | "Friday, 30 April 2021 12:31:29 AM",
48 | False,
49 | ),
50 | (
51 | "This is test highlight 4.",
52 | "34",
53 | "682-684",
54 | "Friday, 30 April 2021 03:14:33 PM",
55 | False,
56 | ),
57 | ],
58 | },
59 | "Title 3 Is Clean (Robert C. Martin Series)": {
60 | "author": "Martin Robert C.",
61 | "highlights": [
62 | (
63 | "This is test highlight 5.",
64 | "22",
65 | "559-560",
66 | "Saturday, 15 May 2021 10:25:42 PM",
67 | False,
68 | ),
69 | (
70 | "This is test highlight 6.",
71 | "22",
72 | "564-565",
73 | "Saturday, 15 May 2021 10:26:26 PM",
74 | False,
75 | ),
76 | ],
77 | },
78 | }
79 |
80 | # When
81 | actual = parse_raw_clippings_text(raw_clippings_text)
82 |
83 | # Then
84 | assert expected == actual
85 |
86 |
87 | def test_parse_author_and_title_case_should_parse_the_author_and_title_when_the_author_name_is_formatted_with_a_comma():
88 | # Given
89 | raw_clipping_list = [
90 | "Relativity (Einstein, Albert)",
91 | "- Your Highlight on page 3 | Location 184-185 | Added on Friday, April 30, 2021 12:31:29 AM",
92 | "",
93 | "This is a test highlight.",
94 | False,
95 | ]
96 | expected = ("Albert Einstein", "Relativity")
97 |
98 | # When
99 | actual = _parse_author_and_title(raw_clipping_list)
100 |
101 | # Then
102 | assert expected == actual
103 |
104 |
105 | def test_parse_author_and_title_case_should_parse_the_author_and_title_when_the_author_name_is_first_name_last_name():
106 | # Given
107 | raw_clipping_list = [
108 | "Relativity (Albert Einstein)",
109 | "- Your Highlight on page 3 | Location 184-185 | Added on Friday, April 30, 2021 12:31:29 AM",
110 | "",
111 | "This is a test highlight.",
112 | False,
113 | ]
114 | expected = ("Albert Einstein", "Relativity")
115 |
116 | # When
117 | actual = _parse_author_and_title(raw_clipping_list)
118 |
119 | # Then
120 | assert expected == actual
121 |
122 |
123 | def test_parse_author_and_title_case_should_parse_the_author_and_title_when_there_are_parentheses_in_the_author_name():
124 | # Given
125 | raw_clipping_list = [
126 | "Candide (Voltaire (François-Marie Arouet))",
127 | "- Your Highlight on page 3 | Location 184-185 | Added on Friday, April 30, 2021 12:31:29 AM",
128 | "",
129 | "This is a test highlight.",
130 | False,
131 | ]
132 | expected = ("Voltaire (François-Marie Arouet)", "Candide")
133 |
134 | # When
135 | actual = _parse_author_and_title(raw_clipping_list)
136 |
137 | # Then
138 | assert expected == actual
139 |
140 |
141 | def test_parse_author_and_title_case_should_parse_the_author_and_title_when_there_is_a_The_at_the_end_of_the_title():
142 | # Given
143 | raw_clipping_list = [
144 | "Age of Louis XIV, The (Voltaire (François-Marie Arouet))",
145 | "- Your Highlight on page 3 | Location 184-185 | Added on Friday, April 30, 2021 12:31:29 AM",
146 | "",
147 | "This is a test highlight.",
148 | False,
149 | ]
150 | expected = ("Voltaire (François-Marie Arouet)", "The Age of Louis XIV")
151 |
152 | # When
153 | actual = _parse_author_and_title(raw_clipping_list)
154 |
155 | # Then
156 | assert expected == actual
157 |
158 |
159 | def test_parse_author_and_title_case_should_parse_the_author_and_title_when_there_are_parentheses_in_the_title():
160 | # Given
161 | raw_clipping_list = [
162 | "The Mysterious Disappearance of Leon (I Mean Noel) (Ellen Raskin)",
163 | "- Your Highlight on page 3 | Location 184-185 | Added on Friday, April 30, 2021 12:31:29 AM",
164 | "",
165 | "This is a test highlight.",
166 | False,
167 | ]
168 | expected = ("Ellen Raskin", "The Mysterious Disappearance of Leon (I Mean Noel)")
169 |
170 | # When
171 | actual = _parse_author_and_title(raw_clipping_list)
172 |
173 | # Then
174 | assert expected == actual
175 |
176 |
177 | def test_parse_page_location_date_and_note_should_parse_the_page_location_and_date_when_there_are_all_three():
178 | # Given
179 | raw_clipping_list = [
180 | "Relativity (Albert Einstein)",
181 | "- Your Highlight on page 3 | Location 184-185 | Added on Friday, April 30, 2021 12:31:29 AM",
182 | "",
183 | "This is a test highlight.",
184 | False,
185 | ]
186 | expected = ("3", "184-185", "Friday, 30 April 2021 12:31:29 AM", False)
187 |
188 | # When
189 | actual = _parse_page_location_date_and_note(raw_clipping_list)
190 |
191 | # Then
192 | assert expected == actual
193 |
194 |
195 | def test_parse_page_location_date_and_note_should_parse_the_page_and_location_when_there_is_no_date():
196 | # Given
197 | raw_clipping_list = [
198 | "Relativity (Albert Einstein)",
199 | "- Your Highlight on page 3 | Location 184-185",
200 | "",
201 | "This is a test highlight.",
202 | False,
203 | ]
204 | expected = ("3", "184-185", "", False)
205 |
206 | # When
207 | actual = _parse_page_location_date_and_note(raw_clipping_list)
208 |
209 | # Then
210 | assert expected == actual
211 |
212 |
213 | def test_parse_page_location_date_and_note_should_parse_the_location_and_date_when_there_is_no_page():
214 | # Given
215 | raw_clipping_list = [
216 | "Relativity (Albert Einstein)",
217 | "Location 184-185 | Added on Friday, April 30, 2021 12:31:29 AM",
218 | "",
219 | "This is a test highlight.",
220 | False,
221 | ]
222 | expected = ("", "184-185", "Friday, 30 April 2021 12:31:29 AM", False)
223 |
224 | # When
225 | actual = _parse_page_location_date_and_note(raw_clipping_list)
226 |
227 | # Then
228 | assert expected == actual
229 |
230 |
231 | def test_parse_page_location_date_and_note_should_parse_the_page_and_date_when_there_is_no_location():
232 | # Given
233 | raw_clipping_list = [
234 | "Relativity (Albert Einstein)",
235 | "- Your Highlight on page 3 | Added on Friday, April 30, 2021 12:31:29 AM",
236 | "",
237 | "This is a test highlight.",
238 | ]
239 | expected = ("3", "", "Friday, 30 April 2021 12:31:29 AM", False)
240 |
241 | # When
242 | actual = _parse_page_location_date_and_note(raw_clipping_list)
243 |
244 | # Then
245 | assert expected == actual
246 |
247 |
248 | def test_add_parsed_items_to_books_dict_should_add_the_parsed_items_when_the_book_is_not_already_in_the_books_dict():
249 | # Given
250 | books = {}
251 | title = "Relativity"
252 | author = "Albert Einstein"
253 | highlight = "This is a first highlight."
254 | page = "1"
255 | location = "100"
256 | date = datetime(2021, 4, 30, 0, 31, 29)
257 | is_note = False
258 |
259 | expected = {
260 | "Relativity": {
261 | "author": "Albert Einstein",
262 | "highlights": [
263 | (
264 | "This is a first highlight.",
265 | "1",
266 | "100",
267 | datetime(2021, 4, 30, 0, 31, 29),
268 | False,
269 | )
270 | ],
271 | }
272 | }
273 |
274 | # When
275 | actual = _add_parsed_items_to_all_books_dict(
276 | books, title, author, highlight, page, location, date, is_note
277 | )
278 |
279 | # Then
280 | assert expected == actual
281 |
282 |
283 | def test_add_parsed_items_to_books_dict_should_add_the_parsed_items_when_the_book_is_already_in_the_books_dict():
284 | # Given
285 | books = {
286 | "Relativity": {
287 | "author": "Albert Einstein",
288 | "highlights": [
289 | (
290 | "This is a first highlight.",
291 | "1",
292 | "100",
293 | datetime(2021, 4, 30, 0, 31, 29),
294 | False,
295 | )
296 | ],
297 | }
298 | }
299 | title = "Relativity"
300 | author = "Albert Einstein"
301 | highlight = "This is a second highlight."
302 | page = "2"
303 | location = "200"
304 | date = datetime(2021, 5, 1, 0, 31, 29)
305 | is_note = False
306 |
307 | expected = {
308 | "Relativity": {
309 | "author": "Albert Einstein",
310 | "highlights": [
311 | (
312 | "This is a first highlight.",
313 | "1",
314 | "100",
315 | datetime(2021, 4, 30, 0, 31, 29),
316 | False,
317 | ),
318 | (
319 | "This is a second highlight.",
320 | "2",
321 | "200",
322 | datetime(2021, 5, 1, 0, 31, 29),
323 | False,
324 | ),
325 | ],
326 | }
327 | }
328 |
329 | # When
330 | actual = _add_parsed_items_to_all_books_dict(
331 | books, title, author, highlight, page, location, date, is_note
332 | )
333 |
334 | # Then
335 | assert expected == actual
336 |
--------------------------------------------------------------------------------
/tests/test_reading.py:
--------------------------------------------------------------------------------
1 | from pathlib import Path
2 |
3 | from kindle2notion.reading import read_raw_clippings
4 |
5 |
6 | def test_read_raw_clippings_should_return_all_clippings_data_as_string():
7 | # Given
8 | test_clippings_file_path = (
9 | Path(__file__).parent.absolute() / "test_data/Test Clippings.txt"
10 | )
11 |
12 | expected = """Title 1: A Great Book (Horowitz, Ben)
13 | - Your Highlight on page 11 | Location 111-114 | Added on Tuesday, September 22, 2020 9:23:48 AM
14 |
15 | This is test highlight 1.
16 | ==========
17 | Title 1: A Great Book (Horowitz, Ben)
18 | - Your Highlight on page 11 | Location 111-114 | Added on Tuesday, September 22, 2020 9:24:04 AM
19 |
20 | This is test highlight 2.
21 | ==========
22 | Title 2 Is Good Too (Bryar, Colin)
23 | - Your Highlight on page 3 | Location 184-185 | Added on Friday, April 30, 2021 12:31:29 AM
24 |
25 | This is test highlight 3.
26 | ==========
27 | Title 2 Is Good Too (Bryar, Colin)
28 | - Your Highlight on page 34 | Location 682-684 | Added on Friday, April 30, 2021 3:14:33 PM
29 |
30 | This is test highlight 4.
31 | ==========
32 | Title 3 Is Clean (Robert C. Martin Series) (C., Martin Robert)
33 | - Your Highlight on page 22 | Location 559-560 | Added on Saturday, May 15, 2021 10:25:42 PM
34 |
35 | This is test highlight 5.
36 | ==========
37 | Title 3 Is Clean (Robert C. Martin Series) (C., Martin Robert)
38 | - Your Highlight on page 22 | Location 564-565 | Added on Saturday, May 15, 2021 10:26:26 PM
39 |
40 | This is test highlight 6.
41 | =========="""
42 |
43 | # When
44 | actual = raw_clippings_text = read_raw_clippings(test_clippings_file_path)
45 |
46 | # Then
47 | assert expected == actual
48 |
--------------------------------------------------------------------------------