├── roam_to_git ├── __init__.py ├── fs.py ├── formatter.py ├── __main__.py └── scrapping.py ├── img ├── gitlab-done.png ├── gitlab-add-key.png ├── gitlab-schedule.png ├── gitlab-start-job.png └── gitlab-add-variable.png ├── .gitignore ├── setup.cfg ├── requirements.txt ├── env.template ├── .github ├── workflows │ ├── python-linting.yml │ └── test.yml └── ISSUE_TEMPLATE │ ├── feature_request.md │ └── bug_report.md ├── LICENSE.txt ├── setup.py ├── tests.py └── README.md /roam_to_git/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /img/gitlab-done.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Stvad/roam-to-git/master/img/gitlab-done.png -------------------------------------------------------------------------------- /img/gitlab-add-key.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Stvad/roam-to-git/master/img/gitlab-add-key.png -------------------------------------------------------------------------------- /img/gitlab-schedule.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Stvad/roam-to-git/master/img/gitlab-schedule.png -------------------------------------------------------------------------------- /img/gitlab-start-job.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Stvad/roam-to-git/master/img/gitlab-start-job.png -------------------------------------------------------------------------------- /img/gitlab-add-variable.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Stvad/roam-to-git/master/img/gitlab-add-variable.png -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | logs 2 | notes 3 | downloads 4 | venv/ 5 | env/ 6 | **.pyc 7 | .env 8 | .mypy_cache/ 9 | MANIFEST 10 | dist/ 11 | *.ipynb 12 | *.log 13 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | # Inside of setup.cfg 2 | [metadata] 3 | description-file = README.md 4 | 5 | [flake8] 6 | max-line-length = 100 7 | extend-exclude = env/*,notes/*,.mypy_cache/* 8 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | # WARNING: don't forget to update setup.py 2 | gitpython==3.1.20 3 | loguru==0.4.1 4 | pathvalidate==2.5.2 5 | python-dotenv==0.20 6 | psutil==5.9.5 7 | selenium==3.13.0 -------------------------------------------------------------------------------- /env.template: -------------------------------------------------------------------------------- 1 | # Copy this file to ".env" and fill the values, or configure it on Github secrets if using Github actions 2 | ROAMRESEARCH_USER="YOUR_EMAIL" 3 | ROAMRESEARCH_PASSWORD="YOUR_PASSWORD" 4 | # find it here https://user-images.githubusercontent.com/656694/84388282-98136800-abf4-11ea-84c1-85ffc59b30b0.png 5 | ROAMRESEARCH_DATABASE="YOUR_DATABASE" 6 | -------------------------------------------------------------------------------- /.github/workflows/python-linting.yml: -------------------------------------------------------------------------------- 1 | name: Flake8 Lint 2 | 3 | on: [push, pull_request] 4 | 5 | jobs: 6 | flake8-lint: 7 | runs-on: ubuntu-latest 8 | name: Lint 9 | steps: 10 | - name: Check out source repository 11 | uses: actions/checkout@v2 12 | - name: Set up Python environment 13 | uses: actions/setup-python@v1 14 | with: 15 | python-version: "3.8" 16 | - name: flake8 Lint 17 | uses: py-actions/flake8@v1 -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Feature request 3 | about: Suggest an idea for this project 4 | title: '' 5 | labels: enhancement 6 | assignees: MatthieuBizien 7 | 8 | --- 9 | 10 | **Is your feature request related to a problem? Please describe.** 11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 12 | 13 | **Describe the solution you'd like** 14 | A clear and concise description of what you want to happen. 15 | 16 | **Describe alternatives you've considered** 17 | A clear and concise description of any alternative solutions or features you've considered. 18 | 19 | **Additional context** 20 | Add any other context or screenshots about the feature request here. 21 | -------------------------------------------------------------------------------- /.github/workflows/test.yml: -------------------------------------------------------------------------------- 1 | name: "roam-to-git tests.py" 2 | 3 | on: [push, pull_request] 4 | 5 | 6 | jobs: 7 | test: 8 | strategy: 9 | fail-fast: false 10 | matrix: 11 | os: [ macos-latest, ubuntu-20.04, windows-latest] 12 | python: [ 3.6, 3.7, 3.8, 3.9 ] 13 | 14 | env: 15 | OS: ${{ matrix.os }} 16 | PYTHON: ${{ matrix.python }} 17 | 18 | runs-on: ${{ matrix.os }} 19 | name: Test 20 | timeout-minutes: 15 21 | 22 | steps: 23 | - uses: actions/checkout@v3.5.2 24 | 25 | - name: Set up Python 26 | uses: actions/setup-python@v4.6.0 27 | with: 28 | python-version: ${{ matrix.python }} 29 | 30 | - name: Setup dependencies 31 | run: | 32 | python --version 33 | python -m pip install -r requirements.txt 34 | python -m pip install mypy 35 | 36 | - name: Run backup 37 | run: ./tests.py -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | MIT License 2 | Copyright (c) 2018 YOUR NAME 3 | Permission is hereby granted, free of charge, to any person obtaining a copy 4 | of this software and associated documentation files (the "Software"), to deal 5 | in the Software without restriction, including without limitation the rights 6 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 7 | copies of the Software, and to permit persons to whom the Software is 8 | furnished to do so, subject to the following conditions: 9 | The above copyright notice and this permission notice shall be included in all 10 | copies or substantial portions of the Software. 11 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 12 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 13 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 14 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 15 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 16 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 17 | SOFTWARE. 18 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help us improve 4 | title: '' 5 | labels: bug 6 | assignees: MatthieuBizien 7 | 8 | --- 9 | 10 | **Describe the bug** 11 | A clear and concise description of what the bug is. 12 | 13 | **To Reproduce** 14 | Steps to reproduce the behavior: 15 | 1. Go to '...' 16 | 2. Click on '....' 17 | 3. Scroll down to '....' 18 | 4. See error 19 | 20 | **Expected behavior** 21 | A clear and concise description of what you expected to happen. 22 | 23 | **Traceback** 24 | Please use http://gist.github.com/ or similar, and report the last line here. 25 | 26 | **Run `roam-to-git --debug notes/` and report what you get.** 27 | It should open a Chrome front-end, and do the scrapping. The repository content will not be modified. If applicable, add screenshots to help explain your problem. 28 | 29 | **Please complete the following information:** 30 | - OS: [e.g. MacOs, Linux] 31 | - Do you use Github Action? 32 | - Do you use multiple Roam databases? 33 | - Does roam-to-git use to work for you? When precisely did it stopped to work? 34 | - Does some backup runs are still working? 35 | 36 | **Additional context** 37 | Add any other context about the problem here. 38 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from distutils.core import setup 2 | 3 | from pkg_resources import parse_requirements 4 | 5 | setup( 6 | name='roam_to_git', 7 | packages=['roam_to_git'], 8 | version='0.2', 9 | license='MIT', 10 | description='Automatic RoamResearch backup to Git', 11 | author='Matthieu Bizien', # Type in your name 12 | author_email='oao2005@gmail.com', # Type in your E-Mail 13 | url='https://github.com/MatthieuBizien/roam-to-git', 14 | download_url='https://github.com/MatthieuBizien/roam-to-git/archive/v0.1.tar.gz', 15 | keywords=['ROAMRESEARCH', 'GIT', 'BACKUP'], 16 | install_requires=[str(requirement) for requirement in 17 | parse_requirements(open("requirements.txt"))], 18 | classifiers=[ 19 | 'Development Status :: 3 - Alpha', 20 | 'Intended Audience :: Developers', 21 | 'Topic :: Internet :: WWW/HTTP :: Dynamic Content :: Wiki', 22 | 'License :: OSI Approved :: MIT License', 23 | 'Programming Language :: Python :: 3', 24 | 'Programming Language :: Python :: 3.6', 25 | 'Programming Language :: Python :: 3.7', 26 | 'Programming Language :: Python :: 3.8', 27 | ], 28 | entry_points={ 29 | 'console_scripts': ['roam-to-git=roam_to_git.__main__:main'], 30 | } 31 | ) 32 | -------------------------------------------------------------------------------- /roam_to_git/fs.py: -------------------------------------------------------------------------------- 1 | import contextlib 2 | import datetime 3 | import json 4 | import platform 5 | import tempfile 6 | import zipfile 7 | from pathlib import Path 8 | from typing import Dict, List 9 | from subprocess import Popen, PIPE, STDOUT 10 | 11 | import git 12 | import pathvalidate 13 | from loguru import logger 14 | 15 | 16 | def get_zip_path(zip_dir_path: Path) -> Path: 17 | """Return the path to the single zip file in a directory, and fail if there is not one single 18 | zip file""" 19 | zip_files = list(zip_dir_path.iterdir()) 20 | zip_files = [f for f in zip_files if f.name.endswith(".zip")] 21 | assert len(zip_files) == 1, (zip_files, zip_dir_path) 22 | zip_path, = zip_files 23 | return zip_path 24 | 25 | 26 | def reset_git_directory(git_path: Path, skip=(".git",)): 27 | """Remove all files in a git directory""" 28 | to_remove: List[Path] = [] 29 | for file in git_path.glob("**/*"): 30 | if any(skip_item in file.parts for skip_item in skip): 31 | continue 32 | to_remove.append(file) 33 | # Now we remove starting from the end to remove children before parents 34 | to_remove = sorted(set(to_remove))[::-1] 35 | for file in to_remove: 36 | if file.is_file(): 37 | file.unlink() 38 | elif file.is_dir(): 39 | if list(file.iterdir()): 40 | logger.debug("Impossible to remove directory {}", file) 41 | else: 42 | file.rmdir() 43 | 44 | 45 | def unzip_archive(zip_dir_path: Path): 46 | logger.debug("Unzipping {}", zip_dir_path) 47 | zip_path = get_zip_path(zip_dir_path) 48 | with zipfile.ZipFile(zip_path) as zip_file: 49 | contents = {file.filename: zip_file.read(file.filename).decode() 50 | for file in zip_file.infolist() 51 | if not file.is_dir()} 52 | return contents 53 | 54 | 55 | def save_files(save_format: str, directory: Path, contents: Dict[str, str]): 56 | logger.debug("Saving {} to {}", save_format, directory) 57 | for file_name, content in contents.items(): 58 | dest = get_clean_path(directory, file_name) 59 | dest.parent.mkdir(parents=True, exist_ok=True) # Needed if a new directory is used 60 | # We have to specify encoding because crontab on Mac don't use UTF-8 61 | # https://stackoverflow.com/questions/11735363/python3-unicodeencodeerror-crontab 62 | with dest.open("w", encoding="utf-8") as f: 63 | if save_format == 'json': 64 | json.dump(json.loads(content), f, sort_keys=True, indent=2, ensure_ascii=True) 65 | else: # markdown, formatted, edn 66 | if save_format == 'edn': 67 | try: 68 | jet = Popen( 69 | ["jet", "--edn-reader-opts", "{:default tagged-literal}", "--pretty"], 70 | stdout=PIPE, stdin=PIPE, stderr=STDOUT) 71 | jet_stdout, _ = jet.communicate(input=str.encode(content)) 72 | content = jet_stdout.decode() 73 | except IOError: 74 | logger.debug("Jet not installed, skipping EDN pretty printing") 75 | 76 | f.write(content) 77 | 78 | 79 | def unzip_and_save_archive(save_format: str, zip_dir_path: Path, directory: Path): 80 | logger.debug("Saving {} to {}", save_format, directory) 81 | contents = unzip_archive(zip_dir_path) 82 | save_files(save_format, directory, contents) 83 | 84 | 85 | def commit_git_directory(repo: git.Repo): 86 | """Add an automatic commit in a git directory if it has changed, and push it""" 87 | if not repo.is_dirty() and not repo.untracked_files: 88 | # No change, nothing to do 89 | return 90 | logger.debug("Committing git repository {}", repo.git_dir) 91 | repo.git.add(A=True) # https://github.com/gitpython-developers/GitPython/issues/292 92 | repo.index.commit(f"Automatic commit {datetime.datetime.now().isoformat()}") 93 | 94 | 95 | def push_git_repository(repo: git.Repo): 96 | logger.debug("Pushing to origin") 97 | origin = repo.remote(name='origin') 98 | origin.push() 99 | 100 | 101 | def get_clean_path(directory: Path, file_name: str) -> Path: 102 | """Remove any special characters on the file name""" 103 | out = directory 104 | for name in file_name.split("/"): 105 | if name == "..": 106 | continue 107 | out = out / pathvalidate.sanitize_filename(name, platform=platform.system()) 108 | return out 109 | 110 | 111 | @contextlib.contextmanager 112 | def create_temporary_directory(autodelete=True): 113 | if autodelete: 114 | with tempfile.TemporaryDirectory() as directory: 115 | yield directory 116 | else: 117 | now = datetime.datetime.now().isoformat().replace(":", "-") 118 | directory = Path("/tmp") / "roam-to-git" / now 119 | directory.mkdir(parents=True) 120 | yield directory 121 | # No clean-up 122 | -------------------------------------------------------------------------------- /tests.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import os 3 | import unittest 4 | import mypy.api 5 | from typing import List 6 | 7 | from roam_to_git.formatter import extract_links, format_link, format_to_do 8 | 9 | 10 | class TestFormatTodo(unittest.TestCase): 11 | def test_empty(self): 12 | self.assertEqual(format_to_do(""), "") 13 | 14 | def test_no_link(self): 15 | self.assertEqual(format_to_do("string"), "string") 16 | 17 | def test_to_do(self): 18 | self.assertEqual(format_to_do("a\n- {{[[TODO]]}}string"), "a\n- [ ] string") 19 | 20 | def test_done(self): 21 | self.assertEqual(format_to_do("a\n- {{[[DONE]]}}string"), "a\n- [x] string") 22 | 23 | def test_something_else(self): 24 | self.assertEqual(format_to_do("a\n- {{[[ZZZ]]}}string"), "a\n- {{[[ZZZ]]}}string") 25 | 26 | 27 | class TestFormatLinks(unittest.TestCase): 28 | """Test that we correctly format the links""" 29 | 30 | def test_empty(self): 31 | self.assertEqual(format_link(""), "") 32 | 33 | def test_no_link(self): 34 | self.assertEqual(format_link("string"), "string") 35 | 36 | def test_one_link(self): 37 | self.assertEqual(format_link("string [[link]]."), "string [link]().") 38 | 39 | def test_one_link_prefix(self): 40 | self.assertEqual(format_link("string [[link]].", link_prefix="../../"), 41 | "string [link](<../../link.md>).") 42 | 43 | def test_two_links(self): 44 | self.assertEqual(format_link("[[link]] [[other]]"), 45 | "[link]() [other]()") 46 | 47 | def test_one_hashtag(self): 48 | self.assertEqual(format_link("string #link."), "string [link]().") 49 | 50 | def test_two_hashtag(self): 51 | self.assertEqual(format_link("#link #other"), 52 | "[link]() [other]()") 53 | 54 | def test_attribute(self): 55 | self.assertEqual(format_link(" - string:: link"), " - **[string]():** link") 56 | 57 | def test_attribute_then_attribute_like(self): 58 | self.assertEqual(format_link("- attrib:: string:: val"), 59 | "- **[attrib]():** string:: val") 60 | 61 | def test_attribute_with_colon(self): 62 | self.assertEqual(format_link("- attrib:is:: string"), 63 | "- **[attrib:is]():** string") 64 | 65 | def test_attribute_new_line(self): 66 | self.assertEqual(format_link(" - attrib:: string\n " 67 | "- attrib:: string"), 68 | " - **[attrib]():** string\n " 69 | " - **[attrib]():** string") 70 | 71 | 72 | def _extract_links(string) -> List[str]: 73 | return [m.group(1) for m in extract_links(string)] 74 | 75 | 76 | class TestExtractLinks(unittest.TestCase): 77 | """Test that we correctly extract the links, for backreference""" 78 | def test_empty(self): 79 | self.assertEqual(_extract_links(""), []) 80 | 81 | def test_no_link(self): 82 | self.assertEqual(_extract_links("string"), []) 83 | 84 | def test_one_link(self): 85 | self.assertEqual(_extract_links("string [[link]]."), ["link"]) 86 | 87 | def test_two_links(self): 88 | self.assertEqual(_extract_links("[[link]] [[other]]"), ["link", "other"]) 89 | 90 | def test_one_hashtag(self): 91 | self.assertEqual(_extract_links("string [[link]]."), ["link"]) 92 | 93 | def test_two_hashtag(self): 94 | self.assertEqual(_extract_links("[[link]] [[other]]"), ["link", "other"]) 95 | 96 | def test_no_attribute(self): 97 | self.assertEqual(_extract_links(" - string: link"), []) 98 | 99 | def test_attribute(self): 100 | self.assertEqual(_extract_links(" - attrib:: link"), ["attrib"]) 101 | 102 | def test_attribute_then_attribute_like(self): 103 | self.assertEqual(_extract_links("- attrib:: link:: val"), ["attrib"]) 104 | 105 | def test_attribute_with_colon(self): 106 | self.assertEqual(_extract_links("- attrib:is:: link"), ["attrib:is"]) 107 | 108 | def test_attribute_new_line(self): 109 | self.assertEqual(_extract_links(" - attrib:: link\n " 110 | "- attrib2:: link"), 111 | ["attrib", "attrib2"]) 112 | 113 | 114 | class TestMypy(unittest.TestCase): 115 | def _test_mypy(self, files: List[str]): 116 | stdout, stderr, exit_status = mypy.api.run(["--ignore-missing-imports", *files]) 117 | self.assertEqual(exit_status, 0) 118 | 119 | def test_mypy_rtg(self): 120 | self._test_mypy(["roam_to_git"]) 121 | 122 | def test_mypy_rtg_and_tests(self): 123 | self._test_mypy(["roam_to_git", "tests.py"]) 124 | 125 | def test_mypy_all(self): 126 | self._test_mypy([str(f) for f in os.listdir(".") 127 | if os.path.isfile(f) and f.endswith(".py") 128 | and not f.startswith("setup")]) 129 | 130 | 131 | if __name__ == "__main__": 132 | unittest.main() 133 | -------------------------------------------------------------------------------- /roam_to_git/formatter.py: -------------------------------------------------------------------------------- 1 | import os 2 | import re 3 | from collections import defaultdict 4 | from itertools import takewhile 5 | from pathlib import Path 6 | from typing import Dict, List, Match, Tuple 7 | 8 | 9 | def read_markdown_directory(raw_directory: Path) -> Dict[str, str]: 10 | contents = {} 11 | for file in raw_directory.iterdir(): 12 | if file.is_dir(): 13 | # We recursively add the content of sub-directories. 14 | # They exist when there is a / in the note name. 15 | for child_name, content in read_markdown_directory(file).items(): 16 | contents[f"{file.name}/{child_name}"] = content 17 | if not file.is_file(): 18 | continue 19 | content = file.read_text(encoding="utf-8") 20 | parts = file.parts[len(raw_directory.parts):] 21 | file_name = os.path.join(*parts) 22 | contents[file_name] = content 23 | return contents 24 | 25 | 26 | def get_back_links(contents: Dict[str, str]) -> Dict[str, List[Tuple[str, Match]]]: 27 | # Extract backlinks from the markdown 28 | forward_links = {file_name: extract_links(content) for file_name, content in contents.items()} 29 | back_links: Dict[str, List[Tuple[str, Match]]] = defaultdict(list) 30 | for file_name, links in forward_links.items(): 31 | for link in links: 32 | back_links[f"{link.group(1)}.md"].append((file_name, link)) 33 | return back_links 34 | 35 | 36 | def format_markdown(contents: Dict[str, str]) -> Dict[str, str]: 37 | back_links = get_back_links(contents) 38 | # Format and write the markdown files 39 | out = {} 40 | for file_name, content in contents.items(): 41 | # We add the backlinks first, because they use the position of the characters 42 | # of the regex matches 43 | content = add_back_links(content, back_links[file_name]) 44 | 45 | # Format content. Backlinks content will be formatted automatically. 46 | content = format_to_do(content) 47 | link_prefix = "../" * sum("/" in char for char in file_name) 48 | content = format_link(content, link_prefix=link_prefix) 49 | if len(content) > 0: 50 | out[file_name] = content 51 | 52 | return out 53 | 54 | 55 | def format_to_do(contents: str): 56 | contents = re.sub(r"{{\[\[TODO\]\]}} *", r"[ ] ", contents) 57 | contents = re.sub(r"{{\[\[DONE\]\]}} *", r"[x] ", contents) 58 | return contents 59 | 60 | 61 | def extract_links(string: str) -> List[Match]: 62 | out = list(re.finditer(r"\[\[" 63 | r"([^\]\n]+)" 64 | r"\]\]", string)) 65 | # Match attributes 66 | out.extend(re.finditer(r"(?:^|\n) *- " 67 | r"((?:[^:\n]|:[^:\n])+)" # Match everything except :: 68 | r"::", string)) 69 | return out 70 | 71 | 72 | def add_back_links(content: str, back_links: List[Tuple[str, Match]]) -> str: 73 | if not back_links: 74 | return content 75 | files = sorted(set((file_name[:-3], match) for file_name, match in back_links), 76 | key=lambda e: (e[0], e[1].start())) 77 | new_lines = [] 78 | file_before = None 79 | for file, match in files: 80 | if file != file_before: 81 | new_lines.append(f"## [{file}](<{file}.md>)") 82 | file_before = file 83 | 84 | start_context_ = list(takewhile(lambda c: c != "\n", match.string[:match.start()][::-1])) 85 | start_context = "".join(start_context_[::-1]) 86 | 87 | middle_context = match.string[match.start():match.end()] 88 | 89 | end_context_ = takewhile(lambda c: c != "\n", match.string[match.end()]) 90 | end_context = "".join(end_context_) 91 | 92 | context = (start_context + middle_context + end_context).strip() 93 | new_lines.extend([context, ""]) 94 | backlinks_str = "\n".join(new_lines) 95 | return f"{content}\n# Backlinks\n{backlinks_str}\n" 96 | 97 | 98 | def format_link(string: str, link_prefix="") -> str: 99 | """Transform a RoamResearch-like link to a Markdown link. 100 | 101 | @param link_prefix: Add the given prefix before all links. 102 | WARNING: not robust to special characters. 103 | """ 104 | # Regex are read-only and can't parse [[[[recursive]] [[links]]]], but they do the job. 105 | # We use a special syntax for links that can have SPACES in them 106 | # Format internal reference: [[mynote]] 107 | string = re.sub(r"\[\[" # We start with [[ 108 | # TODO: manage a single ] in the tag 109 | r"([^\]\n]+)" # Everything except ] 110 | r"\]\]", 111 | rf"[\1](<{link_prefix}\1.md>)", 112 | string, flags=re.MULTILINE) 113 | 114 | # Format hashtags: #mytag 115 | string = re.sub(r"#([a-zA-Z-_0-9]+)", 116 | rf"[\1](<{link_prefix}\1.md>)", 117 | string, flags=re.MULTILINE) 118 | 119 | # Format attributes 120 | string = re.sub(r"(^ *- )" # Match the beginning, like ' - ' 121 | r"(([^:\n]|:[^:\n])+)" # Match everything except :: 122 | r"::", 123 | rf"\1**[\2](<{link_prefix}\2.md>):**", # Format Markdown link 124 | string, flags=re.MULTILINE) 125 | return string 126 | -------------------------------------------------------------------------------- /roam_to_git/__main__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import argparse 3 | import os 4 | import sys 5 | import time 6 | from pathlib import Path 7 | 8 | import git 9 | from dotenv import load_dotenv 10 | from loguru import logger 11 | 12 | from roam_to_git.formatter import read_markdown_directory, format_markdown 13 | from roam_to_git.fs import reset_git_directory, save_files, unzip_and_save_archive, \ 14 | commit_git_directory, push_git_repository, create_temporary_directory 15 | from roam_to_git.scrapping import scrap, Config, ROAM_FORMATS 16 | 17 | CUSTOM_FORMATS = ("formatted",) 18 | ALL_FORMATS = ROAM_FORMATS + CUSTOM_FORMATS 19 | DEFAULT_FORMATS = ROAM_FORMATS[:2] + CUSTOM_FORMATS # exclude EDN from default formats 20 | 21 | 22 | # https://stackoverflow.com/a/41153081/3262054 23 | # Extend action is only available in Python 3.8+ 24 | class ExtendAction(argparse.Action): 25 | 26 | def __call__(self, parser, namespace, values, option_string=None): 27 | items = getattr(namespace, self.dest) or [] 28 | items.extend(values) 29 | setattr(namespace, self.dest, items) 30 | 31 | 32 | @logger.catch(reraise=True) 33 | def main(): 34 | logger.trace("Entrypoint of roam-to-git") 35 | 36 | parser = argparse.ArgumentParser() 37 | parser.add_argument("directory", default=None, nargs="?", 38 | help="Directory of your notes are stored. Default to notes/") 39 | parser.add_argument("--debug", action="store_true", 40 | help="Activate various debug-oriented modes") 41 | parser.add_argument("--gui", action="store_true", 42 | help="Help debug by opening the browser in the foreground.") 43 | parser.add_argument("--database", default=None, 44 | help="If you have multiple Roam databases, select the one you want to save." 45 | "Can also be configured with env variable ROAMRESEARCH_DATABASE.") 46 | parser.add_argument("--skip-git", action="store_true", 47 | help="Consider the repository as just a directory, and don't do any " 48 | "git-related action.") 49 | parser.add_argument("--skip-push", action="store_true", 50 | help="Don't git push after commit.") 51 | parser.add_argument("--sleep-duration", type=float, default=2., 52 | help="Duration to wait for the interface. We wait 100x that duration for" 53 | "Roam to load. Increase it if Roam servers are slow, but be careful" 54 | "with the free tier of Github Actions.") 55 | parser.add_argument("--browser", default="firefox", 56 | help="Browser to use for scrapping in Selenium.") 57 | parser.add_argument("--browser-arg", 58 | help="Flags to pass through to launched browser.", 59 | action='append') 60 | parser.add_argument("--formats", "-f", action=ExtendAction, nargs="+", type=str, 61 | help="Which formats to save. Options include json, markdown, formatted, " 62 | "and edn. Note that if only formatted is specified, the markdown " 63 | "directory will be converted to a formatted directory skipping " 64 | "fetching entirely. Also note that if jet is installed, the edn " 65 | "output will be pretty printed allowing for cleaner git diffs.") 66 | args = parser.parse_args() 67 | 68 | if args.directory is None: 69 | git_path = Path("notes").absolute() 70 | else: 71 | git_path = Path(args.directory).absolute() 72 | 73 | if (git_path / ".env").exists(): 74 | logger.info("Loading secrets from {}", git_path / ".env") 75 | load_dotenv(git_path / ".env", override=True) 76 | else: 77 | logger.debug("No secret found at {}", git_path / ".env") 78 | if "ROAMRESEARCH_USER" not in os.environ or "ROAMRESEARCH_PASSWORD" not in os.environ: 79 | logger.error("Please define ROAMRESEARCH_USER and ROAMRESEARCH_PASSWORD, " 80 | "in the .env file of your notes repository, or in environment variables") 81 | sys.exit(1) 82 | config = Config(database=args.database, 83 | debug=args.debug, 84 | gui=args.gui, 85 | sleep_duration=float(args.sleep_duration), 86 | browser=args.browser, 87 | browser_args=args.browser_arg) 88 | 89 | if args.skip_git: 90 | repo = None 91 | else: 92 | repo = git.Repo(git_path) 93 | assert not repo.bare # Fail fast if it's not a repo 94 | 95 | if args.formats is None or len(args.formats) == 0: 96 | args.formats = DEFAULT_FORMATS 97 | 98 | if any(f not in ALL_FORMATS for f in args.formats): 99 | logger.error("The format values must be one of {}.", ALL_FORMATS) 100 | sys.exit(1) 101 | 102 | # reset all directories to be modified 103 | for f in args.formats: 104 | reset_git_directory(git_path / f) 105 | 106 | # check if we need to fetch a format from roam 107 | roam_formats = [f for f in args.formats if f in ROAM_FORMATS] 108 | if len(roam_formats) > 0: 109 | with create_temporary_directory(autodelete=not config.debug) as root_zip_path: 110 | root_zip_path = Path(root_zip_path) 111 | scrap(root_zip_path, roam_formats, config) 112 | if config.debug: 113 | logger.debug("waiting for the download...") 114 | time.sleep(20) 115 | return 116 | # Unzip and save all the downloaded files. 117 | for f in roam_formats: 118 | unzip_and_save_archive(f, root_zip_path / f, git_path / f) 119 | 120 | if "formatted" in args.formats: 121 | formatted = format_markdown(read_markdown_directory(git_path / "markdown")) 122 | save_files("formatted", git_path / "formatted", formatted) 123 | 124 | if repo is not None: 125 | commit_git_directory(repo) 126 | if not args.skip_push: 127 | push_git_repository(repo) 128 | 129 | 130 | if __name__ == "__main__": 131 | main() 132 | -------------------------------------------------------------------------------- /roam_to_git/scrapping.py: -------------------------------------------------------------------------------- 1 | import atexit 2 | import os 3 | import pdb 4 | import sys 5 | import time 6 | from pathlib import Path 7 | from typing import List, Optional 8 | 9 | import psutil 10 | from loguru import logger 11 | from selenium import webdriver 12 | from selenium.webdriver.common.keys import Keys 13 | from selenium.common.exceptions import NoSuchElementException, StaleElementReferenceException 14 | 15 | ROAM_FORMATS = ("json", "markdown", "edn") 16 | 17 | 18 | class Browser: 19 | FIREFOX = "Firefox" 20 | PHANTOMJS = "PhantomJS" 21 | CHROME = "Chrome" 22 | 23 | def __init__(self, browser, output_directory, headless=True, debug=False): 24 | if browser == Browser.FIREFOX: 25 | logger.trace("Configure Firefox Profile Firefox") 26 | firefox_profile = webdriver.FirefoxProfile() 27 | 28 | firefox_profile.set_preference("browser.download.folderList", 2) 29 | firefox_profile.set_preference("browser.download.manager.showWhenStarting", False) 30 | firefox_profile.set_preference("browser.download.dir", str(output_directory)) 31 | firefox_profile.set_preference( 32 | "browser.helperApps.neverAsk.saveToDisk", "application/zip") 33 | 34 | logger.trace("Configure Firefox Profile Options") 35 | firefox_options = webdriver.FirefoxOptions() 36 | if headless: 37 | logger.trace("Set Firefox as headless") 38 | firefox_options.headless = True 39 | 40 | logger.trace("Start Firefox") 41 | self.browser = webdriver.Firefox(firefox_profile=firefox_profile, 42 | firefox_options=firefox_options) 43 | elif browser == Browser.PHANTOMJS: 44 | raise NotImplementedError() 45 | # TODO configure 46 | # self.browser = webdriver.PhantomJS() 47 | elif browser == Browser.Chrome: 48 | raise NotImplementedError() 49 | # TODO configure 50 | # self.browser = webdriver.Chrome() 51 | else: 52 | raise ValueError(f"Invalid browser '{browser}") 53 | 54 | self.debug = debug 55 | 56 | def get(self, url): 57 | if self.debug: 58 | try: 59 | self.browser.get(url) 60 | except Exception: 61 | pdb.set_trace() 62 | else: 63 | self.browser.get(url) 64 | 65 | def find_element_by_css_selector(self, css_selector, check=True) -> "HTMLElement": 66 | if self.debug and check: 67 | try: 68 | return self.browser.find_element_by_css_selector(css_selector) 69 | except NoSuchElementException: 70 | pdb.set_trace() 71 | raise 72 | element = self.browser.find_element_by_css_selector(css_selector) 73 | return HTMLElement(element, debug=self.debug) 74 | 75 | def find_element_by_link_text(self, text) -> "HTMLElement": 76 | elements = self.browser.find_elements_by_link_text(text) 77 | if len(elements) != 1: 78 | if self.debug: 79 | pdb.set_trace() 80 | elements_str = [e.text for e in elements] 81 | raise ValueError( 82 | f"Got {len(elements)} elements, expected 1 for {text}: {elements_str}") 83 | element, = elements 84 | return HTMLElement(element, debug=self.debug) 85 | 86 | def close(self): 87 | self.browser.close() 88 | 89 | 90 | class HTMLElement: 91 | def __init__(self, html_element: webdriver.remote.webelement.WebElement, debug=False): 92 | self.html_element = html_element 93 | self.debug = debug 94 | 95 | def click(self): 96 | if self.debug: 97 | try: 98 | return self.html_element.click() 99 | except Exception: 100 | pdb.set_trace() 101 | else: 102 | return self.html_element.click() 103 | 104 | def send_keys(self, keys: str): 105 | if self.debug: 106 | try: 107 | return self.html_element.send_keys(keys) 108 | except Exception: 109 | pdb.set_trace() 110 | else: 111 | return self.html_element.send_keys(keys) 112 | 113 | @property 114 | def text(self) -> str: 115 | return self.html_element.text 116 | 117 | 118 | class Config: 119 | def __init__(self, 120 | browser: str, 121 | database: Optional[str], 122 | debug: bool, 123 | gui: bool, 124 | sleep_duration: float = 2., 125 | browser_args: Optional[List[str]] = None): 126 | self.user = os.environ["ROAMRESEARCH_USER"] 127 | self.password = os.environ["ROAMRESEARCH_PASSWORD"] 128 | assert self.user 129 | assert self.password 130 | if database: 131 | self.database: Optional[str] = database 132 | else: 133 | self.database = os.environ["ROAMRESEARCH_DATABASE"] 134 | assert self.database, "Please define the Roam database you want to backup." 135 | self.debug = debug 136 | self.gui = gui 137 | self.sleep_duration = sleep_duration 138 | self.browser = getattr(Browser, browser.upper()) 139 | self.browser_args = (browser_args or []) 140 | 141 | 142 | def download_rr_archive(output_type: str, 143 | output_directory: Path, 144 | config: Config, 145 | slow_motion=10, 146 | ): 147 | logger.debug("Creating browser") 148 | browser = Browser(browser=config.browser, 149 | headless=not config.gui, 150 | debug=config.debug, 151 | output_directory=output_directory) 152 | 153 | if config.debug: 154 | pass 155 | try: 156 | return _download_rr_archive(browser, output_type, output_directory, config) 157 | except (KeyboardInterrupt, SystemExit): 158 | logger.debug("Closing browser on interrupt {}", output_type) 159 | browser.close() 160 | logger.debug("Closed browser {}", output_type) 161 | raise 162 | finally: 163 | logger.debug("Closing browser {}", output_type) 164 | browser.close() 165 | logger.debug("Closed browser {}", output_type) 166 | 167 | 168 | def _download_rr_archive(browser: Browser, 169 | output_type: str, 170 | output_directory: Path, 171 | config: Config, 172 | ): 173 | """Download an archive in RoamResearch. 174 | 175 | :param output_type: Download JSON or Markdown or EDN 176 | :param output_directory: Directory where to stock the outputs 177 | """ 178 | signin(browser, config, sleep_duration=config.sleep_duration) 179 | 180 | if config.database: 181 | go_to_database(browser, config.database) 182 | 183 | logger.debug("Wait for interface to load") 184 | dot_button = None 185 | for _ in range(100): 186 | # Starting is a little bit slow, so we wait for the button that signal it's ok 187 | time.sleep(config.sleep_duration) 188 | try: 189 | dot_button = browser.find_element_by_css_selector(".bp3-icon-more", check=False) 190 | break 191 | except NoSuchElementException: 192 | pass 193 | 194 | # If we have multiple databases, we will be stuck. Let's detect that. 195 | time.sleep(config.sleep_duration) 196 | try: 197 | strong = browser.find_element_by_css_selector("strong", check=False) 198 | except NoSuchElementException: 199 | continue 200 | if "database's you are an admin of" == strong.text.lower(): 201 | logger.error( 202 | "You seems to have multiple databases. Please select it with the option " 203 | "--database") 204 | sys.exit(1) 205 | 206 | assert dot_button is not None, "All roads leads to Roam, but that one is too long. Try " \ 207 | "again when Roam servers are faster." 208 | 209 | # Click on something empty to remove the eventual popup 210 | # "Sync Quick Capture Notes with Workspace" 211 | # TODO browser.mouse.click(0, 0) 212 | 213 | dot_button.click() 214 | 215 | logger.debug("Launch download popup") 216 | export_all = browser.find_element_by_link_text("Export All") 217 | export_all.click() 218 | time.sleep(config.sleep_duration) 219 | 220 | # Configure download type 221 | dropdown_button = browser.find_element_by_css_selector(".bp3-dialog .bp3-button-text") 222 | if output_type.lower() != dropdown_button.text.lower(): 223 | logger.debug("Changing output type to {}", output_type) 224 | dropdown_button.click() 225 | output_type_elem = browser.find_element_by_link_text(output_type.upper()) 226 | output_type_elem.click() 227 | 228 | # defensive check 229 | assert dropdown_button.text.lower() == output_type.lower(), (dropdown_button.text, output_type) 230 | 231 | # Click on "Export All" 232 | export_all_confirm = browser.find_element_by_css_selector(".bp3-intent-primary") 233 | export_all_confirm.click() 234 | 235 | logger.debug("Wait download of {} to {}", output_type, output_directory) 236 | for i in range(1, 60 * 10): 237 | time.sleep(1) 238 | if i % 60 == 0: 239 | logger.debug("Keep waiting for {}, {}s elapsed", output_type, i) 240 | for file in output_directory.iterdir(): 241 | if file.name.endswith(".zip"): 242 | logger.debug("File {} found for {}", file, output_type) 243 | time.sleep(1) 244 | return 245 | logger.debug("Waiting too long {}") 246 | raise FileNotFoundError("Impossible to download {} in {}", output_type, output_directory) 247 | 248 | 249 | def signin(browser: Browser, config: Config, sleep_duration=1.): 250 | """Sign-in into Roam""" 251 | logger.debug("Opening signin page") 252 | browser.get('https://roamresearch.com/#/signin') 253 | 254 | logger.debug("Waiting for email and password fields", config.user) 255 | while True: 256 | try: 257 | email_elem = browser.find_element_by_css_selector("input[name='email']", check=False) 258 | passwd_elem = browser.find_element_by_css_selector("input[name='password']") 259 | 260 | logger.debug("Fill email '{}'", config.user) 261 | email_elem.send_keys(config.user) 262 | 263 | logger.debug("Fill password") 264 | passwd_elem.send_keys(config.password) 265 | 266 | logger.debug("Defensive check: verify that the user input field is correct") 267 | time.sleep(sleep_duration) 268 | email_elem = browser.find_element_by_css_selector("input[name='email']", check=False) 269 | if email_elem.html_element.get_attribute('value') != config.user: 270 | continue 271 | 272 | logger.debug("Activating sign-in") 273 | passwd_elem.send_keys(Keys.RETURN) 274 | break 275 | except NoSuchElementException: 276 | logger.trace("NoSuchElementException: Retry getting the email field") 277 | time.sleep(1) 278 | except StaleElementReferenceException: 279 | logger.trace("StaleElementReferenceException: Retry getting the email field") 280 | time.sleep(1) 281 | 282 | 283 | def go_to_database(browser, database): 284 | """Go to the database page""" 285 | url = f'https://roamresearch.com/#/app/{database}' 286 | logger.debug(f"Load database from url '{url}'") 287 | browser.get(url) 288 | 289 | 290 | def _kill_child_process(timeout=50): 291 | procs = psutil.Process().children(recursive=True) 292 | if not procs: 293 | return 294 | logger.debug("Terminate child process {}", procs) 295 | for p in procs: 296 | try: 297 | p.terminate() 298 | except psutil.NoSuchProcess: 299 | pass 300 | gone, still_alive = psutil.wait_procs(procs, timeout=timeout) 301 | if still_alive: 302 | logger.warning(f"Kill child process {still_alive} that was still alive after " 303 | f"'timeout={timeout}' from 'terminate()' command") 304 | for p in still_alive: 305 | try: 306 | p.kill() 307 | except psutil.NoSuchProcess: 308 | pass 309 | 310 | 311 | def scrap(zip_path: Path, formats: List[str], config: Config): 312 | # Register to always kill child process when the script close, to not have zombie process. 313 | # TODO: is is still needed with Selenium? 314 | if not config.debug: 315 | atexit.register(_kill_child_process) 316 | 317 | for f in formats: 318 | format_zip_path = zip_path / f 319 | format_zip_path.mkdir(exist_ok=True) 320 | download_rr_archive(f, format_zip_path, config=config) 321 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Automatic RoamResearch backup 2 | 3 | [![Roam Research backup](https://github.com/MatthieuBizien/roam-to-git-demo/workflows/Roam%20Research%20backup/badge.svg)](https://github.com/MatthieuBizien/roam-to-git-demo/actions) 4 | [![roam-to-git tests.py](https://github.com/MatthieuBizien/roam-to-git/workflows/roam-to-git%20tests.py/badge.svg)](https://github.com/MatthieuBizien/roam-to-git/actions) 5 | 6 | This script helps you backup your [RoamResearch](https://roamresearch.com/) graphs! 7 | 8 | This script automatically 9 | - Downloads a **markdown archive** of your RoamResearch workspace 10 | - Downloads a **json archive** of your RoamResearch workspace 11 | - Download the full **EDN** of your RoamResearch workspace 12 | - Unzips them to your git directory 13 | - Format your markdown, including with **backlinks** 14 | - **Commits and push** the difference to GitHub 15 | 16 | # What's new 17 | 18 | **V.02:** 19 | - Use Selenium library, and roam-to-git seems to be much faster and stable 🔥 20 | - Download the EDN archive 21 | 22 | # Demo 23 | [See it in action!](https://github.com/MatthieuBizien/roam-to-git-demo). This repo is updated using roam-to-git. 24 | 25 | 🚀🚀 **NEW** 🚀🚀 : The [Unofficial backup](https://github.com/MatthieuBizien/RoamResearch-offical-help/) 26 | of the official [RoamResearch Help Database](https://roamresearch.com/#/app/help) 27 | 28 | # Why to use it 29 | 30 | - You have a **backup** if RoamResearch loses some of your data. 31 | - You have a **history** of your notes. 32 | - You can **browse** your GitHub repository easily with a mobile device 33 | 34 | 35 | # Use it with GitHub Actions (recommended) 36 | 37 | **Note**: [Erik Newhard's guide](https://eriknewhard.com/blog/backup-roam-in-github) shows an easy way of setting up GitHub Actions without using the CLI. 38 | 39 | ## Create a (private) GitHub repository for all your notes 40 | 41 | With [gh](https://github.com/cli/cli): `gh repo create notes` (yes, it's private) 42 | 43 | Or [manually](https://help.github.com/en/github/getting-started-with-github/create-a-repo) 44 | 45 | ## Configure GitHub secrets 46 | 47 | - Go to github.com/your/repository/settings/secrets 48 | 49 | 50 | ##### Regarding Google Account Authorization 51 | 52 | Due to the limitations of OAuth and complexities with tokens, we are unable to snapshot accounts that are set up with the *Login with Google* option as of now. 53 | 54 | To set up backup in this case, you will need to *create*(not exactly) a native account from your old Google Account, which is as simple as using the reset password link found in Roam. 55 | 56 | ![image](https://user-images.githubusercontent.com/46789005/99179188-24482f00-2741-11eb-9c24-df7bb8707709.png) 57 | 58 | Once you've reset your password, use the following steps to finish setting up your backup! 59 | 60 | 61 | ### Configuring GitHub Secrets 62 | 63 | Add 3 (separate) secrets where the names are 64 | 65 | `ROAMRESEARCH_USER` 66 | 67 | `ROAMRESEARCH_PASSWORD` 68 | 69 | `ROAMRESEARCH_DATABASE` 70 | 71 | - Refer to [env.template](env.template) for more information 72 | 73 | - when inserting the information, there is no need for quotations or assignments 74 | 75 | ![image](https://user-images.githubusercontent.com/173090/90904133-2cf1c900-e3cf-11ea-960d-71d0543b8158.png) 76 | 77 | 78 | ## Add GitHub action 79 | 80 | ``` 81 | cd notes 82 | mkdir -p .github/workflows/ 83 | curl https://raw.githubusercontent.com/MatthieuBizien/roam-to-git-demo/master/.github/workflows/main.yml > \ 84 | .github/workflows/main.yml 85 | git add .github/workflows/main.yml 86 | git commit -m "Add github/workflows/main.yml" 87 | git push --set-upstream origin master 88 | ``` 89 | 90 | ## Check that the GitHub Action works 91 | 92 | - Go to github.com/your/repository/actions 93 | - Your CI job should start in a few seconds 94 | 95 | ### Note: 96 | 97 | If the backup does not automatically start, try pushing to the repository again 98 | 99 | # Use with GitLab CI 100 | 101 | This section is based on this article on GitLab blog: 102 | https://about.gitlab.com/blog/2017/11/02/automating-boring-git-operations-gitlab-ci/ 103 | 104 | ## Create a project 105 | 106 | Create a project for your notes. We will refer to it as 107 | `YOUR_USER/YOUR_PROJECT`. 108 | 109 | ## Create key pair for pushing commits 110 | 111 | Generate a new key pair that will be used by the CI job to push the new commits. 112 | 113 | ```bash 114 | $ ssh-keygen -f gitlab-ci-commit 115 | Generating public/private rsa key pair. 116 | Enter passphrase (empty for no passphrase): 117 | Enter same passphrase again: 118 | Your identification has been saved in gitlab-ci-commit-test 119 | Your public key has been saved in gitlab-ci-commit-test.pub 120 | The key fingerprint is: 121 | SHA256:HoQUcbUPJU2Ur78EineqA6IVljk8ZD9XIxiGFUrBues agentydragon@pop-os 122 | The key's randomart image is: 123 | +---[RSA 3072]----+ 124 | | .o=O*..o++. | 125 | | .*+.o. o+o | 126 | | +.=. .oo. . | 127 | | X o.. o . | 128 | | . = oS o. | 129 | | + .. o ... | 130 | | + . .o o ... | 131 | | . E .. o .. | 132 | | .o. .. | 133 | +----[SHA256]-----+ 134 | ``` 135 | 136 | DO NOT commit the private key (`gitlab-ci-commit`). 137 | 138 | ## Add the public key as a deploy key 139 | 140 | This step allows the CI job to push when identified by the public key. 141 | 142 | Go to Project Settings → Repository (`https://gitlab.com/YOUR_USER/YOUR_PROJECT/-/settings/repository`) → Deploy Keys. 143 | 144 | Paste the content of the public key file (`gitlab-ci-commit-test.pub`), and 145 | enable write access for the public key. 146 | 147 | ![](img/gitlab-add-key.png) 148 | 149 | Click "Add Key". 150 | 151 | ## Add the private key as a CI variable 152 | 153 | This step gives the CI job the private key so it can authenticate against 154 | GitLab. 155 | 156 | Go to Project Settings → Pipelines (`https://gitlab.com/YOUR_USER/YOUR_PROJECT/-/settings/ci_cd`) → Variables. 157 | 158 | Click "Add variable", with name `GIT_SSH_PRIV_KEY`, and paste in the content 159 | of the private key file (`gitlab-ci-commit`). You probably want to mark 160 | "Protect". You might want to look up GitLab docs on protected branches. 161 | 162 | ![](img/gitlab-add-variable.png) 163 | 164 | Click "Add variable". 165 | 166 | Also add the following variables with appropriate values: 167 | 168 | * `ROAMRESEARCH_USER` 169 | * `ROAMRESEARCH_PASSWORD` 170 | * `ROAMRESEARCH_DATABASE` 171 | 172 | ## Create a `gitlab_known_hosts` file 173 | 174 | In your repo, create and commit a `gitlab_known_hosts` file containing the 175 | needed SSH `known_hosts` entry/entries for the GitLab instance. It will be used 176 | by the CI job to check that it's talking to the right server. 177 | 178 | This should work as of 2021-01-04: 179 | 180 | ``` 181 | # Generated by @agentydragon at 2021-01-04 at by `git fetch`ing a GitLab repo 182 | # with an empty ~/.ssh/known_hosts file. 183 | 184 | |1|zIQlCRxv+s9xhVCAfGL2nvaZqdY=|jbPpD9GNaS/9Z4iJzE9gw2XCo20= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFSMqzJeV9rUzU4kWitGjeR4PWSa29SPqJ1fVkhtj3Hw9xjLVXVYrU9QlYWrOLXBpQ6KWjbjTDTdDkoohFzgbEY= 185 | |1|uj60xYhsW2vAM8BpQ+xZz51ZarQ=|BNIJlvu4rNcmrxd60fkqpChrf9A= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFSMqzJeV9rUzU4kWitGjeR4PWSa29SPqJ1fVkhtj3Hw9xjLVXVYrU9QlYWrOLXBpQ6KWjbjTDTdDkoohFzgbEY= 186 | ``` 187 | 188 | ## Create `.gitlab-ci.yml` 189 | 190 | Create a `.gitlab-ci.yml` file: 191 | 192 | ```yaml 193 | backup: 194 | when: manual 195 | before_script: 196 | - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )' 197 | - eval $(ssh-agent -s) 198 | - ssh-add <(echo "$GIT_SSH_PRIV_KEY") 199 | - git config --global user.email "roam-to-git@example.org" 200 | - git config --global user.name "roam-to-git automated backup" 201 | - mkdir -p ~/.ssh 202 | - cat gitlab_known_hosts >> ~/.ssh/known_hosts 203 | 204 | # (Taken from: https://github.com/buildkite/docker-puppeteer/blob/master/Dockerfile) 205 | # We install Chrome to get all the OS level dependencies, but Chrome itself 206 | # is not actually used as it's packaged in the pyppeteer library. 207 | # Alternatively, we could include the entire dep list ourselves 208 | # (https://github.com/puppeteer/puppeteer/blob/master/docs/troubleshooting.md#chrome-headless-doesnt-launch-on-unix) 209 | # but that seems too easy to get out of date. 210 | - apt-get install -y wget gnupg ca-certificates 211 | - wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - 212 | - echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list 213 | - apt-get update 214 | - apt-get install google-chrome-stable libxss1 python3-pip -y 215 | 216 | - pip3 install git+https://github.com/MatthieuBizien/roam-to-git.git 217 | 218 | # TODO(agentydragon): Create and publish Docker image with all deps already 219 | # installed. 220 | script: 221 | # Need to clone the repo again over SSH, since by default GitLab clones 222 | # the repo for CI over HTTPS, for which we cannot authenticate pushes via 223 | # pubkey. 224 | - git clone --depth=1 git@gitlab.com:YOUR_USER/YOUR_PROJECT 225 | - cd YOUR_PROJECT 226 | 227 | # --no-sandbox needed because Chrome refuses to run as root without it. 228 | - roam-to-git --browser-arg=--no-sandbox . 229 | ``` 230 | 231 | Commit and push. 232 | 233 | ## To run the pipeline 234 | 235 | Go to `https://gitlab.com/YOUR_USER/YOUR_PROJECT/-/jobs`, 236 | press the Play button: 237 | 238 | ![](img/gitlab-start-job.png) 239 | 240 | Shortly after you should see an automated commit added to master. 241 | 242 | ![](img/gitlab-done.png) 243 | 244 | ## Scheduled backups 245 | 246 | To run the script, say, every hour, go to the project's 247 | CI / CD → Schedules (`https://gitlab.com/YOUR_USER/YOUR_PROJECT/-/pipeline_schedules`). 248 | Click "New schedule", fill out and submit (like this for every 15 minutes): 249 | 250 | ![](img/gitlab-schedule.png) 251 | 252 | # Use it locally 253 | 254 | **Note**: if your file system is not case-sensitive, you will not backup notes that have the same name in different 255 | cases 256 | 257 | ## Install Roam-To-Git 258 | With [pipx](https://github.com/pipxproject/pipx) 259 | (if you don't know pipx, you should look at it, it's wonderful!) 260 | 261 | `pipx install git+https://github.com/MatthieuBizien/roam-to-git.git` 262 | 263 | ## Create a (private) GitHub repository for all your notes 264 | 265 | With [gh](https://github.com/cli/cli): `gh repo create notes` (yes, it's private) 266 | 267 | Or [manually](https://help.github.com/en/github/getting-started-with-github/create-a-repo) 268 | 269 | Then run `git push --set-upstream origin master` 270 | 271 | ## Configure environment variables 272 | 273 | - `curl https://raw.githubusercontent.com/MatthieuBizien/roam-to-git/master/env.template > notes/.env` 274 | - Fill the .env file: `vi .env` 275 | - Ignore it: `echo .env > notes/.gitignore; cd notes; git add .gitignore; git commit -m "Initial commit"` 276 | 277 | ## Manual backup 278 | 279 | - Run the script: `roam-to-git notes/` 280 | - Check your GitHub repository, it should be filled with your notes :) 281 | 282 | ## Automatic backup 283 | 284 | One-liner to run it with a [cron](https://en.wikipedia.org/wiki/Cron) every hour: 285 | `echo "0 * * * * '$(which roam-to-git)' '$(pwd)/notes'" | crontab -` 286 | 287 | NB: there are [issues](https://github.com/MatthieuBizien/roam-to-git/issues/43) on Mac with a crontab. 288 | 289 | # Debug 290 | 291 | Making `roam-to-git` foolproof is hard, as it depends on Roam, on GitHub Action or the local environment, 292 | on software not very stable (`pyppeteer` we still love you 😉 ) 293 | and on the correct user configuration. 294 | 295 | For debugging, please try the following: 296 | 297 | - Check that the environment variables `ROAMRESEARCH_USER`, `ROAMRESEARCH_PASSWORD`, `ROAMRESEARCH_DATABASE` are correctly setup 298 | - Login into Roam using the username and the password. 299 | You may want to ask a new password if you have enabled Google Login, as it solved some user problems. 300 | - Run `roam-to-git --debug` to check the authentication and download work 301 | - Look at the traceback 302 | - Look for similar issues 303 | - If nothing else work, create a new issue with as many details as possible. 304 | I will try my best to understand and help you, no SLA promised 😇 305 | 306 | # Task list 307 | 308 | ## Backup all RoamResearch data 309 | 310 | - [x] Download automatically from RoamResearch 311 | - [x] Create Cron 312 | - [x] Write detailed README 313 | - [x] Publish the repository on GitHub 314 | - [ ] Download images (they currently visible in GitHub, but not in the archive so not saved in the repository 😕) 315 | 316 | ## Format the backup to have a good UI 317 | 318 | ### Link formatting to be compatible with GitHub markdown 319 | - [x] Format `[[links]]` 320 | - [x] Format `#links` 321 | - [x] Format `attribute::` 322 | - [ ] Format `[[ [[link 1]] [[link 2]] ]]` 323 | - [ ] Format `((link))` 324 | 325 | ### Backlink formatting 326 | - [x] Add backlinks reference to the notes files 327 | - [x] Integrate the context into the backlink 328 | - [x] Manage `/` in file names 329 | 330 | ### Other formatting 331 | - [x] Format `{{TODO}}` to be compatible with GitHub markdown 332 | - [ ] Format `{{query}}`` 333 | 334 | ## Make it for others 335 | - [x] Push it to GitHub 336 | - [x] Add example repository 337 | - [x] Make the backup directory configurable 338 | - [ ] Publicize it 339 | - [x] [RoamResearch Slack](https://roamresearch.slack.com/) [thread](https://roamresearch.slack.com/archives/CN5MK4D2M/p1588670473431200) 340 | - [ ] [RoamResearch Reddit](https://www.reddit.com/r/RoamResearch/) 341 | - [ ] Twitter 342 | 343 | ## Some ideas, I don't need it, but PR welcome 😀 344 | - [ ] Test it/make it work on Windows 345 | - [x] Pre-configure a CI server so it can run every hour without a computer 346 | Thanks @Stvad for [#4](https://github.com/MatthieuBizien/roam-to-git/issues/4)! 347 | --------------------------------------------------------------------------------