├── sem_layers.png
├── sem_layers.graffle
├── Use Cases
    └── README.md
├── Requirements
    └── README.md
├── Tools
    └── README.md
├── LICENSE
├── .gitignore
└── README.md


/sem_layers.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ceteri/pkg/HEAD/sem_layers.png


--------------------------------------------------------------------------------
/sem_layers.graffle:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ceteri/pkg/HEAD/sem_layers.graffle


--------------------------------------------------------------------------------
/Use Cases/README.md:
--------------------------------------------------------------------------------
1 | ## Current Use Cases for the Group
2 | 
3 |   * Personal knowledge collection
4 |   * Sharing links to add context to group discussion
5 |   * Topical conversation on KG related subject
6 |   * Collaborating on specific project
7 |   * Publishing tutorials
8 | 


--------------------------------------------------------------------------------
/Requirements/README.md:
--------------------------------------------------------------------------------
 1 | ## What are the PKG group’s requirements/needs from a tool stack?
 2 | 
 3 |   * Asynchronous project chat
 4 |   * Shared document for note taking
 5 |   * Creating semantic markups
 6 |   * Shared storage
 7 |   * Versioning
 8 |   * Publishing on the web
 9 |   * Knowledge graph
10 | 


--------------------------------------------------------------------------------
/Tools/README.md:
--------------------------------------------------------------------------------
 1 | ## Some tools that can be useful for PKGs
 2 | 
 3 | **Points to think about:**
 4 | * Properties
 5 | * Where on the stack each tool fits.
 6 | 
 7 | **List**
 8 | * [Athens Research](https://github.com/athensresearch/athens)
 9 | * [Dendron](https://dendron.so/)
10 | * [Dokieli](https://dokie.li/)
11 | * [Foam](https://foambubble.github.io/foam/)
12 | * [logseq](https://logseq.com/)
13 | * [Obsidian](https://obsidian.md/)
14 | * [org-roam](https://www.orgroam.com/)
15 | * [Remnote](https://www.remnote.io/)
16 | * [Roam Research](https://roamresearch.com/)
17 | * [tiddlyroam](https://tiddlyroam.org/)
18 | 
19 | See also [this list](https://www.notion.so/Artificial-Brain-Networked-with-linear-notebook-app-a131b468fc6f43218fb8105430304709)
20 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2021 Paco Nathan
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | *$py.class
  5 | 
  6 | # C extensions
  7 | *.so
  8 | 
  9 | # Distribution / packaging
 10 | .Python
 11 | build/
 12 | develop-eggs/
 13 | dist/
 14 | downloads/
 15 | eggs/
 16 | .eggs/
 17 | lib/
 18 | lib64/
 19 | parts/
 20 | sdist/
 21 | var/
 22 | wheels/
 23 | pip-wheel-metadata/
 24 | share/python-wheels/
 25 | *.egg-info/
 26 | .installed.cfg
 27 | *.egg
 28 | MANIFEST
 29 | 
 30 | # PyInstaller
 31 | #  Usually these files are written by a python script from a template
 32 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 33 | *.manifest
 34 | *.spec
 35 | 
 36 | # Installer logs
 37 | pip-log.txt
 38 | pip-delete-this-directory.txt
 39 | 
 40 | # Unit test / coverage reports
 41 | htmlcov/
 42 | .tox/
 43 | .nox/
 44 | .coverage
 45 | .coverage.*
 46 | .cache
 47 | nosetests.xml
 48 | coverage.xml
 49 | *.cover
 50 | *.py,cover
 51 | .hypothesis/
 52 | .pytest_cache/
 53 | 
 54 | # Translations
 55 | *.mo
 56 | *.pot
 57 | 
 58 | # Django stuff:
 59 | *.log
 60 | local_settings.py
 61 | db.sqlite3
 62 | db.sqlite3-journal
 63 | 
 64 | # Flask stuff:
 65 | instance/
 66 | .webassets-cache
 67 | 
 68 | # Scrapy stuff:
 69 | .scrapy
 70 | 
 71 | # Sphinx documentation
 72 | docs/_build/
 73 | 
 74 | # PyBuilder
 75 | target/
 76 | 
 77 | # Jupyter Notebook
 78 | .ipynb_checkpoints
 79 | 
 80 | # IPython
 81 | profile_default/
 82 | ipython_config.py
 83 | 
 84 | # pyenv
 85 | .python-version
 86 | 
 87 | # pipenv
 88 | #   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
 89 | #   However, in case of collaboration, if having platform-specific dependencies or dependencies
 90 | #   having no cross-platform support, pipenv may install dependencies that don't work, or not
 91 | #   install all needed dependencies.
 92 | #Pipfile.lock
 93 | 
 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
 95 | __pypackages__/
 96 | 
 97 | # Celery stuff
 98 | celerybeat-schedule
 99 | celerybeat.pid
100 | 
101 | # SageMath parsed files
102 | *.sage.py
103 | 
104 | # Environments
105 | .env
106 | .venv
107 | env/
108 | venv/
109 | ENV/
110 | env.bak/
111 | venv.bak/
112 | 
113 | # Spyder project settings
114 | .spyderproject
115 | .spyproject
116 | 
117 | # Rope project settings
118 | .ropeproject
119 | 
120 | # mkdocs documentation
121 | /site
122 | 
123 | # mypy
124 | .mypy_cache/
125 | .dmypy.json
126 | dmypy.json
127 | 
128 | # Pyre type checker
129 | .pyre/
130 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Personal Knowledge Graph
  2 | 
  3 | Collaboration on descriptions used in personal knowledge graph (PKG) practices.
  4 | 
  5 | 
  6 | ## Semantic Layers
  7 | 
  8 | The following layers are found among processes for developing *knowledge graphs*.
  9 | Ostensibly these descriptions can apply across the range of available tools.
 10 | 
 11 | **Discussion about these layers and iteration on their descriptions is needed.**
 12 | 
 13 | ![semantic layers](https://raw.githubusercontent.com/ceteri/pkg/main/sem_layers.png)
 14 | 
 15 | ### Layer 1: remote storage
 16 | 
 17 | For example, the popular *storage grids* such as Amazon S3, Azure Storage, Google GCS, etc., are at **Layer 1**.
 18 | These are amazingly robust and cost-effective, although relatively "raw" in the sense that they are neither file systems nor databases.
 19 | Also, they are mostly designed for programmers (or applications) to use.
 20 | 
 21 | Use of *remote storage* distinguishes a PKG use case from the trivial case of one person merely using *local storage* on their local computer.
 22 | It use implies capabilities for collaboration, publishing, disaster recovery, etc.
 23 | 
 24 | ### Layer 2: versioning
 25 | 
 26 | Services such as GitHub, GitLab, etc., are at **Layer 2**.
 27 | These typically bundle the versioning semantics of a tool such as *git* along with a storage grid, then provide ways to publish (e.g., jumping all the way up to **Layer 9**).
 28 | Graph-based data and metadata can be difficult to version – or rather, there are specialized methods and *git* doesn't necessarily understand.
 29 | 
 30 | This work is by definition *transactional* in nature.
 31 | 
 32 | ### Layer 3: markdown
 33 | 
 34 | The *markdown* at **Layer 3** is one among many popular formats.
 35 | It has the benefits of being relatively human-readable, even in its raw form.
 36 | It's also native in Jupyter notebooks, as well as one of the most popular formats for documenting open source projects, and increasingly used among technical publishers as well.
 37 | 
 38 | ### Layer 4: semantic markup
 39 | 
 40 | The *semantic markup* at **Layer 4** begins to add some semantic properties to content formatted in markdown.
 41 | For example, means for adding links and other metadata.
 42 | Services such as [Obsidian](https://obsidian.md/) and [Roam](https://roamresearch.com/) are largely at this layer.
 43 | 
 44 | ### Layer 5: shared editing
 45 | 
 46 | **Layer 5** shared editing is what people commonly associate with Google Docs or Box.
 47 | There's a programming technique called *append-only logs* which makes collaborative editing feasible to manage online.
 48 | These services typically bundle **Layer 1** storage, along with some aspects of **Layer 2** versioning.
 49 | Generally these services lack much awareness about markdown formats specifically, and tend to be MS Word lookalikes.
 50 | 
 51 | This work is by definition *transactional* in nature.
 52 | 
 53 | ### Layer 6: shared vocabulary
 54 | 
 55 | The *shared vocabulary* in **Layer 6** is where a project attempts to harmonize their semantic markup with commonly used *controlled vocabularies*, such that the metadata references shared definitions.
 56 | Examples include [DCMI](https://dublincore.org/specifications/dublin-core/dcmi-terms/#) and [Schema.org](https://schema.org/) among many others.
 57 | 
 58 | In aggregate, this is where *ontology* gets described.
 59 | 
 60 | ### Layer 7: persistent identifiers
 61 | 
 62 | **Layer 7** uses *persistent identifiers* to "populate" the semantic markup such that content can be referenced globally using unique identifiers.
 63 | Each class of persistent identifier will have some "authority" backing it.
 64 | For example, there are *ISBN* for books, *ISSN* for periodicals, *ORCID* for resesearchers, *ROR* for organizations, *DOI* for articles, etc.
 65 | Using a *URL* is perhaps the simplest case.
 66 | 
 67 | Alternatively, a given organization be its own "authority", i.e., it may construct and publish it's own identifiers specific to its context.
 68 | This can be performed using [URN](https://en.wikipedia.org/wiki/Uniform_Resource_Name) that are composed of some local identifiers.
 69 | For example, each article posted on LinkedIn has a URN specific to LinkedIn, which gets exposed in public as part of that article's URL (web link).
 70 | For example, in the link: `https://www.linkedin.com/feed/update/urn:li:activity:6774890201442582528/` the URN portion is `urnli:activity:6774890201442582528` where the latter hex number is a kind of [*uuid*](https://en.wikipedia.org/wiki/Universally_unique_identifier) defined within the context of LinkedIn.
 71 | The other elements composing the URN's structure help clarify the semantics about its domain and interpretation.
 72 | 
 73 | While semantic conventions such as [`dct:identifier`](https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/identifier) may be used as "catch-alls" for representing persistent identifiers, unless a KG represents each class of identifier uniquely then it probably won't be able to do support effective queries, inference, validation, embedding, search, etc.,
 74 | In other words, "good enough" representation does not guarantee effective inference down the road in the KG use cases.
 75 | 
 76 | ### Layer 8: knowledge graph
 77 | 
 78 | Within **Layer 8** is where the *knowledge graph* work happens.
 79 | Of course, this has its own internal layering: RDF for triples/quads, then RDFS schema for defining properties, then OWL for machine interpretability, and so on.
 80 | 
 81 | Both the **Layer 6** ontology and the **Layer 7** identifiers must exist as "overlays" atop the content – in other words, as semantic annotation – for the **Layer 8** knowledge graph usage to make any sense.
 82 | Sometimes this is described using the relatively dated terms [*TBox*](https://en.wikipedia.org/wiki/Tbox) and [*ABox*](https://en.wikipedia.org/wiki/Bbox) respectively, although these may introduce some distorted interpretation.
 83 | 
 84 | ### Layer 9: publishing
 85 | 
 86 | Finally, there's a *presentation* layer at the top – roughly similar to network layer models – where *publishing* the KGs occurs.
 87 | Since the W3C standards emerged from *world wide web*, many of their notions (Solid, LDP, etc.) tend to fixate at this layer, without being especially mindful about practical details for some of the underlying foundations.
 88 | 
 89 | This layer is where many KG use cases provide features for public access.
 90 | Publishing may be a matter of:
 91 | 
 92 |   * web-based rendering
 93 |   * search and query capabilities
 94 |   * API access
 95 | 
 96 | ### Misc. Notes
 97 | 
 98 | Recognizing how the marketing departments of technology vendors tend to promise "all things to all people" in reality few if any commercial offerings provide support across the entire stack of these layers.
 99 | 
100 | Effective practices in industry tend to:
101 | 
102 |   * leverage a [*middle-out* strategy](https://answers.knowledgegraph.tech/t/whats-the-difference-between-a-bottom-up-and-a-top-down-ontology-modeling-approach/5064) in lieu of *top-down* EKG practices, where PKG practices may evolve into major components of *middle-out* projects
103 |   * integrate multiple libraries, tools, and services to provide coverage across the stack, depending on the needs of their use cases
104 | 
105 | Notably, the *graph database* vendors tend to focus on **Layer 1** and the *query* aspects (mixed into either in **Layer 8** or **Layer 9**) of search, while not providing especially effective solutions for the other layers.
106 | 
107 | ---
108 | 
109 | ## What are the PKG group’s requirements/needs from a tool stack?
110 | 
111 | Check the [Requirements section](https://github.com/ceteri/pkg/tree/main/Requirements)
112 | 
113 | ## Tools
114 | 
115 | Check the [Tools section](https://github.com/ceteri/pkg/tree/main/Tools)
116 | 
117 | 
118 | ## Current Use Cases for the Group
119 | 
120 | Check the [Use Cases section](https://github.com/ceteri/pkg/tree/main/Use%20Cases)
121 | 


--------------------------------------------------------------------------------