├── .github
└── workflows
│ └── main.yml
├── .gitignore
├── ARCHITECTURE.md
├── README.md
├── dashboard.py
├── fossil_mastodon
├── __init__.py
├── algorithm.py
├── app
│ ├── index.js
│ ├── static
│ │ ├── htmx.js
│ │ ├── logo-light.svg
│ │ ├── page.js
│ │ ├── style.css
│ │ └── work-in-progress.gif
│ └── templates
│ │ ├── bad_plugin.html
│ │ ├── base
│ │ ├── admin.html
│ │ └── page.html
│ │ ├── index.html
│ │ ├── no_algorithm.html
│ │ ├── settings.html
│ │ ├── toot.html
│ │ ├── toot_clusters.html
│ │ └── toot_list.html
├── config.py
├── core.py
├── migrations.py
├── plugin_impl
│ ├── __init__.py
│ ├── toot_debug.py
│ └── topic_cluster.py
├── plugins.py
├── science.py
├── server.py
└── ui.py
├── index.html
├── make.sh
├── poetry.lock
└── pyproject.toml
/.github/workflows/main.yml:
--------------------------------------------------------------------------------
1 | name: Upload Python Package to PyPI
2 |
3 | on:
4 | release:
5 | types: [published]
6 |
7 | permissions:
8 | contents: read
9 |
10 | jobs:
11 | deploy:
12 |
13 | runs-on: ubuntu-latest
14 |
15 | steps:
16 | - uses: actions/checkout@v3
17 | - name: Set up Python
18 | uses: actions/setup-python@v3
19 | with:
20 | python-version: '3.x'
21 | - name: Install dependencies
22 | run: |
23 | python -m pip install --upgrade pip
24 | pip install build
25 | - name: Build package
26 | run: python -m build
27 | - name: Publish package
28 | uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
29 | with:
30 | user: __token__
31 | password: ${{ secrets.PYPI_API_TOKEN }}
32 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__/
2 | .env
3 | fossil.db
4 | dist/
5 |
--------------------------------------------------------------------------------
/ARCHITECTURE.md:
--------------------------------------------------------------------------------
1 |
2 | # It's a *Client*
3 |
4 | Fossil is a Mastodon **client**. While there is a web server that keeps state, it only downloads your home timeline and redisplays it. It doesn't touch posts from accounts or hashtags that you don't follow.
5 |
6 | ```mermaid
7 | graph TD
8 | subgraph fediverse
9 | me[My Mastodon Server]
10 | them[Mastodon Server]
11 | them2[Mastodon Server]
12 | end
13 | them2-->me
14 | them-->me
15 | me-->fossil["Fossil (on my laptop)"]-->phone[My Phone]
16 | ```
17 |
18 | ## How Do I Connect My Phone?
19 |
20 | Traditionally, the way you'd do this would be to deploy fossil into a cloud environment. The problem is that, if it's running on my laptop it doesn't have a public IP address, so by deploying it in the cloud
21 | I can leverage infrastructure in AWS or Azure to serve it with a public IP address and/or a full domain name.
22 |
23 | ### Option 1: Tailscale
24 | [Tailscale](https://tailscale.com/kb/1017/install?slug=kb&slug=1017&slug=install) sets up a mesh network. It's similar to a corporate VPN, except that it's so easy to install that you can trivially pop it
25 | onto your phone, laptop, home lab, cloud lab, IoT devices, etc.
26 |
27 | The setup is:
28 |
29 | 1. Follow tailscale's directions
30 | 2. Install tailscale on your laptop
31 | 3. Install tailscale on your phone
32 | 4. Run fossil on your laptop, remember the port
33 | 5. Lookup your laptop's hostname
34 | 6. Open up safari/chrome/firefox/etc. and paste in the hostname and port: `http://{hostname}:{port}`
35 |
36 | This works seamlessly for yourself and potentially sharing with a small number of friends and family.
37 |
38 | ### Option 2: ngrok
39 | [Ngrok](https://ngrok.com/docs/getting-started/) works a bit different from tailscale. It gives you a public domain name to reach your app, and then tunnels traffic through to your laptop
40 |
41 | ```mermaid
42 | graph LR
43 | Phone
44 | subgraph internet
45 | ngrok[ngrok domain name]
46 | end
47 | Phone-->ngrok
48 | fossil[fossil on laptop]
49 | ngrok --> fossil
50 | ```
51 |
52 | This works better if you want to share it with a lot of people. You can easily tack on authentication, like OAuth.
53 |
54 |
55 | # Code Architecture
56 | It's a fairly standard htmx-on-python arrangement
57 |
58 | ```mermaid
59 | graph TD
60 | SQLite[(SQLite)]-->FastAPI[FastAPI on Python]
61 | llm-->FastAPI
62 | FastAPI-->HTML[HTML w/ htmx tags]
63 | ```
64 |
65 | SQLite stores:
66 | - Toots
67 | - `id`: This is an internal auto-incrementing ID. Not the same as `toot_id`
68 | - Some other fields parsed from JSON
69 | - `embedding`: The embedding vector, stored as a BLOB. In memory it's kept as a numpy array.
70 | - `orig_json`: The full unaltered JSON that the mastodon server sent us, stored as TEXT
71 | - Session
72 | - `id`: The session ID. This is stored in an HTTP cookie when sent to the browser, so all requests can correspond to a session.
73 | - `algorithm_spec`: A JSON object (stored as TEXT) describing the module & class name of the algorithm currently in use.
74 | - `algorithm`: The algorithm serialized via [pickle](https://docs.python.org/3/library/pickle.html), stored as a BLOB. This enables
75 | pluggable algorithms to keep their own state persistently.
76 |
77 | HTTP Cookies
78 | - `fossil_session_id`: The primary key of the `sessions` table. Created whenever it's empty, never expires.
79 |
80 | ## Code Layout
81 |
82 | - `core.py`: Database access, downloading toots, etc.
83 | - `config.py`: Configuration & wrappers around configuration mechanisms. All config should have either a constant or simple function.
84 | - [DEPRECATED] `science.py`: Functionality here has been moved to `algorithm/topic_cluster.py` and made more pluggable.
85 | - `server.py`: Entry point. FastAPI app with all core HTTP operations defined. Operations return either a jinja template or a literal HTML response.
86 | - `ui.py`: partially deprecated (it contains old streamlit code).
87 | - `algorithm/`
88 | - `base.py`: Base classes and utilities needed for building algorithm plugins. All algorithms are installed as plugins, even standard ones.
89 | - Remaining files: algorithms, each implementing base classes from `base.py`.
90 | - `app/`
91 | - `static/`: various CSS & JavaScript files
92 | - `style.css`: the only CSS we're writing manually
93 | - `page.js`: The only JS we're writing manually. No pre or post processing pipeline, it's downloaded literally as it's stored in Git, comments and all.
94 | - Other files: Things I downloaded
95 | - `templates/`: Jinja templates
96 | - `index.html`: Returned by `GET /`
97 | - `settings.html`: Returned by `GET /settings`
98 | - `toot*.html`: Different sub-templates included into `index.html` or returned from XHR endpoints. You can use these for building plugins.
99 | - `base/`
100 | - `page.html`: Base template that is inherited by both `index.html` and `settings.html`
101 |
102 |
103 |
104 | # Plugin Architecture
105 | ## Making an Algorithm Plugin
106 | An algorithm plugin involves:
107 |
108 | 1. Algorithm class
109 | 2. [Optional] Renderer class
110 | 3. [Optional] Jinja templates for displaying
111 |
112 | All algorithms are plugins, so you can use [`topic_cluster.py`](https://github.com/tkellogg/fossil/blob/main/fossil_mastodon/algorithm/topic_cluster.py)
113 | as a guide.
114 |
115 | ### Algorithm Class
116 | Use `base.BaseAlgorithm` as a base class, implement these methods:
117 |
118 | - `render(toots, render_context)`: Convert a list of toots into a Renderer object (which converts to an HTTP response)
119 | - `train(toots, train_context, args)`: Produces an instance of your algorithm class. The assumption is that you're training
120 | some sort of model, e.g. topic_cluster trains a sklearn `KMeansCluster` model and stores it in a field of the `TopicCluster`
121 | object. By storing it in a field, it ensures that the algorithm is serialized to and from the database.
122 |
123 | ### [Optional] Renderer Class
124 | You might not need to do this if you can find a different template & renderer that works for you. This should be very easy to implement, it's just a
125 | matter of capturing the data you need and then passing it to a template.
126 |
127 | Use `base.Renderable` as a base class, implement these methods:
128 |
129 | - `render()`: Returns a FastAPI response. Typically you're going to return a TemplateResponse
130 |
131 |
132 |
133 |
134 |
135 |
136 |
137 |
138 |
139 |
140 |
141 |
142 |
143 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Fossil, a Mastodon Client for Reading
2 |
3 | A mastodon client optimized for reading, with a configurable and
4 | hackable timeline algorithm powered by Simon Wilison's [llm](https://llm.datasette.io/en/stable/index.html) tool. Try making your own algorithm!
5 |
6 |
7 | Sneak peek:
8 |
9 | 
10 |
11 |
12 | # Installing & Running
13 |
14 | ## From PyPi
15 |
16 | I highly suggest not installing any Python app directly into your global Python. Create a virtual environment:
17 |
18 | ```
19 | python -m venv fossil
20 | ```
21 |
22 | And then activate it (see [here](https://docs.python.org/3/library/venv.html))
23 |
24 | ```
25 | source fossil/bin/activate
26 | ```
27 |
28 | Alternatively, **use [`pipx`](https://pipx.pypa.io/stable/installation/)**:
29 |
30 | ```
31 | pip install pipx
32 | pipx install fossil-mastodon
33 | ```
34 |
35 | ## From Source
36 |
37 | Clone this repo:
38 |
39 | ```
40 | git clone https://github.com/tkellogg/fossil.git
41 | ```
42 |
43 | And then `cd fossil` to get into the correct directory.
44 |
45 |
46 | ## Configure the `.env` file
47 |
48 | Before that, you'll need a `.env` file with these keys:
49 |
50 | ```
51 | ACCESS_TOKEN=
52 | ```
53 |
54 | Alternatively, you can set them as environment variables. All available keys are here:
55 |
56 | | Variable | Required? | Value |
57 | | --- | --- | --- |
58 | | OPENAI_API_BASE | no | eg. https://api.openai.com/v1 |
59 | | MASTO_BASE | no? | eg. https://hackyderm.io |
60 | | ACCESS_TOKEN | yes | In your mastodon UI, create a new "app" and copy the access token here |
61 |
62 | ### Connecting to Mastodon
63 |
64 | To get `MASTO_BASE` and `ACCESS_TOKEN`:
65 |
66 | 1. Go to Mastodon web UI
67 | 2. Preferences -> Development
68 | 3. Click "New Application"
69 | 4. Set the name
70 | 5. Set "Redirect URI" to `urn:ietf:wg:oauth:2.0:oob`
71 | 6. Set scopes to all `read` and `write` (contribution idea: figure out what's strictly necessary and send a pull request to update this)
72 | 7. Click Submit
73 | 8. Copy your access token into `ACCESS_TOKEN` in the `.env` file.
74 | 9. Set `MAST_BASE`. You should be able to copy the URL from your browser and then remove the entire path (everything after `/`, inclusive).
75 |
76 | ## Usage
77 | 1. Ensure the settings are correct
78 | 2. "Load More" to populate the database with toots
79 | 3. "Re-Train Algorithm" to categorize and label those toots.
80 |
81 | # Configure Models
82 | Models can be configured and/or added via `llm`.
83 |
84 | ## OpenAI
85 | Here's how to set your OpenAI API key, which gives you access to OpenAI models:
86 |
87 | ```
88 | $ llm keys set openai
89 | Enter key: ...
90 | ```
91 | ## Local (Experimental)
92 | You will need to install an embedding model and a large language model. The instructions here use the `llm-sentence-transformers` and `llm-gpt4all` plugins to do so.
93 |
94 | ```sh
95 | $ llm install llm-sentence-transformers # An Embedding Model Plugin
96 | $ llm install llm-gpt4all # A Large Language Model Plugin
97 | $ llm sentence-transformers register all-mpnet-base-v2 --alias mpnet # Download/Register one of the Embedding Models
98 | ```
99 |
100 | ### Notes
101 | - A full list of possible embedding models is composed of [the default list](https://www.sbert.net/docs/pretrained_models.html) and [these models from huggingface](https://huggingface.co/models?library=sentence-transformers).
102 | - The [llm-gpt4all](https://github.com/simonw/llm-gpt4all) README gives a list of models and their requirements
103 | - The first time you use a model, `llm` will need to download it. This will add to the overall time it takes to process
104 | - The "Re-Train Algorithm" step will take a long time depending on your hardware; a progress bar is shown in the console window
105 | - The quality of the categorization and labels are not guaranteed
106 |
107 | ## Run the server
108 |
109 | If you installed from PyPi:
110 |
111 | ```
112 | uvicorn --host 0.0.0.0 --port 8888 fossil_mastodon.server:app
113 | ```
114 |
115 | If you installed from source:
116 |
117 | ```
118 | poetry run uvicorn --host 0.0.0.0 --port 8888 --reload fossil_mastodon.server:app
119 | ```
120 |
121 | If you're working on CSS or HTML files, you should include them:
122 |
123 | ```
124 | poetry run uvicorn --host 0.0.0.0 --port 8888 --reload --reload-include '*.html' --reload-include '*.css' fossil_mastodon.server:app
125 | ```
126 |
127 | (Note the `--reload` makes it much easier to develop, but is generally unneccessary if you're not developing)
--------------------------------------------------------------------------------
/dashboard.py:
--------------------------------------------------------------------------------
1 | import datetime
2 | from fossil_mastodon import config, core, science, ui
3 | import streamlit as st
4 | import datetime
5 | import random
6 |
7 | st.title("fossil")
8 | link_style = ui.LinkStyle()
9 |
10 | @st.cache_data
11 | def default_date():
12 | return datetime.datetime.utcnow() - datetime.timedelta(days=1)
13 |
14 | @st.cache_data
15 | def get_toots(_cache_key: int, timeline_since, n_clusters) -> list[core.Toot]:
16 | print("get_toots", _cache_key, st.session_state.cache_key, "since=", datetime.datetime.utcnow() - timeline_since)
17 | toots = core.Toot.get_toots_since(datetime.datetime.utcnow() - timeline_since)
18 | if len(toots) > 0:
19 | ui.all_toot_summary(toots)
20 | science.assign_clusters(st.session_state['id'], toots, n_clusters=n_clusters)
21 | return toots
22 |
23 | # Refresh button
24 | latest_date = core.Toot.get_latest_date()
25 | if latest_date is None:
26 | is_refreshing = st.button("Download toots")
27 | if is_refreshing:
28 | with st.spinner("Downloading toots..."):
29 | core.create_database()
30 | core.download_timeline(datetime.datetime.utcnow() - datetime.timedelta(days=1), st.session_state['id'])
31 | latest_date = core.Toot.get_latest_date()
32 | st.session_state.cache_key = random.randint(0, 10000)
33 | else:
34 | is_refreshing = st.button("Refresh toots")
35 | if is_refreshing:
36 | with st.spinner("Downloading toots..."):
37 | core.create_database()
38 | core.download_timeline(latest_date, st.session_state['id'])
39 | st.session_state.cache_key = random.randint(0, 10000)
40 |
41 | # customize timeline segment to analyze
42 | timeline_since = ui.get_time_frame()
43 |
44 | # customize clustering algo
45 | n_clusters = st.slider("Number of clusters", 2, 20, 15)
46 |
47 | if "cache_key" not in st.session_state:
48 | print("init cache_key", st.session_state)
49 | st.session_state.cache_key = random.randint(0, 10000)
50 |
51 | if st.button("Show"):
52 | st.session_state.cache_key = random.randint(0, 10000)
53 |
54 | print(f"state: {st.session_state.cache_key}")
55 |
56 | toots = get_toots(st.session_state.cache_key, timeline_since, n_clusters)
57 | clusters = sorted(list({t.cluster for t in toots if t.cluster}))
58 | if len(toots) == 0:
59 | st.markdown("No toots found. Try clicking **Download toots** or **Refresh toots** above and then click **Show**.")
60 | else:
61 | for cluster in clusters:
62 | cluster_count = len([t for t in toots if t.cluster == cluster])
63 | with st.expander(f"{cluster} ({cluster_count} toots)"):
64 | for toot in toots:
65 | if toot.cluster == cluster:
66 | ui.display_toot(toot, link_style)
67 |
--------------------------------------------------------------------------------
/fossil_mastodon/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tkellogg/fossil/89db2fdbea96666f101e6e16e287fa23fcee0b9d/fossil_mastodon/__init__.py
--------------------------------------------------------------------------------
/fossil_mastodon/algorithm.py:
--------------------------------------------------------------------------------
1 | import abc
2 | import datetime
3 | import pickle
4 | import sqlite3
5 | import typing
6 |
7 | import pydantic
8 | from fastapi import Response, responses
9 |
10 | from fossil_mastodon import config, core
11 | if typing.TYPE_CHECKING:
12 | from fossil_mastodon import plugins
13 |
14 |
15 | class Renderable(abc.ABC):
16 | """
17 | A base class for a "shape" of data to be rendered as HTML.
18 | """
19 | @abc.abstractmethod
20 | def render(self, **response_args) -> Response:
21 | """
22 | Render this object as a FastAPI Response.
23 |
24 | :param response_args: Additional arguments to pass to the Response constructor.
25 | """
26 | raise NotImplementedError()
27 |
28 |
29 | class TrainContext(pydantic.BaseModel):
30 | """
31 | A context object for training a model. This is passed to train().
32 | """
33 | end_time: datetime.datetime
34 | timedelta: datetime.timedelta
35 | session_id: str
36 |
37 | def get_toots(self) -> list[core.Toot]:
38 | return core.Toot.get_toots_since(self.end_time - self.timedelta)
39 |
40 | def sqlite_connection(self) -> sqlite3.Connection:
41 | return config.ConfigHandler.open_db()
42 |
43 |
44 | class BaseAlgorithm(abc.ABC):
45 | """
46 | Base class for an algorithms that render your timeline. You should implemnet
47 | this class to create your own algorithm.
48 |
49 | Abstract methods:
50 | - render: Run the model
51 | - train: Train the model
52 |
53 | Additionally, you may want to override this method to provide a custom UI for
54 | your algorithm:
55 |
56 | - render_model_params
57 |
58 | Note that objects of this class must be serializable, via pickle. However, you
59 | can control how serialization works by overriding these methods:
60 |
61 | - serialize
62 | - deserialize
63 | """
64 |
65 | @abc.abstractmethod
66 | def render(self, toots: list[core.Toot], context: "plugins.RenderContext") -> Renderable:
67 | """
68 | Run the model and return a Renderable object. This object is typically
69 | deserialized before this method is called.
70 |
71 | :param toots: The toots to run the model on. This is typically 1 day of toots,
72 | or 6 hours, or whatever the user (you) has selected.
73 |
74 | :param context: A RenderContext object that you can use to render HTML. This
75 | is generally just passed to the Renderable object you return.
76 | """
77 | raise NotImplementedError()
78 |
79 | @classmethod
80 | @abc.abstractmethod
81 | def train(cls, context: TrainContext, http_args: dict[str, str]) -> "BaseAlgorithm":
82 | """
83 | Create an instance of this algorithm, and train it on the given toots.
84 |
85 | :param context: Context object where training data can be obtained.
86 | """
87 | raise NotImplementedError()
88 |
89 | @classmethod
90 | def render_model_params(cls, context: "plugins.RenderContext") -> Response:
91 | """
92 | Optionally, you can render HTML input elements that capture http_args passed
93 | to train(). This is useful if your agorithm has hyperparameters that you want
94 | to experiment with.
95 | """
96 | return responses.HTMLResponse("")
97 |
98 | def serialize(self) -> bytes:
99 | return pickle.dumps(self)
100 |
101 | @staticmethod
102 | def deserialize(data: bytes) -> "BaseAlgorithm":
103 | return pickle.loads(data)
--------------------------------------------------------------------------------
/fossil_mastodon/app/index.js:
--------------------------------------------------------------------------------
1 | import '@polymer/app-layout/app-layout.js';
2 |
--------------------------------------------------------------------------------
/fossil_mastodon/app/static/htmx.js:
--------------------------------------------------------------------------------
1 | Found. Redirecting to /htmx.org@1.9.10
--------------------------------------------------------------------------------
/fossil_mastodon/app/static/logo-light.svg:
--------------------------------------------------------------------------------
1 |
2 |
4 |
149 |
--------------------------------------------------------------------------------
/fossil_mastodon/app/static/page.js:
--------------------------------------------------------------------------------
1 |
2 | const stickyElm = document.querySelector('.cluster .title')
3 |
4 | const observer = new IntersectionObserver(
5 | ([e]) => e.target.classList.toggle('isSticky', e.intersectionRatio < 1),
6 | {threshold: [1]}
7 | );
8 |
9 | observer.observe(stickyElm)
--------------------------------------------------------------------------------
/fossil_mastodon/app/static/style.css:
--------------------------------------------------------------------------------
1 | html {
2 | font-size: 100%;
3 | }
4 |
5 | body {
6 | background-color: #262222;
7 | color: #ddd;
8 | font-family: "Arial", sans-serif;
9 | max-width: 80rem;
10 | }
11 |
12 | a {
13 | color: #ddd;
14 | /* text-decoration: none; */
15 | }
16 |
17 | .decl {
18 | background-color: #3f3544;
19 | border-radius: 0.5rem;
20 | border-width: 1px;
21 | margin-bottom: 1rem;
22 | padding: 0.5rem;
23 | }
24 |
25 | .model-param-label {
26 | padding-top: 2rem;
27 | padding-left: 2rem;
28 | }
29 |
30 | .row {
31 | display: flex;
32 | flex-direction: row;
33 | align-items: center;
34 | }
35 | .row * {
36 | margin: 0 0.5rem;
37 | }
38 |
39 | select {
40 | margin: 0.5rem;
41 | font-size: 1rem;
42 | padding: 0.5rem;
43 | padding-right: 1rem;
44 | border-radius: 4px;
45 | border: 1px solid #888888;
46 | background-color: #333333;
47 | color: #ffffff;
48 | }
49 |
50 | input[type="text"],input[type="password"] {
51 | margin: 0.5rem;
52 | font-size: 1rem;
53 | padding: 0.5rem;
54 | padding-right: 1rem;
55 | border-radius: 4px;
56 | border: 1px solid #888888;
57 | background-color: #333333;
58 | color: #ffffff;
59 | }
60 |
61 | h1.nav {
62 | display: flex;
63 | align-items: center;
64 | justify-content: space-between; /* This will keep the items spaced apart */
65 | width: 100%;
66 | }
67 |
68 | .nav img {
69 | height: 3rem;
70 | width: 7rem;
71 | }
72 |
73 | /********
74 | Navigation tabs
75 | *********/
76 | .nav-tabs {
77 | margin-bottom: 2rem;
78 | border-bottom: 2px solid #555;
79 | }
80 | .nav-tab {
81 | font-size: 1.5rem;
82 | padding: 0.5rem;
83 | margin: 0rem 0.25rem 0rem 0.25rem;
84 | border-top-left-radius: 0.5rem;
85 | border-top-right-radius: 0.5rem;
86 | }
87 | .nav-tab:hover {
88 | background-color: #555;
89 | }
90 |
91 | .nav-tab a {
92 | text-decoration: none;
93 | }
94 | .nav-tabs div.active {
95 | background-color: #555555;
96 | }
97 | .nav-tabs .back-button:hover {
98 | border-top-left-radius: 0.5rem;
99 | border-top-right-radius: 0.5rem;
100 | background-color: #555;
101 | }
102 |
103 | .back-button {
104 | text-decoration: none;
105 | padding-right: 1rem;
106 | }
107 |
108 | .back-button::before {
109 | display: inline-block;
110 | width: 0.92rem;
111 | height: 1.8rem;
112 | vertical-align: -0.125rem;
113 | padding: 0.5rem 0.5rem 0.5rem 0rem;
114 | /* edit the fill=%23... to change color */
115 | content: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 1472 1600'%3E%3Cpath fill='%23ccc' d='M1472 736v128q0 53-32.5 90.5T1355 992H651l293 294q38 36 38 90t-38 90l-75 76q-37 37-90 37q-52 0-91-37L37 890Q0 853 0 800q0-52 37-91L688 59q38-38 91-38q52 0 90 38l75 74q38 38 38 91t-38 91L651 608h704q52 0 84.5 37.5T1472 736'/%3E%3C/svg%3E");
116 | }
117 |
118 |
119 | /********
120 | Toot display
121 | *********/
122 | .author img {
123 | width: 2rem;
124 | height: 2rem;
125 | border-radius: 50%;
126 | margin-right: 0.5rem;
127 | }
128 |
129 | .toot {
130 | background-color: #333333;
131 | padding: 1rem;
132 | border-radius: 4px;
133 | }
134 |
135 | .toot:not(:last-child) {
136 | margin-bottom: 1rem;
137 | }
138 |
139 | .toot .button-bar {
140 | display: flex;
141 | justify-content: flex-end;
142 | margin-top: 0.5rem;
143 | }
144 |
145 | .toot .button-bar button {
146 | background-color: transparent;
147 | margin-left: 0.5rem;
148 | }
149 |
150 | .toot .button-bar button:hover {
151 | background-color: #888888;
152 | margin-left: 0.5rem;
153 | }
154 |
155 | .toot .content img {
156 | max-width: 100%;
157 | /* margin-bottom: 10px; */
158 | border-radius: 0.5rem;
159 | }
160 |
161 | .toot .content a {
162 | /* long links cause horizontal scroll */
163 | word-wrap: break-word;
164 | }
165 |
166 | #toots {
167 | margin: 2.5rem 0.5rem;
168 | }
169 |
170 | /********
171 | Cluster
172 | *********/
173 |
174 | .cluster {
175 | border-width: 2px;
176 | border-radius: 4px;
177 | margin: 1rem 0rem;
178 | }
179 |
180 | .cluster .title {
181 | padding: 0.5rem;
182 | font-size: 1.5rem;
183 | font-weight: bold;
184 | background-color: #555555;
185 | cursor: pointer;
186 | position: sticky;
187 | top: 0;
188 |
189 | /* HACK: I'd rather only control the height when sticky, but that seems hard */
190 | max-height: 5em;
191 | overflow: scroll;
192 | }
193 |
194 | .cluster[data-open="false"] .title {
195 | border-radius: 4px;
196 | }
197 |
198 | .cluster[data-open="true"] .title {
199 | border-bottom: 2px solid #888888;
200 | border-radius: 4px 0px;
201 | }
202 |
203 | .cluster[data-open="false"] .content {
204 | display: none;
205 | }
206 |
207 | .cluster .content button {
208 | display: flex;
209 | justify-content: flex-end;
210 | }
211 |
212 | /********
213 | Button
214 | *********/
215 |
216 | button {
217 | background-color: #666666;
218 | color: #ffffff;
219 | border: none;
220 | padding: 0.5rem 1rem;
221 | font-size: 1rem;
222 | font-weight: bold;
223 | border-radius: 4px;
224 | cursor: pointer;
225 | }
226 |
227 | button:hover {
228 | background-color: #888888;
229 | }
230 |
231 | button:focus {
232 | outline: none;
233 | box-shadow: 0 0 0 2px #ffffff;
234 | }
235 |
236 | /********
237 | Radio
238 | *********/
239 |
240 | .radio {
241 | margin: 0.6rem 0px 0.6rem 0px;
242 | }
243 |
244 | .radio div {
245 | display: inline-block;
246 | position: relative;
247 | padding-left: 0px;
248 | margin: 0px 0px 0px 0px;
249 | cursor: pointer;
250 | font-size: 16px;
251 | font-weight: bold;
252 | }
253 |
254 | .radio div input[type="radio"] {
255 | display: none;
256 | }
257 |
258 | .radio div label {
259 | display: inline-block;
260 | padding: 0.5rem 1rem;
261 | margin: 0px -4px 0px -4px;
262 | background-color: #666666;
263 | color: #ffffff;
264 | cursor: pointer;
265 | }
266 |
267 | .radio div input[type="radio"]:checked + label {
268 | background-color: #888888;
269 | }
270 |
271 | .radio div:first-child label {
272 | border-top-left-radius: 4px;
273 | border-bottom-left-radius: 4px;
274 | }
275 |
276 | .radio div:last-child label {
277 | border-top-right-radius: 4px;
278 | border-bottom-right-radius: 4px;
279 | }
280 |
281 | /********
282 | Slider
283 | *********/
284 |
285 | input[type="range"] {
286 | width: 100%;
287 | max-width: 400px;
288 | margin: 0.5rem 0px 0.5rem 0px;
289 | }
290 |
291 | /********
292 | Spinner
293 | *********/
294 |
295 | .spinner:not(.htmx-request) {
296 | display: inline-block;
297 | vertical-align: middle;
298 | width: 2rem;
299 | height: 2rem;
300 | border-radius: 100%;
301 | visibility: hidden;
302 | }
303 |
304 | .spinner.htmx-request {
305 | display: inline-block;
306 | vertical-align: middle;
307 | width: 2rem;
308 | height: 2rem;
309 | border-radius: 100%;
310 | visibility: visible;
311 | opacity: 1;
312 | }
313 |
314 | /********
315 | Hamburger
316 | *********/
317 |
318 | .hamburger-launch {
319 | background-color: transparent;
320 | }
321 | .hamburger-launch:hover {
322 | background-color: #444444;
323 | }
324 | .hamburger-launch svg {
325 | fill: #ffffff;
326 | }
327 |
328 | .hamburger {
329 | display: none;
330 | cursor: pointer;
331 | }
332 | .hamburger.open {
333 | display: flex;
334 | flex-direction: column;
335 | justify-content: flex-start;
336 | align-items: flex-end;
337 | right: 0;
338 | top: 5rem;
339 | min-width: 8rem;
340 | position: absolute;
341 | z-index: 10;
342 | background-color: #333333;
343 | padding: 1rem 0rem;
344 | border-radius: 4px;
345 | }
346 | .hamburger.open a {
347 | padding: 0.5rem 1rem;
348 | text-decoration: none;
349 | }
350 | .hamburger.open a:hover {
351 | background-color: #444444;
352 | }
353 |
--------------------------------------------------------------------------------
/fossil_mastodon/app/static/work-in-progress.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tkellogg/fossil/89db2fdbea96666f101e6e16e287fa23fcee0b9d/fossil_mastodon/app/static/work-in-progress.gif
--------------------------------------------------------------------------------
/fossil_mastodon/app/templates/bad_plugin.html:
--------------------------------------------------------------------------------
1 | {% block content %}
2 |
3 | Error in function signature!
4 |
5 | - Plugin: {{ ex.plugin.name }}
6 | - Function: {{ ex.function_name }}
7 | - Signature:
{{ ex.signature }}
8 | - Expected signature:
{{ ex.expected_signature }}
9 |
10 |
11 | This can happen for a variety of reasons. Check the logs for more information.
12 |
13 |
14 | {% endblock %}
15 |
--------------------------------------------------------------------------------
/fossil_mastodon/app/templates/base/admin.html:
--------------------------------------------------------------------------------
1 | {% extends "base/page.html" %}
2 |
3 | {% block content %}
4 |
5 |
6 |
7 | {% for menu_item in extra_menu_items() %}
8 |
13 | {% endfor %}
14 |
15 |
16 | {% block core_content %}
17 |
18 | {% endblock %}
19 | {% endblock %}
20 |
21 |
--------------------------------------------------------------------------------
/fossil_mastodon/app/templates/base/page.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 | fossil
5 |
6 |
7 |
8 |
9 |
10 |
11 | {% for item in head_html() %}
12 | {% autoescape false %}
13 | {{ item }}
14 | {% endautoescape %}
15 | {% endfor %}
16 |
17 |
18 |
19 |
20 |

21 |
22 |
32 |
33 |
34 |
Home
35 |
Settings
36 | {% for menu_item in extra_menu_items() %}
37 | {% autoescape false %}
38 | {{ menu_item.html }}
39 | {% endautoescape %}
40 | {% endfor %}
41 |
42 |
43 | {% block content %}
44 |
45 | {% endblock %}
46 |
47 |
48 |
51 |
52 |
53 |
--------------------------------------------------------------------------------
/fossil_mastodon/app/templates/index.html:
--------------------------------------------------------------------------------
1 | {% extends "base/page.html" %}
2 |
3 | {% block content %}
4 |
5 |
6 |
7 |
8 |
9 |
75 |
76 |
77 |
78 |
79 |
80 | {% endblock %}
81 |
--------------------------------------------------------------------------------
/fossil_mastodon/app/templates/no_algorithm.html:
--------------------------------------------------------------------------------
1 | {% extends "base/page.html" %}
2 |
3 | {% block content %}
4 |
5 | No algorithms installed!
6 |
7 | This can happen for a variety of reasons. Check the logs for more information.
8 |
9 |
10 | {% endblock %}
11 |
--------------------------------------------------------------------------------
/fossil_mastodon/app/templates/settings.html:
--------------------------------------------------------------------------------
1 | {% extends "base/admin.html" %}
2 |
3 | {% block core_content %}
4 |
49 | {% endblock %}
--------------------------------------------------------------------------------
/fossil_mastodon/app/templates/toot.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
{{ toot.author }}
4 | {{ toot.created_at | rel_date }}
5 |
6 |
7 | {% autoescape false %}
8 | {{ toot.content }}
9 | {% endautoescape %}
10 |
11 | {% for attachment in toot.media_attachments %}
12 |

13 | {% else %}
14 | {% if toot.card_url %}
15 |

16 | {% endif %}
17 | {% endfor %}
18 |
19 |
20 |
28 |
--------------------------------------------------------------------------------
/fossil_mastodon/app/templates/toot_clusters.html:
--------------------------------------------------------------------------------
1 |
14 |
15 |
16 | {{ clusters.num_toots }} toots from {{ clusters.min_date | rel_date }} to {{ clusters.max_date | rel_date }}
17 |
18 |
19 | {% for cluster in clusters.clusters %}
20 |
21 |
22 | {{ cluster.name }} ({{ cluster.toots | length }} Toots)
23 |
24 |
25 | {% set toots = cluster.toots %}
26 | {% include 'toot_list.html' %}
27 |
28 |
29 | {% endfor %}
--------------------------------------------------------------------------------
/fossil_mastodon/app/templates/toot_list.html:
--------------------------------------------------------------------------------
1 | {% for toot in toots %}
2 | {% include 'toot.html' %}
3 | {% endfor %}
--------------------------------------------------------------------------------
/fossil_mastodon/config.py:
--------------------------------------------------------------------------------
1 | import atexit
2 | import json
3 | import os
4 | import pathlib
5 | import random
6 | import shutil
7 | import sqlite3
8 | import string
9 | from collections import defaultdict
10 |
11 | import llm
12 | import pydantic
13 | from dotenv import dotenv_values
14 |
15 |
16 | def get_config_var(var_name: str, default):
17 | return dotenv_values().get(var_name, os.environ.get(var_name, default))
18 |
19 | class Model(pydantic.BaseModel):
20 | name: str
21 | context_length: int
22 |
23 | class _ConfigValueNotFound():
24 | pass
25 |
26 | ConfigValueNotFound = _ConfigValueNotFound()
27 |
28 | class _ConfigHandler():
29 | # Default fallbacks for variables defined in either .env or environment
30 | _config_var_defaults = {
31 | "DATABASE_PATH": "fossil.db",
32 | "OPENAI_KEY": "",
33 | "OPENAI_API_BASE": "https://api.openai.com/v1",
34 | "MASTO_BASE": "https://hachyderm.io",
35 | }
36 |
37 | _model_lengths = defaultdict(
38 | lambda: 2048,
39 | {"gpt-3.5-turbo": 4097, "ada-002": 8191}
40 | )
41 |
42 | _model_cache = {}
43 |
44 | def __getattr__(self, item: str):
45 | c_val = get_config_var(item, self._config_var_defaults.get(item, ConfigValueNotFound))
46 |
47 | if isinstance(c_val, _ConfigValueNotFound):
48 | raise AttributeError(f"{item} is not defined in either the enviroment or .env file")
49 | return c_val
50 |
51 | def _get_from_session(self, session_id: str| None, item: str) -> str:
52 | if not session_id:
53 | return ""
54 | with self.open_db() as conn:
55 | c = conn.cursor()
56 | c.execute('SELECT settings FROM sessions WHERE id = ?', [session_id])
57 | row = c.fetchone()
58 | try:
59 | return json.loads(row[0]).get(item, "")
60 | except (json.decoder.JSONDecodeError, IndexError, TypeError):
61 | return ""
62 |
63 | def open_db(self) -> sqlite3.Connection:
64 | return sqlite3.connect(self.DATABASE_PATH)
65 |
66 | def EMBEDDING_MODEL(self, session_id: str|None = None) -> Model:
67 | c_val = self._get_from_session(session_id, "embedding_model")
68 | if not c_val:
69 | c_val = get_config_var("EMBEDDING_MODEL", "ada-002")
70 |
71 | if c_val not in self._model_cache:
72 | self._model_cache[c_val] = Model(name=c_val, context_length=self._model_lengths[c_val])
73 |
74 | return self._model_cache[c_val]
75 |
76 | def SUMMARIZE_MODEL(self, session_id: str|None = None) -> Model:
77 | c_val = self._get_from_session(session_id, "summarize_model")
78 | if not c_val:
79 | c_val = get_config_var("SUMMARIZE_MODEL", "gpt-3.5-turbo")
80 |
81 | if c_val not in self._model_cache:
82 | self._model_cache[c_val] = Model(name=c_val, context_length=self._model_lengths[c_val])
83 |
84 | return self._model_cache[c_val]
85 |
86 |
87 | ConfigHandler = _ConfigHandler()
88 |
89 |
90 | def headers():
91 | return {"Authorization": f"Bearer {ConfigHandler.ACCESS_TOKEN}"}
92 |
93 | def get_installed_llms() -> set[str]:
94 | return {m.model.model_id for m in llm.get_models_with_aliases()}
95 |
96 | def get_installed_embedding_models() -> set[str]:
97 | return {m.model.model_id for m in llm.get_embedding_models_with_aliases()}
98 |
99 |
100 | # Static files
101 | class StaticFiles(pydantic.BaseModel):
102 | """
103 | This manages static files so that the user can `pip install fossil-mastodon` and it runs
104 | fine.
105 |
106 | This copies all files into a temp directory and then deletes them as the program exits. This
107 | seems to work fine even in the dev workflow, since this module gets re-run every time
108 | uvicorn reloads the server.
109 | """
110 | class Config:
111 | arbitrary_types_allowed = True
112 | base_path: pathlib.Path
113 | assets_path: pathlib.Path
114 | templates_path: pathlib.Path
115 |
116 | # HACK: Alright, I admit it, this is crazy. Here's the thing: we need to use shutil.rmtree in
117 | # the destructor, but the destructor runs at a very weird time. I observed it running after
118 | # the shutil module had been unloaded, so I was getting NullType for the module. Obvs the
119 | # simple solution is this — make the function live longer than this object by capturing a reference.
120 | rmtree = shutil.rmtree
121 |
122 | @classmethod
123 | def from_env(cls) -> "StaticFiles":
124 | src_path = pathlib.Path(__file__).parent / "app"
125 | # I used to use tempfile, but MacOS deletes temp files every 3 days, so I needed to move
126 | # to a more permanent location.
127 | dst_path = pathlib.Path(os.path.expanduser(f"~/.cache/fossil-mastodon/{''.join(random.choices(string.ascii_lowercase, k=10))}"))
128 | dst_path.mkdir(parents=True)
129 | shutil.copytree(src_path / "static", dst_path / "static")
130 | shutil.copytree(src_path / "templates", dst_path / "templates")
131 |
132 | obj = cls(
133 | base_path=dst_path,
134 | assets_path=dst_path / "static",
135 | templates_path=dst_path / "templates",
136 | )
137 |
138 | atexit.register(obj.cleanup)
139 |
140 | return obj
141 |
142 | def add_dir(self, path: pathlib.Path, mount_path: str):
143 | shutil.copytree(path, self.base_path / mount_path, dirs_exist_ok=True)
144 |
145 | def cleanup(self):
146 | self.rmtree(self.assets_path.parent)
147 |
148 | def __del__(self):
149 | self.cleanup()
150 |
151 |
152 | ASSETS = StaticFiles.from_env()
153 |
154 | def get_db_path(conn: sqlite3.Connection) -> str:
155 | return conn.execute("PRAGMA database_list").fetchone()[2]
--------------------------------------------------------------------------------
/fossil_mastodon/core.py:
--------------------------------------------------------------------------------
1 | import datetime
2 | import functools
3 | import importlib
4 | import json
5 | import logging
6 | import random
7 | import sqlite3
8 | import string
9 | import traceback
10 | import typing
11 | from typing import Optional, Type
12 |
13 | import html2text
14 | import llm
15 | import numpy as np
16 | from pydantic import BaseModel
17 | import requests
18 | import tiktoken
19 |
20 | from fossil_mastodon import config, migrations
21 |
22 | if typing.TYPE_CHECKING:
23 | from fossil_mastodon import algorithm
24 |
25 |
26 | logger = logging.getLogger(__name__)
27 |
28 |
29 | @functools.lru_cache()
30 | def _get_json(toot: "Toot") -> dict:
31 | # meh, this isn't great, but it works
32 | import json
33 | return json.loads(toot.orig_json)
34 |
35 |
36 | class MediaAttatchment(BaseModel):
37 | type: str | None
38 | preview_url: str | None
39 | url: str | None
40 |
41 |
42 | class Toot(BaseModel):
43 | class Config:
44 | arbitrary_types_allowed = True
45 | id: int | None = None
46 | content: str | None
47 | author: str | None
48 | url: str | None
49 | created_at: datetime.datetime
50 | embedding: np.ndarray | None = None
51 | orig_json: str | None = None
52 | cluster: str | None = None # Added cluster property
53 |
54 | @property
55 | def orig_dict(self) -> dict:
56 | return _get_json(self)
57 |
58 | @property
59 | def avatar_url(self) -> str | None:
60 | return self.orig_dict.get("account", {}).get("avatar")
61 |
62 | @property
63 | def profile_url(self) -> str | None:
64 | return self.orig_dict.get("account", {}).get("url")
65 |
66 | @property
67 | def display_name(self) -> str | None:
68 | return self.orig_dict.get("account", {}).get("display_name")
69 |
70 | @property
71 | def toot_id(self) -> str | None:
72 | return self.orig_dict.get("id")
73 |
74 | @property
75 | def is_reply(self) -> bool:
76 | return self.orig_dict.get("in_reply_to_id") is not None
77 |
78 | @property
79 | def media_attachments(self) -> list[MediaAttatchment]:
80 | return [MediaAttatchment(type=m.get("type"), url=m.get("url"), preview_url=m.get("preview_url"))
81 | for m in self.orig_dict.get("media_attachments", [])]
82 |
83 | @property
84 | def card_preview_url(self) -> str | None:
85 | return self.orig_dict.get("card", {}).get("image")
86 |
87 | @property
88 | def card_url(self) -> str | None:
89 | return self.orig_dict.get("card", {}).get("url")
90 |
91 | def __hash__(self):
92 | return hash(self.url)
93 |
94 | def __eq__(self, other):
95 | return self.url == other.url
96 |
97 | def save(self, init_conn: sqlite3.Connection | None = None) -> bool:
98 | try:
99 | if init_conn is None:
100 | conn = config.ConfigHandler.open_db()
101 | else:
102 | conn = init_conn
103 | migrations.create_database()
104 | c = conn.cursor()
105 |
106 | # Check if the URL already exists
107 | c.execute('''
108 | SELECT COUNT(*) FROM toots WHERE url = ? and embedding is not null
109 | ''', (self.url,))
110 |
111 | result = c.fetchone()
112 | url_exists = result[0] > 0
113 |
114 | if url_exists:
115 | # URL already exists, handle accordingly
116 | return False
117 |
118 | c.execute('''
119 | DELETE FROM toots WHERE url = ?
120 | ''', (self.url,))
121 |
122 | embedding = self.embedding.tobytes() if self.embedding is not None else bytes()
123 | c.execute('''
124 | INSERT INTO toots (content, author, url, created_at, embedding, orig_json, cluster)
125 | VALUES (?, ?, ?, ?, ?, ?, ?)
126 | ''', (self.content, self.author, self.url, self.created_at, embedding, self.orig_json, self.cluster))
127 |
128 | except:
129 | conn.rollback()
130 | raise
131 | finally:
132 | if init_conn is None:
133 | conn.commit()
134 | return True
135 |
136 | @classmethod
137 | def get_toots_since(cls, since: datetime.datetime) -> list["Toot"]:
138 | migrations.create_database()
139 | with config.ConfigHandler.open_db() as conn:
140 | c = conn.cursor()
141 |
142 | c.execute('''
143 | SELECT
144 | id, content, author, url, created_at, embedding, orig_json, cluster
145 | FROM toots WHERE created_at >= ?
146 | ''', (since,))
147 |
148 | rows = c.fetchall()
149 | toots = []
150 | for row in rows:
151 | toot = cls(
152 | id=row[0],
153 | content=row[1],
154 | author=row[2],
155 | url=row[3],
156 | created_at=row[4],
157 | embedding=np.frombuffer(row[5]) if row[5] else None,
158 | orig_json=row[6],
159 | cluster=row[7] # Added cluster property
160 | )
161 | toots.append(toot)
162 |
163 | return toots
164 |
165 | @classmethod
166 | def get_by_id(cls, id: int) -> Optional["Toot"]:
167 | migrations.create_database()
168 | with config.ConfigHandler.open_db() as conn:
169 | c = conn.cursor()
170 |
171 | c.execute('''
172 | SELECT
173 | id, content, author, url, created_at, embedding, orig_json, cluster
174 | FROM toots WHERE id = ?
175 | ''', (id,))
176 |
177 | row = c.fetchone()
178 | if row:
179 | toot = cls(
180 | id=row[0],
181 | content=row[1],
182 | author=row[2],
183 | url=row[3],
184 | created_at=row[4],
185 | embedding=np.frombuffer(row[5]) if row[5] else None,
186 | orig_json=row[6],
187 | cluster=row[7], # Added cluster property
188 | )
189 | return toot
190 | return None
191 |
192 | @staticmethod
193 | def get_latest_date() -> datetime.datetime | None:
194 | migrations.create_database()
195 | with config.ConfigHandler.open_db() as conn:
196 | c = conn.cursor()
197 |
198 | c.execute('''
199 | SELECT MAX(created_at) FROM toots
200 | -- fix issue where only part of the timeline is downloaded after an error
201 | WHERE embedding IS NOT NULL
202 | ''')
203 |
204 | result = c.fetchone()
205 | latest_date = result[0] if result[0] else None
206 |
207 | if isinstance(latest_date, str):
208 | try:
209 | latest_date = datetime.datetime.strptime(latest_date, "%Y-%m-%d %H:%M:%S.%f")
210 | except ValueError:
211 | latest_date = datetime.datetime.strptime(latest_date, "%Y-%m-%d %H:%M:%S")
212 | return latest_date
213 |
214 | @classmethod
215 | def from_dict(cls, data):
216 | import json
217 |
218 | if data.get("reblog"):
219 | return cls.from_dict(data["reblog"])
220 |
221 | return cls(
222 | content=data.get("content"),
223 | author=data.get("account", {}).get("acct"),
224 | url=data.get("url"),
225 | created_at=datetime.datetime.strptime(data.get("created_at"), "%Y-%m-%dT%H:%M:%S.%fZ"),
226 | orig_json=json.dumps(data),
227 | )
228 |
229 | def do_star(self):
230 | print("star", self.url)
231 |
232 | def do_boost(self):
233 | print("boost", self.url)
234 |
235 |
236 | def get_toots_since(since: datetime.datetime, session_id: str):
237 | assert isinstance(since, datetime.datetime), type(since)
238 | migrations.create_database()
239 | download_timeline(since, session_id)
240 | return Toot.get_toots_since(since)
241 |
242 |
243 | def download_timeline(since: datetime.datetime, session_id: str):
244 | last_date = Toot.get_latest_date()
245 | logger.info(f"last toot date: {last_date}")
246 | last_date = last_date or since
247 | earliest_date = None
248 | buffer: list[Toot] = []
249 | last_id = ""
250 | curr_url = f"{config.ConfigHandler.MASTO_BASE}/api/v1/timelines/home?limit=40"
251 | while not earliest_date or earliest_date > last_date:
252 | response = requests.get(curr_url, headers=config.headers())
253 | response.raise_for_status()
254 | json = response.json()
255 | if not json:
256 | logger.info("No more toots")
257 | break
258 | if len(json) > 1:
259 | last_id = json[-1]["id"]
260 | logger.info(f"Got {len(json)} toots; earliest={earliest_date.isoformat() if earliest_date else None}, last_id={last_id}")
261 | for toot_dict in json:
262 | toot = Toot.from_dict(toot_dict)
263 | earliest_date = toot.created_at if not earliest_date else min(earliest_date, datetime.datetime.strptime(toot_dict["created_at"], "%Y-%m-%dT%H:%M:%S.%fZ"))
264 | buffer.append(toot)
265 |
266 | if "next" in response.links:
267 | curr_url = response.links["next"]["url"]
268 | else:
269 | break
270 | logger.info(f"done with toots; earliest={earliest_date.isoformat() if earliest_date else None}, last_date: {last_date.isoformat() if last_date else None}")
271 |
272 | page_size = 50
273 | if len(buffer) > 0:
274 | num_pages = len(buffer) // page_size + 1
275 | else:
276 | num_pages = 0
277 | for page in range(num_pages):
278 | start_index = page * page_size
279 | end_index = start_index + page_size
280 | page_toots = buffer[start_index:end_index]
281 |
282 | # Example: Call the _create_embeddings function
283 | _create_embeddings(page_toots, session_id)
284 | with config.ConfigHandler.open_db() as conn:
285 | for toot in page_toots:
286 | toot.save(init_conn=conn)
287 |
288 | def _prepare_text(text: str) -> str:
289 | return html2text.html2text(text)[:1000]
290 |
291 | def _create_embeddings(toots: list[Toot], session_id: str):
292 | # Convert the list of toots to a single string
293 | toots = [t for t in toots if t.content]
294 |
295 | # Call the llm embedding API to create embeddings
296 | # bugfix: The overall batch size seems to exceed the model's limit, so we need to split the batch into smaller chunks
297 | emb_model = llm.get_embedding_model(config.ConfigHandler.EMBEDDING_MODEL(session_id).name)
298 | total_size = 0
299 | batch = []
300 | embeddings = []
301 | measure = tiktoken.encoding_for_model("gpt-3.5-turbo")
302 | for toot in toots:
303 | text = _prepare_text(toot.content)
304 | new_tokens = len(measure.encode(text))
305 | if total_size + new_tokens > 8000:
306 | embeddings.extend(emb_model.embed_batch(batch))
307 | batch.clear()
308 | total_size = 0
309 |
310 | batch.append(text)
311 | total_size += new_tokens
312 | if len(batch) > 0:
313 | embeddings.extend(emb_model.embed_batch(batch))
314 | batch.clear()
315 |
316 | # Extract the embeddings from the API response
317 | print(f"got {len(embeddings)} embeddings")
318 | for i, toot in enumerate(toots):
319 | toot.embedding = np.array(embeddings[i])
320 |
321 | # Return the embeddings
322 | return toots
323 |
324 |
325 | class Settings(BaseModel):
326 | embedding_model: str | None = None
327 | summarize_model: str | None = None
328 |
329 |
330 | class Session(BaseModel):
331 | id: str
332 | algorithm_spec: str | None = None
333 | algorithm: bytes | None = None
334 | ui_settings: str | None = None
335 | settings: Settings
336 | name: str
337 |
338 | def set_ui_settings(self, ui_settings: dict[str, str]):
339 | self.ui_settings = json.dumps(ui_settings)
340 | self.save()
341 |
342 | def get_ui_settings(self) -> dict[str, str]:
343 | return json.loads(self.ui_settings or "{}")
344 |
345 | def get_algorithm_type(self) -> Type["algorithm.BaseAlgorithm"] | None:
346 | try:
347 | spec = json.loads(self.algorithm_spec) if self.algorithm_spec else {}
348 | if "module" in spec and "class_name" in spec:
349 | mod = importlib.import_module(spec["module"])
350 | return getattr(mod, spec["class_name"])
351 | return None
352 | except ModuleNotFoundError:
353 | traceback.print_exc()
354 | return None
355 |
356 | @classmethod
357 | def get_by_id(cls, id: str) -> Optional["Session"]:
358 | migrations.create_database()
359 | migrations.create_session_table()
360 | with config.ConfigHandler.open_db() as conn:
361 | print(f"Getting session; path={config.get_db_path(conn)}")
362 | c = conn.cursor()
363 |
364 | c.execute('''
365 | SELECT id, algorithm_spec, algorithm, ui_settings, settings, name FROM sessions WHERE id = ?
366 | ''', (id,))
367 |
368 | row = c.fetchone()
369 | if row:
370 | session = cls(
371 | id=row[0],
372 | algorithm_spec=row[1],
373 | algorithm=row[2],
374 | ui_settings=row[3],
375 | settings=Settings(**json.loads(row[4] or "{}")),
376 | name=row[5],
377 | )
378 | return session
379 | return None
380 |
381 | @classmethod
382 | def get_or_create(cls, name: str = "Main") -> "Session":
383 | migrations.create_database()
384 | migrations.create_session_table()
385 | with config.ConfigHandler.open_db() as conn:
386 | c = conn.cursor()
387 | c.execute(""" SELECT id FROM sessions WHERE name IS NOT NULL ORDER BY name DESC LIMIT 1 """)
388 | row = c.fetchone()
389 | if row:
390 | # this is dumb. the ai did it. it's also brilliant. slightly innefficient, but whatever. clean.
391 | obj = cls.get_by_id(row[0])
392 | assert obj is not None
393 | return obj
394 | else:
395 | rand_str = "".join(random.choices(string.ascii_lowercase) for _ in range(32))
396 | obj = cls(id=rand_str, settings=Settings(), name=name)
397 | obj.save(init_conn=conn)
398 | return obj
399 |
400 | def save(self, init_conn: sqlite3.Connection | None = None) -> bool:
401 | try:
402 | if init_conn is None:
403 | conn = config.ConfigHandler.open_db()
404 | else:
405 | conn = init_conn
406 | migrations.create_database()
407 | migrations.create_session_table()
408 | c = conn.cursor()
409 |
410 | c.execute('''
411 | INSERT INTO sessions (id, algorithm_spec, algorithm, ui_settings, settings, name)
412 | VALUES (?, ?, ?, ?, ?, ?)
413 | ON CONFLICT(id) DO UPDATE
414 | SET algorithm_spec = excluded.algorithm_spec
415 | , algorithm = excluded.algorithm
416 | , ui_settings = excluded.ui_settings
417 | , settings = excluded.settings
418 | , name = excluded.name
419 | ''', (self.id, self.algorithm_spec, self.algorithm, self.ui_settings, self.settings.model_dump_json(), self.name))
420 |
421 | if init_conn is None:
422 | conn.commit()
423 | except:
424 | conn.rollback()
425 | raise
426 | return True
427 |
--------------------------------------------------------------------------------
/fossil_mastodon/migrations.py:
--------------------------------------------------------------------------------
1 | """
2 | Migration scripts to update the SQLite schema that are run at the last possible moment.
3 | There's no version numbers, so each script is responsible for "knowing" when it needs
4 | to run itself.
5 |
6 | Typically you should have a @lru_cache on each function to prevent unnecessary invocations,
7 | but also know that it'll get re-invoked every time the server restarts.
8 | """
9 | import functools
10 | import random
11 | import sqlite3
12 | import string
13 |
14 | from fossil_mastodon import config
15 |
16 | class migration:
17 | """
18 | Decorator that tracks all migration functions.
19 | """
20 | all: list["migration"] = []
21 | __counter = 0
22 |
23 | def __init__(self, func: callable):
24 | self.func = func
25 | self.cached = functools.lru_cache()(func)
26 | migration.__counter += 1
27 | self.id = migration.__counter
28 | migration.all.append(self)
29 |
30 | def __call__(self, *args, **kwargs):
31 | return self.cached(*args, **kwargs)
32 |
33 |
34 | @migration
35 | def create_database():
36 | with config.ConfigHandler.open_db() as conn:
37 | c = conn.cursor()
38 |
39 | # Create the toots table if it doesn't exist
40 | c.execute('''
41 | CREATE TABLE IF NOT EXISTS toots (
42 | id INTEGER PRIMARY KEY AUTOINCREMENT,
43 | content TEXT,
44 | author TEXT,
45 | url TEXT,
46 | created_at DATETIME,
47 | embedding BLOB,
48 | orig_json TEXT,
49 | cluster TEXT -- Added cluster column
50 | )
51 | ''')
52 |
53 | conn.commit()
54 |
55 |
56 | @migration
57 | def create_session_table():
58 | create_database()
59 | with config.ConfigHandler.open_db() as conn:
60 | c = conn.cursor()
61 |
62 | # Create the toots table if it doesn't exist
63 | c.execute('''
64 | CREATE TABLE IF NOT EXISTS sessions (
65 | id TEXT PRIMARY KEY,
66 | algorithm_spec TEXT,
67 | algorithm BLOB,
68 | ui_settings TEXT
69 | )
70 | ''')
71 |
72 | try:
73 | c.execute('''
74 | ALTER TABLE sessions ADD COLUMN settings TEXT
75 | ''')
76 | except sqlite3.OperationalError:
77 | pass
78 |
79 | # Add session name
80 | try:
81 | c.execute('''
82 | ALTER TABLE sessions ADD COLUMN name TEXT
83 | ''')
84 | except sqlite3.OperationalError:
85 | pass
86 |
87 | c.execute("DELETE FROM sessions WHERE name IS NULL")
88 |
89 | c2 = conn.cursor()
90 | c2.execute("SELECT COUNT(*) FROM sessions")
91 | row_count = c2.fetchone()[0]
92 | if row_count == 0:
93 | rand_str = "".join(random.choice(string.ascii_lowercase) for _ in range(32))
94 | c2.execute("""
95 | INSERT INTO sessions (id, name, settings)
96 | VALUES (?, ?, '{}')
97 | """, (rand_str, "Main"))
98 |
99 | conn.commit()
--------------------------------------------------------------------------------
/fossil_mastodon/plugin_impl/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tkellogg/fossil/89db2fdbea96666f101e6e16e287fa23fcee0b9d/fossil_mastodon/plugin_impl/__init__.py
--------------------------------------------------------------------------------
/fossil_mastodon/plugin_impl/toot_debug.py:
--------------------------------------------------------------------------------
1 | from fastapi import responses
2 |
3 | from fossil_mastodon import plugins, core
4 |
5 |
6 | plugin = plugins.Plugin(
7 | name="Toot Debug Button",
8 | description="Adds a button to toots that prints the toot's JSON to the server's console.",
9 | )
10 |
11 |
12 | @plugin.api_operation.post("/plugins/toot_debug/{id}")
13 | async def toots_debug(id: int):
14 | toot = core.Toot.get_by_id(id)
15 | if toot is not None:
16 | import json
17 | print(json.dumps(toot.orig_dict, indent=2))
18 | return responses.HTMLResponse("💯
")
19 |
20 |
21 | @plugin.toot_display_button
22 | def get_response(toot: core.Toot, context: plugins.RenderContext) -> responses.Response:
23 | return responses.HTMLResponse(f"""
24 |
25 | """)
--------------------------------------------------------------------------------
/fossil_mastodon/plugin_impl/topic_cluster.py:
--------------------------------------------------------------------------------
1 | import functools
2 | import random
3 | import string
4 | import llm
5 | import numpy as np
6 | import pydantic
7 | import tiktoken
8 | from fastapi import Response, responses
9 | from sklearn.cluster import KMeans
10 | from tqdm import trange
11 |
12 | from fossil_mastodon import algorithm, config, core, migrations, plugins, ui
13 |
14 |
15 | plugin = plugins.Plugin(
16 | name="Topic Cluster",
17 | description="Cluster toots by topic",
18 | )
19 |
20 |
21 | class ClusterRenderer(algorithm.Renderable, pydantic.BaseModel):
22 | clusters: list[ui.TootCluster]
23 | context: plugins.RenderContext
24 |
25 | def render(self, **response_args) -> Response:
26 | toot_clusters = ui.TootClusters(clusters=self.clusters)
27 | return self.context.templates.TemplateResponse("toot_clusters.html", {
28 | "clusters": toot_clusters,
29 | **self.context.template_args(),
30 | },
31 | **response_args)
32 |
33 |
34 | @migrations.migration
35 | def _create_table():
36 | with config.ConfigHandler.open_db() as conn:
37 | c = conn.cursor()
38 |
39 | # Create the toots table if it doesn't exist
40 | c.execute('''
41 | CREATE TABLE IF NOT EXISTS topic_cluster_toots (
42 | id INTEGER PRIMARY KEY AUTOINCREMENT,
43 | toot_id INTEGER NOT NULL,
44 | model_version TEXT NOT NULL,
45 | cluster_id INTEGER NOT NULL,
46 | updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
47 | )
48 | ''')
49 |
50 | conn.commit()
51 |
52 |
53 | class TootModel(pydantic.BaseModel):
54 | """
55 | Cache for the cluster id of a toot. The model_version is used to invalidate the cache if
56 | the model is retrained, since that would lead to an incompatible set of clusters.
57 |
58 | We can't store this inside the model because it's dynamic and created after the model is
59 | trained.
60 | """
61 | id: int | None
62 | toot_id: int
63 | model_version: str
64 | cluster_id: int | None
65 |
66 | @classmethod
67 | def for_toots(cls, toots: list[core.Toot], model_version: str) -> list["TootModel"]:
68 | _create_table()
69 | with config.ConfigHandler.open_db() as conn:
70 | c = conn.cursor()
71 | c.execute('''
72 | SELECT id, toot_id, model_version, cluster_id
73 | FROM topic_cluster_toots
74 | WHERE model_version = ?
75 | ''', (model_version, ))
76 | from_db = {row[1]: cls(id=row[0], toot_id=row[1], model_version=row[2], cluster_id=row[3]) for row in c.fetchall()}
77 | return [
78 | from_db.get(
79 | toot.id,
80 | cls(id=None, toot_id=toot.id, model_version=model_version, cluster_id=None),
81 | )
82 | for toot in toots
83 | ]
84 |
85 | def save(self):
86 | _create_table()
87 | if self.cluster_id is None:
88 | raise ValueError("Cannot save a toot model without a cluster_id")
89 |
90 | if isinstance(self.cluster_id, np.number):
91 | raise ValueError("cluster_id must be an int, not a numpy type")
92 |
93 | with config.ConfigHandler.open_db() as conn:
94 | c = conn.cursor()
95 | if self.id is None:
96 | c.execute('''
97 | INSERT INTO topic_cluster_toots (toot_id, model_version, cluster_id)
98 | VALUES (?, ?, ?)
99 | ''', (self.toot_id, self.model_version, self.cluster_id))
100 | self.id = c.lastrowid
101 | else:
102 | c.execute('''
103 | UPDATE topic_cluster_toots
104 | SET cluster_id = ?, updated_at = CURRENT_TIMESTAMP
105 | WHERE id = ?
106 | ''', (self.cluster_id, self.id))
107 | conn.commit()
108 |
109 |
110 | @plugin.algorithm
111 | class TopicCluster(algorithm.BaseAlgorithm):
112 | def __init__(self, kmeans: KMeans, labels: dict[int, str], model_version: str | None = None):
113 | self.kmeans = kmeans
114 | self.labels = labels
115 | self.model_version = model_version
116 |
117 | def render(self, toots: list[core.Toot], context: plugins.RenderContext) -> ClusterRenderer:
118 | before = len(toots)
119 | toots = [toot for toot in toots if toot.embedding is not None]
120 | toot_models = TootModel.for_toots(toots, model_version=self.model_version)
121 | print("Removed", before - len(toots), "toots with no embedding (probably image-only).", f"{len(toots)} toots remaining.")
122 |
123 | # Assign clusters to the uncached toots
124 | unassigned = [toot for toot, toot_model in zip(toots, toot_models) if toot_model.cluster_id is None]
125 | if len(unassigned) > 0:
126 | unassigned_models = [toot_model for toot_model in toot_models if toot_model.cluster_id is None]
127 | cluster_indices = self.kmeans.predict(np.array([toot.embedding for toot in unassigned]))
128 | print(f"Assigning clusters for {len(unassigned)} toots; model_version={self.model_version}")
129 | for toot, cluster_index, toot_model in zip(unassigned, cluster_indices, unassigned_models):
130 | toot.cluster = self.labels[cluster_index]
131 | toot_model.cluster_id = int(cluster_index)
132 | toot_model.save()
133 |
134 | toot_clusters = ui.TootClusters(
135 | clusters=[
136 | ui.TootCluster(
137 | id=i_cluster,
138 | name=cluster_label,
139 | toots=[toot for toot, toot_model in zip(toots, toot_models) if toot_model.cluster_id == i_cluster],
140 | )
141 | for i_cluster, cluster_label in self.labels.items()
142 | ]
143 | )
144 | return ClusterRenderer(clusters=toot_clusters.clusters, context=context)
145 |
146 | @classmethod
147 | def train(cls, context: algorithm.TrainContext, args: dict[str, str]) -> "TopicCluster":
148 | toots = [toot for toot in context.get_toots() if toot.embedding is not None]
149 |
150 | n_clusters = int(args["num_clusters"])
151 | if len(toots) < n_clusters:
152 | return cls(kmeans=NoopKMeans(n_clusters=1), labels={0: "All toots"})
153 |
154 | embeddings = np.array([toot.embedding for toot in toots])
155 | kmeans = KMeans(n_clusters=n_clusters)
156 | cluster_labels = kmeans.fit_predict(embeddings)
157 |
158 | labels: dict[int, str] = {}
159 | model = llm.get_model(config.ConfigHandler.SUMMARIZE_MODEL(context.session_id).name)
160 | for i_clusters in trange(n_clusters):
161 | clustered_toots = [toot for toot, cluster_label in zip(toots, cluster_labels) if cluster_label == i_clusters]
162 | combined_text = "\n\n".join([toot.content for toot in clustered_toots])
163 |
164 | # Use the summarizing model to summarize the combined text
165 | prompt = f"Create a single label that describes all of these related tweets, make it succinct but descriptive. The label should describe all {len(clustered_toots)} of these\n\n{combined_text}"
166 | summary = model.prompt(reduce_size(context.session_id, prompt)).text().strip()
167 | labels[int(i_clusters)] = summary
168 |
169 | model_version = "".join(random.choice(string.ascii_lowercase) for _ in range(12))
170 | return cls(kmeans=kmeans, labels=labels, model_version=model_version)
171 |
172 | @staticmethod
173 | def render_model_params(context: plugins.RenderContext) -> Response:
174 | default = context.session.get_ui_settings().get("num_clusters", "15")
175 | return responses.HTMLResponse(f"""
176 |
177 |
178 | {default} clusters
179 |
180 | """)
181 |
182 | def get_encoding(session_id: str):
183 | try:
184 | return tiktoken.encoding_for_model(config.ConfigHandler.SUMMARIZE_MODEL(session_id).name)
185 | except KeyError:
186 | encoding_name = tiktoken.list_encoding_names()[-1]
187 | return tiktoken.get_encoding(encoding_name)
188 |
189 | def reduce_size(session_id: str, text: str, model_limit: int = -1, est_output_size: int = 500) -> str:
190 | if model_limit < 0:
191 | model_limit = config.ConfigHandler.SUMMARIZE_MODEL(session_id).context_length
192 | tokens = get_encoding(session_id).encode(text)
193 | return get_encoding(session_id).decode(tokens[:model_limit - est_output_size])
194 |
195 |
196 | class NoopKMeans(KMeans):
197 | def predict(self, X, y=None, sample_weight=None):
198 | return np.zeros(len(X), dtype=int)
199 |
--------------------------------------------------------------------------------
/fossil_mastodon/plugins.py:
--------------------------------------------------------------------------------
1 | import abc
2 | import contextlib
3 | import functools
4 | import inspect
5 | import logging
6 | import pathlib
7 | import re
8 | import sys
9 | import traceback
10 | from typing import Callable, Type, TYPE_CHECKING
11 |
12 | from fastapi import FastAPI, Request, responses, templating
13 | import pkg_resources
14 | import pydantic
15 |
16 | from fossil_mastodon import algorithm, config, ui, core
17 |
18 | if TYPE_CHECKING:
19 | from fossil_mastodon import server
20 |
21 |
22 | logger = logging.getLogger(__name__)
23 |
24 |
25 | def title_case_to_spaced(string):
26 | # The regex pattern looks for any lowercase letter followed by an uppercase letter
27 | # and inserts a space between them
28 | return re.sub(r'(?<=[a-z])(?=[A-Z])', ' ', string)
29 |
30 |
31 | TootDisplayFn = Callable[[core.Toot, "RenderContext"], responses.Response]
32 | class TootDisplayPlugin(pydantic.BaseModel):
33 | fn: TootDisplayFn
34 | fn_name: str
35 |
36 | def render_str(self, toot: core.Toot, context: "RenderContext") -> str:
37 | obj = self.fn(toot, context)
38 | content = obj.body.decode("utf-8")
39 | return content
40 |
41 |
42 | class RenderContext(pydantic.BaseModel):
43 | """
44 | A context object for rendering a template.
45 | """
46 | class Config:
47 | arbitrary_types_allowed = True
48 | templates: templating.Jinja2Templates
49 | request: Request
50 | link_style: ui.LinkStyle
51 | session: core.Session
52 |
53 | def template_args(self) -> dict:
54 | return {
55 | "request": self.request,
56 | "link_style": self.link_style,
57 | "ctx": self,
58 | }
59 |
60 | def render_toot_display_plugins(self, toot: core.Toot) -> str:
61 | return "".join(
62 | plugin.render_str(toot, self)
63 | for plugin in get_toot_display_plugins()
64 | )
65 |
66 |
67 | _app: FastAPI | None = None
68 |
69 | class _MenuItem(pydantic.BaseModel):
70 | html: str
71 | url: str
72 |
73 |
74 | class Plugin(pydantic.BaseModel):
75 | """
76 | Plugin registration API
77 |
78 | Example:
79 |
80 | plugin = Plugin(name="My Plugin", description="Add button to toot that triggers an API POST operation")
81 |
82 | @plugin.api_operation.post("/my_plugin")
83 | def my_plugin(request: Request):
84 | return responses.HTMLResponse("💯
")
85 |
86 | @plugin.toot_display_button
87 | def my_toot_display(toot: core.Toot, context: RenderContext):
88 | return responses.HTMLResponse("💯
")
89 |
90 | """
91 | name: str
92 | display_name: str | None = None
93 | description: str | None = None
94 | author: str | None = None
95 | author_url: str | None = None
96 | enabled_by_default: bool = True
97 | _toot_display_buttons: list[TootDisplayPlugin] = pydantic.PrivateAttr(default_factory=list)
98 | _algorithms: list[Type[algorithm.BaseAlgorithm]] = pydantic.PrivateAttr(default_factory=list)
99 | _lifecycle_hooks: list[callable] = pydantic.PrivateAttr(default_factory=list)
100 | _menu_items: list[_MenuItem] = pydantic.PrivateAttr(default_factory=list)
101 | _extra_nav: list[str] = pydantic.PrivateAttr(default_factory=list)
102 | _head_html: list[str] = pydantic.PrivateAttr(default_factory=list)
103 |
104 | @pydantic.validator("display_name", always=True)
105 | def _set_display_name(cls, v, values):
106 | return v or values["name"]
107 |
108 | @property
109 | def api_operation(self) -> FastAPI:
110 | assert _app is not None
111 | return _app
112 |
113 | @property
114 | def TemplateResponse(self) -> Type["server.templates.TemplateResponse"]:
115 | from fossil_mastodon import server
116 | return server.templates.TemplateResponse
117 |
118 | def toot_display_button(self, impl: TootDisplayFn) -> TootDisplayFn:
119 | """
120 | Decorator for adding a button to the toot display UI. This function should return a
121 | fastapi.responses.Response object. The result will be extracted and inserted into the
122 | toot display UI.
123 | """
124 | name = impl.__name__
125 |
126 | @functools.wraps(impl)
127 | def wrapper(toot: core.Toot, context: RenderContext):
128 | try:
129 | return impl(toot, context)
130 | except TypeError as e:
131 | raise BadPluginFunction(self, impl, "example_function(toot: fossil_mastodon.core.Toot, context: fossil_mastodon.plugins.RenderContext)") from e
132 | except Exception as e:
133 | import inspect
134 | print(inspect.signature(impl))
135 | raise RuntimeError(f"Error in toot display plugin '{self.name}', function '{name}'") from e
136 |
137 | self._toot_display_buttons.append(TootDisplayPlugin(fn=wrapper, fn_name=name))
138 | return wrapper
139 |
140 | def algorithm(self, algo: Type[algorithm.BaseAlgorithm]) -> Type[algorithm.BaseAlgorithm]:
141 | """
142 | Decorator for adding an algorithm class.
143 | """
144 | if not issubclass(algo, algorithm.BaseAlgorithm):
145 | raise ValueError(f"Algorithm {algo} is not a subclass of algorithm.BaseAlgorithm")
146 | self._algorithms.append(algo)
147 | algo.plugin = self
148 | return algo
149 |
150 | def lifecycle_hook(self, fn: callable) -> callable:
151 | """
152 | Decorator for adding a lifecycle hook. Lifecycle hooks are called when the server starts
153 | up, and can be used to perform initialization tasks.
154 | """
155 | self._lifecycle_hooks.append(fn)
156 | return fn
157 |
158 | def add_templates_dir(self, path: pathlib.Path):
159 | """
160 | Add a directory of templates to the plugin. These will be accessible from FastAPI response
161 | objects. For example, if you add a directory of templates at `/templates`, then you
162 | can return a template from a FastAPI route like this:
163 |
164 | @plugin.api_operation.get("/my_route")
165 | def my_route():
166 | return plugin.TemplateResponse("my_template.html", {"request": request})
167 | """
168 | config.ASSETS.add_dir(path, "templates")
169 |
170 | def add_static_dir(self, path: pathlib.Path):
171 | """
172 | Add a directory of static files to the plugin. These will be downloadable by the browser at
173 | the path `GET /static/example.css`, assuming the example.css exists at `/example.css`
174 | as a local path.
175 | """
176 | config.ASSETS.add_dir(path, "static")
177 |
178 | def add_menu_item(self, raw_html: str, url="#"):
179 | self._menu_items.append(_MenuItem(html=raw_html, url=url))
180 |
181 | def add_extra_nav(self, raw_html: str):
182 | self._extra_nav.append(raw_html)
183 |
184 | def add_head_html(self, raw_html: str):
185 | self._head_html.append(raw_html)
186 |
187 |
188 | def init_plugins(app: FastAPI):
189 | global _app
190 | _app = app
191 | get_plugins()
192 |
193 |
194 | @functools.lru_cache
195 | def get_plugins() -> list[Plugin]:
196 | if _app is None:
197 | raise RuntimeError("Plugins not initialized")
198 |
199 | plugins = []
200 | for entry_point in pkg_resources.iter_entry_points("fossil_mastodon.plugins"):
201 | print("Loading plugin", entry_point.name)
202 | try:
203 | plugin = entry_point.load()
204 | if isinstance(plugin, Plugin):
205 | plugins.append(plugin)
206 | else:
207 | print(f"Error loading toot display plugin '{entry_point.name}': not a subclass of Plugin")
208 | except:
209 | print(f"Error loading toot display plugin {entry_point.name}")
210 | traceback.print_exc()
211 | return plugins
212 |
213 |
214 | def get_toot_display_plugins() -> list[TootDisplayPlugin]:
215 | return [
216 | b
217 | for p in get_plugins()
218 | for b in p._toot_display_buttons
219 | ]
220 |
221 |
222 | def get_algorithms() -> list[Type[algorithm.BaseAlgorithm]]:
223 | return [
224 | algo
225 | for p in get_plugins()
226 | for algo in p._algorithms
227 | ]
228 |
229 |
230 | def get_menu_items() -> list[str]:
231 | return [
232 | algo
233 | for p in get_plugins()
234 | for algo in p._menu_items
235 | ]
236 |
237 |
238 | def get_extra_nav() -> list[str]:
239 | return [
240 | algo
241 | for p in get_plugins()
242 | for algo in p._extra_nav
243 | ]
244 |
245 |
246 | def get_head_html() -> list[str]:
247 | return [
248 | algo
249 | for p in get_plugins()
250 | for algo in p._head_html
251 | ]
252 |
253 |
254 | def get_lifecycle_hooks() -> list[callable]:
255 | return [
256 | contextlib.contextmanager(hook)
257 | for p in get_plugins()
258 | for hook in p._lifecycle_hooks
259 | ]
260 |
261 | @contextlib.asynccontextmanager
262 | async def lifespan(app: FastAPI):
263 | hooks = get_lifecycle_hooks()
264 |
265 | objects = []
266 | for hook in hooks:
267 | try:
268 | obj = hook(app)
269 | obj.__enter__()
270 | objects.append(obj)
271 | except:
272 | logger.exception(f"Error running lifecycle hook {hook}")
273 |
274 | yield
275 |
276 | exc_info = sys.exc_info()
277 | exc = exc_info[1] if exc_info else None
278 | exc_type = exc_info[0] if exc_info else None
279 | tb = exc_info[2] if exc_info else None
280 | for obj in objects:
281 | try:
282 | obj.__exit__(exc_type, exc, tb)
283 | except:
284 | logger.exception(f"Error running lifecycle hook {hook}")
285 |
286 |
287 | class BadPluginFunction(Exception):
288 | def __init__(self, plugin: Plugin, function: callable, expected_signature: str):
289 | super().__init__(f"Bad function call: {plugin.name}.{function.__name__} should have signature {expected_signature}")
290 | self.plugin = plugin
291 | self.function = function
292 | self.signature = inspect.signature(function)
293 | self.expected_signature = expected_signature
294 | self.function_name = function.__name__
295 |
--------------------------------------------------------------------------------
/fossil_mastodon/science.py:
--------------------------------------------------------------------------------
1 | import llm
2 | import numpy as np
3 | import openai
4 | import tiktoken
5 | from sklearn.cluster import KMeans
6 |
7 | from . import config, core
8 |
9 |
10 | def assign_clusters(session_id: str, toots: list[core.Toot], n_clusters: int = 5):
11 | # meh, ignore toots without content. I think this might be just an image, not sure
12 | toots = [toot for toot in toots if toot.embedding is not None]
13 |
14 | # Perform k-means clustering on the embeddings
15 | embeddings = np.array([toot.embedding for toot in toots])
16 | kmeans = KMeans(n_clusters=n_clusters)
17 | cluster_labels = kmeans.fit_predict(embeddings)
18 |
19 | client = openai.OpenAI(api_key=config.ConfigHandler.OPENAI_KEY)
20 | for i_clusters in range(n_clusters):
21 | clustered_toots = [toot for toot, cluster_label in zip(toots, cluster_labels) if cluster_label == i_clusters]
22 | combined_text = "\n\n".join([toot.content for toot in clustered_toots])
23 |
24 | # Use GPT-3.5-turbo to summarize the combined text
25 | prompt = f"Create a single label that describes all of these related tweets, make it succinct but descriptive. The label should describe all {len(clustered_toots)} of these\n\n{combined_text}"
26 | model = llm.get_model(config.ConfigHandler.SUMMARIZE_MODEL(session_id).name)
27 | summary = model.prompt(prompt).text()
28 |
29 | # Do something with the summary
30 | for toot, cluster_label in zip(toots, cluster_labels):
31 | if cluster_label == i_clusters:
32 | toot.cluster = summary
33 |
34 | def get_encoding(session_id: str):
35 | try:
36 | return tiktoken.encoding_for_model(config.ConfigHandler.SUMMARIZE_MODEL(session_id).name)
37 | except KeyError:
38 | encoding_name = tiktoken.list_encoding_names()[-1]
39 | return tiktoken.get_encoding(encoding_name)
40 |
41 | def reduce_size(session_id: str, text: str, model_limit: int = -1, est_output_size: int = 500) -> str:
42 | if model_limit < 0:
43 | config.ConfigHandler.SUMMARIZE_MODEL(session_id).context_length
44 | tokens = get_encoding(session_id).encode(text)
45 | return get_encoding(session_id).decode(tokens[:model_limit - est_output_size])
46 |
--------------------------------------------------------------------------------
/fossil_mastodon/server.py:
--------------------------------------------------------------------------------
1 | """
2 | A FastAPI HTML server.
3 |
4 | The streamlit version had issues around state management and was genrally slow
5 | and inflexible. This gives us a lot more control.
6 | """
7 | import datetime
8 | import importlib
9 | import json
10 | import logging
11 | import random
12 | import string
13 | from typing import Annotated, Type
14 |
15 | import llm
16 | import requests
17 | from fastapi import FastAPI, Form, HTTPException, Request, responses, staticfiles, templating
18 |
19 | from fossil_mastodon import algorithm, config, core, migrations, plugins, ui
20 |
21 |
22 | logger = logging.getLogger(__name__)
23 |
24 |
25 | app = FastAPI(lifespan=plugins.lifespan)
26 |
27 |
28 | app.mount("/static", staticfiles.StaticFiles(directory=config.ASSETS.assets_path), name="static")
29 | templates = templating.Jinja2Templates(directory=config.ASSETS.templates_path)
30 | print("using template directory", config.ASSETS.templates_path)
31 | templates.env.filters["rel_date"] = ui.time_ago
32 |
33 |
34 | @app.middleware("http")
35 | async def session_middleware(request: Request, call_next):
36 | """
37 | Called before each request. Sets up the session and saves it to the database.
38 | """
39 | session_id = request.cookies.get("fossil_session_id")
40 | session = core.Session.get_by_id(session_id) if session_id else None
41 | if session is None:
42 | session = core.Session.get_or_create()
43 | session.save()
44 | request.state.session = session
45 | response = await call_next(request)
46 | response.set_cookie("fossil_session_id", session.id)
47 | return response
48 | else:
49 | request.state.session = session
50 | return await call_next(request)
51 |
52 |
53 | @app.get("/")
54 | async def root(request: Request):
55 | session: core.Session = request.state.session
56 | ctx = plugins.RenderContext(
57 | templates=templates,
58 | request=request,
59 | link_style=ui.LinkStyle("Desktop"),
60 | session=session,
61 | )
62 |
63 | # GUARD: ensure some algorithms are installed
64 | algo_list = plugins.get_algorithms()
65 | if len(algo_list) == 0:
66 | print(f"No algorithms found (num plugins: {len(plugins.get_plugins())})")
67 | for plugin in plugins.get_plugins():
68 | print(f"Plugin ({plugin.name})", plugin)
69 | return templates.TemplateResponse("no_algorithm.html", {
70 | "request": request,
71 | })
72 |
73 | # Render the UI
74 | algo = session.get_algorithm_type() or algo_list[0]
75 | return templates.TemplateResponse("index.html", {
76 | "request": request,
77 | "model_params": algo.render_model_params(ctx).body.decode("utf-8"),
78 | "ui_settings": session.get_ui_settings(),
79 | "selected_algorithm": algo,
80 | "algorithms": [
81 | {"name": algo.plugin.name, "display_name": algo.plugin.display_name}
82 | for algo in plugins.get_algorithms()
83 | ],
84 | })
85 |
86 |
87 | @app.get("/toots")
88 | async def toots():
89 | return staticfiles.FileResponse("public/toots.html")
90 |
91 |
92 | @app.post("/toots/download")
93 | async def toots_download(request: Request):
94 | # init
95 | migrations.create_database()
96 | session: core.Session = request.state.session
97 | algorithm_spec: dict = json.loads(session.algorithm_spec) if session.algorithm_spec else {}
98 |
99 | # first page load calls this with display-only=true to load what was loaded last time
100 | if request.query_params.get("display-only", "") != "true":
101 | # download
102 | core.download_timeline(datetime.datetime.utcnow() - datetime.timedelta(days=1), session.id)
103 |
104 | # render
105 | body_params: dict[str, str] = dict((await request.form()))
106 | session.set_ui_settings(body_params)
107 | print("algorithm_spec", algorithm_spec)
108 | if "module" in algorithm_spec and "class_name" in algorithm_spec:
109 | mod = importlib.import_module(algorithm_spec["module"])
110 | model_class: Type[algorithm.BaseAlgorithm] = getattr(mod, algorithm_spec["class_name"])
111 | model: algorithm.BaseAlgorithm = model_class.deserialize(session.algorithm)
112 | timespan = ui.timedelta(body_params["time_span"])
113 | timeline = core.Toot.get_toots_since(datetime.datetime.utcnow() - timespan)
114 | renderable = model.render(timeline, plugins.RenderContext(
115 | templates=templates,
116 | request=request,
117 | link_style=ui.LinkStyle(body_params["link_style"] if "link_style" in body_params else "Desktop"),
118 | session=session,
119 | ))
120 | return renderable.render()
121 | else:
122 | return responses.HTMLResponse("No Toots 😥
")
123 |
124 |
125 | @app.post("/toots/train")
126 | async def toots_train(
127 | link_style: Annotated[str, Form()],
128 | time_span: Annotated[str, Form()],
129 | request: Request,
130 | ):
131 | context = algorithm.TrainContext(
132 | end_time=datetime.datetime.utcnow(),
133 | timedelta=ui.timedelta(time_span),
134 | session_id=request.state.session.id
135 | )
136 |
137 | algo_kwargs = {k: v for k, v in dict((await request.form())).items()
138 | if k not in {"link_style", "time_span"}}
139 | print("Algorithm kwargs:", algo_kwargs)
140 |
141 | # train
142 | session: core.Session = request.state.session
143 | algo = session.get_algorithm_type() or plugins.get_algorithms()[0]
144 | algo.model_version = "".join(random.choices(string.ascii_letters + string.digits, k=12))
145 | model = algo.train(context, algo_kwargs)
146 | session.algorithm = model.serialize()
147 | session.algorithm_spec = json.dumps({
148 | "module": model.__class__.__module__,
149 | "class_name": model.__class__.__qualname__,
150 | "kwargs": algo_kwargs,
151 | })
152 | session.save()
153 |
154 | # render
155 | timeline = core.Toot.get_toots_since(datetime.datetime.utcnow() - ui.timedelta(time_span))
156 | renderable = model.render(timeline, plugins.RenderContext(
157 | templates=templates,
158 | request=request,
159 | link_style=ui.LinkStyle(link_style),
160 | session=session,
161 | ))
162 | try:
163 | return renderable.render()
164 | except plugins.BadPluginFunction as ex:
165 | return templates.TemplateResponse("bad_plugin.html", { "request": request, "ex": ex })
166 |
167 |
168 | @app.get("/algorithm/{name}/form")
169 | async def algorithm_form(name: str, request: Request):
170 | session: core.Session = request.state.session
171 | algo_type = session.get_algorithm_type() or plugins.get_algorithms()[0]
172 | ctx = plugins.RenderContext(
173 | templates=templates,
174 | request=request,
175 | link_style=ui.LinkStyle(session.get_ui_settings().get("link_style", "Desktop")),
176 | session=session,
177 | )
178 | return algo_type.render_model_params(ctx)
179 |
180 |
181 | @app.get("/settings")
182 | async def get_settings(request: Request):
183 | session: core.Session = request.state.session
184 | keys = {"openai": "", **llm.load_keys()}
185 | return templates.TemplateResponse("settings.html", {
186 | "request": request,
187 | "settings": session.settings,
188 | "embedding_models": config.get_installed_embedding_models(),
189 | "embedding_model": session.settings.embedding_model,
190 | "summarize_models": config.get_installed_llms(),
191 | "summarize_model": session.settings.summarize_model,
192 | "keys": keys,
193 | })
194 |
195 | @app.post("/settings")
196 | async def post_settings(settings: core.Settings, request: Request):
197 | session: core.Session = request.state.session
198 | session.settings = settings
199 | session.save()
200 | return responses.HTMLResponse("👍
")
201 |
202 | @app.post("/keys")
203 | async def post_keys(request: Request):
204 | body_params: dict[str, str] = dict((await request.form()))
205 | key_path = llm.user_dir() / "keys.json"
206 | key_path.write_text(json.dumps(body_params))
207 | return responses.HTMLResponse("👍
")
208 |
209 |
210 | @app.post("/toots/{id}/debug")
211 | async def toots_debug(id: int):
212 | toot = core.Toot.get_by_id(id)
213 | if toot is not None:
214 | import json
215 | print(json.dumps(toot.orig_dict, indent=2))
216 | return responses.HTMLResponse("💯
")
217 |
218 | @app.post("/toots/{id}/boost")
219 | async def toots_boost(id: int):
220 | toot = core.Toot.get_by_id(id)
221 | if toot is not None:
222 | url = f'{config.ConfigHandler.MASTO_BASE}/api/v1/statuses/{toot.toot_id}/reblog'
223 | data = {
224 | 'visibility': 'public'
225 | }
226 | response = requests.post(url, json=data, headers=config.headers())
227 | try:
228 | response.raise_for_status()
229 | return responses.HTMLResponse("🚀
")
230 | except:
231 | print("ERROR:", response.json())
232 | raise
233 | raise HTTPException(status_code=404, detail="Toot not found")
234 |
235 | @app.post("/toots/{id}/favorite")
236 | async def toots_favorite(id: int):
237 | toot = core.Toot.get_by_id(id)
238 | if toot is not None:
239 | url = f'{config.ConfigHandler.MASTO_BASE}/api/v1/statuses/{toot.toot_id}/favourite'
240 | response = requests.post(url, headers=config.headers())
241 | try:
242 | response.raise_for_status()
243 | return responses.HTMLResponse("💫
")
244 | except:
245 | print("ERROR:", response.json())
246 | raise
247 | raise HTTPException(status_code=404, detail="Toot not found")
248 |
249 | templates.env.globals["extra_menu_items"] = plugins.get_menu_items
250 | templates.env.globals["head_html"] = plugins.get_head_html
251 | templates.env.globals["extra_nav"] = plugins.get_extra_nav
252 |
253 | # this should always be the last line of this file
254 | plugins.init_plugins(app)
--------------------------------------------------------------------------------
/fossil_mastodon/ui.py:
--------------------------------------------------------------------------------
1 | import datetime
2 | import re
3 | import urllib.parse
4 |
5 | import pydantic
6 | import streamlit as st
7 |
8 | from . import config, core
9 |
10 |
11 | def get_time_frame() -> datetime.timedelta:
12 | time_frame = st.radio("Show last:", ["6 hours", "day", "week"], horizontal=True)
13 |
14 | if time_frame == "6 hours":
15 | return datetime.timedelta(hours=6)
16 | elif time_frame == "day":
17 | return datetime.timedelta(days=1)
18 | elif time_frame == "week":
19 | return datetime.timedelta(weeks=1)
20 | raise ValueError("Invalid time frame")
21 |
22 |
23 | def time_ago(dt: datetime.datetime) -> str:
24 | current_time = datetime.datetime.utcnow()
25 | time_ago = current_time - dt
26 |
27 | # Convert the time difference to a readable string
28 | if time_ago < datetime.timedelta(minutes=1):
29 | time_ago_str = "just now"
30 | elif time_ago < datetime.timedelta(hours=1):
31 | minutes = int(time_ago.total_seconds() / 60)
32 | time_ago_str = f"{minutes} minutes ago"
33 | elif time_ago < datetime.timedelta(days=1):
34 | hours = int(time_ago.total_seconds() / 3600)
35 | time_ago_str = f"{hours} hours ago"
36 | else:
37 | days = time_ago.days
38 | time_ago_str = f"{days} days ago"
39 |
40 | return time_ago_str
41 |
42 |
43 | def timedelta(time_span: str) -> datetime.timedelta:
44 | hour_pattern = re.compile(r"(\d+)h")
45 | day_pattern = re.compile(r"(\d+)d")
46 | week_pattern = re.compile(r"(\d+)w")
47 | if m := hour_pattern.match(time_span):
48 | return datetime.timedelta(hours=int(m.group(1)))
49 | elif m := day_pattern.match(time_span):
50 | return datetime.timedelta(days=int(m.group(1)))
51 | elif m := week_pattern.match(time_span):
52 | return datetime.timedelta(weeks=int(m.group(1)))
53 | raise ValueError("Invalid time frame")
54 |
55 |
56 | class LinkStyle:
57 | def __init__(self, scheme: str | None = None):
58 | # ivory://acct/openURL?url=
59 | # {config.ConfigHandler.MASTO_BASE}/deck/@{toot.author}/{toot.toot_id}
60 | if scheme:
61 | self.scheme = st.radio("Link scheme:", ["Desktop", "Ivory", "Original"], index=1, horizontal=True)
62 | else:
63 | self.scheme = scheme
64 |
65 | def toot_url(self, toot: core.Toot) -> str:
66 | if self.scheme == "Desktop":
67 | return f"{config.ConfigHandler.MASTO_BASE}/@{toot.author}/{toot.toot_id}"
68 | elif self.scheme == "Ivory":
69 | encoded_url = urllib.parse.quote(toot.url)
70 | return f"ivory://acct/openURL?url={encoded_url}"
71 | elif self.scheme == "Original":
72 | return toot.url
73 | raise ValueError("Invalid scheme")
74 |
75 | def profile_url(self, toot: core.Toot) -> str:
76 | if self.scheme == "Desktop":
77 | return f"{config.ConfigHandler.MASTO_BASE}/@{toot.author}"
78 | elif self.scheme == "Ivory":
79 | # return f"ivory://@{toot.author}/profile"
80 | return f"ivory://acct/openURL?url={toot.profile_url}"
81 | elif self.scheme == "Original":
82 | return toot.profile_url
83 | raise ValueError("Invalid scheme")
84 |
85 |
86 |
87 | class TootCluster(pydantic.BaseModel):
88 | id: int
89 | name: str
90 | toots: list[core.Toot]
91 |
92 |
93 | class TootClusters(pydantic.BaseModel):
94 | clusters: list[TootCluster]
95 |
96 | @property
97 | def num_toots(self) -> int:
98 | return sum(len(c.toots) for c in self.clusters)
99 |
100 | @property
101 | def max_date(self) -> datetime.datetime:
102 | seq = [t.created_at for c in self.clusters for t in c.toots]
103 | return max(seq) if len(seq) > 0 else datetime.datetime.utcnow()
104 |
105 | @property
106 | def min_date(self) -> datetime.datetime:
107 | seq = [t.created_at for c in self.clusters for t in c.toots]
108 | return min(seq) if len(seq) > 0 else datetime.datetime.utcnow()
109 |
110 |
111 | def display_toot(toot: core.Toot, link_style: LinkStyle):
112 | with st.container(border=True):
113 | reply = "↩" if toot.is_reply else ""
114 | st.markdown(f"""
115 | {reply}
{toot.display_name} @{toot.author} ({time_ago(toot.created_at)})
116 | {toot.content}
117 | """, unsafe_allow_html=True)
118 |
119 | attachments = [f'
' for a in toot.media_attachments]
120 | st.markdown(" ".join(attachments), unsafe_allow_html=True)
121 |
122 | cols = st.columns(4)
123 | with cols[0]:
124 | st.markdown(f"""🔗""", unsafe_allow_html=True)
125 | with cols[1]:
126 | if st.button("⭐️", key=f"star-{toot.id}"):
127 | toot.do_star()
128 | with cols[2]:
129 | if st.button("️🔁", key=f"boost-{toot.id}"):
130 | toot.do_boost()
131 | with cols[3]:
132 | if st.button("🪲", key=f"delete-{toot.id}"):
133 | import json
134 | print(json.dumps(toot.orig_dict, indent=2))
135 |
136 |
137 | def all_toot_summary(toots: list[core.Toot]):
138 | latest_date = max(t.created_at for t in toots)
139 | earliest_date = min(t.created_at for t in toots)
140 | now = datetime.datetime.utcnow()
141 | msg = f"{len(toots)} toots from {time_ago(earliest_date)} to {time_ago(latest_date)}"
142 | if latest_date > now:
143 | st.warning(msg)
144 | else:
145 | st.info(msg)
--------------------------------------------------------------------------------
/index.html:
--------------------------------------------------------------------------------
1 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 | My App
19 |
20 |
21 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
40 |
41 |
42 |
43 |
44 |
45 |
46 |
47 |
48 |
49 |
50 |
51 |
52 |
53 |
54 |
55 |
56 |
57 |
58 |
59 |
60 |
61 |
62 |
86 |
87 |
88 |
89 |
90 |
91 |
92 |
93 |
94 |
103 |
104 |
105 |
106 |
109 |
110 |
111 |
112 |
--------------------------------------------------------------------------------
/make.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | function update_deps() {
4 | curl 'https://unpkg.com/htmx.org@latest' -o app/public/htmx.js
5 | }
6 |
7 | function run() {
8 | poetry run uvicorn --host 0.0.0.0 --port 8888 --reload --reload-include '*.html' --reload-include '*.css' fossil_mastdon.server:app
9 | }
10 |
11 |
12 | $1 "$@"
13 |
--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
1 | [tool.poetry]
2 | name = "fossil-mastodon"
3 | version = "0.4.0-dev"
4 | description = "A mastodon reader client that uses embeddings to present a consolidated view of my mastodon timeline"
5 | authors = ["Tim Kellogg "]
6 | license = "MIT"
7 | readme = "README.md"
8 | include = ["**/*.css", "**/*.js", "**/*.html"]
9 |
10 | [tool.poetry.dependencies]
11 | python = "^3.10"
12 | requests = "^2.31.0"
13 | streamlit = "^1.29.0"
14 | scikit-learn = "^1.3.2"
15 | html2text = "^2020.1.16"
16 | tiktoken = "^0.5.2"
17 | python-dotenv = "^1.0.0"
18 | fastapi = "^0.105.0"
19 | jinja2 = "^3.1.2"
20 | uvicorn = "^0.25.0"
21 | python-multipart = "^0.0.6"
22 | llm = "^0.12"
23 |
24 | # You can use this same format for installing your own plugins from a different project
25 | [tool.poetry.plugins."fossil_mastodon.plugins"]
26 | topic_cluster = "fossil_mastodon.plugin_impl.topic_cluster:plugin"
27 | debug_button = "fossil_mastodon.plugin_impl.toot_debug:plugin"
28 |
29 |
30 | [tool.poetry.group.dev.dependencies]
31 | watchdog = "^3.0.0"
32 | watchfiles = "^0.21.0"
33 |
34 |
35 | [[tool.poetry.source]]
36 | name = "PyPI"
37 | priority = "primary"
38 |
39 | [build-system]
40 | requires = ["poetry-core"]
41 | build-backend = "poetry.core.masonry.api"
42 |
--------------------------------------------------------------------------------