├── .gitignore ├── LICENSE ├── README.md ├── cmd └── apollo.go ├── docs ├── Screen Shot 2021-07-25 at 4.36.15 PM.png ├── apollo.png └── architecture.png ├── go.mod ├── go.sum ├── pkg └── apollo │ ├── backend │ ├── api.go │ ├── searcher.go │ └── tokenizer.go │ ├── schema │ ├── crawler.go │ └── schema.go │ ├── server.go │ └── sources │ ├── athena.go │ ├── kindle.go │ ├── podcast.go │ ├── source.go │ ├── utils.go │ └── zeus.go ├── static ├── css │ └── stylesheet.css ├── img │ ├── about.png │ ├── add.png │ └── home.png ├── index.html ├── js │ ├── main.js │ └── poseidon.min.js └── search.xml └── tests └── main_test.go /.gitignore: -------------------------------------------------------------------------------- 1 | *.DS_Store 2 | static/CNAME 3 | *.json 4 | data/ 5 | kindle/ 6 | .env -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2021 Amir Bolous 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Apollo 💎 2 | ### A Unix-style personal search engine and web crawler for your digital footprint 3 |

4 |

5 |

6 | 7 | ## Demo 8 | 9 | https://user-images.githubusercontent.com/7995105/126933240-b8176047-7cc4-4b22-91dc-aee7490476ed.mp4 10 | 11 | 12 | 13 | ## Contents 14 | [Background](#background) 15 | [Thesis](#thesis) 16 | [Design](#design) 17 | [Architecture](#architecture) 18 | [Data Schema](#data-schema) 19 | [Workflows](#workflows) 20 | [Document Storage](#document-storage) 21 | [Shut up, how can I use it?](#shut-up-how-can-i-use-it) 22 | [Notes](#notes) 23 | [Future](#future) 24 | [Inspirations](#inspirations) 25 | 26 | 27 | ## Background 28 | Apollo is a different type of search engine. Traditional search engines (like Google) are great for **discovery** when you're trying to find the answer to a question, but you don't know what you're looking for. 29 | 30 | However, they're very poor at **recall and synthesis** when you've seen something before on the internet somewhere but can't remember where. Trying to find it becomes a nightmare - how can you synthezize the great material on the internet when you forgot where it even was? I've wasted many an hour combing through Google and my search history to look up a good article, blog post, or just something I've seen before. 31 | 32 | Even with built in systems to store some of my favorite [articles](https://zeus.amirbolous.com/articles), [podcasts](https://zeus.amirbolous.com/podcasts), and other stuff, I forget things **all the time**. 33 | 34 | ## Thesis 35 | Screw finding a needle in the haystack. Let's create a new type of search to **choose which gem you're looking for** 36 | 37 | Apollo is a search engine and web crawler to digest **your digital footprint**. What this means is that **you choose what to put in it**. When you come across something that looks interesting, be it an article, blog post, website, whatever, you **manually add it** (with built in systems to make doing so easy). If you always want to pull in data from a certain data source, like your notes or something else, you can do that too. This tackles one of the biggest problems of recall in search engines returning a lot of irrelevant information because with Apollo, **the signal to noise ratio is very high**. You've chosen **exactly what to put in it**. 38 | 39 | Apollo is not necessarly built for raw discovery (although it certainly matches rediscovery), it's built for knowledge compression and transformation - that is looking up things that you've previously deemed to be cool 40 | 41 | ## Design 42 | The first thing you might notice is that the design is reminiscent of the old digital computer age, back in the Unix days. This is intentional for many reasons. In addition to paying homage to the greats of the past, this design makes me feel like I'm searching through something that is authentically my own. When I search for stuff, I genuinely feel like I'm **travelling through the past**. 43 | 44 | ## Architecture 45 | ![architecture](docs/architecture.png) 46 | Apollo's client side is written in [Poseidon](https://github.com/amirgamil/poseidon). The client side interacts with the backend via a REST-like API which provides endpoints for searching data and adding a new entry. 47 | 48 | The backend is written in Go and is composed of a couple of important components 49 | 1. The web server which serves the endpoints 50 | 2. A tokenizer and stemmer used during search queries and when building the inverted index on the data 51 | 3. A simple web crawler for scraping links to articles/blog posts/YouTube video 52 | 4. The actual search engine which takes a query, tokenizes and stems it, finds the relevant results from the inverted index using those stemmed tokens 53 | then ranks results with [TF-IDF](https://monkeylearn.com/blog/what-is-tf-idf/#:~:text=TF%2DIDF%20is%20a%20statistical,across%20a%20set%20of%20documents.) 54 | 5. A package which pulls in data from a couple of different sources - if you want to pull data from a custom data source, this is where you should add it. 55 | 56 | ## Data Schema 57 | Two schemas we use, one to first parse the data into some encoded format. 58 | This does not get stored, it's purely an intermediate before we transform it into a record for our inverted index. 59 | Why is this important? 60 | - Because since any data gets parsed into this standarized format, you can link **any data source** you want, if you build your own tool, if you store a lot of data in some existing one, you don't have to manually add everything. You can pull in data from any data source provided you give the API data in this format. 61 | ```go 62 | type Data struct { 63 | title string //a title of the record, self-explanatory 64 | link string //links to the source of a record, e.g. a blog post, website, podcast etc. 65 | content string //actual content of the record, must be text data 66 | tags []string //list of potential high-level document tags you want to add that will be 67 | //indexed in addition to the raw data contained 68 | } 69 | ``` 70 | 71 | ```go 72 | //smallest unit of data that we store in the database 73 | //this will store each "item" in our search engine with all of the necessary information 74 | //for the inverted index 75 | type Record struct { 76 | //unique identifier 77 | ID string `json:"id"` 78 | //title 79 | Title string `json:"title"` 80 | //potential link to the source if applicable 81 | Link string `json:"link"` 82 | //text content to display on results page 83 | Content string `json:"content"` 84 | //map of tokens to their frequency 85 | TokenFrequency map[string]int `json:"tokenFrequency"` 86 | } 87 | ``` 88 | 89 | 90 | 91 | ## Workflows 92 | 93 | Data comes in many forms and the more varied those forms are, the harder it's to write reliable software to deal with it. If everything I wanted to index was just stuff I wrote, life would be easy. All of my notes would probably live in one place, so I would just have to grab the data from that data source and chill 94 | 95 | The problem is I don't take a lot of notes and not everything I want to index is something I'd take notes of. 96 | 97 | So what to do? 98 | 99 | Apollo can't handle all types of data, it's not designed to. However in building a search engine to index stuff, there are a couple of things I focused on: 100 | 101 | 1. Any data that comes from a specific platform can be integrated. If you want to index all your Twitter data for example, 102 | this is possible since all of the data can be absorbed in a constant format, converted into the compatible apollo format, and sent off. 103 | So data sources can be easily integrated, this is by design in case I want to pull in data from personal tools. 104 | 2. The harder thing is what about just, what I wil call, "writing on the internet." I read a lot of stuff on the Internet, much of which I'd like to be able to index, without necessarily having to takes notes on everything I read because I'm lazy. The dream would be to just be able to drop a link and have Apollo intelligently try to fetch the content, then I can index it without having to go to the post and copying the content, which would be painful and too slow. 105 | This was a large motivation for the web crawler component of the project 106 | - If it's writing on the Internet, should be able to post link and autofill pwd 107 | - If it's a podcast episode or any YouTube video, download text transcription e.g. [this](https://github.com/moizahmedd/youtube-video-search) 108 | - If you want to pull data from a custom data source, add it as a file in the `pkg/apollo/sources` folder, following the same rules as some of the examples and make sure to add it in the `GetData()` method of the `source.go` file in this package 109 | 110 | ## Document storage 111 | Local records and data from data sources are stored in separate JSON files. This is for convenience. 112 | 113 | I also personally store my Kindle highlights as a JSON file - I use [read.amazon.com](https://read.amazon.com/) and a readwise [extension](https://readwise.io/bookcision) to download the exported highlights for a book. I put any new book JSON files in a kindle folder in the outer directory and every time the inverted index is recomputed, the kindle file takes any new book highlights, integrate them into the main `kindle.json` file stored in the `data` folder, then delete the old file. 114 | 115 | ## Shut up, how can I use it? 116 | Although I built Apollo first and foremost for myself, I also wanted other people to be able to use if they found it valuable. To use Apollo locally 117 | 1. Clone the repo: `git clone ....` 118 | 2. Make sure you have `Go` installed and `youtube-dl` which is how we download the subtitles of a video. You can use [this](https://ostechnix.com/youtube-dl-tutorial-with-examples-for-beginners/) to install it. 119 | 3. Navigate to the root directory of the project: `cd apollo` . 120 | Note since Apollo syncs from some personal data sources, you'll want to remove them, add your own, or build stuff on top of them. Otherwise the terminal wil complain if you attempt to run it, so: 121 | 4. Navigate to the `pkg/apollo/sources` in your preferred editor and replace the body of the `GetData` function with `return make(map[string]schema.Data)` 122 | 5. Create a folder `data` in the outer directory 123 | 6. Create a `.env` file in the outermost directory (i.e. in the same directory as the `README.md`) and add `PASSWORD=` where `` is whatever password you want. This is necessary for adding or scraping the data, you'll want to "prove you're Amir" i.e. authenticate yourself and then you won't need to do this in the future. If this is not making sense, try adding some data on `apollo.amirbolous.com/add` and see what happens. 124 | 7. Go back to the outer directory (meanging you should see the files the way GitHub is displaying them right now) and run `go run cmd/apollo.go` in the terminal. 125 | 8. Navigate to `127.0.0.1:8993` on your browser 126 | 9. It should be working! You can add data and index data from the database 127 | If you run into problems, open an issue or DM me on [Twitter](https://twitter.com/amirbolous) 128 | ### A little more information on the `Add Data` section 129 | - In order to add data, you'll first need to authenticate yourself - enter your password once in the "Please prove you'r Amir" and if you see a `Hooray!` popup then that means you were authenticated successfully. You only need to do this once since we use `localStorage` to save whether you've been authenticated once or not. 130 | - In order to `scrape` a website, you'll want to paste a link in the link textbox, then click on the button `scrape`. Note this **does not add the website/content** - you still need to click the `add` button if you want to save it. The web crawler works reliably *most of the time* if you're dealing with written content on a web page or a YouTube video. We use a Go ported version of [readability](https://github.com/mozilla/readability) to scrape the main contents from a page if it's written content and [youtube-dl](https://ytdl-org.github.io/youtube-dl/index.html) to get the transcript of a video. In the future, I'd like to make this web crawler more robust, but it works well enough most of the time for now. 131 | 132 | As a side note, although I want others to be able to use Apollo, this is not a "commercial product" so feel free to open a feature request if you'd like one but it's unlikely I will get to it unless it becomes something I personally want to use. 133 | 134 | ## Notes 135 | - The inverted index is re-generated once every n number of days (currently for n = 3) 136 | - Since this is not a commercial product, I will not be running your *version of this* (if you find it useful) on my server. However, although I designed this, first and foremost for myself, I want other people to be able to use if this is something that's useful, refer to [How can I use this](#shut-up-how-can-i-use-it) 137 | - I had the choice between using Go's `gob` package for the database/inverted index and `JSON`. The `gob` package is definitely faster however it's only native in Go so I decided to go with `JSON` to make the data available in the future for potentially any non-Go integrations and be able to switch the infrastructure completely if I want to etc. 138 | - I use a ported version of the Go snowball algorithm for my stemmer. Although I would have like to build my own stemmer, implementing a robust one (which is what I wanted) was not the focus of the project. Since the algorithm for a stemmer does not need to be maintined like other types of software, I decided to use one out of the box. If I write my own in the future, I'll swap it out. 139 | 140 | ## Future 141 | - Improve the search algorithm, more like Elasticsearch when data grows a lot? 142 | - Improve the web crawler - make more robust like [mercury parser](https://github.com/postlight/mercury-parser), maybe write my own 143 | - Speed up search 144 | 145 | 146 | ## Inspirations 147 | - [Monocle](https://github.com/thesephist/monocle) for the idea 148 | - [Serenity OS](https://github.com/SerenityOS/serenity) for the design 149 | -------------------------------------------------------------------------------- /cmd/apollo.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "log" 5 | "time" 6 | 7 | "github.com/amirgamil/apollo/pkg/apollo" 8 | "github.com/amirgamil/apollo/pkg/apollo/backend" 9 | ) 10 | 11 | func main() { 12 | // sources.ReadXMLFile() 13 | backend.InitializeFilesAndData() 14 | //we call ticker to refresh inverted index regularly once every 3 days however 15 | //for convenience we often want to do a refresh "on start" so we add this here too 16 | backend.RefreshInvertedIndex() 17 | log.Println("Refreshing inverted index on launch: ") 18 | // two days in miliseconds 19 | // once every three days, takes all the records, pulls from the data sources, 20 | ticker := time.NewTicker(3 * 24 * time.Hour) 21 | done := make(chan bool) 22 | go func() { 23 | for { 24 | select { 25 | case <-done: 26 | return 27 | case t := <-ticker.C: 28 | log.Println("Refreshing inverted index at: ", t) 29 | backend.RefreshInvertedIndex() 30 | } 31 | } 32 | }() 33 | //server and the pipeline should run on concurrent threads, called regularly, for now manually do it 34 | //start the server on a concurrent thread so that when we need to refresh the inverted index, this happens on 35 | //different threads 36 | // backend.RefreshInvertedIndex() 37 | apollo.Start() 38 | 39 | } 40 | -------------------------------------------------------------------------------- /docs/Screen Shot 2021-07-25 at 4.36.15 PM.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/amirgamil/apollo/c34d6d11efc4d049d199fa1ff1f6df15f1063e70/docs/Screen Shot 2021-07-25 at 4.36.15 PM.png -------------------------------------------------------------------------------- /docs/apollo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/amirgamil/apollo/c34d6d11efc4d049d199fa1ff1f6df15f1063e70/docs/apollo.png -------------------------------------------------------------------------------- /docs/architecture.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/amirgamil/apollo/c34d6d11efc4d049d199fa1ff1f6df15f1063e70/docs/architecture.png -------------------------------------------------------------------------------- /go.mod: -------------------------------------------------------------------------------- 1 | module github.com/amirgamil/apollo 2 | 3 | go 1.14 4 | 5 | require ( 6 | github.com/PuerkitoBio/goquery v1.7.1 7 | github.com/go-shiori/go-readability v0.0.0-20210627123243-82cc33435520 8 | github.com/gorilla/mux v1.7.4 9 | github.com/joho/godotenv v1.3.0 10 | github.com/json-iterator/go v1.1.11 11 | github.com/kljensen/snowball v0.5.0 12 | golang.org/x/net v0.0.0-20210614182718-04defd469f4e 13 | ) 14 | -------------------------------------------------------------------------------- /go.sum: -------------------------------------------------------------------------------- 1 | cloud.google.com/go v0.26.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw= 2 | github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU= 3 | github.com/OneOfOne/xxhash v1.2.2/go.mod h1:HSdplMjZKSmBqAxg5vPj2TmRDmfkzw+cTzAElWljhcU= 4 | github.com/PuerkitoBio/goquery v1.7.1 h1:oE+T06D+1T7LNrn91B4aERsRIeCLJ/oPSa6xB9FPnz4= 5 | github.com/PuerkitoBio/goquery v1.7.1/go.mod h1:XY0pP4kfraEmmV1O7Uf6XyjoslwsneBbgeDjLYuN8xY= 6 | github.com/alecthomas/template v0.0.0-20160405071501-a0175ee3bccc/go.mod h1:LOuyumcjzFXgccqObfd/Ljyb9UuFJ6TxHnclSeseNhc= 7 | github.com/alecthomas/units v0.0.0-20151022065526-2efee857e7cf/go.mod h1:ybxpYRFXyAe+OPACYpWeL0wqObRcbAqCMya13uyzqw0= 8 | github.com/andybalholm/cascadia v1.2.0 h1:vuRCkM5Ozh/BfmsaTm26kbjm0mIOM3yS5Ek/F5h18aE= 9 | github.com/andybalholm/cascadia v1.2.0/go.mod h1:YCyR8vOZT9aZ1CHEd8ap0gMVm2aFgxBp0T0eFw1RUQY= 10 | github.com/armon/consul-api v0.0.0-20180202201655-eb2c6b5be1b6/go.mod h1:grANhF5doyWs3UAsr3K4I6qtAmlQcZDesFNEHPZAzj8= 11 | github.com/beorn7/perks v0.0.0-20180321164747-3a771d992973/go.mod h1:Dwedo/Wpr24TaqPxmxbtue+5NUziq4I4S80YR8gNf3Q= 12 | github.com/beorn7/perks v1.0.0/go.mod h1:KWe93zE9D1o94FZ5RNwFwVgaQK1VOXiVxmqh+CedLV8= 13 | github.com/cespare/xxhash v1.1.0/go.mod h1:XrSqR1VqqWfGrhpAt58auRo0WTKS1nRRg3ghfAqPWnc= 14 | github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw= 15 | github.com/coreos/bbolt v1.3.2/go.mod h1:iRUV2dpdMOn7Bo10OQBFzIJO9kkE559Wcmn+qkEiiKk= 16 | github.com/coreos/etcd v3.3.10+incompatible/go.mod h1:uF7uidLiAD3TWHmW31ZFd/JWoc32PjwdhPthX9715RE= 17 | github.com/coreos/go-semver v0.2.0/go.mod h1:nnelYz7RCh+5ahJtPPxZlU+153eP4D4r3EedlOD2RNk= 18 | github.com/coreos/go-systemd v0.0.0-20190321100706-95778dfbb74e/go.mod h1:F5haX7vjVVG0kc13fIWeqUViNPyEJxv/OmvnBo0Yme4= 19 | github.com/coreos/pkg v0.0.0-20180928190104-399ea9e2e55f/go.mod h1:E3G3o1h8I7cfcXa63jLwjI0eiQQMgzzUDFVpN/nH/eA= 20 | github.com/cpuguy83/go-md2man/v2 v2.0.0/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU= 21 | github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 22 | github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= 23 | github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 24 | github.com/dgrijalva/jwt-go v3.2.0+incompatible/go.mod h1:E3ru+11k8xSBh+hMPgOLZmtrrCbhqsmaPHjLKYnJCaQ= 25 | github.com/dgryski/go-sip13 v0.0.0-20181026042036-e10d5fee7954/go.mod h1:vAd38F8PWV+bWy6jNmig1y/TA+kYO4g3RSRF0IAv0no= 26 | github.com/fsnotify/fsnotify v1.4.7/go.mod h1:jwhsz4b93w/PPRr/qN1Yymfu8t87LnFCMoQvtojpjFo= 27 | github.com/ghodss/yaml v1.0.0/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04= 28 | github.com/go-kit/kit v0.8.0/go.mod h1:xBxKIO96dXMWWy0MnWVtmwkA9/13aqxPnvrjFYMA2as= 29 | github.com/go-logfmt/logfmt v0.3.0/go.mod h1:Qt1PoO58o5twSAckw1HlFXLmHsOX5/0LbT9GBnD5lWE= 30 | github.com/go-logfmt/logfmt v0.4.0/go.mod h1:3RMwSq7FuexP4Kalkev3ejPJsZTpXXBr9+V4qmtdjCk= 31 | github.com/go-shiori/dom v0.0.0-20210627111528-4e4722cd0d65 h1:zx4B0AiwqKDQq+AgqxWeHwbbLJQeidq20hgfP+aMNWI= 32 | github.com/go-shiori/dom v0.0.0-20210627111528-4e4722cd0d65/go.mod h1:NPO1+buE6TYOWhUI98/hXLHHJhunIpXRuvDN4xjkCoE= 33 | github.com/go-shiori/go-readability v0.0.0-20210627123243-82cc33435520 h1:clFLTuh2YyNzQuxKiDTNLZB2N47pwVGI/ZlQxxilLoE= 34 | github.com/go-shiori/go-readability v0.0.0-20210627123243-82cc33435520/go.mod h1:LTRGsNyO3/Y6u3ERbz17OiXy2qO1Y+/8QjXpg2ViyEY= 35 | github.com/go-stack/stack v1.8.0/go.mod h1:v0f6uXyyMGvRgIKkXu+yp6POWl0qKG85gN/melR3HDY= 36 | github.com/gogo/protobuf v1.1.1/go.mod h1:r8qH/GZQm5c6nD/R0oafs1akxWv10x8SbQlK7atdtwQ= 37 | github.com/gogo/protobuf v1.2.1/go.mod h1:hp+jE20tsWTFYpLwKvXlhS1hjn+gTNwPg2I6zVXpSg4= 38 | github.com/gogs/chardet v0.0.0-20191104214054-4b6791f73a28 h1:gBeyun7mySAKWg7Fb0GOcv0upX9bdaZScs8QcRo8mEY= 39 | github.com/gogs/chardet v0.0.0-20191104214054-4b6791f73a28/go.mod h1:Pcatq5tYkCW2Q6yrR2VRHlbHpZ/R4/7qyL1TCF7vl14= 40 | github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b/go.mod h1:SBH7ygxi8pfUlaOkMMuAQtPIUF8ecWP5IEl/CR7VP2Q= 41 | github.com/golang/groupcache v0.0.0-20190129154638-5b532d6fd5ef/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc= 42 | github.com/golang/mock v1.1.1/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfbm0A= 43 | github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= 44 | github.com/golang/protobuf v1.3.1/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= 45 | github.com/google/btree v1.0.0/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ= 46 | github.com/google/go-cmp v0.2.0/go.mod h1:oXzfMopK8JAjlY9xF4vHSVASa0yLyX7SntLO5aqRK0M= 47 | github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg= 48 | github.com/gorilla/mux v1.7.4 h1:VuZ8uybHlWmqV03+zRzdwKL4tUnIp1MAQtp1mIFE1bc= 49 | github.com/gorilla/mux v1.7.4/go.mod h1:DVbg23sWSpFRCP0SfiEN6jmj59UnW/n46BH5rLB71So= 50 | github.com/gorilla/websocket v1.4.0/go.mod h1:E7qHFY5m1UJ88s3WnNqhKjPHQ0heANvMoAMk2YaljkQ= 51 | github.com/grpc-ecosystem/go-grpc-middleware v1.0.0/go.mod h1:FiyG127CGDf3tlThmgyCl78X/SZQqEOJBCDaAfeWzPs= 52 | github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0/go.mod h1:8NvIoxWQoOIhqOTXgfV/d3M/q6VIi02HzZEHgUlZvzk= 53 | github.com/grpc-ecosystem/grpc-gateway v1.9.0/go.mod h1:vNeuVxBJEsws4ogUvrchl83t/GYV9WGTSLVdBhOQFDY= 54 | github.com/hashicorp/hcl v1.0.0/go.mod h1:E5yfLk+7swimpb2L/Alb/PJmXilQ/rhwaUYs4T20WEQ= 55 | github.com/inconshreveable/mousetrap v1.0.0/go.mod h1:PxqpIevigyE2G7u3NXJIT2ANytuPF1OarO4DADm73n8= 56 | github.com/joho/godotenv v1.3.0 h1:Zjp+RcGpHhGlrMbJzXTrZZPrWj+1vfm90La1wgB6Bhc= 57 | github.com/joho/godotenv v1.3.0/go.mod h1:7hK45KPybAkOC6peb+G5yklZfMxEjkZhHbwpqxOKXbg= 58 | github.com/jonboulle/clockwork v0.1.0/go.mod h1:Ii8DK3G1RaLaWxj9trq07+26W01tbo22gdxWY5EU2bo= 59 | github.com/json-iterator/go v1.1.11 h1:uVUAXhF2To8cbw/3xN3pxj6kk7TYKs98NIrTqPlMWAQ= 60 | github.com/json-iterator/go v1.1.11/go.mod h1:KdQUCv79m/52Kvf8AW2vK1V8akMuk1QjK/uOdHXbAo4= 61 | github.com/julienschmidt/httprouter v1.2.0/go.mod h1:SYymIcj16QtmaHHD7aYtjjsJG7VTCxuUUipMqKk8s4w= 62 | github.com/kisielk/errcheck v1.1.0/go.mod h1:EZBBE59ingxPouuu3KfxchcWSUPOHkagtvWXihfKN4Q= 63 | github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck= 64 | github.com/kljensen/snowball v0.5.0 h1:iZ6Wfi7FhyUTAXra4wz4XbU0zYwGZ51xNIY+u4SmrRE= 65 | github.com/kljensen/snowball v0.5.0/go.mod h1:27N7E8fVU5H68RlUmnWwZCfxgt4POBJfENGMvNRhldw= 66 | github.com/konsorten/go-windows-terminal-sequences v1.0.1/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ= 67 | github.com/kr/logfmt v0.0.0-20140226030751-b84e30acd515/go.mod h1:+0opPa2QZZtGFBFZlji/RkVcI2GknAs/DXo4wKdlNEc= 68 | github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo= 69 | github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ= 70 | github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI= 71 | github.com/magiconair/properties v1.8.0/go.mod h1:PppfXfuXeibc/6YijjN8zIbojt8czPbwD3XqdrwzmxQ= 72 | github.com/matttproud/golang_protobuf_extensions v1.0.1/go.mod h1:D8He9yQNgCq6Z5Ld7szi9bcBfOoFv/3dc6xSMkL2PC0= 73 | github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0= 74 | github.com/mitchellh/mapstructure v1.1.2/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y= 75 | github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421 h1:ZqeYNhU3OHLH3mGKHDcjJRFFRrJa6eAM5H+CtDdOsPc= 76 | github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= 77 | github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742 h1:Esafd1046DLDQ0W1YjYsBW+p8U2u7vzgW2SQVmlNazg= 78 | github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0= 79 | github.com/mwitkow/go-conntrack v0.0.0-20161129095857-cc309e4a2223/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U= 80 | github.com/oklog/ulid v1.3.1/go.mod h1:CirwcVhetQ6Lv90oh/F+FBtV6XMibvdAFo93nm5qn4U= 81 | github.com/pelletier/go-toml v1.2.0/go.mod h1:5z9KED0ma1S8pY6P1sdut58dfprrGBbd/94hg7ilaic= 82 | github.com/pkg/errors v0.8.0/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= 83 | github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= 84 | github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= 85 | github.com/prometheus/client_golang v0.9.1/go.mod h1:7SWBe2y4D6OKWSNQJUaRYU/AaXPKyh/dDVn+NZz0KFw= 86 | github.com/prometheus/client_golang v0.9.3/go.mod h1:/TN21ttK/J9q6uSwhBd54HahCDft0ttaMvbicHlPoso= 87 | github.com/prometheus/client_model v0.0.0-20180712105110-5c3871d89910/go.mod h1:MbSGuTsp3dbXC40dX6PRTWyKYBIrTGTE9sqQNg2J8bo= 88 | github.com/prometheus/client_model v0.0.0-20190129233127-fd36f4220a90/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA= 89 | github.com/prometheus/common v0.0.0-20181113130724-41aa239b4cce/go.mod h1:daVV7qP5qjZbuso7PdcryaAu0sAZbrN9i7WWcTMWvro= 90 | github.com/prometheus/common v0.4.0/go.mod h1:TNfzLD0ON7rHzMJeJkieUDPYmFC7Snx/y86RQel1bk4= 91 | github.com/prometheus/procfs v0.0.0-20181005140218-185b4288413d/go.mod h1:c3At6R/oaqEKCNdg8wHV1ftS6bRYblBhIjjI8uT2IGk= 92 | github.com/prometheus/procfs v0.0.0-20190507164030-5867b95ac084/go.mod h1:TjEm7ze935MbeOT/UhFTIMYKhuLP4wbCsTZCD3I8kEA= 93 | github.com/prometheus/tsdb v0.7.1/go.mod h1:qhTCs0VvXwvX/y3TZrWD7rabWM+ijKTux40TwIPHuXU= 94 | github.com/rogpeppe/fastuuid v0.0.0-20150106093220-6724a57986af/go.mod h1:XWv6SoW27p1b0cqNHllgS5HIMJraePCO15w5zCzIWYg= 95 | github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM= 96 | github.com/sergi/go-diff v1.1.0/go.mod h1:STckp+ISIX8hZLjrqAeVduY0gWCT9IjLuqbuNXdaHfM= 97 | github.com/shurcooL/sanitized_anchor_name v1.0.0/go.mod h1:1NzhyTcUVG4SuEtjjoZeVRXNmyL/1OwPU0+IJeTBvfc= 98 | github.com/sirupsen/logrus v1.2.0/go.mod h1:LxeOpSwHxABJmUn/MG1IvRgCAasNZTLOkJPxbbu5VWo= 99 | github.com/sirupsen/logrus v1.8.1 h1:dJKuHgqk1NNQlqoA6BTlM1Wf9DOH3NBjQyu0h9+AZZE= 100 | github.com/sirupsen/logrus v1.8.1/go.mod h1:yWOB1SBYBC5VeMP7gHvWumXLIWorT60ONWic61uBYv0= 101 | github.com/soheilhy/cmux v0.1.4/go.mod h1:IM3LyeVVIOuxMH7sFAkER9+bJ4dT7Ms6E4xg4kGIyLM= 102 | github.com/spaolacci/murmur3 v0.0.0-20180118202830-f09979ecbc72/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA= 103 | github.com/spf13/afero v1.1.2/go.mod h1:j4pytiNVoe2o6bmDsKpLACNPDBIoEAkihy7loJ1B0CQ= 104 | github.com/spf13/cast v1.3.0/go.mod h1:Qx5cxh0v+4UWYiBimWS+eyWzqEqokIECu5etghLkUJE= 105 | github.com/spf13/cobra v1.0.0/go.mod h1:/6GTrnGXV9HjY+aR4k0oJ5tcvakLuG6EuKReYlHNrgE= 106 | github.com/spf13/jwalterweatherman v1.0.0/go.mod h1:cQK4TGJAtQXfYWX+Ddv3mKDzgVb68N+wFjFa4jdeBTo= 107 | github.com/spf13/pflag v1.0.3/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4= 108 | github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg= 109 | github.com/spf13/viper v1.4.0/go.mod h1:PTJ7Z/lr49W6bUbkmS1V3by4uWynFiR9p7+dSq/yZzE= 110 | github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= 111 | github.com/stretchr/objx v0.1.1/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= 112 | github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs= 113 | github.com/stretchr/testify v1.3.0 h1:TivCn/peBQ7UY8ooIcPgZFpTNSz0Q2U6UrFlUfqbe0Q= 114 | github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI= 115 | github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4= 116 | github.com/tmc/grpc-websocket-proxy v0.0.0-20190109142713-0ad062ec5ee5/go.mod h1:ncp9v5uamzpCO7NfCPTXjqaC+bZgJeR0sMTm6dMHP7U= 117 | github.com/ugorji/go v1.1.4/go.mod h1:uQMGLiO92mf5W77hV/PUCpI3pbzQx3CRekS0kk+RGrc= 118 | github.com/xiang90/probing v0.0.0-20190116061207-43a291ad63a2/go.mod h1:UETIi67q53MR2AWcXfiuqkDkRtnGDLqkBTpCHuJHxtU= 119 | github.com/xordataexchange/crypt v0.0.3-0.20170626215501-b2862e3d0a77/go.mod h1:aYKd//L2LvnjZzWKhF00oedf4jCCReLcmhLdhm1A27Q= 120 | go.etcd.io/bbolt v1.3.2/go.mod h1:IbVyRI1SCnLcuJnV2u8VeU0CEYM7e686BmAb1XKL+uU= 121 | go.uber.org/atomic v1.4.0/go.mod h1:gD2HeocX3+yG+ygLZcrzQJaqmWj9AIm7n08wl/qW/PE= 122 | go.uber.org/multierr v1.1.0/go.mod h1:wR5kodmAFQ0UK8QlbwjlSNy0Z68gJhDJUG5sjR94q/0= 123 | go.uber.org/zap v1.10.0/go.mod h1:vwi/ZaCAaUcBkycHslxD9B2zi4UTXhF60s6SWpuDF0Q= 124 | golang.org/x/crypto v0.0.0-20180904163835-0709b304e793/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4= 125 | golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w= 126 | golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE= 127 | golang.org/x/lint v0.0.0-20190313153728-d0100b6bd8b3/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc= 128 | golang.org/x/net v0.0.0-20180218175443-cbe0f9307d01/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= 129 | golang.org/x/net v0.0.0-20180826012351-8a410e7b638d/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= 130 | golang.org/x/net v0.0.0-20181114220301-adae6a3d119a/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= 131 | golang.org/x/net v0.0.0-20181220203305-927f97764cc3/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= 132 | golang.org/x/net v0.0.0-20190311183353-d8887717615a/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg= 133 | golang.org/x/net v0.0.0-20190522155817-f3200d17e092/go.mod h1:HSz+uSET+XFnRR8LxR5pz3Of3rY3CfYBVs4xY44aLks= 134 | golang.org/x/net v0.0.0-20210505214959-0714010a04ed/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y= 135 | golang.org/x/net v0.0.0-20210614182718-04defd469f4e h1:XpT3nA5TvE525Ne3hInMh6+GETgn27Zfm9dxsThnX2Q= 136 | golang.org/x/net v0.0.0-20210614182718-04defd469f4e/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y= 137 | golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U= 138 | golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 139 | golang.org/x/sync v0.0.0-20181108010431-42b317875d0f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 140 | golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 141 | golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 142 | golang.org/x/sys v0.0.0-20180905080454-ebe1bf3edb33/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 143 | golang.org/x/sys v0.0.0-20181107165924-66b7b1311ac8/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 144 | golang.org/x/sys v0.0.0-20181116152217-5ac8a444bdc5/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 145 | golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 146 | golang.org/x/sys v0.0.0-20191026070338-33540a1f6037/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= 147 | golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= 148 | golang.org/x/sys v0.0.0-20210423082822-04245dca01da/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= 149 | golang.org/x/sys v0.0.0-20210514084401-e8d321eab015 h1:hZR0X1kPW+nwyJ9xRxqZk1vx5RUObAPBdKVvXPDUH/E= 150 | golang.org/x/sys v0.0.0-20210514084401-e8d321eab015/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= 151 | golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo= 152 | golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= 153 | golang.org/x/text v0.3.6 h1:aRYxNxv6iGQlyVaZmk6ZgYEDa+Jg18DxebPSrd6bg1M= 154 | golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= 155 | golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= 156 | golang.org/x/tools v0.0.0-20180221164845-07fd8470d635/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= 157 | golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= 158 | golang.org/x/tools v0.0.0-20190114222345-bf090417da8b/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= 159 | golang.org/x/tools v0.0.0-20190311212946-11955173bddd/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs= 160 | google.golang.org/appengine v1.1.0/go.mod h1:EbEs0AVv82hx2wNQdGPgUI5lhzA/G0D9YwlJXL52JkM= 161 | google.golang.org/genproto v0.0.0-20180817151627-c66870c02cf8/go.mod h1:JiN7NxoALGmiZfu7CAH4rXhgtRTLTxftemlI0sWmxmc= 162 | google.golang.org/grpc v1.19.0/go.mod h1:mqu4LbDTu4XGKhr4mRzUsmM4RtVoemTSY81AxZiDr8c= 163 | google.golang.org/grpc v1.21.0/go.mod h1:oYelfM1adQP15Ek0mdvEgi9Df8B9CZIaU1084ijfRaM= 164 | gopkg.in/alecthomas/kingpin.v2 v2.2.6/go.mod h1:FMv+mEhP44yOT+4EoQTLFTRgOQ1FBLkstjWtayDeSgw= 165 | gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 166 | gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 167 | gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 168 | gopkg.in/resty.v1 v1.12.0/go.mod h1:mDo4pnntr5jdWRML875a/NmxYqAlA73dVijT2AXvQQo= 169 | gopkg.in/yaml.v2 v2.0.0-20170812160011-eb3733d160e7/go.mod h1:JAlM8MvJe8wmxCU4Bli9HhUf9+ttbYbLASfIpnQbh74= 170 | gopkg.in/yaml.v2 v2.2.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 171 | gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 172 | gopkg.in/yaml.v2 v2.2.4/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 173 | honnef.co/go/tools v0.0.0-20190102054323-c2f93a96b099/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4= 174 | -------------------------------------------------------------------------------- /pkg/apollo/backend/api.go: -------------------------------------------------------------------------------- 1 | package backend 2 | 3 | import ( 4 | "fmt" 5 | "log" 6 | "os" 7 | 8 | "github.com/amirgamil/apollo/pkg/apollo/schema" 9 | "github.com/amirgamil/apollo/pkg/apollo/sources" 10 | jsoniter "github.com/json-iterator/go" 11 | ) 12 | 13 | //assume for now that new data that has not been built into the inverted index gets stored 14 | //in some JSON file that is available locally 15 | 16 | //TODO: fix all error handling 17 | 18 | //maps tokens to an array of pointers to records 19 | //maps strings or tokens to array of record ids 20 | var globalInvertedIndex map[string][]string 21 | 22 | //global list of all records which are stored locally 23 | //maps strings which are unique ids of each record to the record 24 | var localRecordList map[string]schema.Record 25 | 26 | //global list of records pull in from data sources 27 | var sourcesRecordList map[string]schema.Record 28 | 29 | //database of inverted index for ALL of the data 30 | //maps strings (i.e tokens) to string ids 31 | const invertedIndexPath = "./data/index.json" 32 | 33 | //database of all of records stored locally 34 | //all ids start with lc 35 | const localRecordsPath = "./data/local.json" 36 | 37 | //database of the records we compute from the sources 38 | //all ids start with sr 39 | const sourcesPath = "./data/sources.json" 40 | 41 | func createFile(path string) { 42 | f, errCreating := os.Create(path) 43 | if errCreating != nil { 44 | log.Fatal("Error, could not create database for path: ", path, " with: ", errCreating) 45 | return 46 | } 47 | f.Close() 48 | } 49 | 50 | //called at the global start 51 | func ensureDataExists(path string) { 52 | jsonFile, err := os.Open(path) 53 | if err != nil { 54 | createFile(path) 55 | } else { 56 | defer jsonFile.Close() 57 | } 58 | } 59 | 60 | //helper function which should be called when the program is initialized so that the necessary files and paths 61 | //exist in our database 62 | func InitializeFilesAndData() { 63 | ensureDataExists(sourcesPath) 64 | ensureDataExists(invertedIndexPath) 65 | ensureDataExists(localRecordsPath) 66 | globalInvertedIndex = make(map[string][]string) 67 | localRecordList = make(map[string]schema.Record) 68 | sourcesRecordList = make(map[string]schema.Record) 69 | loadGlobals() 70 | } 71 | 72 | //loads the inverted path from disk to memory 73 | //opt to use a more optimized JSON decoding and encoding library than Go's native one as our inverted index JSON files grow in size and cloud money ain't free 74 | func loadInvertedIndex() { 75 | jsonFile, err := os.Open(invertedIndexPath) 76 | if err != nil { 77 | fmt.Println("Error, could not load the inverted index") 78 | return 79 | } 80 | defer jsonFile.Close() 81 | //TODO: not sure if we can decode into pointers? 82 | jsoniter.NewDecoder(jsonFile).Decode(&globalInvertedIndex) 83 | } 84 | 85 | func loadRecordsList(path string, list map[string]schema.Record) { 86 | jsonFile, err := os.Open(path) 87 | if err != nil { 88 | fmt.Println("Error, could not load the inverted index") 89 | return 90 | } 91 | defer jsonFile.Close() 92 | jsoniter.NewDecoder(jsonFile).Decode(&list) 93 | } 94 | 95 | //takes a string of tokens and returns a map of each token to its frequency 96 | func countFrequencyTokens(tokens []string) map[string]int { 97 | frequencyWords := make(map[string]int) 98 | for _, token := range tokens { 99 | _, isInMap := frequencyWords[token] 100 | if isInMap { 101 | frequencyWords[token] += 1 102 | } else { 103 | frequencyWords[token] = 1 104 | } 105 | } 106 | return frequencyWords 107 | } 108 | 109 | //helper method which writes the current the inverted index to disk 110 | func writeIndexToDisk() { 111 | //flags we pass here are important, need to replace the entire file 112 | jsonFile, err := os.OpenFile(invertedIndexPath, os.O_RDWR|os.O_CREATE|os.O_TRUNC, 0755) 113 | if err != nil { 114 | fmt.Println("Error trying to write the new inverted index to disk") 115 | } 116 | defer jsonFile.Close() 117 | jsoniter.NewEncoder(jsonFile).Encode(globalInvertedIndex) 118 | } 119 | 120 | //helper method which writes the current the record list to disk 121 | //parameters determine which record list we write 122 | func writeRecordListToDisk(path string, list map[string]schema.Record) { 123 | //flags we pass here are important, need to replace the entire file 124 | jsonFile, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE, 0755) 125 | if err != nil { 126 | fmt.Println("Error trying to write the new inverted index to disk") 127 | } 128 | defer jsonFile.Close() 129 | jsoniter.NewEncoder(jsonFile).Encode(list) 130 | } 131 | 132 | //highest-level function that is called at regular intervals to recompute the ENTIRE inverted index to integrate 133 | //new data added via Apollo, resync data from the data sources, and include any saved records to Apollo 134 | func RefreshInvertedIndex() { 135 | //loads the globals we need including the new data, previous records, and our current inverted index 136 | loadGlobals() 137 | //clean inverted index 138 | globalInvertedIndex = make(map[string][]string) 139 | //Order is important here 140 | //Step 1: Write local stored records to the inverted index 141 | flushSavedRecordsIntoInvertedIndex(localRecordList) 142 | //Step 2: Write "old" data from data sources to the inverted index 143 | //"old" = we've previously done work to retrieve and process 144 | flushSavedRecordsIntoInvertedIndex(sourcesRecordList) 145 | //Step 3a: resync data from data sources i.e. get all data again 146 | sourceData := sources.GetData(sourcesRecordList) 147 | //Step 3b: flush new data from data sources into inverted index, note we DO NOT save these records locally since they are stored 148 | //in the origin of where we pulled them from. Since we get ALL of the data from our data sources each time this method is called, this 149 | //prevents creating additional copies in our inverted index 150 | flushDataSourcesIntoInvertedIndex(sourceData) 151 | 152 | //write data to disk in inverted index and record JSON file 153 | writeIndexToDisk() 154 | writeRecordListToDisk(localRecordsPath, localRecordList) 155 | writeRecordListToDisk(sourcesPath, sourcesRecordList) 156 | } 157 | 158 | func loadGlobals() { 159 | loadInvertedIndex() 160 | loadRecordsList(localRecordsPath, localRecordList) 161 | loadRecordsList(sourcesPath, sourcesRecordList) 162 | } 163 | 164 | //takes all of the saved records and puts them in our inverted index 165 | func flushSavedRecordsIntoInvertedIndex(recordList map[string]schema.Record) { 166 | //we already have token frequency data precomputed and saved, so just add it to inverted index directly 167 | for key, record := range recordList { 168 | writeTokenFrequenciesToInvertedIndex(record.TokenFrequency, key) 169 | } 170 | } 171 | 172 | func GetRecordFromData(currData schema.Data, uniqueID string) schema.Record { 173 | 174 | //tokenize, stem, and filter 175 | tokens := Analyze(currData.Content) 176 | 177 | //count frequency and create `Record` 178 | frequencyOfTokens := countFrequencyTokens(tokens) 179 | 180 | //adds meta level tags defined into the data - how do we set the frequency? Since these are global tags 181 | //we push some more probability on them since the user said these were important to index by 182 | //use a simple heuristic of pushing ~20% of "counts" on them 183 | //TODO: is there a more intellignet heuristic we can use here 184 | frequencyToAdd := len(tokens) / 5 185 | for _, metaTag := range currData.Tags { 186 | _, metaTagInMap := frequencyOfTokens[metaTag] 187 | if metaTagInMap { 188 | frequencyOfTokens[metaTag] += frequencyToAdd 189 | } else { 190 | frequencyOfTokens[metaTag] = frequencyToAdd 191 | } 192 | } 193 | 194 | //store record in our tokens list 195 | record := schema.Record{ID: uniqueID, Title: currData.Title, Link: currData.Link, Content: currData.Content, TokenFrequency: frequencyOfTokens} 196 | return record 197 | } 198 | 199 | //method takes data and flushes it into our inverted index 200 | //Note since th 201 | func flushDataSourcesIntoInvertedIndex(data map[string]schema.Data) { 202 | for uniqueID, currData := range data { 203 | record := GetRecordFromData(currData, uniqueID) 204 | sourcesRecordList[uniqueID] = record 205 | writeTokenFrequenciesToInvertedIndex(record.TokenFrequency, uniqueID) 206 | } 207 | } 208 | 209 | //write a map of tokens to their counts in our inverted index 210 | func writeTokenFrequenciesToInvertedIndex(frequencyOfTokens map[string]int, uniqueID string) { 211 | //loop through final frequencyOfTokens and add it to our inverted index database 212 | for key, _ := range frequencyOfTokens { 213 | _, keyInInvertedIndex := globalInvertedIndex[key] 214 | if keyInInvertedIndex { 215 | globalInvertedIndex[key] = append(globalInvertedIndex[key], uniqueID) 216 | } else { 217 | globalInvertedIndex[key] = []string{uniqueID} 218 | } 219 | } 220 | } 221 | 222 | func AddNewEntryToLocalData(data schema.Data) { 223 | key := fmt.Sprintf("lc%d", len(localRecordList)) 224 | record := GetRecordFromData(data, key) 225 | localRecordList[key] = record 226 | writeRecordListToDisk(localRecordsPath, localRecordList) 227 | } 228 | -------------------------------------------------------------------------------- /pkg/apollo/backend/searcher.go: -------------------------------------------------------------------------------- 1 | package backend 2 | 3 | import ( 4 | "errors" 5 | "fmt" 6 | "math" 7 | "regexp" 8 | "sort" 9 | "strings" 10 | "time" 11 | 12 | "github.com/amirgamil/apollo/pkg/apollo/schema" 13 | ) 14 | 15 | //TODO: should search titles too (and put high probability mass on those tokens) 16 | 17 | //given a query string a search type (AND / OR ) returns a list of matches ordered by relevance 18 | func Search(query string, searchType string, currentSearchResults map[string]string) (schema.Payload, error) { 19 | //1. Gets results of a query 20 | //keep it in a Go map that acts as a set 21 | startTime := time.Now() 22 | results := make(map[string]bool) 23 | //2. Apply same analysis as when ingesting data i.e. tokenizing and stemming 24 | queries := Analyze(query) 25 | if len(queries) == 0 { 26 | return schema.Payload{}, errors.New("No valid queries!") 27 | } 28 | //Support for AND / OR (TODO: eventually add NOT) 29 | if searchType == "AND" { 30 | //3. Get list of relevant records from the invertedIndex 31 | //temp set holding records we've matched so far for convenience 32 | //avoid quadratic complexity by sequentially removing records which don't accumulate matches as we move 33 | //through the queries 34 | tempRecords := make(map[string]bool) 35 | //get records for first query 36 | recordsFirstQueryMatch := globalInvertedIndex[queries[0]] 37 | for _, recordID := range recordsFirstQueryMatch { 38 | tempRecords[recordID] = true 39 | } 40 | for recordID, _ := range tempRecords { 41 | record := getRecordFromID(recordID) 42 | for i := 1; i < len(queries); i++ { 43 | _, tokenInRecord := record.TokenFrequency[queries[i]] 44 | if !tokenInRecord { 45 | //token from our intersection does not exist in this record, so remove it, don't need to keep checking 46 | delete(tempRecords, recordID) 47 | break 48 | } 49 | } 50 | } 51 | //now have all of the records which match all of the queries 52 | for recordID, _ := range tempRecords { 53 | results[recordID] = true 54 | } 55 | } else if searchType == "OR" { 56 | //3. Get list of relevant records from the invertedIndex 57 | for _, query := range queries { 58 | recordsWithQuery := globalInvertedIndex[query] 59 | for _, recordID := range recordsWithQuery { 60 | _, inMap := results[recordID] 61 | if !inMap { 62 | results[recordID] = true 63 | } 64 | } 65 | } 66 | } 67 | 68 | //4. Sory by relevance - assign a score to each record that matches how relevant it is 69 | //Use the inverse document frequency 70 | records := rank(results, queries, currentSearchResults) 71 | //convert searched time to miliseconds 72 | time := int64(time.Now().Sub(startTime)) 73 | return schema.Payload{Time: time, Data: records, Query: queries, Length: len(records)}, nil 74 | 75 | } 76 | 77 | //helper method which return a record from the associated id 78 | func getRecordFromID(id string) schema.Record { 79 | if id[:2] == "lc" { 80 | return localRecordList[id] 81 | } else { 82 | return sourcesRecordList[id] 83 | } 84 | } 85 | 86 | //idf = log(total number of documents / number of documents that contain term) - ensures tokens which are rarer get a higher score 87 | func idf(token string) float64 { 88 | return math.Log10(float64(len(localRecordList)+len(sourcesRecordList)) / float64(len(globalInvertedIndex[token]))) 89 | } 90 | 91 | //ranks an unordered list of records based on relevance, uses the inverse document frequency which is a 92 | //document-level statistic that scores how relevant a document (record in our case) matches our query 93 | //then multiplty by the number of times the token gets mentioned in the token 94 | //returns an ordered list of records from most to least relevant 95 | func rank(results map[string]bool, queries []string, currentSearchResults map[string]string) []schema.SearchResult { 96 | type recordRank struct { 97 | result schema.SearchResult 98 | score float64 99 | } 100 | //defining a fixed-size array is faster and more memory efficieny 101 | rankedResults := make([]schema.SearchResult, len(results)) 102 | unsortedResults := make([]recordRank, len(results)) 103 | i := 0 104 | queriesChained := strings.Join(queries, " ") 105 | fmt.Println(queriesChained) 106 | regex, _ := regexp.Compile(queriesChained) 107 | for recordID, _ := range results { 108 | record := getRecordFromID(recordID) 109 | score := float64(0) 110 | for _, token := range queries { 111 | idfVal := idf(token) 112 | score += idfVal * float64(record.TokenFrequency[token]) 113 | } 114 | content := getSurroundingText(regex, record.Content) 115 | fmt.Println(strings.ReplaceAll(record.Title, " ", "!")) 116 | //add regex highlighted of the full content which is readily available when a user clicks on an item to view details 117 | //this way, we don't need to every single record's contents and can speed up searches 118 | currentSearchResults[record.Title] = regex.ReplaceAllString(record.Content, fmt.Sprintf(`%s`, queriesChained)) 119 | unsortedResults[i] = recordRank{result: schema.SearchResult{Title: record.Title, Link: record.Link, Content: content}, score: score} 120 | i += 1 121 | } 122 | //sort by highest order score to lowest 123 | sort.Slice(unsortedResults, func(i, j int) bool { 124 | return unsortedResults[i].score > unsortedResults[j].score 125 | }) 126 | 127 | i = 0 128 | //put sorted records into needed format and return 129 | for _, val := range unsortedResults { 130 | rankedResults[i] = val.result 131 | i += 1 132 | } 133 | return rankedResults 134 | } 135 | 136 | //helper method to get small window of matching result 137 | //don't send the full text back to the client cause this is too slow 138 | func getSurroundingText(regexp *regexp.Regexp, content string) string { 139 | indices := regexp.FindStringIndex(strings.ToLower(content)) 140 | //TODO? make greedy? match different variations 141 | //if we find no match, then we've matched a token that's stem is not included 142 | //in the actual text, so just return the first section 143 | if indices == nil { 144 | if len(content) > 150 { 145 | return content[:150] 146 | } 147 | return content 148 | } 149 | //want to get a small window with the highlighted content 100 characters on each side 150 | start := indices[0] - 15 151 | end := indices[1] + 100 152 | if start < 0 && end >= len(content) { 153 | //if the entire content is smaller than the window, then just display all of the content 154 | start = 0 155 | end = len(content) 156 | } else if start < 0 { 157 | //if the match is nearer to the front, shift the window "to the right" and display more on tailend 158 | start = 0 159 | } else if end >= len(content) { 160 | //if the match is nearer to the end, shift the window "to the left" and display more on the front 161 | end = len(content) 162 | } 163 | return content[start:end] 164 | } 165 | -------------------------------------------------------------------------------- /pkg/apollo/backend/tokenizer.go: -------------------------------------------------------------------------------- 1 | package backend 2 | 3 | import ( 4 | "fmt" 5 | s "strings" 6 | 7 | "github.com/kljensen/snowball" 8 | ) 9 | 10 | var punctuation map[string]bool 11 | var stopWords map[string]bool 12 | 13 | //helper function to add word 14 | func addWord(sb *s.Builder, tokens *[]string) { 15 | currWord := s.ToLower(sb.String()) 16 | //make sure it's not a stop word before we append it to our list of tokens 17 | //use a heuristic that any one-length words are probably missing apostrophes so don't append (only I & a are one letter English 18 | //words, both of which should not be added anyway, so no collateral damage missing anything important) 19 | if _, isStopWord := stopWords[currWord]; !isStopWord && sb.Len() != 1 { 20 | *tokens = append(*tokens, currWord) 21 | } 22 | //"empty" string builder or remove current word 23 | sb.Reset() 24 | } 25 | 26 | //tokenizes a source of text into a list of lowercase tokens with stop words and punctuation removed 27 | func splitByWhiteSpace(source string) []string { 28 | tokens := make([]string, 0) 29 | var sb s.Builder 30 | 31 | for index := 0; index < len(source); index++ { 32 | char := string(source[index]) 33 | _, isPunc := punctuation[char] 34 | if char == " " { 35 | addWord(&sb, &tokens) 36 | } else if source[index] == 10 { 37 | //check if this is a newline character, have to checkout without converting into string since that causes issues 38 | addWord(&sb, &tokens) 39 | sb.Reset() 40 | } else if isPunc { 41 | // continue to next iteration, don't write the string 42 | //check if this is an apostrophe since we should treat contractions as two words 43 | if sb.Len() != 0 && char == "'" { 44 | addWord(&sb, &tokens) 45 | //add ' into the new word to represent the contraction 46 | sb.Reset() 47 | sb.WriteString("'") 48 | } 49 | continue 50 | } else { 51 | //if not white space or punctuation marks, just continue to the next character so add it to current word 52 | //write it as a byte and not string for speed 53 | sb.WriteByte(source[index]) 54 | } 55 | } 56 | //tokenize last word 57 | if sb.Len() != 0 { 58 | addWord(&sb, &tokens) 59 | } 60 | return tokens 61 | } 62 | 63 | func Analyze(source string) []string { 64 | tokens := Tokenize(source) 65 | stemmedTokens := stem(tokens) 66 | return stemmedTokens 67 | } 68 | 69 | //takes in a source of text and converts into an array of stemmed tokens (filtering out stop words and punctuation) 70 | //This gets called when ingesting new data and when searching 71 | //TODO: or is it better to just "generateAllPossibleVarations" of a word on the client side, then wouldn't need to stem on the backend? 72 | func Tokenize(source string) []string { 73 | //careful of single quotes vs. appostrophe 74 | if len(punctuation) == 0 || len(stopWords) == 0 { 75 | initConstants() 76 | } 77 | return splitByWhiteSpace(source) 78 | } 79 | 80 | //I use a ported version of the Go snowball algorithm (considered a portman2.0)here. Although I would have preferred to write my own stemmer 81 | //writing a good robust stemmer was not the focus of this project, you pick your battles :( If at some point in the future, 82 | //I becoming interested in learning about stemmers and write my own, I promise I'll substitute my own implementation here :) 83 | 84 | //stem takes an array of tokens (free of punctuation and stop words) and returns an array of tokens with each token representing its stem 85 | func stem(tokens []string) []string { 86 | for i := 0; i < len(tokens); i++ { 87 | stemmed, err := snowball.Stem(tokens[i], "english", false) 88 | if err != nil { 89 | fmt.Println("Error stemming a token!") 90 | } 91 | tokens[i] = stemmed 92 | } 93 | return tokens 94 | } 95 | 96 | //load the sets for the first time to prevent repeated work 97 | func initConstants() { 98 | punct := []string{".", "?", "!", ",", ":", ";", "-", "(", ")", "\"", "'", "{", "}", "[", "]", "#", "<", ">", "\\", 99 | "~", "*", "_", "|", "%", "/"} 100 | stop := []string{"i", "me", "my", "myself", "we", "our", "ours", "ourselves", "you", "your", "'re", "yours", "yourself", "yourselves", "he", "him", 101 | "his", "himself", "she", "her", "hers", "herself", "it", "its", "itself", "they", "them", "their", "theirs", "themselves", 102 | "what", "which", "who", "whom", "this", "that", "these", "those", "am", "is", "are", "was", "were", "be", "been", "being", 103 | "have", "has", "had", "having", "do", "does", "did", "doing", "a", "an", "the", "and", "but", "if", "or", "because", "as", 104 | "until", "while", "of", "at", "by", "for", "with", "about", "against", "between", "into", "through", "during", "before", 105 | "after", "above", "below", "to", "from", "up", "down", "in", "out", "on", "off", "over", "under", "again", "further", "then", 106 | "once", "here", "there", "when", "where", "why", "how", "all", "any", "both", "each", "few", "more", "most", "other", "some", 107 | "such", "no", "nor", "not", "'t", "'nt", "only", "own", "same", "so", "than", "too", "very", "s", "t", "can", "will", "just", "don", 108 | "should", "now"} 109 | punctuation = make(map[string]bool) 110 | stopWords = make(map[string]bool) 111 | //convert array into a set-like structure for fast-lookups 112 | for _, each := range punct { 113 | punctuation[each] = true 114 | } 115 | 116 | for _, each := range stop { 117 | stopWords[each] = true 118 | } 119 | } 120 | -------------------------------------------------------------------------------- /pkg/apollo/schema/crawler.go: -------------------------------------------------------------------------------- 1 | package schema 2 | 3 | import ( 4 | "bytes" 5 | "errors" 6 | "io/ioutil" 7 | "log" 8 | "os" 9 | "os/exec" 10 | "regexp" 11 | "strings" 12 | "time" 13 | 14 | readability "github.com/go-shiori/go-readability" 15 | ) 16 | 17 | func Scrape(link string) (Data, error) { 18 | log.Println(strings.Contains(link, "www.youtube.com")) 19 | //handle YouTube videos 20 | if strings.Contains(link, "www.youtube.com") { 21 | return HandleYouTubeVideo(link) 22 | } 23 | article, err := readability.FromURL(link, 30*time.Second) 24 | //add goquery and if it fails, return Text()? 25 | if err != nil { 26 | return Data{}, err 27 | } 28 | regex, _ := regexp.Compile("(<[^>]+>)") 29 | cleanContent := regex.ReplaceAllString(article.TextContent, "") 30 | return Data{Title: article.Title, Link: link, Content: cleanContent, Tags: make([]string, 0)}, nil 31 | } 32 | 33 | func HandleYouTubeVideo(link string) (Data, error) { 34 | command := "youtube-dl -o '%(title)s' --write-srt --sub-lang en --skip-download " + link 35 | cmd := exec.Command("bash", "-c", command) 36 | err := cmd.Run() 37 | var out bytes.Buffer 38 | cmd.Stdout = &out 39 | log.Println(out.String()) 40 | if err != nil { 41 | log.Println("Error running bash script: ", err) 42 | return Data{}, errors.New("Error downloading the youtube video!") 43 | } 44 | content, title, err := readFromSubtitlesFile() 45 | if err != nil { 46 | return Data{}, errors.New("Error loading the subtitles of the video!") 47 | } 48 | return Data{Title: title, Link: link, Content: content, Tags: make([]string, 0)}, nil 49 | } 50 | 51 | func readFromSubtitlesFile() (string, string, error) { 52 | files, err := ioutil.ReadDir("./") 53 | if err != nil { 54 | log.Println("Error reading the files of the YouTube video: ", err) 55 | return "", "", nil 56 | } 57 | for _, file := range files { 58 | //find the vtt file which is the format of the downloaded subtitles 59 | r, err := regexp.MatchString(".vtt", file.Name()) 60 | if err == nil && r { 61 | //found file 62 | output, err := readVTTFile(file.Name()) 63 | removeVTTFille(file.Name()) 64 | if err != nil { 65 | return "", "", err 66 | } 67 | regexTitle, _ := regexp.Compile(`(\.en)?\.vtt`) 68 | title := regexTitle.ReplaceAllString(file.Name(), "") 69 | return output, title, nil 70 | } 71 | } 72 | return "", "", errors.New("Error reading subtitles!") 73 | } 74 | 75 | func removeVTTFille(name string) error { 76 | err := os.Remove(name) 77 | if err != nil { 78 | log.Println("Error removing file: ", err) 79 | return err 80 | } 81 | return nil 82 | } 83 | 84 | func readVTTFile(fileName string) (string, error) { 85 | file, err := os.Open(fileName) 86 | if err != nil { 87 | log.Println("Error opening the VTT file: ", err) 88 | return "", err 89 | } 90 | output, err := ioutil.ReadAll(file) 91 | if err != nil { 92 | log.Println("Error reading the VTT file: ", err) 93 | return "", err 94 | } 95 | //rule to remove everything but the text of the vtt file 96 | regexRule := "\n?([0-9]+):([0-9]+):([0-9]+).([0-9]+) --> ([0-9]+):([0-9]+):([0-9]+).([0-9]+)" 97 | r, _ := regexp.Compile(regexRule) 98 | if r.Match(output) { 99 | textVideo := r.ReplaceAllString(string(output), "") 100 | textVideo = strings.ReplaceAll(textVideo, "WEBVTT\nKind: captions\nLanguage: en", "") 101 | return textVideo, nil 102 | } else { 103 | log.Println("Error trying to match regex: ") 104 | return "", err 105 | } 106 | } 107 | -------------------------------------------------------------------------------- /pkg/apollo/schema/schema.go: -------------------------------------------------------------------------------- 1 | package schema 2 | 3 | //smallest unit of data that we store in the database 4 | //this will store each "item" in our search engine with all of the necessary information 5 | //for the interverted index 6 | type Record struct { 7 | //unique identifier 8 | ID string `json:"id"` 9 | //title 10 | Title string `json:"title"` 11 | //potential link to the source if applicable 12 | Link string `json:"link"` 13 | //text content to display on results page 14 | Content string `json:"content"` 15 | //map of tokens to their frequency 16 | TokenFrequency map[string]int `json:"tokenFrequency"` 17 | } 18 | 19 | type SearchResult struct { 20 | Title string `json:"title"` 21 | Link string `json:"link"` 22 | Content string `json:"content"` 23 | } 24 | 25 | //represents raw data that we will parse objects into before they have been transformed into records 26 | //and stored in our database 27 | type Data struct { 28 | Title string `json:"title"` 29 | Link string `json:"link"` 30 | Content string `json:"content"` 31 | Tags []string `json:"tags"` 32 | //TODO: add metadata, should be able to search based on type of record, document, podcast, personal etc. 33 | } 34 | 35 | //how we send back the result of a search query to the client 36 | type Payload struct { 37 | Time int64 `json:"time"` 38 | Length int `json:"length"` 39 | Query []string `json:"query"` 40 | Data []SearchResult `json:"data"` 41 | } 42 | -------------------------------------------------------------------------------- /pkg/apollo/server.go: -------------------------------------------------------------------------------- 1 | package apollo 2 | 3 | import ( 4 | "encoding/json" 5 | "fmt" 6 | "io" 7 | "log" 8 | "net/http" 9 | "os" 10 | "time" 11 | 12 | "github.com/amirgamil/apollo/pkg/apollo/backend" 13 | "github.com/amirgamil/apollo/pkg/apollo/schema" 14 | "github.com/gorilla/mux" 15 | "github.com/joho/godotenv" 16 | jsoniter "github.com/json-iterator/go" 17 | ) 18 | 19 | func check(e error) { 20 | if e != nil { 21 | panic(e) 22 | } 23 | } 24 | 25 | //records which are stored locally, which have been added via Apollo directly 26 | const localRecordsPath = "./data/local.json" 27 | 28 | //global used to quickly access details when searching 29 | var currentSearchResults map[string]string 30 | 31 | func index(w http.ResponseWriter, r *http.Request) { 32 | indexFile, err := os.Open("./static/index.html") 33 | if err != nil { 34 | io.WriteString(w, "error reading index") 35 | return 36 | } 37 | defer indexFile.Close() 38 | 39 | io.Copy(w, indexFile) 40 | } 41 | 42 | func scrape(w http.ResponseWriter, r *http.Request) { 43 | linkToScraoe := r.FormValue("q") 44 | w.Header().Set("Content-Type", "application/json") 45 | result, err := schema.Scrape(linkToScraoe) 46 | if err != nil { 47 | log.Fatal("Error trying to scrape a digital artifact!") 48 | w.WriteHeader(http.StatusExpectationFailed) 49 | } else { 50 | json.NewEncoder(w).Encode(result) 51 | } 52 | } 53 | 54 | func addData(w http.ResponseWriter, r *http.Request) { 55 | var newData schema.Data 56 | err := jsoniter.NewDecoder(r.Body).Decode(&newData) 57 | if err != nil { 58 | w.WriteHeader(http.StatusBadRequest) 59 | } else { 60 | backend.AddNewEntryToLocalData(newData) 61 | w.WriteHeader(http.StatusAccepted) 62 | } 63 | } 64 | 65 | func search(w http.ResponseWriter, r *http.Request) { 66 | searchQuery := r.FormValue("q") 67 | //"erase" current result in preparation for new search 68 | currentSearchResults = make(map[string]string) 69 | w.Header().Set("Content-Encoding", "gz") 70 | w.Header().Set("Content-Type", "application/json") 71 | fmt.Println(searchQuery) 72 | //TODO: add logic for OR 73 | results, err := backend.Search(searchQuery, "AND", currentSearchResults) 74 | if err != nil { 75 | w.WriteHeader(http.StatusNoContent) 76 | } 77 | _, ok := w.(http.Flusher) 78 | if !ok { 79 | //streaming not supported 80 | http.Error(w, "streaming unsupported", http.StatusInternalServerError) 81 | return 82 | } else { 83 | // w.Header().Set("Cache-Control", "no-cache") 84 | // w.Header().Set("Connection", "keep-alive") 85 | // w.Header().Set("Content-Type", "application/x-ndjson; charset=utf-8") 86 | // for _, result := range results.Data { 87 | // b, err := jsoniter.Marshal(result) 88 | // if err != nil { 89 | // fmt.Printf("could not json marshall reponse item %#v: %v\n", result, err) 90 | // continue 91 | // } 92 | // fmt.Fprintf(w, "%s\n", string(b)) 93 | // fmt.Println(result.Title) 94 | // f.Flush() 95 | // } 96 | // fmt.Println("results : ", results) 97 | // gz := gzip.NewWriter(w) 98 | // defer gz.Close() 99 | 100 | jsoniter.NewEncoder(w).Encode(results) 101 | } 102 | } 103 | 104 | //get the full text of a record when expanded for detail 105 | func getRecord(w http.ResponseWriter, r *http.Request) { 106 | recordTitle := r.FormValue("q") 107 | record, inMap := currentSearchResults[recordTitle] 108 | if len(currentSearchResults) == 0 || !inMap { 109 | w.WriteHeader(http.StatusBadRequest) 110 | } else { 111 | jsoniter.NewEncoder(w).Encode(record) 112 | } 113 | 114 | } 115 | 116 | func authenticatePassword(w http.ResponseWriter, r *http.Request) { 117 | type Request struct { 118 | Password string `json:"password"` 119 | } 120 | var request Request 121 | json.NewDecoder(r.Body).Decode(&request) 122 | if isValidPassword(request.Password) { 123 | w.WriteHeader(http.StatusAccepted) 124 | } else { 125 | w.WriteHeader(http.StatusBadRequest) 126 | } 127 | } 128 | 129 | func isValidPassword(password string) bool { 130 | err := godotenv.Load() 131 | check(err) 132 | truePass := os.Getenv("PASSWORD") 133 | return truePass == password 134 | } 135 | 136 | func Start() { 137 | r := mux.NewRouter() 138 | currentSearchResults = make(map[string]string) 139 | srv := &http.Server{ 140 | Handler: r, 141 | Addr: "0.0.0.0:8993", 142 | WriteTimeout: 60 * time.Second, 143 | ReadTimeout: 60 * time.Second, 144 | } 145 | 146 | //will need to some kind of API call to ingest data 147 | r.Methods("POST").Path("/search").HandlerFunc(search) 148 | r.Methods("POST").Path("/scrape").HandlerFunc(scrape) 149 | r.Methods("POST").Path("/addData").HandlerFunc(addData) 150 | r.Methods("POST").Path("/authenticate").HandlerFunc(authenticatePassword) 151 | r.Methods("POST").Path("/getRecordDetail").HandlerFunc(getRecord) 152 | r.PathPrefix("/static/").Handler(http.StripPrefix("/static/", http.FileServer(http.Dir("./static")))) 153 | r.PathPrefix("/").HandlerFunc(index) 154 | log.Printf("Server listening on %s\n", srv.Addr) 155 | log.Fatal(srv.ListenAndServe()) 156 | 157 | } 158 | -------------------------------------------------------------------------------- /pkg/apollo/sources/athena.go: -------------------------------------------------------------------------------- 1 | package sources 2 | 3 | import ( 4 | "encoding/json" 5 | "errors" 6 | "fmt" 7 | "log" 8 | "os" 9 | 10 | "github.com/amirgamil/apollo/pkg/apollo/schema" 11 | ) 12 | 13 | const athenaPath = "../athena/data.json" 14 | 15 | type thought struct { 16 | H string `json:"h"` 17 | B string `json:"b"` 18 | T []string `json: "t"` 19 | } 20 | 21 | func getAthena() map[string]schema.Data { 22 | data, err := loadAthenaData() 23 | if err != nil { 24 | log.Println(err) 25 | return make(map[string]schema.Data) 26 | } 27 | dataToIndex := convertToReqFormat(data) 28 | fmt.Println(dataToIndex) 29 | return dataToIndex 30 | } 31 | 32 | func loadAthenaData() ([]thought, error) { 33 | var data []thought 34 | file, err := os.Open(athenaPath) 35 | if err != nil { 36 | return []thought{}, errors.New("Error loading data from Athena!") 37 | } 38 | json.NewDecoder(file).Decode(&data) 39 | return data, nil 40 | } 41 | 42 | //takes a lists of thoughts and converts it into the require data struct we need for the api 43 | func convertToReqFormat(data []thought) map[string]schema.Data { 44 | dataToIndex := make(map[string]schema.Data) 45 | for i, thought := range data { 46 | //check if we've computed the data for this already 47 | keyInMap := fmt.Sprintf("srat%d", i) 48 | if _, isInMap := sources[keyInMap]; !isInMap { 49 | dataToIndex[keyInMap] = schema.Data{Title: thought.H, Content: thought.B, Link: "https://athena.amirbolous.com", Tags: thought.T} 50 | } 51 | } 52 | return dataToIndex 53 | } 54 | -------------------------------------------------------------------------------- /pkg/apollo/sources/kindle.go: -------------------------------------------------------------------------------- 1 | package sources 2 | 3 | import ( 4 | "fmt" 5 | "log" 6 | "os" 7 | 8 | "github.com/amirgamil/apollo/pkg/apollo/schema" 9 | jsoniter "github.com/json-iterator/go" 10 | ) 11 | 12 | //define the schemas here because they're only applicable to the kindle file 13 | type Book struct { 14 | ASIN string `json:"asin"` 15 | Authors string `json:"authors"` 16 | Highlights []Highlight `json:"highlights"` 17 | Title string `json:"title"` 18 | } 19 | 20 | type Highlight struct { 21 | Text string `json:"text"` 22 | IsNoteOnly bool `json:"isNoteOnly"` 23 | Location Location `json:"location"` 24 | Note interface{} `json:"note"` 25 | } 26 | 27 | type Location struct { 28 | URL string `json:"url"` 29 | Value int `json:"value"` 30 | } 31 | 32 | var kindleGlobal map[string]Book 33 | 34 | const kindlePath = "./data/kindle.json" 35 | 36 | //where new books will be places 37 | const newBooksPath = "./kindle/" 38 | 39 | //Kindle does not directly provide a way to get highlights except via https://read.amazon.com/ 40 | //I use a readwise extension to download my highlights into JSON https://readwise.io/bookcision 41 | //I then put this json file in a directory called kindle, when it comes time to sync data, i.e. recompute the invertedIndex 42 | //the getKindle method will check for files in this directory, if any exist, it will take them 43 | //and consolidate them to the kindle. 44 | 45 | //note save the kindle db in its original state as opposed to saving it in schema.Data format which would reduce repeated work 46 | //in case I want to use the data as is for the future 47 | 48 | func getKindle() map[string]schema.Data { 49 | //ensure our kindle database exists 50 | ensureFileExists(kindlePath) 51 | //load our kindle file 52 | kindleGlobal = make(map[string]Book) 53 | loadKindle() 54 | //first check for new files in kindle folder 55 | newBooks, err := checkForNewBooks() 56 | if err != nil { 57 | return make(map[string]schema.Data) 58 | } 59 | addNewBooksToDb(newBooks) 60 | err = writeKindleDbToDisk() 61 | if err != nil { 62 | log.Println(err) 63 | } else { 64 | //if we successfully write the books to disk, we delete all of the files since 65 | //we've stored them and no longer need them 66 | deleteFiles(newBooksPath, newBooksPath) 67 | } 68 | bookData := convertBooksToData() 69 | return bookData 70 | } 71 | 72 | func convertBooksToData() map[string]schema.Data { 73 | //save each highlight as it's own entry as opposed to each book as it's own entry 74 | data := make(map[string]schema.Data) 75 | for _, book := range kindleGlobal { 76 | //iterate through the higlights 77 | for index, highlight := range book.Highlights { 78 | //check if this is highlight is already saved 79 | keyInMap := fmt.Sprintf("srkd%s%d", book.ASIN, index) 80 | if _, isInMap := sources[keyInMap]; !isInMap { 81 | note := "" 82 | if highlight.Note != nil { 83 | highlightString, isString := highlight.Note.(string) 84 | if isString { 85 | note = highlightString 86 | } 87 | } 88 | content := fmt.Sprintf("Highlight: \n\n %s\n\nNote: %s", highlight.Text, note) 89 | data[keyInMap] = schema.Data{Title: book.Title, Link: highlight.Location.URL, Content: content, Tags: make([]string, 0)} 90 | } 91 | } 92 | } 93 | return data 94 | } 95 | 96 | func writeKindleDbToDisk() error { 97 | jsonFile, err := os.OpenFile(kindlePath, os.O_WRONLY|os.O_CREATE, 0755) 98 | if err != nil { 99 | return err 100 | } 101 | defer jsonFile.Close() 102 | err = jsoniter.NewEncoder(jsonFile).Encode(kindleGlobal) 103 | return err 104 | } 105 | 106 | func addNewBooksToDb(books []Book) { 107 | for _, book := range books { 108 | kindleGlobal[book.Title] = book 109 | } 110 | } 111 | 112 | func loadKindle() { 113 | file, err := os.Open(kindlePath) 114 | if err != nil { 115 | log.Println("Error trying to load the kindle database: ", err) 116 | } 117 | jsoniter.NewDecoder(file).Decode(&kindleGlobal) 118 | } 119 | 120 | func checkForNewBooks() ([]Book, error) { 121 | files := getFilesInFolder(newBooksPath, "kindle") 122 | books := make([]Book, 0) 123 | for _, f := range files { 124 | if f.Name() == ".DS_Store" { 125 | continue 126 | } 127 | //open the file 128 | file, err := os.Open(newBooksPath + f.Name()) 129 | if err != nil { 130 | log.Println("Error trying to open kindle file: ", f.Name(), " with err: ", err) 131 | return []Book{}, err 132 | } 133 | var newBook Book 134 | err = jsoniter.NewDecoder(file).Decode(&newBook) 135 | if err != nil { 136 | log.Println("Uh oh, error decoding book at: ", f.Name(), " with: err: ", err) 137 | return []Book{}, err 138 | } else { 139 | books = append(books, newBook) 140 | } 141 | } 142 | return books, nil 143 | } 144 | -------------------------------------------------------------------------------- /pkg/apollo/sources/podcast.go: -------------------------------------------------------------------------------- 1 | package sources 2 | 3 | import ( 4 | "encoding/xml" 5 | "errors" 6 | "fmt" 7 | "io/ioutil" 8 | "log" 9 | "net/http" 10 | "os" 11 | "regexp" 12 | "strings" 13 | 14 | "github.com/amirgamil/apollo/pkg/apollo/schema" 15 | ) 16 | 17 | type RSS struct { 18 | Title string `xml:"channel>title"` 19 | Description string `xml:"channel>description"` 20 | Episodes []EpisodeXML `xml:"channel>item"` 21 | } 22 | 23 | type EpisodeXML struct { 24 | Title string `xml:"title"` 25 | Description string `xml:"description"` 26 | Link string `xml:"link"` 27 | } 28 | 29 | //This source pulls data from a personal podcast I host with my friend 30 | //Check it out: https://tinyurl.com/theconversationlab 31 | 32 | const newEpisodesPath = "./podcast/" 33 | 34 | var podcastsGlobal map[string]schema.Data 35 | 36 | //helper method to manage which episodes we've deleted once we've confirmed they've been saved 37 | var episodesToDelete []string 38 | 39 | //follows similar approach to Kindle, podcast folder - put new txt files there 40 | func getPodcast() map[string]schema.Data { 41 | episodesToDelete = make([]string, 0) 42 | podcastsGlobal = make(map[string]schema.Data) 43 | rss, err := readXMLFile() 44 | if err != nil { 45 | return make(map[string]schema.Data) 46 | } 47 | newEpisodes, err := checkForNewEpisodes(rss) 48 | if err != nil { 49 | log.Println(err) 50 | return make(map[string]schema.Data) 51 | } 52 | addNewEpisodesToDb(newEpisodes) 53 | if err != nil { 54 | log.Println(err) 55 | } else { 56 | //use a special delete files in the podcast and not the one in the util 57 | //since the special case where the text of a podcast content exists in our folders but 58 | //the RSS feed has not updated and so we don't have access to the podcast metadata. In this case, we cannot delete 59 | //file because we haven't saved it yet 60 | for _, episode := range episodesToDelete { 61 | err := os.Remove(episode) 62 | if err != nil { 63 | log.Println("Error deleting podcast transcript file: ", episode, " err") 64 | } 65 | } 66 | } 67 | return podcastsGlobal 68 | } 69 | 70 | func readXMLFile() (RSS, error) { 71 | resp, err := http.Get("https://media.rss.com/theconversationlab/feed.xml") 72 | if err != nil { 73 | log.Println("Error getting the XML file: ", err) 74 | return RSS{}, err 75 | } 76 | defer resp.Body.Close() 77 | var podcastXML RSS 78 | err = xml.NewDecoder(resp.Body).Decode(&podcastXML) 79 | if err != nil { 80 | log.Println("Error parsing the the XML file: ", err) 81 | return RSS{}, err 82 | } 83 | return podcastXML, nil 84 | } 85 | 86 | var episodeNotFound = errors.New("Epsiode not found in the RSS feed!") 87 | 88 | func findEpisodeInRSSWithName(name string, rssFeed RSS) (EpisodeXML, error) { 89 | for _, episode := range rssFeed.Episodes { 90 | if strings.HasPrefix(episode.Title, name) { 91 | return episode, nil 92 | } 93 | } 94 | return EpisodeXML{}, episodeNotFound 95 | } 96 | 97 | func addNewEpisodesToDb(episodes []schema.Data) { 98 | regex, _ := regexp.Compile("[0-9]+") 99 | for _, episode := range episodes { 100 | //trim any leading or trailing spaces 101 | episode.Title = strings.Trim(episode.Title, " ") 102 | episodeNumber := regex.FindString(episode.Title) 103 | keyInMap := fmt.Sprintf("srpd%s", episodeNumber) 104 | podcastsGlobal[keyInMap] = episode 105 | } 106 | } 107 | 108 | func checkForNewEpisodes(rssFeed RSS) ([]schema.Data, error) { 109 | files := getFilesInFolder(newEpisodesPath, "podcast") 110 | newEpisodes := make([]schema.Data, 0) 111 | for _, f := range files { 112 | if f.Name() == ".DS_Store" { 113 | continue 114 | } 115 | regex, _ := regexp.Compile("Episode [0-9]+") 116 | if regex.MatchString(f.Name()) { 117 | //grab the episode name e.g. "Episode 1" 118 | episodeTitle := regex.FindString(f.Name()) 119 | //check for corresponding episode in the RSS feed 120 | episode, err := findEpisodeInRSSWithName(episodeTitle, rssFeed) 121 | if err == episodeNotFound { 122 | //skip and leave file as is, will refresh once the RSS feed shows it 123 | continue 124 | } else if err != nil { 125 | return []schema.Data{}, err 126 | } 127 | //confirmed we have an episode and it's in the RSS feed so grab the transcript from the file 128 | //open the file 129 | path := newEpisodesPath + f.Name() 130 | file, err := os.Open(path) 131 | if err != nil { 132 | return []schema.Data{}, err 133 | } 134 | fileBody, err := ioutil.ReadAll(file) 135 | if err != nil { 136 | return []schema.Data{}, err 137 | } 138 | transcript := string(fileBody) 139 | title := fmt.Sprintf("The Conversation Lab - %s", episode.Title) 140 | newEpisodes = append(newEpisodes, schema.Data{Title: title, Link: episode.Link, Content: transcript, Tags: make([]string, 0)}) 141 | episodesToDelete = append(episodesToDelete, path) 142 | 143 | } 144 | } 145 | return newEpisodes, nil 146 | } 147 | -------------------------------------------------------------------------------- /pkg/apollo/sources/source.go: -------------------------------------------------------------------------------- 1 | package sources 2 | 3 | import ( 4 | "github.com/amirgamil/apollo/pkg/apollo/schema" 5 | ) 6 | 7 | var sources map[string]schema.Record 8 | 9 | //TODO: make sourcesMap a global so we don't keep passing large maps in parameters 10 | //TODO: should return map[string]schema.Data so we have control over the IDs 11 | func GetData(sourcesMap map[string]schema.Record) map[string]schema.Data { 12 | sources = sourcesMap 13 | //pass in number of sources 14 | sourcesNewData := make([]map[string]schema.Data, 4) 15 | data := make(map[string]schema.Data) 16 | athena := getAthena() 17 | sourcesNewData[0] = athena 18 | zeus := getZeus() 19 | sourcesNewData[1] = zeus 20 | kindle := getKindle() 21 | sourcesNewData[2] = kindle 22 | podcast := getPodcast() 23 | sourcesNewData[3] = podcast 24 | //add all data 25 | for _, sourceData := range sourcesNewData { 26 | for ID, newData := range sourceData { 27 | data[ID] = newData 28 | } 29 | } 30 | return data 31 | } 32 | -------------------------------------------------------------------------------- /pkg/apollo/sources/utils.go: -------------------------------------------------------------------------------- 1 | package sources 2 | 3 | import ( 4 | "io/ioutil" 5 | "log" 6 | "os" 7 | ) 8 | 9 | func ensureFileExists(path string) { 10 | jsonFile, err := os.Open(path) 11 | if err != nil { 12 | file, err := os.Create(path) 13 | if err != nil { 14 | log.Println("Error creating the original kindle database: ", err) 15 | return 16 | } 17 | file.Close() 18 | } else { 19 | defer jsonFile.Close() 20 | } 21 | } 22 | 23 | func getFilesInFolder(path string, folderName string) []os.FileInfo { 24 | files, err := ioutil.ReadDir(path) 25 | if err != nil { 26 | err := os.Mkdir(folderName, 0755) 27 | if err != nil { 28 | log.Println("Error creating the kindle directory!") 29 | } 30 | } 31 | return files 32 | } 33 | 34 | func deleteFiles(path string, folderName string) { 35 | files := getFilesInFolder(path, folderName) 36 | for _, f := range files { 37 | err := os.Remove(path + f.Name()) 38 | if err != nil { 39 | log.Println("Error deleting file: ", f.Name()) 40 | } 41 | } 42 | } 43 | -------------------------------------------------------------------------------- /pkg/apollo/sources/zeus.go: -------------------------------------------------------------------------------- 1 | package sources 2 | 3 | import ( 4 | "encoding/gob" 5 | "fmt" 6 | "log" 7 | "os" 8 | "strings" 9 | 10 | "github.com/PuerkitoBio/goquery" 11 | "github.com/amirgamil/apollo/pkg/apollo/schema" 12 | ) 13 | 14 | const zeusPath = "../zeus/db.gob" 15 | 16 | type List struct { 17 | Key string `json:"key"` 18 | Data []string `json:"data"` 19 | //rule represents markdown of what a unit of the list looks like 20 | Rule string `json:"rule"` 21 | } 22 | 23 | func getZeus() map[string]schema.Data { 24 | //set of paths to ignore 25 | ignore := map[string]bool{"podcasts": true, "startups": true, "directory": true} 26 | cache := make(map[string]*List) 27 | dataToIndex := make(map[string]schema.Data) 28 | file, err := os.Open(zeusPath) 29 | if err != nil { 30 | log.Fatal("Error loading data from zeus") 31 | } 32 | gob.NewDecoder(file).Decode(&cache) 33 | for key, list := range cache { 34 | _, toIgnore := ignore[key] 35 | if !toIgnore { 36 | //in zeus, new data is appended to the front of the list, so we need to iterate from the back of the array to the front 37 | //otherwise we not know whether an element is new and needed to be saved in apollo or not 38 | log.Println("HERE:\n", key, list.Data) 39 | keyID := 0 40 | for index := len(list.Data) - 1; index >= 0; index -= 1 { 41 | //have to set our own "bakcwards index" to maintain correct order 42 | data := list.Data[index] 43 | //check if this is an item we've already scrapped / retreieved data for, in which case ignore to prevent repeated work 44 | keyInMap := fmt.Sprintf("srzs%s%d", list.Key, keyID) 45 | _, isInMap := sources[keyInMap] 46 | if !isInMap { 47 | //pass in our "true index" 48 | newData, err := getDataFromList(data, list.Key, keyID) 49 | if err != nil { 50 | log.Println(err) 51 | } else { 52 | dataToIndex[keyInMap] = newData 53 | } 54 | } else { 55 | log.Println("avoiding") 56 | //TODO: add some aditional logic to handle if elements change, should update, besides deleting everything 57 | } 58 | keyID += 1 59 | } 60 | } 61 | } 62 | return dataToIndex 63 | } 64 | 65 | func getDataFromList(listData string, listKey string, index int) (schema.Data, error) { 66 | //create model of the document first - recall items in Zeus are stored as rendered markdown which means HTML 67 | listDoc, err := goquery.NewDocumentFromReader(strings.NewReader(listData)) 68 | if err != nil { 69 | log.Fatal("Error parsing item in list component!") 70 | return schema.Data{}, err 71 | } 72 | var newItem schema.Data 73 | //use some heuristics to decide whether we should `scrape` a link or 74 | //just put it raw in our database 75 | //need to navigate to the `body` of the pased HTML since goquery automatically populates html, head, and body 76 | body := listDoc.Find("body") 77 | firstChild := body.Children().Nodes[0] 78 | secondChild := firstChild.FirstChild 79 | //If we only have an a tag or one inside another tag, this is probably an item we want to scrape (e.g. /articles) 80 | if firstChild.Data == "a" || secondChild.Data == "a" { 81 | newItem, err = scrapeLink(listDoc) 82 | if err != nil { 83 | log.Println("Error parsing link in list: ", listData, " defaulting to use link") 84 | return schema.Data{}, err 85 | } 86 | } else { 87 | // log.Println("Scraping: ", listKey) 88 | //otherwise, there's other content which we assume will (hopefully be indexable), may be adapted to be more intelligent 89 | newItem = schema.Data{Title: fmt.Sprintf("%s %d", listKey, index), Link: "zeus.amirbolous.com/" + listKey, Content: body.Text(), Tags: make([]string, 0)} 90 | } 91 | 92 | //if it fails, send back the link, using tag words from the link 93 | return newItem, nil 94 | } 95 | 96 | //takes a document which is suspected to be an article or something that's scrapable and attempts to scrape it 97 | func scrapeLink(listDoc *goquery.Document) (schema.Data, error) { 98 | var data schema.Data 99 | var err error 100 | listDoc.Find("a").Each(func(i int, s *goquery.Selection) { 101 | link, hasLink := s.Attr("href") 102 | if hasLink { 103 | data, err = schema.Scrape(link) //TODO: check regex, scrape w. Text()? 104 | if err != nil { 105 | //add URL directly as data, to have our tokenizer extract something meaningful, we try to replace 106 | //as many symbol we might find in URLs with spaces so the tokenizer can extract a couple of meaningful words 107 | //from the title- 108 | cleanedUpData := strings.ReplaceAll(link, "/", " ") 109 | cleanedUpData = strings.ReplaceAll(cleanedUpData, "-", " ") 110 | //Throw in the parent's title as well which might be useful, since most links are of the form

111 | cleanedUpData += s.Parent().Text() 112 | data = schema.Data{Title: s.Parent().Text(), Content: cleanedUpData, Link: link, Tags: make([]string, 0)} 113 | } 114 | } else { 115 | data = schema.Data{} 116 | } 117 | }) 118 | return data, err 119 | } 120 | -------------------------------------------------------------------------------- /static/css/stylesheet.css: -------------------------------------------------------------------------------- 1 | html, body { 2 | width: 100%; 3 | min-height: 100vh; 4 | margin: 0 auto; 5 | --fgHover: rgb(32, 32, 32, 0.1); 6 | --fg: #222; 7 | --nav: #D4D0C8; 8 | --navDarker: #aeaba6; 9 | --bg: white; 10 | font-family: 'JetBrains Mono', monospace; 11 | font-weight: 300; 12 | background-color: black; 13 | overflow-y: auto; 14 | overflow-x: hidden; 15 | color: var(--fg); 16 | } 17 | 18 | body.dark { 19 | --fg: #fafafa; 20 | --fgHover: rgb(250,250,250, 0.1); 21 | --nav: #8a8986; 22 | --bg: black; /*prev: #222 and rgb(32,32,32)*/ 23 | } 24 | 25 | .content { 26 | color: var(--fg); 27 | flex-grow: 1; 28 | min-height: 0; 29 | position: relative; 30 | margin: 0 5px 0 5px; 31 | padding: 0; 32 | } 33 | 34 | main { 35 | max-width: 804px; 36 | min-height: 100vh; 37 | margin: 0 auto; 38 | background: var(--bg); 39 | outline: 4px solid var(--nav); 40 | border: 3px solid black; 41 | display: flex; 42 | flex-direction: column; 43 | width: calc(100% - 4px); 44 | overflow-x: hidden; 45 | } 46 | 47 | .littlePadding { 48 | padding-top: 0.5em; 49 | padding-bottom: 2em; 50 | } 51 | 52 | footer { 53 | width: 100%; 54 | flex-shrink: 0; 55 | padding: 8px 2px 8px 2px; 56 | text-align: center; 57 | font-size: 1em; 58 | color: var(--fg); 59 | } 60 | 61 | .colWrapper { 62 | width: 100%; 63 | margin: 0 auto; 64 | display: flex; 65 | flex-direction: column; 66 | justify-content: center; 67 | } 68 | 69 | strong { 70 | font-weight: 700 !important; 71 | } 72 | 73 | .textbox { 74 | width: 100%; 75 | position: relative; 76 | } 77 | 78 | a { 79 | color: #9d7ac1; 80 | } 81 | 82 | a:hover { 83 | text-decoration: underline; 84 | } 85 | 86 | textarea, .p-heights, input { 87 | box-sizing: border-box; 88 | margin: 10px 0 10px 0; 89 | width: 100%; 90 | color: var(--fg); 91 | font-size: 1em; 92 | min-height: 2em; 93 | line-height: 1.5em; 94 | font-family: 'JetBrains Mono', monospace; 95 | background: var(--bg); 96 | border: 2px solid var(--fg); 97 | border-top-width: 4px; 98 | border-left-width: 4px; 99 | word-wrap: break-word; 100 | outline: none; 101 | white-space: pre-wrap; 102 | } 103 | 104 | textarea { 105 | top: 0; 106 | left: 0; 107 | bottom: 0; 108 | right: 0; 109 | overflow: hidden; 110 | resize: none; 111 | position: absolute; 112 | } 113 | 114 | .p-heights { 115 | visibility: hidden; 116 | } 117 | 118 | 119 | button:hover { 120 | background: var(--fgHover); 121 | } 122 | .action { 123 | border: 2px solid var(--fg); 124 | border-top-width: 4px; 125 | border-left-width: 4px; 126 | margin-bottom: 20px; 127 | padding: 5px; 128 | } 129 | 130 | 131 | button { 132 | width: fit-content; 133 | align-self: center; 134 | font-family: 'JetBrains Mono', monospace; 135 | background: var(--nav); 136 | color: var(--fg); 137 | font-size: 1em; 138 | } 139 | 140 | .icon::after { 141 | position: absolute; 142 | pointer-events: none; 143 | white-space: nowrap; 144 | content: 'Add new page'; 145 | padding: 5px; 146 | background: var(--fgHover); 147 | border-radius: 6px; 148 | font-size: 14px; 149 | box-shadow: 0 2px 4px rgb(0 0 0 / 20%); 150 | opacity: 0; 151 | transform: translate(-63%, 40px); 152 | transition: opacity .2s, transform .2s; 153 | transition-delay: 0s; 154 | } 155 | 156 | .icon:hover::after { 157 | transition-delay: 0.5s; 158 | opacity: 1 !important; 159 | } 160 | 161 | 162 | .engine { 163 | width: 100%; 164 | margin: 0 auto; 165 | display: flex; 166 | flex-direction: column; 167 | align-items: flex-start; 168 | } 169 | 170 | .time { 171 | color: var(--fg); 172 | opacity: 0.4; 173 | } 174 | 175 | .datacontent { 176 | width: 100%; 177 | position: relative; 178 | display: flex; 179 | 180 | } 181 | 182 | .result { 183 | display: flex; 184 | padding-top: 5px; 185 | padding-bottom: 5px; 186 | } 187 | 188 | .result:hover, .hoverShow { 189 | background: var(--fgHover); 190 | } 191 | 192 | .rowWrapper { 193 | display: flex; 194 | justify-content: space-around; 195 | } 196 | 197 | 198 | 199 | nav { 200 | width: 100%; 201 | height: 6.5625em; 202 | background: var(--nav); 203 | display: flex; 204 | flex-direction: column; 205 | } 206 | 207 | .titleNav, .welcomeNav { 208 | margin: 3px 1px 3px 1px; 209 | } 210 | 211 | .titleNav { 212 | background-color: #e5e5f7; 213 | opacity: 0.8; 214 | background-size: 10px 10px; 215 | background-image: repeating-linear-gradient(45deg, var(--navDarker), var(--navDarker)1px, #e5e5f7 0, #e5e5f7 50%); 216 | color: white; 217 | text-align: center; 218 | } 219 | 220 | 221 | .cover { 222 | background: #6165EE; 223 | } 224 | 225 | .welcomeNav { 226 | width: 11em; 227 | font-weight: 400; 228 | box-shadow: 3px 2px 1px 1px black; 229 | z-index: 10 !important; 230 | border-top: 2px solid var(--navDarker); 231 | border-left: 2px solid var(--navDarker); 232 | padding-left: 3px; 233 | padding-right: 3px; 234 | } 235 | 236 | .welcomeNav:hover{ 237 | transform: translate(2px, 2px); 238 | } 239 | 240 | 241 | .navSubar { 242 | display: flex; 243 | align-items: center; 244 | z-index: 2; 245 | background-color: var(--nav); 246 | justify-content: space-between; 247 | border-bottom-width: 0; 248 | box-shadow: 3px 2px 1px 1px black; 249 | border-left: 2px solid var(--navDarker); 250 | } 251 | 252 | .windowBar { 253 | display: flex; 254 | flex-direction: row; 255 | height: 25px; 256 | justify-content: space-evenly; 257 | background: linear-gradient(to right, #90441D, #ECD1B5); 258 | color: white; 259 | } 260 | 261 | .navPattern { 262 | flex-grow: 1; 263 | opacity: 0.8; 264 | margin-top: 3px; 265 | margin-bottom: 3px; 266 | background: transparent; 267 | background-image: linear-gradient(0deg, transparent 50%, black 50%); 268 | background-size: 2px 2px; 269 | } 270 | 271 | .navSubar button { 272 | outline: none; 273 | border: none; 274 | height: fit-content; 275 | } 276 | 277 | .navInput { 278 | background: white; 279 | flex-grow: 1; 280 | border-radius: 0; 281 | border: 1px solid black; 282 | border-top-width: 3px; 283 | margin-right: 10px; 284 | border-left-width: 3px; 285 | } 286 | 287 | .modal { 288 | position: fixed; 289 | z-index: 5; 290 | background-color: var(--bg); 291 | overflow: auto; 292 | opacity: 0.95; 293 | top: 0; 294 | left: 0; 295 | right: 0; 296 | bottom: 0; 297 | display: flex; 298 | align-items: center; 299 | justify-content: space-around; 300 | } 301 | 302 | .modalBody { 303 | padding: 20px; 304 | white-space: pre-line; 305 | } 306 | 307 | .modalContent { 308 | max-width: 750px; 309 | position: relative; 310 | top: 15px; 311 | margin: auto; 312 | display: flex; 313 | flex-direction: column; 314 | border: 5px solid var(--navDarker); 315 | } 316 | 317 | .modalNavTitle { 318 | margin: 0 5px 0 5px; 319 | padding: 0; 320 | font-size: 0.7em; 321 | font-weight: 700; 322 | align-self: center; 323 | } 324 | 325 | .closeModal { 326 | margin-right: 5px; 327 | margin-left: 5px; 328 | padding: 0; 329 | width: 20px; 330 | height: 20px; 331 | line-height: 0; 332 | } 333 | 334 | 335 | .highlighted { 336 | background: yellow; 337 | color: black; 338 | } -------------------------------------------------------------------------------- /static/img/about.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/amirgamil/apollo/c34d6d11efc4d049d199fa1ff1f6df15f1063e70/static/img/about.png -------------------------------------------------------------------------------- /static/img/add.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/amirgamil/apollo/c34d6d11efc4d049d199fa1ff1f6df15f1063e70/static/img/add.png -------------------------------------------------------------------------------- /static/img/home.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/amirgamil/apollo/c34d6d11efc4d049d199fa1ff1f6df15f1063e70/static/img/home.png -------------------------------------------------------------------------------- /static/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | Apollo 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 19 | 20 | 21 | 22 | 23 | 24 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | -------------------------------------------------------------------------------- /static/js/main.js: -------------------------------------------------------------------------------- 1 | // only fire fn once it hasn't been called in delay ms 2 | const debounce = (fn, delay) => { 3 | let to = null; 4 | return (...args) => { 5 | const bfn = () => fn(...args); 6 | clearTimeout(to); 7 | to = setTimeout(bfn, delay); 8 | } 9 | } 10 | 11 | class Data extends Atom { 12 | 13 | } 14 | 15 | //takes a string of content and returns 16 | //a text with HTML tags injected for key query words 17 | const highlightContent = (text, query) => { 18 | const regex = new RegExp(query.join(' ')); 19 | return text.replace(regex, `${query[0]}`); 20 | } 21 | 22 | 23 | class SearchResults extends CollectionStoreOf(Data) { 24 | fetch(query) { 25 | return fetch("/search?q=" + encodeURIComponent(query), 26 | { 27 | method: "POST", 28 | mode: "no-cors", 29 | // headers: { 30 | // "Accept-Encoding": "gzip, deflate" 31 | // }, 32 | body: JSON.stringify() 33 | }) 34 | .then(response => { 35 | if (response.ok) { 36 | return response.json(); 37 | } else { 38 | Promise.reject(response); 39 | } 40 | }).then(result => { 41 | if (result) { 42 | //time comes back in nanoseconds 43 | this.time = result.time * 0.000001; 44 | this.query = result.query; 45 | this.setStore(result.data.map((element, id) => { 46 | element["selected"] = id === 0 ? true : false; 47 | element["content"] = highlightContent(element["content"], this.query); 48 | return new Data(element); 49 | })); 50 | } else { 51 | this.setStore([]); 52 | } 53 | }) 54 | .catch(ex => { 55 | console.log("Exception occurred trying to fetch the result of a request: ", ex); 56 | }) 57 | } 58 | } 59 | 60 | class Result extends Component { 61 | init(data, removeCallBack) { 62 | this.data = data; 63 | this.removeCallBack = removeCallBack; 64 | this.displayDetails = false; 65 | this.loadPreview = this.loadPreview.bind(this); 66 | this.closeModal = this.closeModal.bind(this); 67 | this.bind(data); 68 | } 69 | 70 | loadPreview() { 71 | //fetch the full text 72 | fetch("/getRecordDetail?q=" + this.data.get("title"), { 73 | method: "POST", 74 | mode: "no-cors", 75 | body: JSON.stringify() 76 | }).then(data => data.json()) 77 | .then(res => { 78 | this.displayDetails = true; 79 | //add highlighting 80 | this.data.update({"fullContent": res}); 81 | }).catch(ex => { 82 | console.log("Error fetching details of item: ", ex); 83 | }) 84 | } 85 | 86 | closeModal(evt) { 87 | //stop bubbling up DOM which would cancel this action by loading preview 88 | evt.stopPropagation(); 89 | this.displayDetails = false; 90 | this.render(); 91 | } 92 | 93 | 94 | create({title, link, content, selected, fullContent}) { 95 | const contentToDisplay = content + "..." 96 | return html`

97 | evt.stopPropagation()} href=${link}>${title} 98 |

99 | ${this.displayDetails ? html`` : null} 115 |

` 116 | } 117 | } 118 | 119 | class SearchResultsList extends ListOf(Result) { 120 | create() { 121 | return html`

122 | ${this.nodes} 123 |

`; 124 | } 125 | } 126 | 127 | 128 | class SearchEngine extends Component { 129 | init(router, query) { 130 | this.router = router; 131 | this.query = query; 132 | this.searchInput = ""; 133 | this.searchData = new SearchResults(); 134 | this.searchResultsList = new SearchResultsList(this.searchData); 135 | this.handleInput = this.handleInput.bind(this); 136 | this.loading = false; 137 | this.modalT 138 | //used to change selected results based on arrow keys 139 | this.selected = 0; 140 | this.time = "" 141 | //add a little bit of delay before we search because too many network requests 142 | //will slow down retrieval of search results, especially as user is typing to their deired query 143 | //each time the user lifts up their finger from the keyboard, debounce will fire which will 144 | //check if 500ms has elapsed, if it has, will query and load the search results, 145 | //otherwise if it's called again, rinse and repeat 146 | this.loadSearchResults = debounce(this.loadSearchResults.bind(this), 500); 147 | this.setSearchInput = this.setSearchInput.bind(this); 148 | this.handleKeydown = this.handleKeydown.bind(this); 149 | this.toggleSelected = this.toggleSelected.bind(this); 150 | //if we have a query on initialization, navigate to it directly 151 | if (this.query) { 152 | this.setSearchInput(this.query); 153 | this.loadSearchResults(this.query); 154 | } 155 | } 156 | 157 | //TODO: add pagination into API to return e.g. 20 results and load more for speed 158 | loadSearchResults(evt) { 159 | if (evt.key === "ArrowDown" || evt.key === "ArrowUp" || evt.key === "Enter" || evt.key === "Escape") { 160 | return ; 161 | } 162 | this.searchData.fetch(this.searchInput) 163 | .then(() => { 164 | this.loading = false; 165 | this.render(); 166 | }) 167 | .catch(ex => { 168 | //if an error occured, page won't render so need to call render to update with error message 169 | this.render(); 170 | }) 171 | } 172 | 173 | setSearchInput(value) { 174 | this.searchInput = value; 175 | } 176 | 177 | handleInput(evt) { 178 | this.setSearchInput(evt.target.value); 179 | this.router.navigate("/search?q=" + encodeURIComponent(evt.target.value)); 180 | this.loading = true; 181 | this.render(); 182 | //get search results 183 | // this.loadSearchResults(this.searchInput); 184 | } 185 | 186 | styles() { 187 | return css` 188 | .engineTitle { 189 | align-self: center; 190 | } 191 | .blue { 192 | color: #2A63BF; 193 | } 194 | 195 | .red { 196 | color: #E34133; 197 | } 198 | .yellow { 199 | color: #F3B828; 200 | } 201 | .green { 202 | color: #32A556; 203 | } 204 | ` 205 | } 206 | 207 | toggleSelected(state) { 208 | const listSize = this.searchResultsList.size; 209 | switch (state) { 210 | case "ArrowDown": 211 | this.selected += 1 212 | if (this.selected < listSize) { 213 | window.scrollBy(0, 100); 214 | this.searchResultsList.nodes[this.selected - 1].data.update({"selected": false}); 215 | this.searchResultsList.nodes[this.selected].data.update({"selected": true}); 216 | } else { 217 | window.scrollTo(0, 0); 218 | this.selected = 0; 219 | this.searchResultsList.nodes[this.selected].data.update({"selected": true}); 220 | this.searchResultsList.nodes[listSize - 1].data.update({"selected": false}); 221 | } 222 | break; 223 | case "ArrowUp": 224 | this.selected -= 1; 225 | if (this.selected >= 0) { 226 | window.scrollBy(0, -100); 227 | this.searchResultsList.nodes[this.selected + 1].data.update({"selected": false}); 228 | this.searchResultsList.nodes[this.selected].data.update({"selected": true}); 229 | } else { 230 | window.scrollBy(0, document.body.scrollHeight); 231 | this.selected = listSize - 1; 232 | this.searchResultsList.nodes[0].data.update({"selected": false}); 233 | this.searchResultsList.nodes[this.selected].data.update({"selected": true}); 234 | } 235 | 236 | } 237 | this.searchResultsList.nodes[this.selected].render(); 238 | } 239 | 240 | handleKeydown(evt) { 241 | //deal with cmd a + backspace should empty all search results 242 | if (evt.key === "ArrowDown" || evt.key === "ArrowUp") { 243 | //change the selected attribute 244 | evt.preventDefault(); 245 | this.toggleSelected(evt.key); 246 | } else if (evt.key === "Enter") { 247 | evt.preventDefault(); 248 | this.searchResultsList.nodes[this.selected].loadPreview(); 249 | } else if (evt.key === "Escape") { 250 | evt.preventDefault(); 251 | this.searchResultsList.nodes[this.selected].displayDetails = false; 252 | this.searchResultsList.nodes[this.selected].render(); 253 | } 254 | } 255 | 256 | create() { 257 | const time = this.searchData.time ? this.searchData.time.toFixed(2) : 0 258 | return html`

259 |

Apollo

260 | 261 |

${this.searchInput ? "About " + this.searchData.size + " results (" + time + "ms)" : html`

To navigate with your keyboard: Arrow keys move up and down results, Enter opens the result in detail, Escape 262 | closes the detail view

263 | ${this.loading ? html`

` : this.searchResultsList.node} 264 |

` 265 | } 266 | } 267 | 268 | //where we add data for now, probably going to change 269 | class DigitalFootPrint extends Component { 270 | init() { 271 | //initalize stuff here 272 | this.data = new Data({title: "", link: "", content: "", tags: ""}) 273 | this.handleInput = this.handleInput.bind(this); 274 | this.handleTitle = (evt) => this.handleInput("title", evt); 275 | this.handleLink = (evt) => this.handleInput("link", evt); 276 | this.showModal = false; 277 | this.modalText = ""; 278 | this.handleContent = (evt) => this.handleInput("content", evt); 279 | this.handleTags = (evt) => this.handleInput("tags", evt); 280 | this.addData = this.addData.bind(this); 281 | this.scrapeData = this.scrapeData.bind(this); 282 | this.closeModal = this.closeModal.bind(this); 283 | this.password = ""; 284 | this.isAuthenticated = window.localStorage.getItem("authenticated") === "true"; 285 | this.authenticatePassword = this.authenticatePassword.bind(this); 286 | this.showAuthError = this.showAuthError.bind(this); 287 | this.updatePassword = this.updatePassword.bind(this); 288 | this.bind(this.data); 289 | } 290 | 291 | authenticatePassword() { 292 | fetch("/authenticate", { 293 | method: "POST", 294 | mode: "no-cors", 295 | headers: { 296 | "Content-Type": "application/json" 297 | }, 298 | body: JSON.stringify({ 299 | "password": this.password 300 | }) 301 | }).then(response => { 302 | if (response.ok) { 303 | window.localStorage.setItem("authenticated", "true"); 304 | this.modalText = "Hooray!" 305 | this.showModal = true; 306 | this.render(); 307 | } else { 308 | window.localStorage.getItem("authenticated", "false"); 309 | return Promise.reject(response); 310 | } 311 | }).catch(e => { 312 | this.showAuthError(); 313 | return; 314 | }) 315 | } 316 | 317 | showAuthError() { 318 | this.modalText = "You're not Amir :(" 319 | this.showModal = true; 320 | this.render(); 321 | } 322 | 323 | closeModal() { 324 | this.showModal = false; 325 | this.render(); 326 | } 327 | 328 | updatePassword(evt) { 329 | this.password = evt.target.value; 330 | this.render(); 331 | } 332 | 333 | 334 | scrapeData(evt) { 335 | if (!this.isAuthenticated) { 336 | this.showAuthError(); 337 | return; 338 | } 339 | this.showModal = true; 340 | this.modalText = "Hold on, doing some magic..." 341 | this.render(); 342 | fetch("/scrape?q=" + this.data.get("link"), { 343 | method: "POST", 344 | mode: "no-cors", 345 | headers: { 346 | "Content-Type" : "application/json" 347 | }, 348 | }).then(response => { 349 | if (response.ok) { 350 | return response.json() 351 | } else { 352 | Promise.reject(response) 353 | } 354 | }).then(data => { 355 | this.showModal = false; 356 | this.data.update({title: data["title"], content: data["content"]}); 357 | // window.scrollBy(0, document.body.scrollHeight); 358 | }).catch(ex => { 359 | console.log("Exception trying to fetch the article: ", ex) 360 | this.modalText = "Error scraping, sorry!"; 361 | this.render(); 362 | }) 363 | } 364 | 365 | getTagArrayFromString(tagString) { 366 | //remove whitespace 367 | tagString = tagString.replace(/\s/g, ""); 368 | let tags = tagString.split('#'); 369 | tags = tags.length > 1 ? tags.slice(1) : []; 370 | return tags; 371 | } 372 | 373 | addData() { 374 | if (!this.isAuthenticated) { 375 | this.showAuthError(); 376 | return; 377 | } 378 | //create array from text tags 379 | let tags = this.getTagArrayFromString(this.data.get("tags")); 380 | fetch("/addData", { 381 | method: "POST", 382 | mode: "no-cors", 383 | headers: { 384 | "Content-Type":"application/json" 385 | }, 386 | body: JSON.stringify({ 387 | title: this.data.get("title"), 388 | link: this.data.get("link"), 389 | content: this.data.get("content"), 390 | tags: tags 391 | }) 392 | }).then(response => { 393 | if (response.ok) { 394 | //TODO: change to actually display 395 | this.showModal = true; 396 | this.modalText = "Success!" 397 | this.render(); 398 | } else { 399 | Promise.reject(response) 400 | } 401 | }).catch(ex => { 402 | console.log("Error adding to the db: ", ex); 403 | this.showModal = true; 404 | this.modalText = "Error scraping, sorry!"; 405 | this.render(); 406 | }) 407 | } 408 | 409 | handleInput(el, evt) { 410 | this.data.update({[el]: evt.target.value}) 411 | } 412 | 413 | create({title, link, content, tags}) { 414 | console.log(content); 415 | return html`

416 |

Add some data

417 | 418 | 419 | 420 |

421 | 422 |

${content}

423 |

424 |

425 | 426 | 427 |

428 |

Are you Amir? Please prove yourself

429 | 430 | 431 | ${this.showModal ? html`` : null} 443 |

` 444 | } 445 | } 446 | 447 | const about = html`

448 |

About

449 |

Apollo is an attempt at making something that has felt impersoal for the longest time, personal again. 450 | 451 |

452 | 453 |

454 | The computer revolution produced 455 | personal computers yet impersonal search engines. So what's Apollo? It's a Unix-style search engine 456 | for your digital footprint. The design authentically steals from the past. This is intentional. When I use Apollo, I want to feel like I'm 457 | travelling through the past. 458 |

459 | 460 |

How do I define digital footprint? There are many possible definitions here, I define it as anything 461 | digital I come across that I want to remember in the future. 462 | 463 |

464 |

465 | It's like an indexable database or search engine for anything interesting I come across the web. There are also some personal data 466 | sources I pull from like Athena for my thoughts or 467 | Zeus for curated resources or Kindle Highlights. 468 | This is in addition to any interesting thing I come across the web, which I can add directly via the web crawler. 469 |

470 | 471 |

The web crawler can scrape any article or blog post and reliably get the text - so you can index the entire post without even 472 | having to copy it! Once again, this is intentional. I read a lot of stuff on the Internet but don't take notes (because I'm lazy). Now I can 473 | index anything interesting I come across and don't have to feel guilty about not having made notes. So just to be clear, 474 | I'm not indexing just the name of an article - I'm indexing the entire contents! If that's not cool, I don't know what is. 475 |

476 | 477 |

I no longer have to rely on my memory to index anything interesting I come across. And now you don't have to either

478 | 479 |

P.S I put a lot of ❤️ in this project, I hope you like it :)

480 | 481 |

` 482 | 483 | class App extends Component { 484 | init() { 485 | this.router = new Router(3); 486 | this.footprint = new DigitalFootPrint(); 487 | this.router.on({ 488 | route: "/search", 489 | handler: (route, params) => { 490 | this.engine = new SearchEngine(this.router, params["q"]); 491 | this.route = route; 492 | this.render(); 493 | } 494 | }); 495 | 496 | this.router.on({ 497 | route: ["/about", "/add"], 498 | handler: (route) => { 499 | this.route = route; 500 | this.render(); 501 | } 502 | }) 503 | 504 | this.router.on({ 505 | route: "/", 506 | handler: (route) => { 507 | this.engine = new SearchEngine(this.router); 508 | this.route = route; 509 | this.render(); 510 | } 511 | }); 512 | } 513 | 514 | create() { 515 | const hour = new Date().getHours(); 516 | if (hour > 19 || hour < 7) { 517 | document.body.classList.add('dark'); 518 | document.documentElement.style.color = '#222'; 519 | } else { 520 | document.body.classList.remove('dark'); 521 | document.documentElement.style.color = '#fafafa'; 522 | } 523 | return html`

524 | 536 |

537 | ${() => { 538 | switch (this.route) { 539 | case "/add": 540 | return this.footprint.node; 541 | case "/about": 542 | return about; 543 | default: 544 | return this.engine.node; 545 | } 546 | }} 547 |

548 | 549 |

` 550 | } 551 | } 552 | 553 | const app = new App(); 554 | document.body.appendChild(app.node); -------------------------------------------------------------------------------- /static/js/poseidon.min.js: -------------------------------------------------------------------------------- 1 | //prints vDOM tree to compare with DOM 2 | //tabs is used as helper to print the DOM in a readable format 3 | const printDOMTree = (node, tabs = "") => { 4 | if (node.children === undefined) { 5 | return tabs + node[0].tag 6 | } else { 7 | prettyPrint = "" 8 | node.children.forEach(node => { 9 | prettyPrint += tabs + node.tag + "\n" + printDOMTree(node.children, tabs + "\t") + "\n"; 10 | }) 11 | return prettyPrint; 12 | } 13 | } 14 | 15 | function updateDOMProperties(node, prevVNode, nextVNode) { 16 | //if this is a text node, update the text value 17 | if (prevVNode.tag == "TEXT_ELEMENT" && nextVNode.tag == "TEXT_ELEMENT") { 18 | //set the data attribute in our DOM node instead of nodeValue for speed and for better error detection 19 | //(that we should not be setting this value for HTML tags that don't implement the CharacterData interface) 20 | node.data = nextVNode.nodeValue; 21 | } 22 | //add/remove attributes, event listeners 23 | //remove attributes 24 | Object.keys(prevVNode.attributes || []) 25 | .forEach((key, _) => { 26 | node.removeAttribute(key); 27 | }); 28 | 29 | //remove old event listeners 30 | Object.keys(prevVNode.events || []) 31 | .forEach((key, _) => { 32 | //remove event listener and set the value of the associated key to null 33 | node.removeEventListener(key, prevVNode.events[key]); 34 | }); 35 | 36 | 37 | //add attributes 38 | var attributes = nextVNode.attributes || [] 39 | //helper method that sets an attribute 40 | const setAttributeHelper = (key, val) => { 41 | //check if an ISL attribute was already mutated from DOM manipulation, in which case don't set it 42 | //otherwise may produce unintended DOM side-effects (e.g. changing the value of selectionStart) 43 | if (key && node[key] === val) { 44 | return; 45 | } 46 | //otherwise modify the attribute if it already exists and set element otherwise 47 | if (key in node) { 48 | node[key] = val; 49 | } else { 50 | // node[key] = val; 51 | node.setAttribute(key, val); 52 | } 53 | } 54 | //note if nextVNode is a fully rendered DOM node, .attributes will return a named node map 55 | //or we have a fully fledged DOM node where .attributes returns a NamedNodeMap 56 | //check this is a vdom node before applying attributes 57 | if (!(attributes.length) || attributes.length === 0) { 58 | //this means nextVNode is a vdom node 59 | Object.keys(attributes) 60 | .forEach((key, ) => { 61 | setAttributeHelper(key, nextVNode.attributes[key]); 62 | }); 63 | } 64 | //add event listeners 65 | Object.keys(nextVNode.events || []) 66 | .forEach((key, _) => { 67 | node.addEventListener(key, nextVNode.events[key]); 68 | }); 69 | } 70 | 71 | 72 | const isEvent = key => key.startsWith("on"); 73 | const isDOM = node => node.nodeType !== Node.TEXT_NODE; 74 | //instantiate a virtual DOM node to an actual DOM node 75 | const instantiate = (vNode) => { 76 | if (!vNode.tag) { 77 | //if no tag, then this is already a rendered DOM node, 78 | if (vNode.node) { 79 | return vNode.node 80 | } 81 | return vNode; 82 | } else { 83 | const domNode = vNode.tag !== "TEXT_ELEMENT" ? document.createElement(vNode.tag) : document.createTextNode(vNode.nodeValue); 84 | updateDOMProperties(domNode, normalize(null), vNode); 85 | //create children 86 | const childrenV = vNode.children || []; 87 | const childrenDOM = childrenV.map(instantiate); 88 | childrenDOM.forEach(child => { 89 | domNode.appendChild(child); 90 | }); 91 | return domNode; 92 | } 93 | } 94 | 95 | 96 | //Tags 97 | const APPEND = 1; 98 | const DELETE = 2; 99 | const REPLACE= 3; 100 | const UPDATE = 4; 101 | 102 | //queue to manage all updates to the DOM 103 | //List of {op: , details: {}} 104 | const updateQueue = []; 105 | 106 | //used to update DOM operations from the queue 107 | const performWork = () => { 108 | var node = null; 109 | for (let i = 0; i < updateQueue.length; i++) { 110 | //removes and returns item at index 0 111 | const item = updateQueue[i]; 112 | switch (item.op) { 113 | case APPEND: 114 | parent = item.details.parent; 115 | child = item.details.node; 116 | if (parent) { 117 | parent.appendChild(child); 118 | } 119 | break; 120 | case REPLACE: 121 | dom = item.details.dom 122 | prev = item.details.previous; 123 | //note calling instaniate also set DOM properties 124 | next = instantiate(item.details.node); 125 | dom.replaceWith(next); 126 | node = next; 127 | break; 128 | case DELETE: 129 | parent = item.details.parent; 130 | toRemove = item.details.node; 131 | parent.removeChild(toRemove); 132 | break; 133 | case UPDATE: 134 | dom = item.details.dom; 135 | prev = item.details.prev; 136 | newNode = item.details.new; 137 | updateDOMProperties(dom, prev, newNode); 138 | break; 139 | } 140 | } 141 | //reset `updateQueue` now that we've dequeued everything (this will empty the queue) 142 | updateQueue.length = 0; 143 | return node; 144 | } 145 | 146 | //used to normalize vDOM nodes to prevent consantly checking if nodes are undefined before accessing properties 147 | const normalize = (vNode) => { 148 | if (!vNode) { 149 | return {tag: "", children: [], events: {}, attributes: {}}; 150 | } 151 | if (!(vNode.children)) { 152 | vNode.children = []; 153 | } 154 | if (!(vNode.events)) { 155 | vNode.events = {}; 156 | } 157 | 158 | if (!(vNode.attributes)) { 159 | vNode.attributes = {}; 160 | } 161 | return vNode; 162 | } 163 | 164 | //main render method for reconciliation 165 | //newVNode: is new vDOM node to be rendered, 166 | //prevVNode: is old vDOM node that was previously rendered 167 | //nodeDOM: is the corresponding node in the DOM 168 | const renderVDOM = (newVNode, prevVNode, nodeDOM) => { 169 | //if have an empty node, return 170 | if(!newVNode && !prevVNode) { 171 | return ; 172 | } 173 | const sameType = prevVNode && newVNode && newVNode.tag === prevVNode.tag; 174 | prevVNode = normalize(prevVNode); 175 | newVNode = normalize(newVNode); 176 | var node = normalize(null); 177 | //same node, only update properties 178 | if (sameType) { 179 | //means we have an element loaded in a list node since list nodes hand over fully rendered DOM nodes 180 | if (newVNode.tag === undefined) { 181 | updateQueue.push({op: REPLACE, details: {dom: nodeDOM, previous: prevVNode, node: newVNode}}); 182 | node = newVNode; 183 | } else { 184 | updateQueue.push({op: UPDATE, details: {dom: nodeDOM, prev: prevVNode, new: newVNode}}); 185 | //render children 186 | if (newVNode.children) { 187 | const count = Math.max(newVNode.children.length, prevVNode.children.length); 188 | const domChildren = nodeDOM ? nodeDOM.childNodes : []; 189 | for (let i = 0; i < count; i++) { 190 | newChild = newVNode.children[i]; 191 | prev = prevVNode.children[i]; 192 | //note there are two cases to consider here, either we have a child in our DOM tree (that is domChildren[i] is NOT 193 | //undefined) or we don't. If we won't have a DOM child, there are two subcases a) newVNode doesn't exist 194 | //or b) prevVnode doesn't exist. 195 | domChild = domChildren[i] || true; 196 | child = renderVDOM(newChild, prev, domChild); 197 | //only append node if it's new 198 | if (child && !prev) { 199 | updateQueue.push({op: APPEND, details: {parent: nodeDOM, node: child}}); 200 | } 201 | } 202 | } 203 | node = nodeDOM; 204 | //if we are only updating the value of two text nodes, defer doing the `performWork` operation 205 | //until the caller finishes. This reduces unwanted side effects when we are diffing deep nested 206 | //trees since we don't change the DOM until we've finished looking at all levels of the tree 207 | //as opposed to altering the DOM one level deep into the tree when we haven't yet looked at 208 | //the other levels 209 | if (prevVNode.tag === "TEXT_ELEMENT" && newVNode.tag === "TEXT_ELEMENT") { 210 | return node; 211 | } 212 | } 213 | } else if (newVNode.tag == "") { 214 | //node is no longer present so remove previous present virtual node 215 | //note if the DOM node is true (line 179), then that node has already been handled i.e. removed or added in a previous iteration 216 | if (nodeDOM !== true) { 217 | updateQueue.push({op: DELETE, details: {parent: nodeDOM.parentNode, node: nodeDOM}}); 218 | //Note we want to to return here (i.e. not perform any work yet) to avoid removing DOM nodes before 219 | //we have processed all of the children (to avoid indexing issues at line 168 causing us to skip nodes). This means we defer the 220 | //`performWork` operation to be called by the parent. Note there is no scenario where we would encounter 221 | //an empty newVNode that reaches this block without being called by a parent. 222 | return node; 223 | } 224 | 225 | } else if (prevVNode.tag == "") { 226 | //we have a new node that is currently not in the DOM 227 | node = instantiate(newVNode); 228 | if (nodeDOM) { 229 | //return child, parent will handle the add to the queue 230 | return node; 231 | } 232 | //otherwise adding a node to a currently empty DOM tree 233 | updateQueue.push({op: APPEND, details: {parent: null, node: node}}); 234 | } else { 235 | //node has changed, so replace 236 | //note we use a similar heuristic to the React diffing algorithm here - since the nodes are different 237 | //we rebuild the entire tree at this node 238 | updateQueue.push({op: REPLACE, details: {dom: nodeDOM, previous: prevVNode, node: newVNode}}); 239 | //if we have operations in our queue (i.e. length is greater than 1 since we just pushed an op) we have not yet performed then we defer 240 | //performing the work until we have processed all of the children to reduce side-effects of altering the DOM. 241 | //Note we don't need to do this if there are no operations we need to perform since if the queue 242 | //is empty, we can be confident there are no past operations that will introduce side-effects by altering the current state of the DOM 243 | if (updateQueue.length !== 1) { 244 | return node; 245 | } 246 | } 247 | 248 | //Done diffing so we can now render the updates 249 | const res = performWork(); 250 | //one edge cases that arises is when we attempt to replace the entire DOM tree (i.e. on first iteration) - we push to the queue 251 | //but never assign node which we initialize to `normalize(null)`. This would result in incorrectly updating the DOM to null so we check 252 | //for this case here 253 | if (res && node.tag === "") node = res; 254 | return node; 255 | } 256 | 257 | 258 | 259 | //This is the internal representation of a vDOM node in Poseidon that we will then render onto the DOM 260 | //note we don't use the type and props approach of react because we're going to be creating our virtual DOM representation 261 | const node = { 262 | //tag i.e. h1, p etc. 263 | tag: '', 264 | children: [], 265 | //any events it's listening to e.g. onclick, onmousedown etc, maps keys of events to listen to to responses 266 | events: {}, 267 | //map of attributes to values (e.g. {class: "...", id: "../"}) 268 | attributes: {} 269 | }; 270 | 271 | 272 | //pointer to global stylesheet to be used in subsequent reloads 273 | let globalStyleSheet; 274 | //maps components to class-names, used to check if styles for a component have already been delcared 275 | //e.g. when initializing different elements of a list 276 | const CSS_CACHE = new Map(); 277 | //global rule index to insert CSS rules sequentially 278 | var ruleIndex = 0; 279 | //helper method user to convert the JSON object the `css` template literal returns into 280 | //a set of styles - this function is recursive and resolves nested JSON CSS objects 281 | //the logic may seem confusing but we need to wrap a list of nested JSON CSS objects 282 | //and array of CSS rules into a flat structure that resolves the selectors 283 | //To do this, we distinguish between the rules for a given nested selector and nested objects. 284 | //We add rules for a given selector at the end once we've guaranteed there are no more 285 | //nested JSON objects to parse 286 | const parseCSSJSON = (JSONCSS, containerHash, styleRules, specialTag = false) => { 287 | const {tag, rules} = JSONCSS; 288 | //represents the overall text of our CSS 289 | var text = ""; 290 | var cssTag; 291 | //boolean variable to mark whether we need to handle the text differently when appending to the 292 | //stylesheet 293 | var specialTag = specialTag; 294 | //if this is a special tag that contains @keyframes or media, we need to remove 295 | //the inner references to the container component nesting 296 | if (tag.includes("@keyframes") || tag.includes("@media")) { 297 | specialTag = true; 298 | cssTag = tag; 299 | text += tag + "{\n\n" 300 | } else { 301 | //replace references to the container component which was unknown as time of generating 302 | //the CSS set of JSON rules 303 | cssTag = tag.replace("", containerHash); 304 | } 305 | var textForCurrentSelector = ""; 306 | //represents the set of rules for the current selector at this level of our tree 307 | //only add rules at the current level, if this is not a special tag 308 | textForCurrentSelector = cssTag + " { \n"; 309 | if (!specialTag) { 310 | } 311 | rules.forEach((item, _) => { 312 | //check if this is a rule or a nested CSS JSON object 313 | if (item.key) { 314 | const {key, value} = item; 315 | textForCurrentSelector += "\t" + key + ":" + value + ";\n"; 316 | } else { 317 | //then this is a nested JSON tag so we need to recurse 318 | text += parseCSSJSON(item, containerHash, styleRules, specialTag); 319 | 320 | } 321 | }); 322 | if (specialTag && !text) { 323 | return textForCurrentSelector + "}"; 324 | } 325 | //if text is not empty, we are adding a special rule like @media or @keyframes 326 | if (text) { 327 | styleRules.push(text + "}"); 328 | } else { 329 | //add the rules for the current level now that we've finished parsing all of the nested rules 330 | styleRules.push(textForCurrentSelector + "}"); 331 | } 332 | return ""; 333 | } 334 | 335 | const initStyleSheet = (userJSONStyles, name, rules) => { 336 | const containerHash = CSS_CACHE.get(name); 337 | //create style tag 338 | const cssNode = document.createElement('style'); 339 | cssNode.type = 'text/css'; 340 | //identify poseidon set of css rules with a unique data attribute 341 | cssNode.setAttribute("data-poseidon", "true"); 342 | document.head.appendChild(cssNode); 343 | globalStyleSheet = cssNode.sheet; 344 | //add . before class for the css stylesheet 345 | parseCSSJSON(userJSONStyles, "." + containerHash, rules); 346 | } 347 | const generateUniqueHash = (string) => { 348 | var hashedString = string; 349 | // Math.random should be unique because of its seeding algorithm. 350 | // Convert it to base 36 (numbers + letters), and grab the first 9 characters 351 | // after the decimal. 352 | hashedString += Math.random().toString(36).substr(2, 9); 353 | return hashedString; 354 | } 355 | 356 | const injectStyles = (rules) => { 357 | //add the rules to our stylesheet 358 | for (const rule of rules) { 359 | globalStyleSheet.insertRule(rule); 360 | } 361 | } 362 | 363 | //unit of UI 364 | class Component { 365 | constructor(...args) { 366 | //initialize stuff 367 | //vdom from create 368 | this.vdom = null; 369 | if (this.init !== undefined) { 370 | this.init(...args); 371 | } 372 | //store object of {source, handler} to remove when taking down a component 373 | //note, intentionally only store one source and handler for encapsulation 374 | this.event = {}; 375 | //`this.data` is a reserved property for passing into create to reduce side-effects and allow components to create UI without 376 | //having to rely on getting the data from elsewhere (can define in it in `init` method of a user-defined component) 377 | //call render if a component has not already been initialized with a fully-fledged, ready DOM node 378 | //(e.g. individual elements in a List) 379 | if (this.node === undefined) { 380 | this.render(this.data); 381 | } 382 | } 383 | 384 | //bind allows us to bind data to listen to and trigger an action when data changes. Similar to useState in React which 385 | //triggers a re-render when data changes 386 | bind(source, handler) { 387 | if (source instanceof Listening) { 388 | //if no handler passed in, we assume the callback is just a re-render of the UI because of a change in state 389 | //handler passed in should be a JS callback that takes data and does something (data = new updated data) 390 | if (handler === undefined) { 391 | const defaultHandler = (data) => this.render(data); 392 | source.addHandler(defaultHandler) 393 | this.events = {source, defaultHandler}; 394 | } else { 395 | source.addHandler(handler); 396 | this.events = {source, handler}; 397 | } 398 | } else { 399 | throw 'Attempting to bind to an unknown object!'; 400 | } 401 | } 402 | 403 | //method for adding inline css styling to components via css template literal, should be added in relevant component 404 | //by returning a css template literal 405 | // styles() { 406 | // return null; 407 | // } 408 | 409 | //helper method for adding component-defined styles 410 | addStyle(vdom) { 411 | //call only proceeds if we have custom-defined styles for efficiency 412 | //obleviates the need for having a separate Styled component - any component 413 | //that does not implement styles() will not call any of this method's logic 414 | //and any component can use the styles() API to apply CSS styles on its elements 415 | if (!this.styles) return ; 416 | 417 | //check if we have a class attribute, otherwise, create one 418 | if (!vdom.attributes["class"]) { 419 | vdom.attributes["class"] = ""; 420 | } 421 | //in order to make sure the styles only get applied to elements in the current component 422 | //generate a unique class name - note we don't use a unique ID since we may want to use the same styles 423 | //for dfferent instances of the same component e.g. different elements of a list 424 | //first check if the class is not in our CSS_CACHE 425 | if (!CSS_CACHE.has(this.constructor.name)) { 426 | const uniqueID = generateUniqueHash(this.constructor.name); 427 | vdom.attributes["class"] += " " + uniqueID; 428 | CSS_CACHE.set(this.constructor.name, uniqueID); 429 | } else { 430 | vdom.attributes["class"] += " " + CSS_CACHE.get(this.constructor.name); 431 | } 432 | 433 | //if we don't already have a reference to the globalStyleSheet, we need to create it and populate it with our 434 | //css rules 435 | if (!globalStyleSheet) { 436 | this.regenerateStyleSheet(); 437 | } 438 | //note by design we don't check if state has changed and re-generate/re-inject all of the styles 439 | //Poseidon's API 440 | } 441 | 442 | //generates a new stylesheet and injects all of the styles into the page. This operation is expensive 443 | //and should be called infrequently - only if state required to load css changes. As with Poseidon's API 444 | //any state should be bound to this method to automatically trigger a re-injection when the styles change 445 | regenerateStyleSheet() { 446 | const rules = []; 447 | const name = this.constructor.name; 448 | //get the JSON object of CSS rules 449 | const userJSONStyles = this.styles(); 450 | initStyleSheet(userJSONStyles, name, rules); 451 | injectStyles(rules); 452 | } 453 | 454 | //performs any cleanup before a component is removed such as invalidating timers, canceling network requests or cleaning any 455 | //bindings that were made in the init 456 | remove() { 457 | //remove handlers of any atomic data defined here 458 | const {source, handler} = this.events; 459 | source.remove(); 460 | //reset `this.events` 461 | this.events = {}; 462 | } 463 | 464 | //create allows us to compose our unit of component 465 | //should be deterministic and have no side-effects (i.e. should be rendered declaratively) 466 | create(data) { 467 | //eventually will need to do manipulation to convert template string into this format, but start simple for now 468 | return null; 469 | } 470 | 471 | //converts internal representation of vDOM to DOM node 472 | //used to render a component again if something changes - ONLY if necessary 473 | render(dataIn) { 474 | var data = dataIn; 475 | //only apply render with `this.data` if no parameters passed in, which should take precedence 476 | if (this.data !== undefined && !data) { 477 | //if we had set this.data when initializing a component, it should also 478 | //load the data in a manual call to render 479 | data = this.data; 480 | } 481 | if (data instanceof Atom) { 482 | data = data.state; 483 | } 484 | //create virtual DOM node 485 | const newVdom = this.create(data); 486 | //TODO: fix this, can't use insertRule if element is not already in the DOM 487 | //apply any user-defined styles if applicable (do this before we render in case any user-generated styles 488 | //need to add any properties to the outer vDOM node e.g. a unique id) 489 | this.addStyle(newVdom); 490 | //call the reconciliation algorithm to render/diff the changes and render the new DOM tree which we save 491 | this.node = renderVDOM(newVdom, this.vdom, this.node); 492 | //return an empty comment if no valid DOM node is returned 493 | if (!this.node) this.node = document.createComment(''); 494 | this.vdom = newVdom; 495 | return this.node; 496 | } 497 | } 498 | //Listening class is used to connect handlers 499 | //to data/models for evented data stores (like in Torus) 500 | class Listening { 501 | constructor() { 502 | this.handlers = new Set(); 503 | //represent the current state of the data 504 | //used to determine when a change has happened and execute the corresponding handler 505 | this.state = null; 506 | } 507 | //return summary of state 508 | summarize() { 509 | return null; 510 | } 511 | 512 | //used to listen to and execute handlers on listening to events 513 | fire() { 514 | this.handlers.forEach(handler => { 515 | //call handler with new state 516 | //since we pass in the state, this means we have access directly to an atom's data (aka state) in the handler 517 | //(including a call to render) 518 | handler(this.state); 519 | }) 520 | } 521 | 522 | //called when an atom is taken down to remove all subscribed event handlers 523 | remove() { 524 | this.handlers.forEach(handler => { 525 | //remove handler 526 | this.removeHandler(handler) 527 | }) 528 | } 529 | //add a new handler 530 | addHandler(handler) { 531 | this.handlers.add(handler); 532 | handler(this.state); 533 | } 534 | //remove a handler 535 | removeHandler(handler) { 536 | this.handlers.delete(handler); 537 | } 538 | } 539 | 540 | 541 | //atom is smallest unit of data, similar to record in Torus 542 | class Atom extends Listening { 543 | constructor(object) { 544 | super(); 545 | super.state = object; 546 | } 547 | 548 | summarize() { 549 | return this.state; 550 | } 551 | 552 | //default comparator should be overrided for custom functionality in atom class 553 | get comparator() { 554 | return null; 555 | } 556 | 557 | //all children of atoms should include a method that returns their type (base implementation provided for general Atom) 558 | //but should be specific to implementing atom class 559 | get type() { 560 | return Atom; 561 | } 562 | 563 | 564 | //called to update the state of an atom of data 565 | //takes in an object of keys to values 566 | update(object) { 567 | for (const prop in object){ 568 | this.state[prop] = object[prop]; 569 | } 570 | //change has been made to data so call handler 571 | this.fire(); 572 | 573 | } 574 | 575 | //used to return a property defined in an atom 576 | get(key) { 577 | return this.state[key]; 578 | } 579 | 580 | //convert data to JSON (potentially for persistent store, etc.) 581 | serialize() { 582 | return JSON.stringify(this.state); 583 | } 584 | 585 | } 586 | 587 | //Lists are backed by collection data stores (middle man between database and the UI) to map collections to the UI 588 | class List extends Component { 589 | //fix constructor with args 590 | constructor(item, store, remove, ...args) { 591 | //call super method 592 | super(...args); 593 | this.initList(item, store, remove); 594 | this._atomClass = store.atomClass; 595 | } 596 | 597 | //helper method which initializes the list nodes 598 | initList(item, store, remove) { 599 | if (!(store instanceof CollectionStore)) throw 'Error unknown data store provided, please use a CollectionStore!' 600 | this.store = store; 601 | //check if no remove callback is passed in, in which case we default to using the native `remove` method 602 | //provided by the store 603 | if (remove) { 604 | this.remove = remove; 605 | } else { 606 | this.remove = (data) => store.remove(data); 607 | } 608 | //domElement is the unit of component that will render each individual element of a list 609 | this.domElement = item; 610 | //backed by Javascript Map since maintains order and implements iterable interface, allowing easy manipulation when looping 611 | //this items maps atoms as keys to DOM nodes as values. This prevents us having to re-render all DOM list elements, and only 612 | //re-render the elements that have changed or the ones that need to be added 613 | this.items = new Map(); 614 | this.nodes = []; 615 | //will initialize map on first call of itemsChanged() -> binding calls handler the first time 616 | this.bind(store, () => this.itemsChanged()); 617 | } 618 | 619 | itemsChanged() { 620 | //loop over store and add new elements 621 | this.store.data.forEach((element) => { 622 | if (!this.items.has(element)) { 623 | //pass in the atom to the new initialized component as well as the callback to remove an item from a store 624 | //so that each component can remove its own atomic data 625 | //recall that initializing a new element will call render the first time, meaning 626 | //we will be able to access the DOM node of this new element below 627 | const domNode = new this.domElement(element, this.remove); 628 | //set the value in our items map to an instance of the actual (Poseidon) component 629 | //This allows us to grab specific components and update them in a higher-order component 630 | this.items.set(element, domNode); 631 | } 632 | }) 633 | //loop over map and remove old elements 634 | for (let [key, value] of this.items) { 635 | if (!this.store.has(key)) { 636 | this.items.delete(key); 637 | } 638 | } 639 | 640 | //note althought we create an array from instances of our list item components, recall 641 | //when we instantiated them above, it will have made a call to render so will have access 642 | //to it's predefined DOM node. In our rendering logic, when we see this, we return the 643 | //DOM node directly, as opposed to trying to create a DOM node from our vDOM. 644 | //This is an important subtelty because if we were to do the latter, we (i.e. a Poseidon component) would not have 645 | //a reference to the DOM node locally, thus would not be able to update any changes (on the web page) 646 | //reflected to its state (and a goal of Poseidon is that we have self-managing components 647 | //so should be able to display changes to changes in atomic data directly within our own component) 648 | this.nodes = Array.from(this.items.values()) 649 | this.render(this.nodes); 650 | } 651 | 652 | get type() { 653 | return this._atomClass; 654 | } 655 | 656 | get size() { 657 | return this.items.size; 658 | } 659 | 660 | create(data) { 661 | //default implementation is to return a

664 | ${this.nodes} 665 |

from the result of the nested css template function 1282 | //call to prevent duplicates in our selector 1283 | objectStyles["tag"] = selector + objectStyles["tag"].replace("", ""); 1284 | //add the styles 1285 | dict["rules"].push(objectStyles); 1286 | }); 1287 | } 1288 | reader.skipToNextChar(); 1289 | continue 1290 | } else { 1291 | throw 'Invalid curly brace found in css template literal!' 1292 | } 1293 | } 1294 | 1295 | //we don't directly use the reader's currentChar variable since there are some edge 1296 | //cases where we need to do some lookahead operations and will need to adjust it on the fly 1297 | //to execute the correct logic 1298 | var char = reader.currentChar; 1299 | 1300 | //may be a key-value pair or a selector, need to lookahead 1301 | if (reader.currentChar === ':') { 1302 | reader.skipToNextChar(); 1303 | //some css selectors have `:` in them e.g :hover or ::before, so we need to check if this is a selector 1304 | //or a key-value pair 1305 | //first check if we have the complete word, or if this is a special :: selector case 1306 | if (reader.currentChar === ':' ) { 1307 | word += "::"; //directly add the ::s, first one at line 341 that we skipped, and the current one 1308 | //consume the second : 1309 | reader.consume(); 1310 | word += reader.getNextWord(); 1311 | reader.skipSpaces(); 1312 | //this must be a css selector and not a key-value pair so reset char 1313 | char = reader.currentChar 1314 | } 1315 | if (char !== '{') { 1316 | //make sure to get the result with quotes in case any values rely on it 1317 | //to correctly render CSS e.g. content 1318 | var value = reader.getNextWord(true); 1319 | //check if we have a JS expression as the value for a key 1320 | if (value === '{') { 1321 | //skip the { 1322 | reader.consume(); 1323 | const constant = reader.getNextWord(); 1324 | if (constant !== CSS_PLACEHOLDER) throw 'Invalid JS expression while trying to parse the value of a key!'; 1325 | value = values.shift(); 1326 | //skip past the } of the 1327 | reader.skipToNextChar(); 1328 | } 1329 | //check if this is a css selector with a specific colon like :before, in which case the reader would be 1330 | //pointing to a { 1331 | if (reader.currentChar === '{') { 1332 | word += ":" + value; 1333 | char = reader.currentChar; //adjust char to a { so we correctly parse it as a selector at line 366 1334 | } else if (reader.currentChar === ':') { 1335 | //this is a media rule or a css selector with two colons e.g. @media and (min-width: 800px) and (max-width: 800px) 1336 | reader.consume(); 1337 | const next = reader.getNextWord(); 1338 | //trim for consistency 1339 | word += ":" + value + ":" + next.trimStart(); 1340 | char = reader.currentChar; 1341 | } else { 1342 | //otherwise, we've encountered a key-value pair 1343 | dict.rules.push({key: word, value: value}); 1344 | //consume ; at the end of a rule and skip any spaces 1345 | reader.skipToNextChar(); 1346 | } 1347 | } 1348 | } 1349 | //this is a selector with some associated css rules i.e. {key1: rule1....} 1350 | if (char === '{') { 1351 | reader.skipToNextChar(); 1352 | //nested tag, recursive call here 1353 | const nestedTagDict = {}; 1354 | dict.rules.push(nestedTagDict); 1355 | //TODO: standarize spacing, necessary? 1356 | var newSelector = selector + " " + word; 1357 | //if the tag, or next selector is a keyframe or media, we don't want to append the previous selector 1358 | //since these are special tags which should be handled differently 1359 | if (word.includes("@keyframes") || word.includes("@media") || 1360 | dict["tag"].includes("@keyframes") || dict["tag"].includes("@media")) { 1361 | newSelector = word; 1362 | } 1363 | //note for the new selector, we append the current selector (i.e. child) to the parent 1364 | //to capture all descedants of the parent that correspond to this specific child. 1365 | //This prevents us from having to do this logic ad-hoc when we parse our dict into 1366 | //our eventual stylesheet 1367 | parseCSStringToDict(reader, nestedTagDict, newSelector, values); 1368 | //skip closing } and any spaces 1369 | reader.skipToNextChar(); 1370 | } 1371 | //check if we've reached the end of a block-scoped {} of key-value pairs 1372 | if (reader.currentChar === '}') { 1373 | //note we don't consume the '}' since we delegate the responsibility to the caller to do that 1374 | //allows us to more reliably manage our position / prevents inconsistencies with multiple nested tags on the same level 1375 | break; 1376 | } 1377 | } 1378 | return dict; 1379 | } 1380 | 1381 | const css = (templates, ...values) => { 1382 | //create string and interpolate all of the ${} expressions with our constructed placeholder node 1383 | const cssString= templates.join(CSS_JSX_NODE, values); 1384 | //remove any comments 1385 | const cssCommentsRegex = new RegExp('(\\/\\*[\\s\\S]*?\\*\\/)', 'gi'); 1386 | const cssWithoutComments = cssString.replace(cssCommentsRegex, ''); 1387 | const reader = new Reader(cssWithoutComments, [';', '{', '}', ':']); 1388 | try { 1389 | reader.skipSpaces(); 1390 | const dict = {}; 1391 | parseCSStringToDict(reader, dict, "", values); 1392 | return dict; 1393 | } catch (e) { 1394 | console.error(e); 1395 | return null; 1396 | } 1397 | } -------------------------------------------------------------------------------- /static/search.xml: -------------------------------------------------------------------------------- 1 | 2 |