├── .gitignore ├── LICENSE ├── NOTES.md ├── README.md ├── examples └── full-aws-test.clj ├── project.clj └── src └── crux └── aws.clj /.gitignore: -------------------------------------------------------------------------------- 1 | /target 2 | /classes 3 | /checkouts 4 | profiles.clj 5 | pom.xml 6 | pom.xml.asc 7 | *.jar 8 | *.class 9 | /.lein-* 10 | /.nrepl-port 11 | .hgignore 12 | .hg/ 13 | .idea/ 14 | *.iml 15 | scripts/ -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright © 2019 Casey Marshall. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of 6 | this software and associated documentation files (the "Software"), to deal in 7 | the Software without restriction, including without limitation the rights to 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 9 | the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 17 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 18 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 19 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 20 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | -------------------------------------------------------------------------------- /NOTES.md: -------------------------------------------------------------------------------- 1 | ## AWS Services 2 | 3 | ### S3 4 | 5 | * High durability. 6 | * Low cost (orders-of-magnitude less than any other storage option on AWS). 7 | * Keyspace partitioning by prefix (e.g. pseudo-directories). 8 | * Sorted keyspace retrieval. 9 | 10 | ### DynamoDB 11 | 12 | * Conditional puts and updates -- allows for compare-and-set. 13 | * Ordered traversal via indexes. 14 | 15 | ## Design Ideas 16 | 17 | ### Naive Approaches 18 | 19 | * Implement crux's KV store on S3, and use a Moberg TxLog atop that. Use SQS with 20 | S3 notifications to pause a polling event consumer. Based on the "standalone" 21 | topology in crux. 22 | 23 | * Implement an atomic counter on DynamoDB, and use this to push events into S3, 24 | indexed by event ID generated by this counter. Again use SQS to pause a polling 25 | event consumer. 26 | 27 | ### Trees in S3 + DynamoDB tx-log 28 | 29 | * Store transacted docs in DynamoDB, with sortable, increasing RANGE keys. 30 | * Once the tx-log grows large enough, flush the tx-log into a large B-tree 31 | stored as objects in S3. This can be a persistent data structure, so old 32 | segments just stay as they are. 33 | 34 | #### Challenges 35 | 36 | * Still need an atomic counter for tx-ids in DynamoDB (unless there is a different 37 | approach that can give us good ordering if docs, but I don't think there is). 38 | * Would like a blocking notification when new txes are added to the log (right now 39 | it just polls DynamoDB). 40 | * Possibly use DynamoDB streams -> lambda -> SNS? -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # crux-aws 2 | 3 | [crux](https://juxt.pro/crux/) atop S3 and DynamoDB. 4 | 5 | Experimental. 6 | 7 | ## Quick Start: 8 | 9 | For a full AWS experience: 10 | 11 | ```clojure 12 | (def node (crux.api/start-node {:crux.node/topology crux.aws/topology 13 | :crux.dynamodb/table-name "your-dynamodb-tx-table" 14 | :crux.s3/bucket "your-s3-doc-bucket" 15 | :crux.kv.hitchhiker-tree/konserve crux.kv.hitchhiker-tree.konserve.ddb-s3/ddb-s3-backend 16 | :crux.kv.hitchhiker-tree.konserve.ddb-s3/bucket "your-dynamodb-kv-table" 17 | :crux.kv.hitchhiker-tree.konserve.ddb-s3/table "your-s3-kv-bucket" 18 | :crux.kv.hitchhiker-tree.konserve.ddb-s3/region "us-west-2"})) 19 | ``` 20 | 21 | **NOTE** the hitchhiker-tree KV store with the konserve-ddb-s3 backend is 22 | currently broken; use a different KV store for now. 23 | 24 | Subprojects you should look at as well: 25 | 26 | * [crux-dynamodb](https://github.com/csm/crux-dynamodb) -- TxLog on DynamoDB. 27 | * [crux-ddb-s3](https://github.com/csm/crux-ddb-s3) -- TxLog spread between DynamoDB and S3. 28 | * [crux-hitchhiker-tree](https://github.com/csm/crux-kv-hitchhiker-tree) -- KV store on hitchhiker-tree. 29 | 30 | 31 | ## TODO 32 | 33 | The KV store needs to handle concurrent writes better than it does. 34 | 35 | The KV store should take updates from the remote store periodically (or on changes 36 | to dynamodb) so it keeps in sync. 37 | 38 | The transaction log could wait for changes triggered -------------------------------------------------------------------------------- /examples/full-aws-test.clj: -------------------------------------------------------------------------------- 1 | (in-ns 'crux.aws.repl) 2 | 3 | (.setLevel (org.slf4j.LoggerFactory/getLogger "software") ch.qos.logback.classic.Level/INFO) 4 | (.setLevel (org.slf4j.LoggerFactory/getLogger "io.netty") ch.qos.logback.classic.Level/INFO) 5 | (.setLevel (org.slf4j.LoggerFactory/getLogger "crux") ch.qos.logback.classic.Level/INFO) 6 | (require 'crux.aws 7 | '[crux.api :as crux] 8 | 'crux.kv.hitchhiker-tree.konserve.ddb-s3) 9 | 10 | (def node (crux/start-node {:crux.node/topology crux.aws/topology 11 | :crux.dynamodb/table-name "csm-crux-test" 12 | :crux.s3/bucket "csm-crux-test" 13 | :crux.kv.hitchhiker-tree/konserve crux.kv.hitchhiker-tree.konserve.ddb-s3/ddb-s3-backend 14 | :crux.kv.hitchhiker-tree.konserve.ddb-s3/bucket "csm-crux-kv-test" 15 | :crux.kv.hitchhiker-tree.konserve.ddb-s3/table "csm-crux-kv-test" 16 | :crux.kv.hitchhiker-tree.konserve.ddb-s3/region "us-west-2"})) 17 | 18 | (def node2 (crux/start-node {:crux.node/topology crux.aws/topology-ddb-s3 19 | :crux.ddb-s3/table "csm-crux-tx-test" 20 | :crux.ddb-s3/bucket "csm-crux-tx-test" 21 | :crux.s3/bucket "csm-crux-tx-test" 22 | :crux.s3/prefix "doc/"})) 23 | ; hitchhiker-tree konserve serialization currently broken 24 | ;:crux.kv.hitchhiker-tree/konserve crux.kv.hitchhiker-tree.konserve.ddb-s3/ddb-s3-backend 25 | ;:crux.kv.hitchhiker-tree.konserve.ddb-s3/bucket "csm-crux-kv-test" 26 | ;:crux.kv.hitchhiker-tree.konserve.ddb-s3/table "csm-crux-kv-test" 27 | ;:crux.kv.hitchhiker-tree.konserve.ddb-s3/region "us-west-2"})) -------------------------------------------------------------------------------- /project.clj: -------------------------------------------------------------------------------- 1 | (defproject crux-aws "0.1.0-SNAPSHOT" 2 | :description "Crux atop DynamoDB and S3. Experimental." 3 | :url "https://github.com/csm/crux-aws" 4 | :license {:name "The MIT License" 5 | :url "http://opensource.org/licenses/MIT"} 6 | :dependencies [[org.clojure/clojure "1.10.1"] 7 | [org.clojure/core.cache "0.7.2"] 8 | [org.clojure/core.memoize "0.7.2"] 9 | [juxt/crux-core "20.06-1.9.1-beta"] 10 | [juxt/crux-s3 "20.06-1.9.1-beta" :exclusions [software.amazon.awssdk/s3]] 11 | [com.github.csm/crux-dynamodb "0.1.1"] 12 | [com.github.csm/konserve-ddb-s3 "0.1.4"] 13 | [com.github.csm/crux-hitchhiker-tree "0.1.1"] 14 | [com.github.csm/crux-ddb-s3 "0.1.0"] 15 | [ch.qos.logback/logback-classic "1.2.3"] 16 | [ch.qos.logback/logback-core "1.2.3"]] 17 | :repl-options {:init-ns crux.aws.repl} 18 | :profiles {:repl {:source-paths ["examples"]}}) 19 | -------------------------------------------------------------------------------- /src/crux/aws.clj: -------------------------------------------------------------------------------- 1 | (ns crux.aws 2 | (:require [crux.node :as n])) 3 | 4 | (def topology 5 | ['crux.node/base-topology 6 | 'crux.dynamodb/dynamodb-tx-log 7 | 'crux.s3/s3-doc-store 8 | 'crux.kv.hitchhiker-tree/kv]) 9 | 10 | (def topology-ddb-s3 11 | ['crux.node/base-topology 12 | 'crux.ddb-s3/ddb-s3-tx-log 13 | 'crux.s3/s3-doc-store]) 14 | ;'crux.kv.hitchhiker-tree/kv]) --------------------------------------------------------------------------------