├── .gitignore ├── .travis.yml ├── LICENSE ├── README.md ├── Setup.hs ├── System ├── TrailDB.hs └── TrailDB │ ├── Error.hs │ └── Internal.hs ├── bench-traildb └── Main.hs ├── cbits └── shim.c ├── examples └── tutorial_simple_traildb.hs ├── stack.yaml ├── traildb.cabal ├── traildb_logo_512.png └── wikipedia-benchmark └── Main.hs /.gitignore: -------------------------------------------------------------------------------- 1 | .stack-work/ 2 | *.swp 3 | *~ 4 | 5 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: haskell 2 | 3 | ghc: 4 | - 7.8 5 | 6 | before_install: 7 | - sudo apt-get -qq update 8 | - sudo apt-get install libjudy-dev libarchive-dev pkg-config libtool autoconf automake make git clang 9 | - git clone https://github.com/traildb/traildb && cd traildb && mkdir m4 && ./autogen.sh && CC=clang ./configure --prefix=/usr && make && sudo make install && cd .. 10 | 11 | install: 12 | - cabal install --only-dependencies 13 | 14 | script: 15 | - cabal configure && cabal build 16 | 17 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | Copyright (c) 2016 AdRoll Inc 3 | 4 | Permission is hereby granted, free of charge, to any person obtaining a copy of 5 | this software and associated documentation files (the "Software"), to deal in 6 | the Software without restriction, including without limitation the rights to 7 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 8 | of the Software, and to permit persons to whom the Software is furnished to do 9 | so, subject to the following conditions: 10 | 11 | The above copyright notice and this permission notice shall be included in all 12 | copies or substantial portions of the Software. 13 | 14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 20 | SOFTWARE. 21 | 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![Hackage](https://img.shields.io/hackage/v/traildb.svg)](https://hackage.haskell.org/package/traildb) 2 | [![Travis CI](https://travis-ci.org/traildb/traildb-haskell.svg?branch=master)](https://travis-ci.org/traildb/traildb-haskell) 3 | 4 | TrailDB Haskell bindings 5 | ======================== 6 | 7 | ![TrailDB logo](traildb_logo_512.png?raw=true) 8 | 9 | These are the official Haskell bindings to [TrailDB](http://traildb.io/). Most 10 | of the API is covered (with the notable exception of filters). Check out 11 | `System.TrailDB` module for some examples and documentation. 12 | 13 | How to build 14 | ------------ 15 | 16 | These bindings can be installed using `cabal-install` or `stack` on Linux. At 17 | least GHC 7.8 is required. 18 | 19 | Fetch the code in some way. You can clone 20 | [https://github.com/traildb/traildb-haskell]. Then follow these 21 | instructions. 22 | 23 | You need at least `traildb` and `Judy` libraries installed to compile and use 24 | these bindings. `Judy` is a dependency of TrailDB itself so if you have TrailDB 25 | the C library installed properly, then most likely you don't need to do 26 | anything else regarding dependencies. 27 | 28 | ### cabal-install 29 | 30 | Cabal is usually in the package repositories of your distribution. 31 | 32 | $ apt-get install cabal-install # Debian/Mint/Ubuntu 33 | $ dnf install ghc cabal-install # Fedora 22 34 | $ pacman -S ghc cabal-install # Arch Linux 35 | $ pkg install hs-cabal-install # FreeBSD 36 | 37 | $ cabal install # Run this in the root of traildb-haskell 38 | 39 | # Test it out! 40 | 41 | $ cd examples 42 | $ ghc tutorial_simple_traildb.hs -o tutorial 43 | $ ./tutorial 44 | 45 | After this, bindings should be usable in other Haskell projects by requiring `traildb`. 46 | 47 | ### stack 48 | 49 | Stack is a new fancy Haskell build tool. Because it's new and fancy, it doesn't 50 | quite have the same level of presence in Linux package repositories than 51 | `cabal-install`. 52 | 53 | You can manually download `stack` from [http://www.haskellstack.org/] if it's 54 | not in your repositories. 55 | 56 | # Once you have `stack` in your PATH: 57 | 58 | $ stack setup # May be optional if you have GHC already installed and it can be used 59 | $ stack install 60 | 61 | # Test it out! 62 | 63 | $ cd examples 64 | $ stack ghc -- tutorial_simple_traildb.hs -o tutorial 65 | $ ./tutorial 66 | 67 | License 68 | ------- 69 | 70 | These bindings are licensed under the MIT license. 71 | 72 | How to contribute or report bugs 73 | -------------------------------- 74 | 75 | Use our [GitHub page](https://github.com/traildb/traildb-haskell/) to 76 | report issues or to open pull requests. 77 | 78 | Example program 79 | --------------- 80 | 81 | Check out `examples/tutorial_simple_traildb.hs` in this repository. 82 | 83 | -------------------------------------------------------------------------------- /Setup.hs: -------------------------------------------------------------------------------- 1 | import Distribution.Simple 2 | main = defaultMain 3 | -------------------------------------------------------------------------------- /System/TrailDB.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE GeneralizedNewtypeDeriving #-} 2 | {-# LANGUAGE MultiParamTypeClasses #-} 3 | {-# LANGUAGE TypeSynonymInstances #-} 4 | {-# LANGUAGE ScopedTypeVariables #-} 5 | {-# LANGUAGE DeriveDataTypeable #-} 6 | {-# LANGUAGE DeriveTraversable #-} 7 | {-# LANGUAGE OverloadedStrings #-} 8 | {-# LANGUAGE FlexibleInstances #-} 9 | {-# LANGUAGE FlexibleContexts #-} 10 | {-# LANGUAGE PatternSynonyms #-} 11 | {-# LANGUAGE EmptyDataDecls #-} 12 | {-# LANGUAGE DeriveFoldable #-} 13 | {-# LANGUAGE DeriveGeneric #-} 14 | {-# LANGUAGE DeriveFunctor #-} 15 | {-# LANGUAGE TypeFamilies #-} 16 | {-# LANGUAGE ViewPatterns #-} 17 | {-# LANGUAGE BangPatterns #-} 18 | {-# LANGUAGE LambdaCase #-} 19 | {-# LANGUAGE MultiWayIf #-} 20 | {-# LANGUAGE RankNTypes #-} 21 | 22 | -- | Haskell bindings to trailDB library. 23 | -- 24 | -- Minimal program that lists a TrailDB: 25 | -- 26 | -- @ 27 | -- import qualified Data.ByteString as B 28 | -- import System.TrailDB 29 | -- 30 | -- main :: IO () 31 | -- main = do 32 | -- tdb <- openTrailDB "wikipedia-history-small.tdb" 33 | -- forEachTrailID tdb $ \tid -> print =<< getTrailBytestring tdb tid 34 | -- @ 35 | -- 36 | -- Example program that reads a TrailDB using low-level (faster) cursor API: 37 | -- 38 | -- @ 39 | -- import qualified Data.ByteString as B 40 | -- import qualified Data.Vector.Unboxed as V 41 | -- import System.TrailDB 42 | -- 43 | -- main :: IO () 44 | -- main = do 45 | -- tdb <- openTrailDB "some-trail-db" 46 | -- number_of_trails <- getNumTrails tdb 47 | -- 48 | -- let arbitrarily_chosen_trail_id = 12345 \`mod\` number_of_trails 49 | -- 50 | -- cursor <- makeCursor tdb 51 | -- setCursor cursor arbitrarily_chosen_trail_id 52 | -- 53 | -- -- Read the first event in the arbitrary chosen trail 54 | -- crumb <- stepCursor cursor 55 | -- case crumb of 56 | -- Nothing -> putStrLn "Cannot find this particular trail." 57 | -- Just (timestamp, features) -> 58 | -- V.forM_ features $ \feature -> do 59 | -- field_name <- getFieldName tdb (feature^.field) 60 | -- putStr "Field: " 61 | -- B.putStr field_name 62 | -- putStr " contains value " 63 | -- value <- getValue tdb feature 64 | -- B.putStrLn value 65 | -- 66 | -- @ 67 | -- 68 | -- Another example program that writes a TrailDB: 69 | -- 70 | -- @ 71 | -- {-\# LANGUAGE OverloadedStrings \#-} 72 | -- 73 | -- import System.TrailDB 74 | -- 75 | -- main :: IO () 76 | -- main = do 77 | -- cons <- newTrailDBCons "some-trail-db" (["currency", "order_amount", "item"] :: [String]) 78 | -- addTrail cons ("aaaaaaaaaaaaaa00") -- UUIDs are 16 bytes in length 79 | -- 1457049455 -- This is timestamp 80 | -- [\"USD\", "10.14", "Bacon & Cheese" :: String] 81 | -- addTrail cons ("aaaaaaaaaaaaaa00") -- Same UUID as above, same customer ordered more 82 | -- 1457051221 83 | -- [\"USD\", "8.90", "Avocado Sandwich" :: String] 84 | -- addTrail cons ("aaaaaaaaaaaaaa02") 85 | -- 1457031239 86 | -- [\"JPY\", "2900", "Sun Lotion" :: String] 87 | -- closeTrailDBCons cons 88 | -- 89 | -- -- TrailDB has been written to 'some-trail-db' 90 | -- @ 91 | -- 92 | 93 | module System.TrailDB 94 | ( 95 | -- * Constructing new TrailDBs 96 | newTrailDBCons 97 | , closeTrailDBCons 98 | , withTrailDBCons 99 | , addTrail 100 | , appendTdbToTdbCons 101 | , finalizeTrailDBCons 102 | -- ** Supplying values to construction 103 | , ToTdbRow(..) 104 | , ToTdbRowField(..) 105 | , TdbConsRow(..) 106 | , TdbShowable(..) 107 | , pattern TShow 108 | -- * Opening existing TrailDBs 109 | , openTrailDB 110 | , closeTrailDB 111 | , getTdbVersion 112 | , dontneedTrailDB 113 | , willneedTrailDB 114 | , withTrailDB 115 | -- * Accessing TrailDBs 116 | -- ** High-level, slow, access 117 | , FromTrail(..) 118 | , getTrail 119 | , getTrailBytestring 120 | , decodeCrumbBytestring 121 | -- ** Lowerish-level, fast, access 122 | , makeCursor 123 | , stepCursor 124 | , stepCursorList 125 | , setCursor 126 | -- ** Iterating over TrailDB 127 | , forEachTrailID 128 | , forEachTrailIDUUID 129 | , traverseEachTrailID 130 | , traverseEachTrailIDUUID 131 | , forEachEvent 132 | , forEachEventUUID 133 | , traverseEachEvent 134 | , traverseEachEventUUID 135 | , foldEachEvent 136 | , foldEachEventUUID 137 | , foldTrailDB 138 | , foldTrailDBUUID 139 | -- ** Basic querying 140 | , getNumTrails 141 | , getNumEvents 142 | , getNumFields 143 | , getMinTimestamp 144 | , getMaxTimestamp 145 | -- ** UUID handling 146 | , getUUID 147 | , getTrailID 148 | -- ** Fields 149 | , getFieldName 150 | , getFieldID 151 | , getItemByField 152 | , getValue 153 | , getItem 154 | -- * Time handling 155 | , utcTimeToUnixTime 156 | , posixSecondsToUnixTime 157 | , dayToUnixTime 158 | -- * C interop 159 | , withRawTdb 160 | , getRawTdb 161 | , touchTdb 162 | , withRawTdbCons 163 | , getRawTdbCons 164 | , touchTdbCons 165 | , TdbRaw 166 | , TdbConsRaw 167 | -- ** Taking apart `Feature` 168 | , field 169 | , value 170 | , (^.) 171 | -- * Data types 172 | , UUID 173 | , TrailID 174 | , FieldID 175 | , Crumb 176 | , Feature() 177 | , TdbField 178 | , TdbVal 179 | , TdbVersion 180 | , FieldName 181 | , FieldNameLike(..) 182 | , featureWord 183 | , featureTdbVal 184 | , Cursor() 185 | , TdbCons() 186 | , Tdb() 187 | -- ** Time 188 | , UnixTime 189 | , getUnixTime 190 | -- * Exceptions 191 | , TrailDBException(..) 192 | -- * Multiple TrailDBs 193 | -- 194 | -- | Operations in this section are conveniences that may be useful if you 195 | -- have many TrailDB directories you want to query. 196 | , findTrailDBs 197 | , filterTrailDBDirectories ) 198 | where 199 | 200 | import Control.Applicative 201 | import Control.Concurrent 202 | import Control.Monad 203 | import Control.Monad.Catch 204 | import Control.Monad.IO.Class 205 | import Control.Monad.Primitive 206 | import Control.Monad.Trans.State.Strict 207 | import Data.Bits 208 | import qualified Data.ByteString as B 209 | import qualified Data.ByteString.Unsafe as B 210 | import qualified Data.ByteString.Lazy as BL 211 | import Data.Coerce 212 | import Data.Foldable ( for_, foldlM, Foldable ) 213 | import Data.Data 214 | import Data.IORef 215 | import qualified Data.Map.Strict as M 216 | import Data.Monoid 217 | import Data.Profunctor 218 | import qualified Data.Set as S 219 | import qualified Data.Text as T 220 | import qualified Data.Text.Encoding as T 221 | import qualified Data.Text.Lazy as TL 222 | import Data.Time 223 | import Data.Time.Clock.POSIX 224 | import qualified Data.Vector as VS 225 | import qualified Data.Vector.Generic as VG 226 | import qualified Data.Vector.Generic.Mutable as VGM 227 | import qualified Data.Vector.Unboxed as V 228 | import qualified Data.Vector.Unboxed.Mutable as VM 229 | import Data.Traversable 230 | import Data.Word 231 | import Foreign.C.String 232 | import Foreign.C.Types 233 | import Foreign.ForeignPtr 234 | import Foreign.Marshal.Alloc 235 | import Foreign.Marshal.Array 236 | import Foreign.Ptr 237 | import Foreign.Storable 238 | import GHC.Generics 239 | import System.Directory hiding ( isSymbolicLink ) 240 | import System.IO.Error 241 | import System.Posix.Files.ByteString 242 | 243 | import System.TrailDB.Error 244 | import System.TrailDB.Internal 245 | 246 | -- Raw types (tdb_types.h) 247 | type TdbField = Word32 248 | -- | `FieldID` is indexes a field number. 249 | type FieldID = TdbField 250 | type TdbVal = Word64 251 | type TdbItem = Word64 252 | 253 | foreign import ccall unsafe tdb_error_str 254 | :: CInt -> IO (Ptr CChar) 255 | foreign import ccall unsafe tdb_cons_init 256 | :: IO (Ptr TdbConsRaw) 257 | foreign import ccall unsafe tdb_cons_open 258 | :: Ptr TdbConsRaw 259 | -> Ptr CChar 260 | -> Ptr (Ptr CChar) 261 | -> Word64 262 | -> IO CInt 263 | foreign import ccall unsafe tdb_cons_close 264 | :: Ptr TdbConsRaw -> IO () 265 | foreign import ccall unsafe tdb_cons_add 266 | :: Ptr TdbConsRaw 267 | -> Ptr Word8 268 | -> Word64 269 | -> Ptr (Ptr CChar) 270 | -> Ptr Word64 271 | -> IO CInt 272 | foreign import ccall safe tdb_cons_finalize 273 | :: Ptr TdbConsRaw 274 | -> Word64 275 | -> IO CInt 276 | foreign import ccall safe tdb_cons_append 277 | :: Ptr TdbConsRaw 278 | -> Ptr TdbRaw 279 | -> IO CInt 280 | foreign import ccall unsafe tdb_dontneed 281 | :: Ptr TdbRaw 282 | -> IO () 283 | foreign import ccall unsafe tdb_willneed 284 | :: Ptr TdbRaw 285 | -> IO () 286 | foreign import ccall unsafe tdb_init 287 | :: IO (Ptr TdbRaw) 288 | foreign import ccall safe tdb_open 289 | :: Ptr TdbRaw 290 | -> Ptr CChar 291 | -> IO CInt 292 | foreign import ccall safe tdb_close 293 | :: Ptr TdbRaw 294 | -> IO () 295 | foreign import ccall unsafe tdb_get_trail_id 296 | :: Ptr TdbRaw 297 | -> Ptr Word8 298 | -> Ptr Word64 299 | -> IO CInt 300 | foreign import ccall unsafe tdb_get_uuid 301 | :: Ptr TdbRaw 302 | -> Word64 303 | -> IO (Ptr Word8) 304 | foreign import ccall unsafe tdb_num_trails 305 | :: Ptr TdbRaw -> IO Word64 306 | foreign import ccall unsafe tdb_num_events 307 | :: Ptr TdbRaw -> IO Word64 308 | foreign import ccall unsafe tdb_num_fields 309 | :: Ptr TdbRaw -> IO Word64 310 | foreign import ccall unsafe tdb_min_timestamp 311 | :: Ptr TdbRaw -> IO Word64 312 | foreign import ccall unsafe tdb_max_timestamp 313 | :: Ptr TdbRaw -> IO Word64 314 | foreign import ccall unsafe tdb_version 315 | :: Ptr TdbRaw -> IO Word64 316 | foreign import ccall unsafe tdb_get_field 317 | :: Ptr TdbRaw 318 | -> Ptr CChar 319 | -> Ptr TdbField 320 | -> IO CInt 321 | foreign import ccall unsafe tdb_get_field_name 322 | :: Ptr TdbRaw 323 | -> TdbField 324 | -> IO (Ptr CChar) 325 | foreign import ccall unsafe tdb_get_item_value 326 | :: Ptr TdbRaw 327 | -> TdbItem 328 | -> Ptr Word64 329 | -> IO (Ptr CChar) 330 | foreign import ccall unsafe tdb_get_item 331 | :: Ptr TdbRaw 332 | -> TdbField 333 | -> Ptr CChar 334 | -> Word64 335 | -> IO TdbItem 336 | foreign import ccall unsafe tdb_get_trail 337 | :: Ptr TdbCursorRaw 338 | -> Word64 339 | -> IO CInt 340 | foreign import ccall unsafe tdb_cursor_new 341 | :: Ptr TdbRaw 342 | -> IO (Ptr TdbCursorRaw) 343 | foreign import ccall unsafe tdb_cursor_free 344 | :: Ptr TdbCursorRaw 345 | -> IO () 346 | foreign import ccall unsafe shim_tdb_cursor_next 347 | :: Ptr TdbCursorRaw 348 | -> IO (Ptr TdbEventRaw) 349 | 350 | data TdbCursorRaw 351 | data TdbEventRaw 352 | 353 | -- | UUIDs should be 16 bytes in size. It can be converted to `TrailID` within a traildb. 354 | type UUID = B.ByteString 355 | -- | Fields names are bytestring and can contain nulls. 356 | type FieldName = B.ByteString 357 | -- | The type of time used in traildbs. 358 | type UnixTime = Word64 359 | -- | `TrailID` indexes a trail in a traildb. It can be converted to and back to 360 | -- `UUID` within a traildb. 361 | type TrailID = Word64 362 | -- | TrailDB version 363 | type TdbVersion = Word64 364 | 365 | -- | A single crumb is some event at certain time. 366 | -- 367 | -- The vector always has length as told by `getNumFields`. 368 | type Crumb = (UnixTime, V.Vector Feature) 369 | 370 | -- | `Feature` is a value in traildb. `getValue` can turn it into a 371 | -- human-readable value within a traildb. 372 | newtype Feature = Feature TdbItem 373 | deriving ( Eq, Ord, Show, Read, Typeable, Data, Generic, Storable ) 374 | 375 | -- | Type synonym from lens package. Defined here so we can avoid lens dependency. 376 | type Iso s t a b = forall p f. (Profunctor p, Functor f) => p a (f b) -> p s (f t) 377 | type Iso' s a = Iso s s a a 378 | 379 | iso :: (s -> a) -> (b -> t) -> Iso s t a b 380 | iso sa bt = dimap sa (fmap bt) 381 | {-# INLINE iso #-} 382 | 383 | -- | Type synonym from lens package. 384 | type Lens s t a b = forall f. Functor f => (a -> f b) -> s -> f t 385 | type Lens' s a = Lens s s a a 386 | 387 | lens :: (s -> a) -> (s -> b -> t) -> Lens s t a b 388 | lens sa sbt afb s = sbt s <$> afb (sa s) 389 | {-# INLINE lens #-} 390 | 391 | 392 | 393 | (^.) :: s -> ((a -> Const a a) -> s -> Const a s) -> a 394 | s ^. l = getConst (l Const s) 395 | {-# INLINE (^.) #-} 396 | 397 | featureWord :: Iso' Feature Word64 398 | featureWord = iso (\(Feature w) -> w) Feature 399 | {-# INLINE featureWord #-} 400 | 401 | featureTdbVal :: Iso' Feature TdbVal 402 | featureTdbVal = featureWord 403 | {-# INLINE featureTdbVal #-} 404 | 405 | newtype instance V.Vector Feature = V_Feature (V.Vector Word64) 406 | newtype instance VM.MVector s Feature = VM_Feature (VM.MVector s Word64) 407 | 408 | instance VGM.MVector VM.MVector Feature where 409 | {-# INLINE basicLength #-} 410 | basicLength (VM_Feature w64) = VGM.basicLength w64 411 | {-# INLINE basicUnsafeSlice #-} 412 | basicUnsafeSlice a b (VM_Feature w64) = coerce $ 413 | VGM.basicUnsafeSlice a b w64 414 | {-# INLINE basicOverlaps #-} 415 | basicOverlaps (VM_Feature w1) (VM_Feature w2) = VGM.basicOverlaps w1 w2 416 | {-# INLINE basicUnsafeNew #-} 417 | basicUnsafeNew sz = do 418 | result <- VGM.basicUnsafeNew sz 419 | return $ VM_Feature result 420 | {-# INLINE basicUnsafeRead #-} 421 | basicUnsafeRead (VM_Feature w64) i = do 422 | result <- VGM.basicUnsafeRead w64 i 423 | return $ coerce result 424 | {-# INLINE basicUnsafeWrite #-} 425 | basicUnsafeWrite (VM_Feature w64) i v = 426 | VGM.basicUnsafeWrite w64 i (coerce v) 427 | 428 | basicInitialize (VM_Feature w64) = 429 | VGM.basicInitialize w64 430 | {-# INLINE basicInitialize #-} 431 | 432 | instance VG.Vector V.Vector Feature where 433 | {-# INLINE basicLength #-} 434 | basicLength (V_Feature w64) = VG.basicLength w64 435 | {-# INLINE basicUnsafeFreeze #-} 436 | basicUnsafeFreeze (VM_Feature w64) = do 437 | result <- VG.basicUnsafeFreeze w64 438 | return $ coerce (result :: V.Vector Word64) 439 | {-# INLINE basicUnsafeThaw #-} 440 | basicUnsafeThaw (V_Feature w64) = do 441 | result <- VG.basicUnsafeThaw w64 442 | return $ coerce result 443 | {-# INLINE basicUnsafeIndexM #-} 444 | basicUnsafeIndexM (V_Feature w64) idx = do 445 | result <- VG.basicUnsafeIndexM w64 idx 446 | return $ coerce result 447 | {-# INLINE basicUnsafeSlice #-} 448 | basicUnsafeSlice i1 i2 (V_Feature w64) = 449 | coerce $ VG.basicUnsafeSlice i1 i2 w64 450 | 451 | -- | `Feature` is isomorphic to `Word64` so it's safe to coerce between them. (see `featureWord`). 452 | instance V.Unbox Feature 453 | 454 | -- | A helper function to get the current unix time. 455 | -- 456 | -- May be useful when building TrailDBs if you don't have timestamps already. 457 | getUnixTime :: MonadIO m => m UnixTime 458 | getUnixTime = liftIO $ do 459 | now <- getPOSIXTime 460 | let t = floor now 461 | return t 462 | {-# INLINE getUnixTime #-} 463 | 464 | -- | Converts `UTCTime` to `UnixTime` 465 | utcTimeToUnixTime :: UTCTime -> UnixTime 466 | utcTimeToUnixTime utc = floor $ utcTimeToPOSIXSeconds utc 467 | 468 | -- | Converts `POSIXTime` to `UnixTime` 469 | posixSecondsToUnixTime :: POSIXTime -> UnixTime 470 | posixSecondsToUnixTime = floor 471 | 472 | -- | Converts `Day` to `UnixTime`. 473 | -- 474 | -- The time will be the first second of the `Day`. 475 | dayToUnixTime :: Day -> UnixTime 476 | dayToUnixTime day = utcTimeToUnixTime (UTCTime day 0) 477 | 478 | field :: Lens' Feature FieldID 479 | field = lens get_it set_it 480 | where 481 | get_it (Feature f) = fromIntegral $ f .&. 0x7f 482 | set_it (Feature original) new = Feature $ (original .&. 0xffffffffffffff80) .|. (fromIntegral $ new .&. 0x7f) 483 | {-# INLINE field #-} 484 | 485 | value :: Lens' Feature TdbVal 486 | value = lens get_it set_it 487 | where 488 | get_it (Feature f) = 489 | -- is it 32 bit item? 490 | if f .&. 128 == 0 491 | then (f `shiftR` 8) .&. 0xffffffff 492 | else f `shiftR` 16 493 | set_it (Feature original) new = 494 | Feature $ if original .&. 128 == 0 495 | then (original .&. 0x00000000000000ff) .|. (((new .&. 0xffffffff) `shiftL` 8)) 496 | else (original .&. 0x000000000000ffff) .|. (new `shiftL` 16) 497 | {-# INLINE value #-} 498 | 499 | -- | Class of things that can be used as a field name. 500 | -- 501 | -- The strict bytestring is the native type. Other types are converted, 502 | -- encoding with UTF-8. 503 | class FieldNameLike a where 504 | encodeToFieldName :: a -> B.ByteString 505 | 506 | instance FieldNameLike String where 507 | encodeToFieldName = T.encodeUtf8 . T.pack 508 | 509 | instance FieldNameLike B.ByteString where 510 | encodeToFieldName = id 511 | {-# INLINE encodeToFieldName #-} 512 | 513 | instance FieldNameLike BL.ByteString where 514 | encodeToFieldName = BL.toStrict 515 | 516 | instance FieldNameLike T.Text where 517 | encodeToFieldName = T.encodeUtf8 518 | 519 | instance FieldNameLike TL.Text where 520 | encodeToFieldName = T.encodeUtf8 . TL.toStrict 521 | 522 | newtype TdbCons = TdbCons (MVar (Maybe (Ptr TdbConsRaw))) 523 | deriving ( Typeable, Generic ) 524 | 525 | tdbThrowIfError :: MonadIO m => m CInt -> m () 526 | tdbThrowIfError action = do 527 | result <- action 528 | unless (result == 0) $ liftIO $ do 529 | err_string <- peekCString =<< tdb_error_str result 530 | throwM $ TrailDBError result err_string 531 | 532 | -- | Create a new TrailDB and return TrailDB construction handle. 533 | -- 534 | -- Close it with `closeTrailDBCons`. Garbage collector will close it eventually 535 | -- if you didn't do it yourself. You won't be receiving `FinalizationFailure` 536 | -- exception though if that fails when using the garbage collector. 537 | newTrailDBCons :: (FieldNameLike a, MonadIO m) 538 | => FilePath 539 | -> [a] 540 | -> m TdbCons 541 | newTrailDBCons filepath fields' = liftIO $ mask_ $ 542 | withCString filepath $ \root -> 543 | withBytestrings fields $ \fields_ptr -> do 544 | tdb_cons <- tdb_cons_init 545 | when (tdb_cons == nullPtr) $ 546 | throwM CannotAllocateTrailDBCons 547 | 548 | flip onException (tdb_cons_close tdb_cons) $ do 549 | tdbThrowIfError $ tdb_cons_open 550 | tdb_cons 551 | root 552 | fields_ptr 553 | (fromIntegral $ length fields) 554 | 555 | -- MVar will protect the handle from being used in multiple threads 556 | -- simultaneously. 557 | mvar <- newMVar (Just tdb_cons) 558 | 559 | -- Make garbage collector close it if it wasn't already. 560 | void $ mkWeakMVar mvar $ modifyMVar_ mvar $ \case 561 | Nothing -> return Nothing 562 | Just ptr -> do 563 | void $ tdb_cons_finalize ptr 0 564 | tdb_cons_close ptr 565 | return Nothing 566 | 567 | return $ TdbCons mvar 568 | where 569 | fields = fmap encodeToFieldName fields' 570 | 571 | -- | Runs an `IO` action with an opened `TdbCons`. The `TdbCons` is closed 572 | -- after the action has been executed. 573 | withTrailDBCons :: (FieldNameLike a, MonadIO m, MonadMask m) 574 | => FilePath 575 | -> [a] 576 | -> (TdbCons -> m b) 577 | -> m b 578 | withTrailDBCons filepath fields action = mask $ \restore -> do 579 | cons <- newTrailDBCons filepath fields 580 | finally (restore $ action cons) (closeTrailDBCons cons) 581 | 582 | withBytestrings :: forall a. [B.ByteString] -> (Ptr (Ptr CChar) -> IO a) -> IO a 583 | withBytestrings [] action = action nullPtr 584 | withBytestrings listing action = 585 | allocaArray (length listing) $ \bs_ptr -> loop_it bs_ptr listing 0 586 | where 587 | loop_it :: Ptr (Ptr CChar) -> [B.ByteString] -> Int -> IO a 588 | loop_it bs_ptr (bs:rest) idx = 589 | B.useAsCString bs $ \string_ptr -> do 590 | pokeElemOff bs_ptr idx string_ptr 591 | loop_it bs_ptr rest (idx+1) 592 | loop_it bs_ptr [] _ = action bs_ptr 593 | 594 | -- | Class of things that can be turned into rows and added with `addTrail`. 595 | -- 596 | -- The native type inside traildb is the `ByteString`. The use of this 597 | -- typeclass can eliminate some noise when converting values to `ByteString`. 598 | class ToTdbRow r where 599 | toTdbRow :: r -> [B.ByteString] 600 | 601 | -- | Class of things that can be turned into a field in a TrailDB. 602 | class ToTdbRowField f where 603 | toTdbField :: f -> B.ByteString 604 | 605 | instance (ToTdbRowField f1, ToTdbRowField f2) => ToTdbRow (f1, f2) where 606 | toTdbRow (f1, f2) = toTdbRow [toTdbField f1, toTdbField f2] 607 | 608 | instance (ToTdbRowField f1, ToTdbRowField f2, ToTdbRowField f3) => ToTdbRow (f1, f2, f3) where 609 | toTdbRow (f1, f2, f3) = toTdbRow [toTdbField f1, toTdbField f2, toTdbField f3] 610 | 611 | instance (ToTdbRowField f1, ToTdbRowField f2, ToTdbRowField f3, ToTdbRowField f4) => ToTdbRow (f1, f2, f3, f4) where 612 | toTdbRow (f1, f2, f3, f4) = toTdbRow [toTdbField f1, toTdbField f2, toTdbField f3, toTdbField f4] 613 | 614 | instance (ToTdbRowField f1, ToTdbRowField f2, ToTdbRowField f3, ToTdbRowField f4, ToTdbRowField f5) => ToTdbRow (f1, f2, f3, f4, f5) where 615 | toTdbRow (f1, f2, f3, f4, f5) = toTdbRow [toTdbField f1, toTdbField f2, toTdbField f3, toTdbField f4, toTdbField f5] 616 | 617 | instance ToTdbRowField B.ByteString where 618 | toTdbField = id 619 | {-# INLINE toTdbField #-} 620 | 621 | instance ToTdbRowField BL.ByteString where 622 | toTdbField = BL.toStrict 623 | {-# INLINE toTdbField #-} 624 | 625 | instance ToTdbRowField String where 626 | toTdbField = T.encodeUtf8 . T.pack 627 | {-# INLINE toTdbField #-} 628 | 629 | instance ToTdbRowField T.Text where 630 | toTdbField = T.encodeUtf8 631 | {-# INLINE toTdbField #-} 632 | 633 | instance ToTdbRowField TL.Text where 634 | toTdbField = T.encodeUtf8 . TL.toStrict 635 | {-# INLINE toTdbField #-} 636 | 637 | instance ToTdbRowField f => ToTdbRow [f] where 638 | toTdbRow = fmap toTdbField 639 | {-# INLINE toTdbRow #-} 640 | 641 | -- | Convenience type that lets you arbitrarily make heterogenous list of 642 | -- things that implement `ToTdbRowField` and subsequently `ToTdbRow`. Use this 643 | -- if plain lists are not suitable (because they are monotyped) or your tuples 644 | -- are too long to implement `ToTdbRow`. 645 | data TdbConsRow a b = (:.) a b 646 | 647 | infixr 7 :. 648 | 649 | instance (ToTdbRowField a, ToTdbRow b) 650 | => ToTdbRow (TdbConsRow a b) where 651 | toTdbRow (a :. b) = toTdbField a:toTdbRow b 652 | {-# INLINE toTdbRow #-} 653 | 654 | -- | Convenience newtype to put things that you can `Show` in TrailDB. It 655 | -- implements `ToTdbRowField`. 656 | newtype TdbShowable a = TdbShowable a 657 | deriving ( Functor, Foldable, Traversable, Typeable, Generic, Eq, Ord, Show, Read ) 658 | 659 | -- | Short-cut pattern synonym for `TdbShowable` 660 | pattern TShow a = TdbShowable a 661 | 662 | instance Show a => ToTdbRowField (TdbShowable a) where 663 | toTdbField (TdbShowable thing) = toTdbField $ show thing 664 | {-# INLINE toTdbField #-} 665 | 666 | -- | Add a cookie with timestamp and values to `TdbCons`. 667 | addTrail :: (MonadIO m, ToTdbRow r) 668 | => TdbCons 669 | -> UUID 670 | -> UnixTime 671 | -> r 672 | -> m () 673 | addTrail _ cookie _ _ | B.length cookie /= 16 = 674 | error "addTrail: cookie must be 16 bytes in length." 675 | addTrail (TdbCons mvar) cookie epoch (toTdbRow -> values) = liftIO $ withMVar mvar $ \case 676 | Nothing -> error "addTrail: tdb_cons is closed." 677 | Just ptr -> 678 | B.unsafeUseAsCString cookie $ \cookie_ptr -> 679 | withBytestrings values $ \values_ptr -> 680 | withArray (fmap (fromIntegral . B.length) values) $ \values_length_ptr -> 681 | tdbThrowIfError $ tdb_cons_add 682 | ptr 683 | (castPtr cookie_ptr) 684 | epoch 685 | values_ptr 686 | values_length_ptr 687 | {-# INLINE addTrail #-} 688 | 689 | -- | Finalizes a `TdbCons`. 690 | -- 691 | -- You usually don't need to call this manually because this is called 692 | -- automatically by `closeTrailDBCons` before actually closing `TdbCons`. 693 | finalizeTrailDBCons :: MonadIO m => TdbCons -> m () 694 | finalizeTrailDBCons (TdbCons mvar) = liftIO $ withMVar mvar $ \case 695 | Nothing -> error "finalizeTrailDBCons: tdb_cons is closed." 696 | Just ptr -> do 697 | result <- tdb_cons_finalize ptr 0 698 | unless (result == 0) $ throwM FinalizationFailure 699 | 700 | -- | Close a `TdbCons` 701 | -- 702 | -- Does nothing if it's closed already. 703 | closeTrailDBCons :: MonadIO m => TdbCons -> m () 704 | closeTrailDBCons (TdbCons mvar) = liftIO $ mask_ $ modifyMVar_ mvar $ \case 705 | Nothing -> return Nothing 706 | Just ptr -> do 707 | result <- tdb_cons_finalize ptr 0 708 | unless (result == 0) $ throwM FinalizationFailure 709 | tdb_cons_close ptr >> return Nothing 710 | 711 | -- | Appends a `Tdb` to an open `TdbCons`. 712 | appendTdbToTdbCons :: MonadIO m 713 | => Tdb 714 | -> TdbCons 715 | -> m () 716 | appendTdbToTdbCons (Tdb mvar_tdb) (TdbCons mvar_tdb_cons) = liftIO $ 717 | withMVar mvar_tdb_cons $ \case 718 | Nothing -> error "appendTdbToTdbCons: tdb_cons is closed." 719 | Just tdb_cons_ptr -> withCVar mvar_tdb $ \case 720 | Nothing -> error "appendTdbToTdbCons: tdb is closed." 721 | Just (tdbPtr -> tdb_ptr) -> do 722 | result <- tdb_cons_append tdb_cons_ptr tdb_ptr 723 | unless (result == 0) $ 724 | error "appendTdbToTdbCons: tdb_cons_append() failed." 725 | 726 | -- | Opens an existing TrailDB. 727 | -- 728 | -- It can open file TrailDBs and also directory TrailDBs. 729 | -- 730 | -- In case of files you can use format \"traildb.tdb\" or just \"traildb\". 731 | openTrailDB :: MonadIO m 732 | => FilePath 733 | -> m Tdb 734 | openTrailDB root = liftIO $ mask_ $ 735 | withCString root $ \root_str -> do 736 | tdb <- tdb_init 737 | flip onException (when (tdb /= nullPtr) $ tdb_close tdb) $ do 738 | when (tdb == nullPtr) $ throwM CannotAllocateTrailDB 739 | 740 | tdbThrowIfError $ tdb_open tdb root_str 741 | 742 | buf <- mallocForeignPtrArray 1 743 | 744 | -- Get all field names; later on we won't need to allocate 745 | -- new bytestrings for functions like `getFieldName`. 746 | num_fields <- tdb_num_fields tdb 747 | field_names <- for [0..num_fields-1] $ \field_id -> do 748 | result <- tdb_get_field_name tdb $ fromIntegral field_id 749 | when (result == nullPtr) $ throwM CannotAllocateTrailDB 750 | B.packCString result 751 | 752 | -- Protect concurrent access and attach a tdb_close to garbage collector 753 | mvar <- newCVar (Just TdbState { 754 | tdbPtr = tdb 755 | , decodeBuffer = buf 756 | , decodeBufferSize = 1 757 | , fieldNames = VS.fromList field_names 758 | }) 759 | 760 | void $ mkWeakCVar mvar $ modifyCVar_ mvar $ \case 761 | Nothing -> return Nothing 762 | Just (tdbPtr -> ptr) -> tdb_close ptr >> return Nothing 763 | 764 | return $ Tdb mvar 765 | 766 | -- | Hints that `Tdb` will not be accessed in near future. 767 | -- 768 | -- Internally may invoke system call \'madvise\' behind the scenes to operating system. 769 | -- 770 | -- This has no effect on semantics, only performance. 771 | dontneedTrailDB :: MonadIO m 772 | => Tdb 773 | -> m () 774 | dontneedTrailDB tdb = withTdb tdb "dontneedTrailDB" tdb_dontneed 775 | 776 | -- | Hints that `Tdb` will be walked over in near future. 777 | -- 778 | -- Internally may invoke system call \'madvise\' behind the scenes to operating system. 779 | -- 780 | -- This has no effect on semantics, only performance. 781 | willneedTrailDB :: MonadIO m 782 | => Tdb 783 | -> m () 784 | willneedTrailDB tdb = withTdb tdb "willneedTrailDB" tdb_willneed 785 | 786 | -- | Closes a TrailDB. 787 | -- 788 | -- Does nothing if `Tdb` is already closed. 789 | closeTrailDB :: MonadIO m 790 | => Tdb 791 | -> m () 792 | closeTrailDB (Tdb mvar) = liftIO $ mask_ $ modifyCVar_ mvar $ \case 793 | Nothing -> return Nothing 794 | Just (tdbPtr -> ptr) -> tdb_close ptr >> return Nothing 795 | 796 | -- | Cursors are used to read a TrailDB. 797 | -- 798 | -- It's permissible to make more than one cursor and use them on the same 799 | -- `Tdb` concurrently from many threads. 800 | -- 801 | -- However, you should not use the same cursor across threads. 802 | -- 803 | -- Most operations on `Tdb` lock it so it cannot be used concurrenly. The 804 | -- operations `setCursor`, `stepCursor` and `stepCursorList` don't lock anything. 805 | data Cursor = Cursor {-# UNPACK #-} !(Ptr TdbCursorRaw) 806 | !(IORef ()) 807 | !(MVar (Maybe TdbState)) 808 | deriving ( Eq, Typeable, Generic ) 809 | 810 | -- | Creates a cursor to a trailDB. 811 | makeCursor :: MonadIO m 812 | => Tdb 813 | -> m Cursor 814 | makeCursor (Tdb mvar) = liftIO $ withCVar mvar $ \case 815 | Nothing -> error "makeCursor: tdb is closed." 816 | Just (tdbPtr -> tdb_ptr) -> mask_ $ do 817 | cursor <- tdb_cursor_new tdb_ptr 818 | cursor_fin <- newIORef () 819 | void $ mkWeakIORef cursor_fin $ tdb_cursor_free cursor 820 | return $ Cursor cursor cursor_fin mvar 821 | {-# NOINLINE makeCursor #-} 822 | 823 | -- | Class of things that can be result from `getTrail`. 824 | class FromTrail a where 825 | -- | Makes a result from a list of trails. 826 | -- 827 | -- One crumb of a trail has timestamp and associative list of fields. 828 | -- 829 | -- @ 830 | -- [(timestamp, [(fieldname1, value1), (fieldname2, value2), ...] 831 | -- , ... ] 832 | -- @ 833 | fromBytestringList :: [(UnixTime, [(B.ByteString, B.ByteString)])] -> a 834 | 835 | instance FromTrail [(UnixTime, [(B.ByteString, B.ByteString)])] where 836 | fromBytestringList = id 837 | {-# INLINE fromBytestringList #-} 838 | 839 | -- | Throws away timestamps. 840 | instance FromTrail [M.Map B.ByteString B.ByteString] where 841 | fromBytestringList = fmap (M.fromList . snd) 842 | 843 | -- | Throws away timestamps and field names. 844 | instance FromTrail [[B.ByteString]] where 845 | fromBytestringList = fmap (fmap snd . snd) 846 | 847 | -- | `Vector` version. 848 | instance FromTrail (VS.Vector (M.Map B.ByteString B.ByteString)) where 849 | fromBytestringList = VS.fromList . fmap (M.fromList . snd) 850 | 851 | -- | Set of all values that appear in the trail. 852 | instance FromTrail (S.Set B.ByteString) where 853 | fromBytestringList = S.fromList . concatMap (fmap snd . snd) 854 | 855 | -- | Convenience function that runs a function for each `TrailID` in TrailDB. 856 | forEachTrailID :: (Applicative m, MonadIO m) => Tdb -> (TrailID -> m ()) -> m () 857 | forEachTrailID tdb action = do 858 | num_trails <- getNumTrails tdb 859 | for_ [0..num_trails-1] $ \tid -> action tid 860 | {-# INLINEABLE forEachTrailID #-} 861 | 862 | -- | Same as `forEachTrailID` but passes UUID as well. 863 | forEachTrailIDUUID :: (Applicative m, MonadIO m) => Tdb -> (TrailID -> UUID -> m ()) -> m () 864 | forEachTrailIDUUID tdb action = do 865 | num_trails <- getNumTrails tdb 866 | for_ [0..num_trails-1] $ \tid -> do 867 | uuid <- getUUID tdb tid 868 | action tid uuid 869 | 870 | -- | Convenience function that goes through every single event in the TrailDB. 871 | forEachEvent :: (Applicative m, MonadIO m) => Tdb -> (TrailID -> Crumb -> m ()) -> m () 872 | forEachEvent tdb action = do 873 | num_trails <- getNumTrails tdb 874 | cursor <- makeCursor tdb 875 | for_ [0..num_trails-1] $ \tid -> do 876 | setCursor cursor tid 877 | 878 | let go = stepCursor cursor >>= \case 879 | Nothing -> return () 880 | Just crumb -> action tid crumb >> go 881 | 882 | go 883 | {-# INLINEABLE forEachEvent #-} 884 | 885 | -- | Convenience function that goes through every single event in the TrailDB, with UUID included. 886 | forEachEventUUID :: (Applicative m, MonadIO m) => Tdb -> (TrailID -> Crumb -> UUID -> m ()) -> m () 887 | forEachEventUUID tdb action = do 888 | num_trails <- getNumTrails tdb 889 | cursor <- makeCursor tdb 890 | for_ [0..num_trails-1] $ \tid -> do 891 | setCursor cursor tid 892 | uuid <- getUUID tdb tid 893 | 894 | let go = stepCursor cursor >>= \case 895 | Nothing -> return () 896 | Just crumb -> action tid crumb uuid >> go 897 | 898 | go 899 | {-# INLINEABLE forEachEventUUID #-} 900 | 901 | -- | Same as `forEachEvent` but arguments flipped. 902 | traverseEachEvent :: (Applicative m, MonadIO m) => (TrailID -> Crumb -> m ()) -> Tdb -> m () 903 | traverseEachEvent action tdb = forEachEvent tdb action 904 | {-# INLINE traverseEachEvent #-} 905 | 906 | -- | Same as `forEachEventUUID` but arguments flipped. 907 | traverseEachEventUUID :: (Applicative m, MonadIO m) => (TrailID -> Crumb -> UUID -> m ()) -> Tdb -> m () 908 | traverseEachEventUUID action tdb = forEachEventUUID tdb action 909 | {-# INLINE traverseEachEventUUID #-} 910 | 911 | -- | Convenience function that goes through every single event in the TrailDB. 912 | foldEachEvent :: (Applicative m, MonadIO m) => Tdb -> (a -> TrailID -> Crumb -> m a) -> a -> m a 913 | foldEachEvent tdb action initial = do 914 | num_trails <- getNumTrails tdb 915 | cursor <- makeCursor tdb 916 | goTrails cursor 0 num_trails initial 917 | where 918 | goTrails _ index num_trails !accum | index >= num_trails = pure accum 919 | goTrails cursor index num_trails !accum = do 920 | setCursor cursor index 921 | 922 | let go accum = stepCursor cursor >>= \case 923 | Nothing -> pure accum 924 | Just crumb -> action accum index crumb >>= go 925 | 926 | go accum >>= goTrails cursor (index+1) num_trails 927 | {-# INLINEABLE foldEachEvent #-} 928 | 929 | -- | Convenience function that goes through every single event in the TrailDB, includes UUID. 930 | foldEachEventUUID :: (Applicative m, MonadIO m) => Tdb -> (a -> TrailID -> Crumb -> UUID -> m a) -> a -> m a 931 | foldEachEventUUID tdb action initial = do 932 | num_trails <- getNumTrails tdb 933 | cursor <- makeCursor tdb 934 | goTrails cursor 0 num_trails initial 935 | where 936 | goTrails _ index num_trails !accum | index >= num_trails = pure accum 937 | goTrails cursor index num_trails !accum = do 938 | setCursor cursor index 939 | uuid <- getUUID tdb index 940 | 941 | let go accum = stepCursor cursor >>= \case 942 | Nothing -> pure accum 943 | Just crumb -> action accum index crumb uuid >>= go 944 | 945 | go accum >>= goTrails cursor (index+1) num_trails 946 | {-# INLINEABLE foldEachEventUUID #-} 947 | 948 | -- | Same as `forEachTrailID` but arguments flipped. 949 | traverseEachTrailID :: (Applicative m, MonadIO m) => (TrailID -> m ()) -> Tdb -> m () 950 | traverseEachTrailID action tdb = forEachTrailID tdb action 951 | {-# INLINE traverseEachTrailID #-} 952 | 953 | -- | Same as `traverseEachTrailID` but passes UUID as well. 954 | traverseEachTrailIDUUID :: (Applicative m, MonadIO m) => (TrailID -> UUID -> m ()) -> Tdb -> m () 955 | traverseEachTrailIDUUID action tdb = forEachTrailIDUUID tdb action 956 | {-# INLINE traverseEachTrailIDUUID #-} 957 | 958 | -- | Fold TrailDB for each `TrailID`. 959 | -- 960 | -- This is like `traverseEachTrailID` but lets you carry a folding value. 961 | foldTrailDB :: MonadIO m => (a -> TrailID -> m a) -> a -> Tdb -> m a 962 | foldTrailDB action initial tdb = do 963 | num_trails <- getNumTrails tdb 964 | foldlM action initial [0..num_trails-1] 965 | {-# INLINEABLE foldTrailDB #-} 966 | 967 | -- | Same as `foldTrailDB` but passes UUID as well. 968 | foldTrailDBUUID :: MonadIO m => (a -> TrailID -> UUID -> m a) -> a -> Tdb -> m a 969 | foldTrailDBUUID action initial tdb = do 970 | num_trails <- getNumTrails tdb 971 | foldlM (\accum tid -> do 972 | uuid <- getUUID tdb tid 973 | action accum tid uuid) 974 | initial [0..num_trails-1] 975 | {-# INLINEABLE foldTrailDBUUID #-} 976 | 977 | -- | Given a crumb, decodes it into list of strict bytestring with field, value pairs. 978 | {-# INLINE decodeCrumbBytestring #-} 979 | decodeCrumbBytestring :: MonadIO m => Tdb -> Crumb -> m (UnixTime, [(B.ByteString, B.ByteString)]) 980 | decodeCrumbBytestring (Tdb mvar) (unixtime, features) = liftIO $ withMVar mvar $ \case 981 | Nothing -> error "decodeCrumbBytestring: traildb is closed." 982 | Just tdbstate -> do 983 | valuenames <- alloca $ \len_ptr -> do 984 | for (V.toList features) $ \(Feature feat) -> do 985 | cstr <- tdb_get_item_value (tdbPtr tdbstate) feat len_ptr 986 | when (cstr == nullPtr) $ throwM NoSuchValue 987 | len <- peek len_ptr 988 | B.packCStringLen (cstr, fromIntegral len) 989 | pure (unixtime, zip (tail $ VS.toList $ fieldNames tdbstate) valuenames) 990 | 991 | -- | Convenience function that returns a full trail in human-readable format. 992 | -- 993 | -- This is quite a bit slower than using a cursor (`makeCursor`, `setCursor`, 994 | -- `stepCursor`) but is simpler. If you don't need to go through bazillions of 995 | -- data really fast you might want to use this one. 996 | -- 997 | -- See `FromTrail` for things that can be taken as a result. 998 | getTrail :: (FromTrail a, MonadIO m) 999 | => Tdb 1000 | -> TrailID 1001 | -> m a 1002 | getTrail tdb tid = liftIO $ do 1003 | cursor <- makeCursor tdb 1004 | setCursor cursor tid 1005 | fromBytestringList <$> exhaustCursor cursor 1006 | where 1007 | exhaustCursor :: Cursor -> IO [(UnixTime, [(B.ByteString, B.ByteString)])] 1008 | exhaustCursor cursor = 1009 | stepCursor cursor >>= \case 1010 | Nothing -> return [] 1011 | Just (unixtime, V.toList -> features) -> do 1012 | let field_ids = features <&> (^.field) 1013 | fieldnames <- for field_ids $ getFieldName tdb 1014 | valuenames <- for features $ getValue tdb 1015 | ((unixtime, zip fieldnames valuenames):) <$> exhaustCursor cursor 1016 | 1017 | (<&>) = flip (<$>) 1018 | {-# INLINEABLE getTrail #-} 1019 | 1020 | -- | Same as `getTrail` but not polymorphic in output. 1021 | -- 1022 | -- Meant to be used for very quick throw-away programs where you don't want to 1023 | -- spell out the type to be used (e.g. if you all you do is `print` the trail). 1024 | getTrailBytestring :: MonadIO m => Tdb -> TrailID -> m [(UnixTime, [(B.ByteString, B.ByteString)])] 1025 | getTrailBytestring = getTrail 1026 | {-# INLINE getTrailBytestring #-} 1027 | 1028 | -- | Steps cursor forward in its trail. 1029 | -- 1030 | -- Returns `Nothing` if there are no more crumbs in the trail. 1031 | stepCursor :: MonadIO m 1032 | => Cursor 1033 | -> m (Maybe Crumb) 1034 | stepCursor (Cursor cursor mvar finalizer) = liftIO $ do 1035 | event_ptr <- shim_tdb_cursor_next cursor 1036 | if event_ptr /= nullPtr 1037 | then do let word64_ptr = castPtr event_ptr :: Ptr Word64 1038 | timestamp <- peekElemOff word64_ptr 0 1039 | num_items <- peekElemOff word64_ptr 1 1040 | let items = plusPtr word64_ptr (2*sizeOf (undefined :: Word64)) 1041 | 1042 | vec <- V.generateM (fromIntegral num_items) $ peekElemOff items 1043 | 1044 | touch mvar 1045 | touch finalizer 1046 | 1047 | return $ Just (timestamp, vec) 1048 | 1049 | else return Nothing 1050 | {-# INLINE stepCursor #-} 1051 | 1052 | -- | Same as `stepCursor` but returns the remaining trails on the cursor as a list. 1053 | -- 1054 | -- Available since 0.1.1.0 1055 | stepCursorList :: MonadIO m 1056 | => Cursor 1057 | -> m [Crumb] 1058 | stepCursorList cursor = liftIO step_loop 1059 | where 1060 | step_loop = stepCursor cursor >>= \case 1061 | Just item -> (item:) <$> step_loop 1062 | _ -> return [] 1063 | {-# INLINE stepCursorList #-} 1064 | 1065 | -- | Puts cursor at the start of some trail. 1066 | setCursor :: MonadIO m 1067 | => Cursor 1068 | -> TrailID 1069 | -> m () 1070 | setCursor (Cursor cursor mvar finalizer) trail_id = liftIO $ do 1071 | tdbThrowIfError $ tdb_get_trail cursor trail_id 1072 | touch mvar 1073 | touch finalizer 1074 | {-# INLINE setCursor #-} 1075 | 1076 | withTdb :: MonadIO m => Tdb -> String -> (Ptr TdbRaw -> IO a) -> m a 1077 | withTdb (Tdb mvar) errstring action = liftIO $ withCVar mvar $ \case 1078 | Nothing -> error $ errstring <> ": tdb is closed." 1079 | Just (tdbPtr -> ptr) -> action ptr 1080 | {-# INLINE withTdb #-} 1081 | 1082 | -- | Finds a uuid by trail ID. 1083 | getUUID :: MonadIO m => Tdb -> TrailID -> m UUID 1084 | getUUID tdb cid = withTdb tdb "getUUID" $ \ptr -> do 1085 | cptr <- tdb_get_uuid ptr cid 1086 | when (cptr == nullPtr) $ 1087 | throwM NoSuchTrailID 1088 | B.packCStringLen (castPtr cptr, 16) 1089 | {-# INLINE getUUID #-} 1090 | 1091 | -- | Finds a trail ID by uuid. 1092 | getTrailID :: MonadIO m => Tdb -> UUID -> m TrailID 1093 | getTrailID _ cookie | B.length cookie /= 16 = error "getTrailID: cookie must be 16 bytes in length." 1094 | getTrailID tdb cookie = withTdb tdb "getTrailID" $ \ptr -> 1095 | B.unsafeUseAsCString cookie $ \cookie_str -> 1096 | alloca $ \result_ptr -> do 1097 | result <- tdb_get_trail_id ptr (castPtr cookie_str) result_ptr 1098 | if result == 0 1099 | then peek result_ptr 1100 | else throwM NoSuchUUID 1101 | {-# INLINE getTrailID #-} 1102 | 1103 | -- | Returns the number of cookies in `Tdb` 1104 | getNumTrails :: MonadIO m => Tdb -> m Word64 1105 | getNumTrails tdb = withTdb tdb "getNumTrails" tdb_num_trails 1106 | 1107 | -- | Returns the number of events in `Tdb` 1108 | getNumEvents :: MonadIO m => Tdb -> m Word64 1109 | getNumEvents tdb = withTdb tdb "getNumEvents" tdb_num_events 1110 | 1111 | -- | Returns the number of fields in `Tdb` 1112 | getNumFields :: MonadIO m => Tdb -> m Word64 1113 | getNumFields tdb = withTdb tdb "getNumFields" tdb_num_fields 1114 | 1115 | -- | Returns the minimum timestamp in `Tdb` 1116 | getMinTimestamp :: MonadIO m => Tdb -> m UnixTime 1117 | getMinTimestamp tdb = withTdb tdb "getMinTimestamp" tdb_min_timestamp 1118 | 1119 | -- | Returns the maximum timestamp in `Tdb` 1120 | getMaxTimestamp :: MonadIO m => Tdb -> m UnixTime 1121 | getMaxTimestamp tdb = withTdb tdb "getMaxTimestamp" tdb_max_timestamp 1122 | 1123 | -- | Given a field ID, returns its human-readable field name. 1124 | {-# INLINE getFieldName #-} 1125 | getFieldName :: MonadIO m => Tdb -> FieldID -> m FieldName 1126 | getFieldName (Tdb mvar) (fromIntegral -> !fid) = liftIO $ withCVar mvar $ \case 1127 | Nothing -> error $ "getFieldName: tdb is closed." 1128 | Just (fieldNames -> field_names) -> 1129 | if fid < 0 || fid >= VS.length field_names 1130 | then error "getFieldName: requested field is out of range." 1131 | else return $ field_names `VS.unsafeIndex` fid 1132 | 1133 | -- | Given a field name, returns its `FieldID`. 1134 | getFieldID :: (FieldNameLike a, MonadIO m) => Tdb -> a -> m FieldID 1135 | getFieldID tdb (encodeToFieldName -> field_name) = withTdb tdb "getFieldID" $ \ptr -> 1136 | B.useAsCString field_name $ \field_name_cstr -> 1137 | alloca $ \field_ptr -> do 1138 | tdbThrowIfError $ tdb_get_field ptr field_name_cstr field_ptr 1139 | result <- peek field_ptr 1140 | return $ fromIntegral $ result-1 1141 | 1142 | -- | Given a `Feature`, returns a string that describes it. 1143 | -- 1144 | -- Values in a TrailDB are integers which need to be mapped back to strings to 1145 | -- be human-readable. 1146 | getValue :: MonadIO m => Tdb -> Feature -> m B.ByteString 1147 | getValue tdb (Feature ft) = withTdb tdb "getValue" $ \ptr -> 1148 | alloca $ \len_ptr -> do 1149 | cstr <- tdb_get_item_value ptr ft len_ptr 1150 | when (cstr == nullPtr) $ throwM NoSuchValue 1151 | len <- peek len_ptr 1152 | B.packCStringLen (cstr, fromIntegral len) 1153 | {-# INLINE getValue #-} 1154 | 1155 | -- | Given a field ID and a human-readable value, turn it into `Feature` for that field ID. 1156 | getItem :: MonadIO m => Tdb -> FieldID -> B.ByteString -> m Feature 1157 | getItem tdb fid bs = withTdb tdb "getItem" $ \ptr -> 1158 | B.unsafeUseAsCStringLen bs $ \(cstr, len) -> do 1159 | ft <- tdb_get_item ptr (fid+1) cstr (fromIntegral len) 1160 | if ft == 0 1161 | then throwM NoSuchFeature 1162 | else return $ Feature ft 1163 | 1164 | -- | Same as `getItem` but uses a resolved field name rather than raw `FieldID`. 1165 | -- 1166 | -- This is implemented in terms of `getFieldID` and `getItem` inside. 1167 | getItemByField :: (FieldNameLike a, MonadIO m) => Tdb -> a -> B.ByteString -> m Feature 1168 | getItemByField tdb (encodeToFieldName -> fid) bs = liftIO $ do 1169 | fid <- getFieldID tdb fid 1170 | getItem tdb fid bs 1171 | 1172 | -- | Returns the raw pointer to a TrailDB. 1173 | -- 1174 | -- You can pass this pointer to C code and use the C API of TrailDB to use it. 1175 | -- 1176 | -- You become responsible for ensuring Haskell doesn't clean up and close the 1177 | -- managed `Tdb` handle. You can use `touchTdb` or `withRawTdb` to deal with this. 1178 | getRawTdb :: MonadIO m => Tdb -> m (Ptr TdbRaw) 1179 | getRawTdb (Tdb cvar) = liftIO $ withCVar cvar $ \case 1180 | Nothing -> error "getRawTdb: tdb is closed." 1181 | Just tdbstate -> return (tdbPtr tdbstate) 1182 | 1183 | -- | Returns the raw pointer to a TrailDB construction handle. 1184 | -- 1185 | -- Just as with `getRawTdb`, this pointer can be passed to C and used with the 1186 | -- TrailDB C API. 1187 | -- 1188 | -- Use `touchTdbCons` or `withRawTdbCons` to ensure the pointer is not garbage 1189 | -- collected while you are using it. 1190 | getRawTdbCons :: MonadIO m => TdbCons -> m (Ptr TdbConsRaw) 1191 | getRawTdbCons (TdbCons cvar) = liftIO $ withCVar cvar $ \case 1192 | Nothing -> error "getRawTdbCons: tdb_cons is closed." 1193 | Just raw_ptr -> return raw_ptr 1194 | 1195 | -- | Touch a `TdbCons`. 1196 | -- 1197 | -- Ensures that `TdbCons` has not been garbage collected at the point 1198 | -- `touchTdbCons` is invoked. Has no other effect. 1199 | touchTdbCons :: MonadIO m => TdbCons -> m () 1200 | touchTdbCons (TdbCons cvar) = liftIO $ withCVar cvar $ \case 1201 | Nothing -> return () 1202 | Just raw_ptr -> touch raw_ptr 1203 | 1204 | -- | Run an action with a raw pointer to `TdbCons`. 1205 | -- 1206 | -- The `TdbCons` is guaranteed not to be garbage collected while the given 1207 | -- action is running. 1208 | withRawTdbCons :: MonadIO m => TdbCons -> (Ptr TdbConsRaw -> IO a) -> m a 1209 | withRawTdbCons tdb_cons action = do 1210 | ptr <- getRawTdbCons tdb_cons 1211 | liftIO $ finally (action ptr) (touch ptr) 1212 | 1213 | -- | Touch a `Tdb`. 1214 | -- 1215 | -- Ensures that `Tdb` has not been garbage collected at the point `touchTdb` is 1216 | -- invoked. Has no other effect. 1217 | touchTdb :: MonadIO m => Tdb -> m () 1218 | touchTdb (Tdb cvar) = liftIO $ void $ withCVar cvar $ \case 1219 | Nothing -> return () 1220 | Just tdbstate -> touch (tdbPtr tdbstate) 1221 | 1222 | -- | Run an action with a raw pointer to `Tdb`. 1223 | -- 1224 | -- The `Tdb` is guaranteed not to be garbage collected while the given action is running. 1225 | withRawTdb :: MonadIO m => Tdb -> (Ptr TdbRaw -> IO a) -> m a 1226 | withRawTdb tdb action = do 1227 | ptr <- getRawTdb tdb 1228 | liftIO $ finally (action ptr) (touchTdb tdb) 1229 | 1230 | -- | Opens a `Tdb` and then closes it after action is over. 1231 | withTrailDB :: (MonadIO m, MonadMask m) => FilePath -> (Tdb -> m a) -> m a 1232 | withTrailDB fpath action = mask $ \restore -> do 1233 | tdb <- openTrailDB fpath 1234 | finally (restore $ action tdb) (closeTrailDB tdb) 1235 | {-# INLINE withTrailDB #-} 1236 | 1237 | -- | Given a directory, find all valid TrailDB paths inside it, recursively. 1238 | findTrailDBs :: forall m. (MonadIO m, MonadMask m) 1239 | => FilePath 1240 | -> Bool -- ^ Follow symbolic links? 1241 | -> m [FilePath] 1242 | findTrailDBs filepath follow_symbolic_links = do 1243 | contents <- liftIO $ getDirectoryContents filepath 1244 | dirs <- execStateT (filterChildDirectories filepath contents) [filepath] 1245 | filterTrailDBDirectories dirs 1246 | where 1247 | filterChildDirectories :: FilePath -> [FilePath] -> StateT [FilePath] m () 1248 | filterChildDirectories prefix (".":rest) = filterChildDirectories prefix rest 1249 | filterChildDirectories prefix ("..":rest) = filterChildDirectories prefix rest 1250 | filterChildDirectories prefix (dir_raw:rest) = do 1251 | let dir = prefix <> "/" <> dir_raw 1252 | is_dir <- liftIO $ doesDirectoryExist dir 1253 | is_symbolic_link_maybe <- liftIO $ tryIOError $ getFileStatus (T.encodeUtf8 $ T.pack dir) 1254 | case is_symbolic_link_maybe of 1255 | Left exc | isDoesNotExistError exc -> filterChildDirectories prefix rest 1256 | Left exc -> throwM exc 1257 | Right is_symbolic_link -> 1258 | if is_dir && ((isSymbolicLink is_symbolic_link && follow_symbolic_links) || 1259 | not (isSymbolicLink is_symbolic_link)) 1260 | then modify (dir:) >> recurse dir >> filterChildDirectories prefix rest 1261 | else filterChildDirectories prefix rest 1262 | filterChildDirectories _ [] = return () 1263 | 1264 | recurse dir = do 1265 | contents <- liftIO $ getDirectoryContents dir 1266 | filterChildDirectories dir contents 1267 | 1268 | -- | Given a list of directories, filters it, returning only directories that 1269 | -- are valid TrailDB directories. 1270 | -- 1271 | -- Used internally by `findTrailDBs` but can be useful in general so we export it. 1272 | filterTrailDBDirectories :: (MonadIO m, MonadMask m) => [FilePath] -> m [FilePath] 1273 | filterTrailDBDirectories = filterM $ \dir -> do 1274 | result <- try $ openTrailDB dir 1275 | case result of 1276 | Left CannotAllocateTrailDB -> return False 1277 | Left exc -> throwM exc 1278 | Right ok -> closeTrailDB ok >> return True 1279 | 1280 | getTdbVersion :: MonadIO m => Tdb -> m TdbVersion 1281 | getTdbVersion tdb = withTdb tdb "getTdbVersion" tdb_version 1282 | 1283 | -------------------------------------------------------------------------------- /System/TrailDB/Error.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE DeriveDataTypeable #-} 2 | {-# LANGUAGE DeriveGeneric #-} 3 | 4 | module System.TrailDB.Error 5 | ( TrailDBException(..) ) 6 | where 7 | 8 | import Control.Exception 9 | import Data.Data 10 | import Foreign.C.Types 11 | import GHC.Generics 12 | 13 | -- | Exceptions that may happen with TrailDBs. 14 | -- 15 | -- Some programming errors may throw with `error` instead. 16 | data TrailDBException 17 | = CannotAllocateTrailDBCons -- ^ Failed to allocate `TdbCons`. 18 | | CannotAllocateTrailDB -- ^ Failed to allocate `Tdb`. 19 | | TrailDBError !CInt String -- ^ Errors reported by error code from TrailDB C library. 20 | -- includes numerical error and human-readable error. 21 | | NoSuchTrailID -- ^ A `TrailID` was used that doesn't exist in `Tdb`. 22 | | NoSuchUUID -- ^ A `UUID` was used that doesn't exist in `Tdb`. 23 | | NoSuchFieldID -- ^ A `FieldID` was used that doesn't exist in `Tdb`. 24 | | NoSuchValue -- ^ A `Feature` was used that doesn't contain a valid value. 25 | | NoSuchFeature -- ^ Attempted to find `Feature` for human readable name that doesn't exist. 26 | | FinalizationFailure -- ^ For some reason, finalizing a `TdbCons` failed. 27 | deriving ( Eq, Ord, Show, Read, Typeable, Generic ) 28 | 29 | instance Exception TrailDBException 30 | 31 | -------------------------------------------------------------------------------- /System/TrailDB/Internal.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE DeriveDataTypeable #-} 2 | {-# LANGUAGE DeriveGeneric #-} 3 | {-# LANGUAGE CPP #-} 4 | 5 | module System.TrailDB.Internal 6 | ( Tdb(..) 7 | , TdbState(..) 8 | , TdbConsRaw 9 | , TdbRaw 10 | 11 | , CVar 12 | , newCVar 13 | , mkWeakCVar 14 | , modifyCVar_ 15 | , modifyCVar 16 | , withCVar ) 17 | where 18 | 19 | #ifdef USE_IOREF 20 | import Data.IORef 21 | #else 22 | import Control.Concurrent.MVar 23 | #endif 24 | import qualified Data.ByteString as B 25 | import Data.Data 26 | import qualified Data.Vector as V 27 | import Data.Word 28 | import Foreign.ForeignPtr 29 | import Foreign.Ptr 30 | import GHC.Generics 31 | import System.Mem.Weak 32 | 33 | #ifdef USE_IOREF 34 | type CVar = IORef 35 | #else 36 | type CVar = MVar 37 | #endif 38 | 39 | newCVar :: a -> IO (CVar a) 40 | #ifdef USE_IOREF 41 | newCVar = newIORef 42 | #else 43 | newCVar = newMVar 44 | #endif 45 | {-# INLINE newCVar #-} 46 | 47 | mkWeakCVar :: CVar a -> IO () -> IO (Weak (CVar a)) 48 | #ifdef USE_IOREF 49 | mkWeakCVar = mkWeakIORef 50 | #else 51 | mkWeakCVar = mkWeakMVar 52 | #endif 53 | {-# INLINE mkWeakCVar #-} 54 | 55 | modifyCVar_ :: CVar a -> (a -> IO a) -> IO () 56 | #ifdef USE_IOREF 57 | modifyCVar_ cvar action = do 58 | item <- readIORef cvar 59 | result <- action item 60 | writeIORef cvar result 61 | #else 62 | modifyCVar_ = modifyMVar_ 63 | #endif 64 | {-# INLINE modifyCVar_ #-} 65 | 66 | modifyCVar :: CVar a -> (a -> IO (a, b)) -> IO b 67 | #ifdef USE_IOREF 68 | modifyCVar cvar action = do 69 | item <- readIORef cvar 70 | (new_state, result) <- action item 71 | writeIORef cvar new_state 72 | return result 73 | #else 74 | modifyCVar = modifyMVar 75 | #endif 76 | {-# INLINE modifyCVar #-} 77 | 78 | withCVar :: CVar a -> (a -> IO b) -> IO b 79 | #ifdef USE_IOREF 80 | withCVar cvar action = do 81 | item <- readIORef cvar 82 | action item 83 | #else 84 | withCVar = withMVar 85 | #endif 86 | 87 | -- | Represents the raw TrailDB construction as used in C. 88 | data TdbConsRaw 89 | 90 | -- | Represents the raw TrailDB handle as used in C. 91 | data TdbRaw 92 | 93 | data TdbState = TdbState 94 | { tdbPtr :: {-# UNPACK #-} !(Ptr TdbRaw) 95 | , decodeBuffer :: {-# UNPACK #-} !(ForeignPtr Word64) 96 | , decodeBufferSize :: {-# UNPACK #-} !Word64 97 | , fieldNames :: !(V.Vector B.ByteString) } 98 | 99 | newtype Tdb = Tdb (CVar (Maybe TdbState)) 100 | deriving ( Typeable, Generic ) 101 | 102 | -------------------------------------------------------------------------------- /bench-traildb/Main.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE OverloadedStrings #-} 2 | {-# LANGUAGE FlexibleInstances #-} 3 | {-# LANGUAGE BangPatterns #-} 4 | 5 | module Main ( main ) where 6 | 7 | import Control.Exception 8 | import Control.DeepSeq 9 | import Control.Monad 10 | import Criterion 11 | import Criterion.Main 12 | import qualified Data.ByteString as B 13 | import Data.Foldable 14 | import Data.IORef 15 | import Data.Monoid 16 | import Data.Serialize.Put 17 | import System.Directory 18 | import System.IO.Unsafe ( unsafePerformIO ) 19 | import System.Random 20 | import System.TrailDB 21 | 22 | -- This hack needed to clean up after we close 23 | createdTrailDBs :: IORef [FilePath] 24 | createdTrailDBs = unsafePerformIO $ newIORef [] 25 | {-# NOINLINE createdTrailDBs #-} 26 | 27 | main :: IO () 28 | main = flip finally clearupTrailDBs $ defaultMain 29 | [ 30 | benchReadEq 1 31 | , benchReadEq 2 32 | , benchReadEq 5 33 | , benchReadEq 20 34 | , benchReadEq 150 35 | , benchConsEq 1 36 | , benchConsEq 2 37 | , benchConsEq 5 38 | , benchConsEq 20 39 | , benchConsEq 150 40 | ] 41 | 42 | clearupTrailDBs :: IO () 43 | clearupTrailDBs = do 44 | paths <- readIORef createdTrailDBs 45 | for_ paths $ \path -> do 46 | _ <- try $ removeDirectoryRecursive path :: IO (Either SomeException ()) 47 | _ <- try $ removeFile (path <> ".tdb") :: IO (Either SomeException ()) 48 | return () 49 | 50 | instance NFData TdbCons where 51 | rnf !_ = () 52 | 53 | instance NFData Tdb where 54 | rnf !_ = () 55 | 56 | instance NFData Cursor where 57 | rnf !_ = () 58 | 59 | instance NFData (IO UUID) where 60 | rnf !_ = () 61 | 62 | benchConsEq :: Int -> Benchmark 63 | benchConsEq num_fields = env (prepareCons num_fields) $ \ ~(cons, uuidgen, _) -> bench ("addTrail" <> show num_fields) $ whnfIO $ do 64 | ug <- uuidgen 65 | addTrail cons ug 100 item 66 | where 67 | item = replicate num_fields ("hello" :: B.ByteString) 68 | 69 | benchReadEq :: Int -> Benchmark 70 | benchReadEq num_fields = env (prepareTrailDB num_fields) $ \ ~(tdb, cursor) -> bench ("seekTrail" <> show num_fields) $ whnfIO $ do 71 | idx <- randomRIO (0, 999999) 72 | setCursor cursor idx 73 | stepCursor cursor 74 | 75 | prepareTrailDB :: Int -> IO (Tdb, Cursor) 76 | prepareTrailDB num_fields = do 77 | (cons, _, name) <- prepareCons num_fields 78 | closeTrailDBCons cons 79 | 80 | tdb <- openTrailDB name 81 | cursor <- makeCursor tdb 82 | 83 | return (tdb, cursor) 84 | 85 | prepareCons :: Int -> IO (TdbCons, IO UUID, FilePath) 86 | prepareCons num_fields = do 87 | name <- replicateM 10 $ randomRIO ('a', 'z') 88 | cons <- newTrailDBCons name (fmap show [0..num_fields-1]) 89 | atomicModifyIORef' createdTrailDBs $ \old -> ( name:old, () ) 90 | 91 | uuid_ref <- newIORef 0 92 | let new_uuid = do idx <- readIORef uuid_ref 93 | modifyIORef' uuid_ref $ \old -> old+1 94 | return $ runPut $ do putWord64le idx 95 | putByteString $ B.replicate 8 0x0 96 | 97 | replicateM_ 1000000 $ do 98 | uuid <- new_uuid 99 | addTrail cons uuid 100 item 100 | 101 | return (cons, new_uuid, name) 102 | where 103 | item = replicate num_fields ("blah" :: B.ByteString) 104 | 105 | -------------------------------------------------------------------------------- /cbits/shim.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | const tdb_event* shim_tdb_cursor_next(tdb_cursor* cursor) 4 | { 5 | return tdb_cursor_next(cursor); 6 | } 7 | 8 | tdb_field shim_tdb_item_to_field(const tdb_item item) 9 | { 10 | return tdb_item_field(item); 11 | } 12 | 13 | tdb_val shim_tdb_item_to_val(const tdb_item item) 14 | { 15 | return tdb_item_val(item); 16 | } 17 | 18 | tdb_item shim_tdb_field_val_to_item(const tdb_field field, const tdb_val val) 19 | { 20 | return tdb_make_item(field, val); 21 | } 22 | 23 | -------------------------------------------------------------------------------- /examples/tutorial_simple_traildb.hs: -------------------------------------------------------------------------------- 1 | module Main ( main ) where 2 | 3 | import qualified Data.ByteString as B 4 | import qualified Data.ByteString.Lazy as BL 5 | import Data.Foldable 6 | import Data.UUID 7 | import Data.UUID.V4 8 | import Data.Time 9 | import System.TrailDB 10 | 11 | main :: IO () 12 | main = do 13 | withTrailDBCons "tiny" ["username", "action"] $ \cons -> 14 | for_ [0..2] $ \i -> do 15 | uuid <- BL.toStrict . toByteString <$> nextRandom 16 | let username = "user" ++ show i 17 | for_ (zip [0..] ["open", "save", "close"]) $ \(day, action) -> 18 | addTrail cons uuid 19 | (dayToUnixTime (fromGregorian 2016 (1+i) (1+day))) 20 | [username, action] 21 | 22 | withTrailDB "tiny" $ \tdb -> forEachTrailIDUUID tdb $ \tid uuid -> do 23 | trail <- getTrailBytestring tdb tid 24 | print (uuid, trail) 25 | 26 | -------------------------------------------------------------------------------- /stack.yaml: -------------------------------------------------------------------------------- 1 | # This file was automatically generated by 'stack init' 2 | # 3 | # Some commonly used options have been documented as comments in this file. 4 | # For advanced use and comprehensive documentation of the format, please see: 5 | # http://docs.haskellstack.org/en/stable/yaml_configuration/ 6 | 7 | # Resolver to choose a 'specific' stackage snapshot or a compiler version. 8 | # A snapshot resolver dictates the compiler version and the set of packages 9 | # to be used for project dependencies. For example: 10 | # 11 | # resolver: lts-3.5 12 | # resolver: nightly-2015-09-21 13 | # resolver: ghc-7.10.2 14 | # resolver: ghcjs-0.1.0_ghc-7.10.2 15 | # resolver: 16 | # name: custom-snapshot 17 | # location: "./custom-snapshot.yaml" 18 | resolver: lts-12.13 19 | 20 | # User packages to be built. 21 | # Various formats can be used as shown in the example below. 22 | # 23 | # packages: 24 | # - some-directory 25 | # - https://example.com/foo/bar/baz-0.0.2.tar.gz 26 | # - location: 27 | # git: https://github.com/commercialhaskell/stack.git 28 | # commit: e7b331f14bcffb8367cd58fbfc8b40ec7642100a 29 | # - location: https://github.com/commercialhaskell/stack/commit/e7b331f14bcffb8367cd58fbfc8b40ec7642100a 30 | # extra-dep: true 31 | # subdirs: 32 | # - auto-update 33 | # - wai 34 | # 35 | # A package marked 'extra-dep: true' will only be built if demanded by a 36 | # non-dependency (i.e. a user package), and its test suites and benchmarks 37 | # will not be run. This is useful for tweaking upstream packages. 38 | packages: 39 | - '.' 40 | # Dependency packages to be pulled from upstream that are not in the resolver 41 | # (e.g., acme-missiles-0.3) 42 | extra-deps: [] 43 | 44 | # Override default flag values for local packages and extra-deps 45 | flags: {} 46 | 47 | # Extra package databases containing global packages 48 | extra-package-dbs: [] 49 | 50 | # Control whether we use the GHC we find on the path 51 | # system-ghc: true 52 | # 53 | # Require a specific version of stack, using version ranges 54 | # require-stack-version: -any # Default 55 | # require-stack-version: ">=1.4" 56 | # 57 | # Override the architecture used by stack, especially useful on Windows 58 | # arch: i386 59 | # arch: x86_64 60 | # 61 | # Extra directories used by stack for building 62 | # extra-include-dirs: [/path/to/dir] 63 | # extra-lib-dirs: [/path/to/dir] 64 | # 65 | # Allow a newer minor version of GHC than the snapshot specifies 66 | # compiler-check: newer-minor 67 | -------------------------------------------------------------------------------- /traildb.cabal: -------------------------------------------------------------------------------- 1 | name: traildb 2 | version: 0.1.4.1 3 | synopsis: TrailDB bindings for Haskell 4 | description: Check out README.md for information on these bindings. 5 | . 6 | TrailDB project home page is at traildb.io 7 | license-file: LICENSE 8 | license: MIT 9 | author: Mikko Juola 10 | maintainer: mikko.juola@adroll.com 11 | copyright: AdRoll Inc (c) 2016-2017 12 | category: Database 13 | build-type: Simple 14 | cabal-version: >=1.10 15 | stability: beta 16 | extra-source-files: README.md 17 | examples/tutorial_simple_traildb.hs 18 | stack.yaml 19 | 20 | source-repository head 21 | type: git 22 | location: https://github.com/traildb/traildb-haskell 23 | 24 | flag use-ioref 25 | default: False 26 | description: Use `IORef` instead of `MVar` as 27 | the container for `Tdb`. Can improve performance (very 28 | slightly) but you lose thread safety. 29 | 30 | flag build-wikipedia-benchmark 31 | default: False 32 | description: Build the wikipedia scanning executable from 33 | https://github.com/joeyrobert/traildb-benchmark 34 | This program requires you to download the wikipedia TrailDB 35 | manually; check out the linked git repository for where to get it. 36 | 37 | library 38 | exposed-modules: System.TrailDB 39 | System.TrailDB.Error 40 | System.TrailDB.Internal 41 | build-depends: base >=4.6 && <5.0 42 | ,bytestring 43 | ,containers >=0.4 && <1.0 44 | ,directory 45 | ,exceptions 46 | ,primitive 47 | ,profunctors 48 | ,text 49 | ,time 50 | ,transformers 51 | ,unix 52 | ,vector 53 | extra-libraries: traildb, Judy 54 | ghc-options: -Wall -fno-warn-name-shadowing -O2 55 | c-sources: cbits/shim.c 56 | default-language: Haskell2010 57 | 58 | if flag(use-ioref) 59 | cpp-options: -DUSE_IOREF 60 | 61 | executable traildb-wikipedia-benchmark 62 | main-is: Main.hs 63 | if !flag(build-wikipedia-benchmark) 64 | buildable: False 65 | hs-source-dirs: wikipedia-benchmark 66 | build-depends: base >=4.6 && <5.0 67 | ,traildb 68 | ,vector 69 | ghc-options: -Wall -fno-warn-name-shadowing -O2 70 | default-language: Haskell2010 71 | 72 | benchmark bench-traildb 73 | type: exitcode-stdio-1.0 74 | main-is: Main.hs 75 | build-depends: base >=4.6 && <5.0 76 | ,bytestring 77 | ,cereal 78 | ,criterion 79 | ,deepseq 80 | ,directory 81 | ,random 82 | ,traildb 83 | ghc-options: -Wall -fno-warn-name-shadowing -O2 84 | hs-source-dirs: bench-traildb 85 | default-language: Haskell2010 86 | -------------------------------------------------------------------------------- /traildb_logo_512.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/traildb/traildb-haskell/a27a3c50920ebbfd4e879cbab8c55556e5bf2696/traildb_logo_512.png -------------------------------------------------------------------------------- /wikipedia-benchmark/Main.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE OverloadedStrings #-} 2 | {-# LANGUAGE LambdaCase #-} 3 | 4 | module Main where 5 | 6 | import qualified Data.Vector.Unboxed as V 7 | import System.Environment 8 | import System.TrailDB 9 | 10 | data EditIpCounts = EditIpCounts !Int !Int 11 | deriving ( Show ) 12 | 13 | main :: IO () 14 | main = do 15 | args <- getArgs 16 | case args of 17 | ["fast" ] -> mainFast 18 | ["pretty"] -> mainPretty 19 | _ -> 20 | putStrLn "Please run benchmark with either 'fast' or 'pretty' argument." 21 | 22 | mainPretty :: IO () 23 | mainPretty 24 | = withTrailDB "wikipedia-history-small.tdb" 25 | $ \traildb -> print =<< foldEachEvent 26 | traildb 27 | (\(EditIpCounts edits ip_edits) _ crumb -> do 28 | (_timestamp, fields) <- decodeCrumbBytestring traildb crumb 29 | pure $ EditIpCounts 30 | (edits + if any (\pair -> pair == ("user", "")) fields then 1 else 0 31 | ) 32 | ( ip_edits 33 | + if any (\pair -> pair == ("ip", "")) fields then 1 else 0 34 | ) 35 | ) 36 | (EditIpCounts 0 0) 37 | 38 | mainFast :: IO () 39 | mainFast = withTrailDB "wikipedia-history-small.tdb" $ \traildb -> do 40 | cursor <- makeCursor traildb 41 | user_field <- fromIntegral <$> getFieldID traildb ("user" :: String) 42 | ip_field <- fromIntegral <$> getFieldID traildb ("ip" :: String) 43 | empty_user <- getItem traildb user_field "" 44 | empty_ip <- getItem traildb ip_field "" 45 | print 46 | =<< foldTrailDB 47 | (countEditsAndIPs cursor 48 | (fromIntegral user_field) 49 | (fromIntegral ip_field) 50 | empty_user 51 | empty_ip 52 | ) 53 | (EditIpCounts 0 0) 54 | traildb 55 | where 56 | countEditsAndIPs 57 | :: Cursor 58 | -> Int 59 | -> Int 60 | -> Feature 61 | -> Feature 62 | -> EditIpCounts 63 | -> TrailID 64 | -> IO EditIpCounts 65 | countEditsAndIPs cursor user_field ip_field empty_user empty_ip counts trail_id 66 | = do 67 | setCursor cursor trail_id 68 | 69 | let go accum@(EditIpCounts edits ip_edits) = stepCursor cursor >>= \case 70 | Nothing -> pure accum 71 | Just (_, event_values) -> 72 | let ip_value = event_values V.! ip_field 73 | user_value = event_values V.! user_field 74 | ip_add = if ip_value == empty_ip then 1 else 0 75 | user_add = if user_value == empty_user then 1 else 0 76 | in go $ EditIpCounts (edits + user_add) (ip_edits + ip_add) 77 | 78 | go counts 79 | --------------------------------------------------------------------------------