├── .gitignore ├── CHANGELOG.md ├── LICENSE ├── README.md ├── Setup.hs ├── circle.yml ├── example ├── LICENSE ├── hworker-example.cabal └── src │ └── Main.hs ├── hworker.cabal ├── src ├── Data │ └── Aeson │ │ └── Helpers.hs └── System │ └── Hworker.hs ├── stack.yaml └── test └── Spec.hs /.gitignore: -------------------------------------------------------------------------------- 1 | .cabal-sandbox 2 | .stack-work 3 | cabal.sandbox.config 4 | dist-stack -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | * 0.2.0 Remeike Forbes 2 | 3 | Coerce jobs to string (to work with Redis 5) 4 | 5 | * 0.1.0.1 Daniel Pattersion 2015-11-1 6 | 7 | Add example in haddocks. 8 | 9 | * 0.1.0.0 Daniel Patterson 2015-10-25 10 | 11 | Initial release. 12 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2015 Daniel Patterson 2 | 3 | Permission to use, copy, modify, and/or distribute this software for any purpose 4 | with or without fee is hereby granted, provided that the above copyright notice 5 | and this permission notice appear in all copies. 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH 8 | REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND 9 | FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, 10 | INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS 11 | OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER 12 | TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF 13 | THIS SOFTWARE. 14 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## About 2 | 3 | `hworker` is a Redis-backed persistent at-least-once queue library. It 4 | is vaguely inspired by `sidekiq` for Ruby. It is intended to be a 5 | simple reliable mechanism for processing background tasks. The jobs 6 | can be created by a Haskell application or any application that can 7 | push JSON data structures of the right shape into a Redis queue. The 8 | application that processes the jobs need not be the same one as the 9 | application that creates them (they just need to be able to talk to 10 | the same Redis server, and use the same serialization to/from JSON). 11 | 12 | ## Stability 13 | 14 | This has been running in one application sending email (using 15 | `hworker-ses`) for several months. This is relatively low traffic 16 | (transactional messages) most of the time, with spikes of 10k-30k 17 | messages (mailing blasts). 18 | 19 | ## Important Note 20 | 21 | The expiration of jobs is really important. It defaults to 120 22 | seconds, which may be short depending on your application (for things 23 | like sending emails, it may be fine). **The reason why this timeout is 24 | important is that if a job ever runs longer than this, the monitor 25 | will think that the job failed** in some inexplicable way (like the 26 | server running the job died) and will add the job back to the queue to 27 | be run. Based on the semantics of this job processor, jobs running 28 | multiple times is not a failure case, but it's obviously not something 29 | you _want_ to happen, so be sure to set the timeout to something 30 | reasonable for your application. 31 | 32 | ## Overview 33 | 34 | To define jobs, you define a serialized representation of the job, and 35 | a function that runs the job, which returns a status. The behavior of 36 | uncaught exceptions is defined when you create the worker - it can be 37 | either `Failure` or `Retry`. Jobs that return `Failure` are removed 38 | from the queue, whereas jobs that return `Retry` are added again. The 39 | only difference between a `Success` and a `Failure` is that a 40 | `Failure` returns a message that is logged (ie, neither run again). 41 | 42 | ## Example 43 | 44 | See the `example` directory in the repository. 45 | 46 | ## Semantics 47 | 48 | This behavior of this queue processor is at-least-once. 49 | 50 | We rely on the defined behavior of Redis for reliability. Once a job 51 | has been `queue`d, it is guaranteed to be run eventually, provided 52 | some worker and monitor threads exist. If the worker thread that was 53 | running a given job dies, the job will eventually be retried (if you 54 | do not want this behavior, do not start any monitor threads). Once the 55 | job completes, provided nothing kills the worker thread in the 56 | intervening time, jobs that returned `Success` will not be run again, 57 | jobs that return `Failure` will have their messages logged and will 58 | not be run again, and jobs that return `Retry` will be queued 59 | again. If something kills the worker thread before these 60 | acknowledgements go through, the job will be retried. Exceptions 61 | triggered within the job cannot affect the worker thread - what they 62 | do to the job is defined at startup (they can cause either a `Failure` 63 | or `Retry`). 64 | 65 | Any deviations from this behavior are considered bugs that will be fixed. 66 | 67 | 68 | ## Redis Operations 69 | 70 | Under the hood, we will have the following data structures in redis 71 | (`name` is set when you create the `hworker` instance): 72 | 73 | `hworker-jobs-name`: list of json serialized job descriptions 74 | 75 | `hworker-progress-name`: a hash of jobs that are in progress, mapping to time started 76 | 77 | `hworker-broken-name`: a hash of jobs to time that couldn't be deserialized; most likely means you changed the serialization format with jobs still in queue, _or_ you pointed different applications at the same queues. 78 | 79 | `hworker-failed-queue`: a record of the jobs that failed (limited in size based on config). 80 | 81 | In the following pseudo-code, I'm using `MULTI`...`EXEC` to indicate 82 | atomic blocks of code. These are actually implemented with lua and 83 | `EVAL`, but I think it's easier to read this way. If you want to see 84 | what's actually happening, just read the code - it's not very long! 85 | 86 | When a worker wants to do work, the following happens: 87 | 88 | ``` 89 | now = TIME 90 | MULTI 91 | v = RPOP hworker-jobs-name 92 | if v 93 | HSET hworker-progress-name v now 94 | EXEC 95 | v 96 | ``` 97 | 98 | When it completes the job, it does the following: 99 | 100 | ``` 101 | v = JOB 102 | HDEL hwork-progress v 103 | ``` 104 | 105 | If the job returned `Retry`, the following occurs: 106 | 107 | ``` 108 | v = JOB 109 | t = START_TIME 110 | MULTI 111 | LPUSH hwork-jobs v 112 | HDEL hwork-progress t 113 | EXEC 114 | ``` 115 | 116 | A monitor runs on another thread that will re-run jobs that stay in 117 | progress for too long (as that indicates that something unknown went 118 | wrong). The operation that it runs periodically is: 119 | 120 | ``` 121 | keys = HKEYS (or HSCAN) hwork-progress 122 | for keys as v: 123 | started = HGET hwork-progress v 124 | if started < TIME - timeout 125 | MULTI 126 | RPUSH hwork-jobs v 127 | HDEL hwork-progress v 128 | EXEC 129 | ``` 130 | 131 | Note that what the monitor does and `Retry` is slightly different - 132 | the monitor puts jobs on the front of the queue, whereas `Retry` puts 133 | them on the back. 134 | 135 | ## Primary Libraries Used 136 | 137 | - hedis 138 | - aeson 139 | 140 | ## Contributors 141 | 142 | - Daniel Patterson (@dbp - dbp@dbpmail.net) 143 | 144 | ## Build Status 145 | 146 | [![Circle CI](https://circleci.com/gh/dbp/hworker.svg?style=svg&circle-token=b40a5b06c599d457cbaa4d1c00824c98d4768f2f)](https://circleci.com/gh/dbp/hworker) 147 | -------------------------------------------------------------------------------- /Setup.hs: -------------------------------------------------------------------------------- 1 | import Distribution.Simple 2 | main = defaultMain 3 | -------------------------------------------------------------------------------- /circle.yml: -------------------------------------------------------------------------------- 1 | dependencies: 2 | cache_directories: 3 | - "~/.stack" 4 | pre: 5 | - wget https://github.com/commercialhaskell/stack/releases/download/v0.1.2.0/stack-0.1.2.0-x86_64-linux.gz -O /tmp/stack.gz 6 | - gunzip /tmp/stack.gz && chmod +x /tmp/stack 7 | - sudo mv /tmp/stack /usr/bin/stack 8 | override: 9 | - stack setup 10 | - stack build 11 | - stack test ; echo 0 # We want the test deps cached... :( 12 | 13 | test: 14 | override: 15 | - stack test -------------------------------------------------------------------------------- /example/LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2015 Daniel Patterson 2 | 3 | Permission to use, copy, modify, and/or distribute this software for any purpose 4 | with or without fee is hereby granted, provided that the above copyright notice 5 | and this permission notice appear in all copies. 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH 8 | REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND 9 | FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, 10 | INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS 11 | OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER 12 | TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF 13 | THIS SOFTWARE. 14 | -------------------------------------------------------------------------------- /example/hworker-example.cabal: -------------------------------------------------------------------------------- 1 | name: hworker-example 2 | version: 0.1.0.0 3 | synopsis: Example for Hworker. 4 | description: See README. 5 | homepage: http://github.com/dbp/hworker 6 | license: ISC 7 | license-file: LICENSE 8 | author: Daniel Patterson 9 | maintainer: dbp@dbpmail.net 10 | build-type: Simple 11 | cabal-version: >=1.10 12 | 13 | 14 | executable hworker-example 15 | hs-source-dirs: src 16 | main-is: Main.hs 17 | build-depends: base >= 4.7 && < 5 18 | , aeson 19 | , hedis >= 0.6.5 20 | , text 21 | , bytestring 22 | , time >= 1.5 23 | , attoparsec 24 | , uuid >= 1.2.6 25 | , hworker 26 | default-language: Haskell2010 -------------------------------------------------------------------------------- /example/src/Main.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE DeriveGeneric #-} 2 | {-# LANGUAGE MultiParamTypeClasses #-} 3 | {-# LANGUAGE OverloadedStrings #-} 4 | import Control.Concurrent (forkIO, threadDelay) 5 | import Control.Concurrent.MVar (MVar, newMVar, putMVar, takeMVar) 6 | import Control.Monad (forever) 7 | import Data.Aeson (FromJSON, ToJSON) 8 | import qualified Data.Text as T 9 | import GHC.Generics (Generic) 10 | import System.Hworker 11 | 12 | data PrintJob = PrintA | PrintB deriving (Generic, Show) 13 | data State = State (MVar Int) 14 | instance ToJSON PrintJob 15 | instance FromJSON PrintJob 16 | 17 | loopForever :: a 18 | loopForever = loopForever 19 | 20 | instance Job State PrintJob where 21 | job (State mvar) PrintA = 22 | do v <- takeMVar mvar 23 | if v == 0 24 | then do putMVar mvar 0 25 | putStrLn "A" >> return Success 26 | else do putMVar mvar (v - 1) 27 | error $ "Dying: " ++ show v 28 | 29 | job _ PrintB = putStrLn "B" >> return Success 30 | 31 | main = do mvar <- newMVar 3 32 | hworker <- create "printer" (State mvar) 33 | forkIO (worker hworker) 34 | forkIO (monitor hworker) 35 | forkIO (forever $ queue hworker PrintA >> threadDelay 1000000) 36 | forkIO (forever $ queue hworker PrintB >> threadDelay 500000) 37 | forever (threadDelay 1000000) 38 | -------------------------------------------------------------------------------- /hworker.cabal: -------------------------------------------------------------------------------- 1 | name: hworker 2 | version: 0.2.0 3 | synopsis: A reliable at-least-once job queue built on top of redis. 4 | description: See README. 5 | homepage: http://github.com/positiondev/hworker 6 | license: ISC 7 | license-file: LICENSE 8 | author: Daniel Patterson 9 | maintainer: dbp@dbpmail.net 10 | build-type: Simple 11 | extra-source-files: README.md CHANGELOG.md 12 | cabal-version: >=1.10 13 | 14 | library 15 | exposed-modules: System.Hworker 16 | other-modules: Data.Aeson.Helpers 17 | build-depends: base >= 4.7 && < 5 18 | , aeson 19 | , hedis >= 0.6.5 20 | , text 21 | , bytestring 22 | , time >= 1.5 23 | , attoparsec 24 | , uuid >= 1.2.6 25 | hs-source-dirs: src 26 | default-language: Haskell2010 27 | ghc-options: -Wall 28 | 29 | Test-Suite hworker-test 30 | type: exitcode-stdio-1.0 31 | hs-source-dirs: src test 32 | main-is: Spec.hs 33 | other-modules: Data.Aeson.Helpers 34 | , System.Hworker 35 | build-depends: base >= 4.7 && < 5 36 | , aeson 37 | , hedis >= 0.6.5 38 | , text 39 | , bytestring 40 | , time >= 1.5 41 | , attoparsec 42 | , uuid >= 1.2.6 43 | , hspec >= 2 44 | , hspec-contrib 45 | , HUnit 46 | -------------------------------------------------------------------------------- /src/Data/Aeson/Helpers.hs: -------------------------------------------------------------------------------- 1 | module Data.Aeson.Helpers where 2 | 3 | import Data.Aeson 4 | import Data.Aeson.Parser (value) 5 | import Data.Attoparsec.Lazy (Parser) 6 | import qualified Data.Attoparsec.Lazy as L 7 | import qualified Data.ByteString.Lazy as L 8 | 9 | -- NOTE(dbp 2015-06-14): Taken from Data.Aeson.Parser.Internal 10 | decodeWith :: Parser Value -> (Value -> Result a) -> L.ByteString -> Maybe a 11 | decodeWith p to s = 12 | case L.parse p s of 13 | L.Done _ v -> case to v of 14 | Success a -> Just a 15 | _ -> Nothing 16 | _ -> Nothing 17 | {-# INLINE decodeWith #-} 18 | 19 | decodeValue :: FromJSON t => L.ByteString -> Maybe t 20 | decodeValue = decodeWith value fromJSON 21 | -------------------------------------------------------------------------------- /src/System/Hworker.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE DeriveGeneric #-} 2 | {-# LANGUAGE ExistentialQuantification #-} 3 | {-# LANGUAGE FunctionalDependencies #-} 4 | {-# LANGUAGE MultiParamTypeClasses #-} 5 | {-# LANGUAGE OverloadedStrings #-} 6 | {-# LANGUAGE RankNTypes #-} 7 | {-# LANGUAGE RecordWildCards #-} 8 | {-# LANGUAGE ScopedTypeVariables #-} 9 | 10 | {-| 11 | 12 | This module contains an at-least-once persistent job processing queue 13 | backed by Redis. It depends upon Redis not losing data once it has 14 | acknowledged it, and guaranteeing the atomicity that is specified for 15 | commands like EVAL (ie, that if you do several things within an EVAL, 16 | they will all happen or none will happen). Nothing has been tested 17 | with Redis clusters (and it likely will not work). 18 | 19 | An example use is the following (see the repository for a 20 | slightly expanded version; also, the test cases in the repository are 21 | also good examples): 22 | 23 | 24 | > data PrintJob = Print deriving (Generic, Show) 25 | > data State = State (MVar Int) 26 | > instance ToJSON PrintJob 27 | > instance FromJSON PrintJob 28 | > 29 | > instance Job State PrintJob where 30 | > job (State mvar) Print = 31 | > do v <- takeMVar mvar 32 | > putMVar mvar (v + 1) 33 | > putStrLn $ "A(" ++ show v ++ ")" 34 | > return Success 35 | > 36 | > main = do mvar <- newMVar 0 37 | > hworker <- create "printer" (State mvar) 38 | > forkIO (worker hworker) 39 | > forkIO (monitor hworker) 40 | > forkIO (forever $ queue hworker Print >> threadDelay 1000000) 41 | > forever (threadDelay 1000000) 42 | 43 | 44 | -} 45 | 46 | module System.Hworker 47 | ( -- * Types 48 | Result(..) 49 | , Job(..) 50 | , Hworker 51 | , HworkerConfig(..) 52 | , ExceptionBehavior(..) 53 | , RedisConnection(..) 54 | , defaultHworkerConfig 55 | -- * Managing Workers 56 | , create 57 | , createWith 58 | , destroy 59 | , worker 60 | , monitor 61 | -- * Queuing Jobs 62 | , queue 63 | -- * Inspecting Workers 64 | , jobs 65 | , failed 66 | , broken 67 | -- * Debugging Utilities 68 | , debugger 69 | ) 70 | where 71 | 72 | import Control.Arrow (second) 73 | import Control.Concurrent (forkIO, threadDelay) 74 | import Control.Concurrent.MVar (newEmptyMVar, putMVar, takeMVar) 75 | import Control.Exception (SomeException, catch) 76 | import Control.Monad (forM, forever, void, when) 77 | import Data.Aeson (FromJSON, ToJSON) 78 | import qualified Data.Aeson as A 79 | import Data.Aeson.Helpers 80 | import Data.ByteString (ByteString) 81 | import qualified Data.ByteString.Char8 as B8 82 | import qualified Data.ByteString.Lazy as LB 83 | import Data.Either (isRight) 84 | import Data.Maybe (fromJust, mapMaybe) 85 | import Data.Monoid ((<>)) 86 | import Data.Text (Text) 87 | import qualified Data.Text as T 88 | import qualified Data.Text.Encoding as T 89 | import Data.Time.Calendar (Day (..)) 90 | import Data.Time.Clock (NominalDiffTime, UTCTime (..), 91 | diffUTCTime, getCurrentTime) 92 | import qualified Data.UUID as UUID 93 | import qualified Data.UUID.V4 as UUID 94 | import qualified Database.Redis as R 95 | import GHC.Generics (Generic) 96 | 97 | -- | Jobs can return 'Success', 'Retry' (with a message), or 'Failure' 98 | -- (with a message). Jobs that return 'Failure' are stored in the 99 | -- 'failed' queue and are not re-run. Jobs that return 'Retry' are re-run. 100 | data Result = Success 101 | | Retry Text 102 | | Failure Text 103 | deriving (Generic, Show) 104 | instance ToJSON Result 105 | instance FromJSON Result 106 | 107 | -- | Each Worker that you create will be responsible for one type of 108 | -- job, defined by a 'Job' instance. 109 | -- 110 | -- The job can do many different things (as the value can be a 111 | -- variant), but be careful not to break deserialization if you add 112 | -- new things it can do. 113 | -- 114 | -- The job will take some state (passed as the `s` parameter), which 115 | -- does not vary based on the job, and the actual job data 116 | -- structure. The data structure (the `t` parameter) will be stored 117 | -- and copied a few times in Redis while in the lifecycle, so 118 | -- generally it is a good idea for it to be relatively small (and have 119 | -- it be able to look up data that it needs while the job in running). 120 | -- 121 | -- Finally, while deriving FromJSON and ToJSON instances automatically 122 | -- might seem like a good idea, you will most likely be better off 123 | -- defining them manually, so you can make sure they are backwards 124 | -- compatible if you change them, as any jobs that can't be 125 | -- deserialized will not be run (and will end up in the 'broken' 126 | -- queue). This will only happen if the queue is non-empty when you 127 | -- replce the running application version, but this is obviously 128 | -- possible and could be likely depending on your use. 129 | class (FromJSON t, ToJSON t, Show t) => Job s t | s -> t where 130 | job :: s -> t -> IO Result 131 | 132 | data JobData t = JobData UTCTime t 133 | 134 | -- | What should happen when an unexpected exception is thrown in a 135 | -- job - it can be treated as either a 'Failure' (the default) or a 136 | -- 'Retry' (if you know the only exceptions are triggered by 137 | -- intermittent problems). 138 | data ExceptionBehavior = RetryOnException | FailOnException 139 | 140 | hwlog :: Show a => Hworker s t -> a -> IO () 141 | hwlog hw a = hworkerLogger hw (hworkerName hw, a) 142 | 143 | -- | The worker data type - it is parametrized be the worker 144 | -- state (the `s`) and the job type (the `t`). 145 | data Hworker s t = 146 | Hworker { hworkerName :: ByteString 147 | , hworkerState :: s 148 | , hworkerConnection :: R.Connection 149 | , hworkerExceptionBehavior :: ExceptionBehavior 150 | , hworkerLogger :: forall a. Show a => a -> IO () 151 | , hworkerJobTimeout :: NominalDiffTime 152 | , hworkerFailedQueueSize :: Int 153 | , hworkerDebug :: Bool 154 | } 155 | 156 | -- | When configuring a worker, you can tell it to use an existing 157 | -- redis connection pool (which you may have for the rest of your 158 | -- application). Otherwise, you can specify connection info. By 159 | -- default, hworker tries to connect to localhost, which may not be 160 | -- true for your production application. 161 | data RedisConnection = RedisConnectInfo R.ConnectInfo 162 | | RedisConnection R.Connection 163 | 164 | -- | The main configuration for workers. 165 | -- 166 | -- Each pool of workers should have a unique `hwconfigName`, as the 167 | -- queues are set up by that name, and if you have different types of 168 | -- data written in, they will likely be unable to be deserialized (and 169 | -- thus could end up in the 'broken' queue). 170 | -- 171 | -- The 'hwconfigLogger' defaults to writing to stdout, so you will 172 | -- likely want to replace that with something appropriate (like from a 173 | -- logging package). 174 | -- 175 | -- The `hwconfigTimeout` is really important. It determines the length 176 | -- of time after a job is started before the 'monitor' will decide 177 | -- that the job must have died and will restart it. If it is shorter 178 | -- than the length of time that a normal job takes to complete, the 179 | -- jobs _will_ be run multiple times. This is _semantically_ okay, as 180 | -- this is an at-least-once processor, but obviously won't be 181 | -- desirable. It defaults to 120 seconds. 182 | -- 183 | -- The 'hwconfigExceptionBehavior' controls what happens when an 184 | -- exception is thrown within a job. 185 | -- 186 | -- 'hwconfigFailedQueueSize' controls how many 'failed' jobs will be 187 | -- kept. It defaults to 1000. 188 | data HworkerConfig s = 189 | HworkerConfig { 190 | hwconfigName :: Text 191 | , hwconfigState :: s 192 | , hwconfigRedisConnectInfo :: RedisConnection 193 | , hwconfigExceptionBehavior :: ExceptionBehavior 194 | , hwconfigLogger :: forall a. Show a => a -> IO () 195 | , hwconfigTimeout :: NominalDiffTime 196 | , hwconfigFailedQueueSize :: Int 197 | , hwconfigDebug :: Bool 198 | } 199 | 200 | -- | The default worker config - it needs a name and a state (as those 201 | -- will always be unique). 202 | defaultHworkerConfig :: Text -> s -> HworkerConfig s 203 | defaultHworkerConfig name state = 204 | HworkerConfig name 205 | state 206 | (RedisConnectInfo R.defaultConnectInfo) 207 | FailOnException 208 | print 209 | 120 210 | 1000 211 | False 212 | 213 | -- | Create a new worker with the default 'HworkerConfig'. 214 | -- 215 | -- Note that you must create at least one 'worker' and 'monitor' for 216 | -- the queue to actually process jobs (and for it to retry ones that 217 | -- time-out). 218 | create :: Job s t => Text -> s -> IO (Hworker s t) 219 | create name state = createWith (defaultHworkerConfig name state) 220 | 221 | -- | Create a new worker with a specified 'HworkerConfig'. 222 | -- 223 | -- Note that you must create at least one 'worker' and 'monitor' for 224 | -- the queue to actually process jobs (and for it to retry ones that 225 | -- time-out). 226 | createWith :: Job s t => HworkerConfig s -> IO (Hworker s t) 227 | createWith HworkerConfig{..} = 228 | do conn <- case hwconfigRedisConnectInfo of 229 | RedisConnectInfo c -> R.connect c 230 | RedisConnection c -> return c 231 | return $ Hworker (T.encodeUtf8 hwconfigName) 232 | hwconfigState 233 | conn 234 | hwconfigExceptionBehavior 235 | hwconfigLogger 236 | hwconfigTimeout 237 | hwconfigFailedQueueSize 238 | hwconfigDebug 239 | 240 | -- | Destroy a worker. This will delete all the queues, clearing out 241 | -- all existing 'jobs', the 'broken' and 'failed' queues. There is no need 242 | -- to do this in normal applications (and most likely, you won't want to). 243 | destroy :: Job s t => Hworker s t -> IO () 244 | destroy hw = void $ R.runRedis (hworkerConnection hw) $ 245 | R.del [ jobQueue hw 246 | , progressQueue hw 247 | , brokenQueue hw 248 | , failedQueue hw 249 | ] 250 | 251 | jobQueue :: Hworker s t -> ByteString 252 | jobQueue hw = "hworker-jobs-" <> hworkerName hw 253 | 254 | progressQueue :: Hworker s t -> ByteString 255 | progressQueue hw = "hworker-progress-" <> hworkerName hw 256 | 257 | brokenQueue :: Hworker s t -> ByteString 258 | brokenQueue hw = "hworker-broken-" <> hworkerName hw 259 | 260 | failedQueue :: Hworker s t -> ByteString 261 | failedQueue hw = "hworker-failed-" <> hworkerName hw 262 | 263 | -- | Adds a job to the queue. Returns whether the operation succeeded. 264 | queue :: Job s t => Hworker s t -> t -> IO Bool 265 | queue hw j = 266 | do job_id <- UUID.toString <$> UUID.nextRandom 267 | isRight <$> R.runRedis (hworkerConnection hw) 268 | (R.lpush (jobQueue hw) [LB.toStrict $ A.encode (job_id, j)]) 269 | 270 | -- | Creates a new worker thread. This is blocking, so you will want to 271 | -- 'forkIO' this into a thread. You can have any number of these (and 272 | -- on any number of servers); the more there are, the faster jobs will 273 | -- be processed. 274 | worker :: Job s t => Hworker s t -> IO () 275 | worker hw = 276 | do now <- getCurrentTime 277 | r <- R.runRedis (hworkerConnection hw) $ 278 | R.eval "local job = redis.call('rpop',KEYS[1])\n\ 279 | \if job ~= nil then\n\ 280 | \ redis.call('hset', KEYS[2], tostring(job), ARGV[1])\n\ 281 | \ return job\n\ 282 | \else\n\ 283 | \ return nil\n\ 284 | \end" 285 | [jobQueue hw, progressQueue hw] 286 | [LB.toStrict $ A.encode now] 287 | case r of 288 | Left err -> hwlog hw err >> delayAndRun 289 | Right Nothing -> delayAndRun 290 | Right (Just t) -> 291 | do when (hworkerDebug hw) $ hwlog hw ("WORKER RUNNING", t) 292 | case decodeValue (LB.fromStrict t) of 293 | Nothing -> do hwlog hw ("BROKEN JOB", t) 294 | now <- getCurrentTime 295 | withNil hw (R.eval "local del = redis.call('hdel', KEYS[1], ARGV[1])\n\ 296 | \if del == 1 then\n\ 297 | \ redis.call('hset', KEYS[2], ARGV[1], ARGV[2])\n\ 298 | \end\n\ 299 | \return nil" 300 | [progressQueue hw, brokenQueue hw] 301 | [t, LB.toStrict $ A.encode now]) 302 | delayAndRun 303 | Just (_ :: String, j) -> do 304 | result <- runJob (job (hworkerState hw) j) 305 | case result of 306 | Success -> 307 | do when (hworkerDebug hw) $ hwlog hw ("JOB COMPLETE", t) 308 | delete_res <- R.runRedis (hworkerConnection hw) 309 | (R.hdel (progressQueue hw) [t]) 310 | case delete_res of 311 | Left err -> hwlog hw err >> delayAndRun 312 | Right 1 -> justRun 313 | Right n -> do hwlog hw ("Job done: did not delete 1, deleted " <> show n) 314 | delayAndRun 315 | Retry msg -> 316 | do hwlog hw ("Retry: " <> msg) 317 | withNil hw 318 | (R.eval "local del = redis.call('hdel', KEYS[1], ARGV[1])\n\ 319 | \if del == 1 then\n\ 320 | \ redis.call('lpush', KEYS[2], ARGV[1])\n\ 321 | \end\n\ 322 | \return nil" 323 | [progressQueue hw, jobQueue hw] 324 | [t]) 325 | delayAndRun 326 | Failure msg -> 327 | do hwlog hw ("Failure: " <> msg) 328 | withNil hw 329 | (R.eval "local del = redis.call('hdel', KEYS[1], ARGV[1])\n\ 330 | \if del == 1 then\n\ 331 | \ redis.call('lpush', KEYS[2], ARGV[1])\n\ 332 | \ redis.call('ltrim', KEYS[2], 0, ARGV[2])\n\ 333 | \end\n\ 334 | \return nil" 335 | [progressQueue hw, failedQueue hw] 336 | [t, B8.pack (show (hworkerFailedQueueSize hw - 1))]) 337 | void $ R.runRedis (hworkerConnection hw) 338 | (R.hdel (progressQueue hw) [t]) 339 | delayAndRun 340 | where delayAndRun = threadDelay 10000 >> worker hw 341 | justRun = worker hw 342 | runJob v = 343 | do x <- newEmptyMVar 344 | jt <- forkIO (catch (v >>= putMVar x . Right) 345 | (\(e::SomeException) -> 346 | putMVar x (Left e))) 347 | res <- takeMVar x 348 | case res of 349 | Left e -> 350 | let b = case hworkerExceptionBehavior hw of 351 | RetryOnException -> Retry 352 | FailOnException -> Failure in 353 | return (b ("Exception raised: " <> (T.pack . show) e)) 354 | Right r -> return r 355 | 356 | 357 | -- | Start a monitor. Like 'worker', this is blocking, so should be 358 | -- started in a thread. This is responsible for retrying jobs that 359 | -- time out (which can happen if the processing thread is killed, for 360 | -- example). You need to have at least one of these running to have 361 | -- the retry happen, but it is safe to have any number running. 362 | monitor :: Job s t => Hworker s t -> IO () 363 | monitor hw = 364 | forever $ 365 | do now <- getCurrentTime 366 | withList hw (R.hkeys (progressQueue hw)) 367 | (\jobs -> 368 | void $ forM jobs $ \job -> 369 | withMaybe hw (R.hget (progressQueue hw) job) 370 | (\start -> 371 | when (diffUTCTime now (fromJust $ decodeValue (LB.fromStrict start)) > hworkerJobTimeout hw) $ 372 | do n <- 373 | withInt hw 374 | (R.eval "local del = redis.call('hdel', KEYS[2], ARGV[1])\n\ 375 | \if del == 1 then\ 376 | \ redis.call('rpush', KEYS[1], ARGV[1])\n\ \end\n\ 377 | \return del" 378 | [jobQueue hw, progressQueue hw] 379 | [job]) 380 | when (hworkerDebug hw) $ hwlog hw ("MONITOR RV", n) 381 | when (hworkerDebug hw && n == 1) $ hwlog hw ("MONITOR REQUEUED", job))) 382 | -- NOTE(dbp 2015-07-25): We check every 1/10th of timeout. 383 | threadDelay (floor $ 100000 * hworkerJobTimeout hw) 384 | 385 | -- | Returns the jobs that could not be deserialized, most likely 386 | -- because you changed the 'ToJSON'/'FromJSON' instances for you job 387 | -- in a way that resulted in old jobs not being able to be converted 388 | -- back from json. Another reason for jobs to end up here (and much 389 | -- worse) is if you point two instances of 'Hworker', with different 390 | -- job types, at the same queue (ie, you re-use the name). Then 391 | -- anytime a worker from one queue gets a job from the other it would 392 | -- think it is broken. 393 | broken :: Hworker s t -> IO [(ByteString, UTCTime)] 394 | broken hw = do r <- R.runRedis (hworkerConnection hw) (R.hgetall (brokenQueue hw)) 395 | case r of 396 | Left err -> hwlog hw err >> return [] 397 | Right xs -> return (map (second parseTime) xs) 398 | where parseTime = fromJust . decodeValue . LB.fromStrict 399 | 400 | jobsFromQueue :: Job s t => Hworker s t -> ByteString -> IO [t] 401 | jobsFromQueue hw queue = 402 | do r <- R.runRedis (hworkerConnection hw) (R.lrange queue 0 (-1)) 403 | case r of 404 | Left err -> hwlog hw err >> return [] 405 | Right [] -> return [] 406 | Right xs -> return $ mapMaybe (fmap (\(_::String, x) -> x) . decodeValue . LB.fromStrict) xs 407 | 408 | -- | Returns all pending jobs. 409 | jobs :: Job s t => Hworker s t -> IO [t] 410 | jobs hw = jobsFromQueue hw (jobQueue hw) 411 | 412 | -- | Returns all failed jobs. This is capped at the most recent 413 | -- 'hworkerconfigFailedQueueSize' jobs that returned 'Failure' (or 414 | -- threw an exception when 'hworkerconfigExceptionBehavior' is 415 | -- 'FailOnException'). 416 | failed :: Job s t => Hworker s t -> IO [t] 417 | failed hw = jobsFromQueue hw (failedQueue hw) 418 | 419 | -- | Logs the contents of the jobqueue and the inprogress queue at 420 | -- `microseconds` intervals. 421 | debugger :: Job s t => Int -> Hworker s t -> IO () 422 | debugger microseconds hw = 423 | forever $ 424 | do withList hw (R.hkeys (progressQueue hw)) 425 | (\running -> 426 | withList hw (R.lrange (jobQueue hw) 0 (-1)) 427 | (\queued -> hwlog hw ("DEBUG", queued, running))) 428 | threadDelay microseconds 429 | 430 | -- Redis helpers follow 431 | withList hw a f = 432 | do r <- R.runRedis (hworkerConnection hw) a 433 | case r of 434 | Left err -> hwlog hw err 435 | Right [] -> return () 436 | Right xs -> f xs 437 | 438 | withMaybe hw a f = 439 | do r <- R.runRedis (hworkerConnection hw) a 440 | case r of 441 | Left err -> hwlog hw err 442 | Right Nothing -> return () 443 | Right (Just v) -> f v 444 | 445 | withNil hw a = 446 | do r <- R.runRedis (hworkerConnection hw) a 447 | case r of 448 | Left err -> hwlog hw err 449 | Right (Just ("" :: ByteString)) -> return () 450 | Right _ -> return () 451 | 452 | withInt :: Hworker s t -> R.Redis (Either R.Reply Integer) -> IO Integer 453 | withInt hw a = 454 | do r <- R.runRedis (hworkerConnection hw) a 455 | case r of 456 | Left err -> hwlog hw err >> return (-1) 457 | Right n -> return n 458 | 459 | withIgnore :: Hworker s t -> R.Redis (Either R.Reply a) -> IO () 460 | withIgnore hw a = 461 | do r <- R.runRedis (hworkerConnection hw) a 462 | case r of 463 | Left err -> hwlog hw err 464 | Right _ -> return () 465 | -------------------------------------------------------------------------------- /stack.yaml: -------------------------------------------------------------------------------- 1 | flags: {} 2 | packages: 3 | - '.' 4 | - 'example' 5 | extra-deps: [] 6 | resolver: lts-3.1 7 | -------------------------------------------------------------------------------- /test/Spec.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE DeriveGeneric #-} 2 | {-# LANGUAGE MultiParamTypeClasses #-} 3 | {-# LANGUAGE OverloadedStrings #-} 4 | import Control.Concurrent (forkIO, killThread, threadDelay) 5 | import Control.Concurrent.MVar (MVar, modifyMVarMasked_, newMVar, 6 | readMVar, takeMVar) 7 | import Control.Monad (replicateM_) 8 | import Data.Aeson (FromJSON, ToJSON) 9 | import qualified Data.Text as T 10 | import GHC.Generics (Generic) 11 | import System.Hworker 12 | import System.IO 13 | 14 | import Test.Hspec 15 | import Test.Hspec.Contrib.HUnit 16 | import Test.HUnit 17 | 18 | data SimpleJob = SimpleJob deriving (Generic, Show, Eq) 19 | data SimpleState = SimpleState { unSimpleState :: MVar Int } 20 | instance ToJSON SimpleJob 21 | instance FromJSON SimpleJob 22 | instance Job SimpleState SimpleJob where 23 | job (SimpleState mvar) SimpleJob = 24 | do modifyMVarMasked_ mvar (return . (+1)) 25 | return Success 26 | 27 | data ExJob = ExJob deriving (Generic, Show) 28 | data ExState = ExState { unExState :: MVar Int } 29 | instance ToJSON ExJob 30 | instance FromJSON ExJob 31 | instance Job ExState ExJob where 32 | job (ExState mvar) ExJob = 33 | do modifyMVarMasked_ mvar (return . (+1)) 34 | v <- readMVar mvar 35 | if v > 1 36 | then return Success 37 | else error "ExJob: failing badly!" 38 | 39 | data RetryJob = RetryJob deriving (Generic, Show) 40 | data RetryState = RetryState { unRetryState :: MVar Int } 41 | instance ToJSON RetryJob 42 | instance FromJSON RetryJob 43 | instance Job RetryState RetryJob where 44 | job (RetryState mvar) RetryJob = 45 | do modifyMVarMasked_ mvar (return . (+1)) 46 | v <- readMVar mvar 47 | if v > 1 48 | then return Success 49 | else return (Retry "RetryJob retries") 50 | 51 | data FailJob = FailJob deriving (Eq, Generic, Show) 52 | data FailState = FailState { unFailState :: MVar Int } 53 | instance ToJSON FailJob 54 | instance FromJSON FailJob 55 | instance Job FailState FailJob where 56 | job (FailState mvar) FailJob = 57 | do modifyMVarMasked_ mvar (return . (+1)) 58 | v <- readMVar mvar 59 | if v > 1 60 | then return Success 61 | else return (Failure "FailJob fails") 62 | 63 | data AlwaysFailJob = AlwaysFailJob deriving (Eq, Generic, Show) 64 | data AlwaysFailState = AlwaysFailState { unAlwaysFailState :: MVar Int } 65 | instance ToJSON AlwaysFailJob 66 | instance FromJSON AlwaysFailJob 67 | instance Job AlwaysFailState AlwaysFailJob where 68 | job (AlwaysFailState mvar) AlwaysFailJob = 69 | do modifyMVarMasked_ mvar (return . (+1)) 70 | return (Failure "AlwaysFailJob fails") 71 | 72 | data TimedJob = TimedJob Int deriving (Generic, Show, Eq) 73 | data TimedState = TimedState { unTimedState :: MVar Int } 74 | instance ToJSON TimedJob 75 | instance FromJSON TimedJob 76 | instance Job TimedState TimedJob where 77 | job (TimedState mvar) (TimedJob delay) = 78 | do threadDelay delay 79 | modifyMVarMasked_ mvar (return . (+1)) 80 | return Success 81 | 82 | data BigJob = BigJob T.Text deriving (Generic, Show, Eq) 83 | data BigState = BigState { unBigState :: MVar Int } 84 | instance ToJSON BigJob 85 | instance FromJSON BigJob 86 | instance Job BigState BigJob where 87 | job (BigState mvar) (BigJob _) = 88 | do modifyMVarMasked_ mvar (return . (+1)) 89 | return Success 90 | 91 | nullLogger :: Show a => a -> IO () 92 | nullLogger = const (return ()) 93 | 94 | print' :: Show a => a -> IO () 95 | print' a = do print a 96 | hFlush stdout 97 | 98 | conf n s = (defaultHworkerConfig n s) { 99 | hwconfigLogger = nullLogger 100 | , hwconfigExceptionBehavior = FailOnException 101 | , hwconfigTimeout = 4 102 | } 103 | 104 | main :: IO () 105 | main = hspec $ 106 | do describe "Simple" $ 107 | do it "should run and increment counter" $ 108 | do mvar <- newMVar 0 109 | hworker <- createWith (conf "simpleworker-1" 110 | (SimpleState mvar)) 111 | wthread <- forkIO (worker hworker) 112 | queue hworker SimpleJob 113 | threadDelay 30000 114 | killThread wthread 115 | destroy hworker 116 | v <- takeMVar mvar 117 | assertEqual "State should be 1 after job runs" 1 v 118 | it "queueing 2 jobs should increment twice" $ 119 | do mvar <- newMVar 0 120 | hworker <- createWith (conf "simpleworker-2" 121 | (SimpleState mvar)) 122 | wthread <- forkIO (worker hworker) 123 | queue hworker SimpleJob 124 | queue hworker SimpleJob 125 | threadDelay 40000 126 | killThread wthread 127 | destroy hworker 128 | v <- takeMVar mvar 129 | assertEqual "State should be 2 after 2 jobs run" 2 v 130 | it "queueing 1000 jobs should increment 1000" $ 131 | do mvar <- newMVar 0 132 | hworker <- createWith (conf "simpleworker-3" 133 | (SimpleState mvar)) 134 | wthread <- forkIO (worker hworker) 135 | replicateM_ 1000 (queue hworker SimpleJob) 136 | threadDelay 2000000 137 | killThread wthread 138 | destroy hworker 139 | v <- takeMVar mvar 140 | assertEqual "State should be 1000 after 1000 job runs" 1000 v 141 | it "should work with multiple workers" $ 142 | -- NOTE(dbp 2015-07-12): This probably won't run faster, because 143 | -- they are all blocking on the MVar, but that's not the point. 144 | do mvar <- newMVar 0 145 | hworker <- createWith (conf "simpleworker-4" 146 | (SimpleState mvar)) 147 | wthread1 <- forkIO (worker hworker) 148 | wthread2 <- forkIO (worker hworker) 149 | wthread3 <- forkIO (worker hworker) 150 | wthread4 <- forkIO (worker hworker) 151 | replicateM_ 1000 (queue hworker SimpleJob) 152 | threadDelay 1000000 153 | killThread wthread1 154 | killThread wthread2 155 | killThread wthread3 156 | killThread wthread4 157 | destroy hworker 158 | v <- takeMVar mvar 159 | assertEqual "State should be 1000 after 1000 job runs" 1000 v 160 | 161 | describe "Exceptions" $ 162 | do it "should be able to have exceptions thrown in jobs and retry the job" $ 163 | do mvar <- newMVar 0 164 | hworker <- createWith (conf "exworker-1" 165 | (ExState mvar)) { 166 | hwconfigExceptionBehavior = 167 | RetryOnException 168 | } 169 | wthread <- forkIO (worker hworker) 170 | queue hworker ExJob 171 | threadDelay 40000 172 | killThread wthread 173 | destroy hworker 174 | v <- takeMVar mvar 175 | assertEqual "State should be 2, since the first run failed" 2 v 176 | it "should not retry if mode is FailOnException" $ 177 | do mvar <- newMVar 0 178 | hworker <- createWith (conf "exworker-2" 179 | (ExState mvar)) 180 | wthread <- forkIO (worker hworker) 181 | queue hworker ExJob 182 | threadDelay 30000 183 | killThread wthread 184 | destroy hworker 185 | v <- takeMVar mvar 186 | assertEqual "State should be 1, since failing run wasn't retried" 1 v 187 | 188 | describe "Retry" $ 189 | do it "should be able to return Retry and get run again" $ 190 | do mvar <- newMVar 0 191 | hworker <- createWith (conf "retryworker-1" 192 | (RetryState mvar)) 193 | wthread <- forkIO (worker hworker) 194 | queue hworker RetryJob 195 | threadDelay 50000 196 | destroy hworker 197 | v <- takeMVar mvar 198 | assertEqual "State should be 2, since it got retried" 2 v 199 | 200 | describe "Fail" $ 201 | do it "should not retry a job that Fails" $ 202 | do mvar <- newMVar 0 203 | hworker <- createWith (conf "failworker-1" 204 | (FailState mvar)) 205 | wthread <- forkIO (worker hworker) 206 | queue hworker FailJob 207 | threadDelay 30000 208 | destroy hworker 209 | v <- takeMVar mvar 210 | assertEqual "State should be 1, since failing run wasn't retried" 1 v 211 | it "should put a failed job into the failed queue" $ 212 | do mvar <- newMVar 0 213 | hworker <- createWith (conf "failworker-2" 214 | (FailState mvar)) 215 | wthread <- forkIO (worker hworker) 216 | queue hworker FailJob 217 | threadDelay 30000 218 | jobs <- failed hworker 219 | destroy hworker 220 | assertEqual "Should have failed job" [FailJob] jobs 221 | it "should only store failedQueueSize failed jobs" $ 222 | do mvar <- newMVar 0 223 | hworker <- createWith (conf "failworker-3" 224 | (AlwaysFailState mvar)) { 225 | hwconfigFailedQueueSize = 2 226 | } 227 | wthread <- forkIO (worker hworker) 228 | queue hworker AlwaysFailJob 229 | queue hworker AlwaysFailJob 230 | queue hworker AlwaysFailJob 231 | queue hworker AlwaysFailJob 232 | threadDelay 100000 233 | jobs <- failed hworker 234 | destroy hworker 235 | v <- takeMVar mvar 236 | assertEqual "State should be 4, since all jobs were run" 4 v 237 | assertEqual "Should only have stored 2" 238 | [AlwaysFailJob,AlwaysFailJob] jobs 239 | describe "Monitor" $ 240 | do it "should add job back after timeout" $ 241 | -- NOTE(dbp 2015-07-12): The timing on this test is somewhat 242 | -- tricky. We want to get the job started with one worker, 243 | -- then kill the worker, then start a new worker, and have 244 | -- the monitor put the job back in the queue and have the 245 | -- second worker finish it. It's important that the job 246 | -- takes less time to complete than the timeout for the 247 | -- monitor, or else it'll queue it forever. 248 | -- 249 | -- The timeout is 5 seconds. The job takes 1 seconds to run. 250 | -- The worker is killed after 0.5 seconds, which should be 251 | -- plenty of time for it to have started the job. Then after 252 | -- the second worker is started, we wait 10 seconds, which 253 | -- should be plenty; we expect the total run to take around 11. 254 | do mvar <- newMVar 0 255 | hworker <- createWith (conf "timedworker-1" 256 | (TimedState mvar)) { 257 | hwconfigTimeout = 5 258 | } 259 | wthread1 <- forkIO (worker hworker) 260 | mthread <- forkIO (monitor hworker) 261 | queue hworker (TimedJob 1000000) 262 | threadDelay 500000 263 | killThread wthread1 264 | wthread2 <- forkIO (worker hworker) 265 | threadDelay 10000000 266 | destroy hworker 267 | v <- takeMVar mvar 268 | assertEqual "State should be 2, since monitor thinks it failed" 2 v 269 | it "should add back multiple jobs after timeout" $ 270 | -- NOTE(dbp 2015-07-23): Similar to the above test, but we 271 | -- have multiple jobs started, multiple workers killed. 272 | -- then one worker will finish both interrupted jobs. 273 | do mvar <- newMVar 0 274 | hworker <- createWith (conf "timedworker-2" 275 | (TimedState mvar)) { 276 | hwconfigTimeout = 5 277 | } 278 | wthread1 <- forkIO (worker hworker) 279 | wthread2 <- forkIO (worker hworker) 280 | mthread <- forkIO (monitor hworker) 281 | queue hworker (TimedJob 1000000) 282 | queue hworker (TimedJob 1000000) 283 | threadDelay 500000 284 | killThread wthread1 285 | killThread wthread2 286 | wthread3 <- forkIO (worker hworker) 287 | threadDelay 10000000 288 | destroy hworker 289 | v <- takeMVar mvar 290 | assertEqual "State should be 4, since monitor thinks first 2 failed" 4 v 291 | it "should work with multiple monitors" $ 292 | do mvar <- newMVar 0 293 | hworker <- createWith (conf "timedworker-3" 294 | (TimedState mvar)) { 295 | hwconfigTimeout = 5 296 | } 297 | wthread1 <- forkIO (worker hworker) 298 | wthread2 <- forkIO (worker hworker) 299 | -- NOTE(dbp 2015-07-24): This might seem silly, but it 300 | -- was actually sufficient to expose a race condition. 301 | mthread1 <- forkIO (monitor hworker) 302 | mthread2 <- forkIO (monitor hworker) 303 | mthread3 <- forkIO (monitor hworker) 304 | mthread4 <- forkIO (monitor hworker) 305 | mthread5 <- forkIO (monitor hworker) 306 | mthread6 <- forkIO (monitor hworker) 307 | queue hworker (TimedJob 1000000) 308 | queue hworker (TimedJob 1000000) 309 | threadDelay 500000 310 | killThread wthread1 311 | killThread wthread2 312 | wthread3 <- forkIO (worker hworker) 313 | threadDelay 30000000 314 | destroy hworker 315 | v <- takeMVar mvar 316 | assertEqual "State should be 4, since monitor thinks first 2 failed" 4 v 317 | -- NOTE(dbp 2015-07-24): It would be really great to have a 318 | -- test that went after a race between the retry logic and 319 | -- the monitors (ie, assume that the job completed with 320 | -- Retry, and it happened to complete right at the timeout 321 | -- period). I'm not sure if I could get that sort of 322 | -- precision without adding other delay mechanisms, or 323 | -- something to make it more deterministic. 324 | describe "Broken jobs" $ 325 | it "should store broken jobs" $ 326 | do -- NOTE(dbp 2015-08-09): The more common way this could 327 | -- happen is that you change your serialization format. But 328 | -- we can abuse this by creating two different workers 329 | -- pointing to the same queue, and submit jobs in one, try 330 | -- to run them in another, where the types are different. 331 | mvar <- newMVar 0 332 | hworker1 <- createWith (conf "broken-1" 333 | (TimedState mvar)) { 334 | hwconfigTimeout = 5 335 | } 336 | hworker2 <- createWith (conf "broken-1" 337 | (SimpleState mvar)) { 338 | hwconfigTimeout = 5 339 | } 340 | wthread <- forkIO (worker hworker1) 341 | queue hworker2 SimpleJob 342 | threadDelay 100000 343 | jobs <- broken hworker2 344 | killThread wthread 345 | destroy hworker1 346 | v <- takeMVar mvar 347 | assertEqual "State should be 0, as nothing should have happened" 0 v 348 | assertEqual "Should be one broken job, as serialization is wrong" 1 (length jobs) 349 | describe "Dump jobs" $ do 350 | it "should return the job that was queued" $ 351 | do mvar <- newMVar 0 352 | hworker <- createWith (conf "dump-1" 353 | (SimpleState mvar)) { 354 | hwconfigTimeout = 5 355 | } 356 | queue hworker SimpleJob 357 | res <- jobs hworker 358 | destroy hworker 359 | assertEqual "Should be [SimpleJob]" [SimpleJob] res 360 | it "should return jobs in order (most recently added at front; worker pulls from back)" $ 361 | do mvar <- newMVar 0 362 | hworker <- createWith (conf "dump-2" 363 | (TimedState mvar)) { 364 | hwconfigTimeout = 5 365 | } 366 | queue hworker (TimedJob 1) 367 | queue hworker (TimedJob 2) 368 | res <- jobs hworker 369 | destroy hworker 370 | assertEqual "Should by [TimedJob 2, TimedJob 1]" [TimedJob 2, TimedJob 1] res 371 | describe "Large jobs" $ do 372 | it "should be able to deal with lots of large jobs" $ 373 | do mvar <- newMVar 0 374 | hworker <- createWith (conf "big-1" 375 | (BigState mvar)) 376 | wthread1 <- forkIO (worker hworker) 377 | wthread2 <- forkIO (worker hworker) 378 | wthread3 <- forkIO (worker hworker) 379 | wthread4 <- forkIO (worker hworker) 380 | let content = T.intercalate "\n" (take 1000 (repeat "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz")) 381 | replicateM_ 5000 (queue hworker (BigJob content)) 382 | threadDelay 10000000 383 | killThread wthread1 384 | killThread wthread2 385 | killThread wthread3 386 | killThread wthread4 387 | destroy hworker 388 | v <- takeMVar mvar 389 | assertEqual "Should have processed 5000" 5000 v 390 | --------------------------------------------------------------------------------