├── .gitignore
├── CHANGELOG.md
├── LICENSE
├── README.md
├── Setup.hs
├── circle.yml
├── example
    ├── LICENSE
    ├── hworker-example.cabal
    └── src
    │   └── Main.hs
├── hworker.cabal
├── src
    ├── Data
    │   └── Aeson
    │   │   └── Helpers.hs
    └── System
    │   └── Hworker.hs
├── stack.yaml
└── test
    └── Spec.hs


/.gitignore:
--------------------------------------------------------------------------------
1 | .cabal-sandbox
2 | .stack-work
3 | cabal.sandbox.config
4 | dist-stack


--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------
 1 | * 0.2.0 Remeike Forbes
 2 | 
 3 |   Coerce jobs to string (to work with Redis 5)
 4 | 
 5 | * 0.1.0.1 Daniel Pattersion <dbp@dbpmail.net> 2015-11-1
 6 | 
 7 |   Add example in haddocks.
 8 | 
 9 | * 0.1.0.0 Daniel Patterson <dbp@dbpmail.net> 2015-10-25
10 | 
11 |   Initial release.
12 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | Copyright (c) 2015 Daniel Patterson
 2 | 
 3 | Permission to use, copy, modify, and/or distribute this software for any purpose
 4 | with or without fee is hereby granted, provided that the above copyright notice
 5 | and this permission notice appear in all copies.
 6 | 
 7 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH
 8 | REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
 9 | FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,
10 | INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
11 | OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
12 | TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
13 | THIS SOFTWARE.
14 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | ## About
  2 | 
  3 | `hworker` is a Redis-backed persistent at-least-once queue library. It
  4 | is vaguely inspired by `sidekiq` for Ruby. It is intended to be a
  5 | simple reliable mechanism for processing background tasks. The jobs
  6 | can be created by a Haskell application or any application that can
  7 | push JSON data structures of the right shape into a Redis queue. The
  8 | application that processes the jobs need not be the same one as the
  9 | application that creates them (they just need to be able to talk to
 10 | the same Redis server, and use the same serialization to/from JSON).
 11 | 
 12 | ## Stability
 13 | 
 14 | This has been running in one application sending email (using
 15 | `hworker-ses`) for several months. This is relatively low traffic
 16 | (transactional messages) most of the time, with spikes of 10k-30k
 17 | messages (mailing blasts).
 18 | 
 19 | ## Important Note
 20 | 
 21 | The expiration of jobs is really important. It defaults to 120
 22 | seconds, which may be short depending on your application (for things
 23 | like sending emails, it may be fine). **The reason why this timeout is
 24 | important is that if a job ever runs longer than this, the monitor
 25 | will think that the job failed** in some inexplicable way (like the
 26 | server running the job died) and will add the job back to the queue to
 27 | be run. Based on the semantics of this job processor, jobs running
 28 | multiple times is not a failure case, but it's obviously not something
 29 | you _want_ to happen, so be sure to set the timeout to something
 30 | reasonable for your application.
 31 | 
 32 | ## Overview
 33 | 
 34 | To define jobs, you define a serialized representation of the job, and
 35 | a function that runs the job, which returns a status. The behavior of
 36 | uncaught exceptions is defined when you create the worker - it can be
 37 | either `Failure` or `Retry`. Jobs that return `Failure` are removed
 38 | from the queue, whereas jobs that return `Retry` are added again. The
 39 | only difference between a `Success` and a `Failure` is that a
 40 | `Failure` returns a message that is logged (ie, neither run again).
 41 | 
 42 | ## Example
 43 | 
 44 | See the `example` directory in the repository.
 45 | 
 46 | ## Semantics
 47 | 
 48 | This behavior of this queue processor is at-least-once.
 49 | 
 50 | We rely on the defined behavior of Redis for reliability. Once a job
 51 | has been `queue`d, it is guaranteed to be run eventually, provided
 52 | some worker and monitor threads exist. If the worker thread that was
 53 | running a given job dies, the job will eventually be retried (if you
 54 | do not want this behavior, do not start any monitor threads). Once the
 55 | job completes, provided nothing kills the worker thread in the
 56 | intervening time, jobs that returned `Success` will not be run again,
 57 | jobs that return `Failure` will have their messages logged and will
 58 | not be run again, and jobs that return `Retry` will be queued
 59 | again. If something kills the worker thread before these
 60 | acknowledgements go through, the job will be retried. Exceptions
 61 | triggered within the job cannot affect the worker thread - what they
 62 | do to the job is defined at startup (they can cause either a `Failure`
 63 | or `Retry`).
 64 | 
 65 | Any deviations from this behavior are considered bugs that will be fixed.
 66 | 
 67 | 
 68 | ## Redis Operations
 69 | 
 70 | Under the hood, we will have the following data structures in redis
 71 | (`name` is set when you create the `hworker` instance):
 72 | 
 73 | `hworker-jobs-name`: list of json serialized job descriptions
 74 | 
 75 | `hworker-progress-name`: a hash of jobs that are in progress, mapping to time started
 76 | 
 77 | `hworker-broken-name`: a hash of jobs to time that couldn't be deserialized; most likely means you changed the serialization format with jobs still in queue, _or_ you pointed different applications at the same queues.
 78 | 
 79 | `hworker-failed-queue`: a record of the jobs that failed (limited in size based on config).
 80 | 
 81 | In the following pseudo-code, I'm using `MULTI`...`EXEC` to indicate
 82 | atomic blocks of code. These are actually implemented with lua and
 83 | `EVAL`, but I think it's easier to read this way. If you want to see
 84 | what's actually happening, just read the code - it's not very long!
 85 | 
 86 | When a worker wants to do work, the following happens:
 87 | 
 88 | ```
 89 | now = TIME
 90 | MULTI
 91 | v = RPOP hworker-jobs-name
 92 | if v
 93 |   HSET hworker-progress-name v now
 94 | EXEC
 95 | v
 96 | ```
 97 | 
 98 | When it completes the job, it does the following:
 99 | 
100 | ```
101 | v = JOB
102 | HDEL hwork-progress v
103 | ```
104 | 
105 | If the job returned `Retry`, the following occurs:
106 | 
107 | ```
108 | v = JOB
109 | t = START_TIME
110 | MULTI
111 | LPUSH hwork-jobs v
112 | HDEL hwork-progress t
113 | EXEC
114 | ```
115 | 
116 | A monitor runs on another thread that will re-run jobs that stay in
117 | progress for too long (as that indicates that something unknown went
118 | wrong). The operation that it runs periodically is:
119 | 
120 | ```
121 | keys = HKEYS (or HSCAN) hwork-progress
122 | for keys as v:
123 |   started = HGET hwork-progress v
124 |   if started < TIME - timeout
125 |     MULTI
126 |     RPUSH hwork-jobs v
127 |     HDEL hwork-progress v
128 |     EXEC
129 | ```
130 | 
131 | Note that what the monitor does and `Retry` is slightly different -
132 | the monitor puts jobs on the front of the queue, whereas `Retry` puts
133 | them on the back.
134 | 
135 | ## Primary Libraries Used
136 | 
137 | - hedis
138 | - aeson
139 | 
140 | ## Contributors
141 | 
142 | - Daniel Patterson (@dbp - dbp@dbpmail.net)
143 | 
144 | ## Build Status
145 | 
146 | [![Circle CI](https://circleci.com/gh/dbp/hworker.svg?style=svg&circle-token=b40a5b06c599d457cbaa4d1c00824c98d4768f2f)](https://circleci.com/gh/dbp/hworker)
147 | 


--------------------------------------------------------------------------------
/Setup.hs:
--------------------------------------------------------------------------------
1 | import Distribution.Simple
2 | main = defaultMain
3 | 


--------------------------------------------------------------------------------
/circle.yml:
--------------------------------------------------------------------------------
 1 | dependencies:
 2 |   cache_directories:
 3 |     - "~/.stack"
 4 |   pre:
 5 |     - wget https://github.com/commercialhaskell/stack/releases/download/v0.1.2.0/stack-0.1.2.0-x86_64-linux.gz -O /tmp/stack.gz
 6 |     - gunzip /tmp/stack.gz && chmod +x /tmp/stack
 7 |     - sudo mv /tmp/stack /usr/bin/stack
 8 |   override:
 9 |     - stack setup
10 |     - stack build
11 |     - stack test ; echo 0 # We want the test deps cached... :(
12 | 
13 | test:
14 |   override:
15 |     - stack test


--------------------------------------------------------------------------------
/example/LICENSE:
--------------------------------------------------------------------------------
 1 | Copyright (c) 2015 Daniel Patterson
 2 | 
 3 | Permission to use, copy, modify, and/or distribute this software for any purpose
 4 | with or without fee is hereby granted, provided that the above copyright notice
 5 | and this permission notice appear in all copies.
 6 | 
 7 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH
 8 | REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
 9 | FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,
10 | INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
11 | OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
12 | TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
13 | THIS SOFTWARE.
14 | 


--------------------------------------------------------------------------------
/example/hworker-example.cabal:
--------------------------------------------------------------------------------
 1 | name:                hworker-example
 2 | version:             0.1.0.0
 3 | synopsis:            Example for Hworker.
 4 | description:         See README.
 5 | homepage:            http://github.com/dbp/hworker
 6 | license:             ISC
 7 | license-file:        LICENSE
 8 | author:              Daniel Patterson
 9 | maintainer:          dbp@dbpmail.net
10 | build-type:          Simple
11 | cabal-version:       >=1.10
12 | 
13 | 
14 | executable hworker-example
15 |   hs-source-dirs: src
16 |   main-is: Main.hs
17 |   build-depends:       base >= 4.7 && < 5
18 |                      , aeson
19 |                      , hedis >= 0.6.5
20 |                      , text
21 |                      , bytestring
22 |                      , time >= 1.5
23 |                      , attoparsec
24 |                      , uuid >= 1.2.6
25 |                      , hworker
26 |   default-language:    Haskell2010


--------------------------------------------------------------------------------
/example/src/Main.hs:
--------------------------------------------------------------------------------
 1 | {-# LANGUAGE DeriveGeneric         #-}
 2 | {-# LANGUAGE MultiParamTypeClasses #-}
 3 | {-# LANGUAGE OverloadedStrings     #-}
 4 | import           Control.Concurrent      (forkIO, threadDelay)
 5 | import           Control.Concurrent.MVar (MVar, newMVar, putMVar, takeMVar)
 6 | import           Control.Monad           (forever)
 7 | import           Data.Aeson              (FromJSON, ToJSON)
 8 | import qualified Data.Text               as T
 9 | import           GHC.Generics            (Generic)
10 | import           System.Hworker
11 | 
12 | data PrintJob = PrintA | PrintB deriving (Generic, Show)
13 | data State = State (MVar Int)
14 | instance ToJSON PrintJob
15 | instance FromJSON PrintJob
16 | 
17 | loopForever :: a
18 | loopForever = loopForever
19 | 
20 | instance Job State PrintJob where
21 |   job (State mvar) PrintA =
22 |     do v <- takeMVar mvar
23 |        if v == 0
24 |           then do putMVar mvar 0
25 |                   putStrLn "A" >> return Success
26 |           else do putMVar mvar (v - 1)
27 |                   error $ "Dying: " ++ show v
28 | 
29 |   job _ PrintB = putStrLn "B" >> return Success
30 | 
31 | main = do mvar <- newMVar 3
32 |           hworker <- create "printer" (State mvar)
33 |           forkIO (worker hworker)
34 |           forkIO (monitor hworker)
35 |           forkIO (forever $ queue hworker PrintA >> threadDelay 1000000)
36 |           forkIO (forever $ queue hworker PrintB >> threadDelay 500000)
37 |           forever (threadDelay 1000000)
38 | 


--------------------------------------------------------------------------------
/hworker.cabal:
--------------------------------------------------------------------------------
 1 | name:                hworker
 2 | version:             0.2.0
 3 | synopsis:            A reliable at-least-once job queue built on top of redis.
 4 | description:         See README.
 5 | homepage:            http://github.com/positiondev/hworker
 6 | license:             ISC
 7 | license-file:        LICENSE
 8 | author:              Daniel Patterson
 9 | maintainer:          dbp@dbpmail.net
10 | build-type:          Simple
11 | extra-source-files:  README.md CHANGELOG.md
12 | cabal-version:       >=1.10
13 | 
14 | library
15 |   exposed-modules:     System.Hworker
16 |   other-modules:       Data.Aeson.Helpers
17 |   build-depends:       base >= 4.7 && < 5
18 |                      , aeson
19 |                      , hedis >= 0.6.5
20 |                      , text
21 |                      , bytestring
22 |                      , time >= 1.5
23 |                      , attoparsec
24 |                      , uuid >= 1.2.6
25 |   hs-source-dirs:      src
26 |   default-language:    Haskell2010
27 |   ghc-options:         -Wall
28 | 
29 | Test-Suite hworker-test
30 |   type:            exitcode-stdio-1.0
31 |   hs-source-dirs:  src test
32 |   main-is:         Spec.hs
33 |   other-modules: Data.Aeson.Helpers
34 |                , System.Hworker
35 |   build-depends:       base >= 4.7 && < 5
36 |                      , aeson
37 |                      , hedis >= 0.6.5
38 |                      , text
39 |                      , bytestring
40 |                      , time >= 1.5
41 |                      , attoparsec
42 |                      , uuid >= 1.2.6
43 |                      , hspec >= 2
44 |                      , hspec-contrib
45 |                      , HUnit
46 | 


--------------------------------------------------------------------------------
/src/Data/Aeson/Helpers.hs:
--------------------------------------------------------------------------------
 1 | module Data.Aeson.Helpers where
 2 | 
 3 | import           Data.Aeson
 4 | import           Data.Aeson.Parser    (value)
 5 | import           Data.Attoparsec.Lazy (Parser)
 6 | import qualified Data.Attoparsec.Lazy as L
 7 | import qualified Data.ByteString.Lazy as L
 8 | 
 9 | -- NOTE(dbp 2015-06-14): Taken from Data.Aeson.Parser.Internal
10 | decodeWith :: Parser Value -> (Value -> Result a) -> L.ByteString -> Maybe a
11 | decodeWith p to s =
12 |     case L.parse p s of
13 |       L.Done _ v -> case to v of
14 |                       Success a -> Just a
15 |                       _         -> Nothing
16 |       _          -> Nothing
17 | {-# INLINE decodeWith #-}
18 | 
19 | decodeValue :: FromJSON t => L.ByteString -> Maybe t
20 | decodeValue = decodeWith value fromJSON
21 | 


--------------------------------------------------------------------------------
/src/System/Hworker.hs:
--------------------------------------------------------------------------------
  1 | {-# LANGUAGE DeriveGeneric             #-}
  2 | {-# LANGUAGE ExistentialQuantification #-}
  3 | {-# LANGUAGE FunctionalDependencies    #-}
  4 | {-# LANGUAGE MultiParamTypeClasses     #-}
  5 | {-# LANGUAGE OverloadedStrings         #-}
  6 | {-# LANGUAGE RankNTypes                #-}
  7 | {-# LANGUAGE RecordWildCards           #-}
  8 | {-# LANGUAGE ScopedTypeVariables       #-}
  9 | 
 10 | {-|
 11 | 
 12 | This module contains an at-least-once persistent job processing queue
 13 | backed by Redis. It depends upon Redis not losing data once it has
 14 | acknowledged it, and guaranteeing the atomicity that is specified for
 15 | commands like EVAL (ie, that if you do several things within an EVAL,
 16 | they will all happen or none will happen). Nothing has been tested
 17 | with Redis clusters (and it likely will not work).
 18 | 
 19 | An example use is the following (see the repository for a
 20 | slightly expanded version; also, the test cases in the repository are
 21 | also good examples):
 22 | 
 23 | 
 24 | > data PrintJob = Print deriving (Generic, Show)
 25 | > data State = State (MVar Int)
 26 | > instance ToJSON PrintJob
 27 | > instance FromJSON PrintJob
 28 | >
 29 | > instance Job State PrintJob where
 30 | >   job (State mvar) Print =
 31 | >     do v <- takeMVar mvar
 32 | >        putMVar mvar (v + 1)
 33 | >        putStrLn $ "A(" ++ show v ++ ")"
 34 | >        return Success
 35 | >
 36 | > main = do mvar <- newMVar 0
 37 | >           hworker <- create "printer" (State mvar)
 38 | >           forkIO (worker hworker)
 39 | >           forkIO (monitor hworker)
 40 | >           forkIO (forever $ queue hworker Print >> threadDelay 1000000)
 41 | >           forever (threadDelay 1000000)
 42 | 
 43 | 
 44 | -}
 45 | 
 46 | module System.Hworker
 47 |        ( -- * Types
 48 |          Result(..)
 49 |        , Job(..)
 50 |        , Hworker
 51 |        , HworkerConfig(..)
 52 |        , ExceptionBehavior(..)
 53 |        , RedisConnection(..)
 54 |        , defaultHworkerConfig
 55 |          -- * Managing Workers
 56 |        , create
 57 |        , createWith
 58 |        , destroy
 59 |        , worker
 60 |        , monitor
 61 |          -- * Queuing Jobs
 62 |        , queue
 63 |          -- * Inspecting Workers
 64 |        , jobs
 65 |        , failed
 66 |        , broken
 67 |          -- * Debugging Utilities
 68 |        , debugger
 69 |        )
 70 |        where
 71 | 
 72 | import           Control.Arrow           (second)
 73 | import           Control.Concurrent      (forkIO, threadDelay)
 74 | import           Control.Concurrent.MVar (newEmptyMVar, putMVar, takeMVar)
 75 | import           Control.Exception       (SomeException, catch)
 76 | import           Control.Monad           (forM, forever, void, when)
 77 | import           Data.Aeson              (FromJSON, ToJSON)
 78 | import qualified Data.Aeson              as A
 79 | import           Data.Aeson.Helpers
 80 | import           Data.ByteString         (ByteString)
 81 | import qualified Data.ByteString.Char8   as B8
 82 | import qualified Data.ByteString.Lazy    as LB
 83 | import           Data.Either             (isRight)
 84 | import           Data.Maybe              (fromJust, mapMaybe)
 85 | import           Data.Monoid             ((<>))
 86 | import           Data.Text               (Text)
 87 | import qualified Data.Text               as T
 88 | import qualified Data.Text.Encoding      as T
 89 | import           Data.Time.Calendar      (Day (..))
 90 | import           Data.Time.Clock         (NominalDiffTime, UTCTime (..),
 91 |                                           diffUTCTime, getCurrentTime)
 92 | import qualified Data.UUID               as UUID
 93 | import qualified Data.UUID.V4            as UUID
 94 | import qualified Database.Redis          as R
 95 | import           GHC.Generics            (Generic)
 96 | 
 97 | -- | Jobs can return 'Success', 'Retry' (with a message), or 'Failure'
 98 | -- (with a message). Jobs that return 'Failure' are stored in the
 99 | -- 'failed' queue and are not re-run. Jobs that return 'Retry' are re-run.
100 | data Result = Success
101 |             | Retry Text
102 |             | Failure Text
103 |             deriving (Generic, Show)
104 | instance ToJSON Result
105 | instance FromJSON Result
106 | 
107 | -- | Each Worker that you create will be responsible for one type of
108 | -- job, defined by a 'Job' instance.
109 | --
110 | -- The job can do many different things (as the value can be a
111 | -- variant), but be careful not to break deserialization if you add
112 | -- new things it can do.
113 | --
114 | -- The job will take some state (passed as the `s` parameter), which
115 | -- does not vary based on the job, and the actual job data
116 | -- structure. The data structure (the `t` parameter) will be stored
117 | -- and copied a few times in Redis while in the lifecycle, so
118 | -- generally it is a good idea for it to be relatively small (and have
119 | -- it be able to look up data that it needs while the job in running).
120 | --
121 | -- Finally, while deriving FromJSON and ToJSON instances automatically
122 | -- might seem like a good idea, you will most likely be better off
123 | -- defining them manually, so you can make sure they are backwards
124 | -- compatible if you change them, as any jobs that can't be
125 | -- deserialized will not be run (and will end up in the 'broken'
126 | -- queue). This will only happen if the queue is non-empty when you
127 | -- replce the running application version, but this is obviously
128 | -- possible and could be likely depending on your use.
129 | class (FromJSON t, ToJSON t, Show t) => Job s t | s -> t where
130 |   job :: s -> t -> IO Result
131 | 
132 | data JobData t = JobData UTCTime t
133 | 
134 | -- | What should happen when an unexpected exception is thrown in a
135 | -- job - it can be treated as either a 'Failure' (the default) or a
136 | -- 'Retry' (if you know the only exceptions are triggered by
137 | -- intermittent problems).
138 | data ExceptionBehavior = RetryOnException | FailOnException
139 | 
140 | hwlog :: Show a => Hworker s t -> a -> IO ()
141 | hwlog hw a = hworkerLogger hw (hworkerName hw, a)
142 | 
143 | -- | The worker data type - it is parametrized be the worker
144 | -- state (the `s`) and the job type (the `t`).
145 | data Hworker s t =
146 |      Hworker { hworkerName              :: ByteString
147 |              , hworkerState             :: s
148 |              , hworkerConnection        :: R.Connection
149 |              , hworkerExceptionBehavior :: ExceptionBehavior
150 |              , hworkerLogger            :: forall a. Show a => a -> IO ()
151 |              , hworkerJobTimeout        :: NominalDiffTime
152 |              , hworkerFailedQueueSize   :: Int
153 |              , hworkerDebug             :: Bool
154 |              }
155 | 
156 | -- | When configuring a worker, you can tell it to use an existing
157 | -- redis connection pool (which you may have for the rest of your
158 | -- application). Otherwise, you can specify connection info. By
159 | -- default, hworker tries to connect to localhost, which may not be
160 | -- true for your production application.
161 | data RedisConnection = RedisConnectInfo R.ConnectInfo
162 |                      | RedisConnection R.Connection
163 | 
164 | -- | The main configuration for workers.
165 | --
166 | -- Each pool of workers should have a unique `hwconfigName`, as the
167 | -- queues are set up by that name, and if you have different types of
168 | -- data written in, they will likely be unable to be deserialized (and
169 | -- thus could end up in the 'broken' queue).
170 | --
171 | -- The 'hwconfigLogger' defaults to writing to stdout, so you will
172 | -- likely want to replace that with something appropriate (like from a
173 | -- logging package).
174 | --
175 | -- The `hwconfigTimeout` is really important. It determines the length
176 | -- of time after a job is started before the 'monitor' will decide
177 | -- that the job must have died and will restart it. If it is shorter
178 | -- than the length of time that a normal job takes to complete, the
179 | -- jobs _will_ be run multiple times. This is _semantically_ okay, as
180 | -- this is an at-least-once processor, but obviously won't be
181 | -- desirable. It defaults to 120 seconds.
182 | --
183 | -- The 'hwconfigExceptionBehavior' controls what happens when an
184 | -- exception is thrown within a job.
185 | --
186 | -- 'hwconfigFailedQueueSize' controls how many 'failed' jobs will be
187 | -- kept. It defaults to 1000.
188 | data HworkerConfig s =
189 |      HworkerConfig {
190 |          hwconfigName              :: Text
191 |        , hwconfigState             :: s
192 |        , hwconfigRedisConnectInfo  :: RedisConnection
193 |        , hwconfigExceptionBehavior :: ExceptionBehavior
194 |        , hwconfigLogger            :: forall a. Show a => a -> IO ()
195 |        , hwconfigTimeout           :: NominalDiffTime
196 |        , hwconfigFailedQueueSize   :: Int
197 |        , hwconfigDebug             :: Bool
198 |        }
199 | 
200 | -- | The default worker config - it needs a name and a state (as those
201 | -- will always be unique).
202 | defaultHworkerConfig :: Text -> s -> HworkerConfig s
203 | defaultHworkerConfig name state =
204 |   HworkerConfig name
205 |                 state
206 |                 (RedisConnectInfo R.defaultConnectInfo)
207 |                 FailOnException
208 |                 print
209 |                 120
210 |                 1000
211 |                 False
212 | 
213 | -- | Create a new worker with the default 'HworkerConfig'.
214 | --
215 | -- Note that you must create at least one 'worker' and 'monitor' for
216 | -- the queue to actually process jobs (and for it to retry ones that
217 | -- time-out).
218 | create :: Job s t => Text -> s -> IO (Hworker s t)
219 | create name state = createWith (defaultHworkerConfig name state)
220 | 
221 | -- | Create a new worker with a specified 'HworkerConfig'.
222 | --
223 | -- Note that you must create at least one 'worker' and 'monitor' for
224 | -- the queue to actually process jobs (and for it to retry ones that
225 | -- time-out).
226 | createWith :: Job s t => HworkerConfig s -> IO (Hworker s t)
227 | createWith HworkerConfig{..} =
228 |    do conn <- case hwconfigRedisConnectInfo of
229 |                 RedisConnectInfo c -> R.connect c
230 |                 RedisConnection c -> return c
231 |       return $ Hworker (T.encodeUtf8 hwconfigName)
232 |                        hwconfigState
233 |                        conn
234 |                        hwconfigExceptionBehavior
235 |                        hwconfigLogger
236 |                        hwconfigTimeout
237 |                        hwconfigFailedQueueSize
238 |                        hwconfigDebug
239 | 
240 | -- | Destroy a worker. This will delete all the queues, clearing out
241 | -- all existing 'jobs', the 'broken' and 'failed' queues. There is no need
242 | -- to do this in normal applications (and most likely, you won't want to).
243 | destroy :: Job s t => Hworker s t -> IO ()
244 | destroy hw = void $ R.runRedis (hworkerConnection hw) $
245 |                R.del [ jobQueue hw
246 |                      , progressQueue hw
247 |                      , brokenQueue hw
248 |                      , failedQueue hw
249 |                      ]
250 | 
251 | jobQueue :: Hworker s t -> ByteString
252 | jobQueue hw = "hworker-jobs-" <> hworkerName hw
253 | 
254 | progressQueue :: Hworker s t -> ByteString
255 | progressQueue hw = "hworker-progress-" <> hworkerName hw
256 | 
257 | brokenQueue :: Hworker s t -> ByteString
258 | brokenQueue hw = "hworker-broken-" <> hworkerName hw
259 | 
260 | failedQueue :: Hworker s t -> ByteString
261 | failedQueue hw = "hworker-failed-" <> hworkerName hw
262 | 
263 | -- | Adds a job to the queue. Returns whether the operation succeeded.
264 | queue :: Job s t => Hworker s t -> t -> IO Bool
265 | queue hw j =
266 |   do job_id <- UUID.toString <$> UUID.nextRandom
267 |      isRight <$> R.runRedis (hworkerConnection hw)
268 |                 (R.lpush (jobQueue hw) [LB.toStrict $ A.encode (job_id, j)])
269 | 
270 | -- | Creates a new worker thread. This is blocking, so you will want to
271 | -- 'forkIO' this into a thread. You can have any number of these (and
272 | -- on any number of servers); the more there are, the faster jobs will
273 | -- be processed.
274 | worker :: Job s t => Hworker s t -> IO ()
275 | worker hw =
276 |   do now <- getCurrentTime
277 |      r <- R.runRedis (hworkerConnection hw) $
278 |             R.eval "local job = redis.call('rpop',KEYS[1])\n\
279 |                    \if job ~= nil then\n\
280 |                    \  redis.call('hset', KEYS[2], tostring(job), ARGV[1])\n\
281 |                    \  return job\n\
282 |                    \else\n\
283 |                    \  return nil\n\
284 |                    \end"
285 |                    [jobQueue hw, progressQueue hw]
286 |                    [LB.toStrict $ A.encode now]
287 |      case r of
288 |        Left err -> hwlog hw err >> delayAndRun
289 |        Right Nothing -> delayAndRun
290 |        Right (Just t) ->
291 |          do when (hworkerDebug hw) $ hwlog hw ("WORKER RUNNING", t)
292 |             case decodeValue (LB.fromStrict t) of
293 |               Nothing -> do hwlog hw ("BROKEN JOB", t)
294 |                             now <- getCurrentTime
295 |                             withNil hw (R.eval "local del = redis.call('hdel', KEYS[1], ARGV[1])\n\
296 |                                                 \if del == 1 then\n\
297 |                                                 \  redis.call('hset', KEYS[2], ARGV[1], ARGV[2])\n\
298 |                                                 \end\n\
299 |                                                 \return nil"
300 |                                                 [progressQueue hw, brokenQueue hw]
301 |                                                 [t, LB.toStrict $ A.encode now])
302 |                             delayAndRun
303 |               Just (_ :: String, j) -> do
304 |                 result <- runJob (job (hworkerState hw) j)
305 |                 case result of
306 |                   Success ->
307 |                     do when (hworkerDebug hw) $ hwlog hw ("JOB COMPLETE", t)
308 |                        delete_res <- R.runRedis (hworkerConnection hw)
309 |                                                 (R.hdel (progressQueue hw) [t])
310 |                        case delete_res of
311 |                          Left err -> hwlog hw err >> delayAndRun
312 |                          Right 1 -> justRun
313 |                          Right n -> do hwlog hw ("Job done: did not delete 1, deleted " <> show n)
314 |                                        delayAndRun
315 |                   Retry msg ->
316 |                     do hwlog hw ("Retry: " <> msg)
317 |                        withNil hw
318 |                                 (R.eval "local del = redis.call('hdel', KEYS[1], ARGV[1])\n\
319 |                                         \if del == 1 then\n\
320 |                                         \  redis.call('lpush', KEYS[2], ARGV[1])\n\
321 |                                         \end\n\
322 |                                         \return nil"
323 |                                         [progressQueue hw, jobQueue hw]
324 |                                         [t])
325 |                        delayAndRun
326 |                   Failure msg ->
327 |                     do hwlog hw ("Failure: " <> msg)
328 |                        withNil hw
329 |                                 (R.eval "local del = redis.call('hdel', KEYS[1], ARGV[1])\n\
330 |                                         \if del == 1 then\n\
331 |                                         \  redis.call('lpush', KEYS[2], ARGV[1])\n\
332 |                                         \  redis.call('ltrim', KEYS[2], 0, ARGV[2])\n\
333 |                                         \end\n\
334 |                                         \return nil"
335 |                                         [progressQueue hw, failedQueue hw]
336 |                                         [t, B8.pack (show (hworkerFailedQueueSize hw - 1))])
337 |                        void $ R.runRedis (hworkerConnection hw)
338 |                                          (R.hdel (progressQueue hw) [t])
339 |                        delayAndRun
340 |   where delayAndRun = threadDelay 10000 >> worker hw
341 |         justRun = worker hw
342 |         runJob v =
343 |           do x <- newEmptyMVar
344 |              jt <- forkIO (catch (v >>= putMVar x . Right)
345 |                                  (\(e::SomeException) ->
346 |                                     putMVar x (Left e)))
347 |              res <- takeMVar x
348 |              case res of
349 |                Left e ->
350 |                  let b = case hworkerExceptionBehavior hw of
351 |                            RetryOnException -> Retry
352 |                            FailOnException -> Failure in
353 |                  return (b ("Exception raised: " <> (T.pack . show) e))
354 |                Right r -> return r
355 | 
356 | 
357 | -- | Start a monitor. Like 'worker', this is blocking, so should be
358 | -- started in a thread. This is responsible for retrying jobs that
359 | -- time out (which can happen if the processing thread is killed, for
360 | -- example). You need to have at least one of these running to have
361 | -- the retry happen, but it is safe to have any number running.
362 | monitor :: Job s t => Hworker s t -> IO ()
363 | monitor hw =
364 |   forever $
365 |   do now <- getCurrentTime
366 |      withList hw (R.hkeys (progressQueue hw))
367 |        (\jobs ->
368 |           void $ forM jobs $ \job ->
369 |             withMaybe hw (R.hget (progressQueue hw) job)
370 |              (\start ->
371 |                 when (diffUTCTime now (fromJust $ decodeValue (LB.fromStrict start)) > hworkerJobTimeout hw) $
372 |                   do n <-
373 |                        withInt hw
374 |                          (R.eval "local del = redis.call('hdel', KEYS[2], ARGV[1])\n\
375 |                                  \if del == 1 then\
376 |                                  \  redis.call('rpush', KEYS[1], ARGV[1])\n\                                   \end\n\
377 |                                  \return del"
378 |                                  [jobQueue hw, progressQueue hw]
379 |                                  [job])
380 |                      when (hworkerDebug hw) $ hwlog hw ("MONITOR RV", n)
381 |                      when (hworkerDebug hw && n == 1) $ hwlog hw ("MONITOR REQUEUED", job)))
382 |      -- NOTE(dbp 2015-07-25): We check every 1/10th of timeout.
383 |      threadDelay (floor $ 100000 * hworkerJobTimeout hw)
384 | 
385 | -- | Returns the jobs that could not be deserialized, most likely
386 | -- because you changed the 'ToJSON'/'FromJSON' instances for you job
387 | -- in a way that resulted in old jobs not being able to be converted
388 | -- back from json. Another reason for jobs to end up here (and much
389 | -- worse) is if you point two instances of 'Hworker', with different
390 | -- job types, at the same queue (ie, you re-use the name). Then
391 | -- anytime a worker from one queue gets a job from the other it would
392 | -- think it is broken.
393 | broken :: Hworker s t -> IO [(ByteString, UTCTime)]
394 | broken hw = do r <- R.runRedis (hworkerConnection hw) (R.hgetall (brokenQueue hw))
395 |                case r of
396 |                  Left err -> hwlog hw err >> return []
397 |                  Right xs -> return (map (second parseTime) xs)
398 |   where parseTime = fromJust . decodeValue . LB.fromStrict
399 | 
400 | jobsFromQueue :: Job s t => Hworker s t -> ByteString -> IO [t]
401 | jobsFromQueue hw queue =
402 |   do r <- R.runRedis (hworkerConnection hw) (R.lrange queue 0 (-1))
403 |      case r of
404 |        Left err -> hwlog hw err >> return []
405 |        Right [] -> return []
406 |        Right xs -> return $ mapMaybe (fmap (\(_::String, x) -> x) . decodeValue . LB.fromStrict) xs
407 | 
408 | -- | Returns all pending jobs.
409 | jobs :: Job s t => Hworker s t -> IO [t]
410 | jobs hw = jobsFromQueue hw (jobQueue hw)
411 | 
412 | -- | Returns all failed jobs. This is capped at the most recent
413 | -- 'hworkerconfigFailedQueueSize' jobs that returned 'Failure' (or
414 | -- threw an exception when 'hworkerconfigExceptionBehavior' is
415 | -- 'FailOnException').
416 | failed :: Job s t => Hworker s t -> IO [t]
417 | failed hw = jobsFromQueue hw (failedQueue hw)
418 | 
419 | -- | Logs the contents of the jobqueue and the inprogress queue at
420 | -- `microseconds` intervals.
421 | debugger :: Job s t => Int -> Hworker s t -> IO ()
422 | debugger microseconds hw =
423 |   forever $
424 |   do withList hw (R.hkeys (progressQueue hw))
425 |                (\running ->
426 |                   withList hw (R.lrange (jobQueue hw) 0 (-1))
427 |                         (\queued -> hwlog hw ("DEBUG", queued, running)))
428 |      threadDelay microseconds
429 | 
430 | -- Redis helpers follow
431 | withList hw a f =
432 |   do r <- R.runRedis (hworkerConnection hw) a
433 |      case r of
434 |        Left err -> hwlog hw err
435 |        Right [] -> return ()
436 |        Right xs -> f xs
437 | 
438 | withMaybe hw a f =
439 |   do r <- R.runRedis (hworkerConnection hw) a
440 |      case r of
441 |        Left err -> hwlog hw err
442 |        Right Nothing -> return ()
443 |        Right (Just v) -> f v
444 | 
445 | withNil hw a =
446 |   do r <- R.runRedis (hworkerConnection hw) a
447 |      case r of
448 |        Left err -> hwlog hw err
449 |        Right (Just ("" :: ByteString)) -> return ()
450 |        Right _ -> return ()
451 | 
452 | withInt :: Hworker s t -> R.Redis (Either R.Reply Integer) -> IO Integer
453 | withInt hw a =
454 |   do r <- R.runRedis (hworkerConnection hw) a
455 |      case r of
456 |        Left err -> hwlog hw err >> return (-1)
457 |        Right n -> return n
458 | 
459 | withIgnore :: Hworker s t -> R.Redis (Either R.Reply a) -> IO ()
460 | withIgnore hw a =
461 |   do r <- R.runRedis (hworkerConnection hw) a
462 |      case r of
463 |        Left err -> hwlog hw err
464 |        Right _ -> return ()
465 | 


--------------------------------------------------------------------------------
/stack.yaml:
--------------------------------------------------------------------------------
1 | flags: {}
2 | packages:
3 | - '.'
4 | - 'example'
5 | extra-deps: []
6 | resolver: lts-3.1
7 | 


--------------------------------------------------------------------------------
/test/Spec.hs:
--------------------------------------------------------------------------------
  1 | {-# LANGUAGE DeriveGeneric         #-}
  2 | {-# LANGUAGE MultiParamTypeClasses #-}
  3 | {-# LANGUAGE OverloadedStrings     #-}
  4 | import           Control.Concurrent       (forkIO, killThread, threadDelay)
  5 | import           Control.Concurrent.MVar  (MVar, modifyMVarMasked_, newMVar,
  6 |                                            readMVar, takeMVar)
  7 | import           Control.Monad            (replicateM_)
  8 | import           Data.Aeson               (FromJSON, ToJSON)
  9 | import qualified Data.Text                as T
 10 | import           GHC.Generics             (Generic)
 11 | import           System.Hworker
 12 | import           System.IO
 13 | 
 14 | import           Test.Hspec
 15 | import           Test.Hspec.Contrib.HUnit
 16 | import           Test.HUnit
 17 | 
 18 | data SimpleJob = SimpleJob deriving (Generic, Show, Eq)
 19 | data SimpleState = SimpleState { unSimpleState :: MVar Int }
 20 | instance ToJSON SimpleJob
 21 | instance FromJSON SimpleJob
 22 | instance Job SimpleState SimpleJob where
 23 |   job (SimpleState mvar) SimpleJob =
 24 |     do modifyMVarMasked_ mvar (return . (+1))
 25 |        return Success
 26 | 
 27 | data ExJob = ExJob deriving (Generic, Show)
 28 | data ExState = ExState { unExState :: MVar Int }
 29 | instance ToJSON ExJob
 30 | instance FromJSON ExJob
 31 | instance Job ExState ExJob where
 32 |   job (ExState mvar) ExJob =
 33 |     do modifyMVarMasked_ mvar (return . (+1))
 34 |        v <- readMVar mvar
 35 |        if v > 1
 36 |           then return Success
 37 |           else error "ExJob: failing badly!"
 38 | 
 39 | data RetryJob = RetryJob deriving (Generic, Show)
 40 | data RetryState = RetryState { unRetryState :: MVar Int }
 41 | instance ToJSON RetryJob
 42 | instance FromJSON RetryJob
 43 | instance Job RetryState RetryJob where
 44 |   job (RetryState mvar) RetryJob =
 45 |     do modifyMVarMasked_ mvar (return . (+1))
 46 |        v <- readMVar mvar
 47 |        if v > 1
 48 |           then return Success
 49 |           else return (Retry "RetryJob retries")
 50 | 
 51 | data FailJob = FailJob deriving (Eq, Generic, Show)
 52 | data FailState = FailState { unFailState :: MVar Int }
 53 | instance ToJSON FailJob
 54 | instance FromJSON FailJob
 55 | instance Job FailState FailJob where
 56 |   job (FailState mvar) FailJob =
 57 |     do modifyMVarMasked_ mvar (return . (+1))
 58 |        v <- readMVar mvar
 59 |        if v > 1
 60 |           then return Success
 61 |           else return (Failure "FailJob fails")
 62 | 
 63 | data AlwaysFailJob = AlwaysFailJob deriving (Eq, Generic, Show)
 64 | data AlwaysFailState = AlwaysFailState { unAlwaysFailState :: MVar Int }
 65 | instance ToJSON AlwaysFailJob
 66 | instance FromJSON AlwaysFailJob
 67 | instance Job AlwaysFailState AlwaysFailJob where
 68 |   job (AlwaysFailState mvar) AlwaysFailJob =
 69 |     do modifyMVarMasked_ mvar (return . (+1))
 70 |        return (Failure "AlwaysFailJob fails")
 71 | 
 72 | data TimedJob = TimedJob Int deriving (Generic, Show, Eq)
 73 | data TimedState = TimedState { unTimedState :: MVar Int }
 74 | instance ToJSON TimedJob
 75 | instance FromJSON TimedJob
 76 | instance Job TimedState TimedJob where
 77 |   job (TimedState mvar) (TimedJob delay) =
 78 |     do threadDelay delay
 79 |        modifyMVarMasked_ mvar (return . (+1))
 80 |        return Success
 81 | 
 82 | data BigJob = BigJob T.Text deriving (Generic, Show, Eq)
 83 | data BigState = BigState { unBigState :: MVar Int }
 84 | instance ToJSON BigJob
 85 | instance FromJSON BigJob
 86 | instance Job BigState BigJob where
 87 |   job (BigState mvar) (BigJob _) =
 88 |     do modifyMVarMasked_ mvar (return . (+1))
 89 |        return Success
 90 | 
 91 | nullLogger :: Show a => a -> IO ()
 92 | nullLogger = const (return ())
 93 | 
 94 | print' :: Show a => a -> IO ()
 95 | print' a = do print a
 96 |               hFlush stdout
 97 | 
 98 | conf n s = (defaultHworkerConfig n s) {
 99 |                hwconfigLogger = nullLogger
100 |              , hwconfigExceptionBehavior = FailOnException
101 |              , hwconfigTimeout = 4
102 |              }
103 | 
104 | main :: IO ()
105 | main = hspec $
106 |   do describe "Simple" $
107 |        do it "should run and increment counter" $
108 |             do mvar <- newMVar 0
109 |                hworker <- createWith (conf "simpleworker-1"
110 |                                            (SimpleState mvar))
111 |                wthread <- forkIO (worker hworker)
112 |                queue hworker SimpleJob
113 |                threadDelay 30000
114 |                killThread wthread
115 |                destroy hworker
116 |                v <- takeMVar mvar
117 |                assertEqual "State should be 1 after job runs" 1 v
118 |           it "queueing 2 jobs should increment twice" $
119 |             do mvar <- newMVar 0
120 |                hworker <- createWith (conf "simpleworker-2"
121 |                                            (SimpleState mvar))
122 |                wthread <- forkIO (worker hworker)
123 |                queue hworker SimpleJob
124 |                queue hworker SimpleJob
125 |                threadDelay 40000
126 |                killThread wthread
127 |                destroy hworker
128 |                v <- takeMVar mvar
129 |                assertEqual "State should be 2 after 2 jobs run" 2 v
130 |           it "queueing 1000 jobs should increment 1000" $
131 |             do mvar <- newMVar 0
132 |                hworker <- createWith (conf "simpleworker-3"
133 |                                            (SimpleState mvar))
134 |                wthread <- forkIO (worker hworker)
135 |                replicateM_ 1000 (queue hworker SimpleJob)
136 |                threadDelay 2000000
137 |                killThread wthread
138 |                destroy hworker
139 |                v <- takeMVar mvar
140 |                assertEqual "State should be 1000 after 1000 job runs" 1000 v
141 |           it "should work with multiple workers" $
142 |           -- NOTE(dbp 2015-07-12): This probably won't run faster, because
143 |           -- they are all blocking on the MVar, but that's not the point.
144 |             do mvar <- newMVar 0
145 |                hworker <- createWith (conf "simpleworker-4"
146 |                                            (SimpleState mvar))
147 |                wthread1 <- forkIO (worker hworker)
148 |                wthread2 <- forkIO (worker hworker)
149 |                wthread3 <- forkIO (worker hworker)
150 |                wthread4 <- forkIO (worker hworker)
151 |                replicateM_ 1000 (queue hworker SimpleJob)
152 |                threadDelay 1000000
153 |                killThread wthread1
154 |                killThread wthread2
155 |                killThread wthread3
156 |                killThread wthread4
157 |                destroy hworker
158 |                v <- takeMVar mvar
159 |                assertEqual "State should be 1000 after 1000 job runs" 1000 v
160 | 
161 |      describe "Exceptions" $
162 |        do it "should be able to have exceptions thrown in jobs and retry the job" $
163 |             do mvar <- newMVar 0
164 |                hworker <- createWith (conf "exworker-1"
165 |                                            (ExState mvar)) {
166 |                                        hwconfigExceptionBehavior =
167 |                                          RetryOnException
168 |                                      }
169 |                wthread <- forkIO (worker hworker)
170 |                queue hworker ExJob
171 |                threadDelay 40000
172 |                killThread wthread
173 |                destroy hworker
174 |                v <- takeMVar mvar
175 |                assertEqual "State should be 2, since the first run failed" 2 v
176 |           it "should not retry if mode is FailOnException" $
177 |              do mvar <- newMVar 0
178 |                 hworker <- createWith (conf "exworker-2"
179 |                                             (ExState mvar))
180 |                 wthread <- forkIO (worker hworker)
181 |                 queue hworker ExJob
182 |                 threadDelay 30000
183 |                 killThread wthread
184 |                 destroy hworker
185 |                 v <- takeMVar mvar
186 |                 assertEqual "State should be 1, since failing run wasn't retried" 1 v
187 | 
188 |      describe "Retry" $
189 |        do it "should be able to return Retry and get run again" $
190 |             do mvar <- newMVar 0
191 |                hworker <- createWith (conf "retryworker-1"
192 |                                            (RetryState mvar))
193 |                wthread <- forkIO (worker hworker)
194 |                queue hworker RetryJob
195 |                threadDelay 50000
196 |                destroy hworker
197 |                v <- takeMVar mvar
198 |                assertEqual "State should be 2, since it got retried" 2 v
199 | 
200 |      describe "Fail" $
201 |        do it "should not retry a job that Fails" $
202 |             do mvar <- newMVar 0
203 |                hworker <- createWith (conf "failworker-1"
204 |                                            (FailState mvar))
205 |                wthread <- forkIO (worker hworker)
206 |                queue hworker FailJob
207 |                threadDelay 30000
208 |                destroy hworker
209 |                v <- takeMVar mvar
210 |                assertEqual "State should be 1, since failing run wasn't retried" 1 v
211 |           it "should put a failed job into the failed queue" $
212 |             do mvar <- newMVar 0
213 |                hworker <- createWith (conf "failworker-2"
214 |                                            (FailState mvar))
215 |                wthread <- forkIO (worker hworker)
216 |                queue hworker FailJob
217 |                threadDelay 30000
218 |                jobs <- failed hworker
219 |                destroy hworker
220 |                assertEqual "Should have failed job" [FailJob] jobs
221 |           it "should only store failedQueueSize failed jobs" $
222 |             do mvar <- newMVar 0
223 |                hworker <- createWith (conf "failworker-3"
224 |                                            (AlwaysFailState mvar)) {
225 |                                      hwconfigFailedQueueSize = 2
226 |                            }
227 |                wthread <- forkIO (worker hworker)
228 |                queue hworker AlwaysFailJob
229 |                queue hworker AlwaysFailJob
230 |                queue hworker AlwaysFailJob
231 |                queue hworker AlwaysFailJob
232 |                threadDelay 100000
233 |                jobs <- failed hworker
234 |                destroy hworker
235 |                v <- takeMVar mvar
236 |                assertEqual "State should be 4, since all jobs were run" 4 v
237 |                assertEqual "Should only have stored 2"
238 |                            [AlwaysFailJob,AlwaysFailJob] jobs
239 |      describe "Monitor" $
240 |        do it "should add job back after timeout" $
241 |           -- NOTE(dbp 2015-07-12): The timing on this test is somewhat
242 |           -- tricky.  We want to get the job started with one worker,
243 |           -- then kill the worker, then start a new worker, and have
244 |           -- the monitor put the job back in the queue and have the
245 |           -- second worker finish it. It's important that the job
246 |           -- takes less time to complete than the timeout for the
247 |           -- monitor, or else it'll queue it forever.
248 |           --
249 |           -- The timeout is 5 seconds. The job takes 1 seconds to run.
250 |           -- The worker is killed after 0.5 seconds, which should be
251 |           -- plenty of time for it to have started the job. Then after
252 |           -- the second worker is started, we wait 10 seconds, which
253 |           -- should be plenty; we expect the total run to take around 11.
254 |             do mvar <- newMVar 0
255 |                hworker <- createWith (conf "timedworker-1"
256 |                                            (TimedState mvar)) {
257 |                                        hwconfigTimeout = 5
258 |                                      }
259 |                wthread1 <- forkIO (worker hworker)
260 |                mthread <- forkIO (monitor hworker)
261 |                queue hworker (TimedJob 1000000)
262 |                threadDelay 500000
263 |                killThread wthread1
264 |                wthread2 <- forkIO (worker hworker)
265 |                threadDelay 10000000
266 |                destroy hworker
267 |                v <- takeMVar mvar
268 |                assertEqual "State should be 2, since monitor thinks it failed" 2 v
269 |           it "should add back multiple jobs after timeout" $
270 |              -- NOTE(dbp 2015-07-23): Similar to the above test, but we
271 |              -- have multiple jobs started, multiple workers killed.
272 |              -- then one worker will finish both interrupted jobs.
273 |             do mvar <- newMVar 0
274 |                hworker <- createWith (conf "timedworker-2"
275 |                                           (TimedState mvar)) {
276 |                                        hwconfigTimeout = 5
277 |                                      }
278 |                wthread1 <- forkIO (worker hworker)
279 |                wthread2 <- forkIO (worker hworker)
280 |                mthread <- forkIO (monitor hworker)
281 |                queue hworker (TimedJob 1000000)
282 |                queue hworker (TimedJob 1000000)
283 |                threadDelay 500000
284 |                killThread wthread1
285 |                killThread wthread2
286 |                wthread3 <- forkIO (worker hworker)
287 |                threadDelay 10000000
288 |                destroy hworker
289 |                v <- takeMVar mvar
290 |                assertEqual "State should be 4, since monitor thinks first 2 failed" 4 v
291 |           it "should work with multiple monitors" $
292 |             do mvar <- newMVar 0
293 |                hworker <- createWith (conf "timedworker-3"
294 |                                           (TimedState mvar)) {
295 |                                        hwconfigTimeout = 5
296 |                                      }
297 |                wthread1 <- forkIO (worker hworker)
298 |                wthread2 <- forkIO (worker hworker)
299 |                -- NOTE(dbp 2015-07-24): This might seem silly, but it
300 |                -- was actually sufficient to expose a race condition.
301 |                mthread1 <- forkIO (monitor hworker)
302 |                mthread2 <- forkIO (monitor hworker)
303 |                mthread3 <- forkIO (monitor hworker)
304 |                mthread4 <- forkIO (monitor hworker)
305 |                mthread5 <- forkIO (monitor hworker)
306 |                mthread6 <- forkIO (monitor hworker)
307 |                queue hworker (TimedJob 1000000)
308 |                queue hworker (TimedJob 1000000)
309 |                threadDelay 500000
310 |                killThread wthread1
311 |                killThread wthread2
312 |                wthread3 <- forkIO (worker hworker)
313 |                threadDelay 30000000
314 |                destroy hworker
315 |                v <- takeMVar mvar
316 |                assertEqual "State should be 4, since monitor thinks first 2 failed" 4 v
317 |                -- NOTE(dbp 2015-07-24): It would be really great to have a
318 |                -- test that went after a race between the retry logic and
319 |                -- the monitors (ie, assume that the job completed with
320 |                -- Retry, and it happened to complete right at the timeout
321 |                -- period).  I'm not sure if I could get that sort of
322 |                -- precision without adding other delay mechanisms, or
323 |                -- something to make it more deterministic.
324 |      describe "Broken jobs" $
325 |        it "should store broken jobs" $
326 |          do -- NOTE(dbp 2015-08-09): The more common way this could
327 |             -- happen is that you change your serialization format. But
328 |             -- we can abuse this by creating two different workers
329 |             -- pointing to the same queue, and submit jobs in one, try
330 |             -- to run them in another, where the types are different.
331 |             mvar <- newMVar 0
332 |             hworker1 <- createWith (conf "broken-1"
333 |                                            (TimedState mvar)) {
334 |                                         hwconfigTimeout = 5
335 |                                       }
336 |             hworker2 <- createWith (conf "broken-1"
337 |                                            (SimpleState mvar)) {
338 |                                         hwconfigTimeout = 5
339 |                                       }
340 |             wthread <- forkIO (worker hworker1)
341 |             queue hworker2 SimpleJob
342 |             threadDelay 100000
343 |             jobs <- broken hworker2
344 |             killThread wthread
345 |             destroy hworker1
346 |             v <- takeMVar mvar
347 |             assertEqual "State should be 0, as nothing should have happened" 0 v
348 |             assertEqual "Should be one broken job, as serialization is wrong" 1 (length jobs)
349 |      describe "Dump jobs" $ do
350 |        it "should return the job that was queued" $
351 |          do mvar <- newMVar 0
352 |             hworker <- createWith (conf "dump-1"
353 |                                               (SimpleState mvar)) {
354 |                                        hwconfigTimeout = 5
355 |                                      }
356 |             queue hworker SimpleJob
357 |             res <- jobs hworker
358 |             destroy hworker
359 |             assertEqual "Should be [SimpleJob]" [SimpleJob] res
360 |        it "should return jobs in order (most recently added at front; worker pulls from back)" $
361 |          do mvar <- newMVar 0
362 |             hworker <- createWith (conf "dump-2"
363 |                                        (TimedState mvar)) {
364 |                                     hwconfigTimeout = 5
365 |                                   }
366 |             queue hworker (TimedJob 1)
367 |             queue hworker (TimedJob 2)
368 |             res <- jobs hworker
369 |             destroy hworker
370 |             assertEqual "Should by [TimedJob 2, TimedJob 1]" [TimedJob 2, TimedJob 1] res
371 |      describe "Large jobs" $ do
372 |        it "should be able to deal with lots of large jobs" $
373 |          do mvar <- newMVar 0
374 |             hworker <- createWith (conf "big-1"
375 |                                        (BigState mvar))
376 |             wthread1 <- forkIO (worker hworker)
377 |             wthread2 <- forkIO (worker hworker)
378 |             wthread3 <- forkIO (worker hworker)
379 |             wthread4 <- forkIO (worker hworker)
380 |             let content = T.intercalate "\n" (take 1000 (repeat "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"))
381 |             replicateM_ 5000 (queue hworker (BigJob content))
382 |             threadDelay 10000000
383 |             killThread wthread1
384 |             killThread wthread2
385 |             killThread wthread3
386 |             killThread wthread4
387 |             destroy hworker
388 |             v <- takeMVar mvar
389 |             assertEqual "Should have processed 5000" 5000 v
390 | 


--------------------------------------------------------------------------------