├── SpaceLeak.hs └── README.md /SpaceLeak.hs: -------------------------------------------------------------------------------- 1 | 2 | module SpaceLeak(module SpaceLeak, force, measureStack, NFData(..)) where 3 | 4 | import Control.DeepSeq 5 | import Control.Exception 6 | import Control.Concurrent.Extra 7 | import System.IO.Unsafe 8 | import Data.IORef 9 | import Control.Monad 10 | import Data.List 11 | 12 | 13 | {-# NOINLINE wrapper1 #-} 14 | wrapper1 :: a -> a 15 | wrapper1 x = x 16 | 17 | {-# NOINLINE wrapper2 #-} 18 | wrapper2 :: a -> a 19 | wrapper2 x = x 20 | 21 | {-# NOINLINE wrapper3 #-} 22 | wrapper3 :: a -> a 23 | wrapper3 x = x 24 | 25 | deepSeq :: NFData a => a -> b -> b 26 | deepSeq a b = rnf a `seq` b 27 | 28 | 29 | newThread :: a -> a 30 | newThread a = unsafePerformIO $ join $ onceFork $ return $! a 31 | 32 | newThread'' :: NFData a => a -> a 33 | newThread'' = newThread . force 34 | 35 | foldr' f z xs = foldl' (flip f) z $ reverse xs 36 | foldr'' f z xs = foldl'' (flip f) z $ reverse xs 37 | 38 | foldl'' f = foldl' (\a b -> force $ f a b) 39 | 40 | 41 | measureStack :: IO Int 42 | measureStack = do 43 | ref <- newIORef 0 44 | res <- try $ evaluate $ foldr (+) 0 $ map (\x -> unsafePerformIO $ do writeIORef ref x; return x) [1..1000000] 45 | case res of 46 | Left StackOverflow -> readIORef ref 47 | Right v -> error "Stack did not overflow, so can't measure stack" 48 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Spaceleak detection 2 | 3 | Every large Haskell program almost inevitably contains [space leaks](https://queue.acm.org/detail.cfm?id=2538488). Space leaks are often difficult to detect, but relatively easy to fix once detected (typically insert a `!`). This page gives a simple technique to detect some space leaks, along with a set of blog posts using that technique and detailing other space leaks. 4 | 5 | ## Spaceleak stack-limiting technique 6 | 7 | The stack-limiting technique is useful for detecting a common kind of spaceleak, a situation where an excessive number of interdependent thunks are created and eventually evaluated. The technique works because thunks are evaluated on the stack, so by limiting the stack size we effectively limit the number of interdependent thunks that we can evaluate. 8 | 9 | To find space leaks, given a program and a representative run (e.g. the test suite, a suitable input file): 10 | 11 | * Compile the program for profiling, e.g. `ghc --make Main.hs -rtsopts -prof -fprof-auto`. 12 | * Run the program with a specific stack size, e.g. `./Main +RTS -K100K` to run with a 100Kb stack. 13 | * Increase/decrease the stack size until you have determined the minimum stack for which the program succeeds, e.g. `-K33K`. 14 | * Reduce the stack by a small amount and rerun with `-xc`, e.g. `./Main +RTS -K32K -xc`. 15 | * The `-xc` run will print out the stack trace on every exception, look for the one which says `stack overflow` (likely the last one) and look at the stack trace to determine roughly where the leak is. 16 | * Attempt to fix the space leak, confirm by rerunning with `-K32K`. 17 | * Repeat until the test works with a small stack, typically `-K1K`. 18 | * Add something to your test suite to ensure that if a space leak is ever introduced then it fails, e.g. `ghc-options: -with-rtsopts=-K1K` in Cabal. 19 | 20 | This technique does not detect when an excessive number of thunks are created but never evaluated, or when a small number of thunks hold on to a large amount of live data. 21 | 22 | Note that typically the main thread requires greater than 1K stack, and that once GHC crosses 1K it makes the stack 32K bigger, as a result anything less than 33K (e.g. 1K) is often rounded up to 33K. To obtain a more precise stack measurement use `-kc2k` which increases the stack in 2K chunks (but does cause only half the stack to be used, so bear that in mind when scaling `-K` numbers). That said, 32K corresponds to approximately 1000 stack evaluation slots, which suggests you don't have any significant space leaks. 23 | 24 | Below are links to further information, including lots of practical examples of real space leaks caught using the method above. 25 | 26 | ### Talks 27 | 28 | * Haskell eXchange 2016: Video https://skillsmatter.com/skillscasts/8724-plugging-space-leaks-improving-performance, slides http://ndmitchell.com/downloads/slides-plugging_space_leaks_improving_performance-06_oct_2016.pdf 29 | 30 | ### Blog tales 31 | 32 | * QuickCheck: http://neilmitchell.blogspot.co.uk/2016/05/another-space-leak-quickcheck-edition.html 33 | * Shake: http://neilmitchell.blogspot.co.uk/2013/02/chasing-space-leak-in-shake.html 34 | * Detecting space leaks (leak in base): http://neilmitchell.blogspot.co.uk/2015/09/detecting-space-leaks.html 35 | * Three space leaks (2x Hoogle, 1x Shake): http://neilmitchell.blogspot.co.uk/2015/09/three-space-leaks.html 36 | * QuickCheck space leak: http://neilmitchell.blogspot.co.uk/2016/05/another-space-leak-quickcheck-edition.html 37 | * Alex/Happy: http://neilmitchell.blogspot.co.uk/2016/07/more-space-leaks-alexhappy-edition.html 38 | * Writer is a space leak: https://blog.infinitenegativeutility.com/2016/7/writer-monads-and-space-leaks (fixed by [writer-cps-transformers](https://hackage.haskell.org/package/writer-cps-transformers)) 39 | * Fixing 17 space leaks in GHCi: https://simonmar.github.io/posts/2018-06-20-Finding-fixing-space-leaks.html 40 | 41 | ### Fixes 42 | 43 | * Making mapM take O(1) stack: http://neilmitchell.blogspot.co.uk/2015/09/making-sequencemapm-for-io-take-o1-stack.html 44 | * Detecting space leaks (leak in base): http://neilmitchell.blogspot.co.uk/2015/09/detecting-space-leaks.html 45 | 46 | ### Code changes 47 | 48 | * Happy: Space leaks https://github.com/simonmar/happy/pull/64 and improved code https://github.com/simonmar/happy/pull/66 49 | * Pretty: https://github.com/haskell/pretty/pull/35 50 | * Statistics: https://github.com/bos/statistics/pull/114 51 | 52 | ### Publications 53 | 54 | * ACM: http://queue.acm.org/detail.cfm?id=2538488 55 | 56 | ### Other approaches 57 | 58 | * https://well-typed.com/blog/2020/09/nothunks/ provides a library for declaring that certain values contain no thunks within them. 59 | * http://simonmar.github.io/posts/2018-06-20-Finding-fixing-space-leaks.html describes how to use weak references to detect what memory is being retained. 60 | * https://neilmitchell.blogspot.com/2020/05/fixing-space-leaks-in-ghcide.html describes finding space leaks in Ghcide and unordered-containers. 61 | * https://github.com/haskell-unordered-containers/unordered-containers/issues/254#issuecomment-636387493 describes a trick about using `(# a #)` unboxed tuples as the return type of a function to keep laziness but get rid of space leaks. 62 | 63 | ### Notes 64 | 65 | On the main thread the stack limit is less effective, usually more like 8K if you request 1K. On spawned threads it seems much better. Solution is to always `join . onceFork` on the main thread. 66 | --------------------------------------------------------------------------------