├── README.md ├── crash ├── Main.hs ├── Setup.hs ├── crash.c ├── crash.cabal └── crash.h └── ghd ├── Makefile ├── atos.h ├── ghd.cpp ├── ghd.h ├── rts_wrapper.h ├── rts_wrapper_noprof.c └── rts_wrapper_prof.c /README.md: -------------------------------------------------------------------------------- 1 | 2 | # Hacking GHC's Stack for Fun and Profit 3 | 4 | This document explains how to look at the stack of a [GHC](http://www.haskell.org/ghc/) compiled [Haskell](http://www.haskell.org/haskellwiki/Haskell) program, and why that might be of interest even to somebody who's not a [GHC hacker](http://www.haskell.org/ghc/contributors). Spoiler: It's a basic building block for many kinds of debugging and profiling tools. 5 | 6 | The code / explanations here are for OS X with 32bit GHC 7.6.3 and gcc 4.2.1, but I'll mention relevant differences to make this apply on a different platform / GHC / architecture as well. Just be alert about different word and pointer sizes, register names and calling conventions. 7 | 8 | ### Disclaimer 9 | 10 | I'm not what you'd call an expert on the internals of GHC and its RTS. I simply hope this will be helpful, and I'd love to be corrected and find a better way of doing and explaining all of this. 11 | 12 | ### Motivation 13 | 14 | GHC's [debugging facilities](http://www.haskell.org/haskellwiki/Debugging) are rather limited. There is a debugger inside the [GHCi](http://www.haskell.org/haskellwiki/GHC/GHCi) REPL, but that requires our program to be build and linked by its bytecode compiler. Over the years, there have been many issues preventing otherwise perfectly fine Haskell code from working within GHCi, and even if none of those affect us, we're still left with a much slower running program with significantly changed runtime behavior. Another option would be the [-xc flag](http://www.haskell.org/ghc/docs/7.6.3/html/users_guide/runtime-control.html#rts-options-debugging) of GHC's [RTS](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts), doing a stack trace for every exception. That one also has its fair share of drawbacks (for instance [Ticket #8512](https://ghc.haskell.org/trac/ghc/ticket/8512#ticket)). 15 | 16 | The situation goes from bad to worse as we move from Haskell exceptions to actual [segmentation faults](http://en.wikipedia.org/wiki/Segfault). Here we are completely on our own, as none of the facilities mentioned above work on them. Additionally, the way GHC implements Haskell's stack and generates debug information makes its compiled programs opaque to debuggers like [gdb](https://www.gnu.org/software/gdb/). For this reason, most other profiling and analysis programs will also have a rather hard time with GHC compiled code. 17 | 18 | # Debugging 19 | 20 | ### Introduction 21 | 22 | First, yes, Haskell program's do crash (segfault). There are many APIs and primitives which offer no memory safety guarantees. They are generally used to implement the higher-level, safer systems, or to interface with system APIs like POSIX or OpenGL, or simply for performance reasons (hard to be competitive with C/C++/Fortran when there's a bounds check on every memory access). And even if we ban all of those things from our codebase, we'll very likely still use Haskell libraries which will use unsafe memory operations or call into C code. And this code will inevitably have bugs, and we'll be faced with having our program exit with 23 | 24 | Segmentation fault 25 | 26 | or 27 | 28 | Bus error 29 | 30 | and not much more to go on. This can be a potentially very distressing situation. Imagine having your mission critical 1M LoC application just crash after 30min with zero helpful information. 31 | 32 | If this was a C/C++/ObjC etc. program, we could simply fire up our favorite debugger and get the full call stack of every thread, function arguments, all local and global variables, the line of code where the segfault happened, etc. Even for code obscured by the optimizer and after some memory corruption we'd still likely get some useful data. Even better, we could use tools like [Valgrind](http://valgrind.org/), pinpointing subtle memory errors before they cascade into the inevitable segfault. None of this works (well) with GHC compiled programs. Before resorting to debugging with [git bisect](https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html) or a similar measure, please read on! 33 | 34 | ### Sample Program 35 | 36 | Before we get to actually poking around the stack, we start by writing a simple crashing Haskell program so we have something to work with. Here: 37 | 38 | `Main.hs` 39 | 40 | ```haskell 41 | module Main where 42 | 43 | import Data.Vector.Mutable 44 | 45 | main :: IO () 46 | main = do 47 | someFuncA 48 | someFuncB 49 | 50 | {-# NOINLINE someFuncA #-} 51 | someFuncA :: IO () 52 | someFuncA = do 53 | v <- new 100 54 | unsafeWrite v 0 (0 :: Int) 55 | 56 | {-# NOINLINE someFuncB #-} 57 | someFuncB :: IO () 58 | someFuncB = do 59 | someFuncC 60 | someFuncC 61 | 62 | {-# NOINLINE someFuncC #-} 63 | someFuncC :: IO () 64 | someFuncC = do 65 | v <- new 100 66 | unsafeWrite v 1000000000 (0 :: Int) -- Crash (hopefully) 67 | 68 | ``` 69 | 70 | `crash.cabal` should contain 71 | 72 | ghc-options: -O2 -threaded 73 | 74 | as these flags are likely what we'd likely be using for a real program. 75 | 76 | Running `crash` yields the expected result: 77 | 78 | $ ./dist/build/crash/crash 79 | Segmentation fault 80 | 81 | The full sample program can also be found in the [crash folder](https://github.com/blitzcode/ghc-stack/tree/master/crash). 82 | 83 | ### A first try with gdb 84 | 85 | As discussed during the intro, no (direct) help from GHCi / the RTS here. Let's look at what `gdb` (documentation [here](http://www.gnu.org/software/gdb/documentation/), if you need a little refresher) tells us: 86 | 87 | $ gdb ./dist/build/crash/crash 88 | GNU gdb 6.3.50-20050815 (Apple version gdb-1515) (Sat Jan 15 08:33:48 UTC 2011) 89 | Copyright 2004 Free Software Foundation, Inc. 90 | GDB is free software, covered by the GNU General Public License, and you are 91 | welcome to change it and/or distribute copies of it under certain conditions. 92 | Type "show copying" to see the conditions. 93 | There is absolutely no warranty for GDB. Type "show warranty" for details. 94 | This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries ... done 95 | 96 | (gdb) r 97 | Starting program: /Users/Tim/crash/dist/build/crash/crash 98 | Reading symbols for shared libraries ++. done 99 | 100 | Program received signal EXC_BAD_ACCESS, Could not access memory. 101 | Reason: KERN_PROTECTION_FAILURE at address: 0xbc0ac520 102 | 0x00001d94 in s1A7_info () 103 | (gdb) bt 104 | #0 0x00001d94 in s1A7_info () 105 | #1 0x00000000 in ?? () 106 | (gdb) 107 | 108 | Not helpful at all, at least at first sight! 109 | 110 | The cryptic symbol `s1A7_info` is named by GHC's [STG](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/GeneratedCode) phase. It's worth having a look at the link if this is the first time you've heard that acronym. It's a very informative read in any case, and we'll be getting back to it some more. Also see [Debugging/CompiledCode#Decodingsymbolnames](https://ghc.haskell.org/trac/ghc/wiki/Debugging/CompiledCode#Decodingsymbolnames) and [Compiler/SymbolNames](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/SymbolNames) for additional information on how these symbol names are assigned. 111 | 112 | We can compile our program with `-ddump-stg -ddump-cmm -ddump-to-file`, and if we're lucky we'll find the `s1A7` identifier somewhere in the dumped [C--](http://en.wikipedia.org/wiki/C--) intermediate code (inside `dist/build/crash/crash-tmp/Main.dump-*` for a cabal build of a program named 'crash'). This way we should at least be able to pin down the module of the crash location, maybe more. Better than nothing! 113 | 114 | There is much more information inside the [GHC commentary](https://ghc.haskell.org/trac/ghc/wiki/Commentary), for instance [Debugging/CompiledCode](https://ghc.haskell.org/trac/ghc/wiki/Debugging/CompiledCode) (highly recommended read!). Also see these [helpful gdb macros](https://ghc.haskell.org/trac/ghc/attachment/wiki/Debugging/CompiledCode/.gdbinit). The commentary also has page on [C--](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/CmmType). 115 | 116 | What we'd really like to see is a full stack trace, though. 117 | 118 | ### The x86 stack 119 | 120 | We start by doing a quick review how the stack works with most compiled languages on an x86 system. I'll just point to these two short articles for [x86](http://eli.thegreenplace.net/2011/02/04/where-the-top-of-the-stack-is-on-x86/) and [x86-64](http://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64/). All clear? Good! 121 | 122 | Generating a stack trace for say a C/C++ program on x86 would simply involve looking at the program counter (also called 'PC' or 'instruction pointer', `eip` on x86, `rip` on x86-64) to get the top function and then following the singly linked-list of stack frames while collecting the return addresses along the way. The location of the first stack frame will be obtained from the frame pointer (`ebp` / `rbp` on x86 / x86-64). 123 | 124 | Here are three different implementations of this idea, hopefully making it very clear: 125 | 126 | * https://github.com/blitzcode/rsvp/blob/94c167bcf2a82092a8fe21eeb2e7a5a6b2d8fc77/src/prof.cpp#L1478 127 | * https://bitbucket.org/edd/dbg/src/1abb9939664c/src/osx/frames.cpp?at=default 128 | * http://opensource.apple.com/source/gdb/gdb-956/src/gdb/macosx/macosx-self-backtrace.c 129 | 130 | Frame pointers are fast and conceptually simple, but come with inherent limitations and don't work with the default code generation parameters of many modern compilers (e.g. `-fomit-frame-pointer`). Debuggers like `gdb` generally use special debug information (Call Frame Information, or CFI) or fall back on function prologue analysis. It's slower and vastly more complicated, but more reliable and doesn't require programs to be compiled with frame pointers explicitly enabled. 131 | 132 | Some information and source code regarding CFI: 133 | 134 | * http://gnu.wildebeest.org/blog/mjw/2007/08/23/stack-unwinding/ 135 | * http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/dwarf/cfi_assembler.h 136 | * http://www.nongnu.org/libunwind/ 137 | 138 | For simplicity, we'll assume the presence of frame pointers (`-fno-omit-frame-pointer` for gcc and clang) and use that as out stack traversal method for non-Haskell code. 139 | 140 | ### It's not like that 141 | 142 | Please (temporarily) forget everything you just read about x86 stack traversal, as GHC has a different stack and uses a [different](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/HaskellExecution/CallingConvention) [calling convention](http://en.wikipedia.org/wiki/X86_calling_conventions). Apart from those implementation differences, there is a major conceptual difference as well. Haskell is a lazy language, and its stack is in evaluation order, not source code call order. Add [tail call optimization](http://en.wikipedia.org/wiki/Tail_call_optimization) into the mix, and it becomes clear why generating a stack trace for a GHC compiled program is a non-trivial problem. 143 | 144 | There's an ongoing discussion on how to improve GHC's diagnostic features with easily accessible stack traces (see [ExplicitCallStack](https://ghc.haskell.org/trac/ghc/wiki/ExplicitCallStack), [Ticket #3693](https://ghc.haskell.org/trac/ghc/ticket/3693) contains some very interesting work-in-progress patches to the compiler and RTS). 145 | 146 | ### Notes on RTS headers 147 | 148 | We'll soon start looking at various C header files from GHC's [RTS](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts). Links will be to HEAD versions of these files, including the in-line excerpts from them. That's perfectly fine for following the explanations here, but when it gets to actually poking around in memory you probably want to look at the header files shipping with your GHC. For me they're in `/Library/Frameworks/GHC.framework/Versions/7.6.3-i386/usr/lib/ghc-7.6.3/include/`, you might need to look elsewhere or download a [source tarball](http://www.haskell.org/ghc/download_ghc_7_6_3#sources). 149 | 150 | There's also the issue of different versions of the RTS. There are debug / no-debug, single / multi-threaded, profiling enabled / disabled and 32 / 64 bit versions of it. You'll need to know what the program you're inspecting links against. See the [Rts/Config](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Config) page for more details. 151 | 152 | Unlike most C APIs, the Haskell RTS one does not seem to be usable from C++. Among other issues, code in the headers uses identifiers like `new`. (Update: Fixed in HEAD, see [Ticket #8676](https://ghc.haskell.org/trac/ghc/ticket/8676#ticket)) 153 | 154 | Finally, [SourceTree/Includes](https://ghc.haskell.org/trac/ghc/wiki/Commentary/SourceTree/Includes) from the GHC commentary explains the general include structure. 155 | 156 | ### Locating GHC's stack 157 | 158 | How can we find GHC's stack in memory? The [STG](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/GeneratedCode) machine defines two registers of potential interest: `BaseReg` (or base pointer) and `Sp` (stack pointer). 159 | 160 | Quoting the GHC commentary: 161 | 162 | > * The STG registers 163 | > - There are rather a lot of registers here: more than can be practicably stored in actual available processor registers on most architectures. 164 | > - To deal with the lack of processor registers, most of the STG registers are actually kept on the stack in a block of memory pointed to by a special STG register called the "base register" (or BaseReg). To get or set values of registers which are not kept in processor registers, the STG machine generates an instruction to load or store from an address relative to the BaseReg. 165 | > - The most important four registers are the BaseReg, the stack pointer (Sp), the heap pointer (Hp), and the general purpose register R1 which is used for intermediate values, as well as for returning evaluated values when unwinding the stack. These are the four registers which are assigned actual processor registers when implementing the STG machine on x86. 166 | > * The STG stack 167 | > - Stores function arguments and continuations (i.e. the stack frames which are executed when a function returns) 168 | > - Grows downwards in memory 169 | > - The top of the stack is pointed to by the STG register Sp, and the maximum available stack pointer is stored in SpLim. There is no frame pointer. 170 | 171 | Also see [Rts/HaskellExecution/Registers](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/HaskellExecution/Registers). 172 | 173 | We can look at [MachRegs.h](https://ghc.haskell.org/trac/ghc/browser/ghc/includes/stg/MachRegs.h) to see which architectural registers they map to. 174 | 175 | Relevant excerpt for x86: 176 | 177 | Ok, we've only got 6 general purpose registers, a frame pointer and a 178 | stack pointer. \tr{%eax} and \tr{%edx} are return values from C functions, 179 | hence they get trashed across ccalls and are caller saves. \tr{%ebx}, 180 | \tr{%esi}, \tr{%edi}, \tr{%ebp} are all callee-saves. 181 | 182 | Reg STG-Reg 183 | --------------- 184 | ebx Base 185 | ebp Sp 186 | esi R1 187 | edi Hp 188 | 189 | And for x86-64: 190 | 191 | %rax caller-saves, don't steal this one 192 | %rbx YES 193 | %rcx arg reg, caller-saves 194 | %rdx arg reg, caller-saves 195 | %rsi arg reg, caller-saves 196 | %rdi arg reg, caller-saves 197 | %rbp YES (our *prime* register) 198 | %rsp (unavailable - stack pointer) 199 | %r8 arg reg, caller-saves 200 | %r9 arg reg, caller-saves 201 | %r10 caller-saves 202 | %r11 caller-saves 203 | %r12 YES 204 | %r13 YES 205 | %r14 YES 206 | %r15 YES 207 | 208 | %xmm0-7 arg regs, caller-saves 209 | %xmm8-15 caller-saves 210 | 211 | Use the caller-saves regs for Rn, because we don't always have to 212 | save those (as opposed to Sp/Hp/SpLim etc. which always have to be 213 | saved). 214 | 215 | ### The Cost Center Stack 216 | 217 | Before we attempt to traverse the more unusual and complicated STG stack used for actually evaluating expressions, let's look at the cost center stack (CCS) which GHC's profiling RTS maintains. 218 | 219 | GHC's time profiler (`+RTS -p`) and its exception stack tracing (`+RTS -xc`) feature can show us conventional looking call stacks. This feature comes with with both time and space overhead, but it's probably in line with an unoptimized debug build of a C/C++ program. We can use this system to obtain better call stacks even in the case of a segfault. Now would be a good time to refresh your memory on GHC's profiling features (GHC Manual, [Chapter 5 - Profiling](http://www.haskell.org/ghc/docs/7.6.3/html/users_guide/profiling.html)) if you haven't looked at them in a while. 220 | 221 | We'll start by recompiling our earlier crashing program with profiling (`cabal configure --enable-executable-profiling` or just `-prof` to GHC, also add `-fprof-auto -caf-all` to the `ghc-prof-options` field / pass them to GHC). 222 | 223 | How do we get at the CCS when we're looking at a crashed program in `gdb`? Have a look at this excerpt from [Regs.h](https://ghc.haskell.org/trac/ghc/browser/ghc/includes/stg/Regs.h): 224 | 225 | ```c 226 | /* 227 | * This is the table that holds shadow-locations for all the STG 228 | * registers. The shadow locations are used when: 229 | * 230 | * 1) the particular register isn't mapped to a real machine 231 | * register, probably because there's a shortage of real registers. 232 | * 2) caller-saves registers are saved across a CCall 233 | */ 234 | typedef struct { 235 | StgUnion rR1; 236 | StgUnion rR2; 237 | StgUnion rR3; 238 | StgUnion rR4; 239 | StgUnion rR5; 240 | StgUnion rR6; 241 | StgUnion rR7; 242 | StgUnion rR8; 243 | StgUnion rR9; /* used occasionally by heap/stack checks */ 244 | StgUnion rR10; /* used occasionally by heap/stack checks */ 245 | StgFloat rF1; 246 | StgFloat rF2; 247 | StgFloat rF3; 248 | StgFloat rF4; 249 | StgDouble rD1; 250 | StgDouble rD2; 251 | StgWord64 rL1; 252 | StgPtr rSp; 253 | StgPtr rSpLim; 254 | StgPtr rHp; 255 | StgPtr rHpLim; 256 | struct CostCentreStack_ * rCCCS; /* current cost-centre-stack */ 257 | struct StgTSO_ * rCurrentTSO; 258 | struct nursery_ * rNursery; 259 | struct bdescr_ * rCurrentNursery; /* Hp/HpLim point into this block */ 260 | struct bdescr_ * rCurrentAlloc; /* for allocation using allocate() */ 261 | StgWord rHpAlloc; /* number of *bytes* being allocated in heap */ 262 | StgWord rRet; /* holds the return code of the thread */ 263 | } StgRegTable; 264 | ``` 265 | 266 | This is the block of memory which `BaseReg` points at. From looking at `MachRegs.h` earlier we know this is `ebx` on x86 and `r13` on x86-x64. It contains references to everything from the heap to the STG stack (through the [TSO](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObjects#ThreadStateObjects)), but we're only interested in locating the CCS for now. We need to offset the base pointer so we're getting at the `rCCCS` field. If we're using the debug runtime we can directly use the structure above, but we might need to get by with an offset. To simplify this task, we can look at the special header `DerivedConstants.h` and find the offset constant we want: 267 | 268 | ```c 269 | #define OFFSET_StgRegTable_rCCCS 96 270 | ``` 271 | 272 | If you can't find this header, it's because you're looking at a clean source tarball or repository. The header is generated when GHC is compiled to match the memory layout of the target architecture. Read [SourceTree/Includes#DerivedConstants]([https://ghc.haskell.org/trac/ghc/wiki/Commentary/SourceTree/Includes#DerivedConstants) for details. 273 | 274 | Assuming we found our header and offset, let's proceed with `gdb`: 275 | 276 | (gdb) x/1a $ebx+96 277 | 0x21352c : 0x6fc578 278 | 279 | We now have a pointer to `CostCentreStack`. From [CCS.h](https://ghc.haskell.org/trac/ghc/browser/ghc/includes/rts/prof/CCS.h): 280 | 281 | ```c 282 | typedef struct CostCentreStack_ { 283 | StgInt ccsID; // unique ID, allocated by the RTS 284 | 285 | CostCentre *cc; // Cost centre at the top of the stack 286 | 287 | struct CostCentreStack_ *prevStack; // parent 288 | struct IndexTable_ *indexTable; // children 289 | struct CostCentreStack_ *root; // root of stack 290 | StgWord depth; // number of items in the stack 291 | 292 | StgWord64 scc_count; // Count of times this CCS is entered 293 | // align 8 (Note [struct alignment]) 294 | 295 | StgWord selected; // is this CCS shown in the heap 296 | // profile? (zero if excluded via -hc 297 | // -hm etc.) 298 | 299 | StgWord time_ticks; // number of time ticks accumulated by 300 | // this CCS 301 | 302 | StgWord64 mem_alloc; // mem allocated by this CCS 303 | // align 8 (Note [struct alignment]) 304 | 305 | StgWord64 inherited_alloc; // sum of mem_alloc over all children 306 | // (calculated at the end) 307 | // align 8 (Note [struct alignment]) 308 | 309 | StgWord inherited_ticks; // sum of time_ticks over all children 310 | // (calculated at the end) 311 | } CostCentreStack; 312 | ``` 313 | 314 | Lots of potentially interesting stuff, but we stay focussed and go one deeper, the pointer to `ConstCentre`: 315 | 316 | ```c 317 | typedef struct CostCentre_ { 318 | StgInt ccID; // Unique Id, allocated by the RTS 319 | 320 | char * label; 321 | char * module; 322 | char * srcloc; 323 | 324 | // used for accumulating costs at the end of the run... 325 | StgWord64 mem_alloc; // align 8 (Note [struct alignment]) 326 | StgWord time_ticks; 327 | 328 | StgInt is_caf; // non-zero for a CAF cost centre 329 | 330 | struct CostCentre_ *link; 331 | } CostCentre; 332 | ``` 333 | 334 | The `label`, `module` and `srcloc` members look just like what we need! Continuing where we left of with `gdb`, looking at `ccsID` and the actual top `CostCentre` pointer (`cc` from the `CostCentreStack` pointer we just got) 335 | 336 | (gdb) x/2a 0x6fc578 337 | 0x6fc578: 0x71 0x2095d8 338 | 339 | Diving into `ConstCentre`, now looking at `ccID`, `label`, `module`, and `srcloc` 340 | 341 | (gdb) x/4a 0x2095d8 342 | 0x2095d8 : 0x1 0x19f5a0 0x19f5ac 0x19f5b4 343 | 344 | Printing out the latter three as C strings 345 | 346 | (gdb) p (char *) 0x19f5a0 347 | $3 = 0x19f5a0 "someFuncC" 348 | (gdb) p (char *) 0x19f5ac 349 | $4 = 0x19f5ac "Main" 350 | (gdb) p (char *) 0x19f5b4 351 | $5 = 0x19f5b4 "Main.hs:(25,1)-(27,39)" 352 | 353 | Just what we wanted! We're now going up the stack one level, using `CostCentreStack`'s `prevStack` pointer (8 bytes into the structure, as specified by `#define OFFSET_CostCentreStack_prevStack 8`) 354 | 355 | (gdb) x 0x6fc578+8 356 | 0x6fc580: 0x6fc4d8 357 | (gdb) x 0x6fc4d8+4 358 | 0x6fc4dc: 0x2095fc 359 | (gdb) x/4a 0x2095fc 360 | 0x2095fc : 0x2 0x19f5cc 0x19f5d8 0x19f5e0 361 | (gdb) p (char *) 0x19f5cc 362 | $7 = 0x19f5cc "someFuncB" 363 | (gdb) p (char *) 0x19f5d8 364 | $8 = 0x19f5d8 "Main" 365 | (gdb) p (char *) 0x19f5e0 366 | $9 = 0x19f5e0 "Main.hs:(19,1)-(21,13)" 367 | 368 | And one more 369 | 370 | (gdb) x 0x6fc4d8+8 371 | 0x6fc4e0: 0x6fc398 372 | (gdb) x 0x6fc398+4 373 | 0x6fc39c: 0x209644 374 | (gdb) x/4a 0x209644 375 | 0x209644 : 0x4 0x19f624 0x19f62c 0x19f634 376 | (gdb) p (char *) 0x19f624 377 | $10 = 0x19f624 "main" 378 | (gdb) p (char *) 0x19f62c 379 | $11 = 0x19f62c "Main" 380 | (gdb) p (char *) 0x19f634 381 | $12 = 0x19f634 "Main.hs:(7,1)-(9,13)" 382 | 383 | The `prevStack` pointer now points to `0x211b40 `, we've reached the bottom. We now know the call stack was 384 | 385 | Main.someFuncC (Main.hs:(25,1)-(27,39)) 386 | Main.someFuncB (Main.hs:(19,1)-(21,13)) 387 | Main.main (Main.hs:(7, 1)-(9, 13)) 388 | 389 | If `someFuncC` was very large, we could do a second run after sprinkling in a few `{-# SCC ... #-}` annotations (documentation for the [SCC Pragma](http://www.haskell.org/ghc/docs/7.6.3/html/users_guide/profiling.html#scc-pragma)) to narrow down the location of the crash even further. Also note that we now got access to not only the stack, but all kinds of other RTS profiling data which could be scraped and displayed in realtime by an external tool. 390 | 391 | ### From Haskell into C and back 392 | 393 | One situation we're not addressing with the previous discussion of CCS inspection is how we deal with a crash inside of a [Foreign Function Interface (FFI)](http://book.realworldhaskell.org/read/interfacing-with-c-the-ffi.html) call. We can debug C/C++ code called from Haskell as usual, but how do we get the Haskell part of the stack? 394 | 395 | First, we'll modify the 'crash' test program by adding some C code. 396 | 397 | `crash.h`: 398 | 399 | ```c 400 | #ifndef CRASH_H 401 | #define CRASH_H 402 | 403 | __attribute__((noinline)) void someCFuncA(); 404 | __attribute__((noinline)) void someCFuncB(); 405 | 406 | #endif // CRASH_H 407 | ``` 408 | 409 | `crash.c`: 410 | 411 | ```c 412 | #include "crash.h" 413 | 414 | void someCFuncA() 415 | { 416 | someCFuncB(); 417 | someCFuncB(); 418 | } 419 | 420 | void someCFuncB() 421 | { 422 | * (char *) 1 = 0; 423 | } 424 | ``` 425 | 426 | Additions to `crash.cabal` 427 | 428 | ``` 429 | c-sources: crash.c 430 | cc-options: -std=c99 -g -O3 -fno-omit-frame-pointer 431 | ``` 432 | 433 | (we want debug symbols and frame pointers) 434 | 435 | Changes in `Main.hs` 436 | 437 | ```haskell 438 | -- unsafeWrite v 1000000000 (0 :: Int) -- Crash (hopefully) 439 | c_someCFuncA 440 | 441 | foreign import ccall unsafe "crash.h someCFuncA" 442 | c_someCFuncA :: IO () 443 | ``` 444 | 445 | `gdb` correctly shows the C part of the call stack 446 | 447 | $ gdb ./dist/build/crash/crash 448 | GNU gdb 6.3.50-20050815 (Apple version gdb-1515) (Sat Jan 15 08:33:48 UTC 2011) 449 | Copyright 2004 Free Software Foundation, Inc. 450 | GDB is free software, covered by the GNU General Public License, and you are 451 | welcome to change it and/or distribute copies of it under certain conditions. 452 | Type "show copying" to see the conditions. 453 | There is absolutely no warranty for GDB. Type "show warranty" for details. 454 | This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries ... done 455 | 456 | (gdb) r 457 | Starting program: /Users/Tim/crash/dist/build/crash/crash 458 | Reading symbols for shared libraries ++. done 459 | 460 | Program received signal EXC_BAD_ACCESS, Could not access memory. 461 | Reason: KERN_PROTECTION_FAILURE at address: 0x00000001 462 | someCFuncB () at crash.c:24 463 | 24 * (char *) 1 = 0; 464 | (gdb) bt 465 | #0 someCFuncB () at crash.c:24 466 | #1 0x0000278b in someCFuncA () at crash.c:18 467 | #2 0x0000220e in s1Al_info () 468 | Previous frame inner to this frame (gdb could not unwind past this frame) 469 | 470 | The debugger could be used to walk the C part of the stack, but we're going to use frame pointers to demonstrate the method. 471 | 472 | (gdb) x/2a $ebp 473 | 0xbfffd6e8: 0xbfffd6f8 0x278b 474 | 475 | The frame pointer gives us our first frame link and return address pointers, establishing the someCFuncA -> someCFuncB call order. Moving on: 476 | 477 | (gdb) x/2a 0xbfffd6f8 478 | 0xbfffd6f8: 0x603cc8 0x220e 479 | 480 | The next frame shows we're retuning back into Haskell code (`s1Al_info`). Since Haskell does not use frame pointers, the space of the pointer to the next stack frame is simply occupied by the `Sp` register, pointing to the top of Haskell's STG stack. A detailed look at the STG stack will follow, but for now all we need to know is that the closure on the top will have a layout like this: 481 | 482 | ```c 483 | typedef struct StgClosure_ { 484 | StgHeader header; 485 | struct StgClosure_ *payload[FLEXIBLE_ARRAY]; 486 | } *StgClosurePtr; // StgClosure defined in Rts.h 487 | 488 | typedef struct { 489 | const StgInfoTable* info; 490 | #ifdef PROFILING 491 | StgProfHeader prof; 492 | #endif 493 | } StgHeader; 494 | 495 | typedef struct { 496 | CostCentreStack *ccs; 497 | union { 498 | struct _RetainerSet *rs; /* Retainer Set */ 499 | StgWord ldvw; /* Lag/Drag/Void Word */ 500 | } hp; 501 | } StgProfHeader; 502 | ``` 503 | 504 | Basically, the second word on the stack will point us to the CCS. Here: 505 | 506 | (gdb) x/2a 0x603cc8 507 | 0x603cc8: 0x21e8 0x6fc578 508 | 509 | And showing just enough of `CostCentreStack` to reveal the `CostCentre` pointer 510 | 511 | (gdb) x/2a 0x6fc578 512 | 0x6fc578: 0x71 0x2095d8 513 | 514 | We found the CCS and the cost center for someFuncC, allowing us to proceed as before. 515 | 516 | ### The STG stack 517 | 518 | So far we've only been working with the CCS, which is not available unless we've compiled our program with profiling enabled. What can we learn from looking at Haskell's native STG stack? 519 | 520 | Recall from our earlier discussion of registers (or read [Rts/HaskellExecution/Registers](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/HaskellExecution/Registers) again) that the virtual `Sp` register (`ebp` / `rbp` on x86 / x86-64) always points to the top of the stack. We can also use the `BaseReg` and get access to the current [TSO](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObjects#ThreadStateObjects), containing a pointer to the `StgStack` structure. In any case, we'll have a look with `gdb`: 521 | 522 | (gdb) x/32a $ebp 523 | 0x503cd8: 0x1d94 0x1e2c 0xefc6c 0x0 524 | 0x503ce8: 0x11bd82 0xf32d8 0xf2f34 0x11d79c 525 | 0x503cf8: 0x5038f0 0x503930 0x1 0x0 526 | 0x503d08: 0x11d79c 0x3 0x0 0x1 527 | 0x503d18: 0x300670 0x11e440 0x11d798 0x11d79c 528 | 0x503d28: 0x11d79c 0xec 0xf321c 0x41 529 | 0x503d38: 0x42 0x11c0d5 0x11c0d5 0x502fba 530 | 0x503d48: 0x11c0d5 0x11c0d5 0x11c0d5 0x11c0d5 The stack consists of a sequence of stack frames (also sometimes called activation records) where each frame has the same layout as a heap object: 535 | > Header Payload... 536 | 537 | and 538 | 539 | > Stack frames therefore have bitmap layout. 540 | 541 | Ok, moving on to the documentation on heap objects, i.e. [Rts/Storage/HeapObjects](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObjects): 542 | 543 | > All heap objects have the same basic layout, embodied by the type StgClosure in [Closures.h](https://ghc.haskell.org/trac/ghc/browser/ghc/includes/rts/storage/Closures.h). 544 | 545 | ```c 546 | /* All closures follow the generic format */ 547 | 548 | typedef struct StgClosure_ { 549 | StgHeader header; 550 | struct StgClosure_ *payload[FLEXIBLE_ARRAY]; 551 | } *StgClosurePtr; // StgClosure defined in Rts.h 552 | ``` 553 | 554 | > A heap object always begins with a header, defined by `StgHeader` in [Closures.h](https://ghc.haskell.org/trac/ghc/browser/ghc/includes/rts/storage/Closures.h) 555 | 556 | ```c 557 | typedef struct { 558 | const StgInfoTable* info; 559 | #ifdef PROFILING 560 | StgProfHeader prof; 561 | #endif 562 | } StgHeader; 563 | ``` 564 | 565 | When looking at these structures, for now assume `PROFILING`, `TICKY` and `DEBUG_CLOSURE` to be undefined, while assuming `TABLES_NEXT_TO_CODE` to be defined. See the [Rts/Config](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Config) page for more details. 566 | 567 | Also worth mentioning is the issue of pointer tagging, see [Rts/HaskellExecution/PointerTagging](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/HaskellExecution/PointerTagging). It shouldn't affect us much for our basic stack traversal, but it's important to know about when poking around further in GHC's heap. 568 | 569 | Now we know that when we're looking at the top of our stack and see `0x1d94 `, we actually have a pointer to an `StgInfoTable`, followed by a 'payload' array of closures. What are 'info tables' anyway? Have a quick look at [Rts/Storage/HeapObjects#InfoTables](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObjects#InfoTables) and come back. 570 | 571 | Let's look at the definitions, from [InfoTables.h](https://ghc.haskell.org/trac/ghc/browser/ghc/includes/rts/storage/InfoTables.h): 572 | 573 | ```c 574 | /* 575 | * The "standard" part of an info table. Every info table has this bit. 576 | */ 577 | typedef struct StgInfoTable_ { 578 | 579 | #if !defined(TABLES_NEXT_TO_CODE) 580 | StgFunPtr entry; /* pointer to the entry code */ 581 | #endif 582 | 583 | #ifdef PROFILING 584 | StgProfInfo prof; 585 | #endif 586 | #ifdef TICKY 587 | /* Ticky-specific stuff would go here. */ 588 | #endif 589 | #ifdef DEBUG_CLOSURE 590 | /* Debug-specific stuff would go here. */ 591 | #endif 592 | 593 | StgClosureInfo layout; /* closure layout info (one word) */ 594 | 595 | StgHalfWord type; /* closure type */ 596 | StgHalfWord srt_bitmap; 597 | /* In a CONSTR: 598 | - the constructor tag 599 | In a FUN/THUNK 600 | - a bitmap of SRT entries 601 | */ 602 | 603 | #ifdef TABLES_NEXT_TO_CODE 604 | StgCode code[FLEXIBLE_ARRAY]; 605 | #endif 606 | } *StgInfoTablePtr; 607 | ``` 608 | 609 | Info tables for stack frames / return addresses have a special layout, though. It is defined by `StgRetInfoTable`: 610 | 611 | ```c 612 | typedef struct { 613 | #if defined(TABLES_NEXT_TO_CODE) 614 | OFFSET_FIELD(srt_offset); /* offset to the SRT table */ 615 | StgInfoTable i; 616 | #else 617 | StgInfoTable i; 618 | StgSRT *srt; /* pointer to the SRT table */ 619 | #endif 620 | } StgRetInfoTable; 621 | ``` 622 | 623 | Now, why does `StgRetInfoTable` prepend an additional field instead of append? It would seem more logical to tack it on at the end so we can first treat all info tables as an `StgInfoTable`, inspect the `type` field and then cast to the more specific structure and access additional fields. It has to do with `TABLES_NEXT_TO_CODE` (documented in [Rts/Config](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Config) and [Rts/Storage/HeapObjects#TABLES_NEXT_TO_CODE](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObjects#TABLES_NEXT_TO_CODE)). Have a look at this excerpt from [ClosuresMacros.h](https://ghc.haskell.org/trac/ghc/browser/ghc/includes/rts/storage/ClosureMacros.h): 624 | 625 | ```c 626 | /* ----------------------------------------------------------------------------- 627 | Info tables are slammed up against the entry code, and the label 628 | for the info table is at the *end* of the table itself. This 629 | inline function adjusts an info pointer to point to the beginning 630 | of the table, so we can use standard C structure indexing on it. 631 | 632 | Note: this works for SRT info tables as long as you don't want to 633 | access the SRT, since they are laid out the same with the SRT 634 | pointer as the first word in the table. 635 | 636 | NOTES ABOUT MANGLED C VS. MINI-INTERPRETER: 637 | 638 | A couple of definitions: 639 | 640 | "info pointer" The first word of the closure. Might point 641 | to either the end or the beginning of the 642 | info table, depending on whether we're using 643 | the mini interpretter or not. GET_INFO(c) 644 | retrieves the info pointer of a closure. 645 | 646 | "info table" The info table structure associated with a 647 | closure. This is always a pointer to the 648 | beginning of the structure, so we can 649 | use standard C structure indexing to pull out 650 | the fields. get_itbl(c) returns a pointer to 651 | the info table for closure c. 652 | 653 | An address of the form xxxx_info points to the end of the info 654 | table or the beginning of the info table depending on whether we're 655 | mangling or not respectively. So, 656 | 657 | c->header.info = xxx_info 658 | 659 | makes absolute sense, whether mangling or not. 660 | 661 | -------------------------------------------------------------------------- */ 662 | ... 663 | #ifdef TABLES_NEXT_TO_CODE 664 | #define INFO_PTR_TO_STRUCT(info) ((StgInfoTable *)(info) - 1) 665 | #define RET_INFO_PTR_TO_STRUCT(info) ((StgRetInfoTable *)(info) - 1) 666 | ... 667 | #else 668 | #define INFO_PTR_TO_STRUCT(info) ((StgInfoTable *)info) 669 | #define RET_INFO_PTR_TO_STRUCT(info) ((StgRetInfoTable *)info) 670 | ... 671 | #endif 672 | 673 | ``` 674 | 675 | Our pointer actually points to the closure's entry code (`StgInfoTable.code`), and the info table prepends it. We need to subtract the structure size from it. We can also see this in the [gdb macros](https://ghc.haskell.org/trac/ghc/attachment/wiki/Debugging/CompiledCode/.gdbinit) mentioned earlier: 676 | 677 | define pinfo 678 | p *((StgInfoTable *)$arg0-1) 679 | end 680 | 681 | define prinfo 682 | p *((StgRetInfoTable *)$arg0-1) 683 | end 684 | 685 | GHC's runtime does 'append' the additional fields, it's just the definition of beginning and end that are reversed, like with top / bottom of the stack. 686 | 687 | To go over the stack we need to know the payload size coming after the `StgHeader`. Remember that stack frames have [bitmap layout](https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObjects#Bitmaplayout) (also please have a quick look at that). The `StgClosureInfo` structure from the info table tells us about the size of the payload: 688 | 689 | ```c 690 | /* 691 | * Stuff describing the closure layout. Well, actually, it might 692 | * contain the selector index for a THUNK_SELECTOR. This union is one 693 | * word long. 694 | */ 695 | typedef union { 696 | struct { /* Heap closure payload layout: */ 697 | StgHalfWord ptrs; /* number of pointers */ 698 | StgHalfWord nptrs; /* number of non-pointers */ 699 | } payload; 700 | 701 | StgWord bitmap; /* word-sized bit pattern describing */ 702 | /* a stack frame: see below */ 703 | 704 | #ifndef TABLES_NEXT_TO_CODE 705 | StgLargeBitmap* large_bitmap; /* pointer to large bitmap structure */ 706 | #else 707 | OFFSET_FIELD(large_bitmap_offset); /* offset from info table to large bitmap structure */ 708 | #endif 709 | 710 | StgWord selector_offset; /* used in THUNK_SELECTORs */ 711 | 712 | } StgClosureInfo; 713 | ``` 714 | 715 | The payload size can be extracted from the size bits from `bitmap`, preferably using the `BITMAP_SIZE` macro as the layout of the word changes between 32 / 64 bit. This will get us pretty far, but there are some closures on the stack that require special treatment. Fortunately, the RTS provides a convenient function taking care of it all. Looking at `stack_frame_sizeW` from [ClosuresMacros.h](https://ghc.haskell.org/trac/ghc/browser/ghc/includes/rts/storage/ClosureMacros.h): 716 | 717 | ```c 718 | /* ----------------------------------------------------------------------------- 719 | Sizes of stack frames 720 | -------------------------------------------------------------------------- */ 721 | 722 | EXTERN_INLINE StgWord stack_frame_sizeW( StgClosure *frame ); 723 | EXTERN_INLINE StgWord stack_frame_sizeW( StgClosure *frame ) 724 | { 725 | StgRetInfoTable *info; 726 | 727 | info = get_ret_itbl(frame); 728 | switch (info->i.type) { 729 | 730 | case RET_DYN: 731 | { 732 | StgRetDyn *dyn = (StgRetDyn *)frame; 733 | return sizeofW(StgRetDyn) + RET_DYN_BITMAP_SIZE + 734 | RET_DYN_NONPTR_REGS_SIZE + 735 | RET_DYN_PTRS(dyn->liveness) + RET_DYN_NONPTRS(dyn->liveness); 736 | } 737 | 738 | case RET_FUN: 739 | return sizeofW(StgRetFun) + ((StgRetFun *)frame)->size; 740 | 741 | case RET_BIG: 742 | return 1 + GET_LARGE_BITMAP(&info->i)->size; 743 | 744 | case RET_BCO: 745 | return 2 + BCO_BITMAP_SIZE((StgBCO *)((P_)frame)[1]); 746 | 747 | default: 748 | return 1 + BITMAP_SIZE(info->i.layout.bitmap); 749 | } 750 | } 751 | ``` 752 | 753 | The W postfix means the function returns the size in words, not bytes. 754 | 755 | How do we know when we reached the bottom of the stack? If we have access to the `StgStack` structure (i.e. from the TSO) 756 | 757 | ```c 758 | typedef struct StgStack_ { 759 | StgHeader header; 760 | StgWord32 stack_size; // stack size in *words* 761 | StgWord32 dirty; // non-zero => dirty 762 | StgPtr sp; // current stack pointer 763 | StgWord stack[FLEXIBLE_ARRAY]; 764 | } StgStack; 765 | ``` 766 | 767 | it's rather obvious as we have the stack pointer, start and size of the current stack chunk. If not, we can look for a closure of type `STOP_FRAME` and use this as our bottom marker. 768 | 769 | One last type of stack frame we need to discuss is the underflow frame. The STG stack is implemented as a linked-list of chunks (see [Implement stack chunks and separate TSO/STACK objects](https://ghc.haskell.org/trac/ghc/changeset/f30d527344db528618f64a25250a3be557d9f287/ghc) and [this](https://ghc.haskell.org/trac/ghc/blog/stack-chunks) post from the [GHC blog](https://ghc.haskell.org/trac/ghc/blog/)). Here's the closure of [type](https://ghc.haskell.org/trac/ghc/browser/ghc/includes/rts/storage/ClosureTypes.h) `UNDERFLOW_FRAME` from [Closures.h](https://ghc.haskell.org/trac/ghc/browser/ghc/includes/rts/storage/Closures.h): 770 | 771 | ```c 772 | typedef struct { 773 | const StgInfoTable* info; 774 | struct StgStack_ *next_chunk; 775 | } StgUnderflowFrame; 776 | ``` 777 | 778 | The contained stack object tells us everything we need to know to continue traversing. 779 | 780 | GHC already has some internal code for printing out various heap and stack objects, see [Printer.h](https://ghc.haskell.org/trac/ghc/browser/ghc/rts/Printer.h) and [Printer.c](https://ghc.haskell.org/trac/ghc/browser/ghc/rts/Printer.c). With functions like `printStackChunk` and `prettyPrintClosure` its basically a cookbook for how to traverse and trace the contents of RTS objects. 781 | 782 | All basic building blocks for traversing the STG stack are in place. It's still fairly difficult to extract useful information from it, but it's the best we have without a profiling RTS. 783 | 784 | ### Automatic stack capture 785 | 786 | The last remaining task is to write some code to automate the often tedious steps required to extract the call stack. There are at least three different ways we could go about this. 787 | 788 | One approach would be to automate `gdb` through its [scripting capabilities](http://stackoverflow.com/questions/10748501/automating-gdb-sessions) and [machine interface](https://sourceware.org/gdb/onlinedocs/gdb/GDB_002fMI.html). While this is certainly viable for debugging, it might be a bit slow for more frequent, performance sensitive extraction of call stack information (i.e. for a profiler). 789 | 790 | Another implementation is installing a [signal handler](http://en.wikipedia.org/wiki/Unix_signal) inside our program, catching the `SIGBUS` / `SIGSEGV` / etc. signals and performing the stack walk right there. This is complicated by the short list of functions which can be [safely called from a signal handler](https://www.securecoding.cert.org/confluence/display/seccode/SIG30-C.+Call+only+asynchronous-safe+functions+within+signal+handlers). Given the nature of our signals, the heap might already be corrupted and any information retrieval difficult to carry out. 791 | 792 | The method this document will actually describe is writing a minimal stand-alone debugger, inspecting the crashed process from the outside. A more complicated solution, but the most flexible and versatile one. 793 | 794 | ### atos 795 | 796 | Outside of `gdb`, we need a way for our program to translate addresses into symbol names. That gap will be filled by the OS X [atos](https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/atos.1.html) command line utility. With a live process, it can easily resolve any address. 797 | 798 | Linux provides atos-like functionality through [`addr2line`](http://www.linuxcommand.org/man_pages/addr2line1.html), Windows has the `Sym*` APIs from [DbgHelp](http://msdn.microsoft.com/en-us/library/windows/desktop/ms679291%28v=vs.85%29.aspx). 799 | 800 | The `atos` communication, wrapping and parsing code will be handled by the [ATOS_Pipe](https://github.com/blitzcode/rsvp/blob/94c167bcf2a82092a8fe21eeb2e7a5a6b2d8fc77/src/prof.cpp#L128) and [SymbolManager](https://github.com/blitzcode/rsvp/blob/94c167bcf2a82092a8fe21eeb2e7a5a6b2d8fc77/src/prof.cpp#L181) classes from [rsvp](https://github.com/blitzcode/rsvp/). 801 | 802 | With those classes, getting the symbol name string for a program counter value is very simple: 803 | 804 | ```c 805 | symbol->SymbolIDToName(symbol->AddressToSymbolID(x86ts32.__eip)) 806 | ``` 807 | 808 | ### Inspecting another process 809 | 810 | To apply the stack traversal and symbol resolution methods described, we need to have access to the memory, threads and registers of the debugged process. We also need to be notified once it segfaults. On Linux the [`ptrace`](http://linux.die.net/man/2/ptrace) call should be able to provide everything we need. Unfortunately, OS X has a rather [gimped implementation](https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/ptrace.2.html) of that API. See [uninformed.org](http://uninformed.org/index.cgi?v=4&a=3&p=14) for a discussion of workarounds for OS X, and this [Linux Journal](http://www.linuxjournal.com/article/6100) article for an explanation of how it works on Linux. 811 | 812 | Our usage of `ptrace` boils down to initiating the tracing relationship with the child like this 813 | 814 | ```c 815 | pid_t child = fork(); 816 | if (child == 0) 817 | { 818 | // Let parent trace us 819 | ptrace(PT_TRACE_ME, 0, NULL, NULL); 820 | 821 | if (execv(argv[1], &argv[1]) == -1) 822 | assert("execl() failed, can't load child executable"); 823 | } 824 | else 825 | { 826 | DebugLoop(); 827 | kill(child, SIGKILL); 828 | } 829 | ``` 830 | 831 | and then letting the child continue every time we receive a signal from it (`waitpid`) 832 | 833 | ```c 834 | void PTraceContinue(pid_t pid) 835 | { 836 | // Child process stopped after signalling us, let it proceed 837 | ptrace(PT_CONTINUE, pid, (caddr_t) 1, 0); 838 | } 839 | ``` 840 | 841 | The `DebugLoop` function handling events from the child / target process is structured as follows 842 | 843 | ```c 844 | void DebugLoop(const TargetInfo& ti) 845 | { 846 | while (true) 847 | { 848 | int status; 849 | waitpid(ti.m_pid, &status, 0); 850 | 851 | // Handle signal 852 | if (WIFEXITED(status)) 853 | break; 854 | else if (WIFSIGNALED(status)) 855 | break; 856 | else if (WIFSTOPPED(status)) 857 | { 858 | const int sig = WSTOPSIG(status); 859 | switch (sig) 860 | { 861 | // Fault? 862 | case SIGSEGV: 863 | case SIGBUS: 864 | case SIGILL: 865 | case SIGFPE: 866 | case SIGSYS: 867 | TraceProcessState(ti); 868 | return; 869 | // Continue 870 | default: 871 | PTraceContinue(ti.m_pid); 872 | break; 873 | } 874 | } 875 | } 876 | } 877 | ``` 878 | 879 | ### Mach 880 | 881 | We'll need to replace some of the missing `ptrace` functionality through [Mach kernel](http://en.wikipedia.org/wiki/Mach_%28kernel%29) functions. Mach is the underlying kernel on which OS X implements its BSD / POSIX layer. Mach in OS X is somewhat sparsely documented, but see [uninformed.org](http://uninformed.org/index.cgi?v=4&a=3), [developer.apple.com](https://developer.apple.com/library/mac/documentation/Darwin/Conceptual/KernelProgramming/About/About.html#//apple_ref/doc/uid/TP30000905-CH204-TPXREF101), [Mach IPC Interface](http://web.mit.edu/darwin/src/modules/xnu/osfmk/man/) and [Mac OS X Internals: A Systems Approach](http://www.amazon.com/Mac-OS-Internals-Systems-Approach/dp/0321278542/) (a somewhat dated but still useful book). The Mach APIs can provide us with full access to the debugged process. 882 | 883 | For Windows, similar functionality can be accessed through APIs like `CreateRemoteThread`, `SuspendThread`, `GetThreadContext`, `ResumeThread` and `StackWalk64`. 884 | 885 | ### Mach crash course 886 | 887 | A full Mach kernel API tutorial is out of scope for this document, but here are the bits we need for stack capture. 888 | 889 | The Mach headers can be found in `/usr/include/mach/` and subdirectories. Include like this 890 | 891 | ```c 892 | #include 893 | #include // for mach_vm_ instead of vm_ 894 | ``` 895 | 896 | We need a port for the task to be debugged 897 | 898 | ```c 899 | task_for_pid(mach_task_self(), pid, &m_task_port); 900 | ``` 901 | 902 | Obtain a list of threads from the target process 903 | 904 | ```c 905 | thread_act_port_array_t thread_list; 906 | mach_msg_type_number_t thread_count; 907 | task_threads(ti.m_task_port, &thread_list, &thread_count); 908 | ``` 909 | 910 | Query thread scheduling etc. information 911 | 912 | ```c 913 | thread_basic_info tbi; 914 | mach_msg_type_number_t thread_info_count = THREAD_BASIC_INFO_COUNT; 915 | thread_info(thread_list[thread_idx], 916 | THREAD_BASIC_INFO, 917 | (thread_info_t) &tbi, 918 | &thread_info_count); 919 | ``` 920 | 921 | Skip threads which were not running when we got the fault 922 | 923 | ```c 924 | if (tbi.flags & TH_FLAGS_SWAPPED) 925 | continue; 926 | ``` 927 | 928 | Query thread exception information (trap number, fault address etc.) 929 | 930 | ```c 931 | x86_exception_state32_t x86es32; 932 | mach_msg_type_number_t exception_state_count32 = x86_EXCEPTION_STATE32_COUNT; 933 | thread_get_state(thread_list[thread_idx], 934 | x86_EXCEPTION_STATE32, 935 | (thread_state_t) &x86es32, 936 | &exception_state_count32); 937 | ``` 938 | 939 | Query thread registers (program counter, frame pointer, etc.) 940 | 941 | ```c 942 | x86_thread_state32_t x86ts32; 943 | mach_msg_type_number_t thread_state_count32 = x86_THREAD_STATE32_COUNT; 944 | thread_get_state(thread_list[thread_idx], 945 | x86_THREAD_STATE32, 946 | (thread_state_t) &x86ts32, 947 | &thread_state_count32); 948 | ``` 949 | 950 | And finally, read memory from the target process 951 | 952 | ```c 953 | kern_return_t ReadMemory(mach_port_name_t task, target_ptr_t address, target_ptr_t size, void *out) 954 | { 955 | mach_vm_size_t inout_size = size; 956 | kern_return_t ret = mach_vm_read_overwrite( 957 | task, 958 | address, 959 | inout_size, 960 | (mach_vm_address_t) (uintptr_t) out, 961 | &inout_size); 962 | if (ret != KERN_SUCCESS) 963 | return ret; 964 | if (inout_size != size) 965 | return KERN_FAILURE; 966 | return KERN_SUCCESS; 967 | } 968 | ``` 969 | 970 | ### Other obstacles 971 | 972 | How do we know if our target process is using a profiling RTS or not (CCS vs STG stack traversal)? We can't just call `rts_isProfiled`, but we can use [`nm`](http://unixhelp.ed.ac.uk/CGI/man-cgi?nm) to check for certain symbols, like this: 973 | 974 | ```c++ 975 | bool DoesSymbolExist(const char *executable, const char *symbol) 976 | { 977 | // Use nm to check if a symbol exists 978 | char buf[1024]; 979 | std::snprintf(buf, sizeof(buf), "nm %s | grep %s", executable, symbol); 980 | std::FILE *pipe = popen(buf, "r"); 981 | assert(pipe != NULL); 982 | bool ret = std::fread(buf, 1, sizeof(buf), pipe) != 0; 983 | pclose(pipe); 984 | return ret; 985 | } 986 | 987 | bool IsProfilingRTS(const char *exe) { return DoesSymbolExist(exe, "_CCS_MAIN" ); } 988 | 989 | bool IsThreadedRTS (const char *exe) { return DoesSymbolExist(exe, "_createOSThread"); } 990 | ``` 991 | 992 | When our target process stops with a fault, how do we know if we're in Haskell or C etc. code? I don't really have a good answer for that, but this heuristic seems to work in pratice: 993 | 994 | ```c++ 995 | bool IsHaskellSymbol( 996 | const TargetInfo& ti, 997 | SymbolManager *symbol, 998 | uint32_t sym_id, 999 | uint16_t file_name_id) 1000 | { 1001 | // Haskell functions don't have source file debug information 1002 | if (std::strcmp(symbol->FileIDToName(file_name_id), symbol->GetUnresolvedSymbolName()) != 0) 1003 | return false; 1004 | // Assume that all Haskell functions come from our executable 1005 | if (strcasecmp(symbol->SymbolIDToModule(sym_id), ti.m_module.c_str()) != 0) 1006 | return false; 1007 | // Assume that all Haskell functions have certain tokens in their name 1008 | if (std::strstr(symbol->SymbolIDToName(sym_id), "_info") == NULL) 1009 | return false; 1010 | 1011 | return true; 1012 | } 1013 | ``` 1014 | 1015 | ### The Glorious Haskell Debugger v0.0.1 Pre-alpha 1016 | 1017 | For the finale, here is a **proof-of-concept** implementation of everything we discussed so far. **It is kinda cobbled together, has many limitations and missing features, but can serve as an example and starting point**. With our discussion of stack traversal, the references given above for `atos` / `ptrace` and the Mach APIs, you will hopefully be able to follow along. 1018 | 1019 | As with everything else so far, this is for OS X, GHC 7.6.3, 32bit. There's no support for the LLVM backend. The many hints for other platforms throughout this document should make porting fairly smooth, though. 1020 | 1021 | The full source code is in the [ghd folder](https://github.com/blitzcode/ghc-stack/tree/master/ghd). 1022 | 1023 | Build with `make`. Run with `sudo ghd ./myfaultyexecutable paramsformyfaultyexecutable`. 1024 | 1025 | ### Example stack traces 1026 | 1027 | Let's look at a few examples of stack traces produced by `ghd`. 1028 | 1029 | Our 'crash' program, error in Haskell code, profiling (CCS) enabled: 1030 | 1031 | ``` 1032 | ghd: Debugging '../crash/dist/build/crash/crash', PID 50929 1033 | ghd: RTS - Threaded: Yes | Profiling: Yes 1034 | ghd: Received signal 'SIGTRAP - trace trap' from child, continuing 1035 | ghd: Child stopped on fault 'SIGSEGV - segmentation violation' 1036 | ghd: Attempting to run stack trace 1037 | ghd: --------------- 1038 | ghd: Thread 1 of 3 1039 | ghd: --------------- 1040 | ghd: Status - Waiting 1041 | ghd: ExceptionState - trapno: 14 (EXC_I386_PGFLT - page fault), err: 6, faultvaddr: 0xeecb89bc 1042 | ghd: Registers 1043 | ghd: eax: 7325048, ebx: 2176268, ecx: 6316880, edx: 404, edi: 6314680, esi: 6316456, ebp: 0x603cc8 1044 | ghd: esp: 0xbfffd81c, ss: 31, eflags: 66070, eip: 0x21ee , cs: 23, ds: 31 1045 | ghd: es: 31, fs: 0, gs: 55 1046 | ghd: Stack Trace 1047 | ghd: 0x21ee [Haskell, switching to CCS] 1048 | ghd: CCS:0x6fc578 from Main (Main.hs:(25,1)-(27,39)) 1049 | ghd: CCS:0x6fc4d8 from Main (Main.hs:(19,1)-(21,13)) 1050 | ghd: CCS:0x6fc398
from Main (Main.hs:(7,1)-(9,13)) 1051 | ghd: CCS:0x211b60
from MAIN () 1052 | ghd: Stack trace done, exiting 1053 | ``` 1054 | 1055 | Our 'crash' program, error in C code, profiling (CCS) enabled: 1056 | 1057 | ``` 1058 | ghd: Debugging '../crash/dist/build/crash/crash', PID 50987 1059 | ghd: RTS - Threaded: Yes | Profiling: Yes 1060 | ghd: Received signal 'SIGTRAP - trace trap' from child, continuing 1061 | ghd: Child stopped on fault 'SIGBUS - bus error' 1062 | ghd: Attempting to run stack trace 1063 | ghd: --------------- 1064 | ghd: Thread 1 of 3 1065 | ghd: --------------- 1066 | ghd: Status - Waiting 1067 | ghd: ExceptionState - trapno: 14 (EXC_I386_PGFLT - page fault), err: 6, faultvaddr: 0x1 1068 | ghd: Registers 1069 | ghd: eax: 6316476, ebx: 2176268, ecx: 100, edx: 404, edi: 6314680, esi: 6316456, ebp: 0xbfffd7f8 1070 | ghd: esp: 0xbfffd7f8, ss: 31, eflags: 66182, eip: 0x2773 , cs: 23, ds: 31 1071 | ghd: es: 31, fs: 0, gs: 55 1072 | ghd: Stack Trace 1073 | ghd: 0x2773 from crash (crash.c:12) 1074 | ghd: 0x278b from crash (crash.c:8) 1075 | ghd: 0x220e [Haskell, switching to CCS] 1076 | ghd: CCS:0x6fc578 from Main (Main.hs:(25,1)-(29,16)) 1077 | ghd: CCS:0x6fc4d8 from Main (Main.hs:(19,1)-(21,13)) 1078 | ghd: CCS:0x6fc398
from Main (Main.hs:(7,1)-(9,13)) 1079 | ghd: CCS:0x211b60
from MAIN () 1080 | ghd: Stack trace done, exiting 1081 | ``` 1082 | 1083 | Our 'crash' program, error in C code, profiling (CCS) disabled: 1084 | 1085 | ``` 1086 | ghd: Debugging '../crash/dist/build/crash/crash', PID 50779 1087 | ghd: RTS - Threaded: Yes | Profiling: No 1088 | ghd: Received signal 'SIGTRAP - trace trap' from child, continuing 1089 | ghd: Child stopped on fault 'SIGBUS - bus error' 1090 | ghd: Attempting to run stack trace 1091 | ghd: --------------- 1092 | ghd: Thread 1 of 2 1093 | ghd: --------------- 1094 | ghd: Status - Waiting 1095 | ghd: ExceptionState - trapno: 14 (EXC_I386_PGFLT - page fault), err: 6, faultvaddr: 0x1 1096 | ghd: Registers 1097 | ghd: eax: 5267468, ebx: 1172556, ecx: 100, edx: 404, edi: 5263432, esi: 5267456, ebp: 0xbfffd7f8 1098 | ghd: esp: 0xbfffd7f8, ss: 31, eflags: 66182, eip: 0x1f23 , cs: 23, ds: 31 1099 | ghd: es: 31, fs: 0, gs: 55 1100 | ghd: Stack Trace 1101 | ghd: 0x1f23 from crash (crash.c:12) 1102 | ghd: 0x1f3b from crash (crash.c:8) 1103 | ghd: 0x1d90 [Haskell, switching to STG stack] 1104 | ghd: 0x503cd8 (RET_SMALL, 4b) 1105 | ghd: 0x503cdc (RET_SMALL, 4b) 1106 | ghd: 0x503ce0 (CATCH_FRAME, 12b, ) 1107 | ghd: 0x503cec (STOP_FRAME, 4b) 1108 | ghd: Stack trace done, exiting 1109 | ``` 1110 | 1111 | Our 'crash' program, error in Haskell code, profiling (CCS) disabled: 1112 | 1113 | ``` 1114 | ghd: Debugging '../crash/dist/build/crash/crash', PID 50828 1115 | ghd: RTS - Threaded: Yes | Profiling: No 1116 | ghd: Received signal 'SIGTRAP - trace trap' from child, continuing 1117 | ghd: Child stopped on fault 'SIGSEGV - segmentation violation' 1118 | ghd: Attempting to run stack trace 1119 | ghd: --------------- 1120 | ghd: Thread 1 of 2 1121 | ghd: --------------- 1122 | ghd: Status - Waiting 1123 | ghd: ExceptionState - trapno: 14 (EXC_I386_PGFLT - page fault), err: 6, faultvaddr: 0xeebb880c 1124 | ghd: Registers 1125 | ghd: eax: 5267456, ebx: 1172556, ecx: 5267872, edx: 404, edi: 5263432, esi: 5267456, ebp: 0x503cd8 1126 | ghd: esp: 0xbfffd81c, ss: 31, eflags: 66070, eip: 0x1d70 , cs: 23, ds: 31 1127 | ghd: es: 31, fs: 0, gs: 55 1128 | ghd: Stack Trace 1129 | ghd: 0x1d70 [Haskell, switching to STG stack] 1130 | ghd: 0x503cd8 (RET_SMALL, 4b) 1131 | ghd: 0x503cdc (RET_SMALL, 4b) 1132 | ghd: 0x503ce0 (CATCH_FRAME, 12b, ) 1133 | ghd: 0x503cec (STOP_FRAME, 4b) 1134 | ghd: Stack trace done, exiting 1135 | ``` 1136 | 1137 | My [Game of Life](https://github.com/blitzcode/haskell-gol/tree/master/parallel-glfwb) program, error in C++ code, profiling (CCS) disabled: 1138 | 1139 | ``` 1140 | ghd: Debugging '/Users/Tim/haskell-gol/parallel-glfwb/src/out', PID 51007 1141 | ghd: RTS - Threaded: Yes | Profiling: No 1142 | ghd: Received signal 'SIGTRAP - trace trap' from child, continuing 1143 | ghd: Child stopped on fault 'SIGBUS - bus error' 1144 | ghd: Attempting to run stack trace 1145 | ghd: --------------- 1146 | ghd: Thread 3 of 7 1147 | ghd: --------------- 1148 | ghd: Status - Waiting 1149 | ghd: ExceptionState - trapno: 14 (EXC_I386_PGFLT - page fault), err: 6, faultvaddr: 0x1 1150 | ghd: Registers 1151 | ghd: eax: 256, ebx: 16486418, ecx: 16486162, edx: 16416529, edi: 16485906, esi: 255, ebp: 0xb0100d78 1152 | ghd: esp: 0xb0100d4c, ss: 31, eflags: 66118, eip: 0x3bb44 , cs: 23, ds: 31 1153 | ghd: es: 31, fs: 31, gs: 55 1154 | ghd: Stack Trace 1155 | ghd: 0x3bb44 from out (gol.cpp:36) 1156 | ghd: 0x3bf3d from out (gol.cpp:111) 1157 | ghd: 0x365b3 [Haskell, switching to STG stack] 1158 | ghd: 0x15163400 (RET_SMALL, 36b) 1159 | ghd: 0x15163424 (RET_SMALL, 4b) 1160 | ghd: 0x15163428 (CATCH_FRAME, 12b, ) 1161 | ghd: 0x15163434 (RET_SMALL, 12b) 1162 | ghd: 0x15163440 (RET_SMALL, 4b) 1163 | ghd: 0x15163444 (RET_SMALL, 52b) 1164 | ghd: 0x15163478 (RET_SMALL, 52b) 1165 | ghd: 0x151634ac (CATCH_FRAME, 12b, ) 1166 | ghd: 0x151634b8 (STOP_FRAME, 4b) 1167 | ghd: Stack trace done, exiting 1168 | ``` 1169 | 1170 | My [twitter client](https://github.com/blitzcode/jacky), error in Haskell code, profiling 1171 | (CCS) enabled: 1172 | 1173 | ``` 1174 | ghd: Debugging './jacky', PID 47395 1175 | ghd: RTS - Threaded: Yes | Profiling: Yes 1176 | ghd: Received signal 'SIGTRAP - trace trap' from child, continuing 1177 | ghd: Received signal 'SIGCHLD - child status has changed' from child, continuing 1178 | ghd: Child stopped on fault 'SIGBUS - bus error' 1179 | ghd: Attempting to run stack trace 1180 | ghd: --------------- 1181 | ghd: Thread 1 of 8 1182 | ghd: --------------- 1183 | ghd: Status - Waiting 1184 | ghd: ExceptionState - trapno: 14 (EXC_I386_PGFLT - page fault), err: 7, faultvaddr: 0x18435000 1185 | ghd: Registers 1186 | ghd: eax: 400244736, ebx: 75534156, ecx: 1704960, edx: 435625360, edi: 71898364, esi: 71898149, ebp: 0x43edbd4 1187 | ghd: esp: 0xbfffd93c, ss: 31, eflags: 66050, eip: 0x1ad4f01 , cs: 23, ds: 31 1188 | ghd: es: 31, fs: 0, gs: 55 1189 | ghd: Stack Trace 1190 | ghd: 0x1ad4f01 [Haskell, switching to CCS] 1191 | ghd: CCS:0x19f71d90 from QuadRendering (QuadRendering.hs:379:15-49) 1192 | ghd: CCS:0x19f71b60 from QuadRendering (QuadRendering.hs:(334,1)-(423,38)) 1193 | ghd: CCS:0x19f71480 from FontRendering (FontRendering.hs:(192,83)-(211,57)) 1194 | ghd: CCS:0x46737d0 from FontRendering (FontRendering.hs:(189,1)-(212,46)) 1195 | ghd: CCS:0x4673500 from UI (UI.hs:(190,1)-(193,61)) 1196 | ghd: CCS:0x4673460 from UI (UI.hs:141:1-51) 1197 | ghd: CCS:0x4673370 from UI (UI.hs:(110,1)-(113,28)) 1198 | ghd: CCS:0x437be50 from App (App.hs:(123,1)-(162,16)) 1199 | ghd: CCS:0x4682700 from UI (UI.hs:147:1-53) 1200 | ghd: CCS:0x4682000 from UI (UI.hs:(160,1)-(161,31)) 1201 | ghd: CCS:0x4678f90 from App (App.hs:(95,9)-(120,21)) 1202 | ghd: CCS:0x4678f40 from QuadRendering (QuadRendering.hs:(181,23)-(191,40)) 1203 | ghd: CCS:0x4678ef0 from QuadRendering (QuadRendering.hs:(180,30)-(191,40)) 1204 | ghd: CCS:0x4678e50 from QuadRendering (QuadRendering.hs:(173,28)-(198,17)) 1205 | ghd: CCS:0x4678db0 from QuadRendering (QuadRendering.hs:(169,1)-(205,52)) 1206 | ghd: CCS:0x4678770 from App (App.hs:(82,1)-(120,21)) 1207 | ghd: CCS:0x467af60 from App (App.hs:(361,9)-(382,27)) 1208 | ghd: CCS:0x441eef0 from App (App.hs:(342,1)-(383,12)) 1209 | ghd: CCS:0x441ee40 from Main (Main.hs:(270,60)-(306,85)) 1210 | ghd: CCS:0x441edf0 from FontRendering (FontRendering.hs:(62,13)-(75,17)) 1211 | ghd: CCS:0x441eda0 from FontRendering (FontRendering.hs:(54,9)-(75,17)) 1212 | ghd: CCS:0x441ed50 from FontRendering (FontRendering.hs:(52,1)-(75,17)) 1213 | ghd: CCS:0x441ed00 from Main (Main.hs:(266,21)-(306,85)) 1214 | ghd: CCS:0x441ecb0 from QuadRendering (QuadRendering.hs:(77,1)-(134,28)) 1215 | ghd: CCS:0x441ec60 from Main (Main.hs:(262,19)-(306,85)) 1216 | ghd: CCS:0x441ec10 from TextureCache (TextureCache.hs:(63,9)-(77,13)) 1217 | ghd: CCS:0x441ebc0 from TextureCache (TextureCache.hs:(49,1)-(77,13)) 1218 | ghd: CCS:0x441eb70 from Main (Main.hs:(253,17)-(306,85)) 1219 | ghd: CCS:0x441eb20 from GLFWHelpers (GLFWHelpers.hs:(16,1)-(33,9)) 1220 | ghd: CCS:0x441ead0 from Main (Main.hs:(247,52)-(306,85)) 1221 | ghd: CCS:0x441ea80 from ImageCache (ImageCache.hs:130:16-19) 1222 | ghd: CCS:0x441ea30 from ImageCache (ImageCache.hs:(93,1)-(130,20)) 1223 | ghd: CCS:0x441e9e0 from Main (Main.hs:(121,1)-(152,17)) 1224 | ghd: CCS:0x441e990 from Main (Main.hs:(232,15)-(237,77)) 1225 | ghd: CCS:0x441e940 from Main (Main.hs:(219,15)-(231,36)) 1226 | ghd: CCS:0x441e700 from Main (Main.hs:(216,31)-(306,85)) 1227 | ghd: CCS:0x44f5000
from Main (Main.hs:(155,1)-(307,32)) 1228 | ghd: CCS:0x24426a0
from MAIN () 1229 | ghd: Stack trace done, exiting 1230 | ``` 1231 | 1232 | Seems to work pretty well! 1233 | 1234 | # Profiling 1235 | 1236 | How is what we've discussed so far related to profiling? A profiler often inspects a program in a similar way to a debugger. Sometimes, [your debugger is the profiler](http://poormansprofiler.org/). In case the term 'sampling profiler' is unclear, head over to [Wikipedia](http://en.wikipedia.org/wiki/Profiling_%28computer_programming%29#Statistical_profilers). 1237 | 1238 | ### rsvp 1239 | 1240 | I previously developed a realtime profiling tool called rsvp. It provides source level profiling with call tree analysis for C, C++ and Objective-C programs. It also monitors things like memory allocations, disk I/O and thread scheduling. The source and documentation might be worth a read if you want a working example of some of the things explained here. 1241 | 1242 | ![rsvp](https://raw.github.com/blitzcode/rsvp/master/img/main.png) 1243 | 1244 | You can get it [here](https://github.com/blitzcode/rsvp). 1245 | 1246 | Unfortunately, it does not yet support GHC Haskell programs. The techniques described here should allow rsvp to apply its analysis to them, though. I haven't yet yet decided how to best approach this. Let me know if you have any suggestions or are simply interested in Haskell support for rsvp. 1247 | 1248 | # Conclusion 1249 | 1250 | I hope this was useful, perhaps even a bit fun? Maybe you got inspired and will give a shot at developing your own tools and libraries working with GHC's stack and heap, perhaps even hack on GHC itself. Please [contact me](http://www.blitzcode.net/contact.shtml) if you have any comments or spotted mistakes in this document. 1251 | 1252 | # Legal 1253 | 1254 | This program is published under the [MIT License](http://en.wikipedia.org/wiki/MIT_License). 1255 | 1256 | # Author 1257 | 1258 | Developed by Tim C. Schroeder, visit my [website](http://www.blitzcode.net) to learn more. 1259 | 1260 | -------------------------------------------------------------------------------- /crash/Main.hs: -------------------------------------------------------------------------------- 1 | 2 | module Main where 3 | 4 | import Data.Vector.Mutable 5 | 6 | main :: IO () 7 | main = do 8 | someFuncA 9 | someFuncB 10 | 11 | {-# NOINLINE someFuncA #-} 12 | someFuncA :: IO () 13 | someFuncA = do 14 | v <- new 100 15 | unsafeWrite v 0 (0 :: Int) 16 | 17 | {-# NOINLINE someFuncB #-} 18 | someFuncB :: IO () 19 | someFuncB = do 20 | someFuncC 21 | someFuncC 22 | 23 | {-# NOINLINE someFuncC #-} 24 | someFuncC :: IO () 25 | someFuncC = do 26 | v <- new 100 27 | --unsafeWrite v 1000000000 (0 :: Int) -- Crash (hopefully) 28 | unsafeWrite v 0 (0 :: Int) 29 | c_someCFuncA 30 | 31 | foreign import ccall unsafe "crash.h someCFuncA" 32 | c_someCFuncA :: IO () 33 | 34 | -------------------------------------------------------------------------------- /crash/Setup.hs: -------------------------------------------------------------------------------- 1 | import Distribution.Simple 2 | main = defaultMain 3 | -------------------------------------------------------------------------------- /crash/crash.c: -------------------------------------------------------------------------------- 1 | 2 | #include "crash.h" 3 | 4 | void someCFuncA() 5 | { 6 | someCFuncB(); 7 | someCFuncB(); 8 | } 9 | 10 | void someCFuncB() 11 | { 12 | * (char *) 1 = 0; 13 | } 14 | 15 | -------------------------------------------------------------------------------- /crash/crash.cabal: -------------------------------------------------------------------------------- 1 | -- Initial crash.cabal generated by cabal init. For further documentation, 2 | -- see http://haskell.org/cabal/users-guide/ 3 | 4 | name: crash 5 | version: 0.1.0.0 6 | -- synopsis: 7 | -- description: 8 | -- license: 9 | license-file: LICENSE 10 | author: Tim C. Schroeder 11 | maintainer: tim@blitzcode.net 12 | -- copyright: 13 | -- category: 14 | build-type: Simple 15 | -- extra-source-files: 16 | cabal-version: >=1.10 17 | 18 | executable crash 19 | main-is: Main.hs 20 | ghc-prof-options: -fprof-auto -caf-all 21 | ghc-options: -Wall -rtsopts -O2 -threaded -ddump-stg -ddump-cmm -ddump-to-file 22 | 23 | c-sources: crash.c 24 | --include-dirs: /Library/Frameworks/GHC.framework/Versions/7.6.3-i386/usr/lib/ghc-7.6.3/include/ 25 | cc-options: -std=c99 -g -O3 -fno-omit-frame-pointer -Wall -Wextra -Wno-unused-parameter 26 | 27 | 28 | -- other-modules: 29 | -- other-extensions: 30 | build-depends: base >=4.6 && <4.7, vector 31 | -- hs-source-dirs: 32 | default-language: Haskell2010 33 | -------------------------------------------------------------------------------- /crash/crash.h: -------------------------------------------------------------------------------- 1 | 2 | #ifndef CRASH_H 3 | #define CRASH_H 4 | 5 | __attribute__((noinline)) void someCFuncA(); 6 | __attribute__((noinline)) void someCFuncB(); 7 | 8 | #endif // CRASH_H 9 | 10 | -------------------------------------------------------------------------------- /ghd/Makefile: -------------------------------------------------------------------------------- 1 | 2 | # Compile all C++98 and C99 code 3 | 4 | EXECUTABLE_NAME = ghd 5 | 6 | # Default target depending on main output 7 | .PHONY : all 8 | all : $(EXECUTABLE_NAME) 9 | 10 | # General compiler and linker flags 11 | FLAGS = -m32 -g -Wall -Wextra -Wno-unused-parameter -O3 -fno-omit-frame-pointer 12 | 13 | # Flags specific to C or C++ code 14 | CFLAGS = $(FLAGS) -std=c99 \ 15 | -I/Library/Frameworks/GHC.framework/Versions/7.6.3-i386/usr/lib/ghc-7.6.3/include/ 16 | CXXFLAGS = $(FLAGS) -std=c++98 17 | 18 | # Sources and objects. We assume all .c / .cpp files in the directory are meant to 19 | # be compiled 20 | SRC = $(wildcard *.c) $(wildcard *.cpp) 21 | OBJ = $(patsubst %.cpp,%.o,$(patsubst %.c,%.o,$(SRC))) 22 | 23 | # Rule for producing the main output 24 | LD_OBJ = $(OBJ) 25 | $(EXECUTABLE_NAME) : $(LD_OBJ) 26 | $(CXX) -o $@ $(FLAGS) $(LDFLAGS) $(LD_OBJ) 27 | 28 | # Only include this if we actually build the main output, don't want to 29 | # generate dependencies for the clean targets etc. 30 | ifeq ($(MAKECMDGOALS), ) 31 | 32 | # Automatically generate dependencies with the compiler for each source file 33 | %.d: %.cpp 34 | @set -e; rm -f $@; \ 35 | $(CXX) -MM $(CXXFLAGS) $< > $@.$$$$; \ 36 | sed 's,\($*\)\.o[ :]*,\1.o $@ : ,g' < $@.$$$$ > $@; \ 37 | $(RM) $@.$$$$ 38 | %.d: %.c 39 | @set -e; rm -f $@; \ 40 | $(CC) -MM $(CFLAGS) $< > $@.$$$$; \ 41 | sed 's,\($*\)\.o[ :]*,\1.o $@ : ,g' < $@.$$$$ > $@; \ 42 | $(RM) $@.$$$$ 43 | 44 | # Include the generated dependencies for object and dependency files. Be silent 45 | # during the first compile where the .d files have not been generated yet and 46 | # everything is recompiled 47 | -include $(patsubst %.cpp,%.d,$(patsubst %.c,%.d,$(SRC))) 48 | 49 | endif 50 | 51 | # Clean by deleting final output, dependency and object files 52 | .PHONY : clean 53 | clean: 54 | $(RM) $(EXECUTABLE_NAME) $(OBJ) $(OBJ:.o=.d) 55 | 56 | -------------------------------------------------------------------------------- /ghd/atos.h: -------------------------------------------------------------------------------- 1 | 2 | #ifndef ATOS_H 3 | #define ATOS_H 4 | 5 | 6 | // 7 | // atos code reused from rsvp (https://github.com/blitzcode/rsvp) 8 | // 9 | 10 | 11 | // Wrapper around atos tool 12 | // 13 | // http://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man1/atos.1.html 14 | // 15 | // Duplicating the full functionality of this tool would be rather difficult. Its source is not 16 | // available, some of the underlying APIs are poorly documented, plus it does exactly what we need 17 | // and seems reasonably fast and reliable 18 | class ATOS_Pipe 19 | { 20 | public: 21 | ATOS_Pipe(pid_t pid) 22 | { 23 | // Check for atos 24 | // 25 | // Since 10.9 calling atos directly seems to be deprecated. The warning suggest 26 | // to invoke it through xcrun, which seems to work fine on 10.6 already 27 | // 28 | // Also see here: 29 | // 30 | // http://root.cern.ch/phpBB3/viewtopic.php?f=3&t=17190&start=30 31 | // https://github.com/allending/Kiwi/pull/365 32 | if (std::system("xcrun atos 2> /dev/null") != 0) 33 | assert(!"Can't find 'atos' command line utility - dev. tools not installed?"); 34 | 35 | // TODO: The bi-directional popen() only works with this environment 36 | // variable set to avoid a deadlock due to buffering, nasty 37 | if (setenv("NSUnbufferedIO", "YES", 1) != 0) 38 | assert(!"setenv failed"); 39 | 40 | char buf[64]; 41 | std::snprintf(buf, sizeof(buf), "xcrun atos -p %i", pid); 42 | m_pipe = popen(buf, "r+"); 43 | assert(m_pipe != NULL); 44 | } 45 | 46 | ~ATOS_Pipe() 47 | { 48 | if (m_pipe != NULL) 49 | if (pclose(m_pipe ) != 0) 50 | assert(!"pclose() failed"); 51 | } 52 | 53 | void AddressToSymbol(uint64_t addr, char *buf, size_t buf_len) const 54 | { 55 | // Communicate with atos program for address resolution, needs to have 56 | // buffering disabled to not deadlock 57 | 58 | // The addresses need to be in hexadecimal, since 10.7 atos only resolves those 59 | if (std::fprintf(m_pipe, "0x%llx\n", addr) <= 0) 60 | assert(!"Writing to atos pipe failed"); 61 | 62 | if (std::fgets(buf, buf_len, m_pipe) == NULL) 63 | assert(!"Reading from atos pipe failed"); 64 | } 65 | 66 | protected: 67 | std::FILE *m_pipe; 68 | 69 | }; 70 | 71 | class SymbolManager 72 | { 73 | public: 74 | SymbolManager(pid_t pid) : 75 | m_atos(pid), 76 | m_cache_hit(0), 77 | m_cache_miss(0), 78 | m_unresolved_sym_name("(Unresolved)") 79 | { } 80 | 81 | uint32_t AddressToSymbolID( 82 | uint64_t addr, 83 | uint16_t *file_name_id_out = NULL, // Optional source & line information starts here 84 | uint16_t *line_number_out = NULL) 85 | { 86 | // Resolve an address into a symbol ID common to all addresses resolving into that symbol. 87 | // We return source and line information on the spot. We can't cache them here as they are 88 | // tied to the address and not the symbol 89 | 90 | // Check address cache 91 | CacheEntry &cache = m_cache[addr % (sizeof(m_cache) / sizeof(CacheEntry))]; 92 | if (cache.m_pc == addr && 93 | cache.m_sym_id != uint32_t(-1)) // Sometimes we get a null address and a false hit 94 | { 95 | m_cache_hit++; 96 | } 97 | else 98 | { 99 | m_cache_miss++; 100 | 101 | // Obtain symbol string from atos 102 | char symbol[8192]; 103 | m_atos.AddressToSymbol(addr, symbol, sizeof(symbol)); 104 | 105 | // Module and file name / line 106 | char module[1024]; 107 | char file_name[1024]; 108 | uint line_number; 109 | ExtractModuleAndFileName( 110 | symbol, 111 | module, 112 | sizeof(module), 113 | file_name, 114 | sizeof(file_name), 115 | &line_number); 116 | 117 | // De-mangle atos output into clean and display friendly symbol name 118 | PrettyPrintSymbol(symbol); 119 | 120 | // Just convert all hex addresses to a single unresolved token 121 | if (std::strncmp(symbol, "0x", 2) == 0) 122 | std::strncpy(symbol, GetUnresolvedSymbolName(), sizeof(symbol)); 123 | 124 | // Check if we already have that symbol name in the table 125 | const uint64_t sym_hash = BernsteinHash(symbol) ^ BernsteinHash(module); 126 | std::map::iterator it_sym = m_hash_to_sym_id.find(sym_hash); 127 | if (it_sym == m_hash_to_sym_id.end()) 128 | { 129 | // Add to symbol and module name string table 130 | uint32_t new_id = uint32_t(m_sym_table.size()); 131 | m_sym_table.push_back(SymbolName()); 132 | m_sym_table.back().m_symbol = std::string(symbol); 133 | m_sym_table.back().m_module = std::string(module); 134 | 135 | // Hash-to-ID translation entry 136 | it_sym = m_hash_to_sym_id.insert 137 | (std::map::value_type(sym_hash, new_id)).first; 138 | } 139 | 140 | // Check if we already have that file name in the table 141 | const uint64_t file_hash = BernsteinHash(file_name); 142 | std::map::iterator it_file = m_hash_to_file_name_id.find(file_hash); 143 | if (it_file == m_hash_to_file_name_id.end()) 144 | { 145 | // Add to file name string table 146 | uint16_t new_id = uint16_t(m_file_name_table.size()); 147 | m_file_name_table.push_back(std::string(file_name)); 148 | 149 | // Hash-to-ID translation entry 150 | it_file = m_hash_to_file_name_id.insert 151 | (std::map::value_type(file_hash, new_id)).first; 152 | } 153 | 154 | // Update cache 155 | cache.m_pc = addr; 156 | cache.m_sym_id = (* it_sym).second; 157 | cache.m_file_name_id = (* it_file).second; 158 | cache.m_line_number = line_number; 159 | 160 | assert(std::strcmp(symbol, SymbolIDToName ((* it_sym ).second)) == 0); 161 | assert(std::strcmp(module, SymbolIDToModule((* it_sym ).second)) == 0); 162 | assert(std::strcmp(file_name, FileIDToName ((* it_file).second)) == 0); 163 | } 164 | 165 | // Return results from cache 166 | if (file_name_id_out != NULL) 167 | (* file_name_id_out) = cache.m_file_name_id; 168 | if (line_number_out != NULL) 169 | (* line_number_out) = cache.m_line_number; 170 | const uint32_t sym_id = cache.m_sym_id; 171 | 172 | return sym_id; 173 | } 174 | 175 | const char * SymbolIDToName(uint32_t id) const 176 | { 177 | assert(id < m_sym_table.size()); 178 | return m_sym_table[id].m_symbol.c_str(); 179 | } 180 | 181 | const char * SymbolIDToModule(uint32_t id) const 182 | { 183 | assert(id < m_sym_table.size()); 184 | return m_sym_table[id].m_module.c_str(); 185 | } 186 | 187 | const char * FileIDToName(uint16_t id) const 188 | { 189 | assert(id < m_file_name_table.size()); 190 | return m_file_name_table[id].c_str(); 191 | } 192 | 193 | float GetCacheHitPercentage() const 194 | { 195 | return float(m_cache_hit) / float(m_cache_hit + m_cache_miss) * 100.0f; 196 | } 197 | 198 | const char * GetUnresolvedSymbolName() const { return m_unresolved_sym_name.c_str(); } 199 | 200 | protected: 201 | ATOS_Pipe m_atos; 202 | 203 | // Address -> Symbol ID cache 204 | uint m_cache_hit; 205 | uint m_cache_miss; 206 | struct CacheEntry 207 | { 208 | CacheEntry() : m_pc(0), m_sym_id(-1), m_file_name_id(-1), m_line_number(-1) { } 209 | uint64_t m_pc; 210 | uint32_t m_sym_id; 211 | // Have to store this here instead of the symbol table as they vary by address, not symbol 212 | uint16_t m_file_name_id; 213 | uint16_t m_line_number; 214 | } m_cache[65536 * 32]; // 32MB 215 | 216 | // Table of unique symbol names and map to translate string hash -> table location 217 | std::map m_hash_to_sym_id; 218 | struct SymbolName 219 | { 220 | std::string m_symbol; 221 | std::string m_module; 222 | }; 223 | std::vector m_sym_table; 224 | 225 | // Table of unique file names and map to translate string hash -> table location 226 | std::map m_hash_to_file_name_id; 227 | std::vector m_file_name_table; 228 | 229 | std::string m_unresolved_sym_name; 230 | 231 | uint64_t BernsteinHash(const char *str_) const 232 | { 233 | // The original Bernstein hash eventually had some collisions, this is a simple 234 | // 64 bit extension of it 235 | 236 | const uint8_t *str = reinterpret_cast (str_); 237 | uint32_t hash_a = 5381; 238 | uint32_t hash_b = 5387; 239 | int c; 240 | 241 | while ((c = *str++)) 242 | { 243 | hash_a = hash_a * 33 ^ c; 244 | hash_b = hash_b * 35 ^ c; 245 | } 246 | 247 | return uint64_t(hash_a) | (uint64_t(hash_b) << 32); 248 | } 249 | 250 | void ExtractModuleAndFileName( 251 | const char *sym, 252 | char *module, 253 | size_t module_len, 254 | char *file, 255 | size_t file_len, 256 | uint *line_number) const 257 | { 258 | // Extract the module and file / line from a symbol string. Can pass NULL for all out 259 | // parameters to skip them 260 | 261 | // Initialize with failure defaults in case we abort 262 | if (module != NULL) 263 | std::strncpy(module, GetUnresolvedSymbolName(), module_len); 264 | if (file != NULL) 265 | std::strncpy(file, GetUnresolvedSymbolName(), file_len); 266 | if (line_number != NULL) 267 | (* line_number) = 0; 268 | 269 | // Find module name part 270 | const char module_token[] = " (in "; 271 | const char *module_begin = std::strstr(sym, module_token); 272 | if (module_begin == NULL) 273 | return; // Not present 274 | module_begin += std::strlen(module_token); 275 | 276 | // Find end of module name part 277 | const char *module_end = std::strchr(module_begin, ')'); 278 | if (module_end == NULL) 279 | return; // Must be terminated by closing brace 280 | const size_t module_idx = module_end - module_begin; 281 | 282 | // Extract module name 283 | if (module != NULL) 284 | { 285 | std::strncpy(module, module_begin, module_len); 286 | module[std::min(module_idx, module_len)] = '\0'; 287 | } 288 | 289 | // Find file name part 290 | const char file_token[] = " ("; 291 | const char *file_begin = std::strstr(module_end, file_token); 292 | if (file_begin == NULL) 293 | return; // Not present 294 | file_begin += std::strlen(file_token); 295 | 296 | // Find end of file name part 297 | const char *file_end = std::strchr(file_begin, ':'); 298 | if (file_end == NULL) 299 | return; // Need colon 300 | const size_t file_idx = file_end - file_begin; 301 | 302 | // Extract file name 303 | if (file != NULL) 304 | { 305 | std::strncpy(file, file_begin, file_len); 306 | file[std::min(file_idx, file_len)] = '\0'; 307 | } 308 | 309 | // Extract line number 310 | if (line_number != NULL) 311 | std::sscanf(file_end + 1, "%i", line_number); 312 | } 313 | 314 | void PrettyPrintSymbol(char *sym) const 315 | { 316 | // Convert the output of atos into a name that is readable and compact. We inevitably throw 317 | // away some information like template arguments, function overloads and module information 318 | // etc., it's a trade-off. This function also makes certain assumptions on how atos formats 319 | // its symbols, will likely need to be tweaked if anything changes 320 | 321 | if (sym[0] == '+' || sym[0] == '-') 322 | { 323 | // Objective C. We just cut off the parameter list and everything after the brackets 324 | while (*sym++ != '\0') 325 | { 326 | if (*sym == ']' || *sym == ':') 327 | { 328 | *sym++ = ']'; 329 | *sym = '\0'; 330 | } 331 | } 332 | } 333 | else 334 | { 335 | // Assume C / C++ 336 | 337 | // Remove module, source file and offset information 338 | { 339 | char *cut = std::strstr(sym, " (in "); 340 | if (cut != NULL) 341 | *cut = '\0'; 342 | } 343 | 344 | // Remove newline 345 | if (sym[std::strlen(sym) - 1] == '\n') 346 | sym[std::strlen(sym) - 1] = '\0'; 347 | 348 | // Shorten '(anonymous namespace)' to 'anon' 349 | { 350 | char *anon = NULL; 351 | const char search[] = "(anonymous namespace)"; 352 | const size_t len_s = sizeof(search) - 1; 353 | while ((anon = std::strstr(sym, search))) 354 | { 355 | const char replace[] = "anon"; 356 | const size_t len_r = sizeof(replace) - 1; 357 | std::memcpy(anon, replace, len_r); 358 | std::memmove(anon + len_r, anon + len_s, std::strlen(sym + len_s) + 1); 359 | } 360 | } 361 | 362 | char *orig_ptr = sym; 363 | 364 | // Compact braces and angle brackets 365 | int angle_level = 0, brace_level = 0; 366 | char *angle_start = NULL, *brace_start = NULL; 367 | while (*sym != '\0') 368 | { 369 | // Angle brackets confuse our parser, skip operators which have them 370 | const char ops[][16] = 371 | { 372 | "operator<<=", "operator <<=", "operator>>=", "operator >>=", // Shift Assign 373 | "operator<<", "operator <<", "operator>>", "operator >>", // Shift 374 | "operator<", "operator <", "operator>", "operator >", // Compare 375 | "operator->", "operator ->" // Dereference 376 | }; 377 | for (uint i=0; i') 395 | if (--angle_level == 0) 396 | { 397 | std::memmove(angle_start + 1, sym, strlen(sym) + 1); 398 | sym = angle_start + 1; 399 | } 400 | assert(angle_level >= 0); 401 | } 402 | 403 | // Don't bother inside angle brackets, we just remove it all anyway 404 | if (angle_level == 0) 405 | { 406 | // Increment nesting level and store position of first open 407 | // brace so we know where to start deleting 408 | if (*sym == '(') 409 | if (brace_level++ == 0) 410 | brace_start = sym; 411 | 412 | // Decrement nesting level and delete on encountering final brace 413 | if (*sym == ')') 414 | if (--brace_level == 0) 415 | { 416 | if (sym - brace_start > 1) 417 | { 418 | std::memmove(brace_start + 1, sym, strlen(sym) + 1); 419 | sym = brace_start + 1; 420 | } 421 | } 422 | assert(brace_level >= 0); 423 | } 424 | 425 | sym++; 426 | } 427 | assert(angle_level == 0); 428 | assert(brace_level == 0); 429 | 430 | // Remove leading types and trailing qualifiers 431 | { 432 | sym = orig_ptr; 433 | 434 | // Trailing const 435 | char *const_trail = std::strstr(sym, " const"); 436 | if (const_trail != NULL) 437 | *const_trail = '\0'; 438 | 439 | // Leading types (return values) are sometimes included in the symbol, remove them 440 | while (*sym != '\0') 441 | { 442 | // Operator function names have spaces in them, don't throw them 443 | // away as leading segments 444 | const char ops[][16] = { " operator", ":operator", "$operator" }; 445 | bool break_outer = false; 446 | for (uint i=0; i 5 | #include 6 | #include 7 | #include 8 | 9 | #include 10 | 11 | #include 12 | #include 13 | #include 14 | #include 15 | 16 | #include 17 | #include 18 | #include 19 | 20 | #include "atos.h" 21 | 22 | int main(int argc, char* argv[]) 23 | { 24 | if (argc == 1) 25 | { 26 | std::printf("Usage: %s executable [arguments]\n", argv[0]); 27 | return -1; 28 | } 29 | 30 | // Are we root? 31 | // 32 | // TODO: We could be a bit smarter about this, like explained here 33 | // http://os-tres.net/blog/2010/02/17/mac-os-x-and-task-for-pid-mach-call/ 34 | // or just launch the child process with a Mach call, but for this simple 35 | // proof-of-concept it is easiest just to require running as root to get 36 | // the required access 37 | if (geteuid() != 0) 38 | assert(!"Need to run as root - forgot 'sudo'?"); 39 | 40 | // We expect to be 32bit, like our target. Supporting all combinations of 32 and 64 bit 41 | // is absolutely possible, just this proof-of-concept debugger is fixed to a single 42 | // architecture to simplify things 43 | assert(sizeof(void *) == 4); 44 | 45 | // Launch process to be debugged, forward all arguments 46 | pid_t child = fork(); 47 | if (child == 0) 48 | { 49 | // Let parent trace us 50 | // 51 | // TODO: We could alternatively also handle all of this entirely without ptrace() 52 | // and rely entirely on Mach exception handling 53 | ptrace(PT_TRACE_ME, 0, NULL, NULL); 54 | 55 | if (execv(argv[1], &argv[1]) == -1) 56 | assert("execl() failed, can't load child executable"); 57 | } 58 | else if (child == -1) 59 | assert(!"fork() failed"); 60 | else 61 | { 62 | TargetInfo ti(child, argv[1]); 63 | 64 | if (ti.m_is_64_bit) 65 | assert(!"ghd: Expected 32bit executable"); 66 | 67 | std::printf("ghd: Debugging '%s', PID %i\n", argv[1], child); 68 | 69 | std::printf("ghd: RTS - Threaded: %s | Profiling: %s\n", 70 | IsThreadedRTS(argv[1]) ? "Yes" : "No", 71 | IsProfilingRTS(argv[1]) ? "Yes" : "No"); 72 | 73 | DebugLoop(ti); 74 | kill(child, SIGKILL); 75 | } 76 | 77 | return 0; 78 | } 79 | 80 | // Since we can't call rts_isProfiled() or such, check the flavor of the RTS by checking 81 | // for the presence of symbols 82 | // 83 | // TODO: Could maybe just check for the library 'ribHSrts_thr_p.a(darwin.o)' etc.? 84 | // 85 | bool IsProfilingRTS(const char *exe) { return DoesSymbolExist(exe, "_CCS_MAIN" ); } 86 | bool IsThreadedRTS (const char *exe) { return DoesSymbolExist(exe, "_createOSThread"); } 87 | bool DoesSymbolExist(const char *executable, const char *symbol) 88 | { 89 | // Use nm to check if a symbol exists 90 | char buf[1024]; 91 | std::snprintf(buf, sizeof(buf), "nm %s | grep %s", executable, symbol); 92 | std::FILE *pipe = popen(buf, "r"); 93 | assert(pipe != NULL); 94 | bool ret = std::fread(buf, 1, sizeof(buf), pipe) != 0; 95 | pclose(pipe); 96 | return ret; 97 | } 98 | 99 | bool IsProcess64Bit(pid_t pid) 100 | { 101 | int mib[4] = { CTL_KERN, KERN_PROC, KERN_PROC_PID, 0 }; 102 | mib[3] = pid; 103 | size_t len = sizeof(kinfo_proc); 104 | kinfo_proc kp; 105 | 106 | if (sysctl(mib, 4, &kp, &len, NULL, 0) == -1) 107 | assert(!"sysctl() failed"); 108 | 109 | return kp.kp_proc.p_flag & P_LP64 != 0; 110 | } 111 | 112 | TargetInfo::TargetInfo(pid_t pid, const char *executable) 113 | { 114 | // Process 115 | m_pid = pid; 116 | m_executable = executable; 117 | // Configuration 118 | m_threaded_rts = IsThreadedRTS(executable); 119 | m_profiling_rts = IsProfilingRTS(executable); 120 | m_is_64_bit = IsProcess64Bit(pid); 121 | // Extract name of the executable module 122 | const char *module_name = &executable[std::strlen(executable) - 1]; 123 | while (module_name != executable) 124 | { 125 | if (* module_name == '/') 126 | { 127 | module_name++; 128 | break; 129 | } 130 | module_name--; 131 | } 132 | m_module = module_name; 133 | // We need a port for the task to be debugged 134 | if (task_for_pid(mach_task_self(), pid, &m_task_port) != KERN_SUCCESS) 135 | assert(!"Can't get port for task"); 136 | } 137 | 138 | TargetInfo::~TargetInfo() 139 | { 140 | if (mach_port_deallocate(mach_task_self(), m_task_port) != KERN_SUCCESS) 141 | assert(!"Can't deallocate port"); 142 | } 143 | 144 | bool TargetInfo::ReadMemoryArg(target_ptr_t address, target_ptr_t size, void *buf) const 145 | { 146 | kern_return_t ret = ::ReadMemory(m_task_port, address, size, buf); 147 | return (ret == KERN_SUCCESS); 148 | } 149 | 150 | void * TargetInfo::ReadMemoryReturn(target_ptr_t address, target_ptr_t size) const 151 | { 152 | static char buf[65536]; 153 | if (size > sizeof(buf)) 154 | return NULL; 155 | bool ret = ReadMemoryArg(address, size, buf); 156 | return ret ? buf : NULL; 157 | } 158 | 159 | target_ptr_t TargetInfo::ReadMemoryPtr(target_ptr_t address) const 160 | { 161 | // Return the pointer value at the given address. Returns 0 both in case of 162 | // a pointer to a null pointer and an actual failure to read the pointer value 163 | // in the first place 164 | target_ptr_t *ptr_val = 165 | reinterpret_cast (ReadMemoryReturn(address, sizeof(target_ptr_t))); 166 | return (ptr_val == NULL) ? 0 : (* ptr_val); 167 | } 168 | 169 | bool TargetInfo::ReadMemoryString(target_ptr_t address, char *buf_out, size_t buf_size) const 170 | { 171 | unsigned int i; 172 | for (i=0; i symbol = 258 | std::auto_ptr(new SymbolManager(ti.m_pid)); 259 | 260 | // List of threads 261 | thread_act_port_array_t thread_list; 262 | mach_msg_type_number_t thread_count; 263 | if (task_threads(ti.m_task_port, &thread_list, &thread_count) != KERN_SUCCESS) 264 | assert(!"Can't obtain thread list"); 265 | 266 | for (uint thread_idx=0; thread_idx, cs: %u, ds: %u\n" \ 374 | "ghd: es: %u, fs: %u, gs: %u\n", 375 | x86ts32.__eax, 376 | x86ts32.__ebx, 377 | x86ts32.__ecx, 378 | x86ts32.__edx, 379 | x86ts32.__edi, 380 | x86ts32.__esi, 381 | x86ts32.__ebp, 382 | x86ts32.__esp, 383 | x86ts32.__ss, 384 | x86ts32.__eflags, 385 | x86ts32.__eip, 386 | symbol->SymbolIDToName(symbol->AddressToSymbolID(x86ts32.__eip)), 387 | x86ts32.__cs, 388 | x86ts32.__ds, 389 | x86ts32.__es, 390 | x86ts32.__fs, 391 | x86ts32.__gs); 392 | 393 | // Traverse stack 394 | StackTrace(ti, symbol.get(), x86ts32.__ebp, x86ts32.__eip); 395 | } 396 | 397 | // Cleanup 398 | std::printf("ghd: Stack trace done, exiting\n"); 399 | DeallocateThreadList(thread_list, thread_count); 400 | } 401 | 402 | void Indent(unsigned int depth) 403 | { 404 | std::printf("ghd: "); 405 | for (unsigned int i=0; iAddressToSymbolID(frame.ret, &file_name_id, &line_number); 430 | 431 | // Are we in Haskell code now? 432 | if (IsHaskellSymbol(ti, symbol, sym_id, file_name_id)) 433 | { 434 | Indent(depth); 435 | std::printf("0x%x <%s> [Haskell%s]\n", 436 | frame.ret, 437 | symbol->SymbolIDToName(sym_id), 438 | ti.m_profiling_rts ? ", switching to CCS" : ", switching to STG stack"); 439 | 440 | if (ti.m_profiling_rts) 441 | { 442 | // We have a CCS, pick it up from the Sp in the frame 443 | const target_ptr_t ccs = CCSPtrFromTopOfStack(ti, frame.next); 444 | if (ccs == 0) 445 | { 446 | Indent(depth + 1); 447 | std::printf("(Can't access CCS from STG stack at Sp = 0x%x)\n", frame.next); 448 | } 449 | else 450 | DumpCCS(ti, ccs, depth + 1); 451 | } 452 | else 453 | { 454 | // No profiling RTS, proceed with the STG stack 455 | DumpSTG(ti, symbol, frame.next, depth + 1); 456 | } 457 | break; // Done, we don't handle any potential Haskell -> C transition yet 458 | } 459 | 460 | // Current frame's function 461 | Indent(depth); 462 | std::printf("0x%x <%s> from %s (%s:%i)\n", 463 | frame.ret, 464 | symbol->SymbolIDToName(sym_id), 465 | symbol->SymbolIDToModule(sym_id), 466 | symbol->FileIDToName(file_name_id), 467 | line_number); 468 | 469 | // Some sanity checks for the frame link pointer 470 | bool stop_traversal = false; 471 | if (frame.next < 1024) 472 | { 473 | Indent(depth + 1); 474 | std::printf("(Next frame near zero)\n"); 475 | stop_traversal = true; 476 | } 477 | if (frame.next % sizeof(void *) != 0) // IIRC, OS X actually requires 16b alignment 478 | { 479 | Indent(depth + 1); 480 | std::printf("(Next frame improperly aligned)\n"); 481 | stop_traversal = true; 482 | } 483 | if (frame.next == old_fp) 484 | { 485 | Indent(depth + 1); 486 | std::printf("(Next frame identical to current)\n"); 487 | stop_traversal = true; 488 | } 489 | if (frame.next < old_fp) 490 | { 491 | Indent(depth + 1); 492 | std::printf("(Next frame before current)\n"); 493 | stop_traversal = true; 494 | } 495 | if (frame.next - old_fp > 32 * 1024 * 1024) 496 | { 497 | Indent(depth + 1); 498 | std::printf("(Next frame >32MB away from current)\n"); 499 | stop_traversal = true; 500 | } 501 | 502 | // Next frame 503 | old_fp = frame.next; 504 | if (ti.ReadMemoryArg(frame.next, sizeof(StackFrame), &frame) == false) 505 | { 506 | // Likely a bad address, stop traversing 507 | Indent(depth + 1); 508 | std::printf("(Can't access next frame)\n"); 509 | stop_traversal = true; 510 | } 511 | 512 | if (stop_traversal) 513 | break; 514 | } 515 | } 516 | 517 | // Get a Cost Center Stack pointer from either the base register or the STG stack pointer 518 | target_ptr_t CCSPtrFromBaseReg(const TargetInfo& ti, target_ptr_t base_reg) 519 | { 520 | return ti.ReadMemoryPtr(base_reg + OFFSET_StgRegTable_rCCCS); 521 | } 522 | target_ptr_t CCSPtrFromTopOfStack(const TargetInfo& ti, target_ptr_t sp_reg) 523 | { 524 | return ti.ReadMemoryPtr 525 | (sp_reg + OFFSET_StgClosure_header + OFFSET_StgHeader_prof + OFFSET_StgProfHeader_ccs); 526 | } 527 | 528 | bool IsHaskellSymbol( 529 | const TargetInfo& ti, 530 | SymbolManager *symbol, 531 | uint32_t sym_id, 532 | uint16_t file_name_id) 533 | { 534 | // One challenge with inspecting a stopped target process is figuring out if we're 535 | // inside a Haskell function and need to do CCS / STG stack traversal, or inside 536 | // a C/C++/ObjC function and need to do x86 frame pointer stack traversal. 537 | // 538 | // Here we simply look at the symbol information and decide based upon that. If 539 | // our function has a matching name, comes from the executable (main module) and lacks 540 | // source file debug information, we assume it's a Haskell function. This is rather 541 | // crummy, but works well enough. We could also look at some other bits of machine 542 | // state to make the decision, i.e.: 543 | // 544 | // C <- C 545 | // ----------------------------- 546 | // ebp: Frame pointer (pointing to next + ret pointers) 547 | // esp: Top of x86 stack 548 | // 549 | // C <- Haskell 550 | // ----------------------------- 551 | // ebp: Frame pointer (pointing to Haskell Sp + ret pointers) 552 | // esp: Top of x86 stack 553 | // 554 | // Haskell <- Haskell 555 | // ----------------------------- 556 | // ebx: BaseReg 557 | // edi: Hp 558 | // ebp: Sp, Top of STG stack 559 | // esp: Top of C stack? (Not used as a STG register) 560 | // 561 | // Haskell <- C 562 | // ----------------------------- 563 | // ebx: BaseReg 564 | // edi: Hp 565 | // ebp: Sp, Top of STG stack 566 | // esp: Top of C stack? (Not used as a STG register) 567 | 568 | // Haskell functions don't have source file debug information 569 | if (std::strcmp(symbol->FileIDToName(file_name_id), symbol->GetUnresolvedSymbolName()) != 0) 570 | return false; 571 | // Assume that all Haskell functions come from our executable 572 | if (strcasecmp(symbol->SymbolIDToModule(sym_id), ti.m_module.c_str()) != 0) 573 | return false; 574 | // Assume that all Haskell functions have certain tokens in their name 575 | if (std::strstr(symbol->SymbolIDToName(sym_id), "_info") == NULL) 576 | return false; 577 | 578 | return true; 579 | } 580 | 581 | void DumpCCS(const TargetInfo& ti, target_ptr_t ccs, uint32_t starting_depth) 582 | { 583 | // Traverse and print the passed Cost Center Stack 584 | for (unsigned int depth=starting_depth; depth<64; depth++) 585 | { 586 | Indent(depth); 587 | 588 | // Get CC pointer from CCS 589 | const target_ptr_t cc = ti.ReadMemoryPtr(ccs + OFFSET_ConstCentreStack_cc); 590 | if (cc == 0) 591 | { 592 | std::printf("(Can't read CC pointer)\n"); 593 | break; 594 | } 595 | 596 | // Retrieve symbol information from CC 597 | const target_ptr_t label_ptr = ti.ReadMemoryPtr(cc + OFFSET_ConstCentre_label); 598 | const target_ptr_t module_ptr = ti.ReadMemoryPtr(cc + OFFSET_ConstCentre_module); 599 | const target_ptr_t srcloc_ptr = ti.ReadMemoryPtr(cc + OFFSET_ConstCentre_srcloc); 600 | char label[256], module[256], srcloc[256]; 601 | if (ti.ReadMemoryString(label_ptr, label, sizeof(label)) == false) 602 | std::strcpy(label, "(can't read label)"); 603 | if (ti.ReadMemoryString(module_ptr, module, sizeof(module)) == false) 604 | std::strcpy(module, "(can't read module)"); 605 | if (ti.ReadMemoryString(srcloc_ptr, srcloc, sizeof(srcloc)) == false) 606 | std::strcpy(srcloc, "(can't read srcloc)"); 607 | std::printf("CCS:0x%x <%s> from %s (%s)\n", ccs, label, module, srcloc); 608 | 609 | // Walk the CC stack 610 | ccs = ti.ReadMemoryPtr(ccs + OFFSET_ConstCentreStack_prevStack); 611 | if (ccs == 0) 612 | break; 613 | } 614 | } 615 | 616 | void DumpSTG( 617 | const TargetInfo& ti, 618 | SymbolManager *symbol, 619 | target_ptr_t sp, 620 | uint32_t starting_depth) 621 | { 622 | // TODO: See GHC's 'Printer.c' on how we could print more information about what's on 623 | // the stack 624 | 625 | // Traverse and print the passed STG stack 626 | for (unsigned int depth=starting_depth; depth<128; depth++) 627 | { 628 | // Top closure 629 | const target_ptr_t info = ti.ReadMemoryPtr(sp); 630 | const uint32_t sym_id = symbol->AddressToSymbolID(info); 631 | 632 | Indent(depth); 633 | 634 | // Let our wrapper collect the required information 635 | uint32_t closure_type, closure_size; 636 | target_ptr_t fun_ref; 637 | if (GetClosureTypeAndSize(ti.m_task_port, sp, &closure_type, &closure_size, &fun_ref) != 0) 638 | { 639 | std::printf("0x%x (Can't read stack frame)\n", sp); 640 | break; 641 | } 642 | 643 | // Referenced closure 644 | char ref_buf[256] = { 0 }; 645 | if (fun_ref != 0) 646 | { 647 | std::snprintf( 648 | ref_buf, 649 | sizeof(ref_buf), 650 | ", <%s>", 651 | symbol->SymbolIDToName(symbol->AddressToSymbolID(ti.ReadMemoryPtr(fun_ref)))); 652 | } 653 | 654 | std::printf("0x%x <%s> (%s, %ib%s)\n", 655 | sp, 656 | symbol->SymbolIDToName(sym_id), 657 | ClosureTypeToString(closure_type), 658 | closure_size, 659 | ref_buf); 660 | 661 | if (closure_type == wrapper_STOP_FRAME) 662 | break; 663 | 664 | // TODO: Handle underflow frames 665 | if (closure_type == wrapper_UNDERFLOW_FRAME) 666 | break; 667 | 668 | sp += closure_size; 669 | } 670 | } 671 | 672 | const char * SignalToString(int sig) 673 | { 674 | switch (sig) 675 | { 676 | case SIGHUP : return "SIGHUP - terminal line hangup"; 677 | case SIGINT : return "SIGINT - interrupt program"; 678 | case SIGQUIT : return "SIGQUIT - quit program"; 679 | case SIGILL : return "SIGILL - illegal instruction"; 680 | case SIGTRAP : return "SIGTRAP - trace trap"; 681 | case SIGABRT : return "SIGABRT - abort program (formerly SIGIOT)"; 682 | case SIGEMT : return "SIGEMT - emulate instruction executed"; 683 | case SIGFPE : return "SIGFPE - floating-point exception"; 684 | case SIGKILL : return "SIGKILL - kill program"; 685 | case SIGBUS : return "SIGBUS - bus error"; 686 | case SIGSEGV : return "SIGSEGV - segmentation violation"; 687 | case SIGSYS : return "SIGSYS - non-existent system call invoked"; 688 | case SIGPIPE : return "SIGPIPE - write on a pipe with no reader"; 689 | case SIGALRM : return "SIGALRM - real-time timer expired"; 690 | case SIGTERM : return "SIGTERM - software termination signal"; 691 | case SIGURG : return "SIGURG - urgent condition present on socket"; 692 | case SIGSTOP : return "SIGSTOP - stop (cannot be caught or ignored)"; 693 | case SIGTSTP : return "SIGTSTP - stop signal generated from keyboard"; 694 | case SIGCONT : return "SIGCONT - continue after stop"; 695 | case SIGCHLD : return "SIGCHLD - child status has changed"; 696 | case SIGTTIN : return "SIGTTIN - background read attempted from control terminal"; 697 | case SIGTTOU : return "SIGTTOU - background write attempted to control terminal"; 698 | case SIGIO : return "SIGIO - I/O is possible on a descriptor (see fcntl(2))"; 699 | case SIGXCPU : return "SIGXCPU - cpu time limit exceeded (see setrlimit(2))"; 700 | case SIGXFSZ : return "SIGXFSZ - file size limit exceeded (see setrlimit(2))"; 701 | case SIGVTALRM: return "SIGVTALRM - virtual time alarm (see setitimer(2))"; 702 | case SIGPROF : return "SIGPROF - profiling timer alarm (see setitimer(2))"; 703 | case SIGWINCH : return "SIGWINCH - window size change"; 704 | case SIGINFO : return "SIGINFO - status request from keyboard"; 705 | case SIGUSR1 : return "SIGUSR1 - user defined signal 1"; 706 | case SIGUSR2 : return "SIGUSR2 - user defined signal 2"; 707 | default : return "unknown"; 708 | } 709 | } 710 | 711 | void DeallocateThreadList( 712 | thread_act_port_array_t thread_list, 713 | mach_msg_type_number_t thread_count) 714 | { 715 | for (uint i=0; i 6 | 7 | #include 8 | #include // for mach_vm_ instead of vm_ 9 | 10 | class SymbolManager; 11 | 12 | #include "rts_wrapper.h" 13 | 14 | class TargetInfo 15 | { 16 | public: 17 | TargetInfo(pid_t pid, const char *executable); 18 | ~TargetInfo(); 19 | 20 | void * ReadMemoryReturn(target_ptr_t address, target_ptr_t size) const; 21 | bool ReadMemoryArg(target_ptr_t address, target_ptr_t size, void *buf) const; 22 | target_ptr_t ReadMemoryPtr(target_ptr_t address) const; 23 | bool ReadMemoryString(target_ptr_t address, char *buf_out, size_t buf_size) const; 24 | 25 | std::string m_executable; 26 | std::string m_module; // Module name for our executable 27 | bool m_threaded_rts; 28 | bool m_profiling_rts; 29 | pid_t m_pid; 30 | bool m_is_64_bit; 31 | mach_port_name_t m_task_port; 32 | 33 | protected: 34 | TargetInfo() { }; 35 | }; 36 | 37 | bool DoesSymbolExist(const char *executable, const char *symbol); 38 | bool IsProfilingRTS(const char *exe); 39 | bool IsThreadedRTS (const char *exe); 40 | bool IsProcess64Bit(pid_t pid); 41 | 42 | void DebugLoop(const TargetInfo& ti); 43 | void PTraceContinue(pid_t pid); 44 | void TraceProcessState(const TargetInfo& ti); 45 | void DeallocateThreadList( 46 | thread_act_port_array_t thread_list, 47 | mach_msg_type_number_t thread_count); 48 | const char * SignalToString(int sig); 49 | void Indent(unsigned int starting_depth); 50 | void StackTrace( 51 | const TargetInfo& ti, 52 | SymbolManager *symbol, 53 | target_ptr_t fp, 54 | target_ptr_t ip); 55 | bool IsHaskellSymbol( 56 | const TargetInfo& ti, 57 | SymbolManager *symbol, 58 | uint32_t sym_id, 59 | uint16_t file_name_id); 60 | void DumpCCS(const TargetInfo& ti, target_ptr_t ccs, uint32_t starting_depth); 61 | void DumpSTG(const TargetInfo& ti, SymbolManager *symbol, target_ptr_t sp, uint32_t starting_depth); 62 | target_ptr_t CCSPtrFromBaseReg(const TargetInfo& ti, target_ptr_t base_reg); 63 | target_ptr_t CCSPtrFromTopOfStack(const TargetInfo& ti, target_ptr_t sp_reg); 64 | 65 | #endif // GHD_H 66 | 67 | -------------------------------------------------------------------------------- /ghd/rts_wrapper.h: -------------------------------------------------------------------------------- 1 | 2 | #ifndef RTS_WRAPPER_H 3 | #define RTS_WRAPPER_H 4 | 5 | // 6 | // Since we can't include GHC's RTS headers from C++ code 7 | // (https://ghc.haskell.org/trac/ghc/ticket/8676#ticket) 8 | // we're writing a small C wrapper around the bits we need 9 | // 10 | 11 | #ifdef __cplusplus 12 | extern "C" { 13 | #endif // __cplusplus 14 | 15 | #include 16 | 17 | #include 18 | #include // for mach_vm_ instead of vm_ 19 | 20 | typedef uint32_t target_ptr_t; 21 | 22 | kern_return_t ReadMemory(mach_port_name_t task, target_ptr_t address, target_ptr_t size, void *out); 23 | 24 | const char * ClosureTypeToString(unsigned int type); 25 | int GetClosureTypeAndSize( 26 | mach_port_name_t task, 27 | target_ptr_t closure_addr, 28 | uint32_t *closure_type_out, 29 | uint32_t *closure_size_out, 30 | target_ptr_t *fun_ref_out); 31 | 32 | // Structure field offsets so we can traverse them in the memory space of the target process 33 | extern target_ptr_t OFFSET_StgRegTable_rCCCS; 34 | extern target_ptr_t OFFSET_ConstCentreStack_cc; 35 | extern target_ptr_t OFFSET_ConstCentreStack_prevStack; 36 | extern target_ptr_t OFFSET_ConstCentre_label; 37 | extern target_ptr_t OFFSET_ConstCentre_module; 38 | extern target_ptr_t OFFSET_ConstCentre_srcloc; 39 | extern target_ptr_t OFFSET_StgClosure_header; 40 | extern target_ptr_t OFFSET_StgHeader_prof; 41 | extern target_ptr_t OFFSET_StgProfHeader_ccs; 42 | 43 | // Selected closure types 44 | extern uint32_t wrapper_STOP_FRAME; 45 | extern uint32_t wrapper_UNDERFLOW_FRAME; 46 | 47 | #ifdef __cplusplus 48 | } 49 | #endif // __cplusplus 50 | 51 | #endif // RTS_WRAPPER_H 52 | 53 | -------------------------------------------------------------------------------- /ghd/rts_wrapper_noprof.c: -------------------------------------------------------------------------------- 1 | 2 | #include "rts_wrapper.h" 3 | 4 | // GHC RTS header, profiling disabled 5 | #define THREADED 6 | #include 7 | 8 | int GetClosureTypeAndSize( 9 | mach_port_name_t task, 10 | target_ptr_t closure_addr, 11 | uint32_t *closure_type_out, 12 | uint32_t *closure_size_out, 13 | target_ptr_t *fun_ref_out) // Optional closure referenced by closure_addr 14 | { 15 | // Retrieve information about a closure stack object. We use this code during STG 16 | // stack traversal, which we only do on a non-profiling RTS, hence the reason why 17 | // it is located here 18 | 19 | (* fun_ref_out) = 0; 20 | 21 | // Copy closure 22 | StgClosure closure; 23 | if (ReadMemory(task, closure_addr, sizeof(StgClosure), &closure) != KERN_SUCCESS) 24 | return 1; 25 | 26 | // Copy info table for closure 27 | StgRetInfoTable info; 28 | const target_ptr_t info_addr = (target_ptr_t) get_ret_itbl(&closure); 29 | if (ReadMemory(task, info_addr, sizeof(StgRetInfoTable), &info) != KERN_SUCCESS) 30 | return 1; 31 | 32 | // Type 33 | (* closure_type_out) = info.i.type; 34 | 35 | // Compute closure size. Adapted from this code from 'ClosureMacros.h': 36 | 37 | // EXTERN_INLINE StgWord stack_frame_sizeW( StgClosure *frame ) 38 | // { 39 | // StgRetInfoTable *info; 40 | // 41 | // info = get_ret_itbl(frame); 42 | // switch (info->i.type) { 43 | // 44 | // case RET_DYN: 45 | // { 46 | // StgRetDyn *dyn = (StgRetDyn *)frame; 47 | // return sizeofW(StgRetDyn) + RET_DYN_BITMAP_SIZE + 48 | // RET_DYN_NONPTR_REGS_SIZE + 49 | // RET_DYN_PTRS(dyn->liveness) + RET_DYN_NONPTRS(dyn->liveness); 50 | // } 51 | // 52 | // case RET_FUN: 53 | // return sizeofW(StgRetFun) + ((StgRetFun *)frame)->size; 54 | // 55 | // case RET_BIG: 56 | // return 1 + GET_LARGE_BITMAP(&info->i)->size; 57 | // 58 | // case RET_BCO: 59 | // return 2 + BCO_BITMAP_SIZE((StgBCO *)((P_)frame)[1]); 60 | // 61 | // default: 62 | // return 1 + BITMAP_SIZE(info->i.layout.bitmap); 63 | // } 64 | // } 65 | 66 | switch (info.i.type) 67 | { 68 | case RET_DYN: 69 | { 70 | // typedef struct { 71 | // const StgInfoTable* info; 72 | // StgWord liveness; 73 | // StgWord ret_addr; 74 | // StgClosure * payload[FLEXIBLE_ARRAY]; 75 | // } StgRetDyn; 76 | StgRetDyn dyn; 77 | if (ReadMemory(task, closure_addr, sizeof(StgRetDyn), &dyn) != KERN_SUCCESS) 78 | return 1; 79 | (* closure_size_out) = sizeofW(StgRetDyn) + RET_DYN_BITMAP_SIZE + 80 | RET_DYN_NONPTR_REGS_SIZE + 81 | RET_DYN_PTRS(dyn.liveness) + RET_DYN_NONPTRS(dyn.liveness); 82 | break; 83 | } 84 | 85 | case RET_FUN: 86 | { 87 | // typedef struct { 88 | // const StgInfoTable* info; 89 | // StgWord size; 90 | // StgClosure * fun; 91 | // StgClosure * payload[FLEXIBLE_ARRAY]; 92 | // } StgRetFun; 93 | StgRetFun fun; 94 | if (ReadMemory(task, closure_addr, sizeof(StgRetFun), &fun) != KERN_SUCCESS) 95 | return 1; 96 | (* closure_size_out) = sizeofW(StgRetFun) + fun.size; 97 | break; 98 | } 99 | 100 | case RET_BIG: 101 | { 102 | // #define GET_LARGE_BITMAP(info) ((StgLargeBitmap*) (((StgWord) ((info)+1)) 103 | // + (info)->layout.large_bitmap_offset)) 104 | // return 1 + GET_LARGE_BITMAP(&info->i)->size; 105 | StgLargeBitmap bm; 106 | const target_ptr_t bm_addr = 107 | (info_addr + sizeof(StgRetInfoTable)) + 108 | info.i.layout.large_bitmap_offset; 109 | if (ReadMemory(task, bm_addr, sizeof(StgLargeBitmap), &bm) != KERN_SUCCESS) 110 | return 1; 111 | (* closure_size_out) = 1 + bm.size; 112 | break; 113 | } 114 | 115 | case RET_BCO: 116 | // #define BCO_BITMAP(bco) ((StgLargeBitmap *)((StgBCO *)(bco))->bitmap) 117 | // #define BCO_BITMAP_SIZE(bco) (BCO_BITMAP(bco)->size) 118 | // 119 | // typedef struct { 120 | // StgHeader header; 121 | // StgArrWords *instrs; /* a pointer to an ArrWords */ 122 | // StgArrWords *literals; /* a pointer to an ArrWords */ 123 | // StgMutArrPtrs *ptrs; /* a pointer to a MutArrPtrs */ 124 | // StgHalfWord arity; /* arity of this BCO */ 125 | // StgHalfWord size; /* size of this BCO (in words) */ 126 | // StgWord bitmap[FLEXIBLE_ARRAY]; /* an StgLargeBitmap */ 127 | // } StgBCO; 128 | // 129 | // typedef struct { 130 | // StgWord size; 131 | // StgWord bitmap[FLEXIBLE_ARRAY]; 132 | // } StgLargeBitmap; 133 | // 134 | // (* closure_size_out) = 2 + BCO_BITMAP_SIZE((StgBCO *)((P_)frame)[1]); 135 | // 136 | (* closure_size_out) = 2; // TODO: Implement me 137 | break; 138 | 139 | default: 140 | (* closure_size_out) = 1 + BITMAP_SIZE(info.i.layout.bitmap); 141 | break; 142 | } 143 | 144 | (* closure_size_out) *= sizeof(StgWord); 145 | 146 | // TODO: We could do a lot more than just follow up a single reference 147 | switch (info.i.type) 148 | { 149 | case CATCH_FRAME: 150 | { 151 | StgCatchFrame cframe; 152 | if (ReadMemory(task, closure_addr, sizeof(StgCatchFrame), &cframe) != KERN_SUCCESS) 153 | return 1; 154 | (* fun_ref_out) = (target_ptr_t) UNTAG_CLOSURE(cframe.handler); 155 | break; 156 | } 157 | } 158 | 159 | return 0; 160 | } 161 | 162 | uint32_t wrapper_STOP_FRAME = STOP_FRAME; 163 | uint32_t wrapper_UNDERFLOW_FRAME = UNDERFLOW_FRAME; 164 | 165 | // Copied from GHC's 'Printer.c' 166 | 167 | static const char *closure_type_names[] = { 168 | [INVALID_OBJECT] = "INVALID_OBJECT", 169 | [CONSTR] = "CONSTR", 170 | [CONSTR_1_0] = "CONSTR_1_0", 171 | [CONSTR_0_1] = "CONSTR_0_1", 172 | [CONSTR_2_0] = "CONSTR_2_0", 173 | [CONSTR_1_1] = "CONSTR_1_1", 174 | [CONSTR_0_2] = "CONSTR_0_2", 175 | [CONSTR_STATIC] = "CONSTR_STATIC", 176 | [CONSTR_NOCAF_STATIC] = "CONSTR_NOCAF_STATIC", 177 | [FUN] = "FUN", 178 | [FUN_1_0] = "FUN_1_0", 179 | [FUN_0_1] = "FUN_0_1", 180 | [FUN_2_0] = "FUN_2_0", 181 | [FUN_1_1] = "FUN_1_1", 182 | [FUN_0_2] = "FUN_0_2", 183 | [FUN_STATIC] = "FUN_STATIC", 184 | [THUNK] = "THUNK", 185 | [THUNK_1_0] = "THUNK_1_0", 186 | [THUNK_0_1] = "THUNK_0_1", 187 | [THUNK_2_0] = "THUNK_2_0", 188 | [THUNK_1_1] = "THUNK_1_1", 189 | [THUNK_0_2] = "THUNK_0_2", 190 | [THUNK_STATIC] = "THUNK_STATIC", 191 | [THUNK_SELECTOR] = "THUNK_SELECTOR", 192 | [BCO] = "BCO", 193 | [AP] = "AP", 194 | [PAP] = "PAP", 195 | [AP_STACK] = "AP_STACK", 196 | [IND] = "IND", 197 | [IND_PERM] = "IND_PERM", 198 | [IND_STATIC] = "IND_STATIC", 199 | [RET_BCO] = "RET_BCO", 200 | [RET_SMALL] = "RET_SMALL", 201 | [RET_BIG] = "RET_BIG", 202 | [RET_DYN] = "RET_DYN", 203 | [RET_FUN] = "RET_FUN", 204 | [UPDATE_FRAME] = "UPDATE_FRAME", 205 | [CATCH_FRAME] = "CATCH_FRAME", 206 | [UNDERFLOW_FRAME] = "UNDERFLOW_FRAME", 207 | [STOP_FRAME] = "STOP_FRAME", 208 | [BLOCKING_QUEUE] = "BLOCKING_QUEUE", 209 | [BLACKHOLE] = "BLACKHOLE", 210 | [MVAR_CLEAN] = "MVAR_CLEAN", 211 | [MVAR_DIRTY] = "MVAR_DIRTY", 212 | [ARR_WORDS] = "ARR_WORDS", 213 | [MUT_ARR_PTRS_CLEAN] = "MUT_ARR_PTRS_CLEAN", 214 | [MUT_ARR_PTRS_DIRTY] = "MUT_ARR_PTRS_DIRTY", 215 | [MUT_ARR_PTRS_FROZEN0] = "MUT_ARR_PTRS_FROZEN0", 216 | [MUT_ARR_PTRS_FROZEN] = "MUT_ARR_PTRS_FROZEN", 217 | [MUT_VAR_CLEAN] = "MUT_VAR_CLEAN", 218 | [MUT_VAR_DIRTY] = "MUT_VAR_DIRTY", 219 | [WEAK] = "WEAK", 220 | [PRIM] = "PRIM", 221 | [MUT_PRIM] = "MUT_PRIM", 222 | [TSO] = "TSO", 223 | [STACK] = "STACK", 224 | [TREC_CHUNK] = "TREC_CHUNK", 225 | [ATOMICALLY_FRAME] = "ATOMICALLY_FRAME", 226 | [CATCH_RETRY_FRAME] = "CATCH_RETRY_FRAME", 227 | [CATCH_STM_FRAME] = "CATCH_STM_FRAME", 228 | [WHITEHOLE] = "WHITEHOLE" 229 | }; 230 | 231 | const char * ClosureTypeToString(unsigned int type) 232 | { 233 | static const char unknown[] = "(Unknown Type)"; 234 | if (type >= N_CLOSURE_TYPES) 235 | return unknown; 236 | else 237 | return closure_type_names[type]; 238 | } 239 | 240 | -------------------------------------------------------------------------------- /ghd/rts_wrapper_prof.c: -------------------------------------------------------------------------------- 1 | 2 | #include "rts_wrapper.h" 3 | 4 | // GHC RTS header, profiling enabled 5 | #define PROFILING 6 | #define THREADED 7 | #include 8 | 9 | // We use those offsets during CCS traversal, so they need to be in the implementation 10 | // file where the profiling RTS 11 | target_ptr_t OFFSET_StgRegTable_rCCCS = offsetof(StgRegTable , rCCCS ); 12 | target_ptr_t OFFSET_ConstCentreStack_cc = offsetof(CostCentreStack, cc ); 13 | target_ptr_t OFFSET_ConstCentreStack_prevStack = offsetof(CostCentreStack, prevStack); 14 | target_ptr_t OFFSET_ConstCentre_label = offsetof(CostCentre , label ); 15 | target_ptr_t OFFSET_ConstCentre_module = offsetof(CostCentre , module ); 16 | target_ptr_t OFFSET_ConstCentre_srcloc = offsetof(CostCentre , srcloc ); 17 | target_ptr_t OFFSET_StgClosure_header = offsetof(StgClosure , header ); 18 | target_ptr_t OFFSET_StgHeader_prof = offsetof(StgHeader , prof ); 19 | target_ptr_t OFFSET_StgProfHeader_ccs = offsetof(StgProfHeader , ccs ); 20 | 21 | kern_return_t ReadMemory(mach_port_name_t task, target_ptr_t address, target_ptr_t size, void *out) 22 | { 23 | mach_vm_size_t inout_size = size; 24 | kern_return_t ret = mach_vm_read_overwrite( 25 | task, 26 | address, 27 | inout_size, 28 | (mach_vm_address_t) (uintptr_t) out, 29 | &inout_size); 30 | if (ret != KERN_SUCCESS) 31 | return ret; 32 | if (inout_size != size) 33 | return KERN_FAILURE; 34 | return KERN_SUCCESS; 35 | } 36 | 37 | --------------------------------------------------------------------------------