├── .github └── workflows │ └── awesome-bot.yml └── README.md /.github/workflows/awesome-bot.yml: -------------------------------------------------------------------------------- 1 | name: Awesome Bot CI 2 | on: [push, pull_request] 3 | 4 | jobs: 5 | build: 6 | runs-on: ubuntu-latest 7 | steps: 8 | - uses: actions/checkout@v1 9 | - name: Set up Ruby 2.6 10 | uses: actions/setup-ruby@v1 11 | with: 12 | ruby-version: 2.6.x 13 | - name: Install Awesome Bot 14 | run: gem install awesome_bot 15 | - name: Run Awesome Bot 16 | run: awesome_bot README.md --white-list https://medium.engineering 17 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # Debugging stories 3 | 4 | Debugging stories are fun! This is a collection of links to various debugging stories. 5 | 6 | ## Non Fiction stories 7 | 8 | [An unprecedented look at STUXNET, the world's first digital weapon](https://www.wired.com/2014/11/countdown-to-zero-day-stuxnet/) 9 | 10 | [Atop segfaults](http://rachelbythebay.com/w/2014/03/02/sync/) 11 | 12 | [Breakpoint takes 15ms](https://www.jwhitham.org/2015/04/the-mystery-of-fifteen-millisecond.html) 13 | 14 | [Bug that hides from breakpoints](https://drewdevault.com/2014/02/02/The-worst-bugs.html) 15 | 16 | [C64 Variable Screen Position crash](http://www.linusakesson.net/scene/safevsp/index.php) 17 | 18 | [Chrome development branch crashes with x87 FPU stack overflow](https://randomascii.wordpress.com/2016/09/16/everything-old-is-new-again-and-a-compiler-bug/) 19 | 20 | [Crash Bandicoot memory card corruption](https://www.gamedeveloper.com/programming/my-hardest-bug-ever) 21 | 22 | [Crashes only on Wednesdays](https://gyrovague.com/2015/07/29/crashes-only-on-wednesdays/) 23 | 24 | [Custom Rust OS crashes](https://jvns.ca/blog/2013/12/04/day-37-how-a-keyboard-works/) 25 | 26 | [Debian package data corruption, but only on sparc64](https://web.archive.org/web/20060612203753/http://vger.kernel.org/~davem/cgi-bin/blog.cgi/index.html) 27 | 28 | [Debugging an evil Go runtime bug](https://marcan.st/2017/12/debugging-an-evil-go-runtime-bug/) 29 | 30 | [Debugging Insights](http://debuggingrules.com/?page_id=46) (collection of stories on separate pages, submitted to Dave Agans, the author of "Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems") 31 | 32 | [Digital divider with wrong output](http://danluu.com/teach-debugging/) 33 | 34 | [Email only goes 500 miles](http://www.ibiblio.org/harris/500milemail.html) 35 | 36 | [How I cut GTA Online loading times by 70%](https://nee.lv/2021/02/28/How-I-cut-GTA-Online-loading-times-by-70/) 37 | 38 | [Impossible random number collision happened in practice](https://medium.com/@betable/tifu-by-using-math-random-f1c308c4fd9d) 39 | 40 | [Incorrect sign masking operation](https://labs.spotify.com/2015/08/27/underflow-bug/) 41 | 42 | [iOS file corruption](https://engineering.fb.com/ios/debugging-file-corruption-on-ios/) 43 | 44 | [First Instance of Actual Computer Bug Being Found](https://www.computerhistory.org/tdih/september/9/) 45 | 46 | [FIRST robotics dashboard segfaults due to finalizer](https://www.lukeshu.com/blog/java-segfault.html) 47 | 48 | [Fixing A Bug In An 18 Year Old Shockwave Game](https://mattbruv.github.io/ccsr-bugfix/) 49 | 50 | [FriendNet / WoW message corruption](https://blog.makandra.com/2010/04/the-greatest-bug-i-never-fixed/) 51 | 52 | [FPGA / CPU timing bug](https://eli.thegreenplace.net/2003/10/30/hardware-debugging-is-hard) 53 | 54 | [Gdb set follow-fork-mode child doesn't work](https://nanxiao.me/en/use-dtrace-to-diagnose-gdb-issues/) 55 | 56 | [git - pack corruption post-mortem](http://git.661346.n2.nabble.com/pack-corruption-post-mortem-td7597765.html) 57 | 58 | [Grep segfaults](http://blog.loadzero.com/blog/tracking-down-a-segfault-in-grep/) 59 | 60 | [GWT test passes during normal runs but fails during coverage runs](http://ismail.badawi.io/blog/an-obscure-bug-story/) 61 | 62 | [Java 8 webapp performance problems](https://engineering.indeedblog.com/blog/2016/09/job-search-web-app-java-8-migration/) 63 | 64 | [JVM has 6-7 second GC pauses](https://www.evanjones.ca/jvm-mmap-pause-finding.html) 65 | 66 | [Linux kernel CPU contention](http://notes.secretsauce.net/notes/2015/09/05_a-kernel-debugging-story.html) 67 | 68 | [Login only works when sitting down (and other stories)](https://books.google.ca/books?id=kse_7qbWbjsC&lpg=PP1&pg=PA56). From Jon Bentley's _Programming Pearls_, section 5.10 69 | 70 | [Mandrill stops sending messages](https://www.seancassidy.me/sherlock-holmes-debugging.html) 71 | 72 | [Matrix transformation is 10x faster in a test harness than in the game](https://randomascii.wordpress.com/2015/01/19/knowing-where-to-type-zero/) 73 | 74 | [Memory-corruption / threading bug](https://nanxiao.me/en/an-experience-of-fixing-a-memory-corruption-bug/) 75 | 76 | [Mouse won't move while processes are dying](https://randomascii.wordpress.com/2017/07/09/24-core-cpu-and-i-cant-move-my-mouse/) 77 | 78 | [MS Money crashes](https://devblogs.microsoft.com/oldnewthing/?p=6103). Includes reverse engineered binary patch fix! 79 | 80 | [Multicast to 237.0.0.1 acts craters performance](https://www.outerthoughts.com/2004/10/perfect-multicast-storm/) 81 | 82 | [Network link instability due to DB caching policies](https://engineering.fb.com/production-engineering/solving-the-mystery-of-link-imbalance-a-metastable-failure-state-at-scale/) 83 | 84 | [NFS bug hunting on GitLab](https://about.gitlab.com/blog/2018/11/14/how-we-spent-two-weeks-hunting-an-nfs-bug/) 85 | 86 | [Node.js memory leak](https://www.tritondatacenter.com/blog/walmart-node-js-memory-leak) 87 | 88 | [OpenOffice + Ubuntu|2009-04-28](https://bugs.launchpad.net/ubuntu/+source/cupsys/+bug/255161/comments/28) - "My wife has complained that OpenOffice will never print on Tuesdays" (bug is actually with the [file utility](https://bugs.launchpad.net/ubuntu/+source/file/+bug/248619)) 89 | 90 | [OS crashes on boot if machine was used recently](https://blog.valerieaurora.org/2013/12/17/heres-my-favorite-operating-systems-war-story-whats-yours/) 91 | 92 | [PDP-11 crashes only when live cattle are being shipped from northern Ukraine and western Russia](https://www.jakepoz.com/debugging-behind-the-iron-curtain/) 93 | 94 | [Perl can't read from TMPDIR?](https://blog.afoolishmanifesto.com/posts/investigation-into-why-perl-cant-read-from-tmpdir/) [(separate analysis of the same issue here)](https://blog.plover.com/tech/tmpdir.html) 95 | 96 | [Polish S doesn't appear on medium](https://medium.engineering/the-curious-case-of-disappearing-polish-s-fa398313d4df) 97 | 98 | [PS2 SDK doesn't zero the .bss segment](https://fobes.dev/general/2024/02/29/inline-assembly-dangers.html) 99 | 100 | [Python hashes files differently on different days](https://dpb.bitbucket.io/unexpected-behavior-from-the-python-3-built-in-hash-function.html) 101 | 102 | [Python/Flask/S3 absurd memory usage](https://www.jamesporter.me/2015/12/09/mysterious-memory-consumption.html) 103 | 104 | [Register allocation algorithm in JIT uses extra register](http://bitfunnel.org/debugging-nativejit/) 105 | 106 | [Script crashes before 10 a.m](https://darekkay.com/blog/script-crashes-before-10/) 107 | 108 | [SIGALRM timer hangs](https://nativeguru.wordpress.com/2015/02/19/why-you-should-avoid-using-sigalrm-for-timer/) 109 | 110 | [Signal fails only when returning from page fault handler](https://news.ycombinator.com/item?id=7684824) 111 | 112 | [Signal handling on 286 emulator for Xenix](https://news.ycombinator.com/item?id=7684827) 113 | 114 | [The Case of Veeerrry Slow Logons](https://techcommunity.microsoft.com/t5/windows-blog-archive/the-case-of-the-veeerrry-slow-logons/ba-p/724275) 115 | 116 | [SSH sometimes fails from cron](http://mina.naguib.ca/blog/2012/10/22/the-little-ssh-that-sometimes-couldnt.html) 117 | 118 | [Stack corruption when using select()](https://blogs.unity3d.com/2016/04/25/debugging-memory-corruption-who-the-hell-writes-2-into-my-stack-2/) 119 | 120 | [Switch crashes machine despite not being connected](http://www.catb.org/jargon/html/magic-story.html) 121 | 122 | [The Debug heap that created bugs 123 | ](http://lectem.github.io/windows/heap/appverifier/detours/2020/01/02/The-debug-heap-that-created-bugs.html) 124 | 125 | [The Wi-Fi only works when it's raining](https://predr.ag/blog/wifi-only-works-when-its-raining/) 126 | 127 | [Tmux/vim strangeness](http://www.daniellesucher.com/2014/04/24/my-new-favorite-vim-tmux-bug/) 128 | 129 | [Tracking down a CPU bug in Solaris across multiple OS layers](https://gist.github.com/vladak/d7edf765fe2763b698b06488230d5886) 130 | 131 | [uint32_t gets truncated to 16 bits](http://kdc-blog.blogspot.com/2008/03/one-day-one-of-my-co-workers-stopped-me.html) 132 | 133 | [Unit tests fail when run in Australia](https://github.com/angular/angular.js/issues/5017) 134 | 135 | [Windows crash dump of nonsensical instruction](https://devblogs.microsoft.com/oldnewthing/?p=43293) 136 | 137 | [HTTP/HTTPS not working inside your VM? Wait for it.](https://rachelbythebay.com/w/2016/03/22/6nat/) 138 | 139 | [The big handle gamble](https://x64dbg.com/blog/2017/11/04/the-big-handle-gamble.html) 140 | 141 | ## Fictional stories 142 | 143 | [Coding Machines](https://www.teamten.com/lawrence/writings/coding-machines/) 144 | 145 | ## Contributors 146 | 147 | * Alexandros Tasos 148 | * Arthur Axel fREW Schmidt 149 | * André Luiz 150 | * Bill Smith 151 | * Brad Parker 152 | * Bruce Dawson 153 | * bussiere 154 | * crazymerlyn 155 | * Dan Luu 156 | * Dan Nguyen 157 | * Darek Kay 158 | * Dima Kogan 159 | * Dimitry (unknownexception) 160 | * Florent Viel 161 | * Howard Mao 162 | * Jakub Kądziołka 163 | * Jani Patokallio 164 | * Jessie Weinstein 165 | * José Manuel Díez 166 | * Jouke Waleson 167 | * Jyrki Vesterinen 168 | * Kartik Agaram 169 | * Kevin D. Clark 170 | * Kirtan Gajjar 171 | * Louis Vela 172 | * Matteo Castellini 173 | * Mark Jason Dominus 174 | * michaelx386 175 | * Nan Xiao 176 | * Nathan Isom 177 | * Nikola Brežnjak 178 | * Pietro Menna 179 | * Predrag Gruevski 180 | * Rafik Draoui 181 | * Ramon Fried 182 | * Sean Cassidy 183 | * Ty Lamontagne 184 | * Vladimír Kotal 185 | * James Porter 186 | 187 | ## TODOs: 188 | 189 | * Improve formatting. 190 | 191 | * Write up some interesting bugs: 40 hour hang, hardware bug caused by copying behavior of Intel engineering sample found on ebay, etc. 192 | --------------------------------------------------------------------------------