├── .gitignore ├── COPYING ├── DIARY.txt ├── INSTALL ├── Makefile ├── README ├── rebar ├── src ├── chaos_monkey.app.src ├── chaos_monkey.erl ├── chaos_monkey_app.erl └── chaos_monkey_sup.erl └── todo.org /.gitignore: -------------------------------------------------------------------------------- 1 | ebin 2 | .emacs.desktop 3 | -------------------------------------------------------------------------------- /COPYING: -------------------------------------------------------------------------------- 1 | Copyright (c) 2012, 2013 Daniel Luna 2 | 3 | Permission to use, copy, modify, and/or distribute this software for 4 | any purpose with or without fee is hereby granted, provided that the 5 | above copyright notice and this permission notice appear in all 6 | copies. 7 | 8 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL 9 | WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED 10 | WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE 11 | AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL 12 | DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR 13 | PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER 14 | TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR 15 | PERFORMANCE OF THIS SOFTWARE. 16 | -------------------------------------------------------------------------------- /DIARY.txt: -------------------------------------------------------------------------------- 1 | 20:00 2 | 3 | It starts! Yay! 4 | 5 | 20:06 6 | 7 | First commit. Skeleton files, but whatever. 8 | 9 | 20:20 10 | 11 | essen at Freenode recommended cowboy_examples as a quick to get up and 12 | running Erlang program. It rocks! clone, make, ./start.sh and it's 13 | up. Now I have something to kill... 14 | 15 | 20:23 16 | 17 | Stupid computer! It rebooted from having a USB camera connected to 18 | its USB port. Happens every now and then. 19 | 20 | Oh, well. I'll head for Subway and get some food. 21 | 22 | 20:35 23 | 24 | Sub acquired. 25 | 26 | 21:03 27 | 28 | Program works. I can go home. :-) 29 | 30 | Now let's make it a little bit more sophisticated. 31 | 32 | 22:00 33 | 34 | It is fairly sophisticated at killing stuff. The chaos monkey ignores 35 | system processes, and will not do suicide. Everything else is fair 36 | bait. 37 | 38 | Unfortunately it shouldn't really be allowed to kill supervisors, 39 | because that breaks one of the Rules of Erlang. Let's see if I can 40 | identify those. 41 | 42 | 22:12 43 | 44 | Well that was ten minutes of making my printouts nicer looking. Hmm. 45 | 46 | 22:13 47 | 48 | So, dear diary. I never told you what the program is supposed to do. 49 | Well it's a process killer. I walks around in your system killing 50 | random processes. Just to see if you will recover. What doesn't kill 51 | you makes you stronger and all that. Well, except in this case it's 52 | "what kills you makes you stronger". So now you know. 53 | 54 | 22:07 55 | 56 | Support has been added for not killing supervisors. Yay! 57 | 58 | And I rewrote the whole printout thing. 59 | 60 | I have no idea how to avoid killing the Erlang shell. 61 | 62 | 00:05 63 | 64 | Speed of development has slowed down. 65 | 66 | lists:filter(fun({P, App, S}) -> App =:= undefined end, [{P, application:get_application(P), pman_process:is_system_process(P)} || P <- erlang:processes()]). 67 | 68 | 00:30 69 | 70 | Freenode #spawnfest rocks. Once again very useful. I did not know 71 | about sys:get_status(whereis(chaos_monkey_sup)). before this. Super. 72 | 73 | I got tricked since I was so certain that supervisor was *not* a 74 | gen_server. Which it is of course. Even though it shouldn't for so 75 | many reasons. 76 | 77 | 00:50 78 | 79 | I think I don't add usable code any more. Going home to sleep, and 80 | think a bit about the design. 81 | 82 | 10:30 83 | 84 | I'm back! 85 | 86 | The code is sort of doing what it should. There is a huge lot of 87 | improvements that can be made. I think it's time to write some 88 | documentation to see 89 | 90 | 11:13 91 | 92 | Still writing documentation about a future dream scenario. Time to 93 | accept that it won't ever be perfect, and just go back to coding some 94 | of those cool features that have just been described. 95 | 96 | 11:34 97 | 98 | Ok. So *now* the documentation is done. Let's implement it. 99 | 100 | Without adding a lot of features! Stay away from those. They are bad 101 | for you. 102 | 103 | 12:34 104 | 105 | Sweet. find_orphans/0 seems to work. 106 | 107 | 16:20 108 | 109 | I forgot all about writing here. 110 | 111 | chaos_monkey:on/0 and off/0 work as announced. This is awesome I 112 | think. If I have time I will add configuration options and stuff. 113 | For the time being it works great with some reasonable defaults. It 114 | kills a process every five seconds on average. Should be good enough 115 | to run in most production systems. 116 | 117 | I've started looking at havoc/0,1,2 which is more of a single shot 118 | thing. Aim everything at an application and see how it recovers. 119 | Going slowly, but in the right direction. 120 | 121 | Anderson Silva is fighting tonight, so there will be a fight break 122 | later today. Hope I can finish a simple version of havoc until then. 123 | 124 | 16:26 125 | 126 | I seriously need a way to identify the shell processes. What I have 127 | found so far is that there are two of them. One which is the actual 128 | shell, and the other which keeps track of the history. They don't 129 | belong to any application. 130 | 131 | It is possible to start multiple shells in the same shell session, but 132 | there are still only two shell processes which are not flagged as 133 | system processes. 134 | 135 | 17:03 136 | 137 | Ok, so I know why the shell is so elusive. When running 138 | pman_process:is_system_process in the shell, it is not a system 139 | process. When running the same function from somewhere else, it is. 140 | Gaah. 141 | 142 | Anyway, it seems like the answer is inside shell.erl itself. I can us 143 | group:interfaces/1 to find the shell. 144 | 145 | 17:27 146 | 147 | YAY! Freaking rocks. 148 | 149 | So the stupid shell process which is a system process when kindly 150 | standing around waiting for something to run and then stops being a 151 | system process as soon as it's doing something, has now been 152 | discovered. 153 | 154 | I now have a way of finding out whether a Pid refers to a shell or 155 | not. Let's hope that it stands up in real life. I think so. See 156 | chaos_monkey:is_shell/1 for details. 157 | 158 | 17:30 159 | 160 | Let's have a small break. 161 | 162 | 17:49 163 | 164 | Break over. 165 | 166 | 18:25 167 | 168 | Lots of cleanup of code and documentation. Time to focus on having 169 | some sort of version of havoc ready before going to bed. 170 | 171 | 20:04 172 | 173 | Leaving to watch the UFC fight. The Chaos Monkey can wreck som very 174 | minor havoc. More will have to wait until later. 175 | 176 | 00:50 177 | 178 | Anderson Silva kept his belt. And I got to eat spare ribs. Everyone 179 | is happy. 180 | 181 | 01:13 182 | 183 | The good thing with working alone is that when a commit doesn't 184 | compile, nobody complains. Took me 25 lines and five minutes to have 185 | it compiling again. I hope it works too... 186 | 187 | 01:45 188 | 189 | Major code improvements. Not really moving closer to the goal, but 190 | the code *looks* much nicer. 191 | 192 | 02:45 193 | 194 | I think I'll introduce more bugs than I fix by now. Time to sleep. 195 | See you all tomorrow. 196 | 197 | I'll write some placeholder documentation for later. 198 | 199 | 12:28 200 | 201 | I'm back! And only seven and a half hours to go. Yikes! So maybe I 202 | didn't go straight to bed when I wrote the above text. :-) 203 | 204 | 14:09 205 | 206 | Apparently I'm leading the commit-race, and have more than 10% of the 207 | total commits in the whole of Spawnfest. I think that by lines of 208 | code I'm down by the bottom, and by functionality I'm probably around 209 | whatever. 210 | 211 | 14:21 212 | 213 | Now I have all the information I need to do the rest of what I have 214 | planned. How come I can't even build a stupid tree! 215 | 216 | 15:40 217 | 218 | There is a tree structure. Ugly, but hopefully works well enough for 219 | my purposes. I could possibly have used digraph for this, but 220 | whatever. Next time. 221 | 222 | 15:44 223 | 224 | INSTALL written. Should explain things well enough I hope. 225 | 226 | 16:15 227 | 228 | Yikes. 3 hours and change to go. And orphans get killed by the 229 | monkey. The main issue is still a bit away. 230 | 231 | 17:07 232 | 233 | Print a count of max number of processes at which point the tree would 234 | die. 235 | 236 | 17:20 237 | 238 | Better to keep the TODO list in a TODO file. 239 | 240 | 17:44 241 | 242 | Killing 23 processes in cowboy. My system tells me everything goes 243 | down at 23, but that it probably survives 22. Why doesn't this 244 | happen? 245 | 246 | 17:57 247 | 248 | Issue explained. It had to do with the restart strategy of the app. 249 | FIXED! 250 | 251 | And with zero lines of code (ok, so maybe ten lines of debug printout, 252 | but who's counting). 253 | 254 | 18:08 255 | 256 | I'm done! 257 | 258 | Major cleanup coming up. 259 | 260 | 18:43 261 | 262 | It feels like I'm done now. 263 | 264 | Now I just have to wait for that box of Malbec wine to show up. :-) 265 | 266 | 19:16 267 | 268 | Less than an hour to go. Just added more options for on/1, so that 269 | the user can pick which applications to harass. 270 | 271 | I'm not sure I dare adding more features. It would be silly to have a 272 | broken application in there when the bell rings. 273 | 274 | 144 commits, 510 lines of chaos_monkey (excluding _app, _sup, and 275 | app.src), 98 lines of README, 47 lines of INSTALL, and 279 lines of 276 | diary. Lots of features. Heaps of fun. 277 | 278 | All in all, two good days of awesome. 279 | -------------------------------------------------------------------------------- /INSTALL: -------------------------------------------------------------------------------- 1 | The simplest way is 2 | 3 | make 4 | 5 | or 6 | 7 | ./rebar compile 8 | 9 | It also works cleanly if you include it as a dependency in rebar. 10 | 11 | To test, find yourself a project, include the path to the code, run 12 | application:start(chaos_monkey), and then run chaos_monkey:on() or 13 | chaos_monkey:kill(), to kill processes over time or kill a single 14 | process respectively. You can also test the stability of a single 15 | application with chaos_monkey:almost_kill([AppName]). 16 | 17 | I've personally been using cowboy_examples for testing. Just because 18 | installation was so extremely quick. If your current working 19 | directory is chaos_monkey, then do: 20 | 21 | cd .. 22 | git clone https://github.com/extend/cowboy_examples.git 23 | make 24 | 25 | Add "-pa ../chaos_monkey/ebin -s chaos_monkey \" as the third line in 26 | start.sh (see below for diff). 27 | 28 | You can now run The Chaos Monkey according to the documentation in 29 | README. Note that cowboy_examples have temporary applications so over 30 | time you will kill it off and The Chaos Monkey will have nothing to 31 | do. 32 | 33 | Happy killing. 34 | 35 | 36 | diff --git a/start.sh b/start.sh 37 | index d704f86..0a3c226 100755 38 | --- a/start.sh 39 | +++ b/start.sh 40 | @@ -1,5 +1,6 @@ 41 | #!/bin/sh 42 | erl -sname cowboy_examples -pa ebin -pa deps/*/ebin -s cowboy_examples \ 43 | + -pa ../chaos_monkey/ebin -s chaos_monkey \ 44 | -eval "io:format(\"~n~nThe following examples are available:~n\")." \ 45 | -eval "io:format(\"* Hello world: http://localhost:8080~n\")." \ 46 | -eval "io:format(\"* Websockets: http://localhost:8080/websocket~n\")." \ 47 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | .PHONY: all 2 | all: 3 | ./rebar compile -------------------------------------------------------------------------------- /README: -------------------------------------------------------------------------------- 1 | +---------------------------------------------------------+ 2 | | Canonical repo at https://github.com/dLuna/chaos_monkey | 3 | +---------------------------------------------------------+ 4 | 5 | This is The CHAOS MONKEY. It will kill your processes. 6 | 7 | "What kills you, makes you stronger" -- The Chaos Monkey 8 | 9 | The purpose of The Chaos Monkey is to find out if your system is 10 | stable or not. What will your system do when things start to go wrong 11 | and your processes die randomly? The Chaos Monkey will show you. 12 | With a stick. 13 | 14 | Start by including the chaos_monkey amongst your applications. The 15 | Chaos Monkey sits in its cage and eats bananas. Let it out by running 16 | chaos_monkey:on() or have him kill a single process using 17 | chaos_monkey:kill(). More installation instructions can be found in 18 | INSTALL. 19 | 20 | 21 | chaos_monkey:on() -> {ok, started} | {error, already_running} 22 | chaos_monkey:on(Opts) -> 23 | {ok, started} 24 | | {error, already_running} 25 | | {error, badarg} 26 | chaos_monkey:off() -> {ok, stopped} | {error, not_running} 27 | 28 | Types: 29 | 30 | Opts :: [Opt] 31 | Opt :: {ms, non_neg_integer()} 32 | | {apps, all | all_but_otp | [atom()]} 33 | 34 | Will let The Chaos Monkey wreck reasonable havoc over time on your 35 | system. This is the command to use if you want The Chaos Monkey 36 | running all the time. 37 | 38 | It will stay away from system processes and supervisors like a good 39 | monkey. 40 | 41 | Opts default to [{ms, 5000}, {apps, all_but_otp}] which allows The 42 | Chaos Monkey to kill one process every five seconds on average. It 43 | can deviate from this number by 30%. If your restart frequency 44 | setting doesn't allow for this then you could be in for a surprise. 45 | 46 | 47 | chaos_monkey:almost_kill() -> 48 | {ok, NumberOfKilledProcesses} | {error, Error} 49 | chaos_monkey:almost_kill(Applications) -> 50 | {ok, NumberOfKilledProcesses} | {error, Error} 51 | 52 | Types: 53 | 54 | Applications :: all | all_but_otp | [application()] 55 | NumberOfKilledProcesses :: non_neg_integer() 56 | Error :: term() 57 | 58 | Synchronous. 59 | 60 | This function call will almost kill your system. If it works as 61 | published, The Chaos Monkey should stay one process away from 62 | bringing your system down. Can you recover from that? 63 | 64 | The Chaos Monkey will randomly walk through processes belonging to 65 | the list of applications and kill things. Supervisors are too 66 | strong for The Chaos Monkey, so it will kill their children instead, 67 | aiming to kill them by going above the restart threshold. The Chaos 68 | Monkey is not suicidal so it will respect restart thresholds of 69 | permanent top level supervisors. 70 | 71 | As well as not killing supervisors; system processes and processes 72 | in the kernel application are too strong. As mentioned above The 73 | Chaos Monkey will avoid suicide and by extension its siblings and 74 | parent. 75 | 76 | The Applications argument, tells The Chaos Monkey to focus its 77 | killing spree on: 78 | 79 | all -- All applications are available for killing. 80 | 81 | all_but_otp -- The Chaos Monkey will stay away from applications 82 | in OTP. Everything else is fair bait. Default. 83 | 84 | [ListOfApps] -- Sic The Chaos Monkey on the list of application. 85 | Remember that the Monkey will always see lonesome 86 | processes that don't have the protection of an 87 | application as available for harassment. 88 | 89 | 90 | chaos_monkey:kill() -> {ok, ProcData}. 91 | 92 | Kills a single random non-OTP process in your system. 93 | 94 | ProcData contains information about the process that was killed. 95 | 96 | 97 | chaos_monkey:find_orphans() -> [Pid] 98 | 99 | The Chaos Monkey will smell your processes and find the ones which 100 | lack protection from an application. It gladly hands them over to 101 | you to do with as you please. A well-behaved system should return 102 | [] when calling this function. 103 | -------------------------------------------------------------------------------- /rebar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dLuna/chaos_monkey/3415c81d8f1bc77ad20735f9e12333e4aab540eb/rebar -------------------------------------------------------------------------------- /src/chaos_monkey.app.src: -------------------------------------------------------------------------------- 1 | %% -*- Erlang -*- 2 | {application, chaos_monkey, 3 | [ 4 | {description, "A Monkey that Spreads Chaos"}, 5 | {vsn, git}, 6 | {modules, 7 | [chaos_monkey]}, 8 | {registered, [chaos_monkey]}, 9 | {applications, 10 | [kernel, 11 | stdlib]}, 12 | {mod, {chaos_monkey_app, []}}, 13 | {env, []} 14 | ] 15 | }. 16 | -------------------------------------------------------------------------------- /src/chaos_monkey.erl: -------------------------------------------------------------------------------- 1 | %% @author Daniel Luna 2 | %% @copyright 2012 Daniel Luna 3 | %% @doc 4 | -module(chaos_monkey). 5 | -author('Daniel Luna '). 6 | -behaviour(gen_server). 7 | 8 | %% API 9 | -export([start/0]). 10 | -export([start_link/0]). 11 | 12 | -export([find_orphans/0, 13 | almost_kill/0, 14 | almost_kill/1, 15 | kill/0, 16 | off/0, 17 | on/0, 18 | on/1]). 19 | 20 | %% gen_server callbacks 21 | -export([init/1, handle_call/3, handle_cast/2, handle_info/2, 22 | terminate/2, code_change/3]). 23 | 24 | -define(SERVER, ?MODULE). 25 | 26 | -define(TIMER, 5000). 27 | -define(DEFAULT_OPTS, 28 | [{ms, ?TIMER}, 29 | {apps, all_but_otp}]). 30 | 31 | -record(state, { 32 | is_active = false, 33 | avg_wait, 34 | timer_ref, 35 | apps, 36 | intervals = []}). 37 | 38 | start() -> 39 | application:start(?MODULE). 40 | 41 | start_link() -> 42 | gen_server:start_link({local, ?SERVER}, ?MODULE, [], []). 43 | 44 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 45 | %% START OF EXTERNAL API 46 | 47 | almost_kill() -> 48 | do_almost_kill(all_but_otp). 49 | 50 | almost_kill(Apps) -> 51 | do_almost_kill(Apps). 52 | 53 | find_orphans() -> 54 | do_find_orphans(). 55 | 56 | kill() -> 57 | do_kill(all_but_otp). 58 | 59 | on() -> 60 | gen_server:call(?SERVER, {on, ?DEFAULT_OPTS}, infinity). 61 | 62 | on(Opts) -> 63 | gen_server:call(?SERVER, {on, Opts ++ ?DEFAULT_OPTS}, infinity). 64 | 65 | off() -> 66 | gen_server:call(?SERVER, off, infinity). 67 | 68 | %% END OF EXTERNAL API 69 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 70 | %% START OF GEN_SERVER CALLBACKS 71 | 72 | init([]) -> 73 | case application:get_env(auto_start) of 74 | undefined -> 75 | ok; 76 | {ok, false} -> 77 | ok; 78 | {ok, true} -> 79 | Opts = case application:get_env(ms) of 80 | undefined -> []; 81 | {ok, Ms} -> [{ms, Ms}] 82 | end ++ 83 | case application:get_env(apps) of 84 | undefined -> []; 85 | {ok, Apps} -> [{apps, Apps}] 86 | end ++ 87 | ?DEFAULT_OPTS, 88 | case verify_opts(Opts) of 89 | {ok, _, _} -> 90 | spawn_link(fun() -> {ok, started} = on(Opts) end); 91 | {error, Error} -> 92 | exit(Error) 93 | end 94 | end, 95 | random:seed(now()), 96 | {ok, #state{}}. 97 | 98 | handle_call({on, Opts}, _From, State = #state{is_active = false}) -> 99 | case verify_opts(Opts) of 100 | {ok, Ms, Apps} -> 101 | NewState = State#state{avg_wait = Ms, 102 | apps = Apps, 103 | is_active = true}, 104 | self() ! kill_something, 105 | {reply, {ok, started}, NewState}; 106 | {error, Error} -> 107 | {reply, {error, Error}, State} 108 | end; 109 | handle_call(off, _From, State = #state{is_active = true, timer_ref = Ref}) -> 110 | timer:cancel(Ref), 111 | receive kill_something -> ok 112 | after 0 -> ok 113 | end, 114 | NewState = State#state{is_active = false}, 115 | {reply, {ok, stopped}, NewState}; 116 | handle_call({on, _}, _From, State = #state{is_active = true}) -> 117 | {reply, {error, already_running}, State}; 118 | handle_call(off, _From, State = #state{is_active = false}) -> 119 | {reply, {error, not_running}, State}; 120 | 121 | handle_call(_Msg, _From, State) -> 122 | {reply, {error, unknown_call}, State}. 123 | 124 | handle_cast(_Msg, State) -> 125 | {noreply, State}. 126 | 127 | handle_info(kill_something, State = #state{avg_wait = AvgWait, apps = Apps}) -> 128 | case do_kill(Apps) of 129 | {ok, KilledInfo} -> 130 | p("Killed ~p", [KilledInfo]); 131 | {error, no_killable_processes} -> 132 | p("Warning: no killable processes.", []) 133 | end, 134 | Var = 0.3, %% I.e. 70% to 130% of Waittime 135 | WaitTime = round(AvgWait * ((1 - Var) + (Var * 2 * random:uniform()))), 136 | {ok, Ref} = timer:send_after(WaitTime, kill_something), 137 | {noreply, State#state{timer_ref = Ref}}; 138 | handle_info(Info, State) -> 139 | p("Unknown info ~p", [Info]), 140 | {noreply, State}. 141 | 142 | terminate(_Reason, _State) -> 143 | ok. 144 | 145 | code_change(_OldVsn, State, _Extra) -> 146 | {ok, State}. 147 | 148 | %% END OF GEN_SERVER CALLBACKS 149 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 150 | %% START OF DO_ FUNCTIONS 151 | 152 | do_almost_kill(AppFilter) -> 153 | All = [{case application:get_application(P) of 154 | {ok, App} -> App; %% No apps named undefined, please! 155 | undefined -> undefined 156 | end, P} || P <- erlang:processes()], 157 | TaggedProcesses = 158 | lists:filter(fun({App, Pid}) -> 159 | is_killable(Pid, App, AppFilter, false) 160 | end, randomize(All)), 161 | ByApp = lists:foldl(fun({App, P}, [{App, Ps} | Acc]) -> 162 | [{App, [P | Ps]} | Acc]; 163 | ({App, P}, Acc) -> 164 | [{App, [P]} | Acc] 165 | end, [], lists:sort(TaggedProcesses)), 166 | %% Start off by killing everything which doesn't belong to an app 167 | {KilledNoApp, Ps1} = 168 | case lists:keytake(undefined, 1, ByApp) of 169 | {value, {undefined, Undefined}, PsNoUndefined} -> 170 | {lists:foldl( 171 | fun(Pid, N) -> 172 | case is_supervisor(Pid) of 173 | true -> 174 | p("Why is there a supervisor which " 175 | "doesn't belong to an application. " 176 | "Take a closer look at ~p", [Pid]), 177 | %% Should I handle this better? 178 | %% Probably. Because this will 179 | %% happen whenever somebody out 180 | %% there can't be bothered making 181 | %% proper app files. Which happens a 182 | %% lot. 183 | N; 184 | false -> 185 | kill(Pid), 186 | N + 1 187 | end 188 | end, 0, Undefined), PsNoUndefined}; 189 | false -> 190 | {0, ByApp} 191 | end, 192 | KilledApp = 193 | lists:sum( 194 | [begin 195 | p("About to kill things in ~p", [App]), 196 | app_killer(App, Pids) 197 | end || {App, Pids} <- randomize(Ps1)]), 198 | {ok, KilledNoApp + KilledApp}. 199 | 200 | do_find_orphans() -> 201 | Ps = [{P, 202 | application:get_application(P), 203 | is_system_process(P)} 204 | || P <- erlang:processes()], 205 | lists:zf(fun({P, undefined, false}) -> 206 | case is_shell(P) of 207 | true -> false; 208 | false -> {true, P} 209 | end; 210 | (_) -> false end, Ps). 211 | 212 | do_kill(AppFilter) -> 213 | kill_one(randomize(erlang:processes()), AppFilter). 214 | 215 | %% END OF DO_ FUNCTIONS 216 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 217 | %% START OF AUXILIARY FUNCTIONS 218 | 219 | verify_opts(Opts) -> 220 | try case lists:keyfind(ms, 1, Opts) of 221 | {ms, Ms} when is_integer(Ms), Ms >= 0 -> 222 | case lists:keyfind(apps, 1, Opts) of 223 | {apps, Apps} -> 224 | case Apps =:= all 225 | orelse Apps =:= all_but_otp 226 | orelse lists:all(fun(X) -> is_atom(X) end, Apps) of 227 | true when is_list(Apps) -> 228 | AllApps = application:loaded_applications(), 229 | case lists:all( 230 | fun(X) -> 231 | lists:keymember(X, 1, AllApps) 232 | end, Apps) of 233 | true -> 234 | {ok, Ms, Apps}; 235 | false -> 236 | {error, unknown_application} 237 | end; 238 | true -> 239 | {ok, Ms, Apps}; 240 | false -> 241 | {error, badly_formed_apps} 242 | end; 243 | _ -> 244 | {error, bad_apps} 245 | end; 246 | _ -> 247 | {error, bad_ms} 248 | end 249 | catch 250 | _:_ -> 251 | {error, badarg} 252 | end. 253 | 254 | randomize(Xs) -> 255 | [V || {_, V} <- lists:sort([{random:uniform(), X} || X <- Xs])]. 256 | 257 | %% random(L) -> 258 | %% lists:nth(random:uniform(length(L)), L). 259 | 260 | -record(node, 261 | {pid, 262 | type, 263 | will_die_at = 1, 264 | intensity, 265 | period, 266 | child_data = []}). 267 | 268 | app_killer(App, Pids) -> 269 | {Sups, Other} = lists:partition(fun(Pid) -> is_supervisor(Pid) end, Pids), 270 | SupStates = [supervision_state(Pid) || Pid <- Sups], 271 | {Orphans, Tree} = make_tree(SupStates, Other), 272 | KilledOrphans = 273 | case Orphans =:= [] of 274 | true -> 0; 275 | false -> 276 | p("There are processes in ~s which don't belong to a " 277 | "supervision tree. The Chaos Monkey stomps on them.", 278 | [App]), 279 | [kill(Pid) || Pid <- Orphans], 280 | length(Orphans) 281 | end, 282 | {RealTree, KilledTrees} = 283 | case Tree of 284 | [Single] -> 285 | {Single, 0}; 286 | [Single | _] -> 287 | p("There are multiple top level supervision trees for ~p. " 288 | "The Chaos Monkey picks one randomly to harass. At some" 289 | "point in the future it will pick the right one and kill" 290 | "the others. Hand in a feature request.", [App]), 291 | %% Look at '$ancestors' to find the *real* supervision 292 | %% tree. Easy to do but we are under a deadline 293 | %% here... 294 | {Single, 0} 295 | end, 296 | p("Supervision tree for ~p will go down at ~p kills if The Chaos " 297 | "Monkey kills it in the right order.", 298 | [App, RealTree#node.will_die_at]), 299 | TreeKilling = almost_kill_tree(RealTree), 300 | KilledOrphans + KilledTrees + TreeKilling. 301 | 302 | almost_kill_tree(#node{type = child}) -> 0; 303 | almost_kill_tree(#node{type = supervisor, 304 | child_data = Children, 305 | intensity = Intensity}) -> 306 | KillAllButOne = lists:sublist(Children, Intensity), 307 | case randomize(KillAllButOne) of 308 | [] -> 0; %% nothing to kill, we are done here 309 | L when length(L) < Intensity -> 310 | lists:sum([kill_tree(Child) || Child <- Children]); 311 | [DontKill | KillList] -> 312 | almost_kill_tree(DontKill) + 313 | lists:sum([kill_tree(Kill) || Kill <- KillList]) 314 | end. 315 | 316 | kill_tree(#node{pid = Pid, type = child}) -> 317 | kill(Pid), 318 | %% p("killed", []), 319 | 1; 320 | kill_tree(#node{type = supervisor, 321 | child_data = Children, 322 | intensity = Intensity}) -> 323 | KillAll = lists:sublist(Children, Intensity), 324 | lists:sum([kill_tree(Child) || Child <- KillAll]). 325 | 326 | %% Copied from supervisor.erl 327 | -record(child, {% pid is undefined when child is not running 328 | pid = undefined, 329 | name, 330 | mfargs, 331 | restart_type, 332 | shutdown, 333 | child_type, 334 | modules = []}). 335 | 336 | supervision_state(Pid) -> 337 | try sys:get_status(Pid) of 338 | %% from sys.erl 339 | {status, Pid, {module, _Mod}, 340 | [_PDict, _SysState, _Parent, _Debug, FmtMisc]} -> 341 | [_, _, {data, [{"State", State}]}] = FmtMisc, 342 | %% From supervisor.erl but I already have a #state{} in 343 | %% this module so cannot copy the one from there 344 | {state, _Name, _Strategy, Children, _Dynamics, Intensity, 345 | Period, _Restarts, _Module, _Args} = State, 346 | ChildPids = [CPid || #child{pid = CPid} <- Children], 347 | {#node{pid = Pid, 348 | type = supervisor, 349 | intensity = Intensity, 350 | period = Period}, ChildPids} 351 | catch 352 | exit:timeout -> 353 | throw({supervisor_died_before_we_could_query_it, 354 | report_this_as_a_bug_or_just_rerun_the_command}) 355 | end. 356 | 357 | make_tree(SupStates, OtherPids) -> 358 | make_tree(SupStates, OtherPids, []). 359 | 360 | make_tree([], Orphans, Tree) -> 361 | {Orphans, Tree}; 362 | make_tree([{Node, Children} | SupStates], OtherPids, Completed) -> 363 | {NewSupStates, NewOtherPids, NewCompleted, NewNode} = 364 | make_tree(Children, SupStates, OtherPids, Completed, Node), 365 | make_tree(NewSupStates, NewOtherPids, [NewNode | NewCompleted]). 366 | 367 | make_tree([], SupStates, OtherPids, Completed, Node) -> 368 | Sorted = lists:sort(fun(#node{will_die_at = D1}, 369 | %% TODO: check that this comparison is in 370 | %% the right direction. I don't have a 371 | %% good test case and am too tired to do 372 | %% it by thinking. Deadline moving 373 | %% closer... 374 | #node{will_die_at = D2}) -> D1 > D2 375 | end, Node#node.child_data), 376 | WillDieAt = 377 | lists:sum(lists:sublist([N#node.will_die_at || N <- Sorted], 378 | Node#node.intensity)), 379 | {SupStates, OtherPids, Completed, Node#node{will_die_at = WillDieAt, 380 | child_data = Sorted}}; 381 | make_tree([ChildPid | ChildPids], 382 | SupStates, 383 | OtherPids, 384 | Completed, 385 | Node) -> 386 | case lists:keytake(ChildPid, 1, SupStates) of 387 | {value, {ChildPid, {Node, ChildChildren}}, NewSupStates} -> 388 | {NewNewSupStates, NewOtherPids, NewCompleted, Child} = 389 | make_tree(ChildChildren, 390 | NewSupStates, 391 | OtherPids, 392 | Completed, 393 | Node), 394 | make_tree(ChildPids, 395 | NewNewSupStates, 396 | NewOtherPids, 397 | NewCompleted, 398 | Node#node{child_data = [Child | Node#node.child_data]}); 399 | false -> 400 | case lists:splitwith(fun(X) -> X =/= ChildPid end, OtherPids) of 401 | {Pre, [ChildPid | Post]} -> 402 | make_tree(ChildPids, 403 | SupStates, 404 | Pre ++ Post, 405 | Completed, 406 | Node#node{ 407 | child_data = [#node{pid = ChildPid, 408 | type = child} | 409 | Node#node.child_data]}); 410 | {_, []} -> 411 | case lists:keytake(ChildPid, #node.pid, Completed) of 412 | {value, CompletedChild, NewCompleted} -> 413 | make_tree(ChildPids, 414 | SupStates, 415 | OtherPids, 416 | NewCompleted, 417 | Node#node{ 418 | child_data = [CompletedChild | 419 | Node#node.child_data]}); 420 | false -> 421 | case ChildPid =:= undefined of 422 | true -> 423 | ok; 424 | false -> 425 | p("Missing child ~p, ignoring", [ChildPid]) 426 | end, 427 | make_tree(ChildPids, 428 | SupStates, 429 | OtherPids, 430 | Completed, 431 | Node) 432 | end 433 | end 434 | end. 435 | 436 | kill_one([], _AppFilter) -> {error, no_killable_processes}; 437 | kill_one([Pid | Pids], AppFilter) -> 438 | App = case application:get_application(Pid) of 439 | {ok, A} -> A; %% No apps named undefined, please! 440 | undefined -> undefined 441 | end, 442 | case is_killable(Pid, App, AppFilter) of 443 | true -> {ok, {Pid, App, kill(Pid)}}; 444 | false -> kill_one(Pids, AppFilter) 445 | end. 446 | 447 | kill(Pid) -> 448 | erlang:monitor(process, Pid), 449 | exit(Pid, im_killing_you), 450 | receive 451 | {'DOWN', _, process, Pid, Reason} -> 452 | Reason 453 | after 500 -> 454 | exit(Pid, kill), 455 | receive 456 | {'DOWN', _, process, Pid, Reason} -> 457 | Reason 458 | end 459 | end. 460 | 461 | %% pinfo(Pid) -> [{Item, Info}] | undefined 462 | %% pinfo(Pid, Item) -> Info | undefined 463 | %% A version of process_info/1 that handles pid on remote nodes as well. 464 | pinfo({_, Pid}) -> % Handle internal process format 465 | pinfo(Pid); 466 | pinfo(Pid) when node(Pid)==node() -> 467 | process_info(Pid); 468 | pinfo(Pid) -> 469 | case rpc:call(node(Pid), erlang, process_info, [Pid]) of 470 | {badrpc, _} -> undefined; 471 | Res -> Res 472 | end. 473 | 474 | pinfo({_, Pid}, Item) -> % Handle internal process format 475 | pinfo(Pid, Item); 476 | pinfo(Pid, Item) when node(Pid)==node() -> 477 | case process_info(Pid, Item) of 478 | {Item, Info} -> Info; 479 | "" -> ""; % Item == registered_name 480 | undefined -> undefined 481 | end; 482 | pinfo(Pid, Item) -> 483 | case rpc:call(node(Pid), erlang, process_info, [Pid, Item]) of 484 | {badrpc, _} -> undefined; 485 | {Item, Info} -> Info; 486 | "" -> ""; % Item == registered_name 487 | undefined -> undefined 488 | end. 489 | 490 | %% END OF AUXILIARY FUNCTIONS 491 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 492 | %% START OF FORMATTING FUNCTIONS 493 | 494 | p(Format, Data) -> 495 | catch throw(get_stacktrace), Stacktrace = erlang:get_stacktrace(), 496 | MFAInfo = hd(tl(Stacktrace)), 497 | String = 498 | case MFAInfo of 499 | {M, F, A} -> 500 | format_single_line("~p ~p:~p/~p " ++ Format, 501 | [self(), M, F, A | Data]); 502 | {M, F, A, Info} -> 503 | case lists:keysearch(line, 1, Info) of 504 | {value, {line, Line}} -> 505 | format_single_line("~p ~p:~p/~p #~p " ++ Format, 506 | [self(), M, F, A, Line | Data]); 507 | false -> 508 | format_single_line("~p ~p:~p/~p " ++ Format, 509 | [self(), M, F, A | Data]) 510 | end 511 | end, 512 | io:format("~s~n", [String]). 513 | 514 | format_single_line(Format, Data) -> 515 | oneline(lists:flatten(io_lib:format(Format, Data))). 516 | 517 | oneline([$\n | Rest]) -> [$\s | newline(Rest)]; 518 | oneline([C | Rest]) -> [C | oneline(Rest)]; 519 | oneline([]) -> []. 520 | 521 | newline([$\s | Rest]) -> newline(Rest); 522 | newline(Rest) -> oneline(Rest). 523 | 524 | %% END OF FORMATTING FUNCTIONS 525 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 526 | %% START OF BOOLEAN FUNCTONS 527 | 528 | is_supervisor(Pid) -> 529 | %% inspired by pman_process:is_system_process2/1 which seems 530 | %% cleaner somehow to just grabbing the info from the process_info 531 | %% dictionary (this is what happens in the background anyway). 532 | Init = 533 | case erlang:process_info(Pid, initial_call) of 534 | {initial_call, I} -> I; 535 | undefined -> process_is_dead 536 | end, 537 | SortofActualInit = 538 | case Init of 539 | {proc_lib, init_p, 5} -> proc_lib:translate_initial_call(Pid); 540 | Init -> Init 541 | end, 542 | case SortofActualInit of 543 | {supervisor, _, _} -> true; 544 | _ -> false 545 | end. 546 | 547 | -define(OTP_APPS, 548 | [appmon, asn1, common_test, compiler, cosEvent, 549 | cosEventDomain, cosFileTransfer, cosNotification, 550 | cosProperty, cosTime, cosTransactions, crypto, debugger, 551 | dialyzer, diameter, edoc, eldap, erl_docgen, erl_interface, 552 | erts, et, eunit, gs, hipe, ic, inets, inviso, jinterface, 553 | kernel, megaco, mnesia, observer, odbc, orber, os_mon, 554 | otp_mibs, parsetools, percept, pman, public_key, reltool, 555 | runtime_tools, sasl, snmp, ssh, ssl, stdlib, syntax_tools, 556 | test_server, toolbar, tools, tv, typer, webtool, wx, xmerl]). 557 | 558 | %% List of registered name that will make a prodcess a "SYSTEM"-process 559 | -define(SYSTEM_REG_NAMES, 560 | [%% kernel 561 | application_controller, erl_reply, auth, boot_server, code_server, 562 | disk_log_server, disk_log_sup, erl_prim_loader, error_logger, 563 | file_server_2, fixtable_server, global_group, global_name_server, 564 | heart, inet_gethost_native, inet_gethost_native_sup, init, 565 | kernel_config, kernel_safe_sup, kernel_sup, net_kernel, net_sup, rex, 566 | user, os_server, ddll_server, erl_epmd, inet_db, pg2, 567 | 568 | %% stdlib 569 | timer_server, rsh_starter, take_over_monitor, pool_master, dets, 570 | 571 | %% sasl 572 | sasl_safe_sup, sasl_sup, alarm_handler, overload, release_handler, 573 | 574 | %% gs 575 | gs_frontend]). 576 | 577 | %% List of module:function/arity calls that will make the caller a 578 | %% "SYSTEM"-process. 579 | -define(SYSTEM_INIT_CALLS, 580 | [{application_master,init,4}, 581 | {application_master,start_it,4}, 582 | {inet_tcp_dist,accept_loop,2}, 583 | {net_kernel,ticker,2}, 584 | {supervisor_bridge,user_sup,1}, 585 | {user_drv,server,2}, 586 | {group,server,3}, 587 | {kernel_config,init,1}, 588 | {inet_tcp_dist,do_accept,6}, 589 | {inet_tcp_dist,do_setup,6}, 590 | {pman_main,init,2}, 591 | {pman_buf_printer,init,2}, 592 | {pman_buf_converter,init,2}, 593 | {pman_buf_buffer,init,1}, 594 | {gstk,init,1}, 595 | {gstk_port_handler,init,2}, 596 | {gstk,worker_init,1} 597 | ]). 598 | 599 | %% List of module:function/arity calls that will make the executing 600 | %% process a "SYSTEM"-process. 601 | -define(SYSTEM_RUNNING_CALLS, 602 | [{file_io_server,server_loop,1}, 603 | {global,loop_the_locker,1}, 604 | {global,collect_deletions,2}, 605 | {global,loop_the_registrar,0}, 606 | {gs_frontend,request,2}, 607 | {shell,get_command1,5}, 608 | {shell,eval_loop,3}, 609 | {io,wait_io_mon_reply,2}, 610 | {pman_module_info,loop,1}, 611 | {pman_options,dialog,3}, 612 | {pman_options,loop,1}, 613 | {pman_relay_server,loop,1}, 614 | {pman_shell,monitor_loop,1}, 615 | {pman_shell,safe_loop,2} 616 | ]). 617 | 618 | %% is_system_process(Pid) -> bool() 619 | %% Returns true if Pid is a "system process". 620 | %% This is a prototype version, use file configuration later. 621 | is_system_process(Pid) -> 622 | catch is_system_process2(Pid). 623 | 624 | is_system_process2(Pid) -> 625 | 626 | %% Test if the registered name is a system registered name 627 | case pinfo(Pid, registered_name) of 628 | undefined -> ignore; 629 | "" -> ignore; 630 | Name -> 631 | case lists:member(Name, ?SYSTEM_REG_NAMES) of 632 | true -> throw(true); 633 | false -> ignore 634 | end 635 | end, 636 | 637 | %% Test if the start specification is a "system start function" 638 | MFAi = case pinfo(Pid, initial_call) of 639 | {proc_lib, init_p, 5} -> 640 | proc_lib:translate_initial_call(Pid); % {M,F,A} | Fun 641 | Res -> Res % {M,F,A} | undefined 642 | end, 643 | case lists:member(MFAi, ?SYSTEM_INIT_CALLS) of 644 | true -> throw(true); 645 | false -> ignore 646 | end, 647 | 648 | %% Test if the running specification is a "system running function" 649 | case pinfo(Pid, current_function) of 650 | undefined -> false; 651 | MFAc -> 652 | lists:member(MFAc, ?SYSTEM_RUNNING_CALLS) 653 | end. 654 | 655 | %% is_killable(Pid, App) -> 656 | %% is_killable(Pid, App, all_but_otp, true). 657 | 658 | is_killable(Pid, App, AppFilter) -> 659 | is_killable(Pid, App, AppFilter, true). 660 | 661 | is_killable(Pid, App, AppFilter, IsSupervisorKillable) 662 | when is_pid(Pid), is_atom(App), is_boolean(IsSupervisorKillable) -> 663 | (App =:= undefined 664 | orelse 665 | case AppFilter of 666 | all -> true; 667 | all_but_otp -> not(lists:member(App, ?OTP_APPS)); 668 | Apps when is_list(Apps) -> lists:member(App, Apps) 669 | end) 670 | andalso 671 | not(lists:member(App, [kernel, chaos_monkey])) 672 | andalso 673 | not(is_system_process(Pid)) 674 | andalso 675 | not(is_shell(Pid)) 676 | andalso 677 | not(Pid =:= self()) 678 | andalso 679 | (not(IsSupervisorKillable) 680 | orelse 681 | not(is_supervisor(Pid))). 682 | 683 | %% Theoretically pman_process:is_system_process/1 should say true for 684 | %% the shell. Well, it doesn't, so this is a workaround until it 685 | %% does. 686 | is_shell(Pid) -> 687 | %% The shell never belongs to any applicition. To optimize, check 688 | %% that application:get_application(Pid) yields undefined before 689 | %% calling this function. 690 | case erlang:process_info(Pid, group_leader) of 691 | undefined -> false; %% process is dead 692 | {group_leader, Leader} -> 693 | case lists:keyfind(shell, 1, group:interfaces(Leader)) of 694 | {shell, Pid} -> true; 695 | {shell, Shell} -> 696 | case erlang:process_info(Shell, dictionary) of 697 | {dictionary, Dict} -> 698 | proplists:get_value(evaluator, Dict) =:= Pid; 699 | undefined -> false %% process is dead 700 | end; 701 | false -> false 702 | end 703 | end. 704 | -------------------------------------------------------------------------------- /src/chaos_monkey_app.erl: -------------------------------------------------------------------------------- 1 | %% @author Daniel Luna 2 | %% @copyright 2012 Daniel Luna 3 | %% @doc 4 | -module(chaos_monkey_app). 5 | -author('Daniel Luna '). 6 | -behaviour(application). 7 | 8 | %% Application callbacks 9 | -export([start/2, stop/1]). 10 | 11 | start(_StartType, _StartArgs) -> 12 | chaos_monkey_sup:start_link(). 13 | 14 | stop(_State) -> 15 | ok. 16 | 17 | -------------------------------------------------------------------------------- /src/chaos_monkey_sup.erl: -------------------------------------------------------------------------------- 1 | %% @author Daniel Luna 2 | %% @copyright 2012 Daniel Luna 3 | %% @doc 4 | -module(chaos_monkey_sup). 5 | -author('Daniel Luna '). 6 | -behaviour(supervisor). 7 | 8 | %% API 9 | -export([start_link/0]). 10 | 11 | %% Supervisor callbacks 12 | -export([init/1]). 13 | 14 | start_link() -> 15 | supervisor:start_link({local, ?MODULE}, ?MODULE, []). 16 | 17 | mk_spec(Type, Module, Args) -> 18 | Timeout = case Type of 19 | worker -> 5000; 20 | supervisor -> infinity 21 | end, 22 | {Module, 23 | {Module, start_link, Args}, 24 | permanent, 25 | Timeout, 26 | Type, 27 | [Module]}. 28 | 29 | init([]) -> 30 | Children = [mk_spec(worker, chaos_monkey, [])], 31 | {ok, {{one_for_one, 5, 10}, Children}}. 32 | -------------------------------------------------------------------------------- /todo.org: -------------------------------------------------------------------------------- 1 | * TODO 'exclude apps' (instead of, or in addition to, 'include apps') 2 | suggested by German Todorov 3 | * TODO Option whether "orphan" processes should be killed or not 4 | * TODO randomly suspend gen servers for a couple of seconds to see what happens 5 | suggested by archaelus 6 | * TODO spawn(fun() -> erlang:group_leader(whereis(smemail), self()), App = application:get_application(), erlang:group_leader(whereis(user), self()), io:format("~p", [App]) end). 7 | * TODO Make small monkey logo 8 | to put on pages with the mention "battle tested with chaos_monkey!" 9 | * TODO Everything else in this file 10 | 11 | 2012-07-25 09:47 I had an idea in the back of my mind to 12 | make an app simulating a node going to 13 | the shitter. Playing with priorities, 14 | causing trouble with scheduling and CPU 15 | use, as if the node was overloaded 16 | 2012-07-25 09:47 random failures, etc. 17 | --- ... --- 18 | 2012-07-25 11:13 essen: Let me know what features could be 19 | added to the chaos monkey and I'll see 20 | what I can do. 21 | 2012-07-25 11:19 what MononcQc said, play with priorities, 22 | cause trouble with scheduling, CPU use, as 23 | if the node was overloaded 24 | 2012-07-25 11:21 call manual GCs and whatnot 25 | 2012-07-25 11:21 mock gen_tcp and friends to disallow 26 | opening connections 27 | 2012-07-25 11:21 yeah, pretending file descriptors are 28 | all used up 29 | 2012-07-25 11:21 mess with distribution 30 | 2012-07-25 11:22 call net_kernel:disconnect a few times 31 | at random 32 | 2012-07-25 11:22 and for the tool I had in mind, the 33 | 'hellhole' mode would have been to 34 | unleash everything at once. Random 35 | deaths, slow IO (be a slow 36 | group_leader), disconnections, high 37 | CPU, etc. 38 | 39 | %% Counters to be reported to the user: 40 | %% #dead already 41 | %% sups 42 | %% children 43 | %% neither sups nor children 44 | %% disconnected sups 45 | %% total killable with no dead sups 46 | %% kill and check status 47 | %% total killable with no dead top level sups 48 | %% kill and check status 49 | %% #Killed 50 | %% total number of processes before killing anything 51 | %% total number of processes after killing things 52 | 53 | 54 | The following can be used to print info about processes and should 55 | somehow be brough into line with what chaos_monkey:kill/0 returns. 56 | And the other way around obviously, probably adding a pp_procdata or 57 | whatnot. 58 | 59 | %% p_pidinfo(Killable, Pid, App, IsSystemProcess, IsSystemApp, IsSupervisor) -> 60 | %% FKillable = case Killable of 61 | %% true -> "About to"; 62 | %% false -> "Cannot" 63 | %% end, 64 | %% FName = case erlang:process_info(Pid, registered_name) of 65 | %% {registered_name, Name} -> 66 | %% io_lib:format(" (~s)", [Name]); 67 | %% "" -> "" 68 | %% end, 69 | %% FApp = case App of 70 | %% undefined -> ""; 71 | %% {ok, A} -> io_lib:format(" in app ~s", [A]) 72 | %% end, 73 | %% Immunities = 74 | %% [case IsSystemProcess of 75 | %% true -> " is a system process"; 76 | %% false -> no 77 | %% end, 78 | %% case IsSystemApp of 79 | %% true -> " belongs to a system app"; 80 | %% false -> no 81 | %% end, 82 | %% case IsSupervisor of 83 | %% true -> " is a supervisor"; 84 | %% false -> no 85 | %% end], 86 | %% FImmunities = 87 | %% case lists:filter(fun(X) -> X =/= no end, Immunities) of 88 | %% [] -> ""; 89 | %% Imms -> 90 | %% [" because it", string:join(Imms, " and")] 91 | %% end, 92 | %% case (App =:= undefined) orelse (Killable =:= true) of 93 | %% true -> 94 | %% p("~s kill ~p~s~s~s.", [FKillable, Pid, FName, FApp, FImmunities]); 95 | %% false -> 96 | %% ok 97 | %% end. 98 | 99 | More dead code that ties in whith things to do later. See the comment 100 | about multiple trees in the same app. 101 | 102 | %% with_ancestors(Pids) -> 103 | %% [case erlang:process_info(Pid, dictionary) of 104 | %% {dictionary, PDict} -> 105 | %% case lists:keyfind('$ancestors', 1, PDict) of 106 | %% {'$ancestors', [Ancestor | _Ancestors]} -> 107 | %% {Pid, Ancestor}; 108 | %% _ -> 109 | %% {Pid, unknown} 110 | %% end; 111 | %% _ -> 112 | %% {Pid, unknown} 113 | %% end || Pid <- Pids]. 114 | --------------------------------------------------------------------------------