├── LICENSE
├── Makefile
├── README.org
├── compiled-macros
├── analysis.C
└── steering-macro.C
├── compiled-program
├── Makefile
├── bin
│ └── main.C
├── include
│ └── analysis.h
└── src
│ └── analysis.C
├── docs
├── css
│ └── style.css
├── index.html
└── index.pdf
└── index.org
/LICENSE:
--------------------------------------------------------------------------------
1 | The Hitchhiker's Guide to High Energy Physics is a set of documents
2 | and software herein referred to as "The Guide"
3 |
4 | The Guide is free software; you can redistribute it and/or modify
5 | it under the terms of the GNU General Public License as published by
6 | the Free Software Foundation; either version 3, or (at your option)
7 | any later version.
8 |
9 | The Guide is distributed in the hope that it will be useful,
10 | but WITHOUT ANY WARRANTY; without even the implied warranty of
11 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 | GNU General Public License for more details.
13 |
14 | You should have received a copy of the GNU General Public License
15 | along with this software. If not, see .
16 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | # Modified from https://stackoverflow.com/questions/22072773/batch-export-of-org-mode-files-from-the-command-line
2 | #edit this if you want the html put somewhere else
3 | OUT_DIR=$(PWD)
4 |
5 | #Shouldn't need to edit anything below this line
6 | HTML_FILES=$(patsubst %.org,$(OUT_DIR)/%.html,$(wildcard *.org))
7 | PDF_FILES=$(patsubst %.org,$(OUT_DIR)/%.pdf,$(wildcard *.org))
8 |
9 | .PHONY: all html pdf clean
10 |
11 | all: pdf html
12 | pdf: $(PDF_FILES)
13 | html: $(HTML_FILES)
14 |
15 | %.html: %.org
16 | emacs $< --batch -f org-html-export-to-html --kill
17 |
18 | %.pdf: %.org
19 | emacs $< --batch -f org-latex-export-to-pdf --kill
20 |
21 | install: pdf html
22 | cp css/style.css ${HOME}/public/the-guide/css/style.css
23 | cp index.org ${HOME}/public/the-guide/the-guide.org
24 | cp index.pdf ${HOME}/public/the-guide/the-guide.pdf
25 |
26 |
27 | clean:
28 | rm *.html *.pdf
29 |
--------------------------------------------------------------------------------
/README.org:
--------------------------------------------------------------------------------
1 | * Guidelines for "The Guide"
2 | You can obtain a copy of the guide by cloning this repository. The
3 | online version of the guide is hosted at [[http://dmb2.github.io/hitchhikers-guide-to-hep/][Hitchhiker's guide to HEP]].
4 |
5 | ** Contributing
6 | The guide is satirical and sarcastic. If you read some portion and
7 | are upset or discouraged, please seek out other sources of
8 | information.
9 |
10 | As with all of my open-source projects, pull requests are always
11 | welcome. If you would like to contribute a section to the guide,
12 | please keep the tone intact.
13 |
14 | Some things to keep in mind
15 | - The Guide is *pragmatic* not pedagogical, there is a cacophony of
16 | pedagogy in HEP. It is not the intent of this document to add to
17 | that noise.
18 | - The Guide is also an introduction to hacker culture. The hacker
19 | mindset has helped me a lot with the steep learning curve of HEP.
20 | - Pull requests that change formatting or stylistic changes in tone
21 | will be rejected
22 | - Pull requests which add significant, pragmatic content or advice
23 | are very welcome.
24 | - Code samples are intentionally sparse. They're meant to provide a
25 | convenient substrate for hacking, they are *not* pedagogical
26 | examples of how to write analysis code.
27 |
28 | ** Contributors
29 | - Jimmy Dorff - advice for configuring ssh properly
30 | - [[https://github.com/dougphy][Doug Davis]] - Section on extending ROOT with classes inheriting from
31 | TObject
32 |
--------------------------------------------------------------------------------
/compiled-macros/analysis.C:
--------------------------------------------------------------------------------
1 | #include
2 | #include "TFile.h"
3 | #include "TH1F.h"
4 |
5 |
6 | // Generates a random number between min and max
7 | double randomP(double min, double max){
8 | return (max-min)*((double)rand())/RAND_MAX+min;
9 | }
10 |
11 | int doAnalysis(){
12 | //Since the ROOT binary already defines a "main" an error will occur
13 | //if you redefine another function named "main", therefore we use
14 | //the verb "doAnalysis"
15 |
16 | //Open files and retrieve pointers here
17 |
18 | const int N=10000;
19 |
20 | double jet_E=0,jet_px=0,jet_py=0,jet_pz=0;
21 |
22 | //initialize random seed, use a fixed number to make the results
23 | //repeatable
24 | srand(42);
25 |
26 | for(size_t i=0; i < N; i++ ){
27 | jet_E=randomP(0,100);
28 | jet_px=randomP(0,100);
29 | jet_py=randomP(0,100);
30 | jet_pz=randomP(0,100);
31 | }
32 | return 42;
33 | }
34 |
--------------------------------------------------------------------------------
/compiled-macros/steering-macro.C:
--------------------------------------------------------------------------------
1 | {
2 | //may need to load other libraries or files that depend on analysis.C
3 | gROOT->ProcessLine(".L analysis.C++");
4 | gROOT->ProcessLine("doAnalysis()");
5 | }
6 |
--------------------------------------------------------------------------------
/compiled-program/Makefile:
--------------------------------------------------------------------------------
1 | CC=$(shell root-config --cxx)
2 | # Build this library the same way root was compiled/linked
3 | INCDIR=$(PWD)/include
4 | LIBDIR:=$(shell root-config --libdir)
5 | ROOTINCDIR:=$(shell root-config --incdir)
6 | LDFLAGS:=$(shell root-config --libs) #-L ./lib #-lgcov
7 | WFLAGS= -Wextra -Wall
8 | DFLAGS=-O2 #-fprofile-arcs -ftest-coverage
9 | CXXFLAGS=$(shell root-config --ldflags) -pg -I$(INCDIR) -I$(ROOTINCDIR) \
10 | $(DFLAGS) $(WFLAGS) -ansi
11 |
12 | .PHONY: all clean
13 | all: runAnalysis
14 |
15 | runAnalysis: analysis.o runAnalysis.o
16 | $(CC) $? -o runAnalysis $(LDFLAGS)
17 | runAnalysis.o: ./bin/main.C
18 | $(CC) $(CXXFLAGS) -c $< -o $@
19 | analysis.o: ./src/analysis.C
20 | $(CC) $(CXXFLAGS) -c $< -o $@
21 | clean:
22 | -rm *.o runAnalysis
23 |
--------------------------------------------------------------------------------
/compiled-program/bin/main.C:
--------------------------------------------------------------------------------
1 | #include
2 | #include
3 | #include "analysis.h"
4 | #include
5 | #include
6 |
7 | int main(const int argc, const char* argv[]){
8 | //Parse command line arguments
9 | std::vector arg_list;
10 | for(int i=0; i < argc; i++){
11 | arg_list.push_back(std::string(argv[i]));
12 | std::cout<<"Got argument: "<
2 | #include "TFile.h"
3 | #include "TH1F.h"
4 | #include "TLorentzVector.h"
5 | #include
6 | using std::cout;
7 | using std::cerr;
8 | using std::endl;
9 | double randomP(double min, double max){
10 | return (max-min)*((double)rand())/RAND_MAX+min;
11 | }
12 |
13 | int doAnalysis(){
14 | //This is the analysis driver, we initialize a number of events to
15 | //process, create variables to hold relevant quantities, and execute
16 | //the event loop. Typically there will be selection criteria
17 | //applied at this stage. Histograms and other output are also
18 | //usually produced here.
19 |
20 |
21 | const int N=100;
22 | double jet_E=0.,jet_px=0.,jet_py=0.,jet_pz=0.;
23 |
24 | //initialize random seed, use a fixed number to make the results
25 | //repeatable, be especially careful that you choose unique seeds if
26 | //you are generating random numbers in batch systems!
27 | srand(42);
28 |
29 | TLorentzVector jet;
30 | for(size_t i=0; i < N; i++ ){
31 | jet_E=randomP(0,100);
32 | jet_px=randomP(0,100);
33 | jet_py=randomP(0,100);
34 | jet_pz=randomP(0,100);
35 | jet.SetPxPyPzE(jet_px,jet_py,jet_pz,jet_E);
36 | if (jet.M() < 0){
37 | continue;
38 | }
39 | cout <<"Jet Mass: "<
7 | #+LaTeX_CLASS: article
8 |
9 | * A Hitchhiker's guide to High Energy Physics
10 | ** Preface
11 | Wow! I wrote this approximately eight years ago. The following is... a
12 | lot. I wrote it during the evening to unwind, and I was reading the
13 | Hitchhikers Guide to the Galaxy so I took every opportunity to make
14 | esoteric references to Douglas Adams. I think each individual section
15 | is still pretty readable, and a lot of the information is evergreen.
16 | The link density and trivia is insane. Distance has made me realize
17 | that its fairly overwhelming. Now that I've been outside HEP for a few
18 | years (almost five!) I understand just how intense getting into
19 | research is. It doesn't look like its gotten better, ROOT transitioned
20 | to cmake, most projects are on git, and a decent fraction appear to
21 | have double-downed on docker.
22 |
23 | I've updated some of the prescriptions to future-proof them and fixed
24 | some link-rot. If you're a [[mailto:hhg-svc@ccbar.us][fan say so]]! I occasionally stalk the
25 | internet and find references to it so I know there are some readers.
26 |
27 | If you're getting into this business today, emphasize learning
28 | transferable skills. Learn how to write maintainable C++, learn
29 | docker, learn how to use Jupyter notebooks, only use ROOT if you have
30 | to (the rest of Data Science uses numpy, pandas, and python).
31 |
32 | If I were to write this again today, it would sound very different,
33 | but I'm still very proud of it
34 |
35 | ** Introduction
36 | Welcome to High Energy Physics. Like the Galaxy, HEP is a wonderful
37 | and mysterious place filled with amazing things. The intent of this
38 | guide is to provide a jump-start to HEP research. Its purpose is to
39 | be entertaining and uninformative. The information in this guide was
40 | hard won, The Editor does not intend it to be different for our dear
41 | reader. This guide is not intended to make the reader's life easier.
42 | It is not intended to teach anything. It is intended to provide the
43 | dear reader with a glossary of concepts for further research.
44 |
45 | If the reader finds this guide entertaining, but uninformative, then
46 | please send a bottle of Old Janx Spirit to The Editor. The Guide is
47 | not intended to be read linearly. Jump to the section in need, and if
48 | that section contains links to other sections, then please see above
49 | regarding edits. Prepare to be frustrated.
50 |
51 | *** Navigating the guide
52 | A note on key notation: Keys sequences follow the "emacs" style for
53 | notating keys. Therefore "=C-c=" means hold down the control key,
54 | while pressing the =c= key. "=M-a=" would mean hold down the "Meta" key
55 | and press the =a= key. On most keyboards the Meta key is the key
56 | labeled "Alt". There is also the "Super" key, on most PC keyboards
57 | this key has the "Windows icon" on it. The function keys 1-12 are
58 | notated as =F1= etc. The escape key is usually notated =ESC=, and
59 | the enter key is notated =RET= for Return. A sequence is denoted by
60 | a series of keys and spaces or dashes. A dash means hold the two
61 | keys at the same time, a space means release the previous keys and
62 | continue with the next instruction. Some examples:
63 | - "=C-c C-a=" Control-C release Control-A
64 | - "=C-c a=" Control-c release A
65 | - "=C-M-f=" Press Control, then Meta, then F without releasing
66 |
67 | The online version of the guide includes a Table of Contents; simply
68 | mouse over it and a full list of topics will pop up. You can click
69 | any topic to jump to that section. If you read this in org-mode, the
70 | file will open folded. It will look like:
71 | #+BEGIN_EXAMPLE
72 | #+TITLE: DON'T PANIC
73 | #+AUTHOR: David Bjergaard
74 | #+EMAIL: hhg-svc@ccbar.us
75 | #+OPTIONS: H:5 num:nil toc:t \n:nil @:t ::t |:t ^:t -:t f:t *:t <:t
76 | #+OPTIONS: TeX:t LaTeX:t skip:nil d:nil todo:t pri:nil tags:not-in-toc
77 | #+STYLE:
78 | #+LaTeX_CLASS: article
79 |
80 | * A Hitchhiker's guide to High Energy Physics...
81 | #+END_EXAMPLE
82 | Place your cursor at the beginning of =* A Hitchhiker's...= and hit
83 | =TAB=, this will expand the topics allowing you to see and over-view
84 | of the document. Move to whichever topic you're interested in and
85 | hit =TAB= again to expand that section.
86 |
87 | If you are reading this in Vim without [[https://github.com/jceb/vim-orgmode][Vim-OrgMode]], then you will
88 | have no folding and the whole document will be expanded. Jump to
89 | various headlines by searching for "=**=" (two levels deep),
90 | "=***=" (three levels deep), "=****=" (four levels deep), etc.
91 |
92 | If you are reading this as a PDF, there is no Table of Contents. Just
93 | scroll to the section you are interested in.
94 |
95 | *NOTE* If you are reading this on GitHub, a link to another section
96 | will probably be broken due to a bug in how GitHub parses org-flavored
97 | markdown.
98 |
99 | *** Disclaimer
100 | All bottles of donated Old Janx Spirit are redirected to our lawyers,
101 | who are out enjoying them now. They insisted that we include this
102 | disclaimer:
103 |
104 | #+BEGIN_QUOTE
105 | The Hitchhiker's Guide to High Energy Physics is a set of documents
106 | and software herein referred to as "The Guide"
107 |
108 | The Guide is free software; you can redistribute it and/or modify
109 | it under the terms of the GNU General Public License as published by
110 | the Free Software Foundation; either version 3, or (at your option)
111 | any later version.
112 |
113 | The Guide is distributed in the hope that it will be useful,
114 | but WITHOUT ANY WARRANTY; without even the implied warranty of
115 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
116 | GNU General Public License for more details.
117 |
118 | You should have received a copy of the GNU General Public License
119 | along with this software. If not, see .
120 | #+END_QUOTE
121 |
122 | Also, note that the content of this guide is satirical in nature, and
123 | is intended to be sarcastic. If you have a weak heart or are easily
124 | offended, it may be better to seek out other sources of information.
125 |
126 | *** Obtaining a copy and supporting material
127 | All of the sources for the guide are hosted on GitHub.
128 |
129 | Here are some quick-links:
130 | - [[http://dmb2.github.io/hitchhikers-guide-to-hep/][The Guide]] the online version of the guide
131 | - [[./the-guide.pdf][PDF of The Guide]] if you prefer that sort of thing, though it's of
132 | limited use in printed form.
133 | - [[./the-guide.org][Org source]] of the online version
134 | - [[https://github.com/dmb2/hitchhikers-guide-to-hep][GitHub Repo]] where source code is hosted
135 | - [[https://github.com/dmb2/hitchhikers-guide-to-hep/tree/master/compiled-macros][Compiled Macros]] for Level 2 of ROOT enlightenment
136 | - [[https://github.com/dmb2/hitchhikers-guide-to-hep/tree/master/compiled-program][Compiled Programs]] for Level 3 of ROOT enlightenment
137 | - [[https://github.com/dmb2/hitchhikers-guide-to-hep/issues][Bug Reports/Feature Requests]]
138 | - [[https://github.com/dmb2/hitchhikers-guide-to-hep/pulls][Pull Requests]] for submitting patches
139 |
140 | To get a local copy:
141 | #+BEGIN_SRC sh
142 | git clone https://github.com/dmb2/hitchhikers-guide-to-hep.git
143 | #+END_SRC
144 | Then you'll have a copy of the org file, as well as the compiled
145 | macros and compiled programs.
146 |
147 | ** For Windows Hitchhikers
148 | Everyone should read [[For Linux Hitchhikers]] to understand what
149 | functionality they'll need (especially when working with or on remote
150 | machines)
151 |
152 | While it is possible to practice HEP from the comfort of Bill Gates'
153 | brain child, it is not recommended by The Editor. (He doesn't
154 | run Windows anyway, daylight scares him.) If you insist on using
155 | Windows, the following is a list of useful software.
156 | *** Software you will need
157 | - [[For Linux Hitchhikers]]
158 | - [[http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html][PuTTY (ssh client for Windows)]]: Secure SHell is the standard way of
159 | accessing *nix machines remotely. PuTTY is the Windows client for
160 | this.
161 | - [[*ROOT][ROOT]]: The industry standard for High Energy Physics analysis.
162 | Beware: this program uses an [[https://sites.google.com/site/h2g2theguide/Index/i/149246][Infinite Improbability Drive]] to
163 | perform analysis.
164 | - [[http://www.straightrunning.com/XmingNotes/][Xming an X11 server for Windows]]: This allows you to tunnel X11
165 | applications (ROOT's histogram interface) to your Windows desktop,
166 | this way your data (and ROOT) can live on a remote machine, but you
167 | can still interact with them as if they were on your desktop. (You
168 | need a *fast* internet connection to do this). Xming comes in two
169 | flavors: the "web" version, which is locked behind a paywall for
170 | people who donate to the project, and a slightly less up-to-date
171 | public version available for free. Choose the public version. See
172 | [[http://www.straightrunning.com/XmingNotes/#sect-143][Getting Started]] and [[http://www.straightrunning.com/XmingNotes/trouble.php#head-11][Trouble shooting]] for tips on getting set up.
173 | - [[https://docs.microsoft.com/en-us/windows/wsl/install][WSL]]: Windows 10 released a Linux subsystem that allows a unix-like
174 | environment in Windows.
175 | - [[http://cygwin.com/][Cygwin]]: Adds a substrate of the GNU system to Windows (in addition
176 | to an [[http://x.cygwin.com/][X11 server]]), you can use this to create a more Unix-like
177 | environment to work from.
178 | - [[https://www.virtualbox.org/][VirtualBox]]: Allows you to boot operating systems within operating
179 | systems (useful if you don't want to dual boot Ubuntu) see [[For Linux
180 | Hitchhikers]] after you've setup a working distro.
181 |
182 | [[https://www.wikihow.com/Install-Ubuntu-on-VirtualBox][See here]] for a nice picture-book tutorial on installing Ubuntu
183 | through VirtualBox on Windows.
184 | ** For Linux Hitchhikers
185 | *** Software you will need
186 | - [[https://www.gnu.org/software/screen/][Screen]]: This lets you pick up where you left off if your ssh
187 | connection drops, [[http://www.ibm.com/developerworks/aix/library/au-gnu_screen/][here]] is a good conceptual introduction. If you
188 | use =screen= on =lxplus=, you'll have to re-initialize your
189 | kerberos tokens after logging in with =kinit -5=, otherwise you
190 | won't have read access to your files.
191 | - [[https://github.com/tmux/tmux/wiki][tmux]]: Offers the same features as screen.
192 | - [[*ROOT][ROOT]]: The industry standard for High Energy Physics analysis.
193 | Beware: this program uses an [[https://sites.google.com/site/h2g2theguide/Index/i/149246][Infinite Improbability Drive]] to
194 | perform analysis.
195 | - [[https://help.ubuntu.com/community/Beginners/BashScripting][BASH]]: The command shell of choice for ATLAS Physicists. You may
196 | think you could use ZSH, but it's better just to stick with
197 | what everyone else uses. CMS Physicists prefer TCSH for some
198 | weird reason.
199 | - [[*Editors][Editor]]: Choose you're religion wisely, it will eventually permeate
200 | your being and change the way you approach life in general.
201 | *** The Terminal
202 | You will, regardless of which operating system you use, be typing
203 | commands into a terminal. It's inevitable, powerful, and intimidating
204 | to new users. HEP hitchhikers should feel at home. Proficiency with
205 | the command line is essential to being a functioning HEP researcher.
206 |
207 | The terminal is like the Galaxy Hitchhiker's [[https://hitchhikers.fandom.com/wiki/Towel][utility towel]]. Every
208 | hitchhiker needs a terminal, and each hitchhiker customizes his or
209 | her towel to their needs.
210 |
211 | If you've never touched a terminal before, and don't know what the
212 | command line is, there are two options:
213 | 1. The [[http://lab.demog.berkeley.edu/Docs/12important/12important.pdf][great pedagogical introduction ]] (12 page PDF) by Carl Mason
214 | 2. "[[http://www.oliverelliott.org/article/computing/tut_unix/][An Introduction to Unix]]" (a comprehensive, modern take on the unix
215 | ecosystem) by Oliver Elliott
216 |
217 | You should read both, Carl Mason's tutorial should be read
218 | "cover-to-cover", where as Oliver's is written very much in the spirit
219 | of this guide, so bookmark it and refer to it after you've read Carl
220 | Mason's tutorial.
221 | **** Line Editing
222 | Most modern operating systems' default shell is bash. Be aware that
223 | bash's line editor is set up to respect emacs keybindings, this means
224 | "C-a" is beginning of line "C-e" the end, etc. You can change to vi
225 | bindings by typing:
226 | #+BEGIN_SRC sh
227 | set -o vi
228 | #+END_SRC
229 | If you forget which mode your in, check it by typing:
230 | #+BEGIN_SRC sh
231 | set -o
232 | #+END_SRC
233 | If you want these changes to be permanent, add them to your =.bashrc=.
234 | If these commands give you an error, type:
235 | #+BEGIN_SRC sh
236 | echo $SHELL
237 | #+END_SRC
238 | And see what it says (=/bin/bash= if it's bash, may be =/bin/zsh= or
239 | =/bin/tcsh=). If it is not bash, then you need to google information
240 | for the line editor of whichever shell you are using.
241 | **** Managing Jobs
242 | Sometimes it will be convenient to spawn a process and continue
243 | working in the current shell. Usually this is accomplished by
244 | redirecting the stdout and stderr to a file:
245 | #+BEGIN_SRC sh
246 | myLongRunningCommand foo bar baz 42 &> theProcess.log &
247 | #+END_SRC
248 | When you launch the command, you'll see something like:
249 | #+BEGIN_EXAMPLE
250 | [1] 19509
251 | #+END_EXAMPLE
252 | The number =19509= is the PID of the process. If you have multiple
253 | jobs going they can be summarized by typing =jobs=
254 | #+BEGIN_EXAMPLE
255 | [1] Running myLongRunningCommand foo bar baz 42 &> theProcess.log &
256 | [2]- Running myLongRunningCommand foo bar baz 41 &> theProcess.log &
257 | [3]+ Running myLongRunningCommand foo bar baz 40 &> theProcess.log &
258 | #+END_EXAMPLE
259 | Occasionally you'll realize that you don't want the jobs to run
260 | anymore, so to kill them:
261 | #+BEGIN_SRC sh
262 | kill %2
263 | #+END_SRC
264 | where =%1= is the job number you are referencing. You'll see
265 | something like:
266 | #+BEGIN_EXAMPLE
267 | [2]- Terminated myLongRunningCommand foo bar baz 41 &> theProcess.log &
268 | #+END_EXAMPLE
269 |
270 | *** Configuring SSH
271 | Many of these tips are [[http://blogs.perl.org/users/smylers/2011/08/ssh-productivity-tips.html][lifted from here]].
272 | Put this in your =~/.ssh/config= file:
273 | #+BEGIN_EXAMPLE
274 | ControlMaster auto
275 | ControlPath /tmp/ssh_mux_%h_%p_%r
276 | ControlPersist yes
277 | ServerAliveInterval 30
278 | ServerAliveCountMax 1
279 | #+END_EXAMPLE
280 | It is possible to setup ssh shorthand to route you to remote
281 | machines. The syntax (in =~/.ssh/config=) is:
282 | #+BEGIN_EXAMPLE
283 | Host shortname
284 | #expands to shortname.remote.location.edu
285 | HostName %h.remote.location.edu
286 | User username
287 | ForwardX11 yes #this is equivalient to ssh -Y
288 | IdentityFile ~/.ssh/id_rsa #path to your pubkey
289 | #+END_EXAMPLE
290 | **** SSH Keys
291 | [[https://help.github.com/articles/generating-ssh-keys][Follow this guide]], stop at step 3.
292 | Now, when you need to start using a new machine:
293 | #+BEGIN_EXAMPLE
294 | ssh-copy-id user@remote.machine.name
295 | #+END_EXAMPLE
296 | Then enter your password. Now, when you type =ssh
297 | user@remote.machine.name= you will authenticate yourself with your
298 | newly minted RSA key, and you won't have to enter your password. The
299 | downside is that you'll have to enter your key's passphrase to unlock
300 | it. See below for a way to unlock it once per session.
301 |
302 | *NOTE* While it is cryptographically more secure to authenticate
303 | yourself with ssh keys, if your machine is compromised (ie stolen or
304 | hacked) your ssh keys can provide the attacker with easier access to
305 | all the machines you had access to. This means you should:
306 | 1. Use a strong pass *phrase*, not password. You need to maximize the
307 | number of bits of entropy in your key in order to make it
308 | difficult to crack should the keys fall into enemy hands.
309 | 2. Inform the admins of any machines you had access to if your
310 | machine is compromised
311 | 3. Encrypt your ssh keys (and other sensitive information) in a
312 | private directory that only you can access
313 | 4. *NEVER EVER* store your ssh keys on a third party site (like
314 | Dropbox or similar services)
315 | **** SSH Agent
316 | If you have ssh-agent running (through the =gnome-keyring= service on
317 | Ubuntu, or directly in your .xinitrc through =ssh-agent blah=) you
318 | can type =ssh-add= when you log in and it will add your ssh key to
319 | the keyring, then you can ssh to any machine that you have copied
320 | your key to without entering the password!
321 |
322 | *NOTE* Once you've added your key to the ssh-agent, anyone can sit
323 | down at your keyboard and log into a remote machine as you! This
324 | means if you step away from your computer (even for a moment) you
325 | should lock the screen or log out.
326 |
327 | *** Version Control Systems
328 | The major version control systems in HEP is Git. Git is a series of
329 | tools and utilities to allow collaboration on large pieces of
330 | software.
331 |
332 | Git also provides programmers with a convenient "paper trail" through
333 | the course of developing a piece of software. It allows them to
334 | revert the source code they are working on to any state that they've
335 | previously checked in.
336 |
337 | Git is a software that was written by Linus Torvalds, the hacker
338 | behind Linux. It was written to manage the Linux kernel, a massive
339 | piece of software. Git's model for managing source code is slightly
340 | different. In Git, you maintain the entire repository in your local
341 | copy. This makes committing, managing, and branching very fast. It
342 | also means you can work with all of the advantages of a version
343 | control system without internet access. Simultaneously there is a
344 | copy of the repository on a remote server. Git handles syncing these
345 | two repositories when instructed. This can lead to confusion if
346 | you've used other versioning systems, but shouldn't be a problem if
347 | you have no expectations.
348 |
349 | Some good Git tutorials:
350 | - type "man gittutorial" in the command line
351 | - [[http://git-scm.com/book][Pro Git]] (an online book, modular and comprehensive in scope)
352 | - [[http://gitimmersion.com/][Git Immersion]]
353 | - [[https://learngitbranching.js.org/?locale=en_US][Visual tutorial on branching]]
354 | - [[http://gitolite.com/gcs.html#%25281%2529][Git Concepts Simplified]] (slide show, click to advance)
355 | Intermediate or advanced topics:
356 | - [[http://sethrobertson.github.io/GitFixUm/fixup.html][Undoing, fixing, or removing commits in Git]]
357 | - [[https://blogs.atlassian.com/2014/01/simple-git-workflow-simple/][Simple Git workflow is simple]]
358 | - [[https://ochronus.com/git-tips-from-the-trenches/][Git tips from the trenches]]
359 | *** *rc Files
360 | =*rc= files are special files that are executed every time a program
361 | starts. They almost exclusively live in the user's home directory,
362 | and can be shadowed by the system. Sometimes they have a special
363 | syntax for setting options, sometimes they are written in a scripting
364 | language. The most relevant rc files for a new hitchhiker are:
365 | - .rootrc :: the file that sets root options
366 | - .bashrc :: this is executed every time you open a terminal in bash
367 | - .tcshrc :: as above but for tcsh (hopefully you aren't using this!)
368 | - .zshrc :: as above for the zsh shell
369 | - .vimrc :: configuration options for the venerable vim editor
370 | - .screenrc :: options for gnu screen
371 | - .emacs :: written in emacs lisp, is executed on startup, breaks the
372 | rc naming scheme. Advanced emacs users have
373 | multi-thousand line rc files
374 |
375 | There are other files, if you want to know about them you can do:
376 | #+BEGIN_SRC sh
377 | ls .*rc
378 | #+END_SRC
379 | And google the ones that look interesting. Alternatively you can look
380 | at the system defaults:
381 | #+BEGIN_SRC sh
382 | ls /etc/*rc
383 | #+END_SRC
384 | Sometimes its useful to copy the system file to your home directory
385 | and then edit it there in order to add your customizations. Some
386 | programs document their options that way.
387 |
388 | In recent years programs have begun migrating to putting their
389 | configuration files and options the =.config= subdirectory in a user's
390 | =$HOME= directory.
391 |
392 | ** For Mac OS X Hitchhikers
393 | Everyone should read [[For Linux Hitchhikers]]
394 | to understand what
395 | functionality they'll need (especially when working with or on remote
396 | machines). As a Mac user, you should also read "[[http://www.insectnation.org/blog/it-just-works-or-does-it-the-dark-side-of-macs-in-hep.html][It just works... or
397 | does it? The dark side of Macs in HEP]]" by Andy Buckley. It explains
398 | in detail issues with software development on a Mac. It is an opinion
399 | piece, so don't expect it to be balanced. Also, consider asking your
400 | supervisor for an account on a Linux box and *never look back.*
401 | *** Software you will need
402 | - [[http://xquartz.macosforge.org/landing/][XQuartz]]: Like XMing for Windows, XQuartz runs a local X11 server
403 | for tunneling X11 applications over SSH, unlike Windows, you don't
404 | need a separate SSH program, ssh is built in.
405 | - [[https://support.apple.com/guide/terminal/open-or-quit-terminal-apd5265185d-f365-44cb-8b09-71a064a42125/mac][Terminal.app]]: This is Mac OS's default terminal emulator. It comes
406 | with Mac OS, so you shouldn't need to install it. You should be
407 | aware of it though.
408 | - [[*ROOT][ROOT]]: The industry standard for High Energy Physics analysis.
409 | Beware: this program uses an [[https://sites.google.com/site/h2g2theguide/Index/i/149246][Infinite Improbability Drive]] to
410 | perform analysis.
411 | - [[http://aquamacs.org/][Aquamacs]]: A port of Emacs that uses Aqua as a standard OS X
412 | application. This integrates Emacs with the Mac OS UI. In the
413 | long history of corporate acquisitions a lot of Emacs hackers (from
414 | NeXTSTEP) ended up at apple, you will find that Mac OS integrates
415 | the Emacs experience much more fundamentally than any other OS in
416 | existance. (This doesn't mean you need to use Emacs if you use Mac
417 | OS, just that your muscle memory will thank you subconsciously.)
418 | - [[https://www.macports.org/install.php][MacPorts]]: A system for compiling and installing open source
419 | software on the Mac
420 | - [[http://brew.sh][Home Brew]]: A package manager for Mac OS, allowing you to install
421 | various utilities that don't necessarily come pre-installed with
422 | Mac OS.
423 | ** Editors
424 | Like the major world religions, there are also major editors. In
425 | the *nix ecosystem there are two main editors: Emacs and Vim. There are
426 | others, but they are many, and beyond the scope of this guide.
427 |
428 | The most important thing to do after [[https://stackoverflow.com/questions/1430164/differences-between-Emacs-and-vim][choosing an editor]] is to work
429 | through its corresponding tutorial. An oft heard recommendation is
430 | that "Emacs is easier to learn than vi(m)". A more accurate statement
431 | may be that it is easier to make things happen in Emacs than Vim, but
432 | the two editors are in some sense the yin and yang of text. True
433 | enlightenment in either of these editors takes roughly the same amount
434 | of time after completing the corresponding tutorial.
435 |
436 | *** Finding an editor Guru
437 | After you have finished the tutorial for your editor of choice, then
438 | it's time to find a guru. Guru's are best located by asking around.
439 | If you are talking with someone and notice they use your editor,
440 | don't be afraid to ask them how they did something. Most of the time
441 | the Guru will be flattered and may even volunteer to help you with
442 | any other editor related questions.
443 | **** Editor Guru etiquette
444 | While it is generally OK to ask your Guru any editor related
445 | question, it is best to keep questions restricted to the editor in
446 | question. Flame wars have been fought for decades over which is the
447 | "one true editor."
448 |
449 | In order to prevent a /faux pas/, it is best to make sure you know which
450 | editor your guru uses. This is especially true in the case of a
451 | vi(m) or Emacs guru.
452 |
453 | Another thing to be careful of is repeatedly asking basic questions.
454 | Again, some gurus will tolerate this at the beginning, but after a
455 | point the guru expects you to master the basics (on your own). The
456 | most valuable knowledge your guru can impart is not written in the
457 | tutorial that came with the editor.
458 | **** Keeping your Guru happy
459 | Guru's subsist mainly on a liquid diet of caffeinated beverages
460 | during the day and beer (occasionally wine) at night. It is
461 | important that your Guru remain well lubricated. It is generally
462 | considered a good gesture to offer your Guru his/her beverage of
463 | choice if you've found him/her to be especially helpful on your path
464 | to enlightenment.
465 | *** Emacs
466 | The end goal of any student of the [[http://www.jwz.org/hacks/why-cooperation-with-rms-is-impossible.mp3][Church of Emacs]] is to obtain
467 | proficiency reprogramming the editor to solve the task at hand. This
468 | is ultimately stems from the philosophy of lisp (this gift was given
469 | to us by [[http://www.stallman.org/saint.html][St. IGNUcious]] an AI hacker from MIT where Emacs was born).
470 | In lisp, the flexibility of the language allows it to be re-written to
471 | solve the problem as clearly as possible. In Emacs, an enlightened
472 | user will write a substrate of elisp (Emacs' dialect of lisp) in order
473 | to solve the editing problem at hand.
474 |
475 | While customizing and writing your .emacs (the initialization file
476 | loaded by Emacs in your home directory) is a spiritual journey, there
477 | are those who have done their best to illuminate the path. [[http://www.dialectical-computing.de/blog/blog/2014/03/02/a-simple-emacs-configuration/][A brief
478 | guide to customization philosophies here]].
479 |
480 | The Editor finds the following packages essential:
481 | - [[info:tramp#Top][tramp]]: If your reading this in Emacs, you can follow the link with
482 | "C-c C-o". It is *the* most important aspect of Emacs for HEP
483 | users. It allows you to "visit" files on remote machines from the
484 | Emacs running on your desktop. It does this through ssh. To visit
485 | a remote file, type "C-x C-f" and then type
486 | '/ssh:user@remote.host:~/remote/path', note that tab completion
487 | works remotely just the same as visiting a file locally! Tramp is
488 | also aware of ssh aliases in =~/.ssh/config=, see [[Configuring SSH]].
489 | - [[info:calc#Top][Calc]] - ""Calc" is an advanced desk calculator and mathematical tool
490 | written by Dave Gillespie that runs as part of the GNU Emacs
491 | environment." It handles barns and electron volts out of the box!
492 | - [[http://www.emacswiki.org/emacs/FillAdapt][filladapt]]: a mode for more intelligently filling text in paragraphs
493 | - [[http://www.emacswiki.org/emacs/FlySpell][flyspell]]: a spell checker that highlights mispelled words (will check
494 | in comments if in a programming mode)
495 | - [[http://www.emacswiki.org/cgi-bin/wiki/RectangleMark][rect-mark]]: Adds facilities for marking yanking and otherwise
496 | editing columnar formatted text.
497 | - [[info:emacs#Dired][dired]] (another info link): a directory editor for manipulating files
498 | in the Emacs way
499 | - [[http://ethanschoonover.com/solarized][solarized-theme]]: A theme by Ethan Schoonover, comes in dark and
500 | light variants that actually complement each other well, another
501 | good one is zenburn or gruvbox
502 | - [[http://www.emacswiki.org/emacs/IbufferMode][ibuffer]]: changes the buffer interface and allows you to group
503 | buffers based on various buffer attributes
504 | - [[http://www.emacswiki.org/emacs/ParEdit][paredit]]: Enhances Emacs's awareness of parenthetic structure
505 | - [[https://github.com/Fuco1/smartparens][smartparens]]: Electrically pairs and deletes delimeters when
506 | appropriate (never miss a closing brace again!)
507 | - [[http://www.emacswiki.org/emacs/AutoComplete][auto-complete]]: When setup properly, tab completes anything at any
508 | point depending on past input or names in other buffers.
509 | - [[http://www.emacswiki.org/emacs/AUCTeX][auctex]]: LaTeX editing facilities (for when org-mode doesn't quite cut
510 | it)
511 | - [[http://orgmode.org/][org-mode]]: This guide is written in org-mode. Org-mode can manage
512 | [[http://orgmode.org/worg/org-tutorials/orgtutorial_dto.html][todo lists]], [[http://orgmode.org/worg/org-web.html][write websites]], serve as a [[https://github.com/dmb2/research-log][lab notebook]], execute code
513 | for [[http://orgmode.org/worg/org-contrib/babel/][literate programming]] and many other things. More relevant for
514 | physicists is the [[http://ehneilsen.net/notebook/orgExamples/org-examples.html][org-mode cookbook]]. People switch to Emacs just to
515 | get org-mode!
516 |
517 | Init files of famous Emacs hackers are (in no order of awesomeness)
518 | [[https://github.com/magnars/.emacs.d][Magnar Sveen]], [[https://github.com/technomancy/emacs-starter-kit][Technomancy]], [[https://github.com/jwiegley/dot-emacs][John Wiegley]]. There are also software
519 | packages that intend to comprehensively change the Emacs out of the
520 | box to a better user experience. The two most famous are [[https://github.com/bbatsov/prelude][Prelude]] and
521 | [[https://github.com/overtone/emacs-live][Emacs Live]]. An example (slightly annotated) init file can be found [[https://github.com/dmb2/dotfiles/blob/master/emacs-lisp/init.org][here]].
522 |
523 | Finally, there are some Emacs gurus who post blogs on the internet.
524 | Some particularly useful ones are [[http://emacsredux.com/][Emacs Redux]],
525 | [[http://www.masteringemacs.org/][Mastering Emacs]], and [[http://emacs-fu.blogspot.com/][Emacs Fu]].
526 |
527 | Various religious texts granting Emacs users various powers (such as
528 | reading [[http://www.emacswiki.org/emacs/CategoryMail][email]], [[http://www.emacswiki.org/emacs/CategoryChatClient][chatting]], [[http://www.emacswiki.org/emacs/Twitter][tweeting]], [[http://www.emacswiki.org/emacs/CategoryGames][playing games]], [[http://www.emacswiki.org/emacs/MusicPlayers][listening to music]])
529 | can be found at the [[http://www.emacswiki.org/emacs/][Emacs Wiki]].
530 |
531 | *** Vim
532 | If Emacs is like Catholicism, then Vim is like Buddhism. Vim is the
533 | modern incarnation of vi, a modal text editor that descended from ed.
534 | The modal way of editing is by expressing in a few keystrokes how the
535 | text should be manipulated. This is in contrast to Emacs, where text
536 | is manipulated directly. This fundamental difference is the source of
537 | much confusion for new users, and is also why many people recommend
538 | Emacs as "being easier to learn." This should not deter new users from
539 | learning vi(m), as its editing facilities are substantial.
540 |
541 | A functional =.vimrc= looks like:
542 | #+BEGIN_EXAMPLE
543 | syntax on
544 | set cursorline
545 | set hlsearch
546 | set ic
547 | set incsearch
548 | set ruler
549 | set shiftwidth=4
550 | set tabstop=4
551 | set wrap
552 | #+END_EXAMPLE
553 |
554 | To learn Vim, type =vimtutor= at the command lime and follow the
555 | instructions. Take your time, and repeat the tutorial once or twice
556 | over a few days. In the mean time editors such as =gedit= or =nano=
557 | offer a more traditional experience. As your Vim skills improve, you
558 | will feel more comfortable with Vim and can stop using the less
559 | powerful editors.
560 |
561 | Some useful links include:
562 | - [[http://derekwyatt.org/vim/tutorials/][Vim Videos]] Tutorial videos by Derek Wyatt, the novice videos are
563 | must see if you are new to vi(m)
564 | - [[http://www.vimgenius.com/][Vim Genius]] a drill website for learning Vim commands
565 | - [[https://www.liquidweb.com/kb/overview-of-vim-text-editor/][New user Vim Tutorial]]
566 | - [[http://blog.sanctum.geek.nz/vim-koans/][Vim Koans]] tidbits of wisdom to ponder
567 | - [[http://www.vim.org/scripts/][A collection of extensions and plugins for Vim]]
568 | - [[http://val.markovic.io/blog/youcompleteme-a-fast-as-you-type-fuzzy-search-code-completion-engine-for-vim][YouCompleteMe]] A Vim autocompletion engine for editing.
569 | *** Others
570 | Followers of the Unix way realize that there are situations where a
571 | using a set of shell commands piped together may fit the task at hand
572 | more efficiently than either of the other two editors. Tools you
573 | should be familiar with are:
574 | - [[http://www.grymoire.com/Unix/Sed.html][sed]] and [[http://sed.sourceforge.net/sed1line.txt][one-liners]]
575 | - [[http://www.grymoire.com/Unix/Awk.html][awk]] and [[http://www.pement.org/awk/awk1line.txt][one-liners]]
576 | - [[http://perl-tutorial.org/][perl]] (and its [[https://en.wikipedia.org/wiki/Black_Perl][poetry]])
577 | - [[http://www.thegeekstuff.com/2009/03/15-practical-unix-grep-command-examples/][grep]]
578 | - Heretics exist which exhort the use of [[http://linux.die.net/man/1/pico][pico]] or even [[http://linux.die.net/man/1/nano][nano]].
579 | [[http://regex.info/blog/2006-09-15/247][Always keep in mind]]
580 | #+BEGIN_QUOTE
581 | Some people, when confronted with a problem, think
582 | "I know, I'll use regular expressions." Now they have two
583 | problems. -- Jaimie Zawinski
584 | #+END_QUOTE
585 |
586 | ** Software Design
587 | Well designed software is a true marvel, in the same way architecture
588 | is a marvel. You are a stone mason, and you are building a cathedral.
589 | Repeat that last sentence every time you want to take a shortcut when
590 | coding. A cathedral can't stand on a flimsy foundation.
591 |
592 | In order to help you on your way, you should read the following:
593 | - [[http://aosabook.org/en/index.html][Architecture of Open Source Applications]]
594 | - [[https://github.com/Droogans/unmaintainable-code][How to Write Unmaintainable Code]] (*warning*, many physicists take this
595 | guide literally)
596 |
597 | Good software design is *very* hard, but when you have the pleasure of
598 | using well designed software, it is a true joy. Some examples of good
599 | HEP software:
600 | - [[http://rivet.hepforge.org/][Rivet]]: Robust Independent Validation of Experiment and Theory
601 | - [[http://fastjet.fr][Fastjet]]: Software package for jet finding
602 |
603 | ** A brief introduction to C++
604 | *Caveat Emptor* This information was written for C++03, it was old when
605 | I wrote it (2014), C++11 was too shiny and new. Now in 2022, C++11 is old
606 | and crufty. Modern C++ memory management looks very different than it
607 | did a decade ago. I've been out of the game too long to know what HEP
608 | is doing... They're probably using C++11.
609 |
610 | C++ is the industry standard programming language for analysis in
611 | HEP. Even if you are fortunate enough to do most of your work in
612 | Python, you will eventually be calling C++ code, and should
613 | understand some core concepts in order to debug problems should they
614 | arise.
615 |
616 | Things to keep in mind:
617 | - This portion of the guide covers C++ at a high level. Very little
618 | [[http://www.cplusplus.com/tutorial][specific syntax]] will be covered. When you have a C++ question,
619 | google is your friend.
620 | - When writing in any language, prefer that languages idioms. Don't
621 | write python in C++, C in C++ or C++ in python.
622 | - C++ is a vast language, however being familiar with its roots, C,
623 | is invaluable.
624 | - If faced with a decision between learning C++ vs Python, prefer
625 | C++. C++'s syntax is more rigid and requires more overhead. Once
626 | you know C++, python is much easier to pick up.
627 | - There's always an exception to the rule, just make sure it's the
628 | right exception!
629 |
630 | C++ is an imperative, object oriented language. It started out as a
631 | "C with classes" but has since bolted on significant language
632 | features different from C. Proficiency with C++ should be aimed
633 | towards comfortable use of the template meta-programming features of
634 | the language, although it is entirely possible to spend an entire
635 | career writing C++ without exercising this feature (just read the ROOT
636 | source code).
637 | *** Pointers
638 | [[http://www.chiark.greenend.org.uk/~sgtatham/cdescent/][Required Reading: The Descent to C]]
639 |
640 | As C++ has evolved from C, it retains parts of C's low level nature.
641 | Part of this is the need to be explicit about managing memory
642 | manually. This is in stark contrast to languages such as Java or
643 | Python where memory management is handled for the programmer.
644 |
645 | A consequence of this is the ability to address specific cells of
646 | memory (the smallest accessible unit, typically a byte). An object
647 | (=int=, =double=, =float=, =char=, =string=, etc) may span several
648 | memory cells. A pointer is the computer's representation of a memory
649 | cell's location in memory, ie a memory address. Ultimately the
650 | programmer is interested in the data contained in the set of memory
651 | cells "pointed to" by the pointer. The act of retrieving this data is
652 | called "/dereferencing/ a pointer".
653 |
654 | As in physics, facility with manipulating pointers is best gained
655 | through experience, however many analogies have been developed to ease
656 | confusion. One analogy is street addresses, A street address is a
657 | sequence of numbers (the pointer) which instructs someone, a mailman
658 | say, (the computer), how to find a specific location. Once at that
659 | location, it is possible to manipulate objects located at that address
660 | (deliver mail if your the mailman, break the mailbox if your a bored
661 | teenager, knock on the door if you are a vacuum salesman etc).
662 |
663 | Now some syntax:
664 | #+BEGIN_SRC cpp
665 | Foo* bar = new Foo("Baz",42,"What is the question?");
666 | std::cout << "object bar lives at memory address:"<TheAnswer()<= operator):
731 | #+BEGIN_SRC cpp
732 | Foo* bar=new Foo();
733 | if(bar->Value()==(*bar).Value()){
734 | std::cout<<"They're the same!"<Get("Hist1");
1216 | cout << hist->GetNbinsX() <blah= are interchangeable
1231 | - a semicolon ';' at the end of a line is optional
1232 | - No need to "=#include=" headers
1233 | As you progress in writing more sophisticated C++, you will run into
1234 | CINT's shortcomings as a C++ interpreter. It is recommended that you
1235 | move to Level 2 or 3 before this happens.
1236 |
1237 | While it is possible to write complicated CINT macros (files with
1238 | multiple function definitions) it is not recommended. CINT has a
1239 | habit of keeping up the appearance of doing one thing when in reality
1240 | something entirely different is happening "behind the scenes".
1241 |
1242 | CINT is best used for quick scripts to plot histograms already saved
1243 | to a disk, or to inspect a few branches from a =TTree=. More
1244 | sophisticated analyses are better served by Levels 2 and 3.
1245 |
1246 | **** Level 2: Compiled Macros
1247 | Compiled macros are full-blown C++ programs. Generally there is a
1248 | "steering macro" that handles compiling and loading the required
1249 | libraries. An example steering macro:
1250 | #+BEGIN_SRC cpp
1251 | {
1252 | //may need to load other libraries or files that depend on analysis.C
1253 | gROOT->ProcessLine(".L analysis.C++");
1254 | gROOT->ProcessLine("doAnalysis()");
1255 | }
1256 | #+END_SRC
1257 | The compiled macro itself looks more like a traditional C++ program:
1258 | #+BEGIN_SRC cpp
1259 | #include
1260 | #include "TFile.h"
1261 | #include "TH1F.h"
1262 |
1263 | int doAnalysis(){
1264 |
1265 | return 42;
1266 | }
1267 | #+END_SRC
1268 | Since the ROOT binary already defines a "main" an error will occur
1269 | if you redefine another function named "main", therefore we use
1270 | the verb "doAnalysis".
1271 |
1272 | The steering macro that compiles each source file can become
1273 | arbitrarily complex. To some this may read "flexible", to others it
1274 | may read "disorganized". If your analysis grows into a multi-file
1275 | program, it's probably time to ascend to Level 3.
1276 |
1277 | **** Level 3: Compiled Programs
1278 | A compiled program is just that. Here ROOT takes the role of a rich
1279 | set of libraries for composing a C++ based analysis.
1280 |
1281 | An example program (and supporting Makefile) [[https://github.com/dmb2/hitchhikers-guide-to-hep/tree/master/compiled-program][is included here]].
1282 |
1283 | Makefiles come with their own overhead, but the =make= system is very
1284 | powerful. The [[https://www.gnu.org/software/make/manual/make.html][make manual]] is very readable with many examples.
1285 |
1286 | **** A note on Enlightenment
1287 | Master Foo, of [[http://www.catb.org/~esr//writings/unix-koans/][Rootless Root]], gives the [[http://www.catb.org/~esr//writings/unix-koans/shell-tools.html][sage advice]]:
1288 | #+BEGIN_QUOTE
1289 | "When you are hungry, eat; when you are thirsty, drink; when you are tired, sleep."
1290 | #+END_QUOTE
1291 |
1292 | To spell it out (and to prevent the reader from enlightenment), it is
1293 | wise to choose the use of ROOT which is most appropriate for a task
1294 | at hand. The practicing HEP physicist is proficient with all three
1295 | levels, and can pick and choose which approach is best for the task
1296 | at hand.
1297 |
1298 | *** PyROOT
1299 | PyROOT are a set of python bindings to ROOT. It works fairly well out
1300 | of the box, but there are some things to keep in mind.
1301 | - Idiomatic python avoids "=from ROOT import *=", prefer "=from ROOT
1302 | import blah="
1303 | - the ROOT devs know you aren't going to be idiomatic, so instead
1304 | they've implemented a lazy loading system (ROOT is huge, so "=from
1305 | ROOT import *=" would take forever). This may be confusing if your
1306 | a python expert and expect exploration commands like =dir()= to
1307 | work with ROOT.
1308 | - If performance matters, try to stay in C++ land (ie call C++
1309 | functions from python) as much as possible
1310 | - If performance really matters, write it in python and then port
1311 | it to C++. This is fairly advanced, but not impossible. You'll
1312 | have to generate CINT dictionaries for your source files.
1313 | If you're looking for a more "pythonic" (not my word) experience,
1314 | maybe give [[http://www.rootpy.org][rootpy]] a shot. See also [[An even briefer introduction to
1315 | Python]] for resources to learn python itself.
1316 |
1317 | *** Fitting Data with RooFit
1318 | RooFit is a shiny penny compared to the rest of the ROOT ecosystem.
1319 | It has its own quirks and idioms, but the interfaces are fairly
1320 | reasonable and the manual is well written. The latest version of the
1321 | manual and quickstart can always be found here:
1322 | - [[https://root.cern/topical/][ROOT User's guide, Roofit Manual]]
1323 | Direct links are here, though they aren't guaranteed to be current:
1324 | - [[http://root.cern.ch/download/doc/RooFit_Users_Manual_2.91-33.pdf][RooFit Manual (PDF) 2.91-33]]
1325 | - [[https://root.cern/download/doc/roofit_quickstart_3.00.pdf][RooFit Quick Start Guide (PDF) 3.00]]
1326 | *** Styling Plots
1327 | A well designed graph is truly a work of art. The path from paltry
1328 | graphics spit out by ROOT to something worthy of framing (and yes,
1329 | [[https://www.edwardtufte.com/tufte/posters][truly amazing data visualizations are routinely framed]]) is long and
1330 | fraught with naysayers who will insist that you are doing it wrong.
1331 | Ignore them, and do what needs to be done to communicate your hard won
1332 | data clearly and concisely. To get you started, give "[[https://www.researchgate.net/publication/24285628_Principles_of_Information_Display_for_Visualization_Practitioners][Principles of
1333 | Information Display for Visualization Practitioners]]" a read. It is an
1334 | executive summary of [[https://www.edwardtufte.com/tufte/][Edward Tufte's]] works on visualizing information.
1335 | If what you read resonates with you, then please read [[http://www.amazon.com/The-Visual-Display-Quantitative-Information/dp/0961392142][Tufte's books]].
1336 | They make for delightful coffee time distractions. After reading his
1337 | books you will start to see his influence in particularly nice
1338 | graphics. You will also see many sins committed by other
1339 | practitioners. Choose your role-models wisely.
1340 |
1341 | Sage advice (passed down from The Editor's first mentor):
1342 | #+BEGIN_QUOTE
1343 | When you make a plot, take the time to make it publication quality
1344 | and reproducible.
1345 | #+END_QUOTE
1346 | This means two things:
1347 | 1. Make it good enough to go into a paper
1348 | 2. Prefer generating it with C++/Python over any other format (data
1349 | inputs will frequently change at the last minute, and being able to
1350 | "hit a button" and get the plot is very useful unless you have a
1351 | room full of [[https://en.wikipedia.org/wiki/Mechanical_Turk][Mechanical Turks]] lying around)
1352 |
1353 | This also means it's probably a good idea to keep =*.root= files
1354 | containing your histograms for last minute style changes if you are
1355 | writing a presentation.
1356 |
1357 | Producing a publication quality plot can be challenging, however ROOT
1358 | includes the concept of a "Style" which can be applied. These are
1359 | global rules for how plots should be printed. In previous versions of
1360 | ROOT, the default style was notoriously bad. In The Editor's humble
1361 | opinion, this was done to simultaneously encourage each physicist to
1362 | set their own standard, and to immediately identify ROOT newbies from
1363 | seasoned ROOT hackers.
1364 |
1365 | Now, things are better, though the idea that "each physicist set their
1366 | own standard" has stuck, and so there are many styles floating
1367 | around.
1368 | **** Example Style
1369 | An example style (probably from the CMS TDR, the details are lost to
1370 | time):
1371 | #+BEGIN_SRC cpp
1372 | {
1373 | TStyle *tdrStyle = new TStyle("tdrStyle","Style for P-TDR");
1374 |
1375 | cout << "TDR Style initialized" << endl;
1376 |
1377 | // For the canvas:
1378 | tdrStyle->SetCanvasBorderMode(0);
1379 | tdrStyle->SetCanvasColor(kWhite);
1380 | tdrStyle->SetCanvasDefH(600); //Height of canvas
1381 | tdrStyle->SetCanvasDefW(600); //Width of canvas
1382 | tdrStyle->SetCanvasDefX(0); //Position on screen
1383 | tdrStyle->SetCanvasDefY(0);
1384 |
1385 | // For the Pad:
1386 | tdrStyle->SetPadBorderMode(0);
1387 | // tdrStyle->SetPadBorderSize(Width_t size = 1);
1388 | tdrStyle->SetPadColor(kWhite);
1389 | tdrStyle->SetPadGridX(false);
1390 | tdrStyle->SetPadGridY(false);
1391 | tdrStyle->SetGridColor(0);
1392 | tdrStyle->SetGridStyle(3);
1393 | tdrStyle->SetGridWidth(1);
1394 |
1395 | // For the frame:
1396 | tdrStyle->SetFrameBorderMode(0);
1397 | tdrStyle->SetFrameBorderSize(1);
1398 | tdrStyle->SetFrameFillColor(0);
1399 | tdrStyle->SetFrameFillStyle(0);
1400 | tdrStyle->SetFrameLineColor(1);
1401 | tdrStyle->SetFrameLineStyle(1);
1402 | tdrStyle->SetFrameLineWidth(1);
1403 |
1404 | // For the histo:
1405 | // tdrStyle->SetHistFillColor(1);
1406 | // tdrStyle->SetHistFillStyle(0);
1407 | tdrStyle->SetHistLineColor(1);
1408 | tdrStyle->SetHistLineStyle(0);
1409 | tdrStyle->SetHistLineWidth(1);
1410 |
1411 | tdrStyle->SetEndErrorSize(2);
1412 | //tdrStyle->SetErrorMarker(20);
1413 | tdrStyle->SetErrorX(0.);
1414 |
1415 | tdrStyle->SetMarkerStyle(20);
1416 |
1417 | //For the fit/function:
1418 | tdrStyle->SetOptFit(1);
1419 | tdrStyle->SetFitFormat("5.4g");
1420 | tdrStyle->SetFuncColor(2);
1421 | tdrStyle->SetFuncStyle(1);
1422 | tdrStyle->SetFuncWidth(1);
1423 |
1424 | //For the date:
1425 | tdrStyle->SetOptDate(0);
1426 | // tdrStyle->SetDateX(Float_t x = 0.01);
1427 | // tdrStyle->SetDateY(Float_t y = 0.01);
1428 |
1429 | // For the statistics box:
1430 | tdrStyle->SetOptFile(0);
1431 | tdrStyle->SetOptStat(0); // To display the mean and RMS: SetOptStat("mr");
1432 | tdrStyle->SetStatColor(kWhite);
1433 | tdrStyle->SetStatFont(42);
1434 | tdrStyle->SetStatFontSize(0.025);
1435 | tdrStyle->SetStatTextColor(1);
1436 | tdrStyle->SetStatFormat("6.4g");
1437 | tdrStyle->SetStatBorderSize(1);
1438 | tdrStyle->SetStatH(0.1);
1439 | tdrStyle->SetStatW(0.15);
1440 | // tdrStyle->SetStatStyle(Style_t style = 1001);
1441 | // tdrStyle->SetStatX(Float_t x = 0);
1442 | // tdrStyle->SetStatY(Float_t y = 0);
1443 |
1444 | // Margins:
1445 | tdrStyle->SetPadTopMargin(0.15);
1446 | tdrStyle->SetPadBottomMargin(0.13);
1447 | tdrStyle->SetPadLeftMargin(0.13);
1448 | tdrStyle->SetPadRightMargin(0.15);
1449 |
1450 | // For the Global title:
1451 |
1452 | // tdrStyle->SetOptTitle(0);
1453 | tdrStyle->SetTitleFont(42);
1454 | tdrStyle->SetTitleColor(1);
1455 | tdrStyle->SetTitleTextColor(1);
1456 | tdrStyle->SetTitleFillColor(10);
1457 | tdrStyle->SetTitleFontSize(0.05);
1458 | // tdrStyle->SetTitleH(0); // Set the height of the title box
1459 | // tdrStyle->SetTitleW(0); // Set the width of the title box
1460 | // tdrStyle->SetTitleX(0); // Set the position of the title box
1461 | // tdrStyle->SetTitleY(0.985); // Set the position of the title box
1462 | // tdrStyle->SetTitleStyle(Style_t style = 1001);
1463 | // tdrStyle->SetTitleBorderSize(2);
1464 |
1465 | // For the axis titles:
1466 |
1467 | tdrStyle->SetTitleColor(1, "XYZ");
1468 | tdrStyle->SetTitleFont(42, "XYZ");
1469 | tdrStyle->SetTitleSize(0.06, "XYZ");
1470 | // The inconsistency is great!
1471 | tdrStyle->SetTitleXOffset(1.0);
1472 | tdrStyle->SetTitleOffset(1.5, "Y");
1473 |
1474 | // For the axis labels:
1475 |
1476 | tdrStyle->SetLabelColor(1, "XYZ");
1477 | tdrStyle->SetLabelFont(42, "XYZ");
1478 | tdrStyle->SetLabelOffset(0.007, "XYZ");
1479 | tdrStyle->SetLabelSize(0.05, "XYZ");
1480 |
1481 | // For the axis:
1482 |
1483 | tdrStyle->SetAxisColor(1, "XYZ");
1484 | tdrStyle->SetStripDecimals(kTRUE);
1485 | tdrStyle->SetTickLength(0.03, "XYZ");
1486 | tdrStyle->SetNdivisions(510, "XYZ");
1487 | tdrStyle->SetPadTickX(1); // To get tick marks on the opposite side of the frame
1488 | tdrStyle->SetPadTickY(1);
1489 |
1490 | // Change for log plots:
1491 | tdrStyle->SetOptLogx(0);
1492 | tdrStyle->SetOptLogy(0);
1493 | tdrStyle->SetOptLogz(0);
1494 |
1495 | tdrStyle->SetPalette(1,0);
1496 | tdrStyle->cd();
1497 | }
1498 | #+END_SRC
1499 | If your working with one of the major experiments, they'll most
1500 | likely have a style for you to use (It will invariably be 95% the
1501 | same as above, but the 5% will make *all* the difference).
1502 |
1503 | **** Transparent Plots
1504 | Add this to your =~/.rootrc= (or create it if it doesn't exist):
1505 | #+BEGIN_EXAMPLE
1506 | # Flag to set CanvasPreferGL via gStyle
1507 | OpenGL.CanvasPreferGL: 1
1508 | #+END_EXAMPLE
1509 |
1510 | Now, in your plotting code:
1511 | #+BEGIN_SRC cpp
1512 | TColor* color = gROOT->GetColor(TColor::GetColor(red,green,blue));//Use ints from 0 to 255
1513 | color->SetAlpha(0.5);//0 is fully transparent, 1 fully opaque
1514 | hist->SetFillColor(color->GetNumber());
1515 | #+END_SRC
1516 | Is this a clean interface? No, but it can be just what your graphic
1517 | needs to remain clear without cluttering the canvas with hatching.
1518 |
1519 | Two warnings:
1520 | 1. As of this writing, this is only supported for "popular" output
1521 | formats (pdf, svg, gif, jpg, and png), though notably *not*
1522 | postscript (ie ps).
1523 | 2. It's very easy to create a shade that cannot be properly rendered
1524 | on a projector, making the transparent component of your plots
1525 | invisible.
1526 |
1527 | *** Extending ROOT with custom classes
1528 | Sometimes it is useful to add your own developed classes to ROOT (so
1529 | they can be used in ROOT macros/PyROOT scripts or so they can be
1530 | stored in a ROOT file). For example you may have a class of the form
1531 | (in a file called CustomEvent.h, for example):
1532 | #+BEGIN_SRC cpp
1533 | #ifndef CustomEvent_h
1534 | #define CustomEvent_h
1535 |
1536 | #include "TObject.h"
1537 |
1538 | class CustomEvent : public TObject {
1539 | ClassDef(CustomEvent,1);
1540 | private:
1541 | int _eventID;
1542 | double _eventEnergy;
1543 | public:
1544 | CustomEvent() {}
1545 | virtual ~CustomEvent() {}
1546 |
1547 | void set_eventID(const int eid);
1548 | void set_eventEnergy(const double e);
1549 |
1550 | int eventID() const { return _eventID; }
1551 | double eventEnergy() const { return _eventEnergy; }
1552 | };
1553 |
1554 | #endif
1555 | #+END_SRC
1556 | It is a good idea to inherit from TObject if you require reading or
1557 | writing objects to disk. Official documentation of the ins and outs
1558 | of adding classes can be found in Chapter 15 of the [[https://root.cern.ch/drupal/content/root-users-guide-534][User's Guide]], as
1559 | well as information here: [[https://root.cern.ch/drupal/content/interacting-shared-libraries-rootcint][cint]] and [[https://root.cern.ch/drupal/content/adding-your-class-root-classdef][ClassDef]]. We also need
1560 | CustomEvent.cxx:
1561 | #+BEGIN_SRC cpp
1562 | #include "CustomEvent.h"
1563 |
1564 | CustomEvent::CustomEvent() {}
1565 | CustomEvent::~CustomEvent() {}
1566 |
1567 | void CustomEvent::set_eventID(const int eid) { _eventID = eid; }
1568 | void CustomEvent::set_eventEnergy(const double e) { _eventEnergy = e; }
1569 |
1570 | #+END_SRC
1571 | To make this class accessible within (Py)ROOT you must create a LinkDef.h file
1572 | of the form:
1573 | #+BEGIN_SRC cpp
1574 | #ifdef __CINT__
1575 | #pragma link off all globals;
1576 | #pragma link off all classes;
1577 | #pragma link off all functions;
1578 |
1579 | #pragma link C++ class CustomEvent+;
1580 |
1581 | #endif
1582 | #+END_SRC
1583 | Now you must generate a dictionary for the class:
1584 | #+BEGIN_SRC sh
1585 | rootcint -f CustomEventDictionary.cxx -c CustomEvent.h LinkDef.h
1586 | #+END_SRC
1587 | This generates =CustomEventDictionary.cxx= and
1588 | =CustomEventDictionary.h=. Now you can compile the source for your
1589 | class and the new dictionary source. Then finally link them together
1590 | in a library:
1591 | #+BEGIN_SRC sh
1592 | g++ -fPIC -c CustomEvent.cxx `root-config --cflags`
1593 | g++ -fPIC -c CustomEventDictionary.cxx `root-config --cflags`
1594 | g++ -shared -o libCustomEvent.so CustomEvent.o CustomEventDictionary.o `root-config --glibs`
1595 | #+END_SRC
1596 | Now by loading the library libCustomEvent.so you can use your class
1597 | within ROOT. For example, the macro:
1598 | #+BEGIN_SRC cpp
1599 | {
1600 | gSystem->Load("libCustomEvent");
1601 | CustomEvent a;
1602 | a.set_eventID(42);
1603 | std::cout << a.eventID() << std::endl;
1604 | }
1605 | #+END_SRC
1606 | *** Important Gotcha's
1607 | At some point you'll get stuck. Hopefully you'll be stuck on a good
1608 | problem, but more often than not you'll be stuck on some quirk that
1609 | ROOT has. Remember ROOT's mantra: "[[http://www.jargon.net/jargonfile/f/feature.html][It's not a bug, it's a feature]]!"
1610 |
1611 | ROOT's object protocol is very strange. The [[https://root.cern/TaligentDocs/TaligentOnline/DocumentRoot/1.0/Docs/books/WM/WM_63.html#HEADING77][naming schema]] is based on
1612 | an industry standard for C programs where it's not possible to use
1613 | namespaces. The result is very confusing for new users (very good for
1614 | HEP). Every object in root that can be written to disk (ie saved in a
1615 | ROOT file) derives from a =TObject= base class. This base class
1616 | defines a protocol for objects. (All objects can print themselves,
1617 | have a name, have a title, have a class name, etc). This makes it
1618 | possible to have a list of disparate objects (as long as it's a list of
1619 | =TObject*=). As you gain more experience with ROOT, this becomes
1620 | a power tool. Like any power tool ([[https://en.wikipedia.org/wiki/List_of_Home_Improvement_characters#Tim_Taylor][as Tim Taylor can attest]]), this
1621 | can be abused to no end.
1622 |
1623 | **** TTrees
1624 | ***** Drawing trees
1625 | When you call =TTree::Draw= to draw multi-dimensional histograms, the
1626 | order of the axes is "z:y:x" rather than the expected "x:y:z"
1627 | ***** Caching trees
1628 | Use =TTreeCache= to loop over trees rather than the standard
1629 | =TTree::GetEntry(i)= idiom. A =TTreeCache= learns which branches you
1630 | access most often and caches them, speeding up your processing time
1631 | significantly. [[http://root.cern.ch/root/html/TTreeCache.html][Documentation here]]. Since you won't read to the end
1632 | (no one does...) here is the docs for when *not* to use a
1633 | =TTreeCache=:
1634 | #+BEGIN_EXAMPLE
1635 | SPECIAL CASES WHERE TreeCache should not be activated
1636 |
1637 |
1638 | When reading only a small fraction of all entries such that not all branch
1639 | buffers are read, it might be faster to run without a cache.
1640 |
1641 |
1642 | HOW TO VERIFY That the TreeCache has been used and check its performance
1643 |
1644 |
1645 | Once your analysis loop has terminated, you can access/print the number
1646 | of effective system reads for a given file with a code like
1647 | (where TFile* f is a pointer to your file)
1648 |
1649 | printf("Reading %lld bytes in %d transactions\n",f->GetBytesRead(), f->GetReadCalls());
1650 | #+END_EXAMPLE
1651 | ***** Splitting Trees
1652 | If you want to split a TTree into $n$ statistically independent
1653 | parts use something like:
1654 | #+BEGIN_SRC cpp
1655 | TTree* outTree=tree->CopyTree("Entry$%n==i");
1656 | #+END_SRC
1657 | Here, =n= is the number of parts requested, =i= is the i'th part. If
1658 | you're just splitting it in half, a full blown macro (from the
1659 | trenches) would look like:
1660 | #+BEGIN_SRC cpp
1661 | // A macro to split a tree
1662 | {
1663 | TFile* file=TFile::Open("./merged_dijets.root");
1664 | TTree* tree=(TTree*)file->Get("micro");
1665 | TFile* fileA=new TFile("UnfoldingStudy.dijets-pt1.root","RECREATE");
1666 | fileA->cd();
1667 | TTree* treeA=tree->CopyTree("Entry$%2==0");
1668 | treeA->Write();
1669 | fileA->Write();
1670 | fileA->Close();
1671 | TFile* fileB=new TFile("UnfoldingStudy.dijets-pt2.root","RECREATE");
1672 | fileB->cd();
1673 | TTree* treeB=tree->CopyTree("Entry$%2==1");
1674 | treeB->Write();
1675 | fileB->Write();
1676 | fileB->Close();
1677 |
1678 | }
1679 | #+END_SRC
1680 | **** TH1
1681 | Despite the name, TH1 is the base class for all histograms. This can
1682 | lead to much [[http://dwarffortresswiki.org/index.php/DF2012:Losing][!FUN!]]. Be extra wary of null pointers when handling
1683 | TH1's of unknown origin.
1684 | **** TH2
1685 | ***** Splitting a 2D
1686 | One would expect an interface method like =TH2D::Split()=, but instead
1687 | you need to use the appropriate THStack [[http://root.cern.ch/root/html/THStack.html#THStack:THStack@2][constructor]]:
1688 | =THStack(const TH1* hist, Option_t* axis = "x", ...)=
1689 |
1690 | Then call =THStack::GetHists= to get a TList of the histograms. Of
1691 | course, you'll have to use [[http://root.cern.ch/root/html/TList.html][ROOT's idioms for iterating over lists]],
1692 |
1693 | Another, slightly more direct option is =TH2::ProjectionX()= and
1694 | =TH2::ProjectionY()=, used in the following fashion:
1695 | #+BEGIN_SRC cpp
1696 | std::vector split_list;
1697 | for(size_t i = 1; i < Hist2D->GetNbinsX()+1; i++) {
1698 | split_list.push_back(Hist2D->ProjectionY("_py",i,i+1,"e"));
1699 | }
1700 | #+END_SRC
1701 | Depending on your ROOT version, you may have to change the ="_py"=
1702 | string to be something unique. This can be done as follows:
1703 | #+BEGIN_SRC cpp
1704 | std::vector split_list;
1705 | char buff[256];
1706 | for(size_t i = 1; i < Hist2D->GetNbinsX()+1; i++) {
1707 | snprintf(buff,sizeof(buff)/sizeof(*buff),"%s_%u_py",Hist2D->GetName(),i);
1708 | split_list.push_back(Hist2D->ProjectionY(buff,i,i+1,"e"));
1709 | }
1710 | #+END_SRC
1711 | User Beware: not using a unique name can cause unexpected behavior
1712 | when drawing the list of slices.
1713 | **** THStack
1714 | =THStack= is not mostly harmless. In fact, if you want to do reasonable
1715 | things with =THStack=, it probably won't work, and if it does, it may
1716 | work once and not the second time. If you can avoid =THStack=, then do
1717 | it.
1718 | ***** Get Sum of Stack
1719 | To get a histogram representing the sum of a stack use
1720 | =THStack::GetStack()->Last()=
1721 | ***** A pointer here, a pointer there, who am I?
1722 | If you want to iterate over the stack, there are two ways to get the
1723 | underlying objects. They are not equal. Option 1:
1724 | #+BEGIN_SRC cpp
1725 | THStack stack = new THStack("Stack","A stack of histograms");
1726 | // later on ...
1727 | TIter next(stack->GetHists());
1728 | TH1* hist = NULL;
1729 | while((hist=dynamic_cast(next()))){
1730 | // Do something with *original* histograms added to stack
1731 | }
1732 | #+END_SRC
1733 | Option 2:
1734 | #+BEGIN_SRC cpp
1735 | TH1* hist = NULL;
1736 | for(int i = 0; i < stack->GetStack()->GetEntries(); i++){
1737 | hist = dynamic_cast(stack->GetStack()->At(i));
1738 | // Do something with histograms to be painted on the canvas
1739 | }
1740 | #+END_SRC
1741 | These two methods appear equal but are not. The =TList= method from
1742 | =THStack::GetHists= give you the original pointers (ie those that you
1743 | added with =THStack::Add=). The =TObjArray= method from
1744 | =THStack::GetStack= gives you the internal histograms which will be
1745 | used to draw the histograms on the canvas.
1746 | **** TFile
1747 | TFile's are greedy about object ownership. In fact, object ownership
1748 | in ROOT is a very common +bug+ feature. Many times you'll own objects
1749 | you thought you didn't (memory leak) or, you'll delete objects you
1750 | thought you did (double free core-dump).
1751 |
1752 | The rule of thumb to keep in mind is "TObjects declared after a file
1753 | is opened are owned by previously opened file"
1754 |
1755 | Contrast:
1756 | #+BEGIN_SRC cpp
1757 | TH1F hist("hist","Higgs Discovery Plot", 50,0,200);
1758 | TFile output("discovery.root","RECREATE");
1759 | output.Close();
1760 | #+END_SRC
1761 | with:
1762 | #+BEGIN_SRC cpp
1763 | TFile output("discovery.root","RECREATE");
1764 | TH1F hist("hist","Higgs Discovery Plot", 50,0,200);
1765 | output.Close();
1766 | #+END_SRC
1767 |
1768 | In the former, the file =discovery.root= will be empty. In the
1769 | latter, it will contain a copy of =hist=.
1770 |
1771 | This can get really hairy when you're dealing with pointers.
1772 | Therefore (instead of being a responsible programmer), the best
1773 | approach to managing memory in ROOT is to not manage it until you
1774 | have to.
1775 | ***** Recreate, create, new, update
1776 | From the ROOT docs:
1777 | #+BEGIN_EXAMPLE
1778 | If option = NEW or CREATE create a new file and open it for writing,
1779 | if the file already exists the file is
1780 | not opened.
1781 | = RECREATE create a new file, if the file already
1782 | exists it will be overwritten.
1783 | = UPDATE open an existing file for writing.
1784 | if no file exists, it is created.
1785 | = READ open an existing file for reading (default).
1786 | #+END_EXAMPLE
1787 | *Important*: Recreating will destroy the file if it exists. BE
1788 | CAREFUL when you use this option!
1789 | **** Extracting ACLiCs compilation steps
1790 | Here's a tip you hope you never need. The use case is when you are
1791 | writing a standalone program and you know you need to invoke rootcling
1792 | to generate dictionaries. When you follow the [[https://root.cern/manual/io_custom_classes/#using-rootcling][correct prescription]],
1793 | the produced dictionary fails even when linked to your code. What you
1794 | can do is follow the "[[https://root.cern/manual/root_macros_and_shared_libraries/][loader.C prescription]]" and then dump what ACLiC
1795 | is doing under the hood with:
1796 | #+BEGIN_EXAMPLE
1797 | root [7] gDebug=7
1798 | (const int)7
1799 | root [8] .L libLoader.C++
1800 | #+END_EXAMPLE
1801 | This will dump all of the g++ and rootcling calls that are required to
1802 | generate a working dictionary.
1803 | *** Debugging with ROOT
1804 | Eventually you'll encounter a segmentation fault or segfault in ROOT.
1805 | (They can also go under the name core dump). This happens when you
1806 | try to read, write, or otherwise abuse a part of memory that doesn't
1807 | belong to you. The result is that the program (ROOT usually)
1808 | crashes. ROOT has gotten pretty good at realizing that this has
1809 | happened, and printing information about what was going on when the
1810 | crash happened.
1811 | **** A "crash" course on reading a stack trace
1812 | Here's a stack trace from a real-live analysis program (SFrame in this
1813 | case)
1814 | #+BEGIN_EXAMPLE
1815 |
1816 |
1817 |
1818 | ===========================================================
1819 | There was a crash.
1820 | This is the entire stack trace of all threads:
1821 | ===========================================================
1822 | #0 0x0000003ba7e9a075 in waitpid () from /lib64/libc.so.6
1823 | #1 0x0000003ba7e3c741 in do_system () from /lib64/libc.so.6
1824 | #2 0x00002b4f86156256 in TUnixSystem::StackTrace() ()
1825 | from /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/root/5.34.07-x86_64-slc5-gcc4.3/lib/libCore.so
1826 | #3 0x00002b4f86155b2c in TUnixSystem::DispatchSignals(ESignals) ()
1827 | from /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/root/5.34.07-x86_64-slc5-gcc4.3/lib/libCore.so
1828 | #4
1829 | #5 0x00002b4f93f9a47d in UnfoldingStudy::ExecuteEvent(SInputData const&, double) () from /home/dmb60/bFrame/SFrame/lib/libMiniReaders.so
1830 | #6 0x00002b4f85c33616 in SCycleBaseExec::Process(long long) ()
1831 | from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
1832 | #7 0x00002b4f8a62e1e0 in TTreePlayer::Process(TSelector*, char const*, long long, long long) ()
1833 | from /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/root/5.34.07-x86_64-slc5-gcc4.3/lib/libTreePlayer.so
1834 | #8 0x00002b4f85c47ce8 in SCycleController::ExecuteNextCycle() ()
1835 | from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
1836 | #9 0x00002b4f85c43872 in SCycleController::ExecuteAllCycles() ()
1837 | from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
1838 | #10 0x000000000040226c in main ()
1839 | ===========================================================
1840 |
1841 |
1842 | The lines below might hint at the cause of the crash.
1843 | If they do not help you then please submit a bug report at
1844 | http://root.cern.ch/bugs. Please post the ENTIRE stack trace
1845 | from above as an attachment in addition to anything else
1846 | that might help us fixing this issue.
1847 | ===========================================================
1848 | #5 0x00002b4f93f9a47d in UnfoldingStudy::ExecuteEvent(SInputData const&, double) () from /home/dmb60/bFrame/SFrame/lib/libMiniReaders.so
1849 | #6 0x00002b4f85c33616 in SCycleBaseExec::Process(long long) ()
1850 | from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
1851 | #7 0x00002b4f8a62e1e0 in TTreePlayer::Process(TSelector*, char const*, long long, long long) ()
1852 | from /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/root/5.34.07-x86_64-slc5-gcc4.3/lib/libTreePlayer.so
1853 | #8 0x00002b4f85c47ce8 in SCycleController::ExecuteNextCycle() ()
1854 | from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
1855 | #9 0x00002b4f85c43872 in SCycleController::ExecuteAllCycles() ()
1856 | from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
1857 | #10 0x000000000040226c in main ()
1858 | ===========================================================
1859 |
1860 |
1861 |
1862 | #+END_EXAMPLE
1863 | The numbered lines followed by the memory address (the 64bit hex
1864 | numbers) represent the order in which each function was called. The
1865 | most recent call is at the top of the list. Since this code was
1866 | running a single thread, there is a only one stack trace. If there
1867 | were multiple threads, there would be a trace for each
1868 | thread. Typically the fastest route to a user called function is to
1869 | look at the portion:
1870 | #+BEGIN_EXAMPLE
1871 | The lines below might hint at the cause of the crash.
1872 | If they do not help you then please submit a bug report at
1873 | http://root.cern.ch/bugs. Please post the ENTIRE stack trace
1874 | from above as an attachment in addition to anything else
1875 | that might help us fixing this issue.
1876 | ===========================================================
1877 | #5 0x00002b4f93f9a47d in UnfoldingStudy::ExecuteEvent(SInputData const&, double) () from /home/dmb60/bFrame/SFrame/lib/libMiniReaders.so
1878 | #6 0x00002b4f85c33616 in SCycleBaseExec::Process(long long) ()
1879 | from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
1880 | #7 0x00002b4f8a62e1e0 in TTreePlayer::Process(TSelector*, char const*, long long, long long) ()
1881 | from /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/root/5.34.07-x86_64-slc5-gcc4.3/lib/libTreePlayer.so
1882 | #8 0x00002b4f85c47ce8 in SCycleController::ExecuteNextCycle() ()
1883 | from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
1884 | #9 0x00002b4f85c43872 in SCycleController::ExecuteAllCycles() ()
1885 | from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
1886 | #10 0x000000000040226c in main ()
1887 | ===========================================================
1888 | #+END_EXAMPLE
1889 |
1890 | This strips out the system calls that clutter the full trace, and the
1891 | top frame (#5 in this case) is the last function called that was
1892 | defined (=UnfoldingStudy::ExecuteEvent=). This means that somewhere in
1893 | that function, someone tried to access memory they shouldn't have.
1894 |
1895 | You can trace the whole program from the =main()= invocation. To
1896 | gain more insight into all the information contained in the stack
1897 | trace, it is very useful to go through a =gdb= tutorial.
1898 |
1899 | Also, see [[Getting Help]] for more problem solving strategies before
1900 | filing a bug report. Remember: you probably just found a feature,
1901 | not a bug.
1902 |
1903 | **** Using gdb
1904 | Here are some good gdb (GNU DeBugger) tutorials
1905 | - [[https://www.cs.cmu.edu/~gilpin/tutorial/][Debugging under Unix: gdb Tutorial]]
1906 | - [[http://www.unknownroad.com/rtfm/gdbtut/gdbtoc.html][RMS's gdb Debugger Tutorial]] (not the same [[http://stallman.org/][RMS]])
1907 | - [[https://www.gnu.org/software/gdb/documentation/][GDB manual]] from GNU.org
1908 |
1909 | **** Valgrind
1910 | If you have memory related problems, you should be aware of
1911 | [[http://valgrind.org][valgrind]]. The [[http://valgrind.org/docs/manual/QuickStart.html][quick-start is here]].
1912 |
1913 | ** Physics
1914 | At some point (not necessarily right away) a hitchhiker will want to
1915 | better understand the physics underlying the research he/she is
1916 | doing. There are numerous textbooks on the subject, but a few stand
1917 | out as particularly good.
1918 | - [[http://www.amazon.com/Introduction-Elementary-Particles-David-Griffiths/dp/3527406018/ref%3Dpd_sim_b_1?ie%3DUTF8&refRID%3D1S6X949W6P9EM9E95V8A][Introduction to Elementary Particles 2nd Ed.]] - David Griffiths
1919 | - [[http://www.amazon.com/Quarks-Leptons-Introductory-Particle-Physics/dp/0471887412][Quarks and Leptons: An Introductory Course in Modern Particle]]
1920 | Physics - Francis Halzen and Alan D. Martin
1921 | *** Study Guide For Griffiths
1922 | If you've never touched HEP or heard of Particle Physics, read
1923 | chapter 1 and 2. Otherwise here's a rough path through the book (with
1924 | suggested exercises to work):
1925 | - Chapter 3 (Problems 3.4, 3.14 (for fun), 3.15, 3.16, 3.25, 3.26
1926 | (last two cover Mandelstam variables, which are very useful tools))
1927 | - Chapter 6 (Problems 6.8 or 6.9, 6.12, 6.13, 6.14, 6.15)
1928 | - Chapter 7 (Problems 7.6, 7.8 (optional), 7.14, 7.23, 7.30, 7.36,
1929 | 7.37, 7.39, 7.51)
1930 | - Chapter 8 (Problems 8.14, 8.15, 8.16, 8.19, 8.23, 8.28)
1931 | - Chapter 9 (Problems 9.2, 9.3, 9.6, 9.14, 9.17, 9.20, 9.23, 9.25,
1932 | 9.31, 9.32)
1933 | - Chapter 10 (Problems 10.4 10.5, 10.6, 10.15, 10.16, 10.21, 10.23)
1934 |
1935 | Halzen and Martin has a better treatment of Quantum Chromodynamics,
1936 | but that just adds another book to your library.
1937 |
1938 | This assumes the reader has covered the material in chapter 4 in a
1939 | undergraduate quantum course (at the level of Griffiths). Chapter 4
1940 | is too much of a review to be useful as a primary source. Griffith's
1941 | quantum book covers it, but Townsend's "A Modern Approach to Quantum
1942 | Mechanics" is a better text to have on your bookshelf.
1943 |
1944 | *** Coordinate Systems used in HEP
1945 | HEP uses cylindrical coordinates (well, a combination of cylindrical
1946 | and spherical really) with the z axis oriented along the beam line, R
1947 | radially "up" and the \phi curling right-handed around the beam axis.
1948 |
1949 | In addition, HEP physicists think in terms of a variable called
1950 | "pseudorapidity" denoted \eta. This is defined as
1951 | $$
1952 | \eta = - \log\left(\tan\frac{\theta}{2}\right)
1953 | $$
1954 | Where \theta is the polar angle from the beam azis. Why use this
1955 | crazy coordinate system? Well, it's related to rapidity, the
1956 | relativistic counterpart to speed. More important to HEP experiments,
1957 | /differences/ in pseudorapidity are Lorentz invariant /along the beam
1958 | axis/.
1959 |
1960 | If you want to keep a mental map of angles (from [[https://en.wikipedia.org/wiki/Pseudorapidity][wikipedia]]):
1961 | |--------+-------------------------------------------|
1962 | | \eta | Location |
1963 | |--------+-------------------------------------------|
1964 | | 0 | "Straight up" (\theta=\pi/2) |
1965 | | 0.88 | "Forty Five Degrees" (\theta=\pi/4) |
1966 | | 4.5 | "Along the Beam Pipe" (\theta=2$^\circ$ ) |
1967 | | \infty | "Beam Axis" (\theta=0) |
1968 | |--------+-------------------------------------------|
1969 |
1970 |
1971 |
1972 |
1973 | *** Monte Carlo Event Weights
1974 | This is taken from The Editor's Lab Notebook (which is locked by a
1975 | password, and hence will only be read by one person):
1976 |
1977 | Someone generates gobs of MC for use in any analysis for some process
1978 | ( $W\rightarrow \mu\nu+p$ where $p$ is a parton). I want to study what
1979 | a variable will look like in data, so I run my analysis over the MC
1980 | and get a plot out. Then I run the same code over data and plot the
1981 | two on top of each other. The trouble is that the number of MC events
1982 | I ran over is not the same as the amount of data I ran over. Now the
1983 | problem becomes how to appropriately scale the Monte Carlo prediction
1984 | to match the data (the result of the experiment).
1985 |
1986 | As experimental high energy physicists, we define variables so that we
1987 | can relate the cross sections predicted by theory to what we measure
1988 | out of the beam. To this end, we can get the number of expected
1989 | events from
1990 | \begin{equation}
1991 | N_{D}=\sigma \mathcal{L}
1992 | \end{equation}
1993 | Here $\sigma$ is the cross section in barns (typically pico-barns) and
1994 | $\mathcal{L}$ is the integrated luminosity collected by the
1995 | experiment. How do we compare this to the Monte Carlo prediction?
1996 | There, the computer counts up the number of events that were generated
1997 | for a specific process, what we want is
1998 | \begin{equation}
1999 | N_{exp}=W N_{MC}
2000 | \end{equation}
2001 | Here $N_{exp}$ is the expected number of events we get from Data. In
2002 | order to compare data to MC we need to scale $N_{exp}$ to the same
2003 | order as $N_{D}$. Remember, there is a cross section calculated from
2004 | theory that the MC prediction used, therefore we can write
2005 | \begin{equation}
2006 | N_{exp}=\sigma_{MC}\mathcal{L}
2007 | \end{equation}
2008 | It is possible that $\sigma_{MC}$ is corrected to the next order. In
2009 | order to avoid recalculating everything, a $k$ factor is reported as
2010 | the ratio of $\sigma_{NLO}/\sigma_{MC}$. Then all you have to do is
2011 | multiply $\sigma_{MC}$ by the $k$ factor, and the calculation is
2012 | updated to the newest NLO prediction.
2013 |
2014 | We're interested in the weight, so
2015 | \begin{equation}
2016 | W=\frac{\sigma_{MC}k\mathcal{L}}{N_{MC}}
2017 | \end{equation}
2018 | Now, when each bin in the Monte Carlo histogram is scaled by W, it
2019 | will be equal to the theoretically expected yield calculated by
2020 | $\sigma_{MC}k$.
2021 |
2022 |
2023 | *** Drawing Feynman Diagrams Digitally
2024 | There are many possibilities. The most "user friendly" is probably
2025 | [[http://jaxodraw.sourceforge.net/][JaxoDraw]]. It's even got a name reminiscent of Douglas Adams!
2026 |
2027 | JaxoDraw comes as a jar file, you may want to make it more command
2028 | line friendly by making a script to invoke it (put this in
2029 | =${HOME}/local/bin/jaxodraw=):
2030 | #+BEGIN_SRC sh
2031 | #!/bin/bash
2032 | java -jar ${HOME}/local/lib/java/jaxodraw-2.1-0.jar $@
2033 | #+END_SRC
2034 | Now make it executable:
2035 | #+BEGIN_SRC sh
2036 | chmod +x ${HOME}/local/bin/jaxodraw
2037 | #+END_SRC
2038 | And add it to your =PATH=. I have my path set up to search this path
2039 | from my =.bashrc=:
2040 | #+BEGIN_SRC sh
2041 | export PATH=${HOME}/local/bin:$PATH
2042 | #+END_SRC
2043 | Now to test it out by typing =jaxodraw= at the command prompt, it should
2044 | just launch. If it doesn't try again.
2045 |
2046 | Someday you may be working on a presentation and want the figures
2047 | exported by jaxodraw to have a transparent background. Some kind soul
2048 | has hacked the conversion steps to do this for you. You can grab the
2049 | scripts here:
2050 | http://personal.psu.edu/jcc8//jd-conversions/
2051 |
2052 | If you have jaxodraw setup as above you can just dump the scripts in
2053 | =$HOME/local/bin= and run them on the xml file produced by jaxodraw.
2054 |
2055 | *** What to do if you've lost a 2\pi
2056 | Calm down, take a deep breath and read the first line of The Guide.
2057 | Then come back here. Somewhere a fellow grad student has a copy of
2058 | "Introduction to Elementary Particles (2nd Edition)" by David
2059 | Griffiths. Read Chapter 6 in entirety paying special note to the
2060 | footnote on page 205.
2061 | ** Responsible Research
2062 | This is a topic better covered else where. The part that overlaps
2063 | with this guide is in documenting the work you do. It is important
2064 | that you keep a traceable paper-trail of everything you do.
2065 |
2066 | *** Lab Notebooks
2067 | When keeping a lab notebook, it is important to make the barrier for
2068 | writing something down as small as possible. If it's difficult or
2069 | inconvenient, you will not be inclined to document as well as you should.
2070 | **** Pen/Paper Notebook
2071 | This is the Science standard, and it is perfectly applicable to HEP
2072 | research. Keep it open and write in it as you work. Make sure there
2073 | is a 1-to-1 mapping from what you are writing in your notebook and
2074 | what can be found on a hard drive somewhere (plots, text, code etc).
2075 | **** Org-mode Notebook
2076 | The Editor keeps an org-mode notebook. There is a [[https://github.com/dmb2/research-log][thorough write-up
2077 | here]]. (Editor's note: I always wanted to flesh this out, but it became
2078 | a joke to link to the repo and let it lie.)
2079 | **** Flat text file
2080 | One option is to keep a (well organized) text file for recording what
2081 | you're doing. Paste links and your thoughts here. Choose a markdown
2082 | language and write your posts in that. This way you are forced to
2083 | keep more structure in the file, and you can export to whatever
2084 | output formats are supported by your markdown language.
2085 | **** Wiki/ELog
2086 | Some research groups reserve webspace for hosting private
2087 | e-logs. These are typically wiki syntax, but they can also be
2088 | bulletin board style formatting. In any case, it's possible to use a
2089 | browser extension like "[[https://github.com/fregante/GhostText][Ghost Text]]" to use your favorite
2090 | editor. One downside to these style notebooks is that it is typically
2091 | awkward to post many plots at once.
2092 |
2093 | ** Getting Help
2094 | For better or worse, eventually you will get stuck. This is
2095 | Research! If you're not stuck half the time you're not doing it
2096 | right!
2097 |
2098 | Here's a rough strategy for tackling problems:
2099 | 1. Google it, don't just Google specific errors, search related terms.
2100 | Eventually you will be able to make the search more and more
2101 | general until you get the answer you want, try reading the [[http://www.googleguide.com/][Google
2102 | guide]].
2103 | 2. Read documentation. Many times the answer is buried deep inside
2104 | the program. Don't be afraid to crack open the source code and
2105 | actually try to understand whats going on (after exhausting any
2106 | manuals or user guides available).
2107 | 3. Just take a deep breath, go get a cup of coffee and come back
2108 | to the problem.
2109 | 4. If the coffee didn't help, table the problem for the moment and
2110 | work on something else. Most likely when you come back to it
2111 | (today or tomorrow) the solution will be obvious.
2112 | - Now may be an appropriate time for an email to the person you
2113 | are working with directly. Chances are they've already
2114 | encountered and solved your problem.
2115 | 5. If you're still stuck, it may be time to post to a forum or mailing
2116 | list. In general, when soliciting help from people you don't
2117 | know, it is polite to avoid contacting them directly.
2118 | - Above all else, *do not* be a [[http://slash7.com/2006/12/22/vampires/][help vampire]]. (They do exist!)
2119 |
2120 | ** FAQ
2121 | - I tried: =#include = in C++ but it
2122 | doesn't work, what gives?
2123 | In short: "=#include <...>=" instructs the compiler to look in
2124 | specific system include directories. In contrast '=#include= "=...="'
2125 | instructs the compiler to look in all directories that it can find
2126 | (usually specified with the =-I= flag). ([[http://www.cplusplus.com/forum/beginner/43444/][Further information]])
2127 | - I was typing along and all of the sudden my terminal froze!? The
2128 | only way to continue was to exit the program!
2129 | Most likely you accidentally typed "C-s" (the control key followed by
2130 | the s key). This sends the XOFF command to your terminal emulator.
2131 | To fix it (and "unfreeze" your terminal) type "C-q". [[https://en.wikipedia.org/wiki/Software_flow_control][More Info Here]].
2132 | - I have some information that would be useful for your guide, can
2133 | you use it?
2134 | Yes! Please see [[https://github.com/dmb2/hitchhikers-guide-to-hep#contributing][Contributing]].
2135 | - I'm lost and confused, your guide is overwhelming and overbearing,
2136 | but I really like HEP, what can I do?
2137 | Please don't despair, with time things will come into focus. In the
2138 | mean time, it will probably be useful to seek out other sources of
2139 | information. If you're having trouble finding material, you might
2140 | try the [[http://www.googleguide.com/][Google guide]].
2141 | - I have a question not covered by this FAQ that doesn't involve a
2142 | research problem, whats the best way to communicate it to you?
2143 | You can email me at [[mailto:hhg-svc@ccbar.us][hhg-svc@ccbar.us]]. If it is related to some
2144 | technical issue and you already have a GitHub account, please open an
2145 | issue on GitHub's issue [[https://github.com/dmb2/hitchhikers-guide-to-hep/issues][tracker for this text]].
2146 |
--------------------------------------------------------------------------------