├── .gitignore
├── README.md
└── cases
    ├── internal_only_namespaces.md
    ├── pragma_once.md
    └── return_value_optimization.md


/.gitignore:
--------------------------------------------------------------------------------
1 | .#*
2 | *~
3 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # C++ Code Style and Primer Digest
  2 | 
  3 | **Note: this is a work in progress.**
  4 | 
  5 | This is largely based on
  6 | [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html).
  7 | If you have time, you should read it. C++ programmer of any level can
  8 | learn from it. 
  9 | 
 10 | [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html)
 11 | has clearly stated its
 12 | [goals](https://google.github.io/styleguide/cppguide.html#Goals),
 13 | which provides the **justification** of the rules in the style guide.
 14 | [Dr. Titus Winters](http://alumni.cs.ucr.edu/~titus/) had a very good
 15 | [talk](https://www.youtube.com/watch?v=NOCElcMcFik&t=2481s) on this.
 16 | Some of the goals that worth mentioning, especially for the new C++
 17 | programmers are:
 18 | 
 19 | 1.  Optimize for the **READER**, not the writer.
 20 | 
 21 |     Realize that most of our time will be spent on reading the code
 22 |     than writing it. 
 23 | 2.  Avoid surprising or dangerous constructs.
 24 | 
 25 |     Black magics may look awesome, but they tend to increase the risk
 26 |     of bugs and incorrectness (not only when you are developing it,
 27 |     but also when you or someone else is maintaining it). "Don't be
 28 |     clever".
 29 | 3.  Be **consistent** (with existing code).
 30 | 
 31 |     This contributes to both the readability and the possibility of
 32 |     having automation tools.
 33 |     
 34 | We will restrain ourselves from talking much about format such as
 35 | indentation. It is not as important, and seriously you should make
 36 | good use of your favorite code editor, and the
 37 | handy [clang-format](http://clang.llvm.org/docs/ClangFormat.html).
 38 | See [C++ Programmer's Toolbox](#c-programmers-toolbox) for details.
 39 | 
 40 | 
 41 |     
 42 | ## Header Files
 43 | 
 44 | 1.  **Header guards** vs **#pragma once**
 45 | 
 46 |     Google style guide requires
 47 |     [header guards](https://google.github.io/styleguide/cppguide.html#The__define_Guard).
 48 |     We should prefer `#pragma once` instead.
 49 |     *   `#pragma once` is not part of the standard. However, all
 50 |         mainstream compilers have been supporting it for years.
 51 |     *   Some codebase may require header guards for **consistency**,
 52 |         but new code base may not have that constraint.
 53 |     *   `#pragma once` reduces the possibility of bugs (e.g. when
 54 |         moving files around), and is arguably more readable.
 55 |     *   [Details and example](cases/pragma_once.md).
 56 | 2.  **Forward Declarations**
 57 | 
 58 |     See [Google Style Guide](https://google.github.io/styleguide/cppguide.html#Forward_Declarations).
 59 | 3.  **Inline Functions**
 60 |     
 61 |     **Rule of thumb**: do not inline a function if it is more than 10
 62 |     lines long.
 63 | 4.  **#include order**
 64 |     ```c++
 65 |     #include "foo/server/fooserver.h"  // corresponding header
 66 | 
 67 |     #include <sys/types.h>  // c system headers
 68 |     #include <unistd.h>
 69 | 
 70 |     #include <hash_map>  // c++ system headers
 71 |     #include <vector>
 72 | 
 73 |     #include "base/basictypes.h" // other headers
 74 |     #include "base/commandlineflags.h"
 75 |     #include "foo/server/bar.h"
 76 |     ```
 77 |     
 78 |     All in alphabetic order.
 79 |     
 80 |     *   Easy to maintain/read. The maintainer/reader can find the
 81 |         corresponding header file quickly.
 82 |     *   We do not have to stick to this rule strictly, although we
 83 |         need to be **consistent** about the order in our project.
 84 |     
 85 | ## Namespaces
 86 | 
 87 | 1.  **Unnamed Namespace**
 88 |     *   Use them in `.cpp` file to hide functions or variables you do
 89 |         not want expose.
 90 |     *   **Do not** use them in `.h` files.
 91 |     *   Sometimes you may want to expose internal functions or
 92 |         variables just for testing purpose. In this case it is better
 93 |         to declare them in
 94 |         [internal-only namespaces](cases/internal_only_namespaces.md).
 95 | 2.  Never do **using namespace foo;**
 96 |     
 97 |     *   This pollutes the namespace, and can lead to hard-to-resolve
 98 |         compiler errors and bugs.
 99 |     *   If you really have something like
100 |         `a::really::long::nested::name::space`, you can probably use
101 |         [namespace alias](http://en.cppreference.com/w/cpp/language/namespace_alias)
102 |         in a `.cpp` file:
103 |         
104 |         ```
105 |         namespace short_name = a::really::long::nested::name::space;
106 |         ```
107 |         
108 |         Do not use namespace alias in header files except in
109 |         [internal-only namespaces](cases/internal_only_namespaces.md).
110 |         This is because such aliases will affect every file that
111 |         includes this header file.
112 | 3.  Avoid nested namespaces that match well-known top-level namespaces.
113 |     
114 |     Namespace collision may happen if you use `util` to refer to
115 |     `my_library::util` within `namespace my_library`, while there is
116 |     an existing top-level namespace called `util`. Even if no
117 |     collision happens, this confuses the reader. Therefore,
118 |     
119 |     *   Try not to name your namespace `my_library::util` in this
120 |         case.
121 |     *   Refer to the top-level `util` as `::util` to explicitly say
122 |         "top-level".
123 |         
124 |     Common top-level namespaces names that are prone to this are
125 |     `util`, `base`, `aux`, etc.
126 |     
127 | ## Classes
128 | 
129 | 1.  **Copy Constructor and Move Constructor**
130 |     *   If you provide copy constructor and/or move constructor,
131 |         provide the corresponding `operator=` overload as well.
132 | 2.  **Inheritance**
133 |     *   All inheritance should be **public**. If you want to do
134 |         private inheritance, you should be including an instance of
135 |         the base class as a member instead.
136 |     *   Do not overuse implementation inheritance. Composition is
137 |         often more appropriate. Try to restrict use of inheritance to
138 |         the "is-a" case: Bar subclasses Foo if it can reasonably be
139 |         said that Bar "is a kind of" Foo.
140 |     *   If you find yourself wanting to use multiple inheritance,
141 |         **THINK TWICE**.
142 | 3.  **Access Control**
143 |     *   Data members are **private**, except when they are static
144 |         const.
145 |     *   For technical reasons, we allow data members of a test fixture
146 |         class to be protected when using Google Test).
147 | 4.  **Declaration Order**
148 |     *   `public`, `protected` and then `private`.
149 |     *   In each section, group similar declarations together, prefer
150 |         the order:
151 |             *   `typedef` and `using`
152 |             *   `struct` and `class`
153 |             *   factory functions
154 |             *   constructors
155 |             *   `operator=`
156 |             *   destructors
157 |             *   methods
158 |             *   data members
159 |             
160 | ## Functions
161 | 
162 | 1.  **Parameters**
163 |     *   Ordering: Input and then Output.
164 |     *   All **references** must be **const**, and this is the recommended
165 |         way to pass parameters.
166 |     *   For output parameters, pass pointers.
167 | 2.  **Default Argument**
168 |     *   Not recommended for readability issue. **Do not use** unless
169 |         you have to.
170 | 3.  **Trailing Return Type Syntax**
171 |     *   When in lambda. Period.
172 | 4.  **Return Value of Complicated Type**
173 |     
174 |     Returning values of simple types such as `int`, `int64_t`, `bool`
175 |     may involves **copy**, but we hardly care. However it is rather
176 |     important to **avoid copy** when returning values of complicated
177 |     types because it can be rather expensive, e.g. `std::vector<int>`.
178 |     
179 |     One of the common pattern is to pass in a `std::vector<int>`
180 |     pointer as parameter and construct it in place as below:
181 |     
182 |     ```c++
183 |     void DoWork(..., std::vector<int> *result) {
184 |         ...
185 |         result->push_back(...);
186 |         ...
187 |         result->push_back(...);
188 |         ...
189 |     }
190 |     ```
191 |     
192 |     I think this pattern should be discouraged in favor
193 |     of
194 |     [return value optimization](cases/return_value_optimization.md),
195 |     because the latter is not only less error-prone, but also **much
196 |     more readable**.
197 |     
198 | ## Exceptions
199 | 
200 | 1.  **DO NOT** throw exceptions. Returns error code, and let the upper
201 |     level caller handles them.
202 |     *   The benefit is that we will be forced to handle every error,
203 |         and be free of **surprise** exits. Such exits are especially
204 |         bad in multi-thread or multi-process programs.
205 | 2.  **Constructors** should avoid any code that can **fail**.
206 |     *   As stated above, we need to return error code on failure.
207 |         However, constructors can not do that.
208 |     *   **Solution**: Use factory functions, or use `Init()` function
209 |         to do the lift.
210 |         
211 | ## Tricks
212 | 
213 | 1.  **Thread-safe Local Static Initialization**
214 | 
215 |     In the following function:
216 |     
217 |     ```cpp
218 |     void SomeFunction() {
219 |       static SomeType some_static_variable();
220 |       ...
221 |     }
222 |     ```
223 |     
224 |     If multiple control flows enter the function concurrently, is
225 |     there a risk of **race** condition?
226 |     
227 |     The answer is **no**,
228 |     if
229 |     [compiled with C++11](http://stackoverflow.com/questions/8102125/is-local-static-variable-initialization-thread-safe-in-c11).
230 |     
231 |     In fact, if control enters the **declaration** concurrently while
232 |     the variable is being initialized, the concurrent execution shall
233 |     wait for **completion** of the initialization.
234 |     
235 |     This property can be taken advantage of to do more complicated
236 |     object (lazy) initialization in a thread-safe way, together with
237 |     the
238 |     [lambda functions](http://en.cppreference.com/w/cpp/language/lambda).
239 |     For example, the following code initializes a local static vector
240 |     in a thread-safe way.
241 |     
242 |     ```cpp
243 |     void SomeFunction() {
244 |       static std::unique_ptr<vector<int>> my_list([]() {
245 |         // Note that this lambda function is called within the declaration,
246 |         // therefore is thread-safe.
247 |         vector<int> *tmp_list = new vector<int>();
248 |         tmp_list.push_back(1);
249 |         tmp_list.push_back(2);
250 |         ...
251 |         return tmp_list;
252 |       }());
253 |     }
254 |     ```
255 |         
256 | ## Naming
257 | 
258 | I think for naming we should stick
259 | to
260 | [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html#Naming).
261 | 
262 | ## C++ Programmer's Toolbox
263 | 
264 | There are many available tools for C++, and it is always good to have
265 | them in your pocket. Some of them help enforce the rules
266 | automatically. Use them to free yourself from worrying about the
267 | format and focus on more important stuff.
268 | 
269 | *   [clang](http://clang.llvm.org/)
270 | *   [clang-format](http://clang.llvm.org/docs/ClangFormat.html)
271 | *   [asan](https://github.com/google/sanitizers/wiki/AddressSanitizer) (Address Sanitizer)
272 | *   [rtags](https://github.com/Andersbakken/rtags)
273 | 
274 | ## Other
275 | 
276 | 1.  Use the ones in standard library rather than the third-party
277 |     implementation (e.g. boost) if you have a choice. This helps
278 |     reduce both runtime dependencies and compile-time dependencies,
279 |     and it promotes readability.
280 | 1.  Use `const` whenever it makes sense. With C++11, `constexpr` is a
281 |     better choice for some uses of `const`
282 | 1.  `<stdint.h>` defines types like `int16_t`, `uint32_t`, `int64_t`,
283 |     etc. You should always use those in preference to short, unsigned
284 |     long long and the like, when you need a guarantee on the size of
285 |     an integer.
286 | 1.  Macros damage readability. **Do not use** them unless the benefit
287 |     is huge enough to compensate the loss.
288 | 1.  `auto` is permitted when it promotes readability.
289 | 1.  `switch` cases may have scopes:
290 | 
291 |     ```c++
292 |     switch (var) {
293 |       case 0: {  // 2 space indent
294 |         ...      // 4 space indent
295 |         break;
296 |       }
297 |       case 1: {
298 |         ...
299 |         break;
300 |       }
301 |       default: {
302 |         assert(false);
303 |       }
304 |     }
305 |     ```
306 | 


--------------------------------------------------------------------------------
/cases/internal_only_namespaces.md:
--------------------------------------------------------------------------------
 1 | # Internal-only Namespaces
 2 | 
 3 | Internal-only namepaces are useful when you want something in your
 4 | header file that you do not want to leak to the outside of the header
 5 | file. 
 6 | 
 7 | 1.  An internal helper function that needs to be uint tested.
 8 | 
 9 |     Usually if we do not want to expose an internal help function, we
10 |     can put them in an
11 |     [unnamed namespace](http://en.cppreference.com/w/cpp/language/namespace#Unnamed_namespaces).
12 |     However, functions in unnamed namespaces cannot be exposed to unit
13 |     tests as well, which is bad in case we do want to test them.
14 |     
15 |     The internal-only namespace solution is as below:
16 |     
17 |     ```c++
18 |     namespace my_library_namespace {
19 |     
20 |     // The internal space prevents its content from leaking to 
21 |     // my_library_namespace. Technically it is still exposed, but
22 |     // its name "internal" explicitly tells that the users of the 
23 |     // library are not supposed to use it.
24 |     //
25 |     // In this way, the unit tests are still able to reference the function
26 |     // by my_library_namespace::internal::MyHelperFunction.
27 |     namespace internal {
28 |     void MyHelperFunction();
29 |     }  // namespace internal
30 |     
31 |     }  // namespace
32 |     ```
33 | 2.  A namespace shorthand that should only be used internally.
34 | 
35 |     ```c++
36 |     namespace my_library_namespace {
37 |     
38 |     // You can use impl::short_name to refer to the namespace
39 |     // ::really::long::namespace within the library, where clearly
40 |     // tells the user to avoid doing so.
41 |     namespace impl {
42 |     namespace short_name = ::really::long::namespace;
43 |     }  // namespace impl
44 |     
45 |     }  // namespace
46 |     ```
47 | 


--------------------------------------------------------------------------------
/cases/pragma_once.md:
--------------------------------------------------------------------------------
 1 | # Header Guards and #pragma once
 2 | 
 3 | ## How **#pragma once** works
 4 | 
 5 | The directive `#pragma once` is used to ensure that the same file
 6 | (usually a C++ header file) is included only once. This actually
 7 | requires the compiler to be able to determine whether two files are
 8 | the **same**, under the possible presense of hard links and symbolic
 9 | links.
10 | 
11 | ## Disadvantages of Header Guards
12 | 
13 | Header guards are more error-prone compared to `#pragma once`. 
14 | 
15 | 1.  A programmer can never be sure that one guard is not already used
16 |     in anotehr header file. In case we have two files with the same
17 |     guard:
18 |     
19 |     *   file 1:
20 |     
21 |         ```c++
22 |         #ifndef MY_VERY_UNIQUE_HEADER_GUARD_H
23 |         #define MY_VERY_UNIQUE_HEADER_GUARD_H
24 |         ...
25 |         #endif  // MY_VERY_UNIQUE_HEADER_GUARD_H
26 |         ```
27 |     *   file 2:
28 |     
29 |         ```c++
30 |         #ifndef MY_VERY_UNIQUE_HEADER_GUARD_H
31 |         #define MY_VERY_UNIQUE_HEADER_GUARD_H
32 |         ...
33 |         #endif  // MY_VERY_UNIQUE_HEADER_GUARD_H
34 |         ```
35 | 
36 |     This is called name clash. Only one of them will be included, and
37 |     the other one will be dropped **SILENTLY**. This is bad.
38 |     
39 | 2.  Although you can follow some rules to dramatically lower the risk
40 |     of name clash, they can still happen. For example:
41 |     
42 |     *   Usually the header guard follows the relative path of the
43 |         header file within the projects. When the header file is
44 |         moved, the programmer have to remember to modify the guard
45 |         accordingly, which one does not always do.
46 |     *   Sometimes the header (therefore the guard as well) is even
47 |         generated by tools, such as Protobuf compiler.
48 |         
49 | ## The decision
50 | 
51 | Use `#pragma once` if not contrained by consistency requirement.
52 |     
53 | 
54 | 


--------------------------------------------------------------------------------
/cases/return_value_optimization.md:
--------------------------------------------------------------------------------
 1 | # Return Value Optimization and Copy Elision
 2 | 
 3 | ## Problem
 4 | 
 5 | Suppose we wanted to write a function that returns a vector containing
 6 | number from `0` to `n - 1`. The simplest version is
 7 | **return-by-value**:
 8 | 
 9 | ```c++
10 | std::vector<int> Range(int n) {
11 |     std::vector<int> result;
12 |     result.reserve(n);
13 |     for (int i = 0; i < n; ++i) {
14 |         result.push_back(i);
15 |     }
16 | }
17 | ```
18 | 
19 | And we can call it like:
20 | 
21 | ```c++
22 | std::vector<int> x = Range(1000000);
23 | ```
24 | 
25 | A valid question is: does this involves a copy or move of the huge
26 | ``std::vector``?
27 | 
28 | ## The Answer
29 | 
30 | If you are using a mordern compiler such
31 | as [gcc](https://gcc.gnu.org/) and [clang](http://clang.llvm.org/),
32 | the answer is **no** and **no**. A mechanism called **copy elision**
33 | will be enforced here as a **return value optimization**. 
34 | 
35 | The standard has some detailed description on when **copy elision**
36 | happens, but in general copy elision happens when one (or both) of the
37 | conditions are met (There are other cases but are not important enough
38 | to be highlighted here):
39 | 
40 | 1.  Assigning a temporary to a variable of the same type.
41 | 2.  Returning a value right before it goes out of its scope.
42 | 
43 | When copy elision happens, the variable gets the value of the
44 | temporary without calling the copy constructor nor the move
45 | constructor. In fact it claims the temporary and becomes it.
46 | 
47 | ## How about the Passing-A-Pointer Pattern
48 | 
49 | Another pattern that is widely used for this is pass-a-pointer:
50 | 
51 | ```c++
52 | void Range(int n, std::vector<int> *result) {
53 |     result->reserve(n);
54 |     for (int i = 0; i < n; ++i) {
55 |         result->push_back(i);
56 |     }
57 | }
58 | ```
59 | 
60 | This is acceptable, but I would say it is not as good bcause:
61 | 
62 | 1.  Do we assume the pointer `result` is initialized? A bad assumption
63 |     can core dump here.
64 | 2.  It is far less readable, and it requires certain amount of mental
65 |     work to realize that we are actually **returning** a vector.
66 |     
67 | ## Conclusion
68 | 
69 | Stick to the copy elision approach and let the compiler lift the
70 | weight. Do not try to outsmart the compiler by making your code less
71 | readable.
72 | 
73 | 
74 | 


--------------------------------------------------------------------------------