├── .gitignore ├── README.md └── cases ├── internal_only_namespaces.md ├── pragma_once.md └── return_value_optimization.md /.gitignore: -------------------------------------------------------------------------------- 1 | .#* 2 | *~ 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # C++ Code Style and Primer Digest 2 | 3 | **Note: this is a work in progress.** 4 | 5 | This is largely based on 6 | [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html). 7 | If you have time, you should read it. C++ programmer of any level can 8 | learn from it. 9 | 10 | [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html) 11 | has clearly stated its 12 | [goals](https://google.github.io/styleguide/cppguide.html#Goals), 13 | which provides the **justification** of the rules in the style guide. 14 | [Dr. Titus Winters](http://alumni.cs.ucr.edu/~titus/) had a very good 15 | [talk](https://www.youtube.com/watch?v=NOCElcMcFik&t=2481s) on this. 16 | Some of the goals that worth mentioning, especially for the new C++ 17 | programmers are: 18 | 19 | 1. Optimize for the **READER**, not the writer. 20 | 21 | Realize that most of our time will be spent on reading the code 22 | than writing it. 23 | 2. Avoid surprising or dangerous constructs. 24 | 25 | Black magics may look awesome, but they tend to increase the risk 26 | of bugs and incorrectness (not only when you are developing it, 27 | but also when you or someone else is maintaining it). "Don't be 28 | clever". 29 | 3. Be **consistent** (with existing code). 30 | 31 | This contributes to both the readability and the possibility of 32 | having automation tools. 33 | 34 | We will restrain ourselves from talking much about format such as 35 | indentation. It is not as important, and seriously you should make 36 | good use of your favorite code editor, and the 37 | handy [clang-format](http://clang.llvm.org/docs/ClangFormat.html). 38 | See [C++ Programmer's Toolbox](#c-programmers-toolbox) for details. 39 | 40 | 41 | 42 | ## Header Files 43 | 44 | 1. **Header guards** vs **#pragma once** 45 | 46 | Google style guide requires 47 | [header guards](https://google.github.io/styleguide/cppguide.html#The__define_Guard). 48 | We should prefer `#pragma once` instead. 49 | * `#pragma once` is not part of the standard. However, all 50 | mainstream compilers have been supporting it for years. 51 | * Some codebase may require header guards for **consistency**, 52 | but new code base may not have that constraint. 53 | * `#pragma once` reduces the possibility of bugs (e.g. when 54 | moving files around), and is arguably more readable. 55 | * [Details and example](cases/pragma_once.md). 56 | 2. **Forward Declarations** 57 | 58 | See [Google Style Guide](https://google.github.io/styleguide/cppguide.html#Forward_Declarations). 59 | 3. **Inline Functions** 60 | 61 | **Rule of thumb**: do not inline a function if it is more than 10 62 | lines long. 63 | 4. **#include order** 64 | ```c++ 65 | #include "foo/server/fooserver.h" // corresponding header 66 | 67 | #include // c system headers 68 | #include 69 | 70 | #include // c++ system headers 71 | #include 72 | 73 | #include "base/basictypes.h" // other headers 74 | #include "base/commandlineflags.h" 75 | #include "foo/server/bar.h" 76 | ``` 77 | 78 | All in alphabetic order. 79 | 80 | * Easy to maintain/read. The maintainer/reader can find the 81 | corresponding header file quickly. 82 | * We do not have to stick to this rule strictly, although we 83 | need to be **consistent** about the order in our project. 84 | 85 | ## Namespaces 86 | 87 | 1. **Unnamed Namespace** 88 | * Use them in `.cpp` file to hide functions or variables you do 89 | not want expose. 90 | * **Do not** use them in `.h` files. 91 | * Sometimes you may want to expose internal functions or 92 | variables just for testing purpose. In this case it is better 93 | to declare them in 94 | [internal-only namespaces](cases/internal_only_namespaces.md). 95 | 2. Never do **using namespace foo;** 96 | 97 | * This pollutes the namespace, and can lead to hard-to-resolve 98 | compiler errors and bugs. 99 | * If you really have something like 100 | `a::really::long::nested::name::space`, you can probably use 101 | [namespace alias](http://en.cppreference.com/w/cpp/language/namespace_alias) 102 | in a `.cpp` file: 103 | 104 | ``` 105 | namespace short_name = a::really::long::nested::name::space; 106 | ``` 107 | 108 | Do not use namespace alias in header files except in 109 | [internal-only namespaces](cases/internal_only_namespaces.md). 110 | This is because such aliases will affect every file that 111 | includes this header file. 112 | 3. Avoid nested namespaces that match well-known top-level namespaces. 113 | 114 | Namespace collision may happen if you use `util` to refer to 115 | `my_library::util` within `namespace my_library`, while there is 116 | an existing top-level namespace called `util`. Even if no 117 | collision happens, this confuses the reader. Therefore, 118 | 119 | * Try not to name your namespace `my_library::util` in this 120 | case. 121 | * Refer to the top-level `util` as `::util` to explicitly say 122 | "top-level". 123 | 124 | Common top-level namespaces names that are prone to this are 125 | `util`, `base`, `aux`, etc. 126 | 127 | ## Classes 128 | 129 | 1. **Copy Constructor and Move Constructor** 130 | * If you provide copy constructor and/or move constructor, 131 | provide the corresponding `operator=` overload as well. 132 | 2. **Inheritance** 133 | * All inheritance should be **public**. If you want to do 134 | private inheritance, you should be including an instance of 135 | the base class as a member instead. 136 | * Do not overuse implementation inheritance. Composition is 137 | often more appropriate. Try to restrict use of inheritance to 138 | the "is-a" case: Bar subclasses Foo if it can reasonably be 139 | said that Bar "is a kind of" Foo. 140 | * If you find yourself wanting to use multiple inheritance, 141 | **THINK TWICE**. 142 | 3. **Access Control** 143 | * Data members are **private**, except when they are static 144 | const. 145 | * For technical reasons, we allow data members of a test fixture 146 | class to be protected when using Google Test). 147 | 4. **Declaration Order** 148 | * `public`, `protected` and then `private`. 149 | * In each section, group similar declarations together, prefer 150 | the order: 151 | * `typedef` and `using` 152 | * `struct` and `class` 153 | * factory functions 154 | * constructors 155 | * `operator=` 156 | * destructors 157 | * methods 158 | * data members 159 | 160 | ## Functions 161 | 162 | 1. **Parameters** 163 | * Ordering: Input and then Output. 164 | * All **references** must be **const**, and this is the recommended 165 | way to pass parameters. 166 | * For output parameters, pass pointers. 167 | 2. **Default Argument** 168 | * Not recommended for readability issue. **Do not use** unless 169 | you have to. 170 | 3. **Trailing Return Type Syntax** 171 | * When in lambda. Period. 172 | 4. **Return Value of Complicated Type** 173 | 174 | Returning values of simple types such as `int`, `int64_t`, `bool` 175 | may involves **copy**, but we hardly care. However it is rather 176 | important to **avoid copy** when returning values of complicated 177 | types because it can be rather expensive, e.g. `std::vector`. 178 | 179 | One of the common pattern is to pass in a `std::vector` 180 | pointer as parameter and construct it in place as below: 181 | 182 | ```c++ 183 | void DoWork(..., std::vector *result) { 184 | ... 185 | result->push_back(...); 186 | ... 187 | result->push_back(...); 188 | ... 189 | } 190 | ``` 191 | 192 | I think this pattern should be discouraged in favor 193 | of 194 | [return value optimization](cases/return_value_optimization.md), 195 | because the latter is not only less error-prone, but also **much 196 | more readable**. 197 | 198 | ## Exceptions 199 | 200 | 1. **DO NOT** throw exceptions. Returns error code, and let the upper 201 | level caller handles them. 202 | * The benefit is that we will be forced to handle every error, 203 | and be free of **surprise** exits. Such exits are especially 204 | bad in multi-thread or multi-process programs. 205 | 2. **Constructors** should avoid any code that can **fail**. 206 | * As stated above, we need to return error code on failure. 207 | However, constructors can not do that. 208 | * **Solution**: Use factory functions, or use `Init()` function 209 | to do the lift. 210 | 211 | ## Tricks 212 | 213 | 1. **Thread-safe Local Static Initialization** 214 | 215 | In the following function: 216 | 217 | ```cpp 218 | void SomeFunction() { 219 | static SomeType some_static_variable(); 220 | ... 221 | } 222 | ``` 223 | 224 | If multiple control flows enter the function concurrently, is 225 | there a risk of **race** condition? 226 | 227 | The answer is **no**, 228 | if 229 | [compiled with C++11](http://stackoverflow.com/questions/8102125/is-local-static-variable-initialization-thread-safe-in-c11). 230 | 231 | In fact, if control enters the **declaration** concurrently while 232 | the variable is being initialized, the concurrent execution shall 233 | wait for **completion** of the initialization. 234 | 235 | This property can be taken advantage of to do more complicated 236 | object (lazy) initialization in a thread-safe way, together with 237 | the 238 | [lambda functions](http://en.cppreference.com/w/cpp/language/lambda). 239 | For example, the following code initializes a local static vector 240 | in a thread-safe way. 241 | 242 | ```cpp 243 | void SomeFunction() { 244 | static std::unique_ptr> my_list([]() { 245 | // Note that this lambda function is called within the declaration, 246 | // therefore is thread-safe. 247 | vector *tmp_list = new vector(); 248 | tmp_list.push_back(1); 249 | tmp_list.push_back(2); 250 | ... 251 | return tmp_list; 252 | }()); 253 | } 254 | ``` 255 | 256 | ## Naming 257 | 258 | I think for naming we should stick 259 | to 260 | [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html#Naming). 261 | 262 | ## C++ Programmer's Toolbox 263 | 264 | There are many available tools for C++, and it is always good to have 265 | them in your pocket. Some of them help enforce the rules 266 | automatically. Use them to free yourself from worrying about the 267 | format and focus on more important stuff. 268 | 269 | * [clang](http://clang.llvm.org/) 270 | * [clang-format](http://clang.llvm.org/docs/ClangFormat.html) 271 | * [asan](https://github.com/google/sanitizers/wiki/AddressSanitizer) (Address Sanitizer) 272 | * [rtags](https://github.com/Andersbakken/rtags) 273 | 274 | ## Other 275 | 276 | 1. Use the ones in standard library rather than the third-party 277 | implementation (e.g. boost) if you have a choice. This helps 278 | reduce both runtime dependencies and compile-time dependencies, 279 | and it promotes readability. 280 | 1. Use `const` whenever it makes sense. With C++11, `constexpr` is a 281 | better choice for some uses of `const` 282 | 1. `` defines types like `int16_t`, `uint32_t`, `int64_t`, 283 | etc. You should always use those in preference to short, unsigned 284 | long long and the like, when you need a guarantee on the size of 285 | an integer. 286 | 1. Macros damage readability. **Do not use** them unless the benefit 287 | is huge enough to compensate the loss. 288 | 1. `auto` is permitted when it promotes readability. 289 | 1. `switch` cases may have scopes: 290 | 291 | ```c++ 292 | switch (var) { 293 | case 0: { // 2 space indent 294 | ... // 4 space indent 295 | break; 296 | } 297 | case 1: { 298 | ... 299 | break; 300 | } 301 | default: { 302 | assert(false); 303 | } 304 | } 305 | ``` 306 | -------------------------------------------------------------------------------- /cases/internal_only_namespaces.md: -------------------------------------------------------------------------------- 1 | # Internal-only Namespaces 2 | 3 | Internal-only namepaces are useful when you want something in your 4 | header file that you do not want to leak to the outside of the header 5 | file. 6 | 7 | 1. An internal helper function that needs to be uint tested. 8 | 9 | Usually if we do not want to expose an internal help function, we 10 | can put them in an 11 | [unnamed namespace](http://en.cppreference.com/w/cpp/language/namespace#Unnamed_namespaces). 12 | However, functions in unnamed namespaces cannot be exposed to unit 13 | tests as well, which is bad in case we do want to test them. 14 | 15 | The internal-only namespace solution is as below: 16 | 17 | ```c++ 18 | namespace my_library_namespace { 19 | 20 | // The internal space prevents its content from leaking to 21 | // my_library_namespace. Technically it is still exposed, but 22 | // its name "internal" explicitly tells that the users of the 23 | // library are not supposed to use it. 24 | // 25 | // In this way, the unit tests are still able to reference the function 26 | // by my_library_namespace::internal::MyHelperFunction. 27 | namespace internal { 28 | void MyHelperFunction(); 29 | } // namespace internal 30 | 31 | } // namespace 32 | ``` 33 | 2. A namespace shorthand that should only be used internally. 34 | 35 | ```c++ 36 | namespace my_library_namespace { 37 | 38 | // You can use impl::short_name to refer to the namespace 39 | // ::really::long::namespace within the library, where clearly 40 | // tells the user to avoid doing so. 41 | namespace impl { 42 | namespace short_name = ::really::long::namespace; 43 | } // namespace impl 44 | 45 | } // namespace 46 | ``` 47 | -------------------------------------------------------------------------------- /cases/pragma_once.md: -------------------------------------------------------------------------------- 1 | # Header Guards and #pragma once 2 | 3 | ## How **#pragma once** works 4 | 5 | The directive `#pragma once` is used to ensure that the same file 6 | (usually a C++ header file) is included only once. This actually 7 | requires the compiler to be able to determine whether two files are 8 | the **same**, under the possible presense of hard links and symbolic 9 | links. 10 | 11 | ## Disadvantages of Header Guards 12 | 13 | Header guards are more error-prone compared to `#pragma once`. 14 | 15 | 1. A programmer can never be sure that one guard is not already used 16 | in anotehr header file. In case we have two files with the same 17 | guard: 18 | 19 | * file 1: 20 | 21 | ```c++ 22 | #ifndef MY_VERY_UNIQUE_HEADER_GUARD_H 23 | #define MY_VERY_UNIQUE_HEADER_GUARD_H 24 | ... 25 | #endif // MY_VERY_UNIQUE_HEADER_GUARD_H 26 | ``` 27 | * file 2: 28 | 29 | ```c++ 30 | #ifndef MY_VERY_UNIQUE_HEADER_GUARD_H 31 | #define MY_VERY_UNIQUE_HEADER_GUARD_H 32 | ... 33 | #endif // MY_VERY_UNIQUE_HEADER_GUARD_H 34 | ``` 35 | 36 | This is called name clash. Only one of them will be included, and 37 | the other one will be dropped **SILENTLY**. This is bad. 38 | 39 | 2. Although you can follow some rules to dramatically lower the risk 40 | of name clash, they can still happen. For example: 41 | 42 | * Usually the header guard follows the relative path of the 43 | header file within the projects. When the header file is 44 | moved, the programmer have to remember to modify the guard 45 | accordingly, which one does not always do. 46 | * Sometimes the header (therefore the guard as well) is even 47 | generated by tools, such as Protobuf compiler. 48 | 49 | ## The decision 50 | 51 | Use `#pragma once` if not contrained by consistency requirement. 52 | 53 | 54 | -------------------------------------------------------------------------------- /cases/return_value_optimization.md: -------------------------------------------------------------------------------- 1 | # Return Value Optimization and Copy Elision 2 | 3 | ## Problem 4 | 5 | Suppose we wanted to write a function that returns a vector containing 6 | number from `0` to `n - 1`. The simplest version is 7 | **return-by-value**: 8 | 9 | ```c++ 10 | std::vector Range(int n) { 11 | std::vector result; 12 | result.reserve(n); 13 | for (int i = 0; i < n; ++i) { 14 | result.push_back(i); 15 | } 16 | } 17 | ``` 18 | 19 | And we can call it like: 20 | 21 | ```c++ 22 | std::vector x = Range(1000000); 23 | ``` 24 | 25 | A valid question is: does this involves a copy or move of the huge 26 | ``std::vector``? 27 | 28 | ## The Answer 29 | 30 | If you are using a mordern compiler such 31 | as [gcc](https://gcc.gnu.org/) and [clang](http://clang.llvm.org/), 32 | the answer is **no** and **no**. A mechanism called **copy elision** 33 | will be enforced here as a **return value optimization**. 34 | 35 | The standard has some detailed description on when **copy elision** 36 | happens, but in general copy elision happens when one (or both) of the 37 | conditions are met (There are other cases but are not important enough 38 | to be highlighted here): 39 | 40 | 1. Assigning a temporary to a variable of the same type. 41 | 2. Returning a value right before it goes out of its scope. 42 | 43 | When copy elision happens, the variable gets the value of the 44 | temporary without calling the copy constructor nor the move 45 | constructor. In fact it claims the temporary and becomes it. 46 | 47 | ## How about the Passing-A-Pointer Pattern 48 | 49 | Another pattern that is widely used for this is pass-a-pointer: 50 | 51 | ```c++ 52 | void Range(int n, std::vector *result) { 53 | result->reserve(n); 54 | for (int i = 0; i < n; ++i) { 55 | result->push_back(i); 56 | } 57 | } 58 | ``` 59 | 60 | This is acceptable, but I would say it is not as good bcause: 61 | 62 | 1. Do we assume the pointer `result` is initialized? A bad assumption 63 | can core dump here. 64 | 2. It is far less readable, and it requires certain amount of mental 65 | work to realize that we are actually **returning** a vector. 66 | 67 | ## Conclusion 68 | 69 | Stick to the copy elision approach and let the compiler lift the 70 | weight. Do not try to outsmart the compiler by making your code less 71 | readable. 72 | 73 | 74 | --------------------------------------------------------------------------------