├── license.md
└── readme.md
/license.md:
--------------------------------------------------------------------------------
1 | # Creative Commons Legal Code
2 |
3 | ## Attribution-ShareAlike 3.0 Unported
4 |
5 | CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE LEGAL SERVICES. DISTRIBUTION OF THIS LICENSE DOES NOT CREATE AN ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES REGARDING THE INFORMATION PROVIDED, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM ITS USE.
6 |
7 | ### License
8 |
9 | THE WORK (AS DEFINED BELOW) IS PROVIDED UNDER THE TERMS OF THIS CREATIVE COMMONS PUBLIC LICENSE ("CCPL" OR "LICENSE"). THE WORK IS PROTECTED BY COPYRIGHT AND/OR OTHER APPLICABLE LAW. ANY USE OF THE WORK OTHER THAN AS AUTHORIZED UNDER THIS LICENSE OR COPYRIGHT LAW IS PROHIBITED.
10 |
11 | BY EXERCISING ANY RIGHTS TO THE WORK PROVIDED HERE, YOU ACCEPT AND AGREE TO BE BOUND BY THE TERMS OF THIS LICENSE. TO THE EXTENT THIS LICENSE MAY BE CONSIDERED TO BE A CONTRACT, THE LICENSOR GRANTS YOU THE RIGHTS CONTAINED HERE IN CONSIDERATION OF YOUR ACCEPTANCE OF SUCH TERMS AND CONDITIONS.
12 |
13 | #### 1. Definitions
14 |
15 | - **a.** **"Adaptation"** means a work based upon the Work, or upon the Work and other pre-existing works, such as a translation, adaptation, derivative work, arrangement of music or other alterations of a literary or artistic work, or phonogram or performance and includes cinematographic adaptations or any other form in which the Work may be recast, transformed, or adapted including in any form recognizably derived from the original, except that a work that constitutes a Collection will not be considered an Adaptation for the purpose of this License. For the avoidance of doubt, where the Work is a musical work, performance or phonogram, the synchronization of the Work in timed-relation with a moving image ("synching") will be considered an Adaptation for the purpose of this License.
16 |
17 | - **b.** **"Collection"** means a collection of literary or artistic works, such as encyclopedias and anthologies, or performances, phonograms or broadcasts, or other works or subject matter other than works listed in Section 1(f) below, which, by reason of the selection and arrangement of their contents, constitute intellectual creations, in which the Work is included in its entirety in unmodified form along with one or more other contributions, each constituting separate and independent works in themselves, which together are assembled into a collective whole. A work that constitutes a Collection will not be considered an Adaptation (as defined below) for the purposes of this License.
18 |
19 | - **c.** **"Creative Commons Compatible License"** means a license that is listed at http://creativecommons.org/compatiblelicenses that has been approved by Creative Commons as being essentially equivalent to this License, including, at a minimum, because that license: (i) contains terms that have the same purpose, meaning and effect as the License Elements of this License; and, (ii) explicitly permits the relicensing of adaptations of works made available under that license under this License or a Creative Commons jurisdiction license with the same License Elements as this License.
20 |
21 | - **d.** **"Distribute"** means to make available to the public the original and copies of the Work or Adaptation, as appropriate, through sale or other transfer of ownership.
22 |
23 | - **e.** **"License Elements"** means the following high-level license attributes as selected by Licensor and indicated in the title of this License: Attribution, ShareAlike.
24 |
25 | - **f.** **"Licensor"** means the individual, individuals, entity or entities that offer(s) the Work under the terms of this License.
26 |
27 | - **g.** **"Original Author"** means, in the case of a literary or artistic work, the individual, individuals, entity or entities who created the Work or if no individual or entity can be identified, the publisher; and in addition (i) in the case of a performance the actors, singers, musicians, dancers, and other persons who act, sing, deliver, declaim, play in, interpret or otherwise perform literary or artistic works or expressions of folklore; (ii) in the case of a phonogram the producer being the person or legal entity who first fixes the sounds of a performance or other sounds; and, (iii) in the case of broadcasts, the organization that transmits the broadcast.
28 |
29 | - **h.** **"Work"** means the literary and/or artistic work offered under the terms of this License including without limitation any production in the literary, scientific and artistic domain, whatever may be the mode or form of its expression including digital form, such as a book, pamphlet and other writing; a lecture, address, sermon or other work of the same nature; a dramatic or dramatico-musical work; a choreographic work or entertainment in dumb show; a musical composition with or without words; a cinematographic work to which are assimilated works expressed by a process analogous to cinematography; a work of drawing, painting, architecture, sculpture, engraving or lithography; a photographic work to which are assimilated works expressed by a process analogous to photography; a work of applied art; an illustration, map, plan, sketch or three-dimensional work relative to geography, topography, architecture or science; a performance; a broadcast; a phonogram; a compilation of data to the extent it is protected as a copyrightable work; or a work performed by a variety or circus performer to the extent it is not otherwise considered a literary or artistic work.
30 |
31 | - **i.** **"You"** means an individual or entity exercising rights under this License who has not previously violated the terms of this License with respect to the Work, or who has received express permission from the Licensor to exercise rights under this License despite a previous violation.
32 |
33 | - **j.** **"Publicly Perform"** means to perform public recitations of the Work and to communicate to the public those public recitations, by any means or process, including by wire or wireless means or public digital performances; to make available to the public Works in such a way that members of the public may access these Works from a place and at a place individually chosen by them; to perform the Work to the public by any means or process and the communication to the public of the performances of the Work, including by public digital performance; to broadcast and rebroadcast the Work by any means including signs, sounds or images.
34 |
35 | - **k.** **"Reproduce"** means to make copies of the Work by any means including without limitation by sound or visual recordings and the right of fixation and reproducing fixations of the Work, including storage of a protected performance or phonogram in digital form or other electronic medium.
36 |
37 | #### 2. Fair Dealing Rights
38 |
39 | Nothing in this License is intended to reduce, limit, or restrict any uses free from copyright or rights arising from limitations or exceptions that are provided for in connection with the copyright protection under copyright law or other applicable laws.
40 |
41 | #### 3. License Grant
42 |
43 | Subject to the terms and conditions of this License, Licensor hereby grants You a worldwide, royalty-free, non-exclusive, perpetual (for the duration of the applicable copyright) license to exercise the rights in the Work as stated below:
44 |
45 | - **a.** to Reproduce the Work, to incorporate the Work into one or more Collections, and to Reproduce the Work as incorporated in the Collections;
46 |
47 | - **b.** to create and Reproduce Adaptations provided that any such Adaptation, including any translation in any medium, takes reasonable steps to clearly label, demarcate or otherwise identify that changes were made to the original Work. For example, a translation could be marked "The original work was translated from English to Spanish," or a modification could indicate "The original work has been modified.";
48 |
49 | - **c.** to Distribute and Publicly Perform the Work including as incorporated in Collections; and,
50 |
51 | - **d.** to Distribute and Publicly Perform Adaptations.
52 |
53 | - **e.** For the avoidance of doubt:
54 |
55 | - **i.** **Non-waivable Compulsory License Schemes.** In those jurisdictions in which the right to collect royalties through any statutory or compulsory licensing scheme cannot be waived, the Licensor reserves the exclusive right to collect such royalties for any exercise by You of the rights granted under this License;
56 |
57 | - **ii.** **Waivable Compulsory License Schemes.** In those jurisdictions in which the right to collect royalties through any statutory or compulsory licensing scheme can be waived, the Licensor waives the exclusive right to collect such royalties for any exercise by You of the rights granted under this License; and,
58 |
59 | - **iii.** **Voluntary License Schemes.** The Licensor waives the right to collect royalties, whether individually or, in the event that the Licensor is a member of a collecting society that administers voluntary licensing schemes, via that society, from any exercise by You of the rights granted under this License.
60 |
61 | The above rights may be exercised in all media and formats whether now known or hereafter devised. The above rights include the right to make such modifications as are technically necessary to exercise the rights in other media and formats. Subject to Section 8(f), all rights not expressly granted by Licensor are hereby reserved.
62 |
63 | #### 4. Fair Dealing Rights
64 |
65 | The license granted in Section 3 above is expressly made subject to and limited by the following restrictions:
66 |
67 | - **a.** You may Distribute or Publicly Perform the Work only under the terms of this License. You must include a copy of, or the Uniform Resource Identifier (URI) for, this License with every copy of the Work You Distribute or Publicly Perform. You may not offer or impose any terms on the Work that restrict the terms of this License or the ability of the recipient of the Work to exercise the rights granted to that recipient under the terms of the License. You may not sublicense the Work. You must keep intact all notices that refer to this License and to the disclaimer of warranties with every copy of the Work You Distribute or Publicly Perform. When You Distribute or Publicly Perform the Work, You may not impose any effective technological measures on the Work that restrict the ability of a recipient of the Work from You to exercise the rights granted to that recipient under the terms of the License. This Section 4(a) applies to the Work as incorporated in a Collection, but this does not require the Collection apart from the Work itself to be made subject to the terms of this License. If You create a Collection, upon notice from any Licensor You must, to the extent practicable, remove from the Collection any credit as required by Section 4(c), as requested. If You create an Adaptation, upon notice from any Licensor You must, to the extent practicable, remove from the Adaptation any credit as required by Section 4(c), as requested.
68 |
69 | - **b.** You may Distribute or Publicly Perform an Adaptation only under the terms of: (i) this License; (ii) a later version of this License with the same License Elements as this License; (iii) a Creative Commons jurisdiction license (either this or a later license version) that contains the same License Elements as this License (e.g., Attribution-ShareAlike 3.0 US)); (iv) a Creative Commons Compatible License. If you license the Adaptation under one of the licenses mentioned in (iv), you must comply with the terms of that license. If you license the Adaptation under the terms of any of the licenses mentioned in (i), (ii) or (iii) (the "Applicable License"), you must comply with the terms of the Applicable License generally and the following provisions: (I) You must include a copy of, or the URI for, the Applicable License with every copy of each Adaptation You Distribute or Publicly Perform; (II) You may not offer or impose any terms on the Adaptation that restrict the terms of the Applicable License or the ability of the recipient of the Adaptation to exercise the rights granted to that recipient under the terms of the Applicable License; (III) You must keep intact all notices that refer to the Applicable License and to the disclaimer of warranties with every copy of the Work as included in the Adaptation You Distribute or Publicly Perform; (IV) when You Distribute or Publicly Perform the Adaptation, You may not impose any effective technological measures on the Adaptation that restrict the ability of a recipient of the Adaptation from You to exercise the rights granted to that recipient under the terms of the Applicable License. This Section 4(b) applies to the Adaptation as incorporated in a Collection, but this does not require the Collection apart from the Adaptation itself to be made subject to the terms of the Applicable License.
70 |
71 | - **c.** If You Distribute, or Publicly Perform the Work or any Adaptations or Collections, You must, unless a request has been made pursuant to Section 6(a), keep intact all copyright notices for the Work and provide, reasonable to the medium or means You are utilizing: (i) the name of the Original Author (or pseudonym, if applicable) if supplied, and/or if the Original Author and/or Licensor designate another party or parties (e.g., a sponsor institute, publishing entity, journal) for attribution ("Attribution Parties") in Licensor's copyright notice, terms of service or by other reasonable means, the name of such party or parties; (ii) the title of the Work if supplied; (iii) to the extent reasonably practicable, the URI, if any, that Licensor specifies to be associated with the Work, unless such URI does not refer to the copyright notice or licensing information for the Work; and (iv) , consistent with Ssection 3(b), in the case of an Adaptation, a credit identifying the use of the Work in the Adaptation (e.g., "French translation of the Work by Original Author," or "Screenplay based on original Work by Original Author"). The credit required by this Section 4(c) may be implemented in any reasonable manner; provided, however, that in the case of a Adaptation or Collection, at a minimum such credit will appear, if a credit for all contributing authors of the Adaptation or Collection appears, then as part of these credits and in a manner at least as prominent as the credits for the other contributing authors. For the avoidance of doubt, You may only use the credit required by this Section for the purpose of attribution in the manner set out above and, by exercising Your rights under this License, You may not implicitly or explicitly assert or imply any connection with, sponsorship or endorsement by the Original Author, Licensor and/or Attribution Parties, as appropriate, of You or Your use of the Work, without the separate, express prior written permission of the Original Author, Licensor and/or Attribution Parties.
72 |
73 | - **d.** Except as otherwise agreed in writing by the Licensor or as may be otherwise permitted by applicable law, if You Reproduce, Distribute or Publicly Perform the Work either by itself or as part of any Adaptations or Collections, You must not distort, mutilate, modify or take other derogatory action in relation to the Work which would be prejudicial to the Original Author's honor or reputation. Licensor agrees that in those jurisdictions (e.g. Japan), in which any exercise of the right granted in Section 3(b) of this License (the right to make Adaptations) would be deemed to be a distortion, mutilation, modification or other derogatory action prejudicial to the Original Author's honor and reputation, the Licensor will waive or not assert, as appropriate, this Section, to the fullest extent permitted by the applicable national law, to enable You to reasonably exercise Your right under Section 3(b) of this License (right to make Adaptations) but not otherwise.
74 |
75 | #### 5. Representations, Warranties and Disclaimer
76 |
77 | UNLESS OTHERWISE MUTUALLY AGREED TO BY THE PARTIES IN WRITING, LICENSOR OFFERS THE WORK AS-IS AND MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND CONCERNING THE WORK, EXPRESS, IMPLIED, STATUTORY OR OTHERWISE, INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE, MERCHANTIBILITY, FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, OR THE ABSENCE OF LATENT OR OTHER DEFECTS, ACCURACY, OR THE PRESENCE OF ABSENCE OF ERRORS, WHETHER OR NOT DISCOVERABLE. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO SUCH EXCLUSION MAY NOT APPLY TO YOU.
78 |
79 | #### 6. Limitation on Liability
80 |
81 | EXCEPT TO THE EXTENT REQUIRED BY APPLICABLE LAW, IN NO EVENT WILL LICENSOR BE LIABLE TO YOU ON ANY LEGAL THEORY FOR ANY SPECIAL, INCIDENTAL, CONSEQUENTIAL, PUNITIVE OR EXEMPLARY DAMAGES ARISING OUT OF THIS LICENSE OR THE USE OF THE WORK, EVEN IF LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
82 |
83 | #### 7. Termination
84 |
85 | - **a.** This License and the rights granted hereunder will terminate automatically upon any breach by You of the terms of this License. Individuals or entities who have received Adaptations or Collections from You under this License, however, will not have their licenses terminated provided such individuals or entities remain in full compliance with those licenses. Sections 1, 2, 5, 6, 7, and 8 will survive any termination of this License.
86 |
87 | - **b.** Subject to the above terms and conditions, the license granted here is perpetual (for the duration of the applicable copyright in the Work). Notwithstanding the above, Licensor reserves the right to release the Work under different license terms or to stop distributing the Work at any time; provided, however that any such election will not serve to withdraw this License (or any other license that has been, or is required to be, granted under the terms of this License), and this License will continue in full force and effect unless terminated as stated above.
88 |
89 | #### 8. Miscellaneous
90 |
91 | - **a.** Each time You Distribute or Publicly Perform the Work or a Collection, the Licensor offers to the recipient a license to the Work on the same terms and conditions as the license granted to You under this License.
92 |
93 | - **b.** Each time You Distribute or Publicly Perform an Adaptation, Licensor offers to the recipient a license to the original Work on the same terms and conditions as the license granted to You under this License.
94 |
95 | - **c.** If any provision of this License is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this License, and without further action by the parties to this agreement, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable.
96 |
97 | - **d.** No term or provision of this License shall be deemed waived and no breach consented to unless such waiver or consent shall be in writing and signed by the party to be charged with such waiver or consent.
98 |
99 | - **e.** This License constitutes the entire agreement between the parties with respect to the Work licensed here. There are no understandings, agreements or representations with respect to the Work not specified here. Licensor shall not be bound by any additional provisions that may appear in any communication from You. This License may not be modified without the mutual written agreement of the Licensor and You.
100 |
101 | - **f.** The rights granted under, and the subject matter referenced, in this License were drafted utilizing the terminology of the Berne Convention for the Protection of Literary and Artistic Works (as amended on September 28, 1979), the Rome Convention of 1961, the WIPO Copyright Treaty of 1996, the WIPO Performances and Phonograms Treaty of 1996 and the Universal Copyright Convention (as revised on July 24, 1971). These rights and subject matter take effect in the relevant jurisdiction in which the License terms are sought to be enforced according to the corresponding provisions of the implementation of those treaty provisions in the applicable national law. If the standard suite of rights granted under applicable copyright law includes additional rights not granted under this License, such additional rights are deemed to be included in the License; this License is not intended to restrict the license of any rights under applicable law.
102 |
103 | ### Creative Commons Notice
104 |
105 | Creative Commons is not a party to this License, and makes no warranty whatsoever in connection with the Work. Creative Commons will not be liable to You or any party on any legal theory for any damages whatsoever, including without limitation any general, special, incidental or consequential damages arising in connection to this license. Notwithstanding the foregoing two (2) sentences, if Creative Commons has expressly identified itself as the Licensor hereunder, it shall have all rights and obligations of Licensor.
106 |
107 | Except for the limited purpose of indicating to the public that the Work is licensed under the CCPL, Creative Commons does not authorize the use by either party of the trademark "Creative Commons" or any related trademark or logo of Creative Commons without the prior written consent of Creative Commons. Any permitted use will be in compliance with Creative Commons' then-current trademark usage guidelines, as may be published on its website or otherwise made available upon request from time to time. For the avoidance of doubt, this trademark restriction does not form part of the License.
108 |
109 | Creative Commons may be contacted at .
110 |
--------------------------------------------------------------------------------
/readme.md:
--------------------------------------------------------------------------------
1 | # C Style
2 |
3 | These are my favorite C programming practices. Some rules are as trivial as style, while others are more intricate. I follow a few rules religiously, and others I use as a guideline. I prioritize correctness, readability, simplicity and maintainability over speed because [premature optimization is the root of all evil](http://c2.com/cgi/wiki?PrematureOptimization).
4 |
5 | **Write correct, readable, simple and maintainable software, and tune it when you're done**, with benchmarks to identify the choke points. Also, modern compilers *will* change computational complexities. Simplicity can often lead you to the best solution anyway: it's easier to write a linked list than it is to get an array to grow, but it's harder to index a list than it is to index an array.
6 |
7 | Backwards compatibility (e.g. ANSI C) is rarely important to me. In my opinion, backwards compatibility holds everyone back. I think we should use new technologies and new techniques if we can, to move everyone forward, if only a bit.
8 |
9 | If you don't agree with something here, that's perfectly fine. Pick and choose what you like, and what works for your own situation. These rules aren't intended to be universal admonitions about quality: they're just my preferences, and work well for what I do, and what I care about.
10 |
11 | Writing this guide has made me deeply consider, and reconsider, best C programming practices. I've changed my opinion multiple times on many of the rules in this document.
12 |
13 | So, I'm certain I'm wrong on even more points. This is a constant work-in-progress; issues and pull-requests are very welcome. This guide is licensed under the [Creative Commons Attribution-ShareAlike](/license.md), so I'm not liable for anything you do with this, etc.
14 |
15 | ---
16 |
17 |
18 |
19 | #### Always develop and compile with all warnings (and more) on
20 |
21 | No excuses here. Always develop and compile with warnings on. It turns out, though, that `-Wall` and `-Wextra` actually don't enable "all" warnings. There are a few others that can be really helpful:
22 |
23 | ``` make
24 | CFLAGS += -Wall -Wextra -Wpedantic \
25 | -Wformat=2 -Wno-unused-parameter -Wshadow \
26 | -Wwrite-strings -Wstrict-prototypes -Wold-style-definition \
27 | -Wredundant-decls -Wnested-externs -Wmissing-include-dirs
28 |
29 | # GCC warnings that Clang doesn't provide:
30 | ifeq ($(CC),gcc)
31 | CFLAGS += -Wjump-misses-init -Wlogical-op
32 | endif
33 | ```
34 |
35 | Compiling with optimizations on can also help to detect errors:
36 |
37 | ``` make
38 | CFLAGS += -O2
39 | ```
40 |
41 |
42 |
43 | #### Use GCC's and Clang's `-M` to automatically generate object file dependencies
44 |
45 | The GNU Make Manual [touches](https://www.gnu.org/software/make/manual/make.html#Automatic-Prerequisites) on how to automatically generate the dependencies of your object files from the source file's `#include`s. The example rule given in the manual is a bit complicated. Here's the rules I use:
46 |
47 | ``` make
48 | depfiles = $(objects:.o=.d)
49 |
50 | # Have the compiler output dependency files with make targets for each
51 | # of the object files. The `MT` option specifies the dependency file
52 | # itself as a target, so that it's regenerated when it should be.
53 | %.dep.mk: %.c
54 | $(CC) -M -MP -MT '$(<:.c=.o) $@' $(CPPFLAGS) $< > $@
55 |
56 | # Include each of those dependency files; Make will run the rule above
57 | # to generate each dependency file (if it needs to).
58 | -include $(depfiles)
59 | ```
60 |
61 |
62 |
63 | #### Write to the most modern standard you can
64 |
65 | C11 is better than C99, which is (far) better than C89. C11 support is still coming along in GCC and Clang, but many features are there. If you need to support other compilers in the medium-term, write to C99.
66 |
67 | Always write to a *standard*, as in `-std=c11`. Don't write to a dialect, like `gnu11`. Try to make do without non-standard language extensions: you'll thank yourself later.
68 |
69 |
70 |
71 | #### We can't get tabs right, so use spaces everywhere
72 |
73 | The idea of tabs was that we'd use tabs for indentation levels, and spaces for alignment. This lets people choose an indentation width to their liking, without breaking alignment of columns.
74 |
75 | ``` c
76 | int main( void ) {
77 | |tab |if ( pigs_can_fly() == true ) {
78 | |tab ||tab |developers_can_use_tabs( "and align columns "
79 | |tab ||tab | "with spaces!" );
80 | |tab |}
81 | }
82 | ```
83 |
84 | But, alas, we (and our editors) rarely get it right. There are four main problems posed by using tabs and spaces:
85 |
86 | - Tabs for indentation lead to inconsistencies between opinions on line lengths. Someone who uses a tab width of 8 will hit 80 characters much sooner than someone who uses a tab width of 2. The only way to avoid this is to require a tab-width, which eliminates the benefit of tabs.
87 | - It's much harder to configure your editor to correctly handle tabs and spaces for each project, than it is to just handle spaces. See also: [Tabs vs Spaces: An Eternal Holy War](http://www.jwz.org/doc/tabs-vs-spaces.html)
88 | - It's harder to align things using only the space bar. It's much easier to hit tab twice than to hold the space bar for eight characters. A developer on your project *will* make this mistake eventually. If you use spaces for indentation and alignment, you can hit the tab key in either situation, which is quick, easy and not prone to errors.
89 | - It's easier to prevent tab/space errors on projects that use only spaces, because all they need to do is detect for any tabs at all. To prevent against tabs used for alignment on a project that uses tabs, you'll need to come up with a regular expression.
90 |
91 | Cut the complexity, and use spaces everywhere. You may have to adjust to someone else's indent width every now and then. Tough luck!
92 |
93 |
94 | #### Never have more than 79 characters per line
95 |
96 | Never write lines longer than 79 characters.
97 |
98 | 80-characters-per-line is a de-facto standard for viewing code. Readers of your code who rely on that standard, and have their terminal or editor sized to 80 characters wide, can fit more on the screen by placing windows side-by-side.
99 |
100 | You should stick to a maximum of 79 characters so that there's always a space in the last column. This makes it more obvious the line doesn't continue onto the next line. It also provides a right-hand margin.
101 |
102 | If you go over 80 characters, you're making your code significantly harder to read for people who try to rely on the 80-columns standard. Either your line will wrap, which is hard to read, or your readers will have to scroll the window to the right to get the last few characters. Either of these results in code that's harder to read than if you had just worked out a line-break yourself.
103 |
104 | It's harder to read long lines because your eyes have to travel further to get to the start of the next line, and the further they have to go, the more likely you'll have to visually readjust. 100-wide and 120-wide styles are easier to write, but harder to read.
105 |
106 | It can be very tempting to let a line here or there go over 79 characters, but your readers will pay the price every time they have to read such a line. Treat 79 characters as a hard limit - no ifs or buts. Work out how best to break long lines, and your readers will thank you.
107 |
108 | Do what everyone else is doing, and write for 80-column views, and we'll all be better off.
109 |
110 | * [Emacs Wiki: Eighty Column Rule](http://www.emacswiki.org/emacs/EightyColumnRule)
111 | * [Programmers' Stack Exchange: Is the 80 character limit still relevant?](http://programmers.stackexchange.com/questions/604/is-the-80-character-limit-still-relevant-in-times-of-widescreen-monitors)
112 |
113 |
114 | #### Use `//` comments everywhere, never `/* ... */`
115 |
116 | Stick to single-line comments, and cut the complexity. Compared to single-line comments, multi-line comments:
117 |
118 | - are rarely used with a blank margin, so they're just as character-heavy
119 | - have a style, which has to be specified and adhered to
120 | - often have `*/` on its own line, so they're more line-expensive
121 | - have weird rules about embedded `/*` and `*/`
122 | - are harder/impossible to block-edit, and to extend
123 | - are more visually-cluttering than `//`
124 |
125 | You have to use `/* ... */` for inline comments in multi-line `#define`s, though:
126 |
127 | ``` c
128 | #define MAGIC( x ) \
129 | /* Voodoo magic happens here. */ \
130 | ...
131 | ```
132 |
133 | But I often prefer to just add `//` comments after the macro body describing the tricky bits. I think this makes the macro body easier to read, but still provides the (much-needed) documentation.
134 |
135 |
136 |
137 | #### Program in American English
138 |
139 | Developing in the same language, using the same spelling and vocabulary, is important. This is especially true in free-software projects with contributors from around the world. You should use the same language consistently for your project, in code, comments and documentation.
140 |
141 | So, for American English, write `color`, `flavor`, `center`, `meter`, `neighbor`, `defense`, `routing`, `sizable`, `burned`, and so on ([see more](https://en.wikipedia.org/wiki/American_and_British_English_spelling_differences)). I'm Australian, but I appreciate that most programmers will be learning and using American English. Also, American English spelling is consistently more phonetic and consistent than British English. British English tends to evolve towards American English for this reason, I think.
142 |
143 |
144 |
145 | #### Comment non-standard-library `#include`s to say what symbols you use from them
146 |
147 | Namespaces are one of the great advances of software development. Unfortunately, C missed out (scopes aren't namespaces). But, because namespaces are so fantastic, we should try to simulate them with comments.
148 |
149 | ``` c
150 | #include // Test, tests_run
151 | #include "trie.h" // Trie, Trie_*
152 | ```
153 |
154 | This provides a few benefits:
155 |
156 | - readers aren't forced to refer to documentation or use `grep` to find out where a symbol is defined (or, if you don't follow the rule below, where it comes from): your code just tells them
157 | - developers have a hope of being able to determine which `#include`s can be removed and which can't
158 | - developers are forced to consider namespace pollution (which is otherwise ignored in most C code), and encourages them to only provide small, well-defined headers
159 |
160 | The downside is that the `#include` comments aren't checked or enforced. I've been intending to write a checker for this for quite some time, but for now, there's nothing to stop the comments from becoming wrong - either mentioning symbols that aren't used anymore, or not mentioning symbols that are used. In your project, try to nip these problems in the bud, to stop it from spreading. You should always be able to trust your code. This maintenance is annoying, for sure, but I think `#include` comments are worth it in aggregate.
161 |
162 | Finding where things come from is always one of my main challenges when learning a codebase. It could be a whole lot easier. I've never seen any projects that write `#include` comments like this, but I'd love to see it become a thing.
163 |
164 |
165 |
166 | #### `#include` the definition of everything you use
167 |
168 | Don't depend on what your headers include. If your code uses a symbol, include the header file where that symbol is defined. Then, if your headers change their inclusions, your code won't break.
169 |
170 | Also, combined with the `#include` comment rule above, this saves your readers and fellow developers from having to follow a trail of includes just to find the definition of a symbol you're using. Your code should just tell them where it comes from.
171 |
172 |
173 |
174 | #### Avoid unified headers
175 |
176 | Unified headers are generally bad, because they relieve the library developer of the responsibility to provide loosely-coupled modules clearly separated by their purpose and abstraction. Even if the developer (thinks she) does this anyway, a unified header increases compilation time, and couples the user's program to the entire library, regardless of if they need it. There are numerous other disadvantages, touched on in the points above.
177 |
178 | There was a good exposé on unified headers on the [Programmers' Stack Exchange](http://programmers.stackexchange.com/questions/185773/library-design-provide-a-common-header-file-or-multiple-headers). An answer mentions that it's reasonable for something like GTK+ to only provide a single header file. I agree, but I think that's due to the bad design of GTK+, and it's not intrinsic to a graphical toolkit.
179 |
180 | It's harder for users to write multiple `#include`s just like it's harder for users to write types. Bringing difficulty into it is missing the forest for the trees.
181 |
182 |
183 |
184 | #### Provide include guards for all headers to prevent double inclusion
185 |
186 | [Include guards](https://en.wikipedia.org/wiki/Include_guard) let you include a header file "twice" without it breaking compilation.
187 |
188 | ``` c
189 | // Good
190 | #ifndef INCLUDED_ALPHABET_H
191 | #define INCLUDED_ALPHABET_H
192 |
193 | ...
194 |
195 | #endif // ifndef INCLUDED_ALPHABET_H
196 | ```
197 |
198 | [Rob Pike argues against include guards](http://www.lysator.liu.se/c/pikestyle.html), saying you should just never include files in include files. He says that include guards still "result in thousands of needless lines of code passing through the lexical analyzer".
199 |
200 | In fact, [GCC will detect include guards](http://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html), and won't read such files a second time. I don't know if other compilers perform this optimization.
201 |
202 | I don't think it's a good idea to require your users include the dependencies of your header files. Your header file's dependencies shouldn't really be considered "public". It would enforce the rule "don't depend on what your header files include", but it falls apart as soon as header files are using things you don't need, like `FILE` or `bool`. Users shouldn't have to care about that if they don't need it themselves.
203 |
204 | So, always write include guards, and make your users' lives easy.
205 |
206 |
207 |
208 | #### Always comment `#endif`s of large conditional sections
209 |
210 |
211 |
212 | #### No global or static variables if you can help it (you probably can)
213 |
214 | Global variables are just hidden arguments to all the functions that use them. They make it really hard to understand what a function does, and how it is controlled.
215 |
216 | Mutable global variables are especially evil and should be avoided at all costs. Conceptually, a global variable assignment is a bunch of `longjmp`s to set hidden, static variables. Yuck.
217 |
218 | You should always try to design your functions to be completely controllable by their arguments. Even if you have a variable that will have to be passed around to lots of a functions - if it affects their computation, it should be a argument or a member of a argument. This *always* leads to better code and better design.
219 |
220 | For example, removing global variables and constants from my [Trie.c](https://github.com/mcinglis/trie.c) project resulted in the `Alphabet` struct, which lets users tune the storage structure to their needs. It also opened up some really cool dynamic abilities, like swapping alphabets on the fly for the same trie.
221 |
222 | Static variables in functions are just global variables scoped to that function; the arguments above apply equally to them. Just like global variables, static variables are often used as an easy way out of providing modular, pure functions. They're often defended in the name of performance (benchmarks first!). You don't need static variables, just like you don't need global variables. If you need persistent state, have the function accept that state as a argument. If you need to return something persistent, allocate memory for it.
223 |
224 |
225 |
226 | #### Minimize what you expose; declare top-level names `static` where you can
227 |
228 | Your header files should *only* include things that users need to use your library. Internal functions or structs or macros should not be provided here; declare them in their respective source files. If it's needed among multiple source files, provide an internal header file.
229 |
230 | If a function or global variable isn't exported in the header, declare it as `static` in the source file to give it internal linkage. This eliminates the chance of name-clashes among object files, enables a few optimizations, and can improve the linking speed.
231 |
232 |
233 |
234 | #### Immutability saves lives: use `const` everywhere you can
235 |
236 | `const` improves compile-time correctness. It isn't only for documenting read-only pointers. It should be used for every read-only variable and pointee.
237 |
238 | `const` helps the reader *immensely* in understanding a piece of functionality. If they can look at an initialization and be sure that that value won't change throughout the scope, they can reason about the rest of the scope much easier. Without `const`, everything is up in the air; the reader is forced to comprehend the entire scope to understand what is and isn't being modified. If you consistently use `const`, your reader will begin to trust you, and will be able to assume that a variable that isn't qualified with `const` is a signal that it will be changed at some point in the scope.
239 |
240 | Using `const` everywhere you can also helps you, as a developer, reason about what's happening in the control flow of your program, and where mutability is spreading. It's amazing, when using `const`, how much more helpful the compiler is, especially regarding pointers and pointees. You always want the compiler on your side.
241 |
242 | The compiler will warn if a pointee loses `const`ness in a function call (because that would let the pointee be modified), but it won't complain if a pointee gains `const`ness. Thus, if you *don't* specify your pointer arguments as `const` when they're read-only anyway, you discourage your users from using `const` in their own code:
243 |
244 | ``` c
245 | // Bad: sum should define its array as const.
246 | int sum( int * xs, int n );
247 |
248 | // Because otherwise, this will be a compilation warning:
249 | int const xs[] = { 1, 2, 3 };
250 | return sum( xs, sizeof xs );
251 | // => warning: passing argument 2 of ‘sum’ discards ‘const’
252 | // qualifier from pointer target type
253 | ```
254 |
255 | Thus, using `const` isn't really a choice, at least for function signatures. Lots of people consider it beneficial, so everyone should consider it required, whether they like it or not. If you don't use `const`, you force your users to either cast all calls to your functions (yuck), ignore `const` warnings (asking for trouble), or remove those `const` qualifiers (lose compile-time correctness).
256 |
257 | If you're forced to work with a library that ignores `const`, you can write a macro that casts for you:
258 |
259 | ``` c
260 | // `sum` will not modify the given array; casts for `const` pointers.
261 | #define sum( xs, n ) sum( ( int * ) xs, n )
262 | ```
263 |
264 | Only provide `const` qualifiers for pointees in function prototypes - `const` for the argument names themselves is just an implementation detail.
265 |
266 | ``` c
267 | // Unnecessary
268 | bool Trie_has( Trie const, char const * const );
269 | // Good
270 | bool Trie_has( Trie, char const * );
271 | ```
272 |
273 | Unfortunately, C can't handle conversions from non-const pointee-pointees to const pointee-pointees. Thus, I recommend against making pointee-pointees `const`.
274 |
275 | ``` c
276 | char ** const xss = malloc( 3 * ( sizeof char * ) );
277 | char const * const * const yss = xss;
278 | // Warning: initialization from incompatible pointer type
279 |
280 | char * const * const zss = xss;
281 | //
282 | ```
283 |
284 | If you can `const` the pointees of your *internal* structs, do. Non-constant pointees can cause mutability to needlessly spread, which makes it harder to glean information from the remaining `const` qualifiers. Because you have total control over your internal structs, if you need to remove the `const` in future, you can.
285 |
286 | You usually shouldn't `const` the pointees of your external structs. Flexibility is important when they're part of the public interface. Consider it carefully. An exception to this that I often make is for fields are best assignable to string literals, such as `error` fields. In this case, a `char const *` type prevents you and your users from modifying the underlying string literals, which would prompt a segmentation fault.
287 |
288 | While it can be reasonable to `const` the *pointees* of struct fields, it's never beneficial to `const` the struct fields themselves. For example, [it makes it painful to `malloc`](http://stackoverflow.com/questions/9691404/how-to-initialize-const-in-a-struct-in-c-with-malloc) a value of that struct. If it really makes sense to stop the fields from changing beyond their original values, just define [invariants](#document-your-struct-invariants-and-provide-invariant-checkers) that enforce whatever qualities you need. Also, you and your users can just define individual variables of that struct as `const` to get the same effect.
289 |
290 | Only make return-type pointees `const` if you need to, and after careful consideration. I've found that when the compiler is hinting to add a `const` to a return type, it often means that a `const` should be *removed* somewhere; not added. It can harm flexibility, so be careful.
291 |
292 | Finally, never use typecasts or pointers to get around `const` qualifiers - at least, for things you control. If the variable isn't constant, don't make it one.
293 |
294 |
295 |
296 | #### Always put `const` on the right and read types right-to-left
297 |
298 | ``` c
299 | const char * word; // Bad: not as const-y as it can be
300 | const char * const word; // Bad: makes types very weird to read
301 | char const* const word; // Bad: weird * placement
302 |
303 | // Good: right-to-left, word is a constant pointer to a constant char
304 | char const * const word;
305 | ```
306 |
307 | Because of this rule, you should always pad the `*` type qualifier with spaces.
308 |
309 |
310 |
311 | #### Don't write argument names in function prototypes if they just repeat the type
312 |
313 | But, always declare the name of any pointer argument to communicate if it's a pointer-to-array (plural name) or a pointer-to-value (singular name).
314 |
315 | ``` c
316 | bool trie_eq( Trie trie1, Trie trie2 ); // Bad
317 | bool trie_eq( Trie, Trie ); // Good
318 |
319 | // Bad - are these pointers for modification, nullity, or arrays?
320 | void trie_add( Trie const *, char const * );
321 |
322 | // Good
323 | void trie_add( Trie const * trie, char const * string );
324 | ```
325 |
326 |
327 |
328 | #### Use `double` rather than `float`, unless you have a specific reason otherwise
329 |
330 | From *21st Century C*, by Ben Klemens:
331 |
332 | ``` c
333 | printf( "%f\n", ( float )333334126.98 ); // 333334112.000000
334 | printf( "%f\n", ( float )333334125.31 ); // 333334112.000000
335 | ```
336 |
337 | For the vast majority of applications nowadays, space isn't an issue, but floating-point errors can still pose a threat. It's much harder for numeric drift to cause problems for `double`s than it is for `float`s. Unless you have a very specific reason to use `float`s, use `double`s instead. Don't use `float`s "because they will be faster", because without benchmarks, you can't know if it actually makes any discernible difference. Finish development, then perform benchmarks to identify the choke-points, then use `float`s in those areas, and see if it actually helps. Before then, prioritize everything else over any supposed performance improvements. Don't prematurely optimize.
338 |
339 |
340 |
341 | #### Declare variables as late as possible
342 |
343 | Declaring variables where they're used reminds the reader of the type they're working with. It also suggests where to extract a function to minimize variable scope. Furthermore, it informs the reader as to where each variables are relevant. Declaring variables when they're needed almost always leads to initialization (`int x = 1;`), rather than just declaration (`int x;`). Initializing a variable usually often means you can `const` it, too.
344 |
345 | To me, all declarations (i.e. non-initializations) are shifty.
346 |
347 |
348 |
349 | #### Use one line per variable definition; don't bunch same types together
350 |
351 | This makes the types easier to change in future, because atomic lines are easier to edit. If you'll need to change all their types together, you should use your editor's block editing mode.
352 |
353 | I think it's alright to bunch semantically-connected struct members together, though, because struct definitions are much easier to comprehend than active code.
354 |
355 | ``` c
356 | // Fine
357 | typedef struct Color {
358 | char r, g, b;
359 | } Color;
360 | ```
361 |
362 |
363 |
364 | #### Don't be afraid of short variable names
365 |
366 | If the scope fits on a screen, and the variable is used in a lot of places, and there would be an obvious letter or two to represent it, try it out and see if it helps readability. It probably will!
367 |
368 |
369 |
370 | #### Be consistent in your variable names across functions
371 |
372 | Consistency helps your readers understand what's happening. Using different names for the same values in functions is suspicious, and forces your readers to reason about unimportant things.
373 |
374 |
375 |
376 | #### Use `bool` from `stdbool.h` whenever you have a boolean value
377 |
378 | ``` c
379 | int print_steps = 0; // Bad - is this counting steps?
380 | bool print_steps = false; // Good - intent is clear
381 | ```
382 |
383 |
384 |
385 | #### Explicitly compare values; don't rely on truthiness
386 |
387 | Explicit comparisons tell the reader what they're working with, because it's not always obvious in C, and it *is* always important. Are we working with counts or characters or booleans or pointers? The first thing I do when I see a variable being tested for truthiness in C is to hunt down the declaration to find its type. I really wish the programmer had just told me in the comparison.
388 |
389 | ``` c
390 | // Bad - what are these expressions actually testing for (if at all?)
391 | if ( on_fire );
392 | return !character;
393 | something( first( xs ) );
394 | while ( !at_work );
395 |
396 | // Good - informative, and eliminates ambiguity
397 | if ( on_fire > 0 );
398 | return character == NULL;
399 | something( first( xs ) != '\0' );
400 | while ( at_work == false );
401 | ```
402 |
403 | I'll often skip this rule for boolean functions named as a predicate, like `is_edible` or `has_client`. It's still not *completely* obvious what the conditional is checking for, but I usually consider the visual clutter of a `== true` or `== false` to be more of a hassle than a help to readers in this situation. Use your judgement.
404 |
405 |
406 |
407 | #### Never change state within an expression (e.g. with assignments or `++`)
408 |
409 | Readable (imperative) programs flow from top to bottom: not right to left. Unfortunately, this happens way too much in C programming. I think the habit and practice was started by *The C Programming Language*, and it's stuck with much of the culture ever since. It's a really bad habit, and makes it so much harder to follow what your program is doing. Never change state in an expression.
410 |
411 | ``` c
412 | trie_add( *child, ++word ); // Bad
413 | trie_add( *child, word + 1 ); // Good
414 |
415 | // Good, if you need to modify `word`
416 | word += 1;
417 | trie_add( *child, word );
418 |
419 | // Bad
420 | if ( ( x = calc() ) == 0 );
421 | // Good
422 | x = calc();
423 | if ( x == 0 );
424 |
425 | // Fine; technically an assignment within an expression
426 | a = b = c;
427 |
428 | while ( --atoms > 0 ); // Bad
429 | while ( atoms -= 1, // Good
430 | atoms > 0 );
431 |
432 | // Fine; there's no better way, without repetition
433 | int w;
434 | while ( w = calc_width( shape ),
435 | !valid_width( w ) ) {
436 | shape = reshape( shape, w );
437 | }
438 | ```
439 |
440 | Don't use multiple assignment unless the variables' values are semantically linked. If there are two variable assignments near each other that coincidentally have the same value, don't throw them into a multiple assignment just to save a line.
441 |
442 | Use the comma operator, as above, judiciously. Do without it if you can:
443 |
444 | ``` c
445 | // Bad
446 | for ( int i = 0, limit = get_limit( m ); i < limit; i += 1 ) {
447 | ...
448 | }
449 |
450 | // Better
451 | int const limit = get_limit( x );
452 | for ( int i = 0; i < limit; i += 1 ) {
453 | ...
454 | }
455 | ```
456 |
457 |
458 |
459 | #### Avoid non-pure or non-trivial function calls in expressions
460 |
461 | Assign function calls to a variable to describe what it is, even if the variable is as simple as an `int result`. This avoids surprising your readers with state changes from non-pure functions hidden inside conditional contexts. To me, it's really unnatural to think about the expression inside an `if ( ... )` changing things on the outside world. It's much clear to assign the result of that state change to a variable, and then check that value.
462 |
463 | Even if you think it's obvious, and it will save you a line - it's not worth the potential for a slip-up. Stick to this rule, and don't think about it.
464 |
465 | If the function name is a predicate, like `is_adult` or `in_tree`, and will read naturally in a conditional context, then I think it's alright to skip assigning its result. It's also probably fine to join these kind of functions in a boolean expression if you need to, but use your judgement. Complex boolean expressions should often be extracted to a function.
466 |
467 | ``` c
468 | // Good
469 | int r = listen( fd, backlog );
470 | if ( r == -1 ) {
471 | perror( "listen" );
472 | return 1;
473 | }
474 |
475 | // Good
476 | if ( is_tasty( banana ) ) {
477 | eat( banana );
478 | }
479 | ```
480 |
481 |
482 | #### Always use brackets, even for single-statement blocks
483 |
484 | Always use brackets, because it's safer, easier to change, and easier to read because it's more consistent. For the same reasons, don't put a single-line statement on the same line as the condition.
485 |
486 | What follows is actual code from *The C Programming Language*. Don't do this:
487 |
488 | ``` c
489 | while (--argc > 0 && (*++argv)[0] == '-')
490 | while (c = *++argv[0])
491 | switch (c) {
492 | ...
493 | }
494 | if (argc != 1)
495 | printf("Usage: find -x -n pattern\n");
496 | else
497 | while (getline(line, MAXLINE) > 0) {
498 | ...
499 | }
500 | ```
501 |
502 |
503 |
504 | #### Avoid unsigned types because the integer conversion rules are complicated
505 |
506 | [CERT attempts to explain the integer conversion rules](https://www.securecoding.cert.org/confluence/display/seccode/INT02-C.+Understand+integer+conversion+rules), saying:
507 |
508 | > Misunderstanding integer conversion rules can lead to errors, which in turn can lead to exploitable vulnerabilities. Severity: medium, Likelihood: probable.
509 |
510 | *Expert C Programming* (a great book that explores the ANSI standard) also explains this in its first chapter. The takeaway is that you shouldn't declare `unsigned` variables just because they shouldn't be negative. If you want a larger maximum value, use a `long` or `long long` (the next size up).
511 |
512 | If your function will fail with a negative number, it will probably also fail with a large number - which is what it will get if passed a negative number. If your function will fail with a negative number, just assert that it's positive. Remember, lots of dynamic languages make do with a single integer type that can be either sign.
513 |
514 | Unsigned values offer no type safety; even with `-Wall` and `-Wextra`, GCC doesn't bat an eyelid at `unsigned int x = -1;`.
515 |
516 | *Expert C Programming* also provides an example for why you should cast all macros that will evaluate to an unsigned value.
517 |
518 | ``` c
519 | #define NELEM( xs ) ( ( sizeof xs ) / ( sizeof xs[0] ) )
520 | int const xs[] = { 1, 2, 3, 4, 5, 6 };
521 |
522 | int main( void )
523 | {
524 | int const d = -1;
525 | if ( d < NELEM( xs ) - 1 ) {
526 | return xs[ d + 1 ];
527 | }
528 | return 0;
529 | }
530 | ```
531 |
532 | The `if` branch won't be executed, because `NELEM` will evaluate to an `unsigned int` (via `sizeof`). So, `d` will be promoted to an `unsigned int`. `-1` in [two's complement](https://en.wikipedia.org/wiki/Two%27s_complement) represents the largest possible unsigned value (bit-wise), so the expression will be false, and the program will return `0`. The solution in this case would be to cast the result of `NELEM`:
533 |
534 | ``` c
535 | #define NELEM( xs ) ( long )( ( sizeof xs ) / ( sizeof xs[ 0 ] ) )
536 | ```
537 |
538 | You will need to use unsigned values to provide [well-defined bit operations](http://stackoverflow.com/questions/4009885/arithmetic-bit-shift-on-a-signed-integer) and modular arithmetic overflow. But, try to keep those values contained, and don't let them interact with signed values.
539 |
540 |
541 |
542 | #### Use `+= 1` and `-= 1` over `++` and `--`
543 |
544 | Actually, don't use either form if you can help it. Changing state should always be avoided (within reason). But, when you have to, `+=` and `-=` are obvious, simpler and less cryptic than `++` and `--`, and useful in other contexts and with other values. Also, there are no tricks about the evaluation of `+=` and `-=` and they don't have weird twin operators to provide alternative evaluations. Python does without `++` and `--` operators, and Douglas Crockford excluded them from the Good Parts of JavaScript, because we don't need them. Sticking to this rule also encourages you to avoid changing state within an expression.
545 |
546 |
547 |
548 | #### Use parentheses for expressions where the [operator precedence](https://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B#Operator_precedence) isn't obvious
549 |
550 | ``` c
551 | int x = a * b + c / d; // Bad
552 | int x = ( a * b ) + ( c / d ); // Good
553 |
554 | &sockaddr->sin_addr; // Bad
555 | &( sockaddr->sin_addr ); // Good
556 | ```
557 |
558 | You can and should make exceptions for commonly-seen combinations of operations. For example, skipping the operators when combining the equality and boolean operators is fine, because readers are probably used to that, and are confident of the result.
559 |
560 | ``` c
561 | // Fine
562 | return hungry == true
563 | || ( legs != NULL && fridge.empty == false );
564 | ```
565 |
566 |
567 |
568 | #### Don't use `switch`, and avoid complicated conditionals
569 |
570 | The `switch` fall-through mechanism is error-prone, and you almost never want the cases to fall through anyway, so the vast majority of `switch`es are longer than the `if` equivalent. Worse, a missing `break` will still compile: this tripped me up all the time when I used `switch`. Also, `case` values have to be an integral constant expression, so they can't match against another variable. This discourages extractions of logic to functions. Furthermore, any statement inside a `switch` can be labelled and jumped to, which fosters highly-obscure bugs if, for example, you mistype `defau1t`.
571 |
572 | If you need to map different constant values to behavior, like:
573 |
574 | ``` c
575 | switch ( x ) {
576 | case A:
577 | do_something_for_a( x, y, z );
578 | break;
579 | case B:
580 | do_something_for_b( x, y, z ):
581 | break;
582 | default:
583 | error( x, y, z );
584 | break;
585 | }
586 | // These functions might not be explicit functions (i.e. they might
587 | // just be a series of statements using some of those variables).
588 | ```
589 |
590 | A more explicit, testable and reusable approach is to define a function that uses ternary expressions to return a function pointer of the right type:
591 |
592 | ``` c
593 | action_fn get_x_action( x ) {
594 | return ( x == A ) ? do_something_for_a
595 | : ( x == B ) ? do_something_for_b
596 | : error;
597 | }
598 |
599 | action_fn action = get_x_action( x );
600 | action( x, y, z );
601 |
602 | // or just:
603 | get_x_action( x )( x, y, z );
604 |
605 | // `action` is a terrible name and is only used as an example. You
606 | // should try to think of a more-informative name for your code.
607 | ```
608 |
609 | You should do a similar thing if you need to map between two sets of uncorrelated constant values, like:
610 |
611 | ``` c
612 | // Bad
613 | switch ( x ) {
614 | case A: return X;
615 | case B: return Y;
616 | case C: return Z;
617 | default: return ERR;
618 | }
619 |
620 | // Good
621 | return ( x == A ) ? X
622 | : ( x == B ) ? Y
623 | : ( x == C ) ? Z
624 | : ERR;
625 | ```
626 |
627 | Don't use a `switch` where you can just use a boolean expression:
628 |
629 | ``` c
630 | // Bad
631 | switch ( x ) {
632 | case A: case B: case C:
633 | return true;
634 | default:
635 | return false;
636 | }
637 |
638 | // Good
639 | return x == A || x == B || x == C;
640 |
641 | // Or, if the names are longer, this usually reads better:
642 | return t == JSON_TYPE_null
643 | || t == JSON_TYPE_boolean
644 | || t == JSON_TYPE_number;
645 | ```
646 |
647 | If you need the fall-through behavior of `switch`, like:
648 |
649 | ``` c
650 | switch ( x ) {
651 | case A:
652 | // A stuff, fall through to B
653 | case B:
654 | // B stuff
655 | break;
656 | default:
657 | // default stuff
658 | }
659 | ```
660 |
661 | The equivalent `if` is much more readable and it's obvious what's going to happen and why. The "B stuff" actually applies when `x == A` too, and this is explicitly declared when you use an `if`.
662 |
663 | ``` c
664 | if ( x == A ) {
665 | // A stuff
666 | }
667 |
668 | if ( x == A || x == B ) {
669 | // B stuff
670 | } else {
671 | // default stuff
672 | }
673 | ```
674 |
675 | You should only need to use `switch` for performance tuning (once you've done benchmarks to identify hotspots!). Otherwise, there's always a safer, shorter, more-testable, and reusable alternative.
676 |
677 |
678 |
679 | #### Separate functions and struct definitions with two lines
680 |
681 | If you limit yourself to a maximum of one blank line within functions, this rule provides clear visual separation of global elements. This is a habit I learned from Python's PEP8 style guide.
682 |
683 |
684 |
685 | #### Minimize the scope of variables
686 |
687 | If a few variables are only used in a contiguous sequence of lines, and only a single value is used after that sequence, then those first lines are a great candidate for extracting to a function.
688 |
689 | ``` c
690 | // Good: addr was only used in the first part of handle_request
691 | int accept_request( int const listenfd )
692 | {
693 | struct sockaddr addr;
694 | return accept( listenfd, &addr, &( socklen_t ){ sizeof addr } );
695 | }
696 |
697 | int handle_request( int const listenfd )
698 | {
699 | int const reqfd = accept_request( listenfd );
700 | // ... stuff not involving addr, but involving reqfd
701 | }
702 | ```
703 |
704 | If the body of `accept_request` were left in `handle_request`, then the `addr` variable will be in the scope for the remainder of the `handle_request` function even though it's only used for getting the `reqfd`. This kind of thing adds to the cognitive load of understanding a function, and should be fixed wherever possible.
705 |
706 | Another tactic to limit the exposure of variables is to break apart complex expressions into blocks, like so:
707 |
708 | ``` c
709 | // Rather than:
710 | bool trie_has( Trie const trie, char const * const string )
711 | {
712 | Trie const * const child = Trie_child( trie, string[ 0 ] );
713 | return string[ 0 ] == '\0'
714 | || ( child != NULL
715 | && Trie_has( *child, string + 1 ) );
716 | }
717 |
718 | // child is only used for the second part of the conditional, so we
719 | // can limit its exposure like so:
720 | bool trie_has( Trie const trie, char const * const string )
721 | {
722 | if ( string[ 0 ] == '\0' ) {
723 | return true;
724 | } else {
725 | Trie const * const child = Trie_child( trie, string[ 0 ] );
726 | return child != NULL
727 | && Trie_has( *child, string + 1 );
728 | }
729 | }
730 | ```
731 |
732 |
733 |
734 | #### Simple constant expressions can be easier to read than variables
735 |
736 | It can often help the readability of your code if you replace variables that are only assigned to constant expressions, with those expressions.
737 |
738 | Consider the `trie_has` example above - the `string[ 0 ]` expression is repeated twice. It would be harder to read and follow if we inserted an extra line to define a `char` variable. It's just another thing that the readers would have to keep in mind. Many programmers of other languages wouldn't think twice about repeating an array access.
739 |
740 |
741 |
742 | #### Prefer compound literals to superfluous variables
743 |
744 | This is beneficial for the same reason as minimizing the scope of variables.
745 |
746 | ``` c
747 | // Bad, if `sa` is never used again.
748 | struct sigaction sa = {
749 | .sa_handler = sigchld_handler,
750 | .sa_flags = SA_RESTART
751 | };
752 | sigaction( SIGCHLD, &sa, NULL );
753 |
754 | // Good
755 | sigaction( SIGCHLD, &( struct sigaction ){
756 | .sa_handler = sigchld_handler,
757 | .sa_flags = SA_RESTART
758 | }, NULL );
759 |
760 | // Bad
761 | int v = 1;
762 | setsockopt( fd, SOL_SOCKET, SO_REUSEADDR, &v, sizeof v );
763 |
764 | // Good
765 | setsockopt( fd, SOL_SOCKET, SO_REUSEADDR, &( int ){ 1 }, sizeof int );
766 | ```
767 |
768 |
769 |
770 | #### Never use or provide macros that wrap control structures like `for`
771 |
772 | Macros that loop over the elements of a data structure are extremely confusing, because they're extra-syntactic and readers can't know the control flow without looking up the definition.
773 |
774 | To understand your program, it's crucial that your readers can understand its control flow.
775 |
776 | Don't provide control-macros even as an option. They're universally harmful, so don't enable it. Users can define their own if they really want to.
777 |
778 | ``` c
779 | // Bad
780 | #define TRIE_EACH( TRIE, INDEX ) \
781 | for ( int INDEX = 0; INDEX < ( TRIE ).alphabet.size; INDEX += 1 )
782 |
783 | // Not at all obvious what's actually going to happen here.
784 | TRIE_EACH( trie, i ) {
785 | Trie * const child = trie.children[ i ];
786 | ...
787 | }
788 | ```
789 |
790 |
791 |
792 | #### Only upper-case a macro if will act differently than a function call
793 |
794 | By "act differently", I mean if things will break when users wouldn't expect them to. If a macro just looks different (e.g. the named arguments technique), then I don't consider that justification for an upper-case name. A macro should have an upper-case name if it:
795 |
796 | - repeats its arguments in its body, because this will break for non-pure expressions. Many compilers provide [statement expressions](http://stackoverflow.com/questions/6440021/compiler-support-of-gnu-statement-expression) to prevent this, but it's non-standard. If you do use statement expressions, then you don't need to upper-case your macro name, because it's not relevant to your users.
797 | - is wrapped in blocks or a control structure, because it can't be used as an expression then.
798 | - modifies the surrounding context, e.g., with a `return` or `goto`.
799 | - takes an array literal as a named argument. ([why](http://stackoverflow.com/questions/5503362/passing-array-literal-as-macro-argument))
800 |
801 |
802 |
803 | #### If a macro is specific to a function, `#define` it in the body
804 |
805 | For the same reasons why we should always minimize the scope of our variables, if it makes sense to limit the scope of a macro, we should.
806 |
807 |
808 |
809 | #### Initialize strings as arrays, and use `sizeof` for byte size
810 |
811 | Always initialize your string literals as arrays, because it lets you get the byte size with just `sizeof str`. If you initialize them as pointers, you have to get the byte size with `strlen( str ) + 1` - I know I've been bitten more than once by forgetting the `+ 1` there.
812 |
813 | ``` c
814 | // Good
815 | char const message[] = "always use arrays for strings!";
816 | write( output, message, sizeof message );
817 | ```
818 |
819 | Also, pointer initializations are less safe than array initializations, *unless* you compile with `-Wwrite-strings` to ensure string literals are initialized with the type `char const *`. Unfortunately, `-Wwrite-strings` isn't included in `-Wall` or `-Wextra`: you have to explicitly enable it. Without `-Wwrite-strings`, you can assign string literals to a `char *`. But your program will seg-fault when you re-assign the elements of that pointer.
820 |
821 | ``` c
822 | // Without -Wwrite-strings, this will compile without warnings, but
823 | // it will prompt a segmentation fault at the second line.
824 | char * xs = "hello";
825 | xs[ 0 ] = 'c';
826 |
827 | // This program will compile and execute fine.
828 | char xs[] = "hello";
829 | xs[ 0 ] = 'c';
830 | ```
831 |
832 | The benefit of initializing string literals as pointers is that those pointers will point to read-only memory, potentially allowing some optimizations. Initializing string literals as arrays essentially creates a mutable string is only "artificially" protected against modifications with `const` - but this can be defeated with a cast.
833 |
834 | Again, I advise against prematurely optimizing. Until you've finished development and have done benchmarks, performance should be your lowest priority. I haven't seen any tests on string literal definitions, but I'd be very surprised to see any notable speed improvements by defining string literals as pointers.
835 |
836 | As mentioned in the rule on `const`ing everything: never ever cast away a `const`. Remove the `const` instead. Don't worry about "artificial" protections. I know I'd much prefer my constant values to be protected by explicit, syntactic constructs that will warn when compiling, rather than implicit, obscure rules that will seg-fault when violated.
837 |
838 | Finally, sticking to array initializations saves you and your readers the conceptual overhead of switching between pointer initializations and array initializations, depending on if you need mutability or not.
839 |
840 | Just always initialize string literals as arrays, and keep it simple.
841 |
842 |
843 |
844 | #### Where possible, use `sizeof` on the variable; not the type
845 |
846 | Then, if you change the type of the variable later, you only have to change it once. You'll always get the correct size.
847 |
848 | ``` c
849 | // Good
850 | int * a = malloc( n * ( sizeof *a ) );
851 | ```
852 |
853 | You can't do this with compound literals, though. I think it's a worth-while trade-off to remove a variable that's only used once.
854 |
855 | ``` c
856 | setsockopt( fd, SOL_SOCKET, SO_REUSEADDR, &( int ){ 1 }, ( sizeof int ) );
857 | ```
858 |
859 |
860 | #### Never use array syntax for function arguments definitions
861 |
862 | [Arrays become pointers in most expressions](http://c-faq.com/aryptr/aryptrequiv.html), including [when passed as arguments to functions](http://c-faq.com/aryptr/aryptrparam.html). Functions can never receive an array as a argument: [only a pointer to the array](http://c-faq.com/aryptr/aryptr2.html). `sizeof` won't work like an array argument declaration would suggest: it would return the size of the pointer, not the array pointed to.
863 |
864 | [Static array indices in function arguments are nice](http://hamberg.no/erlend/posts/2013-02-18-static-array-indices.html), but only protect against very trivial situations, like when given a literal `NULL`. Also, GCC doesn't warn about their violation [yet](http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50584), only Clang. I don't consider the confusing, non-obvious syntax to be worth the small compilation check.
865 |
866 | Yeah, `[]` hints that the argument will be treated as an array, but so does a plural name like `requests`, so do that instead.
867 |
868 |
869 |
870 | #### Always prefer array indexing over pointer arithmetic
871 |
872 | If you're working with an array of things, treat them as an array. Pointer arithmetic is confusing and bug-prone. Sticking to array indexing often lets you keep the important variables constant, and only the index variables non-constant.
873 |
874 | ``` c
875 | // Bad
876 | for ( ; *str != '\0'; str += 1 );
877 |
878 | // Good
879 | for ( int i = 0; str[ i ] != '\0'; i += 1 );
880 | ```
881 |
882 |
883 |
884 | #### Document your struct invariants, and provide invariant checkers
885 |
886 | > An **invariant** is a condition that can be relied upon to be true during execution of a program.
887 |
888 | For any function that takes a struct (or a pointer to a struct), all invariants of that struct should be true before and after the execution of the function. Invariants make it the caller's responsibility to provide valid data, and the function's responsibility to return valid data. Invariants save those functions from having to repeat assertions of those conditions, or worse, not even checking and working with invalid data.
889 |
890 | Provide an "invariants" comment section at the end of your struct definition, and list all the invariants you can think of. Also, implement `is_valid` and `assert_valid` functions for users to check those assertions on values of the structs they create on their own. These functions are crucial to being able to trust that the invariants hold for values of that struct. Without them, how will the users know?
891 |
892 | [Here's an example](https://github.com/mcinglis/trie.c/blob/master/alphabet.h#L10) of a struct invariant.
893 |
894 | My university faculty is [pretty big](http://www.itee.uq.edu.au/sse/projects) on software correctness. It certainly rubbed off on me.
895 |
896 |
897 |
898 | #### Use `assert` everywhere your program would fail otherwise
899 |
900 | Write assertions to meaningfully crash your program before it does something stupid, like deleting data, or to prevent a security vulnerability, or just to prevent a segmentation fault. Good software fails fast.
901 |
902 | If a function is given a pointer it will dereference, assert that it's not null. If it's given an array index, assert that it's within bounds. Assert for any consistency that you need between arguments.
903 |
904 | That said, never depend on assertions for correctness. Your program should still work correctly when the assertion lines are removed.
905 |
906 | Don't mistake assertions for error-reporting. Assert things that you won't bother to check otherwise. If user input (not code) can invalidate an assertion, that's a bug. You should be filtering it before-hand, and reporting the errors in a readable fashion for your users.
907 |
908 |
909 |
910 | #### Repeat `assert` calls; don't `&&` them together
911 |
912 | Repeating your `assert` calls improves the assertion error reporting. If you chain assertions together with `&&`, you won't know which condition failed.
913 |
914 |
915 |
916 | #### Don't use variable-length arrays
917 |
918 | Variable-length arrays were introduced in C99 as a way to define dynamic-length arrays with automatic storage; no need for `malloc`. For a few reasons, they've been made optional in C11. Thus, if you want to use variable-length arrays in C11, you'll have to write the `malloc` version anyway. Instead, just don't use variable-length arrays.
919 |
920 | I'd advise against using variable-length arrays in C99, too. First, you have to [check the values](https://www.securecoding.cert.org/confluence/display/seccode/ARR32-C.+Ensure+size+arguments+for+variable+length+arrays+are+in+a+valid+range) that control their size to protect against stack-smashing. Second, they can't be initialized. Finally, avoiding them will make it easier to upgrade to newer standards later on.
921 |
922 |
923 |
924 | #### Avoid `void *` because it harms type safety
925 |
926 | `void *` is useful for polymorphism, but polymorphism is almost never as important as type safety. Void pointers are indispensable in many situations, but you should consider other, safer alternatives first.
927 |
928 |
929 |
930 | #### If you have a `void *`, assign it to a typed variable as soon as possible
931 |
932 | Just like working with uninitialized variables is dangerous, working with void pointers is dangerous: you want the compiler on your side. So ditch the `void *` as soon as you can.
933 |
934 |
935 |
936 | #### Use C11's anonymous structs and unions rather mutually-exclusive fields
937 |
938 | If only certain fields of your struct should be set when certain other fields have certain values, use C11's anonymous structs and unions:
939 |
940 | ``` c
941 | enum AUTOMATON_TYPE {
942 | AUTOMATON_TYPE_char,
943 | AUTOMATON_TYPE_split,
944 | AUTOMATON_TYPE_match
945 | };
946 | #define NUM_AUTOMATON_TYPES ( 3 )
947 |
948 | typedef struct Automaton {
949 | enum AUTOMATON_TYPE type;
950 | union {
951 | struct { // type = char
952 | char c;
953 | struct Automaton * next;
954 | };
955 | struct { // type = split
956 | struct Automaton * left;
957 | struct Automaton * right;
958 | };
959 | };
960 | } Automaton;
961 | ```
962 |
963 | This is much more explicit and obvious than something like:
964 |
965 | ``` c
966 | typedef struct Automaton {
967 | enum AUTOMATON_TYPE type;
968 | char c;
969 | struct Automaton * left;
970 | struct Automaton * right;
971 | } Automaton;
972 | ```
973 |
974 |
975 |
976 | #### Don't typecast unless you have to (you probably don't)
977 |
978 | If it's valid to assign a value of one type to a variable of another type, then you don't have to cast it. You should only use typecasts when you need to, like:
979 |
980 | - performing true division (not integer division) of `int` expressions
981 | - making an array index an integer, but you can do this with assignment anyway
982 | - using compound literals for structs and arrays
983 |
984 | ``` c
985 | // This compiles fine:
986 | struct Apple * apples = malloc( sizeof *apples );
987 | ```
988 |
989 |
990 |
991 | #### Give structs TitleCase names, and typedef them
992 |
993 | ``` c
994 | // Good
995 | typedef struct Person {
996 | char * name;
997 | int age;
998 | } Person;
999 | ```
1000 |
1001 | TitleCase names should be used for structs so that they're recognizable without the `struct` prefix. They also let you name struct variables as the same thing as their type without names clashing (e.g. a `banana` of type `Banana`). You should always define the struct name, even if you don't need to, because you might need to later (e.g. to use as an incomplete type). Also, having a name at the top helps readability when comments are inserted, or the struct definition becomes large.
1002 |
1003 | I don't typedef structs used for named arguments (see below), however, because the TitleCase naming would be weird. Anyway, if you're using a macro for named arguments, then the typedef is unnecessary and the struct definition is hidden.
1004 |
1005 | If a user dislikes this practice of typedefing structs (which is fair, because it does have drawbacks - see below), they can always use the `struct` namespace instead.
1006 |
1007 |
1008 |
1009 | #### Only typedef structs; never basic types or pointers
1010 |
1011 | ``` c
1012 | // Bad
1013 | typedef double centermeters;
1014 | typedef double inches;
1015 | typedef struct Apple * Apple;
1016 | typedef void * gpointer;
1017 | ```
1018 |
1019 | This mistake is committed by way too many codebases. It masks what's really going on, and you have to read documentation or find the `typedef` to learn how to work with it. Never do this in your own interfaces, and try to ignore the typedefs in other interfaces.
1020 |
1021 | These criticisms apply equally to struct typedefs, as advised above. In my opinion, the visual clarity achieved by removing all the `struct` declarations is worth requiring users be aware of (or realize) the convention. Also, having a consistent naming scheme for structs, with TitleCase names, helps recognizability.
1022 |
1023 | Pointer typedefs are particularly nefarious because they exclude the users from qualifying the pointee with `const`. This is a huge loss, for reasons enumerated in other rules.
1024 |
1025 | Function pointer typedefs are justified when you need to declare a function that returns a function pointer; the syntax without a typedef is unbearable. I'll also typedef a function pointer if the type is being repeated in many locations (more than three, or so). Some people like to typedef all function pointers, but this often masks what's going on and what's expected. Carefully consider if a function pointer typedef will actually help people understand what that type represents.
1026 |
1027 |
1028 |
1029 | #### Give enums `UPPERCASE_SNAKE` names, and lowercase their values
1030 |
1031 | Because enums are mostly just integer constants, it's natural to name them the same way as `#define`d constants. The `enum` type prefix will communicate that it expects an enum value, and the lowercase value suffixes will communicate that they aren't quite integer constants.
1032 |
1033 | ``` c
1034 | enum JSON_TYPE {
1035 | JSON_TYPE_null,
1036 | JSON_TYPE_boolean,
1037 | JSON_TYPE_number,
1038 | ...
1039 | };
1040 | ```
1041 |
1042 |
1043 |
1044 | #### Define a constant for the size of every enum
1045 |
1046 | There's no versatile, future-proof way to work with loops, arrays, or bit-fields of the `enum` otherwise. Always define a constant to denote the size of the enumeration, to avoid hard-coded values (by you or your users).
1047 |
1048 | ``` c
1049 | enum SUIT {
1050 | SUIT_hearts,
1051 | SUIT_diamonds,
1052 | SUIT_clubs,
1053 | SUIT_spades
1054 | };
1055 | #define NUM_SUITS 4
1056 | ```
1057 |
1058 | I like to `#define` the size explicitly, rather than making it the last enum value. It seems natural to exclude the size of the enum from the actual enum itself - `NUM_SUITS` isn't a card suit I've ever heard of! It also protects against one of the previous enum values from being explicitly set (e.g. `SUIT_hearts = 1`), which would mean the last enum value wouldn't represent the size of the enum.
1059 |
1060 |
1061 |
1062 | #### Never begin names with `_` or end them with `_t`: they're reserved for standards
1063 |
1064 | [Here's a list](https://www.gnu.org/software/libc/manual/html_node/Reserved-Names.html) of the names reserved by future ISO C standards. `types_like_this_t` and `_anything` are identifiers that are reserved by future standards of C, so don't use them for your own identifiers.
1065 |
1066 | These kinds of names *could've* provided a nice way to tell which types are part of the language standard and which types are provided by libraries. Unfortunately, [it's](https://github.com/facebook/libphenom) [not](https://github.com/joyent/libuv) [hard](https://github.com/liuliu/ccv) to find popular C libraries and projects that make this mistake, which dilutes the helpfulness of such a rule.
1067 |
1068 | This mistake is made way too often: don't make the same mistake in your library!
1069 |
1070 |
1071 |
1072 | #### Only use pointers in structs for nullity, dynamic arrays or incomplete types
1073 |
1074 | Every pointer in a struct is an opportunity for a segmentation fault.
1075 |
1076 | If the would-be pointer shouldn't be NULL, isn't an array of an unknown size, and isn't of the type of the struct itself, then don't make it a pointer. Just include a member of the type itself in the struct. Don't worry about the size of the containing struct until you've done benchmarks.
1077 |
1078 |
1079 |
1080 | #### Only use pointer arguments for nullity, arrays or modifications
1081 |
1082 | This rule helps readers reason about where values are being modified. It also improves the safety by making it impossible for functions that shouldn't receive `NULL` from receiving `NULL` -- this is a huge benefit over languages that require pass-by-reference semantics (and thus `NULL` as a valid value almost everywhere).
1083 |
1084 | When you're reading a codebase that sticks to this rule, and its functions and types are maximally decomposed, you can often tell what a function does just by reading its prototype. This is in stark contrast to projects that pass pointers everywhere: you have no certainty anywhere.
1085 |
1086 | In C, you can pass struct values to functions, and by [pass-by-value semantics](http://c-faq.com/ptrs/passbyref.html), they'll be copied into the stack frame of the receiving function. The original struct can't be modified by that function (although it can return the modification). Like `const`, using this feature wherever you can makes it easier for your readers to reason about your program.
1087 |
1088 | Defining a "modification" gets tricky when you introduce structs with pointer members. I consider a modification to be something that affects the struct itself, or the pointees of the struct.
1089 |
1090 | If a struct will be "modified" by a function, have that function accept a pointer of that struct even if it doesn't need to. This saves the readers from having to find and memorize every relevant struct definition, to be aware of which structs have pointer members.
1091 |
1092 | ``` c
1093 | typedef struct {
1094 | int population;
1095 | } State;
1096 |
1097 | typedef struct {
1098 | State * states;
1099 | int num_states;
1100 | } Country;
1101 |
1102 | // Good: takes a `Country *` even though it *could* modify the array
1103 | // pointed to by the `states` member with just a `Country` value.
1104 | void country_grow( Country const * const country, double const percent ) {
1105 | for ( int i = 0; i < country->num_states; i += 1 ) {
1106 | country->states[ i ].population *= percent;
1107 | }
1108 | }
1109 | ```
1110 |
1111 | Note the const-ness of the `country` argument above: this communicates that the country itself won't be modified, but a pointee (although it could also be taken to suggest that the pointer is for nullity, but the function name suggests otherwise). It also allows callers to pass in a pointer to a `Country const`.
1112 |
1113 | The other situation to use pointer arguments is if the function needs to accept `NULL` as a valid value (i.e. the poor man's [Maybe](http://learnyouahaskell.com/making-our-own-types-and-typeclasses)). If so, be sure use `const` to signal that the pointer is not for modification, and so it can accept `const` arguments.
1114 |
1115 | ``` c
1116 | // Good: `NULL` represents an empty list, and list is a pointer-to-const
1117 | int list_length( List const * list ) {
1118 | int length = 0;
1119 | for ( ; list != NULL; list = list->next ) {
1120 | length += 1;
1121 | }
1122 | return length;
1123 | }
1124 | ```
1125 |
1126 | Sticking to this rule means ditching incomplete struct types, but I don't really like them anyway. (see the "[C isn't object-oriented](#c-isnt-object-oriented-and-you-shouldnt-pretend-it-is)" rule)
1127 |
1128 |
1129 |
1130 | #### Prefer to return a value rather than modifying pointers
1131 |
1132 | This encourages immutability, cultivates [pure functions](https://en.wikipedia.org/wiki/Pure_function), and makes things simpler and easier to understand. It also improves safety by eliminating the possibility of a `NULL` argument.
1133 |
1134 | ``` c
1135 | // Bad: unnecessary mutation (probably), and unsafe
1136 | void drink_mix( Drink * const drink, Ingredient const ingr ) {
1137 | assert( drink != NULL );
1138 | color_blend( &( drink->color ), ingr.color );
1139 | drink->alcohol += ingr.alcohol;
1140 | }
1141 |
1142 | // Good: immutability rocks, pure and safe functions everywhere
1143 | Drink drink_mix( Drink const drink, Ingredient const ingr ) {
1144 | return ( Drink ){
1145 | .color = color_blend( drink.color, ingr.color ),
1146 | .alcohol = drink.alcohol + ingr.alcohol
1147 | };
1148 | }
1149 | ```
1150 |
1151 | This isn't always the best way to go, but it's something you should always consider.
1152 |
1153 |
1154 |
1155 | #### Use structs to name functions' optional arguments
1156 |
1157 | ``` c
1158 | struct run_server_options {
1159 | char * port;
1160 | int backlog;
1161 | };
1162 |
1163 | #define run_server( ... ) \
1164 | run_server_( ( struct run_server_options ){ \
1165 | /* default values */ \
1166 | .port = "45680", \
1167 | .backlog = 5, \
1168 | __VA_ARGS__ \
1169 | } )
1170 |
1171 | int run_server_( struct run_server_options opts )
1172 | {
1173 | ...
1174 | }
1175 |
1176 | int main( void )
1177 | {
1178 | return run_server( .port = "3490", .backlog = 10 );
1179 | }
1180 | ```
1181 |
1182 | I learnt this from *21st Century C*. So many C interfaces could be improved immensely if they took advantage of this technique. The importance and value of (syntactic) named arguments is all-too-often overlooked in software development. If you're not convinced, read Bret Victor's [Learnable Programming](http://worrydream.com/LearnableProgramming/).
1183 |
1184 | Don't use named arguments everywhere. If a function's only argument happens to be a struct, that doesn't necessarily mean it should become the named arguments for that function. A good rule of thumb is that if the struct is used outside of that function, you shouldn't hide it with a macro like above.
1185 |
1186 | ``` c
1187 | // Good; the typecast here is informative and expected.
1188 | book_new( ( Author ){ .name = "Dennis Ritchie" } );
1189 | ```
1190 |
1191 |
1192 |
1193 | #### Always use designated initializers in struct literals
1194 |
1195 | ``` c
1196 | // Bad - will break if struct members are reordered, and it's not
1197 | // always clear what the values represent.
1198 | Fruit apple = { "red", "medium" };
1199 | // Good; future-proof and descriptive
1200 | Fruit watermelon = { .color = "green", .size = "large" };
1201 | ```
1202 |
1203 | Sometimes I'll bend this rule for named arguments, by having a particular field be at the top of the struct, so that callers can call the function without having to name that single argument:
1204 |
1205 | ``` c
1206 | run_server( "3490" );
1207 | run_server( .port = "3490", .backlog = 10 );
1208 | ```
1209 |
1210 | If you want to allow this, document it explicitly. It's then your responsibility to version your library correctly, if you change the ordering of the fields.
1211 |
1212 |
1213 |
1214 | #### If you're providing allocation and free functions only for a struct member, allocate memory for the whole struct
1215 |
1216 | If you're providing `foo_alloc` and `foo_free` functions only so you can allocate memory for a member of the `Foo` struct, you've lost the benefits and safety of automatic storage. You may as well have the allocation and free methods allocate memory for the whole struct, so users can pass it outside the scope it was defined (without dereferencing it), if they want.
1217 |
1218 |
1219 |
1220 | #### Avoid getters and setters
1221 |
1222 | If you're seeking encapsulation in C, you're probably overcomplicating things. Encourage your users to access and set struct members directly; never prefix members with `_` to denote an access level. Declare your struct invariants, and you don't need to worry about your users breaking things - it's their responsibility to provide a valid struct.
1223 |
1224 | As advised in [another rule](#always-prefer-to-return-a-value-rather-than-modifying-pointers), avoid mutability wherever you can.
1225 |
1226 | ``` c
1227 | // Rather than:
1228 | void city_set_state( City * const c, char const * const state )
1229 | {
1230 | c->state = state;
1231 | c->country = country_of_state( state );
1232 | }
1233 |
1234 | // Always prefer immutability and purity:
1235 | City city_with_state( City c, char const * const state )
1236 | {
1237 | c.state = state;
1238 | c.country = country_of_state( state );
1239 | return c;
1240 | }
1241 |
1242 | City c = { .name = "Vancouver" };
1243 | c = city_with_state( c, "BC" );
1244 | printf( "%s is in %s, did you know?\n", c.name, c.country );
1245 | ```
1246 |
1247 | But you should always provide an interface that allows for [declarative programming](https://en.wikipedia.org/wiki/Declarative_programming):
1248 |
1249 | ``` c
1250 | City const c = city_new( .name = "Boston", .state = "MA" );
1251 | printf( "I think I'm going to %s,\n"
1252 | "Where no one changes my state\n", c.name, c.country );
1253 | ```
1254 |
1255 |
1256 |
1257 | #### C isn't object-oriented, and you shouldn't pretend it is
1258 |
1259 | C doesn't have classes, methods, inheritance, (nice) object encapsulation, or real polymorphism. Not to be rude, but: **deal with it**. C might be able to achieve crappy, complicated imitations of those things, but it's just not worth it.
1260 |
1261 | As it turns out, C already has an entirely-capable language model. In C, we define data structures, and we define functionality that uses combinations of those data structures. Data and functionality aren't intertwined in complicated contraptions, and this is a good thing.
1262 |
1263 | Haskell, at the forefront of language design, made the same decision to separate data and functionality. Learning Haskell is one of the best things a programmer can do to improve their technique, but I think it's especially beneficial for C programmers, because of the underlying similarities between C and Haskell. Yes, C doesn't have anonymous functions, and no, you won't be writing monads in C anytime soon. But by learning Haskell, you'll learn how to write good software without classes, without mutability, and with modularity. These qualities are very beneficial for good C programming.
1264 |
1265 | Embrace and appreciate what C offers, rather than attempting to graft other paradigms onto it.
1266 |
1267 |
--------------------------------------------------------------------------------