├── README.md ├── approach_to_lifetime_safety_summary.md ├── build_myscpptool.sh ├── intro_video_transcript.md └── src ├── build_scripts └── build-debug2.sh ├── checker.h ├── converter_c2validcpp.h ├── converter_mode1.h ├── makefile ├── scpptool.cpp ├── scpptool.h ├── utils1.cpp └── utils1.h /README.md: -------------------------------------------------------------------------------- 1 | 2 | Dec 2024 3 | 4 | ### Overview 5 | 6 | scpptool is a command line tool to help enforce a memory and data race safe subset of C++. It's designed to work with the [SaferCPlusPlus](https://github.com/duneroadrunner/SaferCPlusPlus) library. It analyzes the specified C++ file(s) and reports places in the code that it cannot verify to be safe. By design, the tool and the library should be able to fully ensure "lifetime", bounds and [data race](https://github.com/duneroadrunner/SaferCPlusPlus#multithreading) safety. 7 | 8 | This safety necessarily comes at some (modest) expense of either flexibility or performance. Elements with a choice of tradeoffs are provided in the SaferCPlusPlus library. The goal is not to introduce new coding paradigms, but instead to impose only the minimum restrictions and departures from traditional C++ necessary to achieve practical performant memory safety. 9 | 10 | Unfortunately even these minimized changes are not insignificant. For example, the tool considers the standard library containers "unsafe" (and will complain when they are used), and instead provides largely compatible replacement implementations. The other big restriction probably being that null values are not supported for raw pointers. 11 | 12 | A notable difference between this tool and some others in development and in other languages is that the safe subset it enforces is (like traditional C++) not "flow (or path) sensitive". That is, whether an operation is allowed (by the tool/compiler) or not depends only on the declaration of the elements involved, not any other preceding operations or code. This is(/was) a common property of "statically typed" languages, that [arguably](#flow-insensitive-analysis) contributes to "scalability". 13 | 14 | Some samples of conforming safe code can be found in the examples for the [provided library elements](#lifetime-annotated-elements-in-the-safercplusplus-library). 15 | 16 | Note that due to its dependency on the clang+llvm libraries this tool only supports code that's compatible with the clang compiler. 17 | 18 | Note that this tool is still in development and not well tested. (While we presume that it would be unlikely that any bugs or incomplete features would result in your code being less safe than it otherwise would have been, it would not be appropriate to rely on this project being bug free or fully complete in its ability to prevent invalid memory accesses at this time.) 19 | 20 | Quick intro video (with text-to-speech narration) (transcript with code samples [here](intro_video_transcript.md)) part 1: 21 | (Note that a chunk of the video starting at 4:00 and ending at 6:30 is dedicated to demonstrating the installation of the tool. And another chunk starting at 8:15 and ending at 10:00 demonstrates setting up the associated library. You can skip those parts if they are not of immediate interest.) 22 | 23 | https://github.com/duneroadrunner/scpptool/assets/10386072/864df66f-70fb-4e4f-a11f-ff9fee1feb7b 24 | 25 | Quick intro video (with text-to-speech narration) part 2: 26 | 27 | https://github.com/duneroadrunner/scpptool/assets/10386072/a80f55b5-19fd-4ba2-a121-d999eb5b8932 28 | 29 | By some request, a ["Rough Summary of the Approach to Lifetime Safety For Those With Some Familiarity With Rust"](approach_to_lifetime_safety_summary.md). 30 | 31 | ### Table of contents 32 | 1. [Overview](#overview) 33 | 2. [How to Build](#how-to-build) 34 | 3. [How to Use](#how-to-use) 35 | 4. [Local Suppression of the Checks](#local-suppression-of-the-checks) 36 | 5. [About the Enforced Subset](#about-the-enforced-subset) 37 | 1. [Restrictions on the use of native pointers and references](#restrictions-on-the-use-of-native-pointers-and-references) 38 | 2. [Referencing elements in a dynamic container](#referencing-elements-in-a-dynamic-container) 39 | 3. [Annotating lifetime constraints](#annotating-lifetime-constraints) 40 | 4.
41 | Lifetime annotated elements in the SaferCPlusPlus library 42 | 43 | 1. [Overview](#lifetime-annotated-elements-in-the-safercplusplus-library) 44 | 1. [TXSLTAPointer](#txsltapointer) 45 | 2. [TXSLTAOwnerPointer](#txsltaownerpointer) 46 | 3. [xslta_array](#xslta_array) 47 | 4. [xslta_vector, xslta_fixed_vector, xslta_borrowing_fixed_vector](#xslta_vector-xslta_fixed_vector-xslta_borrowing_fixed_vector) 48 | 5. [TXSLTARandomAccessSection, TXSLTARandomAccessConstSection](#txsltarandomaccesssection-txsltarandomaccessconstsection) 49 | 6. [TXSLTACSSSXSTERandomAccessIterator and TXSLTACSSSXSTERandomAccessSection](#txsltacsssxsterandomaccessiterator-and-txsltacsssxsterandomaccesssection) 50 | 7. [xslta_optional, xslta_fixed_optional, xslta_borrowing_fixed_optional](#xslta_optional-xslta_fixed_optional-xslta_borrowing_fixed_optional) 51 |
52 | 5. [SaferCPlusPlus elements](#safercplusplus-elements) 53 | 6. [Elements not (yet) addressed](#elements-not-yet-addressed) 54 | 6. [Autotranslation](#autotranslation) 55 | 7. [Questions and comments](#questions-and-comments) 56 | 57 | ### How to Build: 58 | 59 | Ubuntu Linux is currently the only tested platform. (But there's no intrinsic reason it shouldn't work on any platform for which clang+llvm is available.) 60 | 61 | First, the llvm+clang library used by scpptool requires some additional libraries that can be installed as follows: 62 | ``` 63 | sudo apt-get update 64 | sudo apt-get install zlib1g-dev 65 | sudo apt-get install libtinfo-dev 66 | sudo apt-get install libxml2-dev 67 | ``` 68 | scpptool also uses the `yaml-cpp` library which can be installed as follows: 69 | ``` 70 | sudo apt-get install libyaml-cpp-dev 71 | ``` 72 | 73 | Next, [download](https://github.com/duneroadrunner/scpptool/archive/master.zip) and extract the repository (or clone it). Then just run the `build_myscpptool.sh` script. 74 | 75 | If you're not running an Ubuntu x86_64 system, then it will instruct you to download the clang+llvm pre-built binaries for your system and indicate the directory where they were extracted to. (On some systems, clang+llvm may require you to install other prerequisites. Those should be indicated by link errors. At which point you can just install the requirements and rerun the build script.) 76 | 77 | The build script does not require root privileges. 78 | 79 | (For those developing on Windows, the easiest route may be to build the tool in a WSL Ubuntu distro, and then invoke the tool from windows using the `wsl -e` command.) 80 | 81 | (Note that the scpptool executable uses a clang include directory located in the relative path '../lib/clang' created by the build script. So copying or moving the scpptool executable would also require copying or moving that include directory so that the relative path remains the same.) 82 | 83 | To uninstall: Just delete the directories where you extracted the scpptool repo and clang+llvm pre-built binaries to. Currently, the build script doesn't actually "install" anything. 84 | 85 | ### How to Use: 86 | 87 | The usage syntax is as follows: 88 | 89 | `{scpptool src directory}/scpptool {source filename(s)} -- {compiler options}` 90 | 91 | where the {scpptool src directory} is the `src` subdirectory of the repository you downloaded (or cloned). 92 | 93 | So for example: 94 | 95 | `~/dev/scpptool-master/src/scpptool hello_world.cpp -- -I./msetl -std=c++17` 96 | 97 | (Note, the double dashes `--` cannot be omitted even if there are no compiler options.) 98 | 99 | ### Local Suppression of the Checks 100 | 101 | You can use a "check suppression directive" to indicate places in the source code that the tool should not report conformance violations. For example: 102 | 103 | ```cpp 104 | { 105 | auto ptr1 = new int(5); // scpptool will complain (because `new` is not allowed) 106 | MSE_SUPPRESS_CHECK_IN_XSCOPE auto ptr2 = new int(7); // scpptool will not complain 107 | } 108 | ``` 109 | 110 | The presence of `MSE_SUPPRESS_CHECK_IN_XSCOPE` (a macro provided in the `msepointerbasics.h` include file of the SaferCPlusPlus library) indicates that checking should be suppressed for the statement that follows it. The `XSCOPE` suffix means that this directive can only be used in "execution scopes" (sometimes referred to as "blocks"). Essentially places where you can execute a function. As opposed to "declaration scopes". For example: 111 | 112 | ```cpp 113 | { 114 | struct A1 { 115 | int* m_ptr1 = nullptr; // scpptool will complain (because raw pointers are not allowed to be null) 116 | MSE_SUPPRESS_CHECK_IN_XSCOPE int* m_ptr2 = nullptr; // compile error 117 | MSE_SUPPRESS_CHECK_IN_DECLSCOPE int* m_ptr3 = nullptr; // this will work 118 | }; 119 | } 120 | ``` 121 | 122 | 123 | ### About the Enforced Subset 124 | 125 | The main differences between "traditional C++" and the safe subset that this tool enforces are probably the restrictions on raw pointers/references and the method for accessing elements in dynamic containers (such as vectors). (Just to be clear, scpptool also undertakes the other basic enforcement tasks required for memory safety, like ensuring that objects are initialized, casts are safe, parts of the object aren't accessed before they are initialized during construction, etc.) 126 | 127 | #### Restrictions on the use of native pointers and references 128 | 129 | First, this tool does not allow null values for raw pointers. (If you need null pointer values, you can wrap the pointer in an [`optional<>`](#xslta_optional-xslta_fixed_optional-xslta_borrowing_fixed_optional), or use one of the provided smart pointers which safely support null values.) 130 | 131 | Also, raw pointers and references are considered to be [scope](https://github.com/duneroadrunner/SaferCPlusPlus/blob/master/README.md#scope-pointers) references, which essentially means that their lifetime must be (verifiably at compile-time) bounded by an (execution) scope, and any object they target must (verifiably at compile-time) live at least to the end of that bounding scope. 132 | 133 | These restrictions imposed on scope reference types are generally not super-onerous, but, for example, if you wanted a raw pointer that targets an object owned by a smart pointer with shared ownership, you would first need to instantiate a ["strong pointer store" object](https://github.com/duneroadrunner/SaferCPlusPlus/blob/master/README.md#make_xscope_strong_pointer_store) that ensures that the target object will outlive the raw pointer's bounding scope. (Btw, `std::shared_ptr<>` is not supported for other safety reasons, but [safe implementations](https://github.com/duneroadrunner/SaferCPlusPlus/blob/master/README.md#reference-counting-pointers) of smart pointers with shared ownership are available.) 134 | 135 | When the value of one pointer is assigned to another, some static analyzers will attempt to determine the lifetime of the target object based on the "most recent" value that was assigned to the source pointer. scpptool does not do this. scpptool infers a lower bound for the target object lifespan based solely on the declaration of the source pointer ([not any subsequent assignment operation](#flow-insensitive-analysis)). By default, the only assumption made is that the target object outlives the source pointer. 136 | 137 | If [lifetime annotations](#annotating-lifetime-constraints) are applied to the pointer, then it can be assumed that the target object (also) outlives the pointer's intialization value target object. Often [`rsv::TXSLTAPointer<>`](#txsltapointer) will be used in place of raw pointers. It acts just like a raw pointer with lifetime annotations. 138 | 139 | #### Referencing elements in a dynamic container 140 | 141 | This tool does not support directly taking (raw) references to elements in a dynamic (resizable) container (such as a vector). The preferred way of obtaining a (raw) reference to an element in a dynamic container is via a corresponding ["borrowing fixed (size)"](#xslta_vector-xslta_fixed_vector-xslta_borrowing_fixed_vector) (proxy) container that, while it exists, ensures that no elements are removed or relocated. 142 | 143 | Operations that could resize (or relocate) the contents of a "safe" dynamic container implementation provided by the library will incur some extra overhead. But these sorts of operations are generally avoided inside performance-critical inner loops anyway because they often incur unnecessary cost even without the extra overhead. 144 | 145 | #### Annotating lifetime constraints 146 | 147 | [*provisional*] 148 | 149 | This next topic is kind of a longer one about "lifetime annotations". It is an important topic to understand if you ever intend to store raw pointers or references, or use them as function return values. (You can theoretically avoid those situations, and this topic, by using the provided non-owning run-time-checked smart pointers rather than raw pointers/references in those situations.) This topic introduces some new syntax, but don't get hung up on the details. Most of the time you won't be using the new syntax directly. Most of the time it will be [implied by default](#lifetime-elision) or you'll be using provided [library elements](#lifetime-annotated-elements-in-the-safercplusplus-library) that already incorporate the appropriate lifetime annotations. But still, it's recommended to at least give this topic a "once over" at some point to get an idea of how things work and the principles underlying the associated imposed/enforced restrictions. (Though if you're reading this as part of your first introduction to scpptool, don't let yourself get bogged down in this section, you can always revisit it later.) 150 | 151 | While in most cases the lifetime annotations will be implicit, the lifetime annotations presented in this section may strike some as quite verbose. (Provisional) shorter macro aliases are [provided](https://github.com/duneroadrunner/SaferCPlusPlus/blob/2f23cbb406497d42faa289cecbd76565dd3ea73b/include/mseslta.h#L103-L109). So even though all our examples use the "official" verbose syntax, it's expected that shorter macro aliases will ultimately be more commonly used. 152 | 153 | By default, this tool enforces that targets of scope (raw) pointers outlive the pointer itself. But sometimes it can be useful to enforce even more stringent restrictions on the lifespan of the target objects. Consider the following example: 154 | 155 | ```cpp 156 | typedef int* int_ptr_t; 157 | 158 | void foo1(int_ptr_t& i1_ptr_ref, int_ptr_t& i2_ptr_ref) { 159 | int_ptr_t i3_ptr = i1_ptr_ref; // clearly safe 160 | i2_ptr_ref = i1_ptr_ref; // ??? 161 | } 162 | ``` 163 | 164 | We (and the tool) can see that it is safe to assign the value of the `i1_ptr_ref` parameter to the `i3_ptr` local variable, because the target object of the pointer referred to by `i1_ptr_ref` comes from outside the function and can be assumed to outlive the function call itself and therefore any local variable within the function. 165 | 166 | But what about assigning the value of `i1_ptr_ref` to `i2_ptr_ref`? In this case we (and the tool) don't have enough information to conclude that the target of the pointer referred to by `i1_ptr_ref` would outlive the pointer that `i2_ptr_ref` refers to. 167 | 168 | ##### Annotating function interfaces 169 | 170 | Now imagine we had some way to specify in the function interface that the pointer referred to by `i1_ptr_ref`, and therefore its target object, must live at least as long as the pointer that `i2_ptr_ref` refers to. 171 | 172 | The tool supports such a specification (referred to as "lifetime annotations") and might look something like this: 173 | 174 | ```cpp 175 | typedef int* int_ptr_t; 176 | 177 | void foo1(int_ptr_t& i1_ptr_ref MSE_ATTR_PARAM_STR("mse::lifetime_label(42)"), int_ptr_t& i2_ptr_ref MSE_ATTR_PARAM_STR("mse::lifetime_label(99)")) 178 | MSE_ATTR_FUNC_STR("mse::lifetime_notes{ labels(42, 99); encompasses(42, 99) }") 179 | { 180 | int_ptr_t i3_ptr = i1_ptr_ref; // clearly safe 181 | i2_ptr_ref = i1_ptr_ref; // the lifetime annotations tell us that this is safe 182 | } 183 | ``` 184 | 185 | First note that the `42` and `99` are just arbitrarily chosen labels used to distinguish between the lifetimes of the two parameters. So lets go through the "annotations" we added: 186 | 187 | After the first parameter we added `MSE_ATTR_PARAM_STR("mse::lifetime_label(42)")`. ` MSE_ATTR_PARAM_STR()` is just a (preprocessor) macro function defined in the SaferCPlusPlus library that lets us add these annotations in such a way that the tool can read them, without affecting compilation of the code. The `"mse::lifetime_label(42)"` just associates a label (of our choosing) to the lifespan of the object bound to the (raw) reference first parameter. So we've assigned the labels `42` and `99` to the lifespans of objects bound to the two (raw) reference parameters. 188 | 189 | After the function declaration (and before the body of the function), we added the annotation `MSE_ATTR_FUNC_STR("mse::lifetime_notes{ labels(42, 99); encompasses(42, 99) }")`. `mse::lifetime_notes{}` is used as a sometimes more compact way of expressing multiple annotations separated by semicolons, and without the `mse::lifetime_` prefix on each one. So for example, in place of this annotation we could have instead wrote `MSE_ATTR_FUNC_STR("mse::lifetime_labels(42, 99)") MSE_ATTR_FUNC_STR("mse::lifetime_encompasses(42, 99)")`, which is equivalent (and might even be preferred in cases where you want to put each annotation on its own line). 190 | 191 | Ok, so the `labels(42, 99)` annotation is just the declaration of the lifetime labels, akin to declaring variables before use. Though here we place the declarations below the place where they are used. But that's just an asthetic choice on our part. You can place them before the function declaration if you prefer. (Note that at the time of writing the tool actually allows you to use lifetime labels without declaring them separately, but is expected to be more strict in the future.) 192 | 193 | `encompasses(42, 99)` declares a constraint on the two lifespans. Namely that the `99` lifespan must be contained within the duration of the `42` lifespan. Or, essentially, that the object associated with the `42` lifespan must outlive the object associated with the `99` lifespan. 194 | 195 | The tool will analyze every call of the `foo1()` function and complain if it cannot verify that the function call arguments satisfy the specified constraint. For example: 196 | 197 | ```cpp 198 | #include "msescope.h" 199 | 200 | typedef int* int_ptr_t; 201 | 202 | void foo1(int_ptr_t& i1_ptr_ref MSE_ATTR_PARAM_STR("mse::lifetime_label(42)"), int_ptr_t& i2_ptr_ref MSE_ATTR_PARAM_STR("mse::lifetime_label(99)")) 203 | MSE_ATTR_FUNC_STR("mse::lifetime_notes{ labels(42, 99); encompasses(42, 99) }") 204 | { 205 | int_ptr_t i3_ptr = i1_ptr_ref; // clearly safe 206 | i2_ptr_ref = i1_ptr_ref; // the lifetime annotations tell us that this is safe 207 | } 208 | 209 | int main(int argc, char* argv[]) { 210 | int i1 = 5; 211 | int* i_ptr1 = &i1; 212 | { 213 | int i2 = 7; 214 | int* i_ptr2 = &i2; 215 | 216 | foo1(i_ptr1, i_ptr2); // fine because i_ptr1 outlives i_ptr2 217 | 218 | foo1(i_ptr2, i_ptr1); // scpptool will complain because the first argument does not outlive the second 219 | } 220 | } 221 | ``` 222 | 223 | As noted earlier, the syntax of these lifetime annotations may seem to some to be overly verbose for common usage. But the [shorter macro aliases](https://github.com/duneroadrunner/SaferCPlusPlus/blob/2f23cbb406497d42faa289cecbd76565dd3ea73b/include/mseslta.h#L103-L109) can be used instead. So for instance, the example function could be re-expressed like so: 224 | 225 | ```cpp 226 | typedef int* int_p_t; 227 | 228 | void foo1(int_p_t& i1_p_ref LTP(42), int_p_t& i2_p_ref LTP(99)) 229 | LT(42, 99) LT_ENCOMPASSES(42, 99) 230 | { 231 | int_p_t i3_p = i1_p_ref; // clearly safe 232 | i2_p_ref = i1_p_ref; // the lifetime annotations tell us that this is safe 233 | } 234 | ``` 235 | 236 | Ok, so we've demonstrated associating labels to the lifetimes of objects bound to raw references. But actually, raw references are kind of a special case "quasi-object" in the sense that the reference itself can never be the target of another reference or pointer, and can never be reassigned to reference a different object. (Raw) pointers, on the other hand, provide the functionality of raw references, but additionally can be reassigned to reference (aka "point to") different objects, and can themselves be targeted by references or other pointers. So if we use pointers in place of (raw) references in our first example: 237 | 238 | ```cpp 239 | typedef int* int_ptr_t; 240 | 241 | void foo2(int_ptr_t* i1_ptr_ptr MSE_ATTR_PARAM_STR("mse::lifetime_label(42)"), int_ptr_t* i2_ptr_ptr MSE_ATTR_PARAM_STR("mse::lifetime_label(99)")) 242 | MSE_ATTR_FUNC_STR("mse::lifetime_notes{ labels(42, 99); encompasses(42, 99) }") 243 | { 244 | int_ptr_t i3_ptr = *i1_ptr_ptr; // clearly safe 245 | *i2_ptr_ptr = *i1_ptr_ptr; // the lifetime annotations tell us that this is safe 246 | } 247 | ``` 248 | 249 | It works the same way. Note that the lifetime labels refer to the lifetimes of the targets of the pointer parameters (which also happen to be pointers in this case), not the lifetime of the pointer parameters themselves. 250 | 251 | But pointers can point to different objects during the execution of the program. Does this mean that lifetime labels associated with the target of a pointer can refer to different lifetimes at different points in the execution of a program? 252 | 253 | No. (At least not currently.) Our description thus far of what a lifetime label is has maybe been a little misleading. A lifetime label actually represents the lower bound of possible lifespans of any objects that might be targeted by the associated pointer/reference. (In the case of (raw) references (or `const` pointers), since they cannot be retargeted after initialization, there is only one possible object.) This lower bound is determined at the point where the pointer or reference object is declared (at compile-time). It's determined and set from either the initialization value, specified "constraint" annotations, and/or a default value based on the location of the declaration in the code. You might think of lifetime labels as sort of pseudo (deduced) template parameters. 254 | 255 | Often there is not enough information to determine the lower bound precisely. In such cases, the tool will use what information is available to try to verify safety as best it can. 256 | 257 | So in the above example, in the context of the function implementation (i.e. inside the function), since we don't have access to the parameters' initialization values (i.e. the passed arguments), their lower bound lifetimes are not precisely determined. We know that, by rule, the lower bounds are at least the lifespan of the function call. And that in this case, the lower bound of the `42` lifespan is determined, from the declaration, to be at least that of the `99` lifespan (as specified in the `encompasses()` constraint annotation). 258 | 259 | On the other hand, in the context of invoking/calling the function (i.e. outside the function), the lower bound lifetimes of the function arguments often will be known precisely. Known precisely or not, those lower bound lifetimes will have to verifiably conform to the specified (`encompasses()`) constraint. 260 | 261 | So we've seen lifetime labels associated with (raw) references and (raw) pointers when used as function parameters. But lifetime labels can be associated with other types of reference objects, and not just when used as function parameters. 262 | 263 | ##### Annotating (user-defined) types 264 | 265 | By "reference object" we mean basically any object that references (ultimately via pointer) any other object(s). A simple example would be just a `struct` that has a pointer member. So lets look at an example of a couple of `struct`s with a pointer member, one with and one without lifetime annotation: 266 | 267 | ```cpp 268 | #include "msescope.h" 269 | 270 | struct CRefObj1 { 271 | CRefObj1(int* i_ptr) : m_i_ptr(i_ptr) {} 272 | 273 | int* m_i_ptr; 274 | }; 275 | 276 | struct CLARefObj1 { 277 | CLARefObj1(int* i_ptr MSE_ATTR_PARAM_STR("mse::lifetime_label(99)")) : m_i_ptr(i_ptr) {} 278 | 279 | int* m_i_ptr MSE_ATTR_STR("mse::lifetime_label(99)"); 280 | } MSE_ATTR_STR("mse::lifetime_label(99)"); 281 | 282 | int main(int argc, char* argv[]) { 283 | int i1 = 5; 284 | int* i_ptr1 = &i1; 285 | 286 | { 287 | int i2 = 7; 288 | int* i_ptr2 = &i2; 289 | 290 | CRefObj1 ro2{ i_ptr2 }; 291 | CLARefObj1 laro2{ i_ptr2 }; 292 | 293 | { 294 | CRefObj1 ro3{ i_ptr1 }; 295 | CLARefObj1 laro3{ i_ptr1 }; 296 | 297 | ro2.m_i_ptr = ro3.m_i_ptr; // scpptool will complain because ro2.m_i_ptr outlives ro3.m_i_ptr 298 | 299 | laro2.m_i_ptr = laro3.m_i_ptr; // fine 300 | /* because the lower bound lifespan of the target of laro3.m_i_ptr1 was set (in the construction 301 | of laro3) to be the lifespan of i_ptr1 (i.e. the construction argument), and the lower bound 302 | lifespan of laro2.m_i_ptr was set to be the lifespan of i_ptr2, and i_ptr1 outlives i_ptr2 */ 303 | 304 | ro3.m_i_ptr = i_ptr2; // fine, i_ptr2 outlives ro3.m_i_ptr 305 | 306 | laro3.m_i_ptr = i_ptr2; // scpptool will complain 307 | /* because i_ptr2 does not outlive the lower bound lifespan of the target of laro3.m_i_ptr1 308 | (which was set to i_ptr1) */ 309 | } 310 | } 311 | } 312 | ``` 313 | 314 | So you can see the different restrictions on which objects the member pointers can point to, and how the tool uses those restrictions to determine which assignment operations it can verify to be safe. 315 | 316 | So lets walk through the application of lifetime annotations to the `CLARefObj1` `struct` and its pointer member. The declaration of the pointer member gets an annotation in similar fashion to the function parameter declarations in our previous examples. We also use the same lifetime label annotation on the `struct` itself to "declare" the lifetime label, just like with functions. Note that (for now at least) the tool requires any `struct` with lifetime annotation to define an (annotated) constructor (from which it can infer the lifetime values associated with the lifetime labels). So in the `CLARefObj1` `struct`, the lifetime (lower bound) value associated with lifetime label `99`, and the pointer member, is inferred from the constructor parameter associated with lifetime label `99`. 317 | 318 | ##### Annotating return values and `this` pointers 319 | 320 | Ok, but if we want to use our annotated `CLARefObj1` type as a reference type, you could image we might want to provide, for example, member operators like `operator*()` and `operator->()`. Lets see how we would do that: 321 | 322 | ```cpp 323 | struct CLARefObj1 { 324 | CLARefObj1(int* i_ptr MSE_ATTR_PARAM_STR("mse::lifetime_label(99)")) : m_i_ptr(i_ptr) {} 325 | 326 | int& operator*() const MSE_ATTR_FUNC_STR("mse::lifetime_notes{ return_value(99) }") { 327 | return *m_i_ptr; 328 | } 329 | int* operator->() const MSE_ATTR_FUNC_STR("mse::lifetime_notes{ return_value(99) }") { 330 | return m_i_ptr; 331 | } 332 | 333 | int* m_i_ptr MSE_ATTR_STR("mse::lifetime_label(99)"); 334 | } MSE_ATTR_STR("mse::lifetime_label(99)"); 335 | ``` 336 | 337 | Operators are just like any other functions and annotated in the same way. Our previous example functions didn't have return values, so we didn't get a chance to see how to annotate the return value. This example shows it. It seems these operators don't take any parameters so we don't have to deal with them here. But to be pedantic, since these are member operators, just like member functions, they actually take an implicit `this` pointer parameter. In some cases, you might need to associate a lifetime label to the implicit `this` pointer parameter. This would be done in similar fashion to the return value annotation above, but substituting the `return_value()` part with `this()`. But understand that the `this()` annotation is just associating a lifetime label to a function parameter (in this case the implicit `this` pointer parameter), and so the lifespan (lower bound) value will be inferred from the parameter, whereas the `return_value()` annotation, on the other hand, is imposing a lifespan (lower bound) value associated with a lifetime label that has already been previously inferred (often from one of the (implicit or explicit) function parameters). 338 | 339 | By adding dereference operators, we've made a reference object that kind of resembles the behavior of a pointer. But notice that, unlike the native pointer and reference types, the target of our reference object type is always constrained by the lower bound lifespan inferred from its initialization value (aka constructor argument). With native pointers and references, we have to associate the target object's lifespan with a lifetime label (by adding an annotation to the (parameter) variable or member field) in order to trigger this constraint, whereas a variable or member field of our reference object type will always have this constraint regardless. Native pointers and references are the only types that possess this "dual nature". With all other (user defined) reference object types it's either one or the other. 340 | 341 | So currently our reference object stores one pointer, but what if we wanted it to store two different pointers to two different objects with different lifetime (lower bound) constraints? 342 | 343 | ```cpp 344 | struct CLARefObj2 { 345 | CLARefObj2(int* i_ptr MSE_ATTR_PARAM_STR("mse::lifetime_label(99)"), float* fl_ptr MSE_ATTR_PARAM_STR("mse::lifetime_label(42)")) 346 | : m_i_ptr(i_ptr), m_fl_ptr(fl_ptr) {} 347 | 348 | int* m_i_ptr MSE_ATTR_STR("mse::lifetime_label(99)"); 349 | float* m_fl_ptr MSE_ATTR_STR("mse::lifetime_label(42)"); 350 | } MSE_ATTR_STR("mse::lifetime_labels(99, 42)"); 351 | ``` 352 | 353 | A reference object can have more than one lifetime label associated with it. 354 | 355 | Ok let's say, instead of dereference operators, we want to add some member functions that return the value of each pointer member field: 356 | 357 | ```cpp 358 | struct CLARefObj2 { 359 | CLARefObj2(int* i_ptr MSE_ATTR_PARAM_STR("mse::lifetime_label(99)"), float* fl_ptr MSE_ATTR_PARAM_STR("mse::lifetime_label(42)")) 360 | : m_i_ptr(i_ptr), m_fl_ptr(fl_ptr) {} 361 | 362 | int* first() const MSE_ATTR_FUNC_STR("mse::lifetime_notes{ return_value(99) }") { 363 | return m_i_ptr; 364 | } 365 | float* second() const MSE_ATTR_FUNC_STR("mse::lifetime_notes{ return_value(42) }") { 366 | return m_fl_ptr; 367 | } 368 | 369 | int* m_i_ptr MSE_ATTR_STR("mse::lifetime_label(99)"); 370 | float* m_fl_ptr MSE_ATTR_STR("mse::lifetime_label(42)"); 371 | } MSE_ATTR_STR("mse::lifetime_labels(99, 42)"); 372 | ``` 373 | 374 | ##### Annotating template parameters 375 | 376 | Ok, now let's say that instead of the member fields being of type `int*` and `float*`, we want those types to be generic template parameters. That's a little trickier. Because we know that pointers like `int*` and `float*` each have one reference lifetime to which a lifetime label can be associated. But if a type is a generic template parameter then we wouldn't know in advance how many, if any, lifetime labels can be associated with it. In this case we'll use a generic "lifetime label alias" that maps to the set of (reference) lifetimes the template parameter type has (when the template is instantiated). 377 | 378 | ```cpp 379 | template 380 | struct TLARefObj2 { 381 | TLARefObj2(T val1 MSE_ATTR_PARAM_STR("mse::lifetime_label(alias_11$)") 382 | , U val2 MSE_ATTR_PARAM_STR("mse::lifetime_label(alias_12$)")) 383 | : m_val1(val1), m_val2(val2) {} 384 | 385 | T first() const MSE_ATTR_FUNC_STR("mse::lifetime_notes{ return_value(alias_11$) }") { 386 | return m_val1; 387 | } 388 | U second() const MSE_ATTR_FUNC_STR("mse::lifetime_notes{ return_value(alias_12$) }") { 389 | return m_val2; 390 | } 391 | 392 | T m_val1 MSE_ATTR_STR("mse::lifetime_label(alias_11$)"); 393 | U m_val2 MSE_ATTR_STR("mse::lifetime_label(alias_12$)"); 394 | } MSE_ATTR_STR("mse::lifetime_set_alias_from_template_parameter_by_name(T, alias_11$)") 395 | MSE_ATTR_STR("mse::lifetime_set_alias_from_template_parameter_by_name(U, alias_12$)") 396 | MSE_ATTR_STR("mse::lifetime_labels(alias_11$, alias_12$)"); 397 | ``` 398 | 399 | We use the `mse::lifetime_set_alias_from_template_parameter_by_name()` annotation to define a lifetime label alias for the set of (reference) lifetimes the specified template parameter type has (or rather, will have whenever the template is instantiated). 400 | 401 | ##### Annotating base classes 402 | 403 | Base classes are, in terms of lifetime annotations, conceptually treated just like member fields. You can use the `labels_for_base_class()` annotation (on the derived type) to assign lifetime labels to the first base class. 404 | 405 | ```cpp 406 | struct CLARefObj11 : public CLARefObj1 { 407 | CLARefObj11(int* i_ptr MSE_ATTR_PARAM_STR("mse::lifetime_label(42)")) : CLARefObj1(i_ptr) {} 408 | } MSE_ATTR_STR("mse::lifetime_label(42)") MSE_ATTR_STR("mse::lifetime_label_for_base_class(42)"); 409 | ``` 410 | Note though that (at the time of writing) lifetimes are only properly transmitted to the immediate base class. So, for example, an inherited member function of a base class of a base class (i.e. two levels of inheritance) that uses lifetime annotations would need to be overridden (by a member function that reexpresses the lifetime annotations). 411 | 412 | ##### Accessing sublifetimes 413 | 414 | Now, if we can revisit the earlier part where we were learning to associate lifetime labels with function parameters, and consider a situation where we are interested in, not the lifetime of the parameter directly, but perhaps the lifespan (lower bound) of an object that the parameter references. We can use the `CLARefObj2` `struct` we defined earlier for this example: 415 | 416 | ```cpp 417 | struct CLARefObj2 { 418 | CLARefObj2(int* i_ptr MSE_ATTR_PARAM_STR("mse::lifetime_label(99)"), float* fl_ptr MSE_ATTR_PARAM_STR("mse::lifetime_label(42)")) 419 | : m_i_ptr(i_ptr), m_fl_ptr(fl_ptr) {} 420 | 421 | int* m_i_ptr MSE_ATTR_STR("mse::lifetime_label(99)"); 422 | float* m_fl_ptr MSE_ATTR_STR("mse::lifetime_label(42)"); 423 | } MSE_ATTR_STR("mse::lifetime_labels(99, 42)"); 424 | 425 | float* foo2(const CLARefObj2& la_ref_obj_cref MSE_ATTR_PARAM_STR("mse::lifetime_labels(42 [421, 422])")) 426 | MSE_ATTR_FUNC_STR("mse::lifetime_notes{ labels(42, 421, 422); return_value(422) }") 427 | { 428 | return la_ref_obj_cref.m_fl_ptr; 429 | } 430 | ``` 431 | 432 | Notice the annotation for the `foo2()` function's parameter, `mse::lifetime_labels(42 [421, 422])`. In this case, label `42` is associated with the lifespan of the object (of type `CRefObj2`) referenced by the native reference argument, label `421` is associated with the lifespan of the first object referenced by that object (aka the object's first "sublifetime"), and label `422` is associated with the lifespan of the second object (of type `float` in this case) referenced by that object. See, in the `mse::lifetime_labels()` annotation we can use commas and (possibly nested) square brackets to create a tree of lifetime labels that correspond to the tree of lifespan values of the argument object. 433 | 434 | ##### Future syntax options 435 | 436 | This syntax for addressing sublifetimes might be considered a little messy (and maybe error prone), but results from the fact that, in the source text, our annotations are placed after the declarations rather than the directly after the types they might correspond to. This is, in part, an artifact of a historical limitation in one of the libraries the tool uses. In the future the tool may support placing the lifetime label annotations directly after the type. 437 | 438 | ##### More lifetime constraints 439 | 440 | In the [first example](#annotating-function-interfaces) we saw a straightforward use of the `encompasses` lifetime constraint. Now let's look at a slightly more advanced example where we demonstrate two ways of annotating a `swap` function: 441 | 442 | ```cpp 443 | template 444 | void swap1(T& item1 MSE_ATTR_PARAM_STR("mse::lifetime_label(42 [alias_421$])"), T& item2 MSE_ATTR_PARAM_STR("mse::lifetime_label(99 [alias_991$])")) 445 | MSE_ATTR_FUNC_STR("mse::lifetime_set_alias_from_template_parameter_by_name(T, alias_421$)") 446 | MSE_ATTR_FUNC_STR("mse::lifetime_set_alias_from_template_parameter_by_name(T, alias_991$)") 447 | MSE_ATTR_FUNC_STR("mse::lifetime_labels(42, 99, alias_421$, alias_991$)") 448 | MSE_ATTR_FUNC_STR("mse::lifetime_notes{ encompasses(alias_421$, alias_991$); encompasses(alias_991$, alias_421$) }") 449 | { 450 | MSE_SUPPRESS_CHECK_IN_XSCOPE std::swap(item1, item2); 451 | } 452 | 453 | template 454 | void swap2(T& item1 MSE_ATTR_PARAM_STR("mse::lifetime_label(42)"), T& item2 MSE_ATTR_PARAM_STR("mse::lifetime_label(99)")) 455 | MSE_ATTR_FUNC_STR("mse::lifetime_labels(42, 99)") 456 | MSE_ATTR_FUNC_STR("mse::lifetime_notes{ first_can_be_assigned_to_second(42, 99); first_can_be_assigned_to_second(99, 42) }") 457 | { 458 | MSE_SUPPRESS_CHECK_IN_XSCOPE std::swap(item1, item2); 459 | } 460 | ``` 461 | 462 | Note that the `swap` functions take (raw) reference parameters. Determining whether or not the argument values are safe to swap does not depend on the "direct" lifetimes of the arguments, but rather on the lifetime (lower bound)s of any objects that the arguments might refer to (aka the "sublifetimes), right? So for example, swapping the value of two `int`s is always safe because `int`s don't hold any references to any other objects (i.e. `int`s have no "sublifetimes"). Swapping the value of two pointers on the other hand is not always safe. We can ensure that swapping the value of two pointers is safe if we can ensure that the lifetime (lower bound)s of the objects the pointers reference (aka the "sublifetimes") are the same. 463 | 464 | So in our annotation of the `swap1()` function we assign "lifetime set alias" labels to the sublifetimes (if any) of the arguments. Then we use the `encompasses` constraint to ensure that none of the sublifetimes of one argument outlives the corresponding sublifetime of the other. 465 | 466 | In the annotation of the `swap2()` function, we introduce the use of the `first_can_be_assigned_to_second` constraint to make the annotation a little cleaner. The `first_can_be_assigned_to_second` constraint is similar to the `encompasses` constraint except that it ignores the "direct" lifetime (lower bound) and only constrains the sublifetimes (if any). Using the `first_can_be_assigned_to_second` constraint eliminates the need to assign lifetime (set alias) labels to the argument sublifetimes. 467 | 468 | ##### Lifetime elision 469 | 470 | In certain cases we add implicit ("elided") lifetime annotations to function interfaces, generally as described in the "Lifetime elision" section of the "[[RFC] Lifetime annotations for C++](https://discourse.llvm.org/t/rfc-lifetime-annotations-for-c/61377)" document. Quoting from that document: 471 | 472 | > As in Rust, to avoid unnecessary annotation clutter, we allow lifetime annotations to be elided (omitted) from a function signature when they conform to certain regular patterns. Lifetime elision is merely a shorthand for these regular lifetime patterns. Elided lifetimes are treated exactly as if they had been spelled out explicitly; in particular, they are subject to lifetime verification, so they are just as safe as explicitly annotated lifetimes. 473 | > 474 | > We propose to use the same rules as in Rust, as these transfer naturally to C++. We call lifetimes on parameters *input lifetimes* and lifetimes on return values *output lifetimes*. (Note that all lifetimes on parameters are called input lifetimes, even if those parameters are output parameters.) Here are the rules: 475 | > 476 | > 1. Each input lifetime that is elided (i.e., not stated explicitly) becomes a distinct lifetime. 477 | > 2. If there is exactly one input lifetime (whether stated explicitly or elided), that lifetime is assigned to all elided output lifetimes. 478 | > 3. If there are multiple input lifetimes but one of them applies to the implicit `this` parameter, that lifetime is assigned to all elided output lifetimes. 479 | 480 | Note that we add a little qualification and modification to the rules as quoted. Specifically, we will use an input lifetime that has sublifetimes, but we only apply the input lifetime to the return value if the (sub)lifetime (tree) structure of the input parameter and return types match. Or, if the return type lifetime structure matches the input parameter's with one level of indirection removed (as might happen in the case, for example, where the input parameter is passed by reference, but the corresponding return value is not returned by reference), then we will use the input lifetime with the first level of indirection removed, so that it matches the lifetime of the return type. 481 | 482 | ##### Lifetime annotation implementation caveats 483 | 484 | (Note, that at the time of writing, implementation of lifetime safety enforcement is not complete. Safety can be subverted through, for example, cyclic references with user-defined destructors, or using a type-erased function container to neutralize lifetime annotation specified restrictions, etc.. Also note that while the most essential elements are already available, the process of adding lifetime annotated elements to the SaferCPlusPlus library is still in progress.) 485 | 486 | ##### third party lifetime annotations 487 | 488 | Note that other static lifetime analyzers in development introduce their own distinct lifetime annotations (including the lifetime profile checker and [others](https://discourse.llvm.org/t/rfc-lifetime-annotations-for-c/61377)). Those analyzers may not recognize the lifetime annotations introduced here, so to be compliant with those analyzers you may have to use their lifetime annotations as well. Ideally, in the future scpptool would also support those lifetime annotations, reducing the need for redundant annotations. 489 | 490 | #### Lifetime annotated elements in the SaferCPlusPlus library 491 | 492 | While the most essential elements are already available, the process of adding lifetime annotated elements to the SaferCPlusPlus library is still in progress. 493 | 494 | Since lifetime annotations require the scpptool for enforcement, lifetime annotated elements generally reside in the `mse::rsv` namespace, like the other elements that require scpptool for safety enforcement. Most lifetime annotated elements are "[scope](https://github.com/duneroadrunner/SaferCPlusPlus/blob/master/README.md#scope-pointers)" elements and conform to the corresponding restrictions. 495 | 496 | ##### TXSLTAPointer 497 | 498 | `rsv::TXSLTAPointer<>` is just a (zero-overhead) class that acts like a [lifetime annotated](#annotating-lifetime-constraints) pointer. Like raw pointers, `rsv::TXSLTAPointer<>` is considered a [scope](https://github.com/duneroadrunner/SaferCPlusPlus/blob/master/README.md#scope-pointers) object and is subject to the restrictions of scope objects. 499 | 500 | Also note that (though we don't use them in the example) a couple of (provisional) shorter aliases are defined: 501 | `xl_ptr<>` is an alias for `TXSLTAPointer<>` 502 | `xl_const_ptr<>` is an alias for `TXSLTAConstPointer<>` 503 | 504 | usage example: ([link to interactive version](https://godbolt.org/z/hnb6n1Mh6)) 505 | 506 | ```cpp 507 | #include "mseslta.h" 508 | 509 | int main(int argc, char* argv[]) { 510 | int i1 = 3; 511 | int i2 = 5; 512 | int i3 = 7; 513 | /* The (lower bound) lifetime associated with an rsv::TXSLTAPointer<> is set to the lifetime of 514 | its initialization value. */ 515 | auto ilaptr1 = mse::rsv::TXSLTAPointer{ &i1 }; 516 | auto ilaptr2 = mse::rsv::TXSLTAPointer{ &i2 }; 517 | 518 | /* The (lower bound) lifetime associated with ilaptr2 does not outlive the one associated with 519 | ilaptr1, so assigning the value of ilaptr2 to ilaptr1 cannot be verified to be safe. */ 520 | //ilaptr1 = ilaptr2; // scpptool would complain 521 | ilaptr2 = ilaptr1; 522 | ilaptr2 = &i1; 523 | //ilaptr2 = &i3; // scpptool would complain 524 | } 525 | ``` 526 | 527 | ##### TXSLTAOwnerPointer 528 | 529 | `rsv::TXSLTAOwnerPointer<>` is a ["lifetime annotated"](#annotating-lifetime-constraints) version of [`TXScopeOwnerPointer<>`](https://github.com/duneroadrunner/SaferCPlusPlus/blob/master/README.md#txscopeownerpointer). 530 | 531 | `rsv::TXSLTAOwnerPointer<>` is kind of like an `std::unique_ptr<>` whose use is restricted by the rules of scope objects. You can use it when you want to give scope lifetime to objects that are too large to be declared directly on the stack. 532 | 533 | Instead of its constructor taking a native pointer pointing to an already allocated object, it takes an (often temporary) instance of the desired value and allocates the object itself. You may also use `rsv::make_xslta_owner<>()` to create a `TXSLTAOwnerPointer<>` in a manner akin to `std::make_unique<>()`. 534 | 535 | ##### xslta_array 536 | 537 | `rsv::xslta_array<>` is a [lifetime annotated](#annotating-lifetime-constraints) array. 538 | 539 | usage example: ([link to interactive version](https://godbolt.org/z/ffe3vExTM)) 540 | 541 | ```cpp 542 | #include "mseslta.h" 543 | #include "msemsearray.h" 544 | #include "msealgorithm.h" 545 | 546 | int main(int argc, char* argv[]) { 547 | int i1 = 3; 548 | int i2 = 5; 549 | int i3 = 7; 550 | 551 | /* The lifetime (lower bound) associated with the rsv::xslta_array<>, and each of its 552 | contained elements, is the lower bound of all of the lifetimes of the elements in the initializer 553 | list. */ 554 | auto arr2 = mse::rsv::xslta_array, 2>{ &i1, &i2 }; 555 | auto ilaptr3 = arr2.front(); 556 | //ilaptr3 = &i3; // scpptool would complain 557 | ilaptr3 = &i1; 558 | 559 | /* Note that although the initializer list used in the declaration of arr3 is different than the 560 | initializer list used for arr2, the lower bound of the lifetimes of both initializer lists is 561 | the same. */ 562 | auto arr3 = mse::rsv::xslta_array, 2>{ &i2, &i2 }; 563 | 564 | /* Since the (lower bound) lifetime values of arr2 and arr3 are the same, their values can be 565 | safely swapped.*/ 566 | std::swap(arr2, arr3); 567 | arr2.swap(arr3); 568 | 569 | /* The lower bound lifetime of arr4's initializer list is not the same as that of arr2. */ 570 | auto arr4 = mse::rsv::xslta_array, 2>{ &i3, &i1 }; 571 | 572 | /* Since the (lower bound) lifetime values of arr2 and arr4 are not the same, their values 573 | cannot be safely swapped.*/ 574 | //std::swap(arr2, arr4); // scpptool would complain 575 | //arr2.swap(arr4); // scpptool would complain 576 | 577 | { 578 | /* The standard iterator operations. */ 579 | auto xslta_iter1 = std::begin(arr2); 580 | auto xslta_iter2 = std::end(arr2); 581 | //xslta_iter1[0] = &i3; // scpptool would complain 582 | xslta_iter1[0] = &i1; 583 | 584 | auto xslta_citer3 = std::cbegin(arr2); 585 | xslta_citer3 = xslta_iter1; 586 | xslta_citer3 = std::cbegin(arr2); 587 | xslta_citer3 += 1; 588 | auto res1 = *(*xslta_citer3); 589 | auto res2 = *(xslta_citer3[0]); 590 | 591 | std::cout << "\n"; 592 | for (auto xslta_iter5 = xslta_iter1; xslta_iter2 != xslta_iter5; ++xslta_iter5) { 593 | std::cout << *(*xslta_iter5) << " "; 594 | } 595 | std::cout << "\n"; 596 | } 597 | { 598 | /* The same iterator operations using the SaferCPlusPlus library's make_*_iterator() functions. */ 599 | auto arr2_xsltaptr = mse::rsv::TXSLTAPointer{ &arr2 }; 600 | auto xslta_iter1 = mse::rsv::make_xslta_begin_iterator(arr2_xsltaptr); 601 | auto xslta_iter2 = mse::rsv::make_xslta_end_iterator(arr2_xsltaptr); 602 | //xslta_iter1[0] = &i3; // scpptool would complain 603 | xslta_iter1[0] = &i1; 604 | 605 | auto xslta_citer3 = mse::rsv::make_xslta_begin_const_iterator(arr2_xsltaptr); 606 | xslta_citer3 = xslta_iter1; 607 | xslta_citer3 = mse::rsv::make_xslta_begin_const_iterator(arr2_xsltaptr); 608 | xslta_citer3 += 1; 609 | auto res1 = *(*xslta_citer3); 610 | auto res2 = *(xslta_citer3[0]); 611 | 612 | std::cout << "\n"; 613 | mse::for_each_ptr(xslta_iter1, xslta_iter2, [](auto item_ptr) { 614 | std::cout << *(*item_ptr) << " "; 615 | }); 616 | std::cout << "\n"; 617 | } 618 | } 619 | ``` 620 | 621 | ##### xslta_vector, xslta_fixed_vector, xslta_borrowing_fixed_vector 622 | 623 | `rsv::xslta_vector<>` is a [lifetime annotated](#annotating-lifetime-constraints) vector. 624 | `rsv::xslta_fixed_vector<>` is a lifetime annotated vector of a fixed size determined at construction. 625 | `rsv::xslta_borrowing_fixed_vector<>` is a lifetime annotated vector of fixed size that (exclusively) "borrows" the contents of the specified `rsv::xslta_vector<>`. 626 | 627 | "Dynamic containers", such as vectors, are containers whose (content's) size/structure/location can be changed during a program's execution. scpptool does not "support" direct raw references or pointers to elements of dynamic containers. So, for example, unlike their standard library counterparts, the element access operators and methods of `rsv::xslta_vector<>` do not return raw references. They instead return a "proxy reference" object that can (safely) act like a reference in some situations. But the preferred method of accessing elements of a dynamic container is via the dynamic container's "borrowing fixed" counterpart. These "borrowing fixed" counterparts, while they exist, (exclusively) borrow (access to) the contents of the specified dynamic container and ensure that the content's size/structure/location is not changed. scpptool does support raw references and pointers to elements of "borrowing fixed" containers. (Just as with "regular non-borrowing" "fixed" containers.) 628 | 629 | Some other static safety enforcers/analyzers try to automatically and implicitly put vectors (and other dynamic containers) into a "fixed (size/structure) mode" without requiring the programmer to instantiate a "borrowing fixed" object. But such tools rely on "flow (or path) sensitive" analysis, which [arguably](#flow-insensitive-analysis) has undesirable scalability implications. 630 | 631 | Also note that (though we don't use them in the example) some (provisional) shorter aliases are defined: 632 | `xl_bf_vector<>` is an alias for `xslta_borrowing_fixed_vector<>` 633 | `make_xl_bf_vector()` is an alias for `make_xslta_borrowing_fixed_vector()` 634 | 635 | usage example: ([link to interactive version](https://godbolt.org/z/43aPqMMd1)) 636 | 637 | ```cpp 638 | #include "mseslta.h" 639 | #include "msemsevector.h" 640 | #include "msealgorithm.h" 641 | 642 | int main(int argc, char* argv[]) { 643 | int i1 = 3; 644 | int i2 = 5; 645 | int i3 = 7; 646 | 647 | /* The lifetime (lower bound) associated with the rsv::xslta_vector<>, and each of its 648 | contained elements, is the lower bound of all of the lifetimes of the elements in the initializer 649 | list. */ 650 | auto vec2 = mse::rsv::xslta_vector >{ &i1, &i2 }; 651 | 652 | /* Note that although the initializer list used in the declaration of vec3 is different than the 653 | initializer list used for vec2, the lower bound of the lifetimes of both initializer lists is 654 | the same. */ 655 | auto vec3 = mse::rsv::xslta_vector >{ &i2, &i2 }; 656 | std::swap(vec2, vec3); 657 | vec2.swap(vec3); 658 | 659 | auto vec4 = mse::rsv::xslta_vector >{ &i3, &i1 }; 660 | /* Since the (lower bound) lifetime values of vec2 and vec4 are not the same, their values 661 | cannot be safely swapped.*/ 662 | //std::swap(vec2, vec4); // scpptool would complain 663 | //vec2.swap(vec4); // scpptool would complain 664 | 665 | /* Even when you want to construct an empty rsv::xslta_vector<>, if the element type has an annotated 666 | lifetime, you would still need to provide (a reference to) an initialization element object from which 667 | a lower bound lifetime can be inferred. You could just initialize the vector with a (non-empty) 668 | initializer list, then clear() the vector. Alternatively, you can pass mse::nullopt as the first 669 | constructor parameter, in which case the lower bound lifetime will be inferred from the second 670 | (otherwise unused) parameter. */ 671 | auto vec5 = mse::rsv::xslta_vector >(mse::nullopt, &i2); 672 | assert(0 == vec5.size()); 673 | //auto vec6 = mse::rsv::xslta_vector >{}; // scpptool would complain 674 | auto vec7 = mse::rsv::xslta_vector{}; // fine, the element type does not have an annotated lifetime 675 | 676 | { 677 | /* The preferred way of accessing the contents of an rsv::xslta_vector<> is via an associated 678 | rsv::xslta_borrowing_fixed_vector<> (which, while it exists, "borrows" exclusive access to the 679 | contents of the given vector and (efficiently) prevents any of the elements from being 680 | removed or relocated). */ 681 | auto bf_vec2a = mse::rsv::make_xslta_borrowing_fixed_vector(&vec2); 682 | // or 683 | //auto bf_vec2a = mse::rsv::xslta_borrowing_fixed_vector(&vec2); 684 | 685 | auto& elem_ref1 = bf_vec2a[0]; 686 | elem_ref1 = &i1; 687 | //elem_ref1 = &i3; // scpptool would complain (because i3 does not live long enough) 688 | 689 | /* rsv::xslta_borrowing_fixed_vector<> has an interface largely similar to rsv::xslta_fixed_vector<>, 690 | which is essentially similar to that of std::vector<>, but without any of the methods or operators 691 | that could resize (or relocate the contents of) the container. */ 692 | } 693 | 694 | /* While not the preferred method, rsv::xslta_vector<> does (currently) have limited support for accessing 695 | its elements (pseudo-)directly. */ 696 | 697 | /* mse::rsv::xslta_vector<> element access operators and methods (like front()) do not return a raw 698 | reference. They return a "proxy reference" object that (while it exists, prevents the vector from being 699 | resized, etc. and) behaves like a (raw) reference in some situations. For example, like a reference, it 700 | can be cast to the element type. */ 701 | typename decltype(vec2)::value_type ilaptr3 = vec2.front(); 702 | ilaptr3 = &i1; 703 | //ilaptr3 = &i3; // scpptool would complain (because i3 does not live long enough) 704 | 705 | /* The returned "proxy reference" object also has limited support for assignment operations. */ 706 | vec2.front() = &i1; 707 | //vec2.front() = &i3; // scpptool would complain (because i3 does not live long enough) 708 | 709 | /* Note that these returned "proxy reference" objects are designed to be used as temporary (rvalue) objects, 710 | not as (lvalue) declared variables or stored objects. */ 711 | 712 | { 713 | /* Like the element access operators and methods, dereferencing operations of rsv::xslta_vector<>'s 714 | iterators return "proxy reference" objects rather than raw references. */ 715 | auto xslta_iter1 = std::begin(vec2); 716 | auto xslta_iter2 = mse::rsv::make_xslta_end_iterator(&vec2); 717 | *xslta_iter1 = &i1; 718 | 719 | /* rsv::xslta_vector<>'s iterators can be used to specify an insertion or erasure position in the standard way. */ 720 | vec2.insert(xslta_iter1, &i1); 721 | vec2.erase(xslta_iter1); 722 | } 723 | { 724 | /* rsv::xslta_fixed_vector<> is a (lifetime annotated) vector that doesn't support any operations that 725 | could resize the vector or move its contents (subsequent to initialization). It can be initialized from 726 | an rsv::xslta_vector<>. */ 727 | auto f_vec1 = mse::rsv::xslta_fixed_vector(vec2); 728 | } 729 | } 730 | ``` 731 | 732 | ##### xslta_accessing_fixed_vector 733 | 734 | Note that the [`rsv::xslta_borrowing_fixed_vector<>`](#xslta_vector-xslta_fixed_vector-xslta_borrowing_fixed_vector) described above can only be obtained from a non-`const` pointer to the lending vector. In situations where only a `const` pointer to the vector is available, `rsv::xslta_borrowing_fixed_vector<>` has a counterpart, `rsv::xslta_accessing_fixed_vector<>`, which can be obtained from a `const` pointer to a supported vector (such as [`rsv::xslta_vector<>`](#xslta_vector-xslta_fixed_vector-xslta_borrowing_fixed_vector)). 735 | 736 | `rsv::xslta_accessing_fixed_vector<>`, like `rsv::xslta_borrowing_fixed_vector<>`, ensures, while it exists, that the vector contents are not deallocated/relocated/resized. But unlike `rsv::xslta_borrowing_fixed_vector<>`, `rsv::xslta_accessing_fixed_vector<>`'s access to the vector contents is not exclusive. So, for example, multiple `rsv::xslta_accessing_fixed_vector<>`s corresponding to the same vector can exist and be used at the same time. This lack of exclusivity results in `rsv::xslta_accessing_fixed_vector<>` being branded as ineligible to be passed to or shared with asynchronous threads. 737 | 738 | usage example: ([link to interactive version](https://godbolt.org/z/hzMc4rcha)) 739 | 740 | ```cpp 741 | #include "mseslta.h" 742 | #include "msemsevector.h" 743 | 744 | int main(int argc, char* argv[]) { 745 | int i1 = 3; 746 | int i2 = 5; 747 | int i3 = 7; 748 | 749 | /* The lifetime (lower bound) associated with the rsv::xslta_array<>, and each of its 750 | contained elements, is the lower bound of all of the lifetimes of the elements in the initializer 751 | list. */ 752 | auto vec2 = mse::rsv::xslta_vector >{ &i1, &i2 }; 753 | 754 | { 755 | auto const& vec2_cref = vec2; 756 | 757 | /* Obtaining an rsv::xslta_borrowing_fixed_vector<> requires a non-const pointer to the lending vector. 758 | When only a const pointer is available we can instead use rsv::xslta_accessing_fixed_vector<> for supported vector types. */ 759 | auto af_vec2a = mse::rsv::make_xslta_accessing_fixed_vector(&vec2_cref); 760 | // or 761 | //auto af_vec2a = mse::rsv::xslta_accessing_fixed_vector(&vec2_cref); 762 | 763 | auto& elem_ref1 = af_vec2a[0]; 764 | int i4 = *elem_ref1; 765 | } 766 | } 767 | ``` 768 | 769 | ##### TXSLTARandomAccessSection, TXSLTARandomAccessConstSection 770 | 771 | `rsv::TXSLTARandomAccessSection<>` is a lifetime annotated [`TXScopeRandomAccessSection<>`](https://github.com/duneroadrunner/SaferCPlusPlus/blob/master/README.md#txscoperandomaccesssection-txscoperandomaccessconstsection-trandomaccesssection-trandomaccessconstsection). 772 | 773 | A "random access section" is basically a convenient interface to access a (contiguous) subsection of an existing array or vector. You construct them, using the `make_xslta_random_access_section(...)` functions, by specifying an iterator to the start of the section, and the length of the section. Random access sections support most of the member functions and operators that [std::basic_string_view](http://en.cppreference.com/w/cpp/string/basic_string_view) does, except that the "[substr()](http://en.cppreference.com/w/cpp/string/basic_string_view/substr)" member function is named "xslta_subsection()". 774 | 775 | Note that for convenience, random access sections can be constructed from just a pointer or reference to a supported container object. Also note that these `TXSLTARandomAccessSection<>`s are not the library elements that most directly correspond to `std::span<>`s, as they are "tied" to the iterator type of their template argument. [`TXSLTACSSSXSTERandomAccessSection<>`](#txsltacsssxsterandomaccessiterator-and-txsltacsssxsterandomaccesssection)s more closely correspond to `std::span<>`s. 776 | 777 | usage example: ([link to interactive version](https://godbolt.org/z/f788s7q8v)) 778 | 779 | ```cpp 780 | #include "mseslta.h" 781 | #include "msemsearray.h" //random access sections are defined in this file 782 | #include "msemsevector.h" 783 | 784 | class J { 785 | public: 786 | /* Remember that these functions will have implicit/elided lifetime annotations applied to their parameters. */ 787 | template 788 | static void foo13lta(_TRASection ra_section) { 789 | for (typename _TRASection::size_type i = 0; i < ra_section.size(); i += 1) { 790 | ra_section[i] = ra_section[0]; 791 | } 792 | } 793 | template 794 | static int foo14lta(_TRAConstSection const_ra_section) { 795 | int retval = 0; 796 | for (typename _TRAConstSection::size_type i = 0; i < const_ra_section.size(); i += 1) { 797 | retval += *(const_ra_section[i]); 798 | } 799 | return retval; 800 | } 801 | template 802 | static int foo15lta(_TRAConstSection const_ra_section) { 803 | int retval = 0; 804 | for (const auto& const_item : const_ra_section) { 805 | retval += *const_item; 806 | } 807 | return retval; 808 | } 809 | }; 810 | 811 | int main(int argc, char* argv[]) { 812 | int i1 = 3; 813 | int i2 = 5; 814 | int i3 = 7; 815 | int i4 = 11; 816 | int i5 = 13; 817 | 818 | mse::rsv::xslta_array, 4> array1{ &i1, &i2, &i3, &i4 }; 819 | mse::rsv::xslta_vector > vec1{ &i1, &i2, &i3, &i4, &i5 }; 820 | auto bfvec1 = mse::rsv::make_xslta_borrowing_fixed_vector(&vec1); 821 | 822 | auto xslta_ra_section1 = mse::rsv::make_xslta_random_access_section(array1.begin(), 2); 823 | J::foo13lta(xslta_ra_section1); 824 | 825 | auto xslta_ra_const_section2 = mse::rsv::make_xslta_random_access_const_section(++bfvec1.begin(), 3); 826 | auto res6 = J::foo15lta(xslta_ra_const_section2); 827 | auto res7 = J::foo14lta(xslta_ra_const_section2); 828 | 829 | auto xslta_ra_section1_xslta_iter1 = xslta_ra_section1.begin(); 830 | auto xslta_ra_section1_xslta_iter2 = xslta_ra_section1.end(); 831 | auto xslta_ra_section1_xslta_iter1b = xslta_ra_section1.xslta_begin(); 832 | auto xslta_ra_section1_xslta_iter2b = xslta_ra_section1.xslta_end(); 833 | auto res8 = xslta_ra_section1_xslta_iter2 - xslta_ra_section1_xslta_iter1; 834 | bool res9 = (xslta_ra_section1_xslta_iter1 < xslta_ra_section1_xslta_iter2); 835 | 836 | /* The library provides the rsv::xslta_random_access_subsection() function which takes a random access section and a 837 | tuple containing a start index and a length and returns a random access section spanning the indicated 838 | subsection. You could use this function to implement the equivalent of a "first_half()" function like so: */ 839 | auto xslta_ra_section3 = mse::rsv::xslta_random_access_subsection(xslta_ra_section1, std::make_tuple(0, xslta_ra_section1.length() / 2)); 840 | assert(xslta_ra_section3.length() == 1); 841 | 842 | { 843 | auto vector1 = mse::rsv::xslta_fixed_vector >{ &i1, &i2, &i3 }; 844 | auto xslta_ra_csection1 = mse::rsv::make_xslta_random_access_const_section(&vector1); 845 | 846 | typedef decltype(xslta_ra_csection1) xslta_ra_csection1_t; 847 | /* You can (also) construct a TXSLTARandomAccessSection<> by passing (a reference to) the container, or a 848 | pointer to the container. (The former is for conformance with the interface of std::span<>, etc..) */ 849 | auto xslta_ra_csection1b = xslta_ra_csection1_t(vector1); 850 | auto xslta_ra_csection1c = xslta_ra_csection1_t(&vector1); 851 | 852 | class CD { 853 | public: 854 | static bool second_is_longer(xslta_ra_csection1_t xslta_ra_csection1, xslta_ra_csection1_t xslta_ra_csection2) { 855 | 856 | return (xslta_ra_csection1.size() > xslta_ra_csection2.size()) ? false : true; 857 | } 858 | 859 | /* Here we will demonstrate the creation of type-erased random access sections by instantiation with 860 | type-erased iterators. */ 861 | static bool second_is_longer_CSSSXSTE(mse::rsv::TXSLTARandomAccessConstSection > > xslta_ra_csection1 862 | , mse::rsv::TXSLTARandomAccessConstSection > > xslta_ra_csection2) { 863 | return (xslta_ra_csection1.size() > xslta_ra_csection2.size()) ? false : true; 864 | } 865 | }; 866 | 867 | auto res1 = CD::second_is_longer(xslta_ra_csection1, mse::rsv::make_xslta_random_access_const_section( 868 | mse::rsv::xslta_fixed_vector >{ &i1, & i2, & i3, & i4 })); 869 | 870 | auto res2 = CD::second_is_longer_CSSSXSTE(mse::rsv::make_xslta_random_access_const_section(mse::rsv::xslta_array, 3>{ &i1, & i2, & i3 }) 871 | , mse::rsv::make_xslta_random_access_const_section(mse::rsv::xslta_fixed_vector >{ &i1, & i2, & i3, & i4 })); 872 | } 873 | } 874 | ``` 875 | 876 | See also [`TXSLTACSSSXSTERandomAccessSection`](#txsltacsssxsterandomaccessiterator-and-txsltacsssxsterandomaccesssection). 877 | 878 | ##### TXSLTACSSSXSTERandomAccessIterator and TXSLTACSSSXSTERandomAccessSection 879 | 880 | `rsv::TXSLTACSSSXSTERandomAccessIterator<>` and `rsv::TXSLTACSSSXSTERandomAccessSection<>` are "type-erased" template classes that can be used to enable functions to take as arguments iterators or sections of various container types (like arrays or (fixed size) vectors) without making the functions into template functions. But in this case there are limitations on what types can be converted. In exchange for these limitations, these types require less overhead. The "CSSSXSTE" part of the typenames stands for "Contiguous Sequence, Static Structure, XSLTA, Type-Erased". So the first restriction is that the target container must be recognized as a "contiguous sequence" (basically an array or vector). It also must be recognized as having a "static structure". This essentially means that the container cannot be resized. (At least not while the `rsv::TXSLTACSSSXSTERandomAccessIterator<>` or `rsv::TXSLTACSSSXSTERandomAccessSection<>` exists.) And these iterators and sections are ["lifetime annotated"](#annotating-lifetime-constraints). 881 | 882 | `rsv::TXSLTACSSSXSTERandomAccessSection<>` might be considered, in essence, the primary safe counterpart of `std::span<>`. As such, (though we don't use them in the example) a couple of (provisional) shorter aliases are defined: 883 | `xl_span<>` is an alias for `TXSLTACSSSXSTERandomAccessSection<>` 884 | `xl_const_span<>` is an alias for `TXSLTACSSSXSTERandomAccessConstSection<>` 885 | 886 | usage example: ([link to interactive version](https://godbolt.org/z/E7ev9aYE6)) 887 | 888 | ```cpp 889 | #include "mseslta.h" 890 | #include "msemsearray.h" //TXSLTACSSSXSTERandomAccessIterator/Section are defined in this header 891 | #include "msemsevector.h" 892 | 893 | int main(int argc, char* argv[]) { 894 | int i1 = 3; 895 | int i2 = 5; 896 | int i3 = 7; 897 | int i4 = 11; 898 | int i5 = 13; 899 | 900 | auto array1 = mse::rsv::xslta_array, 4>{ &i1, &i2, &i3, &i4 }; 901 | auto array2 = mse::rsv::xslta_array, 5>{ &i1, &i2, &i3, &i4, &i5 }; 902 | auto vec1 = mse::rsv::xslta_vector>{ &i1, &i2, &i3, &i4, &i5 }; 903 | auto bfvec1 = mse::rsv::make_xslta_borrowing_fixed_vector(&vec1); 904 | class B { 905 | public: 906 | /* Remember that these functions will have implicit/elided lifetime annotations applied to their 907 | parameters (that don't have explicit annotations). */ 908 | static void foo1(mse::rsv::TXSLTACSSSXSTERandomAccessIterator> ra_iter1) { 909 | ra_iter1[1] = ra_iter1[2]; 910 | } 911 | static mse::rsv::TXSLTAPointer foo2(mse::rsv::TXSLTACSSSXSTERandomAccessConstIterator> const_ra_iter1 MSE_ATTR_PARAM_STR("mse::lifetime_labels(_[99])")) 912 | MSE_ATTR_FUNC_STR("mse::lifetime_notes{ label(99); return_value(99) }") { 913 | 914 | const_ra_iter1 += 2; 915 | --const_ra_iter1; 916 | const_ra_iter1--; 917 | return const_ra_iter1[2]; 918 | } 919 | static void foo3(mse::rsv::TXSLTACSSSXSTERandomAccessSection> ra_section) { 920 | for (mse::rsv::TXSLTACSSSXSTERandomAccessSection>::size_type i = 0; i < ra_section.size(); i += 1) { 921 | ra_section[i] = ra_section[0]; 922 | } 923 | } 924 | static int foo4(mse::rsv::TXSLTACSSSXSTERandomAccessConstSection> const_ra_section) { 925 | int retval = 0; 926 | for (mse::rsv::TXSLTACSSSXSTERandomAccessSection>::size_type i = 0; i < const_ra_section.size(); i += 1) { 927 | retval += *(const_ra_section[i]); 928 | } 929 | return retval; 930 | } 931 | static int foo5(mse::rsv::TXSLTACSSSXSTERandomAccessConstSection> const_ra_section) { 932 | int retval = 0; 933 | for (const auto& const_item : const_ra_section) { 934 | retval += *(const_item); 935 | } 936 | return retval; 937 | } 938 | }; 939 | 940 | auto xs_array_iter1 = mse::rsv::make_xslta_begin_iterator(&array1); 941 | xs_array_iter1++; 942 | 943 | auto res1 = B::foo2(xs_array_iter1); 944 | B::foo1(xs_array_iter1); 945 | 946 | auto xs_msearray_const_iter2 = mse::rsv::make_xslta_begin_const_iterator(&array2); 947 | xs_msearray_const_iter2 += 2; 948 | auto res2 = B::foo2(xs_msearray_const_iter2); 949 | 950 | auto res3 = B::foo2(mse::rsv::make_xslta_begin_const_iterator(&bfvec1)); 951 | 952 | auto bfvec1_iter2 = ++mse::rsv::make_xslta_begin_iterator(&bfvec1); 953 | B::foo1(bfvec1_iter2); 954 | auto bfvec1_iter1 = mse::rsv::make_xslta_begin_iterator(&bfvec1); 955 | auto res4 = B::foo2(bfvec1_iter1); 956 | 957 | auto bfvec1_begin_iter1 = mse::rsv::make_xslta_begin_iterator(&bfvec1); 958 | mse::rsv::TXSLTACSSSXSTERandomAccessIterator> xs_te_iter1 = bfvec1_begin_iter1; 959 | auto bfvec1_end_iter1 = mse::rsv::make_xslta_end_iterator(&bfvec1); 960 | mse::rsv::TXSLTACSSSXSTERandomAccessIterator> xs_te_iter2 = bfvec1_end_iter1; 961 | auto res5 = xs_te_iter2 - xs_te_iter1; 962 | xs_te_iter2 = xs_te_iter1; 963 | 964 | mse::rsv::xslta_array, 4> array3 = mse::rsv::xslta_array, 4>({ &i1, &i2, &i3, &i4 }); 965 | auto array_scpiter3 = mse::rsv::make_xslta_begin_iterator(&array3); 966 | ++array_scpiter3; 967 | B::foo1(array_scpiter3); 968 | 969 | mse::rsv::TXSLTACSSSXSTERandomAccessSection> xscp_ra_section1(xs_array_iter1++, 2); 970 | B::foo3(xscp_ra_section1); 971 | auto xs_mstd_vec_iter1 = mse::rsv::make_xslta_begin_iterator(&bfvec1); 972 | mse::rsv::TXSLTACSSSXSTERandomAccessSection> xscp_ra_section2(++xs_mstd_vec_iter1, 3); 973 | 974 | auto res6 = B::foo5(xscp_ra_section2); 975 | B::foo3(xscp_ra_section2); 976 | auto res7 = B::foo4(xscp_ra_section2); 977 | 978 | auto xs_ra_section1 = mse::rsv::make_xslta_random_access_section(xs_array_iter1, 2); 979 | B::foo3(xs_ra_section1); 980 | auto xs_ra_const_section2 = mse::rsv::make_xslta_random_access_const_section(mse::rsv::make_xslta_begin_const_iterator(&bfvec1), 2); 981 | B::foo4(xs_ra_const_section2); 982 | 983 | auto nii_array4 = mse::rsv::xslta_array, 4>{ &i1, &i2, &i3, &i4 }; 984 | auto xscp_ra_section3 = mse::rsv::make_xslta_csssxste_random_access_section(&nii_array4); 985 | 986 | auto xscp_ra_section1_xscp_iter1 = mse::rsv::make_xslta_begin_iterator(xscp_ra_section1); 987 | auto xscp_ra_section1_xscp_iter2 = mse::rsv::make_xslta_end_iterator(xscp_ra_section1); 988 | auto res8 = xscp_ra_section1_xscp_iter2 - xscp_ra_section1_xscp_iter1; 989 | bool res9 = (xscp_ra_section1_xscp_iter1 < xscp_ra_section1_xscp_iter2); 990 | } 991 | ``` 992 | 993 | ##### xslta_optional, xslta_fixed_optional, xslta_borrowing_fixed_optional 994 | 995 | `rsv::xslta_optional<>` is a [lifetime annotated](#annotating-lifetime-constraints) optional. 996 | `rsv::xslta_fixed_optional<>` is a lifetime annotated optional whose empty/non-empty status is fixed and determined at construction. 997 | `rsv::xslta_borrowing_fixed_optional<>` is a lifetime annotated optional whose empty/non-empty status is fixed and (exclusively) "borrows" the contents of the specified `rsv::xslta_optional<>`. 998 | 999 | Conceptually, you can think of an optional as kind of like a [`vector<>`](#xslta_vector-xslta_fixed_vector-xslta_borrowing_fixed_vector) with at most one element. 1000 | 1001 | Also note that (though we don't use them in the example) some (provisional) shorter aliases are defined: 1002 | `xl_bf_optional<>` is an alias for `xslta_borrowing_fixed_optional<>` 1003 | `make_xl_bf_optional()` is an alias for `make_xslta_borrowing_fixed_optional()` 1004 | 1005 | usage example: ([link to interactive version](https://godbolt.org/z/z78W1xrqf)) 1006 | 1007 | ```cpp 1008 | #include "mseslta.h" 1009 | #include "mseoptional.h" 1010 | #include "msealgorithm.h" 1011 | 1012 | int main(int argc, char* argv[]) { 1013 | int i1 = 3; 1014 | int i2 = 5; 1015 | int i3 = 7; 1016 | auto ilaptr4 = mse::rsv::TXSLTAPointer{ &i2 }; 1017 | auto ilaptr5 = mse::rsv::TXSLTAPointer{ &i1 }; 1018 | 1019 | mse::rsv::xslta_optional > maybe_int_xsltaptr3(ilaptr4); 1020 | 1021 | /* Even when you want to construct an empty rsv::xslta_optional<>, if the element type has an annotated 1022 | lifetime, you would still need to provide (a reference to) an initialization element object from which 1023 | a lower bound lifetime can be inferred. You could just initialize the option with a value, then reset() 1024 | the rsv::xslta_optional<>. Alternatively, you can pass mse::nullopt as the first constructor parameter, 1025 | in which case the lower bound lifetime will be inferred from the second (otherwise unused) parameter. */ 1026 | mse::rsv::xslta_optional > maybe_int_xsltaptr2(mse::nullopt, ilaptr4); 1027 | //mse::rsv::xslta_optional > maybe_int_xsltaptr; // scpptool would complain 1028 | mse::rsv::xslta_optional maybe_int; // fine, the element type does not have an annotated lifetime 1029 | 1030 | auto maybe_int_xsltaptr5 = mse::rsv::make_xslta_optional(mse::nullopt, ilaptr4); 1031 | auto maybe_int_xsltaptr6 = mse::rsv::make_xslta_optional(ilaptr4); 1032 | { 1033 | /* As with rsv::xslta_vector<>, the preferred way of accessing the contents of an rsv::xslta_optional<> 1034 | is via an associated rsv::xslta_borrowing_fixed_optional<> (which, while it exists, "borrows" exclusive 1035 | access to the contents of the given optional and (efficiently) prevents the element (if any) from being 1036 | removed). */ 1037 | auto bfmaybe_int_xsltaptr6 = mse::rsv::make_xslta_borrowing_fixed_optional(&maybe_int_xsltaptr6); 1038 | auto ilaptr26 = bfmaybe_int_xsltaptr6.value(); 1039 | std::swap(ilaptr26, ilaptr4); 1040 | 1041 | auto ilaptr7 = mse::rsv::TXSLTAPointer{ &i3 }; 1042 | auto ilaptr28 = bfmaybe_int_xsltaptr6.value_or(ilaptr7); 1043 | //std::swap(ilaptr28, ilaptr4); // scpptool would complain 1044 | std::swap(ilaptr7, ilaptr28); 1045 | } 1046 | 1047 | /* While not the preferred method, rsv::xslta_optional<> does (currently) have limited support for accessing 1048 | its element (pseudo-)directly. */ 1049 | 1050 | /* As with rsv::xslta_vector<>, rsv::xslta_optional<>'s non-const accessor methods and operators do not 1051 | return a raw reference. They return a "proxy reference" object that (while it exists, prevents the addition 1052 | or removal of a value and) behaves like a (raw) reference in some situations. For example, like a reference, 1053 | it can be cast to the element type. */ 1054 | typename decltype(maybe_int_xsltaptr6)::value_type ilaptr6 = (maybe_int_xsltaptr6.value()); 1055 | ilaptr6 = &i2; 1056 | //ilaptr6 = &i3; // scpptool would complain (because i3 does not live long enough) 1057 | 1058 | /* The returned "proxy reference" object also has limited support for assignment operations. */ 1059 | maybe_int_xsltaptr6.value() = &i1; 1060 | //maybe_int_xsltaptr6.value() = &i3; // scpptool would complain (because i3 does not live long enough) 1061 | 1062 | /* Note that these returned "proxy reference" objects are designed to be used as temporary (rvalue) objects, 1063 | not as (lvalue) declared variables or stored objects. */ 1064 | 1065 | /* Note again that we've been using a non-const rsv::xslta_optional<>. Perhaps unintuitively, the contents of 1066 | an rsv::xslta_optional<> cannot be safely accessed via const reference to the optional. */ 1067 | auto const& maybe_int_xsltaptr6_cref1 = maybe_int_xsltaptr6; 1068 | //typename decltype(maybe_int_xsltaptr6)::value_type ilaptr3b = maybe_int_xsltaptr6_cref1.value(); // scpptool would complain 1069 | 1070 | { 1071 | /* rsv::xslta_fixed_optional<> is a (lifetime annotated) optional that doesn't support any operations that 1072 | could change its empty/non-empty state. */ 1073 | auto f_maybe_int_xsltaptr1 = mse::rsv::xslta_fixed_optional(maybe_int_xsltaptr6.value()); 1074 | } 1075 | } 1076 | ``` 1077 | 1078 | ##### xslta_accessing_fixed_optional 1079 | 1080 | `rsv::xslta_accessing_fixed_optional<>` is the [`rsv::xslta_accessing_fixed_vector<>`](#xslta_accessing_fixed_vector) counterpart for `optional`s. 1081 | 1082 | usage example: ([link to interactive version](https://godbolt.org/z/Po3rdaW36)) 1083 | 1084 | ```cpp 1085 | #include "mseslta.h" 1086 | #include "mseoptional.h" 1087 | 1088 | int main(int argc, char* argv[]) { 1089 | int i1 = 3; 1090 | int i2 = 5; 1091 | int i3 = 7; 1092 | 1093 | auto opt2 = mse::rsv::xslta_optional >{ mse::rsv::TXSLTAPointer{ &i2 } }; 1094 | 1095 | { 1096 | auto const& opt2_cref = opt2; 1097 | 1098 | /* Obtaining an rsv::xslta_borrowing_fixed_optional<> requires a non-const pointer to the lending optional. 1099 | When only a const pointer is available we can instead use rsv::xslta_accessing_fixed_optional<> for supported optional types. */ 1100 | auto af_opt2a = mse::rsv::make_xslta_accessing_fixed_optional(&opt2_cref); 1101 | // or 1102 | //auto af_opt2a = mse::rsv::xslta_accessing_fixed_optional(&opt2_cref); 1103 | 1104 | auto& elem_ref1 = af_opt2a.value(); 1105 | int i4 = *elem_ref1; 1106 | } 1107 | } 1108 | ``` 1109 | 1110 | #### SaferCPlusPlus elements 1111 | 1112 | Most of the restrictions required to ensure safety of the elements in the SaferCPlusPlus library are implemented in the type system. However, some of the necessary restrictions cannot be implemented in the type system. This tool is meant to enforce those remaining restrictions. Elements requiring enforcement help are generally relegated to the `mse::rsv` namespace. One exception is the restriction that scope types (regardless of the namespace in which they reside), cannot be used as members of structs/classes that are not themselves scope types. The tool will flag any violations of this restriction. 1113 | 1114 | Note that the `mse::rsv::make_xscope_pointer_to()` function, which allows you to obtain a scope pointer to the resulting object of any eligible expression, is not listed in the documentation of the SaferCPlusPlus library, as without an enforcement helper tool like this one, it could significantly undermine safety. 1115 | 1116 | #### Elements not (yet) addressed 1117 | 1118 | The set of potentially unsafe elements in C++, and in the standard library itself, is pretty large. This tool does not yet address them all. In particular it does not complain about the use of essential elements for which the SaferCPlusPlus library does not (yet) provide a safe alternative, such as conatiners like maps, sets, etc.,. 1119 | 1120 | ### Autotranslation 1121 | 1122 | This tool also has some ability to convert C source files to the memory safe subset of C++ it enforces and is demonstrated in the [SaferCPlusPlus-AutoTranslation2](https://github.com/duneroadrunner/SaferCPlusPlus-AutoTranslation2) project. 1123 | 1124 | ### "Flow (in)sensitive" analysis 1125 | 1126 | Some of the other C++ lifetime analyzers in development employ "flow (or path) sensitive" analysis. That is, they determine whether or not an operation is safe based, in part, on the operations that precede it in the program execution. So for example in this piece of code: 1127 | 1128 | ```cpp 1129 | int foo1(int* num_ptr) { 1130 | int retval = 0; 1131 | const bool b1 = !num_ptr; 1132 | 1133 | if (false) // condition 1 1134 | //if ((!num_ptr)) // condition 2 1135 | //if (!(!(!num_ptr))) // condition 3 1136 | //if (b1) // condition 4 1137 | { 1138 | num_ptr = &retval; 1139 | } 1140 | retval = *num_ptr; 1141 | return retval; 1142 | } 1143 | ``` 1144 | 1145 | the lifetime profile checker [complains](https://godbolt.org/z/4G1Mj8x4j) that `num_ptr` is being dereferenced when it cannot verify that its value is not null: 1146 | 1147 | ``` 1148 | :12:14: warning: dereferencing a possibly null pointer [-Wlifetime-null] 1149 | retval = *num_ptr; 1150 | ^~~~~~~~ 1151 | :1:10: note: the parameter is assumed to be potentially null. Consider using gsl::not_null<>, a reference instead of a pointer or an assert() to explicitly remove null 1152 | int foo1(int* num_ptr) { 1153 | ^~~~~~~~~~~~ 1154 | 1 warning generated. 1155 | ``` 1156 | 1157 | But if we replace "condition 1" in the example with "condition 2", this will ensure that in the event `num_ptr` has a null value it will be replaced with a valid one. The lifetime profile checker recognizes this and in this case no longer complains about the possible null value. 1158 | 1159 | "condition 3" is a slightly more complicated/obfuscated version of "condition 2", but the lifetime profile checker [is able](https://godbolt.org/z/76PGz6EdK) to recognize it as such. 1160 | 1161 | However, "condition 4" is just an indirect version of "condition 2", but the lifetime profile checker (at the time of writing) [does not](https://godbolt.org/z/M9hzY4n4f) recognize this, and will again complain about the dereference of a possible null value. 1162 | 1163 | It's a simplified example, but we're witnessing an instance of a more general phenomenon where we can't make a minor change to the code that clearly, to us, has no ill effect on the correctness of the code, without breaking the compile (or static verification). "Brittle" is the word that comes to mind to describe this kind of situation where things are fine as long as you don't try to make any changes, no matter how seemingly minor or benign. 1164 | 1165 | And making the static analyzer smarter doesn't solve the general "brittleness" problem. No matter how smart the analyzer becomes, there will, in theory, always be some safe code that it won't be able to verify as such in a timely manner. (Because "halting problem", right?) So any code that is "close to" the boundary of what the analyzer can verify as safe will be in this sort of "brittle" situation. 1166 | 1167 | Since scpptool doesn't use "flow (or path) sensitive" analysis, it doesn't suffer this kind of brittleness phenomenon. That is, no expression will fail to compile or pass static verification as a result of the modification of any code other than the expression itself or the declaration of elements participating in the expression. 1168 | 1169 | With respect to the above example, scpptool prevents raw pointers from having null values so there would be no need to raise those errors about "dereferencing a possibly null pointer". Note that the error/warning that the lifetime profile checker raises recommends using `gsl::not_null<>` to basically achieve the same thing. 1170 | 1171 | If for some reason a null state was needed for the pointer, you could change the function to accept an `optional` of a pointer (or `gsl::not_null<>` pointer). Another alternative could be to use a non-owning smart pointer from the SaferCPlusPlus library that safely supports null values. Since both of these solutions include integrated checks for null state before dereferencing, they still don't suffer the "brittleness" phenomenon. 1172 | 1173 | So if this "brittleness" is a downside, an advantage of using "flow (or path) sensitive" analysis to maximize the set of code accepted as safe is that it reduces the modifications of pre-existing/legacy code needed to bring it into safety conformance. 1174 | 1175 | But this benefit is in a sense similarly "brittle" or unreliable. That is, some legacy code won't require modification, while some essentially equivalent versions of that same code will require modification to appease the ("flow (or path) sensitive") analyzer. 1176 | 1177 | While more code will require modification to satisfy the (non-"flow (or path) sensitive") scpptool analyzer, that modification can generally be automated. And scpptool has a [feature](https://github.com/duneroadrunner/scpptool#autotranslation) to do a lot of that automated code modification for you. Note that this auto-converted code may incur a few more run-time checks than the original code, but of course you end up with code that doesn't suffer the brittleness phenomenon. 1178 | 1179 | In fact "flow (or path) sensitive" analyzers like the lifetime profile checker can avoid some run-time checks that a non-"flow (or path) sensitive" analyzer like scpptool can't. This is the case, for example, when it comes to ensuring that contents of dynamic containers aren't moved or deallocated when there are outstanding references to that content. But we find that in practice these run-time checks tend not to occur inside (performance sensitive) inner loops. 1180 | 1181 | So we've made an argument for non-"flow (or path) sensitive" analysis being the better trade-off, but still, it is a trade-off, and so ultimately a judgement call based on which properties one values more. 1182 | 1183 | 1184 | ### Questions and comments 1185 | If you have questions or comments you can create a post in the [discussion section](https://github.com/duneroadrunner/scpptool/discussions). email: scpptool1@f-m.fm 1186 | 1187 | -------------------------------------------------------------------------------- /approach_to_lifetime_safety_summary.md: -------------------------------------------------------------------------------- 1 | Nov 2024 2 | 3 | ### Rough Summary of the Approach to Lifetime Safety For Those With Some Familiarity With Rust 4 | 5 | Mutable aliasing can be a code correctness issue, but in most situations it's not a memory safety issue. For example, if one has (in the same thread) two non-`const` pointers to an `int` variable, or even an element of a (fixed-sized) array of `int`s, there's no memory safety issue due to the aliasing. But, for example, if I have a non-`const` pointer to an `std::vector<>` and another pointer to one of its elements, that can be a lifetime safety issue if the vector contents are cleared or relocated due to an operation instigated via the non-`const` pointer to the vector. 6 | 7 | So the premise is that there is only a limited set of situations where mutable aliasing is potentially a lifetime safety issue. Namely, when a mutable reference to a dynamic container (like a vector, optional, set, etc.) or a (dynamic) owning pointer (like a shared pointer or unique pointer) is used to modify the "structure" of the owned contents. (Modification includes relocation.) So in the scpptool solution, direct (raw) references to the contents of a dynamic container or owning pointer are simply not allowed. In the case of the "idiomatic" dynamic containers, for example, they don't even have any member function or operator that yields a raw reference. 8 | 9 | You can obtain a raw reference to the contents via a "borrowing" object. A borrowing object is roughly analogous to a slice in Rust. The existence of the borrowing object prevents the "structure" of the contents from being modified. This is done via run-time mechanisms and the type system, and does not rely on the static analyzer. A couple of different mechanisms are used depending on the type. These run-time mechanisms generally don't have much effect on performance as they generally aren't applied in the hottest inner loops. 10 | 11 | All dynamic container and owning pointer types (will) have a corresponding borrowing object type. (Multiple types can share the same borrowing object type.) The claim is that this completely solves the (single-threaded) lifetime safety issue due to mutable aliasing. 12 | 13 | In the case of references "in" (or accessible from) different threads, mutable aliasing does need to be prevented. The scpptool approach is to prevent the passing or sharing of "uncontrolled" references (including raw pointer/references) between threads. Objects shared among threads need to be recognized as either immutable, atomic, or "access controlled". Objects can be made to be "access controlled" by wrapping them in an "access controlled object" container (which is essentially a generalization of Rust's `RefCell<>`). "Access controlled" objects support (smart) references which can safely be passed to, or accessed from, other threads. 14 | 15 | The remaining lifetime safety issues are addressed by enforcing scope lifetime restrictions essentially in similar fashion to the way Rust does (or originally did). The fact that the scpptool implementation does not use flow analysis makes the enforcement a little more restrictive than Rust's, but also simpler to implement (and theoretically faster to execute). 16 | 17 | The approach does not intrinsically preclude the use of flow analysis if desired. But it isn't immediately clear that it would be a net benefit, irrespective of the additional implementation complexity. Time and experience will reveal the extent of any inconvenience resulting from the extra restrictiveness due to the lack of flow analysis, but we're fairly confident it's not a major issue. And unlike Rust, with the scpptool solution you can always resort to (non-owning) run-time checked pointers. 18 | 19 | The lifetime annotations also work in somewhat similar fashion to Rust. (This is where the majority of the implementation complexity comes from.) There are differences. Like for example how the scpptool version more reflects C++'s "duck typing" templates (i.e. you can refer to (sub)lifetimes of a generic type without any indication that the type actually has those (sub)lifetimes in advance). 20 | -------------------------------------------------------------------------------- /build_myscpptool.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | originaldir=$PWD 4 | 5 | # obtaining the directory of this script file 6 | SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) 7 | 8 | cd $SCRIPT_DIR 9 | 10 | #This script checks whether it's running on an x86_64 architecture with Ubuntu as its OS 11 | arch=$(uname -m) # Get system architecture using 'uname -m' command 12 | 13 | os=$(cat /etc/*release | grep DISTRIB_ID=Ubuntu) # Check if the current distribution is Ubuntu 14 | 15 | if [ "$arch" == "x86_64" ] && [ ! -z $os ]; then # If architecture is x86_64 and OS is Ubuntu, print a message 16 | 17 | echo "This script is running on an x86_64 Ubuntu platform." 18 | 19 | llvmsubdir="clang+llvm-18.1.7-x86_64-linux-gnu-ubuntu-18.04" 20 | llvmdir=$PWD/$llvmsubdir 21 | if [ -d $llvmdir ]; then 22 | echo "The directory $llvmdir already exists." 23 | else 24 | echo "The directory $llvmdir does not seem to exist." 25 | echo "Downloading llvm prebuilt binares..." 26 | wget "https://github.com/llvm/llvm-project/releases/download/llvmorg-18.1.7/clang+llvm-18.1.7-x86_64-linux-gnu-ubuntu-18.04.tar.xz" 27 | tar -xvf clang+llvm-18.1.7-x86_64-linux-gnu-ubuntu-18.04.tar.xz 28 | fi 29 | else 30 | 31 | echo "This script is not running on an x86_64 Ubuntu platform. Architecture: $arch" # Print the detected architecture if it's not x86_64 or OS isn't Ubuntu 32 | echo "" 33 | echo "Please download the clang+llvm-18.1.7 pre-built binaries tar file for your platform (located here: https://github.com/llvm/llvm-project/releases/tag/llvmorg-18.1.7 or in a sibling directory) and extract the contents. " 34 | 35 | read -p "Then enter the full path of the extracted directory: " dir_path 36 | 37 | dir_path=${dir_path//[\"\']} # remove any enclosing quotes 38 | 39 | if [[ ! -e $dir_path ]] || [[ ! -r $dir_path ]]; then 40 | echo "The specified path doesn't exist or isn't accessible." >&2 41 | exit 1 42 | fi 43 | 44 | llvmdir=$dir_path 45 | fi 46 | 47 | srcdir=$PWD/src 48 | 49 | cd $srcdir 50 | 51 | echo "Compiling scpptool... " 52 | 53 | make LLVM_CONF=$llvmdir/bin/llvm-config BUILD_MODE=DEBUG 54 | 55 | scpptoolfilename=scpptool # The filename we're checking for existence. 56 | if [ -f "$scpptoolfilename" ]; then # Test if $FILE exists and is a regular file. 57 | echo "File '$scpptoolfilename' found." 58 | else 59 | echo "'$scpptoolfilename' not found. The build seems to have failed." 60 | exit -1 61 | fi 62 | 63 | srcclangincludedir=$llvmdir/lib/clang/18/include 64 | 65 | mkdir ../lib 66 | mkdir ../lib/clang 67 | mkdir ../lib/clang/18 68 | cp -n -r $srcclangincludedir ../lib/clang/18 69 | 70 | echo " 71 | Build complete. 72 | 73 | 74 | Once you've verified that scpptool is working properly, you may then delete the clang+llvm-18.1.7 tar file that was downloaded manually or by this build script, and the directory extracted from it. 75 | 76 | Note that scpptool uses a clang include directory located in the relative path '../lib/clang/18' created by this script. So copying or moving the scpptool executable may also require moving that include directory so that the relative path remains the same. 77 | 78 | 79 | The usage syntax is as follows: 80 | 81 | {scpptool src directory}/scpptool {source filename(s)} -- {compiler options} 82 | 83 | where the {scpptool src directory} is $srcdir 84 | 85 | So for example: 86 | 87 | $srcdir/scpptool hello_world.cpp -- -I./msetl -std=c++17 88 | 89 | " 90 | 91 | cd $originaldir 92 | 93 | exit 0 94 | 95 | -------------------------------------------------------------------------------- /intro_video_transcript.md: -------------------------------------------------------------------------------- 1 | May 2024 2 | 3 | Intro video transcript: 4 | 5 | ### A Quick Intro To The scpptool (Essentially) Safe Subset of C++ 6 | 7 | This video is an introduction to programming in the scpptool safe subset of C++. scpptool is a static analyzer that analyzes your C++ source code and complains about any code that it cannot verify to be memory and data race safe. 8 | 9 | Historically C++ has been famous for its lack of memory-safety and undefined behavior. In this video we assume the viewer is familiar with the C++ safety issues being addressed. 10 | 11 | Now intuitively, it doesn't seem too hard to identify a very limited subset of C++ that's essentially memory-safe. For example you can imagine a subset that just excludes all unsafe elements, like pointers and references, as well as any unchecked container accessors or iterators that could result in an "out of bounds" access. You might also need to exclude native signed integers to avoid the undefined behavior associated with integer overflow. And so on. 12 | 13 | But without safe replacements for those excluded elements, such a limited subset wouldn't really be very practical. So the idea behind this scpptool is to use static analysis to enforce a safe subset that, with the help of a companion library, is both practical and high performance. In fact, as far as we know, no other memory-safe language available surpasses (or arguably matches) the scpptool safe subset of C++ in terms of performance and expressive power. 14 | 15 | Unlike some other efforts, we are not trying to introduce new programming paradigms. In fact the goal is to impose only the minimum necessary departures from traditional C++. The two main differences imposed are, i) that null pointer values are not permitted for raw pointers, and ii) that raw references to elements of dynamic containers, like vectors, can only be obtained through a separate interface that ensures that the container contents are not moved or deallocated while the raw reference exists. 16 | 17 | Ok so let's start getting our hands dirty and try to write some safe code. 18 | 19 | Let's start with an example based on one that was posted online recently related to the classic C++ footgun where references to vector elements become invalid when additional elements are added to the vector: 20 | 21 | ```cpp 22 | #include 23 | #include 24 | 25 | void print_int(const int& i) { 26 | std::cout<< i << std::endl; 27 | } 28 | 29 | int main(int argc, char* argv[]) { 30 | std::vector list { 1, 2, 3 }; 31 | 32 | for(const auto i : list){ 33 | list.push_back(i); 34 | /* This is already undefined behavior because the hidden iterators of the `for` loop are now invalid. */ 35 | print_int(i); 36 | } 37 | 38 | for(const auto i : list) { 39 | print_int(i); 40 | } 41 | 42 | return 0; 43 | } 44 | ``` 45 | 46 | This example contains insidious undefined behavior because the first `for` loop is using hidden iterators whose initial values were obtained by the implicit calls `std::begin(list)` and `std::end(list)`. Those iterators are potentially invalidated when the `push_back()` method is called. 47 | 48 | We would expect the scpptool analyzer to complain about this unsafe code. Let's verify that it does. First we need to download and extract scpptool (or clone it if you prefer). In the resulting "scpptool-master" directory you'll find a build script. We'll just go ahead and run that script. 49 | 50 | If you're running Ubuntu linux on an x86_64 platform, then the script will automatically download the appropriate clang+llvm pre-built library used by scpptool. If you're on a different platform, then the script will direct you to go to the specified webpage where you can download the clang+llvm pre-built library appropriate for your platform. It will then ask you to specify the (full path of the) directory where you extracted the clang+llvm pre-built library. 51 | 52 | It will then proceed to build the scpptool executable. The scpptool executable will be located in the "src" subdirectory of the "scpptool-master" directory. If you prefer to move the scpptool executable to another location, note that you must follow the documentation and also move the associated clang include directory so that the path relative to the scpptool executable remains the same. We'll just leave it where it is for now. (Note that installation procedures may have changed since the recording of this video, so be sure to consult the latest installation instructions.) 53 | 54 | Ok, now we're ready to use the scpptool to analyze our (unsafe) program. 55 | 56 | ``` 57 | ~/dev/clang_tooling/scpptool/src/scpptool example1.cpp -- 58 | 59 | /home/user1/dev/clang_tooling/test/scpptool/example1/example1.cpp:9:5: error: 'std::vector' is not supported (in type 'std::vector' used in this declaration). Consider using a corresponding substitute from the SaferCPlusPlus library instead. 60 | 61 | 62 | 1 verification failures. 63 | ``` 64 | 65 | Looking at the output, we see that scpptool does indeed complain about our unsafe program. scpptool considers `std::vector<>` to be irredeemably unsafe. And not just because of the issue we're experiencing here. For example, just the fact that `std::vector<>`'s iterators aren't (necessarily) bounds checked alone means that no degree of static analysis can salvage `std::vector<>`. So the companion SaferCPlusPlus (header-only) library provides an alternative vector with a safer implementation and interface while attempting to remain as compatible as possible with `std::vector<>`. In fact, we can almost use it as a drop-in replacement for `std::vector<>` in our example. 66 | 67 | So now lets download and extract (or clone if you prefer) the SaferCPlusPlus (header) library. The project contains documentation and a directory of usage examples, but for right now we're only interested in the "include" directory of header files that we want to include in our programs. For this demonstration, we'll just copy the include files into a local subdirectory of our project where they'll be easy to find. 68 | 69 | ```cpp 70 | #include 71 | #include "msemsevector.h" 72 | 73 | void print_int(const int& i) { 74 | std::cout<< i << std::endl; 75 | } 76 | 77 | int main(int argc, char* argv[]) { 78 | mse::rsv::xslta_vector list { 1, 2, 3 }; 79 | 80 | for(const int i : list){ 81 | list.push_back(i); 82 | /* This is no longer undefined behavior. `mse::rsv::xslta_vector::iterator`s are not invalidated by the `push_back()`. */ 83 | print_int(i); 84 | } 85 | 86 | for(const auto i : list) { 87 | print_int(i); 88 | } 89 | 90 | return 0; 91 | } 92 | ``` 93 | So in this updated version of our program, we've replaced `std::vector<>` with `mse::rsv::xslta_vector<>`. The code is now safe and valid because `mse::rsv::xslta_vector<>`'s iterators are not invalidated by operations like `push_back()`. As some may suspect, these safe iterators store an index rather than a pointer directly targeting a vector element. A subtle difference that might be easy to miss is that we changed the loop variable to be explicitly declared as an `int` rather than `auto`. This is because, unlike with `std::vector<>`, dereferencing the iterator does not return a raw reference to the element type, but rather a "proxy reference" object that implicitly converts to the element type. This is because, like an `std::vector<>` iterator, a raw reference to an element would be prone to becoming invalid. 94 | 95 | This does not mean that you cannot (safely) obtain a raw reference to a vector element, but doing so does require a little ceremony to ensure that the reference doesn't become invalid. First we need to instantiate a "borrowing fixed" vector. A borrowing fixed vector "borrows" the contents of a given vector and ensures that those contents are not moved or deallocated. 96 | 97 | ```cpp 98 | #include 99 | #include "msemsevector.h" 100 | 101 | void print_int(const int& i) { 102 | std::cout<< i << std::endl; 103 | } 104 | 105 | int main(int argc, char* argv[]) { 106 | mse::rsv::xslta_vector list { 1, 2, 3 }; 107 | 108 | for(const int i : list){ 109 | list.push_back(i); 110 | /* This is no longer undefined behavior. `mse::rsv::xslta_vector::iterator`s are not invalidated by the `push_back()`. */ 111 | print_int(i); 112 | } 113 | 114 | { 115 | auto bf_vector = mse::rsv::make_xslta_borrowing_fixed_vector(&list); 116 | for(auto& i_ref : bf_vector) { 117 | i_ref *= 10; 118 | print_int(i_ref); 119 | } 120 | } 121 | 122 | for(const auto i : list) { 123 | print_int(i); 124 | } 125 | 126 | return 0; 127 | } 128 | ``` 129 | 130 | So unlike `mse::rsv::xslta_vector<>`, dereferencing the iterators of a borrowing fixed vector does yield a raw reference. This is safe because the borrowing fixed vector ensures that its contents aren't going anywhere. Note that while the borrowing vector exists, the contents of the original lending vector are not accessible via the original vector's interface. 131 | 132 | If you try to access the contents of the original lending vector, in the current implementation you'll find that the original vector is just empty. But in the future, its possible that attempts to access the original vector while its contents are borrowed may result in an exception. So try to avoid doing that. 133 | 134 | Borrowing fixed vectors are expected to be the primary method of accessing vector elements. At least in performance sensitive code. 135 | 136 | Just to be clear, the value of the elements can be modified via the borrowing fixed vector, but not the number of elements. 137 | 138 | The borrow ends when the borrowing vector goes out of scope, at which point the (possibly modified) contents are returned to the original lending vector. 139 | 140 | As you can see, most of the safety enforcement of these vectors and borrowing vectors are implemented in the type system without reliance on extra static analysis. But for complete safety, the scpptool static analysis will complain if you try to misuse these elements. For example, if you try to inappropriately use a borrowing vector as a function return value. 141 | 142 | The reason we need these borrowing fixed vectors is because vectors are so-called "dynamic" containers. Containers that can deallocate or relocate their elements arbitrarily at run-time. But vectors are not the only dynamic containers. For example, strings are also dynamic containers. And optionals, which can be thought of conceptually as kind of vectors that can only contain zero or one objects. "Borrowing fixed" counterparts are also provided for those containers. 143 | 144 | So we've just addressed one of the remaining big lifetime issues in "modern" C++. Now let's take a look at how pointers and references are handled: 145 | 146 | The first big restriction is that null values are not supported for raw pointers. If you need a null value, wrap the pointer in an optional. Or, if you prefer, you can use one of the provided (non-owning) smart pointers that safely support null values. In the general case, it's simply not practically possible to safely support null pointer values without inserting run-time dereference checks. 147 | 148 | But in many cases where null pointer values are not needed, safety can be achieved without additional run-time overhead. But some restrictions apply. scpptool's strategy is to require that any object targeted by a pointer (or reference) outlives the pointer itself, regardless of how long the object will actually remain the target of the pointer. Again, we provide (non-owning) smart pointers that can target objects that don't outlive them, but any target of a raw pointer must outlive the pointer. 149 | 150 | So if we take this code: 151 | 152 | ```cpp 153 | #include 154 | 155 | int main(int argc, char* argv[]) { 156 | { 157 | int i1 = 3; 158 | 159 | int * iptr1 = &i1; // no problem, target object outlives the pointer 160 | 161 | { 162 | int i2 = 5; 163 | iptr1 = &i2; // scpptool will complain that the target object does not outlive the pointer 164 | 165 | iptr1 = &i1; // note that promptly restoring the approved target does not suppress the original complaint 166 | } 167 | 168 | std::cout << *iptr1; 169 | } 170 | 171 | return 0; 172 | } 173 | ``` 174 | 175 | and run the scpptool analyzer: 176 | 177 | ``` 178 | ~/dev/clang_tooling/scpptool/src/scpptool example1.cpp -- 179 | 180 | /home/user1/dev/clang_tooling/test/scpptool/example1/example1.cpp:11:13: error: Unable to verify that this pointer assignment (of type 'int *') is safe and valid. (Possibly due to being unable to verify that the object(s) referenced by the new pointer live long enough.) 181 | 182 | 183 | 1 verification failures. 184 | ``` 185 | 186 | We see that scpptool complains when you try to target a pointer at an object that doesn't outlive it. And even immediately restoring the original target does not placate the analyzer. As mentioned, if you really need to target an object that doesn't outlive the pointer, the library has non-owning smart pointers that (safely) support this. But I think you'll find this rarely necessary in practice. 187 | 188 | Ok, but what about assigning the value of one pointer variable to another? Well the analyzer knows that a pointer's target must outlive it, so therefore it would be safe to assign the value of one pointer to another if the source pointer itself outlives the pointer being assigned to. So if we take this code: 189 | 190 | ```cpp 191 | #include 192 | 193 | int main() { 194 | { 195 | int i1 = 3; 196 | 197 | int * iptr1 = &i1; 198 | int * iptr2 = &i1; 199 | 200 | iptr2 = iptr1; // no problem because iptr1 outlives iptr2 201 | 202 | iptr1 = iptr2; // scpptool will complain because iptr2 does not outlive iptr1 203 | 204 | std::cout << *iptr1 << std::endl; 205 | } 206 | 207 | return 0; 208 | } 209 | ``` 210 | 211 | and run the scpptool analyzer: 212 | 213 | ``` 214 | ~/dev/clang_tooling/scpptool/src/scpptool example1.cpp -- 215 | 216 | /home/user1/dev/clang_tooling/test/scpptool/example1/example1.cpp:12:9: error: Unable to verify that this pointer assignment (of type 'int *') is safe and valid. (Possibly due to being unable to verify that the object(s) referenced by the new pointer live long enough.) 217 | 218 | 219 | 1 verification failures. 220 | ``` 221 | 222 | We see that it allows the first pointer-to-pointer assignment, but complains about the second one, which is just the reverse of the first one. That's because in the second assignment the source pointer does not outlive the destination pointer and so could be potentially pointing to a target object that also doesn't outlive the destination pointer. 223 | 224 | Now in this simple example, it's readily apparent to us that the target object actually outlives both pointers. But determining how long a pointer's target object lives is not so easy in every case, so, for the sake of consistency, the restriction is based on the lifetimes of the pointers themselves, not the object they happen to be pointing to at the time. 225 | 226 | Still, we could imagine that there might be situations where we would want to be able to assign pointer values back and forth between each other. scpptool does support this, but requires the pointers to be annotated with "lifetime annotations". Adding a lifetime annotation to a pointer changes the pointer's restriction from the default one, that the pointer's target must outlive the pointer itself, to having the same restriction as the pointer value it was initialized with in its declaration. If the pointer value it was initialized with was a temporary pointer to an object, then the restriction is that the the pointer's target must live at least as long as that initialization target object. This means, for example, that two different annotated pointers initialized with (a temporary pointer to) the same target object would have exactly the same restrictions on the values they can have. 227 | 228 | (This part can be a little confusing for the unfamiliar. If necessary, you may want to go back and listen to the last paragraph again carefully. The example to follow should help to clarify things.) 229 | 230 | When two pointers have exactly the same restrictions on their values, assignments between them in either direction are safe and permitted. 231 | 232 | While you can explicitly add lifetime annotations to raw pointers, we'll leave that for another discussion and note here that the library provides a pointer type template that is already defined with lifetime annotations. So if we consider this code: 233 | 234 | ```cpp 235 | #include "mseslta.h" 236 | 237 | int main(int argc, char* argv[]) { 238 | { 239 | int i1 = 3; 240 | int i2 = 5; 241 | int i3 = 7; 242 | /* The (lower bound) lifetime associated with an rsv::TXSLTAPointer<> is set to the lifetime of 243 | its initialization value. */ 244 | auto p2a = mse::rsv::TXSLTAPointer{ &i2 }; 245 | auto p2b = mse::rsv::TXSLTAPointer{ &i2 }; 246 | 247 | p2a = p2b; 248 | p2b = p2a; 249 | std::swap(p2a, p2b); 250 | 251 | auto p1 = mse::rsv::TXSLTAPointer{ &i1 }; 252 | 253 | /* The (lower bound) lifetime associated with p2a does not outlive the one associated with 254 | p1, so assigning the value of p2a to p1 cannot be verified to be safe. */ 255 | p1 = p2a; // scpptool will complain 256 | p2a = p1; 257 | p2a = &i1; 258 | p2a = &i3; // scpptool will complain 259 | } 260 | 261 | return 0; 262 | } 263 | ``` 264 | 265 | `mse::rsv::TXSLTAPointer` is a zero-overhead pointer type with the same performance characteristics as a raw pointer. Because it is defined with lifetime annotations (that's what the LTA in the name refers to), it is restricted to pointing to objects which live at least as long as the target object it was initialized with. Since `p2a` and `p2b` are both initialized with (temporary pointers to) the same target object, they have exactly the same restrictions, so their values can be swapped or assigned to each other in either direction. 266 | 267 | Note that elements in a container, like an array, are all considered to have essentially the same lifetime. So a pointer that targets an element of a container can be assigned to target any other element in that container. 268 | 269 | But notice that when we try to assign the value of `p2a` to `p1`, the scpptool analyzer complains even though `p2a` outlives `p1`: 270 | 271 | ``` 272 | ~/dev/clang_tooling/scpptool/src/scpptool example1.cpp -- -I ./msetl/ 273 | 274 | /home/user1/dev/clang_tooling/test/scpptool/example1/example1.cpp:21:3: error: Unable to verify that in the 'mse::rsv::TXSLTAPointer::operator=' member function call expression, the argument corresponding to a parameter with lifetime label id '99' has a lifetime (including any sublifetimes) that meets the (minimum required) lifetime set when the object was initialized. 275 | ./msetl/mseslta.h:199:4: function declared here 276 | 277 | /home/user1/dev/clang_tooling/test/scpptool/example1/example1.cpp:24:3: error: Unable to verify that in the 'mse::rsv::TXSLTAPointer::operator=' member function call expression, the argument corresponding to a parameter with lifetime label id '99' has a lifetime (including any sublifetimes) that meets the (minimum required) lifetime set when the object was initialized. 278 | ./msetl/mseslta.h:199:4: function declared here 279 | 280 | 281 | 2 verification failures. 282 | ``` 283 | 284 | That's because even though `p2a` outlives `p1`, the restriction of `p2a` is not as strict as that of `p1`. That is, the target object `p2a` was initialized with does not outlive the target object `p1` was initialized with. 285 | 286 | But this means that the reverse assignment is allowed. 287 | 288 | And similarly, `p2a` cannot be assigned `i3` as a target object. Because even though `i3` outlives `p2a`, it does not outlive `p2a`'s initialization target object, `i2`. 289 | 290 | Under scpptool's restrictions, you'll likely find lifetime annotated pointers to be significantly more useful in practice than unannotated raw pointers. (Though there are a lot of situations where either can be used.) Indeed, despite the restrictions we demonstrated, lifetime annotated pointers should be able to fulfill most use cases where pointers are called for. And again, for the minority of cases where they can't, the library provides essentially unrestricted run-time checked non-owning pointers. 291 | 292 | So there you have it. A quick demonstration of how scpptool ensures lifetime safety in modern C++. For more information, you can check out the documentation in the github repository. 293 | -------------------------------------------------------------------------------- /src/build_scripts/build-debug2.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | if [ -z "$LLVM_CONF" ] 4 | then 5 | if [ -f /usr/bin/llvm-config ] 6 | then 7 | LLVM_CONF=/usr/bin/llvm-config 8 | else 9 | echo "The LLVM_CONF environment variable needs to be set to the llvm-config pathname. " 10 | echo "If you're using the clang+llvm pre-built binaries, then LLVM_CONF should be set to {the clang+llvm pre-built binaries directory}/bin/llvm-config. " 11 | exit 12 | fi 13 | else 14 | echo " " 15 | fi 16 | echo "LLVM_CONF is set to '$LLVM_CONF'" 17 | 18 | CXXFLAGS="$($LLVM_CONF --cxxflags)" 19 | LDFLAGS1="$($LLVM_CONF --ldflags)" 20 | LDFLAGS2="$($LLVM_CONF --libs --system-libs)" 21 | 22 | set -x 23 | 24 | g++ $CXXFLAGS -fexceptions -std=c++17 -I./yaml-cpp/include -O0 -g3 -Wall -c -fmessage-length=0 -fvisibility-inlines-hidden -Wno-unused -Wno-attributes -Wno-deprecated-declarations -fPIC -MMD -MP -MF"scpptool.d" -MT"scpptool.o" -o "scpptool.o" "scpptool.cpp" 25 | 26 | g++ $CXXFLAGS -fexceptions -std=c++17 -O0 -g3 -Wall -c -fmessage-length=0 -fvisibility-inlines-hidden -Wno-unused -Wno-attributes -Wno-deprecated-declarations -fPIC -MMD -MP -MF"utils1.d" -MT"utils1.o" -o "utils1.o" "utils1.cpp" 27 | 28 | g++ scpptool.o utils1.o $LDFLAGS1 -Wl,--start-group -lclangAPINotes -lclangAST -lclangAnalysis -lclangBasic\ 29 | -lclangDriver -lclangEdit -lclangFrontend -lclangFrontendTool\ 30 | -lclangLex -lclangParse -lclangSema -lclangASTMatchers\ 31 | -lclangRewrite -lclangRewriteFrontend -lclangStaticAnalyzerFrontend\ 32 | -lclangStaticAnalyzerCheckers -lclangStaticAnalyzerCore\ 33 | -lclangSerialization -lclangToolingCore -lclangTooling -lclangToolingSyntax -lstdc++\ 34 | -lLLVMRuntimeDyld -lm -Wl,--end-group $LDFLAGS2 -lclangSupport -lyaml-cpp -o scpptool 35 | 36 | -------------------------------------------------------------------------------- /src/makefile: -------------------------------------------------------------------------------- 1 | TARGET=scpptool 2 | SHELL=bash 3 | SHELL?=bash 4 | CC=gcc 5 | CC?=gcc 6 | CFLAGS=-fpic -std=c11 7 | CXX=g++ 8 | CXX?=g++ 9 | CXX_FLAGS=-fpic 10 | CXX_EXTRA?= 11 | CTAGS_I_PATH?=./ 12 | LD_FLAGS= 13 | EXTRA_LD_FLAGS?= 14 | ADD_SANITIZERS_CC= -g -fsanitize=address -fno-omit-frame-pointer 15 | ADD_SANITIZERS_LD= -g -fsanitize=address 16 | MEM_SANITIZERS_CC= -g -fsanitize=memory -fno-omit-frame-pointer 17 | MEM_SANITIZERS_LD= -g -fsanitize=memory 18 | UB_SANITIZERS_CC= -g -fsanitize=undefined -fno-omit-frame-pointer 19 | UB_SANITIZERS_LD= -g -fsanitize=undefined 20 | COV_CXX= -fprofile-instr-generate -fcoverage-mapping 21 | COV_LD= -fprofile-instr-generate 22 | # BUILD_MODES are=RELEASE(default), DEBUG,ADDSAN,MEMSAN,UBSAN 23 | BUILD_MODE?=RELEASE 24 | OBJ_LIST:=$(patsubst %.cpp, %.o, $(wildcard *.cpp)) 25 | ASM_LIST:=$(patsubst %.cpp, %.dis, $(wildcard *.cpp)) 26 | 27 | LLVM_CONF?=llvm-config 28 | LLVM_CXX_FLAGS=$(shell $(LLVM_CONF) --cxxflags) 29 | LLVM_CXX_FLAGS+=\ 30 | # -I$(shell $(LLVM_CONF) --src-root)/tools/clang/include\ 31 | # -I$(shell $(LLVM_CONF) --obj-root)/tools/clang/include\ 32 | -std=c++17 33 | # -stdlib=libstdc++ -std=c++17 -frtti -fexceptions 34 | ifeq ($(CXX), clang++) 35 | LLVM_CXX_FLAGS+=-stdlib=libstdc++ 36 | endif 37 | LLVM_LD_FLAGS=-Wl,--start-group -lclangAPINotes -lclangAST -lclangAnalysis -lclangBasic\ 38 | -lclangDriver -lclangEdit -lclangFrontend -lclangFrontendTool\ 39 | -lclangLex -lclangParse -lclangSema -lclangASTMatchers\ 40 | -lclangRewrite -lclangRewriteFrontend -lclangStaticAnalyzerFrontend\ 41 | -lclangStaticAnalyzerCheckers -lclangStaticAnalyzerCore\ 42 | -lclangSerialization -lclangToolingCore -lclangTooling -lclangToolingSyntax -lstdc++\ 43 | -lLLVMRuntimeDyld -lm -Wl,--end-group -lclangSupport 44 | LLVM_LD_FLAGS+=$(shell $(LLVM_CONF) --ldflags --libs --system-libs) 45 | 46 | CXX_FLAGS+=$(LLVM_CXX_FLAGS) 47 | LD_FLAGS+=$(LLVM_LD_FLAGS) 48 | 49 | #MAKEFLAGS+=--warn-undefined-variables 50 | ifeq ($(BUILD_MODE), ADDSAN) 51 | ifeq ($(CXX), g++) 52 | $(error This build mode is only useable with clang++.) 53 | endif 54 | CXX_EXTRA+=$(ADD_SANITIZERS_CC) 55 | EXTRA_LD_FLAGS+=$(ADD_SANITIZERS_LD) 56 | endif 57 | 58 | ifeq ($(BUILD_MODE), MEMSAN) 59 | ifeq ($(CXX), g++) 60 | $(error This build mode is only useable with clang++.) 61 | endif 62 | CXX_EXTRA+=$(MEM_SANITIZERS_CC) 63 | EXTRA_LD_FLAGS+=$(MEM_SANITIZERS_LD) 64 | endif 65 | 66 | ifeq ($(BUILD_MODE), UBSAN) 67 | ifeq ($(CXX), g++) 68 | $(error This build mode is only useable with clang++.) 69 | endif 70 | CXX_EXTRA+=$(UB_SANITIZERS_CC) 71 | EXTRA_LD_FLAGS+=$(UB_SANITIZERS_LD) 72 | endif 73 | 74 | ifeq ($(BUILD_MODE), DEBUG) 75 | CXX_EXTRA+=-g 76 | endif 77 | 78 | SRCS:=$(wildcard *.cpp) 79 | CXX_FLAGS+=$(CXX_EXTRA) 80 | LD_FLAGS+=$(EXTRA_LD_FLAGS) 81 | 82 | .DEFAULT:all 83 | 84 | .PHONY:all clean help ASM SO TAGS 85 | 86 | all:$(TARGET) 87 | 88 | everything:$(TARGET) ASM SO A $(TARGET)-dbg TAGS $(TARGET)-cov 89 | 90 | depend:.depend 91 | 92 | .depend:$(SRCS) 93 | rm -rf .depend 94 | $(CXX) -MM $(CXX_FLAGS) $^ > ./.depend 95 | echo $(patsubst %.o:, %.odbg:, $(shell $(CXX) -MM $(CXX_FLAGS) $^)) | sed -r 's/[A-Za-z0-9\_\-]+\.odbg/\n&/g' >> ./.depend 96 | echo $(patsubst %.o:, %.ocov:, $(shell $(CXX) -MM $(CXX_FLAGS) $^)) | sed -r 's/[A-Za-z0-9\_\-]+\.ocov/\n&/g' >> ./.depend 97 | 98 | -include ./.depend 99 | 100 | .cpp.o: 101 | $(CXX) $(CXX_FLAGS) -c $< -o $@ 102 | 103 | %.odbg:%.cpp 104 | $(CXX) $(CXX_FLAGS) -g -c $< -o $@ 105 | 106 | %.ocov:%.cpp 107 | $(CXX) $(CXX_FLAGS) $(COV_CXX) -c $< -o $@ 108 | 109 | $(TARGET): $(TARGET).o ./utils1.o 110 | $(CXX) $^ $(LD_FLAGS) -o $@ 111 | $(TARGET)-static: $(TARGET).o ./utils1.o 112 | $(CXX) $^ $(LD_FLAGS) -static -o $@ 113 | 114 | $(TARGET)-dbg: $(TARGET).odbg ./utils1.odbg 115 | $(CXX) $^ $(LD_FLAGS) -g -o $@ 116 | 117 | $(TARGET)-cov: $(TARGET).ocov ./utils1.ocov 118 | $(CXX) $^ $(LD_FLAGS) $(COV_LD) -o $@ 119 | 120 | cov: 121 | @llvm-profdata merge -sparse ./default.profraw -o ./default.profdata 122 | @llvm-cov show $(TARGET)-cov -instr-profile=default.profdata 123 | 124 | covrep: 125 | @llvm-profdata merge -sparse ./default.profraw -o ./default.profdata 126 | @llvm-cov report $(TARGET)-cov -instr-profile=default.profdata 127 | 128 | ASM:$(ASM_LIST) 129 | 130 | SO:$(TARGET).so 131 | 132 | A:$(TARGET).a 133 | 134 | TAGS:tags 135 | 136 | tags:$(SRCS) 137 | # $(shell $(CXX) -c $(shell $(LLVM_CONF) --cxxflags) -I$(shell $(LLVM_CONF) --src-root)/tools/clang/include -I$(shell $(LLVM_CONF) --obj-root)/tools/clang/include -I $(CTAGS_I_PATH) -M $(SRCS)|\ 138 | $(shell $(CXX) -c $(shell $(LLVM_CONF) --cxxflags) -I $(CTAGS_I_PATH) -M $(SRCS)|\ 139 | sed -e 's/[\\ ]/\n/g'|sed -e '/^$$/d' -e '/\.o:[ \t]*$$/d'|\ 140 | ctags -L - --c++-kinds=+p --fields=+iaS --extra=+q) 141 | 142 | %.dis: %.o 143 | objdump -r -d -M intel -S $< > $@ 144 | 145 | $(TARGET).so: $(TARGET).o 146 | $(CXX) $^ $(LD_FLAGS) -shared -o $@ 147 | 148 | $(TARGET).a: $(TARGET).o 149 | ar rcs $(TARGET).a $(TARGET).o 150 | 151 | clean: 152 | rm -f *.o *.dis *.odbg *.ocov *~ $(TARGET) $(TARGET).so $(TARGET)-static $(TARGET)-dbg $(TARGET).a $(TARGET)-cov 153 | 154 | deepclean: 155 | rm -f *.o *.dis *.odbg *.ocov *~ $(TARGET) $(TARGET).so tags $(TARGET)-static $(TARGET)-dbg $(TARGET).a $(TARGET)-cov FILE*.cpp FILE*.hpp 156 | rm .depend 157 | 158 | help: 159 | @echo "--all is the default target, runs $(TARGET) target" 160 | @echo "--everything will build everything" 161 | @echo "--SO will generate the so" 162 | @echo "--ASM will generate assembly files" 163 | @echo "--TAGS will generate tags file" 164 | @echo "--$(TARGET) builds the dynamically-linked executable" 165 | @echo "--$(TARGET)-dbg will generate the debug build. BUILD_MODE should be set to DEBUG to work" 166 | @echo "--$(TARGET)-static will statically link the executable to the libraries" 167 | @echo "--$(TARGET)-cov is the coverage build" 168 | @echo "--cov will print the line coverage report" 169 | @echo "--covrep will print the coverage report" 170 | @echo "--A will build the static library" 171 | @echo "--TAGS will build the tags file" 172 | @echo "--clean" 173 | @echo "--deepclean will clean almost everything" 174 | -------------------------------------------------------------------------------- /src/scpptool.cpp: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2019 Noah Lopez 2 | // special thanks to Farzad Sadeghi 3 | // Use, modification, and distribution is subject to the Boost Software 4 | // License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at 5 | // http://www.boost.org/LICENSE_1_0.txt) 6 | 7 | 8 | /*Standard headers*/ 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | #include 16 | #include 17 | 18 | #include 19 | #include 20 | #include 21 | 22 | #include 23 | 24 | #include "scpptool.h" 25 | #include "utils1.h" 26 | 27 | #include "checker.h" 28 | 29 | //define EXCLUDE_CONVERTER_MODE1 30 | #ifndef EXCLUDE_CONVERTER_MODE1 31 | #include "converter_mode1.h" 32 | #endif //!EXCLUDE_CONVERTER_MODE1 33 | 34 | //define EXCLUDE_C2VALIDCPP 35 | #ifndef EXCLUDE_C2VALIDCPP 36 | #include "converter_c2validcpp.h" 37 | #endif //!EXCLUDE_C2VALIDCPP 38 | 39 | /*Clang Headers*/ 40 | #include "clang/AST/AST.h" 41 | #include "clang/AST/ASTConsumer.h" 42 | #include "clang/ASTMatchers/ASTMatchers.h" 43 | #include "clang/ASTMatchers/ASTMatchFinder.h" 44 | #include "clang/Frontend/CompilerInstance.h" 45 | #include "clang/Frontend/FrontendActions.h" 46 | #include "clang/Lex/Lexer.h" 47 | #include "clang/Tooling/CommonOptionsParser.h" 48 | #include "clang/Tooling/Tooling.h" 49 | #include "clang/Rewrite/Core/Rewriter.h" 50 | #include "clang/Lex/Preprocessor.h" 51 | #include "clang/AST/ASTImporter.h" 52 | #include "clang/AST/ASTDiagnostic.h" 53 | #include "clang/Frontend/TextDiagnosticPrinter.h" 54 | 55 | #include "clang/Basic/SourceManager.h" 56 | 57 | /*LLVM Headers*/ 58 | #include "llvm/Support/raw_ostream.h" 59 | #include "llvm/IR/Function.h" 60 | /**********************************************************************************************************************/ 61 | /*used namespaces*/ 62 | using namespace llvm; 63 | using namespace clang; 64 | using namespace clang::ast_matchers; 65 | using namespace clang::driver; 66 | using namespace clang::tooling; 67 | /**********************************************************************************************************************/ 68 | static llvm::cl::OptionCategory MatcherSampleCategory("TBD"); 69 | 70 | cl::opt ConvertToSCPP("ConvertToSCPP", cl::desc("translate the source to a (memory) safe subset of the language"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 71 | cl::opt CTUAnalysis("CTUAnalysis", cl::desc("cross translation unit analysis"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 72 | cl::opt EnableNamespaceImport("EnableNamespaceImport", cl::desc("enable importing of namespaces from other translation units"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 73 | cl::opt SuppressPrompts("SuppressPrompts", cl::desc("suppress prompts before replacing source files"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 74 | cl::opt DoNotReplaceOriginalSource("DoNotReplaceOriginalSource", cl::desc("prevent replacement/modification of the original source files"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 75 | cl::opt MergeCommand("MergeCommand", cl::desc("specify an alternate merge tool to be used"), cl::init(""), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 76 | cl::opt DoNotResolveMergeConflicts("DoNotResolveMergeConflicts", cl::desc("prevent the automatic resolution of merge conflicts (by heuristic guessing)"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 77 | cl::opt ConvertMode("ConvertMode", cl::desc("specify the code conversion technique to use: \n" 78 | "\t Dual \t- The resulting code can be compiled as either safe C++ or (potentially faster) unsafe 'plain' C or C++. \n" 79 | "\t SlowAndFlexible \t- (Default) \n" 80 | "\t FasterAndStricter \t- The resulting (safe) code should be faster, but code that is not of 'good form' may not translate properly. (Preliminary implementation only.)\n" 81 | ), cl::init(""), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 82 | cl::opt ScopeTypeFunctionParameters("ScopeTypeFunctionParameters", cl::desc("Use 'scope' types when converting pointer and iterator function parameters. \n" 83 | "\t This can result in invalid code (that may need to be fixed manually) in some cases, but the resulting \n" 84 | "\t functions may support arguments of scope type (including raw pointers). "), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 85 | cl::opt ScopeTypePointerFunctionParameters("ScopeTypePointerFunctionParameters", cl::desc("same as 'ScopeTypeFunctionParameters', but only applies to pointers, not iterators. (Not yet implemented.)"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 86 | cl::opt AddressableVars("AddressableVars", cl::desc("make variables of (safely) 'addressable' type even if they are never used as a pointer target"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 87 | cl::opt ConvertC2ValidCpp("ConvertC2ValidCpp", cl::desc("Modify C source to (more) conform to the subset supported by C++. (Preliminary implementation only.)"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 88 | cl::opt ExpandPointerMacros("ExpandPointerMacros", cl::desc("Modify source so that instantiations of macros that contain pointers are replaced by their macro expansion. (May require multiple runs for nested macros.) (Preliminary implementation only.)"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 89 | cl::opt CheckSystemHeader("SysHeader", cl::desc("deprecated - process system headers also"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 90 | cl::opt MainFileOnly("MainOnly", cl::desc("process the main file only"), cl::init(false), cl::cat(MatcherSampleCategory), cl::ZeroOrMore); 91 | 92 | /**********************************************************************************************************************/ 93 | 94 | class MyDiagConsumer : public DiagnosticConsumer { 95 | void anchor() {} 96 | 97 | void HandleDiagnostic(DiagnosticsEngine::Level DiagLevel, const Diagnostic &Info) override { 98 | llvm::SmallVector message; 99 | Info.FormatDiagnostic(message); 100 | llvm::errs() << message << '\n'; 101 | } 102 | }; 103 | 104 | /**********************************************************************************************************************/ 105 | /*Main*/ 106 | int main(int argc, const char **argv) 107 | { 108 | int retval = -1; 109 | 110 | if (false) { 111 | /* This is just here for use in creatng a "baseline" for address sanitizer errors. */ 112 | std::string code = "struct A{public: int i;}; void f(A & a}{}"; 113 | std::unique_ptr ast(clang::tooling::buildASTFromCode(code)); 114 | // now you have the AST for the code snippet 115 | //clang::ASTContext * pctx = &(ast->getASTContext()); 116 | //clang::TranslationUnitDecl * decl = pctx->getTranslationUnitDecl(); 117 | } 118 | 119 | #if MU_LLVM_MAJOR <= 12 120 | CommonOptionsParser op(argc, argv, MatcherSampleCategory); 121 | #elif MU_LLVM_MAJOR > 12 122 | auto op_result = CommonOptionsParser::create(argc, argv, MatcherSampleCategory); 123 | if (auto E = op_result.takeError()) { 124 | std::cerr << "\n" << toString(std::move(E)) << "\n"; 125 | exit(-1); 126 | } 127 | auto& op = *op_result; 128 | #endif /*MU_LLVM_MAJOR*/ 129 | ClangTool Tool(op.getCompilations(), op.getSourcePathList()); 130 | 131 | std::shared_ptr diag_consumer_shptr(new MyDiagConsumer()); 132 | //Tool.setDiagnosticConsumer(diag_consumer_shptr.get()); 133 | 134 | int num_exclusive_options = 0; 135 | if (ConvertToSCPP) { 136 | num_exclusive_options += 1; 137 | } 138 | if (ConvertC2ValidCpp) { 139 | num_exclusive_options += 1; 140 | } 141 | if (ExpandPointerMacros) { 142 | num_exclusive_options += 1; 143 | } 144 | 145 | if (1 < num_exclusive_options) { 146 | llvm::errs() << "More than one mutually exclusive option indicated: The ConvertToSCPP, ConvertC2ValidCpp and ExpandPointerMacros options are mutually exclusive. You may only use one at a time.." << '\n'; 147 | return retval; 148 | } 149 | 150 | if (true) { 151 | checker::Options options = { 152 | CheckSystemHeader, 153 | MainFileOnly, 154 | CTUAnalysis, 155 | EnableNamespaceImport, 156 | SuppressPrompts 157 | }; 158 | retval = checker::buildASTs_and_run(Tool, options); 159 | } 160 | 161 | if (ConvertToSCPP.getValue()) { 162 | #ifndef EXCLUDE_CONVERTER_MODE1 163 | 164 | /* The "checker" pass, among other things, determined which regions of the code are indicated 165 | to be excluded from the checks. The "convert" pass also needs this information. Rather than 166 | re-compute it, we'll copy it from the stored "states" of the checker pass. */ 167 | for (const auto& checker_state : checker::g_final_tu_states) { 168 | convm1::CTUState convm1_state; 169 | convm1_state.m_suppress_check_region_set = checker_state.m_suppress_check_region_set; 170 | convm1::g_prepared_initial_tu_states.push_back(convm1_state); 171 | } 172 | std::reverse(convm1::g_prepared_initial_tu_states.begin(), convm1::g_prepared_initial_tu_states.end()); 173 | 174 | convm1::Options options = { 175 | CheckSystemHeader, 176 | MainFileOnly, 177 | ConvertToSCPP, 178 | CTUAnalysis, 179 | EnableNamespaceImport, 180 | SuppressPrompts, 181 | DoNotReplaceOriginalSource, 182 | MergeCommand, 183 | DoNotResolveMergeConflicts, 184 | ConvertMode, 185 | ScopeTypeFunctionParameters, 186 | ScopeTypePointerFunctionParameters, 187 | AddressableVars 188 | }; 189 | retval = convm1::buildASTs_and_run(Tool, options); 190 | #endif //!EXCLUDE_CONVERTER_MODE1 191 | } else if (ConvertC2ValidCpp.getValue() || ExpandPointerMacros.getValue()) { 192 | #ifndef EXCLUDE_C2VALIDCPP 193 | 194 | /* The "checker" pass, among other things, determined which regions of the code are indicated 195 | to be excluded from the checks. The "convert" pass also needs this information. Rather than 196 | re-compute it, we'll copy it from the stored "states" of the checker pass. */ 197 | for (const auto& checker_state : checker::g_final_tu_states) { 198 | convc2validcpp::CTUState convm1_state; 199 | convm1_state.m_suppress_check_region_set = checker_state.m_suppress_check_region_set; 200 | convc2validcpp::g_prepared_initial_tu_states.push_back(convm1_state); 201 | } 202 | std::reverse(convc2validcpp::g_prepared_initial_tu_states.begin(), convc2validcpp::g_prepared_initial_tu_states.end()); 203 | 204 | convc2validcpp::Options options = { 205 | CheckSystemHeader, 206 | MainFileOnly, 207 | ConvertC2ValidCpp, 208 | ExpandPointerMacros, 209 | CTUAnalysis, 210 | EnableNamespaceImport, 211 | SuppressPrompts, 212 | DoNotReplaceOriginalSource, 213 | MergeCommand, 214 | DoNotResolveMergeConflicts, 215 | ConvertMode, 216 | ScopeTypeFunctionParameters, 217 | ScopeTypePointerFunctionParameters, 218 | AddressableVars 219 | }; 220 | retval = convc2validcpp::buildASTs_and_run(Tool, options); 221 | #endif //!EXCLUDE_C2VALIDCPP 222 | } 223 | 224 | return retval; 225 | } 226 | /*last line intentionally left blank*/ 227 | 228 | -------------------------------------------------------------------------------- /src/scpptool.h: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2019 Noah Lopez 2 | // special thanks to Farzad Sadeghi 3 | // Use, modification, and distribution is subject to the Boost Software 4 | // License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at 5 | // http://www.boost.org/LICENSE_1_0.txt) 6 | 7 | 8 | #ifndef __SCPPTOOL_H 9 | #define __SCPPTOOL_H 10 | 11 | #if __cpp_exceptions >= 199711 12 | #define SCPPT_THROW(x) throw(x) 13 | #define SCPPT_TRY try 14 | #define SCPPT_CATCH(x) catch(x) 15 | #define SCPPT_CATCH_ANY catch(...) 16 | #define SCPPT_FUNCTION_TRY try 17 | #define SCPPT_FUNCTION_CATCH(x) catch(x) 18 | #define SCPPT_FUNCTION_CATCH_ANY catch(...) 19 | #else // __cpp_exceptions >= 199711 20 | #define SCPPT_THROW(x) exit(-11) 21 | #define SCPPT_TRY if (true) 22 | #define SCPPT_CATCH(x) if (false) 23 | #define SCPPT_CATCH_ANY if (false) 24 | #define SCPPT_FUNCTION_TRY 25 | #define SCPPT_FUNCTION_CATCH(x) void SCPPT_placeholder_function_catch(x) 26 | #define SCPPT_FUNCTION_CATCH_ANY void SCPPT_placeholder_function_catch_any() 27 | #endif // __cpp_exceptions >= 199711 28 | 29 | #include "llvm/Config/llvm-config.h" 30 | 31 | #ifndef MU_LLVM_MAJOR 32 | #ifdef LLVM_VERSION_MAJOR 33 | #define MU_LLVM_MAJOR LLVM_VERSION_MAJOR 34 | #else /*LLVM_VERSION_MAJOR*/ 35 | #ifdef __clang_major__ 36 | #define MU_LLVM_MAJOR __clang_major__ 37 | #else /*__clang_major__*/ 38 | #define MU_LLVM_MAJOR 10 39 | #endif /*__clang_major__*/ 40 | #endif /*LLVM_VERSION_MAJOR*/ 41 | #endif /*MU_LLVM_MAJOR*/ 42 | 43 | #ifndef MSE_NAMESPACE_STR 44 | #define MSE_NAMESPACE_STR "mse" 45 | #endif //MSE_NAMESPACE_STR 46 | 47 | #endif //__SCPPTOOL_H 48 | 49 | -------------------------------------------------------------------------------- /src/utils1.cpp: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2019 Noah Lopez 2 | // special thanks to Farzad Sadeghi 3 | // Use, modification, and distribution is subject to the Boost Software 4 | // License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at 5 | // http://www.boost.org/LICENSE_1_0.txt) 6 | 7 | 8 | #include "utils1.h" 9 | /*Standard headers*/ 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | #include 16 | #include 17 | #include 18 | #include 19 | 20 | #include 21 | #include 22 | #include 23 | 24 | #include 25 | 26 | /*Clang Headers*/ 27 | #include "clang/AST/AST.h" 28 | #include "clang/AST/ASTConsumer.h" 29 | #include "clang/ASTMatchers/ASTMatchers.h" 30 | #include "clang/ASTMatchers/ASTMatchFinder.h" 31 | #include "clang/Frontend/CompilerInstance.h" 32 | #include "clang/Frontend/FrontendActions.h" 33 | #include "clang/Lex/Lexer.h" 34 | #include "clang/Tooling/CommonOptionsParser.h" 35 | #include "clang/Tooling/Tooling.h" 36 | #include "clang/Rewrite/Core/Rewriter.h" 37 | #include "clang/Lex/Preprocessor.h" 38 | #include "clang/AST/ASTImporter.h" 39 | #include "clang/AST/ASTDiagnostic.h" 40 | #include "clang/Frontend/TextDiagnosticPrinter.h" 41 | 42 | #include "clang/Basic/SourceManager.h" 43 | 44 | /*LLVM Headers*/ 45 | #include "llvm/Support/raw_ostream.h" 46 | #include "llvm/IR/Function.h" 47 | /**********************************************************************************************************************/ 48 | /*used namespaces*/ 49 | using namespace llvm; 50 | using namespace clang; 51 | using namespace clang::ast_matchers; 52 | using namespace clang::driver; 53 | using namespace clang::tooling; 54 | /**********************************************************************************************************************/ 55 | 56 | /* Execute a shell command. */ 57 | std::pair exec(const char* cmd) { 58 | std::array buffer; 59 | std::string result; 60 | std::shared_ptr pipe(popen(cmd, "r"), pclose); 61 | //if (!pipe) SCPPT_THROW( std::runtime_error("popen() failed!")); 62 | if (!pipe) { return std::pair(result, true); } 63 | while (!feof(pipe.get())) { 64 | if (fgets(buffer.data(), 128, pipe.get()) != nullptr) 65 | result += buffer.data(); 66 | } 67 | return std::pair(result, false); 68 | } 69 | 70 | clang::SourceRange instantiation_source_range(const clang::SourceRange& sr, clang::Rewriter &Rewrite) 71 | { 72 | auto& SM = Rewrite.getSourceMgr(); 73 | SourceLocation SL = sr.getBegin(); 74 | SourceLocation SLE = sr.getEnd(); 75 | 76 | if (false && (SL.isMacroID() && SLE.isMacroID()) && (!filtered_out_by_location(SM, SL))) { 77 | if ((SM.isMacroArgExpansion(SL) || SM.isMacroBodyExpansion(SL)) 78 | && (SM.isMacroArgExpansion(SLE) || SM.isMacroBodyExpansion(SLE))) { 79 | 80 | auto SL2 = SM.getExpansionLoc(SL); 81 | auto SLE2 = SM.getExpansionLoc(SLE); 82 | return { SL2, SLE2 }; 83 | } 84 | } 85 | 86 | SL = SM.getFileLoc(SL); 87 | SLE = SM.getFileLoc(SLE); 88 | 89 | if ((SL == SLE) && (sr.getBegin() != sr.getEnd())) { 90 | int q = 5; 91 | } 92 | return SourceRange(SL, SLE); 93 | } 94 | 95 | clang::SourceRange nice_source_range(const clang::SourceRange& sr, clang::Rewriter &Rewrite) 96 | { 97 | SourceLocation SL = sr.getBegin(); 98 | SourceLocation SLE = sr.getEnd(); 99 | 100 | if (SL.isMacroID() && SLE.isMacroID() && (!filtered_out_by_location(Rewrite.getSourceMgr(), SL))) { 101 | /* If the start and end locations are macro (instantiation) locations, then we'll presume that 102 | they are part of the same macro. */ 103 | IF_DEBUG(std::string debug_source_location_str = SL.printToString(Rewrite.getSourceMgr());) 104 | IF_DEBUG(std::string text1 = Rewrite.getRewrittenText({SL, SLE});) 105 | auto SL5 = Rewrite.getSourceMgr().getSpellingLoc(SL); 106 | auto SLE5 = Rewrite.getSourceMgr().getSpellingLoc(SLE); 107 | clang::SourceRange SR5 = { SL5, SLE5 }; 108 | 109 | if ((!(SLE5 < SL5)) && (SR5.isValid())) { 110 | auto FLSL5 = Rewrite.getSourceMgr().getFileLoc(SL5); 111 | if (!filtered_out_by_location(Rewrite.getSourceMgr(), FLSL5)) { 112 | IF_DEBUG(std::string text5 = Rewrite.getRewrittenText(SR5);) 113 | return SR5; 114 | } 115 | } else { 116 | int q = 5; 117 | } 118 | } 119 | 120 | SL = Rewrite.getSourceMgr().getFileLoc(SL); 121 | SLE = Rewrite.getSourceMgr().getFileLoc(SLE); 122 | clang::SourceRange retSR = { SL, SLE }; 123 | 124 | #ifndef NDEBUG 125 | if ((!(SLE < SL)) && (retSR.isValid())) { 126 | std::string text6 = Rewrite.getRewrittenText({SL, SLE}); 127 | int q = 5; 128 | } else { 129 | int q = 5; 130 | } 131 | #endif /*!NDEBUG*/ 132 | 133 | return retSR; 134 | } 135 | 136 | bool is_macro_instantiation(const clang::SourceRange& sr, clang::Rewriter &Rewrite) 137 | { 138 | bool retval = false; 139 | SourceLocation SL = sr.getBegin(); 140 | SourceLocation SLE = sr.getEnd(); 141 | 142 | if (SL.isMacroID() && SLE.isMacroID() && (!filtered_out_by_location(Rewrite.getSourceMgr(), SL))) { 143 | /* If the start and end locations are macro (instantiation) locations, then we'll presume that 144 | they are part of the same macro. */ 145 | IF_DEBUG(std::string debug_source_location_str = SL.printToString(Rewrite.getSourceMgr());) 146 | IF_DEBUG(std::string text1 = Rewrite.getRewrittenText({SL, SLE});) 147 | auto SL5 = Rewrite.getSourceMgr().getSpellingLoc(SL); 148 | auto SLE5 = Rewrite.getSourceMgr().getSpellingLoc(SLE); 149 | clang::SourceRange SR5 = { SL5, SLE5 }; 150 | 151 | if ((!(SLE5 < SL5)) && (SR5.isValid())) { 152 | auto FLSL5 = Rewrite.getSourceMgr().getFileLoc(SL5); 153 | if (!filtered_out_by_location(Rewrite.getSourceMgr(), FLSL5)) { 154 | IF_DEBUG(std::string text5 = Rewrite.getRewrittenText(SR5);) 155 | /* This may be an macro function argument or something, but we don't think it's an 156 | actual instantiation of a macro. */ 157 | return false; 158 | } 159 | } else { 160 | int q = 5; 161 | } 162 | return true; 163 | } 164 | 165 | SL = Rewrite.getSourceMgr().getFileLoc(SL); 166 | SLE = Rewrite.getSourceMgr().getFileLoc(SLE); 167 | clang::SourceRange retSR = { SL, SLE }; 168 | 169 | #ifndef NDEBUG 170 | if ((!(SLE < SL)) && (retSR.isValid())) { 171 | std::string text6 = Rewrite.getRewrittenText({SL, SLE}); 172 | int q = 5; 173 | } else { 174 | int q = 5; 175 | } 176 | #endif /*!NDEBUG*/ 177 | 178 | return false; 179 | } 180 | 181 | bool first_is_a_subset_of_second(const clang::SourceRange& first, const clang::SourceRange& second) { 182 | bool retval = true; 183 | if ((first.getBegin() < second.getBegin()) || (second.getEnd() < first.getEnd())) { 184 | retval = false; 185 | } 186 | return retval; 187 | } 188 | 189 | bool first_is_a_proper_subset_of_second(const clang::SourceRange& first, const clang::SourceRange& second) { 190 | bool retval = true; 191 | if ((!first_is_a_subset_of_second(first, second)) || (second == first)) { 192 | retval = false; 193 | } 194 | return retval; 195 | } 196 | 197 | bool errors_suppressed_by_location(const SourceManager &SM, SourceLocation SL) { 198 | auto res1 = evaluate_filtering_by_location(SM, SL); 199 | return res1.m_suppress_errors; 200 | } 201 | bool errors_suppressed_by_location(ASTContext const& Ctx, SourceLocation SL) { 202 | const SourceManager &SM = Ctx.getSourceManager(); 203 | return errors_suppressed_by_location(SM, SL); 204 | } 205 | bool errors_suppressed_by_location(const ast_matchers::MatchFinder::MatchResult &MR, SourceLocation SL) { 206 | ASTContext *const ASTC = MR.Context; 207 | assert(MR.Context); 208 | const SourceManager &SM = ASTC->getSourceManager(); 209 | return errors_suppressed_by_location(SM, SL); 210 | } 211 | 212 | // trim from start (in place) 213 | void ltrim(std::string &s) { 214 | auto isnotspace = [](int ch) { return !std::isspace(ch); }; 215 | s.erase(s.begin(), std::find_if(s.begin(), s.end(), 216 | isnotspace)); 217 | } 218 | 219 | // trim from end (in place) 220 | void rtrim(std::string &s) { 221 | auto isnotspace = [](int ch) { return !std::isspace(ch); }; 222 | s.erase(std::find_if(s.rbegin(), s.rend(), 223 | isnotspace).base(), s.end()); 224 | } 225 | 226 | std::string with_whitespace_removed(const std::string_view str) { 227 | std::string retval; 228 | retval = str; 229 | retval.erase(std::remove_if(retval.begin(), retval.end(), isspace), retval.end()); 230 | return retval; 231 | } 232 | 233 | std::string with_newlines_removed(const std::string_view str) { 234 | std::string retval; 235 | retval = str; 236 | auto riter1 = retval.rbegin(); 237 | while (retval.rend() != riter1) { 238 | if ('\n' == *riter1) { 239 | auto riter2 = riter1; 240 | riter2++; 241 | retval.erase(riter1.base()--); 242 | while (retval.rend() != riter2) { 243 | /* look for and remove 'continued on the next line' backslash if present. */ 244 | if ('\\' == (*riter2)) { 245 | riter1++; 246 | retval.erase(riter2.base()--); 247 | break; 248 | } else if (!std::isspace(*riter2)) { 249 | break; 250 | } 251 | 252 | riter2++; 253 | } 254 | } 255 | 256 | riter1++; 257 | } 258 | 259 | return retval; 260 | } 261 | 262 | /* No longer used. This function extracts the text of individual declarations when multiple 263 | * pointers are declared in the same declaration statement. */ 264 | std::vector f_declared_object_strings(const std::string_view decl_stmt_str) { 265 | std::vector retval; 266 | 267 | auto nice_decl_stmt_str = with_newlines_removed(decl_stmt_str); 268 | auto semicolon_position = std::string::npos; 269 | for (size_t pos = 3; pos < nice_decl_stmt_str.size(); pos += 1) { 270 | if (';' == nice_decl_stmt_str[pos]) { 271 | semicolon_position = pos; 272 | } 273 | } 274 | if (std::string::npos == semicolon_position) { 275 | assert(false); 276 | return retval; 277 | } 278 | 279 | std::vector delimiter_positions; 280 | for (size_t pos = 3; ((pos < nice_decl_stmt_str.size()) && (pos < semicolon_position)); pos += 1) { 281 | if (',' == nice_decl_stmt_str[pos]) { 282 | delimiter_positions.push_back(pos); 283 | } 284 | } 285 | 286 | delimiter_positions.push_back(semicolon_position); 287 | auto first_delimiter_pos = delimiter_positions[0]; 288 | 289 | { 290 | auto pos1 = first_delimiter_pos - 1; 291 | auto pos2 = pos1; 292 | bool nonspace_found = false; 293 | while ((2 <= pos1) && (!nonspace_found)) { 294 | if (!std::isspace(nice_decl_stmt_str[pos1])) { 295 | pos2 = pos1 + 1; 296 | nonspace_found = true; 297 | } 298 | 299 | pos1 -= 1; 300 | } 301 | if (!nonspace_found) { 302 | assert(false); 303 | return retval; 304 | } 305 | 306 | bool space_found = false; 307 | while ((1 <= pos1) && (!space_found)) { 308 | if (std::isspace(nice_decl_stmt_str[pos1])) { 309 | space_found = true; 310 | } 311 | 312 | pos1 -= 1; 313 | } 314 | if (!space_found) { 315 | assert(false); 316 | return retval; 317 | } 318 | 319 | pos1 += 2; 320 | std::string first_declaration_string = nice_decl_stmt_str.substr(pos1, pos2 - pos1); 321 | retval.push_back(first_declaration_string); 322 | } 323 | 324 | { 325 | size_t delimiter_index = 0; 326 | while (delimiter_positions.size() > (delimiter_index + 1)) { 327 | if (!(delimiter_positions[delimiter_index] + 1 < delimiter_positions[(delimiter_index + 1)])) { 328 | //assert(false); 329 | } else { 330 | std::string declaration_string = nice_decl_stmt_str.substr(delimiter_positions[delimiter_index] + 1, delimiter_positions[(delimiter_index + 1)] - (delimiter_positions[delimiter_index] + 1)); 331 | retval.push_back(declaration_string); 332 | } 333 | 334 | delimiter_index += 1; 335 | } 336 | } 337 | 338 | return retval; 339 | } 340 | 341 | std::string tolowerstr(const std::string_view a) { 342 | std::string retval; 343 | for (const auto& ch : a) { 344 | retval += tolower(ch); 345 | } 346 | return retval; 347 | } 348 | 349 | bool string_begins_with(const std::string_view s1, const std::string_view prefix) { 350 | return (0 == s1.compare(0, prefix.length(), prefix)); 351 | } 352 | bool string_ends_with(const std::string_view s1, const std::string_view suffix) { 353 | if (suffix.length() > s1.length()) { 354 | return false; 355 | } 356 | return (0 == s1.compare(s1.length() - suffix.length(), suffix.length(), suffix)); 357 | } 358 | 359 | 360 | /* This function returns a list of individual declarations contained in the same declaration statement 361 | * as the given declaration. (eg.: "int a, b = 3, *c;" ) */ 362 | std::vector IndividualDeclaratorDecls(const DeclaratorDecl* DD) { 363 | /* There's probably a more efficient way to do this, but this implementation seems to work. */ 364 | std::vector retval; 365 | 366 | if (!DD) { 367 | assert(false); 368 | return retval; 369 | } 370 | auto SR = DD->getSourceRange(); 371 | if (!SR.isValid()) { 372 | return retval; 373 | } 374 | SourceLocation SL = SR.getBegin(); 375 | 376 | auto decl_context = DD->getDeclContext(); 377 | if ((!decl_context) || (!SL.isValid())) { 378 | assert(false); 379 | retval.push_back(DD); 380 | } else { 381 | for (auto decl_iter = decl_context->decls_begin(); decl_iter != decl_context->decls_end(); decl_iter++) { 382 | auto decl = (*decl_iter); 383 | auto l_DD = dyn_cast(decl); 384 | if (l_DD) { 385 | auto DDSR = l_DD->getSourceRange(); 386 | if (DDSR.isValid()) { 387 | SourceLocation l_SL = DDSR.getBegin(); 388 | if (l_SL == SL) { 389 | retval.push_back(l_DD); 390 | } 391 | } 392 | } 393 | } 394 | } 395 | if (0 == retval.size()) { 396 | //assert(false); 397 | retval.push_back(DD); 398 | } 399 | 400 | return retval; 401 | } 402 | 403 | std::vector IndividualDeclaratorDecls(const DeclaratorDecl* DD, Rewriter &Rewrite) { 404 | if (!DD) { 405 | assert(false); 406 | return std::vector{}; 407 | } 408 | auto SR = nice_source_range(DD->getSourceRange(), Rewrite); 409 | std::string source_text; 410 | if (SR.isValid()) { 411 | source_text = Rewrite.getRewrittenText(SR); 412 | } 413 | SourceLocation SL = SR.getBegin(); 414 | return IndividualDeclaratorDecls(DD); 415 | } 416 | 417 | 418 | /* Determine if a given type is defined using a 'typedef'ed type of pointer type. */ 419 | bool UsesPointerTypedef(clang::QualType qtype) { 420 | IF_DEBUG(std::string qtype_str = qtype.getAsString()); 421 | IF_DEBUG(std::string typeClassName = qtype->getTypeClassName();) 422 | auto TDT = clang::dyn_cast(qtype.getTypePtr()); 423 | if (TDT) { 424 | return true; 425 | } else { 426 | if (qtype->isPointerType()) { 427 | return UsesPointerTypedef(qtype->getPointeeType()); 428 | } else if (qtype->isArrayType()) { 429 | if (llvm::isa(qtype.getTypePtr())) { 430 | auto ATP = llvm::cast(qtype.getTypePtr()); 431 | return UsesPointerTypedef(ATP->getElementType()); 432 | } else { 433 | int q = 3; 434 | } 435 | } else if (qtype->isFunctionType()) { 436 | if (llvm::isa(qtype.getTypePtr())) { 437 | auto FT = llvm::cast(qtype.getTypePtr()); 438 | return UsesPointerTypedef(FT->getReturnType()); 439 | } else { 440 | int q = 3; 441 | } 442 | } 443 | } 444 | return false; 445 | } 446 | --------------------------------------------------------------------------------