├── LICENSE └── README.md /LICENSE: -------------------------------------------------------------------------------- 1 | Creative Commons Legal Code 2 | 3 | CC0 1.0 Universal 4 | 5 | CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE 6 | LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN 7 | ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS 8 | INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES 9 | REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS 10 | PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM 11 | THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED 12 | HEREUNDER. 13 | 14 | Statement of Purpose 15 | 16 | The laws of most jurisdictions throughout the world automatically confer 17 | exclusive Copyright and Related Rights (defined below) upon the creator 18 | and subsequent owner(s) (each and all, an "owner") of an original work of 19 | authorship and/or a database (each, a "Work"). 20 | 21 | Certain owners wish to permanently relinquish those rights to a Work for 22 | the purpose of contributing to a commons of creative, cultural and 23 | scientific works ("Commons") that the public can reliably and without fear 24 | of later claims of infringement build upon, modify, incorporate in other 25 | works, reuse and redistribute as freely as possible in any form whatsoever 26 | and for any purposes, including without limitation commercial purposes. 27 | These owners may contribute to the Commons to promote the ideal of a free 28 | culture and the further production of creative, cultural and scientific 29 | works, or to gain reputation or greater distribution for their Work in 30 | part through the use and efforts of others. 31 | 32 | For these and/or other purposes and motivations, and without any 33 | expectation of additional consideration or compensation, the person 34 | associating CC0 with a Work (the "Affirmer"), to the extent that he or she 35 | is an owner of Copyright and Related Rights in the Work, voluntarily 36 | elects to apply CC0 to the Work and publicly distribute the Work under its 37 | terms, with knowledge of his or her Copyright and Related Rights in the 38 | Work and the meaning and intended legal effect of CC0 on those rights. 39 | 40 | 1. Copyright and Related Rights. A Work made available under CC0 may be 41 | protected by copyright and related or neighboring rights ("Copyright and 42 | Related Rights"). Copyright and Related Rights include, but are not 43 | limited to, the following: 44 | 45 | i. the right to reproduce, adapt, distribute, perform, display, 46 | communicate, and translate a Work; 47 | ii. moral rights retained by the original author(s) and/or performer(s); 48 | iii. publicity and privacy rights pertaining to a person's image or 49 | likeness depicted in a Work; 50 | iv. rights protecting against unfair competition in regards to a Work, 51 | subject to the limitations in paragraph 4(a), below; 52 | v. rights protecting the extraction, dissemination, use and reuse of data 53 | in a Work; 54 | vi. database rights (such as those arising under Directive 96/9/EC of the 55 | European Parliament and of the Council of 11 March 1996 on the legal 56 | protection of databases, and under any national implementation 57 | thereof, including any amended or successor version of such 58 | directive); and 59 | vii. other similar, equivalent or corresponding rights throughout the 60 | world based on applicable law or treaty, and any national 61 | implementations thereof. 62 | 63 | 2. Waiver. To the greatest extent permitted by, but not in contravention 64 | of, applicable law, Affirmer hereby overtly, fully, permanently, 65 | irrevocably and unconditionally waives, abandons, and surrenders all of 66 | Affirmer's Copyright and Related Rights and associated claims and causes 67 | of action, whether now known or unknown (including existing as well as 68 | future claims and causes of action), in the Work (i) in all territories 69 | worldwide, (ii) for the maximum duration provided by applicable law or 70 | treaty (including future time extensions), (iii) in any current or future 71 | medium and for any number of copies, and (iv) for any purpose whatsoever, 72 | including without limitation commercial, advertising or promotional 73 | purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each 74 | member of the public at large and to the detriment of Affirmer's heirs and 75 | successors, fully intending that such Waiver shall not be subject to 76 | revocation, rescission, cancellation, termination, or any other legal or 77 | equitable action to disrupt the quiet enjoyment of the Work by the public 78 | as contemplated by Affirmer's express Statement of Purpose. 79 | 80 | 3. Public License Fallback. Should any part of the Waiver for any reason 81 | be judged legally invalid or ineffective under applicable law, then the 82 | Waiver shall be preserved to the maximum extent permitted taking into 83 | account Affirmer's express Statement of Purpose. In addition, to the 84 | extent the Waiver is so judged Affirmer hereby grants to each affected 85 | person a royalty-free, non transferable, non sublicensable, non exclusive, 86 | irrevocable and unconditional license to exercise Affirmer's Copyright and 87 | Related Rights in the Work (i) in all territories worldwide, (ii) for the 88 | maximum duration provided by applicable law or treaty (including future 89 | time extensions), (iii) in any current or future medium and for any number 90 | of copies, and (iv) for any purpose whatsoever, including without 91 | limitation commercial, advertising or promotional purposes (the 92 | "License"). The License shall be deemed effective as of the date CC0 was 93 | applied by Affirmer to the Work. Should any part of the License for any 94 | reason be judged legally invalid or ineffective under applicable law, such 95 | partial invalidity or ineffectiveness shall not invalidate the remainder 96 | of the License, and in such case Affirmer hereby affirms that he or she 97 | will not (i) exercise any of his or her remaining Copyright and Related 98 | Rights in the Work or (ii) assert any associated claims and causes of 99 | action with respect to the Work, in either case contrary to Affirmer's 100 | express Statement of Purpose. 101 | 102 | 4. Limitations and Disclaimers. 103 | 104 | a. No trademark or patent rights held by Affirmer are waived, abandoned, 105 | surrendered, licensed or otherwise affected by this document. 106 | b. Affirmer offers the Work as-is and makes no representations or 107 | warranties of any kind concerning the Work, express, implied, 108 | statutory or otherwise, including without limitation warranties of 109 | title, merchantability, fitness for a particular purpose, non 110 | infringement, or the absence of latent or other defects, accuracy, or 111 | the present or absence of errors, whether or not discoverable, all to 112 | the greatest extent permissible under applicable law. 113 | c. Affirmer disclaims responsibility for clearing rights of other persons 114 | that may apply to the Work or any use thereof, including without 115 | limitation any person's Copyright and Related Rights in the Work. 116 | Further, Affirmer disclaims responsibility for obtaining any necessary 117 | consents, permissions or other rights required for any use of the 118 | Work. 119 | d. Affirmer understands and acknowledges that Creative Commons is not a 120 | party to this document and has no duty or obligation with respect to 121 | this CC0 or use of the Work. 122 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Formally Reasoning About and Generating Optimizations for LLVM IR 2 | 3 | John Regehr, Zhengyang Liu, Nuno P. Lopes 4 | 5 | ## Introduction 6 | 7 | - What are we trying to do here? 8 | - several projects in different stages 9 | - making Alive2 as complete and definitive as possible 10 | - making the LLVM implementation as formally grounded as possible [example of ongoing discussion](https://github.com/llvm/llvm-project/issues/49839) 11 | - making it unnecessary to create optimizers by hand 12 | - Nuno's role, John's role, Zhengyang's role 13 | - parts of this talk 14 | 15 | ## Equivalence and Inequivalence 16 | 17 | - [simple equivalence](https://alive2.llvm.org/ce/z/arJCEv) 18 | - instructions 19 | - basic blocks 20 | - terminators 21 | - explicit typing 22 | - implicit naming 23 | - [simple inequivalence](https://alive2.llvm.org/ce/z/gtT2J4) 24 | - what does "value mismatch" mean? 25 | - reading counterexamples can be challenging 26 | - we're working on making them better 27 | - [another simple example](https://alive2.llvm.org/ce/z/ZdEyvR) 28 | - [simple example with overflow disallowed](https://alive2.llvm.org/ce/z/qhL7v9) 29 | - [running specific opt passes](https://alive2.llvm.org/ce/z/4H2oo5) 30 | - exercise: [udiv -> sdiv](https://alive2.llvm.org/ce/z/IsgIcg) and [answer](https://alive2.llvm.org/ce/z/LS6Hty) 31 | - changing control flow [1](https://alive2.llvm.org/ce/z/EZs-NN) and [2](https://alive2.llvm.org/ce/z/bN8iK5) 32 | - unrolling loops [C to LLVM](https://gcc.godbolt.org/z/573Mr3d7e) and [verifying the LLVM](https://alive2.llvm.org/ce/z/4QTCQ_) 33 | 34 | ## Undefined Behavior and Refinement 35 | 36 | - undefined behavior is awful in user-facing languages 37 | - however, in an IR it is a useful tool for exposing optimization opportunities 38 | - IR-level UB has little to do with the source language! 39 | - several kinds of UB in LLVM, it's kind of complicated 40 | 41 | - "immediate UB" in LLVM is the same as in C and C++ 42 | - program loses all meaning when you perform this sort of operation 43 | - it is reserved for things with serious consequences 44 | - divide by zero, since it traps on many architectures 45 | - OOB memory operations, since these result in corrupted storage 46 | - immediate UB [inhibits speculation](https://alive2.llvm.org/ce/z/wyArce) 47 | - IR-level speculation is pretty important, shows up in a lot of places e.g. LICM 48 | - this motivated the addition of various forms of "deferred undefined behavior" to LLVM IR 49 | 50 | - undef stands for "any legal value for the type" 51 | - so an i32 undef can produce any value from 0..2^32-1 52 | - why is undef needed? 53 | - it's common to have paths where a variable gets no specific value 54 | - we can materialize any value we want, such as zero, but this has a cost 55 | - undef lets us not care about what value is there, we basically end up just 56 | using whatever value happens to be in a register or memory cell 57 | - there's still a cost, but it's in terms of increasing the complexity of reasoning 58 | about the compiler! 59 | - basically all optimizing compilers end up with this concept in one form or another 60 | - working out a coherent set of rules for undef was hard 61 | - for example, what happens when you branch on undef? 62 | - even with a coherent set of rules, compiler devs have a very hard time with undef 63 | 64 | - [undef can be transformed into an arbitrary value](https://alive2.llvm.org/ce/z/ZVUpXz) 65 | - this means that sequential code ends up having many behaviors 66 | - [undef requires reasoning about arbitrary subsets of values](https://alive2.llvm.org/ce/z/DYNQiv) 67 | - powerset behavior leads to extraordinarily difficult solver queries 68 | - [each SSA use of undef might yield a different value](https://alive2.llvm.org/ce/z/GLtJgB) 69 | - an optimization can reduce the number of uses of a value that might be undef 70 | - an optimization can keep the number of uses of a maybe-undef value the same 71 | - an optimization can almost never increase the number of uses of a maybe-undef value 72 | 73 | - now we're ready to introduce refinement! 74 | - a transformation is a refinement if, forall program configurations, the transformed code has a subset of the original code's behaviors 75 | - when it's a proper subset we call it a non-trivial refinement 76 | - non-trivial refinement is ubiquitous, we have to deal with it 77 | - checking refinement is the primary function of Alive2 78 | - refinement checks in the presence of undef put some stress on the 79 | solver's quantifier elimination parts 80 | - Z3 still needs a lot of work here, until this happens we get timeouts on some apparently very simple queries 81 | 82 | - we've seen undef and immediate undefined behavior 83 | - these two concepts ended up being insufficient to justify all of the 84 | optimizations that the LLVM developers wanted to write, so one more 85 | kind of undefined behavior was added 86 | - in C and C++, a + b > a and b > 0 always have the same result, because signed overflow is undefined 87 | - [the analogous optimization in LLVM didn't work](https://alive2.llvm.org/ce/z/wPtEnP) 88 | - this can't be fixed by making signed overflow return undef 89 | - it can be fixed by immediate UB on signed overflow, but that creates too many other problems 90 | - solution is a "poison" value that is outside the normal space of values 91 | - for most operations, "op x, poison" [evaluates to poison](https://alive2.llvm.org/ce/z/8NFkSZ) 92 | - this solves this problem, but poison and undef are a persistent source of problems for compiler developers, who have an easy time forgetting about one or the other of them 93 | - [poison refines to undef but not the other way around](https://alive2.llvm.org/ce/z/3-S3NT) 94 | 95 | - we proposed a "freeze" instruction that basically makes both undef and poison pick a stable value 96 | - we got the LLVM community to accept it 97 | - we implemented it (this was mainly Juneyoung Lee's work) 98 | - we're trying to get rid of undef entirely, but this is a long and difficult project 99 | - many real-world engineering constraints due to LLVM's wide adoption 100 | 101 | ## Memory 102 | 103 | - not going into details about this since it's pretty difficult stuff and most of the work was done outside of my group, but we had a PLDI paper a few years ago and Nuno + Juneyoung [have a newer one at CAV '21](https://web.ist.utl.pt/nuno.lopes/pubs.php?id=alive2-mem-cav21) 104 | - but it pretty much just works [C to LLVM](https://gcc.godbolt.org/z/ooP5jEao3) and [LLVM to LLVM](https://alive2.llvm.org/ce/z/emHFcp) and [better LLVM to LLVM](https://alive2.llvm.org/ce/z/9jEVvb) 105 | 106 | ## Some Bugs 107 | 108 | - [list of bugs](https://github.com/AliveToolkit/alive2/blob/master/BugList.md) 109 | 110 | ## Translation Validation for the AArch64 backend 111 | 112 | - [blog post](https://blog.regehr.org/archives/2265) 113 | 114 | ## A Current Project: Continuous Fuzzing for LLVM 115 | 116 | - [we wrote a mutation engine for LLVM IR](https://blog.regehr.org/archives/2148) 117 | - we want to take a strong machine and have it... 118 | - build a new LLVM each day 119 | - spend the rest of the day performing mutation-based fuzzing 120 | - test the middle-end optimizer and also some backends 121 | - I wrote a cache to avoid redundant queries 122 | - we'll reduce bug triggers using llvm-reduce 123 | - the current roadblock is triaging the outputs of this process 124 | 125 | # Minotaur: A superoptimizer focusing on vectorized x86-64 code 126 | 127 | - Zhengyang presents! 128 | --------------------------------------------------------------------------------