├── .gitignore ├── CHANGES.md ├── LICENSE ├── README.md ├── TODO.md ├── examples ├── arith.asdl ├── arith.py ├── toy.asdl └── toy.py ├── iast ├── __init__.py ├── asdl │ ├── Python33.asdl │ ├── Python34.asdl │ ├── __init__.py │ ├── asdl.py │ └── asdl_test.py ├── node.py ├── pattern.py ├── python │ ├── __init__.py │ ├── default.py │ ├── native.py │ ├── pynode.py │ ├── python33.py │ ├── python34.py │ └── pyutil.py ├── util.py └── visitor.py ├── setup.py ├── tests ├── __init__.py ├── python │ ├── __init__.py │ ├── test_pynode.py │ └── test_pyutil.py ├── test_node.py ├── test_pattern.py ├── test_util.py └── test_visitor.py └── tox.ini /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__ 2 | .tox 3 | iAST.egg-info 4 | dist 5 | -------------------------------------------------------------------------------- /CHANGES.md: -------------------------------------------------------------------------------- 1 | # Release notes 2 | 3 | ## 0.2.1 (2015-01-04) 4 | 5 | - fix packaging bug regarding missing required files 6 | 7 | ## 0.2.0 (2015-01-03) 8 | 9 | - initial release 10 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2014-2015 Jonathan Brandvein 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of 6 | this software and associated documentation files (the "Software"), to deal in 7 | the Software without restriction, including without limitation the rights to 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 9 | the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 17 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 18 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 19 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 20 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | 22 | 23 | -------------------------------------------------------------------------------- 24 | 25 | 26 | The following files are Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 27 | 2008, 2009, 2010, 2011, 2012, 2013, 2014 Python Software Foundation; All Rights 28 | Reserved. They are covered by the PSF license. 29 | 30 | iast/asdl/asdl.py 31 | iast/asdl/asdl_test.py 32 | iast/python/Python33.asdl 33 | iast/python/Python34.asdl 34 | 35 | The text of the PSF license follows. 36 | 37 | 38 | A. HISTORY OF THE SOFTWARE 39 | ========================== 40 | 41 | Python was created in the early 1990s by Guido van Rossum at Stichting 42 | Mathematisch Centrum (CWI, see http://www.cwi.nl) in the Netherlands 43 | as a successor of a language called ABC. Guido remains Python's 44 | principal author, although it includes many contributions from others. 45 | 46 | In 1995, Guido continued his work on Python at the Corporation for 47 | National Research Initiatives (CNRI, see http://www.cnri.reston.va.us) 48 | in Reston, Virginia where he released several versions of the 49 | software. 50 | 51 | In May 2000, Guido and the Python core development team moved to 52 | BeOpen.com to form the BeOpen PythonLabs team. In October of the same 53 | year, the PythonLabs team moved to Digital Creations (now Zope 54 | Corporation, see http://www.zope.com). In 2001, the Python Software 55 | Foundation (PSF, see http://www.python.org/psf/) was formed, a 56 | non-profit organization created specifically to own Python-related 57 | Intellectual Property. Zope Corporation is a sponsoring member of 58 | the PSF. 59 | 60 | All Python releases are Open Source (see http://www.opensource.org for 61 | the Open Source Definition). Historically, most, but not all, Python 62 | releases have also been GPL-compatible; the table below summarizes 63 | the various releases. 64 | 65 | Release Derived Year Owner GPL- 66 | from compatible? (1) 67 | 68 | 0.9.0 thru 1.2 1991-1995 CWI yes 69 | 1.3 thru 1.5.2 1.2 1995-1999 CNRI yes 70 | 1.6 1.5.2 2000 CNRI no 71 | 2.0 1.6 2000 BeOpen.com no 72 | 1.6.1 1.6 2001 CNRI yes (2) 73 | 2.1 2.0+1.6.1 2001 PSF no 74 | 2.0.1 2.0+1.6.1 2001 PSF yes 75 | 2.1.1 2.1+2.0.1 2001 PSF yes 76 | 2.1.2 2.1.1 2002 PSF yes 77 | 2.1.3 2.1.2 2002 PSF yes 78 | 2.2 and above 2.1.1 2001-now PSF yes 79 | 80 | Footnotes: 81 | 82 | (1) GPL-compatible doesn't mean that we're distributing Python under 83 | the GPL. All Python licenses, unlike the GPL, let you distribute 84 | a modified version without making your changes open source. The 85 | GPL-compatible licenses make it possible to combine Python with 86 | other software that is released under the GPL; the others don't. 87 | 88 | (2) According to Richard Stallman, 1.6.1 is not GPL-compatible, 89 | because its license has a choice of law clause. According to 90 | CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1 91 | is "not incompatible" with the GPL. 92 | 93 | Thanks to the many outside volunteers who have worked under Guido's 94 | direction to make these releases possible. 95 | 96 | 97 | B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON 98 | =============================================================== 99 | 100 | PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 101 | -------------------------------------------- 102 | 103 | 1. This LICENSE AGREEMENT is between the Python Software Foundation 104 | ("PSF"), and the Individual or Organization ("Licensee") accessing and 105 | otherwise using this software ("Python") in source or binary form and 106 | its associated documentation. 107 | 108 | 2. Subject to the terms and conditions of this License Agreement, PSF hereby 109 | grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, 110 | analyze, test, perform and/or display publicly, prepare derivative works, 111 | distribute, and otherwise use Python alone or in any derivative version, 112 | provided, however, that PSF's License Agreement and PSF's notice of copyright, 113 | i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 114 | 2011, 2012, 2013 Python Software Foundation; All Rights Reserved" are retained 115 | in Python alone or in any derivative version prepared by Licensee. 116 | 117 | 3. In the event Licensee prepares a derivative work that is based on 118 | or incorporates Python or any part thereof, and wants to make 119 | the derivative work available to others as provided herein, then 120 | Licensee hereby agrees to include in any such work a brief summary of 121 | the changes made to Python. 122 | 123 | 4. PSF is making Python available to Licensee on an "AS IS" 124 | basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR 125 | IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND 126 | DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS 127 | FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT 128 | INFRINGE ANY THIRD PARTY RIGHTS. 129 | 130 | 5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON 131 | FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS 132 | A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, 133 | OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. 134 | 135 | 6. This License Agreement will automatically terminate upon a material 136 | breach of its terms and conditions. 137 | 138 | 7. Nothing in this License Agreement shall be deemed to create any 139 | relationship of agency, partnership, or joint venture between PSF and 140 | Licensee. This License Agreement does not grant permission to use PSF 141 | trademarks or trade name in a trademark sense to endorse or promote 142 | products or services of Licensee, or any third party. 143 | 144 | 8. By copying, installing or otherwise using Python, Licensee 145 | agrees to be bound by the terms and conditions of this License 146 | Agreement. 147 | 148 | 149 | BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0 150 | ------------------------------------------- 151 | 152 | BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1 153 | 154 | 1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an 155 | office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the 156 | Individual or Organization ("Licensee") accessing and otherwise using 157 | this software in source or binary form and its associated 158 | documentation ("the Software"). 159 | 160 | 2. Subject to the terms and conditions of this BeOpen Python License 161 | Agreement, BeOpen hereby grants Licensee a non-exclusive, 162 | royalty-free, world-wide license to reproduce, analyze, test, perform 163 | and/or display publicly, prepare derivative works, distribute, and 164 | otherwise use the Software alone or in any derivative version, 165 | provided, however, that the BeOpen Python License is retained in the 166 | Software, alone or in any derivative version prepared by Licensee. 167 | 168 | 3. BeOpen is making the Software available to Licensee on an "AS IS" 169 | basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR 170 | IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND 171 | DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS 172 | FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT 173 | INFRINGE ANY THIRD PARTY RIGHTS. 174 | 175 | 4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE 176 | SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS 177 | AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY 178 | DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. 179 | 180 | 5. This License Agreement will automatically terminate upon a material 181 | breach of its terms and conditions. 182 | 183 | 6. This License Agreement shall be governed by and interpreted in all 184 | respects by the law of the State of California, excluding conflict of 185 | law provisions. Nothing in this License Agreement shall be deemed to 186 | create any relationship of agency, partnership, or joint venture 187 | between BeOpen and Licensee. This License Agreement does not grant 188 | permission to use BeOpen trademarks or trade names in a trademark 189 | sense to endorse or promote products or services of Licensee, or any 190 | third party. As an exception, the "BeOpen Python" logos available at 191 | http://www.pythonlabs.com/logos.html may be used according to the 192 | permissions granted on that web page. 193 | 194 | 7. By copying, installing or otherwise using the software, Licensee 195 | agrees to be bound by the terms and conditions of this License 196 | Agreement. 197 | 198 | 199 | CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1 200 | --------------------------------------- 201 | 202 | 1. This LICENSE AGREEMENT is between the Corporation for National 203 | Research Initiatives, having an office at 1895 Preston White Drive, 204 | Reston, VA 20191 ("CNRI"), and the Individual or Organization 205 | ("Licensee") accessing and otherwise using Python 1.6.1 software in 206 | source or binary form and its associated documentation. 207 | 208 | 2. Subject to the terms and conditions of this License Agreement, CNRI 209 | hereby grants Licensee a nonexclusive, royalty-free, world-wide 210 | license to reproduce, analyze, test, perform and/or display publicly, 211 | prepare derivative works, distribute, and otherwise use Python 1.6.1 212 | alone or in any derivative version, provided, however, that CNRI's 213 | License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) 214 | 1995-2001 Corporation for National Research Initiatives; All Rights 215 | Reserved" are retained in Python 1.6.1 alone or in any derivative 216 | version prepared by Licensee. Alternately, in lieu of CNRI's License 217 | Agreement, Licensee may substitute the following text (omitting the 218 | quotes): "Python 1.6.1 is made available subject to the terms and 219 | conditions in CNRI's License Agreement. This Agreement together with 220 | Python 1.6.1 may be located on the Internet using the following 221 | unique, persistent identifier (known as a handle): 1895.22/1013. This 222 | Agreement may also be obtained from a proxy server on the Internet 223 | using the following URL: http://hdl.handle.net/1895.22/1013". 224 | 225 | 3. In the event Licensee prepares a derivative work that is based on 226 | or incorporates Python 1.6.1 or any part thereof, and wants to make 227 | the derivative work available to others as provided herein, then 228 | Licensee hereby agrees to include in any such work a brief summary of 229 | the changes made to Python 1.6.1. 230 | 231 | 4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS" 232 | basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR 233 | IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND 234 | DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS 235 | FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT 236 | INFRINGE ANY THIRD PARTY RIGHTS. 237 | 238 | 5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON 239 | 1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS 240 | A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1, 241 | OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. 242 | 243 | 6. This License Agreement will automatically terminate upon a material 244 | breach of its terms and conditions. 245 | 246 | 7. This License Agreement shall be governed by the federal 247 | intellectual property law of the United States, including without 248 | limitation the federal copyright law, and, to the extent such 249 | U.S. federal law does not apply, by the law of the Commonwealth of 250 | Virginia, excluding Virginia's conflict of law provisions. 251 | Notwithstanding the foregoing, with regard to derivative works based 252 | on Python 1.6.1 that incorporate non-separable material that was 253 | previously distributed under the GNU General Public License (GPL), the 254 | law of the Commonwealth of Virginia shall govern this License 255 | Agreement only as to issues arising under or with respect to 256 | Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this 257 | License Agreement shall be deemed to create any relationship of 258 | agency, partnership, or joint venture between CNRI and Licensee. This 259 | License Agreement does not grant permission to use CNRI trademarks or 260 | trade name in a trademark sense to endorse or promote products or 261 | services of Licensee, or any third party. 262 | 263 | 8. By clicking on the "ACCEPT" button where indicated, or by copying, 264 | installing or otherwise using Python 1.6.1, Licensee agrees to be 265 | bound by the terms and conditions of this License Agreement. 266 | 267 | ACCEPT 268 | 269 | 270 | CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2 271 | -------------------------------------------------- 272 | 273 | Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam, 274 | The Netherlands. All rights reserved. 275 | 276 | Permission to use, copy, modify, and distribute this software and its 277 | documentation for any purpose and without fee is hereby granted, 278 | provided that the above copyright notice appear in all copies and that 279 | both that copyright notice and this permission notice appear in 280 | supporting documentation, and that the name of Stichting Mathematisch 281 | Centrum or CWI not be used in advertising or publicity pertaining to 282 | distribution of the software without specific, written prior 283 | permission. 284 | 285 | STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO 286 | THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND 287 | FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE 288 | FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES 289 | WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN 290 | ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT 291 | OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 292 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # iAST # 2 | 3 | *(Supports Python 3.3 and 3.4)* 4 | 5 | This library provides a way of defining and transforming abstract syntax 6 | trees (ASTs) for custom languages. It can be used to help build a compiler 7 | or other program transformation system. 8 | 9 | iAST reads your language's abstract syntax from an ASDL grammar, and 10 | automatically generates node classes. A standard visitor-style framework 11 | is provided for traversing, transforming, and pattern matching over trees. 12 | Nodes are hashable, have structural equality, and support optional type 13 | checking. (Parsing is not supported and should be handled by an external 14 | parser generator.) 15 | 16 | Node definitions for the ASTs of Python 3.3 and Python 3.4 are provided 17 | out-of-the-box, along with tools for writing code templates and macros 18 | targeting Python code. However, the main framework works on ASTs for 19 | arbitrary languages. 20 | 21 | ## Examples ## 22 | 23 | See [arith.py](examples/arith.py) for basic usage and visitors/transformers. 24 | See [toy.py](examples/toy.py) for a comparison with Python's own ast module 25 | and the use of type checking. Both examples use abstract grammars from the 26 | corresponding ASDL files. 27 | 28 | ## Installation ## 29 | 30 | To install from pip/PyPI: 31 | 32 | ``` 33 | python -m pip install iast 34 | ``` 35 | 36 | To use a development version: 37 | 38 | ``` 39 | python -m pip install https://github.com/brandjon/iast/tree/tarball/develop 40 | ``` 41 | 42 | Python 3.3 and 3.4 are supported. The only dependency is 43 | [simplestruct](https://github.com/brandjon/simplestruct), which is used to 44 | define the node classes. 45 | 46 | ## Developers ## 47 | 48 | Tests can be run with `python setup.py test`, or by installing 49 | [Tox](http://testrun.org/tox/latest/) and running `python -m tox` 50 | in the project root. Tox tests both Python 3.3 and 3.4 configurations. 51 | Building a source distribution (`python setup.py sdist`) requires the 52 | setuptools extension package 53 | [setuptools-git](https://github.com/wichert/setuptools-git). 54 | 55 | ## References ## 56 | 57 | [1]: https://github.com/eliben/asdl_parser 58 | [[1]]: Eli Bendersky's rewrite of the Python ASDL parser, which powers 59 | iAST's generation of nodes from ASDL. 60 | -------------------------------------------------------------------------------- /TODO.md: -------------------------------------------------------------------------------- 1 | # Wishlist # 2 | - Wishlist empty. (All my dreams have come true?) 3 | -------------------------------------------------------------------------------- /examples/arith.asdl: -------------------------------------------------------------------------------- 1 | module arith 2 | { 3 | expr = BinOp(expr left, operator op, expr right) 4 | | Neg(expr value) 5 | | Num(object n) 6 | | Var(identifier id) 7 | 8 | operator = Add | Sub | Mult | Div 9 | } 10 | -------------------------------------------------------------------------------- /examples/arith.py: -------------------------------------------------------------------------------- 1 | """Example of a simple language of arithmetic expressions. 2 | Demonstrates parsing ASDL to create node classes, and using 3 | visitors and transformers to process the tree. 4 | """ 5 | 6 | import iast 7 | 8 | # Read and parse the abstract grammar from an ASDL file. 9 | with open('arith.asdl', 'rt') as file: 10 | absgrammar = iast.parse_asdl(file.read()) 11 | # Generate node classes and store them in a mapping by name. 12 | lang = iast.node.nodes_from_asdl(absgrammar) 13 | # Flood the global namespace with these node classes so 14 | # we can use them more easily. 15 | globals().update(lang) 16 | 17 | 18 | class Unparser(iast.NodeVisitor): 19 | 20 | """Turns AST back into an expression string.""" 21 | 22 | def process(self, tree): 23 | self.tokens = [] 24 | super().process(tree) 25 | return ''.join(self.tokens) 26 | 27 | def visit_BinOp(self, node): 28 | self.tokens.append('(') 29 | self.visit(node.left) 30 | self.tokens.append(' ') 31 | self.visit(node.op) 32 | self.tokens.append(' ') 33 | self.visit(node.right) 34 | self.tokens.append(')') 35 | 36 | def visit_Neg(self, node): 37 | self.tokens.append('-(') 38 | self.visit(node.value) 39 | self.tokens.append(')') 40 | 41 | def visit_Num(self, node): 42 | self.tokens.append(str(node.n)) 43 | 44 | def visit_Var(self, node): 45 | self.tokens.append(node.id) 46 | 47 | def op_helper(self, node): 48 | map = {'Add': '+', 'Sub': '-', 'Mult': '*', 'Div': '/'} 49 | self.tokens.append(map[node.__class__.__name__]) 50 | 51 | visit_Add = visit_Sub = visit_Mult = visit_Div = op_helper 52 | 53 | 54 | class Simplifier(iast.NodeTransformer): 55 | 56 | """Constructs a simplified AST based on a few rewriting rules.""" 57 | 58 | def visit_Neg(self, node): 59 | # Process the expression being negated first. 60 | node = self.generic_visit(node) 61 | 62 | # If the inside value is also a Neg, we cancel out. 63 | if isinstance(node.value, Neg): 64 | return node.value.value 65 | # If the inside value is zero, Neg is redundant. 66 | # (Note the use of structural equality on node instances.) 67 | elif node.value == Num(0): 68 | return Num(0) 69 | # Oh well. Be sure to return node. Returning None indicates 70 | # "no change", which would mean ignoring any rewriting done 71 | # to the inner expression by generic_visit() above. 72 | else: 73 | return node 74 | 75 | def visit_BinOp(self, node): 76 | # Process operand expressions first. 77 | node = self.generic_visit(node) 78 | 79 | # Zero-times-anything rule. 80 | if (node.op == Mult() and 81 | (node.left == Num(0) or node.right == Num(0))): 82 | # Return the AST to replace this node. 83 | return Num(0) 84 | # Zero-plus-anything rule. 85 | elif (node.op == Add() and 86 | (node.left == Num(0) or node.right == Num(0))): 87 | return node.right if node.left == Num(0) else node.left 88 | else: 89 | return node 90 | 91 | 92 | tree = (BinOp(BinOp(BinOp(Var('x'), Add(), Num(3)), Mult(), Neg(Num(0))), 93 | Add(), 94 | BinOp(Var('x'), Sub(), Neg(Neg(Num(2)))))) 95 | # This works because we flooded the global namespace. If we didn't do 96 | # that, we could have instead accessed nodes in lang by their name 97 | # (lang['BinOp'], etc.), or we could have used eval() with lang as 98 | # the namespace, as in: 99 | # 100 | # tree = eval("BinOp(Var('x'), Add(), Num(3))", lang) 101 | 102 | # ASTs get a straightforward repr() from simplestruct. 103 | print(tree) 104 | print() 105 | 106 | # We can also use a multi-line pretty printer. 107 | print(iast.dump(tree)) 108 | print() 109 | 110 | # Visitors can be used by instantiating them and calling the 111 | # process() method with the tree as the argument. As a shorthand, 112 | # the classmethod run() does both instantiation and calling process(). 113 | print(Unparser.run(tree)) 114 | print() 115 | 116 | # For transformers, the result is a new tree. Be sure to use 117 | # assignment to overwrite the old tree. 118 | tree = Simplifier.run(tree) 119 | print(Unparser.run(tree)) 120 | -------------------------------------------------------------------------------- /examples/toy.asdl: -------------------------------------------------------------------------------- 1 | module toy 2 | { 3 | program = (stmt* code) 4 | 5 | stmt = Pass() 6 | | Print(expr? value) 7 | | Assign(identifier id, expr value) 8 | | While(expr test, stmt* code) 9 | | If(expr test, stmt* code, stmt* orelse) 10 | 11 | expr = Num(object n) 12 | | Var(identifier id) 13 | } 14 | -------------------------------------------------------------------------------- /examples/toy.py: -------------------------------------------------------------------------------- 1 | """Example using an abstract grammar for a simple toy language. 2 | Demonstrates structural properties, type checking, and differences 3 | with Python's own ast module. 4 | """ 5 | 6 | import iast 7 | import ast 8 | 9 | # Generate the nodes from ASDL. 10 | # Note the flags passed to nodes_from_asdl(): 11 | # 12 | # - typed=True makes the node classes type-checked to enforce that their 13 | # children conform to the grammar in the ASDL. 14 | # 15 | # - module=__name__ explicitly sets this module as the defining module 16 | # for the node classes. This helps ensure that node instances can be 17 | # pickled, although it should still work so long as this module is 18 | # accessible via sys.modules. 19 | # 20 | with open('toy.asdl', 'rt') as file: 21 | absgrammar = iast.parse_asdl(file.read()) 22 | lang = iast.node.nodes_from_asdl(absgrammar, typed=True, module=__name__) 23 | globals().update(lang) 24 | 25 | 26 | # iAST node classes are subclasses of simplestruct.Struct, and 27 | # therefore have structural equality. 28 | node1 = Num(5) 29 | node2 = Num(5) 30 | print(node1 is node2) # False 31 | print(node1 == node2) # True 32 | 33 | # Compare this to Python's ast library. 34 | node1 = ast.Num(5) 35 | node2 = ast.Num(5) 36 | print(node1 is node2) # False 37 | print(node1 == node2) # False 38 | 39 | # iAST nodes are immutable and hashable. 40 | node1 = Num(5) 41 | node2 = Num(5) 42 | try: 43 | node1.n = 6 44 | except AttributeError as e: 45 | print(e) # Struct is immutable 46 | print(hash(node1)) # The hash values must be the same 47 | print(hash(node2)) # since they are equal. 48 | 49 | # Python's ast library's nodes are mutable. This can give it 50 | # a speed advantage for tree transformations (the fact that 51 | # they're implemented in C also helps). Personally, I find 52 | # mutability in tree transformations to be more error-prone. 53 | # 54 | # Python ast nodes are hashable, but without structural equality, 55 | # hashes aren't very useful. For instance, if you want to test 56 | # whether the AST for some expression is in a set, you need to 57 | # already have a reference to that AST -- essentially the interned 58 | # representation for that expression. It's not enough to parse the 59 | # expression and make a new AST. 60 | 61 | # iAST nodes are (optionally) type-checked. 62 | try: 63 | Var(5) 64 | except TypeError as e: 65 | print(e) # Error constructing Var (field 'id'): 66 | # Expected str; got int 67 | 68 | # Child fields marked ? in the ASLD are optionally None. 69 | # They still must be explicitly passed to the constructor. 70 | Print(Num(5)) 71 | Print(None) 72 | 73 | # Child fields marked * are sequence valued. They can be 74 | # tuples or lists (normalized to tuples). 75 | program([Pass(), Pass()]) 76 | try: 77 | program(Pass()) 78 | except TypeError as e: 79 | print(e) # Error constructing program (field 'code'): 80 | # Expected sequence of stmt; got Pass node instead 81 | -------------------------------------------------------------------------------- /iast/__init__.py: -------------------------------------------------------------------------------- 1 | """Provides tools for defining and manipulating abstract syntax trees.""" 2 | 3 | __version__ = '0.2.1' 4 | 5 | from .asdl import * 6 | from .util import * 7 | from .node import * 8 | from .visitor import * 9 | from .pattern import * 10 | -------------------------------------------------------------------------------- /iast/asdl/Python33.asdl: -------------------------------------------------------------------------------- 1 | -- ASDL's five builtin types are identifier, int, string, bytes, object 2 | 3 | module Python 4 | { 5 | mod = Module(stmt* body) 6 | | Interactive(stmt* body) 7 | | Expression(expr body) 8 | 9 | -- not really an actual node but useful in Jython's typesystem. 10 | | Suite(stmt* body) 11 | 12 | stmt = FunctionDef(identifier name, arguments args, 13 | stmt* body, expr* decorator_list, expr? returns) 14 | | ClassDef(identifier name, 15 | expr* bases, 16 | keyword* keywords, 17 | expr? starargs, 18 | expr? kwargs, 19 | stmt* body, 20 | expr* decorator_list) 21 | | Return(expr? value) 22 | 23 | | Delete(expr* targets) 24 | | Assign(expr* targets, expr value) 25 | | AugAssign(expr target, operator op, expr value) 26 | 27 | -- use 'orelse' because else is a keyword in target languages 28 | | For(expr target, expr iter, stmt* body, stmt* orelse) 29 | | While(expr test, stmt* body, stmt* orelse) 30 | | If(expr test, stmt* body, stmt* orelse) 31 | | With(withitem* items, stmt* body) 32 | 33 | | Raise(expr? exc, expr? cause) 34 | | Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody) 35 | | Assert(expr test, expr? msg) 36 | 37 | | Import(alias* names) 38 | | ImportFrom(identifier? module, alias* names, int? level) 39 | 40 | | Global(identifier* names) 41 | | Nonlocal(identifier* names) 42 | | Expr(expr value) 43 | | Pass | Break | Continue 44 | 45 | -- XXX Jython will be different 46 | -- col_offset is the byte offset in the utf8 string the parser uses 47 | attributes (int lineno, int col_offset) 48 | 49 | -- BoolOp() can use left & right? 50 | expr = BoolOp(boolop op, expr* values) 51 | | BinOp(expr left, operator op, expr right) 52 | | UnaryOp(unaryop op, expr operand) 53 | | Lambda(arguments args, expr body) 54 | | IfExp(expr test, expr body, expr orelse) 55 | | Dict(expr* keys, expr* values) 56 | | Set(expr* elts) 57 | | ListComp(expr elt, comprehension* generators) 58 | | SetComp(expr elt, comprehension* generators) 59 | | DictComp(expr key, expr value, comprehension* generators) 60 | | GeneratorExp(expr elt, comprehension* generators) 61 | -- the grammar constrains where yield expressions can occur 62 | | Yield(expr? value) 63 | | YieldFrom(expr value) 64 | -- need sequences for compare to distinguish between 65 | -- x < 4 < 3 and (x < 4) < 3 66 | | Compare(expr left, cmpop* ops, expr* comparators) 67 | | Call(expr func, expr* args, keyword* keywords, 68 | expr? starargs, expr? kwargs) 69 | | Num(object n) -- a number as a PyObject. 70 | | Str(string s) -- need to specify raw, unicode, etc? 71 | | Bytes(bytes s) 72 | | Ellipsis 73 | -- other literals? bools? 74 | 75 | -- the following expression can appear in assignment context 76 | | Attribute(expr value, identifier attr, expr_context ctx) 77 | | Subscript(expr value, slice slice, expr_context ctx) 78 | | Starred(expr value, expr_context ctx) 79 | | Name(identifier id, expr_context ctx) 80 | | List(expr* elts, expr_context ctx) 81 | | Tuple(expr* elts, expr_context ctx) 82 | 83 | -- col_offset is the byte offset in the utf8 string the parser uses 84 | attributes (int lineno, int col_offset) 85 | 86 | expr_context = Load | Store | Del | AugLoad | AugStore | Param 87 | 88 | slice = Slice(expr? lower, expr? upper, expr? step) 89 | | ExtSlice(slice* dims) 90 | | Index(expr value) 91 | 92 | boolop = And | Or 93 | 94 | operator = Add | Sub | Mult | Div | Mod | Pow | LShift 95 | | RShift | BitOr | BitXor | BitAnd | FloorDiv 96 | 97 | unaryop = Invert | Not | UAdd | USub 98 | 99 | cmpop = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn 100 | 101 | comprehension = (expr target, expr iter, expr* ifs) 102 | 103 | excepthandler = ExceptHandler(expr? type, identifier? name, stmt* body) 104 | attributes (int lineno, int col_offset) 105 | 106 | arguments = (arg* args, identifier? vararg, expr? varargannotation, 107 | arg* kwonlyargs, identifier? kwarg, 108 | expr? kwargannotation, expr* defaults, 109 | expr* kw_defaults) 110 | arg = (identifier arg, expr? annotation) 111 | 112 | -- keyword arguments supplied to call 113 | keyword = (identifier arg, expr value) 114 | 115 | -- import name with optional 'as' alias. 116 | alias = (identifier name, identifier? asname) 117 | 118 | withitem = (expr context_expr, expr? optional_vars) 119 | } 120 | 121 | -------------------------------------------------------------------------------- /iast/asdl/Python34.asdl: -------------------------------------------------------------------------------- 1 | -- ASDL's six builtin types are identifier, int, string, bytes, object, singleton 2 | 3 | module Python 4 | { 5 | mod = Module(stmt* body) 6 | | Interactive(stmt* body) 7 | | Expression(expr body) 8 | 9 | -- not really an actual node but useful in Jython's typesystem. 10 | | Suite(stmt* body) 11 | 12 | stmt = FunctionDef(identifier name, arguments args, 13 | stmt* body, expr* decorator_list, expr? returns) 14 | | ClassDef(identifier name, 15 | expr* bases, 16 | keyword* keywords, 17 | expr? starargs, 18 | expr? kwargs, 19 | stmt* body, 20 | expr* decorator_list) 21 | | Return(expr? value) 22 | 23 | | Delete(expr* targets) 24 | | Assign(expr* targets, expr value) 25 | | AugAssign(expr target, operator op, expr value) 26 | 27 | -- use 'orelse' because else is a keyword in target languages 28 | | For(expr target, expr iter, stmt* body, stmt* orelse) 29 | | While(expr test, stmt* body, stmt* orelse) 30 | | If(expr test, stmt* body, stmt* orelse) 31 | | With(withitem* items, stmt* body) 32 | 33 | | Raise(expr? exc, expr? cause) 34 | | Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody) 35 | | Assert(expr test, expr? msg) 36 | 37 | | Import(alias* names) 38 | | ImportFrom(identifier? module, alias* names, int? level) 39 | 40 | | Global(identifier* names) 41 | | Nonlocal(identifier* names) 42 | | Expr(expr value) 43 | | Pass | Break | Continue 44 | 45 | -- XXX Jython will be different 46 | -- col_offset is the byte offset in the utf8 string the parser uses 47 | attributes (int lineno, int col_offset) 48 | 49 | -- BoolOp() can use left & right? 50 | expr = BoolOp(boolop op, expr* values) 51 | | BinOp(expr left, operator op, expr right) 52 | | UnaryOp(unaryop op, expr operand) 53 | | Lambda(arguments args, expr body) 54 | | IfExp(expr test, expr body, expr orelse) 55 | | Dict(expr* keys, expr* values) 56 | | Set(expr* elts) 57 | | ListComp(expr elt, comprehension* generators) 58 | | SetComp(expr elt, comprehension* generators) 59 | | DictComp(expr key, expr value, comprehension* generators) 60 | | GeneratorExp(expr elt, comprehension* generators) 61 | -- the grammar constrains where yield expressions can occur 62 | | Yield(expr? value) 63 | | YieldFrom(expr value) 64 | -- need sequences for compare to distinguish between 65 | -- x < 4 < 3 and (x < 4) < 3 66 | | Compare(expr left, cmpop* ops, expr* comparators) 67 | | Call(expr func, expr* args, keyword* keywords, 68 | expr? starargs, expr? kwargs) 69 | | Num(object n) -- a number as a PyObject. 70 | | Str(string s) -- need to specify raw, unicode, etc? 71 | | Bytes(bytes s) 72 | | NameConstant(singleton value) 73 | | Ellipsis 74 | 75 | -- the following expression can appear in assignment context 76 | | Attribute(expr value, identifier attr, expr_context ctx) 77 | | Subscript(expr value, slice slice, expr_context ctx) 78 | | Starred(expr value, expr_context ctx) 79 | | Name(identifier id, expr_context ctx) 80 | | List(expr* elts, expr_context ctx) 81 | | Tuple(expr* elts, expr_context ctx) 82 | 83 | -- col_offset is the byte offset in the utf8 string the parser uses 84 | attributes (int lineno, int col_offset) 85 | 86 | expr_context = Load | Store | Del | AugLoad | AugStore | Param 87 | 88 | slice = Slice(expr? lower, expr? upper, expr? step) 89 | | ExtSlice(slice* dims) 90 | | Index(expr value) 91 | 92 | boolop = And | Or 93 | 94 | operator = Add | Sub | Mult | Div | Mod | Pow | LShift 95 | | RShift | BitOr | BitXor | BitAnd | FloorDiv 96 | 97 | unaryop = Invert | Not | UAdd | USub 98 | 99 | cmpop = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn 100 | 101 | comprehension = (expr target, expr iter, expr* ifs) 102 | 103 | excepthandler = ExceptHandler(expr? type, identifier? name, stmt* body) 104 | attributes (int lineno, int col_offset) 105 | 106 | arguments = (arg* args, arg? vararg, arg* kwonlyargs, expr* kw_defaults, 107 | arg? kwarg, expr* defaults) 108 | 109 | arg = (identifier arg, expr? annotation) 110 | attributes (int lineno, int col_offset) 111 | 112 | -- keyword arguments supplied to call 113 | keyword = (identifier arg, expr value) 114 | 115 | -- import name with optional 'as' alias. 116 | alias = (identifier name, identifier? asname) 117 | 118 | withitem = (expr context_expr, expr? optional_vars) 119 | } 120 | 121 | -------------------------------------------------------------------------------- /iast/asdl/__init__.py: -------------------------------------------------------------------------------- 1 | """This subpackage is derived from Eli Bendersky's rewrite of the 2 | Python ASDL parser, which has been incorporated into CPython. 3 | asdl.py and asdl_test.py are directly from 4 | 5 | https://github.com/eliben/asdl_parser 6 | 7 | with minor modification to work as a subpackage. They are 8 | covered by the PSF license. See also: 9 | 10 | http://bugs.python.org/issue19655 11 | 12 | Python33.asdl and Python34.asdl are from their respective versions 13 | of the CPython source distribution (Parser/Python.asdl). They are 14 | covered by the PSF license. 15 | """ 16 | 17 | 18 | __all__ = [ 19 | 'parse_asdl', 20 | 'primitive_types', 21 | 'python33_asdl', 22 | 'python34_asdl', 23 | # ... 24 | ] 25 | 26 | 27 | from os.path import join, dirname 28 | 29 | from .asdl import ASDLParser 30 | 31 | 32 | primitive_types = { 33 | 'identifier': str, 34 | 'int': int, 35 | 'string': str, 36 | 'bytes': bytes, 37 | 'object': object, 38 | 'singleton': object, 39 | } 40 | 41 | def parse_asdl(asdl_source): 42 | parser = ASDLParser() 43 | return parser.parse(asdl_source) 44 | 45 | py_asdl33_filename = join(dirname(__file__), 'Python33.asdl') 46 | py_asdl34_filename = join(dirname(__file__), 'Python34.asdl') 47 | with open(py_asdl33_filename, 'rt') as file: 48 | python33_asdl = parse_asdl(file.read()) 49 | with open(py_asdl34_filename, 'rt') as file: 50 | python34_asdl = parse_asdl(file.read()) 51 | -------------------------------------------------------------------------------- /iast/asdl/asdl.py: -------------------------------------------------------------------------------- 1 | #------------------------------------------------------------------------------- 2 | # Parser for ASDL [1] definition files. Reads in an ASDL description and parses 3 | # it into an AST that describes it. 4 | # 5 | # The EBNF we're parsing here: Figure 1 of the paper [1]. Extended to support 6 | # modules and attributes after a product. Words starting with Capital letters 7 | # are terminals. Literal tokens are in "double quotes". Others are 8 | # non-terminals. Id is either TokenId or ConstructorId. 9 | # 10 | # module ::= "module" Id "{" [definitions] "}" 11 | # definitions ::= { TypeId "=" type } 12 | # type ::= product | sum 13 | # product ::= fields ["attributes" fields] 14 | # fields ::= "(" { field, "," } field ")" 15 | # field ::= TypeId ["?" | "*"] [Id] 16 | # sum ::= constructor { "|" constructor } ["attributes" fields] 17 | # constructor ::= ConstructorId [fields] 18 | # 19 | # [1] "The Zephyr Abstract Syntax Description Language" by Wang, et. al. See 20 | # http://asdl.sourceforge.net/ 21 | #------------------------------------------------------------------------------- 22 | from collections import namedtuple 23 | import re 24 | 25 | __all__ = [ 26 | 'builtin_types', 'parse', 'AST', 'Module', 'Type', 'Constructor', 27 | 'Field', 'Sum', 'Product', 'VisitorBase', 'Check', 'check'] 28 | 29 | # The following classes define nodes into which the ASDL description is parsed. 30 | # Note: this is a "meta-AST". ASDL files (such as Python.asdl) describe the AST 31 | # structure used by a programming language. But ASDL files themselves need to be 32 | # parsed. This module parses ASDL files and uses a simple AST to represent them. 33 | # See the EBNF at the top of the file to understand the logical connection 34 | # between the various node types. 35 | 36 | builtin_types = set( 37 | ['identifier', 'string', 'bytes', 'int', 'object', 'singleton']) 38 | 39 | class AST: 40 | def __repr__(self): 41 | raise NotImplementedError 42 | 43 | class Module(AST): 44 | def __init__(self, name, dfns): 45 | self.name = name 46 | self.dfns = dfns 47 | self.types = {type.name: type.value for type in dfns} 48 | 49 | def __repr__(self): 50 | return 'Module({0.name}, {0.dfns})'.format(self) 51 | 52 | class Type(AST): 53 | def __init__(self, name, value): 54 | self.name = name 55 | self.value = value 56 | 57 | def __repr__(self): 58 | return 'Type({0.name}, {0.value})'.format(self) 59 | 60 | class Constructor(AST): 61 | def __init__(self, name, fields=None): 62 | self.name = name 63 | self.fields = fields or [] 64 | 65 | def __repr__(self): 66 | return 'Constructor({0.name}, {0.fields})'.format(self) 67 | 68 | class Field(AST): 69 | def __init__(self, type, name=None, seq=False, opt=False): 70 | self.type = type 71 | self.name = name 72 | self.seq = seq 73 | self.opt = opt 74 | 75 | def __repr__(self): 76 | if self.seq: 77 | extra = ", seq=True" 78 | elif self.opt: 79 | extra = ", opt=True" 80 | else: 81 | extra = "" 82 | if self.name is None: 83 | return 'Field({0.type}{1})'.format(self, extra) 84 | else: 85 | return 'Field({0.type}, {0.name}{1})'.format(self, extra) 86 | 87 | class Sum(AST): 88 | def __init__(self, types, attributes=None): 89 | self.types = types 90 | self.attributes = attributes or [] 91 | 92 | def __repr__(self): 93 | if self.attributes: 94 | return 'Sum({0.types}, {0.attributes})'.format(self) 95 | else: 96 | return 'Sum({0.types})'.format(self) 97 | 98 | class Product(AST): 99 | def __init__(self, fields, attributes=None): 100 | self.fields = fields 101 | self.attributes = attributes or [] 102 | 103 | def __repr__(self): 104 | if self.attributes: 105 | return 'Product({0.fields}, {0.attributes})'.format(self) 106 | else: 107 | return 'Product({0.fields})'.format(self) 108 | 109 | # A generic visitor for the meta-AST that describes ASDL. This can be used by 110 | # emitters. Note that this visitor does not provide a generic visit method, so a 111 | # subclass needs to define visit methods from visitModule to as deep as the 112 | # interesting node. 113 | # We also define a Check visitor that makes sure the parsed ASDL is well-formed. 114 | 115 | class VisitorBase: 116 | """Generic tree visitor for ASTs.""" 117 | def __init__(self): 118 | self.cache = {} 119 | 120 | def visit(self, obj, *args): 121 | klass = obj.__class__ 122 | meth = self.cache.get(klass) 123 | if meth is None: 124 | methname = "visit" + klass.__name__ 125 | meth = getattr(self, methname, None) 126 | self.cache[klass] = meth 127 | if meth: 128 | try: 129 | meth(obj, *args) 130 | except Exception as e: 131 | print("Error visiting %r: %s" % (obj, e)) 132 | raise 133 | 134 | class Check(VisitorBase): 135 | """A visitor that checks a parsed ASDL tree for correctness. 136 | 137 | Errors are printed and accumulated. 138 | """ 139 | def __init__(self): 140 | super().__init__() 141 | self.cons = {} 142 | self.errors = 0 143 | self.types = {} 144 | 145 | def visitModule(self, mod): 146 | for dfn in mod.dfns: 147 | self.visit(dfn) 148 | 149 | def visitType(self, type): 150 | self.visit(type.value, str(type.name)) 151 | 152 | def visitSum(self, sum, name): 153 | for t in sum.types: 154 | self.visit(t, name) 155 | 156 | def visitConstructor(self, cons, name): 157 | key = str(cons.name) 158 | conflict = self.cons.get(key) 159 | if conflict is None: 160 | self.cons[key] = name 161 | else: 162 | print('Redefinition of constructor {}'.format(key)) 163 | print('Defined in {} and {}'.format(conflict, name)) 164 | self.errors += 1 165 | for f in cons.fields: 166 | self.visit(f, key) 167 | 168 | def visitField(self, field, name): 169 | key = str(field.type) 170 | l = self.types.setdefault(key, []) 171 | l.append(name) 172 | 173 | def visitProduct(self, prod, name): 174 | for f in prod.fields: 175 | self.visit(f, name) 176 | 177 | def check(mod): 178 | """Check the parsed ASDL tree for correctness. 179 | 180 | Return True if success. For failure, the errors are printed out and False 181 | is returned. 182 | """ 183 | v = Check() 184 | v.visit(mod) 185 | 186 | for t in v.types: 187 | if t not in mod.types and not t in builtin_types: 188 | v.errors += 1 189 | uses = ", ".join(v.types[t]) 190 | print('Undefined type {}, used in {}'.format(t, uses)) 191 | return not v.errors 192 | 193 | # The ASDL parser itself comes next. The only interesting external interface 194 | # here is the top-level parse function. 195 | 196 | def parse(filename): 197 | """Parse ASDL from the given file and return a Module node describing it.""" 198 | with open(filename) as f: 199 | parser = ASDLParser() 200 | return parser.parse(f.read()) 201 | 202 | # Types for describing tokens in an ASDL specification. 203 | class TokenKind: 204 | """TokenKind is provides a scope for enumerated token kinds.""" 205 | (ConstructorId, TypeId, Equals, Comma, Question, Pipe, Asterisk, 206 | LParen, RParen, LBrace, RBrace) = range(11) 207 | 208 | operator_table = { 209 | '=': Equals, ',': Comma, '?': Question, '|': Pipe, '(': LParen, 210 | ')': RParen, '*': Asterisk, '{': LBrace, '}': RBrace} 211 | 212 | Token = namedtuple('Token', 'kind value lineno') 213 | 214 | class ASDLSyntaxError(Exception): 215 | def __init__(self, msg, lineno=None): 216 | self.msg = msg 217 | self.lineno = lineno or '' 218 | 219 | def __str__(self): 220 | return 'Syntax error on line {0.lineno}: {0.msg}'.format(self) 221 | 222 | def tokenize_asdl(buf): 223 | """Tokenize the given buffer. Yield Token objects.""" 224 | for lineno, line in enumerate(buf.splitlines(), 1): 225 | for m in re.finditer(r'\s*(\w+|--.*|.)', line.strip()): 226 | c = m.group(1) 227 | if c[0].isalpha(): 228 | # Some kind of identifier 229 | if c[0].isupper(): 230 | yield Token(TokenKind.ConstructorId, c, lineno) 231 | else: 232 | yield Token(TokenKind.TypeId, c, lineno) 233 | elif c[:2] == '--': 234 | # Comment 235 | break 236 | else: 237 | # Operators 238 | try: 239 | op_kind = TokenKind.operator_table[c] 240 | except KeyError: 241 | raise ASDLSyntaxError('Invalid operator %s' % c, lineno) 242 | yield Token(op_kind, c, lineno) 243 | 244 | class ASDLParser: 245 | """Parser for ASDL files. 246 | 247 | Create, then call the parse method on a buffer containing ASDL. 248 | This is a simple recursive descent parser that uses tokenize_asdl for the 249 | lexing. 250 | """ 251 | def __init__(self): 252 | self._tokenizer = None 253 | self.cur_token = None 254 | 255 | def parse(self, buf): 256 | """Parse the ASDL in the buffer and return an AST with a Module root. 257 | """ 258 | self._tokenizer = tokenize_asdl(buf) 259 | self._advance() 260 | return self._parse_module() 261 | 262 | def _parse_module(self): 263 | if self._at_keyword('module'): 264 | self._advance() 265 | else: 266 | raise ASDLSyntaxError( 267 | 'Expected "module" (found {})'.format(self.cur_token.value), 268 | self.cur_token.lineno) 269 | name = self._match(self._id_kinds) 270 | self._match(TokenKind.LBrace) 271 | defs = self._parse_definitions() 272 | self._match(TokenKind.RBrace) 273 | return Module(name, defs) 274 | 275 | def _parse_definitions(self): 276 | defs = [] 277 | while self.cur_token.kind == TokenKind.TypeId: 278 | typename = self._advance() 279 | self._match(TokenKind.Equals) 280 | type = self._parse_type() 281 | defs.append(Type(typename, type)) 282 | return defs 283 | 284 | def _parse_type(self): 285 | if self.cur_token.kind == TokenKind.LParen: 286 | # If we see a (, it's a product 287 | return self._parse_product() 288 | else: 289 | # Otherwise it's a sum. Look for ConstructorId 290 | sumlist = [Constructor(self._match(TokenKind.ConstructorId), 291 | self._parse_optional_fields())] 292 | while self.cur_token.kind == TokenKind.Pipe: 293 | # More constructors 294 | self._advance() 295 | sumlist.append(Constructor( 296 | self._match(TokenKind.ConstructorId), 297 | self._parse_optional_fields())) 298 | return Sum(sumlist, self._parse_optional_attributes()) 299 | 300 | def _parse_product(self): 301 | return Product(self._parse_fields(), self._parse_optional_attributes()) 302 | 303 | def _parse_fields(self): 304 | fields = [] 305 | self._match(TokenKind.LParen) 306 | while self.cur_token.kind == TokenKind.TypeId: 307 | typename = self._advance() 308 | is_seq, is_opt = self._parse_optional_field_quantifier() 309 | id = (self._advance() if self.cur_token.kind in self._id_kinds 310 | else None) 311 | fields.append(Field(typename, id, seq=is_seq, opt=is_opt)) 312 | if self.cur_token.kind == TokenKind.RParen: 313 | break 314 | elif self.cur_token.kind == TokenKind.Comma: 315 | self._advance() 316 | self._match(TokenKind.RParen) 317 | return fields 318 | 319 | def _parse_optional_fields(self): 320 | if self.cur_token.kind == TokenKind.LParen: 321 | return self._parse_fields() 322 | else: 323 | return None 324 | 325 | def _parse_optional_attributes(self): 326 | if self._at_keyword('attributes'): 327 | self._advance() 328 | return self._parse_fields() 329 | else: 330 | return None 331 | 332 | def _parse_optional_field_quantifier(self): 333 | is_seq, is_opt = False, False 334 | if self.cur_token.kind == TokenKind.Asterisk: 335 | is_seq = True 336 | self._advance() 337 | elif self.cur_token.kind == TokenKind.Question: 338 | is_opt = True 339 | self._advance() 340 | return is_seq, is_opt 341 | 342 | def _advance(self): 343 | """ Return the value of the current token and read the next one into 344 | self.cur_token. 345 | """ 346 | cur_val = None if self.cur_token is None else self.cur_token.value 347 | try: 348 | self.cur_token = next(self._tokenizer) 349 | except StopIteration: 350 | self.cur_token = None 351 | return cur_val 352 | 353 | _id_kinds = (TokenKind.ConstructorId, TokenKind.TypeId) 354 | 355 | def _match(self, kind): 356 | """The 'match' primitive of RD parsers. 357 | 358 | * Verifies that the current token is of the given kind (kind can 359 | be a tuple, in which the kind must match one of its members). 360 | * Returns the value of the current token 361 | * Reads in the next token 362 | """ 363 | if (isinstance(kind, tuple) and self.cur_token.kind in kind or 364 | self.cur_token.kind == kind 365 | ): 366 | value = self.cur_token.value 367 | self._advance() 368 | return value 369 | else: 370 | raise ASDLSyntaxError( 371 | 'Unmatched {} (found {})'.format(kind, self.cur_token.kind), 372 | self.cur_token.lineno) 373 | 374 | def _at_keyword(self, keyword): 375 | return (self.cur_token.kind == TokenKind.TypeId and 376 | self.cur_token.value == keyword) 377 | -------------------------------------------------------------------------------- /iast/asdl/asdl_test.py: -------------------------------------------------------------------------------- 1 | # 12/17/14: Minor modification by Jon Brandvein to run as part of a 2 | # package and refer to the sample asdl file more robustly. 3 | 4 | # Simple testing / sanity-checking for asdl.py 5 | # Assumes some things about the current Python.asdl, which is used as input. 6 | 7 | import sys, unittest, os 8 | from . import asdl 9 | 10 | 11 | class TestAsdlParser(unittest.TestCase): 12 | @classmethod 13 | def setUpClass(cls): 14 | # Parse Python.asdl into a ast.Module and run the check on it. 15 | # There's no need to do this for each test method, hence setUpClass. 16 | python_asdl = os.path.join(os.path.dirname(__file__), 17 | 'Python34.asdl') 18 | cls.mod = asdl.parse(python_asdl) 19 | cls.assertTrue(asdl.check(cls.mod), 'Module validation failed') 20 | 21 | def setUp(self): 22 | # alias stuff from the class, for convenience 23 | self.mod = TestAsdlParser.mod 24 | self.types = self.mod.types 25 | 26 | def test_module(self): 27 | self.assertEqual(self.mod.name, 'Python') 28 | self.assertIn('stmt', self.types) 29 | self.assertIn('expr', self.types) 30 | self.assertIn('mod', self.types) 31 | 32 | def test_definitions(self): 33 | defs = self.mod.dfns 34 | self.assertIsInstance(defs[0], asdl.Type) 35 | self.assertIsInstance(defs[0].value, asdl.Sum) 36 | 37 | self.assertIsInstance(self.types['withitem'], asdl.Product) 38 | self.assertIsInstance(self.types['alias'], asdl.Product) 39 | 40 | def test_product(self): 41 | alias = self.types['alias'] 42 | self.assertEqual( 43 | str(alias), 44 | 'Product([Field(identifier, name), Field(identifier, asname, opt=True)])') 45 | 46 | def test_attributes(self): 47 | stmt = self.types['stmt'] 48 | self.assertEqual(len(stmt.attributes), 2) 49 | self.assertEqual(str(stmt.attributes[0]), 'Field(int, lineno)') 50 | self.assertEqual(str(stmt.attributes[1]), 'Field(int, col_offset)') 51 | 52 | def test_constructor_fields(self): 53 | ehandler = self.types['excepthandler'] 54 | self.assertEqual(len(ehandler.types), 1) 55 | self.assertEqual(len(ehandler.attributes), 2) 56 | 57 | cons = ehandler.types[0] 58 | self.assertIsInstance(cons, asdl.Constructor) 59 | self.assertEqual(len(cons.fields), 3) 60 | 61 | f0 = cons.fields[0] 62 | self.assertEqual(f0.type, 'expr') 63 | self.assertEqual(f0.name, 'type') 64 | self.assertTrue(f0.opt) 65 | 66 | f1 = cons.fields[1] 67 | self.assertEqual(f1.type, 'identifier') 68 | self.assertEqual(f1.name, 'name') 69 | self.assertTrue(f1.opt) 70 | 71 | f2 = cons.fields[2] 72 | self.assertEqual(f2.type, 'stmt') 73 | self.assertEqual(f2.name, 'body') 74 | self.assertFalse(f2.opt) 75 | self.assertTrue(f2.seq) 76 | 77 | def test_visitor(self): 78 | class CustomVisitor(asdl.VisitorBase): 79 | def __init__(self): 80 | super().__init__() 81 | self.names_with_seq = [] 82 | 83 | def visitModule(self, mod): 84 | for dfn in mod.dfns: 85 | self.visit(dfn) 86 | 87 | def visitType(self, type): 88 | self.visit(type.value) 89 | 90 | def visitSum(self, sum): 91 | for t in sum.types: 92 | self.visit(t) 93 | 94 | def visitConstructor(self, cons): 95 | for f in cons.fields: 96 | if f.seq: 97 | self.names_with_seq.append(cons.name) 98 | 99 | v = CustomVisitor() 100 | v.visit(self.types['mod']) 101 | self.assertEqual(v.names_with_seq, ['Module', 'Interactive', 'Suite']) 102 | 103 | 104 | if __name__ == '__main__': 105 | unittest.main() 106 | -------------------------------------------------------------------------------- /iast/node.py: -------------------------------------------------------------------------------- 1 | """Framework for Struct-based AST nodes.""" 2 | 3 | 4 | __all__ = [ 5 | 'AST', 6 | 'dump', 7 | 'nodes_from_asdl', 8 | ] 9 | 10 | 11 | from collections import OrderedDict 12 | from simplestruct import Struct, Field, TypedField, MetaStruct 13 | 14 | from . import asdl 15 | 16 | 17 | class MetaAST(MetaStruct): 18 | 19 | """MetaStruct subclass for defining Struct AST nodes. 20 | 21 | Struct fields are auto-generated from the _fields tuple. 22 | """ 23 | 24 | def __new__(mcls, clsname, bases, namespace, **kargs): 25 | # Make sure the namespace is an ordered mapping 26 | # for passing the fields to MetaStruct. 27 | namespace = OrderedDict(namespace) 28 | 29 | # Create _fields if not present, ensure sequence is a tuple. 30 | fields = tuple(namespace.get('_fields', ())) 31 | namespace['_fields'] = fields 32 | 33 | # For each field, if an explicit definition is not provided, 34 | # add one, and if it is provided, put it in the right order. 35 | for fname in fields: 36 | if fname not in namespace: 37 | namespace[fname] = Field() 38 | else: 39 | namespace.move_to_end(fname) 40 | 41 | return super().__new__(mcls, clsname, bases, namespace, **kargs) 42 | 43 | class AST(Struct, metaclass=MetaAST): 44 | 45 | """Root of any Struct AST node class hierarchy.""" 46 | 47 | _meta = False 48 | """If True, this node is metasyntactic (e.g. for pattern matching) 49 | and is therefore not restricted by type constraints. 50 | """ 51 | 52 | class TypedASTField(TypedField): 53 | 54 | """Type-checked field for AST nodes. quant is one of the ASDL 55 | quantifiers: 56 | 57 | '': no type modification 58 | '*': same as passing seq=True to TypedField 59 | '?': same as passing or_none=True to TypedField 60 | 61 | If the field value is an AST node with _meta set to True, 62 | waive type checking. 63 | """ 64 | 65 | def __init__(self, kind, quant): 66 | assert quant in ['', '*', '?'] 67 | seq = quant == '*' 68 | or_none = quant == '?' 69 | super().__init__(kind, seq=seq, or_none=or_none) 70 | self.quant = quant 71 | 72 | def copy(self): 73 | return type(self)(self.kind, self.quant) 74 | 75 | def checktype(self, value, kind, **kargs): 76 | if isinstance(value, AST) and value._meta: 77 | return 78 | super().checktype(value, kind, **kargs) 79 | 80 | def checktype_seq(self, value, kind, **kargs): 81 | if isinstance(value, AST) and value._meta: 82 | return 83 | # If we get passed a singular AST by mistake, 84 | # don't allow the AST to be coerced to a sequence 85 | # via simplestruct.Struct.__iter__() (which gives 86 | # an iterator over node fields). Instead make 87 | # this an explicit error. 88 | if isinstance(value, AST): 89 | exp = self.str_kind(kind) 90 | got = self.str_valtype(value) 91 | raise TypeError('Expected sequence of {}; got {} node ' 92 | 'instead'.format(exp, got)) 93 | super().checktype_seq(value, kind, **kargs) 94 | 95 | def normalize(self, inst, value): 96 | # Without this check, we'd end up replacing a metasyntactic 97 | # node with the sequence of its fields. 98 | if isinstance(value, AST) and value._meta: 99 | return value 100 | return super().normalize(inst, value) 101 | 102 | 103 | def dump(tree, indent=0): 104 | """A multi-line Struct-AST pretty-printer. Note that this is for 105 | getting the exact tree structure, not a source-like representation. 106 | 107 | If all non-node field values in the tree can be constructed from 108 | their reprs, then the returned string can be executed to reproduce 109 | the tree. 110 | """ 111 | if isinstance(tree, AST): 112 | functor = tree.__class__.__name__ + '(' 113 | new_indent = indent + len(functor) 114 | delim = ',\n' + (' ' * new_indent) 115 | return (functor + 116 | delim.join(key + ' = ' + dump(item, len(key) + 3 + new_indent) 117 | for key, item in tree._asdict().items()) + 118 | ')') 119 | elif isinstance(tree, tuple): 120 | new_indent = indent + 1 121 | delim = ',\n' + (' ' * new_indent) 122 | end = ',)' if len(tree) == 1 else ')' 123 | return ('(' + 124 | delim.join(dump(item, new_indent) for item in tree) + 125 | end) 126 | else: 127 | return repr(tree) 128 | 129 | 130 | class ASDLImporter: 131 | 132 | """Given an ASDL structure, return an OrderedDict from each name 133 | of an AST node to a tuple of information describing it. The tuple 134 | consists of: 135 | 136 | 1) a list of field specifications, which are triples of a 137 | field name, a type name (either an ASDL primitive or 138 | another node type), and a quantifier ('', '?', or '*') 139 | 140 | 2) the name of a base node type it inherits from 141 | 142 | The dictionary order is such that each node type can be defined 143 | in terms of previous node types. Specifically, it has all left- 144 | hand sides of production rules first (i.e. Sums and Products), 145 | then all right-hand sides (Constructors), in top-to-bottom 146 | order. 147 | """ 148 | 149 | # In the style of asdl.VisitorBase. 150 | 151 | def run(self, mod): 152 | self.left_info = OrderedDict() 153 | self.right_info = OrderedDict() 154 | 155 | self.visit(mod) 156 | 157 | self.left_info.update(self.right_info) 158 | return self.left_info 159 | 160 | def visit(self, obj, *args): 161 | methname = 'visit' + obj.__class__.__name__ 162 | meth = getattr(self, methname) 163 | return meth(obj, *args) 164 | 165 | def visitModule(self, mod): 166 | for dfn in mod.dfns: 167 | self.visit(dfn) 168 | 169 | def visitType(self, type): 170 | self.visit(type.value, str(type.name)) 171 | 172 | def visitSum(self, sum, name): 173 | for t in sum.types: 174 | self.visit(t, name) 175 | self.left_info[name] = ([], 'AST') 176 | 177 | def visitConstructor(self, cons, name): 178 | fields = [] 179 | for f in cons.fields: 180 | fields.append(self.visit(f, cons.name)) 181 | self.right_info[cons.name] = (fields, name) 182 | 183 | def visitField(self, field, name): 184 | assert not (field.seq and field.opt) 185 | quant = '*' if field.seq else '?' if field.opt else '' 186 | return (field.name, field.type, quant) 187 | 188 | def visitProduct(self, prod, name): 189 | fields = [] 190 | for f in prod.fields: 191 | fields.append(self.visit(f, name)) 192 | self.left_info[name] = (fields, 'AST') 193 | 194 | def nodes_from_asdl(asdl_tree, *, module=None, typed=False, 195 | primitive_types=asdl.primitive_types): 196 | """Given an ASDL structure, return a mapping from node type 197 | names to node types. 198 | 199 | If module is given, it should be the name of a module whose 200 | global namespace will contain the returned node types. 201 | (This allows instances of the node classes to be pickled.) 202 | 203 | If typed is True, the node classes' fields will be type-checked. 204 | primitive_types can be used to override the mapping from names of 205 | primitives appearing in the ASDL to their corresponding types. 206 | """ 207 | # When not using types, we leave it to MetaAST to generate 208 | # the field descriptors from the _fields attribute. 209 | # When using types, we explicitly set each field to a 210 | # TypedField. But since the ASDL productions may be circular, 211 | # the actual type is patched in after creating all nodes. 212 | 213 | lang = {'AST': AST} 214 | info = ASDLImporter().run(asdl_tree) 215 | for name, (fields, base) in info.items(): 216 | fieldnames = tuple(fn for fn, _ft, _fq in fields) 217 | namespace = {'__module__': module, 218 | '_fields': fieldnames} 219 | if typed: 220 | for fn, _ft, fq in fields: 221 | namespace[fn] = TypedASTField(None, fq) 222 | new_node = type(name, (lang[base],), namespace) 223 | lang[name] = new_node 224 | if typed: 225 | for name, (fields, _base) in info.items(): 226 | for fn, ft, fq in fields: 227 | typ = lang[ft] if ft in lang else primitive_types[ft] 228 | desc = getattr(lang[name], fn) 229 | desc.kind = typ 230 | return lang 231 | -------------------------------------------------------------------------------- /iast/pattern.py: -------------------------------------------------------------------------------- 1 | """Pattern-matching for Struct ASTs.""" 2 | 3 | 4 | __all__ = [ 5 | 'MatchFailure', 6 | 'pattern', 7 | 'PatVar', 8 | 'Wildcard', 9 | 'raw_match', 10 | 'match', 11 | 'PatternTransformer', 12 | ] 13 | 14 | 15 | from .node import AST 16 | from .visitor import NodeVisitor, NodeTransformer 17 | 18 | 19 | class MatchFailure(Exception): 20 | """Raised on unification failure, to exit the recursion.""" 21 | 22 | 23 | class pattern(AST): 24 | """Pattern term.""" 25 | _meta = True 26 | 27 | class PatVar(pattern): 28 | """Pattern variable.""" 29 | _fields = ('id',) 30 | 31 | class Wildcard(pattern): 32 | """Wildcard pattern variable.""" 33 | _fields = () 34 | 35 | 36 | class VarExpander(NodeTransformer): 37 | 38 | """Expand pattern variables.""" 39 | 40 | def __init__(self, mapping): 41 | super().__init__() 42 | self.mapping = mapping 43 | 44 | def visit_PatVar(self, node): 45 | return self.mapping.get(node.id, node) 46 | 47 | class OccChecker(NodeVisitor): 48 | 49 | """Run an occurs-check for a variable.""" 50 | 51 | class Found(Exception): 52 | pass 53 | 54 | def __init__(self, var): 55 | super().__init__() 56 | self.var = var 57 | 58 | def process(self, tree): 59 | try: 60 | super().process(tree) 61 | except self.Found: 62 | return True 63 | else: 64 | return False 65 | 66 | def visit_PatVar(self, node): 67 | if node.id == self.var: 68 | raise self.Found 69 | 70 | 71 | def match_step(lhs, rhs): 72 | """Attempt to match lhs against rhs at the top-level. Return a 73 | list of equations that must hold for the matching to succeed, 74 | or raise MatchFailure if matching is not possible. Variable 75 | bindings are also returned, as a mapping. 76 | """ 77 | # Ignore wildcards. 78 | if isinstance(lhs, Wildcard) or isinstance(rhs, Wildcard): 79 | return [], {} 80 | 81 | # In practice, bindings is always either empty or contains 82 | # just one mapping. 83 | bindings = {} 84 | 85 | # Flip for symmetric case. 86 | if not isinstance(lhs, PatVar) and isinstance(rhs, PatVar): 87 | lhs, rhs = rhs, lhs 88 | 89 | # matching 90 | if isinstance(lhs, PatVar): 91 | eqs = [] 92 | if not (isinstance(rhs, PatVar) and rhs.id == lhs.id): 93 | if OccChecker.run(rhs, lhs.id): 94 | raise MatchFailure('Circular match on ' + lhs.id) 95 | bindings[lhs.id] = rhs 96 | 97 | # matching 98 | elif isinstance(lhs, AST): 99 | if not isinstance(rhs, AST): 100 | raise MatchFailure( 101 | 'Node {} does not match non-node {}'.format( 102 | lhs.__class__.__name__, repr(rhs))) 103 | elif not type(lhs) == type(rhs): 104 | raise MatchFailure('Node {} does not match node {}'.format( 105 | lhs.__class__.__name__, 106 | rhs.__class__.__name__)) 107 | else: 108 | eqs = [(getattr(lhs, field), getattr(rhs, field)) 109 | for field in lhs._fields] 110 | 111 | # matching 112 | elif isinstance(lhs, tuple): 113 | if not isinstance(rhs, tuple): 114 | raise MatchFailure( 115 | 'Sequence {} does not match non-sequence {}'.format( 116 | repr(lhs), repr(rhs))) 117 | elif len(lhs) != len(rhs): 118 | raise MatchFailure( 119 | 'Sequence {} and sequence {} have ' 120 | 'different lengths'.format( 121 | repr(lhs), repr(rhs))) 122 | else: 123 | eqs = list(zip(lhs, rhs)) 124 | 125 | # matching 126 | else: 127 | if lhs != rhs: 128 | raise MatchFailure( 129 | 'Constant {} does not match {}'.format( 130 | repr(lhs), repr(rhs))) 131 | eqs = [] 132 | 133 | return eqs, bindings 134 | 135 | 136 | def raw_match(tree1, tree2): 137 | """Given two trees to match, run the unification algorithm. Return 138 | a mapping from each variable to a tree, where the variable does not 139 | appear anywhere else in the mapping. Raise MatchFailure on failure. 140 | """ 141 | eqs = [(tree1, tree2)] 142 | result = {} 143 | 144 | def bindvar(var, repl): 145 | """Add a binding var -> repl. Replace var with repl in the 146 | equations list and in the other result mappings. 147 | """ 148 | result[var] = repl 149 | trans = VarExpander({var: repl}) 150 | for k in result: 151 | result[k] = trans.process(result[k]) 152 | for i, (lhs, rhs) in enumerate(eqs): 153 | eqs[i] = (trans.process(lhs), trans.process(rhs)) 154 | 155 | while len(eqs) > 0: 156 | lhs, rhs = eqs.pop() 157 | new_eqs, new_bindings = match_step(lhs, rhs) 158 | eqs.extend(new_eqs) 159 | for var, repl in new_bindings.items(): 160 | bindvar(var, repl) 161 | 162 | return result 163 | 164 | def match(tree1, tree2): 165 | """Same as raw_match(), but return None instead of raising 166 | MatchFailure. 167 | """ 168 | try: 169 | return raw_match(tree1, tree2) 170 | except MatchFailure: 171 | return None 172 | 173 | 174 | class PatternTransformer(NodeTransformer): 175 | 176 | """Apply pattern substitution rules in a bottom-up (post-traversal) 177 | manner. 178 | 179 | A rule consists of a pattern tree and a replacement function. 180 | When a rule is applied to an input tree, if the pattern matches 181 | the input, then the tree gets replaced by the result of calling 182 | the function. The function is passed keyword arguments for each 183 | of the pattern's PatVars, bound to the corresponding matching 184 | subtree of the input. 185 | 186 | The replacement function may return NotImplemented to defer to 187 | subsequent rules. None may be returned to indicate "no change" 188 | (but see NodeTransformer for information on the _nochange_none 189 | flag). 190 | 191 | As a convenience, a rule may give an AST instead of a replacement 192 | function. The AST serves as a template where PatVars get expanded 193 | according to the match. 194 | """ 195 | 196 | def normalize_repl_func(self, repl): 197 | """Normalize a value that is either a replacement function 198 | or an AST to just a replacement function. 199 | """ 200 | if isinstance(repl, AST): 201 | return lambda **mapping: VarExpander.run(repl, mapping) 202 | else: 203 | return repl 204 | 205 | rules = [] 206 | """List of rules to apply, in order of precedence. Each rule 207 | is a pair of a pattern tree and a replacement function (or 208 | AST). 209 | """ 210 | 211 | def visit(self, tree): 212 | # Process subtree first. 213 | subtree_result = super().visit(tree) 214 | 215 | for pattern, repl in self.rules: 216 | mapping = match(pattern, subtree_result) 217 | if mapping is not None: 218 | # If the match succeeded, consult the repl. 219 | repl_result = repl(**mapping) 220 | if repl_result is NotImplemented: 221 | # Defer to next rule. 222 | continue 223 | if (self._nochange_none and 224 | isinstance(tree, AST) and repl_result is None): 225 | # Normalize None. 226 | repl_result = subtree_result 227 | return repl_result 228 | else: 229 | # No matching rule found. 230 | return subtree_result 231 | -------------------------------------------------------------------------------- /iast/python/__init__.py: -------------------------------------------------------------------------------- 1 | """Subpackage for Python-specific AST definitions and utilities. 2 | 3 | All the code concerning Python 3.3's AST and Python 3.4's AST is 4 | exported from the modules python33.py and python34.py respectively. 5 | There is also default.py, which serves as an alias for whichever 6 | version corresponds to the currently executing Python interpreter. 7 | """ 8 | -------------------------------------------------------------------------------- /iast/python/default.py: -------------------------------------------------------------------------------- 1 | """Alias for python33.py or python34.py.""" 2 | 3 | 4 | __all__ = [ 5 | # ... 6 | ] 7 | 8 | 9 | import sys 10 | 11 | 12 | ver = sys.version_info 13 | if ver[:2] == (3, 3): 14 | from . import python33 as python 15 | elif ver[:2] == (3, 4): 16 | from . import python34 as python 17 | else: 18 | raise AssertionError('Unsupported Python version') 19 | 20 | __all__.extend(python.__all__) 21 | globals().update({k: python.__dict__[k] for k in python.__all__}) 22 | -------------------------------------------------------------------------------- /iast/python/native.py: -------------------------------------------------------------------------------- 1 | """Interoperability with native nodes. 2 | 3 | "Native" nodes is our term for the node classes defined in the 4 | "ast" standard library module. We support conversion between 5 | Struct nodes and native nodes, and parsing of source code into 6 | Struct nodes -- but only for the version of the grammar matching 7 | the currently executing Python interpreter. 8 | """ 9 | 10 | 11 | __all__ = [ 12 | 'native_nodes', 13 | 'pyToStruct', 14 | 'structToPy', 15 | 'parse', 16 | ] 17 | 18 | 19 | import ast 20 | import sys 21 | 22 | from ..util import trim 23 | from ..node import AST 24 | from .pynode import py33_nodes, py34_nodes 25 | 26 | 27 | # Dictionary of all node classes in the ast library. 28 | native_nodes = {nodecls.__name__: nodecls 29 | for nodecls in ast.__dict__.values() 30 | if isinstance(nodecls, type) 31 | if issubclass(nodecls, ast.AST)} 32 | 33 | 34 | # Alias for nodes dictionary matching current interpreter version. 35 | ver = sys.version_info 36 | if ver[:2] == (3, 3): 37 | py_nodes = py33_nodes 38 | elif ver[:2] == (3, 4): 39 | py_nodes = py34_nodes 40 | else: 41 | raise AssertionError('Unsupported Python version') 42 | 43 | 44 | def convert_ast(tree, to_struct): 45 | """Convert from native nodes to Struct nodes if to_struct is 46 | True; otherwise convert in the opposite direction. 47 | """ 48 | base = ast.AST if to_struct else AST 49 | mapping = py_nodes if to_struct else native_nodes 50 | seqtype = tuple if to_struct else list 51 | 52 | if isinstance(tree, base): 53 | name = tree.__class__.__name__ 54 | out_type = mapping[name] 55 | field_values = [] 56 | for field in tree._fields: 57 | fval = getattr(tree, field, None) 58 | fval = convert_ast(fval, to_struct) 59 | field_values.append(fval) 60 | new_tree = out_type(*field_values) 61 | return new_tree 62 | elif isinstance(tree, (list, tuple)): 63 | return seqtype(convert_ast(item, to_struct) for item in tree) 64 | else: 65 | return tree 66 | 67 | def pyToStruct(tree): 68 | """Convert from a native AST to a Struct AST.""" 69 | assert isinstance(tree, ast.AST) 70 | return convert_ast(tree, to_struct=True) 71 | 72 | def structToPy(tree): 73 | """Convert from a Struct AST to a native AST.""" 74 | assert isinstance(tree, AST) 75 | return convert_ast(tree, to_struct=False) 76 | 77 | 78 | def parse(source): 79 | """Like ast.parse(), but produce a Struct AST. Works with indented 80 | triple-quoted literals (via util.trim()).""" 81 | source = trim(source) 82 | tree = ast.parse(source) 83 | tree = pyToStruct(tree) 84 | return tree 85 | -------------------------------------------------------------------------------- /iast/python/pynode.py: -------------------------------------------------------------------------------- 1 | """Struct versions of Python's own AST nodes.""" 2 | 3 | 4 | __all__ = [ 5 | 'py33_nodes', 6 | 'py34_nodes', 7 | ] 8 | 9 | 10 | from ..asdl import python33_asdl, python34_asdl 11 | from ..node import nodes_from_asdl 12 | 13 | 14 | # Dictionary of all Struct classes for Python 3.3 and 3.4 node types. 15 | py33_nodes = {} 16 | py34_nodes = {} 17 | 18 | def initialize_nodetypes(): 19 | """Populate the Struct nodes dictionaries.""" 20 | assert len(py33_nodes) == len(py34_nodes) == 0 21 | 22 | # If anyone asks, these are defined in python33.py and 23 | # python34.py since they are available on those module's 24 | # namespaces. 25 | home33 = __name__[:__name__.rfind('.')] + '.python33' 26 | home34 = __name__[:__name__.rfind('.')] + '.python34' 27 | 28 | py33_nodes.update(nodes_from_asdl( 29 | python33_asdl, module=home33, 30 | typed=True)) 31 | py34_nodes.update(nodes_from_asdl( 32 | python34_asdl, module=home34, 33 | typed=True)) 34 | 35 | initialize_nodetypes() 36 | -------------------------------------------------------------------------------- /iast/python/python33.py: -------------------------------------------------------------------------------- 1 | """Export Python 3.3 nodes and utilities.""" 2 | 3 | 4 | __all__ = [ 5 | # ... 6 | ] 7 | 8 | 9 | import sys 10 | 11 | 12 | def include_dict(mapping): 13 | __all__.extend(mapping.keys()) 14 | globals().update(mapping) 15 | 16 | def include_mod(mod): 17 | # Use get_all() if defined, otherwise use module's __dict__. 18 | get_all = getattr(mod, 'get_all', None) 19 | if get_all is not None: 20 | thismod = sys.modules[__name__] 21 | entries = get_all(thismod) 22 | include_dict(entries) 23 | else: 24 | include_dict({k: mod.__dict__[k] for k in mod.__all__}) 25 | 26 | 27 | # Include node classes. 28 | from .pynode import py33_nodes as py_nodes 29 | __all__.append('py_nodes') 30 | include_dict(py_nodes) 31 | 32 | # Include native features if version matches. 33 | if sys.version_info[:2] == (3, 3): 34 | from . import native 35 | include_mod(native) 36 | 37 | # Include utils. 38 | from . import pyutil 39 | include_mod(pyutil) 40 | -------------------------------------------------------------------------------- /iast/python/python34.py: -------------------------------------------------------------------------------- 1 | """Export Python 3.4 nodes and utilities.""" 2 | 3 | 4 | __all__ = [ 5 | # ... 6 | ] 7 | 8 | 9 | import sys 10 | 11 | 12 | def include_dict(mapping): 13 | __all__.extend(mapping.keys()) 14 | globals().update(mapping) 15 | 16 | def include_mod(mod): 17 | # Use get_all() if defined, otherwise use module's __dict__. 18 | get_all = getattr(mod, 'get_all', None) 19 | if get_all is not None: 20 | thismod = sys.modules[__name__] 21 | entries = get_all(thismod) 22 | include_dict(entries) 23 | else: 24 | include_dict({k: mod.__dict__[k] for k in mod.__all__}) 25 | 26 | 27 | # Include node classes. 28 | from .pynode import py34_nodes as py_nodes 29 | __all__.append('py_nodes') 30 | include_dict(py_nodes) 31 | 32 | # Include native features if version matches. 33 | if sys.version_info[:2] == (3, 4): 34 | from . import native 35 | include_mod(native) 36 | 37 | # Include utils. 38 | from . import pyutil 39 | include_mod(pyutil) 40 | -------------------------------------------------------------------------------- /iast/python/pyutil.py: -------------------------------------------------------------------------------- 1 | """Simple Python-specific AST utilities.""" 2 | 3 | 4 | # Names are exported using get_all() instead of __all__. 5 | # This allows us to instantiate code with py33 or py34 ast 6 | # types as needed. 7 | __all__ = [ 8 | ] 9 | 10 | 11 | import sys 12 | from functools import partial, reduce, wraps 13 | import operator 14 | from inspect import signature, Parameter 15 | from simplestruct.type import checktype, checktype_seq 16 | 17 | from ..util import pairwise 18 | from ..visitor import NodeVisitor, NodeTransformer 19 | from ..pattern import PatVar, Wildcard, PatternTransformer 20 | 21 | 22 | def make_pattern(tree): 23 | """Make a pattern from an AST by replacing Name nodes with PatVars 24 | and Wildcards. Names beginning with an underscore are considered 25 | pattern vars. Names of '_' are considered wildcards. 26 | """ 27 | class NameToPatVar(NodeTransformer): 28 | def visit_Name(self, node): 29 | if node.id == '_': 30 | return Wildcard() 31 | elif node.id.startswith('_'): 32 | return PatVar(node.id) 33 | 34 | return NameToPatVar.run(tree) 35 | 36 | 37 | class ContextSetter(NodeTransformer): 38 | 39 | """Propagate context type ctx to the appropriate nodes of an 40 | expression tree. Mirrors the behavior of set_context() in 41 | the Python source tree file Python/ast.c. Specifically, nodes 42 | that have a context field get assigned a context of ctx, and 43 | Starred, List, and Tuple nodes also propagate ctx recursively. 44 | """ 45 | 46 | def __init__(self, ctx): 47 | # Type, not instance. 48 | self.ctx = ctx 49 | 50 | def basic(self, node): 51 | return node._replace(ctx=self.ctx()) 52 | 53 | def recur(self, node): 54 | node = self.generic_visit(node) 55 | node = node._replace(ctx=self.ctx()) 56 | return node 57 | 58 | visit_Attribute = basic 59 | visit_Subscript = basic 60 | visit_Name = basic 61 | 62 | visit_Starred = recur 63 | visit_List = recur 64 | visit_Tuple = recur 65 | 66 | 67 | def extract_tree(L, tree, mode=None): 68 | """Given a tree rooted at a Module node, return a subtree as 69 | selected by mode, which is one of the following strings. 70 | 71 | mod: 72 | Return the original tree, unchanged. (default) 73 | 74 | code: 75 | Get the list of top-level statements. 76 | 77 | stmt_or_blank: 78 | The one top-level statement, or None if there are 79 | no statements. 80 | 81 | stmt: 82 | The one top-level statement. 83 | 84 | expr: 85 | The one top-level expression. 86 | 87 | lval: 88 | The one top-level expression, in Store context. 89 | """ 90 | checktype(tree, L.Module) 91 | 92 | if mode == 'mod' or mode is None: 93 | pass 94 | 95 | elif mode == 'code': 96 | tree = tree.body 97 | 98 | elif mode == 'stmt_or_blank': 99 | if len(tree.body) == 0: 100 | return None 101 | elif len(tree.body) == 1: 102 | tree = tree.body[0] 103 | else: 104 | raise ValueError('Mode "{}" requires zero or one statements ' 105 | '(got {})'.format(mode, len(tree.body))) 106 | 107 | elif mode in ['stmt', 'expr', 'lval']: 108 | if len(tree.body) != 1: 109 | raise ValueError('Mode "{}" requires exactly one statement ' 110 | '(got {})'.format(mode, len(tree.body))) 111 | tree = tree.body[0] 112 | if mode in ['expr', 'lval']: 113 | if not isinstance(tree, L.Expr): 114 | raise ValueError('Mode "{}" requires Expr node (got {})' 115 | .format(mode, type(tree).__name__)) 116 | tree = tree.value 117 | 118 | if mode == 'lval': 119 | tree = ContextSetter.run(tree, L.Store) 120 | 121 | elif mode is not None: 122 | raise ValueError('Unknown parse mode "' + mode + '"') 123 | 124 | return tree 125 | 126 | 127 | class LiteralEvaluator(NodeVisitor): 128 | 129 | """Analogous to ast.literal_eval(), with similar restrictions 130 | on the allowed types of nodes. 131 | """ 132 | 133 | operator_map = { 134 | 'And': lambda a, b: a and b, 135 | 'Or': lambda a, b: a or b, 136 | 137 | 'Add': operator.add, 138 | 'Sub': operator.sub, 139 | 'Mult': operator.mul, 140 | 'Div': operator.truediv, 141 | 'Mod': operator.mod, 142 | 'Pow': operator.pow, 143 | 'LShift': operator.lshift, 144 | 'RShift': operator.rshift, 145 | 'BitOr': operator.or_, 146 | 'BitXor': operator.xor, 147 | 'BitAnd': operator.and_, 148 | 'FloorDiv': operator.floordiv, 149 | 150 | 'Invert': operator.invert, 151 | 'Not': operator.not_, 152 | 'UAdd': operator.pos, 153 | 'USub': operator.neg, 154 | 155 | 'Eq': operator.eq, 156 | 'NotEq': operator.ne, 157 | 'Lt': operator.lt, 158 | 'LtE': operator.le, 159 | 'Gt': operator.gt, 160 | 'GtE': operator.ge, 161 | 'Is': operator.is_, 162 | 'IsNot': operator.is_not, 163 | 'In': operator.contains, 164 | 'NotIn': lambda a, b: a not in b, 165 | } 166 | 167 | def seq_visit(self, seq): 168 | return seq 169 | 170 | def generic_visit(self, node): 171 | raise ValueError('Unsupported node ' + node.__class__.__name__) 172 | 173 | def visit_Num(self, node): 174 | return node.n 175 | 176 | def visit_Str(self, node): 177 | return node.s 178 | 179 | def visit_Bytes(self, node): 180 | return node.s 181 | 182 | def visit_Ellipsis(self, node): 183 | return Ellipsis 184 | 185 | def visit_Name(self, node): 186 | # This is used for the py33 grammar, which lacks NameConstant. 187 | map = {'True': True, 'False': False, 'None': None} 188 | if node.id not in map: 189 | raise ValueError("Unsupported Name node '{}'".format(node.id)) 190 | return map[node.id] 191 | 192 | def visit_NameConstant(self, node): 193 | return node.value 194 | 195 | def visit_Tuple(self, node): 196 | return tuple(self.visit(elt) for elt in node.elts) 197 | 198 | def visit_List(self, node): 199 | return list(self.visit(elt) for elt in node.elts) 200 | 201 | def visit_Set(self, node): 202 | return set(self.visit(elt) for elt in node.elts) 203 | 204 | def visit_Dict(self, node): 205 | return {self.visit(key): self.visit(value) 206 | for key, value in zip(node.keys, node.values)} 207 | 208 | def visit_BoolOp(self, node): 209 | func = self.operator_map[node.op.__class__.__name__] 210 | return reduce(func, (self.visit(value) for value in node.values)) 211 | 212 | def visit_BinOp(self, node): 213 | func = self.operator_map[node.op.__class__.__name__] 214 | return func(self.visit(node.left), self.visit(node.right)) 215 | 216 | def visit_UnaryOp(self, node): 217 | func = self.operator_map[node.op.__class__.__name__] 218 | return func(self.visit(node.operand)) 219 | 220 | def visit_Compare(self, node): 221 | values = ((self.visit(node.left),) + 222 | tuple(self.visit(c) for c in node.comparators)) 223 | cmps = pairwise(values) 224 | return all(self.operator_map[op.__class__.__name__](a, b) 225 | for ((a, b), op) in zip(cmps, node.ops)) 226 | 227 | 228 | class Templater(NodeTransformer): 229 | 230 | """Instantiate placeholders in the AST according to the given 231 | mapping. The following kinds of mappings are recognized. In 232 | all cases, the keys are strings, and None values indicate 233 | "no change". 234 | 235 | IDENT -> AST 236 | Replace Name occurrences for identifier IDENT with an 237 | arbitrary non-None expression AST. 238 | 239 | IDENT1 -> IDENT2 240 | In Name occurrences, replace IDENT1 with IDENT2 while 241 | leaving context unchanged. 242 | 243 | @ATTR1 -> ATTR2 244 | Replace uses of attribute ATTR1 with ATTR2. 245 | 246 | IDENT1 -> IDENT2 247 | In function definitions, replace the name of the defined 248 | function IDENT1 with IDENT2. 249 | 250 | IDENT -> AST 251 | Replace Name occurrences of IDENT with an arbitrary 252 | code AST (i.e. tuple of statements). 253 | 254 | If the repeat flag is given, then the names and ASTs introduced 255 | by applying the mapping will be transformed repeatedly until 256 | no rules apply (or all applicable rules map to None). This means 257 | that a cyclic set of rules can cause an infinite loop. (This holds 258 | even if the rules apply but produce an equivalent tree.) 259 | 260 | If repeat is True, then bailout is the number of substitutions 261 | to allow before failing with an exception. Set bailout to None 262 | to disable this protection. 263 | """ 264 | 265 | L = None 266 | """Stub for module reference.""" 267 | 268 | def __init__(self, subst, *, repeat=False, 269 | bailout=sys.getrecursionlimit()): 270 | super().__init__() 271 | self.subst = subst 272 | self.repeat = repeat 273 | self.bailout = bailout 274 | 275 | def fix(self, func, value): 276 | """If repeat is True, repeatedly apply func to value 277 | until a non-None result is obtained. Otherwise, apply 278 | func exactly once. In either case, return the last non- 279 | None value (or the original value if the first application 280 | was None). 281 | """ 282 | steps = 0 283 | changed = True 284 | while changed: 285 | if steps >= self.bailout: 286 | raise RuntimeError('Exceeded bailout ({}) in ' 287 | 'Templater'.format(self.bailout)) 288 | changed = False 289 | result = func(value) 290 | if result is not None: 291 | if self.repeat: 292 | changed = True 293 | value = result 294 | steps += 1 295 | return value 296 | 297 | def visit_Name(self, node): 298 | def f(node): 299 | # If we yield a non-Name AST, stop. 300 | if not isinstance(node, self.L.Name): 301 | return None 302 | # Get the mapping entry for this identifier. 303 | # Result is either a string, None, or an expression AST. 304 | result = self.subst.get(node.id, None) 305 | # Normalize string to Name node. 306 | if isinstance(result, str): 307 | result = node._replace(id=result) 308 | return result 309 | 310 | return self.fix(f, node) 311 | 312 | def visit_Attribute(self, node): 313 | # Recurse first. If we repeatedly change the attribute 314 | # name in the fixpoint loop, the node's subexpressions 315 | # won't be affected. 316 | node = self.generic_visit(node) 317 | 318 | def f(node): 319 | new_attr = self.subst.get('@' + node.attr, None) 320 | if new_attr is None: 321 | return None 322 | else: 323 | return node._replace(attr=new_attr) 324 | 325 | return self.fix(f, node) 326 | 327 | def visit_FunctionDef(self, node): 328 | # Recurse first, as above. 329 | node = self.generic_visit(node) 330 | 331 | def f(node): 332 | new_name = self.subst.get('' + node.name, None) 333 | if new_name is None: 334 | return None 335 | else: 336 | return node._replace(name=new_name) 337 | 338 | return self.fix(f, node) 339 | 340 | def visit_Expr(self, node): 341 | # Don't recurse first. We want a 'Foo' rule to take 342 | # precedence over a 'Foo' rule. 343 | # 344 | # Don't use self.fix. If we repeat, we want to recursively 345 | # apply arbitrary rules to the new substitution result. 346 | if isinstance(node.value, self.L.Name): 347 | new_code = self.subst.get('' + node.value.id, None) 348 | if new_code is not None: 349 | if self.repeat: 350 | # Note that we visit(), not generic_visit(), 351 | # so rules can apply freely at the top level 352 | # of new_code. 353 | new_code = self.visit(new_code) 354 | return new_code 355 | 356 | # If the rule didn't trigger or it said no change, 357 | # process subtree as normal. 358 | node = self.generic_visit(node) 359 | return node 360 | 361 | 362 | def astargs(L, func): 363 | """Decorator to automatically unwrap AST arguments.""" 364 | sig = signature(func) 365 | @wraps(func) 366 | def f(*args, **kargs): 367 | ba = sig.bind(*args, **kargs) 368 | for name, val in ba.arguments.items(): 369 | ann = sig.parameters[name].annotation 370 | 371 | if ann is Parameter.empty: 372 | pass 373 | 374 | elif ann == 'Str': 375 | checktype(val, L.Str) 376 | ba.arguments[name] = val.s 377 | 378 | elif ann == 'Num': 379 | checktype(val, L.Num) 380 | ba.arguments[name] = val.n 381 | 382 | elif ann == 'Name': 383 | checktype(val, L.Name) 384 | ba.arguments[name] = val.id 385 | 386 | elif ann == 'List': 387 | checktype(val, L.List) 388 | ba.arguments[name] = val.elts 389 | 390 | elif ann == 'ids': 391 | if not isinstance(val, (L.List, L.Tuple)): 392 | raise TypeError('Expected List or Tuple node') 393 | checktype_seq(val.elts, L.Name) 394 | ba.arguments[name] = tuple(v.id for v in val.elts) 395 | 396 | else: 397 | raise TypeError('Unknown astarg specifier "{}"'.format(ann)) 398 | 399 | return func(*ba.args, **ba.kwargs) 400 | 401 | return f 402 | 403 | 404 | class MacroProcessor(PatternTransformer): 405 | 406 | """Framework for substituting uses of specific functions and 407 | methods with arbitrary ASTs. Each substitution is handled 408 | innermost-first. If repeat is given, the substituted AST 409 | is also transformed. 410 | 411 | The following kinds of uses are supported: 412 | 413 | - function expressions (fe) "print(foo())" 414 | - function statements (fs) "foo()" 415 | - method expressions (me) "print(obj.foo())" 416 | - method statements (ms) "obj.foo()" 417 | - function with (fw) "with foo(): body" 418 | - method with (mw) "with obj.foo(): body" 419 | 420 | The expression patterns match calls that occur anywhere, while the 421 | statement patterns only match calls that appear at statement level, 422 | i.e. immediately inside an Expr node. 423 | 424 | Function and method patterns invoke the handler whose name 425 | corresponds to the syntactic name appearing in the call AST. 426 | For methods, this is just the attribute identifier on the method 427 | call. For functions, the pattern only matches when the function 428 | is given by a Name node, not an arbitrary expression. If the same 429 | name is used for multiple kinds of handlers, expression handlers 430 | take precedence over statement and with handlers, since the rules 431 | are applied to innermost matches first. 432 | 433 | All forms can accept keyword arguments, but not variadic arguments 434 | (*args and **kargs). 435 | 436 | Handlers are defined as methods in MacroProcessor subclasses, 437 | similar to the visit_* methods in NodeVisitor subclasses. The 438 | method names have form 439 | 440 | handle_KIND_NAME 441 | 442 | where KIND is the abbreviation for one of the six pattern types 443 | ("fe", etc.) and NAME is the syntactic function or method name to 444 | match. 445 | 446 | The handlers take in as the first argument (not counting "self") 447 | the function or method name that the pattern matched. (This allows 448 | the same handler to be reused under multiple names.) The remaining 449 | arguments are the ASTs of the arguments in the Call node. For 450 | methods, the first of these remaining arguments is the AST of the 451 | receiver of the method call (i.e. the expression to the left of the 452 | dot). If the Call node has keyword arguments, these are passed as 453 | keywords from the key to the AST of the argument value. The "with" 454 | pattern handlers take an additional '_body' keyword argument, bound 455 | to the AST (tuple of statements) of the with body. 456 | 457 | The handlers return an AST to replace the matched tree with. 458 | A return value of None indicates no change. 459 | 460 | For example, if the input AST contains an expression 461 | 462 | print(obj.foo(x + y, z=1)) 463 | 464 | then we look for the handler 'handle_me_foo()'. If it exists, it is 465 | called with positional arguments "foo", the AST for "obj", and the 466 | AST for "x + y"; and with keyword argument z = the AST for "1". 467 | If it returns the AST Num(5), our new tree is 468 | 469 | print(5) 470 | 471 | Note that a failure to match the arguments in a Call with the 472 | arguments of the handler will result in an exception, the same 473 | as when a Python function is called with the wrong signature. 474 | """ 475 | 476 | L = None 477 | """Stub for module reference.""" 478 | 479 | @property 480 | def func_expr_pattern(self): 481 | L = self.L 482 | return L.Call(L.Name(PatVar('_func'), L.Load()), 483 | PatVar('_args'), PatVar('_keywords'), 484 | PatVar('_starargs'), PatVar('_kwargs')) 485 | 486 | @property 487 | def meth_expr_pattern(self): 488 | L = self.L 489 | return L.Call(L.Attribute(PatVar('_recv'), PatVar('_func'), 490 | L.Load()), 491 | PatVar('_args'), PatVar('_keywords'), 492 | PatVar('_starargs'), PatVar('_kwargs')) 493 | 494 | @property 495 | def func_stmt_pattern(self): 496 | L = self.L 497 | return L.Expr(self.func_expr_pattern) 498 | 499 | @property 500 | def meth_stmt_pattern(self): 501 | L = self.L 502 | return L.Expr(self.meth_expr_pattern) 503 | 504 | @property 505 | def func_with_pattern(self): 506 | L = self.L 507 | return L.With((L.withitem(self.func_expr_pattern, None),), 508 | PatVar('_body')) 509 | 510 | @property 511 | def meth_with_pattern(self): 512 | L = self.L 513 | return L.With((L.withitem(self.meth_expr_pattern, None),), 514 | PatVar('_body')) 515 | 516 | def dispatch(self, prefix, kind, *, _recv=None, _body=None, 517 | _func, _args, _keywords, _starargs, _kwargs): 518 | """Dispatch helper. prefix and kind are strings that vary based 519 | on the pattern form. _recv is prepended to _args if not None. 520 | If _body is not None, ('_body', _body) is added to _keywords. 521 | """ 522 | handler = getattr(self, prefix + _func, None) 523 | if handler is None: 524 | return 525 | 526 | if not (_starargs is None and _kwargs is None): 527 | raise TypeError('Star-args and double star-args are not ' 528 | 'allowed in {} macro {}'.format( 529 | kind, _func)) 530 | 531 | if _recv is not None: 532 | _args = (_recv,) + _args 533 | _args = (_func,) + _args 534 | 535 | kwargs = {kw.arg: kw.value for kw in _keywords} 536 | if _body is not None: 537 | kwargs['_body'] = _body 538 | 539 | sig = signature(handler) 540 | ba = sig.bind(*_args, **kwargs) 541 | return handler(*ba.args, **ba.kwargs) 542 | 543 | def __init__(self): 544 | super().__init__() 545 | self.rules = [ 546 | (self.func_expr_pattern, 547 | partial(self.dispatch, prefix='handle_fe_', kind='function')), 548 | (self.func_stmt_pattern, 549 | partial(self.dispatch, prefix='handle_fs_', kind='function')), 550 | (self.meth_expr_pattern, 551 | partial(self.dispatch, prefix='handle_me_', kind='method')), 552 | (self.meth_stmt_pattern, 553 | partial(self.dispatch, prefix='handle_ms_', kind='method')), 554 | (self.func_with_pattern, 555 | partial(self.dispatch, prefix='handle_fw_', kind='with')), 556 | (self.meth_with_pattern, 557 | partial(self.dispatch, prefix='handle_mw_', kind='with')), 558 | ] 559 | 560 | 561 | def get_all(module): 562 | class _Templater(Templater): 563 | L = module 564 | class _MacroProcessor(MacroProcessor): 565 | L = module 566 | 567 | return { 568 | 'make_pattern': make_pattern, 569 | 'ContextSetter': ContextSetter, 570 | 'extract_tree': partial(extract_tree, module), 571 | 'LiteralEvaluator': LiteralEvaluator, 572 | 'literal_eval': LiteralEvaluator().process, 573 | 'Templater': _Templater, 574 | 'astargs': partial(astargs, module), 575 | 'MacroProcessor': _MacroProcessor, 576 | } 577 | -------------------------------------------------------------------------------- /iast/util.py: -------------------------------------------------------------------------------- 1 | """Miscellaneous utilities.""" 2 | 3 | 4 | __all__ = [ 5 | 'trim', 6 | 'pairwise', 7 | ] 8 | 9 | 10 | from textwrap import dedent 11 | import itertools 12 | 13 | 14 | def trim(text): 15 | """Like textwrap.dedent, but also eliminate leading and trailing 16 | lines if they are whitespace or empty. 17 | 18 | This is useful for writing code as triple-quoted multi-line 19 | strings. 20 | """ 21 | lines = text.split('\n') 22 | if len(lines) > 0: 23 | if len(lines[0]) == 0 or lines[0].isspace(): 24 | lines = lines[1 : ] 25 | if len(lines) > 0: 26 | if len(lines[-1]) == 0 or lines[-1].isspace(): 27 | lines = lines[ : -1] 28 | 29 | return dedent('\n'.join(lines)) 30 | 31 | 32 | # Taken from the documentation for the itertools module. 33 | def pairwise(iterable): 34 | "s -> (s0,s1), (s1,s2), (s2, s3), ..." 35 | a, b = itertools.tee(iterable) 36 | next(b, None) 37 | return zip(a, b) 38 | -------------------------------------------------------------------------------- /iast/visitor.py: -------------------------------------------------------------------------------- 1 | """Visitors for Struct ASTs. Analogous to the visitors in the 2 | standard library 'ast' module, with some enhancements. 3 | """ 4 | 5 | 6 | __all__ = [ 7 | 'NodeVisitor', 8 | 'AdvNodeVisitor', 9 | 'NodeTransformer', 10 | 'AdvNodeTransformer', 11 | 'ChangeCounter', 12 | ] 13 | 14 | 15 | from .node import AST 16 | 17 | 18 | class NodeVisitor: 19 | 20 | """Walk a tree, dispatching to different handlers by node type. 21 | To use, create a subclass and define or override the visit 22 | methods. 23 | 24 | When visit() is called on a node or a tuple, it recursively 25 | processes the subtree using node_visit(), and various handlers. 26 | The handler for a given node type is determined by prefixing 27 | the name of that type with 'visit_', e.g. 'visit_Foo' for node 28 | type 'Foo'. If the handler is not found, generic_visit() is used 29 | as the default. 30 | 31 | The handler is responsible for recursing over the subtree. 32 | It controls whether the tree traversal is preorder or postorder, 33 | or it may prune the traversal by not recursing at all. It should 34 | process each child by calling self.visit(child). Alternatively, 35 | it can call self.generic_visit(node) to get them all. Do not call 36 | self.visit(node), as that would create a call cycle. 37 | 38 | To invoke the visitor, call the process() method with the tree. 39 | Subclasses can override process to do initial setup/teardown 40 | actions or tweak the returned value. The run() classmethod is 41 | provided as a shorthand to combine instantiation and processing. 42 | 43 | You may have the handlers return a value. In this case, you 44 | should override generic_visit() and seq_visit() to propagate 45 | these returned values. 46 | 47 | Note that since Struct nodes are immutable, NodeTransformer must 48 | be used if you want a tree transformation. 49 | """ 50 | 51 | @classmethod 52 | def run(cls, tree, *args, **kargs): 53 | """Convenience method for instantiating the class and running 54 | the visitor on tree. args and kargs are passed on to the 55 | constructor. 56 | """ 57 | visitor = cls(*args, **kargs) 58 | result = visitor.process(tree) 59 | return result 60 | 61 | def process(self, tree): 62 | """Entry point for invoking the visitor.""" 63 | result = self.visit(tree) 64 | return result 65 | 66 | def visit(self, tree): 67 | """Dispatch on a node or sequence (tuple). Other kinds 68 | of values are returned without processing. 69 | """ 70 | if isinstance(tree, AST): 71 | return self.node_visit(tree) 72 | elif isinstance(tree, tuple): 73 | return self.seq_visit(tree) 74 | else: 75 | return tree 76 | 77 | def node_visit(self, node): 78 | """Dispatch to a particular node handler if it exists, 79 | or else to generic_visit(). 80 | """ 81 | method = 'visit_' + node.__class__.__name__ 82 | visitor = getattr(self, method, self.generic_visit) 83 | result = visitor(node) 84 | return result 85 | 86 | def seq_visit(self, seq): 87 | """Dispatch to each item of a sequence.""" 88 | for item in seq: 89 | self.visit(item) 90 | 91 | def generic_visit(self, node): 92 | """Dispatch to each field of a node.""" 93 | for field in node._fields: 94 | value = getattr(node, field) 95 | self.visit(value) 96 | 97 | 98 | class AdvNodeVisitor(NodeVisitor): 99 | 100 | """As above, but tracks context (parent) information and allows 101 | for passing arbitrary arguments to visit handlers. 102 | 103 | The stack of currently visited nodes is made available in the 104 | _visit_stack attribute. Its format is a list of tuples (most 105 | recent last) of form (node, field, index): 106 | 107 | - node is the AST object being visited for that entry 108 | 109 | - field is the name of the parent's field that contains 110 | this node as a child, or None if there is no parent 111 | 112 | - index is the location of this node in the currently 113 | visited sequence, or None if we are not in a sequence. 114 | 115 | Visitors and handlers may pass *args and **kargs, which get 116 | propagated by the default visitor methods unchanged. However, 117 | the special keyword arguments '_field' and '_index' are 118 | intercepted by node_visit() and used to help manage _visit_stack. 119 | Any override of seq_visit() or generic_visit() should pass these 120 | keyword arguments to visit(). 121 | """ 122 | 123 | def process(self, tree): 124 | """Entry point for invoking the visitor.""" 125 | self._visit_stack = [] 126 | result = super().process(tree) 127 | assert len(self._visit_stack) == 0, 'Visit stack unbalanced' 128 | return result 129 | 130 | def visit(self, tree, *args, **kargs): 131 | """Dispatch on a node or sequence (tuple). Other kinds 132 | of values are returned without processing. 133 | """ 134 | if isinstance(tree, AST): 135 | return self.node_visit(tree, *args, **kargs) 136 | elif isinstance(tree, tuple): 137 | return self.seq_visit(tree, *args, **kargs) 138 | else: 139 | return tree 140 | 141 | def node_visit(self, node, *args, _field=None, _index=None, **kargs): 142 | """Dispatch to a particular node handler if it exists, 143 | or else to generic_visit(). 144 | """ 145 | entry = (node, _field, _index) 146 | self._visit_stack.append(entry) 147 | 148 | method = 'visit_' + node.__class__.__name__ 149 | visitor = getattr(self, method, self.generic_visit) 150 | result = visitor(node, *args, **kargs) 151 | 152 | self._visit_stack.pop() 153 | return result 154 | 155 | def seq_visit(self, seq, *args, **kargs): 156 | """Dispatch to each item of a sequence.""" 157 | for i, item in enumerate(seq): 158 | self.visit(item, _index=i, *args, **kargs) 159 | 160 | def generic_visit(self, node, *args, **kargs): 161 | """Dispatch to each field of a node.""" 162 | for field in node._fields: 163 | value = getattr(node, field) 164 | self.visit(value, _field=field, *args, **kargs) 165 | 166 | 167 | class NodeTransformer(NodeVisitor): 168 | 169 | """Visitor that produces a transformed copy of the input tree. 170 | 171 | Handlers may return a replacement node, or None to indicate 172 | no change (note that this differs from ast.NodeTransformer). 173 | If the node is part of a sequence, it may also return a list or 174 | tuple (normalized to a tuple) to splice in its place; use the 175 | empty sequence to delete the node from its sequence. 176 | """ 177 | 178 | # In the handlers, "no change" is indicated by returning None 179 | # or by returning the exact same node as was given. Handlers 180 | # may assume that their subcalls to visit() will always indicate 181 | # "no change" by returning the node rather than None; this is 182 | # ensured by visit(). The same is true of generic_visit(). 183 | # 184 | # This means that it is impossible to replace a part of the tree 185 | # with an actual "None" value. For transformers that require this 186 | # capability, set the class attribute _nochange_none to False, 187 | # which disables this feature. 188 | # 189 | # Returning a node that is equal to ("==") but not identical to 190 | # ("is") the given node is considered a change. This is because 191 | # it would be costly to recognize the case where distinct trees 192 | # are equal to each other. (In fact, it would take quadratic time 193 | # in the depth of the tree.) For efficiency, avoid returning 194 | # equal but non-identical values unnecessarily. 195 | # 196 | # So long as all children return no change, seq_visit() and 197 | # generic_visit() return no change. This means that the only 198 | # nodes that need to be copied are the ones that lie along 199 | # a path from the changed node to the root of the tree, rather 200 | # than all the nodes in the tree. 201 | 202 | _nochange_none = True 203 | """If True, when None is returned by a handler it will be 204 | considered the same as if the given node were returned. 205 | """ 206 | 207 | def visit(self, tree): 208 | result = super().visit(tree) 209 | if self._nochange_none and isinstance(tree, AST) and result is None: 210 | result = tree 211 | return result 212 | 213 | def seq_visit(self, seq): 214 | changed = False 215 | new_seq = [] 216 | 217 | for item in seq: 218 | result = self.visit(item) 219 | if result is not item: 220 | changed = True 221 | if isinstance(result, (tuple, list)): 222 | new_seq.extend(result) 223 | else: 224 | new_seq.append(result) 225 | 226 | if changed: 227 | return tuple(new_seq) 228 | else: 229 | # Be sure to return the original tuple so the 230 | # identity test in generic_visit() succeeds 231 | # and we potentially avoid a copy. 232 | return seq 233 | 234 | def generic_visit(self, node): 235 | repls = {} 236 | for field in node._fields: 237 | value = getattr(node, field) 238 | result = self.visit(value) 239 | if result is not value: 240 | repls[field] = result 241 | 242 | if len(repls) == 0: 243 | return node 244 | else: 245 | return node._replace(**repls) 246 | 247 | 248 | class AdvNodeTransformer(AdvNodeVisitor): 249 | 250 | """As above but with context info and arbitrary parameters.""" 251 | 252 | _nochange_none = True 253 | 254 | def visit(self, tree, *args, **kargs): 255 | result = super().visit(tree, *args, **kargs) 256 | if self._nochange_none and isinstance(tree, AST) and result is None: 257 | result = tree 258 | return result 259 | 260 | def seq_visit(self, seq, *args, **kargs): 261 | changed = False 262 | new_seq = [] 263 | 264 | for i, item in enumerate(seq): 265 | result = self.visit(item, _index=i, *args, **kargs) 266 | if result is not item: 267 | changed = True 268 | if isinstance(result, (tuple, list)): 269 | new_seq.extend(result) 270 | else: 271 | new_seq.append(result) 272 | 273 | if changed: 274 | return tuple(new_seq) 275 | else: 276 | # Be sure to return the original tuple so the 277 | # identity test in generic_visit() succeeds 278 | # and we potentially avoid a copy. 279 | return seq 280 | 281 | def generic_visit(self, node, *args, **kargs): 282 | repls = {} 283 | for field in node._fields: 284 | value = getattr(node, field) 285 | result = self.visit(value, _field=field, *args, **kargs) 286 | if result is not value: 287 | repls[field] = result 288 | 289 | if len(repls) == 0: 290 | return node 291 | else: 292 | return node._replace(**repls) 293 | 294 | 295 | class ChangeCounter(NodeTransformer): 296 | 297 | """Transformer mixin that instruments the transformation to 298 | record how much work is being done. Updates an external dictionary 299 | with the number of new nodes visited and replaced. 300 | """ 301 | 302 | def __init__(self, instr, *args, **kargs): 303 | super().__init__(*args, **kargs) 304 | instr.setdefault('visited', 0) 305 | instr.setdefault('changed', 0) 306 | self.instr = instr 307 | 308 | def visit(self, tree): 309 | self.instr['visited'] += 1 310 | before = tree 311 | tree = super().visit(tree) 312 | if tree is not None and tree is not before: 313 | self.instr['changed'] += 1 314 | return tree 315 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import setup 2 | 3 | setup( 4 | name = 'iAST', 5 | version = '0.2.1', 6 | url = 'https://github.com/brandjon/iast', 7 | 8 | author = 'Jon Brandvein', 9 | author_email = 'jon.brandvein@gmail.com', 10 | license = 'MIT License', 11 | description = 'A library for defining and manipulating ASTs', 12 | 13 | classifiers = [ 14 | 'Development Status :: 3 - Alpha', 15 | 'Intended Audience :: Developers', 16 | 'License :: OSI Approved :: MIT License', 17 | 'Programming Language :: Python :: 3', 18 | 'Topic :: Software Development :: Libraries :: Python Modules', 19 | ], 20 | 21 | packages = ['iast', 'iast.asdl', 'iast.python'], 22 | package_data = {'iast.asdl': ['*.asdl']}, 23 | 24 | test_suite = 'tests', 25 | 26 | install_requires = ['simplestruct >=0.2.1'], 27 | ) 28 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | 3 | def additional_tests(): 4 | return unittest.defaultTestLoader.discover( 5 | 'iast.asdl', pattern='*_test.py') 6 | -------------------------------------------------------------------------------- /tests/python/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/brandjon/iast/23961536c3bfb5d8fce39c28214ea88b8072450c/tests/python/__init__.py -------------------------------------------------------------------------------- /tests/python/test_pynode.py: -------------------------------------------------------------------------------- 1 | """Unit tests for pynode.py.""" 2 | 3 | 4 | import unittest 5 | import ast 6 | import pickle 7 | 8 | from iast.node import AST 9 | from iast.python.default import * 10 | 11 | 12 | class NodeCase(unittest.TestCase): 13 | 14 | # The nodes used in these tests are compatible with both 15 | # Python 3.3 and Python 3.4. 16 | 17 | tree_str = "Module(body=(Expr(value=Name(id='a', ctx=Load())),))" 18 | 19 | def test_init_nodetypes(self): 20 | node = Name('a', Load()) 21 | self.assertEqual(repr(node), "Name(id='a', ctx=Load())") 22 | self.assertEqual(Name.__bases__, (expr,)) 23 | 24 | def test_import(self): 25 | tree = ast.parse('a') 26 | tree = pyToStruct(tree) 27 | self.assertTrue(isinstance(tree, AST)) 28 | self.assertEqual(str(tree), self.tree_str) 29 | 30 | def test_export(self): 31 | tree = eval(self.tree_str, py_nodes) 32 | tree = structToPy(tree) 33 | exp_str = "Module(body=[Expr(value=Name(id='a', ctx=Load()))])" 34 | self.assertTrue(isinstance(tree, ast.AST)) 35 | self.assertEqual(ast.dump(tree), exp_str) 36 | 37 | def test_pickle(self): 38 | # With the craziness of multiple node kinds and import/export 39 | # trickery, make sure pickling still works. 40 | node1 = Name('foo', Load()) 41 | s = pickle.dumps(node1) 42 | node2 = pickle.loads(s) 43 | self.assertEqual(node1, node2) 44 | 45 | 46 | if __name__ == '__main__': 47 | unittest.main() 48 | -------------------------------------------------------------------------------- /tests/python/test_pyutil.py: -------------------------------------------------------------------------------- 1 | """Unit tests for pyutil.py.""" 2 | 3 | 4 | import unittest 5 | 6 | from iast.pattern import PatVar, Wildcard 7 | from iast.python.default import * 8 | 9 | 10 | class PyUtilCase(unittest.TestCase): 11 | 12 | def pc(self, source): 13 | return extract_tree(parse(source), 'code') 14 | 15 | def ps(self, source): 16 | return extract_tree(parse(source), 'stmt') 17 | 18 | def pe(self, source): 19 | return extract_tree(parse(source), 'expr') 20 | 21 | def test_make_pattern(self): 22 | tree = parse(''' 23 | a = (_, _x) 24 | (_x, _) = b 25 | ''') 26 | tree = make_pattern(tree) 27 | exp_tree = Module((Assign((Name('a', Store()),), 28 | Tuple((Wildcard(), PatVar('_x')), 29 | Load())), 30 | Assign((Tuple((PatVar('_x'), Wildcard()), 31 | Store()),), 32 | Name('b', Load())))) 33 | self.assertEqual(tree, exp_tree) 34 | 35 | def test_extract(self): 36 | tree_in = parse('x') 37 | 38 | tree_out = extract_tree(tree_in, mode='mod') 39 | self.assertEqual(tree_out, tree_in) 40 | 41 | tree_out = extract_tree(tree_in, mode='code') 42 | exp_tree_out = (Expr(Name('x', Load())),) 43 | self.assertEqual(tree_out, exp_tree_out) 44 | 45 | tree_out = extract_tree(Module([]), mode='stmt_or_blank') 46 | self.assertEqual(tree_out, None) 47 | tree_out = extract_tree(tree_in, mode='stmt_or_blank') 48 | self.assertEqual(tree_out, Expr(Name('x', Load()))) 49 | with self.assertRaises(ValueError): 50 | extract_tree(parse('x; y'), mode='stmt_or_blank') 51 | 52 | tree_out = extract_tree(tree_in, mode='stmt') 53 | self.assertEqual(tree_out, Expr(Name('x', Load()))) 54 | with self.assertRaises(ValueError): 55 | extract_tree(Module([]), mode='stmt') 56 | with self.assertRaises(ValueError): 57 | extract_tree(parse('x; y'), mode='stmt') 58 | 59 | tree_out = extract_tree(tree_in, mode='expr') 60 | exp_tree_out = Name('x', Load()) 61 | self.assertEqual(tree_out, exp_tree_out) 62 | with self.assertRaises(ValueError): 63 | extract_tree(parse('pass'), mode='expr') 64 | 65 | tree_out = extract_tree(tree_in, mode='lval') 66 | exp_tree_out = Name('x', Store()) 67 | self.assertEqual(tree_out, exp_tree_out) 68 | 69 | def test_ctx(self): 70 | tree = self.pe('(x, [y, z], *(q, r.f))') 71 | tree = ContextSetter.run(tree, Store) 72 | exp_tree = self.ps('(x, [y, z], *(q, r.f)) = None').targets[0] 73 | self.assertEqual(tree, exp_tree) 74 | 75 | def test_liteval(self): 76 | # Basic. 77 | tree = self.pe('(1 + 2) * 5') 78 | val = literal_eval(tree) 79 | self.assertEqual(val, 15) 80 | 81 | # Comparators, names. 82 | tree = self.pe('1 < 2 == -~1 and True and None is None') 83 | val = literal_eval(tree) 84 | self.assertEqual(val, True) 85 | 86 | # Collections. 87 | tree = self.pe('[1, 2], {3, 4}, {5: "a", 6: "b"}') 88 | val = literal_eval(tree) 89 | exp_val = [1, 2], {3, 4}, {5: 'a', 6: 'b'} 90 | self.assertEqual(val, exp_val) 91 | 92 | def test_templater(self): 93 | # Name to string or AST. 94 | tree = parse('a = b + c') 95 | subst = {'a': 'a2', 'b': Name('b2', Load()), 96 | 'c': 'c2'} 97 | tree = Templater.run(tree, subst) 98 | exp_tree = parse('a2 = b2 + c2') 99 | self.assertEqual(tree, exp_tree) 100 | 101 | # Attribute name change. 102 | tree = parse('a.foo.foo') 103 | subst = {'@foo': 'bar'} 104 | tree = Templater.run(tree, subst) 105 | exp_tree = parse('a.bar.bar') 106 | self.assertEqual(tree, exp_tree) 107 | 108 | # Function name change. 109 | tree = parse('def foo(x): return foo(x)') 110 | subst = {'foo': 'bar'} 111 | tree = Templater.run(tree, subst) 112 | exp_tree = parse('def bar(x): return foo(x)') 113 | self.assertEqual(tree, exp_tree) 114 | 115 | # Code substitution. 116 | tree = parse(''' 117 | Foo 118 | Bar 119 | ''') 120 | subst = {'Foo': self.pc('pass'), 121 | 'Foo': 'Foo2', 122 | 'Bar': 'Bar2'} 123 | tree = Templater.run(tree, subst) 124 | exp_tree = parse(''' 125 | pass 126 | Bar2 127 | ''') 128 | self.assertEqual(tree, exp_tree) 129 | 130 | # Repeat substitution. 131 | tree = parse(''' 132 | def foo(): 133 | a.b = c 134 | Bar 135 | ''') 136 | tree2 = self.pc(''' 137 | for x in S: 138 | Baz 139 | ''') 140 | tree3 = self.pc('c') 141 | subst = {'c': 'c2', 'c2': 'c3', 142 | 'foo': 'foo2', 'foo2': 'foo3', 143 | '@b': 'b2', '@b2': 'b3', 144 | 'Bar': Expr(Name('Bar2', Load())), 145 | 'Bar2': tree2, 146 | 'Baz': tree3} 147 | tree = Templater.run(tree, subst, repeat=True) 148 | exp_tree = parse(''' 149 | def foo3(): 150 | a.b3 = c3 151 | for x in S: 152 | c3 153 | ''') 154 | self.assertEqual(tree, exp_tree) 155 | 156 | # Bailout limit. 157 | tree = parse('a') 158 | subst = {'a': 'a'} 159 | with self.assertRaises(RuntimeError): 160 | Templater.run(tree, subst, repeat=True) 161 | 162 | # Recursion limit error. 163 | tree = parse('a') 164 | subst = {'a': Expr(Name('a', Load()))} 165 | with self.assertRaises(RuntimeError): 166 | Templater.run(tree, subst, repeat=True) 167 | 168 | def test_ast_args(self): 169 | @astargs 170 | def foo(a, b:'Str'): 171 | return a + b 172 | res = foo('x', Str('y')) 173 | self.assertEqual(res, 'xy') 174 | with self.assertRaises(TypeError): 175 | foo('x', 'y') 176 | 177 | @astargs 178 | def foo(a:'ids', b:'Name'): 179 | return ', '.join(a) + ' : ' + b 180 | res = foo(self.pe('[a, b, c]'), self.pe('d')) 181 | self.assertEqual(res, 'a, b, c : d') 182 | 183 | @astargs 184 | def foo(a:'err'): 185 | pass 186 | with self.assertRaises(TypeError): 187 | foo(1) 188 | 189 | def test_macro(self): 190 | pe = self.pe 191 | pc = self.pc 192 | 193 | # Handlers for methods and functions, statements and expressions. 194 | class Foo(MacroProcessor): 195 | def handle_ms_foo(self, f, rec, arg): 196 | return Expr(Tuple((rec, arg), Load())) 197 | def handle_fe_bar(self, f, arg): 198 | return Num(5) 199 | 200 | tree = parse('o.foo(bar(1))') 201 | tree = Foo.run(tree) 202 | exp_tree = parse('(o, 5)') 203 | self.assertEqual(tree, exp_tree) 204 | 205 | # With handlers. 206 | class Foo(MacroProcessor): 207 | def handle_fw_baz(self, f, arg, _body): 208 | return (Pass(),) + _body 209 | 210 | tree = parse(''' 211 | with baz(1): 212 | print(5) 213 | ''') 214 | tree = Foo.run(tree) 215 | exp_tree = parse(''' 216 | pass 217 | print(5) 218 | ''') 219 | self.assertEqual(tree, exp_tree) 220 | 221 | # Precedence of expression handlers over other handlers. 222 | class Foo(MacroProcessor): 223 | def handle_fe_foo(self, f): 224 | return pe('bar1') 225 | def handle_fs_foo(self, f): 226 | return pc('bar2') 227 | def handle_fw_foo(self, f, arg, _body): 228 | return pc('bar3') 229 | 230 | tree = parse(''' 231 | foo() + 1 232 | foo() 233 | with foo(): 234 | pass 235 | ''') 236 | tree = Foo.run(tree) 237 | exp_tree = parse(''' 238 | bar1 + 1 239 | bar1 240 | with bar1: 241 | pass 242 | ''') 243 | self.assertEqual(tree, exp_tree) 244 | 245 | 246 | if __name__ == '__main__': 247 | unittest.main() 248 | -------------------------------------------------------------------------------- /tests/test_node.py: -------------------------------------------------------------------------------- 1 | """Unit tests for node.py.""" 2 | 3 | 4 | import unittest 5 | from collections import OrderedDict 6 | from simplestruct import Field 7 | 8 | from iast.util import trim 9 | from iast.asdl import parse_asdl 10 | from iast.node import * 11 | from iast.node import ASDLImporter 12 | 13 | 14 | class NodeCase(unittest.TestCase): 15 | 16 | def test_node(self): 17 | # Define, construct, and repr. 18 | class Foo(AST): 19 | _fields = ('a', 'b', 'c') 20 | b = Field() 21 | node = Foo(1, 2, 3) 22 | s = repr(node) 23 | exp_s = 'Foo(a=1, b=2, c=3)' 24 | self.assertEqual(s, exp_s) 25 | 26 | # Reconstruct the tree from repr. 27 | node2 = eval(s, locals()) 28 | self.assertEqual(node2, node) 29 | 30 | def test_dump(self): 31 | class Add(AST): 32 | _fields = ['left', 'right'] 33 | class Sum(AST): 34 | _fields = ['operands'] 35 | tree = Sum((Add(1, 2), Sum((3, 4,)), Sum((5,)), Sum(()),)) 36 | s = dump(tree) 37 | exp_s = trim(''' 38 | Sum(operands = (Add(left = 1, 39 | right = 2), 40 | Sum(operands = (3, 41 | 4)), 42 | Sum(operands = (5,)), 43 | Sum(operands = ()))) 44 | ''') 45 | self.assertEqual(s, exp_s) 46 | 47 | # Reconstruct the tree from dump. 48 | tree2 = eval(s, locals()) 49 | self.assertEqual(tree2, tree) 50 | 51 | asdl_spec = trim(''' 52 | module Dummy 53 | { 54 | expr = Sum(expr* operands) 55 | | Num(num val) 56 | | Unit() 57 | num = (int real, int? imag) 58 | } 59 | ''') 60 | 61 | def test_asdl_importer(self): 62 | asdl = parse_asdl(self.asdl_spec) 63 | info = ASDLImporter().run(asdl) 64 | 65 | exp_info = OrderedDict([ 66 | ('expr', ([], 'AST')), 67 | ('num', ([('real', 'int', ''), ('imag', 'int', '?')], 'AST')), 68 | ('Sum', ([('operands', 'expr', '*')], 'expr')), 69 | ('Num', ([('val', 'num', '')], 'expr')), 70 | ('Unit', ([], 'expr')), 71 | ]) 72 | self.assertEqual(info.items(), exp_info.items()) 73 | 74 | def test_from_asdl_untyped(self): 75 | asdl = parse_asdl(self.asdl_spec) 76 | lang = nodes_from_asdl(asdl) 77 | 78 | self.assertEqual(lang['AST'], AST) 79 | self.assertEqual(lang['Sum']._fields, ('operands',)) 80 | self.assertEqual(lang['Sum'].__bases__, (lang['expr'],)) 81 | self.assertEqual(lang['num']._fields, ('real', 'imag')) 82 | self.assertEqual(lang['num'].__bases__, (lang['AST'],)) 83 | 84 | def test_from_asdl_typed(self): 85 | asdl = parse_asdl(self.asdl_spec) 86 | lang = nodes_from_asdl(asdl, typed=True, 87 | primitive_types={'int': int}) 88 | 89 | Numcls = lang['Num'] 90 | numcls = lang['num'] 91 | 92 | numcls(1, 2) 93 | numcls(1, None) 94 | with self.assertRaises(TypeError): 95 | numcls('a', 2) 96 | with self.assertRaises(TypeError): 97 | numcls(1, 'b') 98 | 99 | Numcls(numcls(1, 2)) 100 | with self.assertRaises(TypeError): 101 | Numcls((1, 2)) 102 | 103 | Sumcls = lang['Sum'] 104 | Unitcls = lang['Unit'] 105 | with self.assertRaises(TypeError): 106 | Sumcls(Unitcls()) 107 | 108 | 109 | if __name__ == '__main__': 110 | unittest.main() 111 | -------------------------------------------------------------------------------- /tests/test_pattern.py: -------------------------------------------------------------------------------- 1 | """Unit tests for pattern.py.""" 2 | 3 | 4 | import unittest 5 | 6 | from iast.python.default import parse, make_pattern, Num, BinOp, Add, Mult 7 | from iast.pattern import * 8 | from iast.pattern import match_step 9 | 10 | 11 | class PatternCase(unittest.TestCase): 12 | 13 | def pat(self, source): 14 | return make_pattern(parse(source)) 15 | 16 | def pe(self, source): 17 | return parse(source).body[0].value 18 | 19 | def pate(self, source): 20 | return self.pat(source).body[0].value 21 | 22 | def test_match_step(self): 23 | # Simple. 24 | result = match_step(PatVar('_X'), Num(1)) 25 | exp_result = ([], {'_X': Num(1)}) 26 | self.assertEqual(result, exp_result) 27 | 28 | # Wildcard. 29 | result = match_step(Wildcard(), Num(1)) 30 | exp_result = ([], {}) 31 | self.assertEqual(result, exp_result) 32 | 33 | # Var on RHS. 34 | result = match_step(Num(1), PatVar('_X')) 35 | exp_result = ([], {'_X': Num(1)}) 36 | self.assertEqual(result, exp_result) 37 | 38 | # Redundant equation. 39 | result = match_step(PatVar('_X'), PatVar('_X')) 40 | exp_result = ([], {}) 41 | self.assertEqual(result, exp_result) 42 | 43 | # Circular equation. 44 | with self.assertRaises(MatchFailure): 45 | match_step(PatVar('_X'), BinOp(PatVar('_X'), Add(), Num(1))) 46 | 47 | # Nodes, constants. 48 | result = match_step(Num(1), Num(1)) 49 | exp_result = ([(1, 1)], {}) 50 | self.assertEqual(result, exp_result) 51 | with self.assertRaises(MatchFailure): 52 | match_step(Num(1), BinOp(Num(1), Add(), Num(2))) 53 | with self.assertRaises(MatchFailure): 54 | match_step(1, 2) 55 | 56 | # Tuples. 57 | result = match_step((1, 2), (1, 2)) 58 | exp_result = ([(1, 1), (2, 2)], {}) 59 | self.assertEqual(result, exp_result) 60 | with self.assertRaises(MatchFailure): 61 | match_step((1, 2), (1, 2, 3)) 62 | 63 | def test_match(self): 64 | result = match(self.pat('((_X, _Y), _Z + _)'), 65 | self.pat('((1, _Z), 2 + 3)')) 66 | exp_result = { 67 | '_X': Num(1), 68 | '_Y': Num(2), 69 | '_Z': Num(2), 70 | } 71 | self.assertEqual(result, exp_result) 72 | 73 | result = match(1, 2) 74 | self.assertEqual(result, None) 75 | 76 | def test_pattrans(self): 77 | class Trans(PatternTransformer): 78 | rules = [ 79 | # Constant-fold addition. 80 | (BinOp(Num(PatVar('_X')), Add(), Num(PatVar('_Y'))), 81 | lambda _X, _Y: Num(_X + _Y)), 82 | # Constant-fold left-multiplication by 0, 83 | # defer to other rules. 84 | (BinOp(Num(PatVar('_X')), Mult(), Num(PatVar('_Y'))), 85 | lambda _X, _Y: Num(0) if _X == 0 else NotImplemented), 86 | # Constant-fold right-multiplication by 0, 87 | # do not defer to other rules. 88 | (BinOp(Num(PatVar('_X')), Mult(), Num(PatVar('_Y'))), 89 | lambda _X, _Y: Num(0) if _Y == 0 else None), 90 | # Constant-fold multiplication, but never gets 91 | # to run since above rule doesn't defer. 92 | (BinOp(Num(PatVar('_X')), Mult(), Num(PatVar('_Y'))), 93 | lambda _X, _Y: Num(_X * _Y)), 94 | ] 95 | 96 | # Bottom-up; subtrees should be processed first. 97 | tree = parse('1 + (2 + 3)') 98 | tree = Trans.run(tree) 99 | exp_tree = parse('6') 100 | self.assertEqual(tree, exp_tree) 101 | 102 | # NotImplemented defers to third rule, None blocks last rule. 103 | tree = parse('(5 * 2) * ((3 * 0) - 1)') 104 | tree = Trans.run(tree) 105 | exp_tree = parse('(5 * 2) * (0 - 1)') 106 | self.assertEqual(tree, exp_tree) 107 | 108 | 109 | if __name__ == '__main__': 110 | unittest.main() 111 | -------------------------------------------------------------------------------- /tests/test_util.py: -------------------------------------------------------------------------------- 1 | """Unit tests for util.py.""" 2 | 3 | 4 | import unittest 5 | 6 | from iast.util import * 7 | 8 | 9 | class UtilCase(unittest.TestCase): 10 | 11 | def test_trim(self): 12 | text1 = trim(''' 13 | for x in foo: 14 | print(x) 15 | ''') 16 | exp_text1 = 'for x in foo:\n print(x)' 17 | 18 | self.assertEqual(text1, exp_text1) 19 | 20 | text2 = trim('') 21 | exp_text2 = '' 22 | 23 | self.assertEqual(text2, exp_text2) 24 | 25 | 26 | if __name__ == '__main__': 27 | unittest.main() 28 | -------------------------------------------------------------------------------- /tests/test_visitor.py: -------------------------------------------------------------------------------- 1 | """Unit tests for visitor.py.""" 2 | 3 | 4 | import unittest 5 | 6 | from iast.util import trim 7 | from iast.node import dump 8 | import iast.python.default as L 9 | from iast.python.default import parse 10 | from iast.visitor import * 11 | 12 | 13 | class VisitorCase(unittest.TestCase): 14 | 15 | def test_visitor(self): 16 | class Foo(NodeVisitor): 17 | def process(self, tree): 18 | self.names = set() 19 | super().process(tree) 20 | return self.names 21 | def visit_Name(self, node): 22 | self.names.add(node.id) 23 | 24 | tree = parse('a = foo(a)') 25 | result = Foo.run(tree) 26 | self.assertEqual(result, {'a', 'foo'}) 27 | 28 | def test_visitor_context(self): 29 | class Foo(AdvNodeVisitor): 30 | def process(self, tree): 31 | self.occ = [] 32 | super().process(tree) 33 | return self.occ 34 | def visit_Pass(self, node): 35 | self.occ.append(self._visit_stack[-1]) 36 | def visit_Name(self, node): 37 | self.occ.append(self._visit_stack[-1]) 38 | 39 | tree = parse(''' 40 | pass 41 | a = b 42 | ''') 43 | res = Foo.run(tree) 44 | exp_res = [ 45 | (L.Pass(), 'body', 0), 46 | (L.Name('a', L.Store()), 'targets', 0), 47 | (L.Name('b', L.Load()), 'value', None) 48 | ] 49 | self.assertEqual(res, exp_res) 50 | 51 | def test_transformer(self): 52 | # Basic functionality. 53 | 54 | class Foo(NodeTransformer): 55 | def visit_Name(self, node): 56 | if node.id == 'a': 57 | return node._replace(id='c') 58 | def visit_Expr(self, node): 59 | node = self.generic_visit(node) 60 | return [node, node] 61 | def visit_Pass(self, node): 62 | return [] 63 | 64 | tree = parse(trim(''' 65 | a 66 | pass 67 | ''')) 68 | tree = Foo.run(tree) 69 | exp_text = trim(''' 70 | Module(body = (Expr(value = Name(id = 'c', 71 | ctx = Load())), 72 | Expr(value = Name(id = 'c', 73 | ctx = Load())))) 74 | ''') 75 | self.assertEqual(dump(tree), exp_text) 76 | 77 | # Make sure None returns aren't propagated to caller. 78 | 79 | class Foo(NodeTransformer): 80 | pass 81 | 82 | tree1 = parse('pass') 83 | tree2 = Foo.run(tree1) 84 | self.assertEqual(tree1, tree2) 85 | tree1 = (parse('pass'), parse('pass')) 86 | tree2 = Foo.run(tree1) 87 | self.assertEqual(tree1, tree2) 88 | 89 | # Unless we want them to be. 90 | 91 | class Foo(NodeTransformer): 92 | _nochange_none = False 93 | def visit_Num(self, node): 94 | return None 95 | 96 | tree = parse('return 5') 97 | tree = Foo.run(tree) 98 | exp_tree = parse('return') 99 | self.assertEqual(tree, exp_tree) 100 | 101 | def test_counter(self): 102 | class Foo(ChangeCounter, NodeTransformer): 103 | def visit_Name(self, node): 104 | return node._replace(id=node.id * 2) 105 | 106 | instr = {} 107 | tree = parse('a + b + c + "s"') 108 | tree = Foo.run(tree, instr) 109 | exp_tree = parse('aa + bb + cc + "s"') 110 | 111 | self.assertEqual(tree, exp_tree) 112 | self.assertEqual(instr['visited'], 14) 113 | self.assertEqual(instr['changed'], 9) 114 | 115 | 116 | if __name__ == '__main__': 117 | unittest.main() 118 | -------------------------------------------------------------------------------- /tox.ini: -------------------------------------------------------------------------------- 1 | [tox] 2 | envlist = py33, py34 3 | 4 | [testenv] 5 | commands = python setup.py test 6 | --------------------------------------------------------------------------------