├── LICENSE ├── Makefile ├── README.md ├── demos ├── d8-3404-d8-8424.html ├── d8-3404.cc ├── d8-8424.cc ├── demo1-demo2.html ├── demo1.html ├── demo1.yin ├── demo2.html ├── demo2.yin ├── diff-s.css ├── diff.css ├── diff1-diff2.html ├── diff1.rkt ├── diff2.rkt ├── ikarus1.ss ├── ikarus2.ss ├── mk1.ss ├── mk2.ss ├── paredit20-paredit22.html ├── paredit20.el ├── paredit22-paredit23.html ├── paredit22.el ├── paredit23.el ├── pass1-pass2.html ├── simulator-arm.cc ├── simulator-mips-simulator-arm.html ├── simulator-mips.cc ├── typed-clojure1-typed-clojure2.html ├── typed-clojure1.clj └── typed-clojure2.clj ├── diff-cpp.rkt ├── diff-js.rkt ├── diff-lisp.rkt ├── diff-yin.rkt ├── diff.css ├── diff.rkt ├── htmlize.rkt ├── nav.js ├── parse-cpp.rkt ├── parse-js.rkt ├── parse-lisp.rkt ├── parse-yin.rkt ├── parsec.rkt ├── structs.rkt └── utils.rkt /Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | raco exe diff-lisp.rkt 3 | raco exe diff-cpp.rkt 4 | raco exe diff-js.rkt 5 | raco exe diff-yin.rkt 6 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ydiff 2 | ======= 3 | 4 | *a structural comparison tool for Lisp programs* 5 | 6 | 7 | ### Demo 8 | 9 | 10 | 11 | ### Features 12 | 13 | * _Language-aware_. ydiff parses programs, understands basic language constructs and will not make non-sensical comparisons. For example it will not compare a string "10000" with an integer 10000 even though they look very similar. Also, it tries to match functions with the same name before it attempts to destruct and compare functions of different names. 14 | 15 | * _Format insensitive_. The comparison result will not be affected by different number of white spaces, line breaks or indentation. For example, ydiff will not produce a large diff just because you surrounded a block of code with if (condition) {...}. 16 | 17 | * _Moved code detection_. ydiff can find refactored code -- renamed, moved, reordered, wrapped, lifted, combined or fragmented code. Refactored code can be detected however deep they are into the structures. 18 | 19 | * _Human-friendly output_. The output of ydiff is designed for human understanding. The interactive UI helps the user navigate and understand changes efficiently. 20 | 21 | 22 | ### No longer supported Languages 23 | 24 | ydiff originally aim to support multiple languages, but I found that the parsers 25 | are very boring and tedious to get right. I hate languages with complicated 26 | syntax which makes parsing hard, so I decided to stop developing parsers for 27 | languages except the Lisp family. The C++ parser here is just to demonstrate how 28 | a mostly-right C++ can be constructed within two days of work ;-) You are 29 | welcome to take the code and extend to complete parsing C++ and JavaScript, but 30 | I can't provide any assistance. 31 | 32 | 33 | ### Installation 34 | 35 | ydiff is implemented in Racket. You can get Racket from 36 | 37 | http://racket-lang.org 38 | 39 | To build ydiff, go to the directory containing the source and run 40 | 41 | make 42 | 43 | It should compile to several executables, such as `diff-lisp`, 44 | `diff-cpp`, `diff-js` etc. Copy those files to some directory in your 45 | PATH. 46 | 47 | 48 | 49 | ### Usage 50 | 51 | 1. Run commands like this example: 52 | 53 | diff-lisp demos/mk1.ss demos/mk2.ss 54 | 55 | It will produce a HTML file named "mk1-mk2.html" in the current 56 | directory. 57 | 58 | 59 | 2. Use your browser to open the HTML file. That is basically it. 60 | 61 | 62 | 63 | ### Caveats 64 | 65 | 66 | 1. The HTML file needs the support files `nav.js` and `diff.css` to be 67 | present in the same directory. You must copy those files to the 68 | directory where you ran the commands. This is not a good user 69 | experience design and may be improved later. 70 | 71 | 72 | 73 | ### Contact 74 | 75 | Yin Wang (yinwang0@gmail.com) 76 | 77 | 78 | 79 | #### License (GPLv3) 80 | 81 | ydiff - a structural comparison tool for Lisp programs 82 | 83 | Copyright (c) 2011-2014 Yin Wang 84 | 85 | This program is free software: you can redistribute it and/or modify 86 | it under the terms of the GNU General Public License as published by 87 | the Free Software Foundation, either version 3 of the License, or 88 | (at your option) any later version. 89 | 90 | This program is distributed in the hope that it will be useful, 91 | but WITHOUT ANY WARRANTY; without even the implied warranty of 92 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 93 | GNU General Public License for more details. 94 | 95 | You should have received a copy of the GNU General Public License 96 | along with this program. If not, see . 97 | -------------------------------------------------------------------------------- /demos/d8-3404.cc: -------------------------------------------------------------------------------- 1 | // Copyright 2009 the V8 project authors. All rights reserved. 2 | // Redistribution and use in source and binary forms, with or without 3 | // modification, are permitted provided that the following conditions are 4 | // met: 5 | // 6 | // * Redistributions of source code must retain the above copyright 7 | // notice, this list of conditions and the following disclaimer. 8 | // * Redistributions in binary form must reproduce the above 9 | // copyright notice, this list of conditions and the following 10 | // disclaimer in the documentation and/or other materials provided 11 | // with the distribution. 12 | // * Neither the name of Google Inc. nor the names of its 13 | // contributors may be used to endorse or promote products derived 14 | // from this software without specific prior written permission. 15 | // 16 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 17 | // "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 18 | // LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 19 | // A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 20 | // OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 21 | // SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 22 | // LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 23 | // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 24 | // THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 25 | // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 26 | // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 27 | 28 | 29 | #include 30 | #include 31 | 32 | #include "d8.h" 33 | #include "d8-debug.h" 34 | #include "debug.h" 35 | #include "api.h" 36 | #include "natives.h" 37 | #include "platform.h" 38 | 39 | 40 | namespace v8 { 41 | 42 | 43 | const char* Shell::kHistoryFileName = ".d8_history"; 44 | const char* Shell::kPrompt = "d8> "; 45 | 46 | 47 | LineEditor *LineEditor::first_ = NULL; 48 | 49 | 50 | LineEditor::LineEditor(Type type, const char* name) 51 | : type_(type), 52 | name_(name), 53 | next_(first_) { 54 | first_ = this; 55 | } 56 | 57 | 58 | LineEditor* LineEditor::Get() { 59 | LineEditor* current = first_; 60 | LineEditor* best = current; 61 | while (current != NULL) { 62 | if (current->type_ > best->type_) 63 | best = current; 64 | current = current->next_; 65 | } 66 | return best; 67 | } 68 | 69 | 70 | class DumbLineEditor: public LineEditor { 71 | public: 72 | DumbLineEditor() : LineEditor(LineEditor::DUMB, "dumb") { } 73 | virtual i::SmartPointer Prompt(const char* prompt); 74 | }; 75 | 76 | 77 | static DumbLineEditor dumb_line_editor; 78 | 79 | 80 | i::SmartPointer DumbLineEditor::Prompt(const char* prompt) { 81 | static const int kBufferSize = 256; 82 | char buffer[kBufferSize]; 83 | printf("%s", prompt); 84 | char* str = fgets(buffer, kBufferSize, stdin); 85 | return i::SmartPointer(str ? i::StrDup(str) : str); 86 | } 87 | 88 | 89 | CounterMap* Shell::counter_map_; 90 | i::OS::MemoryMappedFile* Shell::counters_file_ = NULL; 91 | CounterCollection Shell::local_counters_; 92 | CounterCollection* Shell::counters_ = &local_counters_; 93 | Persistent Shell::utility_context_; 94 | Persistent Shell::evaluation_context_; 95 | 96 | 97 | bool CounterMap::Match(void* key1, void* key2) { 98 | const char* name1 = reinterpret_cast(key1); 99 | const char* name2 = reinterpret_cast(key2); 100 | return strcmp(name1, name2) == 0; 101 | } 102 | 103 | 104 | // Converts a V8 value to a C string. 105 | const char* ToCString(const v8::String::Utf8Value& value) { 106 | return *value ? *value : ""; 107 | } 108 | 109 | 110 | // Executes a string within the current v8 context. 111 | bool Shell::ExecuteString(Handle source, 112 | Handle name, 113 | bool print_result, 114 | bool report_exceptions) { 115 | HandleScope handle_scope; 116 | TryCatch try_catch; 117 | if (i::FLAG_debugger) { 118 | // When debugging make exceptions appear to be uncaught. 119 | try_catch.SetVerbose(true); 120 | } 121 | Handle") 177 | (line port "") 178 | (line port ""))) 179 | 180 | 181 | 182 | (define html-footer 183 | (lambda (port) 184 | (line port "") 185 | (line port ""))) 186 | 187 | 188 | 189 | (define write-html 190 | (lambda (port text side) 191 | (line port (string-append "
")) 192 | (line port "
")
193 |     (if (string=? side "left")
194 |         (line port "")
195 |         (line port ""))
196 |     (line port text)
197 |     (line port "
") 198 | (line port "
"))) 199 | 200 | 201 | (define htmlize 202 | (lambda (changes file1 file2 text1 text2) 203 | (letv ([tags1 (change-tags changes 'left)] 204 | [tags2 (change-tags changes 'right)] 205 | [tagged-text1 (apply-tags text1 tags1)] 206 | [tagged-text2 (apply-tags text2 tags2)] 207 | [out-file (string-append (base-name file1) "-" 208 | (base-name file2) ".html")] 209 | [port (open-output-file out-file 210 | #:mode 'text 211 | #:exists 'replace)]) 212 | (html-header port) 213 | (write-html port tagged-text1 "left") 214 | (write-html port tagged-text2 "right") 215 | (html-footer port) 216 | (close-output-port port)))) 217 | 218 | -------------------------------------------------------------------------------- /nav.js: -------------------------------------------------------------------------------- 1 | // convenience function for document.getElementById(). 2 | window['$']=function(a){return document.getElementById(a)}; 3 | 4 | 5 | /////////////////////// debug flag //////////////////////// 6 | var debug = false; 7 | 8 | 9 | /////////////////////// adjustable parameters ////////////////// 10 | var minStep = 10; 11 | var nSteps = 30; 12 | var stepInterval = 10; 13 | var blockRange = 5; // how far consider one page blocked 14 | var nodeHLColor = '#C9B0A9'; 15 | var lineHLColor = '#FFFF66'; 16 | var lineBlockedColor = '#E9AB17'; 17 | var bgColor = ''; 18 | var bodyBlockedColor = '#FAF0E6'; 19 | 20 | 21 | ///////////////////////// globals //////////////////////// 22 | var eventCount = { 'left' : 0, 'right' : 0}; 23 | var moving = false; 24 | var matchId1 = 'leftstart'; 25 | var matchId2 = 'rightstart'; 26 | var matchLineId1 = -1; 27 | var matchLineId2 = -1; 28 | var cTimeout; 29 | 30 | 31 | ///////////////////////// utilities /////////////////////// 32 | 33 | // No Math.sign() in JS? 34 | function sign(x) { 35 | if (x > 0) { 36 | return 1; 37 | } else if (x < 0) { 38 | return -1; 39 | } else { 40 | return 0; 41 | } 42 | } 43 | 44 | 45 | function log(msg) { 46 | if (debug) { 47 | console.log(msg); 48 | } 49 | } 50 | 51 | 52 | 53 | function elementPosition(id) { 54 | obj = $(id); 55 | var curleft = 0, curtop = 0; 56 | 57 | if (obj && obj.offsetParent) { 58 | curleft = obj.offsetLeft; 59 | curtop = obj.offsetTop; 60 | 61 | while (obj = obj.offsetParent) { 62 | curleft += obj.offsetLeft; 63 | curtop += obj.offsetTop; 64 | } 65 | } 66 | 67 | return { x: curleft, y: curtop }; 68 | } 69 | 70 | 71 | /* 72 | * Scroll the window to relative position, detecting blocking positions. 73 | */ 74 | function scrollWithBlockCheck(container, distX, distY) { 75 | var oldTop = container.scrollTop; 76 | var oldLeft = container.scrollLeft; 77 | 78 | container.scrollTop += distY; // the ONLY place for actual scrolling 79 | container.scrollLeft += distX; 80 | 81 | var actualX = container.scrollLeft - oldLeft; 82 | var actualY = container.scrollTop - oldTop; 83 | log("distY=" + distY + ", actualY=" + actualY); 84 | log("distX=" + distX + ", actualX=" + actualX); 85 | 86 | // extra leewaw here because Chrome scrolling is horribly inacurate 87 | if ((Math.abs(distX) > blockRange && actualX === 0) 88 | || Math.abs(distY) > blockRange && actualY === 0) { 89 | log("blocked"); 90 | container.style.backgroundColor = bodyBlockedColor; 91 | return true; 92 | } else { 93 | eventCount[container.id] += 1; 94 | container.style.backgroundColor = bgColor; 95 | return false; 96 | } 97 | } 98 | 99 | 100 | function getContainer(elm) { 101 | while (elm && elm.tagName !== 'DIV') { 102 | elm = elm.parentElement || elm.parentNode; 103 | } 104 | return elm; 105 | } 106 | 107 | 108 | /* 109 | * timed animation function for scrolling the current window 110 | */ 111 | function matchWindow(linkId, targetId, n) 112 | { 113 | moving = true; 114 | 115 | var link = $(linkId); 116 | var target = $(targetId); 117 | var linkContainer = getContainer(link); 118 | var targetContainer = getContainer(target); 119 | 120 | var linkPos = elementPosition(linkId).y - linkContainer.scrollTop; 121 | var targetPos = elementPosition(targetId).y - targetContainer.scrollTop; 122 | var distY = targetPos - linkPos; 123 | var distX = linkContainer.scrollLeft - targetContainer.scrollLeft; 124 | 125 | 126 | log("matching window... " + n + " distY=" + distY + " distX=" + distX); 127 | 128 | if (distY === 0 && distX === 0) { 129 | clearTimeout(cTimeout); 130 | moving = false; 131 | } else if (n <= 1) { 132 | scrollWithBlockCheck(targetContainer, distX, distY); 133 | moving = false; 134 | } else { 135 | var stepSize = Math.floor(Math.abs(distY) / n); 136 | actualMinStep = Math.min(minStep, Math.abs(distY)); 137 | if (Math.abs(stepSize) < minStep) { 138 | var step = actualMinStep * sign(distY); 139 | } else { 140 | var step = stepSize * sign(distY); 141 | } 142 | var blocked = scrollWithBlockCheck(targetContainer, distX, step); 143 | var rest = Math.floor(distY / step) - 1; 144 | log("blocked?" + blocked + ", rest steps=" + rest); 145 | if (!blocked) { 146 | cTimeout = setTimeout(function () { 147 | return matchWindow(linkId, targetId, rest); 148 | }, stepInterval); 149 | } else { 150 | clearTimeout(cTimeout); 151 | moving = false; 152 | } 153 | } 154 | } 155 | 156 | 157 | 158 | ////////////////////////// highlighting ///////////////////////////// 159 | 160 | var highlighted = [] 161 | function putHighlight(id, color) { 162 | var elm = $(id); 163 | if (elm !== null) { 164 | elm.style.backgroundColor = color; 165 | if (color !== bgColor) { 166 | highlighted.push(id); 167 | } 168 | } 169 | } 170 | 171 | 172 | function clearHighlight() { 173 | for (i = 0; i < highlighted.length; i += 1) { 174 | putHighlight(highlighted[i], bgColor); 175 | } 176 | highlighted = []; 177 | } 178 | 179 | 180 | 181 | /* 182 | * Highlight the link, target nodes and their lines, 183 | * then start animation to move the other window to match. 184 | */ 185 | function highlight(me, linkId, targetId) 186 | { 187 | clearHighlight(); 188 | putHighlight(linkId, nodeHLColor); 189 | putHighlight(targetId, nodeHLColor); 190 | } 191 | 192 | 193 | function instantMoveOtherWindow (me) { 194 | log("me=" + me.id + ", eventcount=" + eventCount[me.id]); 195 | log("matchId1=" + matchId1 + ", matchId2=" + matchId2); 196 | 197 | me.style.backgroundColor = bgColor; 198 | 199 | if (!moving && eventCount[me.id] === 0) { 200 | if (me.id === 'left') { 201 | matchWindow(matchId1, matchId2, 1); 202 | } else { 203 | matchWindow(matchId2, matchId1, 1); 204 | } 205 | } 206 | if (eventCount[me.id] > 0) { 207 | eventCount[me.id] -= 1; 208 | } 209 | } 210 | 211 | 212 | function getTarget(x){ 213 | x = x || window.event; 214 | return x.target || x.srcElement; 215 | } 216 | 217 | 218 | window.onload = 219 | function (e) { 220 | var tags = document.getElementsByTagName("A") 221 | for (var i = 0; i < tags.length; i++) { 222 | tags[i].onmouseover = 223 | function (e) { 224 | var t = getTarget(e) 225 | var lid = t.id 226 | var tid = t.getAttribute('tid') 227 | var container = getContainer(t) 228 | highlight(container, lid, tid) 229 | } 230 | tags[i].onclick = 231 | function (e) { 232 | var t = getTarget(e) 233 | var lid = t.id 234 | var tid = t.getAttribute('tid') 235 | var container = getContainer(t) 236 | highlight(container, lid, tid) 237 | 238 | // lock left and right side 239 | if (container.id === 'left') { 240 | matchId1 = lid; 241 | matchId2 = tid; 242 | } else { 243 | matchId1 = tid; 244 | matchId2 = lid; 245 | } 246 | 247 | matchWindow(lid, tid, nSteps) 248 | } 249 | } 250 | 251 | tags = document.getElementsByTagName("DIV") 252 | for (var i = 0; i < tags.length; i++) { 253 | tags[i].onscroll = 254 | function (e) { 255 | instantMoveOtherWindow(getTarget(e)) 256 | } 257 | } 258 | 259 | } 260 | -------------------------------------------------------------------------------- /parse-cpp.rkt: -------------------------------------------------------------------------------- 1 | ;; ydiff - a language-aware tool for comparing programs 2 | ;; Copyright (C) 2011-2013 Yin Wang (yinwang0@gmail.com) 3 | 4 | ;; This program is free software: you can redistribute it and/or modify 5 | ;; it under the terms of the GNU General Public License as published by 6 | ;; the Free Software Foundation, either version 3 of the License, or 7 | ;; (at your option) any later version. 8 | 9 | ;; This program is distributed in the hope that it will be useful, 10 | ;; but WITHOUT ANY WARRANTY; without even the implied warranty of 11 | ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 | ;; GNU General Public License for more details. 13 | 14 | ;; You should have received a copy of the GNU General Public License 15 | ;; along with this program. If not, see . 16 | 17 | 18 | #lang racket 19 | 20 | (require "structs.rkt") 21 | (require "utils.rkt") 22 | (require "parsec.rkt") 23 | 24 | (provide parse-cpp) 25 | 26 | 27 | 28 | 29 | ;------------------------------------------------------------- 30 | ; scanner settings 31 | ;------------------------------------------------------------- 32 | 33 | (define set-parameters 34 | (lambda () 35 | (set-delims (list "(" ")" "[" "]" "{" "}" "," "`" ";" "#")) 36 | 37 | (set-operators 38 | (list 39 | "<<=" ">>=" "->*" "..." 40 | "&&" "||" ">>" "<<" "++" "--" 41 | "==" "!=" ">=" "<=" "+=" "-=" "*=" "/=" "^=" "&=" "|=" 42 | "->" ".*" "::" 43 | "=" "+" "-" "*" "/" "%" "<" ">" "!" ":" "?" "." 44 | "^" "|" "&" "~" 45 | )) 46 | 47 | (set-line-comment (list "//")) 48 | (set-comment-start "/*") 49 | (set-comment-end "*/") 50 | (set-quotation-marks '(#\" #\')) 51 | (set-significant-whitespaces 52 | (list #\newline #\linefeed #\u2028 #\u2029)))) 53 | 54 | 55 | 56 | 57 | ;------------------------------------------------------------- 58 | ; parsers 59 | ;------------------------------------------------------------- 60 | 61 | ;; literals 62 | (:: $id 63 | ($pred 64 | (lambda (t) 65 | (and (token? t) 66 | (id? (Node-elts t)))))) 67 | 68 | 69 | (::= $identifier 'identifier 70 | (@? ($$ "::")) 71 | (@* $id (@* $type-parameter) ($$ "::")) 72 | (@= 'id (@? ($$ "~")) $id)) 73 | 74 | 75 | ;; (::= $identifier 'identifier 76 | ;; (@? ($$ "::")) $scope-resolution (@? ($$ "~")) $id) 77 | 78 | 79 | (:: $numeral-literal 80 | ($pred 81 | (lambda (t) 82 | (and (token? t) 83 | (numeral? (Node-elts t)))))) 84 | 85 | (:: $char-literal ($pred character?)) 86 | (:: $string-literal ($pred str?)) 87 | (:: $newline ($pred newline?)) 88 | (:: $comment ($pred comment?)) 89 | 90 | 91 | ;; delimeters 92 | (:: |,| (@_ ",")) 93 | (:: |;| (@~ ";")) 94 | (:: |:| (@_ ":")) 95 | (:: |(| (@~ "(")) 96 | (:: |)| (@~ ")")) 97 | (:: |[| (@~ "[")) 98 | (:: |]| (@~ "]")) 99 | (:: |{| (@~ "{")) 100 | (:: |}| (@~ "}")) 101 | 102 | (:: |\n| ($glob^ (@*^ $newline))) 103 | (:: |;\n| (@or |;| |\n|)) 104 | 105 | 106 | (define old-seq @seq) 107 | 108 | (set-seq 109 | (lambda ps 110 | (let ([psj (join ps |\n|)]) 111 | (apply old-seq `(,|\n| ,@psj ,|\n|))))) 112 | 113 | 114 | 115 | ;; a hacky definition for macros 116 | ;; will fix later 117 | (::= $macro-defintion 'macro 118 | (@~ "#") 119 | (@*^ (old-seq (@*^ (@and (@!^ ($$ "\\")) (@!^ $newline))) ($$ "\\") (@*^ $newline))) 120 | (old-seq (@*^ (@!^ $newline)) ($glob^ $newline) ($glob^ (@*^ $newline))) 121 | ) 122 | 123 | 124 | (:: $directive 125 | (@or ($$ "ifdef") 126 | ($$ "define") 127 | ($$ "undef") 128 | ($$ "endif"))) 129 | 130 | 131 | ;;------------------ starting point ----------------- 132 | (::= $program 'program 133 | (@* $statement) 134 | ) 135 | 136 | 137 | 138 | (:: $statement 139 | (@or $macro-defintion 140 | $empty-statement 141 | $access-label 142 | $statement-block 143 | 144 | $if-statement 145 | $switch-statement 146 | $do-while-statement 147 | $while-statement 148 | $for-statement 149 | $continue-statement 150 | $break-statement 151 | 152 | $return-statement 153 | $labelled-statement 154 | $try-statement 155 | 156 | $namespace-definition 157 | $using-namespace 158 | 159 | $class-definition 160 | $function-definition 161 | $function-declaration 162 | $variable-definition 163 | $enum-declaration 164 | 165 | $extended-assembly 166 | $inline-assembly 167 | 168 | $expression-statement 169 | )) 170 | 171 | 172 | (:: $empty-statement |;|) 173 | 174 | 175 | (::= $enum-declaration 'enum 176 | (@~ "enum") (@? $identifier) 177 | |{| 178 | (@? (@.@ (@= 'name-value $identifier (@? $initializer)) |,|)) 179 | |}| 180 | |;| 181 | ) 182 | 183 | 184 | (::= $access-label 'access-label 185 | $access-specifier (@~ ":")) 186 | 187 | 188 | (::= $statement-block 'block 189 | |{| (@* $statement) |}| 190 | ) 191 | 192 | 193 | (::= $namespace-definition 'namespace 194 | ($$ "namespace") $identifier 195 | |{| (@* $statement) |}| 196 | ) 197 | 198 | (::= $using-namespace 'using-namespace 199 | ($$ "using") ($$ "namespace") $identifier) 200 | 201 | 202 | 203 | ;;-------------------------------------------- 204 | (::= $class-definition 'class 205 | 206 | (@or ($$ "class") 207 | ($$ "struct") 208 | ($$ "union")) 209 | 210 | (@* (@= 'declspec 211 | (@or ($$ "_declspec") ($$ "__declspec")) |(| $expression |)|)) 212 | 213 | (@or (@= 'name $identifier |;| ) 214 | 215 | (@... 216 | (@= 'name (@? $identifier)) (@? (@... (@_ ":") $base-clause)) 217 | (@= 'body |{| (@* $statement) |}|) ) 218 | )) 219 | 220 | 221 | (::= $base-clause 'bases 222 | (@.@ $base-specifier |,|) 223 | ) 224 | 225 | 226 | (::= $base-specifier 'base-specifier 227 | (@? $access-specifier) $identifier) 228 | 229 | 230 | (::= $access-specifier 'access-specifier 231 | (@or ($$ "public") 232 | ($$ "protected") 233 | ($$ "private") 234 | ($$ "virtual"))) 235 | 236 | 237 | ;;---------- function definition and declaration ------------ 238 | 239 | (::= $function-declaration 'function-declaration 240 | (@? ($$ "typedef")) 241 | (@? $access-specifier) (@? $modifiers) (@? $type) 242 | (@= 'name (@or $identifier 243 | (@... |(| ($$ "*") $identifier |)|)) ) 244 | $formal-parameter-list 245 | (@? ($$ "const")) 246 | (@? $initializer) 247 | ) 248 | 249 | 250 | (::= $function-definition 'function 251 | (@or (@... (@? $modifiers) $type 252 | (@= 'name $identifier ) $formal-parameter-list) 253 | 254 | (@... (@= 'name $identifier ) $formal-parameter-list)) 255 | (@? $initializer) 256 | $function-body) 257 | 258 | 259 | (::= $type 'type 260 | (@? $modifiers) (@or $primitive-type 261 | $ctype 262 | $identifier) 263 | (@* $type-parameter) (@* $ptr-suffix)) 264 | 265 | 266 | (::= $type-parameter 'type-parameter 267 | (@~ "<") (@.@ (@or $type $numeral-literal) |,|) (@~ ">")) 268 | 269 | 270 | (::= $ptr-suffix 'suffix 271 | (@or ($$ "*") 272 | ($$ "&"))) 273 | 274 | 275 | (::= $formal-parameter-list 'parameters 276 | |(| (@? (@.@ $type-variable |,|)) (@? |,| ($$ "...")) |)| 277 | ) 278 | 279 | 280 | (::= $type-variable 'type-variable 281 | (@? $modifiers) $type (@? $identifier) (@? $array-suffix)) 282 | 283 | 284 | (::= $function-body 'body 285 | |{| (@* $statement) |}| 286 | ) 287 | 288 | 289 | 290 | (::= $variable-definition 'variable-definition 291 | $variable-declaration-list |;| 292 | ) 293 | 294 | 295 | (:: $variable-declaration-list 296 | (@... (@? $modifiers) $type (@.@ $variable-declaration |,|))) 297 | 298 | 299 | (::= $variable-declaration 'variable-declaration 300 | $identifier (@? $variable-declaration-suffix) 301 | (@? $initializer)) 302 | 303 | 304 | (::= $modifiers 'modifiers 305 | (@+ (@or ($$ "const") 306 | ($$ "static") 307 | ($$ "inline")))) 308 | 309 | (:: $primitive-type 310 | (@or (@... 311 | (@or ($$ "signed") 312 | ($$ "unsigned")) 313 | (@or ($$ "int") 314 | ($$ "char") 315 | ($$ "long") 316 | ($$ "double") 317 | ($$ "float"))) 318 | (@or ($$ "signed") 319 | ($$ "unsigned")))) 320 | 321 | (::= $ctype 'ctype 322 | (@or ($$ "struct") 323 | ($$ "enum")) 324 | $identifier 325 | ) 326 | 327 | 328 | (::= $variable-declaration-suffix 'suffix 329 | (@or (@... |[| $expression |]|)) 330 | ) 331 | 332 | (::= $initializer 'initializer 333 | (@or (@... (@_ "=") $expression) 334 | (@... (@_ ":") $expression) 335 | (@... (@_ "(") $expression (@_ ")")))) 336 | 337 | 338 | 339 | 340 | (::= $if-statement 'if 341 | ($$ "if") (@= 'test |(| $expression |)|) $statement 342 | (@? (@= 'else ($$ "else") $statement)) 343 | ) 344 | 345 | 346 | 347 | (::= $do-while-statement 'do-while 348 | ($$ "do") $statement 349 | (@= 'while-do ($$ "while") (@= 'test |(| $expression |)| )) 350 | |;| 351 | ) 352 | 353 | 354 | (::= $while-statement 'while 355 | ($$ "while") (@= 'test |(| $expression |)| ) 356 | $statement 357 | ) 358 | 359 | 360 | (::= $for-statement 'for 361 | ($$ "for") (@= 'iter 362 | |(| (@? $for-initaliser) |;| 363 | (@? $expression) |;| 364 | (@? $expression) 365 | |)| 366 | ) 367 | $statement 368 | ) 369 | 370 | 371 | (::= $for-initaliser 'for-initializer 372 | (@or (@= 'variable-declaration 373 | $variable-declaration-list) 374 | 375 | $expression 376 | )) 377 | 378 | 379 | (::= $continue-statement 'continue 380 | ($$ "continue") (@= 'label (@? $identifier)) |;| 381 | ) 382 | 383 | 384 | (::= $break-statement 'break 385 | ($$ "break") (@= 'label (@? $identifier)) |;| 386 | ) 387 | 388 | 389 | (::= $return-statement 'return 390 | ($$ "return") (@= 'value (@? $expression)) |;| 391 | ) 392 | 393 | 394 | 395 | (::= $labelled-statement 'labelled-statement 396 | $identifier |:| $statement 397 | ) 398 | 399 | 400 | (::= $switch-statement 'switch 401 | ($$ "switch") |(| $expression |)| 402 | |{| (@* $case-clause) 403 | (@? (@... $default-clause 404 | (@* $case-clause))) 405 | |}| 406 | ) 407 | 408 | 409 | (::= $case-clause 'case-clause 410 | ($$ "case") $expression |:| (@* $statement) 411 | ) 412 | 413 | 414 | (::= $default-clause 'default-clause 415 | ($$ "default") |:| (@* $statement) 416 | ) 417 | 418 | 419 | ;; throw is an expression in C++ 420 | ;; (::= $throw-statement 'throw 421 | ;; ($$ "throw") $expression |;| 422 | ;; ) 423 | 424 | 425 | (::= $try-statement 'try 426 | ($$ "try") $statement-block 427 | (@or $finally-clause 428 | (@... $catch-clause (@? $finally-clause)))) 429 | 430 | 431 | (::= $catch-clause 'catch 432 | ($$ "catch") |(| $identifier |)| $statement-block) 433 | 434 | 435 | (::= $finally-clause 'finally 436 | ($$ "finally") $statement-block) 437 | 438 | 439 | (::= $expression-statement 'expression-statement 440 | $expression |;|) 441 | 442 | 443 | 444 | 445 | ;------------------------------------------------------------- 446 | ; expressions 447 | ;------------------------------------------------------------- 448 | 449 | ;; utility for constructing operators 450 | (define op 451 | (lambda (s) 452 | (@= 'op ($$ s)))) 453 | 454 | 455 | (:: $expression 456 | $comma-expression 457 | ) 458 | 459 | 460 | 461 | ;; 18. comma 462 | ;;-------------------------------------------- 463 | (::= $comma-expression 'comma 464 | (@.@ $assignment-expression |,|)) 465 | 466 | 467 | 468 | ;; 17. throw 469 | ;;-------------------------------------------- 470 | (::= $throw-expression 'throw 471 | (@or (@... (@~ "throw")) $expression 472 | $assignment-expression) 473 | ) 474 | 475 | 476 | ;; 16. assignment 477 | ;;-------------------------------------------- 478 | (:: $assignment-expression 479 | (@or (@= 'assignment 480 | $conditional-expression 481 | $assignment-operator 482 | $assignment-expression) 483 | 484 | $conditional-expression 485 | )) 486 | 487 | 488 | (:: $assignment-operator 489 | (@or (op "=") 490 | (op "*=") 491 | (op "/=") 492 | (op "%=") 493 | (op "+=") 494 | (op "-=") 495 | (op "<<=") 496 | (op ">>=") 497 | (op "&=") 498 | (op "^=") 499 | (op "|="))) 500 | 501 | 502 | 503 | ;; 15. ?: Ternary conditional 504 | ;;-------------------------------------------- 505 | (:: $conditional-expression 506 | (@or (@= 'conditional-expression 507 | (@= 'test $logical-or-expression) 508 | (@~ "?") (@= 'then $conditional-expression) 509 | (@~ ":") (@= 'else $conditional-expression)) 510 | 511 | $logical-or-expression 512 | )) 513 | 514 | 515 | ; ($eval $conditional-expression (scan "x > 0? x-1 : x")) 516 | 517 | 518 | 519 | 520 | ;; 14. || Logical OR 521 | ;;-------------------------------------------- 522 | (:: $logical-or-expression 523 | (@or (@infix-left 'binop 524 | $logical-and-expression 525 | (op "||")) 526 | 527 | $logical-and-expression 528 | )) 529 | 530 | 531 | 532 | ;; 13. && Logical AND 533 | ;;-------------------------------------------- 534 | (:: $logical-and-expression 535 | (@or (@infix-left 'binop 536 | $bitwise-or-expression 537 | (op "&&")) 538 | 539 | $bitwise-or-expression 540 | )) 541 | 542 | 543 | 544 | ;; 12. | Bitwise OR (inclusive or) 545 | ;;-------------------------------------------- 546 | (:: $bitwise-or-expression 547 | (@or (@infix-left 'binop 548 | $bitwise-xor-expression 549 | (op "|")) 550 | 551 | $bitwise-xor-expression 552 | )) 553 | 554 | 555 | 556 | ;; 11. ^ Bitwise XOR (exclusive or) 557 | ;;-------------------------------------------- 558 | (:: $bitwise-xor-expression 559 | (@or (@infix-left 'binop 560 | $bitwise-and-expression 561 | (op "^")) 562 | 563 | $bitwise-and-expression 564 | )) 565 | 566 | 567 | 568 | ;; 10. & Bitwise AND 569 | ;;-------------------------------------------- 570 | (:: $bitwise-and-expression 571 | (@or (@infix-left 'binop 572 | $equality-expression 573 | (op "&")) 574 | 575 | $equality-expression 576 | )) 577 | 578 | 579 | 580 | ;; 9. equality 581 | ;;-------------------------------------------- 582 | (:: $equality-expression 583 | (@or (@infix-left 'binop 584 | $relational-expression 585 | $equality-operator) 586 | 587 | $relational-expression 588 | )) 589 | 590 | (:: $equality-operator 591 | (@or (op "==") 592 | (op "!=") 593 | (op "!=") 594 | )) 595 | 596 | 597 | 598 | ;; 8. relational 599 | ;;-------------------------------------------- 600 | (:: $relational-expression 601 | (@or (@infix-left 'binop 602 | $bitwise-shift-expression 603 | $relational-operator) 604 | 605 | $bitwise-shift-expression 606 | )) 607 | 608 | (:: $relational-operator 609 | (@or (op "<") 610 | (op "<=") 611 | (op ">") 612 | (op ">=") 613 | (op "instanceof") 614 | (op "in") 615 | )) 616 | 617 | 618 | 619 | ;; 7. bitwise shift 620 | ;;-------------------------------------------- 621 | (:: $bitwise-shift-expression 622 | (@or (@infix-left 'binop 623 | $additive-expression 624 | $bitwise-shift-operator) 625 | 626 | $additive-expression 627 | )) 628 | 629 | 630 | (:: $bitwise-shift-operator 631 | (@or (op "<<") 632 | (op ">>") 633 | )) 634 | 635 | 636 | 637 | ;; 6. additive 638 | ;;-------------------------------------------- 639 | (:: $additive-expression 640 | (@or (@infix-left 'binop 641 | $multiplicative-expression 642 | $additive-operator) 643 | 644 | $multiplicative-expression 645 | )) 646 | 647 | 648 | (:: $additive-operator 649 | (@or (op "+") 650 | (op "-"))) 651 | 652 | 653 | ;; ($eval $additive-expression (scan "x + y + z")) 654 | 655 | 656 | 657 | 658 | ;; 5. multiplicative 659 | ;;-------------------------------------------- 660 | (:: $multiplicative-expression 661 | (@or (@infix-left 'binop 662 | $unary-expression 663 | $multiplicative-operator) 664 | 665 | $unary-expression)) 666 | 667 | (:: $multiplicative-operator 668 | (@or (op "*") 669 | (op "/") 670 | (op "%"))) 671 | 672 | 673 | 674 | 675 | ;; unary = 676 | ;; 3. prefix 677 | ;; 2. postfix 678 | ;;-------------------------------------------- 679 | (:: $unary-expression 680 | $prefix-expression) 681 | 682 | 683 | 684 | ;; 3. prefix 685 | ;;-------------------------------------------- 686 | (:: $prefix-expression 687 | (@or (@prefix 'prefix 688 | $postfix-expression 689 | $prefix-operator) 690 | $postfix-expression)) 691 | 692 | 693 | (:: $prefix-operator 694 | (@or (@= 'new (op "new") (@? $array-suffix)) 695 | (@= 'delete (op "delete") (@? $array-suffix)) 696 | (@= 'cast |(| $type |)| ) 697 | (op "void") 698 | (op "sizeof") 699 | (op "++") 700 | (op "--") 701 | (op "+") 702 | (op "-") 703 | (op "~") 704 | (op "!") 705 | (op "*") ; indirection 706 | (op "&") ; address of 707 | (op "::") 708 | )) 709 | 710 | 711 | (::= $array-suffix 'array-suffix 712 | |[| |]|) 713 | 714 | 715 | 716 | 717 | ;; 2. postfix 718 | ;;-------------------------------------------- 719 | (:: $postfix-expression 720 | (@or (@postfix 'postfix 721 | $primary-expression 722 | $postfix-operator) 723 | $primary-expression)) 724 | 725 | 726 | (:: $postfix-operator 727 | (@or (op "++") 728 | (op "--") 729 | $index-suffix 730 | $property-reference-suffix 731 | $type-parameter 732 | $arguments)) 733 | 734 | 735 | (::= $arguments 'argument 736 | |(| (@? (@.@ $expression |,|)) |)| 737 | ) 738 | 739 | 740 | (::= $index-suffix 'index 741 | |[| $expression |]| 742 | ) 743 | 744 | 745 | (::= $property-reference-suffix 'field-access 746 | (@or (@~ ".") (@~ "->")) $identifier) 747 | 748 | 749 | 750 | ;; scope resolution :: 751 | ;--------------------------------------------- 752 | (:: $scope-resolution 753 | (@or (@infix-left 'scope 754 | $id 755 | ($$ "::")) 756 | 757 | $primary-expression 758 | )) 759 | 760 | 761 | 762 | ;; 1. primary 763 | ;;-------------------------------------------- 764 | (:: $primary-expression 765 | (@or (@= 'this ($$ "this")) 766 | $type-cast 767 | $ctype ; could be used in a macro argument 768 | $identifier 769 | $literal 770 | $array-literal 771 | $object-literal 772 | (@= #f |(| $expression |)|) 773 | )) 774 | 775 | 776 | (::= $type-cast 'type-cast 777 | (@or ($$ "typeid") 778 | ($$ "const_cast") 779 | ($$ "dynamic_cast") 780 | ($$ "reinterpret_cast") 781 | ($$ "static_cast"))) 782 | 783 | 784 | 785 | ;; literal 786 | ;;-------------------------------------------- 787 | (:: $literal 788 | (@or ($$ "null") 789 | ($$ "true") 790 | ($$ "false") 791 | $string-concat 792 | $float-literal 793 | $numeral-literal 794 | $string-literal 795 | $char-literal)) 796 | 797 | 798 | (::= $array-literal 'array-literal 799 | |{| (@? (@.@ $expression |,|)) |}| 800 | ) 801 | 802 | 803 | (::= $object-literal 'object-literal 804 | |{| $property-name-value (@* (@... |,| $property-name-value)) |}| 805 | ) 806 | 807 | 808 | (::= $property-name-value 'property-name-value 809 | $property-name |:| $assignment-expression) 810 | 811 | 812 | (:: $property-name 813 | (@or $identifier 814 | $string-literal 815 | $numeral-literal)) 816 | 817 | 818 | (::= $float-literal 'float-literal 819 | $numeral-literal ($$ ".") $numeral-literal) 820 | 821 | 822 | (::= $string-concat 'string-concat 823 | $string-literal (@* (@or $string-literal $expression))) 824 | 825 | 826 | 827 | ;------------------------------------------------------------- 828 | ; inline assembly 829 | ;------------------------------------------------------------- 830 | (::= $inline-assembly 'inline-assembly 831 | (@or (@~ "asm") 832 | (@~ "__asm__")) 833 | (@? (@or ($$ "volatile") 834 | ($$ "__volatile__"))) 835 | |(| $string-concat |)| 836 | |;| 837 | ) 838 | 839 | 840 | (::= $extended-assembly 'extended-assembly 841 | (@or (@~ "asm") 842 | (@~ "__asm__")) 843 | (@? (@or ($$ "volatile") 844 | ($$ "__volatile__"))) 845 | |(| $string-concat 846 | |:| (@= 'output-operands (@* $string-literal |(| $identifier |)| )) 847 | |:| (@= 'input-operands (@* $string-literal |(| $identifier |)| )) 848 | |:| (@= 'clobbered-registers (@? (@.@ $string-literal |,|))) 849 | |)| 850 | |;| 851 | ) 852 | 853 | 854 | 855 | 856 | 857 | (define parse-cpp 858 | (lambda (s) 859 | (set-parameters) 860 | (first-val 861 | ($eval $program 862 | (filter (lambda (x) (not (comment? x))) 863 | (scan s)))))) 864 | 865 | 866 | 867 | 868 | ;------------------------------------------------------------- 869 | ; tests 870 | ;------------------------------------------------------------- 871 | 872 | ;; (test-file "tests/simulator-arm.cc" 873 | ;; "tests/simulator-mips.cc" 874 | ;; "tests/d8-3404.cc" 875 | ;; "tests/d8-8424.cc" 876 | ;; "tests/assembler-arm-2.cc" 877 | ;; "tests/assembler-arm-7.cc" 878 | ;; "tests/assembler-arm-8309.cc" 879 | ;; ) 880 | -------------------------------------------------------------------------------- /parse-js.rkt: -------------------------------------------------------------------------------- 1 | ;; ydiff - a language-aware tool for comparing programs 2 | ;; Copyright (C) 2011-2013 Yin Wang (yinwang0@gmail.com) 3 | 4 | ;; This program is free software: you can redistribute it and/or modify 5 | ;; it under the terms of the GNU General Public License as published by 6 | ;; the Free Software Foundation, either version 3 of the License, or 7 | ;; (at your option) any later version. 8 | 9 | ;; This program is distributed in the hope that it will be useful, 10 | ;; but WITHOUT ANY WARRANTY; without even the implied warranty of 11 | ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 | ;; GNU General Public License for more details. 13 | 14 | ;; You should have received a copy of the GNU General Public License 15 | ;; along with this program. If not, see . 16 | 17 | 18 | 19 | #lang racket 20 | 21 | (require "structs.rkt") 22 | (require "utils.rkt") 23 | (require "parsec.rkt") 24 | 25 | 26 | (provide parse-js) 27 | 28 | 29 | 30 | ;------------------------------------------------------------- 31 | ; scanner settings 32 | ;------------------------------------------------------------- 33 | 34 | (define set-parameters 35 | (lambda () 36 | (set-delims (list "(" ")" "[" "]" "{" "}" "," "`" ";")) 37 | 38 | (set-operators 39 | (list 40 | ">>>=" 41 | 42 | "<<=" ">>=" ">>>" "===" "!==" 43 | 44 | "==" "!=" ">=" "<=" "&&" "||" ">>" "<<" "++" "--" 45 | "+=" "-=" "*=" "/=" "%=" "&=" "^=" "|=" 46 | 47 | "=" "+" "-" "*" "/" "%" "~" "!" ":" "?" "." ">" "<" 48 | )) 49 | 50 | 51 | (set-line-comment (list "//")) 52 | (set-comment-start "/*") 53 | (set-comment-end "*/") 54 | (set-quotation-marks '(#\" #\')) 55 | (set-significant-whitespaces 56 | (list #\newline #\linefeed #\u2028 #\u2029)))) 57 | 58 | 59 | (set-alpha 60 | (predor char-alphabetic? 61 | (lambda (x) (char=? x #\$)))) 62 | 63 | 64 | 65 | 66 | ;------------------------------------------------------------- 67 | ; primitive parsers 68 | ;------------------------------------------------------------- 69 | 70 | (:: $identifier 71 | ($pred 72 | (lambda (t) 73 | (and (token? t) 74 | (id? (Node-elts t)))))) 75 | 76 | 77 | (:: $numeral-literal 78 | ($pred 79 | (lambda (t) 80 | (and (token? t) 81 | (numeral? (Node-elts t)))))) 82 | 83 | 84 | (:: $string-literal ($pred str?)) 85 | (:: $newline ($pred newline?)) 86 | (:: $comment ($pred comment?)) 87 | 88 | 89 | ;; delimeters 90 | (:: |,| (@_ ",")) 91 | (:: |;| (@~ ";")) 92 | (:: |:| (@_ ":")) 93 | (:: |(| (@~ "(")) 94 | (:: |)| (@~ ")")) 95 | (:: |[| (@~ "[")) 96 | (:: |]| (@~ "]")) 97 | (:: |{| (@~ "{")) 98 | (:: |}| (@~ "}")) 99 | 100 | 101 | ;; $glob^ and $*^ needed to define |\n|, because the 102 | ;; definition of |\n| must not contain any call to @seq 103 | ;; otherwise the parser will go into infinite loop 104 | (:: |\n| ($glob^ (@*^ $newline))) 105 | (:: |;\n| (@or |;| |\n|)) 106 | 107 | 108 | ;; Redefine sequence to get over newlines 109 | (define old-seq @seq) 110 | (set-seq 111 | (lambda ps 112 | (let ([psj (join ps |\n|)]) 113 | (apply old-seq `(,|\n| ,@psj ,|\n|))))) 114 | 115 | 116 | ;; ($eval (@seq ($$ "foo") ($$ "bar")) 117 | ;; (scan " 118 | ;; foo 119 | ;; bar ")) 120 | 121 | 122 | 123 | 124 | ;------------------------------------------------------------- 125 | ; compound parsers 126 | ;------------------------------------------------------------- 127 | 128 | (::= $program 'program 129 | (@* $statement)) 130 | 131 | 132 | 133 | (:: $statement 134 | (@or $statement-block 135 | $empty-statement 136 | $function-definition 137 | $variable-statement 138 | $with-statement 139 | 140 | $if-statement 141 | $switch-statement 142 | $do-while-statement 143 | $while-statement 144 | $for-statement 145 | $for-in-statement 146 | $continue-statement 147 | $break-statement 148 | $try-statement 149 | $throw-statement 150 | $return-statement 151 | 152 | $labelled-statement 153 | $expression-statement 154 | )) 155 | 156 | 157 | 158 | (::= $statement-block 'block 159 | |{| (@* $statement) |}| 160 | ) 161 | 162 | 163 | (:: $empty-statement |;|) 164 | 165 | 166 | (::= $function-definition 'function 167 | ($$ "function") (@= 'name $identifier) $formal-parameter-list 168 | $function-body) 169 | 170 | 171 | ;; function-expression can be unnamed 172 | (::= $function-expression 'function 173 | ($$ "function") (@= 'name (@? $identifier)) $formal-parameter-list 174 | $function-body) 175 | 176 | 177 | (::= $formal-parameter-list 'parameters 178 | (@or (@... |(| (@? (@.@ $identifier |,|)) |)| ) 179 | $identifier)) 180 | 181 | 182 | (::= $function-body 'body 183 | $statement-block 184 | ) 185 | 186 | 187 | 188 | ;;---------------- variable statement ----------------- 189 | (::= $variable-statement 'variable-declaration 190 | ($$ "var") (@.@ $variable-declaration |,|) |;\n| 191 | ) 192 | 193 | 194 | (::= $variable-declaration 'variable-declaration 195 | $identifier (@? $initializer)) 196 | 197 | 198 | (::= $initializer 'initializer 199 | (@... ($$ "=") $assignment-expression)) 200 | 201 | 202 | ;;-------------------------------------------- 203 | (::= $with-statement 'with 204 | ($$ "with") (@= 'obj |(| $expression |)|) 205 | $statement 206 | ) 207 | 208 | 209 | ;;-------------------------------------------- 210 | (::= $if-statement 'if 211 | ($$ "if") (@= 'test |(| $expression |)|) $statement 212 | (@? (@= 'else ($$ "else") $statement 213 | ))) 214 | 215 | 216 | ;;-------------------------------------------- 217 | (::= $do-while-statement 'do-while 218 | ($$ "do") $statement 219 | (@= 'while-do ($$ "while") (@= 'test |(| $expression |)| )) 220 | |;\n| 221 | ) 222 | 223 | 224 | ;;-------------------------------------------- 225 | (::= $while-statement 'while 226 | ($$ "while") (@= 'test |(| $expression |)| ) 227 | $statement 228 | ) 229 | 230 | 231 | ;;-------------------------------------------- 232 | (::= $for-statement 'for 233 | ($$ "for") (@= 'iter 234 | |(| (@? $for-initaliser) |;| 235 | (@? $expression) |;| 236 | (@? $expression) 237 | |)| 238 | ) 239 | $statement 240 | ) 241 | 242 | 243 | (::= $for-initaliser 'for-initializer 244 | (@or (@= 'variable-declaration 245 | ($$ "var") (@.@ $variable-declaration |,|)) 246 | 247 | $expression 248 | )) 249 | 250 | 251 | ;;-------------------------------------------- 252 | (::= $for-in-statement 'for-in 253 | ($$ "for") (@= 'iter 254 | |(| (@? $for-in-initalizer) ($$ "in") $expression |)|) 255 | $statement 256 | ) 257 | 258 | 259 | (::= $for-in-initalizer 'for-in-initializer 260 | (@or (@= 'variable-declaration 261 | ($$ "var") (@.@ $variable-declaration |,|)) 262 | 263 | $expression 264 | )) 265 | 266 | 267 | ;;-------------------------------------------- 268 | (::= $continue-statement 'continue 269 | ($$ "continue") (@= 'label (@? $identifier)) |;\n| 270 | ) 271 | 272 | 273 | ;;-------------------------------------------- 274 | (::= $break-statement 'break 275 | ($$ "break") (@= 'label (@? $identifier)) |;\n| 276 | ) 277 | 278 | 279 | ;;-------------------------------------------- 280 | (::= $return-statement 'return 281 | ($$ "return") (@= 'value (@? $expression)) |;\n| 282 | ) 283 | 284 | 285 | ;;-------------------------------------------- 286 | (::= $labelled-statement 'labelled-statement 287 | $identifier |:| $statement 288 | ) 289 | 290 | 291 | ;;-------------------------------------------- 292 | (::= $switch-statement 'switch-statement 293 | ($$ "switch") |(| $expression |)| 294 | |{| (@* $case-clause) 295 | (@? $default-clause 296 | (@* $case-clause)) 297 | |}| 298 | ) 299 | 300 | 301 | (::= $case-clause 'case-clause 302 | ($$ "case") $expression |:| (@* $statement) 303 | ) 304 | 305 | 306 | (::= $default-clause 'default 307 | ($$ "default") |:| (@* $statement) 308 | ) 309 | 310 | 311 | ;;-------------------------------------------- 312 | (::= $throw-statement 'throw 313 | ($$ "throw") $expression |;\n| 314 | ) 315 | 316 | 317 | ;;-------------------------------------------- 318 | (::= $try-statement 'try 319 | ($$ "try") $statement-block 320 | (@or $finally-clause 321 | (@... $catch-clause (@? $finally-clause))) 322 | ) 323 | 324 | 325 | (::= $catch-clause 'catch 326 | ($$ "catch") |(| $identifier |)| $statement-block 327 | ) 328 | 329 | 330 | (::= $finally-clause 'finally 331 | ($$ "finally") $statement-block 332 | ) 333 | 334 | 335 | ;;-------------------------------------------- 336 | (::= $expression-statement 'expression-statement 337 | $expression |;\n| 338 | ) 339 | 340 | 341 | 342 | 343 | ;------------------------------------------------------------- 344 | ; expressions 345 | ;------------------------------------------------------------- 346 | 347 | ;; utility for constructing operators 348 | (define op 349 | (lambda (s) 350 | (@= 'op ($$ s)))) 351 | 352 | 353 | 354 | (:: $expression 355 | $comma-expression) 356 | 357 | 358 | 359 | ;; 18. comma 360 | ;;-------------------------------------------- 361 | (::= $comma-expression 'comma 362 | (@.@ $assignment-expression |,|)) 363 | 364 | 365 | 366 | ;; 16. assignment 367 | ;;-------------------------------------------- 368 | (:: $assignment-expression 369 | (@or (@= 'assignment 370 | $conditional-expression 371 | $assignment-operator 372 | $assignment-expression) 373 | 374 | $conditional-expression 375 | )) 376 | 377 | 378 | (:: $assignment-operator 379 | (@or (op "=") 380 | (op "*=") 381 | (op "/=") 382 | (op "%=") 383 | (op "+=") 384 | (op "-=") 385 | (op "<<=") 386 | (op ">>=") 387 | (op ">>>=") 388 | (op "&=") 389 | (op "^=") 390 | (op "|=") 391 | )) 392 | 393 | 394 | 395 | ;; 15. ?: Ternary conditional 396 | ;;-------------------------------------------- 397 | (:: $conditional-expression 398 | (@or (@= 'conditional-expression 399 | (@= 'test $logical-or-expression) 400 | (@~ "?") (@= 'then $conditional-expression) 401 | (@~ ":") (@= 'else $conditional-expression)) 402 | 403 | $logical-or-expression 404 | )) 405 | 406 | 407 | ; ($eval $conditional-expression (scan "x > 0? x-1 : x")) 408 | 409 | 410 | 411 | 412 | ;; 14. || Logical OR 413 | ;;-------------------------------------------- 414 | (:: $logical-or-expression 415 | (@or (@infix-left 'binop 416 | $logical-and-expression 417 | (op "||")) 418 | 419 | $logical-and-expression 420 | )) 421 | 422 | ; ($eval $logical-or-expression (scan "x || y")) 423 | 424 | 425 | 426 | ;; 13. && Logical AND 427 | ;;-------------------------------------------- 428 | (:: $logical-and-expression 429 | (@or (@infix-left 'binop 430 | $bitwise-or-expression 431 | (op "&&")) 432 | 433 | $bitwise-or-expression 434 | )) 435 | 436 | 437 | ;; 12. | Bitwise OR (inclusive or) 438 | ;;-------------------------------------------- 439 | (:: $bitwise-or-expression 440 | (@or (@infix-left 'binop 441 | $bitwise-xor-expression 442 | (op "|")) 443 | 444 | $bitwise-xor-expression 445 | )) 446 | 447 | 448 | 449 | ;; 11. ^ Bitwise XOR (exclusive or) 450 | ;;-------------------------------------------- 451 | (:: $bitwise-xor-expression 452 | (@or (@infix-left 'binop 453 | $bitwise-and-expression 454 | (op "^")) 455 | 456 | $bitwise-and-expression 457 | )) 458 | 459 | 460 | 461 | ;; 10. & Bitwise AND 462 | ;;-------------------------------------------- 463 | (:: $bitwise-and-expression 464 | (@or (@infix-left 'binop 465 | $equality-expression 466 | (op "&")) 467 | 468 | $equality-expression 469 | )) 470 | 471 | 472 | 473 | ;; 9. equality 474 | ;;-------------------------------------------- 475 | (:: $equality-expression 476 | (@or (@infix-left 'binop 477 | $relational-expression 478 | $equality-operator) 479 | 480 | $relational-expression 481 | )) 482 | 483 | (:: $equality-operator 484 | (@or (op "==") 485 | (op "!=") 486 | (op "===") 487 | (op "!==") 488 | )) 489 | 490 | 491 | 492 | ;; 8. relational 493 | ;;-------------------------------------------- 494 | (:: $relational-expression 495 | (@or (@infix-left 'binop 496 | $bitwise-shift-expression 497 | $relational-operator) 498 | 499 | $bitwise-shift-expression 500 | )) 501 | 502 | (:: $relational-operator 503 | (@or (op "<") 504 | (op "<=") 505 | (op ">") 506 | (op ">=") 507 | (op "instanceof") 508 | (op "in") 509 | )) 510 | 511 | 512 | 513 | ;; 7. bitwise shift 514 | ;;-------------------------------------------- 515 | (:: $bitwise-shift-expression 516 | (@or (@infix-left 'binop 517 | $additive-expression 518 | $bitwise-shift-operator) 519 | 520 | $additive-expression 521 | )) 522 | 523 | (:: $bitwise-shift-operator 524 | (@or (op "<<") 525 | (op ">>") 526 | (op ">>>") 527 | )) 528 | 529 | 530 | 531 | ;; 6. additive 532 | ;;-------------------------------------------- 533 | (:: $additive-expression 534 | (@or (@infix-left 'binop 535 | $multiplicative-expression 536 | $additive-operator) 537 | 538 | $multiplicative-expression 539 | )) 540 | 541 | 542 | (:: $additive-operator 543 | (@or (op "+") 544 | (op "-"))) 545 | 546 | 547 | ;; ($eval $additive-expression (scan "x + y + z")) 548 | 549 | 550 | 551 | 552 | ;; 5. multiplicative 553 | ;;-------------------------------------------- 554 | (:: $multiplicative-expression 555 | (@or (@infix-left 'binop 556 | $unary-expression 557 | $multiplicative-operator) 558 | 559 | $unary-expression)) 560 | 561 | (:: $multiplicative-operator 562 | (@or (op "*") 563 | (op "/") 564 | (op "%"))) 565 | 566 | 567 | 568 | 569 | ;; 3. prefix 570 | ;; 2. postfix 571 | ;;-------------------------------------------- 572 | (:: $unary-expression 573 | $prefix-expression) 574 | 575 | 576 | (:: $prefix-expression 577 | (@or (@prefix 'prefix 578 | $postfix-expression 579 | $prefix-operator) 580 | $postfix-expression)) 581 | 582 | 583 | (:: $postfix-expression 584 | (@or (@postfix 'postfix 585 | $primary-expression 586 | $postfix-operator) 587 | $primary-expression)) 588 | 589 | 590 | (:: $prefix-operator 591 | (@or (op "new") 592 | (op "delete") 593 | (op "void") 594 | (op "typeof") 595 | (op "++") 596 | (op "--") 597 | (op "+") 598 | (op "-") 599 | (op "~") 600 | (op "!") 601 | )) 602 | 603 | 604 | (:: $postfix-operator 605 | (@or $index-suffix 606 | $property-reference-suffix 607 | $arguments 608 | (op "++") 609 | (op "--"))) 610 | 611 | 612 | (::= $index-suffix 'index 613 | |[| $expression |]| 614 | ) 615 | 616 | 617 | (::= $property-reference-suffix 'property 618 | (@_ ".") $identifier) 619 | 620 | 621 | (::= $arguments 'arguments 622 | |(| (@? (@.@ $assignment-expression |,|)) |)| 623 | ) 624 | 625 | 626 | (::= $new-expression 'new 627 | ($$ "new") $postfix-expression) 628 | 629 | 630 | 631 | ;; 1. primary 632 | ;;-------------------------------------------- 633 | (:: $primary-expression 634 | (@or $function-expression 635 | $identifier 636 | $literal 637 | (@= 'expression |(| $expression |)| ) 638 | )) 639 | 640 | 641 | 642 | 643 | ;;----------- 644 | (::= $array-literal 'array-literal 645 | |[| (@? (@.@ $assignment-expression |,|)) |]| 646 | ) 647 | 648 | 649 | 650 | ;;----------- 651 | (::= $object-literal 'object-literal 652 | |{| $property-name-value (@* |,| $property-name-value) |}| 653 | ) 654 | 655 | 656 | (::= $property-name-value 'property-name-value 657 | $property-name |:| $assignment-expression) 658 | 659 | 660 | (:: $property-name 661 | (@or $identifier 662 | $string-literal 663 | $numeral-literal)) 664 | 665 | 666 | 667 | ;;----------- 668 | (:: $literal 669 | (@or ($$ "null") 670 | ($$ "true") 671 | ($$ "false") 672 | (@= 'this ($$ "this")) 673 | $string-literal 674 | $numeral-literal 675 | $array-literal 676 | $object-literal 677 | )) 678 | 679 | 680 | 681 | 682 | 683 | ;------------------------------------------------------------- 684 | ; parse-js 685 | ;------------------------------------------------------------- 686 | 687 | (define parse-js 688 | (lambda (s) 689 | (set-parameters) 690 | (first-val 691 | ($eval $program 692 | (filter (negate comment?) (scan s)))))) 693 | 694 | 695 | 696 | 697 | ;------------------------------------------------------------- 698 | ; tests 699 | ;------------------------------------------------------------- 700 | 701 | ;; (parse-js (read-file "demos/nav1.js")) 702 | -------------------------------------------------------------------------------- /parse-lisp.rkt: -------------------------------------------------------------------------------- 1 | ;; ydiff - a language-aware tool for comparing programs 2 | ;; Copyright (C) 2011-2013 Yin Wang (yinwang0@gmail.com) 3 | 4 | ;; This program is free software: you can redistribute it and/or modify 5 | ;; it under the terms of the GNU General Public License as published by 6 | ;; the Free Software Foundation, either version 3 of the License, or 7 | ;; (at your option) any later version. 8 | 9 | ;; This program is distributed in the hope that it will be useful, 10 | ;; but WITHOUT ANY WARRANTY; without even the implied warranty of 11 | ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 | ;; GNU General Public License for more details. 13 | 14 | ;; You should have received a copy of the GNU General Public License 15 | ;; along with this program. If not, see . 16 | 17 | 18 | #lang racket 19 | 20 | (require "structs.rkt") 21 | (require "utils.rkt") 22 | (require "parsec.rkt") 23 | 24 | (provide parse-lisp) 25 | 26 | 27 | ;------------------------------------------------------------- 28 | ; scanner setttings 29 | ;------------------------------------------------------------- 30 | 31 | ; single quote is considered a delimeter in s-expression 32 | 33 | (define set-parameters 34 | (lambda () 35 | (set-delims (list "(" ")" "[" "]" "{" "}" "'" "`" "," )) 36 | (set-line-comment (list ";")) 37 | (set-comment-start "") 38 | (set-comment-end "") 39 | (set-operators '()) 40 | (set-quotation-marks '(#\")) 41 | (set-lisp-char (list "#\\" "?\\")) 42 | (set-significant-whitespaces '()))) 43 | 44 | 45 | 46 | ;------------------------------------------------------------- 47 | ; parser 48 | ;------------------------------------------------------------- 49 | 50 | (:: $open 51 | (@or (@~ "(") (@~ "["))) 52 | 53 | (:: $close 54 | (@or (@~ ")") (@~ "]"))) 55 | 56 | (:: $non-parens 57 | (@and (@! $open) (@! $close))) 58 | 59 | (::= $parens 'sexp 60 | (@seq $open (@* $sexp) $close)) 61 | 62 | (:: $sexp 63 | (@+ (@or $parens $non-parens))) 64 | 65 | (:: $program $sexp) 66 | 67 | 68 | (define parse-lisp 69 | (lambda (s) 70 | (set-parameters) 71 | (first-val ($eval $sexp (scan s))))) 72 | 73 | 74 | ;;; command line use only 75 | 76 | ;; (let ([args (current-command-line-arguments)]) 77 | ;; (cond 78 | ;; [(null? args) '()] 79 | ;; [else 80 | ;; (set-parameters) 81 | ;; (print (parse-lisp (read-file (vector-ref args 0))))])) 82 | -------------------------------------------------------------------------------- /parse-yin.rkt: -------------------------------------------------------------------------------- 1 | ;; ydiff - a language-aware tool for comparing programs 2 | ;; Copyright (C) 2011-2013 Yin Wang (yinwang0@gmail.com) 3 | 4 | ;; This program is free software: you can redistribute it and/or modify 5 | ;; it under the terms of the GNU General Public License as published by 6 | ;; the Free Software Foundation, either version 3 of the License, or 7 | ;; (at your option) any later version. 8 | 9 | ;; This program is distributed in the hope that it will be useful, 10 | ;; but WITHOUT ANY WARRANTY; without even the implied warranty of 11 | ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 | ;; GNU General Public License for more details. 13 | 14 | ;; You should have received a copy of the GNU General Public License 15 | ;; along with this program. If not, see . 16 | 17 | 18 | #lang racket 19 | 20 | (require "structs.rkt") 21 | (require "utils.rkt") 22 | (require "parsec.rkt") 23 | 24 | (provide parse-yin) 25 | 26 | 27 | ;------------------------------------------------------------- 28 | ; scanner setttings 29 | ;------------------------------------------------------------- 30 | 31 | ; single quote is considered a delimeter in s-expression 32 | 33 | (define set-parameters 34 | (lambda () 35 | (set-delims (list "(" ")" "[" "]" "{" "}" "'" "`" "," )) 36 | (set-line-comment (list "--")) 37 | (set-comment-start "") 38 | (set-comment-end "") 39 | (set-operators '()) 40 | (set-quotation-marks '(#\")) 41 | (set-lisp-char (list "#\\" "?\\")) 42 | (set-significant-whitespaces '()))) 43 | 44 | 45 | 46 | ;------------------------------------------------------------- 47 | ; parser 48 | ;------------------------------------------------------------- 49 | 50 | (:: $open 51 | (@or (@~ "(") (@~ "["))) 52 | 53 | (:: $close 54 | (@or (@~ ")") (@~ "]"))) 55 | 56 | (:: $non-parens 57 | (@and (@! $open) (@! $close))) 58 | 59 | (::= $parens 'sexp 60 | (@seq $open (@* $sexp) $close)) 61 | 62 | (:: $sexp 63 | (@+ (@or $parens $non-parens))) 64 | 65 | (:: $program $sexp) 66 | 67 | 68 | (define parse-yin 69 | (lambda (s) 70 | (set-parameters) 71 | (first-val ($eval $sexp (scan s))))) 72 | 73 | 74 | ;;; command line use only 75 | 76 | ;; (let ([args (current-command-line-arguments)]) 77 | ;; (cond 78 | ;; [(null? args) '()] 79 | ;; [else 80 | ;; (set-parameters) 81 | ;; (print (parse-yin (read-file (vector-ref args 0))))])) 82 | -------------------------------------------------------------------------------- /structs.rkt: -------------------------------------------------------------------------------- 1 | ;; ydiff - a language-aware tool for comparing programs 2 | ;; Copyright (C) 2011-2013 Yin Wang (yinwang0@gmail.com) 3 | 4 | 5 | ;; This program is free software: you can redistribute it and/or modify 6 | ;; it under the terms of the GNU General Public License as published by 7 | ;; the Free Software Foundation, either version 3 of the License, or 8 | ;; (at your option) any later version. 9 | 10 | ;; This program is distributed in the hope that it will be useful, 11 | ;; but WITHOUT ANY WARRANTY; without even the implied warranty of 12 | ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 13 | ;; GNU General Public License for more details. 14 | 15 | ;; You should have received a copy of the GNU General Public License 16 | ;; along with this program. If not, see . 17 | 18 | 19 | #lang racket 20 | 21 | (provide (all-defined-out)) 22 | 23 | 24 | ;------------------------------------------------------------- 25 | ; data types 26 | ;------------------------------------------------------------- 27 | 28 | 29 | ;---------------------------- Node --------------------------- 30 | (struct Node (type 31 | start 32 | end 33 | elts 34 | [size #:mutable] 35 | [ctx #:mutable]) 36 | #:transparent) 37 | 38 | 39 | (define comment? 40 | (lambda (n) 41 | (and (Node? n) (eq? 'comment (Node-type n))))) 42 | 43 | (define phantom? 44 | (lambda (n) 45 | (and (Node? n) (eq? 'phantom (Node-type n))))) 46 | 47 | (define token? 48 | (lambda (n) 49 | (and (Node? n) (eq? 'token (Node-type n))))) 50 | 51 | (define str? 52 | (lambda (n) 53 | (and (Node? n) (eq? 'str (Node-type n))))) 54 | 55 | (define character? 56 | (lambda (n) 57 | (and (Node? n) (eq? 'char (Node-type n))))) 58 | 59 | (define newline? 60 | (lambda (n) 61 | (and (Node? n) (eq? 'newline (Node-type n))))) 62 | 63 | 64 | ;----------- node size function ------------ 65 | (define node-size 66 | (lambda (node) 67 | (cond 68 | [(and (Node? node) (Node-size node)) 69 | (Node-size node)] 70 | [(pair? node) 71 | (apply + (map node-size node))] 72 | [(or (token? node) (str? node) (character? node)) 1] 73 | [(Node? node) 74 | (let ([size (node-size (Node-elts node))]) 75 | (set-Node-size! node size) 76 | size)] 77 | [else 0]))) 78 | 79 | 80 | (define node-depth 81 | (lambda (node) 82 | (cond 83 | [(null? node) 0] 84 | [(pair? node) 85 | (apply max (map node-depth node))] 86 | [(Node? node) 87 | (add1 (node-depth (Node-elts node)))] 88 | [else 0]))) 89 | 90 | 91 | ; (node-depth (parse-scheme "(lambda (x (x (y)) (y)) x)")) 92 | 93 | 94 | (define set-node-context 95 | (lambda (node ctx) 96 | (cond 97 | [(pair? node) 98 | (map (lambda (n) (set-node-context n ctx)) node)] 99 | [(Node? node) 100 | (let ([name (or (get-name node) ctx)]) 101 | (set-Node-ctx! node name) 102 | (set-node-context (Node-elts node) name))]))) 103 | 104 | 105 | ;------------------ operations on nodes --------------------- 106 | 107 | ;; "virtual function" - get definition name 108 | ;; can be overridden by individual languages 109 | (define get-name (lambda (node) #f)) 110 | 111 | (define set-get-name 112 | (lambda (fun) 113 | (set! get-name fun))) 114 | 115 | 116 | ;; "virtual function" - get node type 117 | ;; can be overridden by individual languages 118 | (define get-type Node-type) 119 | 120 | (define set-get-type 121 | (lambda (fun) 122 | (set! get-type fun))) 123 | 124 | 125 | ;; same-def? only depend on get-name, so they need not be overridden 126 | ;; by individual languages. 127 | (define same-def? 128 | (lambda (e1 e2) 129 | (cond 130 | [(not (eq? (get-type e1) (get-type e2))) 131 | #f] 132 | [else 133 | (let ([name1 (get-name e1)] 134 | [name2 (get-name e2)]) 135 | (and name1 name2 (equal? name1 name2)))]))) 136 | 137 | 138 | (define set-same-def 139 | (lambda (fun) 140 | (set! same-def? fun))) 141 | 142 | 143 | 144 | ;---------------------------- Change --------------------------- 145 | ;; Change - a change in the data structure 146 | ;; - old : the old version, #f for insertions 147 | ;; - new : the new version, #f for deletions 148 | ;; - cost : the cost of change from old to new 149 | ;; - type : insertion, deletion, or modification? 150 | (struct Change (old new cost type) #:transparent) 151 | 152 | (define ins? 153 | (lambda (c) 154 | (eq? 'ins (Change-type c)))) 155 | 156 | (define del? 157 | (lambda (c) 158 | (eq? 'del (Change-type c)))) 159 | 160 | (define mov? 161 | (lambda (c) 162 | (eq? 'mov (Change-type c)))) 163 | 164 | 165 | ;----------------- utils for creating changes ---------------- 166 | (define ins 167 | (lambda (node) 168 | (let ([size (node-size node)]) 169 | (list (Change #f node size 'ins))))) 170 | 171 | (define del 172 | (lambda (node) 173 | (let ([size (node-size node)]) 174 | (list (Change node #f size 'del))))) 175 | 176 | (define mov 177 | (lambda (node1 node2 cost) 178 | (list (Change node1 node2 cost 'mov)))) 179 | 180 | 181 | ;; create a "total change" 182 | ;; (delete node1 completely and then insert node2) 183 | (define total 184 | (lambda (node1 node2) 185 | (let ([size1 (node-size node1)] 186 | [size2 (node-size node2)]) 187 | (values (append (del node1) (ins node2)) 188 | (+ size1 size2))))) 189 | 190 | 191 | 192 | ;---------------------------- Tag --------------------------- 193 | ;; HTML tag structure used HTML generation code 194 | (struct Tag (tag idx start) #:transparent) 195 | 196 | 197 | (define get-symbol 198 | (lambda (node) 199 | (cond 200 | [(token? node) 201 | (string->symbol (Node-elts node))] 202 | [else #f]))) 203 | 204 | 205 | ;; Find the first node elements which matches a given tag. 206 | (define get-tag 207 | (lambda (node tag) 208 | (cond 209 | [(not (Node? node)) #f] 210 | [(not (pair? (Node-elts node))) #f] 211 | [(null? (Node-elts node)) #f] 212 | [else 213 | (let ([matches (filter (lambda (x) 214 | (eq? (Node-type x) tag)) 215 | (Node-elts node))]) 216 | (cond 217 | [(null? matches) #f] 218 | [else (car matches)]))]))) 219 | 220 | 221 | ;; (get-tag (car (parse1 $statement "function f(x) {}")) 222 | ;; 'name) 223 | 224 | 225 | ;; Find the first node containing a given path of tags. 226 | ;; For example: '(function parameter) could match a function's parameter 227 | 228 | (define match-tags 229 | (lambda (e tags) 230 | (cond 231 | [(not (Node? e)) #f] 232 | [(null? tags) e] 233 | [else 234 | (match-tags (get-tag e (car tags)) (cdr tags))]))) 235 | 236 | 237 | ;; (match-tags (car (parse1 $statement "function f(x) {}")) 238 | ;; '(function name)) 239 | 240 | 241 | (define uid 242 | (let ([count 1] 243 | [table (box '())]) 244 | (lambda (node) 245 | (let ([p (assq node (unbox table))]) 246 | (cond 247 | [(not p) 248 | (let ([id count]) 249 | (set! count (add1 count)) 250 | (set-box! table (cons `(,node . ,id) (unbox table))) 251 | id)] 252 | [else 253 | (cdr p)]))))) 254 | 255 | 256 | 257 | ;; similarity string from a change 258 | (define similarity 259 | (lambda (change) 260 | (let ([total (+ (node-size (Change-old change)) 261 | (node-size (Change-new change)))]) 262 | (cond 263 | [(or (= 0 total) (= 0 (Change-cost change))) 264 | "100%"] 265 | [else 266 | (string-append 267 | (real->decimal-string 268 | (* 100 (- 1.0 (/ (Change-cost change) total))) 1) 269 | "%")])))) 270 | -------------------------------------------------------------------------------- /utils.rkt: -------------------------------------------------------------------------------- 1 | ;; ydiff - a language-aware tool for comparing programs 2 | ;; Copyright (C) 2011-2013 Yin Wang (yinwang0@gmail.com) 3 | 4 | 5 | ;; This program is free software: you can redistribute it and/or modify 6 | ;; it under the terms of the GNU General Public License as published by 7 | ;; the Free Software Foundation, either version 3 of the License, or 8 | ;; (at your option) any later version. 9 | 10 | ;; This program is distributed in the hope that it will be useful, 11 | ;; but WITHOUT ANY WARRANTY; without even the implied warranty of 12 | ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 13 | ;; GNU General Public License for more details. 14 | 15 | ;; You should have received a copy of the GNU General Public License 16 | ;; along with this program. If not, see . 17 | 18 | 19 | #lang racket 20 | 21 | (provide (all-defined-out)) 22 | 23 | 24 | 25 | ;------------------------------------------------------------- 26 | ; utilities 27 | ;------------------------------------------------------------- 28 | 29 | (define-syntax letv 30 | (syntax-rules () 31 | [(_ () body ...) 32 | (begin body ...)] 33 | [(_ ([(e1 e2* ...) e3] bd* ...) body ...) 34 | (let-values ([(e1 e2* ...) e3]) 35 | (letv (bd* ...) body ...))] 36 | [(_ ([e1 e2] bd* ...) body ...) 37 | (let ([e1 e2]) 38 | (letv (bd* ...) body ...))])) 39 | 40 | 41 | (define-syntax first-val 42 | (syntax-rules () 43 | [(_ e) 44 | (letv ([(x y) e]) x)])) 45 | 46 | 47 | (define-syntax second-val 48 | (syntax-rules () 49 | [(_ e) 50 | (letv ([(x y) e]) y)])) 51 | 52 | 53 | (define *debug* #f) 54 | (define-syntax peek 55 | (syntax-rules () 56 | [(_ name args ...) 57 | (if *debug* 58 | (begin 59 | (printf "~s: ~s = ~s~n" name 'args args) 60 | ...) 61 | (void))])) 62 | 63 | 64 | ;; utility for error reporting 65 | (define fatal 66 | (lambda (who . args) 67 | (printf "~s: " who) 68 | (for-each display args) 69 | (display "\n") 70 | (error who ""))) 71 | 72 | 73 | ; foldl of Racket has a bug! 74 | ; (foldl (lambda (x y) x) 0 '(1 2 3 4)) 75 | ; => 4 76 | ; Don't use it! 77 | (define foldl2 78 | (lambda (f x ls) 79 | (cond 80 | [(null? ls) x] 81 | [else 82 | (foldl2 f (f x (car ls)) (cdr ls))]))) 83 | 84 | 85 | ; (foldl2 + 0 '(1 2 3 4 )) 86 | 87 | 88 | 89 | (define orf 90 | (lambda (a b) 91 | (or a b))) 92 | 93 | 94 | 95 | 96 | (define char->string string) 97 | 98 | 99 | (define read-file 100 | (lambda (filename) 101 | (let ([port (open-input-file filename #:mode 'text)]) 102 | (let loop ([line (read-line port)] 103 | [all ""]) 104 | (cond 105 | [(eof-object? line) all] 106 | [else 107 | (loop (read-line port) 108 | (string-append all line "\n"))]))))) 109 | 110 | 111 | 112 | (define new-progress 113 | (lambda (size) 114 | (let* ([counter 0] 115 | [dots 0] 116 | [print-mark 117 | (lambda (sym) 118 | (display sym) 119 | (set! dots (+ dots 1)) 120 | (cond 121 | [(= 0 (modulo dots 60)) 122 | (display "\n")]) 123 | (flush-output))]) 124 | (lambda (x) 125 | (cond 126 | [(eq? x 'reset) 127 | (set! counter 0) 128 | (set! dots 0)] 129 | [(eq? x 'get) 130 | counter] 131 | [(string? x) 132 | (print-mark x)] 133 | [(= 0 (remainder counter size)) 134 | (set! counter (+ x counter)) 135 | (print-mark ".")] 136 | [else 137 | (set! counter (+ x counter))]))))) 138 | 139 | 140 | 141 | ;;----------------- multi dimensional eq hash -------------------- 142 | 143 | (define hash-put! 144 | (lambda (hash key1 key2 v) 145 | (cond 146 | [(hash-has-key? hash key2) 147 | (let ([inner (hash-ref hash key2)]) 148 | (hash-set! inner key1 v))] 149 | [else 150 | (let ([inner (make-hasheq)]) 151 | (hash-set! inner key1 v) 152 | (hash-set! hash key2 inner))]))) 153 | 154 | (define hash-get 155 | (lambda (hash key1 key2) 156 | (cond 157 | [(hash-has-key? hash key2) 158 | (let ([inner (hash-ref hash key2)]) 159 | (cond 160 | [(hash-has-key? inner key1) 161 | (hash-ref inner key1)] 162 | [else #f]))] 163 | [else #f]))) 164 | 165 | 166 | (define hash-put2! 167 | (lambda (hash key1 key2 v) 168 | (cond 169 | [(hash-has-key? hash key2) 170 | (let ([inner (hash-ref hash key2)]) 171 | (hash-set! inner key1 v))] 172 | [else 173 | (let ([inner (make-hash)]) 174 | (hash-set! inner key1 v) 175 | (hash-set! hash key2 inner))]))) 176 | 177 | (define hash-get2 178 | (lambda (hash key1 key2) 179 | (cond 180 | [(hash-has-key? hash key2) 181 | (let ([inner (hash-ref hash key2)]) 182 | (cond 183 | [(hash-has-key? inner key1) 184 | (hash-ref inner key1)] 185 | [else #f]))] 186 | [else #f]))) 187 | 188 | 189 | (define predand 190 | (lambda preds 191 | (lambda (x) 192 | (cond 193 | [(null? preds) #t] 194 | [((car preds) x) 195 | ((apply predand (cdr preds)) x)] 196 | [else #f])))) 197 | 198 | 199 | (define predor 200 | (lambda preds 201 | (lambda (x) 202 | (cond 203 | [(null? preds) #f] 204 | [((car preds) x) #t] 205 | [else 206 | ((apply predor (cdr preds)) x)])))) 207 | 208 | 209 | (define set- 210 | (lambda (s1 s2) 211 | (cond 212 | [(null? s1) '()] 213 | [(memq (car s1) s2) 214 | (set- (cdr s1) s2)] 215 | [else 216 | (cons (car s1) (set- (cdr s1) s2))]))) 217 | 218 | 219 | 220 | (define string-join 221 | (lambda (ls sep) 222 | (cond 223 | [(null? ls) ""] 224 | [else 225 | (string-append (car ls) sep (string-join (cdr ls) sep))]))) 226 | 227 | ; (string-join (list "a" "b" "c") ",") 228 | 229 | --------------------------------------------------------------------------------