├── .hgignore ├── LICENSE ├── Makefile ├── PI ├── __init__.py ├── align.c ├── align.pyx ├── distance.py ├── input.py ├── multialign.py ├── output.py ├── phylogeny.py └── tree.py ├── README ├── dns_requests_and_responses.dump ├── dns_requests_only.dump ├── main.py └── setup.py /.hgignore: -------------------------------------------------------------------------------- 1 | syntax: glob 2 | *.class 3 | *.o 4 | *.so 5 | *.pyc 6 | *.sqlite3 7 | *.sw[op] 8 | *~ 9 | .DS_Store 10 | bin-debug/* 11 | bin-release/* 12 | bin/* 13 | tags 14 | *.beam 15 | env/ 16 | *egg-info* 17 | PI/numpy/ 18 | build 19 | graph-* 20 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU LESSER GENERAL PUBLIC LICENSE 2 | Version 2.1, February 1999 3 | 4 | Copyright (C) 1991, 1999 Free Software Foundation, Inc. 5 | 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 6 | Everyone is permitted to copy and distribute verbatim copies 7 | of this license document, but changing it is not allowed. 8 | 9 | [This is the first released version of the Lesser GPL. It also counts 10 | as the successor of the GNU Library Public License, version 2, hence 11 | the version number 2.1.] 12 | 13 | Preamble 14 | 15 | The licenses for most software are designed to take away your 16 | freedom to share and change it. By contrast, the GNU General Public 17 | Licenses are intended to guarantee your freedom to share and change 18 | free software--to make sure the software is free for all its users. 19 | 20 | This license, the Lesser General Public License, applies to some 21 | specially designated software packages--typically libraries--of the 22 | Free Software Foundation and other authors who decide to use it. You 23 | can use it too, but we suggest you first think carefully about whether 24 | this license or the ordinary General Public License is the better 25 | strategy to use in any particular case, based on the explanations below. 26 | 27 | When we speak of free software, we are referring to freedom of use, 28 | not price. Our General Public Licenses are designed to make sure that 29 | you have the freedom to distribute copies of free software (and charge 30 | for this service if you wish); that you receive source code or can get 31 | it if you want it; that you can change the software and use pieces of 32 | it in new free programs; and that you are informed that you can do 33 | these things. 34 | 35 | To protect your rights, we need to make restrictions that forbid 36 | distributors to deny you these rights or to ask you to surrender these 37 | rights. These restrictions translate to certain responsibilities for 38 | you if you distribute copies of the library or if you modify it. 39 | 40 | For example, if you distribute copies of the library, whether gratis 41 | or for a fee, you must give the recipients all the rights that we gave 42 | you. You must make sure that they, too, receive or can get the source 43 | code. If you link other code with the library, you must provide 44 | complete object files to the recipients, so that they can relink them 45 | with the library after making changes to the library and recompiling 46 | it. And you must show them these terms so they know their rights. 47 | 48 | We protect your rights with a two-step method: (1) we copyright the 49 | library, and (2) we offer you this license, which gives you legal 50 | permission to copy, distribute and/or modify the library. 51 | 52 | To protect each distributor, we want to make it very clear that 53 | there is no warranty for the free library. Also, if the library is 54 | modified by someone else and passed on, the recipients should know 55 | that what they have is not the original version, so that the original 56 | author's reputation will not be affected by problems that might be 57 | introduced by others. 58 | 59 | Finally, software patents pose a constant threat to the existence of 60 | any free program. We wish to make sure that a company cannot 61 | effectively restrict the users of a free program by obtaining a 62 | restrictive license from a patent holder. Therefore, we insist that 63 | any patent license obtained for a version of the library must be 64 | consistent with the full freedom of use specified in this license. 65 | 66 | Most GNU software, including some libraries, is covered by the 67 | ordinary GNU General Public License. This license, the GNU Lesser 68 | General Public License, applies to certain designated libraries, and 69 | is quite different from the ordinary General Public License. We use 70 | this license for certain libraries in order to permit linking those 71 | libraries into non-free programs. 72 | 73 | When a program is linked with a library, whether statically or using 74 | a shared library, the combination of the two is legally speaking a 75 | combined work, a derivative of the original library. The ordinary 76 | General Public License therefore permits such linking only if the 77 | entire combination fits its criteria of freedom. The Lesser General 78 | Public License permits more lax criteria for linking other code with 79 | the library. 80 | 81 | We call this license the "Lesser" General Public License because it 82 | does Less to protect the user's freedom than the ordinary General 83 | Public License. It also provides other free software developers Less 84 | of an advantage over competing non-free programs. These disadvantages 85 | are the reason we use the ordinary General Public License for many 86 | libraries. However, the Lesser license provides advantages in certain 87 | special circumstances. 88 | 89 | For example, on rare occasions, there may be a special need to 90 | encourage the widest possible use of a certain library, so that it becomes 91 | a de-facto standard. To achieve this, non-free programs must be 92 | allowed to use the library. A more frequent case is that a free 93 | library does the same job as widely used non-free libraries. In this 94 | case, there is little to gain by limiting the free library to free 95 | software only, so we use the Lesser General Public License. 96 | 97 | In other cases, permission to use a particular library in non-free 98 | programs enables a greater number of people to use a large body of 99 | free software. For example, permission to use the GNU C Library in 100 | non-free programs enables many more people to use the whole GNU 101 | operating system, as well as its variant, the GNU/Linux operating 102 | system. 103 | 104 | Although the Lesser General Public License is Less protective of the 105 | users' freedom, it does ensure that the user of a program that is 106 | linked with the Library has the freedom and the wherewithal to run 107 | that program using a modified version of the Library. 108 | 109 | The precise terms and conditions for copying, distribution and 110 | modification follow. Pay close attention to the difference between a 111 | "work based on the library" and a "work that uses the library". The 112 | former contains code derived from the library, whereas the latter must 113 | be combined with the library in order to run. 114 | 115 | GNU LESSER GENERAL PUBLIC LICENSE 116 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 117 | 118 | 0. This License Agreement applies to any software library or other 119 | program which contains a notice placed by the copyright holder or 120 | other authorized party saying it may be distributed under the terms of 121 | this Lesser General Public License (also called "this License"). 122 | Each licensee is addressed as "you". 123 | 124 | A "library" means a collection of software functions and/or data 125 | prepared so as to be conveniently linked with application programs 126 | (which use some of those functions and data) to form executables. 127 | 128 | The "Library", below, refers to any such software library or work 129 | which has been distributed under these terms. A "work based on the 130 | Library" means either the Library or any derivative work under 131 | copyright law: that is to say, a work containing the Library or a 132 | portion of it, either verbatim or with modifications and/or translated 133 | straightforwardly into another language. (Hereinafter, translation is 134 | included without limitation in the term "modification".) 135 | 136 | "Source code" for a work means the preferred form of the work for 137 | making modifications to it. For a library, complete source code means 138 | all the source code for all modules it contains, plus any associated 139 | interface definition files, plus the scripts used to control compilation 140 | and installation of the library. 141 | 142 | Activities other than copying, distribution and modification are not 143 | covered by this License; they are outside its scope. The act of 144 | running a program using the Library is not restricted, and output from 145 | such a program is covered only if its contents constitute a work based 146 | on the Library (independent of the use of the Library in a tool for 147 | writing it). Whether that is true depends on what the Library does 148 | and what the program that uses the Library does. 149 | 150 | 1. You may copy and distribute verbatim copies of the Library's 151 | complete source code as you receive it, in any medium, provided that 152 | you conspicuously and appropriately publish on each copy an 153 | appropriate copyright notice and disclaimer of warranty; keep intact 154 | all the notices that refer to this License and to the absence of any 155 | warranty; and distribute a copy of this License along with the 156 | Library. 157 | 158 | You may charge a fee for the physical act of transferring a copy, 159 | and you may at your option offer warranty protection in exchange for a 160 | fee. 161 | 162 | 2. You may modify your copy or copies of the Library or any portion 163 | of it, thus forming a work based on the Library, and copy and 164 | distribute such modifications or work under the terms of Section 1 165 | above, provided that you also meet all of these conditions: 166 | 167 | a) The modified work must itself be a software library. 168 | 169 | b) You must cause the files modified to carry prominent notices 170 | stating that you changed the files and the date of any change. 171 | 172 | c) You must cause the whole of the work to be licensed at no 173 | charge to all third parties under the terms of this License. 174 | 175 | d) If a facility in the modified Library refers to a function or a 176 | table of data to be supplied by an application program that uses 177 | the facility, other than as an argument passed when the facility 178 | is invoked, then you must make a good faith effort to ensure that, 179 | in the event an application does not supply such function or 180 | table, the facility still operates, and performs whatever part of 181 | its purpose remains meaningful. 182 | 183 | (For example, a function in a library to compute square roots has 184 | a purpose that is entirely well-defined independent of the 185 | application. Therefore, Subsection 2d requires that any 186 | application-supplied function or table used by this function must 187 | be optional: if the application does not supply it, the square 188 | root function must still compute square roots.) 189 | 190 | These requirements apply to the modified work as a whole. If 191 | identifiable sections of that work are not derived from the Library, 192 | and can be reasonably considered independent and separate works in 193 | themselves, then this License, and its terms, do not apply to those 194 | sections when you distribute them as separate works. But when you 195 | distribute the same sections as part of a whole which is a work based 196 | on the Library, the distribution of the whole must be on the terms of 197 | this License, whose permissions for other licensees extend to the 198 | entire whole, and thus to each and every part regardless of who wrote 199 | it. 200 | 201 | Thus, it is not the intent of this section to claim rights or contest 202 | your rights to work written entirely by you; rather, the intent is to 203 | exercise the right to control the distribution of derivative or 204 | collective works based on the Library. 205 | 206 | In addition, mere aggregation of another work not based on the Library 207 | with the Library (or with a work based on the Library) on a volume of 208 | a storage or distribution medium does not bring the other work under 209 | the scope of this License. 210 | 211 | 3. You may opt to apply the terms of the ordinary GNU General Public 212 | License instead of this License to a given copy of the Library. To do 213 | this, you must alter all the notices that refer to this License, so 214 | that they refer to the ordinary GNU General Public License, version 2, 215 | instead of to this License. (If a newer version than version 2 of the 216 | ordinary GNU General Public License has appeared, then you can specify 217 | that version instead if you wish.) Do not make any other change in 218 | these notices. 219 | 220 | Once this change is made in a given copy, it is irreversible for 221 | that copy, so the ordinary GNU General Public License applies to all 222 | subsequent copies and derivative works made from that copy. 223 | 224 | This option is useful when you wish to copy part of the code of 225 | the Library into a program that is not a library. 226 | 227 | 4. You may copy and distribute the Library (or a portion or 228 | derivative of it, under Section 2) in object code or executable form 229 | under the terms of Sections 1 and 2 above provided that you accompany 230 | it with the complete corresponding machine-readable source code, which 231 | must be distributed under the terms of Sections 1 and 2 above on a 232 | medium customarily used for software interchange. 233 | 234 | If distribution of object code is made by offering access to copy 235 | from a designated place, then offering equivalent access to copy the 236 | source code from the same place satisfies the requirement to 237 | distribute the source code, even though third parties are not 238 | compelled to copy the source along with the object code. 239 | 240 | 5. A program that contains no derivative of any portion of the 241 | Library, but is designed to work with the Library by being compiled or 242 | linked with it, is called a "work that uses the Library". Such a 243 | work, in isolation, is not a derivative work of the Library, and 244 | therefore falls outside the scope of this License. 245 | 246 | However, linking a "work that uses the Library" with the Library 247 | creates an executable that is a derivative of the Library (because it 248 | contains portions of the Library), rather than a "work that uses the 249 | library". The executable is therefore covered by this License. 250 | Section 6 states terms for distribution of such executables. 251 | 252 | When a "work that uses the Library" uses material from a header file 253 | that is part of the Library, the object code for the work may be a 254 | derivative work of the Library even though the source code is not. 255 | Whether this is true is especially significant if the work can be 256 | linked without the Library, or if the work is itself a library. The 257 | threshold for this to be true is not precisely defined by law. 258 | 259 | If such an object file uses only numerical parameters, data 260 | structure layouts and accessors, and small macros and small inline 261 | functions (ten lines or less in length), then the use of the object 262 | file is unrestricted, regardless of whether it is legally a derivative 263 | work. (Executables containing this object code plus portions of the 264 | Library will still fall under Section 6.) 265 | 266 | Otherwise, if the work is a derivative of the Library, you may 267 | distribute the object code for the work under the terms of Section 6. 268 | Any executables containing that work also fall under Section 6, 269 | whether or not they are linked directly with the Library itself. 270 | 271 | 6. As an exception to the Sections above, you may also combine or 272 | link a "work that uses the Library" with the Library to produce a 273 | work containing portions of the Library, and distribute that work 274 | under terms of your choice, provided that the terms permit 275 | modification of the work for the customer's own use and reverse 276 | engineering for debugging such modifications. 277 | 278 | You must give prominent notice with each copy of the work that the 279 | Library is used in it and that the Library and its use are covered by 280 | this License. You must supply a copy of this License. If the work 281 | during execution displays copyright notices, you must include the 282 | copyright notice for the Library among them, as well as a reference 283 | directing the user to the copy of this License. Also, you must do one 284 | of these things: 285 | 286 | a) Accompany the work with the complete corresponding 287 | machine-readable source code for the Library including whatever 288 | changes were used in the work (which must be distributed under 289 | Sections 1 and 2 above); and, if the work is an executable linked 290 | with the Library, with the complete machine-readable "work that 291 | uses the Library", as object code and/or source code, so that the 292 | user can modify the Library and then relink to produce a modified 293 | executable containing the modified Library. (It is understood 294 | that the user who changes the contents of definitions files in the 295 | Library will not necessarily be able to recompile the application 296 | to use the modified definitions.) 297 | 298 | b) Use a suitable shared library mechanism for linking with the 299 | Library. A suitable mechanism is one that (1) uses at run time a 300 | copy of the library already present on the user's computer system, 301 | rather than copying library functions into the executable, and (2) 302 | will operate properly with a modified version of the library, if 303 | the user installs one, as long as the modified version is 304 | interface-compatible with the version that the work was made with. 305 | 306 | c) Accompany the work with a written offer, valid for at 307 | least three years, to give the same user the materials 308 | specified in Subsection 6a, above, for a charge no more 309 | than the cost of performing this distribution. 310 | 311 | d) If distribution of the work is made by offering access to copy 312 | from a designated place, offer equivalent access to copy the above 313 | specified materials from the same place. 314 | 315 | e) Verify that the user has already received a copy of these 316 | materials or that you have already sent this user a copy. 317 | 318 | For an executable, the required form of the "work that uses the 319 | Library" must include any data and utility programs needed for 320 | reproducing the executable from it. However, as a special exception, 321 | the materials to be distributed need not include anything that is 322 | normally distributed (in either source or binary form) with the major 323 | components (compiler, kernel, and so on) of the operating system on 324 | which the executable runs, unless that component itself accompanies 325 | the executable. 326 | 327 | It may happen that this requirement contradicts the license 328 | restrictions of other proprietary libraries that do not normally 329 | accompany the operating system. Such a contradiction means you cannot 330 | use both them and the Library together in an executable that you 331 | distribute. 332 | 333 | 7. You may place library facilities that are a work based on the 334 | Library side-by-side in a single library together with other library 335 | facilities not covered by this License, and distribute such a combined 336 | library, provided that the separate distribution of the work based on 337 | the Library and of the other library facilities is otherwise 338 | permitted, and provided that you do these two things: 339 | 340 | a) Accompany the combined library with a copy of the same work 341 | based on the Library, uncombined with any other library 342 | facilities. This must be distributed under the terms of the 343 | Sections above. 344 | 345 | b) Give prominent notice with the combined library of the fact 346 | that part of it is a work based on the Library, and explaining 347 | where to find the accompanying uncombined form of the same work. 348 | 349 | 8. You may not copy, modify, sublicense, link with, or distribute 350 | the Library except as expressly provided under this License. Any 351 | attempt otherwise to copy, modify, sublicense, link with, or 352 | distribute the Library is void, and will automatically terminate your 353 | rights under this License. However, parties who have received copies, 354 | or rights, from you under this License will not have their licenses 355 | terminated so long as such parties remain in full compliance. 356 | 357 | 9. You are not required to accept this License, since you have not 358 | signed it. However, nothing else grants you permission to modify or 359 | distribute the Library or its derivative works. These actions are 360 | prohibited by law if you do not accept this License. Therefore, by 361 | modifying or distributing the Library (or any work based on the 362 | Library), you indicate your acceptance of this License to do so, and 363 | all its terms and conditions for copying, distributing or modifying 364 | the Library or works based on it. 365 | 366 | 10. Each time you redistribute the Library (or any work based on the 367 | Library), the recipient automatically receives a license from the 368 | original licensor to copy, distribute, link with or modify the Library 369 | subject to these terms and conditions. You may not impose any further 370 | restrictions on the recipients' exercise of the rights granted herein. 371 | You are not responsible for enforcing compliance by third parties with 372 | this License. 373 | 374 | 11. If, as a consequence of a court judgment or allegation of patent 375 | infringement or for any other reason (not limited to patent issues), 376 | conditions are imposed on you (whether by court order, agreement or 377 | otherwise) that contradict the conditions of this License, they do not 378 | excuse you from the conditions of this License. If you cannot 379 | distribute so as to satisfy simultaneously your obligations under this 380 | License and any other pertinent obligations, then as a consequence you 381 | may not distribute the Library at all. For example, if a patent 382 | license would not permit royalty-free redistribution of the Library by 383 | all those who receive copies directly or indirectly through you, then 384 | the only way you could satisfy both it and this License would be to 385 | refrain entirely from distribution of the Library. 386 | 387 | If any portion of this section is held invalid or unenforceable under any 388 | particular circumstance, the balance of the section is intended to apply, 389 | and the section as a whole is intended to apply in other circumstances. 390 | 391 | It is not the purpose of this section to induce you to infringe any 392 | patents or other property right claims or to contest validity of any 393 | such claims; this section has the sole purpose of protecting the 394 | integrity of the free software distribution system which is 395 | implemented by public license practices. Many people have made 396 | generous contributions to the wide range of software distributed 397 | through that system in reliance on consistent application of that 398 | system; it is up to the author/donor to decide if he or she is willing 399 | to distribute software through any other system and a licensee cannot 400 | impose that choice. 401 | 402 | This section is intended to make thoroughly clear what is believed to 403 | be a consequence of the rest of this License. 404 | 405 | 12. If the distribution and/or use of the Library is restricted in 406 | certain countries either by patents or by copyrighted interfaces, the 407 | original copyright holder who places the Library under this License may add 408 | an explicit geographical distribution limitation excluding those countries, 409 | so that distribution is permitted only in or among countries not thus 410 | excluded. In such case, this License incorporates the limitation as if 411 | written in the body of this License. 412 | 413 | 13. The Free Software Foundation may publish revised and/or new 414 | versions of the Lesser General Public License from time to time. 415 | Such new versions will be similar in spirit to the present version, 416 | but may differ in detail to address new problems or concerns. 417 | 418 | Each version is given a distinguishing version number. If the Library 419 | specifies a version number of this License which applies to it and 420 | "any later version", you have the option of following the terms and 421 | conditions either of that version or of any later version published by 422 | the Free Software Foundation. If the Library does not specify a 423 | license version number, you may choose any version ever published by 424 | the Free Software Foundation. 425 | 426 | 14. If you wish to incorporate parts of the Library into other free 427 | programs whose distribution conditions are incompatible with these, 428 | write to the author to ask for permission. For software which is 429 | copyrighted by the Free Software Foundation, write to the Free 430 | Software Foundation; we sometimes make exceptions for this. Our 431 | decision will be guided by the two goals of preserving the free status 432 | of all derivatives of our free software and of promoting the sharing 433 | and reuse of software generally. 434 | 435 | NO WARRANTY 436 | 437 | 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO 438 | WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. 439 | EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR 440 | OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY 441 | KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE 442 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 443 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE 444 | LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME 445 | THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 446 | 447 | 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN 448 | WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY 449 | AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU 450 | FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR 451 | CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE 452 | LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING 453 | RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A 454 | FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF 455 | SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH 456 | DAMAGES. 457 | 458 | END OF TERMS AND CONDITIONS 459 | 460 | How to Apply These Terms to Your New Libraries 461 | 462 | If you develop a new library, and you want it to be of the greatest 463 | possible use to the public, we recommend making it free software that 464 | everyone can redistribute and change. You can do so by permitting 465 | redistribution under these terms (or, alternatively, under the terms of the 466 | ordinary General Public License). 467 | 468 | To apply these terms, attach the following notices to the library. It is 469 | safest to attach them to the start of each source file to most effectively 470 | convey the exclusion of warranty; and each file should have at least the 471 | "copyright" line and a pointer to where the full notice is found. 472 | 473 | 474 | Copyright (C) 475 | 476 | This library is free software; you can redistribute it and/or 477 | modify it under the terms of the GNU Lesser General Public 478 | License as published by the Free Software Foundation; either 479 | version 2.1 of the License, or (at your option) any later version. 480 | 481 | This library is distributed in the hope that it will be useful, 482 | but WITHOUT ANY WARRANTY; without even the implied warranty of 483 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 484 | Lesser General Public License for more details. 485 | 486 | You should have received a copy of the GNU Lesser General Public 487 | License along with this library; if not, write to the Free Software 488 | Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 489 | 490 | Also add information on how to contact you by electronic and paper mail. 491 | 492 | You should also get your employer (if you work as a programmer) or your 493 | school, if any, to sign a "copyright disclaimer" for the library, if 494 | necessary. Here is a sample; alter the names: 495 | 496 | Yoyodyne, Inc., hereby disclaims all copyright interest in the 497 | library `Frob' (a library for tweaking knobs) written by James Random Hacker. 498 | 499 | , 1 April 1990 500 | Ty Coon, President of Vice 501 | 502 | That's all there is to it! 503 | 504 | 505 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | python setup.py build 3 | find build -name align.so | xargs -J % cp % PI 4 | 5 | install: 6 | python setup.py install 7 | 8 | clean: 9 | find . -name "*.pyc" | xargs rm -f 10 | rm -f PI/align.c 11 | rm -rf build 12 | 13 | -------------------------------------------------------------------------------- /PI/__init__.py: -------------------------------------------------------------------------------- 1 | __all__ = [ "input", "distance", "phylogeny", "multialign", "output" ] 2 | -------------------------------------------------------------------------------- /PI/align.c: -------------------------------------------------------------------------------- 1 | /* Generated by Pyrex 0.9.8.6 on Tue Aug 30 01:03:29 2011 */ 2 | 3 | #define PY_SSIZE_T_CLEAN 4 | #include "Python.h" 5 | #include "structmember.h" 6 | #ifndef PY_LONG_LONG 7 | #define PY_LONG_LONG LONG_LONG 8 | #endif 9 | #if PY_VERSION_HEX < 0x02050000 10 | typedef int Py_ssize_t; 11 | #define PY_SSIZE_T_MAX INT_MAX 12 | #define PY_SSIZE_T_MIN INT_MIN 13 | #define PyInt_FromSsize_t(z) PyInt_FromLong(z) 14 | #define PyInt_AsSsize_t(o) PyInt_AsLong(o) 15 | #endif 16 | #if !defined(WIN32) && !defined(MS_WINDOWS) 17 | #ifndef __stdcall 18 | #define __stdcall 19 | #endif 20 | #ifndef __cdecl 21 | #define __cdecl 22 | #endif 23 | #endif 24 | #ifdef __cplusplus 25 | #define __PYX_EXTERN_C extern "C" 26 | #else 27 | #define __PYX_EXTERN_C extern 28 | #endif 29 | #include 30 | #include "numpy/arrayobject.h" 31 | 32 | 33 | typedef struct {PyObject **p; int i; char *s; long n;} __Pyx_StringTabEntry; /*proto*/ 34 | 35 | static PyObject *__pyx_m; 36 | static PyObject *__pyx_b; 37 | static int __pyx_lineno; 38 | static char *__pyx_filename; 39 | static char **__pyx_f; 40 | 41 | static PyObject *__Pyx_GetName(PyObject *dict, PyObject *name); /*proto*/ 42 | 43 | static int __Pyx_TypeTest(PyObject *obj, PyTypeObject *type); /*proto*/ 44 | 45 | static int __Pyx_InitStrings(__Pyx_StringTabEntry *t); /*proto*/ 46 | 47 | static PyTypeObject *__Pyx_ImportType(char *module_name, char *class_name, long size); /*proto*/ 48 | 49 | static PyObject *__Pyx_ImportModule(char *name); /*proto*/ 50 | 51 | static PyObject *__Pyx_Import(PyObject *name, PyObject *from_list); /*proto*/ 52 | 53 | static void __Pyx_AddTraceback(char *funcname); /*proto*/ 54 | 55 | /* Declarations from PI.align */ 56 | 57 | 58 | /* Declarations from implementation of PI.align */ 59 | 60 | static PyTypeObject *__pyx_ptype_2PI_5align_ndarray = 0; 61 | 62 | static char __pyx_k1[] = "numpy"; 63 | static char __pyx_k2[] = "zeros"; 64 | static char __pyx_k3[] = "i"; 65 | static char __pyx_k4[] = "append"; 66 | static char __pyx_k5[] = "range"; 67 | 68 | static PyObject *__pyx_n_append; 69 | static PyObject *__pyx_n_i; 70 | static PyObject *__pyx_n_numpy; 71 | static PyObject *__pyx_n_range; 72 | static PyObject *__pyx_n_zeros; 73 | 74 | 75 | static __Pyx_StringTabEntry __pyx_string_tab[] = { 76 | {&__pyx_n_append, 1, __pyx_k4, sizeof(__pyx_k4)}, 77 | {&__pyx_n_i, 1, __pyx_k3, sizeof(__pyx_k3)}, 78 | {&__pyx_n_numpy, 1, __pyx_k1, sizeof(__pyx_k1)}, 79 | {&__pyx_n_range, 1, __pyx_k5, sizeof(__pyx_k5)}, 80 | {&__pyx_n_zeros, 1, __pyx_k2, sizeof(__pyx_k2)}, 81 | {0, 0, 0, 0} 82 | }; 83 | 84 | 85 | 86 | /* Implementation of PI.align */ 87 | 88 | static PyObject *__pyx_f_2PI_5align_NeedlemanWunsch(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/ 89 | static PyObject *__pyx_f_2PI_5align_NeedlemanWunsch(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds) { 90 | PyObject *__pyx_v_seq1 = 0; 91 | PyObject *__pyx_v_seq2 = 0; 92 | PyObject *__pyx_v_S = 0; 93 | int __pyx_v_g; 94 | int __pyx_v_e; 95 | int __pyx_v_M; 96 | int __pyx_v_N; 97 | int __pyx_v_i; 98 | int __pyx_v_j; 99 | int __pyx_v_t_max; 100 | int __pyx_v_i_max; 101 | int __pyx_v_j_max; 102 | int __pyx_v_dir; 103 | int __pyx_v_v1; 104 | int __pyx_v_v2; 105 | int __pyx_v_v3; 106 | int __pyx_v_m; 107 | int __pyx_v_new_len; 108 | int __pyx_v_data; 109 | int __pyx_v_ni; 110 | int __pyx_v_nj; 111 | PyArrayObject *__pyx_v_a; 112 | int __pyx_v_nrows; 113 | int __pyx_v_ncols; 114 | int *__pyx_v_matrix; 115 | PyObject *__pyx_v_edits1; 116 | PyObject *__pyx_v_edits2; 117 | PyObject *__pyx_v_table; 118 | PyObject *__pyx_v_new_seq1; 119 | PyObject *__pyx_v_new_seq2; 120 | int __pyx_v_s1; 121 | int __pyx_v_s2; 122 | int __pyx_v_gaps; 123 | PyObject *__pyx_r; 124 | PyObject *__pyx_1 = 0; 125 | Py_ssize_t __pyx_2; 126 | PyObject *__pyx_3 = 0; 127 | PyObject *__pyx_4 = 0; 128 | PyObject *__pyx_5 = 0; 129 | int __pyx_6; 130 | long __pyx_7; 131 | static char *__pyx_argnames[] = {"seq1","seq2","S","g","e",0}; 132 | if (!PyArg_ParseTupleAndKeywords(__pyx_args, __pyx_kwds, "OOOii", __pyx_argnames, &__pyx_v_seq1, &__pyx_v_seq2, &__pyx_v_S, &__pyx_v_g, &__pyx_v_e)) return 0; 133 | Py_INCREF(__pyx_v_seq1); 134 | Py_INCREF(__pyx_v_seq2); 135 | Py_INCREF(__pyx_v_S); 136 | __pyx_v_a = ((PyArrayObject *)Py_None); Py_INCREF(Py_None); 137 | __pyx_v_edits1 = Py_None; Py_INCREF(Py_None); 138 | __pyx_v_edits2 = Py_None; Py_INCREF(Py_None); 139 | __pyx_v_table = Py_None; Py_INCREF(Py_None); 140 | __pyx_v_new_seq1 = Py_None; Py_INCREF(Py_None); 141 | __pyx_v_new_seq2 = Py_None; Py_INCREF(Py_None); 142 | 143 | /* "/private/tmp/PI/PI/align.pyx":26 */ 144 | __pyx_1 = PyList_New(0); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 26; goto __pyx_L1;} 145 | Py_DECREF(__pyx_v_edits1); 146 | __pyx_v_edits1 = __pyx_1; 147 | __pyx_1 = 0; 148 | 149 | /* "/private/tmp/PI/PI/align.pyx":27 */ 150 | __pyx_1 = PyList_New(0); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 27; goto __pyx_L1;} 151 | Py_DECREF(__pyx_v_edits2); 152 | __pyx_v_edits2 = __pyx_1; 153 | __pyx_1 = 0; 154 | 155 | /* "/private/tmp/PI/PI/align.pyx":29 */ 156 | __pyx_2 = PyObject_Length(__pyx_v_seq1); if (__pyx_2 == -1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 29; goto __pyx_L1;} 157 | __pyx_v_M = (__pyx_2 + 1); 158 | 159 | /* "/private/tmp/PI/PI/align.pyx":30 */ 160 | __pyx_2 = PyObject_Length(__pyx_v_seq2); if (__pyx_2 == -1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 30; goto __pyx_L1;} 161 | __pyx_v_N = (__pyx_2 + 1); 162 | 163 | /* "/private/tmp/PI/PI/align.pyx":32 */ 164 | __pyx_1 = __Pyx_GetName(__pyx_m, __pyx_n_numpy); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 32; goto __pyx_L1;} 165 | __pyx_3 = PyObject_GetAttr(__pyx_1, __pyx_n_zeros); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 32; goto __pyx_L1;} 166 | Py_DECREF(__pyx_1); __pyx_1 = 0; 167 | __pyx_1 = PyInt_FromLong(__pyx_v_M); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 32; goto __pyx_L1;} 168 | __pyx_4 = PyInt_FromLong(__pyx_v_N); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 32; goto __pyx_L1;} 169 | __pyx_5 = PyList_New(2); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 32; goto __pyx_L1;} 170 | PyList_SET_ITEM(__pyx_5, 0, __pyx_1); 171 | PyList_SET_ITEM(__pyx_5, 1, __pyx_4); 172 | __pyx_1 = 0; 173 | __pyx_4 = 0; 174 | __pyx_1 = PyTuple_New(2); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 32; goto __pyx_L1;} 175 | PyTuple_SET_ITEM(__pyx_1, 0, __pyx_5); 176 | Py_INCREF(__pyx_n_i); 177 | PyTuple_SET_ITEM(__pyx_1, 1, __pyx_n_i); 178 | __pyx_5 = 0; 179 | __pyx_4 = PyObject_CallObject(__pyx_3, __pyx_1); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 32; goto __pyx_L1;} 180 | Py_DECREF(__pyx_3); __pyx_3 = 0; 181 | Py_DECREF(__pyx_1); __pyx_1 = 0; 182 | Py_DECREF(__pyx_v_table); 183 | __pyx_v_table = __pyx_4; 184 | __pyx_4 = 0; 185 | 186 | /* "/private/tmp/PI/PI/align.pyx":34 */ 187 | if (!__Pyx_TypeTest(__pyx_v_table, __pyx_ptype_2PI_5align_ndarray)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 34; goto __pyx_L1;} 188 | Py_INCREF(__pyx_v_table); 189 | Py_DECREF(((PyObject *)__pyx_v_a)); 190 | __pyx_v_a = ((PyArrayObject *)__pyx_v_table); 191 | 192 | /* "/private/tmp/PI/PI/align.pyx":36 */ 193 | __pyx_v_nrows = (__pyx_v_a->dimensions[0]); 194 | 195 | /* "/private/tmp/PI/PI/align.pyx":37 */ 196 | __pyx_v_ncols = (__pyx_v_a->dimensions[1]); 197 | 198 | /* "/private/tmp/PI/PI/align.pyx":39 */ 199 | __pyx_v_matrix = ((int *)__pyx_v_a->data); 200 | 201 | /* "/private/tmp/PI/PI/align.pyx":42 */ 202 | for (__pyx_v_i = 1; __pyx_v_i < __pyx_v_M; ++__pyx_v_i) { 203 | for (__pyx_v_j = 1; __pyx_v_j < __pyx_v_N; ++__pyx_v_j) { 204 | __pyx_5 = PyInt_FromLong((__pyx_v_i - 1)); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 44; goto __pyx_L1;} 205 | __pyx_3 = PyObject_GetItem(__pyx_v_seq1, __pyx_5); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 44; goto __pyx_L1;} 206 | Py_DECREF(__pyx_5); __pyx_5 = 0; 207 | __pyx_1 = PyInt_FromLong((__pyx_v_j - 1)); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 44; goto __pyx_L1;} 208 | __pyx_4 = PyObject_GetItem(__pyx_v_seq2, __pyx_1); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 44; goto __pyx_L1;} 209 | Py_DECREF(__pyx_1); __pyx_1 = 0; 210 | if (PyObject_Cmp(__pyx_3, __pyx_4, &__pyx_6) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 44; goto __pyx_L1;} 211 | __pyx_6 = __pyx_6 == 0; 212 | Py_DECREF(__pyx_3); __pyx_3 = 0; 213 | Py_DECREF(__pyx_4); __pyx_4 = 0; 214 | if (__pyx_6) { 215 | (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]) = 1; 216 | goto __pyx_L6; 217 | } 218 | /*else*/ { 219 | (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]) = 0; 220 | } 221 | __pyx_L6:; 222 | } 223 | } 224 | 225 | /* "/private/tmp/PI/PI/align.pyx":50 */ 226 | __pyx_v_i_max = 0; 227 | 228 | /* "/private/tmp/PI/PI/align.pyx":51 */ 229 | __pyx_v_j_max = 0; 230 | 231 | /* "/private/tmp/PI/PI/align.pyx":52 */ 232 | __pyx_v_t_max = 0; 233 | 234 | /* "/private/tmp/PI/PI/align.pyx":54 */ 235 | for (__pyx_v_i = 1; __pyx_v_i < __pyx_v_M; ++__pyx_v_i) { 236 | for (__pyx_v_j = 1; __pyx_v_j < __pyx_v_N; ++__pyx_v_j) { 237 | 238 | /* "/private/tmp/PI/PI/align.pyx":57 */ 239 | __pyx_v_dir = 0; 240 | 241 | /* "/private/tmp/PI/PI/align.pyx":59 */ 242 | __pyx_v_v1 = (__pyx_v_matrix[(((__pyx_v_i - 1) * __pyx_v_ncols) + (__pyx_v_j - 1))]); 243 | 244 | /* "/private/tmp/PI/PI/align.pyx":60 */ 245 | __pyx_v_v2 = (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + (__pyx_v_j - 1))]); 246 | 247 | /* "/private/tmp/PI/PI/align.pyx":61 */ 248 | __pyx_v_v3 = (__pyx_v_matrix[(((__pyx_v_i - 1) * __pyx_v_ncols) + __pyx_v_j)]); 249 | 250 | /* "/private/tmp/PI/PI/align.pyx":63 */ 251 | __pyx_6 = (__pyx_v_v1 > 255); 252 | if (__pyx_6) { 253 | __pyx_v_v1 = (__pyx_v_v1 >> 8); 254 | goto __pyx_L11; 255 | } 256 | __pyx_L11:; 257 | 258 | /* "/private/tmp/PI/PI/align.pyx":65 */ 259 | __pyx_6 = (__pyx_v_v2 > 255); 260 | if (__pyx_6) { 261 | __pyx_v_v2 = (__pyx_v_v2 >> 8); 262 | goto __pyx_L12; 263 | } 264 | __pyx_L12:; 265 | 266 | /* "/private/tmp/PI/PI/align.pyx":67 */ 267 | __pyx_6 = (__pyx_v_v3 > 255); 268 | if (__pyx_6) { 269 | __pyx_v_v3 = (__pyx_v_v3 >> 8); 270 | goto __pyx_L13; 271 | } 272 | __pyx_L13:; 273 | 274 | /* "/private/tmp/PI/PI/align.pyx":70 */ 275 | __pyx_v_v1 = (__pyx_v_v1 + (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)])); 276 | 277 | /* "/private/tmp/PI/PI/align.pyx":71 */ 278 | __pyx_v_v2 = (__pyx_v_v2 - __pyx_v_e); 279 | 280 | /* "/private/tmp/PI/PI/align.pyx":72 */ 281 | __pyx_v_v3 = (__pyx_v_v3 - __pyx_v_e); 282 | 283 | /* "/private/tmp/PI/PI/align.pyx":74 */ 284 | __pyx_6 = (__pyx_v_v1 > 0); 285 | if (__pyx_6) { 286 | __pyx_v_m = __pyx_v_v1; 287 | goto __pyx_L14; 288 | } 289 | /*else*/ { 290 | __pyx_v_m = 0; 291 | } 292 | __pyx_L14:; 293 | 294 | /* "/private/tmp/PI/PI/align.pyx":79 */ 295 | __pyx_6 = (__pyx_v_v2 > __pyx_v_m); 296 | if (__pyx_6) { 297 | __pyx_v_m = __pyx_v_v2; 298 | goto __pyx_L15; 299 | } 300 | __pyx_L15:; 301 | 302 | /* "/private/tmp/PI/PI/align.pyx":82 */ 303 | __pyx_6 = (__pyx_v_v3 > __pyx_v_m); 304 | if (__pyx_6) { 305 | __pyx_v_m = __pyx_v_v3; 306 | goto __pyx_L16; 307 | } 308 | __pyx_L16:; 309 | 310 | /* "/private/tmp/PI/PI/align.pyx":85 */ 311 | __pyx_6 = (__pyx_v_m == __pyx_v_v1); 312 | if (__pyx_6) { 313 | __pyx_v_dir = (__pyx_v_dir | (1 << 0)); 314 | goto __pyx_L17; 315 | } 316 | __pyx_L17:; 317 | 318 | /* "/private/tmp/PI/PI/align.pyx":88 */ 319 | __pyx_6 = (__pyx_v_m == __pyx_v_v2); 320 | if (__pyx_6) { 321 | __pyx_v_dir = (__pyx_v_dir | (1 << 1)); 322 | goto __pyx_L18; 323 | } 324 | __pyx_L18:; 325 | 326 | /* "/private/tmp/PI/PI/align.pyx":91 */ 327 | __pyx_6 = (__pyx_v_m == __pyx_v_v3); 328 | if (__pyx_6) { 329 | __pyx_v_dir = (__pyx_v_dir | (1 << 2)); 330 | goto __pyx_L19; 331 | } 332 | __pyx_L19:; 333 | 334 | /* "/private/tmp/PI/PI/align.pyx":94 */ 335 | __pyx_6 = (__pyx_v_m >= __pyx_v_t_max); 336 | if (__pyx_6) { 337 | 338 | /* "/private/tmp/PI/PI/align.pyx":95 */ 339 | __pyx_v_t_max = __pyx_v_m; 340 | 341 | /* "/private/tmp/PI/PI/align.pyx":96 */ 342 | __pyx_v_i_max = __pyx_v_i; 343 | 344 | /* "/private/tmp/PI/PI/align.pyx":97 */ 345 | __pyx_v_j_max = __pyx_v_j; 346 | goto __pyx_L20; 347 | } 348 | __pyx_L20:; 349 | 350 | /* "/private/tmp/PI/PI/align.pyx":99 */ 351 | __pyx_v_m = (__pyx_v_m << 8); 352 | 353 | /* "/private/tmp/PI/PI/align.pyx":100 */ 354 | __pyx_v_m = (__pyx_v_m | __pyx_v_dir); 355 | 356 | /* "/private/tmp/PI/PI/align.pyx":102 */ 357 | (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]) = __pyx_v_m; 358 | } 359 | } 360 | 361 | /* "/private/tmp/PI/PI/align.pyx":105 */ 362 | __pyx_v_i = __pyx_v_i_max; 363 | 364 | /* "/private/tmp/PI/PI/align.pyx":106 */ 365 | __pyx_v_j = __pyx_v_j_max; 366 | 367 | /* "/private/tmp/PI/PI/align.pyx":108 */ 368 | __pyx_v_new_len = 0; 369 | 370 | /* "/private/tmp/PI/PI/align.pyx":110 */ 371 | while (1) { 372 | __pyx_6 = __pyx_v_i; 373 | if (__pyx_6) { 374 | __pyx_6 = __pyx_v_j; 375 | } 376 | if (!__pyx_6) break; 377 | 378 | /* "/private/tmp/PI/PI/align.pyx":111 */ 379 | __pyx_v_data = (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]); 380 | 381 | /* "/private/tmp/PI/PI/align.pyx":113 */ 382 | __pyx_7 = (__pyx_v_data & (1 << 2)); 383 | if (__pyx_7) { 384 | 385 | /* "/private/tmp/PI/PI/align.pyx":114 */ 386 | __pyx_v_ni = (__pyx_v_i - 1); 387 | 388 | /* "/private/tmp/PI/PI/align.pyx":115 */ 389 | __pyx_v_nj = __pyx_v_j; 390 | goto __pyx_L23; 391 | } 392 | __pyx_L23:; 393 | 394 | /* "/private/tmp/PI/PI/align.pyx":117 */ 395 | __pyx_7 = (__pyx_v_data & (1 << 1)); 396 | if (__pyx_7) { 397 | 398 | /* "/private/tmp/PI/PI/align.pyx":118 */ 399 | __pyx_v_ni = __pyx_v_i; 400 | 401 | /* "/private/tmp/PI/PI/align.pyx":119 */ 402 | __pyx_v_nj = (__pyx_v_j - 1); 403 | goto __pyx_L24; 404 | } 405 | __pyx_L24:; 406 | 407 | /* "/private/tmp/PI/PI/align.pyx":121 */ 408 | __pyx_7 = (__pyx_v_data & (1 << 0)); 409 | if (__pyx_7) { 410 | 411 | /* "/private/tmp/PI/PI/align.pyx":122 */ 412 | __pyx_v_ni = (__pyx_v_i - 1); 413 | 414 | /* "/private/tmp/PI/PI/align.pyx":123 */ 415 | __pyx_v_nj = (__pyx_v_j - 1); 416 | goto __pyx_L25; 417 | } 418 | __pyx_L25:; 419 | 420 | /* "/private/tmp/PI/PI/align.pyx":125 */ 421 | __pyx_v_new_len = (__pyx_v_new_len + 1); 422 | 423 | /* "/private/tmp/PI/PI/align.pyx":127 */ 424 | __pyx_v_i = __pyx_v_ni; 425 | 426 | /* "/private/tmp/PI/PI/align.pyx":128 */ 427 | __pyx_v_j = __pyx_v_nj; 428 | } 429 | 430 | /* "/private/tmp/PI/PI/align.pyx":130 */ 431 | __pyx_5 = __Pyx_GetName(__pyx_m, __pyx_n_numpy); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 130; goto __pyx_L1;} 432 | __pyx_1 = PyObject_GetAttr(__pyx_5, __pyx_n_zeros); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 130; goto __pyx_L1;} 433 | Py_DECREF(__pyx_5); __pyx_5 = 0; 434 | __pyx_3 = PyInt_FromLong(__pyx_v_new_len); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 130; goto __pyx_L1;} 435 | __pyx_4 = PyTuple_New(2); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 130; goto __pyx_L1;} 436 | PyTuple_SET_ITEM(__pyx_4, 0, __pyx_3); 437 | Py_INCREF(__pyx_n_i); 438 | PyTuple_SET_ITEM(__pyx_4, 1, __pyx_n_i); 439 | __pyx_3 = 0; 440 | __pyx_5 = PyObject_CallObject(__pyx_1, __pyx_4); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 130; goto __pyx_L1;} 441 | Py_DECREF(__pyx_1); __pyx_1 = 0; 442 | Py_DECREF(__pyx_4); __pyx_4 = 0; 443 | Py_DECREF(__pyx_v_new_seq1); 444 | __pyx_v_new_seq1 = __pyx_5; 445 | __pyx_5 = 0; 446 | 447 | /* "/private/tmp/PI/PI/align.pyx":131 */ 448 | __pyx_3 = __Pyx_GetName(__pyx_m, __pyx_n_numpy); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 131; goto __pyx_L1;} 449 | __pyx_1 = PyObject_GetAttr(__pyx_3, __pyx_n_zeros); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 131; goto __pyx_L1;} 450 | Py_DECREF(__pyx_3); __pyx_3 = 0; 451 | __pyx_4 = PyInt_FromLong(__pyx_v_new_len); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 131; goto __pyx_L1;} 452 | __pyx_5 = PyTuple_New(2); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 131; goto __pyx_L1;} 453 | PyTuple_SET_ITEM(__pyx_5, 0, __pyx_4); 454 | Py_INCREF(__pyx_n_i); 455 | PyTuple_SET_ITEM(__pyx_5, 1, __pyx_n_i); 456 | __pyx_4 = 0; 457 | __pyx_3 = PyObject_CallObject(__pyx_1, __pyx_5); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 131; goto __pyx_L1;} 458 | Py_DECREF(__pyx_1); __pyx_1 = 0; 459 | Py_DECREF(__pyx_5); __pyx_5 = 0; 460 | Py_DECREF(__pyx_v_new_seq2); 461 | __pyx_v_new_seq2 = __pyx_3; 462 | __pyx_3 = 0; 463 | 464 | /* "/private/tmp/PI/PI/align.pyx":134 */ 465 | __pyx_v_s1 = __pyx_v_new_len; 466 | __pyx_v_s2 = __pyx_v_new_len; 467 | 468 | /* "/private/tmp/PI/PI/align.pyx":135 */ 469 | __pyx_v_gaps = 0; 470 | 471 | /* "/private/tmp/PI/PI/align.pyx":137 */ 472 | __pyx_v_i = __pyx_v_i_max; 473 | 474 | /* "/private/tmp/PI/PI/align.pyx":138 */ 475 | __pyx_v_j = __pyx_v_j_max; 476 | 477 | /* "/private/tmp/PI/PI/align.pyx":140 */ 478 | while (1) { 479 | __pyx_6 = __pyx_v_i; 480 | if (__pyx_6) { 481 | __pyx_6 = __pyx_v_j; 482 | } 483 | if (!__pyx_6) break; 484 | 485 | /* "/private/tmp/PI/PI/align.pyx":141 */ 486 | __pyx_v_data = (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]); 487 | 488 | /* "/private/tmp/PI/PI/align.pyx":143 */ 489 | __pyx_7 = (__pyx_v_data & (1 << 2)); 490 | if (__pyx_7) { 491 | 492 | /* "/private/tmp/PI/PI/align.pyx":144 */ 493 | __pyx_v_ni = (__pyx_v_i - 1); 494 | 495 | /* "/private/tmp/PI/PI/align.pyx":145 */ 496 | __pyx_v_nj = __pyx_v_j; 497 | 498 | /* "/private/tmp/PI/PI/align.pyx":146 */ 499 | __pyx_v_dir = (1 << 2); 500 | goto __pyx_L28; 501 | } 502 | __pyx_L28:; 503 | 504 | /* "/private/tmp/PI/PI/align.pyx":148 */ 505 | __pyx_7 = (__pyx_v_data & (1 << 1)); 506 | if (__pyx_7) { 507 | 508 | /* "/private/tmp/PI/PI/align.pyx":149 */ 509 | __pyx_v_ni = __pyx_v_i; 510 | 511 | /* "/private/tmp/PI/PI/align.pyx":150 */ 512 | __pyx_v_nj = (__pyx_v_j - 1); 513 | 514 | /* "/private/tmp/PI/PI/align.pyx":151 */ 515 | __pyx_v_dir = (1 << 1); 516 | goto __pyx_L29; 517 | } 518 | __pyx_L29:; 519 | 520 | /* "/private/tmp/PI/PI/align.pyx":153 */ 521 | __pyx_7 = (__pyx_v_data & (1 << 0)); 522 | if (__pyx_7) { 523 | 524 | /* "/private/tmp/PI/PI/align.pyx":154 */ 525 | __pyx_v_ni = (__pyx_v_i - 1); 526 | 527 | /* "/private/tmp/PI/PI/align.pyx":155 */ 528 | __pyx_v_nj = (__pyx_v_j - 1); 529 | 530 | /* "/private/tmp/PI/PI/align.pyx":156 */ 531 | __pyx_v_dir = (1 << 0); 532 | goto __pyx_L30; 533 | } 534 | __pyx_L30:; 535 | 536 | /* "/private/tmp/PI/PI/align.pyx":158 */ 537 | __pyx_6 = (__pyx_v_dir == (1 << 0)); 538 | if (__pyx_6) { 539 | 540 | /* "/private/tmp/PI/PI/align.pyx":159 */ 541 | __pyx_v_s1 = (__pyx_v_s1 - 1); 542 | 543 | /* "/private/tmp/PI/PI/align.pyx":160 */ 544 | __pyx_v_s2 = (__pyx_v_s2 - 1); 545 | 546 | /* "/private/tmp/PI/PI/align.pyx":161 */ 547 | __pyx_4 = PyInt_FromLong((__pyx_v_i - 1)); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 161; goto __pyx_L1;} 548 | __pyx_1 = PyObject_GetItem(__pyx_v_seq1, __pyx_4); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 161; goto __pyx_L1;} 549 | Py_DECREF(__pyx_4); __pyx_4 = 0; 550 | __pyx_5 = PyInt_FromLong(__pyx_v_s1); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 161; goto __pyx_L1;} 551 | if (PyObject_SetItem(__pyx_v_new_seq1, __pyx_5, __pyx_1) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 161; goto __pyx_L1;} 552 | Py_DECREF(__pyx_5); __pyx_5 = 0; 553 | Py_DECREF(__pyx_1); __pyx_1 = 0; 554 | 555 | /* "/private/tmp/PI/PI/align.pyx":162 */ 556 | __pyx_3 = PyInt_FromLong((__pyx_v_j - 1)); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 162; goto __pyx_L1;} 557 | __pyx_4 = PyObject_GetItem(__pyx_v_seq2, __pyx_3); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 162; goto __pyx_L1;} 558 | Py_DECREF(__pyx_3); __pyx_3 = 0; 559 | __pyx_1 = PyInt_FromLong(__pyx_v_s2); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 162; goto __pyx_L1;} 560 | if (PyObject_SetItem(__pyx_v_new_seq2, __pyx_1, __pyx_4) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 162; goto __pyx_L1;} 561 | Py_DECREF(__pyx_1); __pyx_1 = 0; 562 | Py_DECREF(__pyx_4); __pyx_4 = 0; 563 | goto __pyx_L31; 564 | } 565 | __pyx_L31:; 566 | 567 | /* "/private/tmp/PI/PI/align.pyx":164 */ 568 | __pyx_6 = (__pyx_v_dir == (1 << 1)); 569 | if (__pyx_6) { 570 | 571 | /* "/private/tmp/PI/PI/align.pyx":165 */ 572 | __pyx_v_s1 = (__pyx_v_s1 - 1); 573 | 574 | /* "/private/tmp/PI/PI/align.pyx":166 */ 575 | __pyx_v_s2 = (__pyx_v_s2 - 1); 576 | 577 | /* "/private/tmp/PI/PI/align.pyx":168 */ 578 | __pyx_5 = PyObject_GetAttr(__pyx_v_edits1, __pyx_n_append); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 168; goto __pyx_L1;} 579 | __pyx_3 = PyInt_FromLong(__pyx_v_s1); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 168; goto __pyx_L1;} 580 | __pyx_4 = PyTuple_New(1); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 168; goto __pyx_L1;} 581 | PyTuple_SET_ITEM(__pyx_4, 0, __pyx_3); 582 | __pyx_3 = 0; 583 | __pyx_1 = PyObject_CallObject(__pyx_5, __pyx_4); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 168; goto __pyx_L1;} 584 | Py_DECREF(__pyx_5); __pyx_5 = 0; 585 | Py_DECREF(__pyx_4); __pyx_4 = 0; 586 | Py_DECREF(__pyx_1); __pyx_1 = 0; 587 | 588 | /* "/private/tmp/PI/PI/align.pyx":169 */ 589 | __pyx_3 = PyInt_FromLong(256); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 169; goto __pyx_L1;} 590 | __pyx_5 = PyInt_FromLong(__pyx_v_s1); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 169; goto __pyx_L1;} 591 | if (PyObject_SetItem(__pyx_v_new_seq1, __pyx_5, __pyx_3) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 169; goto __pyx_L1;} 592 | Py_DECREF(__pyx_5); __pyx_5 = 0; 593 | Py_DECREF(__pyx_3); __pyx_3 = 0; 594 | 595 | /* "/private/tmp/PI/PI/align.pyx":170 */ 596 | __pyx_4 = PyInt_FromLong((__pyx_v_j - 1)); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 170; goto __pyx_L1;} 597 | __pyx_1 = PyObject_GetItem(__pyx_v_seq2, __pyx_4); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 170; goto __pyx_L1;} 598 | Py_DECREF(__pyx_4); __pyx_4 = 0; 599 | __pyx_3 = PyInt_FromLong(__pyx_v_s2); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 170; goto __pyx_L1;} 600 | if (PyObject_SetItem(__pyx_v_new_seq2, __pyx_3, __pyx_1) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 170; goto __pyx_L1;} 601 | Py_DECREF(__pyx_3); __pyx_3 = 0; 602 | Py_DECREF(__pyx_1); __pyx_1 = 0; 603 | 604 | /* "/private/tmp/PI/PI/align.pyx":171 */ 605 | __pyx_v_gaps = (__pyx_v_gaps + 1); 606 | goto __pyx_L32; 607 | } 608 | __pyx_L32:; 609 | 610 | /* "/private/tmp/PI/PI/align.pyx":173 */ 611 | __pyx_6 = (__pyx_v_dir == (1 << 2)); 612 | if (__pyx_6) { 613 | 614 | /* "/private/tmp/PI/PI/align.pyx":174 */ 615 | __pyx_v_s1 = (__pyx_v_s1 - 1); 616 | 617 | /* "/private/tmp/PI/PI/align.pyx":175 */ 618 | __pyx_v_s2 = (__pyx_v_s2 - 1); 619 | 620 | /* "/private/tmp/PI/PI/align.pyx":177 */ 621 | __pyx_5 = PyInt_FromLong((__pyx_v_i - 1)); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 177; goto __pyx_L1;} 622 | __pyx_4 = PyObject_GetItem(__pyx_v_seq1, __pyx_5); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 177; goto __pyx_L1;} 623 | Py_DECREF(__pyx_5); __pyx_5 = 0; 624 | __pyx_1 = PyInt_FromLong(__pyx_v_s1); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 177; goto __pyx_L1;} 625 | if (PyObject_SetItem(__pyx_v_new_seq1, __pyx_1, __pyx_4) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 177; goto __pyx_L1;} 626 | Py_DECREF(__pyx_1); __pyx_1 = 0; 627 | Py_DECREF(__pyx_4); __pyx_4 = 0; 628 | 629 | /* "/private/tmp/PI/PI/align.pyx":178 */ 630 | __pyx_3 = PyInt_FromLong(256); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 178; goto __pyx_L1;} 631 | __pyx_5 = PyInt_FromLong(__pyx_v_s2); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 178; goto __pyx_L1;} 632 | if (PyObject_SetItem(__pyx_v_new_seq2, __pyx_5, __pyx_3) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 178; goto __pyx_L1;} 633 | Py_DECREF(__pyx_5); __pyx_5 = 0; 634 | Py_DECREF(__pyx_3); __pyx_3 = 0; 635 | 636 | /* "/private/tmp/PI/PI/align.pyx":179 */ 637 | __pyx_4 = PyObject_GetAttr(__pyx_v_edits2, __pyx_n_append); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 179; goto __pyx_L1;} 638 | __pyx_1 = PyInt_FromLong(__pyx_v_s2); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 179; goto __pyx_L1;} 639 | __pyx_3 = PyTuple_New(1); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 179; goto __pyx_L1;} 640 | PyTuple_SET_ITEM(__pyx_3, 0, __pyx_1); 641 | __pyx_1 = 0; 642 | __pyx_5 = PyObject_CallObject(__pyx_4, __pyx_3); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 179; goto __pyx_L1;} 643 | Py_DECREF(__pyx_4); __pyx_4 = 0; 644 | Py_DECREF(__pyx_3); __pyx_3 = 0; 645 | Py_DECREF(__pyx_5); __pyx_5 = 0; 646 | 647 | /* "/private/tmp/PI/PI/align.pyx":180 */ 648 | __pyx_v_gaps = (__pyx_v_gaps + 1); 649 | goto __pyx_L33; 650 | } 651 | __pyx_L33:; 652 | 653 | /* "/private/tmp/PI/PI/align.pyx":182 */ 654 | __pyx_v_i = __pyx_v_ni; 655 | 656 | /* "/private/tmp/PI/PI/align.pyx":183 */ 657 | __pyx_v_j = __pyx_v_nj; 658 | } 659 | 660 | /* "/private/tmp/PI/PI/align.pyx":185 */ 661 | __pyx_1 = PyInt_FromLong(__pyx_v_t_max); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 185; goto __pyx_L1;} 662 | __pyx_4 = PyInt_FromLong(__pyx_v_gaps); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 185; goto __pyx_L1;} 663 | __pyx_3 = PyTuple_New(6); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 185; goto __pyx_L1;} 664 | Py_INCREF(__pyx_v_new_seq1); 665 | PyTuple_SET_ITEM(__pyx_3, 0, __pyx_v_new_seq1); 666 | Py_INCREF(__pyx_v_new_seq2); 667 | PyTuple_SET_ITEM(__pyx_3, 1, __pyx_v_new_seq2); 668 | Py_INCREF(__pyx_v_edits1); 669 | PyTuple_SET_ITEM(__pyx_3, 2, __pyx_v_edits1); 670 | Py_INCREF(__pyx_v_edits2); 671 | PyTuple_SET_ITEM(__pyx_3, 3, __pyx_v_edits2); 672 | PyTuple_SET_ITEM(__pyx_3, 4, __pyx_1); 673 | PyTuple_SET_ITEM(__pyx_3, 5, __pyx_4); 674 | __pyx_1 = 0; 675 | __pyx_4 = 0; 676 | __pyx_r = __pyx_3; 677 | __pyx_3 = 0; 678 | goto __pyx_L0; 679 | 680 | __pyx_r = Py_None; Py_INCREF(Py_None); 681 | goto __pyx_L0; 682 | __pyx_L1:; 683 | Py_XDECREF(__pyx_1); 684 | Py_XDECREF(__pyx_3); 685 | Py_XDECREF(__pyx_4); 686 | Py_XDECREF(__pyx_5); 687 | __Pyx_AddTraceback("PI.align.NeedlemanWunsch"); 688 | __pyx_r = 0; 689 | __pyx_L0:; 690 | Py_DECREF(__pyx_v_a); 691 | Py_DECREF(__pyx_v_edits1); 692 | Py_DECREF(__pyx_v_edits2); 693 | Py_DECREF(__pyx_v_table); 694 | Py_DECREF(__pyx_v_new_seq1); 695 | Py_DECREF(__pyx_v_new_seq2); 696 | Py_DECREF(__pyx_v_seq1); 697 | Py_DECREF(__pyx_v_seq2); 698 | Py_DECREF(__pyx_v_S); 699 | return __pyx_r; 700 | } 701 | 702 | static PyObject *__pyx_f_2PI_5align_SmithWaterman(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/ 703 | static PyObject *__pyx_f_2PI_5align_SmithWaterman(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds) { 704 | PyObject *__pyx_v_seq1 = 0; 705 | PyObject *__pyx_v_seq2 = 0; 706 | PyObject *__pyx_v_S = 0; 707 | int __pyx_v_g; 708 | int __pyx_v_e; 709 | int __pyx_v_M; 710 | int __pyx_v_N; 711 | int __pyx_v_i; 712 | int __pyx_v_j; 713 | int __pyx_v_t_max; 714 | int __pyx_v_i_max; 715 | int __pyx_v_j_max; 716 | int __pyx_v_dir; 717 | int __pyx_v_v1; 718 | int __pyx_v_v2; 719 | int __pyx_v_v3; 720 | int __pyx_v_m; 721 | int __pyx_v_new_len; 722 | int __pyx_v_data; 723 | int __pyx_v_ni; 724 | int __pyx_v_nj; 725 | PyArrayObject *__pyx_v_a; 726 | int __pyx_v_nrows; 727 | int __pyx_v_ncols; 728 | int *__pyx_v_matrix; 729 | PyObject *__pyx_v_edits1; 730 | PyObject *__pyx_v_edits2; 731 | PyObject *__pyx_v_table; 732 | PyObject *__pyx_v_new_seq1; 733 | PyObject *__pyx_v_new_seq2; 734 | int __pyx_v_s1; 735 | int __pyx_v_s2; 736 | int __pyx_v_gaps; 737 | PyObject *__pyx_r; 738 | PyObject *__pyx_1 = 0; 739 | Py_ssize_t __pyx_2; 740 | PyObject *__pyx_3 = 0; 741 | PyObject *__pyx_4 = 0; 742 | PyObject *__pyx_5 = 0; 743 | int __pyx_6; 744 | long __pyx_7; 745 | static char *__pyx_argnames[] = {"seq1","seq2","S","g","e",0}; 746 | if (!PyArg_ParseTupleAndKeywords(__pyx_args, __pyx_kwds, "OOOii", __pyx_argnames, &__pyx_v_seq1, &__pyx_v_seq2, &__pyx_v_S, &__pyx_v_g, &__pyx_v_e)) return 0; 747 | Py_INCREF(__pyx_v_seq1); 748 | Py_INCREF(__pyx_v_seq2); 749 | Py_INCREF(__pyx_v_S); 750 | __pyx_v_a = ((PyArrayObject *)Py_None); Py_INCREF(Py_None); 751 | __pyx_v_edits1 = Py_None; Py_INCREF(Py_None); 752 | __pyx_v_edits2 = Py_None; Py_INCREF(Py_None); 753 | __pyx_v_table = Py_None; Py_INCREF(Py_None); 754 | __pyx_v_new_seq1 = Py_None; Py_INCREF(Py_None); 755 | __pyx_v_new_seq2 = Py_None; Py_INCREF(Py_None); 756 | 757 | /* "/private/tmp/PI/PI/align.pyx":196 */ 758 | __pyx_1 = PyList_New(0); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 196; goto __pyx_L1;} 759 | Py_DECREF(__pyx_v_edits1); 760 | __pyx_v_edits1 = __pyx_1; 761 | __pyx_1 = 0; 762 | 763 | /* "/private/tmp/PI/PI/align.pyx":197 */ 764 | __pyx_1 = PyList_New(0); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 197; goto __pyx_L1;} 765 | Py_DECREF(__pyx_v_edits2); 766 | __pyx_v_edits2 = __pyx_1; 767 | __pyx_1 = 0; 768 | 769 | /* "/private/tmp/PI/PI/align.pyx":199 */ 770 | __pyx_2 = PyObject_Length(__pyx_v_seq1); if (__pyx_2 == -1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 199; goto __pyx_L1;} 771 | __pyx_v_M = (__pyx_2 + 1); 772 | 773 | /* "/private/tmp/PI/PI/align.pyx":200 */ 774 | __pyx_2 = PyObject_Length(__pyx_v_seq2); if (__pyx_2 == -1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 200; goto __pyx_L1;} 775 | __pyx_v_N = (__pyx_2 + 1); 776 | 777 | /* "/private/tmp/PI/PI/align.pyx":202 */ 778 | __pyx_1 = __Pyx_GetName(__pyx_m, __pyx_n_numpy); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 202; goto __pyx_L1;} 779 | __pyx_3 = PyObject_GetAttr(__pyx_1, __pyx_n_zeros); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 202; goto __pyx_L1;} 780 | Py_DECREF(__pyx_1); __pyx_1 = 0; 781 | __pyx_1 = PyInt_FromLong(__pyx_v_M); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 202; goto __pyx_L1;} 782 | __pyx_4 = PyInt_FromLong(__pyx_v_N); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 202; goto __pyx_L1;} 783 | __pyx_5 = PyList_New(2); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 202; goto __pyx_L1;} 784 | PyList_SET_ITEM(__pyx_5, 0, __pyx_1); 785 | PyList_SET_ITEM(__pyx_5, 1, __pyx_4); 786 | __pyx_1 = 0; 787 | __pyx_4 = 0; 788 | __pyx_1 = PyTuple_New(2); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 202; goto __pyx_L1;} 789 | PyTuple_SET_ITEM(__pyx_1, 0, __pyx_5); 790 | Py_INCREF(__pyx_n_i); 791 | PyTuple_SET_ITEM(__pyx_1, 1, __pyx_n_i); 792 | __pyx_5 = 0; 793 | __pyx_4 = PyObject_CallObject(__pyx_3, __pyx_1); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 202; goto __pyx_L1;} 794 | Py_DECREF(__pyx_3); __pyx_3 = 0; 795 | Py_DECREF(__pyx_1); __pyx_1 = 0; 796 | Py_DECREF(__pyx_v_table); 797 | __pyx_v_table = __pyx_4; 798 | __pyx_4 = 0; 799 | 800 | /* "/private/tmp/PI/PI/align.pyx":204 */ 801 | __pyx_5 = __Pyx_GetName(__pyx_b, __pyx_n_range); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 204; goto __pyx_L1;} 802 | __pyx_3 = PyInt_FromLong(__pyx_v_M); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 204; goto __pyx_L1;} 803 | __pyx_1 = PyTuple_New(1); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 204; goto __pyx_L1;} 804 | PyTuple_SET_ITEM(__pyx_1, 0, __pyx_3); 805 | __pyx_3 = 0; 806 | __pyx_4 = PyObject_CallObject(__pyx_5, __pyx_1); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 204; goto __pyx_L1;} 807 | Py_DECREF(__pyx_5); __pyx_5 = 0; 808 | Py_DECREF(__pyx_1); __pyx_1 = 0; 809 | __pyx_3 = PyObject_GetIter(__pyx_4); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 204; goto __pyx_L1;} 810 | Py_DECREF(__pyx_4); __pyx_4 = 0; 811 | for (;;) { 812 | __pyx_5 = PyIter_Next(__pyx_3); 813 | if (!__pyx_5) { 814 | if (PyErr_Occurred()) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 204; goto __pyx_L1;} 815 | break; 816 | } 817 | __pyx_6 = PyInt_AsLong(__pyx_5); if (PyErr_Occurred()) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 204; goto __pyx_L1;} 818 | Py_DECREF(__pyx_5); __pyx_5 = 0; 819 | __pyx_v_i = __pyx_6; 820 | __pyx_1 = PyInt_FromLong((0 - __pyx_v_i)); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 205; goto __pyx_L1;} 821 | __pyx_4 = PyInt_FromLong(__pyx_v_i); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 205; goto __pyx_L1;} 822 | __pyx_5 = PyObject_GetItem(__pyx_v_table, __pyx_4); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 205; goto __pyx_L1;} 823 | Py_DECREF(__pyx_4); __pyx_4 = 0; 824 | __pyx_4 = PyInt_FromLong(0); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 205; goto __pyx_L1;} 825 | if (PyObject_SetItem(__pyx_5, __pyx_4, __pyx_1) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 205; goto __pyx_L1;} 826 | Py_DECREF(__pyx_5); __pyx_5 = 0; 827 | Py_DECREF(__pyx_4); __pyx_4 = 0; 828 | Py_DECREF(__pyx_1); __pyx_1 = 0; 829 | } 830 | Py_DECREF(__pyx_3); __pyx_3 = 0; 831 | 832 | /* "/private/tmp/PI/PI/align.pyx":207 */ 833 | __pyx_1 = __Pyx_GetName(__pyx_b, __pyx_n_range); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 207; goto __pyx_L1;} 834 | __pyx_5 = PyInt_FromLong(__pyx_v_N); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 207; goto __pyx_L1;} 835 | __pyx_4 = PyTuple_New(1); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 207; goto __pyx_L1;} 836 | PyTuple_SET_ITEM(__pyx_4, 0, __pyx_5); 837 | __pyx_5 = 0; 838 | __pyx_3 = PyObject_CallObject(__pyx_1, __pyx_4); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 207; goto __pyx_L1;} 839 | Py_DECREF(__pyx_1); __pyx_1 = 0; 840 | Py_DECREF(__pyx_4); __pyx_4 = 0; 841 | __pyx_5 = PyObject_GetIter(__pyx_3); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 207; goto __pyx_L1;} 842 | Py_DECREF(__pyx_3); __pyx_3 = 0; 843 | for (;;) { 844 | __pyx_1 = PyIter_Next(__pyx_5); 845 | if (!__pyx_1) { 846 | if (PyErr_Occurred()) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 207; goto __pyx_L1;} 847 | break; 848 | } 849 | __pyx_6 = PyInt_AsLong(__pyx_1); if (PyErr_Occurred()) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 207; goto __pyx_L1;} 850 | Py_DECREF(__pyx_1); __pyx_1 = 0; 851 | __pyx_v_i = __pyx_6; 852 | __pyx_4 = PyInt_FromLong((0 - __pyx_v_i)); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 208; goto __pyx_L1;} 853 | __pyx_3 = PyInt_FromLong(0); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 208; goto __pyx_L1;} 854 | __pyx_1 = PyObject_GetItem(__pyx_v_table, __pyx_3); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 208; goto __pyx_L1;} 855 | Py_DECREF(__pyx_3); __pyx_3 = 0; 856 | __pyx_3 = PyInt_FromLong(__pyx_v_i); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 208; goto __pyx_L1;} 857 | if (PyObject_SetItem(__pyx_1, __pyx_3, __pyx_4) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 208; goto __pyx_L1;} 858 | Py_DECREF(__pyx_1); __pyx_1 = 0; 859 | Py_DECREF(__pyx_3); __pyx_3 = 0; 860 | Py_DECREF(__pyx_4); __pyx_4 = 0; 861 | } 862 | Py_DECREF(__pyx_5); __pyx_5 = 0; 863 | 864 | /* "/private/tmp/PI/PI/align.pyx":210 */ 865 | if (!__Pyx_TypeTest(__pyx_v_table, __pyx_ptype_2PI_5align_ndarray)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 210; goto __pyx_L1;} 866 | Py_INCREF(__pyx_v_table); 867 | Py_DECREF(((PyObject *)__pyx_v_a)); 868 | __pyx_v_a = ((PyArrayObject *)__pyx_v_table); 869 | 870 | /* "/private/tmp/PI/PI/align.pyx":212 */ 871 | __pyx_v_nrows = (__pyx_v_a->dimensions[0]); 872 | 873 | /* "/private/tmp/PI/PI/align.pyx":213 */ 874 | __pyx_v_ncols = (__pyx_v_a->dimensions[1]); 875 | 876 | /* "/private/tmp/PI/PI/align.pyx":215 */ 877 | __pyx_v_matrix = ((int *)__pyx_v_a->data); 878 | 879 | /* "/private/tmp/PI/PI/align.pyx":218 */ 880 | for (__pyx_v_i = 1; __pyx_v_i < __pyx_v_M; ++__pyx_v_i) { 881 | for (__pyx_v_j = 1; __pyx_v_j < __pyx_v_N; ++__pyx_v_j) { 882 | __pyx_4 = PyInt_FromLong((__pyx_v_i - 1)); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 220; goto __pyx_L1;} 883 | __pyx_1 = PyObject_GetItem(__pyx_v_seq1, __pyx_4); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 220; goto __pyx_L1;} 884 | Py_DECREF(__pyx_4); __pyx_4 = 0; 885 | __pyx_3 = PyInt_FromLong((__pyx_v_j - 1)); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 220; goto __pyx_L1;} 886 | __pyx_5 = PyObject_GetItem(__pyx_v_seq2, __pyx_3); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 220; goto __pyx_L1;} 887 | Py_DECREF(__pyx_3); __pyx_3 = 0; 888 | if (PyObject_Cmp(__pyx_1, __pyx_5, &__pyx_6) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 220; goto __pyx_L1;} 889 | __pyx_6 = __pyx_6 == 0; 890 | Py_DECREF(__pyx_1); __pyx_1 = 0; 891 | Py_DECREF(__pyx_5); __pyx_5 = 0; 892 | if (__pyx_6) { 893 | (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]) = 2; 894 | goto __pyx_L10; 895 | } 896 | /*else*/ { 897 | (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]) = (-1); 898 | } 899 | __pyx_L10:; 900 | } 901 | } 902 | 903 | /* "/private/tmp/PI/PI/align.pyx":226 */ 904 | __pyx_v_i_max = 0; 905 | 906 | /* "/private/tmp/PI/PI/align.pyx":227 */ 907 | __pyx_v_j_max = 0; 908 | 909 | /* "/private/tmp/PI/PI/align.pyx":228 */ 910 | __pyx_v_t_max = 0; 911 | 912 | /* "/private/tmp/PI/PI/align.pyx":230 */ 913 | for (__pyx_v_i = 1; __pyx_v_i < __pyx_v_M; ++__pyx_v_i) { 914 | for (__pyx_v_j = 1; __pyx_v_j < __pyx_v_N; ++__pyx_v_j) { 915 | 916 | /* "/private/tmp/PI/PI/align.pyx":233 */ 917 | __pyx_v_dir = 0; 918 | 919 | /* "/private/tmp/PI/PI/align.pyx":235 */ 920 | __pyx_v_v1 = (__pyx_v_matrix[(((__pyx_v_i - 1) * __pyx_v_ncols) + (__pyx_v_j - 1))]); 921 | 922 | /* "/private/tmp/PI/PI/align.pyx":236 */ 923 | __pyx_v_v2 = (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + (__pyx_v_j - 1))]); 924 | 925 | /* "/private/tmp/PI/PI/align.pyx":237 */ 926 | __pyx_v_v3 = (__pyx_v_matrix[(((__pyx_v_i - 1) * __pyx_v_ncols) + __pyx_v_j)]); 927 | 928 | /* "/private/tmp/PI/PI/align.pyx":239 */ 929 | __pyx_6 = (__pyx_v_v1 > 255); 930 | if (!__pyx_6) { 931 | __pyx_4 = PyInt_FromLong((__pyx_v_v1 & 0xffffff00)); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 239; goto __pyx_L1;} 932 | if (PyObject_Cmp(__pyx_4, Py_False, &__pyx_6) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 239; goto __pyx_L1;} 933 | __pyx_6 = __pyx_6 == 0; 934 | Py_DECREF(__pyx_4); __pyx_4 = 0; 935 | } 936 | if (__pyx_6) { 937 | __pyx_v_v1 = (__pyx_v_v1 >> 8); 938 | goto __pyx_L15; 939 | } 940 | __pyx_L15:; 941 | 942 | /* "/private/tmp/PI/PI/align.pyx":241 */ 943 | __pyx_6 = (__pyx_v_v2 > 255); 944 | if (!__pyx_6) { 945 | __pyx_3 = PyInt_FromLong((__pyx_v_v1 & 0xffffff00)); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 241; goto __pyx_L1;} 946 | if (PyObject_Cmp(__pyx_3, Py_False, &__pyx_6) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 241; goto __pyx_L1;} 947 | __pyx_6 = __pyx_6 == 0; 948 | Py_DECREF(__pyx_3); __pyx_3 = 0; 949 | } 950 | if (__pyx_6) { 951 | __pyx_v_v2 = (__pyx_v_v2 >> 8); 952 | goto __pyx_L16; 953 | } 954 | __pyx_L16:; 955 | 956 | /* "/private/tmp/PI/PI/align.pyx":243 */ 957 | __pyx_6 = (__pyx_v_v3 > 255); 958 | if (!__pyx_6) { 959 | __pyx_1 = PyInt_FromLong((__pyx_v_v1 & 0xffffff00)); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 243; goto __pyx_L1;} 960 | if (PyObject_Cmp(__pyx_1, Py_False, &__pyx_6) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 243; goto __pyx_L1;} 961 | __pyx_6 = __pyx_6 == 0; 962 | Py_DECREF(__pyx_1); __pyx_1 = 0; 963 | } 964 | if (__pyx_6) { 965 | __pyx_v_v3 = (__pyx_v_v3 >> 8); 966 | goto __pyx_L17; 967 | } 968 | __pyx_L17:; 969 | 970 | /* "/private/tmp/PI/PI/align.pyx":246 */ 971 | __pyx_v_v1 = (__pyx_v_v1 + (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)])); 972 | 973 | /* "/private/tmp/PI/PI/align.pyx":247 */ 974 | __pyx_v_v2 = (__pyx_v_v2 - 2); 975 | 976 | /* "/private/tmp/PI/PI/align.pyx":248 */ 977 | __pyx_v_v3 = (__pyx_v_v3 - 2); 978 | 979 | /* "/private/tmp/PI/PI/align.pyx":250 */ 980 | __pyx_6 = (__pyx_v_v1 > 0); 981 | if (__pyx_6) { 982 | __pyx_v_m = __pyx_v_v1; 983 | goto __pyx_L18; 984 | } 985 | /*else*/ { 986 | __pyx_v_m = 0; 987 | } 988 | __pyx_L18:; 989 | 990 | /* "/private/tmp/PI/PI/align.pyx":255 */ 991 | __pyx_6 = (__pyx_v_v2 > __pyx_v_m); 992 | if (__pyx_6) { 993 | __pyx_v_m = __pyx_v_v2; 994 | goto __pyx_L19; 995 | } 996 | __pyx_L19:; 997 | 998 | /* "/private/tmp/PI/PI/align.pyx":258 */ 999 | __pyx_6 = (__pyx_v_v3 > __pyx_v_m); 1000 | if (__pyx_6) { 1001 | __pyx_v_m = __pyx_v_v3; 1002 | goto __pyx_L20; 1003 | } 1004 | __pyx_L20:; 1005 | 1006 | /* "/private/tmp/PI/PI/align.pyx":261 */ 1007 | __pyx_6 = (__pyx_v_m == __pyx_v_v1); 1008 | if (__pyx_6) { 1009 | __pyx_v_dir = (__pyx_v_dir | (1 << 0)); 1010 | goto __pyx_L21; 1011 | } 1012 | __pyx_L21:; 1013 | 1014 | /* "/private/tmp/PI/PI/align.pyx":264 */ 1015 | __pyx_6 = (__pyx_v_m == __pyx_v_v2); 1016 | if (__pyx_6) { 1017 | __pyx_v_dir = (__pyx_v_dir | (1 << 1)); 1018 | goto __pyx_L22; 1019 | } 1020 | __pyx_L22:; 1021 | 1022 | /* "/private/tmp/PI/PI/align.pyx":267 */ 1023 | __pyx_6 = (__pyx_v_m == __pyx_v_v3); 1024 | if (__pyx_6) { 1025 | __pyx_v_dir = (__pyx_v_dir | (1 << 2)); 1026 | goto __pyx_L23; 1027 | } 1028 | __pyx_L23:; 1029 | 1030 | /* "/private/tmp/PI/PI/align.pyx":270 */ 1031 | __pyx_6 = (__pyx_v_m >= __pyx_v_t_max); 1032 | if (__pyx_6) { 1033 | 1034 | /* "/private/tmp/PI/PI/align.pyx":271 */ 1035 | __pyx_v_t_max = __pyx_v_m; 1036 | 1037 | /* "/private/tmp/PI/PI/align.pyx":272 */ 1038 | __pyx_v_i_max = __pyx_v_i; 1039 | 1040 | /* "/private/tmp/PI/PI/align.pyx":273 */ 1041 | __pyx_v_j_max = __pyx_v_j; 1042 | goto __pyx_L24; 1043 | } 1044 | __pyx_L24:; 1045 | 1046 | /* "/private/tmp/PI/PI/align.pyx":275 */ 1047 | __pyx_v_m = (__pyx_v_m << 8); 1048 | 1049 | /* "/private/tmp/PI/PI/align.pyx":276 */ 1050 | __pyx_v_m = (__pyx_v_m | __pyx_v_dir); 1051 | 1052 | /* "/private/tmp/PI/PI/align.pyx":278 */ 1053 | (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]) = __pyx_v_m; 1054 | } 1055 | } 1056 | 1057 | /* "/private/tmp/PI/PI/align.pyx":281 */ 1058 | __pyx_v_i = __pyx_v_i_max; 1059 | 1060 | /* "/private/tmp/PI/PI/align.pyx":282 */ 1061 | __pyx_v_j = __pyx_v_j_max; 1062 | 1063 | /* "/private/tmp/PI/PI/align.pyx":284 */ 1064 | __pyx_v_new_len = 0; 1065 | 1066 | /* "/private/tmp/PI/PI/align.pyx":286 */ 1067 | while (1) { 1068 | __pyx_6 = ((__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]) > 0); 1069 | if (!__pyx_6) break; 1070 | 1071 | /* "/private/tmp/PI/PI/align.pyx":287 */ 1072 | __pyx_v_data = (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]); 1073 | 1074 | /* "/private/tmp/PI/PI/align.pyx":289 */ 1075 | __pyx_7 = (__pyx_v_data & (1 << 2)); 1076 | if (__pyx_7) { 1077 | 1078 | /* "/private/tmp/PI/PI/align.pyx":290 */ 1079 | __pyx_v_ni = (__pyx_v_i - 1); 1080 | 1081 | /* "/private/tmp/PI/PI/align.pyx":291 */ 1082 | __pyx_v_nj = __pyx_v_j; 1083 | goto __pyx_L27; 1084 | } 1085 | __pyx_L27:; 1086 | 1087 | /* "/private/tmp/PI/PI/align.pyx":293 */ 1088 | __pyx_7 = (__pyx_v_data & (1 << 1)); 1089 | if (__pyx_7) { 1090 | 1091 | /* "/private/tmp/PI/PI/align.pyx":294 */ 1092 | __pyx_v_ni = __pyx_v_i; 1093 | 1094 | /* "/private/tmp/PI/PI/align.pyx":295 */ 1095 | __pyx_v_nj = (__pyx_v_j - 1); 1096 | goto __pyx_L28; 1097 | } 1098 | __pyx_L28:; 1099 | 1100 | /* "/private/tmp/PI/PI/align.pyx":297 */ 1101 | __pyx_7 = (__pyx_v_data & (1 << 0)); 1102 | if (__pyx_7) { 1103 | 1104 | /* "/private/tmp/PI/PI/align.pyx":298 */ 1105 | __pyx_v_ni = (__pyx_v_i - 1); 1106 | 1107 | /* "/private/tmp/PI/PI/align.pyx":299 */ 1108 | __pyx_v_nj = (__pyx_v_j - 1); 1109 | goto __pyx_L29; 1110 | } 1111 | __pyx_L29:; 1112 | 1113 | /* "/private/tmp/PI/PI/align.pyx":301 */ 1114 | __pyx_v_new_len = (__pyx_v_new_len + 1); 1115 | 1116 | /* "/private/tmp/PI/PI/align.pyx":303 */ 1117 | __pyx_v_i = __pyx_v_ni; 1118 | 1119 | /* "/private/tmp/PI/PI/align.pyx":304 */ 1120 | __pyx_v_j = __pyx_v_nj; 1121 | } 1122 | 1123 | /* "/private/tmp/PI/PI/align.pyx":306 */ 1124 | __pyx_5 = __Pyx_GetName(__pyx_m, __pyx_n_numpy); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 306; goto __pyx_L1;} 1125 | __pyx_4 = PyObject_GetAttr(__pyx_5, __pyx_n_zeros); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 306; goto __pyx_L1;} 1126 | Py_DECREF(__pyx_5); __pyx_5 = 0; 1127 | __pyx_3 = PyInt_FromLong(__pyx_v_new_len); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 306; goto __pyx_L1;} 1128 | __pyx_1 = PyTuple_New(2); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 306; goto __pyx_L1;} 1129 | PyTuple_SET_ITEM(__pyx_1, 0, __pyx_3); 1130 | Py_INCREF(__pyx_n_i); 1131 | PyTuple_SET_ITEM(__pyx_1, 1, __pyx_n_i); 1132 | __pyx_3 = 0; 1133 | __pyx_5 = PyObject_CallObject(__pyx_4, __pyx_1); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 306; goto __pyx_L1;} 1134 | Py_DECREF(__pyx_4); __pyx_4 = 0; 1135 | Py_DECREF(__pyx_1); __pyx_1 = 0; 1136 | Py_DECREF(__pyx_v_new_seq1); 1137 | __pyx_v_new_seq1 = __pyx_5; 1138 | __pyx_5 = 0; 1139 | 1140 | /* "/private/tmp/PI/PI/align.pyx":307 */ 1141 | __pyx_3 = __Pyx_GetName(__pyx_m, __pyx_n_numpy); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 307; goto __pyx_L1;} 1142 | __pyx_4 = PyObject_GetAttr(__pyx_3, __pyx_n_zeros); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 307; goto __pyx_L1;} 1143 | Py_DECREF(__pyx_3); __pyx_3 = 0; 1144 | __pyx_1 = PyInt_FromLong(__pyx_v_new_len); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 307; goto __pyx_L1;} 1145 | __pyx_5 = PyTuple_New(2); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 307; goto __pyx_L1;} 1146 | PyTuple_SET_ITEM(__pyx_5, 0, __pyx_1); 1147 | Py_INCREF(__pyx_n_i); 1148 | PyTuple_SET_ITEM(__pyx_5, 1, __pyx_n_i); 1149 | __pyx_1 = 0; 1150 | __pyx_3 = PyObject_CallObject(__pyx_4, __pyx_5); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 307; goto __pyx_L1;} 1151 | Py_DECREF(__pyx_4); __pyx_4 = 0; 1152 | Py_DECREF(__pyx_5); __pyx_5 = 0; 1153 | Py_DECREF(__pyx_v_new_seq2); 1154 | __pyx_v_new_seq2 = __pyx_3; 1155 | __pyx_3 = 0; 1156 | 1157 | /* "/private/tmp/PI/PI/align.pyx":310 */ 1158 | __pyx_v_s1 = __pyx_v_new_len; 1159 | __pyx_v_s2 = __pyx_v_new_len; 1160 | 1161 | /* "/private/tmp/PI/PI/align.pyx":311 */ 1162 | __pyx_v_gaps = 0; 1163 | 1164 | /* "/private/tmp/PI/PI/align.pyx":313 */ 1165 | __pyx_v_i = __pyx_v_i_max; 1166 | 1167 | /* "/private/tmp/PI/PI/align.pyx":314 */ 1168 | __pyx_v_j = __pyx_v_j_max; 1169 | 1170 | /* "/private/tmp/PI/PI/align.pyx":316 */ 1171 | while (1) { 1172 | __pyx_6 = ((__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]) > 0); 1173 | if (!__pyx_6) break; 1174 | 1175 | /* "/private/tmp/PI/PI/align.pyx":317 */ 1176 | __pyx_v_data = (__pyx_v_matrix[((__pyx_v_i * __pyx_v_ncols) + __pyx_v_j)]); 1177 | 1178 | /* "/private/tmp/PI/PI/align.pyx":319 */ 1179 | __pyx_7 = (__pyx_v_data & (1 << 2)); 1180 | if (__pyx_7) { 1181 | 1182 | /* "/private/tmp/PI/PI/align.pyx":320 */ 1183 | __pyx_v_ni = (__pyx_v_i - 1); 1184 | 1185 | /* "/private/tmp/PI/PI/align.pyx":321 */ 1186 | __pyx_v_nj = __pyx_v_j; 1187 | 1188 | /* "/private/tmp/PI/PI/align.pyx":322 */ 1189 | __pyx_v_dir = (1 << 2); 1190 | goto __pyx_L32; 1191 | } 1192 | __pyx_L32:; 1193 | 1194 | /* "/private/tmp/PI/PI/align.pyx":324 */ 1195 | __pyx_7 = (__pyx_v_data & (1 << 1)); 1196 | if (__pyx_7) { 1197 | 1198 | /* "/private/tmp/PI/PI/align.pyx":325 */ 1199 | __pyx_v_ni = __pyx_v_i; 1200 | 1201 | /* "/private/tmp/PI/PI/align.pyx":326 */ 1202 | __pyx_v_nj = (__pyx_v_j - 1); 1203 | 1204 | /* "/private/tmp/PI/PI/align.pyx":327 */ 1205 | __pyx_v_dir = (1 << 1); 1206 | goto __pyx_L33; 1207 | } 1208 | __pyx_L33:; 1209 | 1210 | /* "/private/tmp/PI/PI/align.pyx":329 */ 1211 | __pyx_7 = (__pyx_v_data & (1 << 0)); 1212 | if (__pyx_7) { 1213 | 1214 | /* "/private/tmp/PI/PI/align.pyx":330 */ 1215 | __pyx_v_ni = (__pyx_v_i - 1); 1216 | 1217 | /* "/private/tmp/PI/PI/align.pyx":331 */ 1218 | __pyx_v_nj = (__pyx_v_j - 1); 1219 | 1220 | /* "/private/tmp/PI/PI/align.pyx":332 */ 1221 | __pyx_v_dir = (1 << 0); 1222 | goto __pyx_L34; 1223 | } 1224 | __pyx_L34:; 1225 | 1226 | /* "/private/tmp/PI/PI/align.pyx":334 */ 1227 | __pyx_6 = (__pyx_v_dir == (1 << 0)); 1228 | if (__pyx_6) { 1229 | 1230 | /* "/private/tmp/PI/PI/align.pyx":335 */ 1231 | __pyx_v_s1 = (__pyx_v_s1 - 1); 1232 | 1233 | /* "/private/tmp/PI/PI/align.pyx":336 */ 1234 | __pyx_v_s2 = (__pyx_v_s2 - 1); 1235 | 1236 | /* "/private/tmp/PI/PI/align.pyx":337 */ 1237 | __pyx_1 = PyInt_FromLong((__pyx_v_i - 1)); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 337; goto __pyx_L1;} 1238 | __pyx_4 = PyObject_GetItem(__pyx_v_seq1, __pyx_1); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 337; goto __pyx_L1;} 1239 | Py_DECREF(__pyx_1); __pyx_1 = 0; 1240 | __pyx_5 = PyInt_FromLong(__pyx_v_s1); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 337; goto __pyx_L1;} 1241 | if (PyObject_SetItem(__pyx_v_new_seq1, __pyx_5, __pyx_4) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 337; goto __pyx_L1;} 1242 | Py_DECREF(__pyx_5); __pyx_5 = 0; 1243 | Py_DECREF(__pyx_4); __pyx_4 = 0; 1244 | 1245 | /* "/private/tmp/PI/PI/align.pyx":338 */ 1246 | __pyx_3 = PyInt_FromLong((__pyx_v_j - 1)); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 338; goto __pyx_L1;} 1247 | __pyx_1 = PyObject_GetItem(__pyx_v_seq2, __pyx_3); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 338; goto __pyx_L1;} 1248 | Py_DECREF(__pyx_3); __pyx_3 = 0; 1249 | __pyx_4 = PyInt_FromLong(__pyx_v_s2); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 338; goto __pyx_L1;} 1250 | if (PyObject_SetItem(__pyx_v_new_seq2, __pyx_4, __pyx_1) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 338; goto __pyx_L1;} 1251 | Py_DECREF(__pyx_4); __pyx_4 = 0; 1252 | Py_DECREF(__pyx_1); __pyx_1 = 0; 1253 | goto __pyx_L35; 1254 | } 1255 | __pyx_L35:; 1256 | 1257 | /* "/private/tmp/PI/PI/align.pyx":340 */ 1258 | __pyx_6 = (__pyx_v_dir == (1 << 1)); 1259 | if (__pyx_6) { 1260 | 1261 | /* "/private/tmp/PI/PI/align.pyx":341 */ 1262 | __pyx_v_s1 = (__pyx_v_s1 - 1); 1263 | 1264 | /* "/private/tmp/PI/PI/align.pyx":342 */ 1265 | __pyx_v_s2 = (__pyx_v_s2 - 1); 1266 | 1267 | /* "/private/tmp/PI/PI/align.pyx":344 */ 1268 | __pyx_5 = PyObject_GetAttr(__pyx_v_edits1, __pyx_n_append); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 344; goto __pyx_L1;} 1269 | __pyx_3 = PyInt_FromLong(__pyx_v_s1); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 344; goto __pyx_L1;} 1270 | __pyx_1 = PyTuple_New(1); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 344; goto __pyx_L1;} 1271 | PyTuple_SET_ITEM(__pyx_1, 0, __pyx_3); 1272 | __pyx_3 = 0; 1273 | __pyx_4 = PyObject_CallObject(__pyx_5, __pyx_1); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 344; goto __pyx_L1;} 1274 | Py_DECREF(__pyx_5); __pyx_5 = 0; 1275 | Py_DECREF(__pyx_1); __pyx_1 = 0; 1276 | Py_DECREF(__pyx_4); __pyx_4 = 0; 1277 | 1278 | /* "/private/tmp/PI/PI/align.pyx":345 */ 1279 | __pyx_3 = PyInt_FromLong(256); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 345; goto __pyx_L1;} 1280 | __pyx_5 = PyInt_FromLong(__pyx_v_s1); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 345; goto __pyx_L1;} 1281 | if (PyObject_SetItem(__pyx_v_new_seq1, __pyx_5, __pyx_3) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 345; goto __pyx_L1;} 1282 | Py_DECREF(__pyx_5); __pyx_5 = 0; 1283 | Py_DECREF(__pyx_3); __pyx_3 = 0; 1284 | 1285 | /* "/private/tmp/PI/PI/align.pyx":346 */ 1286 | __pyx_1 = PyInt_FromLong((__pyx_v_j - 1)); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 346; goto __pyx_L1;} 1287 | __pyx_4 = PyObject_GetItem(__pyx_v_seq2, __pyx_1); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 346; goto __pyx_L1;} 1288 | Py_DECREF(__pyx_1); __pyx_1 = 0; 1289 | __pyx_3 = PyInt_FromLong(__pyx_v_s2); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 346; goto __pyx_L1;} 1290 | if (PyObject_SetItem(__pyx_v_new_seq2, __pyx_3, __pyx_4) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 346; goto __pyx_L1;} 1291 | Py_DECREF(__pyx_3); __pyx_3 = 0; 1292 | Py_DECREF(__pyx_4); __pyx_4 = 0; 1293 | 1294 | /* "/private/tmp/PI/PI/align.pyx":347 */ 1295 | __pyx_v_gaps = (__pyx_v_gaps + 1); 1296 | goto __pyx_L36; 1297 | } 1298 | __pyx_L36:; 1299 | 1300 | /* "/private/tmp/PI/PI/align.pyx":349 */ 1301 | __pyx_6 = (__pyx_v_dir == (1 << 2)); 1302 | if (__pyx_6) { 1303 | 1304 | /* "/private/tmp/PI/PI/align.pyx":350 */ 1305 | __pyx_v_s1 = (__pyx_v_s1 - 1); 1306 | 1307 | /* "/private/tmp/PI/PI/align.pyx":351 */ 1308 | __pyx_v_s2 = (__pyx_v_s2 - 1); 1309 | 1310 | /* "/private/tmp/PI/PI/align.pyx":353 */ 1311 | __pyx_5 = PyInt_FromLong((__pyx_v_i - 1)); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 353; goto __pyx_L1;} 1312 | __pyx_1 = PyObject_GetItem(__pyx_v_seq1, __pyx_5); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 353; goto __pyx_L1;} 1313 | Py_DECREF(__pyx_5); __pyx_5 = 0; 1314 | __pyx_4 = PyInt_FromLong(__pyx_v_s1); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 353; goto __pyx_L1;} 1315 | if (PyObject_SetItem(__pyx_v_new_seq1, __pyx_4, __pyx_1) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 353; goto __pyx_L1;} 1316 | Py_DECREF(__pyx_4); __pyx_4 = 0; 1317 | Py_DECREF(__pyx_1); __pyx_1 = 0; 1318 | 1319 | /* "/private/tmp/PI/PI/align.pyx":354 */ 1320 | __pyx_3 = PyInt_FromLong(256); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 354; goto __pyx_L1;} 1321 | __pyx_5 = PyInt_FromLong(__pyx_v_s2); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 354; goto __pyx_L1;} 1322 | if (PyObject_SetItem(__pyx_v_new_seq2, __pyx_5, __pyx_3) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 354; goto __pyx_L1;} 1323 | Py_DECREF(__pyx_5); __pyx_5 = 0; 1324 | Py_DECREF(__pyx_3); __pyx_3 = 0; 1325 | 1326 | /* "/private/tmp/PI/PI/align.pyx":355 */ 1327 | __pyx_1 = PyObject_GetAttr(__pyx_v_edits2, __pyx_n_append); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 355; goto __pyx_L1;} 1328 | __pyx_4 = PyInt_FromLong(__pyx_v_s2); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 355; goto __pyx_L1;} 1329 | __pyx_3 = PyTuple_New(1); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 355; goto __pyx_L1;} 1330 | PyTuple_SET_ITEM(__pyx_3, 0, __pyx_4); 1331 | __pyx_4 = 0; 1332 | __pyx_5 = PyObject_CallObject(__pyx_1, __pyx_3); if (!__pyx_5) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 355; goto __pyx_L1;} 1333 | Py_DECREF(__pyx_1); __pyx_1 = 0; 1334 | Py_DECREF(__pyx_3); __pyx_3 = 0; 1335 | Py_DECREF(__pyx_5); __pyx_5 = 0; 1336 | 1337 | /* "/private/tmp/PI/PI/align.pyx":356 */ 1338 | __pyx_v_gaps = (__pyx_v_gaps + 1); 1339 | goto __pyx_L37; 1340 | } 1341 | __pyx_L37:; 1342 | 1343 | /* "/private/tmp/PI/PI/align.pyx":358 */ 1344 | __pyx_v_i = __pyx_v_ni; 1345 | 1346 | /* "/private/tmp/PI/PI/align.pyx":359 */ 1347 | __pyx_v_j = __pyx_v_nj; 1348 | } 1349 | 1350 | /* "/private/tmp/PI/PI/align.pyx":361 */ 1351 | __pyx_4 = PyInt_FromLong(__pyx_v_t_max); if (!__pyx_4) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 361; goto __pyx_L1;} 1352 | __pyx_1 = PyInt_FromLong(__pyx_v_gaps); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 361; goto __pyx_L1;} 1353 | __pyx_3 = PyTuple_New(6); if (!__pyx_3) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 361; goto __pyx_L1;} 1354 | Py_INCREF(__pyx_v_new_seq1); 1355 | PyTuple_SET_ITEM(__pyx_3, 0, __pyx_v_new_seq1); 1356 | Py_INCREF(__pyx_v_new_seq2); 1357 | PyTuple_SET_ITEM(__pyx_3, 1, __pyx_v_new_seq2); 1358 | Py_INCREF(__pyx_v_edits1); 1359 | PyTuple_SET_ITEM(__pyx_3, 2, __pyx_v_edits1); 1360 | Py_INCREF(__pyx_v_edits2); 1361 | PyTuple_SET_ITEM(__pyx_3, 3, __pyx_v_edits2); 1362 | PyTuple_SET_ITEM(__pyx_3, 4, __pyx_4); 1363 | PyTuple_SET_ITEM(__pyx_3, 5, __pyx_1); 1364 | __pyx_4 = 0; 1365 | __pyx_1 = 0; 1366 | __pyx_r = __pyx_3; 1367 | __pyx_3 = 0; 1368 | goto __pyx_L0; 1369 | 1370 | __pyx_r = Py_None; Py_INCREF(Py_None); 1371 | goto __pyx_L0; 1372 | __pyx_L1:; 1373 | Py_XDECREF(__pyx_1); 1374 | Py_XDECREF(__pyx_3); 1375 | Py_XDECREF(__pyx_4); 1376 | Py_XDECREF(__pyx_5); 1377 | __Pyx_AddTraceback("PI.align.SmithWaterman"); 1378 | __pyx_r = 0; 1379 | __pyx_L0:; 1380 | Py_DECREF(__pyx_v_a); 1381 | Py_DECREF(__pyx_v_edits1); 1382 | Py_DECREF(__pyx_v_edits2); 1383 | Py_DECREF(__pyx_v_table); 1384 | Py_DECREF(__pyx_v_new_seq1); 1385 | Py_DECREF(__pyx_v_new_seq2); 1386 | Py_DECREF(__pyx_v_seq1); 1387 | Py_DECREF(__pyx_v_seq2); 1388 | Py_DECREF(__pyx_v_S); 1389 | return __pyx_r; 1390 | } 1391 | 1392 | static struct PyMethodDef __pyx_methods[] = { 1393 | {"NeedlemanWunsch", (PyCFunction)__pyx_f_2PI_5align_NeedlemanWunsch, METH_VARARGS|METH_KEYWORDS, 0}, 1394 | {"SmithWaterman", (PyCFunction)__pyx_f_2PI_5align_SmithWaterman, METH_VARARGS|METH_KEYWORDS, 0}, 1395 | {0, 0, 0, 0} 1396 | }; 1397 | 1398 | static void __pyx_init_filenames(void); /*proto*/ 1399 | 1400 | PyMODINIT_FUNC initalign(void); /*proto*/ 1401 | PyMODINIT_FUNC initalign(void) { 1402 | PyObject *__pyx_1 = 0; 1403 | __pyx_init_filenames(); 1404 | __pyx_m = Py_InitModule4("align", __pyx_methods, 0, 0, PYTHON_API_VERSION); 1405 | if (!__pyx_m) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1; goto __pyx_L1;}; 1406 | Py_INCREF(__pyx_m); 1407 | __pyx_b = PyImport_AddModule("__builtin__"); 1408 | if (!__pyx_b) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1; goto __pyx_L1;}; 1409 | if (PyObject_SetAttrString(__pyx_m, "__builtins__", __pyx_b) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1; goto __pyx_L1;}; 1410 | if (__Pyx_InitStrings(__pyx_string_tab) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1; goto __pyx_L1;}; 1411 | __pyx_ptype_2PI_5align_ndarray = __Pyx_ImportType("numpy", "ndarray", sizeof(PyArrayObject)); if (!__pyx_ptype_2PI_5align_ndarray) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 7; goto __pyx_L1;} 1412 | 1413 | /* "/private/tmp/PI/PI/align.pyx":15 */ 1414 | __pyx_1 = __Pyx_Import(__pyx_n_numpy, 0); if (!__pyx_1) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 15; goto __pyx_L1;} 1415 | if (PyObject_SetAttr(__pyx_m, __pyx_n_numpy, __pyx_1) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 15; goto __pyx_L1;} 1416 | Py_DECREF(__pyx_1); __pyx_1 = 0; 1417 | 1418 | /* "/private/tmp/PI/PI/align.pyx":187 */ 1419 | return; 1420 | __pyx_L1:; 1421 | Py_XDECREF(__pyx_1); 1422 | __Pyx_AddTraceback("PI.align"); 1423 | } 1424 | 1425 | static char *__pyx_filenames[] = { 1426 | "align.pyx", 1427 | }; 1428 | 1429 | /* Runtime support code */ 1430 | 1431 | static void __pyx_init_filenames(void) { 1432 | __pyx_f = __pyx_filenames; 1433 | } 1434 | 1435 | static PyObject *__Pyx_GetName(PyObject *dict, PyObject *name) { 1436 | PyObject *result; 1437 | result = PyObject_GetAttr(dict, name); 1438 | if (!result) 1439 | PyErr_SetObject(PyExc_NameError, name); 1440 | return result; 1441 | } 1442 | 1443 | static int __Pyx_TypeTest(PyObject *obj, PyTypeObject *type) { 1444 | if (!type) { 1445 | PyErr_Format(PyExc_SystemError, "Missing type object"); 1446 | return 0; 1447 | } 1448 | if (obj == Py_None || PyObject_TypeCheck(obj, type)) 1449 | return 1; 1450 | PyErr_Format(PyExc_TypeError, "Cannot convert %s to %s", 1451 | obj->ob_type->tp_name, type->tp_name); 1452 | return 0; 1453 | } 1454 | 1455 | static int __Pyx_InitStrings(__Pyx_StringTabEntry *t) { 1456 | while (t->p) { 1457 | *t->p = PyString_FromStringAndSize(t->s, t->n - 1); 1458 | if (!*t->p) 1459 | return -1; 1460 | if (t->i) 1461 | PyString_InternInPlace(t->p); 1462 | ++t; 1463 | } 1464 | return 0; 1465 | } 1466 | 1467 | #ifndef __PYX_HAVE_RT_ImportType 1468 | #define __PYX_HAVE_RT_ImportType 1469 | static PyTypeObject *__Pyx_ImportType(char *module_name, char *class_name, 1470 | long size) 1471 | { 1472 | PyObject *py_module = 0; 1473 | PyObject *result = 0; 1474 | 1475 | py_module = __Pyx_ImportModule(module_name); 1476 | if (!py_module) 1477 | goto bad; 1478 | result = PyObject_GetAttrString(py_module, class_name); 1479 | if (!result) 1480 | goto bad; 1481 | if (!PyType_Check(result)) { 1482 | PyErr_Format(PyExc_TypeError, 1483 | "%s.%s is not a type object", 1484 | module_name, class_name); 1485 | goto bad; 1486 | } 1487 | if (((PyTypeObject *)result)->tp_basicsize != size) { 1488 | PyErr_Format(PyExc_ValueError, 1489 | "%s.%s does not appear to be the correct type object", 1490 | module_name, class_name); 1491 | goto bad; 1492 | } 1493 | return (PyTypeObject *)result; 1494 | bad: 1495 | Py_XDECREF(result); 1496 | return 0; 1497 | } 1498 | #endif 1499 | 1500 | #ifndef __PYX_HAVE_RT_ImportModule 1501 | #define __PYX_HAVE_RT_ImportModule 1502 | static PyObject *__Pyx_ImportModule(char *name) { 1503 | PyObject *py_name = 0; 1504 | 1505 | py_name = PyString_FromString(name); 1506 | if (!py_name) 1507 | goto bad; 1508 | return PyImport_Import(py_name); 1509 | bad: 1510 | Py_XDECREF(py_name); 1511 | return 0; 1512 | } 1513 | #endif 1514 | 1515 | static PyObject *__Pyx_Import(PyObject *name, PyObject *from_list) { 1516 | PyObject *__import__ = 0; 1517 | PyObject *empty_list = 0; 1518 | PyObject *module = 0; 1519 | PyObject *global_dict = 0; 1520 | PyObject *empty_dict = 0; 1521 | PyObject *list; 1522 | __import__ = PyObject_GetAttrString(__pyx_b, "__import__"); 1523 | if (!__import__) 1524 | goto bad; 1525 | if (from_list) 1526 | list = from_list; 1527 | else { 1528 | empty_list = PyList_New(0); 1529 | if (!empty_list) 1530 | goto bad; 1531 | list = empty_list; 1532 | } 1533 | global_dict = PyModule_GetDict(__pyx_m); 1534 | if (!global_dict) 1535 | goto bad; 1536 | empty_dict = PyDict_New(); 1537 | if (!empty_dict) 1538 | goto bad; 1539 | module = PyObject_CallFunction(__import__, "OOOO", 1540 | name, global_dict, empty_dict, list); 1541 | bad: 1542 | Py_XDECREF(empty_list); 1543 | Py_XDECREF(__import__); 1544 | Py_XDECREF(empty_dict); 1545 | return module; 1546 | } 1547 | 1548 | #include "compile.h" 1549 | #include "frameobject.h" 1550 | #include "traceback.h" 1551 | 1552 | static void __Pyx_AddTraceback(char *funcname) { 1553 | PyObject *py_srcfile = 0; 1554 | PyObject *py_funcname = 0; 1555 | PyObject *py_globals = 0; 1556 | PyObject *empty_tuple = 0; 1557 | PyObject *empty_string = 0; 1558 | PyCodeObject *py_code = 0; 1559 | PyFrameObject *py_frame = 0; 1560 | 1561 | py_srcfile = PyString_FromString(__pyx_filename); 1562 | if (!py_srcfile) goto bad; 1563 | py_funcname = PyString_FromString(funcname); 1564 | if (!py_funcname) goto bad; 1565 | py_globals = PyModule_GetDict(__pyx_m); 1566 | if (!py_globals) goto bad; 1567 | empty_tuple = PyTuple_New(0); 1568 | if (!empty_tuple) goto bad; 1569 | empty_string = PyString_FromString(""); 1570 | if (!empty_string) goto bad; 1571 | py_code = PyCode_New( 1572 | 0, /*int argcount,*/ 1573 | 0, /*int nlocals,*/ 1574 | 0, /*int stacksize,*/ 1575 | 0, /*int flags,*/ 1576 | empty_string, /*PyObject *code,*/ 1577 | empty_tuple, /*PyObject *consts,*/ 1578 | empty_tuple, /*PyObject *names,*/ 1579 | empty_tuple, /*PyObject *varnames,*/ 1580 | empty_tuple, /*PyObject *freevars,*/ 1581 | empty_tuple, /*PyObject *cellvars,*/ 1582 | py_srcfile, /*PyObject *filename,*/ 1583 | py_funcname, /*PyObject *name,*/ 1584 | __pyx_lineno, /*int firstlineno,*/ 1585 | empty_string /*PyObject *lnotab*/ 1586 | ); 1587 | if (!py_code) goto bad; 1588 | py_frame = PyFrame_New( 1589 | PyThreadState_Get(), /*PyThreadState *tstate,*/ 1590 | py_code, /*PyCodeObject *code,*/ 1591 | py_globals, /*PyObject *globals,*/ 1592 | 0 /*PyObject *locals*/ 1593 | ); 1594 | if (!py_frame) goto bad; 1595 | py_frame->f_lineno = __pyx_lineno; 1596 | PyTraceBack_Here(py_frame); 1597 | bad: 1598 | Py_XDECREF(py_srcfile); 1599 | Py_XDECREF(py_funcname); 1600 | Py_XDECREF(empty_tuple); 1601 | Py_XDECREF(empty_string); 1602 | Py_XDECREF(py_code); 1603 | Py_XDECREF(py_frame); 1604 | } 1605 | -------------------------------------------------------------------------------- /PI/align.pyx: -------------------------------------------------------------------------------- 1 | cdef extern from "numpy/arrayobject.h": 2 | 3 | struct PyArray_Descr: 4 | int type_num, elsize 5 | char type 6 | 7 | ctypedef class numpy.ndarray [object PyArrayObject]: 8 | cdef char *data 9 | cdef int nd 10 | cdef int *dimensions, *strides 11 | cdef object base 12 | cdef PyArray_Descr *descr 13 | cdef int flags 14 | 15 | import numpy 16 | 17 | def NeedlemanWunsch(seq1, seq2, S, int g, int e): 18 | 19 | cdef int M, N, i, j, t_max, i_max, j_max, dir 20 | cdef int v1, v2, v3, m, new_len, data, ni, nj 21 | 22 | cdef ndarray a 23 | cdef int nrows, ncols 24 | cdef int *matrix, x 25 | 26 | edits1 = [] 27 | edits2 = [] 28 | 29 | M = len(seq1) + 1 30 | N = len(seq2) + 1 31 | 32 | table = numpy.zeros([M, N], 'i') 33 | 34 | a = table 35 | 36 | nrows = a.dimensions[0] 37 | ncols = a.dimensions[1] 38 | 39 | matrix = a.data 40 | 41 | # Iterate through matrix and score similarities 42 | for i from 1 <= i < M: 43 | for j from 1 <= j < N: 44 | if seq1[i - 1] == seq2[j - 1]: 45 | matrix[i * ncols + j] = 1 46 | else: 47 | matrix[i * ncols + j] = 0 48 | 49 | # Sum the matrix 50 | i_max = 0 51 | j_max = 0 52 | t_max = 0 53 | 54 | for i from 1 <= i < M: 55 | for j from 1 <= j < N: 56 | 57 | dir = 0 58 | 59 | v1 = matrix[(i - 1) * ncols + (j - 1)] 60 | v2 = matrix[i * ncols + (j - 1)] 61 | v3 = matrix[(i - 1) * ncols + j] 62 | 63 | if v1 > 255: 64 | v1 = v1 >> 8 65 | if v2 > 255: 66 | v2 = v2 >> 8 67 | if v3 > 255: 68 | v3 = v3 >> 8 69 | 70 | v1 = v1 + matrix[i * ncols + j] 71 | v2 = v2 - e 72 | v3 = v3 - e 73 | 74 | if v1 > 0: 75 | m = v1 76 | else: 77 | m = 0 78 | 79 | if v2 > m: 80 | m = v2 81 | 82 | if v3 > m: 83 | m = v3 84 | 85 | if m == v1: 86 | dir = dir | (1 << 0) 87 | 88 | if m == v2: 89 | dir = dir | (1 << 1) 90 | 91 | if m == v3: 92 | dir = dir | (1 << 2) 93 | 94 | if m >= t_max: 95 | t_max = m 96 | i_max = i 97 | j_max = j 98 | 99 | m = m << 8 100 | m = m | dir 101 | 102 | matrix[i * ncols + j] = m 103 | 104 | # Do backtrace through matrix 105 | i = i_max 106 | j = j_max 107 | 108 | new_len = 0 109 | 110 | while i and j: 111 | data = matrix[i * ncols + j] 112 | 113 | if data & (1 << 2): 114 | ni = i - 1 115 | nj = j 116 | 117 | if data & (1 << 1): 118 | ni = i 119 | nj = j - 1 120 | 121 | if data & (1 << 0): 122 | ni = i - 1 123 | nj = j - 1 124 | 125 | new_len = new_len + 1 126 | 127 | i = ni 128 | j = nj 129 | 130 | new_seq1 = numpy.zeros((new_len), 'i') 131 | new_seq2 = numpy.zeros((new_len), 'i') 132 | 133 | cdef int s1, s2, gaps 134 | s1 = s2 = new_len 135 | gaps = 0 136 | 137 | i = i_max 138 | j = j_max 139 | 140 | while i and j: 141 | data = matrix[i * ncols + j] 142 | 143 | if data & (1 << 2): 144 | ni = i - 1 145 | nj = j 146 | dir = (1 << 2) 147 | 148 | if data & (1 << 1): 149 | ni = i 150 | nj = j - 1 151 | dir = (1 << 1) 152 | 153 | if data & (1 << 0): 154 | ni = i - 1 155 | nj = j - 1 156 | dir = (1 << 0) 157 | 158 | if dir == (1 << 0): 159 | s1 = s1 - 1 160 | s2 = s2 - 1 161 | new_seq1[s1] = seq1[i - 1] 162 | new_seq2[s2] = seq2[j - 1] 163 | 164 | if dir == (1 << 1): 165 | s1 = s1 - 1 166 | s2 = s2 - 1 167 | 168 | edits1.append(s1) 169 | new_seq1[s1] = 256 # '_' 170 | new_seq2[s2] = seq2[j - 1] 171 | gaps = gaps + 1 172 | 173 | if dir == (1 << 2): 174 | s1 = s1 - 1 175 | s2 = s2 - 1 176 | 177 | new_seq1[s1] = seq1[i - 1] 178 | new_seq2[s2] = 256 # '_' 179 | edits2.append(s2) 180 | gaps = gaps + 1 181 | 182 | i = ni 183 | j = nj 184 | 185 | return (new_seq1, new_seq2, edits1, edits2, t_max, gaps) 186 | 187 | def SmithWaterman(seq1, seq2, S, int g, int e): 188 | 189 | cdef int M, N, i, j, t_max, i_max, j_max, dir 190 | cdef int v1, v2, v3, m, new_len, data, ni, nj 191 | 192 | cdef ndarray a 193 | cdef int nrows, ncols 194 | cdef int *matrix, x 195 | 196 | edits1 = [] 197 | edits2 = [] 198 | 199 | M = len(seq1) + 1 200 | N = len(seq2) + 1 201 | 202 | table = numpy.zeros([M, N], 'i') 203 | 204 | for i in range(M): 205 | table[i][0] = 0 - i 206 | 207 | for i in range(N): 208 | table[0][i] = 0 - i 209 | 210 | a = table 211 | 212 | nrows = a.dimensions[0] 213 | ncols = a.dimensions[1] 214 | 215 | matrix = a.data 216 | 217 | # Iterate through matrix and score similarities 218 | for i from 1 <= i < M: 219 | for j from 1 <= j < N: 220 | if seq1[i - 1] == seq2[j - 1]: 221 | matrix[i * ncols + j] = 2 222 | else: 223 | matrix[i * ncols + j] = -1 224 | 225 | # Sum the matrix 226 | i_max = 0 227 | j_max = 0 228 | t_max = 0 229 | 230 | for i from 1 <= i < M: 231 | for j from 1 <= j < N: 232 | 233 | dir = 0 234 | 235 | v1 = matrix[(i - 1) * ncols + (j - 1)] 236 | v2 = matrix[i * ncols + (j - 1)] 237 | v3 = matrix[(i - 1) * ncols + j] 238 | 239 | if v1 > 255 or (v1 & 0xffffff00) == False: 240 | v1 = v1 >> 8 241 | if v2 > 255 or (v1 & 0xffffff00) == False: 242 | v2 = v2 >> 8 243 | if v3 > 255 or (v1 & 0xffffff00) == False: 244 | v3 = v3 >> 8 245 | 246 | v1 = v1 + matrix[i * ncols + j] 247 | v2 = v2 - 2 248 | v3 = v3 - 2 249 | 250 | if v1 > 0: 251 | m = v1 252 | else: 253 | m = 0 254 | 255 | if v2 > m: 256 | m = v2 257 | 258 | if v3 > m: 259 | m = v3 260 | 261 | if m == v1: 262 | dir = dir | (1 << 0) 263 | 264 | if m == v2: 265 | dir = dir | (1 << 1) 266 | 267 | if m == v3: 268 | dir = dir | (1 << 2) 269 | 270 | if m >= t_max: 271 | t_max = m 272 | i_max = i 273 | j_max = j 274 | 275 | m = m << 8 276 | m = m | dir 277 | 278 | matrix[i * ncols + j] = m 279 | 280 | # Do backtrace through matrix 281 | i = i_max 282 | j = j_max 283 | 284 | new_len = 0 285 | 286 | while matrix[i * ncols + j] > 0: 287 | data = matrix[i * ncols + j] 288 | 289 | if data & (1 << 2): 290 | ni = i - 1 291 | nj = j 292 | 293 | if data & (1 << 1): 294 | ni = i 295 | nj = j - 1 296 | 297 | if data & (1 << 0): 298 | ni = i - 1 299 | nj = j - 1 300 | 301 | new_len = new_len + 1 302 | 303 | i = ni 304 | j = nj 305 | 306 | new_seq1 = numpy.zeros((new_len), 'i') 307 | new_seq2 = numpy.zeros((new_len), 'i') 308 | 309 | cdef int s1, s2, gaps 310 | s1 = s2 = new_len 311 | gaps = 0 312 | 313 | i = i_max 314 | j = j_max 315 | 316 | while matrix[i * ncols + j] > 0: 317 | data = matrix[i * ncols + j] 318 | 319 | if data & (1 << 2): 320 | ni = i - 1 321 | nj = j 322 | dir = (1 << 2) 323 | 324 | if data & (1 << 1): 325 | ni = i 326 | nj = j - 1 327 | dir = (1 << 1) 328 | 329 | if data & (1 << 0): 330 | ni = i - 1 331 | nj = j - 1 332 | dir = (1 << 0) 333 | 334 | if dir == (1 << 0): 335 | s1 = s1 - 1 336 | s2 = s2 - 1 337 | new_seq1[s1] = seq1[i - 1] 338 | new_seq2[s2] = seq2[j - 1] 339 | 340 | if dir == (1 << 1): 341 | s1 = s1 - 1 342 | s2 = s2 - 1 343 | 344 | edits1.append(s1) 345 | new_seq1[s1] = 256 # '_' 346 | new_seq2[s2] = seq2[j - 1] 347 | gaps = gaps + 1 348 | 349 | if dir == (1 << 2): 350 | s1 = s1 - 1 351 | s2 = s2 - 1 352 | 353 | new_seq1[s1] = seq1[i - 1] 354 | new_seq2[s2] = 256 # '_' 355 | edits2.append(s2) 356 | gaps = gaps + 1 357 | 358 | i = ni 359 | j = nj 360 | 361 | return (new_seq1, new_seq2, edits1, edits2, t_max, gaps) 362 | -------------------------------------------------------------------------------- /PI/distance.py: -------------------------------------------------------------------------------- 1 | 2 | """ 3 | Distance module 4 | 5 | Find distance between sequences 6 | 7 | Written by Marshall Beddoe 8 | Copyright (c) 2004 Baseline Research 9 | 10 | Licensed under the LGPL 11 | """ 12 | 13 | # 14 | # Note: Gaps are denoted by the integer value 256 as to avoid '_' problems 15 | # 16 | 17 | import align, zlib 18 | from numpy import * 19 | 20 | __all__ = [ "Distance", "Entropic", "PairwiseIdentity", "LocalAlignment" ] 21 | 22 | class Distance: 23 | 24 | """Implementation of classify base class""" 25 | 26 | def __init__(self, sequences): 27 | self.sequences = sequences 28 | self.N = len(sequences) 29 | 30 | # NxN Distance matrix 31 | self.dmx = zeros((self.N, self.N), float) 32 | 33 | for i in range(len(sequences)): 34 | for j in range(len(sequences)): 35 | self.dmx[i][j] = -1 36 | 37 | self._go() 38 | 39 | def __repr__(self): 40 | return "%s" % self.dmx 41 | 42 | def __getitem__(self, i): 43 | return self.dmx[i] 44 | 45 | def __len__(self): 46 | return len(self.dmx) 47 | 48 | def _go(self): 49 | """Perform distance calculations""" 50 | pass 51 | 52 | class Entropic(Distance): 53 | 54 | """Distance calculation based off compression ratios""" 55 | 56 | def _go(self): 57 | 58 | # Similarity matrix 59 | similar = zeros((self.N, self.N), float) 60 | 61 | for i in range(self.N): 62 | for j in range(self.N): 63 | similar[i][j] = -1 64 | 65 | # 66 | # Do compression ratio calculations 67 | # 68 | for i in range(self.N): 69 | for j in range(self.N): 70 | 71 | if similar[i][j] >= 0: 72 | continue 73 | 74 | seq1 = self.sequences[i][1] 75 | seq2 = self.sequences[j][1] 76 | 77 | # Convert sequences to strings, gaps denoted by '_' 78 | seq1str = "" 79 | for x in seq1: 80 | if x == 256: 81 | seq1str += '_' 82 | else: 83 | seq1str += chr(x) 84 | 85 | seq2str = "" 86 | for x in seq2: 87 | if x == 256: 88 | seq2str += '_' 89 | else: 90 | seq2str += chr(x) 91 | 92 | comp1 = zlib.compress(seq1str) 93 | comp2 = zlib.compress(seq2str) 94 | 95 | if len(comp1) > len(comp2): 96 | score = len(comp2) * 1.0 / len(comp1) * 1.0 97 | else: 98 | score = len(comp1) * 1.0 / len(comp2) * 1.0 99 | 100 | similar[i][j] = similar[j][i] = score 101 | 102 | # 103 | # Distance matrix 104 | # 105 | for i in range(self.N): 106 | for j in range(self.N): 107 | self.dmx[i][j] = similar[i][i] - similar[i][j] 108 | 109 | 110 | class PairwiseIdentity(Distance): 111 | 112 | """Distance through basic pairwise similarity""" 113 | 114 | def _go(self): 115 | 116 | # Similarity matrix 117 | similar = zeros((self.N, self.N), float) 118 | 119 | for i in range(self.N): 120 | for j in range(self.N): 121 | similar[i][j] = -1 122 | 123 | # 124 | # Find pairs 125 | # 126 | for i in range(self.N): 127 | for j in range(self.N): 128 | 129 | if similar[i][j] >= 0: 130 | continue 131 | 132 | seq1 = self.sequences[i][1] 133 | seq2 = self.sequences[j][1] 134 | 135 | minlen = min(len(seq1), len(seq2)) 136 | 137 | len1 = len2 = idents = 0 138 | 139 | for x in range(minlen): 140 | if seq1[x] != 256: 141 | len1 += 1.0 142 | 143 | if seq1[x] == seq2[x]: 144 | idents += 1.0 145 | 146 | if seq2[x] != 256: 147 | len2 += 1.0 148 | 149 | m = max(len1, len2) 150 | 151 | similar[i][j] = idents / m 152 | 153 | # 154 | # Distance matrix 155 | # 156 | for i in range(self.N): 157 | for j in range(self.N): 158 | self.dmx[i][j] = similar[i][i] - similar[i][j] 159 | 160 | class LocalAlignment(Distance): 161 | 162 | """Distance through local alignment similarity""" 163 | 164 | def __init__(self, sequences, smx=None): 165 | self.smx = smx 166 | 167 | # If similarity matrix is None, make a quick identity matrix 168 | if self.smx == None: 169 | 170 | self.smx = zeros((257, 257), float) 171 | 172 | for i in range(257): 173 | for j in range(257): 174 | if i == j: 175 | self.smx[i][j] = 1.0 176 | else: 177 | self.smx[i][j] = 0.0 178 | 179 | Distance.__init__(self, sequences) 180 | 181 | def _go(self): 182 | 183 | # Similarity matrix 184 | similar = zeros((self.N, self.N), float) 185 | 186 | for i in range(self.N): 187 | for j in range(self.N): 188 | similar[i][j] = -1 189 | 190 | # 191 | # Compute similarity matrix of SW scores 192 | # 193 | for i in range(self.N): 194 | for j in range(self.N): 195 | 196 | if similar[i][j] >= 0: 197 | continue 198 | 199 | seq1 = self.sequences[i][1] 200 | seq2 = self.sequences[j][1] 201 | 202 | (nseq1, nseq2, edits1, edits2, score, gaps) = \ 203 | align.SmithWaterman(seq1, seq2, self.smx, 0, 0) 204 | 205 | similar[i][j] = similar[j][i] = score 206 | 207 | # 208 | # Compute distance matrix of SW scores 209 | # 210 | for i in range(self.N): 211 | for j in range(self.N): 212 | 213 | if self.dmx[i][j] >= 0: 214 | continue 215 | 216 | self.dmx[i][j] = 1 - (similar[i][j] / similar[i][i]) 217 | self.dmx[j][i] = self.dmx[i][j] 218 | -------------------------------------------------------------------------------- /PI/input.py: -------------------------------------------------------------------------------- 1 | 2 | """ 3 | Input module 4 | 5 | Handle different input file types and digitize sequences 6 | 7 | Written by Marshall Beddoe 8 | Copyright (c) 2004 Baseline Research 9 | 10 | Licensed under the LGPL 11 | """ 12 | 13 | from pcapy import * 14 | from socket import * 15 | 16 | __all__ = ["Input", "Pcap", "ASCII" ] 17 | 18 | class Input: 19 | 20 | """Implementation of base input class""" 21 | 22 | def __init__(self, filename): 23 | """Import specified filename""" 24 | 25 | self.set = set() 26 | self.sequences = [] 27 | self.index = 0 28 | 29 | def __iter__(self): 30 | self.index = 0 31 | return self 32 | 33 | def next(self): 34 | if self.index == len(self.sequences): 35 | raise StopIteration 36 | 37 | self.index += 1 38 | 39 | return self.sequences[self.index - 1] 40 | 41 | def __len__(self): 42 | return len(self.sequences) 43 | 44 | def __repr__(self): 45 | return "%s" % self.sequences 46 | 47 | def __getitem__(self, index): 48 | return self.sequences[index] 49 | 50 | class Pcap(Input): 51 | 52 | """Handle the pcap file format""" 53 | 54 | def __init__(self, filename, offset=14): 55 | Input.__init__(self, filename) 56 | self.pktNumber = 0 57 | self.offset = offset 58 | 59 | try: 60 | pd = open_offline(filename) 61 | except: 62 | raise IOError 63 | 64 | pd.dispatch(-1, self.handler) 65 | 66 | def handler(self, hdr, pkt): 67 | if hdr.getlen() <= 0: 68 | return 69 | 70 | # Increment packet counter 71 | self.pktNumber += 1 72 | 73 | # Ethernet is a safe assumption 74 | offset = self.offset 75 | 76 | # Parse IP header 77 | iphdr = pkt[offset:] 78 | 79 | ip_hl = ord(iphdr[0]) & 0x0f # header length 80 | ip_len = (ord(iphdr[2]) << 8) | ord(iphdr[3]) # total length 81 | ip_p = ord(iphdr[9]) # protocol type 82 | ip_srcip = inet_ntoa(iphdr[12:16]) # source ip address 83 | ip_dstip = inet_ntoa(iphdr[16:20]) # dest ip address 84 | 85 | offset += (ip_hl * 4) 86 | 87 | # Parse TCP if applicable 88 | if ip_p == 6: 89 | tcphdr = pkt[offset:] 90 | 91 | th_sport = (ord(tcphdr[0]) << 8) | ord(tcphdr[1]) # source port 92 | th_dport = (ord(tcphdr[2]) << 8) | ord(tcphdr[3]) # dest port 93 | th_off = ord(tcphdr[12]) >> 4 # tcp offset 94 | 95 | offset += (th_off * 4) 96 | 97 | # Parse UDP if applicable 98 | elif ip_p == 17: 99 | offset += 8 100 | 101 | # Parse out application layer 102 | seq_len = (ip_len - offset) + 14 103 | 104 | if seq_len <= 0: 105 | return 106 | 107 | seq = pkt[offset:] 108 | 109 | l = len(self.set) 110 | self.set.add(seq) 111 | 112 | if len(self.set) == l: 113 | return 114 | 115 | # Digitize sequence 116 | digitalSeq = [] 117 | for c in seq: 118 | digitalSeq.append(ord(c)) 119 | 120 | self.sequences.append((self.pktNumber, digitalSeq)) 121 | 122 | class ASCII(Input): 123 | 124 | """Handle newline delimited ASCII input files""" 125 | 126 | def __init__(self, filename): 127 | Input.__init__(self, filename) 128 | 129 | try: 130 | fd = open(filename, "r") 131 | except: 132 | raise IOError 133 | 134 | lineno = 0 135 | 136 | while 1: 137 | lineno += 1 138 | line = fd.readline() 139 | 140 | if not line: 141 | break 142 | 143 | l = len(self.set) 144 | self.set.add(line) 145 | 146 | if len(self.set) == l: 147 | continue 148 | 149 | # Digitize sequence 150 | digitalSeq = [] 151 | for c in line: 152 | digitalSeq.append(ord(c)) 153 | 154 | self.sequences.append((lineno, digitalSeq)) 155 | -------------------------------------------------------------------------------- /PI/multialign.py: -------------------------------------------------------------------------------- 1 | 2 | """ 3 | Multialign module 4 | 5 | Perform multiple sequence alignment using tree as guide 6 | 7 | Written by Marshall Beddoe 8 | Copyright (c) 2004 Baseline Research 9 | 10 | Licensed under the LGPL 11 | """ 12 | 13 | import align 14 | from numpy import * 15 | 16 | class Multialign: 17 | 18 | """Implementation of multialign base class""" 19 | 20 | def __init__(self, tree): 21 | self.tree = tree 22 | self.aligned = [] 23 | self.index = 0 24 | 25 | self._go() 26 | 27 | def _go(self): 28 | pass 29 | 30 | def __len__(self): 31 | return len(self.aligned) 32 | 33 | def __getitem__(self, index): 34 | return self.aligned[index] 35 | 36 | def __iter__(self): 37 | self.index = 0 38 | return self 39 | 40 | def next(self): 41 | if self.index == len(self.aligned): 42 | raise StopIteration 43 | 44 | self.index += 1 45 | 46 | return self.aligned[self.index - 1] 47 | 48 | class NeedlemanWunsch(Multialign): 49 | 50 | """Perform global multiple sequence alignment""" 51 | def __init__(self, tree, smx=None): 52 | self.smx = smx 53 | 54 | # If similarity matrix is None, make a quick identity matrix 55 | if self.smx == None: 56 | self.smx = zeros((257, 257), float) 57 | 58 | for i in range(257): 59 | for j in range(257): 60 | if i == j: 61 | self.smx[i][j] = 1.0 62 | else: 63 | self.smx[i][j] = 0.0 64 | 65 | Multialign.__init__(self, tree) 66 | 67 | def _go(self): 68 | 69 | self._assign(self.tree) 70 | self._alignSum(self.tree, []) 71 | 72 | def _assign(self, root): 73 | """Traverse tree and align sequences""" 74 | 75 | if root.getValue()[2] == None: 76 | if root.getLeft().getValue()[2] == None: 77 | self._assign(root.getLeft()) 78 | 79 | if root.getRight().getValue()[2] == None: 80 | self._assign(root.getRight()) 81 | 82 | # Get sequences 83 | seq1 = root.getLeft().getValue()[2][1] 84 | seq2 = root.getRight().getValue()[2][1] 85 | 86 | (a1, a2, e1, e2, score, gaps) = \ 87 | align.NeedlemanWunsch(seq1, seq2, self.smx, 0, 0) 88 | 89 | v1 = root.getLeft().getValue() 90 | v2 = root.getRight().getValue() 91 | 92 | nv1 = (v1[0], v1[1], v1[2], e1) 93 | nv2 = (v2[0], v2[1], v2[2], e2) 94 | 95 | root.getLeft().setValue(nv1) 96 | root.getRight().setValue(nv2) 97 | 98 | # Choose the sequence with least gaps 99 | if e1 < e2: 100 | nseq = a1 101 | else: 102 | nseq = a2 103 | 104 | v1 = root.getValue() 105 | nseq = (v1[0], nseq) 106 | nv1 = (v1[0], v1[1], nseq, v1[3]) 107 | root.setValue(nv1) 108 | 109 | def _alignSum(self, root, edits): 110 | 111 | if root.getLeft() == None and root.getRight() == None: 112 | seq1 = root.getValue()[2][1] 113 | id = root.getValue()[2][0] 114 | new = seq1 115 | 116 | for i in range(len(edits)): 117 | e = edits[i] 118 | self._applyEdits(new, e) 119 | 120 | self.aligned.append((id, new)) 121 | else: 122 | e = root.getLeft().getValue()[3] 123 | edits.insert(0, e) 124 | self._alignSum(root.getLeft(), edits) 125 | k = edits.pop(0) 126 | 127 | e = root.getRight().getValue()[3] 128 | edits.insert(0, e) 129 | self._alignSum(root.getRight(), edits) 130 | k = edits.pop(0) 131 | 132 | def _applyEdits(self, seq, edits): 133 | i = 0 134 | gap = 256 135 | 136 | edits.sort() 137 | 138 | for e in edits: 139 | seq.insert(e, gap) 140 | 141 | return seq 142 | -------------------------------------------------------------------------------- /PI/output.py: -------------------------------------------------------------------------------- 1 | 2 | """ 3 | Consensus module 4 | Generate consensus based on multiple sequence alignment 5 | 6 | Written by Marshall Beddoe 7 | Copyright (c) 2004 Baseline Research 8 | 9 | Licensed under the LGPL 10 | """ 11 | 12 | from curses.ascii import * 13 | 14 | class Output: 15 | 16 | def __init__(self, sequences): 17 | 18 | self.sequences = sequences 19 | self.consensus = [] 20 | self._go() 21 | 22 | def _go(self): 23 | pass 24 | 25 | class Ansi(Output): 26 | 27 | def __init__(self, sequences): 28 | 29 | # Color defaults for composition 30 | self.gap = "\033[41;30m%s\033[0m" 31 | self.printable = "\033[42;30m%s\033[0m" 32 | self.space = "\033[43;30m%s\033[0m" 33 | self.binary = "\033[44;30m%s\033[0m" 34 | self.zero = "\033[45;30m%s\033[0m" 35 | self.bit = "\033[46;30m%s\033[0m" 36 | self.default = "\033[47;30m%s\033[0m" 37 | 38 | Output.__init__(self, sequences) 39 | 40 | def _go(self): 41 | 42 | seqLength = len(self.sequences[0][1]) 43 | rounds = seqLength / 18 44 | remainder = seqLength % 18 45 | l = len(self.sequences[0][1]) 46 | 47 | start = 0 48 | end = 18 49 | 50 | dtConsensus = [] 51 | mtConsensus = [] 52 | 53 | for i in range(rounds): 54 | for id, seq in self.sequences: 55 | print "%04d" % id, 56 | for byte in seq[start:end]: 57 | if byte == 256: 58 | print self.gap % "___", 59 | elif isspace(byte): 60 | print self.space % " ", 61 | elif isprint(byte): 62 | print self.printable % "x%02x" % byte, 63 | elif byte == 0: 64 | print self.zero % "x00", 65 | else: 66 | print self.default % "x%02x" % byte, 67 | print "" 68 | 69 | # Calculate datatype consensus 70 | 71 | print "DT ", 72 | for j in range(start, end): 73 | column = [] 74 | for id, seq in self.sequences: 75 | column.append(seq[j]) 76 | dt = self._dtConsensus(column) 77 | print dt, 78 | dtConsensus.append(dt) 79 | print "" 80 | 81 | print "MT ", 82 | for j in range(start, end): 83 | column = [] 84 | for id, seq in self.sequences: 85 | column.append(seq[j]) 86 | rate = self._mutationRate(column) 87 | print "%03d" % (rate * 100), 88 | mtConsensus.append(rate) 89 | print "\n" 90 | 91 | start += 18 92 | end += 18 93 | 94 | if remainder: 95 | for id, seq in self.sequences: 96 | print "%04d" % id, 97 | for byte in seq[start:start + remainder]: 98 | if byte == 256: 99 | print self.gap % "___", 100 | elif isspace(byte): 101 | print self.space % " ", 102 | elif isprint(byte): 103 | print self.printable % "x%02x" % byte, 104 | elif byte == 0: 105 | print self.zero % "x00", 106 | else: 107 | print self.default % "x%02x" % byte, 108 | print "" 109 | 110 | print "DT ", 111 | for j in range(start, start + remainder): 112 | column = [] 113 | for id, seq in self.sequences: 114 | column.append(seq[j]) 115 | dt = self._dtConsensus(column) 116 | print dt, 117 | dtConsensus.append(dt) 118 | print "" 119 | 120 | print "MT ", 121 | for j in range(start, start + remainder): 122 | column = [] 123 | for id, seq in self.sequences: 124 | column.append(seq[j]) 125 | rate = self._mutationRate(column) 126 | mtConsensus.append(rate) 127 | print "%03d" % (rate * 100), 128 | print "" 129 | 130 | # Calculate consensus sequence 131 | l = len(self.sequences[0][1]) 132 | 133 | for i in range(l): 134 | histogram = {} 135 | for id, seq in self.sequences: 136 | try: 137 | histogram[seq[i]] += 1 138 | except: 139 | histogram[seq[i]] = 1 140 | 141 | items = histogram.items() 142 | items.sort() 143 | 144 | m = 1 145 | v = 257 146 | for j in items: 147 | if j[1] > m: 148 | m = j[1] 149 | v = j[0] 150 | 151 | self.consensus.append(v) 152 | 153 | real = [] 154 | 155 | for i in range(len(self.consensus)): 156 | if self.consensus[i] == 256: 157 | continue 158 | real.append((self.consensus[i], dtConsensus[i], mtConsensus[i])) 159 | 160 | # 161 | # Display consensus data 162 | # 163 | totalLen = len(real) 164 | rounds = totalLen / 18 165 | remainder = totalLen % 18 166 | 167 | start = 0 168 | end = 18 169 | 170 | print "\nUngapped Consensus:" 171 | 172 | for i in range(rounds): 173 | print "CONS", 174 | for byte,type,rate in real[start:end]: 175 | if byte == 256: 176 | print self.gap % "___", 177 | elif byte == 257: 178 | print self.default % "???", 179 | elif isspace(byte): 180 | print self.space % " ", 181 | elif isprint(byte): 182 | print self.printable % "x%02x" % byte, 183 | elif byte == 0: 184 | print self.zero % "x00", 185 | else: 186 | print self.default % "x%02x" % byte, 187 | print "" 188 | 189 | print "DT ", 190 | for byte,type,rate in real[start:end]: 191 | print type, 192 | print "" 193 | 194 | print "MT ", 195 | for byte,type,rate in real[start:end]: 196 | print "%03d" % (rate * 100), 197 | print "\n" 198 | 199 | start += 18 200 | end += 18 201 | 202 | if remainder: 203 | print "CONS", 204 | for byte,type,rate in real[start:start + remainder]: 205 | if byte == 256: 206 | print self.gap % "___", 207 | elif byte == 257: 208 | print self.default % "???", 209 | elif isspace(byte): 210 | print self.space % " ", 211 | elif isprint(byte): 212 | print self.printable % "x%02x" % byte, 213 | elif byte == 0: 214 | print self.zero % "x00", 215 | else: 216 | print self.default % "x%02x" % byte, 217 | print "" 218 | 219 | print "DT ", 220 | for byte,type,rate in real[start:end]: 221 | print type, 222 | print "" 223 | 224 | print "MT ", 225 | for byte,type,rate in real[start:end]: 226 | print "%03d" % (rate * 100), 227 | print "" 228 | 229 | def _dtConsensus(self, data): 230 | histogram = {} 231 | 232 | for byte in data: 233 | if byte == 256: 234 | try: 235 | histogram["G"] += 1 236 | except: 237 | histogram["G"] = 1 238 | elif isspace(byte): 239 | try: 240 | histogram["S"] += 1 241 | except: 242 | histogram["S"] = 1 243 | elif isprint(byte): 244 | try: 245 | histogram["A"] += 1 246 | except: 247 | histogram["A"] = 1 248 | elif byte == 0: 249 | try: 250 | histogram["Z"] += 1 251 | except: 252 | histogram["Z"] = 1 253 | else: 254 | try: 255 | histogram["B"] += 1 256 | except: 257 | histogram["B"] = 1 258 | 259 | items = histogram.items() 260 | items.sort() 261 | 262 | m = 1 263 | v = '?' 264 | for j in items: 265 | if j[1] > m: 266 | m = j[1] 267 | v = j[0] 268 | 269 | return v * 3 270 | 271 | def _mutationRate(self, data): 272 | 273 | histogram = {} 274 | 275 | for x in data: 276 | try: 277 | histogram[x] += 1 278 | except: 279 | histogram[x] = 1 280 | 281 | items = histogram.items() 282 | items.sort() 283 | 284 | if len(items) == 1: 285 | rate = 0.0 286 | else: 287 | rate = len(items) * 1.0 / len(data) * 1.0 288 | 289 | return rate 290 | -------------------------------------------------------------------------------- /PI/phylogeny.py: -------------------------------------------------------------------------------- 1 | 2 | """ 3 | Phylogenetic tree module 4 | 5 | Implementation of multiple tree building algorithms: 6 | - UPGMA 7 | - Maximum Parsimony 8 | - Maximum Likelihood 9 | 10 | Also, contains class to perform clustering based on results of tree generation 11 | 12 | Written by Marshall Beddoe 13 | Copyright (c) 2004 Baseline Research 14 | 15 | Licensed under the LGPL 16 | """ 17 | 18 | from tree import * 19 | from pydot import * 20 | 21 | class Phylogeny: 22 | 23 | """Implementation of base phylogenetic class""" 24 | 25 | def __init__(self, sequences, dmx, minval=1.0): 26 | self.dmx = dmx 27 | self.index = 0 28 | self.tree = None 29 | self.clusters = [] 30 | self.minval = minval 31 | self.sequences = sequences 32 | 33 | self._go() 34 | 35 | def __len__(self): 36 | return len(self.clusters) 37 | 38 | def __iter__(self): 39 | self.index = 0 40 | return self 41 | 42 | def next(self): 43 | if self.index == len(self.clusters): 44 | raise StopIteration 45 | 46 | self.index += 1 47 | 48 | return self.clusters[self.index - 1] 49 | 50 | def __getitem__(self, index): 51 | return self.clusters[index] 52 | 53 | def _go(self): 54 | """Perform tree construction""" 55 | pass 56 | 57 | class UPGMA(Phylogeny): 58 | 59 | """UPGMA tree construction method""" 60 | 61 | def _go(self): 62 | 63 | # Universal Set 64 | Cu = set() 65 | 66 | # Place each sequence into individual tree node 67 | for i in range(len(self.sequences)): 68 | ntree = Tree() 69 | ntree.setValue((i, 0, self.sequences[i], None)) 70 | Cu.add(ntree) 71 | 72 | n = len(Cu) - 1 73 | totalNodes = len(Cu) 74 | 75 | for i in range(n): 76 | min = 10000 77 | 78 | for A in Cu: 79 | for B in Cu: 80 | if A == B: 81 | continue 82 | 83 | Dab = self._distance(A, B) 84 | 85 | # Choose closest clusters 86 | if Dab <= min: 87 | min = Dab 88 | 89 | savex = A.getValue()[0] 90 | savey = B.getValue()[0] 91 | 92 | # Create new root with clusters as children 93 | C = Tree() 94 | C.setLeft(A) 95 | C.setRight(B) 96 | 97 | A.setParent(C) 98 | B.setParent(C) 99 | 100 | C.setValue((10000 + i, min, None, None)) 101 | 102 | totalNodes += 1 103 | 104 | # Remove closest clusters from Cu and add new cluster 105 | #print "%d,%d = %f" % (savex, savey, min) 106 | Cu.remove(C.getLeft()) 107 | Cu.remove(C.getRight()) 108 | Cu.add(C) 109 | 110 | self.tree = Cu.pop() 111 | 112 | self._cluster(self.tree) 113 | 114 | def _distance(self, A, B): 115 | 116 | # If both nodes are leaves, return distance 117 | if A.getIsLeaf() and B.getIsLeaf(): 118 | return self.dmx[A.getValue()[0]][B.getValue()[0]] 119 | 120 | elif A.getIsLeaf(): 121 | d = self._distance(A, B.getLeft()) + self._distance(A, B.getRight()) 122 | return d / 2.0 123 | 124 | else: 125 | d = self._distance(A.getRight(), B) + self._distance(A.getLeft(), B) 126 | return d / 2.0 127 | 128 | def _cluster(self, root): 129 | 130 | if root.getIsLeaf(): 131 | return 132 | 133 | if root.getValue()[1] <= self.minval: 134 | self.clusters.append(root) 135 | return 136 | 137 | if root.getLeft(): 138 | self._cluster(root.getLeft()) 139 | 140 | if root.getRight(): 141 | self._cluster(root.getRight()) 142 | -------------------------------------------------------------------------------- /PI/tree.py: -------------------------------------------------------------------------------- 1 | 2 | """ 3 | Binary Tree module 4 | 5 | Implementation of binary tree class. 6 | 7 | Written by Marshall Beddoe 8 | Copyright (c) 2004 Baseline Research 9 | 10 | Licensed under the LGPL 11 | """ 12 | 13 | from pydot import * 14 | 15 | class Tree: 16 | 17 | def __init__(self): 18 | 19 | self._parentNode = None 20 | self._leftNode = None 21 | self._rightNode = None 22 | self._value = None 23 | self._height = 0 24 | 25 | def getIsLeaf(self): 26 | """Is this tree a leaf node""" 27 | 28 | if self._leftNode == None and self._rightNode == None: 29 | return True 30 | else: 31 | return False 32 | 33 | def getHeight(self): 34 | """Return height at this root""" 35 | return self._height 36 | 37 | def getParent(self): 38 | """Return parent node""" 39 | return self._parentNode 40 | 41 | def setParent(self, parentNode): 42 | """Set new parent node""" 43 | self._parentNode = parentNode 44 | 45 | def getLeft(self): 46 | """Return left child node""" 47 | return self._leftNode 48 | 49 | def setLeft(self, leftNode): 50 | """Set left child node""" 51 | self._leftNode = leftNode 52 | 53 | def getRight(self): 54 | """Return right child node""" 55 | return self._rightNode 56 | 57 | def setRight(self, rightNode): 58 | """Set right child node""" 59 | self._rightNode = rightNode 60 | 61 | def getValue(self): 62 | """Return node value""" 63 | return self._value 64 | 65 | def setValue(self, value): 66 | """Set node value""" 67 | self._value = value 68 | 69 | def graph(self, output, format="raw"): 70 | """Graph tree from root using graphviz""" 71 | 72 | self.i = 0 73 | self.graph = Dot(center="TRUE",rankdir="TB") 74 | #self.graph = Dot(size="3.5,4",page="4.5,6",center="TRUE",rankdir="TB") 75 | self.subgraph = Subgraph("subG", rank="same") 76 | 77 | self._traverse(self) 78 | 79 | self.graph.add_subgraph(self.subgraph) 80 | 81 | #if format == "raw": 82 | self.graph.write_raw(output + ".dot", prog="dot") 83 | self.graph.write_png(output + ".png", prog="dot") 84 | #elif format == "png": 85 | # self.graph.write_png(output + ".png", prog="dot") 86 | #else: 87 | # raise "UnknownFormat" 88 | 89 | def _traverse(self, root): 90 | 91 | if root.getParent(): 92 | 93 | if root.getIsLeaf(): 94 | v1 = root.getValue()[2][0] 95 | else: 96 | v1 = root.getValue()[0] 97 | 98 | weight = root.getValue()[1] 99 | 100 | v2 = root.getParent().getValue()[0] 101 | 102 | l = "%.02f%%" % (weight * 100.0) 103 | 104 | if v1 >= 10000: 105 | node1 = Node(v1, shape="plaintext", ratio="auto", label=l) 106 | else: 107 | node1 = Node(v1, shape="house", ratio="auto") 108 | node1.set("style", "filled") 109 | node1.set("fillcolor", "cyan") 110 | 111 | if v2 >= 10000: 112 | weight = root.getParent().getValue()[1] 113 | l = "%.02f%%" % (weight * 100.0) 114 | node2 = Node(v2, shape="plaintext", ratio="auto", label=l) 115 | 116 | if root.getIsLeaf(): 117 | self.subgraph.add_node(node1) 118 | 119 | self.graph.add_node(node1) 120 | self.graph.add_node(node2) 121 | 122 | edge = Edge(v2, v1) 123 | 124 | self.graph.add_edge(edge) 125 | 126 | if root.getLeft(): 127 | self._traverse(root.getLeft()) 128 | 129 | if root.getRight(): 130 | self._traverse(root.getRight()) 131 | -------------------------------------------------------------------------------- /README: -------------------------------------------------------------------------------- 1 | NOTE: I (@wolever) am not the original author of this software, and I understand 2 | very little about how it actually works. I have only updated it to build and run 3 | against more modern versions of Python and NumPy. It is unlikely that I will be 4 | able to answer questions. 5 | 6 | Originally mirrored from: www.4tphi.net/~awalters/PI/PI.html 7 | 8 | ---- 9 | The Protocol Informatics Framework 10 | Written by Marshall Beddoe 11 | Copyright (c) 2004 Baseline Research 12 | ---- 13 | 14 | Overview: 15 | 16 | The Protocol Informatics project is a software framework that allows for 17 | advanced sequence and protocol stream analysis by utilizing bioinformatics 18 | algorithms. The sole purpose of this software is to identify protocol fields in 19 | unknown or poorly documented network protocol formats. The algorithms that are 20 | utilized perform comparative analysis on a series of samples to better 21 | understand the underlying structure of the otherwise random-looking data. The 22 | PI framework was designed for experimentation through the use of a widget-based 23 | component set. 24 | 25 | Requirements: 26 | 27 | Python >= 2.4 http://www.python.org 28 | numpy http://numpy.scipy.org/ 29 | Pyrex http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/ 30 | Pcapy http://oss.coresecurity.com/projects/pcapy.html 31 | Pydot http://code.google.com/p/pydot/ 32 | 33 | These requirements are available using pip or easy_install: 34 | 35 | $ pip install numpy pydot pcapy 36 | $ make 37 | 38 | This software has been tested and works correctly under: 39 | - OpenBSD 40 | - FreeBSD 41 | - Linux 42 | - MacOSX 43 | 44 | 45 | Example usage: Analyzing the ICMP protocol 46 | 47 | ICMP is a simple fixed length protocol. 48 | Let's use the PI framework to discover the format. 49 | 50 | Step 1: Gather 100 ICMP packets using tcpdump 51 | 52 | # tcpdump -s 42 -c 100 -nl -w icmp.dump icmp 53 | 54 | Step 2: Run dump through PI prototype 55 | 56 | # ./main.py -g -p ./icmp.dump 57 | 58 | Protocol Informatics Prototype (v0.01 beta) 59 | Written by Marshall Beddoe 60 | Copyright (c) 2004 Baseline Research 61 | 62 | Found 100 unique sequences in '../dumps/icmp.out' 63 | Creating distance matrix .. complete 64 | Creating phylogenetic tree .. complete 65 | 66 | Discovered 1 clusters using a weight of 1.00 67 | Performing multiple alignment on cluster 1 .. complete 68 | 69 | Output of cluster 1 70 | 0097 x08 x00 xad x4b x05 xbe x00 x60 71 | 0039 x08 x00 x30 x54 x05 xbe x00 x26 72 | 0026 x08 x00 xf7 xb2 x05 xbe x00 x19 73 | 0015 x08 x00 x01 xdb x05 xbe x00 x0e 74 | 0048 x08 x00 x4f xdf x05 xbe x00 x2f 75 | 0040 x08 x00 xf8 xa4 x05 xbe x00 x27 76 | 0077 x08 x00 xe8 x28 x05 xbe x00 x4c 77 | 0017 x08 x00 xe8 x6c x05 xbe x00 x10 78 | 0027 x08 x00 xc3 xa9 x05 xbe x00 x1a 79 | 0087 x08 x00 xdd xc1 x05 xbe x00 x56 80 | 0081 x08 x00 x88 x42 x05 xbe x00 x50 81 | 0058 x08 x00 xb0 x42 x05 xbe x00 x39 82 | 0013 x08 x00 x3e x38 x05 xbe x00 83 | 0067 x08 x00 x99 x36 x05 xbe x00 x42 84 | 0055 x08 x00 x0f x56 x05 xbe x00 x36 85 | 0004 x08 x00 xe6 xda x05 xbe x00 x03 86 | 0028 x08 x00 x83 xd9 x05 xbe x00 x1b 87 | 0095 x08 x00 xc1 xd9 x05 xbe x00 x5e 88 | 0075 x08 x00 x3a x63 x05 xbe x00 x4a 89 | 0053 x08 x00 x6d x2a x05 xbe x00 x34 90 | 0021 x08 x00 x6d x8d x05 xbe x00 x14 91 | 0088 x08 x00 xa8 x07 x05 xbe x00 x57 92 | 0005 x08 x00 xa8 x8a x05 xbe x00 x04 93 | 0080 x08 x00 xa8 x62 x05 xbe x00 x4f 94 | 0023 x08 x00 x3f x18 x05 xbe x00 x16 95 | 0002 x08 x00 x3f x65 x05 xbe x00 x01 96 | 0074 x08 x00 x3f xc2 x05 xbe x00 x49 97 | 0030 x08 x00 x3f x15 x05 xbe x00 x1d 98 | 0044 x08 x00 xcc xc2 x05 xbe x00 x2b 99 | 0078 x08 x00 xcc x8a x05 xbe x00 x4d 100 | 0071 x08 x00 xd8 x18 x05 xbe x00 x46 101 | 0035 x08 x00 x9a xfd x05 xbe x00 x22 102 | 0001 x08 x00 x69 xf9 x05 xbe x00 x00 103 | 0034 x08 x00 xc5 x9e x05 xbe x00 x21 104 | 0031 x08 x00 x38 x00 x05 xbe x00 x1e 105 | 0092 x08 x00 x38 x4c x05 xbe x00 x5b 106 | 0100 x08 x00 x2b x1a x05 xbe x00 x63 107 | 0049 x08 x00 x15 x1d x05 xbe x00 x30 108 | 0008 x08 x00 x2f x64 x05 xbe x00 x07 109 | 0089 x08 x00 x80 xe5 x05 xbe x00 x58 110 | 0096 x08 x00 xb2 xb0 x05 xbe x00 x5f 111 | 0079 x08 x00 xc2 xae x05 xbe x00 x4e 112 | 0057 x08 x00 xc2 x79 x05 xbe x00 x38 113 | 0046 x08 x00 x77 x7a x05 xbe x00 x2d 114 | 0018 x08 x00 xbb xce x05 xbe x00 x11 115 | 0025 x08 x00 xfe xaa x05 xbe x00 x18 116 | 0068 x08 x00 x50 xe3 x05 xbe x00 x43 117 | 0065 x08 x00 xe0 xb7 x05 xbe x00 x40 118 | 0011 x08 x00 x8d xd6 x05 xbe x00 119 | 0029 x08 x00 x7c xf3 x05 xbe x00 x1c 120 | 0033 x08 x00 xef xf3 x05 xbe x00 121 | 0069 x08 x00 x25 x6b x05 xbe x00 x44 122 | 0083 x08 x00 x25 xff x05 xbe x00 x52 123 | 0099 x08 x00 x56 x99 x05 xbe x00 x62 124 | 0061 x08 x00 x33 x81 x05 xbe x00 x3c 125 | 0050 x08 x00 xe9 xba x05 xbe x00 x31 126 | 0042 x08 x00 xb3 x49 x05 xbe x00 x29 127 | 0059 x08 x00 x81 x4e x05 xbe x00 x3a 128 | 0098 x08 x00 x81 xad x05 xbe x00 x61 129 | 0091 x08 x00 x42 xa0 x05 xbe x00 x5a 130 | 0054 x08 x00 x42 xd8 x05 xbe x00 x35 131 | 0037 x08 x00 x4c xe8 x05 xbe x00 x24 132 | 0041 x08 x00 xeb x4d x05 xbe x00 x28 133 | 0086 x08 x00 xe4 x53 x05 xbe x00 x55 134 | 0006 x08 x00 x71 x7b x05 xbe x00 x05 135 | 0012 x08 x00 x63 x7b x05 xbe x00 136 | 0070 x08 x00 xee x7d x05 xbe x00 x45 137 | 0051 x08 x00 xc8 x57 x05 xbe x00 x32 138 | 0066 x08 x00 xb4 x3c x05 xbe x00 x41 139 | 0014 x08 x00 x2c x26 x05 xbe x00 140 | 0062 x08 x00 x2c x7c x05 xbe x00 x3d 141 | 0016 x08 x00 xed x8e x05 xbe x00 x0f 142 | 0007 x08 x00 x47 x3d x05 xbe x00 x06 143 | 0073 x08 x00 x5e x72 x05 xbe x00 x48 144 | 0052 x08 x00 x9e x06 x05 xbe x00 x33 145 | 0072 x08 x00 x9e x9d x05 xbe x00 x47 146 | 0036 x08 x00 x6f x6e x05 xbe x00 x23 147 | 0060 x08 x00 x6c xc6 x05 xbe x00 x3b 148 | 0045 x08 x00 xa2 xf5 x05 xbe x00 x2c 149 | 0085 x08 x00 x00 x47 x05 xbe x00 x54 150 | 0076 x08 x00 x14 x85 x05 xbe x00 x4b 151 | 0020 x08 x00 xa0 x85 x05 xbe x00 x13 152 | 0019 x08 x00 xa6 x2c x05 xbe x00 x12 153 | 0003 x08 x00 x14 x2c x05 xbe x00 x02 154 | 0022 x08 x00 x44 x8c x05 xbe x00 x15 155 | 0082 x08 x00 x5d xe0 x05 xbe x00 x51 156 | 0009 x08 x00 xfc x41 x05 xbe x00 x08 157 | 0084 x08 x00 x35 x05 xbe x00 x53 158 | 0032 x08 x00 x0e x17 x05 xbe x00 x1f 159 | 0056 x08 x00 xe5 x05 xbe x00 x37 160 | 0043 x08 x00 xa1 xde x05 xbe x00 x2a 161 | 0094 x08 x00 x03 x92 x05 xbe x00 x5d 162 | 0047 x08 x00 x55 x83 x05 xbe x00 x2e 163 | 0090 x08 x00 x55 x94 x05 xbe x00 x59 164 | 0064 x08 x00 x8f x05 xbe x00 x3f 165 | 0093 x08 x00 xb6 x05 xbe x00 x5c 166 | 0010 x08 x00 xd1 xb6 x05 xbe x00 167 | 0024 x08 x00 x11 x8f x05 xbe x00 x17 168 | 0063 x08 x00 x11 x04 x05 xbe x00 x3e 169 | 0038 x08 x00 x37 x3b x05 xbe x00 x25 170 | DT BBB ZZZ BBB BBB BBB BBB ZZZ AAA 171 | MT 000 000 081 089 000 000 000 100 172 | 173 | Ungapped Consensus: 174 | CONS x08 x00 x3f x18 x05 xbe x00 ??? 175 | DT BBB ZZZ BBB BBB BBB BBB ZZZ AAA 176 | MT 000 000 081 089 000 000 000 100 177 | 178 | Step 3: Analyze Consensus Sequence 179 | 180 | Pay attention to datatype composition and mutation rate. 181 | 182 | Offset 0: Binary data, 0% mutation rate 183 | Offset 1: Zeroed data, 0% mutation rate 184 | Offset 2: Binary data, 81% mutation rate 185 | Offset 3: Binary data, 89% mutation rate 186 | Offset 4: Binary data, 0% mutation rate 187 | Offset 5: Binary data, 0% mutation rate 188 | Offset 6: Zeroed data, 0% mutation rate 189 | Offset 7: ASCII data, 100% mutation rate 190 | 191 | Using this information we can construct the structure of the format: 192 | 193 | [ 1 byte ] [ 1 byte ] [ 2 byte ] [ 2 byte ] [ 1 byte ] [ 1 byte ] 194 | 195 | The real format of an ICMP message: 196 | 197 | [ 1 byte ] [ 1 byte ] [ 2 byte ] [ 2 byte ] [ 2 byte ] 198 | 199 | The reason PI made the mistake in identifying the last field was due to the 200 | fact that the last field in an ICMP packet is a 16 bit sequence identifier. 201 | We only gathered 100 packets therefore the greatest significant byte never 202 | changed as the field incremented. 203 | 204 | Therefore, it is very important to gather data efficiently as PI is only as 205 | good as the data that is fed to it. 206 | -------------------------------------------------------------------------------- /dns_requests_and_responses.dump: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wolever/Protocol-Informatics/818d4635474d71eb6ae32e7f0289a6f6f320d91b/dns_requests_and_responses.dump -------------------------------------------------------------------------------- /dns_requests_only.dump: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wolever/Protocol-Informatics/818d4635474d71eb6ae32e7f0289a6f6f320d91b/dns_requests_only.dump -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python -u 2 | 3 | # 4 | # Protocol Informatics Prototype 5 | # Written by Marshall Beddoe 6 | # Copyright (c) 2004 Baseline Research 7 | # 8 | # Licensed under the LGPL 9 | # 10 | 11 | from PI import * 12 | import sys, getopt 13 | 14 | def main(): 15 | 16 | print "Protocol Informatics Prototype (v0.01 beta)" 17 | print "Written by Marshall Beddoe " 18 | print "Copyright (c) 2004 Baseline Research\n" 19 | 20 | # Defaults 21 | format = None 22 | weight = 1.0 23 | graph = False 24 | 25 | # 26 | # Parse command line options and do sanity checking on arguments 27 | # 28 | try: 29 | (opts, args) = getopt.getopt(sys.argv[1:], "pagw:") 30 | except: 31 | usage() 32 | 33 | for o,a in opts: 34 | if o in ["-p"]: 35 | format = "pcap" 36 | elif o in ["-a"]: 37 | format = "ascii" 38 | elif o in ["-w"]: 39 | weight = float(a) 40 | elif o in ["-g"]: 41 | graph = True 42 | else: 43 | usage() 44 | 45 | if len(args) == 0: 46 | usage() 47 | 48 | if weight < 0.0 or weight > 1.0: 49 | print "FATAL: Weight must be between 0 and 1" 50 | sys.exit(-1) 51 | 52 | file = sys.argv[len(sys.argv) - 1] 53 | 54 | try: 55 | file 56 | except: 57 | usage() 58 | 59 | # 60 | # Open file and get sequences 61 | # 62 | if format == "pcap": 63 | try: 64 | sequences = input.Pcap(file) 65 | except IOError: 66 | print "FATAL: Error opening '%s'" % file 67 | sys.exit(-1) 68 | elif format == "ascii": 69 | try: 70 | sequences = input.ASCII(file) 71 | except IOError: 72 | print "FATAL: Error opening '%s'" % file 73 | sys.exit(-1) 74 | else: 75 | print "FATAL: Specify file format" 76 | sys.exit(-1) 77 | 78 | if len(sequences) == 0: 79 | print "FATAL: No sequences found in '%s'" % file 80 | sys.exit(-1) 81 | else: 82 | print "Found %d unique sequences in '%s'" % (len(sequences), file) 83 | 84 | # 85 | # Create distance matrix (LocalAlignment, PairwiseIdentity, Entropic) 86 | # 87 | print "Creating distance matrix ..", 88 | dmx = distance.LocalAlignment(sequences) 89 | print "complete" 90 | 91 | # 92 | # Pass distance matrix to phylogenetic creation function 93 | # 94 | print "Creating phylogenetic tree ..", 95 | phylo = phylogeny.UPGMA(sequences, dmx, minval=weight) 96 | print "complete" 97 | 98 | # 99 | # Output some pretty graphs of each cluster 100 | # 101 | if graph: 102 | cnum = 1 103 | for cluster in phylo: 104 | out = "graph-%d" % cnum 105 | print "Creating %s .." % out, 106 | cluster.graph(out) 107 | print "complete" 108 | cnum += 1 109 | 110 | print "\nDiscovered %d clusters using a weight of %.02f" % \ 111 | (len(phylo), weight) 112 | 113 | # 114 | # Perform progressive multiple alignment against clusters 115 | # 116 | i = 1 117 | alist = [] 118 | for cluster in phylo: 119 | print "Performing multiple alignment on cluster %d .." % i, 120 | aligned = multialign.NeedlemanWunsch(cluster) 121 | print "complete" 122 | alist.append(aligned) 123 | i += 1 124 | print "" 125 | 126 | # 127 | # Display each cluster of aligned sequences 128 | # 129 | i = 1 130 | for seqs in alist: 131 | print "Output of cluster %d" % i 132 | output.Ansi(seqs) 133 | i += 1 134 | print "" 135 | 136 | def usage(): 137 | print "usage: %s [-gpa] [-w ] " % \ 138 | sys.argv[0] 139 | print " -g\toutput graphviz of phylogenetic trees" 140 | print " -p\tpcap format" 141 | print " -a\tascii format" 142 | print " -w\tdifference weight for clustering" 143 | sys.exit(-1) 144 | 145 | if __name__ == "__main__": 146 | main() 147 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from distutils.core import setup 2 | from distutils.extension import Extension 3 | 4 | import sys, os.path 5 | 6 | # Use Pyrex 7 | #import distutils.sysconfig 8 | #if distutils.sysconfig.get_config_var('CC').startswith("gcc"): 9 | # pyrex_compile_options = [] 10 | #else: 11 | pyrex_compile_options = [] 12 | 13 | if sys.platform == "win32" and len(sys.argv) < 2: 14 | sys.argv[1:] = ["bdist_wininst"] 15 | 16 | # Compiling Pyrex modules to .c and .so 17 | try: 18 | import Pyrex.Distutils 19 | except ImportError: 20 | distutils_extras = {} 21 | pyrex_suffix = ".c" 22 | else: 23 | class pyrex_build_ext(Pyrex.Distutils.build_ext): 24 | def pyrex_compile(self, source): 25 | from Pyrex.Compiler.Main import CompilationOptions, default_options 26 | options = CompilationOptions(default_options) 27 | result = Pyrex.Compiler.Main.compile(source, options) 28 | if result.num_errors <> 0: 29 | sys.exit(1) 30 | distutils_extras = { 31 | "cmdclass": { 32 | 'build_ext': pyrex_build_ext}} 33 | pyrex_suffix = ".pyx" 34 | 35 | def PIExtension(module_name): 36 | path = module_name.replace('.', '/') 37 | return Extension(module_name, [path + pyrex_suffix], 38 | extra_compile_args = pyrex_compile_options) 39 | 40 | setup( 41 | name = "PI", 42 | version = "0.01", 43 | url = "http://www.baselineresearch.net/PI", 44 | author_email = "mbeddoe@baselineresearch.net", 45 | description = "Protocol analysis toolkit using bioinformatics algorithms", 46 | packages = ["PI"], 47 | ext_modules=[ 48 | PIExtension("PI.align")],**distutils_extras 49 | ) 50 | --------------------------------------------------------------------------------