├── .gitignore ├── README.md ├── bpline.py ├── docs ├── COPYING └── metagenomics_pipeline.py ├── pipeline ├── __init__.py ├── classes │ ├── __init__.py │ ├── blastfilter.py │ ├── config.py │ ├── constants.py │ └── filter.py ├── modules │ ├── README.md │ ├── __init__.py │ ├── mod_blast.py │ ├── mod_usearch.py │ └── mod_usearch6.py └── utils │ ├── __init__.py │ ├── b6lib.py │ ├── cmdlinehandler.py │ ├── fastalib.py │ ├── logger.py │ └── utils.py └── samples ├── sample-filters-config.ini ├── sample.b6.txt ├── sample.b6.txt.png ├── sample.fa ├── sample.fa.png └── sample_userch_db.wdb /.gitignore: -------------------------------------------------------------------------------- 1 | # platform dependent python byte code files. 2 | *.pyc 3 | 4 | # temporary vi files. 5 | *.sw* 6 | 7 | #nonsense 8 | *.DS_Store 9 | *~ 10 | *_priv* 11 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Aim of This Project 2 | =================== 3 | 4 | Well. One of the problems I was dealing with while I was working with shotgun metagenomics data obtained from Illumina was to filter out certain things from millions of sequences to get them to a point where I can start analyzing them. For instance, in the context of metagenomics, what a researcher might want to do right after quality control and right before assembling longer contigs from her sequences is to filter out reads that are coming from other sources such as human genome, or viral genomes. Moreover, she might like to filter out 16S rRNA genes from this collection of reads in order to analyze taxonomy separately and/or simply to reduce the size of reads. 5 | 6 | Basically in my work-flow what happens to my metagenomic data is that it gets more and more refined by being searched against a database. For instance all reads are being searched against Human genome, good hits are being collected from the original file, remaining sequences are being searched against a collection of Viral DNAs, etc. Obviously it is possible to generalize these steps and implement a layer of abstraction to have computer deal with the input and output files, as well as pesky intermediate steps; and this is the intention of this small project. 7 | 8 | It is absoultely not there yet, but my aim is to develop this pipeline to a point where running it would be as easy as calling it like this: 9 | 10 | $ python bpline.py -i /path/to/sample.fa -o /path/to/output_dir -s /path/to/filters-config.ini -d SAMPLE_NAME 11 | 12 | 13 | Filters Configuration 14 | ===================== 15 | 16 | Filters are going to be defined in a configuration file. Here is a sample: 17 | 18 | [/path/to/a/search/database/human_genome.wdb] 19 | filter_name = Human 20 | module = usearch 21 | execute = clean, init, search, filter 22 | cmdparam.-id = 0.9 23 | cmdparam.-queryalnfract = 0.3 24 | rfnparam.min_alignment_length = 50 25 | rfnparam.min_identity = 90 26 | rfnparam.unique_hits = 1 27 | 28 | [/path/to/another/search/database/reference_SSU.db] 29 | filter_name = rRNA 30 | module = blast 31 | 32 | [/path/to/a/search/database/viral_genomes.wdb] 33 | filter_name = Viral 34 | module = usearch 35 | 36 | 37 | More explanation about the structure of the configuration file will be here. 38 | 39 | 40 | Flow 41 | ==== 42 | 43 | Basically this is what is going to be happening when someone runs the pipeline: 44 | 45 | 1. Parse `filters-config.ini` to generate a chain of filters. 46 | 2. For the first filter, input is `sample.fasta`. 47 | 3. Search every sequence in the input against the `database` defined in the `filters-config.ini` for this filter and create a tabular output of hits (it is the output you get when you run blastall with `-m 8`) 48 | 4. Analyze input files in respect to the tabular search output, and according to the criteria defined in config, separate sequences that have _good hits_ and store them in their own files. 49 | 5. If there is another filter, send sequences that _did not result a good hit to the current filter_ as the input to the next filter: Go to 3. 50 | 6. Else, report everything. 51 | 52 | 53 | Contact me 54 | ========== 55 | 56 | You can reach [me](http://meren.org) via `meren / mbl.edu`. All suggestions and critiques are most welcome. 57 | 58 | Thanks. 59 | -------------------------------------------------------------------------------- /bpline.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | # Copyright (C) 2011, Marine Biological Laboratory 5 | # 6 | # This program is free software; you can redistribute it and/or modify it under 7 | # the terms of the GNU General Public License as published by the Free 8 | # Software Foundation; either version 2 of the License, or (at your option) 9 | # any later version. 10 | # 11 | # Please read the docs/COPYING file. 12 | 13 | # 14 | # bpline.py: user command line interface. 15 | # 16 | 17 | # standard python modules 18 | import os 19 | import sys 20 | 21 | # non-standard python modules 22 | from pipeline.utils.cmdlinehandler import get_parser_obj 23 | from pipeline.utils.utils import print_config_summary 24 | from pipeline.classes.constants import Constants as c 25 | from pipeline.classes.config import Config 26 | 27 | def main(config): 28 | if config.args.dry_run: 29 | print_config_summary(config) 30 | sys.exit() 31 | 32 | for filter in config.filters: 33 | config.init_filter_files_and_directories(filter) 34 | filter.execute() 35 | 36 | return 0 37 | 38 | if __name__ == '__main__': 39 | sys.exit(main(Config(get_parser_obj().parse_args(), c()))) 40 | -------------------------------------------------------------------------------- /docs/COPYING: -------------------------------------------------------------------------------- 1 | NOTE! The GPL below is copyrighted by the Free Software Foundation, but 2 | the instance of code that it refers to (the kde programs) are copyrighted 3 | by the authors who actually wrote it. 4 | 5 | --------------------------------------------------------------------------- 6 | 7 | GNU GENERAL PUBLIC LICENSE 8 | Version 2, June 1991 9 | 10 | Copyright (C) 1989, 1991 Free Software Foundation, Inc. 11 | 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 12 | Everyone is permitted to copy and distribute verbatim copies 13 | of this license document, but changing it is not allowed. 14 | 15 | Preamble 16 | 17 | The licenses for most software are designed to take away your 18 | freedom to share and change it. By contrast, the GNU General Public 19 | License is intended to guarantee your freedom to share and change free 20 | software--to make sure the software is free for all its users. This 21 | General Public License applies to most of the Free Software 22 | Foundation's software and to any other program whose authors commit to 23 | using it. (Some other Free Software Foundation software is covered by 24 | the GNU Library General Public License instead.) You can apply it to 25 | your programs, too. 26 | 27 | When we speak of free software, we are referring to freedom, not 28 | price. Our General Public Licenses are designed to make sure that you 29 | have the freedom to distribute copies of free software (and charge for 30 | this service if you wish), that you receive source code or can get it 31 | if you want it, that you can change the software or use pieces of it 32 | in new free programs; and that you know you can do these things. 33 | 34 | To protect your rights, we need to make restrictions that forbid 35 | anyone to deny you these rights or to ask you to surrender the rights. 36 | These restrictions translate to certain responsibilities for you if you 37 | distribute copies of the software, or if you modify it. 38 | 39 | For example, if you distribute copies of such a program, whether 40 | gratis or for a fee, you must give the recipients all the rights that 41 | you have. You must make sure that they, too, receive or can get the 42 | source code. And you must show them these terms so they know their 43 | rights. 44 | 45 | We protect your rights with two steps: (1) copyright the software, and 46 | (2) offer you this license which gives you legal permission to copy, 47 | distribute and/or modify the software. 48 | 49 | Also, for each author's protection and ours, we want to make certain 50 | that everyone understands that there is no warranty for this free 51 | software. If the software is modified by someone else and passed on, we 52 | want its recipients to know that what they have is not the original, so 53 | that any problems introduced by others will not reflect on the original 54 | authors' reputations. 55 | 56 | Finally, any free program is threatened constantly by software 57 | patents. We wish to avoid the danger that redistributors of a free 58 | program will individually obtain patent licenses, in effect making the 59 | program proprietary. To prevent this, we have made it clear that any 60 | patent must be licensed for everyone's free use or not licensed at all. 61 | 62 | The precise terms and conditions for copying, distribution and 63 | modification follow. 64 | 65 | GNU GENERAL PUBLIC LICENSE 66 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 67 | 68 | 0. This License applies to any program or other work which contains 69 | a notice placed by the copyright holder saying it may be distributed 70 | under the terms of this General Public License. The "Program", below, 71 | refers to any such program or work, and a "work based on the Program" 72 | means either the Program or any derivative work under copyright law: 73 | that is to say, a work containing the Program or a portion of it, 74 | either verbatim or with modifications and/or translated into another 75 | language. (Hereinafter, translation is included without limitation in 76 | the term "modification".) Each licensee is addressed as "you". 77 | 78 | Activities other than copying, distribution and modification are not 79 | covered by this License; they are outside its scope. The act of 80 | running the Program is not restricted, and the output from the Program 81 | is covered only if its contents constitute a work based on the 82 | Program (independent of having been made by running the Program). 83 | Whether that is true depends on what the Program does. 84 | 85 | 1. You may copy and distribute verbatim copies of the Program's 86 | source code as you receive it, in any medium, provided that you 87 | conspicuously and appropriately publish on each copy an appropriate 88 | copyright notice and disclaimer of warranty; keep intact all the 89 | notices that refer to this License and to the absence of any warranty; 90 | and give any other recipients of the Program a copy of this License 91 | along with the Program. 92 | 93 | You may charge a fee for the physical act of transferring a copy, and 94 | you may at your option offer warranty protection in exchange for a fee. 95 | 96 | 2. You may modify your copy or copies of the Program or any portion 97 | of it, thus forming a work based on the Program, and copy and 98 | distribute such modifications or work under the terms of Section 1 99 | above, provided that you also meet all of these conditions: 100 | 101 | a) You must cause the modified files to carry prominent notices 102 | stating that you changed the files and the date of any change. 103 | 104 | b) You must cause any work that you distribute or publish, that in 105 | whole or in part contains or is derived from the Program or any 106 | part thereof, to be licensed as a whole at no charge to all third 107 | parties under the terms of this License. 108 | 109 | c) If the modified program normally reads commands interactively 110 | when run, you must cause it, when started running for such 111 | interactive use in the most ordinary way, to print or display an 112 | announcement including an appropriate copyright notice and a 113 | notice that there is no warranty (or else, saying that you provide 114 | a warranty) and that users may redistribute the program under 115 | these conditions, and telling the user how to view a copy of this 116 | License. (Exception: if the Program itself is interactive but 117 | does not normally print such an announcement, your work based on 118 | the Program is not required to print an announcement.) 119 | 120 | These requirements apply to the modified work as a whole. If 121 | identifiable sections of that work are not derived from the Program, 122 | and can be reasonably considered independent and separate works in 123 | themselves, then this License, and its terms, do not apply to those 124 | sections when you distribute them as separate works. But when you 125 | distribute the same sections as part of a whole which is a work based 126 | on the Program, the distribution of the whole must be on the terms of 127 | this License, whose permissions for other licensees extend to the 128 | entire whole, and thus to each and every part regardless of who wrote it. 129 | 130 | Thus, it is not the intent of this section to claim rights or contest 131 | your rights to work written entirely by you; rather, the intent is to 132 | exercise the right to control the distribution of derivative or 133 | collective works based on the Program. 134 | 135 | In addition, mere aggregation of another work not based on the Program 136 | with the Program (or with a work based on the Program) on a volume of 137 | a storage or distribution medium does not bring the other work under 138 | the scope of this License. 139 | 140 | 3. You may copy and distribute the Program (or a work based on it, 141 | under Section 2) in object code or executable form under the terms of 142 | Sections 1 and 2 above provided that you also do one of the following: 143 | 144 | a) Accompany it with the complete corresponding machine-readable 145 | source code, which must be distributed under the terms of Sections 146 | 1 and 2 above on a medium customarily used for software interchange; or, 147 | 148 | b) Accompany it with a written offer, valid for at least three 149 | years, to give any third party, for a charge no more than your 150 | cost of physically performing source distribution, a complete 151 | machine-readable copy of the corresponding source code, to be 152 | distributed under the terms of Sections 1 and 2 above on a medium 153 | customarily used for software interchange; or, 154 | 155 | c) Accompany it with the information you received as to the offer 156 | to distribute corresponding source code. (This alternative is 157 | allowed only for noncommercial distribution and only if you 158 | received the program in object code or executable form with such 159 | an offer, in accord with Subsection b above.) 160 | 161 | The source code for a work means the preferred form of the work for 162 | making modifications to it. For an executable work, complete source 163 | code means all the source code for all modules it contains, plus any 164 | associated interface definition files, plus the scripts used to 165 | control compilation and installation of the executable. However, as a 166 | special exception, the source code distributed need not include 167 | anything that is normally distributed (in either source or binary 168 | form) with the major components (compiler, kernel, and so on) of the 169 | operating system on which the executable runs, unless that component 170 | itself accompanies the executable. 171 | 172 | If distribution of executable or object code is made by offering 173 | access to copy from a designated place, then offering equivalent 174 | access to copy the source code from the same place counts as 175 | distribution of the source code, even though third parties are not 176 | compelled to copy the source along with the object code. 177 | 178 | 4. You may not copy, modify, sublicense, or distribute the Program 179 | except as expressly provided under this License. Any attempt 180 | otherwise to copy, modify, sublicense or distribute the Program is 181 | void, and will automatically terminate your rights under this License. 182 | However, parties who have received copies, or rights, from you under 183 | this License will not have their licenses terminated so long as such 184 | parties remain in full compliance. 185 | 186 | 5. You are not required to accept this License, since you have not 187 | signed it. However, nothing else grants you permission to modify or 188 | distribute the Program or its derivative works. These actions are 189 | prohibited by law if you do not accept this License. Therefore, by 190 | modifying or distributing the Program (or any work based on the 191 | Program), you indicate your acceptance of this License to do so, and 192 | all its terms and conditions for copying, distributing or modifying 193 | the Program or works based on it. 194 | 195 | 6. Each time you redistribute the Program (or any work based on the 196 | Program), the recipient automatically receives a license from the 197 | original licensor to copy, distribute or modify the Program subject to 198 | these terms and conditions. You may not impose any further 199 | restrictions on the recipients' exercise of the rights granted herein. 200 | You are not responsible for enforcing compliance by third parties to 201 | this License. 202 | 203 | 7. If, as a consequence of a court judgment or allegation of patent 204 | infringement or for any other reason (not limited to patent issues), 205 | conditions are imposed on you (whether by court order, agreement or 206 | otherwise) that contradict the conditions of this License, they do not 207 | excuse you from the conditions of this License. If you cannot 208 | distribute so as to satisfy simultaneously your obligations under this 209 | License and any other pertinent obligations, then as a consequence you 210 | may not distribute the Program at all. For example, if a patent 211 | license would not permit royalty-free redistribution of the Program by 212 | all those who receive copies directly or indirectly through you, then 213 | the only way you could satisfy both it and this License would be to 214 | refrain entirely from distribution of the Program. 215 | 216 | If any portion of this section is held invalid or unenforceable under 217 | any particular circumstance, the balance of the section is intended to 218 | apply and the section as a whole is intended to apply in other 219 | circumstances. 220 | 221 | It is not the purpose of this section to induce you to infringe any 222 | patents or other property right claims or to contest validity of any 223 | such claims; this section has the sole purpose of protecting the 224 | integrity of the free software distribution system, which is 225 | implemented by public license practices. Many people have made 226 | generous contributions to the wide range of software distributed 227 | through that system in reliance on consistent application of that 228 | system; it is up to the author/donor to decide if he or she is willing 229 | to distribute software through any other system and a licensee cannot 230 | impose that choice. 231 | 232 | This section is intended to make thoroughly clear what is believed to 233 | be a consequence of the rest of this License. 234 | 235 | 8. If the distribution and/or use of the Program is restricted in 236 | certain countries either by patents or by copyrighted interfaces, the 237 | original copyright holder who places the Program under this License 238 | may add an explicit geographical distribution limitation excluding 239 | those countries, so that distribution is permitted only in or among 240 | countries not thus excluded. In such case, this License incorporates 241 | the limitation as if written in the body of this License. 242 | 243 | 9. The Free Software Foundation may publish revised and/or new versions 244 | of the General Public License from time to time. Such new versions will 245 | be similar in spirit to the present version, but may differ in detail to 246 | address new problems or concerns. 247 | 248 | Each version is given a distinguishing version number. If the Program 249 | specifies a version number of this License which applies to it and "any 250 | later version", you have the option of following the terms and conditions 251 | either of that version or of any later version published by the Free 252 | Software Foundation. If the Program does not specify a version number of 253 | this License, you may choose any version ever published by the Free Software 254 | Foundation. 255 | 256 | 10. If you wish to incorporate parts of the Program into other free 257 | programs whose distribution conditions are different, write to the author 258 | to ask for permission. For software which is copyrighted by the Free 259 | Software Foundation, write to the Free Software Foundation; we sometimes 260 | make exceptions for this. Our decision will be guided by the two goals 261 | of preserving the free status of all derivatives of our free software and 262 | of promoting the sharing and reuse of software generally. 263 | 264 | NO WARRANTY 265 | 266 | 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY 267 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN 268 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES 269 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED 270 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 271 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS 272 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE 273 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, 274 | REPAIR OR CORRECTION. 275 | 276 | 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 277 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR 278 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, 279 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING 280 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED 281 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY 282 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER 283 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE 284 | POSSIBILITY OF SUCH DAMAGES. 285 | 286 | END OF TERMS AND CONDITIONS 287 | 288 | How to Apply These Terms to Your New Programs 289 | 290 | If you develop a new program, and you want it to be of the greatest 291 | possible use to the public, the best way to achieve this is to make it 292 | free software which everyone can redistribute and change under these terms. 293 | 294 | To do so, attach the following notices to the program. It is safest 295 | to attach them to the start of each source file to most effectively 296 | convey the exclusion of warranty; and each file should have at least 297 | the "copyright" line and a pointer to where the full notice is found. 298 | 299 | 300 | Copyright (C) 19yy 301 | 302 | This program is free software; you can redistribute it and/or modify 303 | it under the terms of the GNU General Public License as published by 304 | the Free Software Foundation; either version 2 of the License, or 305 | (at your option) any later version. 306 | 307 | This program is distributed in the hope that it will be useful, 308 | but WITHOUT ANY WARRANTY; without even the implied warranty of 309 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 310 | GNU General Public License for more details. 311 | 312 | You should have received a copy of the GNU General Public License 313 | along with this program; if not, write to the Free Software 314 | Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA 315 | 316 | 317 | Also add information on how to contact you by electronic and paper mail. 318 | 319 | If the program is interactive, make it output a short notice like this 320 | when it starts in an interactive mode: 321 | 322 | Gnomovision version 69, Copyright (C) 19yy name of author 323 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 324 | This is free software, and you are welcome to redistribute it 325 | under certain conditions; type `show c' for details. 326 | 327 | The hypothetical commands `show w' and `show c' should show the appropriate 328 | parts of the General Public License. Of course, the commands you use may 329 | be called something other than `show w' and `show c'; they could even be 330 | mouse-clicks or menu items--whatever suits your program. 331 | 332 | You should also get your employer (if you work as a programmer) or your 333 | school, if any, to sign a "copyright disclaimer" for the program, if 334 | necessary. Here is a sample; alter the names: 335 | 336 | Yoyodyne, Inc., hereby disclaims all copyright interest in the program 337 | `Gnomovision' (which makes passes at compilers) written by James Hacker. 338 | 339 | , 1 April 1989 340 | Ty Coon, President of Vice 341 | 342 | This General Public License does not permit incorporating your program into 343 | proprietary programs. If your program is a subroutine library, you may 344 | consider it more useful to permit linking proprietary applications with the 345 | library. If this is what you want to do, use the GNU Library General 346 | Public License instead of this License. 347 | -------------------------------------------------------------------------------- /docs/metagenomics_pipeline.py: -------------------------------------------------------------------------------- 1 | # 1. eliminate sequences with N's (if one pair has an N, eliminate both pairs) and create two fasta files for each pairs. 2 | # ssh minnie 3 | # cd /storage-0/hiseq/20110912/Unaligned/HMP_metagen/merens_tmp 4 | #  there you will find run scripts for this. 5 | # 6 | # 2. split first lane into smaller pieces. 7 | # examples are in /xraid2-2/g454/hmp/metagenomics 8 | # ./01_initialize_parts.sh 9 | # 10 | # 3. search R1 against human genome 11 | # 12 | # 02_search_against_HUMAN.sh 13 | # 14 | # 4. concatenate results into one b6 file and filter hits 15 | # 16 | # 03_finalize_HUMAN.sh 17 | # 18 | # 5. create four FASTA files from the resulting hits: 19 | # 1. R1_human_hits 20 | # 2. R2_human_hits (matching pairs) 21 | # 3. R1 (everything that didn't have a hit to human genome) 22 | # 4. R2 (mathing pairs of R1) 23 | # 24 | # 6. search R2 against human genome, concatenate results, and filter hits (what should happen to no hits?). 25 | # 26 | # 7. take 5.3 and 5.4 and perform 1-6 on those against refSSU, create a taxonomy list upload it to VAMPS. 27 | # 28 | # 8. take 7.3 and 7.4 and perform everything to those against bacterial genomes. concatenate results, and filter hits. 29 | # 30 | # 9. dynamically trim reads to elimnate not aligning pieces. 31 | # 32 | # 33 | # 10. crete individual fasta files for matching genomes. perform assembly separately on each bin. 34 | # 35 | # 11. search individually assembled genomic bins against widely available protein databases. 36 | # 37 | # 38 | # 39 | -------------------------------------------------------------------------------- /pipeline/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meren/BLAST-filtering-pipeline/e7ae4f3e76fcdb755fda2c92d70e719ecc7148f9/pipeline/__init__.py -------------------------------------------------------------------------------- /pipeline/classes/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright (C) 2011, Marine Biological Laboratory 2 | # 3 | # This program is free software; you can redistribute it and/or modify it under 4 | # the terms of the GNU General Public License as published by the Free 5 | # Software Foundation; either version 2 of the License, or (at your option) 6 | # any later version. 7 | # 8 | # Please read the docs/COPYING file. 9 | -------------------------------------------------------------------------------- /pipeline/classes/blastfilter.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Copyright (C) 2011, Marine Biological Laboratory 4 | # 5 | # This program is free software; you can redistribute it and/or modify it under 6 | # the terms of the GNU General Public License as published by the Free 7 | # Software Foundation; either version 2 of the License, or (at your option) 8 | # any later version. 9 | # 10 | # Please read the docs/COPYING file. 11 | 12 | class BLASTFilter(): 13 | def __init__(self): 14 | pass 15 | -------------------------------------------------------------------------------- /pipeline/classes/config.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Copyright (C) 2011, Marine Biological Laboratory 4 | # 5 | # This program is free software; you can redistribute it and/or modify it under 6 | # the terms of the GNU General Public License as published by the Free 7 | # Software Foundation; either version 2 of the License, or (at your option) 8 | # any later version. 9 | # 10 | # Please read the docs/COPYING file. 11 | 12 | import os 13 | import sys 14 | import imp 15 | from ConfigParser import ConfigParser 16 | 17 | from pipeline.utils import utils 18 | from pipeline.classes.filter import Filter 19 | from pipeline.utils.logger import debug 20 | from pipeline.utils.logger import error 21 | 22 | class ConfigError(Exception): 23 | def __init__(self, e = None): 24 | Exception.__init__(self) 25 | self.e = e 26 | error(e) 27 | return 28 | def __str__(self): 29 | return 'Config Error: %s' % self.e 30 | 31 | 32 | class ConfigParserWrapper(ConfigParser): 33 | """A wrapper class to ConfigParser to override 'sections' function 34 | in order to get secions sorted by the order they appear in config 35 | file.""" 36 | 37 | def __init__(self, config_file = None): 38 | ConfigParser.__init__(self) 39 | self.config_file = config_file 40 | 41 | def sections(self): 42 | _sections = [] 43 | for line in [l.strip() for l in open(self.config_file) if len(l.strip())]: 44 | if line[0] == '[' and line[-1] == ']': 45 | _sections.append(line[1:-1]) 46 | return _sections 47 | 48 | 49 | class Config: 50 | def __init__(self, args, constants): 51 | if args: 52 | self.args = args 53 | self.constants = constants 54 | self.base_work_dir = self.args.base_work_dir.replace(' ', '_') 55 | self.dataset_name = self.args.dataset_name.replace(' ', '_') 56 | self.input = self.args.input 57 | 58 | self.dataset_root_dir = os.path.join(self.base_work_dir, self.dataset_name) 59 | self.filters = [] 60 | self.modules = {} 61 | 62 | debug('Initializing configuration') 63 | self.init_modules() 64 | self.init_essential_files_and_directories() 65 | self.init_filters_config(args.filters_config) 66 | self.init_chain_of_filters() 67 | debug('Config class is initialized with %d modules and %d filters'\ 68 | % (len(self.modules), len(self.filters))) 69 | 70 | 71 | def init_modules(self): 72 | mod_base = self.constants.dirs['modules'] 73 | for file in os.listdir(mod_base): 74 | if file.startswith('mod_') and file.endswith('.py'): 75 | mod_name = file[4:-3] 76 | self.modules[mod_name] = imp.load_source(mod_name, os.path.join(mod_base, file)) 77 | debug('module "%s" found' % mod_name) 78 | 79 | 80 | def init_essential_files_and_directories(self): 81 | IS_RELATIVE = lambda d: not d.startswith('/') 82 | 83 | if len([True for item in [self.base_work_dir, self.input] if IS_RELATIVE(item)]): 84 | raise ConfigError, 'All paths should be absolute (starting with a "/").' 85 | 86 | if not os.path.exists(self.input): 87 | raise ConfigError, 'Input file is not where it is expected to be: "%s"' % self.input 88 | 89 | utils.check_dir(self.base_work_dir, clean_dir_content = False) 90 | utils.check_dir(self.dataset_root_dir, clean_dir_content = False) 91 | 92 | 93 | def init_filters_config(self, config_file_path): 94 | filters_config = ConfigParserWrapper(config_file_path) 95 | filters_config.read(config_file_path) 96 | for section in filters_config.sections(): 97 | filter = Filter(section) 98 | filter.name = filters_config.get(section, 'filter_name').replace(' ', '_') 99 | 100 | # check if the target database, which happens to be the section name, 101 | # exists 102 | if not (os.path.exists(section) and os.access(section, os.R_OK)): 103 | raise ConfigError, 'Bad target (file not found / no read permission): "%s"' % section 104 | 105 | # assign module 106 | module_from_config = filters_config.get(section, 'module') 107 | if not self.modules.has_key(module_from_config): 108 | raise ConfigError, 'Unknown module for filter "%s": "%s".\nAvailable modules:\n%s' \ 109 | % (filter.name, module_from_config, ', '.join(self.modules.keys())) 110 | else: 111 | filter.module = self.modules[module_from_config] 112 | 113 | # check the availability of the functions and the execution order, if the default 114 | # behavior has been changed manually in the config file 115 | if filters_config.has_option(section, 'execute'): 116 | execute_list_from_config = [e.strip() for e in filters_config.get(section, 'execute').split(',')] 117 | for item in execute_list_from_config: 118 | if item not in filter.module.FUNCTIONS_ORDER: 119 | raise ConfigError, 'Unknown function for module "%s" in "%s": "%s".\nAvailable functions: %s' \ 120 | % (module_from_config, filter.name, item, ', '.join(filter.module.FUNCTIONS_ORDER)) 121 | if len(execute_list_from_config) != len(list(set(execute_list_from_config))): 122 | raise ConfigError, 'Functions cannot be executed more than once: %s' \ 123 | % (', '.join(execute_list_from_config)) 124 | 125 | # make sure the order is right. 126 | t = [filter.module.FUNCTIONS_ORDER.index(i) for i in execute_list_from_config] 127 | if False in [t[i] > t[i - 1] for i in range(1, len(t))]: 128 | raise ConfigError, 'Order of functions to be executed is not correct: %s\nFunctions should follow this order: %s' \ 129 | % (', '.join(execute_list_from_config), ', '.join(filter.module.FUNCTIONS_ORDER)) 130 | 131 | filter.execution_order = execute_list_from_config 132 | 133 | debug('filter module functions execution order has been set: "%s"' % (filter.execution_order)) 134 | 135 | # store command line parameters from the config file 136 | for option in [o for o in filters_config.options(section) if o.startswith('cmdparam.')]: 137 | param = '.'.join(option.split('.')[1:]) 138 | opt = filters_config.get(section, option) 139 | filter.cmdparams.append('%s %s' % (param, opt)) 140 | 141 | debug('command line params for filter "%s": %s ' % (filter.name, filter.cmdparams)) 142 | 143 | # store post-search refinement filters from the config file 144 | for option in [o for o in filters_config.options(section) if o.startswith('rfnparam.')]: 145 | param = '.'.join(option.split('.')[1:]) 146 | opt = filters_config.get(section, option) 147 | if param in filter.get_refinement_params(): 148 | filter.rfnparams[param] = filter.module.ALLOWED_RFNPARAMS[param](opt) 149 | else: 150 | raise ConfigError, 'Unknown refinement parameter for filter "%s": "%s"' \ 151 | % (filter.name, param) 152 | 153 | debug('refinement line params for filter "%s": %s ' % (filter.name, filter.rfnparams)) 154 | 155 | 156 | # take care of file paths and directories 157 | J = lambda x: os.path.join(filter.dirs['output'], x) 158 | 159 | filter.dirs['output'] = os.path.join(self.dataset_root_dir, filter.name) 160 | filter.dirs['parts'] = J('parts') 161 | filter.files['search_output'] = J('01_raw_hits.txt') 162 | filter.files['refined_search_output'] = J('02_refined_hits.txt') 163 | filter.files['hit_ids'] = J('03_hits.ids') 164 | filter.files['filtered_reads'] = J('04_filtered.fa') 165 | filter.files['survived_reads'] = J('05_survived.fa') 166 | 167 | self.filters.append(filter) 168 | 169 | def init_chain_of_filters(self): 170 | for i in range(0, len(self.filters)): 171 | filter = self.filters[i] 172 | 173 | if i == 0: 174 | # first filter. input should be coming from the command 175 | # line parameters: 176 | filter.files['input'] = self.input 177 | else: 178 | # any filter that is not the first one should use the previous filter's 179 | # output files as input: 180 | filter.files['input'] = self.filters[i - 1].files['survived_reads'] 181 | 182 | 183 | def init_filter_files_and_directories(self, filter): 184 | utils.check_dir(filter.dirs['parts']) 185 | 186 | if __name__ == '__main__': 187 | pass 188 | -------------------------------------------------------------------------------- /pipeline/classes/constants.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Copyright (C) 2011, Marine Biological Laboratory 4 | # 5 | # This program is free software; you can redistribute it and/or modify it under 6 | # the terms of the GNU General Public License as published by the Free 7 | # Software Foundation; either version 2 of the License, or (at your option) 8 | # any later version. 9 | # 10 | # Please read the docs/COPYING file. 11 | 12 | import os 13 | import sys 14 | 15 | 16 | class Constants: 17 | def __init__(self, base_dir = None): 18 | self.dirs = {} 19 | 20 | if base_dir: 21 | self.dirs['base'] = base_dir 22 | else: 23 | self.dirs['base'] = '/'.join(os.path.dirname(os.path.abspath(__file__)).split('/')[0:-2]) 24 | 25 | self.dirs['modules'] = os.path.join(self.dirs['base'], 'pipeline/modules') 26 | -------------------------------------------------------------------------------- /pipeline/classes/filter.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Copyright (C) 2011, Marine Biological Laboratory 4 | # 5 | # This program is free software; you can redistribute it and/or modify it under 6 | # the terms of the GNU General Public License as published by the Free 7 | # Software Foundation; either version 2 of the License, or (at your option) 8 | # any later version. 9 | # 10 | # Please read the docs/COPYING file. 11 | 12 | import os 13 | import sys 14 | from ConfigParser import ConfigParser 15 | 16 | from pipeline.utils import utils 17 | from pipeline.utils import fastalib as u 18 | from pipeline.utils.logger import debug 19 | from pipeline.utils.logger import error 20 | 21 | 22 | class FilterError(Exception): 23 | def __init__(self, e = None): 24 | Exception.__init__(self) 25 | self.e = e 26 | error(e) 27 | return 28 | def __str__(self): 29 | return 'Filter Error: %s' % self.e 30 | 31 | class Filter: 32 | def __init__(self, target_db): 33 | self.target_db = target_db 34 | self.name = None 35 | self.module = None 36 | self.cmdparams = [] 37 | self.rfnparams = {} 38 | self.execution_order = [] 39 | self.dirs = {} 40 | self.files = {} 41 | 42 | def execute(self): 43 | if not self.execution_order: 44 | self.execution_order = self.module.FUNCTIONS_ORDER 45 | 46 | for func in self.execution_order: 47 | self.module.FUNCTION_MAP[func](self) 48 | 49 | self.split() 50 | 51 | def get_refinement_params(self): 52 | if hasattr(self.module, 'ALLOWED_RFNPARAMS'): 53 | return self.module.ALLOWED_RFNPARAMS.keys() 54 | else: 55 | return {} 56 | 57 | def split(self): 58 | """this function creates 04_filtered.fa and 05_survived.fa 59 | files from self.files['input'] file, using ids in 60 | self.files['hit_ids'] provided by the filter""" 61 | 62 | # FIXME: user should be able to change the default behavior of 63 | # this function (for instance user may require one filter not 64 | # to split the content of the input file and the same input 65 | # to be used by the next filter. 66 | 67 | utils.split_file(self.files['hit_ids'], 68 | self.files['input'], 69 | self.files['filtered_reads'], 70 | self.files['survived_reads']) 71 | 72 | -------------------------------------------------------------------------------- /pipeline/modules/README.md: -------------------------------------------------------------------------------- 1 | Modules 2 | ======= 3 | 4 | This file will explain how a new module can be implemented so it can be called as a filter from the pipeline. 5 | 6 | Here is a 'FIXME' label so I wouldn't forget to come back to this. 7 | 8 | Sample 9 | ====== 10 | 11 | An empty module frame: 12 | 13 | # -*- coding: utf-8 -*- 14 | 15 | # Copyright (C) YEAR, NAME/INSTITUTE 16 | # 17 | # This program is free software; you can redistribute it and/or modify it under 18 | # the terms of the GNU General Public License as published by the Free 19 | # Software Foundation; either version 2 of the License, or (at your option) 20 | # any later version. 21 | # 22 | # Please read the docs/COPYING file. 23 | 24 | description = "Module description line" 25 | 26 | def clean(m): 27 | """Clean directory module work directory content""" 28 | pass 29 | 30 | def init(m): 31 | """Initialize files and directories""" 32 | pass 33 | 34 | def run(m): 35 | """Run time task""" 36 | pass 37 | 38 | def refine(m): 39 | """Refine search output""" 40 | pass 41 | 42 | def finalize(f_object): 43 | """Generate the list of IDs that will not go to the next filter""" 44 | pass 45 | 46 | -------------------------------------------------------------------------------- /pipeline/modules/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright (C) 2011, Marine Biological Laboratory 2 | # 3 | # This program is free software; you can redistribute it and/or modify it under 4 | # the terms of the GNU General Public License as published by the Free 5 | # Software Foundation; either version 2 of the License, or (at your option) 6 | # any later version. 7 | # 8 | # Please read the docs/COPYING file. 9 | -------------------------------------------------------------------------------- /pipeline/modules/mod_blast.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Copyright (C) 2011, Marine Biological Laboratory 4 | # 5 | # This program is free software; you can redistribute it and/or modify it under 6 | # the terms of the GNU General Public License as published by the Free 7 | # Software Foundation; either version 2 of the License, or (at your option) 8 | # any later version. 9 | # 10 | # Please read the docs/COPYING file. 11 | 12 | description = "BLAST module" 13 | 14 | searchcmd = "" 15 | 16 | allowed_rfnparams = {} 17 | 18 | from pipeline.utils import utils 19 | from pipeline.utils.logger import debug 20 | from pipeline.utils.logger import error 21 | 22 | def clean(m): 23 | pass 24 | 25 | def init(m): 26 | pass 27 | 28 | def run(m): 29 | pass 30 | 31 | def refine(m): 32 | pass 33 | 34 | def finalize(m): 35 | pass 36 | -------------------------------------------------------------------------------- /pipeline/modules/mod_usearch.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Copyright (C) 2011, Marine Biological Laboratory 4 | # 5 | # This program is free software; you can redistribute it and/or modify it under 6 | # the terms of the GNU General Public License as published by the Free 7 | # Software Foundation; either version 2 of the License, or (at your option) 8 | # any later version. 9 | # 10 | # Please read the docs/COPYING file. 11 | 12 | DESCRIPTION = "USEARCH module" 13 | 14 | SEARCH_COMMAND = "usearch -query %(input)s -blast6out %(output)s -wdb %(target)s %(cmdparams)s &> %(log)s" 15 | 16 | ALLOWED_RFNPARAMS = {'min_alignment_length': int, 17 | 'min_identity': float, 18 | 'unique_hits': int} 19 | 20 | from pipeline.utils import utils 21 | from pipeline.utils.logger import debug 22 | from pipeline.utils.logger import error 23 | 24 | 25 | class ModuleError(Exception): 26 | def __init__(self, e = None): 27 | Exception.__init__(self) 28 | self.e = e 29 | error(e) 30 | return 31 | def __str__(self): 32 | return 'Module Error: %s' % self.e 33 | 34 | def clean(m): 35 | utils.check_dir(m.dirs['parts'], clean_dir_content = True) 36 | 37 | def init(m): 38 | m.files['parts'] = utils.split_fasta_file(m.files['input'], m.dirs['parts'], prefix = 'part') 39 | 40 | def search(m): 41 | parts = m.files['parts'] 42 | for part in parts: 43 | params = {'input': part, 'output': part + '.b6', 'target': m.target_db, 44 | 'log': part + '.log', 'cmdparams': ' '.join(m.cmdparams)} 45 | debug('searching part %d/%d (log: %s)' % (parts.index(part) + 1, len(parts), params['log'])) 46 | cmdline = SEARCH_COMMAND % params 47 | utils.run_command(cmdline) 48 | 49 | dest_file = m.files['search_output'] 50 | utils.concatenate_files(dest_file, [part + '.b6' for part in m.files['parts']]) 51 | 52 | def filter(m): 53 | utils.refine_b6(m.files['search_output'], m.files['refined_search_output'], m.rfnparams) 54 | utils.store_ids_from_b6_output(m.files['refined_search_output'], m.files['hit_ids']) 55 | 56 | 57 | FUNCTIONS_ORDER = ['clean', 'init', 'search', 'filter'] 58 | FUNCTION_MAP = {'clean' : clean, 59 | 'init' : init, 60 | 'search': search, 61 | 'filter': filter} 62 | 63 | 64 | -------------------------------------------------------------------------------- /pipeline/modules/mod_usearch6.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Copyright (C) 2011, Marine Biological Laboratory 4 | # 5 | # This program is free software; you can redistribute it and/or modify it under 6 | # the terms of the GNU General Public License as published by the Free 7 | # Software Foundation; either version 2 of the License, or (at your option) 8 | # any later version. 9 | # 10 | # Please read the docs/COPYING file. 11 | 12 | DESCRIPTION = "USEARCH 6 module" 13 | # available from http://www.drive5.com/usearch/download.html, 14 | # tested with USEARCH version v6.0.307 15 | 16 | SEARCH_COMMAND = "usearch -usearch_global %(input)s -db %(target)s %(cmdparams)s -blast6out %(output)s -strand both &> %(log)s" 17 | 18 | ALLOWED_RFNPARAMS = {'min_alignment_length': int, 19 | 'min_identity': float, 20 | 'unique_hits': int} 21 | 22 | from pipeline.utils import utils 23 | from pipeline.utils.logger import debug 24 | from pipeline.utils.logger import error 25 | 26 | 27 | class ModuleError(Exception): 28 | def __init__(self, e = None): 29 | Exception.__init__(self) 30 | self.e = e 31 | error(e) 32 | return 33 | def __str__(self): 34 | return 'Module Error: %s' % self.e 35 | 36 | def clean(m): 37 | utils.check_dir(m.dirs['parts'], clean_dir_content = True) 38 | 39 | def init(m): 40 | m.files['parts'] = utils.split_fasta_file(m.files['input'], m.dirs['parts'], prefix = 'part') 41 | 42 | def search(m): 43 | parts = m.files['parts'] 44 | for part in parts: 45 | params = {'input': part, 'output': part + '.b6', 'target': m.target_db, 46 | 'log': part + '.log', 'cmdparams': ' '.join(m.cmdparams)} 47 | debug('searching part %d/%d (log: %s)' % (parts.index(part) + 1, len(parts), params['log'])) 48 | cmdline = SEARCH_COMMAND % params 49 | utils.run_command(cmdline) 50 | 51 | dest_file = m.files['search_output'] 52 | utils.concatenate_files(dest_file, [part + '.b6' for part in m.files['parts']]) 53 | 54 | def filter(m): 55 | utils.refine_b6(m.files['search_output'], m.files['refined_search_output'], m.rfnparams) 56 | utils.store_ids_from_b6_output(m.files['refined_search_output'], m.files['hit_ids']) 57 | 58 | 59 | FUNCTIONS_ORDER = ['clean', 'init', 'search', 'filter'] 60 | FUNCTION_MAP = {'clean' : clean, 61 | 'init' : init, 62 | 'search': search, 63 | 'filter': filter} 64 | 65 | 66 | -------------------------------------------------------------------------------- /pipeline/utils/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright (C) 2011, Marine Biological Laboratory 2 | # 3 | # This program is free software; you can redistribute it and/or modify it under 4 | # the terms of the GNU General Public License as published by the Free 5 | # Software Foundation; either version 2 of the License, or (at your option) 6 | # any later version. 7 | # 8 | # Please read the docs/COPYING file. 9 | -------------------------------------------------------------------------------- /pipeline/utils/b6lib.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # v.112211 3 | 4 | # Copyright (C) 2011, Marine Biological Laboratory 5 | # 6 | # This program is free software; you can redistribute it and/or modify it under 7 | # the terms of the GNU General Public License as published by the Free 8 | # Software Foundation; either version 2 of the License, or (at your option) 9 | # any later version. 10 | # 11 | # Please read the docs/COPYING file. 12 | 13 | import os 14 | import sys 15 | import numpy 16 | 17 | sys.path.append('/'.join(os.path.abspath(__file__).split('/')[:-3])) 18 | try: 19 | import pipeline.utils.utils 20 | pp = pipeline.utils.utils.pp 21 | except: 22 | pp = lambda x: x 23 | 24 | 25 | QUERY_ID, SUBJECT_ID, IDENTITY, ALIGNMENT_LENGTH,\ 26 | MISMATCHES, GAPS, Q_START, Q_END, S_START, S_END,\ 27 | E_VALUE, BIT_SCORE = range(0, 12) 28 | 29 | 30 | class B6Source: 31 | def __init__(self, b6_source, lazy_init = True): 32 | self.init() 33 | 34 | self.b6_source = b6_source 35 | self.file_pointer = open(self.b6_source) 36 | self.file_pointer.seek(0) 37 | 38 | self.conversion = [str, str, float, int, int, int, int, int, int, int, str, str] 39 | 40 | if lazy_init: 41 | self.total_seq = None 42 | else: 43 | self.total_seq = len([l for l in self.file_pointer.readlines() if not l.startswith('#')]) 44 | 45 | def init(self): 46 | self.pos = 0 47 | self.entry = None 48 | self.matrix = [] 49 | 50 | #b6 columns.. 51 | self.query_id = None 52 | self.subject_id = None 53 | self.identity = None 54 | self.alignment_length = None 55 | self.mismatches = None 56 | self.gaps = None 57 | self.q_start = None 58 | self.q_end = None 59 | self.s_start = None 60 | self.s_end = None 61 | self.e_value = None 62 | self.bit_score = None 63 | 64 | def show_progress(self, end = False): 65 | sys.stderr.write('\r[b6lib] Reading: %s' % (pp(self.pos))) 66 | sys.stderr.flush() 67 | if end: 68 | sys.stderr.write('\n') 69 | 70 | def next(self, raw = False, show_progress = False, progress_step = 10000): 71 | while 1: 72 | self.entry = self.file_pointer.readline() 73 | 74 | if self.entry == '': 75 | if show_progress: 76 | self.show_progress(end = True) 77 | return False 78 | 79 | self.entry = self.entry.strip() 80 | 81 | if not (self.entry.startswith('#') or len(self.entry) == 0): 82 | self.pos += 1 83 | break 84 | 85 | if show_progress and (self.pos == 1 or self.pos % progress_step == 0): 86 | self.show_progress() 87 | 88 | if raw == True: 89 | return True 90 | 91 | try: 92 | self.query_id, self.subject_id, self.identity, self.alignment_length,\ 93 | self.mismatches, self.gaps, self.q_start, self.q_end, self.s_start,\ 94 | self.s_end, self.e_value, self.bit_score =\ 95 | [self.conversion[x](self.entry.split('\t')[x]) for x in range(0, 12)] 96 | 97 | try: 98 | self.e_value = float(self.e_value) 99 | except: 100 | pass 101 | 102 | try: 103 | self.bit_score = float(self.bit_score) 104 | except: 105 | pass 106 | 107 | except: 108 | print 109 | print 'Error: There is something wrong with this entry in the B6 file' 110 | print self.entry 111 | sys.exit() 112 | 113 | self.entry += '\n' 114 | return True 115 | 116 | def reset(self): 117 | self.init() 118 | self.file_pointer.seek(0) 119 | 120 | 121 | def load_b6_matrix(self): 122 | for i in range(0, 12): 123 | self.matrix.append([]) 124 | 125 | F = lambda x, i: self.conversion[i](x) 126 | 127 | while self.next(raw = True): 128 | if self.pos % 10000 == 0 or self.pos == 1: 129 | sys.stderr.write('\r[b6_matrix] Reading: %s' % (pp(self.pos))) 130 | sys.stderr.flush() 131 | 132 | b6_columns = self.entry.split(('\t')) 133 | for i in range(0, 12): 134 | self.matrix[i].append(F(b6_columns[i], i)) 135 | 136 | sys.stderr.write('\n') 137 | return True 138 | 139 | 140 | def print_b6_file_stats(self): 141 | if self.matrix == []: 142 | self.load_b6_matrix() 143 | 144 | TABULAR = lambda x, y: sys.stdout.write('%s %s: %s\n' % (x, '.' * (20 - len(x)), y)) 145 | INFO = lambda x: '%-10.2f %-10.2f %-10.2f %-10.2f'\ 146 | % (numpy.mean(self.matrix[x]), 147 | numpy.std(self.matrix[x]), 148 | numpy.min(self.matrix[x]), 149 | numpy.max(self.matrix[x])) 150 | 151 | print 152 | TABULAR('Total Hits', pp(len(self.matrix[IDENTITY]))) 153 | print 154 | print ' mean std min max' 155 | print 156 | TABULAR('Identity', INFO(IDENTITY)) 157 | TABULAR('Alignment Length', INFO(ALIGNMENT_LENGTH)) 158 | TABULAR('Mismatches', INFO(MISMATCHES)) 159 | TABULAR('Gaps', INFO(GAPS)) 160 | TABULAR('Query Start', INFO(Q_START)) 161 | TABULAR('Query End', INFO(Q_END)) 162 | TABULAR('Target Start', INFO(S_START)) 163 | TABULAR('Target End', INFO(S_END)) 164 | TABULAR('E-Value', INFO(E_VALUE)) 165 | TABULAR('Bit Score', INFO(BIT_SCORE)) 166 | print 167 | 168 | def visualize_b6_output(self, title_hint, Q_LENGTH = 101): 169 | if self.matrix == []: 170 | self.load_b6_matrix() 171 | 172 | import matplotlib.pyplot as plt 173 | import matplotlib.gridspec as gridspec 174 | 175 | def _setp(b, c = 'red'): 176 | plt.setp(b['medians'], color=c) 177 | plt.setp(b['whiskers'], color='black', alpha=0.6) 178 | plt.setp(b['boxes'], color='black', alpha=0.8) 179 | plt.setp(b['caps'], color='black', alpha=0.6) 180 | plt.setp(b['fliers'], color='#EEEEEE', alpha=0.01) 181 | 182 | fig = plt.figure(figsize = (24, 12)) 183 | plt.rcParams.update({'axes.linewidth' : 0.9}) 184 | plt.rc('grid', color='0.50', linestyle='-', linewidth=0.1) 185 | 186 | gs = gridspec.GridSpec(2, 19) 187 | 188 | # 189 | # UPPER PANEL, Q_START AND Q_END 190 | # 191 | 192 | ax1 = plt.subplot(gs[0:15]) 193 | plt.grid(True) 194 | 195 | plt.subplots_adjust(left=0.03, bottom = 0.05, top = 0.92, right = 0.97) 196 | 197 | plt.title('Alignment Start / End Positions for "%s" (Number of Hits: %s)'\ 198 | % (os.path.basename(self.b6_source) if not title_hint else title_hint, pp(len(self.matrix[0])))) 199 | 200 | p1 = [0] * Q_LENGTH 201 | p2 = [0] * Q_LENGTH 202 | 203 | for i in self.matrix[Q_START]: 204 | p1[i - 1] += 1 205 | for i in self.matrix[Q_END]: 206 | p2[i - 1] += 1 207 | 208 | p1 = [x * 100.0 / sum(p1) for x in p1] 209 | p2 = [x * 100.0 / sum(p2) for x in p2] 210 | 211 | for i in range(0, len(p1)): 212 | plt.bar([i], [100], color='green', alpha = (p1[i] / max(p1)) * 0.8, width = 1, edgecolor='green') 213 | for i in range(0, len(p2)): 214 | plt.bar([i], [100], color='purple', alpha = (p2[i] / max(p2)) * 0.8, width = 1, linewidth = 0) 215 | 216 | ax1.plot(p1, c = 'black', linewidth = 3) 217 | ax1.plot(p1, c = 'green', label = 'Alignment Start Position') 218 | ax1.plot(p2, c = 'black', linewidth = 3) 219 | ax1.plot(p2, c = 'red', label = 'Alignment End Position') 220 | plt.fill_between(range(0, len(p1)), p1, y2 = 0, color = 'black', alpha = 0.5) 221 | plt.fill_between(range(0, len(p2)), p2, y2 = 0, color = 'black', alpha = 0.5) 222 | 223 | plt.ylabel('Percent of Hits') 224 | plt.xlabel('Position') 225 | plt.xticks(range(0, Q_LENGTH, Q_LENGTH / 100), range(1, Q_LENGTH + 1, Q_LENGTH / 100), rotation=90, size='xx-small') 226 | plt.yticks([t for t in range(0, 101, 10)], ['%s%%' % t for t in range(0, 101, 10)], size='xx-small') 227 | plt.ylim(ymin = 0, ymax = 100) 228 | plt.xlim(xmin = 0, xmax = Q_LENGTH - 1) 229 | 230 | plt.legend() 231 | 232 | 233 | #UPPER PANEL RIGHT SIDE 234 | 235 | ax1b = plt.subplot(gs[16:19]) 236 | plt.title('Percent Identity Breakdown') 237 | 238 | plt.grid(True) 239 | percent_brake_down = [] 240 | for p in range(90, 101): 241 | percent_brake_down.append(len([True for x in self.matrix[IDENTITY] if x >= p]) * 100.0 / len(self.matrix[IDENTITY])) 242 | 243 | percent_differences = [] 244 | for i in range(0, len(percent_brake_down)): 245 | if i < len(percent_brake_down) - 1: 246 | percent_differences.append(percent_brake_down[i] - percent_brake_down[i + 1]) 247 | else: 248 | percent_differences.append(percent_brake_down[i]) 249 | percent_differences.sort(reverse = True) 250 | 251 | 252 | ax1b.bar([t + .05 for t in range(0, 11)], percent_differences, width = .9, color = 'orange') 253 | plt.xlim(xmax = 11) 254 | plt.ylim(ymax = 100, ymin = 0) 255 | plt.xticks([t + .5 for t in range(0, 11)], ['%s%%' % t for t in range(100, 89, -1)], rotation=90, size='xx-small') 256 | plt.yticks([t for t in range(0, 101, 10)], ['%s%%' % t for t in range(0, 101, 10)], size='xx-small') 257 | plt.xlabel('Percent Identity Level') 258 | plt.ylabel('Percent of Hits') 259 | 260 | # BOX 1 261 | ax2 = plt.subplot(gs[19:22]) 262 | plt.grid(True) 263 | plt.title('Query Alignment Start / End Positions') 264 | plt.ylabel('Position in Query') 265 | b2 = ax2.boxplot([self.matrix[Q_START], self.matrix[Q_END]], positions=[0.5, 1.5], sym=',', widths=0.7) 266 | _setp(b2) 267 | plt.xticks([0.5, 1.5], ['Start', 'End']) 268 | plt.ylim(ymax = 101) 269 | 270 | # BOX 2 271 | ax3 = plt.subplot(gs[23:26]) 272 | plt.grid(True) 273 | plt.title('Target Alignment Start / End Positions') 274 | plt.ylabel('Position in Target') 275 | b3 = ax3.boxplot([self.matrix[S_START], self.matrix[S_END]], positions=[0.5, 1.5], sym=',', widths=0.7) 276 | _setp(b3) 277 | plt.xticks([0.5, 1.5], ['Start', 'End']) 278 | 279 | 280 | # BOX 3 281 | ax4 = plt.subplot(gs[27:29]) 282 | plt.grid(True) 283 | plt.title('Percent Identity to Target') 284 | plt.ylabel('Percent') 285 | b4 = ax4.boxplot(self.matrix[IDENTITY], positions=[0.5], sym=',', widths=0.7) 286 | _setp(b4, 'purple') 287 | plt.xticks([0.5], []) 288 | plt.ylim(ymax = 101, ymin = 0) 289 | 290 | 291 | # BOX 4 292 | ax5 = plt.subplot(gs[30:32]) 293 | plt.grid(True) 294 | plt.title('Alignment Length') 295 | plt.ylabel('Nucleotide') 296 | b5 = ax5.boxplot(self.matrix[ALIGNMENT_LENGTH], positions=[0.5], sym=',', widths=0.7) 297 | _setp(b5, 'orange') 298 | plt.xticks([0.5], []) 299 | plt.ylim(ymax = 101, ymin = 0) 300 | 301 | # BOX 5 302 | ax6 = plt.subplot(gs[33:35]) 303 | plt.grid(True) 304 | plt.title('Mismatches and Gaps') 305 | plt.ylabel('Number') 306 | b6 = ax6.boxplot([self.matrix[MISMATCHES], self.matrix[GAPS]], positions=[0.5, 1.5], sym=',', widths=0.7) 307 | _setp(b6, 'brown') 308 | plt.xticks([0.5, 1.5], ['Mismatches', 'Gaps']) 309 | 310 | # BOX 6 311 | ax7 = plt.subplot(gs[36:38]) 312 | plt.grid(True) 313 | plt.title('Bit Score') 314 | b7 = ax7.boxplot(self.matrix[BIT_SCORE], positions=[0.5], sym=',', widths=0.7) 315 | _setp(b7, 'green') 316 | plt.xticks([0.5], []) 317 | 318 | 319 | try: 320 | plt.savefig(self.b6_source + '.tiff') 321 | except: 322 | plt.savefig(self.b6_source + '.png') 323 | 324 | try: 325 | plt.show() 326 | except: 327 | pass 328 | 329 | return 330 | 331 | if __name__ == '__main__': 332 | b6_f_name = sys.argv[1] 333 | b6 = B6Source(b6_f_name) 334 | b6.visualize_b6_output(sys.argv[2] if len(sys.argv) == 3 else None) 335 | b6.print_b6_file_stats() 336 | -------------------------------------------------------------------------------- /pipeline/utils/cmdlinehandler.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Copyright (C) 2011, Marine Biological Laboratory 4 | # 5 | # This program is free software; you can redistribute it and/or modify it under 6 | # the terms of the GNU General Public License as published by the Free 7 | # Software Foundation; either version 2 of the License, or (at your option) 8 | # any later version. 9 | # 10 | # Please read the docs/COPYING file. 11 | 12 | 13 | import argparse 14 | 15 | def get_parser_obj(): 16 | parser = argparse.ArgumentParser(description='Metagenomics BLAST Filtering Pipeline') 17 | parser.add_argument('-s', '--filters-config', required=True, metavar = 'CONFIG FILE PATH', 18 | help = 'File in which BLAST filtering targets are defined') 19 | parser.add_argument('-i', '--input', required=True, metavar = 'FILE PATH', 20 | help = 'Input file in FASTA format') 21 | parser.add_argument('-o', '--base-work-dir', required=True, metavar = 'DIRECTORY', 22 | help = 'Base working directory (in which new directories for datasets\ 23 | will be created to store output files)') 24 | parser.add_argument('-d', '--dataset-name', required=True, metavar = 'NAME', 25 | help = 'Dataset name for file and directory names') 26 | 27 | parser.add_argument('--dry-run', action = 'store_true', default = False, 28 | help = 'Print out configuration and exit') 29 | 30 | return parser 31 | 32 | -------------------------------------------------------------------------------- /pipeline/utils/fastalib.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # v.120530 3 | 4 | # Copyright (C) 2011, Marine Biological Laboratory 5 | # 6 | # This program is free software; you can redistribute it and/or modify it under 7 | # the terms of the GNU General Public License as published by the Free 8 | # Software Foundation; either version 2 of the License, or (at your option) 9 | # any later version. 10 | # 11 | # Please read the docs/COPYING file. 12 | 13 | import os 14 | import sys 15 | import numpy 16 | import hashlib 17 | 18 | class FastaOutput: 19 | def __init__(self, output_file_path): 20 | self.output_file_path = output_file_path 21 | self.output_file_obj = open(output_file_path, 'w') 22 | 23 | def store(self, entry, split = True, store_frequencies = True): 24 | if entry.unique and store_frequencies: 25 | self.write_id('%s|%s' % (entry.id, 'frequency:%d' % len(entry.ids))) 26 | else: 27 | self.write_id(entry.id) 28 | 29 | self.write_seq(entry.seq, split) 30 | 31 | def write_id(self, id): 32 | self.output_file_obj.write('>%s\n' % id) 33 | 34 | def write_seq(self, seq, split = True): 35 | if split: 36 | seq = self.split(seq) 37 | self.output_file_obj.write('%s\n' % seq) 38 | 39 | def split(self, sequence, piece_length = 80): 40 | ticks = range(0, len(sequence), piece_length) + [len(sequence)] 41 | return '\n'.join([sequence[ticks[x]:ticks[x + 1]] for x in range(0, len(ticks) - 1)]) 42 | 43 | def close(self): 44 | self.output_file_obj.close() 45 | 46 | 47 | class ReadFasta: 48 | def __init__(self, f_name): 49 | self.ids = [] 50 | self.sequences = [] 51 | 52 | self.fasta = SequenceSource(f_name) 53 | 54 | while self.fasta.next(): 55 | if self.fasta.pos % 1000 == 0 or self.fasta.pos == 1: 56 | sys.stderr.write('\r[fastalib] Reading FASTA into memory: %s' % (self.fasta.pos)) 57 | sys.stderr.flush() 58 | self.ids.append(self.fasta.id) 59 | self.sequences.append(self.fasta.seq) 60 | 61 | sys.stderr.write('\n') 62 | 63 | def close(self): 64 | self.fasta.close() 65 | 66 | 67 | class SequenceSource(): 68 | def __init__(self, fasta_file_path, lazy_init = True, unique = False): 69 | self.fasta_file_path = fasta_file_path 70 | self.name = None 71 | self.lazy_init = lazy_init 72 | 73 | self.pos = 0 74 | self.id = None 75 | self.seq = None 76 | self.ids = [] 77 | 78 | self.unique = unique 79 | self.unique_hash_dict = {} 80 | self.unique_hash_list = [] 81 | self.unique_next_hash = 0 82 | 83 | self.file_pointer = open(self.fasta_file_path) 84 | self.file_pointer.seek(0) 85 | 86 | if self.lazy_init: 87 | self.total_seq = None 88 | else: 89 | self.total_seq = len([l for l in self.file_pointer.readlines() if l.startswith('>')]) 90 | self.reset() 91 | 92 | if self.unique: 93 | self.init_unique_hash() 94 | 95 | def init_unique_hash(self): 96 | while self.next_regular(): 97 | hash = hashlib.sha1(self.seq.upper()).hexdigest() 98 | if hash in self.unique_hash_dict: 99 | self.unique_hash_dict[hash]['ids'].append(self.id) 100 | self.unique_hash_dict[hash]['count'] += 1 101 | else: 102 | self.unique_hash_dict[hash] = {'id' : self.id, 103 | 'ids': [self.id], 104 | 'seq': self.seq, 105 | 'count': 1} 106 | 107 | self.unique_hash_list = [i[1] for i in sorted([(self.unique_hash_dict[hash]['count'], hash)\ 108 | for hash in self.unique_hash_dict], reverse = True)] 109 | 110 | 111 | self.total_unique = len(self.unique_hash_dict) 112 | self.reset() 113 | 114 | def next(self): 115 | if self.unique: 116 | return self.next_unique() 117 | else: 118 | return self.next_regular() 119 | 120 | def next_unique(self): 121 | if self.unique: 122 | if self.total_unique > 0 and self.pos < self.total_unique: 123 | hash_entry = self.unique_hash_dict[self.unique_hash_list[self.pos]] 124 | 125 | self.pos += 1 126 | self.seq = hash_entry['seq'] 127 | self.id = hash_entry['id'] 128 | self.ids = hash_entry['ids'] 129 | 130 | return True 131 | else: 132 | return False 133 | else: 134 | return False 135 | 136 | def next_regular(self): 137 | self.seq = None 138 | self.id = self.file_pointer.readline()[1:].strip() 139 | sequence = '' 140 | 141 | while 1: 142 | line = self.file_pointer.readline() 143 | if not line: 144 | if len(sequence): 145 | self.seq = sequence 146 | self.pos += 1 147 | return True 148 | else: 149 | return False 150 | if line.startswith('>'): 151 | self.file_pointer.seek(self.file_pointer.tell() - len(line)) 152 | break 153 | sequence += line.strip() 154 | 155 | self.seq = sequence 156 | self.pos += 1 157 | return True 158 | 159 | 160 | def close(self): 161 | self.file_pointer.close() 162 | 163 | def reset(self): 164 | self.pos = 0 165 | self.id = None 166 | self.seq = None 167 | self.ids = [] 168 | self.file_pointer.seek(0) 169 | 170 | def visualize_sequence_length_distribution(self, title, dest = None, max_seq_len = None, xtickstep = None, ytickstep = None): 171 | import matplotlib.pyplot as plt 172 | import matplotlib.gridspec as gridspec 173 | 174 | sequence_lengths = [] 175 | 176 | self.reset() 177 | 178 | while self.next(): 179 | if self.pos % 10000 == 0 or self.pos == 1: 180 | sys.stderr.write('\r[fastalib] Reading: %s' % (self.pos)) 181 | sys.stderr.flush() 182 | sequence_lengths.append(len(self.seq)) 183 | 184 | self.reset() 185 | 186 | sys.stderr.write('\n') 187 | 188 | if not max_seq_len: 189 | max_seq_len = max(sequence_lengths) + (int(max(sequence_lengths) / 100.0) or 10) 190 | 191 | seq_len_distribution = [0] * (max_seq_len + 1) 192 | 193 | for l in sequence_lengths: 194 | seq_len_distribution[l] += 1 195 | 196 | fig = plt.figure(figsize = (16, 12)) 197 | plt.rcParams.update({'axes.linewidth' : 0.9}) 198 | plt.rc('grid', color='0.50', linestyle='-', linewidth=0.1) 199 | 200 | gs = gridspec.GridSpec(10, 1) 201 | 202 | ax1 = plt.subplot(gs[0:8]) 203 | plt.grid(True) 204 | plt.subplots_adjust(left=0.05, bottom = 0.03, top = 0.95, right = 0.98) 205 | 206 | plt.plot(seq_len_distribution, color = 'black', alpha = 0.3) 207 | plt.fill_between(range(0, max_seq_len + 1), seq_len_distribution, y2 = 0, color = 'black', alpha = 0.15) 208 | plt.ylabel('number of sequences') 209 | plt.xlabel('sequence length') 210 | 211 | if xtickstep == None: 212 | xtickstep = (max_seq_len / 50) or 1 213 | 214 | if ytickstep == None: 215 | ytickstep = max(seq_len_distribution) / 20 or 1 216 | 217 | plt.xticks(range(xtickstep, max_seq_len + 1, xtickstep), rotation=90, size='xx-small') 218 | plt.yticks(range(0, max(seq_len_distribution) + 1, ytickstep), 219 | [y for y in range(0, max(seq_len_distribution) + 1, ytickstep)], 220 | size='xx-small') 221 | plt.xlim(xmin = 0, xmax = max_seq_len) 222 | plt.ylim(ymin = 0, ymax = max(seq_len_distribution) + (max(seq_len_distribution) / 20.0)) 223 | 224 | plt.figtext(0.5, 0.96, '%s' % (title), weight = 'black', size = 'xx-large', ha = 'center') 225 | 226 | ax1 = plt.subplot(gs[9]) 227 | plt.rcParams.update({'axes.edgecolor' : 20}) 228 | plt.grid(False) 229 | plt.yticks([]) 230 | plt.xticks([]) 231 | plt.text(0.02, 0.5, 'total: %s / mean: %.2f / std: %.2f / min: %s / max: %s'\ 232 | % (len(sequence_lengths), 233 | numpy.mean(sequence_lengths), numpy.std(sequence_lengths),\ 234 | min(sequence_lengths),\ 235 | max(sequence_lengths)),\ 236 | va = 'center', alpha = 0.8, size = 'x-large') 237 | 238 | if dest == None: 239 | dest = self.fasta_file_path 240 | 241 | try: 242 | plt.savefig(dest + '.tiff') 243 | except: 244 | plt.savefig(dest + '.png') 245 | 246 | try: 247 | plt.show() 248 | except: 249 | pass 250 | 251 | return 252 | 253 | 254 | class QualSource: 255 | def __init__(self, quals_file_path, lazy_init = True): 256 | self.quals_file_path = quals_file_path 257 | self.name = None 258 | self.lazy_init = lazy_init 259 | 260 | self.pos = 0 261 | self.id = None 262 | self.quals = None 263 | self.quals_int = None 264 | self.ids = [] 265 | 266 | self.file_pointer = open(self.quals_file_path) 267 | self.file_pointer.seek(0) 268 | 269 | if self.lazy_init: 270 | self.total_quals = None 271 | else: 272 | self.total_quals = len([l for l in self.file_pointer.readlines() if l.startswith('>')]) 273 | self.reset() 274 | 275 | 276 | def next(self): 277 | self.id = self.file_pointer.readline()[1:].strip() 278 | self.quals = None 279 | self.quals_int = None 280 | 281 | qualscores = '' 282 | 283 | while 1: 284 | line = self.file_pointer.readline() 285 | if not line: 286 | if len(qualscores): 287 | self.quals = qualscores.strip() 288 | self.quals_int = [int(q) for q in self.quals.split()] 289 | self.pos += 1 290 | return True 291 | else: 292 | return False 293 | if line.startswith('>'): 294 | self.file_pointer.seek(self.file_pointer.tell() - len(line)) 295 | break 296 | qualscores += ' ' + line.strip() 297 | 298 | self.quals = qualscores.strip() 299 | self.quals_int = [int(q) for q in self.quals.split()] 300 | self.pos += 1 301 | 302 | return True 303 | 304 | def close(self): 305 | self.file_pointer.close() 306 | 307 | def reset(self): 308 | self.pos = 0 309 | self.id = None 310 | self.quals = None 311 | self.quals_int = None 312 | self.ids = [] 313 | self.file_pointer.seek(0) 314 | 315 | 316 | if __name__ == '__main__': 317 | fasta = SequenceSource(sys.argv[1]) 318 | fasta.visualize_sequence_length_distribution(title = sys.argv[2] if len(sys.argv) == 3 else 'None') 319 | -------------------------------------------------------------------------------- /pipeline/utils/logger.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Copyright (C) 2010 - 2011, University of New Orleans 4 | # 5 | # This program is free software; you can redistribute it and/or modify it under 6 | # the terms of the GNU General Public License as published by the Free 7 | # Software Foundation; either version 2 of the License, or (at your option) 8 | # any later version. 9 | # 10 | # Please read the COPYING file. 11 | # 12 | # -- 13 | # 14 | # Caller aware logging. 15 | # 16 | 17 | import os 18 | import sys 19 | import time 20 | import string 21 | 22 | def findCaller(): 23 | if string.lower(__file__[-4:]) in [".pyc", ".pyo"]: 24 | srcFile = __file__[:-4] + ".py" 25 | else: 26 | srcFile = __file__ 27 | srcFile = os.path.normcase(srcFile) 28 | 29 | f = sys._getframe().f_back 30 | while 1: 31 | co = f.f_code 32 | filename = os.path.normcase(co.co_filename) 33 | if filename == srcFile: 34 | f = f.f_back 35 | continue 36 | return ":".join([os.path.basename(filename)[:-3], str(f.f_lineno)]) 37 | 38 | 39 | def __raw(msg, f): 40 | output = "\n%s\n" % (msg) 41 | if f: 42 | open(f, "a").write(output) 43 | sys.stdout.write(output) 44 | 45 | def __log(level, caller, msg, f): 46 | output = "%s | %-6.6s| %-16.16s | %s\n" %(time.asctime(), level, caller, msg) 47 | if f: 48 | open(f, "a").write(output) 49 | sys.stdout.write(output) 50 | 51 | 52 | def debug(msg, f = None): 53 | caller = findCaller() 54 | __log("DEBUG", caller, msg, f) 55 | 56 | def error(msg, f = None): 57 | caller = findCaller() 58 | __log("ERROR", caller, msg, f) 59 | 60 | def info(msg, f = None): 61 | caller = findCaller() 62 | __log("INFO", caller, msg, f) 63 | 64 | def raw(msg, f = None): 65 | __raw(msg, f) 66 | -------------------------------------------------------------------------------- /pipeline/utils/utils.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | # Copyright (C) 2011, Marine Biological Laboratory 4 | # 5 | # This program is free software; you can redistribute it and/or modify it under 6 | # the terms of the GNU General Public License as published by the Free 7 | # Software Foundation; either version 2 of the License, or (at your option) 8 | # any later version. 9 | # 10 | # Please read the docs/COPYING file. 11 | 12 | 13 | import gc 14 | import os 15 | import sys 16 | import shutil 17 | import inspect 18 | import subprocess 19 | 20 | from pipeline.utils.logger import debug 21 | from pipeline.utils.logger import error 22 | from pipeline.utils.fastalib import SequenceSource 23 | from pipeline.utils.b6lib import B6Source 24 | 25 | 26 | class UtilsError(Exception): 27 | def __init__(self, e = None): 28 | Exception.__init__(self) 29 | self.e = e 30 | error(e) 31 | return 32 | def __str__(self): 33 | return 'Utils Error: %s' % self.e 34 | 35 | 36 | def pp(n): 37 | """Pretty print function for very big numbers..""" 38 | ret = [] 39 | n = str(n) 40 | for i in range(len(n) - 1, -1, -1): 41 | ret.append(n[i]) 42 | if (len(n) - i) % 3 == 0: 43 | ret.append(',') 44 | ret.reverse() 45 | return ''.join(ret[1:]) if ret[0] == ',' else ''.join(ret) 46 | 47 | def p_tabular(label, value, label_length = 20, file_obj = None): 48 | info_line = "%s %s: %s" % (label, '.' * (label_length - len(label)), str(value)) 49 | if file_obj: 50 | info_file_obj.write(info_line + '\n') 51 | print info_line 52 | 53 | def my_name(): 54 | """a simple function that returns the name of the function from which it was called 55 | (it was written by Jerry Kindall and was copy-pasted from an online resource)""" 56 | frame = inspect.currentframe(1) 57 | code = frame.f_code 58 | globs = frame.f_globals 59 | functype = type(lambda: 0) 60 | 61 | funcs = [] 62 | 63 | for func in gc.get_referrers(code): 64 | if type(func) is functype: 65 | if getattr(func, "func_code", None) is code: 66 | if getattr(func, "func_globals", None) is globs: 67 | funcs.append(func) 68 | if len(funcs) > 1: 69 | return None 70 | 71 | return funcs[0].__name__ if funcs else None 72 | 73 | 74 | def concatenate_files(dest_file, file_list): 75 | debug('%s; dest: "%s"' % (my_name(), dest_file)) 76 | dest_file_obj = open(dest_file, 'w') 77 | for chunk_path in file_list: 78 | for line in open(chunk_path): 79 | dest_file_obj.write(line) 80 | 81 | return dest_file_obj.close() 82 | 83 | def split_file(ids_file, source_file, filtered_dest_file, survived_dest_file, type = 'fasta'): 84 | """splits reads in input file into two files based on ids_file 85 | 86 | for read_id in input: 87 | if read_id in list_of_ids: 88 | --> filtered_dest_file 89 | else: 90 | --> survived dest_file 91 | 92 | """ 93 | debug('%s; src: "%s" (%s), filtered_dest: "%s", survived_dest: "%s"'\ 94 | % (my_name(), source_file, type, filtered_dest_file, survived_dest_file)) 95 | 96 | try: 97 | ids_to_filter = set([id.strip() for id in open(ids_file).readlines()]) 98 | except IOError: 99 | raise FilterError, 'Hit IDs file missing ("%s").' \ 100 | % (ids_to_filter) 101 | 102 | if type == 'fasta': 103 | 104 | STORE = lambda e, f: f.write('>%s\n%s\n' % (e.id, e.seq)) 105 | 106 | input = SequenceSource(source_file) 107 | filtered_output = open(filtered_dest_file, 'w') 108 | survived_output = open(survived_dest_file, 'w') 109 | filtered_count, survived_count = 0, 0 110 | 111 | while input.next(): 112 | if input.pos % 10000 == 0 or input.pos == 1: 113 | sys.stderr.write('\rSplitting FASTA file: ~ %s' % (pp(input.pos))) 114 | sys.stderr.flush() 115 | 116 | if input.id in ids_to_filter: 117 | ids_to_filter.remove(input.id) 118 | STORE(input, filtered_output) 119 | filtered_count += 1 120 | else: 121 | STORE(input, survived_output) 122 | survived_count += 1 123 | 124 | sys.stderr.write('\n') 125 | filtered_output.close() 126 | survived_output.close() 127 | 128 | debug('%s; done. of %s total reads, filtered: %s, survived: %s.'\ 129 | % (my_name(), pp(filtered_count + survived_count),\ 130 | pp(filtered_count), pp(survived_count))) 131 | 132 | else: 133 | raise UtilsError, "type '%s' is not implemented" % (type) 134 | 135 | return True 136 | 137 | 138 | def copy_file(source_file, dest_file): 139 | debug('%s; dest: "%s", src: "%s"' % (my_name(), source_file, dest_file)) 140 | try: 141 | return shutil.copyfile(source_file, dest_file) 142 | except IOError, e: 143 | raise UtilsError, "copy failed due to the following reason: '%s' (src: %s, dst: %s)" \ 144 | % (e, source_file, dest_file) 145 | 146 | def run_command(cmdline): 147 | debug('%s; cmd: %s' % (my_name(), cmdline)) 148 | try: 149 | if subprocess.call(cmdline, shell = True) < 0: 150 | raise UtilsError, "command was terminated by signal: %d" % (-retcode) 151 | except OSError, e: 152 | raise UtilsError, "command was failed for the following reason: '%s' ('%s')" % (e, cmdline) 153 | 154 | 155 | def refine_b6(source_file, dest_file, params): 156 | # FIXME: check if source_file is a valid m8 output. 157 | debug('%s; dest: %s' % (my_name(), dest_file)) 158 | try: 159 | b6 = B6Source(source_file) 160 | except IOError, e: 161 | raise UtilsError, "open failed due to the following reason: '%s' (src: %s)" \ 162 | % (e, source_file) 163 | 164 | try: 165 | output = open(dest_file, 'w') 166 | except IOError, e: 167 | raise UtilsError, "open failed due to the following reason: '%s' (src: %s)" \ 168 | % (e, dest_file) 169 | 170 | previous_query_id = None 171 | 172 | while b6.next(): 173 | if b6.pos % 10000 == 0 or b6.pos == 1: 174 | sys.stderr.write('\rReading B6: ~ %s' % (pp(b6.pos))) 175 | sys.stderr.flush() 176 | 177 | if params.has_key('unique_hits') and params['unique_hits'] and b6.query_id == previous_query_id: 178 | continue 179 | 180 | if params.has_key('min_alignment_length') and b6.alignment_length < params['min_alignment_length']: 181 | continue 182 | 183 | if params.has_key('min_identity') and b6.identity < params['min_identity']: 184 | continue 185 | 186 | # At this point, this entry must be what we are looking for. 187 | # We shall store it. 188 | output.write(b6.entry) 189 | previous_query_id = b6.query_id 190 | 191 | sys.stderr.write('\n') 192 | return output.close() 193 | 194 | def store_ids_from_b6_output(source_b6_output, dest_file): 195 | debug('%s; dest: %s' % (my_name(), dest_file)) 196 | try: 197 | b6 = B6Source(source_b6_output) 198 | except IOError, e: 199 | raise UtilsError, "open failed due to the following reason: '%s' (src: %s)" \ 200 | % (e, source_b6_output) 201 | 202 | try: 203 | output = open(dest_file, 'w') 204 | except IOError, e: 205 | raise UtilsError, "open failed due to the following reason: '%s' (src: %s)" \ 206 | % (e, dest_file) 207 | 208 | while b6.next(): 209 | output.write(b6.query_id + '\n') 210 | 211 | 212 | def get_qstat_info(job_identifier): 213 | try: 214 | proc = subprocess.Popen(['qstat'], stdout=subprocess.PIPE) 215 | except OSError, e: 216 | raise UtilsError, "qstat command was failed for the following reason: '%s' ('%s')" % (e, cmdline) 217 | 218 | qstat_state_codes = {'pending': ['qw', 'hqw', 'hRwq'], 219 | 'running': ['r', 't', 'Rr', 'Rt'], 220 | 'suspended': ['s', 'ts', 'S', 'tS', 'T', 'tT'], 221 | 'error': ['Eqw', 'Ehqw', 'EhRqw'], 222 | 'deleted': ['dr', 'dt', 'dRr', 'dRt', 'ds', 'dS', 'dT', 'dRs', 'dRS', 'dRT']} 223 | 224 | info_dict = {'pending': 0, 'running': 0, 'suspended': 0, 'error': 0, 'deleted': 0} 225 | line_no = 0 226 | 227 | while True: 228 | line = proc.stdout.readline() 229 | 230 | # skip the first two lines 231 | if line_no < 2: 232 | line_no += 1 233 | continue 234 | 235 | if line != '': 236 | id, priority, name, user, state = line.strip().split()[0:5] 237 | if name == job_identifier: 238 | found = False 239 | for s in qstat_state_codes: 240 | if state in qstat_state_codes[s]: 241 | found = True 242 | info_dict[s] += 1 243 | if not found: 244 | raise UtilsError, "unknown state for qstat: '%s' (known states: '%s')"\ 245 | % (state, ', '.join(info_dict.keys())) 246 | 247 | line_no += 1 248 | else: 249 | break 250 | 251 | return info_dict 252 | 253 | 254 | def split_fasta_file(input_file_path, dest_dir, prefix = 'part', number_of_sequences_per_file = 20000): 255 | debug('%s; src: %s, dest dir: %s' % (my_name(), input_file_path, dest_dir)) 256 | 257 | input = SequenceSource(input_file_path) 258 | 259 | parts = [] 260 | next_part = 1 261 | part_obj = None 262 | 263 | while input.next(): 264 | if (input.pos - 1) % number_of_sequences_per_file == 0: 265 | sys.stderr.write('\rCreating part: ~ %s' % (pp(next_part))) 266 | sys.stderr.flush() 267 | 268 | if part_obj: 269 | part_obj.close() 270 | file_path = os.path.join(dest_dir, prefix + '-%08d' % next_part) 271 | parts.append(file_path) 272 | next_part += 1 273 | part_obj = open(file_path, 'w') 274 | 275 | part_obj.write('>%s\n' % input.id) 276 | part_obj.write('%s\n' % input.seq) 277 | 278 | if part_obj: 279 | part_obj.close() 280 | 281 | sys.stderr.write('\n') 282 | return parts 283 | 284 | 285 | def check_dir(dir, create=True, clean_dir_content = False): 286 | if os.path.exists(dir): 287 | pass 288 | elif create: 289 | os.makedirs(dir) 290 | else: 291 | return False 292 | 293 | if clean_dir_content: 294 | delete_files_in_dir(dir) 295 | 296 | return True 297 | 298 | 299 | def delete_files_in_dir(dir): 300 | debug('%s; removing content of "%s"' % (my_name(), dir)) 301 | for f in os.listdir(dir): 302 | os.unlink(os.path.join(dir, f)) 303 | 304 | 305 | def info(label, value, mlen = 30, file_obj = None): 306 | info_line = "%s %s: %s" % (label, '.' * (mlen - len(label)), str(value)) 307 | if file_obj: 308 | info_file_obj.write(info_line + '\n') 309 | print info_line 310 | 311 | 312 | def print_config_summary(config): 313 | print('\nSummary of filters and intended input/output destinations:\n') 314 | info('Dataset name', config.dataset_name) 315 | info('Working Direcotory', config.base_work_dir) 316 | print('\n') 317 | for filter in config.filters: 318 | info(' Filter name', filter.name) 319 | info(' Module', filter.module.__name__) 320 | if len(filter.execution_order): 321 | info(' Special Execution order', filter.execution_order) 322 | info(' Target DB', filter.target_db) 323 | info(' Input File', filter.files['input']) 324 | info(' Filter Output Direcotory', filter.dirs['output']) 325 | info(' Search Output', filter.files['search_output']) 326 | info(' Inspected Search Output', filter.files['refined_search_output']) 327 | info(' Filtered IDs', filter.files['hit_ids']) 328 | info(' Filtered Input', filter.files['filtered_reads']) 329 | info(' Output to the Next Stage', filter.files['survived_reads']) 330 | print '\n' 331 | 332 | 333 | -------------------------------------------------------------------------------- /samples/sample-filters-config.ini: -------------------------------------------------------------------------------- 1 | [/path/to/a/search/database/human_genome.wdb] 2 | filter_name = Human 3 | module = usearch 4 | execute = clean, init, search, filter 5 | cmdparam.-id = 0.9 6 | cmdparam.-queryalnfract = 0.3 7 | rfnparam.min_alignment_length = 50 8 | rfnparam.min_identity = 90 9 | rfnparam.unique_hits = 1 10 | 11 | [/path/to/another/search/database/reference_SSU.db] 12 | filter_name = rRNA 13 | module = blast 14 | 15 | [/path/to/a/search/database/viral_genomes.wdb] 16 | filter_name = Viral 17 | module = usearch 18 | -------------------------------------------------------------------------------- /samples/sample.b6.txt.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meren/BLAST-filtering-pipeline/e7ae4f3e76fcdb755fda2c92d70e719ecc7148f9/samples/sample.b6.txt.png -------------------------------------------------------------------------------- /samples/sample.fa: -------------------------------------------------------------------------------- 1 | >D4ZHLFP1.10.1118.2273.1 2 | ACTTAAATTGATACCTTCCCTGATTACACATACCTTTCCTTGCAGCAAGCATCTTTCCTGGCCATGATCACCCCTGTCTCTTATACACATCTCTGAGCGGG 3 | >D4ZHLFP1.10.1165.2281.1 4 | ACATTATGTAGCCAAAAATCGCTATTTTTGTTATTATAGCTGAACTCACCTTTACGACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCCGACCG 5 | >D4ZHLFP1.10.1173.2303.1 6 | GATGATAGTTATCTTTTGCCGCGCTTCCTCCCTGCTGATGCCAGCCGAATCCGCCACCAGGTTAATCACTGTCTCTTATACACATCTCTGAGCGGGCTGGC 7 | >D4ZHLFP1.10.1063.2305.1 8 | GCCTAATTCTGTACAGAATGCAGAAAATTTATCTTTAGGCTTAGGTGGCAAAAAAAATTTTGTTTTAGGCAATAGCTGTCTCTTATACACATCTCTGAGCG 9 | >D4ZHLFP1.10.1248.2315.1 10 | GTAGAACTCCTTCAAACGTGCCAGTTTATCTTCAAAGTTTTCCAGTTTTCCGAGATAGACAAGATGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 11 | >D4ZHLFP1.10.1084.2325.1 12 | CAACACTGGGGGCGGATTTATGGTACAACAACTTCCTATTATTGCCAGCCATCAAGTGAATACTATGTATATACCTGTCTCTTATACACATCTCTGAGCGG 13 | >D4ZHLFP1.10.1136.2326.1 14 | GAATATGAAGCCCTGAAGGACAAGGTGGCTCTGCCAAAGGACCTTCCGTCCCATGTACACGACCTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 15 | >D4ZHLFP1.10.1242.2339.1 16 | GGCTGGGAGCCGTCTTCGTTTTTCACAGGATTTTATGAGTAACCGGGAAGTATGGCTGGAGGGAGTGTCTACTCTGTCTCTTATACACATCTCTGAGCGGG 17 | >D4ZHLFP1.10.1058.2361.1 18 | GGATTACATTATTGGGAGCAGGTTGCATTATTTTTGGAGTCTATCAGGCAGAGAAAAAGTGGAAGAGTAAAACCTGTCTCTTATACACATCTCTGAGCGGG 19 | >D4ZHLFP1.10.1188.2364.1 20 | ATTCTATTCTGCCGGTATTTATTATTGCAGATACATTTAATAGGGTAGTTTTTGTTTCTCAAGGATATACCTGTCTCTTATACACATCTCTGAGCGGGCTG 21 | >D4ZHLFP1.10.1153.2388.1 22 | GCTCTATTCACAGTCTTCCCCTTCTCGGGTGGAGCAACTCCATTTTTCCAGTTGTTCGGGTACAGTTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 23 | >D4ZHLFP1.10.1167.2390.1 24 | GAATACGCATGTAGTTGCAGGTGATGCTGATAATGGCAGGTGGAATGGTGCGTCCGATACCGTAGAATACACCCTGTCTCTTATACACATCTCTGAGCGGG 25 | >D4ZHLFP1.10.1111.2396.1 26 | GAAAAAGGGAACTGTCAGAAGAGGGCACTCTCAAACTAGGAACCCTATTCGCAAAATGAACAGGAATAGAAAAAAGCCTGTCTCTTATACACATCTCTGAG 27 | >D4ZHLFP1.10.1135.2414.1 28 | GTGCGCCACCCAGTAAACCGTGCCGTCGGTGTCCACCGAAGGATAGGAGATTTCGCTCTTGAGGTATACGCCTGTCTCTCATACACATCTCTGAGCGGGCT 29 | >D4ZHLFP1.10.1171.2447.1 30 | GTTGTCACAATGATAAACCAATGGCAGATAGGGACCTGGCTGCCATGCACCAAAGGCTGTCTCTTATACACATCTCTGAACGGGCTGGCAAGGCAGACCGG 31 | >D4ZHLFP1.10.1133.2448.1 32 | AGCTGAGGCAGGAGAATCGCTTGAACCCGGGAGGCGGAGGTTGCAGTGAGCCGAGATCGCATCAGTGCATTCCCTGTCTCTTATACACATCTCTGAGCGGG 33 | >D4ZHLFP1.10.1079.2469.1 34 | GTTCCTACACCATGTCCCTGGGGGAATAGGAATCAGACAGGGCCATACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACGGGCCAATATC 35 | >D4ZHLFP1.10.1167.2469.1 36 | GTACAGGATTATGTCACAAAGTTAATGTAGGCAGATCCTAGACAAGAGTTACATCACTTGGATGATTAGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 37 | >D4ZHLFP1.10.1229.2473.1 38 | ATTCAAAGCAGAGAGATTTATAATCTGGAGCGTTCTTTAAAACATTTCAATATTAACCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 39 | >D4ZHLFP1.10.1138.2480.1 40 | CCTTTTCCACCTTGCGGCTTTTCCAAGAAACGGGCACCTTCCTGAATAGACATACGTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 41 | >D4ZHLFP1.10.1157.2483.1 42 | GGGTACGCGAACGCTCTTCCCACAATTTCTCCGGCTGAGGCTCTTCCTTCCCGTTCACGCAAACCCGTATCGGCTGTTTAGGGAGCCGTCTCTTATACACA 43 | >D4ZHLFP1.10.1213.2488.1 44 | CTATATGCTTTAGTAGTTCTGGAAGCAGGTATAGCACCTGATTATTTTCTGGACAGTATGCAGACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 45 | >D4ZHLFP1.10.1124.2489.1 46 | GACATTGGAGGTGCGGGCATCCATCCCTTGCCAAAGCTCGGGTTTCTCAAAATTCCTCCCATCAATATCTGTCTCTTATACACATCTCTGAGCGGGCGGGC 47 | >D4ZHLFP1.10.1267.2264.1 48 | CAACCTACTACAGAGGCAATCTTTAGGGGAATACAGTCAGGAAAAGTATTAGAGCTTTTTGACAAACTACAATATCTGTCTCTTATACACATCTCTGAGCG 49 | >D4ZHLFP1.10.1330.2266.1 50 | ACTTCATACCTGTGAGTATCTGCGCAGGCGGGGCATTTTATACTAGCCTGTTAGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGC 51 | >D4ZHLFP1.10.1300.2276.1 52 | GTTCGACATTAAAACCTCCGAAGGATTGGGAAGTGCATTGATCTTTACACTTTACGCTATAACCTGTCTTTTATACACATCTCTGAGCGGGCTGGCAAGGC 53 | >D4ZHLFP1.10.1350.2317.1 54 | CAGCAGATCATCAGTCTCGTGGTCACCGGCGGCATCGGTCCAGGCGAGGCTGGCGACCGGAGTGGTGTTATCCCTGTCTCTTATACACATCTCTGAGCGGG 55 | >D4ZHLFP1.10.1432.2321.1 56 | CCCTTTCTGGGATTATTTCATGAGTTCCTTCACGGGTAAAATAATTTGGGTTCCCATGTATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 57 | >D4ZHLFP1.10.1400.2339.1 58 | GAAGTAAGACGGCTGGGCTTTCTTTGTTTTCTTATTTCCGCATGATGCTCTGCACACAAAGGACTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 59 | >D4ZHLFP1.10.1284.2403.1 60 | GTATATAGTATAAAAGTAGCTCTTTTATAAAAAAATAAATGCCGAAATATTTGGTAGTTCAAATAAAAGCCTGTCTCTTATACACATCTCTGAGCGGGCTG 61 | >D4ZHLFP1.10.1300.2410.1 62 | GAACTGTTTCATGCGATAGTTATAATCGGCAAAAGTCAACGGATGGGAAAGGATTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGC 63 | >D4ZHLFP1.10.1443.2417.1 64 | ATTCTATTAGTTTCCATTCCATTCCATTCCCTTCCATTCCACTGTGGTTGATTCCATTCCTTTCCATTCCCTGTCTCTTATACACATCTCTGAGCGGGCTG 65 | >D4ZHLFP1.10.1382.2423.1 66 | ATACCGACCTGTTGGACAAGGCGAAACTGGAAAAGCGGATCGCATCGCTCGAAGGGGAACGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 67 | >D4ZHLFP1.10.1458.2427.1 68 | GTCTTATAGCAGGTGATATCATGGCAGCTCATGCCCATCTTTGAATCCGGCGCAATCTTATATGTCTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 69 | >D4ZHLFP1.10.1480.2433.1 70 | CGTTGGAAGTGATGGTGAGCAAAGACAGAGTAGGACGGGCATTTTTGTAAACCGGAAGTTTCTTCAATTCCTGTCTCTTATACACATCCCTGAGCGGGCTG 71 | >D4ZHLFP1.10.1431.2437.1 72 | GTGTTGAACTGTTCCGAGTTAGTCTTCACTGTTTCCAGGCTAATGAATAGATCACCTGAAAGACGACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 73 | >D4ZHLFP1.10.1308.2448.1 74 | GCCCAACGCGATCAGGCCGCAAGGACCGACATCGTGCATGAAGGCCGTTTCGTCAGCGGCTTTGAGGAGCTTTCGCTCAGACTCTTATCCAAGCGTCTGGG 75 | >D4ZHLFP1.10.1358.2449.1 76 | GGCTAAGGATGCGGCGGAGGAATCTTTGAAACGGAACAATGCCTTGATAGATTATATAGCTTTGGATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 77 | >D4ZHLFP1.10.1429.2484.1 78 | TTATATAGGAGGTAAGGGCATTGGCATACAAGGGCTTTCATCCAAAGCCCTGCATAATATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGA 79 | >D4ZHLFP1.10.1678.2284.1 80 | GTATTTGCCCCAACTACGAAGTGAACCCGATTGATGTAGCCGGACTGTATAATCTCGAAAAGTCTCTTTTCCTGTCTCTTATACACATCTCTGAGCGGGCT 81 | >D4ZHLFP1.10.1636.2285.1 82 | TTTCCCAACTGATTCCCCATTTGTCCAGAACCTGTTGCGAACCCGGACCGCTATACTGGAACATCAGTACCTGTCTCTTATACACATCTCTGAGCGGGCTG 83 | >D4ZHLFP1.10.1502.2302.1 84 | CCTTGTAACCGTTGTATTCCTTCGGATTGTGAGAAGCAGTCAGAATGATACCGCTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGG 85 | >D4ZHLFP1.10.1713.2309.1 86 | GAAAGATGTTTATTGATTTTAGTATAAAAGACCTCATCGATATCCTGCTGGTAGCCTATCTGCTTTATCCTGTCTCTTATACACATCTCTGAGCGGGCTGG 87 | >D4ZHLFP1.10.1555.2312.1 88 | ATGTTCATTTTCCAGTTCGGAAAGTTGGGCGGCAATGTCCGGCAACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATATCTCG 89 | >D4ZHLFP1.10.1656.2312.1 90 | GGACACGGAAAAATCCTCCAGTTCAAGGTCCAGAGGTATGTCTGTCTTAATGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCA 91 | >D4ZHLFP1.10.1676.2321.1 92 | CAGCGGACACTGATTCCACAGAAGTAGATGCAAACTCAAGTAATGTCAGTACTGGTCATGGTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 93 | >D4ZHLFP1.10.1705.2324.1 94 | GGCAGCGGCTTCCTGTTCCTTCAGACTTTCGTCTTCTTCAAACATGTTCACCGGTTCGGGCTGGGGCTGAGACTGTCTCTTATACACATCTCTGAGCGGGC 95 | >D4ZHLFP1.10.1543.2351.1 96 | GCTCAGTACCCATGCAGGAAGAAGGACATTCATCTGCAATGCGCTGGCTCTCGGAATCCCGGCACCTGTCTCTTATACACATCTCTGAGCGGCCTGGCAAG 97 | >D4ZHLFP1.10.1587.2354.1 98 | GGTCTGGTTGATGGCATTGTGATTGCCTTCGGCTGTGAACAGGGCAGAAGAGTAACCGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 99 | >D4ZHLFP1.10.1675.2357.1 100 | GTCCGGTTGATTACACGATATCTTCCTTTTGATTTAGTTGTTTCACTTAAGTGATGAAGTTTGAGCACTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 101 | >D4ZHLFP1.10.1562.2368.1 102 | CATGATTCCATAAGCAAAGAATTGCTGACTCTTATAAAATCCTTCAATCTCTTCGATGCTTGCTCCAGCATCAAACCTGTCTCTTATACACATCTCTGAGC 103 | >D4ZHLFP1.10.1526.2386.1 104 | CTTATGCTCCACAACAACTGAAAGGATGGAGTTTCACTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATATCTCGTATGCC 105 | >D4ZHLFP1.10.1717.2389.1 106 | GAGATACCATCTCATGTCAGTTAAAATGGGAAACAATAAAAAGTCAGGAAACAACAGATGCTGGAGAGGATGTGGACTGTCTCTTATACACACCTCTGAGC 107 | >D4ZHLFP1.10.1628.2390.1 108 | GTTCAGCAATATGGAGAAGTGCGCGGCGTCCATTTCAAGCATGAGGCTGATAAGTGCCGCCTCAAAGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 109 | >D4ZHLFP1.10.1696.2390.1 110 | CTTCAATGCGCTGTGCGAGCTCTTTTGAGATAAAGATAACATCGCAATCATTTGCGCAACAGGTTGCCACTGTCTCTTATACACATCTCTGAGCGGGCTGG 111 | >D4ZHLFP1.10.1511.2396.1 112 | CAACAGACGCTTCCTGCATCGGATTGAGTTGTTCTATTTTCAACCTTTCGAGTATATCCTTCAAACCATTCATTGCCTGTCTCTTATACACATCTCTGAGC 113 | >D4ZHLFP1.10.1690.2410.1 114 | CTTTTACCAATAAGGCTGCTCGCGAAATGAAAGAGCGTATAGCCCGTCAGGTGGGCGATCAGGCACGCTATTTATGGACTGTCTCTTATACACATCTCTGA 115 | >D4ZHLFP1.10.1720.2419.1 116 | ACTTCGTCAAGTGCCTTAGCATCAGTCTTTAATATCACTTTGATTACCGGTGCAATTGCTACCTCAGCAGCTGTCTCTTATACACATCTCTGAGCGGGCTG 117 | >D4ZHLFP1.10.1620.2447.1 118 | GTACTGGATGCATAACGGGTATCTTTCTCCTTGTCGAATTTAGCAACCAAATCATGAGTAGGCGTACAATAACGTCTCCAGGCTGTCTCTTATACACATCT 119 | >D4ZHLFP1.10.1526.2449.1 120 | GCTATAGCTTGGTGTCGCAGCAGGATATCCAGGCCATCCGACGGTTGGGGGACGCCTACCGCAGGTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 121 | >D4ZHLFP1.10.1716.2470.1 122 | CGTTGCAACTGCTGGAGCAGGTTGACCACCTGCGCCTGAATTGACACGTCCAGCGCCGACACCGGCTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 123 | >D4ZHLFP1.10.1885.2294.1 124 | ATTTTATATTGCGCCAGACGTACCGGTGAACCATCCTTCTTGCGACTGTTGCGTAAAGAGTCAAGACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 125 | >D4ZHLFP1.10.1763.2302.1 126 | TCTCAGCAATGGCCAGATCTTTCAGACAGAAAATCAACAAGGAAACTTCAAAGTTAAACTACACACTAGATCAAATCTGTCTCTTATACACATCTCTGAGC 127 | >D4ZHLFP1.10.1980.2313.1 128 | ACGTACAGGTGTGCTGTCAGACTTAGCTCCCTAAAACACACTTGGTTGCTCCCAAATGCGACAAAATTAAGTCCTGTCTCTTATACACATCTCTGAGCGGG 129 | >D4ZHLFP1.10.1891.2325.1 130 | AGTATAAAGAATCCATTTTAGTAATATGAAGAAATTGCTTTGTCCCCAGTGTAAAATAGCCGGTATGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 131 | >D4ZHLFP1.10.1983.2338.1 132 | CCGGTGCTCCCCGCTGTTCCACGGGAACGGCGGCAAGCTCGGCCAACCGTTCCGCCGAGGTGACGGTACGGACTGTCTCTTATACACATCTCTGAGCGGGC 133 | >D4ZHLFP1.10.1963.2355.1 134 | AATCTCTATTCTCCTTTTTATTTATTATGAAAGAAAAATATTGGAAATATTCTCTGATTATTATCATTATCGGACCTGTCTCTTATACACATCTCTGAGCG 135 | >D4ZHLFP1.10.1845.2383.1 136 | GTGTGGTAGTGACAGGAACCAATCAGGAGAATGCGGCCTATCCGTCCGGACTCTGTGCCGAACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 137 | >D4ZHLFP1.10.1979.2388.1 138 | GAACAGGAGAACTGCCCAACCAAGAATGACGATGATAAGATAAAAAGCAAAAAACATACCCACAGTAGACGCTGTCTCTTATACACATCTCTGAGCGGGCT 139 | >D4ZHLFP1.10.1765.2408.1 140 | GGCCAGAATAATGGCAGGGCGGGGGAGCGATTCAAAGCATTGCATGACCAATCTTTCTACCCCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 141 | >D4ZHLFP1.10.1968.2409.1 142 | GCTTAGTGATCCGGTGGTATGAAAGTGGGATTGCCATCGCTCAACGGATAAAAGCTACCCTGGGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 143 | >D4ZHLFP1.10.1889.2417.1 144 | ATCTTAAGGGAATCTGTGGGGATGTTCCAGAAATATTTTGAATGTTGTAACGGACTATCTGAACGGAAAACTTACCTGTCTCTTATACACATCTCTGAGCG 145 | >D4ZHLFP1.10.1820.2422.1 146 | GGGTTATCTATGGCATCTATTTTGGAAGTTTGTGAGAATGGTGCGGATATCATTGATGTCGCTATGGAGCCCTTATCACTGTCTCTTATACACATCTCTGA 147 | >D4ZHLFP1.10.1792.2431.1 148 | CATCCGCAGGGTGGTGGATTTGCCGCAGCCGGAAGGTCCCAACAGAATCAGGCGCTCTCCTTCCCTGATCCTGTCTCTTATACACATCTCTGAGCGGGCTG 149 | >D4ZHLFP1.10.1843.2440.1 150 | CAATAAGGATCGCACAAGTAAGTACCGCCATCCGAAGCACTGTACACAAAGTACTGGAACTGTGTTTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 151 | >D4ZHLFP1.10.1910.2454.1 152 | TTTACATACGGATCATCCACCGAAGGCTCATAATCATAATTCTTCTGCAAGTCCAGGTCACAGCCGGTCCTGTCTCTTATACACATCTCTGAGCGGGCTGG 153 | >D4ZHLFP1.10.1844.2486.1 154 | TTTTTAGCCAGTTCATAGGCATAGCGCATGATGCGCTCTGTCCCAAAGCGGGAGAAGACCCCGTTCTGGATCTGTCTCTTATACACATCTCTGAGCGGGCT 155 | >D4ZHLFP1.10.1922.2492.1 156 | CTGTACTACTACGACAGCGTCGGCATCTTTTCCCCCAGGGAACGCAATGAGAAAAACGGGCGCAGAAAGTATGAGTTCTGTCTCTTATACACATCTCTGAG 157 | >D4ZHLFP1.10.2150.2272.1 158 | ATCATAACCGATGTGAGGCCCGCCAATACCTTCTCCTGCCTTTCCTTTGAAGAAATAGGGGTTATCCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 159 | >D4ZHLFP1.10.2102.2323.1 160 | CAAGAACCGTTTAAAACCCAATTTCTCGGCTTCACCGATACGTTGCTCTATTCGATTTACAGGACGTATCCTGTCTCTTATACACATCTCTGAGCGGGCTG 161 | >D4ZHLFP1.10.2136.2331.1 162 | CTTTCTTTCCTTTTTCCCAGCCTCTGGGTTCATAAAACTGCAGAAGCCTTTTATTCGGGGCTCCTTTGGCGGCTGTCTCTTATACACATCTCTGAGCGGGC 163 | >D4ZHLFP1.10.2167.2334.1 164 | CTAACGCATATGATGGTGTTCGTCGTGTATCTTCAGAAGAAGCTTTCGAACTTGCACGCGCTATCGGTGGAACCTGTCTCTTATACACATCTCTGAGCGGG 165 | >D4ZHLFP1.10.2023.2353.1 166 | ATAAGGGTTCAGAAAGCATATGAATCTTTTATGCAGGATTATATCGCCAATTATGGCGATGATATTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 167 | >D4ZHLFP1.10.2038.2358.1 168 | AGTTCGGAGTGCAATATCTGCACCATAGCTTCCGTCCGGAAGTCTCCACCTCGAAGATATTCGACAAGACCGGAGAAACCATCGAGCGGGACACGACGTAT 169 | >D4ZHLFP1.10.2016.2375.1 170 | GATTCATACAATGCTTTGTGGATTTCAGATAGCTTCAGGATATTGCTGGCTAACTGTGTACCTAAATAATGCTGTCTCTTATACACATCTCTGAGCGGGCT 171 | >D4ZHLFP1.10.2080.2376.1 172 | GAAGCCATGTGTCCGCCATACCCAACCATCAGACACAGAGAAATGTTCCGCTGGCCGCCAGGGCTGTCTCTTATACACATCTCTGAGTGGGCTGGCAAGGC 173 | >D4ZHLFP1.10.2233.2392.1 174 | ATCCAAGAGGGCCCTGAATGTTCTCCATCCCTTTGGATTTACCCCATTCTTCTACGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCG 175 | >D4ZHLFP1.10.2015.2394.1 176 | TCTCAGGATATTCGGCATTTCCCGTAAATCGCTCCATCATGGGACGCACCGACGAGGTCACCTCGTTGACCACAATGTCTGTCTCTTATACACATCTCTGA 177 | >D4ZHLFP1.10.2137.2400.1 178 | AAAAAGAAGAACGTAATGGATGCCATCCGCAGAAGTTGGGTGCTGCCATCTGTAAGATAGTGGAACGTCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 179 | >D4ZHLFP1.10.2226.2413.1 180 | GTGTGGTCGTAGACCAGATGACACACATACCTATTGCTTTGTTAGAGGATAGAAACGGAGAAGCACTTGATCTGTCTCTTATACACATCTCTGAGCGGGCT 181 | >D4ZHLFP1.10.2165.2427.1 182 | ACGTTGCCCAGTGTGAAATCCCCGTCCGGGAGAATGTGCCAGACGCGTTGCGAAAGATAATCCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 183 | >D4ZHLFP1.10.2123.2430.1 184 | GTGAAAAAGATGAGGCCGGTCAGGCTGCGCTTCATCTCAAAACAGCGCGTTTGTATTTCATGGTAAGGGAATATGGTCTGTCTCTTATACACATCTCTGAG 185 | >D4ZHLFP1.10.2034.2441.1 186 | GTGCTAGACCTTGTAGAACGCGAGATCCATGCAGGCATGAGCACATTGGAAATAGACAAACTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 187 | >D4ZHLFP1.10.2248.2451.1 188 | ATTAAGTACTTGCCCAAAGCTGGAAAACTAAAGTCAAAAATCAAATCCAGATCTGCCTTCCTTGACATGTCATGCCCCTGTCTCTTATACACATCTCTGAG 189 | >D4ZHLFP1.10.2012.2455.1 190 | TCCCAATACTCTTTGAAGCCCCCTTGCGAGTTACCCATAGCATGAGCATATTCACATTGGATCAAAGGTTTGTCTATATTGCCCTGTCTCTTATACACATC 191 | >D4ZHLFP1.10.2189.2456.1 192 | ACATCACATTCGATGAAATGTTTTACTCCCGGATTTTCAAGCCCTACGCTGTTCATTACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGAC 193 | >D4ZHLFP1.10.2152.2480.1 194 | GGGTAAAACATAGTAACAAGAAAAGGAAGAACATGTAGATGGTCATTGCTTTTCAGGTGCATTGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 195 | >D4ZHLFP1.10.2218.2485.1 196 | CCTCCAGCGACTTCCCTATATAAGATGAATAAGGAGATACCCAAACGGAGGTCAACTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 197 | >D4ZHLFP1.10.2004.2492.1 198 | CTGGATATCGGAATTGATGGATGGATGGCAGACTTCGGCGAGTACCTTCCGACAGACGATATCGTTCTTCACCTGTCTCTTATACACATCTCTGAGCGGGC 199 | >D4ZHLFP1.10.2267.2306.1 200 | TATGGAACCATTACGAACATCCACCCCGTCTTTAGTAACTATATTAATCACGGCTGTCAAAGCCACACTGCTGTCTCTTATACACATCTCTGAGCGGGCTG 201 | >D4ZHLFP1.10.2408.2316.1 202 | GTGTAGGCCCGGTGAGAAAATCCACCCAGCGAAAAACTTCCATTCTTTCCTGGGCATCCCGGGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 203 | >D4ZHLFP1.10.2398.2336.1 204 | GTCCGAACGAATCTTCAGTTTATGCTTGGAAATGGCAAGAAGGTGATTTTAGTTACTTCCACAGTCAGTGGTGAAGCTGTCTCTTATACACATCTCTGAGC 205 | >D4ZHLFP1.10.2374.2401.1 206 | GGCAGAAGTGGGTGTTTCTGCCTCAATTTATCTGATAATTCAGTAATGAGAGGACGTGTACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 207 | >D4ZHLFP1.10.2411.2404.1 208 | GTTCCGTTGGGAGCAGGGCAAAATGAATATGCCAGTATTTTAGTGTCTGCTGCTCAGTTTAGTCTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 209 | >D4ZHLFP1.10.2430.2405.1 210 | GTAGTAAACTGTATCTGTATTCAAGGAAGTAAAGGAAGACAAAGATTTTTAAAGGAAATATGAGAAGGACCTGTCTCTTATACACATCTCTGAGCGGGCTG 211 | >D4ZHLFP1.10.2250.2406.1 212 | GAAAGATCCCGTCGGCGCAGAAGGACAAGAATAGCAAACCAAACAAATTTTGATGATGAGAAAGATTAATGACTGTCTCTTATACACATCTCTGAGCGGGC 213 | >D4ZHLFP1.10.2372.2433.1 214 | CCTTTATATGCAATTTTATTTTCAAGGTCAATGACTGTAAATCCGTCTTTATATCCTTCATTCTTATTAGCACCTTGTTTTATATCATCCCTGTCTCTTAT 215 | >D4ZHLFP1.10.2494.2441.1 216 | CGATAAAGGGGGTGAGAAACCCCCTCGCCGAAAGACTAAGGTTTCCTGATCAACGCTAATCGGATCAGGGTCTGTCTCTTATACACATCTCTGAGCGGGCT 217 | >D4ZHLFP1.10.2331.2447.1 218 | TGTCATATTTAAATTACTGAATACTAAAGACAGGAAAGTATTTTAAAAAATAATAAAGAAAAAGATACAGCATAACCTGTCTCTTATACACATCTCTGAGC 219 | >D4ZHLFP1.10.2359.2450.1 220 | AGTGATGACAGGTCAGATAGGAGAACTGCCTTACATACAGAAACGGTCCAGTGGCAGCAGGCTAGTCAAGACCTGTCTCTTATACACATCTCTGAGCGGGC 221 | >D4ZHLFP1.10.2279.2450.1 222 | GTCTCGCTATCCAGTTTTGCAGGAATGGAATTAATGGGATCCCCGGACAGAGCCAGGAGGGAGAGGCCCAAGTCTGTCTCTTATACACATCTCTGAGCGGG 223 | >D4ZHLFP1.10.2320.2481.1 224 | GAAGGGAAAGGAGTTCGCCTGAAAGAGAAACAGCCCAACGGAGAGCAACTTTCTCCGATGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 225 | >D4ZHLFP1.10.2408.2490.1 226 | CCCCTTACCAGAGCTCACTGCGATAATGTTTTTCACATTGGGAAGAACTCGCAAGGTGCCTTGAACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 227 | >D4ZHLFP1.10.2366.2493.1 228 | GTATTAGTTAAGCAAGGAGGTATGGAGAGCATTCTAGCTGGCGGAACAAATGGAAAGCATTCTACATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 229 | >D4ZHLFP1.10.2693.2285.1 230 | TATTGCGGGTGGCAGACCATTTCTTACGATATACCCACACCATACCACCGGCAAATAATAAGATCAATAGCTGTCTCTTATACACATCTCTGAGCGGGCTG 231 | >D4ZHLFP1.10.2715.2309.1 232 | GCTTCAAGGAGCCACACAGGGTTTAGAGATGTTCTTACGAGGCAGCGTGCTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAA 233 | >D4ZHLFP1.10.2528.2319.1 234 | GTCCCATCCATGGTCGCACCTCCAGCGGTTGTTGCGGATGACAGTCGTTTCGGTAGCGTCCCGGAATACCATTTGAGGACTGTCTCTTATACACATCTCTG 235 | >D4ZHLFP1.10.2720.2325.1 236 | GCTGTTTACGATGCGAAGACTGAAATGTGGGAACTTGGAGGAATTTATTTTTCCCGTTATTTAGGACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 237 | >D4ZHLFP1.10.2562.2351.1 238 | ATGGGGGCCTGTGTTTATAGTTACTTCCAAGGTTCCGTCAGATGTAGCCCGTAATATGCAGCGTTCTACGGCTGTCTCTTATACACATCTCTGAGCGGGCT 239 | >D4ZHLFP1.10.2738.2356.1 240 | CGATAAGCTCTTGTATGAGGTTTCAGCAAAAAAACAGCAGGCTCGTCCTTATAAATACTGTCACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAGGCA 241 | >D4ZHLFP1.10.2681.2401.1 242 | GTCTTGAACTGCTGTTGTCCATGTCGAAAGCCTATGCCGACGACTCTTTGCAGAACATTGCCTGTACGCCTGTCTCTTATACACATCTCTGAGCGGGCTGG 243 | >D4ZHLFP1.10.2593.2425.1 244 | TCACTGGATAAGGAAAGAGGTAATCAGTTGTTCAGAAGCATTATTGAGTCGCTGAAGAACGAAGTGCAGATTCCTGCAGCTGTCTCTTATACACATCTCTG 245 | >D4ZHLFP1.10.2726.2432.1 246 | CCACGATACTGTATTAATAAGGTAGACAAACAGCCATAGCAACAAAACCAAAGGTAAATATACCCCTGCACGCTTCAAATTCTGTCTCTTATACACATCTC 247 | >D4ZHLFP1.10.2505.2461.1 248 | CCGGAACAGGAATCATGGTAGGTAGTCCGGGACAAACTCTGCCCCATCTGGTAGAAGACCTGTTGTTCATAGAACCTGTCTCTTATACACATCTCTGAGCG 249 | >D4ZHLFP1.10.2630.2491.1 250 | GCCCTGGACTGCTACCACAAGTTCAACAACGTGAAGAACAGCTTCATTCCTCTGCGAAAAATCAATGACCGGATCTGCCACTGTCTCTTATACACATCTCT 251 | >D4ZHLFP1.10.2531.2497.1 252 | CTTCAATTCTTCGCTGAAGAAATAGACCAGGTCAACACCCAGGTGGTCTACCTTGTCGCTGTCTCTTATACACATCTCCGAGCGGGCTGGCAAGGCAGACC 253 | >D4ZHLFP1.10.2787.2291.1 254 | AAGAGAAACCATCTTAACGGCTTTAAGCGCATGTATGAAGGCAGGTATGAGTACAGGCGTCTGTGCTGTTCCTGTCTCTTATACACATCTCTGAGCGGGCT 255 | >D4ZHLFP1.10.2988.2293.1 256 | ACCTTGGGGTTTAAGTTCCAGCATATGAATTTTGGAGGGACACATGCATTTCAAACTATACTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 257 | >D4ZHLFP1.10.2942.2314.1 258 | GTATATATTCGTCGTAGCCCTGCTTATCATCAACAACAGAATGTTACAGTTGGAGGCGAGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 259 | >D4ZHLFP1.10.2986.2325.1 260 | TCTCTTACCAATGTAGCCTACGATTCCACACATATTATCCTTTTTTGTTTTCGCGTGCAAAGATACACAATCTATATGACACCTGTCTCTTATACACATCT 261 | >D4ZHLFP1.10.2930.2334.1 262 | TTATTCATCTGTTCGCTTTCTTTCTTAAACGTTCATGCTCAAAAAAACACGAACGTGAAAGGGACAGTGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 263 | >D4ZHLFP1.10.2998.2336.1 264 | GCTTGGGTAAGGCTGAATTCTCGGATTTAGACTTTATCCTGTCCCAATCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATAT 265 | >D4ZHLFP1.10.2864.2340.1 266 | CTATATAGTAAGATGCTGATGACAAGGAGCAAAATAACTACTAGTAAATAATTATAACCCCAATGAACTTCGTTAATCCCTGTCTCTTATACACATCTCTG 267 | >D4ZHLFP1.10.2917.2354.1 268 | GATGGAAGTGACCCGGTGTGAAATTTCACGGATCAGCTCTTTGCGGTCGGCACTCAACACTTCGTACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 269 | >D4ZHLFP1.10.2888.2355.1 270 | GTGATGAAATGGGATTCAGCCGGAGTAATTTCTACCGCAAGCTCAAGGCGGTAACAGATTTATCCCTGAACTGTCTCTTATACACATCTCTGAGCGGGCTG 271 | >D4ZHLFP1.10.2852.2372.1 272 | AACCACGACGAGCGTTATTAGTGATATTGGTAGACAACGTCATACTCCAAGGTAAAGAGATATTAGTACTTGCACCATAACCAAACTGTCTCTTATACACA 273 | >D4ZHLFP1.10.2905.2391.1 274 | CTATAGAAGATTTGAATAAAGCCAAAGAAGACATTGCAATCAATGTCACTTCCGCTTATTTGCAGGTTTTATTCCTGTCTCTTATACACATCTCTGAGCGG 275 | >D4ZHLFP1.10.2977.2396.1 276 | CTTCACAGGAGTTAAGCGACCGATAAAAATAATAGTCGGATTGTCATTTTTGAAATGCTCTTTATAAACCTGTCTCTTATACACATCTCTGAGCGGGCTGG 277 | >D4ZHLFP1.10.2889.2398.1 278 | GAATTATGAAAGCCCGCGTCTGCTTCTCTGCGGCCAGTGTTTCCACCACCTCCGGACAAAGACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 279 | >D4ZHLFP1.10.2787.2409.1 280 | CGCCAGTGTTAACTGGATGATCAGTAATGGAATGCTAATCACCGACACCGCAATACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGG 281 | >D4ZHLFP1.10.2887.2424.1 282 | ACTTTACTGTCATATATAAACATATATCCCGACTGTTGGGAGACATTACCTAATAATGAATAAACAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 283 | >D4ZHLFP1.10.2824.2434.1 284 | GATTAGCCCTACCCAACAGGCTTTTCTGAAACATATTCTGCAAAAGCCGGTGGTGAACACACGTCGTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 285 | >D4ZHLFP1.10.2768.2456.1 286 | GGTTTTGGCACTGAATATCGGTTCTAGCGTGTATGGATGGAATCCGCTATACCAGGCTTCGGTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 287 | >D4ZHLFP1.10.2941.2487.1 288 | ATTCTATTTTGTTCGTAAAAACTTGAGTATTGCAATGCCTGCCATGCAGGATGATTTGGGAATAACCAAGGCCTGTCTCTTATACACATCTCTGAGCGGGC 289 | >D4ZHLFP1.10.2773.2497.1 290 | GATTTGAGCTTTCAAAAAAATTAGAAATTTTGAAAACTTGAATCCAACCCTGGATTTTTACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGA 291 | >D4ZHLFP1.10.2913.2497.1 292 | CATTACAGCAAAGTAGAGATTGAAGCGAACGAAAAGCGTCGTTGCCCATGGTAGCATACGGTTTTCTTCCTGTCTCTTATACACATCTCTGAGCGGGCTGG 293 | >D4ZHLFP1.10.3063.2279.1 294 | GGTGTTCTGTTTCATATTAGTCGTTTTAGTCGTTTTGCAACAAAGCTTTGGGTTCATGAGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 295 | >D4ZHLFP1.10.3172.2306.1 296 | GGAGAAGCCATGAACGAACTTCTCAACATTCCTTGCGGAACGGTAAGAATAACCGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCAGC 297 | >D4ZHLFP1.10.3158.2312.1 298 | CATATCTCCAGAACCGTTATGAGATTCTGGAACGCATTGGCTCAGGCGGCATGTCCGTGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAAA 299 | >D4ZHLFP1.10.3235.2323.1 300 | CATCATAAGCTTCGGCTGCTTCCTTGAATTTTTCTTCCGCTTCTTTATCCCCCGGATTCTTGTCCGGATGATATTGTACCTGTCTCTTATACACATCTCTG 301 | >D4ZHLFP1.10.3075.2329.1 302 | ACAGCATCCTGCGCAGTTTCTACGCCTCCAATTTCCCCAAGGAGTTGCTAGAACATCACGGAAGAGATCAAGCCTGTCTCTTATACACATCTCTGAGCGGG 303 | >D4ZHLFP1.10.3141.2344.1 304 | GCAGAGACCCGGTTGCTTTTATGGTCAATCAACGGCAATTCTATCTTCTTGCTGGTGTTACTGTCCGGAGCTTTGTAACTGTCTCTTATACACATCTCTGA 305 | >D4ZHLFP1.10.3033.2356.1 306 | AGATTCCAATGCCTATTATTTCTGATTGTATTCTAAACTCCCTTGCTCTAGTGTCTTTTCATTGTATCCAAACACTGTCTCTTATACACATCTCTGAGCGG 307 | >D4ZHLFP1.10.3076.2357.1 308 | CCAATCATACATCTCTTTGATTTGAAAAATGATGTTTTTTCTGTGATACCTGTGGCCCCAATAATATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 309 | >D4ZHLFP1.10.3194.2363.1 310 | GCATTATTAGGTGCATAAAAACGAAAAAAGAAATCTTTGACCTCTTCCAATGTAGCATTGGCTATGTGAGACCTGTCTCTTATACACATCTCTGAGCGGGC 311 | >D4ZHLFP1.10.3006.2374.1 312 | GAGAATTACCGTCACGTTATGACAAAAGCCATCTGTGCAGCCATCCGTTCACAGATATCACTCTATGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 313 | >D4ZHLFP1.10.3142.2424.1 314 | ATGACATTCCGTATCCAGGTAGACGTTCTCGACTGTCTGGGTTGCGGTAACTGTGCTGACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGAC 315 | >D4ZHLFP1.10.3017.2432.1 316 | GTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCAGGGCGTGGTGGCGGGCTCCTGTAGTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 317 | >D4ZHLFP1.10.3118.2439.1 318 | CCGATACCGTTGGGTTCTGTGATTTGCTTGATGACACCCGGGCTGGGCATGAAGTTCATTTCCGTATCTCTGTCTCTTATACACATCTCTGAGCGGGCTGG 319 | >D4ZHLFP1.10.3048.2441.1 320 | GCCCTGATGAGTTCCTGTTAAATGCAGGACAGGACTTCAGGCAGCGCGGGATTGGGGAGATATATAAGGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 321 | >D4ZHLFP1.10.3235.2442.1 322 | GTCGTACATCTTTTCGCTGTCTATCTTGCTGTCCAGTTTGTCCGTATCGAACGGAGTGAAGCGACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 323 | >D4ZHLFP1.10.3165.2448.1 324 | GAAATACCTCTCTTTCTTATAATGCACAAAGTCACATGAACATGATTGTATAAAAAAGCTCACATTTTATATCTGTCTCTTATACACATCTCTGAGCGGGC 325 | >D4ZHLFP1.10.3122.2457.1 326 | ATTCTTTAGTCTTTCAAAGGCGTATGACCTTTACAACTATTTGTTAATGCTGATGATAGCATTGACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 327 | >D4ZHLFP1.10.3059.2469.1 328 | GATTATTGCTAATGATATCATGGTGCCTGAGCACCTGCAGTCCTTGTCTTGCTCTTATAGGTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 329 | >D4ZHLFP1.10.3162.2497.1 330 | GTTCCAGAGAAAACTGCCCTTCGAGGATGCGGGCCGTTCCCGTGGCCGGGATATAACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGG 331 | >D4ZHLFP1.10.3386.2270.1 332 | CTTCCAGGAAGTCCTCTCCCTGACCGAAAGCCCAGTCCCCCAGCGGACAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATA 333 | >D4ZHLFP1.10.3288.2278.1 334 | GGACACATTCAATTAAAAAATCTATTTACACCATTTATTGTTCGGAATATAATATCCAAATGTCTATCAATGTCTCTTATACACATCTGTGAGCGGGCTGG 335 | >D4ZHLFP1.10.3329.2291.1 336 | GAATTAGAACCGATTATGAGACATGGGGAACACCCCAAGATTCTTCCATTCAAAAATACAGAAGAAGAGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 337 | >D4ZHLFP1.10.3300.2293.1 338 | ATGCTCCGCTTTCTACTTTCTCTATTGATGTAGACGCCGCTTCTTATAGTAATATGCGTCGTTTTATCAATAAAGGTGAACTGCCTCCGGTTGATGCCTGT 339 | >D4ZHLFP1.10.3490.2293.1 340 | ATATTACACCGGACGAAATTGTGGTGGATTCAGGCGTGGTTCCCGCGCTGTCCCATCCCCTGTCACTGATCTGTCTCTTATACACATCCTGAGCGGGCTGG 341 | >D4ZHLFP1.10.3393.2295.1 342 | ACCGCCGGCTTTATTTTTGTCCATTTTTCAAAAACAAGGGCTGCTTGGTAAGCCAGATTGCGTTCATCACAAGGCAAGGCCTGTCTCTTATACACATCTCT 343 | >D4ZHLFP1.10.3318.2308.1 344 | CGGGGGGGGAATATGTTCTGGACTTTCCCTTTGAGGTAAACGGCTTTGAGTATCAGATACAGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 345 | >D4ZHLFP1.10.3494.2309.1 346 | CTTTCGGGAGGGTGCTTGCGCGGATCATCGGGCGGAAGCTGTGTATGGAAAACCGGACAGTATGGGCAGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 347 | >D4ZHLFP1.10.3340.2322.1 348 | CTGTACGGCCTGTTAAGAATCATGCAGGCCCGGAAAAACGAAGATTATAAGGAGTTCCTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGA 349 | >D4ZHLFP1.10.3387.2342.1 350 | CATTCCACTCCACTTCAACCCACTCTGTTCCGTTCAATTCCATTCTTTCCCATTCCAGTCCATTGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 351 | >D4ZHLFP1.10.3436.2344.1 352 | GTCCGGTGCTGCAGTTTAGCGACCGTCTGCGCGGCACGCTCATCGCCAATGAGCCGCTGGCGCGAGAGGATTTTATTATCCACCGTCGCGACGGGCTGTTC 353 | >D4ZHLFP1.10.3291.2357.1 354 | CTATCTACGGTGAGGTGAGAAGTCATTGGAAAAAAGAGGACGGTTCGTTCTGTTGGGACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 355 | >D4ZHLFP1.10.3369.2411.1 356 | CATCGGAACCGTCCACTGCCCGCAATCCAACAGGCACAATACATCGCAACCTCTTATCAACCTATGATAATCAGCCTGTCTCTCATACACATCTCTGAGCG 357 | >D4ZHLFP1.10.3413.2433.1 358 | GTGTATATCTTCCGTTCCTTTTCCAAATGTTCGATGTCCGCCGCTTGTTGTTTGGAGCAAGTTGTCAGTATATTAACTGTCTCTTATACACATCCCTGAGC 359 | >D4ZHLFP1.10.3332.2443.1 360 | GTATAGGTCGGATCCAGTTCGGCAATGTGCAACGTACTATTCTCTTCGGAGGAGTAGATATGGTAAGCTTTACCGCTGTCTCTTATACACATCTCTGAGCG 361 | >D4ZHLFP1.10.3279.2450.1 362 | GAAAAGATCCGTCTGACCATTCCACATCATGGAATGTATTGATACCAAAATGAATGAACATACCATACCCTGTCTCTTATACACATCTCTGAGCGGGCTGG 363 | >D4ZHLFP1.10.3489.2452.1 364 | TCCACAATCACATCAAGGATATAATACAAATCATCCTCCGTGAATTTATCTTTCAAATCCTGAGGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 365 | >D4ZHLFP1.10.3656.2265.1 366 | TTACCAATCTTAGTCAATCCGACCAATACACCCGTACCGTCTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATATCTCGTAT 367 | >D4ZHLFP1.10.3545.2279.1 368 | GTGCACACATATCGGCAGCAATTTCCGGCTTCCCGGCTTTACAGGCATAATAAGCCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGG 369 | >D4ZHLFP1.10.3605.2290.1 370 | CATCAAAACTCTTTTTCTCTTTTGAAATGCCCAGCTTTGTCTTATCGGCATTGTAAAAGAAGAGTAATGCACGCAACGCCGTACTTTTCCTGTCTCTTATA 371 | >D4ZHLFP1.10.3530.2319.1 372 | TGGGAGTACCGCACGTTTCGCGCCGATCTTCTTCCGGAAGTAAATTTCAGCGGTACATTGCCCAGTTACAGCAAGCAGTATAACCTGTCTCTTATACACAT 373 | >D4ZHLFP1.10.3743.2320.1 374 | ATAGGAAAGCGCCTGCTGTTCAAGCTGGCGGAGCTGGCTATCAAGGAGGGCAGCTTTCAGGAGGCAGAGGATTACTACCTGTCTCTTATACACATCTCTGA 375 | >D4ZHLFP1.10.3592.2366.1 376 | AACATGCATATTTGCAAAATAATATTTTGGTGTTACAATCTTACGTAATGTAATAAGTGATTGTATTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 377 | >D4ZHLFP1.10.3671.2367.1 378 | AGGCTTCCGTAACGGTCATGTTAAGGACATCAGAAATATTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATATCTCGTATG 379 | >D4ZHLFP1.10.3723.2371.1 380 | ATACAGCTCCACCTTTCATCACAACTGCTGCATAAAGCGGCATTCCGACAAAAATAACAGCAGCAATAAATCTGTCTCTTATACACATCTCTGAGCGGGCT 381 | >D4ZHLFP1.10.3517.2382.1 382 | GGGATGGCTTTCATTTTCTTGCCATCGGCAAGGATGATGACGCCCCCCTTTCCCAGACCCACATGGACTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 383 | >D4ZHLFP1.10.3734.2385.1 384 | ACACTTTGCCATTAATTAAGAATAATCAATCATGTATTGCATTTCCTTACTTTCATACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 385 | >D4ZHLFP1.10.3502.2386.1 386 | GCGTTGTTCGGAAGCCGTTTTGCTATGCTTGTTAAACAGATTTTCCAAAGAGAAAGTATGCATACCTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 387 | >D4ZHLFP1.10.3614.2390.1 388 | GTCGTGAGAAAGTGTCGGGAGTACCACGTTGATAATGTTGATATCGAGTACCGACATGCATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 389 | >D4ZHLFP1.10.3565.2416.1 390 | ATATAGAGGTTTGCGTGCAGGCACTTTAAAGAAAGCAGAACCATCGGGCTGTACTTCCGTCACACCCAACACCTGTCTCTTATACACATCTCTGAGCGGGC 391 | >D4ZHLFP1.10.3744.2417.1 392 | GTTCTGCACAATATGCAGACACCACCTGAAAATGGACAGCTGCAGCACTACAGCCCCTTTTTAGGATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 393 | >D4ZHLFP1.10.3509.2429.1 394 | TTGTAACATTCGACTATTTCCAGCGTAAATCTAAAAACATGGTAGGCCCCGCCCCCGAACTACCCAATCTCTTAGGCCTGTCTCTTATACACATCTCTGAG 395 | >D4ZHLFP1.10.3637.2429.1 396 | GACCCAACCCTTCCCTCCTGGGAACCTGTCTGCCTCCTGCCACCATCACAAACACCCACAAAACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 397 | >D4ZHLFP1.10.3587.2443.1 398 | ACCATGAGCCCCGTACTGAATACGATGCAGGGTGACGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATATCTCGTATGCCG 399 | >D4ZHLFP1.10.3674.2475.1 400 | ACCTCTAAGTGACAGAGCCAGGATTTGAACCTAGATCTGCCAGGCCCCCAAGATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCC 401 | >D4ZHLFP1.10.3578.2476.1 402 | GATCAAATCATACAGAATGACCTACCAGGGCAGATCCGCATACTCAGCCATCTGCATAAACAGCCTAAACTGCTGTCTCTTATACACATCTCTGAGCGGGC 403 | >D4ZHLFP1.10.3625.2477.1 404 | CCTAGGTCTAAGCAATCCACCTGCCTTGGCCTCCCGAAGTGCTGGGATTACAGGGGTGAGTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 405 | >D4ZHLFP1.10.3624.2497.1 406 | CCTTTGTGGATACCACCATCAAGGCTTTGAGGGAGAAGATAGGTAACGGCAGGGTGCTGTGTGCTCTGTCTGTCTCTTATACACATCTCTGAGCGGGCTGG 407 | >D4ZHLFP1.10.3903.2266.1 408 | GGGATATGTGATTGATCAAAAGCGAATGGAGAACGGTTCTTTCATTGGAGAAGACTATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 409 | >D4ZHLFP1.10.3839.2268.1 410 | GGCCCAATCTCGGCTCACTGCAAGCTCCGCCTCCTGGGTTCATGCCATTCTCCTGCCTCAGCCTCCCGAGTAACTGCTGTCTCTTATACACATCTCTGAGC 411 | >D4ZHLFP1.10.3958.2272.1 412 | CAATGAAATAGATAACATCCCGGATGGCATCGAGCCCTTCCTGAAGGGTGTAGTTTATTTCATGATGAACCGTACCTGTCTCTTATACACATCTCTGAGCG 413 | >D4ZHLFP1.10.3883.2282.1 414 | ATCCAAAGAGATAGGCGCCACATATTCAGGGTGATAAGGGACGTTGTCAGGAGAATAATCAGCCGGACCATTCTTGGTGTCTGTCTCTTATACACATCTCT 415 | >D4ZHLFP1.10.3899.2310.1 416 | CTCTCCATATACGTCGAATATAAATGTAGCTTTTGGTTTCTTCCCTACGGGAGTAGCTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCGGACC 417 | >D4ZHLFP1.10.3824.2332.1 418 | CTCCAGCAACTCTTTACTAAACGAAGAGCAGTTAACTGCCACAAAGTTTTGCCGGGAACGCTTACTACTATAACTGTCTCTTATACACATCTCTGAGCGGG 419 | >D4ZHLFP1.10.3999.2336.1 420 | TGGGTACGTAATGCACGTACTGCTAACGCTGATCTACGTATTGCCATCAAGGCAGACGCTACGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 421 | >D4ZHLFP1.10.3969.2342.1 422 | TCTGTCATCTTCGTCTATTTTACCATCACCATTTAAATCTCTGTATTTGATATCACCGACATTAGGACGAGTGCCATACCTGTCTCTTATACACATCTCTG 423 | >D4ZHLFP1.10.3934.2357.1 424 | GCATATCAATCTCCGAGACATTCTCCATTTCATCTGCCGATAAAATAGCATAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCA 425 | >D4ZHLFP1.10.3890.2363.1 426 | CAATAAATAGTGTGGGATACCAGAGCTCGGGGCATTCTCAGCCTCCATACTAGCGTTGGCCCCCTGGTCCTCTGTCTCTTATACACATCTCTGAGCGGGCT 427 | >D4ZHLFP1.10.3803.2375.1 428 | CTTTCAAGTCGCTTAGTATGACTTTGTTTTCTGCCGGAATGATTTCTTTGACAAAGACAGCTCGGTTAAGATCCTGTCTCTTATACACATCTCTGAGCGGG 429 | >D4ZHLFP1.10.3786.2378.1 430 | ATGTAATATTGTAAAGCTGGAATCCTGAGTGGTTGCTAACGATGCCCTAGGTATATTAATGGAATCAAGGTTGTCTCTTATACACATCTCTGAGCGGGCTG 431 | >D4ZHLFP1.10.3757.2381.1 432 | ATATTCAACAATTGGACGACTATGTATGCGAACACACCTTTGAAGATCTACGTGTGACACTGATCAACAACTGTCTCTTATACACATCTCTGAGCGGGCTG 433 | >D4ZHLFP1.10.3884.2381.1 434 | CGTATGGGGTAGCCAAACAATATGGTTTCTGGATTACTAAAAACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATATCTCGT 435 | >D4ZHLFP1.10.3959.2390.1 436 | GACGACCACCGGCGCGACCGGTTTTTTCACCGCCTTACCCACTTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGAAGACCGGCCAATATCTCG 437 | >D4ZHLFP1.10.3818.2396.1 438 | CCCCAGGCCTGCAGTCTGGGCCAAGGTCACTGGCCCTGGATCAATCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATATCTC 439 | >D4ZHLFP1.10.3975.2407.1 440 | ATCTAGCTTAACGTTCACTAACTTTTTAGCTTAAAGTTCACTCTTTTTAGGAAAGAGGGCAAGAAGAGGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 441 | >D4ZHLFP1.10.3844.2408.1 442 | GTTCCGTATCGCCAAACTTGATGACAGATGACGAATTCTTGCCAAATCCATATAATTCAAAACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 443 | >D4ZHLFP1.10.3804.2414.1 444 | GGTATGGGGTATATCCGACATAAAAGACCATGAACAGTTTGGCAAACTAATGGCCAAACATTCCAATGACTGTCTCTTATACACATCTCTGAGCGGGCTGG 445 | >D4ZHLFP1.10.3849.2425.1 446 | TTTCTGTGCAAATGAAGAAACAACTTTAGCCACAGAAGGAATTCATTTTTGGAGGAATATGAACCATTCTTATTGCTGTCTCTTATACACATCTTGAGCGG 447 | >D4ZHLFP1.10.3832.2429.1 448 | GGGTTACAGTATAATCTCCTTTGGAAGTCACGACTCCACTAATAATGCCAGTCTTGGAATCCAGTTCCTGTCTCTTATAACACTCTGAGCGGGCTGGCAAG 449 | >D4ZHLFP1.10.3984.2440.1 450 | GCGGGTATACCATCCAGGTAGGCACATACGTGTCTTTGTCCCCGTGAATAAACAGCATCGGAAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 451 | >D4ZHLFP1.10.3760.2462.1 452 | ATGTTGCCTTTATGCTTTGCGCCGAAAAGGGAACGGAAGGGGAGAACCGCTATGGTTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGC 453 | >D4ZHLFP1.10.3816.2478.1 454 | GGTAGGGACGAGATGCTTCTCTTATTCAAGTTACTCCCTTGCTTTATGAATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAAT 455 | >D4ZHLFP1.10.3918.2481.1 456 | ACCTGGGTGTCCGGGAGTATCCATCCGTAGATAACCCTATTATTTCAGTATCCTGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGG 457 | >D4ZHLFP1.10.3767.2488.1 458 | GTTAACAGCCGCCTATCTTTCGTATGTGTGTCAATTGAAATTCGGATTCTTGCCTCCTGAGCGCAGATGGAATGCTGTCTCTTATACACATCTCTGAGCGG 459 | >D4ZHLFP1.10.4017.2275.1 460 | TCATTGAATGGACTCGAATGGAATAATCGAATGGTCTTGTGTGGAATCATCATCATATGGAATCGAATGGAATCATCAAATGCTGTCTCTTATACACATCT 461 | >D4ZHLFP1.10.4247.2297.1 462 | ATACAGATATGCAGCTATATTCTCTTTTCACAGATGAAGAGTGTTATGATCTATGGAGTTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCGG 463 | >D4ZHLFP1.10.4009.2302.1 464 | ATTGTAGCTTCTCTCCACATCAAGTCCGATGAGTTCATTCATTCTCTTTTGTATCTTTATGTACAGCTGTAATTCCTGTCTCTTATACACATCTCTGAGCG 465 | >D4ZHLFP1.10.4132.2311.1 466 | CGTTGGAAGCGTGCCGTAAGTACTGTGAATGGGGTACTGGGTGAAGCGGTAGGTCAGATGTATGTGGAGAAATATTTCTGTCTCTTATACACATCTCTGAG 467 | >D4ZHLFP1.10.4108.2311.1 468 | TTCCCTAGGAGCACTTGATTTGTTGTTTATTCAGCCAGTTTCTTGTAATAGGCGTAGCGTTCCATAGCTTCCTTCCTGTCTCTTATACACATCTCTGAGCG 469 | >D4ZHLFP1.10.4108.2337.1 470 | AATATCCCCTATTCATCGCTAATTATAACATTCATTCCCCTCTCTTATCAAAATTACGTTGAACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 471 | >D4ZHLFP1.10.4016.2345.1 472 | TGAGAATCCTGATGCCAAATATCCTCGTTTGACTTGGCAGAATAACAGTAACAATAACCGTGAACCTGTCTCTTATACACATCTCTGAGCGGGCTGTCAAG 473 | >D4ZHLFP1.10.4091.2352.1 474 | AAGTATCATACCTGATATCATACAAAAACCTGGGAGGGATCACTTACCTGAGTTATATCCGGAACAATATCTCTGTCTCTTATACACATCTCTGAGCGGGC 475 | >D4ZHLFP1.10.4036.2372.1 476 | GTTGAGGATGGTTCCTACACGTTCCATATCTGTCTTAGCCAGTTTCCACGGTTCGGTGTCGGCCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 477 | >D4ZHLFP1.10.4025.2393.1 478 | GACCTCGGCCGTGCCGGTCGCCAGCGCGGCCGCGCCGGGCGAAGGCGATGAAGTGGTCACCCGGGTGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 479 | >D4ZHLFP1.10.4159.2433.1 480 | CTTTTATAGAAAAAGAAAAAGATCCAGAAAAATTATTTTCAGGTTATTTAGCATTGGGAAATCTGTATTTACTGTCTCTTATACACATCTCTGAGCGGGCT 481 | >D4ZHLFP1.10.4127.2436.1 482 | GATTCAAGATGATTTGCCTGTCTTCAGTGGCAGCGCTGATGCTTTTATGCTGGGATGGGTGGCCTCAAGGGTCTGTCTCTTATACACATCTCTGAGCGGGC 483 | >D4ZHLFP1.10.4140.2442.1 484 | GCCTTGTTTCTTCCAGATTCTTATTTGATATTGATTTGAAAACTCACTAACCTGACCGATAGTCTATCTTAAGGCTGTCTCTTATACACATCTCTGAGCGG 485 | >D4ZHLFP1.10.4169.2452.1 486 | CCGTCGGATAGCATTGCTTTGGGGGAGCTTTCTTTTTATACGGAAGAAGGACGGATCGGGAACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 487 | >D4ZHLFP1.10.4114.2485.1 488 | CCATTATACATTGATATTTGTTTTGGGTGTGATACAGGTGATTGTTATTTATCCAGACCTTTCCATACAATGTTCTCTGTCTCTTATACACATCTCTGAGC 489 | >D4ZHLFP1.10.4025.2494.1 490 | ATAAGATGGAAGTCCACAATAACTATTCCGAGCCCTGGAAGGAGACATTGGTGGATACCGGACCCGTCTCTTATCCACATCTCTGAGCGGCCTGGCAAGGC 491 | >D4ZHLFP1.10.4146.2497.1 492 | CTATCGGTTAACGTAGAAAGAGCTCTATCAATTTCCCTCGCAAGAGACTCATTAACCAGAGAACGGTCTGCCATGGGCCTGTCTCTTATACACATCTCTGA 493 | >D4ZHLFP1.10.4289.2266.1 494 | CATCCAAACGGAGAGCCAGGTTATTGGGAATCTGACCGCCGGTAGATACAATGACTCCGTGAGGGTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 495 | >D4ZHLFP1.10.4371.2267.1 496 | GGTCTTGGGTATGAAGCATTTCAAAAGTCTCATCCAAAAAATCAAAAACCCCATAGTCCGAAAACAACAACAAGGTTCTGTCTCTTATACACATCTCTGAG 497 | >D4ZHLFP1.10.4268.2282.1 498 | GTCAATAGTTGCCTGTTTAGGGGTAGACGTACAGGCAGTGAAAACAAATAAAAGAATAAATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 499 | >D4ZHLFP1.10.4284.2289.1 500 | CCCTTAACTTACATCCGGCGGATGGAGTAGCTTCTTATGAAGAAAAATATCCTGGATTAGCAAAAGATATGGGCTGTCTCTTATACACATCTCTGAGCGGG 501 | >D4ZHLFP1.10.4351.2292.1 502 | GTCTTTTCATCGTACCCACGTCTTGAAACTTGTTATTAAAGTAATATCCGGTAAGTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCG 503 | >D4ZHLFP1.10.4412.2300.1 504 | CTACTGTCCATGACCGGTCGTCGAAATTTCCGGTTTCAGCTCCGGGGCAATTTCCTTTATAAAAACGCCATGCAGCTGTCTCTTATACACATCTCTGAGCG 505 | >D4ZHLFP1.10.4433.2308.1 506 | TCTACCTGTACCACTCCGGCAGATGCTGCATTGTAGCTGTGAGAAGCAAGGCAAGGTTGAAGTCCCATACCCCTGTCTCTTATACACATCTCTGAGCGGGC 507 | >D4ZHLFP1.10.4262.2348.1 508 | GTATGCCGGTAGGTAGTGACTGGCGTACAGCGGTTTTAAATTCCGAACATTTAGTTTCCGTTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 509 | >D4ZHLFP1.10.4498.2366.1 510 | CGTCTATCTCTTCGATGAATTCTACTTGCCCCCGACACATCGCGCTTTCCGAACGCATACTGTAACTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 511 | >D4ZHLFP1.10.4261.2368.1 512 | GGGTATCCGTCAGAATGAGCATCCGGTCGCTATCCGTGTACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATATCTCGTATG 513 | >D4ZHLFP1.10.4456.2391.1 514 | GCAGAGATTATTCTTTTAACTTCCTGAATCTGCCATTTGAAAATCTTCAATTGATAACATTTTGAATTTATTAACCTGTCTCTTATACACATCTCTGAGCG 515 | >D4ZHLFP1.10.4415.2400.1 516 | GTAGATACTAATTGAAAAGTGCGCCAATATTCCAGTTGAAAATTGCGCCACTATAGGATAAGTATAATGACCCTGTCTCTTATACACATCTCTGAGCGGGC 517 | >D4ZHLFP1.10.4498.2403.1 518 | ATCCAGACCTATGTGATGGAATACAATGAGGAGATGGTGCGTGAGGCAATCAACCGTGAGCTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 519 | >D4ZHLFP1.10.4499.2427.1 520 | GTGAAGATGACAACGACGAAGAAAAAGTCGGAGATGACAAGATCAACACAACTTATATGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGAC 521 | >D4ZHLFP1.10.4356.2448.1 522 | GTAATACCGGGACGACAAACATGGTCATTGATTGTATCAGCATTCCTCCAAACGTGGGTATAGCCATCGGTACCTGTCTCTTATACACATCTCTGAGCGGG 523 | >D4ZHLFP1.10.4451.2454.1 524 | CGGTTATCCAACCGCTGGCTACCCGAACTAGATACGTTAGGCTTATATCTGTCAGGAAAAAAAGAACTACAAGAACCTGTCTCTTATACACATCTCTGAGC 525 | >D4ZHLFP1.10.4264.2476.1 526 | ACTATCAACATCGTATGTCAACGTCAATTGGGACGTGACGCCACTCAAGAAGAAATAGAGAGTATTTATCCTGTCTCTTATACACATCTCTGAGCGGGTTG 527 | >D4ZHLFP1.10.4456.2500.1 528 | ATGTAGTCACTTCCCTTTATTTGAATTACAGGGGCACACTTTTAGGAATTTAATTTGGGAAGGACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 529 | >D4ZHLFP1.10.4677.2275.1 530 | ATCTTATTTTGCAGAGGAAGCACAACGATGGGAAGAAAAAGCAGAGTGTTTTTCGGCTATGCCTATGGCTGTCTCTTATACACACTCTGAGCGGGCTGGCA 531 | >D4ZHLFP1.10.4695.2280.1 532 | CACTATGGCAGTCTGCGTATCGGAGCTTCAGCCGGAAGCGACACGAATAAATTCCCGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCGGACCGGCC 533 | >D4ZHLFP1.10.4599.2280.1 534 | GAGTAAAGAGTAAAAGCGACGGCTATGTACGAAAGGACTTCCCGGGCGGATGGGCGGACGATAAAGAGAACTGTCTCTTATACACATCTCTGAGCGGGCTG 535 | >D4ZHLFP1.10.4549.2296.1 536 | CCTTTTGCCGTGACAGCTCCCGGTTCAGGCGATTCTTGGAACGAAGCGATTTATATCTGTCTCTTATACACATCTCTGAGCGGGGTGGCAAGGCAGACCGG 537 | >D4ZHLFP1.10.4588.2305.1 538 | GTCTGAGGCATGACGCCTTCATCCAGAGCAACCACCAGAAGCACCAGGTCGATGCCCCCAATACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 539 | >D4ZHLFP1.10.4639.2320.1 540 | TATTAATAATGGTGATCAATCTACATTTGCCATAACTAATCCTGAAGCAACAGCGTGGACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGAC 541 | >D4ZHLFP1.10.4562.2321.1 542 | GTGCAAACAAGGCCGGATACTTCTCTTTTCCTGATATCAGCAACTGTCTGCGCTCCGGATGAATTCGCTCTGTCTCTTATACACATCTCTGAGCGGGCTGG 543 | >D4ZHLFP1.10.4706.2334.1 544 | GGCATATACTCAATTAGCTTCTGAAAACGTTACTTGGGAAATTGCAACAAAGCACGATATAGGTGTAGACCCTGTCTCTTATACACATCTCTGAGCGGGCT 545 | >D4ZHLFP1.10.4720.2347.1 546 | GACCTGTACTACGGGCAGGTGGGCAACACGAACCTGAAACCGGAAAGCACCACACAATACAACCTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 547 | >D4ZHLFP1.10.4704.2361.1 548 | CCTTTGTACCTTAATTAAATGCGTCATTTTAAGCAAGAAAAAACATGATTCGAAAAATAATAAAGGCTCTTTGGCTGTCTCTTATACACATCTCTGAGCGG 549 | >D4ZHLFP1.10.4594.2393.1 550 | AGCCTTCCTGTTTCCACCAGGCAGCTCCTAGCTATGCTCCTTGACCCATTGCTGCTTTCACATCTTCCCCAGCCTCTGTCTCTTATACACATCTCTGAGCG 551 | >D4ZHLFP1.10.4578.2434.1 552 | CGCCAGACGAGTGATTTTAGCGGTGGATGGCGGATGCGCATCGAGCTCGCTAAGTTGCTGTTACAGAAACCGGACGTACTCCTGTCTCTTATACACATCTC 553 | >D4ZHLFP1.10.4685.2437.1 554 | CACTATACGCACCCCAGCCGTCCAACGCAATCAATACTACATGTTTTGCCTGCCGTTTAGCCGCCCAAGCCCTGTCTCTTATACACATCTCTGAGCGGGCT 555 | >D4ZHLFP1.10.4626.2458.1 556 | GTCCAGTGTGGAGGACAGCGAGTCCGGCAGTACGATTGGCTGTGCAGGCAGGCTCCAGGCAGGTTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 557 | >D4ZHLFP1.10.4671.2473.1 558 | ATCTTTTTGTTACCAGCGTATCACCTTCGGGAAAAGGCAGAATCCGCACGCACCCCGCCGACATAAACTCGACCCTGTCTCTTATACACATCTCTGAGCGG 559 | >D4ZHLFP1.10.4571.2481.1 560 | GTTCAGAAATTGGAAGGTGTAAATCACGAATACAAGGATTTGCTGTGAAAAGGTGTTGAAACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 561 | >D4ZHLFP1.10.4522.2491.1 562 | GCTCAGGAGTCAGGCAGCGCACGGGCAGGCCCCGCAGCGTCAGGTAGAGGCGGCAGGTGGCTTCCAGCTCCTCCGCCGCTGTCTCTTATACACATCTCTGA 563 | >D4ZHLFP1.10.4557.2495.1 564 | GAGTCAGCTATGGCGTGTCCTGTTCCGCGGATAAACGAACCACCTACGGACAACCATTTCAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 565 | >D4ZHLFP1.10.4815.2263.1 566 | ACATTATTCCCGACTTCCGGTCCTAATTCGATAAAACGTTTCTCTACCGCATTATCTTTGCGCATGGTATAAATACTGTCTCTTATACACATCTCTGAGCG 567 | >D4ZHLFP1.10.4853.2302.1 568 | TATGAAAAGACTCCGGTTTTCCGGTTATAATAATTTAATCCGCCGTCGTTCGTGCCAATCCAGAGATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 569 | >D4ZHLFP1.10.4877.2307.1 570 | GAAGGGGATCCAGCTTCATTCTTTTGCATATGAACATCCAGTTGTCCCAGCATCATTTGTCAAAAAGACCCTGTCTCTTATACACATCTCTGAGCGGGCTG 571 | >D4ZHLFP1.10.4991.2317.1 572 | GAGTAAATATGCGTCCATTGATAATTTGAGTCAACATGATAAATTATAGTATTAGAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCG 573 | >D4ZHLFP1.10.4825.2333.1 574 | GTTTCAACAGTTTTAACTTGAAGAACAACGTTTGGAGTTGCTTCGATAACTGTTACAGGGATCAATTCGCTCTGTCTCTTATACACATCTCTGAGCGGGCT 575 | >D4ZHLFP1.10.4971.2343.1 576 | ACTAAAGGGCATTGTGTAGAATCAGAGAAGTTGAAGAATATTTGCAGGAAAATAAATTGCTATTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 577 | >D4ZHLFP1.10.4866.2343.1 578 | CATCGAAATCAATGTTCCCGGTTTGCTGACGAAATACGTTTTCTTAACTTCTTGTGCCGTAGCTGCAAATAGACTGTCTCTTATACACATCTCTGAGCGGG 579 | >D4ZHLFP1.10.4814.2345.1 580 | TGTGAATATCGTAGAACTTGCCGAAGCGGGAGAATTCGGAAATGTGATTATTGACGGCCCGCTGCATGTACGTACCTGTCTCTTATACCCATCTCTGAGCG 581 | >D4ZHLFP1.10.4856.2358.1 582 | CATTTACACACACTGCGAGTGGCGACATTACAACAACATGCCTCTCTACCGATAATATGGAGTACCCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 583 | >D4ZHLFP1.10.4888.2369.1 584 | ATTATACCCCGATTATGCTGATCCACAAAGATAAGATCGGTTGGCTGCCGCAACGCTACATCTCCATTGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 585 | >D4ZHLFP1.10.4844.2384.1 586 | TCCCCATCCAGTGACCCCCAGTGTCTGTTGTTCCCATCTTGATGTCTATGTGTCCTCAATGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 587 | >D4ZHLFP1.10.4790.2389.1 588 | ATGCAGCTCAGATGGCTGCCCAGGATTGCGCAAAGATTGCATTCGATCTTGGCCTGAGAAAGGTAAAAGCCTGTCTCTTATACACATCTCTGAGCGGGCTG 589 | >D4ZHLFP1.10.4754.2414.1 590 | GGAGGATGCCTTAAGGAGGGGAAAACGGGTGGGGGTTCCCAGGTGCATCTCCAGAGGGATTATGGAAGTACTGTCTCTTATATACATCTCTGAGCGGGCTG 591 | >D4ZHLFP1.10.4787.2417.1 592 | AGTGACAGGATTGCTTAAGCCCAGAGTTCTGGACCAGCCTGGGTGACATAGGGAGACCCTGTCTCTAACAACTGTCTCTTATACACATCTCTGAGCGGGCT 593 | >D4ZHLFP1.10.4768.2420.1 594 | CTCAAACCTTGCATGTCGAAAGCATAACTCTTGAGTTCTTCCCCAAGTCACCCCATCGTAGTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 595 | >D4ZHLFP1.10.4896.2430.1 596 | ATGCGGGCCTTGATGTAGAGATAGATGGATTTGGTAATGTTATCGGTTATAAGATTGGTACAAATCCTGATTTACTCTGTCTCTTATACACATCTCTGAGC 597 | >D4ZHLFP1.10.4805.2435.1 598 | CGGCGTCCCTGTATCAGGCCGTGCTTGGCAGGATACGGGCAGAGCAGGACGACGGGGACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGAC 599 | >D4ZHLFP1.10.4994.2446.1 600 | ATATTAGGTGATACCCTTTAATAGTAGCCACCATTGCCAACCCTACACCTGTATTTCCGCTTGTCGGCTCAACCTGTCTCTTATACACATCTCTGAGCGGG 601 | >D4ZHLFP1.10.4781.2447.1 602 | AGTCGGCATAAAGCACCGGATTTTTGTATTTACCGTTTCCGAGGTCGGAAACCCATACTTCAGATACATAGTTCTTCTGCTGTCTCTTATACACATCTCTG 603 | >D4ZHLFP1.10.4995.2466.1 604 | CATTTGTTGTTGAGCAAACACAGGCAAAGTCAGTAAAGCTATCCAAATGCTAATTATAATCTGCTTCATTGTATACACTCTGTCTCTTATACACATCTCTG 605 | >D4ZHLFP1.10.4885.2488.1 606 | ACCCAAATGCAACCGGTGGATTTGACTCAGGCAGCGGAAAATTCAGTGCATGCAGTAGTTCATATAAAGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 607 | >D4ZHLFP1.10.4954.2488.1 608 | GTCTTTCATCGTATAGTGTTTCCAGGCTGTCGGTGCTCCTTCCTTATAGCAATATAGTCCTGTGGAAGAGCTGTCTCTTATACACATCTCTGAGCGGGCTG 609 | >D4ZHLFP1.10.4786.2499.1 610 | GGATATATCATCCAAGTTGTTCAAGTCACTAAACACACTGACTTTGCCAAACTTTGTCACACCGGTAAGGCTGTCTCTTATACACATCTCGAGCGGGCTGG 611 | >D4ZHLFP1.10.5217.2265.1 612 | CGTTTCAGCAGTTCGAGGATTCCATTCGGAGTGGCCGATACATAGCAAGGCAGTCCGATAGACATGCGTCTGTCTCTTATACACATCTCTGAGCGGGCTGG 613 | >D4ZHLFP1.10.5043.2270.1 614 | CTTTATACTTCTCACAGTAATATTTACACATCAATTGTAAACCACAGGTTTCACATTTGGGAGTGCGGGCCTGACACTGTCTCTTATACAATCTCTGAGCG 615 | >D4ZHLFP1.10.5115.2299.1 616 | GGATATTGGTGACTTATGCCGTACAAGGCGAATTTACAGAAATCAAGTGGCCCGACGTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 617 | >D4ZHLFP1.10.5016.2300.1 618 | GTTACCCGACTGATCTGTGCTTCAATTTCTCCATCAGTGATAAAGGTTCACATACAGGCTGTATGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 619 | >D4ZHLFP1.10.5196.2307.1 620 | CTATTAAGTGGTCACAACATTTTAAGCAGACTATGCTAAAAGACATGCTATGACTCTACCATATTACGTCATCATATTCCTGTCTCTTATACACATCTCTG 621 | >D4ZHLFP1.10.5097.2312.1 622 | GTAAAACATATCCTATCTGCACTCCACAAGAAAGTGACGAATTTGATGGGTATACACTTTCTCCGCAGTGGCAATGGCATGCTAACATTAATGAAAAATGG 623 | >D4ZHLFP1.10.5114.2356.1 624 | TGAAGAAAGCGAATGAAACAATCTTCGAACCTTATTGTCTCGCTAAACCGGACCTGCAAGTGAAATATGTCCTGTCTCTTATACACATCTCTGAGCGGGCT 625 | >D4ZHLFP1.10.5016.2357.1 626 | CCAGTATACCTTGTAGCCGTTGTACTCAGGCGGGTTATGGCTGGCCGTGATGTTAATATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGAC 627 | >D4ZHLFP1.10.5246.2377.1 628 | TATTACAGGTTTCTGAGTAAAGGAATTCGGATTACATCGGGTACTTTCTTAGGTAATGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGAC 629 | >D4ZHLFP1.10.5055.2385.1 630 | CACGTAAGCCAGTGTACGTTTCATCTCTATCCCGGGATAGGCATTAGTGTCTTTCGCCACCATTACCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 631 | >D4ZHLFP1.10.5052.2411.1 632 | CAGACAGCCAGAGAAAGAGAGTAAGACAGAAGATAGGCACAGAAAGAGAGACAGACACAGAGAGAGACAGCTGTCTCTTATACACATCTCTGAGCGGGCTG 633 | >D4ZHLFP1.10.5165.2419.1 634 | CAATAGAATAGAGAAACTGGCTTCCTTATCAGACTGCCAGGAGATGCATTTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAAT 635 | >D4ZHLFP1.10.5107.2426.1 636 | ATGCCATGGTGGCGGCGCTGGATCTTACTACCAAAGAGTGGCGTCCGGCCAGTTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGC 637 | >D4ZHLFP1.10.5212.2431.1 638 | GGGCTGGGATACGCCACGGAAACCGTGAAAGGAGACAAGATTACCAGCGCTCTGCCCTCCAACTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 639 | >D4ZHLFP1.10.5148.2439.1 640 | CAGGAAGTCGTACATTTGCAATGTGTTTTTCATAGTATTAGATTTAAGGTTAACAAAGGTTGGAGCCTGTTTCTTATACACATCTCTGAGCGGGCTGGCAA 641 | >D4ZHLFP1.10.5116.2442.1 642 | CGAGCTTGCCAGTACAATCCCCACCATGCGCAGACAGGAAACGATGGTCCGGGCAATGACCATCATCAGGAGATATGCCCGGACTCTAAAGTCTGACCGGC 643 | >D4ZHLFP1.10.5225.2446.1 644 | AATGAAAGTTTTGAACAATGTGAGGAAAAAGTCGAGAATGAATACAGCGTAAATATCCCCATGCGGTACTACCTGTCTCTTATACACATCTCTGAGCGGGC 645 | >D4ZHLFP1.10.5248.2451.1 646 | TCTGTGCCCTCTGTGGTGAGTTCGTTTTGAAAGGAATTGTTATTTTCTGGCCAGACTCAGTGAATTCAGCCTGTCTCTTATACACATCTCTGAGCGGGCTG 647 | >D4ZHLFP1.10.5085.2473.1 648 | AATCAAGACGACATTAGGACGGCGTTGTTGCACGCTATCCGCTTTTGCAGCATGCCTGTAGACAGCCAACGGGCTGTCTCTTATACACATCTCTGAGCGGG 649 | >D4ZHLFP1.10.5237.2491.1 650 | CCATTAACCTCAAAATGCGCTTCAAAGAGCAAAATGGAACACGTGATGCGGGAAGTGACCAATTATATGGCTGTCTCTTATACACATCTCTGAGCGGGCTG 651 | >D4ZHLFP1.10.5171.2498.1 652 | AATACATACCGAAGTTTTGTTTTCGATCCGGAAGTATTGTCAATCAAGGCAATCCCCTTGGATCCCTTTTTGATCCCTGTCTCTTATACACATCTCTGAGC 653 | >D4ZHLFP1.10.5369.2277.1 654 | GCTGAAGCGTCAGGCGCTGAAGTATGCCTTAGGCTTGTAAAACCGGCTTCCCGGGCAGGGGCTGTTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 655 | >D4ZHLFP1.10.5333.2278.1 656 | CCGCCCACCTGCTTTAACAAATCCTGCCTTATGACGAACAGATGGCATATATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAA 657 | >D4ZHLFP1.10.5481.2291.1 658 | GTGTTCATATATTGGATTTCTTGAACTGTTCATCCTTTTAAAAAATCTTTTCCTTTCATTTTATTTTCTTTGTTCCTGTCTCTTATACACATCTCTGAGCG 659 | >D4ZHLFP1.10.5475.2319.1 660 | ACGCAACTCACGTGCATAACGTAACCGGATGGGGAAGCGCTCGCGCCCTTCTACAGTACGGGTCAACGCCATGCCACCAACAGCCTGTCTCTTATACACAT 661 | >D4ZHLFP1.10.5345.2332.1 662 | GCACAGAATTTAAGGTACAACTGCAAGGATATTGTAACAATCTTGATGGCGATCAATAAGTTGATGATTCTTTATCTGTCTCTTATACACATCTCTGAGCG 663 | >D4ZHLFP1.10.5277.2345.1 664 | AGAAAGTACTCCAATAAACTACACAATAGTTTTGTTTTACAACCTCAGAAAAGGAAAGTCCTTTCTAAGCCTGTCTCTTATACACATCTCTGAGCGGGCTG 665 | >D4ZHLFP1.10.5394.2360.1 666 | GTATCGAAGAAGCTTTGATACACTTCAACCAAGCTGATATCTGGGTAGGTGTACAAGCCCTGTCTCTTATACCCATCTCTGAGCGGGCTGGCAAGGCAGAC 667 | >D4ZHLFP1.10.5346.2364.1 668 | GGATATTATTAAGGTCGCCACTCGAAATTTCGATATCGTCCATCAGAATTAAAGGTGTTGCACGTCCCCCTGTCTCTTATACACATCTCTGAGCGGGCTGG 669 | >D4ZHLFP1.10.5349.2380.1 670 | CCTATATCCGGAACAGAGAACCGTATCTGTCCAGATATCTGGAGGATGGAAGATGCAGCTTTTCGAACAATCTAAGTGCTGTCTCTTATACACATCTCTGA 671 | >D4ZHLFP1.10.5443.2421.1 672 | GCCTGGGTGTGGAACTGAAAAATAATGTAAAGAGAGCCTGGTGGCAGAAGTTCATCAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 673 | >D4ZHLFP1.10.5415.2429.1 674 | ATTTTCTTTAGCAAACTCCGGTTCCATTGCCTTCTCCAGGGTGATGGGGCCGGGGGCTATAGAGAATATACCCTGTCTCTTATACACATCTCTGAGCGGGC 675 | >D4ZHLFP1.10.5481.2432.1 676 | GAGTTGTTACATTCAAGAAGAGTAGTTTGCCCGAAATTTTGACGCATCTGCCTTTGGATAGAACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 677 | >D4ZHLFP1.10.5456.2452.1 678 | ATTATTCACAAAGCCAAGATATGGAAACAATCTGTCAGTGGATGAATGGATAAAGAAACTGTGGTGTATGTACCTGTCTCTTATACACATCTCTGAGCGGG 679 | >D4ZHLFP1.10.5258.2473.1 680 | AAGCAACACAGGCAGAAAAGGCCTCTGAGGATGGCGGGCAGTGGCCTGGGGGAAGAGGTGAGCCATGGCCCTGTCTCTTATACACATCTCTGAGCGGGCTG 681 | >D4ZHLFP1.10.5467.2476.1 682 | TATTTACCCAGAACAGATTATGGCTTTCAGCATCGTTCATCGGTATGCCGTTTACCGTGATGTTGATACGTGTACCGTCTGTCTCTTATACACATCTCTGA 683 | >D4ZHLFP1.10.5367.2478.1 684 | GTATAATCTTGAATGCAGGTGCGTATACACATACTTCCATAGCTTTGCAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATA 685 | >D4ZHLFP1.10.5495.2479.1 686 | ACCAAAGATTGGGTTATTCCGATCGCTCCCACTCCATTAATACCCAAACCACGTGGAATACAGTTTTTATCCTGTCTCTTATACACATCTCTAAGTGGGCT 687 | >D4ZHLFP1.10.5415.2480.1 688 | GCTCTTCGCCGTGTGAGTACCGTTTGTTCTTCATTGTGTGTAAGCCGGAGTGTTTATCTAATAAAAGATGACCGCAGATTCTGTCTCTTATACACATCTCT 689 | >D4ZHLFP1.10.5328.2486.1 690 | ATATAGCGTAGAAACTATCAGGACGGATTGACATCAACATTGATAAGGCTGAAAGCCTTCCATTATATAGCTGTCTCTTATACACATCTCTGAGCGGGCTG 691 | >D4ZHLFP1.10.5436.2493.1 692 | CCTCCGAATATGGCGGTGCCCTGCTTCTTGGCGTACGTGCCCCCTTCATCATCTGCCATGGCAGTTCCAAAGCTGTCTCTTATACACATCTCTGAGCGGGC 693 | >D4ZHLFP1.10.5699.2274.1 694 | GTACTTTGTATTTCTGTGGGATCGGTGGTGATATCCCCTTTATCATTTTTTATTGCATGTATTTGACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 695 | >D4ZHLFP1.10.5678.2278.1 696 | TTGTTGTGCTTCGGCATTAAACTCTACCTTATTCTTAGCATACGAGAATACAAAGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGG 697 | >D4ZHLFP1.10.5517.2304.1 698 | ATCAAAGATAATCTGGTAGCTCTGTTCATTAACAGCGGTACTGCCGATTGCGAATCACTGCAAGGTATCTATGGACCTGTCTCTTATACACATCTCTGAGC 699 | >D4ZHLFP1.10.5506.2318.1 700 | CTTGTAGGCCATGCCCGCCGTCATGATGATGGCGAACAGGGCGGCGAGTGCCATGCACATCTGCAGGATCTGTCTCTTATACACATCTCTGAGCGGGCTGG 701 | >D4ZHLFP1.10.5730.2320.1 702 | GTGATCAAGTCGTTCGGTTTTATCTCGTAGTCAATCAGATTAATGGTTGCGGTCATCTTGCCTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 703 | >D4ZHLFP1.10.5700.2325.1 704 | CTTTAAAGGAATTGATTGCTAAGTTCAATATTTGCAATATTTTGGATTCAGTTGTTTTATCTGCTTTTTCTCTAGCTGTCTCTTATACACATCTCTGAGCG 705 | >D4ZHLFP1.10.5539.2337.1 706 | ACTCTACAATGCGGTTCAGTGTAGCTACGCGAACGCGGAAGAACTCGTCTAAATTGTTAGAGTAAATGCCCAGAAAGGTAAGTCTGTCTCTTATACACATC 707 | >D4ZHLFP1.10.5580.2350.1 708 | CTGCTGTGTGGTGACTGTTGGTAAAATCGTAGGATTCCAGGAGGGCGTTATCGACATGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 709 | >D4ZHLFP1.10.5739.2357.1 710 | GCTTCACCACGCTTGTGCGTATTCTCTTCAATGGTATAACAAATATCAAAAGAACGTTTGGTCTTAATATAGCCTGTCTCTTATACACATCTCTGAGCGGG 711 | >D4ZHLFP1.10.5647.2386.1 712 | GCGCAAAAGCGGGCAGAAAAGCAAGTTTGGCGGACATCAGTACGGCCTGTGCTTCTTCCACACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 713 | >D4ZHLFP1.10.5578.2389.1 714 | ATACACAGGTGCCTTCATTTGTATATTTCGCCAACTTACCTCAGTATGTGAAAGAGCCATACAAACGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 715 | >D4ZHLFP1.10.5744.2411.1 716 | CATTGTAACGGTGCGTATGGTAAATTATAAAATATCTGTTATCGGCGAAGTTACCCGTCCCGGTACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 717 | >D4ZHLFP1.10.5688.2416.1 718 | GCGTGGTGACCCGCTATCGCGTGTCGGTCGACGGCCGTACGGCGGCGGAGGGCGAATTCTCGAATATCGTCAAAAATCCGGCTGTCTCTTATACCCATCCC 719 | >D4ZHLFP1.10.5634.2418.1 720 | CTTCTCTTGGGGCCACCAGGGAGAACTTGCTTCCTCTTTTCTCTTAGGATATCATGCCTTCACCCTTTCCCCTGTCTCTTATACACATCTCTGAGCGGGCT 721 | >D4ZHLFP1.10.5659.2456.1 722 | AATTGGGACTTTACCTAATCCGGGATTGAGATGGGAAAAAACGAAAACGGCTGAAGTTGGGATGGATTCTGTCTCTTATCCACATCTCTGAGCGGGCTGGG 723 | >D4ZHLFP1.10.5711.2472.1 724 | GTTTTATTCGTTCCTTGGTTCCGGAACTGTCGGAAATAATGAAAATACATCCTTTTAAGGAAGGTGTGTCGCTGTCTCTTATACACATCTCTGAGCGGGCT 725 | >D4ZHLFP1.10.5589.2475.1 726 | AGATTATTTCTCGCGCTCTGCCAATAGTACCATTACAGCCAAAATCACTTATCTGTTTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 727 | >D4ZHLFP1.10.5650.2480.1 728 | GGTACACACATCACAAACAAGTTTCTGAGCATGCTTCTGTCTAGTTTTTATGCTGTCTCTTATACACATCTCTGACGGGCTGGCAAGGCAGACCGGCCAAT 729 | >D4ZHLFP1.10.5683.2489.1 730 | ACTTTATGGATTTTTATTTTTGAGAGAGGTCATCGATAAATATATCCACTGTTTGGTTACCGATGTCAGATAGACCCGTCTCTTATACACATCTCTGAGCG 731 | >D4ZHLFP1.10.5849.2278.1 732 | CTTGGGAGCCTGGCAAGTCAGGTCATTCTTTTCAAGATATAGGCAGAAATGTAGCCAATGGGGACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 733 | >D4ZHLFP1.10.5915.2283.1 734 | GCCCTTTATCAGATGGATAAATTGCAAAAATTTTCTTCCATTCGGTAGGTTGCCTGTTCACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 735 | >D4ZHLFP1.10.5946.2287.1 736 | GATCAGAAAAGCCCAATTCTTGAAAGCCACCTTTGATGGCACTGCGTACATAGTCTTCATCGTCTCCCCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 737 | >D4ZHLFP1.10.5781.2289.1 738 | ATGCGAACCCGGTTAATTTGCTGGAAGAAACGCTGACGAAGAAACTCCCTCTTTTTACGCATATACCAGACCTGTCTCTTATACACATCTCTGAGCGGGCT 739 | >D4ZHLFP1.10.5844.2310.1 740 | AATTCATTATTTCCAGAAAGAAATAACAGATTATGATTTCCAACCACAGGCTGAGGATGATATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 741 | >D4ZHLFP1.10.5876.2312.1 742 | GGCCCATATTTCTTTTGAGAAGTTTCTGTTTAAATCTTCGAACATCTTTTAATGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCC 743 | >D4ZHLFP1.10.5989.2312.1 744 | CGGATAAACCTTGCTTTCACCGGAGAATTGCCCGGCTTCGTATCCGTCACCGGAGATTGTCCGACACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 745 | >D4ZHLFP1.10.5947.2339.1 746 | ATCAAGAACGACGTGGGCAGCCAGTTGGCAAGCATGGAATGCGTGCCGCGCCACAACACCATCGACTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 747 | >D4ZHLFP1.10.5805.2349.1 748 | CTATTAAGAATTTCACGGGCTTTGCTCCGTTGGGCATGGTGATTATCGCTATGTTCGGACTGGGAGTAGCACCTGTCTCTTATACACATCTCTGAGCGGGC 749 | >D4ZHLFP1.10.5975.2356.1 750 | GGTGATGCTGAGGACTCTGACAAGACTGCCATCGTAAGATGTGAGGAAGGTGGGGATGACGTCAAATCCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 751 | >D4ZHLFP1.10.5942.2376.1 752 | GCATTAACCTGACGGTGAAGCCGGGCGAAGTACATGCCATTATGGGACCTAACGGTTCGGGTAAAAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 753 | >D4ZHLFP1.10.5964.2377.1 754 | CACCCAAAGATATGGCCCCTGTCTCCCAGCCTCTGTGGGAGGCTAGGAGACTAATTCTGGTAGTGCCAGTCCTGTCTCTTATACACATCTCTGAGCGGGCT 755 | >D4ZHLFP1.10.5774.2390.1 756 | TCATAGTTCGTATTATAAAGTCCGGTCGCTCCCTCTACTGCGATGCGACGTAAATCAGCATAGTAACCGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 757 | >D4ZHLFP1.10.5965.2399.1 758 | GGTGATGGACGGCATTGTGGCTATCGACAAGAACGATGTCAATTTGAGCAACGCCAACATGAGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 759 | >D4ZHLFP1.10.5876.2402.1 760 | ATTAAATGCAGCAGAAAAGATGGATAACTATTTGTTATCTAAAAAGAATGCAGATAATCCTTTTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 761 | >D4ZHLFP1.10.5935.2402.1 762 | ATGAGGTCCTTGAAGGCATCCGCCGTACCCGGATTAAAAAAGAGGAGGATGGCTGGCACCTCCGCGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 763 | >D4ZHLFP1.10.5853.2406.1 764 | CATCTAGCCTTGGCCATGTGCCCCTCCTCAAAGCAGAGTGTTCTTTTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATATC 765 | >D4ZHLFP1.10.6000.2414.1 766 | CAACAAAAGTTAGCATATCAGGATTATCTTCCACTACCAGCACAGCGGGGCGATTATCTTTCTTTCCTGCCTGCTCCTGTCTCTTATACACATCTCTGAGC 767 | >D4ZHLFP1.10.5765.2419.1 768 | GGCCTTAACTTCTCTGAGCCTTATTTATAAAATGGACATAATAATTACCTCCTGCCTGTTGTGAAACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 769 | >D4ZHLFP1.10.5892.2444.1 770 | TCCTTCTACACTGGGAGACGCTTTATTGTTGCAACTGACCGATGGAAGCTATCTATTCACCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 771 | >D4ZHLFP1.10.5870.2449.1 772 | GAACAATTTCATTCAGCAGCTGAATAGCGATTGTCCGCTGGCGTAGTTTGACATCTTCGGCCACCAGCCCGGCGAATGCCTGTCTCTTATACACATCTCTG 773 | >D4ZHLFP1.10.5965.2451.1 774 | CTATTGATTATCCCATTTATCCTTTTGGCAGGTAAACTGACGGAACGGGTCAATTTCTGTCTCTTATACACATCTTGAGCGGGCTGGCAAGGCAGACCGGC 775 | >D4ZHLFP1.10.5941.2452.1 776 | AGATTGCCTTTCATGATCTGATGTACATGCCGCATAACGTGGTGATTCATTATAACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGG 777 | >D4ZHLFP1.10.5782.2456.1 778 | AATGCCACCTCACTGGAACGAAGCGGCGAGATGGTAGAAGTTTCCATGGGTGAGGTTTCCTCTAAATTGCATTTACCCTGTCTCTTATACACATCTCTGAG 779 | >D4ZHLFP1.10.5836.2468.1 780 | CGATTGGACTATTTACCGGACGGCTCTGTTGGCGGGAGACGACCAGTTGGCCGATCTTTATAAGAAAAGGGCTGTCTCTTATACACATCTCTGAGCGGGGT 781 | >D4ZHLFP1.10.5919.2469.1 782 | CTCCGAATGTGGATGCCACGGCGTTGCATCCTCCCTCGATAGCGAGTTTCACAATGTTCCCGGGATCGAAGTAAAGCTGTCTCTTATACACATCTCTGAGC 783 | >D4ZHLFP1.10.5780.2475.1 784 | AAAACCGCCAATATCCATTGTCCAGTAAGGAATTCCACTCATCGCAAAATTCAACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGC 785 | >D4ZHLFP1.10.5997.2490.1 786 | GAGAGGTATTGGAAAACGTTTACTTCAATTCATAGAGCAGCGTGCTGTTAGTGCCGGATATGTGTATATGCTGTCTCTTATACACATCTCTGAGCGGGCTG 787 | >D4ZHLFP1.10.5856.2496.1 788 | CTTCAATACTGTCGAAGAAGTTAGCCATCACCGCCGCACGAGCCTCAAAGTTCTTGGCATCTTCTGCCGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 789 | >D4ZHLFP1.10.6014.2285.1 790 | TAATAGTTGTATGTTGAAAGACCCTTTTTTGTATGATATAGGTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATATCTCGTA 791 | >D4ZHLFP1.10.6053.2310.1 792 | CCTGTAATCCCAGCTACTCCGCGCCGCTGTACTCCAGCCTGGGCGACAGAGCGAGACTCCCTCTCCCCCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 793 | >D4ZHLFP1.10.6039.2316.1 794 | CCAGAACCGTTTTAAAAATACTGACTTCTGGCCCCACCTGAATTAAAATCCCTGGAGTCAGGGCCTTGGAACTGTCTCTTATACACATCTCTGAGCGGGCT 795 | >D4ZHLFP1.10.6034.2365.1 796 | GACATAAAGAAGATGAAGCAATCAGGCAGAGTTTTAAGACAAAGGAGTAAAGAAAGGATTTCCCCAATCTTACTCCTGTCTCTTATACACATTCTGAGCGG 797 | >D4ZHLFP1.10.6011.2381.1 798 | GTTTTTACATCATGCATTTTGATGCGCTATCATTAGGTACATGCTGTTAAGGATTGTTATGTCTTCTTGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 799 | >D4ZHLFP1.10.6186.2395.1 800 | GCATTAATATAATGATCCCCAAACTTACGATCAAAATTCAACTGGAAATTAGAAAATAAATCTTCATTTCTATGATGCTGTCTCTTATACACATCTCTGAG 801 | >D4ZHLFP1.10.6217.2401.1 802 | GATATATACTGTCTGGCGGCAGGGGAACAGATTACCGGTGAATATCTTCTGGAGCGGCTCAGGGCGCTGTGCAGGAACTGTCTCTTATACACATCTCTGAG 803 | >D4ZHLFP1.10.6229.2407.1 804 | GCTCTTTTCTTCTCCCATTTTCTTTCACCTTACCTTTCTATCCTTAAATAATAAGTCCCCACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCA 805 | >D4ZHLFP1.10.6150.2411.1 806 | CAGTTGGTACGAACCCAAAAAACAGGACAAGAATGGAAACAAAGATTAAAGACATCACGGGAGCGGAAATAACTCTGTCTCTTATACACATCTCTGAGCGG 807 | >D4ZHLFP1.10.6247.2413.1 808 | ATATTATATAATTCTCACTGCTCCCTTATCGGCAGAGCTTACCATGCTGGCATACGCCTTCAGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 809 | >D4ZHLFP1.10.6243.2453.1 810 | GGTATATAAAGTTCGCTCTGCTCACCGGCGTGACAAAGTTTGGCAAAGTCAGTGTATTCAGTGATTTGAATCTGTCTCTTATACACATCTCTGAGCGGGCT 811 | >D4ZHLFP1.10.6218.2453.1 812 | CTCTCAGACCTCTGCTGACTTCAGAGAGTTCCCACCCTAGGATCTCCTCTGAGCCCCCAATGCACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 813 | >D4ZHLFP1.10.6005.2461.1 814 | GGCCAGATCCTTAGAGCAATACTTCTGTCTCAGAACCTCTTCTCAATCCATCCGTGCTCCATCTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 815 | >D4ZHLFP1.10.6064.2466.1 816 | GGCCTATATTGAGAAGCCAATGGCGGTGACGTATGAAGAATGTACCCGAATCAACCGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCG 817 | >D4ZHLFP1.10.6144.2479.1 818 | GTTCTTACGGCAGCAACCGCGGCAGGAGCAGCAAACGCAACACCTATAATACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCGGACCGGCCAAT 819 | >D4ZHLFP1.10.6016.2497.1 820 | GGACCTTGCCATCAGAATCCTCCCACCCTGGTGGCTATCGCCGGGAGCGTATTTTGTATATGCATTGCTCTCCCTGTCTCTTATACACATCTCTGAGCGGG 821 | >D4ZHLFP1.10.6289.2266.1 822 | ACATAAGTTACTTCAAACAAAGAGTTGAATTGTTTCTTGATCTCATTCAGAGGATTATCACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGA 823 | >D4ZHLFP1.10.6401.2270.1 824 | GTATGGGAAGCAATGGTACTTATAGCTATGATGCCATATCCAACACAAACAAAAGTTGGGGCCCAAAAGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 825 | >D4ZHLFP1.10.6460.2271.1 826 | GTTACAAGTGTACGGATAAGACAGGTTATCTGGTAGGTGCCAAGCTGGTAAATGACGGCAGGGAAATCACTGTCTCTTATACACATCTCTGAGCGGGCTGG 827 | >D4ZHLFP1.10.6311.2280.1 828 | TACCAGTGTAGCGTCACCGGAGTGAACACCGGCAAATTCCACATGTTCAGAAATAGCATATTCCACCACTTCACCTGTCTCTTATACACATCTCTGAGCGG 829 | >D4ZHLFP1.10.6296.2289.1 830 | TTCCGGTTGGTTTGTCGAGGGATTATTGTCACAGACATTGATCGTACATATGATCAGGACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGA 831 | >D4ZHLFP1.10.6419.2301.1 832 | TCATGTATGTCGGCGATTATCTTGAAGAGTCGCCTCTCGGCAAGGCCGAATGGATGGTAGTGCTTTCAGACTGTCTCTTATACACATCTCTGAGCGGGCTG 833 | >D4ZHLFP1.10.6334.2309.1 834 | TATTAGGATACCCGATAAAATAGGACCTATAATTAGCGCATAAAGCCCGGCTCCACATAGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGGAAGGCAG 835 | >D4ZHLFP1.10.6349.2312.1 836 | GGGTGGTGATGCGCTCGTTCACTTTATAGACTTGGTCCTCTACTTTAGACTCTGTACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCG 837 | >D4ZHLFP1.10.6441.2318.1 838 | CTCCACAGCAGTATGGTTTTTCGACAAGGCTTCATGGAGAAACCGCCCTAAATTCACAGCACAACTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 839 | >D4ZHLFP1.10.6394.2336.1 840 | GTCCTACTCCCCAATATGGTCCGGCATTCATGAATAGCGAAATCTTTTTTGATACAGGCATCATATACCCTATGCTGTCTCTTATACACATCTCTGAGCGG 841 | >D4ZHLFP1.10.6493.2336.1 842 | GTTTTGCTTGATGTGGTCATGCTTAACAAGAATAACTACCATCTCTATATCATCCAAAAACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAG 843 | >D4ZHLFP1.10.6372.2337.1 844 | AGTCAGGGAACTTCTCTTTCAGGTTTCGGAAGAAGTTCTCGTGGATGCGGTCAAAATTGGACTGGAACATGTTCCAGTTCCTGTCTCTTATACACATCTCT 845 | >D4ZHLFP1.10.6304.2348.1 846 | GCAGGGGGACTTCTCCCACATCCAATCCTTGATGCGGCACATCTTCGGGGAACAATACGAGCTGGGTATGGACCTGTCTCTTATACACATCTCTGAGCGGG 847 | >D4ZHLFP1.10.6439.2360.1 848 | GGGTGGACATCTGACCGATTCCAGGCAGATTCTCTGGTAATCTACACGGCTATTTCTGTATCTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 849 | >D4ZHLFP1.10.6493.2363.1 850 | CTGCTACCCCACATATTAGTATAGGATACAATCTGACCGCCTACAGAGAAATCAATATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 851 | >D4ZHLFP1.10.6388.2372.1 852 | GACACACGTAATGTCAATTCGTATTCTCCCACCTCTTCAAACATATACTCCAAGCTTTTCGTCTGAGCAATTTCAGTACCTGTCTCTTATACACATCTCTG 853 | >D4ZHLFP1.10.6446.2373.1 854 | GTATATGGATGAACACTTCAAAACAAAGGCAGGGAGTAGAAAGAGAAAAAAGGAGGCATTGTCCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 855 | >D4ZHLFP1.10.6355.2397.1 856 | GCTTCATGCGGGCGATCAGGTTCCTCTGTCCCTGGTTCTGGGGCATGAACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATAT 857 | >D4ZHLFP1.10.6292.2405.1 858 | CGATAAGTACGTGCTTTCTTATAGATGTTAGCTTGGATCTCTTCCAGCAGATTCTGAACATACGTCTCAATTCCTGTCTCTTATACACATCTCTGAGCGGG 859 | >D4ZHLFP1.10.6428.2425.1 860 | GTCTCACCCCTTAGAGATTCACATTTGAGGGACTAGGGCAGGCCCCTTAATGCACAGTATTAATAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 861 | >D4ZHLFP1.10.6334.2438.1 862 | GTATTACCTGAAGTTTCTGTACCTTTTGAATGGATTGACTTTTTTGCAGAACAAAGCAAGAATAACAATATTGCATCTGTCTCTTATACACATCTCTGAGC 863 | >D4ZHLFP1.10.6361.2440.1 864 | AAGTACAACGGAAAGAATCAGACTATTGGAAGAGGGAGCAGAAGACTATATCCTGAAACCATTCAACCTCTGTCTCTTATACACATCTCTGAGCGGGCTGG 865 | >D4ZHLFP1.10.6483.2467.1 866 | GGAATATCCTGTTTGGCATGTATTATGAACTCCCAATCATCTTTCAATGATGGCATGTAAATCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGC 867 | >D4ZHLFP1.10.6333.2478.1 868 | GTATACTGGGAGGACGGCGCCCAGTTTACGCCGCCTCATGACAAGGGCGTGACAGAGGAGGTACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 869 | >D4ZHLFP1.10.6373.2484.1 870 | ACTTCAGGACGTACAATAAAACCGAATCGATTTAACTTCTTGGTAATTGCAGTCATTTTCTCAGTAACCCTGTCTCTTATACACATCTCTGAGCGGGCTGG 871 | >D4ZHLFP1.10.6675.2265.1 872 | AAATGACGCAGTTCTGAAGAATTTCACATACAGCCGCACCTTTATGATTGGCAGCAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCG 873 | >D4ZHLFP1.10.6646.2265.1 874 | GCCGTTGGGGCAACCACTATTTATATACATCTGTATTTACATAATCTGTTGAGGCTAGAGAACCTCAAATTGTACCTGTCTCTTATACACATCTCTGAGCG 875 | >D4ZHLFP1.10.6659.2278.1 876 | CAGGAAAATCCTGCAGCAGCTGATGCACCAACCTCTTTTGGCTAACCGTTGCCGGGAGATGCTTTTTACTCTGTCTCTTATACACATCTCTGAGCGGGCTG 877 | >D4ZHLFP1.10.6630.2286.1 878 | CATTAAAATACTTTCGAATGAAAGTTAGATTGATGTGCGTCAACTGTTCAGAGAGTTTTCCCGTGATAGTCTACATTCTGTCTCTTATACACATCTCTGAG 879 | >D4ZHLFP1.10.6593.2308.1 880 | GTTCCAGTATGTCGGGTATAAGCAGGAGAGACGGGTAGTAAAATCGACCACGCTGGATGTGAAGACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 881 | >D4ZHLFP1.10.6609.2310.1 882 | TTGTAGTCCCTGCCTGTAATCCCAGCCACTCAGGAGGCTGAGGCAGGAGAACCGTTTGAACCAGGGAGGCGGCTGTCTCTTATACACATCTCTGAGCGGGC 883 | >D4ZHLFP1.10.6625.2322.1 884 | GCCGTATCCTGTCGACCATATCAACCCCTTTTGAAGTTGGACCTACAATCTCGGCATACATCAACATCACATCTGTCTCTTATACACATCTCTGAGCGGGC 885 | >D4ZHLFP1.10.6544.2351.1 886 | CATTTCTTTAGAATACTGTGTTGCTTTCTGCAATACTTCGTTAGATTCACGACTCAGATACGGTTCTCCGCCTGTCTCTTATACACATCTCTGAGCGGGCT 887 | >D4ZHLFP1.10.6719.2375.1 888 | GTTGCCAGGTTGGAGTGCAGTGGCGCAATCTCAGCTCACTGCAACCTCTATCTCCCAGGTTCAAGCCTGTCTCTTATACATATCTCTGAGCGGGCTGGCAA 889 | >D4ZHLFP1.10.6612.2381.1 890 | TTCCCACCCTGCTTTCCGCCATAGGCAGGTACAGCTTCTCCATTCTCCTGGTTCACTGGCTGGTGCTGCACCCGTCTCTTATACACATCTCTTGGCGGGCT 891 | >D4ZHLFP1.10.6610.2401.1 892 | GTACCATTGAAAAGACAACAAAATACGACCAATTCAGAGAAGATAAAATCCGCATGGACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACC 893 | >D4ZHLFP1.10.6665.2445.1 894 | CCGTGGTGTTGACCTGCGGAAACAAAGACTGGACATACACCCCCAGGCTCACCGCAAACAATGAAATCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 895 | >D4ZHLFP1.10.6513.2461.1 896 | ATATAAGTTCGTGGTTTTTATAGGGAGAAAAATGCGATTCGATGCGGTCTGTGACTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCC 897 | >D4ZHLFP1.10.6689.2461.1 898 | CAGCTAGATATTCAGGCTCAGAAAAATTTATCATTTAAAAATGATGGCCAGGTGCAGTGGCTTATGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 899 | >D4ZHLFP1.10.6716.2474.1 900 | GATTATATAGAAGCCGATCCCAGCGAATTGCTGGAAGGTCCCAAAATCGGATTTAAACTTCCGGAAGTACTGACTCTGTCTCTTATACACATCTCTGAGCG 901 | >D4ZHLFP1.10.6621.2484.1 902 | CTTTCAGGATTTCCTCTGTATTATCCCCAATTCCGCGAGTCGGTTTGTAAGTGCGGCATGCATATCGGCTAAACGCCAACGGCGGGGGTGGGAGCATGACC 903 | >D4ZHLFP1.10.6594.2484.1 904 | GAAGAAGGCAGGCTTTGTCAGGGTTTTTTTGCAGCTGAAAGCTATGGCTATCAAAAAATAAAGAGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 905 | >D4ZHLFP1.10.6692.2487.1 906 | GTACAAGGGCTCAGCAAAGACTTATTTATTCTTTTCAAATCTAAATGAGACTTGAACATTCTTTAGTGTTGAAAGATGCCCTTCCTGTCTCTTATACACAT 907 | >D4ZHLFP1.10.6554.2494.1 908 | GTATCCCGTATATCCGGCCTACGAAGTTTAGCAACCATAATGGTATCCTTCACTTCATTGGTTTCATACTGTCTCTTATACACATCTCTGAGCGGGCTGGC 909 | >D4ZHLFP1.10.6964.2284.1 910 | TTATTGGGGAATGTTCCTGTACTTTCCATTCGTTGTGAGCCATTGGAATCACTGGAGAACCGTATTATCAAGCTGTCTCTTATACACATCTCTGAGCGGGC 911 | >D4ZHLFP1.10.6876.2284.1 912 | TATATACTTTCGATGGATTAATAATATAATGCGTCAGAAAATACTGTGTCAGTTCAATCGTCTGATCTGCTGATGTCCTGTCTCTTATACACATCTCTGAG 913 | >D4ZHLFP1.10.6804.2285.1 914 | ATATTCTCCTATAGAAAGCTTTGTTTCCCTTACATTTGTAAAAAACAGAAATGCAAATGTATATGAAAAACATATTCCTGTCTCTTATACACATCTCTGAG 915 | >D4ZHLFP1.10.6920.2300.1 916 | TATATGGATACCACATCTTCCTCAGTCCAGTTGCGGTTATAGTTTTGTTTGAATTGTTCCGTCACACTTGGTCTGTCTCTTATACACATCTCTGAGCGGGC 917 | >D4ZHLFP1.10.6844.2301.1 918 | GAACAGGCATTGCAACCGCTACCGATACCCTTGCTGCCCTGAAAAAATACTATTTCGAAGAGCAAAGCCTGGATTATACCTGTCTCTTATACACATCTCTG 919 | >D4ZHLFP1.10.6889.2335.1 920 | CTGGAGTTGACCCCCTTAGGACAGGGCGTCGGCAGCTTTGGACTGCCAGGGGATTACACTTGCCTCTTTAACATATCTCTGAGCGGGCTGGCAGGGCGGGC 921 | >D4ZHLFP1.10.6885.2352.1 922 | CCACCATTGTTGTTTTTGTTATTTCTGTTTGGGGGCACTATGACGATAAATGACGTAAAAAAACAATGTAAGCTGTCTCTTATACACATCTCTGAGCGGGC 923 | >D4ZHLFP1.10.6849.2353.1 924 | GGTCGAGTTAGAATCTGATCAAAATTCTTTTTCATTGCATGCGGCGGCATTAGGTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGC 925 | >D4ZHLFP1.10.6800.2360.1 926 | GACTTCCCTGGTTTCTGGTAAAGGTATAAATCAAAATATGAAACCAAATAAGAAGACAGTTATGGGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 927 | >D4ZHLFP1.10.6973.2367.1 928 | GGATATGTGTGTGTTGCAAGGAGGATTGGCCACACATGGGAGTCAGCCTCCAAGGCTCTGCAAACTCTGTGGGTGGAAGTGCTGTCTCTTATACACATCTC 929 | >D4ZHLFP1.10.6985.2385.1 930 | CTTTTTCTAATGTTTATTGAGAGAGAGGTGTTTCTCAATGCCTCATTTTACTGATGGGAAATCGAGTTACCTATGGTCTGTCTCTTATACACATCTCTGAG 931 | >D4ZHLFP1.10.6934.2394.1 932 | GCCCTGCACTTACATAAATATACGGAAAGCCCAGACAGGTTTCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATATCTCGTAT 933 | >D4ZHLFP1.10.6804.2407.1 934 | GTACTAAGAAGTAATGGTTGAAATATTTTTCGAAATGGAATCTAATAACCTTTTCTATATCTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGAC 935 | >D4ZHLFP1.10.6752.2409.1 936 | ATATTAATCAAGTTCTTCATCCGTTGTGCCTGCCGGTAGATGGCTTTTAACGGCAGATATTGGCTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 937 | >D4ZHLFP1.10.6926.2414.1 938 | ACTTTAACCAAGCGGAAGAGAAAAGCAAGCGCGACCAGTATCTACGCATCAGTCAGGAGCAGTTGCAAAGGCCTGTCTCTTATACACATCTCTGAGCGGGC 939 | >D4ZHLFP1.10.6965.2422.1 940 | AGCCATAATGGTTTGATTTTAAATGAATTAATATTTAGTTAGAGTATCACGCAAATAAGAATGAGCAAAAGCTGTCTCTTATACACATCTCTGAGCGGGCT 941 | >D4ZHLFP1.10.6903.2438.1 942 | GCACTGTGAATATCTGTCACAGTCGGAACTCCGAAAGTTTCATGTACCTTTCTTAATACTTCGAGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 943 | >D4ZHLFP1.10.6773.2445.1 944 | TCATTAGCCGTTACGAAAGTCCTCCCCCGCCCCCTCCCATATTCTTTTCCGCCCATACCCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGA 945 | >D4ZHLFP1.10.6900.2453.1 946 | GGATATTACGAAAATGACTTATGCTGAAATTCAGCAATATAATCTGTTGGACCGGAATGGAAGGGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAG 947 | >D4ZHLFP1.10.6779.2458.1 948 | ACCTCAGGTGATCCGCCCGCCTTGGCCTCTCAAAGTGCTAGGATTACAGGCCTGAGCCACCATGCCCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCA 949 | >D4ZHLFP1.10.6995.2477.1 950 | GCCAGAGCCTTGGCTATGGAGAACATCTCCGCCACGCTGAGGGTAAAGGCCAGGGAGGTGTCCTTGCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 951 | >D4ZHLFP1.10.6887.2481.1 952 | CCCTCAGGAAGCTTACATTTTAATAAAGAAAGAAAGGAAAGAAGGGAGACAGGGAGGGAGGGAAAGAAAAAATAACTGTCTCTTATACACATCTCTGAGCG 953 | >D4ZHLFP1.10.6864.2483.1 954 | GTGGTCGTCTTTTCCCATTTCAAGTCCGGATTGTATGCATTCGGGCGGTAAGTGATTCCCTCTCCCAGGATAGGATCTGTCTCTTATACACATCTCTGAGC 955 | >D4ZHLFP1.10.6785.2490.1 956 | GCATCCGACCTTCCGGAAATCATTTCGTTTCTTATATCAACATACTCAAGCATCGTCTCTTTCAACTCATCGCCAAAAGGCTGTCTCTTATACACATCTCT 957 | >D4ZHLFP1.10.7029.2272.1 958 | GGATTATAAACAATATTTCCATTGTAACGCAACTGAGTCTTTTTTACATTTCCTTCAGCCTCATACAACAAAGCCTGTCTCTTATACACATCTCTGAGCGG 959 | >D4ZHLFP1.10.7009.2280.1 960 | TCTCTACGCAATCGTGATTGAGCTCCAGGCTTGTCGGAATGTCCCGGTCCTGTCTCTTATAACATCTCTGAGCGGGCTGGCAAGGCCGACCGGCCAATATC 961 | >D4ZHLFP1.10.7068.2285.1 962 | GGCTTATAGGAATGAGCCAAGGTGTTTTTTGAGGGATGTTGTTGCTGTGTATGTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGC 963 | >D4ZHLFP1.10.7141.2307.1 964 | TTTAACGTCTGCAAAATCCAAGTCACAGTTCGTCTGTTGTTGATAGAATTCTTCCCGTGTATTAACAACTGTCTCTTATACACATCTCTGAGCGGGCTGGC 965 | >D4ZHLFP1.10.7083.2316.1 966 | TACTATTACTTCTTCTTGTACCTCGTTCCACGGCACGATACGGCATTTAAGGGTATGGTTTCCTTCTGTCTCTTATACACATCTCTGAGCGGGCCTGGCAA 967 | >D4ZHLFP1.10.7173.2327.1 968 | GTTTTCTACCTCCGGCTTATTATAAATAACAGAGCCAAAAGTTCCTATTGAGCGTTTCACAGACACAGATGGTAATCTGTCTCTTATACACATCTCTGAGC 969 | >D4ZHLFP1.10.7188.2343.1 970 | ACGTATGAAAGACCTACCTTTTGTAGTCACCAATGTCGGCAAGAAAGACGGAAAGGAATACGCTCCATGCTGTCTCTTATACACATCTCTGAGCGGGCTGG 971 | >D4ZHLFP1.10.7090.2382.1 972 | CGCTGACAGATAGGACAAAAAGCCCGATGACAATAATGGGAATGCGCCTTACCTCCCTGCACAGTATAAACACTGTCTCTTATACACATCTCTGAGCGGGC 973 | >D4ZHLFP1.10.7180.2416.1 974 | CAGTTACGCTATAATAAGATTCAGTCTCTCCGTTCTTTCCTTTGTAGTCAAGTTTGCCACCTTTAGCCAATTGAGTCTGTCTCTTATACACATCTCTGAGC 975 | >D4ZHLFP1.10.7194.2462.1 976 | ATCATTCTCTTTGTCCTCCTACTTATGCTTATAGCCTGTGTAAACCGTCATCGCCAGACAGAATACCATCTGTCTCTTATACACATCTCTGAGCGGGCTGG 977 | >D4ZHLFP1.10.7090.2480.1 978 | ACATAGGAATTATGTACTTCCTTCACAATTTTTCTTTTCCTTTTTTTTTTTTTTTTTTTTTAGGGAGGGGTCCTTTTTTTTTTAACACTTTCTTGGGGGGG 979 | >D4ZHLFP1.10.7192.2481.1 980 | ATTATCATTGGTGGCGGTCCTGCGGGATATACGGCTGCCGAAGCTGCCGCTAAAGGTGGTTTGAGCGTATTGCCTGTCTCTTATACACATCTCTGAGCGGG 981 | >D4ZHLFP1.10.7149.2484.1 982 | ACATTTAATATTTGAAATTCTTCTATAAAAAGGATTTGTCCTTTTTTCTCTCTTTGTTTATGTACTGTCTCTTATACACACCTCTGAGCGGGCTGGCAAGG 983 | >D4ZHLFP1.10.7416.2272.1 984 | AAGGAGAAGTGAACGACGTATTCTATTACTATCGCGAACCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCGGACCGGCCAATATCTCGTATGCC 985 | >D4ZHLFP1.10.7397.2297.1 986 | CCCTTATCTCTACCTGAATGATGCGAGAAGATACGGGAAACAAGGAAGCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGGCAGACCGGCCAATAT 987 | >D4ZHLFP1.10.7436.2303.1 988 | GTATTGTAAAGAAGGAGTCCATTTTCCTGAAGCAGTAAACGAGATTTCAACAGTTTCCCCATAAAGTGAAACAGTCTTTTCCGTTTCACTGCTAATTGCAG 989 | >D4ZHLFP1.10.7357.2307.1 990 | AAATAAAATACCAGAGGAGATACCCCAAAATAACAGATATAAAAAAGGCGGCTGAAAGCATGAACCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAA 991 | >D4ZHLFP1.10.7282.2308.1 992 | CATTATCACTGCTGTAACACAACGTTTGGAAGTTCCCGCCGAAAAGGTCATGGTTAATATTGAACGATACGGCAATACCAGCTGTCTCTTATACACATCTC 993 | >D4ZHLFP1.10.7495.2332.1 994 | CCCTTACGGTTTCACGTACTCGATAGCATTAACATACCAGCTCGCTTCCCCGGCAGGGGTATTCCTGTCTCTTATACACATCTCTGAGCGGGCTGGCAAGG 995 | >D4ZHLFP1.10.7343.2333.1 996 | ATGGTGGTCAATCACACTTCTGTTGTAGGAAGAAAGCAGGTTCATCTTCTCGATACCTTATTAAAGAGCTGTCTCTTATACACATCTCTGAGCGGGCTGGC 997 | >D4ZHLFP1.10.7478.2344.1 998 | GGTTTCAACAGGGATGTCATAACTCAAGCGAACCAGAGGTTGGGCAGCAAGATGAAAAACAATCTCAGGTTGATACCTGTCTCTTATACACATCTCTGAGC 999 | >D4ZHLFP1.10.7412.2352.1 1000 | CTACAAAGAAGTAAAAAAATAAATAAATAAAATTGGAGGATACTGTAAACTATTGGTATTTTTTTAAATAGAGGTATCATATATACCTGTCTCTTATACAC 1001 | -------------------------------------------------------------------------------- /samples/sample.fa.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meren/BLAST-filtering-pipeline/e7ae4f3e76fcdb755fda2c92d70e719ecc7148f9/samples/sample.fa.png -------------------------------------------------------------------------------- /samples/sample_userch_db.wdb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meren/BLAST-filtering-pipeline/e7ae4f3e76fcdb755fda2c92d70e719ecc7148f9/samples/sample_userch_db.wdb --------------------------------------------------------------------------------