├── .github └── workflows │ └── upload.yaml ├── .gitignore ├── CONTRIBUTING.md ├── LICENSE ├── Makefile ├── README.md ├── data ├── README.md ├── bam │ ├── bad │ │ ├── bai_older_than_data.bam │ │ ├── bai_older_than_data.bam.bai │ │ ├── read_name_longer_than_254.sam │ │ └── truncated.bam │ └── good │ │ ├── basic.bam │ │ ├── basic.sam │ │ ├── basic_unsorted.bam │ │ ├── compressed.sam.gz │ │ ├── indexed_bai.bam │ │ ├── indexed_bai.bam.bai │ │ ├── indexed_csi.bam │ │ ├── indexed_csi.bam.csi │ │ ├── indexed_csi.sam.gz │ │ ├── indexed_csi.sam.gz.csi │ │ ├── indexed_tbi.sam.gz │ │ ├── indexed_tbi.sam.gz.tbi │ │ └── no_mapped_reads.bam ├── bed │ ├── bad │ │ ├── negative_coords.bed │ │ ├── non_integer_coords.bed │ │ ├── spaces.bed │ │ └── start_greater_than_end_coords.bed │ └── good │ │ ├── basic.bed │ │ ├── compressed.bed.gz │ │ ├── indexed_csi.bed.gz │ │ ├── indexed_csi.bed.gz.csi │ │ ├── indexed_tbi.bed.gz │ │ ├── indexed_tbi.bed.gz.tbi │ │ └── unsorted.bed ├── fasta │ └── good │ │ ├── basic_aligned.fa │ │ ├── basic_dna.fa │ │ ├── basic_protein.fa │ │ ├── compressed.fa.gz │ │ ├── duplicate_sequence_names.fa │ │ ├── empty_lines.fa │ │ ├── multiline.fa │ │ └── name_contains_spaces.fa ├── fastq │ ├── bad │ │ ├── quality_mismatch.fastq │ │ ├── truncated_clean.fastq │ │ └── truncated_halfway.fastq │ └── good │ │ ├── basic_R1.fastq │ │ ├── basic_R2.fastq │ │ ├── compressed.fastq.gz │ │ ├── duplicate_+.fastq │ │ ├── interleaved.fastq │ │ ├── multiline.fastq │ │ └── quality_@.fastq └── vcf │ ├── bad │ └── missing_info_field.vcf │ └── good │ ├── basic.bcf │ ├── basic.vcf │ ├── basic_multisample.bcf │ ├── basic_multisample.vcf │ ├── compressed.vcf.gz │ ├── indexed.bcf │ ├── indexed.bcf.csi │ ├── indexed_csi.vcf.gz │ ├── indexed_csi.vcf.gz.csi │ ├── indexed_tbi.vcf.gz │ └── indexed_tbi.vcf.gz.tbi ├── metadata.csv └── src ├── generate_bam.sh ├── generate_bed.sh ├── generate_fasta.sh ├── generate_fastq.sh ├── generate_vcf.sh └── lib.sh /.github/workflows/upload.yaml: -------------------------------------------------------------------------------- 1 | name: Upload data to bucket 2 | 3 | on: 4 | push: 5 | branches: [ "main" ] 6 | paths: 7 | - data/** 8 | workflow_dispatch: 9 | 10 | jobs: 11 | Upload: 12 | runs-on: ubuntu-latest 13 | steps: 14 | - name: Checkout 15 | uses: actions/checkout@master 16 | 17 | # Doesn't seem to support sync on .gz/.bcf/.bam files (always deletes and reuploads them) 18 | - name: Upload 19 | run: aws s3 sync ./data/ s3://bio-data-zoo/ 20 | env: 21 | AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} 22 | AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} 23 | AWS_ENDPOINT_URL: ${{ secrets.AWS_ENDPOINT_URL }} 24 | AWS_EC2_METADATA_DISABLED: true 25 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | /*.bai 2 | /src/*.bai 3 | .env 4 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | ## Suggest and contribute new formats 4 | 5 | If you're interested in datasets for a file format or edge case that is not currently supported: 6 | 7 | 1. Create an issue on GitHub, describing the file format and scenario(s) to support. 8 | 2. Mention if you are interested in contributing the sample data. 9 | 3. If you want to contribute the data, wait for confirmation before you start working on it, and follow the instructions in the next section to submit a PR. 10 | 11 | 12 | ## Adding a new file 13 | 14 | The general approach is: for each file format, we fetch sample files from public cloud buckets (logic in `Makefile`'s `init` target). Those are generally called `basic_[description].[format]`. Then, we modify those original files to simulate certain scenarios (logic in `generate_[format].sh`). 15 | 16 | * If you're adding a scenario for an existing file format: 17 | * Modify `./src/generate_[format].sh` and add: 18 | 19 | ```bash 20 | # ------------------------------------------------------------------------------ 21 | # Scenario 22 | # ------------------------------------------------------------------------------ 23 | 24 | log "Creating [describe scenario]" 25 | # Add code here to generate that file. You can use $DIR_BASIC for inputs and $DIR_OUT for output 26 | head -n 10 "$DIR_BASIC" > "$DIR_OUT/new_scenario.fastq" 27 | # Write a command here within the "$()" that should return a non-empty string if the generation succeeded 28 | validate "$(diff "$DIR_BASIC" "$DIR_OUT/new_scenario.fastq")" 29 | ``` 30 | 31 | * If you're adding a new file format: 32 | * Create `./src/generate_[format].sh` and follow the structure of other similar files such as `generate_bed.sh` 33 | * Update the Makefile (we can work with you on that) 34 | * Update README to specify where the original files came from (we can work with you on that) 35 | 36 | ## Dev environment 37 | 38 | Prerequisites to develop on this repo: 39 | 40 | * `samtools` 41 | * `bedtools` 42 | * `bcftools` 43 | * `tabix` 44 | * `bgzip` 45 | * `seqtk` 46 | * `gshuf` (`brew install coreutils` on Mac) 47 | * `zcat` 48 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 OMGenomics Labs LLC 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | URL_FASTQ = "https://42basepairs.com/download/s3/1000genomes/phase3/data/NA12878/sequence_read/ERR001268_1.filt.fastq.gz" 2 | URL_BAM = "https://42basepairs.com/download/s3/1000genomes/phase3/data/NA12878/alignment/NA12878.chrom11.ILLUMINA.bwa.CEU.low_coverage.20121211.bam" 3 | URL_BED = "https://42basepairs.com/download/s3/human-pangenomics/pangenomes/freeze/freeze1/minigraph/hprc-v1.0-minigraph-chm13.bb.bed.gz" 4 | URL_VCF = "https://42basepairs.com/download/s3/1000genomes/technical/working/20140708_previous_phase3/chrXY_v1a/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5a.20130502.sites.vcf.gz" 5 | URL_VCF_MULTISAMPLE = "https://42basepairs.com/download/s3/1000genomes/technical/working/20140708_previous_phase3/chrXY_v1a/ALL.chrY.phase3_integrated_v1a.20130502.genotypes.vcf.gz" 6 | 7 | # Non-file targets 8 | .PHONY: all clean generate download 9 | 10 | # Targets 11 | clean: 12 | @rm -r ./data 13 | 14 | generate: 15 | @find ./data -type f ! -name 'basic*' ! -name README.md -exec rm {} \; 16 | @./src/generate_fasta.sh && \ 17 | ./src/generate_fastq.sh && \ 18 | ./src/generate_bam.sh && \ 19 | ./src/generate_vcf.sh && \ 20 | ./src/generate_bed.sh 21 | 22 | init: 23 | @mkdir -p ./data/{fasta,fastq,bam,vcf,bed}/{good,bad} 24 | @echo "Generating FASTA files..." && \ 25 | echo ">sequence1\nAATTCTCATTACTGTATCACAGCAAGTTGTATTTACAACAAAAATCCAAA\n>sequence2\nGCCTACCAGAAAACGTTGTATTTTGGCAAAGTTCAAAAAGTCAGTCCAGA\n>sequence3\nGTATAATTCACAGAGTTTCATGTGGTTGTTGTTGACTCTACATATTGTCT" > "./data/fasta/good/basic_dna.fa" && \ 26 | echo ">sequence1\nATCTACGATCGAGCTACT\n>sequence2\nATC----ATCGACCCACT" > "./data/fasta/good/basic_aligned.fa" && \ 27 | echo ">sequence1\nADHWNARNNAKFWVYSHGPLWGIMHSHFPAGLAQGKNLHEIIPSMKQCIRPEWVDYCHMF\n>sequence2\nISTTGEGMSHFVQNWVPLVWGFAVHYAQVTLFRDTRNGGYEVSVEWLGLYVSQLDASWNI\n>sequence3\nVKMHIEVVRPIWEHSQNIHFAQLTDNPAAKACDGFAPVTMKKTCGTDTIHCYHTYHACWR" > "./data/fasta/good/basic_protein.fa" 28 | 29 | @echo "Downloading FASTQ files..." && \ 30 | curl -s "$(URL_FASTQ)" | seqtk seq -l 0 | head -n 12 > "./data/fastq/good/basic_R1.fastq" && \ 31 | curl -s "$(subst _1,_2,$(URL_FASTQ))" | seqtk seq -l 0 | head -n 12 > "./data/fastq/good/basic_R2.fastq" 32 | 33 | @echo "Downloading BAM files..." && \ 34 | samtools view -b -h "$(URL_BAM)" 11:82365011-82366010 > "./data/bam/good/basic.bam" && \ 35 | samtools view -b -h "$(URL_BAM)" 11:128989445-128990444 11:82365011-82366010 > "./data/bam/good/basic_unsorted.bam" && \ 36 | samtools view -h "./data/bam/good/basic.bam" > "./data/bam/good/basic.sam" 37 | 38 | @echo "Downloading VCF files..." && \ 39 | bcftools view --no-version "$(URL_VCF)" | head -n 300 > "./data/vcf/good/basic.vcf" && \ 40 | bcftools view --no-version "$(URL_VCF_MULTISAMPLE)" | head -n 150 > "./data/vcf/good/basic_multisample.vcf" && \ 41 | bcftools view --no-version "./data/vcf/good/basic.vcf" -o "./data/vcf/good/basic.bcf" && \ 42 | bcftools view --no-version "./data/vcf/good/basic_multisample.vcf" -o "./data/vcf/good/basic_multisample.bcf" 43 | 44 | @echo "Downloading BED file..." && \ 45 | curl -s "$(URL_BED)" | zcat | head -n 10 > "./data/bed/good/basic.bed" 46 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Bio Data Zoo 2 | 3 | This repo contains example data in various genomics file formats. It is intended for bioinformatics tool developers to make testing software easier. It includes examples of valid file formats, edge cases, and invalid formats. 4 | 5 | ## Browse 6 | 7 | Browse the data on 42basepairs: https://42basepairs.com/browse/r2/bio-data-zoo 8 | 9 | ## Download 10 | 11 | Download this repo as a zip file: https://github.com/omgenomics/bio-data-zoo/archive/refs/heads/main.zip 12 | 13 | ## Formats included 14 | 15 | |Format|Extensions| 16 | |--|--| 17 | |FASTA|.fa, .fa.gz| 18 | |FASTQ|.fastq, .fastq.gz| 19 | |BAM|.bam, .bam.bai, .bam.csi, .sam, .sam.gz, .sam.gz.csi, .sam.gz.tbi| 20 | |VCF|.vcf, .vcf.gz, .vcf.gz.csi, .vcf.gz.tbi, .bcf, .bcf.csi| 21 | |BED|.bed, .bed.gz, .bed.gz.csi, .bed.gz.tbi| 22 | |CRAM|**TODO**: .cram, .crai, different CRAM versions| 23 | |GFF|**TODO**: .gff3, .gtf, .gff, .gff.gz, .gff.gz.tbi| 24 | 25 | 26 | ## Data Source 27 | 28 | |Path|Source|Preview file|Download file| 29 | |--|--|--|--| 30 | | `basic_R1.fastq` | `s3://1000genomes` | [Preview on 42basepairs](https://42basepairs.com/browse/s3/1000genomes/phase3/data/NA12878/sequence_read?file=ERR001268_1.filt.fastq.gz&preview=) | [Download](https://42basepairs.com/download/s3/1000genomes/phase3/data/NA12878/sequence_read/ERR001268_1.filt.fastq.gz) | 31 | | `basic.bam` | `s3://1000genomes` | [Preview on 42basepairs](https://42basepairs.com/browse/s3/1000genomes/phase3/data/NA12878/alignment?file=NA12878.chrom11.ILLUMINA.bwa.CEU.low_coverage.20121211.bam&preview=) | [Download](https://42basepairs.com/download/s3/1000genomes/phase3/data/NA12878/alignment/NA12878.chrom11.ILLUMINA.bwa.CEU.low_coverage.20121211.bam) | 32 | | `basic_multisample.vcf` | `s3://human-pangenomics` | [Preview on 42basepairs](https://42basepairs.com/browse/s3/1000genomes/technical/working/20140708_previous_phase3/chrXY_v1a?file=ALL.chrY.phase3_integrated_v1a.20130502.genotypes.vcf.gz&preview=) | [Download](https://42basepairs.com/download/s3/1000genomes/technical/working/20140708_previous_phase3/chrXY_v1a/ALL.chrY.phase3_integrated_v1a.20130502.genotypes.vcf.gz) | 33 | | `basic.vcf` | `s3://human-pangenomics` | [Preview on 42basepairs](https://42basepairs.com/browse/s3/1000genomes/technical/working/20140708_previous_phase3/chrXY_v1a?file=ALL.wgs.phase3_shapeit2_mvncall_integrated_v5a.20130502.sites.vcf.gz&preview=) | [Download](https://42basepairs.com/download/s3/1000genomes/technical/working/20140708_previous_phase3/chrXY_v1a/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5a.20130502.sites.vcf.gz) | 34 | | `basic.bed` | `s3://human-pangenomics` | [Preview on 42basepairs](https://42basepairs.com/browse/s3/human-pangenomics/pangenomes/freeze/freeze1/minigraph?file=hprc-v1.0-minigraph-chm13.bb.bed.gz&preview=) | [Download](https://42basepairs.com/download/s3/human-pangenomics/pangenomes/freeze/freeze1/minigraph/hprc-v1.0-minigraph-chm13.bb.bed.gz) | 35 | 36 | 37 | ## Contributing 38 | 39 | See [CONTRIBUTING docs](./CONTRIBUTING.md). 40 | -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- 1 | ### Bio Data Zoo 2 | 3 | This is a collection of sample data in various genomics file formats. It is intended for bioinformatics tool developers to test software more easily. It includes examples of valid file formats, edge cases, and invalid formats. 4 | 5 | #### Contribute 6 | 7 | Visit https://github.com/omgenomics/bio-data-zoo 8 | -------------------------------------------------------------------------------- /data/bam/bad/bai_older_than_data.bam: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/bad/bai_older_than_data.bam -------------------------------------------------------------------------------- /data/bam/bad/bai_older_than_data.bam.bai: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/bad/bai_older_than_data.bam.bai -------------------------------------------------------------------------------- /data/bam/bad/truncated.bam: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/bad/truncated.bam -------------------------------------------------------------------------------- /data/bam/good/basic.bam: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/basic.bam -------------------------------------------------------------------------------- /data/bam/good/basic_unsorted.bam: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/basic_unsorted.bam -------------------------------------------------------------------------------- /data/bam/good/compressed.sam.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/compressed.sam.gz -------------------------------------------------------------------------------- /data/bam/good/indexed_bai.bam: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/indexed_bai.bam -------------------------------------------------------------------------------- /data/bam/good/indexed_bai.bam.bai: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/indexed_bai.bam.bai -------------------------------------------------------------------------------- /data/bam/good/indexed_csi.bam: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/indexed_csi.bam -------------------------------------------------------------------------------- /data/bam/good/indexed_csi.bam.csi: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/indexed_csi.bam.csi -------------------------------------------------------------------------------- /data/bam/good/indexed_csi.sam.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/indexed_csi.sam.gz -------------------------------------------------------------------------------- /data/bam/good/indexed_csi.sam.gz.csi: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/indexed_csi.sam.gz.csi -------------------------------------------------------------------------------- /data/bam/good/indexed_tbi.sam.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/indexed_tbi.sam.gz -------------------------------------------------------------------------------- /data/bam/good/indexed_tbi.sam.gz.tbi: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/indexed_tbi.sam.gz.tbi -------------------------------------------------------------------------------- /data/bam/good/no_mapped_reads.bam: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bam/good/no_mapped_reads.bam -------------------------------------------------------------------------------- /data/bed/bad/negative_coords.bed: -------------------------------------------------------------------------------- 1 | chr1 -3634 3696 5 4 0 0 374 -1 -1 -1 s1,s367398,s434464,s2,s3 * CCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGA 2 | chr1 11669 11728 3 2 0 0 59 -1 -1 -1 s3,s4,s5 * GTCCCTCTGTCTCTGCCAACCAGTTAACCTGCTGCTTCCTGGAGGGAGACAGTCCCTCA 3 | chr1 18095 18095 3 2 0 0 313 -1 -1 -1 s5,s238062,s6 * GGGCCGGGTGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGGCGGATCACGAGGTCAGAAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCTGGGCATGGTGGTGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCCCGCCACTGCACTCCAGCCTGAGCGACAGAGTGAGACTCTGTCTCAAAAAAAAAAAAAAAAAAGAAGGAATAAGAC 4 | chr1 21837 21883 4 2 0 46 103 -1 -1 -1 s6,s374420,s7,s8 TTCCCAGCAGTGCAGGCCCCTCTCTAGAGCTGAGATGCTCCCGGCA CCCTCTCCAGAGCCGAGACGCTCCCGGCGGTGCAGGCCCCTCTCTAGAGCCGAGACGCTCCCAGCAATGCAGGTCCCCCTCTAGAGCCGAGACGCTCCCGGCG 5 | chr1 24110 26059 59 12526 0 22 3875 -1 -1 -1 s8,s345381,s418892,s238063,s416853,s463110,s271302,s377656,s314817,s482893,s461932,s461931,s9,s10,s11,s432618,s12,s322411,s13,s14,s434465,s398218,s398219,s478354,s443821,s443822,s359857,s15,s16,s17,s18,s19,s425113,s20,s21,s470986,s345380,s466297,s22,s23,s345379,s445459,s347465,s479666,s24,s25,s26,s27,s28,s467481,s29,s432617,s345378,s436164,s479667,s432616,s463109,s30,s31 AGCAATGAGGCACGTGTGGAAA GCAATGAGGCACGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGGCACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAA 6 | chr1 27995 28086 11 12 0 91 958 -1 -1 -1 s31,s416854,s238064,s238065,s314818,s314819,s238066,s32,s414976,s33,s34 TGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGTCCT CGACTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCCGATCTGTATATATGTATCATGTAAACATGATTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTTCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCTCCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATGTATGTATCATGTAAACACGAGTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGTCCT 7 | chr1 29311 29311 7 4 0 0 863 -1 -1 -1 s34,s392676,s432615,s478355,s432614,s432613,s35 * TAGCCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCAACCCCCTCTCCATCCCCCTCTCCATCTCCCTCTCCTTTCTCCTCTCTAGCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCATCCCCCCTCCATCCCCTTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCTTCTCCATCCCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCATCCCCCTCTCCATCTCCCTTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCCTCTCCATCCCCCCTACATCCCCCTCTCCTTTCTCCTCCCCACCCCCTCTTCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTC 8 | chr1 36616 37092 20 111 0 0 2096 -1 -1 -1 s35,s486581,s36,s238067,s458206,s37,s38,s428880,s458205,s39,s40,s238068,s238069,s238070,s307180,s426376,s41,s238071,s42,s43 * GAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCAGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGCTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAG 9 | chr1 37656 37965 20 86 0 1 1759 -1 -1 -1 s43,s486580,s44,s238072,s416855,s238073,s434466,s416856,s238074,s45,s238075,s46,s416857,s441327,s416858,s238076,s416859,s47,s48,s49 T GCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTTGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGAGTGTGACGGGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCATGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGCGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTG 10 | chr1 54659 54659 3 2 0 0 5874 -1 -1 -1 s49,s238077,s50 * TTAGATTTGACCTTCAGCAAGGTCAAAGGGAGTCCGAACTAGTCTCAGGCTTCAACATCGAATACGCCGCAGGCCCCTTCGCCCTATTCTTCATAGCCGAATACACAAACATTATTATAATAAACACCCTCACCACTACAATCTTCCTAGGAACAACATATAACGCACTCTCCCCTGAACTCTACACAACATATTTTGTCACCAAGACCCTACTTCTGACCTCCCTGTTCTTATGAATTCGAACAGCATACCCCCGATTCCGCTACGACCAACTCATACACCTCCTATGAAAAAACTTCCTACCACTCACCCTAGCATTACTTATATGATATGTCTCCATACCCATTACAATCTCCAGCATTCCCCCTCAAACCTAAGAAATATGTCTGATAAAAGAGTTACTTTGATAGAGTAAATAATAGGAGTTTAAATCCCCTTATTTCTAGGACTATGAGAATCGAACCCATCCCTGAGAATCCAAAATTCTCCGTGCCACCTATCACACCCCATCCTAAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTTATATCCTTCCCGTACTAATTAATCCCCTGGCCCAACCCGTCATCTACTCTACCATCTTTGCAGGCACACTCATCACAGCGCTAAGCTCGCACTGATTTTTTACCTGAGTAGGCCTAGAAATAAACATGCTAGCTTTTATTCCAGTTCTAACCAAAAAAATAAACCCTCGTTCCACAGAAGCTGCCATCAAGTATTTCCTCACGCAAGCAACCGCATCCATAATCCTTCTAATAGCTATCCTCTTCAACAATATACTCTCCGGACAATGAACCATAACCAATACCACCAATCAATACTCATCATTAATAATCATAATGGCTATAGCAATAAAACTAGGAATAGCCCCCTTTCACTTCTGAGTCCCAGAGGTTACCCAAGGCACCCCTCTGACATCCGGCCTGCTCCTTCTCACATGACAAAAACTAGCCCCCATCTCAATCATATACCAAATTTCTCCCTCATTAAACGTAAGCCTTCTCCTCACTCTTTCAATCTTATCCATCATGGCAGGCAGTTGAGGTGGATTAAACCAAACCCAACTACGCAAAATCTTAGCATACTCCTCAATTACCCACATAGGATGAATAACAGCAGTTCTACCGTACAACCCTAACATAACCATTCTTAATTTAACTATTTATATTATCCTAACTACTACCGCATTCCTACTACTCAACTTAAACTCCAGCACCACAACCCTACTACTATCTCGCACCTGAAACAAGCTAACATGACTAACACCCTTAATTCCATCCACCCTCCTCTCCCTAGGAGGCCTGCCCCCGCTAACCGGCTTTTTGCCCAAATGGGCCATTATCGAAGAATTCACAAAAAACAATAGCCTCATCATCCCCACCATCATAGCCATCATCACCCTCCTTAACCTCTACTTCTACCTGCGCCTAATCTACTCCACCTCAATCACACTACTCCCTATATCTAACAACGTAAAAATAAAATGACAGTTTGAACACACAAAACCCACCCCATTCCTCCCCACACTCATCGCCCTTACCACACTGCTCCTACCTATCTCCCCTTTTATGCTAATAATCTTATAGAAATTTAGGTTAAATACAGACCAAGAGCCTTCAAAGCCCTCAGTAAGTTGCAATACTTAATTTCTGCAACAGCTAAGGACTGCAAAACCCCACTCTGCATCAACTGAACGCAAATCAGCCACTTTAATTAAGCTAAGCCCTTACTAGACCAATGGGACTTAAACCCACAAACACTTAGTTAACAGCTAAGCACCCTAATCAACTGGCTTCAATCTACTTCTCCCGCCGCCGGGAAAAAAGGCGGGAGAAGCCCCGGCAGGTTTGAAGCTGCTTCTTCGAATTTGCAATTCAATATGAAAATCACCTCAGAGCTGGTAAAAAGAGGCTTAACCCCTGTCTTTAGATTTACAGTCCAATGCTTCACTCAGCCATTTTACCTCACCCCCACTGATGTTCGCCGACCGTTGACTATTCTCTACAAACCACAAAGACATTGGAACACTATACCTATTATTCGGCGCATGAGCTGGAGTCCTAGGCACAGCTCTAAGCCTCCTTATTCGAGCCGAACTGGGCCAGCCAGGCAACCTTCTAGGTAACGACCACATCTACAACGTTATCGTCACAGCCCATGCATTTGTAATAATCTTCTTCATAGTAATACCCATCATAATCGGAGGCTTTGGCAACTGACTAGTTCCCCTAATAATCGGTGCCCCCGATATGGCGTTTCCCCGCATAAACAACATAAGCTTCTGACTCTTACCCCCCTCTCTCCTACTCCTGCTTGCATCTGCTATAGTGGAGGCCGGCGCAGGAACAGGTTGAACAGTCTACCCTCCCTTGGCAGGGAACTACTCCCACCCTGGAGCCTCCGTAGACCTAACCATCTTCTCCTTACACCTAGCAGGTATCTCCTCTATCTTAGGAGCCATCAATTTCATCACAACAATTATTAATATAAAACCCCCTGCCATAACCCAATACCAAACGCCCCTTTTCGTCTGATCCGTCCTAATCACAGCAGTCTTACTTCTCCTATCTCTCCCAGTCCTAGCCGCTGGCATCACTATACTACTAACAGACCGTAACCTCAACACCACCTTCTTCGACCCAGCCGGAGGAGGAGACCCCATTCTATACCAACACCTATTCTGATTTTTCGGTCACCCTGAAGTTTATATTCTCATCCTACCAGGCTTCGGAATAATCTCCCATATTGTAACTTACTACTCCGGGAAAAAAAGAACCATTTGGATACATAGGTATGGTCTGAGCTATGATATCAATTGGCTTCCTAGGGTTTATCGTGTGAGCACACCATATATTTACAGTAGGAATAGACGTAGACACACGAGCATATTTCACCTCCGCTACCATAATCATCGCTATCCCCACCGGCGTCAAAGTATTTAGCTGACTCGCCACACTCCACGGAAGCAATATGAAATGATCTGCTGCAGTGCTCTGAGCCCTAGGATTTATTTTTCTTTTCACCGTAGGTGGCCTGACTGGCATTGTATTAGCAAACTCATCACTAGACATCGTACTACACGACACGTACTACGTTGTAGCCCACTTCCACTATGTCCTATCAATAGGAGCTGTATTTGCCATCATAGGAGGCTTCATTCACTGATTTCCCCTATTCTCAGGCTACACCCTAGACCAAACCTACGCCAAAATCCATTTCGCTATCATATTCATCGGCGTAAATCTAACTTTCTTCCCACAACACTTTCTCGGCCTATCCGGAATGCCCCGACGTTACTCGGACTATCCCGATGCATACACCACATGAAATATCCTATCATCTGTAGGCTCATTCATTTCTCTAACAGCAGTAATATTAATAATTTTCATAATTTGAGAAGCCTTCGCTTCGAAGCGAAAAGTCCTAATAGTAGAAGAACCCTCCATAAACCTGGAGTGACTATATGGATGCCCCCCACCCTACCACACATTCGAAGAACCCGTATACATAAAATCTAGACAAAAAAGGAAGGAATCGAACCCCCCAAAGCTGGTTTCAAGCCAACCCCATGGCCTCCATGACTTTTTCAAAAAGATATTAGAAAAACCATTTCATAACTTTGTCAAAGTTAAATTATAGGCTAAATCCTATATATCTTAATGGCACATGCAGCGCAAGTAGGTCTACAAGACGCTACTTCCCCTATCATAGAAGAGCTTATCATCTTTCATGATCACGCCCTCATAATCATTTTCCTTATCTGCTTCCTAGTCCTGTACGCCCTTTTCCTAACACTCACAACAAAACTAACTAATACTAACATCTCAGACGCTCAGGAAATAGAAACCGTCTGAACTATCCTGCCCGCCATCATCCTAGTCCTTATCGCCCTCCCATCCCTACGCATCCTTTACATAACAGACGAGGTCAACGATCCCTCCTTTACCATCAAATCAATTGGCCATCAATGGTACTGAACCTACGAATACACCGACTACGGCGGACTAATCTTCAACTCCTACATACTTCCCCCATTATTCCTAGAACCAGGCGACCTGCGACTCCTTGACGTTGACAATCGAGTAGTACTCCCGGTTGAAGCCCCCATTCGTATAATAATTACATCACAAGACGTCTTACACTCATGAGCTGTCCCCACATTAGGCTTAAAAACAGATGCAATTCCCGGACGTCTAAACCAAACCACTTTCACTGCTACACGACCAGGGGTATACTACGGCCAATGCTCTGAAATCTGTGGAGCAAACCAGTTTTATGCCCATCGTCCTAGAATTAATTCCCCTAAAAATCTTTGAAATAGGGCCCGTATTTACCCTATAGCACCCCCTCTACCCCCTCTAGAGCCCACTGTAAAGCTAACTTAGCATTAACCTTTTAAGTTAAAGATTAAGAGAACCAACACCTCTTTACAGTGAAATGCCCCAACTAAATACTACCGTATGACCCACCATAATTACCCCCATACTCCTTACACTATTCCTCATCACCCAACTAAAAATATTAAATACAAATTACCACCTACCTCCCTCACCAAAGCCCATAAAAATAAAAAACTATAACAAACCCTGAGAACCAAAATGAACGAAAATCTGTTCACTTCATTCATTGCCCCCACAATCCTAGGCCTACCCGCCGCAGTACTGATCATTCTATTTCCCCCTCTATTGATCCCCACCTCCAAATATCTCATCAACAACCGACTAATTACCACCCAACAATGACTAATCCAACTAACCTCAAAACAAATGATAGCCATACACAACACTAAGGGACGAACCTGATCTCTTATACTAGTATCCTTAATCATTTTTATTGCCACAACTAACCTCCTCGGACTCCTGCCTCACTCATTTACACCAACCACCCAACTATCTATAAACCTAGCCATGGCCATCCCCTTATGAGCGGGCGCAGTGATTATAGGCTTTCGCTCTAAGATTAAAAATGCCCTAGCCCACTTCTTACCACAAGGCACACCTACACCCCTTATCCCTATACTAGTTATTATCGAAACCATCAGCCTACTCATTCAACCAATAGCCCTGGCCGTACGCCTAACCGCTAACATTACTGCAGGCCACCTACTCATGCACCTAATTGGAAGCGCCACACTAGCAATATCAACTATTAACCTTCCCTCTACACTTATCATCTTCACAATTCTAATTCTACTGACTATCCTAGAAATCGCTGTCGCCTTAATCCAAGCCTACGTTTTTACACTTCTAGTAAGCCTCTACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAAACCCAGCCCATGGCCCCTAACAGGGGCCCTCTCAGCCCTCCTAATGACCTCCGGCCTAGCCATGTGATTTCACTTCCACTCCACAACCCTCCTCATACTAGGCCTACTAACCAACACACTAACCATATACCAATGATGGCGCGATGTAACACGAGAAAGCACATACCAAGGCCACCACACACCACCTGTCCAGAAAGGCCTTCGATACGGGATAATCCTATTTATTACCTCAGAAGTTTTTTTCTTCGCAGGATTTTTCTGAGCCTTTTACCACTCCAGCCTAGCTCCCACCCCCCAACTAGGGGGACACTGGCCCCCAACAGGCATCACCCCGCTAAATCCCCTAGAAGTCCCACTCCTAAACACATCCGTATTACTCGCATCAGGGGTATCAATCACCTGAGCTCACCATAGTCTAATAGAAAACAACCGAAACCAAATAATTCAAGCACTGCTTATTACAATTTTACTGGGTCTCTATTTTACCCTCCTACAAGCCTCAGAGTACTTCGAGGTTAAA 11 | -------------------------------------------------------------------------------- /data/bed/bad/non_integer_coords.bed: -------------------------------------------------------------------------------- 1 | chr1 3.63 3696 5 4 0 0 374 -1 -1 -1 s1,s367398,s434464,s2,s3 * CCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGA 2 | chr1 11669 11728 3 2 0 0 59 -1 -1 -1 s3,s4,s5 * GTCCCTCTGTCTCTGCCAACCAGTTAACCTGCTGCTTCCTGGAGGGAGACAGTCCCTCA 3 | chr1 18095 18095 3 2 0 0 313 -1 -1 -1 s5,s238062,s6 * GGGCCGGGTGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGGCGGATCACGAGGTCAGAAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCTGGGCATGGTGGTGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCCCGCCACTGCACTCCAGCCTGAGCGACAGAGTGAGACTCTGTCTCAAAAAAAAAAAAAAAAAAGAAGGAATAAGAC 4 | chr1 21837 21883 4 2 0 46 103 -1 -1 -1 s6,s374420,s7,s8 TTCCCAGCAGTGCAGGCCCCTCTCTAGAGCTGAGATGCTCCCGGCA CCCTCTCCAGAGCCGAGACGCTCCCGGCGGTGCAGGCCCCTCTCTAGAGCCGAGACGCTCCCAGCAATGCAGGTCCCCCTCTAGAGCCGAGACGCTCCCGGCG 5 | chr1 24110 26059 59 12526 0 22 3875 -1 -1 -1 s8,s345381,s418892,s238063,s416853,s463110,s271302,s377656,s314817,s482893,s461932,s461931,s9,s10,s11,s432618,s12,s322411,s13,s14,s434465,s398218,s398219,s478354,s443821,s443822,s359857,s15,s16,s17,s18,s19,s425113,s20,s21,s470986,s345380,s466297,s22,s23,s345379,s445459,s347465,s479666,s24,s25,s26,s27,s28,s467481,s29,s432617,s345378,s436164,s479667,s432616,s463109,s30,s31 AGCAATGAGGCACGTGTGGAAA GCAATGAGGCACGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGGCACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAA 6 | chr1 27995 28086 11 12 0 91 958 -1 -1 -1 s31,s416854,s238064,s238065,s314818,s314819,s238066,s32,s414976,s33,s34 TGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGTCCT CGACTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCCGATCTGTATATATGTATCATGTAAACATGATTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTTCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCTCCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATGTATGTATCATGTAAACACGAGTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGTCCT 7 | chr1 29311 29311 7 4 0 0 863 -1 -1 -1 s34,s392676,s432615,s478355,s432614,s432613,s35 * TAGCCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCAACCCCCTCTCCATCCCCCTCTCCATCTCCCTCTCCTTTCTCCTCTCTAGCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCATCCCCCCTCCATCCCCTTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCTTCTCCATCCCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCATCCCCCTCTCCATCTCCCTTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCCTCTCCATCCCCCCTACATCCCCCTCTCCTTTCTCCTCCCCACCCCCTCTTCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTC 8 | chr1 36616 37092 20 111 0 0 2096 -1 -1 -1 s35,s486581,s36,s238067,s458206,s37,s38,s428880,s458205,s39,s40,s238068,s238069,s238070,s307180,s426376,s41,s238071,s42,s43 * GAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCAGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGCTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAG 9 | chr1 37656 37965 20 86 0 1 1759 -1 -1 -1 s43,s486580,s44,s238072,s416855,s238073,s434466,s416856,s238074,s45,s238075,s46,s416857,s441327,s416858,s238076,s416859,s47,s48,s49 T GCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTTGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGAGTGTGACGGGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCATGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGCGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTG 10 | chr1 54659 54659 3 2 0 0 5874 -1 -1 -1 s49,s238077,s50 * TTAGATTTGACCTTCAGCAAGGTCAAAGGGAGTCCGAACTAGTCTCAGGCTTCAACATCGAATACGCCGCAGGCCCCTTCGCCCTATTCTTCATAGCCGAATACACAAACATTATTATAATAAACACCCTCACCACTACAATCTTCCTAGGAACAACATATAACGCACTCTCCCCTGAACTCTACACAACATATTTTGTCACCAAGACCCTACTTCTGACCTCCCTGTTCTTATGAATTCGAACAGCATACCCCCGATTCCGCTACGACCAACTCATACACCTCCTATGAAAAAACTTCCTACCACTCACCCTAGCATTACTTATATGATATGTCTCCATACCCATTACAATCTCCAGCATTCCCCCTCAAACCTAAGAAATATGTCTGATAAAAGAGTTACTTTGATAGAGTAAATAATAGGAGTTTAAATCCCCTTATTTCTAGGACTATGAGAATCGAACCCATCCCTGAGAATCCAAAATTCTCCGTGCCACCTATCACACCCCATCCTAAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTTATATCCTTCCCGTACTAATTAATCCCCTGGCCCAACCCGTCATCTACTCTACCATCTTTGCAGGCACACTCATCACAGCGCTAAGCTCGCACTGATTTTTTACCTGAGTAGGCCTAGAAATAAACATGCTAGCTTTTATTCCAGTTCTAACCAAAAAAATAAACCCTCGTTCCACAGAAGCTGCCATCAAGTATTTCCTCACGCAAGCAACCGCATCCATAATCCTTCTAATAGCTATCCTCTTCAACAATATACTCTCCGGACAATGAACCATAACCAATACCACCAATCAATACTCATCATTAATAATCATAATGGCTATAGCAATAAAACTAGGAATAGCCCCCTTTCACTTCTGAGTCCCAGAGGTTACCCAAGGCACCCCTCTGACATCCGGCCTGCTCCTTCTCACATGACAAAAACTAGCCCCCATCTCAATCATATACCAAATTTCTCCCTCATTAAACGTAAGCCTTCTCCTCACTCTTTCAATCTTATCCATCATGGCAGGCAGTTGAGGTGGATTAAACCAAACCCAACTACGCAAAATCTTAGCATACTCCTCAATTACCCACATAGGATGAATAACAGCAGTTCTACCGTACAACCCTAACATAACCATTCTTAATTTAACTATTTATATTATCCTAACTACTACCGCATTCCTACTACTCAACTTAAACTCCAGCACCACAACCCTACTACTATCTCGCACCTGAAACAAGCTAACATGACTAACACCCTTAATTCCATCCACCCTCCTCTCCCTAGGAGGCCTGCCCCCGCTAACCGGCTTTTTGCCCAAATGGGCCATTATCGAAGAATTCACAAAAAACAATAGCCTCATCATCCCCACCATCATAGCCATCATCACCCTCCTTAACCTCTACTTCTACCTGCGCCTAATCTACTCCACCTCAATCACACTACTCCCTATATCTAACAACGTAAAAATAAAATGACAGTTTGAACACACAAAACCCACCCCATTCCTCCCCACACTCATCGCCCTTACCACACTGCTCCTACCTATCTCCCCTTTTATGCTAATAATCTTATAGAAATTTAGGTTAAATACAGACCAAGAGCCTTCAAAGCCCTCAGTAAGTTGCAATACTTAATTTCTGCAACAGCTAAGGACTGCAAAACCCCACTCTGCATCAACTGAACGCAAATCAGCCACTTTAATTAAGCTAAGCCCTTACTAGACCAATGGGACTTAAACCCACAAACACTTAGTTAACAGCTAAGCACCCTAATCAACTGGCTTCAATCTACTTCTCCCGCCGCCGGGAAAAAAGGCGGGAGAAGCCCCGGCAGGTTTGAAGCTGCTTCTTCGAATTTGCAATTCAATATGAAAATCACCTCAGAGCTGGTAAAAAGAGGCTTAACCCCTGTCTTTAGATTTACAGTCCAATGCTTCACTCAGCCATTTTACCTCACCCCCACTGATGTTCGCCGACCGTTGACTATTCTCTACAAACCACAAAGACATTGGAACACTATACCTATTATTCGGCGCATGAGCTGGAGTCCTAGGCACAGCTCTAAGCCTCCTTATTCGAGCCGAACTGGGCCAGCCAGGCAACCTTCTAGGTAACGACCACATCTACAACGTTATCGTCACAGCCCATGCATTTGTAATAATCTTCTTCATAGTAATACCCATCATAATCGGAGGCTTTGGCAACTGACTAGTTCCCCTAATAATCGGTGCCCCCGATATGGCGTTTCCCCGCATAAACAACATAAGCTTCTGACTCTTACCCCCCTCTCTCCTACTCCTGCTTGCATCTGCTATAGTGGAGGCCGGCGCAGGAACAGGTTGAACAGTCTACCCTCCCTTGGCAGGGAACTACTCCCACCCTGGAGCCTCCGTAGACCTAACCATCTTCTCCTTACACCTAGCAGGTATCTCCTCTATCTTAGGAGCCATCAATTTCATCACAACAATTATTAATATAAAACCCCCTGCCATAACCCAATACCAAACGCCCCTTTTCGTCTGATCCGTCCTAATCACAGCAGTCTTACTTCTCCTATCTCTCCCAGTCCTAGCCGCTGGCATCACTATACTACTAACAGACCGTAACCTCAACACCACCTTCTTCGACCCAGCCGGAGGAGGAGACCCCATTCTATACCAACACCTATTCTGATTTTTCGGTCACCCTGAAGTTTATATTCTCATCCTACCAGGCTTCGGAATAATCTCCCATATTGTAACTTACTACTCCGGGAAAAAAAGAACCATTTGGATACATAGGTATGGTCTGAGCTATGATATCAATTGGCTTCCTAGGGTTTATCGTGTGAGCACACCATATATTTACAGTAGGAATAGACGTAGACACACGAGCATATTTCACCTCCGCTACCATAATCATCGCTATCCCCACCGGCGTCAAAGTATTTAGCTGACTCGCCACACTCCACGGAAGCAATATGAAATGATCTGCTGCAGTGCTCTGAGCCCTAGGATTTATTTTTCTTTTCACCGTAGGTGGCCTGACTGGCATTGTATTAGCAAACTCATCACTAGACATCGTACTACACGACACGTACTACGTTGTAGCCCACTTCCACTATGTCCTATCAATAGGAGCTGTATTTGCCATCATAGGAGGCTTCATTCACTGATTTCCCCTATTCTCAGGCTACACCCTAGACCAAACCTACGCCAAAATCCATTTCGCTATCATATTCATCGGCGTAAATCTAACTTTCTTCCCACAACACTTTCTCGGCCTATCCGGAATGCCCCGACGTTACTCGGACTATCCCGATGCATACACCACATGAAATATCCTATCATCTGTAGGCTCATTCATTTCTCTAACAGCAGTAATATTAATAATTTTCATAATTTGAGAAGCCTTCGCTTCGAAGCGAAAAGTCCTAATAGTAGAAGAACCCTCCATAAACCTGGAGTGACTATATGGATGCCCCCCACCCTACCACACATTCGAAGAACCCGTATACATAAAATCTAGACAAAAAAGGAAGGAATCGAACCCCCCAAAGCTGGTTTCAAGCCAACCCCATGGCCTCCATGACTTTTTCAAAAAGATATTAGAAAAACCATTTCATAACTTTGTCAAAGTTAAATTATAGGCTAAATCCTATATATCTTAATGGCACATGCAGCGCAAGTAGGTCTACAAGACGCTACTTCCCCTATCATAGAAGAGCTTATCATCTTTCATGATCACGCCCTCATAATCATTTTCCTTATCTGCTTCCTAGTCCTGTACGCCCTTTTCCTAACACTCACAACAAAACTAACTAATACTAACATCTCAGACGCTCAGGAAATAGAAACCGTCTGAACTATCCTGCCCGCCATCATCCTAGTCCTTATCGCCCTCCCATCCCTACGCATCCTTTACATAACAGACGAGGTCAACGATCCCTCCTTTACCATCAAATCAATTGGCCATCAATGGTACTGAACCTACGAATACACCGACTACGGCGGACTAATCTTCAACTCCTACATACTTCCCCCATTATTCCTAGAACCAGGCGACCTGCGACTCCTTGACGTTGACAATCGAGTAGTACTCCCGGTTGAAGCCCCCATTCGTATAATAATTACATCACAAGACGTCTTACACTCATGAGCTGTCCCCACATTAGGCTTAAAAACAGATGCAATTCCCGGACGTCTAAACCAAACCACTTTCACTGCTACACGACCAGGGGTATACTACGGCCAATGCTCTGAAATCTGTGGAGCAAACCAGTTTTATGCCCATCGTCCTAGAATTAATTCCCCTAAAAATCTTTGAAATAGGGCCCGTATTTACCCTATAGCACCCCCTCTACCCCCTCTAGAGCCCACTGTAAAGCTAACTTAGCATTAACCTTTTAAGTTAAAGATTAAGAGAACCAACACCTCTTTACAGTGAAATGCCCCAACTAAATACTACCGTATGACCCACCATAATTACCCCCATACTCCTTACACTATTCCTCATCACCCAACTAAAAATATTAAATACAAATTACCACCTACCTCCCTCACCAAAGCCCATAAAAATAAAAAACTATAACAAACCCTGAGAACCAAAATGAACGAAAATCTGTTCACTTCATTCATTGCCCCCACAATCCTAGGCCTACCCGCCGCAGTACTGATCATTCTATTTCCCCCTCTATTGATCCCCACCTCCAAATATCTCATCAACAACCGACTAATTACCACCCAACAATGACTAATCCAACTAACCTCAAAACAAATGATAGCCATACACAACACTAAGGGACGAACCTGATCTCTTATACTAGTATCCTTAATCATTTTTATTGCCACAACTAACCTCCTCGGACTCCTGCCTCACTCATTTACACCAACCACCCAACTATCTATAAACCTAGCCATGGCCATCCCCTTATGAGCGGGCGCAGTGATTATAGGCTTTCGCTCTAAGATTAAAAATGCCCTAGCCCACTTCTTACCACAAGGCACACCTACACCCCTTATCCCTATACTAGTTATTATCGAAACCATCAGCCTACTCATTCAACCAATAGCCCTGGCCGTACGCCTAACCGCTAACATTACTGCAGGCCACCTACTCATGCACCTAATTGGAAGCGCCACACTAGCAATATCAACTATTAACCTTCCCTCTACACTTATCATCTTCACAATTCTAATTCTACTGACTATCCTAGAAATCGCTGTCGCCTTAATCCAAGCCTACGTTTTTACACTTCTAGTAAGCCTCTACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAAACCCAGCCCATGGCCCCTAACAGGGGCCCTCTCAGCCCTCCTAATGACCTCCGGCCTAGCCATGTGATTTCACTTCCACTCCACAACCCTCCTCATACTAGGCCTACTAACCAACACACTAACCATATACCAATGATGGCGCGATGTAACACGAGAAAGCACATACCAAGGCCACCACACACCACCTGTCCAGAAAGGCCTTCGATACGGGATAATCCTATTTATTACCTCAGAAGTTTTTTTCTTCGCAGGATTTTTCTGAGCCTTTTACCACTCCAGCCTAGCTCCCACCCCCCAACTAGGGGGACACTGGCCCCCAACAGGCATCACCCCGCTAAATCCCCTAGAAGTCCCACTCCTAAACACATCCGTATTACTCGCATCAGGGGTATCAATCACCTGAGCTCACCATAGTCTAATAGAAAACAACCGAAACCAAATAATTCAAGCACTGCTTATTACAATTTTACTGGGTCTCTATTTTACCCTCCTACAAGCCTCAGAGTACTTCGAGGTTAAA 11 | -------------------------------------------------------------------------------- /data/bed/bad/spaces.bed: -------------------------------------------------------------------------------- 1 | chr1 3634 3696 5 4 0 0 374 -1 -1 -1 s1,s367398,s434464,s2,s3 * CCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGA 2 | chr1 11669 11728 3 2 0 0 59 -1 -1 -1 s3,s4,s5 * GTCCCTCTGTCTCTGCCAACCAGTTAACCTGCTGCTTCCTGGAGGGAGACAGTCCCTCA 3 | chr1 18095 18095 3 2 0 0 313 -1 -1 -1 s5,s238062,s6 * GGGCCGGGTGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGGCGGATCACGAGGTCAGAAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCTGGGCATGGTGGTGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCCCGCCACTGCACTCCAGCCTGAGCGACAGAGTGAGACTCTGTCTCAAAAAAAAAAAAAAAAAAGAAGGAATAAGAC 4 | chr1 21837 21883 4 2 0 46 103 -1 -1 -1 s6,s374420,s7,s8 TTCCCAGCAGTGCAGGCCCCTCTCTAGAGCTGAGATGCTCCCGGCA CCCTCTCCAGAGCCGAGACGCTCCCGGCGGTGCAGGCCCCTCTCTAGAGCCGAGACGCTCCCAGCAATGCAGGTCCCCCTCTAGAGCCGAGACGCTCCCGGCG 5 | chr1 24110 26059 59 12526 0 22 3875 -1 -1 -1 s8,s345381,s418892,s238063,s416853,s463110,s271302,s377656,s314817,s482893,s461932,s461931,s9,s10,s11,s432618,s12,s322411,s13,s14,s434465,s398218,s398219,s478354,s443821,s443822,s359857,s15,s16,s17,s18,s19,s425113,s20,s21,s470986,s345380,s466297,s22,s23,s345379,s445459,s347465,s479666,s24,s25,s26,s27,s28,s467481,s29,s432617,s345378,s436164,s479667,s432616,s463109,s30,s31 AGCAATGAGGCACGTGTGGAAA GCAATGAGGCACGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGGCACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAA 6 | chr1 27995 28086 11 12 0 91 958 -1 -1 -1 s31,s416854,s238064,s238065,s314818,s314819,s238066,s32,s414976,s33,s34 TGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGTCCT CGACTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCCGATCTGTATATATGTATCATGTAAACATGATTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTTCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCTCCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATGTATGTATCATGTAAACACGAGTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGTCCT 7 | chr1 29311 29311 7 4 0 0 863 -1 -1 -1 s34,s392676,s432615,s478355,s432614,s432613,s35 * TAGCCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCAACCCCCTCTCCATCCCCCTCTCCATCTCCCTCTCCTTTCTCCTCTCTAGCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCATCCCCCCTCCATCCCCTTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCTTCTCCATCCCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCATCCCCCTCTCCATCTCCCTTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCCTCTCCATCCCCCCTACATCCCCCTCTCCTTTCTCCTCCCCACCCCCTCTTCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTC 8 | chr1 36616 37092 20 111 0 0 2096 -1 -1 -1 s35,s486581,s36,s238067,s458206,s37,s38,s428880,s458205,s39,s40,s238068,s238069,s238070,s307180,s426376,s41,s238071,s42,s43 * GAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCAGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGCTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAG 9 | chr1 37656 37965 20 86 0 1 1759 -1 -1 -1 s43,s486580,s44,s238072,s416855,s238073,s434466,s416856,s238074,s45,s238075,s46,s416857,s441327,s416858,s238076,s416859,s47,s48,s49 T GCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTTGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGAGTGTGACGGGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCATGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGCGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTG 10 | chr1 54659 54659 3 2 0 0 5874 -1 -1 -1 s49,s238077,s50 * TTAGATTTGACCTTCAGCAAGGTCAAAGGGAGTCCGAACTAGTCTCAGGCTTCAACATCGAATACGCCGCAGGCCCCTTCGCCCTATTCTTCATAGCCGAATACACAAACATTATTATAATAAACACCCTCACCACTACAATCTTCCTAGGAACAACATATAACGCACTCTCCCCTGAACTCTACACAACATATTTTGTCACCAAGACCCTACTTCTGACCTCCCTGTTCTTATGAATTCGAACAGCATACCCCCGATTCCGCTACGACCAACTCATACACCTCCTATGAAAAAACTTCCTACCACTCACCCTAGCATTACTTATATGATATGTCTCCATACCCATTACAATCTCCAGCATTCCCCCTCAAACCTAAGAAATATGTCTGATAAAAGAGTTACTTTGATAGAGTAAATAATAGGAGTTTAAATCCCCTTATTTCTAGGACTATGAGAATCGAACCCATCCCTGAGAATCCAAAATTCTCCGTGCCACCTATCACACCCCATCCTAAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTTATATCCTTCCCGTACTAATTAATCCCCTGGCCCAACCCGTCATCTACTCTACCATCTTTGCAGGCACACTCATCACAGCGCTAAGCTCGCACTGATTTTTTACCTGAGTAGGCCTAGAAATAAACATGCTAGCTTTTATTCCAGTTCTAACCAAAAAAATAAACCCTCGTTCCACAGAAGCTGCCATCAAGTATTTCCTCACGCAAGCAACCGCATCCATAATCCTTCTAATAGCTATCCTCTTCAACAATATACTCTCCGGACAATGAACCATAACCAATACCACCAATCAATACTCATCATTAATAATCATAATGGCTATAGCAATAAAACTAGGAATAGCCCCCTTTCACTTCTGAGTCCCAGAGGTTACCCAAGGCACCCCTCTGACATCCGGCCTGCTCCTTCTCACATGACAAAAACTAGCCCCCATCTCAATCATATACCAAATTTCTCCCTCATTAAACGTAAGCCTTCTCCTCACTCTTTCAATCTTATCCATCATGGCAGGCAGTTGAGGTGGATTAAACCAAACCCAACTACGCAAAATCTTAGCATACTCCTCAATTACCCACATAGGATGAATAACAGCAGTTCTACCGTACAACCCTAACATAACCATTCTTAATTTAACTATTTATATTATCCTAACTACTACCGCATTCCTACTACTCAACTTAAACTCCAGCACCACAACCCTACTACTATCTCGCACCTGAAACAAGCTAACATGACTAACACCCTTAATTCCATCCACCCTCCTCTCCCTAGGAGGCCTGCCCCCGCTAACCGGCTTTTTGCCCAAATGGGCCATTATCGAAGAATTCACAAAAAACAATAGCCTCATCATCCCCACCATCATAGCCATCATCACCCTCCTTAACCTCTACTTCTACCTGCGCCTAATCTACTCCACCTCAATCACACTACTCCCTATATCTAACAACGTAAAAATAAAATGACAGTTTGAACACACAAAACCCACCCCATTCCTCCCCACACTCATCGCCCTTACCACACTGCTCCTACCTATCTCCCCTTTTATGCTAATAATCTTATAGAAATTTAGGTTAAATACAGACCAAGAGCCTTCAAAGCCCTCAGTAAGTTGCAATACTTAATTTCTGCAACAGCTAAGGACTGCAAAACCCCACTCTGCATCAACTGAACGCAAATCAGCCACTTTAATTAAGCTAAGCCCTTACTAGACCAATGGGACTTAAACCCACAAACACTTAGTTAACAGCTAAGCACCCTAATCAACTGGCTTCAATCTACTTCTCCCGCCGCCGGGAAAAAAGGCGGGAGAAGCCCCGGCAGGTTTGAAGCTGCTTCTTCGAATTTGCAATTCAATATGAAAATCACCTCAGAGCTGGTAAAAAGAGGCTTAACCCCTGTCTTTAGATTTACAGTCCAATGCTTCACTCAGCCATTTTACCTCACCCCCACTGATGTTCGCCGACCGTTGACTATTCTCTACAAACCACAAAGACATTGGAACACTATACCTATTATTCGGCGCATGAGCTGGAGTCCTAGGCACAGCTCTAAGCCTCCTTATTCGAGCCGAACTGGGCCAGCCAGGCAACCTTCTAGGTAACGACCACATCTACAACGTTATCGTCACAGCCCATGCATTTGTAATAATCTTCTTCATAGTAATACCCATCATAATCGGAGGCTTTGGCAACTGACTAGTTCCCCTAATAATCGGTGCCCCCGATATGGCGTTTCCCCGCATAAACAACATAAGCTTCTGACTCTTACCCCCCTCTCTCCTACTCCTGCTTGCATCTGCTATAGTGGAGGCCGGCGCAGGAACAGGTTGAACAGTCTACCCTCCCTTGGCAGGGAACTACTCCCACCCTGGAGCCTCCGTAGACCTAACCATCTTCTCCTTACACCTAGCAGGTATCTCCTCTATCTTAGGAGCCATCAATTTCATCACAACAATTATTAATATAAAACCCCCTGCCATAACCCAATACCAAACGCCCCTTTTCGTCTGATCCGTCCTAATCACAGCAGTCTTACTTCTCCTATCTCTCCCAGTCCTAGCCGCTGGCATCACTATACTACTAACAGACCGTAACCTCAACACCACCTTCTTCGACCCAGCCGGAGGAGGAGACCCCATTCTATACCAACACCTATTCTGATTTTTCGGTCACCCTGAAGTTTATATTCTCATCCTACCAGGCTTCGGAATAATCTCCCATATTGTAACTTACTACTCCGGGAAAAAAAGAACCATTTGGATACATAGGTATGGTCTGAGCTATGATATCAATTGGCTTCCTAGGGTTTATCGTGTGAGCACACCATATATTTACAGTAGGAATAGACGTAGACACACGAGCATATTTCACCTCCGCTACCATAATCATCGCTATCCCCACCGGCGTCAAAGTATTTAGCTGACTCGCCACACTCCACGGAAGCAATATGAAATGATCTGCTGCAGTGCTCTGAGCCCTAGGATTTATTTTTCTTTTCACCGTAGGTGGCCTGACTGGCATTGTATTAGCAAACTCATCACTAGACATCGTACTACACGACACGTACTACGTTGTAGCCCACTTCCACTATGTCCTATCAATAGGAGCTGTATTTGCCATCATAGGAGGCTTCATTCACTGATTTCCCCTATTCTCAGGCTACACCCTAGACCAAACCTACGCCAAAATCCATTTCGCTATCATATTCATCGGCGTAAATCTAACTTTCTTCCCACAACACTTTCTCGGCCTATCCGGAATGCCCCGACGTTACTCGGACTATCCCGATGCATACACCACATGAAATATCCTATCATCTGTAGGCTCATTCATTTCTCTAACAGCAGTAATATTAATAATTTTCATAATTTGAGAAGCCTTCGCTTCGAAGCGAAAAGTCCTAATAGTAGAAGAACCCTCCATAAACCTGGAGTGACTATATGGATGCCCCCCACCCTACCACACATTCGAAGAACCCGTATACATAAAATCTAGACAAAAAAGGAAGGAATCGAACCCCCCAAAGCTGGTTTCAAGCCAACCCCATGGCCTCCATGACTTTTTCAAAAAGATATTAGAAAAACCATTTCATAACTTTGTCAAAGTTAAATTATAGGCTAAATCCTATATATCTTAATGGCACATGCAGCGCAAGTAGGTCTACAAGACGCTACTTCCCCTATCATAGAAGAGCTTATCATCTTTCATGATCACGCCCTCATAATCATTTTCCTTATCTGCTTCCTAGTCCTGTACGCCCTTTTCCTAACACTCACAACAAAACTAACTAATACTAACATCTCAGACGCTCAGGAAATAGAAACCGTCTGAACTATCCTGCCCGCCATCATCCTAGTCCTTATCGCCCTCCCATCCCTACGCATCCTTTACATAACAGACGAGGTCAACGATCCCTCCTTTACCATCAAATCAATTGGCCATCAATGGTACTGAACCTACGAATACACCGACTACGGCGGACTAATCTTCAACTCCTACATACTTCCCCCATTATTCCTAGAACCAGGCGACCTGCGACTCCTTGACGTTGACAATCGAGTAGTACTCCCGGTTGAAGCCCCCATTCGTATAATAATTACATCACAAGACGTCTTACACTCATGAGCTGTCCCCACATTAGGCTTAAAAACAGATGCAATTCCCGGACGTCTAAACCAAACCACTTTCACTGCTACACGACCAGGGGTATACTACGGCCAATGCTCTGAAATCTGTGGAGCAAACCAGTTTTATGCCCATCGTCCTAGAATTAATTCCCCTAAAAATCTTTGAAATAGGGCCCGTATTTACCCTATAGCACCCCCTCTACCCCCTCTAGAGCCCACTGTAAAGCTAACTTAGCATTAACCTTTTAAGTTAAAGATTAAGAGAACCAACACCTCTTTACAGTGAAATGCCCCAACTAAATACTACCGTATGACCCACCATAATTACCCCCATACTCCTTACACTATTCCTCATCACCCAACTAAAAATATTAAATACAAATTACCACCTACCTCCCTCACCAAAGCCCATAAAAATAAAAAACTATAACAAACCCTGAGAACCAAAATGAACGAAAATCTGTTCACTTCATTCATTGCCCCCACAATCCTAGGCCTACCCGCCGCAGTACTGATCATTCTATTTCCCCCTCTATTGATCCCCACCTCCAAATATCTCATCAACAACCGACTAATTACCACCCAACAATGACTAATCCAACTAACCTCAAAACAAATGATAGCCATACACAACACTAAGGGACGAACCTGATCTCTTATACTAGTATCCTTAATCATTTTTATTGCCACAACTAACCTCCTCGGACTCCTGCCTCACTCATTTACACCAACCACCCAACTATCTATAAACCTAGCCATGGCCATCCCCTTATGAGCGGGCGCAGTGATTATAGGCTTTCGCTCTAAGATTAAAAATGCCCTAGCCCACTTCTTACCACAAGGCACACCTACACCCCTTATCCCTATACTAGTTATTATCGAAACCATCAGCCTACTCATTCAACCAATAGCCCTGGCCGTACGCCTAACCGCTAACATTACTGCAGGCCACCTACTCATGCACCTAATTGGAAGCGCCACACTAGCAATATCAACTATTAACCTTCCCTCTACACTTATCATCTTCACAATTCTAATTCTACTGACTATCCTAGAAATCGCTGTCGCCTTAATCCAAGCCTACGTTTTTACACTTCTAGTAAGCCTCTACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAAACCCAGCCCATGGCCCCTAACAGGGGCCCTCTCAGCCCTCCTAATGACCTCCGGCCTAGCCATGTGATTTCACTTCCACTCCACAACCCTCCTCATACTAGGCCTACTAACCAACACACTAACCATATACCAATGATGGCGCGATGTAACACGAGAAAGCACATACCAAGGCCACCACACACCACCTGTCCAGAAAGGCCTTCGATACGGGATAATCCTATTTATTACCTCAGAAGTTTTTTTCTTCGCAGGATTTTTCTGAGCCTTTTACCACTCCAGCCTAGCTCCCACCCCCCAACTAGGGGGACACTGGCCCCCAACAGGCATCACCCCGCTAAATCCCCTAGAAGTCCCACTCCTAAACACATCCGTATTACTCGCATCAGGGGTATCAATCACCTGAGCTCACCATAGTCTAATAGAAAACAACCGAAACCAAATAATTCAAGCACTGCTTATTACAATTTTACTGGGTCTCTATTTTACCCTCCTACAAGCCTCAGAGTACTTCGAGGTTAAA 11 | -------------------------------------------------------------------------------- /data/bed/bad/start_greater_than_end_coords.bed: -------------------------------------------------------------------------------- 1 | chr1 9999 3696 5 4 0 0 374 -1 -1 -1 s1,s367398,s434464,s2,s3 * CCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGA 2 | chr1 11669 11728 3 2 0 0 59 -1 -1 -1 s3,s4,s5 * GTCCCTCTGTCTCTGCCAACCAGTTAACCTGCTGCTTCCTGGAGGGAGACAGTCCCTCA 3 | chr1 18095 18095 3 2 0 0 313 -1 -1 -1 s5,s238062,s6 * GGGCCGGGTGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGGCGGATCACGAGGTCAGAAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCTGGGCATGGTGGTGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCCCGCCACTGCACTCCAGCCTGAGCGACAGAGTGAGACTCTGTCTCAAAAAAAAAAAAAAAAAAGAAGGAATAAGAC 4 | chr1 21837 21883 4 2 0 46 103 -1 -1 -1 s6,s374420,s7,s8 TTCCCAGCAGTGCAGGCCCCTCTCTAGAGCTGAGATGCTCCCGGCA CCCTCTCCAGAGCCGAGACGCTCCCGGCGGTGCAGGCCCCTCTCTAGAGCCGAGACGCTCCCAGCAATGCAGGTCCCCCTCTAGAGCCGAGACGCTCCCGGCG 5 | chr1 24110 26059 59 12526 0 22 3875 -1 -1 -1 s8,s345381,s418892,s238063,s416853,s463110,s271302,s377656,s314817,s482893,s461932,s461931,s9,s10,s11,s432618,s12,s322411,s13,s14,s434465,s398218,s398219,s478354,s443821,s443822,s359857,s15,s16,s17,s18,s19,s425113,s20,s21,s470986,s345380,s466297,s22,s23,s345379,s445459,s347465,s479666,s24,s25,s26,s27,s28,s467481,s29,s432617,s345378,s436164,s479667,s432616,s463109,s30,s31 AGCAATGAGGCACGTGTGGAAA GCAATGAGGCACGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGGCACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAA 6 | chr1 27995 28086 11 12 0 91 958 -1 -1 -1 s31,s416854,s238064,s238065,s314818,s314819,s238066,s32,s414976,s33,s34 TGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGTCCT CGACTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCCGATCTGTATATATGTATCATGTAAACATGATTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTTCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCTCCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATGTATGTATCATGTAAACACGAGTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGTCCT 7 | chr1 29311 29311 7 4 0 0 863 -1 -1 -1 s34,s392676,s432615,s478355,s432614,s432613,s35 * TAGCCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCAACCCCCTCTCCATCCCCCTCTCCATCTCCCTCTCCTTTCTCCTCTCTAGCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCATCCCCCCTCCATCCCCTTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCTTCTCCATCCCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCATCCCCCTCTCCATCTCCCTTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCCTCTCCATCCCCCCTACATCCCCCTCTCCTTTCTCCTCCCCACCCCCTCTTCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTC 8 | chr1 36616 37092 20 111 0 0 2096 -1 -1 -1 s35,s486581,s36,s238067,s458206,s37,s38,s428880,s458205,s39,s40,s238068,s238069,s238070,s307180,s426376,s41,s238071,s42,s43 * GAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCAGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGCTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAG 9 | chr1 37656 37965 20 86 0 1 1759 -1 -1 -1 s43,s486580,s44,s238072,s416855,s238073,s434466,s416856,s238074,s45,s238075,s46,s416857,s441327,s416858,s238076,s416859,s47,s48,s49 T GCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTTGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGAGTGTGACGGGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCATGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGCGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTG 10 | chr1 54659 54659 3 2 0 0 5874 -1 -1 -1 s49,s238077,s50 * TTAGATTTGACCTTCAGCAAGGTCAAAGGGAGTCCGAACTAGTCTCAGGCTTCAACATCGAATACGCCGCAGGCCCCTTCGCCCTATTCTTCATAGCCGAATACACAAACATTATTATAATAAACACCCTCACCACTACAATCTTCCTAGGAACAACATATAACGCACTCTCCCCTGAACTCTACACAACATATTTTGTCACCAAGACCCTACTTCTGACCTCCCTGTTCTTATGAATTCGAACAGCATACCCCCGATTCCGCTACGACCAACTCATACACCTCCTATGAAAAAACTTCCTACCACTCACCCTAGCATTACTTATATGATATGTCTCCATACCCATTACAATCTCCAGCATTCCCCCTCAAACCTAAGAAATATGTCTGATAAAAGAGTTACTTTGATAGAGTAAATAATAGGAGTTTAAATCCCCTTATTTCTAGGACTATGAGAATCGAACCCATCCCTGAGAATCCAAAATTCTCCGTGCCACCTATCACACCCCATCCTAAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTTATATCCTTCCCGTACTAATTAATCCCCTGGCCCAACCCGTCATCTACTCTACCATCTTTGCAGGCACACTCATCACAGCGCTAAGCTCGCACTGATTTTTTACCTGAGTAGGCCTAGAAATAAACATGCTAGCTTTTATTCCAGTTCTAACCAAAAAAATAAACCCTCGTTCCACAGAAGCTGCCATCAAGTATTTCCTCACGCAAGCAACCGCATCCATAATCCTTCTAATAGCTATCCTCTTCAACAATATACTCTCCGGACAATGAACCATAACCAATACCACCAATCAATACTCATCATTAATAATCATAATGGCTATAGCAATAAAACTAGGAATAGCCCCCTTTCACTTCTGAGTCCCAGAGGTTACCCAAGGCACCCCTCTGACATCCGGCCTGCTCCTTCTCACATGACAAAAACTAGCCCCCATCTCAATCATATACCAAATTTCTCCCTCATTAAACGTAAGCCTTCTCCTCACTCTTTCAATCTTATCCATCATGGCAGGCAGTTGAGGTGGATTAAACCAAACCCAACTACGCAAAATCTTAGCATACTCCTCAATTACCCACATAGGATGAATAACAGCAGTTCTACCGTACAACCCTAACATAACCATTCTTAATTTAACTATTTATATTATCCTAACTACTACCGCATTCCTACTACTCAACTTAAACTCCAGCACCACAACCCTACTACTATCTCGCACCTGAAACAAGCTAACATGACTAACACCCTTAATTCCATCCACCCTCCTCTCCCTAGGAGGCCTGCCCCCGCTAACCGGCTTTTTGCCCAAATGGGCCATTATCGAAGAATTCACAAAAAACAATAGCCTCATCATCCCCACCATCATAGCCATCATCACCCTCCTTAACCTCTACTTCTACCTGCGCCTAATCTACTCCACCTCAATCACACTACTCCCTATATCTAACAACGTAAAAATAAAATGACAGTTTGAACACACAAAACCCACCCCATTCCTCCCCACACTCATCGCCCTTACCACACTGCTCCTACCTATCTCCCCTTTTATGCTAATAATCTTATAGAAATTTAGGTTAAATACAGACCAAGAGCCTTCAAAGCCCTCAGTAAGTTGCAATACTTAATTTCTGCAACAGCTAAGGACTGCAAAACCCCACTCTGCATCAACTGAACGCAAATCAGCCACTTTAATTAAGCTAAGCCCTTACTAGACCAATGGGACTTAAACCCACAAACACTTAGTTAACAGCTAAGCACCCTAATCAACTGGCTTCAATCTACTTCTCCCGCCGCCGGGAAAAAAGGCGGGAGAAGCCCCGGCAGGTTTGAAGCTGCTTCTTCGAATTTGCAATTCAATATGAAAATCACCTCAGAGCTGGTAAAAAGAGGCTTAACCCCTGTCTTTAGATTTACAGTCCAATGCTTCACTCAGCCATTTTACCTCACCCCCACTGATGTTCGCCGACCGTTGACTATTCTCTACAAACCACAAAGACATTGGAACACTATACCTATTATTCGGCGCATGAGCTGGAGTCCTAGGCACAGCTCTAAGCCTCCTTATTCGAGCCGAACTGGGCCAGCCAGGCAACCTTCTAGGTAACGACCACATCTACAACGTTATCGTCACAGCCCATGCATTTGTAATAATCTTCTTCATAGTAATACCCATCATAATCGGAGGCTTTGGCAACTGACTAGTTCCCCTAATAATCGGTGCCCCCGATATGGCGTTTCCCCGCATAAACAACATAAGCTTCTGACTCTTACCCCCCTCTCTCCTACTCCTGCTTGCATCTGCTATAGTGGAGGCCGGCGCAGGAACAGGTTGAACAGTCTACCCTCCCTTGGCAGGGAACTACTCCCACCCTGGAGCCTCCGTAGACCTAACCATCTTCTCCTTACACCTAGCAGGTATCTCCTCTATCTTAGGAGCCATCAATTTCATCACAACAATTATTAATATAAAACCCCCTGCCATAACCCAATACCAAACGCCCCTTTTCGTCTGATCCGTCCTAATCACAGCAGTCTTACTTCTCCTATCTCTCCCAGTCCTAGCCGCTGGCATCACTATACTACTAACAGACCGTAACCTCAACACCACCTTCTTCGACCCAGCCGGAGGAGGAGACCCCATTCTATACCAACACCTATTCTGATTTTTCGGTCACCCTGAAGTTTATATTCTCATCCTACCAGGCTTCGGAATAATCTCCCATATTGTAACTTACTACTCCGGGAAAAAAAGAACCATTTGGATACATAGGTATGGTCTGAGCTATGATATCAATTGGCTTCCTAGGGTTTATCGTGTGAGCACACCATATATTTACAGTAGGAATAGACGTAGACACACGAGCATATTTCACCTCCGCTACCATAATCATCGCTATCCCCACCGGCGTCAAAGTATTTAGCTGACTCGCCACACTCCACGGAAGCAATATGAAATGATCTGCTGCAGTGCTCTGAGCCCTAGGATTTATTTTTCTTTTCACCGTAGGTGGCCTGACTGGCATTGTATTAGCAAACTCATCACTAGACATCGTACTACACGACACGTACTACGTTGTAGCCCACTTCCACTATGTCCTATCAATAGGAGCTGTATTTGCCATCATAGGAGGCTTCATTCACTGATTTCCCCTATTCTCAGGCTACACCCTAGACCAAACCTACGCCAAAATCCATTTCGCTATCATATTCATCGGCGTAAATCTAACTTTCTTCCCACAACACTTTCTCGGCCTATCCGGAATGCCCCGACGTTACTCGGACTATCCCGATGCATACACCACATGAAATATCCTATCATCTGTAGGCTCATTCATTTCTCTAACAGCAGTAATATTAATAATTTTCATAATTTGAGAAGCCTTCGCTTCGAAGCGAAAAGTCCTAATAGTAGAAGAACCCTCCATAAACCTGGAGTGACTATATGGATGCCCCCCACCCTACCACACATTCGAAGAACCCGTATACATAAAATCTAGACAAAAAAGGAAGGAATCGAACCCCCCAAAGCTGGTTTCAAGCCAACCCCATGGCCTCCATGACTTTTTCAAAAAGATATTAGAAAAACCATTTCATAACTTTGTCAAAGTTAAATTATAGGCTAAATCCTATATATCTTAATGGCACATGCAGCGCAAGTAGGTCTACAAGACGCTACTTCCCCTATCATAGAAGAGCTTATCATCTTTCATGATCACGCCCTCATAATCATTTTCCTTATCTGCTTCCTAGTCCTGTACGCCCTTTTCCTAACACTCACAACAAAACTAACTAATACTAACATCTCAGACGCTCAGGAAATAGAAACCGTCTGAACTATCCTGCCCGCCATCATCCTAGTCCTTATCGCCCTCCCATCCCTACGCATCCTTTACATAACAGACGAGGTCAACGATCCCTCCTTTACCATCAAATCAATTGGCCATCAATGGTACTGAACCTACGAATACACCGACTACGGCGGACTAATCTTCAACTCCTACATACTTCCCCCATTATTCCTAGAACCAGGCGACCTGCGACTCCTTGACGTTGACAATCGAGTAGTACTCCCGGTTGAAGCCCCCATTCGTATAATAATTACATCACAAGACGTCTTACACTCATGAGCTGTCCCCACATTAGGCTTAAAAACAGATGCAATTCCCGGACGTCTAAACCAAACCACTTTCACTGCTACACGACCAGGGGTATACTACGGCCAATGCTCTGAAATCTGTGGAGCAAACCAGTTTTATGCCCATCGTCCTAGAATTAATTCCCCTAAAAATCTTTGAAATAGGGCCCGTATTTACCCTATAGCACCCCCTCTACCCCCTCTAGAGCCCACTGTAAAGCTAACTTAGCATTAACCTTTTAAGTTAAAGATTAAGAGAACCAACACCTCTTTACAGTGAAATGCCCCAACTAAATACTACCGTATGACCCACCATAATTACCCCCATACTCCTTACACTATTCCTCATCACCCAACTAAAAATATTAAATACAAATTACCACCTACCTCCCTCACCAAAGCCCATAAAAATAAAAAACTATAACAAACCCTGAGAACCAAAATGAACGAAAATCTGTTCACTTCATTCATTGCCCCCACAATCCTAGGCCTACCCGCCGCAGTACTGATCATTCTATTTCCCCCTCTATTGATCCCCACCTCCAAATATCTCATCAACAACCGACTAATTACCACCCAACAATGACTAATCCAACTAACCTCAAAACAAATGATAGCCATACACAACACTAAGGGACGAACCTGATCTCTTATACTAGTATCCTTAATCATTTTTATTGCCACAACTAACCTCCTCGGACTCCTGCCTCACTCATTTACACCAACCACCCAACTATCTATAAACCTAGCCATGGCCATCCCCTTATGAGCGGGCGCAGTGATTATAGGCTTTCGCTCTAAGATTAAAAATGCCCTAGCCCACTTCTTACCACAAGGCACACCTACACCCCTTATCCCTATACTAGTTATTATCGAAACCATCAGCCTACTCATTCAACCAATAGCCCTGGCCGTACGCCTAACCGCTAACATTACTGCAGGCCACCTACTCATGCACCTAATTGGAAGCGCCACACTAGCAATATCAACTATTAACCTTCCCTCTACACTTATCATCTTCACAATTCTAATTCTACTGACTATCCTAGAAATCGCTGTCGCCTTAATCCAAGCCTACGTTTTTACACTTCTAGTAAGCCTCTACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAAACCCAGCCCATGGCCCCTAACAGGGGCCCTCTCAGCCCTCCTAATGACCTCCGGCCTAGCCATGTGATTTCACTTCCACTCCACAACCCTCCTCATACTAGGCCTACTAACCAACACACTAACCATATACCAATGATGGCGCGATGTAACACGAGAAAGCACATACCAAGGCCACCACACACCACCTGTCCAGAAAGGCCTTCGATACGGGATAATCCTATTTATTACCTCAGAAGTTTTTTTCTTCGCAGGATTTTTCTGAGCCTTTTACCACTCCAGCCTAGCTCCCACCCCCCAACTAGGGGGACACTGGCCCCCAACAGGCATCACCCCGCTAAATCCCCTAGAAGTCCCACTCCTAAACACATCCGTATTACTCGCATCAGGGGTATCAATCACCTGAGCTCACCATAGTCTAATAGAAAACAACCGAAACCAAATAATTCAAGCACTGCTTATTACAATTTTACTGGGTCTCTATTTTACCCTCCTACAAGCCTCAGAGTACTTCGAGGTTAAA 11 | -------------------------------------------------------------------------------- /data/bed/good/basic.bed: -------------------------------------------------------------------------------- 1 | chr1 3634 3696 5 4 0 0 374 -1 -1 -1 s1,s367398,s434464,s2,s3 * CCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGA 2 | chr1 11669 11728 3 2 0 0 59 -1 -1 -1 s3,s4,s5 * GTCCCTCTGTCTCTGCCAACCAGTTAACCTGCTGCTTCCTGGAGGGAGACAGTCCCTCA 3 | chr1 18095 18095 3 2 0 0 313 -1 -1 -1 s5,s238062,s6 * GGGCCGGGTGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGGCGGATCACGAGGTCAGAAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCTGGGCATGGTGGTGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCCCGCCACTGCACTCCAGCCTGAGCGACAGAGTGAGACTCTGTCTCAAAAAAAAAAAAAAAAAAGAAGGAATAAGAC 4 | chr1 21837 21883 4 2 0 46 103 -1 -1 -1 s6,s374420,s7,s8 TTCCCAGCAGTGCAGGCCCCTCTCTAGAGCTGAGATGCTCCCGGCA CCCTCTCCAGAGCCGAGACGCTCCCGGCGGTGCAGGCCCCTCTCTAGAGCCGAGACGCTCCCAGCAATGCAGGTCCCCCTCTAGAGCCGAGACGCTCCCGGCG 5 | chr1 24110 26059 59 12526 0 22 3875 -1 -1 -1 s8,s345381,s418892,s238063,s416853,s463110,s271302,s377656,s314817,s482893,s461932,s461931,s9,s10,s11,s432618,s12,s322411,s13,s14,s434465,s398218,s398219,s478354,s443821,s443822,s359857,s15,s16,s17,s18,s19,s425113,s20,s21,s470986,s345380,s466297,s22,s23,s345379,s445459,s347465,s479666,s24,s25,s26,s27,s28,s467481,s29,s432617,s345378,s436164,s479667,s432616,s463109,s30,s31 AGCAATGAGGCACGTGTGGAAA GCAATGAGGCACGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGGCACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAA 6 | chr1 27995 28086 11 12 0 91 958 -1 -1 -1 s31,s416854,s238064,s238065,s314818,s314819,s238066,s32,s414976,s33,s34 TGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGTCCT CGACTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCCGATCTGTATATATGTATCATGTAAACATGATTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTTCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCTCCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATGTATGTATCATGTAAACACGAGTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGTCCT 7 | chr1 29311 29311 7 4 0 0 863 -1 -1 -1 s34,s392676,s432615,s478355,s432614,s432613,s35 * TAGCCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCAACCCCCTCTCCATCCCCCTCTCCATCTCCCTCTCCTTTCTCCTCTCTAGCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCATCCCCCCTCCATCCCCTTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCTTCTCCATCCCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCATCCCCCTCTCCATCTCCCTTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCCTCTCCATCCCCCCTACATCCCCCTCTCCTTTCTCCTCCCCACCCCCTCTTCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTC 8 | chr1 36616 37092 20 111 0 0 2096 -1 -1 -1 s35,s486581,s36,s238067,s458206,s37,s38,s428880,s458205,s39,s40,s238068,s238069,s238070,s307180,s426376,s41,s238071,s42,s43 * GAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCAGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGCTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAG 9 | chr1 37656 37965 20 86 0 1 1759 -1 -1 -1 s43,s486580,s44,s238072,s416855,s238073,s434466,s416856,s238074,s45,s238075,s46,s416857,s441327,s416858,s238076,s416859,s47,s48,s49 T GCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTTGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGAGTGTGACGGGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCATGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGCGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTG 10 | chr1 54659 54659 3 2 0 0 5874 -1 -1 -1 s49,s238077,s50 * TTAGATTTGACCTTCAGCAAGGTCAAAGGGAGTCCGAACTAGTCTCAGGCTTCAACATCGAATACGCCGCAGGCCCCTTCGCCCTATTCTTCATAGCCGAATACACAAACATTATTATAATAAACACCCTCACCACTACAATCTTCCTAGGAACAACATATAACGCACTCTCCCCTGAACTCTACACAACATATTTTGTCACCAAGACCCTACTTCTGACCTCCCTGTTCTTATGAATTCGAACAGCATACCCCCGATTCCGCTACGACCAACTCATACACCTCCTATGAAAAAACTTCCTACCACTCACCCTAGCATTACTTATATGATATGTCTCCATACCCATTACAATCTCCAGCATTCCCCCTCAAACCTAAGAAATATGTCTGATAAAAGAGTTACTTTGATAGAGTAAATAATAGGAGTTTAAATCCCCTTATTTCTAGGACTATGAGAATCGAACCCATCCCTGAGAATCCAAAATTCTCCGTGCCACCTATCACACCCCATCCTAAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTTATATCCTTCCCGTACTAATTAATCCCCTGGCCCAACCCGTCATCTACTCTACCATCTTTGCAGGCACACTCATCACAGCGCTAAGCTCGCACTGATTTTTTACCTGAGTAGGCCTAGAAATAAACATGCTAGCTTTTATTCCAGTTCTAACCAAAAAAATAAACCCTCGTTCCACAGAAGCTGCCATCAAGTATTTCCTCACGCAAGCAACCGCATCCATAATCCTTCTAATAGCTATCCTCTTCAACAATATACTCTCCGGACAATGAACCATAACCAATACCACCAATCAATACTCATCATTAATAATCATAATGGCTATAGCAATAAAACTAGGAATAGCCCCCTTTCACTTCTGAGTCCCAGAGGTTACCCAAGGCACCCCTCTGACATCCGGCCTGCTCCTTCTCACATGACAAAAACTAGCCCCCATCTCAATCATATACCAAATTTCTCCCTCATTAAACGTAAGCCTTCTCCTCACTCTTTCAATCTTATCCATCATGGCAGGCAGTTGAGGTGGATTAAACCAAACCCAACTACGCAAAATCTTAGCATACTCCTCAATTACCCACATAGGATGAATAACAGCAGTTCTACCGTACAACCCTAACATAACCATTCTTAATTTAACTATTTATATTATCCTAACTACTACCGCATTCCTACTACTCAACTTAAACTCCAGCACCACAACCCTACTACTATCTCGCACCTGAAACAAGCTAACATGACTAACACCCTTAATTCCATCCACCCTCCTCTCCCTAGGAGGCCTGCCCCCGCTAACCGGCTTTTTGCCCAAATGGGCCATTATCGAAGAATTCACAAAAAACAATAGCCTCATCATCCCCACCATCATAGCCATCATCACCCTCCTTAACCTCTACTTCTACCTGCGCCTAATCTACTCCACCTCAATCACACTACTCCCTATATCTAACAACGTAAAAATAAAATGACAGTTTGAACACACAAAACCCACCCCATTCCTCCCCACACTCATCGCCCTTACCACACTGCTCCTACCTATCTCCCCTTTTATGCTAATAATCTTATAGAAATTTAGGTTAAATACAGACCAAGAGCCTTCAAAGCCCTCAGTAAGTTGCAATACTTAATTTCTGCAACAGCTAAGGACTGCAAAACCCCACTCTGCATCAACTGAACGCAAATCAGCCACTTTAATTAAGCTAAGCCCTTACTAGACCAATGGGACTTAAACCCACAAACACTTAGTTAACAGCTAAGCACCCTAATCAACTGGCTTCAATCTACTTCTCCCGCCGCCGGGAAAAAAGGCGGGAGAAGCCCCGGCAGGTTTGAAGCTGCTTCTTCGAATTTGCAATTCAATATGAAAATCACCTCAGAGCTGGTAAAAAGAGGCTTAACCCCTGTCTTTAGATTTACAGTCCAATGCTTCACTCAGCCATTTTACCTCACCCCCACTGATGTTCGCCGACCGTTGACTATTCTCTACAAACCACAAAGACATTGGAACACTATACCTATTATTCGGCGCATGAGCTGGAGTCCTAGGCACAGCTCTAAGCCTCCTTATTCGAGCCGAACTGGGCCAGCCAGGCAACCTTCTAGGTAACGACCACATCTACAACGTTATCGTCACAGCCCATGCATTTGTAATAATCTTCTTCATAGTAATACCCATCATAATCGGAGGCTTTGGCAACTGACTAGTTCCCCTAATAATCGGTGCCCCCGATATGGCGTTTCCCCGCATAAACAACATAAGCTTCTGACTCTTACCCCCCTCTCTCCTACTCCTGCTTGCATCTGCTATAGTGGAGGCCGGCGCAGGAACAGGTTGAACAGTCTACCCTCCCTTGGCAGGGAACTACTCCCACCCTGGAGCCTCCGTAGACCTAACCATCTTCTCCTTACACCTAGCAGGTATCTCCTCTATCTTAGGAGCCATCAATTTCATCACAACAATTATTAATATAAAACCCCCTGCCATAACCCAATACCAAACGCCCCTTTTCGTCTGATCCGTCCTAATCACAGCAGTCTTACTTCTCCTATCTCTCCCAGTCCTAGCCGCTGGCATCACTATACTACTAACAGACCGTAACCTCAACACCACCTTCTTCGACCCAGCCGGAGGAGGAGACCCCATTCTATACCAACACCTATTCTGATTTTTCGGTCACCCTGAAGTTTATATTCTCATCCTACCAGGCTTCGGAATAATCTCCCATATTGTAACTTACTACTCCGGGAAAAAAAGAACCATTTGGATACATAGGTATGGTCTGAGCTATGATATCAATTGGCTTCCTAGGGTTTATCGTGTGAGCACACCATATATTTACAGTAGGAATAGACGTAGACACACGAGCATATTTCACCTCCGCTACCATAATCATCGCTATCCCCACCGGCGTCAAAGTATTTAGCTGACTCGCCACACTCCACGGAAGCAATATGAAATGATCTGCTGCAGTGCTCTGAGCCCTAGGATTTATTTTTCTTTTCACCGTAGGTGGCCTGACTGGCATTGTATTAGCAAACTCATCACTAGACATCGTACTACACGACACGTACTACGTTGTAGCCCACTTCCACTATGTCCTATCAATAGGAGCTGTATTTGCCATCATAGGAGGCTTCATTCACTGATTTCCCCTATTCTCAGGCTACACCCTAGACCAAACCTACGCCAAAATCCATTTCGCTATCATATTCATCGGCGTAAATCTAACTTTCTTCCCACAACACTTTCTCGGCCTATCCGGAATGCCCCGACGTTACTCGGACTATCCCGATGCATACACCACATGAAATATCCTATCATCTGTAGGCTCATTCATTTCTCTAACAGCAGTAATATTAATAATTTTCATAATTTGAGAAGCCTTCGCTTCGAAGCGAAAAGTCCTAATAGTAGAAGAACCCTCCATAAACCTGGAGTGACTATATGGATGCCCCCCACCCTACCACACATTCGAAGAACCCGTATACATAAAATCTAGACAAAAAAGGAAGGAATCGAACCCCCCAAAGCTGGTTTCAAGCCAACCCCATGGCCTCCATGACTTTTTCAAAAAGATATTAGAAAAACCATTTCATAACTTTGTCAAAGTTAAATTATAGGCTAAATCCTATATATCTTAATGGCACATGCAGCGCAAGTAGGTCTACAAGACGCTACTTCCCCTATCATAGAAGAGCTTATCATCTTTCATGATCACGCCCTCATAATCATTTTCCTTATCTGCTTCCTAGTCCTGTACGCCCTTTTCCTAACACTCACAACAAAACTAACTAATACTAACATCTCAGACGCTCAGGAAATAGAAACCGTCTGAACTATCCTGCCCGCCATCATCCTAGTCCTTATCGCCCTCCCATCCCTACGCATCCTTTACATAACAGACGAGGTCAACGATCCCTCCTTTACCATCAAATCAATTGGCCATCAATGGTACTGAACCTACGAATACACCGACTACGGCGGACTAATCTTCAACTCCTACATACTTCCCCCATTATTCCTAGAACCAGGCGACCTGCGACTCCTTGACGTTGACAATCGAGTAGTACTCCCGGTTGAAGCCCCCATTCGTATAATAATTACATCACAAGACGTCTTACACTCATGAGCTGTCCCCACATTAGGCTTAAAAACAGATGCAATTCCCGGACGTCTAAACCAAACCACTTTCACTGCTACACGACCAGGGGTATACTACGGCCAATGCTCTGAAATCTGTGGAGCAAACCAGTTTTATGCCCATCGTCCTAGAATTAATTCCCCTAAAAATCTTTGAAATAGGGCCCGTATTTACCCTATAGCACCCCCTCTACCCCCTCTAGAGCCCACTGTAAAGCTAACTTAGCATTAACCTTTTAAGTTAAAGATTAAGAGAACCAACACCTCTTTACAGTGAAATGCCCCAACTAAATACTACCGTATGACCCACCATAATTACCCCCATACTCCTTACACTATTCCTCATCACCCAACTAAAAATATTAAATACAAATTACCACCTACCTCCCTCACCAAAGCCCATAAAAATAAAAAACTATAACAAACCCTGAGAACCAAAATGAACGAAAATCTGTTCACTTCATTCATTGCCCCCACAATCCTAGGCCTACCCGCCGCAGTACTGATCATTCTATTTCCCCCTCTATTGATCCCCACCTCCAAATATCTCATCAACAACCGACTAATTACCACCCAACAATGACTAATCCAACTAACCTCAAAACAAATGATAGCCATACACAACACTAAGGGACGAACCTGATCTCTTATACTAGTATCCTTAATCATTTTTATTGCCACAACTAACCTCCTCGGACTCCTGCCTCACTCATTTACACCAACCACCCAACTATCTATAAACCTAGCCATGGCCATCCCCTTATGAGCGGGCGCAGTGATTATAGGCTTTCGCTCTAAGATTAAAAATGCCCTAGCCCACTTCTTACCACAAGGCACACCTACACCCCTTATCCCTATACTAGTTATTATCGAAACCATCAGCCTACTCATTCAACCAATAGCCCTGGCCGTACGCCTAACCGCTAACATTACTGCAGGCCACCTACTCATGCACCTAATTGGAAGCGCCACACTAGCAATATCAACTATTAACCTTCCCTCTACACTTATCATCTTCACAATTCTAATTCTACTGACTATCCTAGAAATCGCTGTCGCCTTAATCCAAGCCTACGTTTTTACACTTCTAGTAAGCCTCTACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAAACCCAGCCCATGGCCCCTAACAGGGGCCCTCTCAGCCCTCCTAATGACCTCCGGCCTAGCCATGTGATTTCACTTCCACTCCACAACCCTCCTCATACTAGGCCTACTAACCAACACACTAACCATATACCAATGATGGCGCGATGTAACACGAGAAAGCACATACCAAGGCCACCACACACCACCTGTCCAGAAAGGCCTTCGATACGGGATAATCCTATTTATTACCTCAGAAGTTTTTTTCTTCGCAGGATTTTTCTGAGCCTTTTACCACTCCAGCCTAGCTCCCACCCCCCAACTAGGGGGACACTGGCCCCCAACAGGCATCACCCCGCTAAATCCCCTAGAAGTCCCACTCCTAAACACATCCGTATTACTCGCATCAGGGGTATCAATCACCTGAGCTCACCATAGTCTAATAGAAAACAACCGAAACCAAATAATTCAAGCACTGCTTATTACAATTTTACTGGGTCTCTATTTTACCCTCCTACAAGCCTCAGAGTACTTCGAGGTTAAA 11 | -------------------------------------------------------------------------------- /data/bed/good/compressed.bed.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bed/good/compressed.bed.gz -------------------------------------------------------------------------------- /data/bed/good/indexed_csi.bed.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bed/good/indexed_csi.bed.gz -------------------------------------------------------------------------------- /data/bed/good/indexed_csi.bed.gz.csi: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bed/good/indexed_csi.bed.gz.csi -------------------------------------------------------------------------------- /data/bed/good/indexed_tbi.bed.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bed/good/indexed_tbi.bed.gz -------------------------------------------------------------------------------- /data/bed/good/indexed_tbi.bed.gz.tbi: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/bed/good/indexed_tbi.bed.gz.tbi -------------------------------------------------------------------------------- /data/bed/good/unsorted.bed: -------------------------------------------------------------------------------- 1 | chr1 18095 18095 3 2 0 0 313 -1 -1 -1 s5,s238062,s6 * GGGCCGGGTGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGGCGGATCACGAGGTCAGAAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCTGGGCATGGTGGTGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCCCGCCACTGCACTCCAGCCTGAGCGACAGAGTGAGACTCTGTCTCAAAAAAAAAAAAAAAAAAGAAGGAATAAGAC 2 | chr1 29311 29311 7 4 0 0 863 -1 -1 -1 s34,s392676,s432615,s478355,s432614,s432613,s35 * TAGCCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCAACCCCCTCTCCATCCCCCTCTCCATCTCCCTCTCCTTTCTCCTCTCTAGCCCCTCTCCTTTCTCCTCTCTATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCTTTCTCCTCCCCATCCCCCTCTCCATCCCCCCTCCATCCCCTTCTCCTTTCTCCTCTCCATCCCCCTCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCTTCTCCATCCCCCTCTCCATCCCCCTCTCCTTTCTCCTCTCCATCCCCCTCTCCATCCCCCTCTCCATCTCCCTTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTCCTCTCTAGCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTGCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCTTTCTCCTCTCCTTTCTCCTCCCCATCCCCTCTCCATCCCCCCTACATCCCCCTCTCCTTTCTCCTCCCCACCCCCTCTTCTTTCTCCTCTCCATCCCCCTCTCCTTTCTCCCTCTCCACCCCCCTCTCCATCCCCCTCTCCTTTCTCCCTTTC 3 | chr1 24110 26059 59 12526 0 22 3875 -1 -1 -1 s8,s345381,s418892,s238063,s416853,s463110,s271302,s377656,s314817,s482893,s461932,s461931,s9,s10,s11,s432618,s12,s322411,s13,s14,s434465,s398218,s398219,s478354,s443821,s443822,s359857,s15,s16,s17,s18,s19,s425113,s20,s21,s470986,s345380,s466297,s22,s23,s345379,s445459,s347465,s479666,s24,s25,s26,s27,s28,s467481,s29,s432617,s345378,s436164,s479667,s432616,s463109,s30,s31 AGCAATGAGGCACGTGTGGAAA GCAATGAGGCACGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACATGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGTGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGGCACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGTGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCGGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTAGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGTCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCATCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACGCGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAACTGCGACACTCACACGGGTGCCGTCTCAGCAGCTCACGGTGTGGAAA 4 | chr1 54659 54659 3 2 0 0 5874 -1 -1 -1 s49,s238077,s50 * TTAGATTTGACCTTCAGCAAGGTCAAAGGGAGTCCGAACTAGTCTCAGGCTTCAACATCGAATACGCCGCAGGCCCCTTCGCCCTATTCTTCATAGCCGAATACACAAACATTATTATAATAAACACCCTCACCACTACAATCTTCCTAGGAACAACATATAACGCACTCTCCCCTGAACTCTACACAACATATTTTGTCACCAAGACCCTACTTCTGACCTCCCTGTTCTTATGAATTCGAACAGCATACCCCCGATTCCGCTACGACCAACTCATACACCTCCTATGAAAAAACTTCCTACCACTCACCCTAGCATTACTTATATGATATGTCTCCATACCCATTACAATCTCCAGCATTCCCCCTCAAACCTAAGAAATATGTCTGATAAAAGAGTTACTTTGATAGAGTAAATAATAGGAGTTTAAATCCCCTTATTTCTAGGACTATGAGAATCGAACCCATCCCTGAGAATCCAAAATTCTCCGTGCCACCTATCACACCCCATCCTAAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTTATATCCTTCCCGTACTAATTAATCCCCTGGCCCAACCCGTCATCTACTCTACCATCTTTGCAGGCACACTCATCACAGCGCTAAGCTCGCACTGATTTTTTACCTGAGTAGGCCTAGAAATAAACATGCTAGCTTTTATTCCAGTTCTAACCAAAAAAATAAACCCTCGTTCCACAGAAGCTGCCATCAAGTATTTCCTCACGCAAGCAACCGCATCCATAATCCTTCTAATAGCTATCCTCTTCAACAATATACTCTCCGGACAATGAACCATAACCAATACCACCAATCAATACTCATCATTAATAATCATAATGGCTATAGCAATAAAACTAGGAATAGCCCCCTTTCACTTCTGAGTCCCAGAGGTTACCCAAGGCACCCCTCTGACATCCGGCCTGCTCCTTCTCACATGACAAAAACTAGCCCCCATCTCAATCATATACCAAATTTCTCCCTCATTAAACGTAAGCCTTCTCCTCACTCTTTCAATCTTATCCATCATGGCAGGCAGTTGAGGTGGATTAAACCAAACCCAACTACGCAAAATCTTAGCATACTCCTCAATTACCCACATAGGATGAATAACAGCAGTTCTACCGTACAACCCTAACATAACCATTCTTAATTTAACTATTTATATTATCCTAACTACTACCGCATTCCTACTACTCAACTTAAACTCCAGCACCACAACCCTACTACTATCTCGCACCTGAAACAAGCTAACATGACTAACACCCTTAATTCCATCCACCCTCCTCTCCCTAGGAGGCCTGCCCCCGCTAACCGGCTTTTTGCCCAAATGGGCCATTATCGAAGAATTCACAAAAAACAATAGCCTCATCATCCCCACCATCATAGCCATCATCACCCTCCTTAACCTCTACTTCTACCTGCGCCTAATCTACTCCACCTCAATCACACTACTCCCTATATCTAACAACGTAAAAATAAAATGACAGTTTGAACACACAAAACCCACCCCATTCCTCCCCACACTCATCGCCCTTACCACACTGCTCCTACCTATCTCCCCTTTTATGCTAATAATCTTATAGAAATTTAGGTTAAATACAGACCAAGAGCCTTCAAAGCCCTCAGTAAGTTGCAATACTTAATTTCTGCAACAGCTAAGGACTGCAAAACCCCACTCTGCATCAACTGAACGCAAATCAGCCACTTTAATTAAGCTAAGCCCTTACTAGACCAATGGGACTTAAACCCACAAACACTTAGTTAACAGCTAAGCACCCTAATCAACTGGCTTCAATCTACTTCTCCCGCCGCCGGGAAAAAAGGCGGGAGAAGCCCCGGCAGGTTTGAAGCTGCTTCTTCGAATTTGCAATTCAATATGAAAATCACCTCAGAGCTGGTAAAAAGAGGCTTAACCCCTGTCTTTAGATTTACAGTCCAATGCTTCACTCAGCCATTTTACCTCACCCCCACTGATGTTCGCCGACCGTTGACTATTCTCTACAAACCACAAAGACATTGGAACACTATACCTATTATTCGGCGCATGAGCTGGAGTCCTAGGCACAGCTCTAAGCCTCCTTATTCGAGCCGAACTGGGCCAGCCAGGCAACCTTCTAGGTAACGACCACATCTACAACGTTATCGTCACAGCCCATGCATTTGTAATAATCTTCTTCATAGTAATACCCATCATAATCGGAGGCTTTGGCAACTGACTAGTTCCCCTAATAATCGGTGCCCCCGATATGGCGTTTCCCCGCATAAACAACATAAGCTTCTGACTCTTACCCCCCTCTCTCCTACTCCTGCTTGCATCTGCTATAGTGGAGGCCGGCGCAGGAACAGGTTGAACAGTCTACCCTCCCTTGGCAGGGAACTACTCCCACCCTGGAGCCTCCGTAGACCTAACCATCTTCTCCTTACACCTAGCAGGTATCTCCTCTATCTTAGGAGCCATCAATTTCATCACAACAATTATTAATATAAAACCCCCTGCCATAACCCAATACCAAACGCCCCTTTTCGTCTGATCCGTCCTAATCACAGCAGTCTTACTTCTCCTATCTCTCCCAGTCCTAGCCGCTGGCATCACTATACTACTAACAGACCGTAACCTCAACACCACCTTCTTCGACCCAGCCGGAGGAGGAGACCCCATTCTATACCAACACCTATTCTGATTTTTCGGTCACCCTGAAGTTTATATTCTCATCCTACCAGGCTTCGGAATAATCTCCCATATTGTAACTTACTACTCCGGGAAAAAAAGAACCATTTGGATACATAGGTATGGTCTGAGCTATGATATCAATTGGCTTCCTAGGGTTTATCGTGTGAGCACACCATATATTTACAGTAGGAATAGACGTAGACACACGAGCATATTTCACCTCCGCTACCATAATCATCGCTATCCCCACCGGCGTCAAAGTATTTAGCTGACTCGCCACACTCCACGGAAGCAATATGAAATGATCTGCTGCAGTGCTCTGAGCCCTAGGATTTATTTTTCTTTTCACCGTAGGTGGCCTGACTGGCATTGTATTAGCAAACTCATCACTAGACATCGTACTACACGACACGTACTACGTTGTAGCCCACTTCCACTATGTCCTATCAATAGGAGCTGTATTTGCCATCATAGGAGGCTTCATTCACTGATTTCCCCTATTCTCAGGCTACACCCTAGACCAAACCTACGCCAAAATCCATTTCGCTATCATATTCATCGGCGTAAATCTAACTTTCTTCCCACAACACTTTCTCGGCCTATCCGGAATGCCCCGACGTTACTCGGACTATCCCGATGCATACACCACATGAAATATCCTATCATCTGTAGGCTCATTCATTTCTCTAACAGCAGTAATATTAATAATTTTCATAATTTGAGAAGCCTTCGCTTCGAAGCGAAAAGTCCTAATAGTAGAAGAACCCTCCATAAACCTGGAGTGACTATATGGATGCCCCCCACCCTACCACACATTCGAAGAACCCGTATACATAAAATCTAGACAAAAAAGGAAGGAATCGAACCCCCCAAAGCTGGTTTCAAGCCAACCCCATGGCCTCCATGACTTTTTCAAAAAGATATTAGAAAAACCATTTCATAACTTTGTCAAAGTTAAATTATAGGCTAAATCCTATATATCTTAATGGCACATGCAGCGCAAGTAGGTCTACAAGACGCTACTTCCCCTATCATAGAAGAGCTTATCATCTTTCATGATCACGCCCTCATAATCATTTTCCTTATCTGCTTCCTAGTCCTGTACGCCCTTTTCCTAACACTCACAACAAAACTAACTAATACTAACATCTCAGACGCTCAGGAAATAGAAACCGTCTGAACTATCCTGCCCGCCATCATCCTAGTCCTTATCGCCCTCCCATCCCTACGCATCCTTTACATAACAGACGAGGTCAACGATCCCTCCTTTACCATCAAATCAATTGGCCATCAATGGTACTGAACCTACGAATACACCGACTACGGCGGACTAATCTTCAACTCCTACATACTTCCCCCATTATTCCTAGAACCAGGCGACCTGCGACTCCTTGACGTTGACAATCGAGTAGTACTCCCGGTTGAAGCCCCCATTCGTATAATAATTACATCACAAGACGTCTTACACTCATGAGCTGTCCCCACATTAGGCTTAAAAACAGATGCAATTCCCGGACGTCTAAACCAAACCACTTTCACTGCTACACGACCAGGGGTATACTACGGCCAATGCTCTGAAATCTGTGGAGCAAACCAGTTTTATGCCCATCGTCCTAGAATTAATTCCCCTAAAAATCTTTGAAATAGGGCCCGTATTTACCCTATAGCACCCCCTCTACCCCCTCTAGAGCCCACTGTAAAGCTAACTTAGCATTAACCTTTTAAGTTAAAGATTAAGAGAACCAACACCTCTTTACAGTGAAATGCCCCAACTAAATACTACCGTATGACCCACCATAATTACCCCCATACTCCTTACACTATTCCTCATCACCCAACTAAAAATATTAAATACAAATTACCACCTACCTCCCTCACCAAAGCCCATAAAAATAAAAAACTATAACAAACCCTGAGAACCAAAATGAACGAAAATCTGTTCACTTCATTCATTGCCCCCACAATCCTAGGCCTACCCGCCGCAGTACTGATCATTCTATTTCCCCCTCTATTGATCCCCACCTCCAAATATCTCATCAACAACCGACTAATTACCACCCAACAATGACTAATCCAACTAACCTCAAAACAAATGATAGCCATACACAACACTAAGGGACGAACCTGATCTCTTATACTAGTATCCTTAATCATTTTTATTGCCACAACTAACCTCCTCGGACTCCTGCCTCACTCATTTACACCAACCACCCAACTATCTATAAACCTAGCCATGGCCATCCCCTTATGAGCGGGCGCAGTGATTATAGGCTTTCGCTCTAAGATTAAAAATGCCCTAGCCCACTTCTTACCACAAGGCACACCTACACCCCTTATCCCTATACTAGTTATTATCGAAACCATCAGCCTACTCATTCAACCAATAGCCCTGGCCGTACGCCTAACCGCTAACATTACTGCAGGCCACCTACTCATGCACCTAATTGGAAGCGCCACACTAGCAATATCAACTATTAACCTTCCCTCTACACTTATCATCTTCACAATTCTAATTCTACTGACTATCCTAGAAATCGCTGTCGCCTTAATCCAAGCCTACGTTTTTACACTTCTAGTAAGCCTCTACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAAACCCAGCCCATGGCCCCTAACAGGGGCCCTCTCAGCCCTCCTAATGACCTCCGGCCTAGCCATGTGATTTCACTTCCACTCCACAACCCTCCTCATACTAGGCCTACTAACCAACACACTAACCATATACCAATGATGGCGCGATGTAACACGAGAAAGCACATACCAAGGCCACCACACACCACCTGTCCAGAAAGGCCTTCGATACGGGATAATCCTATTTATTACCTCAGAAGTTTTTTTCTTCGCAGGATTTTTCTGAGCCTTTTACCACTCCAGCCTAGCTCCCACCCCCCAACTAGGGGGACACTGGCCCCCAACAGGCATCACCCCGCTAAATCCCCTAGAAGTCCCACTCCTAAACACATCCGTATTACTCGCATCAGGGGTATCAATCACCTGAGCTCACCATAGTCTAATAGAAAACAACCGAAACCAAATAATTCAAGCACTGCTTATTACAATTTTACTGGGTCTCTATTTTACCCTCCTACAAGCCTCAGAGTACTTCGAGGTTAAA 5 | chr1 3634 3696 5 4 0 0 374 -1 -1 -1 s1,s367398,s434464,s2,s3 * CCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGACCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCCGCCTGCTGGCAGCTAGGGACGTTGCAGGGCCCTCTTGCTCACAGTGTAGTGGCAGCACGCCCGCCTGCTGGCAGCTGGGGACACTGCCGGA 6 | chr1 27995 28086 11 12 0 91 958 -1 -1 -1 s31,s416854,s238064,s238065,s314818,s314819,s238066,s32,s414976,s33,s34 TGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGTCCT CGACTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTACTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCCGATCTGTATATATGTATCATGTAAACATGATTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTTCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCTCCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATGTATGTATCATGTAAACACGAGTTCCTACTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCCGATCTGTATATAAGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTGTAACCGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACACGAGTTCCTGCTGGCATATCTGACTATAACTGACCACCTCAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGTATATATGTATCATGTAAACATGAGTTCCTGCTGGCATATCTGTCTATAACCGACCACCTTAGGGTCCATTCTGATCTGTATATATGTATAATATATATTATATATGGACCTCAGGGTCCATTCTGATCTGCATATATGTATAATATATATTATATATGGTCCT 7 | chr1 37656 37965 20 86 0 1 1759 -1 -1 -1 s43,s486580,s44,s238072,s416855,s238073,s434466,s416856,s238074,s45,s238075,s46,s416857,s441327,s416858,s238076,s416859,s47,s48,s49 T GCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTTGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGAGTGTGACGGGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCATGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTGTGCTGTGTGAGAACATGTGTGTAGTGTTCACATGTCCTCTGCGCGTGAGTCCCCGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGATGCGGCGTGTGCTGTGTGAGAACGTGTGTGTAGTGTCCACATGTCCTCTGTGCGTGAGTCCCTGTGTGTGATGTTGTGTTCTCGGTGTGAGTTCATGGGTGTGACGGGGTGTG 8 | chr1 36616 37092 20 111 0 0 2096 -1 -1 -1 s35,s486581,s36,s238067,s458206,s37,s38,s428880,s458205,s39,s40,s238068,s238069,s238070,s307180,s426376,s41,s238071,s42,s43 * GAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCACACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGTGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCAGGCGGCCAGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGATGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAGGAGATGCCCAGGCCTGGCGGCCGGCGCACGCGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGTTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGGGGAGATGCCCAGGCCTGGCGGCCGGCGCACGTGGGCTCTCTGTGGCCAGCAGGCGGCGCTGCAGGAGAG 9 | chr1 11669 11728 3 2 0 0 59 -1 -1 -1 s3,s4,s5 * GTCCCTCTGTCTCTGCCAACCAGTTAACCTGCTGCTTCCTGGAGGGAGACAGTCCCTCA 10 | chr1 21837 21883 4 2 0 46 103 -1 -1 -1 s6,s374420,s7,s8 TTCCCAGCAGTGCAGGCCCCTCTCTAGAGCTGAGATGCTCCCGGCA CCCTCTCCAGAGCCGAGACGCTCCCGGCGGTGCAGGCCCCTCTCTAGAGCCGAGACGCTCCCAGCAATGCAGGTCCCCCTCTAGAGCCGAGACGCTCCCGGCG 11 | -------------------------------------------------------------------------------- /data/fasta/good/basic_aligned.fa: -------------------------------------------------------------------------------- 1 | >sequence1 2 | ATCTACGATCGAGCTACT 3 | >sequence2 4 | ATC----ATCGACCCACT 5 | -------------------------------------------------------------------------------- /data/fasta/good/basic_dna.fa: -------------------------------------------------------------------------------- 1 | >sequence1 2 | AATTCTCATTACTGTATCACAGCAAGTTGTATTTACAACAAAAATCCAAA 3 | >sequence2 4 | GCCTACCAGAAAACGTTGTATTTTGGCAAAGTTCAAAAAGTCAGTCCAGA 5 | >sequence3 6 | GTATAATTCACAGAGTTTCATGTGGTTGTTGTTGACTCTACATATTGTCT 7 | -------------------------------------------------------------------------------- /data/fasta/good/basic_protein.fa: -------------------------------------------------------------------------------- 1 | >sequence1 2 | ADHWNARNNAKFWVYSHGPLWGIMHSHFPAGLAQGKNLHEIIPSMKQCIRPEWVDYCHMF 3 | >sequence2 4 | ISTTGEGMSHFVQNWVPLVWGFAVHYAQVTLFRDTRNGGYEVSVEWLGLYVSQLDASWNI 5 | >sequence3 6 | VKMHIEVVRPIWEHSQNIHFAQLTDNPAAKACDGFAPVTMKKTCGTDTIHCYHTYHACWR 7 | -------------------------------------------------------------------------------- /data/fasta/good/compressed.fa.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/fasta/good/compressed.fa.gz -------------------------------------------------------------------------------- /data/fasta/good/duplicate_sequence_names.fa: -------------------------------------------------------------------------------- 1 | >sequence2 2 | AATTCTCATTACTGTATCACAGCAAGTTGTATTTACAACAAAAATCCAAA 3 | >sequence2 4 | GCCTACCAGAAAACGTTGTATTTTGGCAAAGTTCAAAAAGTCAGTCCAGA 5 | >sequence3 6 | GTATAATTCACAGAGTTTCATGTGGTTGTTGTTGACTCTACATATTGTCT 7 | -------------------------------------------------------------------------------- /data/fasta/good/empty_lines.fa: -------------------------------------------------------------------------------- 1 | 2 | >sequence1 3 | AATTCTCATTACTGTATCACAGCAAGTTGTATTTACAACAAAAATCCAAA 4 | 5 | >sequence2 6 | GCCTACCAGAAAACGTTGTATTTTGGCAAAGTTCAAAAAGTCAGTCCAGA 7 | 8 | >sequence3 9 | GTATAATTCACAGAGTTTCATGTGGTTGTTGTTGACTCTACATATTGTCT 10 | -------------------------------------------------------------------------------- /data/fasta/good/multiline.fa: -------------------------------------------------------------------------------- 1 | >sequence1 2 | AATTCTCATTACTGTATCAC 3 | AGCAAGTTGTATTTACAACA 4 | AAAATCCAAA 5 | >sequence2 6 | GCCTACCAGAAAACGTTGTA 7 | TTTTGGCAAAGTTCAAAAAG 8 | TCAGTCCAGA 9 | >sequence3 10 | GTATAATTCACAGAGTTTCA 11 | TGTGGTTGTTGTTGACTCTA 12 | CATATTGTCT 13 | -------------------------------------------------------------------------------- /data/fasta/good/name_contains_spaces.fa: -------------------------------------------------------------------------------- 1 | >prefix with spaces1 2 | AATTCTCATTACTGTATCACAGCAAGTTGTATTTACAACAAAAATCCAAA 3 | >prefix with spaces2 4 | GCCTACCAGAAAACGTTGTATTTTGGCAAAGTTCAAAAAGTCAGTCCAGA 5 | >prefix with spaces3 6 | GTATAATTCACAGAGTTTCATGTGGTTGTTGTTGACTCTACATATTGTCT 7 | -------------------------------------------------------------------------------- /data/fastq/bad/quality_mismatch.fastq: -------------------------------------------------------------------------------- 1 | @ERR001268.1 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1115:2003/1 2 | GAAAAGGAAATATCTTCATATAAAATCTAGGCAGAA 3 | + 4 | IIIIIIIIIIIIIIIIIIIIIBF29.'+'(%>( 9 | @ERR001268.3 080821_HWI-EAS301_0002_30ALBAAXX:1:1:907:1952/1 10 | GTTTTTTTTTTTGATTCATCTGAATCCTTTATTTAA 11 | + 12 | IIIIIIIIIIII+=%I:7I.I+51;-)II:(IAB*( 13 | -------------------------------------------------------------------------------- /data/fastq/bad/truncated_clean.fastq: -------------------------------------------------------------------------------- 1 | @ERR001268.1 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1115:2003/1 2 | GAAAAGGAAATATCTTCATATAAAATCTAGGCAGAA 3 | + 4 | IIIIIIIIIIIIIIIIIIIIIBF29.'+'(%>( 9 | @ERR001268.3 080821_HWI-EAS301_0002_30ALBAAXX:1:1:907:1952/1 10 | GTTTTTTTTTTTGATTCATCTGAATCCTTTATTTAA 11 | -------------------------------------------------------------------------------- /data/fastq/bad/truncated_halfway.fastq: -------------------------------------------------------------------------------- 1 | @ERR001268.1 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1115:2003/1 2 | GAAAAGGAAATATCTTCATATAAAATCTAGGCAGAA 3 | + 4 | IIIIIIIIIIIIIIIIIIIIIBF29.'+'(%>( 9 | @ERR001268.3 080821_HWI-EAS301_0002_30ALBAAXX:1:1:907:1952/1 10 | GTTTTTTTTTTTGATTCATCTGAATCCTTTATTTAA 11 | + 12 | IIIIIIIIIIII+=%I:7I.I+51;-)II:(IAB*( 13 | -------------------------------------------------------------------------------- /data/fastq/good/basic_R2.fastq: -------------------------------------------------------------------------------- 1 | @ERR001268.1 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1115:2003/2 2 | ACAAAGTTTCTCAGAATGCTTCTGTGTAGTTTTTGT 3 | + 4 | IIIIIIFIIIIB@I??I26>=3:<;<:/+0AE7C,) 5 | @ERR001268.2 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1090:1998/2 6 | CCTAAAGATAAATTCCATCTGTATATCTCTCCATGA 7 | + 8 | IIIIICIIIE556311,-/)2)- 9 | @ERR001268.3 080821_HWI-EAS301_0002_30ALBAAXX:1:1:907:1952/2 10 | AGCAAATATTTAACAACTGCCAAATACTGGATTGCA 11 | + 12 | IIIIIEIIIIIII7IE:G029):5,040+.//0()+ 13 | -------------------------------------------------------------------------------- /data/fastq/good/compressed.fastq.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/fastq/good/compressed.fastq.gz -------------------------------------------------------------------------------- /data/fastq/good/duplicate_+.fastq: -------------------------------------------------------------------------------- 1 | @ERR001268.1 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1115:2003/1 2 | GAAAAGGAAATATCTTCATATAAAATCTAGGCAGAA 3 | +ERR001268.1 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1115:2003/1 4 | IIIIIIIIIIIIIIIIIIIIIBF29.'+'(%>( 9 | @ERR001268.3 080821_HWI-EAS301_0002_30ALBAAXX:1:1:907:1952/1 10 | GTTTTTTTTTTTGATTCATCTGAATCCTTTATTTAA 11 | +ERR001268.3 080821_HWI-EAS301_0002_30ALBAAXX:1:1:907:1952/1 12 | IIIIIIIIIIII+=%I:7I.I+51;-)II:(IAB*( 13 | -------------------------------------------------------------------------------- /data/fastq/good/interleaved.fastq: -------------------------------------------------------------------------------- 1 | @ERR001268.1 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1115:2003/1 2 | GAAAAGGAAATATCTTCATATAAAATCTAGGCAGAA 3 | + 4 | IIIIIIIIIIIIIIIIIIIIIBF2=3:<;<:/+0AE7C,) 9 | @ERR001268.2 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1090:1998/1 10 | GTTTTAATTCATTCTTGATCCCTTGCTCCAGAAATA 11 | + 12 | IIIIIIIIIIIIIIIII4C/D=HA1F>9.'+'(%>( 13 | @ERR001268.2 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1090:1998/2 14 | CCTAAAGATAAATTCCATCTGTATATCTCTCCATGA 15 | + 16 | IIIIICIIIE556311,-/)2)- 17 | @ERR001268.3 080821_HWI-EAS301_0002_30ALBAAXX:1:1:907:1952/1 18 | GTTTTTTTTTTTGATTCATCTGAATCCTTTATTTAA 19 | + 20 | IIIIIIIIIIII+=%I:7I.I+51;-)II:(IAB*( 21 | @ERR001268.3 080821_HWI-EAS301_0002_30ALBAAXX:1:1:907:1952/2 22 | AGCAAATATTTAACAACTGCCAAATACTGGATTGCA 23 | + 24 | IIIIIEIIIIIII7IE:G029):5,040+.//0()+ 25 | -------------------------------------------------------------------------------- /data/fastq/good/multiline.fastq: -------------------------------------------------------------------------------- 1 | @ERR001268.1 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1115:2003/1 2 | GAAAAGGAAATATCTTCATA 3 | TAAAATCTAGGCAGAA 4 | + 5 | IIIIIIIIIIIIIIIIIIII 6 | IBF29.'+'(%>( 13 | @ERR001268.3 080821_HWI-EAS301_0002_30ALBAAXX:1:1:907:1952/1 14 | GTTTTTTTTTTTGATTCATC 15 | TGAATCCTTTATTTAA 16 | + 17 | IIIIIIIIIIII+=%I:7I. 18 | I+51;-)II:(IAB*( 19 | -------------------------------------------------------------------------------- /data/fastq/good/quality_@.fastq: -------------------------------------------------------------------------------- 1 | @ERR001268.1 080821_HWI-EAS301_0002_30ALBAAXX:1:1:1115:2003/1 2 | GAAAAGGAAATATCTTCATATAAAATCTAGGCAGAA 3 | + 4 | @IIIIIIIIIIIIIIIIIIIIBF29.'+'(%>( 9 | @ERR001268.3 080821_HWI-EAS301_0002_30ALBAAXX:1:1:907:1952/1 10 | GTTTTTTTTTTTGATTCATCTGAATCCTTTATTTAA 11 | + 12 | IIIIIIIIIIII+=%I:7I.I+51;-)II:(IAB*( 13 | -------------------------------------------------------------------------------- /data/vcf/bad/missing_info_field.vcf: -------------------------------------------------------------------------------- 1 | ##fileformat=VCFv4.1 2 | ##FILTER= 3 | ##fileDate=20150218 4 | ##reference=ftp://ftp.1000genomes.ebi.ac.uk//vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz 5 | ##source=1000GenomesPhase3Pipeline 6 | ##contig= 7 | ##contig= 8 | ##contig= 9 | ##contig= 10 | ##contig= 11 | ##contig= 12 | ##contig= 13 | ##contig= 14 | ##contig= 15 | ##contig= 16 | ##contig= 17 | ##contig= 18 | ##contig= 19 | ##contig= 20 | ##contig= 21 | ##contig= 22 | ##contig= 23 | ##contig= 24 | ##contig= 25 | ##contig= 26 | ##contig= 27 | ##contig= 28 | ##contig= 29 | ##contig= 30 | ##contig= 31 | ##contig= 32 | ##contig= 33 | ##contig= 34 | ##contig= 35 | ##contig= 36 | ##contig= 37 | ##contig= 38 | ##contig= 39 | ##contig= 40 | ##contig= 41 | ##contig= 42 | ##contig= 43 | ##contig= 44 | ##contig= 45 | ##contig= 46 | ##contig= 47 | ##contig= 48 | ##contig= 49 | ##contig= 50 | ##contig= 51 | ##contig= 52 | ##contig= 53 | ##contig= 54 | ##contig= 55 | ##contig= 56 | ##contig= 57 | ##contig= 58 | ##contig= 59 | ##contig= 60 | ##contig= 61 | ##contig= 62 | ##contig= 63 | ##contig= 64 | ##contig= 65 | ##contig= 66 | ##contig= 67 | ##contig= 68 | ##contig= 69 | ##contig= 70 | ##contig= 71 | ##contig= 72 | ##contig= 73 | ##contig= 74 | ##contig= 75 | ##contig= 76 | ##contig= 77 | ##contig= 78 | ##contig= 79 | ##contig= 80 | ##contig= 81 | ##contig= 82 | ##contig= 83 | ##contig= 84 | ##contig= 85 | ##contig= 86 | ##contig= 87 | ##contig= 88 | ##contig= 89 | ##contig= 90 | ##contig= 91 | ##contig= 92 | ##ALT= 93 | ##ALT= 94 | ##ALT= 95 | ##ALT= 96 | ##ALT= 97 | ##ALT= 98 | ##ALT= 99 | ##ALT= 100 | ##ALT= 101 | ##ALT= 102 | ##ALT= 103 | ##ALT= 104 | ##ALT= 105 | ##ALT= 106 | ##ALT= 107 | ##ALT= 108 | ##ALT= 109 | ##ALT= 110 | ##ALT= 111 | ##ALT= 112 | ##ALT= 113 | ##ALT= 114 | ##ALT= 115 | ##ALT= 116 | ##ALT= 117 | ##ALT= 118 | ##ALT= 119 | ##ALT= 120 | ##ALT= 121 | ##ALT= 122 | ##ALT= 123 | ##ALT= 124 | ##ALT= 125 | ##ALT= 126 | ##ALT= 127 | ##ALT= 128 | ##ALT= 129 | ##ALT= 130 | ##ALT= 131 | ##ALT= 132 | ##ALT= 133 | ##ALT= 134 | ##ALT= 135 | ##ALT= 136 | ##ALT= 137 | ##ALT= 138 | ##ALT= 139 | ##ALT= 140 | ##ALT= 141 | ##ALT= 142 | ##ALT= 143 | ##ALT= 144 | ##ALT= 145 | ##ALT= 146 | ##ALT= 147 | ##ALT= 148 | ##ALT= 149 | ##ALT= 150 | ##ALT= 151 | ##ALT= 152 | ##ALT= 153 | ##ALT= 154 | ##ALT= 155 | ##ALT= 156 | ##ALT= 157 | ##ALT= 158 | ##ALT= 159 | ##ALT= 160 | ##ALT= 161 | ##ALT= 162 | ##ALT= 163 | ##ALT= 164 | ##ALT= 165 | ##ALT= 166 | ##ALT= 167 | ##ALT= 168 | ##ALT= 169 | ##ALT= 170 | ##ALT= 171 | ##ALT= 172 | ##ALT= 173 | ##ALT= 174 | ##ALT= 175 | ##ALT= 176 | ##ALT= 177 | ##ALT= 178 | ##ALT= 179 | ##ALT= 180 | ##ALT= 181 | ##ALT= 182 | ##ALT= 183 | ##ALT= 184 | ##ALT= 185 | ##ALT= 186 | ##ALT= 187 | ##ALT= 188 | ##ALT= 189 | ##ALT= 190 | ##ALT= 191 | ##ALT= 192 | ##ALT= 193 | ##ALT= 194 | ##ALT= 195 | ##ALT= 196 | ##ALT= 197 | ##ALT= 198 | ##ALT= 199 | ##ALT= 200 | ##ALT= 201 | ##ALT= 202 | ##ALT= 203 | ##ALT= 204 | ##ALT= 205 | ##ALT= 206 | ##ALT= 207 | ##ALT= 208 | ##ALT= 209 | ##ALT= 210 | ##ALT= 211 | ##ALT= 212 | ##ALT= 213 | ##ALT= 214 | ##ALT= 215 | ##ALT= 216 | ##ALT= 217 | ##ALT= 218 | ##ALT= 219 | ##ALT= 220 | ##ALT= 221 | ##ALT= 222 | ##ALT= 223 | ##ALT= 224 | ##ALT= 225 | ##INFO= 226 | ##INFO= 227 | ##INFO= 228 | ##INFO= 229 | ##INFO= 230 | ##INFO= 231 | ##INFO= 232 | ##INFO= 233 | ##INFO= 234 | ##INFO= 235 | ##INFO= 236 | ##INFO= 237 | ##INFO= 238 | ##INFO= 239 | ##INFO= 240 | ##INFO= 241 | ##INFO= 242 | ##INFO= 243 | ##INFO= 244 | ##INFO= 245 | ##INFO= 246 | ##INFO= 247 | ##INFO= 248 | ##INFO= 249 | ##INFO= 250 | ##INFO= 251 | #CHROM POS ID REF ALT QUAL FILTER INFO 252 | 1 10177 rs367896724 A AC 100 PASS AC=2130;AF=0.425319;AN=5008;NS=2504;DP=103152;EAS_AF=0.3363;AMR_AF=0.3602;AFR_AF=0.4909;EUR_AF=0.4056;SAS_AF=0.4949;AA=|||unknown(NO_COVERAGE);VT=INDEL 253 | 1 10235 rs540431307 T TA 100 PASS AC=6;AF=0.00119808;AN=5008;NS=2504;DP=78015;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0;EUR_AF=0;SAS_AF=0.0051;AA=|||unknown(NO_COVERAGE);VT=INDEL 254 | 1 10352 rs555500075 T TA 100 PASS AC=2191;AF=0.4375;AN=5008;NS=2504;DP=88915;EAS_AF=0.4306;AMR_AF=0.4107;AFR_AF=0.4788;EUR_AF=0.4264;SAS_AF=0.4192;AA=|||unknown(NO_COVERAGE);VT=INDEL 255 | 1 10505 rs548419688 A T 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=9632;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 256 | 1 10506 rs568405545 C G 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=9676;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 257 | 1 10511 rs534229142 G A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=9869;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 258 | 1 10539 rs537182016 C A 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=9203;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0;EUR_AF=0.001;SAS_AF=0.001;AA=.|||;VT=SNP 259 | 1 10542 rs572818783 C T 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=9007;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 260 | 1 10579 rs538322974 C A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=5502;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 261 | 1 10616 rs376342519 CCGCCGTTGCAAAGGCGCGCCG C 100 PASS AC=4973;AF=0.993011;AN=5008;NS=2504;DP=2365;EAS_AF=0.9911;AMR_AF=0.9957;AFR_AF=0.9894;EUR_AF=0.994;SAS_AF=0.9969;VT=INDEL 262 | 1 10642 rs558604819 G A 100 PASS AC=21;AF=0.00419329;AN=5008;NS=2504;DP=1360;EAS_AF=0.003;AMR_AF=0.0014;AFR_AF=0.0129;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 263 | 1 11008 rs575272151 C G 100 PASS AC=441;AF=0.0880591;AN=5008;NS=2504;DP=2232;EAS_AF=0.0367;AMR_AF=0.0965;AFR_AF=0.1346;EUR_AF=0.0885;SAS_AF=0.0716;AA=.|||;VT=SNP 264 | 1 11012 rs544419019 C G 100 PASS AC=441;AF=0.0880591;AN=5008;NS=2504;DP=2090;EAS_AF=0.0367;AMR_AF=0.0965;AFR_AF=0.1346;EUR_AF=0.0885;SAS_AF=0.0716;AA=.|||;VT=SNP 265 | 1 11063 rs561109771 T G 100 PASS AC=15;AF=0.00299521;AN=5008;NS=2504;DP=2834;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0.0106;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 266 | 1 13011 rs574746232 T G 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=35822;EAS_AF=0;AMR_AF=0;AFR_AF=0.0023;EUR_AF=0;SAS_AF=0;AA=t|||;VT=SNP 267 | 1 13110 rs540538026 G A 100 PASS AC=134;AF=0.0267572;AN=5008;NS=2504;DP=23422;EAS_AF=0.002;AMR_AF=0.036;AFR_AF=0.0053;EUR_AF=0.0567;SAS_AF=0.044;AA=g|||;VT=SNP 268 | 1 13116 rs62635286 T G 100 PASS AC=486;AF=0.0970447;AN=5008;NS=2504;DP=22340;EAS_AF=0.0248;AMR_AF=0.121;AFR_AF=0.0295;EUR_AF=0.1869;SAS_AF=0.1534;AA=t|||;VT=SNP 269 | 1 13118 rs200579949 A G 100 PASS AC=486;AF=0.0970447;AN=5008;NS=2504;DP=21395;EAS_AF=0.0248;AMR_AF=0.121;AFR_AF=0.0295;EUR_AF=0.1869;SAS_AF=0.1534;AA=a|||;VT=SNP 270 | 1 13156 rs552314247 G C 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=28002;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0.001;AA=g|||;VT=SNP 271 | 1 13259 rs562993331 G A 100 PASS AC=2;AF=0.000399361;AN=5008;NS=2504;DP=31820;EAS_AF=0;AMR_AF=0;AFR_AF=0.0015;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP 272 | 1 13273 rs531730856 G C 100 PASS AC=476;AF=0.0950479;AN=5008;NS=2504;DP=29117;EAS_AF=0.0625;AMR_AF=0.1455;AFR_AF=0.0204;EUR_AF=0.1471;SAS_AF=0.1401;AA=g|||;VT=SNP 273 | 1 13284 rs548333521 G A 100 PASS AC=7;AF=0.00139776;AN=5008;NS=2504;DP=26384;EAS_AF=0.001;AMR_AF=0;AFR_AF=0.0045;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP 274 | 1 13289 rs568318295 C T 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=25361;EAS_AF=0.003;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=c|||;VT=SNP 275 | 1 13289 rs538791886 CCT C 100 PASS AC=20;AF=0.00399361;AN=5008;NS=2504;DP=25361;EAS_AF=0;AMR_AF=0.0043;AFR_AF=0;EUR_AF=0;SAS_AF=0.0174;VT=INDEL 276 | 1 13313 rs527952245 T G 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=20943;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0;AA=t|||;VT=SNP 277 | 1 13365 rs548087592 C T 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=26013;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=c|||;VT=SNP 278 | 1 13380 rs571093408 C G 100 PASS AC=41;AF=0.0081869;AN=5008;NS=2504;DP=28302;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0.0303;EUR_AF=0;SAS_AF=0;AA=c|||;VT=SNP 279 | 1 13382 rs538606945 C G 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=28817;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0.001;AA=c|||;VT=SNP 280 | 1 13445 rs558318514 C G 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=45168;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0.002;AA=c|||;VT=SNP 281 | 1 13453 rs568927457 T C 100 PASS AC=4;AF=0.000798722;AN=5008;NS=2504;DP=47109;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0.0023;EUR_AF=0;SAS_AF=0;AA=t|||;VT=SNP 282 | 1 13482 rs537951473 G C 100 PASS AC=2;AF=0.000399361;AN=5008;NS=2504;DP=51992;EAS_AF=0;AMR_AF=0;AFR_AF=0.0015;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP 283 | 1 13483 rs554760071 G C 100 PASS AC=10;AF=0.00199681;AN=5008;NS=2504;DP=51968;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0.0068;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP 284 | 1 13494 rs574697788 A G 100 PASS AC=7;AF=0.00139776;AN=5008;NS=2504;DP=51763;EAS_AF=0;AMR_AF=0.0029;AFR_AF=0;EUR_AF=0.003;SAS_AF=0.002;AA=a|||;VT=SNP 285 | 1 13543 rs540466151 T G 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=40768;EAS_AF=0.003;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=t|||;VT=SNP 286 | 1 13550 rs554008981 G A 100 PASS AC=17;AF=0.00339457;AN=5008;NS=2504;DP=39894;EAS_AF=0;AMR_AF=0.0101;AFR_AF=0.0008;EUR_AF=0.008;SAS_AF=0.001;AA=g|||;VT=SNP 287 | 1 14462 rs577106641 A G 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=26811;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=a|||;VT=SNP 288 | 1 14464 rs546169444 A T 100 PASS AC=480;AF=0.0958466;AN=5008;NS=2504;DP=26761;EAS_AF=0.005;AMR_AF=0.1138;AFR_AF=0.0144;EUR_AF=0.1859;SAS_AF=0.1943;AA=a|||;VT=SNP 289 | 1 14564 rs562748080 G A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=34021;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP 290 | 1 14599 rs531646671 T A 100 PASS AC=739;AF=0.147564;AN=5008;NS=2504;DP=32081;EAS_AF=0.0893;AMR_AF=0.1758;AFR_AF=0.121;EUR_AF=0.161;SAS_AF=0.2096;AA=t|||;VT=SNP 291 | 1 14604 rs541940975 A G 100 PASS AC=739;AF=0.147564;AN=5008;NS=2504;DP=29231;EAS_AF=0.0893;AMR_AF=0.1758;AFR_AF=0.121;EUR_AF=0.161;SAS_AF=0.2096;AA=a|||;VT=SNP 292 | 1 14674 rs561913721 G A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=26402;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0;AA=g|||;VT=SNP;EX_TARGET 293 | 1 14719 rs527865771 C A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=29713;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0;AA=c|||;VT=SNP;EX_TARGET 294 | 1 14728 rs547701710 C A 100 PASS AC=2;AF=0.000399361;AN=5008;NS=2504;DP=30785;EAS_AF=0.002;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=c|||;VT=SNP;EX_TARGET 295 | 1 14775 rs571121669 C T 100 PASS AC=2;AF=0.000399361;AN=5008;NS=2504;DP=33963;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0;AA=c|||;VT=SNP;EX_TARGET 296 | 1 14860 rs533499096 G A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=38145;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP;EX_TARGET 297 | 1 14874 rs552113149 G C 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=38095;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP;EX_TARGET 298 | 1 14930 rs75454623 A G 100 PASS AC=2415;AF=0.482228;AN=5008;NS=2504;DP=42231;EAS_AF=0.4137;AMR_AF=0.5231;AFR_AF=0.4811;EUR_AF=0.5209;SAS_AF=0.4857;AA=a|||;VT=SNP 299 | 1 14933 rs199856693 G A 100 PASS AC=142;AF=0.0283546;AN=5008;NS=2504;DP=40247;EAS_AF=0.0268;AMR_AF=0.0375;AFR_AF=0.0015;EUR_AF=0.0507;SAS_AF=0.0368;AA=g|||;VT=SNP 300 | -------------------------------------------------------------------------------- /data/vcf/good/basic.bcf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/vcf/good/basic.bcf -------------------------------------------------------------------------------- /data/vcf/good/basic.vcf: -------------------------------------------------------------------------------- 1 | ##fileformat=VCFv4.1 2 | ##FILTER= 3 | ##fileDate=20150218 4 | ##reference=ftp://ftp.1000genomes.ebi.ac.uk//vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz 5 | ##source=1000GenomesPhase3Pipeline 6 | ##contig= 7 | ##contig= 8 | ##contig= 9 | ##contig= 10 | ##contig= 11 | ##contig= 12 | ##contig= 13 | ##contig= 14 | ##contig= 15 | ##contig= 16 | ##contig= 17 | ##contig= 18 | ##contig= 19 | ##contig= 20 | ##contig= 21 | ##contig= 22 | ##contig= 23 | ##contig= 24 | ##contig= 25 | ##contig= 26 | ##contig= 27 | ##contig= 28 | ##contig= 29 | ##contig= 30 | ##contig= 31 | ##contig= 32 | ##contig= 33 | ##contig= 34 | ##contig= 35 | ##contig= 36 | ##contig= 37 | ##contig= 38 | ##contig= 39 | ##contig= 40 | ##contig= 41 | ##contig= 42 | ##contig= 43 | ##contig= 44 | ##contig= 45 | ##contig= 46 | ##contig= 47 | ##contig= 48 | ##contig= 49 | ##contig= 50 | ##contig= 51 | ##contig= 52 | ##contig= 53 | ##contig= 54 | ##contig= 55 | ##contig= 56 | ##contig= 57 | ##contig= 58 | ##contig= 59 | ##contig= 60 | ##contig= 61 | ##contig= 62 | ##contig= 63 | ##contig= 64 | ##contig= 65 | ##contig= 66 | ##contig= 67 | ##contig= 68 | ##contig= 69 | ##contig= 70 | ##contig= 71 | ##contig= 72 | ##contig= 73 | ##contig= 74 | ##contig= 75 | ##contig= 76 | ##contig= 77 | ##contig= 78 | ##contig= 79 | ##contig= 80 | ##contig= 81 | ##contig= 82 | ##contig= 83 | ##contig= 84 | ##contig= 85 | ##contig= 86 | ##contig= 87 | ##contig= 88 | ##contig= 89 | ##contig= 90 | ##contig= 91 | ##contig= 92 | ##ALT= 93 | ##ALT= 94 | ##ALT= 95 | ##ALT= 96 | ##ALT= 97 | ##ALT= 98 | ##ALT= 99 | ##ALT= 100 | ##ALT= 101 | ##ALT= 102 | ##ALT= 103 | ##ALT= 104 | ##ALT= 105 | ##ALT= 106 | ##ALT= 107 | ##ALT= 108 | ##ALT= 109 | ##ALT= 110 | ##ALT= 111 | ##ALT= 112 | ##ALT= 113 | ##ALT= 114 | ##ALT= 115 | ##ALT= 116 | ##ALT= 117 | ##ALT= 118 | ##ALT= 119 | ##ALT= 120 | ##ALT= 121 | ##ALT= 122 | ##ALT= 123 | ##ALT= 124 | ##ALT= 125 | ##ALT= 126 | ##ALT= 127 | ##ALT= 128 | ##ALT= 129 | ##ALT= 130 | ##ALT= 131 | ##ALT= 132 | ##ALT= 133 | ##ALT= 134 | ##ALT= 135 | ##ALT= 136 | ##ALT= 137 | ##ALT= 138 | ##ALT= 139 | ##ALT= 140 | ##ALT= 141 | ##ALT= 142 | ##ALT= 143 | ##ALT= 144 | ##ALT= 145 | ##ALT= 146 | ##ALT= 147 | ##ALT= 148 | ##ALT= 149 | ##ALT= 150 | ##ALT= 151 | ##ALT= 152 | ##ALT= 153 | ##ALT= 154 | ##ALT= 155 | ##ALT= 156 | ##ALT= 157 | ##ALT= 158 | ##ALT= 159 | ##ALT= 160 | ##ALT= 161 | ##ALT= 162 | ##ALT= 163 | ##ALT= 164 | ##ALT= 165 | ##ALT= 166 | ##ALT= 167 | ##ALT= 168 | ##ALT= 169 | ##ALT= 170 | ##ALT= 171 | ##ALT= 172 | ##ALT= 173 | ##ALT= 174 | ##ALT= 175 | ##ALT= 176 | ##ALT= 177 | ##ALT= 178 | ##ALT= 179 | ##ALT= 180 | ##ALT= 181 | ##ALT= 182 | ##ALT= 183 | ##ALT= 184 | ##ALT= 185 | ##ALT= 186 | ##ALT= 187 | ##ALT= 188 | ##ALT= 189 | ##ALT= 190 | ##ALT= 191 | ##ALT= 192 | ##ALT= 193 | ##ALT= 194 | ##ALT= 195 | ##ALT= 196 | ##ALT= 197 | ##ALT= 198 | ##ALT= 199 | ##ALT= 200 | ##ALT= 201 | ##ALT= 202 | ##ALT= 203 | ##ALT= 204 | ##ALT= 205 | ##ALT= 206 | ##ALT= 207 | ##ALT= 208 | ##ALT= 209 | ##ALT= 210 | ##ALT= 211 | ##ALT= 212 | ##ALT= 213 | ##ALT= 214 | ##ALT= 215 | ##ALT= 216 | ##ALT= 217 | ##ALT= 218 | ##ALT= 219 | ##ALT= 220 | ##ALT= 221 | ##ALT= 222 | ##ALT= 223 | ##ALT= 224 | ##ALT= 225 | ##INFO= 226 | ##INFO= 227 | ##INFO= 228 | ##INFO= 229 | ##INFO= 230 | ##INFO= 231 | ##INFO= 232 | ##INFO= 233 | ##INFO= 234 | ##INFO= 235 | ##INFO= 236 | ##INFO= 237 | ##INFO= 238 | ##INFO= 239 | ##INFO= 240 | ##INFO= 241 | ##INFO= 242 | ##INFO= 243 | ##INFO= 244 | ##INFO= 245 | ##INFO= 246 | ##INFO= 247 | ##INFO= 248 | ##INFO= 249 | ##INFO= 250 | ##INFO= 251 | ##INFO= 252 | #CHROM POS ID REF ALT QUAL FILTER INFO 253 | 1 10177 rs367896724 A AC 100 PASS AC=2130;AF=0.425319;AN=5008;NS=2504;DP=103152;EAS_AF=0.3363;AMR_AF=0.3602;AFR_AF=0.4909;EUR_AF=0.4056;SAS_AF=0.4949;AA=|||unknown(NO_COVERAGE);VT=INDEL 254 | 1 10235 rs540431307 T TA 100 PASS AC=6;AF=0.00119808;AN=5008;NS=2504;DP=78015;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0;EUR_AF=0;SAS_AF=0.0051;AA=|||unknown(NO_COVERAGE);VT=INDEL 255 | 1 10352 rs555500075 T TA 100 PASS AC=2191;AF=0.4375;AN=5008;NS=2504;DP=88915;EAS_AF=0.4306;AMR_AF=0.4107;AFR_AF=0.4788;EUR_AF=0.4264;SAS_AF=0.4192;AA=|||unknown(NO_COVERAGE);VT=INDEL 256 | 1 10505 rs548419688 A T 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=9632;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 257 | 1 10506 rs568405545 C G 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=9676;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 258 | 1 10511 rs534229142 G A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=9869;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 259 | 1 10539 rs537182016 C A 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=9203;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0;EUR_AF=0.001;SAS_AF=0.001;AA=.|||;VT=SNP 260 | 1 10542 rs572818783 C T 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=9007;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 261 | 1 10579 rs538322974 C A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=5502;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 262 | 1 10616 rs376342519 CCGCCGTTGCAAAGGCGCGCCG C 100 PASS AC=4973;AF=0.993011;AN=5008;NS=2504;DP=2365;EAS_AF=0.9911;AMR_AF=0.9957;AFR_AF=0.9894;EUR_AF=0.994;SAS_AF=0.9969;VT=INDEL 263 | 1 10642 rs558604819 G A 100 PASS AC=21;AF=0.00419329;AN=5008;NS=2504;DP=1360;EAS_AF=0.003;AMR_AF=0.0014;AFR_AF=0.0129;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 264 | 1 11008 rs575272151 C G 100 PASS AC=441;AF=0.0880591;AN=5008;NS=2504;DP=2232;EAS_AF=0.0367;AMR_AF=0.0965;AFR_AF=0.1346;EUR_AF=0.0885;SAS_AF=0.0716;AA=.|||;VT=SNP 265 | 1 11012 rs544419019 C G 100 PASS AC=441;AF=0.0880591;AN=5008;NS=2504;DP=2090;EAS_AF=0.0367;AMR_AF=0.0965;AFR_AF=0.1346;EUR_AF=0.0885;SAS_AF=0.0716;AA=.|||;VT=SNP 266 | 1 11063 rs561109771 T G 100 PASS AC=15;AF=0.00299521;AN=5008;NS=2504;DP=2834;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0.0106;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP 267 | 1 13011 rs574746232 T G 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=35822;EAS_AF=0;AMR_AF=0;AFR_AF=0.0023;EUR_AF=0;SAS_AF=0;AA=t|||;VT=SNP 268 | 1 13110 rs540538026 G A 100 PASS AC=134;AF=0.0267572;AN=5008;NS=2504;DP=23422;EAS_AF=0.002;AMR_AF=0.036;AFR_AF=0.0053;EUR_AF=0.0567;SAS_AF=0.044;AA=g|||;VT=SNP 269 | 1 13116 rs62635286 T G 100 PASS AC=486;AF=0.0970447;AN=5008;NS=2504;DP=22340;EAS_AF=0.0248;AMR_AF=0.121;AFR_AF=0.0295;EUR_AF=0.1869;SAS_AF=0.1534;AA=t|||;VT=SNP 270 | 1 13118 rs200579949 A G 100 PASS AC=486;AF=0.0970447;AN=5008;NS=2504;DP=21395;EAS_AF=0.0248;AMR_AF=0.121;AFR_AF=0.0295;EUR_AF=0.1869;SAS_AF=0.1534;AA=a|||;VT=SNP 271 | 1 13156 rs552314247 G C 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=28002;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0.001;AA=g|||;VT=SNP 272 | 1 13259 rs562993331 G A 100 PASS AC=2;AF=0.000399361;AN=5008;NS=2504;DP=31820;EAS_AF=0;AMR_AF=0;AFR_AF=0.0015;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP 273 | 1 13273 rs531730856 G C 100 PASS AC=476;AF=0.0950479;AN=5008;NS=2504;DP=29117;EAS_AF=0.0625;AMR_AF=0.1455;AFR_AF=0.0204;EUR_AF=0.1471;SAS_AF=0.1401;AA=g|||;VT=SNP 274 | 1 13284 rs548333521 G A 100 PASS AC=7;AF=0.00139776;AN=5008;NS=2504;DP=26384;EAS_AF=0.001;AMR_AF=0;AFR_AF=0.0045;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP 275 | 1 13289 rs568318295 C T 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=25361;EAS_AF=0.003;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=c|||;VT=SNP 276 | 1 13289 rs538791886 CCT C 100 PASS AC=20;AF=0.00399361;AN=5008;NS=2504;DP=25361;EAS_AF=0;AMR_AF=0.0043;AFR_AF=0;EUR_AF=0;SAS_AF=0.0174;VT=INDEL 277 | 1 13313 rs527952245 T G 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=20943;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0;AA=t|||;VT=SNP 278 | 1 13365 rs548087592 C T 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=26013;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=c|||;VT=SNP 279 | 1 13380 rs571093408 C G 100 PASS AC=41;AF=0.0081869;AN=5008;NS=2504;DP=28302;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0.0303;EUR_AF=0;SAS_AF=0;AA=c|||;VT=SNP 280 | 1 13382 rs538606945 C G 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=28817;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0.001;AA=c|||;VT=SNP 281 | 1 13445 rs558318514 C G 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=45168;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0.002;AA=c|||;VT=SNP 282 | 1 13453 rs568927457 T C 100 PASS AC=4;AF=0.000798722;AN=5008;NS=2504;DP=47109;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0.0023;EUR_AF=0;SAS_AF=0;AA=t|||;VT=SNP 283 | 1 13482 rs537951473 G C 100 PASS AC=2;AF=0.000399361;AN=5008;NS=2504;DP=51992;EAS_AF=0;AMR_AF=0;AFR_AF=0.0015;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP 284 | 1 13483 rs554760071 G C 100 PASS AC=10;AF=0.00199681;AN=5008;NS=2504;DP=51968;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0.0068;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP 285 | 1 13494 rs574697788 A G 100 PASS AC=7;AF=0.00139776;AN=5008;NS=2504;DP=51763;EAS_AF=0;AMR_AF=0.0029;AFR_AF=0;EUR_AF=0.003;SAS_AF=0.002;AA=a|||;VT=SNP 286 | 1 13543 rs540466151 T G 100 PASS AC=3;AF=0.000599042;AN=5008;NS=2504;DP=40768;EAS_AF=0.003;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=t|||;VT=SNP 287 | 1 13550 rs554008981 G A 100 PASS AC=17;AF=0.00339457;AN=5008;NS=2504;DP=39894;EAS_AF=0;AMR_AF=0.0101;AFR_AF=0.0008;EUR_AF=0.008;SAS_AF=0.001;AA=g|||;VT=SNP 288 | 1 14462 rs577106641 A G 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=26811;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=a|||;VT=SNP 289 | 1 14464 rs546169444 A T 100 PASS AC=480;AF=0.0958466;AN=5008;NS=2504;DP=26761;EAS_AF=0.005;AMR_AF=0.1138;AFR_AF=0.0144;EUR_AF=0.1859;SAS_AF=0.1943;AA=a|||;VT=SNP 290 | 1 14564 rs562748080 G A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=34021;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP 291 | 1 14599 rs531646671 T A 100 PASS AC=739;AF=0.147564;AN=5008;NS=2504;DP=32081;EAS_AF=0.0893;AMR_AF=0.1758;AFR_AF=0.121;EUR_AF=0.161;SAS_AF=0.2096;AA=t|||;VT=SNP 292 | 1 14604 rs541940975 A G 100 PASS AC=739;AF=0.147564;AN=5008;NS=2504;DP=29231;EAS_AF=0.0893;AMR_AF=0.1758;AFR_AF=0.121;EUR_AF=0.161;SAS_AF=0.2096;AA=a|||;VT=SNP 293 | 1 14674 rs561913721 G A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=26402;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0;AA=g|||;VT=SNP;EX_TARGET 294 | 1 14719 rs527865771 C A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=29713;EAS_AF=0;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0;AA=c|||;VT=SNP;EX_TARGET 295 | 1 14728 rs547701710 C A 100 PASS AC=2;AF=0.000399361;AN=5008;NS=2504;DP=30785;EAS_AF=0.002;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=c|||;VT=SNP;EX_TARGET 296 | 1 14775 rs571121669 C T 100 PASS AC=2;AF=0.000399361;AN=5008;NS=2504;DP=33963;EAS_AF=0.001;AMR_AF=0;AFR_AF=0;EUR_AF=0.001;SAS_AF=0;AA=c|||;VT=SNP;EX_TARGET 297 | 1 14860 rs533499096 G A 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=38145;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP;EX_TARGET 298 | 1 14874 rs552113149 G C 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=38095;EAS_AF=0;AMR_AF=0.0014;AFR_AF=0;EUR_AF=0;SAS_AF=0;AA=g|||;VT=SNP;EX_TARGET 299 | 1 14930 rs75454623 A G 100 PASS AC=2415;AF=0.482228;AN=5008;NS=2504;DP=42231;EAS_AF=0.4137;AMR_AF=0.5231;AFR_AF=0.4811;EUR_AF=0.5209;SAS_AF=0.4857;AA=a|||;VT=SNP 300 | 1 14933 rs199856693 G A 100 PASS AC=142;AF=0.0283546;AN=5008;NS=2504;DP=40247;EAS_AF=0.0268;AMR_AF=0.0375;AFR_AF=0.0015;EUR_AF=0.0507;SAS_AF=0.0368;AA=g|||;VT=SNP 301 | -------------------------------------------------------------------------------- /data/vcf/good/basic_multisample.bcf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/vcf/good/basic_multisample.bcf -------------------------------------------------------------------------------- /data/vcf/good/compressed.vcf.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/vcf/good/compressed.vcf.gz -------------------------------------------------------------------------------- /data/vcf/good/indexed.bcf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/vcf/good/indexed.bcf -------------------------------------------------------------------------------- /data/vcf/good/indexed.bcf.csi: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/vcf/good/indexed.bcf.csi -------------------------------------------------------------------------------- /data/vcf/good/indexed_csi.vcf.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/vcf/good/indexed_csi.vcf.gz -------------------------------------------------------------------------------- /data/vcf/good/indexed_csi.vcf.gz.csi: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/vcf/good/indexed_csi.vcf.gz.csi -------------------------------------------------------------------------------- /data/vcf/good/indexed_tbi.vcf.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/vcf/good/indexed_tbi.vcf.gz -------------------------------------------------------------------------------- /data/vcf/good/indexed_tbi.vcf.gz.tbi: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omgenomics/bio-data-zoo/f23be65c2d21110113d275b8364388c458e395c4/data/vcf/good/indexed_tbi.vcf.gz.tbi -------------------------------------------------------------------------------- /metadata.csv: -------------------------------------------------------------------------------- 1 | bucket_id,name,prefix,path,value 2 | 3 | BUCKET_ID,42bp_link_github,,,https://github.com/omgenomics/bio-data-zoo 4 | 5 | BUCKET_ID,42bp_tooltip,bam/bad,truncated.bam,BAM file is truncated 6 | BUCKET_ID,42bp_tooltip,bam/bad,bai_older_than_data.bam,BAM is older than index by 1s (often a data transfer timing issue) 7 | BUCKET_ID,42bp_tooltip,bam/bad,read_name_longer_than_254.sam,SAM file stores a read name longer than 254 character 8 | 9 | BUCKET_ID,42bp_tooltip,bam/good,basic_unsorted.bam,BAM file is not sorted by mapping position 10 | BUCKET_ID,42bp_tooltip,bam/good,compressed.sam.gz,SAM file compressed with bgzip 11 | BUCKET_ID,42bp_tooltip,bam/good,indexed_bai.bam,BAM file with BAI index 12 | BUCKET_ID,42bp_tooltip,bam/good,indexed_csi.bam,BAM file with CSI index 13 | BUCKET_ID,42bp_tooltip,bam/good,indexed_csi.sam.gz,SAM file with CSI index 14 | BUCKET_ID,42bp_tooltip,bam/good,indexed_tbi.sam.gz,SAM file with TBI index 15 | BUCKET_ID,42bp_tooltip,bam/good,no_mapped_reads.bam,BAM file with no mapping information 16 | 17 | BUCKET_ID,42bp_tooltip,bed/bad,spaces.bed,BED file with spaces instead of tabs 18 | BUCKET_ID,42bp_tooltip,bed/bad,negative_coords.bed,BED file with negative coordinates 19 | BUCKET_ID,42bp_tooltip,bed/bad,start_greater_than_end_coords.bed,BED file with invalid range where start > end 20 | BUCKET_ID,42bp_tooltip,bed/bad,non_integer_coords.bed,BED file with floating point coordinates instead of integers 21 | 22 | BUCKET_ID,42bp_tooltip,bed/good,compressed.bed.gz,BED file compressed with bgzip 23 | BUCKET_ID,42bp_tooltip,bed/good,indexed_csi.bed.gz,BED file with CSI index 24 | BUCKET_ID,42bp_tooltip,bed/good,indexed_tbi.bed.gz,BED file with TBI index 25 | BUCKET_ID,42bp_tooltip,bed/good,unsorted.bed,BED file is not sorted by start position 26 | 27 | BUCKET_ID,42bp_tooltip,fasta/good,basic_aligned.fa,FASTA output by MSA tool 28 | BUCKET_ID,42bp_tooltip,fasta/good,compressed.fa.gz,FASTA compressed with bgzip 29 | BUCKET_ID,42bp_tooltip,fasta/good,duplicate_sequence_names.fa,FASTA with duplicate sequence names 30 | BUCKET_ID,42bp_tooltip,fasta/good,empty_lines.fa,FASTA with empty lines between sequences 31 | BUCKET_ID,42bp_tooltip,fasta/good,multiline.fa,FASTA with sequences split across multiple lines 32 | BUCKET_ID,42bp_tooltip,fasta/good,name_contains_spaces.fa,FASTA with spaces in sequence name 33 | 34 | BUCKET_ID,42bp_tooltip,fastq/bad,quality_mismatch.fastq,FASTQ where 2nd read has len(sequence) != len(quality) 35 | BUCKET_ID,42bp_tooltip,fastq/bad,truncated_clean.fastq,FASTQ where 3rd read is truncated right after the sequence 36 | BUCKET_ID,42bp_tooltip,fastq/bad,truncated_halfway.fastq,FASTQ where 2nd read is truncated half-way through the sequence 37 | 38 | BUCKET_ID,42bp_tooltip,fastq/good,compressed.fastq.gz,FASTQ compressed with bgzip 39 | BUCKET_ID,42bp_tooltip,fastq/good,duplicate_+.fastq,FASTQ where + line shows read name 40 | BUCKET_ID,42bp_tooltip,fastq/good,interleaved.fastq,FASTQ where R1/R2 are interleaved 41 | BUCKET_ID,42bp_tooltip,fastq/good,multiline.fastq,FASTQ file where sequence/quality are multi-line (please don't do this) 42 | BUCKET_ID,42bp_tooltip,fastq/good,quality_@.fastq,FASTQ file where quality starts with @ (trips up simple FASTQ parsers) 43 | 44 | BUCKET_ID,42bp_tooltip,vcf/bad,missing_info_field.vcf,VCF uses field "AN" which is not defined in the header 45 | 46 | BUCKET_ID,42bp_tooltip,vcf/good,basic_multisample.bcf,BCF with 1200+ samples 47 | BUCKET_ID,42bp_tooltip,vcf/good,basic_multisample.vcf,VCF with 1200+ samples 48 | BUCKET_ID,42bp_tooltip,vcf/good,compressed.vcf.gz,VCF compressed with bgzip 49 | BUCKET_ID,42bp_tooltip,vcf/good,indexed.bcf,BCF indexed with CSI (TBI not supported for BCF) 50 | BUCKET_ID,42bp_tooltip,vcf/good,indexed_csi.vcf.gz,VCF indexed with CSI 51 | BUCKET_ID,42bp_tooltip,vcf/good,indexed_tbi.vcf.gz,VCF indexed with TBI 52 | -------------------------------------------------------------------------------- /src/generate_bam.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | DIR_SRC="${BASH_SOURCE%/*}" 4 | source "$DIR_SRC/lib.sh" || exit 5 | 6 | # ============================================================================== 7 | # Good BAM files 8 | # ============================================================================== 9 | 10 | DIR_OUT=$DIR_SRC/../data/bam/good 11 | DIR_BASIC="$DIR_OUT/basic.bam" 12 | 13 | # ------------------------------------------------------------------------------ 14 | # SAM files 15 | # ------------------------------------------------------------------------------ 16 | 17 | log "Creating compressed SAM files (bgzip)" 18 | cp "$DIR_OUT/basic.sam" "$DIR_OUT/compressed.sam" 19 | bgzip -f "$DIR_OUT/compressed.sam" 20 | validate "ok" 21 | 22 | log "Creating indexed SAM files (CSI, TBI)" 23 | cp "$DIR_OUT/compressed.sam.gz" "$DIR_OUT/indexed_csi.sam.gz" 24 | cp "$DIR_OUT/compressed.sam.gz" "$DIR_OUT/indexed_tbi.sam.gz" 25 | tabix --csi -p sam "$DIR_OUT/indexed_csi.sam.gz" 26 | tabix -p sam "$DIR_OUT/indexed_tbi.sam.gz" 27 | validate "ok" 28 | 29 | # ------------------------------------------------------------------------------ 30 | # Indexed 31 | # ------------------------------------------------------------------------------ 32 | 33 | log "Creating indexed BAM files (BAI, CSI)" 34 | cp "$DIR_BASIC" "$DIR_OUT/indexed_bai.bam" 35 | cp "$DIR_BASIC" "$DIR_OUT/indexed_csi.bam" 36 | samtools index --bai "$DIR_OUT/indexed_bai.bam" 37 | samtools index --csi "$DIR_OUT/indexed_csi.bam" 38 | validate "ok" 39 | 40 | # ------------------------------------------------------------------------------ 41 | # No mapping information 42 | # ------------------------------------------------------------------------------ 43 | 44 | log "Creating BAM with no mapped reads" 45 | samtools reset "$DIR_BASIC" -O BAM > "$DIR_OUT/no_mapped_reads.bam" 46 | validate "$(samtools quickcheck "$DIR_OUT/no_mapped_reads.bam" 2>&1 | grep "no targets in header")" 47 | 48 | 49 | # ============================================================================== 50 | # Bad BAM files 51 | # ============================================================================== 52 | 53 | DIR_OUT=$DIR_SRC/../data/bam/bad 54 | mkdir -p "$DIR_OUT" 55 | 56 | # ------------------------------------------------------------------------------ 57 | # Truncated 58 | # ------------------------------------------------------------------------------ 59 | 60 | log "Creating truncated BAM file" 61 | head -c 10000 "$DIR_BASIC" > "$DIR_OUT/truncated.bam" 62 | validate "$(samtools view "$DIR_OUT/truncated.bam" 2>&1 | grep "EOF marker is absent. The input is probably truncated")" 63 | 64 | # ------------------------------------------------------------------------------ 65 | # BAI index is older than the BAM file (might needs reindexing to reflect latest BAM file contents) 66 | # ------------------------------------------------------------------------------ 67 | 68 | log "Creating BAI index that is older than the BAM file" 69 | cp "$DIR_BASIC" "$DIR_OUT/bai_older_than_data.bam" 70 | samtools index "$DIR_OUT/bai_older_than_data.bam" 71 | sleep 1 # wait before run touch to make sure the timestamp is different enough 72 | touch "$DIR_OUT/bai_older_than_data.bam" 73 | validate "$(samtools idxstats "$DIR_OUT/bai_older_than_data.bam" 2>&1 | grep "The index file is older than the data file")" 74 | 75 | # ------------------------------------------------------------------------------ 76 | # Read name > 254 characters (https://github.com/samtools/samtools/issues/1081) 77 | # ------------------------------------------------------------------------------ 78 | 79 | log "Creating SAM file where a read name is > 254 characters" 80 | awk -v OFS='\t' 'BEGIN { 81 | filler = ""; 82 | for(i = 0; i < 255; i++) { 83 | filler=filler"A" 84 | } 85 | } { 86 | if($0 !~ /^@/) { 87 | $1 = $1 filler; 88 | } 89 | print 90 | }' "${DIR_BASIC/.bam/.sam}" > "$DIR_OUT/read_name_longer_than_254.sam" 91 | validate "$(samtools view "$DIR_OUT/read_name_longer_than_254.sam" 2>&1 | grep "query name too long")" 92 | -------------------------------------------------------------------------------- /src/generate_bed.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | DIR_SRC="${BASH_SOURCE%/*}" 4 | source "$DIR_SRC/lib.sh" || exit 5 | 6 | # ============================================================================== 7 | # Good BED files 8 | # ============================================================================== 9 | 10 | DIR_OUT=$DIR_SRC/../data/bed/good 11 | DIR_BASIC="$DIR_OUT/basic.bed" 12 | 13 | # ------------------------------------------------------------------------------ 14 | # Unsorted intervals 15 | # ------------------------------------------------------------------------------ 16 | 17 | log "Creating unsorted BED file" 18 | shuf --random-source=<(echo 42) "$DIR_BASIC" > "$DIR_OUT/unsorted.bed" 19 | validate "$(diff "$DIR_OUT/unsorted.bed" <(bedtools sort -i "$DIR_OUT/unsorted.bed"))" 20 | 21 | # ------------------------------------------------------------------------------ 22 | # Compressed 23 | # ------------------------------------------------------------------------------ 24 | # Note that tabix requires bgzip compression. Also, using `gzip` adds a timestamp 25 | # to the header, which makes it look different to git every time the script runs. 26 | # ------------------------------------------------------------------------------ 27 | 28 | log "Creating compressed BED file (bgzip)" 29 | cp "$DIR_BASIC" "$DIR_OUT/compressed.bed" 30 | bgzip -f "$DIR_OUT/compressed.bed" 31 | validate "$(diff "$DIR_BASIC" <(gunzip -c "$DIR_OUT/compressed.bed.gz") && echo "ok")" 32 | 33 | log "Creating indexed BED.bgz file (TBI, CSI)" 34 | cp "$DIR_OUT/compressed.bed.gz" "$DIR_OUT/indexed_tbi.bed.gz" 35 | cp "$DIR_OUT/compressed.bed.gz" "$DIR_OUT/indexed_csi.bed.gz" 36 | tabix -p bed "$DIR_OUT/indexed_tbi.bed.gz" 37 | tabix -p bed --csi "$DIR_OUT/indexed_csi.bed.gz" 38 | validate "ok" 39 | 40 | 41 | # ============================================================================== 42 | # Bad BED files 43 | # ============================================================================== 44 | 45 | DIR_OUT=$DIR_SRC/../data/bed/bad 46 | mkdir -p "$DIR_OUT" 47 | 48 | # ------------------------------------------------------------------------------ 49 | # Spaces instead of tabs 50 | # ------------------------------------------------------------------------------ 51 | 52 | log "Creating BED file using spaces instead of tabs" 53 | sed 's/\t/ /g' "$DIR_BASIC" > "$DIR_OUT/spaces.bed" 54 | validate "$(bedtools merge -i "$DIR_OUT/spaces.bed" 2>&1 | grep "unable to open file or unable to determine types")" 55 | 56 | # ------------------------------------------------------------------------------ 57 | # Invalid ranges 58 | # ------------------------------------------------------------------------------ 59 | 60 | log "Creating BED file with negative coordinates" 61 | sed 's/3634/-3634/' "$DIR_BASIC" > "$DIR_OUT/negative_coords.bed" 62 | validate "$(bedtools merge -i "$DIR_OUT/negative_coords.bed" 2>&1 | grep "Invalid record in file")" 63 | 64 | log "Creating BED file with start > end coordinates" 65 | sed 's/3634/9999/' "$DIR_BASIC" > "$DIR_OUT/start_greater_than_end_coords.bed" 66 | validate "$(bedtools merge -i "$DIR_OUT/start_greater_than_end_coords.bed" 2>&1 | grep "unable to open file or unable to determine types")" 67 | 68 | log "Creating BED file with non-integer coordinates" 69 | sed 's/3634/3.63/' "$DIR_BASIC" > "$DIR_OUT/non_integer_coords.bed" 70 | validate "$(bedtools merge -i "$DIR_OUT/non_integer_coords.bed" 2>&1 | grep "unable to open file or unable to determine types")" 71 | -------------------------------------------------------------------------------- /src/generate_fasta.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | DIR_SRC="${BASH_SOURCE%/*}" 4 | source "$DIR_SRC/lib.sh" || exit 5 | 6 | # ============================================================================== 7 | # Good FASTA files 8 | # ============================================================================== 9 | 10 | DIR_OUT=$DIR_SRC/../data/fasta/good 11 | DIR_BASIC="$DIR_OUT/basic_dna.fa" 12 | 13 | # ------------------------------------------------------------------------------ 14 | # Compressed. Use bgzip because gzip adds timestamp in file, which makes git show 15 | # diffs each time this script runs. 16 | # ------------------------------------------------------------------------------ 17 | 18 | log "Creating compressed FASTA file" 19 | cp "$DIR_BASIC" "$DIR_OUT/compressed.fa" 20 | bgzip -f "$DIR_OUT/compressed.fa" 21 | validate "$(diff "$DIR_BASIC" <(gunzip -c "$DIR_OUT/compressed.fa.gz") && echo "ok")" 22 | 23 | # ------------------------------------------------------------------------------ 24 | # Multiline FASTA 25 | # ------------------------------------------------------------------------------ 26 | 27 | log "Creating FASTA split across multiple lines" 28 | seqtk seq -l 20 "$DIR_BASIC" > "$DIR_OUT/multiline.fa" 29 | validate "$(diff <(wc -l < "$DIR_OUT/multiline.fa") <(wc -l < "$DIR_BASIC"))" 30 | 31 | # ------------------------------------------------------------------------------ 32 | # Sequence names with spaces 33 | # ------------------------------------------------------------------------------ 34 | 35 | log "Creating FASTA with spaces in sequence name" 36 | seqtk rename "$DIR_BASIC" "prefix with spaces" > "$DIR_OUT/name_contains_spaces.fa" 37 | validate "$(diff <(grep -c " " "$DIR_BASIC") <(grep -c " " "$DIR_OUT/name_contains_spaces.fa"))" 38 | 39 | # ------------------------------------------------------------------------------ 40 | # Duplicate sequence names 41 | # ------------------------------------------------------------------------------ 42 | 43 | log "Creating FASTA with duplicate sequence names" 44 | sed 's/sequence1/sequence2/g' "$DIR_BASIC" > "$DIR_OUT/duplicate_sequence_names.fa" 45 | validate "$(diff <(grep '>' "$DIR_BASIC") <(grep '>' "$DIR_OUT/duplicate_sequence_names.fa"))" 46 | 47 | # ------------------------------------------------------------------------------ 48 | # Spaces between sequences... 49 | # ------------------------------------------------------------------------------ 50 | 51 | log "Creating FASTA with empty lines between sequences" 52 | sed 's/>sequence/\n>sequence/g' "$DIR_BASIC" > "$DIR_OUT/empty_lines.fa" 53 | validate "$(diff <(wc -l < "$DIR_BASIC") <(wc -l < "$DIR_OUT/empty_lines.fa"))" 54 | -------------------------------------------------------------------------------- /src/generate_fastq.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | DIR_SRC="${BASH_SOURCE%/*}" 4 | source "$DIR_SRC/lib.sh" || exit 5 | 6 | # ============================================================================== 7 | # Good FASTQ files 8 | # ============================================================================== 9 | 10 | DIR_OUT=$DIR_SRC/../data/fastq/good 11 | DIR_BASIC_R1="$DIR_OUT/basic_R1.fastq" 12 | DIR_BASIC_R2="$DIR_OUT/basic_R2.fastq" 13 | 14 | # ------------------------------------------------------------------------------ 15 | # Compressed 16 | # ------------------------------------------------------------------------------ 17 | 18 | log "Creating compressed FASTQ file" 19 | cp "$DIR_BASIC_R1" "$DIR_OUT/compressed.fastq" 20 | bgzip -f "$DIR_OUT/compressed.fastq" 21 | validate "$(diff "$DIR_BASIC_R1" <(gunzip -c "$DIR_OUT/compressed.fastq.gz") && echo "ok")" 22 | 23 | # ------------------------------------------------------------------------------ 24 | # Quality line starts with "@", which can trip up FASTQ parsers 25 | # ------------------------------------------------------------------------------ 26 | 27 | log "Creating FASTQ file where the quality line (line 4) starts with @" 28 | sed '4s/./@/' "$DIR_BASIC_R1" > "$DIR_OUT/quality_@.fastq" 29 | validate "$(diff <(echo 3) <(grep -c "^@" "$DIR_OUT/quality_@.fastq"))" 30 | 31 | # ------------------------------------------------------------------------------ 32 | # Sequence/quality on multiple lines, which breaks parsers that assume 4 lines per read. 33 | # This is quite rare but is technically a valid FASTQ file. 34 | # ------------------------------------------------------------------------------ 35 | 36 | log "Creating FASTQ split across multiple lines" 37 | seqtk seq -l 20 "$DIR_BASIC_R1" > "$DIR_OUT/multiline.fastq" 38 | validate "$(diff <(wc -l < "$DIR_OUT/multiline.fastq") <(wc -l < "$DIR_BASIC_R1"))" 39 | 40 | # ------------------------------------------------------------------------------ 41 | # Non-empty "+" line 42 | # ------------------------------------------------------------------------------ 43 | 44 | log "Creating FASTQ with a + line with duplicate read names" 45 | awk 'BEGIN { read = -1 } { if((NR-1) % 4 == 0) { read=$0; } if($0 == "+") { sub("@", "+", read); print(read) } else { print } }' "$DIR_BASIC_R1" > "$DIR_OUT/duplicate_+.fastq" 46 | validate "$(diff "$DIR_BASIC_R1" "$DIR_OUT/duplicate_+.fastq" )" 47 | 48 | # ------------------------------------------------------------------------------ 49 | # Interleaved R1 and R2 reads 50 | # ------------------------------------------------------------------------------ 51 | 52 | log "Creating interleaved FASTQ (read name does not contain R1/R2)" 53 | seqtk mergepe "$DIR_BASIC_R1" "$DIR_BASIC_R2" > "$DIR_OUT/interleaved.fastq" 54 | validate "$(diff <(wc -l < "$DIR_BASIC_R1") <(wc -l "$DIR_OUT/interleaved.fastq"))" 55 | 56 | 57 | # ============================================================================== 58 | # Bad FASTQ files 59 | # ============================================================================== 60 | 61 | DIR_OUT=$DIR_SRC/../data/fastq/bad 62 | mkdir -p "$DIR_OUT" 63 | 64 | # ------------------------------------------------------------------------------ 65 | # Truncated 66 | # ------------------------------------------------------------------------------ 67 | 68 | log "Creating truncated FASTQ (missing lines)" 69 | head -n 10 "$DIR_BASIC_R1" > "$DIR_OUT/truncated_clean.fastq" 70 | validate "$(diff <(seqtk seq "$DIR_BASIC_R1" | wc -l) <(seqtk seq "$DIR_OUT/truncated_clean.fastq" | wc -l))" 71 | 72 | log "Creating truncated FASTQ (half-way through a line)" 73 | head -c 220 "$DIR_BASIC_R1" > "$DIR_OUT/truncated_halfway.fastq" 74 | validate "$(diff <(seqtk seq "$DIR_BASIC_R1" | wc -l) <(seqtk seq "$DIR_OUT/truncated_halfway.fastq" | wc -l))" 75 | 76 | # ------------------------------------------------------------------------------ 77 | # Not truncated but a read in the middle has len(sequence) != len(quality) 78 | # ------------------------------------------------------------------------------ 79 | 80 | log "Creating FASTQ file where we're missing quality scores for some bases" 81 | sed '8s/.....//' "$DIR_BASIC_R1" > "$DIR_OUT/quality_mismatch.fastq" 82 | validate "$(diff <(seqtk seq "$DIR_BASIC_R1" | wc -l) <(seqtk seq "$DIR_OUT/truncated_clean.fastq" | wc -l))" 83 | -------------------------------------------------------------------------------- /src/generate_vcf.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | DIR_SRC="${BASH_SOURCE%/*}" 4 | source "$DIR_SRC/lib.sh" || exit 5 | 6 | # ============================================================================== 7 | # Good VCF files 8 | # ============================================================================== 9 | 10 | DIR_OUT=$DIR_SRC/../data/vcf/good 11 | DIR_BASIC="$DIR_OUT/basic.vcf" 12 | 13 | # Compressed 14 | log "Creating compressed VCF file (bgzip)" 15 | cp "$DIR_BASIC" "$DIR_OUT/compressed.vcf" 16 | bgzip -f "$DIR_OUT/compressed.vcf" 17 | validate "$(diff "$DIR_BASIC" <(gunzip -c "$DIR_OUT/compressed.vcf.gz") && echo "ok")" 18 | 19 | # Indexed VCF 20 | log "Creating indexed VCF file (TBI, CSI)" 21 | cp "$DIR_OUT/compressed.vcf.gz" "$DIR_OUT/indexed_csi.vcf.gz" 22 | cp "$DIR_OUT/compressed.vcf.gz" "$DIR_OUT/indexed_tbi.vcf.gz" 23 | bcftools index --csi "$DIR_OUT/indexed_csi.vcf.gz" 24 | bcftools index --tbi "$DIR_OUT/indexed_tbi.vcf.gz" 25 | validate "ok" 26 | 27 | # Indexed BCF 28 | log "Creating indexed BCF file (CSI)" 29 | cp "$DIR_OUT/basic.bcf" "$DIR_OUT/indexed.bcf" 30 | bcftools index "$DIR_OUT/indexed.bcf" 31 | validate "ok" 32 | 33 | # ============================================================================== 34 | # Bad VCF files 35 | # ============================================================================== 36 | 37 | DIR_OUT=$DIR_SRC/../data/vcf/bad 38 | 39 | # Missing an INFO field (bcftools outputs warnings) 40 | log "Creating VCF with a missing INFO field" 41 | sed "/^##INFO= "$DIR_OUT/missing_info_field.vcf" 42 | validate "$(bcftools view "$DIR_OUT/missing_info_field.vcf" 2>&1 | grep "W::vcf_parse_info")" 43 | -------------------------------------------------------------------------------- /src/lib.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | function log() { 4 | echo -n "$1... " 5 | } 6 | 7 | function validate() { 8 | if [[ "$1" != "" ]]; then 9 | echo "done"; 10 | else 11 | echo "failed"; 12 | exit 13 | fi 14 | } 15 | --------------------------------------------------------------------------------