├── requirements.txt ├── .gitignore ├── templates ├── assemblers.j2 ├── processors.j2 └── header.md ├── .github └── workflows │ └── python-app.yml ├── render.py ├── data ├── assemblers.csv └── processors.csv └── README.md /requirements.txt: -------------------------------------------------------------------------------- 1 | GitPython 2 | jinja2 -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .vscode/ 2 | __pycache__/ 3 | -------------------------------------------------------------------------------- /templates/assemblers.j2: -------------------------------------------------------------------------------- 1 | ## Genome assemblers 2 | 3 | {% macro assembler_tbl(assemblers, tech) -%} 4 | | Assembler | Publication | Last update | 5 | |:----------|:------------|:------------| 6 | {%- for a in assemblers -%} 7 | {% if a.technology == tech %} 8 | | [{{a.name}}]({{a.link}}) | {{a.publication}} | {{a.last_update}} | 9 | {%- endif %} 10 | {%- endfor %} 11 | {%- endmacro -%} 12 | 13 | ### Sanger reads 14 | 15 | {{ assembler_tbl(assemblers, "Sanger") }} 16 | 17 | 18 | ### High-accuracy short reads 19 | 20 | {{ assembler_tbl(assemblers, "High-accuracy short reads") }} 21 | 22 | ### Low-accuracy long reads 23 | 24 | {{ assembler_tbl(assemblers, "Low-accuracy long reads") }} 25 | 26 | ### High-accuracy long reads 27 | 28 | {{ assembler_tbl(assemblers, "High-accuracy long reads") }} 29 | -------------------------------------------------------------------------------- /templates/processors.j2: -------------------------------------------------------------------------------- 1 | ## Assembly pre and post-processing 2 | 3 | {% macro processors_tbl(procs, task) -%} 4 | | Reads | Tool | Publication | Last update | 5 | |:------|:------|:------------| ----------- | 6 | {%- for p in procs -%} 7 | {% if p.task == task %} 8 | | {{p.reads}} | [{{p.name}}]({{p.link}}) | {{p.publication}} | {{p.last_update}} | 9 | {%- endif %} 10 | {%- endfor %} 11 | {%- endmacro -%} 12 | 13 | ### Long-read error correction 14 | 15 | {{ processors_tbl(processors, "Long-read error correction") }} 16 | 17 | ### Polishing 18 | 19 | {{ processors_tbl(processors, "Polishing") }} 20 | 21 | ### Haplotig purging 22 | 23 | {{ processors_tbl(processors, "Haplotig purging") }} 24 | 25 | ### Scaffolding 26 | 27 | {{ processors_tbl(processors, "Scaffolding") }} 28 | 29 | ### Gap filling 30 | 31 | {{ processors_tbl(processors, "Gap filling") }} 32 | -------------------------------------------------------------------------------- /.github/workflows/python-app.yml: -------------------------------------------------------------------------------- 1 | # This workflow will install Python dependencies, run tests and lint with a single version of Python 2 | # For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python 3 | 4 | name: Update README 5 | 6 | # Scheduled to run once a month 7 | on: 8 | schedule: 9 | - cron: '1 0 1 * *' 10 | workflow_dispatch: 11 | 12 | permissions: 13 | contents: write 14 | 15 | jobs: 16 | build: 17 | runs-on: ubuntu-latest 18 | steps: 19 | - uses: actions/checkout@v3 20 | - name: Set up Python 3.10 21 | uses: actions/setup-python@v3 22 | with: 23 | python-version: "3.10" 24 | - name: Install dependencies 25 | run: | 26 | python -m pip install --upgrade pip 27 | if [ -f requirements.txt ]; then pip install -r requirements.txt; fi 28 | - name: Update README with last update dates for all softwares 29 | run: python render.py > README.md 30 | - uses: stefanzweifel/git-auto-commit-action@v4.14.1 31 | with: 32 | commit_message: "docs: update README with latest dates and softwares" 33 | branch: main 34 | -------------------------------------------------------------------------------- /templates/header.md: -------------------------------------------------------------------------------- 1 | 2 | # Genome assembly tools 3 | 4 | If you appreciate this work, please cite: "A deep dive into genome assemblies of non-vertebrate animals." Guiglielmoni N, Rivera-Vicéns R, Koszul R, Flot J-F. Peer Community Journal, 2022. [doi:10.24072/pcjournal.128](https://peercommunityjournal.org/articles/10.24072/pcjournal.128/) 5 | 6 | ## Contributing 7 | 8 | Adding a software can be done by adding a line in the corresponding CSV file: 9 | * [data/assemblers.csv](data/assemblers.csv) for genome assemblers. 10 | * [data/processors.csv](data/processors.csv) for assembly pre- or post-processing tools. 11 | 12 | Modifications to this readme should be done in the template file of the corresponding section (see [templates](templates)). 13 | Every month, a Github action automatically updates the README using the data and templates, fetching the latest commit date for each software. 14 | 15 | ## Table of contents 16 | * [Genome assemblers](#Genome-assemblers) 17 | * [Sanger reads](#Sanger-reads) 18 | * [High-accuracy short reads](#High-accuracy-short-reads) 19 | * [Low-accuracy long reads](#Low-accuracy-long-reads) 20 | * [High-accuracy long reads](#High-accuracy-long-reads) 21 | * [Assembly pre and post-processing](#Assembly-pre-and-post-processing) 22 | * [Long-read error correction](#Long-read-error-correction) 23 | * [Polishing](#Polishing) 24 | * [Haplotig purging](#Haplotig-purging) 25 | * [Scaffolding](#Scaffolding) 26 | * [Gap filling](#Gap-filling) 27 | 28 | -------------------------------------------------------------------------------- /render.py: -------------------------------------------------------------------------------- 1 | """Dynamically renders the readme from its jinja2 template and csv of assemblers""" 2 | import os 3 | import csv 4 | from dataclasses import dataclass 5 | import logging 6 | import tempfile 7 | import time 8 | from typing import Iterable, List, Optional, Type, TypeVar 9 | 10 | import git 11 | from jinja2 import Environment, FileSystemLoader 12 | 13 | logging.basicConfig(format="%(asctime)s - %(message)s", level=logging.INFO) 14 | 15 | 16 | def get_last_commit_date(url: str) -> str: 17 | """Clone only the .git folder from target remote 18 | into a tempdir and retrieve the latest commit date. 19 | 20 | Parameters 21 | ---------- 22 | url: str 23 | URL of the git repository. 24 | 25 | Returns 26 | ------- 27 | str: 28 | The date of the last commit in YYYY-MM format. 29 | """ 30 | with tempfile.TemporaryDirectory() as repo_dir: 31 | cloned = git.Repo.clone_from(url, repo_dir, no_checkout=True) 32 | auth_date = cloned.head.commit.authored_datetime 33 | return f"{auth_date.year}-{auth_date.month}" 34 | 35 | 36 | @dataclass 37 | class Software: 38 | """Defines standard fields and behaviours for all softwares.""" 39 | 40 | name: str 41 | link: Optional[str] 42 | publication: Optional[str] 43 | last_update: Optional[str] 44 | 45 | def __post_init__(self): 46 | if self.link: 47 | self.set_last_commit_date() 48 | # Don't spam git server 49 | time.sleep(0.1) 50 | 51 | def set_last_commit_date(self): 52 | """Read the remote repo to find the latest commit date""" 53 | try: 54 | self.last_update = get_last_commit_date(self.link) 55 | # If this is not a git repo, do nothing 56 | except git.GitCommandError: 57 | pass 58 | 59 | 60 | @dataclass 61 | class Assembler(Software): 62 | """A software used to assemble a genome""" 63 | 64 | technology: str 65 | 66 | 67 | @dataclass 68 | class Processor(Software): 69 | """Software used to perform a pre/post-processing 70 | task on data in genome assembly.""" 71 | 72 | task: str 73 | reads: str 74 | 75 | 76 | S = TypeVar("S", Software, Assembler, Processor) 77 | 78 | 79 | def load_softwares(path: str, soft_type: Type[S]) -> List[S]: 80 | """Load a bunch of softwares from CSV file. 81 | 82 | Parameters 83 | ---------- 84 | path: str 85 | Path to CSV file containing the list of softwares. 86 | soft_type: type Software, Assembler or Processor 87 | The class to use to represent softwares. This 88 | affects the fields available.""" 89 | softs = [] 90 | n_softs = sum(1 for i in open(path, "rb")) 91 | with open(path, "r") as csvfile: 92 | reader = csv.DictReader(csvfile) 93 | for idx, row in enumerate(reader): 94 | # csv fields expand to dataclass attrs 95 | softs.append(soft_type(**row)) 96 | logging.info( 97 | f'({idx} / {n_softs}) {soft_type.__name__} Done: {row["name"]}' 98 | ) 99 | return softs 100 | 101 | 102 | def fmt_processors(procs: Iterable[Processor]) -> List[Processor]: 103 | """Format a list of processors so that: 104 | + They are sorted by task, reads 105 | + Reads field is in italic 106 | + Only the first of each read type per task has a value 107 | """ 108 | read_type = "" 109 | task = "" 110 | first = True 111 | fmt_procs = sorted(procs, key=lambda x: (x.task, x.reads)) 112 | 113 | for pr in fmt_procs: 114 | # First processor in category ? 115 | if (pr.task != task) or (pr.reads != read_type): 116 | first = True 117 | 118 | read_type = pr.reads 119 | task = pr.task 120 | 121 | if not first: 122 | pr.reads = "" 123 | first = False 124 | 125 | # Add italics 126 | for pr in fmt_procs: 127 | if pr.reads: 128 | pr.reads = f"__{pr.reads}__" 129 | 130 | return fmt_procs 131 | 132 | 133 | env = Environment(loader=FileSystemLoader("."), autoescape=False,) 134 | 135 | # Load list of softwares and render templates consecutively. 136 | # Just dump everything to stdout to compose the readme 137 | 138 | ### HEADER ### 139 | 140 | with open("templates/header.md") as header: 141 | print(header.read()) 142 | 143 | ### ASSEMBLERS ### 144 | 145 | assemblers = load_softwares("data/assemblers.csv", Assembler) 146 | template = env.get_template("templates/assemblers.j2") 147 | print(template.render(assemblers=assemblers)) 148 | 149 | ### PRE/POST PROCESSORS ### 150 | 151 | procs = load_softwares("data/processors.csv", Processor) 152 | template = env.get_template("templates/processors.j2") 153 | # Gotta sort processors and edit them a bit for fancy md formatting 154 | fmt_procs = fmt_processors(procs) 155 | print(template.render(processors=fmt_procs)) 156 | 157 | -------------------------------------------------------------------------------- /data/assemblers.csv: -------------------------------------------------------------------------------- 1 | technology,name,link,publication,last_update 2 | Sanger,ARACHNE,,10.1101/gr.208902, 3 | Sanger,Atlas,https://www.hgsc.bcm.edu/software/atlas-2,10.1101/gr.2264004,2013 4 | Sanger,CAP3,https://faculty.sites.iastate.edu/xqhuang/cap3-assembly-program,10.1101/gr.9.9.868, 5 | Sanger,Celera,,10.1093/bioinformatics/btn074, 6 | Sanger,Euler,,10.1073/pnas.171285098, 7 | Sanger,JAZZ,,10.1126/science.1072104, 8 | Sanger,Minimus,,10.1186/1471-2105-8-64, 9 | Sanger,phrap,,10.1101/gr.8.3.186, 10 | Sanger,Phusion,,10.1101/gr.731003, 11 | Sanger,TIGR,,10.1089/gst.1995.1.9, 12 | High-accuracy short reads,ABySS,https://github.com/bcgsc/abyss,10.1101/gr.214346.116,2022 13 | High-accuracy short reads,ALLPATHS,https://software.broadinstitute.org/allpaths-lg/blog/?page_id=12,10.1101/gr.7337908,2008 14 | High-accuracy short reads,BASE,https://github.com/dhlbh/BASE,10.1186/s12864-016-2829-5,2017 15 | High-accuracy short reads,CABOG,http://wgs-assembler.sourceforge.net,10.1093/bioinformatics/btn548,2008 16 | High-accuracy short reads,Edena,http://www.genomic.ch/edena.php,10.1101/gr.072033.107,2013 17 | High-accuracy short reads,EPGA,https://github.com/bioinfomaticsCSU/EPGA,10.1093/bioinformatics/btu762,2017 18 | High-accuracy short reads,Euler-SR,http://web.archive.org/web/20110720080755/http://euler-assembler.ucsd.edu/euler-sr.1.1.2.tgz,10.1101/gr.7088808,2011 19 | High-accuracy short reads,Gossamer,,10.1093/bioinformatics/bts297,2012 20 | High-accuracy short reads,IDBA,https://github.com/loneknightpy/idba,10.1007/978-3-642-12683-3_28,2016 21 | High-accuracy short reads,ISEA,,10.1109/TCBB.2016.2550433, 22 | High-accuracy short reads,JR-Assembler,,10.1073/pnas.1314090110, 23 | High-accuracy short reads,LightAssembler,,10.1093/bioinformatics/btw470, 24 | High-accuracy short reads,Meraculous,,10.1371/journal.pone.0023501, 25 | High-accuracy short reads,Minia,https://github.com/GATB/minia,10.1186/1748-7188-8-22,2022 26 | High-accuracy short reads,Mira,,10.1.1.23.7465, 27 | High-accuracy short reads,Newbler,,, 28 | High-accuracy short reads,PCAP,https://faculty.sites.iastate.edu/xqhuang/pcap-assembly-program,10.1101/gr.1390403, 29 | High-accuracy short reads,PE-Assembler,,10.1093/bioinformatics/btq626, 30 | High-accuracy short reads,PERGA,,10.1371/journal.pone.0114253, 31 | High-accuracy short reads,Platanus,,10.1101/gr.170720.113, 32 | High-accuracy short reads,QSRA,,10.1186/1471-2105-10-69, 33 | High-accuracy short reads,Ray,,10.1089/cmb.2009.0238, 34 | High-accuracy short reads,Readjoiner,,10.1186/1471-2105-13-82, 35 | High-accuracy short reads,SGA,,10.1101/gr.126953.111, 36 | High-accuracy short reads,SHARCGS,,10.1101/gr.6435207, 37 | High-accuracy short reads,SOAPdenovo,,10.1101/gr.097261.109, 38 | High-accuracy short reads,SOAPdenovo2,,10.1186/2047-217X-1-18, 39 | High-accuracy short reads,SPAdes,,10.1089/cmb.2012.0021, 40 | High-accuracy short reads,SparseAssembler,,10.1186/1471-2105-13-S6-S1, 41 | High-accuracy short reads,SSAKE,,10.1093/bioinformatics/btl629, 42 | High-accuracy short reads,SUTTA,,10.1093/bioinformatics/btq646, 43 | High-accuracy short reads,VCAKE,,10.1093/bioinformatics/btm451, 44 | High-accuracy short reads,Velvet,,10.1002/0471250953.bi1105s31, 45 | High-accuracy short reads,Taipan,,10.1093/bioinformatics/btp374, 46 | Low-accuracy long reads,Canu,https://github.com/marbl/canu,10.1101/gr.215087.116,2021 47 | Low-accuracy long reads,FALCON,https://github.com/PacificBiosciences/FALCON,10.1038/nmeth.4035,2021 48 | Low-accuracy long reads,Flye,https://github.com/fenderglass/Flye,10.1038/s41587-019-0072-8,2022 49 | Low-accuracy long reads,GoldRush,https://github.com/bcgsc/goldrush,10.1101/2022.10.25.513734,2022 50 | Low-accuracy long reads,HINGE,https://github.com/HingeAssembler/HINGE,10.1101/gr.216465.116,2021 51 | Low-accuracy long reads,MECAT,https://github.com/xiaochuanle/MECAT,10.1038/nmeth.4432,2019 52 | Low-accuracy long reads,MECAT2,https://github.com/xiaochuanle/MECAT2,10.1038/nmeth.4432,2020 53 | Low-accuracy long reads,miniasm,https://github.com/lh3/miniasm,10.1038/nmeth.4432,2020 54 | Low-accuracy long reads,NECAT,https://github.com/xiaochuanle/NECAT,10.1038/s41467-020-20236-7,2021 55 | Low-accuracy long reads,NextDenovo,https://github.com/Nextomics/NextDenovo,10.1186/s13059-024-03252-4,2022 56 | Low-accuracy long reads,Ra,https://github.com/lbcb-sci/ra,10.1109/ISPA.2019.8868909,2020 57 | Low-accuracy long reads,Raven,https://github.com/lbcb-sci/raven,10.1038/s43588-021-00073-4,2021 58 | Low-accuracy long reads,SMARTdenovo,https://github.com/ruanjue/smartdenovo,10.20944/preprints202009.0207.v1,2021 59 | Low-accuracy long reads,wtdbg,https://github.com/ruanjue/wtdbg,,2021 60 | Low-accuracy long reads,wtdbg2,https://github.com/ruanjue/wtdbg2,10.1038/s41592-019-0669-3,2021 61 | Low-accuracy long reads,shasta,https://github.com/paoloshasta/shasta,10.1038/s41587-020-0503-6,2023 62 | High-accuracy long reads,Alice-asm,https://github.com/rolandfaure/alice-asm,,2024 63 | High-accuracy long reads,Flye,https://github.com/fenderglass/Flye,10.1038/s41587-019-0072-8,2022 64 | High-accuracy long reads,HiCanu,https://github.com/marbl/canu,10.1101/gr.215087.116,2021 65 | High-accuracy long reads,hifiasm,https://github.com/chhylp123/hifiasm,10.1038/s41592-020-01056-5,2022 66 | High-accuracy long reads,IPA,https://github.com/PacificBiosciences/pbipa,,2021 67 | High-accuracy long reads,LJA,https://github.com/AntonBankevich/LJA,10.1101/2020.12.10.420448,2021 68 | High-accuracy long reads,mdBG,https://github.com/ekimb/rust-mdbg/,10.1016/j.cels.2021.08.009,2021 69 | High-accuracy long reads,MBG,https://github.com/maickrau/MBG,10.1093/bioinformatics/btab004,2021 70 | High-accuracy long reads,NextDenovo,https://github.com/Nextomics/NextDenovo,10.1186/s13059-024-03252-4,2022 71 | High-accuracy long reads,PECAT,https://github.com/lemene/PECAT,10.1101/2022.09.25.509436,2024 72 | High-accuracy long reads,Peregrine,https://github.com/cschin/Peregrine,,2021 73 | High-accuracy long reads,Raven,https://github.com/lbcb-sci/raven,10.1038/s43588-021-00073-4,2021 74 | High-accuracy long reads,verkko,https://github.com/marbl/verkko,10.1101/2022.06.24.497523,2022 75 | High-accuracy long reads,wtdbg2,https://github.com/ruanjue/wtdbg2,10.1038/s41592-019-0669-3,2021 76 | -------------------------------------------------------------------------------- /data/processors.csv: -------------------------------------------------------------------------------- 1 | task,reads,name,link,publication,last_update 2 | Long-read error correction,Short reads,CoLoRMap,https://github.com/cchauve/CoLoRMap,10.1093/bioinformatics/btw463,2018 3 | Long-read error correction,Short reads,Hercules,https://github.com/BilkentCompGen/Hercules,10.1093/nar/gky724,2020 4 | Long-read error correction,Short reads,HG-CoLoR,https://github.com/morispi/HG-CoLoR,10.1093/bioinformatics/bty521,2021 5 | Long-read error correction,Short reads,Jabba,https://github.com/biointec/jabba,10.1186/s13015-016-0075-7,2021 6 | Long-read error correction,Short reads,LoRDEC,https://gite.lirmm.fr/lordec/lordec-releases/-/wikis/home, 10.1093/bioinformatics/btu538,2020 7 | Long-read error correction,Short reads,LoRMA,https://gite.lirmm.fr/lorma/lorma-releases/-/wikis/home,10.1093/bioinformatics/btw321,2019 8 | Long-read error correction,Short reads,NaS,https://github.com/institut-de-genomique/NaS,10.1186/s12864-015-1519-z,2018 9 | Long-read error correction,Short reads,proovread,https://github.com/BioInf-Wuerzburg/proovread,10.1093/bioinformatics/btu392,2021 10 | Long-read error correction,Short reads,Ratatosk,https://github.com/DecodeGenetics/Ratatosk,10.1186/s13059-020-02244-4,2022 11 | Long-read error correction,Long reads,Canu,https://github.com/marbl/canu,10.1101/gr.215087.116,2021 12 | Long-read error correction,Long reads,CONSENT,https://github.com/morispi/CONSENT,10.1038/s41598-020-80757-5,2022 13 | Long-read error correction,Long reads,Daccord,https://github.com/gt1/daccord,10.1101/106252,2020 14 | Long-read error correction,Long reads,FLAS,https://github.com/baoe/FLAS,10.1093/bioinformatics/btz206,2019 15 | Long-read error correction,Long reads,HALC,https://github.com/lanl001/halc,10.1186/s12859-017-1610-3,2018 16 | Long-read error correction,Long reads,MECAT,https://github.com/xiaochuanle/MECAT,10.1038/nmeth.4432,2019 17 | Long-read error correction,Long reads,MECAT2,https://github.com/xiaochuanle/MECAT2,10.1038/nmeth.4432,2020 18 | Long-read error correction,Long reads,NECAT,https://github.com/xiaochuanle/NECAT,10.1038/s41467-020-20236-7,2021 19 | Long-read error correction,Long reads,NextDenovo,https://github.com/Nextomics/NextDenovo,,2022 20 | Polishing,Short reads,ntEdit,https://github.com/bcgsc/ntEdit,10.1093/bioinformatics/btz400,2022 21 | Polishing,Short reads,Pilon,https://github.com/broadinstitute/pilon,10.1371/journal.pone.0112963,2021 22 | Polishing,Short reads,POLCA,https://github.com/alekseyzimin/masurca,10.1371/journal.pcbi.1007981,2022 23 | Polishing,Short reads,Apollo,https://github.com/CMU-SAFARI/Apollo,10.1093/bioinformatics/btaa179,2022 24 | Polishing,Long reads + short reads, Hapo-G,https://github.com/institut-de-genomique/HAPO-G,10.1093/nargab/lqab034,2022 25 | Polishing,Long reads + short reads,HyPo,https://github.com/kensung-lab/hypo,10.1101/2019.12.19.882506,2020 26 | Polishing,Long reads + short reads,Racon,https://github.com/isovic/racon,10.1101/gr.214270.116,2022 27 | Polishing,Long reads,Arrow,,,2014 28 | Polishing,Long reads,CONSENT,https://github.com/morispi/CONSENT,10.1038/s41598-020-80757-5,2022 29 | Polishing,Long reads,GoldRush,https://github.com/bcgsc/goldrush,10.1101/2022.10.25.513734,2022 30 | Polishing,Long reads,Quiver,,,2014 31 | Haplotig purging,Long reads,HaploMerger2,https://github.com/mapleforest/HaploMerger2,10.1093/bioinformatics/btx220,2021 32 | Haplotig purging,Long reads,purge_dups,https://github.com/dfguan/purge_dups,10.1093/bioinformatics/btaa025,2021 33 | Haplotig purging,Long reads,Purge Haplotigs,https://bitbucket.org/mroachawri/purge_haplotigs, 10.1186/s12859-018-2485-7,2022 34 | Haplotig purging,Long reads + short reads,Redundans,https://github.com/lpryszcz/redundans,10.1093/nar/gkw294,2021 35 | Scaffolding,Short reads,Bambus,,10.1101/gr.1536204, 36 | Scaffolding,Mate pairs,BATISCAF,,10.1101/330472, 37 | Scaffolding,Mate pairs,BESST,,10.1186/1471-2105-15-281, 38 | Scaffolding,Mate pairs,BOSS,,10.1093/bioinformatics/btw597, 39 | Scaffolding,Mate pairs,GRASS,,10.1093/bioinformatics/bts175, 40 | Scaffolding,Mate pairs,MIP,,10.1093/bioinformatics/btr562, 41 | Scaffolding,Mate pairs,Opera,,10.1089/cmb.2011.0170, 42 | Scaffolding,Mate pairs,ScaffMatch,,10.1093/bioinformatics/btv211, 43 | Scaffolding,Mate pairs,ScaffoldScaffolder,,10.1093/bioinformatics/btv548, 44 | Scaffolding,Mate pairs,SCARPA,,10.1093/bioinformatics/bts716, 45 | Scaffolding,Mate pairs,SCOP,,10.1093/bioinformatics/bty773, 46 | Scaffolding,Mate pairs,SLIQ,,10.1089/cmb.2011.0263, 47 | Scaffolding,Mate pairs,SOPRA,,10.1186/1471-2105-11-345, 48 | Scaffolding,Mate pairs,SSPACE,,10.1093/bioinformatics/btq683, 49 | Scaffolding,Mate pairs,WiseScaffolder,,10.1186/s12859-015-0705-y, 50 | Scaffolding,Long reads,DENTIST,https://github.com/a-ludi/dentist,10.1093/gigascience/giab100,2022 51 | Scaffolding,Long reads,FinisherSC,https://github.com/kakitone/finishingTool,10.1093/bioinformatics/btv280,2022 52 | Scaffolding,Long reads,gapless,,10.1101/2022.03.08.483466, 53 | Scaffolding,Long reads,GoldRush,https://github.com/bcgsc/goldrush,10.1101/2022.10.25.513734,2022 54 | Scaffolding,Long reads,LINKS,https://github.com/bcgsc/LINKS,10.1186/s13742-015-0076-3,2022 55 | Scaffolding,Long reads,LRScaf,https://github.com/shingocat/lrscaf,10.1186/s12864-019-6337-2,2021 56 | Scaffolding,Long reads,npScarf,https://github.com/mdcao/npScarf,10.1038/ncomms14515,2019 57 | Scaffolding,Long reads,PBJelly,https://sourceforge.net/projects/pb-jelly/,10.1371/journal.pone.0047768,2017 58 | Scaffolding,Long reads,RAILS,https://github.com/bcgsc/RAILS,10.21105/joss.00116,2021 59 | Scaffolding,Long reads,SLR,https://github.com/luojunwei/SLR,10.1186/s12859-019-3114-9,2020 60 | Scaffolding,Long reads,msscaf,https://github.com/mzytnicki/msscaf,,2022 61 | Scaffolding,Long reads,SMIS,https://github.com/wtsi-hpag/smis,,2018 62 | Scaffolding,Long reads,SMSC,https://github.com/UTbioinf/SMSC,10.1186/s12864-017-4271-8,2019 63 | Scaffolding,Long reads,SSPACE-LongRead,,10.1186/1471-2105-15-211,2014 64 | Scaffolding,Genetic maps, ALLMAPS,https://github.com/tanghaibao/jcvi/wiki/ALLMAPS,10.1186/s13059-014-0573-1,2022 65 | Scaffolding,Optical maps, AGORA,https://static-content.springer.com/esm/art%3A10.1186%2F1471-2105-13-189/MediaObjects/12859_2012_5306_MOESM3_ESM.zip, 10.1186/1471-2105-13-189,2012 66 | Scaffolding,Optical maps,BiSCoT,https://github.com/institut-de-genomique/biscot,10.7717/peerj.10150,2021 67 | Scaffolding,Optical maps,OMGS,https://github.com/ucrbioinfo/OMGS,10.1089/cmb.2019.0310,2021 68 | Scaffolding,Optical maps,SewingMachine,https://github.com/i5K-KINBRE-script-share/Irys-scaffolding/blob/master/KSU_bioinfo_lab/stitch/sewing_machine_LAB.md,10.1186/s12864-015-1911-8,2015 69 | Scaffolding,Optical maps,SOMA,ftp://ftp.cbcb.umd.edu/pub/software/soma,10.1093/bioinformatics/btn102,2008 70 | Scaffolding,Linked reads, ARBitR,https://github.com/markhilt/ARBitR,10.1093/bioinformatics/btaa975,2021 71 | Scaffolding,Linked reads,Architect,https://github.com/kuleshov/architect,10.1093/bioinformatics/btw267,2016 72 | Scaffolding,Linked reads,ARCS,https://github.com/bcgsc/ARCS/,10.1093/bioinformatics/btx675,2022 73 | Scaffolding,Linked reads,ARKS,https://github.com/bcgsc/arks,10.1186/s12859-018-2243-x,2019 74 | Scaffolding,Linked reads,fragScaff,https://github.com/adeylab/fragScaff,10.1101/gr.178319.114,2018 75 | Scaffolding,Linked reads,scaff10X,https://github.com/wtsi-hpag/Scaff10X,,2022 76 | Scaffolding,Linked reads,SpLitteR,https://github.com/ablab/spades/releases/tag/splitter-paper,,2022-12 77 | Scaffolding,Linked reads,msscaf,https://github.com/mzytnicki/msscaf,,2022 78 | Scaffolding,Hi-C,3D-DNA,https://github.com/aidenlab/3d-dna,10.1126/science.aal3327,2021 79 | Scaffolding,Hi-C,AutoHiC,https://github.com/Jwindler/AutoHiC,10.1101/2023.08.27.555031,2024 80 | Scaffolding,Hi-C,dnaTri,https://github.com/NoamKaplan/dna-triangulation,10.1038/nbt.2768,2016 81 | Scaffolding,Hi-C,EndHiC,https://github.com/fanagislab/EndHiC,10.48550/arXiv.2111.15411,2022 82 | Scaffolding,Hi-C,GRAAL,https://github.com/koszullab/GRAAL,10.1038/ncomms6695,2018 83 | Scaffolding,Hi-C,GreenHill,https://github.com/ShunOuchi/GreenHill,10.1186/s13059-023-03006-8 84 | Scaffolding,Hi-C,HapHiC,https://github.com/zengxiaofei/HapHiC,10.1101/2023.11.18.567668,2024 85 | Scaffolding,Hi-C,HiCAssembler,https://github.com/maxplanck-ie/HiCAssembler,10.1101/gad.328971.119,2019 86 | Scaffolding,Hi-C,instaGRAAL,https://github.com/koszullab/instaGRAAL,10.1186/s13059-020-02041-z,2022 87 | Scaffolding,Hi-C,Lachesis,https://github.com/shendurelab/LACHESIS,10.1038/nbt.2727,2017 88 | Scaffolding,Hi-C,msscaf,https://github.com/mzytnicki/msscaf,,2022 89 | Scaffolding,Hi-C,pin_hic,https://github.com/dfguan/pin_hic,10.1186/s12859-021-04453-5,2021 90 | Scaffolding,Hi-C,SALSA2,https://github.com/marbl/SALSA,10.1371/journal.pcbi.1007273,2021 91 | Scaffolding,Hi-C,scaffHiC,https://github.com/wtsi-hpag/scaffHiC,,2020 92 | Scaffolding,Hi-C,YaHS,https://github.com/c-zhou/yahs,,2022 93 | Gap filling,Short reads,GapFiller,,10.1186/gb-2012-13-6-r56, 94 | Gap filling,Short reads,GAPPadder,,10.1186/s12864-019-5703-4, 95 | Gap filling,Short reads,Sealer,,10.1186/s12859-015-0663-4, 96 | Gap filling,Long reads,Cobbler,https://github.com/bcgsc/RAILS,10.21105/joss.00116,2021 97 | Gap filling,Long reads,DENTIST,https://github.com/a-ludi/dentist,10.1093/gigascience/giab100,2022 98 | Gap filling,Long reads,FGAP,https://github.com/pirovc/fgap,10.1186/1756-0500-7-371,2021 99 | Gap filling,Long reads,FinisherSC,https://github.com/kakitone/finishingTool,10.1093/bioinformatics/btv280,2022 100 | Gap filling,Long reads,gapless,,10.1101/2022.03.08.483466, 101 | Gap filling,Long reads,GMcloser,https://sourceforge.net/projects/gmcloser/,10.1093/bioinformatics/btv465,2018 102 | Gap filling,Long reads,LR_Gapcloser,https://github.com/CAFS-bioinformatics/LR_Gapcloser, 10.1093/gigascience/giy157,2018 103 | Gap filling,Long reads,PBJelly,https://sourceforge.net/projects/pb-jelly/,10.1371/journal.pone.0047768,2017 104 | Gap filling,Long reads,PGcloser,,10.1177/1176934320913859,2020 105 | Gap filling,Long reads,TGS-GapCloser,https://github.com/BGI-Qingdao/TGS-GapCloser,10.1093/gigascience/giaa094,2022 106 | Gap filling,Long reads,YAGCloser,https://github.com/merlyescalona/yagcloser,,2021 107 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # Genome assembly tools 3 | 4 | If you appreciate this work, please cite: "A deep dive into genome assemblies of non-vertebrate animals." Guiglielmoni N, Rivera-Vicéns R, Koszul R, Flot J-F. Peer Community Journal, 2022. [doi:10.24072/pcjournal.128](https://peercommunityjournal.org/articles/10.24072/pcjournal.128/) 5 | 6 | ## Contributing 7 | 8 | Adding a software can be done by adding a line in the corresponding CSV file: 9 | * [data/assemblers.csv](data/assemblers.csv) for genome assemblers. 10 | * [data/processors.csv](data/processors.csv) for assembly pre- or post-processing tools. 11 | 12 | Modifications to this readme should be done in the template file of the corresponding section (see [templates](templates)). 13 | Every month, a Github action automatically updates the README using the data and templates, fetching the latest commit date for each software. 14 | 15 | ## Table of contents 16 | * [Genome assemblers](#Genome-assemblers) 17 | * [Sanger reads](#Sanger-reads) 18 | * [High-accuracy short reads](#High-accuracy-short-reads) 19 | * [Low-accuracy long reads](#Low-accuracy-long-reads) 20 | * [High-accuracy long reads](#High-accuracy-long-reads) 21 | * [Assembly pre and post-processing](#Assembly-pre-and-post-processing) 22 | * [Long-read error correction](#Long-read-error-correction) 23 | * [Polishing](#Polishing) 24 | * [Haplotig purging](#Haplotig-purging) 25 | * [Scaffolding](#Scaffolding) 26 | * [Gap filling](#Gap-filling) 27 | 28 | 29 | ## Genome assemblers 30 | 31 | ### Sanger reads 32 | 33 | | Assembler | Publication | Last update | 34 | |:----------|:------------|:------------| 35 | | [ARACHNE]() | 10.1101/gr.208902 | | 36 | | [Atlas](https://www.hgsc.bcm.edu/software/atlas-2) | 10.1101/gr.2264004 | 2013 | 37 | | [CAP3](https://faculty.sites.iastate.edu/xqhuang/cap3-assembly-program) | 10.1101/gr.9.9.868 | | 38 | | [Celera]() | 10.1093/bioinformatics/btn074 | | 39 | | [Euler]() | 10.1073/pnas.171285098 | | 40 | | [JAZZ]() | 10.1126/science.1072104 | | 41 | | [Minimus]() | 10.1186/1471-2105-8-64 | | 42 | | [phrap]() | 10.1101/gr.8.3.186 | | 43 | | [Phusion]() | 10.1101/gr.731003 | | 44 | | [TIGR]() | 10.1089/gst.1995.1.9 | | 45 | 46 | 47 | ### High-accuracy short reads 48 | 49 | | Assembler | Publication | Last update | 50 | |:----------|:------------|:------------| 51 | | [ABySS](https://github.com/bcgsc/abyss) | 10.1101/gr.214346.116 | 2025-3 | 52 | | [ALLPATHS](https://software.broadinstitute.org/allpaths-lg/blog/?page_id=12) | 10.1101/gr.7337908 | 2008 | 53 | | [BASE](https://github.com/dhlbh/BASE) | 10.1186/s12864-016-2829-5 | 2016-1 | 54 | | [CABOG](http://wgs-assembler.sourceforge.net) | 10.1093/bioinformatics/btn548 | 2008 | 55 | | [Edena](http://www.genomic.ch/edena.php) | 10.1101/gr.072033.107 | 2013 | 56 | | [EPGA](https://github.com/bioinfomaticsCSU/EPGA) | 10.1093/bioinformatics/btu762 | 2017-4 | 57 | | [Euler-SR](http://web.archive.org/web/20110720080755/http://euler-assembler.ucsd.edu/euler-sr.1.1.2.tgz) | 10.1101/gr.7088808 | 2011 | 58 | | [Gossamer]() | 10.1093/bioinformatics/bts297 | 2012 | 59 | | [IDBA](https://github.com/loneknightpy/idba) | 10.1007/978-3-642-12683-3_28 | 2016-12 | 60 | | [ISEA]() | 10.1109/TCBB.2016.2550433 | | 61 | | [JR-Assembler]() | 10.1073/pnas.1314090110 | | 62 | | [LightAssembler]() | 10.1093/bioinformatics/btw470 | | 63 | | [Meraculous]() | 10.1371/journal.pone.0023501 | | 64 | | [Minia](https://github.com/GATB/minia) | 10.1186/1748-7188-8-22 | 2024-12 | 65 | | [Mira]() | 10.1.1.23.7465 | | 66 | | [Newbler]() | | | 67 | | [PCAP](https://faculty.sites.iastate.edu/xqhuang/pcap-assembly-program) | 10.1101/gr.1390403 | | 68 | | [PE-Assembler]() | 10.1093/bioinformatics/btq626 | | 69 | | [PERGA]() | 10.1371/journal.pone.0114253 | | 70 | | [Platanus]() | 10.1101/gr.170720.113 | | 71 | | [QSRA]() | 10.1186/1471-2105-10-69 | | 72 | | [Ray]() | 10.1089/cmb.2009.0238 | | 73 | | [Readjoiner]() | 10.1186/1471-2105-13-82 | | 74 | | [SGA]() | 10.1101/gr.126953.111 | | 75 | | [SHARCGS]() | 10.1101/gr.6435207 | | 76 | | [SOAPdenovo]() | 10.1101/gr.097261.109 | | 77 | | [SOAPdenovo2]() | 10.1186/2047-217X-1-18 | | 78 | | [SPAdes]() | 10.1089/cmb.2012.0021 | | 79 | | [SparseAssembler]() | 10.1186/1471-2105-13-S6-S1 | | 80 | | [SSAKE]() | 10.1093/bioinformatics/btl629 | | 81 | | [SUTTA]() | 10.1093/bioinformatics/btq646 | | 82 | | [VCAKE]() | 10.1093/bioinformatics/btm451 | | 83 | | [Velvet]() | 10.1002/0471250953.bi1105s31 | | 84 | | [Taipan]() | 10.1093/bioinformatics/btp374 | | 85 | 86 | ### Low-accuracy long reads 87 | 88 | | Assembler | Publication | Last update | 89 | |:----------|:------------|:------------| 90 | | [Canu](https://github.com/marbl/canu) | 10.1101/gr.215087.116 | 2025-9 | 91 | | [FALCON](https://github.com/PacificBiosciences/FALCON) | 10.1038/nmeth.4035 | 2018-4 | 92 | | [Flye](https://github.com/fenderglass/Flye) | 10.1038/s41587-019-0072-8 | 2025-5 | 93 | | [GoldRush](https://github.com/bcgsc/goldrush) | 10.1101/2022.10.25.513734 | 2025-10 | 94 | | [HINGE](https://github.com/HingeAssembler/HINGE) | 10.1101/gr.216465.116 | 2019-1 | 95 | | [MECAT](https://github.com/xiaochuanle/MECAT) | 10.1038/nmeth.4432 | 2019-2 | 96 | | [MECAT2](https://github.com/xiaochuanle/MECAT2) | 10.1038/nmeth.4432 | 2020-4 | 97 | | [miniasm](https://github.com/lh3/miniasm) | 10.1038/nmeth.4432 | 2025-7 | 98 | | [NECAT](https://github.com/xiaochuanle/NECAT) | 10.1038/s41467-020-20236-7 | 2021-3 | 99 | | [NextDenovo](https://github.com/Nextomics/NextDenovo) | 10.1186/s13059-024-03252-4 | 2024-5 | 100 | | [Ra](https://github.com/lbcb-sci/ra) | 10.1109/ISPA.2019.8868909 | 2018-12 | 101 | | [Raven](https://github.com/lbcb-sci/raven) | 10.1038/s43588-021-00073-4 | 2023-11 | 102 | | [SMARTdenovo](https://github.com/ruanjue/smartdenovo) | 10.20944/preprints202009.0207.v1 | 2021-2 | 103 | | [wtdbg](https://github.com/ruanjue/wtdbg) | | 2017-3 | 104 | | [wtdbg2](https://github.com/ruanjue/wtdbg2) | 10.1038/s41592-019-0669-3 | 2020-12 | 105 | | [shasta](https://github.com/paoloshasta/shasta) | 10.1038/s41587-020-0503-6 | 2025-11 | 106 | 107 | ### High-accuracy long reads 108 | 109 | | Assembler | Publication | Last update | 110 | |:----------|:------------|:------------| 111 | | [Alice-asm](https://github.com/rolandfaure/alice-asm) | | 2025-10 | 112 | | [Flye](https://github.com/fenderglass/Flye) | 10.1038/s41587-019-0072-8 | 2025-5 | 113 | | [HiCanu](https://github.com/marbl/canu) | 10.1101/gr.215087.116 | 2025-9 | 114 | | [hifiasm](https://github.com/chhylp123/hifiasm) | 10.1038/s41592-020-01056-5 | 2025-3 | 115 | | [IPA](https://github.com/PacificBiosciences/pbipa) | | 2022-3 | 116 | | [LJA](https://github.com/AntonBankevich/LJA) | 10.1101/2020.12.10.420448 | 2023-8 | 117 | | [mdBG](https://github.com/ekimb/rust-mdbg/) | 10.1016/j.cels.2021.08.009 | 2024-9 | 118 | | [MBG](https://github.com/maickrau/MBG) | 10.1093/bioinformatics/btab004 | 2025-9 | 119 | | [NextDenovo](https://github.com/Nextomics/NextDenovo) | 10.1186/s13059-024-03252-4 | 2024-5 | 120 | | [PECAT](https://github.com/lemene/PECAT) | 10.1101/2022.09.25.509436 | 2024-5 | 121 | | [Peregrine](https://github.com/cschin/Peregrine) | | 2022-2 | 122 | | [Raven](https://github.com/lbcb-sci/raven) | 10.1038/s43588-021-00073-4 | 2023-11 | 123 | | [verkko](https://github.com/marbl/verkko) | 10.1101/2022.06.24.497523 | 2025-11 | 124 | | [wtdbg2](https://github.com/ruanjue/wtdbg2) | 10.1038/s41592-019-0669-3 | 2020-12 | 125 | ## Assembly pre and post-processing 126 | 127 | ### Long-read error correction 128 | 129 | | Reads | Tool | Publication | Last update | 130 | |:------|:------|:------------| ----------- | 131 | | __Long reads__ | [Canu](https://github.com/marbl/canu) | 10.1101/gr.215087.116 | 2025-9 | 132 | | | [CONSENT](https://github.com/morispi/CONSENT) | 10.1038/s41598-020-80757-5 | 2024-2 | 133 | | | [Daccord](https://github.com/gt1/daccord) | 10.1101/106252 | 2018-9 | 134 | | | [FLAS](https://github.com/baoe/FLAS) | 10.1093/bioinformatics/btz206 | 2019-2 | 135 | | | [HALC](https://github.com/lanl001/halc) | 10.1186/s12859-017-1610-3 | 2018-5 | 136 | | | [MECAT](https://github.com/xiaochuanle/MECAT) | 10.1038/nmeth.4432 | 2019-2 | 137 | | | [MECAT2](https://github.com/xiaochuanle/MECAT2) | 10.1038/nmeth.4432 | 2020-4 | 138 | | | [NECAT](https://github.com/xiaochuanle/NECAT) | 10.1038/s41467-020-20236-7 | 2021-3 | 139 | | | [NextDenovo](https://github.com/Nextomics/NextDenovo) | | 2024-5 | 140 | | __Short reads__ | [CoLoRMap](https://github.com/cchauve/CoLoRMap) | 10.1093/bioinformatics/btw463 | 2018-3 | 141 | | | [Hercules](https://github.com/BilkentCompGen/Hercules) | 10.1093/nar/gky724 | 2018-8 | 142 | | | [HG-CoLoR](https://github.com/morispi/HG-CoLoR) | 10.1093/bioinformatics/bty521 | 2021-1 | 143 | | | [Jabba](https://github.com/biointec/jabba) | 10.1186/s13015-016-0075-7 | 2024-12 | 144 | | | [LoRDEC](https://gite.lirmm.fr/lordec/lordec-releases/-/wikis/home) | 10.1093/bioinformatics/btu538 | 2020 | 145 | | | [LoRMA](https://gite.lirmm.fr/lorma/lorma-releases/-/wikis/home) | 10.1093/bioinformatics/btw321 | 2019 | 146 | | | [NaS](https://github.com/institut-de-genomique/NaS) | 10.1186/s12864-015-1519-z | 2017-3 | 147 | | | [proovread](https://github.com/BioInf-Wuerzburg/proovread) | 10.1093/bioinformatics/btu392 | 2019-5 | 148 | | | [Ratatosk](https://github.com/DecodeGenetics/Ratatosk) | 10.1186/s13059-020-02244-4 | 2025-9 | 149 | 150 | ### Polishing 151 | 152 | | Reads | Tool | Publication | Last update | 153 | |:------|:------|:------------| ----------- | 154 | | __Long reads__ | [Arrow]() | | 2014 | 155 | | | [CONSENT](https://github.com/morispi/CONSENT) | 10.1038/s41598-020-80757-5 | 2024-2 | 156 | | | [GoldRush](https://github.com/bcgsc/goldrush) | 10.1101/2022.10.25.513734 | 2025-10 | 157 | | | [Quiver]() | | 2014 | 158 | | __Long reads + short reads__ | [ Hapo-G](https://github.com/institut-de-genomique/HAPO-G) | 10.1093/nargab/lqab034 | 2025-10 | 159 | | | [HyPo](https://github.com/kensung-lab/hypo) | 10.1101/2019.12.19.882506 | 2020-2 | 160 | | | [Racon](https://github.com/isovic/racon) | 10.1101/gr.214270.116 | 2020-8 | 161 | | __Short reads__ | [ntEdit](https://github.com/bcgsc/ntEdit) | 10.1093/bioinformatics/btz400 | 2025-10 | 162 | | | [Pilon](https://github.com/broadinstitute/pilon) | 10.1371/journal.pone.0112963 | 2021-1 | 163 | | | [POLCA](https://github.com/alekseyzimin/masurca) | 10.1371/journal.pcbi.1007981 | 2025-6 | 164 | | | [Apollo](https://github.com/CMU-SAFARI/Apollo) | 10.1093/bioinformatics/btaa179 | 2020-5 | 165 | 166 | ### Haplotig purging 167 | 168 | | Reads | Tool | Publication | Last update | 169 | |:------|:------|:------------| ----------- | 170 | | __Long reads__ | [HaploMerger2](https://github.com/mapleforest/HaploMerger2) | 10.1093/bioinformatics/btx220 | 2016-12 | 171 | | | [purge_dups](https://github.com/dfguan/purge_dups) | 10.1093/bioinformatics/btaa025 | 2025-10 | 172 | | | [Purge Haplotigs](https://bitbucket.org/mroachawri/purge_haplotigs) | 10.1186/s12859-018-2485-7 | 2024-2 | 173 | | __Long reads + short reads__ | [Redundans](https://github.com/lpryszcz/redundans) | 10.1093/nar/gkw294 | 2025-4 | 174 | 175 | ### Scaffolding 176 | 177 | | Reads | Tool | Publication | Last update | 178 | |:------|:------|:------------| ----------- | 179 | | __Genetic maps__ | [ ALLMAPS](https://github.com/tanghaibao/jcvi/wiki/ALLMAPS) | 10.1186/s13059-014-0573-1 | 2022 | 180 | | __Hi-C__ | [3D-DNA](https://github.com/aidenlab/3d-dna) | 10.1126/science.aal3327 | 2023-11 | 181 | | | [AutoHiC](https://github.com/Jwindler/AutoHiC) | 10.1101/2023.08.27.555031 | 2024-12 | 182 | | | [dnaTri](https://github.com/NoamKaplan/dna-triangulation) | 10.1038/nbt.2768 | 2015-7 | 183 | | | [EndHiC](https://github.com/fanagislab/EndHiC) | 10.48550/arXiv.2111.15411 | 2022-10 | 184 | | | [GRAAL](https://github.com/koszullab/GRAAL) | 10.1038/ncomms6695 | 2020-1 | 185 | | | [GreenHill](https://github.com/ShunOuchi/GreenHill) | 10.1186/s13059-023-03006-8 | 2025-3 | 186 | | | [HapHiC](https://github.com/zengxiaofei/HapHiC) | 10.1101/2023.11.18.567668 | 2025-11 | 187 | | | [HiCAssembler](https://github.com/maxplanck-ie/HiCAssembler) | 10.1101/gad.328971.119 | 2024-9 | 188 | | | [instaGRAAL](https://github.com/koszullab/instaGRAAL) | 10.1186/s13059-020-02041-z | 2024-3 | 189 | | | [Lachesis](https://github.com/shendurelab/LACHESIS) | 10.1038/nbt.2727 | 2017-12 | 190 | | | [msscaf](https://github.com/mzytnicki/msscaf) | | 2022-10 | 191 | | | [pin_hic](https://github.com/dfguan/pin_hic) | 10.1186/s12859-021-04453-5 | 2021-12 | 192 | | | [SALSA2](https://github.com/marbl/SALSA) | 10.1371/journal.pcbi.1007273 | 2024-5 | 193 | | | [scaffHiC](https://github.com/wtsi-hpag/scaffHiC) | | 2022-12 | 194 | | | [YaHS](https://github.com/c-zhou/yahs) | | 2024-11 | 195 | | __Linked reads__ | [ ARBitR](https://github.com/markhilt/ARBitR) | 10.1093/bioinformatics/btaa975 | 2020-10 | 196 | | | [Architect](https://github.com/kuleshov/architect) | 10.1093/bioinformatics/btw267 | 2016-10 | 197 | | | [ARCS](https://github.com/bcgsc/ARCS/) | 10.1093/bioinformatics/btx675 | 2024-11 | 198 | | | [ARKS](https://github.com/bcgsc/arks) | 10.1186/s12859-018-2243-x | 2019-12 | 199 | | | [fragScaff](https://github.com/adeylab/fragScaff) | 10.1101/gr.178319.114 | 2018-11 | 200 | | | [scaff10X](https://github.com/wtsi-hpag/Scaff10X) | | 2022-1 | 201 | | | [SpLitteR](https://github.com/ablab/spades/releases/tag/splitter-paper) | | 2022-12 | 202 | | | [msscaf](https://github.com/mzytnicki/msscaf) | | 2022-10 | 203 | | __Long reads__ | [DENTIST](https://github.com/a-ludi/dentist) | 10.1093/gigascience/giab100 | 2024-2 | 204 | | | [FinisherSC](https://github.com/kakitone/finishingTool) | 10.1093/bioinformatics/btv280 | 2016-11 | 205 | | | [gapless]() | 10.1101/2022.03.08.483466 | | 206 | | | [GoldRush](https://github.com/bcgsc/goldrush) | 10.1101/2022.10.25.513734 | 2025-10 | 207 | | | [LINKS](https://github.com/bcgsc/LINKS) | 10.1186/s13742-015-0076-3 | 2025-4 | 208 | | | [LRScaf](https://github.com/shingocat/lrscaf) | 10.1186/s12864-019-6337-2 | 2021-11 | 209 | | | [npScarf](https://github.com/mdcao/npScarf) | 10.1038/ncomms14515 | 2019-10 | 210 | | | [PBJelly](https://sourceforge.net/projects/pb-jelly/) | 10.1371/journal.pone.0047768 | 2017 | 211 | | | [RAILS](https://github.com/bcgsc/RAILS) | 10.21105/joss.00116 | 2023-12 | 212 | | | [SLR](https://github.com/luojunwei/SLR) | 10.1186/s12859-019-3114-9 | 2020-8 | 213 | | | [msscaf](https://github.com/mzytnicki/msscaf) | | 2022-10 | 214 | | | [SMIS](https://github.com/wtsi-hpag/smis) | | 2018-2 | 215 | | | [SMSC](https://github.com/UTbioinf/SMSC) | 10.1186/s12864-017-4271-8 | 2019-9 | 216 | | | [SSPACE-LongRead]() | 10.1186/1471-2105-15-211 | 2014 | 217 | | __Mate pairs__ | [BATISCAF]() | 10.1101/330472 | | 218 | | | [BESST]() | 10.1186/1471-2105-15-281 | | 219 | | | [BOSS]() | 10.1093/bioinformatics/btw597 | | 220 | | | [GRASS]() | 10.1093/bioinformatics/bts175 | | 221 | | | [MIP]() | 10.1093/bioinformatics/btr562 | | 222 | | | [Opera]() | 10.1089/cmb.2011.0170 | | 223 | | | [ScaffMatch]() | 10.1093/bioinformatics/btv211 | | 224 | | | [ScaffoldScaffolder]() | 10.1093/bioinformatics/btv548 | | 225 | | | [SCARPA]() | 10.1093/bioinformatics/bts716 | | 226 | | | [SCOP]() | 10.1093/bioinformatics/bty773 | | 227 | | | [SLIQ]() | 10.1089/cmb.2011.0263 | | 228 | | | [SOPRA]() | 10.1186/1471-2105-11-345 | | 229 | | | [SSPACE]() | 10.1093/bioinformatics/btq683 | | 230 | | | [WiseScaffolder]() | 10.1186/s12859-015-0705-y | | 231 | | __Optical maps__ | [ AGORA](https://static-content.springer.com/esm/art%3A10.1186%2F1471-2105-13-189/MediaObjects/12859_2012_5306_MOESM3_ESM.zip) | 10.1186/1471-2105-13-189 | 2012 | 232 | | | [BiSCoT](https://github.com/institut-de-genomique/biscot) | 10.7717/peerj.10150 | 2020-11 | 233 | | | [OMGS](https://github.com/ucrbioinfo/OMGS) | 10.1089/cmb.2019.0310 | 2018-11 | 234 | | | [SewingMachine](https://github.com/i5K-KINBRE-script-share/Irys-scaffolding/blob/master/KSU_bioinfo_lab/stitch/sewing_machine_LAB.md) | 10.1186/s12864-015-1911-8 | 2015 | 235 | | | [SOMA](ftp://ftp.cbcb.umd.edu/pub/software/soma) | 10.1093/bioinformatics/btn102 | 2008 | 236 | | __Short reads__ | [Bambus]() | 10.1101/gr.1536204 | | 237 | 238 | ### Gap filling 239 | 240 | | Reads | Tool | Publication | Last update | 241 | |:------|:------|:------------| ----------- | 242 | | __Long reads__ | [Cobbler](https://github.com/bcgsc/RAILS) | 10.21105/joss.00116 | 2023-12 | 243 | | | [DENTIST](https://github.com/a-ludi/dentist) | 10.1093/gigascience/giab100 | 2024-2 | 244 | | | [FGAP](https://github.com/pirovc/fgap) | 10.1186/1756-0500-7-371 | 2017-12 | 245 | | | [FinisherSC](https://github.com/kakitone/finishingTool) | 10.1093/bioinformatics/btv280 | 2016-11 | 246 | | | [gapless]() | 10.1101/2022.03.08.483466 | | 247 | | | [GMcloser](https://sourceforge.net/projects/gmcloser/) | 10.1093/bioinformatics/btv465 | 2018 | 248 | | | [LR_Gapcloser](https://github.com/CAFS-bioinformatics/LR_Gapcloser) | 10.1093/gigascience/giy157 | 2018-9 | 249 | | | [PBJelly](https://sourceforge.net/projects/pb-jelly/) | 10.1371/journal.pone.0047768 | 2017 | 250 | | | [PGcloser]() | 10.1177/1176934320913859 | 2020 | 251 | | | [TGS-GapCloser](https://github.com/BGI-Qingdao/TGS-GapCloser) | 10.1093/gigascience/giaa094 | 2024-9 | 252 | | | [YAGCloser](https://github.com/merlyescalona/yagcloser) | | 2025-11 | 253 | | __Short reads__ | [GapFiller]() | 10.1186/gb-2012-13-6-r56 | | 254 | | | [GAPPadder]() | 10.1186/s12864-019-5703-4 | | 255 | | | [Sealer]() | 10.1186/s12859-015-0663-4 | | 256 | --------------------------------------------------------------------------------