├── .gitignore ├── .travis.yml ├── CLI-workflow-executor ├── GC-lite-workflow.ga ├── README.md ├── galaxy_credentials.yml ├── galaxy_credentials.yml.sample ├── input_files.yaml ├── run_galaxy_workflow.py ├── sample_lite.fa └── wf-parameters.json ├── Dockerfile ├── FAQ.md ├── LICENSE ├── README.md ├── assets ├── img │ ├── GalaxyDocker.png │ ├── figure-pipeline_zigzag.png │ ├── figure-pipeline_zigzag.svg │ ├── graphclust_pipeline.png │ ├── graphclust_pipeline.svg │ ├── kitematic-1.png │ ├── kitematic-2.png │ ├── kitematic-3.png │ ├── kitematic-32.png │ ├── kitematic-4.png │ ├── kitematic-5.png │ ├── video-thumbnail.png │ └── workflow_early.png ├── library │ └── library_data.yaml ├── tools │ ├── graphclust_tools.yml │ ├── graphclust_tools2.yml │ └── graphclust_utils.yml ├── tours │ ├── graphclust_step_by_step.yaml │ ├── graphclust_tutorial.yaml │ └── graphclust_very_short.yaml └── welcome.html ├── data ├── CLIP-sites │ ├── Roquin1-PARCLIP-sites.fasta │ └── SLBP-Galaxy1-[peaks_l2fc4_sorted_merged_ext60_merged.fasta].fasta ├── README ├── Rfam-cliques-dataset │ ├── cliques-high-representatives.fa │ └── cliques-low-representatives.fa └── SHAPE-data │ ├── Probealign_labeled_10-10.fa │ └── Probealign_labeled_10-10.react ├── kitematic.md └── workflows ├── GraphClust-MotifFinder.ga ├── GraphClust_main_1r.ga ├── GraphClust_main_2r.ga ├── GraphClust_main_3r.ga ├── README.md ├── auxiliary-workflows ├── Cluster-conservation-filter.ga ├── Cluster-conservation-filter_and_align.ga ├── Galaxy-Workflow-compute-SP-reactivity.ga ├── MAF-to-FASTA-Collection.ga ├── MAF-to-FASTA.ga └── README.md └── extra-workflows ├── Orthology ├── Galaxy-Workflow-MotifFinder-orthlncRNA-conservation-metrics.ga └── README.md ├── README.md ├── RNAshapes ├── GraphClust_1r_brnashapes.ga ├── GraphClust_2r_brnashapes.ga └── README.md ├── SHAPE ├── GraphClust_1r_SHAPE.ga ├── GraphClust_2r_SHAPE.ga ├── GraphClust_3r_SHAPE.ga └── README.md └── with-subworkflow ├── Galaxy-Workflow-MultiRoundClustering.ga ├── Galaxy-Workflow-iterative_clustering.ga ├── Galaxy-Workflow-iterative_clustering_r1.ga ├── GraphClust-iterative_clustering.ga ├── README.md └── superflow-motif-finder-lncRNA-clustal.ga /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | env/ 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | 27 | # PyInstaller 28 | # Usually these files are written by a python script from a template 29 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 30 | *.manifest 31 | *.spec 32 | 33 | # Installer logs 34 | pip-log.txt 35 | pip-delete-this-directory.txt 36 | 37 | # Unit test / coverage reports 38 | htmlcov/ 39 | .tox/ 40 | .coverage 41 | .coverage.* 42 | .cache 43 | nosetests.xml 44 | coverage.xml 45 | *,cover 46 | .hypothesis/ 47 | 48 | # Translations 49 | *.mo 50 | *.pot 51 | 52 | # Django stuff: 53 | *.log 54 | local_settings.py 55 | 56 | # Flask stuff: 57 | instance/ 58 | .webassets-cache 59 | 60 | # Scrapy stuff: 61 | .scrapy 62 | 63 | # Sphinx documentation 64 | docs/_build/ 65 | 66 | # PyBuilder 67 | target/ 68 | 69 | # IPython Notebook 70 | .ipynb_checkpoints 71 | 72 | # pyenv 73 | .python-version 74 | 75 | # celery beat schedule file 76 | celerybeat-schedule 77 | 78 | # dotenv 79 | .env 80 | 81 | # virtualenv 82 | venv/ 83 | ENV/ 84 | 85 | # Spyder project settings 86 | .spyderproject 87 | 88 | # Rope project settings 89 | .ropeproject 90 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | sudo: required 2 | 3 | language: python 4 | python: 2.7 5 | 6 | services: 7 | - docker 8 | 9 | env: 10 | - TOX_ENV=py27 11 | 12 | git: 13 | submodules: false 14 | 15 | before_install: 16 | - wget https://raw.githubusercontent.com/bgruening/galaxy-flavor-testing/master/Makefile 17 | - make docker_install 18 | - travis_wait 50 make docker_build 19 | - make docker_run 20 | - sleep 80 21 | 22 | install: 23 | - make install 24 | 25 | script: 26 | - make test_api 27 | - make test_ftp 28 | - make test_bioblend 29 | #- make test_docker_in_docker 30 | -------------------------------------------------------------------------------- /CLI-workflow-executor/README.md: -------------------------------------------------------------------------------- 1 | This is based on EBI gene expression group repository [https://github.com/ebi-gene-expression-group/galaxy-workflow-executor](https://github.com/ebi-gene-expression-group/galaxy-workflow-executor) 2 | 3 | A sample invocation would be: 4 | 5 | ```python run_galaxy_workflow.py -C galaxy_credentials.yml -i input_files.yaml -W GC-lite-workflow.ga -k -H testCLI -P wf-parameters.json -G usegalaxy_eu``` 6 | 7 | For instruction details and technical issues regarding the CLI executation please refer to these repositories: https://github.com/ebi-gene-expression-group/galaxy-workflow-executor and https://github.com/ebi-gene-expression-group/scxa-workflows 8 | 9 | 10 | # Galaxy workflow executor 11 | 12 | This setup uses bioblend to run a Galaxy workflow through the cli: 13 | 14 | - Inputs: 15 | - Galaxy workflow as JSON file (from share workflow -> download). 16 | - Parameters dictionary as JSON 17 | - Input files defined in YAML 18 | - Steps with allowed errors in YAML (optional) 19 | - History name (optional) 20 | 21 | # Galaxy workflow 22 | 23 | The workflow should be annotated with labels, ideally for all steps, but at least 24 | for the steps where you want to be able to set parameters through the parameters 25 | dictionary. It should be the JSON file resulting from Workflows (upper menu) -> Share workflow 26 | (on the drop down menu of the workflow, in the workflow list) -> Download 27 | (in the following screen). 28 | 29 | # Parameters JSON 30 | 31 | It should follow the following structure: 32 | 33 | ```json 34 | { 35 | "step_label_x": { 36 | "param_name": "value", 37 | .... 38 | "nested_param_name": { 39 | "n_param_name": "n_value", 40 | .... 41 | "x_param_name": "x_value" 42 | } 43 | 44 | }, 45 | "step_label_x2": { 46 | .... 47 | }, 48 | .... 49 | "other_galaxy_setup_params": { ... } 50 | } 51 | ``` 52 | 53 | # Input files in YAML 54 | 55 | It should point to the files in the file system, set a name (which needs to match 56 | with a workflow input label) and file type (among those recognized by Galaxy). 57 | 58 | The structure of the YAML file for inputs is: 59 | 60 | ```yaml 61 | matrix: 62 | path: /path/to/E-MTAB-4850.aggregated_filtered_counts.mtx 63 | type: txt 64 | genes: 65 | path: /path/to/E-MTAB-4850.aggregated_filtered_counts.mtx_rows 66 | type: tsv 67 | barcodes: 68 | path: /path/to/E-MTAB-4850.aggregated_filtered_counts.mtx_cols 69 | type: tsv 70 | gtf: 71 | dataset_id: fe139k21xsak 72 | ``` 73 | 74 | where in this example case the Galaxy workflow should have input labels called `matrix`, 75 | `genes`, `barcodes` and `gtf`. The paths need to exist in the local file system, if `path` is set within an input. Alternatively to a path in the local file system, if the file is already on the Galaxy instance, the `dataset_id` of the file can be given instead, as shown for the `gtf` case here. 76 | 77 | # Steps with allowed errors 78 | 79 | This optional YAML file indicates the executor which steps are allowed to fail without the overal execution being considered 80 | failed and hence retrieving result files anyway. This is to make room to the fact that on a production setup, there might 81 | be border conditions on datasets that could produce acceptable failures. 82 | 83 | The structure of the file relies on the labels for steps used in the workflow and parameters files 84 | 85 | ```yaml 86 | step_label_x: 87 | - any 88 | step_label_z: 89 | - 1 90 | - 43 91 | ``` 92 | 93 | The above example means that the step with label `step_label_x` can fail with any error code, whereas step with label 94 | `step_label_z` will only be allowed to fail with codes 1 or 43. 95 | 96 | 97 | -------------------------------------------------------------------------------- /CLI-workflow-executor/galaxy_credentials.yml: -------------------------------------------------------------------------------- 1 | __default: usegalaxy_eu 2 | 3 | usegalaxy_eu: 4 | key: "paste your account API key from https://usegalaxy.eu/user/api_key here" 5 | url: "https://usegalaxy.eu" 6 | docker_cloud: 7 | key: "xx" 8 | url: "http://4.4.4.X:8088 (public/private IP docker instance here)" 9 | -------------------------------------------------------------------------------- /CLI-workflow-executor/galaxy_credentials.yml.sample: -------------------------------------------------------------------------------- 1 | __default: embassy 2 | 3 | embassy: 4 | key: "xx" 5 | url: "http://193.62.52.166:30700" 6 | ebi_cluster: 7 | key: "xx" 8 | url: "http://galaxy-gxa-001:8088" 9 | -------------------------------------------------------------------------------- /CLI-workflow-executor/input_files.yaml: -------------------------------------------------------------------------------- 1 | cliques_lite: 2 | class: File 3 | path: sample_lite.fa 4 | type: fasta 5 | -------------------------------------------------------------------------------- /CLI-workflow-executor/sample_lite.fa: -------------------------------------------------------------------------------- 1 | >RF00001_rep.0_AL096764.11/46123-46004 RF00001 2 | GUCUAUGGCCAUACCACCCUGAAUGUGCUUGAUCUCAUCUGAUCUCGUGAAGCCAAGCAGGGUGGGGCCUAGUUAGUACUUGGAUGGGAGACUUCCUGGGAAUAUAAGCUGCUGUUGGCU 3 | >RF00001_rep.1_U89919.1/939-1056 RF00001 4 | CUUUACGGCCACACCACCCUGAACGCACCGGAUCUCGACUGACCUUGAAAGCUAAGCAGGAUCGGGCCUGGUUAGUAUUGGGAUGGCAGACCCCCUGGAAAUACAGGGUGCUGAAGGU 5 | >RF00001_rep.2_AJ508600.1/161-58 RF00001 6 | GUCUACAGCCAUACCAUCCUGAACAUGCCAGAUCUUGUCUGACCUCUGAAGCUAAGCAGGGUCAAGCCUGGUUAGUACUUGGGAGAAGCUGGUGUGGCUAGACC 7 | >RF00005_rep.0_M15347.1/1040-968 RF00005 8 | GGCUCCAUAGCUCAGGGGUUAGAGCACUGGUCUUGUAAACCAGGGGUCGCGAGUUCAAUUCUCGCUGGGGCUU 9 | >RF00005_rep.10_X58792.1/174-245 RF00005 10 | GGUCCCAUGGUGUAAUGGUUAGCACUCUGGACUUUGAAUCCAGCGAUCCGAGUUCAAAUCUCGGUGGGACCU 11 | >RF00005_rep.11_AF346992.1/15890-15955 RF00005 12 | GUCCUUGUAGUAUAAACUAAUACACCAGUCUUGUAAACCGGAGAUGAAAACCUUUUUCCAAGGACA 13 | >RF00005_rep.12_AC108081.2/59868-59786 RF00005 14 | GUCAGGAUGGCCGAGCGGUCUAAGGCGCUGCGUUCAGGUCGCAGUCUCCCCUGGAGGCGUGGGUUCGAAUCCCACUUCUGACA 15 | >RF00005_rep.13_AC067849.6/4771-4840 RF00005 16 | CACUGUAAAGCUAACUUAGCAUUAACCUUUUAAGUUAAAGAUUAAGAGAACCAACACCUCUUUACAGUGA 17 | >RF00005_rep.14_AL021808.2/65570-65498 RF00005 18 | GCUUCUGUAGUGUAGUGGUUAUCACGUUCGCCUCACACGCGAAAGGUCCCCGGUUCGAAACCGGGCAGAAGCA 19 | >RF00005_rep.15_AC008443.10/42590-42518 RF00005 20 | GCCCGGCUAGCUCAGUCGGUAGAGCAUGAGACUCUUAAUCUCAGGGUCGUGGGUUCGAGCCCCACGUUGGGCG 21 | >RF00005_rep.16_AL133551.13/12355-12436 RF00005 22 | GCAGCGAUGGCCGAGUGGUUAAGGCGUUGGACUUGAAAUCCAAUGGGGUCUCCCCGCGCAGGUUCGAACCCUGCUCGCUGCG 23 | >RF00005_rep.17_AL021918.1/54817-54736 RF00005 24 | GUAGUCGUGGCCGAGUGGUUAAGGCGAUGGACUUGAAAUCCAUUGGGGUUUCCCCGCGCAGGUUCGAAUCCUGUCGGCUACG 25 | >RF00005_rep.18_AL021918.1/81116-81197 RF00005 26 | GUAGUCGUGGCCGAGUGGUUAAGGCGAUGGACUAGAAAUCCAUUGGGGUUUCCCCACGCAGGUUCGAAUCCUGCCGACUACG 27 | >RF00005_rep.19_AF134583.1/1816-1744 RF00005 28 | UAGAUUGAAGCCAGUUGAUUAGGGUGCUUAGCUGUUAACUAAGUGUUUGUGGGUUUAAGUCCCAUUGGUCUAG 29 | >RF00005_rep.1_AC005329.1/7043-6971 RF00005 30 | GCCGAAAUAGCUCAGUUGGGAGAGCGUUAGACUGAAGAUCUAAAGGUCCCUGGUUCGAUCCCGGGUUUCGGCA 31 | >RF00005_rep.20_AL671879.2/100356-100285 RF00005 32 | GGGGAUGUAGCUCAGUGGUAGAGCGCAUGCUUCGCAUGUAUGAGGCCCCGGGUUCGAUCCCCGGCAUCUCCA 33 | >RF00005_rep.21_AL355149.13/15278-15208 RF00005 34 | GCAUUGGUGGUUCAGUGGUAGAAUUCUCGCCUCCCACGCGGGAGACCCGGGUUCAAUUCCCGGCCAAUGCA 35 | >RF00005_rep.22_AL590385.23/26487-26416 RF00005 36 | GCGUUGGUGGUAUAGUGGUGAGCAUAGCUGCCUUCCAAGCAGUUGACCCGGGUUCGAUUCCCGGCCAACGCA 37 | >RF00005_rep.23_M16479.1/42-123 RF00005 38 | GGUGGGGUUCCCGAGCGGCCAAAGGGAGCAGACUCUAAAUCUGCCGUCAUCGACUUCGAAGGUUCGAAUCCUUCCCCCACCA 39 | >RF00005_rep.24_AC004941.2/32735-32806 RF00005 40 | GGGGGUAUAGCUCAGGGGUAGAGCAUUUGACUGCAGAUCAAGAGGUCCCUGGUUCAAAUCCAGGUGCCCCCU 41 | >RF00005_rep.25_AC006449.19/196857-196784 RF00005 42 | GUCUCUGUGGCGCAAUCGGUUAGCGCGUUCGGCUGUUAACCGAAAGGUUGGUGGUUCGAGCCCACCCAGGGACG 43 | >RF00005_rep.26_AF346999.1/4402-4331 RF00005 44 | UAGGAUGGGGUGUGAUAGGUGGCACGGAGAAUUUUGGAUUCUCAGGGAUGGGUUCGAUUCUCAUAGUCCUAG 45 | >RF00005_rep.27_AL352978.6/119697-119770 RF00005 46 | GGCCGGUUAGCUCAGUUGGUUAGAGCGUGGUGCUAAUAACGCCAAGGUCGCGGGUUCGAUCCCCGUACGGGCCA 47 | >RF00005_rep.28_X04779.1/1-73 RF00005 48 | CCUUCGAUAGCUCAGCUGGUAGAGCGGAGGACUGUAGAUCCUUAGGUCGCUGGUUCGAUUCCGGCUCGAAGGA 49 | >RF00005_rep.29_AF381996.2/4265-4333 RF00005 50 | AGAAAUAUGUCUGAUAAAAGAGUUACUUUGAUAGAGUAAAUAAUAGGAGCUUAAACCCCCUUAUUUCUA 51 | >RF00005_rep.2_AL662865.4/12206-12135 RF00005 52 | GGUUCCAUGGUGUAAUGGUUAGCACUCUGGACUCUGAAUCCAGCGAUCCGAGUUCAAAUCUCGGUGGAACCU 53 | >RF00005_rep.30_AL132988.4/95773-95841 RF00005 54 | AAGGGCUUAGCUUAAUUAAAGUGGCUGAUUUGCGUUCAGUUGAUGCAGAGUGGGGUUUUGCAGUCCUUA 55 | >RF00005_rep.31_AC092686.3/29631-29561 RF00005 56 | GCAUUGGUGGUUCAGUGGUAGAAUUCUCGCCUGCCACGCGGGAGGCCCGGGUUCGAUUCCCGGCCAAUGCA 57 | >RF00005_rep.32_AF347015.1/5892-5827 RF00005 58 | GGUAAAAUGGCUGAGUGAAGCAUUGGACUGUAAAUCUAAAGACAGGGGUUAGGCCUCUUUUUACCA 59 | >RF00005_rep.33_AC018638.5/4694-4623 RF00005 60 | GGCUCGUUGGUCUAGGGGUAUGAUUCUCGCUUAGGGUGCGAGAGGUCCCGGGUUCAAAUCCCGGACGAGCCC 61 | >RF00005_rep.34_AC008443.10/43006-42934 RF00005 62 | GUUUCCGUAGUGUAGUGGUUAUCACGUUCGCCUCACACGCGAAAGGUCCCCGGUUCGAAACCGGGCGGAAACA 63 | >RF00005_rep.35_AC005783.1/27398-27326 RF00005 64 | GUUUCCGUAGUGUAGCGGUUAUCACAUUCGCCUCACACGCGAAAGGUCCCCGGUUCGAUCCCGGGCGGAAACA 65 | >RF00005_rep.36_AC007298.17/145366-145295 RF00005 66 | UCCUCGUUAGUAUAGUGGUGAGUAUCCCCGCCUGUCACGCGGGAGACCGGGGUUCGAUUCCCCGACGGGGAG 67 | >RF00005_rep.37_AF347001.1/16015-15948 RF00005 68 | CAGAGAAUAGUUUAAAUUAGAAUCUUAGCUUUGGGUGCUAAUGGUGGAGUUAAAGACUUUUUCUCUGA 69 | >RF00005_rep.38_J00309.1/356-427 RF00005 70 | UCCCUGGUGGUCUAGUGGCUAGGAUUCGGCGCUUUCACCGCCGCGCCCCGGGUUCGAUUCCCGGCCAGGAAU 71 | >RF00005_rep.39_AL031229.2/40502-40430 RF00005 72 | GUUUCCGUAGUGUAGUGGUUAUCACGUUCGCCUAACACGCGAAAGGUCCCUGGAUCAAAACCAGGCGGAAACA 73 | >RF00005_rep.3_Z54587.1/126-45 RF00005 74 | GGUAGCGUGGCCGAGCGGUCUAAGGCGCUGGAUUUAGGCUCCAGUCUCUUCGGAGGCGUGGGUUCGAAUCCCACCGCUGCCA 75 | >RF00005_rep.40_AF382013.1/10403-10467 RF00005 76 | UGGUAUAUAGUUUAAACAAAACGAAUGAUUUCGACUCAUUAAAUUAUGAUAAUCAUAUUUACCAA 77 | >RF00005_rep.41_AC093311.2/140036-139968 RF00005 78 | GUUCUUGUAGUUGAAAUACAACGAUGGUUUUUCAUAUCAUUGGUCGUGGUUGUAGUCCGUGCGAGAAUA 79 | >RF00005_rep.42_AF347015.1/5827-5762 RF00005 80 | AGCUCCGAGGUGAUUUUCAUAUUGAAUUGCAAAUUCGAAGAAGCAGCUUCAAACCUGCCGGGGCUU 81 | >RF00005_rep.43_L23320.1/77-10 RF00005 82 | ACUCUUUUAGUAUAAAUAGUACCGUUAACUUCCAAUUAACUAGUUUUGACAACAUUCAAAAAAGAGUA 83 | >RF00005_rep.44_AC008670.6/83597-83665 RF00005 84 | GUAAAUAUAGUUUAACCAAAACAUCAGAUUGUGAAUCUGACAACAGAGGCUCACGACCCCUUAUUUACC 85 | >RF00005_rep.45_AF382005.1/581-651 RF00005 86 | GUUUAUGUAGCUUACCUCCUCAAAGCAAUACACUGAAAAUGUUUAGACGGGCUCACAUCACCCCAUAAACA 87 | >RF00005_rep.46_AF347015.1/1604-1672 RF00005 88 | CAGAGUGUAGCUUAACACAAAGCACCCAACUUACACUUAGGAGAUUUCAACUUAACUUGACCGCUCUGA 89 | >RF00005_rep.4_Z98744.2/66305-66234 RF00005 90 | AGCAGAGUGGCGCAGCGGAAGCGUGCUGGGCCCAUAACCCAGAGGUCGAUGGAUCGAAACCAUCCUCUGCUA 91 | >RF00005_rep.5_AL590385.23/26129-26058 RF00005 92 | UCCCUGGUGGUCUAGUGGUUAGGAUUCGGCGCUCUCACCGCCGCGGCCCGGGUUCGAUUCCCGGUCAGGGAA 93 | >RF00005_rep.6_X93334.1/6942-7009 RF00005 94 | AAGGUAUUAGAAAAACCAUUUCAUAACUUUGUCAAAGUUAAAUUAUAGGCUAAAUCCUAUAUAUCUUA 95 | >RF00005_rep.7_AF347005.1/12268-12338 RF00005 96 | ACUUUUAAAGGAUAACAGCUAUCCAUUGGUCUUAGGCCCCAAAAAUUUUGGUGCAACUCCAAAUAAAAGUA 97 | >RF00005_rep.8_AF134583.1/1599-1666 RF00005 98 | AGAAAUUUAGGUUAAAUACAGACCAAGAGCCUUCAAAGCCCUCAGUAAGUUGCAAUACUUAAUUUCUG 99 | >RF00005_rep.9_AP000442.6/2022-1950 RF00005 100 | GCCCGGAUAGCUCAGUCGGUAGAGCAUCAGACUUUUAAUCUGAGGGUCCAGGGUUCAAGUCCCUGUUCGGGCG 101 | >RF00006_rep.0_AF045145.1/1-88 RF00006 102 | GGCUGGCUUUAGCUCAGCGGUUACUUCGCGUGUCAUCAAACCACCUCUCUGGGUUGUUCGAGACCCGCGGGCGCUCUCCAGCCCUCUU 103 | >RF00006_rep.1_AC005219.1/49914-50014 RF00006 104 | GGGUCGGAGUUAGCUCAAGCGGUUACCUCCUCAUGCCGGACUUUCUAUCUGUCCAUCUCUGUGCUGGGGUUCGAGACCCGCGGGUGCUUACUGACCCUUUU 105 | >RF00006_rep.2_AF045143.1/1-98 RF00006 106 | GGCUGGCUUUAGCUCAGCGGUUACUUCGACAGUUCUUUAAUUGAAACAAGCAACCUGUCUGGGUUGUUCGAGACCCGCGGGCGCUCUCCAGUCCUUUU 107 | >RF00006_rep.3_AF045144.1/1-88 RF00006 108 | GGCUGGCUUUAGCUCAGCGGUUACUUCGAGUACAUUGUAACCACCUCUCUGGGUGGUUCGAGACCCGCGGGUGCUUUCCAGCUCUUUU 109 | >RF00019_rep.0_V00584.1/39-151 RF00019 110 | GGCUGGUCCGAAGGUAGUGAGUUAUCUCAAUUGAUUGUUCACAGUCAGUUACAGAUCGAACUCCUUGUUCUACUCUUUCCCCCCUUCUCACUACUGCACUUGACUAGUCUUUU 111 | >RF00019_rep.1_L32608.1/283-377 RF00019 112 | GGCUGGUCCGAUGGUAGUGGGUUAUCAGAACUUAUUAACAUUAGUGUCACUAAAGUUGGUAUACAACCCCCCACUGCUAAAUUUGACUGGCUUUU 113 | >RF00019_rep.2_ABBA01033605.1/1707-1808 RF00019 114 | GGCUGGUCCGAGUGCAGUGGUGUUUACAACUAAUUGAUCACAACCAGUUACAGAUUUCUUUGUUCCUUCUCCACUCCCACUGCUUCACUUGACUAGCCUUUU 115 | >RF00019_rep.3_AADD01087475.1/2469-2552 RF00019 116 | AGUUGGUCCGAGUGUUGUGGGUUAUUGUUAAGUUGAUUUAACAUUGUCUCCCCCCACAACCGCGCUUGACUAGCUUGCUGUUUU 117 | >RF00027_rep.0_AF480570.1/1-79 RF00027 118 | GUGAGGUAGUAAGUUGUAUUGUUGUGGGGUAGGGAUAUUAGGCCCCAAUUAGAAGAUAACUAUACAACUUACUACUUUC 119 | >RF00027_rep.1_AC048341.22/3536-3622 RF00027 120 | CCUGGCUGAGGUAGUAGUUUGUGCUGUUGGUCGGGUUGUGACAUUGCCCGCUGUGGAGAUAACUGCGCAAGCUACUGCCUUGCUAGU 121 | >RF00027_rep.2_AC018755.3/119936-120011 RF00027 122 | CCGGGCUGAGGUAGGAGGUUGUAUAGUUGAGGAGGACACCCAAGGAGAUCACUAUACGGCCUCCUAGCUUUCCCCA 123 | >RF00031_rep.0_X71973.1/730-791 RF00031 124 | CCGGCACUCAUGACGGCCUGCCUGCAAACCUGCUGGUGGGGCAGACCCGAAAAUCCAGCGUG 125 | >RF00031_rep.1_U67171.1/375-442 RF00031 126 | GACGCUUCAUGAUAGGAAGGACUGAAAAGUCUUGUGGACACCUGGUCUUUCCCUGAUGUUCUCGUGGC 127 | >RF00031_rep.2_S79854.1/1605-1666 RF00031 128 | CACUGCUGAUGACGAACUAUCUCUAACUGGUCUUGACCACGAGCUAGUUCUGAAUUGCAGGG 129 | >RF00031_rep.3_X53463.1/847-903 RF00031 130 | UUCACAGAAUGAUGGCACCUUCCUAAACCCUCAUGGGUGGUGUCUGAGAGGCGUGAA 131 | >RF00031_rep.4_AF195141.1/689-759 RF00031 132 | GACUGACAUUAUGAAGGCCUGUACUGAAGACAGCAAGCUGUUAGUACAGACCAGAUGCUUUCUUGGCAGGC 133 | >RF00031_rep.5_AF093774.1/5851-5916 RF00031 134 | GUGUGCGGAUGAUAACUACUGACGAAAGAGUCAUCGACCUCAGUUAGUGGUUGGAUGUAGUCACAU 135 | >RF00031_rep.6_BC003127.1/865-928 RF00031 136 | GUCACUGCAUGAUCCGCUCUGGUCAAACCCUUCCAGGCCAGCCAGAGUGGGGAUGGUCUGUGAC -------------------------------------------------------------------------------- /CLI-workflow-executor/wf-parameters.json: -------------------------------------------------------------------------------- 1 | { 2 | "preprocessing": { 3 | "max_length": "120" 4 | } 5 | } -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | # Galaxy - GraphClust 2 | FROM quay.io/bgruening/galaxy:19.01 3 | 4 | MAINTAINER Björn A. Grüning, bjoern.gruening@gmail.com 5 | 6 | ENV GALAXY_CONFIG_BRAND GraphClust 7 | ENV ENABLE_TTS_INSTALL True 8 | 9 | # Install tools 10 | # Split into multiple layers, it seems that there is a max-layer size. 11 | COPY ./assets/tools/graphclust_tools.yml $GALAXY_ROOT/tools.yaml 12 | RUN install-tools $GALAXY_ROOT/tools.yaml && \ 13 | /tool_deps/_conda/bin/conda clean --tarballs --yes && \ 14 | rm /export/galaxy-central/ -rf 15 | 16 | COPY ./assets/tools/graphclust_tools2.yml $GALAXY_ROOT/tools_2.yaml 17 | RUN install-tools $GALAXY_ROOT/tools_2.yaml && \ 18 | /tool_deps/_conda/bin/conda clean --tarballs --yes && \ 19 | rm /export/galaxy-central/ -rf 20 | 21 | COPY ./assets/tools/graphclust_utils.yml $GALAXY_ROOT/tools_3.yaml 22 | RUN install-tools $GALAXY_ROOT/tools_3.yaml && \ 23 | /tool_deps/_conda/bin/conda clean --tarballs --yes && \ 24 | rm /export/galaxy-central/ -rf 25 | 26 | 27 | # Add Galaxy interactive tours 28 | ADD ./assets/tours/* $GALAXY_ROOT/config/plugins/tours/ 29 | 30 | # Data libraries 31 | ADD ./assets/library/library_data.yaml $GALAXY_ROOT/library_data.yaml 32 | 33 | # Add workflows to the Docker image 34 | ADD ./workflows/*.ga $GALAXY_ROOT/workflows/ 35 | 36 | # Download training data and populate the data library 37 | RUN startup_lite && \ 38 | sleep 30 && \ 39 | . $GALAXY_VIRTUAL_ENV/bin/activate && \ 40 | workflow-install --workflow_path $GALAXY_ROOT/workflows/ -g http://localhost:8080 -u $GALAXY_DEFAULT_ADMIN_USER -p $GALAXY_DEFAULT_ADMIN_PASSWORD 41 | # && \ 42 | # setup-data-libraries -i $GALAXY_ROOT/library_data.yaml -g http://localhost:8080 -u $GALAXY_DEFAULT_ADMIN_USER -p $GALAXY_DEFAULT_ADMIN_PASSWORD 43 | 44 | # Container Style 45 | ADD ./assets/img/workflow_early.png $GALAXY_CONFIG_DIR/web/welcome_image.png 46 | ADD ./assets/welcome.html $GALAXY_CONFIG_DIR/web/welcome.html 47 | -------------------------------------------------------------------------------- /FAQ.md: -------------------------------------------------------------------------------- 1 | # Questions regarding setup and usage: 2 | 3 | 1. Q: How can I stop a running docker instance of Galaxy-GraphClust? 4 | 5 | A: If you are runnig the container in interactive mode (i.e. docker run -i) use `Ctrl+C`. To stop ALL docker instances on your computer you can run this command in terminal: `sudo docker stop $(sudo docker ps -a -q)` 6 | 7 | 2. Q: After a few experiments and upgrades, lots of disk storage is occupied. How can I clean it up? 8 | 9 | Docker takes a conservative approach for cleaning up unnecessary data objects. Below some solutions for cleaning up your hard disk is coming, ordered in the level of conservativeness: 10 | 11 | * Using `docker system prune` command, manual [here](https://docs.docker.com/config/pruning/) 12 | * Please make sure no unintended container instance is running on the background. You can get a list of all containers with `docker ps -a` and remove them if necessary with `docker rm ID-or-NAME`. 13 | * The above steps do not remove the dangling and not needed images which usually take most of the space. You can use `docker images` to get a list of them, and `docker rmi image-ID` to remove individual images. 14 | * To auto-remove a container after exiting, you can use `docker run --rm`. 15 | * A detailed tutorial about these and further ways can be found here: [https://www.tecmint.com/remove-docker-images-containers-and-volumes/](https://www.tecmint.com/remove-docker-images-containers-and-volumes/) 16 | 17 | 3. Q: I would like to customize the workflow settings but there are so many parameters there. What can I do? 18 | 19 | GraphClust2 workflow is collection of more than 15 tools where a majority invoke slightly complex methodologies. We have provided the pre-configurations that we think would be needed by the users, accroding to our own experience and the feedback from the GraphClust2 users and collaborators. 20 | Each tool wrapper is supplemented with brief help descriptions for the arguments and/or external links to the tool's documentation. We are extending, the in-Galaxy help descriptions and Galaxy tutorials, your feedback is very appreciated. If you would like to customize the configurations and do not know how to start, we would recommend to start with adapting the first and last steps of the workflow. 21 | 22 | GraphClust2 takes a windowing approach for folding and clustering long input sequences. The windows size and overlapping ratio can be adapted to the expectation of the structure features. Starting with shorter window-lengths (50-100nt) would be a good idea, specially if it's not known that the putative structured elements are covering the entire sequence or local elemnts. Using shorter windows, the small elements such as stem-loops can be identified, afterwards you may re-run the pipeline with an increased windows lenght to capture the complete structure. 23 | 24 | In the last step `cluster_collection_report`, GraphClust2 assigns the elements to the best matching cluster and also aligns the best (top) matching entries of each cluster. `results_top_num` defines how many of the top entries to align, increasing the number would be a good idea to find covariations and identify a reliable conserved element. Usually aligning the top10-30 or higher would help to identify reliable structure conservations and covariations. The other parameter to consider is the covariance model hit criteria (E-value or bitscore). The E-value works very well (and designed for )specially for structured non-coding RNAs with defined boundaries, like sequences in the Rfam database. We have found switching back to the CM-bit score option (option `Use CM score for cutoff`), to work better for identifying structured elements surrounded (within) a sequence context. 25 | 26 | 4. Q: The workflow runs forever on my computer. Isn't the liner run-time on of the highlighted remarks? 27 | 28 | The apparent practical bottleneck of the workflow specially for local instances is the covariance model calibration. Specifically the `cmcalibrate` step integrated into the `cmbuild` Infernal wrapper. This calibration is necessary to compute a reliable E-value for the significance of a CM hit, but time-consuming for generating ~million bases of background sequences. 29 | 30 | We would suggest to use the instance on our European Galaxy server, where the cmbuild step is pre-configured to use multi-processors and also the server is supported by thousands of computing nodes. In the docker instance, by default, the calibration is performed on a single core. We would recommend to use ask your galaxy admin to configure the wrapper according to the backend hardware. Alternatively, you can reduce the length of random sequences (-L in cmbuild-cmcalibrate) wrapper. Please refer to the Infernal manual and also take care that the E-values might not be reliable anymore. 31 | 32 | 5. Q: How can I do more rounds? 33 | 34 | A: To extend an existing workflow of GraphClust and add another round, you should run a workflow called Galaxy-Workflow-single_round_for_extension : [GraphClust_two](https://raw.githubusercontent.com/BackofenLab/docker-galaxy-graphclust/master/workflows/Galaxy-Workflow-single_round_for_extension.ga) 35 | The inputs for this workflow are the files generated by the GraphClust workflow. The names of each input corresponds to the name of the produced file, so you should just choose from a dropdown selection a needed file. Important parameter for this workflow is the **round number**, which must be specified in **NSPDK_cancidateCluster**, **pgma_graphclust** and **cluster_collection_report** tools. Alternatively it is recommended to use the *with-subworkflows* flavors. 36 | 37 | 6. Q: In my Ubuntu host system the container is running but constantly reports error: `could not connect to server: Connection refused` 38 | 39 | A0: For Ubuntu users we recommend to use 16.04 LTS version which is deleivered with Kernel 4.2 or higher. 40 | 41 | A1: Docker manager is tightly coupled with the host Linux kernel. Under certain Linux kernel the docker storage system might fail. 42 | Please proceed with the following commands or contact you administrator ( __Warning__ please be careful of potential data loss with this procedure): 43 | 44 | ``` 45 | sudo apt update; sudo apt upgrade; 46 | sudo apt-get install linux-image-extra-$(uname -r) linux-image-extra-virtual 47 | sudo modprobe aufs 48 | sudo service docker stop 49 | sudo rm -rf /var/lib/docker/overlay2 50 | sudo service docker restart 51 | ``` 52 | For more information please check Docker documentation: https://docs.docker.com/engine/userguide/storagedriver/aufs-driver/ 53 | 54 | 55 | # Login to the docker instance: 56 | To have distinct history and workflows the Galaxy server requires each user to register for first access time. **By default anyone with access to the host network can register. No registration confirmation email will be sent to the given email.** So you can register with any custom (including non-existent) email address. There exist also a default Admin user [described here](https://bgruening.github.io/docker-galaxy-stable/users-passwords.html). To change the default authorization settings please refer to the Galaxy Wiki section [Authentication](https://wiki.galaxyproject.org/Develop/Authentication) 57 | 58 | * To register (first time only): 59 | * On top right of the panel goto **User→Register** 60 | * Provide a custom email address and password, confirm your password and enter a public name 61 | 62 | * To login: 63 | * On top right of the panel goto **User→Login** 64 | * Provide your registered email address and password 65 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 8 | 9 | GraphClust2 10 | ======================== 11 | GraphClust2 is a workflow for scalable clustering of RNAs based on sequence and secondary structures feature. GraphClust2 is implemented within the Galaxy framework and consists a set of integrated Galaxy tools and flavors of the linear-time clustering workflow. 12 | 13 | 14 | Table of Contents 15 | ================= 16 | * [GraphClust2](#graphclust2) 17 | * [Table of Contents](#table-of-contents) 18 | * [Availability](#availability) 19 | * [GraphClust2 on European Galaxy Server](#graphclust2-on-european-galaxy-server) 20 | * [GraphClust2 Docker 🐳 Image](#graphclust2-docker-whale-image) 21 | * [Installation and Setup](#installation-and-setup) 22 | * [Requirements](#requirements) 23 | * [Running the docker instance](#running-the-docker-instance) 24 | * [Using graphic interface (Windows/MacOS)](#using-graphic-interface-windowsmacos) 25 | * [Installation on a Galaxy instance](#installation-on-a-galaxy-instance) 26 | * [Setup support](#setup-support) 27 | * [Demo instance](#demo-instance) 28 | * [Usage - How to run GraphClust2](#usage---how-to-run-graphclust2) 29 | * [Browser access to the server](#browser-access-to-the-server) 30 | * [Public server](#public-server) 31 | * [Docker instance](#docker-instance) 32 | * [Video tutorial](#video-tutorial) 33 | * [Interactive tours](#interactive-tours) 34 | * [Import additional workflows](#import-additional-workflows) 35 | * [Workflow flavors](#workflow-flavors) 36 | * [Workflows on the running server](#workflows-on-the-running-server) 37 | * [command line support (beta)](#command-line-support-beta) 38 | * [Frequently Asked Questions](#frequently-asked-questions) 39 | * [Workflow overview](#workflow-overview) 40 | * [Input](#input) 41 | * [Output](#output) 42 | * [Support & Bug Reports](#support--bug-reports) 43 | * [References](#references) 44 | 45 | 46 | # Availability 47 | 48 | ## GraphClust2 on European Galaxy Server 49 | GraphClust2 is accessible on European Galaxy server at: 50 | * [https://graphclust.usegalaxy.eu](https://graphclust.usegalaxy.eu) 51 | 52 | ## GraphClust2 Docker :whale: Image 53 | It is also possible to run GraphClust2 as a stand-alone solution using a Docker container that is a pre-configured flavor of the official [Galaxy Docker image](https://github.com/bgruening/docker-galaxy-stable). 54 | This Docker image is a flavor of the Galaxy Docker image customized for GraphClust2 tools, tutorial interactive tours and workflows. 55 | 56 | ### Installation and Setup 57 | #### Requirements 58 | 59 | For running GraphClust2 locally, the `Docker` client is required. 60 | Docker supports the three major desktop operating systems Linux, Windows and Mac OSX. Please refer to thw [Docker installation guideline](https://docs.docker.com/installation) for details. 61 | 62 | A GUI client can also be used for Windows and Mac operation systems. 63 | Please follow the graphical instructions for using Kitematic client [here](./kitematic.md). 64 | 65 | **Hardware requirements:** 66 | * Minimum 8GB memory 67 | * Minimum 20GB free disk storage space, 100GB is recommended. 68 | 69 | **Supported operating systems** 70 | 71 | GraphClust2 has been tested on these operating systems: 72 | * *Windows* : 10 using [Kitematic](https://kitematic.com/) 73 | * *MacOSx*: 10.1x or higher using [Kitematic](https://kitematic.com/) 74 | * *Linux*: Kernel 4.2 or higher, preferably with aufs support (see [FAQ](FAQ.md)) 75 | 76 | 77 | ### Running the docker instance 78 | From the command line: 79 | 80 | ```bash 81 | docker run -i -t -p 8080:80 backofenlab/docker-galaxy-graphclust 82 | ``` 83 | 84 | For details about the docker commands please check the official guide [here](https://docs.docker.com/engine/reference/run/). Galaxy specific run options and configuration supports for computation grid systems are detailed in the Galaxy Docker [repository](https://github.com/bgruening/docker-galaxy-stable). 85 | 86 | ### Using graphic interface (Windows/MacOS) 87 | Please check this [step-by-step guide](./kitematic/kitematic.md). 88 | 89 | ## Installation on a Galaxy instance 90 | GraphClust2 can be integrated into any available Galaxy server. All the GraphClust2 tools and workflows needed to run the 91 | GraphClust pipeline are listed in [workflows](./workflows/) and 92 | [tools-list](./assets/tools/). 93 | 94 | #### Setup support 95 | In case you encountered problems please use the recommended settings, check the [FAQs](./FAQ.md) or contact us via [*Issues*](https://github.com/BackofenLab/GraphClust-2/issues) section of the repository. 96 | 97 | 98 | ## Demo instance 99 | A running demo instance of GraphClust2 is available at http://192.52.32.222:8080/. 100 | Please note that this instance is simply a Cloud instance of the provided Docker container, intended for rapid inspections and demonstration purposes. The computation 101 | capacity is limited and currently it is not planned to have a long-time availability. We recommend to follow instructions above. Please contact us if you prefer to keep this service available. 102 | 103 | # Usage - How to run GraphClust2 104 | 105 | ## Browser access to the server 106 | ### Public server 107 | Please register on our European Galaxy server [https://usegalaxy.eu](https://usegalaxy.eu) and use your authentication information to access the customized sub-domain [https://graphclust.usegalaxy.eu]. Guides and tutorial are available in the server welcome home page. 108 | 109 | ### Docker instance 110 | After running the Galaxy docker, a web server is established under the host IP/URL and designated port (default 8080). 111 | * Inside your browser goto IP/URL:PORT 112 | * Following same settings as previous step 113 | * In the same (local) computer: [http://localhost:8080/](http://localhost:8080) 114 | * In other systems in the network: [http://HOSTIP:8080]() 115 | 116 | ### Video tutorial 117 | You might find this [Youtube tutorial](https://www.youtube.com/watch?v=fJ6tUt_6uas) helpful to get a visually comprehensive introduction on setting-up and running GraphClust2. 118 | 119 | 120 | [![IMAGE ALT TEXT HERE](./assets/img/video-thumbnail.png)](https://www.youtube.com/watch?v=fJ6tUt_6uas) 121 | 122 | ### Interactive tours 123 | Interactive Tours are available for Galaxy and GraphClust2. To run the tours please on top panel go to **Help→Interactive Tours** and click on one of the tours prefixed *GraphClust*. You can check the other tours for a more general introduction to the Galaxy interface. 124 | 125 | ### Import additional workflows 126 | 127 | To import or upload additional workflow flavors (e.g. from [extra-workflows directory](./workflows/extra-workflows/)), on the top panel go to *Workflow* menu. On top right side of the screen click on "Upload or import workflow" button. You can either upload workflow from your local system or by providing the URL of the workflow. Log in is necessary to access into the workflow menu. The docker galaxy instance has a pre-configured *easy!* info that can be found by following the interactive tour. You can download workflows from the following links 128 | 129 | ### Workflow flavors 130 | The pre-configured flavors of GraphClust2 are provided and described inside the [workflows directory](./workflows/) 131 | 132 | #### Workflows on the running server 133 | Below workflows can be directly accessed on the public server: 134 | * MotifFinder: [GraphClust-MotifFinder](https://graphclust.usegalaxy.eu/u/graphclust2/w/graphclust2--motiffinder) 135 | * Workflow main: [GraphClust_1r](https://graphclust.usegalaxy.eu/u/graphclust2/w/graphclust2--main-1r) 136 | * Workflow main, preconfigured for two rounds : [GraphClust_2r](https://graphclust.usegalaxy.eu/u/graphclust2/w/graphclust2--main-2r) 137 | 138 | ## command line support (beta) 139 | Galaxy service is accessible via the Galaxy project `bioblend` API library. In the future we plan to provide a full integration of bioblend API for GraphClust2. Currently a beta support for running GraphClust2 via the CLI is available. The wrapper and setup template is available inside [CLI-workflow-executor](./CLI-workflow-executor) directory. 140 | 141 | 142 | ## [Frequently Asked Questions](FAQ.md) 143 | 144 | Workflow overview 145 | =============================== 146 | 147 | The pipeline for clustering RNA sequences and structured motif discovery is a multi-step pipeline. Overall it consists of three major phases: a) sequence based pre-clustering b) encoding predicted RNA structures as graph features c) iterative fast candidate clustering then refinement 148 | 149 | ![GraphClust-2 workflow overview](./assets/img/figure-pipeline_zigzag.png) 150 | 151 | 152 | Below is a coarse-grained correspondence list of GraphClust2 tool names with each step: 153 | 154 | | Stage | Galaxy Tool Name | Description| 155 | | :--------------------: | :--------------- | :----------------| 156 | |1 | [Preprocessing](https://graphclust.usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_preprocessing/preproc/0.5) | Input preprocessing (fragmentation)| 157 | |2 | [fasta_to_gspan](https://graphclust.usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_fasta_to_gspan/gspan/0.4) | Generation of structures via RNAshapes and conversion into graphs| 158 | |3 | [NSPDK_sparseVect](https://graphclust.usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_nspdk/nspdk_sparse/9.2.3) | Generation of graph features via NSPDK | 159 | |4| [NSPDK_candidateClusters](https://graphclust.usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_nspdk/NSPDK_candidateClust/9.2.3) | min-hash based clustering of all feature vectors, output top dense candidate clusters| 160 | |5| [PGMA_locarna](https://graphclust.usegalaxy.eu/?tool_id=toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_prepocessing_for_mlocarna/preMloc/0.4),[locarna](https://graphclust.usegalaxy.eu/tool_runner?tool_id=toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_mlocarna/locarna_best_subtree/0.4), [CMfinder](https://graphclust.usegalaxy.eu/?tool_id=toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_cmfinder/cmFinder/0.4) | Locarna based clustering of each candidate cluster, all-vs-all pairwise alignments, create multiple alignments along guide tree, select best subtree, and refine alignment.| 161 | |6| [Build covariance models](https://graphclust.usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/bgruening/infernal/infernal_cmbuild/1.1.0.2) | create candidate model | 162 | |7| [Search covariance models](https://graphclust.usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/bgruening/infernal/infernal_cmsearch/1.1.0.2) | Scan full input sequences with Infernal's cmsearch to find missing cluster members | 163 | |8,9| [Report results](https://graphclust.usegalaxy.eu/?tool_id=toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_postprocessing/glob_report/0.5) and [conservation evaluations](https://graphclust.usegalaxy.eu/?tool_id=toolshed.g2.bx.psu.edu/repos/rnateam%2Fgraphclust_aggregate_alignments/graphclust_aggregate_alignments/0.1) | Collect final clusters and create example alignments of top cluster members| 164 | 165 | 166 | ### Input 167 | The input to the workflow is a set of putative RNA sequences in FASTA format. Inside the `data` directory you can find examples of the input format. The labeled datasets are based on Rfam annotation that are labeled with the associated RNA family. 168 | 169 | ### Output 170 | The output contains the predicted clusters, where similar putative input RNA sequences form a cluster. Additionally overall status of the clusters and the matching of cluster elements is reported for each cluster. 171 | 172 | 173 | 179 | 180 | 181 | # Support & Bug Reports 182 | 183 | You can file a [github issue](https://github.com/BackofenLab/GraphClust-2/issues) or find our contact information in the [lab page](http://www.bioinf.uni-freiburg.de/team.html?en). 184 | 185 | # References 186 | The manuscript is currently under prepration/revision. If you find this resource useful, please cite the zenodo DOI of the repo or contact us. 187 | 188 | * Miladi, Milad, Eteri Sokhoyan, Torsten Houwaart, Steffen Heyne, Fabrizio Costa, Bjoern Gruening, and Rolf Backofen. "GraphClust2: Annotation and discovery of structured RNAs with scalable and accessible integrative clustering." GigaScience, Volume 8, Issue 12, December 2019, giz150. doi: [https://doi.org/10.1093/gigascience/giz150](https://doi.org/10.1093/gigascience/giz150) 189 | * Milad Miladi, Björn Grüning, & Eteri Sokhoyan. BackofenLab/GraphClust-2: Zenodo. http://doi.org/10.5281/zenodo.1135094 190 | * GraphClust-1 methodology (S. Heyne, F. Costa, D. Rose, R. Backofen; 191 | GraphClust: alignment-free structural clustering of local RNA secondary structures; Bioinformatics, 2012) available at http://www.bioinf.uni-freiburg.de/Software/GraphClust/ 192 | 193 | -------------------------------------------------------------------------------- /assets/img/GalaxyDocker.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BackofenLab/GraphClust-2/c6edda2b28371d1fa6aa2a9750b890982cb32461/assets/img/GalaxyDocker.png -------------------------------------------------------------------------------- /assets/img/figure-pipeline_zigzag.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BackofenLab/GraphClust-2/c6edda2b28371d1fa6aa2a9750b890982cb32461/assets/img/figure-pipeline_zigzag.png -------------------------------------------------------------------------------- /assets/img/graphclust_pipeline.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BackofenLab/GraphClust-2/c6edda2b28371d1fa6aa2a9750b890982cb32461/assets/img/graphclust_pipeline.png -------------------------------------------------------------------------------- /assets/img/kitematic-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BackofenLab/GraphClust-2/c6edda2b28371d1fa6aa2a9750b890982cb32461/assets/img/kitematic-1.png -------------------------------------------------------------------------------- /assets/img/kitematic-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BackofenLab/GraphClust-2/c6edda2b28371d1fa6aa2a9750b890982cb32461/assets/img/kitematic-2.png -------------------------------------------------------------------------------- /assets/img/kitematic-3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BackofenLab/GraphClust-2/c6edda2b28371d1fa6aa2a9750b890982cb32461/assets/img/kitematic-3.png -------------------------------------------------------------------------------- /assets/img/kitematic-32.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BackofenLab/GraphClust-2/c6edda2b28371d1fa6aa2a9750b890982cb32461/assets/img/kitematic-32.png -------------------------------------------------------------------------------- /assets/img/kitematic-4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BackofenLab/GraphClust-2/c6edda2b28371d1fa6aa2a9750b890982cb32461/assets/img/kitematic-4.png -------------------------------------------------------------------------------- /assets/img/kitematic-5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BackofenLab/GraphClust-2/c6edda2b28371d1fa6aa2a9750b890982cb32461/assets/img/kitematic-5.png -------------------------------------------------------------------------------- /assets/img/video-thumbnail.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BackofenLab/GraphClust-2/c6edda2b28371d1fa6aa2a9750b890982cb32461/assets/img/video-thumbnail.png -------------------------------------------------------------------------------- /assets/img/workflow_early.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BackofenLab/GraphClust-2/c6edda2b28371d1fa6aa2a9750b890982cb32461/assets/img/workflow_early.png -------------------------------------------------------------------------------- /assets/library/library_data.yaml: -------------------------------------------------------------------------------- 1 | libraries: 2 | - name: "FASTA" 3 | files: 4 | - url: https://raw.githubusercontent.com/eteriSokhoyan/test-data/master/cliques-high-representatives.fa 5 | file_type: fasta 6 | - url: https://raw.githubusercontent.com/eteriSokhoyan/test-data/master/cliques-low-representatives.fa 7 | file_type: fasta 8 | -------------------------------------------------------------------------------- /assets/tools/graphclust_tools.yml: -------------------------------------------------------------------------------- 1 | --- 2 | # This is a sample file to be used as a reference for populating a list of 3 | # tools that you wish to install into Galaxy from a Tool Shed via the 4 | # `install_tool_shed_tools.py` script. 5 | # 6 | # For each tool you want to install, you must provide the following keys: 7 | # * name: this is is the name of the tool to install 8 | # * owner: owner of the Tool Shed repository from where the tools is being 9 | # installed 10 | # Further, you need to provide **one** of the following two keys: 11 | # * tool_panel_section_id: ID of the tool panel section where you want the 12 | # tool to be installed. The section ID can be found 13 | # in Galaxy's `shed_tool_conf.xml` config file. Note 14 | # that the specified section must exist in this file. 15 | # Otherwise, the tool will be installed outside any 16 | # section. 17 | # * tool_panel_section_label: Display label of a tool panel section where 18 | # you want the tool to be installed. If it does not 19 | # exist, this section will be created on the target 20 | # Galaxy instance (note that this is different than 21 | # when using the ID). 22 | # Multi-word labels need to be placed in quotes. 23 | # Each label will have a corresponding ID created; 24 | # the ID will be an all lowercase version of the 25 | # label, with multiple words joined with 26 | # underscores (e.g., 'BED tools' -> 'bed_tools'). 27 | # 28 | # Tou can also specify the following optional keys to further define the 29 | # installation properties: 30 | # * tool_shed_url: the URL of the Tool Shed from where the tool should be 31 | # installed. (default: https://toolshed.g2.bx.psu.edu) 32 | # * revisions: a list of revisions of the tool, all of which will attempt to 33 | # be installed. (default: latest) 34 | # * install_tool_dependencies: True or False - whether to install tool 35 | # dependencies or not. (default: True) 36 | # * install_repository_dependencies: True or False - whether to install repo 37 | # dependencies or not. (default: True) 38 | 39 | api_key: admin 40 | galaxy_instance: http://localhost:8080 41 | install_resolver_dependencies: True 42 | install_tool_dependencies: False 43 | tools: 44 | - name: graphclust_cmfinder 45 | owner: rnateam 46 | tool_panel_section_label: "GraphClust" 47 | 48 | - name: graphclust_postprocessing 49 | owner: rnateam 50 | tool_panel_section_label: "GraphClust" 51 | 52 | - name: graphclust_fasta_to_gspan 53 | owner: rnateam 54 | tool_panel_section_label: "GraphClust" 55 | 56 | - name: structure_to_gspan 57 | owner: rnateam 58 | tool_panel_section_label: "GraphClust" 59 | 60 | - name: graphclust_mlocarna 61 | owner: rnateam 62 | tool_panel_section_label: "GraphClust" 63 | -------------------------------------------------------------------------------- /assets/tools/graphclust_tools2.yml: -------------------------------------------------------------------------------- 1 | --- 2 | # This is a sample file to be used as a reference for populating a list of 3 | # tools that you wish to install into Galaxy from a Tool Shed via the 4 | # `install_tool_shed_tools.py` script. 5 | # 6 | # For each tool you want to install, you must provide the following keys: 7 | # * name: this is is the name of the tool to install 8 | # * owner: owner of the Tool Shed repository from where the tools is being 9 | # installed 10 | # Further, you need to provide **one** of the following two keys: 11 | # * tool_panel_section_id: ID of the tool panel section where you want the 12 | # tool to be installed. The section ID can be found 13 | # in Galaxy's `shed_tool_conf.xml` config file. Note 14 | # that the specified section must exist in this file. 15 | # Otherwise, the tool will be installed outside any 16 | # section. 17 | # * tool_panel_section_label: Display label of a tool panel section where 18 | # you want the tool to be installed. If it does not 19 | # exist, this section will be created on the target 20 | # Galaxy instance (note that this is different than 21 | # when using the ID). 22 | # Multi-word labels need to be placed in quotes. 23 | # Each label will have a corresponding ID created; 24 | # the ID will be an all lowercase version of the 25 | # label, with multiple words joined with 26 | # underscores (e.g., 'BED tools' -> 'bed_tools'). 27 | # 28 | # Tou can also specify the following optional keys to further define the 29 | # installation properties: 30 | # * tool_shed_url: the URL of the Tool Shed from where the tool should be 31 | # installed. (default: https://toolshed.g2.bx.psu.edu) 32 | # * revisions: a list of revisions of the tool, all of which will attempt to 33 | # be installed. (default: latest) 34 | # * install_tool_dependencies: True or False - whether to install tool 35 | # dependencies or not. (default: True) 36 | # * install_repository_dependencies: True or False - whether to install repo 37 | # dependencies or not. (default: True) 38 | 39 | api_key: admin 40 | galaxy_instance: http://localhost:8080 41 | install_resolver_dependencies: True 42 | install_tool_dependencies: False 43 | tools: 44 | - name: graphclust_nspdk 45 | owner: rnateam 46 | tool_panel_section_label: "GraphClust" 47 | 48 | - name: graphclust_prepocessing_for_mlocarna 49 | owner: rnateam 50 | tool_panel_section_label: "GraphClust" 51 | 52 | - name: graphclust_preprocessing 53 | owner: rnateam 54 | tool_panel_section_label: "GraphClust" 55 | 56 | - name: graphclust_motif_finder_plot 57 | owner: rnateam 58 | tool_panel_section_label: "GraphClust" 59 | 60 | - name: graphclust_postprocessing_no_align 61 | owner: rnateam 62 | tool_panel_section_label: "GraphClust" 63 | 64 | - name: graphclust_align_cluster 65 | owner: rnateam 66 | tool_panel_section_label: "GraphClust" 67 | 68 | - name: graphclust_aggregate_alignments 69 | owner: rnateam 70 | tool_panel_section_label: "GraphClust" 71 | 72 | 73 | 74 | -------------------------------------------------------------------------------- /assets/tools/graphclust_utils.yml: -------------------------------------------------------------------------------- 1 | --- 2 | # This is a sample file to be used as a reference for populating a list of 3 | # tools that you wish to install into Galaxy from a Tool Shed via the 4 | # `install_tool_shed_tools.py` script. 5 | # 6 | # For each tool you want to install, you must provide the following keys: 7 | # * name: this is is the name of the tool to install 8 | # * owner: owner of the Tool Shed repository from where the tools is being 9 | # installed 10 | # Further, you need to provide **one** of the following two keys: 11 | # * tool_panel_section_id: ID of the tool panel section where you want the 12 | # tool to be installed. The section ID can be found 13 | # in Galaxy's `shed_tool_conf.xml` config file. Note 14 | # that the specified section must exist in this file. 15 | # Otherwise, the tool will be installed outside any 16 | # section. 17 | # * tool_panel_section_label: Display label of a tool panel section where 18 | # you want the tool to be installed. If it does not 19 | # exist, this section will be created on the target 20 | # Galaxy instance (note that this is different than 21 | # when using the ID). 22 | # Multi-word labels need to be placed in quotes. 23 | # Each label will have a corresponding ID created; 24 | # the ID will be an all lowercase version of the 25 | # label, with multiple words joined with 26 | # underscores (e.g., 'BED tools' -> 'bed_tools'). 27 | # 28 | # Tou can also specify the following optional keys to further define the 29 | # installation properties: 30 | # * tool_shed_url: the URL of the Tool Shed from where the tool should be 31 | # installed. (default: https://toolshed.g2.bx.psu.edu) 32 | # * revisions: a list of revisions of the tool, all of which will attempt to 33 | # be installed. (default: latest) 34 | # * install_tool_dependencies: True or False - whether to install tool 35 | # dependencies or not. (default: True) 36 | # * install_repository_dependencies: True or False - whether to install repo 37 | # dependencies or not. (default: True) 38 | 39 | api_key: admin 40 | galaxy_instance: http://localhost:8080 41 | install_resolver_dependencies: True 42 | install_tool_dependencies: False 43 | tools: 44 | - name: text_processing 45 | owner: bgruening 46 | tool_panel_section_id: "textutil" 47 | 48 | - name: fasta_compute_length 49 | owner: devteam 50 | tool_panel_section_label: "FASTA manipulation tools" 51 | 52 | - name: fasta_to_tabular 53 | owner: devteam 54 | tool_panel_section_label: "FASTA manipulation tools" 55 | 56 | - name: tabular_to_fasta 57 | owner: devteam 58 | tool_panel_section_label: "FASTA manipulation tools" 59 | 60 | - name: seq_filter_by_id 61 | owner: peterjc 62 | tool_panel_section_label: "FASTA manipulation tools" 63 | 64 | - name: infernal 65 | owner: bgruening 66 | tool_panel_section_label: "GraphClust" 67 | 68 | - name: cdhit 69 | owner: bebatut 70 | tool_panel_section_label: "CD-HIT" 71 | 72 | - name: viennarna_rnafold 73 | owner: rnateam 74 | tool_panel_section_label: "GraphClust" 75 | 76 | 77 | -------------------------------------------------------------------------------- /assets/tours/graphclust_step_by_step.yaml: -------------------------------------------------------------------------------- 1 | name: GraphClust workflow step by step 2 | description: Step by step instructions for using GraphClust for clustering RNA sequences 3 | title_default: "GraphClust step by step" 4 | steps: 5 | - title: "A tutorial on GraphClust(Clustering RNA sequences)" 6 | content: "This tour will walk you through the process of GraphClust to cluster RNA sequences.

7 | Read and Follow the instructions before clicking 'Next'.

8 | Click 'Prev' in case you missed out on any step." 9 | backdrop: true 10 | 11 | - title: "A tutorial on GraphClust" 12 | content: "Together we will go through the following 10 steps:
13 | 14 | 15 |
  • Data Acquisition
  • 16 |
  • Pre-processing
  • 17 |
  • Creating a graph from FASTA
  • 18 |
  • Creating a sparse vector
  • 19 |
  • Computing candidate clusters
  • 20 |
  • Preprocessing data for computing best subtree
  • 21 |
  • Computing best subtree using LocaRna
  • 22 |
  • Finding consensus motives
  • 23 |
  • Building Covariance Model using cmbuild
  • 24 |
  • Searching homologous sequences using cmsearch
  • 25 |
  • Post-processing
  • 26 |
    27 |
    " 28 | backdrop: true 29 | 30 | 31 | - title: "Data Acquisition" 32 | content: "We will start with a simple small FASTA file.

    33 | You will get one FASTA file with RNA sequences that we want to cluster.

    " 34 | backdrop: true 35 | 36 | - title: "Data Acquisition" 37 | element: ".upload-button" 38 | intro: "We will import the FASTA file into into the history we just created.

    39 | Click 'Next' and the tour will take you to the Upload screen." 40 | position: "right" 41 | postclick: 42 | - ".upload-button" 43 | 44 | - title: "Data Acquisition" 45 | element: "button#btn-new" 46 | intro: "The sample training data available on github is a good place to start.

    47 | Simply click 'Next' and the links to the training data will be automatically inserted and ready for upload.

    48 | Later on, when you want to upload other data, you can do so by clicking the 'Paste/Fetch Data' button or 49 | 'Choose local file' to upload locally stored file." 50 | position: "top" 51 | postclick: 52 | - "button#btn-new" 53 | 54 | - title: "Data Acquisition" 55 | element: ".upload-text-content:first" 56 | intro: "Links Acquired !" 57 | position: "top" 58 | textinsert: 59 | https://github.com/BackofenLab/docker-galaxy-graphclust/raw/master/data/Rfam-cliques-dataset/cliques-low-representatives.fa 60 | 61 | - title: "Data Acquisition" 62 | element: "button#btn-start" 63 | intro: "Click on 'Start' to upload the data into your Galaxy history." 64 | position: "top" 65 | 66 | - title: "Data Acquisition" 67 | element: "button#btn-close" 68 | intro: "The upload may take awhile.

    69 | Hit the close button when you see that the files are uploaded into your history." 70 | position: "top" 71 | 72 | title: "Data Acquisition" 73 | element: "#current-history-panel > div.controls" 74 | intro: "You've acquired your data. Now let's start using GraphCLust tools.

    " 75 | position: "left" 76 | 77 | - title: "GraphCLust" 78 | intro: "Once we have the data for analysis we can start the process of clustering RNA sequences step by step.
    79 | Navigate to the tool panel on the left side and click on GraphCLust. This will open a section 80 | with set of tool necessary for GraphCLust process.
    81 | The first step of the process is Preprocessing." 82 | position: "right" 83 | 84 | 85 | - title: "Pre-processing" 86 | intro: "This tool takes as an input file of sequences in Fasta format 87 | and creates the final input for GraphCLust based on given parameters. Parameters allows us to 88 | split long sequences into smaller fragments to enable the detection of local signals" 89 | position: "right" 90 | postclick: 91 | - "#preproc > div.toolTitle > div > a" 92 | 93 | 94 | - title: "Pre-processing" 95 | element: "#s2id_uid-18_select > a" 96 | intro: "Here we should define our input file.
    " 97 | position: "top" 98 | 99 | 100 | - title: "Pre-processing" 101 | element: "#uid-11" 102 | intro: "'Window size' defines the length of the fragments that the input sequences will be split. 103 | In default settings we set it to very high number to not split the sequences at all.

    " 104 | position: "top" 105 | 106 | 107 | - title: "Pre-processing" 108 | element: "#uid-64 > div.ui-form-title > span" 109 | intro: "'Window shift in percent' defines the percentage of the shift fot fragments of the input sequences. 110 | In default settings it is 100% because we don't split input sequences.

    " 111 | position: "top" 112 | 113 | - title: "Pre-processing" 114 | element: "#uid-46 > div.ui-form-field" 115 | intro: "Minimum sequence length defines the minimal length of input sequences.

    " 116 | position: "top" 117 | 118 | 119 | - title: "Pre-processing" 120 | element: "#execute" 121 | intro: "To run the tool press 'Execute' button.
    " 122 | position: "top" 123 | 124 | 125 | - title: "Understanding the Output" 126 | element: "#current-history-panel" 127 | intro: "After the tool is executed several output files are created. 128 |
    By clicking on the 'eye' icon you can see the content of the files. 129 |

    " 130 | position: "left" 131 | 132 | - title: "Preprocessing : DONE" 133 | intro: "Once preprocessing of the data is done we can move on to the next step :create a graph from FASTA file.
    " 134 | position: "right" 135 | 136 | - title: "Creating a graph from FASTA" 137 | intro: "To create a graph we need to use a tool called fasta_to_gspan which you 138 | can find in GraphClust section of tool panel.
    " 139 | position: "right" 140 | postclick: 141 | - "#gspan > div.toolTitle > div > a" 142 | 143 | - title: "Creating a graph from FASTA" 144 | element: "#uid-71 > div.ui-form-field > div.ui-select-content > div.ui-options > div.btn-group.ui-radiobutton" 145 | intro: "As an input this tool takes pre-processed FASTA file: 'data.fasta'.
    " 146 | position: "right" 147 | 148 | 149 | - title: "Creating a graph from FASTA" 150 | intro: "Detailed description of each parameter you can find in help section at the bottom of the page" 151 | position: "top" 152 | 153 | - title: "Creating a graph from FASTA" 154 | element: "#execute" 155 | intro: "To run the tool press 'Execute' button.
    " 156 | position: "top" 157 | 158 | - title: "Creating a graph from FASTA" 159 | intro: "Once tool is executed it will produce gspan.zip file.
    " 160 | position: "top" 161 | 162 | - title: "Creating a graph from FASTA : DONE" 163 | intro: "Now we have gspan.zip file and can move on the next step.
    164 | The next step is to create sparse vector using NSPDK.
    165 | From GraphClust tools choose NSPDK_sparseVect to start the process.
    " 166 | position: "top" 167 | postclick: 168 | - "#nspdk_sparse > div.toolTitle > div > a" 169 | 170 | - title: "Creating a sparse vector" 171 | intro: "This tool will create explicit sparse feature encoding using NSPDK.
    " 172 | position: "top" 173 | 174 | 175 | - title: "Creating a sparse vector" 176 | intro: "This tools requires 2 input files: 177 |
    178 | 179 | 180 |
  • data.fasta : from pre-processing step
  • 181 |
  • gspan.zip : from the previous step
  • 182 |
    183 |
    " 184 | position: "top" 185 | 186 | 187 | - title: "Creating a sparse vector" 188 | intro: "More information about NSPDK you can find in help section
    " 189 | position: "top" 190 | 191 | - title: "Creating a sparse vector" 192 | element: "#execute" 193 | intro: "Run the tool by pressing 'Execute' button.
    " 194 | position: "top" 195 | 196 | - title: "Creating a sparse vector" 197 | intro: "After tool is exec it will produce data_svector which we will use in the next step." 198 | position: "top" 199 | 200 | - title: "Creating a sparse vector: DONE" 201 | intro: "Created data_svector file will be used in the next step for computing 202 | candidate cluster. For that click on NSPDK_candidateClusters tool." 203 | position: "top" 204 | postclick: 205 | - "#NSPDK_candidateClust > div.toolTitle > div > a" 206 | 207 | 208 | - title: "Computing candidate clusters" 209 | intro: "During this step we will compute global feature index and get top dense sets. 210 | The candidate clusters are chosen as the top ranking neighborhoods provided that the 211 | size of their overlap is below a specified threshold." 212 | position: "top" 213 | 214 | 215 | 216 | - title: "Computing candidate clusters" 217 | intro: "Here we have to specify 3 input files: 218 |
    219 | 220 | 221 |
  • data_svector : from the previous step
  • 222 |
  • data_fasta
  • 223 | and 224 |
  • data_names from pre-processing step
  • 225 |
    226 |
    " 227 | position: "top" 228 | 229 | - title: "Computing candidate clusters" 230 | intro: "Another important parameter for this tool is Multiple iterations. 231 | This parameter by default is set to 'no' which means we will do only a single iteration. 232 | By setting it to 'yes' we have to define some other input files which we would get from previous 233 | iterations. But in the scope of this tutorial we will just go for a single iteration." 234 | position: "top" 235 | 236 | - title: "Computing candidate clusters" 237 | intro: "For more information about this tool check the help section.
    " 238 | position: "top" 239 | 240 | - title: "Computing candidate clusters" 241 | element: "#execute" 242 | intro: "Run the tool by pressing 'Execute' button.
    " 243 | position: "top" 244 | 245 | - title: "Computing candidate clusters" 246 | intro: "This step will produce 3 output files: 247 |
    248 | 249 | 250 |
  • fast_cluster
  • 251 |
  • fast_cluster_sim
  • 252 |
  • blacklist
  • 253 |
    254 |
    255 | fast_cluster represents the id's of the candidate clusters.
    256 | fast_cluster_sim represents the similarity scores.
    257 | blacklist contains the ids of sequences that were already clustered, that's why in case 258 | of the single iteration it's empty.
    " 259 | position: "top" 260 | 261 | - title: "Computing candidate clusters : DONE" 262 | intro: "Once output files are ready we can move to the nexr step by clicking on 263 | premlocarna tool.
    " 264 | position: "top" 265 | postclick: 266 | - "#preMloc > div.toolTitle > div > a" 267 | 268 | - title: "Preprocessing data for computing best subtree" 269 | intro: "This tool will do some pre-processing for computing best subtrees.
    " 270 | position: "top" 271 | 272 | 273 | - title: "Preprocessing data for computing best subtree" 274 | intro: "This step needs 4 input files: 275 |
    276 | 277 | 278 |
  • fast_cluster : from the previous step
  • 279 |
  • fast_cluster_sim : from the previous step
  • 280 |
  • data_fasta
  • 281 | and 282 |
  • data_names from pre-processing step
  • 283 |
    284 |
    285 | " 286 | position: "top" 287 | 288 | - title: "Preprocessing data for computing best subtree" 289 | element: "#execute" 290 | intro: "Run the tool by pressing 'Execute' button.
    " 291 | position: "top" 292 | 293 | - title: "Preprocessing data for computing best subtree" 294 | intro: "Execution of this tool results in 4 datasets : 295 |
    296 | 297 | 298 |
  • centers
  • 299 |
  • trees
  • 300 |
  • tree_matrix
  • 301 |
  • cmfinder_fa
  • 302 |
  • model_tree_fa
  • 303 |
    304 |
    305 | These datasets will be used in next steps. 306 |
    " 307 | position: "top" 308 | 309 | - title: "Preprocessing data for computing best subtree : DONE" 310 | intro: "Preprocessing for the best tree computation is done, so we can now do the actual computation 311 | of the best subtree. For that we need the tool called locarna_best_subtree.
    " 312 | position: "top" 313 | postclick: 314 | - "#locarna_best_subtree > div.toolTitle > div > a" 315 | 316 | - title: "Computing best subtree using LocaRna" 317 | intro: "This step computes a multiple sequence-structure alignment of RNA sequences using LocaRna. 318 | It uses tree file (from previous step) with guide tree in NEWICK format. 319 | The given tree is used as guide tree for the progressive alignment. It saves the calculation 320 | of pairwise all-vs-all similarities and construction of the guide tree. And at the end return the best subtree
    " 321 | position: "top" 322 | 323 | 324 | - title: "Computing best subtree using LocaRna" 325 | intro: "This step takes as an input following files: 326 | 327 | 328 |
  • centers
  • 329 |
  • trees
  • 330 |
  • tree_matrix
  • 331 | from previous steps and 332 |
  • data_map from pre step
  • 333 |
    334 |
    335 |
    " 336 | position: "top" 337 | 338 | - title: "Computing best subtree using LocaRna" 339 | element: "#execute" 340 | intro: "Run the tool by pressing 'Execute' button.
    " 341 | position: "top" 342 | 343 | - title: "Computing best subtree using LocaRna" 344 | intro: "Output of this tool is model.tree.stk which will 345 | be used in next step to find consensus motives.
    " 346 | position: "top" 347 | 348 | - title: "Computing best subtree using LocaRna : DONE" 349 | intro: "Now we have model.tree.stk so we can find consensus motives using the next tool - 350 | CMFinder_v0.
    " 351 | position: "top" 352 | postclick: 353 | - "#cmFinder > div.toolTitle > div > a" 354 | 355 | - title: "Finding consensus motives" 356 | intro: "During this step conversion from CLUSTAL format files to STOCKHOLM format is done. 357 | Then using CMFinder we determine consensus motives for sequences.
    " 358 | position: "top" 359 | 360 | 361 | - title: "Finding consensus motives" 362 | intro: "This tool takes as an input the following files: 363 | 364 | 365 |
  • model_tree_stk : from previous step
  • 366 |
  • cmfinder_fa : from 'Preprocessing data for computing best subtree' step
  • 367 |
  • tree_matrix
  • 368 |
    369 |
    370 |
    " 371 | position: "top" 372 | 373 | - title: "Finding consensus motives" 374 | element: "#execute" 375 | intro: "Run the tool by pressing 'Execute' button.
    " 376 | position: "top" 377 | 378 | - title: "Finding consensus motives" 379 | intro: "Output of the tool is in STOCKHOLM format and contains the consensus structure.
    " 380 | position: "top" 381 | 382 | - title: "Finding consensus motives : DONE" 383 | intro: "Once we have consensus structure we can build covariance model with cmbuild tool. 384 | For that click on the tool named Build covariance models
    " 385 | position: "top" 386 | postclick: 387 | - "#infernal > div:nth-child(3) > div > a " 388 | 389 | - title: "Building Covariance Model using cmbuild" 390 | intro: "In this step we cm build a covariance model of an RNA multiple alignment. 391 | cmbuild uses the consensus structure to determine the architecture of the covariance model.
    " 392 | position: "top" 393 | 394 | 395 | - title: "Building Covariance Model using cmbuild" 396 | intro: "As an input for this tool we give 'model_cmfinder_stk' file containing consensus 397 | structure from the pre step.
    398 | For more information about this tool read the help section of the page.
    " 399 | position: "top" 400 | 401 | - title: "Building Covariance Model using cmbuild" 402 | element: "#execute" 403 | intro: "Run the tool by pressing 'Execute' button.
    " 404 | position: "top" 405 | 406 | - title: "Building Covariance Model using cmbuild : DONE" 407 | intro: "After covariance model is built we can move on to . 408 | Simply click on Search covariance model(s).
    " 409 | position: "top" 410 | postclick: 411 | - "#infernal > div:nth-child(4) > div > a" 412 | 413 | - title: "Searching homologous sequences using cmsearch" 414 | intro: "cmsearch allows you to make consensus RNA secondary structure profiles, 415 | and use them to search nucleic acid sequence databases for homologous RNAs.
    " 416 | position: "top" 417 | 418 | - title: "Searching homologous sequences using cmsearch" 419 | intro: "As a sequence database we choose data_fasta_scan file generated during 420 | pre-processing.
    421 | Then to use covariance model generated in previous step you should 422 | select from Subject covariance models dropdown menu Covariance model from your history option." 423 | position: "top" 424 | 425 | - title: "Searching homologous sequences using cmsearch" 426 | element: "#execute" 427 | intro: "Run the tool by pressing 'Execute' button.
    " 428 | position: "top" 429 | 430 | - title: "Searching homologous sequences using cmsearch : DONE" 431 | intro: "Finally we reached the last step of our workflow! 432 | Click on Report_Results to do the final step.
    " 433 | position: "top" 434 | postclick: 435 | - "#glob_report > div.toolTitle > div > a" 436 | 437 | - title: "Post-processing" 438 | intro: "Final step of our workflow is Post-processing.
    439 | In this step we will report clusters and merge them if needed." 440 | position: "top" 441 | 442 | 443 | - title: "Post-processing" 444 | intro: "This tool takes as an input the following files: 445 | 446 | 447 |
  • FASTA.zip : from Preprocessing step
  • 448 |
  • cmsearch_results : from previous step
  • 449 | and 450 |
  • model_tree_files : from 'Preprocessing data for computing best subtree' step
  • 451 |
    452 |
    453 |
    " 454 | position: "top" 455 | 456 | - title: "Post-processing" 457 | intro: "The final output:
    458 | 'cluster.final.stat' file contains general information about clusters, 459 | e.g. number of clusters, number of sequences in each cluster etc. 460 |
    By clicking on the 'eye' icon you can see the content of the file. 461 |
    462 | 'CLUSTERS' dataset collection contains one file for each cluster.
    463 | Each file contains information about sequences in that cluster. Each line in the file contains: cluster number, 464 | cm_score, sequence origin (whether it comes from model or from Infernal search) and sequence id. 465 |
    " 466 | position: "top" 467 | 468 | - title: "A tutorial on GraphClust workflow" 469 | intro: "Thank You for going through our tutorial." 470 | backdrop: true 471 | -------------------------------------------------------------------------------- /assets/tours/graphclust_tutorial.yaml: -------------------------------------------------------------------------------- 1 | name: GraphClust workflow 2 | description: Simple instructions for using GraphClust workflow for clustering RNA sequences 3 | title_default: "GraphClust" 4 | steps: 5 | - title: "A tutorial on Galaxy-GraphClust(Clustering RNA sequences)" 6 | content: "This tour will walk you through the process of GraphClust to cluster RNA sequences.

    7 | In the forthcoming windows please read and follow the instructions before clicking 'Next'.

    8 | Click 'Prev' in case you missed out on any step." 9 | backdrop: true 10 | 11 | - title: "A tutorial on GraphClust" 12 | content: "Together we will go through the following steps:
    13 |
      14 | 15 |
    1. Data Acquisition
    2. 16 |
    3. Running the Workflow
    4. 17 |
    5. Understanding the Output
    6. 18 |
      19 | " 20 | backdrop: true 21 | 22 | - title: "Log in" 23 | element: '#user > li > a' 24 | intro: " To be able to use workflows you should be logged in. So if you already have an account 25 | simply log in or otherwise register by clicking on 'User'.
      26 | Within a Docker Galaxy-GraphClust everyone can register by default. 27 |
      28 | To have a convenient access the worklows log in you can login with the pre-configured username and password:
      29 | 30 | username : admin@galaxy.org
      31 | password : admin
      32 |
      " 33 | position: "left" 34 | 35 | 36 | - title: "GraphClust" 37 | intro: "Now that you are logged-in we can continue our tour" 38 | position: "left" 39 | backdrop: true 40 | 41 | 42 | - title: "Create a new history" 43 | element: '#history-options-button' 44 | intro: "Let's start by creating a new history:
      45 | (History options :: Create New)" 46 | position: "left" 47 | preclick: 48 | - '#center-panel' 49 | 50 | - title: "Rename the history" 51 | element: "#current-history-panel > div.controls" 52 | intro: "Change the name of the new history to 'GraphClust'." 53 | position: "left" 54 | 55 | - title: "Data Acquisition" 56 | content: "We start with uploading a simple small set of sequences in FASTA format.

      57 | You will get one FASTA file with RNA sequences that we want to cluster.

      " 58 | backdrop: true 59 | 60 | - title: "Data Acquisition" 61 | element: ".upload-button" 62 | intro: "We will import the FASTA file into the history we just created.

      63 | Click 'Next' and the tour will take you to the Upload screen." 64 | position: "right" 65 | postclick: 66 | - ".upload-button" 67 | 68 | - title: "Data Acquisition" 69 | element: "button#btn-new" 70 | intro: "The sample input data available on GitHub is a good place to start.

      71 | Simply click 'Next' and the links to the input data will be automatically inserted and ready for upload.

      72 | Later on, when you want to upload other data, you can do so by clicking the 'Paste/Fetch Data' button or 73 | 'Choose local file' to upload locally stored file." 74 | position: "top" 75 | postclick: 76 | - "button#btn-new" 77 | 78 | - title: "Data Acquisition" 79 | element: ".upload-text-content:first" 80 | intro: "Link acquired !

      81 | This file contains annotated RNAs with human origin from RFAM database, from a mixture of RNA families." 82 | position: "top" 83 | textinsert: 84 | https://github.com/BackofenLab/docker-galaxy-graphclust/raw/master/data/Rfam-cliques-dataset/cliques-low-representatives.fa 85 | 86 | - title: "Data Acquisition" 87 | element: "button#btn-start" 88 | intro: "Click on 'Start' to upload the data into your Galaxy history." 89 | position: "top" 90 | 91 | - title: "Data Acquisition" 92 | element: "button#btn-close" 93 | intro: "The upload may take awhile.

      94 | Hit the close button when you see that the files are uploaded into your history." 95 | position: "top" 96 | 97 | - title: "Data Acquisition" 98 | element: "#current-history-panel > div.controls" 99 | intro: "You've now acquired the input data. Now let's launch a flavor of Galaxy-GraphClust Workflow.

      " 100 | position: "left" 101 | 102 | - title: "Running a Workflow" 103 | element: 'a[href$="/workflow/list_for_run"]' 104 | intro: "Click on 'All Workflows' to access your saved and pre-configured Workflows.
      105 | Alternatively you can click on 'Workflow' tab from the top panel." 106 | position: "right" 107 | 108 | 109 | - title: "Running a Workflow" 110 | element: 'a[href$="/workflow/run?id=1cd8e2f6b131e891"]' 111 | intro: "Inside your workflows list you should see variations of Galaxy-GraphClust. 112 | The round number specifies the number of iterative clusterings.

      113 | Click on GraphClust_1_round then Run, which is the faster one better fitting for 114 | the purpose of this tutorial.Then click Next." 115 | position: "top" 116 | 117 | 118 | - title: "Running a Workflow" 119 | element: "#field-uid-1 > div.btn-group.ui-radiobutton" 120 | intro: "We skip 'History Options' section because we have already created new history, so there is no need to crate a new one.

      " 121 | position: "top" 122 | 123 | - title: "Running a Workflow" 124 | element: "#uid-23 > div.portlet-header > div.portlet-title > span > b" 125 | intro: "Step 1 is the first step of our workflow. Here an input dataset must be assigned.
      126 | Input data is a set of putative RNA sequences that we want to cluster.

      127 | Please ensure the fasta file uploaded in the first step is selected." 128 | position: "right" 129 | 130 | 131 | - title: "Running a Workflow" 132 | element: 'button#uid-11' 133 | intro: "To run the workflow with default setting simply click on 'Run workflow' blue button 134 | on the top right.
      135 | For details about pipeline settings you can check 'step-by-step' tutorial and Galaxy-GraphClust documentations.
      " 136 | position: "left" 137 | 138 | - title: "Understanding the Output" 139 | intro: "Running the workflow takes a few minutes. The workflow is finished 140 | when all the steps inside History panel changes from gray/yellow to green.

      141 | After all the steps are done, clustering output is ready. 142 | The results can be checked from navigating through History panel." 143 | position: "top" 144 | 145 | - title: "Understanding the Output" 146 | element: "#current-history-panel" 147 | intro: "'cluster.final.stat' file contains overall information about predicted clusters. 148 |
      By clicking on the 'eye' icon you can see the content of the file.

      149 | The first four columns specify number of clusters, cluster ids, number of sequences in each cluster." 150 | position: "left" 151 | 152 | - title: "Understanding the Output" 153 | element: "#current-history-panel" 154 | intro: "Click 'CLUSTERS' dataset collection to see clustered sequences. 155 | There is one file for each cluster.
      156 | Each file contains information about sequences in that cluster. Each line in the file contains: 157 |
        158 | 159 |
      1. CLUSTER: cluster number
      2. 160 |
      3. cm_score: Covariance model bit score indicates how well sequence matches to the CM 161 | model of the cluster.
      4. 162 |
      5. sequence origin (whether it orginnates from dense center MODEL 163 | or from Infernal CMSEARCH or preclustering CDHIT)
      6. 164 |
      7. Input fasta sequence id seperated into ORIGID and ORIGHEAD sections
      8. 165 | 166 |
      " 167 | 168 | position: "left" 169 | 170 | 171 | 172 | - title: "A tutorial on GraphClust workflow" 173 | intro: "Thank You for going through our tutorial." 174 | backdrop: true 175 | -------------------------------------------------------------------------------- /assets/tours/graphclust_very_short.yaml: -------------------------------------------------------------------------------- 1 | name: GraphClust workflow fast tutorial 2 | description: Simple and short instructions for using GraphClust workflow for clustering RNA sequences 3 | title_default: "GraphClust_short_tour" 4 | steps: 5 | - title: "A tutorial on GraphClust(Clustering RNA sequences)" 6 | content: "This tour will walk you through the process of GraphClust to cluster RNA sequences.

      7 | Read and Follow the instructions before clicking 'Next'.

      8 | Click 'Prev' in case you missed out on any step." 9 | backdrop: true 10 | 11 | - title: "A tutorial on GraphClust" 12 | content: "Together we will go through the following steps:
      13 | 14 | 15 |
    7. Data Acquisition
    8. 16 |
    9. Calling the Workflow
    10. 17 |
    11. Understanding the Output
    12. 18 |
      19 |
      " 20 | backdrop: true 21 | 22 | - title: "Log in" 23 | element: '#user > li > a' 24 | intro: " To be able to use workflows you should be logged in. So if you already have an account 25 | simply log in or otherwise register by clicking on 'User'." 26 | position: "left" 27 | 28 | 29 | - title: "GraphClust" 30 | intro: "Now when you are logged in we can continue out tour" 31 | position: "left" 32 | backdrop: true 33 | 34 | 35 | 36 | - title: "Data Acquisition" 37 | content: "We will start with a simple small FASTA file.

      38 | You will get one FASTA file with RNA sequences that we want to cluster.

      " 39 | backdrop: true 40 | 41 | - title: "Data Acquisition" 42 | element: ".upload-button" 43 | intro: "We will import the FASTA file into into the history we just created.

      44 | Click 'Next' and the tour will take you to the Upload screen." 45 | position: "right" 46 | postclick: 47 | - ".upload-button" 48 | 49 | - title: "Data Acquisition" 50 | element: "button#btn-new" 51 | intro: "The sample training data available on github is a good place to start.

      52 | Simply click 'Next' and the links to the training data will be automatically inserted and ready for upload.

      53 | Later on, when you want to upload other data, you can do so by clicking the 'Paste/Fetch Data' button or 54 | 'Choose local file' to upload localy stored file." 55 | position: "top" 56 | postclick: 57 | - "button#btn-new" 58 | 59 | - title: "Data Acquisition" 60 | element: ".upload-text-content:first" 61 | intro: "Links Acquired !" 62 | position: "top" 63 | textinsert: 64 | https://github.com/BackofenLab/docker-galaxy-graphclust/raw/master/data/Rfam-cliques-dataset/cliques-low-representatives.fa 65 | 66 | - title: "Data Acquisition" 67 | element: "button#btn-start" 68 | intro: "Click on 'Start' to upload the data into your Galaxy history." 69 | position: "top" 70 | 71 | - title: "Data Acquisition" 72 | element: "button#btn-close" 73 | intro: "The upload may take awhile.

      74 | Hit the close button when you see that the files are uploaded into your history." 75 | position: "top" 76 | 77 | - title: "Data Acquisition" 78 | element: "#current-history-panel > div.controls" 79 | intro: "You've acquired your data. Now let's call the GraphClust Workflow.

      " 80 | position: "left" 81 | 82 | - title: "Running a Workflow" 83 | element: 'a[href$="/workflow/list_for_run"]' 84 | intro: "Click on 'All Workflows' to access your saved workflows.
      " 85 | position: "right" 86 | 87 | 88 | - title: "Running a Workflow" 89 | element: 'a[href$="/workflow/run?id=1cd8e2f6b131e891"]' 90 | intro: "Select simple one round iteration workflow GraphClust_1_round.

      " 91 | position: "top" 92 | 93 | 94 | - title: "Running a Workflow" 95 | element: "#field-uid-1 > div.btn-group.ui-radiobutton" 96 | intro: "If you want the output to be in a new history click 'yes' in 'History Options' otherwise just move on.

      " 97 | position: "top" 98 | 99 | - title: "Running a Workflow" 100 | element: "#uid-23 > div.portlet-header > div.portlet-title > span > b" 101 | intro: "Step 1 is the first step of our workflow.Here we should define out input dataset, 102 | which will be the uploaded FASTA file.

      " 103 | position: "right" 104 | 105 | 106 | - title: "Running a Workflow" 107 | element: 'button#uid-11' 108 | intro: "To run the workflow with default setting simply click on 'Run workflow' button 109 | on the top.

      " 110 | position: "left" 111 | 112 | - title: "Understanding the Output" 113 | intro: "Running the workflow might take a while. 114 | After all the steps are done in History panel we will see the outputs.

      " 115 | position: "top" 116 | 117 | - title: "Understanding the Output" 118 | element: "#current-history-panel" 119 | intro: "'cluste.final.stat' file contains general information about clusters, 120 | e.g. number of clusters, number of sequences in each cluster etc. 121 |
      By clicking on the 'eye' icon you can see the content of the file. 122 |

      " 123 | position: "left" 124 | 125 | - title: "Understanding the Output" 126 | element: "#current-history-panel" 127 | intro: "'CLUSTERS' dataset collection contains one file for each cluster.
      128 | Each file contains information about sequences in that cluster. Each line in the file contains: 129 | 130 | 131 |
    13. cluster number
    14. 132 |
    15. cm_score
    16. 133 |
    17. sequence origin (whether it comes from model or from Infernal search)
    18. 134 |
    19. sequence id
    20. 135 |
      136 |
      " 137 | 138 | position: "left" 139 | 140 | 141 | 142 | - title: "A tutorial on GraphClust workflow" 143 | intro: "Thank You for going through our tutorial." 144 | backdrop: true 145 | -------------------------------------------------------------------------------- /assets/welcome.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
      10 |
      11 |
      12 |

      Hello, your Galaxy GraphClust-2 Docker container is running!

      13 | GraphClust-2 is a web-base workflow for structural clustering of RNA secondary structures developed on GraphClust methodology using the Galaxy framework. This is a running instance GraphClust-2 virtualized container. 14 |

      How to start:

      15 | A quick and easy way to get familiar with Galaxy and GraphClust-2 would be to visit the interactive tours provided under Help → Interactive Tours. Or simply click on the Guided Tour button below to start a tour. 16 |
      17 | Guided Tour Tutorial» 18 | 21 |

      Documentation:

      22 | For more information about the GraphClust-2 pipeline please check GraphClust-2 website:
      23 | GraphClust-2 repository
      24 | 25 | 26 |
      27 |
      28 | 29 |
      30 | 31 |
      32 |
      33 |
      34 | 35 |

      References:

      36 |
      37 |
        38 | M. Miladi, E. Sokhoyan, T. Houwaart, S. Heyne, F. Costa, R. Backofen and B. Gruening; Empowering the annotation and discovery of structured RNAs with scalable and accessible integrative clustering (under preparation/revision) 39 |
      40 |
      41 | 42 |

      About Galaxy project:

      43 |
      44 | 45 |

      46 | 47 | Galaxy is an open platform for supporting data intensive 48 | research. Galaxy is developed by The Galaxy Team 49 | with the support of many contributors. 50 | The Galaxy Docker project is supported by the University of Freiburg, part of de.NBI. 51 |

      52 | 53 | 63 | 64 |
      65 | 66 | 67 | 68 | -------------------------------------------------------------------------------- /data/README: -------------------------------------------------------------------------------- 1 | Sample input data for Galaxy-GraphClust pipeline 2 | 3 | Cliques-high and cliques-low datasets are based on Rfam 12.0 seed sequences with Human originate and different levels of similarity. 4 | -------------------------------------------------------------------------------- /data/Rfam-cliques-dataset/cliques-low-representatives.fa: -------------------------------------------------------------------------------- 1 | >RF00001_rep.0_AL096764.11/46123-46004 RF00001 2 | GUCUAUGGCCAUACCACCCUGAAUGUGCUUGAUCUCAUCUGAUCUCGUGAAGCCAAGCAGGGUGGGGCCUAGUUAGUACUUGGAUGGGAGACUUCCUGGGAAUAUAAGCUGCUGUUGGCU 3 | >RF00001_rep.1_U89919.1/939-1056 RF00001 4 | CUUUACGGCCACACCACCCUGAACGCACCGGAUCUCGACUGACCUUGAAAGCUAAGCAGGAUCGGGCCUGGUUAGUAUUGGGAUGGCAGACCCCCUGGAAAUACAGGGUGCUGAAGGU 5 | >RF00001_rep.2_AJ508600.1/161-58 RF00001 6 | GUCUACAGCCAUACCAUCCUGAACAUGCCAGAUCUUGUCUGACCUCUGAAGCUAAGCAGGGUCAAGCCUGGUUAGUACUUGGGAGAAGCUGGUGUGGCUAGACC 7 | >RF00005_rep.0_M15347.1/1040-968 RF00005 8 | GGCUCCAUAGCUCAGGGGUUAGAGCACUGGUCUUGUAAACCAGGGGUCGCGAGUUCAAUUCUCGCUGGGGCUU 9 | >RF00005_rep.10_X58792.1/174-245 RF00005 10 | GGUCCCAUGGUGUAAUGGUUAGCACUCUGGACUUUGAAUCCAGCGAUCCGAGUUCAAAUCUCGGUGGGACCU 11 | >RF00005_rep.11_AF346992.1/15890-15955 RF00005 12 | GUCCUUGUAGUAUAAACUAAUACACCAGUCUUGUAAACCGGAGAUGAAAACCUUUUUCCAAGGACA 13 | >RF00005_rep.12_AC108081.2/59868-59786 RF00005 14 | GUCAGGAUGGCCGAGCGGUCUAAGGCGCUGCGUUCAGGUCGCAGUCUCCCCUGGAGGCGUGGGUUCGAAUCCCACUUCUGACA 15 | >RF00005_rep.13_AC067849.6/4771-4840 RF00005 16 | CACUGUAAAGCUAACUUAGCAUUAACCUUUUAAGUUAAAGAUUAAGAGAACCAACACCUCUUUACAGUGA 17 | >RF00005_rep.14_AL021808.2/65570-65498 RF00005 18 | GCUUCUGUAGUGUAGUGGUUAUCACGUUCGCCUCACACGCGAAAGGUCCCCGGUUCGAAACCGGGCAGAAGCA 19 | >RF00005_rep.15_AC008443.10/42590-42518 RF00005 20 | GCCCGGCUAGCUCAGUCGGUAGAGCAUGAGACUCUUAAUCUCAGGGUCGUGGGUUCGAGCCCCACGUUGGGCG 21 | >RF00005_rep.16_AL133551.13/12355-12436 RF00005 22 | GCAGCGAUGGCCGAGUGGUUAAGGCGUUGGACUUGAAAUCCAAUGGGGUCUCCCCGCGCAGGUUCGAACCCUGCUCGCUGCG 23 | >RF00005_rep.17_AL021918.1/54817-54736 RF00005 24 | GUAGUCGUGGCCGAGUGGUUAAGGCGAUGGACUUGAAAUCCAUUGGGGUUUCCCCGCGCAGGUUCGAAUCCUGUCGGCUACG 25 | >RF00005_rep.18_AL021918.1/81116-81197 RF00005 26 | GUAGUCGUGGCCGAGUGGUUAAGGCGAUGGACUAGAAAUCCAUUGGGGUUUCCCCACGCAGGUUCGAAUCCUGCCGACUACG 27 | >RF00005_rep.19_AF134583.1/1816-1744 RF00005 28 | UAGAUUGAAGCCAGUUGAUUAGGGUGCUUAGCUGUUAACUAAGUGUUUGUGGGUUUAAGUCCCAUUGGUCUAG 29 | >RF00005_rep.1_AC005329.1/7043-6971 RF00005 30 | GCCGAAAUAGCUCAGUUGGGAGAGCGUUAGACUGAAGAUCUAAAGGUCCCUGGUUCGAUCCCGGGUUUCGGCA 31 | >RF00005_rep.20_AL671879.2/100356-100285 RF00005 32 | GGGGAUGUAGCUCAGUGGUAGAGCGCAUGCUUCGCAUGUAUGAGGCCCCGGGUUCGAUCCCCGGCAUCUCCA 33 | >RF00005_rep.21_AL355149.13/15278-15208 RF00005 34 | GCAUUGGUGGUUCAGUGGUAGAAUUCUCGCCUCCCACGCGGGAGACCCGGGUUCAAUUCCCGGCCAAUGCA 35 | >RF00005_rep.22_AL590385.23/26487-26416 RF00005 36 | GCGUUGGUGGUAUAGUGGUGAGCAUAGCUGCCUUCCAAGCAGUUGACCCGGGUUCGAUUCCCGGCCAACGCA 37 | >RF00005_rep.23_M16479.1/42-123 RF00005 38 | GGUGGGGUUCCCGAGCGGCCAAAGGGAGCAGACUCUAAAUCUGCCGUCAUCGACUUCGAAGGUUCGAAUCCUUCCCCCACCA 39 | >RF00005_rep.24_AC004941.2/32735-32806 RF00005 40 | GGGGGUAUAGCUCAGGGGUAGAGCAUUUGACUGCAGAUCAAGAGGUCCCUGGUUCAAAUCCAGGUGCCCCCU 41 | >RF00005_rep.25_AC006449.19/196857-196784 RF00005 42 | GUCUCUGUGGCGCAAUCGGUUAGCGCGUUCGGCUGUUAACCGAAAGGUUGGUGGUUCGAGCCCACCCAGGGACG 43 | >RF00005_rep.26_AF346999.1/4402-4331 RF00005 44 | UAGGAUGGGGUGUGAUAGGUGGCACGGAGAAUUUUGGAUUCUCAGGGAUGGGUUCGAUUCUCAUAGUCCUAG 45 | >RF00005_rep.27_AL352978.6/119697-119770 RF00005 46 | GGCCGGUUAGCUCAGUUGGUUAGAGCGUGGUGCUAAUAACGCCAAGGUCGCGGGUUCGAUCCCCGUACGGGCCA 47 | >RF00005_rep.28_X04779.1/1-73 RF00005 48 | CCUUCGAUAGCUCAGCUGGUAGAGCGGAGGACUGUAGAUCCUUAGGUCGCUGGUUCGAUUCCGGCUCGAAGGA 49 | >RF00005_rep.29_AF381996.2/4265-4333 RF00005 50 | AGAAAUAUGUCUGAUAAAAGAGUUACUUUGAUAGAGUAAAUAAUAGGAGCUUAAACCCCCUUAUUUCUA 51 | >RF00005_rep.2_AL662865.4/12206-12135 RF00005 52 | GGUUCCAUGGUGUAAUGGUUAGCACUCUGGACUCUGAAUCCAGCGAUCCGAGUUCAAAUCUCGGUGGAACCU 53 | >RF00005_rep.30_AL132988.4/95773-95841 RF00005 54 | AAGGGCUUAGCUUAAUUAAAGUGGCUGAUUUGCGUUCAGUUGAUGCAGAGUGGGGUUUUGCAGUCCUUA 55 | >RF00005_rep.31_AC092686.3/29631-29561 RF00005 56 | GCAUUGGUGGUUCAGUGGUAGAAUUCUCGCCUGCCACGCGGGAGGCCCGGGUUCGAUUCCCGGCCAAUGCA 57 | >RF00005_rep.32_AF347015.1/5892-5827 RF00005 58 | GGUAAAAUGGCUGAGUGAAGCAUUGGACUGUAAAUCUAAAGACAGGGGUUAGGCCUCUUUUUACCA 59 | >RF00005_rep.33_AC018638.5/4694-4623 RF00005 60 | GGCUCGUUGGUCUAGGGGUAUGAUUCUCGCUUAGGGUGCGAGAGGUCCCGGGUUCAAAUCCCGGACGAGCCC 61 | >RF00005_rep.34_AC008443.10/43006-42934 RF00005 62 | GUUUCCGUAGUGUAGUGGUUAUCACGUUCGCCUCACACGCGAAAGGUCCCCGGUUCGAAACCGGGCGGAAACA 63 | >RF00005_rep.35_AC005783.1/27398-27326 RF00005 64 | GUUUCCGUAGUGUAGCGGUUAUCACAUUCGCCUCACACGCGAAAGGUCCCCGGUUCGAUCCCGGGCGGAAACA 65 | >RF00005_rep.36_AC007298.17/145366-145295 RF00005 66 | UCCUCGUUAGUAUAGUGGUGAGUAUCCCCGCCUGUCACGCGGGAGACCGGGGUUCGAUUCCCCGACGGGGAG 67 | >RF00005_rep.37_AF347001.1/16015-15948 RF00005 68 | CAGAGAAUAGUUUAAAUUAGAAUCUUAGCUUUGGGUGCUAAUGGUGGAGUUAAAGACUUUUUCUCUGA 69 | >RF00005_rep.38_J00309.1/356-427 RF00005 70 | UCCCUGGUGGUCUAGUGGCUAGGAUUCGGCGCUUUCACCGCCGCGCCCCGGGUUCGAUUCCCGGCCAGGAAU 71 | >RF00005_rep.39_AL031229.2/40502-40430 RF00005 72 | GUUUCCGUAGUGUAGUGGUUAUCACGUUCGCCUAACACGCGAAAGGUCCCUGGAUCAAAACCAGGCGGAAACA 73 | >RF00005_rep.3_Z54587.1/126-45 RF00005 74 | GGUAGCGUGGCCGAGCGGUCUAAGGCGCUGGAUUUAGGCUCCAGUCUCUUCGGAGGCGUGGGUUCGAAUCCCACCGCUGCCA 75 | >RF00005_rep.40_AF382013.1/10403-10467 RF00005 76 | UGGUAUAUAGUUUAAACAAAACGAAUGAUUUCGACUCAUUAAAUUAUGAUAAUCAUAUUUACCAA 77 | >RF00005_rep.41_AC093311.2/140036-139968 RF00005 78 | GUUCUUGUAGUUGAAAUACAACGAUGGUUUUUCAUAUCAUUGGUCGUGGUUGUAGUCCGUGCGAGAAUA 79 | >RF00005_rep.42_AF347015.1/5827-5762 RF00005 80 | AGCUCCGAGGUGAUUUUCAUAUUGAAUUGCAAAUUCGAAGAAGCAGCUUCAAACCUGCCGGGGCUU 81 | >RF00005_rep.43_L23320.1/77-10 RF00005 82 | ACUCUUUUAGUAUAAAUAGUACCGUUAACUUCCAAUUAACUAGUUUUGACAACAUUCAAAAAAGAGUA 83 | >RF00005_rep.44_AC008670.6/83597-83665 RF00005 84 | GUAAAUAUAGUUUAACCAAAACAUCAGAUUGUGAAUCUGACAACAGAGGCUCACGACCCCUUAUUUACC 85 | >RF00005_rep.45_AF382005.1/581-651 RF00005 86 | GUUUAUGUAGCUUACCUCCUCAAAGCAAUACACUGAAAAUGUUUAGACGGGCUCACAUCACCCCAUAAACA 87 | >RF00005_rep.46_AF347015.1/1604-1672 RF00005 88 | CAGAGUGUAGCUUAACACAAAGCACCCAACUUACACUUAGGAGAUUUCAACUUAACUUGACCGCUCUGA 89 | >RF00005_rep.4_Z98744.2/66305-66234 RF00005 90 | AGCAGAGUGGCGCAGCGGAAGCGUGCUGGGCCCAUAACCCAGAGGUCGAUGGAUCGAAACCAUCCUCUGCUA 91 | >RF00005_rep.5_AL590385.23/26129-26058 RF00005 92 | UCCCUGGUGGUCUAGUGGUUAGGAUUCGGCGCUCUCACCGCCGCGGCCCGGGUUCGAUUCCCGGUCAGGGAA 93 | >RF00005_rep.6_X93334.1/6942-7009 RF00005 94 | AAGGUAUUAGAAAAACCAUUUCAUAACUUUGUCAAAGUUAAAUUAUAGGCUAAAUCCUAUAUAUCUUA 95 | >RF00005_rep.7_AF347005.1/12268-12338 RF00005 96 | ACUUUUAAAGGAUAACAGCUAUCCAUUGGUCUUAGGCCCCAAAAAUUUUGGUGCAACUCCAAAUAAAAGUA 97 | >RF00005_rep.8_AF134583.1/1599-1666 RF00005 98 | AGAAAUUUAGGUUAAAUACAGACCAAGAGCCUUCAAAGCCCUCAGUAAGUUGCAAUACUUAAUUUCUG 99 | >RF00005_rep.9_AP000442.6/2022-1950 RF00005 100 | GCCCGGAUAGCUCAGUCGGUAGAGCAUCAGACUUUUAAUCUGAGGGUCCAGGGUUCAAGUCCCUGUUCGGGCG 101 | >RF00006_rep.0_AF045145.1/1-88 RF00006 102 | GGCUGGCUUUAGCUCAGCGGUUACUUCGCGUGUCAUCAAACCACCUCUCUGGGUUGUUCGAGACCCGCGGGCGCUCUCCAGCCCUCUU 103 | >RF00006_rep.1_AC005219.1/49914-50014 RF00006 104 | GGGUCGGAGUUAGCUCAAGCGGUUACCUCCUCAUGCCGGACUUUCUAUCUGUCCAUCUCUGUGCUGGGGUUCGAGACCCGCGGGUGCUUACUGACCCUUUU 105 | >RF00006_rep.2_AF045143.1/1-98 RF00006 106 | GGCUGGCUUUAGCUCAGCGGUUACUUCGACAGUUCUUUAAUUGAAACAAGCAACCUGUCUGGGUUGUUCGAGACCCGCGGGCGCUCUCCAGUCCUUUU 107 | >RF00006_rep.3_AF045144.1/1-88 RF00006 108 | GGCUGGCUUUAGCUCAGCGGUUACUUCGAGUACAUUGUAACCACCUCUCUGGGUGGUUCGAGACCCGCGGGUGCUUUCCAGCUCUUUU 109 | >RF00019_rep.0_V00584.1/39-151 RF00019 110 | GGCUGGUCCGAAGGUAGUGAGUUAUCUCAAUUGAUUGUUCACAGUCAGUUACAGAUCGAACUCCUUGUUCUACUCUUUCCCCCCUUCUCACUACUGCACUUGACUAGUCUUUU 111 | >RF00019_rep.1_L32608.1/283-377 RF00019 112 | GGCUGGUCCGAUGGUAGUGGGUUAUCAGAACUUAUUAACAUUAGUGUCACUAAAGUUGGUAUACAACCCCCCACUGCUAAAUUUGACUGGCUUUU 113 | >RF00019_rep.2_ABBA01033605.1/1707-1808 RF00019 114 | GGCUGGUCCGAGUGCAGUGGUGUUUACAACUAAUUGAUCACAACCAGUUACAGAUUUCUUUGUUCCUUCUCCACUCCCACUGCUUCACUUGACUAGCCUUUU 115 | >RF00019_rep.3_AADD01087475.1/2469-2552 RF00019 116 | AGUUGGUCCGAGUGUUGUGGGUUAUUGUUAAGUUGAUUUAACAUUGUCUCCCCCCACAACCGCGCUUGACUAGCUUGCUGUUUU 117 | >RF00027_rep.0_AF480570.1/1-79 RF00027 118 | GUGAGGUAGUAAGUUGUAUUGUUGUGGGGUAGGGAUAUUAGGCCCCAAUUAGAAGAUAACUAUACAACUUACUACUUUC 119 | >RF00027_rep.1_AC048341.22/3536-3622 RF00027 120 | CCUGGCUGAGGUAGUAGUUUGUGCUGUUGGUCGGGUUGUGACAUUGCCCGCUGUGGAGAUAACUGCGCAAGCUACUGCCUUGCUAGU 121 | >RF00027_rep.2_AC018755.3/119936-120011 RF00027 122 | CCGGGCUGAGGUAGGAGGUUGUAUAGUUGAGGAGGACACCCAAGGAGAUCACUAUACGGCCUCCUAGCUUUCCCCA 123 | >RF00031_rep.0_X71973.1/730-791 RF00031 124 | CCGGCACUCAUGACGGCCUGCCUGCAAACCUGCUGGUGGGGCAGACCCGAAAAUCCAGCGUG 125 | >RF00031_rep.1_U67171.1/375-442 RF00031 126 | GACGCUUCAUGAUAGGAAGGACUGAAAAGUCUUGUGGACACCUGGUCUUUCCCUGAUGUUCUCGUGGC 127 | >RF00031_rep.2_S79854.1/1605-1666 RF00031 128 | CACUGCUGAUGACGAACUAUCUCUAACUGGUCUUGACCACGAGCUAGUUCUGAAUUGCAGGG 129 | >RF00031_rep.3_X53463.1/847-903 RF00031 130 | UUCACAGAAUGAUGGCACCUUCCUAAACCCUCAUGGGUGGUGUCUGAGAGGCGUGAA 131 | >RF00031_rep.4_AF195141.1/689-759 RF00031 132 | GACUGACAUUAUGAAGGCCUGUACUGAAGACAGCAAGCUGUUAGUACAGACCAGAUGCUUUCUUGGCAGGC 133 | >RF00031_rep.5_AF093774.1/5851-5916 RF00031 134 | GUGUGCGGAUGAUAACUACUGACGAAAGAGUCAUCGACCUCAGUUAGUGGUUGGAUGUAGUCACAU 135 | >RF00031_rep.6_BC003127.1/865-928 RF00031 136 | GUCACUGCAUGAUCCGCUCUGGUCAAACCCUUCCAGGCCAGCCAGAGUGGGGAUGGUCUGUGAC 137 | >RF00049_rep.0_X52138.1/3930-4011 RF00049 138 | CAGCAGUCGAUCGUCAAAAUUUCUUCGGCCUCGAAAUUCACUCGUCGAAGAGUCAAAACCGAGCUUUUUAACACUGAGUCAG 139 | >RF00049_rep.1_X97587.1/1-68 RF00049 140 | UUGCCAAUGAUGGUUAAGAAUUUCUUCACCGUAAUAAACCAUGUGGUCAGCAUUGCAUCUGAGGCAAA 141 | >RF00049_rep.2_X61923.1/2094-2169 RF00049 142 | UAGCAGUCGAUGUCAAAAUUUCUUGGCUCGAAAUUACUGUGAAGAGUAAAAUCGAGCUUUUUAAGAGUGAGUCAGC 143 | >RF00618_rep.0_AL135914.25/92223-92098 RF00618 144 | ACCAUCCUUUUCUUGGGGUUGCACUACUGUCCAAUGGGUACCUAGUGAGGGCAGUACUGCUAACUCCUGCACAACACACCGAAAUCAACUAGAGCUUUGCUUUGCCUUGGUGCAGUUUUUGGAGAA 145 | >RF00618_rep.1_AL161445.10/77816-77941 RF00618 146 | ACCAUCCUUUUCUUGGGGUUGCACUACUGUCCAAAGAGCAUGUAGUGAGGGCAGUACUGCUAACGUCUACACAACACACCCACCUCAACUAGAGCUUUGCUUUAGCUUGGUGUAAUUUUUGGAAAA 147 | >RF00618_rep.2_U62822.1/2-128 RF00618 148 | ACCAUCCUUUUCUUGGGGUUGCGCUACUGUCCAAUGAGCGCAUAGUGAGGGCAGUACUGCUAACGCCUGAACAACACACCCGCAUCAACUAGAGCUUUUGCUUUAUUUUGGUGCAAUUUUUGGAAAA 149 | >RF00618_rep.3_AL389925.10/20736-20611 RF00618 150 | ACCAUCCUUUUCUUGGGGUUGCACUACUGUCUAAUGAGUGCAUAAUGAGGGCAGUAUUGCUAACGCCUAUACAAUGCACCUGCAUCAACUAGAACUUUGCUUUACCUUGGUACAAUUUUUGGAAAA 151 | >RF00641_rep.0_AL132709.5/95865-95787 RF00641 152 | GGUACUUGGAGAGAGGUGGUCCGUGGCGCGUUCGCUUUAUUUAUGGCGCACAUUACACGGUCGACCUCUUUGCAGUAUC 153 | >RF00641_rep.10_AADD01141100.1/2839-2915 RF00641 154 | GGUACUCGGGGAGAGGUUACCCGAGCAACUUUGCAUCUGGACGACGAAUGUUGCUCGGUGAACCCCUUUUCGGUAUC 155 | >RF00641_rep.11_AC208187.2/25743-25822 RF00641 156 | GGUAUUUGAAGAUGCGGUUGACCAUGGUGUGUACGCUUUAUUUGUGACGUAGGACACAUGGUCUACUUCUUCUCAAUAUC 157 | >RF00641_rep.12_AL132709.5/56006-55928 RF00641 158 | GGUACUUGAAGGGAGAUCGACCGUGUUAUAUUCGCUUUAUUGACUUCGAAUAAUACAUGGUUGAUCUUUUCUCAGUAUC 159 | >RF00641_rep.13_AL132709.5/55165-55087 RF00641 160 | GGUGCCUGAGGAGAGGUGGCCUGUGUUGCAUUCACAGAAAUCAUGACACACAAGACACGAGCGGCCUCUCUUCAGUAUC 161 | >RF00641_rep.1_AL132709.5/86350-86274 RF00641 162 | GGUACUGGAGGAGAGGUUAUCUGUGUUUUUUCCCUUUAUUUAUGAUGAAAAAUAUGGUGCACUUCUAUUUGAGAAUC 163 | >RF00641_rep.2_AC208187.2/29806-29886 RF00641 164 | GAUACUCGAAGGAGAGGUUGUCCGUGUUGUCUUCUCUUUAUUUAUGAUGAAACAUACACGGGAAACCUCUUUUUUAGUAUC 165 | >RF00641_rep.3_AADD01141098.1/2614-2692 RF00641 166 | GCUACUUGAAGAGAGGUAAUCCUUCAUGCAUUUGCUUUACUUGCAAUGAUUAUACAAGGGCAGACUCUCUCUGGGGAGC 167 | >RF00641_rep.4_ABBA01048514.1/248685-248607 RF00641 168 | GGUACUCGAAUGGAGGUUGUCCAUGGUGUGUUCAUUUUAUUUAUGAUGAGUAUUACAUGGCCAAUCUCCUUUCGGUACU 169 | >RF00641_rep.5_AL132709.5/73699-73621 RF00641 170 | GGUGCUUAAAGAAUGGCUGUCCGUAGUAUGGUCUCUAUAUUUAUGAUGAUUAAUAUCGGACAACCAUUGUUUUAGUAUC 171 | >RF00641_rep.6_AADB02234051.1/94-172 RF00641 172 | GGUACUUAAGGGGGAGGUGGGCUUUGAAUAAUAAGUUUAUUGACGUGGAAUAUACAAGGGCAAGCUCUCUGUGAGUAUC 173 | >RF00641_rep.7_AL132709.5/54880-54803 RF00641 174 | GGUACCUGAAAUAGGUUGCCUGUGAGGUGUUCACUUUCUAUAUGAUGAAUAUUAUACAGUCAACCUCUUUCCGAUAUC 175 | >RF00641_rep.8_AL132709.5/61554-61474 RF00641 176 | GGUACUGGUGGAGAGGUCUUCCAUGAUGCAUUCGAUUUAUUUUUUGACCAAUCCUACAUAGUGGACUCUUUUGAAAGCAUU 177 | >RF00641_rep.9_AL132709.5/98273-98194 RF00641 178 | GGUACUUGGAGAGAUAGUAGACCGUAUAGCGUACGCUUUAUCUGUGACGUAUGUAACACGGUCCACUAACCCUCAGUAUC 179 | >RF00693_rep.0_AADD01093451.1/1636-1571 RF00693 180 | AAUCUUAUGGAAACAUUUCUGCACACAAUAAAGAAACUAGUGUACAGAAAUGUUUACUUGUCUAUG 181 | >RF00693_rep.1_AL031669.28/23303-23371 RF00693 182 | GGUCUAGUGGAAACAUUUCUGCACAACCAGAAUACUCAAACUAGCAUGAGGAAAGGCUUCUGUAACAGU 183 | >RF00693_rep.2_ABBA01075050.1/41249-41181 RF00693 184 | UAUUUCGUGGAUGAUUUCUACACAGACUAGGCCAUAGAAACCAGUGCGUAGAAAUGCUUCUGUUACAUG 185 | -------------------------------------------------------------------------------- /kitematic.md: -------------------------------------------------------------------------------- 1 | ### Galaxy-GraphClust 2 | ## Step-by-step setup guide with Kitematic (Windows/MacOS): ## 3 | 4 | 0. Obtain and install Kitematic from https://kitematic.com/ 5 | 6 | 1. For detailed info you may check [Kitematic guide](https://docs.docker.com/kitematic/userguide/#docker-command-line-access) 7 | 8 | 1. Run kitematic, search for `graphclust` and click on `create` button 9 | 10 | 11 | 2. Wait for image to be downloaded. With the first time run this step may take few minutes. 12 | 13 | 14 | 3. Galaxy instance starts loading, wait for message `Binding and starting galaxy control worker for main` 15 | 16 | 17 | 4. Inside Kitematic, go to teh `settings` tab then `ports`. Configure Docker port `80` to bind on host port `8080`. Save the setting and click on binded IP for port `8080`. 18 | 19 | 20 | 5. Start browsing Galaxy html interface on `IP:8080` 21 | 22 | 23 | -------------------------------------------------------------------------------- /workflows/GraphClust-MotifFinder.ga: -------------------------------------------------------------------------------- 1 | {"uuid": "e949d240-cdc2-4a8a-b8ef-0747c693951f", "tags": [], "format-version": "0.1", "name": "GraphClust-MotifFinder", "version": 1, "steps": {"0": {"tool_id": null, "tool_version": null, "outputs": [], "workflow_outputs": [{"output_name": "output", "uuid": "b62a4276-5cc2-403d-a0c0-75b2ca8e0fc3", "label": null}], "input_connections": {}, "tool_state": "{\"name\": \"Input Dataset\"}", "id": 0, "uuid": "49e67ce3-c4b2-4aec-935f-9549c9c3b20f", "errors": null, "name": "Input dataset", "label": null, "inputs": [{"name": "Input Dataset", "description": ""}], "position": {"top": 342.8726043701172, "left": 170.84130859375}, "annotation": "", "content_id": null, "type": "data_input"}, "1": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_preprocessing/preproc/0.5", "tool_version": "0.5", "outputs": [{"type": "fasta", "name": "data.fasta"}, {"type": "txt", "name": "data.map"}, {"type": "txt", "name": "data.names"}, {"type": "fasta", "name": "data.fasta.scan"}, {"type": "zip", "name": "FASTA"}, {"type": "txt", "name": "shape_data_split"}, {"type": "stockholm", "name": "alignment_data_split"}], "workflow_outputs": [], "input_connections": {"fastaFile": {"output_name": "output", "id": 0}}, "tool_state": "{\"fastaFile\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"in_winShift\": \"\\\"50\\\"\", \"__page__\": null, \"__rerun_remap_job_id__\": null, \"min_seq_length\": \"\\\"5\\\"\", \"max_length\": \"\\\"100\\\"\", \"AlignmentData\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"SHAPEdata\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 1, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "2a5defc09381", "name": "graphclust_preprocessing", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "a427a862-cbad-4ddf-b490-973d2615b3af", "errors": null, "name": "Preprocessing", "post_job_actions": {"HideDatasetActiondata.names": {"output_name": "data.names", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionalignment_data_split": {"output_name": "alignment_data_split", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionFASTA": {"output_name": "FASTA", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionshape_data_split": {"output_name": "shape_data_split", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActiondata.fasta.scan": {"output_name": "data.fasta.scan", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActiondata.map": {"output_name": "data.map", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActiondata.fasta": {"output_name": "data.fasta", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "AlignmentData", "description": "runtime parameter for tool Preprocessing"}, {"name": "SHAPEdata", "description": "runtime parameter for tool Preprocessing"}], "position": {"top": 428.94232177734375, "left": 299.8557434082031}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_preprocessing/preproc/0.5", "type": "tool"}, "2": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/viennarna_rnafold/viennarna_rnafold/2.2.10.4", "tool_version": "2.2.10.4", "outputs": [{"type": "input", "name": "structure_outputs"}, {"type": "input", "name": "matrix_outputs"}, {"type": "dbn", "name": "dot_bracket_stdout"}], "workflow_outputs": [{"output_name": "dot_bracket_stdout", "uuid": "3b78174a-f8a5-4b09-b2f5-f1c799fedfff", "label": null}], "input_connections": {"input_source|fasta_input": {"output_name": "data.fasta", "id": 1}}, "tool_state": "{\"__page__\": null, \"mea\": \"\\\"false\\\"\", \"temperature\": \"\\\"37.0\\\"\", \"dangling\": \"\\\"2\\\"\", \"__rerun_remap_job_id__\": null, \"IDs\": \"{\\\"auto_id\\\": \\\"false\\\", \\\"id_digits\\\": \\\"4\\\", \\\"id_prefix\\\": \\\"sequence\\\", \\\"id_start\\\": \\\"1\\\"}\", \"pf\": \"\\\"false\\\"\", \"meagamma\": \"\\\"\\\"\", \"input_source\": \"{\\\"__current_case__\\\": 0, \\\"fasta_input\\\": {\\\"__class__\\\": \\\"ConnectedValue\\\"}, \\\"select_fasta\\\": \\\"true\\\"}\", \"layout_type\": \"\\\"1\\\"\", \"advancedOptions\": \"{\\\"betaScale\\\": \\\"1.0\\\", \\\"bppmThreshold\\\": \\\"1e-05\\\", \\\"circular\\\": \\\"false\\\", \\\"gquad\\\": \\\"false\\\", \\\"noclosinggu\\\": \\\"true\\\", \\\"noconversion\\\": \\\"true\\\", \\\"nogu\\\": \\\"true\\\", \\\"nolp\\\": \\\"true\\\", \\\"nops\\\": \\\"true\\\", \\\"notetra\\\": \\\"true\\\", \\\"nsp\\\": \\\"\\\"}\", \"constraints\": \"{\\\"constraintLocation\\\": {\\\"__current_case__\\\": 0, \\\"constraintSelector\\\": \\\"none\\\"}, \\\"maxBPspan\\\": \\\"-1\\\", \\\"motif\\\": \\\"\\\", \\\"shapeOption\\\": {\\\"__current_case__\\\": 1, \\\"shapeSelector\\\": \\\"notUsed\\\"}}\", \"pfscale\": \"\\\"1.07\\\"\"}", "id": 2, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "bdb786715d28", "name": "viennarna_rnafold", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "0d0f77b8-533e-4034-adc5-9c3ade2fae13", "errors": null, "name": "RNAfold", "post_job_actions": {"HideDatasetActionstructure_outputs": {"output_name": "structure_outputs", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionmatrix_outputs": {"output_name": "matrix_outputs", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [], "position": {"top": 149.84375, "left": 328.90625}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/viennarna_rnafold/viennarna_rnafold/2.2.10.4", "type": "tool"}, "3": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/structure_to_gspan/structure_to_gspan/0.4", "tool_version": "0.4", "outputs": [{"type": "input", "name": "gspan_compressed"}], "workflow_outputs": [{"output_name": "gspan_compressed", "uuid": "012697c5-bdaf-4884-ad37-32f8493c7af1", "label": null}], "input_connections": {"dataFile": {"output_name": "dot_bracket_stdout", "id": 2}}, "tool_state": "{\"__page__\": null, \"dataFile\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"group\": \"\\\"10000\\\"\", \"inputFormat\": \"\\\"vrna-simple\\\"\", \"__rerun_remap_job_id__\": null, \"structureType\": \"\\\"MFE\\\"\", \"abstr\": \"\\\"false\\\"\", \"stack\": \"\\\"true\\\"\", \"seq_graph_t\": \"\\\"true\\\"\"}", "id": 3, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "d34ab3aa1724", "name": "structure_to_gspan", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "159603bb-48f5-4e32-ac7f-cb8cdd0a2877", "errors": null, "name": "Structure to GSPAN", "post_job_actions": {}, "label": null, "inputs": [], "position": {"top": 203.88223266601562, "left": 555.9134521484375}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/structure_to_gspan/structure_to_gspan/0.4", "type": "tool"}, "4": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_nspdk/nspdk_sparse/9.2.3", "tool_version": "9.2.3", "outputs": [{"type": "zip", "name": "data_svector"}], "workflow_outputs": [{"output_name": "data_svector", "uuid": "bf672ab7-fce4-4405-ae44-e206b9bf6b9e", "label": null}], "input_connections": {"data_fasta": {"output_name": "data.fasta", "id": 1}, "gspan_file": {"output_name": "gspan_compressed", "id": 3}}, "tool_state": "{\"max_dist_relations\": \"\\\"3\\\"\", \"__page__\": 0, \"__rerun_remap_job_id__\": null, \"gspan_file\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"data_fasta\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"max_rad\": \"\\\"3\\\"\"}", "id": 4, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "1b142e88d068", "name": "graphclust_nspdk", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "1025f17d-4712-4365-a816-7c8dfbee06b7", "errors": null, "name": "NSPDK_sparseVect", "post_job_actions": {}, "label": null, "inputs": [], "position": {"top": 304.3509826660156, "left": 759.3870146274567}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_nspdk/nspdk_sparse/9.2.3", "type": "tool"}, "5": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_nspdk/NSPDK_candidateClust/9.2.3.1", "tool_version": "9.2.3.1", "outputs": [{"type": "txt", "name": "fast_cluster"}, {"type": "txt", "name": "fast_cluster_sim"}, {"type": "txt", "name": "black_list"}, {"type": "txt", "name": "fast_cluster_m"}, {"type": "txt", "name": "fast_cluster_sim_m"}, {"type": "txt", "name": "black_list_m"}], "workflow_outputs": [], "input_connections": {"data_names": {"output_name": "data.names", "id": 1}, "data_fasta": {"output_name": "data.fasta", "id": 1}, "data_svector": {"output_name": "data_svector", "id": 4}}, "tool_state": "{\"knn\": \"\\\"10\\\"\", \"max_dist_relations\": \"\\\"3\\\"\", \"nhf\": \"\\\"500\\\"\", \"noCache\": \"\\\"true\\\"\", \"__page__\": null, \"usn\": \"\\\"true\\\"\", \"__rerun_remap_job_id__\": null, \"oc\": \"\\\"true\\\"\", \"iteration_num\": \"{\\\"CI\\\": \\\"1\\\", \\\"__current_case__\\\": 1, \\\"iteration_num_selector\\\": \\\"false\\\"}\", \"ensf\": \"\\\"5\\\"\", \"data_fasta\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"max_rad\": \"\\\"3\\\"\", \"data_svector\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"nspdk_nhf_max\": \"\\\"1000\\\"\", \"nspdk_nhf_step\": \"\\\"25\\\"\", \"GLOBAL_num_clusters\": \"\\\"100\\\"\", \"data_names\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\"}", "id": 5, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "1b142e88d068", "name": "graphclust_nspdk", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "ad34b03e-4ef1-4d30-ba35-66154533d5dc", "errors": null, "name": "NSPDK_candidateClusters", "post_job_actions": {"HideDatasetActionfast_cluster_m": {"output_name": "fast_cluster_m", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionfast_cluster_sim_m": {"output_name": "fast_cluster_sim_m", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionfast_cluster": {"output_name": "fast_cluster", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionblack_list_m": {"output_name": "black_list_m", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionblack_list": {"output_name": "black_list", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionfast_cluster_sim": {"output_name": "fast_cluster_sim", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [], "position": {"top": 597.8966541290283, "left": 769.8677835464478}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_nspdk/NSPDK_candidateClust/9.2.3.1", "type": "tool"}, "6": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_prepocessing_for_mlocarna/preMloc/0.4", "tool_version": "0.4", "outputs": [{"type": "input", "name": "centers"}, {"type": "input", "name": "trees"}, {"type": "input", "name": "cmFa"}, {"type": "input", "name": "model_tree_fa"}, {"type": "input", "name": "tree_matrix"}], "workflow_outputs": [{"output_name": "centers", "uuid": "66e8289b-68a9-4d9d-8980-93a843dc9c4a", "label": null}], "input_connections": {"fasta_data": {"output_name": "data.fasta", "id": 1}, "fast_cluster_sim": {"output_name": "fast_cluster_sim", "id": 5}, "data_map": {"output_name": "data.map", "id": 1}, "fast_cluster": {"output_name": "fast_cluster", "id": 5}}, "tool_state": "{\"knn\": \"\\\"10\\\"\", \"__page__\": null, \"CI\": \"\\\"1\\\"\", \"fasta_data\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"__rerun_remap_job_id__\": null, \"fast_cluster_sim\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"data_map\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"fast_cluster\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\"}", "id": 6, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "5c92a3f7b59c", "name": "graphclust_prepocessing_for_mlocarna", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "1a9702e1-b72b-453c-b275-631e6bf0370f", "errors": null, "name": "pgma_graphclust", "post_job_actions": {"HideDatasetActioncmFa": {"output_name": "cmFa", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActiontrees": {"output_name": "trees", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActiontree_matrix": {"output_name": "tree_matrix", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionmodel_tree_fa": {"output_name": "model_tree_fa", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [], "position": {"top": 495.9134826660156, "left": 1009.9399108886719}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_prepocessing_for_mlocarna/preMloc/0.4", "type": "tool"}, "7": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_mlocarna/locarna_best_subtree/0.4", "tool_version": "0.4", "outputs": [{"type": "stockholm", "name": "model_tree_stk"}], "workflow_outputs": [{"output_name": "model_tree_stk", "uuid": "c62b9620-1470-49a9-9a35-8a8453db0984", "label": null}], "input_connections": {"tree_file": {"output_name": "trees", "id": 6}, "data_map": {"output_name": "data.map", "id": 1}, "center_fa_file": {"output_name": "centers", "id": 6}, "tree_matrix": {"output_name": "tree_matrix", "id": 6}}, "tool_state": "{\"__page__\": 0, \"center_fa_file\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"__rerun_remap_job_id__\": null, \"free_endgaps\": \"\\\"0\\\"\", \"tree_matrix\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"allow_overlap\": \"\\\"false\\\"\", \"data_map\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"param_type\": \"{\\\"__current_case__\\\": 1, \\\"param_type_selector\\\": \\\"locarna\\\"}\", \"tree_file\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\"}", "id": 7, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "c45eef4517d9", "name": "graphclust_mlocarna", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "096dfbc3-bc7d-4cb7-8851-5b78fc207d8f", "errors": null, "name": "locarna_graphclust", "post_job_actions": {}, "label": null, "inputs": [], "position": {"top": 531.935115814209, "left": 1251.4302978515625}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_mlocarna/locarna_best_subtree/0.4", "type": "tool"}, "8": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_cmfinder/cmFinder/0.4", "tool_version": "0.4", "outputs": [{"type": "stockholm", "name": "model_cmfinder_stk"}], "workflow_outputs": [{"output_name": "model_cmfinder_stk", "uuid": "67b6d32a-9782-4eeb-9bf1-ddf6efa84f25", "label": null}], "input_connections": {"model_tree_stk": {"output_name": "model_tree_stk", "id": 7}, "cmfinder_fa": {"output_name": "cmFa", "id": 6}}, "tool_state": "{\"__page__\": null, \"__rerun_remap_job_id__\": null, \"cmfinder_fa\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"model_tree_stk\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"gap_threshold_opts\": \"{\\\"__current_case__\\\": 0, \\\"gap\\\": \\\"0.9\\\", \\\"gap_threshold_opts_selector\\\": \\\"--g\\\"}\"}", "id": 8, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "5a1dcc77b0ce", "name": "graphclust_cmfinder", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "9848aa85-cc60-4ec9-b503-020e4453d632", "errors": null, "name": "cmfinder", "post_job_actions": {}, "label": null, "inputs": [], "position": {"top": 799.9399261474609, "left": 1186.9110717773438}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_cmfinder/cmFinder/0.4", "type": "tool"}, "9": {"tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/infernal/infernal_cmbuild/1.1.2.0", "tool_version": "1.1.2.0", "outputs": [{"type": "text", "name": "summary_outfile"}, {"type": "cm", "name": "cmfile_outfile"}, {"type": "stockholm", "name": "refined_multiple_alignment_output"}, {"type": "txt", "name": "hfile"}, {"type": "txt", "name": "sfile"}, {"type": "txt", "name": "qqfile"}, {"type": "txt", "name": "ffile"}, {"type": "txt", "name": "xfile"}], "workflow_outputs": [{"output_name": "cmfile_outfile", "uuid": "967e5ef6-585f-4fe1-b481-52b0d5371ba0", "label": null}], "input_connections": {"alignment_infile": {"output_name": "model_cmfinder_stk", "id": 8}}, "tool_state": "{\"__page__\": null, \"controlling_filter_p7_hmm\": \"{\\\"EgfN\\\": \\\"200\\\", \\\"ElfN\\\": \\\"200\\\", \\\"EmN\\\": \\\"200\\\", \\\"EvN\\\": \\\"200\\\", \\\"p7ere\\\": \\\"0.38\\\", \\\"p7ml\\\": \\\"false\\\"}\", \"noss\": \"\\\"false\\\"\", \"is_summery_output\": \"\\\"false\\\"\", \"__rerun_remap_job_id__\": null, \"effective_opts\": \"{\\\"__current_case__\\\": 0, \\\"effective_opts_selector\\\": \\\"--enone\\\"}\", \"alignment_infile\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"refining_opts\": \"{\\\"__current_case__\\\": 0, \\\"refining_opts_selector\\\": \\\"\\\"}\", \"Calibrate\": \"{\\\"L\\\": \\\"0.5\\\", \\\"__current_case__\\\": 1, \\\"add_opts\\\": {\\\"beta\\\": \\\"1e-15\\\", \\\"gc\\\": {\\\"__class__\\\": \\\"RuntimeValue\\\"}, \\\"nonbanded\\\": \\\"false\\\", \\\"nonull3\\\": \\\"false\\\", \\\"random\\\": \\\"false\\\", \\\"seed\\\": \\\"181\\\"}, \\\"cont_exp_tails_fits\\\": {\\\"__current_case__\\\": 0, \\\"gtailn\\\": \\\"250\\\", \\\"ltailn\\\": \\\"750\\\", \\\"selector\\\": \\\"top_n\\\"}, \\\"output_options_cond\\\": {\\\"__current_case__\\\": 1, \\\"selector\\\": \\\"none\\\"}, \\\"selector\\\": \\\"true\\\"}\", \"model_construction_opts\": \"{\\\"__current_case__\\\": 0, \\\"model_construction_opts_selector\\\": \\\"--fast\\\", \\\"symfrac\\\": \\\"0.5\\\"}\", \"relative_weights_opts\": \"{\\\"__current_case__\\\": 0, \\\"relative_weights_opts_selector\\\": \\\"--wpb\\\"}\"}", "id": 9, "tool_shed_repository": {"owner": "bgruening", "changeset_revision": "477d829d3250", "name": "infernal", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "16bd0fad-471d-44a4-955e-31afa854e45e", "errors": null, "name": "cmbuild", "post_job_actions": {"HideDatasetActionsummary_outfile": {"output_name": "summary_outfile", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionrefined_multiple_alignment_output": {"output_name": "refined_multiple_alignment_output", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionffile": {"output_name": "ffile", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionqqfile": {"output_name": "qqfile", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionxfile": {"output_name": "xfile", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionsfile": {"output_name": "sfile", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionhfile": {"output_name": "hfile", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [], "position": {"top": 801.9351196289062, "left": 1493.8942260742188}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/infernal/infernal_cmbuild/1.1.2.0", "type": "tool"}, "10": {"tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/infernal/infernal_cmsearch/1.1.2.0", "tool_version": "1.1.2.0", "outputs": [{"type": "tabular", "name": "outfile"}, {"type": "tabular", "name": "multiple_alignment_output"}], "workflow_outputs": [{"output_name": "outfile", "uuid": "3a50686a-87a1-424c-b4bc-f519e4e19752", "label": null}], "input_connections": {"cm_opts|cmfile": {"output_name": "cmfile_outfile", "id": 9}, "seqdb": {"output_name": "data.fasta.scan", "id": 1}}, "tool_state": "{\"anytrunc\": \"\\\"false\\\"\", \"verbose\": \"\\\"false\\\"\", \"notrunc\": \"\\\"false\\\"\", \"smxsize\": \"\\\"128.0\\\"\", \"cm_opts\": \"{\\\"__current_case__\\\": 1, \\\"cm_opts_selector\\\": \\\"histdb\\\", \\\"cmfile\\\": {\\\"__class__\\\": \\\"ConnectedValue\\\"}}\", \"__page__\": null, \"bottomonly\": \"\\\"false\\\"\", \"seqdb\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"inclusion_thresholds_opts\": \"{\\\"__current_case__\\\": 0, \\\"inclusion_thresholds_selector\\\": \\\"\\\"}\", \"__rerun_remap_job_id__\": null, \"mxsize\": \"\\\"128.0\\\"\", \"A\": \"\\\"false\\\"\", \"acyk\": \"\\\"false\\\"\", \"acceleration_huristics\": \"{\\\"__current_case__\\\": 1, \\\"acceleration_huristics_selector\\\": \\\"--nohmm\\\"}\", \"reporting_thresholds_opts\": \"{\\\"__current_case__\\\": 0, \\\"reporting_thresholds_selector\\\": \\\"\\\"}\", \"cyk\": \"\\\"false\\\"\", \"Z\": \"\\\"\\\"\", \"model_thresholds\": \"{\\\"cut_ga\\\": \\\"false\\\", \\\"cut_nc\\\": \\\"false\\\", \\\"cut_tc\\\": \\\"false\\\"}\", \"g\": \"\\\"true\\\"\", \"nonull3\": \"\\\"false\\\"\", \"toponly\": \"\\\"true\\\"\", \"noali\": \"\\\"false\\\"\"}", "id": 10, "tool_shed_repository": {"owner": "bgruening", "changeset_revision": "477d829d3250", "name": "infernal", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "6bc118e7-7cd9-4f07-93b8-b3e6e372d8d5", "errors": null, "name": "cmsearch", "post_job_actions": {"HideDatasetActionmultiple_alignment_output": {"output_name": "multiple_alignment_output", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [], "position": {"top": 970.9495544433594, "left": 1182.4399108886719}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/infernal/infernal_cmsearch/1.1.2.0", "type": "tool"}, "11": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_postprocessing/glob_report/0.5", "tool_version": "0.5", "outputs": [{"type": "input", "name": "clusters"}, {"type": "input", "name": "allFasta"}, {"type": "input", "name": "partitions"}, {"type": "input", "name": "topSecondaryStruct"}, {"type": "input", "name": "topDot"}, {"type": "input", "name": "rscapePlot"}, {"type": "txt", "name": "final_stats"}, {"type": "tabular", "name": "tableForEval"}, {"type": "txt", "name": "final_soft"}, {"type": "txt", "name": "final_used_cmsearch"}, {"type": "txt", "name": "evaluation"}, {"type": "txt", "name": "combined_cm_out"}, {"type": "zip", "name": "RESULTS_zip"}], "workflow_outputs": [{"output_name": "topDot", "uuid": "30b10812-3f64-4d4a-a1bd-81a6a25b22d3", "label": null}, {"output_name": "combined_cm_out", "uuid": "1ccb0cb0-7a5c-4da1-a9a5-fb12c69d9bd1", "label": null}, {"output_name": "final_stats", "uuid": "f95386a6-968e-42ac-a263-02b16ad91d60", "label": null}, {"output_name": "allFasta", "uuid": "68671773-caf5-4fe4-8de4-856b66be8982", "label": null}, {"output_name": "rscapePlot", "uuid": "6341baf6-2925-4327-b124-80700155d7e3", "label": null}, {"output_name": "topSecondaryStruct", "uuid": "54b61df8-2561-42e8-9186-8f99b0930cde", "label": null}, {"output_name": "clusters", "uuid": "6283a3a7-3167-469a-a69e-4a90ee2809c2", "label": null}, {"output_name": "evaluation", "uuid": "65fe556a-2b9b-4ed1-86e8-d839822fa1f2", "label": null}, {"output_name": "RESULTS_zip", "uuid": "11c8ce70-e90a-4c8f-b748-4613bca8b928", "label": null}], "input_connections": {"model_tree_files": {"output_name": "model_tree_fa", "id": 6}, "FASTA": {"output_name": "FASTA", "id": 1}, "cmsearch_results": {"output_name": "outfile", "id": 10}}, "tool_state": "{\"__page__\": null, \"model_tree_files\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"cm_max_eval\": \"\\\"0.01\\\"\", \"min_cluster_size\": \"\\\"5\\\"\", \"advanced_opts\": \"{\\\"__current_case__\\\": 1, \\\"advanced_opts_selector\\\": \\\"show\\\", \\\"param_type\\\": {\\\"__current_case__\\\": 1, \\\"param_type_selector\\\": \\\"locarna\\\"}}\", \"merge_cluster_ol\": \"\\\"0.66\\\"\", \"iteration_num\": \"{\\\"__current_case__\\\": 1, \\\"iteration_num_selector\\\": \\\"false\\\"}\", \"cm_min_bitscore\": \"\\\"20\\\"\", \"cut_type\": \"\\\"true\\\"\", \"results_top_num\": \"\\\"30\\\"\", \"merge_overlap\": \"\\\"0.51\\\"\", \"cm_bitscore_sig\": \"\\\"1\\\"\", \"__rerun_remap_job_id__\": null, \"FASTA\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"cmsearch_results\": \"{\\\"__class__\\\": \\\"ConnectedValue\\\"}\", \"cdhit\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"partition_type\": \"\\\"true\\\"\"}", "id": 11, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "f93c868203cc", "name": "graphclust_postprocessing", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "5fabab28-ca99-4511-97c0-9df34ce59270", "errors": null, "name": "cluster_collection_report", "post_job_actions": {"HideDatasetActionpartitions": {"output_name": "partitions", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActiontableForEval": {"output_name": "tableForEval", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionfinal_used_cmsearch": {"output_name": "final_used_cmsearch", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionfinal_soft": {"output_name": "final_soft", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "cdhit", "description": "runtime parameter for tool cluster_collection_report"}], "position": {"top": 945.9255065917969, "left": 1796.9110717773438}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_postprocessing/glob_report/0.5", "type": "tool"}}, "annotation": "", "a_galaxy_workflow": "true"} -------------------------------------------------------------------------------- /workflows/README.md: -------------------------------------------------------------------------------- 1 | **Workflow flavors** 2 | 3 | In this directory you can find the alternative pre-configurations of GraphClust-2 as flavors tailored for different use-case scenarios. 4 | 5 | - Preconfigured flavors of the workflow 6 | - The *MotifFinder* workflow flavor targets identifying a handful of local signals/motifs under the likely presence of noise and sequence context. 7 | - The pre-configured *main* workflows perform best for clustering and partitioning a set of RNA sequences with quasi defined structure boundary signals (e.g. ncRNAs or data from genomic screenings with tools such as CMfinder or RNAz screens). Usually up to 3 rounds of clustering, depending on the size of input and classes, would be enough to identify the homologs. 8 | - For large datasets with thousands of sequences, further iterations of clustering can be helpful. The *sub-workflow* based flavors are recommended for such cases available under [extra-workflows/with-subworkflow/](./extra-workflows/with-subworkflow/) 9 | - Auxiliary workflows 10 | - The [auxiliary workflows](./auxiliary-workflows/) provide alternative ways to cluster genomic data beyond the classic FASTA input. 11 | 12 | **Configuring the workflows** 13 | 14 | Please proceed with the interactive tour named `GraphClust workflow step by step`, available under `Help->Interactive Tours` and also check the references. 15 | An intuitive tutorial highlighting the use-case scenarios and the few parameters that can be adapted according to the scenarios will be provided soon here. 16 | -------------------------------------------------------------------------------- /workflows/auxiliary-workflows/Cluster-conservation-filter.ga: -------------------------------------------------------------------------------- 1 | {"uuid": "70e25788-0869-4fa7-b524-d64047944551", "tags": [], "format-version": "0.1", "name": "cluster_conservation_filter_no-fasta", "version": 3, "steps": {"0": {"tool_id": null, "tool_version": null, "outputs": [], "workflow_outputs": [{"output_name": "output", "uuid": "461f5b89-7dd4-4b4e-b03a-4d950fa47b7d", "label": null}], "input_connections": {}, "tool_state": "{}", "id": 0, "uuid": "177d5ece-9582-4fd8-b4b8-20ae836ff4cb", "errors": null, "name": "Input dataset", "label": "cluster.bed", "inputs": [], "position": {"top": 200, "left": 200}, "annotation": "", "content_id": null, "type": "data_input"}, "1": {"tool_id": null, "tool_version": null, "outputs": [], "workflow_outputs": [{"output_name": "output", "uuid": "f507484f-c641-46a7-9a47-909c8084b4b3", "label": null}], "input_connections": {}, "tool_state": "{}", "id": 1, "uuid": "f19a9ea1-a1e2-48aa-9306-f4e7e3324880", "errors": null, "name": "Input dataset", "label": "phastcons.wig", "inputs": [], "position": {"top": 287, "left": 200}, "annotation": "", "content_id": null, "type": "data_input"}, "2": {"tool_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_sortbed/2.26.0.0", "tool_version": null, "outputs": [{"type": "input", "name": "output"}], "workflow_outputs": [], "input_connections": {"input": {"output_name": "output", "id": 0}}, "tool_state": "{\"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"__rerun_remap_job_id__\": null, \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\", \"option\": \"\\\"\\\"\", \"__page__\": null}", "id": 2, "tool_shed_repository": {"owner": "iuc", "changeset_revision": "e0cec48a4695", "name": "bedtools", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "758417cb-8ad2-4abf-a948-53abfad2bf3a", "errors": null, "name": "SortBED", "post_job_actions": {"HideDatasetActionoutput": {"output_name": "output", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool SortBED"}], "position": {"top": 200, "left": 444}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_sortbed/2.26.0.0", "type": "tool"}, "3": {"tool_id": "wiggle2simple1", "tool_version": null, "outputs": [{"type": "interval", "name": "out_file1"}], "workflow_outputs": [], "input_connections": {"input": {"output_name": "output", "id": 1}}, "tool_state": "{\"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"__rerun_remap_job_id__\": null, \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\", \"__page__\": null}", "id": 3, "uuid": "891a9165-ee93-497f-a3b5-d4f09286ae89", "errors": null, "name": "Wiggle-to-Interval", "post_job_actions": {"HideDatasetActionout_file1": {"output_name": "out_file1", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool Wiggle-to-Interval"}], "position": {"top": 334, "left": 444}, "annotation": "", "content_id": "wiggle2simple1", "type": "tool"}, "4": {"tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/1.1.0", "tool_version": null, "outputs": [{"type": "input", "name": "outfile"}], "workflow_outputs": [], "input_connections": {"infile": {"output_name": "out_file1", "id": 3}}, "tool_state": "{\"__page__\": null, \"__rerun_remap_job_id__\": null, \"code\": \"\\\"BEGIN {OFS=\\\\\\\"\\\\\\\\t\\\\\\\"}\\\\n{print $1,$2,$3,\\\\\\\"N\\\\\\\",$5,$4}\\\"\", \"infile\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 4, "tool_shed_repository": {"owner": "bgruening", "changeset_revision": "20344ce0c811", "name": "text_processing", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "9607a32e-7a36-46f4-81c1-4cb7aa944f17", "errors": null, "name": "Text reformatting", "post_job_actions": {"HideDatasetActionoutfile": {"output_name": "outfile", "action_type": "HideDatasetAction", "action_arguments": {}}, "ChangeDatatypeActionoutfile": {"output_name": "outfile", "action_type": "ChangeDatatypeAction", "action_arguments": {"newtype": "bed"}}}, "label": null, "inputs": [{"name": "infile", "description": "runtime parameter for tool Text reformatting"}], "position": {"top": 200, "left": 707}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/1.1.0", "type": "tool"}, "5": {"tool_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_sortbed/2.26.0.0", "tool_version": null, "outputs": [{"type": "input", "name": "output"}], "workflow_outputs": [], "input_connections": {"input": {"output_name": "outfile", "id": 4}}, "tool_state": "{\"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"__rerun_remap_job_id__\": null, \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\", \"option\": \"\\\"\\\"\", \"__page__\": null}", "id": 5, "tool_shed_repository": {"owner": "iuc", "changeset_revision": "e0cec48a4695", "name": "bedtools", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "0699ec08-cd39-478d-a39c-b2e66a719d16", "errors": null, "name": "SortBED", "post_job_actions": {"HideDatasetActionoutput": {"output_name": "output", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool SortBED"}], "position": {"top": 200, "left": 964}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_sortbed/2.26.0.0", "type": "tool"}, "6": {"tool_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_map/2.19.0", "tool_version": null, "outputs": [{"type": "input", "name": "output"}], "workflow_outputs": [{"output_name": "output", "uuid": "e921a8ec-23d2-45b7-ba16-89884617acaa", "label": null}], "input_connections": {"inputB": {"output_name": "output", "id": 5}, "inputA": {"output_name": "output", "id": 2}}, "tool_state": "{\"__page__\": null, \"reciprocal\": \"\\\"false\\\"\", \"__rerun_remap_job_id__\": null, \"overlap\": \"\\\"1e-09\\\"\", \"inputB\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"header\": \"\\\"false\\\"\", \"inputA\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"genome\": \"{\\\"__current_case__\\\": 1, \\\"genome_choose\\\": \\\"false\\\"}\", \"operation\": \"\\\"median\\\"\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\", \"col\": \"\\\"5\\\"\", \"strand\": \"\\\"\\\"\", \"split\": \"\\\"true\\\"\"}", "id": 6, "tool_shed_repository": {"owner": "iuc", "changeset_revision": "b8348686a0b9", "name": "bedtools", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "7d6b12d2-52fd-490e-a598-96e3749da237", "errors": null, "name": "MapBed", "post_job_actions": {}, "label": null, "inputs": [{"name": "inputB", "description": "runtime parameter for tool MapBed"}, {"name": "inputA", "description": "runtime parameter for tool MapBed"}], "position": {"top": 200, "left": 1220}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_map/2.19.0", "type": "tool"}, "7": {"tool_id": "Filter1", "tool_version": null, "outputs": [{"type": "input", "name": "out_file1"}], "workflow_outputs": [{"output_name": "out_file1", "uuid": "a8cc97a0-8eef-4d24-bf18-7fea6a8b5b10", "label": null}], "input_connections": {"input": {"output_name": "output", "id": 6}}, "tool_state": "{\"__page__\": null, \"__rerun_remap_job_id__\": null, \"cond\": \"\\\"c5>=0.5\\\"\", \"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"header_lines\": \"\\\"0\\\"\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\"}", "id": 7, "uuid": "5ad20907-26ea-477e-a934-f222ae91f2fa", "errors": null, "name": "Filter", "post_job_actions": {}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool Filter"}], "position": {"top": 200, "left": 1464}, "annotation": "", "content_id": "Filter1", "type": "tool"}}, "annotation": "", "a_galaxy_workflow": "true"} -------------------------------------------------------------------------------- /workflows/auxiliary-workflows/Cluster-conservation-filter_and_align.ga: -------------------------------------------------------------------------------- 1 | {"uuid": "cc4f6817-87c6-4197-9a11-87f10de5d645", "tags": [], "format-version": "0.1", "name": "cluster_conservation_filter_and_align", "version": 1, "steps": {"0": {"tool_id": null, "tool_version": null, "outputs": [], "workflow_outputs": [], "input_connections": {}, "tool_state": "{}", "id": 0, "uuid": "36af6bd6-facd-4fd2-b251-8a9995adc2b2", "errors": null, "name": "Input dataset", "label": "cluster.bed", "inputs": [], "position": {"top": 200, "left": 200}, "annotation": "", "content_id": null, "type": "data_input"}, "1": {"tool_id": null, "tool_version": null, "outputs": [], "workflow_outputs": [], "input_connections": {}, "tool_state": "{}", "id": 1, "uuid": "4f251c2d-1c26-4617-87e1-368dfe74f404", "errors": null, "name": "Input dataset", "label": "phastcons.wig", "inputs": [], "position": {"top": 287, "left": 200}, "annotation": "", "content_id": null, "type": "data_input"}, "2": {"tool_id": null, "tool_version": null, "outputs": [], "workflow_outputs": [], "input_connections": {}, "tool_state": "{}", "id": 2, "uuid": "7cb186df-af0c-449d-842b-adf399f8539f", "errors": null, "name": "Input dataset", "label": "cluster.fa", "inputs": [], "position": {"top": 374, "left": 200}, "annotation": "", "content_id": null, "type": "data_input"}, "3": {"tool_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_sortbed/2.26.0.0", "tool_version": "2.26.0.0", "outputs": [{"type": "input", "name": "output"}], "workflow_outputs": [], "input_connections": {"input": {"output_name": "output", "id": 0}}, "tool_state": "{\"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"__rerun_remap_job_id__\": null, \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\", \"option\": \"\\\"\\\"\", \"__page__\": null}", "id": 3, "tool_shed_repository": {"owner": "iuc", "changeset_revision": "18aeac3cd1db", "name": "bedtools", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "cceee62d-bca4-4d31-85ca-258a2ff44316", "errors": null, "name": "SortBED", "post_job_actions": {}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool SortBED"}], "position": {"top": 200, "left": 444}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_sortbed/2.26.0.0", "type": "tool"}, "4": {"tool_id": "wiggle2simple1", "tool_version": "1.0.0", "outputs": [{"type": "interval", "name": "out_file1"}], "workflow_outputs": [], "input_connections": {"input": {"output_name": "output", "id": 1}}, "tool_state": "{\"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"__rerun_remap_job_id__\": null, \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\", \"__page__\": null}", "id": 4, "uuid": "2a927687-b452-4de8-b1da-a20926c9797e", "errors": null, "name": "Wiggle-to-Interval", "post_job_actions": {}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool Wiggle-to-Interval"}], "position": {"top": 334, "left": 444}, "annotation": "", "content_id": "wiggle2simple1", "type": "tool"}, "5": {"tool_id": "toolshed.g2.bx.psu.edu/repos/devteam/fasta_to_tabular/fasta2tab/1.1.0", "tool_version": "1.1.0", "outputs": [{"type": "tabular", "name": "output"}], "workflow_outputs": [], "input_connections": {"input": {"output_name": "output", "id": 2}}, "tool_state": "{\"__page__\": null, \"keep_first\": \"\\\"0\\\"\", \"descr_columns\": \"\\\"2\\\"\", \"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/?.len\\\"\", \"__rerun_remap_job_id__\": null}", "id": 5, "tool_shed_repository": {"owner": "devteam", "changeset_revision": "7e801ab2b70e", "name": "fasta_to_tabular", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "19ad5a05-d95b-497d-8837-857a488c0673", "errors": null, "name": "FASTA-to-Tabular", "post_job_actions": {}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool FASTA-to-Tabular"}], "position": {"top": 450, "left": 444}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/devteam/fasta_to_tabular/fasta2tab/1.1.0", "type": "tool"}, "6": {"tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/1.1.0", "tool_version": "1.1.0", "outputs": [{"type": "input", "name": "outfile"}], "workflow_outputs": [], "input_connections": {"infile": {"output_name": "out_file1", "id": 4}}, "tool_state": "{\"__page__\": null, \"__rerun_remap_job_id__\": null, \"code\": \"\\\"BEGIN {OFS=\\\\\\\"\\\\\\\\t\\\\\\\"}\\\\n{print $1,$2,$3,\\\\\\\"N\\\\\\\",$5,$4}\\\"\", \"infile\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 6, "tool_shed_repository": {"owner": "bgruening", "changeset_revision": "e39fceb6ab85", "name": "text_processing", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "381483c5-6ff5-426b-a633-be9160797c13", "errors": null, "name": "Text reformatting", "post_job_actions": {"ChangeDatatypeActionoutfile": {"output_name": "outfile", "action_type": "ChangeDatatypeAction", "action_arguments": {"newtype": "bed"}}}, "label": null, "inputs": [{"name": "infile", "description": "runtime parameter for tool Text reformatting"}], "position": {"top": 200, "left": 707}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/1.1.0", "type": "tool"}, "7": {"tool_id": "toolshed.g2.bx.psu.edu/repos/bjoern-gruening/sed_wrapper/sed_stream_editor/0.0.1", "tool_version": "0.0.1", "outputs": [{"type": "input", "name": "outfile"}], "workflow_outputs": [], "input_connections": {"input": {"output_name": "output", "id": 5}}, "tool_state": "{\"__page__\": null, \"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"__rerun_remap_job_id__\": null, \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/?.len\\\"\", \"pattern\": \"\\\"s/ /\\\\\\\\t/g\\\"\"}", "id": 7, "tool_shed_repository": {"owner": "bjoern-gruening", "changeset_revision": "12ac67b5c81d", "name": "sed_wrapper", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "a907115f-6d7e-4b2f-9790-34ea7369daeb", "errors": null, "name": "Manipulation", "post_job_actions": {}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool Manipulation"}], "position": {"top": 316, "left": 707}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/bjoern-gruening/sed_wrapper/sed_stream_editor/0.0.1", "type": "tool"}, "8": {"tool_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_sortbed/2.26.0.0", "tool_version": "2.26.0.0", "outputs": [{"type": "input", "name": "output"}], "workflow_outputs": [], "input_connections": {"input": {"output_name": "outfile", "id": 6}}, "tool_state": "{\"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"__rerun_remap_job_id__\": null, \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\", \"option\": \"\\\"\\\"\", \"__page__\": null}", "id": 8, "tool_shed_repository": {"owner": "iuc", "changeset_revision": "18aeac3cd1db", "name": "bedtools", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "ce8abd16-b051-487b-b7fc-1f4a161df990", "errors": null, "name": "SortBED", "post_job_actions": {}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool SortBED"}], "position": {"top": 200, "left": 964}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_sortbed/2.26.0.0", "type": "tool"}, "9": {"tool_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_map/2.19.0", "tool_version": "2.19.0", "outputs": [{"type": "input", "name": "output"}], "workflow_outputs": [], "input_connections": {"inputB": {"output_name": "output", "id": 8}, "inputA": {"output_name": "output", "id": 3}}, "tool_state": "{\"__page__\": null, \"reciprocal\": \"\\\"false\\\"\", \"__rerun_remap_job_id__\": null, \"overlap\": \"\\\"1e-09\\\"\", \"inputB\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"header\": \"\\\"false\\\"\", \"inputA\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"genome\": \"{\\\"__current_case__\\\": 1, \\\"genome_choose\\\": \\\"false\\\"}\", \"operation\": \"\\\"median\\\"\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\", \"col\": \"\\\"5\\\"\", \"strand\": \"\\\"\\\"\", \"split\": \"\\\"true\\\"\"}", "id": 9, "tool_shed_repository": {"owner": "iuc", "changeset_revision": "b8348686a0b9", "name": "bedtools", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "8b2e7700-cfa2-43e4-ac43-f7fa8fa03fbc", "errors": null, "name": "MapBed", "post_job_actions": {}, "label": null, "inputs": [{"name": "inputB", "description": "runtime parameter for tool MapBed"}, {"name": "inputA", "description": "runtime parameter for tool MapBed"}], "position": {"top": 200, "left": 1220}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_map/2.19.0", "type": "tool"}, "10": {"tool_id": "Filter1", "tool_version": "1.1.0", "outputs": [{"type": "input", "name": "out_file1"}], "workflow_outputs": [], "input_connections": {"input": {"output_name": "output", "id": 9}}, "tool_state": "{\"__page__\": null, \"__rerun_remap_job_id__\": null, \"cond\": \"\\\"c5>=0.5\\\"\", \"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"header_lines\": \"\\\"0\\\"\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\"}", "id": 10, "uuid": "c31933be-126b-4eb4-a1a1-f202f72a1041", "errors": null, "name": "Filter", "post_job_actions": {}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool Filter"}], "position": {"top": 200, "left": 1464}, "annotation": "", "content_id": "Filter1", "type": "tool"}, "11": {"tool_id": "comp1", "tool_version": "1.0.2", "outputs": [{"type": "input", "name": "out_file1"}], "workflow_outputs": [], "input_connections": {"input2": {"output_name": "out_file1", "id": 10}, "input1": {"output_name": "outfile", "id": 7}}, "tool_state": "{\"input2\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"__page__\": null, \"input1\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"field2\": \"\\\"4\\\"\", \"__rerun_remap_job_id__\": null, \"field1\": \"\\\"15\\\"\", \"mode\": \"\\\"N\\\"\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\"}", "id": 11, "uuid": "e1e9c804-f9dd-44e2-b8ab-e481532c6a77", "errors": null, "name": "Compare two Datasets", "post_job_actions": {}, "label": null, "inputs": [{"name": "input2", "description": "runtime parameter for tool Compare two Datasets"}, {"name": "input1", "description": "runtime parameter for tool Compare two Datasets"}], "position": {"top": 200, "left": 1708}, "annotation": "", "content_id": "comp1", "type": "tool"}, "12": {"tool_id": "toolshed.g2.bx.psu.edu/repos/devteam/tabular_to_fasta/tab2fasta/1.1.0", "tool_version": "1.1.0", "outputs": [{"type": "fasta", "name": "output"}], "workflow_outputs": [], "input_connections": {"input": {"output_name": "out_file1", "id": 11}}, "tool_state": "{\"title_col\": \"[\\\"15\\\"]\", \"__page__\": null, \"seq_col\": \"\\\"16\\\"\", \"__rerun_remap_job_id__\": null, \"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\"}", "id": 12, "tool_shed_repository": {"owner": "devteam", "changeset_revision": "0b4e36026794", "name": "tabular_to_fasta", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "755f45ab-8333-4d18-9c17-2dbf955b9044", "errors": null, "name": "Tabular-to-FASTA", "post_job_actions": {}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool Tabular-to-FASTA"}], "position": {"top": 200, "left": 1998}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/devteam/tabular_to_fasta/tab2fasta/1.1.0", "type": "tool"}, "13": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/mlocarna/mlocarna/1.8.12.0", "tool_version": "1.8.12.0", "outputs": [{"type": "txt", "name": "stdout"}, {"type": "clustal", "name": "clustal"}, {"type": "stockholm", "name": "stockholm"}], "workflow_outputs": [], "input_connections": {"input_data": {"output_name": "output", "id": 12}}, "tool_state": "{\"Scoring\": \"{\\\"indel\\\": \\\"-350\\\", \\\"indel_opening\\\": \\\"-500\\\", \\\"sequence_score\\\": {\\\"__current_case__\\\": 0, \\\"sequence_score_selector\\\": \\\"ribofit\\\"}, \\\"struct_weight\\\": \\\"200\\\", \\\"tau\\\": \\\"50\\\"}\", \"Heuristics\": \"{\\\"alifold_consensus_dp\\\": \\\"false\\\", \\\"max_bps_length_ratio\\\": \\\"0.0\\\", \\\"max_diff\\\": \\\"60\\\", \\\"max_diff_am\\\": \\\"30\\\", \\\"max_diff_at_am\\\": \\\"-1\\\", \\\"min_prob\\\": \\\"0.0005\\\"}\", \"Folding\": \"{\\\"plfold_span\\\": \\\"150\\\", \\\"plfold_winsize\\\": \\\"300\\\", \\\"rnafold_temperature\\\": \\\"37.0\\\"}\", \"stdout_verbosity\": \"\\\"--quiet\\\"\", \"__page__\": null, \"alignment_mode\": \"{\\\"__current_case__\\\": 0, \\\"alignment_mode_selector\\\": \\\"global_locarna\\\", \\\"free_endgaps\\\": \\\"\\\"}\", \"__rerun_remap_job_id__\": null, \"input_data\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"Other\": \"{\\\"lonely_pairs\\\": \\\"false\\\"}\", \"outputs\": \"[\\\"clustal\\\"]\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\"}", "id": 13, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "59055c49a112", "name": "mlocarna", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "364e0d01-f484-4d80-97b5-b370760e8316", "errors": null, "name": "LocARNA", "post_job_actions": {}, "label": null, "inputs": [{"name": "input_data", "description": "runtime parameter for tool LocARNA"}], "position": {"top": 200, "left": 2254}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/mlocarna/mlocarna/1.8.12.0", "type": "tool"}, "14": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/viennarna_rnaalifold/viennarna_rnaalifold/2.2.10.0", "tool_version": "2.2.10.0", "outputs": [{"type": "txt", "name": "tabularFile"}, {"type": "tar", "name": "imagesFile"}], "workflow_outputs": [], "input_connections": {"input": {"output_name": "clustal", "id": 13}}, "tool_state": "{\"__page__\": null, \"model_options\": \"{\\\"betaScale\\\": \\\"1.0\\\", \\\"cfactor\\\": \\\"1.0\\\", \\\"dangling\\\": \\\"2\\\", \\\"endgaps\\\": \\\"false\\\", \\\"nfactor\\\": \\\"1.0\\\", \\\"noclosinggu\\\": \\\"true\\\", \\\"nogu\\\": \\\"true\\\", \\\"nolp\\\": \\\"true\\\", \\\"notetra\\\": \\\"true\\\", \\\"nsp\\\": \\\"\\\", \\\"ribosum\\\": \\\"false\\\", \\\"temperature\\\": \\\"37.0\\\"}\", \"general_options\": \"{\\\"alignment\\\": \\\"true\\\", \\\"color\\\": \\\"true\\\", \\\"layout_type\\\": \\\"1\\\", \\\"noPS\\\": \\\"true\\\", \\\"verbose\\\": \\\"false\\\"}\", \"algorithm_options\": \"{\\\"bppmThreshold\\\": \\\"1e-06\\\", \\\"circular\\\": \\\"false\\\", \\\"gquad\\\": \\\"false\\\", \\\"mea\\\": \\\"1.0\\\", \\\"mis\\\": \\\"false\\\", \\\"pf\\\": \\\"-1\\\", \\\"pfScale\\\": \\\"1.07\\\", \\\"sci\\\": \\\"false\\\", \\\"stochBT_en\\\": \\\"1\\\"}\", \"__rerun_remap_job_id__\": null, \"IDs\": \"{\\\"auto_id\\\": \\\"false\\\", \\\"continuous_ids\\\": \\\"false\\\", \\\"id_digits\\\": \\\"4\\\", \\\"id_prefix\\\": \\\"alignment\\\", \\\"id_start\\\": \\\"1\\\"}\", \"input\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg18.len\\\"\", \"constraints\": \"{\\\"constraintLocation\\\": {\\\"__current_case__\\\": 0, \\\"constraintSelector\\\": \\\"none\\\"}, \\\"maxBPspan\\\": \\\"-1\\\", \\\"shapeOption\\\": {\\\"__current_case__\\\": 1, \\\"shapeSelector\\\": \\\"notUsed\\\"}}\"}", "id": 14, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "3bc9bd5290c1", "name": "viennarna_rnaalifold", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "802e4682-9322-4464-abe5-44e00a24cd54", "errors": null, "name": "RNAalifold", "post_job_actions": {}, "label": null, "inputs": [{"name": "input", "description": "runtime parameter for tool RNAalifold"}], "position": {"top": 200, "left": 2507}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/viennarna_rnaalifold/viennarna_rnaalifold/2.2.10.0", "type": "tool"}}, "annotation": "", "a_galaxy_workflow": "true"} -------------------------------------------------------------------------------- /workflows/auxiliary-workflows/Galaxy-Workflow-compute-SP-reactivity.ga: -------------------------------------------------------------------------------- 1 | {"uuid": "60d3cc87-f821-4308-9090-836b47afaf87", "tags": [], "format-version": "0.1", "name": "Compute-SP-reactivity", "version": 1, "steps": {"0": {"tool_id": null, "tool_version": null, "outputs": [], "workflow_outputs": [], "input_connections": {}, "tool_state": "{\"name\": \"ftp://ftp.ensemblgenomes.org/pub/plants/current/fasta/arabidopsis_thaliana/ncrna/Arabidopsis_thaliana.TAIR10.ncrna.fa\"}", "id": 0, "uuid": "e8af18e9-9d99-4820-89f0-ad20c0587fac", "errors": null, "name": "Input dataset", "label": "ftp://ftp.ensemblgenomes.org/pub/plants/current/fasta/arabidopsis_thaliana/ncrna/Arabidopsis_thaliana.TAIR10.ncrna.fa", "inputs": [{"name": "ftp://ftp.ensemblgenomes.org/pub/plants/current/fasta/arabidopsis_thaliana/ncrna/Arabidopsis_thaliana.TAIR10.ncrna.fa", "description": ""}], "position": {"top": 100, "left": 116.5}, "annotation": "", "content_id": null, "type": "data_input"}, "1": {"tool_id": "toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fastq_dump/2.8.1.3", "tool_version": "2.8.1.3", "outputs": [{"type": "input", "name": "list_paired"}, {"type": "input", "name": "output_collection"}, {"type": "fastqsanger", "name": "output_accession"}, {"type": "fastqsanger", "name": "output_file"}], "workflow_outputs": [], "input_connections": {}, "tool_state": "{\"adv\": \"{\\\"alignments\\\": \\\"both\\\", \\\"clip\\\": \\\"false\\\", \\\"matepairDist\\\": \\\"\\\", \\\"maxID\\\": \\\"\\\", \\\"minID\\\": \\\"\\\", \\\"minlen\\\": \\\"\\\", \\\"readfilter\\\": \\\"\\\", \\\"region\\\": \\\"\\\", \\\"skip_technical\\\": \\\"false\\\", \\\"split\\\": \\\"true\\\", \\\"spotgroups\\\": \\\"\\\"}\", \"__page__\": null, \"outputformat\": \"\\\"fastqsanger.gz\\\"\", \"__rerun_remap_job_id__\": null, \"input\": \"{\\\"__current_case__\\\": 0, \\\"accession\\\": \\\"SRR933552\\\", \\\"input_select\\\": \\\"accession_number\\\"}\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/?.len\\\"\"}", "id": 1, "tool_shed_repository": {"owner": "iuc", "changeset_revision": "5e6237d58b0c", "name": "sra_tools", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "2401ac0c-1edb-4e4f-89c6-6d8253b5ac71", "errors": null, "name": "Download and Extract Reads in FASTA/Q", "post_job_actions": {}, "label": null, "inputs": [], "position": {"top": 197.5, "left": 116.5}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fastq_dump/2.8.1.3", "type": "tool"}, "2": {"tool_id": "toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fastq_dump/2.8.1.3", "tool_version": "2.8.1.3", "outputs": [{"type": "input", "name": "list_paired"}, {"type": "input", "name": "output_collection"}, {"type": "fastqsanger", "name": "output_accession"}, {"type": "fastqsanger", "name": "output_file"}], "workflow_outputs": [], "input_connections": {}, "tool_state": "{\"adv\": \"{\\\"alignments\\\": \\\"both\\\", \\\"clip\\\": \\\"false\\\", \\\"matepairDist\\\": \\\"\\\", \\\"maxID\\\": \\\"\\\", \\\"minID\\\": \\\"\\\", \\\"minlen\\\": \\\"\\\", \\\"readfilter\\\": \\\"\\\", \\\"region\\\": \\\"\\\", \\\"skip_technical\\\": \\\"false\\\", \\\"split\\\": \\\"true\\\", \\\"spotgroups\\\": \\\"\\\"}\", \"__page__\": null, \"outputformat\": \"\\\"fastqsanger.gz\\\"\", \"__rerun_remap_job_id__\": null, \"input\": \"{\\\"__current_case__\\\": 0, \\\"accession\\\": \\\"SRR933551\\\", \\\"input_select\\\": \\\"accession_number\\\"}\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/?.len\\\"\"}", "id": 2, "tool_shed_repository": {"owner": "iuc", "changeset_revision": "5e6237d58b0c", "name": "sra_tools", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "531a65e9-ab75-4a62-8f0b-7f85de776ef4", "errors": null, "name": "Download and Extract Reads in FASTA/Q", "post_job_actions": {}, "label": null, "inputs": [], "position": {"top": 310, "left": 116.5}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fastq_dump/2.8.1.3", "type": "tool"}, "3": {"tool_id": "toolshed.g2.bx.psu.edu/repos/devteam/bowtie2/bowtie2/2.3.2.2", "tool_version": "2.3.2.2", "outputs": [{"type": "fastqsanger", "name": "output_unaligned_reads_l"}, {"type": "fastqsanger", "name": "output_aligned_reads_l"}, {"type": "fastqsanger", "name": "output_aligned_reads_r"}, {"type": "fastqsanger", "name": "output_unaligned_reads_r"}, {"type": "bam", "name": "output"}, {"type": "sam", "name": "output_sam"}, {"type": "txt", "name": "mapping_stats"}], "workflow_outputs": [], "input_connections": {"library|input_1": {"output_name": "output_accession", "id": 1}, "reference_genome|own_file": {"output_name": "output", "id": 0}}, "tool_state": "{\"__page__\": null, \"__rerun_remap_job_id__\": null, \"library\": \"{\\\"__current_case__\\\": 0, \\\"aligned_file\\\": \\\"false\\\", \\\"input_1\\\": {\\\"__class__\\\": \\\"RuntimeValue\\\"}, \\\"type\\\": \\\"single\\\", \\\"unaligned_file\\\": \\\"false\\\"}\", \"reference_genome\": \"{\\\"__current_case__\\\": 1, \\\"own_file\\\": {\\\"__class__\\\": \\\"RuntimeValue\\\"}, \\\"source\\\": \\\"history\\\"}\", \"rg\": \"{\\\"__current_case__\\\": 3, \\\"rg_selector\\\": \\\"do_not_set\\\"}\", \"save_mapping_stats\": \"\\\"false\\\"\", \"analysis_type\": \"{\\\"__current_case__\\\": 1, \\\"alignment_options\\\": {\\\"L\\\": \\\"22\\\", \\\"N\\\": \\\"1\\\", \\\"__current_case__\\\": 0, \\\"align_mode\\\": {\\\"__current_case__\\\": 0, \\\"align_mode_selector\\\": \\\"end-to-end\\\", \\\"score_min_ete\\\": \\\"L,-0.6,-0.6\\\"}, \\\"alignment_options_selector\\\": \\\"yes\\\", \\\"dpad\\\": \\\"15\\\", \\\"gbar\\\": \\\"4\\\", \\\"i\\\": \\\"S,1,1.15\\\", \\\"ignore_quals\\\": \\\"false\\\", \\\"n_ceil\\\": \\\"L,0,0.15\\\", \\\"no_1mm_upfront\\\": \\\"false\\\", \\\"nofw\\\": \\\"false\\\", \\\"norc\\\": \\\"false\\\"}, \\\"analysis_type_selector\\\": \\\"full\\\", \\\"effort_options\\\": {\\\"__current_case__\\\": 1, \\\"effort_options_selector\\\": \\\"no\\\"}, \\\"input_options\\\": {\\\"__current_case__\\\": 0, \\\"input_options_selector\\\": \\\"yes\\\", \\\"int_quals\\\": \\\"false\\\", \\\"qupto\\\": \\\"100000000\\\", \\\"qv_encoding\\\": \\\"--phred33\\\", \\\"skip\\\": \\\"0\\\", \\\"solexa_quals\\\": \\\"false\\\", \\\"trim3\\\": \\\"0\\\", \\\"trim5\\\": \\\"3\\\"}, \\\"other_options\\\": {\\\"__current_case__\\\": 1, \\\"other_options_selector\\\": \\\"no\\\"}, \\\"reporting_options\\\": {\\\"__current_case__\\\": 0, \\\"reporting_options_selector\\\": \\\"no\\\"}, \\\"sam_opt\\\": \\\"false\\\", \\\"sam_options\\\": {\\\"__current_case__\\\": 1, \\\"sam_options_selector\\\": \\\"no\\\"}, \\\"scoring_options\\\": {\\\"__current_case__\\\": 1, \\\"scoring_options_selector\\\": \\\"no\\\"}}\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/araTha1.len\\\"\"}", "id": 3, "tool_shed_repository": {"owner": "devteam", "changeset_revision": "66f992977578", "name": "bowtie2", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "5c68635c-a5e1-48c8-9063-0491e5d2e23a", "errors": null, "name": "Bowtie2", "post_job_actions": {}, "label": null, "inputs": [{"name": "library", "description": "runtime parameter for tool Bowtie2"}, {"name": "reference_genome", "description": "runtime parameter for tool Bowtie2"}], "position": {"top": 100, "left": 280.5}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/devteam/bowtie2/bowtie2/2.3.2.2", "type": "tool"}, "4": {"tool_id": "toolshed.g2.bx.psu.edu/repos/devteam/bowtie2/bowtie2/2.3.2.2", "tool_version": "2.3.2.2", "outputs": [{"type": "fastqsanger", "name": "output_unaligned_reads_l"}, {"type": "fastqsanger", "name": "output_aligned_reads_l"}, {"type": "fastqsanger", "name": "output_aligned_reads_r"}, {"type": "fastqsanger", "name": "output_unaligned_reads_r"}, {"type": "bam", "name": "output"}, {"type": "sam", "name": "output_sam"}, {"type": "txt", "name": "mapping_stats"}], "workflow_outputs": [], "input_connections": {"library|input_1": {"output_name": "output_accession", "id": 2}, "reference_genome|own_file": {"output_name": "output", "id": 0}}, "tool_state": "{\"__page__\": null, \"__rerun_remap_job_id__\": null, \"library\": \"{\\\"__current_case__\\\": 0, \\\"aligned_file\\\": \\\"false\\\", \\\"input_1\\\": {\\\"__class__\\\": \\\"RuntimeValue\\\"}, \\\"type\\\": \\\"single\\\", \\\"unaligned_file\\\": \\\"false\\\"}\", \"reference_genome\": \"{\\\"__current_case__\\\": 1, \\\"own_file\\\": {\\\"__class__\\\": \\\"RuntimeValue\\\"}, \\\"source\\\": \\\"history\\\"}\", \"rg\": \"{\\\"__current_case__\\\": 3, \\\"rg_selector\\\": \\\"do_not_set\\\"}\", \"save_mapping_stats\": \"\\\"false\\\"\", \"analysis_type\": \"{\\\"__current_case__\\\": 1, \\\"alignment_options\\\": {\\\"L\\\": \\\"22\\\", \\\"N\\\": \\\"1\\\", \\\"__current_case__\\\": 0, \\\"align_mode\\\": {\\\"__current_case__\\\": 0, \\\"align_mode_selector\\\": \\\"end-to-end\\\", \\\"score_min_ete\\\": \\\"L,-0.6,-0.6\\\"}, \\\"alignment_options_selector\\\": \\\"yes\\\", \\\"dpad\\\": \\\"15\\\", \\\"gbar\\\": \\\"4\\\", \\\"i\\\": \\\"S,1,1.15\\\", \\\"ignore_quals\\\": \\\"false\\\", \\\"n_ceil\\\": \\\"L,0,0.15\\\", \\\"no_1mm_upfront\\\": \\\"false\\\", \\\"nofw\\\": \\\"false\\\", \\\"norc\\\": \\\"false\\\"}, \\\"analysis_type_selector\\\": \\\"full\\\", \\\"effort_options\\\": {\\\"__current_case__\\\": 1, \\\"effort_options_selector\\\": \\\"no\\\"}, \\\"input_options\\\": {\\\"__current_case__\\\": 0, \\\"input_options_selector\\\": \\\"yes\\\", \\\"int_quals\\\": \\\"false\\\", \\\"qupto\\\": \\\"100000000\\\", \\\"qv_encoding\\\": \\\"--phred33\\\", \\\"skip\\\": \\\"0\\\", \\\"solexa_quals\\\": \\\"false\\\", \\\"trim3\\\": \\\"0\\\", \\\"trim5\\\": \\\"3\\\"}, \\\"other_options\\\": {\\\"__current_case__\\\": 1, \\\"other_options_selector\\\": \\\"no\\\"}, \\\"reporting_options\\\": {\\\"__current_case__\\\": 0, \\\"reporting_options_selector\\\": \\\"no\\\"}, \\\"sam_opt\\\": \\\"false\\\", \\\"sam_options\\\": {\\\"__current_case__\\\": 1, \\\"sam_options_selector\\\": \\\"no\\\"}, \\\"scoring_options\\\": {\\\"__current_case__\\\": 1, \\\"scoring_options_selector\\\": \\\"no\\\"}}\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/Arabidopsis_thaliana_TAIR10.len\\\"\"}", "id": 4, "tool_shed_repository": {"owner": "devteam", "changeset_revision": "66f992977578", "name": "bowtie2", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "eeed880a-03f0-49d0-94cd-5c0340b8c6d4", "errors": null, "name": "Bowtie2", "post_job_actions": {}, "label": null, "inputs": [{"name": "library", "description": "runtime parameter for tool Bowtie2"}, {"name": "reference_genome", "description": "runtime parameter for tool Bowtie2"}], "position": {"top": 292, "left": 280.5}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/devteam/bowtie2/bowtie2/2.3.2.2", "type": "tool"}, "5": {"tool_id": "toolshed.g2.bx.psu.edu/repos/tyty/structurefold/get_read_pipeline/1.0", "tool_version": "1.0", "outputs": [{"type": "txt", "name": "output"}], "workflow_outputs": [], "input_connections": {"lib_file": {"output_name": "output", "id": 0}, "map_file": {"output_name": "output", "id": 3}}, "tool_state": "{\"__page__\": null, \"lib_file\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/?.len\\\"\", \"__rerun_remap_job_id__\": null, \"map_file\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 5, "tool_shed_repository": {"owner": "tyty", "changeset_revision": "7bb98e9296e9", "name": "structurefold", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "75bf6826-ba87-493c-a13e-8d1b248a61a8", "errors": null, "name": "Get RT Stop Counts", "post_job_actions": {}, "label": null, "inputs": [{"name": "lib_file", "description": "runtime parameter for tool Get RT Stop Counts"}, {"name": "map_file", "description": "runtime parameter for tool Get RT Stop Counts"}], "position": {"top": 100, "left": 444.4921875}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/tyty/structurefold/get_read_pipeline/1.0", "type": "tool"}, "6": {"tool_id": "toolshed.g2.bx.psu.edu/repos/tyty/structurefold/get_read_pipeline/1.0", "tool_version": "1.0", "outputs": [{"type": "txt", "name": "output"}], "workflow_outputs": [], "input_connections": {"lib_file": {"output_name": "output", "id": 0}, "map_file": {"output_name": "output", "id": 4}}, "tool_state": "{\"__page__\": null, \"lib_file\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/?.len\\\"\", \"__rerun_remap_job_id__\": null, \"map_file\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 6, "tool_shed_repository": {"owner": "tyty", "changeset_revision": "7bb98e9296e9", "name": "structurefold", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "37ad7902-2f84-4420-b4dc-cb56e2468808", "errors": null, "name": "Get RT Stop Counts", "post_job_actions": {}, "label": null, "inputs": [{"name": "lib_file", "description": "runtime parameter for tool Get RT Stop Counts"}, {"name": "map_file", "description": "runtime parameter for tool Get RT Stop Counts"}], "position": {"top": 181, "left": 444.4921875}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/tyty/structurefold/get_read_pipeline/1.0", "type": "tool"}, "7": {"tool_id": "toolshed.g2.bx.psu.edu/repos/tyty/structurefold/react_cal_pipeline/1.0", "tool_version": "1.0", "outputs": [{"type": "txt", "name": "output"}], "workflow_outputs": [], "input_connections": {"dist_file1": {"output_name": "output", "id": 5}, "seq_file": {"output_name": "output", "id": 0}, "dist_file2": {"output_name": "output", "id": 6}}, "tool_state": "{\"dist_file1\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"__page__\": null, \"dist_file2\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"seq_file\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"flag_in\": \"\\\"true\\\"\", \"threshold\": \"\\\"7.0\\\"\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/?.len\\\"\", \"__rerun_remap_job_id__\": null, \"nt_spec\": \"\\\"AC\\\"\"}", "id": 7, "tool_shed_repository": {"owner": "tyty", "changeset_revision": "7bb98e9296e9", "name": "structurefold", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "87e823b1-def1-4af0-ad49-b4342de417c1", "errors": null, "name": "Reactivity Calculation", "post_job_actions": {}, "label": null, "inputs": [{"name": "dist_file1", "description": "runtime parameter for tool Reactivity Calculation"}, {"name": "dist_file2", "description": "runtime parameter for tool Reactivity Calculation"}, {"name": "seq_file", "description": "runtime parameter for tool Reactivity Calculation"}], "position": {"top": 100, "left": 594.5}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/tyty/structurefold/react_cal_pipeline/1.0", "type": "tool"}}, "annotation": "", "a_galaxy_workflow": "true"} -------------------------------------------------------------------------------- /workflows/auxiliary-workflows/MAF-to-FASTA-Collection.ga: -------------------------------------------------------------------------------- 1 | {"uuid": "45c5ed46-f54b-4ab2-978d-8150135f9e38", "tags": [], "format-version": "0.1", "name": "MAF-to-FASTA-Collection", "version": 2, "steps": {"0": {"tool_id": null, "tool_version": null, "outputs": [], "workflow_outputs": [], "input_connections": {}, "tool_state": "{\"collection_type\": \"list\"}", "id": 0, "uuid": "4c592c7e-5b79-4232-8d3b-e1ed452a42eb", "errors": null, "name": "Input dataset collection", "label": null, "inputs": [], "position": {"top": 335.9801139831543, "left": 146.9744415283203}, "annotation": "", "content_id": null, "type": "data_collection_input"}, "1": {"tool_id": "MAF_To_Fasta1", "tool_version": "1.0.1", "outputs": [{"type": "fasta", "name": "out_file1"}], "workflow_outputs": [], "input_connections": {"input1": {"output_name": "output", "id": 0}}, "tool_state": "{\"fasta_target_type\": \"{\\\"__current_case__\\\": 1, \\\"fasta_type\\\": \\\"concatenated\\\", \\\"species\\\": null}\", \"__rerun_remap_job_id__\": null, \"input1\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg38.len\\\"\", \"__page__\": null}", "id": 1, "uuid": "ba63abce-cfb5-4210-815d-67234b9375e6", "errors": null, "name": "MAF to FASTA", "post_job_actions": {"HideDatasetActionout_file1": {"output_name": "out_file1", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "input1", "description": "runtime parameter for tool MAF to FASTA"}], "position": {"top": 199.98579025268555, "left": 420.00001525878906}, "annotation": "", "content_id": "MAF_To_Fasta1", "type": "tool"}, "2": {"tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/1.1.1", "tool_version": "1.1.1", "outputs": [{"type": "input", "name": "outfile"}], "workflow_outputs": [], "input_connections": {"infile": {"output_name": "out_file1", "id": 1}}, "tool_state": "{\"__page__\": null, \"find_pattern\": \"\\\"-\\\"\", \"replace_pattern\": \"\\\"\\\"\", \"__rerun_remap_job_id__\": null, \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg38.len\\\"\", \"infile\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 2, "tool_shed_repository": {"owner": "bgruening", "changeset_revision": "74a8bef53a00", "name": "text_processing", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "e8e61e36-14a5-492f-af44-fd5ef6e162f1", "errors": null, "name": "Replace Text", "post_job_actions": {"HideDatasetActionoutfile": {"output_name": "outfile", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "infile", "description": "runtime parameter for tool Replace Text"}], "position": {"top": 199.98579025268555, "left": 639.9857940673828}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/1.1.1", "type": "tool"}, "3": {"tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.0", "tool_version": "1.1.0", "outputs": [{"type": "input", "name": "output"}], "workflow_outputs": [{"output_name": "output", "uuid": "e8252d1b-2ef0-4247-b9eb-ed87f67edeab", "label": null}], "input_connections": {"infile": {"output_name": "outfile", "id": 2}}, "tool_state": "{\"adv_opts\": \"{\\\"__current_case__\\\": 1, \\\"adv_opts_selector\\\": \\\"advanced\\\", \\\"silent\\\": \\\"\\\"}\", \"__page__\": null, \"__rerun_remap_job_id__\": null, \"code\": \"\\\"s/>(.*)/>\\\\\\\\1 \\\\\\\\1/g\\\"\", \"infile\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 3, "tool_shed_repository": {"owner": "bgruening", "changeset_revision": "e39fceb6ab85", "name": "text_processing", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "9b501c03-c038-48e9-bc27-6b3625ad5f01", "errors": null, "name": "Text transformation", "post_job_actions": {"ChangeDatatypeActionoutput": {"output_name": "output", "action_type": "ChangeDatatypeAction", "action_arguments": {"newtype": "fasta"}}}, "label": null, "inputs": [{"name": "infile", "description": "runtime parameter for tool Text transformation"}], "position": {"top": 319.9999809265137, "left": 860.0000152587891}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.0", "type": "tool"}}, "annotation": "", "a_galaxy_workflow": "true"} -------------------------------------------------------------------------------- /workflows/auxiliary-workflows/MAF-to-FASTA.ga: -------------------------------------------------------------------------------- 1 | {"uuid": "ee0ed325-898d-48fc-85cc-ed7ce6f08ec0", "tags": [], "format-version": "0.1", "name": "MAF-to-FASTA-labeled", "version": 1, "steps": {"0": {"tool_id": null, "tool_version": null, "outputs": [], "workflow_outputs": [{"output_name": "output", "uuid": "67fdbb22-a899-4d6b-9f89-accff12f01ac", "label": null}], "input_connections": {}, "tool_state": "{}", "id": 0, "uuid": "afce9252-dc93-415b-b81f-96c268aa45f1", "errors": null, "name": "Input dataset", "label": "UCSC Main on Human: multiz100way", "inputs": [], "position": {"top": 135, "left": 171.9921875}, "annotation": "", "content_id": null, "type": "data_input"}, "1": {"tool_id": "MAF_To_Fasta1", "tool_version": "1.0.1", "outputs": [{"type": "fasta", "name": "out_file1"}], "workflow_outputs": [{"output_name": "out_file1", "uuid": "9369a580-3256-46d7-846d-c7d2b870224b", "label": "fasta_alignment"}], "input_connections": {"input1": {"output_name": "output", "id": 0}}, "tool_state": "{\"fasta_target_type\": \"{\\\"__current_case__\\\": 1, \\\"fasta_type\\\": \\\"concatenated\\\", \\\"species\\\": null}\", \"__rerun_remap_job_id__\": null, \"input1\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg38.len\\\"\", \"__page__\": null}", "id": 1, "uuid": "a1416abd-5d5d-48c8-b7db-70a6bf12a367", "errors": null, "name": "MAF to FASTA", "post_job_actions": {}, "label": null, "inputs": [{"name": "input1", "description": "runtime parameter for tool MAF to FASTA"}], "position": {"top": 170, "left": 447.98828125}, "annotation": "", "content_id": "MAF_To_Fasta1", "type": "tool"}, "2": {"tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/1.1.2", "tool_version": "1.1.2", "outputs": [{"type": "input", "name": "outfile"}], "workflow_outputs": [], "input_connections": {"infile": {"output_name": "out_file1", "id": 1}}, "tool_state": "{\"__page__\": null, \"replacements\": \"[]\", \"find_pattern\": \"\\\"-\\\"\", \"replace_pattern\": \"\\\"\\\"\", \"__rerun_remap_job_id__\": null, \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg38.len\\\"\", \"infile\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 2, "tool_shed_repository": {"owner": "bgruening", "changeset_revision": "a6f147a050a2", "name": "text_processing", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "1dfce274-f51a-4c03-af22-d3c2f4262190", "errors": null, "name": "Replace Text", "post_job_actions": {"HideDatasetActionoutfile": {"output_name": "outfile", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "infile", "description": "runtime parameter for tool Replace Text"}], "position": {"top": 195, "left": 718.984375}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/1.1.2", "type": "tool"}, "3": {"tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1", "tool_version": "1.1.1", "outputs": [{"type": "input", "name": "output"}], "workflow_outputs": [{"output_name": "output", "uuid": "ff95d35d-4dca-403b-888d-8697cd23f4b4", "label": "fasta_ungapped"}], "input_connections": {"infile": {"output_name": "outfile", "id": 2}}, "tool_state": "{\"__page__\": null, \"code\": \"\\\"s/>(.*)/>\\\\\\\\1 \\\\\\\\1/g\\\"\", \"adv_opts\": \"{\\\"__current_case__\\\": 1, \\\"adv_opts_selector\\\": \\\"advanced\\\", \\\"silent\\\": \\\"\\\"}\", \"__rerun_remap_job_id__\": null, \"chromInfo\": \"\\\"/usr/local/galaxy/galaxy-dist/tool-data/shared/ucsc/chrom/hg38.len\\\"\", \"infile\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 3, "tool_shed_repository": {"owner": "bgruening", "changeset_revision": "a6f147a050a2", "name": "text_processing", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "fa75dd3c-2c2c-428d-bc26-17466e7add44", "errors": null, "name": "Text transformation", "post_job_actions": {}, "label": null, "inputs": [{"name": "infile", "description": "runtime parameter for tool Text transformation"}], "position": {"top": 230, "left": 995.99609375}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1", "type": "tool"}}, "annotation": "", "a_galaxy_workflow": "true"} -------------------------------------------------------------------------------- /workflows/auxiliary-workflows/README.md: -------------------------------------------------------------------------------- 1 | **Auxiliary workflows** 2 | - MAF-to-FASTA: Conversion of Multiz Alignment Format (MAF) file of a locus, e.g. from UCSC's table browser, to FASTA format for identifying structurally conserved elements in orthologous regions. 3 | - MAF-to-FASTA-Collection: Conversion of Multiz Alignment Format (MAF) files of loci, e.g. from UCSC's table browser, to FASTA format for identifying structurally conserved elements in orthologous regions in a collection of loci separately and in parallel. 4 | - Compute-SP-reactivity: Computing RNA structure reactivities from HTS structure probing experiments SHAPE/DMS, using Bowtie-2 and StructureFold. 5 | - Cluster-conservation-filter: Filter clustering input or output genomic coordinates according to the conservation scores PhastCons/PhyoP 6 | - Cluster-conservation-filter_and_align: Filter clustering input or output genomic coordinates according to the conservation scores PhastCons/PhyoP, and further align and predict a conserved element of each conserved region. As input, the accordingly ordered set of fasta sequences is also required. 7 | -------------------------------------------------------------------------------- /workflows/extra-workflows/Orthology/README.md: -------------------------------------------------------------------------------- 1 | This workflow can be used for identifying structurally conserved candidates from sequences of orthologous regions that can for example be obtained Multiz alignemnts. This is an extension of the Motif-Finder flavor where in the last step the clusters are filtered according to the structure conservation analysis metrics EvoFold, RNAz and R-scape. In the future release the conservation analysis feature is planned to be merged as an integrated part of the primary MotifFinder workflow. 2 | 3 | ## Step-by-step guide for extraction locally conserved candidates 4 | The genomic coordinates of a locus is the basic input we need to identify locally conserved structure candidates from the ortholog genomic regions. Below is an example of one way to perform the orthology procedure. Specifically we have used this to identify NEAT1 and other lncRNA candidates. 5 | 1. locus coordinates: In UCSC genome browser hg38 human, search for NEAT1 then click on the desired (here the longest) isoform, open details in the page and extract the coordinates. Here we get 6 | * "Position: hg38 chr11:65,422,798-65,445,540 Size: 22,743 Total Exon Count: 1 Strand: +" 7 | 2. MAF blocks: Either directly download the MAF blocks from the UCSC Table browser or as a better option go through the Galaxy-UCSC interface to dump the MAF blocks into your Galaxy history: 8 | a) Inside Galaxy, search for UCSC and click "UCSC Main table browser" 9 | b) Use the settings as the screenshot below, the position from previous step is entered. Make sure `Galaxy` is selected. Click on `get output` and `Send query to Galaxy` buttons. 10 | ![](./tablebrowser-neat1.png) 11 | c) After a few seconds a new entry should appear inside your current Galaxy history prefixed `UCSC Main on Human: multiz100way`. This contains the MAF blocks for the selected locus. 12 | d) Use the `MAF-FASTA` auxiliary workflow to prepare the fasta files for GC2 input. If a subset of species is desired, you can select them within the workflow step1 options. If the lncRNA is located on the negative strand use `Reverse complement the sequences` otherwise stick to the fasta file prefixed `Text transformation on`. 13 | e) The first sequence of the fasta file can be used as the reference human NEAT1 transcript (NEAT1_hg38.fasta). Alternatively one can extract the sequence from the Tabled browser for example. This sequence will be used in the next step to map the relative location of candidates to the hg38 coordinates, so it's important to cover the full locus and not be an spliced one. 14 | 3. Invoking motif-finder workflow: Now we have the NEAT1 locus coordinates(step-1) and the one sequence per species fasta sequence (step-2) from genomic alignments available. The motif-finder workflow can now be applied to cluster the data and identify the candidates. We have configured the `MotifFinder-lncRNA` workflow to additionally obtain the annotated genomic tracks as also shown in the paper. We are dealing with ortholog sequences that are quite long and tend to have high sequence similarities, therefore the automatically generated genomic track that is annotated and filtered by Evofold, RNAz and R-scape can be very useful to get an intuition about the distribution of reliable candidates and reduce false discoveries. 15 | a) Run the `MotifFinder-lncRNA` workflow. 16 | b) Pass the genmoic fasta file as `1: Input data`, it is prefixed `Text transformation` and is the output of MAF-FASTA from step-2. 17 | c) Pass `NEAT1_hg38.fasta` as `2: Loci reference sequence` 18 | d) In the step `14: Align GraphClust cluster` (one before the last), optionally replace the `transcript loci bed` default value of "chr1 0 100000 gene 0 +" with "chr11 65422798 65445540 NEAT1 0 +" and take care of using whitespaces and the strand sign. This would be used to convert the relative positions of the candidates to the genomic one in the ucsc track. 19 | e) Run the workflow. This might take some minutes or a few hours depending on the available back-end capacity. 20 | f) The summarized outputs would be `filtered-alignments-metrics.tsv` and `bed-cluster-locations.bed`. The bed file can be copied as a a UCSC browser custom track for example https://genome.ucsc.edu/cgi-bin/hgCustom . Please do not copy the first row of numbered columns. 21 | e) The alignments and secondary structures are available to view and download under the collections output entries `Rscape-R2R`, `structure.png` and `alignment.png`. 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | -------------------------------------------------------------------------------- /workflows/extra-workflows/README.md: -------------------------------------------------------------------------------- 1 | * Here you can find alternative pre-configures falvors of GraphClust-2 pipeline. Further descriptions within the directories. -------------------------------------------------------------------------------- /workflows/extra-workflows/RNAshapes/GraphClust_1r_brnashapes.ga: -------------------------------------------------------------------------------- 1 | {"uuid": "73852eb6-d944-4b81-b6b8-a1e137294ec8", "tags": [], "format-version": "0.1", "name": "GraphClust-1r-brnashapes (imported from uploaded file)", "steps": {"0": {"tool_id": null, "tool_version": null, "outputs": [], "workflow_outputs": [{"output_name": "output", "uuid": "8f49a89e-563d-41ee-a7eb-4062a39b4bd0", "label": null}], "input_connections": {}, "tool_state": "{\"name\": \"Input Dataset\"}", "id": 0, "uuid": "9201528f-e47f-4b82-a34e-8cfe358a0621", "errors": null, "name": "Input dataset", "label": null, "inputs": [{"name": "Input Dataset", "description": ""}], "position": {"top": 299, "left": 203.953125}, "annotation": "", "content_id": null, "type": "data_input"}, "1": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_preprocessing/preproc/0.4", "tool_version": "0.4", "outputs": [{"type": "fasta", "name": "data.fasta"}, {"type": "txt", "name": "data.map"}, {"type": "txt", "name": "data.names"}, {"type": "fasta", "name": "data.fasta.scan"}, {"type": "zip", "name": "FASTA"}, {"type": "txt", "name": "shape_data_split"}, {"type": "stockholm", "name": "alignment_data_split"}], "workflow_outputs": [{"output_name": "data.names", "uuid": "de3c82bc-67f5-4945-bb9c-cb1738ee8c01", "label": null}, {"output_name": "data.fasta", "uuid": "fa6c584b-88f1-49f1-b8de-c9809f025062", "label": null}, {"output_name": "data.map", "uuid": "ffaf6f84-5441-4e9f-b1cb-ca920916cd3f", "label": null}, {"output_name": "shape_data_split", "uuid": "842c110c-abd5-4254-ab7d-fbbbc46dd44c", "label": null}, {"output_name": "alignment_data_split", "uuid": "b0898f44-8dc9-479b-92eb-c11704a654c8", "label": null}, {"output_name": "FASTA", "uuid": "2cd669f9-1c9b-4c84-befb-29f1bec7e5cf", "label": null}, {"output_name": "data.fasta.scan", "uuid": "1f9c1a31-18a0-428c-97f1-1f990edd3e5b", "label": null}], "input_connections": {"fastaFile": {"output_name": "output", "id": 0}}, "tool_state": "{\"fastaFile\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"in_winShift\": \"\\\"100\\\"\", \"__page__\": null, \"__rerun_remap_job_id__\": null, \"min_seq_length\": \"\\\"5\\\"\", \"max_length\": \"\\\"10000\\\"\", \"AlignmentData\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"SHAPEdata\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 1, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "8a1786cdcf95", "name": "graphclust_preprocessing", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "0e367ec4-acb3-4f38-93ba-4c900aee99d0", "errors": null, "name": "Preprocessing", "post_job_actions": {}, "label": null, "inputs": [{"name": "fastaFile", "description": "runtime parameter for tool Preprocessing"}, {"name": "AlignmentData", "description": "runtime parameter for tool Preprocessing"}, {"name": "SHAPEdata", "description": "runtime parameter for tool Preprocessing"}], "position": {"top": 408.875, "left": 437.859375}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_preprocessing/preproc/0.4", "type": "tool"}, "2": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_fasta_to_gspan/gspan/0.3", "tool_version": "0.3", "outputs": [{"type": "input", "name": "gspan_compressed"}], "workflow_outputs": [], "input_connections": {"dataFasta": {"output_name": "data.fasta", "id": 1}}, "tool_state": "{\"__page__\": null, \"group\": \"\\\"50\\\"\", \"shift\": \"\\\"30\\\"\", \"__rerun_remap_job_id__\": null, \"M\": \"\\\"5\\\"\", \"dataFasta\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"i_crop_unpaired_ends\": \"\\\"false\\\"\", \"i_abstr\": \"\\\"false\\\"\", \"wins\": \"\\\"200\\\"\", \"u\": \"\\\"true\\\"\", \"rel_energy_range\": \"\\\"20\\\"\", \"seq_graph_t\": \"\\\"true\\\"\", \"i_stacks\": \"\\\"true\\\"\"}", "id": 2, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "57778db3211b", "name": "graphclust_fasta_to_gspan", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "810d1a17-76d0-4934-bc46-784ad623d5c6", "errors": null, "name": "fasta_to_gspan", "post_job_actions": {"HideDatasetActiongspan_compressed": {"output_name": "gspan_compressed", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "dataFasta", "description": "runtime parameter for tool fasta_to_gspan"}], "position": {"top": 292.484375, "left": 645}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_fasta_to_gspan/gspan/0.3", "type": "tool"}, "3": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_nspdk/nspdk_sparse/9.2.2", "tool_version": "9.2.2", "outputs": [{"type": "zip", "name": "data_svector"}], "workflow_outputs": [{"output_name": "data_svector", "uuid": "0491c9d6-d30a-413c-a997-e86518dbf950", "label": null}], "input_connections": {"data_fasta": {"output_name": "data.fasta", "id": 1}, "gspan_file": {"output_name": "gspan_compressed", "id": 2}}, "tool_state": "{\"max_dist_relations\": \"\\\"3\\\"\", \"__page__\": 0, \"__rerun_remap_job_id__\": null, \"gspan_file\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"data_fasta\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"max_rad\": \"\\\"3\\\"\"}", "id": 3, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "25fd145b498a", "name": "graphclust_nspdk", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "ce2ece03-fe02-4fce-9f6f-6180daf51294", "errors": null, "name": "NSPDK_sparseVect", "post_job_actions": {}, "label": null, "inputs": [{"name": "data_fasta", "description": "runtime parameter for tool NSPDK_sparseVect"}, {"name": "gspan_file", "description": "runtime parameter for tool NSPDK_sparseVect"}], "position": {"top": 259.484375, "left": 869.484375}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_nspdk/nspdk_sparse/9.2.2", "type": "tool"}, "4": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_nspdk/NSPDK_candidateClust/9.2.2", "tool_version": "9.2.2", "outputs": [{"type": "txt", "name": "fast_cluster"}, {"type": "txt", "name": "fast_cluster_sim"}, {"type": "txt", "name": "black_list"}, {"type": "txt", "name": "fast_cluster_m"}, {"type": "txt", "name": "fast_cluster_sim_m"}, {"type": "txt", "name": "black_list_m"}], "workflow_outputs": [], "input_connections": {"data_names": {"output_name": "data.names", "id": 1}, "data_fasta": {"output_name": "data.fasta", "id": 1}, "data_svector": {"output_name": "data_svector", "id": 3}}, "tool_state": "{\"knn\": \"\\\"10\\\"\", \"max_dist_relations\": \"\\\"3\\\"\", \"nhf\": \"\\\"500\\\"\", \"noCache\": \"\\\"true\\\"\", \"__page__\": null, \"usn\": \"\\\"true\\\"\", \"__rerun_remap_job_id__\": null, \"oc\": \"\\\"true\\\"\", \"iteration_num\": \"{\\\"iteration_num_selector\\\": \\\"false\\\", \\\"CI\\\": \\\"1\\\", \\\"__current_case__\\\": 1}\", \"ensf\": \"\\\"5\\\"\", \"data_fasta\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"max_rad\": \"\\\"3\\\"\", \"data_names\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"nspdk_nhf_max\": \"\\\"1000\\\"\", \"nspdk_nhf_step\": \"\\\"25\\\"\", \"GLOBAL_num_clusters\": \"\\\"100\\\"\", \"data_svector\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 4, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "25fd145b498a", "name": "graphclust_nspdk", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "2f46dfa9-7477-4364-88bd-4a568fdb0f4b", "errors": null, "name": "NSPDK_candidateClusters", "post_job_actions": {"HideDatasetActionfast_cluster_m": {"output_name": "fast_cluster_m", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionfast_cluster_sim_m": {"output_name": "fast_cluster_sim_m", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionfast_cluster": {"output_name": "fast_cluster", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionblack_list_m": {"output_name": "black_list_m", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionblack_list": {"output_name": "black_list", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionfast_cluster_sim": {"output_name": "fast_cluster_sim", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "data_fasta", "description": "runtime parameter for tool NSPDK_candidateClusters"}, {"name": "data_svector", "description": "runtime parameter for tool NSPDK_candidateClusters"}, {"name": "data_names", "description": "runtime parameter for tool NSPDK_candidateClusters"}], "position": {"top": 358.921875, "left": 1149.875}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_nspdk/NSPDK_candidateClust/9.2.2", "type": "tool"}, "5": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_prepocessing_for_mlocarna/preMloc/0.3", "tool_version": "0.3", "outputs": [{"type": "input", "name": "centers"}, {"type": "input", "name": "trees"}, {"type": "input", "name": "cmFa"}, {"type": "input", "name": "model_tree_fa"}, {"type": "input", "name": "tree_matrix"}], "workflow_outputs": [], "input_connections": {"fasta_data": {"output_name": "data.fasta", "id": 1}, "fast_cluster_sim": {"output_name": "fast_cluster_sim", "id": 4}, "data_map": {"output_name": "data.map", "id": 1}, "fast_cluster": {"output_name": "fast_cluster", "id": 4}}, "tool_state": "{\"knn\": \"\\\"10\\\"\", \"__page__\": null, \"CI\": \"\\\"1\\\"\", \"fasta_data\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"__rerun_remap_job_id__\": null, \"fast_cluster_sim\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"data_map\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"fast_cluster\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 5, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "cf17dd082eb4", "name": "graphclust_prepocessing_for_mlocarna", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "54ca3c3a-41a2-4c50-8bec-1956c879e052", "errors": null, "name": "pgma_graphclust", "post_job_actions": {"HideDatasetActioncmFa": {"output_name": "cmFa", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActiontrees": {"output_name": "trees", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActioncenters": {"output_name": "centers", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActiontree_matrix": {"output_name": "tree_matrix", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionmodel_tree_fa": {"output_name": "model_tree_fa", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "fasta_data", "description": "runtime parameter for tool pgma_graphclust"}, {"name": "fast_cluster_sim", "description": "runtime parameter for tool pgma_graphclust"}, {"name": "data_map", "description": "runtime parameter for tool pgma_graphclust"}, {"name": "fast_cluster", "description": "runtime parameter for tool pgma_graphclust"}], "position": {"top": 469.90625, "left": 1512.859375}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_prepocessing_for_mlocarna/preMloc/0.3", "type": "tool"}, "6": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_mlocarna/locarna_best_subtree/0.3", "tool_version": "0.3", "outputs": [{"type": "stockholm", "name": "model_tree_stk"}], "workflow_outputs": [], "input_connections": {"tree_file": {"output_name": "trees", "id": 5}, "data_map": {"output_name": "data.map", "id": 1}, "center_fa_file": {"output_name": "centers", "id": 5}, "tree_matrix": {"output_name": "tree_matrix", "id": 5}}, "tool_state": "{\"tau\": \"\\\"50\\\"\", \"max_diff_am\": \"\\\"50\\\"\", \"center_fa_file\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"__page__\": 0, \"__rerun_remap_job_id__\": null, \"free_endgaps\": \"\\\"0\\\"\", \"tree_matrix\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"p\": \"\\\"0.001\\\"\", \"allow_overlap\": \"\\\"false\\\"\", \"data_map\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"param_type\": \"{\\\"tau\\\": \\\"50\\\", \\\"max_diff_am\\\": \\\"50\\\", \\\"struct_weight\\\": \\\"180\\\", \\\"p\\\": \\\"0.001\\\", \\\"plfold_winsize\\\": \\\"300\\\", \\\"alifold_consensus_dp\\\": \\\"true\\\", \\\"param_type_selector\\\": \\\"gclust\\\", \\\"indel\\\": \\\"-200\\\", \\\"__current_case__\\\": 0, \\\"max_diff\\\": \\\"100\\\", \\\"indel_opening\\\": \\\"-400\\\", \\\"plfold_minlen\\\": \\\"210\\\", \\\"plfold_span\\\": \\\"150\\\"}\", \"max_diff\": \"\\\"100\\\"\", \"tree_file\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"plfold_minlen\": \"\\\"210\\\"\"}", "id": 6, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "f416cae7fcdb", "name": "graphclust_mlocarna", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "8febdf67-f1ae-42c0-89e4-baabaf4e2e1b", "errors": null, "name": "locarna_graphclust", "post_job_actions": {"HideDatasetActionmodel_tree_stk": {"output_name": "model_tree_stk", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "center_fa_file", "description": "runtime parameter for tool locarna_graphclust"}, {"name": "tree_matrix", "description": "runtime parameter for tool locarna_graphclust"}, {"name": "data_map", "description": "runtime parameter for tool locarna_graphclust"}, {"name": "tree_file", "description": "runtime parameter for tool locarna_graphclust"}], "position": {"top": 586.921875, "left": 1788.375}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_mlocarna/locarna_best_subtree/0.3", "type": "tool"}, "7": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_cmfinder/cmFinder/0.3", "tool_version": "0.3", "outputs": [{"type": "stockholm", "name": "model_cmfinder_stk"}], "workflow_outputs": [], "input_connections": {"model_tree_stk": {"output_name": "model_tree_stk", "id": 6}, "cmfinder_fa": {"output_name": "cmFa", "id": 5}}, "tool_state": "{\"__page__\": 0, \"__rerun_remap_job_id__\": null, \"gap_threshold_opts\": \"{\\\"gap_threshold_opts_selector\\\": \\\"--g\\\", \\\"__current_case__\\\": 0, \\\"gap\\\": \\\"1.0\\\"}\", \"model_tree_stk\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"cmfinder_fa\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\"}", "id": 7, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "68bfc8df36f8", "name": "graphclust_cmfinder", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "03a78407-d5d6-4c78-ba39-170b52702be2", "errors": null, "name": "cmfinder", "post_job_actions": {"HideDatasetActionmodel_cmfinder_stk": {"output_name": "model_cmfinder_stk", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "cmfinder_fa", "description": "runtime parameter for tool cmfinder"}, {"name": "model_tree_stk", "description": "runtime parameter for tool cmfinder"}], "position": {"top": 934.90625, "left": 2048.875}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_cmfinder/cmFinder/0.3", "type": "tool"}, "8": {"tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/infernal/infernal_cmbuild/1.1.2.0", "tool_version": "1.1.2.0", "outputs": [{"type": "txt", "name": "summary_outfile"}, {"type": "cm", "name": "cmfile_outfile"}, {"type": "stockholm", "name": "refined_multiple_alignment_output"}, {"type": "txt", "name": "hfile"}, {"type": "txt", "name": "sfile"}, {"type": "txt", "name": "qqfile"}, {"type": "txt", "name": "ffile"}, {"type": "txt", "name": "xfile"}], "workflow_outputs": [{"output_name": "sfile", "uuid": "319ce79f-4475-4db8-9196-60e8ce59dfc9", "label": null}, {"output_name": "ffile", "uuid": "d80c97ff-259c-4831-aeb2-64ec9d6f8811", "label": null}, {"output_name": "xfile", "uuid": "0a25a25c-8136-44ba-8ac5-b3c6448303b8", "label": null}, {"output_name": "qqfile", "uuid": "4b3ed1e1-402a-432f-9168-f882cff1caac", "label": null}, {"output_name": "hfile", "uuid": "04eb30d4-2214-4a84-bdc0-bdd8bcbf6e69", "label": null}], "input_connections": {"alignment_infile": {"output_name": "model_cmfinder_stk", "id": 7}}, "tool_state": "{\"__page__\": 0, \"controlling_filter_p7_hmm\": \"{\\\"p7ml\\\": \\\"false\\\", \\\"p7ere\\\": \\\"0.38\\\", \\\"EvN\\\": \\\"200\\\", \\\"ElfN\\\": \\\"200\\\", \\\"EmN\\\": \\\"200\\\", \\\"EgfN\\\": \\\"200\\\"}\", \"noss\": \"\\\"false\\\"\", \"is_summery_output\": \"\\\"false\\\"\", \"__rerun_remap_job_id__\": null, \"effective_opts\": \"{\\\"__current_case__\\\": 0, \\\"effective_opts_selector\\\": \\\"--enone\\\"}\", \"alignment_infile\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"refining_opts\": \"{\\\"__current_case__\\\": 0, \\\"refining_opts_selector\\\": \\\"\\\"}\", \"Calibrate\": \"{\\\"output_options_cond\\\": {\\\"__current_case__\\\": 1, \\\"selector\\\": \\\"none\\\"}, \\\"L\\\": \\\"1.6\\\", \\\"add_opts\\\": {\\\"nonull3\\\": \\\"false\\\", \\\"random\\\": \\\"false\\\", \\\"beta\\\": \\\"1e-15\\\", \\\"seed\\\": \\\"181\\\", \\\"gc\\\": {\\\"__class__\\\": \\\"RuntimeValue\\\"}, \\\"nonbanded\\\": \\\"false\\\"}, \\\"selector\\\": \\\"true\\\", \\\"__current_case__\\\": 1, \\\"cont_exp_tails_fits\\\": {\\\"gtailn\\\": \\\"250\\\", \\\"ltailn\\\": \\\"750\\\", \\\"__current_case__\\\": 0, \\\"selector\\\": \\\"top_n\\\"}}\", \"model_construction_opts\": \"{\\\"model_construction_opts_selector\\\": \\\"--fast\\\", \\\"symfrac\\\": \\\"0.5\\\", \\\"__current_case__\\\": 0}\", \"relative_weights_opts\": \"{\\\"relative_weights_opts_selector\\\": \\\"--wpb\\\", \\\"__current_case__\\\": 0}\"}", "id": 8, "tool_shed_repository": {"owner": "bgruening", "changeset_revision": "6e18e0b098cd", "name": "infernal", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "d653418f-f33d-482b-a690-0eb1ba7682e4", "errors": null, "name": "cmbuild", "post_job_actions": {"HideDatasetActioncmfile_outfile": {"output_name": "cmfile_outfile", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionsummary_outfile": {"output_name": "summary_outfile", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionrefined_multiple_alignment_output": {"output_name": "refined_multiple_alignment_output", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "alignment_infile", "description": "runtime parameter for tool cmbuild"}], "position": {"top": 1006.921875, "left": 2360.9375}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/infernal/infernal_cmbuild/1.1.2.0", "type": "tool"}, "9": {"tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/infernal/infernal_cmsearch/1.1.2.0", "tool_version": "1.1.2.0", "outputs": [{"type": "tabular", "name": "outfile"}, {"type": "tabular", "name": "multiple_alignment_output"}], "workflow_outputs": [], "input_connections": {"cm_opts|cmfile": {"output_name": "cmfile_outfile", "id": 8}, "seqdb": {"output_name": "data.fasta.scan", "id": 1}}, "tool_state": "{\"nohmm\": \"\\\"true\\\"\", \"anytrunc\": \"\\\"false\\\"\", \"verbose\": \"\\\"false\\\"\", \"notrunc\": \"\\\"false\\\"\", \"smxsize\": \"\\\"128.0\\\"\", \"cm_opts\": \"{\\\"cmfile\\\": {\\\"__class__\\\": \\\"RuntimeValue\\\"}, \\\"__current_case__\\\": 1, \\\"cm_opts_selector\\\": \\\"histdb\\\"}\", \"__page__\": null, \"bottomonly\": \"\\\"false\\\"\", \"seqdb\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"inclusion_thresholds_opts\": \"{\\\"__current_case__\\\": 0, \\\"inclusion_thresholds_selector\\\": \\\"\\\"}\", \"__rerun_remap_job_id__\": null, \"mid\": \"\\\"false\\\"\", \"mxsize\": \"\\\"128.0\\\"\", \"A\": \"\\\"false\\\"\", \"acyk\": \"\\\"false\\\"\", \"max\": \"\\\"false\\\"\", \"acceleration_huristics\": \"{\\\"__current_case__\\\": 3, \\\"acceleration_huristics_selector\\\": \\\"--default\\\"}\", \"reporting_thresholds_opts\": \"{\\\"reporting_thresholds_selector\\\": \\\"\\\", \\\"__current_case__\\\": 0}\", \"cyk\": \"\\\"false\\\"\", \"Z\": \"\\\"\\\"\", \"model_thresholds\": \"{\\\"cut_ga\\\": \\\"false\\\", \\\"cut_nc\\\": \\\"false\\\", \\\"cut_tc\\\": \\\"false\\\"}\", \"--acyk\": \"\\\"false\\\"\", \"g\": \"\\\"true\\\"\", \"nonull3\": \"\\\"false\\\"\", \"toponly\": \"\\\"true\\\"\", \"noali\": \"\\\"false\\\"\"}", "id": 9, "tool_shed_repository": {"owner": "bgruening", "changeset_revision": "6e18e0b098cd", "name": "infernal", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "04a643c3-4ee6-4241-bba2-0df6535f32eb", "errors": null, "name": "cmsearch", "post_job_actions": {"HideDatasetActionoutfile": {"output_name": "outfile", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActionmultiple_alignment_output": {"output_name": "multiple_alignment_output", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "cm_opts", "description": "runtime parameter for tool cmsearch"}, {"name": "seqdb", "description": "runtime parameter for tool cmsearch"}], "position": {"top": 1031.90625, "left": 2732.375}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/infernal/infernal_cmsearch/1.1.2.0", "type": "tool"}, "10": {"tool_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_postprocessing/glob_report/0.3", "tool_version": "0.3", "outputs": [{"type": "input", "name": "clusters"}, {"type": "input", "name": "partitions"}, {"type": "input", "name": "topSecondaryStruct"}, {"type": "input", "name": "topDot"}, {"type": "input", "name": "rscapePlot"}, {"type": "txt", "name": "final_stats"}, {"type": "tabular", "name": "tableForEval"}, {"type": "txt", "name": "final_soft"}, {"type": "txt", "name": "final_used_cmsearch"}, {"type": "txt", "name": "evaluation"}, {"type": "txt", "name": "combined_cm_out"}, {"type": "zip", "name": "RESULTS_zip"}], "workflow_outputs": [{"output_name": "final_used_cmsearch", "uuid": "08b56468-85c9-4d1d-a044-9f6742e0c148", "label": null}, {"output_name": "topDot", "uuid": "14959ce4-9e2c-4174-8952-ed1dc12bd395", "label": null}, {"output_name": "combined_cm_out", "uuid": "a3ac919a-95ea-492b-b7cb-58c8ab4002b2", "label": null}, {"output_name": "final_stats", "uuid": "38bc3ed1-1dba-4665-b2bc-88698f5875aa", "label": null}, {"output_name": "rscapePlot", "uuid": "740cb63b-1077-42b6-a2da-f772b99e1a96", "label": null}, {"output_name": "topSecondaryStruct", "uuid": "e73c4935-1b80-4c65-aa9a-43e8a8159d23", "label": null}, {"output_name": "clusters", "uuid": "d2a96c0d-4702-40cd-9f89-79b95065dfe8", "label": null}, {"output_name": "RESULTS_zip", "uuid": "9d09a53f-abb6-42e5-af3e-169cd92d922a", "label": null}, {"output_name": "evaluation", "uuid": "51dc83c0-5c56-48e8-91ee-e4b3fac56602", "label": null}, {"output_name": "final_soft", "uuid": "0e0ea996-a1b6-4184-a922-bb4dd483f585", "label": null}], "input_connections": {"FASTA": {"output_name": "FASTA", "id": 1}, "model_tree_files": {"output_name": "model_tree_fa", "id": 5}, "cmsearch_results": {"output_name": "outfile", "id": 9}}, "tool_state": "{\"__page__\": 0, \"model_tree_files\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"cm_max_eval\": \"\\\"0.001\\\"\", \"min_cluster_size\": \"\\\"3\\\"\", \"advanced_opts\": \"{\\\"param_type\\\": {\\\"param_type_selector\\\": \\\"locarna\\\", \\\"__current_case__\\\": 1}, \\\"__current_case__\\\": 1, \\\"advanced_opts_selector\\\": \\\"show\\\"}\", \"merge_cluster_ol\": \"\\\"0.66\\\"\", \"iteration_num\": \"{\\\"iteration_num_selector\\\": \\\"false\\\", \\\"__current_case__\\\": 1}\", \"cm_min_bitscore\": \"\\\"20\\\"\", \"cut_type\": \"\\\"false\\\"\", \"results_top_num\": \"\\\"5\\\"\", \"merge_overlap\": \"\\\"0.51\\\"\", \"cm_bitscore_sig\": \"\\\"1\\\"\", \"__rerun_remap_job_id__\": null, \"FASTA\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"cmsearch_results\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"cdhit\": \"{\\\"__class__\\\": \\\"RuntimeValue\\\"}\", \"partition_type\": \"\\\"true\\\"\"}", "id": 10, "tool_shed_repository": {"owner": "rnateam", "changeset_revision": "c7ca5d173482", "name": "graphclust_postprocessing", "tool_shed": "toolshed.g2.bx.psu.edu"}, "uuid": "2cf68268-36ab-4b5c-a0be-5bbe149970f4", "errors": null, "name": "cluster_collection_report", "post_job_actions": {"HideDatasetActionpartitions": {"output_name": "partitions", "action_type": "HideDatasetAction", "action_arguments": {}}, "HideDatasetActiontableForEval": {"output_name": "tableForEval", "action_type": "HideDatasetAction", "action_arguments": {}}}, "label": null, "inputs": [{"name": "model_tree_files", "description": "runtime parameter for tool cluster_collection_report"}, {"name": "FASTA", "description": "runtime parameter for tool cluster_collection_report"}, {"name": "cmsearch_results", "description": "runtime parameter for tool cluster_collection_report"}, {"name": "cdhit", "description": "runtime parameter for tool cluster_collection_report"}], "position": {"top": 952.90625, "left": 3131.875}, "annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/rnateam/graphclust_postprocessing/glob_report/0.3", "type": "tool"}}, "annotation": "", "a_galaxy_workflow": "true"} -------------------------------------------------------------------------------- /workflows/extra-workflows/RNAshapes/README.md: -------------------------------------------------------------------------------- 1 | These pre-configurations use Bielefeld's RNAshapes package for the RNA structure predictions. -------------------------------------------------------------------------------- /workflows/extra-workflows/SHAPE/README.md: -------------------------------------------------------------------------------- 1 | The SHAPE workflows allow providing the structure probing reactivity information alongside the fasta sequences. Please check the sample data for formatting. The SHAPE variation has similar setting to the "main" workflows and will be eventually merged as an optional feature of the "main" workflows. -------------------------------------------------------------------------------- /workflows/extra-workflows/with-subworkflow/README.md: -------------------------------------------------------------------------------- 1 | Usually up to 3 rounds of clustering, depending on the size of input and classes, would be enough to identify the homologs. For large datasets with thousands of sequences, further iterations of clustering can be helpful. For such cases, the *sub-workflow* concept that is recently provided by the Galaxy team is handy. For enhancing the invocation of further rounds, we have encapsulated initial and iterative rounds of GraphClust2 as *sub-workflow*. This encapsulates the collection of individual tools and provides it as a single sub-workflow. By connecting the initial fast clustering with custom numbers of iterative sub-workflows a custom number of iterations would be achieved. --------------------------------------------------------------------------------