├── .gitattributes ├── .gitignore ├── LICENSE.txt ├── README.md ├── codemlScript.py └── codemlScript_Batch.sh /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | 4 | # Custom for Visual Studio 5 | *.cs diff=csharp 6 | *.sln merge=union 7 | *.csproj merge=union 8 | *.vbproj merge=union 9 | *.fsproj merge=union 10 | *.dbproj merge=union 11 | 12 | # Standard to msysgit 13 | *.doc diff=astextplain 14 | *.DOC diff=astextplain 15 | *.docx diff=astextplain 16 | *.DOCX diff=astextplain 17 | *.dot diff=astextplain 18 | *.DOT diff=astextplain 19 | *.pdf diff=astextplain 20 | *.PDF diff=astextplain 21 | *.rtf diff=astextplain 22 | *.RTF diff=astextplain 23 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Windows image file caches 2 | Thumbs.db 3 | ehthumbs.db 4 | 5 | # Folder config file 6 | Desktop.ini 7 | 8 | # Recycle Bin used on file shares 9 | $RECYCLE.BIN/ 10 | 11 | # Windows Installer files 12 | *.cab 13 | *.msi 14 | *.msm 15 | *.msp 16 | 17 | # ========================= 18 | # Operating System Files 19 | # ========================= 20 | 21 | # OSX 22 | # ========================= 23 | 24 | .DS_Store 25 | .AppleDouble 26 | .LSOverride 27 | 28 | # Icon must ends with two \r. 29 | Icon 30 | 31 | # Thumbnails 32 | ._* 33 | 34 | # Files that might appear on external disk 35 | .Spotlight-V100 36 | .Trashes 37 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2015 Nathan V. Whelan 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | automate-PAML-codeml 2 | ===================== 3 | 4 | This code was put together to automate the PAML-codeml package (Yang, 2007). It requires BioPython. 5 | 6 | This code works well if you have many (i.e. tens-thousands) of genes/alignments that you want to test for positive selection or fit 7 | any codeml model to. 8 | -------------------------------------------------------------------------------- /codemlScript.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | ######################################################################################################################### 4 | #This script was written by Nathan Whelan. 5 | 6 | # THIS SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 7 | # OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 8 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL 9 | # THE CONTRIBUTORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, 10 | # WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF 11 | # OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS WITH THE 12 | # SOFTWARE. 13 | ########################################################################################################################## 14 | 15 | ##This script utilizes codeml in PAML by Ziheng Yang. If you use this script please also cite PAML. 16 | 17 | ##BIOPYTHON IS REQUIRED FOR THIS SCRIPT! 18 | 19 | 20 | ##This script can be used to automate running the codeml PAML package(e.g. if you have hundreds of genes you want to fit to a model). 21 | #A codeml ctl file should be in your working directory with the parameters you wish to use. 22 | 23 | ##Calling this program is typically done in conjunction with the Bash script provided with this Python script. However, this script can be run independently 24 | ##To run script: python codemlScript.py The input files must be typed in the correct order. 25 | 26 | from __future__ import division 27 | from Bio.Phylo.PAML import codeml ##Biopython PAML 28 | import sys 29 | 30 | #This program takes three inputs: the alignment in phylip format, a treefile in Newick format, and outputfile name 31 | #It returns a nested dictionary with various results depending on analysis 32 | if len(sys.argv) != 4: 33 | print "Error. There should be three inputs. An alignment in pyhlip format, a tree file, and the name for the output file" 34 | quit() 35 | 36 | cml = codeml.Codeml() 37 | cml.read_ctl_file("codeml.ctl") ##CTL File. See PAML manual for format. 38 | cml.alignment = sys.argv[1] 39 | cml.tree = sys.argv[2] 40 | cml.out_file = sys.argv[3] 41 | 42 | name=sys.argv[1] 43 | print name 44 | name=sys.argv[1] 45 | print name 46 | name2=name[0:5]+"_omega.out" 47 | print name2 48 | results1 = cml.run(verbose=True) 49 | -------------------------------------------------------------------------------- /codemlScript_Batch.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | ######################################################################################################################### 4 | #This script was written by Nathan Whelan. 5 | 6 | # THIS SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 7 | # OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 8 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL 9 | # THE CONTRIBUTORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, 10 | # WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF 11 | # OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS WITH THE 12 | # SOFTWARE. 13 | ########################################################################################################################## 14 | 15 | 16 | ##This shell script calls a simple for loop to automate codeml in combination with the script codemlScript.py This script was put together 17 | ##because there wasn't a very good way to call PAML in batch mode. This script could be useful if you have many gene alignments, 18 | ##a tree for your taxa and the desire to fit your alignments and tree to a model in PAML. It could be used to test for positive 19 | ##selection for any given gene. NOTE: The tree must have the same tip names as taxa in the alignment. 20 | 21 | for FILE in *.phy.aln 22 | do 23 | NAME=`echo $FILE | sed 's/\_DNA.phy.aln//'` 24 | 25 | ##This needs to be modified given your naming schemes. 26 | python codemlScript.py $NAME\_DNA.phy.aln $NAME\_dropped.tre $NAME\_PAML_NeutralMethod1.out 27 | done 28 | --------------------------------------------------------------------------------