├── LICENSE ├── README.md └── phylobench ├── DESCRIPTION ├── NAMESPACE ├── R ├── benchmarks_EP.R └── phylobench.R ├── inst └── extdata │ ├── input │ ├── FASTA │ │ └── seq1_DNA.fas │ ├── Newick │ │ ├── three_unrooted_trees_4tips.tre │ │ ├── tree1_Newick.tre │ │ └── tree_primates.tre │ └── Table │ │ ├── M_SaitouNei.txt │ │ └── data_primates.txt │ └── output │ ├── BF1.txt │ ├── PIC_primates.txt │ ├── bt1.txt │ └── tree_NJ_SaitouNei.tre ├── man ├── BF.Rd ├── BTIMES.Rd ├── MANTEL.Rd ├── NJ_SaitouNei.Rd ├── PIC.Rd ├── RCOAL.Rd ├── REORDERPHYLO.Rd ├── SPLITS.Rd ├── TOPODIST.Rd ├── ULTRAMETRIC.Rd ├── VCVBM.Rd ├── YULE.Rd ├── phylobench-package.Rd └── runTests.Rd └── vignettes └── PhylogeneticBenchmarks.Rnw /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 2, June 1991 3 | 4 | Copyright (C) 1989, 1991 Free Software Foundation, Inc., 5 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA 6 | Everyone is permitted to copy and distribute verbatim copies 7 | of this license document, but changing it is not allowed. 8 | 9 | Preamble 10 | 11 | The licenses for most software are designed to take away your 12 | freedom to share and change it. By contrast, the GNU General Public 13 | License is intended to guarantee your freedom to share and change free 14 | software--to make sure the software is free for all its users. This 15 | General Public License applies to most of the Free Software 16 | Foundation's software and to any other program whose authors commit to 17 | using it. (Some other Free Software Foundation software is covered by 18 | the GNU Lesser General Public License instead.) You can apply it to 19 | your programs, too. 20 | 21 | When we speak of free software, we are referring to freedom, not 22 | price. Our General Public Licenses are designed to make sure that you 23 | have the freedom to distribute copies of free software (and charge for 24 | this service if you wish), that you receive source code or can get it 25 | if you want it, that you can change the software or use pieces of it 26 | in new free programs; and that you know you can do these things. 27 | 28 | To protect your rights, we need to make restrictions that forbid 29 | anyone to deny you these rights or to ask you to surrender the rights. 30 | These restrictions translate to certain responsibilities for you if you 31 | distribute copies of the software, or if you modify it. 32 | 33 | For example, if you distribute copies of such a program, whether 34 | gratis or for a fee, you must give the recipients all the rights that 35 | you have. You must make sure that they, too, receive or can get the 36 | source code. And you must show them these terms so they know their 37 | rights. 38 | 39 | We protect your rights with two steps: (1) copyright the software, and 40 | (2) offer you this license which gives you legal permission to copy, 41 | distribute and/or modify the software. 42 | 43 | Also, for each author's protection and ours, we want to make certain 44 | that everyone understands that there is no warranty for this free 45 | software. If the software is modified by someone else and passed on, we 46 | want its recipients to know that what they have is not the original, so 47 | that any problems introduced by others will not reflect on the original 48 | authors' reputations. 49 | 50 | Finally, any free program is threatened constantly by software 51 | patents. We wish to avoid the danger that redistributors of a free 52 | program will individually obtain patent licenses, in effect making the 53 | program proprietary. To prevent this, we have made it clear that any 54 | patent must be licensed for everyone's free use or not licensed at all. 55 | 56 | The precise terms and conditions for copying, distribution and 57 | modification follow. 58 | 59 | GNU GENERAL PUBLIC LICENSE 60 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 61 | 62 | 0. This License applies to any program or other work which contains 63 | a notice placed by the copyright holder saying it may be distributed 64 | under the terms of this General Public License. The "Program", below, 65 | refers to any such program or work, and a "work based on the Program" 66 | means either the Program or any derivative work under copyright law: 67 | that is to say, a work containing the Program or a portion of it, 68 | either verbatim or with modifications and/or translated into another 69 | language. (Hereinafter, translation is included without limitation in 70 | the term "modification".) Each licensee is addressed as "you". 71 | 72 | Activities other than copying, distribution and modification are not 73 | covered by this License; they are outside its scope. The act of 74 | running the Program is not restricted, and the output from the Program 75 | is covered only if its contents constitute a work based on the 76 | Program (independent of having been made by running the Program). 77 | Whether that is true depends on what the Program does. 78 | 79 | 1. You may copy and distribute verbatim copies of the Program's 80 | source code as you receive it, in any medium, provided that you 81 | conspicuously and appropriately publish on each copy an appropriate 82 | copyright notice and disclaimer of warranty; keep intact all the 83 | notices that refer to this License and to the absence of any warranty; 84 | and give any other recipients of the Program a copy of this License 85 | along with the Program. 86 | 87 | You may charge a fee for the physical act of transferring a copy, and 88 | you may at your option offer warranty protection in exchange for a fee. 89 | 90 | 2. You may modify your copy or copies of the Program or any portion 91 | of it, thus forming a work based on the Program, and copy and 92 | distribute such modifications or work under the terms of Section 1 93 | above, provided that you also meet all of these conditions: 94 | 95 | a) You must cause the modified files to carry prominent notices 96 | stating that you changed the files and the date of any change. 97 | 98 | b) You must cause any work that you distribute or publish, that in 99 | whole or in part contains or is derived from the Program or any 100 | part thereof, to be licensed as a whole at no charge to all third 101 | parties under the terms of this License. 102 | 103 | c) If the modified program normally reads commands interactively 104 | when run, you must cause it, when started running for such 105 | interactive use in the most ordinary way, to print or display an 106 | announcement including an appropriate copyright notice and a 107 | notice that there is no warranty (or else, saying that you provide 108 | a warranty) and that users may redistribute the program under 109 | these conditions, and telling the user how to view a copy of this 110 | License. (Exception: if the Program itself is interactive but 111 | does not normally print such an announcement, your work based on 112 | the Program is not required to print an announcement.) 113 | 114 | These requirements apply to the modified work as a whole. If 115 | identifiable sections of that work are not derived from the Program, 116 | and can be reasonably considered independent and separate works in 117 | themselves, then this License, and its terms, do not apply to those 118 | sections when you distribute them as separate works. But when you 119 | distribute the same sections as part of a whole which is a work based 120 | on the Program, the distribution of the whole must be on the terms of 121 | this License, whose permissions for other licensees extend to the 122 | entire whole, and thus to each and every part regardless of who wrote it. 123 | 124 | Thus, it is not the intent of this section to claim rights or contest 125 | your rights to work written entirely by you; rather, the intent is to 126 | exercise the right to control the distribution of derivative or 127 | collective works based on the Program. 128 | 129 | In addition, mere aggregation of another work not based on the Program 130 | with the Program (or with a work based on the Program) on a volume of 131 | a storage or distribution medium does not bring the other work under 132 | the scope of this License. 133 | 134 | 3. You may copy and distribute the Program (or a work based on it, 135 | under Section 2) in object code or executable form under the terms of 136 | Sections 1 and 2 above provided that you also do one of the following: 137 | 138 | a) Accompany it with the complete corresponding machine-readable 139 | source code, which must be distributed under the terms of Sections 140 | 1 and 2 above on a medium customarily used for software interchange; or, 141 | 142 | b) Accompany it with a written offer, valid for at least three 143 | years, to give any third party, for a charge no more than your 144 | cost of physically performing source distribution, a complete 145 | machine-readable copy of the corresponding source code, to be 146 | distributed under the terms of Sections 1 and 2 above on a medium 147 | customarily used for software interchange; or, 148 | 149 | c) Accompany it with the information you received as to the offer 150 | to distribute corresponding source code. (This alternative is 151 | allowed only for noncommercial distribution and only if you 152 | received the program in object code or executable form with such 153 | an offer, in accord with Subsection b above.) 154 | 155 | The source code for a work means the preferred form of the work for 156 | making modifications to it. For an executable work, complete source 157 | code means all the source code for all modules it contains, plus any 158 | associated interface definition files, plus the scripts used to 159 | control compilation and installation of the executable. However, as a 160 | special exception, the source code distributed need not include 161 | anything that is normally distributed (in either source or binary 162 | form) with the major components (compiler, kernel, and so on) of the 163 | operating system on which the executable runs, unless that component 164 | itself accompanies the executable. 165 | 166 | If distribution of executable or object code is made by offering 167 | access to copy from a designated place, then offering equivalent 168 | access to copy the source code from the same place counts as 169 | distribution of the source code, even though third parties are not 170 | compelled to copy the source along with the object code. 171 | 172 | 4. You may not copy, modify, sublicense, or distribute the Program 173 | except as expressly provided under this License. Any attempt 174 | otherwise to copy, modify, sublicense or distribute the Program is 175 | void, and will automatically terminate your rights under this License. 176 | However, parties who have received copies, or rights, from you under 177 | this License will not have their licenses terminated so long as such 178 | parties remain in full compliance. 179 | 180 | 5. You are not required to accept this License, since you have not 181 | signed it. However, nothing else grants you permission to modify or 182 | distribute the Program or its derivative works. These actions are 183 | prohibited by law if you do not accept this License. Therefore, by 184 | modifying or distributing the Program (or any work based on the 185 | Program), you indicate your acceptance of this License to do so, and 186 | all its terms and conditions for copying, distributing or modifying 187 | the Program or works based on it. 188 | 189 | 6. Each time you redistribute the Program (or any work based on the 190 | Program), the recipient automatically receives a license from the 191 | original licensor to copy, distribute or modify the Program subject to 192 | these terms and conditions. You may not impose any further 193 | restrictions on the recipients' exercise of the rights granted herein. 194 | You are not responsible for enforcing compliance by third parties to 195 | this License. 196 | 197 | 7. If, as a consequence of a court judgment or allegation of patent 198 | infringement or for any other reason (not limited to patent issues), 199 | conditions are imposed on you (whether by court order, agreement or 200 | otherwise) that contradict the conditions of this License, they do not 201 | excuse you from the conditions of this License. If you cannot 202 | distribute so as to satisfy simultaneously your obligations under this 203 | License and any other pertinent obligations, then as a consequence you 204 | may not distribute the Program at all. For example, if a patent 205 | license would not permit royalty-free redistribution of the Program by 206 | all those who receive copies directly or indirectly through you, then 207 | the only way you could satisfy both it and this License would be to 208 | refrain entirely from distribution of the Program. 209 | 210 | If any portion of this section is held invalid or unenforceable under 211 | any particular circumstance, the balance of the section is intended to 212 | apply and the section as a whole is intended to apply in other 213 | circumstances. 214 | 215 | It is not the purpose of this section to induce you to infringe any 216 | patents or other property right claims or to contest validity of any 217 | such claims; this section has the sole purpose of protecting the 218 | integrity of the free software distribution system, which is 219 | implemented by public license practices. Many people have made 220 | generous contributions to the wide range of software distributed 221 | through that system in reliance on consistent application of that 222 | system; it is up to the author/donor to decide if he or she is willing 223 | to distribute software through any other system and a licensee cannot 224 | impose that choice. 225 | 226 | This section is intended to make thoroughly clear what is believed to 227 | be a consequence of the rest of this License. 228 | 229 | 8. If the distribution and/or use of the Program is restricted in 230 | certain countries either by patents or by copyrighted interfaces, the 231 | original copyright holder who places the Program under this License 232 | may add an explicit geographical distribution limitation excluding 233 | those countries, so that distribution is permitted only in or among 234 | countries not thus excluded. In such case, this License incorporates 235 | the limitation as if written in the body of this License. 236 | 237 | 9. The Free Software Foundation may publish revised and/or new versions 238 | of the General Public License from time to time. Such new versions will 239 | be similar in spirit to the present version, but may differ in detail to 240 | address new problems or concerns. 241 | 242 | Each version is given a distinguishing version number. If the Program 243 | specifies a version number of this License which applies to it and "any 244 | later version", you have the option of following the terms and conditions 245 | either of that version or of any later version published by the Free 246 | Software Foundation. If the Program does not specify a version number of 247 | this License, you may choose any version ever published by the Free Software 248 | Foundation. 249 | 250 | 10. If you wish to incorporate parts of the Program into other free 251 | programs whose distribution conditions are different, write to the author 252 | to ask for permission. For software which is copyrighted by the Free 253 | Software Foundation, write to the Free Software Foundation; we sometimes 254 | make exceptions for this. Our decision will be guided by the two goals 255 | of preserving the free status of all derivatives of our free software and 256 | of promoting the sharing and reuse of software generally. 257 | 258 | NO WARRANTY 259 | 260 | 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY 261 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN 262 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES 263 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED 264 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 265 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS 266 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE 267 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, 268 | REPAIR OR CORRECTION. 269 | 270 | 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 271 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR 272 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, 273 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING 274 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED 275 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY 276 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER 277 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE 278 | POSSIBILITY OF SUCH DAMAGES. 279 | 280 | END OF TERMS AND CONDITIONS 281 | 282 | How to Apply These Terms to Your New Programs 283 | 284 | If you develop a new program, and you want it to be of the greatest 285 | possible use to the public, the best way to achieve this is to make it 286 | free software which everyone can redistribute and change under these terms. 287 | 288 | To do so, attach the following notices to the program. It is safest 289 | to attach them to the start of each source file to most effectively 290 | convey the exclusion of warranty; and each file should have at least 291 | the "copyright" line and a pointer to where the full notice is found. 292 | 293 | {description} 294 | Copyright (C) {year} {fullname} 295 | 296 | This program is free software; you can redistribute it and/or modify 297 | it under the terms of the GNU General Public License as published by 298 | the Free Software Foundation; either version 2 of the License, or 299 | (at your option) any later version. 300 | 301 | This program is distributed in the hope that it will be useful, 302 | but WITHOUT ANY WARRANTY; without even the implied warranty of 303 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 304 | GNU General Public License for more details. 305 | 306 | You should have received a copy of the GNU General Public License along 307 | with this program; if not, write to the Free Software Foundation, Inc., 308 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. 309 | 310 | Also add information on how to contact you by electronic and paper mail. 311 | 312 | If the program is interactive, make it output a short notice like this 313 | when it starts in an interactive mode: 314 | 315 | Gnomovision version 69, Copyright (C) year name of author 316 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 317 | This is free software, and you are welcome to redistribute it 318 | under certain conditions; type `show c' for details. 319 | 320 | The hypothetical commands `show w' and `show c' should show the appropriate 321 | parts of the General Public License. Of course, the commands you use may 322 | be called something other than `show w' and `show c'; they could even be 323 | mouse-clicks or menu items--whatever suits your program. 324 | 325 | You should also get your employer (if you work as a programmer) or your 326 | school, if any, to sign a "copyright disclaimer" for the program, if 327 | necessary. Here is a sample; alter the names: 328 | 329 | Yoyodyne, Inc., hereby disclaims all copyright interest in the program 330 | `Gnomovision' (which makes passes at compilers) written by James Hacker. 331 | 332 | {signature of Ty Coon}, 1 April 1989 333 | Ty Coon, President of Vice 334 | 335 | This General Public License does not permit incorporating your program into 336 | proprietary programs. If your program is a subroutine library, you may 337 | consider it more useful to permit linking proprietary applications with the 338 | library. If this is what you want to do, use the GNU Lesser General 339 | Public License instead of this License. 340 | 341 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## phylobench: Phylogenetic Benchmarking 2 | 3 | phylobench implements tests of phylogenetic analyses and compares the output with expected outputs. 4 | 5 | Currently, the implemented benchmarks are: 6 | 7 | - Branching times calculation 8 | - Base frequencies from DNA sequences 9 | - Phylogenetically independent contrasts 10 | - Variance-covariance under Brownian motion 11 | - Neiborgh-joining 12 | - Random coalescent trees 13 | - Random Yule trees 14 | - Type I error rate of the Mantel test 15 | - Ultrametric trees 16 | - Topological distances 17 | - Splits from unrooted trees 18 | - Reordering of the edge matrix 19 | 20 | New benchmarks are added easily by: 21 | 22 | - Writing a function, say `FUN`, which runs the benchmark and returns `"OK"` if the results are as expected. This function must have no argument or its arguments defined by default so that it can be called with `FUN()`. 23 | - Modifying the list `.list_of_tests` (in the file phylobench/R/phylobench.R) by adding the new benchmark like this: 24 | ```r 25 | .list_of_tests <- list(...., "Title of the benchmark" = "FUN") 26 | ``` 27 | - Optionally, a file phylobench/man/FUN.Rd describing the new benchmark may be created. 28 | 29 | If the benchmark requires files, these must be placed in phylobench/inst/extdata/. 30 | 31 | All the benchmarks are run when building the vignette, so they can be visualized once the package is installed with: 32 | 33 | ```r 34 | vignette("PhylogeneticBenchmarks") 35 | ``` 36 | -------------------------------------------------------------------------------- /phylobench/DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: phylobench 2 | Version: 0.2-1 3 | Date: 2023-02-02 4 | Title: Phylogenetic Benchmarks 5 | Authors@R: c(person("Emmanuel", "Paradis", role = c("aut", "cre", "cph"), email = "Emmanuel.Paradis@ird.fr", comment = c(ORCID = "0000-0003-3092-2199"))) 6 | Depends: R (>= 3.2.0), ape 7 | Description: Benchmarks for phylogenetic and comparative methods. The outputs from functions are compared to the expected results stored in files or calculated from theoretical expectations. 8 | License: GPL (>= 2) 9 | URL: https://github.com/emmanuelparadis/phylobench 10 | -------------------------------------------------------------------------------- /phylobench/NAMESPACE: -------------------------------------------------------------------------------- 1 | export(codeTests, fileTests, listTests, runTests) 2 | 3 | import(ape) 4 | 5 | importFrom(stats, cor, qnorm, reorder, runif) 6 | 7 | importFrom(utils, read.table) 8 | -------------------------------------------------------------------------------- /phylobench/R/benchmarks_EP.R: -------------------------------------------------------------------------------- 1 | ## benchmarks_EP.R (2023-02-02) 2 | 3 | ## Phylogenetic Benchmarks 4 | 5 | ## Copyright 2019-2023 Emmanuel Paradis 6 | 7 | ## This file is part of the R-package `phylobench'. 8 | ## See the file ../COPYING for licensing issues. 9 | 10 | ## read a tree, calculate its branching times, and compare them to values in a file 11 | BTIMES <- function() 12 | { 13 | nwk1 <- system.file("extdata/input/Newick/tree1_Newick.tre", 14 | package = "phylobench") 15 | tr1 <- read.tree(nwk1) 16 | bt1 <- branching.times(tr1) 17 | bt1.0 <- scan(system.file("extdata/output/bt1.txt", package = "phylobench"), 18 | sep = "\n", quiet = TRUE) 19 | if (all(abs(bt1 - bt1.0) < eps)) "OK" 20 | else "problem in branching times calculcation" 21 | } 22 | 23 | ## base frequencies 24 | BF <- function() 25 | { 26 | fas1 <- system.file("extdata/input/FASTA/seq1_DNA.fas", package = "phylobench") 27 | dna1 <- read.dna(fas1, format = "f") 28 | BF1 <- base.freq(dna1, TRUE, TRUE) 29 | out1 <- system.file("extdata/output/BF1.txt", package = "phylobench") 30 | BF1.0 <- read.table(out1, header = TRUE) 31 | if (all(BF1[c("a", "c", "g", "t", "n")] == BF1.0)) "OK" else "problem when clculating base frequencies" 32 | } 33 | 34 | ## Phylogenetically independent contrasts 35 | PIC <- function() 36 | { 37 | treefile <- system.file("extdata/input/Newick/tree_primates.tre", package = "phylobench") 38 | datfile <- system.file("extdata/input/Table/data_primates.txt", package = "phylobench") 39 | tree.primates <- read.tree(treefile) 40 | DATA <- read.table(datfile, header = TRUE) 41 | pic.body <- pic(DATA$body, tree.primates) 42 | pic.brain <- pic(DATA$brain, tree.primates) 43 | outfile <- system.file("extdata/output/PIC_primates.txt", package = "phylobench") 44 | PIC.0 <- read.table(outfile, header = TRUE) 45 | ## only 6 digits in PHYLIP's output 46 | test1 <- all(abs(sort(pic.body) - sort(PIC.0$body)) < 1e-5) 47 | test2 <- all(abs(sort(pic.brain) - sort(PIC.0$brain)) < 1e-5) 48 | if (test1 && test2) return("OK") 49 | return("disagreement between the values of PICs") 50 | } 51 | 52 | ## Variance-covariance under Brownian motion 53 | VCVBM <- function() 54 | { 55 | tr <- compute.brtime(stree(5, "l"), 4:1) 56 | vcvape <- vcv(tr) 57 | expected.vcv <- diag(4, 5, 5) 58 | expected.vcv[lower.tri(expected.vcv)] <- offdiag <- rep(0:3, 4:1) 59 | expected.vcv <- t(expected.vcv) 60 | expected.vcv[lower.tri(expected.vcv)] <- offdiag 61 | if (all(expected.vcv == vcvape)) return("OK") 62 | } 63 | 64 | ## Neiborgh-joining 65 | NJ_SaitouNei <- function() 66 | { 67 | matfile <- system.file("extdata/input/Table/M_SaitouNei.txt", package = "phylobench") 68 | M <- as.matrix(read.table(matfile)) 69 | tr.nj <- nj(M) 70 | outfile <- system.file("extdata/output/tree_NJ_SaitouNei.tre", package = "phylobench") 71 | tr.ref <- read.tree(outfile) 72 | test <- all.equal(tr.nj, tr.ref) 73 | if (test) return("OK") 74 | return("disagreement between the reconstructed and reference NJ trees") 75 | } 76 | 77 | ## Random coalescent trees 78 | RCOAL <- function() 79 | { 80 | BOUND <- qnorm(0.995) 81 | N <- 100 82 | tree.sizes <- c(5, 10, 20, 50, 75, 100) 83 | res <- numeric() 84 | for (i in 1:200) { 85 | for (n in tree.sizes) { 86 | k <- 2:n 87 | expected.mean <- 2 * sum(1/(k * (k - 1))) 88 | expected.var <- 4 * sum(1/(k * (k - 1))^2) 89 | x <- replicate(N, branching.times(rcoal(n))[1]) 90 | res <- c(res, (mean(x) - expected.mean) * sqrt(N/expected.var)) 91 | } 92 | if (anyNA(res)) 93 | return(paste("some missing values returned after", 94 | length(res), "simulations")) 95 | tab <- tabulate((abs(res) > BOUND) + 1L, 2L)[2] 96 | if (tab < length(res)/100) return("OK") 97 | } 98 | paste("found", tab, "replications", "out of range out of", 99 | length(res), "(1% expected)") 100 | } 101 | 102 | ## Random Yule trees 103 | YULE <- function(N = 1000, lambda = 0.05, Tmax = 50, threshold = c(0.8, 1.2)) 104 | { 105 | x <- replicate(floor(N/2), balance(rlineage(lambda, 0, Tmax))[1, ]) 106 | dim(x) <- NULL 107 | mx <- max(x) 108 | O <- tabulate(x, mx) 109 | P <- length(x) * dyule(1:mx, lambda, Tmax) 110 | r <- cor(P, O) 111 | if (r < threshold[1] || r > threshold[2]) 112 | return(paste("observed and predicted numbers of species seem too different: cor =", round(r, 3))) 113 | "OK" 114 | } 115 | 116 | ## Type I error rate of the Mantel test 117 | MANTEL <- function(N = 100, n = 10) 118 | { 119 | rmat <- function(n) { 120 | x <- runif(n * (n - 1) / 2) 121 | m <- matrix(0, n, n) 122 | m[lower.tri(m)] <- x 123 | m <- t(m) 124 | m[lower.tri(m)] <- x 125 | m 126 | } 127 | res <- numeric() 128 | for (i in 1:200) { 129 | res <- c(res, replicate(N, { 130 | ma <- rmat(n) 131 | mb <- rmat(n) 132 | mantel.test(ma, mb)$p 133 | })) 134 | if (anyNA(res)) 135 | return(paste("some missing values returned after", 136 | length(res), "replications")) 137 | nsig <- sum(res < 0.05) 138 | if (nsig / length(res) <= 0.05) return("OK") 139 | } 140 | paste("number of significant tests seem too high after", 141 | length(res), "replications") 142 | } 143 | 144 | ## Ultrametric trees 145 | ULTRAMETRIC <- function(N = 100, n = c(5, 10, 20, 50, 100)) 146 | { 147 | res <- logical() 148 | for (k in n) { 149 | res <- c(res, !replicate(N, is.ultrametric(rcoal(k)))) 150 | res <- c(res, replicate(N, is.ultrametric(rtree(k)))) 151 | } 152 | if (any(res)) 153 | return(paste(sum(res), "test(s) incorrect out of", length(res))) 154 | "OK" 155 | } 156 | 157 | ## Topological distances: 158 | TOPODIST <- function() 159 | { 160 | fl <- system.file("extdata/input/Newick/three_unrooted_trees_4tips.tre", 161 | package = "phylobench") 162 | TR <- read.tree(fl) 163 | D <- dist.topo(TR) 164 | if (length(D) == 3 && all(D == 2)) return("OK") 165 | "not all distances equal to 2" 166 | } 167 | 168 | ## Splits from unrooted trees: 169 | SPLITS <- function() 170 | { 171 | fl <- system.file("extdata/input/Newick/three_unrooted_trees_4tips.tre", 172 | package = "phylobench") 173 | TR <- read.tree(fl) 174 | a <- summary(prop.part(TR))[-1] 175 | b <- bitsplits(TR)$freq 176 | if (length(a) == 3 && all(a == 1) && length(b) == 3 && all(b == 1)) 177 | return("OK") 178 | "did not return three splits with relative frequencies 1/3" 179 | } 180 | 181 | ## Test reordering of edge matrix: 182 | REORDERPHYLO <- function(Nmin = 3, Nmax = 1000, ProbRooted = 0.5, 183 | ProbMultichotomy = 0.5, nrep = 1e4) 184 | { 185 | Ntip <- Nnode <- integer(nrep) 186 | Test1 <- Test2 <- logical(nrep) 187 | 188 | pm <- runif(nrep) < ProbMultichotomy 189 | N <- ceiling(runif(nrep, Nmin, Nmax)) 190 | Rooted <- runif(nrep) < ProbRooted 191 | 192 | for (i in 1:nrep) { 193 | n <- N[i] 194 | rooted <- Rooted[i] 195 | tr <- rtree(n, rooted) 196 | if (pm[i]) { 197 | if (n == 3 && !rooted) break 198 | INTS <- which(tr$edge[, 2L] > n) 199 | m <- length(INTS) 200 | if (!m) break 201 | k <- sample(INTS, ceiling(ProbMultichotomy * m)) 202 | tr$edge.length[k] <- 0 203 | tr <- di2multi(tr) 204 | } 205 | Ntip[i] <- Ntip(tr) 206 | Nnode[i] <- Nnode(tr) 207 | Test1[i] <- identical(reorder(reorder(tr, "pr"))$edge, tr$edge) 208 | Test2[i] <- identical(reorder(reorder(tr, "po"))$edge, tr$edge) 209 | } 210 | 211 | res <- data.frame(Ntip = Ntip, Nnode = Nnode, Rooted = Rooted, 212 | Test1 = Test1, Test2 = Test2) 213 | 214 | if (all(res$Test1) && all(res$Test2)) "OK" else res 215 | } 216 | -------------------------------------------------------------------------------- /phylobench/R/phylobench.R: -------------------------------------------------------------------------------- 1 | ## phylobench.R (2021-04-20) 2 | 3 | ## Phylogenetic Benchmarks 4 | 5 | ## Copyright 2019-2021 Emmanuel Paradis 6 | 7 | ## This file is part of the R-package `phylobench'. 8 | ## See the file ../COPYING for licensing issues. 9 | 10 | .list_of_tests <- list("Branching times calculation" = "BTIMES", 11 | "Base frequencies from DNA sequences" = "BF", 12 | "Phylogenetically independent contrasts" = "PIC", 13 | "Variance-covariance under Brownian motion" = "VCVBM", 14 | "Neiborgh-joining" = "NJ_SaitouNei", 15 | "Random coalescent trees" = "RCOAL", 16 | "Random Yule trees" = "YULE", 17 | "Type I error rate of the Mantel test" = "MANTEL", 18 | "Ultrametric trees" = "ULTRAMETRIC", 19 | "Topological distances" = "TOPODIST", 20 | "Splits from unrooted trees" = "SPLITS", 21 | "Test reordering of edge matrix" = "REORDERPHYLO") 22 | 23 | eps <- .Machine$double.eps 24 | 25 | runTests <- function(verbose = TRUE) 26 | { 27 | FUN <- .list_of_tests 28 | tl <- names(FUN) 29 | ntests <- length(tl) 30 | res <- vector("list", ntests) 31 | if (verbose) 32 | cat(" Phylogenetic benchmarking: starting", ntests, "tests...\n\n") 33 | for (i in 1:ntests) { 34 | if (verbose) cat("Running test no.", i, ": ", tl[i], "...", sep = "") 35 | out <- try(eval(parse(text = paste0(FUN[[i]], "()")))) 36 | if (identical(out, "OK")) cat(" OK.\n") 37 | else cat(" problem!\n") 38 | res[[i]] <- out 39 | } 40 | if (verbose) { 41 | allok <- sapply(res, identical, y = "OK") 42 | Nnotok <- sum(!allok) 43 | if (!Nnotok) cat("\nAll tests were OK.\n") 44 | else cat("\n", Nnotok, "tests out of", ntests, "were not OK. See details in the returned list.\n") 45 | } 46 | names(res) <- tl 47 | res 48 | } 49 | 50 | listTests <- function() 51 | { 52 | FUN <- .list_of_tests 53 | DF <- data.frame(Function = unlist(FUN)) 54 | row.names(DF) <- paste(1:nrow(DF), names(FUN), sep = ": ") 55 | DF 56 | } 57 | 58 | fileTests <- function(which) 59 | { 60 | if (missing(which)) 61 | stop("give the number of the test to show its code (see listTests())") 62 | FUN <- .list_of_tests[which] 63 | tl <- names(FUN) 64 | code <- deparse(get(FUN[[1]])) 65 | files <- grep("system\\.file", code, value = TRUE) 66 | prefix <- paste0(system.file(package = "phylobench"), "/") 67 | files <- gsub(".*system\\.file\\(\"", prefix, files) 68 | files <- gsub("\".*$", "", files) 69 | infiles <- grep("/extdata/input/", files, value = TRUE) 70 | outfiles <- grep("/extdata/output/", files, value = TRUE) 71 | cat("Test no.", which, ": ", tl, ":\n\n", sep = "") 72 | cat("Input (data) files:", infiles, sep = "\n") 73 | cat("\nOutput (result) files:", outfiles, sep = "\n") 74 | } 75 | 76 | codeTests <- function(which) 77 | { 78 | if (missing(which)) 79 | stop("give the number of the test to show its code (see listTests())") 80 | FUN <- .list_of_tests[which] 81 | tl <- names(FUN) 82 | cat("Test no.", which, ": ", tl, "\n\n", FUN[[1]], " <- ", sep = "") 83 | get(FUN[[1]]) 84 | } 85 | -------------------------------------------------------------------------------- /phylobench/inst/extdata/input/FASTA/seq1_DNA.fas: -------------------------------------------------------------------------------- 1 | >A 2 | AAAAAAAAAA 3 | >B 4 | TTTTTTTTTT 5 | >C 6 | GGGGGGGGGG 7 | >D 8 | CCCCCCCCCC 9 | >E 10 | NNNNNNNNNN 11 | -------------------------------------------------------------------------------- /phylobench/inst/extdata/input/Newick/three_unrooted_trees_4tips.tre: -------------------------------------------------------------------------------- 1 | (A,B,(C,D)); 2 | (A,C,(B,D)); 3 | (A,D,(B,C)); 4 | -------------------------------------------------------------------------------- /phylobench/inst/extdata/input/Newick/tree1_Newick.tre: -------------------------------------------------------------------------------- 1 | ((((A:1,B:1):1,C:2):1,D:3):1,E:4); 2 | -------------------------------------------------------------------------------- /phylobench/inst/extdata/input/Newick/tree_primates.tre: -------------------------------------------------------------------------------- 1 | ((((Homo:0.21,Pongo:0.21):0.28,Macaca:0.49):0.13,Ateles:0.62):0.38,Galago:1); 2 | -------------------------------------------------------------------------------- /phylobench/inst/extdata/input/Table/M_SaitouNei.txt: -------------------------------------------------------------------------------- 1 | 1 2 3 4 5 6 7 8 2 | 1 0 7 8 11 13 16 13 17 3 | 2 7 0 5 8 10 13 10 14 4 | 3 8 5 0 5 7 10 7 11 5 | 4 11 8 5 0 8 11 8 12 6 | 5 13 10 7 8 0 5 6 10 7 | 6 16 13 10 11 5 0 9 13 8 | 7 13 10 7 8 6 9 0 8 9 | 8 17 14 11 12 10 13 8 0 10 | -------------------------------------------------------------------------------- /phylobench/inst/extdata/input/Table/data_primates.txt: -------------------------------------------------------------------------------- 1 | body brain 2 | Homo 4.09434 4.74493 3 | Pongo 3.61092 3.33220 4 | Macaca 2.37024 3.36730 5 | Ateles 2.02815 2.89037 6 | Galago -1.46968 2.30259 7 | -------------------------------------------------------------------------------- /phylobench/inst/extdata/output/BF1.txt: -------------------------------------------------------------------------------- 1 | A C G T N 2 | 10 10 10 10 10 3 | -------------------------------------------------------------------------------- /phylobench/inst/extdata/output/PIC_primates.txt: -------------------------------------------------------------------------------- 1 | ## source: http://www0.nih.go.jp/~jun/research/phylip/contrast.htm (accessed 2017-09-25) 2 | body brain 3 | 0.74593 2.17989 4 | 1.58474 0.71761 5 | 1.19293 0.86790 6 | 3.35832 0.89706 7 | -------------------------------------------------------------------------------- /phylobench/inst/extdata/output/bt1.txt: -------------------------------------------------------------------------------- 1 | 4 2 | 3 3 | 2 4 | 1 5 | -------------------------------------------------------------------------------- /phylobench/inst/extdata/output/tree_NJ_SaitouNei.tre: -------------------------------------------------------------------------------- 1 | (8:6,7:2,((((1:5,2:2):2,3:1):1,4:3):2,(5:1,6:4):2):1); 2 | -------------------------------------------------------------------------------- /phylobench/man/BF.Rd: -------------------------------------------------------------------------------- 1 | \name{baseFrequencies} 2 | \alias{baseFrequencies} 3 | \alias{BF} 4 | \title{Base Frequencies} 5 | \description{ 6 | This benchmark computes the base frequencies from five sequences 7 | stored in a FASTA file each with ten bases. Each sequence is made 8 | uniquely of A, C, G, T, or N. These values are compared with the 9 | values returned by the function \code{\link[ape]{base.freq}}. 10 | } 11 | \usage{ 12 | BF() 13 | } 14 | \author{Emmanuel Paradis} 15 | \keyword{utilities} 16 | -------------------------------------------------------------------------------- /phylobench/man/BTIMES.Rd: -------------------------------------------------------------------------------- 1 | \name{BTIMES} 2 | \alias{BTIMES} 3 | \title{Branching Times} 4 | \description{ 5 | This benchmark assesses the branching times calculated on an 6 | ultrametric tree with five tips and four branching times equal to 7 | four, three, two, and one unit of time (from the root to the most 8 | recent node). The initial tree is stored in a Newick file. These 9 | values are compared with the values returned by the function 10 | \code{\link[ape]{branching.times}}. 11 | } 12 | \usage{ 13 | BTIMES() 14 | } 15 | \author{Emmanuel Paradis} 16 | \keyword{utilities} 17 | -------------------------------------------------------------------------------- /phylobench/man/MANTEL.Rd: -------------------------------------------------------------------------------- 1 | \name{MANTEL} 2 | \alias{MANTEL} 3 | \title{Type I Error Rate of the Mantel Test} 4 | \description{ 5 | This benchmark assesses the type I error rate of the Mantel test 6 | computed by the function \code{\link[ape]{mantel.test}}. The test is 7 | performed on two random square matrices of size \code{n} and is 8 | replicated \code{N} times. If more than 5\% of the tests are 9 | significant, this cycle (simulation + test) is repeated; otherwise 10 | ``OK'' is returned. If after 200 cycles, there are more than 5\% of 11 | significant tests, an appropriate message is returned. 12 | 13 | If at least one missing value (NA or NaN) is observed, an appropriate 14 | message is returned. 15 | } 16 | \usage{ 17 | MANTEL(N = 100, n = 10) 18 | } 19 | \arguments{ 20 | \item{N}{the number of replications.} 21 | \item{n}{the number of rows in the matrices.} 22 | } 23 | \author{Emmanuel Paradis} 24 | \references{ 25 | Mantel, N. (1967) The detection of disease clustering and a 26 | generalized regression approach. \emph{Cancer Research}, \bold{27}, 27 | 209--220. 28 | } 29 | \keyword{utilities} 30 | -------------------------------------------------------------------------------- /phylobench/man/NJ_SaitouNei.Rd: -------------------------------------------------------------------------------- 1 | \name{NJ_SaitouNei} 2 | \alias{NJ_SaitouNei} 3 | \title{Neiborgh-Joining} 4 | \description{ 5 | This benchmark assesses the neiborgh--joining (NJ) method of Saitou 6 | and Nei (1987) using the small example provided in their paper. This 7 | simple case results in a NJ tree with all branch lengths 8 | integers. This tree is compared with the tree returned by the function 9 | \code{\link[ape]{nj}}. 10 | } 11 | \usage{ 12 | NJ_SaitouNei() 13 | } 14 | \author{Emmanuel Paradis} 15 | \references{ 16 | Saitou, N. and Nei, M. (1987) The neighbor-joining method: a new 17 | method for reconstructing phylogenetic trees. \emph{Molecular Biology 18 | and Evolution}, \bold{4}, 406--425. 19 | } 20 | \keyword{utilities} 21 | -------------------------------------------------------------------------------- /phylobench/man/PIC.Rd: -------------------------------------------------------------------------------- 1 | \name{PIC} 2 | \alias{PIC} 3 | \title{Phylogenetically Independent Contrasts Benchmark} 4 | \description{ 5 | This benchmark compares the phylogenetically independent contrasts 6 | calculated by \pkg{ape} with those output from Phylip (Felsenstein 7 | 2004). The data are a phylogeny of five species of primates and two 8 | variables: body mass and brain mass, both log-transformed. 9 | 10 | The comparisons are done to the nearest 1e-5 since the results 11 | reported in Phylip have five digits. 12 | } 13 | \usage{ 14 | PIC() 15 | } 16 | \author{Emmanuel Paradis} 17 | \references{ 18 | Felsenstein, J. (2004) Phylip (Phylogeny Inference Package) version 19 | 3.68. Department of Genetics, University of Washington, Seattle, USA. 20 | \url{http://evolution.genetics.washington.edu/phylip/phylip.html}. 21 | } 22 | \keyword{utilities} 23 | -------------------------------------------------------------------------------- /phylobench/man/RCOAL.Rd: -------------------------------------------------------------------------------- 1 | \name{RCOAL} 2 | \alias{RCOAL} 3 | \title{Random Coalescent Trees} 4 | \description{ 5 | This benchmark simulates random coalescent trees with the function 6 | \code{\link[ape]{rcoal}}, calculates the time to the MRCA (root age) 7 | with \code{\link[ape]{branching.times}} for different sample sizes 8 | (\emph{n} = 5, 10, 20, 50, 75, 100). The expected mean and variance are 9 | calculated with: 10 | 11 | \deqn{2\sum_{i=2}^n\frac{1}{i(i-1)}}{2 \sum 1/(i(i-1)), i=2, \dots, n} 12 | 13 | \deqn{4\sum_{i=2}^n\frac{1}{[i(i-1)]^2}}{4 \sum 1/(i(i-1))^2, i=2, \dots, n} 14 | 15 | For each value of \emph{n}, 100 trees are simulated and the mean root 16 | age is centered and scaled with the above formulas, so the transformed 17 | values are expected to follow a standard normal distribution (assuming 18 | the central limit theorem applies). If more than 1\% of these 19 | transformed values are smaller or larger than expected as indicated by 20 | the quantile of the normal distribution (see 21 | \code{\link[stats]{qnorm}}), the whole process is repeated, otherwise 22 | the test returns ``OK''. The maximum number of repetitions is 200 23 | (equivalent to 120,000 simulated trees). If after these 200 repetitions, 24 | more than 1\% out of the 1200 means are out of range, an appropriate 25 | message is returned. 26 | 27 | If at least one missing value (NA or NaN) is observed, an appropriate 28 | message is returned. 29 | } 30 | \usage{ 31 | RCOAL() 32 | } 33 | \author{Emmanuel Paradis} 34 | \references{ 35 | Hudson, R. R. (1991) Gene genealogies and the coalescent process. \emph{Oxford Surveys in Evolutionary Biology}, \bold{7}, 1--44. 36 | 37 | Kingman, J. F. C. (1982) On the genealogy of large populations. \emph{Journal of Applied Probability}, \bold{19A}, 27--43. 38 | 39 | Wakeley, J. (2009) Coalescent theory: an introduction. Roberts \& Company Publishers, Greenwood Village CO. 40 | } 41 | \keyword{utilities} 42 | 43 | -------------------------------------------------------------------------------- /phylobench/man/REORDERPHYLO.Rd: -------------------------------------------------------------------------------- 1 | \name{REORDERPHYLO} 2 | \alias{REORDERPHYLO} 3 | \title{Test Reordering of Edge Matrix} 4 | \description{ 5 | This benchmark tests whether reodering the edges of a \code{"phylo"} 6 | tree works correctly. 7 | 8 | This benchmark is quite critical as reordering the edges of a 9 | \code{"phylo"} tree is an important operation used in many functions. 10 | } 11 | \usage{ 12 | REORDERPHYLO(Nmin = 3L, Nmax = 1000L, ProbRooted = 0.5, 13 | ProbMultichotomy = 0.5, nrep = 1e4L) 14 | } 15 | \arguments{ 16 | \item{Nmin, Nmax}{the smallest and largest values allowed for the 17 | number of tips, by default between 3 and 1000.} 18 | \item{ProbRooted}{the probability that the tree is simulated rooted; 19 | 0.5 by default.} 20 | \item{ProbMultichotomy}{the probability that multichotomies are 21 | introduced into the tree, 0.5 by default.} 22 | \item{nrep}{the number of trees simulated (10,000 by default).} 23 | } 24 | \details{ 25 | The edges (or branches) in a tree of class \code{"phylo"} can be 26 | ordered in cladewise, pruningwise, or postorder order (see the 27 | definition of the class on \pkg{ape}'s web site for mode details). 28 | 29 | The idea of this benchmark is to simulate a tree with the function 30 | \code{\link[ape]{rtree}} which outputs trees in cladewise order. The 31 | simulated tree can be rooted or not and include multichotomies, both 32 | being controlled by the above options. The simulated tree is then 33 | reordered into pruningwise order, and back into cladewise order: the 34 | edge matrix of the final tree is expected to be identical to the one 35 | of the original tree. The same operation is performed with the 36 | postorder order. Both operations are repeated \code{nrep} on a newly 37 | simulated tree. 38 | 39 | If at least one test does not run as expected, a data frame is 40 | returned with information on all simulated trees (see below). 41 | } 42 | \value{ 43 | \code{"OK"} if no problem is detected; otherwise, a data frame is 44 | returned with \code{nrep} rows and the following columns: 45 | 46 | \itemize{ 47 | \item{Ntip}{the number of tips in the simulated tree.} 48 | \item{Nnode}{the number of nodes in the simulated tree} 49 | \item{Rooted}{a logical value indicating whether the tree was rooted 50 | or not.} 51 | \item{Test1}{a logical value indicating whether the first test 52 | described above run as expected.} 53 | \item{Test2}{idem for the second test.} 54 | } 55 | } 56 | \author{Emmanuel Paradis} 57 | \keyword{utilities} 58 | -------------------------------------------------------------------------------- /phylobench/man/SPLITS.Rd: -------------------------------------------------------------------------------- 1 | \name{SPLITS} 2 | \alias{SPLITS} 3 | \title{Splits from Unrooted Trees} 4 | \description{ 5 | This benchmark assesses the extraction of splits (or bipartitions) 6 | from unrooted trees. The test trees are the three possible unrooted 7 | topologies with four tips. The splits are extracted with the functions 8 | \code{\link[ape]{prop.part}} and \code{\link[ape]{bitsplits}} both 9 | from \pkg{ape}. It is expected that three splits are observed in 10 | relative frequencies 1/3. 11 | } 12 | \usage{ 13 | SPLITS() 14 | } 15 | \author{Emmanuel Paradis} 16 | \seealso{\code{\link{TOPODIST}}} 17 | \keyword{utilities} 18 | -------------------------------------------------------------------------------- /phylobench/man/TOPODIST.Rd: -------------------------------------------------------------------------------- 1 | \name{TOPODIST} 2 | \alias{TOPODIST} 3 | \title{Topological Distances} 4 | \description{ 5 | This benchmark assesses the topological distances between pairs of 6 | unrooted trees. Three trees are considered which represent the three 7 | possible unrooted topologies with four tips. The distances are 8 | calculated with the function \code{\link[ape]{dist.topo}}. If all 9 | distances are equal to 2, ``OK'' is returned. 10 | } 11 | \usage{ 12 | TOPODIST() 13 | } 14 | \author{Emmanuel Paradis} 15 | \references{ 16 | Saitou, N. and Nei, M. (1987) The neighbor-joining method: a new 17 | method for reconstructing phylogenetic trees. \emph{Molecular Biology 18 | and Evolution}, \bold{4}, 406--425. 19 | } 20 | \seealso{\code{\link{SPLITS}}} 21 | \keyword{utilities} 22 | -------------------------------------------------------------------------------- /phylobench/man/ULTRAMETRIC.Rd: -------------------------------------------------------------------------------- 1 | \name{ULTRAMETRIC} 2 | \alias{ULTRAMETRIC} 3 | \title{Ultrametric Trees} 4 | \description{ 5 | For each value in \code{n}, one tree is simulated with 6 | \code{\link[ape]{rtree}} and one tree with 7 | \code{\link[ape]{rcoal}}. Both trees are tested with 8 | \code{\link[ape]{is.ultrametric}}. This is replicated \code{N} 9 | times. If all tests are as expected, ``OK'' is returned; otherwise, 10 | a message with the number of unexpected results is returned. 11 | } 12 | \usage{ 13 | ULTRAMETRIC(N = 100, n = c(5, 10, 20, 50, 100)) 14 | } 15 | \arguments{ 16 | \item{N}{the number of replications.} 17 | \item{n}{the tree sizes.} 18 | } 19 | \author{Emmanuel Paradis} 20 | \keyword{utilities} 21 | -------------------------------------------------------------------------------- /phylobench/man/VCVBM.Rd: -------------------------------------------------------------------------------- 1 | \name{VCVBM} 2 | \alias{VCVBM} 3 | \title{Phylogenetic Variance-Covariance Matrix Under Brownian Motion Model} 4 | \description{ 5 | This benchmark assesses the variance-covariance (VCV) matrix for a 6 | trait evolving under Brownian motion on an ultrametric tree with five 7 | tips and four branching times equal to four, three, two, and one unit 8 | of time (from the root to the most recent node). In this case, the 9 | values in the VCV matrix can be calculated by hand. These values are 10 | compared with the values returned by the function \code{\link{vcv}}. 11 | } 12 | \usage{ 13 | VCVBM() 14 | } 15 | \author{Emmanuel Paradis} 16 | \references{ 17 | Felsenstein, J. (1985) Phylogenies and the comparative method. 18 | \emph{American Naturalist}, \bold{125}, 1--15. 19 | 20 | Martins, E. P. and Hansen, T. F. (1997) Phylogenies and the comparative 21 | method: a general approach to incorporating phylogenetic information 22 | into the analysis of interspecific data. \emph{American Naturalist}, 23 | \bold{149}, 646--667. 24 | } 25 | \examples{ 26 | ## the tree used in the benchmark: 27 | tr <- compute.brtime(stree(5, "l"), 4:1) 28 | plot(tr) 29 | } 30 | \keyword{utilities} 31 | -------------------------------------------------------------------------------- /phylobench/man/YULE.Rd: -------------------------------------------------------------------------------- 1 | \name{YULE} 2 | \alias{YULE} 3 | \title{Random Yule Trees} 4 | \description{ 5 | This benchmark simulates random phylogenies under the Yule model 6 | (i.e., without extinction) with the function 7 | \code{\link[ape]{rlineage}}. The process is repeated \code{N} times 8 | and the number of species is extracted and the frequencies of these 9 | values are compared with the expected values calculated with 10 | \code{\link[ape]{dbd}}. 11 | 12 | This benchmark is quite sensitive to the parameter values. For 13 | instance, \code{YULE(N = 10)} fails most of the time. 14 | 15 | Because the number of species can grow to a very large number if 16 | \code{lambda} and/or \code{Tmax} are large, this may result in an 17 | error if the simulated tree has more than 100,000 branches. 18 | } 19 | \usage{ 20 | YULE(N = 1000, lambda = 0.05, Tmax = 50, threshold = c(0.8, 1.2)) 21 | } 22 | \arguments{ 23 | \item{N}{the number of simulated trees.} 24 | \item{lambda}{the value of speciation rate.} 25 | \item{Tmax}{the timespan of the simulation.} 26 | \item{threshold}{the lower and upper bounds when comparing the 27 | observed and predicted numbers of species.} 28 | } 29 | \author{Emmanuel Paradis} 30 | \references{ 31 | Kendall, D. G. (1948) On the generalized ``birth-and-death'' 32 | process. \emph{Annals of Mathematical Statistics}, \bold{19}, 1--15. 33 | 34 | Yule, G. U. (1924) A mathematical theory of evolution, based on the 35 | conclusions of Dr. J. C. Willis, F.R.S.. \emph{Philosophical 36 | Transactions of the Royal Society of London. Series B}, \bold{213}, 37 | 21--87. 38 | } 39 | \keyword{utilities} 40 | 41 | -------------------------------------------------------------------------------- /phylobench/man/phylobench-package.Rd: -------------------------------------------------------------------------------- 1 | \name{phylobench-package} 2 | \alias{phylobench-package} 3 | \alias{phylobench} 4 | \docType{package} 5 | \title{ 6 | Phylogenetic Benchmarks 7 | } 8 | \description{ 9 | \pkg{phylobench} provides functions for testing phylogenetic functions. 10 | 11 | More information on \pkg{phylobench} can be found at 12 | \url{https://github.com/emmanuelparadis/phylobench}. 13 | } 14 | 15 | \author{ 16 | Emmanuel Paradis 17 | 18 | Maintainer: Emmanuel Paradis 19 | } 20 | \keyword{package} 21 | -------------------------------------------------------------------------------- /phylobench/man/runTests.Rd: -------------------------------------------------------------------------------- 1 | \name{runTests} 2 | \alias{runTests} 3 | \alias{codeTests} 4 | \alias{listTests} 5 | \alias{fileTests} 6 | \title{Phylogenetic Benchmarking} 7 | \description{ 8 | \code{runTests} runs a series of phylogenetic benchmark tests. The 9 | other functions are utilities. 10 | } 11 | \usage{ 12 | runTests(verbose = TRUE) 13 | listTests() 14 | fileTests(which) 15 | codeTests(which) 16 | } 17 | \arguments{ 18 | \item{verbose}{a logical value specifying whether to print the 19 | progress of the tests. Set it to \code{FALSE} for a completely 20 | silent testing.} 21 | \item{which}{a number giving the number of the test.} 22 | } 23 | \author{Emmanuel Paradis} 24 | \value{ 25 | \code{runTests} returns a named list with the results of the tests. 26 | 27 | \code{listTests} returns a data frame with the titles and functions of 28 | the tests. 29 | 30 | \code{fileTests} and \code{codeTests} simply prints their results. 31 | } 32 | \examples{ 33 | ## This is a test to check that the functions match with 34 | ## those listed in the test list inside the package: 35 | fun.db <- unlist(phylobench:::.list_of_tests) 36 | fun.pkg <- ls(env = asNamespace("phylobench")) 37 | del <- match(c("runTests", "listTests", "fileTests", "codeTests", "eps"), 38 | fun.pkg) 39 | fun.pkg <- fun.pkg[-del] 40 | test1 <- length(fun.db) == length(fun.pkg) 41 | test2 <- all(sort(fun.db) == sort(fun.pkg)) 42 | if (!(test1 && test2)) { 43 | cat("Function(s) in the package not in the list:", 44 | fun.pkg[is.na(match(fun.pkg, fun.db))], "\n") 45 | cat("Function(s) in the list not in the package:", 46 | fun.db[is.na(match(fun.db, fun.pkg))], "\n") 47 | stop("Check the functions and the list of functions.") 48 | } 49 | } 50 | \keyword{utilities} 51 | -------------------------------------------------------------------------------- /phylobench/vignettes/PhylogeneticBenchmarks.Rnw: -------------------------------------------------------------------------------- 1 | \documentclass[a4paper]{article} 2 | %\VignetteIndexEntry{Moran's I} 3 | %\VignettePackage{ape} 4 | \usepackage{fancyvrb} 5 | \usepackage{color} 6 | 7 | \newcommand{\code}{\texttt} 8 | \newcommand{\pkg}{\textsf} 9 | \newcommand{\ape}{\pkg{ape}} 10 | \newcommand{\phylobench}{\pkg{phylobench}} 11 | \newcommand{\R}{\pkg{R}} 12 | 13 | \author{Emmanuel Paradis} 14 | \title{Phylogenetic Benchmarks With \phylobench} 15 | 16 | \begin{document} 17 | 18 | \maketitle 19 | 20 | <>= 21 | options(width=60) 22 | @ 23 | 24 | \section{Structure of the Package} 25 | 26 | The package \phylobench\ has four functions: 27 | 28 | \begin{itemize} 29 | \item \code{runTests} which runs all the tests programmed in 30 | \phylobench. It has a single option: \code{verbose = TRUE} by 31 | default. 32 | \item \code{listTests} which lists all the tests programmed. 33 | \item \code{fileTests} which lists the files used for each test. 34 | \item \code{codeTests} which displays the code of each test. 35 | \end{itemize} 36 | The last two functions have a mandatory option, \code{which}, to 37 | specify the number of the test as returned by \code{listTests}. 38 | 39 | \phylobench\ has a predefined parameter named \code{eps} which is 40 | taken from the machine numerical characteristics and is defined as the 41 | smallest number $x$ so that $1+x\ne 1$ (see \code{?.Machine}): 42 | 43 | <<>>= 44 | library(phylobench) 45 | phylobench:::eps 46 | .Machine$double.eps 47 | @ 48 | This value can be used when programming tests, although it is not 49 | exported by \phylobench. 50 | 51 | The tests themselves are programmed in \R\ (see below). The files 52 | needed for the tests are included in \phylobench. They are arranged 53 | into two subdirectories depending on their use during the tests: 54 | 55 | <<>>= 56 | dir(system.file("extdata/input", package = "phylobench"), recursive = TRUE) 57 | dir(system.file("extdata/output", package = "phylobench"), recursive = TRUE) 58 | @ 59 | The files in \code{"phylobench/extdata/input"} are used as input data for 60 | the tests, whereas the files in \code{"phylobench/extdata/output"} are 61 | used for comparisons with the results of the tests. 62 | 63 | \section{Programming the Tests} 64 | 65 | Each test is performed by a function which has the following 66 | features: 67 | 68 | \begin{itemize} 69 | \item The function should have either no argument or have all its 70 | arguments with default values, so that it will be executed with 71 | something like \code{FUN()}. 72 | \item The function should return the character string \code{"OK"} if 73 | the tests run as expected. 74 | \item The name of the function should not start with a period so that 75 | it is easily found (though it will not be exported). 76 | \end{itemize} 77 | 78 | Once the function has been added to the source of \phylobench\ as well 79 | as the data files (if any), we need to update the list of tests in the 80 | package: 81 | 82 | <<>>= 83 | phylobench:::.list_of_tests 84 | @ 85 | Note that the contents of this list is checked against the (non 86 | exported) functions present in the package during \code{R CMD check 87 | phylobench}. An optional help page can be provided describing the 88 | benchmark. These help pages can be used to give more or less 89 | detailed descriptions of the benchmarks which are then compiled in 90 | \phylobench's PDF manual. 91 | 92 | \section{Usage} 93 | 94 | We run all tests: 95 | 96 | <<>>= 97 | res <- runTests() 98 | str(res) 99 | @ 100 | Then, we examine the files and the code of the third set of tests: 101 | 102 | <<>>= 103 | listTests() 104 | fileTests(3) 105 | codeTests(3) 106 | @ 107 | Note that since the examples from this vignette were run in a 108 | temporary directory, the directories listed above will not be 109 | standard. 110 | 111 | \end{document} 112 | --------------------------------------------------------------------------------