The Test Anything Protocol (TAP)

55 |

85 |

86 | 87 |

88 | 89 | 90 | 91 | 92 | A piece of software which determines if a particular facility or 93 | component is functional. 94 | 95 | 96 | 97 | 98 | The encoding in TAP of the result of a single test. It may have 99 | exactly one TAP result directive. 100 | 101 | 102 | 103 | 104 | Zero or more test results with exactly one plan. A test set can 105 | be determined to have failed or passed. 106 | 107 | 108 | 109 | 110 | A collection of test sets. 111 | 112 | 113 | 114 | 115 | One test set which parses correctly as defined in this 116 | document. Future expansion may allow multiple test sets to be 117 | contained in a single TAP document, allowing an entire test 118 | suite to be stored as a single TAP document. 119 | 120 | 121 | 122 | 123 | The number of tests which are expected to pass in a single test 124 | set. 125 | 126 | 127 | 128 | 129 | A directive changes the meaning of a test result; it is 130 | commonly used to indicate that a failing test should be 131 | reported as passing (TODO) or that a passing test should be 132 | tagged as suspicious (SKIP). 133 | 134 | 135 | 136 | 137 | A reason explains why a directive was necessary. 138 | 139 | 140 | 141 | 142 | A description says what a test result is testing. 143 | 144 | 145 | 146 | 147 | A producer is a thing which can generate a valid TAP document. 148 | 149 | 150 | 151 | 152 | A consumer is a thing which can parse and interpret a TAP 153 | document. 154 | 155 | 156 | 157 | 158 | A filter a piece of software which consumes a TAP document, and 159 | produces a new TAP document based in some way on the TAP input. 160 | For example, it can reproduce the TAP document exactly, 161 | normalize the formatting, combine multiple documents, or 162 | summarize the result of tests as a new TAP document. 163 | 164 | 165 | 166 |

167 | 168 |

209 | | Storage | +--> | Formatted Output | 217 | ---------------- | ----------- | -------------------- 218 | | | | 219 | | V | 220 | | -------------- | -------------------- 221 | +--> | TAP parser |---+--> | Test summaries | 222 | -------------- -------------------- 223 | ]]> 224 | 225 | 226 | 227 | Additionally, utilities like "prove" can further simplify running a 228 | suite of TAP producers, by searching for files having certain 229 | characteristics or matching particular patterns. For instance, 230 | conventionally, all tests for a Perl module are stored in the 't/' 231 | folder, and consist of executable scripts (TAP producers) with 232 | extensions of 't'. The "prove" utility, part of the Test::Harness 233 | module, can then be used to find these files and run them all. In 234 | the following example, prove executes t/error.t, t/id.t and t/url.t 235 | and evaluates the produced TAP documents in turn, checking which 236 | ones passed and failed, and providing a complete summary of the 237 | entire test suite run. 238 | 239 | 240 |

247 |

248 | 249 |

250 | 251 | A TAP test set consists of one version, one plan, zero or more test 252 | results, and any number of comments and ignored elements. A TAP 253 | test set is split into lines, separated by newline characters. Any 254 | line beginning with the letters 'ok' or 'not ok' is a test result. 255 | All unparsable lines must be ignored by TAP consumers. In order to 256 | keep TAP readable on the system on which it is produced, the 257 | end-of-line character for TAP may be LF or CRLF. The same 258 | end-of-line character must be used throughout a document. 259 | 260 | 261 | 277 | 278 | 279 | Free-form strings in TAP, such as reasons and descriptions, must be 280 | in UTF8 unless the interchange parties agree otherwise. Strings 281 | MUST NOT contain an EOL or NUL character. Since directives start 282 | with a "#" and come after a description, descriptions MUST NOT 283 | contain a "#". 284 | 285 | 286 | 292 | 293 |

294 | 295 | Every test set as defined in this document should contain a 296 | version definition. A test set without a version definition 297 | is assumed to be written in TAP version 12, which is not 298 | covered by this document. 299 | 300 |

301 | 302 |

303 | 304 | Every test set should contain one and only one plan. A test 305 | set containing multiple plans can be parsed as a TAP 306 | document, but will be judged to have failed. 307 | 308 |

309 | 310 |

328 | 329 |

340 | 341 |

342 | 343 | Comments are lines which have no meaning in TAP. They do 344 | not alter the result of a test. A TAP document with and 345 | without comments has exactly the same meaning. 346 | 347 | 348 | 349 | Comments typically contain debugging information. TAP 350 | consumers should not display comments by default, as there 351 | will likely be a large number of tests in such a suite. 352 | 353 | 354 | 355 | Note that TAP does not provide a mechanism for comments to 356 | be associated with particular test results; for instance, 357 | comments of a general nature might be interspersed with 358 | comments specific to a particular test. 359 | 360 |

361 | 362 |

363 | 364 | In order to allow extension of the protocol while 365 | maintaining backwards compatibility, a TAP consumer must 366 | ignore certain lines and elements. Any line which does not 367 | parse must be ignored. Any directive or plan-directive 368 | which is not recognized must also be ignored. 369 | 370 | 371 | 372 | Here's an example of a TAP stream which contains ignored elements. 373 | 374 | 375 |

381 | 382 | 383 | Line 2 should be ignored, it does not parse. The directive 384 | "BANG" on test #2 should be treated as an ordinary comment, 385 | it is not a TAP directive. 386 | 387 |

388 |

389 | 390 |

391 | 392 | The test set is defined below using Augmented Backus-Naur Form 393 | (ABNF), as defined in . 394 | 395 | 396 |

438 | 439 | 440 | A document must parse as per the grammar above to qualify as a TAP 441 | document. The grammar presented below may be used by TAP consumers 442 | to ensure that a test set contains all the required parts. 443 | 444 | 445 |

478 | 479 |

480 | 481 |

482 | 483 | Both test sets and test results can be determined to have 484 | passed or failed as detailed below. Note that a test set might 485 | fail even if every single test result contained in it passes. 486 | 487 | 488 |

514 | 515 |

516 | 517 | This directive indicates that the test was not begun. Usually 518 | this is caused by environmental reasons: a missing optional 519 | library, an operating specific test, or an expensive test that 520 | is only run on request. 521 | 522 | 523 | 524 | Since the test was skipped, the test result is expected to be 525 | "ok" (indicating that the test was skipped correctly). A 526 | skipped test with a result of "not ok" is suspicious, and the 527 | TAP consumer should report a warning. 528 | 529 |

530 | 531 |

532 | 533 | This directive indicates that the test was run, but failure is 534 | expected and should not cause the test set to fail. This is 535 | usually because the functionality being tested has not been 536 | completely implemented or is obstructed by a known bug. 537 | 538 | 539 | 540 | Neither a failing nor a passing TODO test will cause the test 541 | set to fail, but since passing TODOs are suspicious, they may 542 | optionally be reported by TAP consumers. 543 | 544 | 545 | 556 |

557 | 558 |

559 | 560 | The plan indicates the number of tests which are expected to be 561 | run in the current test set. Two types of plans are defined, 562 | with different requirements for passing: 563 | 564 | 565 | 566 | 567 | 568 | A simple plan: the number of expected tests must be a 569 | positive integer. In order for a test set with a simple 570 | plan to pass, it must contain exactly the number of test 571 | results indicated in the plan, in ascending numeric order, 572 | without either any gaps in test number or any duplicate 573 | tests. Every test result must pass. 574 | 575 | 576 | 577 | A skip-all plan: the number of expected tests is zero. A 578 | test set with a skip-all plan passes unless it contains any 579 | test results. 580 | 581 | 582 | 583 | 584 | 585 | Allowing simple plans to plan for zero tests is being 586 | considered, but is not a part of this specification. 587 | 588 |

589 | 590 |

591 | 592 | TAP consumers may use any other system-specific factors to 593 | determine whether a test set passes or fails. If such a failure 594 | is to be reported, it MUST inform the user of the state of the 595 | TAP parsing and whether all tests appear to have passed, and 596 | then must separately describe the nature of the system-specific 597 | failure which caused the consumer to become suspicious of the 598 | results. 599 | 600 | 601 | 602 | TAP consumers which also act as TAP producers could add 603 | additional test results to the end of the output TAP document 604 | if such a failure occurs. For instance, take a TAP producer 605 | which emits ten successful tests, then throws an exception 606 | and exits with a failure. A filter might add a new (11th) 607 | failing ("not ok") test result to the end of the TAP it 608 | emits, which informs the user of the exception or failure 609 | exit code. Note that such a test result would cause the 610 | test set to fail; unless the test set planned one more test 611 | than it emitted, the number of tests would not equal the 612 | number of tests planned; even if it did this, this 11th 613 | failing test would cause the test set to fail. 614 | 615 | 616 | 617 | An example of the use of this option is to check exit codes of 618 | TAP producers for a failure, which might indicate that the 619 | producer failed after having emitted a complete TAP document. 620 | It seems unpragmatic to ignore such exit codes, but this 621 | information cannot be reliably expressed into its own TAP 622 | output by a failing producer, and must be treated separately. 623 | This also allows for language-specific and system-specific 624 | features to be used by the TAP consumer at its discretion to 625 | improve the rigour of its testing. 626 | 627 |

628 |

629 | 630 | 631 |

632 | 636 | 637 | A parser which stores test results in a dynamically sized array may 638 | be vulnerable to memory starvation by a test which uses a very high 639 | test number. For example: 640 | 641 | 642 |

648 | 649 | 650 | The above test result would cause an array of 123456789 elements to 651 | be allocated. So it is recommended that test results, if stored at 652 | all, are stored in a sparse array. 653 | 654 |

655 | 656 |

657 | 679 | TBD 680 |

681 | 682 |

683 | 684 | This document is based on Andy Armstrong's description of the TAP 685 | protocol, version 13, which is itself based on Andy Lester's 686 | description of the TAP protocol, version 1.00. The basis for the 687 | TAP format was created by Larry Wall in the original test script 688 | for Perl 1. Tim Bunce and Andreas Koenig developed it further 689 | with their modifications to the Test::Harness module. 690 | 691 |

692 |