├── composer.json ├── LICENSE.txt ├── README.md └── Parsedown.php /composer.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "erusev/parsedown", 3 | "description": "Parser for Markdown.", 4 | "keywords": ["markdown", "parser"], 5 | "homepage": "http://parsedown.org", 6 | "type": "library", 7 | "license": "MIT", 8 | "authors": [ 9 | { 10 | "name": "Emanuil Rusev", 11 | "email": "hello@erusev.com", 12 | "homepage": "http://erusev.com" 13 | } 14 | ], 15 | "require": { 16 | "php": ">=5.3.0", 17 | "ext-mbstring": "*" 18 | }, 19 | "require-dev": { 20 | "phpunit/phpunit": "^4.8.35" 21 | }, 22 | "autoload": { 23 | "psr-0": {"Parsedown": ""} 24 | }, 25 | "autoload-dev": { 26 | "psr-0": { 27 | "TestParsedown": "test/", 28 | "ParsedownTest": "test/", 29 | "CommonMarkTest": "test/", 30 | "CommonMarkTestWeak": "test/" 31 | } 32 | } 33 | } 34 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2013-2018 Emanuil Rusev, erusev.com 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of 6 | this software and associated documentation files (the "Software"), to deal in 7 | the Software without restriction, including without limitation the rights to 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 9 | the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 17 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 18 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 19 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 20 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 |

Hello Parsedown!
41 | ``` 42 | 43 | You can also parse inline markdown only: 44 | 45 | ```php 46 | echo $Parsedown->line('Hello _Parsedown_!'); # prints: Hello Parsedown! 47 | ``` 48 | 49 | More examples in [the wiki](https://github.com/erusev/parsedown/wiki/) and in [this video tutorial](http://youtu.be/wYZBY8DEikI). 50 | 51 | ## Security 52 | 53 | Parsedown is capable of escaping user-input within the HTML that it generates. Additionally Parsedown will apply sanitisation to additional scripting vectors (such as scripting link destinations) that are introduced by the markdown syntax itself. 54 | 55 | To tell Parsedown that it is processing untrusted user-input, use the following: 56 | 57 | ```php 58 | $Parsedown->setSafeMode(true); 59 | ``` 60 | 61 | If instead, you wish to allow HTML within untrusted user-input, but still want output to be free from XSS it is recommended that you make use of a HTML sanitiser that allows HTML tags to be whitelisted, like [HTML Purifier](http://htmlpurifier.org/). 62 | 63 | In both cases you should strongly consider employing defence-in-depth measures, like [deploying a Content-Security-Policy](https://scotthelme.co.uk/content-security-policy-an-introduction/) (a browser security feature) so that your page is likely to be safe even if an attacker finds a vulnerability in one of the first lines of defence above. 64 | 65 | #### Security of Parsedown Extensions 66 | 67 | Safe mode does not necessarily yield safe results when using extensions to Parsedown. Extensions should be evaluated on their own to determine their specific safety against XSS. 68 | 69 | ## Escaping HTML 70 | 71 | > ⚠️ **WARNING:** This method isn't safe from XSS! 72 | 73 | If you wish to escape HTML **in trusted input**, you can use the following: 74 | 75 | ```php 76 | $Parsedown->setMarkupEscaped(true); 77 | ``` 78 | 79 | Beware that this still allows users to insert unsafe scripting vectors, such as links like `[xss](javascript:alert%281%29)`. 80 | 81 | ## Questions 82 | 83 | **How does Parsedown work?** 84 | 85 | It tries to read Markdown like a human. First, it looks at the lines. It’s interested in how the lines start. This helps it recognise blocks. It knows, for example, that if a line starts with a `-` then perhaps it belongs to a list. Once it recognises the blocks, it continues to the content. As it reads, it watches out for special characters. This helps it recognise inline elements (or inlines). 86 | 87 | We call this approach "line based". We believe that Parsedown is the first Markdown parser to use it. Since the release of Parsedown, other developers have used the same approach to develop other Markdown parsers in PHP and in other languages. 88 | 89 | **Is it compliant with CommonMark?** 90 | 91 | It passes most of the CommonMark tests. Most of the tests that don't pass deal with cases that are quite uncommon. Still, as CommonMark matures, compliance should improve. 92 | 93 | **Who uses it?** 94 | 95 | [Laravel Framework](https://laravel.com/), [Bolt CMS](http://bolt.cm/), [Grav CMS](http://getgrav.org/), [Herbie CMS](http://www.getherbie.org/), [Kirby CMS](http://getkirby.com/), [October CMS](http://octobercms.com/), [Pico CMS](http://picocms.org), [Statamic CMS](http://www.statamic.com/), [phpDocumentor](http://www.phpdoc.org/), [RaspberryPi.org](http://www.raspberrypi.org/), [Symfony demo](https://github.com/symfony/symfony-demo) and [more](https://packagist.org/packages/erusev/parsedown/dependents). 96 | 97 | **How can I help?** 98 | 99 | Use it, star it, share it and if you feel generous, [donate](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=528P3NZQMP8N2). 100 | -------------------------------------------------------------------------------- /Parsedown.php: -------------------------------------------------------------------------------- 1 | textElements($text); 27 | 28 | # convert to markup 29 | $markup = $this->elements($Elements); 30 | 31 | # trim line breaks 32 | $markup = trim($markup, "\n"); 33 | 34 | return $markup; 35 | } 36 | 37 | protected function textElements($text) 38 | { 39 | # make sure no definitions are set 40 | $this->DefinitionData = array(); 41 | 42 | # standardize line breaks 43 | $text = str_replace(array("\r\n", "\r"), "\n", $text); 44 | 45 | # remove surrounding line breaks 46 | $text = trim($text, "\n"); 47 | 48 | # split text into lines 49 | $lines = explode("\n", $text); 50 | 51 | # iterate through lines to identify blocks 52 | return $this->linesElements($lines); 53 | } 54 | 55 | # 56 | # Setters 57 | # 58 | 59 | function setBreaksEnabled($breaksEnabled) 60 | { 61 | $this->breaksEnabled = $breaksEnabled; 62 | 63 | return $this; 64 | } 65 | 66 | protected $breaksEnabled; 67 | 68 | function setMarkupEscaped($markupEscaped) 69 | { 70 | $this->markupEscaped = $markupEscaped; 71 | 72 | return $this; 73 | } 74 | 75 | protected $markupEscaped; 76 | 77 | function setUrlsLinked($urlsLinked) 78 | { 79 | $this->urlsLinked = $urlsLinked; 80 | 81 | return $this; 82 | } 83 | 84 | protected $urlsLinked = true; 85 | 86 | function setSafeMode($safeMode) 87 | { 88 | $this->safeMode = (bool) $safeMode; 89 | 90 | return $this; 91 | } 92 | 93 | protected $safeMode; 94 | 95 | function setStrictMode($strictMode) 96 | { 97 | $this->strictMode = (bool) $strictMode; 98 | 99 | return $this; 100 | } 101 | 102 | protected $strictMode; 103 | 104 | protected $safeLinksWhitelist = array( 105 | 'http://', 106 | 'https://', 107 | 'ftp://', 108 | 'ftps://', 109 | 'mailto:', 110 | 'tel:', 111 | 'data:image/png;base64,', 112 | 'data:image/gif;base64,', 113 | 'data:image/jpeg;base64,', 114 | 'irc:', 115 | 'ircs:', 116 | 'git:', 117 | 'ssh:', 118 | 'news:', 119 | 'steam:', 120 | ); 121 | 122 | # 123 | # Lines 124 | # 125 | 126 | protected $BlockTypes = array( 127 | '#' => array('Header'), 128 | '*' => array('Rule', 'List'), 129 | '+' => array('List'), 130 | '-' => array('SetextHeader', 'Table', 'Rule', 'List'), 131 | '0' => array('List'), 132 | '1' => array('List'), 133 | '2' => array('List'), 134 | '3' => array('List'), 135 | '4' => array('List'), 136 | '5' => array('List'), 137 | '6' => array('List'), 138 | '7' => array('List'), 139 | '8' => array('List'), 140 | '9' => array('List'), 141 | ':' => array('Table'), 142 | '<' => array('Comment', 'Markup'), 143 | '=' => array('SetextHeader'), 144 | '>' => array('Quote'), 145 | '[' => array('Reference'), 146 | '_' => array('Rule'), 147 | '`' => array('FencedCode'), 148 | '|' => array('Table'), 149 | '~' => array('FencedCode'), 150 | ); 151 | 152 | # ~ 153 | 154 | protected $unmarkedBlockTypes = array( 155 | 'Code', 156 | ); 157 | 158 | # 159 | # Blocks 160 | # 161 | 162 | protected function lines(array $lines) 163 | { 164 | return $this->elements($this->linesElements($lines)); 165 | } 166 | 167 | protected function linesElements(array $lines) 168 | { 169 | $Elements = array(); 170 | $CurrentBlock = null; 171 | 172 | foreach ($lines as $line) 173 | { 174 | if (chop($line) === '') 175 | { 176 | if (isset($CurrentBlock)) 177 | { 178 | $CurrentBlock['interrupted'] = (isset($CurrentBlock['interrupted']) 179 | ? $CurrentBlock['interrupted'] + 1 : 1 180 | ); 181 | } 182 | 183 | continue; 184 | } 185 | 186 | while (($beforeTab = strstr($line, "\t", true)) !== false) 187 | { 188 | $shortage = 4 - mb_strlen($beforeTab, 'utf-8') % 4; 189 | 190 | $line = $beforeTab 191 | . str_repeat(' ', $shortage) 192 | . substr($line, strlen($beforeTab) + 1) 193 | ; 194 | } 195 | 196 | $indent = strspn($line, ' '); 197 | 198 | $text = $indent > 0 ? substr($line, $indent) : $line; 199 | 200 | # ~ 201 | 202 | $Line = array('body' => $line, 'indent' => $indent, 'text' => $text); 203 | 204 | # ~ 205 | 206 | if (isset($CurrentBlock['continuable'])) 207 | { 208 | $methodName = 'block' . $CurrentBlock['type'] . 'Continue'; 209 | $Block = $this->$methodName($Line, $CurrentBlock); 210 | 211 | if (isset($Block)) 212 | { 213 | $CurrentBlock = $Block; 214 | 215 | continue; 216 | } 217 | else 218 | { 219 | if ($this->isBlockCompletable($CurrentBlock['type'])) 220 | { 221 | $methodName = 'block' . $CurrentBlock['type'] . 'Complete'; 222 | $CurrentBlock = $this->$methodName($CurrentBlock); 223 | } 224 | } 225 | } 226 | 227 | # ~ 228 | 229 | $marker = $text[0]; 230 | 231 | # ~ 232 | 233 | $blockTypes = $this->unmarkedBlockTypes; 234 | 235 | if (isset($this->BlockTypes[$marker])) 236 | { 237 | foreach ($this->BlockTypes[$marker] as $blockType) 238 | { 239 | $blockTypes []= $blockType; 240 | } 241 | } 242 | 243 | # 244 | # ~ 245 | 246 | foreach ($blockTypes as $blockType) 247 | { 248 | $Block = $this->{"block$blockType"}($Line, $CurrentBlock); 249 | 250 | if (isset($Block)) 251 | { 252 | $Block['type'] = $blockType; 253 | 254 | if ( ! isset($Block['identified'])) 255 | { 256 | if (isset($CurrentBlock)) 257 | { 258 | $Elements[] = $this->extractElement($CurrentBlock); 259 | } 260 | 261 | $Block['identified'] = true; 262 | } 263 | 264 | if ($this->isBlockContinuable($blockType)) 265 | { 266 | $Block['continuable'] = true; 267 | } 268 | 269 | $CurrentBlock = $Block; 270 | 271 | continue 2; 272 | } 273 | } 274 | 275 | # ~ 276 | 277 | if (isset($CurrentBlock) and $CurrentBlock['type'] === 'Paragraph') 278 | { 279 | $Block = $this->paragraphContinue($Line, $CurrentBlock); 280 | } 281 | 282 | if (isset($Block)) 283 | { 284 | $CurrentBlock = $Block; 285 | } 286 | else 287 | { 288 | if (isset($CurrentBlock)) 289 | { 290 | $Elements[] = $this->extractElement($CurrentBlock); 291 | } 292 | 293 | $CurrentBlock = $this->paragraph($Line); 294 | 295 | $CurrentBlock['identified'] = true; 296 | } 297 | } 298 | 299 | # ~ 300 | 301 | if (isset($CurrentBlock['continuable']) and $this->isBlockCompletable($CurrentBlock['type'])) 302 | { 303 | $methodName = 'block' . $CurrentBlock['type'] . 'Complete'; 304 | $CurrentBlock = $this->$methodName($CurrentBlock); 305 | } 306 | 307 | # ~ 308 | 309 | if (isset($CurrentBlock)) 310 | { 311 | $Elements[] = $this->extractElement($CurrentBlock); 312 | } 313 | 314 | # ~ 315 | 316 | return $Elements; 317 | } 318 | 319 | protected function extractElement(array $Component) 320 | { 321 | if ( ! isset($Component['element'])) 322 | { 323 | if (isset($Component['markup'])) 324 | { 325 | $Component['element'] = array('rawHtml' => $Component['markup']); 326 | } 327 | elseif (isset($Component['hidden'])) 328 | { 329 | $Component['element'] = array(); 330 | } 331 | } 332 | 333 | return $Component['element']; 334 | } 335 | 336 | protected function isBlockContinuable($Type) 337 | { 338 | return method_exists($this, 'block' . $Type . 'Continue'); 339 | } 340 | 341 | protected function isBlockCompletable($Type) 342 | { 343 | return method_exists($this, 'block' . $Type . 'Complete'); 344 | } 345 | 346 | # 347 | # Code 348 | 349 | protected function blockCode($Line, $Block = null) 350 | { 351 | if (isset($Block) and $Block['type'] === 'Paragraph' and ! isset($Block['interrupted'])) 352 | { 353 | return; 354 | } 355 | 356 | if ($Line['indent'] >= 4) 357 | { 358 | $text = substr($Line['body'], 4); 359 | 360 | $Block = array( 361 | 'element' => array( 362 | 'name' => 'pre', 363 | 'element' => array( 364 | 'name' => 'code', 365 | 'text' => $text, 366 | ), 367 | ), 368 | ); 369 | 370 | return $Block; 371 | } 372 | } 373 | 374 | protected function blockCodeContinue($Line, $Block) 375 | { 376 | if ($Line['indent'] >= 4) 377 | { 378 | if (isset($Block['interrupted'])) 379 | { 380 | $Block['element']['element']['text'] .= str_repeat("\n", $Block['interrupted']); 381 | 382 | unset($Block['interrupted']); 383 | } 384 | 385 | $Block['element']['element']['text'] .= "\n"; 386 | 387 | $text = substr($Line['body'], 4); 388 | 389 | $Block['element']['element']['text'] .= $text; 390 | 391 | return $Block; 392 | } 393 | } 394 | 395 | protected function blockCodeComplete($Block) 396 | { 397 | return $Block; 398 | } 399 | 400 | # 401 | # Comment 402 | 403 | protected function blockComment($Line) 404 | { 405 | if ($this->markupEscaped or $this->safeMode) 406 | { 407 | return; 408 | } 409 | 410 | if (strpos($Line['text'], '') !== false) 420 | { 421 | $Block['closed'] = true; 422 | } 423 | 424 | return $Block; 425 | } 426 | } 427 | 428 | protected function blockCommentContinue($Line, array $Block) 429 | { 430 | if (isset($Block['closed'])) 431 | { 432 | return; 433 | } 434 | 435 | $Block['element']['rawHtml'] .= "\n" . $Line['body']; 436 | 437 | if (strpos($Line['text'], '-->') !== false) 438 | { 439 | $Block['closed'] = true; 440 | } 441 | 442 | return $Block; 443 | } 444 | 445 | # 446 | # Fenced Code 447 | 448 | protected function blockFencedCode($Line) 449 | { 450 | $marker = $Line['text'][0]; 451 | 452 | $openerLength = strspn($Line['text'], $marker); 453 | 454 | if ($openerLength < 3) 455 | { 456 | return; 457 | } 458 | 459 | $infostring = trim(substr($Line['text'], $openerLength), "\t "); 460 | 461 | if (strpos($infostring, '`') !== false) 462 | { 463 | return; 464 | } 465 | 466 | $Element = array( 467 | 'name' => 'code', 468 | 'text' => '', 469 | ); 470 | 471 | if ($infostring !== '') 472 | { 473 | $Element['attributes'] = array('class' => "language-$infostring"); 474 | } 475 | 476 | $Block = array( 477 | 'char' => $marker, 478 | 'openerLength' => $openerLength, 479 | 'element' => array( 480 | 'name' => 'pre', 481 | 'element' => $Element, 482 | ), 483 | ); 484 | 485 | return $Block; 486 | } 487 | 488 | protected function blockFencedCodeContinue($Line, $Block) 489 | { 490 | if (isset($Block['complete'])) 491 | { 492 | return; 493 | } 494 | 495 | if (isset($Block['interrupted'])) 496 | { 497 | $Block['element']['element']['text'] .= str_repeat("\n", $Block['interrupted']); 498 | 499 | unset($Block['interrupted']); 500 | } 501 | 502 | if (($len = strspn($Line['text'], $Block['char'])) >= $Block['openerLength'] 503 | and chop(substr($Line['text'], $len), ' ') === '' 504 | ) { 505 | $Block['element']['element']['text'] = substr($Block['element']['element']['text'], 1); 506 | 507 | $Block['complete'] = true; 508 | 509 | return $Block; 510 | } 511 | 512 | $Block['element']['element']['text'] .= "\n" . $Line['body']; 513 | 514 | return $Block; 515 | } 516 | 517 | protected function blockFencedCodeComplete($Block) 518 | { 519 | return $Block; 520 | } 521 | 522 | # 523 | # Header 524 | 525 | protected function blockHeader($Line) 526 | { 527 | $level = strspn($Line['text'], '#'); 528 | 529 | if ($level > 6) 530 | { 531 | return; 532 | } 533 | 534 | $text = trim($Line['text'], '#'); 535 | 536 | if ($this->strictMode and isset($text[0]) and $text[0] !== ' ') 537 | { 538 | return; 539 | } 540 | 541 | $text = trim($text, ' '); 542 | 543 | $Block = array( 544 | 'element' => array( 545 | 'name' => 'h' . $level, 546 | 'handler' => array( 547 | 'function' => 'lineElements', 548 | 'argument' => $text, 549 | 'destination' => 'elements', 550 | ) 551 | ), 552 | ); 553 | 554 | return $Block; 555 | } 556 | 557 | # 558 | # List 559 | 560 | protected function blockList($Line, array $CurrentBlock = null) 561 | { 562 | list($name, $pattern) = $Line['text'][0] <= '-' ? array('ul', '[*+-]') : array('ol', '[0-9]{1,9}+[.\)]'); 563 | 564 | if (preg_match('/^('.$pattern.'([ ]++|$))(.*+)/', $Line['text'], $matches)) 565 | { 566 | $contentIndent = strlen($matches[2]); 567 | 568 | if ($contentIndent >= 5) 569 | { 570 | $contentIndent -= 1; 571 | $matches[1] = substr($matches[1], 0, -$contentIndent); 572 | $matches[3] = str_repeat(' ', $contentIndent) . $matches[3]; 573 | } 574 | elseif ($contentIndent === 0) 575 | { 576 | $matches[1] .= ' '; 577 | } 578 | 579 | $markerWithoutWhitespace = strstr($matches[1], ' ', true); 580 | 581 | $Block = array( 582 | 'indent' => $Line['indent'], 583 | 'pattern' => $pattern, 584 | 'data' => array( 585 | 'type' => $name, 586 | 'marker' => $matches[1], 587 | 'markerType' => ($name === 'ul' ? $markerWithoutWhitespace : substr($markerWithoutWhitespace, -1)), 588 | ), 589 | 'element' => array( 590 | 'name' => $name, 591 | 'elements' => array(), 592 | ), 593 | ); 594 | $Block['data']['markerTypeRegex'] = preg_quote($Block['data']['markerType'], '/'); 595 | 596 | if ($name === 'ol') 597 | { 598 | $listStart = ltrim(strstr($matches[1], $Block['data']['markerType'], true), '0') ?: '0'; 599 | 600 | if ($listStart !== '1') 601 | { 602 | if ( 603 | isset($CurrentBlock) 604 | and $CurrentBlock['type'] === 'Paragraph' 605 | and ! isset($CurrentBlock['interrupted']) 606 | ) { 607 | return; 608 | } 609 | 610 | $Block['element']['attributes'] = array('start' => $listStart); 611 | } 612 | } 613 | 614 | $Block['li'] = array( 615 | 'name' => 'li', 616 | 'handler' => array( 617 | 'function' => 'li', 618 | 'argument' => !empty($matches[3]) ? array($matches[3]) : array(), 619 | 'destination' => 'elements' 620 | ) 621 | ); 622 | 623 | $Block['element']['elements'] []= & $Block['li']; 624 | 625 | return $Block; 626 | } 627 | } 628 | 629 | protected function blockListContinue($Line, array $Block) 630 | { 631 | if (isset($Block['interrupted']) and empty($Block['li']['handler']['argument'])) 632 | { 633 | return null; 634 | } 635 | 636 | $requiredIndent = ($Block['indent'] + strlen($Block['data']['marker'])); 637 | 638 | if ($Line['indent'] < $requiredIndent 639 | and ( 640 | ( 641 | $Block['data']['type'] === 'ol' 642 | and preg_match('/^[0-9]++'.$Block['data']['markerTypeRegex'].'(?:[ ]++(.*)|$)/', $Line['text'], $matches) 643 | ) or ( 644 | $Block['data']['type'] === 'ul' 645 | and preg_match('/^'.$Block['data']['markerTypeRegex'].'(?:[ ]++(.*)|$)/', $Line['text'], $matches) 646 | ) 647 | ) 648 | ) { 649 | if (isset($Block['interrupted'])) 650 | { 651 | $Block['li']['handler']['argument'] []= ''; 652 | 653 | $Block['loose'] = true; 654 | 655 | unset($Block['interrupted']); 656 | } 657 | 658 | unset($Block['li']); 659 | 660 | $text = isset($matches[1]) ? $matches[1] : ''; 661 | 662 | $Block['indent'] = $Line['indent']; 663 | 664 | $Block['li'] = array( 665 | 'name' => 'li', 666 | 'handler' => array( 667 | 'function' => 'li', 668 | 'argument' => array($text), 669 | 'destination' => 'elements' 670 | ) 671 | ); 672 | 673 | $Block['element']['elements'] []= & $Block['li']; 674 | 675 | return $Block; 676 | } 677 | elseif ($Line['indent'] < $requiredIndent and $this->blockList($Line)) 678 | { 679 | return null; 680 | } 681 | 682 | if ($Line['text'][0] === '[' and $this->blockReference($Line)) 683 | { 684 | return $Block; 685 | } 686 | 687 | if ($Line['indent'] >= $requiredIndent) 688 | { 689 | if (isset($Block['interrupted'])) 690 | { 691 | $Block['li']['handler']['argument'] []= ''; 692 | 693 | $Block['loose'] = true; 694 | 695 | unset($Block['interrupted']); 696 | } 697 | 698 | $text = substr($Line['body'], $requiredIndent); 699 | 700 | $Block['li']['handler']['argument'] []= $text; 701 | 702 | return $Block; 703 | } 704 | 705 | if ( ! isset($Block['interrupted'])) 706 | { 707 | $text = preg_replace('/^[ ]{0,'.$requiredIndent.'}+/', '', $Line['body']); 708 | 709 | $Block['li']['handler']['argument'] []= $text; 710 | 711 | return $Block; 712 | } 713 | } 714 | 715 | protected function blockListComplete(array $Block) 716 | { 717 | if (isset($Block['loose'])) 718 | { 719 | foreach ($Block['element']['elements'] as &$li) 720 | { 721 | if (end($li['handler']['argument']) !== '') 722 | { 723 | $li['handler']['argument'] []= ''; 724 | } 725 | } 726 | } 727 | 728 | return $Block; 729 | } 730 | 731 | # 732 | # Quote 733 | 734 | protected function blockQuote($Line) 735 | { 736 | if (preg_match('/^>[ ]?+(.*+)/', $Line['text'], $matches)) 737 | { 738 | $Block = array( 739 | 'element' => array( 740 | 'name' => 'blockquote', 741 | 'handler' => array( 742 | 'function' => 'linesElements', 743 | 'argument' => (array) $matches[1], 744 | 'destination' => 'elements', 745 | ) 746 | ), 747 | ); 748 | 749 | return $Block; 750 | } 751 | } 752 | 753 | protected function blockQuoteContinue($Line, array $Block) 754 | { 755 | if (isset($Block['interrupted'])) 756 | { 757 | return; 758 | } 759 | 760 | if ($Line['text'][0] === '>' and preg_match('/^>[ ]?+(.*+)/', $Line['text'], $matches)) 761 | { 762 | $Block['element']['handler']['argument'] []= $matches[1]; 763 | 764 | return $Block; 765 | } 766 | 767 | if ( ! isset($Block['interrupted'])) 768 | { 769 | $Block['element']['handler']['argument'] []= $Line['text']; 770 | 771 | return $Block; 772 | } 773 | } 774 | 775 | # 776 | # Rule 777 | 778 | protected function blockRule($Line) 779 | { 780 | $marker = $Line['text'][0]; 781 | 782 | if (substr_count($Line['text'], $marker) >= 3 and chop($Line['text'], " $marker") === '') 783 | { 784 | $Block = array( 785 | 'element' => array( 786 | 'name' => 'hr', 787 | ), 788 | ); 789 | 790 | return $Block; 791 | } 792 | } 793 | 794 | # 795 | # Setext 796 | 797 | protected function blockSetextHeader($Line, array $Block = null) 798 | { 799 | if ( ! isset($Block) or $Block['type'] !== 'Paragraph' or isset($Block['interrupted'])) 800 | { 801 | return; 802 | } 803 | 804 | if ($Line['indent'] < 4 and chop(chop($Line['text'], ' '), $Line['text'][0]) === '') 805 | { 806 | $Block['element']['name'] = $Line['text'][0] === '=' ? 'h1' : 'h2'; 807 | 808 | return $Block; 809 | } 810 | } 811 | 812 | # 813 | # Markup 814 | 815 | protected function blockMarkup($Line) 816 | { 817 | if ($this->markupEscaped or $this->safeMode) 818 | { 819 | return; 820 | } 821 | 822 | if (preg_match('/^<[\/]?+(\w*)(?:[ ]*+'.$this->regexHtmlAttribute.')*+[ ]*+(\/)?>/', $Line['text'], $matches)) 823 | { 824 | $element = strtolower($matches[1]); 825 | 826 | if (in_array($element, $this->textLevelElements)) 827 | { 828 | return; 829 | } 830 | 831 | $Block = array( 832 | 'name' => $matches[1], 833 | 'element' => array( 834 | 'rawHtml' => $Line['text'], 835 | 'autobreak' => true, 836 | ), 837 | ); 838 | 839 | return $Block; 840 | } 841 | } 842 | 843 | protected function blockMarkupContinue($Line, array $Block) 844 | { 845 | if (isset($Block['closed']) or isset($Block['interrupted'])) 846 | { 847 | return; 848 | } 849 | 850 | $Block['element']['rawHtml'] .= "\n" . $Line['body']; 851 | 852 | return $Block; 853 | } 854 | 855 | # 856 | # Reference 857 | 858 | protected function blockReference($Line) 859 | { 860 | if (strpos($Line['text'], ']') !== false 861 | and preg_match('/^\[(.+?)\]:[ ]*+(\S+?)>?(?:[ ]+["\'(](.+)["\')])?[ ]*+$/', $Line['text'], $matches) 862 | ) { 863 | $id = strtolower($matches[1]); 864 | 865 | $Data = array( 866 | 'url' => $matches[2], 867 | 'title' => isset($matches[3]) ? $matches[3] : null, 868 | ); 869 | 870 | $this->DefinitionData['Reference'][$id] = $Data; 871 | 872 | $Block = array( 873 | 'element' => array(), 874 | ); 875 | 876 | return $Block; 877 | } 878 | } 879 | 880 | # 881 | # Table 882 | 883 | protected function blockTable($Line, array $Block = null) 884 | { 885 | if ( ! isset($Block) or $Block['type'] !== 'Paragraph' or isset($Block['interrupted'])) 886 | { 887 | return; 888 | } 889 | 890 | if ( 891 | strpos($Block['element']['handler']['argument'], '|') === false 892 | and strpos($Line['text'], '|') === false 893 | and strpos($Line['text'], ':') === false 894 | or strpos($Block['element']['handler']['argument'], "\n") !== false 895 | ) { 896 | return; 897 | } 898 | 899 | if (chop($Line['text'], ' -:|') !== '') 900 | { 901 | return; 902 | } 903 | 904 | $alignments = array(); 905 | 906 | $divider = $Line['text']; 907 | 908 | $divider = trim($divider); 909 | $divider = trim($divider, '|'); 910 | 911 | $dividerCells = explode('|', $divider); 912 | 913 | foreach ($dividerCells as $dividerCell) 914 | { 915 | $dividerCell = trim($dividerCell); 916 | 917 | if ($dividerCell === '') 918 | { 919 | return; 920 | } 921 | 922 | $alignment = null; 923 | 924 | if ($dividerCell[0] === ':') 925 | { 926 | $alignment = 'left'; 927 | } 928 | 929 | if (substr($dividerCell, - 1) === ':') 930 | { 931 | $alignment = $alignment === 'left' ? 'center' : 'right'; 932 | } 933 | 934 | $alignments []= $alignment; 935 | } 936 | 937 | # ~ 938 | 939 | $HeaderElements = array(); 940 | 941 | $header = $Block['element']['handler']['argument']; 942 | 943 | $header = trim($header); 944 | $header = trim($header, '|'); 945 | 946 | $headerCells = explode('|', $header); 947 | 948 | if (count($headerCells) !== count($alignments)) 949 | { 950 | return; 951 | } 952 | 953 | foreach ($headerCells as $index => $headerCell) 954 | { 955 | $headerCell = trim($headerCell); 956 | 957 | $HeaderElement = array( 958 | 'name' => 'th', 959 | 'handler' => array( 960 | 'function' => 'lineElements', 961 | 'argument' => $headerCell, 962 | 'destination' => 'elements', 963 | ) 964 | ); 965 | 966 | if (isset($alignments[$index])) 967 | { 968 | $alignment = $alignments[$index]; 969 | 970 | $HeaderElement['attributes'] = array( 971 | 'style' => "text-align: $alignment;", 972 | ); 973 | } 974 | 975 | $HeaderElements []= $HeaderElement; 976 | } 977 | 978 | # ~ 979 | 980 | $Block = array( 981 | 'alignments' => $alignments, 982 | 'identified' => true, 983 | 'element' => array( 984 | 'name' => 'table', 985 | 'elements' => array(), 986 | ), 987 | ); 988 | 989 | $Block['element']['elements'] []= array( 990 | 'name' => 'thead', 991 | ); 992 | 993 | $Block['element']['elements'] []= array( 994 | 'name' => 'tbody', 995 | 'elements' => array(), 996 | ); 997 | 998 | $Block['element']['elements'][0]['elements'] []= array( 999 | 'name' => 'tr', 1000 | 'elements' => $HeaderElements, 1001 | ); 1002 | 1003 | return $Block; 1004 | } 1005 | 1006 | protected function blockTableContinue($Line, array $Block) 1007 | { 1008 | if (isset($Block['interrupted'])) 1009 | { 1010 | return; 1011 | } 1012 | 1013 | if (count($Block['alignments']) === 1 or $Line['text'][0] === '|' or strpos($Line['text'], '|')) 1014 | { 1015 | $Elements = array(); 1016 | 1017 | $row = $Line['text']; 1018 | 1019 | $row = trim($row); 1020 | $row = trim($row, '|'); 1021 | 1022 | preg_match_all('/(?:(\\\\[|])|[^|`]|`[^`]++`|`)++/', $row, $matches); 1023 | 1024 | $cells = array_slice($matches[0], 0, count($Block['alignments'])); 1025 | 1026 | foreach ($cells as $index => $cell) 1027 | { 1028 | $cell = trim($cell); 1029 | 1030 | $Element = array( 1031 | 'name' => 'td', 1032 | 'handler' => array( 1033 | 'function' => 'lineElements', 1034 | 'argument' => $cell, 1035 | 'destination' => 'elements', 1036 | ) 1037 | ); 1038 | 1039 | if (isset($Block['alignments'][$index])) 1040 | { 1041 | $Element['attributes'] = array( 1042 | 'style' => 'text-align: ' . $Block['alignments'][$index] . ';', 1043 | ); 1044 | } 1045 | 1046 | $Elements []= $Element; 1047 | } 1048 | 1049 | $Element = array( 1050 | 'name' => 'tr', 1051 | 'elements' => $Elements, 1052 | ); 1053 | 1054 | $Block['element']['elements'][1]['elements'] []= $Element; 1055 | 1056 | return $Block; 1057 | } 1058 | } 1059 | 1060 | # 1061 | # ~ 1062 | # 1063 | 1064 | protected function paragraph($Line) 1065 | { 1066 | return array( 1067 | 'type' => 'Paragraph', 1068 | 'element' => array( 1069 | 'name' => 'p', 1070 | 'handler' => array( 1071 | 'function' => 'lineElements', 1072 | 'argument' => $Line['text'], 1073 | 'destination' => 'elements', 1074 | ), 1075 | ), 1076 | ); 1077 | } 1078 | 1079 | protected function paragraphContinue($Line, array $Block) 1080 | { 1081 | if (isset($Block['interrupted'])) 1082 | { 1083 | return; 1084 | } 1085 | 1086 | $Block['element']['handler']['argument'] .= "\n".$Line['text']; 1087 | 1088 | return $Block; 1089 | } 1090 | 1091 | # 1092 | # Inline Elements 1093 | # 1094 | 1095 | protected $InlineTypes = array( 1096 | '!' => array('Image'), 1097 | '&' => array('SpecialCharacter'), 1098 | '*' => array('Emphasis'), 1099 | ':' => array('Url'), 1100 | '<' => array('UrlTag', 'EmailTag', 'Markup'), 1101 | '[' => array('Link'), 1102 | '_' => array('Emphasis'), 1103 | '`' => array('Code'), 1104 | '~' => array('Strikethrough'), 1105 | '\\' => array('EscapeSequence'), 1106 | ); 1107 | 1108 | # ~ 1109 | 1110 | protected $inlineMarkerList = '!*_&[:<`~\\'; 1111 | 1112 | # 1113 | # ~ 1114 | # 1115 | 1116 | public function line($text, $nonNestables = array()) 1117 | { 1118 | return $this->elements($this->lineElements($text, $nonNestables)); 1119 | } 1120 | 1121 | protected function lineElements($text, $nonNestables = array()) 1122 | { 1123 | # standardize line breaks 1124 | $text = str_replace(array("\r\n", "\r"), "\n", $text); 1125 | 1126 | $Elements = array(); 1127 | 1128 | $nonNestables = (empty($nonNestables) 1129 | ? array() 1130 | : array_combine($nonNestables, $nonNestables) 1131 | ); 1132 | 1133 | # $excerpt is based on the first occurrence of a marker 1134 | 1135 | while ($excerpt = strpbrk($text, $this->inlineMarkerList)) 1136 | { 1137 | $marker = $excerpt[0]; 1138 | 1139 | $markerPosition = strlen($text) - strlen($excerpt); 1140 | 1141 | $Excerpt = array('text' => $excerpt, 'context' => $text); 1142 | 1143 | foreach ($this->InlineTypes[$marker] as $inlineType) 1144 | { 1145 | # check to see if the current inline type is nestable in the current context 1146 | 1147 | if (isset($nonNestables[$inlineType])) 1148 | { 1149 | continue; 1150 | } 1151 | 1152 | $Inline = $this->{"inline$inlineType"}($Excerpt); 1153 | 1154 | if ( ! isset($Inline)) 1155 | { 1156 | continue; 1157 | } 1158 | 1159 | # makes sure that the inline belongs to "our" marker 1160 | 1161 | if (isset($Inline['position']) and $Inline['position'] > $markerPosition) 1162 | { 1163 | continue; 1164 | } 1165 | 1166 | # sets a default inline position 1167 | 1168 | if ( ! isset($Inline['position'])) 1169 | { 1170 | $Inline['position'] = $markerPosition; 1171 | } 1172 | 1173 | # cause the new element to 'inherit' our non nestables 1174 | 1175 | 1176 | $Inline['element']['nonNestables'] = isset($Inline['element']['nonNestables']) 1177 | ? array_merge($Inline['element']['nonNestables'], $nonNestables) 1178 | : $nonNestables 1179 | ; 1180 | 1181 | # the text that comes before the inline 1182 | $unmarkedText = substr($text, 0, $Inline['position']); 1183 | 1184 | # compile the unmarked text 1185 | $InlineText = $this->inlineText($unmarkedText); 1186 | $Elements[] = $InlineText['element']; 1187 | 1188 | # compile the inline 1189 | $Elements[] = $this->extractElement($Inline); 1190 | 1191 | # remove the examined text 1192 | $text = substr($text, $Inline['position'] + $Inline['extent']); 1193 | 1194 | continue 2; 1195 | } 1196 | 1197 | # the marker does not belong to an inline 1198 | 1199 | $unmarkedText = substr($text, 0, $markerPosition + 1); 1200 | 1201 | $InlineText = $this->inlineText($unmarkedText); 1202 | $Elements[] = $InlineText['element']; 1203 | 1204 | $text = substr($text, $markerPosition + 1); 1205 | } 1206 | 1207 | $InlineText = $this->inlineText($text); 1208 | $Elements[] = $InlineText['element']; 1209 | 1210 | foreach ($Elements as &$Element) 1211 | { 1212 | if ( ! isset($Element['autobreak'])) 1213 | { 1214 | $Element['autobreak'] = false; 1215 | } 1216 | } 1217 | 1218 | return $Elements; 1219 | } 1220 | 1221 | # 1222 | # ~ 1223 | # 1224 | 1225 | protected function inlineText($text) 1226 | { 1227 | $Inline = array( 1228 | 'extent' => strlen($text), 1229 | 'element' => array(), 1230 | ); 1231 | 1232 | $Inline['element']['elements'] = self::pregReplaceElements( 1233 | $this->breaksEnabled ? '/[ ]*+\n/' : '/(?:[ ]*+\\\\|[ ]{2,}+)\n/', 1234 | array( 1235 | array('name' => 'br'), 1236 | array('text' => "\n"), 1237 | ), 1238 | $text 1239 | ); 1240 | 1241 | return $Inline; 1242 | } 1243 | 1244 | protected function inlineCode($Excerpt) 1245 | { 1246 | $marker = $Excerpt['text'][0]; 1247 | 1248 | if (preg_match('/^(['.$marker.']++)[ ]*+(.+?)[ ]*+(? strlen($matches[0]), 1255 | 'element' => array( 1256 | 'name' => 'code', 1257 | 'text' => $text, 1258 | ), 1259 | ); 1260 | } 1261 | } 1262 | 1263 | protected function inlineEmailTag($Excerpt) 1264 | { 1265 | $hostnameLabel = '[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?'; 1266 | 1267 | $commonMarkEmail = '[a-zA-Z0-9.!#$%&\'*+\/=?^_`{|}~-]++@' 1268 | . $hostnameLabel . '(?:\.' . $hostnameLabel . ')*'; 1269 | 1270 | if (strpos($Excerpt['text'], '>') !== false 1271 | and preg_match("/^<((mailto:)?$commonMarkEmail)>/i", $Excerpt['text'], $matches) 1272 | ){ 1273 | $url = $matches[1]; 1274 | 1275 | if ( ! isset($matches[2])) 1276 | { 1277 | $url = "mailto:$url"; 1278 | } 1279 | 1280 | return array( 1281 | 'extent' => strlen($matches[0]), 1282 | 'element' => array( 1283 | 'name' => 'a', 1284 | 'text' => $matches[1], 1285 | 'attributes' => array( 1286 | 'href' => $url, 1287 | ), 1288 | ), 1289 | ); 1290 | } 1291 | } 1292 | 1293 | protected function inlineEmphasis($Excerpt) 1294 | { 1295 | if ( ! isset($Excerpt['text'][1])) 1296 | { 1297 | return; 1298 | } 1299 | 1300 | $marker = $Excerpt['text'][0]; 1301 | 1302 | if ($Excerpt['text'][1] === $marker and preg_match($this->StrongRegex[$marker], $Excerpt['text'], $matches)) 1303 | { 1304 | $emphasis = 'strong'; 1305 | } 1306 | elseif (preg_match($this->EmRegex[$marker], $Excerpt['text'], $matches)) 1307 | { 1308 | $emphasis = 'em'; 1309 | } 1310 | else 1311 | { 1312 | return; 1313 | } 1314 | 1315 | return array( 1316 | 'extent' => strlen($matches[0]), 1317 | 'element' => array( 1318 | 'name' => $emphasis, 1319 | 'handler' => array( 1320 | 'function' => 'lineElements', 1321 | 'argument' => $matches[1], 1322 | 'destination' => 'elements', 1323 | ) 1324 | ), 1325 | ); 1326 | } 1327 | 1328 | protected function inlineEscapeSequence($Excerpt) 1329 | { 1330 | if (isset($Excerpt['text'][1]) and in_array($Excerpt['text'][1], $this->specialCharacters)) 1331 | { 1332 | return array( 1333 | 'element' => array('rawHtml' => $Excerpt['text'][1]), 1334 | 'extent' => 2, 1335 | ); 1336 | } 1337 | } 1338 | 1339 | protected function inlineImage($Excerpt) 1340 | { 1341 | if ( ! isset($Excerpt['text'][1]) or $Excerpt['text'][1] !== '[') 1342 | { 1343 | return; 1344 | } 1345 | 1346 | $Excerpt['text']= substr($Excerpt['text'], 1); 1347 | 1348 | $Link = $this->inlineLink($Excerpt); 1349 | 1350 | if ($Link === null) 1351 | { 1352 | return; 1353 | } 1354 | 1355 | $Inline = array( 1356 | 'extent' => $Link['extent'] + 1, 1357 | 'element' => array( 1358 | 'name' => 'img', 1359 | 'attributes' => array( 1360 | 'src' => $Link['element']['attributes']['href'], 1361 | 'alt' => $Link['element']['handler']['argument'], 1362 | ), 1363 | 'autobreak' => true, 1364 | ), 1365 | ); 1366 | 1367 | $Inline['element']['attributes'] += $Link['element']['attributes']; 1368 | 1369 | unset($Inline['element']['attributes']['href']); 1370 | 1371 | return $Inline; 1372 | } 1373 | 1374 | protected function inlineLink($Excerpt) 1375 | { 1376 | $Element = array( 1377 | 'name' => 'a', 1378 | 'handler' => array( 1379 | 'function' => 'lineElements', 1380 | 'argument' => null, 1381 | 'destination' => 'elements', 1382 | ), 1383 | 'nonNestables' => array('Url', 'Link'), 1384 | 'attributes' => array( 1385 | 'href' => null, 1386 | 'title' => null, 1387 | ), 1388 | ); 1389 | 1390 | $extent = 0; 1391 | 1392 | $remainder = $Excerpt['text']; 1393 | 1394 | if (preg_match('/\[((?:[^][]++|(?R))*+)\]/', $remainder, $matches)) 1395 | { 1396 | $Element['handler']['argument'] = $matches[1]; 1397 | 1398 | $extent += strlen($matches[0]); 1399 | 1400 | $remainder = substr($remainder, $extent); 1401 | } 1402 | else 1403 | { 1404 | return; 1405 | } 1406 | 1407 | if (preg_match('/^[(]\s*+((?:[^ ()]++|[(][^ )]+[)])++)(?:[ ]+("[^"]*+"|\'[^\']*+\'))?\s*+[)]/', $remainder, $matches)) 1408 | { 1409 | $Element['attributes']['href'] = $matches[1]; 1410 | 1411 | if (isset($matches[2])) 1412 | { 1413 | $Element['attributes']['title'] = substr($matches[2], 1, - 1); 1414 | } 1415 | 1416 | $extent += strlen($matches[0]); 1417 | } 1418 | else 1419 | { 1420 | if (preg_match('/^\s*\[(.*?)\]/', $remainder, $matches)) 1421 | { 1422 | $definition = strlen($matches[1]) ? $matches[1] : $Element['handler']['argument']; 1423 | $definition = strtolower($definition); 1424 | 1425 | $extent += strlen($matches[0]); 1426 | } 1427 | else 1428 | { 1429 | $definition = strtolower($Element['handler']['argument']); 1430 | } 1431 | 1432 | if ( ! isset($this->DefinitionData['Reference'][$definition])) 1433 | { 1434 | return; 1435 | } 1436 | 1437 | $Definition = $this->DefinitionData['Reference'][$definition]; 1438 | 1439 | $Element['attributes']['href'] = $Definition['url']; 1440 | $Element['attributes']['title'] = $Definition['title']; 1441 | } 1442 | 1443 | return array( 1444 | 'extent' => $extent, 1445 | 'element' => $Element, 1446 | ); 1447 | } 1448 | 1449 | protected function inlineMarkup($Excerpt) 1450 | { 1451 | if ($this->markupEscaped or $this->safeMode or strpos($Excerpt['text'], '>') === false) 1452 | { 1453 | return; 1454 | } 1455 | 1456 | if ($Excerpt['text'][1] === '/' and preg_match('/^<\/\w[\w-]*+[ ]*+>/s', $Excerpt['text'], $matches)) 1457 | { 1458 | return array( 1459 | 'element' => array('rawHtml' => $matches[0]), 1460 | 'extent' => strlen($matches[0]), 1461 | ); 1462 | } 1463 | 1464 | if ($Excerpt['text'][1] === '!' and preg_match('/^/s', $Excerpt['text'], $matches)) 1465 | { 1466 | return array( 1467 | 'element' => array('rawHtml' => $matches[0]), 1468 | 'extent' => strlen($matches[0]), 1469 | ); 1470 | } 1471 | 1472 | if ($Excerpt['text'][1] !== ' ' and preg_match('/^<\w[\w-]*+(?:[ ]*+'.$this->regexHtmlAttribute.')*+[ ]*+\/?>/s', $Excerpt['text'], $matches)) 1473 | { 1474 | return array( 1475 | 'element' => array('rawHtml' => $matches[0]), 1476 | 'extent' => strlen($matches[0]), 1477 | ); 1478 | } 1479 | } 1480 | 1481 | protected function inlineSpecialCharacter($Excerpt) 1482 | { 1483 | if (substr($Excerpt['text'], 1, 1) !== ' ' and strpos($Excerpt['text'], ';') !== false 1484 | and preg_match('/^&(#?+[0-9a-zA-Z]++);/', $Excerpt['text'], $matches) 1485 | ) { 1486 | return array( 1487 | 'element' => array('rawHtml' => '&' . $matches[1] . ';'), 1488 | 'extent' => strlen($matches[0]), 1489 | ); 1490 | } 1491 | 1492 | return; 1493 | } 1494 | 1495 | protected function inlineStrikethrough($Excerpt) 1496 | { 1497 | if ( ! isset($Excerpt['text'][1])) 1498 | { 1499 | return; 1500 | } 1501 | 1502 | if ($Excerpt['text'][1] === '~' and preg_match('/^~~(?=\S)(.+?)(?<=\S)~~/', $Excerpt['text'], $matches)) 1503 | { 1504 | return array( 1505 | 'extent' => strlen($matches[0]), 1506 | 'element' => array( 1507 | 'name' => 'del', 1508 | 'handler' => array( 1509 | 'function' => 'lineElements', 1510 | 'argument' => $matches[1], 1511 | 'destination' => 'elements', 1512 | ) 1513 | ), 1514 | ); 1515 | } 1516 | } 1517 | 1518 | protected function inlineUrl($Excerpt) 1519 | { 1520 | if ($this->urlsLinked !== true or ! isset($Excerpt['text'][2]) or $Excerpt['text'][2] !== '/') 1521 | { 1522 | return; 1523 | } 1524 | 1525 | if (strpos($Excerpt['context'], 'http') !== false 1526 | and preg_match('/\bhttps?+:[\/]{2}[^\s<]+\b\/*+/ui', $Excerpt['context'], $matches, PREG_OFFSET_CAPTURE) 1527 | ) { 1528 | $url = $matches[0][0]; 1529 | 1530 | $Inline = array( 1531 | 'extent' => strlen($matches[0][0]), 1532 | 'position' => $matches[0][1], 1533 | 'element' => array( 1534 | 'name' => 'a', 1535 | 'text' => $url, 1536 | 'attributes' => array( 1537 | 'href' => $url, 1538 | ), 1539 | ), 1540 | ); 1541 | 1542 | return $Inline; 1543 | } 1544 | } 1545 | 1546 | protected function inlineUrlTag($Excerpt) 1547 | { 1548 | if (strpos($Excerpt['text'], '>') !== false and preg_match('/^<(\w++:\/{2}[^ >]++)>/i', $Excerpt['text'], $matches)) 1549 | { 1550 | $url = $matches[1]; 1551 | 1552 | return array( 1553 | 'extent' => strlen($matches[0]), 1554 | 'element' => array( 1555 | 'name' => 'a', 1556 | 'text' => $url, 1557 | 'attributes' => array( 1558 | 'href' => $url, 1559 | ), 1560 | ), 1561 | ); 1562 | } 1563 | } 1564 | 1565 | # ~ 1566 | 1567 | protected function unmarkedText($text) 1568 | { 1569 | $Inline = $this->inlineText($text); 1570 | return $this->element($Inline['element']); 1571 | } 1572 | 1573 | # 1574 | # Handlers 1575 | # 1576 | 1577 | protected function handle(array $Element) 1578 | { 1579 | if (isset($Element['handler'])) 1580 | { 1581 | if (!isset($Element['nonNestables'])) 1582 | { 1583 | $Element['nonNestables'] = array(); 1584 | } 1585 | 1586 | if (is_string($Element['handler'])) 1587 | { 1588 | $function = $Element['handler']; 1589 | $argument = $Element['text']; 1590 | unset($Element['text']); 1591 | $destination = 'rawHtml'; 1592 | } 1593 | else 1594 | { 1595 | $function = $Element['handler']['function']; 1596 | $argument = $Element['handler']['argument']; 1597 | $destination = $Element['handler']['destination']; 1598 | } 1599 | 1600 | $Element[$destination] = $this->{$function}($argument, $Element['nonNestables']); 1601 | 1602 | if ($destination === 'handler') 1603 | { 1604 | $Element = $this->handle($Element); 1605 | } 1606 | 1607 | unset($Element['handler']); 1608 | } 1609 | 1610 | return $Element; 1611 | } 1612 | 1613 | protected function handleElementRecursive(array $Element) 1614 | { 1615 | return $this->elementApplyRecursive(array($this, 'handle'), $Element); 1616 | } 1617 | 1618 | protected function handleElementsRecursive(array $Elements) 1619 | { 1620 | return $this->elementsApplyRecursive(array($this, 'handle'), $Elements); 1621 | } 1622 | 1623 | protected function elementApplyRecursive($closure, array $Element) 1624 | { 1625 | $Element = call_user_func($closure, $Element); 1626 | 1627 | if (isset($Element['elements'])) 1628 | { 1629 | $Element['elements'] = $this->elementsApplyRecursive($closure, $Element['elements']); 1630 | } 1631 | elseif (isset($Element['element'])) 1632 | { 1633 | $Element['element'] = $this->elementApplyRecursive($closure, $Element['element']); 1634 | } 1635 | 1636 | return $Element; 1637 | } 1638 | 1639 | protected function elementApplyRecursiveDepthFirst($closure, array $Element) 1640 | { 1641 | if (isset($Element['elements'])) 1642 | { 1643 | $Element['elements'] = $this->elementsApplyRecursiveDepthFirst($closure, $Element['elements']); 1644 | } 1645 | elseif (isset($Element['element'])) 1646 | { 1647 | $Element['element'] = $this->elementsApplyRecursiveDepthFirst($closure, $Element['element']); 1648 | } 1649 | 1650 | $Element = call_user_func($closure, $Element); 1651 | 1652 | return $Element; 1653 | } 1654 | 1655 | protected function elementsApplyRecursive($closure, array $Elements) 1656 | { 1657 | foreach ($Elements as &$Element) 1658 | { 1659 | $Element = $this->elementApplyRecursive($closure, $Element); 1660 | } 1661 | 1662 | return $Elements; 1663 | } 1664 | 1665 | protected function elementsApplyRecursiveDepthFirst($closure, array $Elements) 1666 | { 1667 | foreach ($Elements as &$Element) 1668 | { 1669 | $Element = $this->elementApplyRecursiveDepthFirst($closure, $Element); 1670 | } 1671 | 1672 | return $Elements; 1673 | } 1674 | 1675 | protected function element(array $Element) 1676 | { 1677 | if ($this->safeMode) 1678 | { 1679 | $Element = $this->sanitiseElement($Element); 1680 | } 1681 | 1682 | # identity map if element has no handler 1683 | $Element = $this->handle($Element); 1684 | 1685 | $hasName = isset($Element['name']); 1686 | 1687 | $markup = ''; 1688 | 1689 | if ($hasName) 1690 | { 1691 | $markup .= '<' . $Element['name']; 1692 | 1693 | if (isset($Element['attributes'])) 1694 | { 1695 | foreach ($Element['attributes'] as $name => $value) 1696 | { 1697 | if ($value === null) 1698 | { 1699 | continue; 1700 | } 1701 | 1702 | $markup .= " $name=\"".self::escape($value).'"'; 1703 | } 1704 | } 1705 | } 1706 | 1707 | $permitRawHtml = false; 1708 | 1709 | if (isset($Element['text'])) 1710 | { 1711 | $text = $Element['text']; 1712 | } 1713 | // very strongly consider an alternative if you're writing an 1714 | // extension 1715 | elseif (isset($Element['rawHtml'])) 1716 | { 1717 | $text = $Element['rawHtml']; 1718 | 1719 | $allowRawHtmlInSafeMode = isset($Element['allowRawHtmlInSafeMode']) && $Element['allowRawHtmlInSafeMode']; 1720 | $permitRawHtml = !$this->safeMode || $allowRawHtmlInSafeMode; 1721 | } 1722 | 1723 | $hasContent = isset($text) || isset($Element['element']) || isset($Element['elements']); 1724 | 1725 | if ($hasContent) 1726 | { 1727 | $markup .= $hasName ? '>' : ''; 1728 | 1729 | if (isset($Element['elements'])) 1730 | { 1731 | $markup .= $this->elements($Element['elements']); 1732 | } 1733 | elseif (isset($Element['element'])) 1734 | { 1735 | $markup .= $this->element($Element['element']); 1736 | } 1737 | else 1738 | { 1739 | if (!$permitRawHtml) 1740 | { 1741 | $markup .= self::escape($text, true); 1742 | } 1743 | else 1744 | { 1745 | $markup .= $text; 1746 | } 1747 | } 1748 | 1749 | $markup .= $hasName ? '' . $Element['name'] . '>' : ''; 1750 | } 1751 | elseif ($hasName) 1752 | { 1753 | $markup .= ' />'; 1754 | } 1755 | 1756 | return $markup; 1757 | } 1758 | 1759 | protected function elements(array $Elements) 1760 | { 1761 | $markup = ''; 1762 | 1763 | $autoBreak = true; 1764 | 1765 | foreach ($Elements as $Element) 1766 | { 1767 | if (empty($Element)) 1768 | { 1769 | continue; 1770 | } 1771 | 1772 | $autoBreakNext = (isset($Element['autobreak']) 1773 | ? $Element['autobreak'] : isset($Element['name']) 1774 | ); 1775 | // (autobreak === false) covers both sides of an element 1776 | $autoBreak = !$autoBreak ? $autoBreak : $autoBreakNext; 1777 | 1778 | $markup .= ($autoBreak ? "\n" : '') . $this->element($Element); 1779 | $autoBreak = $autoBreakNext; 1780 | } 1781 | 1782 | $markup .= $autoBreak ? "\n" : ''; 1783 | 1784 | return $markup; 1785 | } 1786 | 1787 | # ~ 1788 | 1789 | protected function li($lines) 1790 | { 1791 | $Elements = $this->linesElements($lines); 1792 | 1793 | if ( ! in_array('', $lines) 1794 | and isset($Elements[0]) and isset($Elements[0]['name']) 1795 | and $Elements[0]['name'] === 'p' 1796 | ) { 1797 | unset($Elements[0]['name']); 1798 | } 1799 | 1800 | return $Elements; 1801 | } 1802 | 1803 | # 1804 | # AST Convenience 1805 | # 1806 | 1807 | /** 1808 | * Replace occurrences $regexp with $Elements in $text. Return an array of 1809 | * elements representing the replacement. 1810 | */ 1811 | protected static function pregReplaceElements($regexp, $Elements, $text) 1812 | { 1813 | $newElements = array(); 1814 | 1815 | while (preg_match($regexp, $text, $matches, PREG_OFFSET_CAPTURE)) 1816 | { 1817 | $offset = $matches[0][1]; 1818 | $before = substr($text, 0, $offset); 1819 | $after = substr($text, $offset + strlen($matches[0][0])); 1820 | 1821 | $newElements[] = array('text' => $before); 1822 | 1823 | foreach ($Elements as $Element) 1824 | { 1825 | $newElements[] = $Element; 1826 | } 1827 | 1828 | $text = $after; 1829 | } 1830 | 1831 | $newElements[] = array('text' => $text); 1832 | 1833 | return $newElements; 1834 | } 1835 | 1836 | # 1837 | # Deprecated Methods 1838 | # 1839 | 1840 | function parse($text) 1841 | { 1842 | $markup = $this->text($text); 1843 | 1844 | return $markup; 1845 | } 1846 | 1847 | protected function sanitiseElement(array $Element) 1848 | { 1849 | static $goodAttribute = '/^[a-zA-Z0-9][a-zA-Z0-9-_]*+$/'; 1850 | static $safeUrlNameToAtt = array( 1851 | 'a' => 'href', 1852 | 'img' => 'src', 1853 | ); 1854 | 1855 | if ( ! isset($Element['name'])) 1856 | { 1857 | unset($Element['attributes']); 1858 | return $Element; 1859 | } 1860 | 1861 | if (isset($safeUrlNameToAtt[$Element['name']])) 1862 | { 1863 | $Element = $this->filterUnsafeUrlInAttribute($Element, $safeUrlNameToAtt[$Element['name']]); 1864 | } 1865 | 1866 | if ( ! empty($Element['attributes'])) 1867 | { 1868 | foreach ($Element['attributes'] as $att => $val) 1869 | { 1870 | # filter out badly parsed attribute 1871 | if ( ! preg_match($goodAttribute, $att)) 1872 | { 1873 | unset($Element['attributes'][$att]); 1874 | } 1875 | # dump onevent attribute 1876 | elseif (self::striAtStart($att, 'on')) 1877 | { 1878 | unset($Element['attributes'][$att]); 1879 | } 1880 | } 1881 | } 1882 | 1883 | return $Element; 1884 | } 1885 | 1886 | protected function filterUnsafeUrlInAttribute(array $Element, $attribute) 1887 | { 1888 | foreach ($this->safeLinksWhitelist as $scheme) 1889 | { 1890 | if (self::striAtStart($Element['attributes'][$attribute], $scheme)) 1891 | { 1892 | return $Element; 1893 | } 1894 | } 1895 | 1896 | $Element['attributes'][$attribute] = str_replace(':', '%3A', $Element['attributes'][$attribute]); 1897 | 1898 | return $Element; 1899 | } 1900 | 1901 | # 1902 | # Static Methods 1903 | # 1904 | 1905 | protected static function escape($text, $allowQuotes = false) 1906 | { 1907 | return htmlspecialchars($text, $allowQuotes ? ENT_NOQUOTES : ENT_QUOTES, 'UTF-8'); 1908 | } 1909 | 1910 | protected static function striAtStart($string, $needle) 1911 | { 1912 | $len = strlen($needle); 1913 | 1914 | if ($len > strlen($string)) 1915 | { 1916 | return false; 1917 | } 1918 | else 1919 | { 1920 | return strtolower(substr($string, 0, $len)) === strtolower($needle); 1921 | } 1922 | } 1923 | 1924 | static function instance($name = 'default') 1925 | { 1926 | if (isset(self::$instances[$name])) 1927 | { 1928 | return self::$instances[$name]; 1929 | } 1930 | 1931 | $instance = new static(); 1932 | 1933 | self::$instances[$name] = $instance; 1934 | 1935 | return $instance; 1936 | } 1937 | 1938 | private static $instances = array(); 1939 | 1940 | # 1941 | # Fields 1942 | # 1943 | 1944 | protected $DefinitionData; 1945 | 1946 | # 1947 | # Read-Only 1948 | 1949 | protected $specialCharacters = array( 1950 | '\\', '`', '*', '_', '{', '}', '[', ']', '(', ')', '>', '#', '+', '-', '.', '!', '|', '~' 1951 | ); 1952 | 1953 | protected $StrongRegex = array( 1954 | '*' => '/^[*]{2}((?:\\\\\*|[^*]|[*][^*]*+[*])+?)[*]{2}(?![*])/s', 1955 | '_' => '/^__((?:\\\\_|[^_]|_[^_]*+_)+?)__(?!_)/us', 1956 | ); 1957 | 1958 | protected $EmRegex = array( 1959 | '*' => '/^[*]((?:\\\\\*|[^*]|[*][*][^*]+?[*][*])+?)[*](?![*])/s', 1960 | '_' => '/^_((?:\\\\_|[^_]|__[^_]*__)+?)_(?!_)\b/us', 1961 | ); 1962 | 1963 | protected $regexHtmlAttribute = '[a-zA-Z_:][\w:.-]*+(?:\s*+=\s*+(?:[^"\'=<>`\s]+|"[^"]*+"|\'[^\']*+\'))?+'; 1964 | 1965 | protected $voidElements = array( 1966 | 'area', 'base', 'br', 'col', 'command', 'embed', 'hr', 'img', 'input', 'link', 'meta', 'param', 'source', 1967 | ); 1968 | 1969 | protected $textLevelElements = array( 1970 | 'a', 'br', 'bdo', 'abbr', 'blink', 'nextid', 'acronym', 'basefont', 1971 | 'b', 'em', 'big', 'cite', 'small', 'spacer', 'listing', 1972 | 'i', 'rp', 'del', 'code', 'strike', 'marquee', 1973 | 'q', 'rt', 'ins', 'font', 'strong', 1974 | 's', 'tt', 'kbd', 'mark', 1975 | 'u', 'xm', 'sub', 'nobr', 1976 | 'sup', 'ruby', 1977 | 'var', 'span', 1978 | 'wbr', 'time', 1979 | ); 1980 | } 1981 | --------------------------------------------------------------------------------