├── .gitignore ├── CHANGES.md ├── readme.txt ├── License.text ├── PHP Markdown Readme.text └── php-markdown-modified.php /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store -------------------------------------------------------------------------------- /CHANGES.md: -------------------------------------------------------------------------------- 1 | ### Markdown-Modified 2 | 3 | #### 1.0.1 4 | 5 | * renamed functions and `MARKDOWN_VERSION` const so not errors if another version of `markdown.php` is being loaded by another plugin 6 | 7 | #### 1.0 8 | 9 | * initial commit -------------------------------------------------------------------------------- /readme.txt: -------------------------------------------------------------------------------- 1 | Modified version of original PHP Markdown plugin to work along side Markdown on Save variants. 2 | 3 | All Markdown markup will be rendered in all posts regardless of setting in Markdown on Save variant. 4 | 5 | Better solution for those who have always used Markdown markup would be some method of modifying the setting to **not use** Markdown on posts to **use** Markdown. 6 | -------------------------------------------------------------------------------- /License.text: -------------------------------------------------------------------------------- 1 | PHP Markdown 2 | Copyright (c) 2004-2009 Michel Fortin 3 | 4 | All rights reserved. 5 | 6 | Based on Markdown 7 | Copyright (c) 2003-2006 John Gruber 8 | 9 | All rights reserved. 10 | 11 | Redistribution and use in source and binary forms, with or without 12 | modification, are permitted provided that the following conditions are 13 | met: 14 | 15 | * Redistributions of source code must retain the above copyright notice, 16 | this list of conditions and the following disclaimer. 17 | 18 | * Redistributions in binary form must reproduce the above copyright 19 | notice, this list of conditions and the following disclaimer in the 20 | documentation and/or other materials provided with the distribution. 21 | 22 | * Neither the name "Markdown" nor the names of its contributors may 23 | be used to endorse or promote products derived from this software 24 | without specific prior written permission. 25 | 26 | This software is provided by the copyright holders and contributors "as 27 | is" and any express or implied warranties, including, but not limited 28 | to, the implied warranties of merchantability and fitness for a 29 | particular purpose are disclaimed. In no event shall the copyright owner 30 | or contributors be liable for any direct, indirect, incidental, special, 31 | exemplary, or consequential damages (including, but not limited to, 32 | procurement of substitute goods or services; loss of use, data, or 33 | profits; or business interruption) however caused and on any theory of 34 | liability, whether in contract, strict liability, or tort (including 35 | negligence or otherwise) arising in any way out of the use of this 36 | software, even if advised of the possibility of such damage. 37 | -------------------------------------------------------------------------------- /PHP Markdown Readme.text: -------------------------------------------------------------------------------- 1 | PHP Markdown 2 | ============ 3 | 4 | Version 1.0.1m - Sat 21 Jun 2008 5 | 6 | by Michel Fortin 7 | 8 | 9 | based on work by John Gruber 10 | 11 | 12 | 13 | Introduction 14 | ------------ 15 | 16 | Markdown is a text-to-HTML conversion tool for web writers. Markdown 17 | allows you to write using an easy-to-read, easy-to-write plain text 18 | format, then convert it to structurally valid XHTML (or HTML). 19 | 20 | "Markdown" is two things: a plain text markup syntax, and a software 21 | tool, written in Perl, that converts the plain text markup to HTML. 22 | PHP Markdown is a port to PHP of the original Markdown program by 23 | John Gruber. 24 | 25 | PHP Markdown can work as a plug-in for WordPress and bBlog, as a 26 | modifier for the Smarty templating engine, or as a remplacement for 27 | textile formatting in any software that support textile. 28 | 29 | Full documentation of Markdown's syntax is available on John's 30 | Markdown page: 31 | 32 | 33 | Installation and Requirement 34 | ---------------------------- 35 | 36 | PHP Markdown requires PHP version 4.0.5 or later. 37 | 38 | 39 | ### WordPress ### 40 | 41 | PHP Markdown works with [WordPress][wp], version 1.2 or later. 42 | 43 | [wp]: http://wordpress.org/ 44 | 45 | 1. To use PHP Markdown with WordPress, place the "makrdown.php" file 46 | in the "plugins" folder. This folder is located inside 47 | "wp-content" at the root of your site: 48 | 49 | (site home)/wp-content/plugins/ 50 | 51 | 2. Activate the plugin with the administrative interface of 52 | WordPress. In the "Plugins" section you will now find Markdown. 53 | To activate the plugin, click on the "Activate" button on the 54 | same line than Markdown. Your entries will now be formatted by 55 | PHP Markdown. 56 | 57 | 3. To post Markdown content, you'll first have to disable the 58 | "visual" editor in the User section of WordPress. 59 | 60 | You can configure PHP Markdown to not apply to the comments on your 61 | WordPress weblog. See the "Configuration" section below. 62 | 63 | It is not possible at this time to apply a different set of 64 | filters to different entries. All your entries will be formated by 65 | PHP Markdown. This is a limitation of WordPress. If your old entries 66 | are written in HTML (as opposed to another formatting syntax, like 67 | Textile), they'll probably stay fine after installing Markdown. 68 | 69 | 70 | ### bBlog ### 71 | 72 | PHP Markdown also works with [bBlog][bb]. 73 | 74 | [bb]: http://www.bblog.com/ 75 | 76 | To use PHP Markdown with bBlog, rename "markdown.php" to 77 | "modifier.markdown.php" and place the file in the "bBlog_plugins" 78 | folder. This folder is located inside the "bblog" directory of 79 | your site, like this: 80 | 81 | (site home)/bblog/bBlog_plugins/modifier.markdown.php 82 | 83 | Select "Markdown" as the "Entry Modifier" when you post a new 84 | entry. This setting will only apply to the entry you are editing. 85 | 86 | 87 | ### Replacing Textile in TextPattern ### 88 | 89 | [TextPattern][tp] use [Textile][tx] to format your text. You can 90 | replace Textile by Markdown in TextPattern without having to change 91 | any code by using the *Texitle Compatibility Mode*. This may work 92 | with other software that expect Textile too. 93 | 94 | [tx]: http://www.textism.com/tools/textile/ 95 | [tp]: http://www.textpattern.com/ 96 | 97 | 1. Rename the "markdown.php" file to "classTextile.php". This will 98 | make PHP Markdown behave as if it was the actual Textile parser. 99 | 100 | 2. Replace the "classTextile.php" file TextPattern installed in your 101 | web directory. It can be found in the "lib" directory: 102 | 103 | (site home)/textpattern/lib/ 104 | 105 | Contrary to Textile, Markdown does not convert quotes to curly ones 106 | and does not convert multiple hyphens (`--` and `---`) into en- and 107 | em-dashes. If you use PHP Markdown in Textile Compatibility Mode, you 108 | can solve this problem by installing the "smartypants.php" file from 109 | [PHP SmartyPants][psp] beside the "classTextile.php" file. The Textile 110 | Compatibility Mode function will use SmartyPants automatically without 111 | further modification. 112 | 113 | [psp]: http://michelf.com/projects/php-smartypants/ 114 | 115 | 116 | ### Updating Markdown in Other Programs ### 117 | 118 | Many web applications now ship with PHP Markdown, or have plugins to 119 | perform the conversion to HTML. You can update PHP Markdown in many of 120 | these programs by swapping the old "markdown.php" file for the new one. 121 | 122 | Here is a short non-exhaustive list of some programs and where they 123 | hide the "markdown.php" file. 124 | 125 | | Program | Path to Markdown 126 | | ------- | ---------------- 127 | | [Pivot][] | `(site home)/pivot/includes/markdown/markdown.php` 128 | 129 | If you're unsure if you can do this with your application, ask the 130 | developer, or wait for the developer to update his application or 131 | plugin with the new version of PHP Markdown. 132 | 133 | [Pivot]: http://pivotlog.net/ 134 | 135 | 136 | ### In Your Own Programs ### 137 | 138 | You can use PHP Markdown easily in your current PHP program. Simply 139 | include the file and then call the Markdown function on the text you 140 | want to convert: 141 | 142 | include_once "markdown.php"; 143 | $my_html = Markdown($my_text); 144 | 145 | If you wish to use PHP Markdown with another text filter function 146 | built to parse HTML, you should filter the text *after* the Markdown 147 | function call. This is an example with [PHP SmartyPants][psp]: 148 | 149 | $my_html = SmartyPants(Markdown($my_text)); 150 | 151 | 152 | ### With Smarty ### 153 | 154 | If your program use the [Smarty][sm] template engine, PHP Markdown 155 | can now be used as a modifier for your templates. Rename "markdown.php" 156 | to "modifier.markdown.php" and put it in your smarty plugins folder. 157 | 158 | [sm]: http://smarty.php.net/ 159 | 160 | If you are using MovableType 3.1 or later, the Smarty plugin folder is 161 | located at `(MT CGI root)/php/extlib/smarty/plugins`. This will allow 162 | Markdown to work on dynamic pages. 163 | 164 | 165 | Configuration 166 | ------------- 167 | 168 | By default, PHP Markdown produces XHTML output for tags with empty 169 | elements. E.g.: 170 | 171 |
172 | 173 | Markdown can be configured to produce HTML-style tags; e.g.: 174 | 175 |
176 | 177 | To do this, you must edit the "MARKDOWN_EMPTY_ELEMENT_SUFFIX" 178 | definition below the "Global default settings" header at the start of 179 | the "markdown.php" file. 180 | 181 | 182 | ### WordPress-Specific Settings ### 183 | 184 | By default, the Markdown plugin applies to both posts and comments on 185 | your WordPress weblog. To deactivate one or the other, edit the 186 | `MARKDOWN_WP_POSTS` or `MARKDOWN_WP_COMMENTS` definitions under the 187 | "WordPress settings" header at the start of the "markdown.php" file. 188 | 189 | 190 | Bugs 191 | ---- 192 | 193 | To file bug reports please send email to: 194 | 195 | 196 | Please include with your report: (1) the example input; (2) the output you 197 | expected; (3) the output PHP Markdown actually produced. 198 | 199 | 200 | Version History 201 | --------------- 202 | 203 | 1.0.1n (10 Oct 2009): 204 | 205 | * Enabled reference-style shortcut links. Now you can write reference-style 206 | links with less brakets: 207 | 208 | This is [my website]. 209 | 210 | [my website]: http://example.com/ 211 | 212 | This was added in the 1.0.2 betas, but commented out in the 1.0.1 branch, 213 | waiting for the feature to be officialized. [But half of the other Markdown 214 | implementations are supporting this syntax][half], so it makes sense for 215 | compatibility's sake to allow it in PHP Markdown too. 216 | 217 | [half]: http://babelmark.bobtfish.net/?markdown=This+is+%5Bmy+website%5D.%0D%0A%09%09%0D%0A%5Bmy+website%5D%3A+http%3A%2F%2Fexample.com%2F%0D%0A&src=1&dest=2 218 | 219 | * Now accepting many valid email addresses in autolinks that were 220 | previously rejected, such as: 221 | 222 | 223 | 224 | <"abc@def"@example.com> 225 | <"Fred Bloggs"@example.com> 226 | 227 | 228 | * Now accepting spaces in URLs for inline and reference-style links. Such 229 | URLs need to be surrounded by angle brakets. For instance: 230 | 231 | [link text]( "optional title") 232 | 233 | [link text][ref] 234 | [ref]: "optional title" 235 | 236 | There is still a quirk which may prevent this from working correctly with 237 | relative URLs in inline-style links however. 238 | 239 | * Fix for adjacent list of different kind where the second list could 240 | end as a sublist of the first when not separated by an empty line. 241 | 242 | * Fixed a bug where inline-style links wouldn't be recognized when the link 243 | definition contains a line break between the url and the title. 244 | 245 | * Fixed a bug where tags where the name contains an underscore aren't parsed 246 | correctly. 247 | 248 | * Fixed some corner-cases mixing underscore-ephasis and asterisk-emphasis. 249 | 250 | 251 | 1.0.1m (21 Jun 2008): 252 | 253 | * Lists can now have empty items. 254 | 255 | * Rewrote the emphasis and strong emphasis parser to fix some issues 256 | with odly placed and overlong markers. 257 | 258 | 259 | 1.0.1l (11 May 2008): 260 | 261 | * Now removing the UTF-8 BOM at the start of a document, if present. 262 | 263 | * Now accepting capitalized URI schemes (such as HTTP:) in automatic 264 | links, such as ``. 265 | 266 | * Fixed a problem where `` was seen as a horizontal 267 | rule instead of an automatic link. 268 | 269 | * Fixed an issue where some characters in Markdown-generated HTML 270 | attributes weren't properly escaped with entities. 271 | 272 | * Fix for code blocks as first element of a list item. Previously, 273 | this didn't create any code block for item 2: 274 | 275 | * Item 1 (regular paragraph) 276 | 277 | * Item 2 (code block) 278 | 279 | * A code block starting on the second line of a document wasn't seen 280 | as a code block. This has been fixed. 281 | 282 | * Added programatically-settable parser properties `predef_urls` and 283 | `predef_titles` for predefined URLs and titles for reference-style 284 | links. To use this, your PHP code must call the parser this way: 285 | 286 | $parser = new Markdwon_Parser; 287 | $parser->predef_urls = array('linkref' => 'http://example.com'); 288 | $html = $parser->transform($text); 289 | 290 | You can then use the URL as a normal link reference: 291 | 292 | [my link][linkref] 293 | [my link][linkRef] 294 | 295 | Reference names in the parser properties *must* be lowercase. 296 | Reference names in the Markdown source may have any case. 297 | 298 | * Added `setup` and `teardown` methods which can be used by subclassers 299 | as hook points to arrange the state of some parser variables before and 300 | after parsing. 301 | 302 | 303 | 1.0.1k (26 Sep 2007): 304 | 305 | * Fixed a problem introduced in 1.0.1i where three or more identical 306 | uppercase letters, as well as a few other symbols, would trigger 307 | a horizontal line. 308 | 309 | 310 | 1.0.1j (4 Sep 2007): 311 | 312 | * Fixed a problem introduced in 1.0.1i where the closing `code` and 313 | `pre` tags at the end of a code block were appearing in the wrong 314 | order. 315 | 316 | * Overriding configuration settings by defining constants from an 317 | external before markdown.php is included is now possible without 318 | producing a PHP warning. 319 | 320 | 321 | 1.0.1i (31 Aug 2007): 322 | 323 | * Fixed a problem where an escaped backslash before a code span 324 | would prevent the code span from being created. This should now 325 | work as expected: 326 | 327 | Litteral backslash: \\`code span` 328 | 329 | * Overall speed improvements, especially with long documents. 330 | 331 | 332 | 1.0.1h (3 Aug 2007): 333 | 334 | * Added two properties (`no_markup` and `no_entities`) to the parser 335 | allowing HTML tags and entities to be disabled. 336 | 337 | * Fix for a problem introduced in 1.0.1g where posting comments in 338 | WordPress would trigger PHP warnings and cause some markup to be 339 | incorrectly filtered by the kses filter in WordPress. 340 | 341 | 342 | 1.0.1g (3 Jul 2007): 343 | 344 | * Fix for PHP 5 compiled without the mbstring module. Previous fix to 345 | calculate the length of UTF-8 strings in `detab` when `mb_strlen` is 346 | not available was only working with PHP 4. 347 | 348 | * Fixed a problem with WordPress 2.x where full-content posts in RSS feeds 349 | were not processed correctly by Markdown. 350 | 351 | * Now supports URLs containing literal parentheses for inline links 352 | and images, such as: 353 | 354 | [WIMP](http://en.wikipedia.org/wiki/WIMP_(computing)) 355 | 356 | Such parentheses may be arbitrarily nested, but must be 357 | balanced. Unbalenced parentheses are allowed however when the URL 358 | when escaped or when the URL is enclosed in angle brakets `<>`. 359 | 360 | * Fixed a performance problem where the regular expression for strong 361 | emphasis introduced in version 1.0.1d could sometime be long to process, 362 | give slightly wrong results, and in some circumstances could remove 363 | entirely the content for a whole paragraph. 364 | 365 | * Some change in version 1.0.1d made possible the incorrect nesting of 366 | anchors within each other. This is now fixed. 367 | 368 | * Fixed a rare issue where certain MD5 hashes in the content could 369 | be changed to their corresponding text. For instance, this: 370 | 371 | The MD5 value for "+" is "26b17225b626fb9238849fd60eabdf60". 372 | 373 | was incorrectly changed to this in previous versions of PHP Markdown: 374 | 375 |

The MD5 value for "+" is "+".

376 | 377 | * Now convert escaped characters to their numeric character 378 | references equivalent. 379 | 380 | This fix an integration issue with SmartyPants and backslash escapes. 381 | Since Markdown and SmartyPants have some escapable characters in common, 382 | it was sometime necessary to escape them twice. Previously, two 383 | backslashes were sometime required to prevent Markdown from "eating" the 384 | backslash before SmartyPants sees it: 385 | 386 | Here are two hyphens: \\-- 387 | 388 | Now, only one backslash will do: 389 | 390 | Here are two hyphens: \-- 391 | 392 | 393 | 1.0.1f (7 Feb 2007): 394 | 395 | * Fixed an issue with WordPress where manually-entered excerpts, but 396 | not the auto-generated ones, would contain nested paragraphs. 397 | 398 | * Fixed an issue introduced in 1.0.1d where headers and blockquotes 399 | preceded too closely by a paragraph (not separated by a blank line) 400 | where incorrectly put inside the paragraph. 401 | 402 | * Fixed an issue introduced in 1.0.1d in the tokenizeHTML method where 403 | two consecutive code spans would be merged into one when together they 404 | form a valid tag in a multiline paragraph. 405 | 406 | * Fixed an long-prevailing issue where blank lines in code blocks would 407 | be doubled when the code block is in a list item. 408 | 409 | This was due to the list processing functions relying on artificially 410 | doubled blank lines to correctly determine when list items should 411 | contain block-level content. The list item processing model was thus 412 | changed to avoid the need for double blank lines. 413 | 414 | * Fixed an issue with `<% asp-style %>` instructions used as inline 415 | content where the opening `<` was encoded as `<`. 416 | 417 | * Fixed a parse error occuring when PHP is configured to accept 418 | ASP-style delimiters as boundaries for PHP scripts. 419 | 420 | * Fixed a bug introduced in 1.0.1d where underscores in automatic links 421 | got swapped with emphasis tags. 422 | 423 | 424 | 1.0.1e (28 Dec 2006) 425 | 426 | * Added support for internationalized domain names for email addresses in 427 | automatic link. Improved the speed at which email addresses are converted 428 | to entities. Thanks to Milian Wolff for his optimisations. 429 | 430 | * Made deterministic the conversion to entities of email addresses in 431 | automatic links. This means that a given email address will always be 432 | encoded the same way. 433 | 434 | * PHP Markdown will now use its own function to calculate the length of an 435 | UTF-8 string in `detab` when `mb_strlen` is not available instead of 436 | giving a fatal error. 437 | 438 | 439 | 1.0.1d (1 Dec 2006) 440 | 441 | * Fixed a bug where inline images always had an empty title attribute. The 442 | title attribute is now present only when explicitly defined. 443 | 444 | * Link references definitions can now have an empty title, previously if the 445 | title was defined but left empty the link definition was ignored. This can 446 | be useful if you want an empty title attribute in images to hide the 447 | tooltip in Internet Explorer. 448 | 449 | * Made `detab` aware of UTF-8 characters. UTF-8 multi-byte sequences are now 450 | correctly mapped to one character instead of the number of bytes. 451 | 452 | * Fixed a small bug with WordPress where WordPress' default filter `wpautop` 453 | was not properly deactivated on comment text, resulting in hard line breaks 454 | where Markdown do not prescribes them. 455 | 456 | * Added a `TextileRestrited` method to the textile compatibility mode. There 457 | is no restriction however, as Markdown does not have a restricted mode at 458 | this point. This should make PHP Markdown work again in the latest 459 | versions of TextPattern. 460 | 461 | * Converted PHP Markdown to a object-oriented design. 462 | 463 | * Changed span and block gamut methods so that they loop over a 464 | customizable list of methods. This makes subclassing the parser a more 465 | interesting option for creating syntax extensions. 466 | 467 | * Also added a "document" gamut loop which can be used to hook document-level 468 | methods (like for striping link definitions). 469 | 470 | * Changed all methods which were inserting HTML code so that they now return 471 | a hashed representation of the code. New methods `hashSpan` and `hashBlock` 472 | are used to hash respectivly span- and block-level generated content. This 473 | has a couple of significant effects: 474 | 475 | 1. It prevents invalid nesting of Markdown-generated elements which 476 | could occur occuring with constructs like `*something [link*][1]`. 477 | 2. It prevents problems occuring with deeply nested lists on which 478 | paragraphs were ill-formed. 479 | 3. It removes the need to call `hashHTMLBlocks` twice during the the 480 | block gamut. 481 | 482 | Hashes are turned back to HTML prior output. 483 | 484 | * Made the block-level HTML parser smarter using a specially-crafted regular 485 | expression capable of handling nested tags. 486 | 487 | * Solved backtick issues in tag attributes by rewriting the HTML tokenizer to 488 | be aware of code spans. All these lines should work correctly now: 489 | 490 | bar 491 | bar 492 | `` 493 | 494 | * Changed the parsing of HTML comments to match simply from `` 495 | instead using of the more complicated SGML-style rule with paired `--`. 496 | This is how most browsers parse comments and how XML defines them too. 497 | 498 | * `
` has been added to the list of block-level elements and is now 499 | treated as an HTML block instead of being wrapped within paragraph tags. 500 | 501 | * Now only trim trailing newlines from code blocks, instead of trimming 502 | all trailing whitespace characters. 503 | 504 | * Fixed bug where this: 505 | 506 | [text](http://m.com "title" ) 507 | 508 | wasn't working as expected, because the parser wasn't allowing for spaces 509 | before the closing paren. 510 | 511 | * Filthy hack to support markdown='1' in div tags. 512 | 513 | * _DoAutoLinks() now supports the 'dict://' URL scheme. 514 | 515 | * PHP- and ASP-style processor instructions are now protected as 516 | raw HTML blocks. 517 | 518 | 519 | <% ... %> 520 | 521 | * Fix for escaped backticks still triggering code spans: 522 | 523 | There are two raw backticks here: \` and here: \`, not a code span 524 | 525 | 526 | 1.0.1c (9 Dec 2005) 527 | 528 | * Fixed a problem occurring with PHP 5.1.1 due to a small 529 | change to strings variable replacement behaviour in 530 | this version. 531 | 532 | 533 | 1.0.1b (6 Jun 2005) 534 | 535 | * Fixed a bug where an inline image followed by a reference link would 536 | give a completely wrong result. 537 | 538 | * Fix for escaped backticks still triggering code spans: 539 | 540 | There are two raw backticks here: \` and here: \`, not a code span 541 | 542 | * Fix for an ordered list following an unordered list, and the 543 | reverse. There is now a loop in _DoList that does the two 544 | separately. 545 | 546 | * Fix for nested sub-lists in list-paragraph mode. Previously we got 547 | a spurious extra level of `

` tags for something like this: 548 | 549 | * this 550 | 551 | * sub 552 | 553 | that 554 | 555 | * Fixed some incorrect behaviour with emphasis. This will now work 556 | as it should: 557 | 558 | *test **thing*** 559 | **test *thing*** 560 | ***thing* test** 561 | ***thing** test* 562 | 563 | Name: __________ 564 | Address: _______ 565 | 566 | * Correct a small bug in `_TokenizeHTML` where a Doctype declaration 567 | was not seen as HTML. 568 | 569 | * Major rewrite of the WordPress integration code that should 570 | correct many problems by preventing default WordPress filters from 571 | tampering with Markdown-formatted text. More details here: 572 | 573 | 574 | 575 | 1.0.1a (15 Apr 2005) 576 | 577 | * Fixed an issue where PHP warnings were trigged when converting 578 | text with list items running on PHP 4.0.6. This was comming from 579 | the `rtrim` function which did not support the second argument 580 | prior version 4.1. Replaced by a regular expression. 581 | 582 | * Markdown now filter correctly post excerpts and comment 583 | excerpts in WordPress. 584 | 585 | * Automatic links and some code sample were "corrected" by 586 | the balenceTag filter in WordPress meant to ensure HTML 587 | is well formed. This new version of PHP Markdown postpone this 588 | filter so that it runs after Markdown. 589 | 590 | * Blockquote syntax and some code sample were stripped by 591 | a new WordPress 1.5 filter meant to remove unwanted HTML 592 | in comments. This new version of PHP Markdown postpone this 593 | filter so that it runs after Markdown. 594 | 595 | 596 | 1.0.1 (16 Dec 2004): 597 | 598 | * Changed the syntax rules for code blocks and spans. Previously, 599 | backslash escapes for special Markdown characters were processed 600 | everywhere other than within inline HTML tags. Now, the contents of 601 | code blocks and spans are no longer processed for backslash escapes. 602 | This means that code blocks and spans are now treated literally, 603 | with no special rules to worry about regarding backslashes. 604 | 605 | **IMPORTANT**: This breaks the syntax from all previous versions of 606 | Markdown. Code blocks and spans involving backslash characters will 607 | now generate different output than before. 608 | 609 | Implementation-wise, this change was made by moving the call to 610 | `_EscapeSpecialChars()` from the top-level `Markdown()` function to 611 | within `_RunSpanGamut()`. 612 | 613 | * Significants performance improvement in `_DoHeader`, `_Detab` 614 | and `_TokenizeHTML`. 615 | 616 | * Added `>`, `+`, and `-` to the list of backslash-escapable 617 | characters. These should have been done when these characters 618 | were added as unordered list item markers. 619 | 620 | * Inline links using `<` and `>` URL delimiters weren't working: 621 | 622 | like [this]() 623 | 624 | Fixed by moving `_DoAutoLinks()` after `_DoAnchors()` in 625 | `_RunSpanGamut()`. 626 | 627 | * Fixed bug where auto-links were being processed within code spans: 628 | 629 | like this: `` 630 | 631 | Fixed by moving `_DoAutoLinks()` from `_RunBlockGamut()` to 632 | `_RunSpanGamut()`. 633 | 634 | * Sort-of fixed a bug where lines in the middle of hard-wrapped 635 | paragraphs, which lines look like the start of a list item, 636 | would accidentally trigger the creation of a list. E.g. a 637 | paragraph that looked like this: 638 | 639 | I recommend upgrading to version 640 | 8. Oops, now this line is treated 641 | as a sub-list. 642 | 643 | This is fixed for top-level lists, but it can still happen for 644 | sub-lists. E.g., the following list item will not be parsed 645 | properly: 646 | 647 | * I recommend upgrading to version 648 | 8. Oops, now this line is treated 649 | as a sub-list. 650 | 651 | Given Markdown's list-creation rules, I'm not sure this can 652 | be fixed. 653 | 654 | * Fix for horizontal rules preceded by 2 or 3 spaces or followed by 655 | trailing spaces and tabs. 656 | 657 | * Standalone HTML comments are now handled; previously, they'd get 658 | wrapped in a spurious `

` tag. 659 | 660 | * `_HashHTMLBlocks()` now tolerates trailing spaces and tabs following 661 | HTML comments and `


` tags. 662 | 663 | * Changed special case pattern for hashing `
` tags in 664 | `_HashHTMLBlocks()` so that they must occur within three spaces 665 | of left margin. (With 4 spaces or a tab, they should be 666 | code blocks, but weren't before this fix.) 667 | 668 | * Auto-linked email address can now optionally contain 669 | a 'mailto:' protocol. I.e. these are equivalent: 670 | 671 | 672 | 673 | 674 | * Fixed annoying bug where nested lists would wind up with 675 | spurious (and invalid) `

` tags. 676 | 677 | * Changed `_StripLinkDefinitions()` so that link definitions must 678 | occur within three spaces of the left margin. Thus if you indent 679 | a link definition by four spaces or a tab, it will now be a code 680 | block. 681 | 682 | * You can now write empty links: 683 | 684 | [like this]() 685 | 686 | and they'll be turned into anchor tags with empty href attributes. 687 | This should have worked before, but didn't. 688 | 689 | * `***this***` and `___this___` are now turned into 690 | 691 | this 692 | 693 | Instead of 694 | 695 | this 696 | 697 | which isn't valid. 698 | 699 | * Fixed problem for links defined with urls that include parens, e.g.: 700 | 701 | [1]: http://sources.wikipedia.org/wiki/Middle_East_Policy_(Chomsky) 702 | 703 | "Chomsky" was being erroneously treated as the URL's title. 704 | 705 | * Double quotes in the title of an inline link used to give strange 706 | results (incorrectly made entities). Fixed. 707 | 708 | * Tabs are now correctly changed into spaces. Previously, only 709 | the first tab was converted. In code blocks, the second one was too, 710 | but was not always correctly aligned. 711 | 712 | * Fixed a bug where a tab character inserted after a quote on the same 713 | line could add a slash before the quotes. 714 | 715 | This is "before" [tab] and "after" a tab. 716 | 717 | Previously gave this result: 718 | 719 |

This is \"before\" [tab] and "after" a tab.

720 | 721 | * Removed a call to `htmlentities`. This fixes a bug where multibyte 722 | characters present in the title of a link reference could lead to 723 | invalid utf-8 characters. 724 | 725 | * Changed a regular expression in `_TokenizeHTML` that could lead to 726 | a segmentation fault with PHP 4.3.8 on Linux. 727 | 728 | * Fixed some notices that could show up if PHP error reporting 729 | E_NOTICE flag was set. 730 | 731 | 732 | Copyright and License 733 | --------------------- 734 | 735 | PHP Markdown 736 | Copyright (c) 2004-2009 Michel Fortin 737 | 738 | All rights reserved. 739 | 740 | Based on Markdown 741 | Copyright (c) 2003-2006 John Gruber 742 | 743 | All rights reserved. 744 | 745 | Redistribution and use in source and binary forms, with or without 746 | modification, are permitted provided that the following conditions are 747 | met: 748 | 749 | * Redistributions of source code must retain the above copyright notice, 750 | this list of conditions and the following disclaimer. 751 | 752 | * Redistributions in binary form must reproduce the above copyright 753 | notice, this list of conditions and the following disclaimer in the 754 | documentation and/or other materials provided with the distribution. 755 | 756 | * Neither the name "Markdown" nor the names of its contributors may 757 | be used to endorse or promote products derived from this software 758 | without specific prior written permission. 759 | 760 | This software is provided by the copyright holders and contributors "as 761 | is" and any express or implied warranties, including, but not limited 762 | to, the implied warranties of merchantability and fitness for a 763 | particular purpose are disclaimed. In no event shall the copyright owner 764 | or contributors be liable for any direct, indirect, incidental, special, 765 | exemplary, or consequential damages (including, but not limited to, 766 | procurement of substitute goods or services; loss of use, data, or 767 | profits; or business interruption) however caused and on any theory of 768 | liability, whether in contract, strict liability, or tort (including 769 | negligence or otherwise) arising in any way out of the use of this 770 | software, even if advised of the possibility of such damage. 771 | -------------------------------------------------------------------------------- /php-markdown-modified.php: -------------------------------------------------------------------------------- 1 | 8 | # 9 | # Original Markdown 10 | # Copyright (c) 2004-2006 John Gruber 11 | # 12 | # 13 | 14 | 15 | define( 'AJF_MARKDOWN_VERSION', "1.0.1o" ); # Sun 8 Jan 2012 16 | 17 | 18 | # 19 | # Global default settings: 20 | # 21 | 22 | # Change to ">" for HTML output 23 | @define( 'MARKDOWN_EMPTY_ELEMENT_SUFFIX', " />"); 24 | 25 | # Define the width of a tab for code blocks. 26 | @define( 'MARKDOWN_TAB_WIDTH', 4 ); 27 | 28 | 29 | # 30 | # WordPress settings: 31 | # 32 | 33 | # Change to false to remove Markdown from posts and/or comments. 34 | @define( 'MARKDOWN_WP_POSTS', true ); 35 | @define( 'MARKDOWN_WP_COMMENTS', true ); 36 | 37 | 38 | 39 | ### Standard Function Interface ### 40 | 41 | @define( 'MARKDOWN_PARSER_CLASS', 'AJF_Markdown_Parser' ); 42 | 43 | function AJF_Markdown($text) { 44 | # 45 | # Initialize the parser and return the result of its transform method. 46 | # 47 | # Setup static parser variable. 48 | static $parser; 49 | if (!isset($parser)) { 50 | $parser_class = MARKDOWN_PARSER_CLASS; 51 | $parser = new $parser_class; 52 | } 53 | 54 | # Transform text using parser. 55 | return $parser->transform($text); 56 | } 57 | 58 | 59 | ### WordPress Plugin Interface ### 60 | 61 | /* 62 | Plugin Name: Markdown - Modified 63 | Plugin URI: https://github.com/afragen/php-markdown-modified 64 | GitHub Plugin URI: https://github.com/afragen/php-markdown-modified 65 | Description: Modified to work along side Markdown on Save variants. All posts containing Markdown are rendered regardless of Markdown on Save variant setting. Using PHP Markdown 1.0.1o 66 | Version: 1.0.1 67 | Author: Andy Fragen 68 | */ 69 | 70 | 71 | if (isset($wp_version)) { 72 | # More details about how it works here: 73 | # 74 | 75 | # Post content and excerpts 76 | # - Remove WordPress paragraph generator. 77 | # - Run Markdown on excerpt, then remove all tags. 78 | # - Add paragraph tag around the excerpt, but remove it for the excerpt rss. 79 | if (MARKDOWN_WP_POSTS) { 80 | remove_filter('the_content', 'wpautop'); 81 | remove_filter('the_content_rss', 'wpautop'); 82 | remove_filter('the_excerpt', 'wpautop'); 83 | add_filter('the_content', 'AJF_Markdown', 6); 84 | add_filter('the_content_rss', 'AJF_Markdown', 6); 85 | add_filter('get_the_excerpt', 'AJF_Markdown', 6); 86 | add_filter('get_the_excerpt', 'trim', 7); 87 | add_filter('the_excerpt', 'ajf_mdwp_add_p'); 88 | add_filter('the_excerpt_rss', 'ajf_mdwp_strip_p'); 89 | 90 | remove_filter('content_save_pre', 'balanceTags', 50); 91 | remove_filter('excerpt_save_pre', 'balanceTags', 50); 92 | add_filter('the_content', 'balanceTags', 50); 93 | add_filter('get_the_excerpt', 'balanceTags', 9); 94 | } 95 | 96 | # Comments 97 | # - Remove WordPress paragraph generator. 98 | # - Remove WordPress auto-link generator. 99 | # - Scramble important tags before passing them to the kses filter. 100 | # - Run Markdown on excerpt then remove paragraph tags. 101 | if (MARKDOWN_WP_COMMENTS) { 102 | remove_filter('comment_text', 'wpautop', 30); 103 | remove_filter('comment_text', 'make_clickable'); 104 | add_filter('pre_comment_content', 'AJF_Markdown', 6); 105 | add_filter('pre_comment_content', 'ajf_mdwp_hide_tags', 8); 106 | add_filter('pre_comment_content', 'ajf_mdwp_show_tags', 12); 107 | add_filter('get_comment_text', 'AJF_Markdown', 6); 108 | add_filter('get_comment_excerpt', 'AJF_Markdown', 6); 109 | add_filter('get_comment_excerpt', 'ajf_mdwp_strip_p', 7); 110 | 111 | global $mdwp_hidden_tags, $mdwp_placeholders; 112 | $mdwp_hidden_tags = explode(' ', 113 | '

 
  • '); 114 | $mdwp_placeholders = explode(' ', str_rot13( 115 | 'pEj07ZbbBZ U1kqgh4w4p pre2zmeN6K QTi31t9pre ol0MP1jzJR '. 116 | 'ML5IjmbRol ulANi1NsGY J7zRLJqPul liA8ctl16T K9nhooUHli')); 117 | } 118 | 119 | function ajf_mdwp_add_p($text) { 120 | if (!preg_match('{^$|^<(p|ul|ol|dl|pre|blockquote)>}i', $text)) { 121 | $text = '

    '.$text.'

    '; 122 | $text = preg_replace('{\n{2,}}', "

    \n\n

    ", $text); 123 | } 124 | return $text; 125 | } 126 | 127 | function ajf_mdwp_strip_p($t) { return preg_replace('{}i', '', $t); } 128 | 129 | function ajf_mdwp_hide_tags($text) { 130 | global $mdwp_hidden_tags, $mdwp_placeholders; 131 | return str_replace($mdwp_hidden_tags, $mdwp_placeholders, $text); 132 | } 133 | function ajf_mdwp_show_tags($text) { 134 | global $mdwp_hidden_tags, $mdwp_placeholders; 135 | return str_replace($mdwp_placeholders, $mdwp_hidden_tags, $text); 136 | } 137 | } 138 | 139 | 140 | ### bBlog Plugin Info ### 141 | 142 | function AJF_identify_modifier_markdown() { 143 | return array( 144 | 'name' => 'markdown', 145 | 'type' => 'modifier', 146 | 'nicename' => 'Markdown', 147 | 'description' => 'A text-to-HTML conversion tool for web writers', 148 | 'authors' => 'Michel Fortin and John Gruber', 149 | 'licence' => 'BSD-like', 150 | 'version' => AJF_MARKDOWN_VERSION, 151 | 'help' => 'Markdown syntax allows you to write using an easy-to-read, easy-to-write plain text format. Based on the original Perl version by John Gruber. More...' 152 | ); 153 | } 154 | 155 | 156 | ### Smarty Modifier Interface ### 157 | 158 | function AJF_smarty_modifier_markdown($text) { 159 | return AJF_Markdown($text); 160 | } 161 | 162 | 163 | ### Textile Compatibility Mode ### 164 | 165 | # Rename this file to "classTextile.php" and it can replace Textile everywhere. 166 | 167 | if (strcasecmp(substr(__FILE__, -16), "classTextile.php") == 0) { 168 | # Try to include PHP SmartyPants. Should be in the same directory. 169 | @include_once 'smartypants.php'; 170 | # Fake Textile class. It calls Markdown instead. 171 | class Textile { 172 | function TextileThis($text, $lite='', $encode='') { 173 | if ($lite == '' && $encode == '') $text = AJF_Markdown($text); 174 | if (function_exists('SmartyPants')) $text = SmartyPants($text); 175 | return $text; 176 | } 177 | # Fake restricted version: restrictions are not supported for now. 178 | function TextileRestricted($text, $lite='', $noimage='') { 179 | return $this->TextileThis($text, $lite); 180 | } 181 | # Workaround to ensure compatibility with TextPattern 4.0.3. 182 | function blockLite($text) { return $text; } 183 | } 184 | } 185 | 186 | 187 | 188 | # 189 | # Markdown Parser Class 190 | # 191 | 192 | class AJF_Markdown_Parser { 193 | 194 | # Regex to match balanced [brackets]. 195 | # Needed to insert a maximum bracked depth while converting to PHP. 196 | var $nested_brackets_depth = 6; 197 | var $nested_brackets_re; 198 | 199 | var $nested_url_parenthesis_depth = 4; 200 | var $nested_url_parenthesis_re; 201 | 202 | # Table of hash values for escaped characters: 203 | var $escape_chars = '\`*_{}[]()>#+-.!'; 204 | var $escape_chars_re; 205 | 206 | # Change to ">" for HTML output. 207 | var $empty_element_suffix = MARKDOWN_EMPTY_ELEMENT_SUFFIX; 208 | var $tab_width = MARKDOWN_TAB_WIDTH; 209 | 210 | # Change to `true` to disallow markup or entities. 211 | var $no_markup = false; 212 | var $no_entities = false; 213 | 214 | # Predefined urls and titles for reference links and images. 215 | var $predef_urls = array(); 216 | var $predef_titles = array(); 217 | 218 | 219 | function AJF_Markdown_Parser() { 220 | # 221 | # Constructor function. Initialize appropriate member variables. 222 | # 223 | $this->_initDetab(); 224 | $this->prepareItalicsAndBold(); 225 | 226 | $this->nested_brackets_re = 227 | str_repeat('(?>[^\[\]]+|\[', $this->nested_brackets_depth). 228 | str_repeat('\])*', $this->nested_brackets_depth); 229 | 230 | $this->nested_url_parenthesis_re = 231 | str_repeat('(?>[^()\s]+|\(', $this->nested_url_parenthesis_depth). 232 | str_repeat('(?>\)))*', $this->nested_url_parenthesis_depth); 233 | 234 | $this->escape_chars_re = '['.preg_quote($this->escape_chars).']'; 235 | 236 | # Sort document, block, and span gamut in ascendent priority order. 237 | asort($this->document_gamut); 238 | asort($this->block_gamut); 239 | asort($this->span_gamut); 240 | } 241 | 242 | 243 | # Internal hashes used during transformation. 244 | var $urls = array(); 245 | var $titles = array(); 246 | var $html_hashes = array(); 247 | 248 | # Status flag to avoid invalid nesting. 249 | var $in_anchor = false; 250 | 251 | 252 | function setup() { 253 | # 254 | # Called before the transformation process starts to setup parser 255 | # states. 256 | # 257 | # Clear global hashes. 258 | $this->urls = $this->predef_urls; 259 | $this->titles = $this->predef_titles; 260 | $this->html_hashes = array(); 261 | 262 | $in_anchor = false; 263 | } 264 | 265 | function teardown() { 266 | # 267 | # Called after the transformation process to clear any variable 268 | # which may be taking up memory unnecessarly. 269 | # 270 | $this->urls = array(); 271 | $this->titles = array(); 272 | $this->html_hashes = array(); 273 | } 274 | 275 | 276 | function transform($text) { 277 | # 278 | # Main function. Performs some preprocessing on the input text 279 | # and pass it through the document gamut. 280 | # 281 | $this->setup(); 282 | 283 | # Remove UTF-8 BOM and marker character in input, if present. 284 | $text = preg_replace('{^\xEF\xBB\xBF|\x1A}', '', $text); 285 | 286 | # Standardize line endings: 287 | # DOS to Unix and Mac to Unix 288 | $text = preg_replace('{\r\n?}', "\n", $text); 289 | 290 | # Make sure $text ends with a couple of newlines: 291 | $text .= "\n\n"; 292 | 293 | # Convert all tabs to spaces. 294 | $text = $this->detab($text); 295 | 296 | # Turn block-level HTML blocks into hash entries 297 | $text = $this->hashHTMLBlocks($text); 298 | 299 | # Strip any lines consisting only of spaces and tabs. 300 | # This makes subsequent regexen easier to write, because we can 301 | # match consecutive blank lines with /\n+/ instead of something 302 | # contorted like /[ ]*\n+/ . 303 | $text = preg_replace('/^[ ]+$/m', '', $text); 304 | 305 | # Run document gamut methods. 306 | foreach ($this->document_gamut as $method => $priority) { 307 | $text = $this->$method($text); 308 | } 309 | 310 | $this->teardown(); 311 | 312 | return $text . "\n"; 313 | } 314 | 315 | var $document_gamut = array( 316 | # Strip link definitions, store in hashes. 317 | "stripLinkDefinitions" => 20, 318 | 319 | "runBasicBlockGamut" => 30, 320 | ); 321 | 322 | 323 | function stripLinkDefinitions($text) { 324 | # 325 | # Strips link definitions from text, stores the URLs and titles in 326 | # hash references. 327 | # 328 | $less_than_tab = $this->tab_width - 1; 329 | 330 | # Link defs are in the form: ^[id]: url "optional title" 331 | $text = preg_replace_callback('{ 332 | ^[ ]{0,'.$less_than_tab.'}\[(.+)\][ ]?: # id = $1 333 | [ ]* 334 | \n? # maybe *one* newline 335 | [ ]* 336 | (?: 337 | <(.+?)> # url = $2 338 | | 339 | (\S+?) # url = $3 340 | ) 341 | [ ]* 342 | \n? # maybe one newline 343 | [ ]* 344 | (?: 345 | (?<=\s) # lookbehind for whitespace 346 | ["(] 347 | (.*?) # title = $4 348 | [")] 349 | [ ]* 350 | )? # title is optional 351 | (?:\n+|\Z) 352 | }xm', 353 | array(&$this, '_stripLinkDefinitions_callback'), 354 | $text); 355 | return $text; 356 | } 357 | function _stripLinkDefinitions_callback($matches) { 358 | $link_id = strtolower($matches[1]); 359 | $url = $matches[2] == '' ? $matches[3] : $matches[2]; 360 | $this->urls[$link_id] = $url; 361 | $this->titles[$link_id] =& $matches[4]; 362 | return ''; # String that will replace the block 363 | } 364 | 365 | 366 | function hashHTMLBlocks($text) { 367 | if ($this->no_markup) return $text; 368 | 369 | $less_than_tab = $this->tab_width - 1; 370 | 371 | # Hashify HTML blocks: 372 | # We only want to do this for block-level HTML tags, such as headers, 373 | # lists, and tables. That's because we still want to wrap

    s around 374 | # "paragraphs" that are wrapped in non-block-level tags, such as anchors, 375 | # phrase emphasis, and spans. The list of tags we're looking for is 376 | # hard-coded: 377 | # 378 | # * List "a" is made of tags which can be both inline or block-level. 379 | # These will be treated block-level when the start tag is alone on 380 | # its line, otherwise they're not matched here and will be taken as 381 | # inline later. 382 | # * List "b" is made of tags which are always block-level; 383 | # 384 | $block_tags_a_re = 'ins|del'; 385 | $block_tags_b_re = 'p|div|h[1-6]|blockquote|pre|table|dl|ol|ul|address|'. 386 | 'script|noscript|form|fieldset|iframe|math'; 387 | 388 | # Regular expression for the content of a block tag. 389 | $nested_tags_level = 4; 390 | $attr = ' 391 | (?> # optional tag attributes 392 | \s # starts with whitespace 393 | (?> 394 | [^>"/]+ # text outside quotes 395 | | 396 | /+(?!>) # slash not followed by ">" 397 | | 398 | "[^"]*" # text inside double quotes (tolerate ">") 399 | | 400 | \'[^\']*\' # text inside single quotes (tolerate ">") 401 | )* 402 | )? 403 | '; 404 | $content = 405 | str_repeat(' 406 | (?> 407 | [^<]+ # content without tag 408 | | 409 | <\2 # nested opening tag 410 | '.$attr.' # attributes 411 | (?> 412 | /> 413 | | 414 | >', $nested_tags_level). # end of opening tag 415 | '.*?'. # last level nested tag content 416 | str_repeat(' 417 | # closing nested tag 418 | ) 419 | | 420 | <(?!/\2\s*> # other tags with a different name 421 | ) 422 | )*', 423 | $nested_tags_level); 424 | $content2 = str_replace('\2', '\3', $content); 425 | 426 | # First, look for nested blocks, e.g.: 427 | #

    428 | #
    429 | # tags for inner block must be indented. 430 | #
    431 | #
    432 | # 433 | # The outermost tags must start at the left margin for this to match, and 434 | # the inner nested divs must be indented. 435 | # We need to do this before the next, more liberal match, because the next 436 | # match will start at the first `
    ` and stop at the first `
    `. 437 | $text = preg_replace_callback('{(?> 438 | (?> 439 | (?<=\n\n) # Starting after a blank line 440 | | # or 441 | \A\n? # the beginning of the doc 442 | ) 443 | ( # save in $1 444 | 445 | # Match from `\n` to `\n`, handling nested tags 446 | # in between. 447 | 448 | [ ]{0,'.$less_than_tab.'} 449 | <('.$block_tags_b_re.')# start tag = $2 450 | '.$attr.'> # attributes followed by > and \n 451 | '.$content.' # content, support nesting 452 | # the matching end tag 453 | [ ]* # trailing spaces/tabs 454 | (?=\n+|\Z) # followed by a newline or end of document 455 | 456 | | # Special version for tags of group a. 457 | 458 | [ ]{0,'.$less_than_tab.'} 459 | <('.$block_tags_a_re.')# start tag = $3 460 | '.$attr.'>[ ]*\n # attributes followed by > 461 | '.$content2.' # content, support nesting 462 | # the matching end tag 463 | [ ]* # trailing spaces/tabs 464 | (?=\n+|\Z) # followed by a newline or end of document 465 | 466 | | # Special case just for
    . It was easier to make a special 467 | # case than to make the other regex more complicated. 468 | 469 | [ ]{0,'.$less_than_tab.'} 470 | <(hr) # start tag = $2 471 | '.$attr.' # attributes 472 | /?> # the matching end tag 473 | [ ]* 474 | (?=\n{2,}|\Z) # followed by a blank line or end of document 475 | 476 | | # Special case for standalone HTML comments: 477 | 478 | [ ]{0,'.$less_than_tab.'} 479 | (?s: 480 | 481 | ) 482 | [ ]* 483 | (?=\n{2,}|\Z) # followed by a blank line or end of document 484 | 485 | | # PHP and ASP-style processor instructions ( 492 | ) 493 | [ ]* 494 | (?=\n{2,}|\Z) # followed by a blank line or end of document 495 | 496 | ) 497 | )}Sxmi', 498 | array(&$this, '_hashHTMLBlocks_callback'), 499 | $text); 500 | 501 | return $text; 502 | } 503 | function _hashHTMLBlocks_callback($matches) { 504 | $text = $matches[1]; 505 | $key = $this->hashBlock($text); 506 | return "\n\n$key\n\n"; 507 | } 508 | 509 | 510 | function hashPart($text, $boundary = 'X') { 511 | # 512 | # Called whenever a tag must be hashed when a function insert an atomic 513 | # element in the text stream. Passing $text to through this function gives 514 | # a unique text-token which will be reverted back when calling unhash. 515 | # 516 | # The $boundary argument specify what character should be used to surround 517 | # the token. By convension, "B" is used for block elements that needs not 518 | # to be wrapped into paragraph tags at the end, ":" is used for elements 519 | # that are word separators and "X" is used in the general case. 520 | # 521 | # Swap back any tag hash found in $text so we do not have to `unhash` 522 | # multiple times at the end. 523 | $text = $this->unhash($text); 524 | 525 | # Then hash the block. 526 | static $i = 0; 527 | $key = "$boundary\x1A" . ++$i . $boundary; 528 | $this->html_hashes[$key] = $text; 529 | return $key; # String that will replace the tag. 530 | } 531 | 532 | 533 | function hashBlock($text) { 534 | # 535 | # Shortcut function for hashPart with block-level boundaries. 536 | # 537 | return $this->hashPart($text, 'B'); 538 | } 539 | 540 | 541 | var $block_gamut = array( 542 | # 543 | # These are all the transformations that form block-level 544 | # tags like paragraphs, headers, and list items. 545 | # 546 | "doHeaders" => 10, 547 | "doHorizontalRules" => 20, 548 | 549 | "doLists" => 40, 550 | "doCodeBlocks" => 50, 551 | "doBlockQuotes" => 60, 552 | ); 553 | 554 | function runBlockGamut($text) { 555 | # 556 | # Run block gamut tranformations. 557 | # 558 | # We need to escape raw HTML in Markdown source before doing anything 559 | # else. This need to be done for each block, and not only at the 560 | # begining in the Markdown function since hashed blocks can be part of 561 | # list items and could have been indented. Indented blocks would have 562 | # been seen as a code block in a previous pass of hashHTMLBlocks. 563 | $text = $this->hashHTMLBlocks($text); 564 | 565 | return $this->runBasicBlockGamut($text); 566 | } 567 | 568 | function runBasicBlockGamut($text) { 569 | # 570 | # Run block gamut tranformations, without hashing HTML blocks. This is 571 | # useful when HTML blocks are known to be already hashed, like in the first 572 | # whole-document pass. 573 | # 574 | foreach ($this->block_gamut as $method => $priority) { 575 | $text = $this->$method($text); 576 | } 577 | 578 | # Finally form paragraph and restore hashed blocks. 579 | $text = $this->formParagraphs($text); 580 | 581 | return $text; 582 | } 583 | 584 | 585 | function doHorizontalRules($text) { 586 | # Do Horizontal Rules: 587 | return preg_replace( 588 | '{ 589 | ^[ ]{0,3} # Leading space 590 | ([-*_]) # $1: First marker 591 | (?> # Repeated marker group 592 | [ ]{0,2} # Zero, one, or two spaces. 593 | \1 # Marker character 594 | ){2,} # Group repeated at least twice 595 | [ ]* # Tailing spaces 596 | $ # End of line. 597 | }mx', 598 | "\n".$this->hashBlock("empty_element_suffix")."\n", 599 | $text); 600 | } 601 | 602 | 603 | var $span_gamut = array( 604 | # 605 | # These are all the transformations that occur *within* block-level 606 | # tags like paragraphs, headers, and list items. 607 | # 608 | # Process character escapes, code spans, and inline HTML 609 | # in one shot. 610 | "parseSpan" => -30, 611 | 612 | # Process anchor and image tags. Images must come first, 613 | # because ![foo][f] looks like an anchor. 614 | "doImages" => 10, 615 | "doAnchors" => 20, 616 | 617 | # Make links out of things like `` 618 | # Must come after doAnchors, because you can use < and > 619 | # delimiters in inline links like [this](). 620 | "doAutoLinks" => 30, 621 | "encodeAmpsAndAngles" => 40, 622 | 623 | "doItalicsAndBold" => 50, 624 | "doHardBreaks" => 60, 625 | ); 626 | 627 | function runSpanGamut($text) { 628 | # 629 | # Run span gamut tranformations. 630 | # 631 | foreach ($this->span_gamut as $method => $priority) { 632 | $text = $this->$method($text); 633 | } 634 | 635 | return $text; 636 | } 637 | 638 | 639 | function doHardBreaks($text) { 640 | # Do hard breaks: 641 | return preg_replace_callback('/ {2,}\n/', 642 | array(&$this, '_doHardBreaks_callback'), $text); 643 | } 644 | function _doHardBreaks_callback($matches) { 645 | return $this->hashPart("empty_element_suffix\n"); 646 | } 647 | 648 | 649 | function doAnchors($text) { 650 | # 651 | # Turn Markdown link shortcuts into XHTML tags. 652 | # 653 | if ($this->in_anchor) return $text; 654 | $this->in_anchor = true; 655 | 656 | # 657 | # First, handle reference-style links: [link text] [id] 658 | # 659 | $text = preg_replace_callback('{ 660 | ( # wrap whole match in $1 661 | \[ 662 | ('.$this->nested_brackets_re.') # link text = $2 663 | \] 664 | 665 | [ ]? # one optional space 666 | (?:\n[ ]*)? # one optional newline followed by spaces 667 | 668 | \[ 669 | (.*?) # id = $3 670 | \] 671 | ) 672 | }xs', 673 | array(&$this, '_doAnchors_reference_callback'), $text); 674 | 675 | # 676 | # Next, inline-style links: [link text](url "optional title") 677 | # 678 | $text = preg_replace_callback('{ 679 | ( # wrap whole match in $1 680 | \[ 681 | ('.$this->nested_brackets_re.') # link text = $2 682 | \] 683 | \( # literal paren 684 | [ \n]* 685 | (?: 686 | <(.+?)> # href = $3 687 | | 688 | ('.$this->nested_url_parenthesis_re.') # href = $4 689 | ) 690 | [ \n]* 691 | ( # $5 692 | ([\'"]) # quote char = $6 693 | (.*?) # Title = $7 694 | \6 # matching quote 695 | [ \n]* # ignore any spaces/tabs between closing quote and ) 696 | )? # title is optional 697 | \) 698 | ) 699 | }xs', 700 | array(&$this, '_doAnchors_inline_callback'), $text); 701 | 702 | # 703 | # Last, handle reference-style shortcuts: [link text] 704 | # These must come last in case you've also got [link text][1] 705 | # or [link text](/foo) 706 | # 707 | $text = preg_replace_callback('{ 708 | ( # wrap whole match in $1 709 | \[ 710 | ([^\[\]]+) # link text = $2; can\'t contain [ or ] 711 | \] 712 | ) 713 | }xs', 714 | array(&$this, '_doAnchors_reference_callback'), $text); 715 | 716 | $this->in_anchor = false; 717 | return $text; 718 | } 719 | function _doAnchors_reference_callback($matches) { 720 | $whole_match = $matches[1]; 721 | $link_text = $matches[2]; 722 | $link_id =& $matches[3]; 723 | 724 | if ($link_id == "") { 725 | # for shortcut links like [this][] or [this]. 726 | $link_id = $link_text; 727 | } 728 | 729 | # lower-case and turn embedded newlines into spaces 730 | $link_id = strtolower($link_id); 731 | $link_id = preg_replace('{[ ]?\n}', ' ', $link_id); 732 | 733 | if (isset($this->urls[$link_id])) { 734 | $url = $this->urls[$link_id]; 735 | $url = $this->encodeAttribute($url); 736 | 737 | $result = "titles[$link_id] ) ) { 739 | $title = $this->titles[$link_id]; 740 | $title = $this->encodeAttribute($title); 741 | $result .= " title=\"$title\""; 742 | } 743 | 744 | $link_text = $this->runSpanGamut($link_text); 745 | $result .= ">$link_text"; 746 | $result = $this->hashPart($result); 747 | } 748 | else { 749 | $result = $whole_match; 750 | } 751 | return $result; 752 | } 753 | function _doAnchors_inline_callback($matches) { 754 | $whole_match = $matches[1]; 755 | $link_text = $this->runSpanGamut($matches[2]); 756 | $url = $matches[3] == '' ? $matches[4] : $matches[3]; 757 | $title =& $matches[7]; 758 | 759 | $url = $this->encodeAttribute($url); 760 | 761 | $result = "encodeAttribute($title); 764 | $result .= " title=\"$title\""; 765 | } 766 | 767 | $link_text = $this->runSpanGamut($link_text); 768 | $result .= ">$link_text"; 769 | 770 | return $this->hashPart($result); 771 | } 772 | 773 | 774 | function doImages($text) { 775 | # 776 | # Turn Markdown image shortcuts into tags. 777 | # 778 | # 779 | # First, handle reference-style labeled images: ![alt text][id] 780 | # 781 | $text = preg_replace_callback('{ 782 | ( # wrap whole match in $1 783 | !\[ 784 | ('.$this->nested_brackets_re.') # alt text = $2 785 | \] 786 | 787 | [ ]? # one optional space 788 | (?:\n[ ]*)? # one optional newline followed by spaces 789 | 790 | \[ 791 | (.*?) # id = $3 792 | \] 793 | 794 | ) 795 | }xs', 796 | array(&$this, '_doImages_reference_callback'), $text); 797 | 798 | # 799 | # Next, handle inline images: ![alt text](url "optional title") 800 | # Don't forget: encode * and _ 801 | # 802 | $text = preg_replace_callback('{ 803 | ( # wrap whole match in $1 804 | !\[ 805 | ('.$this->nested_brackets_re.') # alt text = $2 806 | \] 807 | \s? # One optional whitespace character 808 | \( # literal paren 809 | [ \n]* 810 | (?: 811 | <(\S*)> # src url = $3 812 | | 813 | ('.$this->nested_url_parenthesis_re.') # src url = $4 814 | ) 815 | [ \n]* 816 | ( # $5 817 | ([\'"]) # quote char = $6 818 | (.*?) # title = $7 819 | \6 # matching quote 820 | [ \n]* 821 | )? # title is optional 822 | \) 823 | ) 824 | }xs', 825 | array(&$this, '_doImages_inline_callback'), $text); 826 | 827 | return $text; 828 | } 829 | function _doImages_reference_callback($matches) { 830 | $whole_match = $matches[1]; 831 | $alt_text = $matches[2]; 832 | $link_id = strtolower($matches[3]); 833 | 834 | if ($link_id == "") { 835 | $link_id = strtolower($alt_text); # for shortcut links like ![this][]. 836 | } 837 | 838 | $alt_text = $this->encodeAttribute($alt_text); 839 | if (isset($this->urls[$link_id])) { 840 | $url = $this->encodeAttribute($this->urls[$link_id]); 841 | $result = "\"$alt_text\"";titles[$link_id])) { 843 | $title = $this->titles[$link_id]; 844 | $title = $this->encodeAttribute($title); 845 | $result .= " title=\"$title\""; 846 | } 847 | $result .= $this->empty_element_suffix; 848 | $result = $this->hashPart($result); 849 | } 850 | else { 851 | # If there's no such link ID, leave intact: 852 | $result = $whole_match; 853 | } 854 | 855 | return $result; 856 | } 857 | function _doImages_inline_callback($matches) { 858 | $whole_match = $matches[1]; 859 | $alt_text = $matches[2]; 860 | $url = $matches[3] == '' ? $matches[4] : $matches[3]; 861 | $title =& $matches[7]; 862 | 863 | $alt_text = $this->encodeAttribute($alt_text); 864 | $url = $this->encodeAttribute($url); 865 | $result = "\"$alt_text\"";encodeAttribute($title); 868 | $result .= " title=\"$title\""; # $title already quoted 869 | } 870 | $result .= $this->empty_element_suffix; 871 | 872 | return $this->hashPart($result); 873 | } 874 | 875 | 876 | function doHeaders($text) { 877 | # Setext-style headers: 878 | # Header 1 879 | # ======== 880 | # 881 | # Header 2 882 | # -------- 883 | # 884 | $text = preg_replace_callback('{ ^(.+?)[ ]*\n(=+|-+)[ ]*\n+ }mx', 885 | array(&$this, '_doHeaders_callback_setext'), $text); 886 | 887 | # atx-style headers: 888 | # # Header 1 889 | # ## Header 2 890 | # ## Header 2 with closing hashes ## 891 | # ... 892 | # ###### Header 6 893 | # 894 | $text = preg_replace_callback('{ 895 | ^(\#{1,6}) # $1 = string of #\'s 896 | [ ]* 897 | (.+?) # $2 = Header text 898 | [ ]* 899 | \#* # optional closing #\'s (not counted) 900 | \n+ 901 | }xm', 902 | array(&$this, '_doHeaders_callback_atx'), $text); 903 | 904 | return $text; 905 | } 906 | function _doHeaders_callback_setext($matches) { 907 | # Terrible hack to check we haven't found an empty list item. 908 | if ($matches[2] == '-' && preg_match('{^-(?: |$)}', $matches[1])) 909 | return $matches[0]; 910 | 911 | $level = $matches[2]{0} == '=' ? 1 : 2; 912 | $block = "".$this->runSpanGamut($matches[1]).""; 913 | return "\n" . $this->hashBlock($block) . "\n\n"; 914 | } 915 | function _doHeaders_callback_atx($matches) { 916 | $level = strlen($matches[1]); 917 | $block = "".$this->runSpanGamut($matches[2]).""; 918 | return "\n" . $this->hashBlock($block) . "\n\n"; 919 | } 920 | 921 | 922 | function doLists($text) { 923 | # 924 | # Form HTML ordered (numbered) and unordered (bulleted) lists. 925 | # 926 | $less_than_tab = $this->tab_width - 1; 927 | 928 | # Re-usable patterns to match list item bullets and number markers: 929 | $marker_ul_re = '[*+-]'; 930 | $marker_ol_re = '\d+[\.]'; 931 | $marker_any_re = "(?:$marker_ul_re|$marker_ol_re)"; 932 | 933 | $markers_relist = array( 934 | $marker_ul_re => $marker_ol_re, 935 | $marker_ol_re => $marker_ul_re, 936 | ); 937 | 938 | foreach ($markers_relist as $marker_re => $other_marker_re) { 939 | # Re-usable pattern to match any entirel ul or ol list: 940 | $whole_list_re = ' 941 | ( # $1 = whole list 942 | ( # $2 943 | ([ ]{0,'.$less_than_tab.'}) # $3 = number of spaces 944 | ('.$marker_re.') # $4 = first list item marker 945 | [ ]+ 946 | ) 947 | (?s:.+?) 948 | ( # $5 949 | \z 950 | | 951 | \n{2,} 952 | (?=\S) 953 | (?! # Negative lookahead for another list item marker 954 | [ ]* 955 | '.$marker_re.'[ ]+ 956 | ) 957 | | 958 | (?= # Lookahead for another kind of list 959 | \n 960 | \3 # Must have the same indentation 961 | '.$other_marker_re.'[ ]+ 962 | ) 963 | ) 964 | ) 965 | '; // mx 966 | 967 | # We use a different prefix before nested lists than top-level lists. 968 | # See extended comment in _ProcessListItems(). 969 | 970 | if ($this->list_level) { 971 | $text = preg_replace_callback('{ 972 | ^ 973 | '.$whole_list_re.' 974 | }mx', 975 | array(&$this, '_doLists_callback'), $text); 976 | } 977 | else { 978 | $text = preg_replace_callback('{ 979 | (?:(?<=\n)\n|\A\n?) # Must eat the newline 980 | '.$whole_list_re.' 981 | }mx', 982 | array(&$this, '_doLists_callback'), $text); 983 | } 984 | } 985 | 986 | return $text; 987 | } 988 | function _doLists_callback($matches) { 989 | # Re-usable patterns to match list item bullets and number markers: 990 | $marker_ul_re = '[*+-]'; 991 | $marker_ol_re = '\d+[\.]'; 992 | $marker_any_re = "(?:$marker_ul_re|$marker_ol_re)"; 993 | 994 | $list = $matches[1]; 995 | $list_type = preg_match("/$marker_ul_re/", $matches[4]) ? "ul" : "ol"; 996 | 997 | $marker_any_re = ( $list_type == "ul" ? $marker_ul_re : $marker_ol_re ); 998 | 999 | $list .= "\n"; 1000 | $result = $this->processListItems($list, $marker_any_re); 1001 | 1002 | $result = $this->hashBlock("<$list_type>\n" . $result . ""); 1003 | return "\n". $result ."\n\n"; 1004 | } 1005 | 1006 | var $list_level = 0; 1007 | 1008 | function processListItems($list_str, $marker_any_re) { 1009 | # 1010 | # Process the contents of a single ordered or unordered list, splitting it 1011 | # into individual list items. 1012 | # 1013 | # The $this->list_level global keeps track of when we're inside a list. 1014 | # Each time we enter a list, we increment it; when we leave a list, 1015 | # we decrement. If it's zero, we're not in a list anymore. 1016 | # 1017 | # We do this because when we're not inside a list, we want to treat 1018 | # something like this: 1019 | # 1020 | # I recommend upgrading to version 1021 | # 8. Oops, now this line is treated 1022 | # as a sub-list. 1023 | # 1024 | # As a single paragraph, despite the fact that the second line starts 1025 | # with a digit-period-space sequence. 1026 | # 1027 | # Whereas when we're inside a list (or sub-list), that line will be 1028 | # treated as the start of a sub-list. What a kludge, huh? This is 1029 | # an aspect of Markdown's syntax that's hard to parse perfectly 1030 | # without resorting to mind-reading. Perhaps the solution is to 1031 | # change the syntax rules such that sub-lists must start with a 1032 | # starting cardinal number; e.g. "1." or "a.". 1033 | 1034 | $this->list_level++; 1035 | 1036 | # trim trailing blank lines: 1037 | $list_str = preg_replace("/\n{2,}\\z/", "\n", $list_str); 1038 | 1039 | $list_str = preg_replace_callback('{ 1040 | (\n)? # leading line = $1 1041 | (^[ ]*) # leading whitespace = $2 1042 | ('.$marker_any_re.' # list marker and space = $3 1043 | (?:[ ]+|(?=\n)) # space only required if item is not empty 1044 | ) 1045 | ((?s:.*?)) # list item text = $4 1046 | (?:(\n+(?=\n))|\n) # tailing blank line = $5 1047 | (?= \n* (\z | \2 ('.$marker_any_re.') (?:[ ]+|(?=\n)))) 1048 | }xm', 1049 | array(&$this, '_processListItems_callback'), $list_str); 1050 | 1051 | $this->list_level--; 1052 | return $list_str; 1053 | } 1054 | function _processListItems_callback($matches) { 1055 | $item = $matches[4]; 1056 | $leading_line =& $matches[1]; 1057 | $leading_space =& $matches[2]; 1058 | $marker_space = $matches[3]; 1059 | $tailing_blank_line =& $matches[5]; 1060 | 1061 | if ($leading_line || $tailing_blank_line || 1062 | preg_match('/\n{2,}/', $item)) 1063 | { 1064 | # Replace marker with the appropriate whitespace indentation 1065 | $item = $leading_space . str_repeat(' ', strlen($marker_space)) . $item; 1066 | $item = $this->runBlockGamut($this->outdent($item)."\n"); 1067 | } 1068 | else { 1069 | # Recursion for sub-lists: 1070 | $item = $this->doLists($this->outdent($item)); 1071 | $item = preg_replace('/\n+$/', '', $item); 1072 | $item = $this->runSpanGamut($item); 1073 | } 1074 | 1075 | return "
  • " . $item . "
  • \n"; 1076 | } 1077 | 1078 | 1079 | function doCodeBlocks($text) { 1080 | # 1081 | # Process Markdown `
    ` blocks.
    1082 | 	#
    1083 | 		$text = preg_replace_callback('{
    1084 | 				(?:\n\n|\A\n?)
    1085 | 				(	            # $1 = the code block -- one or more lines, starting with a space/tab
    1086 | 				  (?>
    1087 | 					[ ]{'.$this->tab_width.'}  # Lines must start with a tab or a tab-width of spaces
    1088 | 					.*\n+
    1089 | 				  )+
    1090 | 				)
    1091 | 				((?=^[ ]{0,'.$this->tab_width.'}\S)|\Z)	# Lookahead for non-space at line-start, or end of doc
    1092 | 			}xm',
    1093 | 			array(&$this, '_doCodeBlocks_callback'), $text);
    1094 | 
    1095 | 		return $text;
    1096 | 	}
    1097 | 	function _doCodeBlocks_callback($matches) {
    1098 | 		$codeblock = $matches[1];
    1099 | 
    1100 | 		$codeblock = $this->outdent($codeblock);
    1101 | 		$codeblock = htmlspecialchars($codeblock, ENT_NOQUOTES);
    1102 | 
    1103 | 		# trim leading newlines and trailing newlines
    1104 | 		$codeblock = preg_replace('/\A\n+|\n+\z/', '', $codeblock);
    1105 | 
    1106 | 		$codeblock = "
    $codeblock\n
    "; 1107 | return "\n\n".$this->hashBlock($codeblock)."\n\n"; 1108 | } 1109 | 1110 | 1111 | function makeCodeSpan($code) { 1112 | # 1113 | # Create a code span markup for $code. Called from handleSpanToken. 1114 | # 1115 | $code = htmlspecialchars(trim($code), ENT_NOQUOTES); 1116 | return $this->hashPart("$code"); 1117 | } 1118 | 1119 | 1120 | var $em_relist = array( 1121 | '' => '(?:(? '(?<=\S|^)(? '(?<=\S|^)(? '(?:(? '(?<=\S|^)(? '(?<=\S|^)(? '(?:(? '(?<=\S|^)(? '(?<=\S|^)(?em_relist as $em => $em_re) { 1143 | foreach ($this->strong_relist as $strong => $strong_re) { 1144 | # Construct list of allowed token expressions. 1145 | $token_relist = array(); 1146 | if (isset($this->em_strong_relist["$em$strong"])) { 1147 | $token_relist[] = $this->em_strong_relist["$em$strong"]; 1148 | } 1149 | $token_relist[] = $em_re; 1150 | $token_relist[] = $strong_re; 1151 | 1152 | # Construct master expression from list. 1153 | $token_re = '{('. implode('|', $token_relist) .')}'; 1154 | $this->em_strong_prepared_relist["$em$strong"] = $token_re; 1155 | } 1156 | } 1157 | } 1158 | 1159 | function doItalicsAndBold($text) { 1160 | $token_stack = array(''); 1161 | $text_stack = array(''); 1162 | $em = ''; 1163 | $strong = ''; 1164 | $tree_char_em = false; 1165 | 1166 | while (1) { 1167 | # 1168 | # Get prepared regular expression for seraching emphasis tokens 1169 | # in current context. 1170 | # 1171 | $token_re = $this->em_strong_prepared_relist["$em$strong"]; 1172 | 1173 | # 1174 | # Each loop iteration search for the next emphasis token. 1175 | # Each token is then passed to handleSpanToken. 1176 | # 1177 | $parts = preg_split($token_re, $text, 2, PREG_SPLIT_DELIM_CAPTURE); 1178 | $text_stack[0] .= $parts[0]; 1179 | $token =& $parts[1]; 1180 | $text =& $parts[2]; 1181 | 1182 | if (empty($token)) { 1183 | # Reached end of text span: empty stack without emitting. 1184 | # any more emphasis. 1185 | while ($token_stack[0]) { 1186 | $text_stack[1] .= array_shift($token_stack); 1187 | $text_stack[0] .= array_shift($text_stack); 1188 | } 1189 | break; 1190 | } 1191 | 1192 | $token_len = strlen($token); 1193 | if ($tree_char_em) { 1194 | # Reached closing marker while inside a three-char emphasis. 1195 | if ($token_len == 3) { 1196 | # Three-char closing marker, close em and strong. 1197 | array_shift($token_stack); 1198 | $span = array_shift($text_stack); 1199 | $span = $this->runSpanGamut($span); 1200 | $span = "$span"; 1201 | $text_stack[0] .= $this->hashPart($span); 1202 | $em = ''; 1203 | $strong = ''; 1204 | } else { 1205 | # Other closing marker: close one em or strong and 1206 | # change current token state to match the other 1207 | $token_stack[0] = str_repeat($token{0}, 3-$token_len); 1208 | $tag = $token_len == 2 ? "strong" : "em"; 1209 | $span = $text_stack[0]; 1210 | $span = $this->runSpanGamut($span); 1211 | $span = "<$tag>$span"; 1212 | $text_stack[0] = $this->hashPart($span); 1213 | $$tag = ''; # $$tag stands for $em or $strong 1214 | } 1215 | $tree_char_em = false; 1216 | } else if ($token_len == 3) { 1217 | if ($em) { 1218 | # Reached closing marker for both em and strong. 1219 | # Closing strong marker: 1220 | for ($i = 0; $i < 2; ++$i) { 1221 | $shifted_token = array_shift($token_stack); 1222 | $tag = strlen($shifted_token) == 2 ? "strong" : "em"; 1223 | $span = array_shift($text_stack); 1224 | $span = $this->runSpanGamut($span); 1225 | $span = "<$tag>$span"; 1226 | $text_stack[0] .= $this->hashPart($span); 1227 | $$tag = ''; # $$tag stands for $em or $strong 1228 | } 1229 | } else { 1230 | # Reached opening three-char emphasis marker. Push on token 1231 | # stack; will be handled by the special condition above. 1232 | $em = $token{0}; 1233 | $strong = "$em$em"; 1234 | array_unshift($token_stack, $token); 1235 | array_unshift($text_stack, ''); 1236 | $tree_char_em = true; 1237 | } 1238 | } else if ($token_len == 2) { 1239 | if ($strong) { 1240 | # Unwind any dangling emphasis marker: 1241 | if (strlen($token_stack[0]) == 1) { 1242 | $text_stack[1] .= array_shift($token_stack); 1243 | $text_stack[0] .= array_shift($text_stack); 1244 | } 1245 | # Closing strong marker: 1246 | array_shift($token_stack); 1247 | $span = array_shift($text_stack); 1248 | $span = $this->runSpanGamut($span); 1249 | $span = "$span"; 1250 | $text_stack[0] .= $this->hashPart($span); 1251 | $strong = ''; 1252 | } else { 1253 | array_unshift($token_stack, $token); 1254 | array_unshift($text_stack, ''); 1255 | $strong = $token; 1256 | } 1257 | } else { 1258 | # Here $token_len == 1 1259 | if ($em) { 1260 | if (strlen($token_stack[0]) == 1) { 1261 | # Closing emphasis marker: 1262 | array_shift($token_stack); 1263 | $span = array_shift($text_stack); 1264 | $span = $this->runSpanGamut($span); 1265 | $span = "$span"; 1266 | $text_stack[0] .= $this->hashPart($span); 1267 | $em = ''; 1268 | } else { 1269 | $text_stack[0] .= $token; 1270 | } 1271 | } else { 1272 | array_unshift($token_stack, $token); 1273 | array_unshift($text_stack, ''); 1274 | $em = $token; 1275 | } 1276 | } 1277 | } 1278 | return $text_stack[0]; 1279 | } 1280 | 1281 | 1282 | function doBlockQuotes($text) { 1283 | $text = preg_replace_callback('/ 1284 | ( # Wrap whole match in $1 1285 | (?> 1286 | ^[ ]*>[ ]? # ">" at the start of a line 1287 | .+\n # rest of the first line 1288 | (.+\n)* # subsequent consecutive lines 1289 | \n* # blanks 1290 | )+ 1291 | ) 1292 | /xm', 1293 | array(&$this, '_doBlockQuotes_callback'), $text); 1294 | 1295 | return $text; 1296 | } 1297 | function _doBlockQuotes_callback($matches) { 1298 | $bq = $matches[1]; 1299 | # trim one level of quoting - trim whitespace-only lines 1300 | $bq = preg_replace('/^[ ]*>[ ]?|^[ ]+$/m', '', $bq); 1301 | $bq = $this->runBlockGamut($bq); # recurse 1302 | 1303 | $bq = preg_replace('/^/m', " ", $bq); 1304 | # These leading spaces cause problem with
     content, 
    1305 | 		# so we need to fix that:
    1306 | 		$bq = preg_replace_callback('{(\s*
    .+?
    )}sx', 1307 | array(&$this, '_doBlockQuotes_callback2'), $bq); 1308 | 1309 | return "\n". $this->hashBlock("
    \n$bq\n
    ")."\n\n"; 1310 | } 1311 | function _doBlockQuotes_callback2($matches) { 1312 | $pre = $matches[1]; 1313 | $pre = preg_replace('/^ /m', '', $pre); 1314 | return $pre; 1315 | } 1316 | 1317 | 1318 | function formParagraphs($text) { 1319 | # 1320 | # Params: 1321 | # $text - string to process with html

    tags 1322 | # 1323 | # Strip leading and trailing lines: 1324 | $text = preg_replace('/\A\n+|\n+\z/', '', $text); 1325 | 1326 | $grafs = preg_split('/\n{2,}/', $text, -1, PREG_SPLIT_NO_EMPTY); 1327 | 1328 | # 1329 | # Wrap

    tags and unhashify HTML blocks 1330 | # 1331 | foreach ($grafs as $key => $value) { 1332 | if (!preg_match('/^B\x1A[0-9]+B$/', $value)) { 1333 | # Is a paragraph. 1334 | $value = $this->runSpanGamut($value); 1335 | $value = preg_replace('/^([ ]*)/', "

    ", $value); 1336 | $value .= "

    "; 1337 | $grafs[$key] = $this->unhash($value); 1338 | } 1339 | else { 1340 | # Is a block. 1341 | # Modify elements of @grafs in-place... 1342 | $graf = $value; 1343 | $block = $this->html_hashes[$graf]; 1344 | $graf = $block; 1345 | // if (preg_match('{ 1346 | // \A 1347 | // ( # $1 =
    tag 1348 | //
    ]* 1350 | // \b 1351 | // markdown\s*=\s* ([\'"]) # $2 = attr quote char 1352 | // 1 1353 | // \2 1354 | // [^>]* 1355 | // > 1356 | // ) 1357 | // ( # $3 = contents 1358 | // .* 1359 | // ) 1360 | // (
    ) # $4 = closing tag 1361 | // \z 1362 | // }xs', $block, $matches)) 1363 | // { 1364 | // list(, $div_open, , $div_content, $div_close) = $matches; 1365 | // 1366 | // # We can't call Markdown(), because that resets the hash; 1367 | // # that initialization code should be pulled into its own sub, though. 1368 | // $div_content = $this->hashHTMLBlocks($div_content); 1369 | // 1370 | // # Run document gamut methods on the content. 1371 | // foreach ($this->document_gamut as $method => $priority) { 1372 | // $div_content = $this->$method($div_content); 1373 | // } 1374 | // 1375 | // $div_open = preg_replace( 1376 | // '{\smarkdown\s*=\s*([\'"]).+?\1}', '', $div_open); 1377 | // 1378 | // $graf = $div_open . "\n" . $div_content . "\n" . $div_close; 1379 | // } 1380 | $grafs[$key] = $graf; 1381 | } 1382 | } 1383 | 1384 | return implode("\n\n", $grafs); 1385 | } 1386 | 1387 | 1388 | function encodeAttribute($text) { 1389 | # 1390 | # Encode text for a double-quoted HTML attribute. This function 1391 | # is *not* suitable for attributes enclosed in single quotes. 1392 | # 1393 | $text = $this->encodeAmpsAndAngles($text); 1394 | $text = str_replace('"', '"', $text); 1395 | return $text; 1396 | } 1397 | 1398 | 1399 | function encodeAmpsAndAngles($text) { 1400 | # 1401 | # Smart processing for ampersands and angle brackets that need to 1402 | # be encoded. Valid character entities are left alone unless the 1403 | # no-entities mode is set. 1404 | # 1405 | if ($this->no_entities) { 1406 | $text = str_replace('&', '&', $text); 1407 | } else { 1408 | # Ampersand-encoding based entirely on Nat Irons's Amputator 1409 | # MT plugin: 1410 | $text = preg_replace('/&(?!#?[xX]?(?:[0-9a-fA-F]+|\w+);)/', 1411 | '&', $text);; 1412 | } 1413 | # Encode remaining <'s 1414 | $text = str_replace('<', '<', $text); 1415 | 1416 | return $text; 1417 | } 1418 | 1419 | 1420 | function doAutoLinks($text) { 1421 | $text = preg_replace_callback('{<((https?|ftp|dict):[^\'">\s]+)>}i', 1422 | array(&$this, '_doAutoLinks_url_callback'), $text); 1423 | 1424 | # Email addresses: 1425 | $text = preg_replace_callback('{ 1426 | < 1427 | (?:mailto:)? 1428 | ( 1429 | (?: 1430 | [-!#$%&\'*+/=?^_`.{|}~\w\x80-\xFF]+ 1431 | | 1432 | ".*?" 1433 | ) 1434 | \@ 1435 | (?: 1436 | [-a-z0-9\x80-\xFF]+(\.[-a-z0-9\x80-\xFF]+)*\.[a-z]+ 1437 | | 1438 | \[[\d.a-fA-F:]+\] # IPv4 & IPv6 1439 | ) 1440 | ) 1441 | > 1442 | }xi', 1443 | array(&$this, '_doAutoLinks_email_callback'), $text); 1444 | 1445 | return $text; 1446 | } 1447 | function _doAutoLinks_url_callback($matches) { 1448 | $url = $this->encodeAttribute($matches[1]); 1449 | $link = "$url"; 1450 | return $this->hashPart($link); 1451 | } 1452 | function _doAutoLinks_email_callback($matches) { 1453 | $address = $matches[1]; 1454 | $link = $this->encodeEmailAddress($address); 1455 | return $this->hashPart($link); 1456 | } 1457 | 1458 | 1459 | function encodeEmailAddress($addr) { 1460 | # 1461 | # Input: an email address, e.g. "foo@example.com" 1462 | # 1463 | # Output: the email address as a mailto link, with each character 1464 | # of the address encoded as either a decimal or hex entity, in 1465 | # the hopes of foiling most address harvesting spam bots. E.g.: 1466 | # 1467 | #

    foo@exampl 1470 | # e.com

    1471 | # 1472 | # Based by a filter by Matthew Wickline, posted to BBEdit-Talk. 1473 | # With some optimizations by Milian Wolff. 1474 | # 1475 | $addr = "mailto:" . $addr; 1476 | $chars = preg_split('/(? $char) { 1480 | $ord = ord($char); 1481 | # Ignore non-ascii chars. 1482 | if ($ord < 128) { 1483 | $r = ($seed * (1 + $key)) % 100; # Pseudo-random function. 1484 | # roughly 10% raw, 45% hex, 45% dec 1485 | # '@' *must* be encoded. I insist. 1486 | if ($r > 90 && $char != '@') /* do nothing */; 1487 | else if ($r < 45) $chars[$key] = '&#x'.dechex($ord).';'; 1488 | else $chars[$key] = '&#'.$ord.';'; 1489 | } 1490 | } 1491 | 1492 | $addr = implode('', $chars); 1493 | $text = implode('', array_slice($chars, 7)); # text without `mailto:` 1494 | $addr = "$text"; 1495 | 1496 | return $addr; 1497 | } 1498 | 1499 | 1500 | function parseSpan($str) { 1501 | # 1502 | # Take the string $str and parse it into tokens, hashing embeded HTML, 1503 | # escaped characters and handling code spans. 1504 | # 1505 | $output = ''; 1506 | 1507 | $span_re = '{ 1508 | ( 1509 | \\\\'.$this->escape_chars_re.' 1510 | | 1511 | (?no_markup ? '' : ' 1514 | | 1515 | # comment 1516 | | 1517 | <\?.*?\?> | <%.*?%> # processing instruction 1518 | | 1519 | <[/!$]?[-a-zA-Z0-9:_]+ # regular tags 1520 | (?> 1521 | \s 1522 | (?>[^"\'>]+|"[^"]*"|\'[^\']*\')* 1523 | )? 1524 | > 1525 | ').' 1526 | ) 1527 | }xs'; 1528 | 1529 | while (1) { 1530 | # 1531 | # Each loop iteration seach for either the next tag, the next 1532 | # openning code span marker, or the next escaped character. 1533 | # Each token is then passed to handleSpanToken. 1534 | # 1535 | $parts = preg_split($span_re, $str, 2, PREG_SPLIT_DELIM_CAPTURE); 1536 | 1537 | # Create token from text preceding tag. 1538 | if ($parts[0] != "") { 1539 | $output .= $parts[0]; 1540 | } 1541 | 1542 | # Check if we reach the end. 1543 | if (isset($parts[1])) { 1544 | $output .= $this->handleSpanToken($parts[1], $parts[2]); 1545 | $str = $parts[2]; 1546 | } 1547 | else { 1548 | break; 1549 | } 1550 | } 1551 | 1552 | return $output; 1553 | } 1554 | 1555 | 1556 | function handleSpanToken($token, &$str) { 1557 | # 1558 | # Handle $token provided by parseSpan by determining its nature and 1559 | # returning the corresponding value that should replace it. 1560 | # 1561 | switch ($token{0}) { 1562 | case "\\": 1563 | return $this->hashPart("&#". ord($token{1}). ";"); 1564 | case "`": 1565 | # Search for end marker in remaining text. 1566 | if (preg_match('/^(.*?[^`])'.preg_quote($token).'(?!`)(.*)$/sm', 1567 | $str, $matches)) 1568 | { 1569 | $str = $matches[2]; 1570 | $codespan = $this->makeCodeSpan($matches[1]); 1571 | return $this->hashPart($codespan); 1572 | } 1573 | return $token; // return as text since no ending marker found. 1574 | default: 1575 | return $this->hashPart($token); 1576 | } 1577 | } 1578 | 1579 | 1580 | function outdent($text) { 1581 | # 1582 | # Remove one level of line-leading tabs or spaces 1583 | # 1584 | return preg_replace('/^(\t|[ ]{1,'.$this->tab_width.'})/m', '', $text); 1585 | } 1586 | 1587 | 1588 | # String length function for detab. `_initDetab` will create a function to 1589 | # hanlde UTF-8 if the default function does not exist. 1590 | var $utf8_strlen = 'mb_strlen'; 1591 | 1592 | function detab($text) { 1593 | # 1594 | # Replace tabs with the appropriate amount of space. 1595 | # 1596 | # For each line we separate the line in blocks delemited by 1597 | # tab characters. Then we reconstruct every line by adding the 1598 | # appropriate number of space between each blocks. 1599 | 1600 | $text = preg_replace_callback('/^.*\t.*$/m', 1601 | array(&$this, '_detab_callback'), $text); 1602 | 1603 | return $text; 1604 | } 1605 | function _detab_callback($matches) { 1606 | $line = $matches[0]; 1607 | $strlen = $this->utf8_strlen; # strlen function for UTF-8. 1608 | 1609 | # Split in blocks. 1610 | $blocks = explode("\t", $line); 1611 | # Add each blocks to the line. 1612 | $line = $blocks[0]; 1613 | unset($blocks[0]); # Do not add first block twice. 1614 | foreach ($blocks as $block) { 1615 | # Calculate amount of space, insert spaces, insert block. 1616 | $amount = $this->tab_width - 1617 | $strlen($line, 'UTF-8') % $this->tab_width; 1618 | $line .= str_repeat(" ", $amount) . $block; 1619 | } 1620 | return $line; 1621 | } 1622 | function _initDetab() { 1623 | # 1624 | # Check for the availability of the function in the `utf8_strlen` property 1625 | # (initially `mb_strlen`). If the function is not available, create a 1626 | # function that will loosely count the number of UTF-8 characters with a 1627 | # regular expression. 1628 | # 1629 | if (function_exists($this->utf8_strlen)) return; 1630 | $this->utf8_strlen = create_function('$text', 'return preg_match_all( 1631 | "/[\\\\x00-\\\\xBF]|[\\\\xC0-\\\\xFF][\\\\x80-\\\\xBF]*/", 1632 | $text, $m);'); 1633 | } 1634 | 1635 | 1636 | function unhash($text) { 1637 | # 1638 | # Swap back in all the tags hashed by _HashHTMLBlocks. 1639 | # 1640 | return preg_replace_callback('/(.)\x1A[0-9]+\1/', 1641 | array(&$this, '_unhash_callback'), $text); 1642 | } 1643 | function _unhash_callback($matches) { 1644 | return $this->html_hashes[$matches[0]]; 1645 | } 1646 | 1647 | } 1648 | 1649 | /* 1650 | 1651 | PHP Markdown 1652 | ============ 1653 | 1654 | Description 1655 | ----------- 1656 | 1657 | This is a PHP translation of the original Markdown formatter written in 1658 | Perl by John Gruber. 1659 | 1660 | Markdown is a text-to-HTML filter; it translates an easy-to-read / 1661 | easy-to-write structured text format into HTML. Markdown's text format 1662 | is most similar to that of plain text email, and supports features such 1663 | as headers, *emphasis*, code blocks, blockquotes, and links. 1664 | 1665 | Markdown's syntax is designed not as a generic markup language, but 1666 | specifically to serve as a front-end to (X)HTML. You can use span-level 1667 | HTML tags anywhere in a Markdown document, and you can use block level 1668 | HTML tags (like
    and as well). 1669 | 1670 | For more information about Markdown's syntax, see: 1671 | 1672 | 1673 | 1674 | 1675 | Bugs 1676 | ---- 1677 | 1678 | To file bug reports please send email to: 1679 | 1680 | 1681 | 1682 | Please include with your report: (1) the example input; (2) the output you 1683 | expected; (3) the output Markdown actually produced. 1684 | 1685 | 1686 | Version History 1687 | --------------- 1688 | 1689 | See the readme file for detailed release notes for this version. 1690 | 1691 | 1692 | Copyright and License 1693 | --------------------- 1694 | 1695 | PHP Markdown 1696 | Copyright (c) 2004-2009 Michel Fortin 1697 | 1698 | All rights reserved. 1699 | 1700 | Based on Markdown 1701 | Copyright (c) 2003-2006 John Gruber 1702 | 1703 | All rights reserved. 1704 | 1705 | Redistribution and use in source and binary forms, with or without 1706 | modification, are permitted provided that the following conditions are 1707 | met: 1708 | 1709 | * Redistributions of source code must retain the above copyright notice, 1710 | this list of conditions and the following disclaimer. 1711 | 1712 | * Redistributions in binary form must reproduce the above copyright 1713 | notice, this list of conditions and the following disclaimer in the 1714 | documentation and/or other materials provided with the distribution. 1715 | 1716 | * Neither the name "Markdown" nor the names of its contributors may 1717 | be used to endorse or promote products derived from this software 1718 | without specific prior written permission. 1719 | 1720 | This software is provided by the copyright holders and contributors "as 1721 | is" and any express or implied warranties, including, but not limited 1722 | to, the implied warranties of merchantability and fitness for a 1723 | particular purpose are disclaimed. In no event shall the copyright owner 1724 | or contributors be liable for any direct, indirect, incidental, special, 1725 | exemplary, or consequential damages (including, but not limited to, 1726 | procurement of substitute goods or services; loss of use, data, or 1727 | profits; or business interruption) however caused and on any theory of 1728 | liability, whether in contract, strict liability, or tort (including 1729 | negligence or otherwise) arising in any way out of the use of this 1730 | software, even if advised of the possibility of such damage. 1731 | 1732 | */ 1733 | ?> --------------------------------------------------------------------------------