├── .gitignore
├── CHANGES.md
├── readme.txt
├── License.text
├── PHP Markdown Readme.text
└── php-markdown-modified.php


/.gitignore:
--------------------------------------------------------------------------------
1 | .DS_Store


--------------------------------------------------------------------------------
/CHANGES.md:
--------------------------------------------------------------------------------
1 | ### Markdown-Modified
2 | 
3 | #### 1.0.1
4 | 
5 |  * renamed functions and `MARKDOWN_VERSION` const so not errors if another version of `markdown.php` is being loaded by another plugin
6 | 
7 | #### 1.0
8 | 
9 |  * initial commit


--------------------------------------------------------------------------------
/readme.txt:
--------------------------------------------------------------------------------
1 | Modified version of original PHP Markdown plugin to work along side Markdown on Save variants.
2 | 
3 | All Markdown markup will be rendered in all posts regardless of setting in Markdown on Save variant.
4 | 
5 | Better solution for those who have always used Markdown markup would be some method of modifying the setting to **not use** Markdown on posts to **use** Markdown.
6 | 


--------------------------------------------------------------------------------
/License.text:
--------------------------------------------------------------------------------
 1 | PHP Markdown
 2 | Copyright (c) 2004-2009 Michel Fortin  
 3 | <http://michelf.com/>  
 4 | All rights reserved.
 5 | 
 6 | Based on Markdown  
 7 | Copyright (c) 2003-2006 John Gruber   
 8 | <http://daringfireball.net/>   
 9 | All rights reserved.
10 | 
11 | Redistribution and use in source and binary forms, with or without
12 | modification, are permitted provided that the following conditions are
13 | met:
14 | 
15 | * Redistributions of source code must retain the above copyright notice,
16 |   this list of conditions and the following disclaimer.
17 | 
18 | * Redistributions in binary form must reproduce the above copyright
19 |   notice, this list of conditions and the following disclaimer in the
20 |   documentation and/or other materials provided with the distribution.
21 | 
22 | * Neither the name "Markdown" nor the names of its contributors may
23 |   be used to endorse or promote products derived from this software
24 |   without specific prior written permission.
25 | 
26 | This software is provided by the copyright holders and contributors "as
27 | is" and any express or implied warranties, including, but not limited
28 | to, the implied warranties of merchantability and fitness for a
29 | particular purpose are disclaimed. In no event shall the copyright owner
30 | or contributors be liable for any direct, indirect, incidental, special,
31 | exemplary, or consequential damages (including, but not limited to,
32 | procurement of substitute goods or services; loss of use, data, or
33 | profits; or business interruption) however caused and on any theory of
34 | liability, whether in contract, strict liability, or tort (including
35 | negligence or otherwise) arising in any way out of the use of this
36 | software, even if advised of the possibility of such damage.
37 | 


--------------------------------------------------------------------------------
/PHP Markdown Readme.text:
--------------------------------------------------------------------------------
  1 | PHP Markdown
  2 | ============
  3 | 
  4 | Version 1.0.1m - Sat 21 Jun 2008
  5 | 
  6 | by Michel Fortin
  7 | <http://michelf.com/>
  8 | 
  9 | based on work by John Gruber  
 10 | <http://daringfireball.net/>
 11 | 
 12 | 
 13 | Introduction
 14 | ------------
 15 | 
 16 | Markdown is a text-to-HTML conversion tool for web writers. Markdown
 17 | allows you to write using an easy-to-read, easy-to-write plain text
 18 | format, then convert it to structurally valid XHTML (or HTML).
 19 | 
 20 | "Markdown" is two things: a plain text markup syntax, and a software 
 21 | tool, written in Perl, that converts the plain text markup to HTML. 
 22 | PHP Markdown is a port to PHP of the original Markdown program by 
 23 | John Gruber.
 24 | 
 25 | PHP Markdown can work as a plug-in for WordPress and bBlog, as a 
 26 | modifier for the Smarty templating engine, or as a remplacement for
 27 | textile formatting in any software that support textile.
 28 | 
 29 | Full documentation of Markdown's syntax is available on John's 
 30 | Markdown page: <http://daringfireball.net/projects/markdown/>
 31 | 
 32 | 
 33 | Installation and Requirement
 34 | ----------------------------
 35 | 
 36 | PHP Markdown requires PHP version 4.0.5 or later.
 37 | 
 38 | 
 39 | ### WordPress ###
 40 | 
 41 | PHP Markdown works with [WordPress][wp], version 1.2 or later.
 42 | 
 43 |  [wp]: http://wordpress.org/
 44 | 
 45 | 1.  To use PHP Markdown with WordPress, place the "makrdown.php" file 
 46 |     in the "plugins" folder. This folder is located inside 
 47 |     "wp-content" at the root of your site:
 48 | 
 49 |         (site home)/wp-content/plugins/
 50 | 
 51 | 2.  Activate the plugin with the administrative interface of 
 52 |     WordPress. In the "Plugins" section you will now find Markdown. 
 53 |     To activate the plugin, click on the "Activate" button on the 
 54 |     same line than Markdown. Your entries will now be formatted by 
 55 |     PHP Markdown.
 56 | 
 57 | 3.  To post Markdown content, you'll first have to disable the 
 58 | 	"visual" editor in the User section of WordPress.
 59 | 
 60 | You can configure PHP Markdown to not apply to the comments on your 
 61 | WordPress weblog. See the "Configuration" section below.
 62 | 
 63 | It is not possible at this time to apply a different set of 
 64 | filters to different entries. All your entries will be formated by 
 65 | PHP Markdown. This is a limitation of WordPress. If your old entries 
 66 | are written in HTML (as opposed to another formatting syntax, like 
 67 | Textile), they'll probably stay fine after installing Markdown.
 68 | 
 69 | 
 70 | ### bBlog ###
 71 | 
 72 | PHP Markdown also works with [bBlog][bb].
 73 | 
 74 |  [bb]: http://www.bblog.com/
 75 | 
 76 | To use PHP Markdown with bBlog, rename "markdown.php" to 
 77 | "modifier.markdown.php" and place the file in the "bBlog_plugins" 
 78 | folder. This folder is located inside the "bblog" directory of 
 79 | your site, like this:
 80 | 
 81 |         (site home)/bblog/bBlog_plugins/modifier.markdown.php
 82 | 
 83 | Select "Markdown" as the "Entry Modifier" when you post a new 
 84 | entry. This setting will only apply to the entry you are editing.
 85 | 
 86 | 
 87 | ### Replacing Textile in TextPattern ###
 88 | 
 89 | [TextPattern][tp] use [Textile][tx] to format your text. You can 
 90 | replace Textile by Markdown in TextPattern without having to change
 91 | any code by using the *Texitle Compatibility Mode*. This may work 
 92 | with other software that expect Textile too.
 93 | 
 94 |  [tx]: http://www.textism.com/tools/textile/
 95 |  [tp]: http://www.textpattern.com/
 96 | 
 97 | 1.  Rename the "markdown.php" file to "classTextile.php". This will
 98 | 	make PHP Markdown behave as if it was the actual Textile parser.
 99 | 
100 | 2.  Replace the "classTextile.php" file TextPattern installed in your
101 | 	web directory. It can be found in the "lib" directory:
102 | 
103 | 		(site home)/textpattern/lib/
104 | 
105 | Contrary to Textile, Markdown does not convert quotes to curly ones 
106 | and does not convert multiple hyphens (`--` and `---`) into en- and 
107 | em-dashes. If you use PHP Markdown in Textile Compatibility Mode, you 
108 | can solve this problem by installing the "smartypants.php" file from 
109 | [PHP SmartyPants][psp] beside the "classTextile.php" file. The Textile 
110 | Compatibility Mode function will use SmartyPants automatically without 
111 | further modification.
112 | 
113 |  [psp]: http://michelf.com/projects/php-smartypants/
114 | 
115 | 
116 | ### Updating Markdown in Other Programs ###
117 | 
118 | Many web applications now ship with PHP Markdown, or have plugins to 
119 | perform the conversion to HTML. You can update PHP Markdown in many of 
120 | these programs by swapping the old "markdown.php" file for the new one.
121 | 
122 | Here is a short non-exhaustive list of some programs and where they 
123 | hide the "markdown.php" file.
124 | 
125 | | Program   | Path to Markdown
126 | | -------   | ----------------
127 | | [Pivot][] | `(site home)/pivot/includes/markdown/markdown.php`
128 | 
129 | If you're unsure if you can do this with your application, ask the 
130 | developer, or wait for the developer to update his application or 
131 | plugin with the new version of PHP Markdown.
132 | 
133 |  [Pivot]: http://pivotlog.net/
134 | 
135 | 
136 | ### In Your Own Programs ###
137 | 
138 | You can use PHP Markdown easily in your current PHP program. Simply 
139 | include the file and then call the Markdown function on the text you 
140 | want to convert:
141 | 
142 |     include_once "markdown.php";
143 |     $my_html = Markdown($my_text);
144 | 
145 | If you wish to use PHP Markdown with another text filter function 
146 | built to parse HTML, you should filter the text *after* the Markdown
147 | function call. This is an example with [PHP SmartyPants][psp]:
148 | 
149 |     $my_html = SmartyPants(Markdown($my_text));
150 | 
151 | 
152 | ### With Smarty ###
153 | 
154 | If your program use the [Smarty][sm] template engine, PHP Markdown 
155 | can now be used as a modifier for your templates. Rename "markdown.php" 
156 | to "modifier.markdown.php" and put it in your smarty plugins folder.
157 | 
158 |   [sm]: http://smarty.php.net/
159 | 
160 | If you are using MovableType 3.1 or later, the Smarty plugin folder is 
161 | located at `(MT CGI root)/php/extlib/smarty/plugins`. This will allow 
162 | Markdown to work on dynamic pages.
163 | 
164 | 
165 | Configuration
166 | -------------
167 | 
168 | By default, PHP Markdown produces XHTML output for tags with empty 
169 | elements. E.g.:
170 | 
171 |     <br />
172 | 
173 | Markdown can be configured to produce HTML-style tags; e.g.:
174 | 
175 |     <br>
176 | 
177 | To do this, you  must edit the "MARKDOWN_EMPTY_ELEMENT_SUFFIX" 
178 | definition below the "Global default settings" header at the start of 
179 | the "markdown.php" file.
180 | 
181 | 
182 | ### WordPress-Specific Settings ###
183 | 
184 | By default, the Markdown plugin applies to both posts and comments on 
185 | your WordPress weblog. To deactivate one or the other, edit the 
186 | `MARKDOWN_WP_POSTS` or `MARKDOWN_WP_COMMENTS` definitions under the 
187 | "WordPress settings" header at the start of the "markdown.php" file.
188 | 
189 | 
190 | Bugs
191 | ----
192 | 
193 | To file bug reports please send email to:
194 | <michel.fortin@michelf.com>
195 | 
196 | Please include with your report: (1) the example input; (2) the output you
197 | expected; (3) the output PHP Markdown actually produced.
198 | 
199 | 
200 | Version History
201 | ---------------
202 | 
203 | 1.0.1n (10 Oct 2009):
204 | 
205 | *	Enabled reference-style shortcut links. Now you can write reference-style 
206 | 	links with less brakets:
207 | 	
208 | 		This is [my website].
209 | 		
210 | 		[my website]: http://example.com/
211 | 	
212 | 	This was added in the 1.0.2 betas, but commented out in the 1.0.1 branch, 
213 | 	waiting for the feature to be officialized. [But half of the other Markdown
214 | 	implementations are supporting this syntax][half], so it makes sense for 
215 | 	compatibility's sake to allow it in PHP Markdown too.
216 | 
217 |  [half]: http://babelmark.bobtfish.net/?markdown=This+is+%5Bmy+website%5D.%0D%0A%09%09%0D%0A%5Bmy+website%5D%3A+http%3A%2F%2Fexample.com%2F%0D%0A&src=1&dest=2
218 | 
219 | *	Now accepting many valid email addresses in autolinks that were 
220 | 	previously rejected, such as:
221 | 	
222 | 		<abc+mailbox/department=shipping@example.com>
223 | 		<!#$%&'*+-/=?^_`.{|}~@example.com>
224 | 		<"abc@def"@example.com>
225 | 		<"Fred Bloggs"@example.com>
226 | 		<jsmith@[192.0.2.1]>
227 | 
228 | *	Now accepting spaces in URLs for inline and reference-style links. Such 
229 | 	URLs need to be surrounded by angle brakets. For instance:
230 | 	
231 | 		[link text](<http://url/with space> "optional title")
232 | 
233 | 		[link text][ref]
234 | 		[ref]: <http://url/with space> "optional title"
235 | 	
236 | 	There is still a quirk which may prevent this from working correctly with 
237 | 	relative URLs in inline-style links however.
238 | 
239 | *	Fix for adjacent list of different kind where the second list could
240 | 	end as a sublist of the first when not separated by an empty line.
241 | 
242 | *	Fixed a bug where inline-style links wouldn't be recognized when the link 
243 | 	definition contains a line break between the url and the title.
244 | 
245 | *	Fixed a bug where tags where the name contains an underscore aren't parsed 
246 | 	correctly.
247 | 
248 | *	Fixed some corner-cases mixing underscore-ephasis and asterisk-emphasis.
249 | 
250 | 
251 | 1.0.1m (21 Jun 2008):
252 | 
253 | *	Lists can now have empty items.
254 | 
255 | *	Rewrote the emphasis and strong emphasis parser to fix some issues
256 | 	with odly placed and overlong markers.
257 | 
258 | 
259 | 1.0.1l (11 May 2008):
260 | 
261 | *	Now removing the UTF-8 BOM at the start of a document, if present.
262 | 
263 | *	Now accepting capitalized URI schemes (such as HTTP:) in automatic
264 | 	links, such as `<HTTP://EXAMPLE.COM/>`.
265 | 
266 | *	Fixed a problem where `<hr@example.com>` was seen as a horizontal
267 | 	rule instead of an automatic link.
268 | 
269 | *	Fixed an issue where some characters in Markdown-generated HTML
270 | 	attributes weren't properly escaped with entities.
271 | 
272 | *	Fix for code blocks as first element of a list item. Previously,
273 | 	this didn't create any code block for item 2:
274 | 	
275 | 		*   Item 1 (regular paragraph)
276 | 		
277 | 		*       Item 2 (code block)
278 | 
279 | *	A code block starting on the second line of a document wasn't seen
280 | 	as a code block. This has been fixed.
281 | 	
282 | *	Added programatically-settable parser properties `predef_urls` and 
283 | 	`predef_titles` for predefined URLs and titles for reference-style 
284 | 	links. To use this, your PHP code must call the parser this way:
285 | 	
286 | 		$parser = new Markdwon_Parser;
287 | 		$parser->predef_urls = array('linkref' => 'http://example.com');
288 | 		$html = $parser->transform($text);
289 | 	
290 | 	You can then use the URL as a normal link reference:
291 | 	
292 | 		[my link][linkref]	
293 | 		[my link][linkRef]
294 | 		
295 | 	Reference names in the parser properties *must* be lowercase.
296 | 	Reference names in the Markdown source may have any case.
297 | 
298 | *	Added `setup` and `teardown` methods which can be used by subclassers
299 | 	as hook points to arrange the state of some parser variables before and 
300 | 	after parsing.
301 | 
302 | 
303 | 1.0.1k (26 Sep 2007):
304 | 
305 | *	Fixed a problem introduced in 1.0.1i where three or more identical
306 | 	uppercase letters, as well as a few other symbols, would trigger
307 | 	a horizontal line.
308 | 
309 | 
310 | 1.0.1j (4 Sep 2007):
311 | 
312 | *	Fixed a problem introduced in 1.0.1i where the closing `code` and 
313 | 	`pre` tags at the end of a code block were appearing in the wrong 
314 | 	order.
315 | 
316 | *	Overriding configuration settings by defining constants from an 
317 | 	external before markdown.php is included is now possible without 
318 | 	producing a PHP warning.
319 | 
320 | 
321 | 1.0.1i (31 Aug 2007):
322 | 
323 | *	Fixed a problem where an escaped backslash before a code span 
324 | 	would prevent the code span from being created. This should now
325 | 	work as expected:
326 | 	
327 | 		Litteral backslash: \\`code span`
328 | 
329 | *	Overall speed improvements, especially with long documents.
330 | 
331 | 
332 | 1.0.1h (3 Aug 2007):
333 | 
334 | *	Added two properties (`no_markup` and `no_entities`) to the parser 
335 | 	allowing HTML tags and entities to be disabled.
336 | 
337 | *	Fix for a problem introduced in 1.0.1g where posting comments in 
338 | 	WordPress would trigger PHP warnings and cause some markup to be 
339 | 	incorrectly filtered by the kses filter in WordPress.
340 | 
341 | 
342 | 1.0.1g (3 Jul 2007):
343 | 
344 | *	Fix for PHP 5 compiled without the mbstring module. Previous fix to 
345 | 	calculate the length of UTF-8 strings in `detab` when `mb_strlen` is 
346 | 	not available was only working with PHP 4.
347 | 
348 | *	Fixed a problem with WordPress 2.x where full-content posts in RSS feeds 
349 | 	were not processed correctly by Markdown.
350 | 
351 | *	Now supports URLs containing literal parentheses for inline links 
352 | 	and images, such as:
353 | 
354 | 		[WIMP](http://en.wikipedia.org/wiki/WIMP_(computing))
355 | 
356 | 	Such parentheses may be arbitrarily nested, but must be
357 | 	balanced. Unbalenced parentheses are allowed however when the URL 
358 | 	when escaped or when the URL is enclosed in angle brakets `<>`.
359 | 
360 | *	Fixed a performance problem where the regular expression for strong 
361 | 	emphasis introduced in version 1.0.1d could sometime be long to process, 
362 | 	give slightly wrong results, and in some circumstances could remove 
363 | 	entirely the content for a whole paragraph.
364 | 
365 | *	Some change in version 1.0.1d made possible the incorrect nesting of 
366 | 	anchors within each other. This is now fixed.
367 | 
368 | *	Fixed a rare issue where certain MD5 hashes in the content could
369 | 	be changed to their corresponding text. For instance, this:
370 | 
371 | 		The MD5 value for "+" is "26b17225b626fb9238849fd60eabdf60".
372 | 	
373 | 	was incorrectly changed to this in previous versions of PHP Markdown:
374 | 
375 | 		<p>The MD5 value for "+" is "+".</p>
376 | 
377 | *	Now convert escaped characters to their numeric character 
378 | 	references equivalent.
379 | 	
380 | 	This fix an integration issue with SmartyPants and backslash escapes. 
381 | 	Since Markdown and SmartyPants have some escapable characters in common, 
382 | 	it was sometime necessary to escape them twice. Previously, two 
383 | 	backslashes were sometime required to prevent Markdown from "eating" the 
384 | 	backslash before SmartyPants sees it:
385 | 	
386 | 		Here are two hyphens: \\--
387 | 	
388 | 	Now, only one backslash will do:
389 | 	
390 | 		Here are two hyphens: \--
391 | 
392 | 
393 | 1.0.1f (7 Feb 2007):
394 | 
395 | *	Fixed an issue with WordPress where manually-entered excerpts, but 
396 | 	not the auto-generated ones, would contain nested paragraphs.
397 | 
398 | *	Fixed an issue introduced in 1.0.1d where headers and blockquotes 
399 | 	preceded too closely by a paragraph (not separated by a blank line) 
400 | 	where incorrectly put inside the paragraph.
401 | 
402 | *	Fixed an issue introduced in 1.0.1d in the tokenizeHTML method where 
403 | 	two consecutive code spans would be merged into one when together they 
404 | 	form a valid tag in a multiline paragraph.
405 | 
406 | *	Fixed an long-prevailing issue where blank lines in code blocks would 
407 | 	be doubled when the code block is in a list item.
408 | 	
409 | 	This was due to the list processing functions relying on artificially 
410 | 	doubled blank lines to correctly determine when list items should 
411 | 	contain block-level content. The list item processing model was thus 
412 | 	changed to avoid the need for double blank lines.
413 | 
414 | *	Fixed an issue with `<% asp-style %>` instructions used as inline 
415 | 	content where the opening `<` was encoded as `&lt;`.
416 | 
417 | *	Fixed a parse error occuring when PHP is configured to accept 
418 | 	ASP-style delimiters as boundaries for PHP scripts.
419 | 
420 | *	Fixed a bug introduced in 1.0.1d where underscores in automatic links
421 | 	got swapped with emphasis tags.
422 | 
423 | 
424 | 1.0.1e (28 Dec 2006)
425 | 
426 | *	Added support for internationalized domain names for email addresses in 
427 | 	automatic link. Improved the speed at which email addresses are converted 
428 | 	to entities. Thanks to Milian Wolff for his optimisations.
429 | 
430 | *	Made deterministic the conversion to entities of email addresses in 
431 | 	automatic links. This means that a given email address will always be 
432 | 	encoded the same way.
433 | 
434 | *	PHP Markdown will now use its own function to calculate the length of an 
435 | 	UTF-8 string in `detab` when `mb_strlen` is not available instead of 
436 | 	giving a fatal error.
437 | 
438 | 
439 | 1.0.1d (1 Dec 2006)
440 | 
441 | *   Fixed a bug where inline images always had an empty title attribute. The 
442 | 	title attribute is now present only when explicitly defined.
443 | 
444 | *	Link references definitions can now have an empty title, previously if the 
445 | 	title was defined but left empty the link definition was ignored. This can 
446 | 	be useful if you want an empty title attribute in images to hide the 
447 | 	tooltip in Internet Explorer.
448 | 
449 | *	Made `detab` aware of UTF-8 characters. UTF-8 multi-byte sequences are now 
450 | 	correctly mapped to one character instead of the number of bytes.
451 | 
452 | *	Fixed a small bug with WordPress where WordPress' default filter `wpautop`
453 | 	was not properly deactivated on comment text, resulting in hard line breaks
454 | 	where Markdown do not prescribes them.
455 | 
456 | *	Added a `TextileRestrited` method to the textile compatibility mode. There
457 | 	is no restriction however, as Markdown does not have a restricted mode at 
458 | 	this point. This should make PHP Markdown work again in the latest 
459 | 	versions of TextPattern.
460 | 
461 | *   Converted PHP Markdown to a object-oriented design.
462 | 
463 | *	Changed span and block gamut methods so that they loop over a 
464 | 	customizable list of methods. This makes subclassing the parser a more 
465 | 	interesting option for creating syntax extensions.
466 | 
467 | *	Also added a "document" gamut loop which can be used to hook document-level 
468 | 	methods (like for striping link definitions).
469 | 
470 | *	Changed all methods which were inserting HTML code so that they now return 
471 | 	a hashed representation of the code. New methods `hashSpan` and `hashBlock`
472 | 	are used to hash respectivly span- and block-level generated content. This 
473 | 	has a couple of significant effects:
474 | 	
475 | 	1.	It prevents invalid nesting of Markdown-generated elements which 
476 | 	    could occur occuring with constructs like `*something [link*][1]`.
477 | 	2.	It prevents problems occuring with deeply nested lists on which 
478 | 	    paragraphs were ill-formed.
479 | 	3.	It removes the need to call `hashHTMLBlocks` twice during the the 
480 | 		block gamut.
481 | 	
482 | 	Hashes are turned back to HTML prior output.
483 | 
484 | *	Made the block-level HTML parser smarter using a specially-crafted regular 
485 | 	expression capable of handling nested tags.
486 | 
487 | *	Solved backtick issues in tag attributes by rewriting the HTML tokenizer to 
488 | 	be aware of code spans. All these lines should work correctly now:
489 | 	
490 | 		<span attr='`ticks`'>bar</span>
491 | 		<span attr='``double ticks``'>bar</span>
492 | 		`<test a="` content of attribute `">`
493 | 
494 | *	Changed the parsing of HTML comments to match simply from `<!--` to `-->` 
495 | 	instead using of the more complicated SGML-style rule with paired `--`.
496 | 	This is how most browsers parse comments and how XML defines them too.
497 | 
498 | *	`<address>` has been added to the list of block-level elements and is now
499 | 	treated as an HTML block instead of being wrapped within paragraph tags.
500 | 
501 | *	Now only trim trailing newlines from code blocks, instead of trimming
502 | 	all trailing whitespace characters.
503 | 
504 | *	Fixed bug where this:
505 | 
506 | 		[text](http://m.com "title" )
507 | 		
508 | 	wasn't working as expected, because the parser wasn't allowing for spaces
509 | 	before the closing paren.
510 | 
511 | *	Filthy hack to support markdown='1' in div tags.
512 | 
513 | *	_DoAutoLinks() now supports the 'dict://' URL scheme.
514 | 
515 | *	PHP- and ASP-style processor instructions are now protected as
516 | 	raw HTML blocks.
517 | 
518 | 		<? ... ?>
519 | 		<% ... %>
520 | 
521 | *	Fix for escaped backticks still triggering code spans:
522 | 
523 | 		There are two raw backticks here: \` and here: \`, not a code span
524 | 
525 | 
526 | 1.0.1c (9 Dec 2005)
527 | 
528 | *   Fixed a problem occurring with PHP 5.1.1 due to a small
529 |     change to strings variable replacement behaviour in
530 |     this version.
531 | 
532 | 
533 | 1.0.1b (6 Jun 2005)
534 | 
535 | *	Fixed a bug where an inline image followed by a reference link would
536 | 	give a completely wrong result.
537 | 
538 | *	Fix for escaped backticks still triggering code spans:
539 | 	
540 | 		There are two raw backticks here: \` and here: \`, not a code span
541 | 
542 | *	Fix for an ordered list following an unordered list, and the 
543 | 	reverse. There is now a loop in _DoList that does the two 
544 | 	separately.
545 | 
546 | *	Fix for nested sub-lists in list-paragraph mode. Previously we got
547 | 	a spurious extra level of `<p>` tags for something like this:
548 | 
549 | 		*	this
550 | 		
551 | 			*	sub
552 | 		
553 | 			that
554 | 
555 | *	Fixed some incorrect behaviour with emphasis. This will now work
556 | 	as it should:
557 | 
558 | 		*test **thing***  
559 | 		**test *thing***  
560 | 		***thing* test**  
561 | 		***thing** test*
562 | 
563 | 		Name: __________  
564 | 		Address: _______
565 | 
566 | *	Correct a small bug in `_TokenizeHTML` where a Doctype declaration
567 | 	was not seen as HTML.
568 | 
569 | *	Major rewrite of the WordPress integration code that should 
570 | 	correct many problems by preventing default WordPress filters from 
571 | 	tampering with Markdown-formatted text. More details here:
572 | 	<http://michelf.com/weblog/2005/wordpress-text-flow-vs-markdown/>
573 | 
574 | 
575 | 1.0.1a (15 Apr 2005)
576 | 
577 | *	Fixed an issue where PHP warnings were trigged when converting
578 | 	text with list items running on PHP 4.0.6. This was comming from 
579 | 	the `rtrim` function which did not support the second argument 
580 | 	prior version 4.1. Replaced by a regular expression.
581 | 
582 | *	Markdown now filter correctly post excerpts and comment
583 | 	excerpts in WordPress.
584 | 
585 | *	Automatic links and some code sample were "corrected" by 
586 | 	the balenceTag filter in WordPress meant to ensure HTML
587 | 	is well formed. This new version of PHP Markdown postpone this
588 | 	filter so that it runs after Markdown.
589 | 
590 | *	Blockquote syntax and some code sample were stripped by 
591 | 	a new WordPress 1.5 filter meant to remove unwanted HTML 
592 | 	in comments. This new version of PHP Markdown postpone this
593 | 	filter so that it runs after Markdown.
594 | 
595 | 
596 | 1.0.1 (16 Dec 2004):
597 | 
598 | *	Changed the syntax rules for code blocks and spans. Previously,
599 | 	backslash escapes for special Markdown characters were processed
600 | 	everywhere other than within inline HTML tags. Now, the contents of
601 | 	code blocks and spans are no longer processed for backslash escapes.
602 | 	This means that code blocks and spans are now treated literally,
603 | 	with no special rules to worry about regarding backslashes.
604 | 
605 | 	**IMPORTANT**: This breaks the syntax from all previous versions of
606 | 	Markdown. Code blocks and spans involving backslash characters will
607 | 	now generate different output than before.
608 | 
609 | 	Implementation-wise, this change was made by moving the call to
610 | 	`_EscapeSpecialChars()` from the top-level `Markdown()` function to
611 | 	within `_RunSpanGamut()`.
612 | 
613 | *	Significants performance improvement in `_DoHeader`, `_Detab`
614 | 	and `_TokenizeHTML`.
615 | 
616 | *	Added `>`, `+`, and `-` to the list of backslash-escapable
617 | 	characters. These should have been done when these characters
618 | 	were added as unordered list item markers.
619 | 
620 | *	Inline links using `<` and `>` URL delimiters weren't working:
621 | 
622 | 		like [this](<http://example.com/>)
623 | 
624 | 	Fixed by moving `_DoAutoLinks()` after `_DoAnchors()` in
625 | 	`_RunSpanGamut()`.
626 | 
627 | *	Fixed bug where auto-links were being processed within code spans:
628 | 
629 | 		like this: `<http://example.com/>`
630 | 
631 | 	Fixed by moving `_DoAutoLinks()` from `_RunBlockGamut()` to
632 | 	`_RunSpanGamut()`.
633 | 
634 | *	Sort-of fixed a bug where lines in the middle of hard-wrapped
635 | 	paragraphs, which lines look like the start of a list item,
636 | 	would accidentally trigger the creation of a list. E.g. a
637 | 	paragraph that looked like this:
638 | 
639 | 		I recommend upgrading to version
640 | 		8. Oops, now this line is treated
641 | 		as a sub-list.
642 | 
643 | 	This is fixed for top-level lists, but it can still happen for
644 | 	sub-lists. E.g., the following list item will not be parsed
645 | 	properly:
646 | 
647 | 		*	I recommend upgrading to version
648 | 			8. Oops, now this line is treated
649 | 			as a sub-list.
650 | 
651 | 	Given Markdown's list-creation rules, I'm not sure this can
652 | 	be fixed.
653 | 
654 | *	Fix for horizontal rules preceded by 2 or 3 spaces or followed by
655 | 	trailing spaces and tabs.
656 | 
657 | *	Standalone HTML comments are now handled; previously, they'd get
658 | 	wrapped in a spurious `<p>` tag.
659 | 
660 | *	`_HashHTMLBlocks()` now tolerates trailing spaces and tabs following
661 | 	HTML comments and `<hr/>` tags.
662 | 
663 | *	Changed special case pattern for hashing `<hr>` tags in
664 | 	`_HashHTMLBlocks()` so that they must occur within three spaces
665 | 	of left margin. (With 4 spaces or a tab, they should be
666 | 	code blocks, but weren't before this fix.)
667 | 
668 | *	Auto-linked email address can now optionally contain
669 | 	a 'mailto:' protocol. I.e. these are equivalent:
670 | 
671 | 		<mailto:user@example.com>
672 | 		<user@example.com>
673 | 
674 | *	Fixed annoying bug where nested lists would wind up with
675 | 	spurious (and invalid) `<p>` tags.
676 | 
677 | *	Changed `_StripLinkDefinitions()` so that link definitions must
678 | 	occur within three spaces of the left margin. Thus if you indent
679 | 	a link definition by four spaces or a tab, it will now be a code
680 | 	block.
681 | 
682 | *	You can now write empty links:
683 | 
684 | 		[like this]()
685 | 
686 | 	and they'll be turned into anchor tags with empty href attributes.
687 | 	This should have worked before, but didn't.
688 | 
689 | *	`***this***` and `___this___` are now turned into
690 | 
691 | 		<strong><em>this</em></strong>
692 | 
693 | 	Instead of
694 | 
695 | 		<strong><em>this</strong></em>
696 | 
697 | 	which isn't valid.
698 | 
699 | *	Fixed problem for links defined with urls that include parens, e.g.:
700 | 
701 | 		[1]: http://sources.wikipedia.org/wiki/Middle_East_Policy_(Chomsky)
702 | 
703 | 	"Chomsky" was being erroneously treated as the URL's title.
704 | 
705 | *	Double quotes in the title of an inline link used to give strange 
706 | 	results (incorrectly made entities). Fixed.
707 | 
708 | *	Tabs are now correctly changed into spaces. Previously, only 
709 | 	the first tab was converted. In code blocks, the second one was too,
710 | 	but was not always correctly aligned.
711 | 
712 | *	Fixed a bug where a tab character inserted after a quote on the same
713 | 	line could add a slash before the quotes.
714 | 
715 | 		This is "before"	[tab] and "after" a tab.
716 | 
717 | 	Previously gave this result:
718 | 
719 | 		<p>This is \"before\"  [tab] and "after" a tab.</p>
720 | 
721 | *	Removed a call to `htmlentities`. This fixes a bug where multibyte
722 | 	characters present in the title of a link reference could lead to
723 | 	invalid utf-8 characters. 
724 | 
725 | *	Changed a regular expression in `_TokenizeHTML` that could lead to
726 | 	a segmentation fault with PHP 4.3.8 on Linux.
727 | 
728 | *	Fixed some notices that could show up if PHP error reporting 
729 | 	E_NOTICE flag was set.
730 | 
731 | 
732 | Copyright and License
733 | ---------------------
734 | 
735 | PHP Markdown
736 | Copyright (c) 2004-2009 Michel Fortin  
737 | <http://michelf.com/>  
738 | All rights reserved.
739 | 
740 | Based on Markdown  
741 | Copyright (c) 2003-2006 John Gruber   
742 | <http://daringfireball.net/>   
743 | All rights reserved.
744 | 
745 | Redistribution and use in source and binary forms, with or without
746 | modification, are permitted provided that the following conditions are
747 | met:
748 | 
749 | *   Redistributions of source code must retain the above copyright notice,
750 |     this list of conditions and the following disclaimer.
751 | 
752 | *   Redistributions in binary form must reproduce the above copyright
753 |     notice, this list of conditions and the following disclaimer in the
754 |     documentation and/or other materials provided with the distribution.
755 | 
756 | *   Neither the name "Markdown" nor the names of its contributors may
757 |     be used to endorse or promote products derived from this software
758 |     without specific prior written permission.
759 | 
760 | This software is provided by the copyright holders and contributors "as
761 | is" and any express or implied warranties, including, but not limited
762 | to, the implied warranties of merchantability and fitness for a
763 | particular purpose are disclaimed. In no event shall the copyright owner
764 | or contributors be liable for any direct, indirect, incidental, special,
765 | exemplary, or consequential damages (including, but not limited to,
766 | procurement of substitute goods or services; loss of use, data, or
767 | profits; or business interruption) however caused and on any theory of
768 | liability, whether in contract, strict liability, or tort (including
769 | negligence or otherwise) arising in any way out of the use of this
770 | software, even if advised of the possibility of such damage.
771 | 


--------------------------------------------------------------------------------
/php-markdown-modified.php:
--------------------------------------------------------------------------------
   1 | <?php
   2 | #
   3 | # Markdown  -  A text-to-HTML conversion tool for web writers
   4 | #
   5 | # PHP Markdown
   6 | # Copyright (c) 2004-2012 Michel Fortin  
   7 | # <http://michelf.com/projects/php-markdown/>
   8 | #
   9 | # Original Markdown
  10 | # Copyright (c) 2004-2006 John Gruber  
  11 | # <http://daringfireball.net/projects/markdown/>
  12 | #
  13 | 
  14 | 
  15 | define( 'AJF_MARKDOWN_VERSION',  "1.0.1o" ); # Sun 8 Jan 2012
  16 | 
  17 | 
  18 | #
  19 | # Global default settings:
  20 | #
  21 | 
  22 | # Change to ">" for HTML output
  23 | @define( 'MARKDOWN_EMPTY_ELEMENT_SUFFIX',  " />");
  24 | 
  25 | # Define the width of a tab for code blocks.
  26 | @define( 'MARKDOWN_TAB_WIDTH',     4 );
  27 | 
  28 | 
  29 | #
  30 | # WordPress settings:
  31 | #
  32 | 
  33 | # Change to false to remove Markdown from posts and/or comments.
  34 | @define( 'MARKDOWN_WP_POSTS',      true );
  35 | @define( 'MARKDOWN_WP_COMMENTS',   true );
  36 | 
  37 | 
  38 | 
  39 | ### Standard Function Interface ###
  40 | 
  41 | @define( 'MARKDOWN_PARSER_CLASS',  'AJF_Markdown_Parser' );
  42 | 
  43 | function AJF_Markdown($text) {
  44 | #
  45 | # Initialize the parser and return the result of its transform method.
  46 | #
  47 | 	# Setup static parser variable.
  48 | 	static $parser;
  49 | 	if (!isset($parser)) {
  50 | 		$parser_class = MARKDOWN_PARSER_CLASS;
  51 | 		$parser = new $parser_class;
  52 | 	}
  53 | 
  54 | 	# Transform text using parser.
  55 | 	return $parser->transform($text);
  56 | }
  57 | 
  58 | 
  59 | ### WordPress Plugin Interface ###
  60 | 
  61 | /*
  62 | Plugin Name: Markdown - Modified
  63 | Plugin URI: https://github.com/afragen/php-markdown-modified
  64 | GitHub Plugin URI: https://github.com/afragen/php-markdown-modified
  65 | Description: Modified to work along side Markdown on Save variants. All posts containing Markdown are rendered regardless of Markdown on Save variant setting. Using PHP Markdown 1.0.1o
  66 | Version: 1.0.1
  67 | Author: Andy Fragen
  68 | */
  69 | 
  70 | 
  71 | if (isset($wp_version)) {
  72 | 	# More details about how it works here:
  73 | 	# <http://michelf.com/weblog/2005/wordpress-text-flow-vs-markdown/>
  74 | 	
  75 | 	# Post content and excerpts
  76 | 	# - Remove WordPress paragraph generator.
  77 | 	# - Run Markdown on excerpt, then remove all tags.
  78 | 	# - Add paragraph tag around the excerpt, but remove it for the excerpt rss.
  79 | 	if (MARKDOWN_WP_POSTS) {
  80 | 		remove_filter('the_content',     'wpautop');
  81 |         remove_filter('the_content_rss', 'wpautop');
  82 | 		remove_filter('the_excerpt',     'wpautop');
  83 | 		add_filter('the_content',     'AJF_Markdown', 6);
  84 |         add_filter('the_content_rss', 'AJF_Markdown', 6);
  85 | 		add_filter('get_the_excerpt', 'AJF_Markdown', 6);
  86 | 		add_filter('get_the_excerpt', 'trim', 7);
  87 | 		add_filter('the_excerpt',     'ajf_mdwp_add_p');
  88 | 		add_filter('the_excerpt_rss', 'ajf_mdwp_strip_p');
  89 | 		
  90 | 		remove_filter('content_save_pre',  'balanceTags', 50);
  91 | 		remove_filter('excerpt_save_pre',  'balanceTags', 50);
  92 | 		add_filter('the_content',  	  'balanceTags', 50);
  93 | 		add_filter('get_the_excerpt', 'balanceTags', 9);
  94 | 	}
  95 | 	
  96 | 	# Comments
  97 | 	# - Remove WordPress paragraph generator.
  98 | 	# - Remove WordPress auto-link generator.
  99 | 	# - Scramble important tags before passing them to the kses filter.
 100 | 	# - Run Markdown on excerpt then remove paragraph tags.
 101 | 	if (MARKDOWN_WP_COMMENTS) {
 102 | 		remove_filter('comment_text', 'wpautop', 30);
 103 | 		remove_filter('comment_text', 'make_clickable');
 104 | 		add_filter('pre_comment_content', 'AJF_Markdown', 6);
 105 | 		add_filter('pre_comment_content', 'ajf_mdwp_hide_tags', 8);
 106 | 		add_filter('pre_comment_content', 'ajf_mdwp_show_tags', 12);
 107 | 		add_filter('get_comment_text',    'AJF_Markdown', 6);
 108 | 		add_filter('get_comment_excerpt', 'AJF_Markdown', 6);
 109 | 		add_filter('get_comment_excerpt', 'ajf_mdwp_strip_p', 7);
 110 | 	
 111 | 		global $mdwp_hidden_tags, $mdwp_placeholders;
 112 | 		$mdwp_hidden_tags = explode(' ',
 113 | 			'<p> </p> <pre> </pre> <ol> </ol> <ul> </ul> <li> </li>');
 114 | 		$mdwp_placeholders = explode(' ', str_rot13(
 115 | 			'pEj07ZbbBZ U1kqgh4w4p pre2zmeN6K QTi31t9pre ol0MP1jzJR '.
 116 | 			'ML5IjmbRol ulANi1NsGY J7zRLJqPul liA8ctl16T K9nhooUHli'));
 117 | 	}
 118 | 	
 119 | 	function ajf_mdwp_add_p($text) {
 120 | 		if (!preg_match('{^$|^<(p|ul|ol|dl|pre|blockquote)>}i', $text)) {
 121 | 			$text = '<p>'.$text.'</p>';
 122 | 			$text = preg_replace('{\n{2,}}', "</p>\n\n<p>", $text);
 123 | 		}
 124 | 		return $text;
 125 | 	}
 126 | 	
 127 | 	function ajf_mdwp_strip_p($t) { return preg_replace('{</?p>}i', '', $t); }
 128 | 
 129 | 	function ajf_mdwp_hide_tags($text) {
 130 | 		global $mdwp_hidden_tags, $mdwp_placeholders;
 131 | 		return str_replace($mdwp_hidden_tags, $mdwp_placeholders, $text);
 132 | 	}
 133 | 	function ajf_mdwp_show_tags($text) {
 134 | 		global $mdwp_hidden_tags, $mdwp_placeholders;
 135 | 		return str_replace($mdwp_placeholders, $mdwp_hidden_tags, $text);
 136 | 	}
 137 | }
 138 | 
 139 | 
 140 | ### bBlog Plugin Info ###
 141 | 
 142 | function AJF_identify_modifier_markdown() {
 143 | 	return array(
 144 | 		'name'			=> 'markdown',
 145 | 		'type'			=> 'modifier',
 146 | 		'nicename'		=> 'Markdown',
 147 | 		'description'	=> 'A text-to-HTML conversion tool for web writers',
 148 | 		'authors'		=> 'Michel Fortin and John Gruber',
 149 | 		'licence'		=> 'BSD-like',
 150 | 		'version'		=> AJF_MARKDOWN_VERSION,
 151 | 		'help'			=> '<a href="http://daringfireball.net/projects/markdown/syntax">Markdown syntax</a> allows you to write using an easy-to-read, easy-to-write plain text format. Based on the original Perl version by <a href="http://daringfireball.net/">John Gruber</a>. <a href="http://michelf.com/projects/php-markdown/">More...</a>'
 152 | 	);
 153 | }
 154 | 
 155 | 
 156 | ### Smarty Modifier Interface ###
 157 | 
 158 | function AJF_smarty_modifier_markdown($text) {
 159 | 	return AJF_Markdown($text);
 160 | }
 161 | 
 162 | 
 163 | ### Textile Compatibility Mode ###
 164 | 
 165 | # Rename this file to "classTextile.php" and it can replace Textile everywhere.
 166 | 
 167 | if (strcasecmp(substr(__FILE__, -16), "classTextile.php") == 0) {
 168 | 	# Try to include PHP SmartyPants. Should be in the same directory.
 169 | 	@include_once 'smartypants.php';
 170 | 	# Fake Textile class. It calls Markdown instead.
 171 | 	class Textile {
 172 | 		function TextileThis($text, $lite='', $encode='') {
 173 | 			if ($lite == '' && $encode == '')    $text = AJF_Markdown($text);
 174 | 			if (function_exists('SmartyPants'))  $text = SmartyPants($text);
 175 | 			return $text;
 176 | 		}
 177 | 		# Fake restricted version: restrictions are not supported for now.
 178 | 		function TextileRestricted($text, $lite='', $noimage='') {
 179 | 			return $this->TextileThis($text, $lite);
 180 | 		}
 181 | 		# Workaround to ensure compatibility with TextPattern 4.0.3.
 182 | 		function blockLite($text) { return $text; }
 183 | 	}
 184 | }
 185 | 
 186 | 
 187 | 
 188 | #
 189 | # Markdown Parser Class
 190 | #
 191 | 
 192 | class AJF_Markdown_Parser {
 193 | 
 194 | 	# Regex to match balanced [brackets].
 195 | 	# Needed to insert a maximum bracked depth while converting to PHP.
 196 | 	var $nested_brackets_depth = 6;
 197 | 	var $nested_brackets_re;
 198 | 	
 199 | 	var $nested_url_parenthesis_depth = 4;
 200 | 	var $nested_url_parenthesis_re;
 201 | 
 202 | 	# Table of hash values for escaped characters:
 203 | 	var $escape_chars = '\`*_{}[]()>#+-.!';
 204 | 	var $escape_chars_re;
 205 | 
 206 | 	# Change to ">" for HTML output.
 207 | 	var $empty_element_suffix = MARKDOWN_EMPTY_ELEMENT_SUFFIX;
 208 | 	var $tab_width = MARKDOWN_TAB_WIDTH;
 209 | 	
 210 | 	# Change to `true` to disallow markup or entities.
 211 | 	var $no_markup = false;
 212 | 	var $no_entities = false;
 213 | 	
 214 | 	# Predefined urls and titles for reference links and images.
 215 | 	var $predef_urls = array();
 216 | 	var $predef_titles = array();
 217 | 
 218 | 
 219 | 	function AJF_Markdown_Parser() {
 220 | 	#
 221 | 	# Constructor function. Initialize appropriate member variables.
 222 | 	#
 223 | 		$this->_initDetab();
 224 | 		$this->prepareItalicsAndBold();
 225 | 	
 226 | 		$this->nested_brackets_re = 
 227 | 			str_repeat('(?>[^\[\]]+|\[', $this->nested_brackets_depth).
 228 | 			str_repeat('\])*', $this->nested_brackets_depth);
 229 | 	
 230 | 		$this->nested_url_parenthesis_re = 
 231 | 			str_repeat('(?>[^()\s]+|\(', $this->nested_url_parenthesis_depth).
 232 | 			str_repeat('(?>\)))*', $this->nested_url_parenthesis_depth);
 233 | 		
 234 | 		$this->escape_chars_re = '['.preg_quote($this->escape_chars).']';
 235 | 		
 236 | 		# Sort document, block, and span gamut in ascendent priority order.
 237 | 		asort($this->document_gamut);
 238 | 		asort($this->block_gamut);
 239 | 		asort($this->span_gamut);
 240 | 	}
 241 | 
 242 | 
 243 | 	# Internal hashes used during transformation.
 244 | 	var $urls = array();
 245 | 	var $titles = array();
 246 | 	var $html_hashes = array();
 247 | 	
 248 | 	# Status flag to avoid invalid nesting.
 249 | 	var $in_anchor = false;
 250 | 	
 251 | 	
 252 | 	function setup() {
 253 | 	#
 254 | 	# Called before the transformation process starts to setup parser 
 255 | 	# states.
 256 | 	#
 257 | 		# Clear global hashes.
 258 | 		$this->urls = $this->predef_urls;
 259 | 		$this->titles = $this->predef_titles;
 260 | 		$this->html_hashes = array();
 261 | 		
 262 | 		$in_anchor = false;
 263 | 	}
 264 | 	
 265 | 	function teardown() {
 266 | 	#
 267 | 	# Called after the transformation process to clear any variable 
 268 | 	# which may be taking up memory unnecessarly.
 269 | 	#
 270 | 		$this->urls = array();
 271 | 		$this->titles = array();
 272 | 		$this->html_hashes = array();
 273 | 	}
 274 | 
 275 | 
 276 | 	function transform($text) {
 277 | 	#
 278 | 	# Main function. Performs some preprocessing on the input text
 279 | 	# and pass it through the document gamut.
 280 | 	#
 281 | 		$this->setup();
 282 | 	
 283 | 		# Remove UTF-8 BOM and marker character in input, if present.
 284 | 		$text = preg_replace('{^\xEF\xBB\xBF|\x1A}', '', $text);
 285 | 
 286 | 		# Standardize line endings:
 287 | 		#   DOS to Unix and Mac to Unix
 288 | 		$text = preg_replace('{\r\n?}', "\n", $text);
 289 | 
 290 | 		# Make sure $text ends with a couple of newlines:
 291 | 		$text .= "\n\n";
 292 | 
 293 | 		# Convert all tabs to spaces.
 294 | 		$text = $this->detab($text);
 295 | 
 296 | 		# Turn block-level HTML blocks into hash entries
 297 | 		$text = $this->hashHTMLBlocks($text);
 298 | 
 299 | 		# Strip any lines consisting only of spaces and tabs.
 300 | 		# This makes subsequent regexen easier to write, because we can
 301 | 		# match consecutive blank lines with /\n+/ instead of something
 302 | 		# contorted like /[ ]*\n+/ .
 303 | 		$text = preg_replace('/^[ ]+$/m', '', $text);
 304 | 
 305 | 		# Run document gamut methods.
 306 | 		foreach ($this->document_gamut as $method => $priority) {
 307 | 			$text = $this->$method($text);
 308 | 		}
 309 | 		
 310 | 		$this->teardown();
 311 | 
 312 | 		return $text . "\n";
 313 | 	}
 314 | 	
 315 | 	var $document_gamut = array(
 316 | 		# Strip link definitions, store in hashes.
 317 | 		"stripLinkDefinitions" => 20,
 318 | 		
 319 | 		"runBasicBlockGamut"   => 30,
 320 | 		);
 321 | 
 322 | 
 323 | 	function stripLinkDefinitions($text) {
 324 | 	#
 325 | 	# Strips link definitions from text, stores the URLs and titles in
 326 | 	# hash references.
 327 | 	#
 328 | 		$less_than_tab = $this->tab_width - 1;
 329 | 
 330 | 		# Link defs are in the form: ^[id]: url "optional title"
 331 | 		$text = preg_replace_callback('{
 332 | 							^[ ]{0,'.$less_than_tab.'}\[(.+)\][ ]?:	# id = $1
 333 | 							  [ ]*
 334 | 							  \n?				# maybe *one* newline
 335 | 							  [ ]*
 336 | 							(?:
 337 | 							  <(.+?)>			# url = $2
 338 | 							|
 339 | 							  (\S+?)			# url = $3
 340 | 							)
 341 | 							  [ ]*
 342 | 							  \n?				# maybe one newline
 343 | 							  [ ]*
 344 | 							(?:
 345 | 								(?<=\s)			# lookbehind for whitespace
 346 | 								["(]
 347 | 								(.*?)			# title = $4
 348 | 								[")]
 349 | 								[ ]*
 350 | 							)?	# title is optional
 351 | 							(?:\n+|\Z)
 352 | 			}xm',
 353 | 			array(&$this, '_stripLinkDefinitions_callback'),
 354 | 			$text);
 355 | 		return $text;
 356 | 	}
 357 | 	function _stripLinkDefinitions_callback($matches) {
 358 | 		$link_id = strtolower($matches[1]);
 359 | 		$url = $matches[2] == '' ? $matches[3] : $matches[2];
 360 | 		$this->urls[$link_id] = $url;
 361 | 		$this->titles[$link_id] =& $matches[4];
 362 | 		return ''; # String that will replace the block
 363 | 	}
 364 | 
 365 | 
 366 | 	function hashHTMLBlocks($text) {
 367 | 		if ($this->no_markup)  return $text;
 368 | 
 369 | 		$less_than_tab = $this->tab_width - 1;
 370 | 
 371 | 		# Hashify HTML blocks:
 372 | 		# We only want to do this for block-level HTML tags, such as headers,
 373 | 		# lists, and tables. That's because we still want to wrap <p>s around
 374 | 		# "paragraphs" that are wrapped in non-block-level tags, such as anchors,
 375 | 		# phrase emphasis, and spans. The list of tags we're looking for is
 376 | 		# hard-coded:
 377 | 		#
 378 | 		# *  List "a" is made of tags which can be both inline or block-level.
 379 | 		#    These will be treated block-level when the start tag is alone on 
 380 | 		#    its line, otherwise they're not matched here and will be taken as 
 381 | 		#    inline later.
 382 | 		# *  List "b" is made of tags which are always block-level;
 383 | 		#
 384 | 		$block_tags_a_re = 'ins|del';
 385 | 		$block_tags_b_re = 'p|div|h[1-6]|blockquote|pre|table|dl|ol|ul|address|'.
 386 | 						   'script|noscript|form|fieldset|iframe|math';
 387 | 
 388 | 		# Regular expression for the content of a block tag.
 389 | 		$nested_tags_level = 4;
 390 | 		$attr = '
 391 | 			(?>				# optional tag attributes
 392 | 			  \s			# starts with whitespace
 393 | 			  (?>
 394 | 				[^>"/]+		# text outside quotes
 395 | 			  |
 396 | 				/+(?!>)		# slash not followed by ">"
 397 | 			  |
 398 | 				"[^"]*"		# text inside double quotes (tolerate ">")
 399 | 			  |
 400 | 				\'[^\']*\'	# text inside single quotes (tolerate ">")
 401 | 			  )*
 402 | 			)?	
 403 | 			';
 404 | 		$content =
 405 | 			str_repeat('
 406 | 				(?>
 407 | 				  [^<]+			# content without tag
 408 | 				|
 409 | 				  <\2			# nested opening tag
 410 | 					'.$attr.'	# attributes
 411 | 					(?>
 412 | 					  />
 413 | 					|
 414 | 					  >', $nested_tags_level).	# end of opening tag
 415 | 					  '.*?'.					# last level nested tag content
 416 | 			str_repeat('
 417 | 					  </\2\s*>	# closing nested tag
 418 | 					)
 419 | 				  |				
 420 | 					<(?!/\2\s*>	# other tags with a different name
 421 | 				  )
 422 | 				)*',
 423 | 				$nested_tags_level);
 424 | 		$content2 = str_replace('\2', '\3', $content);
 425 | 
 426 | 		# First, look for nested blocks, e.g.:
 427 | 		# 	<div>
 428 | 		# 		<div>
 429 | 		# 		tags for inner block must be indented.
 430 | 		# 		</div>
 431 | 		# 	</div>
 432 | 		#
 433 | 		# The outermost tags must start at the left margin for this to match, and
 434 | 		# the inner nested divs must be indented.
 435 | 		# We need to do this before the next, more liberal match, because the next
 436 | 		# match will start at the first `<div>` and stop at the first `</div>`.
 437 | 		$text = preg_replace_callback('{(?>
 438 | 			(?>
 439 | 				(?<=\n\n)		# Starting after a blank line
 440 | 				|				# or
 441 | 				\A\n?			# the beginning of the doc
 442 | 			)
 443 | 			(						# save in $1
 444 | 
 445 | 			  # Match from `\n<tag>` to `</tag>\n`, handling nested tags 
 446 | 			  # in between.
 447 | 					
 448 | 						[ ]{0,'.$less_than_tab.'}
 449 | 						<('.$block_tags_b_re.')# start tag = $2
 450 | 						'.$attr.'>			# attributes followed by > and \n
 451 | 						'.$content.'		# content, support nesting
 452 | 						</\2>				# the matching end tag
 453 | 						[ ]*				# trailing spaces/tabs
 454 | 						(?=\n+|\Z)	# followed by a newline or end of document
 455 | 
 456 | 			| # Special version for tags of group a.
 457 | 
 458 | 						[ ]{0,'.$less_than_tab.'}
 459 | 						<('.$block_tags_a_re.')# start tag = $3
 460 | 						'.$attr.'>[ ]*\n	# attributes followed by >
 461 | 						'.$content2.'		# content, support nesting
 462 | 						</\3>				# the matching end tag
 463 | 						[ ]*				# trailing spaces/tabs
 464 | 						(?=\n+|\Z)	# followed by a newline or end of document
 465 | 					
 466 | 			| # Special case just for <hr />. It was easier to make a special 
 467 | 			  # case than to make the other regex more complicated.
 468 | 			
 469 | 						[ ]{0,'.$less_than_tab.'}
 470 | 						<(hr)				# start tag = $2
 471 | 						'.$attr.'			# attributes
 472 | 						/?>					# the matching end tag
 473 | 						[ ]*
 474 | 						(?=\n{2,}|\Z)		# followed by a blank line or end of document
 475 | 			
 476 | 			| # Special case for standalone HTML comments:
 477 | 			
 478 | 					[ ]{0,'.$less_than_tab.'}
 479 | 					(?s:
 480 | 						<!-- .*? -->
 481 | 					)
 482 | 					[ ]*
 483 | 					(?=\n{2,}|\Z)		# followed by a blank line or end of document
 484 | 			
 485 | 			| # PHP and ASP-style processor instructions (<? and <%)
 486 | 			
 487 | 					[ ]{0,'.$less_than_tab.'}
 488 | 					(?s:
 489 | 						<([?%])			# $2
 490 | 						.*?
 491 | 						\2>
 492 | 					)
 493 | 					[ ]*
 494 | 					(?=\n{2,}|\Z)		# followed by a blank line or end of document
 495 | 					
 496 | 			)
 497 | 			)}Sxmi',
 498 | 			array(&$this, '_hashHTMLBlocks_callback'),
 499 | 			$text);
 500 | 
 501 | 		return $text;
 502 | 	}
 503 | 	function _hashHTMLBlocks_callback($matches) {
 504 | 		$text = $matches[1];
 505 | 		$key  = $this->hashBlock($text);
 506 | 		return "\n\n$key\n\n";
 507 | 	}
 508 | 	
 509 | 	
 510 | 	function hashPart($text, $boundary = 'X') {
 511 | 	#
 512 | 	# Called whenever a tag must be hashed when a function insert an atomic 
 513 | 	# element in the text stream. Passing $text to through this function gives
 514 | 	# a unique text-token which will be reverted back when calling unhash.
 515 | 	#
 516 | 	# The $boundary argument specify what character should be used to surround
 517 | 	# the token. By convension, "B" is used for block elements that needs not
 518 | 	# to be wrapped into paragraph tags at the end, ":" is used for elements
 519 | 	# that are word separators and "X" is used in the general case.
 520 | 	#
 521 | 		# Swap back any tag hash found in $text so we do not have to `unhash`
 522 | 		# multiple times at the end.
 523 | 		$text = $this->unhash($text);
 524 | 		
 525 | 		# Then hash the block.
 526 | 		static $i = 0;
 527 | 		$key = "$boundary\x1A" . ++$i . $boundary;
 528 | 		$this->html_hashes[$key] = $text;
 529 | 		return $key; # String that will replace the tag.
 530 | 	}
 531 | 
 532 | 
 533 | 	function hashBlock($text) {
 534 | 	#
 535 | 	# Shortcut function for hashPart with block-level boundaries.
 536 | 	#
 537 | 		return $this->hashPart($text, 'B');
 538 | 	}
 539 | 
 540 | 
 541 | 	var $block_gamut = array(
 542 | 	#
 543 | 	# These are all the transformations that form block-level
 544 | 	# tags like paragraphs, headers, and list items.
 545 | 	#
 546 | 		"doHeaders"         => 10,
 547 | 		"doHorizontalRules" => 20,
 548 | 		
 549 | 		"doLists"           => 40,
 550 | 		"doCodeBlocks"      => 50,
 551 | 		"doBlockQuotes"     => 60,
 552 | 		);
 553 | 
 554 | 	function runBlockGamut($text) {
 555 | 	#
 556 | 	# Run block gamut tranformations.
 557 | 	#
 558 | 		# We need to escape raw HTML in Markdown source before doing anything 
 559 | 		# else. This need to be done for each block, and not only at the 
 560 | 		# begining in the Markdown function since hashed blocks can be part of
 561 | 		# list items and could have been indented. Indented blocks would have 
 562 | 		# been seen as a code block in a previous pass of hashHTMLBlocks.
 563 | 		$text = $this->hashHTMLBlocks($text);
 564 | 		
 565 | 		return $this->runBasicBlockGamut($text);
 566 | 	}
 567 | 	
 568 | 	function runBasicBlockGamut($text) {
 569 | 	#
 570 | 	# Run block gamut tranformations, without hashing HTML blocks. This is 
 571 | 	# useful when HTML blocks are known to be already hashed, like in the first
 572 | 	# whole-document pass.
 573 | 	#
 574 | 		foreach ($this->block_gamut as $method => $priority) {
 575 | 			$text = $this->$method($text);
 576 | 		}
 577 | 		
 578 | 		# Finally form paragraph and restore hashed blocks.
 579 | 		$text = $this->formParagraphs($text);
 580 | 
 581 | 		return $text;
 582 | 	}
 583 | 	
 584 | 	
 585 | 	function doHorizontalRules($text) {
 586 | 		# Do Horizontal Rules:
 587 | 		return preg_replace(
 588 | 			'{
 589 | 				^[ ]{0,3}	# Leading space
 590 | 				([-*_])		# $1: First marker
 591 | 				(?>			# Repeated marker group
 592 | 					[ ]{0,2}	# Zero, one, or two spaces.
 593 | 					\1			# Marker character
 594 | 				){2,}		# Group repeated at least twice
 595 | 				[ ]*		# Tailing spaces
 596 | 				$			# End of line.
 597 | 			}mx',
 598 | 			"\n".$this->hashBlock("<hr$this->empty_element_suffix")."\n", 
 599 | 			$text);
 600 | 	}
 601 | 
 602 | 
 603 | 	var $span_gamut = array(
 604 | 	#
 605 | 	# These are all the transformations that occur *within* block-level
 606 | 	# tags like paragraphs, headers, and list items.
 607 | 	#
 608 | 		# Process character escapes, code spans, and inline HTML
 609 | 		# in one shot.
 610 | 		"parseSpan"           => -30,
 611 | 
 612 | 		# Process anchor and image tags. Images must come first,
 613 | 		# because ![foo][f] looks like an anchor.
 614 | 		"doImages"            =>  10,
 615 | 		"doAnchors"           =>  20,
 616 | 		
 617 | 		# Make links out of things like `<http://example.com/>`
 618 | 		# Must come after doAnchors, because you can use < and >
 619 | 		# delimiters in inline links like [this](<url>).
 620 | 		"doAutoLinks"         =>  30,
 621 | 		"encodeAmpsAndAngles" =>  40,
 622 | 
 623 | 		"doItalicsAndBold"    =>  50,
 624 | 		"doHardBreaks"        =>  60,
 625 | 		);
 626 | 
 627 | 	function runSpanGamut($text) {
 628 | 	#
 629 | 	# Run span gamut tranformations.
 630 | 	#
 631 | 		foreach ($this->span_gamut as $method => $priority) {
 632 | 			$text = $this->$method($text);
 633 | 		}
 634 | 
 635 | 		return $text;
 636 | 	}
 637 | 	
 638 | 	
 639 | 	function doHardBreaks($text) {
 640 | 		# Do hard breaks:
 641 | 		return preg_replace_callback('/ {2,}\n/', 
 642 | 			array(&$this, '_doHardBreaks_callback'), $text);
 643 | 	}
 644 | 	function _doHardBreaks_callback($matches) {
 645 | 		return $this->hashPart("<br$this->empty_element_suffix\n");
 646 | 	}
 647 | 
 648 | 
 649 | 	function doAnchors($text) {
 650 | 	#
 651 | 	# Turn Markdown link shortcuts into XHTML <a> tags.
 652 | 	#
 653 | 		if ($this->in_anchor) return $text;
 654 | 		$this->in_anchor = true;
 655 | 		
 656 | 		#
 657 | 		# First, handle reference-style links: [link text] [id]
 658 | 		#
 659 | 		$text = preg_replace_callback('{
 660 | 			(					# wrap whole match in $1
 661 | 			  \[
 662 | 				('.$this->nested_brackets_re.')	# link text = $2
 663 | 			  \]
 664 | 
 665 | 			  [ ]?				# one optional space
 666 | 			  (?:\n[ ]*)?		# one optional newline followed by spaces
 667 | 
 668 | 			  \[
 669 | 				(.*?)		# id = $3
 670 | 			  \]
 671 | 			)
 672 | 			}xs',
 673 | 			array(&$this, '_doAnchors_reference_callback'), $text);
 674 | 
 675 | 		#
 676 | 		# Next, inline-style links: [link text](url "optional title")
 677 | 		#
 678 | 		$text = preg_replace_callback('{
 679 | 			(				# wrap whole match in $1
 680 | 			  \[
 681 | 				('.$this->nested_brackets_re.')	# link text = $2
 682 | 			  \]
 683 | 			  \(			# literal paren
 684 | 				[ \n]*
 685 | 				(?:
 686 | 					<(.+?)>	# href = $3
 687 | 				|
 688 | 					('.$this->nested_url_parenthesis_re.')	# href = $4
 689 | 				)
 690 | 				[ \n]*
 691 | 				(			# $5
 692 | 				  ([\'"])	# quote char = $6
 693 | 				  (.*?)		# Title = $7
 694 | 				  \6		# matching quote
 695 | 				  [ \n]*	# ignore any spaces/tabs between closing quote and )
 696 | 				)?			# title is optional
 697 | 			  \)
 698 | 			)
 699 | 			}xs',
 700 | 			array(&$this, '_doAnchors_inline_callback'), $text);
 701 | 
 702 | 		#
 703 | 		# Last, handle reference-style shortcuts: [link text]
 704 | 		# These must come last in case you've also got [link text][1]
 705 | 		# or [link text](/foo)
 706 | 		#
 707 | 		$text = preg_replace_callback('{
 708 | 			(					# wrap whole match in $1
 709 | 			  \[
 710 | 				([^\[\]]+)		# link text = $2; can\'t contain [ or ]
 711 | 			  \]
 712 | 			)
 713 | 			}xs',
 714 | 			array(&$this, '_doAnchors_reference_callback'), $text);
 715 | 
 716 | 		$this->in_anchor = false;
 717 | 		return $text;
 718 | 	}
 719 | 	function _doAnchors_reference_callback($matches) {
 720 | 		$whole_match =  $matches[1];
 721 | 		$link_text   =  $matches[2];
 722 | 		$link_id     =& $matches[3];
 723 | 
 724 | 		if ($link_id == "") {
 725 | 			# for shortcut links like [this][] or [this].
 726 | 			$link_id = $link_text;
 727 | 		}
 728 | 		
 729 | 		# lower-case and turn embedded newlines into spaces
 730 | 		$link_id = strtolower($link_id);
 731 | 		$link_id = preg_replace('{[ ]?\n}', ' ', $link_id);
 732 | 
 733 | 		if (isset($this->urls[$link_id])) {
 734 | 			$url = $this->urls[$link_id];
 735 | 			$url = $this->encodeAttribute($url);
 736 | 			
 737 | 			$result = "<a href=\"$url\"";
 738 | 			if ( isset( $this->titles[$link_id] ) ) {
 739 | 				$title = $this->titles[$link_id];
 740 | 				$title = $this->encodeAttribute($title);
 741 | 				$result .=  " title=\"$title\"";
 742 | 			}
 743 | 		
 744 | 			$link_text = $this->runSpanGamut($link_text);
 745 | 			$result .= ">$link_text</a>";
 746 | 			$result = $this->hashPart($result);
 747 | 		}
 748 | 		else {
 749 | 			$result = $whole_match;
 750 | 		}
 751 | 		return $result;
 752 | 	}
 753 | 	function _doAnchors_inline_callback($matches) {
 754 | 		$whole_match	=  $matches[1];
 755 | 		$link_text		=  $this->runSpanGamut($matches[2]);
 756 | 		$url			=  $matches[3] == '' ? $matches[4] : $matches[3];
 757 | 		$title			=& $matches[7];
 758 | 
 759 | 		$url = $this->encodeAttribute($url);
 760 | 
 761 | 		$result = "<a href=\"$url\"";
 762 | 		if (isset($title)) {
 763 | 			$title = $this->encodeAttribute($title);
 764 | 			$result .=  " title=\"$title\"";
 765 | 		}
 766 | 		
 767 | 		$link_text = $this->runSpanGamut($link_text);
 768 | 		$result .= ">$link_text</a>";
 769 | 
 770 | 		return $this->hashPart($result);
 771 | 	}
 772 | 
 773 | 
 774 | 	function doImages($text) {
 775 | 	#
 776 | 	# Turn Markdown image shortcuts into <img> tags.
 777 | 	#
 778 | 		#
 779 | 		# First, handle reference-style labeled images: ![alt text][id]
 780 | 		#
 781 | 		$text = preg_replace_callback('{
 782 | 			(				# wrap whole match in $1
 783 | 			  !\[
 784 | 				('.$this->nested_brackets_re.')		# alt text = $2
 785 | 			  \]
 786 | 
 787 | 			  [ ]?				# one optional space
 788 | 			  (?:\n[ ]*)?		# one optional newline followed by spaces
 789 | 
 790 | 			  \[
 791 | 				(.*?)		# id = $3
 792 | 			  \]
 793 | 
 794 | 			)
 795 | 			}xs', 
 796 | 			array(&$this, '_doImages_reference_callback'), $text);
 797 | 
 798 | 		#
 799 | 		# Next, handle inline images:  ![alt text](url "optional title")
 800 | 		# Don't forget: encode * and _
 801 | 		#
 802 | 		$text = preg_replace_callback('{
 803 | 			(				# wrap whole match in $1
 804 | 			  !\[
 805 | 				('.$this->nested_brackets_re.')		# alt text = $2
 806 | 			  \]
 807 | 			  \s?			# One optional whitespace character
 808 | 			  \(			# literal paren
 809 | 				[ \n]*
 810 | 				(?:
 811 | 					<(\S*)>	# src url = $3
 812 | 				|
 813 | 					('.$this->nested_url_parenthesis_re.')	# src url = $4
 814 | 				)
 815 | 				[ \n]*
 816 | 				(			# $5
 817 | 				  ([\'"])	# quote char = $6
 818 | 				  (.*?)		# title = $7
 819 | 				  \6		# matching quote
 820 | 				  [ \n]*
 821 | 				)?			# title is optional
 822 | 			  \)
 823 | 			)
 824 | 			}xs',
 825 | 			array(&$this, '_doImages_inline_callback'), $text);
 826 | 
 827 | 		return $text;
 828 | 	}
 829 | 	function _doImages_reference_callback($matches) {
 830 | 		$whole_match = $matches[1];
 831 | 		$alt_text    = $matches[2];
 832 | 		$link_id     = strtolower($matches[3]);
 833 | 
 834 | 		if ($link_id == "") {
 835 | 			$link_id = strtolower($alt_text); # for shortcut links like ![this][].
 836 | 		}
 837 | 
 838 | 		$alt_text = $this->encodeAttribute($alt_text);
 839 | 		if (isset($this->urls[$link_id])) {
 840 | 			$url = $this->encodeAttribute($this->urls[$link_id]);
 841 | 			$result = "<img src=\"$url\" alt=\"$alt_text\"";
 842 | 			if (isset($this->titles[$link_id])) {
 843 | 				$title = $this->titles[$link_id];
 844 | 				$title = $this->encodeAttribute($title);
 845 | 				$result .=  " title=\"$title\"";
 846 | 			}
 847 | 			$result .= $this->empty_element_suffix;
 848 | 			$result = $this->hashPart($result);
 849 | 		}
 850 | 		else {
 851 | 			# If there's no such link ID, leave intact:
 852 | 			$result = $whole_match;
 853 | 		}
 854 | 
 855 | 		return $result;
 856 | 	}
 857 | 	function _doImages_inline_callback($matches) {
 858 | 		$whole_match	= $matches[1];
 859 | 		$alt_text		= $matches[2];
 860 | 		$url			= $matches[3] == '' ? $matches[4] : $matches[3];
 861 | 		$title			=& $matches[7];
 862 | 
 863 | 		$alt_text = $this->encodeAttribute($alt_text);
 864 | 		$url = $this->encodeAttribute($url);
 865 | 		$result = "<img src=\"$url\" alt=\"$alt_text\"";
 866 | 		if (isset($title)) {
 867 | 			$title = $this->encodeAttribute($title);
 868 | 			$result .=  " title=\"$title\""; # $title already quoted
 869 | 		}
 870 | 		$result .= $this->empty_element_suffix;
 871 | 
 872 | 		return $this->hashPart($result);
 873 | 	}
 874 | 
 875 | 
 876 | 	function doHeaders($text) {
 877 | 		# Setext-style headers:
 878 | 		#	  Header 1
 879 | 		#	  ========
 880 | 		#  
 881 | 		#	  Header 2
 882 | 		#	  --------
 883 | 		#
 884 | 		$text = preg_replace_callback('{ ^(.+?)[ ]*\n(=+|-+)[ ]*\n+ }mx',
 885 | 			array(&$this, '_doHeaders_callback_setext'), $text);
 886 | 
 887 | 		# atx-style headers:
 888 | 		#	# Header 1
 889 | 		#	## Header 2
 890 | 		#	## Header 2 with closing hashes ##
 891 | 		#	...
 892 | 		#	###### Header 6
 893 | 		#
 894 | 		$text = preg_replace_callback('{
 895 | 				^(\#{1,6})	# $1 = string of #\'s
 896 | 				[ ]*
 897 | 				(.+?)		# $2 = Header text
 898 | 				[ ]*
 899 | 				\#*			# optional closing #\'s (not counted)
 900 | 				\n+
 901 | 			}xm',
 902 | 			array(&$this, '_doHeaders_callback_atx'), $text);
 903 | 
 904 | 		return $text;
 905 | 	}
 906 | 	function _doHeaders_callback_setext($matches) {
 907 | 		# Terrible hack to check we haven't found an empty list item.
 908 | 		if ($matches[2] == '-' && preg_match('{^-(?: |$)}', $matches[1]))
 909 | 			return $matches[0];
 910 | 		
 911 | 		$level = $matches[2]{0} == '=' ? 1 : 2;
 912 | 		$block = "<h$level>".$this->runSpanGamut($matches[1])."</h$level>";
 913 | 		return "\n" . $this->hashBlock($block) . "\n\n";
 914 | 	}
 915 | 	function _doHeaders_callback_atx($matches) {
 916 | 		$level = strlen($matches[1]);
 917 | 		$block = "<h$level>".$this->runSpanGamut($matches[2])."</h$level>";
 918 | 		return "\n" . $this->hashBlock($block) . "\n\n";
 919 | 	}
 920 | 
 921 | 
 922 | 	function doLists($text) {
 923 | 	#
 924 | 	# Form HTML ordered (numbered) and unordered (bulleted) lists.
 925 | 	#
 926 | 		$less_than_tab = $this->tab_width - 1;
 927 | 
 928 | 		# Re-usable patterns to match list item bullets and number markers:
 929 | 		$marker_ul_re  = '[*+-]';
 930 | 		$marker_ol_re  = '\d+[\.]';
 931 | 		$marker_any_re = "(?:$marker_ul_re|$marker_ol_re)";
 932 | 
 933 | 		$markers_relist = array(
 934 | 			$marker_ul_re => $marker_ol_re,
 935 | 			$marker_ol_re => $marker_ul_re,
 936 | 			);
 937 | 
 938 | 		foreach ($markers_relist as $marker_re => $other_marker_re) {
 939 | 			# Re-usable pattern to match any entirel ul or ol list:
 940 | 			$whole_list_re = '
 941 | 				(								# $1 = whole list
 942 | 				  (								# $2
 943 | 					([ ]{0,'.$less_than_tab.'})	# $3 = number of spaces
 944 | 					('.$marker_re.')			# $4 = first list item marker
 945 | 					[ ]+
 946 | 				  )
 947 | 				  (?s:.+?)
 948 | 				  (								# $5
 949 | 					  \z
 950 | 					|
 951 | 					  \n{2,}
 952 | 					  (?=\S)
 953 | 					  (?!						# Negative lookahead for another list item marker
 954 | 						[ ]*
 955 | 						'.$marker_re.'[ ]+
 956 | 					  )
 957 | 					|
 958 | 					  (?=						# Lookahead for another kind of list
 959 | 					    \n
 960 | 						\3						# Must have the same indentation
 961 | 						'.$other_marker_re.'[ ]+
 962 | 					  )
 963 | 				  )
 964 | 				)
 965 | 			'; // mx
 966 | 			
 967 | 			# We use a different prefix before nested lists than top-level lists.
 968 | 			# See extended comment in _ProcessListItems().
 969 | 		
 970 | 			if ($this->list_level) {
 971 | 				$text = preg_replace_callback('{
 972 | 						^
 973 | 						'.$whole_list_re.'
 974 | 					}mx',
 975 | 					array(&$this, '_doLists_callback'), $text);
 976 | 			}
 977 | 			else {
 978 | 				$text = preg_replace_callback('{
 979 | 						(?:(?<=\n)\n|\A\n?) # Must eat the newline
 980 | 						'.$whole_list_re.'
 981 | 					}mx',
 982 | 					array(&$this, '_doLists_callback'), $text);
 983 | 			}
 984 | 		}
 985 | 
 986 | 		return $text;
 987 | 	}
 988 | 	function _doLists_callback($matches) {
 989 | 		# Re-usable patterns to match list item bullets and number markers:
 990 | 		$marker_ul_re  = '[*+-]';
 991 | 		$marker_ol_re  = '\d+[\.]';
 992 | 		$marker_any_re = "(?:$marker_ul_re|$marker_ol_re)";
 993 | 		
 994 | 		$list = $matches[1];
 995 | 		$list_type = preg_match("/$marker_ul_re/", $matches[4]) ? "ul" : "ol";
 996 | 		
 997 | 		$marker_any_re = ( $list_type == "ul" ? $marker_ul_re : $marker_ol_re );
 998 | 		
 999 | 		$list .= "\n";
1000 | 		$result = $this->processListItems($list, $marker_any_re);
1001 | 		
1002 | 		$result = $this->hashBlock("<$list_type>\n" . $result . "</$list_type>");
1003 | 		return "\n". $result ."\n\n";
1004 | 	}
1005 | 
1006 | 	var $list_level = 0;
1007 | 
1008 | 	function processListItems($list_str, $marker_any_re) {
1009 | 	#
1010 | 	#	Process the contents of a single ordered or unordered list, splitting it
1011 | 	#	into individual list items.
1012 | 	#
1013 | 		# The $this->list_level global keeps track of when we're inside a list.
1014 | 		# Each time we enter a list, we increment it; when we leave a list,
1015 | 		# we decrement. If it's zero, we're not in a list anymore.
1016 | 		#
1017 | 		# We do this because when we're not inside a list, we want to treat
1018 | 		# something like this:
1019 | 		#
1020 | 		#		I recommend upgrading to version
1021 | 		#		8. Oops, now this line is treated
1022 | 		#		as a sub-list.
1023 | 		#
1024 | 		# As a single paragraph, despite the fact that the second line starts
1025 | 		# with a digit-period-space sequence.
1026 | 		#
1027 | 		# Whereas when we're inside a list (or sub-list), that line will be
1028 | 		# treated as the start of a sub-list. What a kludge, huh? This is
1029 | 		# an aspect of Markdown's syntax that's hard to parse perfectly
1030 | 		# without resorting to mind-reading. Perhaps the solution is to
1031 | 		# change the syntax rules such that sub-lists must start with a
1032 | 		# starting cardinal number; e.g. "1." or "a.".
1033 | 		
1034 | 		$this->list_level++;
1035 | 
1036 | 		# trim trailing blank lines:
1037 | 		$list_str = preg_replace("/\n{2,}\\z/", "\n", $list_str);
1038 | 
1039 | 		$list_str = preg_replace_callback('{
1040 | 			(\n)?							# leading line = $1
1041 | 			(^[ ]*)							# leading whitespace = $2
1042 | 			('.$marker_any_re.'				# list marker and space = $3
1043 | 				(?:[ ]+|(?=\n))	# space only required if item is not empty
1044 | 			)
1045 | 			((?s:.*?))						# list item text   = $4
1046 | 			(?:(\n+(?=\n))|\n)				# tailing blank line = $5
1047 | 			(?= \n* (\z | \2 ('.$marker_any_re.') (?:[ ]+|(?=\n))))
1048 | 			}xm',
1049 | 			array(&$this, '_processListItems_callback'), $list_str);
1050 | 
1051 | 		$this->list_level--;
1052 | 		return $list_str;
1053 | 	}
1054 | 	function _processListItems_callback($matches) {
1055 | 		$item = $matches[4];
1056 | 		$leading_line =& $matches[1];
1057 | 		$leading_space =& $matches[2];
1058 | 		$marker_space = $matches[3];
1059 | 		$tailing_blank_line =& $matches[5];
1060 | 
1061 | 		if ($leading_line || $tailing_blank_line || 
1062 | 			preg_match('/\n{2,}/', $item))
1063 | 		{
1064 | 			# Replace marker with the appropriate whitespace indentation
1065 | 			$item = $leading_space . str_repeat(' ', strlen($marker_space)) . $item;
1066 | 			$item = $this->runBlockGamut($this->outdent($item)."\n");
1067 | 		}
1068 | 		else {
1069 | 			# Recursion for sub-lists:
1070 | 			$item = $this->doLists($this->outdent($item));
1071 | 			$item = preg_replace('/\n+$/', '', $item);
1072 | 			$item = $this->runSpanGamut($item);
1073 | 		}
1074 | 
1075 | 		return "<li>" . $item . "</li>\n";
1076 | 	}
1077 | 
1078 | 
1079 | 	function doCodeBlocks($text) {
1080 | 	#
1081 | 	#	Process Markdown `<pre><code>` blocks.
1082 | 	#
1083 | 		$text = preg_replace_callback('{
1084 | 				(?:\n\n|\A\n?)
1085 | 				(	            # $1 = the code block -- one or more lines, starting with a space/tab
1086 | 				  (?>
1087 | 					[ ]{'.$this->tab_width.'}  # Lines must start with a tab or a tab-width of spaces
1088 | 					.*\n+
1089 | 				  )+
1090 | 				)
1091 | 				((?=^[ ]{0,'.$this->tab_width.'}\S)|\Z)	# Lookahead for non-space at line-start, or end of doc
1092 | 			}xm',
1093 | 			array(&$this, '_doCodeBlocks_callback'), $text);
1094 | 
1095 | 		return $text;
1096 | 	}
1097 | 	function _doCodeBlocks_callback($matches) {
1098 | 		$codeblock = $matches[1];
1099 | 
1100 | 		$codeblock = $this->outdent($codeblock);
1101 | 		$codeblock = htmlspecialchars($codeblock, ENT_NOQUOTES);
1102 | 
1103 | 		# trim leading newlines and trailing newlines
1104 | 		$codeblock = preg_replace('/\A\n+|\n+\z/', '', $codeblock);
1105 | 
1106 | 		$codeblock = "<pre><code>$codeblock\n</code></pre>";
1107 | 		return "\n\n".$this->hashBlock($codeblock)."\n\n";
1108 | 	}
1109 | 
1110 | 
1111 | 	function makeCodeSpan($code) {
1112 | 	#
1113 | 	# Create a code span markup for $code. Called from handleSpanToken.
1114 | 	#
1115 | 		$code = htmlspecialchars(trim($code), ENT_NOQUOTES);
1116 | 		return $this->hashPart("<code>$code</code>");
1117 | 	}
1118 | 
1119 | 
1120 | 	var $em_relist = array(
1121 | 		''  => '(?:(?<!\*)\*(?!\*)|(?<!_)_(?!_))(?=\S|$)(?![\.,:;]\s)',
1122 | 		'*' => '(?<=\S|^)(?<!\*)\*(?!\*)',
1123 | 		'_' => '(?<=\S|^)(?<!_)_(?!_)',
1124 | 		);
1125 | 	var $strong_relist = array(
1126 | 		''   => '(?:(?<!\*)\*\*(?!\*)|(?<!_)__(?!_))(?=\S|$)(?![\.,:;]\s)',
1127 | 		'**' => '(?<=\S|^)(?<!\*)\*\*(?!\*)',
1128 | 		'__' => '(?<=\S|^)(?<!_)__(?!_)',
1129 | 		);
1130 | 	var $em_strong_relist = array(
1131 | 		''    => '(?:(?<!\*)\*\*\*(?!\*)|(?<!_)___(?!_))(?=\S|$)(?![\.,:;]\s)',
1132 | 		'***' => '(?<=\S|^)(?<!\*)\*\*\*(?!\*)',
1133 | 		'___' => '(?<=\S|^)(?<!_)___(?!_)',
1134 | 		);
1135 | 	var $em_strong_prepared_relist;
1136 | 	
1137 | 	function prepareItalicsAndBold() {
1138 | 	#
1139 | 	# Prepare regular expressions for searching emphasis tokens in any
1140 | 	# context.
1141 | 	#
1142 | 		foreach ($this->em_relist as $em => $em_re) {
1143 | 			foreach ($this->strong_relist as $strong => $strong_re) {
1144 | 				# Construct list of allowed token expressions.
1145 | 				$token_relist = array();
1146 | 				if (isset($this->em_strong_relist["$em$strong"])) {
1147 | 					$token_relist[] = $this->em_strong_relist["$em$strong"];
1148 | 				}
1149 | 				$token_relist[] = $em_re;
1150 | 				$token_relist[] = $strong_re;
1151 | 				
1152 | 				# Construct master expression from list.
1153 | 				$token_re = '{('. implode('|', $token_relist) .')}';
1154 | 				$this->em_strong_prepared_relist["$em$strong"] = $token_re;
1155 | 			}
1156 | 		}
1157 | 	}
1158 | 	
1159 | 	function doItalicsAndBold($text) {
1160 | 		$token_stack = array('');
1161 | 		$text_stack = array('');
1162 | 		$em = '';
1163 | 		$strong = '';
1164 | 		$tree_char_em = false;
1165 | 		
1166 | 		while (1) {
1167 | 			#
1168 | 			# Get prepared regular expression for seraching emphasis tokens
1169 | 			# in current context.
1170 | 			#
1171 | 			$token_re = $this->em_strong_prepared_relist["$em$strong"];
1172 | 			
1173 | 			#
1174 | 			# Each loop iteration search for the next emphasis token. 
1175 | 			# Each token is then passed to handleSpanToken.
1176 | 			#
1177 | 			$parts = preg_split($token_re, $text, 2, PREG_SPLIT_DELIM_CAPTURE);
1178 | 			$text_stack[0] .= $parts[0];
1179 | 			$token =& $parts[1];
1180 | 			$text =& $parts[2];
1181 | 			
1182 | 			if (empty($token)) {
1183 | 				# Reached end of text span: empty stack without emitting.
1184 | 				# any more emphasis.
1185 | 				while ($token_stack[0]) {
1186 | 					$text_stack[1] .= array_shift($token_stack);
1187 | 					$text_stack[0] .= array_shift($text_stack);
1188 | 				}
1189 | 				break;
1190 | 			}
1191 | 			
1192 | 			$token_len = strlen($token);
1193 | 			if ($tree_char_em) {
1194 | 				# Reached closing marker while inside a three-char emphasis.
1195 | 				if ($token_len == 3) {
1196 | 					# Three-char closing marker, close em and strong.
1197 | 					array_shift($token_stack);
1198 | 					$span = array_shift($text_stack);
1199 | 					$span = $this->runSpanGamut($span);
1200 | 					$span = "<strong><em>$span</em></strong>";
1201 | 					$text_stack[0] .= $this->hashPart($span);
1202 | 					$em = '';
1203 | 					$strong = '';
1204 | 				} else {
1205 | 					# Other closing marker: close one em or strong and
1206 | 					# change current token state to match the other
1207 | 					$token_stack[0] = str_repeat($token{0}, 3-$token_len);
1208 | 					$tag = $token_len == 2 ? "strong" : "em";
1209 | 					$span = $text_stack[0];
1210 | 					$span = $this->runSpanGamut($span);
1211 | 					$span = "<$tag>$span</$tag>";
1212 | 					$text_stack[0] = $this->hashPart($span);
1213 | 					$$tag = ''; # $$tag stands for $em or $strong
1214 | 				}
1215 | 				$tree_char_em = false;
1216 | 			} else if ($token_len == 3) {
1217 | 				if ($em) {
1218 | 					# Reached closing marker for both em and strong.
1219 | 					# Closing strong marker:
1220 | 					for ($i = 0; $i < 2; ++$i) {
1221 | 						$shifted_token = array_shift($token_stack);
1222 | 						$tag = strlen($shifted_token) == 2 ? "strong" : "em";
1223 | 						$span = array_shift($text_stack);
1224 | 						$span = $this->runSpanGamut($span);
1225 | 						$span = "<$tag>$span</$tag>";
1226 | 						$text_stack[0] .= $this->hashPart($span);
1227 | 						$$tag = ''; # $$tag stands for $em or $strong
1228 | 					}
1229 | 				} else {
1230 | 					# Reached opening three-char emphasis marker. Push on token 
1231 | 					# stack; will be handled by the special condition above.
1232 | 					$em = $token{0};
1233 | 					$strong = "$em$em";
1234 | 					array_unshift($token_stack, $token);
1235 | 					array_unshift($text_stack, '');
1236 | 					$tree_char_em = true;
1237 | 				}
1238 | 			} else if ($token_len == 2) {
1239 | 				if ($strong) {
1240 | 					# Unwind any dangling emphasis marker:
1241 | 					if (strlen($token_stack[0]) == 1) {
1242 | 						$text_stack[1] .= array_shift($token_stack);
1243 | 						$text_stack[0] .= array_shift($text_stack);
1244 | 					}
1245 | 					# Closing strong marker:
1246 | 					array_shift($token_stack);
1247 | 					$span = array_shift($text_stack);
1248 | 					$span = $this->runSpanGamut($span);
1249 | 					$span = "<strong>$span</strong>";
1250 | 					$text_stack[0] .= $this->hashPart($span);
1251 | 					$strong = '';
1252 | 				} else {
1253 | 					array_unshift($token_stack, $token);
1254 | 					array_unshift($text_stack, '');
1255 | 					$strong = $token;
1256 | 				}
1257 | 			} else {
1258 | 				# Here $token_len == 1
1259 | 				if ($em) {
1260 | 					if (strlen($token_stack[0]) == 1) {
1261 | 						# Closing emphasis marker:
1262 | 						array_shift($token_stack);
1263 | 						$span = array_shift($text_stack);
1264 | 						$span = $this->runSpanGamut($span);
1265 | 						$span = "<em>$span</em>";
1266 | 						$text_stack[0] .= $this->hashPart($span);
1267 | 						$em = '';
1268 | 					} else {
1269 | 						$text_stack[0] .= $token;
1270 | 					}
1271 | 				} else {
1272 | 					array_unshift($token_stack, $token);
1273 | 					array_unshift($text_stack, '');
1274 | 					$em = $token;
1275 | 				}
1276 | 			}
1277 | 		}
1278 | 		return $text_stack[0];
1279 | 	}
1280 | 
1281 | 
1282 | 	function doBlockQuotes($text) {
1283 | 		$text = preg_replace_callback('/
1284 | 			  (								# Wrap whole match in $1
1285 | 				(?>
1286 | 				  ^[ ]*>[ ]?			# ">" at the start of a line
1287 | 					.+\n					# rest of the first line
1288 | 				  (.+\n)*					# subsequent consecutive lines
1289 | 				  \n*						# blanks
1290 | 				)+
1291 | 			  )
1292 | 			/xm',
1293 | 			array(&$this, '_doBlockQuotes_callback'), $text);
1294 | 
1295 | 		return $text;
1296 | 	}
1297 | 	function _doBlockQuotes_callback($matches) {
1298 | 		$bq = $matches[1];
1299 | 		# trim one level of quoting - trim whitespace-only lines
1300 | 		$bq = preg_replace('/^[ ]*>[ ]?|^[ ]+$/m', '', $bq);
1301 | 		$bq = $this->runBlockGamut($bq);		# recurse
1302 | 
1303 | 		$bq = preg_replace('/^/m', "  ", $bq);
1304 | 		# These leading spaces cause problem with <pre> content, 
1305 | 		# so we need to fix that:
1306 | 		$bq = preg_replace_callback('{(\s*<pre>.+?</pre>)}sx', 
1307 | 			array(&$this, '_doBlockQuotes_callback2'), $bq);
1308 | 
1309 | 		return "\n". $this->hashBlock("<blockquote>\n$bq\n</blockquote>")."\n\n";
1310 | 	}
1311 | 	function _doBlockQuotes_callback2($matches) {
1312 | 		$pre = $matches[1];
1313 | 		$pre = preg_replace('/^  /m', '', $pre);
1314 | 		return $pre;
1315 | 	}
1316 | 
1317 | 
1318 | 	function formParagraphs($text) {
1319 | 	#
1320 | 	#	Params:
1321 | 	#		$text - string to process with html <p> tags
1322 | 	#
1323 | 		# Strip leading and trailing lines:
1324 | 		$text = preg_replace('/\A\n+|\n+\z/', '', $text);
1325 | 
1326 | 		$grafs = preg_split('/\n{2,}/', $text, -1, PREG_SPLIT_NO_EMPTY);
1327 | 
1328 | 		#
1329 | 		# Wrap <p> tags and unhashify HTML blocks
1330 | 		#
1331 | 		foreach ($grafs as $key => $value) {
1332 | 			if (!preg_match('/^B\x1A[0-9]+B$/', $value)) {
1333 | 				# Is a paragraph.
1334 | 				$value = $this->runSpanGamut($value);
1335 | 				$value = preg_replace('/^([ ]*)/', "<p>", $value);
1336 | 				$value .= "</p>";
1337 | 				$grafs[$key] = $this->unhash($value);
1338 | 			}
1339 | 			else {
1340 | 				# Is a block.
1341 | 				# Modify elements of @grafs in-place...
1342 | 				$graf = $value;
1343 | 				$block = $this->html_hashes[$graf];
1344 | 				$graf = $block;
1345 | //				if (preg_match('{
1346 | //					\A
1347 | //					(							# $1 = <div> tag
1348 | //					  <div  \s+
1349 | //					  [^>]*
1350 | //					  \b
1351 | //					  markdown\s*=\s*  ([\'"])	#	$2 = attr quote char
1352 | //					  1
1353 | //					  \2
1354 | //					  [^>]*
1355 | //					  >
1356 | //					)
1357 | //					(							# $3 = contents
1358 | //					.*
1359 | //					)
1360 | //					(</div>)					# $4 = closing tag
1361 | //					\z
1362 | //					}xs', $block, $matches))
1363 | //				{
1364 | //					list(, $div_open, , $div_content, $div_close) = $matches;
1365 | //
1366 | //					# We can't call Markdown(), because that resets the hash;
1367 | //					# that initialization code should be pulled into its own sub, though.
1368 | //					$div_content = $this->hashHTMLBlocks($div_content);
1369 | //					
1370 | //					# Run document gamut methods on the content.
1371 | //					foreach ($this->document_gamut as $method => $priority) {
1372 | //						$div_content = $this->$method($div_content);
1373 | //					}
1374 | //
1375 | //					$div_open = preg_replace(
1376 | //						'{\smarkdown\s*=\s*([\'"]).+?\1}', '', $div_open);
1377 | //
1378 | //					$graf = $div_open . "\n" . $div_content . "\n" . $div_close;
1379 | //				}
1380 | 				$grafs[$key] = $graf;
1381 | 			}
1382 | 		}
1383 | 
1384 | 		return implode("\n\n", $grafs);
1385 | 	}
1386 | 
1387 | 
1388 | 	function encodeAttribute($text) {
1389 | 	#
1390 | 	# Encode text for a double-quoted HTML attribute. This function
1391 | 	# is *not* suitable for attributes enclosed in single quotes.
1392 | 	#
1393 | 		$text = $this->encodeAmpsAndAngles($text);
1394 | 		$text = str_replace('"', '&quot;', $text);
1395 | 		return $text;
1396 | 	}
1397 | 	
1398 | 	
1399 | 	function encodeAmpsAndAngles($text) {
1400 | 	#
1401 | 	# Smart processing for ampersands and angle brackets that need to 
1402 | 	# be encoded. Valid character entities are left alone unless the
1403 | 	# no-entities mode is set.
1404 | 	#
1405 | 		if ($this->no_entities) {
1406 | 			$text = str_replace('&', '&amp;', $text);
1407 | 		} else {
1408 | 			# Ampersand-encoding based entirely on Nat Irons's Amputator
1409 | 			# MT plugin: <http://bumppo.net/projects/amputator/>
1410 | 			$text = preg_replace('/&(?!#?[xX]?(?:[0-9a-fA-F]+|\w+);)/', 
1411 | 								'&amp;', $text);;
1412 | 		}
1413 | 		# Encode remaining <'s
1414 | 		$text = str_replace('<', '&lt;', $text);
1415 | 
1416 | 		return $text;
1417 | 	}
1418 | 
1419 | 
1420 | 	function doAutoLinks($text) {
1421 | 		$text = preg_replace_callback('{<((https?|ftp|dict):[^\'">\s]+)>}i', 
1422 | 			array(&$this, '_doAutoLinks_url_callback'), $text);
1423 | 
1424 | 		# Email addresses: <address@domain.foo>
1425 | 		$text = preg_replace_callback('{
1426 | 			<
1427 | 			(?:mailto:)?
1428 | 			(
1429 | 				(?:
1430 | 					[-!#$%&\'*+/=?^_`.{|}~\w\x80-\xFF]+
1431 | 				|
1432 | 					".*?"
1433 | 				)
1434 | 				\@
1435 | 				(?:
1436 | 					[-a-z0-9\x80-\xFF]+(\.[-a-z0-9\x80-\xFF]+)*\.[a-z]+
1437 | 				|
1438 | 					\[[\d.a-fA-F:]+\]	# IPv4 & IPv6
1439 | 				)
1440 | 			)
1441 | 			>
1442 | 			}xi',
1443 | 			array(&$this, '_doAutoLinks_email_callback'), $text);
1444 | 
1445 | 		return $text;
1446 | 	}
1447 | 	function _doAutoLinks_url_callback($matches) {
1448 | 		$url = $this->encodeAttribute($matches[1]);
1449 | 		$link = "<a href=\"$url\">$url</a>";
1450 | 		return $this->hashPart($link);
1451 | 	}
1452 | 	function _doAutoLinks_email_callback($matches) {
1453 | 		$address = $matches[1];
1454 | 		$link = $this->encodeEmailAddress($address);
1455 | 		return $this->hashPart($link);
1456 | 	}
1457 | 
1458 | 
1459 | 	function encodeEmailAddress($addr) {
1460 | 	#
1461 | 	#	Input: an email address, e.g. "foo@example.com"
1462 | 	#
1463 | 	#	Output: the email address as a mailto link, with each character
1464 | 	#		of the address encoded as either a decimal or hex entity, in
1465 | 	#		the hopes of foiling most address harvesting spam bots. E.g.:
1466 | 	#
1467 | 	#	  <p><a href="&#109;&#x61;&#105;&#x6c;&#116;&#x6f;&#58;&#x66;o&#111;
1468 | 	#        &#x40;&#101;&#x78;&#97;&#x6d;&#112;&#x6c;&#101;&#46;&#x63;&#111;
1469 | 	#        &#x6d;">&#x66;o&#111;&#x40;&#101;&#x78;&#97;&#x6d;&#112;&#x6c;
1470 | 	#        &#101;&#46;&#x63;&#111;&#x6d;</a></p>
1471 | 	#
1472 | 	#	Based by a filter by Matthew Wickline, posted to BBEdit-Talk.
1473 | 	#   With some optimizations by Milian Wolff.
1474 | 	#
1475 | 		$addr = "mailto:" . $addr;
1476 | 		$chars = preg_split('/(?<!^)(?!$)/', $addr);
1477 | 		$seed = (int)abs(crc32($addr) / strlen($addr)); # Deterministic seed.
1478 | 		
1479 | 		foreach ($chars as $key => $char) {
1480 | 			$ord = ord($char);
1481 | 			# Ignore non-ascii chars.
1482 | 			if ($ord < 128) {
1483 | 				$r = ($seed * (1 + $key)) % 100; # Pseudo-random function.
1484 | 				# roughly 10% raw, 45% hex, 45% dec
1485 | 				# '@' *must* be encoded. I insist.
1486 | 				if ($r > 90 && $char != '@') /* do nothing */;
1487 | 				else if ($r < 45) $chars[$key] = '&#x'.dechex($ord).';';
1488 | 				else              $chars[$key] = '&#'.$ord.';';
1489 | 			}
1490 | 		}
1491 | 		
1492 | 		$addr = implode('', $chars);
1493 | 		$text = implode('', array_slice($chars, 7)); # text without `mailto:`
1494 | 		$addr = "<a href=\"$addr\">$text</a>";
1495 | 
1496 | 		return $addr;
1497 | 	}
1498 | 
1499 | 
1500 | 	function parseSpan($str) {
1501 | 	#
1502 | 	# Take the string $str and parse it into tokens, hashing embeded HTML,
1503 | 	# escaped characters and handling code spans.
1504 | 	#
1505 | 		$output = '';
1506 | 		
1507 | 		$span_re = '{
1508 | 				(
1509 | 					\\\\'.$this->escape_chars_re.'
1510 | 				|
1511 | 					(?<![`\\\\])
1512 | 					`+						# code span marker
1513 | 			'.( $this->no_markup ? '' : '
1514 | 				|
1515 | 					<!--    .*?     -->		# comment
1516 | 				|
1517 | 					<\?.*?\?> | <%.*?%>		# processing instruction
1518 | 				|
1519 | 					<[/!$]?[-a-zA-Z0-9:_]+	# regular tags
1520 | 					(?>
1521 | 						\s
1522 | 						(?>[^"\'>]+|"[^"]*"|\'[^\']*\')*
1523 | 					)?
1524 | 					>
1525 | 			').'
1526 | 				)
1527 | 				}xs';
1528 | 
1529 | 		while (1) {
1530 | 			#
1531 | 			# Each loop iteration seach for either the next tag, the next 
1532 | 			# openning code span marker, or the next escaped character. 
1533 | 			# Each token is then passed to handleSpanToken.
1534 | 			#
1535 | 			$parts = preg_split($span_re, $str, 2, PREG_SPLIT_DELIM_CAPTURE);
1536 | 			
1537 | 			# Create token from text preceding tag.
1538 | 			if ($parts[0] != "") {
1539 | 				$output .= $parts[0];
1540 | 			}
1541 | 			
1542 | 			# Check if we reach the end.
1543 | 			if (isset($parts[1])) {
1544 | 				$output .= $this->handleSpanToken($parts[1], $parts[2]);
1545 | 				$str = $parts[2];
1546 | 			}
1547 | 			else {
1548 | 				break;
1549 | 			}
1550 | 		}
1551 | 		
1552 | 		return $output;
1553 | 	}
1554 | 	
1555 | 	
1556 | 	function handleSpanToken($token, &$str) {
1557 | 	#
1558 | 	# Handle $token provided by parseSpan by determining its nature and 
1559 | 	# returning the corresponding value that should replace it.
1560 | 	#
1561 | 		switch ($token{0}) {
1562 | 			case "\\":
1563 | 				return $this->hashPart("&#". ord($token{1}). ";");
1564 | 			case "`":
1565 | 				# Search for end marker in remaining text.
1566 | 				if (preg_match('/^(.*?[^`])'.preg_quote($token).'(?!`)(.*)$/sm', 
1567 | 					$str, $matches))
1568 | 				{
1569 | 					$str = $matches[2];
1570 | 					$codespan = $this->makeCodeSpan($matches[1]);
1571 | 					return $this->hashPart($codespan);
1572 | 				}
1573 | 				return $token; // return as text since no ending marker found.
1574 | 			default:
1575 | 				return $this->hashPart($token);
1576 | 		}
1577 | 	}
1578 | 
1579 | 
1580 | 	function outdent($text) {
1581 | 	#
1582 | 	# Remove one level of line-leading tabs or spaces
1583 | 	#
1584 | 		return preg_replace('/^(\t|[ ]{1,'.$this->tab_width.'})/m', '', $text);
1585 | 	}
1586 | 
1587 | 
1588 | 	# String length function for detab. `_initDetab` will create a function to 
1589 | 	# hanlde UTF-8 if the default function does not exist.
1590 | 	var $utf8_strlen = 'mb_strlen';
1591 | 	
1592 | 	function detab($text) {
1593 | 	#
1594 | 	# Replace tabs with the appropriate amount of space.
1595 | 	#
1596 | 		# For each line we separate the line in blocks delemited by
1597 | 		# tab characters. Then we reconstruct every line by adding the 
1598 | 		# appropriate number of space between each blocks.
1599 | 		
1600 | 		$text = preg_replace_callback('/^.*\t.*$/m',
1601 | 			array(&$this, '_detab_callback'), $text);
1602 | 
1603 | 		return $text;
1604 | 	}
1605 | 	function _detab_callback($matches) {
1606 | 		$line = $matches[0];
1607 | 		$strlen = $this->utf8_strlen; # strlen function for UTF-8.
1608 | 		
1609 | 		# Split in blocks.
1610 | 		$blocks = explode("\t", $line);
1611 | 		# Add each blocks to the line.
1612 | 		$line = $blocks[0];
1613 | 		unset($blocks[0]); # Do not add first block twice.
1614 | 		foreach ($blocks as $block) {
1615 | 			# Calculate amount of space, insert spaces, insert block.
1616 | 			$amount = $this->tab_width - 
1617 | 				$strlen($line, 'UTF-8') % $this->tab_width;
1618 | 			$line .= str_repeat(" ", $amount) . $block;
1619 | 		}
1620 | 		return $line;
1621 | 	}
1622 | 	function _initDetab() {
1623 | 	#
1624 | 	# Check for the availability of the function in the `utf8_strlen` property
1625 | 	# (initially `mb_strlen`). If the function is not available, create a 
1626 | 	# function that will loosely count the number of UTF-8 characters with a
1627 | 	# regular expression.
1628 | 	#
1629 | 		if (function_exists($this->utf8_strlen)) return;
1630 | 		$this->utf8_strlen = create_function('$text', 'return preg_match_all(
1631 | 			"/[\\\\x00-\\\\xBF]|[\\\\xC0-\\\\xFF][\\\\x80-\\\\xBF]*/", 
1632 | 			$text, $m);');
1633 | 	}
1634 | 
1635 | 
1636 | 	function unhash($text) {
1637 | 	#
1638 | 	# Swap back in all the tags hashed by _HashHTMLBlocks.
1639 | 	#
1640 | 		return preg_replace_callback('/(.)\x1A[0-9]+\1/', 
1641 | 			array(&$this, '_unhash_callback'), $text);
1642 | 	}
1643 | 	function _unhash_callback($matches) {
1644 | 		return $this->html_hashes[$matches[0]];
1645 | 	}
1646 | 
1647 | }
1648 | 
1649 | /*
1650 | 
1651 | PHP Markdown
1652 | ============
1653 | 
1654 | Description
1655 | -----------
1656 | 
1657 | This is a PHP translation of the original Markdown formatter written in
1658 | Perl by John Gruber.
1659 | 
1660 | Markdown is a text-to-HTML filter; it translates an easy-to-read /
1661 | easy-to-write structured text format into HTML. Markdown's text format
1662 | is most similar to that of plain text email, and supports features such
1663 | as headers, *emphasis*, code blocks, blockquotes, and links.
1664 | 
1665 | Markdown's syntax is designed not as a generic markup language, but
1666 | specifically to serve as a front-end to (X)HTML. You can use span-level
1667 | HTML tags anywhere in a Markdown document, and you can use block level
1668 | HTML tags (like <div> and <table> as well).
1669 | 
1670 | For more information about Markdown's syntax, see:
1671 | 
1672 | <http://daringfireball.net/projects/markdown/>
1673 | 
1674 | 
1675 | Bugs
1676 | ----
1677 | 
1678 | To file bug reports please send email to:
1679 | 
1680 | <michel.fortin@michelf.com>
1681 | 
1682 | Please include with your report: (1) the example input; (2) the output you
1683 | expected; (3) the output Markdown actually produced.
1684 | 
1685 | 
1686 | Version History
1687 | --------------- 
1688 | 
1689 | See the readme file for detailed release notes for this version.
1690 | 
1691 | 
1692 | Copyright and License
1693 | ---------------------
1694 | 
1695 | PHP Markdown
1696 | Copyright (c) 2004-2009 Michel Fortin  
1697 | <http://michelf.com/>  
1698 | All rights reserved.
1699 | 
1700 | Based on Markdown
1701 | Copyright (c) 2003-2006 John Gruber   
1702 | <http://daringfireball.net/>   
1703 | All rights reserved.
1704 | 
1705 | Redistribution and use in source and binary forms, with or without
1706 | modification, are permitted provided that the following conditions are
1707 | met:
1708 | 
1709 | *	Redistributions of source code must retain the above copyright notice,
1710 | 	this list of conditions and the following disclaimer.
1711 | 
1712 | *	Redistributions in binary form must reproduce the above copyright
1713 | 	notice, this list of conditions and the following disclaimer in the
1714 | 	documentation and/or other materials provided with the distribution.
1715 | 
1716 | *	Neither the name "Markdown" nor the names of its contributors may
1717 | 	be used to endorse or promote products derived from this software
1718 | 	without specific prior written permission.
1719 | 
1720 | This software is provided by the copyright holders and contributors "as
1721 | is" and any express or implied warranties, including, but not limited
1722 | to, the implied warranties of merchantability and fitness for a
1723 | particular purpose are disclaimed. In no event shall the copyright owner
1724 | or contributors be liable for any direct, indirect, incidental, special,
1725 | exemplary, or consequential damages (including, but not limited to,
1726 | procurement of substitute goods or services; loss of use, data, or
1727 | profits; or business interruption) however caused and on any theory of
1728 | liability, whether in contract, strict liability, or tort (including
1729 | negligence or otherwise) arising in any way out of the use of this
1730 | software, even if advised of the possibility of such damage.
1731 | 
1732 | */
1733 | ?>


--------------------------------------------------------------------------------