├── README.md ├── README.txt ├── UPDATING.txt ├── admin ├── apc.php ├── codemirror │ ├── codemirror.css │ ├── codemirror.js │ └── properties.js ├── edit-pattern.php ├── index.php ├── login.php ├── require_login.php ├── template.php └── update.php ├── cache ├── index.php ├── rss-with-key │ └── index.php └── rss │ └── index.php ├── changelog.txt ├── cleancache.php ├── config.php ├── css ├── bootstrap.min.css ├── feed.css └── feed.xsl ├── extract.php ├── ftr_compatibility_test.php ├── images └── agplv3.png ├── index.php ├── js ├── bootstrap-popover.js ├── bootstrap-tab.js ├── bootstrap-tooltip.js └── jquery.min.js ├── libraries ├── DisableSimplePieSanitize.php ├── Zend │ ├── Cache.php │ ├── Cache │ │ ├── Backend.php │ │ ├── Backend │ │ │ ├── ExtendedInterface.php │ │ │ ├── File.php │ │ │ └── Interface.php │ │ ├── Core.php │ │ └── Exception.php │ └── Exception.php ├── content-extractor │ ├── ContentExtractor.php │ └── SiteConfig.php ├── feedwriter │ ├── FeedItem.php │ └── FeedWriter.php ├── htmLawed │ └── htmLawed.php ├── html5php │ ├── HTML5.php │ ├── HTML5 │ │ ├── Elements.php │ │ ├── Entities.php │ │ ├── Exception.php │ │ ├── InstructionProcessor.php │ │ ├── Parser │ │ │ ├── CharacterReference.php │ │ │ ├── DOMTreeBuilder.php │ │ │ ├── EventHandler.php │ │ │ ├── FileInputStream.php │ │ │ ├── InputStream.php │ │ │ ├── ParseError.php │ │ │ ├── README.md │ │ │ ├── Scanner.php │ │ │ ├── StringInputStream.php │ │ │ ├── Tokenizer.php │ │ │ ├── TreeBuildingRules.php │ │ │ └── UTF8Utils.php │ │ └── Serializer │ │ │ ├── HTML5Entities.php │ │ │ ├── OutputRules.php │ │ │ ├── README.md │ │ │ ├── RulesInterface.php │ │ │ └── Traverser.php │ ├── LICENSE.txt │ ├── README.md │ ├── RELEASE.md │ └── autoloader.php ├── humble-http-agent │ ├── CookieJar.php │ ├── HumbleHttpAgent.php │ ├── HumbleHttpAgentDummy.php │ ├── RollingCurl.php │ └── SimplePie_HumbleHttpAgent.php ├── language-detect │ ├── LanguageDetect.php │ ├── LanguageDetect │ │ ├── Exception.php │ │ ├── ISO639.php │ │ └── Parser.php │ ├── lang.dat │ └── unicode_blocks.dat ├── readability │ ├── JSLikeHTMLElement.php │ └── Readability.php └── simplepie │ ├── LICENSE.txt │ ├── autoloader.php │ └── library │ ├── SimplePie.php │ └── SimplePie │ ├── Author.php │ ├── Cache.php │ ├── Cache │ ├── Base.php │ ├── DB.php │ ├── File.php │ ├── Memcache.php │ ├── Memcached.php │ ├── MySQL.php │ └── Redis.php │ ├── Caption.php │ ├── Category.php │ ├── Content │ └── Type │ │ └── Sniffer.php │ ├── Copyright.php │ ├── Core.php │ ├── Credit.php │ ├── Decode │ └── HTML │ │ └── Entities.php │ ├── Enclosure.php │ ├── Exception.php │ ├── File.php │ ├── HTTP │ └── Parser.php │ ├── IRI.php │ ├── Item.php │ ├── Locator.php │ ├── Misc.php │ ├── Net │ └── IPv6.php │ ├── Parse │ └── Date.php │ ├── Parser.php │ ├── Rating.php │ ├── Registry.php │ ├── Restriction.php │ ├── Sanitize.php │ ├── Source.php │ ├── XML │ └── Declaration │ │ └── Parser.php │ └── gzdecode.php ├── license.txt ├── makefulltextfeed.php ├── robots.txt ├── site_config ├── README.txt ├── custom │ └── index.php ├── index.php └── standard │ ├── .wikipedia.org.txt │ ├── README.md │ ├── index.php │ └── version.txt └── ubuntu-16.04.pp /README.md: -------------------------------------------------------------------------------- 1 | Full-Text RSS 2 | ============= 3 | 4 | [Full-Text RSS](https://fivefilters.org/content-only/) from [FiveFilters.org](https://fivefilters.org) is used by software developers and news enthusiasts to extract article content from news sites and blogs, and to convert RSS feeds that contain only extracts of stories to full-text feeds. 5 | 6 | This is the public version of Full-Text RSS available to download for free from . 7 | 8 | For best extraction results, and to help us sustain the project, you can purchase the most up-to-date version at . 9 | 10 | If want a managed, hosted API instead, [see our plans](https://rapidapi.com/fivefilters/api/full-text-rss/pricing). 11 | 12 | ### Installation 13 | 14 | 1. Extract the files in this ZIP archive to a folder on your computer. 15 | 16 | 2. FTP the files up to your server 17 | 18 | 3. Access index.php through your browser. E.g. https://example.org/full-text-rss/index.php 19 | 20 | 4. Enter a URL in the form field to test the code 21 | 22 | 5. If you get an RSS feed with full-text content, all is working well. :) 23 | 24 | ### Configuration (optional) 25 | 26 | 1. Save a copy of config.php as custom_config.php and edit custom_config.php 27 | 28 | 2. If you decide to enable caching, make sure the cache folder (and its 2 sub folders) is writable. (You might need to change the permissions of these folders to 777 through your FTP client.) 29 | 30 | ### Site-specific extraction rules 31 | 32 | This free version does not contain the site config files we include with purchased copies, but these are now all available [online](https://github.com/fivefilters/ftr-site-config). If you'd like to keep yours up to date using Git, follow the steps below: 33 | 34 | 1. Change into the site_config/standard/ folder 35 | 2. Delete everything in there 36 | 3. Using the command line, enter: `git clone https://github.com/fivefilters/ftr-site-config.git .` 37 | 4. Git should now download the latest site config files for you. 38 | 5. To update the site config files again, you can simply run `git pull` from the directory. 39 | 40 | ### Code example 41 | 42 | If you're developing an application which requires content extraction, you can call Full-Text RSS as a web service from within your application. Here's how to do it in PHP: 43 | 44 | rss->channel->item); 65 | 66 | $title = $json->rss->channel->item->title; 67 | // Note: this works when you're processing an article. 68 | // If the input URL is a feed, ->item will be an array. 69 | 70 | echo $title; 71 | 72 | ### Different language? 73 | 74 | Although we don't have examples in other programming languages, the essential steps should be: 75 | 76 | 1. Construct the request URL using URL where you installed Full-Text RSS and the article or feed URL (see $ftr, $article, $request in example above). 77 | 78 | 2. Fetch the resulting URL using an HTTP GET request. 79 | 80 | 3. Parse the HTTP response body as JSON and grab what you need. -------------------------------------------------------------------------------- /README.txt: -------------------------------------------------------------------------------- 1 | Full-Text RSS 2 | ============= 3 | 4 | About 5 | ----- 6 | 7 | See http://fivefilters.org/content-only/ for a description of the code. 8 | 9 | 10 | Installation 11 | ------------ 12 | 13 | 1. Extract the files in this ZIP archive to a folder on your computer. 14 | 15 | 2. FTP the files up to your server 16 | 17 | 3. Access index.php through your browser. E.g. http://example.org/full-text-rss/index.php 18 | 19 | 4. Enter a URL in the form field to test the code 20 | 21 | 5. If you get an RSS feed with full-text content, all is working well. :) 22 | 23 | Configuration (optional) 24 | ------------------------ 25 | 26 | 1. Save a copy of config.php as custom_config.php and edit custom_config.php 27 | 28 | 2. If you decide to enable caching, make sure the cache folder (and its 2 sub folders) is writable. 29 | (You might need to change the permissions of these folders to 777 through your FTP client.) 30 | 31 | 3. If you want to use the admin area to edit/update your site config files, make sure the 32 | site_config folder (and its 2 sub folders) is writable. (You might need to change the permissions 33 | of these folders to 777 through your FTP client.) 34 | 35 | Help 36 | ---- 37 | 38 | Please visit http://help.fivefilters.org -------------------------------------------------------------------------------- /UPDATING.txt: -------------------------------------------------------------------------------- 1 | Updating Full-Text RSS 2 | ====================== 3 | 4 | To update your copy of Full-Text RSS to ensure feeds continue to be processed as they were before, we suggest the following steps: 5 | 6 | 1. Keep your current installation in place for now (we'll deal with it later) 7 | 8 | 2. Extract this updated package to a new folder -- for example, if the last version is in a folder called 'full-text-rss', extract this version to a new folder called 'full-text-rss-updated' 9 | 10 | 3. Upload the new folder to your server 11 | 12 | 4. Access index.php in the new folder through your browser -- for example http://example.org/full-text-rss-updated/index.php 13 | 14 | 5. Enter a URL in the form field to test the updated code 15 | 16 | 6. If you'd configured the last version, copy custom_config.php from your old version to the new installation. 17 | 18 | 7. If you'd added custom site config files (in site_config/custom/), copy those over to the new installation. 19 | 20 | 8. If you'd enabled caching, make sure the cache folder (and its 2 sub folders) is writable. (You might need to change the permissions of these folders to 777 through your FTP client.) 21 | 22 | 9. Test the new copy again to make sure the config values are now applied to the new version. 23 | 24 | 10. Now simply rename the folder with your old copy to 'full-text-rss-old' and then rename the folder with the new copy to 'full-text-rss' (or whatever name you'd given the original folder). 25 | 26 | That's all that's needed. Your feeds should continue to work as they did before. Let us know if you have any trouble: help@fivefilters.org. 27 | 28 | Updating Site Patterns 29 | ====================== 30 | 31 | Site patterns are used by Full-Text RSS to improve extraction for certain sites. These are simple text files which are updated more frequently than the Full-Text RSS software itself. 32 | 33 | To make sure your copy of Full-Text RSS is using the latest site patterns, We've created a simple tool to help you get the latest copy of these site patterns from FiveFilters.org. To use it, access the admin/ folder in your browser and follow the instructions. 34 | 35 | Alternatively, you can download the latest set of site config files via GitHub: https://github.com/fivefilters/ftr-site-config -------------------------------------------------------------------------------- /admin/codemirror/codemirror.css: -------------------------------------------------------------------------------- 1 | .CodeMirror { 2 | line-height: 1em; 3 | font-family: monospace; 4 | 5 | /* Necessary so the scrollbar can be absolutely positioned within the wrapper on Lion. */ 6 | position: relative; 7 | /* This prevents unwanted scrollbars from showing up on the body and wrapper in IE. */ 8 | overflow: hidden; 9 | } 10 | 11 | .CodeMirror-scroll { 12 | overflow-x: auto; 13 | overflow-y: hidden; 14 | height: 300px; 15 | /* This is needed to prevent an IE[67] bug where the scrolled content 16 | is visible outside of the scrolling box. */ 17 | position: relative; 18 | outline: none; 19 | } 20 | 21 | /* Vertical scrollbar */ 22 | .CodeMirror-scrollbar { 23 | float: right; 24 | overflow-x: hidden; 25 | overflow-y: scroll; 26 | 27 | /* This corrects for the 1px gap introduced to the left of the scrollbar 28 | by the rule for .CodeMirror-scrollbar-inner. */ 29 | margin-left: -1px; 30 | } 31 | .CodeMirror-scrollbar-inner { 32 | /* This needs to have a nonzero width in order for the scrollbar to appear 33 | in Firefox and IE9. */ 34 | width: 1px; 35 | } 36 | .CodeMirror-scrollbar.cm-sb-overlap { 37 | /* Ensure that the scrollbar appears in Lion, and that it overlaps the content 38 | rather than sitting to the right of it. */ 39 | position: absolute; 40 | z-index: 1; 41 | float: none; 42 | right: 0; 43 | min-width: 12px; 44 | } 45 | .CodeMirror-scrollbar.cm-sb-nonoverlap { 46 | min-width: 12px; 47 | } 48 | .CodeMirror-scrollbar.cm-sb-ie7 { 49 | min-width: 18px; 50 | } 51 | 52 | .CodeMirror-gutter { 53 | position: absolute; left: 0; top: 0; 54 | z-index: 10; 55 | background-color: #f7f7f7; 56 | border-right: 1px solid #eee; 57 | min-width: 2em; 58 | height: 100%; 59 | } 60 | .CodeMirror-gutter-text { 61 | color: #aaa; 62 | text-align: right; 63 | padding: .4em .2em .4em .4em; 64 | white-space: pre !important; 65 | cursor: default; 66 | } 67 | .CodeMirror-lines { 68 | padding: .4em; 69 | white-space: pre; 70 | cursor: text; 71 | } 72 | .CodeMirror-lines * { 73 | /* Necessary for throw-scrolling to decelerate properly on Safari. */ 74 | pointer-events: none; 75 | } 76 | 77 | .CodeMirror pre { 78 | -moz-border-radius: 0; 79 | -webkit-border-radius: 0; 80 | -o-border-radius: 0; 81 | border-radius: 0; 82 | border-width: 0; margin: 0; padding: 0; background: transparent; 83 | font-family: inherit; 84 | font-size: inherit; 85 | padding: 0; margin: 0; 86 | white-space: pre; 87 | word-wrap: normal; 88 | line-height: inherit; 89 | color: inherit; 90 | } 91 | 92 | .CodeMirror-wrap pre { 93 | word-wrap: break-word; 94 | white-space: pre-wrap; 95 | word-break: normal; 96 | } 97 | .CodeMirror-wrap .CodeMirror-scroll { 98 | overflow-x: hidden; 99 | } 100 | 101 | .CodeMirror textarea { 102 | outline: none !important; 103 | } 104 | 105 | .CodeMirror pre.CodeMirror-cursor { 106 | z-index: 10; 107 | position: absolute; 108 | visibility: hidden; 109 | border-left: 1px solid black; 110 | border-right: none; 111 | width: 0; 112 | } 113 | .cm-keymap-fat-cursor pre.CodeMirror-cursor { 114 | width: auto; 115 | border: 0; 116 | background: transparent; 117 | background: rgba(0, 200, 0, .4); 118 | filter: progid:DXImageTransform.Microsoft.gradient(startColorstr=#6600c800, endColorstr=#4c00c800); 119 | } 120 | /* Kludge to turn off filter in ie9+, which also accepts rgba */ 121 | .cm-keymap-fat-cursor pre.CodeMirror-cursor:not(#nonsense_id) { 122 | filter: progid:DXImageTransform.Microsoft.gradient(enabled=false); 123 | } 124 | .CodeMirror pre.CodeMirror-cursor.CodeMirror-overwrite {} 125 | .CodeMirror-focused pre.CodeMirror-cursor { 126 | visibility: visible; 127 | } 128 | 129 | div.CodeMirror-selected { background: #d9d9d9; } 130 | .CodeMirror-focused div.CodeMirror-selected { background: #d7d4f0; } 131 | 132 | .CodeMirror-searching { 133 | background: #ffa; 134 | background: rgba(255, 255, 0, .4); 135 | } 136 | 137 | /* Default theme */ 138 | 139 | .cm-s-default span.cm-keyword {color: #708;} 140 | .cm-s-default span.cm-atom {color: #219;} 141 | .cm-s-default span.cm-number {color: #164;} 142 | .cm-s-default span.cm-def {color: #00f;} 143 | .cm-s-default span.cm-variable {color: black;} 144 | .cm-s-default span.cm-variable-2 {color: #05a;} 145 | .cm-s-default span.cm-variable-3 {color: #085;} 146 | .cm-s-default span.cm-property {color: black;} 147 | .cm-s-default span.cm-operator {color: black;} 148 | .cm-s-default span.cm-comment {color: #a50;} 149 | .cm-s-default span.cm-string {color: #a11;} 150 | .cm-s-default span.cm-string-2 {color: #f50;} 151 | .cm-s-default span.cm-meta {color: #555;} 152 | .cm-s-default span.cm-error {color: #f00;} 153 | .cm-s-default span.cm-qualifier {color: #555;} 154 | .cm-s-default span.cm-builtin {color: #30a;} 155 | .cm-s-default span.cm-bracket {color: #cc7;} 156 | .cm-s-default span.cm-tag {color: #170;} 157 | .cm-s-default span.cm-attribute {color: #00c;} 158 | .cm-s-default span.cm-header {color: blue;} 159 | .cm-s-default span.cm-quote {color: #090;} 160 | .cm-s-default span.cm-hr {color: #999;} 161 | .cm-s-default span.cm-link {color: #00c;} 162 | 163 | span.cm-header, span.cm-strong {font-weight: bold;} 164 | span.cm-em {font-style: italic;} 165 | span.cm-emstrong {font-style: italic; font-weight: bold;} 166 | span.cm-link {text-decoration: underline;} 167 | 168 | div.CodeMirror span.CodeMirror-matchingbracket {color: #0f0;} 169 | div.CodeMirror span.CodeMirror-nonmatchingbracket {color: #f22;} 170 | -------------------------------------------------------------------------------- /admin/codemirror/properties.js: -------------------------------------------------------------------------------- 1 | CodeMirror.defineMode("properties", function() { 2 | return { 3 | token: function(stream, state) { 4 | var sol = stream.sol() || state.afterSection; 5 | var eol = stream.eol(); 6 | 7 | state.afterSection = false; 8 | 9 | if (sol) { 10 | if (state.nextMultiline) { 11 | state.inMultiline = true; 12 | state.nextMultiline = false; 13 | } else { 14 | state.position = "def"; 15 | } 16 | } 17 | 18 | if (eol && ! state.nextMultiline) { 19 | state.inMultiline = false; 20 | state.position = "def"; 21 | } 22 | 23 | if (sol) { 24 | while(stream.eatSpace()); 25 | } 26 | 27 | var ch = stream.next(); 28 | 29 | if (sol && (ch === "#")) { 30 | state.position = "comment"; 31 | stream.skipToEnd(); 32 | return "comment"; 33 | } else if (sol && ch === "[") { 34 | state.afterSection = true; 35 | stream.skipTo("]"); stream.eat("]"); 36 | return "header"; 37 | } else if (ch === ":") { 38 | state.position = "quote"; 39 | return null; 40 | } else if (ch === "\\" && state.position === "quote") { 41 | if (stream.next() !== "u") { // u = Unicode sequence \u1234 42 | // Multiline value 43 | state.nextMultiline = true; 44 | } 45 | } 46 | 47 | return state.position; 48 | }, 49 | 50 | startState: function() { 51 | return { 52 | position : "def", // Current position, "def", "quote" or "comment" 53 | nextMultiline : false, // Is the next line multiline value 54 | inMultiline : false, // Is the current line a multiline value 55 | afterSection : false // Did we just open a section 56 | }; 57 | } 58 | 59 | }; 60 | }); 61 | 62 | CodeMirror.defineMIME("text/x-properties", "properties"); 63 | CodeMirror.defineMIME("text/x-ini", "properties"); 64 | -------------------------------------------------------------------------------- /admin/index.php: -------------------------------------------------------------------------------- 1 | . 23 | */ 24 | 25 | // Usage 26 | // ----- 27 | // Access this file in your browser 28 | 29 | error_reporting(E_ALL ^ E_NOTICE); 30 | ini_set("display_errors", 1); 31 | @set_time_limit(120); 32 | 33 | //////////////////////////////// 34 | // Load config file 35 | //////////////////////////////// 36 | require_once('../config.php'); 37 | require_once('require_login.php'); 38 | require_once('template.php'); 39 | tpl_header('Admin'); 40 | 41 | ?> 42 |

The admin pages are intended to help you manage your copy of Full-Text RSS more easily.

43 | -------------------------------------------------------------------------------- /admin/login.php: -------------------------------------------------------------------------------- 1 | admin_credentials) || $options->admin_credentials['username'] == '' || $options->admin_credentials['password'] == '') { 6 | die('

Admin privileges required

This page requires admin privileges but Full-Text RSS has not been configured with admin credentials.

If you are the administrator, please edit your custom_config.php file and enter the credentials in the appropriate section. When you\'ve done that, this page will prompt you for your admin credentials.

'); 7 | } 8 | 9 | $name = @$_POST['username']; 10 | $pass = @$_POST['pass']; 11 | $invalid_login = false; 12 | 13 | if ($name || $pass) { 14 | if ($name == $options->admin_credentials['username'] && $pass == $options->admin_credentials['password']) { 15 | // Authentication successful - set session 16 | $_SESSION['auth'] = 1; 17 | if (isset($_POST['redirect']) && preg_match('/^[0-9a-z]+$/', $_POST['redirect'])) { 18 | header('Location: '.$_POST['redirect'].'.php'); 19 | } else { 20 | header('Location: index.php'); 21 | } 22 | exit; 23 | } 24 | $invalid_login = true; 25 | } 26 | ?> 27 | 28 | 29 | Login 30 | 31 | Invalid login, please try again. If you can\'t remember your admin credentials, open your custom_config.php and you\'ll find them in there.

'; ?> 32 |
33 | '; ?> 34 | 35 | 36 | 37 |
38 | 39 | -------------------------------------------------------------------------------- /admin/require_login.php: -------------------------------------------------------------------------------- 1 | . 23 | */ 24 | 25 | // Usage 26 | // ----- 27 | // This file is included on pages which require admin privileges - e.g. updating the software. 28 | // The username is 'admin' by default and the password should be set in the custom_config.php file. 29 | session_start(); 30 | require_once(dirname(dirname(__FILE__)).'/config.php'); 31 | 32 | if (isset($_GET['logout'])) $_SESSION['auth'] = 0; 33 | 34 | if (!isset($_SESSION['auth']) || $_SESSION['auth'] != 1) { 35 | if (isset($admin_page)) { 36 | header('Location: login.php?redirect='.$admin_page); 37 | } else { 38 | header('Location: login.php'); 39 | } 40 | exit; 41 | } -------------------------------------------------------------------------------- /admin/template.php: -------------------------------------------------------------------------------- 1 | 6 | 7 | 8 | 9 | <?php echo $title; ?> 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 33 | 34 | 35 |
36 | 49 | 50 | 55 | 56 |
57 | 58 | 59 | . 20 | */ 21 | 22 | // Usage 23 | // ----- 24 | // Set up your scheduler (e.g. cron) to request this file periodically. 25 | // Note: this file must _not_ be named cleancache.php so please rename it. 26 | // We ask you to do this to prevent others from initiating 27 | // the cache cleanup process. It will not run if it's called cleancache.php. 28 | 29 | error_reporting(E_ALL ^ E_NOTICE); 30 | ini_set("display_errors", 1); 31 | @set_time_limit(120); 32 | 33 | // check file name 34 | if (basename(__FILE__) == 'cleancache.php') die('cleancache.php must be renamed'); 35 | 36 | // set include path 37 | set_include_path(realpath(dirname(__FILE__).'/libraries').PATH_SEPARATOR.get_include_path()); 38 | 39 | // Autoloading of classes allows us to include files only when they're 40 | // needed. If we've got a cached copy, for example, only Zend_Cache is loaded. 41 | function __autoload($class_name) { 42 | static $mapping = array( 43 | 'Zend_Cache' => 'Zend/Cache.php' 44 | ); 45 | if (isset($mapping[$class_name])) { 46 | //echo "Loading $class_name\n
"; 47 | require_once $mapping[$class_name]; 48 | return true; 49 | } else { 50 | return false; 51 | } 52 | } 53 | require_once dirname(__FILE__).'/config.php'; 54 | if (!$options->caching) die('Caching is disabled'); 55 | 56 | // clean APC cache 57 | if ($options->apc && function_exists('apc_delete')) { 58 | $_apc_data = apc_cache_info('user'); 59 | foreach ($_apc_data['cache_list'] as $_apc_item) { 60 | //var_dump($_apc_item); exit; 61 | // APCu keys incompatible with original APC keys, apparently fixed in newer versions, but not in 4.0.4 62 | // So let's look for those keys and fix here (ctime -> creation_time, key -> info). 63 | if (isset($_apc_item['ctime'])) $_apc_item['creation_time'] = $_apc_item['ctime']; 64 | if (isset($_apc_item['key'])) $_apc_item['info'] = $_apc_item['key']; 65 | if ($_apc_item['ttl'] > 0 && ($_apc_item['ttl'] + $_apc_item['creation_time'] < time())) { 66 | apc_delete($_apc_item['info']); 67 | } 68 | } 69 | } 70 | 71 | // clean rss (non-key) cache 72 | $frontendOptions = array( 73 | 'lifetime' => $options->cache_time*60, 74 | 'automatic_serialization' => false, 75 | 'write_control' => false, 76 | 'automatic_cleaning_factor' => 0, 77 | 'ignore_user_abort' => false 78 | ); 79 | $backendOptions = array( 80 | 'cache_dir' => $options->cache_dir.'/rss/', 81 | 'file_locking' => false, 82 | 'read_control' => true, 83 | 'read_control_type' => 'strlen', 84 | 'hashed_directory_level' => $options->cache_directory_level, 85 | 'hashed_directory_perm' => 0777, 86 | 'cache_file_perm' => 0664, 87 | 'file_name_prefix' => 'ff' 88 | ); 89 | $cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions); 90 | $cache->clean(Zend_Cache::CLEANING_MODE_OLD); 91 | 92 | // clean rss (key) cache 93 | $frontendOptions = array( 94 | 'lifetime' => $options->cache_time*60, 95 | 'automatic_serialization' => false, 96 | 'write_control' => false, 97 | 'automatic_cleaning_factor' => 0, 98 | 'ignore_user_abort' => false 99 | ); 100 | $backendOptions = array( 101 | 'cache_dir' => $options->cache_dir.'/rss-with-key/', 102 | 'file_locking' => false, 103 | 'read_control' => true, 104 | 'read_control_type' => 'strlen', 105 | 'hashed_directory_level' => $options->cache_directory_level, 106 | 'hashed_directory_perm' => 0777, 107 | 'cache_file_perm' => 0664, 108 | 'file_name_prefix' => 'ff' 109 | ); 110 | $cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions); 111 | $cache->clean(Zend_Cache::CLEANING_MODE_OLD); -------------------------------------------------------------------------------- /css/feed.css: -------------------------------------------------------------------------------- 1 | /* RSS CSS Document */ 2 | 3 | * { margin:0; padding:0; } 4 | 5 | p { padding: .5em 0; } 6 | 7 | h1,h2,h3,h4,h5,h6 { font-size: 1em; padding: .5em 0; } 8 | 9 | html { display:block; padding-bottom:50px; } 10 | body { font:80% Verdana, sans-serif; color:#000; padding:25px 0 0 35px; } 11 | 12 | a { color:#5BAB03; text-decoration:none; } 13 | a:hover { color:#5BAB03; text-decoration: underline;} 14 | 15 | ul { margin-left:1.5em; } 16 | li { margin-bottom:0.4em; } 17 | div#content>ul { list-style-type: none; } 18 | div.article>li>a { font-weight:bold; font-size: 1.3em;} 19 | 20 | 21 | div { line-height:1.6em; } 22 | 23 | div#content { background:#fff; margin-right:15px; padding-left:1em;} 24 | div#content div { margin:0 1em 1em 0; } 25 | 26 | div#explanation { padding:1em 1em 0 1em; border:1px solid #ddd; background:#efefef; margin:0 2em 2em 0; } 27 | div#explanation h1 { font-weight:normal; font-size:1.8em; margin-bottom:0.3em; } 28 | div#explanation p { margin-bottom:1em; } 29 | 30 | .small { font-size: .7em; color: #666; } -------------------------------------------------------------------------------- /css/feed.xsl: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | <xsl:value-of select="$title"/> (full-text feed) 10 | 11 | 12 | 13 |
14 |

(full-text feed)

15 |

You are viewing an auto-generated full-text RSS feed. RSS feeds allow you to stay up to date with the latest news and features you want from websites.
Subscribe to this feed.

16 |

Below is the latest content available from this feed.

17 |
18 | 19 |
20 |
    21 | 22 |
    23 |
  • 24 |
    25 | 26 | 27 | 28 | 29 |
    30 |
  • 31 |
    32 |
    33 |
34 |
35 | 36 | 37 |
38 |
-------------------------------------------------------------------------------- /extract.php: -------------------------------------------------------------------------------- 1 | . 24 | */ 25 | 26 | // Usage 27 | // ----- 28 | // Request this file passing it a web page URL in the querystring: extract.php?url=example.org 29 | // You can use GET and POST requests. 30 | // You'll get a simple JSON response: 31 | /* 32 | HTTP/1.0 200 OK 33 | { 34 | "title": "Blowing Smoke with Boxing's Big Voice", 35 | "content"

Content here

More content

", 36 | "author": "Rafi Kohan", 37 | "excerpt": "Short extract from the beginning of the article.", 38 | "language": "en", 39 | "url": "http://example.org/article.html", 40 | "effective_url": "http://example.org/article.html", 41 | "date": "2014-05-10" 42 | } 43 | */ 44 | 45 | define('_FF_FTR_MODE', 'simple'); 46 | 47 | // Don't process URL as feed 48 | $_POST['accept'] = 'html'; 49 | // JSON output only 50 | $_POST['format'] = 'json'; 51 | // Enable excerpts 52 | $_POST['summary'] = '1'; 53 | // Guess language if it's not already marked up 54 | $_POST['lang'] = '2'; 55 | // Don't produce result if extraction fails 56 | // Additional meta elements might still be useful (e.g. og/twitter elements), 57 | // so we're commeting this out from FTR 3.7 58 | //$_POST['exc'] = '1'; 59 | // Enable XSS filtering (unless explicitly disabled) 60 | if (isset($_POST['xss']) && $_POST['xss'] !== '0') { 61 | $_POST['xss'] = '1'; 62 | } elseif (isset($_GET['xss']) && $_GET['xss'] !== '0') { 63 | $_GET['xss'] = '1'; 64 | } else { 65 | $_POST['xss'] = '1'; 66 | } 67 | 68 | require 'makefulltextfeed.php'; -------------------------------------------------------------------------------- /images/agplv3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/timothytylee/full-text-rss/384d52fd83361ffd6e7f28bd39b322970a015a28/images/agplv3.png -------------------------------------------------------------------------------- /js/bootstrap-popover.js: -------------------------------------------------------------------------------- 1 | /* =========================================================== 2 | * bootstrap-popover.js v2.0.3 3 | * http://twitter.github.com/bootstrap/javascript.html#popovers 4 | * =========================================================== 5 | * Copyright 2012 Twitter, Inc. 6 | * 7 | * Licensed under the Apache License, Version 2.0 (the "License"); 8 | * you may not use this file except in compliance with the License. 9 | * You may obtain a copy of the License at 10 | * 11 | * http://www.apache.org/licenses/LICENSE-2.0 12 | * 13 | * Unless required by applicable law or agreed to in writing, software 14 | * distributed under the License is distributed on an "AS IS" BASIS, 15 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 16 | * See the License for the specific language governing permissions and 17 | * limitations under the License. 18 | * =========================================================== */ 19 | 20 | 21 | !function ($) { 22 | 23 | "use strict"; // jshint ;_; 24 | 25 | 26 | /* POPOVER PUBLIC CLASS DEFINITION 27 | * =============================== */ 28 | 29 | var Popover = function ( element, options ) { 30 | this.init('popover', element, options) 31 | } 32 | 33 | 34 | /* NOTE: POPOVER EXTENDS BOOTSTRAP-TOOLTIP.js 35 | ========================================== */ 36 | 37 | Popover.prototype = $.extend({}, $.fn.tooltip.Constructor.prototype, { 38 | 39 | constructor: Popover 40 | 41 | , setContent: function () { 42 | var $tip = this.tip() 43 | , title = this.getTitle() 44 | , content = this.getContent() 45 | 46 | $tip.find('.popover-title')[this.isHTML(title) ? 'html' : 'text'](title) 47 | $tip.find('.popover-content > *')[this.isHTML(content) ? 'html' : 'text'](content) 48 | 49 | $tip.removeClass('fade top bottom left right in') 50 | } 51 | 52 | , hasContent: function () { 53 | return this.getTitle() || this.getContent() 54 | } 55 | 56 | , getContent: function () { 57 | var content 58 | , $e = this.$element 59 | , o = this.options 60 | 61 | content = $e.attr('data-content') 62 | || (typeof o.content == 'function' ? o.content.call($e[0]) : o.content) 63 | 64 | return content 65 | } 66 | 67 | , tip: function () { 68 | if (!this.$tip) { 69 | this.$tip = $(this.options.template) 70 | } 71 | return this.$tip 72 | } 73 | 74 | }) 75 | 76 | 77 | /* POPOVER PLUGIN DEFINITION 78 | * ======================= */ 79 | 80 | $.fn.popover = function (option) { 81 | return this.each(function () { 82 | var $this = $(this) 83 | , data = $this.data('popover') 84 | , options = typeof option == 'object' && option 85 | if (!data) $this.data('popover', (data = new Popover(this, options))) 86 | if (typeof option == 'string') data[option]() 87 | }) 88 | } 89 | 90 | $.fn.popover.Constructor = Popover 91 | 92 | $.fn.popover.defaults = $.extend({} , $.fn.tooltip.defaults, { 93 | placement: 'right' 94 | , content: '' 95 | , template: '

' 96 | }) 97 | 98 | }(window.jQuery); -------------------------------------------------------------------------------- /js/bootstrap-tab.js: -------------------------------------------------------------------------------- 1 | /* ======================================================== 2 | * bootstrap-tab.js v2.0.3 3 | * http://twitter.github.com/bootstrap/javascript.html#tabs 4 | * ======================================================== 5 | * Copyright 2012 Twitter, Inc. 6 | * 7 | * Licensed under the Apache License, Version 2.0 (the "License"); 8 | * you may not use this file except in compliance with the License. 9 | * You may obtain a copy of the License at 10 | * 11 | * http://www.apache.org/licenses/LICENSE-2.0 12 | * 13 | * Unless required by applicable law or agreed to in writing, software 14 | * distributed under the License is distributed on an "AS IS" BASIS, 15 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 16 | * See the License for the specific language governing permissions and 17 | * limitations under the License. 18 | * ======================================================== */ 19 | 20 | 21 | !function ($) { 22 | 23 | "use strict"; // jshint ;_; 24 | 25 | 26 | /* TAB CLASS DEFINITION 27 | * ==================== */ 28 | 29 | var Tab = function ( element ) { 30 | this.element = $(element) 31 | } 32 | 33 | Tab.prototype = { 34 | 35 | constructor: Tab 36 | 37 | , show: function () { 38 | var $this = this.element 39 | , $ul = $this.closest('ul:not(.dropdown-menu)') 40 | , selector = $this.attr('data-target') 41 | , previous 42 | , $target 43 | , e 44 | 45 | if (!selector) { 46 | selector = $this.attr('href') 47 | selector = selector && selector.replace(/.*(?=#[^\s]*$)/, '') //strip for ie7 48 | } 49 | 50 | if ( $this.parent('li').hasClass('active') ) return 51 | 52 | previous = $ul.find('.active a').last()[0] 53 | 54 | e = $.Event('show', { 55 | relatedTarget: previous 56 | }) 57 | 58 | $this.trigger(e) 59 | 60 | if (e.isDefaultPrevented()) return 61 | 62 | $target = $(selector) 63 | 64 | this.activate($this.parent('li'), $ul) 65 | this.activate($target, $target.parent(), function () { 66 | $this.trigger({ 67 | type: 'shown' 68 | , relatedTarget: previous 69 | }) 70 | }) 71 | } 72 | 73 | , activate: function ( element, container, callback) { 74 | var $active = container.find('> .active') 75 | , transition = callback 76 | && $.support.transition 77 | && $active.hasClass('fade') 78 | 79 | function next() { 80 | $active 81 | .removeClass('active') 82 | .find('> .dropdown-menu > .active') 83 | .removeClass('active') 84 | 85 | element.addClass('active') 86 | 87 | if (transition) { 88 | element[0].offsetWidth // reflow for transition 89 | element.addClass('in') 90 | } else { 91 | element.removeClass('fade') 92 | } 93 | 94 | if ( element.parent('.dropdown-menu') ) { 95 | element.closest('li.dropdown').addClass('active') 96 | } 97 | 98 | callback && callback() 99 | } 100 | 101 | transition ? 102 | $active.one($.support.transition.end, next) : 103 | next() 104 | 105 | $active.removeClass('in') 106 | } 107 | } 108 | 109 | 110 | /* TAB PLUGIN DEFINITION 111 | * ===================== */ 112 | 113 | $.fn.tab = function ( option ) { 114 | return this.each(function () { 115 | var $this = $(this) 116 | , data = $this.data('tab') 117 | if (!data) $this.data('tab', (data = new Tab(this))) 118 | if (typeof option == 'string') data[option]() 119 | }) 120 | } 121 | 122 | $.fn.tab.Constructor = Tab 123 | 124 | 125 | /* TAB DATA-API 126 | * ============ */ 127 | 128 | $(function () { 129 | $('body').on('click.tab.data-api', '[data-toggle="tab"], [data-toggle="pill"]', function (e) { 130 | e.preventDefault() 131 | $(this).tab('show') 132 | }) 133 | }) 134 | 135 | }(window.jQuery); -------------------------------------------------------------------------------- /libraries/DisableSimplePieSanitize.php: -------------------------------------------------------------------------------- 1 | infinite lifetime) 68 | * @return boolean true if no problem 69 | */ 70 | public function save($data, $id, $tags = array(), $specificLifetime = false); 71 | 72 | /** 73 | * Remove a cache record 74 | * 75 | * @param string $id Cache id 76 | * @return boolean True if no problem 77 | */ 78 | public function remove($id); 79 | 80 | /** 81 | * Clean some cache records 82 | * 83 | * Available modes are : 84 | * Zend_Cache::CLEANING_MODE_ALL (default) => remove all cache entries ($tags is not used) 85 | * Zend_Cache::CLEANING_MODE_OLD => remove too old cache entries ($tags is not used) 86 | * Zend_Cache::CLEANING_MODE_MATCHING_TAG => remove cache entries matching all given tags 87 | * ($tags can be an array of strings or a single string) 88 | * Zend_Cache::CLEANING_MODE_NOT_MATCHING_TAG => remove cache entries not {matching one of the given tags} 89 | * ($tags can be an array of strings or a single string) 90 | * Zend_Cache::CLEANING_MODE_MATCHING_ANY_TAG => remove cache entries matching any given tags 91 | * ($tags can be an array of strings or a single string) 92 | * 93 | * @param string $mode Clean mode 94 | * @param array $tags Array of tags 95 | * @return boolean true if no problem 96 | */ 97 | public function clean($mode = Zend_Cache::CLEANING_MODE_ALL, $tags = array()); 98 | 99 | } 100 | -------------------------------------------------------------------------------- /libraries/Zend/Cache/Exception.php: -------------------------------------------------------------------------------- 1 | _previous = $previous; 48 | } else { 49 | parent::__construct($msg, (int) $code, $previous); 50 | } 51 | } 52 | 53 | /** 54 | * Overloading 55 | * 56 | * For PHP < 5.3.0, provides access to the getPrevious() method. 57 | * 58 | * @param string $method 59 | * @param array $args 60 | * @return mixed 61 | */ 62 | public function __call($method, array $args) 63 | { 64 | if ('getprevious' == strtolower($method)) { 65 | return $this->_getPrevious(); 66 | } 67 | return null; 68 | } 69 | 70 | /** 71 | * String representation of the exception 72 | * 73 | * @return string 74 | */ 75 | public function __toString() 76 | { 77 | if (version_compare(PHP_VERSION, '5.3.0', '<')) { 78 | if (null !== ($e = $this->getPrevious())) { 79 | return $e->__toString() 80 | . "\n\nNext " 81 | . parent::__toString(); 82 | } 83 | } 84 | return parent::__toString(); 85 | } 86 | 87 | /** 88 | * Returns previous Exception 89 | * 90 | * @return Exception|null 91 | */ 92 | protected function _getPrevious() 93 | { 94 | return $this->_previous; 95 | } 96 | } 97 | -------------------------------------------------------------------------------- /libraries/feedwriter/FeedItem.php: -------------------------------------------------------------------------------- 1 | 9 | * @link http://www.ajaxray.com/projects/rss 10 | */ 11 | class FeedItem 12 | { 13 | private $elements = array(); //Collection of feed elements 14 | private $version; 15 | 16 | /** 17 | * Constructor 18 | * 19 | * @param contant (RSS1/RSS2/ATOM) RSS2 is default. 20 | */ 21 | function __construct($version = RSS2) 22 | { 23 | $this->version = $version; 24 | } 25 | 26 | /** 27 | * Set element (overwrites existing elements with $elementName) 28 | * 29 | * @access public 30 | * @param srting The tag name of an element 31 | * @param srting The content of tag 32 | * @param array Attributes(if any) in 'attrName' => 'attrValue' format 33 | * @return void 34 | */ 35 | public function setElement($elementName, $content, $attributes = null) 36 | { 37 | if (isset($this->elements[$elementName])) { 38 | unset($this->elements[$elementName]); 39 | } 40 | $this->addElement($elementName, $content, $attributes); 41 | } 42 | 43 | /** 44 | * Add an element to elements array 45 | * 46 | * @access public 47 | * @param srting The tag name of an element 48 | * @param srting The content of tag 49 | * @param array Attributes(if any) in 'attrName' => 'attrValue' format 50 | * @return void 51 | */ 52 | public function addElement($elementName, $content, $attributes = null) 53 | { 54 | $i = 0; 55 | if (isset($this->elements[$elementName])) { 56 | $i = count($this->elements[$elementName]); 57 | } else { 58 | $this->elements[$elementName] = array(); 59 | } 60 | $this->elements[$elementName][$i]['name'] = $elementName; 61 | $this->elements[$elementName][$i]['content'] = $content; 62 | $this->elements[$elementName][$i]['attributes'] = $attributes; 63 | } 64 | 65 | /** 66 | * Set multiple feed elements from an array. 67 | * Elements which have attributes cannot be added by this method 68 | * 69 | * @access public 70 | * @param array array of elements in 'tagName' => 'tagContent' format. 71 | * @return void 72 | */ 73 | public function addElementArray($elementArray) 74 | { 75 | if(! is_array($elementArray)) return; 76 | foreach ($elementArray as $elementName => $content) 77 | { 78 | $this->addElement($elementName, $content); 79 | } 80 | } 81 | 82 | /** 83 | * Return the collection of elements in this feed item 84 | * 85 | * @access public 86 | * @return array 87 | */ 88 | public function getElements() 89 | { 90 | return $this->elements; 91 | } 92 | 93 | // Wrapper functions ------------------------------------------------------ 94 | 95 | /** 96 | * Set the 'dscription' element of feed item 97 | * 98 | * @access public 99 | * @param string The content of 'description' element 100 | * @return void 101 | */ 102 | public function setDescription($description) 103 | { 104 | $tag = ($this->version == ATOM)? 'summary' : 'description'; 105 | $this->setElement($tag, $description); 106 | } 107 | 108 | /** 109 | * @desc Set the 'title' element of feed item 110 | * @access public 111 | * @param string The content of 'title' element 112 | * @return void 113 | */ 114 | public function setTitle($title) 115 | { 116 | $this->setElement('title', $title); 117 | } 118 | 119 | /** 120 | * Set the 'date' element of feed item 121 | * 122 | * @access public 123 | * @param string The content of 'date' element 124 | * @return void 125 | */ 126 | public function setDate($date) 127 | { 128 | if(! is_numeric($date)) 129 | { 130 | $date = strtotime($date); 131 | } 132 | 133 | if($this->version == ATOM) 134 | { 135 | $tag = 'updated'; 136 | $value = date(DATE_ATOM, $date); 137 | } 138 | elseif($this->version == RSS2) 139 | { 140 | $tag = 'pubDate'; 141 | $value = date(DATE_RSS, $date); 142 | } 143 | else 144 | { 145 | $tag = 'dc:date'; 146 | $value = date("Y-m-d", $date); 147 | } 148 | 149 | $this->setElement($tag, $value); 150 | } 151 | 152 | /** 153 | * Set the 'link' element of feed item 154 | * 155 | * @access public 156 | * @param string The content of 'link' element 157 | * @return void 158 | */ 159 | public function setLink($link) 160 | { 161 | if($this->version == RSS2 || $this->version == RSS1) 162 | { 163 | $this->setElement('link', $link); 164 | } 165 | else 166 | { 167 | $this->setElement('link','',array('href'=>$link)); 168 | $this->setElement('id', FeedWriter::uuid($link,'urn:uuid:')); 169 | } 170 | 171 | } 172 | 173 | /** 174 | * Set the 'encloser' element of feed item 175 | * For RSS 2.0 only 176 | * 177 | * @access public 178 | * @param string The url attribute of encloser tag 179 | * @param string The length attribute of encloser tag 180 | * @param string The type attribute of encloser tag 181 | * @return void 182 | */ 183 | public function setEncloser($url, $length, $type) 184 | { 185 | $attributes = array('url'=>$url, 'length'=>$length, 'type'=>$type); 186 | $this->setElement('enclosure','',$attributes); 187 | } 188 | 189 | } -------------------------------------------------------------------------------- /libraries/html5php/HTML5/Exception.php: -------------------------------------------------------------------------------- 1 | ) 72 | * @return int One of the Tokenizer::TEXTMODE_* constants. 73 | */ 74 | public function startTag($name, $attributes = array(), $selfClosing = false); 75 | 76 | /** 77 | * An end-tag. 78 | */ 79 | public function endTag($name); 80 | 81 | /** 82 | * A comment section (unparsed character data). 83 | */ 84 | public function comment($cdata); 85 | 86 | /** 87 | * A unit of parsed character data. 88 | * 89 | * Entities in this text are *already decoded*. 90 | */ 91 | public function text($cdata); 92 | 93 | /** 94 | * Indicates that the document has been entirely processed. 95 | */ 96 | public function eof(); 97 | 98 | /** 99 | * Emitted when the parser encounters an error condition. 100 | */ 101 | public function parseError($msg, $line, $col); 102 | 103 | /** 104 | * A CDATA section. 105 | * 106 | * @param string $data 107 | * The unparsed character data. 108 | */ 109 | public function cdata($data); 110 | 111 | /** 112 | * This is a holdover from the XML spec. 113 | * 114 | * While user agents don't get PIs, server-side does. 115 | * 116 | * @param string $name 117 | * The name of the processor (e.g. 'php'). 118 | * @param string $data 119 | * The unparsed data. 120 | */ 121 | public function processingInstruction($name, $data = null); 122 | } 123 | -------------------------------------------------------------------------------- /libraries/html5php/HTML5/Parser/FileInputStream.php: -------------------------------------------------------------------------------- 1 | is = $input; 32 | } 33 | 34 | /** 35 | * Get the current position. 36 | * 37 | * @return int The current intiger byte position. 38 | */ 39 | public function position() 40 | { 41 | return $this->is->key(); 42 | } 43 | 44 | /** 45 | * Take a peek at the next character in the data. 46 | * 47 | * @return string The next character. 48 | */ 49 | public function peek() 50 | { 51 | return $this->is->peek(); 52 | } 53 | 54 | /** 55 | * Get the next character. 56 | * 57 | * Note: This advances the pointer. 58 | * 59 | * @return string The next character. 60 | */ 61 | public function next() 62 | { 63 | $this->is->next(); 64 | if ($this->is->valid()) { 65 | if ($this->debug) 66 | fprintf(STDOUT, "> %s\n", $this->is->current()); 67 | return $this->is->current(); 68 | } 69 | 70 | return false; 71 | } 72 | 73 | /** 74 | * Get the current character. 75 | * 76 | * Note, this does not advance the pointer. 77 | * 78 | * @return string The current character. 79 | */ 80 | public function current() 81 | { 82 | if ($this->is->valid()) { 83 | return $this->is->current(); 84 | } 85 | 86 | return false; 87 | } 88 | 89 | /** 90 | * Silently consume N chars. 91 | */ 92 | public function consume($count = 1) 93 | { 94 | for ($i = 0; $i < $count; ++ $i) { 95 | $this->next(); 96 | } 97 | } 98 | 99 | /** 100 | * Unconsume some of the data. 101 | * This moves the data pointer backwards. 102 | * 103 | * @param int $howMany 104 | * The number of characters to move the pointer back. 105 | */ 106 | public function unconsume($howMany = 1) 107 | { 108 | $this->is->unconsume($howMany); 109 | } 110 | 111 | /** 112 | * Get the next group of that contains hex characters. 113 | * 114 | * Note, along with getting the characters the pointer in the data will be 115 | * moved as well. 116 | * 117 | * @return string The next group that is hex characters. 118 | */ 119 | public function getHex() 120 | { 121 | return $this->is->charsWhile(static::CHARS_HEX); 122 | } 123 | 124 | /** 125 | * Get the next group of characters that are ASCII Alpha characters. 126 | * 127 | * Note, along with getting the characters the pointer in the data will be 128 | * moved as well. 129 | * 130 | * @return string The next group of ASCII alpha characters. 131 | */ 132 | public function getAsciiAlpha() 133 | { 134 | return $this->is->charsWhile(static::CHARS_ALPHA); 135 | } 136 | 137 | /** 138 | * Get the next group of characters that are ASCII Alpha characters and numbers. 139 | * 140 | * Note, along with getting the characters the pointer in the data will be 141 | * moved as well. 142 | * 143 | * @return string The next group of ASCII alpha characters and numbers. 144 | */ 145 | public function getAsciiAlphaNum() 146 | { 147 | return $this->is->charsWhile(static::CHARS_ALNUM); 148 | } 149 | 150 | /** 151 | * Get the next group of numbers. 152 | * 153 | * Note, along with getting the characters the pointer in the data will be 154 | * moved as well. 155 | * 156 | * @return string The next group of numbers. 157 | */ 158 | public function getNumeric() 159 | { 160 | return $this->is->charsWhile('0123456789'); 161 | } 162 | 163 | /** 164 | * Consume whitespace. 165 | * 166 | * Whitespace in HTML5 is: formfeed, tab, newline, space. 167 | */ 168 | public function whitespace() 169 | { 170 | return $this->is->charsWhile("\n\t\f "); 171 | } 172 | 173 | /** 174 | * Returns the current line that is being consumed. 175 | * 176 | * @return int The current line number. 177 | */ 178 | public function currentLine() 179 | { 180 | return $this->is->currentLine(); 181 | } 182 | 183 | /** 184 | * Read chars until something in the mask is encountered. 185 | */ 186 | public function charsUntil($mask) 187 | { 188 | return $this->is->charsUntil($mask); 189 | } 190 | 191 | /** 192 | * Read chars as long as the mask matches. 193 | */ 194 | public function charsWhile($mask) 195 | { 196 | return $this->is->charsWhile($mask); 197 | } 198 | 199 | /** 200 | * Returns the current column of the current line that the tokenizer is at. 201 | * 202 | * Newlines are column 0. The first char after a newline is column 1. 203 | * 204 | * @return int The column number. 205 | */ 206 | public function columnOffset() 207 | { 208 | return $this->is->columnOffset(); 209 | } 210 | 211 | /** 212 | * Get all characters until EOF. 213 | * 214 | * This consumes characters until the EOF. 215 | * 216 | * @return int The number of characters remaining. 217 | */ 218 | public function remainingChars() 219 | { 220 | return $this->is->remainingChars(); 221 | } 222 | } 223 | -------------------------------------------------------------------------------- /libraries/html5php/HTML5/Parser/TreeBuildingRules.php: -------------------------------------------------------------------------------- 1 | 1, 20 | 'dd' => 1, 21 | 'dt' => 1, 22 | 'rt' => 1, 23 | 'rp' => 1, 24 | 'tr' => 1, 25 | 'th' => 1, 26 | 'td' => 1, 27 | 'thead' => 1, 28 | 'tfoot' => 1, 29 | 'tbody' => 1, 30 | 'table' => 1, 31 | 'optgroup' => 1, 32 | 'option' => 1 33 | ); 34 | 35 | /** 36 | * Build a new rules engine. 37 | * 38 | * @param \DOMDocument $doc 39 | * The DOM document to use for evaluation and modification. 40 | */ 41 | public function __construct($doc) 42 | { 43 | $this->doc = $doc; 44 | } 45 | 46 | /** 47 | * Returns true if the given tagname has special processing rules. 48 | */ 49 | public function hasRules($tagname) 50 | { 51 | return isset(static::$tags[$tagname]); 52 | } 53 | 54 | /** 55 | * Evaluate the rule for the current tag name. 56 | * 57 | * This may modify the existing DOM. 58 | * 59 | * @return \DOMElement The new Current DOM element. 60 | */ 61 | public function evaluate($new, $current) 62 | { 63 | switch ($new->tagName) { 64 | case 'li': 65 | return $this->handleLI($new, $current); 66 | case 'dt': 67 | case 'dd': 68 | return $this->handleDT($new, $current); 69 | case 'rt': 70 | case 'rp': 71 | return $this->handleRT($new, $current); 72 | case 'optgroup': 73 | return $this->closeIfCurrentMatches($new, $current, array( 74 | 'optgroup' 75 | )); 76 | case 'option': 77 | return $this->closeIfCurrentMatches($new, $current, array( 78 | 'option', 79 | )); 80 | case 'tr': 81 | return $this->closeIfCurrentMatches($new, $current, array( 82 | 'tr' 83 | )); 84 | case 'td': 85 | case 'th': 86 | return $this->closeIfCurrentMatches($new, $current, array( 87 | 'th', 88 | 'td' 89 | )); 90 | case 'tbody': 91 | case 'thead': 92 | case 'tfoot': 93 | case 'table': // Spec isn't explicit about this, but it's necessary. 94 | 95 | return $this->closeIfCurrentMatches($new, $current, array( 96 | 'thead', 97 | 'tfoot', 98 | 'tbody' 99 | )); 100 | } 101 | 102 | return $current; 103 | } 104 | 105 | protected function handleLI($ele, $current) 106 | { 107 | return $this->closeIfCurrentMatches($ele, $current, array( 108 | 'li' 109 | )); 110 | } 111 | 112 | protected function handleDT($ele, $current) 113 | { 114 | return $this->closeIfCurrentMatches($ele, $current, array( 115 | 'dt', 116 | 'dd' 117 | )); 118 | } 119 | 120 | protected function handleRT($ele, $current) 121 | { 122 | return $this->closeIfCurrentMatches($ele, $current, array( 123 | 'rt', 124 | 'rp' 125 | )); 126 | } 127 | 128 | protected function closeIfCurrentMatches($ele, $current, $match) 129 | { 130 | $tname = $current->tagName; 131 | if (in_array($current->tagName, $match)) { 132 | $current->parentNode->appendChild($ele); 133 | } else { 134 | $current->appendChild($ele); 135 | } 136 | 137 | return $ele; 138 | } 139 | } 140 | -------------------------------------------------------------------------------- /libraries/html5php/HTML5/Parser/UTF8Utils.php: -------------------------------------------------------------------------------- 1 | 8 | 9 | Permission is hereby granted, free of charge, to any person obtaining a 10 | copy of this software and associated documentation files (the 11 | "Software"), to deal in the Software without restriction, including 12 | without limitation the rights to use, copy, modify, merge, publish, 13 | distribute, sublicense, and/or sell copies of the Software, and to 14 | permit persons to whom the Software is furnished to do so, subject to 15 | the following conditions: 16 | 17 | The above copyright notice and this permission notice shall be included 18 | in all copies or substantial portions of the Software. 19 | 20 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 21 | OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 22 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 23 | IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 24 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 25 | TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 26 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 27 | 28 | */ 29 | /** 30 | * UTF-8 Utilities 31 | */ 32 | class UTF8Utils 33 | { 34 | 35 | /** 36 | * The Unicode replacement character.. 37 | */ 38 | const FFFD = "\xEF\xBF\xBD"; 39 | 40 | /** 41 | * Count the number of characters in a string. 42 | * 43 | * UTF-8 aware. This will try (in order) iconv, 44 | * MB, libxml, and finally a custom counter. 45 | * 46 | * @todo Move this to a general utility class. 47 | */ 48 | public static function countChars($string) 49 | { 50 | // Get the length for the string we need. 51 | if (function_exists('mb_strlen')) { 52 | return mb_strlen($string, 'utf-8'); 53 | } elseif (function_exists('iconv_strlen')) { 54 | return iconv_strlen($string, 'utf-8'); 55 | } elseif (function_exists('utf8_decode')) { 56 | // MPB: Will this work? Won't certain decodes lead to two chars 57 | // extrapolated out of 2-byte chars? 58 | return strlen(utf8_decode($string)); 59 | } 60 | $count = count_chars($string); 61 | // 0x80 = 0x7F - 0 + 1 (one added to get inclusive range) 62 | // 0x33 = 0xF4 - 0x2C + 1 (one added to get inclusive range) 63 | return array_sum(array_slice($count, 0, 0x80)) + array_sum(array_slice($count, 0xC2, 0x33)); 64 | } 65 | 66 | /** 67 | * Convert data from the given encoding to UTF-8. 68 | * 69 | * This has not yet been tested with charactersets other than UTF-8. 70 | * It should work with ISO-8859-1/-13 and standard Latin Win charsets. 71 | * 72 | * @param string $data 73 | * The data to convert. 74 | * @param string $encoding 75 | * A valid encoding. Examples: http://www.php.net/manual/en/mbstring.supported-encodings.php 76 | */ 77 | public static function convertToUTF8($data, $encoding = 'UTF-8') 78 | { 79 | /* 80 | * From the HTML5 spec: Given an encoding, the bytes in the input stream must be converted to Unicode characters for the tokeniser, as described by the rules for that encoding, except that the leading U+FEFF BYTE ORDER MARK character, if any, must not be stripped by the encoding layer (it is stripped by the rule below). Bytes or sequences of bytes in the original byte stream that could not be converted to Unicode characters must be converted to U+FFFD REPLACEMENT CHARACTER code points. 81 | */ 82 | 83 | // mb_convert_encoding is chosen over iconv because of a bug. The best 84 | // details for the bug are on http://us1.php.net/manual/en/function.iconv.php#108643 85 | // which contains links to the actual but reports as well as work around 86 | // details. 87 | if (function_exists('mb_convert_encoding')) { 88 | // mb library has the following behaviors: 89 | // - UTF-16 surrogates result in false. 90 | // - Overlongs and outside Plane 16 result in empty strings. 91 | 92 | // Before we run mb_convert_encoding we need to tell it what to do with 93 | // characters it does not know. This could be different than the parent 94 | // application executing this library so we store the value, change it 95 | // to our needs, and then change it back when we are done. This feels 96 | // a little excessive and it would be great if there was a better way. 97 | $save = mb_substitute_character(); 98 | mb_substitute_character('none'); 99 | $data = mb_convert_encoding($data, 'UTF-8', $encoding); 100 | mb_substitute_character($save); 101 | } // @todo Get iconv running in at least some environments if that is possible. 102 | elseif (function_exists('iconv') && $encoding != 'auto') { 103 | // fprintf(STDOUT, "iconv found\n"); 104 | // iconv has the following behaviors: 105 | // - Overlong representations are ignored. 106 | // - Beyond Plane 16 is replaced with a lower char. 107 | // - Incomplete sequences generate a warning. 108 | $data = @iconv($encoding, 'UTF-8//IGNORE', $data); 109 | } else { 110 | // we can make a conforming native implementation 111 | throw new Exception('Not implemented, please install mbstring or iconv'); 112 | } 113 | 114 | /* 115 | * One leading U+FEFF BYTE ORDER MARK character must be ignored if any are present. 116 | */ 117 | if (substr($data, 0, 3) === "\xEF\xBB\xBF") { 118 | $data = substr($data, 3); 119 | } 120 | 121 | return $data; 122 | } 123 | 124 | /** 125 | * Checks for Unicode code points that are not valid in a document. 126 | * 127 | * @param string $data 128 | * A string to analyze. 129 | * @return array An array of (string) error messages produced by the scanning. 130 | */ 131 | public static function checkForIllegalCodepoints($data) 132 | { 133 | if (! function_exists('preg_match_all')) { 134 | throw\Exception('The PCRE library is not loaded or is not available.'); 135 | } 136 | 137 | // Vestigal error handling. 138 | $errors = array(); 139 | 140 | /* 141 | * All U+0000 null characters in the input must be replaced by U+FFFD REPLACEMENT CHARACTERs. Any occurrences of such characters is a parse error. 142 | */ 143 | for ($i = 0, $count = substr_count($data, "\0"); $i < $count; $i ++) { 144 | $errors[] = 'null-character'; 145 | } 146 | 147 | /* 148 | * Any occurrences of any characters in the ranges U+0001 to U+0008, U+000B, U+000E to U+001F, U+007F to U+009F, U+D800 to U+DFFF , U+FDD0 to U+FDEF, and characters U+FFFE, U+FFFF, U+1FFFE, U+1FFFF, U+2FFFE, U+2FFFF, U+3FFFE, U+3FFFF, U+4FFFE, U+4FFFF, U+5FFFE, U+5FFFF, U+6FFFE, U+6FFFF, U+7FFFE, U+7FFFF, U+8FFFE, U+8FFFF, U+9FFFE, U+9FFFF, U+AFFFE, U+AFFFF, U+BFFFE, U+BFFFF, U+CFFFE, U+CFFFF, U+DFFFE, U+DFFFF, U+EFFFE, U+EFFFF, U+FFFFE, U+FFFFF, U+10FFFE, and U+10FFFF are parse errors. (These are all control characters or permanently undefined Unicode characters.) 149 | */ 150 | // Check PCRE is loaded. 151 | $count = preg_match_all( 152 | '/(?: 153 | [\x01-\x08\x0B\x0E-\x1F\x7F] # U+0001 to U+0008, U+000B, U+000E to U+001F and U+007F 154 | | 155 | \xC2[\x80-\x9F] # U+0080 to U+009F 156 | | 157 | \xED(?:\xA0[\x80-\xFF]|[\xA1-\xBE][\x00-\xFF]|\xBF[\x00-\xBF]) # U+D800 to U+DFFFF 158 | | 159 | \xEF\xB7[\x90-\xAF] # U+FDD0 to U+FDEF 160 | | 161 | \xEF\xBF[\xBE\xBF] # U+FFFE and U+FFFF 162 | | 163 | [\xF0-\xF4][\x8F-\xBF]\xBF[\xBE\xBF] # U+nFFFE and U+nFFFF (1 <= n <= 10_{16}) 164 | )/x', $data, $matches); 165 | for ($i = 0; $i < $count; $i ++) { 166 | $errors[] = 'invalid-codepoint'; 167 | } 168 | 169 | return $errors; 170 | } 171 | } 172 | -------------------------------------------------------------------------------- /libraries/html5php/HTML5/Serializer/README.md: -------------------------------------------------------------------------------- 1 | # The Serializer (Writer) Model 2 | 3 | The serializer roughly follows sections _8.1 Writing HTML documents_ and section 4 | _8.3 Serializing HTML fragments_ by converting DOMDocument, DOMDocumentFragment, 5 | and DOMNodeList into HTML5. 6 | 7 | [ HTML5 ] // Interface for saving. 8 | || 9 | [ Traverser ] // Walk the DOM 10 | || 11 | [ Rules ] // Convert DOM elements into strings. 12 | || 13 | [ HTML5 ] // HTML5 document or fragment in text. 14 | 15 | 16 | ## HTML5 Class 17 | 18 | Provides the top level interface for saving. 19 | 20 | ## The Traverser 21 | 22 | Walks the DOM finding each element and passing it off to the output rules to 23 | convert to HTML5. 24 | 25 | ## Output Rules 26 | 27 | The output rules are defined in the RulesInterface which can have multiple 28 | implementations. Currently, the OutputRules is the default implementation that 29 | converts a DOM as is into HTML5. 30 | 31 | ## HTML5 String 32 | 33 | The output of the process it HTML5 as a string or saved to a file. -------------------------------------------------------------------------------- /libraries/html5php/HTML5/Serializer/RulesInterface.php: -------------------------------------------------------------------------------- 1 | 'html', 21 | 'http://www.w3.org/1998/Math/MathML' => 'math', 22 | 'http://www.w3.org/2000/svg' => 'svg' 23 | ); 24 | 25 | protected $dom; 26 | 27 | protected $options; 28 | 29 | protected $encode = false; 30 | 31 | protected $rules; 32 | 33 | protected $out; 34 | 35 | /** 36 | * Create a traverser. 37 | * 38 | * @param DOMNode|DOMNodeList $dom 39 | * The document or node to traverse. 40 | * @param resource $out 41 | * A stream that allows writing. The traverser will output into this 42 | * stream. 43 | * @param array $options 44 | * An array or options for the traverser as key/value pairs. These include: 45 | * - encode_entities: A bool to specify if full encding should happen for all named 46 | * charachter references. Defaults to false which escapes &'<>". 47 | * - output_rules: The path to the class handling the output rules. 48 | */ 49 | public function __construct($dom, $out, RulesInterface $rules, $options = array()) 50 | { 51 | $this->dom = $dom; 52 | $this->out = $out; 53 | $this->rules = $rules; 54 | $this->options = $options; 55 | 56 | $this->rules->setTraverser($this); 57 | } 58 | 59 | /** 60 | * Tell the traverser to walk the DOM. 61 | * 62 | * @return resource $out 63 | * Returns the output stream. 64 | */ 65 | public function walk() 66 | { 67 | if ($this->dom instanceof \DOMDocument) { 68 | $this->rules->document($this->dom); 69 | } elseif ($this->dom instanceof \DOMDocumentFragment) { 70 | // Document fragments are a special case. Only the children need to 71 | // be serialized. 72 | if ($this->dom->hasChildNodes()) { 73 | $this->children($this->dom->childNodes); 74 | } 75 | } // If NodeList, loop 76 | elseif ($this->dom instanceof \DOMNodeList) { 77 | // If this is a NodeList of DOMDocuments this will not work. 78 | $this->children($this->dom); 79 | } // Else assume this is a DOMNode-like datastructure. 80 | else { 81 | $this->node($this->dom); 82 | } 83 | 84 | return $this->out; 85 | } 86 | 87 | /** 88 | * Process a node in the DOM. 89 | * 90 | * @param mixed $node 91 | * A node implementing \DOMNode. 92 | */ 93 | public function node($node) 94 | { 95 | // A listing of types is at http://php.net/manual/en/dom.constants.php 96 | switch ($node->nodeType) { 97 | case XML_ELEMENT_NODE: 98 | $this->rules->element($node); 99 | break; 100 | case XML_TEXT_NODE: 101 | $this->rules->text($node); 102 | break; 103 | case XML_CDATA_SECTION_NODE: 104 | $this->rules->cdata($node); 105 | break; 106 | case XML_PI_NODE: 107 | $this->rules->processorInstruction($node); 108 | break; 109 | case XML_COMMENT_NODE: 110 | $this->rules->comment($node); 111 | break; 112 | // Currently we don't support embedding DTDs. 113 | default: 114 | //print ''; 115 | break; 116 | } 117 | } 118 | 119 | /** 120 | * Walk through all the nodes on a node list. 121 | * 122 | * @param \DOMNodeList $nl 123 | * A list of child elements to walk through. 124 | */ 125 | public function children($nl) 126 | { 127 | foreach ($nl as $node) { 128 | $this->node($node); 129 | } 130 | } 131 | 132 | /** 133 | * Is an element local? 134 | * 135 | * @param mixed $ele 136 | * An element that implement \DOMNode. 137 | * 138 | * @return bool True if local and false otherwise. 139 | */ 140 | public function isLocalElement($ele) 141 | { 142 | $uri = $ele->namespaceURI; 143 | if (empty($uri)) { 144 | return false; 145 | } 146 | 147 | return isset(static::$local_ns[$uri]); 148 | } 149 | } 150 | -------------------------------------------------------------------------------- /libraries/html5php/LICENSE.txt: -------------------------------------------------------------------------------- 1 | ## HTML5-PHP License 2 | 3 | Copyright (c) 2013 The Authors of HTML5-PHP 4 | 5 | Matt Butcher - mattbutcher@google.com 6 | Matt Farina - matt@mattfarina.com 7 | Asmir Mustafic - goetas@gmail.com 8 | 9 | Permission is hereby granted, free of charge, to any person obtaining a copy of 10 | this software and associated documentation files (the "Software"), to deal in 11 | the Software without restriction, including without limitation the rights to 12 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 13 | the Software, and to permit persons to whom the Software is furnished to do so, 14 | subject to the following conditions: 15 | 16 | The above copyright notice and this permission notice shall be included in all 17 | copies or substantial portions of the Software. 18 | 19 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 20 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 21 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 22 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 23 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 24 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 25 | 26 | ## HTML5Lib License 27 | 28 | Portions of this are based on html5lib's PHP version, which was a 29 | sub-project of html5lib. The following is the list of contributors from 30 | html5lib: 31 | 32 | html5lib: 33 | 34 | Copyright (c) 2006-2009 The Authors 35 | 36 | Contributors: 37 | James Graham - jg307@cam.ac.uk 38 | Anne van Kesteren - annevankesteren@gmail.com 39 | Lachlan Hunt - lachlan.hunt@lachy.id.au 40 | Matt McDonald - kanashii@kanashii.ca 41 | Sam Ruby - rubys@intertwingly.net 42 | Ian Hickson (Google) - ian@hixie.ch 43 | Thomas Broyer - t.broyer@ltgt.net 44 | Jacques Distler - distler@golem.ph.utexas.edu 45 | Henri Sivonen - hsivonen@iki.fi 46 | Adam Barth - abarth@webkit.org 47 | Eric Seidel - eric@webkit.org 48 | The Mozilla Foundation (contributions from Henri Sivonen since 2008) 49 | David Flanagan (Mozilla) - dflanagan@mozilla.com 50 | 51 | Permission is hereby granted, free of charge, to any person obtaining a copy of 52 | this software and associated documentation files (the "Software"), to deal in 53 | the Software without restriction, including without limitation the rights to 54 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 55 | the Software, and to permit persons to whom the Software is furnished to do so, 56 | subject to the following conditions: 57 | 58 | The above copyright notice and this permission notice shall be included in all 59 | copies or substantial portions of the Software. 60 | 61 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 62 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 63 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 64 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 65 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 66 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 67 | -------------------------------------------------------------------------------- /libraries/html5php/RELEASE.md: -------------------------------------------------------------------------------- 1 | # Release Notes 2 | 3 | 2.3.0 (2017-09-04) 4 | 5 | - #129: image within inline svg breaks system (fixed by #133) 6 | - #131: ² does not work (fixed by #132) 7 | - #134: Improve tokenizer performance by 20% (alternative version of #130 thanks to @MichaelHeerklotz) 8 | - #135: Raw & in attributes 9 | 10 | 2.2.2 (2016-09-22) 11 | 12 | - #116: In XML mode, tags are case sensitive 13 | - #115: Fix PHP Notice in OutputRules 14 | - #112: fix parsing of options of an optgroup 15 | - #111: Adding test for the address tag 16 | 17 | 2.2.1 (2016-05-10) 18 | 19 | - #109: Fixed issue where address tag could be written without closing tag (thanks sylus) 20 | 21 | 2.2.0 (2016-04-11) 22 | 23 | - #105: Enable composer cache (for CI/CD) 24 | - #100: Use mb_substitute_character inset of ini_set for environments where ini_set is disable (e.g., shared hosting) 25 | - #98: Allow link, meta, style tags in noscript tags 26 | - #96: Fixed xml:href on svgs that use the "use" breaking 27 | - #94: Counting UTF8 characters performance improvement 28 | - #93: Use newer version of coveralls package 29 | - #90: Remove duplicate test 30 | - #87: Allow multiple root nodes 31 | 32 | 2.1.2 (2015-06-07) 33 | - #82: Support for PHP7 34 | - #84: Improved boolean attribute handling 35 | 36 | 2.1.1 (2015-03-23) 37 | - #78: Fixes bug where unmatched entity like string drops everything after &. 38 | 39 | 2.1.0 (2015-02-01) 40 | - #74: Added `disable_html_ns` and `target_doc` dom parsing options 41 | - Unified option names 42 | - #73: Fixed alphabet, ß now can be detected 43 | - #75 and #76: Allow whitespace in RCDATA tags 44 | - #77: Fixed parsing blunder for json embeds 45 | - #72: Add options to HTML methods 46 | 47 | 2.0.2 (2014-12-17) 48 | - #50: empty document handling 49 | - #63: tags with strange capitalization 50 | - #65: dashes and underscores as allowed characters in tag names 51 | - #68: Fixed issue with non-inline elements inside inline containers 52 | 53 | 2.0.1 (2014-09-23) 54 | - #59: Fixed issue parsing some fragments. 55 | - #56: Incorrectly saw 0 as empty string 56 | - Sami as new documentation generator 57 | 58 | 2.0.0 (2014-07-28) 59 | - #53: Improved boolean attributes handling 60 | - #52: Facebook HHVM compatibility 61 | - #48: Adopted PSR-2 as coding standard 62 | - #47: Moved everything to Masterminds namespace 63 | - #45: Added custom namespaces 64 | - #44: Added support to XML-style namespaces 65 | - #37: Refactored HTML5 class removing static methods 66 | 67 | 1.0.5 (2014-06-10) 68 | - #38: Set the dev-master branch as the 1.0.x branch for composer (goetas) 69 | - #34: Tests use PSR-4 for autoloading. (goetas) 70 | - #40, #41: Fix entity handling in RCDATA sections. (KitaitiMakoto) 71 | - #32: Fixed issue where wharacter references were being incorrectly encoded in style tags. 72 | 73 | 1.0.4 (2014-04-29) 74 | - #30/#31 Don't throw an exception for invalid tag names. 75 | 76 | 1.0.3 (2014-02-28) 77 | - #23 and #29: Ignore attributes with illegal chars in name for the PHP DOM. 78 | 79 | 1.0.2 (2014-02-12) 80 | - #23: Handle missing tag close in attribute list. 81 | - #25: Fixed text escaping in the serializer (HTML% 8.3). 82 | - #27: Fixed tests on Windows: changed "\n" -> PHP_EOL. 83 | - #28: Fixed infinite loop for char "&" in unquoted attribute in parser. 84 | - #26: Updated tag name case handling to deal with uppercase usage. 85 | - #24: Newlines and tabs are allowed inside quoted attributes (HTML5 8.2.4). 86 | - Fixed Travis CI testing. 87 | 88 | 1.0.1 (2013-11-07) 89 | - CDATA encoding is improved. (Non-standard; Issue #19) 90 | - Some parser rules were not returning the new current element. (Issue #20) 91 | - Added, to the README, details on code test coverage and to packagist version. 92 | - Fixed processor instructions. 93 | - Improved test coverage and documentation coverage. 94 | 95 | 1.0.0 (2013-10-02) 96 | - Initial release. 97 | -------------------------------------------------------------------------------- /libraries/html5php/autoloader.php: -------------------------------------------------------------------------------- 1 | path = dirname(__FILE__); 16 | } 17 | 18 | /** 19 | * Autoloader 20 | * 21 | * @param string $class The name of the class to attempt to load. 22 | */ 23 | public function autoload($class) 24 | { 25 | // Only load the class if it starts with "HTML5" 26 | if (strpos($class, 'Masterminds\HTML5') !== 0) 27 | { 28 | return; 29 | } 30 | $class = substr($class, 12); 31 | //die($class); 32 | 33 | $filename = $this->path . DIRECTORY_SEPARATOR . str_replace('\\', DIRECTORY_SEPARATOR, $class) . '.php'; 34 | include $filename; 35 | } 36 | } -------------------------------------------------------------------------------- /libraries/humble-http-agent/HumbleHttpAgentDummy.php: -------------------------------------------------------------------------------- 1 | body = $body; 33 | if (isset($headers)) $this->headers = $headers; 34 | } 35 | 36 | public function rewriteHashbangFragment($url) { 37 | return $url; 38 | } 39 | 40 | public function getRedirectURLfromHTML($url, $html) { 41 | return false; 42 | } 43 | 44 | public function getMetaRefreshURL($url, $html) { 45 | return false; 46 | } 47 | 48 | public function getUglyURL($url, $html) { 49 | return false; 50 | } 51 | 52 | public function removeFragment($url) { 53 | return $url; 54 | } 55 | 56 | public function rewriteUrls($url) { 57 | return $url; 58 | } 59 | 60 | public function enableDebug($bool=true) { 61 | return; 62 | } 63 | 64 | public function minimiseMemoryUse($bool = true) { 65 | return; 66 | } 67 | 68 | public function setMaxParallelRequests($max) { 69 | return; 70 | } 71 | 72 | public function validateUrl($url) { 73 | $url = filter_var($url, FILTER_SANITIZE_URL); 74 | $test = filter_var($url, FILTER_VALIDATE_URL); 75 | // deal with bug http://bugs.php.net/51192 (present in PHP 5.2.13 and PHP 5.3.2) 76 | if ($test === false) { 77 | $test = filter_var(strtr($url, '-', '_'), FILTER_VALIDATE_URL); 78 | } 79 | if ($test !== false && $test !== null && preg_match('!^https?://!', $url)) { 80 | return $url; 81 | } else { 82 | return false; 83 | } 84 | } 85 | 86 | public function fetchAll(array $urls) { 87 | return; 88 | } 89 | 90 | // fetch all URLs without following redirects 91 | public function fetchAllOnce(array $urls, $isRedirect=false) { 92 | return; 93 | } 94 | 95 | public function get($url, $remove=false, $gzdecode=true) { 96 | return array( 97 | 'body' => $this->body, 98 | 'headers' => $this->headers, 99 | 'status_code' => 200, 100 | 'effective_url' => $url 101 | ); 102 | } 103 | 104 | public function parallelSupport() { 105 | return false; 106 | } 107 | } -------------------------------------------------------------------------------- /libraries/humble-http-agent/SimplePie_HumbleHttpAgent.php: -------------------------------------------------------------------------------- 1 | encode($parsed['authority']), $parsed['path'], $parsed['query'], $parsed['fragment']); 37 | } 38 | $this->url = $url; 39 | $this->useragent = $useragent; 40 | if (preg_match('/^http(s)?:\/\//i', $url)) 41 | { 42 | if (!is_array($headers)) 43 | { 44 | $headers = array(); 45 | } 46 | $this->method = SIMPLEPIE_FILE_SOURCE_REMOTE | SIMPLEPIE_FILE_SOURCE_CURL; 47 | $headers2 = array(); 48 | foreach ($headers as $key => $value) { 49 | $headers2[] = "$key: $value"; 50 | } 51 | //TODO: allow for HTTP headers 52 | // curl_setopt($fp, CURLOPT_HTTPHEADER, $headers2); 53 | 54 | $response = self::$agent->get($url); 55 | 56 | if ($response === false || !isset($response['status_code'])) { 57 | $this->error = 'failed to fetch URL'; 58 | $this->success = false; 59 | } else { 60 | // The extra lines at the end are there to satisfy SimplePie's HTTP parser. 61 | // The class expects a full HTTP message, whereas we're giving it only 62 | // headers - the new lines indicate the start of the body. 63 | $parser = new SimplePie_HTTP_Parser($response['headers']."\r\n\r\n"); 64 | if ($parser->parse()) { 65 | $this->headers = $parser->headers; 66 | //$this->body = $parser->body; 67 | $this->body = $response['body']; 68 | $this->status_code = $parser->status_code; 69 | } 70 | } 71 | } 72 | else 73 | { 74 | $this->error = 'invalid URL'; 75 | $this->success = false; 76 | } 77 | } 78 | } -------------------------------------------------------------------------------- /libraries/language-detect/LanguageDetect/Exception.php: -------------------------------------------------------------------------------- 1 | 10 | * @license BSD http://www.opensource.org/licenses/bsd-license.php 11 | * @link http://pear.php.net/package/Text_LanguageDetect/ 12 | */ 13 | 14 | class Text_LanguageDetect_Exception extends Exception 15 | { 16 | /** 17 | * Database file could not be found 18 | */ 19 | const DB_NOT_FOUND = 10; 20 | 21 | /** 22 | * Database file found, but not readable 23 | */ 24 | const DB_NOT_READABLE = 11; 25 | 26 | /** 27 | * Database file is empty 28 | */ 29 | const DB_EMPTY = 12; 30 | 31 | /** 32 | * Database contents is not a PHP array 33 | */ 34 | const DB_NOT_ARRAY = 13; 35 | 36 | /** 37 | * Magic quotes are activated 38 | */ 39 | const MAGIC_QUOTES = 14; 40 | 41 | 42 | /** 43 | * Parameter of invalid type passed to method 44 | */ 45 | const PARAM_TYPE = 20; 46 | 47 | /** 48 | * Character in parameter is invalid 49 | */ 50 | const INVALID_CHAR = 21; 51 | 52 | 53 | /** 54 | * Language is not in the database 55 | */ 56 | const UNKNOWN_LANGUAGE = 30; 57 | 58 | 59 | /** 60 | * Error during block detection 61 | */ 62 | const BLOCK_DETECTION = 40; 63 | 64 | 65 | /** 66 | * Error while clustering languages 67 | */ 68 | const NO_HIGHEST_KEY = 50; 69 | } 70 | -------------------------------------------------------------------------------- /libraries/language-detect/unicode_blocks.dat: -------------------------------------------------------------------------------- 1 | a:145:{i:0;a:3:{i:0;i:0;i:1;i:127;i:2;s:11:"Basic Latin";}i:1;a:3:{i:0;i:128;i:1;i:255;i:2;s:18:"Latin-1 Supplement";}i:2;a:3:{i:0;i:256;i:1;i:383;i:2;s:16:"Latin Extended-A";}i:3;a:3:{i:0;i:384;i:1;i:591;i:2;s:16:"Latin Extended-B";}i:4;a:3:{i:0;i:592;i:1;i:687;i:2;s:14:"IPA Extensions";}i:5;a:3:{i:0;i:688;i:1;i:767;i:2;s:24:"Spacing Modifier Letters";}i:6;a:3:{i:0;i:768;i:1;i:879;i:2;s:27:"Combining Diacritical Marks";}i:7;a:3:{i:0;i:880;i:1;i:1023;i:2;s:16:"Greek and Coptic";}i:8;a:3:{i:0;i:1024;i:1;i:1279;i:2;s:8:"Cyrillic";}i:9;a:3:{i:0;i:1280;i:1;i:1327;i:2;s:19:"Cyrillic Supplement";}i:10;a:3:{i:0;i:1328;i:1;i:1423;i:2;s:8:"Armenian";}i:11;a:3:{i:0;i:1424;i:1;i:1535;i:2;s:6:"Hebrew";}i:12;a:3:{i:0;i:1536;i:1;i:1791;i:2;s:6:"Arabic";}i:13;a:3:{i:0;i:1792;i:1;i:1871;i:2;s:6:"Syriac";}i:14;a:3:{i:0;i:1872;i:1;i:1919;i:2;s:17:"Arabic Supplement";}i:15;a:3:{i:0;i:1920;i:1;i:1983;i:2;s:6:"Thaana";}i:16;a:3:{i:0;i:2304;i:1;i:2431;i:2;s:10:"Devanagari";}i:17;a:3:{i:0;i:2432;i:1;i:2559;i:2;s:7:"Bengali";}i:18;a:3:{i:0;i:2560;i:1;i:2687;i:2;s:8:"Gurmukhi";}i:19;a:3:{i:0;i:2688;i:1;i:2815;i:2;s:8:"Gujarati";}i:20;a:3:{i:0;i:2816;i:1;i:2943;i:2;s:5:"Oriya";}i:21;a:3:{i:0;i:2944;i:1;i:3071;i:2;s:5:"Tamil";}i:22;a:3:{i:0;i:3072;i:1;i:3199;i:2;s:6:"Telugu";}i:23;a:3:{i:0;i:3200;i:1;i:3327;i:2;s:7:"Kannada";}i:24;a:3:{i:0;i:3328;i:1;i:3455;i:2;s:9:"Malayalam";}i:25;a:3:{i:0;i:3456;i:1;i:3583;i:2;s:7:"Sinhala";}i:26;a:3:{i:0;i:3584;i:1;i:3711;i:2;s:4:"Thai";}i:27;a:3:{i:0;i:3712;i:1;i:3839;i:2;s:3:"Lao";}i:28;a:3:{i:0;i:3840;i:1;i:4095;i:2;s:7:"Tibetan";}i:29;a:3:{i:0;i:4096;i:1;i:4255;i:2;s:7:"Myanmar";}i:30;a:3:{i:0;i:4256;i:1;i:4351;i:2;s:8:"Georgian";}i:31;a:3:{i:0;i:4352;i:1;i:4607;i:2;s:11:"Hangul Jamo";}i:32;a:3:{i:0;i:4608;i:1;i:4991;i:2;s:8:"Ethiopic";}i:33;a:3:{i:0;i:4992;i:1;i:5023;i:2;s:19:"Ethiopic Supplement";}i:34;a:3:{i:0;i:5024;i:1;i:5119;i:2;s:8:"Cherokee";}i:35;a:3:{i:0;i:5120;i:1;i:5759;i:2;s:37:"Unified Canadian Aboriginal Syllabics";}i:36;a:3:{i:0;i:5760;i:1;i:5791;i:2;s:5:"Ogham";}i:37;a:3:{i:0;i:5792;i:1;i:5887;i:2;s:5:"Runic";}i:38;a:3:{i:0;i:5888;i:1;i:5919;i:2;s:7:"Tagalog";}i:39;a:3:{i:0;i:5920;i:1;i:5951;i:2;s:7:"Hanunoo";}i:40;a:3:{i:0;i:5952;i:1;i:5983;i:2;s:5:"Buhid";}i:41;a:3:{i:0;i:5984;i:1;i:6015;i:2;s:8:"Tagbanwa";}i:42;a:3:{i:0;i:6016;i:1;i:6143;i:2;s:5:"Khmer";}i:43;a:3:{i:0;i:6144;i:1;i:6319;i:2;s:9:"Mongolian";}i:44;a:3:{i:0;i:6400;i:1;i:6479;i:2;s:5:"Limbu";}i:45;a:3:{i:0;i:6480;i:1;i:6527;i:2;s:6:"Tai Le";}i:46;a:3:{i:0;i:6528;i:1;i:6623;i:2;s:11:"New Tai Lue";}i:47;a:3:{i:0;i:6624;i:1;i:6655;i:2;s:13:"Khmer Symbols";}i:48;a:3:{i:0;i:6656;i:1;i:6687;i:2;s:8:"Buginese";}i:49;a:3:{i:0;i:7424;i:1;i:7551;i:2;s:19:"Phonetic Extensions";}i:50;a:3:{i:0;i:7552;i:1;i:7615;i:2;s:30:"Phonetic Extensions Supplement";}i:51;a:3:{i:0;i:7616;i:1;i:7679;i:2;s:38:"Combining Diacritical Marks Supplement";}i:52;a:3:{i:0;i:7680;i:1;i:7935;i:2;s:25:"Latin Extended Additional";}i:53;a:3:{i:0;i:7936;i:1;i:8191;i:2;s:14:"Greek Extended";}i:54;a:3:{i:0;i:8192;i:1;i:8303;i:2;s:19:"General Punctuation";}i:55;a:3:{i:0;i:8304;i:1;i:8351;i:2;s:27:"Superscripts and Subscripts";}i:56;a:3:{i:0;i:8352;i:1;i:8399;i:2;s:16:"Currency Symbols";}i:57;a:3:{i:0;i:8400;i:1;i:8447;i:2;s:39:"Combining Diacritical Marks for Symbols";}i:58;a:3:{i:0;i:8448;i:1;i:8527;i:2;s:18:"Letterlike Symbols";}i:59;a:3:{i:0;i:8528;i:1;i:8591;i:2;s:12:"Number Forms";}i:60;a:3:{i:0;i:8592;i:1;i:8703;i:2;s:6:"Arrows";}i:61;a:3:{i:0;i:8704;i:1;i:8959;i:2;s:22:"Mathematical Operators";}i:62;a:3:{i:0;i:8960;i:1;i:9215;i:2;s:23:"Miscellaneous Technical";}i:63;a:3:{i:0;i:9216;i:1;i:9279;i:2;s:16:"Control Pictures";}i:64;a:3:{i:0;i:9280;i:1;i:9311;i:2;s:29:"Optical Character Recognition";}i:65;a:3:{i:0;i:9312;i:1;i:9471;i:2;s:22:"Enclosed Alphanumerics";}i:66;a:3:{i:0;i:9472;i:1;i:9599;i:2;s:11:"Box Drawing";}i:67;a:3:{i:0;i:9600;i:1;i:9631;i:2;s:14:"Block Elements";}i:68;a:3:{i:0;i:9632;i:1;i:9727;i:2;s:16:"Geometric Shapes";}i:69;a:3:{i:0;i:9728;i:1;i:9983;i:2;s:21:"Miscellaneous Symbols";}i:70;a:3:{i:0;i:9984;i:1;i:10175;i:2;s:8:"Dingbats";}i:71;a:3:{i:0;i:10176;i:1;i:10223;i:2;s:36:"Miscellaneous Mathematical Symbols-A";}i:72;a:3:{i:0;i:10224;i:1;i:10239;i:2;s:21:"Supplemental Arrows-A";}i:73;a:3:{i:0;i:10240;i:1;i:10495;i:2;s:16:"Braille Patterns";}i:74;a:3:{i:0;i:10496;i:1;i:10623;i:2;s:21:"Supplemental Arrows-B";}i:75;a:3:{i:0;i:10624;i:1;i:10751;i:2;s:36:"Miscellaneous Mathematical Symbols-B";}i:76;a:3:{i:0;i:10752;i:1;i:11007;i:2;s:35:"Supplemental Mathematical Operators";}i:77;a:3:{i:0;i:11008;i:1;i:11263;i:2;s:32:"Miscellaneous Symbols and Arrows";}i:78;a:3:{i:0;i:11264;i:1;i:11359;i:2;s:10:"Glagolitic";}i:79;a:3:{i:0;i:11392;i:1;i:11519;i:2;s:6:"Coptic";}i:80;a:3:{i:0;i:11520;i:1;i:11567;i:2;s:19:"Georgian Supplement";}i:81;a:3:{i:0;i:11568;i:1;i:11647;i:2;s:8:"Tifinagh";}i:82;a:3:{i:0;i:11648;i:1;i:11743;i:2;s:17:"Ethiopic Extended";}i:83;a:3:{i:0;i:11776;i:1;i:11903;i:2;s:24:"Supplemental Punctuation";}i:84;a:3:{i:0;i:11904;i:1;i:12031;i:2;s:23:"CJK Radicals Supplement";}i:85;a:3:{i:0;i:12032;i:1;i:12255;i:2;s:15:"Kangxi Radicals";}i:86;a:3:{i:0;i:12272;i:1;i:12287;i:2;s:34:"Ideographic Description Characters";}i:87;a:3:{i:0;i:12288;i:1;i:12351;i:2;s:27:"CJK Symbols and Punctuation";}i:88;a:3:{i:0;i:12352;i:1;i:12447;i:2;s:8:"Hiragana";}i:89;a:3:{i:0;i:12448;i:1;i:12543;i:2;s:8:"Katakana";}i:90;a:3:{i:0;i:12544;i:1;i:12591;i:2;s:8:"Bopomofo";}i:91;a:3:{i:0;i:12592;i:1;i:12687;i:2;s:25:"Hangul Compatibility Jamo";}i:92;a:3:{i:0;i:12688;i:1;i:12703;i:2;s:6:"Kanbun";}i:93;a:3:{i:0;i:12704;i:1;i:12735;i:2;s:17:"Bopomofo Extended";}i:94;a:3:{i:0;i:12736;i:1;i:12783;i:2;s:11:"CJK Strokes";}i:95;a:3:{i:0;i:12784;i:1;i:12799;i:2;s:28:"Katakana Phonetic Extensions";}i:96;a:3:{i:0;i:12800;i:1;i:13055;i:2;s:31:"Enclosed CJK Letters and Months";}i:97;a:3:{i:0;i:13056;i:1;i:13311;i:2;s:17:"CJK Compatibility";}i:98;a:3:{i:0;i:13312;i:1;i:19903;i:2;s:34:"CJK Unified Ideographs Extension A";}i:99;a:3:{i:0;i:19904;i:1;i:19967;i:2;s:23:"Yijing Hexagram Symbols";}i:100;a:3:{i:0;i:19968;i:1;i:40959;i:2;s:22:"CJK Unified Ideographs";}i:101;a:3:{i:0;i:40960;i:1;i:42127;i:2;s:12:"Yi Syllables";}i:102;a:3:{i:0;i:42128;i:1;i:42191;i:2;s:11:"Yi Radicals";}i:103;a:3:{i:0;i:42752;i:1;i:42783;i:2;s:21:"Modifier Tone Letters";}i:104;a:3:{i:0;i:43008;i:1;i:43055;i:2;s:12:"Syloti Nagri";}i:105;a:3:{i:0;i:44032;i:1;i:55215;i:2;s:16:"Hangul Syllables";}i:106;a:3:{i:0;i:55296;i:1;i:56191;i:2;s:15:"High Surrogates";}i:107;a:3:{i:0;i:56192;i:1;i:56319;i:2;s:27:"High Private Use Surrogates";}i:108;a:3:{i:0;i:56320;i:1;i:57343;i:2;s:14:"Low Surrogates";}i:109;a:3:{i:0;i:57344;i:1;i:63743;i:2;s:16:"Private Use Area";}i:110;a:3:{i:0;i:63744;i:1;i:64255;i:2;s:28:"CJK Compatibility Ideographs";}i:111;a:3:{i:0;i:64256;i:1;i:64335;i:2;s:29:"Alphabetic Presentation Forms";}i:112;a:3:{i:0;i:64336;i:1;i:65023;i:2;s:27:"Arabic Presentation Forms-A";}i:113;a:3:{i:0;i:65024;i:1;i:65039;i:2;s:19:"Variation Selectors";}i:114;a:3:{i:0;i:65040;i:1;i:65055;i:2;s:14:"Vertical Forms";}i:115;a:3:{i:0;i:65056;i:1;i:65071;i:2;s:20:"Combining Half Marks";}i:116;a:3:{i:0;i:65072;i:1;i:65103;i:2;s:23:"CJK Compatibility Forms";}i:117;a:3:{i:0;i:65104;i:1;i:65135;i:2;s:19:"Small Form Variants";}i:118;a:3:{i:0;i:65136;i:1;i:65279;i:2;s:27:"Arabic Presentation Forms-B";}i:119;a:3:{i:0;i:65280;i:1;i:65519;i:2;s:29:"Halfwidth and Fullwidth Forms";}i:120;a:3:{i:0;i:65520;i:1;i:65535;i:2;s:8:"Specials";}i:121;a:3:{i:0;i:65536;i:1;i:65663;i:2;s:18:"Linear B Syllabary";}i:122;a:3:{i:0;i:65664;i:1;i:65791;i:2;s:18:"Linear B Ideograms";}i:123;a:3:{i:0;i:65792;i:1;i:65855;i:2;s:14:"Aegean Numbers";}i:124;a:3:{i:0;i:65856;i:1;i:65935;i:2;s:21:"Ancient Greek Numbers";}i:125;a:3:{i:0;i:66304;i:1;i:66351;i:2;s:10:"Old Italic";}i:126;a:3:{i:0;i:66352;i:1;i:66383;i:2;s:6:"Gothic";}i:127;a:3:{i:0;i:66432;i:1;i:66463;i:2;s:8:"Ugaritic";}i:128;a:3:{i:0;i:66464;i:1;i:66527;i:2;s:11:"Old Persian";}i:129;a:3:{i:0;i:66560;i:1;i:66639;i:2;s:7:"Deseret";}i:130;a:3:{i:0;i:66640;i:1;i:66687;i:2;s:7:"Shavian";}i:131;a:3:{i:0;i:66688;i:1;i:66735;i:2;s:7:"Osmanya";}i:132;a:3:{i:0;i:67584;i:1;i:67647;i:2;s:17:"Cypriot Syllabary";}i:133;a:3:{i:0;i:68096;i:1;i:68191;i:2;s:10:"Kharoshthi";}i:134;a:3:{i:0;i:118784;i:1;i:119039;i:2;s:25:"Byzantine Musical Symbols";}i:135;a:3:{i:0;i:119040;i:1;i:119295;i:2;s:15:"Musical Symbols";}i:136;a:3:{i:0;i:119296;i:1;i:119375;i:2;s:30:"Ancient Greek Musical Notation";}i:137;a:3:{i:0;i:119552;i:1;i:119647;i:2;s:21:"Tai Xuan Jing Symbols";}i:138;a:3:{i:0;i:119808;i:1;i:120831;i:2;s:33:"Mathematical Alphanumeric Symbols";}i:139;a:3:{i:0;i:131072;i:1;i:173791;i:2;s:34:"CJK Unified Ideographs Extension B";}i:140;a:3:{i:0;i:194560;i:1;i:195103;i:2;s:39:"CJK Compatibility Ideographs Supplement";}i:141;a:3:{i:0;i:917504;i:1;i:917631;i:2;s:4:"Tags";}i:142;a:3:{i:0;i:917760;i:1;i:917999;i:2;s:30:"Variation Selectors Supplement";}i:143;a:3:{i:0;i:983040;i:1;i:1048575;i:2;s:32:"Supplementary Private Use Area-A";}i:144;a:3:{i:0;i:1048576;i:1;i:1114111;i:2;s:32:"Supplementary Private Use Area-B";}} -------------------------------------------------------------------------------- /libraries/readability/JSLikeHTMLElement.php: -------------------------------------------------------------------------------- 1 | registerNodeClass('DOMElement', 'JSLikeHTMLElement'); 16 | * $doc->loadHTML('

Para 1

Para 2

'); 17 | * $elem = $doc->getElementsByTagName('div')->item(0); 18 | * 19 | * // print innerHTML 20 | * echo $elem->innerHTML; // prints '

Para 1

Para 2

' 21 | * echo "\n\n"; 22 | * 23 | * // set innerHTML 24 | * $elem->innerHTML = 'FiveFilters.org'; 25 | * echo $elem->innerHTML; // prints 'FiveFilters.org' 26 | * echo "\n\n"; 27 | * 28 | * // print document (with our changes) 29 | * echo $doc->saveXML(); 30 | * @endcode 31 | * 32 | * @author Keyvan Minoukadeh - http://www.keyvan.net - keyvan@keyvan.net 33 | * @see http://fivefilters.org (the project this was written for) 34 | */ 35 | class JSLikeHTMLElement extends DOMElement 36 | { 37 | /** 38 | * Used for setting innerHTML like it's done in JavaScript: 39 | * @code 40 | * $div->innerHTML = '

Chapter 2

The story begins...

'; 41 | * @endcode 42 | */ 43 | public function __set($name, $value) { 44 | if ($name == 'innerHTML') { 45 | // first, empty the element 46 | for ($x=$this->childNodes->length-1; $x>=0; $x--) { 47 | $this->removeChild($this->childNodes->item($x)); 48 | } 49 | // $value holds our new inner HTML 50 | if ($value != '') { 51 | $f = $this->ownerDocument->createDocumentFragment(); 52 | // appendXML() expects well-formed markup (XHTML) 53 | $result = @$f->appendXML($value); // @ to suppress PHP warnings 54 | if ($result) { 55 | if ($f->hasChildNodes()) $this->appendChild($f); 56 | } else { 57 | // $value is probably ill-formed 58 | $f = new DOMDocument(); 59 | $value = mb_convert_encoding($value, 'HTML-ENTITIES', 'UTF-8'); 60 | // Using will generate a warning, but so will bad HTML 61 | // (and by this point, bad HTML is what we've got). 62 | // We use it (and suppress the warning) because an HTML fragment will 63 | // be wrapped around tags which we don't really want to keep. 64 | // Note: despite the warning, if loadHTML succeeds it will return true. 65 | $result = @$f->loadHTML(''.$value.''); 66 | if ($result) { 67 | $import = $f->getElementsByTagName('htmlfragment')->item(0); 68 | foreach ($import->childNodes as $child) { 69 | $importedNode = $this->ownerDocument->importNode($child, true); 70 | $this->appendChild($importedNode); 71 | } 72 | } else { 73 | // oh well, we tried, we really did. :( 74 | // this element is now empty 75 | } 76 | } 77 | } 78 | } else { 79 | $trace = debug_backtrace(); 80 | trigger_error('Undefined property via __set(): '.$name.' in '.$trace[0]['file'].' on line '.$trace[0]['line'], E_USER_NOTICE); 81 | } 82 | } 83 | 84 | /** 85 | * Used for getting innerHTML like it's done in JavaScript: 86 | * @code 87 | * $string = $div->innerHTML; 88 | * @endcode 89 | */ 90 | public function __get($name) 91 | { 92 | if ($name == 'innerHTML') { 93 | $inner = ''; 94 | foreach ($this->childNodes as $child) { 95 | $inner .= $this->ownerDocument->saveXML($child); 96 | } 97 | return $inner; 98 | } 99 | 100 | $trace = debug_backtrace(); 101 | trigger_error('Undefined property via __get(): '.$name.' in '.$trace[0]['file'].' on line '.$trace[0]['line'], E_USER_NOTICE); 102 | return null; 103 | } 104 | 105 | public function __toString() 106 | { 107 | return '['.$this->tagName.']'; 108 | } 109 | } 110 | ?> -------------------------------------------------------------------------------- /libraries/simplepie/LICENSE.txt: -------------------------------------------------------------------------------- 1 | Copyright (c) 2004-2007, Ryan Parman and Geoffrey Sneddon. 2 | All rights reserved. 3 | 4 | Redistribution and use in source and binary forms, with or without modification, are 5 | permitted provided that the following conditions are met: 6 | 7 | * Redistributions of source code must retain the above copyright notice, this list of 8 | conditions and the following disclaimer. 9 | 10 | * Redistributions in binary form must reproduce the above copyright notice, this list 11 | of conditions and the following disclaimer in the documentation and/or other materials 12 | provided with the distribution. 13 | 14 | * Neither the name of the SimplePie Team nor the names of its contributors may be used 15 | to endorse or promote products derived from this software without specific prior 16 | written permission. 17 | 18 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS 19 | OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY 20 | AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS 21 | AND CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 22 | CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 23 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 24 | THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR 25 | OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 26 | POSSIBILITY OF SUCH DAMAGE. -------------------------------------------------------------------------------- /libraries/simplepie/autoloader.php: -------------------------------------------------------------------------------- 1 | path = dirname(__FILE__) . DIRECTORY_SEPARATOR . 'library'; 67 | } 68 | 69 | /** 70 | * Autoloader 71 | * 72 | * @param string $class The name of the class to attempt to load. 73 | */ 74 | public function autoload($class) 75 | { 76 | // Only load the class if it starts with "SimplePie" 77 | if (strpos($class, 'SimplePie') !== 0) 78 | { 79 | return; 80 | } 81 | 82 | $filename = $this->path . DIRECTORY_SEPARATOR . str_replace('_', DIRECTORY_SEPARATOR, $class) . '.php'; 83 | include $filename; 84 | } 85 | } -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Author.php: -------------------------------------------------------------------------------- 1 | name = $name; 90 | $this->link = $link; 91 | $this->email = $email; 92 | } 93 | 94 | /** 95 | * String-ified version 96 | * 97 | * @return string 98 | */ 99 | public function __toString() 100 | { 101 | // There is no $this->data here 102 | return md5(serialize($this)); 103 | } 104 | 105 | /** 106 | * Author's name 107 | * 108 | * @return string|null 109 | */ 110 | public function get_name() 111 | { 112 | if ($this->name !== null) 113 | { 114 | return $this->name; 115 | } 116 | else 117 | { 118 | return null; 119 | } 120 | } 121 | 122 | /** 123 | * Author's link 124 | * 125 | * @return string|null 126 | */ 127 | public function get_link() 128 | { 129 | if ($this->link !== null) 130 | { 131 | return $this->link; 132 | } 133 | else 134 | { 135 | return null; 136 | } 137 | } 138 | 139 | /** 140 | * Author's email address 141 | * 142 | * @return string|null 143 | */ 144 | public function get_email() 145 | { 146 | if ($this->email !== null) 147 | { 148 | return $this->email; 149 | } 150 | else 151 | { 152 | return null; 153 | } 154 | } 155 | } 156 | 157 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Cache.php: -------------------------------------------------------------------------------- 1 | 'SimplePie_Cache_MySQL', 65 | 'memcache' => 'SimplePie_Cache_Memcache', 66 | 'memcached' => 'SimplePie_Cache_Memcached', 67 | 'redis' => 'SimplePie_Cache_Redis' 68 | ); 69 | 70 | /** 71 | * Don't call the constructor. Please. 72 | */ 73 | private function __construct() { } 74 | 75 | /** 76 | * Create a new SimplePie_Cache object 77 | * 78 | * @param string $location URL location (scheme is used to determine handler) 79 | * @param string $filename Unique identifier for cache object 80 | * @param string $extension 'spi' or 'spc' 81 | * @return SimplePie_Cache_Base Type of object depends on scheme of `$location` 82 | */ 83 | public static function get_handler($location, $filename, $extension) 84 | { 85 | $type = explode(':', $location, 2); 86 | $type = $type[0]; 87 | if (!empty(self::$handlers[$type])) 88 | { 89 | $class = self::$handlers[$type]; 90 | return new $class($location, $filename, $extension); 91 | } 92 | 93 | return new SimplePie_Cache_File($location, $filename, $extension); 94 | } 95 | 96 | /** 97 | * Create a new SimplePie_Cache object 98 | * 99 | * @deprecated Use {@see get_handler} instead 100 | */ 101 | public function create($location, $filename, $extension) 102 | { 103 | trigger_error('Cache::create() has been replaced with Cache::get_handler(). Switch to the registry system to use this.', E_USER_DEPRECATED); 104 | return self::get_handler($location, $filename, $extension); 105 | } 106 | 107 | /** 108 | * Register a handler 109 | * 110 | * @param string $type DSN type to register for 111 | * @param string $class Name of handler class. Must implement SimplePie_Cache_Base 112 | */ 113 | public static function register($type, $class) 114 | { 115 | self::$handlers[$type] = $class; 116 | } 117 | 118 | /** 119 | * Parse a URL into an array 120 | * 121 | * @param string $url 122 | * @return array 123 | */ 124 | public static function parse_URL($url) 125 | { 126 | $params = parse_url($url); 127 | $params['extras'] = array(); 128 | if (isset($params['query'])) 129 | { 130 | parse_str($params['query'], $params['extras']); 131 | } 132 | return $params; 133 | } 134 | } 135 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Cache/Base.php: -------------------------------------------------------------------------------- 1 | get_items(); 63 | $items_by_id = array(); 64 | 65 | if (!empty($items)) 66 | { 67 | foreach ($items as $item) 68 | { 69 | $items_by_id[$item->get_id()] = $item; 70 | } 71 | 72 | if (count($items_by_id) !== count($items)) 73 | { 74 | $items_by_id = array(); 75 | foreach ($items as $item) 76 | { 77 | $items_by_id[$item->get_id(true)] = $item; 78 | } 79 | } 80 | 81 | if (isset($data->data['child'][SIMPLEPIE_NAMESPACE_ATOM_10]['feed'][0])) 82 | { 83 | $channel =& $data->data['child'][SIMPLEPIE_NAMESPACE_ATOM_10]['feed'][0]; 84 | } 85 | elseif (isset($data->data['child'][SIMPLEPIE_NAMESPACE_ATOM_03]['feed'][0])) 86 | { 87 | $channel =& $data->data['child'][SIMPLEPIE_NAMESPACE_ATOM_03]['feed'][0]; 88 | } 89 | elseif (isset($data->data['child'][SIMPLEPIE_NAMESPACE_RDF]['RDF'][0])) 90 | { 91 | $channel =& $data->data['child'][SIMPLEPIE_NAMESPACE_RDF]['RDF'][0]; 92 | } 93 | elseif (isset($data->data['child'][SIMPLEPIE_NAMESPACE_RSS_20]['rss'][0]['child'][SIMPLEPIE_NAMESPACE_RSS_20]['channel'][0])) 94 | { 95 | $channel =& $data->data['child'][SIMPLEPIE_NAMESPACE_RSS_20]['rss'][0]['child'][SIMPLEPIE_NAMESPACE_RSS_20]['channel'][0]; 96 | } 97 | else 98 | { 99 | $channel = null; 100 | } 101 | 102 | if ($channel !== null) 103 | { 104 | if (isset($channel['child'][SIMPLEPIE_NAMESPACE_ATOM_10]['entry'])) 105 | { 106 | unset($channel['child'][SIMPLEPIE_NAMESPACE_ATOM_10]['entry']); 107 | } 108 | if (isset($channel['child'][SIMPLEPIE_NAMESPACE_ATOM_03]['entry'])) 109 | { 110 | unset($channel['child'][SIMPLEPIE_NAMESPACE_ATOM_03]['entry']); 111 | } 112 | if (isset($channel['child'][SIMPLEPIE_NAMESPACE_RSS_10]['item'])) 113 | { 114 | unset($channel['child'][SIMPLEPIE_NAMESPACE_RSS_10]['item']); 115 | } 116 | if (isset($channel['child'][SIMPLEPIE_NAMESPACE_RSS_090]['item'])) 117 | { 118 | unset($channel['child'][SIMPLEPIE_NAMESPACE_RSS_090]['item']); 119 | } 120 | if (isset($channel['child'][SIMPLEPIE_NAMESPACE_RSS_20]['item'])) 121 | { 122 | unset($channel['child'][SIMPLEPIE_NAMESPACE_RSS_20]['item']); 123 | } 124 | } 125 | if (isset($data->data['items'])) 126 | { 127 | unset($data->data['items']); 128 | } 129 | if (isset($data->data['ordered_items'])) 130 | { 131 | unset($data->data['ordered_items']); 132 | } 133 | } 134 | return array(serialize($data->data), $items_by_id); 135 | } 136 | } 137 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Cache/File.php: -------------------------------------------------------------------------------- 1 | location = $location; 91 | $this->filename = $name; 92 | $this->extension = $type; 93 | $this->name = "$this->location/$this->filename.$this->extension"; 94 | } 95 | 96 | /** 97 | * Save data to the cache 98 | * 99 | * @param array|SimplePie $data Data to store in the cache. If passed a SimplePie object, only cache the $data property 100 | * @return bool Successfulness 101 | */ 102 | public function save($data) 103 | { 104 | if (file_exists($this->name) && is_writeable($this->name) || file_exists($this->location) && is_writeable($this->location)) 105 | { 106 | if ($data instanceof SimplePie) 107 | { 108 | $data = $data->data; 109 | } 110 | 111 | $data = serialize($data); 112 | return (bool) file_put_contents($this->name, $data); 113 | } 114 | return false; 115 | } 116 | 117 | /** 118 | * Retrieve the data saved to the cache 119 | * 120 | * @return array Data for SimplePie::$data 121 | */ 122 | public function load() 123 | { 124 | if (file_exists($this->name) && is_readable($this->name)) 125 | { 126 | return unserialize(file_get_contents($this->name)); 127 | } 128 | return false; 129 | } 130 | 131 | /** 132 | * Retrieve the last modified time for the cache 133 | * 134 | * @return int Timestamp 135 | */ 136 | public function mtime() 137 | { 138 | return @filemtime($this->name); 139 | } 140 | 141 | /** 142 | * Set the last modified time to the current time 143 | * 144 | * @return bool Success status 145 | */ 146 | public function touch() 147 | { 148 | return @touch($this->name); 149 | } 150 | 151 | /** 152 | * Remove the cache 153 | * 154 | * @return bool Success status 155 | */ 156 | public function unlink() 157 | { 158 | if (file_exists($this->name)) 159 | { 160 | return unlink($this->name); 161 | } 162 | return false; 163 | } 164 | } 165 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Cache/Memcache.php: -------------------------------------------------------------------------------- 1 | options = array( 90 | 'host' => '127.0.0.1', 91 | 'port' => 11211, 92 | 'extras' => array( 93 | 'timeout' => 3600, // one hour 94 | 'prefix' => 'simplepie_', 95 | ), 96 | ); 97 | $this->options = SimplePie_Misc::array_merge_recursive($this->options, SimplePie_Cache::parse_URL($location)); 98 | 99 | $this->name = $this->options['extras']['prefix'] . md5("$name:$type"); 100 | 101 | $this->cache = new Memcache(); 102 | $this->cache->addServer($this->options['host'], (int) $this->options['port']); 103 | } 104 | 105 | /** 106 | * Save data to the cache 107 | * 108 | * @param array|SimplePie $data Data to store in the cache. If passed a SimplePie object, only cache the $data property 109 | * @return bool Successfulness 110 | */ 111 | public function save($data) 112 | { 113 | if ($data instanceof SimplePie) 114 | { 115 | $data = $data->data; 116 | } 117 | return $this->cache->set($this->name, serialize($data), MEMCACHE_COMPRESSED, (int) $this->options['extras']['timeout']); 118 | } 119 | 120 | /** 121 | * Retrieve the data saved to the cache 122 | * 123 | * @return array Data for SimplePie::$data 124 | */ 125 | public function load() 126 | { 127 | $data = $this->cache->get($this->name); 128 | 129 | if ($data !== false) 130 | { 131 | return unserialize($data); 132 | } 133 | return false; 134 | } 135 | 136 | /** 137 | * Retrieve the last modified time for the cache 138 | * 139 | * @return int Timestamp 140 | */ 141 | public function mtime() 142 | { 143 | $data = $this->cache->get($this->name); 144 | 145 | if ($data !== false) 146 | { 147 | // essentially ignore the mtime because Memcache expires on its own 148 | return time(); 149 | } 150 | 151 | return false; 152 | } 153 | 154 | /** 155 | * Set the last modified time to the current time 156 | * 157 | * @return bool Success status 158 | */ 159 | public function touch() 160 | { 161 | $data = $this->cache->get($this->name); 162 | 163 | if ($data !== false) 164 | { 165 | return $this->cache->set($this->name, $data, MEMCACHE_COMPRESSED, (int) $this->options['extras']['timeout']); 166 | } 167 | 168 | return false; 169 | } 170 | 171 | /** 172 | * Remove the cache 173 | * 174 | * @return bool Success status 175 | */ 176 | public function unlink() 177 | { 178 | return $this->cache->delete($this->name, 0); 179 | } 180 | } 181 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Cache/Memcached.php: -------------------------------------------------------------------------------- 1 | options = array( 86 | 'host' => '127.0.0.1', 87 | 'port' => 11211, 88 | 'extras' => array( 89 | 'timeout' => 3600, // one hour 90 | 'prefix' => 'simplepie_', 91 | ), 92 | ); 93 | $this->options = SimplePie_Misc::array_merge_recursive($this->options, SimplePie_Cache::parse_URL($location)); 94 | 95 | $this->name = $this->options['extras']['prefix'] . md5("$name:$type"); 96 | 97 | $this->cache = new Memcached(); 98 | $this->cache->addServer($this->options['host'], (int)$this->options['port']); 99 | } 100 | 101 | /** 102 | * Save data to the cache 103 | * @param array|SimplePie $data Data to store in the cache. If passed a SimplePie object, only cache the $data property 104 | * @return bool Successfulness 105 | */ 106 | public function save($data) { 107 | if ($data instanceof SimplePie) { 108 | $data = $data->data; 109 | } 110 | 111 | return $this->setData(serialize($data)); 112 | } 113 | 114 | /** 115 | * Retrieve the data saved to the cache 116 | * @return array Data for SimplePie::$data 117 | */ 118 | public function load() { 119 | $data = $this->cache->get($this->name); 120 | 121 | if ($data !== false) { 122 | return unserialize($data); 123 | } 124 | return false; 125 | } 126 | 127 | /** 128 | * Retrieve the last modified time for the cache 129 | * @return int Timestamp 130 | */ 131 | public function mtime() { 132 | $data = $this->cache->get($this->name . '_mtime'); 133 | return (int) $data; 134 | } 135 | 136 | /** 137 | * Set the last modified time to the current time 138 | * @return bool Success status 139 | */ 140 | public function touch() { 141 | $data = $this->cache->get($this->name); 142 | return $this->setData($data); 143 | } 144 | 145 | /** 146 | * Remove the cache 147 | * @return bool Success status 148 | */ 149 | public function unlink() { 150 | return $this->cache->delete($this->name, 0); 151 | } 152 | 153 | /** 154 | * Set the last modified time and data to Memcached 155 | * @return bool Success status 156 | */ 157 | private function setData($data) { 158 | 159 | if ($data !== false) { 160 | $this->cache->set($this->name . '_mtime', time(), (int)$this->options['extras']['timeout']); 161 | return $this->cache->set($this->name, $data, (int)$this->options['extras']['timeout']); 162 | } 163 | 164 | return false; 165 | } 166 | } 167 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Cache/Redis.php: -------------------------------------------------------------------------------- 1 | 8 | * @link http://galvani.cz/ 9 | * @license http://www.opensource.org/licenses/bsd-license.php BSD License 10 | * @version 0.2.9 11 | */ 12 | 13 | 14 | /** 15 | * Caches data to redis 16 | * 17 | * Registered for URLs with the "redis" protocol 18 | * 19 | * For example, `redis://localhost:6379/?timeout=3600&prefix=sp_&dbIndex=0` will 20 | * connect to redis on `localhost` on port 6379. All tables will be 21 | * prefixed with `simple_primary-` and data will expire after 3600 seconds 22 | * 23 | * @package SimplePie 24 | * @subpackage Caching 25 | * @uses Redis 26 | */ 27 | class SimplePie_Cache_Redis implements SimplePie_Cache_Base { 28 | /** 29 | * Redis instance 30 | * 31 | * @var \Redis 32 | */ 33 | protected $cache; 34 | 35 | /** 36 | * Options 37 | * 38 | * @var array 39 | */ 40 | protected $options; 41 | 42 | /** 43 | * Cache name 44 | * 45 | * @var string 46 | */ 47 | protected $name; 48 | 49 | /** 50 | * Cache Data 51 | * 52 | * @var type 53 | */ 54 | protected $data; 55 | 56 | /** 57 | * Create a new cache object 58 | * 59 | * @param string $location Location string (from SimplePie::$cache_location) 60 | * @param string $name Unique ID for the cache 61 | * @param string $type Either TYPE_FEED for SimplePie data, or TYPE_IMAGE for image data 62 | */ 63 | public function __construct($location, $name, $options = null) { 64 | //$this->cache = \flow\simple\cache\Redis::getRedisClientInstance(); 65 | $parsed = SimplePie_Cache::parse_URL($location); 66 | $redis = new Redis(); 67 | $redis->connect($parsed['host'], $parsed['port']); 68 | $this->cache = $redis; 69 | 70 | if (!is_null($options) && is_array($options)) { 71 | $this->options = $options; 72 | } else { 73 | $this->options = array ( 74 | 'prefix' => 'rss:simple_primary:', 75 | 'expire' => 0, 76 | ); 77 | } 78 | 79 | $this->name = $this->options['prefix'] . $name; 80 | } 81 | 82 | /** 83 | * @param \Redis $cache 84 | */ 85 | public function setRedisClient(\Redis $cache) { 86 | $this->cache = $cache; 87 | } 88 | 89 | /** 90 | * Save data to the cache 91 | * 92 | * @param array|SimplePie $data Data to store in the cache. If passed a SimplePie object, only cache the $data property 93 | * @return bool Successfulness 94 | */ 95 | public function save($data) { 96 | if ($data instanceof SimplePie) { 97 | $data = $data->data; 98 | } 99 | $response = $this->cache->set($this->name, serialize($data)); 100 | if ($this->options['expire']) { 101 | $this->cache->expire($this->name, $this->options['expire']); 102 | } 103 | 104 | return $response; 105 | } 106 | 107 | /** 108 | * Retrieve the data saved to the cache 109 | * 110 | * @return array Data for SimplePie::$data 111 | */ 112 | public function load() { 113 | $data = $this->cache->get($this->name); 114 | 115 | if ($data !== false) { 116 | return unserialize($data); 117 | } 118 | return false; 119 | } 120 | 121 | /** 122 | * Retrieve the last modified time for the cache 123 | * 124 | * @return int Timestamp 125 | */ 126 | public function mtime() { 127 | 128 | $data = $this->cache->get($this->name); 129 | 130 | if ($data !== false) { 131 | return time(); 132 | } 133 | 134 | return false; 135 | } 136 | 137 | /** 138 | * Set the last modified time to the current time 139 | * 140 | * @return bool Success status 141 | */ 142 | public function touch() { 143 | 144 | $data = $this->cache->get($this->name); 145 | 146 | if ($data !== false) { 147 | $return = $this->cache->set($this->name, $data); 148 | if ($this->options['expire']) { 149 | return $this->cache->expire($this->name, $this->ttl); 150 | } 151 | return $return; 152 | } 153 | 154 | return false; 155 | } 156 | 157 | /** 158 | * Remove the cache 159 | * 160 | * @return bool Success status 161 | */ 162 | public function unlink() { 163 | return $this->cache->set($this->name, null); 164 | } 165 | 166 | } 167 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Caption.php: -------------------------------------------------------------------------------- 1 | ` captions as defined in Media RSS. 47 | * 48 | * Used by {@see SimplePie_Enclosure::get_caption()} and {@see SimplePie_Enclosure::get_captions()} 49 | * 50 | * This class can be overloaded with {@see SimplePie::set_caption_class()} 51 | * 52 | * @package SimplePie 53 | * @subpackage API 54 | */ 55 | class SimplePie_Caption 56 | { 57 | /** 58 | * Content type 59 | * 60 | * @var string 61 | * @see get_type() 62 | */ 63 | var $type; 64 | 65 | /** 66 | * Language 67 | * 68 | * @var string 69 | * @see get_language() 70 | */ 71 | var $lang; 72 | 73 | /** 74 | * Start time 75 | * 76 | * @var string 77 | * @see get_starttime() 78 | */ 79 | var $startTime; 80 | 81 | /** 82 | * End time 83 | * 84 | * @var string 85 | * @see get_endtime() 86 | */ 87 | var $endTime; 88 | 89 | /** 90 | * Caption text 91 | * 92 | * @var string 93 | * @see get_text() 94 | */ 95 | var $text; 96 | 97 | /** 98 | * Constructor, used to input the data 99 | * 100 | * For documentation on all the parameters, see the corresponding 101 | * properties and their accessors 102 | */ 103 | public function __construct($type = null, $lang = null, $startTime = null, $endTime = null, $text = null) 104 | { 105 | $this->type = $type; 106 | $this->lang = $lang; 107 | $this->startTime = $startTime; 108 | $this->endTime = $endTime; 109 | $this->text = $text; 110 | } 111 | 112 | /** 113 | * String-ified version 114 | * 115 | * @return string 116 | */ 117 | public function __toString() 118 | { 119 | // There is no $this->data here 120 | return md5(serialize($this)); 121 | } 122 | 123 | /** 124 | * Get the end time 125 | * 126 | * @return string|null Time in the format 'hh:mm:ss.SSS' 127 | */ 128 | public function get_endtime() 129 | { 130 | if ($this->endTime !== null) 131 | { 132 | return $this->endTime; 133 | } 134 | else 135 | { 136 | return null; 137 | } 138 | } 139 | 140 | /** 141 | * Get the language 142 | * 143 | * @link http://tools.ietf.org/html/rfc3066 144 | * @return string|null Language code as per RFC 3066 145 | */ 146 | public function get_language() 147 | { 148 | if ($this->lang !== null) 149 | { 150 | return $this->lang; 151 | } 152 | else 153 | { 154 | return null; 155 | } 156 | } 157 | 158 | /** 159 | * Get the start time 160 | * 161 | * @return string|null Time in the format 'hh:mm:ss.SSS' 162 | */ 163 | public function get_starttime() 164 | { 165 | if ($this->startTime !== null) 166 | { 167 | return $this->startTime; 168 | } 169 | else 170 | { 171 | return null; 172 | } 173 | } 174 | 175 | /** 176 | * Get the text of the caption 177 | * 178 | * @return string|null 179 | */ 180 | public function get_text() 181 | { 182 | if ($this->text !== null) 183 | { 184 | return $this->text; 185 | } 186 | else 187 | { 188 | return null; 189 | } 190 | } 191 | 192 | /** 193 | * Get the content type (not MIME type) 194 | * 195 | * @return string|null Either 'text' or 'html' 196 | */ 197 | public function get_type() 198 | { 199 | if ($this->type !== null) 200 | { 201 | return $this->type; 202 | } 203 | else 204 | { 205 | return null; 206 | } 207 | } 208 | } 209 | 210 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Category.php: -------------------------------------------------------------------------------- 1 | term = $term; 90 | $this->scheme = $scheme; 91 | $this->label = $label; 92 | } 93 | 94 | /** 95 | * String-ified version 96 | * 97 | * @return string 98 | */ 99 | public function __toString() 100 | { 101 | // There is no $this->data here 102 | return md5(serialize($this)); 103 | } 104 | 105 | /** 106 | * Get the category identifier 107 | * 108 | * @return string|null 109 | */ 110 | public function get_term() 111 | { 112 | if ($this->term !== null) 113 | { 114 | return $this->term; 115 | } 116 | else 117 | { 118 | return null; 119 | } 120 | } 121 | 122 | /** 123 | * Get the categorization scheme identifier 124 | * 125 | * @return string|null 126 | */ 127 | public function get_scheme() 128 | { 129 | if ($this->scheme !== null) 130 | { 131 | return $this->scheme; 132 | } 133 | else 134 | { 135 | return null; 136 | } 137 | } 138 | 139 | /** 140 | * Get the human readable label 141 | * 142 | * @return string|null 143 | */ 144 | public function get_label() 145 | { 146 | if ($this->label !== null) 147 | { 148 | return $this->label; 149 | } 150 | else 151 | { 152 | return $this->get_term(); 153 | } 154 | } 155 | } 156 | 157 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Copyright.php: -------------------------------------------------------------------------------- 1 | ` copyright tags as defined in Media RSS 46 | * 47 | * Used by {@see SimplePie_Enclosure::get_copyright()} 48 | * 49 | * This class can be overloaded with {@see SimplePie::set_copyright_class()} 50 | * 51 | * @package SimplePie 52 | * @subpackage API 53 | */ 54 | class SimplePie_Copyright 55 | { 56 | /** 57 | * Copyright URL 58 | * 59 | * @var string 60 | * @see get_url() 61 | */ 62 | var $url; 63 | 64 | /** 65 | * Attribution 66 | * 67 | * @var string 68 | * @see get_attribution() 69 | */ 70 | var $label; 71 | 72 | /** 73 | * Constructor, used to input the data 74 | * 75 | * For documentation on all the parameters, see the corresponding 76 | * properties and their accessors 77 | */ 78 | public function __construct($url = null, $label = null) 79 | { 80 | $this->url = $url; 81 | $this->label = $label; 82 | } 83 | 84 | /** 85 | * String-ified version 86 | * 87 | * @return string 88 | */ 89 | public function __toString() 90 | { 91 | // There is no $this->data here 92 | return md5(serialize($this)); 93 | } 94 | 95 | /** 96 | * Get the copyright URL 97 | * 98 | * @return string|null URL to copyright information 99 | */ 100 | public function get_url() 101 | { 102 | if ($this->url !== null) 103 | { 104 | return $this->url; 105 | } 106 | else 107 | { 108 | return null; 109 | } 110 | } 111 | 112 | /** 113 | * Get the attribution text 114 | * 115 | * @return string|null 116 | */ 117 | public function get_attribution() 118 | { 119 | if ($this->label !== null) 120 | { 121 | return $this->label; 122 | } 123 | else 124 | { 125 | return null; 126 | } 127 | } 128 | } 129 | 130 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Core.php: -------------------------------------------------------------------------------- 1 | ` as defined in Media RSS 46 | * 47 | * Used by {@see SimplePie_Enclosure::get_credit()} and {@see SimplePie_Enclosure::get_credits()} 48 | * 49 | * This class can be overloaded with {@see SimplePie::set_credit_class()} 50 | * 51 | * @package SimplePie 52 | * @subpackage API 53 | */ 54 | class SimplePie_Credit 55 | { 56 | /** 57 | * Credited role 58 | * 59 | * @var string 60 | * @see get_role() 61 | */ 62 | var $role; 63 | 64 | /** 65 | * Organizational scheme 66 | * 67 | * @var string 68 | * @see get_scheme() 69 | */ 70 | var $scheme; 71 | 72 | /** 73 | * Credited name 74 | * 75 | * @var string 76 | * @see get_name() 77 | */ 78 | var $name; 79 | 80 | /** 81 | * Constructor, used to input the data 82 | * 83 | * For documentation on all the parameters, see the corresponding 84 | * properties and their accessors 85 | */ 86 | public function __construct($role = null, $scheme = null, $name = null) 87 | { 88 | $this->role = $role; 89 | $this->scheme = $scheme; 90 | $this->name = $name; 91 | } 92 | 93 | /** 94 | * String-ified version 95 | * 96 | * @return string 97 | */ 98 | public function __toString() 99 | { 100 | // There is no $this->data here 101 | return md5(serialize($this)); 102 | } 103 | 104 | /** 105 | * Get the role of the person receiving credit 106 | * 107 | * @return string|null 108 | */ 109 | public function get_role() 110 | { 111 | if ($this->role !== null) 112 | { 113 | return $this->role; 114 | } 115 | else 116 | { 117 | return null; 118 | } 119 | } 120 | 121 | /** 122 | * Get the organizational scheme 123 | * 124 | * @return string|null 125 | */ 126 | public function get_scheme() 127 | { 128 | if ($this->scheme !== null) 129 | { 130 | return $this->scheme; 131 | } 132 | else 133 | { 134 | return null; 135 | } 136 | } 137 | 138 | /** 139 | * Get the credited person/entity's name 140 | * 141 | * @return string|null 142 | */ 143 | public function get_name() 144 | { 145 | if ($this->name !== null) 146 | { 147 | return $this->name; 148 | } 149 | else 150 | { 151 | return null; 152 | } 153 | } 154 | } 155 | 156 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Exception.php: -------------------------------------------------------------------------------- 1 | ` or `` tags as defined in Media RSS and iTunes RSS respectively 46 | * 47 | * Used by {@see SimplePie_Enclosure::get_rating()} and {@see SimplePie_Enclosure::get_ratings()} 48 | * 49 | * This class can be overloaded with {@see SimplePie::set_rating_class()} 50 | * 51 | * @package SimplePie 52 | * @subpackage API 53 | */ 54 | class SimplePie_Rating 55 | { 56 | /** 57 | * Rating scheme 58 | * 59 | * @var string 60 | * @see get_scheme() 61 | */ 62 | var $scheme; 63 | 64 | /** 65 | * Rating value 66 | * 67 | * @var string 68 | * @see get_value() 69 | */ 70 | var $value; 71 | 72 | /** 73 | * Constructor, used to input the data 74 | * 75 | * For documentation on all the parameters, see the corresponding 76 | * properties and their accessors 77 | */ 78 | public function __construct($scheme = null, $value = null) 79 | { 80 | $this->scheme = $scheme; 81 | $this->value = $value; 82 | } 83 | 84 | /** 85 | * String-ified version 86 | * 87 | * @return string 88 | */ 89 | public function __toString() 90 | { 91 | // There is no $this->data here 92 | return md5(serialize($this)); 93 | } 94 | 95 | /** 96 | * Get the organizational scheme for the rating 97 | * 98 | * @return string|null 99 | */ 100 | public function get_scheme() 101 | { 102 | if ($this->scheme !== null) 103 | { 104 | return $this->scheme; 105 | } 106 | else 107 | { 108 | return null; 109 | } 110 | } 111 | 112 | /** 113 | * Get the value of the rating 114 | * 115 | * @return string|null 116 | */ 117 | public function get_value() 118 | { 119 | if ($this->value !== null) 120 | { 121 | return $this->value; 122 | } 123 | else 124 | { 125 | return null; 126 | } 127 | } 128 | } 129 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Registry.php: -------------------------------------------------------------------------------- 1 | 'SimplePie_Cache', 62 | 'Locator' => 'SimplePie_Locator', 63 | 'Parser' => 'SimplePie_Parser', 64 | 'File' => 'SimplePie_File', 65 | 'Sanitize' => 'SimplePie_Sanitize', 66 | 'Item' => 'SimplePie_Item', 67 | 'Author' => 'SimplePie_Author', 68 | 'Category' => 'SimplePie_Category', 69 | 'Enclosure' => 'SimplePie_Enclosure', 70 | 'Caption' => 'SimplePie_Caption', 71 | 'Copyright' => 'SimplePie_Copyright', 72 | 'Credit' => 'SimplePie_Credit', 73 | 'Rating' => 'SimplePie_Rating', 74 | 'Restriction' => 'SimplePie_Restriction', 75 | 'Content_Type_Sniffer' => 'SimplePie_Content_Type_Sniffer', 76 | 'Source' => 'SimplePie_Source', 77 | 'Misc' => 'SimplePie_Misc', 78 | 'XML_Declaration_Parser' => 'SimplePie_XML_Declaration_Parser', 79 | 'Parse_Date' => 'SimplePie_Parse_Date', 80 | ); 81 | 82 | /** 83 | * Class mapping 84 | * 85 | * @see register() 86 | * @var array 87 | */ 88 | protected $classes = array(); 89 | 90 | /** 91 | * Legacy classes 92 | * 93 | * @see register() 94 | * @var array 95 | */ 96 | protected $legacy = array(); 97 | 98 | /** 99 | * Constructor 100 | * 101 | * No-op 102 | */ 103 | public function __construct() { } 104 | 105 | /** 106 | * Register a class 107 | * 108 | * @param string $type See {@see $default} for names 109 | * @param string $class Class name, must subclass the corresponding default 110 | * @param bool $legacy Whether to enable legacy support for this class 111 | * @return bool Successfulness 112 | */ 113 | public function register($type, $class, $legacy = false) 114 | { 115 | if (!@is_subclass_of($class, $this->default[$type])) 116 | { 117 | return false; 118 | } 119 | 120 | $this->classes[$type] = $class; 121 | 122 | if ($legacy) 123 | { 124 | $this->legacy[] = $class; 125 | } 126 | 127 | return true; 128 | } 129 | 130 | /** 131 | * Get the class registered for a type 132 | * 133 | * Where possible, use {@see create()} or {@see call()} instead 134 | * 135 | * @param string $type 136 | * @return string|null 137 | */ 138 | public function get_class($type) 139 | { 140 | if (!empty($this->classes[$type])) 141 | { 142 | return $this->classes[$type]; 143 | } 144 | if (!empty($this->default[$type])) 145 | { 146 | return $this->default[$type]; 147 | } 148 | 149 | return null; 150 | } 151 | 152 | /** 153 | * Create a new instance of a given type 154 | * 155 | * @param string $type 156 | * @param array $parameters Parameters to pass to the constructor 157 | * @return object Instance of class 158 | */ 159 | public function &create($type, $parameters = array()) 160 | { 161 | $class = $this->get_class($type); 162 | 163 | if (in_array($class, $this->legacy)) 164 | { 165 | switch ($type) 166 | { 167 | case 'locator': 168 | // Legacy: file, timeout, useragent, file_class, max_checked_feeds, content_type_sniffer_class 169 | // Specified: file, timeout, useragent, max_checked_feeds 170 | $replacement = array($this->get_class('file'), $parameters[3], $this->get_class('content_type_sniffer')); 171 | array_splice($parameters, 3, 1, $replacement); 172 | break; 173 | } 174 | } 175 | 176 | if (!method_exists($class, '__construct')) 177 | { 178 | $instance = new $class; 179 | } 180 | else 181 | { 182 | $reflector = new ReflectionClass($class); 183 | $instance = $reflector->newInstanceArgs($parameters); 184 | } 185 | 186 | if (method_exists($instance, 'set_registry')) 187 | { 188 | $instance->set_registry($this); 189 | } 190 | return $instance; 191 | } 192 | 193 | /** 194 | * Call a static method for a type 195 | * 196 | * @param string $type 197 | * @param string $method 198 | * @param array $parameters 199 | * @return mixed 200 | */ 201 | public function &call($type, $method, $parameters = array()) 202 | { 203 | $class = $this->get_class($type); 204 | 205 | if (in_array($class, $this->legacy)) 206 | { 207 | switch ($type) 208 | { 209 | case 'Cache': 210 | // For backwards compatibility with old non-static 211 | // Cache::create() methods 212 | if ($method === 'get_handler') 213 | { 214 | $result = @call_user_func_array(array($class, 'create'), $parameters); 215 | return $result; 216 | } 217 | break; 218 | } 219 | } 220 | 221 | $result = call_user_func_array(array($class, $method), $parameters); 222 | return $result; 223 | } 224 | } 225 | -------------------------------------------------------------------------------- /libraries/simplepie/library/SimplePie/Restriction.php: -------------------------------------------------------------------------------- 1 | ` as defined in Media RSS 46 | * 47 | * Used by {@see SimplePie_Enclosure::get_restriction()} and {@see SimplePie_Enclosure::get_restrictions()} 48 | * 49 | * This class can be overloaded with {@see SimplePie::set_restriction_class()} 50 | * 51 | * @package SimplePie 52 | * @subpackage API 53 | */ 54 | class SimplePie_Restriction 55 | { 56 | /** 57 | * Relationship ('allow'/'deny') 58 | * 59 | * @var string 60 | * @see get_relationship() 61 | */ 62 | var $relationship; 63 | 64 | /** 65 | * Type of restriction 66 | * 67 | * @var string 68 | * @see get_type() 69 | */ 70 | var $type; 71 | 72 | /** 73 | * Restricted values 74 | * 75 | * @var string 76 | * @see get_value() 77 | */ 78 | var $value; 79 | 80 | /** 81 | * Constructor, used to input the data 82 | * 83 | * For documentation on all the parameters, see the corresponding 84 | * properties and their accessors 85 | */ 86 | public function __construct($relationship = null, $type = null, $value = null) 87 | { 88 | $this->relationship = $relationship; 89 | $this->type = $type; 90 | $this->value = $value; 91 | } 92 | 93 | /** 94 | * String-ified version 95 | * 96 | * @return string 97 | */ 98 | public function __toString() 99 | { 100 | // There is no $this->data here 101 | return md5(serialize($this)); 102 | } 103 | 104 | /** 105 | * Get the relationship 106 | * 107 | * @return string|null Either 'allow' or 'deny' 108 | */ 109 | public function get_relationship() 110 | { 111 | if ($this->relationship !== null) 112 | { 113 | return $this->relationship; 114 | } 115 | else 116 | { 117 | return null; 118 | } 119 | } 120 | 121 | /** 122 | * Get the type 123 | * 124 | * @return string|null 125 | */ 126 | public function get_type() 127 | { 128 | if ($this->type !== null) 129 | { 130 | return $this->type; 131 | } 132 | else 133 | { 134 | return null; 135 | } 136 | } 137 | 138 | /** 139 | * Get the list of restricted things 140 | * 141 | * @return string|null 142 | */ 143 | public function get_value() 144 | { 145 | if ($this->value !== null) 146 | { 147 | return $this->value; 148 | } 149 | else 150 | { 151 | return null; 152 | } 153 | } 154 | } 155 | -------------------------------------------------------------------------------- /robots.txt: -------------------------------------------------------------------------------- 1 | User-agent: * 2 | Disallow: /makefulltextfeed.php 3 | Disallow: /extract.php -------------------------------------------------------------------------------- /site_config/README.txt: -------------------------------------------------------------------------------- 1 | Full-Text RSS Site Patterns 2 | --------------------------- 3 | 4 | Site patterns allow you to specify what should be extracted from specific sites. 5 | 6 | Please see http://help.fivefilters.org/customer/portal/articles/223153-site-patterns for more information. -------------------------------------------------------------------------------- /site_config/custom/index.php: -------------------------------------------------------------------------------- 1 | The Fork & Pull Model lets anyone fork an existing repository and push changes to their personal fork without requiring access be granted to the source repository. The changes must then be pulled into the source repository by the project maintainer. This model reduces the amount of friction for new contributors and is popular with open source projects because it allows people to work independently without upfront coordination. 17 | 18 | When we receive a pull request we'll review the changes and if everything's okay we'll update our copy. 19 | 20 | If a site is not in our set, you can create a file for it in the same way. See [Creating files on GitHub](https://github.com/blog/1327-creating-files-on-github). 21 | 22 | ### How to write a site config file 23 | 24 | The quickest and simplest way is to use our [point-and-click interface](http://siteconfig.fivefilters.org). It's a simple tool only intended to create a rule to extract the correct content block. 25 | 26 | For further refinements, e.g. selecting the title, stripping elements, dealing with multi-page articles, please see our [help page](http://help.fivefilters.org/customer/portal/articles/223153-site-patterns). 27 | 28 | ### Instapaper 29 | 30 | When we introduced site patterns, we chose to adopt the [same format](http://blog.instapaper.com/post/730281947) used by Instapaper. This allows us to make use of the existing extraction rules contributed by Instapaper users. 31 | 32 | Marco, Instapaper's creator, graciously opened up the database of contributions to everyone: 33 | 34 | > And, recognizing that your efforts could be useful to a wide range of other tools and services, I'll make the list of all of these site-specific configurations available to the public, free, with no strings attached. 35 | 36 | Most of the extraction rules in our set are borrowed from Instapaper. You can see the list maintained by Instapaper at [instapaper.com/bodytext/](http://instapaper.com/bodytext/) (no longer available since Instapaper was sold). 37 | 38 | ### Testing site config files 39 | 40 | Currently you will have to have a copy of Full-Text RSS to test changes to the site config files. In the future we will try to make this process easier. 41 | -------------------------------------------------------------------------------- /site_config/standard/index.php: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /site_config/standard/version.txt: -------------------------------------------------------------------------------- 1 | 2014-05-05T08:36:15Z --------------------------------------------------------------------------------