├── LICENSE-GPL2 ├── LICENSE-LGPL3 ├── README.md ├── composer.json ├── htmLawed.php ├── htmLawed_README.htm ├── htmLawed_README.txt └── htmLawed_TESTCASE.txt /LICENSE-GPL2: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 2, June 1991 3 | 4 | Copyright (C) 1989, 1991 Free Software Foundation, Inc., 5 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA 6 | Everyone is permitted to copy and distribute verbatim copies 7 | of this license document, but changing it is not allowed. 8 | 9 | Preamble 10 | 11 | The licenses for most software are designed to take away your 12 | freedom to share and change it. By contrast, the GNU General Public 13 | License is intended to guarantee your freedom to share and change free 14 | software--to make sure the software is free for all its users. This 15 | General Public License applies to most of the Free Software 16 | Foundation's software and to any other program whose authors commit to 17 | using it. (Some other Free Software Foundation software is covered by 18 | the GNU Lesser General Public License instead.) You can apply it to 19 | your programs, too. 20 | 21 | When we speak of free software, we are referring to freedom, not 22 | price. Our General Public Licenses are designed to make sure that you 23 | have the freedom to distribute copies of free software (and charge for 24 | this service if you wish), that you receive source code or can get it 25 | if you want it, that you can change the software or use pieces of it 26 | in new free programs; and that you know you can do these things. 27 | 28 | To protect your rights, we need to make restrictions that forbid 29 | anyone to deny you these rights or to ask you to surrender the rights. 30 | These restrictions translate to certain responsibilities for you if you 31 | distribute copies of the software, or if you modify it. 32 | 33 | For example, if you distribute copies of such a program, whether 34 | gratis or for a fee, you must give the recipients all the rights that 35 | you have. You must make sure that they, too, receive or can get the 36 | source code. And you must show them these terms so they know their 37 | rights. 38 | 39 | We protect your rights with two steps: (1) copyright the software, and 40 | (2) offer you this license which gives you legal permission to copy, 41 | distribute and/or modify the software. 42 | 43 | Also, for each author's protection and ours, we want to make certain 44 | that everyone understands that there is no warranty for this free 45 | software. If the software is modified by someone else and passed on, we 46 | want its recipients to know that what they have is not the original, so 47 | that any problems introduced by others will not reflect on the original 48 | authors' reputations. 49 | 50 | Finally, any free program is threatened constantly by software 51 | patents. We wish to avoid the danger that redistributors of a free 52 | program will individually obtain patent licenses, in effect making the 53 | program proprietary. To prevent this, we have made it clear that any 54 | patent must be licensed for everyone's free use or not licensed at all. 55 | 56 | The precise terms and conditions for copying, distribution and 57 | modification follow. 58 | 59 | GNU GENERAL PUBLIC LICENSE 60 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 61 | 62 | 0. This License applies to any program or other work which contains 63 | a notice placed by the copyright holder saying it may be distributed 64 | under the terms of this General Public License. The "Program", below, 65 | refers to any such program or work, and a "work based on the Program" 66 | means either the Program or any derivative work under copyright law: 67 | that is to say, a work containing the Program or a portion of it, 68 | either verbatim or with modifications and/or translated into another 69 | language. (Hereinafter, translation is included without limitation in 70 | the term "modification".) Each licensee is addressed as "you". 71 | 72 | Activities other than copying, distribution and modification are not 73 | covered by this License; they are outside its scope. The act of 74 | running the Program is not restricted, and the output from the Program 75 | is covered only if its contents constitute a work based on the 76 | Program (independent of having been made by running the Program). 77 | Whether that is true depends on what the Program does. 78 | 79 | 1. You may copy and distribute verbatim copies of the Program's 80 | source code as you receive it, in any medium, provided that you 81 | conspicuously and appropriately publish on each copy an appropriate 82 | copyright notice and disclaimer of warranty; keep intact all the 83 | notices that refer to this License and to the absence of any warranty; 84 | and give any other recipients of the Program a copy of this License 85 | along with the Program. 86 | 87 | You may charge a fee for the physical act of transferring a copy, and 88 | you may at your option offer warranty protection in exchange for a fee. 89 | 90 | 2. You may modify your copy or copies of the Program or any portion 91 | of it, thus forming a work based on the Program, and copy and 92 | distribute such modifications or work under the terms of Section 1 93 | above, provided that you also meet all of these conditions: 94 | 95 | a) You must cause the modified files to carry prominent notices 96 | stating that you changed the files and the date of any change. 97 | 98 | b) You must cause any work that you distribute or publish, that in 99 | whole or in part contains or is derived from the Program or any 100 | part thereof, to be licensed as a whole at no charge to all third 101 | parties under the terms of this License. 102 | 103 | c) If the modified program normally reads commands interactively 104 | when run, you must cause it, when started running for such 105 | interactive use in the most ordinary way, to print or display an 106 | announcement including an appropriate copyright notice and a 107 | notice that there is no warranty (or else, saying that you provide 108 | a warranty) and that users may redistribute the program under 109 | these conditions, and telling the user how to view a copy of this 110 | License. (Exception: if the Program itself is interactive but 111 | does not normally print such an announcement, your work based on 112 | the Program is not required to print an announcement.) 113 | 114 | These requirements apply to the modified work as a whole. If 115 | identifiable sections of that work are not derived from the Program, 116 | and can be reasonably considered independent and separate works in 117 | themselves, then this License, and its terms, do not apply to those 118 | sections when you distribute them as separate works. But when you 119 | distribute the same sections as part of a whole which is a work based 120 | on the Program, the distribution of the whole must be on the terms of 121 | this License, whose permissions for other licensees extend to the 122 | entire whole, and thus to each and every part regardless of who wrote it. 123 | 124 | Thus, it is not the intent of this section to claim rights or contest 125 | your rights to work written entirely by you; rather, the intent is to 126 | exercise the right to control the distribution of derivative or 127 | collective works based on the Program. 128 | 129 | In addition, mere aggregation of another work not based on the Program 130 | with the Program (or with a work based on the Program) on a volume of 131 | a storage or distribution medium does not bring the other work under 132 | the scope of this License. 133 | 134 | 3. You may copy and distribute the Program (or a work based on it, 135 | under Section 2) in object code or executable form under the terms of 136 | Sections 1 and 2 above provided that you also do one of the following: 137 | 138 | a) Accompany it with the complete corresponding machine-readable 139 | source code, which must be distributed under the terms of Sections 140 | 1 and 2 above on a medium customarily used for software interchange; or, 141 | 142 | b) Accompany it with a written offer, valid for at least three 143 | years, to give any third party, for a charge no more than your 144 | cost of physically performing source distribution, a complete 145 | machine-readable copy of the corresponding source code, to be 146 | distributed under the terms of Sections 1 and 2 above on a medium 147 | customarily used for software interchange; or, 148 | 149 | c) Accompany it with the information you received as to the offer 150 | to distribute corresponding source code. (This alternative is 151 | allowed only for noncommercial distribution and only if you 152 | received the program in object code or executable form with such 153 | an offer, in accord with Subsection b above.) 154 | 155 | The source code for a work means the preferred form of the work for 156 | making modifications to it. For an executable work, complete source 157 | code means all the source code for all modules it contains, plus any 158 | associated interface definition files, plus the scripts used to 159 | control compilation and installation of the executable. However, as a 160 | special exception, the source code distributed need not include 161 | anything that is normally distributed (in either source or binary 162 | form) with the major components (compiler, kernel, and so on) of the 163 | operating system on which the executable runs, unless that component 164 | itself accompanies the executable. 165 | 166 | If distribution of executable or object code is made by offering 167 | access to copy from a designated place, then offering equivalent 168 | access to copy the source code from the same place counts as 169 | distribution of the source code, even though third parties are not 170 | compelled to copy the source along with the object code. 171 | 172 | 4. You may not copy, modify, sublicense, or distribute the Program 173 | except as expressly provided under this License. Any attempt 174 | otherwise to copy, modify, sublicense or distribute the Program is 175 | void, and will automatically terminate your rights under this License. 176 | However, parties who have received copies, or rights, from you under 177 | this License will not have their licenses terminated so long as such 178 | parties remain in full compliance. 179 | 180 | 5. You are not required to accept this License, since you have not 181 | signed it. However, nothing else grants you permission to modify or 182 | distribute the Program or its derivative works. These actions are 183 | prohibited by law if you do not accept this License. Therefore, by 184 | modifying or distributing the Program (or any work based on the 185 | Program), you indicate your acceptance of this License to do so, and 186 | all its terms and conditions for copying, distributing or modifying 187 | the Program or works based on it. 188 | 189 | 6. Each time you redistribute the Program (or any work based on the 190 | Program), the recipient automatically receives a license from the 191 | original licensor to copy, distribute or modify the Program subject to 192 | these terms and conditions. You may not impose any further 193 | restrictions on the recipients' exercise of the rights granted herein. 194 | You are not responsible for enforcing compliance by third parties to 195 | this License. 196 | 197 | 7. If, as a consequence of a court judgment or allegation of patent 198 | infringement or for any other reason (not limited to patent issues), 199 | conditions are imposed on you (whether by court order, agreement or 200 | otherwise) that contradict the conditions of this License, they do not 201 | excuse you from the conditions of this License. If you cannot 202 | distribute so as to satisfy simultaneously your obligations under this 203 | License and any other pertinent obligations, then as a consequence you 204 | may not distribute the Program at all. For example, if a patent 205 | license would not permit royalty-free redistribution of the Program by 206 | all those who receive copies directly or indirectly through you, then 207 | the only way you could satisfy both it and this License would be to 208 | refrain entirely from distribution of the Program. 209 | 210 | If any portion of this section is held invalid or unenforceable under 211 | any particular circumstance, the balance of the section is intended to 212 | apply and the section as a whole is intended to apply in other 213 | circumstances. 214 | 215 | It is not the purpose of this section to induce you to infringe any 216 | patents or other property right claims or to contest validity of any 217 | such claims; this section has the sole purpose of protecting the 218 | integrity of the free software distribution system, which is 219 | implemented by public license practices. Many people have made 220 | generous contributions to the wide range of software distributed 221 | through that system in reliance on consistent application of that 222 | system; it is up to the author/donor to decide if he or she is willing 223 | to distribute software through any other system and a licensee cannot 224 | impose that choice. 225 | 226 | This section is intended to make thoroughly clear what is believed to 227 | be a consequence of the rest of this License. 228 | 229 | 8. If the distribution and/or use of the Program is restricted in 230 | certain countries either by patents or by copyrighted interfaces, the 231 | original copyright holder who places the Program under this License 232 | may add an explicit geographical distribution limitation excluding 233 | those countries, so that distribution is permitted only in or among 234 | countries not thus excluded. In such case, this License incorporates 235 | the limitation as if written in the body of this License. 236 | 237 | 9. The Free Software Foundation may publish revised and/or new versions 238 | of the General Public License from time to time. Such new versions will 239 | be similar in spirit to the present version, but may differ in detail to 240 | address new problems or concerns. 241 | 242 | Each version is given a distinguishing version number. If the Program 243 | specifies a version number of this License which applies to it and "any 244 | later version", you have the option of following the terms and conditions 245 | either of that version or of any later version published by the Free 246 | Software Foundation. If the Program does not specify a version number of 247 | this License, you may choose any version ever published by the Free Software 248 | Foundation. 249 | 250 | 10. If you wish to incorporate parts of the Program into other free 251 | programs whose distribution conditions are different, write to the author 252 | to ask for permission. For software which is copyrighted by the Free 253 | Software Foundation, write to the Free Software Foundation; we sometimes 254 | make exceptions for this. Our decision will be guided by the two goals 255 | of preserving the free status of all derivatives of our free software and 256 | of promoting the sharing and reuse of software generally. 257 | 258 | NO WARRANTY 259 | 260 | 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY 261 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN 262 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES 263 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED 264 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 265 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS 266 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE 267 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, 268 | REPAIR OR CORRECTION. 269 | 270 | 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 271 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR 272 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, 273 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING 274 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED 275 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY 276 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER 277 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE 278 | POSSIBILITY OF SUCH DAMAGES. 279 | 280 | END OF TERMS AND CONDITIONS 281 | 282 | How to Apply These Terms to Your New Programs 283 | 284 | If you develop a new program, and you want it to be of the greatest 285 | possible use to the public, the best way to achieve this is to make it 286 | free software which everyone can redistribute and change under these terms. 287 | 288 | To do so, attach the following notices to the program. It is safest 289 | to attach them to the start of each source file to most effectively 290 | convey the exclusion of warranty; and each file should have at least 291 | the "copyright" line and a pointer to where the full notice is found. 292 | 293 | 294 | Copyright (C) 295 | 296 | This program is free software; you can redistribute it and/or modify 297 | it under the terms of the GNU General Public License as published by 298 | the Free Software Foundation; either version 2 of the License, or 299 | (at your option) any later version. 300 | 301 | This program is distributed in the hope that it will be useful, 302 | but WITHOUT ANY WARRANTY; without even the implied warranty of 303 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 304 | GNU General Public License for more details. 305 | 306 | You should have received a copy of the GNU General Public License along 307 | with this program; if not, write to the Free Software Foundation, Inc., 308 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. 309 | 310 | Also add information on how to contact you by electronic and paper mail. 311 | 312 | If the program is interactive, make it output a short notice like this 313 | when it starts in an interactive mode: 314 | 315 | Gnomovision version 69, Copyright (C) year name of author 316 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 317 | This is free software, and you are welcome to redistribute it 318 | under certain conditions; type `show c' for details. 319 | 320 | The hypothetical commands `show w' and `show c' should show the appropriate 321 | parts of the General Public License. Of course, the commands you use may 322 | be called something other than `show w' and `show c'; they could even be 323 | mouse-clicks or menu items--whatever suits your program. 324 | 325 | You should also get your employer (if you work as a programmer) or your 326 | school, if any, to sign a "copyright disclaimer" for the program, if 327 | necessary. Here is a sample; alter the names: 328 | 329 | Yoyodyne, Inc., hereby disclaims all copyright interest in the program 330 | `Gnomovision' (which makes passes at compilers) written by James Hacker. 331 | 332 | , 1 April 1989 333 | Ty Coon, President of Vice 334 | 335 | This General Public License does not permit incorporating your program into 336 | proprietary programs. If your program is a subroutine library, you may 337 | consider it more useful to permit linking proprietary applications with the 338 | library. If this is what you want to do, use the GNU Lesser General 339 | Public License instead of this License. 340 | -------------------------------------------------------------------------------- /LICENSE-LGPL3: -------------------------------------------------------------------------------- 1 | GNU LESSER GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | 9 | This version of the GNU Lesser General Public License incorporates 10 | the terms and conditions of version 3 of the GNU General Public 11 | License, supplemented by the additional permissions listed below. 12 | 13 | 0. Additional Definitions. 14 | 15 | As used herein, "this License" refers to version 3 of the GNU Lesser 16 | General Public License, and the "GNU GPL" refers to version 3 of the GNU 17 | General Public License. 18 | 19 | "The Library" refers to a covered work governed by this License, 20 | other than an Application or a Combined Work as defined below. 21 | 22 | An "Application" is any work that makes use of an interface provided 23 | by the Library, but which is not otherwise based on the Library. 24 | Defining a subclass of a class defined by the Library is deemed a mode 25 | of using an interface provided by the Library. 26 | 27 | A "Combined Work" is a work produced by combining or linking an 28 | Application with the Library. The particular version of the Library 29 | with which the Combined Work was made is also called the "Linked 30 | Version". 31 | 32 | The "Minimal Corresponding Source" for a Combined Work means the 33 | Corresponding Source for the Combined Work, excluding any source code 34 | for portions of the Combined Work that, considered in isolation, are 35 | based on the Application, and not on the Linked Version. 36 | 37 | The "Corresponding Application Code" for a Combined Work means the 38 | object code and/or source code for the Application, including any data 39 | and utility programs needed for reproducing the Combined Work from the 40 | Application, but excluding the System Libraries of the Combined Work. 41 | 42 | 1. Exception to Section 3 of the GNU GPL. 43 | 44 | You may convey a covered work under sections 3 and 4 of this License 45 | without being bound by section 3 of the GNU GPL. 46 | 47 | 2. Conveying Modified Versions. 48 | 49 | If you modify a copy of the Library, and, in your modifications, a 50 | facility refers to a function or data to be supplied by an Application 51 | that uses the facility (other than as an argument passed when the 52 | facility is invoked), then you may convey a copy of the modified 53 | version: 54 | 55 | a) under this License, provided that you make a good faith effort to 56 | ensure that, in the event an Application does not supply the 57 | function or data, the facility still operates, and performs 58 | whatever part of its purpose remains meaningful, or 59 | 60 | b) under the GNU GPL, with none of the additional permissions of 61 | this License applicable to that copy. 62 | 63 | 3. Object Code Incorporating Material from Library Header Files. 64 | 65 | The object code form of an Application may incorporate material from 66 | a header file that is part of the Library. You may convey such object 67 | code under terms of your choice, provided that, if the incorporated 68 | material is not limited to numerical parameters, data structure 69 | layouts and accessors, or small macros, inline functions and templates 70 | (ten or fewer lines in length), you do both of the following: 71 | 72 | a) Give prominent notice with each copy of the object code that the 73 | Library is used in it and that the Library and its use are 74 | covered by this License. 75 | 76 | b) Accompany the object code with a copy of the GNU GPL and this license 77 | document. 78 | 79 | 4. Combined Works. 80 | 81 | You may convey a Combined Work under terms of your choice that, 82 | taken together, effectively do not restrict modification of the 83 | portions of the Library contained in the Combined Work and reverse 84 | engineering for debugging such modifications, if you also do each of 85 | the following: 86 | 87 | a) Give prominent notice with each copy of the Combined Work that 88 | the Library is used in it and that the Library and its use are 89 | covered by this License. 90 | 91 | b) Accompany the Combined Work with a copy of the GNU GPL and this license 92 | document. 93 | 94 | c) For a Combined Work that displays copyright notices during 95 | execution, include the copyright notice for the Library among 96 | these notices, as well as a reference directing the user to the 97 | copies of the GNU GPL and this license document. 98 | 99 | d) Do one of the following: 100 | 101 | 0) Convey the Minimal Corresponding Source under the terms of this 102 | License, and the Corresponding Application Code in a form 103 | suitable for, and under terms that permit, the user to 104 | recombine or relink the Application with a modified version of 105 | the Linked Version to produce a modified Combined Work, in the 106 | manner specified by section 6 of the GNU GPL for conveying 107 | Corresponding Source. 108 | 109 | 1) Use a suitable shared library mechanism for linking with the 110 | Library. A suitable mechanism is one that (a) uses at run time 111 | a copy of the Library already present on the user's computer 112 | system, and (b) will operate properly with a modified version 113 | of the Library that is interface-compatible with the Linked 114 | Version. 115 | 116 | e) Provide Installation Information, but only if you would otherwise 117 | be required to provide such information under section 6 of the 118 | GNU GPL, and only to the extent that such information is 119 | necessary to install and execute a modified version of the 120 | Combined Work produced by recombining or relinking the 121 | Application with a modified version of the Linked Version. (If 122 | you use option 4d0, the Installation Information must accompany 123 | the Minimal Corresponding Source and Corresponding Application 124 | Code. If you use option 4d1, you must provide the Installation 125 | Information in the manner specified by section 6 of the GNU GPL 126 | for conveying Corresponding Source.) 127 | 128 | 5. Combined Libraries. 129 | 130 | You may place library facilities that are a work based on the 131 | Library side by side in a single library together with other library 132 | facilities that are not Applications and are not covered by this 133 | License, and convey such a combined library under terms of your 134 | choice, if you do both of the following: 135 | 136 | a) Accompany the combined library with a copy of the same work based 137 | on the Library, uncombined with any other library facilities, 138 | conveyed under the terms of this License. 139 | 140 | b) Give prominent notice with the combined library that part of it 141 | is a work based on the Library, and explaining where to find the 142 | accompanying uncombined form of the same work. 143 | 144 | 6. Revised Versions of the GNU Lesser General Public License. 145 | 146 | The Free Software Foundation may publish revised and/or new versions 147 | of the GNU Lesser General Public License from time to time. Such new 148 | versions will be similar in spirit to the present version, but may 149 | differ in detail to address new problems or concerns. 150 | 151 | Each version is given a distinguishing version number. If the 152 | Library as you received it specifies that a certain numbered version 153 | of the GNU Lesser General Public License "or any later version" 154 | applies to it, you have the option of following the terms and 155 | conditions either of that published version or of any later version 156 | published by the Free Software Foundation. If the Library as you 157 | received it does not specify a version number of the GNU Lesser 158 | General Public License, you may choose any version of the GNU Lesser 159 | General Public License ever published by the Free Software Foundation. 160 | 161 | If the Library as you received it specifies that a proxy can decide 162 | whether future versions of the GNU Lesser General Public License shall 163 | apply, that proxy's public statement of acceptance of any version is 164 | permanent authorization for you to choose that version for the 165 | Library. 166 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | HTMLawed is ... 2 | =============== 3 | 4 | ... a single-file, 45 kb PHP script that makes input text more secure, HTML standards-compliant, and 5 | suitable in general from the viewpoint of a web-page administrator, for use in the body of HTML, XHTML 6 | or XML documents. A simple HTMLTidy alternative, the htmLawed filter, processor, purifier, sanitizer, 7 | beautifier, etc., is highly customizable. 8 | 9 | It ensures that HTML tags are balanced and properly nested tags, neutralizes code that may be used 10 | for cross-site scripting (XSS) attacks, limits allowed HTML elements, attributes, or URL protocols, 11 | tidies the code, and so forth. 12 | 13 | As such is may serve as an alternative to [HTMLtidy](http://en.wikipedia.org/wiki/HTML_Tidy) in a 14 | sanitation context. 15 | 16 | 17 | This repository is ... 18 | ====================== 19 | 20 | ... a derivative, which closely tracks [the original](http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/) 21 | 22 | 23 | Links 24 | ===== 25 | 26 | * The Original: http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/ 27 | * The SF site where the official Original Releases are available (no cvs/svn/... repository there, though, just releases): http://sourceforge.net/projects/htmlawed/ 28 | * HTMLawed against RSnake's XSS attack vectors: http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/rsnake/RSnakeXSSTest.htm 29 | 30 | -------------------------------------------------------------------------------- /composer.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "htmlawed/htmlawed", 3 | "type": "library", 4 | "description": "Official htmLawed PHP library for HTML filtering", 5 | "keywords": [ 6 | "clean","compliance","filter","filtering","htm","html","input","purify","safe","safety","sanitize","sanitizer","standards","text","xss" 7 | ], 8 | "homepage": "https://bioinformatics.org/phplabware/internal_utilities/htmLawed", 9 | "license": [ 10 | "GPL-2.0-or-later","LGPL-3.0-only" 11 | ], 12 | "authors": [ 13 | { 14 | "name": "Santosh Patnaik", 15 | "email": "drpatnaikREMOVECAPS@yahoo.com", 16 | "role": "Creator and developer" 17 | } 18 | ], 19 | "require": { 20 | "php": ">=4.4" 21 | }, 22 | "autoload": { 23 | "files": ["htmLawed.php"] 24 | }, 25 | "repositories": [ 26 | { 27 | "type": "composer", 28 | "url": "https://bioinformatics.org/phplabware/downloads" 29 | }, 30 | { 31 | "type": "composer", 32 | "url": "https://sourceforge.net/projects/htmlawed/files" 33 | } 34 | ], 35 | "support": { 36 | "docs": "https://bioinformatics.org/phplabware/internal_utilities/htmLawed", 37 | "forum": "https://bioinformatics.org/phplabware/internal_utilities/htmLawed", 38 | "source": "https://bioinformatics.org/phplabware/internal_utilities/htmLawed" 39 | } 40 | } 41 | -------------------------------------------------------------------------------- /htmLawed.php: -------------------------------------------------------------------------------- 1 | 14 | * @copyright (c) 2007-, Santosh Patnaik 15 | * @dependency None 16 | * @license LGPL 3 and GPL 2+ dual license 17 | * @link https://bioinformatics.org/phplabware/internal_utilities/htmLawed 18 | * @package htmLawed 19 | * @php >=4.4 20 | * @time 2023-08-04 21 | * @version 1.2.15 22 | */ 23 | 24 | /* 25 | * Main function. 26 | * Calls all other functions (alphabetically ordered further below). 27 | * 28 | * @param string $t HTM. 29 | * @param mixed $C $config configuration option. 30 | * @param mixed $S $spec specification option. 31 | * @return string Filtered/sanitized $t. 32 | */ 33 | function htmLawed($t, $C=1, $S=array()) 34 | { 35 | // Standard elements including deprecated. 36 | 37 | $eleAr = array('a'=>1, 'abbr'=>1, 'acronym'=>1, 'address'=>1, 'applet'=>1, 'area'=>1, 'article'=>1, 'aside'=>1, 'audio'=>1, 'b'=>1, 'bdi'=>1, 'bdo'=>1, 'big'=>1, 'blockquote'=>1, 'br'=>1, 'button'=>1, 'canvas'=>1, 'caption'=>1, 'center'=>1, 'cite'=>1, 'code'=>1, 'col'=>1, 'colgroup'=>1, 'command'=>1, 'data'=>1, 'datalist'=>1, 'dd'=>1, 'del'=>1, 'details'=>1, 'dialog'=>1, 'dfn'=>1, 'dir'=>1, 'div'=>1, 'dl'=>1, 'dt'=>1, 'em'=>1, 'embed'=>1, 'fieldset'=>1, 'figcaption'=>1, 'figure'=>1, 'font'=>1, 'footer'=>1, 'form'=>1, 'h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1, 'header'=>1, 'hgroup'=>1, 'hr'=>1, 'i'=>1, 'iframe'=>1, 'img'=>1, 'input'=>1, 'ins'=>1, 'isindex'=>1, 'kbd'=>1, 'keygen'=>1, 'label'=>1, 'legend'=>1, 'li'=>1, 'link'=>1, 'main'=>1, 'map'=>1, 'mark'=>1, 'menu'=>1, 'meta'=>1, 'meter'=>1, 'nav'=>1, 'noscript'=>1, 'object'=>1, 'ol'=>1, 'optgroup'=>1, 'option'=>1, 'output'=>1, 'p'=>1, 'param'=>1, 'picture'=>1, 'pre'=>1, 'progress'=>1, 'q'=>1, 'rb'=>1, 'rbc'=>1, 'rp'=>1, 'rt'=>1, 'rtc'=>1, 'ruby'=>1, 's'=>1, 'samp'=>1, 'script'=>1, 'section'=>1, 'select'=>1, 'slot'=>1, 'small'=>1, 'source'=>1, 'span'=>1, 'strike'=>1, 'strong'=>1, 'style'=>1, 'sub'=>1, 'summary'=>1, 'sup'=>1, 'table'=>1, 'tbody'=>1, 'td'=>1, 'template'=>1, 'textarea'=>1, 'tfoot'=>1, 'th'=>1, 'thead'=>1, 'time'=>1, 'tr'=>1, 'track'=>1, 'tt'=>1, 'u'=>1, 'ul'=>1, 'var'=>1, 'video'=>1, 'wbr'=>1); 38 | 39 | // Set $C array ($config), using default parameters as needed. 40 | 41 | $C = is_array($C) ? $C : array(); 42 | if (!empty($C['valid_xhtml'])) { 43 | $C['elements'] = empty($C['elements']) ? '*-acronym-big-center-dir-font-isindex-s-strike-tt' : $C['elements']; 44 | $C['make_tag_strict'] = isset($C['make_tag_strict']) ? $C['make_tag_strict'] : 2; 45 | $C['xml:lang'] = isset($C['xml:lang']) ? $C['xml:lang'] : 2; 46 | } 47 | 48 | // -- Configure for elements. 49 | 50 | if (!empty($C['safe'])) { 51 | unset($eleAr['applet'], $eleAr['audio'], $eleAr['canvas'], $eleAr['dialog'], $eleAr['embed'], $eleAr['iframe'], $eleAr['object'], $eleAr['script'], $eleAr['video']); 52 | } 53 | $x = !empty($C['elements']) ? str_replace(array("\n", "\r", "\t", ' '), '', strtolower($C['elements'])) : '*'; 54 | if ($x == '-*') { 55 | $eleAr = array(); 56 | } elseif (strpos($x, '*') === false) { 57 | $eleAr = array_flip(explode(',', $x)); 58 | } else { 59 | if (isset($x[1])) { 60 | if (strpos($x, '(')) { // Temporarily replace hyphen of custom element, minus being special character 61 | $x = 62 | preg_replace_callback( 63 | '`\([^()]+\)`', 64 | function ($m) { 65 | return str_replace(array('(', ')', '-'), array('', '', 'A'), $m[0]); 66 | }, 67 | $x); 68 | } 69 | preg_match_all('`(?:^|-|\+)[^\-+]+?(?=-|\+|$)`', $x, $m, PREG_SET_ORDER); 70 | for ($i=count($m); --$i>=0;) { 71 | $m[$i] = $m[$i][0]; 72 | } 73 | foreach ($m as $v) { 74 | $v = str_replace('A', '-', $v); 75 | if ($v[0] == '+') { 76 | $eleAr[substr($v, 1)] = 1; 77 | } elseif ($v[0] == '-') { 78 | if (strpos($v, '-', 1)) { 79 | $eleAr[$v] = 1; 80 | } elseif (isset($eleAr[($v = substr($v, 1))]) && !in_array('+'. $v, $m)) { 81 | unset($eleAr[$v]); 82 | } 83 | } 84 | } 85 | } 86 | } 87 | $C['elements'] =& $eleAr; 88 | 89 | // -- Configure for attributes. 90 | 91 | $x = !empty($C['deny_attribute']) ? strtolower(preg_replace('"\s+-"', '/', trim($C['deny_attribute']))) : ''; 92 | $x = str_replace(array(' ', "\t", "\r", "\n"), '', $x); 93 | $x = 94 | array_flip( 95 | (isset($x[0]) && $x[0] == '*') 96 | ? preg_replace( 97 | '`^[^*]`', 98 | '-'. '\\0', 99 | explode( 100 | '/', 101 | (!empty($C['safe']) ? preg_replace('`/on[^/]+`', '', $x) : $x))) 102 | : array_filter(explode(',', $x. (!empty($C['safe']) ? ',on*' : '')))); 103 | $C['deny_attribute'] = $x; 104 | 105 | // -- Configure URL handling. 106 | 107 | $x = (isset($C['schemes'][2]) && strpos($C['schemes'], ':') 108 | ? strtolower($C['schemes']) 109 | : ('href: aim, feed, file, ftp, gopher, http, https, irc, mailto, news, nntp, sftp, ssh, tel, telnet, ws, wss' 110 | . (empty($C['safe']) 111 | ? ', app, javascript; *: data, javascript, ' 112 | : '; *:') 113 | . 'file, http, https, ws, wss')); 114 | $C['schemes'] = array(); 115 | foreach (explode(';', trim(str_replace(array(' ', "\t", "\r", "\n"), '', $x), ';')) as $v) { 116 | if(strpos($v, ':')) { 117 | list($x, $y) = explode(':', $v, 2); 118 | $C['schemes'][$x] = array_flip(explode(',', $y)); 119 | } 120 | } 121 | if (!isset($C['schemes']['*'])) { 122 | $C['schemes']['*'] = array('file'=>1, 'http'=>1, 'https'=>1, 'ws'=>1, 'wss'=>1); 123 | if (empty($C['safe'])) { 124 | $C['schemes']['*'] += array('data'=>1, 'javascript'=>1); 125 | } 126 | } 127 | if (!empty($C['safe']) && empty($C['schemes']['style'])) { 128 | $C['schemes']['style'] = array('!'=>1); 129 | } 130 | $C['abs_url'] = isset($C['abs_url']) ? $C['abs_url'] : 0; 131 | if (!isset($C['base_url']) || !preg_match('`^[a-zA-Z\d.+\-]+://[^/]+/(.+?/)?$`', $C['base_url'])) { 132 | $C['base_url'] = $C['abs_url'] = 0; 133 | } 134 | 135 | // -- Configure other parameters. 136 | 137 | $C['and_mark'] = empty($C['and_mark']) ? 0 : 1; 138 | $C['anti_link_spam'] = 139 | (isset($C['anti_link_spam']) 140 | && is_array($C['anti_link_spam']) 141 | && count($C['anti_link_spam']) == 2 142 | && (empty($C['anti_link_spam'][0]) 143 | || hl_regex($C['anti_link_spam'][0])) 144 | && (empty($C['anti_link_spam'][1]) 145 | || hl_regex($C['anti_link_spam'][1]))) 146 | ? $C['anti_link_spam'] 147 | : 0; 148 | $C['anti_mail_spam'] = isset($C['anti_mail_spam']) ? $C['anti_mail_spam'] : 0; 149 | $C['any_custom_element'] = (!isset($C['any_custom_element']) || !empty($C['any_custom_element'])) ? 1 : 0; 150 | $C['balance'] = isset($C['balance']) ? (bool)$C['balance'] : 1; 151 | $C['cdata'] = isset($C['cdata']) ? $C['cdata'] : (empty($C['safe']) ? 3 : 0); 152 | $C['clean_ms_char'] = empty($C['clean_ms_char']) ? 0 : $C['clean_ms_char']; 153 | $C['comment'] = isset($C['comment']) ? $C['comment'] : (empty($C['safe']) ? 3 : 0); 154 | $C['css_expression'] = empty($C['css_expression']) ? 0 : 1; 155 | $C['direct_list_nest'] = empty($C['direct_list_nest']) ? 0 : 1; 156 | $C['hexdec_entity'] = isset($C['hexdec_entity']) ? $C['hexdec_entity'] : 1; 157 | $C['hook'] = (!empty($C['hook']) && is_callable($C['hook'])) ? $C['hook'] : 0; 158 | $C['hook_tag'] = (!empty($C['hook_tag']) && is_callable($C['hook_tag'])) ? $C['hook_tag'] : 0; 159 | $C['keep_bad'] = isset($C['keep_bad']) ? $C['keep_bad'] : 6; 160 | $C['lc_std_val'] = isset($C['lc_std_val']) ? (bool)$C['lc_std_val'] : 1; 161 | $C['make_tag_strict'] = isset($C['make_tag_strict']) ? $C['make_tag_strict'] : 1; 162 | $C['named_entity'] = isset($C['named_entity']) ? (bool)$C['named_entity'] : 1; 163 | $C['no_deprecated_attr'] = isset($C['no_deprecated_attr']) ? $C['no_deprecated_attr'] : 1; 164 | $C['parent'] = isset($C['parent'][0]) ? strtolower($C['parent']) : 'body'; 165 | $C['show_setting'] = !empty($C['show_setting']) ? $C['show_setting'] : 0; 166 | $C['style_pass'] = empty($C['style_pass']) ? 0 : 1; 167 | $C['tidy'] = empty($C['tidy']) ? 0 : $C['tidy']; 168 | $C['unique_ids'] = isset($C['unique_ids']) && (!preg_match('`\W`', $C['unique_ids'])) ? $C['unique_ids'] : 1; 169 | $C['xml:lang'] = isset($C['xml:lang']) ? $C['xml:lang'] : 0; 170 | 171 | if (isset($GLOBALS['C'])) { 172 | $oldC = $GLOBALS['C']; 173 | } 174 | $GLOBALS['C'] = $C; 175 | 176 | // Set $S array ($spec). 177 | 178 | $S = is_array($S) ? $S : hl_spec($S); 179 | if (isset($GLOBALS['S'])) { 180 | $oldS = $GLOBALS['S']; 181 | } 182 | $GLOBALS['S'] = $S; 183 | 184 | // Handle characters. 185 | 186 | $t = preg_replace('`[\x00-\x08\x0b-\x0c\x0e-\x1f]`', '', $t); // Remove illegal 187 | if ($C['clean_ms_char']) { // Convert MS Windows CP-1252 188 | $x = array("\x7f"=>'', "\x80"=>'€', "\x81"=>'', "\x83"=>'ƒ', "\x85"=>'…', "\x86"=>'†', "\x87"=>'‡', "\x88"=>'ˆ', "\x89"=>'‰', "\x8a"=>'Š', "\x8b"=>'‹', "\x8c"=>'Œ', "\x8d"=>'', "\x8e"=>'Ž', "\x8f"=>'', "\x90"=>'', "\x95"=>'•', "\x96"=>'–', "\x97"=>'—', "\x98"=>'˜', "\x99"=>'™', "\x9a"=>'š', "\x9b"=>'›', "\x9c"=>'œ', "\x9d"=>'', "\x9e"=>'ž', "\x9f"=>'Ÿ'); 189 | $x = $x 190 | + ($C['clean_ms_char'] == 1 191 | ? array("\x82"=>'‚', "\x84"=>'„', "\x91"=>'‘', "\x92"=>'’', "\x93"=>'“', "\x94"=>'”') 192 | : array("\x82"=>'\'', "\x84"=>'"', "\x91"=>'\'', "\x92"=>'\'', "\x93"=>'"', "\x94"=>'"')); 193 | $t = strtr($t, $x); 194 | } 195 | 196 | // Handle CDATA, comments, and entities. 197 | 198 | if ($C['cdata'] || $C['comment']) { 199 | $t = preg_replace_callback('``sm', 'hl_commentCdata', $t); 200 | } 201 | $t = 202 | preg_replace_callback( 203 | '`&([a-zA-Z][a-zA-Z0-9]{1,30}|#(?:[0-9]{1,8}|[Xx][0-9A-Fa-f]{1,7}));`', 204 | 'hl_entity', 205 | str_replace('&', '&', $t)); 206 | if ($C['unique_ids'] && !isset($GLOBALS['hl_Ids'])) { 207 | $GLOBALS['hl_Ids'] = array(); 208 | } 209 | 210 | if ($C['hook']) { 211 | $t = call_user_func($C['hook'], $t, $C, $S); 212 | } 213 | 214 | // Handle remaining text. 215 | 216 | $t = preg_replace_callback('`<(?:(?:\s|$)|(?:[^>]*(?:>|$)))|>`m', 'hl_tag', $t); 217 | $t = $C['balance'] ? hl_balance($t, $C['keep_bad'], $C['parent']) : $t; 218 | $t = (($C['cdata'] || $C['comment']) && strpos($t, "\x01") !== false) 219 | ? str_replace(array("\x01", "\x02", "\x03", "\x04", "\x05"), array('', '', '&', '<', '>'), $t) 220 | : $t; 221 | $t = $C['tidy'] ? hl_tidy($t, $C['tidy'], $C['parent']) : $t; 222 | 223 | // Cleanup. 224 | 225 | if ($C['show_setting'] && preg_match('`^[a-z][a-z0-9_]*$`i', $C['show_setting'])) { 226 | $GLOBALS[$C['show_setting']] = array('config'=>$C, 'spec'=>$S, 'time'=>microtime(true), 'version'=>hl_version()); 227 | } 228 | unset($C, $eleAr); 229 | if (isset($oldC)) { 230 | $GLOBALS['C'] = $oldC; 231 | } 232 | if (isset($oldS)) { 233 | $GLOBALS['S'] = $oldS; 234 | } 235 | return $t; 236 | } 237 | 238 | /** 239 | * Validate attribute value and possibly reset to a default. 240 | * 241 | * @param string $attr Attribute name. 242 | * @param string $value Attribute value. 243 | * @param array $ruleAr Array of rules derived from $spec. 244 | * @param string $ele Element. 245 | * @return mixed 0 if invalid $value, 246 | * or string with validated or default value. 247 | */ 248 | function hl_attributeValue($attr, $value, $ruleAr, $ele) 249 | { 250 | static $spacedValsAttrAr = array('accesskey', 'class', 'itemtype', 'rel'); // Some attributes have multiple values 251 | $valSep = 252 | (in_array($attr, $spacedValsAttrAr) || ($attr == 'archive' && $ele == 'object')) 253 | ? ' ' 254 | : (($attr == 'sizes' || $attr == 'srcset' || ($attr == 'archive' && $ele == 'applet')) 255 | ? ',' 256 | : ''); 257 | $out = array(); 258 | $valAr = !empty($valSep) ? explode($valSep, $value) : array($value); 259 | foreach ($valAr as $v) { 260 | $ok = 1; 261 | $v = trim($v); 262 | $lengthVal = strlen($v); 263 | foreach ($ruleAr as $ruleType=>$ruleVal) { 264 | if (!$lengthVal) { 265 | continue; 266 | } 267 | switch ($ruleType) { 268 | case 'maxlen': if ($lengthVal > $ruleVal) { 269 | $ok = 0; 270 | } 271 | break; case 'minlen': if ($lengthVal < $ruleVal) { 272 | $ok = 0; 273 | } 274 | break; case 'maxval': if ((float)($v) > $ruleVal) { 275 | $ok = 0; 276 | } 277 | break; case 'minval': if ((float)($v) < $ruleVal) { 278 | $ok = 0; 279 | } 280 | break; case 'match': if (!preg_match($ruleVal, $v)) { 281 | $ok = 0; 282 | } 283 | break; case 'nomatch': if (preg_match($ruleVal, $v)) { 284 | $ok = 0; 285 | } 286 | break; case 'oneof': if(!in_array($v, explode('|', $ruleVal))) { 287 | $ok = 0; 288 | } 289 | break; case 'noneof': if(in_array($v, explode('|', $ruleVal))) { 290 | $ok = 0; 291 | } 292 | break; default: 293 | break; 294 | } 295 | if (!$ok) { 296 | break; 297 | } 298 | } 299 | if ($ok) { 300 | $out[] = $v; 301 | } 302 | } 303 | $out = implode($valSep == ',' ? ', ' : ' ', $out); 304 | return (isset($out[0]) ? $out : (isset($ruleAr['default']) ? $ruleAr['default'] : 0)); 305 | } 306 | 307 | /* 308 | * Enforce parent-child validity of elements and balance tags. 309 | * 310 | * @param string $t HTM. Previously partly sanitized/filtered. CDATA 311 | * and comment sections have characters hidden. 312 | * @param int $act $config's keep_bad parameter. 313 | * @param string $parentEle $t's parent element option. 314 | * @return string $t with valid nesting and balanced tags. 315 | */ 316 | function hl_balance($t, $act=1, $parentEle='div') 317 | { 318 | // Group elements in different ways. 319 | 320 | $closingTagOmitableEleAr = array('caption'=>1, 'colgroup'=>1, 'dd'=>1, 'dt'=>1, 'li'=>1, 'optgroup'=>1, 'option'=>1, 'p'=>1, 'rp'=>1, 'rt'=>1, 'tbody'=>1, 'td'=>1, 'tfoot'=>1, 'th'=>1, 'thead'=>1, 'tr'=>1); 321 | 322 | // -- Block, inline, etc. 323 | 324 | $blockEleAr = array('a'=>1, 'address'=>1, 'article'=>1, 'aside'=>1, 'blockquote'=>1, 'center'=>1, 'del'=>1, 'details'=>1, 'dialog'=>1, 'dir'=>1, 'dl'=>1, 'div'=>1, 'fieldset'=>1, 'figure'=>1, 'footer'=>1, 'form'=>1, 'ins'=>1, 'h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1, 'header'=>1, 'hr'=>1, 'isindex'=>1, 'main'=>1, 'menu'=>1, 'nav'=>1, 'noscript'=>1, 'ol'=>1, 'p'=>1, 'pre'=>1, 'section'=>1, 'slot'=>1, 'style'=>1, 'table'=>1, 'template'=>1, 'ul'=>1); 325 | $inlineEleAr = array('#pcdata'=>1, 'a'=>1, 'abbr'=>1, 'acronym'=>1, 'applet'=>1, 'audio'=>1, 'b'=>1, 'bdi'=>1, 'bdo'=>1, 'big'=>1, 'br'=>1, 'button'=>1, 'canvas'=>1, 'cite'=>1, 'code'=>1, 'command'=>1, 'data'=>1, 'datalist'=>1, 'del'=>1, 'dfn'=>1, 'em'=>1, 'embed'=>1, 'figcaption'=>1, 'font'=>1, 'i'=>1, 'iframe'=>1, 'img'=>1, 'input'=>1, 'ins'=>1, 'kbd'=>1, 'label'=>1, 'link'=>1, 'map'=>1, 'mark'=>1, 'meta'=>1, 'meter'=>1, 'object'=>1, 'output'=>1, 'picture'=>1, 'progress'=>1, 'q'=>1, 'ruby'=>1, 's'=>1, 'samp'=>1, 'select'=>1, 'script'=>1, 'small'=>1, 'span'=>1, 'strike'=>1, 'strong'=>1, 'sub'=>1, 'summary'=>1, 'sup'=>1, 'textarea'=>1, 'time'=>1, 'tt'=>1, 'u'=>1, 'var'=>1, 'video'=>1, 'wbr'=>1); 326 | $otherEleAr = array('area'=>1, 'caption'=>1, 'col'=>1, 'colgroup'=>1, 'command'=>1, 'dd'=>1, 'dt'=>1, 'hgroup'=>1, 'keygen'=>1, 'legend'=>1, 'li'=>1, 'optgroup'=>1, 'option'=>1, 'param'=>1, 'rb'=>1, 'rbc'=>1, 'rp'=>1, 'rt'=>1, 'rtc'=>1, 'script'=>1, 'source'=>1, 'tbody'=>1, 'td'=>1, 'tfoot'=>1, 'thead'=>1, 'th'=>1, 'tr'=>1, 'track'=>1); 327 | $flowEleAr = $blockEleAr + $inlineEleAr; 328 | 329 | // -- Type of child allowed. 330 | 331 | $blockKidEleAr = array('blockquote'=>1, 'form'=>1, 'map'=>1, 'noscript'=>1); 332 | $flowKidEleAr = array('a'=>1, 'article'=>1, 'aside'=>1, 'audio'=>1, 'button'=>1, 'canvas'=>1, 'del'=>1, 'details'=>1, 'dialog'=>1, 'div'=>1, 'dd'=>1, 'fieldset'=>1, 'figure'=>1, 'footer'=>1, 'header'=>1, 'iframe'=>1, 'ins'=>1, 'li'=>1, 'main'=>1, 'menu'=>1, 'nav'=>1, 'noscript'=>1, 'object'=>1, 'section'=>1, 'slot'=>1, 'style'=>1, 'td'=>1, 'template'=>1, 'th'=>1, 'video'=>1); // Later context-wise dynamic move of ins & del to $inlineKidEleAr 333 | $inlineKidEleAr = array('abbr'=>1, 'acronym'=>1, 'address'=>1, 'b'=>1, 'bdi'=>1, 'bdo'=>1, 'big'=>1, 'caption'=>1, 'cite'=>1, 'code'=>1, 'data'=>1, 'datalist'=>1, 'dfn'=>1, 'dt'=>1, 'em'=>1, 'figcaption'=>1, 'font'=>1, 'h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1, 'hgroup'=>1, 'i'=>1, 'kbd'=>1, 'label'=>1, 'legend'=>1, 'mark'=>1, 'meter'=>1, 'output'=>1, 'p'=>1, 'picture'=>1, 'pre'=>1, 'progress'=>1, 'q'=>1, 'rb'=>1, 'rt'=>1, 'ruby'=>1, 's'=>1, 'samp'=>1, 'small'=>1, 'span'=>1, 'strike'=>1, 'strong'=>1, 'sub'=>1, 'summary'=>1, 'sup'=>1, 'time'=>1, 'tt'=>1, 'u'=>1, 'var'=>1); 334 | $noKidEleAr = array('area'=>1, 'br'=>1, 'col'=>1, 'command'=>1, 'embed'=>1, 'hr'=>1, 'img'=>1, 'input'=>1, 'isindex'=>1, 'keygen'=>1, 'link'=>1, 'meta'=>1, 'param'=>1, 'source'=>1, 'track'=>1, 'wbr'=>1); 335 | 336 | // Special parent-child relations. 337 | 338 | $invalidMomKidAr = array('a'=>array('a'=>1, 'address'=>1, 'button'=>1, 'details'=>1, 'embed'=>1, 'iframe'=>1, 'keygen'=>1, 'label'=>1, 'select'=>1, 'textarea'=>1), 'address'=>array('address'=>1, 'article'=>1, 'aside'=>1, 'footer'=>1, 'h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1, 'header'=>1, 'hgroup'=>1, 'keygen'=>1, 'nav'=>1, 'section'=>1), 'audio'=>array('audio'=>1, 'video'=>1), 'button'=>array('a'=>1, 'address'=>1, 'button'=>1, 'details'=>1, 'embed'=>1, 'iframe'=>1, 'keygen'=>1, 'label'=>1, 'select'=>1, 'textarea'=>1), 'dfn'=>array('dfn'=>1), 'fieldset'=>array('fieldset'=>1), 'footer'=>array('footer'=>1, 'header'=>1), 'form'=>array('form'=>1), 'header'=>array('footer'=>1, 'header'=>1), 'label'=>array('label'=>1), 'main'=>array('main'=>1), 'meter'=>array('meter'=>1), 'noscript'=>array('script'=>1), 'progress'=>array('progress'=>1), 'rb'=>array('ruby'=>1), 'rt'=>array('ruby'=>1), 'ruby'=>array('ruby'=>1), 'time'=>array('time'=>1), 'video'=>array('audio'=>1, 'video'=>1)); 339 | $invalidKidEleAr = array('a'=>1, 'address'=>1, 'article'=>1, 'aside'=>1, 'audio'=>1, 'button'=>1, 'details'=>1, 'dfn'=>1, 'embed'=>1, 'fieldset'=>1, 'footer'=>1, 'form'=>1, 'h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1, 'header'=>1, 'hgroup'=>1, 'iframe'=>1, 'keygen'=>1, 'label'=>1, 'main'=>1, 'meter'=>1, 'nav'=>1, 'progress'=>1, 'ruby'=>1, 'script'=>1, 'section'=>1, 'select'=>1, 'textarea'=>1, 'time'=>1, 'video'=>1); // $invalidMomKidAr values 340 | $invalidMomEleAr = array_keys($invalidMomKidAr); 341 | $validMomKidAr = array('colgroup'=>array('col'=>1, 'template'=>1), 'datalist'=>array('option'=>1, 'script'=>1), 'dir'=>array('li'=>1), 'dl'=>array('dd'=>1, 'div'=>1, 'dt'=>1), 'hgroup'=>array('h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1), 'menu'=>array('li'=>1, 'script'=>1, 'template'=>1), 'ol'=>array('li'=>1, 'script'=>1, 'template'=>1), 'optgroup'=>array('option'=>1, 'script'=>1, 'template'=>1), 'option'=>array('#pcdata'=>1), 'picture'=>array('img'=>1, 'script'=>1, 'source'=>1, 'template'=>1), 'rbc'=>array('rb'=>1), 'rp'=>array('#pcdata'=>1), 'rtc'=>array('rp'=>1, 'rt'=>1), 'select'=>array('optgroup'=>1, 'option'=>1), 'script'=>array('#pcdata'=>1), 'table'=>array('caption'=>1, 'col'=>1, 'colgroup'=>1, 'script'=>1, 'tbody'=>1, 'tfoot'=>1, 'thead'=>1, 'tr'=>1, 'template'=>1), 'tbody'=>array('script'=>1, 'template'=>1, 'tr'=>1), 'tfoot'=>array('tr'=>1), 'textarea'=>array('#pcdata'=>1), 'thead'=>array('script'=>1, 'template'=>1, 'tr'=>1), 'tr'=>array('script'=>1, 'td'=>1, 'template'=>1, 'th'=>1), 'ul'=>array('li'=>1, 'script'=>1, 'template'=>1)); // Immediate parent-child relation 342 | if ($GLOBALS['C']['direct_list_nest']) { 343 | $validMomKidAr['ol'] = $validMomKidAr['ul'] = $validMomKidAr['menu'] += array('menu'=>1, 'ol'=>1, 'ul'=>1); 344 | } 345 | $otherValidMomKidAr = array('address'=>array('p'=>1), 'applet'=>array('param'=>1), 'audio'=>array('source'=>1, 'track'=>1), 'blockquote'=>array('script'=>1), 'fieldset'=>array('legend'=>1, '#pcdata'=>1), 'figure'=>array('figcaption'=>1),'form'=>array('script'=>1), 'map'=>array('area'=>1), 'legend'=>array('h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1), 'object'=>array('param'=>1, 'embed'=>1), 'ruby'=>array('rb'=>1, 'rbc'=>1, 'rp'=>1, 'rt'=>1, 'rtc'=>1), 'summary'=>array('h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1, 'hgroup'=>1), 'video'=>array('source'=>1, 'track'=>1)); 346 | 347 | // Valid elements for top-level parent. 348 | 349 | $mom = ((isset($flowEleAr[$parentEle]) && $parentEle != '#pcdata') 350 | || isset($otherEleAr[$parentEle])) 351 | ? $parentEle 352 | : 'div'; 353 | if (isset($noKidEleAr[$mom])) { 354 | return (!$act ? '' : str_replace(array('<', '>'), array('<', '>'), $t)); 355 | } 356 | if (isset($validMomKidAr[$mom])) { 357 | $validInMomEleAr = $validMomKidAr[$mom]; 358 | } elseif (isset($inlineKidEleAr[$mom])) { 359 | $validInMomEleAr = $inlineEleAr; 360 | $inlineKidEleAr['del'] = 1; 361 | $inlineKidEleAr['ins'] = 1; 362 | } elseif (isset($flowKidEleAr[$mom])) { 363 | $validInMomEleAr = $flowEleAr; 364 | unset($inlineKidEleAr['del'], $inlineKidEleAr['ins']); 365 | } elseif (isset($blockKidEleAr[$mom])) { 366 | $validInMomEleAr = $blockEleAr; 367 | unset($inlineKidEleAr['del'], $inlineKidEleAr['ins']); 368 | } 369 | if (isset($otherValidMomKidAr[$mom])) { 370 | $validInMomEleAr = $validInMomEleAr + $otherValidMomKidAr[$mom]; 371 | } 372 | if (isset($invalidMomKidAr[$mom])) { 373 | $validInMomEleAr = array_diff_assoc($validInMomEleAr, $invalidMomKidAr[$mom]); 374 | } 375 | if (strpos($mom, '-')) { // Custom element 376 | $validInMomEleAr = array('*' => 1, '#pcdata' =>1); 377 | } 378 | 379 | // Loop over elements. 380 | 381 | $t = explode('<', $t); 382 | $validKidsOfMom = $openEleQueue = array(); // Queue of opened elements 383 | ob_start(); 384 | for ($i=-1, $eleCount=count($t); ++$i<$eleCount;) { 385 | 386 | // Check element validity as child. Same code as section: Finishing (below). 387 | 388 | if ($queueLength = count($openEleQueue)) { 389 | $eleNow = array_pop($openEleQueue); 390 | $openEleQueue[] = $eleNow; 391 | if (isset($validMomKidAr[$eleNow])) { 392 | $validKidsOfMom = $validMomKidAr[$eleNow]; 393 | } elseif (isset($inlineKidEleAr[$eleNow])) { 394 | $validKidsOfMom = $inlineEleAr; 395 | $inlineKidEleAr['del'] = 1; 396 | $inlineKidEleAr['ins'] = 1; 397 | } elseif (isset($flowKidEleAr[$eleNow])) { 398 | $validKidsOfMom = $flowEleAr; 399 | unset($inlineKidEleAr['del'], $inlineKidEleAr['ins']); 400 | } elseif (isset($blockKidEleAr[$eleNow])) { 401 | $validKidsOfMom = $blockEleAr; 402 | unset($inlineKidEleAr['del'], $inlineKidEleAr['ins']); 403 | } 404 | if (isset($otherValidMomKidAr[$eleNow])) { 405 | $validKidsOfMom = $validKidsOfMom + $otherValidMomKidAr[$eleNow]; 406 | } 407 | if (isset($invalidMomKidAr[$eleNow])) { 408 | $validKidsOfMom = array_diff_assoc($validKidsOfMom, $invalidMomKidAr[$eleNow]); 409 | } 410 | if (strpos($eleNow, '-')) { // Custom element 411 | $validKidsOfMom = array('*'=>1, '#pcdata'=>1); 412 | } 413 | } else { 414 | $validKidsOfMom = $validInMomEleAr; 415 | unset($inlineKidEleAr['del'], $inlineKidEleAr['ins']); 416 | } 417 | if ( 418 | isset($ele) 419 | && ($act == 1 420 | || (isset($validKidsOfMom['#pcdata']) 421 | && ($act == 3 422 | || $act == 5))) 423 | ) { 424 | echo '<', $slash, $ele, $attrs, '>'; 425 | } 426 | if (isset($content[0])) { 427 | if (strlen(trim($content)) 428 | && (($queueLength && isset($blockKidEleAr[$eleNow])) 429 | || (isset($blockKidEleAr[$mom]) && !$queueLength)) 430 | ) { 431 | echo '
', $content, '
'; 432 | } elseif ($act < 3 || isset($validKidsOfMom['#pcdata'])) { 433 | echo $content; 434 | } elseif (strpos($content, "\x02\x04")) { 435 | foreach ( 436 | preg_split( 437 | '`(\x01\x02[^\x01\x02]+\x02\x01)`', $content, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY) as $m) { 438 | echo( 439 | substr($m, 0, 2) == "\x01\x02" 440 | ? $m 441 | : ($act > 4 442 | ? preg_replace('`\S`', '', $m) 443 | : '')); 444 | } 445 | } elseif ($act > 4) { 446 | echo preg_replace('`\S`', '', $content); 447 | } 448 | } // End: Check element validity as child 449 | 450 | // Get parts of element. 451 | 452 | if (!preg_match('`^(/?)([a-z][^ >]*)([^>]*)>(.*)`sm', $t[$i], $m)) { 453 | $content = $t[$i]; 454 | continue; 455 | } 456 | $slash = null; // Closing tag's slash 457 | $ele = null; // Name 458 | $attrs = null; // Attribute string 459 | $content = null; // Content 460 | list($all, $slash, $ele, $attrs, $content) = $m; 461 | 462 | // Handle closing tag. 463 | 464 | if ($slash) { 465 | if (isset($noKidEleAr[$ele]) || !in_array($ele, $openEleQueue)) { // Element empty type or unopened 466 | continue; 467 | } 468 | if ($eleNow == $ele) { // Last open tag 469 | array_pop($openEleQueue); 470 | echo ''; 471 | unset($ele); 472 | continue; 473 | } 474 | $closedTags = ''; // Nesting, so close open elements as necessary 475 | for ($j=-1, $cj=count($openEleQueue); ++$j<$cj;) { 476 | if (($closableEle = array_pop($openEleQueue)) == $ele) { 477 | break; 478 | } else { 479 | $closedTags .= ""; 480 | } 481 | } 482 | echo $closedTags, ''; 483 | unset($ele); 484 | continue; 485 | } 486 | 487 | // Handle opening tag. 488 | 489 | if (isset($blockKidEleAr[$ele]) && strlen(trim($content))) { // $blockKidEleAr element needs $blockEleAr element 490 | $t[$i] = "{$ele}{$attrs}>"; 491 | array_splice($t, $i+1, 0, 'div>'. $content); 492 | unset($ele, $content); 493 | ++$eleCount; 494 | --$i; 495 | continue; 496 | } 497 | if (strpos($ele, '-')) { // Custom element 498 | $validKidsOfMom[$ele] = 1; 499 | } 500 | if ((($queueLength && isset($blockKidEleAr[$eleNow])) 501 | || (isset($blockKidEleAr[$mom]) && !$queueLength)) 502 | && !isset($blockEleAr[$ele]) 503 | && !isset($validKidsOfMom[$ele]) 504 | && !isset($validKidsOfMom['*']) 505 | ) { 506 | array_splice($t, $i, 0, 'div>'); 507 | unset($ele, $content); 508 | ++$eleCount; 509 | --$i; 510 | continue; 511 | } 512 | if ( 513 | !$queueLength 514 | || !isset($invalidKidEleAr[$ele]) 515 | || !array_intersect($openEleQueue, $invalidMomEleAr) 516 | ) { // If no open element; mostly immediate parent-child relation should hold 517 | if (!isset($validKidsOfMom[$ele]) && !isset($validKidsOfMom['*'])) { 518 | if ($queueLength && isset($closingTagOmitableEleAr[$eleNow])) { 519 | echo ''; 520 | unset($ele, $content); 521 | --$i; 522 | } 523 | continue; 524 | } 525 | if (!isset($noKidEleAr[$ele])) { 526 | $openEleQueue[] = $ele; 527 | } 528 | echo '<', $ele, $attrs, '>'; 529 | unset($ele); 530 | continue; 531 | } 532 | if (isset($validMomKidAr[$eleNow][$ele])) { // Specific parent-child relation 533 | if (!isset($noKidEleAr[$ele])) { 534 | $openEleQueue[] = $ele; 535 | } 536 | echo '<', $ele, $attrs, '>'; 537 | unset($ele); 538 | continue; 539 | } 540 | $closedTags = ''; // Nesting, so close open elements as needed 541 | $openEleQueue2 = array(); 542 | for ($k=-1, $kc=count($openEleQueue); ++$k<$kc;) { 543 | $closableEle = $openEleQueue[$k]; 544 | $validKids2 = array(); 545 | if (isset($validMomKidAr[$closableEle])) { 546 | $openEleQueue2[] = $closableEle; 547 | continue; 548 | } 549 | $validKids2 = isset($inlineKidEleAr[$closableEle]) ? $inlineEleAr : $flowEleAr; 550 | if (isset($otherValidMomKidAr[$closableEle])) { 551 | $validKids2 = $validKids2 + $otherValidMomKidAr[$closableEle]; 552 | } 553 | if (isset($invalidMomKidAr[$closableEle])) { 554 | $validKids2 = array_diff_assoc($validKids2, $invalidMomKidAr[$closableEle]); 555 | } 556 | if (!isset($validKids2[$ele]) && !strpos($ele, '-')) { 557 | if (!$k && !isset($validInMomEleAr[$ele]) && !isset($validInMomEleAr['*'])) { 558 | continue 2; 559 | } 560 | $closedTags = ""; 561 | for (;++$k<$kc;) { 562 | $closedTags = "{$closedTags}"; 563 | } 564 | break; 565 | } else { 566 | $openEleQueue2[] = $closableEle; 567 | } 568 | } 569 | $openEleQueue = $openEleQueue2; 570 | if (!isset($noKidEleAr[$ele])) { 571 | $openEleQueue[] = $ele; 572 | } 573 | echo $closedTags, '<', $ele, $attrs, '>'; 574 | unset($ele); 575 | continue; 576 | } // End of For: loop over elements 577 | 578 | // Finishing. Same code as: 'Check element validity as child'. 579 | 580 | if ($queueLength = count($openEleQueue)) { 581 | $eleNow = array_pop($openEleQueue); 582 | $openEleQueue[] = $eleNow; 583 | if (isset($validMomKidAr[$eleNow])) { 584 | $validKidsOfMom = $validMomKidAr[$eleNow]; 585 | } elseif (isset($inlineKidEleAr[$eleNow])) { 586 | $validKidsOfMom = $inlineEleAr; 587 | $inlineKidEleAr['del'] = 1; 588 | $inlineKidEleAr['ins'] = 1; 589 | } elseif (isset($flowKidEleAr[$eleNow])) { 590 | $validKidsOfMom = $flowEleAr; 591 | unset($inlineKidEleAr['del'], $inlineKidEleAr['ins']); 592 | } elseif (isset($blockKidEleAr[$eleNow])) { 593 | $validKidsOfMom = $blockEleAr; 594 | unset($inlineKidEleAr['del'], $inlineKidEleAr['ins']); 595 | } 596 | if (isset($otherValidMomKidAr[$eleNow])) { 597 | $validKidsOfMom = $validKidsOfMom + $otherValidMomKidAr[$eleNow]; 598 | } 599 | if (isset($invalidMomKidAr[$eleNow])) { 600 | $validKidsOfMom = array_diff_assoc($validKidsOfMom, $invalidMomKidAr[$eleNow]); 601 | } 602 | if (strpos($eleNow, '-')) { // Custom element 603 | $validKidsOfMom = array('*'=>1, '#pcdata'=>1); 604 | } 605 | } else { 606 | $validKidsOfMom = $validInMomEleAr; 607 | unset($inlineKidEleAr['del'], $inlineKidEleAr['ins']); 608 | } 609 | if ( 610 | isset($ele) 611 | && ($act == 1 612 | || (isset($validKidsOfMom['#pcdata']) 613 | && ($act == 3 614 | || $act == 5))) 615 | ) { 616 | echo '<', $slash, $ele, $attrs, '>'; 617 | } 618 | if (isset($content[0])) { 619 | if ( 620 | strlen(trim($content)) 621 | && (($queueLength && isset($blockKidEleAr[$eleNow])) 622 | || (isset($blockKidEleAr[$mom]) && !$queueLength)) 623 | ) { 624 | echo '
', $content, '
'; 625 | } elseif ($act < 3 || isset($validKidsOfMom['#pcdata'])) { 626 | echo $content; 627 | } elseif (strpos($content, "\x02\x04")) { 628 | foreach ( 629 | preg_split( 630 | '`(\x01\x02[^\x01\x02]+\x02\x01)`', $content, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY) as $m) { 631 | echo( 632 | substr($m, 0, 2) == "\x01\x02" 633 | ? $m 634 | : ($act > 4 635 | ? preg_replace('`\S`', '', $m) 636 | : '')); 637 | } 638 | } elseif ($act > 4) { 639 | echo preg_replace('`\S`', '', $content); 640 | } 641 | } // End: Finishing 642 | 643 | while (!empty($openEleQueue) && ($ele = array_pop($openEleQueue))) { 644 | echo ''; 645 | } 646 | $o = ob_get_contents(); 647 | ob_end_clean(); 648 | return $o; 649 | } 650 | 651 | /** 652 | * Handle comment/CDATA section. 653 | * 654 | * Filter/sanitize as per $config and disguise special characters. 655 | * 656 | * @param array $t Array result of preg_replace, with potential comment/CDATA. 657 | * @return string Sanitized comment/CDATA with hidden special characters. 658 | */ 659 | function hl_commentCdata($t) 660 | { 661 | $t = $t[0]; 662 | global $C; 663 | if (!($rule = $C[$type = $t[3] == '-' ? 'comment' : 'cdata'])) { 664 | return $t; 665 | } 666 | if ($rule == 1) { 667 | return ''; 668 | } 669 | if ($type == 'comment') { 670 | if (substr(($t = preg_replace('`--+`', '-', substr($t, 4, -3))), -1) != ' ') { 671 | $t .= $rule == 4 ? '' : ' '; 672 | } 673 | } else { 674 | $t = substr($t, 1, -1); 675 | } 676 | $t = $rule == 2 ? str_replace(array('&', '<', '>'), array('&', '<', '>'), $t) : $t; 677 | return 678 | str_replace( 679 | array('&', '<', '>'), 680 | array("\x03", "\x04", "\x05"), 681 | ($type == 'comment' ? "\x01\x02\x04!--$t--\x05\x02\x01" : "\x01\x01\x04$t\x05\x01\x01")); 682 | } 683 | 684 | /** 685 | * Transform deprecated element, with any attribute, into a new element. 686 | * 687 | * 688 | * @param string $ele Deprecated element. 689 | * @param string $attrStr Attribute string of element. 690 | * @param int $act No transformation if 2. 691 | * @return mixed New attribute string (may be empty) or 0. 692 | */ 693 | function hl_deprecatedElement(&$ele, &$attrStr, $act=1) 694 | { 695 | if ($ele == 'big') { 696 | $ele = 'span'; 697 | return 'font-size: larger;'; 698 | } 699 | if ($ele == 's' || $ele == 'strike') { 700 | $ele = 'span'; 701 | return 'text-decoration: line-through;'; 702 | } 703 | if ($ele == 'tt') { 704 | $ele = 'code'; 705 | return ''; 706 | } 707 | if ($ele == 'center') { 708 | $ele = 'div'; 709 | return 'text-align: center;'; 710 | } 711 | static $fontSizeAr = array('0'=>'xx-small', '1'=>'xx-small', '2'=>'small', '3'=>'medium', '4'=>'large', '5'=>'x-large', '6'=>'xx-large', '7'=>'300%', '-1'=>'smaller', '-2'=>'60%', '+1'=>'larger', '+2'=>'150%', '+3'=>'200%', '+4'=>'300%'); 712 | if ($ele == 'font') { 713 | $attrStrNew = ''; 714 | while (preg_match('`(^|\s)(color|size)\s*=\s*(\'|")?(.+?)(\\3|\s|$)`i', $attrStr, $m)) { 715 | $attrStr = str_replace($m[0], ' ', $attrStr) ; 716 | $attrStrNew .= 717 | strtolower($m[2]) == 'color' 718 | ? ' color: '. str_replace(array('"', ';', ':'), '\'', trim($m[4])). ';' 719 | : (isset($fontSizeAr[($m = trim($m[4]))]) 720 | ? ' font-size: '. $fontSizeAr[$m]. ';' 721 | : ''); 722 | } 723 | while ( 724 | preg_match('`(^|\s)face\s*=\s*(\'|")?([^=]+?)\\2`i', $attrStr, $m) 725 | || preg_match('`(^|\s)face\s*=(\s*)(\S+)`i', $attrStr, $m) 726 | ) { 727 | $attrStr = str_replace($m[0], ' ', $attrStr) ; 728 | $attrStrNew .= ' font-family: '. str_replace(array('"', ';', ':'), '\'', trim($m[3])). ';'; 729 | } 730 | $ele = 'span'; 731 | return ltrim(str_replace('<', '', $attrStrNew)); 732 | } 733 | if ($ele == 'acronym') { 734 | $ele = 'abbr'; 735 | return ''; 736 | } 737 | if ($ele == 'dir') { 738 | $ele = 'ul'; 739 | return ''; 740 | } 741 | if ($act == 2) { 742 | $ele = 0; 743 | return 0; 744 | } 745 | return ''; 746 | } 747 | 748 | /** 749 | * Handle entity. 750 | * 751 | * As needed, convert to named/hexadecimal form, or neutralize '&' as '&'. 752 | * 753 | * @param array $t Array result of preg_replace, with potential entity. 754 | * @return string Neutralized or converted entity. 755 | */ 756 | function hl_entity($t) 757 | { 758 | global $C; 759 | $t = $t[1]; 760 | static $reservedEntAr = array('amp'=>1, 'AMP'=>1, 'gt'=>1, 'GT'=>1, 'lt'=>1, 'LT'=>1, 'quot'=>1, 'QUOT'=>1); 761 | static $commonEntNameAr = array('Aacute'=>'193', 'aacute'=>'225', 'Acirc'=>'194', 'acirc'=>'226', 'acute'=>'180', 'AElig'=>'198', 'aelig'=>'230', 'Agrave'=>'192', 'agrave'=>'224', 'alefsym'=>'8501', 'Alpha'=>'913', 'alpha'=>'945', 'and'=>'8743', 'ang'=>'8736', 'apos'=>'39', 'Aring'=>'197', 'aring'=>'229', 'asymp'=>'8776', 'Atilde'=>'195', 'atilde'=>'227', 'Auml'=>'196', 'auml'=>'228', 'bdquo'=>'8222', 'Beta'=>'914', 'beta'=>'946', 'brvbar'=>'166', 'bull'=>'8226', 'cap'=>'8745', 'Ccedil'=>'199', 'ccedil'=>'231', 'cedil'=>'184', 'cent'=>'162', 'Chi'=>'935', 'chi'=>'967', 'circ'=>'710', 'clubs'=>'9827', 'cong'=>'8773', 'copy'=>'169', 'crarr'=>'8629', 'cup'=>'8746', 'curren'=>'164', 'dagger'=>'8224', 'Dagger'=>'8225', 'darr'=>'8595', 'dArr'=>'8659', 'deg'=>'176', 'Delta'=>'916', 'delta'=>'948', 'diams'=>'9830', 'divide'=>'247', 'Eacute'=>'201', 'eacute'=>'233', 'Ecirc'=>'202', 'ecirc'=>'234', 'Egrave'=>'200', 'egrave'=>'232', 'empty'=>'8709', 'emsp'=>'8195', 'ensp'=>'8194', 'Epsilon'=>'917', 'epsilon'=>'949', 'equiv'=>'8801', 'Eta'=>'919', 'eta'=>'951', 'ETH'=>'208', 'eth'=>'240', 'Euml'=>'203', 'euml'=>'235', 'euro'=>'8364', 'exist'=>'8707', 'fnof'=>'402', 'forall'=>'8704', 'frac12'=>'189', 'frac14'=>'188', 'frac34'=>'190', 'frasl'=>'8260', 'Gamma'=>'915', 'gamma'=>'947', 'ge'=>'8805', 'harr'=>'8596', 'hArr'=>'8660', 'hearts'=>'9829', 'hellip'=>'8230', 'Iacute'=>'205', 'iacute'=>'237', 'Icirc'=>'206', 'icirc'=>'238', 'iexcl'=>'161', 'Igrave'=>'204', 'igrave'=>'236', 'image'=>'8465', 'infin'=>'8734', 'int'=>'8747', 'Iota'=>'921', 'iota'=>'953', 'iquest'=>'191', 'isin'=>'8712', 'Iuml'=>'207', 'iuml'=>'239', 'Kappa'=>'922', 'kappa'=>'954', 'Lambda'=>'923', 'lambda'=>'955', 'laquo'=>'171', 'larr'=>'8592', 'lArr'=>'8656', 'lceil'=>'8968', 'ldquo'=>'8220', 'le'=>'8804', 'lfloor'=>'8970', 'lowast'=>'8727', 'loz'=>'9674', 'lrm'=>'8206', 'lsaquo'=>'8249', 'lsquo'=>'8216', 'macr'=>'175', 'mdash'=>'8212', 'micro'=>'181', 'middot'=>'183', 'minus'=>'8722', 'Mu'=>'924', 'mu'=>'956', 'nabla'=>'8711', 'nbsp'=>'160', 'ndash'=>'8211', 'ne'=>'8800', 'ni'=>'8715', 'not'=>'172', 'notin'=>'8713', 'nsub'=>'8836', 'Ntilde'=>'209', 'ntilde'=>'241', 'Nu'=>'925', 'nu'=>'957', 'Oacute'=>'211', 'oacute'=>'243', 'Ocirc'=>'212', 'ocirc'=>'244', 'OElig'=>'338', 'oelig'=>'339', 'Ograve'=>'210', 'ograve'=>'242', 'oline'=>'8254', 'Omega'=>'937', 'omega'=>'969', 'Omicron'=>'927', 'omicron'=>'959', 'oplus'=>'8853', 'or'=>'8744', 'ordf'=>'170', 'ordm'=>'186', 'Oslash'=>'216', 'oslash'=>'248', 'Otilde'=>'213', 'otilde'=>'245', 'otimes'=>'8855', 'Ouml'=>'214', 'ouml'=>'246', 'para'=>'182', 'part'=>'8706', 'permil'=>'8240', 'perp'=>'8869', 'Phi'=>'934', 'phi'=>'966', 'Pi'=>'928', 'pi'=>'960', 'piv'=>'982', 'plusmn'=>'177', 'pound'=>'163', 'prime'=>'8242', 'Prime'=>'8243', 'prod'=>'8719', 'prop'=>'8733', 'Psi'=>'936', 'psi'=>'968', 'radic'=>'8730', 'raquo'=>'187', 'rarr'=>'8594', 'rArr'=>'8658', 'rceil'=>'8969', 'rdquo'=>'8221', 'real'=>'8476', 'reg'=>'174', 'rfloor'=>'8971', 'Rho'=>'929', 'rho'=>'961', 'rlm'=>'8207', 'rsaquo'=>'8250', 'rsquo'=>'8217', 'sbquo'=>'8218', 'Scaron'=>'352', 'scaron'=>'353', 'sdot'=>'8901', 'sect'=>'167', 'shy'=>'173', 'Sigma'=>'931', 'sigma'=>'963', 'sigmaf'=>'962', 'sim'=>'8764', 'spades'=>'9824', 'sub'=>'8834', 'sube'=>'8838', 'sum'=>'8721', 'sup'=>'8835', 'sup1'=>'185', 'sup2'=>'178', 'sup3'=>'179', 'supe'=>'8839', 'szlig'=>'223', 'Tau'=>'932', 'tau'=>'964', 'there4'=>'8756', 'Theta'=>'920', 'theta'=>'952', 'thetasym'=>'977', 'thinsp'=>'8201', 'THORN'=>'222', 'thorn'=>'254', 'tilde'=>'732', 'times'=>'215', 'trade'=>'8482', 'Uacute'=>'218', 'uacute'=>'250', 'uarr'=>'8593', 'uArr'=>'8657', 'Ucirc'=>'219', 'ucirc'=>'251', 'Ugrave'=>'217', 'ugrave'=>'249', 'uml'=>'168', 'upsih'=>'978', 'Upsilon'=>'933', 'upsilon'=>'965', 'Uuml'=>'220', 'uuml'=>'252', 'weierp'=>'8472', 'Xi'=>'926', 'xi'=>'958', 'Yacute'=>'221', 'yacute'=>'253', 'yen'=>'165', 'yuml'=>'255', 'Yuml'=>'376', 'Zeta'=>'918', 'zeta'=>'950', 'zwj'=>'8205', 'zwnj'=>'8204'); 762 | static $rareEntNameAr = array('Abreve'=>'258', 'abreve'=>'259', 'ac'=>'8766', 'acd'=>'8767', 'Acy'=>'1040', 'acy'=>'1072', 'af'=>'8289', 'Afr'=>'120068', 'afr'=>'120094', 'aleph'=>'8501', 'Amacr'=>'256', 'amacr'=>'257', 'amalg'=>'10815', 'And'=>'10835', 'andand'=>'10837', 'andd'=>'10844', 'andslope'=>'10840', 'andv'=>'10842', 'ange'=>'10660', 'angle'=>'8736', 'angmsd'=>'8737', 'angmsdaa'=>'10664', 'angmsdab'=>'10665', 'angmsdac'=>'10666', 'angmsdad'=>'10667', 'angmsdae'=>'10668', 'angmsdaf'=>'10669', 'angmsdag'=>'10670', 'angmsdah'=>'10671', 'angrt'=>'8735', 'angrtvb'=>'8894', 'angrtvbd'=>'10653', 'angsph'=>'8738', 'angst'=>'197', 'angzarr'=>'9084', 'Aogon'=>'260', 'aogon'=>'261', 'Aopf'=>'120120', 'aopf'=>'120146', 'ap'=>'8776', 'apacir'=>'10863', 'apE'=>'10864', 'ape'=>'8778', 'apid'=>'8779', 'ApplyFunction'=>'8289', 'approx'=>'8776', 'approxeq'=>'8778', 'Ascr'=>'119964', 'ascr'=>'119990', 'Assign'=>'8788', 'ast'=>'42', 'asympeq'=>'8781', 'awconint'=>'8755', 'awint'=>'10769', 'backcong'=>'8780', 'backepsilon'=>'1014', 'backprime'=>'8245', 'backsim'=>'8765', 'backsimeq'=>'8909', 'Backslash'=>'8726', 'Barv'=>'10983', 'barvee'=>'8893', 'barwed'=>'8965', 'Barwed'=>'8966', 'barwedge'=>'8965', 'bbrk'=>'9141', 'bbrktbrk'=>'9142', 'bcong'=>'8780', 'Bcy'=>'1041', 'bcy'=>'1073', 'becaus'=>'8757', 'because'=>'8757', 'Because'=>'8757', 'bemptyv'=>'10672', 'bepsi'=>'1014', 'bernou'=>'8492', 'Bernoullis'=>'8492', 'beth'=>'8502', 'between'=>'8812', 'Bfr'=>'120069', 'bfr'=>'120095', 'bigcap'=>'8898', 'bigcirc'=>'9711', 'bigcup'=>'8899', 'bigodot'=>'10752', 'bigoplus'=>'10753', 'bigotimes'=>'10754', 'bigsqcup'=>'10758', 'bigstar'=>'9733', 'bigtriangledown'=>'9661', 'bigtriangleup'=>'9651', 'biguplus'=>'10756', 'bigvee'=>'8897', 'bigwedge'=>'8896', 'bkarow'=>'10509', 'blacklozenge'=>'10731', 'blacksquare'=>'9642', 'blacktriangle'=>'9652', 'blacktriangledown'=>'9662', 'blacktriangleleft'=>'9666', 'blacktriangleright'=>'9656', 'blank'=>'9251', 'blk12'=>'9618', 'blk14'=>'9617', 'blk34'=>'9619', 'block'=>'9608', 'bNot'=>'10989', 'bnot'=>'8976', 'Bopf'=>'120121', 'bopf'=>'120147', 'bot'=>'8869', 'bottom'=>'8869', 'bowtie'=>'8904', 'boxbox'=>'10697', 'boxdl'=>'9488', 'boxdL'=>'9557', 'boxDl'=>'9558', 'boxDL'=>'9559', 'boxdr'=>'9484', 'boxdR'=>'9554', 'boxDr'=>'9555', 'boxDR'=>'9556', 'boxh'=>'9472', 'boxH'=>'9552', 'boxhd'=>'9516', 'boxHd'=>'9572', 'boxhD'=>'9573', 'boxHD'=>'9574', 'boxhu'=>'9524', 'boxHu'=>'9575', 'boxhU'=>'9576', 'boxHU'=>'9577', 'boxminus'=>'8863', 'boxplus'=>'8862', 'boxtimes'=>'8864', 'boxul'=>'9496', 'boxuL'=>'9563', 'boxUl'=>'9564', 'boxUL'=>'9565', 'boxur'=>'9492', 'boxuR'=>'9560', 'boxUr'=>'9561', 'boxUR'=>'9562', 'boxv'=>'9474', 'boxV'=>'9553', 'boxvh'=>'9532', 'boxvH'=>'9578', 'boxVh'=>'9579', 'boxVH'=>'9580', 'boxvl'=>'9508', 'boxvL'=>'9569', 'boxVl'=>'9570', 'boxVL'=>'9571', 'boxvr'=>'9500', 'boxvR'=>'9566', 'boxVr'=>'9567', 'boxVR'=>'9568', 'bprime'=>'8245', 'breve'=>'728', 'Breve'=>'728', 'bscr'=>'119991', 'Bscr'=>'8492', 'bsemi'=>'8271', 'bsim'=>'8765', 'bsime'=>'8909', 'bsol'=>'92', 'bsolb'=>'10693', 'bsolhsub'=>'10184', 'bullet'=>'8226', 'bump'=>'8782', 'bumpE'=>'10926', 'bumpe'=>'8783', 'Bumpeq'=>'8782', 'bumpeq'=>'8783', 'Cacute'=>'262', 'cacute'=>'263', 'Cap'=>'8914', 'capand'=>'10820', 'capbrcup'=>'10825', 'capcap'=>'10827', 'capcup'=>'10823', 'capdot'=>'10816', 'CapitalDifferentialD'=>'8517', 'caret'=>'8257', 'caron'=>'711', 'Cayleys'=>'8493', 'ccaps'=>'10829', 'Ccaron'=>'268', 'ccaron'=>'269', 'Ccirc'=>'264', 'ccirc'=>'265', 'Cconint'=>'8752', 'ccups'=>'10828', 'ccupssm'=>'10832', 'Cdot'=>'266', 'cdot'=>'267', 'Cedilla'=>'184', 'cemptyv'=>'10674', 'centerdot'=>'183', 'CenterDot'=>'183', 'cfr'=>'120096', 'Cfr'=>'8493', 'CHcy'=>'1063', 'chcy'=>'1095', 'check'=>'10003', 'checkmark'=>'10003', 'cir'=>'9675', 'circeq'=>'8791', 'circlearrowleft'=>'8634', 'circlearrowright'=>'8635', 'circledast'=>'8859', 'circledcirc'=>'8858', 'circleddash'=>'8861', 'CircleDot'=>'8857', 'circledR'=>'174', 'circledS'=>'9416', 'CircleMinus'=>'8854', 'CirclePlus'=>'8853', 'CircleTimes'=>'8855', 'cirE'=>'10691', 'cire'=>'8791', 'cirfnint'=>'10768', 'cirmid'=>'10991', 'cirscir'=>'10690', 'ClockwiseContourIntegral'=>'8754', 'CloseCurlyDoubleQuote'=>'8221', 'CloseCurlyQuote'=>'8217', 'clubsuit'=>'9827', 'colon'=>'58', 'Colon'=>'8759', 'Colone'=>'10868', 'colone'=>'8788', 'coloneq'=>'8788', 'comma'=>'44', 'commat'=>'64', 'comp'=>'8705', 'compfn'=>'8728', 'complement'=>'8705', 'complexes'=>'8450', 'congdot'=>'10861', 'Congruent'=>'8801', 'conint'=>'8750', 'Conint'=>'8751', 'ContourIntegral'=>'8750', 'copf'=>'120148', 'Copf'=>'8450', 'coprod'=>'8720', 'Coproduct'=>'8720', 'COPY'=>'169', 'copysr'=>'8471', 'CounterClockwiseContourIntegral'=>'8755', 'cross'=>'10007', 'Cross'=>'10799', 'Cscr'=>'119966', 'cscr'=>'119992', 'csub'=>'10959', 'csube'=>'10961', 'csup'=>'10960', 'csupe'=>'10962', 'ctdot'=>'8943', 'cudarrl'=>'10552', 'cudarrr'=>'10549', 'cuepr'=>'8926', 'cuesc'=>'8927', 'cularr'=>'8630', 'cularrp'=>'10557', 'Cup'=>'8915', 'cupbrcap'=>'10824', 'cupcap'=>'10822', 'CupCap'=>'8781', 'cupcup'=>'10826', 'cupdot'=>'8845', 'cupor'=>'10821', 'curarr'=>'8631', 'curarrm'=>'10556', 'curlyeqprec'=>'8926', 'curlyeqsucc'=>'8927', 'curlyvee'=>'8910', 'curlywedge'=>'8911', 'curvearrowleft'=>'8630', 'curvearrowright'=>'8631', 'cuvee'=>'8910', 'cuwed'=>'8911', 'cwconint'=>'8754', 'cwint'=>'8753', 'cylcty'=>'9005', 'daleth'=>'8504', 'Darr'=>'8609', 'dash'=>'8208', 'Dashv'=>'10980', 'dashv'=>'8867', 'dbkarow'=>'10511', 'dblac'=>'733', 'Dcaron'=>'270', 'dcaron'=>'271', 'Dcy'=>'1044', 'dcy'=>'1076', 'DD'=>'8517', 'dd'=>'8518', 'ddagger'=>'8225', 'ddarr'=>'8650', 'DDotrahd'=>'10513', 'ddotseq'=>'10871', 'Del'=>'8711', 'demptyv'=>'10673', 'dfisht'=>'10623', 'Dfr'=>'120071', 'dfr'=>'120097', 'dHar'=>'10597', 'dharl'=>'8643', 'dharr'=>'8642', 'DiacriticalAcute'=>'180', 'DiacriticalDot'=>'729', 'DiacriticalDoubleAcute'=>'733', 'DiacriticalGrave'=>'96', 'DiacriticalTilde'=>'732', 'diam'=>'8900', 'diamond'=>'8900', 'Diamond'=>'8900', 'diamondsuit'=>'9830', 'die'=>'168', 'DifferentialD'=>'8518', 'digamma'=>'989', 'disin'=>'8946', 'div'=>'247', 'divideontimes'=>'8903', 'divonx'=>'8903', 'DJcy'=>'1026', 'djcy'=>'1106', 'dlcorn'=>'8990', 'dlcrop'=>'8973', 'dollar'=>'36', 'Dopf'=>'120123', 'dopf'=>'120149', 'Dot'=>'168', 'dot'=>'729', 'DotDot'=>'8412', 'doteq'=>'8784', 'doteqdot'=>'8785', 'DotEqual'=>'8784', 'dotminus'=>'8760', 'dotplus'=>'8724', 'dotsquare'=>'8865', 'doublebarwedge'=>'8966', 'DoubleContourIntegral'=>'8751', 'DoubleDot'=>'168', 'DoubleDownArrow'=>'8659', 'DoubleLeftArrow'=>'8656', 'DoubleLeftRightArrow'=>'8660', 'DoubleLeftTee'=>'10980', 'DoubleLongLeftArrow'=>'10232', 'DoubleLongLeftRightArrow'=>'10234', 'DoubleLongRightArrow'=>'10233', 'DoubleRightArrow'=>'8658', 'DoubleRightTee'=>'8872', 'DoubleUpArrow'=>'8657', 'DoubleUpDownArrow'=>'8661', 'DoubleVerticalBar'=>'8741', 'downarrow'=>'8595', 'DownArrow'=>'8595', 'Downarrow'=>'8659', 'DownArrowBar'=>'10515', 'DownArrowUpArrow'=>'8693', 'DownBreve'=>'785', 'downdownarrows'=>'8650', 'downharpoonleft'=>'8643', 'downharpoonright'=>'8642', 'DownLeftRightVector'=>'10576', 'DownLeftTeeVector'=>'10590', 'DownLeftVector'=>'8637', 'DownLeftVectorBar'=>'10582', 'DownRightTeeVector'=>'10591', 'DownRightVector'=>'8641', 'DownRightVectorBar'=>'10583', 'DownTee'=>'8868', 'DownTeeArrow'=>'8615', 'drbkarow'=>'10512', 'drcorn'=>'8991', 'drcrop'=>'8972', 'Dscr'=>'119967', 'dscr'=>'119993', 'DScy'=>'1029', 'dscy'=>'1109', 'dsol'=>'10742', 'Dstrok'=>'272', 'dstrok'=>'273', 'dtdot'=>'8945', 'dtri'=>'9663', 'dtrif'=>'9662', 'duarr'=>'8693', 'duhar'=>'10607', 'dwangle'=>'10662', 'DZcy'=>'1039', 'dzcy'=>'1119', 'dzigrarr'=>'10239', 'easter'=>'10862', 'Ecaron'=>'282', 'ecaron'=>'283', 'ecir'=>'8790', 'ecolon'=>'8789', 'Ecy'=>'1069', 'ecy'=>'1101', 'eDDot'=>'10871', 'Edot'=>'278', 'edot'=>'279', 'eDot'=>'8785', 'ee'=>'8519', 'efDot'=>'8786', 'Efr'=>'120072', 'efr'=>'120098', 'eg'=>'10906', 'egs'=>'10902', 'egsdot'=>'10904', 'el'=>'10905', 'Element'=>'8712', 'elinters'=>'9191', 'ell'=>'8467', 'els'=>'10901', 'elsdot'=>'10903', 'Emacr'=>'274', 'emacr'=>'275', 'emptyset'=>'8709', 'EmptySmallSquare'=>'9723', 'emptyv'=>'8709', 'EmptyVerySmallSquare'=>'9643', 'emsp13'=>'8196', 'emsp14'=>'8197', 'ENG'=>'330', 'eng'=>'331', 'Eogon'=>'280', 'eogon'=>'281', 'Eopf'=>'120124', 'eopf'=>'120150', 'epar'=>'8917', 'eparsl'=>'10723', 'eplus'=>'10865', 'epsi'=>'949', 'epsiv'=>'1013', 'eqcirc'=>'8790', 'eqcolon'=>'8789', 'eqsim'=>'8770', 'eqslantgtr'=>'10902', 'eqslantless'=>'10901', 'Equal'=>'10869', 'equals'=>'61', 'EqualTilde'=>'8770', 'equest'=>'8799', 'Equilibrium'=>'8652', 'equivDD'=>'10872', 'eqvparsl'=>'10725', 'erarr'=>'10609', 'erDot'=>'8787', 'escr'=>'8495', 'Escr'=>'8496', 'esdot'=>'8784', 'Esim'=>'10867', 'esim'=>'8770', 'excl'=>'33', 'Exists'=>'8707', 'expectation'=>'8496', 'exponentiale'=>'8519', 'ExponentialE'=>'8519', 'fallingdotseq'=>'8786', 'Fcy'=>'1060', 'fcy'=>'1092', 'female'=>'9792', 'ffilig'=>'64259', 'fflig'=>'64256', 'ffllig'=>'64260', 'Ffr'=>'120073', 'ffr'=>'120099', 'filig'=>'64257', 'FilledSmallSquare'=>'9724', 'FilledVerySmallSquare'=>'9642', 'flat'=>'9837', 'fllig'=>'64258', 'fltns'=>'9649', 'Fopf'=>'120125', 'fopf'=>'120151', 'ForAll'=>'8704', 'fork'=>'8916', 'forkv'=>'10969', 'Fouriertrf'=>'8497', 'fpartint'=>'10765', 'frac13'=>'8531', 'frac15'=>'8533', 'frac16'=>'8537', 'frac18'=>'8539', 'frac23'=>'8532', 'frac25'=>'8534', 'frac35'=>'8535', 'frac38'=>'8540', 'frac45'=>'8536', 'frac56'=>'8538', 'frac58'=>'8541', 'frac78'=>'8542', 'frown'=>'8994', 'fscr'=>'119995', 'Fscr'=>'8497', 'gacute'=>'501', 'Gammad'=>'988', 'gammad'=>'989', 'gap'=>'10886', 'Gbreve'=>'286', 'gbreve'=>'287', 'Gcedil'=>'290', 'Gcirc'=>'284', 'gcirc'=>'285', 'Gcy'=>'1043', 'gcy'=>'1075', 'Gdot'=>'288', 'gdot'=>'289', 'gE'=>'8807', 'gEl'=>'10892', 'gel'=>'8923', 'geq'=>'8805', 'geqq'=>'8807', 'geqslant'=>'10878', 'ges'=>'10878', 'gescc'=>'10921', 'gesdot'=>'10880', 'gesdoto'=>'10882', 'gesdotol'=>'10884', 'gesles'=>'10900', 'Gfr'=>'120074', 'gfr'=>'120100', 'gg'=>'8811', 'Gg'=>'8921', 'ggg'=>'8921', 'gimel'=>'8503', 'GJcy'=>'1027', 'gjcy'=>'1107', 'gl'=>'8823', 'gla'=>'10917', 'glE'=>'10898', 'glj'=>'10916', 'gnap'=>'10890', 'gnapprox'=>'10890', 'gne'=>'10888', 'gnE'=>'8809', 'gneq'=>'10888', 'gneqq'=>'8809', 'gnsim'=>'8935', 'Gopf'=>'120126', 'gopf'=>'120152', 'grave'=>'96', 'GreaterEqual'=>'8805', 'GreaterEqualLess'=>'8923', 'GreaterFullEqual'=>'8807', 'GreaterGreater'=>'10914', 'GreaterLess'=>'8823', 'GreaterSlantEqual'=>'10878', 'GreaterTilde'=>'8819', 'Gscr'=>'119970', 'gscr'=>'8458', 'gsim'=>'8819', 'gsime'=>'10894', 'gsiml'=>'10896', 'Gt'=>'8811', 'gtcc'=>'10919', 'gtcir'=>'10874', 'gtdot'=>'8919', 'gtlPar'=>'10645', 'gtquest'=>'10876', 'gtrapprox'=>'10886', 'gtrarr'=>'10616', 'gtrdot'=>'8919', 'gtreqless'=>'8923', 'gtreqqless'=>'10892', 'gtrless'=>'8823', 'gtrsim'=>'8819', 'Hacek'=>'711', 'hairsp'=>'8202', 'half'=>'189', 'hamilt'=>'8459', 'HARDcy'=>'1066', 'hardcy'=>'1098', 'harrcir'=>'10568', 'harrw'=>'8621', 'Hat'=>'94', 'hbar'=>'8463', 'Hcirc'=>'292', 'hcirc'=>'293', 'heartsuit'=>'9829', 'hercon'=>'8889', 'hfr'=>'120101', 'Hfr'=>'8460', 'HilbertSpace'=>'8459', 'hksearow'=>'10533', 'hkswarow'=>'10534', 'hoarr'=>'8703', 'homtht'=>'8763', 'hookleftarrow'=>'8617', 'hookrightarrow'=>'8618', 'hopf'=>'120153', 'Hopf'=>'8461', 'horbar'=>'8213', 'HorizontalLine'=>'9472', 'hscr'=>'119997', 'Hscr'=>'8459', 'hslash'=>'8463', 'Hstrok'=>'294', 'hstrok'=>'295', 'HumpDownHump'=>'8782', 'HumpEqual'=>'8783', 'hybull'=>'8259', 'hyphen'=>'8208', 'ic'=>'8291', 'Icy'=>'1048', 'icy'=>'1080', 'Idot'=>'304', 'IEcy'=>'1045', 'iecy'=>'1077', 'iff'=>'8660', 'ifr'=>'120102', 'Ifr'=>'8465', 'ii'=>'8520', 'iiiint'=>'10764', 'iiint'=>'8749', 'iinfin'=>'10716', 'iiota'=>'8489', 'IJlig'=>'306', 'ijlig'=>'307', 'Im'=>'8465', 'Imacr'=>'298', 'imacr'=>'299', 'ImaginaryI'=>'8520', 'imagline'=>'8464', 'imagpart'=>'8465', 'imath'=>'305', 'imof'=>'8887', 'imped'=>'437', 'Implies'=>'8658', 'in'=>'8712', 'incare'=>'8453', 'infintie'=>'10717', 'inodot'=>'305', 'Int'=>'8748', 'intcal'=>'8890', 'integers'=>'8484', 'Integral'=>'8747', 'intercal'=>'8890', 'Intersection'=>'8898', 'intlarhk'=>'10775', 'intprod'=>'10812', 'InvisibleComma'=>'8291', 'InvisibleTimes'=>'8290', 'IOcy'=>'1025', 'iocy'=>'1105', 'Iogon'=>'302', 'iogon'=>'303', 'Iopf'=>'120128', 'iopf'=>'120154', 'iprod'=>'10812', 'iscr'=>'119998', 'Iscr'=>'8464', 'isindot'=>'8949', 'isinE'=>'8953', 'isins'=>'8948', 'isinsv'=>'8947', 'isinv'=>'8712', 'it'=>'8290', 'Itilde'=>'296', 'itilde'=>'297', 'Iukcy'=>'1030', 'iukcy'=>'1110', 'Jcirc'=>'308', 'jcirc'=>'309', 'Jcy'=>'1049', 'jcy'=>'1081', 'Jfr'=>'120077', 'jfr'=>'120103', 'jmath'=>'567', 'Jopf'=>'120129', 'jopf'=>'120155', 'Jscr'=>'119973', 'jscr'=>'119999', 'Jsercy'=>'1032', 'jsercy'=>'1112', 'Jukcy'=>'1028', 'jukcy'=>'1108', 'kappav'=>'1008', 'Kcedil'=>'310', 'kcedil'=>'311', 'Kcy'=>'1050', 'kcy'=>'1082', 'Kfr'=>'120078', 'kfr'=>'120104', 'kgreen'=>'312', 'KHcy'=>'1061', 'khcy'=>'1093', 'KJcy'=>'1036', 'kjcy'=>'1116', 'Kopf'=>'120130', 'kopf'=>'120156', 'Kscr'=>'119974', 'kscr'=>'120000', 'lAarr'=>'8666', 'Lacute'=>'313', 'lacute'=>'314', 'laemptyv'=>'10676', 'lagran'=>'8466', 'lang'=>'10216', 'Lang'=>'10218', 'langd'=>'10641', 'langle'=>'10216', 'lap'=>'10885', 'Laplacetrf'=>'8466', 'Larr'=>'8606', 'larrb'=>'8676', 'larrbfs'=>'10527', 'larrfs'=>'10525', 'larrhk'=>'8617', 'larrlp'=>'8619', 'larrpl'=>'10553', 'larrsim'=>'10611', 'larrtl'=>'8610', 'lat'=>'10923', 'latail'=>'10521', 'lAtail'=>'10523', 'late'=>'10925', 'lbarr'=>'10508', 'lBarr'=>'10510', 'lbbrk'=>'10098', 'lbrace'=>'123', 'lbrack'=>'91', 'lbrke'=>'10635', 'lbrksld'=>'10639', 'lbrkslu'=>'10637', 'Lcaron'=>'317', 'lcaron'=>'318', 'Lcedil'=>'315', 'lcedil'=>'316', 'lcub'=>'123', 'Lcy'=>'1051', 'lcy'=>'1083', 'ldca'=>'10550', 'ldquor'=>'8222', 'ldrdhar'=>'10599', 'ldrushar'=>'10571', 'ldsh'=>'8626', 'lE'=>'8806', 'LeftAngleBracket'=>'10216', 'leftarrow'=>'8592', 'LeftArrow'=>'8592', 'Leftarrow'=>'8656', 'LeftArrowBar'=>'8676', 'LeftArrowRightArrow'=>'8646', 'leftarrowtail'=>'8610', 'LeftCeiling'=>'8968', 'LeftDoubleBracket'=>'10214', 'LeftDownTeeVector'=>'10593', 'LeftDownVector'=>'8643', 'LeftDownVectorBar'=>'10585', 'LeftFloor'=>'8970', 'leftharpoondown'=>'8637', 'leftharpoonup'=>'8636', 'leftleftarrows'=>'8647', 'leftrightarrow'=>'8596', 'LeftRightArrow'=>'8596', 'Leftrightarrow'=>'8660', 'leftrightarrows'=>'8646', 'leftrightharpoons'=>'8651', 'leftrightsquigarrow'=>'8621', 'LeftRightVector'=>'10574', 'LeftTee'=>'8867', 'LeftTeeArrow'=>'8612', 'LeftTeeVector'=>'10586', 'leftthreetimes'=>'8907', 'LeftTriangle'=>'8882', 'LeftTriangleBar'=>'10703', 'LeftTriangleEqual'=>'8884', 'LeftUpDownVector'=>'10577', 'LeftUpTeeVector'=>'10592', 'LeftUpVector'=>'8639', 'LeftUpVectorBar'=>'10584', 'LeftVector'=>'8636', 'LeftVectorBar'=>'10578', 'lEg'=>'10891', 'leg'=>'8922', 'leq'=>'8804', 'leqq'=>'8806', 'leqslant'=>'10877', 'les'=>'10877', 'lescc'=>'10920', 'lesdot'=>'10879', 'lesdoto'=>'10881', 'lesdotor'=>'10883', 'lesges'=>'10899', 'lessapprox'=>'10885', 'lessdot'=>'8918', 'lesseqgtr'=>'8922', 'lesseqqgtr'=>'10891', 'LessEqualGreater'=>'8922', 'LessFullEqual'=>'8806', 'LessGreater'=>'8822', 'lessgtr'=>'8822', 'LessLess'=>'10913', 'lesssim'=>'8818', 'LessSlantEqual'=>'10877', 'LessTilde'=>'8818', 'lfisht'=>'10620', 'Lfr'=>'120079', 'lfr'=>'120105', 'lg'=>'8822', 'lgE'=>'10897', 'lHar'=>'10594', 'lhard'=>'8637', 'lharu'=>'8636', 'lharul'=>'10602', 'lhblk'=>'9604', 'LJcy'=>'1033', 'ljcy'=>'1113', 'll'=>'8810', 'Ll'=>'8920', 'llarr'=>'8647', 'llcorner'=>'8990', 'Lleftarrow'=>'8666', 'llhard'=>'10603', 'lltri'=>'9722', 'Lmidot'=>'319', 'lmidot'=>'320', 'lmoust'=>'9136', 'lmoustache'=>'9136', 'lnap'=>'10889', 'lnapprox'=>'10889', 'lne'=>'10887', 'lnE'=>'8808', 'lneq'=>'10887', 'lneqq'=>'8808', 'lnsim'=>'8934', 'loang'=>'10220', 'loarr'=>'8701', 'lobrk'=>'10214', 'longleftarrow'=>'10229', 'LongLeftArrow'=>'10229', 'Longleftarrow'=>'10232', 'longleftrightarrow'=>'10231', 'LongLeftRightArrow'=>'10231', 'Longleftrightarrow'=>'10234', 'longmapsto'=>'10236', 'longrightarrow'=>'10230', 'LongRightArrow'=>'10230', 'Longrightarrow'=>'10233', 'looparrowleft'=>'8619', 'looparrowright'=>'8620', 'lopar'=>'10629', 'Lopf'=>'120131', 'lopf'=>'120157', 'loplus'=>'10797', 'lotimes'=>'10804', 'lowbar'=>'95', 'LowerLeftArrow'=>'8601', 'LowerRightArrow'=>'8600', 'lozenge'=>'9674', 'lozf'=>'10731', 'lpar'=>'40', 'lparlt'=>'10643', 'lrarr'=>'8646', 'lrcorner'=>'8991', 'lrhar'=>'8651', 'lrhard'=>'10605', 'lrtri'=>'8895', 'lscr'=>'120001', 'Lscr'=>'8466', 'lsh'=>'8624', 'Lsh'=>'8624', 'lsim'=>'8818', 'lsime'=>'10893', 'lsimg'=>'10895', 'lsqb'=>'91', 'lsquor'=>'8218', 'Lstrok'=>'321', 'lstrok'=>'322', 'Lt'=>'8810', 'ltcc'=>'10918', 'ltcir'=>'10873', 'ltdot'=>'8918', 'lthree'=>'8907', 'ltimes'=>'8905', 'ltlarr'=>'10614', 'ltquest'=>'10875', 'ltri'=>'9667', 'ltrie'=>'8884', 'ltrif'=>'9666', 'ltrPar'=>'10646', 'lurdshar'=>'10570', 'luruhar'=>'10598', 'male'=>'9794', 'malt'=>'10016', 'maltese'=>'10016', 'Map'=>'10501', 'map'=>'8614', 'mapsto'=>'8614', 'mapstodown'=>'8615', 'mapstoleft'=>'8612', 'mapstoup'=>'8613', 'marker'=>'9646', 'mcomma'=>'10793', 'Mcy'=>'1052', 'mcy'=>'1084', 'mDDot'=>'8762', 'measuredangle'=>'8737', 'MediumSpace'=>'8287', 'Mellintrf'=>'8499', 'Mfr'=>'120080', 'mfr'=>'120106', 'mho'=>'8487', 'mid'=>'8739', 'midast'=>'42', 'midcir'=>'10992', 'minusb'=>'8863', 'minusd'=>'8760', 'minusdu'=>'10794', 'MinusPlus'=>'8723', 'mlcp'=>'10971', 'mldr'=>'8230', 'mnplus'=>'8723', 'models'=>'8871', 'Mopf'=>'120132', 'mopf'=>'120158', 'mp'=>'8723', 'mscr'=>'120002', 'Mscr'=>'8499', 'mstpos'=>'8766', 'multimap'=>'8888', 'mumap'=>'8888', 'Nacute'=>'323', 'nacute'=>'324', 'nap'=>'8777', 'napos'=>'329', 'napprox'=>'8777', 'natur'=>'9838', 'natural'=>'9838', 'naturals'=>'8469', 'ncap'=>'10819', 'Ncaron'=>'327', 'ncaron'=>'328', 'Ncedil'=>'325', 'ncedil'=>'326', 'ncong'=>'8775', 'ncup'=>'10818', 'Ncy'=>'1053', 'ncy'=>'1085', 'nearhk'=>'10532', 'nearr'=>'8599', 'neArr'=>'8663', 'nearrow'=>'8599', 'NegativeMediumSpace'=>'8203', 'NegativeThickSpace'=>'8203', 'NegativeThinSpace'=>'8203', 'NegativeVeryThinSpace'=>'8203', 'nequiv'=>'8802', 'nesear'=>'10536', 'NestedGreaterGreater'=>'8811', 'NestedLessLess'=>'8810', 'NewLine'=>'10', 'nexist'=>'8708', 'nexists'=>'8708', 'Nfr'=>'120081', 'nfr'=>'120107', 'nge'=>'8817', 'ngeq'=>'8817', 'ngsim'=>'8821', 'ngt'=>'8815', 'ngtr'=>'8815', 'nharr'=>'8622', 'nhArr'=>'8654', 'nhpar'=>'10994', 'nis'=>'8956', 'nisd'=>'8954', 'niv'=>'8715', 'NJcy'=>'1034', 'njcy'=>'1114', 'nlarr'=>'8602', 'nlArr'=>'8653', 'nldr'=>'8229', 'nle'=>'8816', 'nleftarrow'=>'8602', 'nLeftarrow'=>'8653', 'nleftrightarrow'=>'8622', 'nLeftrightarrow'=>'8654', 'nleq'=>'8816', 'nless'=>'8814', 'nlsim'=>'8820', 'nlt'=>'8814', 'nltri'=>'8938', 'nltrie'=>'8940', 'nmid'=>'8740', 'NoBreak'=>'8288', 'NonBreakingSpace'=>'160', 'nopf'=>'120159', 'Nopf'=>'8469', 'Not'=>'10988', 'NotCongruent'=>'8802', 'NotCupCap'=>'8813', 'NotDoubleVerticalBar'=>'8742', 'NotElement'=>'8713', 'NotEqual'=>'8800', 'NotExists'=>'8708', 'NotGreater'=>'8815', 'NotGreaterEqual'=>'8817', 'NotGreaterLess'=>'8825', 'NotGreaterTilde'=>'8821', 'notinva'=>'8713', 'notinvb'=>'8951', 'notinvc'=>'8950', 'NotLeftTriangle'=>'8938', 'NotLeftTriangleEqual'=>'8940', 'NotLess'=>'8814', 'NotLessEqual'=>'8816', 'NotLessGreater'=>'8824', 'NotLessTilde'=>'8820', 'notni'=>'8716', 'notniva'=>'8716', 'notnivb'=>'8958', 'notnivc'=>'8957', 'NotPrecedes'=>'8832', 'NotPrecedesSlantEqual'=>'8928', 'NotReverseElement'=>'8716', 'NotRightTriangle'=>'8939', 'NotRightTriangleEqual'=>'8941', 'NotSquareSubsetEqual'=>'8930', 'NotSquareSupersetEqual'=>'8931', 'NotSubsetEqual'=>'8840', 'NotSucceeds'=>'8833', 'NotSucceedsSlantEqual'=>'8929', 'NotSupersetEqual'=>'8841', 'NotTilde'=>'8769', 'NotTildeEqual'=>'8772', 'NotTildeFullEqual'=>'8775', 'NotTildeTilde'=>'8777', 'NotVerticalBar'=>'8740', 'npar'=>'8742', 'nparallel'=>'8742', 'npolint'=>'10772', 'npr'=>'8832', 'nprcue'=>'8928', 'nprec'=>'8832', 'nrarr'=>'8603', 'nrArr'=>'8655', 'nrightarrow'=>'8603', 'nRightarrow'=>'8655', 'nrtri'=>'8939', 'nrtrie'=>'8941', 'nsc'=>'8833', 'nsccue'=>'8929', 'Nscr'=>'119977', 'nscr'=>'120003', 'nshortmid'=>'8740', 'nshortparallel'=>'8742', 'nsim'=>'8769', 'nsime'=>'8772', 'nsimeq'=>'8772', 'nsmid'=>'8740', 'nspar'=>'8742', 'nsqsube'=>'8930', 'nsqsupe'=>'8931', 'nsube'=>'8840', 'nsubseteq'=>'8840', 'nsucc'=>'8833', 'nsup'=>'8837', 'nsupe'=>'8841', 'nsupseteq'=>'8841', 'ntgl'=>'8825', 'ntlg'=>'8824', 'ntriangleleft'=>'8938', 'ntrianglelefteq'=>'8940', 'ntriangleright'=>'8939', 'ntrianglerighteq'=>'8941', 'num'=>'35', 'numero'=>'8470', 'numsp'=>'8199', 'nvdash'=>'8876', 'nvDash'=>'8877', 'nVdash'=>'8878', 'nVDash'=>'8879', 'nvHarr'=>'10500', 'nvinfin'=>'10718', 'nvlArr'=>'10498', 'nvrArr'=>'10499', 'nwarhk'=>'10531', 'nwarr'=>'8598', 'nwArr'=>'8662', 'nwarrow'=>'8598', 'nwnear'=>'10535', 'oast'=>'8859', 'ocir'=>'8858', 'Ocy'=>'1054', 'ocy'=>'1086', 'odash'=>'8861', 'Odblac'=>'336', 'odblac'=>'337', 'odiv'=>'10808', 'odot'=>'8857', 'odsold'=>'10684', 'ofcir'=>'10687', 'Ofr'=>'120082', 'ofr'=>'120108', 'ogon'=>'731', 'ogt'=>'10689', 'ohbar'=>'10677', 'ohm'=>'937', 'oint'=>'8750', 'olarr'=>'8634', 'olcir'=>'10686', 'olcross'=>'10683', 'olt'=>'10688', 'Omacr'=>'332', 'omacr'=>'333', 'omid'=>'10678', 'ominus'=>'8854', 'Oopf'=>'120134', 'oopf'=>'120160', 'opar'=>'10679', 'OpenCurlyDoubleQuote'=>'8220', 'OpenCurlyQuote'=>'8216', 'operp'=>'10681', 'Or'=>'10836', 'orarr'=>'8635', 'ord'=>'10845', 'order'=>'8500', 'orderof'=>'8500', 'origof'=>'8886', 'oror'=>'10838', 'orslope'=>'10839', 'orv'=>'10843', 'oS'=>'9416', 'Oscr'=>'119978', 'oscr'=>'8500', 'osol'=>'8856', 'Otimes'=>'10807', 'otimesas'=>'10806', 'ovbar'=>'9021', 'OverBar'=>'8254', 'OverBrace'=>'9182', 'OverBracket'=>'9140', 'OverParenthesis'=>'9180', 'par'=>'8741', 'parallel'=>'8741', 'parsim'=>'10995', 'parsl'=>'11005', 'PartialD'=>'8706', 'Pcy'=>'1055', 'pcy'=>'1087', 'percnt'=>'37', 'period'=>'46', 'pertenk'=>'8241', 'Pfr'=>'120083', 'pfr'=>'120109', 'phiv'=>'981', 'phmmat'=>'8499', 'phone'=>'9742', 'pitchfork'=>'8916', 'planck'=>'8463', 'planckh'=>'8462', 'plankv'=>'8463', 'plus'=>'43', 'plusacir'=>'10787', 'plusb'=>'8862', 'pluscir'=>'10786', 'plusdo'=>'8724', 'plusdu'=>'10789', 'pluse'=>'10866', 'PlusMinus'=>'177', 'plussim'=>'10790', 'plustwo'=>'10791', 'pm'=>'177', 'Poincareplane'=>'8460', 'pointint'=>'10773', 'popf'=>'120161', 'Popf'=>'8473', 'Pr'=>'10939', 'pr'=>'8826', 'prap'=>'10935', 'prcue'=>'8828', 'pre'=>'10927', 'prE'=>'10931', 'prec'=>'8826', 'precapprox'=>'10935', 'preccurlyeq'=>'8828', 'Precedes'=>'8826', 'PrecedesEqual'=>'10927', 'PrecedesSlantEqual'=>'8828', 'PrecedesTilde'=>'8830', 'preceq'=>'10927', 'precnapprox'=>'10937', 'precneqq'=>'10933', 'precnsim'=>'8936', 'precsim'=>'8830', 'primes'=>'8473', 'prnap'=>'10937', 'prnE'=>'10933', 'prnsim'=>'8936', 'Product'=>'8719', 'profalar'=>'9006', 'profline'=>'8978', 'profsurf'=>'8979', 'Proportion'=>'8759', 'Proportional'=>'8733', 'propto'=>'8733', 'prsim'=>'8830', 'prurel'=>'8880', 'Pscr'=>'119979', 'pscr'=>'120005', 'puncsp'=>'8200', 'Qfr'=>'120084', 'qfr'=>'120110', 'qint'=>'10764', 'qopf'=>'120162', 'Qopf'=>'8474', 'qprime'=>'8279', 'Qscr'=>'119980', 'qscr'=>'120006', 'quaternions'=>'8461', 'quatint'=>'10774', 'quest'=>'63', 'questeq'=>'8799', 'rAarr'=>'8667', 'Racute'=>'340', 'racute'=>'341', 'raemptyv'=>'10675', 'rang'=>'10217', 'Rang'=>'10219', 'rangd'=>'10642', 'range'=>'10661', 'rangle'=>'10217', 'Rarr'=>'8608', 'rarrap'=>'10613', 'rarrb'=>'8677', 'rarrbfs'=>'10528', 'rarrc'=>'10547', 'rarrfs'=>'10526', 'rarrhk'=>'8618', 'rarrlp'=>'8620', 'rarrpl'=>'10565', 'rarrsim'=>'10612', 'Rarrtl'=>'10518', 'rarrtl'=>'8611', 'rarrw'=>'8605', 'ratail'=>'10522', 'rAtail'=>'10524', 'ratio'=>'8758', 'rationals'=>'8474', 'rbarr'=>'10509', 'rBarr'=>'10511', 'RBarr'=>'10512', 'rbbrk'=>'10099', 'rbrace'=>'125', 'rbrack'=>'93', 'rbrke'=>'10636', 'rbrksld'=>'10638', 'rbrkslu'=>'10640', 'Rcaron'=>'344', 'rcaron'=>'345', 'Rcedil'=>'342', 'rcedil'=>'343', 'rcub'=>'125', 'Rcy'=>'1056', 'rcy'=>'1088', 'rdca'=>'10551', 'rdldhar'=>'10601', 'rdquor'=>'8221', 'rdsh'=>'8627', 'Re'=>'8476', 'realine'=>'8475', 'realpart'=>'8476', 'reals'=>'8477', 'rect'=>'9645', 'REG'=>'174', 'ReverseElement'=>'8715', 'ReverseEquilibrium'=>'8651', 'ReverseUpEquilibrium'=>'10607', 'rfisht'=>'10621', 'rfr'=>'120111', 'Rfr'=>'8476', 'rHar'=>'10596', 'rhard'=>'8641', 'rharu'=>'8640', 'rharul'=>'10604', 'rhov'=>'1009', 'RightAngleBracket'=>'10217', 'rightarrow'=>'8594', 'RightArrow'=>'8594', 'Rightarrow'=>'8658', 'RightArrowBar'=>'8677', 'RightArrowLeftArrow'=>'8644', 'rightarrowtail'=>'8611', 'RightCeiling'=>'8969', 'RightDoubleBracket'=>'10215', 'RightDownTeeVector'=>'10589', 'RightDownVector'=>'8642', 'RightDownVectorBar'=>'10581', 'RightFloor'=>'8971', 'rightharpoondown'=>'8641', 'rightharpoonup'=>'8640', 'rightleftarrows'=>'8644', 'rightleftharpoons'=>'8652', 'rightrightarrows'=>'8649', 'rightsquigarrow'=>'8605', 'RightTee'=>'8866', 'RightTeeArrow'=>'8614', 'RightTeeVector'=>'10587', 'rightthreetimes'=>'8908', 'RightTriangle'=>'8883', 'RightTriangleBar'=>'10704', 'RightTriangleEqual'=>'8885', 'RightUpDownVector'=>'10575', 'RightUpTeeVector'=>'10588', 'RightUpVector'=>'8638', 'RightUpVectorBar'=>'10580', 'RightVector'=>'8640', 'RightVectorBar'=>'10579', 'ring'=>'730', 'risingdotseq'=>'8787', 'rlarr'=>'8644', 'rlhar'=>'8652', 'rmoust'=>'9137', 'rmoustache'=>'9137', 'rnmid'=>'10990', 'roang'=>'10221', 'roarr'=>'8702', 'robrk'=>'10215', 'ropar'=>'10630', 'ropf'=>'120163', 'Ropf'=>'8477', 'roplus'=>'10798', 'rotimes'=>'10805', 'RoundImplies'=>'10608', 'rpar'=>'41', 'rpargt'=>'10644', 'rppolint'=>'10770', 'rrarr'=>'8649', 'Rrightarrow'=>'8667', 'rscr'=>'120007', 'Rscr'=>'8475', 'rsh'=>'8625', 'Rsh'=>'8625', 'rsqb'=>'93', 'rsquor'=>'8217', 'rthree'=>'8908', 'rtimes'=>'8906', 'rtri'=>'9657', 'rtrie'=>'8885', 'rtrif'=>'9656', 'rtriltri'=>'10702', 'RuleDelayed'=>'10740', 'ruluhar'=>'10600', 'rx'=>'8478', 'Sacute'=>'346', 'sacute'=>'347', 'Sc'=>'10940', 'sc'=>'8827', 'scap'=>'10936', 'sccue'=>'8829', 'sce'=>'10928', 'scE'=>'10932', 'Scedil'=>'350', 'scedil'=>'351', 'Scirc'=>'348', 'scirc'=>'349', 'scnap'=>'10938', 'scnE'=>'10934', 'scnsim'=>'8937', 'scpolint'=>'10771', 'scsim'=>'8831', 'Scy'=>'1057', 'scy'=>'1089', 'sdotb'=>'8865', 'sdote'=>'10854', 'searhk'=>'10533', 'searr'=>'8600', 'seArr'=>'8664', 'searrow'=>'8600', 'semi'=>'59', 'seswar'=>'10537', 'setminus'=>'8726', 'setmn'=>'8726', 'sext'=>'10038', 'Sfr'=>'120086', 'sfr'=>'120112', 'sfrown'=>'8994', 'sharp'=>'9839', 'SHCHcy'=>'1065', 'shchcy'=>'1097', 'SHcy'=>'1064', 'shcy'=>'1096', 'ShortDownArrow'=>'8595', 'ShortLeftArrow'=>'8592', 'shortmid'=>'8739', 'shortparallel'=>'8741', 'ShortRightArrow'=>'8594', 'ShortUpArrow'=>'8593', 'sigmav'=>'962', 'simdot'=>'10858', 'sime'=>'8771', 'simeq'=>'8771', 'simg'=>'10910', 'simgE'=>'10912', 'siml'=>'10909', 'simlE'=>'10911', 'simne'=>'8774', 'simplus'=>'10788', 'simrarr'=>'10610', 'slarr'=>'8592', 'SmallCircle'=>'8728', 'smallsetminus'=>'8726', 'smashp'=>'10803', 'smeparsl'=>'10724', 'smid'=>'8739', 'smile'=>'8995', 'smt'=>'10922', 'smte'=>'10924', 'SOFTcy'=>'1068', 'softcy'=>'1100', 'sol'=>'47', 'solb'=>'10692', 'solbar'=>'9023', 'Sopf'=>'120138', 'sopf'=>'120164', 'spadesuit'=>'9824', 'spar'=>'8741', 'sqcap'=>'8851', 'sqcup'=>'8852', 'Sqrt'=>'8730', 'sqsub'=>'8847', 'sqsube'=>'8849', 'sqsubset'=>'8847', 'sqsubseteq'=>'8849', 'sqsup'=>'8848', 'sqsupe'=>'8850', 'sqsupset'=>'8848', 'sqsupseteq'=>'8850', 'squ'=>'9633', 'square'=>'9633', 'Square'=>'9633', 'SquareIntersection'=>'8851', 'SquareSubset'=>'8847', 'SquareSubsetEqual'=>'8849', 'SquareSuperset'=>'8848', 'SquareSupersetEqual'=>'8850', 'SquareUnion'=>'8852', 'squarf'=>'9642', 'squf'=>'9642', 'srarr'=>'8594', 'Sscr'=>'119982', 'sscr'=>'120008', 'ssetmn'=>'8726', 'ssmile'=>'8995', 'sstarf'=>'8902', 'Star'=>'8902', 'star'=>'9734', 'starf'=>'9733', 'straightepsilon'=>'1013', 'straightphi'=>'981', 'strns'=>'175', 'Sub'=>'8912', 'subdot'=>'10941', 'subE'=>'10949', 'subedot'=>'10947', 'submult'=>'10945', 'subnE'=>'10955', 'subne'=>'8842', 'subplus'=>'10943', 'subrarr'=>'10617', 'subset'=>'8834', 'Subset'=>'8912', 'subseteq'=>'8838', 'subseteqq'=>'10949', 'SubsetEqual'=>'8838', 'subsetneq'=>'8842', 'subsetneqq'=>'10955', 'subsim'=>'10951', 'subsub'=>'10965', 'subsup'=>'10963', 'succ'=>'8827', 'succapprox'=>'10936', 'succcurlyeq'=>'8829', 'Succeeds'=>'8827', 'SucceedsEqual'=>'10928', 'SucceedsSlantEqual'=>'8829', 'SucceedsTilde'=>'8831', 'succeq'=>'10928', 'succnapprox'=>'10938', 'succneqq'=>'10934', 'succnsim'=>'8937', 'succsim'=>'8831', 'SuchThat'=>'8715', 'Sum'=>'8721', 'sung'=>'9834', 'Sup'=>'8913', 'supdot'=>'10942', 'supdsub'=>'10968', 'supE'=>'10950', 'supedot'=>'10948', 'Superset'=>'8835', 'SupersetEqual'=>'8839', 'suphsol'=>'10185', 'suphsub'=>'10967', 'suplarr'=>'10619', 'supmult'=>'10946', 'supnE'=>'10956', 'supne'=>'8843', 'supplus'=>'10944', 'supset'=>'8835', 'Supset'=>'8913', 'supseteq'=>'8839', 'supseteqq'=>'10950', 'supsetneq'=>'8843', 'supsetneqq'=>'10956', 'supsim'=>'10952', 'supsub'=>'10964', 'supsup'=>'10966', 'swarhk'=>'10534', 'swarr'=>'8601', 'swArr'=>'8665', 'swarrow'=>'8601', 'swnwar'=>'10538', 'Tab'=>'9', 'target'=>'8982', 'tbrk'=>'9140', 'Tcaron'=>'356', 'tcaron'=>'357', 'Tcedil'=>'354', 'tcedil'=>'355', 'Tcy'=>'1058', 'tcy'=>'1090', 'tdot'=>'8411', 'telrec'=>'8981', 'Tfr'=>'120087', 'tfr'=>'120113', 'therefore'=>'8756', 'Therefore'=>'8756', 'thetav'=>'977', 'thickapprox'=>'8776', 'thicksim'=>'8764', 'ThinSpace'=>'8201', 'thkap'=>'8776', 'thksim'=>'8764', 'Tilde'=>'8764', 'TildeEqual'=>'8771', 'TildeFullEqual'=>'8773', 'TildeTilde'=>'8776', 'timesb'=>'8864', 'timesbar'=>'10801', 'timesd'=>'10800', 'tint'=>'8749', 'toea'=>'10536', 'top'=>'8868', 'topbot'=>'9014', 'topcir'=>'10993', 'Topf'=>'120139', 'topf'=>'120165', 'topfork'=>'10970', 'tosa'=>'10537', 'tprime'=>'8244', 'TRADE'=>'8482', 'triangle'=>'9653', 'triangledown'=>'9663', 'triangleleft'=>'9667', 'trianglelefteq'=>'8884', 'triangleq'=>'8796', 'triangleright'=>'9657', 'trianglerighteq'=>'8885', 'tridot'=>'9708', 'trie'=>'8796', 'triminus'=>'10810', 'TripleDot'=>'8411', 'triplus'=>'10809', 'trisb'=>'10701', 'tritime'=>'10811', 'trpezium'=>'9186', 'Tscr'=>'119983', 'tscr'=>'120009', 'TScy'=>'1062', 'tscy'=>'1094', 'TSHcy'=>'1035', 'tshcy'=>'1115', 'Tstrok'=>'358', 'tstrok'=>'359', 'twixt'=>'8812', 'twoheadleftarrow'=>'8606', 'twoheadrightarrow'=>'8608', 'Uarr'=>'8607', 'Uarrocir'=>'10569', 'Ubrcy'=>'1038', 'ubrcy'=>'1118', 'Ubreve'=>'364', 'ubreve'=>'365', 'Ucy'=>'1059', 'ucy'=>'1091', 'udarr'=>'8645', 'Udblac'=>'368', 'udblac'=>'369', 'udhar'=>'10606', 'ufisht'=>'10622', 'Ufr'=>'120088', 'ufr'=>'120114', 'uHar'=>'10595', 'uharl'=>'8639', 'uharr'=>'8638', 'uhblk'=>'9600', 'ulcorn'=>'8988', 'ulcorner'=>'8988', 'ulcrop'=>'8975', 'ultri'=>'9720', 'Umacr'=>'362', 'umacr'=>'363', 'UnderBar'=>'95', 'UnderBrace'=>'9183', 'UnderBracket'=>'9141', 'UnderParenthesis'=>'9181', 'Union'=>'8899', 'UnionPlus'=>'8846', 'Uogon'=>'370', 'uogon'=>'371', 'Uopf'=>'120140', 'uopf'=>'120166', 'uparrow'=>'8593', 'UpArrow'=>'8593', 'Uparrow'=>'8657', 'UpArrowBar'=>'10514', 'UpArrowDownArrow'=>'8645', 'updownarrow'=>'8597', 'UpDownArrow'=>'8597', 'Updownarrow'=>'8661', 'UpEquilibrium'=>'10606', 'upharpoonleft'=>'8639', 'upharpoonright'=>'8638', 'uplus'=>'8846', 'UpperLeftArrow'=>'8598', 'UpperRightArrow'=>'8599', 'upsi'=>'965', 'Upsi'=>'978', 'UpTee'=>'8869', 'UpTeeArrow'=>'8613', 'upuparrows'=>'8648', 'urcorn'=>'8989', 'urcorner'=>'8989', 'urcrop'=>'8974', 'Uring'=>'366', 'uring'=>'367', 'urtri'=>'9721', 'Uscr'=>'119984', 'uscr'=>'120010', 'utdot'=>'8944', 'Utilde'=>'360', 'utilde'=>'361', 'utri'=>'9653', 'utrif'=>'9652', 'uuarr'=>'8648', 'uwangle'=>'10663', 'vangrt'=>'10652', 'varepsilon'=>'1013', 'varkappa'=>'1008', 'varnothing'=>'8709', 'varphi'=>'981', 'varpi'=>'982', 'varpropto'=>'8733', 'varr'=>'8597', 'vArr'=>'8661', 'varrho'=>'1009', 'varsigma'=>'962', 'vartheta'=>'977', 'vartriangleleft'=>'8882', 'vartriangleright'=>'8883', 'vBar'=>'10984', 'Vbar'=>'10987', 'vBarv'=>'10985', 'Vcy'=>'1042', 'vcy'=>'1074', 'vdash'=>'8866', 'vDash'=>'8872', 'Vdash'=>'8873', 'VDash'=>'8875', 'Vdashl'=>'10982', 'vee'=>'8744', 'Vee'=>'8897', 'veebar'=>'8891', 'veeeq'=>'8794', 'vellip'=>'8942', 'verbar'=>'124', 'Verbar'=>'8214', 'vert'=>'124', 'Vert'=>'8214', 'VerticalBar'=>'8739', 'VerticalLine'=>'124', 'VerticalSeparator'=>'10072', 'VerticalTilde'=>'8768', 'VeryThinSpace'=>'8202', 'Vfr'=>'120089', 'vfr'=>'120115', 'vltri'=>'8882', 'Vopf'=>'120141', 'vopf'=>'120167', 'vprop'=>'8733', 'vrtri'=>'8883', 'Vscr'=>'119985', 'vscr'=>'120011', 'Vvdash'=>'8874', 'vzigzag'=>'10650', 'Wcirc'=>'372', 'wcirc'=>'373', 'wedbar'=>'10847', 'wedge'=>'8743', 'Wedge'=>'8896', 'wedgeq'=>'8793', 'Wfr'=>'120090', 'wfr'=>'120116', 'Wopf'=>'120142', 'wopf'=>'120168', 'wp'=>'8472', 'wr'=>'8768', 'wreath'=>'8768', 'Wscr'=>'119986', 'wscr'=>'120012', 'xcap'=>'8898', 'xcirc'=>'9711', 'xcup'=>'8899', 'xdtri'=>'9661', 'Xfr'=>'120091', 'xfr'=>'120117', 'xharr'=>'10231', 'xhArr'=>'10234', 'xlarr'=>'10229', 'xlArr'=>'10232', 'xmap'=>'10236', 'xnis'=>'8955', 'xodot'=>'10752', 'Xopf'=>'120143', 'xopf'=>'120169', 'xoplus'=>'10753', 'xotime'=>'10754', 'xrarr'=>'10230', 'xrArr'=>'10233', 'Xscr'=>'119987', 'xscr'=>'120013', 'xsqcup'=>'10758', 'xuplus'=>'10756', 'xutri'=>'9651', 'xvee'=>'8897', 'xwedge'=>'8896', 'YAcy'=>'1071', 'yacy'=>'1103', 'Ycirc'=>'374', 'ycirc'=>'375', 'Ycy'=>'1067', 'ycy'=>'1099', 'Yfr'=>'120092', 'yfr'=>'120118', 'YIcy'=>'1031', 'yicy'=>'1111', 'Yopf'=>'120144', 'yopf'=>'120170', 'Yscr'=>'119988', 'yscr'=>'120014', 'YUcy'=>'1070', 'yucy'=>'1102', 'Zacute'=>'377', 'zacute'=>'378', 'Zcaron'=>'381', 'zcaron'=>'382', 'Zcy'=>'1047', 'zcy'=>'1079', 'Zdot'=>'379', 'zdot'=>'380', 'zeetrf'=>'8488', 'ZeroWidthSpace'=>'8203', 'zfr'=>'120119', 'Zfr'=>'8488', 'ZHcy'=>'1046', 'zhcy'=>'1078', 'zigrarr'=>'8669', 'zopf'=>'120171', 'Zopf'=>'8484', 'Zscr'=>'119989', 'zscr'=>'120015'); 763 | if ($t[0] != '#') { 764 | return 765 | ($C['and_mark'] ? "\x06" : '&') 766 | . (isset($reservedEntAr[$t]) 767 | ? $t 768 | : (isset($commonEntNameAr[$t]) 769 | ? (!$C['named_entity'] 770 | ? '#'. ($C['hexdec_entity'] > 1 771 | ? 'x'. dechex($commonEntNameAr[$t]) 772 | : $commonEntNameAr[$t]) 773 | : $t) 774 | : (isset($rareEntNameAr[$t]) 775 | ? (!$C['named_entity'] 776 | ? '#'. ($C['hexdec_entity'] > 1 777 | ? 'x'. dechex($rareEntNameAr[$t]) 778 | : $rareEntNameAr[$t]) 779 | : $t) 780 | : 'amp;'. $t))) 781 | . ';'; 782 | } 783 | if ( 784 | ($n = ctype_digit($t = substr($t, 1)) ? intval($t) : hexdec(substr($t, 1))) < 9 785 | || ($n > 13 && $n < 32) 786 | || $n == 11 787 | || $n == 12 788 | || ($n > 126 && $n < 160 && $n != 133) 789 | || ($n > 55295 790 | && ($n < 57344 791 | || ($n > 64975 && $n < 64992) 792 | || $n == 65534 793 | || $n == 65535 794 | || $n > 1114111)) 795 | ) { 796 | return ($C['and_mark'] ? "\x06" : '&'). "amp;#{$t};"; 797 | } 798 | return 799 | ($C['and_mark'] ? "\x06" : '&') 800 | . '#' 801 | . (((ctype_digit($t) && $C['hexdec_entity'] < 2) 802 | || !$C['hexdec_entity']) 803 | ? $n 804 | : 'x'. dechex($n)) 805 | . ';'; 806 | } 807 | 808 | /** 809 | * Check regex pattern for PHP error. 810 | * 811 | * @param string $t Pattern including limiters/modifiers. 812 | * @return int 0 or 1 if pattern is invalid or valid, respectively. 813 | */ 814 | function hl_regex($t) 815 | { 816 | if (empty($t) || !is_string($t)) { 817 | return 0; 818 | } 819 | if ($funcsExist = function_exists('error_clear_last') && function_exists('error_get_last')) { 820 | error_clear_last(); 821 | } else { 822 | if ($valTrackErr = ini_get('track_errors')) { 823 | $valMsgErr = isset($php_errormsg) ? $php_errormsg : null; 824 | } else { 825 | ini_set('track_errors', '1'); 826 | } 827 | unset($php_errormsg); 828 | } 829 | if (($valShowErr = ini_get('display_errors'))) { 830 | ini_set('display_errors', '0'); 831 | } 832 | preg_match($t, ''); 833 | if ($funcsExist) { 834 | $out = error_get_last() == null ? 1 : 0; 835 | } else { 836 | $out = isset($php_errormsg) ? 0 : 1; 837 | if ($valTrackErr) { 838 | $php_errormsg = isset($valMsgErr) ? $valMsgErr : null; 839 | } else { 840 | ini_set('track_errors', '0'); 841 | } 842 | } 843 | if ($valShowErr) { 844 | ini_set('display_errors', '1'); 845 | } 846 | return $out; 847 | } 848 | 849 | /** 850 | * Parse $spec htmLawed argument as array. 851 | * 852 | * @param string $t Value of $spec. 853 | * @return array Multidimensional array of form: tag -> attribute -> rule. 854 | */ 855 | function hl_spec($t) 856 | { 857 | $out = array(); 858 | 859 | // Hide special characters used for rules. 860 | 861 | if (!function_exists('hl_aux1')) { 862 | function hl_aux1($x) { 863 | return 864 | substr( 865 | str_replace( 866 | array(";", "|", "~", " ", ",", "/", "(", ")", '`"'), 867 | array("\x01", "\x02", "\x03", "\x04", "\x05", "\x06", "\x07", "\x08", '"'), 868 | $x[0]), 869 | 1, -1); 870 | } 871 | } 872 | $t = 873 | str_replace( 874 | array("\t", "\r", "\n", ' '), 875 | '', 876 | preg_replace_callback('/"(?>(`.|[^"])*)"/sm', 'hl_aux1', trim($t))); 877 | 878 | // Tag, attribute, and rule separators: semi-colon, comma, and slash respectively. 879 | 880 | for ($i = count(($t = explode(';', $t))); --$i>=0;) { 881 | $ele = $t[$i]; 882 | if ( 883 | empty($ele) 884 | || ($tagPos = strpos($ele, '=')) === false 885 | || !strlen(($tagSpec = substr($ele, $tagPos + 1))) 886 | ) { 887 | continue; 888 | } 889 | $ruleAr = $denyAttrAr = array(); 890 | foreach (explode(',', $tagSpec) as $v) { 891 | if (!preg_match('`^(-?data-[^:=]+|[a-z:\-\*]+)(?:\((.*?)\))?`i', $v, $m) 892 | || preg_match('`^-?data-xml`i', $m[1])) { 893 | continue; 894 | } 895 | if (($attr = strtolower($m[1])) == '-*') { 896 | $denyAttrAr['*'] = 1; 897 | continue; 898 | } 899 | if ($attr[0] == '-') { 900 | $denyAttrAr[substr($attr, 1)] = 1; 901 | continue; 902 | } 903 | if (!isset($m[2])) { 904 | $ruleAr[$attr] = 1; 905 | continue; 906 | } 907 | foreach (explode('/', $m[2]) as $m) { 908 | if (empty($m) 909 | || ($rulePos = strpos($m, '=')) === 0 910 | || $rulePos < 5 // Shortest rule: oneof 911 | ) { 912 | $ruleAr[$attr] = 1; 913 | continue; 914 | } 915 | $rule = strtolower(substr($m, 0, $rulePos)); 916 | $ruleAr[$attr][$rule] = 917 | str_replace( 918 | array("\x01", "\x02", "\x03", "\x04", "\x05", "\x06", "\x07", "\x08"), 919 | array(";", "|", "~", " ", ",", "/", "(", ")"), 920 | substr($m, $rulePos + 1)); 921 | } 922 | if (isset($ruleAr[$attr]['match']) && !hl_regex($ruleAr[$attr]['match'])) { 923 | unset($ruleAr[$attr]['match']); 924 | } 925 | if (isset($ruleAr[$attr]['nomatch']) && !hl_regex($ruleAr[$attr]['nomatch'])) { 926 | unset($ruleAr[$attr]['nomatch']); 927 | } 928 | } 929 | 930 | if (!count($ruleAr) && !count($denyAttrAr)) { 931 | continue; 932 | } 933 | foreach (explode(',', substr($ele, 0, $tagPos)) as $tag) { 934 | if (!strlen(($tag = strtolower($tag)))) { 935 | continue; 936 | } 937 | if (count($ruleAr)) { 938 | $out[$tag] = !isset($out[$tag]) ? $ruleAr : array_merge($out[$tag], $ruleAr); 939 | } 940 | if (count($denyAttrAr)) { 941 | $out[$tag]['deny'] = !isset($out[$tag]['deny']) ? $denyAttrAr : array_merge($out[$tag]['deny'], $denyAttrAr); 942 | } 943 | } 944 | } 945 | 946 | return $out; 947 | } 948 | 949 | /** 950 | * Handle tag text with limiters, and attributes in opening tags. 951 | * 952 | * @param array $t Array from preg_replace call. 953 | * @return string Tag with any attribute, 954 | * or text with neutralized into entities, or empty. 955 | */ 956 | function hl_tag($t) 957 | { 958 | $t = $t[0]; 959 | global $C; 960 | 961 | // Check if character not in tag. 962 | 963 | if ($t == '< ') { 964 | return '< '; 965 | } 966 | if ($t == '>') { 967 | return '>'; 968 | } 969 | if (!preg_match('`^<(/?)([a-zA-Z][^\s>]*)([^>]*?)\s?>$`m', $t, $m)) { // Get tag with element name and attributes 970 | return str_replace(array('<', '>'), array('<', '>'), $t); 971 | } 972 | 973 | // Check if element not permitted. Custom element names have certain requirements. 974 | 975 | $ele = rtrim(strtolower($m[2]), '/'); 976 | static $invalidCustomEleAr = array('annotation-xml'=>1, 'color-profile'=>1, 'font-face'=>1, 'font-face-src'=>1, 'font-face-uri'=>1, 'font-face-format'=>1, 'font-face-name'=>1, 'missing-glyph'=>1); 977 | if ( 978 | (!strpos($ele, '-') 979 | && !isset($C['elements'][$ele])) // Not custom element 980 | || (strpos($ele, '-') 981 | && (isset($C['elements']['-' . $ele]) 982 | || (!$C['any_custom_element'] 983 | && !isset($C['elements'][$ele])) 984 | || isset($invalidCustomEleAr[$ele]) 985 | || preg_match( 986 | '`[^-._0-9a-z\xb7\xc0-\xd6\xd8-\xf6\xf8-\x{2ff}' 987 | . '\x{370}-\x{37d}\x{37f}-\x{1fff}\x{200c}-\x{200d}\x{2070}-\x{218f}' 988 | . '\x{2c00}-\x{2fef}\x{3001}-\x{d7ff}\x{f900}-\x{fdcf}\x{fdf0}-\x{fffd}\x{10000}-\x{effff}]`u' 989 | , $ele))) 990 | ) { 991 | return (($C['keep_bad']%2) ? str_replace(array('<', '>'), array('<', '>'), $t) : ''); 992 | } 993 | 994 | // Attribute string. 995 | 996 | $attrStr = str_replace(array("\n", "\r", "\t"), ' ', trim($m[3])); 997 | 998 | // Transform deprecated element. 999 | 1000 | static $deprecatedEleAr = array('acronym'=>1, 'applet'=>1, 'big'=>1, 'center'=>1, 'dir'=>1, 'font'=>1, 'isindex'=>1, 's'=>1, 'strike'=>1, 'tt'=>1); 1001 | if ($C['make_tag_strict'] && isset($deprecatedEleAr[$ele])) { 1002 | $eleTransformed = hl_deprecatedElement($ele, $attrStr, $C['make_tag_strict']); // hl_deprecatedElement uses referencing 1003 | if (!$ele) { 1004 | return (($C['keep_bad'] % 2) ? str_replace(array('<', '>'), array('<', '>'), $t) : ''); 1005 | } 1006 | } 1007 | 1008 | // Handle closing tag. 1009 | 1010 | static $emptyEleAr = array('area'=>1, 'br'=>1, 'col'=>1, 'command'=>1, 'embed'=>1, 'hr'=>1, 'img'=>1, 'input'=>1, 'isindex'=>1, 'keygen'=>1, 'link'=>1, 'meta'=>1, 'param'=>1, 'source'=>1, 'track'=>1, 'wbr'=>1); 1011 | if (!empty($m[1])) { 1012 | return( 1013 | !isset($emptyEleAr[$ele]) 1014 | ? (empty($C['hook_tag']) 1015 | ? "" 1016 | : call_user_func($C['hook_tag'], $ele, 0)) 1017 | : ($C['keep_bad'] % 2 1018 | ? str_replace(array('<', '>'), array('<', '>'), $t) 1019 | : '')); 1020 | } 1021 | 1022 | // Handle opening tag. 1023 | 1024 | // -- Sets of possible attributes. 1025 | 1026 | // .. Element-specific non-global. 1027 | 1028 | static $attrEleAr = array('abbr'=>array('td'=>1, 'th'=>1), 'accept'=>array('form'=>1, 'input'=>1), 'accept-charset'=>array('form'=>1), 'action'=>array('form'=>1), 'align'=>array('applet'=>1, 'caption'=>1, 'col'=>1, 'colgroup'=>1, 'div'=>1, 'embed'=>1, 'h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1, 'hr'=>1, 'iframe'=>1, 'img'=>1, 'input'=>1, 'legend'=>1, 'object'=>1, 'p'=>1, 'table'=>1, 'tbody'=>1, 'td'=>1, 'tfoot'=>1, 'th'=>1, 'thead'=>1, 'tr'=>1), 'allowfullscreen'=>array('iframe'=>1), 'alt'=>array('applet'=>1, 'area'=>1, 'img'=>1, 'input'=>1), 'archive'=>array('applet'=>1, 'object'=>1), 'async'=>array('script'=>1), 'autocomplete'=>array('form'=>1, 'input'=>1), 'autofocus'=>array('button'=>1, 'input'=>1, 'keygen'=>1, 'select'=>1, 'textarea'=>1), 'autoplay'=>array('audio'=>1, 'video'=>1), 'axis'=>array('td'=>1, 'th'=>1), 'bgcolor'=>array('embed'=>1, 'table'=>1, 'tbody'=>1, 'td'=>1, 'tfoot'=>1, 'th'=>1, 'thead'=>1, 'tr'=>1), 'border'=>array('img'=>1, 'object'=>1, 'table'=>1), 'bordercolor'=>array('table'=>1, 'td'=>1, 'tr'=>1), 'cellpadding'=>array('table'=>1), 'cellspacing'=>array('table'=>1), 'challenge'=>array('keygen'=>1), 'char'=>array('col'=>1, 'colgroup'=>1, 'tbody'=>1, 'td'=>1, 'tfoot'=>1, 'th'=>1, 'thead'=>1, 'tr'=>1), 'charoff'=>array('col'=>1, 'colgroup'=>1, 'tbody'=>1, 'td'=>1, 'tfoot'=>1, 'th'=>1, 'thead'=>1, 'tr'=>1), 'charset'=>array('a'=>1, 'script'=>1), 'checked'=>array('command'=>1, 'input'=>1), 'cite'=>array('blockquote'=>1, 'del'=>1, 'ins'=>1, 'q'=>1), 'classid'=>array('object'=>1), 'clear'=>array('br'=>1), 'code'=>array('applet'=>1), 'codebase'=>array('applet'=>1, 'object'=>1), 'codetype'=>array('object'=>1), 'color'=>array('font'=>1), 'cols'=>array('textarea'=>1), 'colspan'=>array('td'=>1, 'th'=>1), 'compact'=>array('dir'=>1, 'dl'=>1, 'menu'=>1, 'ol'=>1, 'ul'=>1), 'content'=>array('meta'=>1), 'controls'=>array('audio'=>1, 'video'=>1), 'coords'=>array('a'=>1, 'area'=>1), 'crossorigin'=>array('img'=>1), 'data'=>array('object'=>1), 'datetime'=>array('del'=>1, 'ins'=>1, 'time'=>1), 'declare'=>array('object'=>1), 'default'=>array('track'=>1), 'defer'=>array('script'=>1), 'dirname'=>array('input'=>1, 'textarea'=>1), 'disabled'=>array('button'=>1, 'command'=>1, 'fieldset'=>1, 'input'=>1, 'keygen'=>1, 'optgroup'=>1, 'option'=>1, 'select'=>1, 'textarea'=>1), 'download'=>array('a'=>1), 'enctype'=>array('form'=>1), 'face'=>array('font'=>1), 'flashvars'=>array('embed'=>1), 'for'=>array('label'=>1, 'output'=>1), 'form'=>array('button'=>1, 'fieldset'=>1, 'input'=>1, 'keygen'=>1, 'label'=>1, 'object'=>1, 'output'=>1, 'select'=>1, 'textarea'=>1), 'formaction'=>array('button'=>1, 'input'=>1), 'formenctype'=>array('button'=>1, 'input'=>1), 'formmethod'=>array('button'=>1, 'input'=>1), 'formnovalidate'=>array('button'=>1, 'input'=>1), 'formtarget'=>array('button'=>1, 'input'=>1), 'frame'=>array('table'=>1), 'frameborder'=>array('iframe'=>1), 'headers'=>array('td'=>1, 'th'=>1), 'height'=>array('applet'=>1, 'canvas'=>1, 'embed'=>1, 'iframe'=>1, 'img'=>1, 'input'=>1, 'object'=>1, 'td'=>1, 'th'=>1, 'video'=>1), 'high'=>array('meter'=>1), 'href'=>array('a'=>1, 'area'=>1, 'link'=>1), 'hreflang'=>array('a'=>1, 'area'=>1, 'link'=>1), 'hspace'=>array('applet'=>1, 'embed'=>1, 'img'=>1, 'object'=>1), 'icon'=>array('command'=>1), 'ismap'=>array('img'=>1, 'input'=>1), 'keyparams'=>array('keygen'=>1), 'keytype'=>array('keygen'=>1), 'kind'=>array('track'=>1), 'label'=>array('command'=>1, 'menu'=>1, 'option'=>1, 'optgroup'=>1, 'track'=>1), 'language'=>array('script'=>1), 'list'=>array('input'=>1), 'longdesc'=>array('img'=>1, 'iframe'=>1), 'loop'=>array('audio'=>1, 'video'=>1), 'low'=>array('meter'=>1), 'marginheight'=>array('iframe'=>1), 'marginwidth'=>array('iframe'=>1), 'max'=>array('input'=>1, 'meter'=>1, 'progress'=>1), 'maxlength'=>array('input'=>1, 'textarea'=>1), 'media'=>array('a'=>1, 'area'=>1, 'link'=>1, 'source'=>1, 'style'=>1), 'mediagroup'=>array('audio'=>1, 'video'=>1), 'method'=>array('form'=>1), 'min'=>array('input'=>1, 'meter'=>1), 'model'=>array('embed'=>1), 'multiple'=>array('input'=>1, 'select'=>1), 'muted'=>array('audio'=>1, 'video'=>1), 'name'=>array('a'=>1, 'applet'=>1, 'button'=>1, 'embed'=>1, 'fieldset'=>1, 'form'=>1, 'iframe'=>1, 'img'=>1, 'input'=>1, 'keygen'=>1, 'map'=>1, 'object'=>1, 'output'=>1, 'param'=>1, 'select'=>1, 'slot'=>1, 'textarea'=>1), 'nohref'=>array('area'=>1), 'noshade'=>array('hr'=>1), 'novalidate'=>array('form'=>1), 'nowrap'=>array('td'=>1, 'th'=>1), 'object'=>array('applet'=>1), 'open'=>array('details'=>1, 'dialog'=>1), 'optimum'=>array('meter'=>1), 'pattern'=>array('input'=>1), 'ping'=>array('a'=>1, 'area'=>1), 'placeholder'=>array('input'=>1, 'textarea'=>1), 'pluginspage'=>array('embed'=>1), 'pluginurl'=>array('embed'=>1), 'poster'=>array('video'=>1), 'pqg'=>array('keygen'=>1), 'preload'=>array('audio'=>1, 'video'=>1), 'prompt'=>array('isindex'=>1), 'pubdate'=>array('time'=>1), 'radiogroup'=>array('command'=>1), 'readonly'=>array('input'=>1, 'textarea'=>1), 'rel'=>array('a'=>1, 'area'=>1, 'link'=>1), 'required'=>array('input'=>1, 'select'=>1, 'textarea'=>1), 'rev'=>array('a'=>1), 'reversed'=>array('ol'=>1), 'rows'=>array('textarea'=>1), 'rowspan'=>array('td'=>1, 'th'=>1), 'rules'=>array('table'=>1), 'sandbox'=>array('iframe'=>1), 'scope'=>array('td'=>1, 'th'=>1), 'scoped'=>array('style'=>1), 'scrolling'=>array('iframe'=>1), 'seamless'=>array('iframe'=>1), 'selected'=>array('option'=>1), 'shape'=>array('a'=>1, 'area'=>1), 'size'=>array('font'=>1, 'hr'=>1, 'input'=>1, 'select'=>1), 'sizes'=>array('img'=>1, 'link'=>1, 'source'=>1), 'span'=>array('col'=>1, 'colgroup'=>1), 'src'=>array('audio'=>1, 'embed'=>1, 'iframe'=>1, 'img'=>1, 'input'=>1, 'script'=>1, 'source'=>1, 'track'=>1, 'video'=>1), 'srcdoc'=>array('iframe'=>1), 'srclang'=>array('track'=>1), 'srcset'=>array('img'=>1, 'link'=>1, 'source'=>1), 'standby'=>array('object'=>1), 'start'=>array('ol'=>1), 'step'=>array('input'=>1), 'summary'=>array('table'=>1), 'target'=>array('a'=>1, 'area'=>1, 'form'=>1), 'type'=>array('a'=>1, 'area'=>1, 'button'=>1, 'command'=>1, 'embed'=>1, 'input'=>1, 'li'=>1, 'link'=>1, 'menu'=>1, 'object'=>1, 'ol'=>1, 'param'=>1, 'script'=>1, 'source'=>1, 'style'=>1, 'ul'=>1), 'typemustmatch'=>array('object'=>1), 'usemap'=>array('img'=>1, 'input'=>1, 'object'=>1), 'valign'=>array('col'=>1, 'colgroup'=>1, 'tbody'=>1, 'td'=>1, 'tfoot'=>1, 'th'=>1, 'thead'=>1, 'tr'=>1), 'value'=>array('button'=>1, 'data'=>1, 'input'=>1, 'li'=>1, 'meter'=>1, 'option'=>1, 'param'=>1, 'progress'=>1), 'valuetype'=>array('param'=>1), 'vspace'=>array('applet'=>1, 'embed'=>1, 'img'=>1, 'object'=>1), 'width'=>array('applet'=>1, 'canvas'=>1, 'col'=>1, 'colgroup'=>1, 'embed'=>1, 'hr'=>1, 'iframe'=>1, 'img'=>1, 'input'=>1, 'object'=>1, 'pre'=>1, 'table'=>1, 'td'=>1, 'th'=>1, 'video'=>1), 'wmode'=>array('embed'=>1), 'wrap'=>array('textarea'=>1)); 1029 | 1030 | // .. Empty. 1031 | 1032 | static $emptyAttrAr = array('allowfullscreen'=>1, 'checkbox'=>1, 'checked'=>1, 'command'=>1, 'compact'=>1, 'declare'=>1, 'defer'=>1, 'default'=>1, 'disabled'=>1, 'hidden'=>1, 'inert'=>1, 'ismap'=>1, 'itemscope'=>1, 'multiple'=>1, 'nohref'=>1, 'noresize'=>1, 'noshade'=>1, 'nowrap'=>1, 'open'=>1, 'radio'=>1, 'readonly'=>1, 'required'=>1, 'reversed'=>1, 'selected'=>1); 1033 | 1034 | // .. Global. 1035 | 1036 | static $globalAttrAr = array( 1037 | 1038 | // .... General. 1039 | 1040 | 'accesskey'=>1, 'autocapitalize'=>1, 'autofocus'=>1, 'class'=>1, 'contenteditable'=>1, 'contextmenu'=>1, 'dir'=>1, 'draggable'=>1, 'dropzone'=>1, 'enterkeyhint'=>1, 'hidden'=>1, 'id'=>1, 'inert'=>1, 'inputmode'=>1, 'is'=>1, 'itemid'=>1, 'itemprop'=>1, 'itemref'=>1, 'itemscope'=>1, 'itemtype'=>1, 'lang'=>1, 'nonce'=>1, 'role'=>1, 'slot'=>1, 'spellcheck'=>1, 'style'=>1, 'tabindex'=>1, 'title'=>1, 'translate'=>1, 'xmlns'=>1, 'xml:base'=>1, 'xml:lang'=>1, 'xml:space'=>1, 1041 | 1042 | // .... Event. 1043 | 1044 | 'onabort'=>1, 'onauxclick'=>1, 'onblur'=>1, 'oncancel'=>1, 'oncanplay'=>1, 'oncanplaythrough'=>1, 'onchange'=>1, 'onclick'=>1, 'onclose'=>1, 'oncontextlost'=>1, 'oncontextmenu'=>1, 'oncontextrestored'=>1, 'oncopy'=>1, 'oncuechange'=>1, 'oncut'=>1, 'ondblclick'=>1, 'ondrag'=>1, 'ondragend'=>1, 'ondragenter'=>1, 'ondragleave'=>1, 'ondragover'=>1, 'ondragstart'=>1, 'ondrop'=>1, 'ondurationchange'=>1, 'onemptied'=>1, 'onended'=>1, 'onerror'=>1, 'onfocus'=>1, 'onformchange'=>1, 'onformdata'=>1, 'onforminput'=>1, 'ongotpointercapture'=>1, 'oninput'=>1, 'oninvalid'=>1, 'onkeydown'=>1, 'onkeypress'=>1, 'onkeyup'=>1, 'onload'=>1, 'onloadeddata'=>1, 'onloadedmetadata'=>1, 'onloadend'=>1, 'onloadstart'=>1, 'onlostpointercapture'=>1, 'onmousedown'=>1, 'onmouseenter'=>1, 'onmouseleave'=>1, 'onmousemove'=>1, 'onmouseout'=>1, 'onmouseover'=>1, 'onmouseup'=>1, 'onmousewheel'=>1, 'onpaste'=>1, 'onpause'=>1, 'onplay'=>1, 'onplaying'=>1, 'onpointercancel'=>1, 'onpointerdown'=>1, 'onpointerenter'=>1, 'onpointerleave'=>1, 'onpointermove'=>1, 'onpointerout'=>1, 'onpointerover'=>1, 'onpointerup'=>1, 'onprogress'=>1, 'onratechange'=>1, 'onreadystatechange'=>1, 'onreset'=>1, 'onresize'=>1, 'onscroll'=>1, 'onsearch'=>1, 'onsecuritypolicyviolation'=>1, 'onseeked'=>1, 'onseeking'=>1, 'onselect'=>1, 'onshow'=>1, 'onslotchange'=>1, 'onstalled'=>1, 'onsubmit'=>1, 'onsuspend'=>1, 'ontimeupdate'=>1, 'ontoggle'=>1, 'ontouchcancel'=>1, 'ontouchend'=>1, 'ontouchmove'=>1, 'ontouchstart'=>1, 'onvolumechange'=>1, 'onwaiting'=>1, 'onwheel'=>1, 1045 | 1046 | // .... Aria. 1047 | 1048 | 'aria-activedescendant'=>1, 'aria-atomic'=>1, 'aria-autocomplete'=>1, 'aria-braillelabel'=>1, 'aria-brailleroledescription'=>1, 'aria-busy'=>1, 'aria-checked'=>1, 'aria-colcount'=>1, 'aria-colindex'=>1, 'aria-colindextext'=>1, 'aria-colspan'=>1, 'aria-controls'=>1, 'aria-current'=>1, 'aria-describedby'=>1, 'aria-description'=>1, 'aria-details'=>1, 'aria-disabled'=>1, 'aria-dropeffect'=>1, 'aria-errormessage'=>1, 'aria-expanded'=>1, 'aria-flowto'=>1, 'aria-grabbed'=>1, 'aria-haspopup'=>1, 'aria-hidden'=>1, 'aria-invalid'=>1, 'aria-keyshortcuts'=>1, 'aria-label'=>1, 'aria-labelledby'=>1, 'aria-level'=>1, 'aria-live'=>1, 'aria-multiline'=>1, 'aria-multiselectable'=>1, 'aria-orientation'=>1, 'aria-owns'=>1, 'aria-placeholder'=>1, 'aria-posinset'=>1, 'aria-pressed'=>1, 'aria-readonly'=>1, 'aria-relevant'=>1, 'aria-required'=>1, 'aria-roledescription'=>1, 'aria-rowcount'=>1, 'aria-rowindex'=>1, 'aria-rowindextext'=>1, 'aria-rowspan'=>1, 'aria-selected'=>1, 'aria-setsize'=>1, 'aria-sort'=>1, 'aria-valuemax'=>1, 'aria-valuemin'=>1, 'aria-valuenow'=>1, 'aria-valuetext'=>1); 1049 | 1050 | static $urlAttrAr = array('action'=>1, 'archive'=>1, 'cite'=>1, 'classid'=>1, 'codebase'=>1, 'data'=>1, 'formaction'=>1, 'href'=>1, 'itemtype'=>1, 'longdesc'=>1, 'model'=>1, 'pluginspage'=>1, 'pluginurl'=>1, 'poster'=>1, 'src'=>1, 'srcset'=>1, 'usemap'=>1); // Excludes style and on* 1051 | 1052 | // .. Deprecated. 1053 | 1054 | $alterDeprecAttr = 0; 1055 | if ($C['no_deprecated_attr']) { 1056 | static $deprecAttrEleAr = array('align'=>array('caption'=>1, 'div'=>1, 'h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1, 'hr'=>1, 'img'=>1, 'input'=>1, 'legend'=>1, 'object'=>1, 'p'=>1, 'table'=>1), 'bgcolor'=>array('table'=>1, 'tbody'=>1, 'td'=>1, 'tfoot'=>1, 'th'=>1, 'thead'=>1, 'tr'=>1), 'border'=>array('object'=>1), 'bordercolor'=>array('table'=>1, 'td'=>1, 'tr'=>1), 'cellspacing'=>array('table'=>1), 'clear'=>array('br'=>1), 'compact'=>array('dl'=>1, 'ol'=>1, 'ul'=>1), 'height'=>array('td'=>1, 'th'=>1), 'hspace'=>array('img'=>1, 'object'=>1), 'language'=>array('script'=>1), 'name'=>array('a'=>1, 'form'=>1, 'iframe'=>1, 'img'=>1, 'map'=>1), 'noshade'=>array('hr'=>1), 'nowrap'=>array('td'=>1, 'th'=>1), 'size'=>array('hr'=>1), 'vspace'=>array('img'=>1, 'object'=>1), 'width'=>array('hr'=>1, 'pre'=>1, 'table'=>1, 'td'=>1, 'th'=>1)); 1057 | static $deprecAttrPossibleEleAr = array('a'=>1, 'br'=>1, 'caption'=>1, 'div'=>1, 'dl'=>1, 'form'=>1, 'h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1, 'hr'=>1, 'iframe'=>1, 'img'=>1, 'input'=>1, 'legend'=>1, 'map'=>1, 'object'=>1, 'ol'=>1, 'p'=>1, 'pre'=>1, 'script'=>1, 'table'=>1, 'td'=>1, 'th'=>1, 'tr'=>1, 'ul'=>1); 1058 | $alterDeprecAttr = isset($deprecAttrPossibleEleAr[$ele]) ? 1 : 0; 1059 | } 1060 | 1061 | // -- Standard attribute values that may need lowercasing. 1062 | 1063 | if ($C['lc_std_val']) { 1064 | static $lCaseStdAttrValAr = array('all'=>1, 'auto'=>1, 'baseline'=>1, 'bottom'=>1, 'button'=>1, 'captions'=>1, 'center'=>1, 'chapters'=>1, 'char'=>1, 'checkbox'=>1, 'circle'=>1, 'col'=>1, 'colgroup'=>1, 'color'=>1, 'cols'=>1, 'data'=>1, 'date'=>1, 'datetime'=>1, 'datetime-local'=>1, 'default'=>1, 'descriptions'=>1, 'email'=>1, 'file'=>1, 'get'=>1, 'groups'=>1, 'hidden'=>1, 'image'=>1, 'justify'=>1, 'left'=>1, 'ltr'=>1, 'metadata'=>1, 'middle'=>1, 'month'=>1, 'none'=>1, 'number'=>1, 'object'=>1, 'password'=>1, 'poly'=>1, 'post'=>1, 'preserve'=>1, 'radio'=>1, 'range'=>1, 'rect'=>1, 'ref'=>1, 'reset'=>1, 'right'=>1, 'row'=>1, 'rowgroup'=>1, 'rows'=>1, 'rtl'=>1, 'search'=>1, 'submit'=>1, 'subtitles'=>1, 'tel'=>1, 'text'=>1, 'time'=>1, 'top'=>1, 'url'=>1, 'week'=>1); 1065 | static $lCaseStdAttrValPossibleEleAr = array('a'=>1, 'area'=>1, 'bdo'=>1, 'button'=>1, 'col'=>1, 'fieldset'=>1, 'form'=>1, 'img'=>1, 'input'=>1, 'object'=>1, 'ol'=>1, 'optgroup'=>1, 'option'=>1, 'param'=>1, 'script'=>1, 'select'=>1, 'table'=>1, 'td'=>1, 'textarea'=>1, 'tfoot'=>1, 'th'=>1, 'thead'=>1, 'tr'=>1, 'track'=>1, 'xml:space'=>1); 1066 | $lCaseStdAttrVal = isset($lCaseStdAttrValPossibleEleAr[$ele]) ? 1 : 0; 1067 | } 1068 | 1069 | // -- Get attribute name-value pairs. 1070 | 1071 | if (strpos($attrStr, "\x01") !== false) { // Remove CDATA/comment 1072 | $attrStr = preg_replace('`\x01[^\x01]*\x01`', '', $attrStr); 1073 | } 1074 | $attrStr = trim($attrStr, ' /'); 1075 | $attrAr = array(); 1076 | $state = 0; 1077 | while (strlen($attrStr)) { 1078 | $ok = 0; // For parsing errors, to deal with space, ", and ' characters 1079 | switch ($state) { 1080 | case 0: if (preg_match('`^[^=\s/\x7f-\x9f]+`', $attrStr, $m)) { // Name 1081 | $attr = strtolower($m[0]); 1082 | $ok = $state = 1; 1083 | $attrStr = ltrim(substr_replace($attrStr, '', 0, strlen($m[0]))); 1084 | } 1085 | break; case 1: if ($attrStr[0] == '=') { 1086 | $ok = 1; 1087 | $state = 2; 1088 | $attrStr = ltrim($attrStr, '= '); 1089 | } else { // No value 1090 | $ok = 1; 1091 | $state = 0; 1092 | $attrStr = ltrim($attrStr); 1093 | $attrAr[$attr] = ''; 1094 | } 1095 | break; case 2: if (preg_match('`^((?:"[^"]*")|(?:\'[^\']*\')|(?:\s*[^\s"\']+))(.*)`', $attrStr, $m)) { // Value 1096 | $attrStr = ltrim($m[2]); 1097 | $m = $m[1]; 1098 | $ok = 1; 1099 | $state = 0; 1100 | $attrAr[$attr] = 1101 | trim( 1102 | str_replace('<', '<', 1103 | ($m[0] == '"' || $m[0] == '\'') 1104 | ? substr($m, 1, -1) 1105 | : $m)); 1106 | } 1107 | break; 1108 | } 1109 | if (!$ok) { 1110 | $attrStr = preg_replace('`^(?:"[^"]*("|$)|\'[^\']*(\'|$)|\S)*\s*`', '', $attrStr); 1111 | $state = 0; 1112 | } 1113 | } 1114 | if ($state == 1) { 1115 | $attrAr[$attr] = ''; 1116 | } 1117 | 1118 | // -- Clean attributes. 1119 | 1120 | global $S; 1121 | $eleSpec = isset($S[$ele]) ? $S[$ele] : array(); 1122 | $filtAttrAr = array(); // Finalized attributes 1123 | $deniedAttrAr = $C['deny_attribute']; 1124 | 1125 | foreach ($attrAr as $attr=>$v) { 1126 | 1127 | // .. Check if attribute is permitted. 1128 | 1129 | if ( 1130 | 1131 | // .... Valid attribute. 1132 | 1133 | ((isset($attrEleAr[$attr][$ele]) 1134 | || isset($globalAttrAr[$attr]) 1135 | || preg_match('`data-((?!xml)[^:]+$)`', $attr) 1136 | || (strpos($ele, '-') 1137 | && strpos($attr, 'data-xml') !== 0)) 1138 | 1139 | // .... No denial through $spec. 1140 | 1141 | && (empty($eleSpec) 1142 | || (!isset($eleSpec['deny']) 1143 | || (!isset($eleSpec['deny']['*']) 1144 | && !isset($eleSpec['deny'][$attr]) 1145 | && !isset($eleSpec['deny'][preg_replace('`^(on|aria|data).+`', '\\1', $attr). '*'])))) 1146 | 1147 | // .... No denial through $config. 1148 | 1149 | && (empty($deniedAttrAr) 1150 | || (isset($deniedAttrAr['*']) 1151 | ? (isset($deniedAttrAr["-$attr"]) 1152 | || isset($deniedAttrAr['-'. preg_replace('`^(on|aria|data)..+`', '\\1', $attr). '*'])) 1153 | : (!isset($deniedAttrAr[$attr]) 1154 | && !isset($deniedAttrAr[preg_replace('`^(on|aria|data).+`', '\\1', $attr). '*']))))) 1155 | 1156 | // .... Permit if permission through $spec. 1157 | 1158 | || (!empty($eleSpec) 1159 | && (isset($eleSpec[$attr]) 1160 | || (isset($globalAttrAr[$attr]) 1161 | && isset($eleSpec[preg_replace('`^(on|aria|data).+`', '\\1', $attr). '*'])))) 1162 | ) { 1163 | 1164 | // .. Attribute with no value or standard value. 1165 | 1166 | if (isset($emptyAttrAr[$attr])) { 1167 | $v = $attr; 1168 | } elseif ( 1169 | !empty($lCaseStdAttrVal) // ! Rather loose but should be ok 1170 | && (($ele != 'button' || $ele != 'input') 1171 | || $attr == 'type') 1172 | ) { 1173 | $v = (isset($lCaseStdAttrValAr[($vNew = strtolower($v))])) ? $vNew : $v; 1174 | } 1175 | 1176 | // .. URLs and CSS expressions in style attribute. 1177 | 1178 | if ($attr == 'style' && !$C['style_pass']) { 1179 | if (false !== strpos($v, '&#')) { // Change any entity to character 1180 | static $entityAr = array(' '=>' ', ' '=>' ', ':'=>':', ':'=>':', '"'=>'"', '"'=>'"', '('=>'(', '('=>'(', ')'=>')', ')'=>')', '*'=>'*', '*'=>'*', '/'=>'/', '/'=>'/', '\'=>'\\', '\'=>'\\', 'e'=>'e', 'E'=>'e', 'E'=>'e', 'e'=>'e', 'i'=>'i', 'I'=>'i', 'I'=>'i', 'i'=>'i', 'l'=>'l', 'L'=>'l', 'L'=>'l', 'l'=>'l', 'n'=>'n', 'N'=>'n', 'N'=>'n', 'n'=>'n', 'o'=>'o', 'O'=>'o', 'O'=>'o', 'o'=>'o', 'p'=>'p', 'P'=>'p', 'P'=>'p', 'p'=>'p', 'r'=>'r', 'R'=>'r', 'R'=>'r', 'r'=>'r', 's'=>'s', 'S'=>'s', 'S'=>'s', 's'=>'s', 'u'=>'u', 'U'=>'u', 'U'=>'u', 'u'=>'u', 'x'=>'x', 'X'=>'x', 'X'=>'x', 'x'=>'x', '''=>"'", '''=>"'"); 1181 | $v = strtr($v, $entityAr); 1182 | } 1183 | $v = 1184 | preg_replace_callback( 1185 | '`(url(?:\()(?: )*(?:\'|"|&(?:quot|apos);)?)(.+?)((?:\'|"|&(?:quot|apos);)?(?: )*(?:\)))`iS', 1186 | 'hl_url', 1187 | $v); 1188 | $v = !$C['css_expression'] 1189 | ? preg_replace('`expression`i', ' ', preg_replace('`\\\\\S|(/|(%2f))(\*|(%2a))`i', ' ', $v)) 1190 | : $v; 1191 | 1192 | // .. URLs in other attributes. 1193 | 1194 | } elseif (isset($urlAttrAr[$attr]) || (isset($globalAttrAr[$attr]) && strpos($attr, 'on') === 0)) { 1195 | $v = 1196 | str_replace("­", ' ', 1197 | (strpos($v, '&') !== false // ! Double-quoted character = soft-hyphen 1198 | ? str_replace(array('­', '­', '­'), ' ', $v) 1199 | : $v)); 1200 | if ($attr == 'srcset' || ($attr == 'archive' && $ele == 'applet')) { 1201 | $vNew = ''; 1202 | foreach (explode(',', $v) as $k=>$x) { 1203 | $x = explode(' ', ltrim($x), 2); 1204 | $k = isset($x[1]) ? trim($x[1]) : ''; 1205 | $x = trim($x[0]); 1206 | if (isset($x[0])) { 1207 | $vNew .= hl_url($x, $attr). (empty($k) ? '' : ' '. $k). ', '; 1208 | } 1209 | } 1210 | $v = trim($vNew, ', '); 1211 | } 1212 | if ($attr == 'itemtype' || ($attr == 'archive' && $ele == 'object')) { 1213 | $vNew = ''; 1214 | foreach (explode(' ', $v) as $x) { 1215 | if (isset($x[0])) { 1216 | $vNew .= hl_url($x, $attr). ' '; 1217 | } 1218 | } 1219 | $v = trim($vNew, ' '); 1220 | } else { 1221 | $v = hl_url($v, $attr); 1222 | } 1223 | 1224 | // Anti-spam measure. 1225 | 1226 | if ($attr == 'href') { 1227 | if ($C['anti_mail_spam'] && strpos($v, 'mailto:') === 0) { 1228 | $v = str_replace('@', htmlspecialchars($C['anti_mail_spam']), $v); 1229 | } elseif ($C['anti_link_spam']) { 1230 | $x = $C['anti_link_spam'][1]; 1231 | if (!empty($x) && preg_match($x, $v)) { 1232 | continue; 1233 | } 1234 | $x = $C['anti_link_spam'][0]; 1235 | if (!empty($x) && preg_match($x, $v)) { 1236 | if (isset($filtAttrAr['rel'])) { 1237 | if (!preg_match('`\bnofollow\b`i', $filtAttrAr['rel'])) { 1238 | $filtAttrAr['rel'] .= ' nofollow'; 1239 | } 1240 | } elseif (isset($attrAr['rel'])) { 1241 | if (!preg_match('`\bnofollow\b`i', $attrAr['rel'])) { 1242 | $addNofollow = 1; 1243 | } 1244 | } else { 1245 | $filtAttrAr['rel'] = 'nofollow'; 1246 | } 1247 | } 1248 | } 1249 | } 1250 | } 1251 | 1252 | // .. Check attribute value against any $spec rule. 1253 | 1254 | if (isset($eleSpec[$attr]) 1255 | && is_array($eleSpec[$attr]) 1256 | && ($v = hl_attributeValue($attr, $v, $eleSpec[$attr], $ele)) === 0) { 1257 | continue; 1258 | } 1259 | 1260 | $filtAttrAr[$attr] = str_replace('"', '"', $v); 1261 | } 1262 | } 1263 | 1264 | // -- Add nofollow. 1265 | 1266 | if (isset($addNofollow)) { 1267 | $filtAttrAr['rel'] = isset($filtAttrAr['rel']) ? $filtAttrAr['rel']. ' nofollow' : 'nofollow'; 1268 | } 1269 | 1270 | // -- Add required attributes. 1271 | 1272 | static $requiredAttrAr = array('area'=>array('alt'=>'area'), 'bdo'=>array('dir'=>'ltr'), 'command'=>array('label'=>''), 'form'=>array('action'=>''), 'img'=>array('src'=>'', 'alt'=>'image'), 'map'=>array('name'=>''), 'optgroup'=>array('label'=>''), 'param'=>array('name'=>''), 'style'=>array('scoped'=>''), 'textarea'=>array('rows'=>'10', 'cols'=>'50')); 1273 | if (isset($requiredAttrAr[$ele])) { 1274 | foreach ($requiredAttrAr[$ele] as $k=>$v) { 1275 | if (!isset($filtAttrAr[$k])) { 1276 | $filtAttrAr[$k] = isset($v[0]) ? $v : $k; 1277 | } 1278 | } 1279 | } 1280 | 1281 | // -- Transform deprecated attributes into CSS declarations in style attribute. 1282 | 1283 | if ($alterDeprecAttr) { 1284 | $css = array(); 1285 | foreach ($filtAttrAr as $name=>$val) { 1286 | if ($name == 'style' || !isset($deprecAttrEleAr[$name][$ele])) { 1287 | continue; 1288 | } 1289 | $val = str_replace(array('\\', ':', ';', '&#'), '', $val); 1290 | if ($name == 'align') { 1291 | unset($filtAttrAr['align']); 1292 | if ($ele == 'img' && ($val == 'left' || $val == 'right')) { 1293 | $css[] = 'float: '. $val; 1294 | } elseif (($ele == 'div' || $ele == 'table') && $val == 'center') { 1295 | $css[] = 'margin: auto'; 1296 | } else { 1297 | $css[] = 'text-align: '. $val; 1298 | } 1299 | } elseif ($name == 'bgcolor') { 1300 | unset($filtAttrAr['bgcolor']); 1301 | $css[] = 'background-color: '. $val; 1302 | } elseif ($name == 'border') { 1303 | unset($filtAttrAr['border']); 1304 | $css[] = "border: {$val}px"; 1305 | } elseif ($name == 'bordercolor') { 1306 | unset($filtAttrAr['bordercolor']); 1307 | $css[] = 'border-color: '. $val; 1308 | } elseif ($name == 'cellspacing') { 1309 | unset($filtAttrAr['cellspacing']); 1310 | $css[] = "border-spacing: {$val}px"; 1311 | } elseif ($name == 'clear') { 1312 | unset($filtAttrAr['clear']); 1313 | $css[] = 'clear: '. ($val != 'all' ? $val : 'both'); 1314 | } elseif ($name == 'compact') { 1315 | unset($filtAttrAr['compact']); 1316 | $css[] = 'font-size: 85%'; 1317 | } elseif ($name == 'height' || $name == 'width') { 1318 | unset($filtAttrAr[$name]); 1319 | $css[] = 1320 | $name 1321 | . ': ' 1322 | . ((isset($val[0]) && $val[0] != '*') 1323 | ? $val. (ctype_digit($val) ? 'px' : '') 1324 | : 'auto'); 1325 | } elseif ($name == 'hspace') { 1326 | unset($filtAttrAr['hspace']); 1327 | $css[] = "margin-left: {$val}px; margin-right: {$val}px"; 1328 | } elseif ($name == 'language' && !isset($filtAttrAr['type'])) { 1329 | unset($filtAttrAr['language']); 1330 | $filtAttrAr['type'] = 'text/'. strtolower($val); 1331 | } elseif ($name == 'name') { 1332 | if ($C['no_deprecated_attr'] == 2 || ($ele != 'a' && $ele != 'map')) { 1333 | unset($filtAttrAr['name']); 1334 | } 1335 | if (!isset($filtAttrAr['id']) && !preg_match('`\W`', $val)) { 1336 | $filtAttrAr['id'] = $val; 1337 | } 1338 | } elseif ($name == 'noshade') { 1339 | unset($filtAttrAr['noshade']); 1340 | $css[] = 'border-style: none; border: 0; background-color: gray; color: gray'; 1341 | } elseif ($name == 'nowrap') { 1342 | unset($filtAttrAr['nowrap']); 1343 | $css[] = 'white-space: nowrap'; 1344 | } elseif ($name == 'size') { 1345 | unset($filtAttrAr['size']); 1346 | $css[] = 'size: '. $val. 'px'; 1347 | } elseif ($name == 'vspace') { 1348 | unset($filtAttrAr['vspace']); 1349 | $css[] = "margin-top: {$val}px; margin-bottom: {$val}px"; 1350 | } 1351 | } 1352 | if (count($css)) { 1353 | $css = implode('; ', $css); 1354 | $filtAttrAr['style'] = 1355 | isset($filtAttrAr['style']) 1356 | ? rtrim($filtAttrAr['style'], ' ;'). '; '. $css. ';' 1357 | : $css. ';'; 1358 | } 1359 | } 1360 | 1361 | // -- Enforce unique id attribute values. 1362 | 1363 | if ($C['unique_ids'] && isset($filtAttrAr['id'])) { 1364 | if (preg_match('`\s`', ($id = $filtAttrAr['id'])) || (isset($GLOBALS['hl_Ids'][$id]) && $C['unique_ids'] == 1)) { 1365 | unset($filtAttrAr['id']); 1366 | } else { 1367 | while (isset($GLOBALS['hl_Ids'][$id])) { 1368 | $id = $C['unique_ids']. $id; 1369 | } 1370 | $GLOBALS['hl_Ids'][($filtAttrAr['id'] = $id)] = 1; 1371 | } 1372 | } 1373 | 1374 | // -- Handle lang attributes. 1375 | 1376 | if ($C['xml:lang'] && isset($filtAttrAr['lang'])) { 1377 | $filtAttrAr['xml:lang'] = isset($filtAttrAr['xml:lang']) ? $filtAttrAr['xml:lang'] : $filtAttrAr['lang']; 1378 | if ($C['xml:lang'] == 2) { 1379 | unset($filtAttrAr['lang']); 1380 | } 1381 | } 1382 | 1383 | // -- If transformed element, modify style attribute. 1384 | 1385 | if (!empty($eleTransformed)) { 1386 | $filtAttrAr['style'] = 1387 | isset($filtAttrAr['style']) 1388 | ? rtrim($filtAttrAr['style'], ' ;'). '; '. $eleTransformed 1389 | : $eleTransformed; 1390 | } 1391 | 1392 | // -- Return opening tag with attributes. 1393 | 1394 | if (empty($C['hook_tag'])) { 1395 | $attrStr = ''; 1396 | foreach ($filtAttrAr as $k=>$v) { 1397 | $attrStr .= " {$k}=\"{$v}\""; 1398 | } 1399 | return "<{$ele}{$attrStr}". (isset($emptyEleAr[$ele]) ? ' /' : ''). '>'; 1400 | } else { 1401 | return call_user_func($C['hook_tag'], $ele, $filtAttrAr); 1402 | } 1403 | } 1404 | 1405 | /** 1406 | * Tidy/beautify HTM by adding newline and other spaces (padding), 1407 | * or compact by removing unnecessary spaces. 1408 | * 1409 | * @param string $t HTM. 1410 | * @param mixed $format -1 (compact) or string (type of padding). 1411 | * @param string $parentEle Parent element of $t. 1412 | * @return mixed Transformed attribute string (may be empty) or 0. 1413 | */ 1414 | function hl_tidy($t, $format, $parentEle) 1415 | { 1416 | if (strpos(' pre,script,textarea', "$parentEle,")) { 1417 | return $t; 1418 | } 1419 | 1420 | // Hide CDATA/comment. 1421 | 1422 | if (!function_exists('hl_aux2')) { 1423 | function hl_aux2($x) { 1424 | return 1425 | $x[1] 1426 | . str_replace( 1427 | array("<", ">", "\n", "\r", "\t", ' '), 1428 | array("\x01", "\x02", "\x03", "\x04", "\x05", "\x07"), 1429 | $x[3]) 1430 | . $x[4]; 1431 | } 1432 | } 1433 | $t = 1434 | preg_replace( 1435 | array('`(<\w[^>]*(?)\s+`', '`\s+`', '`(<\w[^>]*(?) `'), 1436 | array(' $1', ' ', '$1'), 1437 | preg_replace_callback( 1438 | array('`(<(!\[CDATA\[))(.+?)(\]\]>)`sm', '`(<(!--))(.+?)(-->)`sm', '`(<(pre|script|textarea)[^>]*?>)(.+?)()`sm'), 1439 | 'hl_aux2', 1440 | $t)); 1441 | 1442 | if (($format = strtolower($format)) == -1) { 1443 | return 1444 | str_replace(array("\x01", "\x02", "\x03", "\x04", "\x05", "\x07"), array('<', '>', "\n", "\r", "\t", ' '), $t); 1445 | } 1446 | $padChar = strpos(" $format", 't') ? "\t" : ' '; 1447 | $padStr = 1448 | preg_match('`\d`', $format, $m) 1449 | ? str_repeat($padChar, intval($m[0])) 1450 | : str_repeat($padChar, ($padChar == "\t" ? 1 : 2)); 1451 | $leadN = preg_match('`[ts]([1-9])`', $format, $m) ? intval($m[1]) : 0; 1452 | 1453 | // Group elements by line-break requirement. 1454 | 1455 | $postCloseEleAr = array('br'=>1); // After closing 1456 | $preEleAr = array('button'=>1, 'command'=>1, 'input'=>1, 'option'=>1, 'param'=>1, 'track'=>1); // Before opening or closing 1457 | $preOpenPostCloseEleAr = array('audio'=>1, 'canvas'=>1, 'caption'=>1, 'dd'=>1, 'dt'=>1, 'figcaption'=>1, 'h1'=>1, 'h2'=>1, 'h3'=>1, 'h4'=>1, 'h5'=>1, 'h6'=>1, 'isindex'=>1, 'label'=>1, 'legend'=>1, 'li'=>1, 'object'=>1, 'p'=>1, 'pre'=>1, 'style'=>1, 'summary'=>1, 'td'=>1, 'textarea'=>1, 'th'=>1, 'video'=>1); // Before opening and after closing 1458 | $prePostEleAr = array('address'=>1, 'article'=>1, 'aside'=>1, 'blockquote'=>1, 'center'=>1, 'colgroup'=>1, 'datalist'=>1, 'details'=>1, 'dialog'=>1, 'dir'=>1, 'div'=>1, 'dl'=>1, 'fieldset'=>1, 'figure'=>1, 'footer'=>1, 'form'=>1, 'header'=>1, 'hgroup'=>1, 'hr'=>1, 'iframe'=>1, 'main'=>1, 'map'=>1, 'menu'=>1, 'nav'=>1, 'noscript'=>1, 'ol'=>1, 'optgroup'=>1, 'picture'=>1, 'rbc'=>1, 'rtc'=>1, 'ruby'=>1, 'script'=>1, 'section'=>1, 'select'=>1, 'table'=>1, 'tbody'=>1, 'template'=>1, 'tfoot'=>1, 'thead'=>1, 'tr'=>1, 'ul'=>1); // Before and after opening and closing 1459 | 1460 | $doPad = 1; 1461 | $t = explode('<', $t); 1462 | while ($doPad) { 1463 | $n = $leadN; 1464 | $eleAr = $t; 1465 | ob_start(); 1466 | if (isset($prePostEleAr[$parentEle])) { 1467 | echo str_repeat($padStr, ++$n); 1468 | } 1469 | echo ltrim(array_shift($eleAr)); 1470 | for ($i=-1, $j=count($eleAr); ++$i<$j;) { 1471 | $rest = ''; 1472 | list($tag, $rest) = explode('>', $eleAr[$i]); 1473 | $open = $tag[0] == '/' ? 0 : (substr($tag, -1) == '/' ? 1 : ($tag[0] != '!' ? 2 : -1)); 1474 | $ele = !$open ? ltrim($tag, '/') : ($open > 0 ? substr($tag, 0, strcspn($tag, ' ')) : 0); 1475 | $tag = "<$tag>"; 1476 | if (isset($prePostEleAr[$ele])) { 1477 | if (!$open) { 1478 | if ($n) { 1479 | echo "\n", str_repeat($padStr, --$n), "$tag\n", str_repeat($padStr, $n); 1480 | } else { 1481 | ++$leadN; 1482 | ob_end_clean(); 1483 | continue 2; 1484 | } 1485 | } else { 1486 | echo "\n", str_repeat($padStr, $n), "$tag\n", str_repeat($padStr, ($open != 1 ? ++$n : $n)); 1487 | } 1488 | echo $rest; 1489 | continue; 1490 | } 1491 | $pad = "\n". str_repeat($padStr, $n); 1492 | if (isset($preOpenPostCloseEleAr[$ele])) { 1493 | if (!$open) { 1494 | echo $tag, $pad, $rest; 1495 | } else { 1496 | echo $pad, $tag, $rest; 1497 | } 1498 | } elseif (isset($preEleAr[$ele])) { 1499 | echo $pad, $tag, $rest; 1500 | } elseif (isset($postCloseEleAr[$ele])) { 1501 | echo $tag, $pad, $rest; 1502 | } elseif (!$ele) { 1503 | echo $pad, $tag, $pad, $rest; 1504 | } else { 1505 | echo $tag, $rest; 1506 | } 1507 | } 1508 | $doPad = 0; 1509 | } 1510 | $t = str_replace(array("\n ", " \n"), "\n", preg_replace('`[\n]\s*?[\n]+`', "\n", ob_get_contents())); 1511 | ob_end_clean(); 1512 | if (($newline = strpos(" $format", 'r') ? (strpos(" $format", 'n') ? "\r\n" : "\r") : 0)) { 1513 | $t = str_replace("\n", $newline, $t); 1514 | } 1515 | return str_replace(array("\x01", "\x02", "\x03", "\x04", "\x05", "\x07"), array('<', '>', "\n", "\r", "\t", ' '), $t); 1516 | } 1517 | 1518 | /** 1519 | * Handle URL to convert to relative/absolute type, 1520 | * block scheme, or add anti-spam text. 1521 | * 1522 | * @param mixed $url URL string, or array with URL value (if $attr is null). 1523 | * @param mixed $attr Attribute name string, or null (if $url is array). 1524 | * @return string With URL after any conversion/obfuscation. 1525 | */ 1526 | function hl_url($url, $attr=null) 1527 | { 1528 | global $C; 1529 | $preUrl = $postUrl = ''; 1530 | static $blocker = 'denied:'; 1531 | if ($attr == null) { // style attribute value 1532 | $attr = 'style'; 1533 | $preUrl = $url[1]; 1534 | $postUrl = $url[3]; 1535 | $url = trim($url[2]); 1536 | } 1537 | $okSchemeAr = isset($C['schemes'][$attr]) ? $C['schemes'][$attr] : $C['schemes']['*']; 1538 | if (isset($okSchemeAr['!']) && substr($url, 0, 7) != $blocker) { 1539 | $url = "{$blocker}{$url}"; 1540 | } 1541 | if (isset($okSchemeAr['*']) 1542 | || !strcspn($url, '#?;') 1543 | || substr($url, 0, strlen($blocker)) == $blocker 1544 | ) { 1545 | return "{$preUrl}{$url}{$postUrl}"; 1546 | } 1547 | if (preg_match('`^([^:?[@!$()*,=/\'\]]+?)(:|&(#(58|x3a)|colon);|%3a|\\\\0{0,4}3a).`i', $url, $m) 1548 | && !isset($okSchemeAr[strtolower($m[1])]) // Special crafting suggests malice 1549 | ) { 1550 | return "{$preUrl}{$blocker}{$url}{$postUrl}"; 1551 | } 1552 | if ($C['abs_url']) { 1553 | if ($C['abs_url'] == -1 && strpos($url, $C['base_url']) === 0) { // Make URL relative 1554 | $url = substr($url, strlen($C['base_url'])); 1555 | } elseif (empty($m[1])) { // Make URL absolute 1556 | if (substr($url, 0, 2) == '//') { 1557 | $url = substr($C['base_url'], 0, strpos($C['base_url'], ':') + 1). $url; 1558 | } elseif ($url[0] == '/') { 1559 | $url = preg_replace('`(^.+?://[^/]+)(.*)`', '$1', $C['base_url']). $url; 1560 | } elseif (strcspn($url, './')) { 1561 | $url = $C['base_url']. $url; 1562 | } else { 1563 | preg_match('`^([a-zA-Z\d\-+.]+://[^/]+)(.*)`', $C['base_url'], $m); 1564 | $url = preg_replace('`(?<=/)\./`', '', $m[2]. $url); 1565 | while (preg_match('`(?<=/)([^/]{3,}|[^/.]+?|\.[^/.]|[^/.]\.)/\.\./`', $url)) { 1566 | $url = preg_replace('`(?<=/)([^/]{3,}|[^/.]+?|\.[^/.]|[^/.]\.)/\.\./`', '', $url); 1567 | } 1568 | $url = $m[1]. $url; 1569 | } 1570 | } 1571 | } 1572 | return "{$preUrl}{$url}{$postUrl}"; 1573 | } 1574 | 1575 | /** 1576 | * Report version. 1577 | * 1578 | * @return string Version. 1579 | */ 1580 | function hl_version() 1581 | { 1582 | return '1.2.15'; 1583 | } 1584 | -------------------------------------------------------------------------------- /htmLawed_TESTCASE.txt: -------------------------------------------------------------------------------- 1 | /* 2 | htmLawed_TESTCASE.txt, 23 January 2023 3 | To test htmLawed 4 | Copyright Santosh Patnaik 5 | Dual licensed with LGPL 3 and GPL 2+ 6 | A PHP Labware internal utility - www.bioinformatics.org/phplabware/internal_utilities/htmLawed 7 | */ 8 | 9 | This file has UTF-8-encoded text with both correct and incorrect/malformed HTML/XHTML code snippets to test htmLawed (test cases/samples). The entire text may also be used as a unit. 10 | 11 | ************************************************ 12 | when viewing this file in a web browser, set the 13 | character encoding to Unicode/UTF-8 14 | ************************************************ 15 | 16 | --------------------- start -------------------- 17 | 18 | Try different $config and $spec values. Some text even when filtered in will not be displayed in a rendered web-page
19 | 20 |
Attributes
21 | 22 | Xml:lang:, ,
23 | Standard, predefined value, or empty attribute: , ,
24 | Required: , image
25 | Quote & space variation: a, a, a
26 | Invalid: a
27 | Duplicated: a
28 | Deprecated: a,

29 | Casing:
30 | Custom: image
31 | Data-*: a
32 | Admin-restricted?: 33 | 34 |
Attribute values
35 | 36 | Duplicate ID value:, ,
37 | (try 'my_' for prefix)
38 | Double-quotes in value:, ,
39 | (try filter for CSS expression)
40 | CSS expression:

41 | Other: ,
42 | (try 'maxlen', 'maxval', etc., for 'input' in '$spec') 43 | 44 |
Blockquotes
45 | 46 |
abc

47 |
abc
def

48 |
abc
def

49 |
abc
def
ghi

50 | abc
def
ghi
51 |
QQQ
x

52 |
x
QQQ

53 |
x
QQQ
x

54 |
x
QQQ

x


55 |
56 | (try with blockquote parent) 57 | 58 |
CDATA sections
59 | 60 | Special characters inside: ]]>, 3.5, & 4 > 4 ]]>
61 | Normal: , CDATA follows:
62 | Malformed: , < ![CDATA check ]]>, , < ![CDATA check ] ]>
63 | Invalid: >CDATA in tag content,
text not allowed
64 | 65 |
Complex-1: deprecated elements
66 | 67 |
68 | The PHP software script used for this web-page webpage is htmLawedTest.php, from PHP Labware. 69 |
70 | 71 |
Complex-2: deprecated attributes
72 | 73 | aa 74 |
75 |
76 | image 77 | 78 | 79 | 86 | 89 | 90 |
80 |
81 |

Section

82 |

Para

83 |
  1. First item
84 |
85 |
87 |
  1. First item
88 |
91 |
92 | 93 |
Complex-3: embed, object, area
94 | 95 |
96 | 97 |
98 | 99 | 100 |

navigate the site: 1 | 3 | 4

101 | 102 |
103 | 104 | value 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 |
Complex-4: nested and other tables
114 | 115 |
Cell
Cell
Cell
Cell Cell Cell
Cell
Cell Cell Cell

116 | PCDATA wrong: Well
Hello

117 | Missing tr:
Well

118 | 119 |
Complex-5: pseudo, disallowed or non-HTML tags
120 | 121 | (Try different 'keep_bad' values) 122 | <*> Pseudotags <*> 123 | Non-HTML tag xml 124 |

125 | Disallowed tag p 126 |

127 |
    Bad
  • OK
128 | 129 |
Elements
130 | 131 | Unbalanced: check
132 | Non-XHTML:

133 | Malformed: < a href="">, , , , < /a>, < a href="">, a, a,
135 | Invalid: a
136 | Empty: a, a, atext
137 | Content invalid: 12
138 | Content invalid?:

(try setting 'form' as parent)
139 | Casing:
140 | Check for tidy:



hi
141 | Customized element: 142 | Custom element: Click me?A beautiful tree towering over an empty savannah 143 | Custom element: 144 | Facebook 145 | G+ 146 | xx 147 | 148 | 149 | Math: 2 = 2 150 | SVG: 151 | 152 |
Entities
153 | 154 | Special: & 3 < 2 & 5>4 and j >i >a & ia
155 | Padding: B B f f  
156 | Malformed: & #x27;, &x27;, ' &TILDE;, &tilde
157 | Invalid: , �, , �, ￿, &bad;
158 | Discouraged characters: , „, ﷠, 􏿾
159 | Context: '>', <?
160 | Casing: ', ', &TILDE;, ˜ 161 |
162 | (also check named-to-numeric and hexdec-to-decimal, and vice versa, conversions) 163 | 164 |
Format
165 | 166 | Valid but ill-formatted: text 167 | text 169 | text text
p r e
174 | text text

177 | text none text 178 | text none t e x t 179 |
text none t e x t 180 | 181 | text none t e x t 182 | 183 |
184 |
p r e  
185 |
186 | 				pre
187 | 		
188 |
189 |
Cell
Cell
Cell
CellCellCell
Cell
CellCellCell
190 | (try to compact or beautify) 191 | 192 |
Forms
193 | 194 | (note nesting of 'form', missing required attributes, etc.)
195 |
196 | 197 |
pl
198 | h 199 | 200 |

201 |


202 |
B:C:

203 | (try each of these lines separately)
204 |
what
205 | what 206 | (try with container as div and as form)
207 | c a b 208 | 209 |
HTML comments (also CDATA)
210 | 211 | Script inside:
214 | Special characters inside: , , , c
215 | Normal: , , comment:,
text not allowed

216 | Malformed: , < ![CDATA check ]]>, < ![CDATA check ] ]>
217 | Invalid:
>comment in tag content, 218 | 219 |
HTML5
220 | 221 | figure and figcaption:
picture
Caption for the awesome picture
222 | article:

A

B

C

E

F

G

223 | meter:

Heat 150.

224 | datalist: 225 | 226 |
Ins-Del
227 | 228 | (depending on context, these elements can be of either block or inline type)
229 |

block


230 |

d


231 |

d

d

d
232 | 233 |
Lists
234 | 235 | Invalid character data:
  • (item
  • )

236 | Definition list:
a
bad
first one
b
second

237 | Definition list, close-tags omitted:
a
bad
first one
b
second

238 | Definition lists, nested:
239 |
T1
240 |
D1
241 |
T2
242 |
D2
t1
d1
t2
d2
243 |
T3
244 |
D3
245 |
T4
246 |
D4
t1
d1
247 |

248 | Definition lists, nested, close-tags omitted:
249 |
T1 250 |
D1
251 |
T2
252 |
D2
t1
d1
t2
d2
253 |
T3 254 |
D3 255 |
T4 256 |
D4
t1
d1
257 |

258 | Nested:
    259 |
  • l1
  • 260 |
  • l2
    1. lo1
    2. lo2
  • 261 |
  • l3
  • 262 |
  • l4
    1. lo3
    2. lo4
      1. lo5
  • 263 |

264 | Nested, directly:
    265 |
  • l1
  • 266 |
      l2
    267 |
  • l3
  • 268 |

269 | Nested, close-tags omitted:
    270 |
  • l1
  • 271 |
  • l2
    1. lo1
    2. lo2
    272 |
  • l3 273 |
  • l4
    1. lo3
    2. lo4
      1. lo5
    274 |

275 | Complex: 276 |
  1. 277 |
    285 |
286 | Menu:
  • 287 | 288 |
  • 289 |
    290 | 291 |
    Microdata
    292 | 293 |
    294 | I am X but people call me Y. 295 | Find me at 296 |
    297 | 298 |
    Microsoft Word
    299 | 300 | Proprietary tag:

     


    301 | XML declaration:
    302 | XML-invalid character code-point (may not replicate):

    “Where is he?” asked both Mary – the one so lovely – and Jane.

    303 | 304 |
    Nesting
    305 | 306 | Block or inline a:

    text

    hi

    307 | 308 |
    Non-English text-1
    309 | 310 | Inscrieţi-vă acum la a Zecea Conferinţă Internaţională
    311 | გთხოვთ ახლავე გაიაროთ რეგისტრაცია
    312 | večjezično računalništvo
    313 | อ.อ่าง
    314 | Зарегистрируйтесь сейчас 316 | на Десятую Международную Конференцию по
    317 | (this file should have utf-8 encoding; some characters may not be displayed because of missing fonts, etc.) 318 | 319 |
    Non-English text-2: entities
    320 | 321 | 用统一码
    322 | გთხოვთ
    323 | Inscreva-se agora para a Décima Conferência Internacional Sobre O Unicode, realizada entre os dias 10 e 12 de março de 1997 em Mainz 324 | na Alemanha. 325 | 326 |
    Ruby
    327 | 328 | (need compatible browser)
    329 | 330 | 331 | 332 | 333 | 334 | 335 | 336 | 337 | さい 338 | とう 339 | のぶ 340 | 341 | 342 | 343 | W3C Associate Chairman 344 | 345 |
    346 | 347 | WWW 348 | (World Wide Web) 349 |
    350 | 351 | A 352 | (aaa) 353 | 354 | 355 | 356 |
    Tables
    357 | 358 | Omitted closing tags: 359 | 360 | 361 | 363 |
    h1c1h1c2 362 |
    r1c1r1c2 364 |
    r2c1r2c2 365 |

    366 | Nested, omitted closing tags: 367 | 368 | 369 | 371 |
    h1c1h1c2 370 |
    r1c1r1c2 372 | 373 | 374 | 376 |
    h1c1h1c2 375 |
    r1c1r1c2 377 |
    r2c1r2c2 378 |
    379 |
    r2c1r2c2 380 |

    381 | 382 |
    Tag transformation
    383 | Font element with malicious code:


    384 | Font element intended as 'inline' element:

    hi


    385 | Font element intended as 'block' element:
    hi

    386 | Font element intended as 'block' element:
    hi
    QQQ

    387 | 388 |
    Tidy
    389 | White-space handling: abc def ghi abc def ghi 390 | 391 |
    URLs
    392 | 393 | Relative and absolute: , , , , , ,
    394 | (try base URL value of 'http://a.com/b/')
    395 | CSS URLs:
    ,
    ,
    ,
    ,

    396 | Double URLs: b
    397 | Anti-spam: (try regex for 'http://a.com', etc.) , , , , , , ,
    398 | Soft-hyphen: ídis­c 399 | 400 |
    XSS
    401 | 402 | <img onmouseover=confirm(1)// 403 | '';!--"=&{()}
    404 |
    405 |
    406 |
    407 |
    408 |
    410 | test 411 | 412 |

    413 |

    414 |

    415 |
    416 |
    417 |

    418 | test
    419 | Bad IE7: x
    421 | Opera: link 422 | Bad IE7: xxx
    423 | Bad IE7: xxx
    424 | Bad IE7: xxx
    425 | Bad IE7: xxx
    426 | Bad IE7: xxx
    427 | Bad IE7: xxx
    428 | Bad IE7: xxx
    429 | Bad IE7: xxx
    430 | Bad IE7: xxx
    431 | Bad IE7: xxx
    432 | Bad IE7: xxx
    433 | Bad IE7: xxx
    434 | Bad IE7: xxx
    435 | Bad IE7: x
    436 | Bad IE7: x
    437 | Bad IE7: x
    438 | Bad IE7: x
    439 | Bad IE7: exp/*x
    440 | Bad IE7: hi
    441 | Bad IE7: hi
    442 | Bad IE7: test
    443 | Bad IE7: hi
    444 | Bad IE7: hi
    446 | 447 |
    Other
    448 | 449 | 3 < 4
    450 | 3 > 4
    451 | > 3
    452 | <._.> hi!
    453 | <<< ALERT >>>
    454 | some stuff
    455 |
    456 |
    457 |
    458 | if(13age){say 'teen'}
    459 | age >51 and a smoking history of >51 pack-years was
    460 | age > 51 and a smoking history of >51 pack-years was
    461 | age <51 and a smoking history of <51 pack-years was
    462 | age < 51 and a smoking history of < 51 pack-years was
    463 | age >51 and a smoking history of >51 pack-years
    464 | age > 51 and a smoking history of >51 pack-years
    465 | age <51 and a smoking history of <51 pack-years
    466 | age < 51 and a smoking history of < 51 pack-years
    467 | --------------------------------------------------------------------------------