diff options
Diffstat (limited to 'library/htmlpurifier-4.6.0-lite')
-rw-r--r-- | library/htmlpurifier-4.6.0-lite/CREDITS | 9 | ||||
-rw-r--r-- | library/htmlpurifier-4.6.0-lite/INSTALL | 374 | ||||
-rw-r--r-- | library/htmlpurifier-4.6.0-lite/LICENSE | 504 | ||||
-rw-r--r-- | library/htmlpurifier-4.6.0-lite/NEWS | 1078 |
4 files changed, 1965 insertions, 0 deletions
diff --git a/library/htmlpurifier-4.6.0-lite/CREDITS b/library/htmlpurifier-4.6.0-lite/CREDITS new file mode 100644 index 000000000..7921b45af --- /dev/null +++ b/library/htmlpurifier-4.6.0-lite/CREDITS @@ -0,0 +1,9 @@ + +CREDITS + +Almost everything written by Edward Z. Yang (Ambush Commander). Lots of thanks +to the DevNetwork Community for their help (see docs/ref-devnetwork.html for +more details), Feyd especially (namely IPv6 and optimization). Thanks to RSnake +for letting me package his fantastic XSS cheatsheet for a smoketest. + + vim: et sw=4 sts=4 diff --git a/library/htmlpurifier-4.6.0-lite/INSTALL b/library/htmlpurifier-4.6.0-lite/INSTALL new file mode 100644 index 000000000..677c04aa0 --- /dev/null +++ b/library/htmlpurifier-4.6.0-lite/INSTALL @@ -0,0 +1,374 @@ + +Install + How to install HTML Purifier + +HTML Purifier is designed to run out of the box, so actually using the +library is extremely easy. (Although... if you were looking for a +step-by-step installation GUI, you've downloaded the wrong software!) + +While the impatient can get going immediately with some of the sample +code at the bottom of this library, it's well worth reading this entire +document--most of the other documentation assumes that you are familiar +with these contents. + + +--------------------------------------------------------------------------- +1. Compatibility + +HTML Purifier is PHP 5 only, and is actively tested from PHP 5.0.5 and +up. It has no core dependencies with other libraries. PHP +4 support was deprecated on December 31, 2007 with HTML Purifier 3.0.0. +HTML Purifier is not compatible with zend.ze1_compatibility_mode. + +These optional extensions can enhance the capabilities of HTML Purifier: + + * iconv : Converts text to and from non-UTF-8 encodings + * bcmath : Used for unit conversion and imagecrash protection + * tidy : Used for pretty-printing HTML + +These optional libraries can enhance the capabilities of HTML Purifier: + + * CSSTidy : Clean CSS stylesheets using %Core.ExtractStyleBlocks + * Net_IDNA2 (PEAR) : IRI support using %Core.EnableIDNA + +--------------------------------------------------------------------------- +2. Reconnaissance + +A big plus of HTML Purifier is its inerrant support of standards, so +your web-pages should be standards-compliant. (They should also use +semantic markup, but that's another issue altogether, one HTML Purifier +cannot fix without reading your mind.) + +HTML Purifier can process these doctypes: + +* XHTML 1.0 Transitional (default) +* XHTML 1.0 Strict +* HTML 4.01 Transitional +* HTML 4.01 Strict +* XHTML 1.1 + +...and these character encodings: + +* UTF-8 (default) +* Any encoding iconv supports (with crippled internationalization support) + +These defaults reflect what my choices would be if I were authoring an +HTML document, however, what you choose depends on the nature of your +codebase. If you don't know what doctype you are using, you can determine +the doctype from this identifier at the top of your source code: + + <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> + +...and the character encoding from this code: + + <meta http-equiv="Content-type" content="text/html;charset=ENCODING"> + +If the character encoding declaration is missing, STOP NOW, and +read 'docs/enduser-utf8.html' (web accessible at +http://htmlpurifier.org/docs/enduser-utf8.html). In fact, even if it is +present, read this document anyway, as many websites specify their +document's character encoding incorrectly. + + +--------------------------------------------------------------------------- +3. Including the library + +The procedure is quite simple: + + require_once '/path/to/library/HTMLPurifier.auto.php'; + +This will setup an autoloader, so the library's files are only included +when you use them. + +Only the contents in the library/ folder are necessary, so you can remove +everything else when using HTML Purifier in a production environment. + +If you installed HTML Purifier via PEAR, all you need to do is: + + require_once 'HTMLPurifier.auto.php'; + +Please note that the usual PEAR practice of including just the classes you +want will not work with HTML Purifier's autoloading scheme. + +Advanced users, read on; other users can skip to section 4. + +Autoload compatibility +---------------------- + + HTML Purifier attempts to be as smart as possible when registering an + autoloader, but there are some cases where you will need to change + your own code to accomodate HTML Purifier. These are those cases: + + PHP VERSION IS LESS THAN 5.1.2, AND YOU'VE DEFINED __autoload + Because spl_autoload_register() doesn't exist in early versions + of PHP 5, HTML Purifier has no way of adding itself to the autoload + stack. Modify your __autoload function to test + HTMLPurifier_Bootstrap::autoload($class) + + For example, suppose your autoload function looks like this: + + function __autoload($class) { + require str_replace('_', '/', $class) . '.php'; + return true; + } + + A modified version with HTML Purifier would look like this: + + function __autoload($class) { + if (HTMLPurifier_Bootstrap::autoload($class)) return true; + require str_replace('_', '/', $class) . '.php'; + return true; + } + + Note that there *is* some custom behavior in our autoloader; the + original autoloader in our example would work for 99% of the time, + but would fail when including language files. + + AN __autoload FUNCTION IS DECLARED AFTER OUR AUTOLOADER IS REGISTERED + spl_autoload_register() has the curious behavior of disabling + the existing __autoload() handler. Users need to explicitly + spl_autoload_register('__autoload'). Because we use SPL when it + is available, __autoload() will ALWAYS be disabled. If __autoload() + is declared before HTML Purifier is loaded, this is not a problem: + HTML Purifier will register the function for you. But if it is + declared afterwards, it will mysteriously not work. This + snippet of code (after your autoloader is defined) will fix it: + + spl_autoload_register('__autoload') + + Users should also be on guard if they use a version of PHP previous + to 5.1.2 without an autoloader--HTML Purifier will define __autoload() + for you, which can collide with an autoloader that was added by *you* + later. + + +For better performance +---------------------- + + Opcode caches, which greatly speed up PHP initialization for scripts + with large amounts of code (HTML Purifier included), don't like + autoloaders. We offer an include file that includes all of HTML Purifier's + files in one go in an opcode cache friendly manner: + + // If /path/to/library isn't already in your include path, uncomment + // the below line: + // require '/path/to/library/HTMLPurifier.path.php'; + + require 'HTMLPurifier.includes.php'; + + Optional components still need to be included--you'll know if you try to + use a feature and you get a class doesn't exists error! The autoloader + can be used in conjunction with this approach to catch classes that are + missing. Simply add this afterwards: + + require 'HTMLPurifier.autoload.php'; + +Standalone version +------------------ + + HTML Purifier has a standalone distribution; you can also generate + a standalone file from the full version by running the script + maintenance/generate-standalone.php . The standalone version has the + benefit of having most of its code in one file, so parsing is much + faster and the library is easier to manage. + + If HTMLPurifier.standalone.php exists in the library directory, you + can use it like this: + + require '/path/to/HTMLPurifier.standalone.php'; + + This is equivalent to including HTMLPurifier.includes.php, except that + the contents of standalone/ will be added to your path. To override this + behavior, specify a new HTMLPURIFIER_PREFIX where standalone files can + be found (usually, this will be one directory up, the "true" library + directory in full distributions). Don't forget to set your path too! + + The autoloader can be added to the end to ensure the classes are + loaded when necessary; otherwise you can manually include them. + To use the autoloader, use this: + + require 'HTMLPurifier.autoload.php'; + +For advanced users +------------------ + + HTMLPurifier.auto.php performs a number of operations that can be done + individually. These are: + + HTMLPurifier.path.php + Puts /path/to/library in the include path. For high performance, + this should be done in php.ini. + + HTMLPurifier.autoload.php + Registers our autoload handler HTMLPurifier_Bootstrap::autoload($class). + + You can do these operations by yourself--in fact, you must modify your own + autoload handler if you are using a version of PHP earlier than PHP 5.1.2 + (See "Autoload compatibility" above). + + +--------------------------------------------------------------------------- +4. Configuration + +HTML Purifier is designed to run out-of-the-box, but occasionally HTML +Purifier needs to be told what to do. If you answer no to any of these +questions, read on; otherwise, you can skip to the next section (or, if you're +into configuring things just for the heck of it, skip to 4.3). + +* Am I using UTF-8? +* Am I using XHTML 1.0 Transitional? + +If you answered no to any of these questions, instantiate a configuration +object and read on: + + $config = HTMLPurifier_Config::createDefault(); + + +4.1. Setting a different character encoding + +You really shouldn't use any other encoding except UTF-8, especially if you +plan to support multilingual websites (read section three for more details). +However, switching to UTF-8 is not always immediately feasible, so we can +adapt. + +HTML Purifier uses iconv to support other character encodings, as such, +any encoding that iconv supports <http://www.gnu.org/software/libiconv/> +HTML Purifier supports with this code: + + $config->set('Core.Encoding', /* put your encoding here */); + +An example usage for Latin-1 websites (the most common encoding for English +websites): + + $config->set('Core.Encoding', 'ISO-8859-1'); + +Note that HTML Purifier's support for non-Unicode encodings is crippled by the +fact that any character not supported by that encoding will be silently +dropped, EVEN if it is ampersand escaped. If you want to work around +this, you are welcome to read docs/enduser-utf8.html for a fix, +but please be cognizant of the issues the "solution" creates (for this +reason, I do not include the solution in this document). + + +4.2. Setting a different doctype + +For those of you using HTML 4.01 Transitional, you can disable +XHTML output like this: + + $config->set('HTML.Doctype', 'HTML 4.01 Transitional'); + +Other supported doctypes include: + + * HTML 4.01 Strict + * HTML 4.01 Transitional + * XHTML 1.0 Strict + * XHTML 1.0 Transitional + * XHTML 1.1 + + +4.3. Other settings + +There are more configuration directives which can be read about +here: <http://htmlpurifier.org/live/configdoc/plain.html> They're a bit boring, +but they can help out for those of you who like to exert maximum control over +your code. Some of the more interesting ones are configurable at the +demo <http://htmlpurifier.org/demo.php> and are well worth looking into +for your own system. + +For example, you can fine tune allowed elements and attributes, convert +relative URLs to absolute ones, and even autoparagraph input text! These +are, respectively, %HTML.Allowed, %URI.MakeAbsolute and %URI.Base, and +%AutoFormat.AutoParagraph. The %Namespace.Directive naming convention +translates to: + + $config->set('Namespace.Directive', $value); + +E.g. + + $config->set('HTML.Allowed', 'p,b,a[href],i'); + $config->set('URI.Base', 'http://www.example.com'); + $config->set('URI.MakeAbsolute', true); + $config->set('AutoFormat.AutoParagraph', true); + + +--------------------------------------------------------------------------- +5. Caching + +HTML Purifier generates some cache files (generally one or two) to speed up +its execution. For maximum performance, make sure that +library/HTMLPurifier/DefinitionCache/Serializer is writeable by the webserver. + +If you are in the library/ folder of HTML Purifier, you can set the +appropriate permissions using: + + chmod -R 0755 HTMLPurifier/DefinitionCache/Serializer + +If the above command doesn't work, you may need to assign write permissions +to all. This may be necessary if your webserver runs as nobody, but is +not recommended since it means any other user can write files in the +directory. Use: + + chmod -R 0777 HTMLPurifier/DefinitionCache/Serializer + +You can also chmod files via your FTP client; this option +is usually accessible by right clicking the corresponding directory and +then selecting "chmod" or "file permissions". + +Starting with 2.0.1, HTML Purifier will generate friendly error messages +that will tell you exactly what you have to chmod the directory to, if in doubt, +follow its advice. + +If you are unable or unwilling to give write permissions to the cache +directory, you can either disable the cache (and suffer a performance +hit): + + $config->set('Core.DefinitionCache', null); + +Or move the cache directory somewhere else (no trailing slash): + + $config->set('Cache.SerializerPath', '/home/user/absolute/path'); + + +--------------------------------------------------------------------------- +6. Using the code + +The interface is mind-numbingly simple: + + $purifier = new HTMLPurifier($config); + $clean_html = $purifier->purify( $dirty_html ); + +That's it! For more examples, check out docs/examples/ (they aren't very +different though). Also, docs/enduser-slow.html gives advice on what to +do if HTML Purifier is slowing down your application. + + +--------------------------------------------------------------------------- +7. Quick install + +First, make sure library/HTMLPurifier/DefinitionCache/Serializer is +writable by the webserver (see Section 5: Caching above for details). +If your website is in UTF-8 and XHTML Transitional, use this code: + +<?php + require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php'; + + $config = HTMLPurifier_Config::createDefault(); + $purifier = new HTMLPurifier($config); + $clean_html = $purifier->purify($dirty_html); +?> + +If your website is in a different encoding or doctype, use this code: + +<?php + require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php'; + + $config = HTMLPurifier_Config::createDefault(); + $config->set('Core.Encoding', 'ISO-8859-1'); // replace with your encoding + $config->set('HTML.Doctype', 'HTML 4.01 Transitional'); // replace with your doctype + $purifier = new HTMLPurifier($config); + + $clean_html = $purifier->purify($dirty_html); +?> + + vim: et sw=4 sts=4 diff --git a/library/htmlpurifier-4.6.0-lite/LICENSE b/library/htmlpurifier-4.6.0-lite/LICENSE new file mode 100644 index 000000000..8c88a20d4 --- /dev/null +++ b/library/htmlpurifier-4.6.0-lite/LICENSE @@ -0,0 +1,504 @@ + GNU LESSER GENERAL PUBLIC LICENSE + Version 2.1, February 1999 + + Copyright (C) 1991, 1999 Free Software Foundation, Inc. + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +[This is the first released version of the Lesser GPL. It also counts + as the successor of the GNU Library Public License, version 2, hence + the version number 2.1.] + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software--to make sure the software is free for all its users. + + This license, the Lesser General Public License, applies to some +specially designated software packages--typically libraries--of the +Free Software Foundation and other authors who decide to use it. You +can use it too, but we suggest you first think carefully about whether +this license or the ordinary General Public License is the better +strategy to use in any particular case, based on the explanations below. + + When we speak of free software, we are referring to freedom of use, +not price. Our General Public Licenses are designed to make sure that +you have the freedom to distribute copies of free software (and charge +for this service if you wish); that you receive source code or can get +it if you want it; that you can change the software and use pieces of +it in new free programs; and that you are informed that you can do +these things. + + To protect your rights, we need to make restrictions that forbid +distributors to deny you these rights or to ask you to surrender these +rights. These restrictions translate to certain responsibilities for +you if you distribute copies of the library or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link other code with the library, you must provide +complete object files to the recipients, so that they can relink them +with the library after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + We protect your rights with a two-step method: (1) we copyright the +library, and (2) we offer you this license, which gives you legal +permission to copy, distribute and/or modify the library. + + To protect each distributor, we want to make it very clear that +there is no warranty for the free library. Also, if the library is +modified by someone else and passed on, the recipients should know +that what they have is not the original version, so that the original +author's reputation will not be affected by problems that might be +introduced by others. + + Finally, software patents pose a constant threat to the existence of +any free program. We wish to make sure that a company cannot +effectively restrict the users of a free program by obtaining a +restrictive license from a patent holder. Therefore, we insist that +any patent license obtained for a version of the library must be +consistent with the full freedom of use specified in this license. + + Most GNU software, including some libraries, is covered by the +ordinary GNU General Public License. This license, the GNU Lesser +General Public License, applies to certain designated libraries, and +is quite different from the ordinary General Public License. We use +this license for certain libraries in order to permit linking those +libraries into non-free programs. + + When a program is linked with a library, whether statically or using +a shared library, the combination of the two is legally speaking a +combined work, a derivative of the original library. The ordinary +General Public License therefore permits such linking only if the +entire combination fits its criteria of freedom. The Lesser General +Public License permits more lax criteria for linking other code with +the library. + + We call this license the "Lesser" General Public License because it +does Less to protect the user's freedom than the ordinary General +Public License. It also provides other free software developers Less +of an advantage over competing non-free programs. These disadvantages +are the reason we use the ordinary General Public License for many +libraries. However, the Lesser license provides advantages in certain +special circumstances. + + For example, on rare occasions, there may be a special need to +encourage the widest possible use of a certain library, so that it becomes +a de-facto standard. To achieve this, non-free programs must be +allowed to use the library. A more frequent case is that a free +library does the same job as widely used non-free libraries. In this +case, there is little to gain by limiting the free library to free +software only, so we use the Lesser General Public License. + + In other cases, permission to use a particular library in non-free +programs enables a greater number of people to use a large body of +free software. For example, permission to use the GNU C Library in +non-free programs enables many more people to use the whole GNU +operating system, as well as its variant, the GNU/Linux operating +system. + + Although the Lesser General Public License is Less protective of the +users' freedom, it does ensure that the user of a program that is +linked with the Library has the freedom and the wherewithal to run +that program using a modified version of the Library. + + The precise terms and conditions for copying, distribution and +modification follow. Pay close attention to the difference between a +"work based on the library" and a "work that uses the library". The +former contains code derived from the library, whereas the latter must +be combined with the library in order to run. + + GNU LESSER GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License Agreement applies to any software library or other +program which contains a notice placed by the copyright holder or +other authorized party saying it may be distributed under the terms of +this Lesser General Public License (also called "this License"). +Each licensee is addressed as "you". + + A "library" means a collection of software functions and/or data +prepared so as to be conveniently linked with application programs +(which use some of those functions and data) to form executables. + + The "Library", below, refers to any such software library or work +which has been distributed under these terms. A "work based on the +Library" means either the Library or any derivative work under +copyright law: that is to say, a work containing the Library or a +portion of it, either verbatim or with modifications and/or translated +straightforwardly into another language. (Hereinafter, translation is +included without limitation in the term "modification".) + + "Source code" for a work means the preferred form of the work for +making modifications to it. For a library, complete source code means +all the source code for all modules it contains, plus any associated +interface definition files, plus the scripts used to control compilation +and installation of the library. + + Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running a program using the Library is not restricted, and output from +such a program is covered only if its contents constitute a work based +on the Library (independent of the use of the Library in a tool for +writing it). Whether that is true depends on what the Library does +and what the program that uses the Library does. + + 1. You may copy and distribute verbatim copies of the Library's +complete source code as you receive it, in any medium, provided that +you conspicuously and appropriately publish on each copy an +appropriate copyright notice and disclaimer of warranty; keep intact +all the notices that refer to this License and to the absence of any +warranty; and distribute a copy of this License along with the +Library. + + You may charge a fee for the physical act of transferring a copy, +and you may at your option offer warranty protection in exchange for a +fee. + + 2. You may modify your copy or copies of the Library or any portion +of it, thus forming a work based on the Library, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) The modified work must itself be a software library. + + b) You must cause the files modified to carry prominent notices + stating that you changed the files and the date of any change. + + c) You must cause the whole of the work to be licensed at no + charge to all third parties under the terms of this License. + + d) If a facility in the modified Library refers to a function or a + table of data to be supplied by an application program that uses + the facility, other than as an argument passed when the facility + is invoked, then you must make a good faith effort to ensure that, + in the event an application does not supply such function or + table, the facility still operates, and performs whatever part of + its purpose remains meaningful. + + (For example, a function in a library to compute square roots has + a purpose that is entirely well-defined independent of the + application. Therefore, Subsection 2d requires that any + application-supplied function or table used by this function must + be optional: if the application does not supply it, the square + root function must still compute square roots.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Library, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Library, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote +it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Library. + +In addition, mere aggregation of another work not based on the Library +with the Library (or with a work based on the Library) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may opt to apply the terms of the ordinary GNU General Public +License instead of this License to a given copy of the Library. To do +this, you must alter all the notices that refer to this License, so +that they refer to the ordinary GNU General Public License, version 2, +instead of to this License. (If a newer version than version 2 of the +ordinary GNU General Public License has appeared, then you can specify +that version instead if you wish.) Do not make any other change in +these notices. + + Once this change is made in a given copy, it is irreversible for +that copy, so the ordinary GNU General Public License applies to all +subsequent copies and derivative works made from that copy. + + This option is useful when you wish to copy part of the code of +the Library into a program that is not a library. + + 4. You may copy and distribute the Library (or a portion or +derivative of it, under Section 2) in object code or executable form +under the terms of Sections 1 and 2 above provided that you accompany +it with the complete corresponding machine-readable source code, which +must be distributed under the terms of Sections 1 and 2 above on a +medium customarily used for software interchange. + + If distribution of object code is made by offering access to copy +from a designated place, then offering equivalent access to copy the +source code from the same place satisfies the requirement to +distribute the source code, even though third parties are not +compelled to copy the source along with the object code. + + 5. A program that contains no derivative of any portion of the +Library, but is designed to work with the Library by being compiled or +linked with it, is called a "work that uses the Library". Such a +work, in isolation, is not a derivative work of the Library, and +therefore falls outside the scope of this License. + + However, linking a "work that uses the Library" with the Library +creates an executable that is a derivative of the Library (because it +contains portions of the Library), rather than a "work that uses the +library". The executable is therefore covered by this License. +Section 6 states terms for distribution of such executables. + + When a "work that uses the Library" uses material from a header file +that is part of the Library, the object code for the work may be a +derivative work of the Library even though the source code is not. +Whether this is true is especially significant if the work can be +linked without the Library, or if the work is itself a library. The +threshold for this to be true is not precisely defined by law. + + If such an object file uses only numerical parameters, data +structure layouts and accessors, and small macros and small inline +functions (ten lines or less in length), then the use of the object +file is unrestricted, regardless of whether it is legally a derivative +work. (Executables containing this object code plus portions of the +Library will still fall under Section 6.) + + Otherwise, if the work is a derivative of the Library, you may +distribute the object code for the work under the terms of Section 6. +Any executables containing that work also fall under Section 6, +whether or not they are linked directly with the Library itself. + + 6. As an exception to the Sections above, you may also combine or +link a "work that uses the Library" with the Library to produce a +work containing portions of the Library, and distribute that work +under terms of your choice, provided that the terms permit +modification of the work for the customer's own use and reverse +engineering for debugging such modifications. + + You must give prominent notice with each copy of the work that the +Library is used in it and that the Library and its use are covered by +this License. You must supply a copy of this License. If the work +during execution displays copyright notices, you must include the +copyright notice for the Library among them, as well as a reference +directing the user to the copy of this License. Also, you must do one +of these things: + + a) Accompany the work with the complete corresponding + machine-readable source code for the Library including whatever + changes were used in the work (which must be distributed under + Sections 1 and 2 above); and, if the work is an executable linked + with the Library, with the complete machine-readable "work that + uses the Library", as object code and/or source code, so that the + user can modify the Library and then relink to produce a modified + executable containing the modified Library. (It is understood + that the user who changes the contents of definitions files in the + Library will not necessarily be able to recompile the application + to use the modified definitions.) + + b) Use a suitable shared library mechanism for linking with the + Library. A suitable mechanism is one that (1) uses at run time a + copy of the library already present on the user's computer system, + rather than copying library functions into the executable, and (2) + will operate properly with a modified version of the library, if + the user installs one, as long as the modified version is + interface-compatible with the version that the work was made with. + + c) Accompany the work with a written offer, valid for at + least three years, to give the same user the materials + specified in Subsection 6a, above, for a charge no more + than the cost of performing this distribution. + + d) If distribution of the work is made by offering access to copy + from a designated place, offer equivalent access to copy the above + specified materials from the same place. + + e) Verify that the user has already received a copy of these + materials or that you have already sent this user a copy. + + For an executable, the required form of the "work that uses the +Library" must include any data and utility programs needed for +reproducing the executable from it. However, as a special exception, +the materials to be distributed need not include anything that is +normally distributed (in either source or binary form) with the major +components (compiler, kernel, and so on) of the operating system on +which the executable runs, unless that component itself accompanies +the executable. + + It may happen that this requirement contradicts the license +restrictions of other proprietary libraries that do not normally +accompany the operating system. Such a contradiction means you cannot +use both them and the Library together in an executable that you +distribute. + + 7. You may place library facilities that are a work based on the +Library side-by-side in a single library together with other library +facilities not covered by this License, and distribute such a combined +library, provided that the separate distribution of the work based on +the Library and of the other library facilities is otherwise +permitted, and provided that you do these two things: + + a) Accompany the combined library with a copy of the same work + based on the Library, uncombined with any other library + facilities. This must be distributed under the terms of the + Sections above. + + b) Give prominent notice with the combined library of the fact + that part of it is a work based on the Library, and explaining + where to find the accompanying uncombined form of the same work. + + 8. You may not copy, modify, sublicense, link with, or distribute +the Library except as expressly provided under this License. Any +attempt otherwise to copy, modify, sublicense, link with, or +distribute the Library is void, and will automatically terminate your +rights under this License. However, parties who have received copies, +or rights, from you under this License will not have their licenses +terminated so long as such parties remain in full compliance. + + 9. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Library or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Library (or any work based on the +Library), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Library or works based on it. + + 10. Each time you redistribute the Library (or any work based on the +Library), the recipient automatically receives a license from the +original licensor to copy, distribute, link with or modify the Library +subject to these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties with +this License. + + 11. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Library at all. For example, if a patent +license would not permit royalty-free redistribution of the Library by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Library. + +If any portion of this section is held invalid or unenforceable under any +particular circumstance, the balance of the section is intended to apply, +and the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 12. If the distribution and/or use of the Library is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Library under this License may add +an explicit geographical distribution limitation excluding those countries, +so that distribution is permitted only in or among countries not thus +excluded. In such case, this License incorporates the limitation as if +written in the body of this License. + + 13. The Free Software Foundation may publish revised and/or new +versions of the Lesser General Public License from time to time. +Such new versions will be similar in spirit to the present version, +but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library +specifies a version number of this License which applies to it and +"any later version", you have the option of following the terms and +conditions either of that version or of any later version published by +the Free Software Foundation. If the Library does not specify a +license version number, you may choose any version ever published by +the Free Software Foundation. + + 14. If you wish to incorporate parts of the Library into other free +programs whose distribution conditions are incompatible with these, +write to the author to ask for permission. For software which is +copyrighted by the Free Software Foundation, write to the Free +Software Foundation; we sometimes make exceptions for this. Our +decision will be guided by the two goals of preserving the free status +of all derivatives of our free software and of promoting the sharing +and reuse of software generally. + + NO WARRANTY + + 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO +WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. +EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR +OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY +KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE +LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME +THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN +WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY +AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU +FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE +LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING +RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A +FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF +SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH +DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Libraries + + If you develop a new library, and you want it to be of the greatest +possible use to the public, we recommend making it free software that +everyone can redistribute and change. You can do so by permitting +redistribution under these terms (or, alternatively, under the terms of the +ordinary General Public License). + + To apply these terms, attach the following notices to the library. It is +safest to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least the +"copyright" line and a pointer to where the full notice is found. + + <one line to give the library's name and a brief idea of what it does.> + Copyright (C) <year> <name of author> + + This library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with this library; if not, write to the Free Software + Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +Also add information on how to contact you by electronic and paper mail. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the library, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the + library `Frob' (a library for tweaking knobs) written by James Random Hacker. + + <signature of Ty Coon>, 1 April 1990 + Ty Coon, President of Vice + +That's all there is to it! + + vim: et sw=4 sts=4 diff --git a/library/htmlpurifier-4.6.0-lite/NEWS b/library/htmlpurifier-4.6.0-lite/NEWS new file mode 100644 index 000000000..90a054620 --- /dev/null +++ b/library/htmlpurifier-4.6.0-lite/NEWS @@ -0,0 +1,1078 @@ +NEWS ( CHANGELOG and HISTORY ) HTMLPurifier +||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| + += KEY ==================== + # Breaks back-compat + ! Feature + - Bugfix + + Sub-comment + . Internal change +========================== + +4.6.0, released 2013-11-30 +# Secure URI munge hashing algorithm has changed to hash_hmac("sha256", $url, $secret). + Please update any verification scripts you may have. +# URI parsing algorithm was made more strict, so only prefixes which + looks like schemes will actually be schemes. Thanks + Michael Gusev <mgusev@sugarcrm.com> for fixing. +# %Core.EscapeInvalidChildren is no longer supported, and no longer does + anything. +! New directive %Core.AllowHostnameUnderscore which allows underscores + in hostnames. +- Eliminate quadratic behavior in DOMLex by using a proper queue. + Thanks Ole Laursen for noticing this. +- Rewritten MakeWellFormed/FixNesting implementation eliminates quadratic + behavior in the rest of the purificaiton pipeline. Thanks Chedburn + Networks for sponsoring this work. +- Made Linkify URL parser a bit less permissive, so that non-breaking + spaces and commas are not included as part of URL. Thanks nAS for fixing. +- Fix some bad interactions with %HTML.Allowed and injectors. Thanks + David Hirtz for reporting. +- Fix infinite loop in DirectLex. Thanks Ashar Javed (@soaj1664ashar) + for reporting. + +4.5.0, released 2013-02-17 +# Fix bug where stacked attribute transforms clobber each other; + this also means it's no longer possible to override attribute + transforms in later modules. No internal code was using this + but this may break some clients. +# We now use SHA-1 to identify cached definitions, instead of MD5. +! Support display:inline-block +! Support for more white-space CSS values. +! Permit underscores in font families +! Support for page-break-* CSS3 properties when proprietary properties + are enabled. +! New directive %Core.DisableExcludes; can be set to 'true' to turn off + SGML excludes checking. If HTML Purifier is removing too much text + and you don't care about full standards compliance, try setting this to + 'true'. +- Use prepend for SPL autoloading on PHP 5.3 and later. +- Fix bug with nofollow transform when pre-existing rel exists. +- Fix bug where background:url() always gets lower-cased + (but not background-image:url()) +- Fix bug with non lower-case color names in HTML +- Fix bug where data URI validation doesn't remove temporary files. + Thanks Javier MarÃn Ros <javiermarinros@gmail.com> for reporting. +- Don't remove certain empty tags on RemoveEmpty. + +4.4.0, released 2012-01-18 +# Removed PEARSax3 handler. +# URI.Munge now munges URIs inside the same host that go from https + to http. Reported by Neike Taika-Tessaro. +# Core.EscapeNonASCIICharacters now always transforms entities to + entities, even if target encoding is UTF-8. +# Tighten up selector validation in ExtractStyleBlocks. + Non-syntactically valid selectors are now rejected, along with + some of the more obscure ones such as attribute selectors, the + :lang pseudoselector, and anything not in CSS2.1. Furthermore, + ID and class selectors now work properly with the relevant + configuration attributes. Also, mute errors when parsing CSS + with CSS Tidy. Reported by Mario Heiderich and Norman Hippert. +! Added support for 'scope' attribute on tables. +! Added %HTML.TargetBlank, which adds target="blank" to all outgoing links. +! Properly handle sub-lists directly nested inside of lists in + a standards compliant way, by moving them into the preceding <li> +! Added %HTML.AllowedComments and %HTML.AllowedCommentsRegexp for + limited allowed comments in untrusted situations. +! Implement iframes, and allow them to be used in untrusted mode with + %HTML.SafeIframe and %URI.SafeIframeRegexp. Thanks Bradley M. Froehle + <brad.froehle@gmail.com> for submitting an initial version of the patch. +! The Forms module now works properly for transitional doctypes. +! Added support for internationalized domain names. You need the PEAR + Net_IDNA2 module to be in your path; if it is installed, ensure the + class can be loaded and then set %Core.EnableIDNA to true. +- Color keywords are now case insensitive. Thanks Yzmir Ramirez + <yramirez-htmlpurifier@adicio.com> for reporting. +- Explicitly initialize anonModule variable to null. +- Do not duplicate nofollow if already present. Thanks 178 + for reporting. +- Do not add nofollow if hostname matches our current host. Thanks 178 + for reporting, and Neike Taika-Tessaro for helping diagnose. +- Do not unset parser variable; this fixes intermittent serialization + problems. Thanks Neike Taika-Tessaro for reporting, bill + <10010tiger@gmail.com> for diagnosing. +- Fix iconv truncation bug, where non-UTF-8 target encodings see + output truncated after around 8000 characters. Thanks Jörg Ludwig + <joerg.ludwig@iserv.eu> for reporting. +- Fix broken table content model for XHTML1.1 (and also earlier + versions, although the W3C validator doesn't catch those violations). + Thanks GlitchMr <glitch.mr@gmail.com> for reporting. + +4.3.0, released 2011-03-27 +# Fixed broken caching of customized raw definitions, but requires an + API change. The old API still works but will emit a warning, + see http://htmlpurifier.org/docs/enduser-customize.html#optimized + for how to upgrade your code. +# Protect against Internet Explorer innerHTML behavior by specially + treating attributes with backticks but no angled brackets, quotes or + spaces. This constitutes a slight semantic change, which can be + reverted using %Output.FixInnerHTML. Reported by Neike Taika-Tessaro + and Mario Heiderich. +# Protect against cssText/innerHTML by restricting allowed characters + used in fonts further than mandated by the specification and encoding + some extra special characters in URLs. Reported by Neike + Taika-Tessaro and Mario Heiderich. +! Added %HTML.Nofollow to add rel="nofollow" to external links. +! More types of SPL autoloaders allowed on later versions of PHP. +! Implementations for position, top, left, right, bottom, z-index + when %CSS.Trusted is on. +! Add %Cache.SerializerPermissions option for custom serializer + directory/file permissions +! Fix longstanding bug in Flash support for non-IE browsers, and + allow more wmode attributes. +! Add %CSS.AllowedFonts to restrict permissible font names. +- Switch to an iterative traversal of the DOM, which prevents us + from running out of stack space for deeply nested documents. + Thanks Maxim Krizhanovsky for contributing a patch. +- Make removal of conditional IE comments ungreedy; thanks Bernd + for reporting. +- Escape CDATA before removing Internet Explorer comments. +- Fix removal of id attributes under certain conditions by ensuring + armor attributes are preserved when recreating tags. +- Check if schema.ser was corrupted. +- Check if zend.ze1_compatibility_mode is on, and error out if it is. + This safety check is only done for HTMLPurifier.auto.php; if you + are using standalone or the specialized includes files, you're + expected to know what you're doing. +- Stop repeatedly writing the cache file after I'm done customizing a + raw definition. Reported by ajh. +- Switch to using require_once in the Bootstrap to work around bad + interaction with Zend Debugger and APC. Reported by Antonio Parraga. +- Fix URI handling when hostname is missing but scheme is present. + Reported by Neike Taika-Tessaro. +- Fix missing numeric entities on DirectLex; thanks Neike Taika-Tessaro + for reporting. +- Fix harmless notice from indexing into empty string. Thanks Matthijs + Kooijman <matthijs@stdin.nl> for reporting. +- Don't autoclose no parent elements are able to support the element + that triggered the autoclose. In particular fixes strange behavior + of stray <li> tags. Thanks pkuliga@gmail.com for reporting and + Neike Taika-Tessaro <pinkgothic@gmail.com> for debugging assistance. + +4.2.0, released 2010-09-15 +! Added %Core.RemoveProcessingInstructions, which lets you remove + <? ... ?> statements. +! Added %URI.DisableResources functionality; the directive originally + did nothing. Thanks David Rothstein for reporting. +! Add documentation about configuration directive types. +! Add %CSS.ForbiddenProperties configuration directive. +! Add %HTML.FlashAllowFullScreen to permit embedded Flash objects + to utilize full-screen mode. +! Add optional support for the <code>file</code> URI scheme, enable + by explicitly setting %URI.AllowedSchemes. +! Add %Core.NormalizeNewlines options to allow turning off newline + normalization. +- Fix improper handling of Internet Explorer conditional comments + by parser. Thanks zmonteca for reporting. +- Fix missing attributes bug when running on Mac Snow Leopard and APC. + Thanks sidepodcast for the fix. +- Warn if an element is allowed, but an attribute it requires is + not allowed. + +4.1.1, released 2010-05-31 +- Fix undefined index warnings in maintenance scripts. +- Fix bug in DirectLex for parsing elements with a single attribute + with entities. +- Rewrite CSS output logic for font-family and url(). Thanks Mario + Heiderich <mario.heiderich@googlemail.com> for reporting and Takeshi + Terada <t-terada@violet.plala.or.jp> for suggesting the fix. +- Emit an error for CollectErrors if a body is extracted +- Fix bug where in background-position for center keyword handling. +- Fix infinite loop when a wrapper element is inserted in a context + where it's not allowed. Thanks Lars <lars@renoz.dk> for reporting. +- Remove +x bit and shebang from index.php; only supported mode is to + explicitly call it with php. +- Make test script less chatty when log_errors is on. + +4.1.0, released 2010-04-26 +! Support proprietary height attribute on table element +! Support YouTube slideshows that contain /cp/ in their URL. +! Support for data: URI scheme; not enabled by default, add it using + %URI.AllowedSchemes +! Support flashvars when using %HTML.SafeObject and %HTML.SafeEmbed. +! Support for Internet Explorer compatibility with %HTML.SafeObject + using %Output.FlashCompat. +! Handle <ol><ol> properly, by inserting the necessary <li> tag. +- Always quote the insides of url(...) in CSS. + +4.0.0, released 2009-07-07 +# APIs for ConfigSchema subsystem have substantially changed. See + docs/dev-config-bcbreaks.txt for details; in essence, anything that + had both namespace and directive now have a single unified key. +# Some configuration directives were renamed, specifically: + %AutoFormatParam.PurifierLinkifyDocURL -> %AutoFormat.PurifierLinkify.DocURL + %FilterParam.ExtractStyleBlocksEscaping -> %Filter.ExtractStyleBlocks.Escaping + %FilterParam.ExtractStyleBlocksScope -> %Filter.ExtractStyleBlocks.Scope + %FilterParam.ExtractStyleBlocksTidyImpl -> %Filter.ExtractStyleBlocks.TidyImpl + As usual, the old directive names will still work, but will throw E_NOTICE + errors. +# The allowed values for class have been relaxed to allow all of CDATA for + doctypes that are not XHTML 1.1 or XHTML 2.0. For old behavior, set + %Attr.ClassUseCDATA to false. +# Instead of appending the content model to an old content model, a blank + element will replace the old content model. You can use #SUPER to get + the old content model. +! More robust support for name="" and id="" +! HTMLPurifier_Config::inherit($config) allows you to inherit one + configuration, and have changes to that configuration be propagated + to all of its children. +! Implement %HTML.Attr.Name.UseCDATA, which relaxes validation rules on + the name attribute when set. Use with care. Thanks Ian Cook for + sponsoring. +! Implement %AutoFormat.RemoveEmpty.RemoveNbsp, which removes empty + tags that contain non-breaking spaces as well other whitespace. You + can also modify which tags should have maintained with + %AutoFormat.RemoveEmpty.RemoveNbsp.Exceptions. +! Implement %Attr.AllowedClasses, which allows administrators to restrict + classes users can use to a specified finite set of classes, and + %Attr.ForbiddenClasses, which is the logical inverse. +! You can now maintain your own configuration schema directories by + creating a config-schema.php file or passing an extra argument. Check + docs/dev-config-schema.html for more details. +! Added HTMLPurifier_Config->serialize() method, which lets you save away + your configuration in a compact serial file, which you can unserialize + and use directly without having to go through the overhead of setup. +- Fix bug where URIDefinition would not get cleared if it's directives got + changed. +- Fix fatal error in HTMLPurifier_Encoder on certain platforms (probably NetBSD 5.0) +- Fix bug in Linkify autoformatter involving <a><span>http://foo</span></a> +- Make %URI.Munge not apply to links that have the same host as your host. +- Prevent stray </body> tag from truncating output, if a second </body> + is present. +. Created script maintenance/rename-config.php for renaming a configuration + directive while maintaining its alias. This script does not change source code. +. Implement namespace locking for definition construction, to prevent + bugs where a directive is used for definition construction but is not + used to construct the cache hash. + +3.3.0, released 2009-02-16 +! Implement CSS property 'overflow' when %CSS.AllowTricky is true. +! Implement generic property list classess +- Fix bug with testEncodingSupportsASCII() algorithm when iconv() implementation + does not do the "right thing" with characters not supported in the output + set. +- Spellcheck UTF-8: The Secret To Character Encoding +- Fix improper removal of the contents of elements with only whitespace. Thanks + Eric Wald for reporting. +- Fix broken test suite in versions of PHP without spl_autoload_register() +- Fix degenerate case with YouTube filter involving double hyphens. + Thanks Pierre Attar for reporting. +- Fix YouTube rendering problem on certain versions of Firefox. +- Fix CSSDefinition Printer problems with decorators +- Add text parameter to unit tests, forces text output +. Add verbose mode to command line test runner, use (--verbose) +. Turn on unit tests for UnitConverter +. Fix missing version number in configuration %Attr.DefaultImageAlt (added 3.2.0) +. Fix newline errors that caused spurious failures when CRLF HTML Purifier was + tested on Linux. +. Removed trailing whitespace from all text files, see + remote-trailing-whitespace.php maintenance script. +. Convert configuration to use property list backend. + +3.2.0, released 2008-10-31 +# Using %Core.CollectErrors forces line number/column tracking on, whereas + previously you could theoretically turn it off. +# HTMLPurifier_Injector->notifyEnd() is formally deprecated. Please + use handleEnd() instead. +! %Output.AttrSort for when you need your attributes in alphabetical order to + deal with a bug in FCKEditor. Requested by frank farmer. +! Enable HTML comments when %HTML.Trusted is on. Requested by Waldo Jaquith. +! Proper support for name attribute. It is now allowed and equivalent to the id + attribute in a and img tags, and is only converted to id when %HTML.TidyLevel + is heavy (for all doctypes). +! %AutoFormat.RemoveEmpty to remove some empty tags from documents. Please don't + use on hand-written HTML. +! Add error-cases for unsupported elements in MakeWellFormed. This enables + the strategy to be used, standalone, on untrusted input. +! %Core.AggressivelyFixLt is on by default. This causes more sensible + processing of left angled brackets in smileys and other whatnot. +! Test scripts now have a 'type' parameter, which lets you say 'htmlpurifier', + 'phpt', 'vtest', etc. in order to only execute those tests. This supercedes + the --only-phpt parameter, although for backwards-compatibility the flag + will still work. +! AutoParagraph auto-formatter will now preserve double-newlines upon output. + Users who are not performing inbound filtering, this may seem a little + useless, but as a bonus, the test suite and handling of edge cases is also + improved. +! Experimental implementation of forms for %HTML.Trusted +! Track column numbers when maintain line numbers is on +! Proprietary 'background' attribute on table-related elements converted into + corresponding CSS. Thanks Fusemail for sponsoring this feature! +! Add forward(), forwardUntilEndToken(), backward() and current() to Injector + supertype. +! HTMLPurifier_Injector->handleEnd() permits modification to end tokens. The + time of operation varies slightly from notifyEnd() as *all* end tokens are + processed by the injector before they are subject to the well-formedness rules. +! %Attr.DefaultImageAlt allows overriding default behavior of setting alt to + basename of image when not present. +! %AutoFormat.DisplayLinkURI neuters <a> tags into plain text URLs. +- Fix two bugs in %URI.MakeAbsolute; one involving empty paths in base URLs, + the other involving an undefined $is_folder error. +- Throw error when %Core.Encoding is set to a spurious value. Previously, + this errored silently and returned false. +- Redirected stderr to stdout for flush error output. +- %URI.DisableExternal will now use the host in %URI.Base if %URI.Host is not + available. +- Do not re-munge URL if the output URL has the same host as the input URL. + Requested by Chris. +- Fix error in documentation regarding %Filter.ExtractStyleBlocks +- Prevent <![CDATA[<body></body>]]> from triggering %Core.ConvertDocumentToFragment +- Fix bug with inline elements in blockquotes conflicting with strict doctype +- Detect if HTML support is disabled for DOM by checking for loadHTML() method. +- Fix bug where dots and double-dots in absolute URLs without hostname were + not collapsed by URIFilter_MakeAbsolute. +- Fix bug with anonymous modules operating on SafeEmbed or SafeObject elements + by reordering their addition. +- Will now throw exception on many error conditions during lexer creation; also + throw an exception when MaintainLineNumbers is true, but a non-tracksLineNumbers + is being used. +- Detect if domxml extension is loaded, and use DirectLEx accordingly. +- Improve handling of big numbers with floating point arithmetic in UnitConverter. + Reported by David Morton. +. Strategy_MakeWellFormed now operates in-place, saving memory and allowing + for more interesting filter-backtracking +. New HTMLPurifier_Injector->rewind() functionality, allows injectors to rewind + index to reprocess tokens. +. StringHashParser now allows for multiline sections with "empty" content; + previously the section would remain undefined. +. Added --quick option to multitest.php, which tests only the most recent + release for each series. +. Added --distro option to multitest.php, which accepts either 'normal' or + 'standalone'. This supercedes --exclude-normal and --exclude-standalone + +3.1.1, released 2008-06-19 +# %URI.Munge now, by default, does not munge resources (for example, <img src="">) + In order to enable this again, please set %URI.MungeResources to true. +! More robust imagecrash protection with height/width CSS with %CSS.MaxImgLength, + and height/width HTML with %HTML.MaxImgLength. +! %URI.MungeSecretKey for secure URI munging. Thanks Chris + for sponsoring this feature. Check out the corresponding documentation + for details. (Att Nightly testers: The API for this feature changed before + the general release. Namely, rename your directives %URI.SecureMungeSecretKey => + %URI.MungeSecretKey and and %URI.SecureMunge => %URI.Munge) +! Implemented post URI filtering. Set member variable $post to true to set + a URIFilter as such. +! Allow modules to define injectors via $info_injector. Injectors are + automatically disabled if injector's needed elements are not found. +! Support for "safe" objects added, use %HTML.SafeObject and %HTML.SafeEmbed. + Thanks Chris for sponsoring. If you've been using ad hoc code from the + forums, PLEASE use this instead. +! Added substitutions for %e, %n, %a and %p in %URI.Munge (in order, + embedded, tag name, attribute name, CSS property name). See %URI.Munge + for more details. Requested by Jochem Blok. +- Disable percent height/width attributes for img. +- AttrValidator operations are now atomic; updates to attributes are not + manifest in token until end of operations. This prevents naughty internal + code from directly modifying CurrentToken when they're not supposed to. + This semantics change was requested by frank farmer. +- Percent encoding checks enabled for URI query and fragment +- Fix stray backslashes in font-family; CSS Unicode character escapes are + now properly resolved (although *only* in font-family). Thanks Takeshi Terada + for reporting. +- Improve parseCDATA algorithm to take into account newline normalization +- Account for browser confusion between Yen character and backslash in + Shift_JIS encoding. This fix generalizes to any other encoding which is not + a strict superset of printable ASCII. Thanks Takeshi Terada for reporting. +- Fix missing configuration parameter in Generator calls. Thanks vs for the + partial patch. +- Improved adherence to Unicode by checking for non-character codepoints. + Thanks Geoffrey Sneddon for reporting. This may result in degraded + performance for extremely large inputs. +- Allow CSS property-value pair ''text-decoration: none''. Thanks Jochem Blok + for reporting. +. Added HTMLPurifier_UnitConverter and HTMLPurifier_Length for convenient + handling of CSS-style lengths. HTMLPurifier_AttrDef_CSS_Length now uses + this class. +. API of HTMLPurifier_AttrDef_CSS_Length changed from __construct($disable_negative) + to __construct($min, $max). __construct(true) is equivalent to + __construct('0'). +. Added HTMLPurifier_AttrDef_Switch class +. Rename HTMLPurifier_HTMLModule_Tidy->construct() to setup() and bubble method + up inheritance hierarchy to HTMLPurifier_HTMLModule. All HTMLModules + get this called with the configuration object. All modules now + use this rather than __construct(), although legacy code using constructors + will still work--the new format, however, lets modules access the + configuration object for HTML namespace dependant tweaks. +. AttrDef_HTML_Pixels now takes a single construction parameter, pixels. +. ConfigSchema data-structure heavily optimized; on average it uses a third + the memory it did previously. The interface has changed accordingly, + consult changes to HTMLPurifier_Config for details. +. Variable parsing types now are magic integers instead of strings +. Added benchmark for ConfigSchema +. HTMLPurifier_Generator requires $config and $context parameters. If you + don't know what they should be, use HTMLPurifier_Config::createDefault() + and new HTMLPurifier_Context(). +. Printers now properly distinguish between output configuration, and + target configuration. This is not applicable to scripts using + the Printers for HTML Purifier related tasks. +. HTML/CSS Printers must be primed with prepareGenerator($gen_config), otherwise + fatal errors will ensue. +. URIFilter->prepare can return false in order to abort loading of the filter +. Factory for AttrDef_URI implemented, URI#embedded to indicate URI that embeds + an external resource. +. %URI.Munge functionality factored out into a post-filter class. +. Added CurrentCSSProperty context variable during CSS validation + +3.1.0, released 2008-05-18 +# Unnecessary references to objects (vestiges of PHP4) removed from method + signatures. The following methods do not need references when assigning from + them and will result in E_STRICT errors if you try: + + HTMLPurifier_Config->get*Definition() [* = HTML, CSS] + + HTMLPurifier_ConfigSchema::instance() + + HTMLPurifier_DefinitionCacheFactory::instance() + + HTMLPurifier_DefinitionCacheFactory->create() + + HTMLPurifier_DoctypeRegistry->register() + + HTMLPurifier_DoctypeRegistry->get() + + HTMLPurifier_HTMLModule->addElement() + + HTMLPurifier_HTMLModule->addBlankElement() + + HTMLPurifier_LanguageFactory::instance() +# Printer_ConfigForm's get*() functions were static-ified +# %HTML.ForbiddenAttributes requires attribute declarations to be in the + form of tag@attr, NOT tag.attr (which will throw an error and won't do + anything). This is for forwards compatibility with XML; you'd do best + to migrate an %HTML.AllowedAttributes directives to this syntax too. +! Allow index to be false for config from form creation +! Added HTMLPurifier::VERSION constant +! Commas, not dashes, used for serializer IDs. This change is forwards-compatible + and allows for version numbers like "3.1.0-dev". +! %HTML.Allowed deals gracefully with whitespace anywhere, anytime! +! HTML Purifier's URI handling is a lot more robust, with much stricter + validation checks and better percent encoding handling. Thanks Gareth Heyes + for indicating security vulnerabilities from lax percent encoding. +! Bootstrap autoloader deals more robustly with classes that don't exist, + preventing class_exists($class, true) from barfing. +- InterchangeBuilder now alphabetizes its lists +- Validation error in configdoc output fixed +- Iconv and other encoding errors muted even with custom error handlers that + do not honor error_reporting +- Add protection against imagecrash attack with CSS height/width +- HTMLPurifier::instance() created for consistency, is equivalent to getInstance() +- Fixed and revamped broken ConfigForm smoketest +- Bug with bool/null fields in Printer_ConfigForm fixed +- Bug with global forbidden attributes fixed +- Improved error messages for allowed and forbidden HTML elements and attributes +- Missing (or null) in configdoc documentation restored +- If DOM throws and exception during parsing with PH5P (occurs in newer versions + of DOM), HTML Purifier punts to DirectLex +- Fatal error with unserialization of ScriptRequired +- Created directories are now chmod'ed properly +- Fixed bug with fallback languages in LanguageFactory +- Standalone testing setup properly with autoload +. Out-of-date documentation revised +. UTF-8 encoding check optimization as suggested by Diego +. HTMLPurifier_Error removed in favor of exceptions +. More copy() function removed; should use clone instead +. More extensive unit tests for HTMLDefinition +. assertPurification moved to central harness +. HTMLPurifier_Generator accepts $config and $context parameters during + instantiation, not runtime +. Double-quotes outside of attribute values are now unescaped + +3.1.0rc1, released 2008-04-22 +# Autoload support added. Internal require_once's removed in favor of an + explicit require list or autoloading. To use HTML Purifier, + you must now either use HTMLPurifier.auto.php + or HTMLPurifier.includes.php; setting the include path and including + HTMLPurifier.php is insufficient--in such cases include HTMLPurifier.autoload.php + as well to register our autoload handler (or modify your autoload function + to check HTMLPurifier_Bootstrap::getPath($class)). You can also use + HTMLPurifier.safe-includes.php for a less performance friendly but more + user-friendly library load. +# HTMLPurifier_ConfigSchema static functions are officially deprecated. Schema + information is stored in the ConfigSchema directory, and the + maintenance/generate-schema-cache.php generates the schema.ser file, which + is now instantiated. Support for userland schema changes coming soon! +# HTMLPurifier_Config will now throw E_USER_NOTICE when you use a directive + alias; to get rid of these errors just modify your configuration to use + the new directive name. +# HTMLPurifier->addFilter is deprecated; built-in filters can now be + enabled using %Filter.$filter_name or by setting your own filters using + %Filter.Custom +# Directive-level safety properties superceded in favor of module-level + safety. Internal method HTMLModule->addElement() has changed, although + the externally visible HTMLDefinition->addElement has *not* changed. +! Extra utility classes for testing and non-library operations can + be found in extras/. Specifically, these are FSTools and ConfigDoc. + You may find a use for these in your own project, but right now they + are highly experimental and volatile. +! Integration with PHPT allows for automated smoketests +! Limited support for proprietary HTML elements, namely <marquee>, sponsored + by Chris. You can enable them with %HTML.Proprietary if your client + demands them. +! Support for !important CSS cascade modifier. By default, this will be stripped + from CSS, but you can enable it using %CSS.AllowImportant +! Support for display and visibility CSS properties added, set %CSS.AllowTricky + to true to use them. +! HTML Purifier now has its own Exception hierarchy under HTMLPurifier_Exception. + Developer error (not enduser error) can cause these to be triggered. +! Experimental kses() wrapper introduced with HTMLPurifier.kses.php +! Finally %CSS.AllowedProperties for tweaking allowed CSS properties without + mucking around with HTMLPurifier_CSSDefinition +! ConfigDoc output has been enhanced with version and deprecation info. +! %HTML.ForbiddenAttributes and %HTML.ForbiddenElements implemented. +- Autoclose now operates iteratively, i.e. <span><span><div> now has + both span tags closed. +- Various HTMLPurifier_Config convenience functions now accept another parameter + $schema which defines what HTMLPurifier_ConfigSchema to use besides the + global default. +- Fix bug with trusted script handling in libxml versions later than 2.6.28. +- Fix bug in ExtractStyleBlocks with comments in style tags +- Fix bug in comment parsing for DirectLex +- Flush output now displayed when in command line mode for unit tester +- Fix bug with rgb(0, 1, 2) color syntax with spaces inside shorthand syntax +- HTMLPurifier_HTMLDefinition->addAttribute can now be called multiple times + on the same element without emitting errors. +- Fixed fatal error in PH5P lexer with invalid tag names +. Plugins now get their own changelogs according to project conventions. +. Convert tokens to use instanceof, reducing memory footprint and + improving comparison speed. +. Dry runs now supported in SimpleTest; testing facilities improved +. Bootstrap class added for handling autoloading functionality +. Implemented recursive glob at FSTools->globr +. ConfigSchema now has instance methods for all corresponding define* + static methods. +. A couple of new historical maintenance scripts were added. +. HTMLPurifier/HTMLModule/Tidy/XHTMLAndHTML4.php split into two files +. tests/index.php can now be run from any directory. +. HTMLPurifier_Token subclasses split into seperate files +. HTMLPURIFIER_PREFIX now is defined in Bootstrap.php, NOT HTMLPurifier.php +. HTMLPURIFIER_PREFIX can now be defined outside of HTML Purifier +. New --php=php flag added, allows PHP executable to be specified (command + line only!) +. htmlpurifier_add_test() preferred method to translate test files in to + classes, because it handles PHPT files too. +. Debugger class is deprecated and will be removed soon. +. Command line argument parsing for testing scripts revamped, now --opt value + format is supported. +. Smoketests now cleanup after magic quotes +. Generator now can output comments (however, comments are still stripped + from HTML Purifier output) +. HTMLPurifier_ConfigSchema->validate() deprecated in favor of + HTMLPurifier_VarParser->parse() +. Integers auto-cast into float type by VarParser. +. HTMLPURIFIER_STRICT removed; no validation is performed on runtime, only + during cache generation +. Reordered script calls in maintenance/flush.php +. Command line scripts now honor exit codes +. When --flush fails in unit testers, abort tests and print message +. Improved documentation in docs/dev-flush.html about the maintenance scripts +. copy() methods removed in favor of clone keyword + +3.0.0, released 2008-01-06 +# HTML Purifier is PHP 5 only! The 2.1.x branch will be maintained + until PHP 4 is completely deprecated, but no new features will be added + to it. + + Visibility declarations added + + Constructor methods renamed to __construct() + + PHP4 reference cruft removed (in progress) +! CSS properties are now case-insensitive +! DefinitionCacheFactory now can register new implementations +! New HTMLPurifier_Filter_ExtractStyleBlocks for extracting <style> from + documents and cleaning their contents up. Requires the CSSTidy library + <http://csstidy.sourceforge.net/>. You can access the blocks with the + 'StyleBlocks' Context variable ($purifier->context->get('StyleBlocks')). + The output CSS can also be "scoped" for a specific element, use: + %Filter.ExtractStyleBlocksScope +! Experimental support for some proprietary CSS attributes allowed: + opacity (and all of the browser-specific equivalents) and scrollbar colors. + Enable by setting %CSS.Proprietary to true. +- Colors missing # but in hex form will be corrected +- CSS Number algorithm improved +- Unit testing and multi-testing now on steroids: command lines, + XML output, and other goodies now added. +. Unit tests for Injector improved +. New classes: + + HTMLPurifier_AttrDef_CSS_AlphaValue + + HTMLPurifier_AttrDef_CSS_Filter +. Multitest now has a file docblock + +2.1.3, released 2007-11-05 +! tests/multitest.php allows you to test multiple versions by running + tests/index.php through multiple interpreters using `phpv` shell + script (you must provide this script!) +- Fixed poor include ordering for Email URI AttrDefs, causes fatal errors + on some systems. +- Injector algorithm further refined: off-by-one error regarding skip + counts for dormant injectors fixed +- Corrective blockquote definition now enabled for HTML 4.01 Strict +- Fatal error when <img> tag (or any other element with required attributes) + has 'id' attribute fixed, thanks NykO18 for reporting +- Fix warning emitted when a non-supported URI scheme is passed to the + MakeAbsolute URIFilter, thanks NykO18 (again) +- Further refine AutoParagraph injector. Behavior inside of elements + allowing paragraph tags clarified: only inline content delimeted by + double newlines (not block elements) are paragraphed. +- Buggy treatment of end tags of elements that have required attributes + fixed (does not manifest on default tag-set) +- Spurious internal content reorganization error suppressed +- HTMLDefinition->addElement now returns a reference to the created + element object, as implied by the documentation +- Phorum mod's HTML Purifier help message expanded (unreleased elsewhere) +- Fix a theoretical class of infinite loops from DirectLex reported + by Nate Abele +- Work around unnecessary DOMElement type-cast in PH5P that caused errors + in PHP 5.1 +- Work around PHP 4 SimpleTest lack-of-error complaining for one-time-only + HTMLDefinition errors, this may indicate problems with error-collecting + facilities in PHP 5 +- Make ErrorCollectorEMock work in both PHP 4 and PHP 5 +- Make PH5P work with PHP 5.0 by removing unnecessary array parameter typedef +. %Core.AcceptFullDocuments renamed to %Core.ConvertDocumentToFragment + to better communicate its purpose +. Error unit tests can now specify the expectation of no errors. Future + iterations of the harness will be extremely strict about what errors + are allowed +. Extend Injector hooks to allow for more powerful injector routines +. HTMLDefinition->addBlankElement created, as according to the HTMLModule + method +. Doxygen configuration file updated, with minor improvements +. Test runner now checks for similarly named files in conf/ directory too. +. Minor cosmetic change to flush-definition-cache.php: trailing newline is + outputted +. Maintenance script for generating PH5P patch added, original PH5P source + file also added under version control +. Full unit test runner script title made more descriptive with PHP version +. Updated INSTALL file to state that 4.3.7 is the earliest version we + are actively testing + +2.1.2, released 2007-09-03 +! Implemented Object module for trusted users +! Implemented experimental HTML5 parsing mode using PH5P. To use, add + this to your code: + require_once 'HTMLPurifier/Lexer/PH5P.php'; + $config->set('Core', 'LexerImpl', 'PH5P'); + Note that this Lexer introduces some classes not in the HTMLPurifier + namespace. Also, this is PHP5 only. +! CSS property border-spacing implemented +- Fix non-visible parsing error in DirectLex with empty tags that have + slashes inside attribute values. +- Fix typo in CSS definition: border-collapse:seperate; was incorrectly + accepted as valid CSS. Usually non-visible, because this styling is the + default for tables in most browsers. Thanks Brett Zamir for pointing + this out. +- Fix validation errors in configuration form +- Hammer out a bunch of edge-case bugs in the standalone distribution +- Inclusion reflection removed from URISchemeRegistry; you must manually + include any new schema files you wish to use +- Numerous typo fixes in documentation thanks to Brett Zamir +. Unit test refactoring for one logical test per test function +. Config and context parameters in ComplexHarness deprecated: instead, edit + the $config and $context member variables +. HTML wrapper in DOMLex now takes DTD identifiers into account; doesn't + really make a difference, but is good for completeness sake +. merge-library.php script refactored for greater code reusability and + PHP4 compatibility + +2.1.1, released 2007-08-04 +- Fix show-stopper bug in %URI.MakeAbsolute functionality +- Fix PHP4 syntax error in standalone version +. Add prefix directory to include path for standalone, this prevents + other installations from clobbering the standalone's URI schemes +. Single test methods can be invoked by prefixing with __only + +2.1.0, released 2007-08-02 +# flush-htmldefinition-cache.php superseded in favor of a generic + flush-definition-cache.php script, you can clear a specific cache + by passing its name as a parameter to the script +! Phorum mod implemented for HTML Purifier +! With %Core.AggressivelyFixLt, <3 and similar emoticons no longer + trigger HTML removal in PHP5 (DOMLex). This directive is not necessary + for PHP4 (DirectLex). +! Standalone file now available, which greatly reduces the amount of + includes (although there are still a few files that reside in the + standalone folder) +! Relative URIs can now be transformed into their absolute equivalents + using %URI.Base and %URI.MakeAbsolute +! Ruby implemented for XHTML 1.1 +! You can now define custom URI filtering behavior, see enduser-uri-filter.html + for more details +! UTF-8 font names now supported in CSS +- AutoFormatters emit friendly error messages if tags or attributes they + need are not allowed +- ConfigForm's compactification of directive names is now configurable +- AutoParagraph autoformatter algorithm refined after field-testing +- XHTML 1.1 now applies XHTML 1.0 Strict cleanup routines, namely + blockquote wrapping +- Contents of <style> tags removed by default when tags are removed +. HTMLPurifier_Config->getSerial() implemented, this is extremely useful + for output cache invalidation +. ConfigForm printer now can retrieve CSS and JS files as strings, in + case HTML Purifier's directory is not publically accessible +. Introduce new text/itext configuration directive values: these represent + longer strings that would be more appropriately edited with a textarea +. Allow newlines to act as separators for lists, hashes, lookups and + %HTML.Allowed +. ConfigForm generates textareas instead of text inputs for lists, hashes, + lookups, text and itext fields +. Hidden element content removal genericized: %Core.HiddenElements can + be used to customize this behavior, by default <script> and <style> are + hidden +. Added HTMLPURIFIER_PREFIX constant, should be used instead of dirname(__FILE__) +. Custom ChildDef added to default include list +. URIScheme reflection improved: will not attempt to include file if class + already exists. May clobber autoload, so I need to keep an eye on it +. ConfigSchema heavily optimized, will only collect information and validate + definitions when HTMLPURIFIER_SCHEMA_STRICT is true. +. AttrDef_URI unit tests and implementation refactored +. benchmarks/ directory now protected from public view with .htaccess file; + run the tests via command line +. URI scheme is munged off if there is no authority and the scheme is the + default one +. All unit tests inherit from HTMLPurifier_Harness, not UnitTestCase +. Interface for URIScheme changed +. Generic URI object to hold components of URI added, most systems involved + in URI validation have been migrated to use it +. Custom filtering for URIs factored out to URIDefinition interface for + maximum extensibility + +2.0.1, released 2007-06-27 +! Tag auto-closing now based on a ChildDef heuristic rather than a + manually set auto_close array; some behavior may change +! Experimental AutoFormat functionality added: auto-paragraph and + linkify your HTML input by setting %AutoFormat.AutoParagraph and + %AutoFormat.Linkify to true +! Newlines normalized internally, and then converted back to the + value of PHP_EOL. If this is not desired, set your newline format + using %Output.Newline. +! Beta error collection, messages are implemented for the most generic + cases involving Lexing or Strategies +- Clean up special case code for <script> tags +- Reorder includes for DefinitionCache decorators, fixes a possible + missing class error +- Fixed bug where manually modified definitions were not saved via cache + (mostly harmless, except for the fact that it would be a little slower) +- Configuration objects with different serials do not clobber each + others when revision numbers are unequal +- Improve Serializer DefinitionCache directory permissions checks +- DefinitionCache no longer throws errors when it encounters old + serial files that do not conform to the current style +- Stray xmlns attributes removed from configuration documentation +- configForm.php smoketest no longer has XSS vulnerability due to + unescaped print_r output +- Printer adheres to configuration's directives on output format +- Fix improperly named form field in ConfigForm printer +. Rewire some test-cases to swallow errors rather than expect them +. HTMLDefinition printer updated with some of the new attributes +. DefinitionCache keys reordered to reflect precedence: version number, + hash, then revision number +. %Core.DefinitionCache renamed to %Cache.DefinitionImpl +. Interlinking in configuration documentation added using + Injector_PurifierLinkify +. Directives now keep track of aliases to themselves +. Error collector now requires a severity to be passed, use PHP's internal + error constants for this +. HTMLPurifier_Config::getAllowedDirectivesForForm implemented, allows + much easier selective embedding of configuration values +. Doctype objects now accept public and system DTD identifiers +. %HTML.Doctype is now constrained by specific values, to specify a custom + doctype use new %HTML.CustomDoctype +. ConfigForm truncates long directives to keep the form small, and does + not re-output namespaces + +2.0.0, released 2007-06-20 +# Completely refactored HTMLModuleManager, decentralizing safety + information +# Transform modules changed to Tidy modules, which offer more flexibility + and better modularization +# Configuration object now finalizes itself when a read operation is + performed on it, ensuring that its internal state stays consistent. + To revert this behavior, you can set the $autoFinalize member variable + off, but it's not recommended. +# New compact syntax for AttrDef objects that can be used to instantiate + new objects via make() +# Definitions (esp. HTMLDefinition) are now cached for a significant + performance boost. You can disable caching by setting %Core.DefinitionCache + to null. You CANNOT edit raw definitions without setting the corresponding + DefinitionID directive (%HTML.DefinitionID for HTMLDefinition). +# Contents between <script> tags are now completely removed if <script> + is not allowed +# Prototype-declarations for Lexer removed in favor of configuration + determination of Lexer implementations. +! HTML Purifier now works in PHP 4.3.2. +! Configuration form-editing API makes tweaking HTMLPurifier_Config a + breeze! +! Configuration directives that accept hashes now allow new string + format: key1:value1,key2:value2 +! ConfigDoc now factored into OOP design +! All deprecated elements now natively supported +! Implement TinyMCE styled whitelist specification format in + %HTML.Allowed +! Config object gives more friendly error messages when things go wrong +! Advanced API implemented: easy functions for creating elements (addElement) + and attributes (addAttribute) on HTMLDefinition +! Add native support for required attributes +- Deprecated and removed EnableRedundantUTF8Cleaning. It didn't even work! +- DOMLex will not emit errors when a custom error handler that does not + honor error_reporting is used +- StrictBlockquote child definition refrains from wrapping whitespace + in tags now. +- Bug resulting from tag transforms to non-allowed elements fixed +- ChildDef_Custom's regex generation has been improved, removing several + false positives +. Unit test for ElementDef created, ElementDef behavior modified to + be more flexible +. Added convenience functions for HTMLModule constructors +. AttrTypes now has accessor functions that should be used instead + of directly manipulating info +. TagTransform_Center deprecated in favor of generic TagTransform_Simple +. Add extra protection in AttrDef_URI against phantom Schemes +. Doctype object added to HTMLDefinition which describes certain aspects + of the operational document type +. Lexer is now pre-emptively included, with a conditional include for the + PHP5 only version. +. HTMLDefinition and CSSDefinition have a common parent class: Definition. +. DirectLex can now track line-numbers +. Preliminary error collector is in place, although no code actually reports + errors yet +. Factor out most of ValidateAttributes to new AttrValidator class + +1.6.1, released 2007-05-05 +! Support for more deprecated attributes via transformations: + + hspace and vspace in img + + size and noshade in hr + + nowrap in td + + clear in br + + align in caption, table, img and hr + + type in ul, ol and li +! DirectLex now preserves text in which a < bracket is followed by + a non-alphanumeric character. This means that certain emoticons + are now preserved. +! %Core.RemoveInvalidImg is now operational, when set to false invalid + images will hang around with an empty src +! target attribute in a tag supported, use %Attr.AllowedFrameTargets + to enable +! CSS property white-space now allows nowrap (supported in all modern + browsers) but not others (which have spotty browser implementations) +! XHTML 1.1 mode now sort-of works without any fatal errors, and + lang is now moved over to xml:lang. +! Attribute transformation smoketest available at smoketests/attrTransform.php +! Transformation of font's size attribute now handles super-large numbers +- Possibly fatal bug with __autoload() fixed in module manager +- Invert HTMLModuleManager->addModule() processing order to check + prefixes first and then the literal module +- Empty strings get converted to empty arrays instead of arrays with + an empty string in them. +- Merging in attribute lists now works. +. Demo script removed: it has been added to the website's repository +. Basic.php script modified to work out of the box +. Refactor AttrTransform classes to reduce duplication +. AttrTransform_TextAlign axed in favor of a more general + AttrTransform_EnumToCSS, refer to HTMLModule/TransformToStrict.php to + see how the new equivalent is implemented +. Unit tests now use exclusively assertIdentical + +1.6.0, released 2007-04-01 +! Support for most common deprecated attributes via transformations: + + bgcolor in td, th, tr and table + + border in img + + name in a and img + + width in td, th and hr + + height in td, th +! Support for CSS attribute 'height' added +! Support for rel and rev attributes in a tags added, use %Attr.AllowedRel + and %Attr.AllowedRev to activate +- You can define ID blacklists using regular expressions via + %Attr.IDBlacklistRegexp +- Error messages are emitted when you attempt to "allow" elements or + attributes that HTML Purifier does not support +- Fix segfault in unit test. The problem is not very reproduceable and + I don't know what causes it, but a six line patch fixed it. + +1.5.0, released 2007-03-23 +! Added a rudimentary I18N and L10N system modeled off MediaWiki. It + doesn't actually do anything yet, but keep your eyes peeled. +! docs/enduser-utf8.html explains how to use UTF-8 and HTML Purifier +! Newly structured HTMLDefinition modeled off of XHTML 1.1 modules. + I am loathe to release beta quality APIs, but this is exactly that; + don't use the internal interfaces if you're not willing to do migration + later on. +- Allow 'x' subtag in language codes +- Fixed buggy chameleon-support for ins and del +. Added support for IDREF attributes (i.e. for) +. Renamed HTMLPurifier_AttrDef_Class to HTMLPurifier_AttrDef_Nmtokens +. Removed context variable ParentType, replaced with IsInline, which + is false when you're not inline and an integer of the parent that + caused you to become inline when you are (so possibly zero) +. Removed ElementDef->type in favor of ElementDef->descendants_are_inline + and HTMLDefinition->content_sets +. StrictBlockquote now reports what elements its supposed to allow, + rather than what it does allow +. Removed HTMLDefinition->info_flow_elements in favor of + HTMLDefinition->content_sets['Flow'] +. Removed redundant "exclusionary" definitions from DTD roster +. StrictBlockquote now requires a construction parameter as if it + were an Required ChildDef, this is the "real" set of allowed elements +. AttrDef partitioned into HTML, CSS and URI segments +. Modify Youtube filter regexp to be multiline +. Require both PHP5 and DOM extension in order to use DOMLex, fixes + some edge cases where a DOMDocument class exists in a PHP4 environment + due to DOM XML extension. + +1.4.1, released 2007-01-21 +! docs/enduser-youtube.html updated according to new functionality +- YouTube IDs can have underscores and dashes + +1.4.0, released 2007-01-21 +! Implemented list-style-image, URIs now allowed in list-style +! Implemented background-image, background-repeat, background-attachment + and background-position CSS properties. Shorthand property background + supports all of these properties. +! Configuration documentation looks nicer +! Added %Core.EscapeNonASCIICharacters to workaround loss of Unicode + characters while %Core.Encoding is set to a non-UTF-8 encoding. +! Support for configuration directive aliases added +! Config object can now be instantiated from ini files +! YouTube preservation code added to the core, with two lines of code + you can add it as a filter to your code. See smoketests/preserveYouTube.php + for sample code. +! Moved SLOW to docs/enduser-slow.html and added code examples +- Replaced version check with functionality check for DOM (thanks Stephen + Khoo) +. Added smoketest 'all.php', which loads all other smoketests via frames +. Implemented AttrDef_CSSURI for url(http://google.com) style declarations +. Added convenient single test selector form on test runner + +1.3.2, released 2006-12-25 +! HTMLPurifier object now accepts configuration arrays, no need to manually + instantiate a configuration object +! Context object now accessible to outside +! Added enduser-youtube.html, explains how to embed YouTube videos. See + also corresponding smoketest preserveYouTube.php. +! Added purifyArray(), which takes a list of HTML and purifies it all +! Added static member variable $version to HTML Purifier with PHP-compatible + version number string. +- Fixed fatal error thrown by upper-cased language attributes +- printDefinition.php: added labels, added better clarification +. HTMLPurifier_Config::create() added, takes mixed variable and converts into + a HTMLPurifier_Config object. + +1.3.1, released 2006-12-06 +! Added HTMLPurifier.func.php stub for a convenient function to call the library +- Fixed bug in RemoveInvalidImg code that caused all images to be dropped + (thanks to .mario for reporting this) +. Standardized all attribute handling variables to attr, made it plural + +1.3.0, released 2006-11-26 +# Invalid images are now removed, rather than replaced with a dud + <img src="" alt="Invalid image" />. Previous behavior can be restored + with new directive %Core.RemoveInvalidImg set to false. +! (X)HTML Strict now supported + + Transparently handles inline elements in block context (blockquote) +! Added GET method to demo for easier validation, added 50kb max input size +! New directive %HTML.BlockWrapper, for block-ifying inline elements +! New directive %HTML.Parent, allows you to only allow inline content +! New directives %HTML.AllowedElements and %HTML.AllowedAttributes to let + users narrow the set of allowed tags +! <li value="4"> and <ul start="2"> now allowed in loose mode +! New directives %URI.DisableExternalResources and %URI.DisableResources +! New directive %Attr.DisableURI, which eliminates all hyperlinking +! New directive %URI.Munge, munges URI so you can use some sort of redirector + service to avoid PageRank leaks or warn users that they are exiting your site. +! Added spiffy new smoketest printDefinition.php, which lets you twiddle with + the configuration settings and see how the internal rules are affected. +! New directive %URI.HostBlacklist for blocking links to bad hosts. + xssAttacks.php smoketest updated accordingly. +- Added missing type to ChildDef_Chameleon +- Remove Tidy option from demo if there is not Tidy available +. ChildDef_Required guards against empty tags +. Lookup table HTMLDefinition->info_flow_elements added +. Added peace-of-mind variable initialization to Strategy_FixNesting +. Added HTMLPurifier->info_parent_def, parent child processing made special +. Added internal documents briefly summarizing future progression of HTML +. HTMLPurifier_Config->getBatch($namespace) added +. More lenient casting to bool from string in HTMLPurifier_ConfigSchema +. Refactored ChildDef classes into their own files + +1.2.0, released 2006-11-19 +# ID attributes now disabled by default. New directives: + + %HTML.EnableAttrID - restores old behavior by allowing IDs + + %Attr.IDPrefix - %Attr.IDBlacklist alternative that munges all user IDs + so that they don't collide with your IDs + + %Attr.IDPrefixLocal - Same as above, but for when there are multiple + instances of user content on the page + + Profuse documentation on how to use these available in docs/enduser-id.txt +! Added MODx plugin <http://modxcms.com/forums/index.php/topic,6604.0.html> +! Added percent encoding normalization +! XSS attacks smoketest given facelift +! Configuration documentation now has table of contents +! Added %URI.DisableExternal, which prevents links to external websites. You + can also use %URI.Host to permit absolute linking to subdomains +! Non-accessible resources (ex. mailto) blocked from embedded URIs (img src) +- Type variable in HTMLDefinition was not being set properly, fixed +- Documentation updated + + TODO added request Phalanger + + TODO added request Native compression + + TODO added request Remove redundant tags + + TODO added possible plaintext formatter for HTML Purifier documentation + + Updated ConfigDoc TODO + + Improved inline comments in AttrDef/Class.php, AttrDef/CSS.php + and AttrDef/Host.php + + Revamped documentation into HTML, along with misc updates +- HTMLPurifier_Context doesn't throw a variable reference error if you attempt + to retrieve a non-existent variable +. Switched to purify()-wide Context object registry +. Refactored unit tests to minimize duplication +. XSS attack sheet updated +. configdoc.xml now has xml:space attached to default value nodes +. Allow configuration directives to permit null values +. Cleaned up test-cases to remove unnecessary swallowErrors() + +1.1.2, released 2006-09-30 +! Add HTMLPurifier.auto.php stub file that configures include_path +- Documentation updated + + INSTALL document rewritten + + TODO added semi-lossy conversion + + API Doxygen docs' file exclusions updated + + Added notes on HTML versus XML attribute whitespace handling + + Noted that HTMLPurifier_ChildDef_Custom isn't being used + + Noted that config object's definitions are cached versions +- Fixed lack of attribute parsing in HTMLPurifier_Lexer_PEARSax3 +- ftp:// URIs now have their typecodes checked +- Hooked up HTMLPurifier_ChildDef_Custom's unit tests (they weren't being run) +. Line endings standardized throughout project (svn:eol-style standardized) +. Refactored parseData() to general Lexer class +. Tester named "HTML Purifier" not "HTMLPurifier" + +1.1.1, released 2006-09-24 +! Configuration option to optionally Tidy up output for indentation to make up + for dropped whitespace by DOMLex (pretty-printing for the entire application + should be done by a page-wide Tidy) +- Various documentation updates +- Fixed parse error in configuration documentation script +- Fixed fatal error in benchmark scripts, slightly augmented +- As far as possible, whitespace is preserved in-between table children +- Sample test-settings.php file included + +1.1.0, released 2006-09-16 +! Directive documentation generation using XSLT +! XHTML can now be turned off, output becomes <br> +- Made URI validator more forgiving: will ignore leading and trailing + quotes, apostrophes and less than or greater than signs. +- Enforce alphanumeric namespace and directive names for configuration. +- Table child definition made more flexible, will fix up poorly ordered elements +. Renamed ConfigDef to ConfigSchema + +1.0.1, released 2006-09-04 +- Fixed slight bug in DOMLex attribute parsing +- Fixed rejection of case-insensitive configuration values when there is a + set of allowed values. This manifested in %Core.Encoding. +- Fixed rejection of inline style declarations that had lots of extra + space in them. This manifested in TinyMCE. + +1.0.0, released 2006-09-01 +! Shorthand CSS properties implemented: font, border, background, list-style +! Basic color keywords translated into hexadecimal values +! Table CSS properties implemented +! Support for charsets other than UTF-8 (defined by iconv) +! Malformed UTF-8 and non-SGML character detection and cleaning implemented +- Fixed broken numeric entity conversion +- API documentation completed +. (HTML|CSS)Definition de-singleton-ized + +1.0.0beta, released 2006-08-16 +! First public release, most functionality implemented. Notable omissions are: + + Shorthand CSS properties + + Table CSS properties + + Deprecated attribute transformations + + vim: et sw=4 sts=4 |