diff options
Diffstat (limited to 'library/Text_Highlighter/README')
-rw-r--r-- | library/Text_Highlighter/README | 455 |
1 files changed, 0 insertions, 455 deletions
diff --git a/library/Text_Highlighter/README b/library/Text_Highlighter/README deleted file mode 100644 index 88f71aed2..000000000 --- a/library/Text_Highlighter/README +++ /dev/null @@ -1,455 +0,0 @@ -# $Id$ - -Introduction -============ - -Text_Highlighter is a class for syntax highlighting. The main idea is to -simplify creation of subclasses implementing syntax highlighting for -particular language. Subclasses do not implement any new functioanality, they -just provide syntax highlighting rules. The rules sources are in XML format. -To create a highlighter for a language, there is no need to code a new class -manually. Simply describe the rules in XML file and use Text_Highlighter_Generator -to create a new class. - - -This document does not contain a formal description of API - it is very -simple, and I believe providing some examples of code is sufficient. - - -Highlighter XML source -====================== - -Basics ------- - -Creating a new syntax highlighter begins with describing the highlighting -rules. There are two basic elements: block and region. A block is just a -portion of text matching a regular expression and highlighted with a single -color. Keyword is an example of a block. A region is defined by two regular -expressions: one for start of region, and another for the end. The main -difference from a block is that a region can contain blocks and regions -(including same-named regions). An example of a region is a group of -statements enclosed in curly brackets (this is used in many languages, for -example PHP and C). Also, characters matching start and end of a region may be -highlighted with their own color, and region contents with another. - -Blocks and regions may be declared as contained. Contained blocks and regions -can only appear inside regions. If a region or a block is not declared as -contained, it can appear both on top level and inside regions. Block or region -declared as not-contained can only appear on top level. - -For any region, a list of blocks and regions that can appear inside this -region can be specified. - -In this document, the term "color group" is used. Chunks of text assigned to -same color group will be highlighted with same color. Note that in versions -prior 0.5.0 color goups were refered as CSS classes, but since 0.5.0 not only -HTML output is supported, so "color group" is more appropriate term. - -Elements --------- - -The toplevel element is <highlight>. Attribute lang is required and denotes -the name of the language. Its value is used as a part of generated class name, -and must only contain letters, digits and underscores. Optional attribute -case, when given value yes, makes the language case sensitive (default is case -insensitive). Allowed subelements are: - - * <authors>: Information about the authors of the file. - <author>: Information about a single author of the file. (May be used - multiple times, one per author.) - - name="...": Author's name. Required. - - email="...": Author's email address. Optional. - - * <default>: Default color group. - - innerGroup="...": color group name. Required. - - * <region>: Region definition - - name="...": Region name. Required. - - innerGroup="...": Default color group of region contents. Required. - - delimGroup="...": color group of start and end of region. Optional, - defaults to value of innerGroup attribute. - - start="...", end="...": Regular expression matching start and end - of region. Required. Regular expression delimiters are optional, but - if you need to specify delimiter, use /. The only case when the - delimiters are needed, is specifying regular expression modifiers, - such as m or U. Examples: \/\* or /$/m. - - contained="yes": Marks region as contained. - - never-contained="yes": Marks region as not-contained. - - <contains>: Elements allowed inside this region. - - all="yes" Region can contain any other region or block - (except not-contained). May be used multiple times. - - <but> Do not allow certain regions or blocks. - - region="..." Name of region not allowed within - current region. - - block="..." Name of block not allowed within - current region. - - region="..." Name of region allowed within current region. - - block="..." Name of block allowed within current region. - - <onlyin> Only allow this region within certain regions. May be - used multiple times. - - block="..." Name of parent region - - * <block>: Block definition - - name="...": Block name. Required. - - innerGroup="...": color group of block contents. Optional. If not - specified, color group of parent region or default color group will be - used. One would only want to omit this attribute if there are - keyword groups (see below) inherited from this block, and no special - highlighting should apply when the block does not match the keyword. - - match="..." Regular expression matching the block. Required. - Regular expression delimiters are optional, but if you need to - specify delimiter, use /. The only case when the delimiters are - needed, is specifying regular expression modifiers, such as m or U. - Examples: #|\/\/ or /$/m. - - contained="yes": Marks block as contained. - - never-contained="yes": Marks block as not-contained. - - <onlyin> Only allow this block within certain regions. May be used - multiple times. - - block="..." Name of parent region - - multiline="yes": Marks block as multi-line. By default, whole - blocks are assumed to reside in a single line. This make the things - faster. If you need to declare a multi-line block, use this - attribute. - - <partgroup>: Assigns another color group to a part of the block that - matched a subpattern. - - index="n": Subpattern index. Required. - - innerGroup="...": color group name. Required. - - This is an example from CSS highlighter: the measure is matched as - a whole, but the measurement units are highlighted with different - color. - - <block name="measure" match="\d*\.?\d+(\%|em|ex|pc|pt|px|in|mm|cm)" - innerGroup="number" contained="yes"> - <onlyin region="property"/> - <partGroup index="1" innerGroup="string" /> - </block> - - * <keywords>: Keyword group definition. Keyword groups are useful when you - want to highlight some words that match a condition for a block with a - different color. Keywords are defined with literal match, not regular - expressions. For example, you have a block named identifier matching a - general identifier, and want to highlight reserved words (which match - this block as well) with different color. You inherit a keyword group - "reserved" from "identifier" block. - - name="...": Keyword group. Required. - - ifdef="...", ifndef="..." : Conditional declaration. See - "Conditions" below. - - inherits="...": Inherited block name. Required. - - innerGroup="...": color group of keyword group. Required. - - case="yes|no": Overrides case-sensitivity of the language. - Optional, defaults to global value. - - <keyword>: Single keyword definition. - - match="..." The keyword. Note: this is not a regular - expression, but literal match (possibly case insensitive). - -Note that for BC reasons element partClass is alias for partGroup, and -attributes innerClass and delimClass are aliases of innerGroup and -delimGroup, respectively. - - -Conditions ----------- - -Conditional declarations allow enabling or disabling certain highlighting -rules at runtime. For example, Java highlighter has a very big list of -keywords matching Java standard classes. Finding a match in this list can take -much time. For that reason, corresponding keyword group is declared with -"ifdef" attribute : - - <keywords name="builtin" inherits="identifier" innerClass="builtin" - case="yes" ifdef="java.builtins"> - <keyword match="AbstractAction" /> - <keyword match="AbstractBorder" /> - <keyword match="AbstractButton" /> - ... - ... - <keyword match="_Remote_Stub" /> - <keyword match="_ServantActivatorStub" /> - <keyword match="_ServantLocatorStub" /> - </keywords> - -This keyword group will be only enabled when "java.builtins" is passed as an -element of "defines" option: - - $options = array( - 'defines' => array( - 'java.builtins', - ), - 'numbers' => HL_NUMBERS_TABLE, - ); - $highlighter = Text_Highlighter::factory('java', $options); - -"ifndef" attribute has reverse meaning. - -Currently, "ifdef" and "ifndef" attributes are only supported for <keywords> -tag. - - - -Class generation -================ - -Creating XML description of highlighting rules is the most complicated part of -the process. To generate the class, you need just few lines of code: - - <?php - require_once 'Text/Highlighter/Generator.php'; - $generator = new Text_Highlighter_Generator('php.xml'); - $generator->generate(); - $generator->saveCode('PHP.php'); - ?> - - - -Command-line class generation tool -================================== - -Example from previous section looks pretty simple, but it does not handle any -errors which may occur during parsing of XML source. The package provides a -command-line script to make generation of classes even more simple, and takes -care of possible errors. It is called generate (on Unix/Linux) or generate.bat -(on Windows). This script is able to process multiple files in one run, and -also to process XML from standard input and write generated code to standard -output. - - Usage: - generate options - - Options: - -x filename, --xml=filename - source XML file. Multiple input files can be specified, in which - case each -x option must be followed by -p unless -d is specified - Defaults to stdin - -p filename, --php=filename - destination PHP file. Defaults to stdout. If specied multiple times, - each -p must follow -x - -d dirname, --dir=dirname - Default destination directory. File names will be taken from XML input - ("lang" attribute of <highlight> tag) - -h, --help - This help - -Examples - - Read from php.xml, write to PHP.php - - generate -x php.xml -p PHP.php - - Read from php.xml, write to standard output - - generate -x php.xml - - Read from php.xml, write to PHP.php, read from xml.xml, write to XML.php - - generate -x php.xml -p PHP.php -x xml.xml -p XML.php - - Read from php.xml, write to /some/dir/PHP.php, read from xml.xml, write to - /some/dir/XML.php (assuming that xml.xml contains <highlight lang="xml">, and - php.xml contains <highlight lang="php">) - - generate -x php.xml -x xml.xml -d /some/dir/ - - - -Renderers -========= - -Introduction ------------- - -Text_Highlighter supports renderes. Using renderers, you can get output in -different formats. Two renderers are included in the package: - - - HTML renderer. Generates HTML output. A style sheet should be linked to - the document to display colored text - - - Console renderer. Can be used to output highlighted text to - color-capable terminals, either directly or trough less -r - - -Renderers API -------------- - -Renderers are subclasses of Text_Highlighter_Renderer. Renderer should -override at least two methods - acceptToken and getOutput. Overriding other -methods is optional, depending on the nature of renderer's output and details -of implementation. - - string reset() - resets renderer state. This method is called every time before a new - source file is highlighted. - - string preprocess(string $code) - preprocesses code. Can be used, for example, to normalize whitespace - before highlighting. Returns preprocessed string. - - void acceptToken(string $group, string $content) - the core method of the renderer. Highlighter passes chunks of text to - this method in $content, and color group in $group - - void finalize() - signals the renderer that no more tokens are available. - - mixed getOutput() - returns generated output. - - -Setting renderer options --------------------------------- - -Renderers accept an optional argument to their constructor - options array. -Elements of this array are renderer-specific. - -HTML renderer -------------- - -HTML renderer produces HTML output with optional line numbering. The renderer -itself does not provide information about actual colors of highlighted text. -Instead, <span class="hl-XXX"> is used, where XXX is replaced with color group -name (hl-var, hl-string, etc.). It is up to you to create a CSS stylesheet. -If 'use_language' option with value evaluating to true was passed, class names -will be formatted as "LANG-hl-XXX", where LANG is language name as defined in -highlighter XML source ("lang" attribute of <highlight> tag) in lower case. - -There are 3 special CSS classes: - - hl-main - this class applies to whole output or right table column, - depending on 'numbers' option - hl-gutter - applies to left column in table - hl-table - applies to whole table - -HTML renderer accepts following options (each being optional): - - * numbers - line numbering style. - 0 - no numbering (default) - HL_NUMBERS_LI - use <ol></ol> for line numbering - HL_NUMBERS_TABLE - create a 2-column table, with line numbers in left - column and highlighted text in right column - - * tabsize - tabulation size. Defaults to 4 - - Example: - - require_once 'Text/Highlighter/Renderer/Html.php'; - $options = array( - 'numbers' => HL_NUMBERS_LI, - 'tabsize' => 8, - ); - $renderer = new Text_Highlighter_Renderer_HTML($options); - -Console renderer ----------------- - -Console renderer produces output for displaying on a color-capable terminal, -either directly or through less -r, using ANSI escape sequences. By default, -this renderer only highlights most common color groups. Additional colors -can be specified using 'colors' option. This renderer also accepts 'numbers' -option - a boolean value, and 'tabsize' option. - - Example : - - require_once 'Text/Highlighter/Renderer/Console.php'; - $colors = array( - 'prepro' => "\033[35m", - 'types' => "\033[32m", - ); - $options = array( - 'numbers' => true, - 'tabsize' => 8, - 'colors' => $colors, - ); - $renderer = new Text_Highlighter_Renderer_Console($options); - - -ANSI color escape sequences have the following format: - - ESC[#;#;....;#m - -where ESC is character with ASCII code 27 (033 octal, 0x1B hexadecimal). # is -one of the following: - - 0 for normal display - 1 for bold on - 4 underline (mono only) - 5 blink on - 7 reverse video on - 8 nondisplayed (invisible) - 30 black foreground - 31 red foreground - 32 green foreground - 33 yellow foreground - 34 blue foreground - 35 magenta foreground - 36 cyan foreground - 37 white foreground - 40 black background - 41 red background - 42 green background - 43 yellow background - 44 blue background - 45 magenta background - 46 cyan background - 47 white background - - -How to use Text_Highlighter class -================================= - -Creating a highlighter object ------------------------------ - -To create a highlighter for a certain language, use Text_Highlighter::factory() -static method: - - require_once 'Text/Highlighter.php'; - $hl = Text_Highlighter::factory('php'); - - -Setting a renderer ------------------- - -Actual output is produced by a renderer. - - require_once 'Text/Highlighter.php'; - require_once 'Text/Highlighter/Renderer/Html.php'; - $options = array( - 'numbers' => HL_NUMBERS_LI, - 'tabsize' => 8, - ); - $renderer = new Text_Highlighter_Renderer_HTML($options); - $hl = Text_Highlighter::factory('php'); - $hl->setRenderer($renderer); - -Note that for BC reasons, it is possible to use highlighter without setting a -renderer. If no renderer is set, HTML renderer will be used by default. In -this case, you should pass options as second parameter to factory method. The -following example works exactly as previous one: - - require_once 'Text/Highlighter.php'; - $options = array( - 'numbers' => HL_NUMBERS_LI, - 'tabsize' => 8, - ); - $hl = Text_Highlighter::factory('php', $options); - - -Getting output --------------- - -And finally, do the highlighting and get the output: - - require_once 'Text/Highlighter.php'; - require_once 'Text/Highlighter/Renderer/Html.php'; - $options = array( - 'numbers' => HL_NUMBERS_LI, - 'tabsize' => 8, - ); - $renderer = new Text_Highlighter_Renderer_HTML($options); - $hl = Text_Highlighter::factory('php'); - $hl->setRenderer($renderer); - $html = $hl->highlight(file_get_contents('example.php')); - -# vim: set autoindent tabstop=4 shiftwidth=4 softtabstop=4 tw=78: */ - |