Comments: on this page. Click to read or post your own.

Class SimpleHtmlSaxParser

Description

Converts HTML tokens into selected SAX events.

Located in /sapphire/dev/simpletest/parser.php (line 543)


	
			
Variable Summary
mixed $_attributes
mixed $_lexer
mixed $_listener
mixed $_tag
Method Summary
static SimpleLexer &createLexer ( &$parser, SimpleSaxParser $parser)
static string decodeHtml (string $html)
static string normalise (string $html)
SimpleHtmlSaxParser SimpleHtmlSaxParser ( &$listener, SimpleSaxListener $listener)
boolean acceptAttributeToken (string $token, integer $event)
boolean acceptEndToken (string $token, integer $event)
boolean acceptEntityToken (string $token, integer $event)
boolean acceptStartToken (string $token, integer $event)
boolean acceptTextToken (string $token, integer $event)
boolean ignore (string $token, integer $event)
boolean parse (string $raw)
Variables
mixed $_attributes (line 547)
mixed $_current_attribute (line 548)
mixed $_lexer (line 544)
mixed $_listener (line 545)
mixed $_tag (line 546)
Methods
static method createLexer (line 581)

Sets up the matching lexer. Starts in 'text' mode.

  • return: Lexer suitable for this parser.
  • access: public
static SimpleLexer &createLexer ( &$parser, SimpleSaxParser $parser)
  • SimpleSaxParser $parser: Event generator, usually $self.
  • &$parser
static method decodeHtml (line 693)

Decodes any HTML entities.

  • return: Outgoing plain text.
  • access: public
static string decodeHtml (string $html)
  • string $html: Incoming HTML.
static method normalise (line 706)

Turns HTML into text browser visible text. Images are converted to their alt text and tags are supressed.

Entities are converted to their visible representation.

  • return: Plain text.
  • access: public
static string normalise (string $html)
  • string $html: HTML to convert.
Constructor SimpleHtmlSaxParser (line 555)

Sets the listener.

  • access: public
SimpleHtmlSaxParser SimpleHtmlSaxParser ( &$listener, SimpleSaxListener $listener)
acceptAttributeToken (line 639)

Part of the tag data.

  • return: False if parse error.
  • access: public
boolean acceptAttributeToken (string $token, integer $event)
  • string $token: Incoming characters.
  • integer $event: Lexer event type.
acceptEndToken (line 625)

Accepts a token from the end tag mode.

The element name is converted to lower case.

  • return: False if parse error.
  • access: public
boolean acceptEndToken (string $token, integer $event)
  • string $token: Incoming characters.
  • integer $event: Lexer event type.
acceptEntityToken (line 660)

A character entity.

  • return: False if parse error.
  • access: public
boolean acceptEntityToken (string $token, integer $event)
  • string $token: Incoming characters.
  • integer $event: Lexer event type.
acceptStartToken (line 597)

Accepts a token from the tag mode. If the

starting element completes then the element is dispatched and the current attributes set back to empty. The element or attribute name is converted to lower case.

  • return: False if parse error.
  • access: public
boolean acceptStartToken (string $token, integer $event)
  • string $token: Incoming characters.
  • integer $event: Lexer event type.
acceptTextToken (line 671)

Character data between tags regarded as important.

  • return: False if parse error.
  • access: public
boolean acceptTextToken (string $token, integer $event)
  • string $token: Incoming characters.
  • integer $event: Lexer event type.
ignore (line 682)

Incoming data to be ignored.

  • return: False if parse error.
  • access: public
boolean ignore (string $token, integer $event)
  • string $token: Incoming characters.
  • integer $event: Lexer event type.
parse (line 570)

Runs the content through the lexer which should call back to the acceptors.

  • return: False if parse error.
  • access: public
boolean parse (string $raw)
  • string $raw: Page text to parse.
blog comments powered by Disqus

Documentation generated on Sun, 19 Oct 2008 06:44:03 +1300 by phpDocumentor 1.3.2