HTML_Safe
[ class tree: HTML_Safe ] [ index: HTML_Safe ] [ all elements ]

Class: HTML_Safe

Source Location: /HTML_Safe-0.3.5/Safe.php

Class Overview


HTML_Safe Parser


Author(s):

Version:

  • Release: @package_version@

Copyright:

  • 1997-2005 Roman Ivanov

Variables

Methods


Inherited Variables

Inherited Methods


Class Details

[line 58]
HTML_Safe Parser

This parser strips down all potentially dangerous content within HTML:

  • opening tag without its closing tag
  • closing tag without its opening tag
  • any of these tags: "base", "basefont", "head", "html", "body", "applet", "object", "iframe", "frame", "frameset", "script", "layer", "ilayer", "embed", "bgsound", "link", "meta", "style", "title", "blink", "xml" etc.
  • any of these attributes: on*, data*, dynsrc
  • javascript:/vbscript:/about: etc. protocols
  • expression/behavior etc. in styles
  • any other active content
It also tries to convert code to XHTML valid, but htmltidy is far better solution for this task.

Example:

 $parser =& new HTML_Safe();
 $result = $parser->parse($doc);
 



[ Top ]


Class Variables

$attributes = array('dynsrc', 'id', 'name', )

[line 270]

List of dangerous attributes
  • Access: public

Type:   array


[ Top ]

$blackProtocols = array(
        'about',   'chrome',     'data',       'disk',     'hcp',     
        'help',    'javascript', 'livescript', 'lynxcgi',  'lynxexec', 
        'ms-help', 'ms-its',     'mhtml',      'mocha',    'opera',   
        'res',     'resource',   'shell',      'vbscript', 'view-source', 
        'vnd.ms.radio',          'wysiwyg', 
        )

[line 176]

List of "dangerous" protocols (used for blacklist-filtering)
  • Access: public

Type:   array


[ Top ]

$closeParagraph = array(
        'address', 'blockquote', 'center', 'dd',      'dir',       'div', 
        'dl',      'dt',         'h1',     'h2',      'h3',        'h4', 
        'h5',      'h6',         'hr',     'isindex', 'listing',   'marquee', 
        'menu',    'multicol',   'ol',     'p',       'plaintext', 'pre', 
        'table',   'ul',         'xmp', 
        )

[line 237]

List of block-level tags that terminates paragraph

Paragraph will be closed when this tags opened

  • Access: public

Type:   array


[ Top ]

$cssKeywords = array(
        'absolute', 'behavior',       'behaviour',   'content', 'expression', 
        'fixed',    'include-source', 'moz-binding',
        )

[line 215]

List of dangerous CSS keywords

Whole style="" attribute will be removed, if parser will find one of these keywords

  • Access: public

Type:   array


[ Top ]

$deleteTags = array(
        'applet', 'base',   'basefont', 'bgsound', 'blink',  'body', 
        'embed',  'frame',  'frameset', 'head',    'html',   'ilayer', 
        'iframe', 'layer',  'link',     'meta',    'object', 'style', 
        'title',  'script', 
        )

[line 146]

List of dangerous tags (such tags will be deleted)
  • Access: public

Type:   array


[ Top ]

$deleteTagsContent = array('script', 'style', 'title', 'xml', )

[line 160]

List of dangerous tags (such tags will be deleted, and all content inside this tags will be also removed)
  • Access: public

Type:   array


[ Top ]

$listTags = array('dir', 'menu', 'ol', 'ul', )

[line 262]

List of list tags
  • Access: public

Type:   array


[ Top ]

$noClose = array()

[line 227]

List of tags that can have no "closing tag"
  • Deprecated: XHTML does not allow such tags
  • Access: public

Type:   array


[ Top ]

$protocolAttributes = array(
        'action', 'background', 'codebase', 'dynsrc', 'href', 'lowsrc', 'src', 
        )

[line 202]

List of attributes that can contain protocols
  • Access: public

Type:   array


[ Top ]

$protocolFiltering =  'white'

[line 168]

Type of protocols filtering ('white' or 'black')
  • Access: public

Type:   string


[ Top ]

$singleTags = array('area', 'br', 'img', 'input', 'hr', 'wbr', )

[line 138]

List of single tags ("<tag />")
  • Access: public

Type:   array


[ Top ]

$tableTags = array(
        'caption', 'col', 'colgroup', 'tbody', 'td', 'tfoot', 'th', 
        'thead',   'tr', 
        )

[line 251]

List of table tags, all table tags outside a table will be removed
  • Access: public

Type:   array


[ Top ]

$whiteProtocols = array(
        'ed2k',   'file', 'ftp',  'gopher', 'http',  'https', 
        'irc',    'mailto', 'news', 'nntp', 'telnet', 'webcal', 
        'xmpp', 
        )

[line 190]

List of "safe" protocols (used for whitelist-filtering)
  • Access: public

Type:   array


[ Top ]



Method Detail

HTML_Safe (Constructor)   [line 277]

HTML_Safe HTML_Safe( )

Constructs class
  • Access: public

[ Top ]

clear   [line 567]

boolean clear( )

Clears current document data
  • Access: public

[ Top ]

getXHTML   [line 552]

string getXHTML( )

Returns the XHTML document
  • Return: Processed (X)HTML document
  • Access: public

[ Top ]

parse   [line 580]

string parse( string $doc)

Main parsing fuction
  • Return: Processed (X)HTML document
  • Access: public

Parameters:

string   $doc     HTML document for processing

[ Top ]


Documentation generated on Sun, 04 Sep 2005 07:30:06 -0400 by phpDocumentor 1.2.3. PEAR Logo Copyright © PHP Group 2004.