Proposal for "HTML_Safe"

» Metadata » Status
  • Category: HTML
  • Proposer: Roman Ivanov 
  • License: BSD, can be changed to PHP
» Description

This parser strips down all potentially dangerous content within HTML:

  • opening tag without its closing tag
  • closing tag without its opening tag
  • any of these tags: âbaseâ, âbasefontâ, âheadâ, âhtmlâ, âbodyâ, âappletâ, âobjectâ, âiframeâ, âframeâ, âframesetâ, âscriptâ, âlayerâ, âilayerâ, âembedâ, âbgsoundâ, âlinkâ, âmetaâ, âstyleâ, âtitleâ, âblinkâ, âxmlâ etc.
  • any of these attributes: on*, data*, dynsrc
  • javascript:/vbscript:/about: etc. protocols
  • expression/behavior etc. in styles
  • any other active content

It also tries to convert code to XHTML valid, but htmltidy is far better solution for this task.

Advantages comparing to strip_tags:

1. strip_tags works on white-list basis, deleting all tags except
allowed. HTML_Safe works on black-list basis, deleting only dangerous
2. strip_tags can only strip tags. HTML_safe strips down all active
content, including tags, attributes and values of atrributes.
3. strip_tags is not intended to fight XSS. HTML_Safe has primary goal
to prevent any XSS attack.
4. strip_tags does not try to produce XHTML compliant code. It does
not close unclosed tags.

HTML_Safe is successor of SafeHTML project. HTML_Safe fixes all known issues with SafeHTML.

» Dependencies » Links
  • XML_HTMLSax3
» Timeline » Changelog
  • First Draft: 2005-01-29
  • Proposal: 2005-01-29
  • Call for Votes: 2005-02-06
  • Roman Ivanov
    [2005-01-30 18:01 UTC]

    Description updated: added comparison with strip_tags() function.

  • Roman Ivanov
    [2005-02-06 16:33 UTC]

    Description updated: relationship with SafeHTML clarified.

    Code updated: now it seems to be fully compatible with PEAR Coding Standards.