org.clapper.util.html
Class HTMLUtil

java.lang.Object
  extended by org.clapper.util.html.HTMLUtil

public final class HTMLUtil
extends java.lang.Object

Static class containing miscellaneous HTML-related utility methods.

Version:
$Revision: 6735 $
Author:
Copyright © 2004-2007 Brian M. Clapper

Method Summary
static java.lang.String convertCharacterEntities(java.lang.String s)
          Converts all inline HTML character entities (c.f., http://www.w3.org/TR/REC-html40/sgml/entities.html) to their Unicode character counterparts, if possible.
static java.lang.String escapeHTML(java.lang.String s)
          Escape characters that are special in HTML, so that the resulting string can be included in HTML (or XML).
static java.lang.String makeCharacterEntities(java.lang.String s)
          Converts appropriate Unicode characters to their HTML character entity counterparts (c.f., http://www.w3.org/TR/REC-html40/sgml/entities.html).
static java.lang.String stripHTMLTags(java.lang.String s)
          Removes all HTML element tags from a string, leaving just the character data.
static java.lang.String textFromHTML(java.lang.String s)
          Convenience method to convert embedded HTML to text.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

stripHTMLTags

public static java.lang.String stripHTMLTags(java.lang.String s)
Removes all HTML element tags from a string, leaving just the character data. This method does not touch any inline HTML character entity codes. Use convertCharacterEntities() to convert HTML character entity codes.

Parameters:
s - the string to adjust
Returns:
the resulting, possibly modified, string
See Also:
convertCharacterEntities(java.lang.String)

escapeHTML

public static java.lang.String escapeHTML(java.lang.String s)
Escape characters that are special in HTML, so that the resulting string can be included in HTML (or XML). For instance, this method will convert an embedded "&" to "&".

Parameters:
s - the string to convert
Returns:
the converted string

convertCharacterEntities

public static java.lang.String convertCharacterEntities(java.lang.String s)
Converts all inline HTML character entities (c.f., http://www.w3.org/TR/REC-html40/sgml/entities.html) to their Unicode character counterparts, if possible.

Parameters:
s - the string to convert
Returns:
the resulting, possibly modified, string
See Also:
stripHTMLTags(java.lang.String), makeCharacterEntities(java.lang.String)

makeCharacterEntities

public static java.lang.String makeCharacterEntities(java.lang.String s)
Converts appropriate Unicode characters to their HTML character entity counterparts (c.f., http://www.w3.org/TR/REC-html40/sgml/entities.html).

Parameters:
s - the string to convert
Returns:
the resulting, possibly modified, string
See Also:
stripHTMLTags(java.lang.String), convertCharacterEntities(java.lang.String)

textFromHTML

public static java.lang.String textFromHTML(java.lang.String s)
Convenience method to convert embedded HTML to text. This method:

Parameters:
s - the string to parse
Returns:
the resulting, possibly modified, string
See Also:
convertCharacterEntities(java.lang.String), stripHTMLTags(java.lang.String)


Copyright © 2004-2007 Brian M. Clapper. All Rights Reserved.