|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
ObjectHTMLToString
public class HTMLToString
This class provides a single static convert()
method that converts an HTML file into an XML string that can be
pre-filtered and added to a Lucene database by the
XMLTextProcessor class.
Internally, the HTML to XML file conversion is performed by the jTidy
library, which is a variant of the HTMLTidy converter.
| Field Summary | |
|---|---|
private static HashMap |
htmlCodeMap
Build a HashMap from the code table above |
(package private) static String[] |
htmlCodes
Table of conversions from HTML ampersand codes to UNICODE. |
(package private) static Tidy |
tidy
Create the HTMLTidy object that will do the work. |
| Constructor Summary | |
|---|---|
HTMLToString()
|
|
| Method Summary | |
|---|---|
static String |
convert(InputStream htmlInputStream)
Convert an HTML file into an HTMLTidy style XML string. |
static String |
replaceHtmlCodes(String in)
Convert any non-XML ampersand codes within a string to their unicode equivalents. |
| Methods inherited from class Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
static Tidy tidy
static final String[] htmlCodes
private static HashMap htmlCodeMap
| Constructor Detail |
|---|
public HTMLToString()
| Method Detail |
|---|
public static String convert(InputStream htmlInputStream)
htmlInputStream - Stream of HTML text to convert to an XML string.
null.public static String replaceHtmlCodes(String in)
in - The string within which to convert codes.
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||