|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.ceryle.xml.XMLUtils
public class XMLUtils
Provides some static XML utility methods.
| Field Summary | |
|---|---|
static boolean |
entifyApos
When true, entify apostrophe ("'") characters. |
static boolean |
forceUTF8
Force set of XML declaration's encoding to UTF-8 when true. |
static int |
SERIALIZE_HTML
An enumerated type indicating that the serialization method used is HTML, which treats the incoming node as a text string (ie., no indenting or other processing). |
static int |
SERIALIZE_TEXT
An enumerated type indicating that the serialization method used is TEXT, which treats the incoming node as a text string (ie., no indenting). |
static int |
SERIALIZE_UNKNOWN
An enumerated type indicating an unknown serialization method (-1). |
static int |
SERIALIZE_XHTML
An enumerated type indicating that the serialization method used is XHTML, which preserves XHTML's empty element behaviour. |
static int |
SERIALIZE_XML
An enumerated type indicating that the serialization method used is XML (default). |
static int |
SERIALIZE_XTM
An enumerated type indicating that the serialization method used is for XTM (same as XML). |
static String |
XSLT_property_cdata_section_elements
The standard XSLT property keys supported are: |
static String |
XSLT_property_doctype_public
The standard XSLT property keys supported are: |
static String |
XSLT_property_doctype_system
The standard XSLT property keys supported are: |
static String |
XSLT_property_encoding
The standard XSLT property keys supported are: |
static String |
XSLT_property_indent
The standard XSLT property keys supported are: |
static String |
XSLT_property_media_type
The standard XSLT property keys supported are: |
static String |
XSLT_property_method
The standard XSLT property keys supported are: |
static String |
XSLT_property_omit_xml_declaration
The standard XSLT property keys supported are: |
static String |
XSLT_property_standalone
The standard XSLT property keys supported are: |
static String |
XSLT_property_version
The standard XSLT property keys supported are: |
| Constructor Summary | |
|---|---|
XMLUtils()
|
|
| Method Summary | |
|---|---|
static String |
deentify(String s)
Return a string with any of the XML-defined numeric or named character entities replaced by their character equivalents (as normal text). |
static String |
entify(String s,
boolean asNumeric)
Return a string with markup-sensitive characters (LT,GT,AMP,APOS and QUOT) expressed as either numeric or named character entities, depending on the boolean asNumeric. |
static Document |
generateDocument(String uri,
String qname)
Returns a DOM implementation of a Document object with a document element having an XML Namespace URI uri and a qualified name qname. |
static Element |
getChildElementByTagName(Element element,
String name,
boolean exclusive)
Returns the child Element of the Element provided whose element type ("tag") name matches the String name, throwing a ProcessException if there are more than one such child element when the boolean exclusive is true, returning the first instance otherwise. |
static List |
getElementsWithAttribute(Document doc,
String name,
String attrname,
String value,
boolean ignoreCase)
Returns the List of all instances found of an element with the provided tag name containing an attribute attr whose value is value, in the order they are located. |
static String |
getElementText(Element element,
boolean normalizeWS)
Returns the text content of all Text node children of Element element, ignoring any descendant elements (grandchildren, etc.). |
static Element |
getElementWithAttribute(Document doc,
String name,
String attrname,
String value,
boolean ignoreCase)
Returns the first instance found of an element with the provided tag name containing an attribute attr whose value is value. |
static Element |
getFirstChildElement(Element element,
String name)
Returns the first child element of element whose element type name (AKA "tag name") matches the provided String, null if no match. |
static Element |
getFirstChildElementNS(Element element,
String namespaceURI,
String localName)
Returns the first child element of element whose XML Namespace URI and local name (AKA "element type name" or "tag name") matches the parameters, null if no match. |
static Element |
getFirstDescendantElement(Element element,
String tagName)
Returns the first descendant element of element whose element type name (AKA "tag name") matches the provided String, null if no match. |
static int |
getMethodForMIME(MIME mime)
Returns a XMLUtils.SERIALIZE_* value based upon the provided MIME type. |
static String |
getMethodName(int method)
Returns a String indicator of the provided serialization method, as provided by the org.apache.xml.serialize.Method class. |
static MIME |
getMIMEtype(int method)
Returns a MIME object based upon the provided serialization method (XMLUtils.SERIALIZE_XML, etc.). |
static String |
getPCDATAContent(Element element)
Returns the PCDATA content of the Element provided. |
static String |
getSerializationMethod(String mimetype)
Returns the serialization method (as a String, but using the org.apache.xml.serialize.Method class as the source) based on a String comparison with mimetype when matched against several common MIME types. |
static String |
getSerializedNode(Node node)
A static method that returns a serialization of the provided DOM Document or Element using an XML serialization method. |
static int |
getSerializedSize(Node node)
A static method that returns the size in characters of the serialization of the provided DOM Document or Element using an XML serialization method. |
static org.apache.xml.serializer.Serializer |
getSerializer(Object out,
String method)
Returns a serializer suitable for the provided method, using the provided Writer or OutputStream out. |
static org.apache.xml.serializer.Serializer |
getSerializer(Properties props,
Object out,
String method)
Returns a serializer suitable for the provided method, using the provided Writer or OutputStream out. |
static String |
harvestText(Node node,
boolean useTreeWalker,
boolean goDeep,
boolean normalizeWS)
Provided with a DOM Node node, iterates over its content, returning a concatenation of all Text nodes, with normalized whitespace as necessary to keep words from erroneously merging if normalizeWS is true. |
static boolean |
isXML(int method)
Returns true if the provided serialization method is a non-XHTML form of XML, including XTM and generic XML. |
static void |
removeNamespaceCruft(Node node)
Provided with a DOM Node node, iterates over its content, removing all namespace cruft, including all namespace declarations and prefixes. |
static String |
scanForTitle(Element doc,
String content)
Provided with a DOM Node node (expected to be an XHTML Document), iterates over its content, returning the first matching "DC.Title" content. |
static String |
scanTextForTitle(String s,
String name)
A utility method that attempts to obtain the element content from the first instance in an HTML document provided as a String s of a given element type name. |
static boolean |
serialize(Document doc,
String filename)
A convenience method that writes the Document doc to a file named filename, using an XML serialization method. |
static boolean |
serialize(Node node,
Object out,
String method)
Using the serializer from the Xalan project, serializes the provided DOM Node to the provided Writer using the designated method. |
static void |
setPCDATAContent(Element element,
String content)
Sets the PCDATA content of the Element provided. |
static String |
stripMarkup(String s)
Strips markup from the provided String s. |
static Document |
toDocument(String source)
A static method, when provided with a String source, parses it to an XML Document. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final int SERIALIZE_UNKNOWN
public static final int SERIALIZE_XML
public static final int SERIALIZE_XTM
public static final int SERIALIZE_XHTML
public static final int SERIALIZE_HTML
public static final int SERIALIZE_TEXT
public static boolean forceUTF8
public static boolean entifyApos
public static final String XSLT_property_method
public static final String XSLT_property_version
public static final String XSLT_property_encoding
public static final String XSLT_property_standalone
public static final String XSLT_property_doctype_public
public static final String XSLT_property_doctype_system
public static final String XSLT_property_indent
public static final String XSLT_property_media_type
public static final String XSLT_property_cdata_section_elements
public static final String XSLT_property_omit_xml_declaration
| Constructor Detail |
|---|
public XMLUtils()
| Method Detail |
|---|
public static Document generateDocument(String uri,
String qname)
public static Document toDocument(String source)
throws DocumentException
DocumentException
public static org.apache.xml.serializer.Serializer getSerializer(Object out,
String method)
Note: because most of the planned use of this method is to support a Writer rather than OutputStream parameter, rather than throwing an UnsupportedEncodingException on building the serializer, an error is registered with the handler and null is returned.
public static org.apache.xml.serializer.Serializer getSerializer(Properties props,
Object out,
String method)
Note: because most of the planned use of this method is to support a Writer rather than OutputStream parameter, rather than throwing an UnsupportedEncodingException on building the serializer, an error is registered with the handler and null is returned. If the OutputFormat is non-null, the serialization method is supplied by the OutputFormat; the method parameter is then ignored.
public static int getSerializedSize(Node node)
throws ProcessException
ProcessException - if an error occurs during serialization.
public static String getSerializedNode(Node node)
throws ProcessException
ProcessException - if an error occurs during serialization.
public static boolean serialize(Document doc,
String filename)
throws ProcessException
doc - the DOM Document to serializefilename - the pathname of the target file
ProcessException - if an error occurs during serialization.
public static boolean serialize(Node node,
Object out,
String method)
throws IOException
node - the DOM node to be serialized.out - the File, Writer or OutputStream to receive the serialized content.method - the method should be one of Method.TEXT, Method.HTML,
Method.XHTML or Method.XML (the default if the value
is not recognized).
IOException - if out is not a Writer or an OutputStream, or a serialization error occurs.public static MIME getMIMEtype(int method)
public static String getMethodName(int method)
public static int getMethodForMIME(MIME mime)
public static String getSerializationMethod(String mimetype)
public static boolean isXML(int method)
public static final String entify(String s,
boolean asNumeric)
Note that use of numeric entities allows for more "entification" than simply XML's five built-in entities. While not currently supported, use of numeric entities may in the future mean support for characters not included in the current encoding.
If the provided String is either null or empty, an empty String is returned.
public static final String deentify(String s)
If the provided String is either null or empty, an empty String is returned.
public static Element getElementWithAttribute(Document doc,
String name,
String attrname,
String value,
boolean ignoreCase)
public static List getElementsWithAttribute(Document doc,
String name,
String attrname,
String value,
boolean ignoreCase)
public static void removeNamespaceCruft(Node node)
public static Element getFirstDescendantElement(Element element,
String tagName)
public static Element getFirstChildElement(Element element,
String name)
public static Element getFirstChildElementNS(Element element,
String namespaceURI,
String localName)
public static String getElementText(Element element,
boolean normalizeWS)
Note that currently this returns an empty String even for Elements that have no children. This behaviour should not be relied upon, and may be changed in the future (i.e., empty Elements may return null).
public static String harvestText(Node node,
boolean useTreeWalker,
boolean goDeep,
boolean normalizeWS)
If the boolean useTreeWalker is true, uses an org.w3c.dom.traversal.TreeWalker, otherwise an org.w3c.dom.traversal.NodeIterator. If the boolean goDeep is true, traverses nodes "forward" in a depth-first traversal. If false, only the children of the provided node. Since knowledge of placement in the tree is available only with the TreeWalker, goDeep is only relevant when using the TreeWalker; NodeIterators always go deep.
public static String scanForTitle(Element doc,
String content)
scanTextForTitle(String,String),
which then attempts to obtain the title.
public static String scanTextForTitle(String s,
String name)
public static String stripMarkup(String s)
public static Element getChildElementByTagName(Element element,
String name,
boolean exclusive)
throws ProcessException
ProcessException
public static void setPCDATAContent(Element element,
String content)
public static String getPCDATAContent(Element element)
throws ProcessException
ProcessException
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||