org.ceryle.xml
Class TypedInputSource

java.lang.Object
  extended by org.xml.sax.InputSource
      extended by org.ceryle.xml.TypedInputSource

public class TypedInputSource
extends InputSource

Extends SAX's InputSource to provide a typed input source for processing of various file formats. File extensions are mapped as follows:

    extension   description
    ".xml"      XML
    ".xtm"      XML Topic Map (XTM)
    ".ltm"      Linear Topic Map (LTM)
    ".log"      Ceryle log notation
    ".atm"      AsTMa= Topic Map notation
    ".cl"       Cyc ontology (LISP-like syntax)
    ".its"      ITIS zoological taxonomy (XML format)
    ".kif"      Knowledge Interchange Format (KIF)
    ".xcg"      XML Conceptual Graph (XCG)
    ".ref"      Refer bibliographic format (text)
    ".bib"      Bibtex bibliographic format (text)
    ".gxl"      Graph Exchange Language (GXL)
    ".pdb"      Pilot Document format (binary)
    ".apt"      Augmented Plain Text (APT)
    ".wik"      Ceryle Wiki Text (a wiki text variant based on APT)
    ".lisp"     Operational Conceptual Modeling Language (OCML)
 

Since:
JDK1.3
Version:
$Id: TypedInputSource.java,v 3.9 2007-06-15 12:10:29 altheim Exp $
Author:
Murray Altheim
See Also:
InputSource

Field Summary
static boolean canonicalize
          Wehen true, alters all created TypedInputSources so that they use a canonical form of the systemId.
static String[] fileExtensions
          An array of file extensions acceptable to the import filter.
static int TYPE_APT
          An int indicating the Augmented Plain Text (APT) notation or file format.
static int TYPE_ASTMA
          An int indicating the AsTMa= Topic Map notation or file format.
static int TYPE_AUTO
          An int indicating that the TypedInputSource object should determine the file type by looking at the file extension.
static int TYPE_BIBTEX
          An int indicating the Bibtex Bibliographic format.
static int TYPE_CYC
          An int indicating the Cyc (Cyc ontology) notation or file format.
static int TYPE_GXL
          An int indicating the Graph Exchange Language (GXL) notation or file format.
static int TYPE_ITIS
          An int indicating the ITIS (XML zoological taxononomy) notation or file format.
static int TYPE_KIF
          An int indicating the Knowledge Interchange Format (KIF) notation or file format.
static int TYPE_LOG
          An int indicating the Ceryle log notation or file format.
static int TYPE_LTM
          An int indicating the Linear Topic Map (LTM) notation or file format.
static int TYPE_NULL
          An int indicating a null or unspecified type or file format.
static int TYPE_OCML
          An int indicating the Operational Conceptual Modelling Language (OCML) notation or file format.
static int TYPE_PDB
          An int indicating the Pilot Document format (a binary format).
static int TYPE_REFER
          An int indicating the Refer Bibliographic format.
static int TYPE_WIKI
          An int indicating the Wiki text notation or file format.
static int TYPE_XCG
          An int indicating the XML Conceptual Graph (XCG) notation or file format.
static int TYPE_XML
          An int indicating the XML type or file format.
static int TYPE_XTM
          An int indicating the XML Topic Map (XTM) notation or file format.
 
Constructor Summary
TypedInputSource(int type, InputStream stream)
          Constructor with an int type and an InputSource source.
TypedInputSource(int type, Reader reader)
          Constructor with an int type and a Reader reader.
TypedInputSource(int type, String systemId)
          Constructor with an int type and system identifier systemId.
TypedInputSource(int type, String publicId, String systemId)
          Constructor with an int type, a public identifier publicId, and a system identifier systemId.
TypedInputSource(int type, URL url)
          Constructor with an int type and a URL url.
TypedInputSource(Reader reader)
          Constructor provided with a Reader reader, assuming its content is well-formed XML.
TypedInputSource(String systemId)
          Constructor with a system identifier systemId; will set file type based on its file extension.
TypedInputSource(String publicId, String systemId)
          Constructor with both a public identifier publicId and a system identifier systemId; will set file type based on its file extension.
TypedInputSource(URL url)
          Constructor provided with a URL url, which must be a fully-resolved URL, assuming its content is well-formed XML.
 
Method Summary
 InputStream getInputStream()
          Returns the source as an InputStream.
 File getLocalReference()
          If this TypedInputSource resolves to a local File reference, returns a File.
 Reader getReader()
          Returns the source as a Reader.
static TypedInputSource getSource(int type, String systemId)
          Returns a TypedInputSource provided with a source type type and system identifier systemId.
static TypedInputSource getSource(int type, URL url)
          Returns a TypedInputSource provided with a source type type and URL systemId.
 int getType()
          Returns the type of this TypedInputSource.
static String getTypeDescription(int type)
          Returns the type of the supplied TypedInputSource type as a descriptive String.
static int getTypeFor(String systemId)
          Returns the type of the system identifier depending on its file extension.
static boolean isURL(String sysid)
          Returns the URL if the supplied sysid can be made into a URL, null otherwise.
static String resolveSystemId(String systemId)
          Resolves the String systemId to return a resolved URI (as a String).
 void setType(String systemId)
          Sets the type of this TypedInputSource depending on the file extension.
 
Methods inherited from class org.xml.sax.InputSource
getByteStream, getCharacterStream, getEncoding, getPublicId, getSystemId, setByteStream, setCharacterStream, setEncoding, setPublicId, setSystemId
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

fileExtensions

public static String[] fileExtensions
An array of file extensions acceptable to the import filter.


TYPE_NULL

public static final int TYPE_NULL
An int indicating a null or unspecified type or file format. This is an error condition.

See Also:
Constant Field Values

TYPE_XML

public static final int TYPE_XML
An int indicating the XML type or file format. This includes XHTML documents.

See Also:
Constant Field Values

TYPE_XTM

public static final int TYPE_XTM
An int indicating the XML Topic Map (XTM) notation or file format.

See Also:
Constant Field Values

TYPE_LTM

public static final int TYPE_LTM
An int indicating the Linear Topic Map (LTM) notation or file format.

See Also:
Constant Field Values

TYPE_ASTMA

public static final int TYPE_ASTMA
An int indicating the AsTMa= Topic Map notation or file format.

See Also:
Constant Field Values

TYPE_CYC

public static final int TYPE_CYC
An int indicating the Cyc (Cyc ontology) notation or file format.

See Also:
Constant Field Values

TYPE_ITIS

public static final int TYPE_ITIS
An int indicating the ITIS (XML zoological taxononomy) notation or file format.

See Also:
Constant Field Values

TYPE_KIF

public static final int TYPE_KIF
An int indicating the Knowledge Interchange Format (KIF) notation or file format.

See Also:
Constant Field Values

TYPE_XCG

public static final int TYPE_XCG
An int indicating the XML Conceptual Graph (XCG) notation or file format.

See Also:
Constant Field Values

TYPE_OCML

public static final int TYPE_OCML
An int indicating the Operational Conceptual Modelling Language (OCML) notation or file format.

See Also:
Constant Field Values

TYPE_REFER

public static final int TYPE_REFER
An int indicating the Refer Bibliographic format.

See Also:
Constant Field Values

TYPE_BIBTEX

public static final int TYPE_BIBTEX
An int indicating the Bibtex Bibliographic format.

See Also:
Constant Field Values

TYPE_GXL

public static final int TYPE_GXL
An int indicating the Graph Exchange Language (GXL) notation or file format.

See Also:
Constant Field Values

TYPE_APT

public static final int TYPE_APT
An int indicating the Augmented Plain Text (APT) notation or file format.

See Also:
Constant Field Values

TYPE_WIKI

public static final int TYPE_WIKI
An int indicating the Wiki text notation or file format.

See Also:
Constant Field Values

TYPE_PDB

public static final int TYPE_PDB
An int indicating the Pilot Document format (a binary format).

See Also:
Constant Field Values

TYPE_LOG

public static final int TYPE_LOG
An int indicating the Ceryle log notation or file format.

See Also:
Constant Field Values

TYPE_AUTO

public static final int TYPE_AUTO
An int indicating that the TypedInputSource object should determine the file type by looking at the file extension.

See Also:
Constant Field Values

canonicalize

public static boolean canonicalize
Wehen true, alters all created TypedInputSources so that they use a canonical form of the systemId.

Constructor Detail

TypedInputSource

public TypedInputSource(String systemId)
                 throws IllegalArgumentException
Constructor with a system identifier systemId; will set file type based on its file extension.

Throws:
IllegalArgumentException

TypedInputSource

public TypedInputSource(String publicId,
                        String systemId)
                 throws IllegalArgumentException
Constructor with both a public identifier publicId and a system identifier systemId; will set file type based on its file extension. Preference between the public or system identifier is determined by the XML catalog settings. The public identifier is optional (will be ignored if null).

Throws:
IllegalArgumentException

TypedInputSource

public TypedInputSource(int type,
                        String publicId,
                        String systemId)
                 throws IllegalArgumentException
Constructor with an int type, a public identifier publicId, and a system identifier systemId. Preference between the public or system identifier is determined by the XML catalog settings. The public identifier is optional (will be ignored if null).

Throws:
IllegalArgumentException

TypedInputSource

public TypedInputSource(int type,
                        String systemId)
                 throws IllegalArgumentException
Constructor with an int type and system identifier systemId. If the system identifier is not a URI, it will be converted to one.

Throws:
IllegalArgumentException - if the specified type is out of range.

TypedInputSource

public TypedInputSource(int type,
                        URL url)
Constructor with an int type and a URL url.


TypedInputSource

public TypedInputSource(URL url)
Constructor provided with a URL url, which must be a fully-resolved URL, assuming its content is well-formed XML. If a more specific type is desired, used setType(String).


TypedInputSource

public TypedInputSource(Reader reader)
Constructor provided with a Reader reader, assuming its content is well-formed XML.


TypedInputSource

public TypedInputSource(int type,
                        Reader reader)
                 throws IllegalArgumentException
Constructor with an int type and a Reader reader.

Throws:
IllegalArgumentException

TypedInputSource

public TypedInputSource(int type,
                        InputStream stream)
                 throws IllegalArgumentException
Constructor with an int type and an InputSource source.

Throws:
IllegalArgumentException
Method Detail

setType

public void setType(String systemId)
Sets the type of this TypedInputSource depending on the file extension. If unrecognized, will assume the file type is generically 'XML'.


getTypeFor

public static int getTypeFor(String systemId)
Returns the type of the system identifier depending on its file extension. If unrecognized, will return the file type as 'XML'.


getTypeDescription

public static String getTypeDescription(int type)
Returns the type of the supplied TypedInputSource type as a descriptive String.


getLocalReference

public File getLocalReference()
                       throws IOException
If this TypedInputSource resolves to a local File reference, returns a File. If the source is not a reference to a local file this returns null; if anything fails or the file does not exist, throws an IOException.

Throws:
IOException

resolveSystemId

public static String resolveSystemId(String systemId)
                              throws IOException
Resolves the String systemId to return a resolved URI (as a String).

Throws:
IOException

isURL

public static final boolean isURL(String sysid)
Returns the URL if the supplied sysid can be made into a URL, null otherwise. This is a modified copy of the method from org.ceryle.util.Utilities, made for class portability.


getReader

public Reader getReader()
Returns the source as a Reader. This relies on getInputStream(), wrapping its InputStream with a BufferedReader.


getInputStream

public InputStream getInputStream()
                           throws IOException
Returns the source as an InputStream. For File sources, this is a FileInputStream; for URL sources, this is the input stream retrieved from the URLConnection.

Throws:
IOException

getSource

public static TypedInputSource getSource(int type,
                                         String systemId)
                                  throws IllegalArgumentException,
                                         IOException
Returns a TypedInputSource provided with a source type type and system identifier systemId. This will fully resolve any relative URLs or filenames in providing an absolute URI.

Throws:
IllegalArgumentException
IOException

getSource

public static TypedInputSource getSource(int type,
                                         URL url)
                                  throws IllegalArgumentException,
                                         IOException
Returns a TypedInputSource provided with a source type type and URL systemId. This will fully resolve any relative URLs or filenames in providing an absolute URI.

Throws:
IllegalArgumentException
IOException

getType

public int getType()
Returns the type of this TypedInputSource.



The Ceryle Project. Copyright ©2001-2007 Murray Altheim, All Rights Reserved. See LICENSE included with distribution.