Creating XML using SAX
Consider Company
class containing array of Employee
;
class Company{
String name;
Employee employees[];
Company(String name, Employee... employees){
this.name = name;
this.employees = employees;
}
}
class Employee{
String id;
String name;
String email;
int age;
Employee(String id, String name, String email, int age){
this.id = id;
this.name = name;
this.email = email;
this.age = age;
}
}
To create xml:
import jlibs.xml.sax.XMLDocument;
import javax.xml.transform.stream.StreamResult;
XMLDocument xml = new XMLDocument(new StreamResult(System.out), false, 4, null);
xml.startDocument();{
xml.startElement("company");{
xml.addAttribute("name", company.name);
for(Employee emp: company.employees){
xml.startElement("employee");{
xml.addAttribute("id", emp.id);
xml.addAttribute("age", ""+emp.age);
xml.addElement("name", emp.name);
xml.addElement("email", emp.email);
}
xml.endElement("employee");
}
}
xml.endElement("company");
}
xml.endDocument();
Running this prints following:
<?xml version="1.0" encoding="UTF-8"?>
<company name="MyCompany">
<employee id="1" age="20">
<name>scott</name>
<email>scott@gmail.com</email>
</employee>
<employee id="2" age="25">
<name>alice</name>
<email>alice@gmail.com</email>
</employee>
</company>
The constructor of XMLDocument is:
XMLDocument(Result result, boolean omitXMLDeclaration, int indentAmount, String encoding) throws TransformerConfigurationException
The first argument is of type javax.xml.transform.Result
; So we can even use DOMResult
to create DOM;
if last argument encoding
is null, then it defaults to default XML encoding(UTF-8
);
NULL Friendly
The methods to fire SAX events are null
friendly. it means:
xml.addAttribute("id", emp.id);
will not add attribute if emp.id=null
. So you no longer need to write as below:
if(emp.id!=null)
xml.addAttribute("id", emp.id);
null
friendly methods, avoid code clutter and make it more readable
Method Chaining
The methods to fire SAX events return this
. So method calls can be chained to produce more readable code
xml.startElement("employee")
.addAttribute("id", emp.id)
.addAttribute("age", ""+emp.age);
instead of:
xml.startElement("employee");
xml.addAttribute("id", emp.id);
xml.addAttribute("age", ""+emp.age);
Simple Text Only Elements
You can do following:
xml.addElement("email", emp.email);
instead of:
if(emp.email!=null){
xml.startElement("email");
xml.addText(emp.email);
xml.endElement("email");
}
there is also addCDATAElement(...)
available
End Element
To end element, we do:
xml.endElement("employee");
If you mis-spell element name here, it will throw SAXException
:
org.xml.sax.SAXException: expected </employee>
there is also another variation of endElement
with no arguments;
xml.endElement();
This will implicitly find the recent element started and ends it.
suppose we have series of endElement
calls as below:
xml.endElement("elem3");
xml.endElement("elem2");
xml.endElement("elem1");
the same can be done in single line as below:
xml.endElements("elem1");
This will do endElement() until elem1
is closed
To end all elements started, do:
xml.endElements();
NOTE:
endElements()
will do nothing if all elements are already closedendElements()
is implictly called inendDocument()
. So you can safely ignore trailing end elements of xml
DTD
xml.addSystemDTD("company", "company.dtd");
will produce:
<!DOCTYPE company SYSTEM "company.dtd">
xml.addPublicDTD("company", "-//W3C//DTD XHTML 1.0 Transitional//EN", "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd");
will produce:
<!DOCTYPE company PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Adding XML
xml.startElement("elem1");
xml.addXML("<test><test1>first</test1><test2>second</test2></test>", false);
xml.endElement();
will produce:
<elem1>
<test>
<test1>first</test1>
<test2>second</test2>
</test>
</elem1>
The first argument to addXML(...)
should be well-formed xml string;
second argument will tell whether to ignore root element or not;
if second argument is true
in above sample it will produce:
<elem1>
<test1>first</test1>
<test2>second</test2>
</elem1>
there is another variation of addXML(…) available:
public XMLDocument addXML(InputSource is, boolean excludeRoot) throws SAXException
for example, you could write:
xml.addXML(new InputSource("notes.xml"), true);
Miscellaneous
xml.addComment("this is comment");
xml.addCDATA("this is inside cdata");
// to produce: <?xml-stylesheet href="classic.xsl" type="text/xml"?>
xml.addProcessingInstruction("xml-stylesheet", "href=\"classic.xsl\" type=\"text/xml\"");
Namespaces
static final String URI_JLIBS = "http://jlibs.org";
static final String URI_COMP = "http://mycompany.com";
static final String URI_EMP = "http://employee.com";
xml.startDocument();
xml.startElement(URI_COMP, "company")
.addAttribute("name", "mycompany")
.addAttribute(URI_JLIBS, "version", "0.1")
.startElement(URI_EMP, "employee")
.addAttribute("name", "scott")
.addElement(URI_EMP, "email", "scott@google.com")
.endElement()
.endElement();
xml.endDocument();
will produce the following:
<?xml version="1.0" encoding="UTF-8"?>
<mycompany:company xmlns:mycompany="http://mycompany.com" name="mycompany" jlibs:version="0.1" xmlns:jlibs="http://jlibs.org">
<employee:employee xmlns:employee="http://employee.com" name="scott">
<employee:email>scott@google.com</employee:email>
</employee:employee>
</mycompany:company>
You can notice that, we didn’t tell what prefix to use.
XMLDocument
is intelligent enough to generate prefixes automatically from namespace uri.
Standard Namespaces
jlibs.xml.Namespaces
class contains most frequently used namespaces like:
public static final String URI_XSD = "http://www.w3.org/2001/XMLSchema";
public static final String URI_XSI = "http://www.w3.org/2001/XMLSchema-instance";
public static final String URI_XSL = "http://www.w3.org/1999/XSL/Transform";
Namespaces.suggestPrefix(String uri)
suggests most commonly used prefix for any of these standard prefixes
String prefix = Namespaces.suggestPrefix(Namespaces.URI_XSD); // prefix will be "xsd"
XMLDocument
uses suggested prefixes from Namespaces
if available; For example:
import static jlibs.xml.Namespaces.*;
xml.startDocument();
xml.startElement(URI_XSD, "element")
.addAttribute("name", "employee")
.addAttribute("type", "employeeType");
xml.endDocument();
will produce the following:
<xsd:element xmlns:xsd="http://www.w3.org/2001/XMLSchema" name="employee" type="employeeType"/>
Suggesting Prefixes
public void suggestPrefix(String prefix, String uri)
this method can be used to suggest prefix for given uri.
Note that, using this method you can even ovverride the prefixes for standard namespaces, if needed.
xml.startDocument();
xml.suggestPrefix(URI_JLIBS, "jlibs");
xml.suggestPrefix(URI_COMP, "comp");
xml.suggestPrefix(URI_EMP, "emp");
xml.startElement(URI_COMP, "company")
.addAttribute("name", "mycompany")
.addAttribute(URI_JLIBS, "version", "0.1")
.startElement(URI_EMP, "employee")
.addAttribute("name", "scott")
.addElement(URI_EMP, "email", "scott@google.com")
.endElement()
.endElement();
xml.endDocument();
will produce the following:
<comp:company xmlns:comp="http://mycompany.com" name="mycompany" jlibs:version="0.1" xmlns:jlibs="http://jlibs.org">
<emp:employee xmlns:emp="http://employee.com" name="scott">
<emp:email>scott@google.com</emp:email>
</emp:employee>
</comp:company>
Declaring Prefixes
When you declare prefix, xmlns attribute will be added to generated xml.
This could be handy in following situation:
xml.startDocument();
xml.startElement(URI_COMP, "company")
.startElement(URI_EMP, "employee")
.addAttribute("name", "scott")
.endElement()
.startElement(URI_EMP, "employee")
.addAttribute("name", "alice")
.endElement()
.startElement(URI_EMP, "employee")
.addAttribute("name", "alean")
.endElement()
.endElement();
xml.endDocument();
produces the following:
<mycompany:company xmlns:mycompany="http://mycompany.com">
<employee:employee xmlns:employee="http://employee.com" name="scott"/>
<employee:employee xmlns:employee="http://employee.com" name="alice"/>
<employee:employee xmlns:employee="http://employee.com" name="alean"/>
</mycompany:company>
In output, you can notice that employee
namespace is declared in each <employee>
element.
The xml is looking cluttered because of this. If we could have defined employee
namespace in <company>
, it would be better.
To do this:
xml.startDocument();
xml.declarePrefix(URI_EMP); // we are declaring manually here
xml.startElement(URI_COMP, "company")
.startElement(URI_EMP, "employee")
.addAttribute("name", "scott")
.endElement()
.startElement(URI_EMP, "employee")
.addAttribute("name", "alice")
.endElement()
.startElement(URI_EMP, "employee")
.addAttribute("name", "alean")
.endElement()
.endElement();
xml.endDocument();
now the above code produces:
<mycompany:company xmlns:mycompany="http://mycompany.com" xmlns:employee="http://employee.com">
<employee:employee name="scott"/>
<employee:employee name="alice"/>
<employee:employee name="alean"/>
</mycompany:company>
notice that xmlns:employee
attribute is now moved to <mycompany>
element.
there is also another variant of declarePrefix(...)
public boolean declarePrefix(String prefix, String uri)
using this, you can specify prefix of your wish.
Computing QNames
xml.startDocument();
xml.declarePrefix("emp", URI_EMP);
xml.startElement(URI_XSD, "schema");
.startElement(URI_XSD, "element")
.addAttribute("name", "employee")
.addAttribute("type", toQName(URI_EMP, "emloyeeType"));
xml.endDocument();
will produce following:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:emp="http://employee.com">
<xsd:element name="employee" type="emp:emloyeeType"/>
</xsd:schema>
here the value of @type
is a qname which should be valid. i.e you want to use correct prefix
toQName(uri, localPart)
will return the correct qname string.
if the given uri is not yet declared, it will be declared automatically.
Mark and Release
let us say we have following two methods:
public static void serializeCompany(XMLDocument xml, Company company) throws SAXException{
xml.startElement("company");
xml.addAttribute("name", company.name);
for(Employee emp: company.employees){
serializeEmployee(xml, emp);
}
xml.endElement("company");
}
public static void serializeEmployee(XMLDocument xml, Employee emp) throws SAXException{
xml.startElement("employee");{
xml.addAttribute("id", emp.id);
xml.addAttribute("age", ""+emp.age);
xml.addElement("name", emp.name);
xml.addElement("email", emp.email);
}
xml.endElement();
//xml.endElement();
}
public static void main(String[] args) throws Exception{
Company company = createCompany();
XMLDocument xml = new XMLDocument(new StreamResult(System.out), false, 4, null);
xml.startDocument();
serializeCompany(xml, company);
xml.endDocument();
}
The above code works perfectly.
Now uncomment last line in serializeEmployee(...)
. when you run, it produces following exception:
Exception in thread "main" org.xml.sax.SAXException: can't find matching start element
at jlibs.xml.sax.XMLDocument.findEndElement(XMLDocument.java:244)
at jlibs.xml.sax.XMLDocument.endElement(XMLDocument.java:257)
at jlibs.xml.sax.XMLDocument.endElement(XMLDocument.java:264)
at Example.serializeCompany(XMLDocument.java:483)
at Example.main(XMLDocument.java:504)
from above stacktrace, you will notice that the error is reported for in serializeCompany(...)
.
but actually the bug is in serializeEmployee(...)
method. This make finding bug tedious.
to make finding bug easier, change serializeCompany(...)
to use marking support as follows:
public static void serializeCompany(XMLDocument xml, Company company) throws SAXException{
xml.startElement("company");
xml.addAttribute("name", company.name);
for(Employee emp: company.employees){
xml.mark();
serializeEmployee(xml, emp);
xml.release();
}
xml.endElement("company");
}
now the exception produced will be:
Exception in thread "main" org.xml.sax.SAXException: can't find matching start element
at jlibs.xml.sax.XMLDocument.findEndElement(XMLDocument.java:244)
at jlibs.xml.sax.XMLDocument.endElement(XMLDocument.java:268)
at Example.serializeEmployee(XMLDocument.java:496)
at Example.serializeCompany(XMLDocument.java:482)
at Example.main(XMLDocument.java:506)
i.e the stacktrace now clearly tells the bug is in serializeEmployee(...)
method;
Thus the xml generated between mark and release is validated as complete element.
let us see marking support in detail:
xml.startElement("elem1");
...
xml.startElement("elem2");
....
xml.mark();
xml.startElement("elem3");
....
xml.startElement("elem4");
.....
xml.release(); // will close elem4 and elem3 i.e upto the mark and clears the mark
xml.endElement("elem2");
xml.release()
must be called prior to ending Elements before the mark. i.e,
xml.startElement("elem1");
...
xml.mark();
....
xml.endElement("elem1"); // will throw SAXException: can't find matching start element
NOTE:
- You can mark as many times as possible. i.e, multiple marks can exist;
endElements()
will only end elements which are started after recent mark.release()
implictly does end elements
you can also release any mark, instead of last mark as below:
int mark = xml.mark();
...
xml.mark();
...
xml.mark();
...
xml.release(mark);
when you call mark()
, it returns the number of mark;
first call to mark()
returns 1. next call to mark()
will return 2, if earlier mark is not released;
NOTE: there is an implicit mark 0, which should not be released by user. it is used by XMLDocument
XMLDocument Wrappers
You can create wrappers for XMLDocument
to make creating specific type of xml document easier;
For example: JLibs has one such wrapper jlibs.xml.xsd.XSDocument
to create XMLSchema documents easily;
import jlibs.xml.xsd.XSDocument;
XSDocument xsd = new XSDocument(new StreamResult(System.out), false, 4, null);
xsd.startDocument();
{
String n1 = "http://www.example.com/N1";
String n2 = "http://www.example.com/N2";
xsd.xml().declarePrefix("n1", n1);
xsd.xml().declarePrefix("n2", n2);
xsd.startSchema(n1);
{
xsd.addImport(n2, "imports/b.xsd");
xsd.startComplexType().name("MyType");
{
xsd.startCompositor(Compositor.SEQUENCE);
xsd.startElement().ref(n1, "e1").endElement();
xsd.endCompositor();
}
xsd.endComplexType();
xsd.startElement().name("root").type(n1, "MyType").endElement();
}
xsd.endSchema();
}
xsd.endDocument();
produces following output:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.com/N1" xmlns:n1="http://www.example.com/N1" xmlns:n2="http://www.example.com/N2">
<xsd:import namespace="http://www.example.com/N2" schemaLocation="imports/b.xsd"/>
<xsd:complexType name="MyType">
<xsd:sequence>
<xsd:element ref="n1:e1"/>
</xsd:sequence>
</xsd:complexType>
<xsd:element name="root" type="n1:MyType"/>
</xsd:schema>
You can also create similar wrappers.
ObjectInputSource
org.xml.sax.InputSource
wraps systemID
or OutputStream
or Reader
which is source of xml;
Similarly, ObjectInputSource
extends InputSource
and wraps a java object, which is the source of xml:
new ObjectInputSource<E>(E obj, XMLDocument xml)
ObjectInputSource
has with single abstract method:
protected abstract void write(E obj, XMLDocument xml) throws SAXException;
Subclasses override this method and fire SAX events.
Let us write an implementation of ObjectInputSource
for Company
:
import jlibs.xml.sax.ObjectInputSource;
import org.xml.sax.SAXException;
public class CompanyInputSource extends ObjectInputSource<Company>{
public CompanyInputSource(Company company){
super(company);
}
@Override
protected void write(Company company, XMLDocument xml) throws SAXException{
xml.startElement("company");
xml.addAttribute("name", company.name);
for(Employee emp: company.employees){
xml.startElement("employee");{
xml.addAttribute("id", emp.id);
xml.addAttribute("age", ""+emp.age);
xml.addElement("name", emp.name);
xml.addElement("email", emp.email);
}
xml.endElement("employee");
}
xml.endElement("company");
}
}
Note: xml.startDocument()
is implicitly called before write(...)
method and
xml.endDocument()
is called implicitly after write(...)
method.
To create XML, now we can do the following:
public static void main(String[] args) throws Exception{
Employee scott = new Employee("1", "scott", "scott@gmail.com", 20);
Employee alice = new Employee("2", "alice", "alice@gmail.com", 25);
Company company = new Company("MyCompany", scott, alice);
// print company to System.out as xml
new CompanyInputSource(company).writeTo(System.out, false, 4, null);
}
ObjectInputSource
contains several methods to serialize the SAX events:
public void writeTo(Writer writer, boolean omitXMLDeclaration, int indentAmount) throws TransformerException
public void writeTo(OutputStream out, boolean omitXMLDeclaration, int indentAmount, String encoding) throws TransformerException
public void writeTo(String systemID, boolean omitXMLDeclaration, int indentAmount, String encoding) throws TransformerException
if encoding
is null, then it defaults to default XML encoding(UTF-8
);
These writeTo(...)
methods use Identity Trasformer;