JAX DOM Parser Example

0

In this article, we will parse an XML file and construct a Document Object Model.

The DOM represents the entire structure of the parsed XML in the form of a tree. Each node knows its parent as well as the children.

The constructed DOM structure sits in in-memory so one can navigate through the node structure, and add, modify, or delete elements and content.

Sample XML

In this example, we will see how to convert an XML document into a DOM object.

Below XMl contains employee information. We will first convert XML into a DOM object and then traverse through DOM to convert it to a POJO object.

emp.xml:

<?xml version="1.0"?>
<employees>
	<employee id="1">
		<name>Joe</name>
		<age>34</age>
	</employee>
	<employee id="2">
		<name>Sam</name>
		<age>24</age>
	</employee>
	<employee id="3">
		<name>John<!-- author --> S</name>
		<age>44</age>
	</employee>
</employees>

Here is the employee bean.

Employee:

package com.javarticles.sax;

public class Employee {
    private String name;
    private Integer age;
    private Integer id;
    
    Employee(){}
    
    Employee(Integer id, String name, Integer age) {
        this.id = id;
        this.name = name;
        this.age = age;
    }

    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    public Integer getAge() {
        return age;
    }
    public void setAge(Integer age) {
        this.age = age;
    }
    public Integer getId() {
        return id;
    }
    public void setId(Integer id) {
        this.id = id;
    }
    public String toString() {
        return "Employee(id:"+ id + ", name:" + name + ", age:" + age + ")";
    }
}

Parse the XML into DOM

Below are the steps to convert the XML document into a DOM object.

  1. Instantiate DOM Builder Factory.
    DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
    
  2. Next, get a new instance of a document builder using the above factory, and use it to parse the specified file.
    DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
    
  3. Use the document builder to parse the XML file.
    Document document = documentBuilder.parse( DOMParserExample.class.getResourceAsStream("emp.xml") );
    

Convert DOM into POJO

Once we have the Document object, we need to traverse through the nodes to covert it into a POJO.

  1. The topmost element of the document is employees which in turn contains one or more employee child elements. In order to iterate through the employee elements, first we need to get the top element.
    Element top = document.getDocumentElement()
    
  2. Next, get the child nodes, in our example, employee nodes.
    NodeList children = top.getChildNodes()
    for ( int i = 0; i < children.getLength() ; i++ ) {
        Node node = children.item( i );
        if (node instanceof Element) {
            final Element element = (Element) node;
            if ( tag.equals( "employee" ) ) {
                ....
            }
        }
    }
    
  3. Get employee attribute id‘s value.
    element.getAttribute("id");
    
  4. We also need to retrieve the name and age of employee. Since name and age are the child elements of employee element, we will have to traverse through the child elements of employeename and age elements.
    public static Element getChildElementByTagName(Element ele, String childEleName) {
            NodeList nl = ele.getChildNodes();
            for (int i = 0; i < nl.getLength(); i++) {
                Node node = nl.item(i);
                if (node instanceof Element && node.getNodeName().equals(childEleName)) {
                    return (Element) node;
                }
            }
            return null;
     }
    
  5. Once we have found the child element (name or age in our example), we need to retrieve its value.
    private static String getTextValue(Element el) {
            NodeList nl = el.getChildNodes();
            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < nl.getLength(); i++) {
                Node item = nl.item(i);
                if ((item instanceof CharacterData && !(item instanceof Comment))) {
                    sb.append(item.getNodeValue());
                }
            }
            return sb.toString();
    }
    
  6. Finally, we print the constructed employee objects.
    empList.forEach(System.out::print);
    

DOMParserExample:

package com.javarticles.sax;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.CharacterData;
import org.w3c.dom.Comment;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;


public class DOMParserExample {
    public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
        Document document = documentBuilder.parse( DOMParserExample.class.getResourceAsStream("emp.xml") );
        
        final Element top = document.getDocumentElement();
        final NodeList children = top.getChildNodes();
        Employee currentEmp = null;
        List empList = new ArrayList<>();
        for ( int i = 0; i < children.getLength() ; i++ ) {
            Node node = children.item( i );
            if (node instanceof Element) {
                final Element element = (Element) node;
                final String tag = element.getNodeName();
                if ( tag.equals( "employee" ) ) {
                    currentEmp = new Employee();
                    String id = element.getAttribute("id");
                    currentEmp.setId(Integer.parseInt(id));
                    empList.add(currentEmp);
                    
                    Element nameElement = getChildElementByTagName(element, "name");
                    currentEmp.setName(getTextValue(nameElement));

                    Element ageElement = getChildElementByTagName(element, "age");
                    currentEmp.setAge(Integer.parseInt(getTextValue(ageElement)));                                       
                }
            }
            
        }
        empList.forEach(System.out::print);
       
    }
    
    private static String getTextValue(Element el) {
        NodeList nl = el.getChildNodes();
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < nl.getLength(); i++) {
            Node item = nl.item(i);
            if ((item instanceof CharacterData && !(item instanceof Comment))) {
                sb.append(item.getNodeValue());
            }
        }
        return sb.toString();
    }
    
    public static Element getChildElementByTagName(Element ele, String childEleName) {
        NodeList nl = ele.getChildNodes();
        for (int i = 0; i < nl.getLength(); i++) {
            Node node = nl.item(i);
            if (node instanceof Element && node.getNodeName().equals(childEleName)) {
                return (Element) node;
            }
        }
        return null;
    }
}

Output:

Employee(id:1, name:Joe, age:34)Employee(id:2, name:Sam, age:24)Employee(id:3, name:John S, age:44)

Download the source code

This was an example about DOM Parser.

You can download the source code here: domParserExample.zip

About Author

Ram's expertise lies in test driven development and re-factoring. He is passionate about open source technologies and loves blogging on various java and open-source technologies like spring. You can reach him at [email protected]

Comments are closed.