Image of How to read XML using SAX parser

ADVERTISEMENT

Table of Contents

Introduction

In the previous article we talked about DOM parser and provided different examples for parsing and reading elements of an XML document. SAX parser is yet another XML parser provided by JDK which parses documents in a more optimized and faster way.

SAX parser doesn’t load the whole document into the memory, however it parses the document line by line and provides callback operations to the developer in order to handle each read tag separately.

1- Students.xml

Consider we have the following Students.xml file:

<students>
    <student graduated="true">
        <id>1</id>
        <name>Hussein</name>
    </student>
    <student>
        <id>2</id>
        <name>Alex</name>
    </student>
</students>

2- Student.java

For mapping purposes, we create Student.java for populating each student element inside Students.xml:

package com.programmer.gate;
 
public class Student {
 
    private int id;
    private String name;
    private boolean isGraduated;
 
    public int getId() {
        return id;
    }
 
    public void setId(int id) {
        this.id = id;
    }
 
    public String getName() {
        return name;
    }
 
    public void setName(String name) {
        this.name = name;
    }
 
    public boolean isGraduated() {
        return isGraduated;
    }
 
    public void setGraduated(boolean isGraduated) {
        this.isGraduated = isGraduated;
    }
}

3- Define SAX handler

In this section, we’re going to parse students.xml and populate a List of Student objects out of it.

SAX parses documents using a handler. In order to define our own customized handler, we define a class called SAXHandler as the following:

package com.programmer.gate;
 
import java.util.ArrayList;
import java.util.List;
 
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
 
public class SAXHandler extends DefaultHandler {
    
    private List<Student> students = null;
    private Student student = null;
    private String elementValue;
    
    @Override
    public void startDocument() throws SAXException {
        students = new ArrayList<Student>();
    }
    
    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        if (qName.equalsIgnoreCase("student")) {
            student = new Student();
            
            if(attributes.getLength() > 0)
            {
                String graduated = attributes.getValue("graduated");
                student.setGraduated(Boolean.valueOf(graduated));
            }
        }
    }
    
    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        if (qName.equalsIgnoreCase("student")) {
            students.add(student);
        }
        
        if (qName.equalsIgnoreCase("id")) {
            student.setId(Integer.valueOf(elementValue));
        }
        
        if (qName.equalsIgnoreCase("name")) {
            student.setName(elementValue);
        }
    }
    
    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        elementValue = new String(ch, start, length);
    }
 
    public List<Student> getStudents() {
        return students;
    }
}

Following is a brief description for the above code snippet:

  1. startDocument(): This method is called when the parser starts parsing the document.
  2. endDocument(): This method is called when the parser ends parsing the document.
  3. startElement(): This method is called when the parser starts parsing a specific element inside the document.
  • qName: refers to the element or tag name.
  • attributes: refers to the attributes linked to the element.
  • In the above example, we’re instantiating a new Student object whenever the parser starts parsing a ‘student’ element.
  1. endElement(): This method is called when the parser ends parsing a specific element inside the document.
  • qName: refers to the element or tag name
  • In the above example, we’re adding the already instantiated Student object to students list whenever we reach the end of student element. If the ending element is id or name, then we set the id and name of the current student object.
  1. characters(): This method reads the text value of the currently parsed element. We’re saving the text value in a class field called elementValue so that we access it inside endElement().
  2. getStudents(): This method exposes the populated list of Student objects so that caller classes can use it.

4- Parse students.xml

Now we create our main class named as ReadXMLWithSAX which parses students.xml using SAXParser.

package com.programmer.gate;
 
import java.util.List;
 
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
 
import org.xml.sax.SAXException;
 
public class ReadXMLWithSAX {
 
    public static void main(String[] args) throws ParserConfigurationException, SAXException {
        try
        {
            SAXParserFactory factory = SAXParserFactory.newInstance();
            SAXParser saxParser = factory.newSAXParser();
            
            SAXHandler saxHandler = new SAXHandler();
            saxParser.parse("students.xml", saxHandler);
            
            List<Student> students = saxHandler.getStudents();
            for(Student student : students)
            {
                System.out.println("Student Id = " + student.getId());
                System.out.println("Student Name = " + student.getName());
                System.out.println("Is student graduated? " + student.isGraduated());
            }
        }
        catch(Exception ex)
        {
            ex.printStackTrace();
        }
    }
}

After running the above main method, we get the following output:

Student Id = 1
Student Name = Hussein
Is student graduated? true
Student Id = 2
Student Name = Alex
Is student graduated? false

5- Source Code

You can download the source code from this repository: Read-XML

Summary

In the previous article we talked about DOM parser and provided different examples for parsing and reading elements of an XML document. SAX parser is yet another XML parser provided by JDK which parses documents in a more optimized and faster way.

Next Steps

If you're interested in learning more about the basics of Java, coding, and software development, check out our Coding Essentials Guidebook for Developers, where we cover the essential languages, concepts, and tools that you'll need to become a professional developer.

Thanks and happy coding! We hope you enjoyed this article. If you have any questions or comments, feel free to reach out to jacob@initialcommit.io.

Final Notes