Evolution of the XML Parsing/Manipulation using Java
The combination of Java and XML has been one of the most attracting things which had happened in the field of software development in the 21st century. It has been mainly for two reasons - Java, arguably the most widely used programming language and XML, almost unarguably the best mechanism of data description and transfer.
Since these two were different technologies and hence it initially required a developer to have a sound understanding of both of these before he can make the best use of the combination. Since then there have been a paradigm shift towards Java and we have seen few interesting technologies getting evolved to make this happen. Some of them are:-
SAX - Simple API for XML Parsing
It was the first to come on the scene and interestingly it was developed in the XML-Dev maling list. Evidently the people who developed this were XML gurus and it is quite visible in the usage of this API. You got to have a fair understanding of XML, but at least Java developers got something to combine the two worlds - Java and XML in a structured way. It instantly became a hit for the obvious reasons.
Being the first in the evolution ladder, it obviously had only the basic support for XML processing. It is an event-based technology, which uses callbacks to load the parts of the XML document in a sequential way. This effectively means you can't go back to some part which was read/processed previously - if you do have such a requirement then you would need to store/manage the relevant data yourself.
Since this API does require to load the entire XML doc and also because it offers only a sequential processing of the doc hence it is quite fast. Another reason of it being faster is that it does not allow modification of the underlying XML data.
Interested in going through a step-by-step implementation (with explanation of the complete source code) of a simple SAX Parser in Java using SAX2 APIs? Here is it for you - SAX Parser Implementation in Java >>
DOM - Document Object Model
The Java binding for DOM provided a tree-based representation of the XML documents - allowing random access and modification of the underlying XML data. Not very difficult to deduce that it would be slower as compared to SAX.
The event-based callback methodology was replaced by an object-oriented in-memory representation of the XML documents. Though, it differs from one implementation to another if the entire document or a part of it would be kept in the memory at a particular instant, but the Java developers are kept out of all the hassle and they get the entire tree readily available whenever they wish.
JAXP - Java API for XML Parsing
The creators and designers of Java realized that the Java developers should not be XML gurus to use the XML in Java applications. The first step towards making this possible was the evolution of JAXP, which made it easier to obtain either a DOM Document or a SAX-compliant parser via a factory class. This reduced the dependence of Java developers over the numerous vendors supplying the parsers of either type. Additionally, JAXP made sure that an interchange between the parsers required minimal code changes.