Fastest XML parser
In most applications you need to do some XML parsing to either provide XML to external systems or parse XML from external systems or of course both. Since the systems I work on use a lot of XML and they need to be as fast as possible I did some tests on several XML parsers. I tested the following parsers/XML frameworks:
- JDom
- Piccolo
- StaxMate
- XStream
- JAXB 2.1
- 25 threads running simultaniously
- Every thread calls the parser 100 times
- JDom
Avg. parsing time: 325.24 ms
Avg. memory usage: 307 KB - Piccolo
Avg. parsing time: 88.08 ms
Avg. memory usage: 454 KB - StaxMate
Avg. parsing time: 96.16 ms
Avg. memory usage: 203 KB - XStream
Avg. parsing time: 77.04 ms
Avg. memory usage: 319 KB - Jaxb 2.1
Avg. parsing time: 1778.12 ms
Avg. memory usage: 618 KB
- XStream
- Piccolo
- StaxMate
- JDom
- JAXB 2.1
- StaxMate
- JDom
- XStream
- Piccolo
- JAXB 2.1
- Speed
- Memory footprint
- Ease of use
JAXB is dissapointingly slow and has a huge memory footprint. The ease of use was a big plus though but if you compare it to XStream it's almost the same. XStream also provides annotations to easily map XML to Java objects. And it performs much better.
Piccolo was a bit slower than XStream but there is a significant difference in the memory footprint but also in the ease of use. When using Piccolo you need to implement a lot of stuff that you actually don't use at all. Also a lot of Exceptions need to catched resulting in a lot of boiler plate code.
StaxMate would also be a good choice but the ease of use compared to XStream or JAXB lacks a bit.
XStream itself also has some additional features like:
- JSON support.
- Streaming capabilities.
- Ability to switch to other type of XML parser without changing the parsing code.
So in general if you want fast XML parsing and an easy to use API I would recommend to use XStream. :-)
Nice.
ReplyDeleteIt would be nice if you could post code of your benchmark so anyone (especially contributors to those projects) can run it.
:)
The test code is unfortunately not open source. But basically I wrote 5 parsers that could parse a specific XML file. Than a single abstract test with the bench marking logic in there and created 5 separate tests that tests a specific parser in isolation. Pretty straight forward. :-)
ReplyDeleteIf I have some time left I'll try to write an open source version of the tests ;-).
Hey, glad to see StaxMate being included! (I assume with Woodstox?).
ReplyDeleteInteresting results.
But I am bit curious regarding XStream: while I really like it functionality-wise, my results have always indicated it to be bit slow, and JAXB (v2) being much faster.
Also: I hope you had a warmup round before the actual test. If not, make sure you run through the test setup once without measuring, and second time doing measurements. Otherwise you will be measuring JVM startup time and HotSpot overhead, not lib performance.
The code I tested was written in a thread safe manner meaning for JAXB v2 you need to call the createUnmarshaller() every time you want to parse something. I think this is the reason why it comes out slower in my tests. The XStream parser only contains 1 line:
ReplyDelete...xstream.fromXML(xml);
Actually I was very surprised and a bit disappointed myself about the results of JAXB v2. I didn't expect it to be that bad.
I didn't include a warmup round as you described but I made sure that everything that was needed for the test (XML file, instantiation of the parsers etc.) was done before the parsing started. Maybe not the best way to do it in your opinion but all tests for every parser were done exactly in the same way which is the important part so you at least have relative numbers to see which one performed better in a certain scenario.
Yes, you do need a new Unmarshaller each time. But that's cheap, as far as I know it's the JAXBContext that's expensive to construct. So I don't know why it would be slow, unless it was given input as DOM tree or something (which probably was not the case).
ReplyDeletehow about vtd-xml?
ReplyDeletehttp://vtd-xml.sf.net
Parsing xml file using Xstream parser works like charm for small xml files. Some times my web responses (xml format) may be very large as the xml file may be more than 10MB. At this time, the Xml parsing takes too much time to finish. How to increase the performance of Xstream parser for large files.
ReplyDelete