Fastest XML parser

In most applications you need to do some XML parsing to either provide XML to external systems or parse XML from external systems or of course both. Since the systems I work on use a lot of XML and they need to be as fast as possible I did some tests on several XML parsers. I tested the following parsers/XML frameworks:
  1. JDom
  2. Piccolo
  3. StaxMate
  4. XStream
  5. JAXB 2.1
All parsers were tested in isolation with the following parameters:
  1. 25 threads running simultaniously
  2. Every thread calls the parser 100 times
The results are as follows.
  1. JDom
    Avg. parsing time: 325.24 ms
    Avg. memory usage: 307 KB
  2. Piccolo
    Avg. parsing time: 88.08 ms
    Avg. memory usage: 454 KB
  3. StaxMate
    Avg. parsing time: 96.16 ms
    Avg. memory usage: 203 KB
  4. XStream
    Avg. parsing time: 77.04 ms
    Avg. memory usage: 319 KB
  5. Jaxb 2.1
    Avg. parsing time: 1778.12 ms
    Avg. memory usage: 618 KB
As you can see XStream is the fastest one of them all. The total ranking of the parsers you can find below:
  1. XStream
  2. Piccolo
  3. StaxMate
  4. JDom
  5. JAXB 2.1
As you can see StaxMate has the smallest memory footprint. The total ranking of the parser you can find below:
  1. StaxMate
  2. JDom
  3. XStream
  4. Piccolo
  5. JAXB 2.1
If you take the ovarall pro's and con's XStream comes out as the best choice as it comes to:
  1. Speed
  2. Memory footprint
  3. Ease of use
JDom has a remarkably low memory footprint but this is because the test was done with a single XML file which wasn't that big so when the XML file will grow the memory footprint will drastically grow with JDom since it loads everything in memory to construct the full document.

JAXB is dissapointingly slow and has a huge memory footprint. The ease of use was a big plus though but if you compare it to XStream it's almost the same. XStream also provides annotations to easily map XML to Java objects. And it performs much better.

Piccolo was a bit slower than XStream but there is a significant difference in the memory footprint but also in the ease of use. When using Piccolo you need to implement a lot of stuff that you actually don't use at all. Also a lot of Exceptions need to catched resulting in a lot of boiler plate code.

StaxMate would also be a good choice but the ease of use compared to XStream or JAXB lacks a bit.

XStream itself also has some additional features like:
  1. JSON support.
  2. Streaming capabilities.
  3. Ability to switch to other type of XML parser without changing the parsing code.
So in general if you want fast XML parsing and an easy to use API I would recommend to use XStream. :-)

Comments

  1. Nice.
    It would be nice if you could post code of your benchmark so anyone (especially contributors to those projects) can run it.
    :)

    ReplyDelete
  2. The test code is unfortunately not open source. But basically I wrote 5 parsers that could parse a specific XML file. Than a single abstract test with the bench marking logic in there and created 5 separate tests that tests a specific parser in isolation. Pretty straight forward. :-)

    If I have some time left I'll try to write an open source version of the tests ;-).

    ReplyDelete
  3. Hey, glad to see StaxMate being included! (I assume with Woodstox?).
    Interesting results.

    But I am bit curious regarding XStream: while I really like it functionality-wise, my results have always indicated it to be bit slow, and JAXB (v2) being much faster.

    Also: I hope you had a warmup round before the actual test. If not, make sure you run through the test setup once without measuring, and second time doing measurements. Otherwise you will be measuring JVM startup time and HotSpot overhead, not lib performance.

    ReplyDelete
  4. The code I tested was written in a thread safe manner meaning for JAXB v2 you need to call the createUnmarshaller() every time you want to parse something. I think this is the reason why it comes out slower in my tests. The XStream parser only contains 1 line:
    ...xstream.fromXML(xml);

    Actually I was very surprised and a bit disappointed myself about the results of JAXB v2. I didn't expect it to be that bad.

    I didn't include a warmup round as you described but I made sure that everything that was needed for the test (XML file, instantiation of the parsers etc.) was done before the parsing started. Maybe not the best way to do it in your opinion but all tests for every parser were done exactly in the same way which is the important part so you at least have relative numbers to see which one performed better in a certain scenario.

    ReplyDelete
  5. Yes, you do need a new Unmarshaller each time. But that's cheap, as far as I know it's the JAXBContext that's expensive to construct. So I don't know why it would be slow, unless it was given input as DOM tree or something (which probably was not the case).

    ReplyDelete
  6. how about vtd-xml?
    http://vtd-xml.sf.net

    ReplyDelete
  7. Parsing xml file using Xstream parser works like charm for small xml files. Some times my web responses (xml format) may be very large as the xml file may be more than 10MB. At this time, the Xml parsing takes too much time to finish. How to increase the performance of Xstream parser for large files.

    ReplyDelete

Post a Comment

Popular posts from this blog

Tomcat behind Apache HTTP server using Spring Security

Configure Tomcat as a deamon on Linux