Support: 1-800-961-4454
Sales Chat
1-800-961-2888

node-elementtree – Node.js Library To Build And Parse XML Documents

3

This is another post in the series about Cloud Monitoring libraries we have open-sourced. I first talked about Whiskey, a powerful Node.js test framework and recently Gary talked about the Cassandra CQL driver. This week I’m going to talk about node-elementtree, a Node.js library inspired by Python’s ElementTree module for building and parsing XML.

We began collecting requirements before we started to build our Monitoring-as-a-Service (MaaS) platform and one of the requirements was to support XML. None of the developers in our team particularly like XML (we prefer JSON), but we needed to support XML because some of the API consumers still use it. When we started working on the project, there wasn’t a good library available for creating XML documents, so we decided to build our own. Many of the developers on our team have previously worked with Python and were familiar with Python’s ElementTree API, so we modeled our library after ElementTree’s straightforward and easy to use interface.

In this blog post I am going to go over a couple examples of leveraging the API.  First we’ll start with creating a document, and then we’ll talk about parsing a document.

Example 1 – Creating An XML Document

This example shows how to build a valid XML document that can be published to Atom Hopper. Atom Hopper is used internally as a bridge from products all the way to collecting revenue, called “Usage.”  MaaS and other products send similar events to it every time user performs an action on a resource (e.g. creates, updates or deletes). Below is an example of leveraging the API to create a new XML document.

var et = require('elementtree');
var XML = et.XML;
var ElementTree = et.ElementTree;
var element = et.Element;
var subElement = et.SubElement;

var date, root, tenantId, serviceName, eventType, usageId, dataCenter, region, checks, resourceId, category, startTime, resourceName, etree, xml;

date = new Date();

root = element('entry');
root.set('xmlns', 'http://www.w3.org/2005/Atom');

tenantId = subElement(root, 'TenantId');
tenantId.text = '12345';

serviceName = subElement(root, 'ServiceName');
serviceName.text = 'MaaS';

resourceId = subElement(root, 'ResourceID');
resourceId.text = 'enAAAA';

usageId = subElement(root, 'UsageID');
usageId.text = '550e8400-e29b-41d4-a716-446655440000';

eventType = subElement(root, 'EventType');
eventType.text = 'create';

category = subElement(root, 'category');
category.set('term', 'monitoring.entity.create');

dataCenter = subElement(root, 'DataCenter');
dataCenter.text = 'global';

region = subElement(root, 'Region');
region.text = 'global';

startTime = subElement(root, 'StartTime');
startTime.text = date;

resourceName = subElement(root, 'ResourceName');
resourceName.text = 'entity';

etree = new ElementTree(root);
xml = etree.write({'xml_declaration': false});
console.log(xml);

https://gist.github.com/2554363

As you can see, both et.Element and et.SubElement are factory methods which return a new instance of Element and SubElement class, respectively. When you create a new element (tag) you can use set method to set an attribute. To set the tag value, assign a value to the .text attribute.

This example would output a document that looks like this:

<entry xmlns="http://www.w3.org/2005/Atom">
  <TenantId>12345</TenantId>
  <ServiceName>MaaS</ServiceName>
  <ResourceID>enAAAA</ResourceID>
  <UsageID>550e8400-e29b-41d4-a716-446655440000</UsageID>
  <EventType>create</EventType>
  <category term="monitoring.entity.create"/>
  <DataCenter>global</DataCenter>
  <Region>global</Region>
  <StartTime>Sun Apr 29 2012 16:37:32 GMT-0700 (PDT)</StartTime>
  <ResourceName>entity</ResourceName>
</entry>

https://gist.github.com/2632179

Example 2 – Parsing An XML Document

This example shows how to parse an XML document and use simple XPath selectors. For demonstration purposes, we will use the XML document located at https://gist.github.com/2554343.

Behind the scenes, node-elementtree uses Isaac’s sax library for parsing XML, but the library has a concept of “parsers,” which means it’s pretty simple to add support for a different parser.

var fs = require('fs');

var et = require('elementtree');

var XML = et.XML;
var ElementTree = et.ElementTree;
var element = et.Element;
var subElement = et.SubElement;

var data, etree;

data = fs.readFileSync('document.xml').toString();
etree = et.parse(data);

console.log(etree.findall('./entry/TenantId').length); // 2
console.log(etree.findtext('./entry/ServiceName')); // MaaS
console.log(etree.findall('./entry/category')[0].get('term')); // monitoring.entity.create
console.log(etree.findall('*/category/[@term="monitoring.entity.update"]').length); // 1

https://gist.github.com/2554353

Basically, the module exposes three methods on the ElementTree instance:
•    etree.find – return a single element which matches the XPath selector
•    etree.findall – returns an array of elements which match the XPath selector
•    etree.findtext – return a text value of the tag which matches the XPath selector

For more information on how to use XPath selectors you can refer to the Python documentation.

How We Use node-elementtree In MaaS

In MaaS we use it in two places:
1. Sending usage documents to an Atom Hopper instance
2. Returning a response to the API user which has requested a response format to be XML

In the second case we actually use it in combination with our validation and serialization framework called Swiz. We use Swiz to define field types and validators for every object in the system. When a user hits our API and requests XML, Swiz serializes the response using the object definition and node-elementtree library.

Conclusion

I hope you found this post helpful! More examples and project source code can be found on Github. We will see you around soon when we talk more about our previously mentioned serialization and validation framework called Swiz. If you think this kind of work would be challenging and fun, and you enjoy contributing to open-source projects, we are always looking for talented engineers.

About the Author

This is a post written and contributed by Rackspace .


More
3 Comments

Why not use this simple script by Derek Anderson to convert a JSON object to xml? If there’s one thing I hate about all the xml parsers out there for e.g. PHP, it’s that you have to state the obvious again and a-frickin-gain. I was looking for something that could convert a human-readable json object to xml, and I found Derek’s NodeJS function.

I converted it to a NodeJS module for my own project. https://gist.github.com/2842694

avatar Redsandro on May 31, 2012 | Reply

Just a comment/suggestion… Would have an option that can do streams/pipes.. since you can have fairly large XML representations that could take up a lot of memory/overhead for what you are doing here.

The same goes for reading.. should have an element name that you are looking for, with an event to read in just that element (one at a time) for processing.

This is actually a pretty big shortcoming in all the XML handlers for Node.js, they tend to be an all at once approach, which is fine for small data, but bad for larger import/export processors and services.

avatar Michael J. Ryan on October 24, 2012 | Reply

Nice code. Clean and simple. But the parsing does not work for me. I get an error when parsing: “TypeError: Cannot call method ‘get’ of undefined”

And when I comment that line, I get this result:

0
undefined
0

That does not look right. Anyone got an idea?

avatar Bastian on December 6, 2012 | Reply

Leave a New Comment

(Required)


Racker Powered
©2014 Rackspace, US Inc.