node-elementtree – Node.js Library To Build And Parse XML Documents

Filed in Product & Development by Tomaz Muraus | May 9, 2012 9:00 am

This is another post in the series about Cloud Monitoring libraries we have open-sourced. I first talked about Whiskey, a powerful Node.js test framework[1] and recently Gary talked about the Cassandra CQL driver[2]. This week I’m going to talk about node-elementtree, a Node.js library inspired by Python’s ElementTree module[3] for building and parsing XML.

We began collecting requirements before we started to build our Monitoring-as-a-Service (MaaS) platform and one of the requirements was to support XML. None of the developers in our team particularly like XML (we prefer JSON), but we needed to support XML because some of the API consumers still use it. When we started working on the project, there wasn’t a good library available for creating XML documents, so we decided to build our own. Many of the developers on our team have previously worked with Python and were familiar with Python’s ElementTree[4] API, so we modeled our library after ElementTree’s straightforward and easy to use interface.

In this blog post I am going to go over a couple examples of leveraging the API.  First we’ll start with creating a document, and then we’ll talk about parsing a document.

Example 1 – Creating An XML Document

This example shows how to build a valid XML document that can be published to Atom Hopper[5]. Atom Hopper is used internally as a bridge from products all the way to collecting revenue, called “Usage.”  MaaS and other products send similar events to it every time user performs an action on a resource (e.g. creates, updates or deletes). Below is an example of leveraging the API to create a new XML document.

var et = require('elementtree');
var XML = et.XML;
var ElementTree = et.ElementTree;
var element = et.Element;
var subElement = et.SubElement;

var date, root, tenantId, serviceName, eventType, usageId, dataCenter, region, checks, resourceId, category, startTime, resourceName, etree, xml;

date = new Date();

root = element('entry');
root.set('xmlns', 'http://www.w3.org/2005/Atom');

tenantId = subElement(root, 'TenantId');
tenantId.text = '12345';

serviceName = subElement(root, 'ServiceName');
serviceName.text = 'MaaS';

resourceId = subElement(root, 'ResourceID');
resourceId.text = 'enAAAA';

usageId = subElement(root, 'UsageID');
usageId.text = '550e8400-e29b-41d4-a716-446655440000';

eventType = subElement(root, 'EventType');
eventType.text = 'create';

category = subElement(root, 'category');
category.set('term', 'monitoring.entity.create');

dataCenter = subElement(root, 'DataCenter');
dataCenter.text = 'global';

region = subElement(root, 'Region');
region.text = 'global';

startTime = subElement(root, 'StartTime');
startTime.text = date;

resourceName = subElement(root, 'ResourceName');
resourceName.text = 'entity';

etree = new ElementTree(root);
xml = etree.write({'xml_declaration': false});
console.log(xml);

https://gist.github.com/2554363[6]

As you can see, both et.Element and et.SubElement are factory methods which return a new instance of Element and SubElement class, respectively. When you create a new element (tag) you can use set method to set an attribute. To set the tag value, assign a value to the .text attribute.

This example would output a document that looks like this:

<entry xmlns="http://www.w3.org/2005/Atom">
  <TenantId>12345</TenantId>
  <ServiceName>MaaS</ServiceName>
  <ResourceID>enAAAA</ResourceID>
  <UsageID>550e8400-e29b-41d4-a716-446655440000</UsageID>
  <EventType>create</EventType>
  <category term="monitoring.entity.create"/>
  <DataCenter>global</DataCenter>
  <Region>global</Region>
  <StartTime>Sun Apr 29 2012 16:37:32 GMT-0700 (PDT)</StartTime>
  <ResourceName>entity</ResourceName>
</entry>

https://gist.github.com/2632179[7]

Example 2 – Parsing An XML Document

This example shows how to parse an XML document and use simple XPath selectors. For demonstration purposes, we will use the XML document located at https://gist.github.com/2554343[8].

Behind the scenes, node-elementtree uses Isaac’s sax library[9] for parsing XML, but the library has a concept of “parsers,” which means it’s pretty simple to add support for a different parser.

var fs = require('fs');

var et = require('elementtree');

var XML = et.XML;
var ElementTree = et.ElementTree;
var element = et.Element;
var subElement = et.SubElement;

var data, etree;

data = fs.readFileSync('document.xml').toString();
etree = et.parse(data);

console.log(etree.findall('./entry/TenantId').length); // 2
console.log(etree.findtext('./entry/ServiceName')); // MaaS
console.log(etree.findall('./entry/category')[0].get('term')); // monitoring.entity.create
console.log(etree.findall('*/category/[@term="monitoring.entity.update"]').length); // 1

https://gist.github.com/2554353[10]

Basically, the module exposes three methods on the ElementTree instance:
•    etree.find – return a single element which matches the XPath selector
•    etree.findall – returns an array of elements which match the XPath selector
•    etree.findtext – return a text value of the tag which matches the XPath selector

For more information on how to use XPath selectors you can refer to the Python documentation[11].

How We Use node-elementtree In MaaS

In MaaS we use it in two places:
1. Sending usage documents to an Atom Hopper instance
2. Returning a response to the API user which has requested a response format to be XML

In the second case we actually use it in combination with our validation and serialization framework called Swiz. We use Swiz to define field types and validators for every object in the system. When a user hits our API and requests XML, Swiz serializes the response using the object definition and node-elementtree library.

Conclusion

I hope you found this post helpful! More examples and project source code can be found on Github[12]. We will see you around soon when we talk more about our previously mentioned serialization and validation framework called Swiz. If you think this kind of work would be challenging and fun, and you enjoy contributing to open-source projects, we are always looking for talented engineers[13].

Endnotes:
  1. Whiskey, a powerful Node.js test framework: http://www.rackspace.com/blog/rackspace-open-sources-whiskey-a-test-framework/
  2. Cassandra CQL driver: http://www.rackspace.com/blog/rackspace-contributes-cassandra-cql-driver-for-node-js/
  3. Python’s ElementTree module: http://effbot.org/zone/element-index.htm
  4. ElementTree: https://github.com/racker/node-elementtree
  5. Atom Hopper: http://www.rackspace.com/blog/rackspace-open-sources-atom-hopper-an-atom-publishing-server/
  6. https://gist.github.com/2554363: https://gist.github.com/2554363
  7. https://gist.github.com/2632179: https://gist.github.com/2632179
  8. https://gist.github.com/2554343: https://gist.github.com/2554343
  9. Isaac’s sax library: https://github.com/isaacs/sax-js
  10. https://gist.github.com/2554353: https://gist.github.com/2554353
  11. Python documentation: http://effbot.org/zone/element-xpath.htm
  12. Github: https://github.com/racker/node-elementtree
  13. talented engineers: http://jobs.rackspace.com/search?q=developer

Source URL: http://www.rackspace.com/blog/node-elementtree-node-js-library-to-build-and-parse-xml-documents/