Archive for February, 2011

DITA – A framework for scientific publishing?

There are two industry recognised standards for XML based documentation. These are Docbook and DITA (Darwin Information Typing Architecture).

Docbook is the older of the two specifications and created specifically for technical documentation. DITA, is a younger specification which grew out of IBM, and is referred to as having its own architecture and was designed to provide structure to more than just a book. Both specifications are OASIS standards.

A DITA topic

As with XML schemas, both specifications can be extended to include bespoke features. However, Docbook is based more on a book structure with Sections and subsections, where as DITA is built around topics that can be built up in any arrangement based on a document map.  A DITA topic is open to specialisation itself, however, a topic has only three required elements

  1. An id attribute
  2. A title
  3. A body
A topic also has numerous optional elements, utilising HTML syntax. e.g,




A topic can exist as a single XML file which can be composed into any arrangement for publication through the use of a document map. A DITA structure would present a more flexible architecture where the same “topic”, i.e a journal article section, such as an abstract, materials and methods, or results, could be included with ease more than one publication, correctly referenced. In this respect DITA is more like an object-oriented document schema, and can be more easily repurposed (in terms of structure) for any output format (i.e pdf, HTML). In the same respect, Docbook can be configured with some work to behave on a more topic by topic basis and DITA can support a book based methodology. They are after all both XML schemas and are equally extensible or open to specialisation.

As its a standard, whole ecosystems have emerged which makes use of the DITA architecture. For example, DITA for publishers provides libraries to convert DITA markup into HTML, PDF, EPUB, and Kindle rendering support. This allows content structures in DITA to be repurposed for different audiences or different devices with relative ease.

I have recently started using DITA as an architecture to represent content, primarily designed for books. However, with new demands appearing for different delivery mechanisms of the traditional textbook, such as Web delivery and ebooks, DITA is proving to be immensely powerful to deliver the same content through different mediums with relative ease and speed. In using it, it seems obvious that a DITA architecture would benefit the representation of content within a journal article, allowing references re-purposing and multiple format delivery. Maybe a topic for discussion through the Beyond the PDF forum.

In the end, it’s just XML, so I wont repeat the virtues of content markup through XML. However, for me its main advantage is the object oriented -like topic structure as a working architecture.

Enhanced by Zemanta

, , , , , , , , , , , ,

Leave a comment