Adoption without Disruption: NCBI’s Experience in Switching to BITS

The NCBI Bookshelf at the National Library of Medicine is an online archive of books and documents in life science and healthcare. Its growing collection comprises over 5,000 titles, the majority of which are stored as full text XML. In the fall of 2014, Bookshelf began work to adopt the Book Interchange Tag Suite (BITS) DTD, replacing the NCBI Book Tag Set Version 2.3 as its XML format of choice. It became immediately apparent that Bookshelf could not simply perform a one-time “switch” to BITS. It needed to support the new schema alongside the old one. The complexity of the project would have required Bookshelf to focus so much energy on the transition to BITS, thereby bringing regular production workflows to a complete halt. This was particularly inconceivable, as Bookshelf judged the benefits of adopting BITS to be mostly long-term rather than immediate. Released only in December 2013, BITS was still very new. While there was no doubt that the format is superior to the NCBI Book DTD, the prospect of further revisions to the Tag Suite cautioned against acting too quickly. Therefore, adoption of BITS was conceived as a longer term project of small, incremental steps designed to neither disrupt the regular production cycle nor consume all resources. By the time version 2.0 of BITS was released in December 2015, Bookshelf had the ability to load, render, and index books tagged as per the BITS. A number of in-house XML converters were updated to output BITS, and the first titles in BITS were released. While the majority of new content was still tagged as per the NCBI Book Tag Set v2.3, Bookshelf now had a solid foundation to complete adoption using the new version of BITS. By the end of 2016, all workflows had switched to BITS, including Bookshelf’s Word authoring program, external vendors providing BITS XML, and over 20 in-house XML converters. This paper describes Bookshelf’s experience in adopting BITS: the challenges Bookshelf faced, the solutions it developed, and the lessons learned along the way. Special emphasis is placed on issues related to markup and XML conversion.

Read full story

In pursuit of family harmony: Introducing the JATS Compatibility Meta Model

JATS is an Open Standard. Users may modify it by adding or removing elements and attributes to suit their needs. Some publishers have extended (added to) JATS based on their own requirements. And there are some public extensions like BITS, STS, and Taxpub. Users expect significant efficiencies from vocabularies based on JATS, including the ability to intermingle the documents in databases, to use tools created for JATS for their new vocabulary with minimal additional work, and to adopt rendering/formatting applications and change only those aspects specific to the new vocabulary. Some model changes create compatible documents, which can interoperate with JATS documents gracefully. But some model changes are disruptive. We discuss what types of changes to the JATS models can be integrated into existing XML environments and which may be disruptive. We propose a set of criteria to evaluate whether a proposed change will be seamless or might cause problems.

Read full story

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑