Adoption without Disruption: NCBI’s Experience in Switching to BITS

The NCBI Bookshelf at the National Library of Medicine is an online archive of books and documents in life science and healthcare. Its growing collection comprises over 5,000 titles, the majority of which are stored as full text XML. In the fall of 2014, Bookshelf began work to adopt the Book Interchange Tag Suite (BITS) DTD, replacing the NCBI Book Tag Set Version 2.3 as its XML format of choice. It became immediately apparent that Bookshelf could not simply perform a one-time “switch” to BITS. It needed to support the new schema alongside the old one. The complexity of the project would have required Bookshelf to focus so much energy on the transition to BITS, thereby bringing regular production workflows to a complete halt. This was particularly inconceivable, as Bookshelf judged the benefits of adopting BITS to be mostly long-term rather than immediate. Released only in December 2013, BITS was still very new. While there was no doubt that the format is superior to the NCBI Book DTD, the prospect of further revisions to the Tag Suite cautioned against acting too quickly. Therefore, adoption of BITS was conceived as a longer term project of small, incremental steps designed to neither disrupt the regular production cycle nor consume all resources. By the time version 2.0 of BITS was released in December 2015, Bookshelf had the ability to load, render, and index books tagged as per the BITS. A number of in-house XML converters were updated to output BITS, and the first titles in BITS were released. While the majority of new content was still tagged as per the NCBI Book Tag Set v2.3, Bookshelf now had a solid foundation to complete adoption using the new version of BITS. By the end of 2016, all workflows had switched to BITS, including Bookshelf’s Word authoring program, external vendors providing BITS XML, and over 20 in-house XML converters. This paper describes Bookshelf’s experience in adopting BITS: the challenges Bookshelf faced, the solutions it developed, and the lessons learned along the way. Special emphasis is placed on issues related to markup and XML conversion.

Read full story

Comments are closed.

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑