Citations as First-Class Data Entities: Introduction

Citations are now centre stage

As a result of the Initiative for Open Citations (I4OC), launched on April 6 last year, almost all the major scholarly publishers now open the reference lists they submit to Crossref, resulting in more than half a billion references being openly available via the Crossref API.

It is therefore time to think carefully about how citations are treated, and how they might be better handled as part of the Linked Open Data Web. Continue reading “Citations as First-Class Data Entities: Introduction”

Peer review as practised at Wellcome Open Research

In November 2016 Wellcome became the first research funder to launch a publishing platform for the exclusive use of its grantholders. Wellcome Open Research, run on behalf of Wellcome by F1000, uses a model of immediate publication followed by invited, post-publication peer review. All reviews are citable and posted to the platform along with the identities of the reviewers. Continue reading “Peer review as practised at Wellcome Open Research”

Six essential reads on peer review

In preparation for our meeting on Transparency, Recognition, and Innovation in Peer Review in the Life Sciences on February 7-9 at HHMI Headquarters, we’ve collected some recent (and not-so-recent) literature on journal peer review. A full annotated bibliography can be found at the bottom of this post, and we invite any additions via comments. To […]

New feature aims to draw journals into post-publication comments on PubPeer

When a paper is challenged on PubPeer, is a journal paying attention? A new feature recently unveiled by the site makes it easier to find out. The Journal Dashboards allow journals to see what people are saying about the papers they published, and allows readers to know which journals are particularly responsive to community feedback. […]

The post New feature aims to draw journals into post-publication comments on PubPeer appeared first on Retraction Watch.

Bringing the peer review conversation to life

At Wellcome Open Research, we operate a model of post-publication open peer review . We believe this will encourage constructive feedback from experts focused on helping the authors improve their work.

There are many other models of open peer review out there that work in different ways. In most models, the reviewer is named and it is seen as a way of crediting them for their work. We go a step further by not only naming them, but we also include their full reports as part of published article. Each peer review report also has its own DOI that can be added to ORCiD profiles, which also ensures peer reviewers get credit for their work.

Open peer review as a two-way conversation

Open peer review could also be described as a way of giving reviewers a voice as their critique and insight often helps shape what is the final article. Although the peer reports and reviewers’ names are readily available, we don’t often hear from reviewers, so were interested to explore what the conversation between author and reviewer looks like.

CRISPR for the community

Jürg Bähler, María Rodríguez-López and their team decided to try to refine the CRISPR/Cas9 gene-editing technique for yeast based on questions that were raised on a community email distribution list. They saw their work as being a valuable resource in helping others in their research and were keen to get it out there quickly. This was one of the main reasons that they decided to publish their Method Article on Wellcome Open Research.

Peer reviewed by the community

Once published, Jürg, María and colleagues then needed to decide who had the most appropriate expertise to review their article. This can be particularly important in niche fields as authors are best placed to know who should review their work. In this instance, Jürg and María thought it would be good to invite Damien Helmand to review as they knew his work, and also knew he was interested in this specific technique from questions he raised on the email distribution list.  Damien agreed and invited two of his PhD students, Carlo Yague-Sanz and Olivier Finet, to review with him as a way of gaining experience in peer review. Carlo and Olivier are also named alongside Damien as reviewers of the article. Credit where credit is due.



Exploring the living article’s publishing process

After the article passed peer review, we went to meet with Jürg, María, Damien and Carlo to hear their views on the publication process, open peer review and how versioning has helped make a living article.

INK – the file conversion engine

For the past 8 months we have been been building INK – the open source file conversion and transformation engine for publishing.

INK is now nearing 1.0, ready in the next weeks. In anticipation of the first major release we thought you might to know a little more about what INK does and why.

INK has been built with two major use cases in mind:

  1. Publishers – publishers need to automate all manner of operations on files (conversion, enrichment, format validation, etc). INK does all this and can be integrated with any current technology stack the publisher uses.
  2. File conversion pro’s and production staff– the people who love staying up all night perfecting file transformations. INK is a job management framework into which you can plug any action you want taken on files, create recipes, generate reports and more.

Lets look at these needs a little closer.

INK and Publishers

Publishers need to do all sorts of things to files. The highest value need right now is to automate file conversion from one format to another. Most publishers currently  ‘automate’ file conversion by sending MS Word documents to external vendors which is both costly and slow. Adding to these inefficiencies, it can be painful when there are errors introduced by the file conversion vendor and the workflow required to correct those errors.

We built INK so that Publishers could automate these conversions and generate reports to measure accuracy and speed. INK supports the Publishers workflow by acting as an ‘invisible’ file conversion service. In these situations you push a button and get a result. INK can be integrated into your current workflow with minimal hassle since it uses APIs. Because INK is open source, Publishers can either set up their own instance of INK, or they can use INK as offered by a service for a small fee (we are currently talking to some service providers to make this kind of hosted version available). It could also be possible for several smaller publishers to set up a shared instance of INK to lower costs even further.

As mentioned above, integration with existing softwares is easy. We have, for example, integrated INK with the open access monograph production platform – Editoria – as you can see below. The integration comes in the form of a button that says ‘Upload Word’. Uploading a Word doc in this instance will send the document to INK and return beautifully formatted and structured HTML to Editoria and ‘automagically’ load it into a chapter. All done without the user knowing a thing about file conversion.

In other contexts you may require production tools as well to QA conversions. In this case it is very simple to set up an tightly integrated production environment connecting INK to, for example, a QA editing environment. Everything you need to make your production staff happy (see below for how INK helps troubleshoot file conversions).

INK and File Conversion Pro’s / Production Staff

It is a simple truth that you cannot have good file conversions without some file conversion pro, somewhere, doing the initial hardwork for you. This is because file conversion is not just a science, it is an undocumented art!

INK helps these talented artists help you in 3 critical ways:

  1. Easy to build conversion pipelines – INK enables production staff to construct file conversion pipelines through a simple UI. This means they can assemble a new pipeline, reusing previously constructed conversion steps, in (literally) a matter of minutes. This flexibility hasn’t yet been available in the publishing industry. Most file conversion pipelines are hard coded which makes them very difficult to optimize, but it also makes it very difficult to reuse any part of the pipeline for other conversions.
  2. Reusable steps – INKs pipelines are built up of discrete reusable steps. This is the magic behind INKs philosophy for reuse. File conversion specialists can build these steps very easily (we have clear example documentation) and then use these steps in as many pipelines as they wish. Steps can be wholly new code in any language, leverage existing services via APIs or run system processes. These steps, once built, can be shared with the world or kept private. Our hope is to build up a shared repository of reusable steps for every need that a publisher may have. This would assist us all by reducing the possibility of duplicating effort, and enable us a community to spend the time optimizing conversion steps rather than building the same old hard coded conversion pipelines over and over again.
  3. Troubleshooting conversions – INK has a very sophisticated way of managing file conversions and exposing the pipeline results through a clean open API. INK also logs and displays errors to assist in troubleshooting. That means file conversion specialists or production staff can inspect any given conversion and work out exactly where a problem may have occurred and why.


Currently we have developed INK steps to achieve the following:

  • Docx to xHTML (a very sophisticated conversion that we have been working on for over 6 months)
  • HTML to PDF
  • EPUB to InDesign XML (ICML)
  • Docx to PDF
  • HTML to print-ready, journal-grade PDF

In the works are the following:

  • Docx to JATS
  • LaTeX to PDF
  • HTML to JATS
  • R Markdown to Docx
  • Markdown to HTML
  • HTML to Markdown
  • EPUB to print ready book formatted PDF
  • EPUB to Mobi
  • Docx to DocBook XML

and more! INK itself, and all steps we produce, are open source (MIT license).

Its not all about conversions

INK isn’t only about conversions. Reusable steps can be written to mine data from articles, automatically register DOIs, automate plagiarism checks, normalize data, validate formats and data, link identifiers, syndicate, and a whole let more. One of the most important use-cases ahead of us, we think, is to start parsing and normalizing metadata out of manuscripts at submission time and then disseminating to third parties – reducing the time and effort for processing research and improving early discovery of preprints or articles. A perfect job for INK. We will be moving quickly on to these use cases after our initial file conversions are in place. You should see rapid progress on these other file operations within the next month or so!


There is a lot to the INK universe as it is a sophisticated software. Here is a short break down for the technically minded:


  • HTTP Service API
  • Resource management
  • Async request management
  • Multi tenet service architecture
  • JWT authentication
  • Step abstraction (leveraging GEMs)
  • Recipe management
  • Web Socket support
  • Event subscription during recipe execution, meaning any client using the INK API can update their users on the progress of execution in real time.


  • Login
  • UI Recipe creation (including selecting the steps from an automatically populated, searchable dropdown of available steps on that INK instance)
  • Public and private recipes
  • Editing a recipe from the UI
  • An updated recipe view with clearer step names, and with descriptions
  • Users can immediately see the file list belonging to each step as it completes.
  • Users can see download each file individually or together as a .zip file.
  • Administrators can get a status report of services INK uses, so it’s easy to spot potential issues that may affect users.
  • A list of user accounts – it’s basic at the moment, and will evolve to account management.
  • A list of available steps. In the future, administrators will be able to enable and disable execution of these steps from this panel.

As you can see, INK has come along a long ways from a proof-of-concept and we’re excited about what it can bring to the domain.

We are currently working on the following features:

  • downloadable log and report generation
  • single step execution (currently steps are nested in recipes)
  • synchronous execution
  • http recipe parameters
  • http step parameters
  • semantic tagging of outputs

Please get in touch if you’re interested in finding out more or working with us to improve INK, implement it, or build and share steps! INK 1.0 due by the end of June!

ORCID iDs @ Temple


Last year on the blog, we introduced ORCID, a non-profit organization that provides persistent, unique identifiers to researchers across the globe. ORCID iDs help ensure that researchers get credit for all their scholarly contributions.

While there are a number of different researcher identifiers out there (including ResearchID and Scopus Author ID), we recommend that all Temple researchers register for an ORCID iD. It’s free and it takes less than a minute to sign up.

There are currently 3,364,764 live ORCID iDs. Sixteen publishers, including the American Chemical Society, PLOS, and Wiley, now require that authors submit ORCID iDs at some point in the publication process. And if you think ORCID is just for scientists, you’re wrong. Cambridge University Press has begun integrating ORCID iDs into their book publishing workflows, and Taylor & Francis is currently undertaking a pilot project to integrate ORCID iDs into their humanities journals.

Researchers can use their ORCID iD profile to highlight their education, employment, publications, and grants. They can even add peer review activities. The American Geophysical Union, F1000, and IEEE are just three of the organizations that currently connect with ORCID to recognize the work of their peer reviewers.

In order to get a better sense of who is using ORCID at Temple, we looked for researchers with publicly available ORCID profiles who note “Temple University” as their current place of employment. We found 205 ORCID iDs that matched this criteria. Of those, the Lewis Katz School of Medicine has the highest number of researchers with ORCID iDs at Temple. The College of Science and Technology has the second highest number, with faculty from Physics, Chemistry, and Biology being well particularly well represented. The College of Liberal Arts has the third-highest number of ORCID iDs, thanks in large part to the Psychology department. A handful of researchers in the Fox School of Business, the College of Engineering, and the College of Education have also signed up for ORCID iDs. The overwhelming majority of researchers with ORCID iDs at Temple are faculty members. Some postdoctoral fellows have ORCID iDs, but very few graduate students do.

Because filling out one’s ORCID iD profile is optional, and profiles can also be set to private, our data is incomplete, and probably underestimates the true number of individuals at Temple with ORCID iDs. Nonetheless, it is exciting to see that researchers in almost all of Temple’s schools and colleges have signed up for ORCID iDs. We’re confident that this number will continue to grow in the future.

Temple Libraries is proud to be an institutional member of ORCID.

SciLite – an open annotation platform for sustainable curation

Scientific publications are the main medium for sharing scientific results and assertions supported by observational data. Consequently, bioinformatics resources depend on research literature to keep the content updated; a task carried out by curators, who extract information from articles and transfer its essence to the corresponding resources.

The advances made in high-throughput technology have resulted in a tremendous growth of biological data, increasing the number of research papers being published. It provides a great challenge for manual curation that relies on finding the right articles and assimilating facts described in them. Therefore, services that support researchers and curators in browsing the content and identifying key biological concepts with minimal effort would be beneficial for the community.


What is SciLite?


We at the literature services group, EMBL-EBI, host Europe PMC, a database for life science literature, a partner in PubMed Central International. Europe PMC hosts a large variety of content and provides free access to over 32 million abstracts (27 million from PubMed) and 4 million full-text articles.

Our goal is to develop Europe PMC as an open community platform for new developments that improve our interaction with the scientific literature. As a part of this effort we have recently launched a new Europe PMC tool – SciLite, which we present in our Software Tool Article published on Wellcome Open Research. SciLite presents an opportunity for text miners to showcase their work to a wider public. SciLite exposes text-mined annotations and provides deep links with related data to a wide audience of scientists and curators, as well as other interested stakeholders.


How does SciLite work?


SciLite links text mined annotations from literature to the corresponding data resource and highlights those outputs on full text articles and abstracts in Europe PMC. Using the checkboxes on the right-hand side of article pages, readers can select the type of concepts that they are interested in, and matching annotations for that article will be highlighted on the article text as below. Clicking on the highlighted terms in the text opens a popup with information about the given annotation, such as a link to related database record and the source of the annotation.


What types of annotations are available?


SciLite annotates articles by identifying concepts, such as gene/protein names, organisms, diseases, Gene Ontology terms, chemicals, and accession numbers, as well as biological events (e.g. phosphorylation). The latter annotations are provided by the National Centre for Text Mining. SciLite also displays gene function annotations (GeneRIF – Gene Reference into Function) contributed by the Bibliomics and Text Mining group at the University of Applied Sciences, Geneva.


Are all annotations correct?


Although text-mining algorithms have greatly improved over the years and are being actively used in real-world applications, inaccuracies do occur. To counteract that we have introduced a user-driven mechanism to refine the annotations. While reading a paper, users of Europe PMC can validate or report an erroneous annotation (see example below). Such feedback ensures the quality of provided annotations and improves the text-mined outputs.


How is SciLite useful?


For the reader SciLite makes it very easy to skim-read articles, focusing on highlighted terms and concepts and helping to quickly understand what a given article is about. Those annotated entities are linked to the corresponding resources, so the reader can comfortably get to the underlying data in a straightforward way. In addition, SciLite could be useful for fetching related concepts from the text, as annotations highlighted in close proximity might signal a functional relationship between those terms, e.g., gene-disease association.


What are the future plans for SciLite?


We believe SciLite has the potential to further enhance the reading experience of scientific articles by developing applications that improve full text searching, filtering and integration with biological data. We have taken an initial step towards this for Protein Data Bank (PDB) accession numbers with the BioJS application. For a given PDB accession number it fetches the coordinate information and displays the corresponding 3D molecular structure, serving as an interactive visualiser (see below). Similar applications could be developed to display relevant information for a given annotation type in the context of the article.


How can you contribute to SciLite?


We encourage sharing annotations from text-mining and other associated communities on the SciLite platform. We have set up a participation page to assist interested groups to submit annotation data. Furthermore, the annotations on SciLite are modelled based on the Web Annotation Data Model specification, and the open nature of the format allows other platforms, such as journal publisher websites or other content aggregators, to fetch these annotations from SciLite to be reflected on their resource.


If you would like to find out more about developments at Europe PMC, visit or follow @EuropePMC_news on Twitter.

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑