OpenCitations are pleased to announce the launch of the Initiative for Open Citations (I4OC) , a fresh momentum in the scholarly publishing world to open up data on the citations that link research publications. OpenCitations are proud to be a founder of I4OC, and we encourage those remaining publishers whose journal article reference lists are still closed to embrace this sea change in attitude towards open citation data. The other I4OC founding organizations are Wikimedia Foundation, PLOS, eLife, DataCite, and the Centre for Culture and Technology at Curtin University,
Until recently, the vast majority of citation data were not openly available, even though all major publishers freely share their article metadata through Crossref. Before I4OC started, only about 1% of the reference data deposited in Crossref were freely available. Today that figure has jumped to 40% .
In recent months, following earlier indications of willingness reported in this blog, several publishers have made the decision to release these metadata publicly, including the American Geophysical Union, Association for Computing Machinery, BMJ, Cambridge University Press, Cold Spring Harbor Laboratory Press, EMBO Press, Royal Society of Chemistry, SAGE Publishing, Springer Nature, Taylor & Francis, and Wiley. These publishers join other publishers who have been opening their references through Crossref for some time. The full list of scholarly publishers now opening their reference data via Crossef is given in .
These decisions stem from discussions that have been taking place since a call-to-action to open up citations was made by Dario Taraborelli of the Wikimedia Foundation at the 2016 OASPA Conference on Open-Access Publishing. The creation of I4OC was spearheaded by Jonathan Dugan, Martin Fenner, Jan Gerlach, Catriona MacCallum, Daniel Mietchen, Cameron Neylon, Mark Patterson, Michelle Paulson, Silvio Peroni, myself and Dario Taraborelli. The purpose of I4OC is to coordinate these efforts and to promote the creation of a comprehensive, freely-available corpus of scholarly citation data.
Such a corpus will be valuable for new as well as existing services, and will allow many more interested parties to explore, mine, and reuse the data for new knowledge. The key benefits that arise from a fully open citation dataset include:
- The establishment of a global public web of linked scholarly citation data to enhance the discoverability of published content, both subscription access and open access. This will particularly benefit individuals who are not members of academic institutions with subscriptions to commercial citation databases.
- The ability to build new services over the open citation data, for the benefit of publishers, researchers, funding agencies, academic institutions and the general public, as well as to enhancing existing services.
- The creation of a public citation graph to explore connections between knowledge fields, and to follow the evolution of ideas and scholarly disciplines.
The Internet Archive, Mozilla, the Wellcome Trust, and twenty eight other projects and organizations have formally put their names behind I4OC as stakeholders in support of openly accessible citations. The full list of stakeholders is given in .
Dario Taraborelli, Head of Research at the Wikimedia Foundation, said:
“Citations are the foundation for how we know what we know. Today, tens of millions of scholarly citations become available to the public with no copyright restriction. We look forward to more organizations joining this initiative to release, and build on these data.”
Liz Ferguson, VP Publishing Development, Wiley, said:
“Wiley is delighted to support I4OC by opening our citation metadata via Crossref. Collaborating with other publishers further contributes to sustainable and standardized infrastructure that will benefit the research community. We are particularly excited by the potential to expose networks of research that would otherwise lie hidden or take years to discover.”
Robert Kiley, Head of Open Research at the Wellcome Trust, said:
“The open availability of citation data will help all funders better evaluate the research they fund. The progress that I4OC has made is an essential first step and we encourage all publishers to publicly share this data.”
Mark Patterson, Executive Director of eLife, said:
“It’s fantastic to see the interest that’s being shown by so many publishers in making their reference list metadata publicly available. We hope that this new momentum will encourage all publishers to follow suit, and that new services and tools can be built around these open data.”
Catriona MacCallum Advocacy Director, PLOS, said:
“Creating an open database of citations will allow researchers to perform independent analyses of how scientific ideas are communicated through article citations, and a transparent way of tracking the influence of particular articles. By opening up these metadata via Crossref, publishers are providing a vital contribution to Open Science.”
Many other publishers have expressed interest in opening up their reference data. They can do this very easily via Crossref, with a simple email to firstname.lastname@example.org requesting they turn on reference distribution for all their DOI prefixes. This is required even for publishers of open access articles, since by default references submitted to the Crossref Cited-By Linking service are closed, as previously explained here. I4OC will provide regular updates on the growth of the public citation corpus, how the data are being used, additional stakeholders and participating publishers as they join, and as new services are developed.
I4OC and OpenCitations
Through the efforts of I4OC, scholarly citation data will be increasingly available to any interested party through all of Crossref’s Metadata Delivery Services, including the REST API and bulk metadata dumps. From this open source, OpenCitations will progressively import the citation data into the OpenCitations Corpus, describe them using the SPAR Ontologies according to the OCC metadata model, and make them available in RDF under a Creative Commons public domain dedication as Linked Open Data. Potential users should be aware that is will take some considerable time before all the new citation data now available via the Crossref API are ingested into the OpenCitations Corpus.
 40% is the percentage of publications with open references out of the total number of publications with reference metadata deposited with Crossref. As of March 2017, nearly 35 million articles with references are deposited with Crossref.
 Full list of publishers now making their citation data open via Crossref.
 Full list of I4OA supporting stakeholder organizations.