As a publisher, we want to meet the needs of researchers and academics who interact with our site to ensure that we offer the best user experience and functionality possible. With this in mind, we began a project to facilitate text and data mining on emeraldinsight.com, following requests from individuals and institutions in recent months.
So what is text and data mining, or TDM as it is often shortened to? Well, it is the analysis of large bodies of work by a machine, to try and identify trends that would not ordinarily be picked up through usual ‘human’ reading. For example, the processing of data contained in a large collection of scientific papers in a particular medical field could suggest a possible link between a gene and a disease, or between a drug and an adverse effect – things that a human would never piece together after reading thousands of articles.
With so much amazing content on our site, it was an obvious decision to enable this functionality. Hopefully by doing so, it will spark further ideas and research and perhaps even change the world!.. Okay, we are maybe getting a bit ahead of ourselves, but it is still a good thing that it is now available.
Having investigated a number of different options as to how we could do this, we decided to go with a solution that involved the use of CrossRef’s TDM facility. This meant adding additional data into current and future deposits with CrossRef, along with depositing a huge tranche of historical information. So far, we have provided data for over 200,000 articles, and this number will continue to grow over forthcoming weeks. We have also enabled access to the equivalent number of machine-readable files on our site.
Users wishing to mine the site are encouraged to inform us of their intention to do so, so they are not automatically blocked by our system. There are also the usual access restrictions in place, so a user will still have to be a subscriber to the content. But aside from those minor caveats, we encourage our users to use the facility and mine for that one diamond of information that is just waiting to be discovered.