Tuesday, 31 March 2015

Rachel Lammey - CrossRef Text & Data Mining Services: one year in

The background to this session:-

"The field of Text and Data Mining (TDM) is growing in importance with an increasing number of researchers interested in mining scholarly content. CrossRef Text and Data Mining Services launched in May 2014 and focuses on providing one common way to retrieve the full text of articles for the purposes of TDM for interested parties. This session will provide an introduction to and update on this service, and a short demonstration of it in action". 

"This is an introductory level talk" said Rachel, but indeed, this slide is great for people new to text & data mining.

This slide includes links to Announcing the PLOS Text Mining Collection &

Text mining: what do publishers have against this hi-tech research tool? by prolific science reporter/blogger Alok Jha.


Most in the room were aware of CrossRef and their services so as such, Rachel skipped through the opening section of her slides which can be found here.

An explanation was provided as to the importance of DOI's. 

CrossRef has been set up to do things that publishers don't do on an equal platform, they currently have. 27 - 28 staff.

May 2014 was the launch of CrossRef's TDM services.

There are a lot of technical components involved in the TDM landscape/industry/publishers.

CrossRef built a cross-publisher API for TDM'ing.

The live demo did not work as planned (the joys of doing a live-demo) But we captured some of this section on camera, as you do..

Negotiations / Permissions 

API Token

Publishers can upload their own T&C's to this service, they then get an API key.

A short video clip from some of the "Demo Version"

Rachel Lammey - CrossRef Text & Data Mining Services - DEMO from Graham Steel on Vimeo.


Rachel concluded her talk by talking about the benefits of TDM'ing. "Over 14 million articles with full-text links add license information deposited"

Importantly, it is also worth noting that this API is Open Source and free to play around with on GitHub, as are the likes of ContentMine.


No comments:

Post a comment