Friday, 6 October 2017

Should Academic Libraries offer a policy or service for Text Data Mining?

Sheldon Korpet (Information Officer in ScHARR Library) reports on a Masters research project she undertook for the University of Sheffield Library.

While Text Data Mining (TDM) is not completely unheard of within Librarianship, it was a very unfamiliar area to myself and two other MSc Digital Library Management students at the University of Sheffield. We are tasked with exploring this area and how the library could support its growing popularity across disciplines.

Image of Sheldon Korpet
Download the report
What is TDM? 

TDM is a way of analysing data computationally. It can be used to look for themes and sentiment within documents or to compare documents’ word usage or sentence structure to determine similarity.

Why is TDM Important?

Scholarly publications are increasing at an overwhelming rate. TDM has helped the researchers we have interviewed deal with increasingly large amounts of information by examining it in new ways and deal with information overload. The ability to examine huge data sets has also enabled the study of social media data which would have been vastly time-consuming or simply impossible to analyse.

Who Uses TDM?

 On undertaking our interviews we were able to find researchers from all five of the University of Sheffield’s subject faculties, including Mark Clowes, Information Specialist at  ScHARR. These methods are being used widely, beyond computer science. However those researchers interviewed often spoke of a need for programming or statistical knowledge to be able to exploit the technology to its fullest extent.

How Could an Academic Library Support TDM?

 Academic libraries already host information and digital literacy skills programs, maintain publisher connections and content collections. In addition they have copyright specialists and have subject-neutral spaces. These key assets could help researchers access the information they need and counter the legal challenges of TDM to support its growth.

Read the report to learn what we recommended the University of Sheffield Library could do to support TDM in its institution.

A Practical Class Project

 Myself, Erica and Bálint decided to release this report in to the wild thanks to the recommendation of our supervisor, Dr Andrew Cox, and our interview participants — many of whom found the end result of interest.

Images of Sheldon, Erica and Balint
Left to right: Sheldon, Erica and Bálint

  • Sheldon Korpet is an Information Officer in the School of Health and Related Research, University of Sheffield.

  • Dr EricaBrown is working in Scholarly Communications at the University of Manchester.

Useful resources

Fact Sheet: Text Mining — NLPN
Text & Data Mining — University of Cambridge Library LibGuide


No comments: