While Text Data Mining (TDM) is not completely unheard of within Librarianship, it was a very unfamiliar area to myself and two other MSc Digital Library Management students at the University of Sheffield. We are tasked with exploring this area and how the library could support its growing popularity across disciplines.
TDM is a way of analysing data computationally. It can be used to look for themes and sentiment within documents or to compare documents’ word usage or sentence structure to determine similarity.
Why is TDM Important?
Scholarly publications are increasing at an overwhelming rate. TDM has helped the researchers we have interviewed deal with increasingly large amounts of information by examining it in new ways and deal with information overload. The ability to examine huge data sets has also enabled the study of social media data which would have been vastly time-consuming or simply impossible to analyse.
Who Uses TDM?
On undertaking our interviews we were able to find researchers from all five of the University of Sheffield’s subject faculties, including Mark Clowes, Information Specialist at ScHARR. These methods are being used widely, beyond computer science. However those researchers interviewed often spoke of a need for programming or statistical knowledge to be able to exploit the technology to its fullest extent.
How Could an Academic Library Support TDM?
Academic libraries already host information and digital literacy skills programs, maintain publisher connections and content collections. In addition they have copyright specialists and have subject-neutral spaces. These key assets could help researchers access the information they need and counter the legal challenges of TDM to support its growth.
A Practical Class Project
Myself, Erica and Bálint decided to release this report in to the wild thanks to the recommendation of our supervisor, Dr Andrew Cox, and our interview participants — many of whom found the end result of interest.
- Sheldon Korpet is an Information Officer in the School of Health and Related Research, University of Sheffield.
- Dr EricaBrown is working in Scholarly Communications at the University of Manchester.
- Bálint Csöllei is a Freelance Information Professional.