skip to primary navigationskip to content

Machine Reading the Archive - programme overview

A digital methods development programme organised by Cambridge Digital Humanities Network and the Cambridge Digital History Programme with support from the Isaac Newton Trust. Programme outline and registration information below.

Machine Reading the Archive brings together humanities researchers, archivists and computer scientists together to explore the challenges of working with archives in the digital age. Through a series of reading group sessions, practical workshops, technical demonstrations, field trips and a one day end-of-programme workshop, we hope to seed new collaborations and encourage the exchange of ideas and practices across professions and disciplines.

The programme is born out of a recognition that the practice of making, curating and using archives has been changed by the adoption of digital technologies, at both an institutional and individual level. Archives and library special collections are developing new roles as platforms for different kinds of data, held in a variety of formats from xml, to pdfs and tiffs, rather than physical containers for people, books and documents. Many researchers return from visits to the archive (or the archive’s website) having filled hard drives with collections of digital photographs of rare books, documents, manuscripts, maps, pictures and objects of scholarly interest whose fragility and immobility required the production of a digital copy. The digital archive thus seeds new private sub-collections on researchers’ laptops and tablets, at times a promising and overwhelmingly rich resources and at other times remaining invisible and inaccessible; while growing in scale and complexity over the trajectory of a scholarly life.

The primary aim of Machine Reading the Archive is to help participants develop a deeper understanding of the challenges and possibilities of working with archival data in the digital age, drawing on theory, methods and practice from the humanities, computer science and the archival profession. The program provides a chance to develop skills to engage with existing digital archives in new ways, to turn a cluttered hard drive of archival photographs into a refined dataset or to embark on the mission of text-mining to reveal new aspects in existing research or lay the groundwork for prospective projects.

In addition to providing participants the chance to learn practical skills and experiment with digital methods using their own or provided datasets, the framework of the course is designed to ignite reflection on the significance of the ways private and institutional digital archives are sorted, structured and accessed and to discuss how these insular knowledge infrastructures impact and influence writing, thinking and the development of research projects.

Joining the programme

Participants can follow the programme through two tracks: Track 1 is by participation and registration provides priority access to booking programme workshops, training sessions and other events.

Sign up here for Track 1

Track 2 (Machine Reading the Archive Projects) and will require a larger commitment to the programme. We will offer Track 2 participants a series of small group mentoring / peer-to-peer learning sessions to support them as they build their own project developing and using archive data. Track 2 is directed both at researchers who have already collected archival materials in digital form and want to find out about new ways of analysing their data and at researchers who are interested in conducting research in digital archives but want to improve skills and gain a broader understanding of computational methods in archival research ahead of their research. Track 2 participants will also be invited to give a short presentation in the final course workshop reflecting on what they have learned over the course of the programme.

MRtA Projects is currently closed for applications.


Registered participants must be PhD students or staff at the University and Colleges of Cambridge. However some of our workshops, including the end-of-programme showcase will also be open for public booking - if you are not a member of the university and would like to be notified of these opportunities, please sign up for the Digital Humanities mailing list here.

Pre-requisites and time commitment

There are no formal pre-requisites. Track 2 participants are asked to commit to around 4 small group sessions and to make a brief presentation about their project at the end-of-programme workshop in June 2018 (TBC).

Application process

Applicants for Track 1 should complete the online Programme Registration form Track 1.

Apply for Machine Reading the Archive Projects (Track 2) using the online application form.


Sessions for 2017/8 will be advertised here and elsewhere on the Digital Humanities website shortly.


This year's programme will be organised into introductory sessions and advanced workshops. We hope to offer sessions on the following topics:

Introductory sessions:

  • Digital research project design for beginners
  • How to clean up your messy data (introduction to OpenRefine)
  • How to turn your PDFs into searchable text (introduction to simple OCR tools)
  • Build and publish a simple digital archive or collection (Introduction to Omeka)

  • Network Analysis in the digital archive
  • Curating your own digital archive (principles of metadata creation, version control, and database construction)
  • Text-mining the archive – an introduction

Advanced workshops will be announced during the course of the year. This year we are excited to announce workshop collaborations with two external partners, The National Archives and the Tranksribus project.

We are a network of researchers at the University of Cambridge who are interested in how the use of digital tools is transforming scholarship in the humanities and social sciences. This transformation spans both the content and practice of humanities research, as the diffusion of digital technologies opens up new fields of study and generates research questions which breach traditional disciplinary boundaries.

RSS Feed Latest news

New project aims to support text and data-mining research in Cambridge

Jan 29, 2018

Want to explore the possibilities of text and data-mining using the collections of Cambridge University Library and Cambridge University Press? Our new project may be able to help.

Machine Reading the Archive 2017/8 - registration now open

Sep 18, 2017

Register now to join our Machine Reading the Archive programme for 2017/8

View all news