We will be building a search engine based on the data from the British Library Labs' 19th Century Book collection. Led by British Library Labs developer Ben O'Steen, this 3-hour workshop will give participants an understanding of how search engines workshop from the inside.
Attendees will load some texts from the largely 19th Century British Library digitised Book collection into a search engine to explore the problems, opportunities and assumptions made when creating such a service. The session will be using Elasticsearch, Python, Git and Notepad++. The aim is to step people through the challenges and compromises required to have something as simple as a Google search service and to explore a few ways to tailor it to specific needs. It involves dealing with XML and the quality of real world data and use python code to put data into and query elasticsearch.
Led by British Library Labs developer Ben O'Steen, this 3-hour workshop will give participants an understanding of how search engines work from the inside. No technical knowledge is required as a prerequisite but spaces are strictly limited and the focus of this workshop will be on practical application of the ideas.
Book early to secure a place. University of Cambridge researchers and students have priority for bookings, but please contact us if you are from outside the University and would like to attend.
Book online here