The Linking Lives project aims to deliver:
- An end-user interface that provides a means to integrate archival data with other information sources
- Blog posts that share our progress and reflect on the work
- Reusable software outputs for manipulating RDF and formatting within Web pages
- An evaluation report
- Documentation setting out the data sources and relationships behind the interface
You can read more about it on our ‘About Us’ page.
First things first. We currently have a Linked Data store with a small amount of Archives Hub data. We need to expand this considerably. Our aim is to provide a substantial amount of the Hub data, preferably the entire data set, as Linked Data, and then it will be part of the Linking Lives interface.
We had already consulted with Hub contributors about our Linked Data work, but in order to really expand the data set, we need to make it very clear to them what we want to do. The Archives Hub is an aggregation of data from over 200 archives across the UK, so we are in a very unique situation, and we want to work with archivists to move the community towards an open data agenda. It is vital for us to show our contributors that we are working on their behalf, and that they will be fully informed about our plans and progress.
We feel that it is important to give the data an explicit licence, preferably a completely open licence. That way we don’t put any barriers in the way of its potential reuse. I was recently at the Europeana Tech conference in Vienna, and the dominant theme of the conference was the fundamental importance of open data. One observation that struck me was the conclusion from Europeana participants that it is better to put less data out but put it out under an open licence, than put more data out but compromise with a complex and/or restrictive licence. Some of the Archives Hub contributors have been concerned about commercial exploitation. It is worth looking at Jill Cousins’ presentation on this. She argues that even a non-commercial licence means that you are substantially restricting the potential of the data. It can’t be used on any sites or cultural blogs that demonstrate any commercial activity, it can’t be used with Wikipedia or by commercial companies that might generate income for partners.
We need to bring Hub contributors on board with this vision, and to do this we sent out an email to all contributors outlining our proposal and asking that they let us know if the do not want to participate.
In the email I did the following:
1) Set out the benefits of Linked Open Data
2) Described the Linking Lives proposal
3) Referred to the potential for us to be involved with the US-based ‘SNAC‘ project. This is not a Linked Data project, but it is creating name authority files using the archival standard of EAC-CPF, and I wanted to show that we are working on different fronts with the aim of improving access to archives. I do think it’s worth giving this kind of context; showing that services like the Hub are working in different ways on behalf of archives to promote understanding and use of primary source material.
4) Referred to the options for licensing, referring to the possibility of an attribution licence, although ideally we would still opt for a completely open licence and strongly promote best practice around attribution (and we are looking at named graphs with this in mind, as a means to ensure that the provenance of statements can be shown).
5) Emphasised that this is about the metadata, not the content. This may sound obvious, but it is an important distinction. The metadata is there to promote the collections. There are far more complex issues around open access to some collections, where there are legal issues around IPR.
6) Referred to some useful sources to read more about open data and some initiatives, such as Europeana and Discovery, that are fully behind an open data approach.
I think the real potential of Linked Data is still difficult for people to grasp. I pointed to things like Tim Sherratt’s recent work, creating a narrative using the Web of Data, as this is a great way to demonstrate the possible uses of this structured data, and I also referred to established and respected institutions like the BBC leading the way with using different data sources and taking the risk of incorporating Wikipedia data on their site.
So far we have had two contributors asking to opt out of the Linked Data work, one very small archive and one large HE archive. We have also had some questions about what the work involves, questions that show a certain level of concern (as you would expect), albeit with an overall positive attitude towards open data. Maybe we need more explicit help with licensing archival data. There are a number of useful sources, such as the Licensing Open Data guide (PDF) available from the Discovery website, but it would be useful to have a document that specifically refers to opening up archival metadata, and maybe more information on the issues around data aggregations.
Several contributors have written to us to show their support, including the Universities of London, Dundee and Hull. We are very pleased that two of our biggest contributors, the University of Glasgow and John Rylands Library at the University of Manchester, have shown very strong support. We’re going to be adding their data to our triple store in the near future, as they have large collection descriptions with thousands of component items, so that will be a good test for our stylesheet. Institutions like this have some great archives, and detailed descriptions that lend themselves to strong narratives, linking up people, places, events, to create a whole host of different stories.
We are still working on exactly which licence to use for the Archives Hub data, but we are certain that it will be open, as this is vital to ensuring that we can truly connect data. As Edward L Ayers wrote, back in 1999: “Might history, which exists in symbiosis with large amounts of diverse evidence, be especially well-suited for the technology evolving around us?” (from History in Hypertext). I think that the answer is ‘yes’, and I think that Linked Data promises much if it really does become embedded in the Web.