Designing an Interface: some first thoughts

One of the aims of the Linking Lives project is to demonstrate the value of Linked Data through the creation of an end-user interface that pulls in content from the Hub Linked Data, including the external data sets we are linking to. The Linking Lives interface will be part of the Archives Hub service, that is to say, available from within the Hub website. We will present it as a beta service; something that is usable and useful, but also in a state of development. With the provision of this interface, we can start to build up an understanding of how valuable this type of name-based resource is for researchers. We will be able to monitor use as well as carrying out an evaluation to ask researchers what they think of the site. This is far preferable to positing benefits based upon potential, which is tending to happen too much with Linked Data at present.

This post is written from a non-technical perspective and covers a few of the areas that we are currently thinking about, as we start to set out our interface design.

Priorities

We will be concentrating on development of the interface, rather than prioritising scale for this project: quality rather than quantity you might say, although we expect to have some thousands of records included. This is partly pragmatic, because we are still finding challenges over integrating EAD data (Archives Hub descriptions) into our Linked Data because of inconsistencies and sometimes problematic content. The problems that we face with variable data are ongoing, and maybe highlight a basic issue with Linked Data: it works best with consistent data-centric information, and not so well with archival descriptions, built up over decades, many created before there were any standards at all to adhere to. However, on the positive side, our Linked Data work has enabled us to highlight and deal with many data issues, which is beneficial in the long run for any data processing that we might do (or that others might do).

Our focus for this project is on the Linking Lives pages themselves, and what researchers can access from there, so we will not be prioritising the creation of different search options into the data: this would be a next stage, once we get a clearer idea of the use of the interface.

Archives Hub Branding and Navigation

We want Linking Lives (LL) to be recognisably part of the Hub, although it would be premature to try to fully integrate the two. As yet, we don’t know how users  will respond to what we are proposing, and we need to evaluate what we are doing before taking it further into service. We are carrying out an evaluation as part of the project: we will be asking a small group of researchers questions about the current Hub interface, and following this up with some focus group work to get reactions to our new LL interface. This will help us in understanding user requirements.

Linking Lives will be an interface available within the Archives Hub site, but we propose to incorporate data other than archival descriptions within the page. This does raise questions about the clarity of what we are doing and the balance between the different data sources. If we strongly brand the page as Archives Hub, will researchers expect to access just archives, and not other information resources? Will they assume all of the sources are held by us, or that we are responsible for them? If we include the basic Hub navigation at the top of the page, will that actually confuse users, as they may click on links that take them into the main Hub search without realising that LL and the Hub are somewhat different?

We are looking at creating a sub-brand of the Hub as a possible way to identify LL as part of the Hub, but still distinct from it to some extent. This may help to distinguish between the two different applications. We will use the basic Hub logo, but modify it to signify something different. We do want to keep the links between the two, as we believe that researchers will benefit from this, and we do want to bring archives and other data sources together to provide a fuller context, and not make them too distinctly separate. The idea is to enable researchers to move seamlessly from archives described within the Hub to other resources, and take a fairly bold approach to integration, otherwise we will not get the benefits we are after. I am somewhat reminded of The National Archives’ initiative called ‘Your Archives‘, which is a Wiki for community content that it does seem to have remained rather separate from the main TNA catalogues, and maybe that has been to its detriment in terms of profile and use (I often have trouble finding links to Your Archives from within TNA’s website).

Broad Appeal

The LL interface, like the Hub itself, will not be aimed at subject specialists or expert users. It will primarily be aimed at academic researchers, but is intended to appeal to a broad audience: anyone who might be interested in undertaking research. This means that we need to avoid making assumptions about knowledge. Our ‘designated community’ may not have prior knowledge of archives and certainly won’t have knowledge of Linked Data. So they may not know how archives are organised, what an archival ‘biographical history’ is, what an archival creator is, or what ‘same as’ links are between different data sources.

Our aim, therefore, is to incorporate these things in a way that makes sense and makes the person the primary focus of the page, so that it is easy to see that a page is about George Bernard Shaw, for example, and it provides life dates, descriptive information, biographical information, an image or two, aliases for the same person, etc. It is information you might expect to find, or information that makes sense within the context of a page about a person.  At the same time we are keen to ensure that we capture provenance, and so this adds another dimension. Starting to include the source of each piece of information could clutter the screen and so we will need to think about how best to incorporate it. We believe that it will be important to some users, as it could have implications for the quality and accuracy of the data. It is something we would be pleased to see others do for our data, if they were presenting it in a Web interface.

The BBC Example

Our interface will combine content from different sources. We would like to draw in content, in a similar way to the BBC (on the BBC page for Stevie Wonder you can see how the Wikipedia biography is pulled into the page). The BBC page pulls in some of the Wikipedia biog, and provides a link to to go Wikipedia and read more. This helps to make clear that the information comes from elsewhere. With MusicBrainz, another Linked Data source, the BBC provide a link to the MusicBrainz site, but also, further down their page, they state: “Links & information come from MusicBrainz. You can add or edit information about Stevie Wonder at musicbrainz.org.” The information includes personal and business relationships, such as ‘child of’ and ‘collaborated on’.

On the BBC page, the Wikipedia information is more clearly labelled as being from that source; the MusicBrainz information is also identified, but in a less obvious way. But for this, they are not only declaring where the information comes from, they also also invite people to edit the information themselves.

LL will be a useful resource in itself, but can also be a starting point, in much the same way as the BBC provides a page that gives substantial information on a musician or an animal they are interested in, but also invites people to move away from the site to other resources. This in itself is an interesting shift of focus. Long gone are the days when some sites actually disabled the ‘back’ button, and now we are moving towards an even more fluid world, if this type of approach continues to gain traction, where we are not always trying to keep people on our pages, but are actually encouraging them to move around the ‘Web of Data’.

Focus on Expectations

Looking at the BBC page on Stevie Wonder again, one thing that I notice is that it is quite busy. There is a good deal of information, with various boxes and loads of links and options for the user. There does seem to be a trend towards busier pages now, maybe an indication that people are increasingly adept at finding their way through information online, so a certain level of complexity is acceptable. Also, the page is quite long. The BBC page about mammals  is similarly long and complex: introduction, links to other pages on mammals, distribution, classification, BBC news, video, information elsewhere, size ranges, the Wikipedia ‘about’ page, etc. Yet the page does not seem cluttered or difficult to navigate. This is partly because of use of plain language, as well as BBC expertise in web design. It may also be that expectations largely match reality: users may expect the BBC to provide a wealth of information, and they generally know what they will get if they go to ‘programmes’ or ‘video’ or ‘news’ pages.

Expectations do play an important part in good Web design, and maybe it is easier if you are a very well known provider, as the expectations people have are clearer? Many people come into a page through a search engine, so you cannot expect they will have used your homepage, and picked up information via this route. However they arrive at a BBC page, most people know what the BBC is. But arriving at an Archives Hub Linking Lives page, you probably have little idea of the provider in this case, and you may not be clear about what archives are in this context.

We chose to create a biographical resource partly because this would provide a focus; we can convey the fact that the page is about one person relatively easily. This makes it easier in some ways that working the Archives Hub itself, which doesn’t have that kind of focus.  If we provide a page with a whole range of links to various types of biographical content, then we should be able to convey what the page is about fairly easily. It may be that good clear and simple headings and relevant content (about one subject – in this case one person) is better than providing explanations about what you are and what you are trying to provide, as people don’t tend to read help pages.

A ‘Controlled’ Experience

Our interface will use the external data sources within our data, and will be designed in order to give users a controlled experience, in the sense that we are  evaluating the sources we include and presenting the interface in a very defined way. Of course, we cannot control the content of the external data; I am just talking about the way we present it.

An alternative approach would be to pull in all the data that can be found on a topic and display it. Maybe this is the ideal for Linked Data – the ability to bring in any data sources on a topic – but we are quite some way, it seems, from presenting this in a way that end users will want to use. Try a search on Hakia, a semantic search engine (not directly about consuming Linked Data, but about pulling in related information in a more semantic way). I looked for Beatrice Webb, and got a substantial amount of information from a very diverse range of sources, including news, blogs, twitter, images and video. It’s quite impressive in principle, and could be really useful for a researcher, but the net is cast very wide, so it’s not easy to process all of this varied information. Sig.ma describes itself as a semantic information mash-up. If you take a look at the page that sig.ma provides for Beatrice Webb, a substantial amount of data is pulled in, but it is not very user-friendly, not always very coherent and sometimes not relevant. Obviously it is just a demonstrator, and I would say it is for a different audience, with more expertise in Linked Data. It does show the potential for this type of approach, that draws in a really diverse range of data on on-the-fly, but it also shows how semantic searching is complex and difficult to achieve within a user-friendly interface.

The Linking Lives Unique Selling Point

Sites like Wikipedia have biographical pages, and we can never compete with them, so what can we offer that is of value? Essentially, our focus is on meeting the needs of those who want to carry out more in-depth research and who are likely to use primary sources. It may not be people who know they want to use primary sources, it may actually be a means to bring people to archives for the first time (we know that a large proportion of Archives Hub users are first time users of the Hub, and have not necessarily used archives before). We want to make primary sources the focus, but at the same time put them within the context of a whole range of information sources about a person, so that they are not held apart as somehow different and not for mainstream researchers.

It is also worth pointing out that our interface will still in some sense be a demonstrator – it will provide one option for presenting our Linked Data, but the data is there for others to create their own interfaces, and the Sparql endpoint is there for people to query the data in the ways they want to.  In addition, we can re-expose the data that we present. So, there are several purposes here: benefiting end-users, evaluating a name-based approach and putting archives within a broader context, demonstrating the sort of interface that can be provided from Linked Data and possibly re-exposing the data to create more potential benefit.

 

 

 

 

 

 

 

 

 

 

 

 

 

This entry was posted in archival context, branding, interface. Bookmark the permalink.

4 Responses to Designing an Interface: some first thoughts

  1. Really interesting post – thanks.

    I think the sig.ma example shows both the potential of the ‘linked data’ web, and also why curators (whether human or algorithmic) are needed to help researchers explore and make sense of the wealth of data online (to take Clay Shirky’s phrase, we need to ensure there isn’t “filter failure”).

    Thinking about some of the other points (and I’d guess that you’ve already thought of these, but just getting them out of my head) I think the question of the USP, and the reflections on branding come together – the USP is the Archives behind the data – the opportunity to go from the data to the source.

    I’m not familiar enough with how academic researchers (and other archives users) interact with archives to know the best way of surfacing this, but it seems to me that things like ensuring easy access to contact details, restrictions on materials, opening hours, location, etc. etc. need to be given some thought. It also seems likely that more experience researchers may have already identified the major archives for information about individuals, and so ironically highlighting the archives that hold very little on a specific individual might be of advantage to these researchers (i.e. the places they haven’t yet looked).

    This also makes me think that an interface that gives the option to display information ‘by archive’ as opposed to ‘type of information’ would be key.

    You highlight some of the challenges of dealing with the data and why you are going for ‘quality not quantity’. I understand this point, and obviously it’s a choice you need to make. However, I do think there is an opportunity in publishing ‘bad’ data to get help in correcting it. This could be quite subtle in the interface I think (“Is John Smith from Wales this John Smith?” type Qs in a sidebar or something – single click response). Some (or perhaps much) of the disambiguation needed can only be done by humans, and I suspect embracing ‘the crowd’ is the only way to achieve this in an affordable way.

    Finally in the last section you say/ask “Sites like Wikipedia have biographical pages, and we can never compete with them, so what can we offer that is of value?”. I find the idea of basing the page around wikipedia content quite appealing – and I’d certainly recommend talking to people in wikipedia about how a project like Linking Lives might add value back to them. For example highlighting disagreements in factual data (birth/death dates/places) between your data and wikipedia’s is likely to improve both data sets.

    Perhaps this is a naive ‘non-research’ view of it, but for me the biography is the centre of what I’d expect on a page about a person – so why not use wikipedia to provide this as the starting point for the page? The advantage you have over wikipedia is you can surround this with data relevant to researchers, based on the data from contributing archives – rather than just the more generic presentation on the wikipedia site. If you will allow me some purple prose – illuminate the wikipedia biography with the light of the archives 🙂

  2. By the way, thinking about interfaces I though this presentation on ‘Generous Interfaces’ was really interesting and food for thought http://www.slideshare.net/mtchl/generous-interfaces

  3. Jane Stevenson says:

    Hi Owen,
    Many thanks for your comments. You are largely thinking along the same lines as we are I think.
    The USP is of course the archives to an extent – but that’s the case for the Archives Hub as is – so I was thinking about this extra dimension with other data sources. And we do intend to use the Wikipedia entry – essentially we’ll often have several biographies for one person – usually different perspectives, although certainly some duplication of content.

    “ensuring easy access to contact details, restrictions on materials, opening hours, location, etc. etc. need to be given some thought”

    Yes, this is information that we have within the Hub descriptions or link to via Archon (http://www.nationalarchives.gov.uk/archon/). I had hoped the Archon data might be a bit more open, so we could use it, but that hasn’t happened as yet. I’m not sure about ways to include this info on the Linking Lives interface, as with several archive collections for one person, we have to be careful not to overload the page with info. The question is when is appropriate to send the researcher to the Hub page and when should info be included in LL?

    “highlighting the archives that hold very little on a specific individual might be of advantage to these researchers”

    Yes, I think this is an interesting perspective. But its not that easy to distinguish. Ideally we would like to show (i) archives where the person is the creator, (ii) archives where the person is in the index terms, (iii) archives where the person is in the description but not in (i) or (ii). Problem is, inconsistency in cataloguing means you can’t draw too many conclusions from this. A person might be in the index terms, but arguably they shouldn’t be included there because they are barely mentioned in the archive; or significant people might not be indexed.

    This comes back to quality/quantity. You are quite right about crowdsourcing in theory, but in practice, its hard for us to match many of our names to e.g. VIAF or DBPedia at all, to be able to create the interface that we want. We’d have to think about a page for ‘John Smith’ that doesn’t draw other data in (because we can’t make the match) and then ask people to identify him. But loads of names in archives are really very anonymous. Still, I do think this would be great in principle. It’s just that most names like that would require quite a bit of researching to make any matches. I think it might be another project….

  4. Pingback: Linking Lives as Mashup: more on Aggregation and Provenance | Linking Lives

Comments are closed.