Blog

Blog posts are written by project team members. Topics range from conferences we attend, musings on current affairs of relevance, internal project findings and news and more succinct content which can be found in our Digital Humanities Case studies or project related publications. Blog posts will mainly be posted in English but will from time to time feature in the language of the project team member’s preference, since we are a multilingual bunch! Happy reading!

 

Online research of digital newspapers of three national libraries: A survey.

By Sarah Oberbichler, Stefan Hechl, Barbara Klaus, Minna Kaukonen, Tuula Pääkkönen and Marion Ansel               

In 2018, the three National Libraries of the NewsEye project: Austria, Finland and France, surveyed their users focusing specifically on their digitised newspaper collections searches habits and expectations. This online survey was written and coordinated by the NewsEye Digital Humanities (DH) Group members based in Innsbruck. Before we publish an in-depth analysis in the coming months, let's have a look into the preliminary results:

 

Why did we conduct this survey?

The DH group of the NewsEye team in collaboration with the National Libraries in Austria, Finland and France conducted this survey in order to find out how different groups use digitised newspaper collections accessible via the National Libraries’ online search interfaces, namely: ANNO (Austria), Gallica or Retronews (France), and DIGI (Finland). The aim was to ask users how they evaluate the search options offered and what improvements they would like to see in the future. For this reason, questions were asked about current usage behaviour, as well as about problems and difficulties in searching for relevant content. In addition, we asked questions that would provide us with information about additional future search functions and tools necessary to facilitate research in digital newspaper corpora and to improve the outcomes. In Finland, the user survey also covered the use of digitised journals and ephemera, also available on the DIGI interface.

 

 

Who was the survey aimed at?

The DH group of the NewsEye project focused on targeting different user groups, such as librarians, academics, students, teachers, pupils, and especially those groups interested in history and historical newspapers – for example journalists, archivists, genealogists, family historians, or chroniclers. As newspapers collect information about cultural, political and social events, it is strongly in our interest to consider the needs of anyone interested in European cultural heritage and not only professional historians. Usually, the former are not easily reached, but with these surveys, we deliberately aimed to reach the general public.

Even before the final evaluation, it can be said that – in the case of Austria – scientists represented the largest group of participants (41 %), followed by a group that can be described as hobby or lay historians (21 %), students (12 %) and other people interested in history, such as librarians, teachers, journalists or archivists (17 % all together). We can thus say that – for the Austrian example – reaching persons outside of academia was successful.

Figure 1. Austrian survey participants.

The Finnish survey on the other hand did not directly ask the user status of the respondents. Based on the motivation of the users, we can see anyway that genealogists and researchers, as well as people interested in history, are the biggest user groups of DIGI (digi.kansalliskirjasto.fi).

In the case of the French survey, the largest group of participants are lay historians (24%), followed by librarians (21%) and researchers (16%). However, about of a quarter of the responses were very detailed and need to be analysed, and maybe reclassified. Hence, this summary is susceptible to change once the in-depth analysis is completed.

Figure 2. French survey participants.

How did we conduct this survey?

The Finnish survey was available from mid-June to the end of July 2018 on the homepage of DIGI. In that time period, 140 people took part in the survey. The Finnish survey did not ask any detailed demographic information (gender, age, location) in order to preserve the anonymity of users. Most of the participants have research interests regarding family history, scientific research, proceeded by research out of private interest. There were also quite often multiple usage interests for an individual e.g. primarily academic research but family research as the secondary option. 17 % also say that they use the digitised resources simply for browsing. 

The survey for German speaking users was available from the beginning of November to the beginning of December 2018 on the online search page ANNO of the Austrian National Library. In that time period, 374 people took part in the survey. 58 % of the participants are male, 30 % under 35 years of age and 58 % under 50 years. Most of the participants conduct scientific research, followed by research out of private interest as well as research interests regarding family or hometown history. 22 % also said that they use newspapers simply for browsing. 

For the French speaking users, the survey was available from the beginning of December to the beginning of January 2019. It was featured in an article published on the Blog of Gallica and directly accessible via the Gallica homepage. During this timeframe, 385 people took part in the survey: 50% of the participants were female, 56% were over 50 years of age (36% between 51 and 65, 20% over 65). Respondents mainly declared using digitised newspapers for private purposes, academic research, but also with interests regarding family or hometown history. Like their German speaking counterparts, 26% also declared using the newspapers simply for browsing.

What we expect from the survey – how is it relevant for our project?

Beyond reaching out to different user groups, the Digital Humanities Group of the project hoped to receive input regarding the practical needs of (regular) users of these online search pages. Knowing their needs, failures and demands, and combining this information with the research results within the DH team, will improve the feedback to the National Libraries and Computer Scientists working together in NewsEye. Ultimately, if we know how different groups use the digital interfaces and the information they collect there for leisure, research, study or teaching, this will have a decisive impact on the practical use of digital newspaper collections.

First results of the German-language survey show, for example, that full text search is the most used search function, followed by the search for results from certain time periods, topics, or named entities. In addition, 60 % of participants said that they also already use the advanced search. However, we have also learnt from the survey that only 18% of the participants can find everything they need with the search options currently available. Most of them would like to have better scan quality (78 %), the possibility to save, copy, and annotate certain articles (71 %), or various tools to get better/specific results, like named entity recognition, topic modelling, or keyword suggestions (51 %).

In the Finnish survey results, full text search was also the most used search function, followed by search facets and the search lists of all newspapers/journals. 58 % of the participants said that they use the advanced search. 80 % of the participants can easily find everything they need with the search options currently available, at least partly (53 %). 74 % fully or partly agreed with the statement of sufficient search possibilities being available. The desired additional options are however for increased possibilities for more precise searches, for example in family names.

The initial results of the French-language survey show that full text search is the most used search function again. It is followed by the search for specific names, specific newspapers, time periods and topics. In addition, 61 % of the participants admitted already using the advanced search. However, 31% of participants only find partly what they need and 26% find the content they need but it takes them a long time. A majority of the participants want good quality of the scan of the original newspaper (78%), for different tools in order to find better results (63%), and for the possibility to copy, save, and manually annotate articles (53%).

What comes next?

An in-depth analysis of the surveys and the publication of the results will follow in the next months, first for internal use and then for the public. The most important step for the NewsEye team, however, is to set priorities in developing tools and improving features of the online interfaces based on the surveys.

Figure 3. Austrian participant’s answers to which functions they need most.

As a first insight, the German-language survey clearly demonstrated that the existing interfaces lack the possibility to search for images (50 % of participants), named entity recognition (49 %), a tool for exclusion of certain search options (45 %), and tools for visualizations (41 %). Other functions, such as a personal workspace (25 %) or the possibility of manual annotation (24 %) would also be helpful for many users.

The publication and back end system DIGI is to be renewed in Finland this year. Among other modified functionalities a general search function through all content -  newspapers, journals, ephemera, books – is introduced as well as a simple search tool on the front page. Image search is improved by categorizing the images in few basic categories in order to produce a more granular way to search images across material types. A harvesting interface is created for new material types. The implementation of an IIIF image interface and additional tools for researchers are being planned.

Finally, for the French survey, the following areas were voted for in terms of improvements to be made: a tool for exclusion of certain search options (50 %), also the possibility to search for images (45 %), keyword suggestions (44 %), tools for visualizations (27 %), a personal workspace (20 %) and the possibility to manually annotate (11 %).

Follow our progress on Twitter as well as on our website's Blog, Media and Publications pages.