What is NewsEye about?

NewsEye, funded by the European Union’s Horizon 2020 research and innovation programme, is a research project advancing the state of the art and introducing new concepts, methods and tools for digital humanities by providing enhanced access to historical newspapers for a wide range of users. With the tools and methods created by Newseye, crucial user groups will be able to investigate views and perspectives on historical events and development and, as a consequence, the project aims to change the way European digital heritage data is (re)searched, accessed, used and analysed.


Why does NewsEye focus on Newspapers? 

Newspapers collect information about cultural, political and social events in a more detailed way than any other public record. Since their beginnings in the 17th century, they have recorded billions of events, stories and names, in almost every language, every country, every day. Newspapers have always been an important medium for the dissemination of public and political opinions, literary works, essays and art. This thematic wealth sets them at the center stage for anyone interested in European cultural heritage.

During the last few decades, tens of millions of newspaper pages from European libraries have been digitised and made available online as national libraries aim to intensify their digitisation efforts in the coming years due to a large demand for access to historical newspapers. Whilst the broad public shows general interest in this historical and cultural resource, it is also of crucial importance for many humanities scholars.


Who is in the NewsEye project team?

The NewsEye project involves national libraries, humanities and social science research groups and computer science research groups (see our consortium page for more details). It addresses a number of challenges, which are resulting in significant scientific advances in several directions:

  • in text recognition, text analysis, natural language processing (NLP), computational creativity and natural language generation, with regard to historical newspapers but also more broadly,
  • in digital newspaper research, addressing a number of editorial issues like optical character recognition (OCR) and article separation,
  • in Digital Humanities, in respect to the huge amounts of text material, the availability of useful tools and the possibilities of searching and browsing,
  • in history, in terms of analysing historical assets with new methods across different language corpora.


What are the project's aims?

The main objective of the NewsEye project is to develop methods and tools for effective exploration and exploitation of the rich resource of newspapers by means of new technologies and 'big data' approaches, combining the 'close' and 'distant reading' methods of Digital Humanities.

This will improve the methods of studying European cultural heritage used by researchers and experts, as well as the general public.

NewsEye is therefore developing a seamlessly integrated armoury of tools and methods that will improve users’ capability to access, analyse and use the content in the digital Libraries of historical newspapers.




  1. Text Recognition & Article Separation - enriches digitised newspaper data with both article separation and classification information, as well as further textual information and full text transcripts at the article level.
  2. Semantic text enrichment - produces semantic annotations to ease access and facilitate advanced systematic analyses of newspaper collections.

  3. Dynamic text analysis - develops methods to automatically find topics, trends, viewpoints and exceptions in the corpus being studied, both within a specified context and in comparison between contrasting contexts.
  4. Personal Research Assistant - is the user’s intelligent and transparent aid, using the enriched texts and dynamic text analysis tools to carry out a series of analysis steps and explain the results to the user. The Assistant aims to extract content from a dynamic query for an initial report, to be presented to the user in natural language. The Assistant aims to then either continue an investigation autonomously, or the users may select viewpoints, articles and keywords which will interactively refine the targeted query.


Read more about the project on the website of the European Commission: cordis.europa.eu/project/rcn/216024/.