What is NewsEye about?

NewsEye, funded by the European Union’s Horizon 2020 research and innovation programme, is a research project advancing the state of the art and introducing new concepts, methods and tools for digital humanities by providing enhanced access to historical newspapers for a wide range of users. With the tools and methods created by Newseye, crucial user groups will be able to investigate views and perspectives on historical events and development and, as a consequence, the project will change the way European digital heritage data is (re)searched, accessed, used and analysed.


Why does NewsEye focus on Newspapers? 

Newspapers collect information about cultural, political and social events in a more detailed way than any other public record. Since their beginnings in the 17th century they record billions of events, stories and names, in almost every language, every country, every day. Newspapers have always been an important medium for the dissemination of public and political opinions, literary works, essays and art. This thematic wealth sets them at the center stage for anyone interested in European cultural heritage.

In the last decades, tens of millions of newspaper pages from European libraries have been digitised and made available online as national libraries intensify their digitisation efforts in the coming years due to large demand for access to historical newspapers. Whilst the broad public shows general interest in this historical and cultural resource, it is of crucial importance for many humanities scholars.


Who is the NewsEye project team?

The NewsEye project involves national libraries, humanities and social science research groups and computer science research groups (see our consortium page for more details!) It addresses a number of challenges, which will result in significant scientific advances, in several directions:

  • in text recognition, text analysis, natural language processing, computational creativity and natural language generation, with regard to historical newspapers but also more broadly,
  • in digital newspaper research, addressing a number of editorial issues like OCR and article separation,
  • in digital humanities, in respect to huge amounts of text material, availability of useful tools and possibilities of searching and browsing,
  • in history, in terms of analysing historical assets with new methods across different language corpora.


What are the project's aims?

The main objective of the NewsEye project is to develop methods and tools for effective exploration and exploitation of the rich resource of newspapers by means of new technologies and “big data” approaches, combining “close” and “distant reading” methods of Digital Humanities.

This will improve the ways of studying theEuropean cultural heritage by researchers and experts, as well as by the interested general public.

NewsEye will therefore develop a seamlessly integrated armoury of tools and methods that will improve users’ capability to access, analyse and use the content in the digital Libraries of historical newspapers.


  1. Text Recognition & Article Separation - enriches digitized newspaper data with both article separation and classification information as well as further textual information and finally full text transcripts at the article level.
  2. Semantic text enrichment - produces semantic annotations to ease the access and allow the advanced systematic analysis of the newspaper collections.
  3. Dynamic text analysis - develops methods to automatically find topics, trends,viewpoints, and exceptions in the corpus being studied, both within a specified context and in comparison between contrasting contexts.
  4. Personal Research Assistant - is the user’s intelligent and transparent aid, using the enriched texts and dynamic text analysis tools to carry out a series of analysis steps and explain the results to the user. The Assistant will extract the content from a dynamic query for an initial report, to be presented to the user in natural language. The Assistant will then either continue investigation autonomously, or the users may select viewpoints, articles and keywords to interactively refine the targeted query.


Read more about the project on the website of the European Commission cordis.europa.eu/project/rcn/216024/