The challenges of researching news chatbots on messaging apps

By Samuel Danzon-Chambaud (DCU), Enric Moreu (DCU), and Giuliander Carpes (UPS)

At a recent workshop, JOLT ESRs worked in teams to develop ideas for research projects. Here, three ESRs consider how to investigate the use of chatbots in news.

To develop this research idea we combined our three areas of expertise: algorithms and news; machine learning for news; and news distribution through messaging apps. News chatbots are the focus of our collaboration.  Although news chatbots are not as fashionable as they were a couple of years ago, there are several interesting recent initiatives on the main platforms. Examples include Jam and Bonjour Marianne on Facebook Messenger; Robot LaBot and politibot on Telegram and chatbots run by Aos Fatos, and the International Fact-Checking Network on WhatsApp.

An automated approach?

To research chatbots, we wanted to employ digital research methods and adopt a quantitative approach to the subject. We wondered if it would be possible to dive deep on chatbots’ APIs in order to collect data and better understand how they function. The answer from our development engineer Enric Moreu was disappointing: “not really”.

Collecting and analysing bots’ messages is complex because of the nature of their algorithms is ‘unpredictable’. Obtaining data from a chatbot is not like scraping data from a webpage or obtaining some tweets where all the data is visible. In order to access the chatbot’s information you have to ask the right questions to “reveal” all the messages that are available. This is extremely hard to automate.      

In addition, the metadata provided by chatbots are usually very poor. For instance, when extracting data from a news bot on Telegram, there is no automatic way to identify the bot’s country of origin or even information about its creators. The confidentiality of the chatbot’s creators data makes it difficult to obtain any meaningful insight. OK, case closed.

An audience approach?

Apart from collecting data, another research perspective that would be worth exploring revolves around users’ perceptions of chatbots. Because chatbots use automated text to interact with readers, they can be considered part of automated journalism. This is a new discipline understood as the automatic generation of journalistic text through software and algorithms. There is no human intervention except for the initial programming.

Audience research on automated journalism has previously shown that readers tend to perceive automated news as being as credible and trustworthy as human-written content, but not as enjoyable to read. Yet, little seems to be known about readers’ perceptions of chatbots specifically. Existing studies find that audiences consider chatbots to be as credible and trustworthy as automated news in general and, because chatbots sometimes use a casual tone to cater to younger demographics, some people find them to be more appealing. An empirical investigation into this matter, probably through the use of an online survey, would then most likely bridge a research gap.

Easy, huh? Not really. How can we apply an effective and significant survey with real news chatbots’ users? That is another challenge that demands consideration. Ideally, we would apply the survey inside the environment of specific chatbots (with different conversational tones, for example). The problem, though, is that this kind of research can be only performed by bots’ creators/managers themselves. That is actually the business model of Jam, the abovementioned French startup that surveys its audience (a considerable group of youngsters aged 15 to 25 years old) and sells companies studies about specific topics. It is unlikely that independent researchers would be granted that kind of access. And there are ethical implications that should also be considered.

We still do not have answers. Right now, we are trying to find studies with similar approaches to research on chatbots. We are looking for clues. Access appears to be the greatest challenge for innovative research on digital platforms: if access to data is not impossible, it is very difficult. We have to develop the methods from scratch because old-fashioned approaches tend not to work. Moreover, there are serious ethical implications. In summary, it can take a lot of time – and there is no guarantee it will work.

Project Members


Funding Image

This project is funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska Curie grant agreement No 765140