Home » Uncategorized » “How important are user-generated data for search engine quality? Experimental results” published in Journal of Law & Economics

“How important are user-generated data for search engine quality? Experimental results” published in Journal of Law & Economics

Online search engines are used by billions of users every day. They offer the basic infrastructure for many other industries and are, therefore, of very high economic, political, and social importance. Over the past few years, an intense policy debate has formed around the question: do some search engines produce better search results because their algorithm is better, or because they have access to more data from past searches?

In the former case, it may be best to refrain from interventions in the market in order not to stifle the innovation incentives of successful entrepreneurs (and their potential contestants). In the latter case, mandatory data sharing of user-generated data, a policy that is currently discussed and already contained in the EU’s Digital Markets Act, could trigger innovation and would benefit all users of search engines.

Together with Tobias Klein, Madina Kurmangaliyeva, and Patricia Prüfer, and asked by the German Finance Ministry, I engaged on a journey to produce relevant data and inform the policy-making process. The resulting paper, “How important are user-generated data for search engine quality? Experimental results”, has now been accepted for publication by the Journal of Law & Economics.

In this paper, we report results from a collaboration with a small search engine, Cliqz. They provided us with non-personalized search results for a random set of queries and conducted an experiment on our behalf. This offers within-search engine comparisons. We complemented the Cliqz data with non-personalized search results from Google and Bing on the same queries in the same period in the same country and asked external assessors to assess the quality of the search results on a 7-point Likert scale (not mentioning the origin of the results). This offers insights about between-search engine comparisons.

We find robust evidence that differences in the quality of search results are explained by searches for less popular search terms, for which a search algorithm can rely on less data. The insights are complemented by results from an experiment, in which we keep the algorithm of the search engine fixed and vary the amount of data it uses as an input. This offers causal evidence that more user data on rare queries enables search engines to produce better quality. Notably, 74% of the traffic in our data come from “rare queries”. Hence, this is the relevant dimension of competition, where a search engine must perform in order to attract users.

Our results show that the mandatory sharing of user data may be an appropriate remedy in the sense that it would likely allow entrants, such as Cliqz, to successfully compete with the incumbent (Google) by enabling Cliqz to provide search results that are also of high quality for rare queries. Unlike in other contexts, this remedy does not directly harm the incumbent, as it makes use of the non-rivalry of information: the incumbent will still be able to use the same data. Only the exclusivity of data access would be reduced. Consequently, users would benefit.

Tags: Cliqz, experiment, mandatory data sharing, rare queries, search engines

By pruferj in Uncategorized on January 19, 2025.

Latest Posts

Consultation Response: Joint EC/EDPB Guidelines on DMA-GDPR Interplay
Over the past years, the EU has been bold enough to globally lead regulation of digital markets and ecosystems. One of the key legal provisions…
Video talk: From Economic Power to Political Power
My recent working paper, “From Economic Power to Political Power” (joint with Ivan Khomyanin) meets significant interest among several audiences as seven scheduled talks within…
Chair of Economics, Governance, and Technology at Tilburg University
Since 2022, I have been Professor in Economics at the University of East Anglia’s School of Economics. Now, Tilburg University, my main affiliation since 2023,…
New Working Paper: From Economic Power to Political Power
In 2024, a highly unusual conflict erupted between Elon Musk, the world’s richest entrepreneur, and Brazil’s Supreme Court, raising profound questions about corporate power versus…

Jens Prüfer

“How important are user-generated data for search engine quality? Experimental results” published in Journal of Law & Economics

Share this:

Latest Posts

Consultation Response: Joint EC/EDPB Guidelines on DMA-GDPR Interplay

Video talk: From Economic Power to Political Power

Chair of Economics, Governance, and Technology at Tilburg University

New Working Paper: From Economic Power to Political Power