Disentangling the Relation Between
Crowdsourcing and Bias Management
Ujwal Gadiraju, Cristina Sarasua, Alessandro Checco, Gianluca Demartini
ABSTRACT: The CrowdBias’18 workshop was held on 5th July 2018 during the first day of the AAAI Conference on Human Computation at the University of Zurich, Switzerland. The goal of this workshop was to analyze both existing biases in crowdsourcing, and methods to manage bias via crowdsourcing. The workshop discussed different types of biases, measures and methods to track bias, as well as methodologies to prevent and mitigate bias.
Crowdsourcing has become a successful and widely used means to obtain human input on a large-scale, needed to evaluate various systems, augment algorithms, perform high quality data management among a hoard of other applications. Humans though, have various cognitive biases that influence the way they interpret statements, make decisions and remember information. If we use crowdsourcing to generate ground truth, it is important to identify existing biases among crowdsourcing contributors and analyze the effects that their biases may produce and propagate. At the same time, having access to a potentially large number of people can give us the opportunity to manage the biases in existing data and systems.
The workshop consisted of 2 keynote talks, 7 paper presentations and a moderated discussion. We provided a framework for discussion among scholars, practitioners and other interested parties, including crowd workers, requesters and crowdsourcing platform managers. In the first keynote talk, Jahna Otterbacher (Open University Cyprus) discussed social biases in human-machine information systems and encouraged the audience to think of algorithmic transparency and accountability. In a second keynote, a long-time Turker, researcher and founder of Turker Nation, Kristy Milland talked about bias from the perspective of crowd workers. Kristy Milland stressed the fact that crowd workers’ motivations may change in different contexts and over time, and emphasized the need to treat crowd workers cautiously and fairly.
Papers that were presented outlined the themes of aggregation bias, measuring how opinion bias influences crowdsourced labelling tasks, and pitting biases emanating from experts in comparison to the crowd. Other major themes included measuring bias in data or content using crowdsourcing, bias in task selection, sampling biases during recruitment, and biases induced due to the work environments. The moderated discussion resulted in identifying the need for a ‘taxonomy of biases in crowdsourcing’, possibly extending other classifications of general biases on the Web, that can guide future work in understanding and managing bias in crowdsourcing. Investigating the various sources of bias, and developing methods to present bias related information to end users was identified as an important challenge in sociotechnical and crowdsourcing systems. The influence of task types in propagating and mitigating biases was discussed. Finally, the paid crowdsourcing paradigm was recognized as a special realm where additional factors such as worker motivation, self-selection, rewards, task design, etc. can influence task outcomes.
Alessandro Checco, Gianluca Demartini, Ujwal Gadiraju and Cristina Sarasua served as co-chairs for this workshop. The papers of this workshop were published online as CEUR workshop proceedings (http://ceur-ws.org/).
Alessandro Checco is a Research Associate at the Information School, University of Sheffield, United Kingdom.
Gianluca Demartini is a Senior Lecturer in Data Science at the University of Queensland, School of Information Technology and Electrical Engineering, Australia.
Ujwal Gadiraju is a Postdoctoral Fellow at the L3S Research Center, Leibniz University of Hannover, Germany.
Cristina Sarasua is a Researcher at the Department of Informatics, University of Zurich, Switzerland.