Can we trust online crowdworkers? Comparing online and offline participants in a preference test of virtual agents

September 22, 2020 · Declared Dead · 🏛 International Conference on Intelligent Virtual Agents

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Patrik Jonell, Taras Kucherenko, Ilaria Torre, Jonas Beskow arXiv ID 2009.10760 Category cs.HC: Human-Computer Interaction Cross-listed cs.MM Citations 27 Venue International Conference on Intelligent Virtual Agents Last Checked 4 months ago

Abstract

Conducting user studies is a crucial component in many scientific fields. While some studies require participants to be physically present, other studies can be conducted both physically (e.g. in-lab) and online (e.g. via crowdsourcing). Inviting participants to the lab can be a time-consuming and logistically difficult endeavor, not to mention that sometimes research groups might not be able to run in-lab experiments, because of, for example, a pandemic. Crowdsourcing platforms such as Amazon Mechanical Turk (AMT) or Prolific can therefore be a suitable alternative to run certain experiments, such as evaluating virtual agents. Although previous studies investigated the use of crowdsourcing platforms for running experiments, there is still uncertainty as to whether the results are reliable for perceptual studies. Here we replicate a previous experiment where participants evaluated a gesture generation model for virtual agents. The experiment is conducted across three participant pools -- in-lab, Prolific, and AMT -- having similar demographics across the in-lab participants and the Prolific platform. Our results show no difference between the three participant pools in regards to their evaluations of the gesture generation models and their reliability scores. The results indicate that online platforms can successfully be used for perceptual evaluations of this kind.