Massimo Poesio

CrowdSourcing and Games-With-A-Purpose

Crowdsourcing --using workers contacted via the Web- has become the de facto standard for small and medium scale annotation in CL ever since the Snow et al (2008) paper (Poesio et al, in press). In our research, as well, we have been using crowdsourcing systematically, in particular for summarization (first to prepare the 2009 Arabic Summarization data for MULTILING, more recently in the SENSEI project) and text classification (in our ongoing KTP with Minority Rights Group). But we have also been developers of crowdsourcing technology, in two areas in particular: using Games-With-A-Purpose (Chamberlain et al, 2013) to collect data, and analyzing data collected using crowdsourcing using Bayesian models.

Phrase Detectives

Phrase Detectives (Poesio et al, 2008; Chamberlain et al, 2008; Poesio et al, 2013; Poesio et al, in press) is a Game-With-a-Purpose developed to annotate anaphoric information. It is one of the most successful GWAPs for Computational Linguistics, having collected over the year more than 2.5 million judgments. The completely annotated dataset at present is of over 300,000 tokens, covering English and Italian data, from Wikipedia and fiction from Project Gutenberg.

Analysing crowdsourced data

The data collected using crowdsourcing tend to be very noisy; some method is required to identify unreliable workers and assign a reliability to labels. Bayesian models of annotation (Dawid and Skene, 1979; Carpenter, 2008; Hovy et al, 2013; Passonneau and Carpenter, 2014) have proven much more effective than majority voting in assessing the reliability of the label and are becoming the new standard. In our research, we have used such methods to assess the reliability of labels obtained with a variety of methods, more recently in the SENSEI evaluation campaign.

Using crowdsourcing for summary evaluation

As part of the ongoing SENSEI project, we organized the Online Forums Summarization Task of MULTILING-2015 in which system summaries were evaluated using crowdsourcing (Kabadjov et al, submitted).

Teaching

Projects (in inverse chronological order)

  • SENSEI , funded by the EU (2013-2016). This ongoing project is concerned with the use of discourse to summarize spoken and online conversations such as those in online forums.
  • AnaWiki (2007-2009, funded by EPSRC) was the project in which Phrase Detectives was developed.

Main publications

  • Poesio, Massimo, Jon Chamberlain, and Udo Kruschwitz, In press. Case Study: Phrase Detectives. In N. Ide and J. Pustejovsky (eds.), Handbook of Annotation. Springer (pdf)
  • Poesio, Massimo, Jon Chamberlain, and Udo Kruschwitz, In press. Crowdsourcing. In N. Ide and J. Pustejovsky (eds.), Handbook of Annotation. Springer (pdf)
  • Massimo Poesio, Jon Chamberlain, Udo Kruschwitz, Livio Robaldo and Luca Ducceschi, 2013. Phrase Detectives: Utilizing Collective Intelligence for Internet-Scale Language Resource Creation. ACM Transactions on Intelligent Interactive Systems, 3(1). (pdf)
  • Chamberlain, Jon, Karen Fort, Udo Kruschwitz, Mathieu Lafourcade and Massimo Poesio, 2013. Using games to create linguistic resources. In I. Gurevytch et al (eds.), The People's Web Meets NLP. Springer
  • Chamberlain, J. and Kruschwitz, U. and Poesio, M., 2013. Methods for Engaging and Evaluating Users of Human Computation Systems. In P. Michelucci (eds.), Handbook of Human Computation. Springer
  • Poesio, Massimo, Nils Diewald, Maik Stuehrenberg, Jon Chamberlain, Daniel Jettka, Daniela Goecke and Udo Kruschwitz, 2011. Markup infrastructure for the Anaphoric Bank, part I: Supporting web collaboration. In A. Mehler, K.-U. Kuehnberger, H. Lobin, H. Luengen, A. Storrer, and A. Witt, editors, Modelling, Learning and Processing of Text Technological Data Structures, Dordrecht, Springer.
  • Jon Chamberlain, Udo Kruschwitz and Massimo Poesio, 2009. Constructing an anaphorically annotated corpus with non-experts: assessing the quality of collaborative annotations. Proc. of the ACL Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resource, Singapore.
  • J. Chamberlain, M. Poesio and U. Kruschwitz, 2009. A new life for a dead parrot: incentive structure in the Phrase Detectives game Proc. of Webcentives09, Madrid.
  • U. Kruschwitz, J. Chamberlain, and M. Poesio, 2009. (Linguistic) Science Through Web Collaboration in the ANAWIKI ProjectProc. of Web Science, Athens.
  • Chamberlain, Jon, Massimo Poesio and Udo Kruschwitz, 2008. Phrase Detectives - A Web-based Collaborative Annotation Game. In Proc. of I-Semantics, Graz. ( pdf)
  • Poesio, Massimo Udo Kruschwitz and Jon Chamberlain, 2008. ANAWIKI: Creating Anaphorically Annotated Resources Through Web Collaboration. In Proc. of LREC, Marrakesh. ( pdf)