THE GNOME CORPUS
|The GNOME Corpus includes texts from three genres - museum
labels, pharmaceutical leaflets, and tutorial dialogues - in which
different types of discourse and semantic information have been annotated.
The corpus was created to study the aspects of discourse that affect
generation, particularly salience. The corpus has been used to study
Centering both from a generation and from an interpretation perspective;
to study many subtasks of generation, including text planning,
aggregation, and sentence planning; and more recently to study the
interpretation of anaphoric expressions, particularly bridging references.
The GNOME corpus cannot be freely distributed (we don't have copyright on all texts), but has been released on a case-by-case basis - ask Massimo Poesio for information.