The GNOME Corpus includes texts from three genres - museum labels, pharmaceutical leaflets, and tutorial dialogues - in which different types of discourse and semantic information have been annotated. The corpus was created to study the aspects of discourse that affect generation, particularly salience. The corpus has been used to study Centering both from a generation and from an interpretation perspective; to study many subtasks of generation, including text planning, aggregation, and sentence planning; and more recently to study the interpretation of anaphoric expressions, particularly bridging references.

The GNOME corpus cannot be freely distributed (we don't have copyright on all texts), but has been released on a case-by-case basis - ask Massimo Poesio for information.