Language and Computation Group
Resources
|
This is a list of resources available between the various departments. Check
also Doug's old CL/MT pages. For a list
of resources available on the Web, check Chris Manning's Stat NLP pages.
Corpora
Many of these corpora are installed in a
University-accessible server under /ufs/corpora (\\corpora\corpora
from Windows machines). For details about accessing, ask Doug.
- al-Hayat corpus (CS, ask Abdul Goweder)
- British National Corpus (LAL, ask Doug )
- Brown Corpus (CS, ask Massimo)
- GNOME corpus (CS; ask Massimo)
- ICAME CD (contains the Brown corpus, LOB, London-Lund, and a few other
corpora) (CS, ask Massimo)
- LOB (CS, LAL; ask Doug or Massimo)
- London-Lund (CS, LAL; ask Doug or Massimo)
- MUC6, MUC7 (CS; ask Massimo)
- Reuters (CS, ask Udo)
- Switchboard (CS, ask Massimo)
- TREC (CS; ask Massimo)
- Verbmobil (CS, ask Massimo)
Lexical Resources
- Concise Medical Dictionary (OUP)
- New Oxford Thesaurus of English (OUP)
- Oxford English Dictionary (OUP)
- Oxford Spanish Dictionary (OUP)
- Pocket Oxford Italian Dictionary (OUP)
- WordNet (CS, LAL; ask Massimo; accessible from machines in the CS Labs)
(The items marked 'OUP' are available for research purposes under a
three-year licence from OUP to Massimo - ask.)
Software
- Connexor Machinese Syntax (CS, installed in the Labs)
- GATE (CS, installed in the Labs) - a multi-purpose NL tool.
- LT-XML tools from LTG, Edinburgh (CS, ask Massimo) - POS tagger, chunker,
and tokenizers
- QTAG (CS, installed in the Labs) - a POS tagger from Birmingham
[back]