uk.ac.essex.malexa.nlp.dp.GuiTAR.prepro
Class PreProSyntacticHeuristics

java.lang.Object
  extended byuk.ac.essex.malexa.nlp.dp.GuiTAR.prepro.PreProSyntacticHeuristics

public class PreProSyntacticHeuristics
extends Object

A class that implements heuristics for identifying NP-type, NP-AgreementFeatures, NP-head, and NP-modifiers.

Version:
1.1
Author:
Mijail A. Kabadjov

Field Summary
private  Document domDocument
           
private  int headCounter
           
private  int modifierCounter
           
 
Constructor Summary
PreProSyntacticHeuristics()
           
 
Method Summary
private  Map createAdjectiveMap()
          An auxiliary method that creates word-to-type mappings for an adjective.
private  Map createAdjectiveSuperlativeMap()
          An auxiliary method that creates word-to-type mappings for a superlative adjective.
private  Map createDeterminerMap()
          An auxiliary method that creates word-to-type mappings for a determiner.
private  Map createDeterminerNoHeadMap()
          An auxiliary method that creates word-to-type mappings for a determiner when the NP has no head.
 Node findHead(Node node)
          A method that identifies the head of a Noun Phrase.
 byte findType(Node node, Node headNode)
          A method that identifies the type of a Noun Phrase, depending on its structure, and words and their part-of-speech.
private  Vector getPostModifiers(Node npNode, Node headNode)
          Gets the postmodifiers of a Noun Phrase.
private  Vector getPreModifiers(Node npNode, Node headNode)
          Gets the premodifiers of a Noun Phrase.
private  byte getTypeMapping(String word, Map wordToTypeMap)
          An auxiliary method that wraps a general functionality for retrieving a type of an NP associated with a given word.
static void main(String[] args)
          MAIN METHOD
private  void markUpModifiers(Vector modifiers, Node refNode, boolean premod)
          Wraps up the original DOM nodes holding modifiers within a node.
 void processFile(String inputFileName)
          This the main method which triggers various other methods for identifying and setting NP syntactic features.
private  void setAgreementFeatures(Node npNode, Node headNode)
          Sets the agreement features (i.e.
private  Node setHead(Node npNode, Node headNode)
          Marks-up the head of a given NP.
private  void setModifiers(Node npNode, Node headNode)
          Marks-up the modifiers of a given NP.
private  void setType(Node npNode, Node headNode)
          Sets the NP-type (i.e.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

domDocument

private Document domDocument

headCounter

private int headCounter

modifierCounter

private int modifierCounter
Constructor Detail

PreProSyntacticHeuristics

public PreProSyntacticHeuristics()
Method Detail

processFile

public void processFile(String inputFileName)
This the main method which triggers various other methods for identifying and setting NP syntactic features.

Parameters:
inputFileName - The name of the file to be processed

findHead

public Node findHead(Node node)
A method that identifies the head of a Noun Phrase. Essentially the rightmost common or proper noun, or personal, possessive or reflexive pronoun, or determiner this/these (if any) of the NP is considered to be its head.

Parameters:
node - The DOM node that holds the noun phrase to be processesed
Returns:
Node The DOM node holding the head of the noun phrase

setType

private void setType(Node npNode,
                     Node headNode)
Sets the NP-type (i.e. the-np, pers-pro, etc.).

Parameters:
npNode - The DOM node that holds the noun phrase to be processesed
headNode - The head of the given noun phrase

setAgreementFeatures

private void setAgreementFeatures(Node npNode,
                                  Node headNode)
Sets the agreement features (i.e. person, number, and gender).

Parameters:
npNode - The DOM node that holds the noun phrase to be processesed
headNode - The head of the given noun phrase

setHead

private Node setHead(Node npNode,
                     Node headNode)
Marks-up the head of a given NP.

Parameters:
npNode - The DOM node that holds the noun phrase to be processesed
headNode - The head of the given noun phrase
Returns:
Node The newly created DOM Node that holds the NP head

setModifiers

private void setModifiers(Node npNode,
                          Node headNode)
Marks-up the modifiers of a given NP.

Parameters:
npNode - The DOM node that holds the noun phrase to be processesed
headNode - The head of the given noun phrase

markUpModifiers

private void markUpModifiers(Vector modifiers,
                             Node refNode,
                             boolean premod)
Wraps up the original DOM nodes holding modifiers within a node.

Parameters:
modifiers - A Vector of DOM nodes holding the modifiers
refNode - A DOM node before which the new node will be inserted
premod - True if marking premodifiers, false otherwise

getPreModifiers

private Vector getPreModifiers(Node npNode,
                               Node headNode)
Gets the premodifiers of a Noun Phrase. Starts from the NP head goes to the left, sibling by sibling, storing every node on the way, except determiners.

Parameters:
npNode - The DOM node that holds the noun phrase to be processesed
headNode - The head of the given noun phrase
Returns:
Vector The Vector of DOM nodes holding the premodifiers

getPostModifiers

private Vector getPostModifiers(Node npNode,
                                Node headNode)
Gets the postmodifiers of a Noun Phrase. First it gets the DOM node holding the NP, then retrieves all the nodes holding words and finally stores all the words after the head as postmodifiers.

Parameters:
npNode - The DOM node that holds the noun phrase to be processesed
headNode - The head of the given noun phrase
Returns:
Vector The Vector of DOM nodes holding the postmodifiers

findType

public byte findType(Node node,
                     Node headNode)
A method that identifies the type of a Noun Phrase, depending on its structure, and words and their part-of-speech.

Parameters:
node - The DOM node that holds the noun phrase to be processesed
headNode - The head of the given noun phrase
Returns:
byte The type of the noun phrase (the-np, a-np, etc.)

getTypeMapping

private byte getTypeMapping(String word,
                            Map wordToTypeMap)
An auxiliary method that wraps a general functionality for retrieving a type of an NP associated with a given word. If no mapping is found for the word the type is set to bare-np.

Parameters:
word - The key word
wordToTypeMap - The map to be searched
Returns:
byte The type of the NP containing the input word

createDeterminerMap

private Map createDeterminerMap()
An auxiliary method that creates word-to-type mappings for a determiner. The mappings are hard-coded.

Returns:
Map The map containing the corresponding associations

createDeterminerNoHeadMap

private Map createDeterminerNoHeadMap()
An auxiliary method that creates word-to-type mappings for a determiner when the NP has no head. The mappings are hard-coded.

Returns:
Map The map containing the corresponding associations

createAdjectiveMap

private Map createAdjectiveMap()
An auxiliary method that creates word-to-type mappings for an adjective. The mappings are hard-coded.

Returns:
Map The map containing the corresponding associations

createAdjectiveSuperlativeMap

private Map createAdjectiveSuperlativeMap()
An auxiliary method that creates word-to-type mappings for a superlative adjective. The mappings are hard-coded.

Returns:
Map The map containing the corresponding associations

main

public static void main(String[] args)
                 throws IOException
MAIN METHOD

Throws:
IOException