New version of GuiTAR: gtar3.0.3.zip
Unzip the file to a working directory.
Set the classpath to include all the .jar files in the lib subdirectory.
If you have used ltchunk previously, you might want to try using it.
It may work as well, but I have not tried it myself.
This step has changed so the first time you run the system over your input you should add the command line option '-prepro' to the command described in sec. III below.
This is probably the most recommended way for using it. It should work as with previous versions. With the option "-i" you can provide an open-ended list of input files. Even though it is open-ended there is a limitation on how many files you can provide (specially on the Windows prompt), therefore a more useful option for many input files would be "-f" followed by a name of (text) file containing the list of input file names (one per line). Additionally, the "-t" option is almost a must (uses the Penn Tree Bank tag set used by Charniak's parser), since if it is not provided it will by default use a tag set employed by a proprietary software XELDA (which I have not used for long).
This new version of GuiTAR also features a discourse-new classifier and in the zip file there are two
trained models: one is for a Support Vector Machines (SVM) classifier using software
and the other is a Maximum Entropy classifier using the openNLP package
But in order to use this facility, one must provide a valid Google Key, as some of the classifier's input features
are computed by querying google through its API.
The way to provide this key, is by editing the file "penntagSet.ini" (or "tagSet.ini" if using XELDA) and
replacing the value "kkk" of the parameter "GOOGLE_KEY". Otherwise google features will not be computed, and possibly
the system will not behave as expected. Here is how you can invoke either one or the other classifiers:
java -jar gtar3.0.3.jar -log -t penntagSet.ini -verbose -svm gnmvpc.libsvm -f masxmlFilesVPCGNM.txt
java -jar gtar3.0.3.jar -log -t penntagSet.ini -verbose -maxent gnmvpc.maxent -f masxmlFilesVPCGNM.txt
Note the libsvm model is composed of two files: one with the normalization ranges of input features and the other one with the model. The maxent model is the model itself, so any feature normalization should be done externally.
Feedback most welcome! (contact)