opennlp.grok.preprocess.mwe
Class EnglishFixedLexicalMWE
java.lang.Object
|
+--opennlp.grok.preprocess.mwe.LexicalMWE
|
+--opennlp.grok.preprocess.mwe.EnglishFixedLexicalMWE
- All Implemented Interfaces:
- opennlp.maxent.Evalable, opennlp.common.preprocess.NameFinder, opennlp.common.preprocess.Pipelink
- public class EnglishFixedLexicalMWE
- extends LexicalMWE
A Fixed Lexicon Multi-Word Expression finder that uses "EnglishFixedLexicalMWE.data"
for its content model.
This finds both common and rare multi-word expressions which are completely fixed
in English. Examples are "ad hoc", "au pair", "aes alienum", "ben trovato". Most
are foreign language expressions which have been borrowed by English, although
they might be analysable in their native language, the consitutent words make
no sense when analysed with English grammar except as part of the MWE. Rather
than extend the grammar to include these special usages it is much easier to
treat the whole MWE as a lexicon entry with the right POS, semantic, etc. tags.
Token tagging is delayed to a later stage in the pipeline.
<?xml version="1.0" encoding="UTF-8"?>
<nlpDocument>
<text>
<p>
<s>
<t>
<w>ad</w>
</t>
<t>
<w>hoc</w>
</t>
</s>
</p>
</text>
</nlpDocument>
is transformed to:
<?xml version="1.0" encoding="UTF-8"?>
<nlpDocument>
<text>
<p>
<s>
<t type="mwe">
<w>ad</w>
<w>hoc</w>
</t>
</s>
</p>
</text>
</nlpDocument>
This class just gets the MWE model, while the FixedLexicalMWE implements
the matching algorithm.
- Version:
- $Revision: 1.1 $, $Date: 2002/03/12 12:51:20 $
- Author:
- Mike Atkinson
Constructor Summary |
EnglishFixedLexicalMWE()
Constructor for the EnglishFixedLexicalMWE object, which creates the
model. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
EnglishFixedLexicalMWE
public EnglishFixedLexicalMWE()
- Constructor for the EnglishFixedLexicalMWE object, which creates the
model.
Copyright © 2003 Jason Baldridge and Gann Bierner. All Rights Reserved.