opennlp.grok.preprocess.mwe
Class EnglishVariableLexicalMWE
java.lang.Object
|
+--opennlp.grok.preprocess.mwe.LexicalMWE
|
+--opennlp.grok.preprocess.mwe.EnglishVariableLexicalMWE
- All Implemented Interfaces:
- opennlp.maxent.Evalable, opennlp.common.preprocess.NameFinder, opennlp.common.preprocess.Pipelink
- public class EnglishVariableLexicalMWE
- extends LexicalMWE
A Fixed Lexicon Multi-Word Expression finder that uses "EnglishFixedLexicalMWE.data"
for its content model.
This finds both common and rare multi-word expressions which are completely fixed
in English. Examples are "ad hoc", "au pair", "aes alienum", "ben trovato". Most
are foreign language expressions which have been borrowed by English, although
they might be analysable in their native language, the consitutent words make
no sense when analysed with English grammar except as part of the MWE. Rather
than extend the grammar to include these special usages it is much easier to
treat the whole MWE as a lexicon entry with the right POS, semantic, etc. tags.
Token tagging is delayed to a later stage in the pipeline.
<?xml version="1.0" encoding="UTF-8"?>
<nlpDocument>
<text>
<p>
<s>
<t>
<w>ad</w>
</t>
<t>
<w>hoc</w>
</t>
</s>
</p>
</text>
</nlpDocument>
is transformed to:
<?xml version="1.0" encoding="UTF-8"?>
<nlpDocument>
<text>
<p>
<s>
<t type="mwe">
<w>ad</w>
<w>hoc</w>
</t>
</s>
</p>
</text>
</nlpDocument>
This class just gets the MWE model, while the FixedLexicalMWE implements
the matching algorithm.
- Version:
- $Revision: 1.1 $, $Date: 2002/03/12 12:51:20 $
- Author:
- Mike Atkinson
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
EnglishVariableLexicalMWE
public EnglishVariableLexicalMWE()
- Constructor for the EnglishFixedLexicalMWE object, which creates the
model.
Copyright © 2003 Jason Baldridge and Gann Bierner. All Rights Reserved.