opennlp.grok.preprocess.mwe
Class EnglishCommonFixedLexicalMWE
java.lang.Object
|
+--opennlp.grok.preprocess.mwe.LexicalMWE
|
+--opennlp.grok.preprocess.mwe.EnglishCommonFixedLexicalMWE
- All Implemented Interfaces:
- opennlp.maxent.Evalable, opennlp.common.preprocess.NameFinder, opennlp.common.preprocess.Pipelink
- public class EnglishCommonFixedLexicalMWE
- extends LexicalMWE
A Fixed Lexicon Multi-Word Expression finder that uses "EnglishCommonFixedLexicalMWE.data"
for its content model.
This finds common multi-word expressions which are completely fixed in English.
Examples are "ad hoc", "au pair". Most are foreign language expressions which
have been borrowed by English, although they might be analysable in their
native language, the consitutent words make no sense when analysed with
English grammar except as part of the MWE. Rather than extend the grammar to
include these special usages it is much easier to treat the whole MWE as a
lexicon entry with the right POS, semantic, etc. tags.
Token tagging is delayed to a later stage in the pipeline.
<?xml version="1.0" encoding="UTF-8"?>
<nlpDocument>
<text>
<p>
<s>
<t>
<w>ad</w>
</t>
<t>
<w>hoc</w>
</t>
</s>
</p>
</text>
</nlpDocument>
is transformed to:
<?xml version="1.0" encoding="UTF-8"?>
<nlpDocument>
<text>
<p>
<s>
<t type="mwe">
<w>ad</w>
<w>hoc</w>
</t>
</s>
</p>
</text>
</nlpDocument>
This class just gets the MWE model, while the FixedLexicalMWE implements the
matching algorithm.
- Version:
- $Revision: 1.1 $, $Date: 2002/03/12 12:51:20 $
- Author:
- Mike Atkinson
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
EnglishCommonFixedLexicalMWE
public EnglishCommonFixedLexicalMWE()
- Constructor for the EnglishCommonFixedLexicalMWE object, which creates the
model.
Copyright © 2003 Jason Baldridge and Gann Bierner. All Rights Reserved.