opennlp.grok.preprocess.mwe
Class EnglishVariableLexicalMWE

java.lang.Object
  |
  +--opennlp.grok.preprocess.mwe.LexicalMWE
        |
        +--opennlp.grok.preprocess.mwe.EnglishVariableLexicalMWE
All Implemented Interfaces:
opennlp.maxent.Evalable, opennlp.common.preprocess.NameFinder, opennlp.common.preprocess.Pipelink

public class EnglishVariableLexicalMWE
extends LexicalMWE

A Fixed Lexicon Multi-Word Expression finder that uses "EnglishFixedLexicalMWE.data" for its content model.

This finds both common and rare multi-word expressions which are completely fixed in English. Examples are "ad hoc", "au pair", "aes alienum", "ben trovato". Most are foreign language expressions which have been borrowed by English, although they might be analysable in their native language, the consitutent words make no sense when analysed with English grammar except as part of the MWE. Rather than extend the grammar to include these special usages it is much easier to treat the whole MWE as a lexicon entry with the right POS, semantic, etc. tags.

Token tagging is delayed to a later stage in the pipeline.

 <?xml version="1.0" encoding="UTF-8"?>
 <nlpDocument>
   <text>
     <p>
       <s>
         <t>
           <w>ad</w>
         </t>
         <t>
           <w>hoc</w>
         </t>
       </s>
     </p>
   </text>
 </nlpDocument>

 is transformed to:

 <?xml version="1.0" encoding="UTF-8"?>
 <nlpDocument>
   <text>
     <p>
       <s>
         <t type="mwe">
           <w>ad</w>
           <w>hoc</w>
         </t>
       </s>
     </p>
   </text>
 </nlpDocument>

 
This class just gets the MWE model, while the FixedLexicalMWE implements the matching algorithm.

Version:
$Revision: 1.1 $, $Date: 2002/03/12 12:51:20 $
Author:
Mike Atkinson

Field Summary
 
Fields inherited from class opennlp.grok.preprocess.mwe.LexicalMWE
model
 
Constructor Summary
EnglishVariableLexicalMWE()
          Constructor for the EnglishFixedLexicalMWE object, which creates the model.
 
Methods inherited from class opennlp.grok.preprocess.mwe.LexicalMWE
getEventCollector, getNegativeOutcome, localEval, process, requires
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

EnglishVariableLexicalMWE

public EnglishVariableLexicalMWE()
Constructor for the EnglishFixedLexicalMWE object, which creates the model.



Copyright © 2003 Jason Baldridge and Gann Bierner. All Rights Reserved.