com.swfit.core.search
Class ScandinavianAnalyzer
java.lang.Object
|
+--org.apache.lucene.analysis.Analyzer
|
+--com.swfit.core.search.ScandinavianAnalyzer
- public final class ScandinavianAnalyzer
- extends org.apache.lucene.analysis.Analyzer
The ScandinavianAnalyzer is a required
A ScandinavianAnalyzer is a filter intended to do as little as
possible in terms of excluding stop words (there aren't any yet). The SWFIT
package is not intended for gigabytes of data, just some hundred HTML documents,
and the only effect I am looking for here, is for people without Scandinavian
keyboards (and character sets) to be able to find terms spelled with local characters.
A word of advice: You are sincerely adviced to NOT implement this code, as I
now far too little about linguistics, unicode, character sets - you name it -
to make a valuable contribution to the field. Whatever I do here, should have
been done by a professional. Go back and read the last sentence. For all I know,
this code will corrupt your data.
- Since:
- SWFIT1.0
- Version:
- $Revision: 1.2 $ $Date: 2003/02/03 05:57:34 $
- Author:
- Olaf Havnes
|
Constructor Summary |
ScandinavianAnalyzer()
Builds an analyzer with a default hashtable. |
ScandinavianAnalyzer(java.util.Hashtable stop_table)
Builds an analyzer with a Hashtable of given stop words |
ScandinavianAnalyzer(java.lang.String[] stop_words)
Builds an analyzer with an array of Strings of given stop words |
|
Method Summary |
org.apache.lucene.analysis.TokenStream |
tokenStream(java.lang.String fieldName,
java.io.Reader reader)
Constructs a StandardTokenizer filtered by |
| Methods inherited from class org.apache.lucene.analysis.Analyzer |
tokenStream |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ScandinavianAnalyzer
public ScandinavianAnalyzer()
- Builds an analyzer with a default hashtable.
ScandinavianAnalyzer
public ScandinavianAnalyzer(java.lang.String[] stop_words)
- Builds an analyzer with an array of Strings of given stop words
ScandinavianAnalyzer
public ScandinavianAnalyzer(java.util.Hashtable stop_table)
- Builds an analyzer with a
Hashtable of given stop words
tokenStream
public final org.apache.lucene.analysis.TokenStream tokenStream(java.lang.String fieldName,
java.io.Reader reader)
- Constructs a
StandardTokenizer filtered by
- Overrides:
tokenStream in class org.apache.lucene.analysis.Analyzer
Swfit developer homepage
Copyright © 2003 Orgdot AS. All Rights Reserved.