Multilingual Text Stemmer - a naive comparison
Stemmers are critical in any text mining/text analysis/Information retrieval application like search/text classifiers. Following are the few which I found which are being used for some time (some are used for last 30 years).
1. Porter Stemmer - One of the oldest one. Better ones available, so can be avoided it use as its primitive. ( porter2 is a improved version of it).
2. Kstem - Available in lucene, I feel this is good for search applications. Stems produced are more near to real words.
3. Snowball - This is also available in lucene and even in Sphinx. This is collection of stemmer in different languages (eg. Russian, Dutch, English,2 German). It got porter stemmer to in its collection. It even got one lovins stemmer. This is







