Tuesday, February 26, 2008

[ANN] RMMSeg 0.1.2 Released

Mainly performance improvement.

rmmseg version 0.1.2
by pluskid
http://rmmseg.rubyforge.org

== DESCRIPTION

RMMSeg is an implementation of MMSEG Chinese word segmentation
algorithm. It is based on two variants of maximum matching
algorithms. Two algorithms are available for using:

* simple algorithm that uses only forward maximum matching.
* complex algorithm that uses three-word chunk maximum matching and 3
aditonal rules to solve ambiguities.

For more information about the algorithm, please refer to the
following essays:

* http://technology.chtsai.org/mmseg/
* http://pluskid.lifegoo.com/?p=261

== CHANGES

* Add cache to find_match_words: performance improved.
* Implement Chunk as a module instead of a class: performance improved.
* Don’t store unnecessary data in dictionary: memory usage reduced.