CofeehousePy/services/corenlp/doc/segmenter
Netkas 6bb11b5d3f Added CoreNLP 2021-01-08 21:43:33 -05:00
..
README-Arabic.txt Added CoreNLP 2021-01-08 21:43:33 -05:00
README-Chinese.txt Added CoreNLP 2021-01-08 21:43:33 -05:00
README.txt Added CoreNLP 2021-01-08 21:43:33 -05:00
SegDemo.class Added CoreNLP 2021-01-08 21:43:33 -05:00
build.xml Added CoreNLP 2021-01-08 21:43:33 -05:00
segment-05202008.bat Added CoreNLP 2021-01-08 21:43:33 -05:00
segment-05202008.sh Added CoreNLP 2021-01-08 21:43:33 -05:00
test.simp.utf8 Added CoreNLP 2021-01-08 21:43:33 -05:00

README.txt

The Stanford segmenter distribution includes tools for segmenting
Chinese and Arabic text.  See README-Chinese.txt and README-Arabic.txt
for more details.

------------------------------------
LICENSE
------------------------------------

 This program is free software; you can redistribute it and/or
 modify it under the terms of the GNU General Public License
 as published by the Free Software Foundation; either version 2
 of the License, or (at your option) any later version.

 This program is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.

 You should have received a copy of the GNU General Public License
 along with this program.  If not, see http://www.gnu.org/licenses/ .

 For more information, bug reports, fixes, contact:
    Christopher Manning
    Dept of Computer Science, Gates 2A
    Stanford CA 94305-9020
    USA
    manning@cs.stanford.edu


------------------------------------
CHANGES
------------------------------------

2020-11-17    4.2.0     Update for compatibility 

2020-05-10    4.0.0     New Chinese segmenter trained off of CTB 9.0 

2018-10-16    3.9.2     Update for compatibility 

2018-02-27    3.9.1     Updated for compatibility 

2016-10-31    3.7.0     Update for compatibility 

2015-12-09    3.6.0     Update for compatibility 

2015-04-20    3.5.2     Update for compatibility 

2015-01-29    3.5.1     Update for compatibility 

2014-10-26    3.5.0     Upgrade to Java 1.8 

2014-08-27    3.4.1     Update for compatibility 

2014-06-16      3.4     Update Arabic segmenter 

2014-01-04    3.3.1     Bugfix release 

2013-11-12    3.3.0     Update for compatibility 

2013-06-19    3.2.0     Improve handling of line-by-line input 

2013-04-04    1.6.8     ctb7 model, -nthreads option 

2012-11-11    1.6.7     Bugfixes to both Arabic and Chinese 
                        segmenters; Chinese segmenter can now load 
                        files from jar 

2012-07-09    1.6.6     Improved Arabic model 

2012-05-22    1.6.5     Supports stdin.

2012-03-09    1.6.4     Arabic segmenter now included.

2011-12-16    1.6.3     Updated code to maintain compatibility

2011-09-14    1.6.2     Updated code to maintain compatibility

2011-06-15    1.6.1     Updated code to maintain compatibility

2011-05-15      1.6     Updated models, code to be compatible with
                        other current releases

2008-05-21      1.5     The models in distribution incorporate
                        training lexicon features.  In addition, the
                        segmenter now supports k-best output.

2006-05-12      1.0     This distribution includes models of two
                        segmentation standards -- CTB and PK
                        (Beijing Univ.)

2006-04-10      0.9     Add normalization for punctuation/symbol
                        characters from the "ASCII" (U+0021-U+0075)
                        code point range.