NEWS
jiebaR 0.10
- Major Change: update CppJieba version to 5.0.0.
- Remove: 'query_threshold' and 'words_locate'
- Remove: 'level' and 'level_pair' methods for worker
- Change: query mode now behaves the same as Python jieba 'cut_for_search'.
- Fix: special Unicode string decoding error
- Fix: GCC 8 warnings
jiebaR 0.9.1 (2016-09-28)
- Major Change: 'distance' and 'vector_distance' now return integer value as distance.
- Major Change: requires C++11 with GCC 4.9+ to build this package
- Fix: 'tobin' now returns the correct value
- Fix: 'get_idf' rownames with 1 based index
- Add: 'new_user_word' now has a default tag
- Add: 'apply_list' to handle nested list input data
- Add: 'simhash_dist' to compute distance of simhash values
- Add: 'simhash_dist_mat' to compute compute distance matrix of simhash values
- Add: 'vector_tag' to tag a character vector
- Add: more docs
- Depreciated: quick mode will be remove in v0.11.0
- Depreciated: filecoding to file_coding
- Warning: next version will update internal CppJieba version to 5.0.0, 'query_threshold', 'words_locate' will be removed due to the upstream apis changes.
jiebaR 0.8.2
- Add: user_weight option for worker(), and default value is the max weight.
- Fix: Build with R 3.3.0
jiebaR 0.8 (2016-01-30)
- Remove: ShowDictPath() EditDict() tag()
- Remove: some C API due to CppJieba V4.4.1 update.
- C APIs will not work: jiebaR_mp_ptr jiebaR_mp_cut jiebaR_query_ptr jiebaR_query_cut jiebaR_hmm_ptr jiebaR_hmm_cut.
- C APIs will work but give a warning: jiebaR_mix_ptr jiebaR_mix_cut jiebaR_tag_ptr jiebaR_tag_tag jiebaR_tag_file. jiebaR_mix_cut.
- C APIs change: jiebaR_key_ptr jiebaR_sim_ptr add user path varible.
- Add: some C API due to CppJieba V4.4.1 update.
jiebaR_jiebaclass_ptr, jiebaR_jiebaclass_mix_cut, jiebaR_jiebaclass_mp_cut, jiebaR_jiebaclass_hmm_cut, jiebaR_jiebaclass_query_cut, jiebaR_jiebaclass_full_cut, jiebaR_jiebaclass_level_cut, jiebaR_jiebaclass_level_cut_pair, jiebaR_jiebaclass_tag_tag,jiebaR_jiebaclass_tag_file, jiebaR_set_query_threshold, jiebaR_add_user_word, jiebaR_u64tobin, jiebaR_get_loc
- Add: more type for segmentation, add: full cut, level cut.
- Add: default attributte for the type of segmentation.
- Add: add new user word after worker engine created.
- Add: query_threshold to update query threshold
- Add: words_locate to locate the positions of words
- Fix: build on GCC 5.3.2 with gnu++14
- Fix: build on Clang 3.8 RC
- Fix: add roxygen2 as a dependency for the update of devtools
jiebaR 0.7 (2015-12-06)
- Add: tobin() to transform simhash to binary format.
- Add: vector_simhash() vector_distance() to extract simhash or compute Hamming distance from the result of segmentation.
- Add: get_tuple() to get tuple from segmentation result.
- Add: get_idf() to generate IDF dict.
- Fix: C API now work with Clang on Mac 10.11.
- Enhencement: Update tests for C API.
- Warning: Next version will update internal CppJieba version and tag(), EditDict(), ShowDictPath() will be remove.
jiebaR 0.6 (2015-10-01)
- Add: C API.
- Add: freq() to count word frequency.
- Fix: filter_segment() may occasionally remove words.
- Enhencement: filter_segment() now can handle list of vectors of words.
- Enhencement: segmentation worker now can remove stop words. The default STOPPATH is not used by default for segmentation worker.
- Enhencement: when symbol = F, 2010-10-13, 10.2 can be identified.
jiebaR 0.5 (2015-04-29)
- Fix: edit_dict() on Mac.
- New function: filter_segment() to filter segmentation result.
- New function: vector_keywords() to extract keywords from a string.
- Enhancement: Segmentation support: Vector input => List output.
- Enhancement: Segmentation support: Input by lines => Output by lines.
- Enhancement: Add option write = "NOFILE".
- Enhancement: New rules for "English word + Numbers".
- Update documentation.
jiebaR 0.4 (2015-01-04)
- Remove Rcpp Modules.
- Better symbol filter in segmentation.
- Separate data files to jiebaRD package.
jiebaR 0.3 (2014-12-01)
- 2X segmentation speed.
- Quick Mode.
- A new '[' symbol to do segmentation.
- Portable string utility function.
jiebaR 0.2 (2014-11-23)