Skip to content
Cogode edited this page Nov 30, 2020 · 4 revisions

Chinese NLP (Chinese Natural Language Process)

Task 1: Text segmentation Chinese text segmentation refers to the segmentation of Chinese words by computer, that is to say, as in English, there are spaces between words in Chinese sentences. Chinese text segmentation is considered to be one of the most basic steps in Chinese natural language processing. Keywords: dict, forward maximum matching, backward maximum matching.

Task 2: New word discovery In the process of Chinese processing, the dictionary gives priority to the importance, but the dictionary is not complete, and the rare words and new words are not included, so the word can not be correctly segmented. New word discovery can help the word segmentation device to identify the words that have not been included in the dictionary, so as to improve the performance of word segmentation; it can also help the public opinion analysis to find hot information. Key words: information entropy.

Clone this wiki locally