To detect a meaning boundary of a phrase etc., consisting of a word groups of a speech recognition result.
For a word string generated by employing NBEST candidates for respective speech segments from inputted speech recognition result data, merging word sets included in the respective NBEST candidates by the respective speech segments and sorting words in the increasing order of start time information on the words, and removing unnecessary words from word strings, and connecting word strings of all speech segments, windows as ranges of words in word strings each constituting of a fixed number of words are specified before and after each word border, vectors representing meanings of windows are calculated by the windows, and the similarity, beginning with a cosine measure, between vectors corresponding to two successive windows is calculated as the degree of bundling, thereby approving a speech segment boundary right nearby a minimum point as a top boundary.
COPYRIGHT: (C)2004,JPO
JP2000099089A | ||||
JP2001154936A | ||||
JP1276266A |
WO2002027546A1 |
西澤信一郎他,”名詞の文書内頻度を利用したテキストセグメンテーション”,情報処理学会研究報告,NL117-20(1997-01),p.145-152
Next Patent: METHOD FOR PROJECTING STEREOSCOPIC IMAGE IN SPACE WITHOUT USING SCREEN