To provide a voice processing device which appropriately and stably extracts a voice segment in audio signals.
For each of the words in a recognizable word list generated in a step 200, a maximum value of the total sum of the lengths of the duration of each phoneme is obtained when phonemes having power values smaller than a beforehand set threshold value continue starting from a leading phoneme (Steps 202 to 214). Then, the starting end of a voice segment is defined as the time going back for a period equal to the maximum value in time starting from the point of time when the power value of inputted audio signals exceeds the threshold value for the first time. Then, the maximum value of the total sum of the duration of each phoneme is obtained for each work on the list when phonemes having power values smaller than the threshold value continue from the phonemes other thatn the leading phoneme (Steps 216 to 228). When the power value of the inputted audio signals is continuously smaller than the threshold value for a period longer than the maximum value, the end point of a voice segment is defined as the point of time after a period equal to the maximum value has elapsed from the time when the power value becomes smaller than the threshold value.
JPH06273180 | ROUTE GUIDING APPARATUS |
JP4096570 | Information system, terminal, information acquisition method, program |
JPH09218050 | NAVIGATION SYSTEM |
Next Patent: VOICE ENCODING/DECODING DEVICE, VOICE ENCODER AND ENCODING METHOD