PURPOSE: To perform precise speaker identification in a method of identifying a speaker from an input voice limiting no utterance content.
CONSTITUTION: A figure shows a process from inputting the voice until identifying the speaker. A sound signal digitized by a sound analysis part 20 is converted to the sound signal shown by a characteristic parameter time sequence 33 in a characteristic parameter extraction part 30. A characteristic parameter group 33a by the prescribed number of frames (m) from Pnf-m+1 to Pnf is inputted to a neural network 40 while shifting one frame each, and a speaker identification information short time sequence 53 at every frame is obtained as the output. The speaker identification information 53 at every frame is obtained based on both of not only a characteristic related to a personality of a short time spectrum shape at every frame but the characteristic of the personality in the way of the timewise change of the spectrum shape between prescribed frames, and the precise speaker identification (55) is performed based on the speaker identification time sequence 53.
KAMODA MORITOSHI
KATO TOSHIFUMI
Next Patent: SOUND INFORMATION PROCESSOR