To provide a speech extracting apparatus that extracts a speech of a specified person from a mixed signal of a speech signal of the specified person and a non-speech signal other than the speech signal while taking a statistical state of a sound source depending upon time into consideration by using a plurality of input signals in which a plurality of sound sources are mixed and the statistical independence of the sound sources included in the input signals.
A non-speech section detection part 20 detects a non-speech section including no speech from a mixed signal in which a plurality of sound sources are mixed, a non-speech statistical state estimation part 30 estimates a non-speech statistical state by using information on the detected non-speech section, and a speech extraction part 50 extracts at least one speech by sequentially updating the mixed signal according to an initial procedure selected by a separation procedure initialization part 40 by using the statistical independence of the estimated non-speech statistical state and a statistical state, regarding the speech, stored in a speech statistical state storage part 70.
COPYRIGHT: (C)2004,JPO
Kyoji Ohta
Hitoshi Sasaki
Mitsuyoshi Matsubara
JP2002189476A | ||||
JP2000242624A | ||||
JP2000097758A |