音声合成装置、方法及びプログラム - Oki Electric Industry Co., Ltd.

Title:

音声合成装置、方法及びプログラム

Document Type and Number:

Japanese Patent JP5040778

Kind Code:

B2

Abstract:

To include intonation intended by a speaker in a synthetic speech when synthetic voiced speech is generated from non-voiced speech and a lip image.

The non-voiced speech of the speaker and a photographic lip image are synchronously input to generate synthetic voiced speech, in a speech synthesis device. An image signal analysis means extracts vowel information of the voiced speech from the input lip image, and a ratio of lip opening size at vowel pronunciation to a predetermined reference size is extracted as a pitch ratio. A speech signal analysis means extracts consonant information from the input non-voiced speech and a sound model of the non-voiced vowel corresponding to the vowel extracted by the image signal analysis means, and text information is extracted from a built-in dictionary which stores phoneme sequences and words in association with each other, and a language model for calculating the sequence of the word, and a continuation time length of a whole pronunciation from power variation of the input non-voiced speech. A speech synthesis means synthesizes voiced speech with intonation added thereto, based on various information extracted by both analysis means.

Inventors:

Tsutomu Kaneyasu

Application Number:

JP2008097726A

Publication Date:

October 03, 2012

Filing Date:

April 04, 2008

Export Citation:

Click for automatic bibliography generation Help

Assignee:

Oki Electric Industry Co., Ltd.

International Classes:

G10L13/08; G10L13/10; G10L15/24; G10L21/003; G10L21/043; G10L25/90

Domestic Patent References:

JP2006276470A
JP200068882A
JP10240283A
JP2002351489A

Attorney, Agent or Firm:

Nobuyuki Kudo

Previous Patent: JPS5040777

Next Patent: JPS5040779