Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DEVICE, METHOD, AND PROGRAM FOR ANALYZING SPEECH SIGNAL
Document Type and Number:
WIPO Patent Application WO/2019/163753
Kind Code:
A1
Abstract:
The present invention is capable of accurately estimating a parameter inherent in a basic frequency pattern of a speech fragment from said basic frequency pattern, and reconstructing the basic frequency pattern of the speech fragment from the parameter inherent in the basic frequency pattern. A learning unit 30 learns a deep generative model on the basis of a basic frequency pattern in a speech signal and parallel data to a parameter inherent in the basic frequency pattern of the speech signal, wherein the parameter inherent in the basic frequency pattern of the speech signal is regarded as a latent variable of the deep generative model, and the deep generative model includes an encoder for estimating the latent variable from the basic frequency pattern of the speech signal and a decoder for reconstructing the basic frequency pattern of the speech signal from the latent variable.

Inventors:
TANAKA KO (JP)
KAMEOKA HIROKAZU (JP)
Application Number:
PCT/JP2019/006047
Publication Date:
August 29, 2019
Filing Date:
February 19, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NIPPON TELEGRAPH & TELEPHONE (JP)
International Classes:
G10L25/90; G10L25/30
Domestic Patent References:
WO2017168870A12017-10-05
Foreign References:
JP2016085408A2016-05-19
JP2015194781A2015-11-05
JPH02239294A1990-09-21
Other References:
HSU, CHIN-CHENG ET AL.: "Voice Conversion from Unaligned Corpora using Variational Autoencoding Wassersterin Generative Adversarial Networks", INTERSPEECH, August 2017 (2017-08-01), pages 3364 - 3368, XP055633701, ISSN: 1990-9772
YOSHIZATO, KOTA ET AL.: "Estimation of phase and accent commands from speech signals using statistical model of speech F0 contours.", REPORT OF THE MEETING OF THE ACOUSTICAL SOCIETY OF JAPAN, March 2012 (2012-03-01), pages 311 - 314
NARUSAWA, SHUICHI ET AL.: "A method for automatic extraction of parameters of the fundamental frequency contour generation model", TRANSACTIONS OF THE INFORMATION PROCESSING SOCIETY OF JAPAN, vol. 43, no. 7, July 2002 (2002-07-01), pages 2155 - 2168
TANAKA, KOU ET AL: "VAE-SPACE: Deep Generative Model for Voiced F0 contours", PROCEEDINGS OF THE ACOUSTICAL SOCIETY OF JAPAN, vol. 03, March 2018 (2018-03-01), pages 229 - 230, ISSN: 1880-7658
Attorney, Agent or Firm:
TAIYO, NAKAJIMA & KATO (JP)
Download PDF: