Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DEEP LEARNING-BASED SPEECH TRAINING METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2022/141842
Kind Code:
A1
Abstract:
Provided are a deep learning-based speech training method and apparatus, a computer device, and a storage medium, applied in the technical field of artificial intelligence. Provided is a method for training a speech synthesis model by means of teacher and student neural networks, which can train the speech synthesis model efficiently and rapidly with low resource consumption. The method comprises: coding a first phoneme sequence to obtain a first phoneme coded value (S101); performing duration prediction on the first phoneme coded value to obtain a first pronunciation duration predicted value (S102); expanding each phoneme in the first phoneme sequence to obtain an expanded feature of each phoneme in the first phoneme sequence (S103); transforming the expanded feature of each phoneme in the first phoneme sequence into a first Mel spectrum value (S104); and training a student neural network by means of latent variables provided by a trained teacher neural network and the first Mel spectrum value until a first loss function of the student neural network converges to obtain a trained student neural network (S105).

Inventors:
SUN AOLAN (CN)
WANG JIANZONG (CN)
CHENG NING (CN)
Application Number:
PCT/CN2021/083233
Publication Date:
July 07, 2022
Filing Date:
March 26, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
PING AN TECH SHENZHEN CO LTD (CN)
International Classes:
G10L13/00; G10L13/02
Foreign References:
CN112116903A2020-12-22
CN111968618A2020-11-20
CN112002303A2020-11-27
CN111583904A2020-08-25
CN109979429A2019-07-05
US20160049144A12016-02-18
Attorney, Agent or Firm:
SHENZHEN CHINA INNOVATION SOUTH INTELLECTUAL PROPERTY AGENCY CO., LTD. (CN)
Download PDF: