Title:
CHINESE SPEECH SYNTHESIZING METHOD, APPARATUS AND DEVICE, STORAGE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2020/215551
Kind Code:
A1
Abstract:
The present invention relates to the speech signal processing field in the field of artificial intelligence. Disclosed are a Chinese speech synthesizing method, apparatus and device, and a storage medium, which are used to reduce a training time duration, enhance a model presentation capability and generalization capability, and further improve the quality of synthesized speech. The Chinese speech synthesizing method comprises: acquiring an initial mel-frequency spectrum and a target vector (101); processing the target vector to obtain a first sequence, the first sequence being a two-dimensional tensor (102); processing the initial mel-frequency spectrum to obtain a target mel-frequency spectrum (103); determining a target correlation, in each subspace, between the first sequence and the target mel-frequency spectrum (104); and performing speech synthesis according to a self-attention mechanism and the target correlation to obtain a target speech (105).
Inventors:
CHEN MINCHUAN (CN)
MA JUN (CN)
WANG SHAOJUN (CN)
MA JUN (CN)
WANG SHAOJUN (CN)
Application Number:
PCT/CN2019/102247
Publication Date:
October 29, 2020
Filing Date:
August 23, 2019
Export Citation:
Assignee:
PING AN TECH SHENZHEN CO LTD (CN)
International Classes:
G10L13/02
Foreign References:
CN110070852A | 2019-07-30 | |||
CN104392717A | 2015-03-04 | |||
CN109036377A | 2018-12-18 | |||
CN104485099A | 2015-04-01 | |||
CN107545903A | 2018-01-05 | |||
US20050182629A1 | 2005-08-18 |
Attorney, Agent or Firm:
BEIJING JINGDA LAW FIRM (CN)
Download PDF: