Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MULTI-STYLE AUDIO SYNTHESIS METHOD, APPARATUS AND DEVICE, AND STORAGE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2022/116432
Kind Code:
A1
Abstract:
The present application relates to the field of artificial intelligence. Disclosed are a multi-style audio synthesis method, apparatus and device, and a storage medium. The method comprises: acquiring text data to be processed and a first Mel spectrum of a single style; inputting the first Mel spectrum into a preset style extraction network for feature extraction, so as to obtain a first style feature; inputting the text data into an encoder in a preset Mel spectrum generation network for feature extraction, and splicing an extracted first text feature and the first style feature, so as to obtain a first fusion feature; inputting the first fusion feature into a decoder in the preset Mel spectrum generation network for feature conversion, so as to obtain a second Mel spectrum; and inputting the second Mel spectrum into a preset vocoder for audio generation, so as to obtain multi-style audio. By means of the present application, multi-style audio can be generated by means of taking a style feature as a conditional feature of a vocoder.

Inventors:
LIANG SHUANG (CN)
CHEN MINCHUAN (CN)
MA JUN (CN)
WANG SHAOJUN (CN)
Application Number:
PCT/CN2021/083546
Publication Date:
June 09, 2022
Filing Date:
March 29, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
PING AN TECH SHENZHEN CO LTD (CN)
International Classes:
G10L13/02; G10L13/04; G10L25/03; G10L25/18
Foreign References:
CN112562634A2021-03-26
CN110136690A2019-08-16
CN111627418A2020-09-04
US20200258496A12020-08-13
US20200051583A12020-02-13
Attorney, Agent or Firm:
BEIJING JINGDA LAW FIRM (CN)
Download PDF: