Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
CUSTOM TONE AND VOCAL SYNTHESIS METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2022/156479
Kind Code:
A1
Abstract:
A custom tone and vocal synthesis method and apparatus, an electronic device, and a storage medium. The synthesis method comprises: training a first neural network by means of a speaker record sample to obtain a speaker recognition model, the output training result of the first neural network being a speaker vector sample (S102); training a second neural network by means of an unaccompanied vocal singing sample and the speaker vector sample to obtain an unaccompanied singing synthesis model (S104); inputting a speaker record to be synthesized into the speaker recognition model to obtain speaker information output by the intermediate hidden layer of the speaker recognition model (S106); and inputting unaccompanied singing music information to be synthesized and the speaker information into the unaccompanied singing synthesis model to obtain a synthesized custom tone and vocal (S108). By means of the method, the efficiency and effect of custom tone and vocal synthesis are improved, and the model training time and response time of custom tone and vocal synthesis are shortened.

Inventors:
ZHANG ZHENGCHEN (CN)
WU JUNYI (CN)
CAI YUYU (CN)
YUAN XIN (CN)
SONG WEI (CN)
HE XIAODONG (CN)
Application Number:
PCT/CN2021/140858
Publication Date:
July 28, 2022
Filing Date:
December 23, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BEIJING WODONG TIANJUN INFORMATION TECHNOLOGY CO LTD (CN)
BEIJING JINGDONG CENTURY TRADING CO LTD (CN)
International Classes:
G10L13/08
Foreign References:
CN113781993A2021-12-10
CN104766603A2015-07-08
CN111862937A2020-10-30
CN108461079A2018-08-28
CN111583900A2020-08-25
US20200135172A12020-04-30
Other References:
HEYANG XUE; SHAN YANG; YI LEI; LEI XIE; XIULIN LI: "Learn2Sing: Target Speaker Singing Voice Synthesis by learning from a Singing Teacher", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 17 November 2020 (2020-11-17), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081815923
SERCAN ARIK, GREGORY DIAMOS, ANDREW GIBIANSKY, JOHN MILLER, KAINAN PENG, WEI PING, JONATHAN RAIMAN, YANQI ZHOU: "Deep Voice 2: Multi-Speaker Neural Text-to-Speech", 24 May 2017 (2017-05-24), XP055491751, Retrieved from the Internet
Attorney, Agent or Firm:
BEIJING INTELLEGAL INTELLECTUAL PROPERTY AGENT LTD. (CN)
Download PDF: