CUSTOM TONE AND VOCAL SYNTHESIS METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM - BEIJING WODONG TIANJUN INFORMATION TECHNOLOGY CO LTD

Title:

CUSTOM TONE AND VOCAL SYNTHESIS METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM

Document Type and Number:

WIPO Patent Application WO/2022/156479

Kind Code:

A1

Abstract:

A custom tone and vocal synthesis method and apparatus, an electronic device, and a storage medium. The synthesis method comprises: training a first neural network by means of a speaker record sample to obtain a speaker recognition model, the output training result of the first neural network being a speaker vector sample (S102); training a second neural network by means of an unaccompanied vocal singing sample and the speaker vector sample to obtain an unaccompanied singing synthesis model (S104); inputting a speaker record to be synthesized into the speaker recognition model to obtain speaker information output by the intermediate hidden layer of the speaker recognition model (S106); and inputting unaccompanied singing music information to be synthesized and the speaker information into the unaccompanied singing synthesis model to obtain a synthesized custom tone and vocal (S108). By means of the method, the efficiency and effect of custom tone and vocal synthesis are improved, and the model training time and response time of custom tone and vocal synthesis are shortened.

More Like This:

JP4366918	Mobile terminal
WO/2004/001570	METHOD FOR DESCRIBING EXISTING DATA BY A NATURAL LANGUAGE AND PROGRAM FOR THAT
JP6036682	A speech synthesis system, a speech synthesis method, and a voice synthesis program

Inventors:

ZHANG ZHENGCHEN (CN)
WU JUNYI (CN)
CAI YUYU (CN)
YUAN XIN (CN)
SONG WEI (CN)
HE XIAODONG (CN)

Application Number:

PCT/CN2021/140858

Publication Date:

July 28, 2022

Filing Date:

December 23, 2021

Export Citation:

Click for automatic bibliography generation Help

Assignee:

BEIJING WODONG TIANJUN INFORMATION TECHNOLOGY CO LTD (CN)
BEIJING JINGDONG CENTURY TRADING CO LTD (CN)

International Classes:

G10L13/08

Foreign References:

CN113781993A	2021-12-10
CN104766603A	2015-07-08
CN111862937A	2020-10-30
CN108461079A	2018-08-28
CN111583900A	2020-08-25
US20200135172A1	2020-04-30

Other References:

HEYANG XUE; SHAN YANG; YI LEI; LEI XIE; XIULIN LI: "Learn2Sing: Target Speaker Singing Voice Synthesis by learning from a Singing Teacher", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 17 November 2020 (2020-11-17), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081815923
SERCAN ARIK, GREGORY DIAMOS, ANDREW GIBIANSKY, JOHN MILLER, KAINAN PENG, WEI PING, JONATHAN RAIMAN, YANQI ZHOU: "Deep Voice 2: Multi-Speaker Neural Text-to-Speech", 24 May 2017 (2017-05-24), XP055491751, Retrieved from the Internet

Attorney, Agent or Firm:

BEIJING INTELLEGAL INTELLECTUAL PROPERTY AGENT LTD. (CN)

Download PDF:

View/Download PDF PDF Help

Previous Patent: BATTERY CELL, BATTERY, AND ELECTRIC DEVICE

Next Patent: CLOCK ERROR PREDICTING METHOD AND DEVICE