Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SPEECH SYNTHESIS METHOD AND SYSTEM
Document Type and Number:
WIPO Patent Application WO/2024/058573
Kind Code:
A1
Abstract:
The present disclosure relates to a speech synthesis method performed by means of at least one processor. The speech synthesis method comprises the steps of: receiving an input text; generating a text representation from the input text by using a text encoder; generating a self-supervised representation including linguistic information from the text representation by using a self-supervised representation generator; generating an acoustic feature on the basis of the self-supervised representation by using an acoustic feature generator; and generating synthetic speech on the basis of the acoustic feature by using a speech generator.

Inventors:
SONG EUNWOO (KR)
OH SUHYEON (KR)
LEE SANG-HOON (KR)
LEE SEONG-WHAN (KR)
Application Number:
PCT/KR2023/013832
Publication Date:
March 21, 2024
Filing Date:
September 14, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NAVER CORP (KR)
UNIV KOREA RES & BUS FOUND (KR)
International Classes:
G10L13/08; G06N20/00; G10L13/027; G10L13/06
Foreign References:
KR20220083987A2022-06-21
Other References:
CHENPENG DU; YIWEI GUO; XIE CHEN; KAI YU: "VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 11 May 2022 (2022-05-11), 201 Olin Library Cornell University Ithaca, NY 14853, XP091218415
HYEONG-SEOK CHOI; JUHEON LEE; WANSOO KIM; JIE HWAN LEE; HOON HEO; KYOGU LEE: "Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 27 October 2021 (2021-10-27), 201 Olin Library Cornell University Ithaca, NY 14853, XP091081743
ZHEHUAI CHEN; YU ZHANG; ANDREW ROSENBERG; BHUVANA RAMABHADRAN; GARY WANG; PEDRO MORENO: "Injecting Text in Self-Supervised Speech Pretraining", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 27 August 2021 (2021-08-27), 201 Olin Library Cornell University Ithaca, NY 14853, XP091039368
YIHAN WU; XI WANG; SHAOFEI ZHANG; LEI HE; RUIHUA SONG; JIAN-YUN NIE: "Self-supervised Context-aware Style Representation for Expressive Speech Synthesis", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 25 June 2022 (2022-06-25), 201 Olin Library Cornell University Ithaca, NY 14853, XP091257437
Attorney, Agent or Firm:
KIM, Han Sol et al. (KR)
Download PDF: