Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
APPARATUS AND METHOD FOR GENERATING SPEECH SYNTHESIS IMAGE
Document Type and Number:
WIPO Patent Application WO/2023/153555
Kind Code:
A1
Abstract:
An apparatus and a method for generating a speech synthesis image are disclosed. An apparatus for generating a speech synthesis image according to an embodiment relates to an apparatus for generating a speech synthesis image on the basis of machine learning, and comprises: a first global geometric transformation prediction unit for receiving an input of each of a source image and a target image, in which the same person is included, and trained to predict, on the basis of the source image and the target image, a global geometric transformation for global movement of the person between the source image and the target image; a local feature tensor prediction unit trained to predict a feature tensor for a local movement of the person, on the basis of information relating to the input target image; and an image generation unit trained to reconstruct the target image on the basis of the global geometric transformation, the source image, and the feature tensor for the local movement.

Inventors:
CHAE GYEONG SU (KR)
HWANG GUEM BUEL (KR)
Application Number:
PCT/KR2022/003610
Publication Date:
August 17, 2023
Filing Date:
March 15, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
DEEPBRAIN AI INC (KR)
International Classes:
G10L21/10; G06T5/00; G06T5/50; G06T7/20; G10L15/04; G10L21/055
Domestic Patent References:
WO2002031772A22002-04-18
Foreign References:
KR20200145700A2020-12-30
US20200393943A12020-12-17
Other References:
LIJUAN WANG ; WEI HAN ; FRANK K. SOONG: "High quality lip-sync animation for 3D photo-realistic talking head", 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2012) : KYOTO, JAPAN, 25 - 30 MARCH 2012 ; [PROCEEDINGS], IEEE, PISCATAWAY, NJ, 25 March 2012 (2012-03-25), Piscataway, NJ , pages 4529 - 4532, XP032228161, ISBN: 978-1-4673-0045-2, DOI: 10.1109/ICASSP.2012.6288925
M. WANG; D. BRADLEY; S. ZAFEIRIOU; T. BEELER: "Facial Expression Synthesis using a Globalā€Local Multilinear Framework", COMPUTER GRAPHICS FORUM : JOURNAL OF THE EUROPEAN ASSOCIATION FOR COMPUTER GRAPHICS, WILEY-BLACKWELL, OXFORD, vol. 39, no. 2, 13 July 2020 (2020-07-13), Oxford , pages 235 - 245, XP071545877, ISSN: 0167-7055, DOI: 10.1111/cgf.13926
Attorney, Agent or Firm:
T&C IP LAW FIRM (KR)
Download PDF: