Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
APPARATUS AND METHOD FOR GENERATING SYNTHESIZED SPEECH IMAGE
Document Type and Number:
WIPO Patent Application WO/2023/153554
Kind Code:
A1
Abstract:
Disclosed are an apparatus and method for generating a synthesized speech image. The apparatus for generating a synthesized speech image, according to an embodiment, is a machine learning-based apparatus for generating a synthesized speech image, comprising: a first global geometric transformation prediction unit that receives an input of each of a source image and a target image, which include the same person, and is trained to predict global geometric transformation for global movement of the person between the source image and the target image on the basis of the source image and the target image; a local geometric transformation prediction unit that is trained to predict local geometric transformation for local movement of the person between the source image and the target image on the basis of preconfigured input data; a geometric transformation combination unit that combines the global geometric transformation and the local geometric transformation so as to calculate overall movement geometric transformation for overall movement of the person; and an image generation unit that is trained to reconstruct the target image on the basis of the source image and the overall movement geometric transformation.

Inventors:
CHAE GYEONG SU (KR)
HWANG GUEM BUEL (KR)
Application Number:
PCT/KR2022/003608
Publication Date:
August 17, 2023
Filing Date:
March 15, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
DEEPBRAIN AI INC (KR)
International Classes:
G10L21/10; G06N20/00; G06T5/00; G06T7/269; G06T13/20; G10L15/04; G10L21/055
Domestic Patent References:
WO2002031772A22002-04-18
Foreign References:
KR20200145700A2020-12-30
US20200393943A12020-12-17
KR20180057564A2018-05-30
Other References:
LIJUAN WANG ; WEI HAN ; FRANK K. SOONG: "High quality lip-sync animation for 3D photo-realistic talking head", 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2012) : KYOTO, JAPAN, 25 - 30 MARCH 2012 ; [PROCEEDINGS], IEEE, PISCATAWAY, NJ, 25 March 2012 (2012-03-25), Piscataway, NJ , pages 4529 - 4532, XP032228161, ISBN: 978-1-4673-0045-2, DOI: 10.1109/ICASSP.2012.6288925
Attorney, Agent or Firm:
T&C IP LAW FIRM (KR)
Download PDF: