LIP SYNC VIDEO GENERATION APPARATUS AND METHOD

Title:

LIP SYNC VIDEO GENERATION APPARATUS AND METHOD

Document Type and Number:

WIPO Patent Application WO/2022/124498

Kind Code:

A1

Abstract:

Disclosed are a lip sync video generation apparatus and method. The lip sync video generation apparatus, according to a disclosed embodiment, is a lip sync video generation apparatus comprising one or more processors and memory storing one or more programs executed by the one or more processors, and comprises: a first artificial neural network model which generates a synthesized speech video by using, as an input, a background video of a person and a speech audio signal corresponding to the background video of the person, and generates a synthesized silence video by using, as an input, only the background video of the person; and a second artificial neural network model which outputs classification values for a speech maintenance video and the synthesized silence video by using, as an input, a preset speech maintenance video and the synthesized silence video from the first artificial neural network model.

Inventors:

HWANG GUEM BUEL (KR)
CHAE GYEONG SU (KR)

Application Number:

PCT/KR2021/006913

Publication Date:

June 16, 2022

Filing Date:

June 03, 2021

Export Citation:

Click for automatic bibliography generation Help

Assignee:

DEEPBRAIN AI INC (KR)

International Classes:

G06N3/04; G06N3/08; G10L21/10; G10L25/30

Other References:

ZHENG, Ruobing et al. Photorealistic Lip Sync with Adversarial Temporal Convolutional Networks. arXiv:2002.08700v1. February 2020, pp. 1-9. [retrieved on 29 July 2021]. Retrieved from .
K R PRAJWAL; RUDRABHA MUKHOPADHYAY; VINAY NAMBOODIRI; C V JAWAHAR: "A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 23 August 2020 (2020-08-23), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081746748, DOI: 10.1145/3394171.3413532
SANJANA SINHA; SANDIKA BISWAS; BROJESHWAR BHOWMICK: "Identity-Preserving Realistic Talking Face Generation", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 25 May 2020 (2020-05-25), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081683355
TAVI HALPERIN; ARIEL EPHRAT; SHMUEL PELEG: "Dynamic Temporal Alignment of Speech to Lips", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 19 August 2018 (2018-08-19), 201 Olin Library Cornell University Ithaca, NY 14853 , XP080898125
RAN YI; ZIPENG YE; JUYONG ZHANG; HUJUN BAO; YONG-JIN LIU: "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 24 February 2020 (2020-02-24), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081613176

Attorney, Agent or Firm:

CHANG, Young Tae (KR)

Download PDF:

View/Download PDF PDF Help

Previous Patent: REMOTE CONTROL SYSTEM AND METHOD FOR VEHICLE

Next Patent: HETEROCYCLIC COMPOUND AND ORGANIC LIGHT-EMITTING DEVICE COMPRISING SAME