Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
LIP SYNC VIDEO GENERATION APPARATUS AND METHOD
Document Type and Number:
WIPO Patent Application WO/2022/124498
Kind Code:
A1
Abstract:
Disclosed are a lip sync video generation apparatus and method. The lip sync video generation apparatus, according to a disclosed embodiment, is a lip sync video generation apparatus comprising one or more processors and memory storing one or more programs executed by the one or more processors, and comprises: a first artificial neural network model which generates a synthesized speech video by using, as an input, a background video of a person and a speech audio signal corresponding to the background video of the person, and generates a synthesized silence video by using, as an input, only the background video of the person; and a second artificial neural network model which outputs classification values for a speech maintenance video and the synthesized silence video by using, as an input, a preset speech maintenance video and the synthesized silence video from the first artificial neural network model.

Inventors:
HWANG GUEM BUEL (KR)
CHAE GYEONG SU (KR)
Application Number:
PCT/KR2021/006913
Publication Date:
June 16, 2022
Filing Date:
June 03, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
DEEPBRAIN AI INC (KR)
International Classes:
G06N3/04; G06N3/08; G10L21/10; G10L25/30
Other References:
ZHENG, Ruobing et al. Photorealistic Lip Sync with Adversarial Temporal Convolutional Networks. arXiv:2002.08700v1. February 2020, pp. 1-9. [retrieved on 29 July 2021]. Retrieved from .
K R PRAJWAL; RUDRABHA MUKHOPADHYAY; VINAY NAMBOODIRI; C V JAWAHAR: "A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 23 August 2020 (2020-08-23), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081746748, DOI: 10.1145/3394171.3413532
SANJANA SINHA; SANDIKA BISWAS; BROJESHWAR BHOWMICK: "Identity-Preserving Realistic Talking Face Generation", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 25 May 2020 (2020-05-25), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081683355
TAVI HALPERIN; ARIEL EPHRAT; SHMUEL PELEG: "Dynamic Temporal Alignment of Speech to Lips", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 19 August 2018 (2018-08-19), 201 Olin Library Cornell University Ithaca, NY 14853 , XP080898125
RAN YI; ZIPENG YE; JUYONG ZHANG; HUJUN BAO; YONG-JIN LIU: "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 24 February 2020 (2020-02-24), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081613176
Attorney, Agent or Firm:
CHANG, Young Tae (KR)
Download PDF: