Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PROGRESSIVE POSITIONING METHOD FOR TEXT-TO-VIDEO CLIP POSITIONING
Document Type and Number:
WIPO Patent Application WO/2022/088238
Kind Code:
A1
Abstract:
Disclose in the present invention is a progressive positioning method for text-to-video clip positioning. The method comprises: first, respectively extracting features of two modes, namely a video mode and a text mode by using different feature extraction methods; then progressively selecting different step lengths, and learning the correlation between the video and the text in multiple stages; and finally, training a model in an end-to-end mode by combining the correlation loss of each stage. Moreover, the fine time granularity stage is fused with information of the coarse time granularity stage by means of a condition feature updating module and up-sampling connection, such that different stages are mutually promoted. Different stages can pay attention to fragments with different time granularities, and the model can cope with the situation that the length of a target fragment is obviously changed by combining the interrelation between the stages. According to the present invention, the thinking mode of processing a fragment positioning task by human beings is used for reference, a mode from coarse to fine is adopted, the target fragment is positioned in a multi-stage and asymptotic manner, and the positioning performance is improved to a great extent.

Inventors:
WANG XUN (CN)
DONG JIANFENG (CN)
ZHENG QI (CN)
PENG JINGWEI (CN)
Application Number:
PCT/CN2020/127657
Publication Date:
May 05, 2022
Filing Date:
November 10, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV ZHEJIANG GONGSHANG (CN)
International Classes:
G06F16/783; G06K9/00; G06K9/62; G06N3/04
Domestic Patent References:
WO2018081751A12018-05-03
Foreign References:
CN111414845A2020-07-14
CN110121118A2019-08-13
CN109145712A2019-01-04
US20170330363A12017-11-16
Other References:
XU TONG, DU HAO, CHEN ENHONG, CHEN JOYA, WU YUFEI: "Cross-modal video moment retrieval based on visual-textual relationship alignment", SCIENTIA SINICA INFORMATIONIS, vol. 50, no. 6, 1 June 2020 (2020-06-01), pages 862 - 876, XP055926110, ISSN: 1674-7267, DOI: 10.1360/SSI-2019-0292
Attorney, Agent or Firm:
HANGZHOU QIUSHI PATENT OFFICE CO., LTD. (CN)
Download PDF: