Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
VISUAL POSITIONING METHOD AND APPARATUS, DEVICE, AND MEDIUM
Document Type and Number:
WIPO Patent Application WO/2023/201990
Kind Code:
A1
Abstract:
The present application relates to the technical field of artificial intelligence, and discloses a visual positioning method and apparatus, a device, and a medium. The method comprises: performing feature splicing on an image coded feature and a text coded feature; performing feature fusion on the spliced coded features to obtain a first fused coded feature; respectively performing noise correction on the first fused coded feature and the text coded feature on the basis of a preset cross-attention mechanism to obtain a corrected fused feature and a corrected text coded feature, and performing feature fusion on the spliced coded feature and the corrected text coded feature to obtain a second fused coded feature; and correcting a preset frame feature by using a target coded feature determined on the basis of the corrected fused feature and the second fused coded feature to predict a regional position coordinate of the target visual object. Hence, according to the present application, image-text noise is corrected on the basis of the preset cross-attention mechanism, the impact of noise is weakened by reducing the attention on the noise part in the text, and anti-noise visual positioning is achieved.

Inventors:
LI XIAOCHUAN (CN)
LI RENGANG (CN)
ZHAO YAQIAN (CN)
GUO ZHENHUA (CN)
FAN BAOYU (CN)
Application Number:
PCT/CN2022/122335
Publication Date:
October 26, 2023
Filing Date:
September 28, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
INSPUR SUZHOU INTELLIGENT TECHNOLOGY CO LTD (CN)
International Classes:
G06T5/00; G06T7/70; G06T9/00
Foreign References:
CN114511472A2022-05-17
CN113850201A2021-12-28
CN113837102A2021-12-24
CN113095435A2021-07-09
CN112800782A2021-05-14
KR102279797B12021-07-21
Attorney, Agent or Firm:
BEIJING WANHUIDA LAW FIRM (CN)
Download PDF: