VISUAL SEARCH METHOD WITH FEEDBACK LOOP BASED ON INTERACTIVE SKETCH

Title:

VISUAL SEARCH METHOD WITH FEEDBACK LOOP BASED ON INTERACTIVE SKETCH

Document Type and Number:

WIPO Patent Application WO/2023/286003

Kind Code:

Abstract:

The disclosure relates to the fields of electronic processing of images and e-commerce, specifically – to the visual search with feedback loop based on interactive sketch for transparency and user control. Presented method adds transparency and user control to the traditionally "black-box" visual search process by introducing a feedback loop, based on interactive sketch. The method comprises of following main steps: submission of an information about the object being searched for by the user (in a form of image, written text, spoken text or selection of product category from menu in GUI), identification of product attributes and verification of identified attributes against the ontology rules, selection of building blocks from database for filtered attributes, connection of selected blocks to form a sketch of object being searched for, application of projection rules to produce a form vector, comparison of vector from previous step with database, presenting user with interactive sketch of product alongside the list of matching products from database, enabling user to select a part of the object in sketch, presenting user with possible modifications for the selected part of the object, enabling user to select needed modification and re-iterate the search process as many times as needed, until the wanted results are achieved.

Inventors:

JAPERTAS PRANAS (LT)

Application Number:

PCT/IB2022/056492

Publication Date:

January 19, 2023

Filing Date:

July 14, 2022

Export Citation:

Click for automatic bibliography generation Help

Assignee:

GASEFIS UAB (LT)

International Classes:

G06F16/242

Foreign References:

US20120054177A1	2012-03-01
US20140337370A1	2014-11-13
US20070067279A1	2007-03-22
US20080177640A1	2008-07-24
US20190205333A1	2019-07-04
US8412594B2	2013-04-02
US20160239898A1	2016-08-18
US20180108066A1	2018-04-19

Other References:

WEIMING HU ET AL: "A Survey on Visual Content-Based Video Indexing and Retrieval", IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: PART C:APPLICATIONS AND REVIEWS, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 41, no. 6, 1 November 2011 (2011-11-01), pages 797 - 819, XP011479468, ISSN: 1094-6977, DOI: 10.1109/TSMCC.2011.2109710

Attorney, Agent or Firm:

PAKENIENE, Ausra (LT)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A visual search method with feedback loop, based on interactive sketch, for transparency and user control, characterized in that it comprises the following steps: introducing the input interface to input specifications about the object (101, 102, 103, 104), identification of product attributes (106) (Scenario 1), converting speech data into text keywords (105) (Scenario 3), identification of product attributes (107) (Scenarios 2 and 3),

Collecting applicable product attributes (109) from Product Ontology Repository (114) (Scenario 4), checking of identified attributes (108) against the ontology rules (113) (Scenarios

1, 2 and 3), selection of building blocks (201) from database (205) for filtered attributes, connection of selected blocks (202) to form a schematic drawing using Combination Rules (207), application of projection rules (206) to produce a form vector (203), comparison of vector (204) from previous step (203) with database (208), enabling user to select a part of the object in schematic drawing (209), retrieve possible modifications for the selected attribute (301), according to Modification Rules (303), retrieve of matching building blocks (302) taken from database (205), enabling user to choose an alternative presented to him for the selected part / attribute (303).

2. A method according to claim 1, characterized in that specifications of the object are presented as a picture (101) and/or typed keywords (102), and/or spoken keywords (103), and/or selection of product category from menu in user interface (104).

3. A method according to claim 1, characterized in that identification of product attributes (106) is performed by the object recognition models (111) that are responsible for recognizing different attributes in the provided image.

4. A method according to claim 1, characterized in that converting speech data into text keywords (105) is performed using Speech Recognition (Speech To Text) engine (110) which is responsible for converting voice data into text data.

5. A method according to claim 1, characterized in that identification of product attributes (107) is performed by Natural Language Processing (112) which is responsible for recognizing different product attributes from provided keywords.

6. A method according to claim 1, characterized in that the product ontology rules (113) serve as a first pass filter for the identified attributes (108).

7. A method according to claim 1, characterized in that projection rules (206) are a mixture of knowledge-based rules with the data learned about the user.

Description:

VISUAL SEARCH METHOD WITH FEEDBACK LOOP BASED ON INTERACTIVE SKETCH

FIELD OF THE INVENTION

The disclosure relates to the fields of electronic processing of images and e-commerce, specifically - to the visual search method with feedback loop, based on interactive sketch, for transparency and user control. Further in the document this particular invention will be referred to as INVENTION (all capitals).

DESCRIPTION OF THE RELATED ART

Traditionally visual search is a “black box” process- there is little control over it. After user inputs initial information, he just receives results on the other end. If search results are unsatisfactory - clear cause for that is not clear to the user. And there is nothing user can do in order to improve the results. There were multiple attempts to improve this problematic situation, but none of discovered existing patent applications provide the level of user control and over the search process, as the INVENTION.

A patent application US8412594B2 (published on April 02, 2013) discloses a method that provides possibility to select a first silhouette image of an item at a client machine depicting a plurality of silhouette images representing aspects of the item such as style, length type and sleeve type. Selecting any of those aspects allows a server to search a database for listings of similar items that have those aspects. Concurrently selecting one or more of the images representing those aspects and one or more sizes, allows a server to search a database for listings of similar items that have those aspects and those sizes. Although described method provides possibility to select images as search parameters, it lacks the feedback loop, which is essential part of INVENTION.

A patent application US20160239898A1 (published on August 18, 2016) discloses a system and method for sketch based queries. A sketch of a search item is received from a user device. An item attribute corresponding to a physical attribute of the search item is extracted from the sketch. Inventory items are identified based on the extracted item attribute. The identified inventory items are presented to the user. A modification to the sketch of the search item is received and the inventory items are updated based on the received modification. Although method described in this patent application assumes multiple iterations of search query, each iteration is essentially a separate query, starting with new (modified) image. User does not modify the results in a systematic and controlled manner, like in the INVENTION, but performs a new search with manually modified sketch. Also this invention only allows a sketch as a source data for initial search query, while INVENTION also allows a photograph, textual or voice description and product category selection from menu list in user interface.

A patent application US20180108066A1 (published on April 19, 2018) discloses systems, methods, and computer program products for identifying a relevant candidate product in an electronic marketplace. Embodiments perform a visual similarity comparison between candidate product image visual content and input query image visual content, process formal and informal natural language user inputs, and coordinate aggregated past user interactions with the marketplace stored in a knowledge graph. Visually similar items and their corresponding product categories, aspects, and aspect values can determine suggested candidate products without discernible delay during a multi-turn user dialog. The user can then refine the search for the most relevant items available for purchase by providing responses to machine-generated prompts that are based on the initial search results from visual, voice, and/or text inputs. An intelligent online personal assistant can thus guide a user to the most relevant candidate product more efficiently than existing search tools. Although method described in this patent application allows multiple formats of initial data for search (text, voice and image) and implies multiple iteration of search, the principal of how the criteria for new search iteration is obtained is very different from the one described in the INVENTION. Here modification of the search query is fully based on AI driven virtual assistant, which provides textual questions/suggestions to the user and based on user answers to those questions/suggestions new iterations of search are executed. Essential difference from INVENTION is that here AI based virtual assistant drives the modification of the search, while in INVENTION - end user is fully in charge of which part of search query should be modified and how. Also INVENTION provides unique fully visual way for user to choose the modification of the object to be searched for, while in this patent application interaction between end user and the system is text based.

SUMMARY OF THE INVENTION

The disclosure presents a visual search method with feedback loop, based on interactive sketch, for transparency and user control. The presented method adds a feedback loop that makes process transparent and also adds user control. A visual representation of how visual search algorithm perceives initial input data is presented to the user. Object in the image is presented as schematic drawing (sketch) to have greater generalization power, yet being able to retain important specifics about an object being searched for. This visual representation of an object is also being interactive, i.e. user can manipulate this schematic drawing and therefore affect the search performed. User is not restricted on how many interactions (corrections) can be made - each interaction with the schematic drawing would have corresponding results presented. Interactive schematic drawing that user can manipulate serves several purposes: clearly presents to user current query (what algorithm is currently searching for) providing transparency, allows user to make hybrid searches - starting with the image, but then modifying it and arriving at the results that would not be possible with standard reverse image search.

Example workflow: user has a photo of an object that is close to the object he really wants to find; user submits an image, object in the image is recognized and is presented to the user in form of an interactive sketch; user makes needed changes on the schematic drawing so the schematic drawing would better match his / hers expectations (fine tunes the query) until user is happy with the results (objects retrieved from the database).

The method comprises of following steps: an introduction of the input interface to input specifications about he object (in a form of image and/or text, and/or voice recording, and/or product category selection), identification of product attributes by the object recognition models, checking of identified attributes against the ontology rules, selection of building blocks from database for filtered attributes, connection of selected blocks to form a schematic drawing, application of projection rules to produce a form vector, comparison of vector from previous step with database, presenting schematic drawing to the user, enabling user to select a part of the object in schematic drawing, retrieve possible modifications for the selected part of an object, retrieve of matching building blocks taken from database, enabling user to choose an alternative presented to him for the selected part / attribute.

BRIEF DESCRIPTION OF THE DRAWINGS

Fig. 1. Schematically illustrates the method structure (Steps S1.1/S2.1/S3.1/S4.1-S3).

Fig. 2. Schematically illustrates the method structure (Steps S4-S8). Fig. 3 Schematically illustrates the method structure (Steps S9-S12, including loop back to S3-S8).

DETAILED DESCRIPTION

The disclosure presents the method for the visual search with feedback loop, based on interactive sketch, for transparency and user control. The method comprises of several steps. Search process begins with user providing initial information about the object that is being searched for. The input interface is introduced to input specifications about the object. This information can come in several different forms: a) an image wherein an object is presented as picture or schematic drawing (101). b) Set of typed keywords, best describing the object (102) c) Set of spoken keywords (voice recording), best specifying the object (103). d) Specific product category, which is selected from menu of user-interface (104).

These alternative ways of initial information about the object being searched for allow for greatest possible applicability of invention in different situations and usage scenarios.

Each form of initial information suggests different beginning of search process, so each case will be described in detail.

Scenario 1 - image is provided as initial information.

51.1. At first step (101) user submits a picture or schematic drawing (sketch) of a product which is being searched for.

51.2. In the following step (106), product attributes are being identified by the object recognition models (111) that are responsible for recognizing different attributes in the provided image.

51.3. In further step (108), identified attributes are checked against ontology rules (113) - conflicts between attributes are resolved and conflicting attributes are removed. In this case, the product ontology rules (113) serve as a first pass filter for the identified attributes.

S3. List of selected attributes (115) is ready for further steps.

Scenario 2- typed keywords are provided as initial information.

S2.E At first step (102) user submits typed keywords, describing the product that is being searched for. 52.2. In the following step (107), product attributes are being identified by NLP (Natural Language Processing) engine (112) which is responsible for recognizing different product attributes from the provided keywords.

52.3. In further step (108), identified attributes are checked against ontology rules (113) - conflicts between attributes are resolved and conflicting attributes are removed. In this case, the product ontology rules (113) serve as a first pass filter for the identified attributes.

S3. List of selected attributes (115) is ready for further steps.

Scenario 3 - spoken keywords are provided as initial information.

53.1. At first step (103) user submits a verbal description (words), describing the product that is being searched for.

53.2. In the following step (105), voice speech input is processed by Speech Recognition (Speech To Text) engine (110), which is responsible for reliably converting voice data into text data.

53.3. In the following step (107), product attributes are being identified by NLP (Natural Language Processing) engine (112) which is responsible for recognizing different product attributes from the provided keywords.

53.4. In further step (108), identified attributes are checked against ontology rules (113) - conflicts between attributes are resolved and conflicting attributes are removed. In this case, the product ontology rules (113) serve as a first pass filter for the identified attributes.

S3. List of selected attributes (115) is ready for further steps.

Scenario 4 - product category is provided as initial information.

54.1. At first step (104) user selects a product category to be searched for from a menu in user interface.

54.2. In the following step (109), product attributes applicable for selected product category are being collected from Products Ontologies Repository (114).

53. List of selected attributes (115) is ready for further steps.

54. In the next step (201), building blocks for filtered attributes are selected from database (205). 55. In the following step (202), the selected building blocks are connected together to form a schematic drawing wherein the block combination rules (207) are used. In this step, the schematic drawing is presented to the user.

56. In the next step (203), attributes from SI.4, S2.4, S3.5 or S4.3 are taken in parallel and projection rules (206) are applied to produce a vector. The mentioned projection rules (206) are a mixture of knowledge-based rules with some data learned about the user to better reflect personal preference. In this case, projection rules (206) refer from attribute space to some Euclidean space in order to reflect the similarity best.

57. In the following step (204), the vector from step S6 (203) is compared with database (208) wherein the actual objects and corresponding attributes are located.

58. At this step (209), user is presented with a schematic drawing of the object being search for alongside the list of matching objects from a database.

At this point user has a full control on search process as he can modify the presented schematic drawing. User can perform his actions in several possible ways: a) By using a mouse and selecting/clicking on the sketch in graphical user interface and selecting alternative options (this option is further used in drawings and description). b) By typing in commands, which specify user action (which part of sketch should be selected and changed) c) By speaking out commands, which specify user actions (which part of sketch should be selected and changed)

This action further requires the following steps:

59. At this step (209), user selects (using alternative method “a)” described in S8) a part of the object in schematic drawing (representing some attribute).

S10. In the next step (301), modifications are retrieved for the selected attribute under question. A database with modification rules (303) is used for this reason. It describes possibilities for attribute modification and can be curated (knowledge-based) or learned from analysis of real-world objects.

SI 1. In the following step (302), matching building blocks taken from database (205) are retrieved. Alternatives and building blocks are sent to the user interface and presented to the user.

S12. In this step (303), the user can choose (using one of alternative methods described in S7) an alternative presented to him for the selected part / attribute. The change of attributes is passed back to algorithm. After that steps S3-S7 (115, 201- 204) are repeated. The steps S8-S12 (209, 301-303) and then steps S3-S7 (115, 201 -204) can be repeated as many times as wanted and constitute a control loop for visual search.

The above description of the preferred embodiment is provided in order to illustrate and describe the present invention. This is not an exhaustive or limiting description, seeking to determine the exact form or embodiment. The above description should be considered more like an illustration, rather than a limitation. It is evident that numerous modifications and variations may be obvious to the specialists of that field. Embodiment is chosen and described so that experts of this field would clarify the best way the principles of this invention and the best practical application for various embodiments with various modifications suitable for a particular use of application of the embodiment. It is intended that the scope of the invention is defined in the claim appended thereto and its equivalents, where all of the said terms have meaning within the widest range, unless indicated otherwise.

In the embodiment options described by the specialists of this field changes can be created without deviations from the scope of this invention as specified in the following claims.

Previous Patent: COMPUTER-IMPLEMENTED METHOD FOR REMAPPING A TEXTURE OF A THREE-DIMENSIONAL GRAPHIC OBJECT

Next Patent: VEHICLE SERVICE APPARATUS AND METHOD FOR PERFORMING A VEHICLE SERVICE