SYSTEMS AND METHODS FOR ENHANCED NOVELTY DETECTION

Title:

SYSTEMS AND METHODS FOR ENHANCED NOVELTY DETECTION

Document Type and Number:

WIPO Patent Application WO/2020/106971

Kind Code:

Abstract:

A method for analytically determining novelty response, is provided. The method includes receiving a query item indicative of a response to a first stimulus, assigning in a filter engine multiple query hash functions to the query item, and comparing a first query hash function to a database item, each database item occupying a position indicative of a query hash function in the filter engine. The method also includes reducing a value for the database item when the database item is indicative of the first query hash function, and determining a novelty score for the first stimulus by computing a weighted sum of the values of the database items in the filter engine. A system and a memory storing instructions to cause the system to perform the above method are also provided.

Inventors:

NAVLAKHA SAKET (US)

Application Number:

PCT/US2019/062638

Publication Date:

May 28, 2020

Filing Date:

November 21, 2019

Export Citation:

Click for automatic bibliography generation Help

Assignee:

SALK INST BIOLOGICAL STUDIES (US)

International Classes:

G06F16/13; G06F16/14; G06F16/22; G06K9/62

Foreign References:

US20080243941A1	2008-10-02
US20170041292A1	2017-02-09
US20130166576A1	2013-06-27
CN106372190A	2017-02-01
US20130080114A1	2013-03-28

Attorney, Agent or Firm:

NOVAK, Jason J. et al. (US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

WHAT IS CLAIMED IS:

1. A method for analytically determining novelty response, comprising:

receiving an initial item I;

assigning a plurality of initial hash functions IH to the initial item I;

inserting the initial item I into a Bloom filter, wherein each of the plurality of initial hash functions IH is mapped to a respective one of a plurality of database items in an initial hash database, each database item occupying a position in the initial hash database, and wherein each database item has a value that is reset when one of the plurality of initial hash functions IH maps to the respective database item;

updating the initial hash database to an updated hash database;

receiving a query item X;

assigning a plurality of query hash functions QH to the query item X;

inserting the query item X into a Bloom filter, wherein each of the query hash functions of the query item X is compared to a respective one of the plurality of database items in the updated hash database, each database item occupying a position in the updated hash database, and wherein each query hash function adopts the value of the respective database item in the updated hash database; and

calculating a continuous novelty score N by computing a weighted sum of the database item values corresponding to each of the query hash functions, wherein 0 < N < 1, and wherein the continuous novelty score is both distance sensitive and time sensitive.

2. The method of claim 1, wherein each database item value recovers by a recovery factor e for each hash function of an inserted subsequent query item that does not map to that database item.

3. The method of claim 2, wherein recovery by recovery factor e is additive.

4. The method of claim 1, wherein each database item value decays by a decay factor d for each hash function of an inserted subsequent query item that maps to that database item.

5. The method of claim 4, wherein decay by decay factor d is multiplicative.

6. The method of claim 1, wherein the plurality of database items in the Bloom filter comprises a plurality of bits.

7. The method of claim 1, further comprising resetting the database item value to zero for calculating novelty score N when a specific hash function maps to a respective database item in the Bloom filter.

8. The method of claim 1, further comprising:

receiving a plurality of initial items IS;

assigning a plurality of initial hash functions IH to each of the plurality of initial items

IS; and

inserting the plurality of initial items IS into the Bloom filter, wherein each of the plurality of initial hash functions IH is mapped to a respective one of a plurality of database items in an initial hash database, each database item occupying a position in the initial hash database, and wherein each database item has a value that is reset when one of the plurality of initial hash functions IH maps to the respective database item.

9. A non-transitory computer-readable medium in which a program is stored for causing a computer to perform a method for analytically determining novelty response, the method comprising:

receiving an initial item I;

assigning a plurality of initial hash functions IH to the initial item I;

updating the initial hash database to an updated hash database;

receiving a query item X;

assigning a plurality of query hash functions QH to the query item X;

10. A system for analytically determining novelty response, comprising:

a processing unit comprising:

an initial item processing engine configured to

receive an initial item I;

assign a plurality of initial hash functions IH to the initial item I;

insert the initial item I into a bloom filter, wherein each of the plurality of initial hash functions IH is mapped to a respective one of a plurality of database items in an initial hash database, each database item occupying a position in the initial hash database, and wherein each database item has a value that is reset when one of the plurality of initial hash functions IH maps to the respective database item;

update the initial hash database to an updated hash database

a query item processing engine configured to

receive a query item X;

assign a plurality of query hash functions QH to the query item X; and

insert the query item X into a Bloom filter, wherein each of the query hash functions of the query item X is compared to a respective one of the plurality of database items in the updated hash database, each database item occupying a position in the updated hash database, and wherein each query hash function adopts the value of the respective database item in the updated hash database; and

a scoring unit configured to calculate a continuous novelty score N by computing a weighted sum of the database item values corresponding to each of the query hash functions, wherein 0 < N < 1, and wherein the continuous novelty score is both distance sensitive and time sensitive.

11. A method for analytically determining novelty response, comprising:

receiving a query item indicative of a response to a first stimulus; assigning, in a filter engine, multiple query hash functions to the query item; comparing a first query hash function to a database item, each database item occupying a position indicative of a query hash function in the filter engine;

reducing a value for the database item in the filter engine when the database item is indicative of the first query hash function; and

determining a novelty score for the first stimulus by computing a weighted sum of the values of the database items in the filter engine.

12. The method of claim 11, wherein comparing a first query hash function to a database item in a filter engine comprises mapping the query hash function to at least one of multiple database items in the filter engine, and resetting a value for the database item when the query hash function maps to the database item.

13. The method of claim 11, further comprising updating the hash database when the value for the first query hash function has changed.

14. The method of claim 11, further comprising determining a normalization factor in the weighted sum to make the novelty score less than one.

15. The method of claim 11, wherein determining a novelty score comprises assigning a similar novelty score to a second stimulus that is similar to the first stimulus.

16. The method of claim 11, wherein determining a novelty score further comprises increasing the value for the first query hash function when a pre-selected time has elapsed after the first stimulus.

17. The method of claim 11, further comprising increasing the value for the database item by a selected amount when none of the multiple query hash functions map to the database item.

18. The method of claim 11, further comprising selecting an additive factor to update a database item in the hash database when a second stimulus leaves the database item unchanged.

19. The method of claim 11, further comprising reducing each database item value by a decay factor d for each hash function of an inserted subsequent query item that maps to that database item.

20. The method of claim 11, wherein reducing the value for the database item comprises multiplying the value for the database item by a selected decay factor that is less than one.

21. The method of claim 11, further comprising forming the hash database as a string of bits, wherein each bit is a database item.

22. The method of claim 11, wherein reducing a value for the database item comprises resetting the value for the database item to zero.

23. The method of claim 11, further comprising:

receiving a second query item indicative of a response to a second stimulus;

assigning multiple query hash functions to the second query item;

mapping the query hash functions into the filter engine; and

determining a novelty score for the second stimulus based on a mapping of the query hash functions into the filter engine.

24. A non-transitory computer-readable medium in which a program is stored for causing a computer to perform a method for analytically determining novelty response, the method comprising:

receiving a query item indicative of a response to a first stimulus; assigning, in a filter engine, multiple query hash functions to the query item; comparing a first query hash function to a database item in a filter engine, each database item occupying a position indicative of a query hash function in the filter engine;

reducing a value for the database item when the database item is indicative of the first query hash function;

determining a novelty score for the first stimulus by computing a weighted sum of the values of the database items in the filter engine; and updating a hash database when the value for the first query hash function has changed.

25. The non-transitory, computer readable medium of claim 24, wherein in the method, determining a novelty score comprises assigning a similar novelty score to a second stimulus that is similar to the first stimulus.

26. The non-transitory, computer readable medium of claim 24, wherein in the method, wherein determining a novelty score further comprises increasing the value for the first query hash function when a pre-selected time has elapsed after the first stimulus.

27. The non-transitory, computer readable medium of claim 24, wherein the method further comprises increasing the value for the database item by a selected amount when none of the multiple query hash functions map to the database item.

28. A system for analytically determining a novelty response, comprising:

a sensing device configured to:

generate a query item in response to a first stimulus;

assign multiple hash functions to the query item; and transmit a hash function through a network; and

a classification device configured to receive the hash function from the network, and to map a hash function to a database item, the classification device comprising:

a filter engine configured to map the hash function to a database item in a hash database;

a scoring tool configured to reset a value of the database item when one of the plurality of hash functions maps to the database item and to update the hash database; and

a novelty detection tool configured to calculate a continuous novelty score for the first stimulus by computing a weighted sum of the values of the database items in the filter engine.

29. The system of claim 28, further comprising a memory circuit storing an additive factor and a decaying factor for the scoring tool to reset the value of the database item.

30. The system of claim 28, wherein the hash database is a binary string, the system further comprising a memory circuit to store the binary string.

31. A method for filtering input data from a sensor to classify a stimulus, comprising:

encoding a stimulus into a vector based on the stimulation rate of each of multiple receptors in a sensor device;

normalizing the vector such that the mean stimulation rate is the same for different stimuli and different stimulus concentrations;

projecting the vector into a multidimensional space in a hash engine according to a sparse matrix;

finding an addition value of a random selection of the components of the vector using a component of the hash engine;

exciting an inhibitory component in the hash engine proportionally to the addition value;

silencing the remaining components in the hash engine with the inhibitory component; and

forming a hash for the stimulus with one or more unsilenced components in the hash engine.

32. A method for finding a novelty score for a stimulus in a sensor system, comprising:

receiving a hash function from an input stimulus;

determining a distance between the hash function and multiple database items indexed according to a longevity value for each database item;

adding the distances for multiple database items, each distance weighted with a decaying factor based on the longevity value for each database item; and

associating a novelty score for the input stimulus based on the addition of the distances for multiple database items.

Description:

SYSTEMS AND METHODS FOR ENHANCED NOVELTY DETECTION

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] The present application is related, and claims priority under 35 U.S.C. 119(e) to U.S. Provisional Patent Application No. 62/770,688 filed on November 21, 2018, entitled SYSTEMS AND METHODS FOR ENHANCED NOVELTY DETECTION, the contents of which are hereinafter incorporated in their entirety, for all purposes.

BACKGROUND

[0002] The embodiments disclosed herein are generally directed towards systems and methods for enhanced novelty detection (or duplicate detection). Systems and methods as disclosed herein may also be applied to outlier detection (e.g. , in a statistical distribution), or anomaly detection. Novelty detection (or duplicate detection) has been demonstrated to be a fundamental biological problem that organisms must solve to determine if a given stimulus departs from those previously experienced. In computer science, this problem can be solved efficiently using a data structure called a Bloom filter. Similarly, in the biological world, the olfactory circuit of the fruit fly, for example, evolved a variant of a Bloom filter to assess the novelty of odors. Compared to a traditional Bloom filter, the fly can adjust novelty responses based on, for example, the similarity of an odor to previously experienced odors, and the time elapsed since the odor was last experienced.

[0003] Given the current state of Bloom filters in computer science as essentially one dimensional, there exists a need in the industry to develop advances in data structures that are multi-dimensional, such as taking into account distance and time in calculations. Some of these data structures may be provided by filter engines, e.g., a Bloom filter.

BRIEF DESCRIPTION OF FIGURES

[0004] FIG. 1 is a block diagram that illustrates a sensor system, in accordance with various embodiments.

[0005] FIG. 2 illustrates a sensing device and a classification device in a sensor system as in FIG. 1, in accordance with various embodiments.

[0006] FIG. 3 illustrates a schematic view of a filter engine as used in accordance with various embodiments.

[0007] FIGS. 4A-4B illustrates charts indicating a distance sensitivity of a filter engine, in accordance with various embodiments. [0008] FIGS. 5A-5E illustrate multiple charts comparing the performance of different filter engines to determine novelty of a stimulus, in accordance with various embodiments.

[0009] FIGS. 6A-6B illustrate charts illustrating a time recovery of the novelty response to a stimulus in a classification device, in accordance with various embodiments.

[0010] FIG. 7 illustrates a distribution of an exemplary stimulus response as a function of the frequency of neuronal activity associated with the response (in Hertz), in accordance with various embodiments.

[0011] FIG. 8 is a flow chart illustrating steps in a method for analytically determining a novelty response, in accordance with various embodiments.

[0012] FIG. 9 is a flow chart illustrating steps in a method for analytically determining a novelty response to a stimulus, in accordance with various embodiments.

[0013] FIG. 10 is a flow chart illustrating steps in a method for filtering input data from a sensor to classify a stimulus, in accordance with various embodiments.

[0014] FIG. 11 is a flow chart illustrating steps in a method for creating a hash database with items associated with hash functions provided by a sensor in response to a stimulus, in accordance with various embodiments.

[0015] FIG. 12 illustrates a block diagram that illustrates a computer system, in accordance with various embodiments.

[0016] It is to be understood that the figures are not necessarily drawn to scale, nor are the objects in the figures necessarily drawn to scale in relationship to one another. The figures are depictions that are intended to bring clarity and understanding to various embodiments of apparatuses, systems, and methods disclosed herein. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts. Moreover, it should be appreciated that the drawings are not intended to limit the scope of the present teachings in any way.

SUMMARY

[0017] In one embodiment, a method for analytically determining novelty response includes receiving an initial item I, assigning a plurality of initial hash functions IH to the initial item I, and inserting the initial item I into a Bloom filter, wherein each of the plurality of initial hash functions IH is mapped to a respective one of a plurality of database items in an initial hash database, each database item occupying a position in the initial hash database, and wherein each database item has a value that is reset when one of the plurality of initial hash functions IH maps to the respective database item. The method also includes updating the initial hash database to an updated hash database, receiving a query item X, and assigning a plurality of query hash functions QH to the query item X. The method also includes inserting the query item X into a Bloom filter, wherein each of the query hash functions of the query item X is compared to a respective one of the plurality of database items in the updated hash database, each database item occupying a position in the updated hash database, and wherein each query hash function adopts the value of the respective database item in the updated hash database, and calculating a continuous novelty score N by computing a weighted sum of the database item values corresponding to each of the query hash functions, wherein 0 < N < 1, and wherein the continuous novelty score is both distance sensitive and time sensitive.

[0018] In a second embodiment, a non-transitory computer-readable medium in which a program is stored for causing a computer to perform a method for analytically determining novelty response is provided, where the method includes receiving an initial item I and assigning a plurality of initial hash functions IH to the initial item I. The method also includes inserting the initial item I into a Bloom filter, wherein each of the plurality of initial hash functions IH is mapped to a respective one of a plurality of database items in an initial hash database, each database item occupying a position in the initial hash database, and wherein each database item has a value that is reset when one of the plurality of initial hash functions IH maps to the respective database item, and updating the initial hash database to an updated hash database. The method also includes receiving a query item X, and assigning a plurality of query hash functions QH to the query item X. The method also includes inserting the query item X into a Bloom filter, wherein each of the query hash functions of the query item X is compared to a respective one of the plurality of database items in the updated hash database, each database item occupying a position in the updated hash database, and wherein each query hash function adopts the value of the respective database item in the updated hash database, and calculating a continuous novelty score N by computing a weighted sum of the database item values corresponding to each of the query hash functions, wherein 0 < N < 1, and wherein the continuous novelty score is both distance sensitive and time sensitive.

[0019] In yet another embodiment, a system for analytically determining novelty response includes a processing unit having an initial item processing engine. The initial item processing engine is configured to receive an initial item I, to assign a plurality of initial hash functions IH to the initial item I, and to insert the initial item I into a bloom filter, wherein each of the plurality of initial hash functions IH is mapped to a respective one of a plurality of database items in an initial hash database, each database item occupying a position in the initial hash database, and wherein each database item has a value that is reset when one of the plurality of initial hash functions IH maps to the respective database item. The initial item processing engine is also configured to update the initial hash database to an updated hash database. The system also includes a query item processing engine configured to receive a query item X, to assign a plurality of query hash functions QH to the query item X, and to insert the query item X into a Bloom filter, wherein each of the query hash functions of the query item X is compared to a respective one of the plurality of database items in the updated hash database, each database item occupying a position in the updated hash database, and wherein each query hash function adopts the value of the respective database item in the updated hash database. The system also includes a scoring unit configured to calculate a continuous novelty score N by computing a weighted sum of the database item values corresponding to each of the query hash functions, wherein 0 < N < 1, and wherein the continuous novelty score is both distance sensitive and time sensitive.

[0020] In one embodiment, a method for analytically determining novelty response includes receiving a query item indicative of a response to a first stimulus and assigning in a filter engine multiple query hash functions to the query item. The method also includes comparing a first query hash function to a database item, each database item occupying a position indicative of a query hash function in the filter engine, reducing a value for the database item when the database item is indicative of the first query hash function, and determining a novelty score for the first stimulus by computing a weighted sum of the values of the database items in the filter engine.

[0021] In another embodiment, a non-transitory computer-readable medium in which a program is stored for causing a computer to perform a method for analytically determining novelty response is provided. The method includes receiving a query item indicative of a response to a first stimulus and assigning in a filter engine multiple query hash functions to the query item. The method also includes comparing a first query hash function to a database item, each database item occupying a position indicative of a query hash function in the filter engine and reducing a value for the database item when the database item is indicative of the first query hash function. The method also includes determining a novelty score for the first stimulus by computing a weighted sum of the values of the database items in the filter engine, and updating the hash database when the value for the first query hash function has changed.

[0022] In yet another embodiment, a system for analytically determining a novelty response includes a sensing device configured to generate a query item in response to a first stimulus, to assign in a filter engine multiple hash functions to the query item, and to transmit a hash function through a network. The system also includes a classification device configured to receive the hash function from the network, and to map a hash function to a database item in a hash database. The classification device includes a scoring tool configured to reset a value of the database item when one of the plurality of hash functions maps to the database item and to update the hash database, and a novelty detection tool configured to calculate a continuous novelty score for the first stimulus by computing a weighted sum of the values of the database items in the filter engine.

[0023] In one embodiment, a method for filtering input data from a sensor to classify a stimulus includes encoding a stimulus into a vector based on the stimulation rate of each of multiple receptors in a sensor device, and normalizing the vector such that the mean stimulation rate is the same for different stimuli and different stimulus concentrations. The method also includes projecting the vector into a multidimensional space in a hash engine according to a sparse matrix, finding an addition value of a random selection of the components of the vector using a component of the hash engine, and exciting an inhibitory component in the hash engine proportionally to the addition value. The method also includes silencing the remaining components in the hash engine with the inhibitory component, and forming a hash for the stimulus with one or more unsilenced components in the hash engine.

[0024] In another embodiment a method for finding a novelty score for a stimulus in a sensor system includes receiving a hash function from an input stimulus and determining a distance between the hash function and multiple database items indexed according to a longevity value for each database item. The method also includes adding the distances for multiple database items, each distance weighted with a decaying factor based on the longevity value for each database item, and associating a novelty score for the input stimulus based on the addition of the distances for multiple database items.

DETAILED DESCRIPTION

[0025] This specification describes various exemplary embodiments of systems, methods, and software for enhanced novelty detection. The disclosure, however, is not limited to these exemplary embodiments and applications or to the manner in which the exemplary embodiments and applications operate or are described herein.

[0026] Unless otherwise defined, scientific and technical terms used in connection with the present teachings described herein shall have the meanings that are commonly understood by those of ordinary skill in the art. Further, unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular. [0027] All publications mentioned herein are incorporated herein by reference for the purpose of describing and disclosing devices, compositions, formulations, and methodologies which are described in the publication and which might be used in connection with the present disclosure.

[0028] As used herein, the terms “comprise,” “comprises,” “comprising,” “contain,” “contains,”“containing,”“have,”“having,”“i nclude,”“includes,” and“including” and their variants are not intended to be limiting, are inclusive or open-ended and do not exclude additional, unrecited additives, components, integers, elements, or method steps. For example, a process, method, system, composition, kit, or apparatus that comprises a list of features is not necessarily limited only to those features but may include other features not expressly listed or inherent to such process, method, system, composition, kit, or apparatus.

NOVELTY RESPONSE

[0029] Embodiments as disclosed herein provide a framework to predict novelty responses of a sensing device to given pairs of stimuli (e.g. , fly olfactory system sensing and classifying odors). Systems and methods in accordance to some embodiments may also be applied to outlier detection (e.g., in a statistical distribution), or anomaly detection in sensing and diagnostics applications. Various embodiments include systems for distance- and time- sensitive classification of stimuli that outperform prior filters and systems when evaluated on several biological and computational datasets. In accordance with various embodiments, the algorithmic basis of a neurobiological problem is used to device methods and systems for novelty detection in computational systems coupled to chemical, electronic, mechanical, and optical sensors, or any combination thereof.

[0030] By translating insights from the fly circuit to develop a new class of distance- and time-sensitive Bloom filters that outperform prior filters when evaluated on several biological and computational datasets, this need can be solved in a novel fashion. These new filters illuminate the algorithmic basis of an important neurobiological problem and offers new strategies for novelty detection (or duplicate detection) in computational systems.

[0031] In accordance with various embodiments herein, the systems, methods, and software are described for quantitatively predicting neural novelty responses based on use of data structures generated by a filtering engine. In accordance with various embodiments, the filter engine may include a Bloom filter, or a modified version thereof. In accordance with various embodiments herein, systems, methods, and software are described that translate the three above properties to extend a fundamental data structure to new application domains. [0032] The fruit fly brain evolved a simple space-efficient data structure to compute the novelty of odors, directly taking advantage of sparse, high-dimensional representations useful for discriminating odors. In accordance with various embodiments herein, the systems, methods and software described herein can advantageously be used to predict novelty responses in fruit flies to odor pairs. Moreover, in accordance with various embodiments herein, the systems, methods and software described herein can advantageously be used to enhance (e.g. , increase accuracy, reduce false positives), increase speed, increase efficiency, and so on in computational applications, including, for example and not limited to, anomaly detection, novelty or duplicate detection, outlier detection, edge computing applications, life long learning applications, energy-efficient computing applications, various computer networking architectures and applications, blockchain data processing, distributed and/or decentralized computing and networks, malware detection, content delivery networks, and deep learning/machine learning.

[0033] Assessing whether a stimulus is novel is a basic neural problem that helps alert organisms to new and potentially salient events. This problem requires determining how much a given stimulus departs from those previously experienced. The fruit fly, for example, determines the novelty of an odor using a two-step procedure.

[0034] The first step of this two-step procedure can include assigning a“tag” (the term used in neurobiology) or“hash” (the term used in computer science) to the odor, where the tag/hash corresponds to a small set of neurons that fire action potentials in response to the odor. An odor is initially encoded in the fly’s nose as a 50-dimensional vector, corresponding to the firing rates of 50 different types of odorant receptor neurons (ORNs). The 50 ORNs send odor information to 50 projection neurons (PNs), whose responses are normalized such that the mean firing rate of the PNs is nearly the same for all odors and odor concentrations. The PNs then project to about 2000 Kenyon cells (KCs) that lie in a structure called the“mushroom body”. The connection matrix between the 50 PNs and the 2000 KCs is sparse, approximately binary, and random. Each KC randomly selects about 6 of the 50 PNs, and sums up their firing rates. Each KC then provides feed-forward excitation to a single inhibitory neuron, which provides feedback inhibition to all of the KCs. As a result of this winner-take-all computation, only the top ~5% (100) of the highest rate KCs fire in response to the odor; the remaining 95% are silenced. This 5% corresponds to the tag/hash of the odor. Thus, the hash of an odor is a sparse point in a 2000-dimensional space. Each distinct odor activates a different 5% of the KCs, such that two very different odors will have little to no overlap in their active KCs. [0035] The second step of the fruit fly’s two-step procedure for determining odor novelty can be to decide if an odor, as defined by its hash, has been previously experienced. The output of the mushroom body is generated by mushroom body output neurons (MBONs) that receive input from the Kenyon cells and perform many different functions. For example, one such MBON, called MBON-a'3, receives inputs from 350 of the 2000 KCs. It was previously established that the activity of MBON-a'3 in response to an odor encodes a novelty signal for the odor, and that this signal is suppressed with odor familiarization (repeated presentations of the same odor).

[0036] For how MBON-a'3 encodes novelty, the following model has been previously proposed. For each odor presented to the fly, the hash for that odor is a sparse set of about 20 KCs (~5-10% of the 350 KCs that project to MBON-a'3). Local dopamine released by a neuron called PPLl-a'3 then modifies the strength of the 350 KCs MBON-a'3 synapses. Synapses made by the odor’s 20 activated KCs onto MBON-a'3 weaken, whereas synapses made by non-active KCs onto MBON-a'3 strengthen. The activity of MBON-a'3 is computed as the weighted sum of its inputs (/. _<?., the activity of each KC multiplied by its synaptic strength). Thus, repeated exposure to the same odor depresses active KC MBON-a'3 synapses, which suppresses the activity of MBON-a'3 in response to the odor, indicating that the odor has become familiar. At the same time, the synapses of non-active KCs onto MBON- a'3 strengthen, increasing over time the novelty response of odors with non-overlapping hashes.

[0037] In accordance with various embodiments, systems and methods as disclosed herein go beyond a simple binary classification of a stimulus (e.g., novel vs. familiar). Accordingly, in accordance with various embodiments, a continuous novelty score may indicate a degree in the classification of a given stimulus. For example, comparison of prior approaches for novelty detection makes clear that different odors are not just novel vs. familiar, but rather can have various degrees of novelty depending on the similarities between odors. Furthermore, in accordance with various embodiments, a filter engine as disclosed herein can be implemented in machine learning applications to solve problems in the realm of computers, computer networks, and data processing.

[0038] FIG. 1 is a block diagram that illustrates a computer system 100, in accordance with various embodiments. Computer system 100 may include one or more servers 130 communicatively coupled with one another and with one or more client devices 110 through a network 150. In accordance with various embodiments, the client devices 110 may be sensors configured to provide output signals to servers 130 in response to stimuli, through network 150. Servers 130 may include a computational device configured to receive the sensor inputs and perform a classification of the stimulus. In that regard, servers 130, client devices 110, and network 150 may be electronic components including memory circuits and processor circuits, and wired or wireless communication systems.

[0039] More generally, one or more of the components in system 100 may include one or more biological systems, such as sensory organs (e.g., eye, nose, tongue, skin, and the like, as client devices 110), a part of a central nervous system (e.g., brain, dorsal spine, and the like, as servers 130), or a communicative physiological network (nerves, lymphatic systems, or blood vessels, as part of network 150).

[0040] FIG. 2 illustrates an example sensing device 210 and a classification device 230 in a sensing system 200, according to various embodiments. Sensing device 210 may be a client device, and classification device 230 may be a server (e.g., client device 110 and server 130). More generally, sensing device 210 and classification device 230 may include a computer system having memory circuits 220-1 and 220-2, respectively (hereinafter, collectively referred to as“memories 220”), and processor circuits 212-1 and 212-2 (hereinafter, collectively referred to as“processors 212”). Memories 220 may store instructions which, when executed by processors 212, cause sensing device 210 and classification device 230 to perform at least partially one or more steps in methods consistent with various embodiments herein.

[0041] Sensing device 210 and classification device 230 may also communicate with each other through network 150 via communication modules 218-1 and 218-2, respectively (hereinafter, collectively referred to as“communication modules 218”). Sensing device 210 may also include an input device 214 and an output device 216. Input device 214 may include a keyboard, a mouse, a pointer, a stylus, a microphone, a touch sensitive display, and the like. Output device 216 may include a display (e.g. , a touch sensitive display and the like), a speaker, an alarm, a siren, or any other type of visual or auditory device.

[0042] In accordance with various embodiments, sensing device 210 can be configured to generate a query item in response to a stimulus, to assign multiple hash functions to the query item using a hash engine 222, and to transmit a hash function through a network (e.g., via communications module 218-1). The stimulus may be an olfactory stimulus, a visual stimulus, a tactile stimulus, an audible stimulus, a taste stimulus, or any combination thereof. In accordance with various embodiments, the hash functions are sparse, high-dimensional locality-sensitive hash functions.

[0043] Classification device 230 can be configured to receive the hash function from network 150, and to map the hash function to a database item in a hash database 252. In accordance to various embodiments, classification device 230 includes a filter engine 240 configured to map the hash function to the database item in the hash database, and a scoring tool 244 configured to reset a value of the database item when one of multiple hash functions maps to the database item and to update the hash database. Further, in accordance with various embodiments, classification device 230 includes a novelty detection tool 246 configured to calculate a continuous novelty score for the first stimulus by computing a weighted sum of the values of the database items in hash database 252. In accordance with various embodiments, novelty detection tool 246 may include other functions in addition to the weighted sum of the scores. For example, in accordance with various embodiments, novelty detection tool 246 may be configured for (or may include features for) finding the maximum of the score values, the minimum of the score values, or any combination thereof.

[0044] In various embodiments, a memory circuit 220-2 may store an additive factor and a decaying factor for scoring tool 244 to reset the value of the database item in hash database 252. In various embodiments, hash database 252 may include a binary string that is stored in memory 220-2. While the example illustrated system 200 indicates hash engine 222 as included in memory 220-1 of sensing device 210, and scoring tool 244, novelty detection tool 246, and a query log 248 as being stored together with filter engine 240 in memory 220-2 of classification device 230, it should be understood that in various embodiments any one of engines, tools, and logs may be stored at least partially on either one of sensing device 210 or classification device 230.

[0045] More generally, hash engine 222 and filter engine 240 form a mapping between sensory channels in sensing device 210 and novelty detection circuits in classification device 230. Thus, sensing system 200 may be configured to predict a novelty response between pairs of stimuli, analytically predict novelty responses, empirically evaluate benchmark datasets, and build distance- and time- sensitive filters for sensorial data in a biological organism, or in an electronic device or appliance.

[0046] FIG. 3 illustrates a schematic view of a filter engine 340 as used, in accordance to various embodiments. Without loss of generality, filter engine 340 will be illustrated in the context of an olfactory sensor of a fly, wherein the stimuli are odors to which the fly may be exposed. Filter engine 340 includes a novelty detection tool 346 (MBON-a'3) to encode novelty. For each odor presented to the fly, the hash for that odor is a sparse set of about 20 Kenyon Cells (KCs), which is about ~5-10% of the 350 KCs that project to MBON-a'3. The Kenyon Cells may be referred to as a hash engine 322. Local dopamine released by a neuron called PPLl-a'3 then modifies the strength of the 350 KCs MBON-a'3 synapses. Synapses made by the odor’s 20 activated KCs onto MBON-a'3 weaken, whereas synapses made by non-active KCs onto MBON-a'3 strengthen. The activity of MBON-a'3 is computed as the weighted sum of its inputs (e.g., the activity of each KC multiplied by its synaptic strength). Thus, repeated exposure to the same odor depresses active KC MBON-a'3 synapses, which suppresses the activity of MBON-a'3 in response to the odor, indicating that the odor has become familiar. At the same time, the synapses of non-active KCs onto MBON-a'3 strengthen, increasing over time the novelty response of odors with non-overlapping hashes.

[0047] From a computer science perspective, the KC MBON-a’3 synapses can be viewed as storing an enhanced-“Bloom filter”. A Bloom filter is a space-efficient data structure that serves as the foundation for answering the question,“Does x belong to S?”, where x is some query item (an odor) and S is a database of previously observed items. If x does not belong to S, then the item is considered novel, and vice-versa.

[0048] This foundational question of novelty naturally comes up in many applications. For example, when Google indexes the Web, its web crawler, upon reaching a website (x) needs to quickly determine if it has visited this site in the past (S) to avoid the cost of re-indexing. Similarly, determining whether a DNA sequence (x) belongs to a database (S) is a common primitive for many bioinformatics applications.

[0049] In various embodiments, filter engine 340 can include the following features:

[0050] 1. Continuous-valued novelty. For the fly, novelty responses should reflect how different the query item is from those previously experienced. A traditional Bloom filter, on the other hand, only reports a binary response, indicating whether or not the exact item is in the database.

[0051] 2. Distance sensitivity. The novelty of an odor with respect to a similar odor should be smaller than the novelty of the odor with respect to a very different odor. A traditional filter engine does not provide distance-sensitivity, instead treating each pair of unique items as equi distant from all other pairs. Distance- sensitivity also provides robustness, since, in a noisy environment, it is unlikely that the fly will observe the exact same odor twice.

[0052] 3. Time sensitivity. For the fly, if the same or a similar odor was last experienced a long time ago, its novelty should be larger than if the odor was experienced recently. A traditional filter engine, on the other hand, does not discount novelty over time.

[0053] The first two properties above modify the question to ask,“How close is x to some item in S?” The third property asks,“How long ago was x, or some item similar to x, observed?” These modifications are also relevant practically. For example, when recommending news articles, one might ask: how novel is this article compared to other articles the user has (recently) read? Or, how different is this patient’s profile compared to others who have (recently) signed up for a clinical trial?

[0054] In accordance with various embodiments herein, the systems, methods and software are described for quantitatively predicting neural novelty responses based on the use of filter engines. In accordance with various embodiments herein, systems, methods and software are described that translate the three above properties to extend a fundamental data structure to new application domains.

[0055] In various embodiments, filter engine 340 (like filter engine 240 of FIG. 2) can include a scoring tool 344 and a novelty detection tool 346 to quantitatively determine the novelty of a stimulus based on sensor responses. In various embodiments, a filter engine 340 with the above attributes may be implemented in other areas of digital communication and processing (including computer networks and related systems). For example, filter engine 340 may be part of a network server recommending news articles to a subscriber. Filter engine 340 may then evaluate a degree of novelty of a given article compared to other articles the user has (recently) read. In various embodiments, filter engine 340 may be included in a classification device for medical diagnostics, or in a healthcare facility to establish healthcare protocols and procedures. Accordingly, filter engine 340 may determine how different a patient’s profile is compared to other patients who have (recently) undergone a specific clinical trial.

[0056] In various embodiments, filter 340 includes a mapping between a layer of Kenyon Cells (KCs) 310, hash database 352 and a novelty response tool 346. The KC-to-MBON-03 synapses form a hash engine 322 that populates a hash database 352 for use in filter engine 340 having the three above features, namely: continuity, distance sensitivity, and time sensitivity. To answer the query:“Does x belong to S?” various embodiments may keep a sorted list of all of the items in S and then perform a binary search for x, or insert each item into hash database 352 and perform a lookup. In various embodiments, in may be desirable to create a compact hash database 352 rather than storing all items from S to avoid very large databases, or without having to store millions of small databases. Accordingly, in various embodiments, hash database 352 in filter engine 340 includes a small number of bits per item to store, independent of the size or content of each individual item.

[0057] Formally, a database of n items (wherein each item is a stimulus received by the sensing device), S ={si, S2, ... , s _n], and a query item x can be provided. In various embodiments, hash database 352 is formed with a binary novelty score, f (x , S) e {0, 1 } of x with respect to S. The objective function (Does x belong to S?) is:

[0058] In various embodiments, the goal is to output a continuous-valued novelty score, f (x , S) e [0, 1] of x with respect to S, and this score can be distance sensitive (cf. the third property, time sensitivity will be considered later). Distance sensitivity implies that f should reduce the novelty of x if there is an item similar to x in the database. The objective function (How close is x to some item in S?) is:

where d(x , Si ) is a distance measure between items x and Si. When f (x , S)=l, x is highly novel, whereas when f (x , S) = 0, x is highly familiar.

[0059] In various embodiments, novelty detection tool 346 approximates f (x , S) using a compressed m-bit representation of S. In general, m > n, where n is the number of items in S, and a value of m may be about 10h (e.g., 10 bits per item stored). In various embodiments of filter engine 340, each of the m bits may be initialized to 1. The choice of value (0 or 1) is not limiting, and in various embodiments, the bits associated with unused items (e.g. , novel stimuli) may be initialized to 0 instead of to 1. In various embodiments, the reverse choice may be used and novel items elicit a high novelty response instead of a low response. Each s e S is inserted into the filter by applying k << m independent hash functions to s. Each hash function maps s to one of the m positions in hash database 301, and each of these positions is reset to 0 as they belong to an‘old’ (e.g., already sensed) stimulus. To determine the novelty of a query x, filter engine 340 applies the same k hash functions to x to obtain k positions in hash database 301. When any of the resulting bits are 1, then the item definitely does not belong to S (e.g., a‘new’ stimulus). When all of the bits are 0, then the item may belong to S (e.g., an old stimulus). To avoid false positives (e.g.,‘chardonnay’ may be classified as old, when it is actually different from‘zinfandel’ or‘pinot noir’), in various embodiments, the indices of the k active KCs for an odor act as the k hashed positions of the odor in hash database 352 updated as follows. When a stimulus is presented (e.g., inserted into the filter by a sensing device), the strength of each of the m synapses is modified. Let w(i) correspond to the weight of the z ^'th entry in hash database 301. Initially, each w(i)=l. After an odor x is observed, each synaptic weight changes as follows: wherein, e is a pre-selected additive factor, and 0 < d < 1 is a pre-selected decaying factor. In other words, the synaptic weights of the k active KCs for the stimulus each decay by some factor, d. The remaining m- k KCs strengthen by a small amount, 0 < e < 1. In accordance to various embodiments, and without limitation, Eq. 3 assumes that each KC’s activity for a stimulus is binary (‘on’ or‘off’). In various embodiments, the response of the KCs to the stimulus may have a degree between zero (0) and one (1).

[0060] In various embodiments, when ignoring time, e = 0, and d = 0 so that once a stimulus activates a KC, its weight is reset to 0, and no recovery occurs with the presentation of subsequent stimuli that activate the same KC. Accordingly, in various embodiments, synaptic weights are binary (bits, cf Eq. 3). In various embodiments, a continuous-valued novelty score of a stimulus x includes a sum of the weights of the k entries in hash database 352 for the stimulus x. Hash engine 322 creates hash tags (or hash functions) associated with a specific stimulus. For example, hash tag 312 (with‘0’ values in positions [3, 6, 12] of hash database 352) may be associated with a‘zinfandel’ taste, hash tag 314 ([3, 6, 14]) may be associated with a‘pinot noir’ taste; and hash tag 316 ([2, 8, 14]) may be associated with‘’chardonnay’ taste. In various embodiments, novelty detection tool 346 includes a normalization factor (e.g., k) to generate continuous novelty responses for between 0 and 1. For example, novelty response 325 corresponds to a‘chardonnay’ taste (= 0.666), which indicates the‘novelty’ of this flavor relative to‘zinfandel,’ and‘pinot noir’ (= 0,‘old’ flavors). Because all weights are binary (ignoring time), the only weights that will be 1 are those corresponding to KCs that are active for x and no other stimulus in S previously inserted into the filter.

[0061] In various embodiments, filter engine 340 may include tools to adapt k, the number of hash functions used, based on the‘popularity’ of the data and to decentralize the filter to handle failures. In various embodiments, filter engine 340 may also include tools to allow items to be deleted, and to use other sophisticated inference steps. Moreover, filter engine 340 may include tools to compress high-dimensional datasets that enable estimating f(x, S) efficiently using a nearest-neighbors search. For example, short codes for dataset items using product quantization techniques, further speed up these computations using GPUs. In various embodiments, a space-efficient tree-based data structure for storing items may use less bits per item to store in hash database 352.

[0062] FIGS. 4A-4B illustrates charts 400A and 400B (hereinafter, collectively referred to as“charts 400”), indicating a distance sensitivity of a filter engine, according to various embodiments. To determine the similarity of two stimuli, charts 400 include data from a publicly available database, which provides sensor responses to different stimuli (e.g., odors detected by olfactory receptor neurons -ORNs-) by collecting experimental data from different sources.

[0063] FIG. 4A illustrates chart 400A, which tests how stimulus similarity, defined based on the differences in sensor activity for the two stimuli, affected novelty responses. Specifically, each stimulus is associated with a vector of recorded sensor activity in response to the stimulus, using the database. Accordingly, a significant correlation between the Euclidean distance between the vectors and the recorded novelty response of the filter engine for the pair: Higher distance is associated with a larger novelty (r =0:91).

[0064] FIG. 4B illustrates chart 400B, which tests how well a novelty detection tool as disclosed herein (e.g., novelty detection tools 246 and 346), defined based on the overlap of active Kenyon cells for the pair, correlated with the recorded novelty response (from the public database). Specifically, the sensor responses for each stimulus were computed by multiplying the vector of sensor activity for the stimulus by a sparse, binary random projection matrix into an m = 350-dimensional space. The hash engine operates by setting the indexes of the k = 20 (5% of 350) most rapidly firing KCs (e.g. , sensing units) to 1 and setting the rest of the indexes to 0. Chart 400B illustrates that a greater overlap in active KCs shared by the pair of stimuli, correlates with a lower novelty response (r =-0:94).

[0065] The robustness of the correlations in charts 400 was tested using leave-one-out cross- validation, and found no change to our conclusions. Charts 400 illustrate that the similarity between stimuli— either calculated in input sensor space or in KC hash space (e.g. , KC layer 310 or hash database 301), could predict the novelty response for the pair. Further, the similarity between two stimuli can be read off from the output of the novelty detection tool consistent with the first and second features of a filter engine as disclosed herein (cf. filter engines 240 and 340).

[0066] FIGS. 5A-5E illustrate multiple charts 500A, 500B, 500C, 500D, and 500E (hereinafter, collectively referred to as“charts 500”), comparing the performance of different filter engines to determine novelty of a stimulus, according to various embodiments. Charts 500 include an empirical evaluation on benchmark datasets. Accordingly, three different filter engines are compared in charts 500: a“fly [Bloom filter],” a“Bloom [filter],” and a prior distance-sensitive Bloom filter (“LSBF”). Exemplary scripts for each of the above filter engines are listed below.

[0067] A fly Bloom filter

[0069] And an LSBF

[0070] FIGS. 5A-5E and Table I illustrate a performance comparison on five datasets: Fly odors (chart 500A), Macaque faces (chart 500B), SIFT images (chart 500C), MNIST images (chart 500D), and Random exponential (chart 500E, cf. Table I). For each dataset, a 10-fold cross-validation is used, where all items from the training set S (cf. Eqs. 1-3) are inserted into the filter engine. Then, for each item in the test set, a ground-truth novelty response is determined via Eq. 2 and the predicted novelty from the filter engine (e.g., from the novelty detection tool). Charts 500 illustrate the correlation between the ground-truth and predicted novelty responses over all items in the test set. We set m = 30n and varied k e [5, 50].

TABLE I

[0071] FIGS. 5A and 5B illustrate charts 500A and 500B for two neural datasets (fly odors and primate faces, respectively). [0072] FIG. 5A illustrates chart 500A with results for the neural activity of 24 ORNs in the fruit fly in response to n = 110 odors.

[0073] FIG. 5B illustrates chart 500B with results for the neural activity of 99 face patch neurons in the macaque in response to n = 2,000 faces.

[0074] Charts 500A and 500B demonstrate a significant improvement in novelty detection using the fly Bloom filter compared with the alternatives. For example, on the odors dataset, the correlation between ground-truth and predicted novelty using the fly Bloom filter was 0.657±0.06, compared with 0.537±0.08 using the LSBF (k =40). Similarly, on the face data, the correlations were 0.508±0.03 for the fly vs. 0.404±0.03 for the LSBF. On all datasets, the Bloom filter had close to zero correlation between ground-truth and predicted novelty because the hash functions used in a Bloom filter are random and not distance sensitive.

[0075] FIGS. 5C-5E illustrate charts 500C, 500D, and 500E for three computational datasets. The first two are image datasets: a scale-invariant feature transform descriptors (SIFT, chart 500C), a modified National Institute of Standards and Technology database (MNIST, chart 500D) with n =5,000 each, and a random data drawn from an exponential distribution (chart 500E). Charts 500C, 500D, and 500E illustrate that the fly Bloom filter outperforms, or performs no worse than, other filters (e.g.,“Bloom” and“LSBF”).

[0076] FIG. 5C illustrates chart 500C (SIFT dataset), wherein the fly Bloom filter achieved a correlation of 0.535±0.03 vs. 0.345±0.03 and 0.002±0.02 for the LSBF and traditional filters, respectively. Similar results are obtained with m = 10h. Overall, embodiments of the fly Bloom filter as disclosed herein enhances novelty detection relative to other filters, while using the same space-efficient representation (e.g., hash database).

[0077] FIGS. 6A-6B illustrate charts 600A and 600B (hereinafter, collectively referred to as“charts 600”) illustrating a time recovery of the novelty response to a stimulus in a classification device, according to various embodiments. To incorporate timing mechanisms, a filter engine, in accordance with various embodiments, as disclosed herein may include several features illustrated in charts 600. In various embodiments, the rates of synapse weight recovery (e.g. , additive factor, e) and decay (e.g. , decay factor, d) provide clues to recover and suppresses novelty over time, in the classification device (e.g., classification device 130). In various embodiments, the functional forms of these two parameters may include an additive- increase, multiplicative-decrease mechanism. In various embodiments, multiplicative weight update rules may be used in online learning algorithms to converge to optimal decisions over time in a computer network application. In various embodiments, the novelty detection tool in a filter engine as disclosed herein may follow at least one of two general rules: first, an aggressive decay after initial exposure to a stimulus. In scenarios where the decay is slow, the novelty signal may persist over several successive presentations of the stimulus, which may unnecessarily burden attentiveness. Further, aggressive decay can more quickly heighten a difference between a stimulus detected once recently (familiar) and a stimulus that has not been detected for a long time (novel). Second, a slower recovery from familiarity back to novelty may be desirable. Otherwise, very soon after detecting a stimulus, the stimulus may become novel again, limiting the timescale over which novelty can be integrated. Various embodiments may apply the above rules in different degrees of rigour, based on a desired behavioral response.

[0078] In various embodiments, a filter engine can include a novelty detection tool based on a tradeoff between distance and time. For example, in various embodiments, the classification device includes a filter engine that distinguishes between two similar stimuli detected far apart in time, or between two dissimilar stimuli detected recently in time. To establish a compromise between the two above scenarios, a filter engine in accordance to various embodiments, may include more than one novelty detection tool. In various embodiments, a filter engine may be devised to include a local synaptic weight update rule (cf. Eq. 3) obtained by reverse engineering the global objective function that the local rule is trying to optimize.

[0079] In various embodiments, other measures from the sensing device may be used in a hash engine to provide a more accurate and better-graded novelty detection tool in the classification device. For example, the amount of dopamine released for a given stimulus, which may affect recovery and decay rates, could be related to a concentration or valence (intensity) of the stimulus and may be regulated by feedback from other mechanisms in the hash engine. In various embodiments, no dopamine release may indicate that a stimulus was not inserted into the filter. Understanding these parameters may allow novelty responses to be predicted for a time sequence of stimuli, instead of simply stimulus pairs. Accordingly, filter engines as disclosed herein may help understanding and evaluating novelty detection in other species and brain regions. Similar patterns of decay and recovery of novelty responses as illustrated in charts 500, albeit over longer timescales, may be found in the vertebrate olfactory system. Novelty detection tools as disclosed herein may also help understand stimulus response circuits in the hippocampus, in the amygdala, and in auditory systems.

[0080] In various embodiments, it may be desirable that similar stimuli observed recently in time should elicit a lower novelty response than similar items observed far apart in time. Accordingly, in various embodiments, it may be desirable that old data stored in the hash database slowly decay or be evicted from the hash database over time. Eviction is also important in life-long learning applications, where the database is not of fixed size but continuously grows. Unless old data are forgotten, a filter engine as disclosed herein may“fill” up (e.g., saturate, after multiple input stimuli) such that every bit in the hash database is reset to zero. A model as to how distance sensitivity and time sensitivity combine to create a single novelty response in various embodiments includes candidate objective functions, as follows.

[0081] Without time, the dataset S may be a set of items without a specific ordering scheme. When incorporating time, S becomes a time-series or stream of items,

where stimulus Si is observed at time i. Assuming that one item (stimulus) is observed per time- point, a first candidate objective function may be:

[0082] Accordingly, Eq. 4 may find the single closest item Si e S to x (or one sufficiently close to x), and set the desired novelty response to be the distance from x to Si, discounted by the time i when s, was observed. The time discount factor, following the decay rate, d, is multiplicative. A second candidate objective function is:

[0083] Eq. 5 adds the distances from each item s, e S to x (instead of the single closest item), discounting each by the same time factor as above. Accordingly, when there exists a single item Si e S that is very similar to x, Eq. 5 differentiates between instances where the rest of the items in S are very different than x versus the rest being very similar to x. If the rest are very similar, this means that the hashes for many items in S will overlap with the hash for x. This would reduce the novelty of x more than if the rest of the items are all very different than x. For both objectives, additional normalization may be used to ensure /(x, S) lies in [0, 1].

[0084] FIG. 6A illustrates chart 600A, including the recovery of novelty responses to the same stimulus over time. In chart 600A, a stimulus was presented, and after some recovery time (denoted on the x-axis), the same stimulus was presented, and the output of a novelty detection tool was recorded (e.g. , novelty detection tools 246 and 346). Chart 600A illustrates a linear relationship between recovery time elapsed and the novelty response, indicative of an additive relationship for the parameter, £ [0085] FIG 6B illustrates chart 600B, including a novelty response of the same stimulus decays with repeated presentations. In chart 600B, the same stimulus was presented repeatedly (denoted as‘trials’ on the x-axis), and after each trial, the novelty response from the novelty detection tool was recorded. Each black curve denotes a repeated experiment. These curves are clearly non-linear, illustrating an exponential decay. The thick curve denotes what would be expected with a decay parameter, d= 0.4. In other words, the novelty response for trial i, in chart 600B, would be 0.4z. This indicates a multiplicative relationship for decay.

[0086] Table II shows the performance of Eqs. 4 and 5 with the fly odors dataset (cf. FIG. 5A and chart 500A) using example realistic values of m = 350, k = 20, and n = 20, as the fly may distinguish between roughly 20 distinct odors. For Table II, a decay factor of d = 0.40, and an additive factor e = 0.01 are used for illustrative purposes only. The evaluation includes a leave-one-out cross-validation, where a random subset of n odors are inserted into the filter in a random order, and then the novelty is computed for a random single odor x. The ground- truth novelty response of x (as defined by either Eq. 4 or Eq. 5 above) is correlated with the predicted novelty response from the filter. This process is repeated 500 times and reports the average correlation. For both objectives, the fly Bloom filter engine (cf. FIGS. 5A-5E) has a higher correlation between the ground-truth and predicted novelty compared to the other two filters (Bloom filter and LSBF). For example, for Eq. 4, a correlation of 0.415 ± 0.01 for fly Bloom filter versus 0.305 ± 0.01 for ESBF and - 0.020 ± 0.01 for the traditional Bloom filter. Performance increased with Eq. 5:— 0.459 ± 0.01 for fly Bloom filter versus 0.349+0.01 for LSBF and -0.038+0.01 for the Bloom filter— suggesting that biological similarity may indeed take a form where more than one previous item is used to ascertain the novelty of x. Results for larger values d are also shown in Table II.

[0087] FIG. 7 illustrates a distribution 700 of an exemplary stimulus response as a function of the frequency of neuronal activity associated with the response (e.g., neuronal firing rates, in Hertz), according to various embodiments. Distribution 700 is an exponential distribution of ORN firing rates for 169 different stimuli, with at least 20 ORN responses per stimulus. Each black curve (or bracket) shows the cumulative distribution of the activity for ORNs for one stimulus. Each stimulus is normalized to have the same mean of fO Hz. The red line corresponds to an exponential distribution with a mean of b= f 0.

[0088] FIG. 8 is a flow chart illustrating steps in a method 800 for analytically determining a novelty response, according to various embodiments. Methods consistent with method 800 may be performed by a sensor system including a sensing device and a classification device, as disclosed herein (e.g., sensor systems 100 and 200, sensing device 210, and classification device 230). Further, the sensor system may include processors, memories, communications modules, and a network (e.g., processors 212, memories 220, communications modules 218, and network 150). The memories may include a hash engine, a filter engine, a scoring tool, a novelty detection tool, and a query log or a hash database, according to various embodiments (e.g. , hash engine 222, filter engine 240, scoring tool 244, novelty detection tool 246, and query log 248). Further, methods consistent with various embodiments may include at least one or more of the steps in method 800, performed in any order, simultaneously, or overlapping in time.

[0089] Step 802 includes receiving an initial item I.

[0090] Step 804 includes assigning a plurality of initial hash functions IH to the initial item I.

[0091] Step 806 includes testing whether the query item X belongs in the database, wherein each of the plurality of initial hash functions IH is mapped to a respective one of a plurality of database items in an initial hash database, each database item occupying a position in the initial hash database, and wherein each database item has a value that is reset when one of the plurality of initial hash functions IH maps to the respective database item. In various embodiments, step 806 includes testing whether the query item X belongs in the database. In various embodiments, each database item value recovers by a recovery factor e for each hash function of an inserted subsequent query item that does not map to that database item. In various embodiments, each database item value decays by a decay factor d for each hash function of an inserted subsequent query item that maps to that database item. In various embodiments, wherein decay by decay factor d is multiplicative. In various embodiments, the plurality of database items in the Bloom filter includes a plurality of bits.

[0092] Step 808 includes updating the initial hash database to an updated hash database.

[0093] Step 810 includes receiving a query item X.

[0094] Step 812 includes assigning a plurality of query hash functions QH to the query item

[0095] Step 814 includes inserting the query item X into a Bloom filter, wherein each of the query hash functions of the query item X is compared to a respective one of the plurality of database items in the updated hash database, each database item occupying a position in the updated hash database, and wherein each query hash function adopts the value of the respective database item in the updated hash database. In accordance with various embodiments, step 814 includes testing whether the query item X belongs in the database. In various embodiments, step 814 includes a recovery of a database item by a recovery factor e that is additive.

[0096] Step 816 includes calculating a continuous novelty score N by computing a weighted sum of the database item values corresponding to each of the query hash functions, wherein 0 < N < 1, and wherein the continuous novelty score is both distance sensitive and time sensitive. In various embodiments, step 816 includes resetting the database item value to zero for calculating novelty score N when a specific hash function maps to a respective database item in the Bloom filter.

[0097] FIG. 9 is a flow chart illustrating steps in a method 900 for analytically determining a novelty response to a stimulus, according to various embodiments. Methods consistent with method 900 may be performed by a sensor system including a sensing device and a classification device, as disclosed herein (e.g. , sensor systems 100 and 200, sensing device 210, and classification device 230). Further, the sensor system may include processors, memories, communications modules, and a network (e.g. , processors 212, memories 220, communications modules 218, and network 150). The memories may include a hash engine, a filter engine, a scoring tool, a novelty detection tool, and a query log or a hash database, according to various embodiments (e.g. , hash engine 222, filter engine 240, scoring tool 244, novelty detection tool 246, and query log 248). Further, methods consistent with various embodiments may include at least one or more of the steps in method 900, performed in any order, simultaneously, or overlapping in time.

[0098] Step 902 includes receiving a query item indicative of a response to a first stimulus.

[0100] Step 904 includes assigning, in the filter engine, multiple query hash functions to the query item.

[0101] Step 906 includes comparing a first query hash function to a database item, each database item occupying a position indicative of a query hash function in the filter engine. In various embodiments, step 906 includes mapping the query hash function to at least one of multiple database items in the hash database, and resetting a value for the database item when the query hash function maps to the database item.

[0102] Step 908 includes reducing a value for the database item when the database item is indicative of the first query hash function. In various embodiments, step 908 includes updating the hash database when the value for the first query hash function has changed. In various embodiments, step 908 includes increasing the value for the database item by a selected amount when none of the multiple query hash functions map to the database item. In various embodiments, step 908 includes selecting an additive factor to update a database item in the hash database when a second stimulus leaves the database item unchanged. In various embodiments, step 908 includes multiplying the value for the database item by a selected decay factor that is less than one. In various embodiments, step 908 includes forming the hash database as a string of bits, wherein each bit is a database item. In various embodiments, step 908 includes resetting the value for the database item to zero. In various embodiments, step 908 includes reducing each database item value by a decay factor d for each hash function of an inserted subsequent query item that maps to that database item.

[0103] Step 910 includes determining a novelty score for the first stimulus by computing a weighted sum of the values of the database items in the filter engine. In various embodiments, step 910 includes determining a normalization factor in the weighted sum to make the novelty score less than one. In various embodiments, step 910 includes increasing the value for the first query hash function when a pre-selected time has elapsed after the first stimulus.

[0104] FIG. 10 is a flow chart illustrating steps in a method 1000 for filtering input data from a sensor to classify a stimulus, according to various embodiments. Methods consistent with method 1000 may be performed by a sensor system including a sensing device and a classification device, as disclosed herein (e.g. , sensor systems 100 and 200, sensing device 210, and classification device 230). Further, the sensor system may include processors, memories, communications modules, and a network (e.g. , processors 212, memories 220, communications modules 218, and network 150). The memories may include a hash engine, a filter engine, a scoring tool, a novelty detection tool, and a query log or a hash database, according to various embodiments (e.g. , hash engine 222, filter engine 240, scoring tool 244, novelty detection tool 246, and query log 248). Further, methods consistent with various embodiments may include at least one or more of the steps in method 1000, performed in any order, simultaneously, or overlapping in time.

[0105] Step 1002 includes encoding a stimulus into a vector based on the stimulation rate of each of multiple receptors in a sensor device.

[0106] Step 1004 includes normalizing the vector such that the mean stimulation rate is the same for different stimuli and different stimulus concentrations.

[0107] Step 1006 includes projecting the vector into a multidimensional space in a hash engine according to a sparse matrix.

[0108] Step 1008 includes finding an addition value of a random selection of the components of the vector using a component of the hash engine. In accordance to various embodiments, the addition value that excites the inhibitory component may be promotional to the sum of the values in the high-dimensional vector, not a random selection of values in the high-dimensional vector.

[0109] Step 1010 includes exciting an inhibitory component in the hash engine proportionally to the addition value. [0110] Step 1012 includes silencing the remaining components in the hash engine with the inhibitory component.

[0111] Step 1014 includes forming a hash for the stimulus with one or more unsilenced components in the hash engine.

[0112] FIG. 11 is a flow chart illustrating steps in a method 1100 for creating a hash database with items associated with hash functions provided by a sensor in response to a stimulus, according to various embodiments. Methods consistent with method 1100 may be performed by a sensor system including a sensing device and a classification device, as disclosed herein (e.g., sensor systems 100 and 200, sensing device 210, and classification device 230). Further, the sensor system may include processors, memories, communications modules, and a network (e.g., processors 212, memories 220, communications modules 218, and network 150). The memories may include a hash engine, a filter engine, a scoring tool, a novelty detection tool, and a query log or a hash database, according to various embodiments (e.g., hash engine 222, filter engine 240, scoring tool 244, novelty detection tool 246, and query log 248). Further, methods consistent with various embodiments may include at least one or more of the steps in method 1100, performed in any order, simultaneously, or overlapping in time.

[0113] Step 1102 includes receiving a hash function from an input stimulus.

[0114] Step 1104 includes determining a distance between the hash function and multiple database items indexed according to a longevity value for each database item.

[0115] Step 1106 includes adding the distances for multiple database items, each distance weighted with a decaying factor based on the longevity value for each database item.

[0116] Step 1108 includes associating a novelty score for the input stimulus based on the addition of the distances for multiple database items.

COMPUTER SYSTEM

[0117] FIG. 12 illustrates a block diagram that illustrates a computer system 1200, upon which embodiments, or portions of the embodiments, of the present teachings may be implemented. In various embodiments of the present teachings, computer system 1200 can include a bus 1202 or other communication mechanism for communicating information, and a processor 1204 coupled with bus 1202 for processing information. In various embodiments, computer system 1200 can also include a memory 1206, which can be a random access memory (RAM) or other dynamic storage device, coupled to bus 1202 for determining instructions to be executed by processor 1204. Memory 1206 also can be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 1204. In various embodiments, computer system 1200 can further include a read only memory (ROM) 1208 or other static storage device coupled to bus 1202 for storing static information and instructions for processor 1204. A storage device 1210, such as a magnetic disk or optical disk, can be provided and coupled to bus 1202 for storing information and instructions.

[0118] In various embodiments, computer system 1200 can be coupled via bus 1202 to a display 1212, such as a cathode ray tube (CRT) or liquid crystal display (LCD), for displaying information to a computer user. An input device 1214, including alphanumeric and other keys, can be coupled to bus 1202 for communicating information and command selections to processor 1204. Another type of user input device is a cursor control 1216, such as a mouse, a trackball or cursor direction keys for communicating direction information and command selections to processor 1204 and for controlling cursor movement on display 1212. This input device 1214 typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane. However, it should be understood that input devices 1214 allowing for 3-dimensional (x, y and z) cursor movement are also contemplated herein.

[0119] Consistent with certain implementations of the present teachings, results can be provided by computer system 1200 in response to processor 1204 executing one or more sequences of one or more instructions contained in memory 1206. Such instructions can be read into memory 1206 from another computer-readable medium or computer-readable storage medium, such as storage device 1210. Execution of the sequences of instructions contained in memory 1206 can cause processor 1204 to perform the processes described herein. Alternatively, hard-wired circuitry can be used in place of or in combination with software instructions to implement the present teachings. Thus, implementations of the present teachings are not limited to any specific combination of hardware circuitry and software.

[0120] The term “computer-readable medium” (e.g., data store, data storage, etc.) or “computer-readable storage medium” as used herein refers to any media that participates in providing instructions to processor 1204 for execution. Such a medium can take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Examples of non-volatile media can include, but are not limited to, optical, solid state, magnetic disks, such as storage device 1210. Examples of volatile media can include, but are not limited to, dynamic memory, such as memory 1206. Examples of transmission media can include, but are not limited to, coaxial cables, copper wire, and fiber optics, including the wires in bus 1202.

[0121] Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, or any other tangible medium from which a computer can read.

[0122] In addition to computer readable medium, instructions or data can be provided as signals on transmission media included in a communications apparatus or system to provide sequences of one or more instructions to processor 1204 of computer system 1200 for execution. For example, a communication apparatus may include a transceiver having signals indicative of instructions and data. The instructions and data are configured to cause one or more processors to implement the functions outlined in the disclosure herein. Representative examples of data communications transmission connections can include, but are not limited to, telephone modem connections, wide area networks (WAN), local area networks (LAN), infrared data connections, NFC connections, etc.

[0123] It should be appreciated that the methodologies described herein including flow charts, diagrams and accompanying disclosure can be implemented using computer system 1200 as a standalone device or on a distributed network of shared computer processing resources such as a cloud computing network.

[0124] In accordance with various embodiments, the systems and methods described herein can be implemented using computer system 1200 as a standalone device or on a distributed network of shared computer processing resources such as a cloud computing network. As such, a non-transitory computer-readable medium can be provided in which a program is stored for causing a computer to perform the disclosed methods for identifying mutually incompatible gene pairs.

[0125] It should also be understood that the preceding embodiments can be provided, in whole or in part, as a system of components integrated to perform the methods described. For example, in accordance with various embodiments, the methods described herein can be provided as a system of components or stations for analytically determining novelty responses.

[0126] In describing the various embodiments, the specification may have presented a method and/or process as a particular sequence of steps. However, to the extent that the method or process does not rely on the particular order of steps set forth herein, the method or process should not be limited to the particular sequence of steps described. As one of ordinary skill in the art would appreciate, other sequences of steps may be possible. Therefore, the particular order of the steps set forth in the specification should not be construed as limitations on the claims. In addition, the claims directed to the method and/or process should not be limited to the performance of their steps in the order written, and one skilled in the art can readily appreciate that the sequences may be varied and still remain within the spirit and scope of the various embodiments. Similarly, any of the various system embodiments may have been presented as a group of particular components. However, these systems should not be limited to the particular set of components, now their specific configuration, communication and physical orientation with respect to each other. One skilled in the art should readily appreciate that these components can have various configurations and physical orientations (e.g., wholly separate components, units and subunits of groups of components, different communication regimes between components).

[0127] Although specific embodiments and applications of the disclosure have been described in this specification (including the associated Appendix), these embodiments and applications are exemplary only, and many variations are possible.

RECITATION OF EMBODIMENTS

[0128] A. A method for analytically determining novelty response includes receiving an initial item I, assigning a plurality of initial hash functions IH to the initial item I, and inserting the initial item I into a Bloom filter, wherein each of the plurality of initial hash functions IH is mapped to a respective one of a plurality of database items in an initial hash database, each database item occupying a position in the initial hash database, and wherein each database item has a value that is reset when one of the plurality of initial hash functions IH maps to the respective database item. The method also includes updating the initial hash database to an updated hash database, receiving a query item X, and assigning a plurality of query hash functions QH to the query item X. The method also includes inserting the query item X into a Bloom filter, wherein each of the query hash functions of the query item X is compared to a respective one of the plurality of database items in the updated hash database, each database item occupying a position in the updated hash database, and wherein each query hash function adopts the value of the respective database item in the updated hash database, and calculating a continuous novelty score N by computing a weighted sum of the database item values corresponding to each of the query hash functions, wherein 0 < N < 1, and wherein the continuous novelty score is both distance sensitive and time sensitive.

[0129] B. A non-transitory computer-readable medium in which a program is stored for causing a computer to perform a method for analytically determining novelty response is provided, where the method includes receiving an initial item I and assigning a plurality of initial hash functions IH to the initial item I. The method also includes inserting the initial item I into a Bloom filter, wherein each of the plurality of initial hash functions IH is mapped to a respective one of a plurality of database items in an initial hash database, each database item occupying a position in the initial hash database, and wherein each database item has a value that is reset when one of the plurality of initial hash functions IH maps to the respective database item, and updating the initial hash database to an updated hash database. The method also includes receiving a query item X, and assigning a plurality of query hash functions QH to the query item X. The method also includes inserting the query item X into a Bloom filter, wherein each of the query hash functions of the query item X is compared to a respective one of the plurality of database items in the updated hash database, each database item occupying a position in the updated hash database, and wherein each query hash function adopts the value of the respective database item in the updated hash database, and calculating a continuous novelty score N by computing a weighted sum of the database item values corresponding to each of the query hash functions, wherein 0 < N < 1, and wherein the continuous novelty score is both distance sensitive and time sensitive.

[0130] C. A system for analytically determining novelty response includes a processing unit having an initial item processing engine. The initial item processing engine is configured to receive an initial item I, to assign a plurality of initial hash functions IH to the initial item I, and to insert the initial item I into a bloom filter, wherein each of the plurality of initial hash functions IH is mapped to a respective one of a plurality of database items in an initial hash database, each database item occupying a position in the initial hash database, and wherein each database item has a value that is reset when one of the plurality of initial hash functions IH maps to the respective database item. The initial item processing engine is also configured to update the initial hash database to an updated hash database. The system also includes a query item processing engine configured to receive a query item X, to assign a plurality of query hash functions QH to the query item X, and to insert the query item X into a Bloom filter, wherein each of the query hash functions of the query item X is compared to a respective one of the plurality of database items in the updated hash database, each database item occupying a position in the updated hash database, and wherein each query hash function adopts the value of the respective database item in the updated hash database. The system also includes a scoring unit configured to calculate a continuous novelty score N by computing a weighted sum of the database item values corresponding to each of the query hash functions, wherein 0 < N < 1, and wherein the continuous novelty score is both distance sensitive and time sensitive.

[0131] D. A method for analytically determining novelty response includes receiving a query item indicative of a response to a first stimulus and assigning in a filter engine multiple query hash functions to the query item. The method also includes comparing a first query hash function to a database item, each database item occupying a position indicative of a query hash function in the filter engine, reducing a value for the database item when the database item is indicative of the first query hash function, and determining a novelty score for the first stimulus by computing a weighted sum of the values of the database items in the filter engine.

[0132] E. A non-transitory computer-readable medium in which a program is stored for causing a computer to perform a method for analytically determining novelty response is provided. The method includes receiving a query item indicative of a response to a first stimulus and assigning in the filter engine multiple query hash functions to the query item. The method also includes comparing a first query hash function to a database item, each database item occupying a position indicative of a query hash function in the filter engine and reducing a value for the database item when the database item is indicative of the first query hash function. The method also includes determining a novelty score for the first stimulus by computing a weighted sum of the values of the database items in the filter engine, and updating the hash database when the value for the first query hash function has changed.

[0133] F. A system for analytically determining a novelty response includes a processing unit. The processing unit includes a sensing device configured to generate a query item in response to a first stimulus, to assign multiple hash functions to the query item, and to transmit a hash function through a network. The system also includes a classification device configured to receive the hash function from the network, and to map a hash function to a database item in the filter engine. The classification device includes a scoring tool configured to reset a value of the database item when one of the plurality of hash functions maps to the database item and to update the hash database, and a novelty detection tool configured to calculate a continuous novelty score for the first stimulus by computing a weighted sum of the values of the database items in the filter engine.

[0134] G. A method for filtering input data from a sensor to classify a stimulus includes encoding a stimulus into a vector based on the stimulation rate of each of multiple receptors in a sensor device, and normalizing the vector such that the mean stimulation rate is the same for different stimuli and different stimulus concentrations. The method also includes projecting the vector into a multidimensional space in a hash engine according to a sparse matrix, finding an addition value of a random selection of the components of the vector using a component of the hash engine, and exciting an inhibitory component in the hash engine proportionally to the addition value. The method also includes silencing the remaining components in the hash engine with the inhibitory component, and forming a hash for the stimulus with one or more unsilenced components in the hash engine. [0135] H. A method for finding a novelty score for a stimulus in a sensor system includes receiving a hash function from an input stimulus and determining a distance between the hash function and multiple database items indexed according to a longevity value for each database item. The method also includes adding the distances for multiple database items, each distance weighted with a decaying factor based on the longevity value for each database item, and associating a novelty score for the input stimulus based on the addition of the distances for multiple database items.

[0136] Each of embodiments A, B, C, D, E, F, G or H may be combined with the following elements in any number and order, to produce further embodiments consistent with the present disclosure, as follows:

[0137] Element 1, wherein each database item value recovers by a recovery factor e for each hash function of an inserted subsequent query item that does not map to that database item. Element 2, wherein recovery by recovery factor e is additive. Element 3, wherein each database item value decays by a decay factor d for each hash function of an inserted subsequent query item that maps to that database item. Element 4, wherein decay by decay factor d is multiplicative. Element 5, wherein the plurality of database items in the Bloom filter includes a plurality of bits. Element 6, further including resetting the database item value to zero for calculating novelty score N when a specific hash function maps to a respective database item in the Bloom filter.

[0138] Element 7, wherein comparing a first query hash function to a database item includes mapping the query hash function to at least one of multiple database items in the filter engine, and resetting a value for the database item when the query hash function maps to the database item. Element 8, further including updating the hash database when the value for the first query hash function has changed. Element 9, further including determining a normalization factor in the weighted sum to make the novelty score less than one. Element 10, wherein determining a novelty score includes assigning a similar novelty score to a second stimulus that is similar to the first stimulus. Element 11, wherein determining a novelty score further includes increasing the value for the first query hash function when a pre-selected time has elapsed after the first stimulus. Element 12, further including increasing the value for the database item by a selected amount when none of the multiple query hash functions map to the database item. Element 13, further including selecting an additive factor to update a database item in the hash database when a second stimulus leaves the database item unchanged. Element 14, further including reducing each database item value by a decay factor d for each hash function of an inserted subsequent query item that maps to that database item. Element 15, wherein reducing the value for the database item includes multiplying the value for the database item by a selected decay factor that is less than one. Element 16, further including forming the hash database as a string of bits, wherein each bit is a database item. Element 17, wherein reducing a value for the database item includes resetting the value for the database item to zero. Element 18, further including: receiving a second query item indicative of a response to a second stimulus; assigning multiple query hash functions to the second query item; mapping the query hash functions into the filter engine; and determining a novelty score for the second stimulus based on a mapping of the query hash functions into the filter engine.

[0139] Element 19, wherein in the method, determining a novelty score includes assigning a similar novelty score to a second stimulus that is similar to the first stimulus. Element 20, wherein in the method, wherein determining a novelty score further includes increasing the value for the first query hash function when a pre-selected time has elapsed after the first stimulus. Element 21, wherein the method further includes increasing the value for the database item by a selected amount when none of the multiple query hash functions map to the database item.

[0140] Element 22, further including a memory circuit storing an additive factor and a decaying factor for the scoring tool to reset the value of the database item. Element 23, wherein the hash database is a binary string, the system further including a memory circuit to store the binary string.

Previous Patent: APPARATUS AND METHOD FOR FILTERING

Next Patent: WIDE-FIELD NANOSECOND IMAGING METHODS USING WIDE-FIELD OPTICAL MODULATORS