EMBEDDING DRIFT IN A MACHINE LEARNING MODEL

Title:

EMBEDDING DRIFT IN A MACHINE LEARNING MODEL

Document Type and Number:

WIPO Patent Application WO/2023/183833

Kind Code:

Abstract:

Techniques for determining embedding drift score in a machine learning model. The techniques can include: obtaining one or more first embedding vectors based on at least one first prediction of a machine learning model; filtering the first embedding vectors based on a slice of the first prediction; determining a first average vector by averaging each dimension of the filtered first embedding vectors; obtaining one or more second embedding vectors on al least one second prediction of the machine learning model; filtering the second embedding vectors based on a slice of the second prediction; generating a second average vector by averaging each dimension of the filtered second embedding vectors; and determining an embedding drift score based on a distance measure of the first average vector and the second average vector.

Inventors:

LOPATECKI JASON (US)
CARRASCO FRANCISCO (US)
DHINAKARAN APARNA (US)
SCHIFF MICHAEL (US)

Application Number:

PCT/US2023/064798

Publication Date:

September 28, 2023

Filing Date:

March 22, 2023

Export Citation:

Click for automatic bibliography generation Help

Assignee:

ARIZE AI INC (US)

International Classes:

G06N20/00

Other References:

HANG YU; TIANYU LIU; JIE LU; GUANGQUAN ZHANG: "Automatic Learning to Detect Concept Drift", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 4 May 2021 (2021-05-04), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081957990
GOLDENBERG I ET AL.: "Survey of distance measures for quantifying concept drift and shift in numeric data", KNOWLEDGE AND INFORMATION SYSTEMS, vol. 60, no. 2, August 2019 (2019-08-01), pages 591 - 615, XP036837568, DOI: 10.1007/s10115-018-1257-z
JEATRAKUL P ET AL.: "Data cleaning for classification using misclassification analysis", JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, vol. 14, no. 3, 2010, pages 297 - 302, XP055754280, DOI: 10.20965/jaciii.2010.p0297
GREC O S ET AL.: "Drift Lens: Real-time unsupervised Concept Drift detection by evaluating per-label embedding distributions", 2021 INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW, 7 December 2021 (2021-12-07), pages 341 - 349, XP034072393, DOI: 10.1109/ICDMW53433.2021.00049
CASTELLANI ANDREA; SCHMITT SEBASTIAN; HAMMER BARBARA: "Task-Sensitive Concept Drift Detector with Constraint Embedding", 2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), IEEE, 5 December 2021 (2021-12-05), pages 01 - 08, XP034044728, DOI: 10.1109/SSCI50451.2021.9659969

Attorney, Agent or Firm:

MINUTOLI, Gianni et al. (US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A computer-implemented method for determining an embedding drift score in a machine learning model, the method comprising: obtaining one or more first embedding vectors based on at least one first prediction of a machine learning model; filtering the first embedding vectors based on a slice of the first prediction; determining a first average vector by averaging each dimension of the filtered first embedding vectors; obtaining one or more second embedding vectors based on at least one second prediction of the machine learning model; filtering the second embedding vectors based on a slice of the second prediction; generating a second average vector by averaging each dimension of the filtered second embedding vectors; and determining an embedding drift score based on a distance measure of the first average vector and the second average vector.

2. The method of claim 1, further comprising: optimizing the machine learning model based on the embedding drift score.

3. The method of claim 1, wherein determining the embedding drift score comprises: determining a Euclidean embedding drift score when the distance measure is a Euclidean distance.

4. The method of claim 1, wherein determining the embedding drift score comprises: determining a Cosine embedding drift score when the distance measure is a Cosine distance.

5. The method of claim 1, wherein the machine learning model is based on unstructured image data.

6. The method of claim 1, wherein the machine learning model is based on unstructured natural language data.

7. The method of claim 1, wherein the first prediction is during a first time period and the second prediction is during a second time period.

8. The method of claim 1, wherein the first prediction is generated when the model is operating in a training environment and the second prediction is generated when the model is operating in a production environment.

9. The method of claim 1, wherein obtaining the first embedding vectors and/or the second embedding vectors comprises extracting the vectors from the model based on activations at a layer of the model.

10. The method of claim 1 , wherein the filtering of the first embedding vectors and the second embedding vectors comprises removing false negatives and/or false positives from the slice of the first prediction and second prediction respectively.

11. A system for determining a drift impact score in a machine learning model, the system comprising: a processor and an associated memory, the processor being configured for: obtaining one or more first embedding vectors based on at least one first prediction of a machine learning model; filtering the first embedding vectors based on a slice of the first prediction; determining a first average vector by averaging each dimension of the filtered first embedding vectors; obtaining one or more second embedding vectors based on at least one second prediction of the machine learning model; filtering the second embedding vectors based on a slice of the second prediction; generating a second average vector by averaging each dimension of the filtered second embedding vectors; and determining an embedding drift score based on a distance measure of the first average vector and the second average vector.

12. The system of claim 11, wherein the processor is further configured for optimizing the machine learning model based on the embedding drift score.

13. The system of claim 11, wherein determining the embedding drift score comprises: determining an euclidean embedding drift score when the distance measure is an euclidean distance.

14. The system of claim 11, wherein determining the embedding drift score comprises: determining a Cosine embedding drift score when the distance measure is a Cosine distance.

15. The system of claim 11, wherein the machine learning model is based on unstructured image data.

16. The system of claim 11, wherein the machine learning model is based on unstructured natural language data.

17. The system of claim 11, wherein the first prediction is during a first time period and the second prediction is during a second time period.

18. The system of claim 11, wherein the first prediction is generated when the model is operating in a training environment and the second prediction is generated when the model is operating in a production environment.

19. The system of claim 11, wherein obtaining the first embedding vectors and/or the second embedding vectors comprises extracting the vectors from the model based on activations at a layer of the model.

20. The system of claim 11, wherein the filtering of the first embedding vectors and the second embedding vectors comprises removing false negatives and/or false positives from the slice of the first prediction and second prediction respectively.

Description:

EMBEDDING DRIFT IN A MACHINE LEARNING MODEL

RELATED APPLICATIONS

[1] This application is based on and derives the benefit of the filing date of United States Patent Application No. 17/703,205, filed March 24, 2022. The entire content of this application is herein incorporated by reference in its entirety. This application is related to U.S. Patent Application Nos. 17/212,202, filed March 25, 2021 and 17/548,070, filed December 10, 2021, which are incorporated in their entirety by reference.

BACKGROUND

[2] Performance of a machine learning model depends on whether the data and labels in the training examples are similar to the production environment. For example, a model’s performance may be adversely impacted if it is trained only on images of good quality samples, but in production the model encounters dark/pixelated/blur images or other different situations.

[3] Natural language processing (NLP) training samples (e.g., sentences or paragraphs) may have similar issues because they need to be labeled for training. The labeling can be an expensive and manual task, and typically only done for a tiny fraction of the production data. This is undesirable.

SUMMARY

[4] The present disclosure provides embedding drift techniques to overcome the aforementioned problems by tracking drift of models with unstructured training datasets. For example, a computer-implemented method for determining an embedding drift score in a machine learning model is disclosed. The method can include: obtaining one or more first embedding vectors based on at least one first prediction of a machine learning model; filtering the first embedding vectors based on a slice of the first prediction; determining a first average vector by averaging each dimension of the filtered first embedding vectors; obtaining one or more second embedding vectors based on at least one second prediction of the machine learning model; filtering the second embedding vectors based on a slice of the second prediction; generating a second average vector by averaging each dimension of the filtered second embedding vectors; and determining an embedding drift score based on a distance measure of the first average vector and the second average vector.

[5] In example embodiments, the method can further include optimizing the machine learning model based on the embedding drift score. The determining of the embedding drift score can include determining a Euclidean embedding drift score when the distance measure is a Euclidean distance. The determining of the embedding drift score can include determining a Cosine embedding drift score when the distance measure is a Cosine distance. The machine learning model can be based on unstructured image data or natural language processing. The first prediction can be at a first time period and the second prediction at a second time period. The first prediction can be generated when the model is operating in a training environment and the second prediction can be generated when the model is operating in a production environment. The obtaining of the first embedding vectors and/or the second embedding vectors can include extracting the vectors from the model based on activations at a layer of the model. The filtering of the first embedding vectors and the second embedding vectors comprises removing false negatives and/or false positives from the slice of the first prediction and second prediction, respectively.

[6] A system for determining an embedding drift score in a machine learning model is also disclosed. The system can include a processor and an associated memory, the processor being configured for: obtaining one or more first embedding vectors based on at least one first prediction of a machine learning model; filtering the first embedding vectors based on a slice of the first prediction; determining a first average vector by averaging each dimension of the filtered first embedding vectors; obtaining one or more second embedding vectors based on at least one second prediction of the machine learning model; filtering the second embedding vectors based on a slice of the second prediction; generating a second average vector by averaging each dimension of the filtered second embedding vectors; and determining an embedding drift score based on a distance measure of the first average vector and the second average vector.

BRIEF DESCRIPTION OF DRAWINGS

[7] Other objects and advantages of the present disclosure will become apparent to those skilled in the art upon reading the following detailed description of example embodiments, in conjunction with the accompanying drawings, in which like reference numerals have been used to designate like elements, and in which:

[8] FIG. 1 illustrates images used to train a machine learning model according to an example embodiment of the present disclosure;

[9] FIG. 2 shows various predictions by a machine learning model according to an example embodiment of the present disclosure;

[10] FIG. 3 illustrates natural language processing (NLP) samples to train a machine learning model according to an example embodiment of the present disclosure; [11] FIG. 4 shows a flowchart of a method for determining the embedding drift metric in a machine learning model according to an example embodiment of the present disclosure;

[12] FIG. 5 shows embedding vectors extracted from the model and visualized in a three- dimensional plot according to an example embodiment of the present disclosure;

[13] FIGS. 6A-6D show representations of embedding drift score based on the Euclidean and Cosine distance measures according to example embodiments of the present disclosure;

[14] FIG. 7 shows a graph for embedding drift over time according to example embodiment of the present disclosure; and

[15] FIG. 8 illustrates a machine configured to perform computing operations according to an example embodiment of the present disclosure.

DESCRIPTION

[16] The present disclosure provides embedding drift techniques that can track a drift of a machine learning model that is based on unstructured data. FIG. 1 illustrates an example unstructured image dataset (102-124), which are images of chicken nuggets. Images 102, 104, 106, 112, and 116 show high quality images. Image 108 is pixelated, image 110 is blurred, and image 114 is darkened. Images 118-124 show environments where the model was not trained on. A machine learning model trained only on high quality images (102, 104, 106, 112, and 116) may underperform in production if it encounters unstructured image data such as images 108, 110, 114, or 118-124.

[17] FIG. 2 is a graphic showing various predictions by a machine learning model. Of these predictions, predictions 210 and 220 are incorrect predictions. If a model is trained on a sample that excludes predictions 210 and 220 from labeling, then the model may underperform. In the illustrated example, predictions 210 and 220 represent dark images, and so if these are excluded from labeling, the model may underperform when other dark images are input into the model.

[18] FIG.3 illustrates natural language processing (NLP) examples (302-346) that can be used to train a machine learning model based on movie descriptions and genre labels. The genre can belong to a single category or multiple categories. For example, example 302 has a genre of thriller and example 328 has a genre of thriller as well science fiction and animated. Similar to the previously described machine learning model trained based on images, a machine learning model based on NLP sentences may underperform in production if it encounters inputs that are unstructured NLP sentences or paragraphs. A person of ordinary skill in the art would appreciate that a machine learning model based on unstructured data, as described herein, is not limited to models based on image data or NLP data. It can include models based on other unstructured data, e.g., graphs.

[19] The present disclosure describes an embedding drift metric that can be used to solve the aforementioned problems with unstructured data in production environments. FIG. 4 shows a flowchart of an example method 400 of determining the embedding drift metric in a machine learning model. The method 400 can include a step 410 of obtaining one or more first embedding vectors based on at least one first prediction of a machine learning model; a step 420 of filtering the first embedding vectors based on a slice of the first prediction; a step 430 of determining a first average vector by averaging each dimension of the filtered first embedding vectors; a step 440 of obtaining one or more second embedding vectors based on at least one second prediction of the machine learning model; a step 450 of filtering the second embedding vectors based on a slice of the second prediction; a step 460 of generating a second average vector by averaging each dimension of the filtered second embedding vectors; and a step 470 of determining an embedding drift score based on a distance measure of the first average vector and the second average vector. Each of these steps are subsequently described in detail.

[20] Although the steps 410-470 are illustrated in sequential order, these steps may also be performed in parallel, and/or in a different order than the order disclosed and described herein. Also, the various steps may be combined into fewer steps, divided into additional steps, and/or removed based upon a desired implementation.

[21] At step 410, one or more first embedding vectors based on at least one first prediction of a machine learning model can be obtained. An embedding vector is a vector of information that can be extracted from a model based on the activations at a specific layer of the model. In example embodiments, the first embedding vectors can be obtained from an external or internal source (e.g., from a memory device, a network connection, etc.), or extracted as explained below.

[22] In an example embodiment, the first embedding vectors can be obtained at a certain timestamp or time period, e.g., when predictions of the model are at a baseline level of accuracy, which can be defined according to the type of model (e.g., an image-based model, NLP based model, etc.). The baseline level of accuracy can be changed/updated during the training or based on the model’s performance in production.

[23] In example embodiments, one or more embedding vectors can be grouped by their environment (i.e., vectors from training, validation, or production set) or metadata to form the first set of embedding vectors. Further, one or more embedding vectors can also be grouped based on a combination of their timestamp, environment, and metadata.

[24] In an example embodiment, the first embedding vectors of step 410 can be extracted using known methods, e.g., as described in https://beta.openai.com/docs/guides/embeddings (accessed March 16, 2022) or https://www.pinecone.io/leam/vector-embeddings/ (accessed March 16, 2022), which are incorporated by reference. FIG. 5 shows example embedding vectors extracted from predictions 510 of the model and visualized in a three-dimensional plot 520.

[25] At step 420, the first embedding vectors can be filtered by applying filtering criteria (e.g., removing False Positives and/or False negatives) based on a slice (i.e., a subset) of the model’s first predictions. Techniques described in application No. 17/212,202 and/or 17/548,070 for generating a slice and filtering can be used to perform step 420.

[26] At step 430, a first average vector can be determined by averaging each dimension of the filtered first embedding vectors. The first average vector can be considered a representation of the filtered first set of embedding vectors. In an example embodiment, the average vector can be calculated by using the below equation. Other known methods may also be used.

[27] For a set of embedding vectors that are: = (—1,2,0) , = (2,3,1) , =

(0, —2,1) , and = (3,0, —1), the average vector using the aforementioned equation can be calculated as:

[28] At step 440, one or more second embedding vectors for at least one second prediction of the machine learning model can be obtained. The process for obtained of the second embedding vectors can be similar to the process described for extracting the first embedding vectors in step 410. However, the timestamp/time period, environment, and/or metadata for the second prediction can be different than the first. For example, the second time period (e.g., 1-2 days ago) can be a more recent time than the first time period (e.g, 30 days ago). Similarly, the first prediction can be in a training environment, but the second prediction can be in a production environment. [29] At step 450, the second embedding vectors can be filtered by applying filtering criteria (e.g., removing False Positives and/or False negatives) based on a slice (i.e., a subset) of the model’s second predictions. Techniques described in application No. 17/212,202 for generating a slice and filtering can be used to perform step 450. In an example embodiment, the criteria used to filter the first embedding vectors and second embedding vectors can be the same.

[30] At step 460, a second average vector can be determined by averaging each dimension of the filtered second embedding vectors. The second average vector can be considered a representation of the filtered second set of embedding vectors. Step 460 can be performed using the process of averaging described with respect to step 430.

[31] For a set of embedding vectors that are: = (5, 1/4,0), = (1,1,1), = (2, —1,1), = (3,1/4, —1), = (4, 3/4, 3/2), the average vector using the aforementioned equation can be calculated as: = .

[32] At step 470, an embedding drift score is determined based on a distance measure of the first average vector and the second average vector. In an example embodiment, the embedding drift score can be determined based on a Euclidean distance measure using the formula: . With the first average vector and the second average vector determined at steps 430 and 460, the Euclidean distance measure is calculated as:

[33] In another example embodiment, the embedding drift score can be determined based on a Cosine distance measure using the formula:

With the first average vector and the second average vector determined at steps 430 and 460, the Cosine distance measure is calculated as:

[34] FIG. 6A-6D shows various example representations of embedding drift scores based on the Euclidean and Cosine distance measures in accordance with the disclosed principles in steps 410-470. FIG. 6 A shows a representation for an image dataset of chicken nuggets shown in FIG. 1. FIG. 6B shows a representation for an NLP dataset of movie genres and descriptions shown in FIG. 2. FIG. 6C shows a representation for an NLP dataset of women’s clothing reviews. FIG. 6D shows a representation for an NLP dataset of hotel reviews.

[35] The magnitude of change in the distance measure over time provides an indication of the magnitude of the embedding drift. In some cases, a relationship between the distance measure and the embedding drift can be directly proportional, i.e., any increase/decrease in the distance measure has a corresponding and exact increase/decrease in the embedding drift score. However, if the relationship is not directly proportional, the distance measure may suggest a different embedding drift across datasets, customers, environments, etc.

[36] Therefore, being able to compare the values of the distance between a reference group and any other group of embeddings may require adding scale information to the progression information to be extracted from the tracking of the distances over time. In an example embodiment, this can be done by scaling the distances over time calculated before with a reference distance. The reference Euclidean distance and reference Cosine distance between embeddings from the same dataset but at different time periods, environment, feature metadata, etc. can be calculated by using the following formulae respectively:

[37] The reference Euclidean distance and reference Cosine distance between embeddings from different datasets can be calculated by using the following formulae respectively: where the vector u ^→ depends on a type of scaling to be performed.

[38] For time scaling, the vector representing dataset A evolves with time. The vector that represents dataset B, is deemed constant (hence used as reference). In this scenario, vectors represent dataset A and B, respectively, at t = t ₁. Vector represents dataset A at t = t ₂ • As mentioned, dataset B is recommended to be approximately constant in time.

[39] For environment scaling, the vector representing dataset A evolves with time. The vector that represents dataset B, is deemed constant (hence used as reference). In this scenario, vectors represent dataset A and B, respectively, at t = t ₁ . Vector represents dataset A at t = t ₂ As mentioned, dataset B is recommended to be approximately constant in time.

[40] In an example embodiment, A is a group of 4 vectors, each with 3 dimensions. As calculated previously, the average vector representing group A at time t1 is . B is a group of 5 vectors, each with 3 dimensions. As calculated previously, the average vector representing group B at time t1 is .

[41] At time t2, the datasets A and B can be represented by the vectors = (1, 1/4, 6/4), and = (3,1/4, 2/4), respectively. In this example embodiment, the vector representing dataset B remains the same to have a constant sense of scale. The distance between dataset A at time t2 and the dataset B at time t1 can be compared. The scaled distances are calculated as follows.

Euclidean:

Cosine:

[42] In this example, the dataset A (e.g., in a production environment), is represented by vector , dataset B (e.g., training environment) is represented by =

(3, 1/4, 2/4), and dataset C (e.g., validation environment) is represented by vector =

(1, 1/4, 6/4). Since the vectors are the same the scaled distances results are the same as above. A person of skill in the art would understand that while the calculations are the same in this example, in other examples the calculations may vary if reference datasets and scaling methods are changed, hence the scaled distance may change as well.

[43] Issues arising from embedding drift can range from sudden data pipeline failures to long-term drift in feature inputs. The following are non-limiting examples of such issues: (1) incorrect data indexing mistake - breaks upstream mapping of data; (2) software engineering changes the meaning of a field; (3) third party data source makes a change dropping a feature, changing format, or moving data; (4) newly deployed code changes an item in a feature vector; (5) outside world drastically changes (e.g., the covid-19 pandemic) and every feature shifts; (6) periodic daily collection of data fails, causing missing values or lack of file; (7) presumption of valid format that changes and is suddenly not valid; (8) third party library functionality changes; (9) date string changes format; (10) bad text handling - causes new tokens model has never seen, for e.g., mistakes handling case and problems with new text string; (11) system naturally evolves and feature shifts; (12) drastic increase in volume skews statistics; and (13) different sources of features with different coordinates or indexing.

[44] In the real-world post model-deployment, the embedding drift issues can occur in a myriad of different ways and cause model performance issues. Therefore, to avoid such issues, the embedding drift can be monitored proactively, and an alert can be raised if the embedding drift score exceeds a predefined threshold.

[45] FIG. 7 shows a graph for embedding drift over time. In an example embodiment, if the embedding drift exceeds 0.2 (shown by the vertical line 710), an alert can be raised. The machine learning model can then be optimized by adjusting/changing one or more predictions or slices of predictions of the machine learning model. Techniques described in application No. 17/212,202 for optimizing the model can be used.

[46] FIG. 8 shows an example system 800 that can be used for implementing the method 400 and other aspects of the present disclosure. The system 800 can include a processor 802 (e.g., a central processing unit (CPU), a graphics processing unit (GPU) or both) and an associated memory 804. The processor 802 can be configured to perform all the previously described steps with respect to method 400. In various embodiments, the computer system 800 can operate as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of either a server or a client machine in server-client network environments, or it may act as a peer machine in peer-to-peer (or distributed) network environments. [47] Example computer system 800 may further include a static memory 806, which communicate via an interconnect 808 (e.g., a link, a bus, etc.). The computer system 800 may further include a video display unit 810, an input device 812 (e.g., keyboard) and a user interface (UI) navigation device 814 (e.g., a mouse). In one embodiment, the video display unit 810, input device 812 and UI navigation device 814 are a touch screen display. The computer system 800 may additionally include a storage device 816 (e.g., a drive unit), a signal generation device 818 (e.g., a speaker), an output controller 832, and a network interface device 820 (which may include or operably communicate with one or more antennas 830, transceivers, or other wireless communications hardware), and one or more sensors 828.

[48] The storage device 816 includes a machine-readable medium 822 on which is stored one or more sets of data structures and instructions 824 (e.g., software) embodying or utilized by any one or more of the methodologies or functions described herein. The instructions 824 may also reside, completely or at least partially, within the main memory 804, static memory 806, and/or within the processor 802 during execution thereof by the computer system 800, with the main memory 804, static memory 806, and the processor 802 constituting machine- readable media.

[49] While the machine-readable medium 822 is illustrated in an example embodiment to be a single medium, the term “machine-readable medium” may include a single medium or multiple medium (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more instructions 824.

[50] The term “machine-readable medium” shall also be taken to include any tangible medium that is capable of storing, encoding or carry ing instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present disclosure or that is capable of storing, encoding or carrying data structures utilized by or associated with such instructions.

[51] The term “machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical media, and magnetic media. Specific examples of machine-readable media include non-volatile memory, including, by way of example, semiconductor memory devices (e.g., Electrically Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM)) and flash memory devices; magnetic disks such as internal hard disks and removable disks; magnetooptical disks; and CD-ROM and DVD-ROM disks. [52] The instructions 824 may further be transmitted or received over a communications network 826 using a transmission medium via the network interface device 820 utilizing any one of several well-known transfer protocols (e.g., HTTP). Examples of communication networks include a local area network (LAN), wide area network (WAN), the Internet, mobile telephone networks, Plain Old Telephone (POTS) networks, and wireless data networks (e.g., Wi-Fi, 3G, and 4G LTE/LTE-A or WiMAX networks).

[53] The term “transmission medium” shall be taken to include any intangible medium that can store, encoding, or carrying instructions for execution by the machine, and includes digital or analog communications signals or other intangible medium to facilitate communication of such software.

[54] Other applicable network configurations may be included within the scope of the presently described communication networks. Although examples were provided with reference to a local area wireless network configuration and a wide area Internet network connection, it will be understood that communications may also be facilitated using any number of personal area networks, LANs, and WANs, using any combination of wired or wireless transmission mediums.

[55] The embodiments described above may be implemented in one or a combination of hardware, firmware, and software. For example, the features in the system architecture 800 of the processing system may be client-operated software or be embodied on a server running an operating system with software running thereon.

[56] While some embodiments described herein illustrate only a single machine or device, the terms “system”, “machine”, or “device” shall also be taken to include any collection of machines or devices that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.

[57] Examples, as described herein, may include, or may operate on, logic or several components, modules, features, or mechanisms. Such items are tangible entities (e.g., hardware) capable of performing specified operations and may be configured or arranged in a certain manner. In an example, circuits may be arranged (e.g., internally or with respect to external entities such as other circuits) in a specified manner as a module, component, or feature. In an example, the whole or part of one or more computer systems (e.g., a standalone, client or server computer system) or one or more hardware processors may be configured by firmware or software (e.g., instructions, an application portion, or an application) as an item that operates to perform specified operations. In an example, the software may reside on a machine readable medium. In an example, the software, when executed by underlying hardware, causes the hardware to perform the specified operations.

[58] Accordingly, such modules, components, and features are understood to encompass a tangible entity, be that an entity that is physically constructed, specifically configured (e.g., hardwired), or temporarily (e.g., transitorily) configured (e.g., programmed) to operate in a specified manner or to perform part or all operations described herein. Considering examples in which modules, components, and features are temporarily configured, each of the items need not be instantiated at any one moment in time. For example, where the modules, components, and features comprise a general-purpose hardware processor configured using software, the general-purpose hardware processor may be configured as respective different items at different times. Software may accordingly configure a hardware processor, for example, to constitute a particular item at one instance of time and to constitute a different item at a different instance of time.

[59] Additional examples of the presently described method, system, and device embodiments are suggested according to the structures and techniques described herein. Other non-limiting examples may be configured to operate separately or can be combined in any permutation or combination with any one or more of the other examples provided above or throughout the present disclosure.

[60] It will be appreciated by those skilled in the art that the present disclosure can be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The presently disclosed embodiments are therefore considered in all respects to be illustrative and not restricted. The scope of the disclosure is indicated by the appended claims rather than the foregoing description and all changes that come within the meaning and range and equivalence thereof are intended to be embraced therein.

[61] It should be noted that the terms “including” and “comprising” should be interpreted as meaning “including, but not limited to”. If not already set forth explicitly in the claims, the term “a” should be interpreted as “at least one” and “the”, “said”, etc. should be interpreted as “the at least one”, “said at least one”, etc. Furthermore, it is the Applicant's intent that only claims that include the express language "means for" or "step for" be interpreted under 35 U.S.C. 112(f). Claims that do not expressly include the phrase "means for" or "step for" are not to be interpreted under 35 U.S.C. 112(f).

Previous Patent: SWEETENER COMPRISING REBAUDIOSE M AND BRAZZEIN

Next Patent: SYSTEM AND METHOD FOR SUMMARIZING TEXTUAL CLUES