Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYSTEMS AND METHODS FOR PROGRESSING SAMPLES THROUGH A WORKFLOW BASED ON SELECTION CRITERIA AND AUTOMATICALLY CAPTURING WORKFLOW DECISIONS
Document Type and Number:
WIPO Patent Application WO/2021/158548
Kind Code:
A1
Abstract:
Example embodiments provide systems and methods for automatically interfacing with a screening system configured to screen a plurality of samples according to at least one hit criterion. The hit matching system, which may be accessible to users via a network connection, programmatically identifies a subset of samples that meet the hit criterion; the hit criterion that gave rise to the subset of samples are preserved in an output structure for future re-analysis. The output can be provided to the screening system, which can automatically rescreen the subset of samples. By programmatically pushing the samples to be rescreened back to the screening instrument, the instrument can rescreen the samples in a short period of time. The cherry-picking lists (represented by the subset(s) of samples) allow for selected hits to be re-arrayed in a smaller and more manageable format for secondary and tertiary analysis.

Inventors:
LE ANH-HUY PHAN (US)
WALL MARK (US)
Application Number:
PCT/US2021/016215
Publication Date:
August 12, 2021
Filing Date:
February 02, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BASF SE (DE)
International Classes:
G16H40/60
Foreign References:
US20060200315A12006-09-07
US20180051319A12018-02-22
Other References:
BECK BERND ET AL: "The impact of data integrity on decision making in early lead discovery", JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, SPRINGER NETHERLANDS, NL, vol. 29, no. 9, 26 September 2015 (2015-09-26), pages 911 - 921, XP035603074, ISSN: 0920-654X, [retrieved on 20150926], DOI: 10.1007/S10822-015-9871-2
DAVID SHUM ET AL: "A high density assay format for the detection of novel cytotoxic agents in large chemical libraries", JOURNAL OF ENZYME INHIBITION AND MEDICINAL CHEMISTRY, vol. 23, no. 6, 1 January 2008 (2008-01-01), GB, pages 931 - 945, XP055489207, ISSN: 1475-6366, DOI: 10.1080/14756360701810082
Attorney, Agent or Firm:
DHINDSA, Richa (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the processors to: interface with a screening system configured to screen a plurality of samples according to at least one hit criterion; programmatically identify a subset of samples from among the plurality of samples that meet the at least one hit criterion; generate an output comprising respective identifiers for each of the samples in the subset of samples; and providing the output to the screening system, the screening system configured to rescreen the subset of samples.

2. The medium of claim 1 , further storing instructions for receiving a selection of the at least one hit criterion, wherein the output further comprises the hit criterion that gave rise to the identified subset of samples.

3. The medium of claim 2, wherein receiving the selection of the at least one hit criterion comprises receiving a plurality of hit criteria combined with Boolean operators.

4. The medium of claim 1 , wherein the screening system comprises a high throughput screening (HTS) instrument.

5. The medium of claim 1, further storing instructions for selecting at least one of the subset of samples as an outlier, excluding the selected outlier from the subset of samples, and storing an identifier of the outlier in an outlier list in the output.

6. The medium of claim 1, wherein the plurality of samples are respectively arranged into a plurality of wells on one or more plates.

7. The medium of claim 6, wherein the output comprises a heat map of the one or more plates, wherein the heat map color-codes the respective wells on the one or more plates based on how extensively the respective well exceeded the at least one hit criterion.

8. A computer- implemented method comprising: interfacing with a screening system configured to screen a plurality of samples according to at least one hit criterion; programmatically identifying a subset of samples from among the plurality of samples that meet the at least one hit criterion; generating an output comprising respective identifiers for each of the samples in the subset of samples; and provide the output to the screening instrument, the screening instrument configured to rescreen the subset of samples.

9. The method of claim 8, further comprising receiving a selection of the at least one hit criterion, wherein the output further comprises the hit criterion that gave rise to the identified subset of samples.

10. The method of claim 9, wherein receiving the selection of the at least one hit criterion comprises receiving a plurality of hit criteria combined with Boolean operators.

11. The method of claim 8, wherein the screening system comprises a high throughput screening (HTS) instrument.

12. The method of claim 8, further comprising selecting at least one of the subset of samples as an outlier, excluding the selected outlier from the subset of samples, and storing an identifier of the outlier in an outlier list in the output.

13. The method of claim 8, wherein the plurality of samples are respectively arranged into a plurality of wells on one or more plates.

14. The method of claim 13, wherein the output comprises a heat map of the one or more plates, wherein the heat map color-codes the respective wells on the one or more plates based on how extensively the respective well exceeded the at least one hit criterion.

15. An apparatus comprising: a hardware interface configured to communicate with a screening system configured to screen a plurality of samples according to at least one hit criterion; a hardware processor configured to programmatically identify a subset of samples from among the plurality of samples that meet the at least one hit criterion; and a non-transitory computer-readable medium configured to store an output comprising respective identifiers for each of the samples in the subset of samples, wherein the hardware interface is further configured to provide the output to the screening system, the screening system configured to rescreen the subset of samples.

16. The apparatus of claim 15, wherein the hardware processor is further configured to receive a selection of the at least one hit criterion, and wherein the output further comprises the hit criterion that gave rise to the identified subset of samples.

17. The apparatus of claim 16, wherein receiving the selection of the at least one hit criterion comprises receiving a plurality of hit criteria combined with Boolean operators.

18. The apparatus of claim 15, wherein the screening system comprises a high throughput screening (HTS) instrument.

19. The apparatus of claim 15, wherein the hardware processor is further configured to receive a selection of at least one of the subset of samples as an outlier, exclude the selected outlier from the subset of samples, and store an identifier of the outlier in an outlier list in the output.

20. The apparatus of claim 15, wherein the plurality of samples are respectively arranged into a plurality of wells on one or more plates, and the output comprises a heat map of the one or more plates, wherein the heat map color-codes the respective wells on the one or more plates based on how extensively the respective well exceeded the at least one hit criterion.

Description:
Systems and Methods for Progressing Samples Through a Workflow Based on Selection Criteria and Automatically Capturing Workflow Decisions

BACKGROUND

[0001] High-throughput screening (HTS) is a technique for scientific experimentation often used in connection with chemistry or biology. HTS allows users to rapidly conduct many chemical, genetic, or pharmacological tests in a short period of time. For example, an HTS system may analyze samples provided on microtiter plates, which typically include (e.g.) 96 sample wells arranged in a grid. An HTS tool may perform tests on multiple such plates (in some cases, hundreds of plates) in a given experimental run, and may make the data available for analysis. In some cases, multiple data points are available for each sample, meaning that many tens of thousands of units of data are generated in a single experimental run.

SUMMARY

[0002] Exemplary embodiments provide methods, mediums, and systems for identifying screened samples for rescreening based on one or more hit criteria.

[0003] According to one embodiment, a computing device may interface with a screening system configured to screen a plurality of samples according to at least one hit criterion. The screening system may include a high throughput screening (HTS) instrument. The samples may be arranged into a plurality of wells on one or more plates.

[0004] The computing device may programmatically identify a subset of samples from among the plurality of samples that meet the at least one hit criterion. In some embodiments, at least one of the subset of samples may be selected as an outlier. The selected outlier may be excluded from the subset of samples for further processing. An identifier of the outlier may be stored in an outlier list in the output generated by the computing device.

[0005] The computing device may generate an output comprising respective identifiers for each of the samples in the subset of samples. In some embodiments, the computing device may receive a selection of the at least one hit criterion, and may store the hit criterion that gave rise to the identified subset of samples in the output. The hit criterion may be one of multiple hit criteria that are combined with Boolean operators.

[0006] In some embodiments, the output may include a heat map of the one or more plates, wherein the heat map color-codes the respective wells on the one or more plates based on how extensively the respective well exceeded the at least one hit criterion.

[0007] The computing device may provide the output to the screening system, the screening system configured to rescreen the subset of samples.

[0008] These and other embodiments will now be described with reference to the attached drawings.

BRIEF DESCRIPTION OF THE DRAWINGS [0009] FIG. 1 depicts an environment suitable for use with exemplary embodiments.

[0010] FIGs. 2A-2I depict examples of interfaces for data analysis, collection, and sample identification according to exemplary embodiments.

[0011] FIG. 3 A depicts an example of an input data structure suitable for use with exemplary embodiments.

[0012] FIG. 3B depicts an example of an output data structure suitable for use with exemplary embodiments.

[0013] FIG. 4 depicts an input/output specification for an exemplary embodiment.

[0014] FIG. 5 is a block diagram depicting logic according to an exemplary embodiment.

[0015] FIG. 6 is a flowchart illustrating an exemplary procedure suitable for practicing exemplary embodiments.

[0016] FIG. 7 depicts an exemplary computing system suitable for use with exemplary embodiments. [0017] FIG. 8 depicts an exemplary network environment suitable for use with exemplary embodiments.

DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

[0018] There are several limitations to conventional HTS systems and traditional analysis techniques used on the resulting data. First, analysis is typically slow and inefficient. One conventional technique for performing data analysis involves capturing the data at an HTS instrument, adding it to a database, importing the data to a spreadsheet, copying the spreadsheet data into an analysis template, and emailing the analysis template to an expert, who will then analyze the data over the course of several weeks.

[0019] The experts may analyze the analysis template in view of hit criteria that determines which samples are of interest (“hits”) and which are not (“non-hits” or “misses”). Unfortunately, a given analysis may not be repeatable, because the hit criteria used to establish which samples are of interest are not preserved. Instead, a list of hits is provided; if a future user wishes to determine how the data was analyzed, the user must attempt to reconstruct the hit criteria based on the results. Even if the user can regenerate the original hit criteria, it may not be possible to perform updated tests with variations on the hit criteria, since the underlying data may not be preserved. Moreover, the analysis templates provided to the experts are typically not standardized, which means that analyses from different organizations may not be directly comparable.

[0020] After the expert identifies one or more samples of interest, it may be desirable to re screen the identified samples with the screening instrument. By rescreening a smaller sample, more detailed, more expensive, and/or more stringent tests can be performed in a shorter period of time, and users can confirm that the earlier screening did not yield false positives. The follow-up tests can determine if the originally-tested samples were of good quality. Furthermore, changes to the samples of interest can be introduced, and users can determine if a significant improvement over a reference sample is achieved. However, using conventional techniques, rescreening can suffer from the same problems as the original screening process. Moreover, rescreening may require substantial further effort, as the original samples need to be recreated (perhaps months later) or pulled from storage, arranged onto new plates, and set up to run with the new rescreening criteria.

[0021] Still further, it can be difficult to identify and diagnose problems in conventional screening systems. For instance, assume that the screening instrument is applying heat to the plates being tested, but is doing so unevenly. Accordingly, the samples on the outside of the plate may receive more or less heat than the samples in the middle. Using conventional techniques, it may be difficult to identify plate-level patterns in the data.

[0022] To address these and other problems, exemplary embodiments provide techniques for improved analysis of screening data that can trigger and perform rescreening of the samples.

[0023] According to exemplary embodiments, a hit matching system automatically interfaces with a screening system configured to screen a plurality of samples according to at least one hit criterion. The hit matching system, which may be accessible to users via a network connection, programmatically identifies a subset of samples that meet the hit criterion; the hit criterion that gave rise to the subset of samples are preserved in an output structure for future re-analysis. The output can be provided to the screening system, which can automatically rescreen the subset of samples. By programmatically pushing the samples to be rescreened back to the screening instrument, the instrument can rescreen the samples in a short period of time. The cherry-picking lists (represented by the subset(s) of samples) allow for selected hits to be re-arrayed in a smaller and more manageable format for secondary and tertiary analysis. In another example, hits may be combined to achieve synergistic improvements.

[0024] By eliminating a processing bottleneck in the rescreening process, significant time savings can be achieved. For instance, in one enzyme evolution project, it was estimated that use of the exemplary embodiment described above saved 2-3 months of project time and allowed an enzyme product to be brought to market significantly faster.

[0025] In some embodiments, users may identify outliers in the data (or the outliers can be programmatically identified based on one or more criteria). The outliers may be excluded from further analysis or re-analysis, but the identity of the outliers may be preserved in the output structure to facilitate review of the data in the future. [0026] By preserving the hit criteria and any outliers, the reproducibility of the data processing is improved, allowing for retrospective analysis or re-analysis.

[0027] The following description of embodiments provides non-limiting representative examples referencing numerals to particularly describe features and teachings of different aspects of the invention. The embodiments described should be recognized as capable of implementation separately, or in combination, with other embodiments from the description of the embodiments. The description of embodiments should facilitate understanding of the invention to such an extent that other implementations, not specifically covered but within the knowledge of a person of skill in the art having read the description of embodiments, would be understood to be consistent with an application of the invention.

[0028] It is noted that, although exemplary embodiments are described in connection with particular examples (electrophoresis analysis of proteins, a standalone alignment and identification system, etc.), the present invention is not limited to these examples.

[0029] FIG. 1 illustrates an environment 100 according to an example embodiment. The environment 100 includes a screening system 102 configured to generate data from a collection of samples 108, a hit matching system 116 configured to analyze the data, identify samples of interest 112 from among the collection of samples 108, and push a list of the samples of interest 112 back to the screening system 102 for re-screening and/or further analysis. The hit matching system 116 may store the results of the analysis in a database 118. A user device 120 may access the hit matching system 116, either directly or through a network. The user device 120 may be used to visualize the data, allow a user to select samples of interest 112 for further testing, flag outliers among the samples of interest 112, and trigger re-analysis of the samples 108.

[0030] The screening system 102 may include a high-throughput screening (HTS) instrument 104 configured to perform tests on a group of samples 108.

[0031] The samples may include one or more test materials, such as chemical compounds, cells, enzymes, etc. Typically, the goal of an HTS experiment is to analyze the samples 108 to determine whether the sample 108 includes active compounds, antibodies, genes, etc. that meet a given criteria (e.g., modulating a given biomolecular pathway). To determine if this is the case, the samples 108 may be analyzed by the HTS instrument 104 in view of one or more hit criteria. Any samples 108 meeting the hit criteria are considered to be samples of interest 112 and good candidates for further research.

[0032] In some cases, a sample 108 may meet the hit criteria and may be identified as a sample of interest 112, but may have certain characteristics that suggest that the sample 108 should not be considered a sample of interest 112. For example, problems in the experiment (e.g., over-heating or under-heating of the sample), in the sample (improper preparation of the sample), with the equipment (e.g., with the HTS instrument 104), or other issues may result in data that causes the sample 108 to be identified as a sample of interest 112, but is clearly erroneous. In exemplary embodiments, such samples of interest 112 may be flagged as outliers 114; outliers 114 may be removed from consideration for analysis and/or re screening. In some embodiments, outliers 114 may include samples 108 that were not flagged as being samples of interest 112, but which the user wishes to have re-analyzed (which might be the case, e.g., if there was a problem in the experimental design or execution that becomes apparent later). The identity of such outliers may be preserved in an output data structure in order to ensure the reproducibility of experimental results.

[0033] The samples 108 may be arranged in wells on one or more plates 110-/ (where / is an integer from 1 to a value N ), such as microtiter plates. Each plate 104 may include multiple wells; typical plates include from 96 to 6144 wells (96 < N < 6144). A given screening system 102 may analyze hundreds of plates 110-/ in a given run.

[0034] The screening system 102 may optionally include a robotic system 106 for retrieving and handling samples 108 from a sample library. The robotic system 106 may be configured to retrieve samples 108 from the library based on an identifier of the sample 108, and to load the sample 108 into the HTS instrument 104 for testing. In some embodiments, entire plates 110-/ may be loaded directly from the library. Alternatively or in addition, new plates 110-/ may be created from a combination of individual samples 108 in the library. In some embodiments, two or more samples 108 may be combined by the robotic system 106 in order to explore synergistic effects.

[0035] The robotic system 106 may also be a part of the HTS instrument 104, where the robot is used to read data from the samples 108, and/or to move particular samples under another measurement device. [0036] The hit matching system 116 receives data from the screening system 102 and analyzes the data in view of one or more hit criteria. The hit criteria may be particular values for certain parameters of the data, ratios or other mathematical combinations of two or more parameters, and/or logical combinations of parameters (e.g., by linking parameters together with Boolean The hit criteria may be provided by or selected by a user interacting with a user device 120.

[0037] The hit matching system 116 may provide data visualizations to the user device 120, may identify the hits to the user device, may receive identifications of outliers, and may push a list of samples of interest for re-analysis or further analysis back to the screening system 102. This may be done to confirm that a hit was, in fact, a hit, to collect more information, to explore ways to improve the effectiveness of a sample, and to perform quality control.

[0038] The data visualizations displayed on the user device 120 may be used for several applications, including identifying outliers. In some embodiments, a heat map of a given plate may be provided. By virtue of the heat map, a user may be able to see patterns in the data, which may indicate problems with the experimental design or execution. For example, the heat map may show that different samples were heated unevenly, and samples in given regions with improper heating may be selected (individually or as a group) as outliers.

[0039] The hit matching system 116 may also store the results of each analysis, along with any identified hits, the hit criteria that gave rise to the hits, and any identified outliers in a database 118.

[0040] FIGs. 2A-2I depict examples of interfaces for data analysis, collection, and sample identification according to exemplary embodiments.

[0041] FIG. 2A depicts an example of an interface including a heat map showing the results of an analysis on a plate containing 384 samples. Each well’s color (as designated by the scale on the right) represents the degree to which a specified parameter of the respective sample exceeded that of a control (controls may be provided a priori, or may be selected from among the samples that were tested) and/or deviated from the median of the plate, thus emphasizing relative differences between samples. In some embodiments, the heat map includes samples and controls but excludes any wells identified as outliers. [0042] Instead of viewing all samples on a plate, a user may wish to view only selected subgroups of samples. For example, FIG. 2B depicts a similar heatmap to the one depicted in FIG. 2A, but displays only the samples identified as control samples (i.e., experimental samples have been omitted).

[0043] FIG. 2C is an interface for performing integrated outlier marking. The interface shows a heatmap, similar to the one depicted in FIGs. 2A-2B, and also displays corresponding data for each sample in tabular format. The interface provides filtering options (along the left side of the interface) that allows samples to be displayed or removed from the heatmap and/or table.

[0044] By selecting one of the samples (either in the heatmap or the table) and then selecting the “Mark Outlier” button, the user can flag a certain sample as an outlier. As noted above, outliers may be excluded from further analysis, but the identity of each outlier may be preserved in an output data structure to ensure data reproducibility.

[0045] As previously noted, various values may be calculated for each of the samples, which may be based on one or more parameters from the data generated by the screening system. FIG. 2D depicts an example of an interface for creating calculated values based on combinations of the parameters. The interface provides selectable icons allowing constants, operators, and variables to be included; the variables may be automatically included based on a list of parameters in the data collected by the screening system (for example, the depicted interface includes options for selecting the “PAHBAH” and “Iodine” parameters).

[0046] The calculated values from FIG. 2D may be used to perform hitpicking, such as when the calculated value is above or below a predefined and/or dynamic threshold. Alternatively or in addition, the parameters from the data may be directly compared to the threshold. FIGs. 2E-2F depict various interfaces suitable for comparing experimental results (and derived values) to a given threshold (marked on the graphs in FIG. 2F). In this example, any samples above the threshold may be identified as a hit). FIG. 2E depicts the examples in tabular format, whereas FIG. 2F depicts a different graph for each respective plate. [0047] FIG. 2E also depicts various interactable options allowing for the selection of a hit picking method (threshold or linear regression; set to “threshold” in this case), a comparison operator, a threshold boundary, and the threshold. As depicted, the threshold may be set in comparison to the average of the data (e.g., plus or minus a certain number of standard deviations above the mean). The graphs as shown in FIG. 2F may be displayed in the same interface as the selection criteria and/or tabular data, and may be updated in real-time as selections are made in the interface. Accordingly, the user can visualize the effects of setting the threshold at a given point.

[0048] Still further, a sample may be identified as a hit when it deviates by more than a certain amount from a relationship identified in the data (e.g., based on a linear regression analysis). An example of an interface for identifying such “off-axis hits” is shown in FIG. 2G. In this example, the interactable options may allow the variables to be compared to be specified, and may allow the user to set an alpha level for the linear regression analysis. Graphs displayed in the interface (e.g., for each plate) may be updated in real-time to allow the user to visualize the effect of selecting different linear regression options).

[0049] In addition to comparing variables directly and/or in mathematical combination, a further interface (see FIG. 2H) allows a user to combine variables using logical (e.g., Boolean) operations. The interface provides interactable elements providing logical operators (e.g., AND, OR, XOR, NOT, etc.). A visualization element shows (e.g., on a Venn diagram) the results of the Boolean operation.

[0050] When a user is satisfied with their hit-picking (by specifying options in the interfaces of FIGs. 2A-2H), the various options may be saved alongside the list of hits and/or the original data. FIG. 21 depicts an example of an interface allowing the user to select a name for the criteria and/or options, and for the resulting hit group. This information may be saved in an output data structure, as described in more detail below.

[0051] In order to perform the hitpicking described above, the hit matching system 116 may receive an input 300 from the screening system 102; an exemplary input 300 data structure suitable for use with exemplary embodiments is depicted in FIG. 3 A. [0052] The input 300 may be divided into a number “i: of samples 302-/,. where / is an integer from 1 a value N. Each portion of the input 300 data structure associated with a given sample 302-/ may include an identifier 304-/ for the /th sample, and screening series data 306-/ derived from an analysis of the /th sample by the screening system 102. The identifier 304-/ may identify the plate on which the sample is provided, and the location of the sample on the plate. The data may be in any suitable format, such as a comma-separated value (CSV) list, an array, a linked list, a matrix, a table, a custom data structure, etc.

[0053] Using the information from the input 300 data structure (and, optionally, additional parameters and settings as shown in FIG. 4), the hit matching system 116 may generate an output 350. FIG. 3B depicts an example of an output data structure 350 suitable for use with exemplary embodiments. The hit matching system 116 may generate the output data structure 350 and may store it locally, display all or portions of the output data structure 350, and/or transmit the output data structure 350 so that it can be stored in a remote database.

[0054] The output 350 may include identifiers 354-/, where / is an integer from 1 a value N, for any samples that have been identified by the hit matching system 116 as being hits 352 (the identifiers 354-/ may correspond to the identifiers 304-/ identified in the input 300 structure).

[0055] The output 350 may further include identifiers 358-/, where / is an integer from 1 a value N, for any outliers 356 identified in the interface (the identifiers 358-/ may correspond to the identifiers 304-/ identified in the input 300 structure).

[0056] The output 350 may further include the hit criteria 360 that gave rise to the hits. The hit criteria 360 may correspond to the options selected in the interfaces previously described.

[0057] Optionally, the output 350 may include visualizations generated in the interfaces, such as one or more heat maps 362, generated threshold graphs, linear regression graphs, etc.

[0058] Still further, the output 350 may include the data 306-/ from the input 300, and/or may be stored with the input 300, so that the data and the resulting hit criteria can be accessed from a single location so that the experimental results can be replicated. [0059] As shown in the exemplary input/output specification of FIG. 4, the screening data 306-/ may be provided from the screening system 102 to hit matching logic 400, which may be provided (e.g.) on the hit matching system 116.

[0060] The hit screening system 116 may also make use of (locally or remotely-stored) parameters 404. The parameters 404 may include, for example, a list of available hit criteria and/or options (e.g., parameters available for consideration in hit picking, available types of comparisons such as a threshold comparison or linear regression, mathematical operators that may be applied to combine the parameters, etc.).

[0061] The parameters 404 may further include Boolean operators 408 that can be used to logically combine different parameters. The hit matching system may further store Boolean logic allowing the parameters to be combined according to the Boolean parameters.

[0062] The parameters 404 may further include heat map parameters 410, which may define (e.g.) a list of colors and/or a scale that defines how the heat map should be generated or displayed. The heat map parameters 410 may include multiple different sets of parameters, so that different types of heat maps can be generated from the same data.

[0063] After processing the screening data 306-/ in view of the parameters 404, the hit matching system logic 400 may generate an output 350. The output 350 may be provided to a display 414 to display the data in tabular and/or graphical format, and/or may be uploaded to a database 118 of experimental results.

[0064] FIG. 5 is a block diagram depicting logic deployed on the devices of the environment 100 according to an exemplary embodiment.

[0065] The HTS instrument 104 includes a measurement device 502 suitable for generating data with respect to a collection of samples. For instance, the measurement device 502 may include a biochemical analysis device suitable for determining how a given sample reacts to a biochemical stimulus, whether a sample exhibits activity against a biological target, and/or whether the sample modulates a given biomolecular pathway. Various parameters may be measured by the measurement device 502, and these parameters may be used by the hit matching system 118 to perform hit matching. [0066] The HTS instrument 104 may include a memory 504, which may be any suitable non- transitory computer-readable medium (e.g., RAM, ROM, an HDD, an SSD, flash memory, etc.). The memory 504 may store, as one or more instructions executable by a processor on the HTS instrument 104 (not shown), screening logic 506 for performing an analysis of the samples using the measurement device 502.

[0067] The memory 504 may further store data 402, which may be screening measurements generated by the analysis logic 506. The data 402 may be provided to the hit matching system 116 over a network 510 via corresponding hardware network interfaces 508, 512.

[0068] The hit matching system 116 may also include a memory 514 (e.g., a non-transitory computer-readable medium), which may store the parameters 404 used to generate visualizations and/or identify hits. The memory 514 may store, as one or more instructions executable by a processor on the analysis instrument 102 (not shown), logic 516 for retrieving screening data, and identifying hits and outliers, and preserving the information used to identify the hits and outliers in an output for future use.

[0069] The output may be provided, in one embodiment, via the network interface 512 to a corresponding hardware network interface 530 on a user device 120. The user device 120 may be used to visualize the screening data, set options for hit picking, identify outliers, etc.

[0070] The logic 506 and 516 is described in more detail in connection with the flowchart 600 of FIG. 6. For example, the analysis logic 506 may perform the actions described at block 604; the retrieval logic 518 may perform the actions described at block 608; the hit logic 520 may perform the actions described at blocks 610-616; the Boolean logic 522 may be used in connection with block 612; the outlier logic 524 may perform the actions described at block 616; and the output generation logic 526 may perform the actions described at block 618.

[0071] With reference to FIG. 6, processing may start at block 602.

[0072] At block 604, the screening instrument may analyze a group of samples according to a design or configuration of the screening instrument. The measurements performed by the screening instrument may be collected and organized as screening data and stored in a memory of the screening instrument.

[0073] At block 606, the hit matching system may interface with the screening instrument, such as by establishing a connection to the screening instrument over a network. In some embodiments, the hit matching system may be connected directly (in a wired and/or wireless manner) to the screening instrument. In other embodiments, the hit matching system may be integrated with the screening instrument, and thus the two devices may already be connected.

[0074] At block 608, the hit matching system may retrieve the measurements stored at the screening instrument via the interface established at block 606. The screening instrument may provide the data to the hit matching system using the input format shown in FIG. 3 A.

[0075] At block 610, the hit matching system may receive one or more hit criteria. For example, the hit criteria may be specified as selectable options via a user device interfaced with the hit matching system. The hit criteria may be received dynamically from the user device, and/or may be pre-stored (locally or remotely) and retrieved for use. The hit criteria may include one or more parameters, combined by mathematical and/or logical operators that may also be specified via the user interface or through pre-stored options. The hit matching system may apply the hit criteria in conjunction with the mathematical and/or logical criteria, and one or more hit matching criteria (threshold comparisons, linear regressions, etc.) to identify which samples correspond to hits (block 612).

[0076] At block 614, the system may forward a list of the hits to a display, such as a display of a user device, for review. The hit list may be displayed in tabular and/or graphical format. Optionally, at block 616, the user may select one or more of the hits as outliers that should be excluded from further analysis. In some embodiments, the user may select one or more non hits as outliers that should be included in future analysis. The outliers may be transmitted back to the hit matching system and may be stored in a data structure for future reference.

[0077] At block 618, the hit matching system may generate an output, such as the output 350 depicted in FIG. 3B. The output may include a list of hits and a list of outliers, a list of hit criteria that gave rise to the list of hits, heat maps and/or other visualizations, and optionally any data used to generate the hit list (in some embodiments the output may be transmitted with or associated with the input retrieved at block 608 so that the screening data need not be stored with the output).

[0078] At block 620, the output data structure may be transmitted to the screening instrument, to a suitable database and/or to a display. Accordingly, the hit matching system format the data appropriately for the type of output device and transmit the data to the appropriate device. If the data is to be stored in a database (e.g., a database of experimental results), the hit matching system may identify a suitable database at block 620 and transmit the results to a corresponding database server/service for storage in the database.

[0079] If the output is used to rescreen the identified hits, then at block 620 the hit matching system may forward the output data structure to the screening instrument, and at block 622 the screening instrument may rescreen the hits identified in the output transmitted in block 620. The screening instrument may rescreen the hits automatically, without user intervention, and/or based on manual input from a user. The screening process may be similar to the one applied at block 604. Processing may then proceed to block 624 and terminate.

[0080] Although FIGs. 1-6 depict specific components in a particular configuration, it is contemplated that other configurations may also be used. For example, any or all of the analysis instrument 102, the A&I system 112, and the database 114 may be integrated in a single device, or aspects of these systems and instruments may be separated into distinct devices. Various logic modules may be split into multiple modules or combined into a single module and may be deployed on a single device (which may or may not be the precise device noted above) or split between multiple devices.

[0081] The above-described methods may be embodied as instructions on a computer readable medium or as part of a computing architecture. FIG. 7 illustrates an embodiment of an exemplary computing architecture 700 suitable for implementing various embodiments as previously described. In one embodiment, the computing architecture 700 may comprise or be implemented as part of an electronic device, such as a computer 701. The embodiments are not limited in this context. [0082] As used in this application, the terms “system” and “component” are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution, examples of which are provided by the exemplary computing architecture 700. For example, a component can be, but is not limited to being, a process running on a processor, a processor, a hard disk drive, multiple storage drives (of optical and/or magnetic storage medium), an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a server and the server can be a component. One or more components can reside within a process and/or thread of execution, and a component can be localized on one computer and/or distributed between two or more computers. Further, components may be communicatively coupled to each other by various types of communications media to coordinate operations. The coordination may involve the uni-directional or bi-directional exchange of information. For instance, the components may communicate information in the form of signals communicated over the communications media. The information can be implemented as signals allocated to various signal lines. In such allocations, each message is a signal.

Further embodiments, however, may alternatively employ data messages. Such data messages may be sent across various connections. Exemplary connections include parallel interfaces, serial interfaces, and bus interfaces.

[0083] The computing architecture 700 includes various common computing elements, such as one or more processors, multi-core processors, co-processors, memory units, chipsets, controllers, peripherals, interfaces, oscillators, timing devices, video cards, audio cards, multimedia input/output (EO) components, power supplies, and so forth. The embodiments, however, are not limited to implementation by the computing architecture 700.

[0084] As shown in FIG. 7, the computing architecture 700 comprises a processing unit 702, a system memory 704 and a system bus 706. The processing unit 702 can be any of various commercially available processors, including without limitation an AMD® Athlon®, Duron® and Opteron® processors; ARM® application, embedded and secure processors; IBM® and Motorola® DragonBall® and PowerPC® processors; IBM and Sony® Cell processors; Intel® Celeron®, Core (2) Duo®, Itanium®, Pentium®, Xeon®, and XScale® processors; and similar processors. Dual microprocessors, multi-core processors, and other multi-processor architectures may also be employed as the processing unit 702.

[0085] The system bus 706 provides an interface for system components including, but not limited to, the system memory 704 to the processing unit 702. The system bus 706 can be any of several types of bus structure that may further interconnect to a memory bus (with or without a memory controller), a peripheral bus, and a local bus using any of a variety of commercially available bus architectures. Interface adapters may connect to the system bus 706 via a slot architecture. Example slot architectures may include without limitation Accelerated Graphics Port (AGP), Card Bus, (Extended) Industry Standard Architecture ((E)ISA), Micro Channel Architecture (MCA), NuBus, Peripheral Component Interconnect (Extended) (PCI(X)), PCI Express, Personal Computer Memory Card International Association (PCMCIA), and the like.

[0086] The computing architecture 700 may comprise or implement various articles of manufacture. An article of manufacture may comprise a computer-readable storage medium to store logic. Examples of a computer-readable storage medium may include any tangible media capable of storing electronic data, including volatile memory or non-volatile memory, removable or non-removable memory, erasable or non-erasable memory, writeable or re writeable memory, and so forth. Examples of logic may include executable computer program instructions implemented using any suitable type of code, such as source code, compiled code, interpreted code, executable code, static code, dynamic code, object-oriented code, visual code, and the like. Embodiments may also be at least partly implemented as instructions contained in or on a non-transitory computer-readable medium, which may be read and executed by one or more processors to enable performance of the operations described herein.

[0087] The system memory 704 may include various types of computer-readable storage media in the form of one or more higher speed memory units, such as read-only memory (ROM), random-access memory (RAM), dynamic RAM (DRAM), Double-Data-Rate DRAM (DDRAM), synchronous DRAM (SDRAM), static RAM (SRAM), programmable ROM (PROM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, polymer memory such as ferroelectric polymer memory, ovonic memory, phase change or ferroelectric memory, silicon-oxide-nitride-oxide-silicon (SONOS) memory, magnetic or optical cards, an array of devices such as Redundant Array of Independent Disks (RAID) drives, solid state memory devices (e.g., USB memory, solid state drives (SSD) and any other type of storage media suitable for storing information. In the illustrated embodiment shown in FIG. 7, the system memory 704 can include non volatile memory 708 and/or volatile memory 710. A basic input/output system (BIOS) can be stored in the non-volatile memory 708.

[0088] The computing architecture 700 may include various types of computer-readable storage media in the form of one or more lower speed memory units, including an internal (or external) hard disk drive (HDD) 712, a magnetic floppy disk drive (FDD) 714 to read from or write to a removable magnetic disk 716, and an optical disk drive 718 to read from or write to a removable optical disk 720 (e.g., a CD-ROM or DVD). The HDD 712, FDD 714 and optical disk drive 720 can be connected to the system bus 706 by an HDD interface 722, an FDD interface 724 and an optical drive interface 726, respectively. The HDD interface 722 for external drive implementations can include at least one or both of Universal Serial Bus (USB) and IEEE 694 interface technologies.

[0089] The drives and associated computer-readable media provide volatile and/or nonvolatile storage of data, data structures, computer-executable instructions, and so forth. For example, a number of program modules can be stored in the drives and memory units 708, 712, including an operating system 728, one or more application programs 730, other program modules 732, and program data 734. In one embodiment, the one or more application programs 730, other program modules 732, and program data 734 can include, for example, the various applications and/or components of the messaging system 500.

[0090] A user can enter commands and information into the computer 701 through one or more wire/wireless input devices, for example, a keyboard 736 and a pointing device, such as a mouse 738. Other input devices may include microphones, infra-red (IR) remote controls, radio-frequency (RF) remote controls, game pads, stylus pens, card readers, dongles, finger print readers, gloves, graphics tablets, joysticks, keyboards, retina readers, touch screens (e.g., capacitive, resistive, etc.), trackballs, trackpads, sensors, styluses, and the like. These and other input devices are often connected to the processing unit 702 through an input device interface 740 that is coupled to the system bus 706 but can be connected by other interfaces such as a parallel port, IEEE 694 serial port, a game port, a USB port, an IR interface, and so forth.

[0091] A monitor 742 or other type of display device is also connected to the system bus 706 via an interface, such as a video adaptor 744. The monitor 742 may be internal or external to the computer 701. In addition to the monitor 742, a computer typically includes other peripheral output devices, such as speakers, printers, and so forth.

[0092] The computer 701 may operate in a networked environment using logical connections via wire and/or wireless communications to one or more remote computers, such as a remote computer 744. The remote computer 744 can be a workstation, a server computer, a router, a personal computer, portable computer, microprocessor-based entertainment appliance, a peer device or other common network node, and typically includes many or all of the elements described relative to the computer 701, although, for purposes of brevity, only a memory/storage device 746 is illustrated. The logical connections depicted include wire/wireless connectivity to a local area network (LAN) 748 and/or larger networks, for example, a wide area network (WAN) 750. Such LAN and WAN networking environments are commonplace in offices and companies, and facilitate enterprise-wide computer networks, such as intranets, all of which may connect to a global communications network, for example, the Internet.

[0093] When used in a LAN networking environment, the computer 701 is connected to the LAN 748 through a wire and/or wireless communication network interface or adaptor 752. The adaptor 752 can facilitate wire and/or wireless communications to the LAN 748, which may also include a wireless access point disposed thereon for communicating with the wireless functionality of the adaptor 752.

[0094] When used in a WAN networking environment, the computer 701 can include a modem 754, or is connected to a communications server on the WAN 750, or has other means for establishing communications over the WAN 750, such as by way of the Internet. The modem 754, which can be internal or external and a wire and/or wireless device, connects to the system bus 706 via the input device interface 740. In a networked environment, program modules depicted relative to the computer 701, or portions thereof, can be stored in the remote memory/storage device 746. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers can be used.

[0095] The computer 701 is operable to communicate with wire and wireless devices or entities using the IEEE 802 family of standards, such as wireless devices operatively disposed in wireless communication (e.g., IEEE 802.13 over-the-air modulation techniques). This includes at least Wi-Fi (or Wireless Fidelity), WiMax, and Bluetooth™ wireless technologies, among others. Thus, the communication can be a predefined structure as with a conventional network or simply an ad hoc communication between at least two devices. Wi-Fi networks use radio technologies called IEEE 802.13x (a, b, g, n, etc.) to provide secure, reliable, fast wireless connectivity. A Wi-Fi network can be used to connect computers to each other, to the Internet, and to wire networks (which use IEEE 802.3-related media and functions).

[0096] FIG. 8 is a block diagram depicting an exemplary communications architecture 800 suitable for implementing various embodiments as previously described. The communications architecture 800 includes various common communications elements, such as a transmitter, receiver, transceiver, radio, network interface, baseband processor, antenna, amplifiers, filters, power supplies, and so forth. The embodiments, however, are not limited to implementation by the communications architecture 800.

[0097] As shown in FIG. 8, the communications architecture 800 includes one or more clients 802 and servers 804. The clients 802 may implement the client device 510. The servers 804 may implement the server device 526. The clients 802 and the servers 804 are operatively connected to one or more respective client data stores 806 and server data stores 808 that can be employed to store information local to the respective clients 802 and servers 804, such as cookies and/or associated contextual information. [0098] The clients 802 and the servers 804 may communicate information between each other using a communication framework 810. The communications framework 810 may implement any well-known communications techniques and protocols. The communications framework 810 may be implemented as a packet- switched network (e.g., public networks such as the Internet, private networks such as an enterprise intranet, and so forth), a circuit- switched network (e.g., the public switched telephone network), or a combination of a packet-switched network and a circuit- switched network (with suitable gateways and translators).

[0099] The communications framework 810 may implement various network interfaces arranged to accept, communicate, and connect to a communications network. A network interface may be regarded as a specialized form of an input output interface. Network interfaces may employ connection protocols including without limitation direct connect, Ethernet (e.g., thick, thin, twisted pair 10/100/1000 Base T, and the like), token ring, wireless network interfaces, cellular network interfaces, IEEE 802.8a-x network interfaces, IEEE 802.16 network interfaces, IEEE 802.20 network interfaces, and the like. Further, multiple network interfaces may be used to engage with various communications network types. For example, multiple network interfaces may be employed to allow for the communication over broadcast, multicast, and unicast networks. Should processing requirements dictate a greater amount speed and capacity, distributed network controller architectures may similarly be employed to pool, load balance, and otherwise increase the communicative bandwidth required by clients 802 and the servers 804. A communications network may be any one and the combination of wired and/or wireless networks including without limitation a direct interconnection, a secured custom connection, a private network (e.g., an enterprise intranet), a public network (e.g., the Internet), a Personal Area Network (PAN), a Local Area Network (LAN), a Metropolitan Area Network (MAN), an Operating Missions as Nodes on the Internet (OMNI), a Wide Area Network (WAN), a wireless network, a cellular network, and other communications networks.

[0100] The components and features of the devices described above may be implemented using any combination of discrete circuitry, application specific integrated circuits (ASICs), logic gates and/or single chip architectures. Further, the features of the devices may be implemented using microcontrollers, programmable logic arrays and/or microprocessors or any combination of the foregoing where suitably appropriate. It is noted that hardware, firmware and/or software elements may be collectively or individually referred to herein as “logic” or “circuit.”

[0101] It will be appreciated that the exemplary devices shown in the block diagrams described above may represent one functionally descriptive example of many potential implementations. Accordingly, division, omission or inclusion of block functions depicted in the accompanying figures does not infer that the hardware components, circuits, software and/or elements for implementing these functions would be necessarily be divided, omitted, or included in embodiments.

[0102] At least one computer-readable storage medium may include instructions that, when executed, cause a system to perform any of the computer-implemented methods described herein.

[0103] Some embodiments may be described using the expression “one embodiment” or “an embodiment” along with their derivatives. These terms mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment. Moreover, unless otherwise noted the features described above are recognized to be usable together in any combination. Thus, any features discussed separately may be employed in combination with each other unless it is noted that the features are incompatible with each other.

[0104] With general reference to notations and nomenclature used herein, the detailed descriptions herein may be presented in terms of program procedures executed on a computer or network of computers. These procedural descriptions and representations are used by those skilled in the art to most effectively convey the substance of their work to others skilled in the art.

[0105] A procedure is here, and generally, conceived to be a self-consistent sequence of operations leading to a desired result. These operations are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical, magnetic or optical signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It proves convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like. It should be noted, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to those quantities.

[0106] Further, the manipulations performed are often referred to in terms, such as adding or comparing, which are commonly associated with mental operations performed by a human operator. No such capability of a human operator is necessary, or desirable in most cases, in any of the operations described herein, which form part of one or more embodiments. Rather, the operations are machine operations. Useful machines for performing operations of various embodiments include general purpose digital computers or similar devices.

[0107] Some embodiments may be described using the expression "coupled" and "connected" along with their derivatives. These terms are not necessarily intended as synonyms for each other. For example, some embodiments may be described using the terms “connected” and/or “coupled” to indicate that two or more elements are in direct physical or electrical contact with each other. The term "coupled,” however, may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other.

[0108] Various embodiments also relate to apparatus or systems for performing these operations. This apparatus may be specially constructed for the required purpose or it may comprise a general purpose computer as selectively activated or reconfigured by a computer program stored in the computer. The procedures presented herein are not inherently related to a particular computer or other apparatus. Various general purpose machines may be used with programs written in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these machines will appear from the description given. [0109] It is emphasized that the Abstract of the Disclosure is provided to allow a reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, it can be seen that various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment. In the appended claims, the terms "including" and "in which" are used as the plain-English equivalents of the respective terms "comprising" and "wherein," respectively. Moreover, the terms "first," "second," "third," and so forth, are used merely as labels, and are not intended to impose numerical requirements on their objects.

[0110] What has been described above includes examples of the disclosed architecture.

It is, of course, not possible to describe every conceivable combination of components and/or methodologies but one of ordinary skill in the art may recognize that many further combinations and permutations are possible. Accordingly, the novel architecture is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.