Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
VOC-BASED NARCOLEPSY DIAGNOSTIC METHOD
Document Type and Number:
WIPO Patent Application WO/2014/180974
Kind Code:
A1
Abstract:
The invention provides a method, apparatus and kit for determining whether a patient suffers narcolepsy, the method comprising the detection of at least one volatile organic compound (VOC) in a sample obtained from a patient.

Inventors:
DOMÍNGUEZ ORTEGA LUIS (ES)
Application Number:
PCT/EP2014/059506
Publication Date:
November 13, 2014
Filing Date:
May 08, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
RAMEM S A (ES)
International Classes:
G01N27/62; G01N33/497
Domestic Patent References:
WO2008021617A12008-02-21
WO2001014555A12001-03-01
WO2006085648A12006-08-17
WO2011003922A12011-01-13
WO2012122128A22012-09-13
WO1994004705A11994-03-03
WO1995033848A11995-12-14
WO1998029563A11998-07-09
WO2000057182A12000-09-28
WO2010133714A12010-11-25
WO2008003797A12008-01-10
Foreign References:
US5996586A1999-12-07
US6312390B12001-11-06
CN2430111Y2001-05-16
Other References:
NARCOLEPSY UK: "Dog sniffs out trouble for narcolepsy sufferer", CATNAP NEWSLETTER OF NARCOLEPSY UK, 1 November 2011 (2011-11-01), pages 1 - 8, XP007922833
WESTHOFF M ET AL: "Ion mobility spectrometry for the detection of volatile organic compounds in exhaled breath of patients with lung cancer: results of a pilot study", THORAX, vol. 64, no. 9, September 2009 (2009-09-01), pages 744 - 748, XP002729436, ISSN: 0040-6376
MIEKISCH W ET AL: "Diagnostic potential of breath analysis - focus on volatile organic compounds", CLINICA CHIMICA ACTA, ELSEVIER BV, AMSTERDAM, NL, vol. 347, no. 1-2, 1 September 2004 (2004-09-01), pages 25 - 39, XP002556502, ISSN: 0009-8981, [retrieved on 20040622], DOI: 10.1016/J.CCCN.2004.04.023
MIEKISCH ET AL., CLINICA CHIMICA ACTA, vol. 347, 2004, pages 25 - 39
LINDINGER ET AL., INT J MASS SPECTROM ION PROCESS, vol. 173, 1998, pages 191 - 241
LINDINGE ET AL., ADV GAS PHASE ION CHEM, vol. 4, 2001, pages 191 - 241
ELLS ET AL., J. ENVIRON. MONIT., vol. 2, 2000, pages 393 - 397
DI NATALE ET AL., BIOSENSORS AND BIOELECTRONICS, vol. 18, 2003, pages 1209 - 1218
GORDON ET AL., CLIN CHEM, vol. 31, no. 8, 1985, pages 1278 - 1282
WEHINGER ET AL., INTER J MASS SPECTROMETRY, vol. 265, 2007, pages 49 - 59
PENG ET AL., NATURE NANOTECH, vol. 4, 2009, pages 669 - 673
PHILLIPS ET AL., CANCER BIOMARKERS, vol. 3, 2007, pages 95 - 109
TRYGG, J., WOLD, S. J. CHEMOMETRICS, vol. 16, 2002, pages 119 - 128
"International classification of sleep disorders (ICSD-2", DIAGNOSTIC AND CODING MANUAL., 2005
ABRAHAM, SAVITZKY; M. J. E. GOLAY: "Smoothing and Differentiation of Data by Simplified Least Squares Procedures", ANALYTICAL CHEMISTRY, vol. 36, no. 8, 1 July 1964 (1964-07-01), pages 1627 - 1639
PAUL HC EILERS; HANS FM BOELENS, BASELINE CORRECTION WITH ASYMMETRIC LEAST SQUARES SMOOTHING, 2005
GIORGIO TOMASI; FRANCESCO SAVORANI; SOREN B. ENGELSEN: "Icoshift: An Effective Tool for the Alignment of Chromatographic Data", JOURNAL OF CHROMATOGRAPHY A, vol. 1218, no. 43, 28 October 2011 (2011-10-28)
COLIN A. SMITH ET AL.: "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification", ANALYTICAL CHEMISTRY, vol. 78, no. 3, 1 February 2006 (2006-02-01)
HAIZHOU WANG; MINGZHOU SONG: "Ckmeans. Id. Dp: Optimal K-means Clustering in One Dimension by Dynamic Programming", A PEER-REVIEWED; OPEN-ACCESS PUBLICATION OF THE R FOUNDATION FOR STATISTICAL COMPUTING, 19 March 2013 (2013-03-19), pages 29
RONALD A. FISHER: "Annals of Human Genetics", vol. 7, 1936, article "The Use of Multiple Measurements in Taxonomic Problems", pages: 179 - 188
SVANTE WOLD; KIM ESBENSEN; PAUL GELADI: "Principal Component Analysis", CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, vol. 2, no. I, 1987, pages 37 - 52
Attorney, Agent or Firm:
LORCA MELTON, Miguel (1ºB, Tres Cantos, ES)
Download PDF:
Claims:
CLAIMS

1. - Method for determining whether a patient suffers narcolepsy comprising the detection of at least one VOC in a sample obtained from a patient.

2. - Method according to claim 1 comprising the steps of:

- obtaining a sample from a subject;

- detecting the levels of at least one VOC in said sample in order to obtain the VOC profile of the sample; and

- comparing said VOC profile of the sample with a reference VOC profile.

3. - Method according to any of claims 1 or 2, wherein detecting comprises at least one technique selected from the group consisting of electronic noses, canine scent, gas chromatography, mass spectroscopy, quartz crystal microbalance, surface acoustic waves, resistive or capacitive sensors, liquid chromatography, differential mobility analysis and ion mobility spectroscopy, capillary electrophoresis and infrared detection.

4. - Method according to claim 3, wherein said ion mobility spectroscopy comprises drift time IMS and field asymmetric IMS.

5. - Method according to claims 3, wherein the detection step is performed by a properly trained dog.

6. - Method according to any of claims 1 to 5, comprising two or more orthogonal detection means.

7.- Method according to any of claims 1 to 6, wherein the sample is selected from the group consisting of blood, breath, swab, sweat, urine, feces, semen, vaginal discharge, hair, nails, soft body tissue and mucus.

8.- Method according to any of claims 1 to 7, comprising the following steps:

- obtaining a sample from a subject by washing a part of the subject and rubbing the subject with a gauze;

- detecting the levels of at least one VOC in said gauze in order to obtain a sample VOC profile, with a technique selected form the group consisting of GC-MS, IMS, DMA and a properly trained dog; and

- comparing said sample VOC profile with a reference VOC profile, and thereby determining whether the subject suffers narcolepsy.

9.- Method according to any of claims 1 to 8, wherein said reference VOC profile has been created by analyzing the VOC profile of at least one subject known to suffer narcolepsy.

10. - Method according to any of claims 1 to 8, wherein said reference VOC profile has been created by analyzing the VOC profile of at least one subject known not to suffer narcolepsy.

11. - Method according to any of claims 1 to 10, wherein said reference VOC profile is obtained by:

- obtaining 2 or more chromatograms or the result of any other analytical technique aimed to detect the odor from patient samples for which the presence or absence of narcolepsy is known;

- preprocesing said chromatograms in order to homogenize their retention times;

- clustering peaks with similar retention times;

- selecting statistically significant clusters; and - analyzing and classifying said statistically significant clusters according to their correlation with the presence or absence of narcolepsy.

12. - Method according to claim 11, wherein said analysis is performed by using at least one method selected from the group consisting of artificial neural networks, multi-layer perception, generalized regression neural network, fuzzy inference systems, self- organizing map, radial bias function, genetic algorithms, neuro-fuzzy systems, adaptive resonance theory and statistical methods including, principal component analysis, partial least squares, multiple linear regression, principal component regression, discriminant function analysis including linear discriminant analysis, and cluster analysis including nearest neighbor.

13. - Method according to any of claims 1 to 12, wherein at least one of said VOCs is a substance with a molecular weight below 1000 Dalton.

14. - Method according to any of claims 1 to 13, wherein at least one of said VOCs is selected from the group consisting of C1-C20 linear or branched hydrocarbons, aromatic hydrocarbons or a substances that comprise an aromatic moiety, acids, esters, ketones, aldhydes and mixtures thereof.

15. - Method according to any of claims 1 to 14, wherein at least one of said VOCs is selected from the group consisting of 2,4-bis(l,l-dimethylethyl)-phenol, 2-methyl-6-(2- propenyl)-phenol, 2-ethyl-l,4-dimethylbenzene, 4'-hydroxy-acetophenone, l-ethyl-3- methyl-benzene and benzyl-2-chloroethyl sulfone.

16. - Method according to any of claims 1 to 15, wherein at least one of said VOCs is selected from the group consisting of decane, hexadecane, 4-methyl-l-heptanol, acetic acid octyl ester, decane, 2-decanone, 3-methyl-decane, octanal, pentadecanenitrile, and tetradecene, 6-methyl-l- heptanol, 2-ethyl-l-hexanol, benzaldehyde, tetrahydrofuran, isopropyl myristate, 2,2,4- trimethyl-pentanenitrile, 2,2,4-trimethyl-3-carboxyisopropyl- isobutyl ester pentanoic acid, phenol, styrene, 4-methyl-tetradecane, toluene, tridecane, 6-methyl- tridecane, undecance, triethylamine, 2-hydroxy benzaldehyde, decanal, prednisolone, prednisolone acetate and 4,4,8,8-tetramethyloctahydro-4a,7-methano- 4aH-napth[ 1 ,8a-b]oxirene.

17.- Method according to claim 14, wherein said aromatic hydrocarbon or substance that comprises an aromatic moiety is substituted with at least one or more moieties selected from the group consisting of Ci-C6 alkyl, Ci-C6 alkenyl, Ci-C6 alkynyl, hydroxyl, Ci-C6 alcoxy, ketones and sulfones.

18. - Apparatus comprising a sensor capable of detecting VOCs and comparative means by which a sample's VOC profile is compared with at least one reference VOC profile.

19. - Kit for obtaining a narcolepsy diagnosis sample that comprises sterile material, a sealed container and instructions for the manipulation of materials and how to take and manipulate the sample.

20.- Kit according to claim 19, comprising a deodorized sterile gauze, tweezers, soap and sample container.

21. - Use of a kit comprising sterile material and a sealed container for narcolepsy diagnosis.

22. - Use of VOC detection for the determination of whether a patient suffers narcolepsy.

23.- Use according to the previous claim wherein VOC detection is performed spectroscopic means.

24.- Use according to the previous claim wherein VOC detection comprises at least one technique selected from the group consisting of electronic noses, canine scent, gas chromatography, mass spectroscopy, quartz crystal microbalance, surface acoustic waves, resistive or capacitive sensors, liquid chromatography, differential mobility analysis and ion mobility spectroscopy, capillary electrophoresis and infrared detection.

25.- A computer implemented method comprising comparing the VOC profile obtained from a sample a subject suspected of suffering narcolepsy with a reference VOC profile.

26. - A computer readable medium comprising computer executed instructions for performing the method defined in claim 25.

27. - A computer implemented method for the construction of a reference VOC profile comprising acquiring the VOC profile data from one or more subjects, including at least one subject known to suffer narcolepsy, and identifying one or more signals which distinguish subjects having narcolepsy from those not suffering narcolepsy.

28. - A computer readable medium comprising computer executed instructions for performing the method defined in claim 27.

Description:
VOC-BASED NARCOLEPSY DIAGNOSTIC METHOD

FIELD OF THE INVENTION

The present invention relates to a process for the diagnosis of narcolepsia by the detection of volatile organic compounds (VOC) in a patient's sample.

BACKGROUND OF THE INVENTION

Narcolepsy is a primary sleep disorder characterized by excessive daytime sleepiness, sleep attacks, cataplexy and sleep paralysis with hypnagogic hallucinations. REM sleep and sleep continuity variables are also disturbed in narcoleptic patients. The etiology of narcolepsy in humans is unknown, although it gives great importance to the role of the orexin/hypo cretin system, both in animals and humans. The prevalence of narcolepsy is 1/2000 in the general population. There are also some reports of narcolepsy symptoms associated to organic lesions such as mid-brain tumors, pontine gliomas, hypothalamic glyosis, and Down syndrome.

Current diagnosis of narcolepsy is based on clinical data and polysomnographic studies. These studies are expensive and require the subject to go personally to specialized facilities where he is monitored during sleep with costly equipment. Thus, in addition to the costs, this method takes full hours to perform and interpret the results and is inconvenient for the patient. Overall, diagnosis comes with years of delay due to the inspecificity of the symptoms, high economic costs and low reliability of the diagnostic methods.

Further known diagnosis methods can be found in WO 01/14555 Al and WO

2006/085648 Al, describing the detection of mutations in the gene encoding hypocretin

(orexin) receptor 1. Genetic methods are however time consuming, expensive and require the intervention of a specialist.

There is therefore a need to provide a diagnostic method for narcolepsy that overcomes the above problems. In order to obtain wide acceptance, the method must not only be reliable and robust, but also easy to perform without excessive inconveniences for the subject.

Unambiguous identification of narcolepsy would be important in the initial differential diagnosis in the study of hypersomnias, since their negativity would avoid more costly studies. FIGURES

Figure 1: Peak detection results: Each dot represents a peak detected at a particular sample in a particular retention time.

Figure 2: Cluster range distribution. X axis shows the range of retention times; y axis the number of clusters.

Figure 3: LDA (linear discriminant analysis) classification for "S". The "x" axis represents the predicted probability of a sample being of type "S".

Figure 4: Supervised Multivariate Analysis Orthogonal Projection to Latent Structures analysis on a population of 17 diseased (black) and 27 healthy (red) subjects (example 3).

Figure 5: R2 (left) and Q2 (right) coefficients (example 3)

SUMMARY OF THE INVENTION

The inventors have overcome the above problems by providing a method based on the detection of at least one VOC. The method is useful since it provides a new noninvasive, and surprisingly easy, safe, cost effective and fast detection tool for the diagnosis of narcolepsy.

According to one aspect, the invention provides a method for determining whether a patient suffers narcolepsy comprising the detection of at least one VOC in a sample obtained from a patient. According to a preferred embodiment, the method comprises the steps of:

- obtaining a sample from a subject;

- detecting the odor components of the sample; and

- comparing said odor components of the sample with a reference odor.

In other words, the method comprises detecting the odor components (VOCs) of a sample obtained from a subject; and comparing said odor components of the sample with a reference odor.

According to a further preferred embodiment, the method comprises the steps of:

- obtaining a sample from a subject; - detecting the levels of at least one VOC in said sample in order to obtain the VOC profile of the sample; and

- comparing said VOC profile of the sample with a reference VOC profile.

Among the origins or causes of narcolepsy the deregulation in the orexin/hypo cretin system plays an important role. Thus, the present invention also may provide a method for the detection of orexin/hypo cretin system malfunction.

A further aspect of the invention is a kit for obtaining a narcolepsy diagnosis sample that comprises sterile and odorless material. Another aspect of the invention is an apparatus comprising a sensor capable of detecting VOCs and comparative means by which a sample's VOC profile is compared with at least one reference VOC profile.

The result is a convenient, simple, quick and inexpensive diagnostic process with high sensitivity and specificity. It is now possible for the first time to routinely perform these tests in large groups of population and on professionals in positions of risk or high responsibility.

The diagnostic use of the test is important in the initial differential diagnosis in the study of hypersomnias, since their negativity would avoid more costly studies. It is important as a means of screening for studying populations or groups requiring treatment of narcolepsy. Its application is specially interesting in cases where hypersomnia could be hazardous, as in the case of drivers, pilots and in professions involving risk. This diagnostic technique will reduce costs and will help to make a better selection of patients that should be studied with more complex and costly tests. Furthermore, it will be a useful means of screening for the study of narcolepsy in risk populations.

Although screening of VOC has been used in the detection of other clinical conditions such as asthma (WO 2011/003922), lung cancer (e.g. US 5,996,586, US 6,312,390, WO 2012/122128 Al), diabetes (CN 2 430 111), or bacterial identification (WO 94/04705 Al, WO 95/33848 Al, WO 98/29563 Al, WO 00/57182 Al), it has been surprising the fact that their VOC profile could provide such convenient diagnostic method. DETAILED DESCRIPTION OF THE INVENTION

Sample Collection and Manipulation

The particular VOC profile caused by narcolepsy can be reflected on different tissues and body fluids. The sample is preferably selected from the group consisting of blood, breath, swab (sample obtained by rubbing the subject with a, cloth, gauze or other means of absorbing odor on the skin), sweat, urine, feces, semen, vaginal discharge, hair, nails, soft body tissue and mucus. The sample may be thus collected and then analyzed ex vivo. The invention can thus be performed in vitro. In order to collect, manipulate, store and transport the sample it is preferable to maintain aseptic conditions to avoid contamination. Thus, another aspect of the present invention is a kit comprising suitable means for collecting, manipulating, storing and transporting the sample. Such kit is preferably sterile in order to prevent cross-contamination. In a preferred embodiment, all components of the kit are isolated from the exterior by, for example, a sealed receptacle which is opened immediately before collecting the sample. This may include containers for the sample that can be sealed, preferably opaque. The kit further includes instructions on how to manipulate materials and how to take and manipulate the sample. Just by way of example, the kit may contain a sterile deodorized container for urine. The sample may be used immediately for detection, or it can be sealed and transported to another location for analysis. The time the sample is useful since the moment is taken depends on the nature of the sample, and is preferably preserved cold (e.g. between -10°C and 10°C).

In the case of swabs, the kit may contain an odorless sterilized sample container, gauze, soap and/or tweezers to manipulate the gauze. According to this embodiment, a part of the patient can be washed with said odorless soap, rinsed with water and then air dried before taking the sample by allowing the patient to rub with the gauze the washed area. The kit may additionally comprise specially deodorized water for rinsing the soap in order to further improve reliability of the method and prevent cross-contamination.

In another embodiment the sample is blood, a convenient sample form that is not excessively invasive and readily available in health centers.

Detection

The skilled person can appreciate that different methods are available to detect VOCs. Typical means include electronic noses (also known as olfactory systems) such as quartz crystal microbalance, resistive or capacitive sensors or surface acoustic waves, mass spectrometry (MS), liquid chromatography, gas chromatography (GC), capillary electrophoresis, differential mobility analyzer (DMA) and ion mobility spectroscopy (IMS) in any of its variants (drift time IMS, Field Asymmetric IMS), infrared techniques,. These means can be used as stand-alone detection means or in different combinations. For example, GC-MS (Miekisch et al, Clinica Chi mica Acta (2004) 347 25-39) or proton transfer reaction-mass spectroscopy (for a review on this technique see Lindinger et al. Int J Mass Spectrom Ion Process (1998) 173 191-241 or Lindinger et al. Adv Gas Phase Ion Chem (2001) 4 191-241) can be used. In a further embodiment, the means used is IMS-MS (e.g. Ells et al. J. Environ. Monit, 2000, 2, 393-397).

Detection can also be done by a properly trained dog. The dog so trained develops an odor, VOC profile or odor "fingerprint" that he can remember, of a subject or group of subjects suffering narcolepsy (reference odor or reference VOC profile). When exposed to the subject's sample, the dog detects the odor or VOC profile of the sample, and performs the comparison.

For example, Di Natale et al. (Biosensors and Bioelectronics (2003) 18 1209- 1218) used an array of non-selective gas sensors for detecting various alkanes and benzene derivatives as possible candidate markers of lung cancer. Gordon et al. (Clin Chem (1985) 31(8) 1278-1282) used breath collection technique and computer-assisted gas chromatography/mass spectrometry to identify several volatile organic compounds in the exhaled breath of lung cancer patients which appear to be associated with the disease. Wehinger et al. (Inter J Mass Spectrometry (2007) 265 49-59) used proton transfer reaction mass-spectrometric analysis to detect lung cancer in human breath. Peng et al. (Nature Nanotech (2009) 4 669-673) identified 42 VOCs that represent lung cancer biomarkers using gas chromatography/mass spectrometry.

Exemplary IMS or DMA systems useful in the method of the invention are described in WO 2010/133714 Al or WO 2008/003797 Al, respectively.

According to a further embodiment, the method uses two or more orthogonal detection means, providing a more accurate diagnosis means. Orthogonal detection means are understood as two or more detection means that are mutually independent, i.e. they detect independent characteristics of the sample. Once the sample's VOC profile has been obtained, it is correlated with the diagnosis of narcolepsy. The determination of said levels can involve detecting whether said at least one VOC is present or absent, or alternatively the levels in which it is present.

In an embodiment of the invention, the sample VOC profile is compared with at least one reference VOC negative profile acting as negative control sample, i.e. a VOC profile obtained from a subject or group of subjects known not have narcolepsy. In another embodiment the comparison is done with a positive reference VOC profile acting as positive control sample, i.e. a VOC profile obtained from a subject or group of subjects known to have narcolepsy. Thus, in order to determine the presence of narcolepsy the reference VOC profile can be positive or negative. The reference VOC can be obtained from a single subject or it can be obtained from a plurality of subjects.

According to one embodiment of the invention, the levels or abundance of one VOC are detected. In an alternative embodiment, the levels or abundance of two or more VOC are detected, and the determination of the VOC profile of the sample can involve the detection of the levels or abundance and proportions in which different VOCs are found.

Thus, a further aspect of the invention is a method for comparing a sample from a patient suspected of suffering narcolepsia with a reference VOC profile obtained as explained below. The method can be conveniently implemented in a computer or similar, and a further aspect of the invention is a data processing system having means for carrying out said method for comparing. It is therefore a further aspect of the invention a computer implemented method comprising comparing the VOC profile obtained from a sample of a subject suspected of suffering narcolepsy with a reference VOC profile, and a computer readable medium comprising computer executed instructions for performing said method for comparing.

VOC's are organic substances susceptible of being detected by an animal nose or by instrumental methods such as gas chromatography or liquid chromatography, or others disclosed herein. They are typically substances with low molecular weight, e.g. below 1000 Dalton, typically below 800 or below 600 Dalton.

Examples of VOC that can be useful for diagnostic purposes according to the present invention are C1-C20 linear or branched hydrocarbons. For example, linear or branched hydrocarbons having between 10 and 16 carbon atoms. VOCs can be saturated or unsaturated, cyclic or acyclic. Non-limiting examples include decane or hexadecane. According to a further embodiment, said VOC detected is an aromatic hydrocarbon or comprises an aromatic moiety. Typical examples include benzenes substituted with at least one moiety selected from the group consisting of Ci-C 6 alkyl, Ci-C 6 alkenyl, Ci-C 6 alkinyl, hydroxyl, Ci-C 6 alcoxy, ketones and sulfones. Non- limitative examples include 2,4-bis(l , l-dimethylethyl)-phenol, 2-methyl-6-(2- propenyl)-phenol, 2-ethyl-l ,4-dimethylbenzene, 4'-hydroxy-acetophenone, l-ethyl-3- methyl-benzene or benzyl-2-chloroethyl sulfone. Other exemplary VOCs can be acids, esters, ketones or aldhydes. Non-limiting examples are 4-methyl-l-heptanol, acetic acid octyl ester, decane, 2-decanone, 3-methyl-decane, octanal, pentadecanenitrile, and tetradecene, 6-methyl-l- heptanol, 2-ethyl-l-hexanol, benzaldehyde, tetrahydrofuran, isopropyl myristate, 2,2,4- trimethyl-pentanenitrile, 2,2,4-trimethyl-3-carboxyisopropyl- isobutyl ester pentanoic acid, phenol, styrene, 4-methyl-tetradecane, toluene, tridecane, 6-methyl- tridecane, undecance, triethylamine, 2-hydroxy benzaldehyde, and decanal. Other VOC's which can be detected according to the present invention are prednisolone or prednisolone acetate, or polycyclic alkanes optionally oxidated (e.g. 4,4,8,8-tetramethyloctahydro-4a,7-methano-4aH-napth[l ,8a-b]oxirene.

Each can be used in isolation or in combination with two or more VOCs. According to an embodiment of the invention the variation can be measured in one or more VOC's. There is no limitation on the type of variation measured. It can involve relative variation between different VOC levels, or between the levels of one or more VOC levels in comparison with a reference level or a reference sample. The present invention can also involve the detection of the relative levels between two or more VOCs.

Analysis

The VOC profile obtained from the patient can be compared to a reference VOC profile according to various methods known in the art. For example, a multi-linear regression and fuzzy logic can be used to analyze the sample (Phillips et al, Cancer Biomarkers (2007) 3 95-109). According to an embodiment of the invention, the VOC profile of the sample can be analyzed with an algorithm selected from the group consisting of artificial neural networks, multi-layer perception (MLP), generalized regression neural network (GRNN), fuzzy inference systems (FIS), self-organizing map (SOM), radial bias function (RBF), genetic algorithms (GAS), neuro-fuzzy systems (NFS), adaptive resonance theory (ART) and statistical methods including, but not limited to, principal component analysis (PCA), partial least squares (PLS), multiple linear regression (MLR), principal component regression (PCR), discriminant function analysis (DFA) including linear discriminant analysis (LDA), and cluster analysis including nearest neighbor. In an exemplary embodiment, the algorithm used to analyze the pattern is principal component analysis (PCA). In another exemplary embodiment, the algorithm used to analyze the pattern is discriminant function analysis (DFA). In other embodiments, the pattern can be analyzed using support vector machine (SVM) analysis.

According to a particular embodiment, the analysis is performed through a principal component analysis (PCA) and/or a linear discriminant analysis (LDA).

According to an embodiment of the invention, the reference VOC profile is obtained through a method comprising the following steps:

- obtaining 2 or more chromatograms or the result of any other analytical technique aimed to detect the odor from patient samples for which the presence or absence of narcolepsy is known;

- preprocesing said chromatograms or instrumental results in order to homogenize their retention times or other instrumental references;

- clustering peaks with similar retention times or the appropriate reference

- selecting statistically significant clusters; and

- classifying said statistically significant clusters according to their correlation with the presence or absence of narcolepsy.

Thus, the method of the invention is useful to predict with reasonable accuracy the presence or absence of narcolepsy in a patient by the analysis of its VOC profile through chromatographic or other instrumental techniques.

According to an alternative embodiment the creation of a suitable reference VOC profile may involve the analysis of a large sample of healthy and diseased subjects.

According to an embodiment of the invention, the reference VOC profile is obtained through a method comprising the following steps:

- obtaining 2 or more chromatograms or the result of any other analytical technique aimed to detect VOC from one or more patient samples for which the presence or absence of narcolepsy is known;

- creation of a library of target compounds - deconvolution of said chromatograms in order to obtain chromatograms from any pure compound

- aligning the compounds to obtain a compound:abundance matrix.

The method of the invention can be further refined by processing the compound:abundance matrix. Such refining may include as way of example, elimination of signals or abundances derived from column bleed, normalization of signals with respect to the total signal of the chromatogram, normalization test, or keeping signals present in at least a predetermined amount of the VOC profiles (e.g. present in at least 75% of the chromatograms of a group).

In a further embodiment of the invention, the reference VOC profile is obtained through a method comprising the following steps:

- obtaining 2 or more chromatograms or the result of any other analytical technique aimed to detect the VOC from one or more patient samples for which the presence or absence of narcolepsy is known;

- performing a multivariate (e.g. supervised: Orthogonal Projection to Latent Structures [Trygg, J.; Wold, S. J. Chemometrics 2002; 16: 119-128], unsupervised: Principal Component Analysis (PCA) or univariate analysis of the VOC profiles in order to obtain a model; and

- further resampling (e.g. jack-knife, t-test, Monte Carlo or Benjamini) the model obtained in order to generate a reference VOC profile.

It is therefore a further embodiment of the invention a computer implemented method for the construction of a reference VOC profile comprising acquiring the VOC profile data from one or more subjects, including one or more subjects known to suffer narcolepsy, and identifying one or more signals which distinguish subjects having narcolepsy from those not suffering narcolepsy. A further aspect of the invention is a computer readable medium comprising computer executed instructions for performing said method for the construction of a reference VOC profile.

The data processing systems described in the present invention will be any capable of computing the methods of the invention, including, but not limited to computers, laptops, tablets, smartphones, whether or not connected to the detection means. Such data processing systems will preferably comprise a display screen, data acquisition means, memory, one or more processors and one or more programs stored in the memory configured to be executed by one or more processors, the program including instructions for performing the methods of the invention. That is, in one case the instructions for comparing a sample from a patient suspected of suffering narcolepsy with a reference VOC profile; in another case, for the construction of said reference VOC profile comprising acquiring the VOC profile data from one or more subjects, including at least one subject known to suffer narcolepsy, and identifying one or more signals which distinguish subjects having narcolepsy from those not suffering narcolepsy.

The method of the invention will now be further explained by using a non- limiting example of an specific embodiment.

EXAMPLES

Example 1: Dog Scent

Following the criteria of the American Academy of Sleep Medicine (American Academy of Sleep Medicine. International classification of sleep disorders (ICSD-2). Diagnostic and coding manual.2005; Westchester,IL.), the patients were diagnosed through their clinical records and nocturnal polysomnography followed by the Multiple Sleep Latency Test (MSLT). The control group was made up of healthy individuals, both sexes and various ages, who completed a sleep questionnaire to rule out sleep disorders, suspicion of narcolepsy or other sleep pathologies. Patients were enrolled in the study from April 2011 to June 2012. All the patients and control group participants signed an informed consent.

Sweat collection.

The samples from both patients and controls were collected according the following protocol: wash the hands and forearms for 30 seconds with non-perfumed soap (water, vegetable glycerin, coconut, corn syrup soap, alcohol, citric acid, vegetable vitamin E, nothing else); rinse hands and forearms in water for 2 minutes; air dries the hands for 2 minutes; rub the palms of the hands on the forearms for 5 minutes; while walking for 10 minutes, rub gauze in the palms of the hands, forearms and arms; introduce the gauze into the vial and close with the septum; identify the sample (sex, age, date of collection) and keep the sample refrigerated at 5° C until remission for the test provided in the course of 24 hours.

To avoid contamination, only the patients and controls came into contact with materials used to take the sample (vial, septum, tweezers, and gauze). Before taking the sample the candidates had to avoid the use of perfumes, fragrances, deodorants, tobacco, or any source of external contamination. Sterile surgical cotton gauzes were used.

Dogs and training

Two 4 year old dogs, one male named "Kuns", and one female "Coca", Labrador retrievers, were chosen for the detection instead of just one, as it was done in other published clinical trials, in order to obtain greater accuracy in the odor identification. Both dogs belong to the Civil Guard ("Guardia Civil" in Spanish), an armed institution of military nature whose main mission is to ensure the full exercise of rights and freedoms and to vouch for public safety. Its Cynological Service (SECIR) was created in 1951 for the training and use of dogs in searching for missing persons, intervention in disasters, location of drugs or explosives, mountain rescues and so on. This service conducted all the dogs training and completed the detection test for all samples submitted.

Coca was trained in drug detection (active detection) and Kuns in the detection of explosives (passive detection). In active detection Coca detects a specific odor and moves, using paws and nails waiting to receive her reward. In the passive detection Kuns, in front of a specific sample demonstrates an inhibition in arousal and movements and adopts a seated position to receive his reward.

The training method for the detection of the suspected specific odor in narcoleptic patients was as follows: breeds of medium size retrievers with strong olfactory senses and socialized to adapt to the intervention locations. The dogs were exposed to the odor of narcoleptics using gauze soaked in their sweat until the dogs associated the rewards with the odor and demonstrated they could discriminate between other samples of odorants and sweat soaked gauze from the healthy control group. The dogs were trained to search and identify odors, utilizing a guide with a leash or away from the guide without a leash.

To improve the procedure a vertical cabinet with 8 boxes was designed to contain the sweat soaked gauzes of patients, healthy control group and other substances to dogs for identification. One positive sample was provided among seven others used as controls, each in a separate box. For each particular testing the samples were changed and each box was cleaned with alcohol prior to their use. Neither dog nor handler knew the location of the sample.

The dogs worked in a 6 x 9 m room. The cabinet was made of formic lined agglomerate panel that measures 366 x 91 cm and 2 cm thick hanging perpendicular to the floor on metal legs. It has 4 holes, 7.5 cm in diameter, which are separated by 88 cm and are 57.5 cm from the floor. The division of the last hole at the end of the panel is 44 cm, which can be linked to the other panel to maintain the distance (88 cm) with the other holes. In the back of the panel there are 4 boxes with lids. Their placements coincide with the holes on the other side. The panels are attached by wooden screws and measure 30 cm wide x 20 cm deep x 20 cm tall. As already mentioned, active and passive detection was used, each dog used a specific method. The dog training was completed during March and April 2011.

Testing

The tests were performed between April 2011 and July 2012. New samples, from patients and the control group, were used for each test. The dogs would pass by the boxes only once. A maximum of two tests were done daily with different samples. It was a double blind test (neither trainer nor dog knew the type of samples selected nor its placement in the search device), with a double test by the dogs (each sample was analyzed independently by each dog) using two different approaches of detection (active and passive). The dogs were always under veterinary supervision, in perfect health and injury free.

Statistical analysis

Description is made of the distribution of the variables of age, gender and tobacco habit among patients and controls. A change corrected index between made by the two raters (dogs) was undertaken applying the Kappa coefficient (CI 95 %).

Since both tests, problem (detection by dogs) and reference (gold standard diagnosis) have a binary distribution, analysis of the operational characteristics of the problem diagnosis test has been conducted by using a 2x2 matrix. Estimates on sensitivity (Se), specificity (Sp), and positive and negative predictive values (PPV, NPV), with 95 % confidence intervals, are performed. A joint evaluation of the behaviour of Se and Sp is performed by means of a ROC curve, whose area below the curve represents the likelihood that, when applying the test to two subjects taken at random, one patient and another a control, they may be suitably identified.

In order to control the effect of the prevalence of narcolepsy in the series studied and to generalize the results of the diagnostic estimators in different clinical situations, the likelihood ratios (LR) are calculated an alternative way to describe the performance of a diagnostic test, and can be used to calculate the probability of disease after a positive or negative test. The LR+, likelihood ratio of a positive result [Se/(1-Sp)], express how many times more likely a positive test result is to be found in narcoleptic compared with control people. The LR-, [(l-Se)/(Sp)], express how many times less likely a positive test result is to be found in narcoleptics compared with controls. The initial clinical probability (pre-test probability, Ppre) heuristically established by the doctor (extent of clinical suspicion) is modified (post-test probability, Ppost) depending on the efficacy of the diagnostic test in accordance with the function:

Oddspost = LR x Oddspre.

Odds is a ratio of two probabilities and a convenient way of expressing probability (p) for certain mathematical calculations. The two can be interconverted using simple formulas: Odds=[p/(l-p)] and p=[Odds/(l+Odds)].

Results

34 subjects have been studied, 12 narcoleptic and 22 healthy controls. Median age of the series was 33, 36 in patients and 33 in controls. 68 % of cases were men, which was distributed as 62 % in patients and 73 % in controls. The frequency of smokers in series, and in cases and controls, was 26, 23 and 31 %, respectively (Table 1).

Smoked (within 2 Yes: Male=3 , Female=1 Yes: Male=3,Fennale=2 weeks) No: Male=5, Female=3 No: Male=13, Female=4

Total Yes=4 Total Yes=5

Total No=8 Total No=17

Table 1

The extent of adjusted accordance by random (Kappa index) between the two detector dogs was 0.94 (0.83 - 1.00 CI 95 %).

Table 2 shows the distribution matrix of the subjects studied.

Table 2

The basic operational diagnostic characteristics obtained in the experiment (CI 95 %) are: sensitivity 0.92 (0.72-1.00), specificity 0.86 (0.70-1.00), PPV 0.79 (0.054-1.00) and NPV 0.95 (0.83-1.00). The area under the ROC curve (Receiving Operator Characteristic Curve) is 0.89 (0.78-0.99 CI 95 %). The likelihood ratio of a positive test is 6.72 (2.32-19.51) and that of a negative test is 0.10 (0.01-0.63). Table 3 shows an analysis of sensitivity of the post-test probabilities obtained for positive and negative results in the test, in different situations of clinical suspicion (pre-test probability).

Table 3

The first line shows the estimated population prevalence of narcolepsy in the European population. If the test result is positive, the initial probability estimates that the patient observed may in fact have narcolepsy increase up to 0.3, 0.7, 6.4, 24.6, 26.1, 42.7 and 62.7 %, respectively. If the test proves negative, they decrease to 0, 0, 0.1, 0.5, 1.1 and 2.4 %, respectively.

Out of the 22 healthy controls, 3 cases of false positives were detected for narcolepsy. Clinically re-evaluated and, with a polygraph, narcolepsy was discarded. This false positive may be due to contamination in handling the samples or to a physiological or pathological situation in said controls, unknown at the time.

The results obtained in the study show that the patients with narcolepsy give off a specific VOC profile. The momentary values of diagnostic indices are high, and the study results prove a high predictive capacity of the test, particularly when the result is negative. Further, when detection by VOC is negative, the post test probability that the patient has narcolepsy decreases to clinically irrelevant values.

Example 2: Gas Chromatography

A classifier was trained to develop factor values Met.A, based on 50 gas chromatograms of samples obtained following the procedure described in the previous example (see section "Sweat Collection" in example 1). Said factor values were thus used as reference VOC profiles.

The analysis has been done using vials, where the gauze containing the odor was placed and sealed with special septa. Solid Phase Micro Extraction (SPME) was used to extract the VOCs from the vials headspace and to concentrate the VOCs in the SPME fibre. CAR-DVB-PDMS was the SPME fibre selected for the VOCS. After extracting the compounds for 21h at ambient temperature, the SPME fibre was placed in the injector of a Gas Chromatograph for 30s at 250°C in splitless configuration. A temperature ramp was programmed until all the VOCs were detected in a Mass Spectrometer, which is the used detector.

The 50 samples thus obtained were processed as described below. Each chromatogram was labeled as follows:

- A unique sample identification name (i.e."M-IOO").

- Met.A: A factor which may take three values: "S", "N" or "?".

Table 4 shows how samples were distributed for each factor.

Table 4: Label distribution across samples

Met.A factors behave like binary labels with "Yes/No" values with an extra "Unknown" value. Therefore "unknown" samples cannot be used for training the classifier and they can only be used to see which label would the classifier assign them.

Methodology

The methodology followed to train the classifier can be decomposed in four steps:

- Preprocessing

- Feature extraction

- Feature selection

- Classification/V alidation

Preprocessing

First, samples are interpolated using splines in order to homogenize the retention time axis among different samples. Then noise is removed using a Savitzky-Golay (Abraham, Savitzky and M. J. E. Golay, "Smoothing and Differentiation of Data by Simplified Least Squares Procedures.," Analytical Chemistry 36, no. 8 (July 1, 1964): 1627-1639, doi: 10.1021/ac60214a047.) filter with a window size of 7 samples and a second order polynomial.

A baseline estimation is computed for each sample using Asymmetric Least Squares (Paul HC Eilers and Hans FM Boelens, "Baseline Correction with Asymmetric Least Squares Smoothing," Leiden University Medical Centre Report (2005), http://zanran_storage.s3. amazonaws om/www.science.uva.nl/ContentPages/443199618 .pdf) using λ=106 to ensure the baseline does not overfit to the peak areas and p=0.005 to impose that it is fitted below the signal and not over it.

Finally the different samples are aligned using icoshift (Giorgio Tomasi, Francesco Savorani, and Soren B. Engelsen, "Icoshift: An Effective Tool for the Alignment of Chromatographic Data," Journal of Chromatography A 1218, no. 43 (October 28, 20 11): 7832-7840, doi: 10.1016/j.chroma.2011.08.086), using as reference the average spectra and using fixed sized windows of 9 seconds.

Feature extraction

In this step a set of features are extracted from the chromatograms building a "feature matrix". To generate the features peaks from all the samples are detected, then peaks with similar retention times are clustered together and a set of clusters is defined, where each cluster has its own retention time range. Finally a matrix with samples as rows and clusters as columns is built having as element (i,j) of the matrix the integral of sample i along the retention time range of cluster j.

Peak detection

Peak detection is performed using a matched gaussian filter based on the xcms R package (Colin A. Smith et al, "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification," Analytical Chemistry 78, no. 3 (February 1, 2006): 779-787, doi: 10.1021/ac051437y). The peak detection algorithm is controlled by three parameters:

- The standard deviation of the gaussian used as a pattern by the adapted filter.

- A maximum number of detected peaks allowed.

- A Signal to noise ratio used as a threshold to stop detecting peaks.

Peaks are detected until the maximum number of peaks is reached or the signal to noise ratio threshold of the peaks is achieved. Peak boundaries are detected for each detected peak.

A sigma of 1.2 seconds is suitable given the average peak widths, detecting a maximum of 325 peaks on each sample and requesting the peaks to be above 1 % S/N ratio. The most stringent criterion is the signal to noise ratio as the number of peaks limitation is never reached.

Peak clustering Peak clustering was performed using a one dimensional k-means algorithm (Haizhou Wang and Mingzhou Song, "Ckmeans. Id. Dp: Optimal K-means Clustering in One Dimension by Dynamic Programming," A Peer-reviewed; Open-access Publication of the R Foundation for Statistical Computing (n.d.): 29, accessed March 19, 2013). Peaks were clustered in 325 groups based on how close were the peaks to each other. Each cluster was considered a "feature" and was given a unique identification. Each cluster was assigned a retention time range, where all of its peaks were contained.

Cluster integration

Finally, the feature matrix was built by integrating each of the samples at the clusters' retention time. The integral was performed along the cluster time range and everything inside the time range was be integrated.

Feature selection

Once the feature matrix was built, irrelevant features were discarded in order to reduce further the dimensionality of the matrix and the computational requirements. For example, some features could have only appeared because a peak was detected on a single sample or on a non-significant subset of samples. Those features were not taken into account by the classifier to avoid over-fitting problems.

Cluster selection: Number of samples

The first feature selection criteria was based on the idea that a cluster is not significant if appears in "very few" samples. In other words, a cluster was selected as valid if at least verified one of these criteria:

- The cluster has peaks detected in more than 30% of the samples.

- The cluster had peaks detected in more than 30% of the samples of a particular level of each factor. This means that if a cluster was detected in more than 30% of the samples that have Met.A = "S" or any other possible value the cluster was also selected.

Cluster selection: Analysis of variance

The second feature selection criteria was based on an analysis of variance (ANOVA). Each cluster was tested for significance against "Met.A" factor, and was selected if there was a statistical significance with a p-value less than 0.05 of the cluster against said factor. Therefore a feature matrice was built with the clusters statistically significant to "Met.A".

Classification

A PCA/LDA classifier was used and built for factor Met.A. Linear Discriminant Analysis

LDA (Linear Discriminant Analysis) aims to separate both classes based on the Fisher's linear discriminant (Ronald A. Fisher, "The Use of Multiple Measurements in Taxonomic Problems," Annals of Human Genetics 7, no. 2 (1936): 179-188) which tries to maximize the inter-class variance while minimizing the in-class variance. Previous PC A step was performed in order to ensure that the dimensionality (the number of features) was small enough for LDA.

Principal Component Analysis

PCA (Principal Component Analysis) (Svante Wold, Kim Esbensen, and Paul Geladi, "Principal Component Analysis," Chemometrics and Intelligent Laboratory Systems 2, no. I (1987): 37-52) reduces dimensionality by capturing the greatest sources of variance in the first principal components. Thanks to the previous feature selection process, we could assume that the greatest contributions to the variance in the feature matrix belonged to the effect of the factors, and not to some other non-interesting effect. The number of components selected was determined in our problem by the number of samples of the less populated label divided by 4. This means that for the Met.A classifier we could use 16/4 = 4 principal components. This criteria was taken as a rule of thumb.

Validation

In order to validate the results and estimate the uncertainty a random subsampling stratified cross-validation was performed with 100 iterations and a 80% training - 20% test split. The stratification was done in order to have a fair proportion of "S" and "N" samples in the training set. A confusion matrix was built for each classifier and averaged in order to obtain the figures of merit.

Results and discussion

Feature extraction

Peak detection

Peaks detected by the matched filter are shown on figure 1. Between 4 and 6 peaks were detected every 30 seconds on average.

Peak clustering The number of clusters is a parameter that must be estimated for the K-means algorithm. Knowing that a chromatogram lasts for 30 minutes approximately, the following estimation was given:

1800 seconds * [4,6] peaks / 30 seconds = [240,360] peaks.

Therefore the estimation of 325 peaks seemed reasonable. In figure 2, the distribution of clusters' ranges is shown. Most of the clusters extended over 5 seconds, in agreement with the peak detection routines.

Feature selection

Among the 325 clusters formed, 269 appeared in more than 30% of the samples of some Met.A level. The global criteria (clusters appearing in more than 30% of the samples in global) did not add any extra cluster to the selection.

Using the analysis of variance criteria, for the Met.A factor 7 clusters were statistically significant which are shown on table 5 :

Table 5

Classification

A visual representation of the ability to predict the classes from our PCA/LDA classifier is shown on figure 3. LDA classification for "S". The "x" axis represents the predicted probability of a sample being of type "S ". On an ideal case the central plot, corresponding to "N" samples, would have all samples concentrated on the left (near x=0) whereas the bottom plot, corresponding to "S" samples, would have all samples concentrated to the right (near x=l, high "S" probability). We see that Met.A classifier is very specific ("N" samples are predicted accurately).

This methodology described in example 2 can be applied to other detection means such as electronic noses, quartz crystal microbalance, surface acoustic waves, resistive or capacitive sensors, liquid chromatography, differential mobility analysis and ion mobility spectroscopy, capillary electrophoresis and infrared detection. Thus, the method of the invention is useful to predict with reasonable accuracy the presence or absence of narcolepsy in a patient by the analysis of its VOC profile through chromatographic or other detection means.

Example 3

A sweat collection was obtained as described in Example 1. 17 diseased and 27 healthy patients participated in the experiment. Sweat samples were processed with an Agilent 7890 gas chromatograph coupled to q uadrupole mass spectrometer, all in full scan mode. A library of target compounds was built including possible compounds found in the whole chromatograms by comparison of any spectrum peaks with the reference spectral library NIST 2008. The signal/time/mass spectra matrix was deconvoluted with AMDIS using this home-built library. The obtained data was further aligned in order to obtain a compound/abundance matrix with Mass Profiler Professional B.02.01. (Agilent Technologies). Abundances derived from column bleed were eliminated, then signals normalized with respect to the total signal of the chromatogram, followed by elimination of variables not present in at least 25% of the chromatograms of a group. Finally, variables present in at least 75% of the chromatograms of a group were kept, and the result subjected to a normalization test. Variables with zero abundance have been treated so that it would result in the most similar global tendency inside the group. Multivariate discriminant analysis was performed by SIMCA -P+12.0 software (UMETRICS). A multivariate supervised Orthogonal Projection to Latent Structures was applied resulting in an excellent separation of the groups (see figures 4 and 5), followed by Jack-knife analysis for finding out statistically significant variablesAdditionally, an univariate analysis was performed in Matlab (t-test, unequal variances, 0.05 value for p) and Bonferroni and Benjamini significance tests.

After the application of the three tests (t-test, Benjamini and Jack-Knife) statistically significant VOCs were selected and their variation percentage calculated according to the following equation: variation %=(mean value in diseased group-mean value in healthy group)* 100/mean value in healthy group As a result, at least the following relevant compounds were identified* (table 6):

* final confirmation of compounds is pending comparison with a standard

Table 6

Thus, variations in at least one of the above compounds was capable of distinguishing healthy from diseased subjects with 95% of statistically significance, which is a great improvement with respect to the methods currently used. The proposed method could be based not only in the relative abundance of a single variable in patients compared to controls but also in some selective combination of more than one variable.