Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS FOR PROFILIING AND QUANTITATING CELL-FREE RNA
Document Type and Number:
WIPO Patent Application WO/2015/069900
Kind Code:
A1
Abstract:
The invention generally relates to methods for assessing a neurological disorder by characterizing circulating nucleic acids in a blood sample. According to certain embodiments, methods for assessing a neurological disorder include obtaining RNA present in a blood sample of a patient suspected of having a neurological disorder, determining a level of RNA present in the sample that is specific to brain tissue, comparing the sample level of RNA to a reference level of RNA specific to brain tissue, determining whether a difference exists between the sample level and the reference level, and indicating a neurological disorder if a difference is determined.

Inventors:
KOH LIAN CHYE WINSTON (US)
QUAKE STEPHEN R (US)
FAN HEI-MUN CHRISTINA (US)
PAN WENYING (US)
Application Number:
PCT/US2014/064355
Publication Date:
May 14, 2015
Filing Date:
November 06, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV LELAND STANFORD JUNIOR (US)
International Classes:
C12Q1/68
Domestic Patent References:
WO2013113012A22013-08-01
Foreign References:
CA2838562A12013-02-14
Attorney, Agent or Firm:
MEYERS, Thomas, C. et al. (One Financial CenterBoston, MA, US)
Download PDF:
Claims:
What is claimed is:

1. A method for characterizing a neurological disorder of a patient, the method comprising: obtaining RNA from a blood sample of a patient suspected of having a neurological disorder;

converting the RNA obtained in the sample into cDNA;

determining a level of the sample cDNA that corresponds to RNA originating from brain tissue;

comparing the level of the sample cDNA to a reference level of circulating RNA originating from brain tissue; and

indicating a neurological disorder based upon a statistically-significant deviation between the level of sample cDNA and the reference level.

2. The method of claim 1, further comprising the step of determining a stage of the indicated neurological disorder.

3. The method of claim 2, wherein the stage is selected from the group consisting of no cognitive impairment, mild cognitive impairment, moderate cognitive impairment, and severe cognitive impairment.

4. The method of claim 1, wherein the neurological disorder is Alzheimer's disease.

5. The method of claim 1, wherein the level of the sample cDNA and the reference level correspond to an amount of circulating RNA released from brain tissue selected from the group consisting of spinal cord, pituitary, hypothalamus, thalamus, corpus callosum, cerebrum, cerebral cortex, and combinations thereof.

6. The method of claim 1, further comprising the step of monitoring progression of the neurological disorder by repeating the steps of obtaining through comparing.

7. The method of claim 1, wherein the reference level comprises a level of cDNA corresponding to a patient population without cognitive impairment.

8. The method of claim 1, wherein the reference level comprises a level of cDNA corresponding to a patient population diagnosed with a neurological disorder.

9. The method of claim 1, wherein the blood sample is plasma or serum.

10. The method of claim 1, wherein the determining step is performed via a sequencing technique, a microarray technique, or both.

11. A method for characterizing a neurological disorder of a patient, the method comprising: obtaining RNA from a blood sample of a patient suspected of having a neurological disorder,

converting the RNA obtained in the sample into cDNA;

determining a level of the sample cDNA that corresponds to RNA originating from brain tissue;

comparing the level of the sample cDNA to a set of variables correlated with a neurological disorder, wherein the variables comprise reference levels of cDNA that correspond to circulating RNA originating from brain tissue and to one or more stages of the neurological disorder; and

indicating a stage of a neurological disorder of the patient based upon a statistically significant deviation between the level of the sample cDNA and the set of variables correlated with a neurological disorder.

12. The method of claim 11, wherein the reference levels of cDNA further correspond to patient populations of certain ages.

13. The method of claim 11, wherein the level of the sample cDNA and reference levels of cDNA correspond to an amount of circulating RNA released from brain tissue that is selected from the group consisting of pituitary, hypothalamus, thalamus, corpus callosum, cerebrum, cerebral cortex, and combinations thereof.

14. The method of claim 11, further comprising monitoring progression of the neurological disorder by repeating the detecting step through the indicating step at a future time.

15. The method of claim 11, wherein the stages are selected from the group consisting of no cognitive impairment, mild cognitive impairment, moderate cognitive impairment, and severe cognitive impairment.

16. The method of claim 11, wherein the neurological disorder is Alzheimer's disease.

17. The method of claim 11, wherein the blood sample is plasma or serum.

18. The method of claim 11, wherein the determining step is performed via a sequencing technique, a microarray technique, or both.

19. A method of characterizing a neurological disorder, comprising the steps of

obtaining RNA from a blood sample of a patient suspected of having a neurological disorder;

determining a level of RNA present in the sample that is specific to brain tissue;

comparing the sample level of RNA to a reference level of RNA specific to brain tissue; determining whether a difference exists between the sample level and the reference level; and

indicating a neurological disorder if a difference is determined.

20. The method of claim 19, further comprising the step of determining a stage of the indicated neurological disorder.

21. The method of claim 19, wherein the stage is selected from the group consisting of no cognitive impairment, mild cognitive impairment, moderate cognitive impairment, and severe cognitive impairment.

22. The method of claim 19, wherein the neurological disorder is Alzheimer's disease.

23. The method of claim 19, wherein the level of sample RNA and the reference level of RNA correspond to an amount of circulating RNA released from brain tissue selected from the group consisting of pituitary, hypothalamus, thalamus, corpus callosum, cerebrum, cerebral cortex, and combinations thereof.

24. The method of claim 19, further comprising the step of monitoring progression of the neurological disorder by repeating the steps of obtaining through comparing at a future time.

25. The method of claim 19, wherein the reference level of RNA corresponds to a patient population diagnosed with a neurological disorder.

26. The method of claim 19, wherein the blood sample is plasma or serum.

27. The method of claim 19, wherein the determining step is performed via a sequencing technique, a microarray technique, or both.

28. A method for identifying one or more biomarkers associated with a neurological disorder, the method comprising

obtaining RNA present in a blood sample of a patient suspected of having a neurological disorder;

converting the RNA in the sample into cDNA;

determining levels of the sample cDNA that corresponds to RNA originating from brain tissue;

comparing the levels of the sample cDNA to one or more reference levels that correspond to circulating RNA originating from brain tissue; and identifying as a biomarker for a neurological disorder a level of sample cDNA that is statistically different from a reference level.

Description:
Methods for Profiling and Quantitating Cell-Free RNA

Related Applications

This application claims the benefit of and priority to U.S Provisional No. 61/900,927, filed November 6, 2013, and is a continuation-in-part of U.S. Non-Provisional No. 13/752,131, filed January 28, 2013, which claims the benefit of and priority to U.S. Provisional No. 61/591,642, filed on January 27, 2012. The entirety of each foregoing application is incorporated herein by reference.

Technical Field

The present invention relates to assessing neurological disorders based on nucleic acid specific to brain tissue.

Background

Dementia is a catchall term used to characterize cognitive declines that interfere with one's ability to perform everyday activities. Signs of dementia include declines in the following mental functions: memory, communication and language, ability to focus and pay attention, reasoning, judgment, motor skills, and visual perception. While there are several neurological disorders that cause dementia, Alzheimer's disease is the most common, accounting for 60 to 80 percent of all dementia cases.

Alzheimer's disease is a progressive disease that gradually destroys memory and mental functions in patients. Symptoms manifest initially as a decline in memory followed by deterioration of other cognitive functions as well as by abnormal behavior. Individuals with Alzheimer's disease usually begin to show dementia symptoms later in life (e.g., 65 years or older), but a small percentage of individuals in their 40s and 50s experience early onset

Alzheimer's disease. Alzheimer's disease is associated with the damage and degeneration of neurons in several regions of the brain. The neuropathic characteristics of Alzheimer's disease include the presence of plaques and tangles, synaptic loss, and selective neuronal cell death. Plaques are abnormal levels of protein fragments called beta-amyloid that accumulate between nerve cells. Tangles are twisted fibers of a protein known as tau that accumulate within nerve cells.

While the above-described neuropathic characteristics are hallmarks of the disease, the exact cause of Alzheimer' s disease is unknown and there are no specific tests that confirm whether an individual has Alzheimer's disease. For diagnosis of Alzheimer's, clinicians assess a combination of clinical criteria, which may include a neurological exam, mental status tests, and brain imaging. Efforts are being made to determine the genetic causes in order to help definitively diagnose Alzheimer's disease. However, only a handful of genetic markers associated with Alzheimer's have been characterized to date, and diagnostic tests for those markers require invasive brain biopsies.

Summary

The present invention provides methods for assessing neurological conditions using circulating nucleic acid (such as DNA or RNA) that is specific to brain tissue. In particular embodiments, the invention involves a comparative analysis of levels of circulating nucleic acid in a patient that are specific to brain tissue with reference levels of circulating nucleic acid that are specific to brain tissue. The present invention recognizes that abnormal deviations in circulating nucleic acid result from tissue-specific nucleic acid being released into the blood in large amounts as tissue begins to fail and degrade. By focusing on genes the expression of which is highly specific to brain tissue, methods of the invention allow one to characterize the extent of brain degradation based on statistically-significant levels of circulating brain- specific transcripts; and use that characterization to diagnose and determine the stage of the neurological disease. Accordingly, methods of the invention allow one to characterize neurological disorders without focusing on small subset of known biomarkers, but rather focusing on the extent to which nucleic acid is released into blood from brain tissue affected by disease. Methods of the invention are particularly useful in diagnosing and determining the stage of Alzheimer's disease.

In particular embodiments, methods of the invention include obtaining RNA from a blood sample of a patient suspected of having a neurological disorder, and determining a level of the sample RNA that originated from brain tissue. In certain embodiments, the RNA is converted to cDNA. The level of the sample RNA specific to brain tissue is then compared to a reference level of RNA that is specific to brain tissue. The reference level may be derived from a subject or patient population having a neurological disorder or from a normal/control subject or patient population. Depending on the reference level chosen, similarities or variances between the level of sample RNA and the reference level of RNA are indicative of the neurological disorder, the type of neurological disorder and/or the stage of the neurological disorder. In certain embodiments, only similarities or variances of statistical significance are indicative of the neurological disorder. Whether a variance is significant depends upon the chosen reference population.

Additional aspects of the invention involve assessing a neurological disorder using a set of predictive variables correlated with a neurological disorder. In such aspects, methods of the invention involve detecting RNA present in a biological sample obtained from a patient suspected of having a neurological disorder. In certain embodiments, the RNA is converted to cDNA. Sample levels of one or more RNA transcripts that are specific to brain tissue are determined, and the sample levels of RNA transcripts specific to brain tissue are compared to a set of predictive variables correlated with a neurological disorder. The predictive variables may include reference levels of RNA transcripts that are specific to brain tissue and correspond to one or more stages of the neurological disorders. In certain embodiments, the predictive variables may include brain- specific reference levels of transcripts that correlate to other factors such as age, sex, environmental exposure, familial history of dementia, dementia symptoms. The stage of a neurological disorder of the patient may be indicated based on variances or similarities between the level of sample RNA and the predictive variables.

RNA obtained from the blood sample may be converted into synthetic cDNA. In such instances, the sample levels of cDNA that correspond to RNA originating from brain tissue may be compared to reference levels of RNA or references levels of cDNA that correspond to RNA originating from brain tissue. For example, methods of the invention may include the steps of detecting circulating RNA in a sample obtained from a patient suspected of having a

neurological disorder and converting the circulating RNA from the sample into cDNA. The next steps involve determining levels of the sample cDNA that correspond to RNA originating from brain tissue, and comparing the determined levels of the cDNA to a reference level of cDNA. The reference level of cDNA may also correspond to RNA originating from brain tissue. The neurological condition of the patient may then be indicated based similarities or differences between the patient cDNA levels and the reference cDNA levels.

Methods of the invention are also useful to identify one or more biomarkers associated with a neurological disorder. In such aspects, brain- specific transcripts of an individual or patient population suspected of having or actually having a neurological disorder (e.g. exhibiting impaired cognitive functions) are compared to a reference (e.g. brain- specific transcripts of a healthy, normal population). The brain-specific transcripts of the individual or patient population that are differentially expressed as compared to the reference may then be identified as biomarkers of the neurological disorder. In certain embodiments, only differentially expressed brain- specific transcripts that are statistically significant are identified as biomarkers. Methods of determining statistical significance are known in the art.

The reference level of RNA or cDNA specific to brain tissue may pertain to a patient population having a particular condition or pertain to a normal/control patient population. In one embodiment, the reference level of RNA or cDNA specific to brain tissue may be levels of RNA or cDNA specific to brain tissue in a normal patient population. In another embodiment, the reference level of RNA or cDNA may be the level of RNA or cDNA specific to brain tissue in a patient population having a certain neurological disorder. The certain neurological disorder may be mild cognitive impairment or moderate-to-severe cognitive impairment. The various levels of cognitive impairment may be indicative of a stage of Alzheimer's disease. In further

embodiments, the reference level of RNA or cDNA may be the level of RNA or cDNA specific to brain tissue having a certain neurological disorder at a certain age. Other embodiments may include reference levels that correspond to a variety of predictive variables, including type of neurological disorder, stage of neurological disorder, age, sex, environmental exposure, familial history of dementia, dementia symptoms.

Methods of the invention involve assaying biological samples for circulating nucleic acid (RNA or DNA). Suitable biological samples may include blood, blood fractions, plasma, saliva, sputum, urine, semen, transvaginal fluid, and cerebrospinal fluid. Preferably, the sample is a blood sample. The blood sample may be plasma or serum.

The present invention also provides methods for profiling the origin of the cell-free RNA to assess the health of an organ or tissue. Deviations in normal cell-free transcriptomes are caused when organ/tissue-specific transcripts are released in to the blood in large amounts as those organs/tissue begin to fail or are attacked by the immune system or pathogens. As a result inflammation process can occur as part of body's complex biological response to these harmful stimuli. The invention, according to certain aspects, utilizes tissue-specific RNA transcripts of healthy individuals to deduce the relative optimal contributions of different tissues in the normal cell-free transcriptome, with each tissue-specific RNA transcript of the sample being indicative of the apotopic rate of that tissue. The normal cell-free transcriptome serves as a baseline or reference level to assess tissue health of other individuals. The invention includes a comparative measurement of the cell-free transcriptome of a sample to the normal cell free transcriptome to assess the sample levels of tissue- specific transcripts circulating in plasma and to assess the health of tissues contributing to the cell-free transcriptome.

In addition to cell-free transcriptomes reference levels of normal patient populations, methods of the invention also utilize reference levels for cell-free transcriptomes specific to other patient populations. Using methods of the invention one can determine the relative contribution of tissue-specific transcripts to the cell-free transcriptome of maternal subjects, fetus subjects, and/or subjects having a condition or disease.

By analyzing the health of tissue based on tissue-specific transcripts, methods of the invention advantageously allow one to assess the health of a tissue without relying on disease- related protein biomarkers. In certain aspects, methods of the invention assess the health of a tissue by comparing a sample level of RNA in a biological sample to a reference level of RNA specific to a tissue, determining whether a difference exists between the sample level and the reference level, and characterizing the tissue as abnormal if a difference is detected. For example, if a patient's RNA expression levels for a specific tissue differs from the RNA expression levels for the specific tissue in the normal cell-free transcriptome, this indicates that patient's tissue is not functioning properly.

In certain aspects, methods of the invention involve assessing health of a tissue by characterizing the tissue as abnormal if a specified level of RNA is present in the blood. The method may further include detecting a level of RNA in a blood sample, comparing the sample level of RNA to a reference level of RNA specific to a tissue, determining whether a difference exists between the sample level and the reference level, and characterizing the tissue as abnormal if the sample level and the reference level are the same.

The present invention also provides methods for comprehensively profiling fetal specific cell-free RNA in maternal plasma and deconvoluting the cell-free transcriptome of fetal origin with relative proportion to different fetal tissue types. Methods of the invention involve the use of next- generation sequencing technology and/ or microarrays to characterize the cell-free RNA transcripts that are present in maternal plasma at different stages of pregnancy. Quantification of these transcripts allows one to deduce changes of these genes across different trimesters, and hence provides a way of quantification of temporal changes in transcripts. Methods of the invention allow diagnosis and identification of the potential for complications during or after pregnancy. Methods also allow the identification of pregnancy- associated transcripts which, in turn, elucidates maternal and fetal developmental programs. Methods of the invention are useful for preterm diagnosis as well as elucidation of transcript profiles associated with fetal developmental pathways generally. Thus, methods of the invention are useful to characterize fetal development and are not limited to characterization only of disease states or complications associated with pregnancy. Exemplary embodiments of the methods are described in the detailed description, claims, and figures provided below.

Brief Description of the Drawings

FIG. 1 depicts a listing of the top detected female pregnancy associated differentially expressed transcripts.

FIG. 2 shows plots of the two main principal components for cell free RNA transcript levels obtained in Example 1.

FIG. 3 A depicts a heatmap of the top 100 cell free transcript levels exhibiting different temporal levels in preterm and normal pregnancy using microarrays. The heat map of FIG. 3A is split across FIG. 3A-1 and FIG. 3A-2, as indicated by the graphical figure outline.

FIG. 3B depicts heatmap of the top 100 cell free transcript levels exhibiting different temporal levels in preterm and normal pregnancy using RNA-Seq. The heat map of FIG. 3B is split across FIG. 3B-1 and FIG. 3B-2, as indicated by the graphical figure outline.

FIG. 4 depicts a ranking of the top 20 transcripts differentially expressed between preterm and normal pregnancy.

FIG. 5 depicts results of a Gene Ontology analysis on the top 20 common RNA transcripts of FIG. 4, showing those transcripts enriched for proteins that are attached (integrated or loosely bound) to the plasma membrane or on the membranes of the platelets.

FIG. 6 depicts that the gene expression profile for PVALB across the different trimesters shows the premature births [highlighted in blue] has higher levels of cell free RNA transcripts found as compared to normal pregnancy.

FIG. 7 outlines exemplary process steps for determining the relative tissue contributions to a cell-free transcriptome of a sample. FIG. 7 is split across FIGS. 7A and 7B, as indicated by the graphical figure outline. FIG. 8 depicts the panel of selected fetal tissue-specific transcripts generated in Example 2. FIG. 8 is split across FIGS. 8A and 8B, as indicated by the graphical figure outline.

FIGS. 9A and 9B depict the raw data of parallel quantification of the fetal tissue- specific transcripts showing changes across maternal time-points (first trimester, second trimester, third trimester, and post partum) using the actual cell free RNA as well as the cDNA library of the same cell free RNA.

FIG. 10 illustrates relative expression of placental genes across maternal time points (first trimester, second trimester, third trimester, and post partum). FIG. 10 is split across FIGS. 10A and 10B, as indicated by the graphical figure outline. In FIG. 10, relative expression fold changes of each trimester as compared to post-partum for the panel of placental genes. Plotted are the results for two subjects done at two different concentrations each, each point represent one subject sampled at a particular trimester, and the cell free RNA went through the described protocol at two concentration levels. FIG. 10B depicts the same results segmented across the two subjects labeled as P53 & P54.

FIG. 11 illustrates relative expression of fetal brain genes across maternal time points (first trimester, second trimester, third trimester, and post partum). FIG. 11 is split across FIGS. 11 A and 1 IB, as indicated by the graphical figure outline. In FIG. 11 A, relative expression folds changes of each trimester as compared to post-partum for the panel of Fetal Brain genes. Plotted are the results for two subjects done at two different concentrations each, each point represent one subject sampled at a particular trimester, and the cell free RNA went through the described protocol at two concentration levels. FIG. 11B depicts the same results segmented across the two subjects labeled as P53 & P54.

FIG. 12 illustrates relative expression of fetal liver genes across maternal time points (first trimester, second trimester, third trimester, and post partum). FIG. 12 is split across FIGS. 12A and 12B, as indicated by the graphical figure outline. In FIG. 12A, relative expression fold changes of each trimester as compared to post-partum for the panel of Fetal Liver genes. Plotted are the results for two subjects done at two different concentrations each, each point represent one subject sampled at a particular trimester, and the cell free RNA went through the described protocol at two concentration levels. FIG. 12B depicts the same results segmented across the two subjects labeled as P53 & P54. FIG. 13 illustrates the relative composition of different organs contribution towards a plasma adult cell free transcriptome.

FIG. 14 illustrates a decomposition of decomposition of organ contribution towards a plasma adult cell free transcriptome using RNA-seq data.

FIG. 15 shows a heat map of the tissue specific transcripts of Table 2 of Example 3, being detectable in the cell free RNA.

FIG. 16 depicts a flow-diagram of a method of the invention according to certain embodiments.

FIG. 17 illustrates identifying brain- specific cell-free RNA transcripts that differ between Alzheimer's subjects and control subjects.

FIG. 18 illustrates an experimental design comparing microarray, RNA-seq and quantitative PCR for a customized bioinformatics pipeline. In the experiment, 11 pregnant women and 4 non-pregnant control subjects were recruited. For all the pregnant patients, blood was drawn at 1st, 2nd, 3rd trimester and postpartum. The cell-free plasma RNA were then extracted, amplified and characterized by Affymetrix microarray, Illumina sequencer and quantitative PCR.

FIG. 19 illustrates a heat map of temporal varying genes obtained from microarray analysis. Unsupervised clustering was performed on genes across different time points. Cluster of genes belongs to the CGB family of genes which are known to be expressed at high levels during the first trimester exhibited corresponding high levels of RNA in the first trimester.

FIG. 20 illustrates another heat map of temporal varying genes obtained from microarray analysis. Unsupervised clustering was performed on genes across different time points. Cluster of genes belongs to the CGB family of genes which are known to be expressed at high levels during the first trimester exhibited corresponding high levels of RNA in the first trimester.

FIG. 21 illustrates a list of genes identified with fetal SNPs using the experimental design of FIG. 18. List of identified Gene Transcripts with identified fetal SNPs and the captured temporal dynamics. The barplot reflects the relative contribution of fetal SNPs as reflected in the sequencing data. The red color bar reflects the extent of the relative Fetal SNP contribution.

FIG. 22 identifies placental specific transcripts measured by qPCR in the experimental design of FIG. 18. As shown in FIG. 22, the time course of placental specific genes is measured by qPCR. Plot showing the Delta Ct value with respect to the housekeeping gene ACTB across the different trimesters of pregnancy including after birth. General trends show elevated levels during the trimesters with a decline to low levels after the baby is born.

FIG. 23 identifies fetal brain specific transcripts measured byq. As shown in FIG. 23, the time course of fetal brain specific genes is measured by qPCR. Plot showing the Delta Ct value with respect to the housekeeping gene ACTB across the different trimesters of pregnancy including after birth. General trends show elevated levels during the trimesters with a decline to low levels after the baby is born.

FIG. 24 identifies fetal liver specific transcripts measured by qPCR. As shown in FIG. 24, the time course of fetal liver specific genes is measured by qPCR. Plot showing the Delta Ct value with respect to the housekeeping gene ACTB across the different trimesters of pregnancy including after birth. General trends show elevated levels during the trimesters with a decline to low levels after the baby is born.

FIG. 25 illustrates tissue composition of the adult cell free transcriptome in typical adult plasma as a summation of RNAs from different tissue types.

FIG.26 illustrates decomposition of Cell-free RNA transcriptome of normal adult into their respective tissues types using microarray data and quadratic programming.

FIG. 27 depicts a Principle Component Analysis (PCA) space reflecting the unsupervised clustering of the patients using the gene expression data from the 48 genes assay.

FIG. 28 depicts the measured APP levels in patients. The left panel shows the levels of APP transcripts across different age groups in the study. The right panel shows the different levels of the APP transcripts of the combined population of patients.

FIG. 29 depicts the measured MOBP levels in patients. The left panel shows the levels of the MOBP transcripts across different age groups in the study. The right panel shows the different levels of the MOBP transcripts of the combined population of patients.

FIG. 30 depicts classification results using combined Z-scores.

Detailed Description

Methods and materials described herein apply a combination of next-generation sequencing and microarray techniques for detecting, quantitating and characterizing RNA present in a biological sample. In certain embodiments, the biological sample contains a mixture of genetic material from different genomic sources, i.e. pregnant female and a fetus. Unlike other methods of digital analysis in which the nucleic acid in the sample is isolated to a nominal single target molecule in a small reaction volume, methods of the present invention are conducted without diluting or distributing the genetic material in the sample. Methods of the invention allow for simultaneous screening of multiple transcriptomes, and provide informative sequence information for each transcript at the single-nucleotide level, thus providing the capability for non-invasive, high throughput screening for a broad spectrum of diseases or conditions in a subject from a limited amount of biological sample.

In one particular embodiment, methods of the invention involve analysis of mixed fetal and maternal RNA in the maternal blood to identify differentially expressed transcripts throughout different stages of pregnancy that may be indicative of a preterm or pathological pregnancy. Differential detection of transcripts is achieved, in part, by isolating and amplifying plasma RNA from the maternal blood throughout the different stages of pregnancy, and quantitating and characterizing the isolated transcripts via microarray and RNA-Seq.

Methods and materials specific for analyzing a biological sample containing RNA (including non-maternal, maternal, maternal-fetus mixed) as described herein, are merely one example of how methods of the invention can be applied and are not intended to limit the invention. Methods of the invention are also useful to screen for the differential expression of target genes related to cancer diagnosis, progression and/or prognosis using cell-free RNA in blood, stool, sputum, urine, transvaginal fluid, breast nipple aspirate, cerebrospinal fluid, etc.

In certain embodiments, methods of the invention generally include the following steps: obtaining a biological sample containing genetic material from different genomic sources, isolating total RNA from the biological sample containing biological sample containing a mizture of genetic material from different genomic sources, preparing amplified cDNA from total RNA, sequencing amplified cDNA, and digital counting and analysis, and profiling the amplified cDNA.

Methods of the invention also involve assessing the health of a tissue contributing to the cell-free transcriptome. In certain embodiments, the invention involves assessing the cell-free transcriptome of a biological sample to determine tissue- specific contributions of individual tissues to the cell-free transcriptome. According to certain aspects, the invention assesses the health of a tissue by detecting a sample level of RNA in a biological sample, comparing the sample level of RNA to a reference level of RNA specific to the tissue, and characterizing the tissue as abnormal if a difference is detected. This method is applicable to characterize the health of a tissue in non-maternal subjects, pregnant subjects, and live fetuses. FIG. 16 depicts a flow-diagram of this method according to certain embodiments.

In certain aspects, methods of the invention employ a deconvolution of a reference cell- free RNA transcriptome to determine a reference level for a tissue. Preferably, the reference cell- free RNA transcriptome is a normal, healthy transcriptome, and the reference level of a tissue is a relative level of RNA specific to the tissue present in the blood of healthy, normal individuals. Methods of the invention assume that apoptotic cells from different tissue types release their RNA into plasma of a subject. Each of these tissues expresses a specific number of genes unique to the tissue type, and the cell-free RNA transcriptome of a subject is a summation of the different tissue types. Each tissue may express one or more numbers of genes. In certain embodiments, the reference level is a level associated with one of the genes expressed by a certain tissue. In other embodiments, the reference level is a level associated with a plurality of genes expressed by a certain tissue. It should be noted that a reference level or threshold amount for a tissue-specific transcript present in circulating RNA may be zero or a positive number.

For healthy, normal subjects, the relative contributions of circulating RNA from different tissue types are relatively stable, and each tissue-specific RNA transcript of the cell-free RNA transcriptome for normal subjects can serve as a reference level for that tissue. Applying methods of the invention, a tissue is characterized as unhealthy or abnormal if a sample includes a level of RNA that differs from a reference level of RNA specific to the tissue. The tissue of the sample may be characterized as unhealthy if the actual level of RNA is statistically different from the reference level. Statistical significance can be determined by any method known in the art. These measurements can be used to screen for organ health, as diagnostic tool, and as a tool to measure response to pharmaceuticals or in clinical trials to monitor health.

If a difference is detected between the sample level of RNA and the reference level of RNA, such difference suggests that the associated tissue is not functioning properly. The change in circulating RNA may be the precursor to organ failure or indicate that the tissue is being attacked by the immune system or pathogens. If a tissue is identified as abnormal, the next step(s), according to certain embodiments, may include more extensive testing of the tissue (e.g. invasive biopsy of the tissue), prescribing course of treatment specific to the tissue, and/or routine monitoring of the tissue. Methods of the invention can be used to infer organ health non-invasively. This noninvasive testing can be used to screen for appendicitis, incipient diabetes and pathological conditions induced by diabetes such as nephropathy, neuropathy, retinopathy etc. In addition, the invention can be used to determine the presence of graft versus host disease in organ transplants, particularly in bone marrow transplant recipients whose new immune system is attacking the skin, GI tract or liver. The invention can also be used to monitor the health of solid organ transplant recipients such as heart, lung and kidney. The methods of the invention can assess likelihood of prematurity, preeclampsia and anomalies in pregnancy and fetal

development. In addition, methods of the invention could be used to identify and monitor neurological disorders (e.g. multiple sclerosis and Alzheimer's disease) that involve cell specific death (e.g. of neurons or due to demyelination) or that involve the generation of plaques or protein aggregation.

A cell-free transcriptome for purposes of determining a reference level for tissue-specific transcripts can be the cell-free transcriptome of one or more normal subjects, maternal subjects, subjects having a certain conditions and diseases, or fetus subjects. In the case of certain conditions, the reference level of a tissue is a level of RNA specific to the tissue present in blood of one or more subjects having a certain disease or condition. In such aspect, the method includes detecting a level of RNA in a blood, comparing the sample level of RNA to a reference level of RNA specific to a tissue, determining whether a difference exists between the sample level and the reference level, and characterizing the as abnormal if the sample level and the reference level are the same.

A deconvolution of a cell-free transcriptome is used to determine the relative contribution of each tissue type towards the cell-free RNA transcriptome. The following steps are employed to determine the relative RNA contributions of certain tissues in a sample. First, a panel of tissue-specific transcripts is identified. Second, total RNA in plasma from a sample is determined using methods known in the art. Third, the total RNA is assessed against the panel of tissue-specific transcripts, and the total RNA is considered a summation these different tissue- specific transcripts. Quadratic programming can be used as a constrained optimization method to deduce the relative optimal contributions of different organs/tissues towards the cell-free transcriptome of the sample. One or more databases of genetic information can be used to identify a panel of tissue- specific transcripts. Accordingly, aspects of the invention provide systems and methods for the use and development of a database. Particularly, methods of the invention utilize databases containing existing data generated across tissue types to identify the tissue-specific genes.

Databases utilized for identification of tissue- specific genes include the Human 133A/GNF1H Gene Atlas and RNA-Seq Atlas, although any other database or literature can be used. In order to identify tissue-specific transcripts from one or more databases, certain embodiments employ a template-matching algorithm to the databases. Template matching algorithms used to filter data are known in the art, see e.g., Pavlidis P, Noble WS (2001) Analysis of strain and regional variation in gene expression in mouse brain. Genome Biol 2:research0042.1-0042.15.

In certain embodiments, quadratic programming is used as a constrained optimization method to deduce relative optimal contributions of different organs/tissues towards the cell-free transcriptome in a sample. Quadratic programming is known in the art and described in detail in Goldfarb and A. Idnani (1982). Dual and Primal-Dual Methods for Solving Strictly Convex Quadratic Programs. In J. P. Hennart (ed.), Numerical Analysis, Springer- Verlag, Berlin, pages226-239, and D. Goldfarb and A. Idnani (1983). A numerically stable dual method for solving strictly convex quadratic programs. Mathematical Programming, 27, 1-33.

FIG. 7 outlines exemplary process steps for determining the relative tissue contributions to a cell-free transcriptome of a sample. Using information provided by one or more tissue- specific databases, a panel of tissue-specific genes is generated with a template-matching function. A quality control function can be applied to filter the results. A blood sample is then analyzed to determine the relative contribution of each tissue- specific transcript to the total RNA of the sample. Cell-free RNA is extracted from the sample, and the cell-free RNA extractions are processed using one or more quantification techniques (e.g. standard mircoarrays and RNA- sequence protocols). The obtained gene expression values for the sample are then normalized. This involves rescaling of all gene expression values to the housekeeping genes. Next, the sample's total RNA is assessed against the panel of tissue-specific genes using quadratic programming in order to determine the tissue-specific relative contributions to the sample's cell- free transcriptome. The following constraints are employed to obtain the estimated relative contributions during the quadratic programming analysis: a) the RNA contributions of different tissues are greater than or equal to zero, and b) the sum of all contributions to the cell-free transcriptome equals one.

Method of the invention for determining the relative contributions for each tissue can be used to determine the reference level for the tissue. That is, a certain population of subjects (e.g.. maternal, normal,cancerous, Alzheimer's (and various stages thereof)) can be subject to the deconvolution process outlined in FIG. 7 to obtain reference levels of tissue-specific gene expression for that patient population. When relative tissue contributions are considered individually, quantification of each of these tissue-specific transcripts can be used as a measure for the reference apoptotic rate of that particular tissue for that particular population. For example, blood from one or more healthy, normal individuals can be analyzed to determine the relative RNA contribution of tissues to the cell-free RNA transcriptome for healthy, normal individuals. Each relative RNA contribution of tissue that makes up the normal RNA

transcriptome is a reference level for that tissue.

According to certain embodiments, an unknown sample of blood can be subject to process outlined in FIG. 7 to determine the relative tissue contributions to the cell-free RNA transcriptome of that sample. The relative tissue contributions of the sample are then compared to one or more reference levels of the relative contributions to a reference cell-free RNA transcriptome. If a specific tissue shows a contribution to the cell-free RNA transcriptome in the sample that is greater or less than the contribution of the specific tissue in a reference cell-free RNA transcriptome, then the tissue exhibiting differential contribution may be characterized accordingly. If the reference cell-free transcriptome represents a healthy population, a tissue exhibiting a differential RNA contribution in a sample cell-free transcriptome can be classified as unhealthy.

The biological sample can be blood, saliva, sputum, urine, semen, transvaginal fluid, cerebrospinal fluid, sweat, breast milk, breast fluid (e.g., breast nipple aspirate), stool, a cell or a tissue biopsy. In certain embodiments, the samples of the same biological sample are obtained at multiple different time points in order to analyze differential transcript levels in the biological sample over time. For example, maternal plasma may be analyzed in each trimester. In some embodiments, the biological sample is drawn blood and circulating nucleic acids, such as cell- free RNA. The cell-free RNA may be from different genomic sources is found in the blood or plasma, rather than in cells. In a particular embodiment, the drawn blood is maternal blood. In order to obtain a sufficient amount of nucleic acids for testing, it is preferred that approximately 10-50 mL of blood be drawn. However, less blood may be drawn for a genetic screen in which less statistical significance is required, or in which the RNA sample is enriched for fetal RNA.

Methods of the invention involve isolating total RNA from a biological sample. Total RNA can be isolated from the biological sample using any methods known in the art. In certain embodiments, total RNA is extracted from plasma. Plasma RNA extraction is described in Enders et al., "The Concentration of Circulating Corticotropin-releasing Hormone mRNA in Maternal Plasma Is Increased in Preeclampsia," Clinical Chemistry 49: 727-731, 2003. As described there, plasma harvested after centrifugation steps is mixed Trizol LS reagent

(Invitrogen) and chloroform. The mixture is centrifuged, and the aqueous layer transferred to new tubes. Ethanol is added to the aqueous layer. The mixture is then applied to an RNeasy mini column (Qiagen) and processed according to the manufacturer's recommendations.

In the embodiments where the biological sample is maternal blood, the maternal blood may optionally be processed to enrich the fetal RNA concentration in the total RNA. For example, after extraction, the RNA can be separated by gel electrophoresis and the gel fraction containing circulatory RNA with a size of corresponding to fetal RNA (e.g., <300 bp) is carefully excised. The RNA is extracted from this gel slice and eluted using methods known in the art.

Alternatively, fetal specific RNA may be concentrated by known methods, including centrifugation and various enzyme inhibitors. The RNA is bound to a selective membrane (e.g., silica) to separate it from contaminants. The RNA is preferably enriched for fragments circulating in the plasma, which are less than less 300 bp. This size selection is done on an RNA size separation medium, such as an electrophoretic gel or chromatography material.

Flow cytometry techniques can also be used to enrich for fetal cells in maternal blood (Herzenberg et al., PNAS 76: 1453-1455 (1979); Bianchi et al., PNAS 87: 3279-3283 (1990); Bruch et al., Prenatal Diagnosis 11: 787-798 (1991)). U.S. Patent No. 5,432,054 also describes a technique for separation of fetal nucleated red blood cells, using a tube having a wide top and a narrow, capillary bottom made of polyethylene. Centrifugation using a variable speed program results in a stacking of red blood cells in the capillary based on the density of the molecules. The density fraction containing low-density red blood cells, including fetal red blood cells, is recovered and then differentially hemolyzed to preferentially destroy maternal red blood cells. A density gradient in a hypertonic medium is used to separate red blood cells, now enriched in the fetal red blood cells from lymphocytes and ruptured maternal cells. The use of a hypertonic solution shrinks the red blood cells, which increases their density, and facilitates purification from the more dense lymphocytes. After the fetal cells have been isolated, fetal RNA can be purified using standard techniques in the art.

Further, an agent that stabilizes cell membranes may be added to the maternal blood to reduce maternal cell lysis including but not limited to aldehydes, urea formaldehyde, phenol formaldehyde, DMAE (dimethylaminoethanol), cholesterol, cholesterol derivatives, high concentrations of magnesium, vitamin E, and vitamin E derivatives, calcium, calcium gluconate, taurine, niacin, hydroxylamine derivatives, bimoclomol, sucrose, astaxanthin, glucose, amitriptyline, isomer A hopane tetral phenylacetate, isomer B hopane tetral phenylacetate, citicoline, inositol, vitamin B, vitamin B complex, cholesterol hemisuccinate, sorbitol, calcium, coenzyme Q, ubiquinone, vitamin K, vitamin K complex, menaquinone, zonegran, zinc, ginkgo biloba extract, diphenylhydantoin, perftoran, polyvinylpyrrolidone, phosphatidylserine, tegretol, PABA, disodium cromglycate, nedocromil sodium, phenyloin, zinc citrate, mexitil, dilantin, sodium hyaluronate, or polaxamer 188.

An example of a protocol for using this agent is as follows: The blood is stored at 4° C. until processing. The tubes are spun at 1000 rpm for ten minutes in a centrifuge with braking power set at zero. The tubes are spun a second time at 1000 rpm for ten minutes. The supernatant (the plasma) of each sample is transferred to a new tube and spun at 3000 rpm for ten minutes with the brake set at zero. The supernatant is transferred to a new tube and stored at -80° C. Approximately two milliliters of the "buffy coat," which contains maternal cells, is placed into a separate tube and stored at -80° C.

Methods of the invention also involve preparing amplified cDNA from total RNA.

cDNA is prepared and indiscriminately amplified without diluting the isolated RNA sample or distributing the mixture of genetic material in the isolated RNA into discrete reaction samples. Preferably, amplification is initiated at the 3' end as well as randomly throughout the whole transcriptome in the sample to allow for amplification of both mRNA and non-polyadenylated transcripts. The double- stranded cDNA amplification products are thus optimized for the generation of sequencing libraries for Next Generation Sequencing platforms. Suitable kits for amplifying cDNA in accordance with the methods of the invention include, for example, the Ovation ® RNA-Seq System.

Methods of the invention also involve sequencing the amplified cDNA. While any known sequencing method can be used to sequence the amplified cDNA mixture, single molecule sequencing methods are preferred. Preferably, the amplified cDNA is sequenced by whole transcriptome shotgun sequencing (also referred to herein as ("RNA-Seq"). Whole transcriptome shotgun sequencing (RNA-Seq) can be accomplished using a variety of next- generation sequencing platforms such as the Illumina Genome Analyzer platform, ABI Solid Sequencing platform, or Life Science's 454 Sequencing platform.

Methods of the invention further involve subjecting the cDNA to digital counting and analysis. The number of amplified sequences for each transcript in the amplified sample can be quantitated via sequence reads (one read per amplified strand). Unlike previous methods of digital analysis, sequencing allows for the detection and quantitation at the single nucleotide level for each transcript present in a biological sample containing a genetic material from different genomic sources and therefore multiple transcriptomes.

After digital counting, the ratios of the various amplified transcripts can compared to determine relative amounts of differential transcript in the biological sample. Where multiple biological samples are obtained at different time-points, the differential transcript levels can be characterized over the course of time.

Differential transcript levels within the biological sample can also be analyzed using via microarray techniques. The amplified cDNA can be used to probe a microarray containing gene transcripts associated with one or conditions or diseases, such as any prenatal condition, or any type of cancer, inflammatory, or autoimmune disease.

It will be understood that methods and any flow diagrams disclosed herein can be implemented by computer program instructions. These program instructions may be provided to a computer processor, such that the instructions, which execute on the processor, create means for implementing the actions specified in the flowchart blocks or described in methods for assessing tissue disclosed herein. The computer program instructions may be executed by a processor to cause a series of operational steps to be performed by the processor to produce a computer implemented process. The computer program instructions may also cause at least some of the operational steps to be performed in parallel. Moreover, some of the steps may also be performed across more than one processor, such as might arise in a multi-processor computer system. In addition, one or more processes may also be performed concurrently with other processes or even in a different sequence than illustrated without departing from the scope or spirit of the invention.

The computer program instructions can be, stored on any suitable computer-readable medium including, but not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computing device.

In certain aspects, methods of the invention can be used to determine cell-free RNA transcripts specific to the certain tissue, and use those transcripts to diagnose disorders and diseases associated with that tissue. In certain embodiments, methods of the invention can be used to determine cell-free RNA transcripts specific to the brain, and use those transcripts to diagnose neurological disorders (such as Alzheimer's disease). For example, methods of profiling cell-free RNA described herein can be used to differentiate subjects with neurological disorders from normal subjects because cell-free RNA transcripts associated with certain neurological disorders present at statistically-significant different levels than the same cell-free RNA transcripts in normal healthy populations. As a result, one is able to utilize levels of those RNA transcripts for clear and simple diagnostic tests.

In accordance with certain embodiments, cell-free RNA transcripts that source from brain tissue can be further examined as potential biomarkers for neurological disorders. In certain embodiments, once a brain- specific cell-free RNA transcript is determined, levels of the brain- specific cell-free RNA transcripts in normal patients are compared to patients with certain neurological disorders. In instances where the levels of brain specific cell-free RNA transcript consistently exhibit a statistically significant difference between subjects with a certain neurological disorder and normal subjects, then that brain- specific cell-free RNA transcript can be used as a biomarker for that neurological disorder. For example, the inventors have found that measurements of PSD3 and APP cell-free RNA transcript levels in plasma for Alzheimer disorder patients are statistically different from the levels of PSD3 and APP cell-free RNA in normal subjects. According to certain aspects, a neurological disorder is indicated in a patient based on a comparison of the patient's circulating nucleic acid that is specific to brain tissue and circulating nucleic acid of a reference or multiple references that is specific to brain tissue. In particular, the circulating nucleic acid is RNA, but may also be DNA. In certain embodiments, levels of brain- specific circulating RNA present in a reference population are used as thresholds that are indicative with a condition. The condition may be a normal healthy condition or may be a diseased condition (e.g. neurological disorder, Alzheimer's disease generally or particular stage of Alzheimer's disease). When the threshold is indicative of a diseased condition, the patient's transcript levels that are underexpressed or overexpressed in comparison to the threshold may indicate that the patient does not have the disease. When the threshold is indicative of normal condition, the patient's transcript levels that are underexpressed or overexpressed in comparison to the threshold may indicate that the patient has the disease.

Reference RNA levels (e.g. levels of circulating RNA) may be obtained by statistically analyzing the brain- specific transcript levels of a defined patient population. The reference levels may pertain to a healthy patient population or a patient population with a particular neurological disorder. In further examples, the references levels may be tailored to a more specific patient population. For example, a reference level may correlate to a patient population of a certain age and/or correspond to a patient population exhibiting symptoms associated with a particular stage of a neurological disorder. Other factors for tailoring the patient population for reference levels may include sex, familial history, environmental exposure, and/or phenotypic traits.

Brain- specific genes or transcripts may be determined by deconvolving the cell-free transcriptome as described above and outlined in FIG. 7. Brain- specific genes or transcripts may also be determined by directly analyzing brain tissue. In addition, Tables 1 and 2, as listed in Example 4 below, provide genes whose expression profiles are unique to certain tissue types. Particularly, Tables 1 and 2 list brain- specific genes corresponding with hypothalamus as well as genes corresponding with the whole brain (e.g. most brain tissue), prefrontal cortex, thalamus, etc. In certain embodiments, brain- specific genes or transcripts include APP, PSD3, MOBP, MAG, SLC2A1, TCF7L2, CDH22, CNTF, and PAQR6.

The brain- specific transcripts used in methods of the invention may correspond to cell- free transcripts released from certain types of brain tissue. The types of brain tissue include the pituitary, hypothalamus, thalamus, corpus callosum, cerebrum, cerebral cortex, and combinations thereof. In particular embodiments, the brain- specific transcripts correspond with the hypothalamus. The hypothalamus is bounded by specialized brain regions that lack an effective blood/brain barrier, and thus transcripts released from the hypothalamus are likely to be introduced into blood or plasma.

FIG. 19 illustrates the difference in levels of PSD3 and APP cell-free RNA between subjects with Alzheimer's and normal subjects. Measurements of PSD3 and APP cell free RNA transcripts levels in plasma shows that the levels of these two transcripts are elevated in AD patients and can be used to cleanly group the AD patients from the normal patients. Shown in the figure are only two potential transcripts showing significant diagnostic potential. High throughput microfluidics chip allow for simultaneous measurements of other brain specific transcripts which can improve the classification process.

In particular aspects, brain- specific transcripts are used to characterize and diagnose neurological disorders. The neurological disorder characterized may include degenerative neurological disorders, such as Alzheimer's disease, Parkinson's disease, Huntington's disease, and some types of multiple sclerosis. The most common neurological disorder is Alzheimer's disease. In some instances, the neurological disorder is classified by the extent of cognitive impairment, which may include no impairment, mild impairment, moderate impairment, and severe impairment.

Alzheimer's disease is characterized into stages based on the cognitive symptoms that occur as the disease progresses. Stage 1 involves no impairment (normal function). The person does not experience any memory problems or signs of dementia. Stage 2 involves a very mild decline in cognitive functions. During Stage 2, a person may experience mild memory loss, but cognitive impairment is not likely noticeable by friends, family, and treating physicians. Stage 3 involves a mild cognitive decline, in which friends, family, and treating physicians may notice difficulties in the individual's memory and ability to perform tasks. For example, trouble identifying certain words, noticeable difficulty in performing tasks in social or work settings, forgetting just-read materials. Stage 4 involves moderate cognitive decline, which is noticeable and causes a significant impairment on the individual's daily life. In Stage 4, the individual will have trouble performing everyday complex tasks, such as managing financings and planning social gatherings, will have trouble remembering their own personal history, and becomes moody or withdrawn. Stage 5 involves moderately severe cognitive decline, in which gaps in memory and thinking are noticeable and the individual will begin to need help with certain activities. In Stage 5, individuals will be confused about the day, will have trouble with recalling particular details (such as phone number and street address), but will be able to remember significant details about themselves and their loved ones. Stage 6 involves severe cognitive decline, as the individual's memory continues to worsen. Individuals in Stage 6 will likely need extensive help with daily activities because they lose awareness of their surroundings and while they often remember certain tasks, they forget how to complete them or make mistakes (e.g. wearing pajamas during the day, forgetting to rinse after shampooing, wearing shoes on wrong side of the foot). Stage 7 involves very severe cognitive decline and is the final stage of Alzheimer's disease. In Stage 7, individuals lose their ability to respond to the environment, remember others, carry on a conversation, and control movement. Individuals need help with daily care, eating, dressing, using the bathroom, and have abnormal reflexes and tense muscles. Individuals may still be verbal, but will not make sense or relate to the present.

In certain embodiments, methods for assessing a neurological disorder involve a comparison of one or more brain- specific transcripts of an individual to a set of predictive variables correlated with the neurological disorder. The set of predictive variables may include a variety of reference levels that are brain specific. For instance, the set of predictive variables may include brain- specific transcript levels of a plurality of references. For example, one reference level may correspond to a normal patient population and another reference level may correspond to a patient population with the neurological disorder. In further examples, the references may correspond to more specific patient populations. For example, each reference level may correlate to a patient population of a certain age and/or correspond to a patient population exhibiting symptoms associated with a particular stage of a neurological disorder. Other factors for tailoring the patient population for reference levels may include sex, familial history, environmental exposure, and/or phenotypic traits.

Statistical analyses can be used to determine brain-specific reference levels of certain patient populations (such as those discussed above). Statistical analyses for identifying trends in patient populations and comparing patient populations are known in the art. Suitable statistical analyses include, but are not limited to, clustering analysis, principle component analysis, non- parametric statistical analyses (e.g. Wilcoxon tests), etc. In addition, statistical analyses may be used to statistically significant deviations between the individual's circulating nucleic specific to brain tissue and that of a reference. When the reference is based on a diseased population, statistically significant deviations of the individual's brain- specific circulating RNA to those of the diseased population are indicative of no neurological disorder. When the reference is based on a normal population, statistically significant deviations of the individual's brain- specific circulating RNA to those of the normal population are indicative of a neurological disorder. Methods of determining statistical significance are known in the art. P- values and odds ratio can be used for statistical inference. Logistic regression models are common statistical classification models. In addition, Chi-Square tests and T-test may also be used to determine statistical significance.

Methods of the invention can also be used to identify one or more biomarkers associated with a neurological disorder. In such aspects, brain- specific transcripts of an individual or patient population suspected of having or actually having a neurological disorder (e.g. exhibiting impaired cognitive functions) are compared to reference brain- specific transcript (e.g. a healthy, normal control). The brain- specific transcripts of the individual or patient population that are differentially expressed as compared to the reference may then be identified as biomarkers of the neurological disorder. In certain embodiments, only differentially expressed brain- specific transcripts that are statistically significant are identified as biomarkers.

In certain embodiments, methods of the invention provide recommend a course of treatment based on the clinical indications determined by comparing of the patient' s circulating brain- specific RNA and the reference. Depending on the diagnosis, the course of treatment may include medicinal therapy, behavioral therapy, sleep therapy, and combinations thereof. The course of treatment and diagnosis may be provided in a read-out or a report.

EXAMPLES

Example 1: Profiling Maternal Plasma Cell-Free RNA by RNA Sequencing-A Comprehensive Approach

Overview: The plasma RNA profiles of 5 pregnant women were collected during the first trimester, second trimester, post-partum, as well as those of 2 non-pregnant female donors and 2 male donors using both microarray and RNA-Seq.

Among these pregnancies, there were 2 pregnancies with clinical complications such as premature birth and one pregnancy with bi-lobed placenta. Comparison of these pregnancies against normal cases reveals genes that exhibit significantly different gene expression pattern across different temporal stages of pregnancy. Application of such technique to samples associated with complicated pregnancies may help identify transcripts that can be used as molecular markers that are predictive of these pathologies.

Study Design and Methods:

Subjects

Samples were collected from 5 pregnant women were during the first trimester, second trimester, third trimester, and post-partum. As a control, blood plasma samples were also collected from 2 non-pregnant female donors and 2 male donors.

Blood Collection and processing

Blood samples were collected in EDTA tube and centrifuged at 1600g for 10 min at 4°C. Supernatant were placed in 1 ml aliquots in a 1.5 ml microcentrifuge tube which were then centrifuged at 16000 g for 10 min at 4°C to remove residual cells. Supernatants were then stored in 1.5 ml microcentrifuge tubes at -80°C until use.

RNA Extraction and Amplification

The cell-free maternal plasma RNAs was extracted by Trizol LS reagent. The extracted and purified total RNA was converted to cDNA and amplified using the RNA-Seq Ovation Kit (NuGen). (The above steps were the same for both Microarray and RNA-Seq sample

preparation).

The cDNA was fragmented using DNase I and labeled with Biotin, following by hybridization to Affymetrix GeneChip ST 1.0 microarrays. The Illumina sequencing platform and standard Illumina library preparation protocols were used for sequencing.

Data Analysis: Correlation between microarray and RNA-Seq

The RMA algorithm was applied to process the raw microarray data for background correction and normalization. RPKM values of the sequenced transcripts were obtained using the CASAVA 1.7 pipeline for RNA-seq. The RPKM in the RNA-Seq and the probe intensities in the microarray were converted to log2 scale. For the RNA-Seq data, to avoid taking the log of 0, the gene expressions with RPKM of 0 were set to 0.01 prior to taking logs. Correlation coefficients between these two platforms ranges were then calculated.

Differential Expression of RNA transcripts levels using RNA-seq

Differential gene expression analysis was performed using edgeR, a set of library functions which are specifically written to analyze digital gene expression data. Gene Ontology was then performed using DAVID to identify for significantly enriched GO terms.

Principle Component Analysis & Identification of Significant Time Varying genes

Principle component analysis was carried out using a custom script in R. To identify time varying genes, the time course library of functions in R were used to implement empirical Bayes methods for assessing differential expression in experiments involving time course which in our case are the different trimesters and post-partum for each individual patients.

Results and Discussion

RNA-Seq reveals that pregnancy-associated transcripts are detected at significantly different levels between pregnant and non-pregnant subjects.

A comparison of the transcripts level derived using RNA-Seq and Gene Ontology Analysis between pregnant and non-pregnant subjects revealed that transcripts exhibiting differential transcript levels are significantly associated with female pregnancy, suggesting that RNA-Seq are enabling observation of real differences between these two class of transcriptome due to pregnancy. The top rank significantly expressed gene is PLAC4 which has also been known as a target in previous studies for developing RNA based test for trisomy 21. A listing of the top detected female pregnancy associated differentially expressed transcripts is shown in Figure 1.

Principle Component Analysis {PC A) on plasma cell free RNA transcripts levels in maternal plasma distinguishes between pre-mature and normal pregnancy Using the plasma cell free transcript level profiles as inputs for Principle Component Analysis, the profile from each patient at different time points clustered into different pathological clusters suggesting that cell free plasma RNA transcript profile in maternal plasma may be used to distinguish between pre-term and non-preterm pregnancy.

Plasma Cell free RNA levels were quantified using both microarray and RNA-Seq. Transcripts expression levels profile from microarray and RNA-Seq from each patient are correlated with a Pearson correlation of approximately 0.7. Plots of the two main principal components for cell free RNA transcript levels is shown in Figure 2.

Identification of cell free RNA transcripts in maternal plasma exhibiting significantly different time varying trends between pre-term and normal pregnancy across all three trimesters and post-partum

A heatmap of the top 100 cell free transcript levels exhibiting different temporal levels in preterm and normal pregnancy using microarrays is shown in Figure 3A. A heatmap of the top 100 cell free transcript levels exhibiting different temporal levels in preterm and normal pregnancy using RNA-Seq is shown in Figure 3B.

Common cell free RNA transcripts identified by microarray and RNA-Seq which exhibit significantly different time varying trends between pre-term and normal pregnancy across all three trimesters and post-partum

A ranking of the top 20 transcripts differentially expressed between pre-term and normal pregnancy is shown in Figure 4. These top 20 common RNA transcripts were analyzed using Gene Ontology and were shown to be enriched for proteins that are attached (integrated or loosely bound) to the plasma membrane or on the membranes of the platelets (see Figure 5). Gene Expression Profiles for PVALB

The protein encoded by PVALB gene is a high affinity calcium ion-binding protein that is structurally and functionally similar to calmodulin and troponin C. The encoded protein is thought to be involved in muscle relaxation. As shown in Figure 6, the gene expression profile for PVALB across the different trimesters shows the premature births [highlighted in blue] has higher levels of cell free RNA transcripts found as compared to normal pregnancy.

Conclusion: Results from quantification and characterization of maternal plasma cell-free RNA using RNA-Seq strongly suggest that pregnancy associated transcripts can be detected.

Furthermore, both RNA-Seq and microarray methods can detect considerable gene transcripts whose level showed differential time trends that has a high probability of being associated with premature births.

The methods described herein can be modified to investigate pregnancies of different pathological situations and can also be modified to investigate temporal changes at more frequent time points.

Example 2: Quantification of Tissue-Specific Cell-Free RNA Exhibiting Temporal Variation During Pregnancy

Overview:

Cell-free fetal DNA found in maternal plasma has been exploited extensively for noninvasive diagnostics. In contrast, cell-free fetal RNA which has been shown to be similarly detected in maternal circulation has yet been applied widely as a form of diagnostics. Both fetal cell-free RNA and DNA face similar challenges in distinguishing the fetal from maternal component because in both cases the maternal component dominates. To detect cell-free RNA of fetal origin, focus can be placed on genes that are highly expressed only during fetal development, which are subsequently inferred to be of fetal in origin and easily distinguished from background maternal RNA. Such a perspective is collaborated by studies that has established that cell-free fetal RNA derived from genes that are highly expressed in the placenta are detectable in maternal plasma during pregnancy.

A significant characteristic that set RNA apart from DNA can be attributed to RNA transcripts dynamic nature which is well reflected during fetal development. Life begins as a series of well-orchestrated events that starts with fertilization to form a single-cell zygote and ends with a multi-cellular organism with diverse tissue types. During pregnancy, majority of fetal tissues undergoes extensive remodeling and contain functionally diverse cell types. This underlying diversity can be generated as a result of differential gene expression from the same nuclear repertoire; where the quantity of RNA transcripts dictate that different cell types make different amount of proteins, despite their genomes being identical. The human genome comprises approximately 30,000 genes. Only a small set of genes are being transcribed to RNA within a particular differentiated cell type. These tissue specific RNA transcripts have been identified through many studies and databases involving developing fetuses of classical animal models. Combining known literature available with high throughput data generated from samples via sequencing, the entire collection of RNA transcripts contained within maternal plasma can be characterized.

Fetal organ formation during pregnancy depends on successive programs of gene expression. Temporal regulation of RNA quantity is necessary to generate this progression of cell differentiation events that accompany fetal organ genesis. To unravel similar temporal dynamics for cell free RNA, the expression profile of maternal plasma cell free RNA, especially the selected fetal tissue specific panel of genes, as a function across all three trimesters during pregnancy and post-partum were analyzed. Leveraging high throughput qPCR and sequencing technologies capability for simultaneous quantification of cell free fetal tissue specific RNA transcripts, a system level view of the spectrum of RNA transcripts with fetal origins in maternal plasma was obtained. In addition, maternal plasma was analyzed to deconvolute the heterogeneous cell free transcriptome of fetal origin a relative proportion of the different fetal tissue types. This approach incorporated physical constraints regarding the fetal contributions in maternal plasma, specifically the fraction of contribution of each fetal tissues were required to be non-negative and sum to one during all three trimesters of the pregnancy. These constraints on the data set enabled the results to be interpreted as relative proportions from different fetal organs. That is, a panel of previously selected fetal tissue-specific RNA transcripts exhibiting temporal variation can be used as a foundation for applying quadratic programing in order to determine the relative tissue-specific RNA contribution in one or more samples.

When considered individually, quantification of each of these fetal tissue specific transcripts within the maternal plasma can be used as a measure for the apoptotic rate of that particular fetal tissue during pregnancy. Normal fetal organ development is tightly regulated by cell division and apoptotic cell death. Developing tissues compete to survive and proliferate, and organ size is the result of a balance between cell proliferation and death. Due to the close association between aberrant cell death and developmental diseases, therapeutic modulation of apoptosis has become an area of intense research, but with this comes the demand for monitoring the apoptosis rate of specific. Quantification of fetal cell-free RNA transcripts provide such prognostic value, especially in premature births where the incidence of apoptosis in various organs of these preterm infants has been have been shown to contribute to neurodevelopmental deficits and cerebral palsy of preterm infants.

Sample Collection and Study Design

Selection of Fetal Tissue Specific Transcript Panel

To detect the presence of these fetal tissue- specific transcripts, a list of known fetal tissue specific genes was prepared from known literature and databases. The specificity for fetal tissues was validated by cross referencing between two main databases: TISGeD (Xiao, S.-J., Zhang, C. & Ji, Z.-L. TiSGeD: a Database for Tissue-Specific Genes. Bioinformatics (Oxford, England) 26, 1273-1275 (2010)) and BioGPS (Wu, C. et al. BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome biology 10, R130 (2009); Su, A. I. et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proceedings of the National Academy of Sciences of the United States of America 101, 6062-7 (2004)). Most of these selected transcripts are associated with known fetal developmental processes. This list of genes was overlapped with RNA sequencing and microarray data to generate the panel of selected fetal tissue- specie transcripts shown in Figure 8.

Subjects

Samples of maternal blood were collected from normal pregnant women during the first trimester, second trimester, third trimester, and post-partum. For positive controls, fetal tissue specific RNA from the various fetal tissue types were bought from Agilent. Negative controls for the experiments were performed with the entire process with water, as well as with samples that did not undergoes the reverse transcription process.

Blood Collection and Processing

At each time-point, 7 to 15 mL of peripheral blood was drawn from each subject. Blood was centrifuged at 1600g for 10 mins and transferred to microcentrifuge tubes for further centrifugation at 16000g for 10 mins to remove residual cells. The above steps were carried out within 24 hours of the blood draw. Resulting plasma is stored at -80 Celsius for subsequent RNA extractions.

RNA Extraction

Cell free RNA extractions were carried using Trizol followed by Qiagen's RNeasy Mini Kit. To ensure that there are no contaminating DNA, DNase digestion is performed after RNA elution using RNase free DNase from Qiagen. Resulting cell free RNA from the pregnant subjects was then processed using standard microarrays and Illumina RNA-seq protocols. These steps generate the sequencing library that we used to generate RNA-seq data as well as the microarray expression data. The remaining cell free RNA are then used for parallel qPCR.

Parallel qPCR of Selected Transcripts

Accurate quantification of these fetal tissue specific transcripts was carried out using the Fluidigm BioMark system (See e.g. Spurgeon, S. L., Jones, R. C. & Ramakrishnan, R. High throughput gene expression measurement with real time PCR in a microfluidic dynamic array. PloS one 3, el662 (2008)). This system allows for simultaneous query of a panel of fetal tissue specific transcripts. Two parallel forms of inquiry were conducted using different starting source of material. One was using the cDNA library from the Illumina sequencing protocol and the other uses the eluted RNA directly. Both sources of material were amplified with evagreen primers targeting the genes of interest. Both sources, RNA and cDNA, were preamplified. cDNA is preamplifed using evagreen PCR supermix and primers. RNA source is preamplified using the CellsDirect One-Step qRT-PCR kit from Invitrogen. Modifications were made to the default One-Step qRT-PCR protocol to accomodate a longer incubation time for reverse transcription. 19 cycles of preamplfication were conducted for both sources and the collected PCR products were cleaned up using Exonuclease I Treatment. To increase the dynamic range and the ability to quantify the efficiency of the later qPCR steps, serial dilutions were performed on the PCR products from 5 fold, 10 fold and 10 fold dilutions. Each of the collected maternal plasma from individual pregnant women across the time points went through the same procedures and was loaded onto 48x48 Dynamic Arrary Chips from Fluidigm to perform the qPCR. For positive control, fetal tissue specific RNA from the various fetal tissue types were bought from Agilent. Each of these RNA from fetal tissues went through the same preamplification and clean-up steps. A pool sample with equal proportions of different fetal tissues was created as well for later analysis to deconvolute the relative contribution of each tissue type in the pooled samples. All collected data from the Fluidigm BioMark system were pre-processed using Fluidigm Real Time PCR Analysis software to obtain the respective Ct values for each of the transcript across all samples. Negative controls of the experiments were performed with the entire process with water, as well as with samples that did not undergoes the reverse transcription process. Data Analysis:

Fetal tissue specific RNA transcripts clear from the maternal peripheral bloodstream within a short period after birth. That is, the post-partum cell-free RNA transcriptome of maternal blood lacks fetal tissue specific RNA transcripts. As a result, it is expected that the quantity of these fetal tissue-specific transcripts to be higher before than after birth. The data of interest were the relative quantitative changes of the tissue specific transcripts across all three trimesters of pregnancy as compared to this baseline level after the baby is born. As described the methods, the fetal tissue-specific transcripts were quantified in parallel both using the actual cell-free RNA as well as the cDNA library of the same cell-free RNA. An example of the raw data obtained is shown in Figures 9 A and 9B. The qPCR system gave a better quality readout using the cell-free RNA as the initial source. Focusing on the qPCR results from the direct cell- free RNA source, the analysis was conducted by comparing the fold changes level of each of these fetal tissue specific transcripts across all three trimesters using the post-partum level as the baseline for comparison. The Delta-Delta Ct method was employed (Schmittgen, T. D. & Livak, K. J. Analyzing real-time PCR data by the comparative CT method. Nature Protocols 3, 1101— 1108 (2008)). Each of the transcript expression level was compared to the housekeeping genes to get the delta Ct value. Subsequently, to compare each trimesters to after birth, the delta-delta Ct method was applied using the post-partum data as the baseline.

Results and Discussion:

As shown in Figures 10, 11, and 12, the tissue-specific transcripts are generally found to be at a higher level during the trimesters as compared to after-birth. In particular, the tissue- specific panel of placental, fetal brain and fetal liver specific transcripts showed the same bias, where these transcripts are typically found to exist at higher levels during pregnancy then compared to after birth. Between the different trimesters, a general trend showed that the quantity of these transcripts increase with the progression into pregnancy.

Biological Significance of Quantified Fetal Tissue-Specific RNA: Most of the transcripts in the panel were involved in fetal organ development and many are also found within the amniotic fluid. Once such example is ZNF238. This transcript is specific to fetal brain tissue and is known to be vital for cerebral cortex expansion during embryogenesis when neuronal layers are formed. Loss of ZNF238 in the central nervous system leads to severe disruption of neurogenesis, resulting in a striking postnatal small-brain phenotype. Using methods of the invention, one can determine whether ZNF238 is presenting in healthy, normal levels according to the stage of development.

Known defects due to the loss of ZNF238 include a striking postnatal small-brain phenotype: microcephaly, agenesis of the corpus callosum and cerebellar hypoplasia. Microcephaly can sometimes be diagnosed before birth by prenatal ultrasound. In many cases, however, it might not be evident by ultrasound until the third trimester. Typically, diagnosis is not made until birth or later in infancy upon finding that the baby's head circumference is much smaller than normal. Microcephaly is a life-long condition and currently untreatable. A child born with microcephaly will require frequent examinations and diagnostic testing by a doctor to monitor the development of the head as he or she grows. Early detection of ZNf238 differential expression using methods of the invention provides for prenatal diagnosis and may hold prognostic value for drug treatments and dosing during course of treatment.

Beyond ZNF238, many of the characterized transcripts may hold diagnostic value in developmental diseases involving apoptosis, i.e., diseases caused by removal of unnecessary neurons during neural development. Seeing that apoptosis of neurons is essential during development, one could extrapolate that similar apoptosis might be activated in neurodegenerative diseases such as Alzheimer's disease, Huntington's disease, and amyotrophic lateral sclerosis. In such a scenario, the methodology described herein will allow for close monitoring for disease progression and possibly an ideal dosage according to the progression.

Deducing relative contributions of different fetal tissue types: Differential rate of apoptosis of specific tissues may directly correlate with certain developmental diseases. That is, certain developmental diseases may increase the levels of a particular specific RNA transcripts being observed in the maternal transcriptome. Knowledge of the relative contribution from various tissue types will allow for observations of these types of changes during the progression of these diseases. The quantified panel of fetal tissue specific transcripts during pregnancy can be considered as a summation of the contributions from the various fetal tissues (See FIG. 25).

Expressing, j

where Y is the observed transcript quantity in maternal plasma for gene i, X is the known transcript quantity for gene i in known fetal tissue j and ε the normally distributed error. Additional physical constraints includes: 1. Summation of all fraction contributing to the observed quantification is 1, given by the

2. All the contribution from each tissue type has to greater than or equal zero. There is no physical meaning to having a negative contribution. This is given by π 0, since n is defined as the fractional contribution of each tissue types.

Consequently to obtain the optimal fractional contribution of each tissue type, the least- square error is minimized. The above equations are then solved using quadratic programming in R to obtain the optimal relative contributions of the tissue types towards the maternal cell free RNA transcripts. In the workflow, the quantity of RNA transcripts are given relative to the housekeeping genes in terms of Ct values obtained from qPCR. Therefore, the Ct value can be considered as a proxy of the measured transcript quantity. An increase in Ct value of one is similar to a two-fold change in transcript quantity, i.e. 2 raised to the power of 1. The process beings with normalizing all of the data in CT relative to the housekeeping gene, and is followed by quadratic programming.

As a proof of concept for the above scheme, different fetal tissue types (Brain, Placenta, Liver, Thymus, Lung) were mixed in equal proportions to generate a pool sample. Each fetal tissue types (Brain, Placenta, Liver, Thymus, Lung) along with the pooled sample were quantified using the same Fluidigm Biomark System to obtain the Ct values from qPCR for each fetal tissue specific transcript across all tissues and the pooled sample. These values were used to perform the same deconvolution. The resulting fetal fraction of each of the fetal tissue organs (Brain, Placenta, Liver, Thymus, Lung) was 0.109, 0.206, 0.236, 0.202 & 0.245 respectively.

Conclusion:

In summary, the panel of fetal specific cell free transcripts provides valuable biological information across different fetal tissues at once. Most particularly, the method can deduce the different relative proportions of fetal tissue-specific transcripts to total RNA, and, when considered individually, each transcript can be indicative of the apoptotic rate of the fetal tissue. Such measurements have numerous potential applications for developmental and fetal medicine. Most human fetal development studies have relied mainly on postnatal tissue specimens or aborted fetuses. Methods described herein provide quick and rapid assay of the rate of fetal tissue/organ growth or death on live fetuses with minimal risk to the pregnant mother and fetus. Similar methods may be employed to monitor major adult organ tissue systems that exhibit specific cell free RNA transcripts in the plasma.

Example 3: Additional Study for Quantification of Tissue-Specific Cell-Free RNA Exhibiting Temporal Variation During Pregnancy

High-throughput methods of microarray and next- generation sequencing were used to characterize the landscape of cell-free RNA transcriptome of healthy adults and of pregnant women across all three trimesters of pregnancy and post-partum. The results confirm the study presented in Example 2, by showing that it is possible to monitor the gene expression status of many tissues and the temporal expression of certain genes can be measured across the stages of human development. The study also investigated the role of cell-free RNA in adult's suffering from neurodegenerative disorder Alzheimer's and observed a marked increase of neuron-specific transcripts in the blood of affected individuals. Thus, this study shows that the same principles of observing tissue-specific RNA to assess development can also be applied to assess the deterioration of brain tissue associated with neurological disorders.

Overview

An additional study following the guidance of Example 2 was conducted to illustrate the temporal variation among tissue- specific cell-free RNA across trimesters. FIG. 18 outlines the experimental design for this study, which examined cell-free plasma samples of 15 subjects, of which 11 were pregnant and 4 were not pregnant (2 males; 2 females). The blood samples were taken over several time-points: 1st, 2nd, and 3rd Trimester and Post-Partum. The cell-free plama RNA were then extracted, amplified, and characterized by Affymetrix microarray, Illumina Sequencer, and quantitative PCR. For each plasma sample, ~20 million sequencing reads were generated, ~80% of which could be mapped against the human reference genome (hgl9). As the plasma RNA is of low concentration and vulnerable to degradation, contamination from the plasma DNA is a concern. To assess the quality of the sequencing library, the number of reads assigned to different regions was counted: 34% mapped to exons, 18% mapped to introns, and 24% mapped to ribosomal RNA and tRNA. Therefore, dominant portion of the reads originated from RNA transcripts rather than DNA contamination. To validate the RNA-seq measurements, all of the plasma samples were also analyzed with gene expression microarrays. Apoptotic cells from different tissue types release their RNA into the cell-free RNA component in plasma. Each of these tissues expresses a number of genes unique to their tissue type, and the observed cell-free RNA transcriptomes can be considered as a summation of contributions from these different tissue types. Using expression data of different tissue types available in public databases, the cell-free RNA transcriptome from our four nonpregnant subjects were deconvoluted using quadratic programming to reveal the relative contributions of different tissue types (Fig. 26). These contributions identified different tissue types which are consistent among different control subjects. Whole blood, as expected, is the major contributor (~40%) toward the cell-free RNA transcriptome. Other major contributing tissue types include the bone marrow and lymph nodes. One also sees consistent contributions from smooth muscle, epithelial cells, thymus, and hypothalamus.

Results and Discussion

Within the cohort, about 100 genes were analyzed whose RNA transcripts contained paternal SNPs that were distinct from the maternal inheritance to explicitly demonstrate that the fetus contributes a substantial amount of RNA to the mother's blood (See FIG. 21). To accurately quantify and verify the relative fetal contribution, the following were genotyped: a mother and her fetus and inferred paternal genotype. The weighted average fraction of fetal- originated cell-free RNA was quantified using paternal SNPs. Cell-free RNA fetal fraction depends on gene expression and varies greatly across different genes. In general, the fetal fraction of cell-free RNA increases as the pregnancy progress and decreases after delivery. The weighted average fetal fraction started at 0.4% in the first trimester, increased to 3.4% in the second trimester, and peaked at 15.4% in the third trimester. Although fetal RNA should be cleared after delivery, there was still 0.3% of fetal RNA as calculated, which can be attributed to background noise arising from misalignment and sequencing errors.

In addition to monitoring fetal tissue- specific mRNA, noncoding transcripts present in the cell-free compartment across pregnancy were identified. These noncoding transcripts include long noncoding RNAs (IncRNAs), as well as circular RNAs (circRNA). Additional PCR assays were designed to specifically amplify and validate the presence of these circRNA in plasma. circRNAs have recently been shown to be widely expressed in human cells and have greater stability than their linear counterparts, potentially making them reliable biomarkers for capturing transient events. Several of the circRNA species appear to be specifically expressed during different trimesters of pregnancy. The identification of these cell-free noncoding RNAs during pregnancy improve our ability to monitor the health of the mother and fetus.

There is a general increase in the number of genes detected across the different trimesters followed by a steep drop after the pregnancy. Such an increase in the number of genes detected suggests that unique transcripts are expressed specifically during particular time intervals in the developing fetus. FIGS. 18 and 19 show the heatmap of genes whose level changed over time during pregnancy, as detected by microarray. ANOVA was applied to identify genes that varied in expression in a statistically significant manner across different trimesters. An additional condition filtering for transcripts that were expressed at low levels in both the postpartum plasma of pregnant subjects and in nonpregnant controls. Using these conditions, 39 genes from RNA- seq and 34 genes from microarray were identified, of which there were 17 genes in common. Gene Ontology (GO) performed on the identified genes using Database for Annotation,

Visualization and Integrated Discovery (DAVID) revealed that the identified gene list is enriched for the following GO terms: female pregnancy (Bonferroni-corrected P = 5.5 x 10 ~5 ), extracellular region (corrected P = 6.6 x 10 —3 ), and hormone activity (corrected P = 6.3 x 10—9 ). These RNA transcripts show a general trend of having low expression postpartum and the highest expression during the third trimester. Most of these transcripts are specifically expressed in the placenta, and their levels reach a maximum in the later stages of pregnancy.

Other nonplacental transcripts that share similar temporal trends. Two such significant transcripts were RAB6B and MARCH2, which are known to be expressed specifically in CD71+ erythrocytes. Erythrocytes enriched for CD71+ have been shown to contain fetal hemoglobin and are interpreted to be of fetal origin. The presence of transcripts with known specificity to different fetal tissue types reflects the fact that the cell-free transcriptome during the period of pregnancy can be considered as a summation of transcriptomes from various different fetal tissues on top of a maternal background.

This analysis detected the presence of numerous transcripts that are specifically expressed in several other fetal tissues, although the available sequencing depth resulted in limited concordance between samples. To verify the presence of these and other potential fetal tissue-specific transcripts, a panel of fetal tissue- specific transcripts was devised for detailed quantification using the more sensitive method of quantitative PCR (qPCR). Three main sources were focused on, which are of interest to fetal neurodevelopment and metabolism: placenta, fetal brain, and fetal liver. In FIGS. 22-24, the levels of these groups of fetal tissue- specific transcripts at different trimesters were systematically compared to the level seen in maternal serum after delivery. To illustrate the temporal trends, housekeeping genes as the baseline were used as a baseline, and ACt analysis was applied to find the level of relative expression these fetal tissue- specific transcripts with respect to the housekeeping genes. Many of these tissue- specific transcripts expressed at substantially higher levels during the pregnancy compared with postpartum. There was a general trend of an increase in the quantity of these transcripts across advancing gestation.

The placental qPCR assay focused on genes that are known to be highly expressed in the placenta, many of which encode for proteins that have been shown to be present in the maternal blood. The serum levels of these proteins are known to be involved in pregnancy complications such as preeclampsia and premature births. Examples in our panel includes ADAM12, which encodes for disintegrin, and metalloproteinase domain-containing protein 12. These proteinases are highly expressed in human placenta and are present at high concentrations in maternal serum as early as the first trimester. ADAM12 serum concentrations are known to be significantly reduced in pregnancies complicated by fetal trisomy 18 and trisomy 21 and may therefore be of potential use in conjunction with cell-free DNA for the detection of chromosomal abnormalities. Similarly, placental alkaline phosphatase, encoded by the ALPP gene, is a tissue- specific isoform expressed increasingly throughout pregnancy until term in the placenta. It is anchored to the plasma membrane of the syncytiotrophoblast and to a lesser extent of cytotrophoblastic cells. This enzyme is also released into maternal serum, and variations of its concentration are related with several clinical disorders such as preterm delivery. Another gene in the panel, BACE2, encoded the β site APP-cleaving enzyme, which generates amyloid- β protein by endoproteolytic processing. Brain deposition of amyloid-β protein is a frequent complication of Down syndrome patients, and BACE-2 is known to be overexpressed in Down syndrome.

Other transcripts in our placental assay are known to be transcribed at high levels in the placenta, and levels of these mRNAs are important for normal placental function and

development in pregnancy. TAC3 is mainly expressed in the placenta and is significantly elevated in preeclamptic human placentas at term. Similarly, PLAC1 is essential for normal placental development. PLAC1 deficiency results in a hyperplastic placenta, characterized by an enlarged and dysmorphic junctional zone,. An increase in cell-free mRNA of PLAC1 has been suggested to be correlated with the occurrence of preeclampsia.

On the fetal liver tissue-specific panel, one of the characterized transcripts is AFP. AFP encodes for a-fetoprotein and is transcribed mainly in the fetal liver. AFP is the most abundant plasma protein found in the human fetus. Clinically, AFP protein levels are measured in pregnant women in either maternal blood or amniotic fluid and serve as a screening marker for fetal aneuploidy, as well as neural tube and abdominal wall defects. Other fetal liver- specific transcripts that were characterized are highly involved in metabolism. An example is fetal liver- specific monooxygenase CYP3A7, which catalyzes many reactions involved in synthesis of cholesterol and steroids and is responsible for the metabolism of more than 50% of all clinical pharmaceuticals. In drug-treated diabetic pregnancies in which glucose levels in the woman are uncontrolled, neural tube and cardiac defects in the early developing brain, spine, and heart depend on functional GLUT2 carriers, whose transcripts are well characterized in the panel. Mutations in this gene results in Fanconi-Bickel syndrome, a congenital defect of facilitative glucose transport. Monitoring of fetal liver- specific transcripts during the drug regime may enable analysis of the fetuses' response to drug therapy that the mother is undergoing.

Example 4: Deconvolution of Adult Cell-Free Transcriptome

Overview:

The plasma RNA profiles of 4 healthy, normal adults were analyzed. Based on the gene expression profile of different tissue types, the methods described quantify the relative contributions of each tissue type towards the cell-free RNA component in a donor's plasma. For quantification, apoptotic cells from different tissue types are assumed to release their RNA into the plasma. Each of these tissues expressed a specific number of genes unique to the tissue type, and the observed cell-free RNA transcriptome is a summation of these different tissue types. Study Design and Methods:

To determine the contribution of tissue- specific transcripts to the cell-free adult transriptome, a list of known tissue- specific genes was prepared from known literature and databases. Two database sources were utilized: Human U133A/GNF1H Gene Atlas and RNA- Seq Atlas. Using the raw data from these two database, tissue- specific genes were identified by the following method. A template-matching process was applied to data obtained from the two databases for the purpose of identifying tissue- specific gene. The list of tissue specific genes identified by the method is provided in Table 1 below. The specificity and sensitivity of the panel is constrained by the number of tissue samples in the database. For example, the Human U133A/GNF1H Gene Atlas dataset includes 84 different tissue samples, and a panel's specificity from that database is constrained by the 84 sample sets. Similarly, for the RNA-seq atlas, there are 11 different tissue samples and specificity is limited to distinguishing between these 11 tissues. After obtaining a list of tissue-specific transcripts from the two databases, the specificity of these transcripts was verified with literature as well as the TisGED database.

The adult cell-free transcriptome can be considered as a summation of the tissue- specific transcripts obtained from the two databases. To quantitatively deduce the relative proportions of the different tissues in an adult cell-free transcriptome, quadratic programming is performed as a constrained optimization method to deduce the relative optimal contributions of different organs/tissues towards the cell free-transcriptome. The specificity and accuracy of this process is dependent on the table of genes (Table 2 below) and the extent by which that they are detectable in RNA-seq and microarray.

Subjects: Plasma samples were collected from 4 healthy, normal adults.

Initial Results:

Deconvolution of our adult cell-free RNA transcriptome from microarray using the above methods revealed the relative contributions of the different tissue and organs are tabulated in Figure 13.

Figure 13 shows that the normal cell free transcriptome for adults is consistent across all 4 subjects. The relative contributions between the 4 subjects do not differ greatly, suggesting that the relative contributions from different tissue types are relatively stable between normal adults. Out of the 84 tissue types available, the deduced optimal major contributing tissues are from whole blood and bone marrow.

An interesting tissue type contributing to circulating RNA is the hypothalamus. The hypothalamus is bounded by specialized brain regions that lack an effective blood-brain barrier; the capillary endothelium at these sites is fenestrated to allow free passage of even large proteins and other molecules which in our case we believed that RNA transcripts from apoptotic cells in that region could be released into the plasma cell free RNA component. The same methods were performed on the subjects using RNA-seq. The results described herein are limited due to the amount of tissue- specific RNA-Seq data available. However, it is understood that tissue-specific data is expanding with the increasing rate of sequencing of various tissue rates, and future analysis will be able to leverage those datasets. For RNA-seq data (as compared to microarray), whole blood nor the bone marrow samples are not available. The cell free transcriptome can only be decomposed to the available 11 different tissue types of RNA-seq data. Of which, only relative contributions from the hypothalamus and spleen were observed, as shown in Figure 14.

A list of 84 tissue- specific genes (as provided in Table 2) was further selected for verification with qPCR. The Fluidigm BioMark Platform was used to perform the qPCR on RNA derived from the following tissues: Brain, Cerebellum, Heart, Kidney, Liver and Skin. Similar qPCR workflow was applied to the cell free RNA component as well. The delta Ct values by comparing with the housekeeping genes: ACTB was plotted in the heatmap format in Figure 15, which shows that these tissue specific transcripts are detectable in the cell free RNA.

Tables for Example 4

The following table lists the tissue-specific genes for Example 4 that was obtained using raw data from the Human U133A/GNF1H Gene Atlas and RNA-Seq Atlas databases.

Uterus orpus

CON Testis Intersitial

X721 B lymphoblasts

CD33 Myeloid

CD71 Early Erythroid

EXTL3 Subthalamic Nucleus

Thymus

Adipocyte

X721 B lymphoblasts

The following table (Table 2) lists panel of 94 tissue- specific genes in Example 4 that were verified with qPCR.

Example 5: Using Tissue-Specific Cell-Free RNA to Assess Alzheimer's

The analysis of fetal brain-specific transcripts, in Examples 2 and 3, leads to the assessment of brain-specific transcripts for neurological disorder. Particularly, the qPCR brain panel detected fetal brain-specific transcripts in maternal blood, whereas the whole transcriptome deconvolution analysis in our nonpregnant adult samples, in Examples 2 and 3, revealed that the hypothalamus is a significant contributor to the whole cell-free transcriptome. Since the hypothalamus is bounded by specialized brain regions that lack an effective blood-brain barrier, cell-free DNA in the blood was examined in the current study to measure neuronal death. qPCR was used to measure the expression levels of selected brain transcripts in the plasma of both Alzheimer's patients and age-matched normal controls. These measurements were made for a cohort of 16 patients: 6 diagnosed as Alzheimer's and 10 normal subjects. FIG. 17 depicts the measurements of PSD3 and APP cell-free RNA transcript levels in plasma. As provided in FIG. 17, the levels of PSD3 and APP cell-free RNA transcripts are elevated in Alzheimer's (AD) patients as compared to normal patients and can be used to characterize the different patient populations.

The APP transcript encodes for the precursor molecule whose proteolysis generates β amyloid, which is the primary component of amyloid plaques found in the brain of Alzheimer's disease patients. Preliminary measurements of the plasma APP transcript corroborate the known biology behind progression of Alzheimer's disease and showed a significant increase in patients with Alzheimer's disease compared with normal subjects, suggesting that plasma APP mRNA levels may be a good marker for diagnosing Alzheimer's disease. Similarly, the gene PSD3, which is highly expressed in the nervous system and localized to the postsynaptic density based on sequence similarities, shows an increase in the plasma of patients with Alzheimer's disease. By plotting the ACt values of APP against PSD3, AD patients were clustered away from the normal patients. In light of the cluster variants, cell-free RNA may serve as a blood-based diagnostic test for Alzheimer's disease and other neurodegenerative disorders.

Example 6: Assessing Neurological Disorders with Brain-Specific Transcripts

Overview

This study expands upon Example 5 and was designed to determine brain- specific tissue transcripts that correlate with the various stages of Alzheimer's disease. The study examined a cohort of patients from different centers that have previously collected Alzheimer's patents and age controlled references. There were a total of 254 plasma samples available from the different centers. Cell free RNA was extracted from each of the samples. The extracted cell free RNA from each of these samples were then assayed using high throughput qPCR on the Biomark Fluidigm system. Each of the samples was assayed using a panel of 48 genes of which 43 genes are known to be brain specific. The resulting measurements from each of the samples were put through a very stringent quality control process. The first step includes measuring the distribution of housekeeping genes: ACTB and GAPDH. By observing the levels of housekeeping genes across the sample from different batches, batches with significantly lower levels of housekeeping genes were removed from downstream analysis. The next step in quality control is by the number of failed gene assays in each of the patient sample. Sample where 8 or more assays failed to amplify are removed. This results in 125 good quality samples:

I. 27 Alzheimers Patients (AD)

II. 52 Mild Cognitive Impairment Patients (MCI)

III. 46 Normal patients.

IV.

Analysis and Results

An unsupervised method of Principle Component Analysis (PCA) was applied to the qPCR gene expression of the 43 brain- specific transcripts in order to differentiate between Alzheimer's and Normal patients. FIG. 27 illustrates the PCA space reflecting the unsupervised clustering of the patients using the gene expression data from the 48-gene assay. As shown in FIG. 27 two different populations are formed which correspond to the neurological disease state of the patients.

Additionally, a Wilcox non-parametric statistical test was performed between Alzheimer's and normal patients for each of the brain specific transcripts. The resulting p-values were bonferroni corrected for multiple testing. Brain specific transcripts whose p-values that are significant at the 0.05 levels were cataloged as transcripts that high distinguishing power between alzheimer's and normal patients. Amongst all the assayed brain specific transcripts, two of them are elevated in Alzheimer patients: APP and PSD3. Another 7 transcripts were below normal levels at a significant level: MOBP ; MAG ; SLC2A1; TCF7L2; CDH22; CNTF and PAQR6. Figure 28 shows the boxplot of the different levels of APP transcripts across the different patient groups and the corrected P-value indicating the significance of the transcripts in distinguishing Alzheimer's. Figure 29 illustrates the alternate trends where the levels of the measure brain transcript MOBP were lower in the Alzheimer population as compared to the normal population. MOBP is a myelin- associated oligodendrocyte protein-coding gene which is known to play a role in compacting or stabilizing the myelin sheath.

Methods of Normalization for Comparison across Sample Batches Considerable heterogeneity may be present between different batches of samples collected. A normalization scheme may be deployed to allow for valid comparison across samples from different batches, and such scheme was deployed in the present study. For each gene assay within each batch, the delta ct values of each sample was used to generate a z-score by using the mean and standard deviation inferred from the population of normal samples within the batch. This z-score is then used to as the normalized expression value for downstream analysis, as discussed below.

Classification Results using Combined Z-Scores (See FIG. 30)

To incorporate the different measurements across the brain specific genes into a single distinct measure for classification of the patients, the method of combined z-scores was employed. The combined z-scores measure the deviation of the brain specific transcripts from the mean expected value of the normal controls and combine these deviations into a single measure for distinguishing Alzheimer's. To analyze the utility of such a measure in distinguishing Alzheimer's, a receiver- operator analysis was performed and achieved an area under curve (AUC) of 0.79 (See FIG. 30).

Incorporation by Reference

References and citations to other documents, such as patents, patent applications, patent publications, journals, books, papers, web contents, have been made throughout this disclosure. All such documents are hereby incorporated herein by reference in their entirety for all purposes.

Equivalents

The invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments are therefore to be considered in all respects illustrative rather than limiting on the invention described herein. Scope of the invention is thus indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.