SUBSTANTIALLY UNBIASED AMPLIFICATION OF GENOMES

Title:

SUBSTANTIALLY UNBIASED AMPLIFICATION OF GENOMES

Document Type and Number:

WIPO Patent Application WO/2014/193980

Kind Code:

Abstract:

Methods and manufactures for substantially unbiased amplification of genomes are provided herein. Some embodiments include methods of producing a substantially unbiased amplification library of a genome of a single cell. Some embodiments include methods of producing a substantially unbiased amplification of a genome by multiple strand displacement amplification (MDA). Some embodiments include a substrate for substantially unbiased amplification a genome of each of a plurality of single cells

More Like This:

JP2005525781	High processing performance test system
WO/2022/107814	RNA PROBE FOR MUTATION PROFILING AND USE THEREOF
WO/2003/070876	METHOD FOR ATTACHING NUCLEIC ACID MOLECULES TO ELECTRICALLY CONDUCTIVE SURFACES

Inventors:

GOLE JEFF (US)
ZHANG KUN (US)

Application Number:

PCT/US2014/039830

Publication Date:

December 04, 2014

Filing Date:

May 28, 2014

Export Citation:

Click for automatic bibliography generation Help

Assignee:

UNIV CALIFORNIA (US)

International Classes:

C40B40/06

Foreign References:

US20060068394A1	2006-03-30
US20060014167A1	2006-01-19
US4683195A	1987-07-28
US5455166A	1995-10-03
US6977148B2	2005-12-20
US6410278B1	2002-06-25
US4988617A	1991-01-29
US5427930A	1995-06-27
US5849478A	1998-12-15
US5399491A	1995-03-21
US5130238A	1992-07-14
US5854033A	1998-12-29

Other References:

MARCY ET AL.: "Nanoliter Reactors Improve Multiple Displacement Amplification of Genomes from Single Cells", PLOS GENETICS;, vol. 3, 1 September 2007 (2007-09-01), pages 1702 - 1708, XP002561035
TANG ET AL.: "mRNA-Seq whole-transcriptome analysis of a single cell", NATURE METHODS, vol. 6, 19 April 2009 (2009-04-19), pages 377 - 384, XP055037482
HUANG ET AL.: "Non-biased and efficient global amplification of a single- cell cDNA library.", NUCL ACIDS RES, 18 October 2013 (2013-10-18), pages 1 - 11, XP055297049
See also references of EP 3004433A4
LASKEN: "Current Opinion in Microbiology", vol. 10, 17 October 2007, CURRENT BIOLOGY LTD, article "Single-cell genomic sequencing using Multiple Displacement Amplification", pages: 510 - 516
ISHOEY ET AL.: "Current Opinion in Microbiology", vol. 11, 1 June 2008, CURRENT BIOLOGY LTD, article "Genomic sequencing of single microbial cells from environmental samples", pages: 198 - 204
DEAN ET AL., MULTIPLE DISPLACEMENT AMPLIFICATION
NOTOMI ET AL., PROCESS FOR SYNTHESIZING NUCLEIC ACID
KACIAN ET AL., NUCLEIC ACID SEQUENCE AMPLIFICATION METHODS
MALEK ET AL., ENHANCED NUCLEIC ACID AMPLIFICATION PROCESS
LIZARDI ET AL., BIOTECHNOLOGY, vol. 6, 1988, pages 1197
ZHANG, K. ET AL.: "Sequencing genomes from single cells by polymerase cloning", NAT BIOTECHNOL, vol. 24, 2006, pages 680 - 686, XP055149835, DOI: doi:10.1038/nbt1214
RODRIGUE, S. ET AL.: "Whole genome amplification and de novo assembly of single bacterial cells", PLOS ONE, vol. 4, 2009, pages e6864
FAN, H.C.; WANG, J.; POTANINA, A.; QUAKE, S.R.: "Whole-genome molecular haplotyping of single cells", NAT BIOTECHNOL, vol. 29, 2011, pages 51 - 57, XP055026438, DOI: doi:10.1038/nbt.1739
HOU, Y. ET AL.: "Single-cell exome sequencing and monoclonal evolution of a JAK2-negative myeloproliferative neoplasm", CELL, vol. 148, 2012, pages 873 - 885, XP028402802, DOI: doi:10.1016/j.cell.2012.02.028
PAN, X. ET AL.: "A procedure for highly specific, sensitive, and unbiased whole-genome amplification", PROC NATL ACAD SCI USA, vol. 105, 2008, pages 15499 - 15504, XP055211855, DOI: doi:10.1073/pnas.0808028105
MARCY, Y. ET AL.: "Dissecting biological ''dark matter'' with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth", PROC NATL ACAD SCI U SA, vol. 104, 2007, pages 11889 - 11894, XP055112670, DOI: doi:10.1073/pnas.0704662104
YOON, H.S. ET AL.: "Single-cell genomics reveals organismal interactions in uncultivated marine protists", SCIENCE, vol. 332, 2011, pages 714 - 717
NAVIN, N. ET AL.: "Tumour evolution inferred by single-cell sequencing", NATURE, vol. 472, 2011, pages 90 - 94, XP055042936, DOI: doi:10.1038/nature09807
XU, X. ET AL.: "Single-cell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor", CELL, vol. 148, 2012, pages 886 - 895, XP028402803, DOI: doi:10.1016/j.cell.2012.02.025
WANG, J.; FAN, H.C.; BEHR, B.; QUAKE, S.R.: "Genome-wide single-cell analysis of recombination activity and de novo mutation rates in human sperm", CELL, vol. 150, 2012, pages 402 - 412, XP028930183, DOI: doi:10.1016/j.cell.2012.06.030
BANKEVICH, A. ET AL.: "SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing", J COMPUT BIOL, vol. 19, 2012, pages 455 - 477, XP002764171, DOI: doi:10.1089/cmb.2012.0021
CHITSAZ, H. ET AL.: "Efficient de novo assembly of single-cell bacterial genomes from short-read data sets", NAT BIOTECHNOL, vol. 29, 2011, pages 915 - 921
HUTCHISON, C.A.; 3RD, SMITH, H.O.; PFANNKOCH, C.; VENTER, J.C.: "Cell-free cloning using phi29 DNA polymerase", PROC NATL ACAD SCI USA, vol. 102, 2005, pages 17332 - 17336
MARCY, Y. ET AL.: "Nanolitre reactors improve multiple displacement amplification of genomes from single cells", PLOS GENET, vol. 3, 2007, pages 1702 - 1708, XP002561035, DOI: doi:10.1371/journal.pgen.0030155
INOUE, J.; SHIGEMORI, Y.; MIKAWA, T.: "Improvements of rolling circle amplification (RCA) efficiency and accuracy using Thermus thermophilus SSB mutant protein", NUCLEIC ACIDS RES, vol. 34, 2006, pages e69, XP002541058, DOI: doi:10.1093/nar/gkl350
WOYKE, T. ET AL.: "One bacterial cell, one complete genome", PLOS ONE 5, 2010, pages e10314
FITZSIMONS, M.S. ET AL.: "Nearly finished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome", GENOME RES, 2013
BLAINEY, P.C.; QUAKE, S.R.: "Digital MDA for enumeration of total nucleic acid contamination", NUCLEIC ACIDS RESEARCH, vol. 39, 2011, pages e19, XP055132193, DOI: doi:10.1093/nar/gkq1074
ADEY, A.; SHENDURE, J.: "Ultra-low-input, tagmentation-based whole-genome bisulfite sequencing", GENOME RES, vol. 22, 2012, pages 1139 - 1143, XP055136909, DOI: doi:10.1101/gr.136242.111
REHEN, S.K. ET AL.: "Constitutional aneuploidy in the normal human brain", J NEUROSCI, vol. 25, 2005, pages 2176 - 2180, XP002547871, DOI: doi:10.1523/jneurosci.4560-04.2005
REHEN, S.K. ET AL.: "Chromosomal variation in neurons of the developing and adult mammalian nervous system", PROC NATL ACAD SCI USA, vol. 98, 2001, pages 13361 - 13366, XP003013102, DOI: doi:10.1073/pnas.231487398
YANG, A.H. ET AL.: "Chromosome segregation defects contribute to aneuploidy in normal neural progenitor cells", J NEUROSCI, vol. 23, 2003, pages 10454 - 10462, XP002547545
YUROV, Y.B. ET AL.: "Aneuploidy and confined chromosomal mosaicism in the developing human brain", PLOS ONE, vol. 2, 2007, pages e558
MUOTRI, A.R.; GAGE, F.H.: "Generation of neuronal variability and complexity", NATURE, vol. 441, 2006, pages 1087 - 1093
SINGER, T.; MCCONNELL, M.J.; MARCHETTO, M.C.; COUFAL, N.G.; GAGE, F.H.: "LINE-1 retrotransposons: mediators of somatic variation in neuronal genomes?", TRENDS NEUROSCI, vol. 33, 2010, pages 345 - 354, XP027188680
WESTRA, J.W. ET AL.: "Neuronal DNA content variation (DCV) with regional and individual differences in the human brain", J COMP NEUROL, vol. 518, 2010, pages 3981 - 4000
BASLAN, T. ET AL.: "Genome-wide copy number analysis of single cells", NAT PROTOC, vol. 7, 2012, pages 1024 - 1041, XP009171967, DOI: doi:10.1038/nprot.2012.039
SHENDURE, J. ET AL.: "Accurate multiplex polony sequencing of an evolved bacterial genome", SCIENCE, vol. 309, 2005, pages 1728 - 1732, XP002427180, DOI: doi:10.1126/science.1117389
ZONG, C.; LU, S.; CHAPMAN, A.R.; XIE, X.S.: "Genome-wide detection of single-nucleotide and copy-number variations of a single human cell", SCIENCE, vol. 338, 2012, pages 1622 - 1626, XP055183862, DOI: doi:10.1126/science.1229164
HUSSEIN, S.M. ET AL.: "Copy number variation and selection during reprogramming to pluripotency", NATURE, vol. 471, 2011, pages 58 - 62, XP055000452, DOI: doi:10.1038/nature09871
WESTRA, J.W. ET AL.: "Aneuploid mosaicism in the developing and adult cerebellar cortex", J COMP NEUROL, vol. 507, 2008, pages 1944 - 1951
HUSON, D.H.; AUCH, A.F., QI, J.; SCHUSTER, S.C.: "MEGAN analysis of metagenomic data", GENOME RES, vol. 17, 2007, pages 377 - 386, XP055033245, DOI: doi:10.1101/gr.5969107
GUREVICH, A.; SAVELIEV, V.; VYAHHI, N.; TESLER, G.: "QUAST: quality assessment tool for genome assemblies", BIOINFORMATICS, vol. 29, 2013, pages 1072 - 1075
AZIZ, R.K. ET AL.: "The RAST Server: rapid annotations using subsystems technology", BMC GENOMICS, vol. 9, 2008, pages 75, XP021033139
MORIYA, Y.; ITOH, M.; OKUDA, S.; YOSHIZAWA, A.C.; KANEHISA, M.: "KAAS: an automatic genome annotation and pathway reconstruction server", NUCLEIC ACIDS RESEARCH, vol. 35, 2007, pages W182 - 185, XP002517617, DOI: doi:10.1093/NAR/GKM321
FAN, CHRISTINA ET AL.: "Whole genome molecular haplotyping of single cells", NATURE BIOTECH
ZHONG, CHENGHANG ET AL.: "Genome-Wide Detection of Single-Nucleotide and Copy-Number Variations of a Single Human", CELL SCIENCE, vol. 338, 2012, pages 1622, XP055183862, DOI: doi:10.1126/science.1229164
ZHANG, KUN ET AL.: "Sequencing Genomes from Single Cells by Polymerase Cloning", NATURE BIOTECH
EVRONY, GILRAD ET AL.: "Single Neuron Sequencing Analysis of L1 Retrotransposition and Somatic Mutation in the Human", BRAIN CELL, vol. 151, 2012, pages 483
KIRKNESS, E.F. ET AL.: "Sequencing of isolated sperm cells for direct haplotyping of a human genome", GENOME RES, vol. 23, 2013, pages 826 - 832
LU, S. ET AL.: "Probing meiotic recombination and aneuploidy of single sperm cells by whole-genome sequencing", SCIENCE, vol. 338, 2012, pages 1627 - 1630

Attorney, Agent or Firm:

MILLER, Kimberly, J. (2040 Main Street14th Floo, Irvine CA, US)

Download PDF:

View/Download PDF PDF Help

Claims:

WHAT IS CLAIMED IS:

1. A method of producing a substantially unbiased amplification library of a genome of a single cell, the method comprising:

amplifying the genome of the single cell in a nanoliter-scale reaction environment configured for substantial!}' unbiased amplification of the genome; and constructing a library comprising a plurality of amplicons of the substantially unbiased amplification of the genome.

2. The method of claim 1 wherein amplifying the genome of the single cell comprises multiple strand displacement amplification (MDA) comprising contacting the reaction environment with (a) strand-displacement polymerase, and (b) a plurality of random muitimers of DNA, thereby producing a substantially unbiased amplification of the genome of the single cell.

3. The method of any of the above claims, wherein a ratio of amount of nucleic acid of the genome to volume of the nanoliter-scale reaction environment is at least about 0.03 Mega-basepairs per nanoliter.

4. The method of any of the above claims, wherein a ratio of amount of nucleic acid of the genome to volume of the nanoliter-scale reaction environment is at least about 200 Mega-basepairs per nanoliter.

5. The method of any of the above claims, wherein the nanoliter-scale reaction environment is configured for amplification of at least about 90% of the genome at greater than I coverage.

6. The method of any of the above claims, wherein the nanoliter-scale reaction en vironment comprises a volume of no more than about 2()nL.

7. The method of any of the above claims, wherein the nanoliter-scale reaction environment comprises a volume of no more than about 12nL.

8. The method of any of the above claims, further comprising amplifying a plurality of genomes of single cells in a piurality of nanoliter-scale reaction en vironments on a single substrate, wherein at least 95% of the reaction environments do not comprise any genomes other than a genome of a single cell.

9. The method of claim 8, wherein at least 99% of the reaction environments do not comprise any genomes other than a genome of a single cell.

10. The method of claim 8 or 9, wherein the substrate is configured for a single pipetting action to distribute the genomes of single cells among the reaction environments.

1 1. The method of any one of claims 8-10, further comprising:

selecting a desired number of reaction environments; and

amplifying the plurality of genomes of single cells in only the desired number of reaction environments.

12. The method of any one of claims 8-11 , further comprising identifying a reaction environment in which a desired level of amplification has been achieved, wherein the library is constructed from the reaction environment in which a desired level of amplification has been achieved.

13. The method of any one of claims 8-12, further comprising constructing a plurality of libraries from the plurality of reaction environments, wherein the number of the plurality of libraries is the same or different as the number of the plurality of reaction environments.

14. The method of any of the above claims wherein ampl ifying the genome of the single cell in the nanoliter-scale reaction environment comprises amplification in the presence of an amplification-detection moiety.

15. The method of claim 14, wherein the amplification-detection moiety comprises a cyanine dye.

16. The method of any one of claims 14-15, wherein signal from the amplification-detection moiety identifies a reaction environment in which a desired level of amplification has been achieved.

17. The method of any of the above claims, wherein the reaction environment does not comprise any cells other than the single cell.

18. The method of any of the above claims, wherein the reaction environment does not comprise any genomes other than the genome of the single cell.

19. The method of any of the above claims, wherein the random multimers are selected from the group consisting of: pentamers, hexamers, heptamers, octamers, nonamers and decamers.

20. The method of claim 19, wherein the random multimers are hexamers.

21. The method any of the above claims, wherein substantially all of the plurality of amplicons are unbranched.

22. The method of any of the above claims, further comprising removing at least some of the plurality of amplicons from the reaction environment prior to constructing the library.

23. The method of claim 22, wherein removing at least some of the plurality of amplicons comprises micromanipulation.

24. The method of any of the above claims, wherein the plurality of amplicons comprises no more than about 100 picograms to about 10 nanograms of DNA.

25. The method of any of the above claims, wherein the library comprises a transposase-based library.

26. The method of any of the above claims, wherein the library comprises a Tn5 transposase-based library.

27. The method of any of the above claims, wherein the library comprises a random fragmentation and ligation library.

28. The method of any of the above claims, wherein the single cell is one of a human cell or a microbial cell,

29. The method of any of the above claims, wherein the single cell comprises a cell of a bacterium that is uneulturable, or substantially uneulturable.

30. The method of any of the above wherein the MDA comprises real time MDA.

31. The method of any of the above claims, wherein the method is performed in parallel on two or more genomes of two or more single cells, thereby producing two or more unbiased amplification libraries in parallel.

32. The method of any of the above claims, further comprising at least one of: de novo assembly of uneulturable bacteria in the human gut, de novo assembly of uneulturable bacteria in heterogeneous environments such as sea water, copy number variation calling on single neurons, copy number variation calling on single cancerous cells or circulating tumor cells, or human haplotyping.

33. A method of producing a substantially unbiased amplification of a genome by multiple strand displacement amplification (MDA), the method comprising

providing the genome in a nanoliter-scale reaction environment; and contacting the nanoliter- scale reaction environment with (a) strand- displacement polymerase, and (b) a plurality of random multimers of DNA, thereby producing a substantially unbiased amplification of the genome.

34. The method of claim 33, further comprising constructing a library comprising a plurality of amplicons of the substantial!}' unbiased amplification of the genome.

35. The method of any one of claims 33-34, wherein the nano!iter-scale reaction environment is configured for amplification of at least 90% of the genome at greater than Ix coverage.

36. The method of any one of claims 33-35, wherein a ratio of amount of nucleic acid of the genome to volume of the nanolioter-scale reaction environment is at least, about 0.3 Mega-basepairs per nanoliter.

37. The method of any of one of claims 33-36, wherein a ratio of amount of nucleic acid of the genome to volume of the reaction environment is at least about 200 Mega- basepairs per nanoliter.

38. The method of any of claims 33-37, wherein the random multimers are selected from the group consisting of: pentamers, hexamers, heptamers, octamers, nonamers, and decamers.

39. The method of claim 38, wherein the random multimers are hexamers.

40. The method any of claims 33-39, wherein substantially all of the plurality of amplicons are unbranched,

41. The method of any of claims 33-40, wherein the nanoliter-scale reaction environment comprises a nanoliter-scale reaction environment that facilitates substantially unbiased amplification of the single ceils.

42. The method of any of claims 33-41, wherein the nanoliter-scale reaction environment comprises a volume of no more than about 20nL.

43. The method of any of claims 33-41, wherein the nanoliter-scale reaction environment comprises a volume of no more than about 12nL,

44. The method of any of claims 33-42, wherein there is at least a 99% probability that the reaction environment comprises no more than one genome.

45. The method of any of claims 33-44, further comprising at least one of: de novo assembly of a genome of an unculturable bacterium of the human gut, de novo assembly of an unculturable bacterium of a heterogeneous environment, copy number variation calling on a single neuron, copy number variation calling on a single cancerous ceil or circulating tumor cell, or human haplotypmg.

46. A substrate for substantially unbiased amplification a genome at least one single cell, the substrate comprising:

a plurality of loading areas, wherein each loading area is configured to receive a liquid sample, each loading area comprising:

a plurality of nanoliter-scale reaction environments that facilitates substantially unbiased amplification of a single cell.

47. The substrate of claim 46, wherein the plurality of nanoliter-scale reaction environments is configured for performing a desired number of amplification reactions in parallel, wherein each amplification reaction is conducted in a different nanoliter-scale reaction environrnen t.

48. The substrate of claim 47, wherein the plurality of nanoliter-scale reaction environments is configured for performing a desired number of amplification reactions in parallel without further modification of the substrate.

49. The substrate of any one of claims 46-48, wherein the plurality of nanoliter- scale reaction environments are not in fluid communication with any microfluidic channels or nanofiuidic channels.

50. The substrate of any one of claims 46-49, wherein each nanoliter-scale reaction environment has a volume of no more than about 12 nL.

51. The substrate of any one of claims 46-49, wherein each nanoliter-scale reaction environment has a volume of no more than about 20 nL.

52. The substrate of any one of claims 46-51, wherein each loading area is configured for loading a solution comprising diluted cells into the plurality of nanoliter-scale reaction environments via a single pipetting action.

53. The substrate of any one of claims 46-52, wherein each reaction environment comprises a plurality of random rnultimers and strand-displacement polymerase.

54. The substrate of claim 53, wherein the plurality of rnultimers comprises hexamers.

55. The substrate of any of claims 46-54 comprising at least three loading areas

56. The substrate of any of claims 46-55, wherein each loading area comprises at least ten nanoliter-scale reaction environments.

57. The substrate of any of claims 46-55, wherein each loading area comprises at least one hundred nanoliter-scale reaction environments.

58. The substrate of any of claims 46-57, further comprising a detector configured to detect an amplification-detection moiety in each of the reaction environments.

59. The substrate of any of claims 46-58, further comprising a nanopipettor configured to withdraw amplified nucleic acid from a single reaction environment.

60. The substrate of any one of claims 46-59, wherein the nanoliter-scale reaction environments are configured sot that at least 99% of the reaction environments comprise a genome of no more than one cell following a loading of solution comprising single cells or fractions thereof in the loading area.

61. The substrate of any one of claims 46-60, wherein substantially each reaction environment comprises a genome of no more than one cell, and wherein substantially each reaction environment that comprises a genome further comprises a plurality of amplicons of the genome.

62. The substrate of claim 61 , wherein the plurality of amplicons comprises substantially unbiased coverage of the genome.

63. The substrate of any one of claims 46-62, wherein the plurality of amplicons comprises no more than about 100 picograms to about 10 nanograms of DNA.

64. The substrate of any one of claims 53-63 wherein the strand-displacement polymerase comprises phi29 polymerase.

65. The method of any of claims 1 -45, wherein the strand-displacement polymerase comprises phi29 polymerase.

Description:

[0001] The present application claims the benefit of U.S. Provisional App. No. 61/829193, filed May 30, 2013, which is hereby incorporated by reference in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED R&D

[0002] This research was supported by National Institutes of Health Grant, R01HG004876. The government may have certain rights in the invention.

BACKGROUND

[0003] The genetic material in a single cell can be amplified by DNA polymerase into many clonal copies through whole genome amplification and characterized by shotgun sequencing. Single-cell genome sequencing has been successfully demonstrated on microbial and mammalian cells ^5"", and applied to the characterization of microbial genomic diversity of the ocean', somatic mutations in cancers ⁸' ⁹, and rneiotic recombination and mutation in sperm 3, 10

Field

[0004] Embodiments herein relate generally to whole-genome amplification. Some embodiments herein related generally to unbiased amplification of a genome.

SUMMARY

[0005] According to some aspects, a method of producing a substantially unbiased amplification library of a genome of a single cell is provided. The method can comprise amplifying the genome of the single cell in a nanoliter-scale reaction environment configured for substantially unbiased amplification of the genome, and constructing a library comprising a plurality of amplicons of the substantially unbiased amplification of the genome. In some embodiments, amplifying the genome of the single ceil comprises multiple strand displacement amplification ( DA) comprising contacting the reaction environment with (a) strand-displacement polymerase, and (b) a plurality of random multimers of DNA, thereby producing a substantially unbiased amplification of the genome of the single cell. In some embodiments, a ratio of amount of nucleic acid of the genome to volume of the nanoliter-scale reaction environment is at least about 0.03 Mega-basepairs per nanoliter. In some embodiments, a ratio of amount of nucleic acid of the genome to volume of the nanoliter-scale reaction environment is at least about 200 Mega-basepairs per nanoliter. In some embodiments, the nanoliter-scale reaction environment is configured for amplification of at least about 90% of the genome at greater than 1 x coverage. In some embodiments, the nanoliter-scale reaction environment comprises a volume of no more than about 20nL. In some embodiments, the nanoliter-scale reaction environment comprises a volume of no more than about 12nL. In some embodiments, the method further comprises amplifying a plurality of genomes of single cells in a plurality of nanoliter-scale reaction environments on a single substrate, wherein at least 95% of the reaction environments do not comprise any genomes other than a genome of a single cell. In some embodiments, at least 99% of the reaction environments do not comprise any genomes other than a genome of a single cell. In some embodiments, the substrate is configured for a single pipetting action to distribute the genomes of single cells among the reaction environments. In some embodiments, the method further comprises selecting a desired number of reaction environments; and amplifying the plurality of genomes of single cells in only the desired number of reaction environments. In some embodiments, the method further comprises identifying a reaction environment in which a desired level of amplification has been achieved, wherein the library is constmcted from the reaction environment in which a desired level of amplification has been achieved. In some embodiments, the method further comprises constructing a plurality of libraries from the plurality of reaction environments, in which the number of the plurality of libraries is the same or different as the number of the plurality of reaction environments. In some embodiments, amplifying the genome of the single cell in the nanoliter-scale reaction environment comprises amplification in the presence of an amplification-detection moiety. In some embodiments, the amplification-detection moiety comprises a cyanine dye. In some embodiments, the amplification-detection moiety comprises 8YBR™ green dye. In some embodiments, signal from the amplification-detection moiety identifies a reaction environment in which a desired level of amplification has been achieved. In some embodiments, the reaction environment does not comprise any cells other than the single cell. In some embodiments, the reaction environment does not comprise any genomes other than the genome of the single cell. In some embodiments, the random mul timers are selected from the group consisting of: pentamers, hexamers, heptamers, octamers, nonamers and decamers. In some embodiments, the random multimers are hexamers. In some embodiments, substantially all of the plurality of amplicons are unbranched. In some embodiments, the method iiirther comprises removing at least some of the plurality of amplicons from the reaction environment prior to constructing the library. In some embodiments, removing at least some of the plurality of amplicons comprises micromanipulation. In some embodiments, the plurality of amplicons comprises no more than about 100 picograms to about 10 nanograms of DNA. In some embodiments, the library comprises a transposase- based library. In some embodiments, the library comprises a Tn5 transposase-based library. In some embodiments, the library comprises a random fragmentation and ligation library. In some embodiments, the single cell is one of a human cell or a microbial cell. In some embodiments, the single cell comprises a cell of a bacterium that is unculturable, or substantially unculturable. In some embodiments, the MDA comprises real time MDA. In some embodiments, the method is performed in parallel on two or more genomes of two or more single ceils, thereby producing two or more unbiased amplification libraries in parallel. In some embodiments, the method further comprises at least one of: de novo assembly of unculturable bacteria in the human gut, de novo assembly of unculturable bacteria in heterogeneous environments such as sea water, copy number variation calling on single neurons, copy number variation calling on single cancerous cells or circulating tumor cells, or human haplotyping. In some embodiments, the strand-displacement polymerase comprises a high-fidelity polymerase. In some embodiments, the strand-displacement polymerase comprises phi29 polymerase.

[0006] According to some aspects, a method of producing a substantially unbiased amplification of a genome by multiple strand displacement amplification (MDA). The method can comprise providing the genome in a nanoiiter-scale reaction environment, and contacting the nan liter-scale reaction environment with (a) strand-displacement polymerase, and (b) a plurality of random multimers of DNA, thereby producing a substantially unbiased amplification of the genome. In some embodiments, the method further comprises constructing a library comprising a plurality of amplicons of the substantially unbiased amplification of the genome. In some embodiments, the nanoliter- scaie reaction environment is configured for amplification of at least 90% of the genome at greater than Ix coverage. In some embodiments, a ratio of amount of nucleic acid of the genome to volume of the nanolioter-scale reaction environment is at least about 0.3 Mega- basepairs per nanoliter. In some embodiments, a ratio of amount of nucleic acid of the genome to volume of the reaction environment is at least about 200 Mega-basepairs per nanoliter. In some embodiments, the random multimers are selected from the group consisting of: pentamers, hexamers, heptamers, octamers, nonamers, and decamers. In some embodiments, the random multimers comprise hexamers. In some embodiments, substantially all of the plurality of amplicons are unbranched. In some embodiments, the nanoliter-scale reaction environment comprises a nanoliter-scale reaction environment, that facilitates substantially unbiased amplification of the single cells. In some embodiments, the nanoliter-scale reaction environment comprises a volume of no more than about, 20nL. In some embodiments, the nanoliter-scale reaction environment comprises a volume of no more than about 12nL. In some embodiments, there is at least a 99% probability that the reaction environment comprises no more than one genome. In some embodiments, the method further comprises at least one of: de novo assembly of a genome of an unculturable bacterium of the human gut, de novo assembly of an unculturable bacterium of a heterogeneous environment, copy number variation calling on a single neuron, copy number variation calling on a single cancerous ceil or circulating tumor cell, or human haplotyping. In some embodiments, the strand-displacement, polymerase comprises a high-fidelity polymerase. In some embodiments, the strand-displacement polymerase comprises phi29 polymerase.

[0007] According to some aspects, a substrate for substantially unbiased amplification a genome at least one single cell is provided. The substrate can comprise a plurality of loading areas, in which each loading area is configured to receive a liquid sample. Each loading area can comprise a plurality of nanoliter-scale reaction environments that facilitates substantially unbiased amplification of a single cell. In some embodiments, the plurality of nanoliter-scale reaction environments is configured for performing a desired number of amplification reactions in parallel, in which each amplification reaction is conducted in a different nanoliter-scale reaction environment. In some embodiments, the plurality of nanoliter-scale reaction environments is configured for performing a desired number of amplification reactions in parallel without further modification of the substrate. In some embodiments, the plurality of nanoliter-scale reaction environments are not in fluid communication with any microfluidic channels or nanofluidic channels. In some embodiments, each nanoliter-scale reaction environment has a volume of no more than about 12 iiL. In some embodiments, each nanoliter-scale reaction environment has a volume of no more than about 20 riL. In some embodiments, each loading area is configured for loading a solution comprising diluted cells into the plurality of nanoliter-scale reaction environments via a single pipetting action. In some embodiments, each reaction environment comprises a plurality of random multimers and strand-displacement polymerase. In some embodiments, the plurality of multimers comprises hexamers. In some embodiments, the substrate comprises at least three loading areas. In some embodiments, each loading area comprises at least ten nanoliter-scale reaction environments. In some embodiments, each loading area comprises at least one hundred nanoliter-scale reaction environments. In some embodiments, the substrate further comprises a detector configured to detect an amplification-detection moiety in each of the reaction environments. In some embodiments, the substrate further comprises a nanopipettor configured to withdraw amplified nucleic acid from a single reaction environment. In some embodiments, the nanoliter-scale reaction environments are configured so that at least 99% of the reaction environments comprise a genome of no more than one cell following a loading of solution comprising single cells or fractions thereof in the loading area. In some embodiments, substantially each reaction environment comprises a genome of no more than one cell, and wherein substantially each reaction environment that comprises a genome further comprises a plurality of amplicons of the genome. In some embodiments, the plurality of amplicons comprises substantially unbiased coverage of the genome. In some embodiments, the plurality of amplicons comprises no more than about, 100 picograms to about 10 nanograms of DNA. In some embodiments, the strand- displacement polymerase comprises a high-fidelity polymerase, in some embodiments, the strand-displacement polymerase comprises phi29 polymerase.

BRIEF DESCRIPTION OF THE DRAWINGS

[ΘΘΘ8] Figure 1 is a series of schematic diagrams illustrating substantial!}' unbiased amplification of genomes according to some embodiments herein. Figure 1A is a schematic diagram illustrating a substrate 100 according to some embodiments herein in the context of a method of substantially unbiased amplification of genomes in accordance with some embodiments herein. Each substrate lOOcan contain 16 individual loading areas 12, with each loading area 14 containing 255 nanoliter-scale reaction environments, for example 12nl microwelis. Cells, lysis solution, denaturing buffer, neutralization buffer, and MDA master mix comprising an amplification-detection moiety can be each added to the microwelis with a single pipette pump. Ampiicon growth can be then visualized with a fluorescent microscope using a real time MDA system. Microwelis showing increasing fluorescence over time are positive amplicons. The amplicons are extracted with fine, glass pipettes attached to a micromanipulation system. Figure IB is a series of scanning electron microscopy (SEM) images of a single E. coii cell at different magnifications. This particular well contains only 1 cell, and most, wells obsen'ed also contained no more than 1 cell. Figure 1C is a photograph illustrating a custom microscope incubation chamber that can be used for real time MDA in accordance with some embodiments herein. The chamber is temperature and humidity controlled to mitigate evaporation of reagents. Additionally, it, prevents contamination during ampiicon extraction by self-containing the micromanipulation system. An image of the entire microwell array is also shown, as well as a micropipette probing a well. Figure ID is a schematic diagram illustrating that in accordance with some embodiments herein, complex 3-dimensional MDA amplicons are reduced to linear DNA using DMA polymerase I and Ampligase. This process can significantly improve the library complexity post-tagmentation.

[0009] Figure 2 is a diagram of assembled E. coii genomes generated by MID AS in accordance with some embodiments herein. Three single E. coii cells were analyzed using MIDAS. Between 88% and 94% of the genome was assembled with very little sequencing effort (2-8M PElOObp reads). The histograms show the log ₂ of average depth of coverage across each assembled region for each of the three cells. Gaps are represented by blank whitespace in between color contigs. Depth of coverage is fairly uniform across the genome, and few gaps are present.

[ΘΘ10] Figure 3 is a series of graphs illustrating genomic coverage of single bacterial and mammalian cells post MDA and MIDAS in accordance with some embodiments herein. Figure 3A is a graph illustrating a comparison of single E. coli cells amplified in a PCR tube for 10 hours (top), 2 hours (middle), and in a micro well (MIDAS) for 10 hours (bottom) in accordance with some embodiments herein. Log ₁₀ ratio (y-axis) represents the normalized coverage. The bias improves as MDA is limited, with the MIDAS method displaying the greatest uniformity. Figure 3B is a graph illustrating a comparison of single human cells amplified using traditional MDA and MIDAS in accordance with some embodiments herein. A 10 hour MDA of a single lymphocyte (top) displays more coverage bias when compared to a single neuronal nucleus amplified by MIDAS (bottom). Figure 3C is a graph illustrating distribution of coverage of amplified single bacterial cells in accordance with some embodiments herein. The x-axis represents the logic of genomic coverage binned into 100 total bins. MIDAS (30) demonstrates a tight coverage, indicating limited bias in the library. Both the normal (32) and limited (34) in-tube MDA libraries show a broad range of coverages. Figure 3D is a graph illustrating distribution of coverage of amplified single mammalian cells in accordance with some embodiments herein. MIDAS (36) shows a much tighter coverage distribution than an in-tube MDA library (38).

[0011] Figure 4 is a series of graphs illustrating detection of copy number variants using MIDAS in accordance with some embodiments herein. Figure 4A is a graph illustrating a plot of copy number variation in a Down Syndrome single cell analyzed with MIDAS in accordance with some embodiments herein. The x-axis shows genomic position, while the y-axis shows (in a iog ₂ scale) the estimated copy number. Trisomy 21 is clearly visible in this single ceil, along with several other smaller CNV calls. Figure 4B is a plot of copy number variation in a Down Syndrome single cell with Trisomy 21 "spike-ins" in accordance with some embodiments herein. The x-axis shows genomic position, while the y- axis shows (in a log ₂ scale) the estimated copy number. At each arrow, a 2 Mb section of chromosome 21 was computationally inserted into the genome. At each location, a copy number variant is called, showing that MIDAS can detect 2 Mb copy number variation accurately.

[0012] Figure 5 is a series of microscope images depicting real time MDA in accordance with some embodiments herein. Images are taken every hour using a 488 nm filter. Shown are 1 hour (Figure 5A), 2 hours (Figure 5B), 3 hours (Figure 5C), 4 hours (Figure 5D), 5 hours (Figure 5E), 6 hours (Figure 5F), 7 hours (Figure 5G), and 8 hours (Figure 5H). Amplicons are visualized growing beginning at 1 hour and continue to grow until they cannot amplify due to the limited space in the microweils. This saturation usually occurs within 5 to 6 hours. The amplicons are randomly distributed demonstrating random cell seeding, and no amplicons are in abutting wells.

[0013] Figure 6 is a series of microscope images depicting amplicon extraction in accordance with some embodiments herein. Microweils are saturated with genomic DNA and MDA is performed such that every well contains an MDA amplicon. The fluorescence in Figure 6A displays successful amplification. After amplification, a micropipette is lowered into a single well, designated by the arrow, and the amplicon is extracted. Figure 6B shows a successful removal of the amplicon due to loss of fluorescence, without disturbing the contents of the nearby microweils.

[0014] Figure 7 is a schematic diagram depicting a comparison of assembly to mapped reads across a genome in accordance with some embodiments herein. The outer track displays the assembled contigs mapping to E. coli. The middle track shows the raw reads mapping to E, coli. The inner track presents the coverage of the reads. The coverage is less in the mapped regions where contigs were not assembled.

[0015] Figure 8 is a series of graphs depicting detection of copy number variants using traditional MDA-based single cell sequencing in accordance with some embodiments herein. Figure 8A is a graph depicting a plot of copy number variation in a Down Syndrome single cell analyzed with traditional MDA. The x-axis shows genomic position, while the y- axis shows (in a log2 scale) the estimated copy number. Trisomy 21 is not visible in this single cell, and several other large CNV s spread across the genome are called. Figure 8B is a graph depicting a plot of copy number variation in a Down Syndrome single cell with Trisomy 21 "spike-ins." The x-axis shows genomic position, while the y-axis shows (in a 3og2 scale) the estimated copy number. At each arrow, a 2 Mb section of chromosome 21 was computationally inserted into the genome. Copy number variation is not called at any location, showing that traditional MDA-based methods cannot detect CNVs accurately.

[0016] Figure 9A-9B is a series of graphs depicting a comparison of MIDAS amplification, according to some embodiments herein, to MALBAC, a different, method of amplifying nucleic acids. Figure 9A is a pair of graphs depicting MALBAC (top) and MIDAS (bottom), in which MIDAS and MALBAC show similar unbiased coverage across the genome. Figure 9B is a pair of graphs depicting MIDAS 90 displays a slightly better distribution of coverages when compared with MALBAC 92.

[0017] Figures lOA-lOC are a series of graphs depicting a comparison of MIDAS amplification according to some embodiments herein to previously published data for in-tube MDA ⁴³, microfluidic MDA ⁱ⁰ and MALBAC ⁴⁴ for diploid regions of pools of two sperm cells and diploid regions of a single SW480 cancer cell processed using MALBAC ^3"'. Genomic positions were consolidated into variable bins of -60 kb in size previously determined to contain a similar read count ³" and were plotted against the loglO ratio (y axis) of genomic coverage (normalized to the mean). For the cancer cell data, nondiploid regions have been masked out (white gaps between pink) to remove the bias generated by comparing a highly aneuploid cell to a primarily diploid cell. Figure 10A depicts results for sperm pool I, in- tube MDA; sperrn pool 2, in-tube MDA; and sperm pool 1, microfluidic MDA. Figure 1ΘΒ depicts results for sperm pool 2, microfluidic MDA; sperm pool I, mALBAC; and sperm pool 2, mALBAC. Figure IOC depicts results for SW480 cancer cell (diploid regions, MALBAC), Neuronal nucleus 1, MIDAS; and Neuronal nucleus 2, MIDAS.

DETAILED DESCRIPTION

[0018] Amplification of sub-nanogram quantities of nucleic acids, for example the genome of a single ceil, can be useful for a number of applications. According to some embodiments herein, methods and manufactures for substantially unbiased amplification of nucleic acids are provided. In some embodiments, a small quantity of nucleic acid, for example the genomic material of a single cell, is amplified in a nanoliter-scale volume. The nanoiiter-scale volume can provide for amplification in a high concentration of reactants. The amplification can comprise multiple strand displacement amplification ( DA). in some embodiments, the amplification is performed in a single reaction space, such as a well, thus minimizing moving parts. In some embodiments, the amplification method can be readily scaled by simply increasing or decreasing a number of nanoliter-scale amplifications that are performed in parallel. In some embodiments, a sequencing library is prepared from the amplified nucleic acid. In some embodiments, the library comprises a random fragmentation and ligation library.

[0019] Genome sequencing of single cells can have a variety of applications including, but not limited to characterizing diffic lt-to-culture microorganisms and identifying somatic mutations in single cells from mammalian tissues. A major hurdle of this process can be bias in amplifying and making multiple copies of the genetic material from a single cell, a procedure known as polymerase cloning. Some embodiments herein provide a microwell displacement amplification system (MIDAS), a massively parallel polymerase cloning method in which single cells are randomly distributed into hundreds to thousands of microweils in nanoliter-scale volumes and simultaneously amplified for shotgun sequencing. In some embodiments, MIDAS dramatically reduces amplification bias by implementing polymerase cloning in nanoliter-scale reactions, allowing the de novo assembly of near- complete microbial genomes from single E. coli cells. In some embodiments, MIDAS allows detection of single-copy number changes in primary human adult neurons at 1-2 Mb resolution. MIDAS can facilitate the characterization of genomic diversity in many heterogeneous cell populations. It is further contemplated that as amplification reactions according to some embodiments herein are performed in a single reaction environment, these reactions can be performed with minimal moving parts (for example, only a pippettor to add or remove solution from a reaction environment). Accordingly, amplification reactions according to some embodiments herein can be performed with a high degree of reliability, while minimizing the need for additional components such as moving parts, and chasses and operating software for such moving parts. In some embodiments, amplification is performed in a single reaction environment. In some embodiments, the amplification is performed without the activity of fluidic channels or other fluidic system other than one or more pipettors for adding and/or removing solution from the reaction environment. In some embodiments, the amplification is performed in a reaction environment that is not in fluid communication with a network of fluidic channels, and is not configured for being in fluid communication with a network of fluidic channels.

[ΘΘ20] Some embodiments allow for whole genome amplification of many single cells in parallel in an unbiased manner. Hundreds (or more) of cells can be amplified simultaneously in nanoliter volumes. Some embodiments include a low input sequencing library construction technique such that DNA directly from the whole genome amplification can be sequenced. The unbiased nature of amplification can allow for a myriad of downstream applications, including de novo assembly of unculturable bacteria and copy number variation calling of single mammalian ceils.

[0021] According to some embodiments herein, methods of nucleic acid amplification are readily scalable. Depending on the desired number of amplification reactions to be performed, a number of nanoliter- sc le reaction environments (for example wells) can be selected. Templates (e.g. single cells, or single cell genomes) can be diluted to a volume such that there is approximately no more than one template per reaction environment, and distributed among the desired number of reaction en ironme ts. In some embodiments, at least one substrate comprising a plurality of nanoliter-scaie reaction environments is provided. If the desired number of reactions is less than the number of reaction environments on the substrate, only some of the reaction environments can be used. If the desired number of reactions is greater than the number of reaction environments on the substrate, two or more substrates can be used, for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, or 100 substrates, including ranges between any two of the listed values. It is contemplated herein that scalability offers flexibility to an operator. Additionally, as amplification reactions according to some embodiments herein can be performed with minimal moving parts, the number of amplification reactions can be readily sealed without any substantial customization or redesign of the substrate architecture (such as operating software, mechanical components, fluidic systems, and the like). Accordingly, in some embodiments, a large number of amplification reactions can be performed in parallel. In some embodiments, at least 2 amplification reactions are performed in parallel, for example at least 2, 3, 4, 5, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000,1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500, or 10000 amplifications, including ranges between any two of the listed values.

Nucleic acid amplification

[0022] Traditional whole genome amplification techniques for single cells can amplify genomes extremely biasedly. Small regions of the genome can be amplified greatly, whereas most of the genome can be amplified very little. Therefore, a large amount of sequencing effort can be required to resolve any of the genome. Downstream applications, such as de novo assembly or copy number variation calling, thus can be extremely difficult and inaccurate.

[ΘΘ23] In some embodiments, whole genomes of single cells are amplified unbiasedly. In some embodiments, whole genomes of single cells are amplified substantially unbiasedly. As used herein "substantially unbiased" and pluralizations, conjugations, variations, and the like of this root term refers to amplification of a genome wherein, when the amplified genome is divided into at least 100 genomic bins that were previously determined such that each would contain a similar number of reads after mapping (see, e.g. ^j0), the log _!o fold-amplification of at least 80% of the bins is within ±20% of the mean (i.e. for at least 80% percent of the genomic bins, the logio of the fold amplification, is no more than 20% more, and no less than 20% less than the mean number of copies genome-wide). In some embodiments, the logio fold-amplification of at least 80% of the bins is within ±20% of the mean, for example at least about 80%, 85%, 90%, 95%, 99%, or 99.9%. When whole genome amplification is substantially unbiased or unbiased, most of the genome can be amplified to a similar degree. Therefore, relatively little sequencing effort can be necessary for downstream analysis. De novo assembly can be accomplished and copy number variations can be called with a much greater accuracy. [0024] As used herein, "nanoliter-scale" refers to a volume, for example in a reaction environment, of at least about one nano liter and no more than about 50 nanoliters, more preferably about 5 nanoliters to about 30 nanoliters, more preferably about 10 nanoliters to about 25 nanoliters, for example about 12 nanoliters or about 20 nanoliters,

[0025] In some embodiments, cells are diluted and spread evenly across a loading area on a substrate, in which the loading area contains hundreds of nanoliter-scale reaction environments such that at least 99% of the reaction environments contain no more than 1 cell per well, in some embodiments, the substrate comprises a PDMS slide. After lysis and denaturing, the DNA can be amplified using multiple displacement amplification (MDA). The MDA reactants can be provided in buffer comprising polymerase, dNTP's, random oligonucleotides, and an amplification-detection moiety such as SYBR™ green dye. The MDA can be performed in a temperature controlled environment and in optical communication with a detector for amplification-detection moiety, such as a microscope. Without being limited by any theory, the small volume and consequent high concentration of template can allow for an unbiased amplification of the whole genome. Staining with an amplification-detection moiety, for example SYBR™ green, during amplification allows for positive amplifications to be observed due to an increase in detectable signal over time. Positive amplifications are then automatically or manually removed using a micromanipulator and deposited into tubes. Some embodiments include a low input sequencing library construction method capable of using sub nanogram inputs of DNA.. The complex MDA amplicon can then be denatured and simple linear DNA created. The linear DNA can be used to construct, a sequencing library. In some embodiments, transposons with Illumina sequencing adaptors ( extera) then fragment the DNA while adding sequencing adapters. Accordingly, a sequencing library can be prepared. It, is contemplated that nucleic acid amplified substantially unbiasedly in accordance with embodiments herein can be used for a number of downstream applications, including any of a number of genome sequencing techniques known to the skilled artisan.

[0026] A variety of techniques for amplifying nucleic acid are known to the skilled artisan. Exemplary techniques for amplifying nucleic acid include, but are not limited to; polymerase chain reaction (PCR), strand displacement amplification (SDA), for example multiple displacement amplification ( DA), loop-mediated isothermal amplification (LAMP), ligase chain reaction (LCR), immuno-amplification, and a variety of transcription- based amplification procedures, including transcription-mediated amplification (TMA), nucleic acid sequence based amplification (NASBA), self-sustained sequence replication (3SR), and rolling circle amplification. See, e.g., Mullis, "Process for Amplifying, Detecting, and/or Cloning Nucleic Acid Sequences," U.S. Pat. No. 4,683,195; Walker, "Strand Displacement Amplification," U.S. Pat. No. 5,455,166; Dean et al, "Multiple displacement amplification," U.S. Pat. No. 6,977,148; Notonii et al, "Process for Synthesizing Nucleic Acid," U.S. Pat. No. 6,410,278; Landegren et al. U.S. Pat. No. 4,988,617 "Method of detecting a nucleotide change in nucleic acids"; Birkenmeyer, "Amplification of Target Nucleic Acids Using Gap Filling Ligase Chain Reaction," U.S. Pat. No. 5,427,930; Cashman, "Blocked-Polymerase Polynucleotide Immunoassay Method and Kit," U.S. Pat. No. 5,849,478; acian et al., "Nucleic Acid Sequence Amplification Methods," U.S. Pat. No. 5,399,491; Malek et al, "Enhanced Nucleic Acid Amplification Process," U.S. Pat. No. 5,130,238; Lizardi et al, BioTechnology, 6: 1197 (1988); Lizardi et al., U.S. Pat. No. 5,854,033 "Roiling circle replication reporter systems," each of which is hereby incorporated by reference in its entirety. Preferably, MDA can be used in accordance with some embodiments herein. MDA can comprise annealing random oligonucleotide primers to a template nucleic acid, and extending the oligonucleotide primers forward to the annealing site of the most immediate downstream oligonucleotide primer so as to form branched amplified nucleic acid. MDA can be performed at a constant temperature, and compared to conventional PGR can produce relatively large products with a relatively low error rate. A variety of MDA reagents can be used in accordance with embodiments herein. In some embodiments, MDA is performed with a strand-displacement polymerase In some embodiments, the strand displacement polymerase comprises a high-fidelity DNA polymerase, for example 29 DNA polymerase.

[0027] The fold amount of amplification that occurs according to some embodiments herein can depend on the amount of template, and the total mass of reactants. According to some embodiments herein, amplification is performed until saturation (e.g. until additional cycles of amplification are no longer in a logarithmic phase, so that the additional cycles produce few to no additional amplicons). Without being limited by any theory, it is contemplated that the total amount of amplification is proportional to the total mass of the reaction, and inversely proportional to the size of the template being amplification. Accordingly, by way of example, given the same reaction mass and amplification until saturation in accordance with some embodiments herein, a 1 Mb genome would be amplified approximately 10-fold more than a 10Mb genome,

[0028] Without being limited by any theory, it is contemplated herein that a high concentration of amplification reactants and template in accordance with some embodiments herein can facilitate substantially unbiased amplification of all or substantially all of the template, for example genomic material. So as to provide a high concentration of reactants, including, but not limited to, template, the ratio of template to reaction volume can be relatively high in some embodiments herein. Accordingly, in some embodiments, the nanoliter-scale reaction environments are configured for a high ratio of genomic material to reaction volume. In some embodiments, the nanoliter-scale reaction environments are configured for at least about 0.02 rnegabases of genomic material per nanoliter of reaction volume, for example at least about 0.02, 0.03, 0.05, 0.1 , 0.15, 0.2, 0.25, 0.3, 0.35, 0.4., 0.45, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 2, 3,4 , 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1500, 2000, 2500, 3000, 4500, or 5000 rnegabases of genomic material per nanoliter, including ranges between any two of the listed values. In some embodiments, the nanoliter-scale reaction environments are configured for at least about 0.03 rnegabases of genomic material per nanoliter of reaction, in some embodiments, the nanoliter-scale reaction environments are configured for at least about 0.3 rnegabases of genomic material per nanoliter of reaction. In some embodiments, the nanoliter-scale reaction environments are configured for at least about 100 rnegabases of genomic material per nanoliter of reaction. In some embodiments, the nanoliter-scale reaction environments are configured for at least about 200 rnegabases of genomic material per nanoliter of reaction. It is further contemplated herein that the nanoliter-scale reaction environments can be configured so that substantially each nanoliter- scale reaction environment comprises only one genome (or cell comprising a genome) when a liquid comprising diluted whole cells or fractions thereof is applied to a substrate as described herein. Accordingly, in some embodiments, each nanoliter-scale reaction environment is configured so that at least about 95% of the nanoliter-scale reaction en vironments comprises only one cell after administration of the solution comprising cells or fragments thereof for example at least about 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 99.99%.

[0029] While substantially unbiased amplification in accordance with some embodiments herein can be useful for many applications, one useful application includes genome sequencing. It is contemplated that the substantially unbiased amplification in accordance with some embodiments herein yields amplification of all or substantially all of the template genome at a coverage level that is useful for sequencing, in some embodiments, the nanoliter-scale reaction environments are configured for amplifying at least about 90% of the entire genome therein with >lx coverage, for example at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, or 99.9%, including ranges between any two of the listed values.

[0030] In some embodiments, unbranched amplicons are produced for use in library construction. As used herein, "substantially all amplicons are unbranched" and the like refers to at least about 70% of the amplicons (for example, about 70%, 75%, 80%), 85%, 90%, 95%, 97%, 98%, 99%, or 99.9%) do not have a branch characteristic of multiple strand displacement, but rather, are unbranched double-stranded DNA molecules. Without being limited by any theory, it is noted that, MDA products are typically highly branched. In some embodiments, unbranched amplicons can be produced from MDA products by contacting the MD A products with DNA polymerase I.

[0031] A variety of sequencing techniques are known to the skilled artisan, and can be used in accordance with embodiments herein. The selection of a sequencing technique can depend on a variety of factors, for example the size and characteristics of a genome being amplified. As a number of embodiments herein comprise or are compatible with massively parallel amplification and sequencing, sequencing techniques compatible with rapid, large-scale "next-generation" sequencing can be useful in accordance with some embodiments herein. Exemplary sequencing techniques include Illumina™ (Solexa) sequencing (Alumina), Ion Torrent™ sequencing (Life Technologies), SOLID™ sequencing (Life Technologies), and the like.

Amplification-detection moieties

[0032] In some embodiments, an amplification-detection moiety is used to monitor the progress of amplification. As used herein, "amplification-detection moiety" refers broadly to any of number of detectable moieties that produce a detectable type or intensity of signal in the presence of amplification product, for example double-stranded nucleic acid, but do not produce the signal (or produce only low-level or background signal) in the absence of amplification product. A first class of amplification-detection moieties includes dyes that bind specifically to double-stranded DNA, for example intercalating agents. These dyes have a relatively low fluorescence when unbound, and a relatively high fluorescence upon binding to double-stranded nucleic acids. As such, dyes that selectively detect double-stranded can be used to monitor the accumulation of double strained nucleic acids during an amplification reaction. Examples of dyes that selectively detect double- stranded DNA include, but are not limited to SYBR™ Green I dye (Molecular Probes), SYBR™ Green II dye (Molecular Probes), SYBR™ Gold dye (Molecular Probes), Picogreen,dye (Molecular Probes), Hoechst 33258 (Hoechst AG), and cyanine dimer families of dyes such as the YOYO family of dyes (e.g. YOYO-1 and YOYO-3), the ΊΌΊΌ family of dyes (e.g. TOTO-1 and TQTQ-3), and the like. Other types of amplification-detection moieties employ derivatives of sequence-specific nucleic acid probes. For example, oligonucleotide probes labeled with one or more dyes, such that upon hybridization to a template nucleic acid, a detectable change in fluorescence is generated. Exemplar}' amplification-detection moieties in this class include, but are not limited to Taqman™ probes, molecular beacons, and the like. While non-specific dyes may be desirable for some applications, sequence-specific probes can provide more accurate measurements of amplification. One configuration of sequence-specific probe can include one end of the probe tethered to a fluorophore, and the other end of the probe tethered to a quencher. When the probe is unhybridized, it can maintain a stern-loop configuration, in which the fluorophore is quenched by the quencher, thus preventing the fluorophore from fluorescing. When the probe is hybridized to a template nucleic sequence, it is linearized, distancing the fluorophore from the quencher, and thus permitting the fluorophore to fluoresce. Another configuration of sequence-specific probe can include a first probe tethered to a first fluorophore of a FRET pair, and a second probe tethered to a second fluorophore of a FRET pair. The first probe and second probe can be configured to hybridize to sequences of an amplicon that are within sufficient proximity to permit energy transfer by FRET when the first probe and second probe are hybridized to the same amplicon.

10033] In some embodiments, an amplification-detection moiety is used to quantify the double-stranded DNA in each reaction environment. Accordingly, in some embodiments, the products of reaction environments in which a desired amount of amplification has occurred can be selected for downstream applications such as construction of sequencing libraries. Thus, methods according to some embodiments herein can minimize the use of reagents and other resources by only constructing sequencing libraries for single- cell genomes that were actually amplified, and for reducing a need for preparing redundant libraries as a "back-up" against reaction environments that did not amplify.

[0034] In some embodiments, the sequence-specific probe comprises an oligonucleotide that is complementary to a sequence to be amplified, and is conjugated to a fluorophore. In some embodiments, the probe is conjugated to two or more fluorophores. Examples of fluorophores include: xanthene dyes, e.g., fluorescein and rhodamine dyes, such as fluorescein isothiocyanate (FITC), 2-[ethylamino)-3-(ethyUniino)-2-7-dimethyl-3H- xanthen-9-yl]benzoic acid ethyl ester monohydrochloride (R6G)(emits a response radiation in the wavelength that ranges from about 500 to 560 nrn), 1,I,3,3,3',3'- Hexamethylindodicarbocyanine iodide (HIDC) (emits a response radiation in the wavelength that ranged from about 600 to 660 nrn), 6-carboxyfluorescein (commonly known by the abbreviations FAM and F), 6-carboxy-2',4',7',4,7-hexachlorofluorescein (HEX), 6-carboxy- 4',5 ^!-dichIoro-2',7'-dimethoxyiiuorescein (JOE or J), N,N,N*,N'-tetramethyl-6- carboxyrhodamine (TAMRA or T), 6-carboxy-X-rhodamine (ROX or R), 5- carboxyrhodamine-6G (R6G5 or G5), 6-carboxyrhodamine-6G (R6G6 or G6), and rhodamine 110; cyanine dyes, e.g. Cy3, Cy5 and Cy7 dyes; coumarins, e.g., umbelliferone; benzimide dyes, e.g. Hoechst 33258; phenanthridine dyes, e.g. Texas Red; ethidiurn dyes; acridine dyes; carbazoie dyes; phenoxazine dyes; porphyrin dyes; poiymethine dyes, e.g. cyanine dyes such as Cy3 (emits a response radiation in the wavelength that ranges from about 540 to 580 nrrs), Cy5 (emits a response radiation in the wavelength that ranges from about 640 to 680 rim), etc; BODIPY dyes and quinoline dyes. Specific fluorophores of interest include: Pyrene, Coumarin, Diethylammocoumarm, FAM, Fluorescein Chlorotriazinyl, Fluorescein, 1 10, Eosin, JOE, R6G, HIDC, Tetramethylrhodamine, TAMRA, Lissamine, RQX, apthofluorescein, Texas Red, Napthofiuorescein, Cy3, and Cy5, and the like.

[0035] In some embodiments, the sequence-specific probe is conjugated to a quencher. A quencher can absorb electromagnetic radiation and dissipate it as heat, thus remaining dark. Example quenchers include Dabcyi, NFQ's, such as BHQ-1 or BHQ-2 (Biosearch), IOWA BLACK FQ (IDT), and IOWA BLACK RQ (IDT). In some embodiments, the quencher is selected to pair with a fluorophore so as to absorb electromagnetic radiation emitted by the fluorophore. Flourophore/ ^'quencher pairs useful in the compositions and methods disclosed herein are well-known in the art, and can be found, e.g., described in 8. M arras, "Selection of Fluorophore and Quencher Pairs for Fluorescent Nucleic Acid Hybridization Probes" available at the world wide web site molecular- beacons.org/download/marras,mmb06%28335°/o293.pdf.

[ΘΘ36] In some embodiments, a fluorophore is attached to a first end of the sequence-specific probe, and a quencher is attached to a second end of the probe. Attachment can include covalent bonding, and can optionally include at least one linker molecule positioned between the probe and the fluorophore or quencher. In some embodiments, a fluorophore is attached to a 5' end of a probe, and a quencher is attached to a 3' end of a probe. In some embodiments, a fluorophore is attached to a 3' end of a probe, and a quencher is attached to a 5' end of a probe. Examples of probes that can be used in quantitative nucleic acid amplification include molecular beacons, SCORPIONS™ probes (Sigma) and TAQMAN™ probes (Life Technologies). Substrates

[0037] Substrates comprising a plurality of nanoliter-scale reaction environments can be used in accordance with some embodiments herein.

[Ό038] In some embodiments, the substrate comprises several loading areas, and a plurality of nanoliter-scale reaction environments in fluid communication with each loading area. In some embodiments, applying to a loading area a solution having the total volume of the nanoliter-scale reaction environments for that loading area, and single genomes (for example single cells, or isolated genomes of single cells) at a dilution of about 0.1 genome per reaction environment can result in 99% of the reaction environments in that loading area comprising no more than a single genome (or single cell comprising that genome). For example, if each loading area of the substrate comprises 255 microwell reaction environments, each having a diameter of about 4ί)()μηι and a depth of about Ι ΟΟμιη (for a volume of about 12 nl), applying 3μ1 of a solution comprising 0.1 cells per microwell (e.g. 26 cells), about 99.5% of the microwells will comprise no more than one cell, it is noted that this number was confirmed via SEM microscopy (see Fig. IB).

[0039] An exemplary substrate 10 in accordance with some embodiments herein is schematically illustrated in Fig. 1A. The substrate can comprise several loading areas 12, which are not in fluid communication with each other. In some embodiments, the substrate comprises at least 3 loading areas, for example, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 30, 35, 40, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, or 500 loading areas, including ranges between any two of the listed values. In some embodiments, each loading area is configured to be loaded directly by a pipette without any intervening fiuidie channels (e.g. mierofiuidie or nanofluidic channels). The pipette can be manually operated or automatically operated. Each loading area 12 can comprise, or can be in fluid communication with a plurality of nanoliter-scale reaction environments 14, for example microwells. The number of nanoliter-scale reaction environments can be useful for increasing the likelihood that no each reaction environment comprises no more than one genome (or single cell comprising a genome). In some embodiments, each loading area 12 comprises at least about 100 nanoliter-scale reaction environments, for example about 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1 100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 3000, 4000, or 5000 nanoliter-scale reaction environments, including ranges between any two of the listed values. In some embodiments, each nanoliter-scale reaction environment 14 has a volume of no more than 30 nanoliters, for example about 30, 29, 28, 27, 26, 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 1 1 , 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 nanoliters, including ranges between any two of the listed values. In some embodiments, each nanoliter-scale reaction environment 14 has a volume of no more than 20 nanoliters. In some embodiments, each nanoliter-scale reaction environment 14 has a volume of no more than 12 nanoliters. In some embodiments, each nanoliter-scale reaction environment 14 has a volume of about 20 nl. In some embodiments, each nanoliter-scale reaction environment 14 has a volume of about, 12 nl. In some embodiments, each nanoliter-scale reaction environment has a diameter- to- depth ratio of about 4: 1 , for example about 2: 1, 3: 1 , 4: 1 , 5: 1 , 6: 1 , 7: 1 , or 8: 1. For example, a round nanoliter-scale reaction environment having a diameter of about 400μηι and a depth of about ΙΟΟμ ι would have a volume of about 12nl.

|0040] It is recognized that each loading area can be loaded with a separate sample, so that multiple samples can be amplified on the same substrate in parallel (one sample in each loading area). According!}', in some embodiments, the number of samples being amplified in parallel can readily be scaled up or down. For example, if the number of samples is less than or equal to the total number of loading areas on the substrate, the appropriate number of loading areas can be selected for parallel reactions. If the number of samples is greater than the total number of loading areas on the substrate, two or more substrates can be used to accommodate the total number of samples,

|004J ] In some embodiments, the substrate 100 comprises 16 loading areas 12, and each loading area 12 comprises 255 nanoliter-scale loading environments 14. Each nanoliter-scale reaction environment 14 can have a diameter of about 400μηι and a depth of about Ι ΟΟμτη, for a volume of about 12nl. The substrate can comprise PDMS. Each loading area can have a height of about 7mm and a width of about 7mm. The loading areas can be arranged in a pattern on the substrate.

[0042] In some embodiments, the substrate further includes a detector for amplification-detection moieties. The detector need not be attached to the substrate. For example, the substrate can be positioned in optical communication with a fluorescent microscope, and optionally a camera. In accordance with some embodiments herein, an amplification-detection moiety can be present in the nanoliter-scale reaction environments, and can indicate when a desired amount of amplification of nucleic acid has occurred in a particular nanoliter-scale reaction environment. Accordingly, in some embodiments, the detector is configured to detect nanoliter-scale reaction environments in which a desired amount of amplification has occurred. In some embodiments, a manual user can select one or more nanoliter-scale reaction environments for downstream applications such as library construction based on the signal detected by the detector. In some embodiments, one or more nanoliter-scale reaction environments are automatically selected for downstream applications such as for library construction based on the amount of signal detected by the detector.

10043] In some embodiments, the substrate further comprises a pipettor for withdrawing amplified nucleic acid from a selected nanoliter-scale reaction environment. The pipettor can be configured to withdraw nanoliter-scale volumes or less from the selected well. In some embodiments, the pipettor comprises a pipette having a diameter less than the diameter of the nanoliter-scale reaction environment. In some embodiments, the pipette has a diameter of no more than about 50,um, for example about 50μηι, 45, 40, 35, 30, 25, 20, 15, 10, or 5μη¾, including ranges between any two of the listed values. In some embodiments, the pipette has a diameter of about 30μηι. In some embodiments, the pipette is a glass pipette. The pipette can be sterile. In some embodiments, the pipettor is under the mechanical control of a manual micromanipulator so that a user can manually select a nanoliter-scale reaction environment of interest for withdrawing liquid, for example amplified nucleic acid. In some embodiments, the pipettor is under the mechanical control of an automatic micromanipulator in data communication with a detector as described herein, so that the pipettor can automatically withdraw liquid from a nanoliter-scale reaction environment exhibiting a desired level of amplification.

[0044] in some embodiments, the genome of microbial and/or human ceils is sequenced. Some embodiments include assembly of genomes of single bacterial cells with very little sequencing effort. Some embodiments include calling copy number variations on single human neurons down to a 1 -2 megabase resolution.

~77- [0045] Methods and manufactures in accordance with some embodiments herein can be useful for one or more of: De novo assembly of unculturable bacteria in the human gut; De novo assembly of unculturable bacteria in heterogeneous environments such as sea water; Copy number variation calling on single neurons; Copy number variation calling on single cancerous cells or circulating tumor cells; and baplotypmg, for example Human hapiofyping.

[0046] In some embodiments, the genome of a single ceil is amplified. In some embodiments the cell is a human cell. In some embodiments, the cell is a microbial cell. In some embodiments, the cell is a bacterial cell. In some embodiments, the cell is from a substantially unculturable strain. As used herein, "substantially unculturable" and variations thereof refer to a strain that, when cultured under normal laboratory conditions, fewer than 20% of replicates of that strain will reach a logarithmic growth phase, for example fewer than 20%, 15%, 10%, 5%, 2%, 1%, or 0.1%.

[0047] For previous techniques, a major technical challenge was the highly uneven amplification of the one or two copies of each chromosome in a single cell. This high amplification bias leads to difficulties in assembling microbial genomes de novo and inaccurate identification of copy number variants (CNV) or heterozygous single nucleotide changes in single mammalian cells. Recent developments of bias-tolerant algorithms ' have greatly mitigated the effects of uneven read depth on de novo genome assembly and CNV calling, yet an unusually high sequencing depth is still required, making this approach impractical for organisms with large genome sizes.

[0048] Several strategies have been previously developed to reduce amplification bias, including reduced reaction volume ¹³' ¹⁴ and supplementing amplification reactions with single-strand binding proteins or Threhalose ^{15, 56}. Post-amplification normalization by digesting highly abundant sequences with a duplex-specific nuclease has also been utilized to markedly reduce bias ⁵ '. Despite these efforts, amplification bias still remains the primary technical challenge in single-cell genome sequencing. Using cells that contain multiple copies of the genome or multiple clonal ceils has been the only viable solution to achieve near complete genome coverage with MDA ^{l8, i9}. Without being bound by any particular theory, we reasoned that amplification is always bias-prone, and that by limiting the amplification magnitude to "just-enough" for sequencing in accordance with some embodiments herein, we could potentially reduce the bias. In addition, we supposed that reducing the reaction volume by -1000 fold to nanoliter level and thus dramatically increasing the effective concentration of the template genome might reduce contamination and improve primer annealing and hence amplification uniformity ^1"5' ⁱ⁴. To these ends, we developed the microwell displacement amplification system (MIDAS) in accordance with some embodiments herein, a microwell-hased platform that allows for highly parallel polymerase cloning of single cells in thousands of naiioliter reactors of 12 iiL in volume, the smallest volume that has been implemented to date to the best knowledge of Applicants. Coupled with a low-input library construction method, we achieved highly uniform coverage in the genomes of both microbial and mammalian cells. We demonstrated substantial improvement both in de novo genome assembly from single microbial cells and in the ability to detect small somatic copy number variants in individual human adult neurons with minimal sequencing effort.

[0049] Due to the extreme bias that can be caused by whole genome amplification from a single DNA molecule, genomic analysis of single cells has traditionally been a challenging task. Traditionally, a large amount of sequencing resources can be required to produce a draft quality genome assembly or determine a low-resolution copy number variation profile due to amplification bias and coverage dropout. MIDAS in accordance with some embodiments herein addresses this issue through the use of naiioliter scale volumes to generate nanogram level amplicons and the use of a low-input transposon-based library construction method. Compared to the traditional single-cell library construction and sequencing protocol, MIDAS in accordance with some embodiments herein provides a more uniform, higher-coverage, and lower cost way to analyze single cells from a heterogeneous population.

[0050] MIDAS was applied to single E. coii cells and resolved nearly the entire genome with relatively low sequencing depth. Additionally, using de novo assembly on MIDAS libraries, over 90 percent of the genome was assembled. Thus, in some embodiments, MIDAS is applied to an uncultivated organism to provide a draft quality assembly with more genes covered and less sequencing resource expenditure. Currently, a majority of unculturable bacteria are analyzed metagenomically as part of a mixed population rather than individually. Although metagenomics allows for the discover}' of novel genes, individual sequences cannot be resolved. The biased nature of traditional MDA-based methods when applied to single cells has proved single cell microbial analysis challenging in terms of de novo genome assembly. Despite recent success in analyzing partially assembled single ceil genomes ', the full potential of single ceil genomics remains to be fully explored. As such, in some embodiments the use of MIDAS on heterogeneous environmental samples, novel single-cell organisms and genes can be easily discovered and characterized in a low- cost and high-throughput manner, allowing a much higher-resolution and more complete analysis of single bacterial cells.

[0051] In some embodiments, MIDAS is applied to the analysis of copy number variation in single human neuronal nuclei. With a low amount of sequencing effort, MIDAS was able to systematically call single copy number changes of 2 million base pairs or larger in size. It has been shown recently that, in human adult brains, post-mitotic neurons in different brain regions exhibited various levels of DNA content variation (DCV) ²⁹. The exact genomic regions that associate with DCV have been difficult to map to single neurons because of the amplification bias with existing MDA-based methods. CNV s in single rumor cells have been successfully characterized with a PGR -based whole genome amplification method ^' . However, tumor cells tend to be highly aneuploid and exhibit copy number changes of larger magnitude, which are more easily detected. The applicability of this strategy to other primary cell types with more subtle CNV events remains unclear. We have demonstrated that MIDAS greatly reduces the variability of single cell analysis to a level such that a small single-copy change is detectible, allowing characterization of much more subtle copy number variation. MIDAS can be used to simultaneously probe into the individual genomes of many cells from patients with neurological diseases, and thus will allow identification of a range of structural genomic variants and eventually allow accurate determination of the influence of somatic CNV s on brain disorders in a high-throughput manner.

[0052] In some embodiments, MIDAS compares very favorably to traditional MDA-based methods. Recently, another single cell sequencing method that dramatically reduces amplification bias and increases genomic coverage was reported. Known as MALBAC", this method incorporates a novel enzymatic strategy to amplify single DNA molecules initially through quasi-linear amplification to a limited magnitude prior to exponential amplification and library construction ^"52. MALBAC" was implemented in microliter reactions in conventional reaction tubes. In contrast, MIDAS represents an orthogonal strategy by adapting MDA to a microwell platform. It will therefore be more easily able to analyze a larger number of single cells in parallel in a single experiment. While both MIDAS and MALBAC show relatively unbiased amplification across the genome (Figs. 9A-9B), MIDAS in accordance with some embodiments herein shows less variability in coverage distribution, making it more suitable for CNV calling with less sequencing effort. Additionally, unlike MIDAS, MALBAC" has not been demonstrated on femtogram level DNA inputs, which is required for genome sequencing of single microbial ceils. Finally, the error rate of MALBAC is roughly 100-fold higher than MDA due to the difference in DNA polymerases used.

[0053] MIDAS can provide researchers with a powerful tool for many other applications, including high-coverage end-to-end haplotyping of mammalian genomes or probing de novo CNV events at the single-cell level during the induction of pluripotency or stem cell differentiation ³³. MIDAS can allow for efficient high-throughput sequencing of a variety of organisms at a relatively low price. This new technology should help propel single cell genomics, enhance our ability to identify diversity in multicellular organisms, and lead to the discovery of thousands of new organisms in various environments.

[0054] With reference to Examples 1-5, the following methods were used. The skilled artisan will appreciate that the following methods can readily be used or adapted or modified in accordance with some embodiments herein: Microwell A rray Fabrication

[0055] Microwell arrays were fabricated from polydirnethylsiloxane (PDMS). Each array was 7 mm x 7 mm, with 2 rows of 8 arrays per slide and 156 microwells per array. The individual microwells were 400 μτη in diameter and 100 urn deep (-12 nL volume), and were arranged in honeycomb patterns in order to minimize space in between the wells. To fabricate the arrays, first, an SU-8 mold was created using soft lithograph}' at the Nano3 facility at UC San Diego. Next, a 10: 1 ratio of polymer to curing agent mixture of PDMS was poured over the mold. Finally, the PDMS was degassed and cured for 3 hours at 65 C.

Bacteria and Neuron Preparation

[ΘΘ56] E. coli K12 MG1655 was cultured overnight, collected in log-phase, and washed 3x in PBS. After quantification, the solution was diluted to 10 eells/pL. Human neuronal nuclei were isolated as previously described ^{''" " "} and fixed in ice-cold 70% ethanol. Nuclei were labeled with a monoclonal mouse antibody against NeuN (1 : 100 dilution) (Chemicon, Temecula, CA) and an AlexaFluor 488 goat anti-mouse IgG secondary antibody (1:500 dilution) (Life Technologies, San Diego, CA). Nuclei were eounterstained with propidium iodide (50ug'ml) (Sigma, St. Louis, MO) in PBS solution containing 50 ug/ml RNase A. (Sigma) and chick erythrocyte nuclei (Biosure, Grass Valley, CA). Nuclei in the Gl/ ^'GO cell cycle peak, determined by propidium iodide fluorescence, were electronically gated on a Becton Dickinson FACS-Aria II (BD Biosciences, San Jose, CA) and selectively collected based on NeuN+ immunoreacfivity.

Cell Seeding, Lysis, and Multiple Displacement Amplification

[0057] All reagents not containing DNA or enzymes were first exposed to ultraviolet light for 10 minutes prior to use. The PDMS slides were treated with oxygen plasma to make them hydrophilic and ensure random cell seeding. The slides were then treated with 1% bovine serum albumin (BSA) (HMD Chemicals, Biilerica, MA) in phosphate buffered saline (PBS) (Gibco, Grand Island, NY) for 30 minutes and washed 3x with PBS to prevent DNA from sticking to the PDMS. The slides were completely dried in a vacuum prior to cell seeding. Cells were diluted to a concentration of 10 ee!ls/ L, and 3 μΕ of cell dilution was added to each array (30 cells total per array).

[0058] Initially, to verify that cell seeding adhered to the Poisson distribution, cells were stained with Ix SYBR green and viewed under a fluorescent microscope. Proper cell distribution was further confirmed with SEM imaging. For SEM imaging, chromium was sputtered onto the seeded cells for 6 seconds to increase conductivity. Note that the imaging of cell seeding was only used to confirm the theoretical Poisson distribution and not performed during actual amplification and sequencing experiments due to the potential introduction of contamination.

[ΘΘ59] After seeding, cells were left to settle into the wells for 10 minutes. The seeded cells were then iysed either with 300 U ReadyLyse lysozyme at 100 U/LIL (Epicentre, Madison, WI) and incubation at room temperature for 10 minutes, or with 5 1 minute freeze/thaw cycles using a dry ice brick and room temperature in a laminar flow hood. After lysis, 4.5 μΐ, of alkaline lysis (ALS) buffer (400 mM OIT, 100 mM DTT, 10 mM EDTA) was added to each array and incubated on ice for 10 minutes. Then, 4.5 μΕ of neutralizing (NS) buffer (666 mM Tris-HCl, 250 mM HCL) was added to each array. 11.2 μΕ of MDA master mix ( Ix buffer, 0.2x SYBR green I, 1 mM d TP's, 50 μΜ thiolated random hexamer primer, 8U phi29 polymerase. Epicentre, Madison, WI) was added and the arrays were then covered with mineral oil. The slides were then transferred to the microscope stage enclosed in a custom temperature and humidity controlled incubator set to 30 C. Images were taken at 30-minute intervals for 10 hours using a 488 nm filter.

Image Analysis

[ΘΘ60] Images were analyzed with a custom Matlab script to subtract background fluorescence. Because SYBR Green was added to the MDA master mix, fluorescence under a 488 nm filter was expected to increase over time for positive amplifications. If a digital profile of fluorescent wells with increasing fluorescence over time was observed (approximately 10-20 wells per array), the array was kept. If no wells fluoresced, amplification failed and further experiments were stopped. Alternatively, if a majority of the wells fluoresced, the array was considered to be contaminated and subsequent analysis was similarly stopped. If 2 abutting wells fluoresced, neither was extracted due to the higher likelihood of more than one cell in each well existing (as in this case, seeding was potentially n on -uniform).

Amplicon Extraction

[0061] I mm outer diameter glass pipettes (Sutter, Novato, CA) were pulled to -30 um diameters, bent to a 45 degree angle under heat, coated with SigmaCote (Sigma, St. Louis, MO), and washed 3 times with dH ₂0.

[0062] Wells with positive amplification were identified using the custom Matlab script described above. A digital micromanipulation system (Sutter, Novato, CA) was used for amplicon extraction. The glass pipette was loaded into the micromanipulator and moved over the well of interest. The microscope filter was switched to bright field and the pipette was lowered into the well. Negative pressure was slowly applied, and the well contents were visualized proceeding into the pipette. The filter was then switched back to 488 nm to ensure the well was no longer fluorescent. Amplicons were deposited in 1 μϊ _^ dH ₂().

Amplicon Quantification

[0063] For quantification of microwell amplification, 0,5 μΐ. of amplicon was amplified a second time using MDA in a 20 uL PGR tube reaction (Ix buffer, 0.2x SYBR green I, 1 rnM dNTP's, 50 mM thiolated random hexarner primer, 8U phi29 polymerase). After purification using Ampure XP beads (Beckman Coulter, Brea, CA), the 2 ^ad round amplicon was quantified using a Nanodrop spectrophotometer. The 2 ^nd round amplicon was then diluted to 1 ng, 100 pg, 10 pg, 1 pg, and 100 fg to create an amplicon ladder. Subsequently, the remaining 0.5 jxL of the 1 ^st round amplicon was amplified using MDA. along with the amplicon ladder in a quantitative PGR machine. The samples were allowed to amplify to completion, and the time required for each to reach 0.5x of the maximum fluorescence was extracted. The original amplicon concentration could then be interpolated. Low-input library construction

[0064] 1.5 μΐ, of ALS buffer was added to the extracted amplicons to denature the DNA followed by a 3 -minute incubation at room temperature, 1 ,5 μL· of NS buffer was added on ice to neutralize the solution. 10 U of DNA Polymerase I (Invitrogen, Carlsbad, CA) was added to the denatured amplicons along with 250 nanograms of unmodified random hexamer primer, 1 mM dNTPs, Ix Ampligase buffer (Epicentre, Madison, Wi), and Ix NEB buffer 2 (NEB, Cambridge, MA). The solution was incubated at 37 C for 1 hour, allowing second strand synthesis. 1 U of Ampligase was added to seal nicks and the reaction was incubated first at 37 C for 10 minutes and then at 65 C for 10 minutes. The reaction was cleaned using standard ethanol precipitation and eluted in 4 pL water.

[0065] Nextera transposase enzymes (Epicentre, Madison, WI) were diluted 100 fold in I TE buffer and glycerol. 10 μΕ transposase reactions were then conducted on the eluted amplicons after addition of 1 uL of the diluted enzymes and I x tagment DNA buffer. The reactions were incubated for 5 minutes at 55 C for mammalian ceils and 1 minute at 55 C for bacterial cells. 0.05 U of protease (Qiagen, Hilden, Germany) was added to each sample to inactivate the transposase enzymes; the protease reactions were incubated at 50 C for 10 minutes followed by 65 C for 20 minutes. 5 U Exo minus Kienow (Epicentre, Madison, WI) and 1 mM dNTP's were added and incubated at 37 C for 15 minutes followed by 65 C for 20 minutes. Two stage quantitative PCR using I x KAPA Robust 2G master mix (Kapa Biosystems, Woburn, MA), 10 μΜ Adapter 1, 10 μΜ barcoded Adapter 2 in the first stage, and ix KAPA Robust, 2G master mix, 10 μΜ Illumina primer 1 , 10 uM Murnina primer 2, and 0.4x SYBR Green I in the second stage was performed and the reaction was stopped before amplification curves reached their plateaus. The reactions were then cleaned up using Ampure XP beads in a 1 : 1 ratio. A 6% PAGE gel verified successful tagmentation reactions.

Mapping and Be novo Assembly of Bacteria! Genomes

[0066] Bacterial libraries were size selected into the 300-600 bp range and sequenced in an Illumina Genome Analyzer IIx, Illumina HiSeq, or Illumina MiSeq using 100 bp paired end reads. E. coli data was both mapped to the reference genome and de novo assembled. For the mapping analysis, libraries were mapped as single end reads to the reference E. coll K12 MG1655 genome using default Bowtie parameters. Contamination was analyzed, and clonal reads were removed using SA tools' rmdup function. For the de novo assembly, paired end reads with a combined length less than 200 bp were first joined and treated as single end reads. All remaining paired end reads and newly generated single end reads were then quality trimmed. De novo assembly was performed using SPAdes" v. 2.4.0. Corrected reads were assembled with kraer values of 21, 33, and 55. The assembled scaffolds were mapped to the NCBI nt database with BLAST, and the organism distribution was visualized using MEGAN ^{3 "}. Obvious contaminants (e.g., human) were removed from the assembly and the assembly was analyzed using QUASI ^"36. The remaining contigs were annotated using RAST ³' and KAAS ^jS.

Example 1; MIDAS implements massively parallel polymerase cloning in microwells.

[0067] To implement "just-enough" amplification and thus limit the effects of the exponential amplification bias from MDA in a highly parallel manner, we designed and fabricated microwell arrays of a size comparable to standard microscope slides. The format of the microwell arrays, including well size, pattern, and spacing, was optimized to achieve efficient cell loading, optimal amplification yield, and convenient DNA extraction. Each slide contained 16 arrays each containing 156 micro wells of 400μηι in diameter, allowing for parallel amplification of 16 separate heterogeneous cell populations (Fig, 1A). All liquid handling procedures (cell seeding, lysis, DMA denaturation, neutralization and addition of amplification master mix) required only a single pipette pump per step per array, greatly reducing the labor required for hundreds of amplification reactions. The reagent, cost is i 000- fold less than conventional methods, as each microwell is 12 nL in volume. In order to ensure that each reactor would contain only one single cell, we under-loaded the microwells at a density of roughly 1 cell per 10 wells, ensuring that no more than 0.5% of the wells would contain more than 1 cell. The remaining empty wells served as internal negative controls, allowing easy detection and elimination of contaminated samples. Proper microbial cell seeding in microwells was confirmed by scanning electron microscopy (Fig. IB). [0068] After seeding of cell populations into each rnicrowell array, we performed limited Multiple Displacement Amplification (MDA) on the seeded single cells at a reaction volume of ~12nL in a temperature and humidity controlled chamber (Fig. IC). We utilized SYBR Green I to visualize the amplicons growing in real-time using an epiiiuorescent microscope (Fig. 5). A random distribution of amplicons across the arrays was observed with approximately 10% of the wells containing amplicons, further confirming the parallel and localized amplification within individual microwells as well as the stochastic seeding of single cells ⁰. Exogenous contamination was easily detectible as a uniform increase of fluorescent signal across ail microwells, allowing easy removal of contaminated samples. After amplification in the microwells, we employed a micromanipulation system to extract amplicons from individual wells for sequencing. (Fig. IC). Fluorescent monitoring during this procedure ensured that only single wells were extracted for analysis (Figs. 6A-6B). Using real-time MDA ¹, we estimated that, the extracted amplicon masses ranged from 500 picograms to 3 nanograms.

[Ό069] To construct lilumina sequencing libraries from the nanogram-scale DNA amplicons, we used a modified method based on the Nextera Tn5 transposase library construction kit. Previous studies have shown that Nextera transposase-based libraries can be prepared using as little as 10 picograms of genomic DNA ¹. However, the standard Nextera protocol was unable to generate high-complexity libraries from MDA amplicons, resulting in poor genomic coverage (data not shown). To address this issue, we used random hexarners and DNA Polymerase 1 to first convert the hyperbranched amplicons into unbranched double- stranded DNA molecules, which allowed effective library construction using the Nextera™ in vitro transposition method (Fig. I D). We additionally used a small reaction volume to further increase the efficiency of the Nextera library construction ^"" .

[0070] Thus, a sequencing library was constructed using products of substantially unbiased amplification in accordance with some embodiments herein. Example 2: MIDAS efficiently generates a near-complete genome assembly from single E. coii cells.

[Ό07Ι] As a proof of concept, we utilized MIDAS on three single MGI655 E. coii cells and analyzed between approximately 2 - 8 million paired-end Alumina sequencing reads of 100 bp in length for each, which is equivalent to a genomic coverage of between 87x and 364x. We first mapped the reads to the reference E. coii genome and were able to recover between 94% and 99% of the genome at >lx coverage. We then performed de novo genome assembly using SPAdes ^{^}. We were able to assemble between 88% and 94% of the E. coii genome (Fig. 2), with an N50 contig size of 2,654 27,882 bp and a max contig length of 18,465 - 132,037 bp. More than 80% of the assembled bases were mapped to E. coii, with the remainder resulting from common MDA contaminants such as Delftia and Acidovorax (Fig. 7, Table 1), We annotated the genome using the RAST and KAAS annotation sewers. Over 96% of E. coii genes were either partially or fully covered in the assembly. Major biosynthetic pathways, including glycolysis and the citric acid cycle, were also present. Furthermore, pathways for amino acid synthesis and tRNA development were covered. MIDAS was thus able to assemble an extremely large portion of the E. coii genome from a single cell with very minimal sequencing,

[ΘΘ72] As a control, we also amplified and sequenced one E. coii cell using the conventional in-tube MDA method, and controlled the reaction time to limit the amplification yield to the nanogram level. A fraction of the control amplicon was further amplified in a second reaction to the microgram level. The two control amplicons were converted into sequencing libraries using the traditional shearing and ligation method. We found that limiting amplification yield resulted in a reduction of amplification bias even for in-tube amplification. However, MIDAS had a markedly reduced level of amplification bias when compared with either control reaction (Figs. 3A-3D). MIDAS was also able to recover a much larger fraction of the genome than the traditional MDA-based method. In fact, when compared with the most complete previously published singie E. coii genome data set-, MIDAS was able to recover 50% more of the E. coii genome than the traditional MDA-based method with 3 to 13-fold less sequencing effort (~90-400x vs. ~12G0x). This result demonstrates that MIDAS provides a much more efficient and cost-effective way to assemble whole bacterial genomes from single cells without culture.

Example 3; MIDAS can identify small copy number variation in single human adult neurons.

[0073] Given the highly uniform genome coverage achieved by microweli based polymerase cloning, we next applied MIDAS to the characterization of copy number variation in single mammalian cells. The higher cognitive function of the human brain is supported by a complex network of neurons and glia. It has been long thought that all cells in a human brain share the same genome. Without being bound by any particular theory, recent evidence suggests that individual neurons could have non-identical genomes due to aneuploidy ^{~ '6}, active retrotransposons^ ' ' ²⁸ and other DNA content variation ²⁹. However, the presence of somatic genetic variation in individual neurons has never been conclusively demonstrated at the single genome scale. To demonstrate the viability of MIDAS as a platform for investigating cop}' number variation in single primary human neurons, we prepared nuclei from one post-mortem brain sample from a healthy female donor and a second post-mortem brain sample from a female individual with Down Syndrome. We purified cortical neuronal nuclei by flow sorting based on neuron-specific NeuN antibody staining. Five sequencing libraries (two disease-free, three Down Syndrome) were generated from individual nuclei using MIDAS, and generated sequencing data was analyzed using an SNS method based on circular binary segmentation ^'' ⁰. We similarly observed a dramatic reduction of amplification bias in the MIDAS libraries when compared to the conventional in-tube MDA-based method (Figs. 3C-D).

[0074] We next sought to characterize the sensitivity of detecting single copy- number changes. While it was not possible, even with aggressive binning into large genomic regions, to distinguish true copy number differences from random amplification bias in the conventional single-cell MDA library, the uniform genome coverage in the MIDAS libraries allowed clear detection of Trisomy 21 in each of the Down Syndrome nuclei (Fig. 4 A). Rigorous validation of single-cell sequencing methods has been extremely challenging, mainly due to the fact that any single cell analyzed might carry additional genomic differences from the bulk cell population. Hence, there is no reference genome that single cell data can be compared to. In order to determine the CNV detection limit of MIDAS, we computationally transplanted data from random 1 or 2 bps regions of either chromosome 21 (to simulate the gain of a single copy, the smallest possible copy number change) or chromosome 4 (as a negative control) from Down Ssiidrome nuclei into 100 other random genomic locations (Table 2). This computational approach, similar to a strategy previously used for assessing sequencing errors ³¹, provided us a list of reference CNV events at various sizes for benchmarking without affecting the inherent technical noise in the data sets. We identified 68/100 (68%) of 1 Mb T21 insertions and 98/100 (98%) of 2 Mb T21 insertions, indicating that MIDAS is able to call copy number events at the megabase-scale with high sensitivity (Fig. 4B, Table 2). As expected, the insertion of diploid chromosome 4 regions did not generate any copy number calls. When the same simulation was performed with data from traditional in- tube MDA libraries, no T2I insertions were detected, indicating that at this level of sequencing depth, traditional MDA-based methods are unable to call small CNVs (Figs. 8A-B). We then performed CNV calling using the parameters calibrated by the T21 transplantation simulation, MIDAS additionally called 4-17 copy number events in each neuron (Table 3). Only 2/62 called CNV events were larger than 2 Mb, and 5/62 larger than 1 Mb. It remains unclear whether the remaining events represent true copy number changes or whether they are false positives due to the small size of most of the calls. However, five smaller CNV events were consistently called in two different nuclei from the healthy donor, and one additional CNV event on chromosome 10 was called in two nuclei from the Down Syndrome patient, suggesting that they are germ-line CNVs. Based on the T21 computational transplantation results, it appears that the five human neurons contain an average of I region each with 1 copy number gain at the megabase scale.

[0075] Thus, substantially unbiased amplification in accordance with some embodiments herein can sensitively detect changes in copy number of portions of a genome.

Example 4: Identification of CNVs in MIDAS and MDA data

[0076] Mammalian single-cell libraries were sequenced in an Alumina Genome Analyzer l x or IUumina HiSeq using 36 bp single end reads. The CNV algorithm previously published by Cold Spring Harbor Laboratories' was used to call copy number variation on each single neuron, with modifications to successfully analyze non-cancer cells. Briefly, for each sample, reads were mapped to the genome using Bowtie. Clonal reads resulting from Polymerase Chain Reaction artifacts were removed using samtools, and the remaining unique reads were then assigned into 49,891 genomic bins that were previously determined such that each would contain a similar number of reads after mapping ³¹'. Each bin's read count was then expressed as a value relative to the average number of reads per bin in the sample, and then normalized by GC content of each bin using a weighted sum of least squares algorithm (LOWESS). Circular binary segmentation was then used to divide each chromosome's bins into adjacent segments with similar means. Unlike the previously published algorithm, in which a histogram of bin counts was then plotted and the second peak chosen as representing a copy number of two, it was assumed, due to samples not being cancerous and thus being unlikely to contain significant amounts of aneuploidy, that the mean bin count in each sample would correspond to a copy number of two. Each segment's normalized bin count was thus multiplied by two and rounded to the nearest integer to call copy number. MIDAS data clearly showed a C V call designating Trisomy 21 in all Down Syndrome single cells, while the traditional MDA-based method was not able to call Trisomy 21.

Example 5: Identification of Artificial CNVs in MDA and MIDAS data

[0077] In order to test the ability of the CNV algorithm described above to call small CNVs, artificial CNVs were computationally constructed. Prior to circular binary segmentation, in each Down Syndrome sample, one hundred random genomic regions across chromosomes 1-22 were chosen, each consisting of either 17 or 34 bins of approximately 60 kb in size. Each region was replaced with an equivalently sized region from chromosome 21 or chromosome 4 (Supplementary Table 2). The above algorithm was then run on each "spiked-in" sample, and the number of new CNV calls in each sample that matched each spike-in was tallied. For the chromosome 21 spike-ins, MIDAS was able to accurately call 98% of spiked-in CNVs at the 2 Mb level and 68% of spiked-in CNVs at the I Mb level, while the traditional MDA-based method was not able to call any spiked-in CNVs. As expected, spike-ins of chromosome 4 did not result in any additional CNV calls. [0078] Thus, small CNV's can be called in accordance with some embodiments herein.

Table 1: Single E, coli assembly statistics

[ΘΘ79] Total number of reads, rmrnber of contigs mapping to E. coli, N ₅₀, maximum contig length, total base pairs assembled to E. coli K12 MG1655 genome, percent of E. coli K12 MG1655 covered in assembly, complete and partial genes covered, and percent of genome covered by mapped reads. Total number of reads refers to all sequencing reads, including non-mapping and clonal reads.

Table 2; Artificial CNV transplantation statistics,

[0080] Each genomic location used for calling of artificial CNVs is shown, along with whether or not MIDAS was able to call the artificial CNV. Only spike-ins of Trisomy Chromosome 21 from MIDAS samples generated CNV calls; spiking in either MIDAS Chromosome 4 or Trisomy Chromosome 21 from the traditional MDA-based method did not result in any artificial CNV calls.

2Mb Mb

2 Mb 2 Mb 1 Mb 1 Mb ciir21 Spike Spik chr212 Mb

1 Mb -in e-i Spike-in

Region Size Region Size Detec Dete ted? cted

·> chrl:35,9

chrl:35,95 ehi-21: 15,86 chr21: 15,86

53,938-

3,938- 1,936,052 1,039,038 9,057- 9,057- Yes Yes

36,992,9

37,889,989 17,759,721 16,841,316

chrl: 91,0

chrl :91,04 chr2I;35,73 chr21:35,73

42,930-

2,930- 2,005,522 1,028,011 3,857- 3,857- Yes Yes

92,070,9

93,048,451 37,620,466 36,687,022

chrl:98,28 chr 1:98.2

chr21:31,32 chr21:31,32

4,802- 84,802-

1,882,342 952,188 9,048- 9,048- Yes Yes 100,167,14 99,236,9

33,234,529 32,284,116

3 89

chrl: 101, 7 chrl: 101,

cln-21: 15,54 chr21: 15,54

20,184- 720,184-

1,902,201 960,359 9,571- 9,571- Yes Yes

103,622,38 102,680,

17,439,036 16,523,267

4 542

chrl: 158,9 chiT:158,

chr2I:43,94 chr21:43,94

48,121- 948,121-

1,956,454 1,008,466 7,454- 7,454- Yes Yes

160,904,57 159,956,

45,973,419 45,032,873

4 586

chrl: 180,6 chrl: 180, chr21:18,14 chr21: 18,14

12,063- 1,926,579 612,063- 992,201 4,565- 4,565- Yes No 182,538,64 181,604, 20,109,045 19,193,040 1 263

chrl:219,l chrl:219,

chr21:37,38 chr21:37,38

67,316- 167,316-

1,932,189 994,449 2,817- 2,817- Yes No 221,099,50 220,161,

39,338,993 38,415,585

4 764

chrl:241,3 chrl:241,

chr21:45,25 chr21:45,25

04,468- 304,468-

2,234,867 1,000,739 6,736- 6,736- Yes Yes

243,539,33 242,305,

47,160,835 46,250,492

4 206

chr2:47,2

chr2:47,27 chr21:43,89 chr21:43,89

79,743- 9,743- 1,977,860 1,036,142 5,354- 5,354- Yes Yes

48,315,8

49,257,602 45,919,088 44,976,485

chr2:51 ,0

chr2:51,0i chi 1:28,48 chr21:28,48

16,978- 6,978- 1,883,302 960,498 5,490- 5,490- Yes Yes

51,977,4

52,900,279 30,475,416 29,548,591

chr2: 120,9 chr2:120,

chr21:20,70 chr21:20,70

17,453- 917,453-

1,900,941 942,705 1,039- 1,039- Yes Yes

122,858,39 121,860,

22,557,692 21,663,245

3 157

chr2; 139,2 chr2:139,

chr21:21,71 chr21:21,71

84,812- 284,812-

1,866,764 964,131 5,029- 5,029- No No

141,151,57 140,248,

23,559,945 22,661,749

5 942

chr2:151,5 chr2:151,

chr21: 19,01 chi21: 19,01

37,791- 537,791-

1,946,891 1,006,673 6,794- 6,794- Yes No

153,484,68 152,544,

20,913,230 20,003,839

1 463

chr2: 175,3 chr2:175,

chr21:17,38 chr21: 17,38

46,199- 346,199-

1,924,982 969,381 4,295- 4,295- Yes No 177,271,18 176,315,

19,361,384 18,362,470

0 579

ehr2;204,5 chr2:204,

chr21:46,08 chr21:46,08

50,336- 550,336-

1,870,310 961,630 7,520- 7,520- Yes No 206,420,64 205,511,

47,989,191 47,051,166

5 965

chr2:240,9 chr2:240, chr21:45,53 chi21:45,53

1,938,188 975,469 Yes No 35,763- 935,763- 4,869- 4,869- 242,873,95 241,911, 47,430,139 46,517,024 0 231

chr3:21,4

chr3:21,45 chr21: 17,38 chr21: 17,38

57,475- 7,475- 1,930,556 972,412 4,295- 4,295- Yes No

22,429,8

23,388,030 19,361,384 18,362,470

chr3:29,7

ch: 3:29.79 chr21:32,60 chr21:32,60

94,211- 4,211- 1,855,080 972,070 9,078- 9,078- Yes Yes

30,766,2

31,649,290 34,563,159 33,620,107

chr3:64,7

chr3: 64,75 chr21:25,79 chr21:25,79

59,471- 9,471- 1,989,428 969,940 7,240- 7,240- Yes Yes

65,729,4

66,748,898 27,686,265 26,755,429

ch: 3:94.7

chr3:94,72 chr21:29,54 chr21:29,54

28,396-

8,396- 1,911,104 989,977 8,591- 8,591- Yes Yes

95,718,3

96,639,499 31,432,266 30,529,736

chr3:131,3 chr3:131,

chr21:20,38 chr21:20,38

50,124- 350,124-

1,971,524 1,004,890 0,269- 0,269- Yes Yes 133,321,64 132,355,

22,240,402 21,341,117

7 013

chr3: 169,5 chr3:169,

chr21:43,31 ehr21:43,31

32,039- 532,039-

1,962,284 1,027,958 2,554- 2,554- Yes Yes

171,494,32 170,559,

₉ 45,314,840 44,279,997

996

chr3: 190,6 chr3:190,

chr21:45,69 chr21:45,69

59,728- 659,728-

1,893,906 989,016 9,913- 9,913- Yes No 192,553,63 191,648,

47,592,766 46,673,526

3 743

chr4:26,7

chr4:26,71 chr21:34,56 chr21:34,56

17,043- 7,043- 1,874,670 970,664 3,159- 3,159- Yes No

27,687,7

28,591,712 36,473,184 35,575,904

chr4:41,8

chr4:41,80 chr21:46,59 chr21:46,19

07,132- 7,132- 1,925,322 998,987 5,641- 5,641- Yes No

42,806,1

43,732,453 48,129,895 47,160,835

clir4:47,15 5,662,643 chr4:47,l 984,082 chr21:41,81 chr21:41,81 Yes Yes 2,041- 52,041- 1,953- 1,953- 52,814,683 48,136,1 43,737,990 42,776,509

chr4:55,0

chr4:55,03 chr21:38,13 chr21:38,13

36,501- 6,501- 1,954,661 996,090 7,201- 7,201- Yes Yes

56,032,5

56,991,161 40,021,631 39,127,553

chr4:59,9

chr4:59,92 chr21:41,10 ehr21:41,10

22,675-

2,675- 1,928,594 999,384 0,954- 0,954- Yes Yes

60,922,0

61,851,268 43,045,476 42,076,268

chr4:62,l

chr4:62,17 chr21:39,75 chr21:39,75

74,303-

4,303- 1,923,962 979,518 9,243- 9,243- Yes Yes

63,153,8

64,098,264 41,648,348 40,718,850

chr4:68,7

chr4;68,75 chr21:20,96 chr21:20,96

52,406-

2,406- 2,113,626 1,065,726 5,576- 5,576- Yes Yes

69,818,1

70,866,031 22,821,144 21,926,708

chr4: 120,4 chr4:120,

ehr21:26,0! ehr21:26,0!

92,349- 492,349-

1,910,324 985,261 2,981- 2,981- Yes Yes

122,402,67 121,477,

9 27,901,092 26,970,320

609

chr4: 122,8 chr4;122,

chr21:31,01 chr21:3!,01

95,270- 895,270-

1,951,317 1,003,897 3,874- 3,874- Yes Yes

124,846,58 123,899,

32,887,206 31,960,666

6 166

chr4: 147,6 chr4:147,

chr21:33,34 chr21:33,34

55,266- 655,266-

1,925,293 993,435 4,236- 4,236- Yes Yes 149,580,55 148,648,

35,307,614 34,350,216

8 700

chr5:42,8

chr5:42,83 chr21:25,96 chr21:25,96

37,955-

7,955- 2,034,104 1,078,644 0,034- 0,034- Yes Yes

43,916,5

44,872,058 27,848,628 26,918,132

chr5;75,l

chr5:75,10 chr21:23,55 chr21:23,55

08,565- 8,161- 1,974,187 1,020,849 9,945- 9,945- Yes Yes

76,129,0

77,082,347 25,429,916 24,529,659

09 29

chr6: 137,1 chr6;137,

chr21:25,27 chr21:25,27

41,139- 141,139-

1,910,355 968,859 1 ,009- 1 ,009- Yes Yes

139,051,49 138,109,

27,132,790 26,223,918

3 997

chr7:27,9

chr7:27,90 chr21:45,14 chr21:45,14

05,786-

5,786- 1,905,003 944,880 5,316- 5,316- Yes Yes

28,850,6

29,810,788 47,051,166 46,142,620

chr7:37,5

chr7:37,52 chr21:41,21 chr21:41 ,21

24,768- 4,768- 1,889,256 989,157 2,176- 2,176- Yes Yes

38,513,9

39,414,023 43,151,199 42,183,940

chr7;56,5

chr7:56,53 chr21:29,75 chr21:29,75

31,510-

1,510- 7,281,789 5,717,277 9,604- 9,604- Yes Yes

62,248,7

63,813,298 31,643,582 30,743,560

chr7:131,3 chr7:131,

chr21:42,23 ehr21:42,23

71,835- 371,835-

1,867,569 944,217 6,377- 6,377- Yes Yes

133,239,40 132,316,

44,163,506 43,258,676

3 051

chr7; 142,1 chr7:142,

chr21: 14,82 chr21: 14,82

72,461- 172,461-

2,482,468 1,101,282 0,139- 0,139- Yes Yes

144,654,92 143,273,

17,160,267 16,253,786

8 742

chr8:31,0

chr8:31,00 chr21:39,86 ehr21:39,86

02,063-

2,063- 1,892,069 987,203 4,943- 4,943- No No

31,989,2

32,894,131 41,758,932 40,831,777

chr8:58,l

chr8:58,14 chr21:42,07 chr21:42,07

42,435-

2,435- 1,888,763 968,092 6,268- 6,268- Yes Yes

59,110,5

60,035,197 43,998,204 43,098,462

chr8:78,9

chr8;78,92 chr21: 18,24 chr21: 18,24

29,030-

9,030- 1,891,571 966,378 8,286- 8,286- Yes No

79,895,4

80,820,600 20,218,830 19,304,425

chr9: 1,055, chr9:l,05 chr21:45,14 chr21:45,14

1,912,819 997,158 Yes Yes 886- 5,886- 5,316- 5,316- 2,968,704 2,053,04 47,051,166 46,142,620 chr9:26,6

chr9;26,65 chr21:45,20 chr21:45,20

53,725-

3,725- 1,924,124 998,550 0,702- 0,702- Yes Yes

27,652,2

28,577,848 47,107,423 46,195,641

chr9:78,4

chr9:78,43 chr21:29,60 chr21:29,60

38,145-

8,145- 1,945,237 989,759 2,048- 2,048- Yes Yes

79,427,9

80,383,381 31,485,060 30,583,583

chr9:85,3

chr9:85,35 chr21:36,47 chr21:36,47

52,360-

2,360- 1,995,394 989,555 3,184- 3,184- Yes Yes

86,341,9

87,347,753 38,415,585 37,439,834

chr : 108,9 chr9:108,

chr21:30,02 chr21:30,02

43,484- 943,484-

1,925,348 981,162 7,091- 7,091- Yes No 110,868,83 109,924,

31,908,111 31,013,874

1 645

chr9; 136,0 chr9:136,

chr21:27,35 chr21:27,35

54,731- 054,731-

1,979,140 1,079,707 7,146- 7,146- Yes No

138,033,87 137,134,

29,249,735 28,326,314

0 437

chr9: 138,8 chr9:138,

chr21:41,43 chr21:41,43

51,087- 851,087-

2,033,469 1,054,655 2,175- 2,175- Yes Yes 140,884,55 139,905,

43,366,620 42,401,776

5 741

chrl0;37,

chrl0:37,2 chr21:29,43 chr21:29,43

235,643-

35,643- 6,150,607 1,131,970 8,499- 8,499- Yes Yes

38,367,6

43,386,249 31,329,048 30,420,106

chrl 0:50,

chrl0;50,5 chr21:41,26 chr21:41,26

517,442-

17,442- 2,727,143 1,741,630 5,140- 5,140- Yes Yes

52,259,0

53,244,584 43,202,177 42,236,377

chrl 0:84,

chrl 0:84,7 chr21:30,02 chr21:30,02

733,418-

33,418- 1,935,722 976,010 7,091- 7,091- Yes Yes

85,709,4

86,669,139 31,908,111 31,013,874

chrl 1:1,15 1,948,765 chrl 1:1,1 1,042,264 chr21:32,66 chr21:32,66 Yes Yes 5,181- 55,181- 2,136- 2,136- 3,103,945 2,197,44 34,618,144 33,675,958

chrl 1:60,

chrl 1:60,1 chr21:33,45 chr21:33,45

102,184-

02,184- 2,047,474 1,066,890 4,741- 4,741- Yes No

61,169,0

62,149,657 35,416,887 34,454,555

chrl 1:87,

chrl 1:87,2 chr21:27,57 ehr21:27,57

270,689-

70,689- 2,567,602 1,050,795 2,323- 2,323- Yes No

88,321,4

89,838,290 29,548,591 28,537,710

chrl 1:90,

chrl 1:90,9 chr21:38,03 chr21:38,03

997,546-

97,546- 1,891,843 978,397 1,056- 1,056- Yes No

91,975,9

92,889,388 39,916,751 39,023,029

chrl 1:10

chrl 1:109,

9,670,89 chr21:32,17 chr21:32,17

670,892-

1,902,245 9- 1,004,455 4,455- 4,455- Yes No 111,573,13

110,675, 34,332,773 33,174,429

346

chrll:ll

chrl 1:116,

6,094,96 chr21:21,01 chr21:21,01

094,960-

1,910,200 0- 992,769 7,554- 7,554- Yes No 118,005,15

117,087, 22,872,356 21,979,621

728

chrl 1:12

chrl 1:125,

5,026,40 chr21:28,69 ehr21:28,69

026,409-

1,894,814 9- 995,748 5,036- 5,036- Yes No

126,921,22

9 126,022, 30,689,040 29,759,604

156

chrl 2:49,

chr!2:49,2 chr21:45,14 chr21:45,14

227,100- 27,100- 2,056,777 1,054,590 5,316- 5,316- Yes Yes

50,281,6

51,283,876 47,051,166 46,142,620

chrl 2:52,

chrl2:52,0 chr21:44,46 chr21:44,46

063,249-

63,249- 1,973,688 979,066 7,407- 7,407- Yes Yes

53,042,3

54,036,936 46,465,237 45,534,869

chrl 2:93,7 1,951,674 chrl 2: 93, 992,265 chr21 :22,08 chr21:22,08 Yes No 62,836- 762,836- 3,898- 3,898- 95,754,509 94,755,1 23,935,965 23,028,045

chr 12: 10

chrl2:105,

5,555,32 chr21:34,50 chr21:34,50

555,128-

1,923,575 8- 981,722 9,021- 9,021- Yes Yes 107,478,70

106,536, 36,421,675 35,521,899

849

chrl2:12

chrl 2: 125,

5,199,00 chr21:41,26 chr21:41,26

199,005-

1,915,716 5- 994,370 5,140- 5,140- Yes Yes 127,114,72

126,193, 43,202,177 42,236,377

374

chrl 3:20,

chr 13:20,2 chr21:21 ,28 chr21:21 ,28

265,706-

65,706- 2,027,106 1,030,107 6,351- 6,351- Yes Yes

21,295,8

22,292,811 23,135,668 22,240,402

chrl3:72,

chrl3:72,3 chr21:32,06 chr21:32,06

352,465- 52,465- 1,898,744 959,028 7,733- Yes No

73,311,4

74,251,208 34,023,166 33,061,883

chrl 3: 11

chrl3:l 11,

1,996,47 chr21:30,47 chr21:30,47

996,474-

2,075,392 4- 1,133,581 5,416- 5,416- Yes Yes

114,071,86

113,130, 32,337,385 31,432,266

054

chrl4:40,

chrl4:40,3 chr21: 16,42 chr21: 16,42

383,400-

83,400- 1,913,627 985,054 1,666- 1 ,666- Yes No

41,368,4

42,297,026 18,309,003 17,384,295

chs 14:47.

chrl4:47,2 chr21:36,84 chr21:36,84

205,765-

05,765- 1,924,210 958,371 2,693- 2,693- Yes No

48,164,1

49,129,974 38,809,659 37,864,928

chrl 4:49,

chrl 4:49,7 chr21:34,61 chr21:34,61

785,308-

85,308- 1,978,959 1,042,409 8,144- 8,144- Yes No

50,827,7

51,764,266 36,526,570 35,629,603

chrl 4:54,0 1,930,336 chr!4:54, 981,762 chr21:16,35 chr21:l 6,31 Yes Yes 10,611- 010,611- 7,320- 7,320- 55,940,946 54,992,3 18,197,321 17,264,941

chrl4:58,

chrl4:58,2 chr21:20,80 chr21:20,80

280,794-

80,794- 1,934,287 1,002,168 8,528- 8,528- Yes Yes

59,282,9

60,215,080 22,661,749 21,768,367

chr!4:65,

chrl4:65,7 chr21:45,08 chr21:45,08

798,341-

98,341- 1,959,450 999,913 7,470- 7,470- Yes Yes

66,798,2

67,757,790 46,993,898 46,087,520

chrl4:69,

chr!4:69,9 chr21:40,71 chr21:40,71

993,123-

93,123- 1,957,749 1,005,892 8,810- 8,810- Yes Yes

70,999,0

71,950,871 42,616,819 41,704,149

chrl 4: 79,

chrl4;79,8 chr21:26,97 chr21:26,97

855,300-

55,300- 1,901,761 973,409 0,320- 0,320- Yes Yes

80,828,7

81,757,060 28,855,428 27,952,868

chr!5:34,

chrl 5:34,9 chr21:31,64 chr21:31,64

986,155-

86,155- 1,892,805 990,131 3,582- 3,582- Yes Yes

35,976,2

36,878,959 33,561,518 32,609,078

chr!5:79,

chrl 5:79,6 chr21:40,55 chr21:40,55

683,541- 83,541- 1,900,025 996,059 1,918- 1,918- Yes Yes

80,679,5

81,583,565 42,454,621 41,542,378

chrl 6:47,

chrl 6:47,7 chr21:40,12 chr21:40,12

776,123-

76,123- 1,920,061 1,014,857 7,624- 7,624- Yes No

48,790,9

49,696,183 42,024,470 41,100,954

chrl 6: 73,

chrl 6:73,4 chr21:23,93 chr21:23,93

420,082-

20,082- 2,065,478 1,052,446 5,965- 5,965- Yes Yes

74,472,5

75,485,559 25,797,240 24,896,759

chrl 7:4,0

chrl 7:4,08 chr21:36,42 chr21:36,42

89,647-

9,647- 2,052,648 1,102,655 1 ,675- 1 ,675- Yes Yes

5,192,30

6,142,294 38,356,392 37,382,817

1 chrl7:13,

chrl7:13,4 chr21:46,19 chr21:46,19

450,820-

50,820- 1,928,251 1,009,727 5,641- 5,641- Yes Yes

14,460,5

15,379,070 48,129,895 47,160,835

chrl7:37,

chrl 7:37,3 chr21:35,30 ehr21:35,30

373,102-

73,102- 2,014,889 1,060,018 7,614- 7,614- Yes Yes

38,433,1

39,387,990 37,163,621 36,264,921

chrl 7:42,

chrl 7:42,9 chr21:39,02 chr21:39,02

946,627-

46,627- 2,589,260 1,146,091 3,029- 3,029- Yes Yes

44,092,7

45,535,886 40,887,012 39,970,006

chrl 7:62,

chr!7:62,0 chr21:23,77 chr21:23,77

034,684-

34,684- 2,132,220 1,201,118 0,888- 0,888- Yes Yes

63,235,8

64,166,903 25,639,505 24,739,139

chrl 8:42,

chrl 8:42,2 chr21:36,90 chr21:36,90

248,309-

48,309- 1,891,358 944,349 0,623- 0,623- Yes No

43,592,6

44,139,666 38,861,882 37,919,721

chrl 9:24,

chrl 9:24,3 chr21:20,96 chr21:20,96

366,238-

66,238- 5,314,541 4,388,890 5,576- 5,576- Yes Yes

28,755,1

29,680,778 22,821,144 21,926,708

chr20:47,

chr20:47,2 chr21: 16,63 chr21: 16,63

231,329-

31,329- 1,972,109 1,006,616 0,136- 0,136- Yes Yes

48,237,9

49,203,437 18,541,644 17,604,225

chr22:49,

chr22:49,5 chr21:30,36 chr21:30,36

539,479-

39,479- 2,370,510 1,023,195 1,934- 1,934- Yes Yes

50,562,6

51,909,988 32,231,305 31,329,048

73 Table 3; Copy number events called in each neuron.

[0081] All identified copy number events in each single cell are listed, along with the size of the CNV in actual base pairs and number of base pairs in the CNV that were non- repetitive according to a previously published algorithm ⁸. Unique CNVs are presented in plain text, while CNVs shared between one or more samples are presented in italics (if a CNV call was partially identified in another sample) or bold (if a CNV call was fully identified in another sample). Aside from Trisomy 21 (identified in all three Down Syndrome cells), most CNV calls are fairly small in both size and non-repetitive size.

Size

Cell ChromoStart End Copy (Valid

Type Size

# some Position Position No. Genomic

Regions ⁸)

1 Healthy 1 16,949,551 17,257,431 5 307,881 120,000

1 Healthy 1 147,802,093 149,049,044 3 1,246,952 120,000

1 Healthy 2 133, 000, 723 133, 135, 043 4 134, 321 120, 000

I Healthy 75,275,861 76,035,772 3 759,912 420,000

1 Healthy 4 190, 664, 845 191, 154,276 4 489, 432 240, 000

1 Healthy 6 32,526, 395 32, 645, 736 1 119, 342 120, 000

1 Healthy 8 39,308, 029 39,363,306 ] 55,278 60, 000

1 Healthy 10 47,008,316 47,538,599 4 530,284 180,000

1 Healthy 1 1 48,858,583 48,959,202 4 100,620 60,000

1 Healthy 1 1 122,887,817 123,010,937 1 123, 121 120,000

1 Healthy 15 34,761,777 34,873,738 1 111,962 60,000

1 Healthy 16 3,762,009 3,81 8,563 1 56,555 60,000

1 Healthy 16 32,340,630 34,746,226 3 2,405,597 1 ,140,000

7 Healthy 16 71, 141, 287 71,246,392 105, 106 60, 000

1 Healthy 17 21,257,685 21,374,155 3 116,471 120,000

1 Healthy 17 77,452,319 77,652,085 4 1 99,767 60,000

1 Healthy 20 29,449, 066 29,811,435 4 362, 370 120, 000

2 Healthy 1 16,949,551 17,257,431 4 307,881 120,000

Healthy 1 34,347, 191 34,666,699 3 319,509 360,000

2 Healthy 1 147,802,093 149,049,044 4 120,000

Health - 2 132,846,449 133, 135,043 3 288,595 180,000

2 Healthy 3 75,803,231 75,901 ,346 4 98, 1 16 60,000

Healthy 3 195,457,070 195,525,025 3 67,956 60,000 Healthy 6 0 358,119 .1 358,120 180,000

Healthy 6 32,526,395 32,699,933 1 173,539 180,000

Healthy 8 39,308,029 39,363,306 1 55,278 60,000

Healthy 10 47,008,316 47,538,599 4 530,284 180,000

Healthy 15 34,761,777 34,873,738 1 111,962 60,000

Healthy 16 32,499,141 34,280,003 3 1,780,863 600,000

Healthy 16 34,410,499 34,746,226 .1 335,728 360,000

Healthy 16 71,141,287 71,246,392 9 105,106 60,000

Healthy 17 21,257,685 21,374,155 3 116,471 120,000

Healthy 18 59,103,041 59,431,597 3 328,557 360,000

Healthy 20 25,753,877 29,868,184 .1 4,114,308 420,000

Healthy 20 35,971,800 36,129,265 3 157,466 180,000

Down

4 4 489,432 240,000 Syndrome 190,664,845 191,154,276

Down

8 0 55,278 60,000 Syndrome 39,308,029 39,363,306

Down

10 6 3,989,204 240,000 Syndrome 38,869,769 42,858,972

Down

10 3 3,458,440 2,040,000 Syndrome 47,008,316 50,466,755

Down

10 1 659,672 660,000 Syndrome 69,854,431 70,514,102

Down

16 11 105,106 60,000 Syndrome 71,141,287 71,246,392

Down

19 1 784,269 900,000 Syndrome 31,729,973 32,514,241

Down

20 6 419,119 180,000 Syndrome 29,449,066 29,868,184

Down

20 3 1,540,957 1,680,000 Syndrome 42,392,899 43,933,855

Down

21 3 33,697,356 36,180,000 Syndrome 14,432,540 48,129,895

Down

22 1 518,882 420,000 Syndrome 50,785,685 51,304,566

Down

1 1 53,596 60,000 Syndrome 114,955,315 115,008,910

Down

39 65,054 60,000 Syndrome 133,000,723 133,065,776 Down '> 0 110,677 120,000 Syndrome 180,472,228 180,582,904

Down

4 0 59,117 60,000 Syndrome 68,807,337 68,866,453

Down

4 0 161,866 180,000 Syndrome 107,214,750 107,376,615

Down

8 0 107,309 120,000 Syndrome 39,308,029 39,415,337

Down

10 8 3,989,204 240,000 Syndrome 38,869,769 42,858,972

Down

10 1 54,450 60,000 Syndrome 61,755,833 61,810,282

Down

10 3 1,191,858 1 ,320,000 Syndrome 65,820,124 67,011,981

Down n

16 218,029 60,000 Syndrome 34,002,234 34,220,262

Down

19 3 881,397 960,000 Syndrome 29,082,056 29,963,452

Down

19 1 516,006 480,000 Syndrome 53,713,097 54,229,102

Down

20 6 ^' 478,644 240,000 Syndrome 29,449,066 29,927,709

Down

21 3 33,697,356 36,180,000

Syndrome 14,432,540 48,129,895

Down

10 1 313,988 300,000 Syndrome 12,083,581 12,397,568

Down

10 9 3,989,204 240,000 Syndrome 38,869,769 42,858,972

Down

20 3 1,912,866 2,100,000 Syndrome 21,609,652 23,522,517

Down

20 9 419,119 180,000 Syndrome 29,449,066 29,868,184

Down

21 3 33,697,356 36,180,000 Syndrome 14,432,540 48,129,895 References

1. Zhang, K. et al. Sequencing genomes from single cells by polymerase cloning. Nat Biotechnol 24, 680-686 (2006).

2. Rodrigue, S. et al. Whole genome amplification and de novo assembly of single bacterial cells. PLoS One 4, e6864 (2009).

3. Fan, H.C., Wang, J., Potanina, A. & Quake, S.R. Whole-genome molecular haplotyping of single cells. Nat Biotechnol 29, 51 -57 (201 1).

4. Hou, Y. et al. Single-cell exome sequencing and monoclonal evolution of a JAK2- negative myeloproliferative neoplasm. Cell 148, 873-885 (2012).

5. Pan, X. et al. A procedure for highly specific, sensitive, and unbiased whole-genome amplification. Proc Natl Acad Sci US A 105, 15499-15504 (2008).

6. Marcy, Y. et al. Dissecting biological "dark matter" with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth. Proc Natl Acad Sci U S A. 104, 1 1889- 1 1894 (2007).

7. Yoon, H.S. et al. Single-cell genomics reveals organ ismal interactions in uncultivated marine protisis. Science 332, 714-737 (201 1).

8. Navin, N. et al. Tumour evolution inferred by single-cell sequencing. Nature 472, 90- 94 (201 1).

9. Xu, X. et al. Single-cell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor. Cell 148, 886-895 (2012).

10. Wang, J., Fan, H.C., Behr, B. & Quake, S.R. Genome-wide single-cell analysis of recombination activity and de novo mutation rates in human sperm. Cell 150, 402-412 (2012).

1 1. Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-ceil sequencing. J Compi.it Bio! 19, 455-477 (2012).

12. Chitsaz, I I. et al. Efficient de novo assembly of single-cell bacterial genomes from short-read data sets. Nat Biotechnol 29, 915-921 (201 1).

13. Hutchison, C.A., 3rd, Smith, H.O., Pfannkoch, C. & Venter, J.C, Cell-free cloning using phi29 DNA polymerase. Proc Natl Acad Sci USA 102, 17332- 17336 (2005). Marcy, Y. et al. Nanoliter reactors improve multiple displacement amplification of genomes from single cells. PLoS Genet 3, 1702-1708 (2007).

Inoue, J., Shigemori, Y. & Mikawa, T. Improvements of rolling circle amplification (RCA) efficiency and accuracy using Thermus therniophiius SSB mutant protein. Nucleic Acids Res 34, e69 (2006).

Pan, X. et al. A procedure for highly specific, sensitive, and unbiased whole-genome amplification. Proc Natl Acad Sci US A 105, 15499-15504 (2008).

Rodrigue, S. et al. Whole genome amplification and de novo assembly of single bacteria! cells. PLoS One 4, e6864 (2009).

Woyke, T. et al. One bacterial cell, one complete genome. PLoS One 5, el 0314 (2010).

Fitzsimons, M.S. et al. Nearly finished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome. Genome Res (2013).

Blainey, P.C. & Quake, S.R. Digital MDA for enumeration of total nucleic acid contamination. Nucleic acids research 39, el 9 (2011).

Adey, A. & Shendure, J. Ultra-low-input, tagmentation-based whole-genome bisulfite sequencing. Genome Res 22, 1 139-1143 (2012).

Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-ceil sequencing. J Compiit Biol 19, 455-477 (2012).

Rehen, S.K. et al. Constitutional aneuploidy in the normal human brain. J Neurosci 25, 2176-2180 (2005).

Rehen, S.K. et al. Chromosomal variation in neurons of the developing and adult mammalian nervous system. Proc Natl Acad Sci USA 98, 13361-13366 (2001). Yang, A.H. et al. Chromosome segregation defects contribute to aneuploidy in normal neural progenitor cells. J Neurosci 23, 10454-10462 (2003).

Yurov, Y.B. et al. Aneuploidy and confined chromosomal mosaicism in the developing human brain. PLoS One 2, e558 (2007).

Muotri, A.R, & Gage, F.H. Generation of neuronal variability and complexity. Nature 441, 1087-1093 (2006). Singer, Τ,, McConnell, MJ., Marchetto, M.C., Coufai, N.G. & Gage, F.IL LXNE-l retrotransposons: mediators of somatic variation in neuronal genomes? Trends Neurosci 33, 345-354 (2010).

Westra, J.VV. et al. Neuronal DNA content variation (DCV) with regional and individual differences in the human brain. J Comp Neurol 518, 3981-4000 (2010). Baslan, T. et al. Genome-wide copy number analysis of single cells. Nat Protoc 7, 1024-1041 (2012).

Shendure, j. et al. Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309, 1728-1732 (2005).

Zong, C, Lu, S., Chapman, A.R. & Xie, X.S. Genome -wide detection of single - nucleotide and copy-number variations of a single human cell. Science 338, 1622- 1626 (2012).

Hussein, S.M. et al. Copy number variation and selection during reprogramming to pluripotency. Nature 471, 58-62 (2011).

Westra, J.W. et al. Aneuploid mosaicism in the developing and adult cerebellar cortex. J Comp Neurol 507, 1944- 1951 (2008).

Huson, D.H., Auch, A.F., Qi, j. & Schuster, S.C. MEGAN analysis of metagenomic data. Genome Res 17, 377-386 (2007).

Gurevieh, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072-1075 (2013).

Aziz, R.K. et al. The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9, 75 (2008).

Moriya, Y., toh, M., Okuda, S., Yoshizawa, A.C. & anehisa, M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic acids research 35, Wl 82-185 (2007).

Fan, Christina et al. Whole genome molecular haplotyping of single cells Nature Biotech

Zhong, Chenghang et al. Genome-Wide Detection of Single-Nucleotide and Copy- Number Variations of a Single Human Ceil Science 338 1622 (2012) 41. Zhang, Run et al. Sequencing Genomes from Single Cells by Polymerase Cloning Nature Biotech

42. Evrony, Gilrad et al. Single Neuron Sequencing Analysis of LI Retrotransposition and Somatic Mutation in the Human Brain Cell 151 483 (2012)

43. Rirkness, E.F. et al. Sequencing of isolated sperm cells for direct haplotyping of a human genome. Genome Res. 23, 826 832 (2013).

44. Lu, S. et al. Probing meiotic recombination and aneuploidy of single sperm cells by whole-genome sequencing. Science 338, 1627 1630 (2012).

[ΘΘ82] The disclosures of all references cited herein are incorporated herein by reference in their entireties.

[0083] In this application, the use of the singular can include the plural unless specifically stated otherwise or unless, as will be understood by one of skill in the art in light of the present disclosure, the singular is the only functional embodiment. Thus, for example, "a" can mean more than one, and "one embodiment" can mean that the description applies to multiple embodiments.

[0084] The foregoing description and Examples detail certain embodiments. It will be appreciated, however, that no matter how detailed the foregoing may appear in text, the invention may be practiced in many ways and the invention should be construed in accordance with the appended claims and any equivalents thereof.

Previous Patent: ELECTROKINETIC PUMPS

Next Patent: DIRECTIONAL FLUID INLET