Nucleic acid encoding reactions

文档序号:842739 发布日期:2021-04-02 浏览:7次 中文

阅读说明:本技术 核酸编码反应 (Nucleic acid encoding reactions ) 是由 梅甘·安德森 陈佩林 布赖恩·福勒 罗伯特·C·琼斯 菲奥娜·卡佩尔 罗纳德·列波夫斯基 于 2012-05-21 设计创作,主要内容包括:本申请涉及核酸编码反应。描述了可用于向一个、或典型地多个靶核苷酸序列掺入一个或多个衔接子和/或核苷酸标签和/或条形码核苷酸序列的方法。在具体实施方案中,产生了具有衔接子的核酸片段,例如适于用在高通量DNA测序中的核酸片段。在其他实施方案中,将有关反应混合物的信息编码到反应产物中。还描述了在制备用于应用诸如双向核酸测序中可用于扩增一个或多个靶核酸的方法和试剂盒。在具体实施方案中,本申请的方法包括另外进行双向DNA测序。还描述了通过引物延伸编码和检测和/或定量等位基因的方法。(The present application relates to nucleic acid encoding reactions. Methods are described that can be used to incorporate one or more adaptors and/or nucleotide tags and/or barcode nucleotide sequences into one, or typically a plurality of, target nucleotide sequences. In particular embodiments, nucleic acid fragments with adaptors, such as nucleic acid fragments suitable for use in high-throughput DNA sequencing, are generated. In other embodiments, information about the reaction mixture is encoded into the reaction product. Also described are methods and kits useful for amplifying one or more target nucleic acids in preparation for applications such as bidirectional nucleic acid sequencing. In particular embodiments, the methods of the present application comprise additionally performing bidirectional DNA sequencing. Methods of encoding and detecting and/or quantifying alleles by primer extension are also described.)

1. A method of analyzing nucleic acids in or associated with a single particle, the method comprising:

a) capturing particles in separate reaction volumes such that each of the plurality of reaction volumes comprises a single particle, wherein the separate reaction volumes are droplets in an emulsion;

b) generating a barcoded reaction product from the nucleic acid in the single particle in each individual reaction volume or from the nucleic acid associated with the single particle in each individual reaction volume, wherein generating the reaction product comprises, for each individual reaction volume, incorporating a barcoded adaptor nucleotide sequence into the reaction product using a plurality of different single-stranded barcoded adaptor molecules, wherein each adaptor molecule comprises: (i) a nucleotide sequence comprising a primer binding site, (ii) a degenerate tail sequence, and (iii) a barcode nucleotide sequence, wherein the plurality of different single-stranded barcoded adaptor molecules for the same individual reaction volume comprise the same barcode;

c) Pooling the barcoded reaction products;

d) amplifying the barcoded reaction products using the primers that bind to the primer binding sites to generate templates for bidirectional DNA sequencing; and

e) sequencing the barcoded reaction product.

2. The method of claim 1, wherein the particle is a cell.

3. The method of claim 2, further comprising lysing the cells after step a) and before step b) to release the cell contents.

4. The method of claim 2, wherein the distribution of cells in the individual reaction volumes approximates a poisson distribution.

5. The method of claim 4, wherein the cells are diluted to produce the highest proportion of reaction volume with only one cell.

6. The method of claim 2, wherein step a) further comprises binding the cells to a support.

7. The method of claim 6, wherein the support has a binding partner distributed on its surface.

8. The method of claim 7, wherein the binding partner is an antibody.

9. The method of claim 6, wherein the support is a bead.

10. The method of claim 1, wherein the particles are ruptured after step a) and before step b) to release internal components.

11. The method of claim 1, wherein the particles are beads.

12. The method of claim 11, wherein the beads are primer coated.

13. The method of claim 1, wherein step b) of producing a reaction product comprises reverse transcription.

14. The method of claim 1, wherein step b) of producing a reaction product comprises amplification.

15. The method of claim 13, wherein step b) of producing a reaction product comprises amplification.

16. The method of claim 1, further comprising f) correlating the sequencing results obtained in e) to a single particle.

Technical Field

The present invention generally relates to the incorporation of a nucleic acid sequence into a target nucleic acid, e.g., the addition of one or more adaptors (adaptors) and/or nucleotide tags and/or barcode nucleotide sequences to the target nucleotide sequence. The methods described herein can be used, for example, in the field of high throughput assays for detecting and/or sequencing particular target nucleic acids.

Background

The ability to detect specific nucleic acid sequences in samples has led to a variety of new approaches in diagnostic and prognostic medical, environmental, food and agricultural monitoring, molecular biology research, and a variety of other fields. For many applications, it is desirable to simultaneously detect and/or analyze many target nucleic acids in multiple samples, e.g., multiple individual cells in a population.

Summary of The Invention

In certain embodiments, the invention provides methods of adding an adaptor molecule to each end of a plurality of target nucleic acids comprising sticky ends. The method comprises (itail) annealing an adaptor molecule to a sticky end of a double stranded target nucleic acid molecule to produce an annealed adaptor-target nucleic acid molecule, wherein the adaptor molecule is:

(i) hairpin structures each comprising:

an adaptor nucleotide sequence linked to

A nucleotide linker (linker) attached to

A nucleotide sequence capable of annealing to an adaptor nucleotide sequence and ligating thereto

A degenerate tail sequence; or

(ii) A double-stranded or single-stranded molecule comprising on each strand:

a first adaptor nucleotide sequence linked to

A nucleotide linker attached to

A second adaptor nucleotide sequence; and

a degenerate tail sequence, wherein the double-stranded molecules each comprise two degenerate tail sequences as sticky ends. After annealing, the method includes filling in any gaps in the resulting annealed adaptor-target nucleic acid molecule, and ligating any adjacent nucleotide sequences in the annealed adaptor-target nucleic acid molecule to produce an adaptor-modified target nucleic acid molecule. In related embodiments, the invention provides a plurality of adaptor molecules, wherein the adaptor molecules are the hairpin structures of (i) above or the double-stranded or single-stranded molecules of (ii) above. Also contemplated are kits that, in various embodiments, may comprise a plurality of adaptor molecules in combination with a dnase, an exonuclease, an endonuclease, a polymerase, a ligase, or any combination thereof.

In other embodiments, the invention provides methods for tagging a plurality of target nucleic acids with nucleotide sequences. The method comprises preparing a first reaction mixture for each target nucleic acid, the first reaction mixture comprising an inner primer pair and an outer primer pair, wherein:

(i) The inner primer comprises:

a forward, inboard primer comprising a first nucleotide tag, a first barcode nucleotide sequence, and a target-specific portion; and

a reverse, inboard primer comprising a target-specific portion, a first barcode nucleotide sequence, and a second nucleotide tag; and

(ii) the outer primer comprises:

a forward, outer primer comprising a second barcode nucleotide sequence and a first nucleotide tag-specific portion; and

a reverse, outer primer comprising a second nucleotide tag-specific portion and a second barcode nucleotide sequence, wherein the outer primer is in excess of the inner primer. Reacting each first reaction mixture to produce a plurality of tagged target nucleotide sequences, each comprising 5 '-second barcode nucleotide sequence-first nucleotide tag sequence-first barcode nucleotide sequence-target nucleotide sequence-first barcode nucleotide sequence-second nucleotide tag sequence-second barcode nucleotide sequence-3'. In a related embodiment, the invention provides a kit comprising a polymerase in combination with the inside primer of (i) above and the outside primer of (ii) above, wherein the outside primer is in excess of the inside primer.

In certain embodiments, the present invention provides methods for tagging a plurality of target nucleic acids with nucleotide sequences. The method comprises preparing a first reaction mixture for each target nucleic acid, the first reaction mixture comprising an inner primer pair, a stuffer primer pair, and an outer primer pair, wherein:

(i) the inner primer comprises:

a forward, inboard primer comprising a first nucleotide tag and a target-specific portion; and

a reverse, inboard primer comprising a target-specific portion and a second nucleotide tag;

(ii) the filling primer comprises:

a forward, stuffer primer comprising a third nucleotide tag, a first barcode nucleotide sequence, and a first nucleotide tag specific portion; and

a reverse, filled primer comprising a second nucleotide tag-specific portion, a first barcode nucleotide sequence, a fourth nucleotide tag; and

(iii) the outer primer comprises:

a forward, outer primer comprising a second barcode nucleotide sequence and a third nucleotide tag-specific portion; and

a reverse, outer primer comprising a fourth nucleotide tag-specific portion and a second barcode nucleotide sequence, wherein the outer primer is in excess of the stuffer primer and the stuffer primer is in excess of the inner primer. Reacting each first reaction mixture to produce a plurality of tagged target nucleotide sequences, each comprising 5 '-second barcode nucleotide sequence-third nucleotide tag sequence-first barcode nucleotide sequence-first nucleotide tag sequence-target nucleotide sequence-second nucleotide tag sequence-first barcode nucleotide sequence-fourth nucleotide tag sequence-second barcode nucleotide sequence-3'. In a related embodiment, the invention provides a kit comprising a polymerase in combination with the inside primer of (i) above, the stuffer primer of (ii) above, and the outside primer of (iii) above, wherein the outside primer is in excess of the stuffer primer and the stuffer primer is in excess of the inside primer.

In particular embodiments, the present invention provides methods for combinatorial tagging (combinatorial tagging) of multiple target nucleotide sequences. The method employs a plurality of tagged target nucleotide sequences (a tagged target nucleotide sequences) derived from a target nucleic acid, each tagged target nucleotide sequence comprising an endonuclease site and a first barcode nucleotide sequence, wherein the plurality of tagged target nucleotide sequences (tagged target nucleotide sequences in the plurality) comprise the same endonuclease site but N different first barcode nucleotide sequences, wherein N is an integer greater than 1. The method includes cleaving a plurality of tagged target nucleotide sequences with an endonuclease site-specific to the endonuclease to generate a plurality of sticky-ended (tagged) tagged target nucleotide sequences. The method further comprises ligating a plurality of adaptors comprising a second barcode nucleotide sequence and complementary sticky ends to the plurality of tagged target nucleotide sequences having sticky ends in the first reaction mixture, wherein the plurality of adaptors comprises M different second barcode nucleotide sequences, wherein M is an integer greater than 1. This ligation produces a plurality of combinatorial tagged target nucleotide sequences, each comprising first and second barcode nucleotide sequences, wherein the plurality comprises N x M different combinations of the first and second barcodes. In related embodiments, the invention provides a plurality of adaptors comprising:

A plurality of first adaptors, each comprising the same endonuclease site, N different barcode nucleotide sequences, a first primer binding site, and a sticky end, wherein N is an integer greater than 1;

a second adaptor comprising a second primer binding site and a sticky end; and

a plurality of third adaptors comprising second barcode nucleotide sequences and sticky ends complementary to those generated upon cleavage of said first adaptors at said endonuclease sites, wherein the plurality of third adaptors comprises M different second barcode nucleotide sequences, wherein M is an integer greater than 1. Also contemplated is a kit comprising a plurality of first adaptors, second adaptors, and a plurality of third adaptors in combination with an endonuclease and/or ligase specific for the endonuclease site in the first adaptors.

In other embodiments, the invention provides methods for combinatorial tagging of a plurality of target nucleotide sequences, wherein the methods comprise annealing a plurality of barcode primers to a plurality of tagged target nucleotide sequences derived from the target nucleic acid. Each tagged target nucleotide sequence comprises a nucleotide tag at one end and a first barcode nucleotide sequence, wherein a plurality of tagged target nucleotide sequences comprise the same nucleotide tag but N different first barcode nucleotide sequences, wherein N is an integer greater than 1. Each barcode primer comprises:

A first tag-specific moiety linked to;

a second barcode nucleotide sequence linked to;

a second tag-specific portion, wherein the plurality of barcode primers each comprise the same first and second tag-specific portions but M different second barcode nucleotide sequences, wherein M is an integer greater than 1. The method further comprises amplifying the tagged target nucleotide sequences in the first reaction mixture to produce a plurality of combined tagged target nucleotide sequences, each comprising a first and second barcode nucleotide sequence, wherein the plurality comprises nx M different first and second barcode combinations in related embodiments, the invention provides a kit comprising one or more nucleotide tags together with the plurality of barcode primers above, the nucleotide tags being useful for producing the tagged target nucleotide sequences.

In certain embodiments, the invention provides an assay method for detecting a plurality of target nucleic acids, the method comprising preparing M first reaction mixtures to be pooled prior to assay, wherein M is an integer greater than 1. Each first reaction mixture comprises:

a sample nucleic acid;

a first, forward primer comprising a target-specific portion;

A first, reverse primer comprising a target-specific portion, wherein the first, forward primer or the first, reverse primer further comprises a barcode nucleotide sequence, and wherein each barcode nucleotide sequence in each of the M reaction mixtures is different. A first reaction is performed on each first reaction mixture to produce a plurality of barcoded target nucleotide sequences, each comprising a target nucleotide sequence linked to a barcode nucleotide sequence. The method further includes, for each of the M first reaction mixtures, pooling the barcoded target nucleotide sequences to form a test pool. Performing a second reaction using a unique second primer pair to the test cell or one or more aliquots thereof, wherein each second primer pair comprises:

a second, forward or reverse primer that anneals to the target nucleotide sequence; and

a second, reverse or forward primer, anneals to the barcode nucleotide sequence. The method then includes determining, for each unique, second primer pair, whether a reaction product is present in the test cell or an aliquot thereof, whereby the presence of the reaction product is indicative of the presence of a particular target nucleic acid in a particular first reaction mixture.

In a specific embodiment, a variation of this assay method for detecting a plurality of target nucleic acids includes preparing M first reaction mixtures to be pooled prior to the assay, where M is an integer greater than 1, and each first reaction mixture comprises:

A sample nucleic acid;

a first, forward primer comprising a target-specific portion;

a first, reverse primer comprising a target-specific portion, wherein the first, forward primer or the first, reverse primer further comprises a nucleotide tag; and

at least one barcode primer comprising a barcode nucleotide sequence and a nucleotide tag specific portion, wherein the barcode primer is in excess of the first, forward and/or first, reverse primers, and wherein each barcode nucleotide sequence in each of the M reaction mixtures is different. Performing a first reaction on each of the first reaction mixtures to generate a plurality of barcoded target nucleotide sequences, each comprising a target nucleotide sequence linked to a nucleotide tag linked to a barcode nucleotide sequence (a target nucleotide sequence linked to a barcode nucleotide sequence). The method further includes, for each of the M reaction mixtures, pooling the barcoded target nucleotide sequences to form a test pool. Performing a second reaction using a unique second primer pair to the test cell or one or more aliquots thereof, wherein each second primer pair comprises:

A second, forward or reverse primer that anneals to a target nucleotide sequence; and

a second, reverse or forward primer that anneals to the barcode nucleotide sequence. The method then includes determining, for each unique, second primer pair, whether a reaction product is present in the test cell or an aliquot thereof, whereby the presence of the reaction product is indicative of the presence of a particular target nucleic acid in a particular first reaction mixture.

In certain embodiments, the invention provides methods and kits useful for amplifying one or more target nucleic acids for ready application, such as bidirectional nucleic acid sequencing. In some embodiments, the methods of the invention comprise additionally performing bidirectional DNA sequencing.

In particular bidirectional embodiments, the methods include amplifying, tagging, and barcoding a plurality of target nucleic acids in a plurality of samples. The nucleotide tag sequence may comprise primer binding sites that may be used to aid in amplification and/or DNA sequencing. The barcode nucleotide sequence may encode information about the amplification product, such as the identity of the sample from which the amplification product was obtained.

In certain bidirectional embodiments, the method of amplifying a target nucleic acid comprises amplifying the target nucleic acid using:

a medial primer set, wherein the set comprises:

An inner, forward primer comprising a target-specific portion and a first primer binding site;

an inner, reverse primer comprising a target-specific portion and a second primer binding site, wherein the first and second primer binding sites are different;

a first outer primer set, wherein the set comprises:

a first outer, forward primer comprising a portion specific for a first primer binding site; and

a first outer, reverse primer comprising a barcode nucleotide sequence and a portion specific for a second primer binding site;

a second set of flanking primers, wherein the set comprises:

a second outer, forward primer comprising a barcode nucleotide sequence and a portion specific for the first primer binding site; and

a second outer, reverse primer comprising a portion specific for the second primer binding site. This amplification produces two target amplicons, wherein:

the first target amplicon comprises 5 '-first primer binding site-target nucleotide sequence-second primer binding site-barcode nucleotide sequence-3'; and

the second target amplicon comprises a 5 '-barcode nucleotide sequence-first primer binding site-target nucleotide sequence-second primer binding site-3'. In variations of these embodiments, the barcode nucleotide sequence in each target amplicon is the same, and each target amplicon comprises only one barcode nucleotide sequence.

In some bidirectional embodiments, the first and second primer binding sites are binding sites for DNA sequencing primers. The outer primers may optionally each additionally comprise a further nucleotide sequence, wherein:

the first outer, forward primer comprises a first additional nucleotide sequence and the first outer, reverse primer comprises a second additional nucleotide sequence; and

the second outer, forward primer comprises a second additional nucleotide sequence and the second outer, reverse primer comprises the first additional nucleotide sequence; and the first and second additional nucleotide sequences are different. In such embodiments, the amplification produces two target amplicons, wherein:

the first target amplicon comprises: 5 '-a first additional nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site-a barcode nucleotide sequence-a second additional nucleotide sequence-3'; and

the second target amplicon comprises: 5 '-second additional nucleotide sequence-barcode nucleotide sequence-first primer binding site-target nucleotide sequence-second primer binding site-first additional nucleotide sequence 3'. In a specific embodiment, the first and/or second additional nucleotide sequence comprises a primer binding site. In an exemplary embodiment, the first set of outer primers comprises PE1-CS1 and PE2-BC-CS2, and the second set of outer primers comprises PE1-CS2 and PE2-BC-CS1 (Table 1, example 9).

In certain bidirectional embodiments, amplification is performed in a single amplification reaction. In other embodiments, amplification comprises using an inner primer in a first amplification reaction and an outer primer in a second amplification reaction, wherein the second amplification reaction is different from the (separate from) first amplification reaction. In a variation of this latter embodiment, the second amplification reaction comprises two separate amplification reactions, wherein one amplification reaction employs a first outer primer set and the other amplification reaction employs a second outer primer set. The target amplicons produced in the two separate second amplification reactions can optionally be pooled.

In any of the above bidirectional embodiments, the method can comprise amplifying a plurality of target nucleic acids. The plurality of target nucleic acids may be, for example, genomic DNA, cDNA, fragmented DNA, DNA reverse transcribed from RNA, a DNA library, or nucleic acids extracted or amplified from a cell, body fluid, or tissue sample. In particular embodiments, the plurality of target nucleic acids are amplified from a formalin-fixed, paraffin-embedded tissue sample.

Any of the above bidirectional methods may further comprise sequencing the target amplicon. For example, where the target amplicon generated as described above comprises additional nucleotide sequences, the method may comprise additional amplification using primers that bind the first and second additional nucleotide sequences to generate a template for DNA sequencing. In a particular embodiment, one or both of the primers that bind the first and second additional nucleotide sequences are immobilized on a substrate. In particular embodiments, amplification to generate a DNA sequencing template may be performed by isothermal nucleic acid amplification. In certain embodiments, the method comprises performing DNA sequencing using a template and a primer that binds to the first and second primer binding sites and initiates sequencing of the target nucleotide sequence; these primers are preferably present in substantially equal amounts. In some embodiments, the method comprises performing DNA sequencing using a template and a primer that binds to the first and second primer binding sites and initiates sequencing of the barcode nucleotide sequence; these primers are preferably present in substantially equal amounts. In particular embodiments, the method comprises performing DNA sequencing using a template and a primer that binds to the first and second primer binding sites and primers that primer sequencing of the barcode nucleotide sequence; wherein the primer is the reverse complement of the primer that primes the sequencing of the target nucleotide sequence. In an illustrative embodiment, the primers used to prime sequencing of the target nucleotide sequence and the barcode nucleotide sequence comprise CS1, CS2, CS1rc, and CS2rc (table 2, example 9).

In any of the above bidirectional embodiments, the barcode nucleotide sequence can be selected to avoid substantial annealing (substential annealing) to the target nucleic acid. In certain embodiments, the barcode nucleotide sequence recognizes a particular sample.

In some embodiments, when bidirectional DNA sequencing is performed according to the methods described above, at least 50% of the sequences determined from DNA sequencing are present at greater than 50% of the average copy number of the sequences and less than 2 times the average copy number of the sequences. In certain embodiments, at least 70% of the sequences determined from DNA sequencing are present at greater than 50% of the average copy number of the sequences and less than 2 times the average copy number of the sequences. In particular embodiments, at least 90% of the sequences determined from DNA sequencing are present at greater than 50% and less than 2 times the average copy number of the sequences.

In any of the above bidirectional embodiments, the target amplicon has an average length of less than 200 bases. In various embodiments, the first amplification (i.e., the amplification that produces the target amplicon) is performed in a volume ranging from about 1 picoliter to about 50 nanoliters, or about 5 picoliters to about 25 nanoliters. In particular embodiments, the first amplification (i.e., the amplification that produces the target amplicon) reaction is formed in or distributed to a separate compartment (component) of the microfluidic device prior to amplification. Microfluidic devices may be, for example, those fabricated at least partially from elastomeric materials. In certain embodiments, the first amplification (i.e., amplification that produces the target amplicon) reaction is performed in a droplet.

Another aspect of the invention includes a kit that can be used to perform the bidirectional embodiments discussed above. In certain embodiments, the kit comprises:

a first outer primer, wherein the set comprises:

a first outer, forward primer comprising a portion specific for a first primer binding site; and

a first outer, reverse primer comprising a barcode nucleotide sequence and a portion specific for a second primer binding site, wherein the first and second primer binding sites are different;

a second set of flanking primers, wherein the set comprises:

a second outer, forward primer comprising a barcode nucleotide sequence and a portion specific for the first primer binding site; and

a second outer, reverse primer comprising a portion specific for the second primer binding site. In particular embodiments, the first and second primer binding sites are binding sites for DNA sequencing primers. In a specific embodiment, the outer primers each additionally comprise an additional nucleotide sequence, wherein:

the first outer, forward primer comprises a first additional nucleotide sequence and the first outer, reverse primer comprises a second additional nucleotide sequence; and

the second outer, forward primer comprises a second additional nucleotide sequence and the second outer, reverse primer comprises a first additional nucleotide sequence, and the first and second additional nucleotide sequences are different. In an exemplary embodiment, the first set of outer primers comprises PE1-CS1 and PE2-BC-CS2, and the second set of outer primers comprises PE1-CS2 and PE2-BC-CS1 (Table 1, example 9). In certain embodiments, the kit further comprises an inner primer set, wherein the set comprises:

An inner, forward primer comprising a target-specific portion and a first primer binding site; and

an inner, reverse primer comprising a target-specific portion and a second primer binding site. In some embodiments, the kit comprises a plurality of inner primer sets each specific for a different target nucleic acid.

Any of the above kits useful for performing the bidirectional embodiments can further comprise a DNA sequencing primer that binds the first and second primer binding sites and primes sequencing of the target nucleotide sequence and/or further comprise a DNA sequencing primer that binds the first and second primer binding sites and primes sequencing of the barcode nucleotide sequence. In particular embodiments, the primer that binds to the first and second primer binding sites and initiates sequencing of the barcode nucleotide sequence is the reverse complement of the primer that initiates sequencing of the target nucleotide sequence. For example, primers used to prime sequencing of the target nucleotide sequence and barcode nucleotide sequence include CS1, CS2, CS1rc, and CS2rc (table 2, example 9).

In some embodiments, the invention also provides methods of detecting, and/or quantifying the relative amounts of at least two different target nucleic acids in a nucleic acid sample. The method comprises generating first and second tagged target nucleotide sequences from first and second target nucleic acids in a sample,

The first tagged target nucleotide sequence comprises a first nucleotide tag; and

the second tagged target nucleotide sequence comprises a second nucleotide tag, wherein the first and second nucleotide tags are different. The tagged target nucleotide sequence is subjected to a first primer extension reaction using a first primer that anneals to a first nucleotide tag, and a second primer extension reaction using a second primer that anneals to a second nucleotide tag. The method further comprises detecting and/or quantifying a signal indicative of extension of the first primer, and a signal indicative of extension of the second primer, wherein the signal of a given primer is indicative of the presence, and/or relative amount, of the corresponding target nucleic acid.

Specifically, the present application provides the following:

1. a method of adding an adaptor molecule to each end of a plurality of target nucleic acids comprising sticky ends, the method comprising:

(a) annealing an adaptor molecule to the sticky end of a double stranded target nucleic acid molecule to produce an annealed adaptor-target nucleic acid molecule, wherein the adaptor molecule is:

(i) hairpin structures each comprising:

an adaptor nucleotide sequence linked to

A nucleotide linker attached to

A nucleotide sequence capable of annealing to the adaptor nucleotide sequence and being linked to a degenerate tail sequence; or

(ii) A double-stranded or single-stranded molecule comprising on each strand:

a first adaptor nucleotide sequence linked to

A nucleotide linker attached to

A second adaptor nucleotide sequence; and

a degenerate tail sequence, wherein the double-stranded molecules each comprise two degenerate tail sequences as sticky ends;

(b) filling any gaps in the resulting annealed adaptor-target nucleic acid molecules; and

(c) ligating any adjacent nucleotide sequences in the annealed adaptor-target nucleic acid molecule to produce an adaptor-modified target nucleic acid molecule.

2. The method of clause 1, wherein the plurality of DNA molecules comprises a DNA library to be sequenced.

3. The method of any one of the preceding items, wherein the method comprises generating the plurality of target nucleic acid molecules comprising sticky ends by fragmenting DNA molecules.

4. The method of item 3, wherein the DNA molecule is fragmented by digesting them with DNase.

5. The method of clause 3, wherein the fragmented DNA molecules are followed by digestion of the fragmented DNA molecules with an enzyme to produce sticky ends.

6. The method of clause 5, wherein the enzyme used to produce sticky ends comprises a strand-specific endonuclease that does not have polymerase activity under the conditions employed in the digestion.

7. The method of any one of the preceding items, wherein the sticky end of the double stranded target nucleic acid molecule is a 3' extension.

8. The method of any preceding item, wherein the degenerate tail sequence is at the 3' end of the adaptor molecule.

9. The method of any one of the preceding items, wherein the adaptor molecules are hairpin structures each comprising:

an adaptor nucleotide sequence linked to

A nucleotide linker attached to

A nucleotide sequence capable of annealing to the first adaptor nucleotide sequence and being ligated thereto

A degenerate tail sequence.

10. The method of item 9, wherein the adaptor molecule comprises:

a first type of adaptor molecule comprising a first adaptor nucleotide sequence; and

a second type of adaptor molecule comprising a second adaptor nucleotide sequence different from the first adaptor nucleotide sequence.

11. The method of clause 9, wherein the ligating converts the annealed adaptor-target nucleic acid molecule into a single-stranded circular DNA molecule.

12. The method of any of items 1-8, wherein the adaptor molecules annealed in (a) are double stranded molecules, each comprising:

a first adaptor nucleotide sequence linked to

A nucleotide linker attached to

A second adaptor nucleotide sequence different from the first adaptor nucleotide sequence; and

sticky ends comprising a degenerate tail sequence.

13. The method of any one of items 1-8, wherein the adaptor molecules annealed in (a) are single-stranded molecules each comprising:

a first adaptor nucleotide sequence linked to

A nucleotide linker attached to

A second adaptor nucleotide sequence different from the first adaptor nucleotide sequence; and

a degenerate tail sequence.

14. The method of item 10, 12 or 13, wherein the first and second adaptor sequences comprise primer binding sites capable of being specifically bound by a DNA sequencing primer.

15. The method of clauses 12 or 13, wherein the ligating converts the annealed adaptor-target nucleic acid molecule into a double stranded circular DNA molecule.

16. The method of any one of the preceding items, wherein the nucleotide linker comprises a nucleotide sequence selected from the group consisting of seq id no: an endonuclease site, a barcode nucleotide sequence, an affinity tag, and any combination thereof.

17. The method of any one of the preceding items, wherein the nucleotide linker comprises a restriction enzyme site.

18. The method of any one of the preceding items, wherein the nucleotide linker comprises a restriction enzyme site and at least one barcode nucleotide sequence.

19. The method of clause 16, wherein the nucleotide linker comprises an endonuclease site, and the method further comprises digesting the single-stranded or double-stranded circular DNA molecule to produce a linear DNA molecule.

20. The method of clause 16, wherein the nucleotide linker comprises a restriction enzyme site, and the method further comprises digesting the double-stranded circular DNA molecule to produce a linear DNA molecule.

21. The method of item 19, wherein the linear DNA molecule comprises a first portion of a 5 '-nucleotide linker-a second adaptor nucleotide sequence-a first degenerate tail sequence-a target nucleic acid molecule-a second degenerate tail sequence-a first adaptor nucleotide sequence-a second portion of a nucleotide linker-3'.

22. The method of any preceding item, wherein the method comprises:

generating the plurality of target nucleic acid molecules comprising sticky ends by:

digesting a DNA molecule with dnase I to produce a fragmented DNA molecule, followed by heat inactivation of said dnase I;

Digesting the fragmented DNA molecules with a nuclease having 5 'to 3' exonuclease activity in the absence of deoxynucleotides to produce a plurality of target nucleic acid molecules having sticky ends;

annealing the adaptors to the sticky ends of the plurality of target nucleic acid molecules, wherein the nucleotide linkers of the adaptors comprise endonuclease sites;

filling any gaps in the annealed adaptor-target nucleic acid molecules and ligating any adjacent nucleotide sequences in a single reaction comprising a polymerase and a ligase to produce a circular DNA molecule; and

digesting the circular DNA molecule with an endonuclease that cleaves at the endonuclease site to produce a linear DNA molecule.

23. The method of clause 22, wherein the nuclease having 5 'to 3' exonuclease activity is exonuclease III.

24. The method of any one of the preceding items, wherein the method further comprises sequencing the adaptor-modified target nucleic acid molecule.

25. A plurality of adaptor molecules, wherein the adaptor molecules are:

(i) hairpin structures each comprising:

an adaptor nucleotide sequence linked to

A nucleotide linker attached to

A nucleotide sequence capable of annealing to the adaptor nucleotide sequence and being linked to a degenerate tail sequence; or

(ii) A double-stranded or single-stranded molecule comprising on each strand:

a first adaptor nucleotide sequence linked to

A nucleotide linker attached to

A second adaptor nucleotide sequence; and

a degenerate tail sequence, wherein the double-stranded molecules each comprise two degenerate tail sequences as sticky ends.

26. The plurality of adaptor molecules of clause 25, wherein the adaptor molecules are hairpin structures each comprising:

an adaptor nucleotide sequence linked to

A nucleotide linker attached to

A nucleotide sequence capable of annealing to the first adaptor nucleotide sequence and being ligated thereto

A degenerate tail sequence.

27. The plurality of adaptor molecules of item 26, wherein the adaptor molecules comprise:

a first type of adaptor molecule comprising a first adaptor nucleotide sequence; and

a second type of adaptor molecule comprising a second adaptor nucleotide sequence different from the first adaptor nucleotide sequence.

28. The plurality of adaptor molecules of clause 25, wherein the adaptor molecules annealed in (a) are double stranded molecules, each comprising:

a first adaptor nucleotide sequence linked to

A nucleotide linker attached to

A second adaptor nucleotide sequence different from the first adaptor nucleotide sequence; and

sticky ends comprising a degenerate tail sequence.

29. The plurality of adaptor molecules of item 25, wherein the adaptor molecules annealed in (a) are single-stranded molecules, each comprising:

a first adaptor nucleotide sequence linked to

A nucleotide linker attached to

A second adaptor nucleotide sequence different from the first adaptor nucleotide sequence; and

a degenerate tail sequence.

30. The plurality of adaptor molecules of items 27-29, wherein the first and second adaptor sequences comprise primer binding sites capable of being specifically bound by a DNA sequencing primer.

31. The plurality of adaptor molecules of any one of items 25-30, wherein the nucleotide linker comprises a nucleotide sequence selected from the group consisting of: an endonuclease site, a barcode nucleotide sequence, an affinity tag, and any combination thereof.

32. The plurality of adaptor molecules of any one of items 25-30, wherein the nucleotide linkers comprise a restriction enzyme site.

33. The plurality of adaptor molecules of any one of items 25-30, wherein the nucleotide linkers comprise a restriction enzyme site and at least one barcode nucleotide sequence.

34. A kit comprising a plurality of adaptor molecules according to any one of items 25 to 33 and one or more components selected from the group consisting of: dnase, exonuclease, endonuclease, polymerase and ligase.

35. A method of tagging a plurality of target nucleic acids with nucleotide sequences, the method comprising:

(a) preparing a first reaction mixture for each target nucleic acid, the first reaction mixture comprising an inner primer pair and an outer primer pair, wherein:

(i) the inner primer comprises:

a forward, inboard primer comprising a first nucleotide tag, a first barcode nucleotide sequence, and a target-specific portion; and

a reverse, inboard primer comprising a target-specific portion, a first barcode nucleotide sequence, and a second nucleotide tag; and

(ii) the outer primer comprises:

a forward, outer primer comprising a second barcode nucleotide sequence and a first nucleotide tag-specific portion; and

a reverse, outer primer comprising a second nucleotide tag-specific portion and a second barcode nucleotide sequence;

wherein the outer primer is in excess of the inner primer; and

(b) reacting each first reaction mixture to produce a plurality of tagged target nucleotide sequences, each comprising 5 '-second barcode nucleotide sequence-first nucleotide tag sequence-first barcode nucleotide sequence-target nucleotide sequence-first barcode nucleotide sequence-second nucleotide tag sequence-second barcode nucleotide sequence-3'.

36. The method of clause 35, wherein the reaction comprises amplification.

37. The method of clause 35, wherein the reaction comprises ligation.

38. The method of clause 35, wherein the outer primer further comprises first and second primer binding sites capable of being bound by a DNA sequencing primer.

39. The method of item 38, wherein the reacting produces a tagged target nucleotide sequence comprising 5 '-first primer binding site-second barcode nucleotide sequence-first nucleotide tag sequence-first barcode nucleotide sequence-target nucleotide sequence-first barcode nucleotide sequence-second nucleotide tag sequence-second barcode nucleotide sequence-second primer binding site-3'.

40. A kit for performing the method of item 35, wherein the kit comprises a polymerase and:

(i) an inner primer comprising:

a forward, inboard primer comprising a first nucleotide tag, a first barcode nucleotide sequence, and a target-specific portion; and

a reverse, inboard primer comprising a target-specific portion, a first barcode nucleotide sequence, and a second nucleotide tag; and

(ii) an outer primer comprising:

A forward, outer primer comprising a second barcode nucleotide sequence and a first nucleotide tag-specific portion; and

a reverse, outer primer comprising a second nucleotide tag-specific portion and a second barcode nucleotide sequence;

wherein the outer primer is in excess of the inner primer.

41. A method of tagging a plurality of target nucleic acids with nucleotide sequences, the method comprising:

(a) preparing a first reaction mixture for each target nucleic acid, the first reaction mixture comprising an inner primer pair, a packed primer pair, and an outer primer pair, wherein:

(i) the inner primer comprises:

a forward, inboard primer comprising a first nucleotide tag and a target-specific portion; and

a reverse, inboard primer comprising a target-specific portion and a second nucleotide tag;

(ii) the filling primer comprises:

a forward, stuffer primer comprising a third nucleotide tag, a first barcode nucleotide sequence, and a first nucleotide tag specific portion; and

a reverse, stuffer primer comprising a second nucleotide tag specific portion, a first barcode nucleotide sequence, a fourth nucleotide tag; and

(iii) the outer primer comprises:

a forward, outer primer comprising a second barcode nucleotide sequence and a third nucleotide tag-specific portion; and

A reverse, outer primer comprising a fourth nucleotide tag-specific portion and a second barcode nucleotide sequence;

wherein the outer primer is in excess of the stuffer primer, which is in excess of the inner primer; and

(b) reacting each first reaction mixture to produce a plurality of tagged target nucleotide sequences, each comprising 5 '-second barcode nucleotide sequence-third nucleotide tag sequence-first barcode nucleotide sequence-first nucleotide tag sequence-target nucleotide sequence-second nucleotide tag sequence-first barcode nucleotide sequence-fourth nucleotide tag sequence-second barcode nucleotide sequence-3'.

42. The method of clause 41, wherein the reaction comprises amplification.

43. The method of clause 41, wherein the reaction comprises ligation.

44. The method of clause 41, wherein the outer primer further comprises first and second primer binding sites capable of being bound by a DNA sequencing primer.

45. The method of item 44, wherein the reacting produces a tagged target nucleotide sequence comprising 5 '-first primer binding site-second barcode nucleotide sequence-third nucleotide tag sequence-first barcode nucleotide sequence-first nucleotide tag sequence-target nucleotide sequence-second nucleotide tag sequence-first barcode nucleotide sequence-fourth nucleotide tag sequence-second barcode nucleotide sequence-second primer binding site-3'.

46. A kit for performing the method of item 41, wherein the kit comprises a polymerase and:

(i) an inner primer comprising:

a forward, inboard primer comprising a first nucleotide tag and a target-specific portion; and

a reverse, inboard primer comprising a target-specific portion and a second nucleotide tag; and

(ii) a stuffer primer comprising:

a forward, stuffer primer comprising a third nucleotide tag, a first barcode nucleotide sequence, and a first nucleotide tag specific portion; and

a reverse, filled primer comprising a second nucleotide tag-specific portion, a first barcode nucleotide sequence, a fourth nucleotide tag; and

(iii) an outer primer comprising:

a forward, outer primer comprising a second barcode nucleotide sequence and a third nucleotide tag-specific portion; and

a reverse, outer primer comprising a fourth nucleotide tag-specific portion and a second barcode nucleotide sequence;

wherein the outer primer is in excess of the stuffer primer, and the stuffer primer is in excess of the inner primer.

47. The kit of clauses 40 or 46, wherein the outer primer further comprises first and second primer binding sites capable of being bound by a DNA sequencing primer.

48. A method of combinatorial tagging of a plurality of target nucleotide sequences, the method comprising:

cleaving a plurality of tagged target nucleotide sequences derived from a target nucleic acid with an endonuclease specific for an endonuclease site, each tagged target nucleotide sequence comprising the endonuclease site and a first barcode nucleotide sequence, wherein the plurality of tagged target nucleotide sequences comprise the same endonuclease site but N different first barcode nucleotide sequences, wherein N is an integer greater than 1, to produce a plurality of sticky-ended, tagged target nucleotide sequences;

ligating a plurality of adaptors comprising a second barcode nucleotide sequence and complementary sticky ends to the plurality of sticky-ended, tagged target nucleotide sequences in a first reaction mixture, wherein the plurality of adaptors comprises M different second barcode nucleotide sequences, wherein M is an integer greater than 1, to produce a plurality of combined tagged target nucleotide sequences, each comprising a first and a second barcode nucleotide sequence, wherein the plurality comprises N x M different combinations of first and second barcodes.

49. The method of clause 48, wherein the endonuclease site is adjacent to the first barcode nucleotide sequence in the tagged target nucleotide sequence.

50. The method of clause 49, wherein the second barcode nucleotide sequence is adjacent to the complementary sticky end in the adaptor.

51. The method of any of items 50-52, wherein the combinatorial tagged target nucleotide sequences comprise the first and second barcode nucleotide sequences separated by less than 5 nucleotides.

52. The method of any of items 50-53, wherein the tagged target nucleotide sequence comprises a first and a second primer binding site, having an arrangement selected from:

5' -endonuclease site-first barcode nucleotide sequence-first primer binding site-target nucleotide sequence-second primer binding site; and

5 '-first primer binding site-target nucleotide sequence-second primer binding site-first barcode nucleotide sequence-endonuclease site-3'.

53. The method of clause 52, wherein the first and second primer binding sites are binding sites for DNA sequencing primers.

54. The method of clauses 52 or 53, wherein the combinatorial tagged target nucleotide sequence comprises:

5' -a second barcode nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site; or

5 '-first primer binding site-target nucleotide sequence-second primer binding site-first barcode nucleotide sequence-second barcode nucleotide sequence-3'.

55. The method of any of clauses 48-54, wherein the tagged target nucleotide sequences are prepared by ligating an adaptor to a plurality of target nucleic acids, wherein the adaptor comprises:

a first adaptor comprising the endonuclease site, the first barcode nucleotide sequence, the first primer binding site, and a sticky end; and

a second adaptor comprising a second primer binding site and a sticky end.

56. The method of clauses 52 or 53, wherein the tagged target nucleotide sequence comprises a first additional nucleotide sequence having an arrangement selected from the group consisting of:

5' -endonuclease site-first barcode nucleotide sequence-first primer binding site-target nucleotide sequence-second primer binding site-first further nucleotide sequence; and

5 '-first additional nucleotide sequence-first primer binding site-target nucleotide sequence-second primer binding site-first barcode nucleotide sequence-endonuclease site-3'.

57. The method of clause 56, wherein the adapter comprises a second additional nucleotide sequence and has the following arrangement: 5 '-second additional nucleotide sequence-second barcode nucleotide sequence-complementary sticky end-3'.

58. The method of clause 57, wherein the combinatorial tagged target nucleotide sequences comprise:

5' -a second additional nucleotide sequence-a second barcode nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site-a first additional nucleotide sequence; and

5 '-second additional nucleotide sequence-first primer binding site-target nucleotide sequence-second primer binding site-first barcode nucleotide sequence-second barcode nucleotide sequence-first additional nucleotide sequence-3'.

59. The method of clauses 56-58, wherein the first and/or second additional nucleotide sequence comprises a primer binding site.

60. The method of any of clauses 56-59, wherein the tagged target nucleotide sequences are prepared by ligating an adaptor to a plurality of target nucleic acids, wherein the adaptor comprises:

a first adaptor comprising the endonuclease site, the first barcode nucleotide sequence, the first primer binding site, and a sticky end; and

a second adaptor comprising a first further nucleotide sequence, a second primer binding site and a sticky end.

61. A plurality of adaptors comprising:

a plurality of first adaptors, each comprising the same endonuclease site, N different barcode nucleotide sequences, a first primer binding site, and a sticky end, wherein M is an integer greater than 1;

a second adaptor comprising a second primer binding site and a sticky end; and

a plurality of third adaptors comprising second barcode nucleotide sequences and sticky ends complementary to those generated upon cleavage of said first adaptor at said endonuclease site, wherein said plurality of third adaptors comprises M different second barcode nucleotide sequences, wherein M is an integer greater than 1.

62. A kit comprising an adaptor according to item 61 and one or more components selected from the group consisting of: an endonuclease and a ligase that are site specific for the endonuclease in the first adaptor.

63. A method for combinatorial tagging of a plurality of target nucleotide sequences, the method comprising:

annealing a plurality of barcode primers to a plurality of tagged target nucleotide sequences derived from the target nucleic acid, wherein:

each tagged target nucleotide sequence comprises a nucleotide tag at one end and a first barcode nucleotide sequence, wherein the plurality of tagged target nucleotide sequences comprises the same nucleotide tag but N different first barcode nucleotide sequences, wherein N is an integer greater than 1; and

Each barcode primer comprises:

a first tag-specific moiety linked to;

a second barcode nucleotide sequence linked to;

a second tag-specific portion;

wherein each of the plurality of barcode primers comprises identical first and second tag-specific portions but M different second barcode nucleotide sequences, wherein M is an integer greater than 1; and

amplifying the tagged target nucleotide sequences in a first reaction mixture to produce a plurality of combination tagged target nucleotide sequences, each comprising a first and a second barcode nucleotide sequence, wherein the plurality comprises nx M different first and second barcode combinations.

64. The method of clause 63, wherein the first barcode nucleotide sequence and the nucleotide tag are separated by the target nucleotide sequence.

65. The method of clauses 63 or 64, wherein the first tag-specific portion of the barcode primer anneals to the 5 'portion of the nucleotide tag, the second tag-specific portion of the barcode primer anneals to the adjacent 3' portion of the nucleotide tag, the second barcode nucleotide sequence does not anneal to the nucleotide tag, forming a loop between the annealed first and second tag-specific portions.

66. The method of any of items 63-65, wherein the tagged nucleotide sequence further comprises a primer binding site between the target nucleotide sequence and the first barcode nucleotide sequence.

67. The method of any of items 63-66, wherein the first and second tag-specific portions of the barcode primer are sufficiently long to serve as primer binding sites.

68. The method of clauses 66 or 67, wherein the binding site is a binding site for a DNA sequencing primer.

69. The method of any of items 66-68, wherein the combinatorial tagged target nucleotide sequence comprises 5 '-first tag-specific moiety-second barcode nucleotide sequence-second tag-specific moiety-target nucleotide sequence-primer binding site-first barcode nucleotide sequence-3'.

70. The method of items 66-69, wherein the tagged target nucleotide sequence comprises a first additional nucleotide sequence having the following arrangement: 5 '-nucleotide tag-target nucleotide sequence-primer binding site-first barcode nucleotide sequence-first additional nucleotide sequence-3'.

71. The method of clause 70, wherein the barcode primer further comprises a second further nucleotide sequence and has the following arrangement: 5 '-second additional nucleotide sequence-first tag-specific part-second barcode nucleotide sequence-second tag-specific part-3'.

72. The method of item 71, wherein the combinatorial tagged target nucleotide sequence comprises 5 '-a second additional nucleotide sequence-a first tag specific moiety-a second barcode nucleotide sequence-a second tag specific moiety-a target nucleotide sequence-a primer binding site-a first barcode nucleotide sequence-a first additional nucleotide sequence-3'.

73. The method of clauses 70-72, wherein the first and/or second additional nucleotide sequence comprises a primer binding site.

74. The method of any one of items 66-73, wherein the nucleotide tag comprises a transposon end that is incorporated into the tagged target nucleotide sequence using a transposase.

75. The method of any of items 35-74, wherein the reaction mixture is prepared in separate compartments of a microfluidic device, the separate compartments being arranged in an array defined by rows and columns, wherein combinations of the first and second barcode nucleotide sequences in each tagged target nucleotide sequence are identified as rows and columns of compartments from which the tagged target nucleotide sequence originates.

76. The method of clause 75, wherein the method comprises recovering the tagged target nucleotide sequence from the microfluidic device.

77. The method of item 76, wherein said recovering comprises pooling tagged target nucleotide sequences from compartments in a row or column to produce a test pool.

78. The method of clause 77, wherein the method comprises further analyzing the test cell by nucleic acid amplification.

79. The method of clause 77, wherein the method comprises further analyzing the test cell by nucleic acid sequencing.

80. The method of clauses 78 or 79, wherein the further analysis comprises determining whether a tagged target nucleotide sequence comprising the selected combination of first and second barcode nucleotide sequences is present in the test cell or an aliquot thereof.

81. A kit for performing the method of item 63, comprising:

one or more nucleotide tags; and

a plurality of barcode primers, wherein each barcode primer comprises:

a first portion, specific for the first portion of the nucleotide tag, attached to;

a barcode nucleotide sequence, which does not anneal to the nucleotide tag, linked to;

a second portion specific for a second portion of the nucleotide tag;

wherein each of the plurality of barcode primers comprises identical first and second tag-specific portions but M different second barcode nucleotide sequences, wherein M is an integer greater than 1.

82. The kit of item 81, wherein the nucleotide tag comprises a transposon end, and the kit further comprises a transposase.

83. The kit of clauses 81 or 82, wherein the kit further comprises a polymerase.

84. An assay method for detecting a plurality of target nucleic acids, the method comprising:

preparing M first reaction mixtures to be pooled before testing, wherein M is an integer greater than 1, and each first reaction mixture comprises:

a sample nucleic acid;

a first, forward primer comprising a target-specific portion;

a first, reverse primer comprising a target-specific portion, wherein said first, forward primer or said first, reverse primer further comprises a barcode nucleotide sequence, wherein each barcode nucleotide sequence in each of said M reaction mixtures is different;

performing a first reaction on each first reaction mixture to produce a plurality of barcoded target nucleotide sequences, each comprising a target nucleotide sequence linked to a barcode nucleotide sequence;

for each of the M first reaction mixtures, pooling the barcoded target nucleotide sequences to form a test pool;

Performing a second reaction on the test cell or one or more aliquots thereof using unique second primer pairs, wherein each second primer pair comprises:

a second, forward or reverse primer that anneals to the target nucleotide sequence; and

a second, reverse or forward primer that anneals to a barcode nucleotide sequence; and

for each unique, second primer pair, determining whether a reaction product is present in the test cell or aliquot thereof, whereby the presence of a reaction product is indicative of the presence of a particular target nucleic acid in a particular first reaction mixture.

85. An assay method for detecting a plurality of target nucleic acids, the method comprising:

preparing M first reaction mixtures to be pooled before testing, wherein M is an integer greater than 1, and each first reaction mixture comprises:

a sample nucleic acid;

a first, forward primer comprising a target-specific portion;

a first, reverse primer comprising a target-specific portion, wherein the first, forward primer or the first, reverse primer further comprises a nucleotide tag; and

at least one barcode primer comprising a barcode nucleotide sequence and a nucleotide tag specific portion, wherein the barcode primer is in excess of the first, forward and/or first, reverse primers, and wherein each barcode nucleotide sequence in each of the M reaction mixtures is different;

Performing a first reaction on each first reaction mixture to produce a plurality of barcoded target nucleotide sequences, each comprising a target nucleotide sequence linked to a nucleotide tag linked to a barcode nucleotide sequence;

for each of the M reaction mixtures, pooling the barcoded target nucleotide sequences to form a test pool;

performing a second reaction on the test cell or one or more aliquots thereof using unique second primer pairs, wherein each second primer pair comprises:

a second, forward or reverse primer that anneals to a target nucleotide sequence; and

a second, reverse or forward primer that anneals to the barcode nucleotide sequence; and

for each unique, second primer pair, determining whether a reaction product is present in the test cell or aliquot thereof, whereby the presence of a reaction product is indicative of the presence of a particular target nucleic acid in a particular first reaction mixture.

86. The method of any of clauses 84 or 85, wherein the method further comprises preparing M x N first reaction mixtures, wherein N is an integer greater than 1, and:

each first reaction mixture comprises first, forward and reverse primer pairs specific for different target nucleic acids:

The pooling comprises preparing N test wells of M first reaction mixtures, wherein each barcoded target nucleotide sequence in a test well comprises a different barcoded nucleotide sequence; and

the second reaction is carried out in each of the N test cells, each test cell being separate from each other test cell.

87. The method of any one of clauses 84-86, wherein the first reaction comprises amplification.

88. The method of any of clauses 84-87, wherein the second reaction comprises amplification.

89. The method of any one of clauses 35-86, wherein each first reaction mixture comprises a target nucleic acid from a single particle.

90. The method of clause 89, wherein the single particles comprise particles selected from the group consisting of: cells, organelles, vesicles, chromosomes, chromosome fragments, and particles produced using chemical cross-linking agents.

91. The method of clause 89, wherein the single particle comprises a single cell.

92. The method of any of clauses 84-91, wherein the first reaction mixture is prepared in separate compartments of a microfluidic device, the separate compartments being arranged in an array defined by rows and columns.

93. The method of item 86, wherein:

the first reaction mixture is prepared in separate compartments of a microfluidic device, the separate compartments being arranged as an array defined by rows and columns;

each of the N test cells is obtained by pooling the first reaction mixture in a row or column of the device; and

the barcoded nucleotide sequences in each barcoded target nucleotide sequence, along with the identity of the assay pool, are identified as rows and columns of compartments from which the barcoded target nucleotide sequence originates.

94. The method of any of clauses 84-93, wherein the second reaction mixture is prepared in separate compartments of a microfluidic device, the separate compartments being arranged in an array defined by rows and columns.

95. The method of clause 94, wherein the first reaction mixture is prepared in a separate compartment of a first microfluidic device and the second reaction mixture is prepared in a separate compartment of a second microfluidic device, wherein the second microfluidic device is different from the first microfluidic device.

96. The method of clauses 92 or 93, wherein the particles to be analyzed are partitioned into compartments of the device.

97. The method of clause 96, wherein the number of particles in each compartment is determined by brightfield microscopy or fluorescence microscopy.

98. The method of clause 96, wherein a stain, dye, or label is used to detect the number of particles in each compartment.

99. The method of clause 98, wherein the particles are cells and a cell membrane permeable nucleic acid dye is used to detect the number of cells in each compartment.

100. The method of clause 98, wherein the particles are cells and a labeled antibody specific for a cell surface marker is used to detect the number of cells in each compartment.

101. The method of item 96, wherein the method comprises:

determining whether any compartment contains more than a single particle; and

results from any compartments containing more than a single particle were not further analyzed or discarded.

102. The method of item 96, wherein the method comprises:

determining whether any compartment contains a particle with a particular characteristic; and

the results from any compartment containing particles with specific characteristics were further analyzed or considered only.

103. The method of item 102, wherein the feature is selected from the group consisting of: a particular genomic rearrangement, copy number variation, or polymorphism; expression of a particular gene; and expression of specific proteins.

104. The method of any of clauses 89-103, wherein at least one reaction is performed with whole particles.

105. The method of any of clauses 89-103, wherein at least one reaction is performed with the disrupted particle.

106. The method of any of clauses 89-105, wherein the particles are treated with a bioresponse-eliciting agent prior to performing the plurality of first reactions.

107. The method of any one of clauses 84-106, wherein said determining whether a reaction product is present in the test cell or an aliquot thereof is performed by Polymerase Chain Reaction (PCR).

108. The method of any one of items 84-106, wherein said determining whether a reaction product is present in said test cell or an aliquot thereof is performed by Ligase Chain Reaction (LCR).

109. The method of clauses 84-108, wherein the determining whether a reaction product is present in the test cell or aliquot thereof is performed by real-time assay.

110. The method of clause 109, wherein said determining whether a reaction product is present in said test cell or aliquot thereof is performed by a test that utilizes a labeled probe and an unlabeled probe, wherein simultaneous hybridization of said probes to a reaction product results in formation of a flap at the 5' end of said labeled probe, and cleavage of said flap generates a signal.

111. The method of item 108, wherein cleavage of the flap separates a fluorophore from a quencher.

112. The method of item 108, wherein a portion of the labeled probe is specific for a barcode nucleotide sequence in the reaction product.

113. The method of clause 108, wherein the determining whether a reaction product is present in the test cell or an aliquot thereof is performed by melting temperature analysis comprising detecting at:

the reaction product is substantially double-stranded and capable of generating a signal in the presence of the double-stranded DNA binding dye; and

the primer is substantially single-stranded and is incapable of generating a signal in the presence of the double-stranded DNA binding dye.

114. The method of any one of clauses 84-113, wherein the method comprises sequencing the reaction product.

115. The method of any one of clauses 35-114, wherein the method comprises determining the amount of at least one target nucleic acid in the first reaction mixture.

116. The method of any one of clauses 35-115, wherein the method comprises determining the copy number of one or more DNA molecules in the first reaction mixture.

117. The method of item 116, wherein the tagged or barcoded target nucleotide sequence is produced by less than 20 cycles of PCR.

118. The method of any one of clauses 35-117, wherein the method comprises determining a genotype at one or more loci in the first reaction mixture.

119. The method of any one of clauses 35-118, wherein the method comprises determining a haplotype of a plurality of loci in the first reaction mixture.

120. The method of item 119, wherein the method comprises:

compressing the chromosomes; and

partitioning the chromosomes into a first reaction mixture to produce a plurality of first reaction mixtures comprising individual chromosomes;

wherein the reacting comprises sequencing a plurality of loci in each of the first reaction mixtures.

121. The method of any one of clauses 35-120, wherein the method comprises determining the expression level of one or more RNA molecules in the first reaction mixture.

122. The method of item 121, wherein the tagged or barcoded target nucleotide sequence is produced by RT-PCR in less than 20 cycles.

123. The method of any one of clauses 35-122, wherein the method comprises determining the nucleotide sequence of one or more RNA molecules in the first reaction mixture.

124. The method of any of items 35-123, wherein the method comprises:

Performing a plurality of reactions in each first reaction mixture, wherein one of the plurality of reactions comprises amplification to produce a tagged or barcoded target nucleotide sequence;

analyzing the results of the plurality of reactions; and

correlating the results of the analysis with each first reaction mixture.

125. The method of clause 124, wherein the results of the analysis indicate the presence of two or more of the changes selected from the group consisting of:

a change in copy number;

mutation;

a change in expression level;

a splice variant;

wherein the presence of the two or more changes is associated with a phenotype.

126. The method of item 125, wherein said phenotype is selected from the group consisting of risk, presence, severity, prognosis, and responsiveness to a particular treatment.

127. The method of item 126, wherein said phenotype comprises resistance to a drug.

128. The method of clause 124, wherein the results of said analysis indicate the occurrence of genomic recombination.

129. The method of clause 124, wherein the results of said analysis indicate co-expression of a particular splice variant.

130. The method of clause 124, wherein the results of the analysis indicate co-expression of a particular light chain and heavy chain in a B cell.

131. The method of clause 124, wherein the results of the analysis indicate the presence of a particular pathogen in a particular host cell.

132. A method of amplifying a target nucleic acid, the method comprising:

the target nucleic acid was amplified using:

a medial primer set, wherein the set comprises:

an inner, forward primer comprising a target-specific portion and a first primer binding site;

an inner, reverse primer comprising a target-specific portion and a second primer binding site, wherein the first and second primer binding sites are different;

a first outer primer set, wherein the set comprises:

a first outer, forward primer comprising a portion specific for the first primer binding site; and

a first outer, reverse primer comprising a barcode nucleotide sequence and a portion specific for the second primer binding site;

a second set of flanking primers, wherein the set comprises:

a second outer, forward primer comprising a barcode nucleotide sequence and a portion specific for the first primer binding site; and

a second outer, reverse primer comprising a portion specific for the second primer binding site;

to produce two target amplicons, wherein:

The first target amplicon comprises 5 '-first primer binding site-target nucleotide sequence-second primer binding site-barcode nucleotide sequence-3'; and

the second target amplicon comprises a 5 '-barcode nucleotide sequence-first primer binding site-target nucleotide sequence-second primer binding site-3'.

133. The method of clause 132, wherein the barcode nucleotide sequence in each target amplicon is the same, and wherein each target amplicon comprises only one barcode nucleotide sequence.

134. The method of clause 132, wherein the first and second primer binding sites are binding sites for DNA sequencing primers.

135. The method of any one of clauses 132-134, wherein the outer primers each further comprise an additional nucleotide sequence, wherein:

the first outer, forward primer comprises a first additional nucleotide sequence and the first outer, reverse primer comprises a second additional nucleotide sequence; and

said second outer, forward primer comprises said second additional nucleotide sequence and said second outer, reverse primer comprises said first additional nucleotide sequence; and

the first and second additional nucleotide sequences are different; and

The amplification produces two target amplicons, wherein:

the first target amplicon comprises 5 '-a first additional nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site-a barcode nucleotide sequence-a second additional nucleotide sequence-3'; and

the second target amplicon comprises 5 '-a second additional nucleotide sequence-a barcode nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site-a first additional nucleotide sequence 3'.

136. The method of clause 135, wherein the first or second additional nucleotide sequence comprises a primer binding site.

137. The method of clause 136, wherein the first and second additional nucleotide sequences comprise primer binding sites.

138. The method of items 135-137, wherein the first set of outer primers comprises PE1-CS1 and PE2-BC-CS2, and the second set of outer primers comprises PE1-CS2 and PE2-BC-CS1 (Table, example 9).

139. The method of clause 132, wherein the amplification is performed in a single amplification reaction.

140. The method of clause 132, wherein said amplifying comprises employing said inner primer in a first amplification reaction and employing said outer primer in a second amplification reaction, wherein said second amplification reaction is different from said first amplification reaction.

141. The method of clause 140, wherein the second amplification reaction comprises two separate amplification reactions, wherein one amplification reaction employs the first outer primer set and the other amplification reaction employs the second outer primer set.

142. The method of clause 141, wherein target amplicons produced in the two separate amplification reactions are pooled.

143. The method of any of clauses 132-142, wherein the method comprises amplifying a plurality of target nucleic acids.

144. The method of clause 143, wherein the plurality of target nucleic acids is selected from the group consisting of: genomic DNA, cDNA, fragmented DNA, reverse transcribed DNA from RNA, DNA libraries, and nucleic acids extracted or amplified from cell, body fluid, or tissue samples.

145. The method of clause 144, wherein the plurality of target nucleic acids are amplified from a formalin-fixed, paraffin-embedded tissue sample.

146. The method of any of clauses 132-145, wherein the method further comprises sequencing the target amplicon.

147. The method of clause 143, wherein the method comprises amplifying the target amplicon with a primer that binds the first and second additional nucleotide sequences to generate a template for DNA sequencing.

148. The method of clause 147, wherein one or both of the primers that bind to the first and second additional nucleotide sequences is immobilized on a substrate.

149. The method of clauses 143 or 147, wherein the amplifying is performed by isothermal nucleic acid amplification.

150. The method of item 147, wherein the method comprises DNA sequencing with the template and a primer that binds to the first and second primer binding sites and primes sequencing of the target nucleotide sequence.

151. The method of clause 150, wherein the primers that bind the first and second primer binding sites and prime sequencing of the target nucleotide sequence are present in substantially equal amounts.

152. The method of item 147, wherein the method comprises DNA sequencing with the template and a primer that binds to the first and second primer binding sites and primes sequencing of the barcode nucleotide sequence.

153. The method of clause 152, wherein the primers that bind to the first and second primer binding sites and prime sequencing of the barcode nucleotide sequence are present in substantially equal amounts.

154. The method of clause 150, wherein the method comprises DNA sequencing with the template and a primer that binds to the first and second primer binding sites and primes sequencing of the barcode nucleotide sequence, wherein the primer is the reverse complement of the primer that primes sequencing of the target nucleotide sequence.

155. The method of item 154, wherein the primers used to prime sequencing of the target nucleotide sequence and barcode nucleotide sequence comprise CS1, CS2, CS1rc, and CS2rc (table 4, example 9).

156. The method of any of items 132-155, wherein the barcode nucleotide sequence is selected to avoid substantial annealing to the target nucleic acid.

157. The method of any one of items 132-156, wherein the barcode nucleotide sequence identifies a specific sample.

158. The method of clause 147, wherein at least 50% of the sequences determined from DNA sequencing are present at greater than 50% and less than 2 times the average copy number of the sequences.

159. The method of clause 147, wherein at least 70% of the sequences determined from DNA sequencing are present at greater than 50% and less than 2 times the average copy number of the sequences.

160. The method of clause 147, wherein at least 90% of the sequences determined from DNA sequencing are present at greater than 50% and less than 2 times the average copy number of the sequences.

161. The method of any one of clauses 132-160, wherein the target amplicon has an average length of less than 200 bases.

162. The method of any of clauses 132-161, wherein the first amplification reaction is performed in a volume ranging from about 1 picoliter to about 50 nanoliters.

163. The method of any of clauses 132-162, wherein the first amplification reaction is performed in a volume ranging from about 5 picoliters to about 25 nanoliters.

164. The method of any of clauses 132-163, wherein the first amplification reaction is formed in or distributed to a separate compartment of the microfluidic device prior to amplification.

165. The method of clause 164, wherein the microfluidic device is at least partially fabricated from an elastomeric material.

166. The method of any of clauses 132-163, wherein the first amplification reaction is performed in a droplet.

167. The method of clause 166, wherein the plurality of first amplification reactions are performed in droplets in an emulsion.

168. A kit for amplifying a target nucleic acid, the kit comprising:

a first outer primer set, wherein the set comprises:

a first outer, forward primer comprising a portion specific for a first primer binding site; and

a first outer, reverse primer comprising a barcode nucleotide sequence and a portion specific for a second primer binding site, wherein the first and second primer binding sites are different;

A second set of flanking primers, wherein the set comprises:

a second outer, forward primer comprising a barcode nucleotide sequence and a portion specific for the first primer binding site; and

a second outer, reverse primer comprising a portion specific for the second primer binding site.

169. The kit of clause 168, wherein the first and second primer binding sites are binding sites for DNA sequencing primers.

170. The kit of clauses 168 or 169, wherein each of said outer primers further comprises an additional nucleotide sequence, wherein:

the first outer, forward primer comprises a first additional nucleotide sequence and the first outer, reverse primer comprises a second additional nucleotide sequence; and

said second outer, forward primer comprises said second additional nucleotide sequence and said second outer, reverse primer comprises said first additional nucleotide sequence; and the first and second further nucleotide sequences are different.

171. The kit of item 170, wherein the first set of outer primers comprises PE1-CS1 and PE2-BC-CS2 and the second set of outer primers comprises PE1-CS2 and PE2-BC-CS1 (table, example 9).

172. The kit of clause 168, further comprising:

a medial primer set, wherein the set comprises:

an inner, forward primer comprising a target-specific portion and the first primer binding site; and

an inner, reverse primer comprising a target-specific portion and the second primer binding site.

173. The kit of clause 172, comprising a plurality of inner primer sets each specific for a different target nucleic acid.

174. The kit of any one of items 168-172, further comprising a DNA sequencing primer that binds to the first and second primer binding sites and initiates sequencing of the target nucleotide sequence.

175. The kit of any one of items 168-174, further comprising a DNA sequencing primer that binds to the first and second primer binding sites and initiates sequencing of the barcode nucleotide sequence.

176. The kit of clause 175, wherein the primer that binds to the first and second primer binding sites and initiates sequencing of the barcode nucleotide sequence is the reverse complement of the primer that initiates sequencing of the target nucleotide sequence.

177. The kit of item 176, wherein the primers used to prime sequencing of the target nucleotide sequence and barcode nucleotide sequence comprise CS1, CS2, CS1rc, and CS2rc (table 4, example 9).

178. A method of detecting, and/or quantifying the relative amount of at least two different target nucleic acids in a nucleic acid sample, the method comprising:

generating first and second tagged target nucleic acid sequences from first and second target nucleic acids in the sample,

the first tagged target nucleotide sequence comprises a first nucleotide tag; and

the second tagged target nucleotide sequence comprises a second nucleotide tag, wherein the first and second nucleotide tags are different;

subjecting the tagged target nucleotide sequence to:

a first primer extension reaction using a first primer annealed to the first nucleotide tag; and

a second primer extension reaction using a second primer annealed to the second nucleotide tag; and

detection and/or quantification:

a signal indicative of extension of the first primer; and

a signal indicative of extension of the second primer

Wherein the signal of a given primer is indicative of the presence, and/or relative amount, of the corresponding target nucleic acid.

179. The method of clause 178, wherein the first and second tagged target nucleic acid nucleotide sequences comprise an adaptor at the end of each molecule for DNA sequencing.

180. The method of clauses 178 or 179, wherein the first and second tagged target nucleotide sequences are generated by amplifying first and second target nucleic acids with first and second primer pairs, respectively, wherein at least one primer of the first primer pair comprises a first nucleotide tag and at least one primer of the second primer pair comprises a second nucleotide tag.

181. The method of clause 180, wherein one primer of each primer pair comprises 5'- (DNA sequencing adaptor) - (nucleotide tag) - (target-specific moiety) -3' and the other primer of each primer pair comprises 5'- (DNA sequencing adaptor) - (target-specific moiety) -3'.

182. The method of item 180, wherein the tagged target nucleotide sequence is further amplified prior to primer extension.

183. The method of clause 182, wherein said further amplification comprises emulsion amplification or bridge amplification.

184. The method of any one of items 178-183, wherein the first and second primer extension reactions are performed sequentially in at least two cycles of primer extension, wherein:

a first cycle of primer extension is performed with a first primer annealed to the first nucleotide tag;

a second cycle of primer extension is performed with a second primer annealed to the second nucleotide tag;

providing all deoxynucleoside triphosphates in each cycle of primer extension;

the incorporation of any deoxynucleoside triphosphates into the DNA molecule produces a detectable signal; and

the signal detected in the first cycle is indicative of the presence, and/or relative amount, of the first target nucleic acid in the nucleic acid sample, and

The signal detected in the second cycle is indicative of the presence, and/or relative amount, of the second target nucleic acid in the nucleic acid sample.

185. The method of any of items 178-184, wherein the detectable signal comprises pyrophosphate release.

186. The method of clauses 184 or 185, wherein the tagged target nucleotide sequence is further amplified by emulsion PCR prior to primer extension.

187. The method of clauses 178 or 182, wherein the first and second primer extension reactions are performed by oligonucleotide ligation and detection, and wherein:

ligation of the labeled dibasic oligonucleotide to the first and/or second primer generates a detectable signal; and

the total signal detected for a particular primer is indicative of the presence, and/or relative amount, of the corresponding target nucleic acid in the nucleic acid sample.

188. The method of clause 187, wherein said ligating of a labeled dibasic oligonucleotide to said first primer and said ligating of a labeled dibasic oligonucleotide to said second primer produce the same detectable signal, and said first and second primer extension reactions are performed separately.

189. The method of clause 188, wherein the first and second primer extension reactions are performed in consecutive cycles.

190. The method of clause 187, wherein said ligating of a labeled dibasic oligonucleotide to said first primer and said ligating of a labeled dibasic oligonucleotide to said second primer produce different detectable signals.

191. The method of clause 190, wherein the first and second primer extension reactions are performed simultaneously in one reaction mixture.

192. The method of any of items 187-191, wherein the detectable signal is a fluorescent signal.

193. The method of any of items 187-192, wherein the tagged target nucleotide sequence is further amplified by emulsion PCR prior to primer extension.

194. The method of clauses 178 or 182, wherein the first and second primer extension reactions comprise sequencing by synthesis, wherein:

labeling each deoxynucleoside triphosphate with a different, base-specific label;

the incorporation of deoxynucleoside triphosphates into DNA molecules generates a base-specific detectable signal; and

the total signal detected for a particular primer is indicative of the presence, and/or relative amount, of the corresponding target nucleic acid in the nucleic acid sample.

195. The method of clause 194, wherein extension of the first primer produces the same detectable signal as extension of the second primer, and the first and second primer extension reactions are performed separately.

196. The method of clause 195, wherein the first and second primer extension reactions are performed in consecutive cycles.

197. The method of clause 194, wherein extension of the first primer produces a different detectable signal than extension of the second primer.

198. The method of clause 197, wherein the first and second primer extension reactions are performed simultaneously in one reaction mixture.

199. The method of any of clauses 194-197, wherein the detectable signal is a fluorescent signal.

200. The method of any of items 194-199, wherein the tagged target nucleotide sequence is further amplified by bridge PCR prior to primer extension.

201. The method of clause 182, wherein amplifying produces a clonal population of tagged target nucleotide sequences that are or become located at discrete reaction sites.

202. The method of clause 201, wherein the number of reaction sites comprising the first nucleotide tag relative to the number of reaction sites comprising the second nucleotide tag is indicative of the amount of the first target nucleic acid relative to the second target nucleic acid in the sample.

203. The method of items 201 or 202, wherein said detecting and/or quantifying comprises:

Detecting and comparing the total signal of all reaction sites comprising the first nucleotide tag with the total signal of all reaction sites comprising the second nucleotide tag; or

Detecting and comparing the number of reaction sites comprising the first nucleotide tag with the number of reaction sites comprising the second nucleotide tag.

204. The method of item 203, wherein the comparing comprises determining a ratio.

205. The method of items 179-187, 189-194 and 196-204, wherein the first nucleotide tag comprises a poly- (first nucleotide) sequence and the second nucleotide tag comprises a poly- (second nucleotide) sequence, wherein the first and second nucleotides are different.

206. The method of any of clauses 178-205, wherein the first and second target nucleic acids are selected from the group consisting of:

two different alleles of a polymorphic site;

a target nucleic acid that can be present in the nucleic acid sample at an altered copy number and a reference target nucleic acid that is expected to be present in the sample at a normal copy number;

a target nucleic acid on a single chromosome; and

target nucleic acids on different chromosomes.

207. The method of clause 206, wherein the first and second target nucleic acids comprise a mutant allele and a wild-type allele.

208. The method of any of items 178-207, comprising:

generating three or more tagged target nucleotide sequences from three or more target nucleic acids in the sample

Performing three or more primer extension reactions on the tagged target nucleotide sequence, each using a primer that anneals to a different nucleotide tag; and

signal is detected and/or quantified for extension of each primer.

209. The kit of any one of items 34, 40, 47, 62, 81-83 and 168, 177, wherein the kit further comprises a matrix-type microfluidic device.

Brief Description of Drawings

FIGS. 1A-1D: hairpin adaptor molecules generate a representation of adaptor-modified target nucleic acid molecules, such as libraries suitable for high throughput DNA sequencing. (1A) Hairpin adaptor molecules, each comprising: an adaptor nucleotide sequence linked to a nucleotide linker linked to a nucleotide sequence capable of annealing to the adaptor nucleotide sequence and linked to a degenerate tail sequence; n ═ nucleotides; optional specific enzyme cleavage sites may be included in the nucleotide linker. (1B) Target nucleic acid molecule preparation can include fragmentation and digestion of the 5 'end to produce a 3' sticky end. (1C) Annealing, filling the gaps, and joining are performed. (1D) The resulting DNA is conveniently linearized using an enzyme that cleaves within the linker.

FIGS. 2A-2D: double stranded adaptor molecules generate a representation of adaptor-modified target nucleic acid molecules, such as libraries suitable for high throughput DNA sequencing. (2A) A double-stranded adaptor molecule comprising on each strand: a first adaptor nucleotide sequence linked to a nucleotide linker linked to a second adaptor nucleotide sequence; and a degenerate tail sequence, wherein the double-stranded molecules each comprise two degenerate tail sequences as sticky ends; n ═ nucleotides; optional specific enzyme cleavage sites may be included in the nucleotide linkers. (2B) Target nucleic acid molecule preparation can include fragmentation and digestion of the 5 'end to produce a 3' sticky end. (2C) annealing, filling the gap, and connecting. (2D) The resulting circular DNA is conveniently linearized using an enzyme that cleaves within a linker.

FIG. 3: a four-primer, combinatorial barcoding approach can be used to place a combination of two barcodes on either end of each amplicon. The inner primer contains a target-specific portion ("TS-F" in the forward primer and "TS-R" in the reverse primer), a barcode nucleotide sequence ("bc2"), and different nucleotide tags. The outer primers contain tag-specific portions ("CS1" and "CS2"), different barcode nucleotide sequences ("bc1"), primer binding sites ("a" and "B") for sequencing primers.

FIG. 4: a six primer, combinatorial barcoding approach can be used to place a combination of two barcodes on either end of each amplicon. The inner primer contains a target-specific portion ("TS-F" in the forward primer and "TS-R" in the reverse primer) and a different nucleotide tag. The stuffer primer comprises tag-specific portions ("CS1" and "CS2"), a barcode nucleotide sequence ("bc2"), and two additional different nucleotide tags. The outer primer contains portions specific for two additional nucleotide tags ("CS3" and "CS4"), a different barcode nucleotide sequence ("bc1"), and primer binding sites ("a" and "B") for the sequencing primer.

FIGS. 5A-5B: the combinatorial ligation-based tagging method utilizes a tagged target nucleotide sequence (5A) to generate a combinatorial tagged target nucleotide sequence. PE1, PE1 ═ Illumina sequencing flow cell (flowcell) binding sequences; seq1, Seq2 ═ priming sites for sequencing; BC1, BC2 are barcode sequences. See example 2.

FIG. 6: combinatorial insertional mutagenesis-based tagging for sequencing (e.g., Illumina sequencing). The barcode is inserted into the transposon sub-label sequence. TagA and TagB need to be long enough to initiate sequencing. BC2 should contain a barcode of 4 bases plus 3 degenerate primers (e.g., NNNAGTC) at the 5' end. Transposon end sequence 5'-AGATGTGTATAAGAGACAG-3' (SEQ ID NO: 1). PE1, PE1 ═ Illumina sequencing flow cell binding sequence; BC1, BC2 are barcode sequences.

FIGS. 7A-7C: barcoding and pooling reaction mixtures for subsequent analysis: generation of barcoded target nucleotide sequences. (7A) In an exemplary embodiment, cells are loaded to ACCESS ARRAY in limiting dilutionTMIFCs ("Integrated fluid circuits", also referred to herein as "chips"). The primer sets were loaded as shown, with each chamber (chamber) in the chip receiving a complete set of 96 forward primers (F1-96) and 96 reverse primers (R1-96) for amplifying 96 targets. The reverse primer is tagged with a tag that anneals to the barcode primer. Each chamber in a row of chips receives a different barcode primer. (7B) Reverse transcription and preamplification were performed in the chip using the 3-primer method to generate barcoded target nucleotide sequences as described in example 5. Any given chamber will amplify all genes, and all amplicons will be tagged by a single barcode. The reaction product is output from the cell (90 degrees with different primer compositions, i.e. from the sample). (7C) For detection, the sample DYNAMIC ARRAY may be sampled as shownTMIFC, forward primer (e.g., F1) is used to amplify a particular target nucleic acid and barcode primer (e.g., BC1) is used to amplify this sequence in a particular chamber of a particular pool (e.g., pool 1).

FIGS. 8A-8C: barcoding and pooling reaction mixtures for subsequent analysis: exemplary strategies for amplifying/detecting barcoded target nucleotide sequences. (8A) Exemplary embodiments utilize LCR to detect barcoded target nucleotide sequences having the following structure: 5 '-forward primer sequence-target nucleotide sequence-reverse primer sequence-barcode nucleotide sequence-3'. In this case, one primer can anneal to the reverse primer sequence and the other primer can anneal to the adjacent barcode nucleotide sequence, followed by ligation, and repeated cycles of annealing and ligation. (8B) Detection can be performed in real time using the flap endonuclease-ligase chain reaction. This reaction employs a labeled probe and an unlabeled probe, wherein simultaneous hybridization of the probes to the reaction product results in the formation of an overhang at the 5' end of the labeled probe, and cleavage of the overhang produces a signal. As shown, cleavage of the overhang can separate the fluorophore from the quencher to generate a signal. (8C) An alternative real-time detection method that can be used, for example, to detect amplicons produced by LCR from a barcoded target nucleotide sequence having the structure: 5 '-forward primer sequence-target nucleotide sequence-reverse primer sequence-barcode nucleotide sequence-3'. This method relies on the use of a double stranded DNA binding dye (dye) to detect the difference in melting temperature between the reaction product and the primer used in the LCR. Melting temperature analysis includes detection at a temperature ("high temperature") at which the reaction product is substantially double-stranded and is capable of producing a signal in the presence of the double-stranded DNA binding dye, but the primer is substantially single-stranded and is incapable of producing a signal. For example, for detecting a barcoded target nucleotide sequence having the following structure: 5 '-forward primer sequence-target nucleotide sequence-reverse primer sequence-barcode nucleotide sequence-3', one primer can anneal to the reverse primer sequence and the other primer can anneal to the adjacent barcode nucleotide sequence, followed by ligation, and repeated cycles of annealing and ligation. See fig. 8C.

FIG. 9: schematic diagram of a unit cell structure ("MA006") for microfluidic devices suitable for cell manipulation, showing on-chip processes.

FIG. 10: limiting dilution of the cell suspension is used to obtain a single cell per individual reaction volume (the "chamber" or "chip" of the microfluidic device). The theoretical distribution (poisson distribution) for different cell densities is shown.

FIGS. 11A-11B: the results of cell counting in the chip, using bright field imaging (11A), were compared to the theoretical distribution (11B). Based on bright field imaging, the cell density in the chip is close to but lower than poisson distribution, a trend that is exacerbated at higher cell densities.

FIGS. 12A-12B: fluorescent cell "ghost" images (12A) allow more cells to be detected than pre-PCR bright field imaging, so that the cell density more closely approximates the poisson distribution (12B).

FIG. 13: specific methods for detecting cells in the chip that may be used include, for example, detection using cell membrane permeable nucleic acid stains (statins) and/or cell specific surface markers with antibodies. The results of these more specific methods are shown for a cell density of 1E 6/ml.

FIGS. 14A-14B: (14A) Pre-RT-PCR (pre-RT-PCR) nucleic acid stain (Syto10 DNA stain) detected the comparison of cells in the chip with post-RT-PCR (post RT-PCR) ghost images (cell ghosts). (14B) Syto10 did not inhibit the RT-PCR of GAPDH.

FIG. 15: RT-PCR of GAPDH in the presence of 0.5% Tween 20 or 0.5% NP40 (the latter being a lytic reagent). Neither significantly inhibited RT-PCR of GAPDH.

FIG. 16: standard curve amplification of 11 genes performed in the MA006 chip. These results demonstrate that CellsDirectTMOne-step qRT-PCR kit can be used with 0.5% NP40 (for cell lysis and to prevent depletion effects in the chip) to convert gene-specific RNA in cells into amplicons in the MA006 chip.

FIG. 17: a four primer, combinatorial barcoding approach was used to place a combination of two barcodes on either end of each amplicon. The inner primer contains a target-specific portion ("TS-F" in the forward primer and "TS-R" in the reverse primer), a barcode nucleotide sequence ("bc2"), and different nucleotide tags. The outer primers contain tag-specific portions ("CS1" and "CS2"), different barcode nucleotide sequences ("bc1"), primer binding sites ("a" and "B") for sequencing primers.

FIGS. 18A-18B: illustration of how 4-primer barcoding can be performed on a chip such as MA 006. (18A) amplification was performed on the chip with the inner primers, where the chambers of each row have the same inner primer pair with the same barcode. (18B) The reaction products from each column of chambers can be harvested as pools and amplified using a different pair of outer primers for each pool. This amplification produces amplicons with a barcode combination at either end of the amplicon that uniquely identifies the chamber (in rows and columns) in which the initial amplification is performed.

FIG. 19: results obtained from sequencing gene-specific amplicons from single cells were compared (example 5), expressed as the number of reads per gene-specific amplicon, compared to total RNA. As is evident from the figure, the representation of these RNAs is different when measured in individual cells, compared to that observed in total RNA.

FIGS. 20A-20B: capture sites with capture features (capture features) and drain (drain). (20A) There is no obstacle (basale) to concentrate the flow (flow) point. (20B) Sites with obstructions.

FIG. 21: additional capture site design.

FIGS. 22A-22C: the capture architecture (capture architecture) can be designed to maximize the likelihood that the cell will contact the surface marker. For example, obstructions on one or more channel walls (channel walls) may be used to direct the beads toward the capture feature. (22A) Exemplary capture feature/obstruction combinations. (22B) The performance of the capture feature may be adjusted by adjusting one or more variables including the angle of the obstruction, the distance of the obstruction from the capture site, the length of the obstruction, the size and shape of the capture feature, the size of the drainage channel in the capture feature (if present). Obstructions on the channel wall serve to guide the beads towards the capture feature. (22C) The capture feature is paired with an obstruction on the channel wall; individual capture feature/obstruction combinations may be located on alternating walls to focus the flow toward adjacent capture feature/obstruction combinations.

FIGS. 23A-23B: strategies that utilize capture features to capture individual, affinity reagent-coated beads that subsequently display affinity reagents (e.g., antibodies) to capture single particles (e.g., cells). (FIG. 23A, FIG. 1) the flow begins in a channel containing a capture feature. (FIG. 23A, panel 2) antibody-bound beads are flowed to the capture feature until the beads are lodged in the capture feature. (FIG. 23A, panel 3) the channel is then washed to remove the uncaptured beads. (FIG. 23B, panel 1) cells with antibody-bound cell surface markers flow into channels containing captured beads. (FIG. 23B, 2-Panel) the marker-bearing cells interact and bind to the captured bead-displayed antibody. The size of the display area is such that the bound cells will inhibit other cells from interacting with the captured beads via steric blockages, so that only one cell binds to each captured bead. (FIG. 23B, panel 3) the channels were then washed to remove unbound cells, leaving one immobilized cell at each capture site.

FIGS. 24A-24G: (24A) schematic of a microfluidic device designed to capture single cells in discrete locations (niches). Single cell capture allows analysis of biological events at the single cell level. (24B) The flow is designed to be stronger above the niche than through the overflow channel. The niche contains a small notch (-3 μm high). As the cell enters the niche it closes the niche and prevents any further flow into the niche. The flow passes through to the next unoccupied niche until it is also enclosed by the cell. One cell per niche should be captured before the cells pass through the overflow channel and exit waste (out to waste). (24C) A schematic of (24A) is shown with additional details provided in (24D-24F). (24D) The buffer inlet is pooled with the cell inlet (convert) forcing the cells to one side of the feed channel closest to the series of transverse cell capture channels. (24E) The resistance of the lateral cell capture channel is lower than that of the cell overflow channel to direct the cell flow preferentially into the niche rather than into the cell overflow channel. (24F) Each niche is large enough to capture only one cell. The cells in the niche raise the resistance of this particular circuit, and flow is directed to the cell-free circuit. (24G)24A, captured Human Umbilical Vein Endothelial Cells (HUVECs) are located in the niche.

FIG. 25: the amplicon tagging strategy employed in example 9. (A) Standard 4 primer amplicon tagging is relative to double sequencing amplicon tagging. The standard 4-primer amplicon tagging method incorporates end-paired Illumina sequencing primer annealing sites in co-sequence tag 1(CS1) and consensus tag 2(CS 2). Sequencing of both the 5 'and 3' ends of each PCR product requires a paired-end sequencing run. (B) The target-specific primers are appended with consensus sequence tags CS1 and CS 2. The sample specific primer pair contained the consensus tags CS1 or CS2, appended in two permutations with the adaptor sequences used by Genome analyzers (PE1 and PE 2). Two PCR product types were generated from the same target region: in the same sequencing read, product a allows the 5 'end of the sequencing target region and product B allows the 3' end of the sequencing target region.

FIG. 26: overview of the isolated-primer PCR (segregated-primer PCR) strategy used in example 9. First PCR with target-specific primer pairs at ACCESS ARRAYTMIn IFC. The pool of harvested PCR products was divided into two subsequent PCR reactions with sample-specific barcode primers. (A) Reactions that generate products that allow sequencing of the 5' end of the target region utilize PE1_ CS1 and PE2_ BC _ CS2 primer combinations. (B) The reaction that produces a product that allows sequencing of the 3' end of the target region utilizes a PE1_ CS2 and PE2_ BC _ CS1 primer combination.

FIG. 27 is a schematic view showing: overview of the sequencing workflow used in example 9. Both PCR product types were present on the flow cell. An equimolar mixture of CS1 and CS2 allows sequencing of the 5 'end and the 3' end of the target region. The barcodes were sequenced after stripping (striping) and rehybridization (cluster) with an equimolar mixture of CS1rc and CS2 rc. Sequencing primers CS1 and CS2 were provided in reagent FL 1. The index primers CS1rc and CS2rc are provided in reagent FL 2.

FIG. 28: bioanalyzer product obtained from barcoding reaction runs with barcodes from plate 1 and plate 2 in example 10.

FIG. 29: alternative sequencing primers used in example 10. Use ACCESS ARRAYTMThe use of an equimolar mixture of all target-specific PCR primers on the IFCs as a pool of sequencing primers avoids passing through uninformative target-specific primer regions for sequencing.

FIG. 30: base-by-base coverage of gene EGFR for one sample in example 10. Reads from each chain are shown in different shades.

FIGS. 31A-31B: (31A) before the 454 sequencing emulsion PCR reaction, allele specific PCR on target DNA was performed in one reaction. The forward primer has 454 adaptors and an allele-specific tag. Different labels are shown in different shades. This reaction produced amplicons prepared for 454 bead emulsion PCR. (31B) Following emulsion PCR and loading to the sequencer, the amplicons on individual beads in each well were either wild type or mutant. The first 454 cycles flow the primer bound to the wild type tag (pink arrow) and flow all dntps. As this primer is extended, multiple nucleotides are incorporated, giving a very robust signal, but only in wells with wild type molecules. The second cycle flows all dntps and mutant tagged primers, generating signals only in wells with mutant molecules.

FIG. 32: agilent Bioanalyzer results from an interference experiment between Fluidigm and Illumina TruSeq sequencing primers on Illumina-generated libraries. The PCR reaction for each lane is as follows:

illumina standard library + Fluidigm FL1 sequencing primer

Illumina standard library + Illumina TruSeq sequencing primer

Illumina Standard library + Fluidigm FL1 and Illumina TruSeq sequencing primers

Illumina Standard library + Illumina Standard sequencing primers (control)

Illumina multiplex library + Fluidigm FL1 sequencing primer

Illumina multiplex library + Illumina TruSeq sequencing primers

Illumina multiplex library + Fluidigm FL1 and Illumina TruSeq sequencing primers

Illumina multiplex library + Illumina multiplex sequencing primers (control)

Illumina small RNA library + Fluidigm FL1 sequencing primer

Illumina small RNA library + Illumina TruSeq sequencing primers

Illumina Small RNA library + Fluidigm FL1 and Illumina TruSeq sequencing primers

Illumina Small RNA library + Illumina Small RNA sequencing primers (control)

FIG. 33: from Fluidigm and Illumina TruSeq sequencing primer pair ACCESS ARRAYTMAgilent Bioanalyzer results of the interference experiments with IFC generated libraries. The PCR reaction for each lane is as follows:

1.Fluidigm ACCESS ARRAYTMIFC library + Fluidigm FL1 sequencing primer

2.Fluidigm ACCESS ARRAYTMIFC library + Illumina TruSeq sequencing primer

3.Fluidigm ACCESS ARRAYTMIFC library + Fluidigm FL 1 and Illumina TruSeq sequencing primers

Detailed description of the invention

For many applications, it is necessary or desirable to incorporate nucleic acid sequences into target nucleic acids, e.g., derived from a sample, such as a biological sample. In certain embodiments, the incorporated sequence can aid in further analysis of the target nucleic acid. Thus, described herein are methods that can be used to incorporate one or more adapter and/or nucleotide tags and/or barcode nucleotide sequences into one or typically more target nucleotide sequences. In particular embodiments, nucleic acid fragments are generated having adaptors, e.g., nucleic acid fragments suitable for use in high-throughput DNA sequencing. In other embodiments, information about the reaction mixture is encoded in the reaction product. For example, if nucleic acid amplification is performed in separate reaction volumes, it may be desirable to recover the contents for subsequent analysis, e.g., by PCR and/or nucleic acid sequencing. The contents of the separate reaction volumes can be analyzed separately and the results correlated to the initial reaction volume. Alternatively, the particle/reaction volume identity may be encoded in the reaction product, e.g., as discussed below with respect to the multi-primer nucleic acid amplification method. Furthermore, these two strategies can be combined to encode separate sets of reaction volumes, such that each reaction volume in the set is uniquely identifiable and then pooled, and each pool then analyzed separately.

In certain embodiments, the invention provides amplification methods in which a barcode nucleotide sequence and additional nucleotide sequences that aid in DNA sequencing are added to a target nucleotide sequence. The barcode nucleotide sequence may encode information such as the source of the sample, the associated target nucleic acid sequence to which it is appended (attached). The added sequence can, for example, serve as a binding site for a DNA sequencing primer. Barcoding target nucleotide sequences can increase the number of samples that can be analyzed for one or more targets in a single assay while minimizing the increase in assay cost. These methods are particularly suitable for increasing the efficiency of assays performed on microfluidic devices.

Definition of

Unless otherwise indicated, the terms used in the claims and specification are defined as set forth below. These terms are specifically defined for the sake of clarity, but all such definitions are consistent with how those terms would be understood by those skilled in the art.

The term "contiguous (adjacent)" when used herein to refer to two nucleotide sequences in a nucleic acid can refer to sequences that are separated by 0 to about 20 nucleotides, more specifically, in the range of about 1 to about 10 nucleotides, or directly adjacent to each other (abut). As will be appreciated by those skilled in the art, two nucleotide sequences that are to be linked together will typically be directly adjacent to each other.

The term "nucleic acid" refers to a polymer of nucleotides and, unless otherwise limited, includes known analogs of natural nucleotides that are capable of functioning (e.g., hybridizing) in a similar manner to naturally occurring nucleotides.

The term nucleic acid includes any form of DNA or RNA, including, for example, genomic DNA; complementary DNA (cdna), which is a DNA representation of mRNA, typically obtained by reverse transcription of messenger rna (mRNA), or by amplification; DNA molecules produced synthetically or by amplification; and mRNA.

The term nucleic acid includes double-stranded or triple-stranded nucleic acids, as well as single-stranded molecules. In double-stranded or triple-stranded nucleic acids, the nucleic acid strands need not be coextensive (i.e., the double-stranded nucleic acid need not be double-stranded along the entire length of both strands).

Double-stranded nucleic acids that are not double-stranded along the entire length of both strands have a 5 'or 3' extension (extension) referred to herein as a "sticky end" or "tail sequence". The term "sticky end" is typically used to refer to relatively short 5 'or 3' extensions, such as produced by restriction enzymes, while the term "tail sequence" is typically used to refer to longer 5 'or 3' extensions.

The term "degenerate sequence" as used herein refers to a sequence in a plurality of molecules in which a plurality of different nucleotide sequences are present. For example, all possible sequences of a degenerate sequence may be present.

The term "degenerate tail sequence" is used to describe a tail sequence in a plurality of molecules, wherein the tail sequence has a plurality of different nucleotide sequences; for example, all possible different nucleotide sequences (1 for each tail) may be present in a plurality of molecules.

The term nucleic acids also includes any chemical modification thereof, for example by methylation and/or by capping. Nucleic acid modification may include the addition of chemical groups that bind additional charge, polarizability, hydrogen bonding, electrostatic interactions, and functionality to individual nucleic acid bases or to the nucleic acid as a whole. Such modifications may include base modifications such as 2' -position sugar modifications, 5-position pyrimidine modifications, 8-position purine modifications, modifications on exocyclic amines of cytosines, substitutions of 5-bromo-uracil, backbone modifications, rare base pairing combinations such as the iso-base isocytidine and isoguanine, and the like.

More specifically, in certain embodiments, nucleic acids may include polydeoxyribonucleotides (including 2-deoxy-D-ribose), polyribonucleotides (including D-ribose), and any other type of nucleic acid that is an N-or C-glycoside of a purine or pyrimidine base, as well as other polymers comprising non-nucleotide backbones, such as polyamides (e.g., nucleic acid Peptides (PNAs)) and poly-morpholino (commercially available from Anti-Virals, inc., Corvallis, Oregon as Neugene) polymers, as well as other synthetic sequence-specific nucleic acid polymers, provided that these polymers comprise nucleobases in a configuration that allows base pairing and base stacking, such as those found in DNA and RNA. The term nucleic acid also includes Locked Nucleic Acids (LNAs), which are described in U.S. patent nos. 6,794,499, 6,670,461, 6,262,490, and 6,770,748, which are incorporated herein by reference in their entirety for their disclosure of LNAs.

Nucleic acids can be obtained from entirely chemical synthetic methods such as solid phase mediated chemical synthesis, from biological sources such as by isolation from any species from which the nucleic acid is produced, or from methods involving processing of the nucleic acid by molecular biological means such as DNA replication, PCR amplification, reverse transcription, or from combinations of these methods.

The order of elements in a nucleic acid molecule is generally described herein as 5 'to 3'. In the case of double-stranded molecules, the "upper" strand is typically shown from 5 'to 3' by convention, and the order of the elements is described herein with reference to the upper strand.

The term "target nucleic acid" as used herein refers to a specific nucleic acid to be detected in the method of the present invention.

The term "target nucleotide sequence" as used herein refers to a molecule comprising the nucleotide sequence of a target nucleic acid, like, for example, an amplification product obtained by amplifying the target nucleic acid or a cDNA generated when the RNA target nucleic acid is reverse transcribed.

The term "complementary" as used herein refers to the ability to pair precisely between two nucleotides. That is, two nucleic acids are considered to be complementary to each other at a given position if the nucleotides of the two nucleic acids are capable of forming hydrogen bonds with the nucleotides of the other nucleic acid at that position. Complementarity between two single-stranded nucleic acid molecules may be "partial," in which only some of the nucleotides bind, or it may be complete when complete complementarity exists between the single-stranded molecules. The degree of complementarity between nucleic acid strands has a significant effect on the efficiency and strength of hybridization between nucleic acid molecules. A first nucleotide sequence is said to be the "complement" of a second sequence if the first nucleotide sequence is complementary to the second nucleotide sequence. A first nucleotide sequence is said to be the "reverse complement" of a second sequence if it is complementary to the sequence of the second sequence in reverse (i.e., the order of nucleotides is reversed).

By "specifically hybridizes" is meant that the nucleic acid binds to the target nucleotide sequence under defined stringent conditions and does not substantially bind to other nucleotide sequences present in the hybridization mixture. One of ordinary skill in the art will appreciate that relaxing the stringency of hybridization conditions will allow sequence mismatches to be tolerated.

In a specific embodiment, hybridization is performed under stringent hybridization conditions. The phrase "stringent hybridization conditions" generally refers to a temperature below the melting temperature (T) for a particular sequence at a defined ionic strength and pHm) From about 5 ℃ to about 20 ℃ or 25 ℃. As used herein, TmIs the temperature at which the population of double-stranded nucleic acid molecules is semi-dissociated into single strands. T for calculating nucleic acidsmMETHODS of (A) are well known IN the art (see, e.g., Berger and Kimmel (1987) METHODS IN ENZYMOLOGY, vol.152: GUIDE TO MOLECULAR CLONING TECHNIQUES, San Diego: Academic Press, Inc. and Sambrook et al (1989) MOLECULAR CLONING: A LABORATORY MANUAL, 2 nd edition, VOLS. 1-3, Cold Spring Harbor LABORATORY), both of which are incorporated herein by reference). As indicated in the standard reference, when the nucleic acid is in an aqueous solution of 1M NaCl, T can be calculated by the following equation mSimple estimation of value Tm81.5+0.41 (% G + C) (see, e.g., Anderson and Young, Quantitative Filter Hybridization in NUCLEIC ACID Hybridization (1985)). The melting temperature of the hybrid (and thus the conditions for stringent hybridization) is affected by a variety of factors, such as the length and nature of the primer or probe (DNA, RNA, base composition) and the nature of the target nucleic acid (DNA, RNA, base composition, presence in solution or immobilized, etc.), and the concentration of salts and other components (e.g., the presence or absence of formamide, dextran sulfate, polyethylene glycol). The effects of these factors are well known and are discussed in standard references in the art. Exemplary stringency conditions suitable for achieving specific hybridization of most sequences are: at a temperature of at least about 60 ℃ at a pH of 7 and a salt concentration of about 0.2 moles/liter.

The term "oligonucleotide" is used to refer to a relatively short nucleic acid which is generally shorter than 200 nucleotides, more particularly shorter than 100 nucleotides, most particularly shorter than 50 nucleotides. Typically, the oligonucleotide is a single-stranded DNA molecule.

The term "adaptor" is used to refer to a nucleic acid that becomes attached to one or both ends of the nucleic acid in use. The adapter may be single stranded, double stranded, or may comprise both single stranded and double stranded portions.

The term "primer" refers to an oligonucleotide that is capable of hybridizing to (also referred to as "annealing") a nucleic acid under suitable conditions (i.e., in the presence of four different nucleoside triphosphates and a polymerization reagent such as DNA or RNA polymerase or reverse transcriptase) in a suitable buffer and at a suitable temperature and serves as a starting site for a nucleotide (RNA or DNA) polymerization reaction. The appropriate length of the primer depends on the intended use of the primer, but typically the primer is at least 7 nucleotides in length, more typically ranging from 10 nucleotides to 30 nucleotides, or even more typically from 15 nucleotides to 30 nucleotides in length. Other primers may be slightly longer, for example 30 to 50 nucleotides in length. In this context, "primer length" refers to the portion of an oligonucleotide or nucleic acid that hybridizes to a complementary "target" sequence and initiates nucleotide synthesis. Short primer molecules generally require cooler temperatures to form sufficiently stable hybridization complexes with the template. The primer does not necessarily reflect the exact sequence of the template but must be sufficiently complementary to hybridize with the template. The term "primer site" or "primer binding site" refers to a segment of a target nucleic acid to which a primer hybridizes.

A primer is said to anneal to another nucleic acid if it, or a portion thereof, hybridizes to a nucleotide sequence in the nucleic acid. The statement that a primer hybridizes to a particular nucleotide sequence is not intended to imply that the primer hybridizes completely or exclusively to that nucleotide sequence. For example, in certain embodiments, an amplification primer as used herein is said to "anneal to a nucleotide tag". Such instructions include primers that anneal completely to the nucleotide tag and primers that anneal partially to the nucleotide tag and partially to adjacent nucleotide sequences, e.g., a target nucleotide sequence). Such hybridization primers can increase the specificity of the amplification reaction.

As used herein, selecting primers "so as to avoid substantial annealing to the target nucleic acid" means selecting primers such that the majority of amplicons detected after amplification are "full-length" in the sense that they result from priming at the desired site on each end of the target nucleic acid, as opposed to amplicons resulting from priming within the target nucleic acid, which produces shorter amplicons than desired. In various embodiments, the primers are selected such that at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% are full-length.

The term "primer pair" refers to a set of primers comprising a 5 '"upstream primer" or "forward primer" that hybridizes to the complement of the 5' end of the DNA sequence to be amplified, and a 3 '"downstream primer" or "reverse primer" that hybridizes to the 3' end of the sequence to be amplified. As one of ordinary skill in the art will appreciate, the terms "upstream" and "downstream" or "forward" and "reverse" in particular embodiments are not intended to be limiting, but rather to provide an illustrative direction.

In embodiments where two primer pairs are used, for example, in an amplification reaction, the primer pairs may be labeled as "inner" and "outer" primer pairs to indicate their relative positions; that is, the "inner" primer is incorporated into the reaction product (e.g., amplicon) at a position between the positions where the outer primer is incorporated.

In embodiments where three primer pairs are used, for example, in an amplification reaction, the term "stuffer primer" may be used to refer to a primer having a position between the inner and outer primers; that is, the "stuffer" primer is incorporated into the reaction product (e.g., amplicon) at an intermediate position between the inner and outer primers.

A primer pair is said to be "unique" if it can be used to specifically generate (e.g., amplify) a particular reaction product (e.g., amplicon) in a given reaction (e.g., amplification) mixture.

A "probe" is a nucleic acid capable of forming a double-stranded structure by binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, often through hydrogen bonding. The probe binds or hybridizes to a "probe binding site". The probe may be labeled with a detectable label to allow easy detection of the probe, particularly once the probe is hybridized to its complementary target. Alternatively, however, the probe may be unlabeled, but may be detectable by specific binding to a directly or indirectly labeled ligand. The probes can vary significantly in size. In general, probes are at least 7 to 15 nucleotides in length. Other probes are at least 20, 30 or 40 nucleotides in length. While others are slightly longer, at least 50, 60, 70, 80, or 90 nucleotides in length. While others are still longer and are at least 100, 150, 200 or more nucleotides in length. The probe can also be any length (e.g., 15 to 20 nucleotides in length) within any range defined by any of the values above.

The primer or probe may be fully complementary to the target nucleic acid sequence or may be less than fully complementary. In certain embodiments, the primer has at least 65% identity to the complement of the target nucleic acid sequence, and more often at least 75% identity, at least 85% identity, at least 90% identity, or at least 95%, 96%, 97%, 98%, or 99% identity over a sequence of at least 7 nucleotides, more typically over a sequence in the range of 10 to 30 nucleotides, and often over a sequence of at least 14-25 nucleotides. It will be appreciated that it is generally desirable that certain bases (e.g., the 3' base of a primer) be fully complementary to corresponding bases of a target nucleic acid sequence. Primers and probes typically anneal to the target sequence under stringent hybridization conditions.

The term "nucleotide tag" is used herein to refer to a predetermined nucleotide sequence that is added to a target nucleotide sequence. The nucleotide tag may encode information about the target nucleotide sequence, such as the identity of the target nucleotide sequence or the identity of the sample from which the target nucleotide sequence was derived. In certain embodiments, such information may be encoded into one or more nucleotide tags, e.g., a combination of two nucleotide tags, one on either end of the target nucleotide sequence may encode the identity of the target nucleotide sequence.

The term "affinity tag" as used herein refers to the moiety to which a molecule is specifically bound by a binding partner. The portion may, but need not, be a nucleotide sequence. Specific binding can be used to facilitate affinity purification of affinity tagged molecules.

The term "transposon end" refers to an oligonucleotide that is capable of being attached to a nucleic acid by a transposase.

The term "barcode primer" as used herein refers to a primer that includes a specific barcode nucleotide sequence that encodes information about the amplicon generated when the barcode primer is used in an amplification reaction. For example, different barcode primers can be used to amplify one or more target sequences from each of a number of different samples, such that the barcode nucleotide sequence is indicative of the sample source of the resulting amplicon.

As used herein, the term "encoding reaction" refers to a reaction in which at least one nucleotide tag is added to a target nucleotide sequence. The nucleotide tag may be added, for example, by "encoding PCR", wherein at least one primer comprises a target-specific portion and a nucleotide tag located on the 5 'end of the target-specific portion, and a second primer comprising only the target-specific portion or the target-specific portion and a nucleotide tag located on the 5' end of the target-specific portion. For illustrative examples of PCR protocols that may be used to encode PCR, see pending WO application US03/37808 and U.S. Pat. No. 6,605,451. Nucleotide tags may also be added by a "coded ligation" reaction, which may include a ligation reaction in which at least one primer includes a target-specific portion and a nucleotide tag located on the 5 'end of the target-specific portion, and a second primer that includes only the target-specific portion or the target-specific portion and a nucleotide tag located on the 5' end of the target-specific portion. Exemplary coded ligation reactions are described, for example, in U.S. patent publication No. 2005/0260640, which is incorporated by reference herein in its entirety, and is specifically directed to ligation reactions.

As used herein, an "encoding reaction" may generate a "tagged target nucleotide sequence" that includes a nucleotide tag attached to the target nucleotide sequence.

As used herein, with reference to a portion of a primer, the term "target-specific" nucleotide sequence refers to a sequence that is capable of specifically annealing to a target nucleic acid or target nucleotide sequence under appropriate annealing conditions.

As used herein, with reference to a portion of a primer, the term "nucleotide tag-specific nucleotide sequence" refers to a sequence that is capable of specifically annealing to a nucleotide tag under appropriate annealing conditions.

Amplification according to the teachings of the present invention includes any means of regenerating at least a portion of at least one target nucleic acid, typically in a template-dependent manner, including, but not limited to, a wide range of techniques for linear or exponential amplification of nucleic acid sequences. Illustrative means for accomplishing the amplification step include Ligase Chain Reaction (LCR), Ligase Detection Reaction (LDR), ligation followed by Q-replicase amplification, PCR, primer extension, Strand Displacement Amplification (SDA), hyperbranched strand displacement amplification, Multiple Displacement Amplification (MDA), nucleic acid strand-based amplification (NASBA), two-step multiplex amplification, Rolling Circle Amplification (RCA), and the like, including various forms and combinations thereof, such as, but not limited to, OLA/PCR, PCR/OLA, LDR/PCR, PCR/LDR, LCR/PCR, PCR/LCR (also known as combined strand reaction-CCR), and the like. Descriptions of such techniques may be found in the following and other sources: ausbel et al; PCR Primer A Laboratory Manual, Diffenbach, Cold Spring Harbor Press (1995); the Electronic Protocol Book, Chang Bioscience (2002); msuih et al, J.Clin.Micro.34:501-07 (1996); the Nucleic acids Handbook, edited by r. Rapley, Humana Press, Totowa, n.j. (2002); abramson et al, Curr Opin Biotechnol.1993 Feb; 41-7 (1), U.S. Pat. No. 6,027,998; U.S. Pat. No. 6,605,451 to Barany et al, PCT publication No. WO 97/31256; wenz et al, PCT publication No. WO 01/92579; day et al, Genomics,29(1): 152-; innis et al, PCR Protocols A Guide to Methods and Applications, Academic Press (1990); favis et al, Nature Biotechnology 18:561-64 (2000); and Rabenau et al, Infection 28:97-102 (2000); belgrader, Barany and Lubin, Development of a multiple Ligation deletion Detection Reaction DNATyping Assay, six International Symposium on Human Identification, 1995 (available at the following Internet site: promega. com/geneticdproc/ussymp 6proc/blegrad. html-); LCR Kit Instruction Manual, Cat. #200520, Rev. #050002, Stratagene, 2002; barany, Proc.Natl. Acad.Sci.USA 88:188-93 (1991); bi and Sambrook, Nucl. acids Res.25: 2924-2951 (1997); zirvi et al, Nucl. acid Res.27: e40i-viii (1999); dean et al, Proc Natl Acad Sci USA99:5261-66 (2002); barany and Gelfand, Gene 109:1-11 (1991); walker et al, Nucl. acid Res.20:1691-96 (1992); polstra et al, BMC Inf.Dis.2:18- (2002); lage et al, Genome Res.2003 Feb.; 294 (294) -307, and Landegren et al, Science 241:1077-80(1988), Demidov, V., Expert Rev Mol diagn.2002Nov.; 542-8, Cook et al, J Microbiol methods.2003 May; 165 (2), 165-74, Schweitzer et al, Curr Opin Biotechnol.2001Feb; 21-7, U.S. Pat. No. 5,830,711, U.S. Pat. No. 6,027,889, U.S. Pat. No. 5,686,243, PCT publication No. WO0056927A3, and PCT publication No. WO9803673A 1.

In some embodiments, the amplification comprises at least one cycle of sequential steps of: annealing at least one primer with a complementary or substantially complementary sequence in at least one target nucleic acid; synthesizing at least one strand of nucleotides in a template-dependent manner using a polymerase; and denaturing the newly formed nucleic acid duplex to separate the strands. This cycle may or may not be repeated. Amplification may include thermal cycling or may be accomplished isothermally.

The term "qPCR" is used herein to refer to quantitative real-time Polymerase Chain Reaction (PCR), which is also referred to as "real-time PCR" or "kinetic polymerase chain reaction".

The term "substantially" as used herein with reference to a parameter means that the parameter is sufficient to provide a useful result. Thus, "substantially complementary," when applied to a nucleic acid sequence, generally means sufficiently complementary to function in the described context. Generally, substantially complementary means sufficiently complementary to hybridize under the conditions employed. In some embodiments described herein, the reaction product must be distinguished from unreacted primers. In this case, the statement that the "reaction product is substantially double-stranded" and the statement that the "primer is substantially single-stranded" indicate that there is sufficient difference between the amount of double-stranded reaction product and single-stranded primer such that the presence and/or amount of reaction product can be determined.

"reagent" broadly refers to any agent used in a reaction other than an analyte (e.g., a nucleic acid being analyzed). Illustrative reagents for a nucleic acid amplification reaction include, but are not limited to, buffers, metal ions, polymerases, reverse transcriptases, primers, template nucleic acids, nucleotides, labels, dyes, nucleases, and the like. Reagents for enzymatic reactions include, for example, substrates, cofactors, buffers, metal ions, inhibitors, and activators.

The term "universal detection probe" is used herein to refer to any probe that identifies the presence of an amplification product regardless of the identity of the target nucleotide sequence present in the product.

The term "universal qPCR probe" is used herein to refer to any such probe that identifies the presence of an amplification product during qPCR. In a specific embodiment, the nucleotide tag according to the invention may comprise a nucleotide sequence to which a detection probe, such as a universal qPCR probe, is bound. When tags are added to both ends of the target nucleotide sequence, each tag may include a sequence recognized by the detection probe, if desired. Combinations of such sequences may encode information about the identity or sample source of the tagged target nucleotide sequence. In other embodiments, one or more amplification primers can include a nucleotide sequence to which a detection probe, such as a universal qPCR probe, binds. In this manner, one, two, or more probe binding sites may be added to the amplification product during the amplification step of the method of the invention. One of ordinary skill in the art knows the possibility of introducing multiple probe binding sites during pre-amplification (if performed), and amplification facilitates multiplex assays where two or more different amplification products can be detected in a given amplification mixture or aliquot thereof.

The term "universal detection probe" is also intended to include primers labeled with a detectable label (e.g., a fluorescent label), as well as non-sequence specific probes, such as DNA binding dyes, including double-stranded DNA (dsdna) dyes, such as SYBR Green.

The term "label" as used herein refers to any atom or molecule that can be used to provide a detectable and/or quantifiable signal. Specifically, the label may be attached directly or indirectly to the nucleic acid or protein. Suitable labels that may be attached to the probe include, but are not limited to, radioisotopes, fluorophores, chromophores, mass labels, electron-dense particles, magnetic particles, spin labels, molecules that emit chemiluminescence, electrochemically active molecules, enzymes, cofactors, and enzyme substrates.

The term "stain" as used herein generally refers to any organic or inorganic molecule that binds a component of a reaction or assay mixture to aid in the detection of that component.

The term "dye" as used herein generally refers to any organic or inorganic molecule that absorbs electromagnetic radiation at a wavelength greater than or equal to 340 nm.

The term "fluorescent dye" as used herein generally refers to any dye that emits longer wavelength electromagnetic radiation by a fluorescence mechanism when illuminated by a source of electromagnetic radiation such as a lamp, photodiode, or laser.

The term "elastomer" has the general meaning used in the art. Thus, for example, Allcock et al (2 nd edition) describe elastomers as a whole present as polymers at temperatures between their glass transition temperature and liquefaction temperature. Elastomeric materials exhibit elastic properties because these polymer chains are susceptible to torsional movement allowing the backbone to uncoil in response to a force in the absence of which the backbone uncoils (recoiling) to assume a previous shape. Generally, elastomers deform when a force is applied, but then return to their original shape when the force is removed.

The term "change" as used herein is used to refer to any difference. Variation may refer to differences between individuals or populations. Variations encompass differences from the ordinary or normal case. Thus, "copy number change" or "mutation" may refer to a difference from a normal or normal copy number or nucleotide sequence. "alteration in expression level" or "splice variant" may refer to an expression level or RNA or protein that differs from the normal or normal expression level or RNA or protein for a particular, cell or tissue, developmental stage, state, etc.

A "polymorphic marker" or "polymorphic site" is a locus at which nucleotide sequence divergence occurs. Exemplary markers have at least two alleles, each occurring at a frequency greater than 1%, and more typically greater than 10% or 20% of the selected population. Polymorphic sites may be as small as one base pair. Polymorphic markers include Restriction Fragment Length Polymorphism (RFLP), Variable Number Tandem Repeats (VNTR), hypervariable regions, microsatellite sequences, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, deletions, and insertion elements, such as Alu. The first identified allelic form is arbitrarily designated as the reference form and the other allelic forms are designated as the alternate or variant alleles. The allelic form that occurs most frequently in the selected population is sometimes referred to as wild-type. For allelic forms, diploid organisms may be homozygous or heterozygous. Biallelic polymorphisms have two forms. The triallelic polymorphism has three forms.

A "single nucleotide polymorphism" (SNP) occurs at a polymorphic site occupied by a single nucleotide, which is a site at which changes occur between allelic sequences. This site is typically preceded or followed by a highly conserved sequence of the allele (e.g., a sequence that varies among members of the population less than 1/100 or 1/1000). SNPs typically occur as a result of the substitution of one nucleotide for another at a polymorphic site. The transition is the substitution of one purine for another purine or one pyrimidine for another pyrimidine. Transversion (transversion) is the substitution of a purine for a pyrimidine or vice versa. SNPs can also be generated by deletion of one nucleotide or insertion of one nucleotide relative to a reference allele.

As used herein with respect to reactions, reaction mixtures, reaction volumes, and the like, the term "separate" is intended to refer to reactions, reaction mixtures, reaction volumes, and the like, in which the reactions are conducted separately from the other reactions. The respective reactions, reaction mixtures, reaction volumes, etc., include those that are carried out in Droplets (see, e.g., U.S. patent No. 7,294,503 to Quake et al, entitled "Microfabricated cross flow devices and methods," granted on 11/13/2007, which is incorporated herein by reference in its entirety, particularly for a description of the apparatus and methods for forming and analyzing Droplets; U.S. patent publication No. 20100022414 to Link et al, published on 1/28/2010, entitled "Droplet libraries," which is incorporated herein by reference in its entirety, particularly for a description of the apparatus and methods for forming and analyzing Droplets; and U.S. patent publication No. 20110000560 to Miller et al, published on 1/6/2011, entitled "chromatography of microfluidic Droplets," which is incorporated herein by reference in its entirety, particularly for a description of the apparatus and methods for forming and analyzing Droplets), which may, but need not be, in an emulsion, and those in which reactions, reaction mixtures, reaction volumes, etc. are separated by mechanical barriers (mechanical barrier), for example, separate vessels, separate wells of a microtiter plate, or separate compartments of a matrix-type microfluidic device.

Generation of adaptor-modified target nucleic acid molecules

In certain embodiments, the invention relates to methods of adding an adaptor molecule to each end of a plurality of target nucleic acids comprising sticky ends. These embodiments can be used, for example, for fragment generation for high throughput DNA sequencing. Adapters can be selected to facilitate sequencing using the selected DNA sequencing platform.

In particular embodiments, such methods include annealing an adaptor molecule to a sticky end of a double stranded target nucleic acid molecule to produce an annealed adaptor-target nucleic acid molecule. A target nucleic acid molecule comprising a sticky end can be produced by any convenient method. In certain embodiments, the DNA molecule is fragmented, e.g., by any of enzymatic digestion, nebulization, sonication, and the like. For example, a DNA molecule may be fragmented by digestion with a dnase such as dnase I, and ended by heat treatment. Fragmentation without sticky ends can be followed by enzymatic digestion of the fragmented DNA molecules to generate sticky ends. In a specific embodiment, the sticky end of a double stranded target nucleic acid molecule is a 3' extension. Strand-specific endonucleases that do not have polymerase activity under the conditions employed for digestion can be used to generate sticky ends. In an exemplary embodiment, the sticky ends are generated by digesting the 5' end with exonuclease III in the absence of dntps.

In a first embodiment, the adaptor molecules are hairpin structures, each comprising: an adaptor nucleotide sequence linked to a nucleotide linker linked to a nucleotide sequence capable of annealing to the adaptor nucleotide sequence and being linked to a degenerate tail sequence. See fig. 1A. This embodiment employs two types of adaptor molecules, wherein each type comprises an adaptor nucleotide sequence (i.e., a first adaptor nucleotide sequence and a second adaptor nucleotide sequence) that is different from the other type.

In a second embodiment, the adaptor molecule is a double-stranded or single-stranded molecule, each comprising on each strand: a first adaptor nucleotide sequence linked to a nucleotide linker linked to a second adaptor nucleotide sequence; and degenerate tail sequences, wherein the double-stranded molecules each comprise two degenerate tail sequences as sticky ends. See fig. 2A.

In certain embodiments, e.g., embodiments in which the target nucleic acid molecule is intended for high-throughput DNA sequencing, the first and second adaptor sequences may comprise primer binding sites capable of being specifically bound by DNA sequencing primers, i.e., sequencer-specific tag 1 and sequencer-specific tag 2. See fig. 1A and 2A.

In all cases, the degenerate tail sequence may be at the 3' end of the adaptor molecule. The degenerate tail sequence of the adaptor molecule is substantially complementary to at least a portion of the sticky end on the target nucleic acid molecule; that is, the adaptor molecule is capable of annealing to the target nucleic acid molecule under the conditions employed. The length of the degenerate tail sequence will generally be sufficient to facilitate this annealing, e.g., from about 10 to about 20 nucleotides. In certain embodiments, the degenerate tail sequence is protected at its 3' end, e.g., with phosphorothioate (phosphothionate) or dUTP to protect against exonuclease digestion.

Optionally, the adaptor molecule may comprise one or more additional nucleotide sequences. In certain embodiments, the nucleotide linker moiety of the adaptor molecule may comprise an endonuclease site, a barcode nucleotide sequence, an affinity tag, and any combination thereof. For example, the nucleotide linker may comprise a restriction enzyme site, and optionally, at least one barcode nucleotide sequence.

In the first and second embodiments, after annealing to the target nucleic acid molecule, the method includes filling in any gaps in the annealed adaptor-target nucleic acid molecule (e.g., using a DNA polymerase), and ligating any adjacent nucleotide sequences in the annealed adaptor-target nucleic acid molecule to produce an adaptor-modified target nucleic acid molecule. In some embodiments, sticky end generation and ligation may be performed in the same reaction mixture. For example, an exonuclease can be used to prepare a single reaction mixture with a ligase (e.g., a thermostable ligase) and a polymerase (e.g., ) Are used together.

When the adaptor molecule is a hairpin structure, ligation of the adaptor to the target nucleic acid converts the annealed adaptor-target nucleic acid molecule into a single-stranded circular DNA molecule, which can form a double-stranded structure, as shown in fig. 1D. When the adaptor molecule is a single-stranded or double-stranded molecule, ligation of the adaptor to the target nucleic acid converts the annealed adaptor-target nucleic acid molecule into a double-stranded circular DNA molecule. When the nucleotide linker comprises an endonuclease site, the method may further comprise digesting the single-stranded or double-stranded circular DNA molecule to produce a linear DNA molecule. See fig. 1D and 2D. Specifically, double-stranded circular DNA molecules can be digested with restriction enzymes that cleave at sites in nucleotide linkers to produce linear DNA molecules. In a specific embodiment, the linear DNA molecule comprises a first portion of a 5 '-nucleotide linker-a second adaptor nucleotide sequence-a first degenerate tail sequence-a target nucleic acid molecule-a second degenerate tail sequence-a first adaptor nucleotide sequence-a second portion of a nucleotide linker-3'.

In an exemplary embodiment, the above method may be performed by:

generating a plurality of target nucleic acid molecules comprising sticky ends by:

digesting the DNA molecule with dnase I to produce fragmented DNA molecules, followed by heat inactivation of dnase I;

Digesting the fragmented DNA molecules with a nuclease having 5 'to 3' exonuclease activity (such as exonuclease III) in the absence of deoxynucleotides to produce a plurality of target nucleic acid molecules having sticky ends;

annealing the adaptors to sticky ends of a plurality of target nucleic acid molecules, wherein the nucleotide adaptors of the adaptors comprise endonuclease sites;

filling any gaps in the annealed adaptor-target nucleic acid molecule and ligating any adjacent nucleotide sequences in a single reaction comprising a polymerase and a ligase to produce a circular DNA molecule; and

the circular DNA molecule is digested with an endonuclease that cleaves at an endonuclease site to produce a linear DNA molecule.

In particular embodiments, the method of adding an adaptor molecule to each end of a plurality of target nucleic acids can include sequencing the adaptor-modified target nucleic acid molecules by any available method, such as any available high throughput DNA sequencing technology.

Incorporation of nucleic acid sequences into target nucleic acids

The reaction for incorporating one or more nucleotide sequences into a target nucleic acid can be performed using two or more primers that comprise one or more nucleic acid sequences in addition to the portion that anneals to the target nucleic acid. One or more of these moieties may comprise a random sequence to incorporate the nucleic acid sequence into substantially all of the nucleic acids in the sample. Alternatively or additionally, one or more of these moieties may be specific for one or more sequences common to a plurality or all of the nucleic acids present. In other embodiments, the primer comprises a portion specific for one or more particular target nucleic acids. Nucleic acid sequences can be incorporated using as few as two primers. However, various embodiments employ three, four, five, or six or more primers, as described in more detail below. Such reactions are discussed below with respect to nucleic acid amplification; however, one skilled in the art will readily appreciate that the strategies discussed below may be employed in other types of reactions, such as polymerase extension and ligation.

Three-primer method

In particular embodiments, the invention provides amplification methods for incorporating a plurality (e.g., at least three) of selected nucleotide sequences into one or more target nucleic acids. In some embodiments, such methods comprise amplifying a plurality of target nucleic acids in a plurality of samples. In illustrative embodiments, the same set of target nucleic acids can be amplified in each of two or more different samples. The samples can differ from each other in any respect, e.g., the samples can be from different tissues, subjects, environmental sources, etc. At least three primers can be used to amplify each target nucleic acid, namely: forward and reverse amplification primers, each primer comprising a target-specific portion, and one or both primers comprising a nucleotide tag (e.g., a first and second nucleotide tag). These target-specific moieties can specifically anneal to the target under appropriate annealing conditions. The nucleotide tag for the forward primer may have the same or different sequence as the nucleotide tag for the reverse primer. Generally, the nucleotide tags are 5' to the target-specific moieties. The third primer is a barcode primer comprising a barcode nucleotide sequence and a first and/or second nucleotide tag specific portion. The barcode nucleotide sequence is a sequence selected to encode information about the amplicons generated when the barcode primer is used in an amplification reaction. The tag-specific portion can specifically anneal to one or both of the nucleotide tags in the forward and reverse primers. The barcode primers are generally 5' to the target-specific portion.

The barcode primer is typically present in the amplification mixture in an amount exceeding one or more forward and/or reverse or (inside) primers. More specifically, if the barcode primer anneals to the nucleotide tag in the forward primer, the barcode primer is generally present in an amount that exceeds the amount of the forward primer. If the barcode primer anneals to the nucleotide tag in the reverse primer, the barcode primer is generally present in an amount that exceeds the reverse primer. In illustrative embodiments, in each case, the third primer, i.e., the reverse primer or the forward primer, respectively, in the amplification mixture can be present at a concentration substantially similar to the barcode primer. In general, the barcode primer is present in a substantial excess. For example, the concentration of the barcode primer in the amplification mixture can be at least 2-fold, at least 4-fold, at least 5-fold, at least 10-fold, at least 15-fold, at least 20-fold, at least 25-fold, at least 30-fold, at least 35-fold, at least 40-fold, at least 45-fold, at least 50-fold, at least 100-fold, at least 500-fold, at least 10-fold, relative to the concentration of the one or more forward and/or reverse primers3Times, at least 5X103Times, at least 104Times, at least 5X10 4Times, at least 105Times, at least 5X105Times, at least 106Multiple times or higher. Further, the excess concentration of the barcode primer may fall within any range (e.g., 2-fold to 10-fold) with any of the above values as endpoints5Multiple). In illustrative embodiments, when the barcode primer has a tag-specific moiety specific for a nucleotide tag on the forward primer, the forward primer can be present at a picomolar to nanomolar concentration, e.g., about 5pM to 500nM, about 5pM to 100nM, about 5pM to 50nM, about 5pM to 10nM, about 5pM to 5nM, about 10pM to 1nM, about 50pM to about 500pM, about 100pM, or any other range with any of these values as an endpoint (e.g., 10pM to 50 pM). Suitably, any of these concentrations with the forward primer may be usedExemplary concentrations of the barcode primers used in combination include about 10nM to about 10 μ M, about 25nM to about 7.5 μ M, about 50nM to about 5 μ M, about 75 nM to about 2.5 μ M, about 100nM to about 1 μ M, about 250nM to about 750nM, about 500nM, or any other range (e.g., 100nM to 500 nM) with any of these values as endpoints. In amplification reactions using such concentrations of forward and barcode primers, the reverse primer has a concentration that is on the same order of magnitude as the barcode primer (e.g., within about 10-fold, within about 5-fold, or the same).

Each amplification mixture can be subjected to amplification to generate a target amplicon comprising a tagged target nucleotide sequence, each comprising first and second nucleotide tags flanking the target nucleotide sequence, and at least one barcode nucleotide sequence (relative to one strand of the target amplicon) on the 5 'or 3' end of the target amplicon. In certain embodiments, the first and second nucleotide tags and/or barcode nucleotide sequences are selected so as to avoid substantial annealing to the target nucleic acid. In such embodiments, the tagged target nucleotide sequence may comprise a molecule having the following elements 5 '- (barcode nucleotide sequence) - (first nucleotide tag from forward primer) - (target nucleotide sequence) - (second nucleotide tag sequence from reverse primer) -3' or 5 '- (first nucleotide tag from forward primer) - (target nucleotide sequence) - (second nucleotide tag sequence from reverse primer) - (barcode nucleotide sequence) -3'.

Four-primer method

In some embodiments, more than three primers may be used to add a desired element to a target nucleotide sequence. For example, four primers can be used to generate a molecule having the same elements as discussed above plus optionally additional barcodes, such as 5 '- (barcode nucleotide sequence) - (first nucleotide tag from forward primer) - (target nucleotide sequence) - (second nucleotide tag from reverse primer) - (additional barcode nucleotide sequence) -3'. In an exemplary four-primer embodiment, the forward primer includes a target-specific portion and a first nucleotide tag, and the reverse primer includes a target-specific portion and a second nucleotide tag. In summary, these two primers constitute the "inner primer". The remaining two primers are the "outer primers" which anneal to the first and second nucleotide tags present in the inner primer. One outer primer is a barcode primer, as described above. The second outer primer may comprise a second tag-specific portion and an additional barcode nucleotide sequence, i.e. it may be a second barcode primer.

Amplification may be performed in one or more amplification reactions to incorporate elements from more than three primers. For example, a four-primer amplification can be performed in one amplification reaction in which all four primers are present. Alternatively, four-primer amplification may be performed, for example, in two amplification reactions: one for incorporation of the inner primer and a different amplification reaction for incorporation of the outer primer. When all four primers are present in one amplification reaction, the outer primers are generally present in excess in the reaction mixture. In a one-step, four-primer amplification reaction, the relative concentration values given above for the barcode primer relative to the forward and/or reverse primer also apply to the concentration of the outer primer relative to the inner primer.

Combination method

In an exemplary embodiment of a four-primer amplification reaction, each of the outer primers comprises a unique barcode. For example, a barcode primer may be composed of the following elements: 5 '- (first barcode nucleotide sequence) - (first nucleotide tag) -3', and the second barcode primer may be comprised of the following elements: 5 '- (second barcode nucleotide sequence) - (second nucleotide tag) -3'. In this embodiment, a number (J) of first barcode primers can be combined with a number (K) of second barcode primers to generate JxK unique amplification products.

In further exemplary embodiments of the invention, more than 4 primers may be combined into a single reaction to attach different combinations of barcode nucleotide sequences and nucleotide tags. For example, as described above, an outer barcode primer comprising the following elements may be used: 5 ' - (first barcode nucleotide sequence) - (first nucleotide tag) -3', 5- (first barcode nucleotide sequence) - (second nucleotide tag) -3', 5 ' - (second barcode nucleotide sequence) - (first nucleotide tag) -3', 5 ' - (second barcode nucleotide sequence) - (second nucleotide tag) -3' are combined with a medial target-specific primer to generate an amplification product pool comprising all combinations of barcode primers and desired amplicon sequences.

In other exemplary embodiments of the invention, the outer barcode primer in any one of the above combinations or other combinations that would be apparent to one of ordinary skill in the art may be combined with more than one pair of target primer sequences having identical first and second nucleotide tag sequences. For example, as described above, an inner primer comprising up to ten different target-specific forward primer sequences combined with the same first nucleotide tag and up to ten different target-specific reverse primer sequences combined with the same second nucleotide tag can be combined with up to 2 or up to 4 outer barcode primers to generate a plurality of amplification products. In various embodiments, at least 10, at least 20, at least 50, at least 100, at least 200, at least 500, at least 1000, at least 2000, at least 5000, or at least 10000 different target-specific primer pairs carrying the same first nucleotide tag and second nucleotide tag can be combined with up to 2 or up to 4 outer barcode primers to generate a plurality of amplification products.

Bidirectional combination method

In an exemplary embodiment of a four-primer amplification reaction, the inner and outer primers may each comprise a unique barcode, such that amplification results in a barcode combination at each end of the resulting amplicon. This method is useful when the amplicons are to be sequenced, because barcode combinations can be read from either end of the sequence. For example, four primers can be employed to generate a molecule having the following elements: 5 '-second barcode nucleotide sequence-first nucleotide tag sequence-first barcode nucleotide sequence-target nucleotide sequence-first barcode nucleotide sequence-second nucleotide tag sequence-second barcode nucleotide sequence-3'. In an exemplary four-primer embodiment, the two inner primers may comprise:

a forward, inboard primer comprising a first nucleotide tag, a first barcode nucleotide sequence, and a target-specific portion; and

a reverse, inboard primer comprising a target-specific portion, a first barcode nucleotide sequence, and a second nucleotide tag. The two outer primers may comprise:

a forward, outer primer comprising a second barcode nucleotide sequence and a first nucleotide tag specific portion; and

A reverse, outer primer comprising a second nucleotide tag-specific portion and a second barcode nucleotide sequence. As discussed above, if the inner and outer primers are contained in the same reaction mixture, the outer primer is preferably present in excess.

Similar combinations of elements can be produced in a six-primer amplification method that employs "stuffer" primers in addition to the inner and outer primers. Thus, for example, two inner primers may comprise:

a forward, inboard primer comprising a first nucleotide tag and a target-specific portion; and

a reverse, inboard primer comprising a target-specific portion and a second nucleotide tag. The two filling guides may comprise:

a forward, stuffer primer comprising a third nucleotide tag, a first barcode nucleotide sequence, and a first nucleotide tag specific portion; and

a reverse, filled primer comprising a second nucleotide tag specific portion, a first barcode nucleotide sequence, a fourth nucleotide tag. The two outer primers may comprise:

a forward, outer primer comprising a second barcode nucleotide sequence and a third nucleotide tag-specific portion; and

a reverse, outer primer comprising a fourth nucleotide tag-specific portion and a second barcode nucleotide sequence. Nucleic acid amplification produces an amplicon comprising the following elements: 5 '-second barcode nucleotide sequence-third nucleotide tag sequence-first barcode nucleotide sequence-first nucleotide tag sequence-target nucleotide sequence-second nucleotide tag sequence-first barcode nucleotide sequence-fourth nucleotide tag sequence-second barcode nucleotide sequence-3'. Amplification can be performed in one, two, or three amplification reactions. For example, all three primer pairs may be included in one reaction. Alternatively, two reactions may be performed, e.g., a first reaction comprising an inner and a stuffer primer, and a second reaction comprising only an outer primer; or the first reaction includes only the inner primer, followed by the second reaction including the stuffer and outer primer. When more than one primer pair is present, the primer pairs that are "outer" pairs relative to the other pairs are preferably present in excess, as discussed above. Thus, the filling primer is preferably present in excess if the inner and filling primers are contained in the reaction mixture, and the outer primer is preferably present in excess if the filling and outer primers are contained in the reaction mixture. When all three primer pairs are contained in a single reaction, the filling primer may be present at an intermediate concentration between the concentrations of the inner and outer primers.

In certain embodiments of the four-and six-primer amplification methods described above, for example, when the molecule produced in the reaction is to be DNA sequenced, the outer primer can additionally comprise first and second primer binding sites capable of being bound by a DNA sequencing primer. For example, a four-primer reaction can produce a tagged target nucleotide sequence comprising: 5 '-first primer binding site-second barcode nucleotide sequence-first nucleotide tag sequence-first barcode nucleotide sequence-target nucleotide sequence-first barcode nucleotide sequence-second nucleotide tag sequence-second barcode nucleotide sequence-second primer binding site-3'. This embodiment provides the benefit that barcode combinations can be determined from sequencing reads at either end of the molecule. Similarly, a six-primer reaction can produce a tagged target nucleotide sequence comprising: 5 '-first primer binding site-second barcode nucleotide sequence-third nucleotide tag sequence-first barcode nucleotide sequence-first nucleotide tag sequence-target nucleotide sequence-second nucleotide tag sequence-first barcode nucleotide sequence-fourth nucleotide tag sequence-second barcode nucleotide sequence-second primer binding site-3'.

Tagging based on combinatorial connections

In certain embodiments, the invention includes ligation-based methods for combinatorial tagging (e.g., barcoding) a plurality of target nucleotide sequences. The method employs a plurality of tagged target nucleotide sequences derived from a target nucleic acid. Each tagged target nucleotide sequence comprises an endonuclease site and a first barcode nucleotide sequence. The plurality of tagged target nucleotide sequences comprises the same endonuclease site but N different first barcode nucleotide sequences, wherein N is an integer greater than 1.

Cleaving the tagged target nucleotide sequences with an endonuclease site-specific to the endonuclease to produce a plurality of sticky-ended, tagged target nucleotide sequences. A plurality of adaptors is then ligated to the tagged target nucleotide sequences in the first reaction mixture. The plurality of adaptors comprises a second barcode nucleotide sequence and a sticky end complementary to a plurality of tagged target nucleotide sequences having sticky ends. Further, the plurality of adaptors comprises M different second barcode nucleotide sequences, wherein M is an integer greater than 1. Ligation generates a plurality of combinatorial tagged target nucleotide sequences, each comprising first and second barcode nucleotide sequences, wherein the plurality comprises nxm different first and second barcode combinations.

In certain embodiments, the endonuclease site is adjacent to the first barcode nucleotide sequence in the tagged target nucleotide sequence. In variations of such embodiments, the second barcode nucleotide sequence is adjacent to a complementary sticky end in the adaptor. In particular embodiments, for example, the combination tagged target nucleotide sequence comprises first and second barcode nucleotide sequences separated by less than 5 nucleotides.

In particular embodiments, for example, when the combinatorial tagged target nucleotide sequence is intended for sequencing, the tagged target nucleotide sequence may comprise first and second primer binding sites, which may have either of the following arrangements: 5' -endonuclease site-first barcode nucleotide sequence-first primer binding site-target nucleotide sequence-second primer binding site; and 5 '-first primer binding site-target nucleotide sequence-second primer binding site-first barcode nucleotide sequence-endonuclease site-3'. To aid in sequencing, the first and second primer binding sites may be binding sites of DNA sequencing primers. In variations of such embodiments, the combinatorial tagged nucleotide sequence may comprise a second barcode nucleotide sequence arranged in one of: 5' -a second barcode nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site; or 5 '-first primer binding site-target nucleotide sequence-second primer binding site-first barcode nucleotide sequence-second barcode nucleotide sequence-3'.

Tagged target nucleotide sequences useful in this method can be prepared by any convenient means, such as, for example, by ligating an adaptor to a plurality of target nucleic acids, wherein the adaptor comprises: a first adaptor comprising an endonuclease site, a first barcode nucleotide sequence, a first primer binding site, and a sticky end; and a second adaptor comprising a second primer binding site and a sticky end.

In some embodiments, it is advantageous to include one or more additional nucleotide sequences in the tagged target nucleotide sequence, e.g., to aid in manipulation and/or identification. As such, the tagged target nucleotide sequence may comprise a first additional nucleotide sequence having an arrangement selected from: 5' -a nucleolytic enzyme site-a first barcode nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site-a first additional nucleotide sequence; and/or 5 '-a first additional nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site-a first barcode nucleotide sequence-an endonuclease site-3'. For example, in Illumina sequencing, flow cell binding sequences (e.g., PE1 and PE2) are incorporated at either end of a DNA template to be sequenced. In the present method, the tagged target nucleotide sequence may comprise one flow cell binding sequence as the first further nucleotide sequence, and another flow cell binding sequence may be introduced via an adaptor. See, for example, FIGS. 5A-5B. Thus, the method may employ an adaptor comprising a second further nucleotide sequence and having the following arrangement: 5 '-second additional nucleotide sequence-second barcode nucleotide sequence-complementary sticky end-3'. In this case, ligating the adaptor to the above-described tagged target nucleotide sequence comprising the first further nucleotide sequence yields a combinatorial tagged target nucleotide sequence comprising: 5' -a second additional nucleotide sequence-a second barcode nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site-a first additional nucleotide sequence; and/or 5 '-a second additional nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site-a first barcode nucleotide sequence-a second barcode nucleotide sequence-a first additional nucleotide sequence-3'. In a variation of this embodiment, the first and/or second additional nucleotide sequence comprises a primer binding site.

The tagged target nucleotide sequence comprising the first additional nucleotide sequence may be prepared by any convenient means, such as, for example, by ligating adaptors to the plurality of target nucleic acids, wherein the adaptors comprise: a first adaptor comprising an endonuclease site, a first barcode nucleotide sequence, a first primer binding site, and a sticky end; and a second adaptor comprising a first additional nucleotide sequence, a second primer binding site and a sticky end.

Tagging based on combinatorial insertional mutagenesis

Combinatorial tagging can also be performed using insertional mutagenesis. In certain embodiments, the combinatorial tagging of the plurality of target nucleotide sequences is performed by: annealing the plurality of barcode primers to a plurality of tagged target nucleotide sequences derived from the target nucleic acid, and then amplifying the tagged target nucleotide sequences in a first reaction mixture to produce a plurality of combined tagged target nucleotide sequences, each comprising a first and a second barcode nucleotide sequence, wherein the plurality comprises N x M different first and second barcode combinations.

In a specific embodiment, each tagged target nucleotide sequence comprises a nucleotide tag at one end, and a first barcode nucleotide sequence, wherein a plurality of tagged target nucleotide sequences comprise the same nucleotide tag but N different first barcode nucleotide sequences, wherein N is an integer greater than 1. In variations of such embodiments, the first barcode nucleotide sequence is spaced from the nucleotide tag by the target nucleotide sequence. Each barcode primer comprises: a first tag-specific moiety linked to a second barcode nucleotide sequence, which is itself linked to a second tag-specific moiety, wherein the plurality of barcode primers each comprise the same first and second tag-specific moieties, but M different second barcode nucleotide sequences, wherein M is an integer greater than 1. The first tag-specific portion of the barcode primer anneals to the 5' portion of the nucleotide tag and the second tag-specific portion of the barcode primer anneals to the adjacent 3' portion of the nucleotide tag (an adjacent 3' portion of the nucleotide tag) and the second barcode nucleotide sequence does not anneal to the nucleotide tag, forming a loop between the annealed first and second tag-specific portions.

In particular embodiments, such as useful in DNA sequencing, the tagged nucleotide sequence additionally comprises a primer binding site between the target nucleotide sequence and the first barcode nucleotide sequence. In variations of such embodiments, the first and second tag-specific portions of the barcode primer are sufficiently long to serve as primer binding sites. To aid in sequencing, one or more, or preferably all, of these binding sites are binding sites for DNA sequencing primers. In such embodiments, the combinatorially tagged target nucleotide sequence may comprise 5 '-first tag-specific portion-second barcode nucleotide sequence-second tag-specific portion-target nucleotide sequence-primer binding site-first barcode nucleotide sequence-3'.

In some embodiments, it is advantageous to include one or more additional nucleotide sequences in the tagged target nucleotide sequence, e.g., to aid in manipulation and/or identification. As such, the tagged target nucleotide sequence may comprise a first additional nucleotide sequence having the following arrangement: 5 '-nucleotide tag-target nucleotide sequence-primer binding site-first barcode nucleotide sequence-first additional nucleotide sequence-3'. For example, in Illumina sequencing, flow cell binding sequences (e.g., PE1 and PE2) are incorporated at either end of a DNA template to be sequenced. In the present method, the tagged target nucleotide sequence may comprise one flow cell binding sequence as the first further nucleotide sequence, and another flow cell binding sequence may be introduced via a barcode primer. See, for example, fig. 6. As such, the method may employ a barcode primer comprising a second additional nucleotide sequence and having the following arrangement: 5 '-second additional nucleotide sequence-first tag-specific part-second barcode nucleotide sequence-second tag-specific part-3'. In this case, the amplification produces a combinatorial tagged target nucleotide sequence comprising 5 '-a second additional nucleotide sequence-a first tag specific part-a second barcode nucleotide sequence-a second tag specific part-the target nucleotide sequence-a primer binding site-a first barcode nucleotide sequence-a first additional nucleotide sequence-3'. In a variation of this embodiment, the first and/or second additional nucleotide sequence comprises a primer binding site.

The target nucleotide sequence may be tagged by any convenient means, including primer-based methods described herein. In certain embodiments, the nucleotide tag comprises a transposon end that is incorporated into the tagged target nucleotide sequence using a transposase.

Reactions incorporating nucleic acid sequences

Any method can be used to incorporate a nucleic acid sequence into a target nucleic acid. In an exemplary embodiment, PCR is employed. When three or more primers are used, amplification is typically performed for at least three cycles to incorporate the first and second nucleotide tags and the barcode nucleotide sequence. In various embodiments, rows 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 cycles, or any number of cycles (e.g., 5-10 cycles) falling within a range having any of these values as endpoints, are expanded. In particular embodiments, amplification is performed for a sufficient number of cycles to normalize target amplicon copy number across the target and across the sample (e.g., 15, 20, 25, 30, 35, 40, 45, or 50 cycles, or any number of cycles that falls within a range with any of these values as endpoints).

Particular embodiments of the above-described methods provide for substantially uniform amplification, producing a plurality of target amplicons, wherein a majority of the amplicons are present at a level that is relatively close to an average copy number calculated for the plurality of target amplicons. As such, in various embodiments, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the target amplicons are present at greater than 50% of the average copy number of the target amplicons and less than 2-fold the average copy number of the target amplicons.

Applications of

In illustrative embodiments, the barcode nucleotide sequence recognizes a particular sample. Thus, for example, a set of T target nucleic acids can be amplified in each of S samples, where S and T are integers, typically greater than 1. In such embodiments, amplification may be performed separately for each sample, with the same set of forward and reverse primers being used for each sample, and the set of forward and reverse primers having at least one nucleotide tag that is common to all primers in the set. Different barcode primers may be used for each sample, where the barcode primers have different barcode nucleotide sequences but the same tag-specific portion that can anneal to a common nucleotide tag. This embodiment has the benefit of reducing the number of different primers in the amplicon generated for multiple target sequences that need to be synthesized encoding the source of the sample. Alternatively, different sets of forward and reverse primers may be used for each sample, where each set has a different nucleotide tag than the primers in the other set, and a different barcode primer is used for each sample, where the barcode primer has a different barcode nucleotide sequence and a different tag-specific portion. In either case, amplification produces a set of T amplicons with sample-specific barcodes from each sample.

In embodiments where the same set of forward and reverse primers is used for each sample, the forward and reverse primers for each target can be initially combined with the sample separately, and each barcode primer can be initially combined with its corresponding sample. The initially combined aliquots of forward and reverse primers can then be added to the initially combined aliquots of sample and barcode primers to generate S × T amplification mixtures. These amplification mixtures may be formed in any article that can be subjected to conditions suitable for amplification. For example, the amplification mixture may be formed in or distributed to separate compartments of the microfluidic device prior to amplification. In illustrative embodiments, suitable microfluidic devices include matrix-type microfluidic devices, such as those described below.

In certain embodiments, the target amplicons produced in any of the methods described herein can be recovered from the amplification mixture. For example, a matrix-type microfluidic device (see below) adapted to allow recovery of the contents of each reaction compartment may be used for amplification to produce target amplicons. In variations of these embodiments, the target amplicon may be further amplified and/or analyzed. In certain embodiments, the amount of target amplicon produced in the amplification mixture can be quantified during amplification, for example, by quantitative real-time PCR, or later.

In embodiments useful for single particle analysis, combinatorial barcoding can be used to encode the identity of the reaction volume, and thus the identity of the particle that is the source of the amplification product. In particular embodiments, nucleic acid amplification is performed using at least two barcode sequences, and the combination of barcode sequences encodes the identity of the reaction volume from which the reaction product is derived (referred to as "combinatorial barcoding"). These embodiments are conveniently employed when the separate reaction volumes are in separate compartments of a matrix-type microfluidic device, for example, as those available from Fluidigm Corp (South San Francisco, CA) and described below (see "microfluidic device"). Each respective compartment may comprise a combination of barcode nucleotide sequences identifying the rows and columns of the compartment in which the encoding reaction is performed. If the reaction volume is recovered and subjected to further analysis (e.g., by DNA sequencing) including detection of barcode combinations, the results can be correlated to a particular compartment and thus to a particular particle in that compartment. Such embodiments are particularly useful when the separate reaction volumes are combined during or after the recovery process, thereby combining ("pooling") the reaction products from multiple separate reaction volumes. In a matrix-type microfluidic device, for example, reaction products from all compartments in a row, all compartments in a column, or all compartments in the device can be pooled. If all compartments in a row are pooled, each column in a row preferably has a unique barcode combination. If all compartments in a column are pooled, each row in a column has a unique barcode combination. If all compartments in the device are pooled, each compartment in the device has a unique barcode combination.

Barcoding and pooling reaction mixtures for subsequent analysis

In other embodiments, the barcoding and pooling strategy is used to detect multiple target nucleic acids in a single reaction mixture, which may, for example, comprise individual particles, such as cells. This strategy is described in example 7 below for single cell analysis of gene expression.

In one embodiment, the method includes preparing M first reaction mixtures to be pooled prior to testing, where M is an integer greater than 1. Each reaction mixture comprising sample nucleic acids; a first, forward primer comprising a target-specific portion; and a first, reverse primer comprising a target-specific portion. The first, forward primer or the first, reverse primer may additionally comprise a barcode nucleotide sequence, wherein each barcode nucleotide sequence in each of the M reaction mixtures is different. Optionally, the first, forward primer or the first, reverse primer further comprises a nucleotide tag and each reaction mixture further comprises at least one barcode primer comprising a barcode nucleotide sequence and a nucleotide tag specific portion, wherein each barcode nucleotide sequence in each of the M reaction mixtures is different. In this embodiment, the barcode primer is generally in excess of the first, forward and/or first, reverse primer. A first reaction is performed on each first reaction mixture to produce a plurality of barcoded target nucleotide sequences, each comprising a target nucleotide sequence linked to a barcode nucleotide sequence. For each of the M reaction mixtures, the barcoded target nucleotide sequences are pooled to form a test pool. In this assay pool, a particular target nucleotide sequence from a particular reaction mixture is uniquely identified by a particular barcode nucleotide sequence. Performing a second reaction on the test cell or one or more aliquots thereof using unique second primer pairs, wherein each second primer pair comprises a second, forward or reverse primer, respectively, that anneals to the target nucleotide sequence; and a second, reverse or forward primer that anneals to the barcode nucleotide sequence. The method includes determining, for each unique, second primer pair, whether a reaction product is present in the test cell or an aliquot thereof. For each unique, second primer pair, the presence of a reaction product indicates the presence of a particular target nucleic acid in a particular first reaction mixture.

In certain embodiments, the method comprises preparing M x N first reaction mixtures, wherein N is an integer greater than 1, and each first reaction mixture comprises first, forward and reverse primer pairs specific for a different target nucleic acid. After the first reaction, N test wells each comprising M first reaction mixtures are prepared, wherein each barcoded target nucleotide sequence in the test wells comprises a different barcode nucleotide sequence. The second reaction is carried out in each of N test cells, each test cell being separate from each other test cell.

For the first reaction, any reaction that can produce a linkage of the target nucleotide sequence to the barcode nucleotide sequence may be performed. Convenient first reactions include amplification and ligation.

The second reaction may be any reaction that relies on primer-based detection of the barcoded target nucleotide sequence. Methods comprising amplification and/or ligation steps, including any of those described herein and/or known in the art, may be used. For example, the presence of the reaction product can be detected using Polymerase Chain Reaction (PCR) or Ligase Chain Reaction (LCR). In some embodiments, real-time detection is employed.

An exemplary second reaction may employ LCR to detect a barcoded target nucleotide sequence having the following structure: 5 '-forward primer sequence-target nucleotide sequence-reverse primer sequence-barcode nucleotide sequence-3'. In this case, one primer can anneal to the reverse primer sequence and the other primer can anneal to the adjacent barcode nucleotide sequence, followed by ligation, and repeated cycles of annealing and ligation. The reverse primer sequence provides target information, and the barcode nucleotide sequence identifies a pool (which may, for example, represent a pool of all targets amplified in a particular sample). See fig. 8A.

Exemplary second reactions may include real-time detection, for example, using a flap endonuclease-ligase chain reaction. This reaction employs a labeled probe and an unlabeled probe, wherein simultaneous hybridization of the probes to the reaction product results in the formation of an overhang at the 5' end of the labeled probe, and cleavage of the overhang produces a signal. For example, cleavage of the overhang can separate the fluorophore from the quencher to generate a signal. Exemplary embodiments can be employed to detect reaction products having the following structure: 5 '-forward primer sequence-target nucleotide sequence-reverse primer sequence-barcode nucleotide sequence-3'. In this case, the reaction may employ an unlabeled probe that anneals to the reverse primer sequence and a labeled probe that anneals to the adjacent barcode nucleotide sequence. Annealing of the 3 'end of the unlabeled probe prevents annealing of the 5' end of the labeled probe, forming an overhang. This 5 'overhang portion may be labeled with a fluorophore, and the portion that anneals to the barcode nucleotide sequence may carry a quencher, such that cleavage of the overhang by an enzyme, such as a 5' overhang endonuclease, releases the overhang, whereby the quencher is no longer capable of quenching the fluorophore. See fig. 8B.

Useful alternative real-time detection methods, for example for detecting amplicons produced by an LCR, rely on the use of double-stranded DNA binding dyes to detect differences in melting temperature between the reaction product and the primers used by the LCR. Melting temperature analysis involves detection at a temperature at which the reaction product is substantially double-stranded and is capable of producing a signal in the presence of the double-stranded DNA binding dye, but the primer is substantially single-stranded and is incapable of producing a signal. For example, for detecting a barcoded target nucleotide sequence having the following structure: 5 '-forward primer sequence-target nucleotide sequence-reverse primer sequence-barcode nucleotide sequence-3', one primer can anneal to the reverse primer sequence and the other primer can anneal to the adjacent barcode nucleotide sequence, followed by ligation, and repeated cycles of annealing and ligation. See fig. 8C. Ligated primer sequences e.g., R1Adding BC1Length ratio R of its complement1Or BC1The length of the complement thereof is sufficiently long that at elevated temperatures, the ligated primer sequence is substantially double stranded (i.e., signal producing) and the unligated primer sequence is substantially single stranded (i.e., no signal producing). In various embodiments, at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the unligated primer is single stranded. In each of these embodiments, the percentage of ligated primers that are double-stranded can be at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90%.

In certain embodiments, the first reaction mixture is prepared in separate compartments of a microfluidic device, the separate compartments being arranged as an array defined by rows and columns, e.g., as those available from Fluidigm corp. For example, a matrix-type microfluidic device (see below) adapted to allow recovery of the contents of the reaction compartment may be used for the first reaction. This method is particularly convenient for preparing N assay cells each containing M first reaction mixtures. More specifically, the first reaction is carried out in separate compartments of the microfluidic device, wherein the separate compartments are arranged as an array defined by rows and columns. Each of the N test cells passesThe first reaction mixture in the row or column of the device is pooled. The barcode nucleotide sequence in each barcoded target nucleotide sequence, along with the identity of the assay pool, is identified as the row and column of the compartment from which the barcoded target nucleotide sequence originates. In a specific embodiment, the second reaction mixture is prepared in a separate compartment of a microfluidic device having the separate compartments arranged as an array defined by rows and columns. For example, a first reaction mixture can be prepared in a separate compartment of a first microfluidic device to incorporate a barcode nucleotide sequence (e.g., ACCESS ARRAY of Fluidigm Corporation) TMIFC (Integrated Fluidic Circuit) or MA006 IFC), preparing a second reaction mixture in a separate compartment of a second, different microfluidic device, e.g., to aid detection (e.g., DYNAMIC ARRAY of Fluidigm Corporation)TMOne of the IFCs, using PCR or RT-PCR, binds a dye such as EvaGreen with double stranded DNA for detection).

In particular embodiments, at least one of the first and/or second reactions is performed on individual particles, such as cells. Particle capture and detection can be performed as described below or as known in the art. MA006 IFC from Fluidigm Corporation is well suited for this purpose. The particles can be substantially intact when undergoing the first and/or second reaction, providing the necessary reactants to be contacted with the target nucleic acid of interest. Optionally, the particles may be disrupted (disrapt) prior to the first or second reaction to aid barcoding and/or subsequent analysis. In some embodiments, the particles are treated with an agent that elicits a biological response prior to performing the plurality of first reactions.

Subsequent analysis

Any of the above-described methods of incorporating a nucleic acid sequence into a target nucleic acid (including the barcoding and pooling methods described above) can include any of a variety of analytical steps, such as determining the amount of at least one target nucleic acid in a first reaction mixture or determining the copy number of one or more DNA molecules in a first reaction mixture. In certain embodiments in which tagged or barcoded target nucleotide sequences are generated by PCR, e.g., those in which copy number determinations are made, PCR is advantageously performed for less than 20 cycles to preserve the relative copy number of different target nucleotide sequences.

Any of the above methods can comprise determining a genotype at one or more loci in the first reaction mixture and/or determining a haplotype for a plurality of loci in the first reaction mixture. Haplotype determination can be performed, for example, by: compressing (contracting) the chromosomes and partitioning the chromosomes into the first reaction mixture to produce a plurality of first reaction mixtures comprising a single chromosome. This assignment can be done, for example, for single particle analysis as described below (in which case the "particle" being analyzed is a chromosome). Multiple loci in the first reaction mixture, and therefore necessarily on the same chromosome(s), can be sequenced to provide a haplotype for those loci.

In any of the above methods, for example, wherein RT-PCR is performed, the expression level of one or more RNA molecules in the first reaction mixture can be determined. As with DNA copy number determination, it is advantageous to perform PCR for less than 20 cycles to preserve the differential relative copy number.

Whether the target nucleic acid in the first reaction mixture is DNA or RNA, subsequent analysis can include determining the sequence of the target nucleotide sequence produced therefrom.

In some embodiments, the methods described herein comprise performing a plurality of reactions in each first reaction mixture, wherein one of the plurality of reactions comprises amplification to produce a tagged or barcoded target nucleotide sequence, analyzing the results of the plurality of reactions, and correlating the analysis results with each first reaction mixture. This association may be aided by tagging or barcoding the target nucleotide sequence as mentioned above. For example, combinatorial barcoding can be used to encode information about a source reaction mixture. Alternatively, a combination of primer sequences and barcodes can encode this information, as discussed above for the barcoding and pooling methods.

Bidirectional nucleic acid sequencing

In particular embodiments, the invention provides methods of preparing nucleic acids for bidirectional DNA sequencing that facilitate sequencing both ends of an amplification product in a single read sequencing run. Such a method is exemplified in example 9.

The DNA to be sequenced may be any type of DNA. In a particular embodiment, the DNA is genomic DNA or cDNA from an organism. In some embodiments, the DNA may be fragmented DNA. The DNA to be sequenced may be representative of the RNA in the sample, wherein the DNA is obtained by, for example, reverse transcription or amplification of the RNA. In certain embodiments, the DNA may be a DNA library.

To prepare nucleic acids for bidirectional DNA sequencing according to the methods described herein, each target nucleic acid to be sequenced is amplified using an inner primer set, wherein the set comprises:

an inner, forward primer comprising a target-specific portion and a first primer binding site;

an inner, reverse primer comprising a target-specific portion and a second primer binding site, wherein the first and second primer binding sites are different. These first and second primer binding sites serve a dual function, serving as nucleotide tags (as described below) to aid in the addition of further nucleotide sequences, and in certain embodiments, as primer binding sites to which DNA sequencing primers can anneal. In a specific embodiment of example 9, the first and second primer binding sites are designated "CS1" and "CS2" and represent "consensus tag 1" and "consensus tag 2". In this embodiment, the target-specific portion of the inner primer is designated "TS-F" for "target-specific forward" and "TS-R" for "target-specific reverse".

After amplification, the target nucleotide sequence becomes tagged with the first and second primer binding sites. These tagged target nucleotide sequences anneal to two sets of outer primers, which anneal to the first and second primer binding sites. The two sets of outer primers contained:

A first outer primer, wherein the set comprises:

a first outer, forward primer comprising a portion specific for a first primer binding site; and

a first outer, reverse primer comprising a barcode nucleotide sequence and a portion specific for a second primer binding site;

a second set of flanking primers, wherein the set comprises:

a second outer, forward primer comprising a barcode nucleotide sequence and a portion specific for the first primer binding site; and

a second outer, reverse primer comprising a portion specific for the second primer binding site. Amplification then produces two target amplicons, namely:

a first target amplicon comprising 5 '-first primer binding site-target nucleotide sequence-second primer binding site-barcode nucleotide sequence-3'; and

a second target amplicon comprising a 5 '-barcode nucleotide sequence-first primer binding site-target nucleotide sequence-second primer binding site-3'. In particular embodiments, the barcode nucleotide sequence in each of the two target amplicons is the same, and each target amplicon comprises only one barcode nucleotide sequence. In some embodiments, when amplifying more than one target nucleic acid, each pair of target amplicons produced may have the same barcode sequence, but different pairs may have different barcode sequences. In this case, the barcode sequence will differ between different target amplicons generated from different target nucleic acids. As discussed above, for example, different sets of target nucleic acids from a particular biological sample can be barcoded with the same set-specific sequences (i.e., sequences that differ between sets). In particular embodiments, the group-specific barcode may be a sample-specific barcode, i.e., a barcode that identifies the sample from which the target amplicon is derived.

In certain embodiments, the outer primers each further comprise an additional nucleotide sequence, wherein:

the first outer, forward primer comprises a first additional nucleotide sequence and the first outer, reverse primer comprises a second additional nucleotide sequence; and

the second outer, forward primer comprises a second additional nucleotide sequence and the second outer, reverse primer comprises a first additional nucleotide sequence, and the first and second additional nucleotide sequences are different. In such embodiments, the outer primer amplification produces two target amplicons, namely:

a first target amplicon comprising 5 '-a first additional nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site-a barcode nucleotide sequence-a second additional nucleotide sequence-3'; and

a second target amplicon comprising 5 '-a second additional nucleotide sequence-a barcode nucleotide sequence-a first primer binding site-a target nucleotide sequence-a second primer binding site-a first additional nucleotide sequence 3'. (those skilled in the art will understand that the amplicons described herein in this manner are described for one strand, and the complementary strand will have the 5 'to 3' order reversed for these nucleotide sequences.)

The first and/or second additional nucleotide sequence may further comprise a primer binding site. An exemplary primer configuration of this type is described in example 9, wherein the additional nucleotide sequences are designated "PE-1" and "PE-2". These sequences are adaptor sequences used by Genome Analyzer (commercially available from Illumina, Inc., San Diego, Calif.). The barcode nucleotide sequence was designated "BC". Amplification with the outer primers of these primers yielded two target amplicons, namely:

a first target amplicon comprising 5'-PE1-CS 1-target nucleotide sequence-CS 2-BC-PE 2-3'; and

a second target amplicon comprising 5'-PE2-BC-CS 1-target nucleotide sequence-CS 2-PE 1-3'. In a specific, exemplary embodiment, the first outer primer set PE1-CS1 and PE2-BC-CS2, and the second outer primer set PE1-CS2 and PE2-BC-CS1 have the nucleotide sequences set forth in Table 1 of example 9.

The inner and outer primer amplifications can be performed in a single amplification reaction. Alternatively, the inner primer amplification may be performed in a first amplification reaction and the outer primer amplification may be performed in a second, different amplification reaction than the first. In certain embodiments, the second amplification reaction may be performed in two separate second amplification reactions: one with a first set of outer primers and the other with a second set of outer primers. See example 9, figure 2. In such embodiments, the target amplicons produced in each respective second amplification reaction can be pooled for further analysis, such as DNA sequencing.

In many embodiments, the above methods will be performed on a plurality of target nucleic acids, such as, for example, a DNA library. In this case, the method can be used to generate a pool of target amplicons comprising two types of amplicons (described above and exemplified in example 9 with "fig. 2A" and "fig. 2B") for each target nucleic acid. One type of target amplicon ("fig. 2A") facilitates sequencing the 5 'end of the target nucleic acid, and another type of target amplicon ("fig. 2B") facilitates sequencing the 3' end of the target nucleic acid. In addition, each target amplicon comprises a barcode sequence, which in certain embodiments is the same in each of the two types of target amplicons. The barcode nucleotide sequence may encode information about the target nucleotide sequence, such as the identity of the reaction from which it was produced and/or the identity of the sample from which the target nucleic acid was derived. As described in more detail below, the target nucleotide sequence and barcode nucleotide sequence in each target amplicon can be readily determined using any suitable available DNA sequencing method. In particular embodiments, the DNA sequencing method is a high throughput sequencing method, such as bridge amplification (cluster generation) and sequencing methods commercialized by Illumina, inc. In certain embodiments, for example, those employing bridge amplification and sequencing, the average length of the target amplicon is less than 200 bases, less than 150 bases, or less than 100 bases.

In bridge amplification and sequencing, for example, a target amplicon generated as described herein is hybridized to the plateaus of immobilized primer pairs (a lawn of immobilized primer pairs) via first and second additional nucleotide sequences (e.g., PE1 and PE 2). One immobilized primer in each primer pair is cleavable. First strand synthesis is performed to produce a double-stranded molecule. These are denatured and the initially hybridized target amplicon strands, which serve as templates for first strand synthesis, are washed away, leaving the immobilized first strands. These can be inverted and hybridized with the appropriate adjacent primers to form a bridge. Second strand synthesis is performed to generate a double-stranded bridge. These are denatured and each bridge produces two immobilized single stranded molecules which can be hybridized again with the appropriate immobilized primers. Isothermal bridge amplification is performed to generate a plurality of double stranded bridges. The double-stranded bridge is denatured, the "reverse" strand is cleaved and washed away, leaving a cluster of immobilized "forward" strands that can be used as a template for DNA sequencing.

When bridging and sequencing target amplicons generated as described herein, primers that anneal to the first and second primer binding sites (e.g., CS1 and CS2) can be used to sequence the target nucleotide sequence or barcode nucleotide sequence, both of which are present in the immobilized template generated from the amplicon. In certain embodiments, a primer pair suitable for sequencing a target nucleotide sequence is contacted with the immobilized template under conditions suitable for annealing, followed by DNA sequencing. After reading these sequences, the sequencing products can be denatured and washed away. The immobilized template can then be contacted with a primer pair suitable for sequencing barcode nucleotide sequences under conditions suitable for annealing, followed by DNA sequencing. The order of these sequencing reactions is not critical and can be reversed (i.e., the barcode nucleotide sequence can be sequenced first, followed by the target nucleotide sequence). See example 9, figure 3. In certain embodiments, the primer that primes the sequencing barcode nucleotide sequence is the reverse complement of the primer that primes the sequencing target nucleotide sequence. In a specific, exemplary embodiment, the primers used to prime sequencing of the target nucleotide sequence and barcode nucleotide sequence are CS1, CS2, CS1rc, and CS2rc (table 2, example 9).

Conveniently, both types of target amplicons are bridge amplified and sequenced in the same reaction to allow simultaneous sequencing of templates from each type of target amplicon. See example 9, figure 3. This allows simultaneous sequencing of each target nucleotide sequence from the 5 'end (e.g., by sequencing the template from a type a amplicon in example 9, fig. 3) and from the 3' end (e.g., by sequencing the template from a type B amplicon in example 9, fig. 3). In particular embodiments, primers that bind to the first and second primer binding sites and prime sequencing of the target nucleotide sequences are present at substantially equal concentrations, thereby generating 5 'and 3' DNA sequence information from each target nucleotide sequence. Similarly, in certain embodiments, primers that bind to the first and second primer binding sites and prime the sequencing barcode nucleotide sequence are present at substantially equal concentrations, thereby generating barcode sequences from each template type (i.e., in example 9, fig. 3, from either a-type amplicon or a B-type amplicon).

When the internal amplification is performed as separate reactions, particularly when amplifying multiple target nucleic acids, it may be convenient to perform the individual reactions (e.g., amplify 1, 2, 3, 4, 5 or more target nucleic acids per reaction) in separate compartments of a microfluidic device, such as any of those described herein or known in the art. As discussed below, suitable microfluidic devices may be fabricated at least partially from elastomeric materials.

In particular embodiments, the inner or (inner and outer) amplification is performed in a microfluidic device designed to facilitate recovery of amplification products after performing the amplification reaction, such as ACCESS ARRAY described herein (see fig. 2-9) and available from Fluidigm, inc., South San Francisco, CATMIFC. In this type of exemplary device, expansion pumping may be used to remove substantially all of the reaction product from the microfluidic device, providing uniformity between different reaction product reservoirs. In this manner, it is possible to produce pools of barcoded reaction products that are uniform in volume and copy number. In various embodiments, the volume and/or copy number uniformity is such that the variability per pool recovered from the device is less than about 100%, less than about 90%, less than about 80%, less than about 70%, less than about 60%, less than about 50%, less than about 40%, less than about 30%, less than about 20%, less than about 17%, or less than about 15%, 12%, 10%, 9%, 8%, 7%, 6%, 5%, 4.5%, 4%, 3.5%, 3%, 2.5%, 2%, 1.5%, 1%, or 0.5% in terms of volume and/or copy number. Those skilled in the art understand that the volume and/or copy number variability can fall within any range bounded by any of these values (e.g., about 2% to about 7%). In exemplary embodiments, the volume sample recovered from the microfluidic device does not vary by more than about 10%. The standard suction error is in an amount of 5% to 10% And (4) stages. As such, the variables observed in volume are primarily attributable to pipetting errors. Utilizing the systems and methods described herein reduces the time and labor required to prepare sequencing libraries as compared to conventional techniques.

One skilled in the art will know of other devices and strategies that can be used to perform the inboard (or inboard and outboard) amplification described herein for a plurality of different target nucleic acids, each in separate reactions. For example, droplet-based amplification is well suited to perform this internal amplification. See, for example, U.S. patent No. 7,294,503 to Quake et al, entitled "Microfabricated cross flow devices and methods," granted 11/13/2007, which is incorporated herein by reference in its entirety, particularly for a description of the apparatus and methods for forming and analyzing droplets thereof; U.S. patent publication No. 20100022414 to Link et al, entitled "Droplet libraries," published on 28/1/2010, which is incorporated herein by reference in its entirety, particularly for a description of apparatus and methods for forming and analyzing droplets; and U.S. patent publication No. 20110000560 to Miller et al, entitled "Manipulation of microfluidic Droplets," published 6/1/2011, which is incorporated herein by reference in its entirety, particularly for a description of apparatus and methods for forming and analyzing Droplets. In particular embodiments, the internal amplification is performed in droplets of an emulsion.

Encoding and detection/quantification of alleles by primer extension

Nucleic acid encoding can be employed in methods for detecting and evaluating the proportion of a particular target nucleic acid (e.g., a rare mutation) in a nucleic acid sample. The method includes generating first and second tagged target nucleotide sequences from first and second target nucleic acids in a sample. For example, the method can be performed by introducing an allele-specific nucleotide tag into the resulting tagged target nucleotide sequence using allele-specific amplification. The tagged target nucleotide sequence is then subjected to a primer extension reaction using a primer specific for each nucleotide tag. The method comprises detecting and/or quantifying a signal indicative of extension of the first primer and a signal indicative of extension of the second primer. The signal of a given primer indicates the presence of the corresponding target nucleic acid, and +Or a relative amount. This method can be conveniently performed on a high throughput (e.g., next generation) DNA sequencing platform to detect, for example, known mutations in a sample by detecting the presence of tags, rather than by determining the DNA sequence of each molecule. The benefits of this approach are speed, sensitivity and accuracy. The large number of cloned molecules examined in next generation sequencing allows for reliable detection of very rare sequences (e.g., 10) 6Less than 1 in the sequence). Moreover, the proportion of target sequences (e.g., mutations) can be determined more accurately than PCR, as next generation sequencing platforms can obtain a very high number of reads.

To facilitate primer extension on the DNA sequencing platform, adaptors for, e.g., high throughput DNA sequencing can be introduced to the first and second tagged target nucleotide sequences. In particular embodiments, an adaptor is introduced to each end of the tagged target nucleotide sequence molecule. These adapters can be conveniently introduced in one reaction together with the nucleotide tag.

The nucleotide tag and/or DNA sequencing adaptor may be introduced into the target nucleotide sequence using any suitable method such as, for example, amplification or ligation. For example, first and second tagged target nucleotide sequences can be generated by amplifying first and second target nucleic acids with first and second primer pairs, respectively. At least one primer of the first primer pair comprises a first nucleotide tag and at least one primer of the second primer pair comprises a second nucleotide tag. When DNA sequencing adaptors are introduced in the same reaction, one primer of each primer pair comprises 5'- (DNA sequencing adaptor) - (nucleotide tag) - (target-specific moiety) -3' and the other primer of each primer pair comprises 5'- (DNA sequencing adaptor) - (target-specific moiety) -3'.

Many high throughput DNA sequencing techniques involve an amplification step prior to DNA sequencing. Thus, in some embodiments, the tagged target nucleotide sequence is further amplified prior to primer extension on the DNA sequencing platform. For example, emulsion amplification or bridge amplification may be performed. Emulsion pcr (empcr) separates individual DNA molecules into water droplets located in an oil phase along with primer coated beads. PCR generates multiple copies of DNA molecules that are attached to primers on beads, followed by immobilization for subsequent sequencing. emPCR is used in the methods provided by Marguilis et al (commercialized by 454Life Sciences, Branford, CT), deliver, and poreca et al (herein referred to as "454 sequencing", also known as "polymerase sequencing"), and SOLiD sequencing (Life Technologies, Foster City, CA). See M.Margulies, et al, (2005) "Genome sequencing in micro-engineered high-density pixel microorganisms" Nature 437: 376-380; J.Shendare, et al, (2005) "Accurate Multiplex polar Sequencing of an Evolved Bacterial Genome" Science 309(5741): 1728-. In vitro clonal amplification can also be performed by "bridge PCR" in which fragments are amplified when primers are attached to a solid surface. Braslavsky et al developed a single molecule method (commercialized by Helicos Biosciences Corp., Cambridge, Mass.) that omitted this amplification step and directly immobilized DNA molecules onto a surface. Braslave, et al, (2003) "Sequence information can be associated with from single DNA molecules" Proceedings of the National Academy of Sciences of the United States of America 100: 3960-.

DNA molecules that are physically bound to the surface can be sequenced in parallel. "sequencing by Synthesis", like sequencing by dye termination electrophoresis, a DNA polymerase is used to determine the base sequence. "Pyrophosphoric sequencing" uses DNA polymerization, adding one nucleotide at a time and detecting and quantifying the number of nucleotides added to a given position by the light emitted by the release of additional pyrophosphate (commercialized by 454Life Sciences, Branford, CT). See, M.Ronaghi, et al, (1996) "Real-time DNA sequencing using detection of pyrophoric phosphate release" Analytical Biochemistry 242: 84-89. Reversible terminator methods (commercialized by Illumina, inc., San Diego, CA and Helicos Biosciences corp., Cambridge, MA) use a reversible form of dye terminator by repeated removal of the blocking group to allow polymerization of another nucleotide, one nucleotide at a time, and detection of fluorescence at each position in real time.

In one embodiment of the detection method by primer extension, which may conveniently be carried out on a 454 sequencing platform, the first and second primer extension reactions are carried out sequentially over at least two cycles of primer extension. Specifically, a first cycle of primer extension is performed with a first primer annealed to a first nucleotide tag, and a second cycle of primer extension is performed with a second primer annealed to a second nucleotide tag. All deoxynucleoside triphosphates (dntps) are provided in each primer extension cycle. Any incorporation of dNTPs into the DNA molecule produces a detectable signal. The signal detected in the first cycle is indicative of the presence of the first nucleic acid target in the nucleic acid sample, and the signal detected in the second cycle is indicative of the presence of the second nucleic acid target in the nucleic acid sample. As such, each target nucleic acid (e.g., mutation) can be detected with only a single cycle of the sequencing platform.

Since the signal detected is proportional to the copy number of the target nucleic acid, the signal can also be used to assess the amount of target nucleic acid in the sample. In particular, the signal can be used to determine the amount of two or more target nucleic acids relative to each other.

In an exemplary embodiment of using the 454 sequencing platform to detect wild-type and mutant target nucleic acids, an allele-specific PCR reaction is prepared with the specific tags of the wild-type and each mutant to be detected. As shown in fig. 31, the forward primer has 454 adapters and an allele-specific tag (identified by different shading). The adapter is 5 'to the tag, and the tag is 5' to the allele-specific portion of the primer. The reverse primer comprises the 454 adaptor 5' to the target-specific portion. As shown in FIG. 31, only one reverse primer is required to detect a single nucleotide polymorphism. In this example, two allele-specific PCR reactions are performed in a single PCR reaction, which is not required for this method. The PCR reaction produces a tagged target nucleotide sequence ready for 454 bead emulsion PCR. The emulsion PCR step can be omitted, for example, by directly annealing the tagged target nucleotide sequence to beads that are preloaded with allele-specific oligonucleotides (i.e., each individual bead carries only one type of oligonucleotide). In either case, an individual bead will carry only one type of tagged target nucleotide. The beads were loaded to a 454 sequencer. The first 454 circular flows, for example, bind the wild-type tagged primer and all four dntps. As this primer is extended, multiple nucleotides are incorporated, providing a very robust signal, but only in wells containing wild-type beads. The second 454 cycles flow to bind the mutant tagged primer and all four dntps, providing signal only in the wells containing the mutant beads.

In another embodiment of the detection method by primer extension, which can be conveniently performed on the SOLiD sequencing platform, the first and second primer extension reactions are performed by oligonucleotide ligation and detection. In this embodiment, ligation of the first and/or second primers by the labeled dibasic oligonucleotide produces a detectable signal, and the total signal detected for a particular primer is indicative of the presence, and/or relative amount, of the corresponding target nucleic acid in the nucleic acid sample. In a variation of this embodiment, ligation of the first primer by the labeled dibasic oligonucleotide and ligation of the second primer by the labeled dibasic oligonucleotide produces the same detectable signal, and the first and second primer extension reactions are performed separately, e.g., in simultaneous or sequential cycles. In another variation, ligation of the first primer by the labeled dibasic oligonucleotide and ligation of the second primer by the labeled dibasic oligonucleotide produce different detectable signals. The use of different signals allows the first and second primer extension reactions to be performed simultaneously in one reaction mixture. Any type of detectable signal may be used in the method, but typically a fluorescent signal is used, e.g., for SOLiD sequencing.

Tagged target nucleotide sequences comprising, for example, an allele-specific tag and suitable DNA sequencing adaptors are prepared for primer extension on the SOLiD sequencing platform as described above. Emulsion PCR may be performed, and this step is not strictly necessary. As described above with respect to 454 sequencing, any method of generating a clonal population of tagged target nucleotide sequences attached to beads can be used to generate tagged target nucleotide sequences suitable for primer extension on the SOLiD sequencing platform.

In yet another embodiment of the detection method by primer extension, which may conveniently be performed on the Illumina sequencing platform, the first and second primer extension reactions comprise sequencing by synthesis. In this embodiment, each deoxynucleoside triphosphate is labeled with a different, base-specific label, and incorporation of the deoxynucleoside triphosphate into a DNA molecule produces a base-specific detectable signal. The total signal detected for a particular primer is indicative of the presence and/or relative amount of the corresponding target nucleic acid in the nucleic acid sample. In a variation of this embodiment, extension of the first primer produces the same detectable signal as extension of the second primer, and the first and second primer extension reactions are performed separately, e.g., in simultaneous or sequential cycles. In another variation, extension of the first primer produces a different detectable signal than extension of the second primer. The use of different signals allows the first and second primer extension reactions to be performed simultaneously in one reaction mixture. Any type of detectable signal may be used in the method, but typically a fluorescent signal is used, e.g., for Illumina sequencing. Tagged target nucleotide sequences comprising an allele-specific tag and suitable DNA sequencing adaptors are prepared for primer extension on the Illumina sequencing platform as described above. For primer extension on the Illumina sequencing platform, the tagged target nucleotide sequence is further amplified, typically by bridge PCR, prior to DNA sequencing.

In the specific detection embodiments described above by primer extension, and in some other implementations of the method, amplification produces a clonal population of tagged target nucleotide sequences that are or become located at discrete reaction sites. The number of reaction sites comprising the first nucleotide tag relative to the number of reaction sites comprising the second nucleotide tag is indicative of the amount of the first target nucleic acid relative to the second target nucleic acid in the sample. In particular embodiments of this type, the method can include detecting and comparing the total signal from all reaction sites comprising the first nucleotide tag to the total signal from all reaction sites comprising the second nucleotide tag. Alternatively or additionally, the method may comprise detecting and comparing the number of reaction sites comprising a first nucleotide tag with the number of reaction sites comprising a second nucleotide tag. In either case, the comparison may include any conventional means of comparing two values, such as, for example, determining a ratio.

The selection of suitable, distinguishable nucleotide tags for use in the method is within the ability of one skilled in the art. In certain embodiments, the first nucleotide tag can comprise a homopolymer of the first nucleotide (e.g., poly-a), and the second nucleotide tag can comprise a homopolymer of the second, different nucleotide (e.g., poly-G).

Although the detection method by primer extension is described above for the analysis of two target nucleic acids, the method encompasses the analysis of three or more target nucleic acids, each tagged with a different nucleotide tag. Performing three or more primer extension reactions on the resulting tagged target nucleotide sequence, each using primers that anneal to a different nucleotide tag, and detecting and/or quantifying a signal for extension of each primer. In particular embodiments, the two or more tagged target nucleotide sequences comprise different barcodes, as described above, that can encode information about the tagged target nucleotide sequence, e.g., a sample or reaction mixture.

The above detection method by primer extension can be carried out in multiplex, if necessary. For example, in certain embodiments, multiple samples can be analyzed together in one or more primer extension reactions by incorporating one or more barcodes into the nucleotide tag, where the barcode encodes the identity of the sample. For primer extension reactions, primers specific for both alleles and barcodes can be employed, or alternatively, the barcode can preferably be adjacent to the nucleotide tag to which the primer anneals, and the primer extension reaction can be a DNA sequencing reaction, which requires only the detection of the sequence of the barcode. In the former embodiment, primer extension will indicate the presence of an allele from a particular sample, while in the latter embodiment, primer extension will indicate the presence of an allele and the barcode nucleotide sequence will identify the sample.

Single particle analysis applications

Incorporation of nucleic acid sequences into a Single particle

In certain embodiments, the above-described methods of incorporating nucleic acid sequences into target nucleic acids (including the above-described barcoding and pooling methods) are used in the context of testing a single particle in a population of particles. Typically, the nucleic acid sequence is introduced into a target nucleic acid associated with (associated with) or contained in the particle. Thus, the first reaction described above is carried out in a reaction volume containing individual particles. The ability to associate a single particle analysis result with each particle tested can be developed, where, for example, two or more parameters are associated with a phenotype. The two or more parameters measured may be different types of parameters, for example, RNA expression levels and nucleotide sequences. Additional applications of the single cell analysis methods described herein are described below.

Single particle analysis involves capturing a population of particles in separate reaction volumes to produce a plurality of separate reaction volumes each containing only one particle. The separate reaction volumes comprising the particles may be formed in droplets, in an emulsion, in a vessel, in wells of a microtiter plate, or in compartments of a matrix-type microfluidic device. In illustrative embodiments, the separate reaction volumes are present in separate compartments of a microfluidic device, such as, for example, any of those described herein. See also, U.S. patent publication No. 2004/0229349 to Daridon et al, published 2004, 11, 18, which is incorporated herein by reference in its entirety, particularly for a description of its microfluidic particle analysis system.

In certain embodiments, the parameters are examined by: reactions such as nucleic acid amplifications are performed in each respective reaction volume to produce one or more reaction products, the reaction products are analyzed to obtain results, and the results are then correlated with particles and entered into a data set. The particles may be captured in separate reaction volumes and then contacted with one or more reagents for performing one or more reactions. Alternatively or additionally, the particles may be contacted with one or more such reagents and the reaction mixture may be dispensed into separate reaction volumes. In various embodiments, at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more reactions are carried out in each respective reaction volume. The analysis of the reaction products can be carried out in separate reaction volumes. In some embodiments, however, it may be advantageous to recover the contents of the separate reaction volumes for subsequent analysis or other purposes. For example, if nucleic acid amplification is performed in separate reaction volumes, it may be desirable to recover the contents for subsequent analysis, e.g., by PCR and/or nucleic acid sequencing. The contents of the separate reaction volumes can be analyzed separately and the results correlated to the particles present in the initial reaction volume. Alternatively, the particle/reaction volume identity may be encoded in the reaction product, e.g., as discussed above for the multi-primer nucleic acid amplification method. In addition, the two strategies can be combined, encoding separate sets of reaction volumes such that each reaction volume in a set is uniquely identifiable, then pooled, and each pool analyzed separately, as exemplified by the barcoding and pooling methods described above.

Granules

The methods described herein can be used to analyze any type of particle, for example, by performing any of the above-described reactions on nucleic acid from one or more individual particles. In certain embodiments, a particle generally includes any object that is small enough to be suspended in a fluid, but large enough to be distinguished from the fluid. The particles may be microscopic or near microscopic, and may have a diameter of about 0.005 to 100 μm, 0.1 to 50 μm, or about 0.5 to 30 μm. Alternatively or additionally, the particles may have a particle size of about 10-20To 10-5Gram, 10-16To 10-7G, or 10-14To 10-8Mass in grams. In certain embodiments, the particle is a particle from a biological source ("bioparticle"). Biological particles include, for example, molecules such as nucleic acids, proteins, carbohydrates, lipids, and combinations or aggregates thereof (e.g., lipoproteins), as well as larger entities such as viruses, chromosomes, cell vesicles and organelles, and cells. Particles that can be analyzed as described herein also include those having insoluble components, e.g., beads, to which the molecules to be analyzed are attached.

In an exemplary embodiment, the particle is a cell. Cells suitable for use as particles in the methods described herein generally include any self-replicating, membrane-bound biological entity or any non-replicating, membrane-bound progeny thereof. The non-replicating progeny may be senescent cells, terminally differentiated cells, cell chimeras, serum deprived cells (serum-stabilized cells), infected cells, non-replicating mutants, anucleated cells, and the like. The cells used in the methods described herein can have any source, genetic background, health state, fixation state, membrane permeability, pretreatment, and/or population purity, among other characteristics. Suitable cells may be eukaryotic, prokaryotic, archaeal, and the like, and may be derived from animal, plant, fungal, protozoan, bacterial, and/or similar sources. In an exemplary embodiment, human cells are analyzed. The cells can be from any stage of biological development, for example, in the case of mammalian cells (e.g., human cells), embryonic, fetal, or adult cells can be analyzed. In certain embodiments, the cell is a stem cell. The cell may be wild-type; natural, chemical or viral mutants; engineered mutants (such as transgenes); and/or the like. In addition, cells may be growing, dormant, senescent, transformed, and/or immortal, among other states. Moreover, the cells may be monoculture (monoculture), generally obtained as a clonal population from a single cell or a small group of very similar cells; may be pre-sorted by any suitable mechanism, such as affinity binding, FACS, drug screening, and the like; and/or may be a mixed or heterogeneous population of different cell types.

Particles comprising membranes (e.g., cells or cell vesicles or organelles), cell walls, or any other type of barrier separating one or more internal components from an external space can be intact or partially (e.g., permeabilized) or completely (e.g., to release the internal components) disrupted. When the particles are cells, fixed and/or non-fixed cells may be used. Living or dead, fixed or unfixed cells may have intact membranes, and/or membranes that are permeabilized/disrupted to allow access of ions, stains, dyes, labels, ligands, etc., and/or lysed to allow release of the cell contents.

One benefit of the methods described herein is that they can be used to analyze almost any number of particles, including numbers that are well below the millions of particles required by other methods. In various embodiments, the number of particles analyzed can be about 10, about 50, about 100, about 500, about 1000, about 2000, about 3000, about 4000, about 5000, about 6000, about 7,000, about 8000, about 9,000, about 10,000, about 15,000, about 20,000, about 25,000, about 30,000, about 35,000, about 40,000, about 45,000, about 50,000, about 75,000, or about 100,000. In particular embodiments, the number of particles analyzed may fall within a range bounded by any two values listed above.

Particle capture

The particles may be trapped in the respective reaction volumes by any means known in the art or described herein. In certain embodiments, the capture feature maintains a capture site for one or more cells in a separate reaction volume. In preferred embodiments, the capture feature preferentially retains only a single cell at the capture site. In certain preferred embodiments, each capture site is located in a separate compartment of the microfluidic device. The term "separate compartments" as used herein refers to compartments that are at least temporarily separated from other compartments in the microfluidic device, such that the compartments may contain separate reaction volumes. Temporary separations can be achieved, for example, with valves, as in the case of microfluidic devices available from Fluidgm, Inc. The degree of separation must be such that the assays/reactions can be performed separately in the compartments. The term "capture feature" as used herein includes single or multiple mechanisms (mechanisms) operating in series or in parallel. The capture feature may act to overcome a positioning force exerted by a fluid flow. Suitable capture features can be based on physical barriers coupled to the flow (referred to as "mechanical capture"), chemical interactions (referred to as "affinity-based capture), vacuum forces, fluid flow in the ring, gravity, centrifugal forces, magnetic forces, electrical forces (e.g., electrophoretic or electroosmotic forces), and/or optically generated forces, among others.

The capture feature may be selective or non-selective. The selectivity mechanism may be hierarchically selective, i.e. keeping less than all (a subset) of the particles input. The hierarchical selectivity mechanism may rely, at least in part, on random, lumped features (see below). Alternatively or additionally, the selectivity mechanism may be particle-dependent, i.e., the particles are retained based on one or more characteristics of the input particles, such as size, surface chemistry, density, magnetic characteristics, charge, optical characteristics (such as refractive index), and/or the like.

Mechanical capture

Mechanical capture may be based, at least in part, on particles coming into contact with any suitable physical barrier disposed, for example, in a microfluidic device. This particle-barrier contact generally restricts longitudinal particle movement along the direction of fluid flow, resulting in flow-assisted retention. Flow assisted particle-barrier contact may also limit side-to-side/orthogonal (lateral) movement. Suitable physical barriers may be formed by projections extending inwardly from any portion (i.e., walls, roof, and/or floor) of a channel or other passageway. For example, the projections may be fixed and/or movable, including columns (columns), posts (posts), blocks (blocks), bumps (bumps), walls, and/or partially/fully enclosed valves, among others. Some physical barriers, such as valves, may be movable or adjustable. Alternatively or additionally, the physical barrier may be defined by a recess (e.g., niche) formed in the channel or other pathway, or by a fluid permeable membrane. Other physical barriers may be formed based on the cross-sectional dimensions of the passage. For example, the size selective channel may retain particles that are too large to enter the channel. (size selective channels may also be referred to as filtration channels, microchannels, or particle-restrictive or particle-selective channels.) examples 6 and 8 provide exemplary mechanical capture embodiments.

Affinity-based capture

Affinity-based capture may retain particles based on one or more chemical interactions, i.e., where the binding partner binds to a component of the particle. Chemical interactions may be covalent and/or non-covalent interactions, including ionic, electrostatic, hydrophobic, van der waals, and/or metal coordination interactions, among others. Chemical interactions may selectively and/or non-selectively retain particles. Selective and non-selective retention may be based on specific and/or non-specific chemical interactions between particles and surfaces, e.g., in microfluidic devices.

Specific chemical mechanisms may use Specific Binding Partners (SBPs), e.g., first and second SBPs disposed on the surface of the particle and device, respectively. Exemplary SBPs may include biotin/avidin, antibodies/antigens, lectins/carbohydrates, and the like. The SBP may be locally disposed in the microfluidic device before, during, and/or after device formation. For example, the surface of the substrate and/or fluid layer components may be locally modified by the adhesion/attachment of SBP members prior to the substrate and fluid layer components being attached. Alternatively or additionally, the SBP may be locally associated with a portion of the microfluidic device after the device has been formed, for example, by a local chemical reaction of the SBP member with the device (such as a chemical reaction catalyzed by local illumination with light). See also example 7, which describes an embodiment in which beads bearing SBP members are mechanically captured at capture sites to display the SBP members for affinity-based capture particles (i.e., cells).

Non-specific chemical mechanisms may rely on local differences in the surface chemistry of the microfluidic device. As described above, such local differences may be created before, during, and/or after the formation of the microfluidic device. Local differences may result from local chemical reactions, e.g., to create hydrophobic or hydrophilic regions, and/or local binding of materials. Conjugated materials may include poly-L-lysine, poly-D-lysine, polyethyleneimine, albumin, gelatin, collagen, laminin, fibronectin, entactin, vitronectin, fibrillin, elastin, heparin, keratan sulfate, heparan sulfate, chondroitin sulfate, hyaluronic acid, and/or extracellular matrix extracts/mixtures, among others.

Other capture features

Alternatively or in addition to affinity-based or mechanical capture, other capture features may be used. Some or all of these mechanisms, and/or the mechanisms described above, may rely at least in part on friction between the particles and the microfluidic device channels or passages to aid in retention.

The capture feature may be based on vacuum force, fluid flow, and/or gravity. Vacuum-based capture features can apply a force that pulls particles into closer contact with the surface of the passageway, for example, with a force outward from the channel. Vacuum application, and/or particle retention, may be assisted by holes/ports in the walls of the channels or other passageways. Instead, a fluid flow path such as a loop that retains particles may be created based on the capture characteristics of the fluid flow. These fluid flow paths may be formed by: a closed channel circuit without an outlet (e.g., closed by a valve and actively pumped), and/or a vortex, such as formed by a generally annular fluid flow in a pocket. Gravity-based capture features can retain particles against the bottom surface of the passageway, thereby restricting particle movement in combination with friction. Gravity-based retention may be aided by a recess and/or a reduction in fluid flow rate.

The capture feature may be based on centrifugal force, magnetic force, and/or optically generated force. Capture features based on centrifugal force may retain particles by pushing the particles against the surface of the passageway, typically by applying a force to the particles that is generally perpendicular to the fluid flow. Such forces may be applied by centrifuging the microfluidic device and/or by particle movement in the fluid flow path. Magnetic-based capture features can utilize magnetic fields generated inside and/or outside of the microfluidic device to retain particles. The magnetic field may interact with ferromagnetic and/or paramagnetic portions of the particles. For example, the beads may be at least partially formed of a ferromagnetic material, or the cells may include surface-bound or internalized ferromagnetic particles. The electric-based capture features can utilize an electric field to hold charged particles and/or populations. Instead, optically generated force-based capture features may utilize light to hold particles. Such mechanisms may operate based on optical tweezers and other principles.

Another form of capture feature is blind-fill (blind-fill) channels, where the channel has an entrance but no exit either permanently (fixedly) or temporarily. For example, when the microfluidic device is fabricated from a gas permeable material such as PDMS, the gas present in the dead-end channels may escape, or be forced out of the channels through the gas permeable material, when forced out by the inflow of liquid through the inlet. This is a preferred example of blind filling. Blind filling may be used with a channel or compartment having an inlet, and an outlet that is gated or valved (valid by a valid). In this example, blind filling of the gas-filled channel or compartment occurs when the outlet valve is closed while the channel or compartment is filled through the inlet. If the inlet also has a valve, the valve can be subsequently closed after blind filling is completed and then the outlet can be opened to expose the channel or compartment contents to another channel or compartment. If the third inlet is in communication with the channel or compartment, the third inlet may introduce additional fluid, gas or liquid into the channel or compartment to drain the blind fill liquid to be drained from the channel or compartment in a measured amount.

Collection Feature (Focusing Feature)

Particle capture can be enhanced in a microfluidic device by using one or more focusing features to focus particle flow to each capture site. The concentrating feature can be categorized in a variety of ways, for example, without limitation, to reflect its origin and/or principle of operation, including direct and/or indirect, fluid-mediated and/or non-fluid-mediated, external and/or internal, and the like. These classifications are not mutually exclusive. As such, a given concentrating feature can locate particles in two or more ways; for example, the electric field may position the particles directly (e.g., via electrophoresis) and indirectly (e.g., via electroosmosis).

The concentrating feature may act to longitudinally and/or laterally define the particle location. The term "longitudinal position" denotes a position parallel to or along the long axis of a microfluidic channel and/or a fluid flow stream in the channel. Conversely, the term "lateral position" refers to a position perpendicular to the long axis of the channel and/or the associated primary fluid flow stream. By equating the "major axis" with the "tangent" in the curved channel, both longitudinal and lateral positions can be defined locally. The focusing features may act to move particles between longitudinal and transverse flow, along a path at any angle relative to the long axis of the channel and/or fluid flow.

The concentrating features may be used individually and/or in combination. If used in combination, the features can be used in series (i.e., sequentially) and/or in parallel (i.e., simultaneously). For example, indirect mechanisms such as fluid flow may be used for gross positioning, and direct mechanisms such as optical tweezers may be used for final positioning.

The direct focusing feature generally includes any mechanism in which a force acts directly on a particle to position the particle in a microfluidic network. The direct concentration feature may be based on any suitable mechanism, including optical, electrical, magnetic, and/or gravitational based forces, among others. The optical concentration feature uses light to mediate or at least assist in locating the particles. Suitable optical focusing features include "optical tweezers" which utilize a suitably focused and movable light source to impart a positioning force on particles. The electrical focusing feature uses electricity to locate particles. Suitable electromechanical mechanisms include "electrokinetic," i.e., applying a voltage and/or current across some or all of the microfluidic network, which, as described above, can move charged particles directly (e.g., via electrophoresis) and/or indirectly via movement of ions in the fluid (e.g., via electroosmosis). The magnetic focusing feature uses magnetism to localize particles based on magnetic interactions. Suitable magnetic mechanisms include applying a magnetic field in or around the fluid network to position the particles via their association with ferromagnetic and/or paramagnetic materials in, on or near the particles. Gravity-based focusing features utilize gravity to position particles, e.g., contacting attached cells with a substrate at a cell culture location.

Indirect focusing features generally include any mechanism in which forces act indirectly on particles, for example, moving particles in a microfluidic network longitudinally and/or laterally via a fluid. The longitudinal indirect focusing feature may generally be created and/or modulated by fluid flow along the channel and/or other passageways. Thus, the longitudinal centralization feature may be assisted and/or adjusted by valves and/or pumps that adjust the flow rate and/or path. In some cases, the longitudinal concentration feature may be assisted and/or adjusted by the electroosmotic concentration feature. Alternatively or additionally, the longitudinal focus feature may be input-based, i.e., assisted and/or adjusted by an input mechanism, such as a pressure or gravity-based mechanism, including a pressure head (pressure head) generated by unequal heights of the fluid column.

The lateral indirect focusing feature may generally be created and/or adjusted by fluid flow streams at channel junctions, laterally disposed reduced fluid flow regions, channel bends, and/or physical barriers (i.e., obstructions). The channel junctions may be uniform sites (uniting sites) or spaced-apart sites (dividing sites) based on the number of channels carrying fluid towards the sites relative to the number of channels carrying fluid away from the sites. The physical barrier may have any suitable design to direct the flow of particles towards the capture site. For example, the obstructions may extend outwardly from any of the channel surfaces, e.g., at an angle that directs particle flow toward the capture site. The length of the obstruction, the angle to the surface of the channel, and the distance from the capture site can be adjusted to enhance the flow of particles toward the capture site. The obstruction may be formed by a projection extending inwardly from any portion (i.e., wall, roof, and/or floor) of the channel or other passageway. For example, the projections may be fixed and/or movable, including posts, rods, blocks, bumps, walls, and/or partially/fully enclosed valves, among others. Some physical barriers, such as valves, may be movable or adjustable.

In some embodiments, multiple obstacles may be employed for each capture site. For example, obstructions extending outwardly at an angle from each sidewall of the channel may be employed to direct particle flow toward a capture site centrally located in the channel. See fig. 22A-22B. When mechanical capture is employed, the obstacle may be spaced from a physical barrier in the capture site. Alternatively or additionally, the obstacle may contact or be an intrinsic part of a physical barrier in the capture site. See fig. 22A and 22C. For example, an obstruction extending outwardly at an angle from a channel wall may contact or be an intrinsic part of a concave capture feature (e.g., a physical barrier). It will be understood that a "concave" capture feature is concave on the side of the capture feature that generally faces the direction of fluid flow. The obstruction directs the flow of particles away from the channel wall and toward the concave capture feature, aiding particle capture. The next capture site along the flow path may have a similar obstruction-concave capture feature configuration, the obstruction extending from the same wall of the channel. However, in some embodiments, it is advantageous for the next obstacle-concave capture feature to extend from the opposite channel wall. This alternating configuration acts to concentrate the flow from one obstruction to the next, thereby enhancing the flow of particles along each obstruction into each concave trapping feature. See fig. 22C.

The transverse indirect focusing feature may be based on laminar flow, random separation, and/or centrifugal force, among other mechanisms. Lateral positioning of particles and/or reagents in a microfluidic device can be mediated, at least in part, by a laminar flow-based mechanism. Laminar flow-based mechanisms generally include any lumped feature in which the position of an input flow stream in a channel is determined by the presence, absence, and/or relative position of additional flow streams in the channel. Such a laminar flow-based mechanism may be defined by a channel junction that is a uniform point at which inlet flow streams from two, three, or more channels flow to the junction, unifying to form a smaller number, preferably 1, of outlet flow streams flowing out of the junction. Due to the laminar flow characteristics of the flow stream at the microfluidic scale, the unifying points may maintain the relative distribution of the inlet flow stream after the inlet flow stream is unified into the layered outlet flow stream. Thus, the particles and/or reagents may remain confined to any selected laminar flow stream or streams upon which the inlet channel carries the particles and/or reagents, thereby laterally positioning the particles and/or reagents. See, for example, fig. 24D.

The relative size (or flow rate) and location of each inlet flow stream may determine both the location and relative width of the flow stream carrying the particles and/or reagents. For example, an inlet flow stream of particles/reagents that is relatively small (narrow), flanked by two larger (wider) flow streams, may occupy a narrow central position in a single outlet channel. Conversely, an inlet flow stream of particles/reagents that is relatively large (wide), flanked by a flow stream of comparable size and a smaller (narrower) flow stream, may occupy a wider position that is laterally offset toward the smaller flow stream. In either case, the laminar flow-based mechanism may be referred to as a concentration mechanism, since the particles/reagents are "concentrated" to a subset of the cross-sectional area of the outlet channel. A laminar flow based mechanism may be used to transport (address) particles and/or reagents individually to multiple different capture sites.

The laminar flow based mechanism may be a variable mechanism to change the lateral position of the particles/reagents. As described above, the relative contribution of each inlet flow stream may determine the lateral position of the particle/reagent flow stream. Any altered flow of the inlet flow stream may alter its contribution to the outlet flow stream, correspondingly shifting the particle/reagent flow stream. In the extreme case called the perfusion mechanism (perfusion mechanism), the reagent (or particle) flow stream can be moved laterally, either in contact with or separated from the retained particles (reagent) based on the presence or absence of flow from the adjacent inlet flow stream. Such a mechanism may also be used to achieve variable or adjusted lateral positioning of the particles, for example, to direct the particles to capture sites having different lateral positions.

Lateral positioning of particles and/or reagents in a microfluidic device may be mediated, at least in part, by random (or fractional flow) focusing features. The random lateral focusing features generally include any focusing feature in which an at least partially randomly selected subset of the input particles or reagents is laterally distributed away from the region of the primary flow stream to the reduced fluid stream in the channel (or possibly to a different channel). The reduced flow area may facilitate particle retention, handling, detection, minimizing particle damage, and/or facilitate particle contact with the substrate. The random focusing features may be determined by separating flow sites and/or local widening, as well as other channels.

The spaced apart flow sites may be randomly located by creating regions of reduced fluid flow velocity. A split-flow site generally includes any channel junction where an inlet flow stream from one (preferably) or more inlet channels is split into a larger number of outlet channels, including two, three, or more channels. Such a separation site may convey a subset of particles, which may be selected randomly or based on characteristics of the particles (such as mass), to a region of reduced flow velocity or quasi-stagnant flow formed at or near the junction. The proportion of particles represented by the subgroups may depend on the relative flow direction of the outlet channels with respect to the inlet channels. These flow directions may be generally perpendicular to the inlet flow stream, directed in opposite directions to form a "T-junction". Alternatively, the outlet flow direction may form an angle of less than and/or greater than 90 degrees.

A split flow concentrating feature with two or more outlet channels may be used as a split flow mechanism. In particular, the fluid, particles and/or reagents carried to the channel junction may be divided according to the fluid flow through two or more outlet channels. Thus, the number or volumetric ratio of particles or reagents entering two or more channels may be adjusted by the relative sizes of the channels and/or the fluid flow rates through the channels, which in turn may be adjusted by valves or other suitable flow adjustment mechanisms. In a first set of embodiments, the outlet channels may be of very unequal size, so that only a small proportion of particles and/or reagents are directed to the smaller channels. In a second set of embodiments, a valve may be used to create the desired dilution of the reagent. In a third set of embodiments, a valve can be used to selectively direct particles to one of two or more fluid paths.

Locally widened channels may facilitate random positioning by creating areas of reduced flow velocity laterally to the primary flow stream. Reducing the flow rate can settle a subset of the incoming particles in the area of reduced flow rate. Such widened channels may include non-linear channels that are curved (curved) or bent down (bend) at an angle. Alternatively or in addition, the widened region may be formed by a recess formed in the channel wall, the chamber traversing the channel and/or the like, in particular at the outer edge of the curved or bent-down channel.

Lateral localization of particles and/or reagents may also be mediated, at least in part, by a centrifugation focus feature. In a centrifugal concentrating feature, particles may experience a centrifugal force determined by a change in velocity, for example, by moving through a bend in the fluid path. The size and/or density of the particles may determine the rate of change of speed, dispensing particles of different sizes and/or densities to different lateral positions.

Drainage channel feature

In certain embodiments, the capture site further comprises a drainage channel feature. When mechanical capture is employed, for example, the drain feature can include one or more obstructions in the capture feature that are sized to allow fluid flow but not particle flow past and/or around the capture feature. As such, for example, the capture feature may comprise two physical barriers separated by a space (drainage channel feature) that is large enough to allow particle-free fluid to flow between the barriers with low enough impedance to direct cells toward the barriers, thereby enhancing the probability of particle capture. The space between the physical barriers should generally be small enough and/or suitably configured so that particles that will be captured at the capture site will not pass through the barriers. In a specific, exemplary embodiment, the capture feature comprises two concave physical barriers with first and second ends, wherein the barriers are arranged with a small space between the first ends of the barriers, forming a drainage channel feature, and a larger space between the second ends of the barriers. See fig. 22B (where d3 is greater than d1, d1 forms the drainage channel). In this configuration, the barrier forms a "cup" sized to capture particles with a drainage channel at the base of the cup. By virtue of the drainage channels, the particles flow towards the cup as long as they are not occupied. After the particles flow into the cup, the drainage channel is "plugged," which tends to enhance the flow of the particles around the cup and on to the next capture feature in the microfluidic device.

Non-optimized single particle capture

In particular embodiments, capture techniques such as limiting dilution are used to capture particles in separate reaction volumes. In this type of capture, no capture features are used, such as binding affinity or mechanical features that preferentially retain only single cells at the capture sites, for example in a microfluidic device. For example, limiting dilution may be performed as follows: by preparing a series of dilutions of the particle suspension and dispensing aliquots from each dilution into separate reaction volumes. The number of particles in each reaction volume is determined and the dilution that produces the highest proportion of reaction volumes with only a single particle is then selected and used to measure the capture particles for the parameters described herein.

Optimized single particle capture

In some embodiments, the methods include using optimized capture techniques to increase the expected fraction (i.e., above about 33%) of separate reaction volumes with only one particle above that achieved with methods such as limiting dilution. In variations of these embodiments, capture is optimized such that the expected proportion of the respective reaction volumes each having only one particle is at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, or at least about 95% of the total number of the respective reaction volumes. In particular embodiments, the predicted proportion of the respective reaction volumes each having only one particle falls within a range bounded by any two of the percentages listed above. The expected proportion of the respective reaction volumes each having only one particle may be determined empirically or statistically, depending on the particular capture technique (e.g., limiting dilution produces a reaction volume having only one particle in a manner consistent with a poisson distribution). The term "optimizing" as used herein does not imply achieving optimal results, but merely means taking some measure to increase the expected proportion of individual reaction volumes with only one particle above about 33%. In particular embodiments, optimized single particle capture can be achieved, for example, using a size-based mechanism that precludes the retention of more than one particle in each reaction volume (capture site).

In certain embodiments, mechanical capture is used alone or in combination with one or more other capture features to preferentially capture a single particle in each respective reaction volume (i.e., each capture site in a microfluidic device). For example, each capture site may comprise one or more physical barriers sized to contain only one particle. The physical barrier may be shaped to enhance retention of the particles. For example, when the particle is a cell, the size and configuration of the physical barrier may form a concave surface suitable for holding only one cell. In such embodiments, the physical barrier may be designed to allow fluid to flow through the capture site when unoccupied by a cell, and/or the capture site may include a drainage channel feature that aids in this flow. In particular embodiments, the microfluidic device comprises a plurality of appropriately sized/configured physical barriers whereby a plurality of individual particles are held in the device, each physical barrier holding one particle. In illustrative embodiments, the physical barrier may be located in separate compartments in the microfluidic device, one region per compartment. The compartments can be arranged to form an array, such as, for example, a microfluidic array available from Fluidigm Corp. (South San Francisco, CA) and described herein. See also fig. 24A-24G.

In certain embodiments, affinity-based capture is used alone or in combination with one or more other capture features, such as mechanical capture, to preferentially capture single cells in each respective reaction volume (i.e., each capture site in a microfluidic device). For example, discrete regions of a microfluidic device surface that contain binding partners for particles or particle components may be sized such that only one particle may bind to the region, with subsequent particle binding being blocked by steric hindrance. In particular embodiments, the microfluidic device comprises a plurality of appropriately sized regions whereby a plurality of individual particles are held in the device, one particle in each region. In illustrative embodiments, these regions may be located in separate compartments in a microfluidic device, one region for each compartment. The compartments can be arranged to form an array, such as, for example, a microfluidic array available from Fluidigm Corp. (South San Francisco, CA) and described herein.

One approach to affinity-based, optimized single particle capture is based on capturing a support comprising a binding partner that binds to the particle to be tested. In an exemplary embodiment, the support may be beads having binding partners distributed on their surface. See fig. 23A. Beads can be captured by mechanical capture using a cup-shaped capture feature to produce a single immobilized support (e.g., bead) at each capture site. In addition to immobilizing the support, in certain embodiments, the capture features can reduce the surface area of the support (e.g., bead) displaying the binding partner. This surface may be sufficiently reduced so that only one particle may bind to a region of the immobilized support (e.g., a bead) displaying a binding partner. To aid in particle-support binding, in some embodiments, the region of the immobilized support displaying the binding partner faces the flow path of the particle. In specific, exemplary embodiments, the flow channel of the microfluidic device comprises a series of capture features. A suspension of beads with binding partners (e.g., cell-specific antibodies) is input into the channel to generate a series of immobilized beads at the capture site. The channel is then washed to remove any free (i.e., non-immobilized) beads. Fig. 23A. The cell suspension is then transferred into the channel. Individual cells may bind to the portion of each bead that displays a binding partner. Each bound cell prevents any other cell from binding the bead via steric blockage. Washing of the channels removes unbound cells. See fig. 23B. The valve between the capture sites can then be closed to create separate reaction volumes, each containing one capture site and one bound cell. One or more focusing features may be employed to direct bead and particle flow toward each capture site. Alternatively or additionally, the capture features may each comprise a drainage channel feature that allows fluid to flow through the capture site when the capture feature is not occupied by a bead.

Determining the number and/or characteristics of captured particles

In certain embodiments, it is advantageous to determine the number of particles in each respective reaction volume. When limiting dilution is utilized, this determination can be made to identify the highest proportion of dilutions that produce compartments with only a single particle. This determination can also be made after any capture technique to identify those reaction volumes that contain only one particle. For example, in some embodiments, test results may be sorted into multiple "bins" (bins) based on whether the test results are from a reaction volume containing 0, 1, 2, or more cells, allowing one or more of these bins to be analyzed separately. In certain embodiments, any of the methods described herein can include determining whether any compartment includes more than a single particle (more than a single particle); and results from any compartments comprising more than a single particle are not further analyzed or discarded.

In some embodiments, the number of particles in each respective reaction volume is determined by microscopy. For example, when the separate reaction volumes are in compartments of a microfluidic device that are sufficiently transparent or translucent, simple bright field microscopy can be used to visualize and count particles, e.g., cells, in each compartment. See example 5. Microfluidic devices described below and available from Fluidigm Corp. (South San Francisco, CA) are suitable for use in this bright field microscopy method.

In certain embodiments, a stain, dye, or label may be employed to detect the number of particles in each respective reaction volume. Any staining agent, dye or label detectable in the respective reaction volumes may be used. In illustrative embodiments, a fluorescent stain, dye, or label may be used. The stain, dye or marker employed may be tailored for a particular application. When the particle is a cell and the parameter to be measured is a characteristic of the cell surface, the stain, dye or marker may be a cell surface stain, dye or marker that does not need to penetrate the cell. For example, labeled antibodies specific for cell surface markers can be used to detect the number of cells in each respective reaction volume. When the particle is a cell and the parameter to be measured is a characteristic of the interior of the cell (e.g., a nucleic acid), the stain, dye or label may be a membrane-permeable stain, dye or label (e.g., a double-stranded DNA binding dye).

In particular embodiments, the characteristics of the cells can be detected in each respective reaction volume with or without determining the number of cells in each reaction volume. For example, a stain, dye, or label can be employed to determine whether any reaction volume (e.g., any compartment in a microfluidic device) includes a particle having this characteristic. This step can increase assay efficiency by allowing subsequent analysis of the reaction results for only those compartments that include particles with that particular characteristic. Exemplary features that may be detected in this context include, for example, a particular genomic rearrangement, copy number variation, or polymorphism; expression of a particular gene; and expression of specific proteins.

Analysis of nucleic acids in Single particles

In particular embodiments, the methods described herein are used to analyze one or more nucleic acids. For example, the presence and/or level of a particular target nucleic acid can be determined, as can the characteristics of the target nucleic acid, e.g., nucleotide sequence. In an illustrative embodiment, a population of particles with one or more sample nucleic acids in or associated with the particles is captured in separate reaction volumes, each of which preferably comprises only a single particle. Reactions are performed, such as ligation and/or amplification of DNA, or reverse transcription and/or amplification of RNA, to produce reaction products for any reaction volume containing one or more target nucleic acids. These reaction products can be analyzed in the reaction volume, or can be separately or in a pool to recover the reaction volume for subsequent analysis, such as DNA sequencing.

In certain embodiments, the reaction incorporates one or more nucleotide sequences into the reaction product. These sequences may be incorporated by any suitable method, including ligation, transposase-mediated incorporation, or amplification using one or more primers bearing one or more nucleotide tags that include the sequence to be incorporated. These incorporated nucleotide sequences may serve any function that facilitates any of the assays described herein. For example, one or more nucleotide sequences may be incorporated into a reaction product to encode an item of information about the reaction product, such as the identity of the reaction volume from which the reaction product originated. In this case, the reaction is referred to herein as an "encoding reaction". For this purpose, a multi-primer method of adding a "barcode" nucleotide sequence to a target nucleic acid can be employed and is described above. In particular embodiments, nucleic acid amplification is performed using at least two amplification primers, wherein each amplification primer comprises a barcode nucleotide sequence, and the combination of barcode nucleotide sequences encodes the identity of the reaction volume from which the reaction product is derived (referred to as "combinatorial barcoding"). These embodiments are conveniently employed when the separate reaction volumes are in separate compartments of a matrix-type microfluidic device, such as, for example, those available from Fluidigm Corp (South San Francisco, CA) and described below (see "microfluidic device"). Each respective compartment may comprise a combination of barcode nucleotide sequences identifying the rows and columns of the compartment in which the encoding reaction is performed. If the reaction volume is recovered and subjected to further analysis including detection of barcode combinations, the results can be correlated to a particular compartment and thus to the particles in that compartment. This association can be performed for all compartments containing a single particle to allow single particle (e.g., single cell) analysis of a population of particles.

The following section discusses suitable nucleic acid samples, and target nucleic acids therein suitable for analysis in the methods described herein. Amplification primer design and exemplary amplification methods are then described. The remainder discusses various labeling strategies and removal of unwanted reaction components. These sections are described with respect to methods that employ amplification to incorporate nucleic acid sequences into target nucleic acids and/or to analyze them. However, based on the teachings herein, one of skill in the art will recognize that amplification is not critical to performing many of the methods described herein. For example, the nucleic acid sequence may be incorporated by other means, such as ligation or by use of a transposase.

Sample nucleic acid

Preparation of nucleic acids ("samples") can be obtained from biological sources and prepared using conventional methods known in the art. In particular, DNA or RNA useful in the methods described herein can be extracted and/or amplified from any source, including bacteria, protozoa, fungi, viruses, organelles, and higher organisms such as plants or animals, particularly mammals, and more particularly humans. Suitable nucleic acids can also be obtained from environmental sources (e.g., pond water), from manufactured products (e.g., food), from forensic samples, and the like. Nucleic acids can be extracted or amplified from cells, bodily fluids (e.g., blood fractions, urine, etc.), or tissue samples by any of a variety of standard techniques. Exemplary samples include samples of plasma, serum, spinal fluid, lymph fluid, peritoneal fluid, pleural fluid, oral fluid, and external sections of skin; samples from the respiratory, intestinal reproductive, and urinary tracts; a sample of tears, saliva, blood cells, stem cells, or a tumor. For example, a sample of fetal DNA may be obtained from an embryo or from maternal blood. Samples may be obtained from living or dead organisms or from in vitro cultures. Exemplary samples may include single cells, formalin-fixed and/or paraffin-embedded tissue samples, and needle biopsies. Nucleic acids useful in the methods described herein can also be derived from one or more nucleic acid libraries including cDNA, cosmids, YACs, BACs, PI, PAC libraries, and the like.

The nucleic acid of interest can be isolated using methods well known in the art, wherein the selection of a particular method depends on the source, nature, and the like of the nucleic acid. These sample nucleic acids need not be in pure form, but are typically sufficiently pure to allow the target reaction to be carried out. When the target nucleic acid is RNA, the RNA can be reverse transcribed into cDNA by standard methods known in the art and described in, for example, Sambrook, J., Fritsch, E.F., and Maniatis, T., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, NY, Vol.1,2,3 (1989).

Target nucleic acid

Target nucleic acids useful in the methods described herein can be derived from any of the sample nucleic acids described above. In typical embodiments, at least some nucleotide sequence information should be known for the target nucleic acid. For example, if PCR is used as the encoding reaction, sufficient sequence information can be obtained for each end of a given target nucleic acid as a whole to allow for the design of appropriate amplification primers. In alternative embodiments, the target-specific sequence in the primer may be replaced with a random or degenerate nucleotide sequence.

These targets may include, for example, nucleic acids associated with pathogens such as viruses, bacteria, protozoa, or fungi; RNA, e.g., those that are over-or under-expressed indicative of disease, those that are expressed in a tissue-specific or development-specific form; or those induced by a particular stimulus; genomic DNA that can be analyzed for specific polymorphisms (e.g., SNPs), alleles, or haplotypes, for example, in genotyping. Of particular interest are genomic DNAs that are altered (e.g., amplified, deleted, rearranged, and/or mutated) in genetic diseases or other pathologies; a sequence associated with a desired or undesired trait; and/or sequences that uniquely identify the individual (e.g., in forensic or paternity testing). When multiple target nucleic acids are employed, these may be on the same or different chromosomes.

In various embodiments, the target nucleic acid to be amplified can be, for example, 25 bases, 50 bases, 100 bases, 200 bases, 500 bases, or 750 bases. In certain embodiments of the methods described herein, long-range amplification methods such as long-range PCR may be employed to generate amplicons from the amplification mixture. Long fragment PCR allows amplification of target nucleic acids ranging from 1 or several kilobases (kb) to over 50 kb. In various embodiments, the target nucleic acid amplified by long-fragment PCR is at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, or 50kb in length. The target nucleic acid can also fall within any range having any of these values as endpoints (e.g., 25 bases to 100 bases or 5-15 kb).

Primer design

Primers suitable for nucleic acid amplification are long enough to prime the synthesis of extension products in the presence of reagents for polymerization. The exact length and composition of the primer will depend on a variety of factors including, for example, the temperature of the annealing reaction, the source and composition of the primer, and when a probe is used, the proximity of the probe annealing site to the primer annealing site and the ratio of primer to probe concentration. For example, depending on the complexity of the target nucleic acid sequence, the oligonucleotide primer is typically comprised in the range of about 15 to about 30 nucleotides, although it may comprise more or fewer nucleotides. The primers should be sufficiently complementary to selectively anneal to their respective strands and form a stable duplex. One of ordinary skill in the art would know how to select an appropriate primer pair to amplify a target nucleic acid of interest.

For example, PCR primers can be designed by using any commercially available software or open source software such as Primer3 (see, e.g., Rozen and Skolletsky (2000) meth. mol. biol,132: 365-. Amplicon sequences were entered into the Primer3 program with the UPL probe sequences in parentheses to ensure that the Primer3 program will design primers on either side of the bracketed probe sequences.

Primers can be prepared by any suitable method, including, for example, cloning of the appropriate sequence and restriction enzyme cleavage or direct chemical synthesis by a variety of methods, such as the phosphotriester method of Narang et al (1979) meth.enzymol.68: 90-99; brown et al (1979) meth. enzymol.68: 109-151; the diethylphosphoramidate acid method of Beaucage et al (1981) tetra.Lett.,22: 1859-1862; the solid support method of U.S. Pat. No. 4,458,066, or the primers may be provided from commercial sources.

Primers can be purified by using a Sephadex column (Amersham Biosciences, inc., Piscataway, NJ) or other methods known to those of ordinary skill in the art. Primer purification can improve the sensitivity of the method of the invention.

Amplification method

Nucleic acids can be amplified for any useful purpose according to the methods described herein, e.g., to increase the concentration of a target nucleic acid for subsequent analysis, and/or to incorporate one or more nucleotide sequences, and/or to detect and/or quantify and/or sequence one or more target nucleic acids. Amplification can be performed in droplets, in an emulsion, in a vessel, in a well of a microtiter plate, in a compartment of a matrix-type microfluidic device, and the like.

Amplification to increase target nucleic acid concentration

Amplification to increase target nucleic acid concentration can be intended to amplify all nucleic acids in a reaction mixture, all nucleic acids of a particular type (e.g., DNA or RNA), or a particular target nucleic acid. In specific, exemplary embodiments, whole genome amplification can be performed to increase the concentration of genomic DNA; RNA can be amplified, optionally preceded by a reverse transcription step; and/or general or target-specific preamplification.

Whole genome amplification

To analyze genomic DNA, sample nucleic acids can be amplified using the Whole Genome Amplification (WGA) method. Suitable methods for WGA include Primer Extension PCR (PEP) and modified PEP (I-PEP), degenerate oligonucleotide primer PCR (DOP-PCR), Ligation Mediated PCR (LMP), T7-based DNA linear amplification (TLAD), and Multiple Displacement Amplification (MDA). These techniques are described in U.S. patent publication No. 20100178655 (Hamilton et al), published 2010, 7, 15, which is incorporated herein by reference in its entirety, particularly for a description of methods in which they may be used for single cell nucleic acid analysis.

WGA kits are available, for example, from Qiagen, Inc. (Valencia, Calif., USA), Sigma-Aldrich (Rubicon Genomics; e.g., Sigma)Single Cell wheel Genome Amplification Kit, PN WGA4-50 RXN). The WGA step of the methods described herein can be performed using any available kit according to the manufacturer's instructions.

In a specific embodiment, the WGA step is a limited WGA, i.e., the WGA stops before the reaction plateau is reached. Typically, WGA performs more than two amplification cycles. In certain embodiments, WGA is subjected to less than about 10 amplification cycles, e.g., between 4 and 8 cycles, including 4 and 8 cycles. However, WGA may be cycled through 3, 4, 5, 6, 7, 8, or 9 cycles, or some cycles that fall within a range defined by any of these values.

RNA amplification

In certain embodiments, one or more RNA targets may be analyzed in the RNA of a single cell or small cell population. Suitable RNA targets include mRNA, as well as non-coding RNAs such as small nucleolar RNA (snorRNA), microRNA (miRNA), small interfering RNA (siRNA), and Piwi-interacting RNA (piRNA) that interacts with Piwi protein. In particular embodiments, the RNA of interest is converted to DNA, such as by reverse transcription or amplification.

For example, to analyze mRNA of a single cell or small cell population, the mRNA is typically converted into a DNA representation of the mRNA population. In certain embodiments, the methods used preferably produce a population of cdnas, wherein the relative amount of each cDNA is about the same as the relative amount of the corresponding mRNA in the sample population.

In particular embodiments, reverse transcription may be employed to produce cDNA from an mRNA template using reverse transcriptase according to standard methods. Reverse transcription of cellular mRNA populations can be initiated by using, for example, specific primers, oligo dT or random primers. To synthesize a cDNA library representing cellular mRNA, reverse transcriptase can be used to synthesize a first strand of cDNA complementary to cellular RNA of a sample. This can be done using the commercially available BRL Superscript II kit (BRL, Gaithersburg, Md.) or any other commercially available kit. The reverse transcriptase preferably uses RNA as a template, but may also use a single-stranded DNA template. Thus, second strand cDNA synthesis can be accomplished using reverse transcriptase and appropriate primers (e.g., poly A, random primers, etc.). Second strand synthesis can also be accomplished using e.coli (e.coli) DNA polymerase I. The RNA can be removed at the same time or after the second cDNA strand synthesis. This can be accomplished, for example, by treating the mixture with an RNA degrading RNase such as E.coli RNase H.

In other embodiments, cDNA is generated from an mRNA template using an amplification method. In such embodiments, amplification methods that produce a population of cdnas representative of a population of mrnas are typically used.

Analysis of non-coding RNA of a single cell or small cell population typically begins with conversion of the RNA of interest into DNA. The transformation may be accomplished by reverse transcription or amplification. In certain embodiments, the methods used preferably produce a population of DNA, wherein the relative amount of each DNA is about the same as the relative amount of the corresponding mRNA in the sample population. The target RNA can be selectively reverse transcribed or amplified using primers that preferentially anneal to the RNA of interest. Suitable primers are commercially available or can be designed by those skilled in the art. For example, Life Technologies sells MegaPlex for micro RNA (miRNA) targetsTMAnd (4) a primer pool. These primers can be used for Reverse Transcription (RT) and Specific Target Amplification (STA). See, e.g., example 6B.

Pre-amplification

Preamplification can be performed to increase the concentration of nucleic acid sequences in a reaction mixture, typically, for example, using a random primer set, primers specific for one or more sequences common to multiple or all nucleic acids present (e.g., poly-dT to prime a poly-a tail), or a combination of a random primer set and specific primers. Alternatively, pre-amplification can be performed using one or more primer pairs specific for one or more target nucleic acids of interest. In particular, exemplary embodiments, amplified genomes produced by WGA or DNA (e.g., cDNA) produced from RNA can be pre-amplified to produce a pre-amplification reaction mixture comprising one or more amplicons specific for one or more target nucleic acids of interest. Typically, preamplification is performed using preamplification primers, a suitable buffer system, nucleotides, and a DNA polymerase (e.g., a polymerase modified for "hot start" conditions).

In particular embodiments, the pre-amplification primers are the same sequences as those used in the amplification experiment in which the sample was prepared, although typically at reduced concentrations. The primer concentration may be, for example, about 10 to about 250 times less than the primer concentration used in the amplification experiment. Embodiments include the use of primers that are about 10, 20, 35, 50, 65, 75, 100, 125, 150, 175, and 200 times less than the concentration of primers in the amplification experiment.

In a specific embodiment, the pre-amplification is performed for at least two cycles. In certain embodiments, the preamplification is performed for less than about 20 cycles, such as between 8 and 18 cycles, including 8 and 18 cycles. However, pre-amplification may be performed for 3, 4, 5, 6, 7, 8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 cycles or for some cycles that fall within a range defined by any of these values. In an exemplary embodiment, pre-amplification may be performed for about 14 cycles to increase the detected amplicons by about 16,000-fold.

Amplification for detection and/or quantification of target nucleic acids

Any method of nucleic acid detection and/or quantification may be used in the methods described herein to detect the amplification product. In one embodiment, PCR (polymerase chain reaction) is used to amplify and/or quantify the target nucleic acid. In other embodiments, other amplification systems or detection systems are used, including, for example, the system described in U.S. patent No. 7,118,910 (which is incorporated herein by reference in its entirety for the purpose of describing an amplification/detection system). In a specific embodiment, a real-time quantification method is used. The amount of target nucleic acid present in the sample can be determined, for example, by measuring the amount of amplification product formed during the amplification process itself using a "quantitative real-time PCR" method.

The fluorigenic nuclease assay (fluorogenic nucleic acid assay) is a specific example of a real-time quantification method that can be successfully used in the methods described herein. Methods for monitoring the formation of amplification products include the continuous measurement of PCR product accumulation using dual-labeled fluorogenic oligonucleotide probes, commonly referred to in the literature as "Method "one method of. See U.S. patent No. 5,723,591; heid et al, 1996, Real-time quantitative PCR Genome Res.6:986-94, each of which is incorporated herein by reference in its entirety for the purpose of their description of the fluorogenic nuclease assay. It should be understood that although for qPCR "Probes "are the most widely used and the methods described herein are not limited to the use of these probes; any suitable probe may be used.

Other detection/quantification methods that may be used in the present invention include FRET and template extension reactions, molecular beacon detection, scorpion probe detection (scorpion detection), invader detection (invader detection), and padlock probe detection (padlock probe detection).

FRET and template extension reactions use primers labeled with one member of a donor/acceptor pair and nucleotides labeled with the other member of the donor/acceptor pair. The donor and acceptor are spaced sufficiently apart so that energy transfer does not occur prior to incorporation of the labeled nucleotide into the primer during the template-dependent extension reaction. However, if the labeled nucleotides are incorporated into the primer and the spacing is close enough, energy transfer occurs and can be detected. These methods are particularly useful in performing single base pair extension reactions in detecting single nucleotide polymorphisms and are described in U.S. Pat. No. 5,945,283 and PCT publication WO 97/22719.

With respect to molecular beacons, when a probe hybridizes to a complementary region of an amplification product, its conformational change results in the formation of a detectable signal. The probe itself comprises two segments: one segment is at the 5 'end and the other segment is at the 3' end. These segments flank the probe segment annealed to the probe binding site and are complementary to each other. One end segment is typically attached to a reporter dye (reporter dye) and the other end segment is typically attached to a quencher dye (quencher dye). In solution, the two end segments may hybridize to each other to form a hairpin loop. In this conformation, the reporter and quencher dyes are close enough together that fluorescence from the reporter dye is efficiently quenched by the quenching dye. In contrast, hybridized probe results in a linear conformation in which the degree of quenching is reduced. Thus, by monitoring the emission changes of the two dyes, it is possible to monitor the formation of the amplified product indirectly. Probes of this type and methods for their use are further described by, for example, Piatek et al, 1998, nat. Biotechnol.16: 359-63; tyagi, and Kramer,1996, nat. Biotechnology 14: 303-; and Tyagi, et al, 1998, nat. Biotechnol.16:49-53 (1998).

Scorpion-type probe detection methods are described, for example, by Thelwell et al 2000, Nucleic Acids Research, 28: 3752-. Scorpion primers are fluorogenic PCR primers in which a probe element is attached to the 5' end via a PCR terminator (PCR stopper). They are used for real-time amplicon-specific detection of PCR products in homogeneous solution. Two different forms are possible, namely the "stem loop" form and the "duplex" form. In both cases, the probing mechanism is intramolecular. The basic elements of scorpion probe detection in all formats are: (i) PCR primers; (ii) a PCR terminator for preventing PCR reading through the probe element; (iii) a specific probe sequence; and (iv) a fluorescence detection system comprising at least one fluorophore and a quencher. Following PCR extension of the scorpion primer, the resulting amplicon contains sequences complementary to the probe, which results in single-stranded formation at the denaturation stage of each PCR cycle. When cooled, the probe is free to bind to this complementary sequence, causing an increase in fluorescence because the quencher is no longer in the vicinity of the fluorophore. The PCR terminator prevents the probe from being undesirably read through by Taq DNA polymerase.

Invader assays (Third Wave Technologies, Madison, WI) are particularly useful for SNP genotyping and use oligonucleotides called signaling probes that are complementary to a target nucleic acid (DNA or RNA) or polymorphic site. The second oligonucleotide, termed the Invader Oligo, comprises the same 5 'nucleotide sequence, but the 3' nucleotide sequence comprises a nucleotide polymorphism. The Invader Oligo interferes with the binding of the signaling probe to the target nucleic acid such that the 5' end of the signaling probe forms a "overhang" over the nucleotide containing the polymorphism. This complex is recognized by a structure-specific endonuclease called lyase (cleavase). The lyase cleaves 5' overhanging pieces of nucleotides. The released flap binds to the third probe carrying the FRET label, thereby forming another duplex structure recognized by the cleaving enzyme. This time the lyase cleaves the fluorophore away from the quencher and generates a fluorescent signal. For SNP genotyping, a signaling probe will be designed to hybridize to one of the reference (wild-type) alleles or variant (mutant) alleles. Unlike PCR, there is linear amplification of the signal and no amplification of the nucleic acid. Further details sufficient to teach one of ordinary skill in the art are provided by, for example, Neri, B.P., et al, Advances in Nucleic Acid and Protein Analysis 3826: 117-.

Padlock probes (PLPs) are long (e.g., about 100 bases) linear oligonucleotides. The sequences at the 3 'and 5' ends of the probe are complementary to adjacent sequences in the target nucleic acid. In the central non-complementary region of PLP, there is a "tag" sequence that can be used to identify a specific PLP. The tag sequence flanks a universal priming site that allows PCR amplification of the tag. When hybridized to a target, the two ends of the PLP oligonucleotide become in close proximity and can be ligated by enzymatic ligation. The resulting product is a circular probe molecule linked (formed) to the target DNA strand. Any unligated probe (i.e.probe which has not hybridised to the target) is removed by the action of an exonuclease. Hybridization and ligation of PLPs requires that both end segments recognize the target sequence. In this way, PLP provides very specific target recognition.

The tag region of the circularised PLP can then be amplified and the resulting amplicon detected. For example, can proceedReal-time PCR to detect and quantify the amplicons. The presence and amount of amplicons may be correlated with the presence and amount of target sequence in the sample. For a PLP description, see, e.g., Landegren et al, 2003, Padlock and yield probes for in situ and array-based assays, tools for the post-genetic era, Comparative and Functional Genomics 4: 525-30; nilsson et al, 2006, analytical genes using closing and reproducing circuits Trends BiotechnoX.24: 83-8; nilsson et al, 1994, Padlock probes: circulating oligonucleotides for localized DNA detection, Science 265: 2085-8.

In particular embodiments, fluorophores that can be used as detection labels for the probes include, but are not limited to, rhodamine, cyanine 3(Cy 3), cyanine 5(Cy5), fluorescein, VicTM、LiZTM、TamraTM、 5-FamTM、6-FamTMAnd texas red (Molecular Probes). (Vic)TM、LiZTM、 TamraTM、5-FamTM、6-FamTMBoth available from Life Technologies, Foster City, Calif).

In some embodiments, the skilled artisan can simply perform monitoring of the amount of amplification product after a predetermined number of cycles sufficient to indicate the presence of the target nucleic acid sequence in the sample. For any given sample type, primer sequence, and reaction conditions, one skilled in the art can readily determine that a number of cycles or less is sufficient for determining the presence of a given target nucleic acid. In other embodiments, detection is performed at the end of exponential amplification, i.e., during the "plateau" phase, or end-point PCR is performed. In various embodiments, the amplification may be performed about: 2, 4, 10, 15, 20, 25, 30, 35, or 40 cycles or a number of cycles falling within any range defined by any one of these values.

By obtaining fluorescence at different temperatures, it is possible to track the extent of hybridization. Furthermore, the temperature dependence of the hybridization of the PCR product can be used to identify and/or quantify the PCR product. Thus, the methods described herein include the use of melting curve analysis for the detection and/or quantification of amplicons. Melting curve analysis is well known and described, for example, in U.S. Pat. nos. 6,174,670, 6472156 and 6,569,627, each of which is incorporated herein by reference in its entirety, particularly their description for the use of melting curve analysis to detect and/or quantify amplification products. In an exemplary embodiment, melting curve analysis is performed using double stranded DNA dyes, such as SYBR Green, Pico Green (Molecular Probes, Inc., Eugene, OR), EVA Green (Biotinum), ethidium bromide, and the like (see Zhu et al, 1994, anal. chem.66: 1941-48).

In certain embodiments, multiplex detection is performed in separate amplification mixtures, e.g., in separate reaction compartments of a microfluidic device, which can be used to further increase the number of samples and/or targets that can be analyzed in a single assay or to perform comparative methods, such as Comparative Genomic Hybridization (CGH). In various embodiments, up to 2, 3, 4, 5,6, 7, 8,9, 10, 50, 100, 500, 1000, 5000, 10000 or more amplification reactions are performed in each individual reaction compartment.

According to certain embodiments, the skilled artisan may employ internal standards to quantify the amplification products indicated by the fluorescent signal. See, for example, U.S. Pat. No. 5,736,333.

Devices have been developed that can perform thermal cycling reactions with compositions containing fluorescent dyes, emit light beams of specific wavelengths, read the intensity of the fluorescent dyes, and exhibit fluorescence intensity after each cycle. Devices containing thermocyclers, beam emitters and fluorescent signal detectors have been described in, for example, U.S. patent nos. 5,928,907; 6,015,674; and 6,174,670.

In some embodiments, each of these functions may be performed by a separate device. For example, if the technician employs a Q-beta replicase reaction for amplification, the reaction may not be performed in a thermal cycler, but may include emitting a light beam at a particular wavelength, detecting the fluorescent signal, and calculating and displaying the amount of amplification product.

In particular embodiments, a combined thermal cycling and fluorescence detection device can be used to accurately quantify a target nucleic acid. In some embodiments, the fluorescent signal can be detected and displayed during and/or after one or more thermal cycles, thereby allowing real-time monitoring of the amplification product as the reaction occurs. In certain embodiments, the skilled artisan can use the amount of amplification product and the number of amplification cycles to calculate how much target nucleic acid sequence is present in the sample prior to amplification.

Amplification for DNA sequencing

In certain embodiments, amplification methods are employed to generate amplicons suitable for automated DNA sequencing. Many existing DNA sequencing technologies rely on "sequencing by synthesis". These techniques include library generation, large parallel PCR amplification of library molecules, and sequencing. Library generation begins with converting sample nucleic acids into appropriately sized fragments, ligating adaptor sequences to the ends of the fragments, and selecting appropriately adaptor-tagged molecules. The presence of adapter sequences on the ends of the library molecules enables amplification of random-sequence inserts. The above-described methods of tagging nucleotide sequences may be substituted for ligation to incorporate adaptor sequences, as described in more detail below.

Furthermore, the ability of the above-described methods to provide substantially uniform amplification of target nucleotide sequences facilitates the preparation of DNA sequencing libraries with good coverage. In the context of automated DNA sequencing, the term "overlay" refers to the number of times a sequence is measured while sequencing. A DNA sequencing library with substantially uniform coverage can generate sequence data in which the coverage is also substantially uniform. As such, in various embodiments, when performing automated sequencing of a plurality of target amplicons prepared as described herein, at least 50% of the sequences of the target amplicons are present at greater than 50% of the average copy number of the target amplicon sequences and less than 2-fold the average copy number of the target amplicon sequences. In various embodiments of this method, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the target amplicon sequence is present in greater than 50% of the average copy number of the target amplicon sequence and less than 2-fold the average copy number of the target amplicon sequence.

In certain embodiments, at least three primers may be employed to generate amplicons suitable for DNA sequencing: forward, reverse and barcode primers. However, one or more of the forward primer, reverse primer and barcode primer may comprise at least one additional primer binding site. In particular embodiments, the barcode primer comprises at least a first additional primer binding site located upstream of a barcode nucleotide sequence that is upstream of the first nucleotide tag specific portion. In certain embodiments, two of the forward primer, the reverse primer, and the barcode primer comprise at least one additional primer binding site (i.e., the amplicon resulting from amplification comprises the nucleotide tag sequence, the barcode nucleotide sequence, and the two additional binding sites). For example, if the barcode primer comprises a first additional primer binding site upstream of the barcode nucleotide sequence, in particular embodiments, the reverse primer may comprise at least a second additional primer binding site downstream of the second nucleotide tag. The amplification then produces a molecule with the following elements: 5 '-first additional primer binding site-barcode nucleotide sequence-first nucleotide tag from forward primer-target nucleotide sequence-second nucleotide tag from reverse primer-second additional primer binding site-3'. In particular embodiments, the first and second additional primer binding sites are capable of being bound by a DNA sequencing primer to aid in sequencing a complete amplicon comprising a barcode that is indicative of the sample origin, as discussed above.

In other embodiments, at least four primers are employed to generate amplicons suitable for DNA. For example, the inner primer may be used with an outer primer that further comprises first and second primer binding sites capable of being bound by a DNA sequencing primer. Amplification produces a molecule with the following elements: 5 '-first primer binding site-second barcode nucleotide sequence-first nucleotide tag sequence-first barcode nucleotide sequence-target nucleotide sequence-first barcode nucleotide sequence-second nucleotide tag sequence-second barcode nucleotide sequence-second primer binding site-3'. Because this molecule contains a barcode combination at either end, sequences can be obtained from either end of the molecule to identify the barcode combination.

In a similar manner, six primers can be used to prepare the DNA for sequencing. More specifically, as discussed above for the inner and stuffer primers, may be used with outer primers additionally comprising first and second primer binding sites capable of being bound by a DNA sequencing primer. Amplification produces a molecule with the following elements: 5 '-first primer binding site-second barcode nucleotide sequence-third nucleotide tag sequence-first barcode nucleotide sequence-first nucleotide tag sequence-target nucleotide sequence-second nucleotide tag sequence-first barcode nucleotide sequence-fourth nucleotide tag sequence-second barcode nucleotide sequence-second primer binding site-3'. Because this molecule contains a barcode combination at either end, sequences can be obtained from either end of the molecule to identify the barcode combination.

The methods described herein can include DNA sequencing at least one target amplicon using any available DNA sequencing method. In particular embodiments, a high throughput sequencing method is used to sequence a plurality of target amplicons. Such methods typically utilize an in vitro cloning step to amplify individual DNA molecules. As discussed above, emulsion pcr (empcr) separates individual DNA molecules in aqueous droplets in an oil phase along with primer-coated beads. PCR produces copies of DNA molecules, which bind to primers on the beads, which are subsequently immobilized for subsequent sequencing. In vitro clonal amplification can also be performed by "bridge PCR" in which fragments are amplified after primers are attached to solid surfaces. DNA molecules that are physically bound to the surface can be sequenced in parallel, for example, by pyrosequencing or by sequencing-by-synthesis methods as discussed above.

Tagging strategies

Any suitable labeling strategy may be used in the methods described herein. When the assay mixture is aliquoted and each aliquot is analyzed for the presence of a single amplification product, a universal detection probe can be used in the amplification mixture. In particular embodiments, universal qPCR probes may be used for real-time PCR detection. Suitable universal qPCR Probes include double stranded DNA dyes such as SYBR Green, Pico Green (Molecular Probes, inc., Eugene, OR), EVA Green (biotin), ethidium bromide, and the like (see Zhu et al, 1994, anal. chem.66: 1941-48). Suitable universal qPCR probes also include sequence specific probes that bind to the nucleotide sequences present in all amplification products. During amplification, the binding sites of such probes may be conveniently incorporated into the tagged target nucleotide sequence.

Alternatively, one or more target-specific qPCR probes (i.e., specific for the target nucleotide sequence to be detected) can be used in the amplification mixture to detect the amplification products. Target-specific probes may be useful, for example, when only a small number of target nucleic acids are to be detected in a large sample. For example, if only 3 targets are to be detected, target-specific probes having different fluorescent labels for each target may be used. By proper selection of the labels, an assay can be performed in which different labels are excited and/or detected at different wavelengths in a single reaction. See, e.g., Fluorescence Spectroscopy (Pesce et al) Marcel Dekker, New York, (1971); white et al, Fluorescence Analysis, A Practical Approach, Marcel Dekker, New York, (1970); berlman, Handbook of Fluorescence Spectra of Aromatic Molecules, 2 nd edition, Academic Press, New York, (1971); griffiths, Colour and Constitution of Organic Molecules, Academic Press, New York, (1976); indicators (Bishop, eds.) Pergamon Press, Oxford, 19723; and Haughland, Handbook of Fluorescent Probes and Research Chemicals, Molecular Probes, Eugene (1992).

Removal of unwanted reaction components

It will be appreciated that reactions involving complex mixtures of nucleic acids in which multiple reaction steps are used may result in multiple unincorporated reaction components, and that removal of such unincorporated reaction components or reduction of their concentration by any of a number of purification steps (clean-up procedures) may improve the efficiency and specificity of the subsequent reactions taking place. For example, in some embodiments, it may be desirable to remove, or reduce the concentration of, the pre-amplification primers prior to performing the amplification steps described herein.

In certain embodiments, the concentration of the undesired component may be reduced by simply diluting. For example, the pre-amplified sample may be diluted about 2-fold, 5-fold, 10-fold, 100-fold, 500-fold, 1000-fold prior to amplification to improve specificity of subsequent amplification steps.

In some embodiments, unwanted components may be removed by various enzymatic methods. Alternatively, or in addition to the above methods, the unwanted components may be removed by purification. For example, a purification tag may be incorporated into any of the above primers (e.g., in a barcode nucleotide sequence) to facilitate purification of the tagged target nucleotide.

In particular embodiments, purification includes selectively immobilizing the desired nucleic acid. For example, the desired nucleic acid can be preferentially immobilized to a solid support. In an exemplary embodiment, an affinity moiety, such as biotin (e.g., photo-biotin), is attached to the desired nucleic acid, and the resulting biotin-labeled nucleic acid is immobilized on a solid support that includes an affinity moiety-binder, such as streptavidin. The immobilized nucleic acid can be interrogated with multiple probes and unhybridized and/or unligated probes removed by washing (see, e.g., published p.c.t. applications WO 03/006677 and USSN 09/931,285). Alternatively, the immobilized nucleic acid can be washed to remove other components and then released from the solid support for further analysis. This method can be used, for example, to recover target amplicons from an amplification mixture after addition of primer binding sites for DNA sequencing. In particular embodiments, an affinity moiety, such as biotin, can be attached to the amplification primer such that amplification produces an affinity moiety-labeled (e.g., biotin-labeled) amplicon. Thus, for example, when three primers are used to add barcode and nucleotide tag elements to a target nucleotide sequence, at least one of the barcode or reverse primer may include an affinity moiety, as described above. When four primers (two inner primers and two outer primers) are used to add the desired element to the target nucleotide sequence, at least one of the outer primers may include an affinity moiety.

Microfluidic device

In certain embodiments, the methods described herein can be implemented using a microfluidic device. In an exemplary embodiment, the device is a matrix-type microfluidic device, allowing simultaneous combination of multiple substrate solutions (substrate solutions) with multiple reagent solutions in separate reaction compartments (separated reaction compartments). It is understood that the substrate solution can comprise one or more substrates (e.g., target nucleic acids) and the reagent solution can comprise one or more reagents. For example, a microfluidic device may allow for the simultaneous paired combination of multiple different amplification primers and samples. In certain embodiments, the device is configured to contain a different combination of primers and sample in each of these different chambers. In various embodiments, the number of separate reaction compartments can be greater than 50, typically greater than 100, more often greater than 500, even more often greater than 1000, and often greater than 5000, or greater than 10,000.

In a specific embodiment, the matrix-type microfluidic device is a Dynamic Array ("DA") microfluidic device. DA microfluidic devices are matrix-type microfluidic devices designed to separate paired combinations of samples and reagents (e.g., amplification primers, detection probes, etc.) and are suitable for performing qualitative as well as quantitative PCR reactions, including real-time quantitative PCR analysis. In some embodiments, such DA microfluidic devices are at least partially fabricated from an elastomer. DA microfluidic devices are described in PCT publication No. WO05107938a2(Thermal Reaction Device and Method For Using The Same) and U.S. patent publication No. US20050252773a1, both of which are incorporated herein by reference in their entirety For The purpose of illustration of their DA microfluidic devices. DA microfluidic devices may incorporate high density matrix designs that use fluid communication pathways between layers of the microfluidic device to weave control and fluid lines through and between layers of the device. High density reaction cell (reaction cell) arrangements are possible through fluid lines in multiple layers of elastomeric blocks. Alternatively, the DA microfluidic device may be designed such that all reagents and sample channels are located in the same elastomeric layer, with the control channels in different layers. In certain embodiments, the DA microfluidic device can be used to react M number of different samples with N number of different reagents.

Although the DA microfluidic device described in WO05107938 is well suited for carrying out the methods described herein, the present invention is not limited to any particular device or design. Any device that dispenses a sample and/or allows for separate pairing of a combined reagent and sample may be used. U.S. patent publication No. 20080108063 (which is incorporated herein by reference in its entirety) includes an illustration 48.48 DYNAMIC ARRAYTMIFC, a diagram of a commercially available device available from Fluidigm Corp (South San Francisco Calif). It should be understood that other configurations are possible and contemplated, such as 48 × 96, 96 × 96, 30 × 120; and the like.

In a specific embodiment, the microfluidic device may be DIGITAL ARRAYTMAn IFC microfluidic device adapted to perform digital amplification. Such devices may have integrated channels and valves that distribute the mixture of sample and reagents into the nanoliter volume reaction compartments. In some embodiments, DIGITAL ARRAYTMThe IFC microfluidic device is at least partially fabricated from an elastomer. Exemplary DIGITAL ARRAYTMIFC microfluidic devices are described in co-pending U.S. application Ser. No. 12/170,414 entitled "Method and Apparatus for Determining coding Number Variation Using Digital PCR", which is directed to Fluidigm Corp. An exemplary embodiment has 12 input ports corresponding to 12 individual samples input into the device. The device may have 12 panels (panels), and each of the 12 panels may contain 765 reaction compartments of 6nL, each panel having a total volume of 4.59. mu.L. Microfluidic channels may connect different reaction compartments on the panel to a fluid source. Pressure may be applied to an accumulator (accumulator) to open and close valves connecting the reaction compartment to a fluid source. In an exemplary embodiment, 12 inlets may be provided for loading the sample reagent mixture. The 48 inlets may be used to provide a source of reagent that is supplied to the chip when pressure is applied to the accumulator. Additionally, two or more inlets may be provided to provide hydration of the chip.

Although DIGITAL ARRAYTMIFC microfluidic devices are well suited for use in carrying out certain amplification methods described herein, and one of ordinary skill in the art will recognize numerous variations and alternatives to these devices. Given DIGITAL ARRAYTMThe geometry of the IFC microfluidic device will depend on the particular application. In connection with apparatus suitable for use in the methods described hereinIs provided in U.S. patent application publication No. 20050252773, for its disclosure DIGITAL ARRAYTMThe purpose of the IFC microfluidic device is incorporated herein by reference.

In certain embodiments, the methods described herein can be implemented using a microfluidic device that provides for recovery of reaction products. Such devices are described in detail in co-pending U.S. application No. 61/166,105 filed on 4/2/2009 (which is incorporated herein by reference in its entirety and specifically for the purpose of its description of microfluidic devices and related methods that allow for recovery of reaction products), and designated ACCESS ARRAY by Fluidigm corpTMIFC (Integrated fluid Circuit).

In this type of exemplary device, independent sample inputs are combined with primer inputs in an mxn array configuration. Thus, each reaction is a unique combination of a particular sample and a particular reagent mixture. In one implementation, the sample is loaded into the sample compartment of the microfluidic device through sample input lines arranged as columns. Assay reagents (e.g., primers) are loaded into assay compartments of the microfluidic device through assay input lines arranged across rows of the columns. The sample compartment and assay compartment are in fluid isolation during loading. After the loading process is complete, it will be operable to prevent the fluid line from opening through the interface valve between the pair of sample and assay compartments to enable free interfacial diffusion of the paired combination of sample and assay liquid. The precise mixing of the sample and assay allows reactions to occur between the different paired sets, producing one or more reaction products in each compartment. The reaction product is harvested and can then be used in a subsequent process. The terms "assay" and "sample" as used herein are illustrative of the specific use of these devices in certain embodiments. However, the use of these devices is not limited to the use of sample "and" assay solution "in all embodiments. For example, in other embodiments, "sample" may refer to a "first reagent" or a plurality of "first reagents" and "assay solution" may refer to a "second reagent" or a plurality of "second reagents". The mxn feature of these devices enables any set of first reagents to be combined with any set of second reagents.

According to particular embodiments, reaction products from the M x N paired combinations can be recovered from the microfluidic device into discrete wells, e.g., one for each of the M samples. Typically, discrete wells are contained in a sample input port provided on a carrier (carrier). In some methods, these reaction products may be collected on a "per amplicon" basis for normalization purposes. Using embodiments of the present invention, it is possible to achieve results in which the copy number of the amplification product varies by no more than ± 25% in one sample and by no more than ± 25% between samples (for repeated experiments assembled from the same input solution of sample and assay solution). Thus, the amplification product recovered from the microfluidic device will be representative of the input sample, as measured by a particular known genotype profile. In certain embodiments, the output sample concentration will be greater than 2,000 copies/amplicon/microliter, and recovery of the reaction product will be complete in less than two hours.

In some embodiments, the reaction product is recovered by expansion pumping. Expansion pumping provides benefits not normally available using conventional techniques. For example, expansion pumping enables slow removal of reaction products from a microfluidic device. In an exemplary embodiment, the reaction product is recovered at a fluid flow rate of less than 100 μ l/hour. In this example, to distribute 48 reaction products to the reaction compartments in each column, each reaction product having a volume of about 1.5 μ l, removal of the reaction product over a period of about 30 minutes would result in a fluid flow rate of 72 μ l/hr. (i.e., 48X 1.5/0.5 hr). In other embodiments, the removal rate of the reaction product proceeds at the following rates: less than 90 μ l/hr, 80 μ l/hr, 70 μ l/hr, 60 μ l/hr, 50 μ l/hr, 40 μ l/hr, 30 μ l/hr, 20 μ l/hr, 10 μ l/hr, 9 μ l/hr, less than 8 μ l/hr, less than 7 μ l/hr, less than 6 μ l/hr, less than 5 μ l/hr, less than 4 μ l/hr, less than 3 μ l/hr, less than 2 μ l/hr, less than 1 μ l/hr, or less than 0.5 μ l/hr.

Expansion pumping results in purging substantially a high percentage and possibly all of the reaction products present in the microfluidic device. Some embodiments remove more than 75% of the reaction product present in a reaction compartment (e.g., a sample region chamber) of a microfluidic device. By way of example, some embodiments remove more than 80%, 85%, 90%, 92%, 95%, 96%, 97%, 98%, or 99% of the reaction product present in the reaction compartment.

The methods described herein may use a microfluidic device with a plurality of "unit cells" that generally include a sample compartment and an assay compartment. Such unit cells may have dimensions in the order of hundreds of microns, for example unit cells having the following dimensions: 500x 500. mu.m, 525x 525. mu.m, 550x 550. mu.m, 575 x 575. mu.m, 600x 600. mu.m, 625x 625. mu.m, 650x 650. mu.m, 675x 675. mu.m, 700 x 700. mu.m or the like. The dimensions of the sample compartment and the assay compartment are selected to provide a sufficient amount of material to complete the desired process when the amount of sample and assay solution is reduced. By way of example, the sample compartments may have dimensions on the order of 100-400 μm width x200-600 μm length x100-500 μm height. For example, the width can be 100 μm, 125 μm, 150 μm, 175 μm, 200 μm, 225 μm, 250 μm, 275 μm, 300 μm, 325 μm, 350 μm, 375 μm, 400 μm, or the like. For example, the length can be 200 μm, 225 μm, 250 μm, 275 μm, 300 μm, 325 μm, 350 μm, 375 μm, 400 μm, 425 μm, 450 μm, 475 μm, 500 μm, 525 μm, 550 μm, 575 μm, 600 μm, or similar dimensions. For example, the height may be 100 μm, 125 μm, 150 μm, 175 μm, 200 μm, 225 μm, 250 μm, 275 μm, 300 μm, 325 μm, 350 μm, 375 μm, 400 μm, 425 μm, 450 μm, 475 μm, 500 μm, 525 μm, 550 μm, 575 μm, 600 μm, or the like. The assay compartments may have a similar size range, which typically provides a similar step size (step size) in a smaller range than the smaller compartment volume. In some embodiments, the ratio of sample compartment volume to assay compartment volume is about 5:1, 10:1, 15:1, 20:1, 25:1, or 30: 1. Compartment volumes smaller than the listed ranges are included within the scope of the invention and are readily manufactured using microfluidic device manufacturing techniques.

Higher density microfluidic devices will typically use smaller compartment volumes to reduce the footprint of the unit cell. In applications where very small sample sizes are available, reducing the chamber volume will facilitate testing of such small samples.

For single particle analysis, the microfluidic device may be designed to assist in loading and capturing the particular particles to be analyzed. FIG. 9 shows the cell structure of an exemplary microfluidic device for analyzing mammalian cells. Each unit cell has a "cell channel" (i.e., a sample compartment) and a "test channel" (i.e., a test compartment). The cell channels are circular and are used to load mammalian cells, ranging in size from tens of microns in diameter to a hundred and hundreds of microns in length. Depending on the size of the cells being analyzed, the diameter may be about 15 μm, about 20 μm, about 25um, about 30 μm, about 35 μm, about 40 μm, or about 45 μm or more, or may fall within a range having any of these values as endpoints. Depending on the size of the cells being analyzed, the length may be about 60 μm, about 90 μm, about 120 μm, about 150 μm, about 170 μm, about 200 μm, about 230 μm, about 260 μm, about 290 μm or more, or may fall within a range having any of these values as endpoints. At ACCESS ARRAY TMIn an exemplary microfluidic device of the IFC platform ("MA006"), the unit cell for loading mammalian cells may be about 30 μm x 170 μm. Such devices may be configured to provide or assist in providing heat to the cell channels to lyse the cells after loading. As shown in fig. 9, the device may include a test channel separate from the cell channel for carrying out reactions such as nucleic acid amplification. A170 μm x 170 lock valve (containment valve) can be used to close the cell channel.

Co-pending U.S. application No. 61/605,016 entitled "Methods, Systems, And Devices For Multiple Single-Particle or Single-Cell Processing Using Microfluidics", filed on 29.2.2012, describes a method, system And apparatus For Multiple Single-Particle or Single-Cell Processing Using Microfluidics. Various embodiments provide for capturing, segregating, and/or manipulating individual particles or cells from a larger population of cell particles, as well as generating genetic information and/or responses associated with each individual particle or cell. Some embodiments may be configured to image individual particles or cells or associated reaction products as part of a process. This application is incorporated by reference herein in its entirety, particularly its description of microfluidic devices and related systems configured for multiple single particle or single cell processing.

In particular embodiments, microfluidic devices are employed to facilitate assays having the following dynamic ranges: at least 3 orders of magnitude, more often at least 4, at least 5, at least 6, at least 7, or at least 8 orders of magnitude.

Manufacturing methods using elastomeric materials and methods for designing devices and their components have been described in detail in the scientific and patent literature. See, e.g., Unger et al (2000) Science 288: 113-116; U.S. Pat. No. 5, 6,960,437(Nucleic acid amplification and microfluidics devices); 6,899,137(Microfabricated elastomeric valves and pump systems); 6,767,706(Integrated active flux microfluidic devices and methods); 6,752,922 (microfluidics chromatography); 6,408,878 (Microfabricated elastomeric valves and pump systems); 6,645,432 (microfluidic devices including three-dimensional arrayed channel networks); U.S. patent application publication numbers 2004/0115838; 2005/0072946, respectively; 2005/0000900, respectively; 2002/0127736, respectively; 2002/0109114, respectively; 2004/0115838, respectively; 2003/0138829, respectively; 2002/0164816, respectively; 2002/0127736, respectively; and 2002/0109114; PCT publication nos. WO 2005/084191; WO 05/030822a 2; and WO 01/01025; quake & Scherer,2000, "From micro to Nano fabrics with Soft materials" Science 290: 1536-40; unger et al, 2000, "Monolithic microbial differentiated values and pumps by multilayer soft lithography" Science 288: 113-; thorsen et al, 2002, "Microfluidic large-scale integration" Science 298: 580-584; chou et al, 2000, "Microfibrous Rotry Pump" biological Microdevices 3: 323-330; liu et al, 2003, "dissolving the" world-to-chip "interface protocol with a microfluidic matrix" Analytical Chemistry 75,4718-23, Hong et al, 2004, "A nanoliter-scale nucleic acid processor with parallel architecture" Nature Biotechnology 22: 435-39.

Data export and analysis

In certain embodiments, the data can be output as a thermal matrix (also referred to as a "thermal map") when the methods described herein are performed on a matrix-type microfluidic device. In the thermal matrix, each square (representing a reaction compartment on the DA matrix) has been assigned a color value that can be displayed in grayscale, but is more typically displayed in color. In the grey scale, black squares indicate no amplification product detected, while white squares indicate the highest level of amplification production, with grey scale shading indicating the level of amplification product therebetween. In a further aspect, a software program can be used to compile the data generated in the thermal matrix into a more reader-friendly form.

Applications of

In particular embodiments, the methods described herein are used to analyze one or more nucleic acids, e.g., in some embodiments, one or more nucleic acids in or associated with a particle. Thus, for example, these methods are useful for identifying the presence of a particular polymorphism (e.g., a SNP), allele, or haplotype, or chromosomal abnormality, such as an amplification, deletion, rearrangement, or aneuploidy. These methods can be used in genotyping, which can be carried out in a variety of contexts, including diagnosis of genetic diseases or disorders, cancer, pharmacogenomics (personalized medicine), quality control in agriculture (e.g., for seeds or livestock), research and management of populations of plants or animals (e.g., in aquaculture or fishery management or in determining population diversity), or paternity or forensic identification. The methods described herein can be used to identify sequences indicative of a particular condition or organism in a biological or environmental sample. For example, the methods can be used in assays to identify pathogens such as viruses, bacteria, and fungi. The method may also be used in studies aimed at characterizing the environment or microenvironment, for example characterizing microbial species in the human intestine.

In certain embodiments, these methods may also be used to determine DNA or RNA copy number. Determining abnormal DNA copy numbers in genomic DNA is useful, for example, in diagnosing and/or prognosing genetic defects and diseases, such as cancer. For the monitoring of expression of a gene of interest, e.g. under different conditions (e.g. different external stimuli or disease states) and/or at different developmental stages, e.g. in different individuals, tissues, or cells, it is useful to determine the RNA "copy number", i.e. expression level.

Furthermore, the method may be used to prepare nucleic acid samples for further analysis, such as, for example, DNA sequencing.

Furthermore, the nucleic acid sample may be tagged as a first step prior to subsequent analysis, thereby reducing the risk that false labeling or cross-contamination of the sample will compromise the results. For example, any physician's office, laboratory, or hospital may tag the sample immediately after collection and may confirm the tag at the time of analysis. Similarly, samples containing nucleic acids collected at a crime scene may be tagged as soon as possible to ensure that the sample is not mislabeled or tampered with. The detection of the tag at each transfer of the sample from one party to another can be used to establish a chain of custody of the sample.

As discussed above, the methods described herein can be used to analyze other parameters of the particles in addition to the nucleic acids, such as, for example, the expression levels of one or more proteins in or associated with each particle. In some embodiments, each particle is analyzed for one or more nucleic acids along with one or more other parameters.

The ability to associate test results for multiple parameters with each particle in a population of particles can be exploited in a variety of different types of studies. In various embodiments, the methods described herein can be used to identify two or more changes, such as copy number changes, mutations, expression level changes, or splice variants, wherein the changes are linked together to a phenotype. A phenotype can be, for example, the risk, presence, severity, prognosis, and/or responsiveness to a particular therapy or resistance to a drug. The methods described herein can also be used to detect the co-presence of specific nucleic acid sequences, which can indicate genomic recombination, co-expression of specific splice variants, co-expression of specific light and heavy chains in B cells. The method may also be applicable to detecting the presence of a particular pathogen in a particular host cell, for example, when both pathogen-specific and host cell-specific nucleic acids (or other parameters) are present together in the same cell. The method can also be used for targeted resequencing from tumor cells in circulation, e.g., mutational hot spots in different cancers.

Reagent kit

Kits according to the invention may include one or more reagents useful for practicing one or more of the assay methods described herein. Generally the kit comprises a package in which one or more containers contain reagents (e.g., primers and/or probes) as one or more separate compositions, or optionally, as a mixture when compatibility of the reagents will permit. The kit may also include other substances that may be desirable from a user's perspective, such as buffers, diluents, standards, and/or any other substance useful in sample processing, washing, or any other step in performing the assay. In particular embodiments, the kit includes one or more matrix-type microfluidic devices as discussed above.

In certain embodiments, the invention includes a kit for performing the above-described method of adding adaptor molecules to each end of a plurality of target nucleic acids comprising sticky ends. These embodiments can be used, for example, for fragment generation for high-throughput DNA sequencing. Such a kit may comprise a plurality of adaptor molecules designed for use in this method (see above) and one or more components selected from the group consisting of: dnase, exonuclease, endonuclease, polymerase and ligase. .

In particular embodiments, the invention includes kits for combinatorial barcoding. A kit for performing the four primer method, for example, can comprise a polymerase and:

(i) an inner primer comprising:

a forward, inboard primer comprising a first nucleotide tag, a first barcode nucleotide sequence, and a target-specific portion; and

a reverse, inboard primer comprising a target-specific portion, a first barcode nucleotide sequence, and a second nucleotide tag; and

(ii) an outer primer comprising:

a forward, outer primer comprising a second barcode nucleotide sequence and a first nucleotide tag-specific portion; and

a reverse, outer primer comprising a second nucleotide tag-specific portion and a second barcode nucleotide sequence, wherein the outer primer is in excess of the inner primer. A kit for performing a six-primer, combinatorial barcoding method may comprise a polymerase and:

(i) an inner primer comprising:

a forward, inboard primer comprising a first nucleotide tag and a target-specific portion; and

a reverse, inboard primer comprising a target-specific portion and a second nucleotide tag;

(ii) a stuffer primer comprising:

a forward, stuffer primer comprising a third nucleotide tag, a first barcode nucleotide sequence, and a first nucleotide tag specific portion; and

A reverse, filled primer comprising a second nucleotide tag-specific portion, a first barcode nucleotide sequence, a fourth nucleotide tag; and

(iii) an outer primer comprising:

a forward, outer primer comprising a second barcode nucleotide sequence and a third nucleotide tag-specific portion; and

a reverse, outer primer comprising a fourth nucleotide tag-specific portion and a second barcode nucleotide sequence; wherein the outer primer is in excess of the stuffer primer and the stuffer primer is in excess of the inner primer.

In other embodiments, the invention includes kits for combinatorial ligation-based tagging. These kits comprise a plurality of adaptors comprising:

a plurality of first adaptors, each comprising the same endonuclease site, N different barcode nucleotide sequences, a first primer binding site, and a sticky end, wherein N is an integer greater than 1;

a second adaptor comprising a second primer binding site and a sticky end; and

a plurality of third adaptors comprising second barcode nucleotide sequences and sticky ends complementary to those generated upon cleavage of said first adaptors at said endonuclease sites, wherein the plurality of third adaptors comprises M different second barcode nucleotide sequences, wherein M is an integer greater than 1. Such a kit may optionally comprise an endonuclease and/or a ligase specific for the endonuclease site in the first adaptor.

The invention also provides kits for tagging by insertion mutagenesis, which may also be used for combinatorial tagging as described above. In certain embodiments, such kits comprise:

one or more nucleotide tags; and

a plurality of barcode primers, wherein each barcode primer comprises:

a first portion, specific for the first portion of the nucleotide tag, attached to;

a barcode nucleotide sequence, which does not anneal to a nucleotide tag, linked to;

a second portion specific for a second portion of the nucleotide tag, wherein each of the plurality of barcode primers comprises the same first and second tag-specific portions but M different second barcode nucleotide sequences, wherein M is an integer greater than 1. In particular embodiments, the nucleotide tag comprises a transposon end, and the kit further comprises a transposase that can add the transposon end to the target nucleic acid. Such kits may also optionally comprise a polymerase.

The invention includes kits useful for bidirectional nucleic acid sequencing. In particular embodiments, such kits may comprise:

a first outer primer set, wherein the set comprises:

A first outer, forward primer comprising a portion specific for a first primer binding site; and

a first outer, reverse primer comprising a barcode nucleotide sequence and a portion specific for a second primer binding site, wherein the first and second primer binding sites are different;

a second set of flanking primers, wherein the set comprises:

a second outer, forward primer comprising a barcode nucleotide sequence and a portion specific for the first primer binding site; and

a second outer, reverse primer comprising a portion specific for the second primer binding site. In certain embodiments, the first and second primer binding sites may be binding sites of a DNA sequencing primer. In some embodiments, the outer primers may each further comprise an additional nucleotide sequence, wherein:

the first outer, forward primer comprises a first additional nucleotide sequence and the first outer, reverse primer comprises a second additional nucleotide sequence; and

the second outer, forward primer comprises a second additional nucleotide sequence and the second outer, reverse primer comprises a first additional nucleotide sequence; and the first and second further nucleotide sequences are different. In a specific, exemplary embodiment, the first outside primer set comprises PE1-CS1 and PE2-BC-CS2, and the second outside primer set comprises PE1-CS2 and PE2-BC-CS1 (table 1, example 9).

The bidirectional nucleic acid sequencing kit comprising two sets of outside primers can optionally further comprise a set of inside primers, wherein the set comprises:

an inner, forward primer comprising a target-specific portion and a first primer binding site; and

an inner, reverse primer comprising a target-specific portion and a second primer binding site. In certain embodiments, the kit can comprise a plurality of inner primer sets each specific for a different target nucleic acid.

Any of these bidirectional nucleic acid sequencing kits can also optionally comprise the following DNA sequencing primers:

binding the first and second primer binding sites and initiating sequencing of the target nucleotide sequence; and/or

Binds to the first and second primer binding sites and initiates sequencing of the barcode nucleotide sequence. In particular embodiments, two types of DNA sequencing primers are included in the kit, and the primer that binds to the first and second primer binding sites and initiates sequencing of the barcode nucleotide sequence is the reverse complement of the primer that initiates sequencing of the target nucleotide sequence. In a specific, exemplary embodiment, the kit comprises DNA sequencing primers CS1, CS2, CS1rc, and CS2rc (table 2, example 9).

The kits generally include instructions for performing one or more of the methods described herein. The instructions included in the kit may be attached to the packaging material or may be included as a package insert. While these specifications are typically written or printed materials, they are not so limited. Any medium capable of storing such instructions and communicating them to an end user is contemplated by the present invention. Such media include, but are not limited to, electronic storage media (e.g., disks, tapes, cartridges, chips), optical media (e.g., CD ROMs), RF tags, and the like. As used herein, the term "specification" may include the address of the internet that provides the specification.

It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims.

Moreover, all other publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.

Examples

Example 1

General library preparation method for DNA sequencing

Existing methods of preparing libraries for nucleic acid sequencing are cumbersome and require multiple steps. The key to the method involves random fragmentation of the DNA (e.g., followed by end repair, polishing of fragment ends and ligation of end adaptors). Each of these steps requires specific reaction conditions and purification of the product between each step.

This example describes an alternative method of library preparation with figures 1 and 2. This method utilizes simple sequencing adaptors, which may be double-stranded DNA molecules, comprising end adaptors (or portions thereof) for a given sequencer, restriction enzyme digestion sites (or other specific cleavage sites), and flanking degenerate sequences at the 3' ends of both strands. Alternatively, the adaptor may be a hairpin sequence or a double-stranded oligonucleotide. It is also possible that the end adaptor is a single stranded oligonucleotide with a degenerate sequence at the 3' end.

The DNA will be fragmented using standard methods (e.g. enzymatic digestion, nebulization, sonication). Enzymatic digests would be preferred as they result in less damage to the DNA molecule for use in downstream steps. For example, DNase I may be added to the DNA to be sequenced. This reaction can be terminated by heat treatment.

Double-stranded DNA will then be digested back to single-stranded DNA at the ends using T4 polymerase or a strand-specific exonuclease without polymerase activity in the absence of NTP. Exonuclease will be preferred because it can be used in a single reaction with a ligase (e.g., a thermostable ligase) and a polymerase (e.g.,) Act together. However, if T4 polymerase is used, the preparation method will still work in multiple steps.

Nuclease digestion will expose one strand at the end of the DNA. The adapter sequence will be added in the presence of polymerase and ligase. The adaptor sequence will anneal to the digested DNA and the nicks will be filled in and repaired by the polymerase/ligase mixture. In one version of this protocol, the adaptor sequences will be made from hairpin structures so that during digestion/ligation/polymerisation the end product is circularised DNA. It will be protected from further degradation by exonucleases, leading to the accumulation of the final product.

Example 2

Combinatorial ligation-based barcoding for Illumina sequencing

A DNA sequencing library was prepared and the standard PE 2-BC-tag sequence was replaced by a RE-1-BC-tag.

The PE2 tag sequence downstream of the barcode sequence was replaced with a recognition site (RE-1) for a restriction enzyme (e.g., BsrD1), leaving a short overhang:

the library was cleaved with an enzyme.

Ligating an adaptor molecule comprising an appropriate overhang and a second barcode sequence:

ligation will result in the following constructs:

the remaining adaptor molecules are removed prior to sequencing using standard clean-up methods.

During index reads on a sequencing run, the index sequence reported back will be: CTAGNNAGCT (SEQ ID NO: 8).

Example 3

Single cell analysis of gene expression

Problem(s): to utilize DYNAMIC ARRAYTMIFC obtains single cell gene expression data of a set of genes, and cells are first isolated in off-chip tubes. Methods for isolating such cells are difficult to perform and/or require large numbers of cells. This last obstacle has become more of a way to obtain genes from single cells using BioMark when cells are limited, such as primary cells from tissues and/or cells from drug screening assays in microwell platesA barrier to expression data.

Solution scheme:ACCESS ARRAYTMIFCs ("chips") or similar chips that allow recovery of the reaction mixture can be used to load single cells via limiting dilution (e.g., MA006 chips). Conveniently DYNAMIC ARRAY by using the chip as a means of sorting and preparing cells for downstream gene expression analysisTMIFCs prepare a limited number of cells, thereby providing a solution to the problems listed above. The method comprises the following steps:

1) at a limiting dilution of ACCESS ARRAYTMCells were loaded in IFCs. The primer sets were loaded as shown in FIG. 7A. Any given cell will be exposed to all gene-specific primers and a single unique barcode primer.

2) Reverse transcription and pre-amplification were performed on the chip. An example of the resulting amplicon is shown in fig. 7B. This is a 3 primer approach. The benefit of using this approach is that only a set of 96 primer pairs (or more, for as many genes as desired) need be designed and ordered for a particular experiment. The BC reverse primer was universal and was used in all experiments. Any given cell will have all of the genes amplified, and all amplicons will have been tagged with a single barcode. (see possible variations below).

3) The reaction product is output from the cell (90 degrees from the different primers, i.e. from the sample). Pool N now contains a mixture of preamplifiers of 96 genes (or more or less) and barcodes, one of which matches one cell. The pools remain separate so that even if multiple cells are tagged with the same barcode, they are distinguishable because they belong to different pools.

4) Sample loading DYNAMIC ARRAYTMIFC, as shown in FIG. 7C. Note that: single cells can be obtained at ACCESS ARRAY by a variety of methodsTMOn-chip tracking. This provides information about which pool and which barcode pre-amplification reaction has a single cell, i.e., which should be loaded to DYNAMIC ARRAYTMOn the IFC. This option allowed us to read ACCESS ARRAY containing only one cellTMIFC chamber, leading to DYNAMIC ARRAYTMIFC hasThe effect is good. Moreover, if the target cells are delineated by using a cell-specific stain, i.e., an antibody to a cell surface marker, only this subset of cells may be selected for loading to DYNAMIC ARRAYTMIFC. This can become important when cells are rare in a heterogeneous population of cells, i.e., stem cells, cancer cells.

5) qPCR was run and EvaGreen was used for detection. Single cells can be obtained by amplifying a combination of one BC primer and one gene-specific primer (the amplicon of which is ACCESS ARRAY)TMTagged by the BC primer during preamplification in IFC) for a given gene (whose amplification will be DYNAMIC ARRAYTMGene-specific primer detection in IFC).

Possible variations: there are different detection methods with a common end result of pre-amplifying a set of genes and labeling individual cells with unique barcodes. Examples are as follows:

The same procedure was followed as above but using the 2-primer method.

The Fen-ligase chain reaction was used.

A melting temperature strategy was used.

Example 4

Alternative method for detecting the reaction product from example 3

Instead of using qPCR to detect EvaGreen at ACCESS ARRAYTMPre-amplified BC-tagged amplicons in IFCs, at DYNAMIC ARRAYTMLigase chain reactions with real-time detection are performed in IFC (e.g., M96).

Exemplary amplicons have the following structure: 5 '-forward primer sequence-target nucleotide sequence-reverse primer sequence-barcode nucleotide sequence-3'. In this case, one primer can anneal to the reverse primer sequence and the other primer can anneal to the adjacent barcode nucleotide sequence, followed by ligation, and repeated cycles of annealing and ligation. See fig. 8A. The amplicons in the pool with any of the different reverse primers ("R") are derived from different target nucleic acids (here messenger RNAs),amplicons in pools with different barcode primers ("BC") will not be amplified. Thus, using BCMAmplification poolNAmplified from ACCESS ARRAYTMBarcoded target nucleic acids for chambers in row N, column M of IFCs. Use of R in this amplification 1As another primer, the primer from the corresponding R1The amplicon of (1).

One method of real-time detection is the flap endonuclease-ligase chain reaction, which uses a 5' flap endonuclease and labeled BCnThe primers are shown in FIG. 8B. This reaction employs a labeled probe and an unlabeled probe, where simultaneous hybridization of the probes to the reaction product results in the formation of a flap at the 5' end of the labeled probe, and cleavage of the flap separates the cleavable fluorophore from the quencher, producing a signal. Since the BC is not amplicon specific, these primers need only be made once. For example, a group of 96 BC for any number of different groups FnRnOne amplicon will suffice.

The benefits of this strategy:

selection of pools and BC allows analysis of those ACCESS ARRAY that contain only single cellsTMIFC chamber (where single cell analysis is the objective). Unlabeled cells can be utilized at ACCESS ARRAYTMBright field or fluorescence imaging of IFCs. In addition, the cells may be loaded to ACCESS ARRAYTMThe IFCs are stained with a dye and/or labeled antibody before or after to identify cells of interest (e.g., stem cells, cancer stem cells, etc.). Selection of pools and BC allows analysis of only those ACCESS ARRAY containing cells of interest TMIFC chamber, improved efficiency.

This strategy requires much fewer cells than FACS, which makes possible the use in assays that cannot be performed using FACS, such as analyzing primary cells or populations of cells from screening assays.

Example 5

TMPreparation of nucleic acids for sequencing from Single cells Using ACCESS ARRAYIFC ("MA006") suitable for cell manipulation Method (2)

General description of the Process

The "chip", referred to herein as MA006, has been utilized ACCESS ARRAYTMThe IFC platform was developed and methods for sample preparation using MA006, integrated cell manipulation and for nucleic acid sequencing were also developed. Referring to fig. 9, a schematic diagram of the MA006 cell structure, showing on-chip methods. This integration simplifies the steps required to perform the experiment. Moreover, the loading chip requires only hundreds of cells.

The MA006 chip has the following features:

the unit cell has a 170x30pm circular channel for loading mammalian cells

48.48 matrix format;

lysing the cells in the cell channels using heat;

separate reaction compartments for the amplification reaction;

170x170pm lock valve to close cell channel;

additional resistance layer: PouroB-30gm circular resistance;

chip manufacturing: using the existing AA48.48 method;

65pm collimation tolerance;

130pm punch diameter;

65x85pm valve size; and

3-layer design method.

There are no cell capture features on the MA006 chip. The result is a limiting dilution strategy for obtaining the desired number of cells per chamber. However, the cell capture features can be designed into the chip. They may be physical (e.g., cups or pot structures), biological (e.g., spotted peptides), or chemical (e.g., charged ions).

Off-chip cell manipulation: the cells to be analyzed are prepared to a density such that the desired number of cells per sample chamber ("cell channel" in fig. 9) is obtained. Because the MA006 chip uses a limiting dilution strategy, the number of cells per chamber follows a poisson distribution both theoretically and practically. In the first case, since the maximum number of chambers containing a single cell is desired, the optimal cell density is 300-600 cells per microliter. A minimum volume of 1 to 2 microliters may be applied to the inlet. Thus, experiments can be performed with only hundreds of cells. Any cell type (i.e., mammalian, bacterial, etc.) from any source (i.e., living organism, tissue culture, etc.) can be used. Any form or degree of preparation, washing, and/or dyeing may be used so long as it is compatible with downstream applications.

Cell tracking in the chip: in the absence of any polymerase/amplification dependent chemical reaction, bright field or fluorescence microscopy can be used to monitor the location, identity, and/or content of the cells in the chip. The cells can be stained with any stain (i.e., a nucleic acid-specific stain such as SYT 010; an immunoassay such as Cy 5-conjugated anti-CD 19; etc.) so long as it is compatible with downstream applications. This can be used, for example, to identify rare cells, i.e., cancer stem cells, in a heterogeneous population of cells.

Chemical reaction: after loading the cells to MA006, the assay is loaded into the assay chamber ("assay channel" in fig. 9), releasing the interface valve to mix the sample and the contents of the assay chamber. The chip is subjected to thermal cycling according to the selected chemical reaction and imaging, either in real time or at the end point if the chemical reaction is needed and/or supported. This procedure is not limited to gene-specific amplification, either non-specific degenerate primers can be used, or RNA-specific amplification can be performed. In the case of gene-specific amplification, a "multiplexing" strategy can be used to target more than one gene simultaneously. The chemical reaction is flexible, provided that the output is the substrate for sequencing (substrate), and should not be limited to polymerase chain reaction or even amplification.

Cell manipulation

Cell counting: bright field imaging

RAMOS cells were manipulated as follows:

(1) the cells were harvested.

(2) Wash 2-3X in ice cold Tris saline BSA buffer.

(3) Count and make appropriate dilutions. The theoretical distribution of the different cell densities (poisson distribution) is shown in fig. 10.

(4) Push cells into MA006 chip.

(5) Imaging through bright fields.

FIGS. 11A-11B show the results of cell counting in the chip using bright field imaging (A) compared to the theoretical distribution (B). Based on bright field imaging, the cell density in the chip is close to but lower than poisson distribution, a trend that is exacerbated at higher cell densities. This may be due in part to "shadowing" created by the chip features, which may reduce the measurable area in which cells may be detected using bright field imaging.

Cell counting: post-PCR fluorescence

Cells were loaded at 0.15E6/ml onto MA006 chips and used with Cells-DirectTMRT-PCR was performed with RT PCR fractions, Rox and EVA green. Fig. 12A-12B show that fluorescent cell "ghosting" images (fig. 12A) allow more cells to be detected than bright field imaging prior to PCR, so that the cell density more closely approximates the poisson distribution (fig. 12B). Based on these results, if 4000 cells (e.g., 4 μ Ι, 1000 cells/μ Ι) were applied to each inlet of the MA006 chip and spread over, 2304(48 × 48) or about 1/3 of the 800 chambers had single cells.

More specific methods

More specific methods for detecting cells in the chip that can be used include, for example, the use of cell membrane permeable nucleic acid stains and/or the detection of cell specific surface markers with antibodies. Thus, for example, a RAMOS cell can be operated as follows:

(1) the cells were harvested.

(2) Wash 2-3X in ice cold Tris saline BSA buffer.

(3) Staining with Syto10 DNA stain and/or Cy 5-labeled anti-CD 19 antibody.

(4) Wash 2-3X in ice cold Tris saline BSA buffer.

(5) Counted and diluted appropriately.

(6) Push cells into MA006 chip.

(7) And (6) imaging.

In FIG. 13, the results of these more specific methods are shown for a cell density of 1E 6/ml. FIG. 14A shows a comparison of the nucleic acid stain before RT-PCR (Syto10 DNA stain) and the ghost image after RT-PCR (cell ghosts), and FIG. 14B shows RT-PCR with Syto10 not inhibiting GAPDH. The workflow for cell detection in the chip may include staining cells with DNA stain and/or antibodies, followed by pre-RT-PCR counting and then post-RT-PCR counting of cell ghosts as a backup (back-up).

Chemical reaction: one-step gene-specific RT-PCR

Different chemical reactions were studied to find efficient chemical reactions for transforming gene-specific RNA in cells as amplicons in the MA006 chip. Cells were pushed into the cell channel in Tris saline BSA (0.5. mu.g/ml) buffer. Reagents for loading into the test channels include:

Primers (500nM final concentration)

CellsDirectTMOne-step qRT-PCR kit Components (available from Life Technologies, Foster City, CA)

Reaction mixture

Enzyme mixture: SuperScripte III + Platinum Taq polymerase

Buffer solution

Rox

EVA Green

Loading reagents-AA or GE (available from Fluidigm corp., South San Francisco, CA) to prevent nonspecific uptake by PDMS ("depletion effect") and to lyse cells.

RT-PCR of GAPDH was performed with or without AA or GE loading reagents. The results show that both loading reagents inhibited RT-PCR. The loading reagent comprises: prion: (AA) or BSA, (GE) and 0.5% Tween-20. RT-PCR of GAPDH was performed in the presence of Prionix or BSA. Prionix but not BSA was found to inhibit RT-PCR. RT-PCR of GAPDH was performed in the presence of 0.5% Tween 20 or 0.5% NP40 (the latter being a lytic reagent). The results of this study are shown in figure 15. Neither 0.5% Tween 20 nor 0.5% NP40 significantly inhibited RT-PCR of GAPDH.

To determine the reaction conditions developed for RT-PCR of GAPDH from cells would allow RT-PCR of other genes expressed at different levels, covering a range of expressionRT-PCR of 11 genes at the level was performed with 10 ng/. mu.l RNA and the reagents described above, except 0.5% NP40 instead of AA/GE loading reagent. The thermal cycling protocol was: 30 minutes at 50 ℃; 30 minutes at 55 ℃; 2 minutes at 95 ℃; then 45 cycles: 95 ℃ for 15 seconds, 60 ℃ for 30 seconds, and 72 ℃ for 60 seconds. Standard curve amplification of these 11 genes performed in the MA006 chip is shown in FIG. 16. These results demonstrate that CellsDirect TMOne-step qRT-PCR kit can be used with 0.5% NP40 (for cell lysis and to prevent depletion effects in the chip) to convert gene-specific RNA in cells into amplicons in the MA006 chip.

Sequencing

To aid in sequencing gene-specific amplicons generated in the MA006 chip, barcoding methods were employed to distinguish amplicons from different chambers (e.g., cells). More specifically, a four-primer, combinatorial barcoding approach was employed to place a combination of two barcodes on either end of each amplicon. This method is shown diagrammatically in fig. 17. The inner primer contains a target-specific portion ("TS-F" in the forward primer and "TS-R" in the reverse primer), a barcode nucleotide sequence ("bc2"), and a different nucleotide tag. The outer primers contain tag-specific portions ("CS1" and "CS2"), different barcode nucleotide sequences ("bc1"), and primer binding sites ("a" and "B") for sequencing primers. FIGS. 18A-18B illustrate how 4 primer barcoding can be performed on a chip such as MA 006. Amplification was performed on the chip with the inner primers, with the chambers of each row having the same inner primer pair with the same barcode. The reaction products from each column of chambers can be harvested as pools and amplified using a different outer primer pair for each pool. This amplification produces amplicons having a combination of barcodes at either end of the amplicon that uniquely identifies the chamber (in rows and columns) in which the initial amplification was performed. The reaction products were sequenced and the number of reads per sequence per reaction compartment was determined. This determination was performed on RAMOS cells and on spleen RNA. FIG. 19 shows a comparison of the results obtained, expressed as the number of reads per gene-specific amplicon (red) compared to the number of reads of total RNA. As is evident from the figure, the representation of these RNAs is different when measured in individual cells compared to that observed in total RNA.

Example 6

Size-based microfluidic single particle capture

One method of discretely capturing (discrete capturing) single cells from a suspension as it flows through a microfluidic device is to define a microfluidic geometry that directs a suspension of particles (such as cells or beads) through a capture site in the following manner: the capture site captures a single particle, efficiently captures a single particle (e.g., has a high probability of capturing a particle passing near the capture site), and/or directs the remaining suspension around the capture site. The geometry may be size-based, i.e., the capture site is only large enough to contain one particle (and not more), but still allows particle-free suspension to flow through the site with reasonably low fluidic impedance, so that an empty capture site will direct the flow of particles toward it rather than around it. This can be achieved by using a gutter. Additional geometries may also concentrate the flow of particles in the following manner: for a high probability of successful capture, the probability of the particle going close enough to the capture site is increased. These geometric changes are focused on controlling the flow resistance of the fluid around the capture site and the drainage channel, including the drainage channel itself, as well as changing the aperture of the focusing geometry to attempt to locate the flow of particles proximate to the capture site. FIGS. 20A-20B illustrate capture sites with capture features and drainage channels. Fig. 20A shows a site without obstructions to concentrate flow, while fig. 20B shows a site with obstructions. Additional capture site designs are shown in figure 21.

Example 7

Capturing particles based on surface markers

Single cell studies in microfluidic structures require the isolation of individual cells into individual reaction compartments (chambers, droplets, particles). Limiting dilution is one way to achieve this separation. Cells were loaded at a concentration of less than one cell per compartment on average and distributed into those compartments in a manner described by poisson statistics. Another approach relies on mechanical traps to capture cells. These traps are designed to capture cells of a given size range (see example 6). This results in biased selection of cells in this size range from the population.

For some applications, an ideal capture method would utilize biomarkers expressed on the cell surface. Antibodies may be arranged at specific locations on the microfluidic array, although this approach may not be simple depending on the structure of the microfluidic array.

This example describes a method of capturing a single particle (e.g., a cell) based on a single, affinity reagent-coated bead initially captured at a particular location in a microfluidic device. The surface area occupied by the beads at the capture site openings provides a defined surface for cell binding of accessible affinity reagents. The bead size and capture site can be selected/designed such that after a single cell is bound to the bead, the remaining accessible surface area of the bead is closed by the previously bound cell space. The selection of appropriately sized bead capture sites also provides capture of a wide range of cell sizes. It should be possible to capture a cell as long as the cell is larger than the exposed capture area and expresses the appropriate surface marker or binding partner of the affinity reagent.

The capture structure may be designed to maximize the probability of the cell contacting the surface marker. For example, obstructions on one or more channel walls may be used to direct the beads toward the capture feature. See fig. 22A for an exemplary capture feature/obstruction combination. The performance of the capture feature may be adjusted by adjusting one or more variables including the angle of the obstruction, the distance of the obstruction from the capture site, the length of the obstruction, the size and shape of the capture feature, and the size of the drainage channel (if present) in the capture feature. Referring to fig. 22B and 22C, variables and behavior of the capture feature/obstruction combination are illustrated. In fig. 22B, the obstacles on the channel wall serve to guide the beads towards the capture feature. In fig. 22C, the capture feature is paired with an obstruction on the channel wall; individual capture feature/obstruction combinations may be located on alternating walls to focus the flow toward adjacent capture feature/obstruction combinations. These combinations can be located at sites that can be separated (e.g., using valves) in use to form separate reaction compartments.

Fig. 23A and 23B illustrate (in simplified form, lack of an obstacle) a strategy for capturing a single, affinity reagent-coated bead using a capture feature, which bead then displays an affinity reagent (e.g., an antibody) to capture a single particle (e.g., a cell). In fig. 23A, fig. 1, flow begins in a channel containing a capture feature. In FIG. 23A, Panel 2, antibody-bound beads are flowed to the capture feature until the beads settle in the capture feature, as shown in FIG. 23A, Panel 3. The channel is then washed to remove the uncaptured beads. Subsequently, as shown in figure 23B, panel 1, cells bearing antibody-bound cell surface markers flow into channels containing the captured beads. Figure 23B, 2, illustrates how a cell with a marker interacts and binds with an antibody displayed by a captured bead. The size of the display area is such that the bound cells will inhibit other cells from interacting with the captured beads via steric blockages, so that only one cell binds to each captured bead. As shown in fig. 23B, panel 3, the channels were then washed to remove unbound cells, leaving one immobilized cell at each capture site.

Example 8

Microfluidic device for cell capture ("CCap

Figure 24A shows a schematic of a microfluidic device designed to capture a single cell in a discrete location (niche). The flow is designed to be stronger above the niche than through the overflow channel. The niche contains a small notch (-3 μm high). See fig. 24B. When the cell enters the niche it closes the niche and prevents any further flow into the niche. The flow passes through to the next unoccupied niche until it is also enclosed by the cell. Ideally, one cell should be captured per niche before the cells pass through the overflow channel and exit waste. Referring to more detail in fig. 24C-24F, the buffer inlet is in confluence with the cell inlet, forcing the cells to one side of the feed channel closest to the series of lateral cell capture channels. See fig. 24D. The resistance of the lateral cell capture channel is lower than that of the cell overflow channel to direct the cell flow preferentially into the niche rather than into the cell overflow channel. See fig. 24E. As shown in figure 24F, each niche is large enough to capture only one cell. The niche notch is small enough that the cell is captured at the operating pressure/flow level. If the latter is too high and/or the niche opening is too large, the cell can deform and be pushed through the niche opening. The presence of cells in the niche raises the resistance of this particular circuit, so that flow is directed to the cell-free circuit. Figure 24G shows the actual device, with captured Human Umbilical Vein Endothelial Cells (HUVECs) located in the niche.

Example 9

TMBidirectional DNA sequencing amplicon tagging for Illumina sequencer using 48.48ACCESS ARRAY IFC Tab-scheme 1

Introduction to the design reside in

The following scheme outlines the scheme for the method at ACCESS ARRAYTMThe amplicon libraries that have been generated on the System, the bidirectional sequencing strategy on Illumina Genome GAII, HiSeq and MiSeq sequencers. The purpose of this protocol is to sequence both ends of the PCR product with a single read sequencing run. In a standard 4-primer amplicon tagging method (see example 6), a tagged target-specific (TS) primer pair is combined with a sample-specific primer pair comprising a barcode sequence (BC) and adaptor sequences (PE1 and PE2, panel a of fig. 25) used by an Illumina sequencer. Here, in a double sequencing amplicon tagging strategy, tagged target-specific primer pairs are combined differently with two sets of sample-specific primer pairs. The sample specific primer pair contained consensus tags CS1 or CS2, appended with Illumina adaptor sequences in two permutations (PE1 and PE2, panel B of fig. 25). This method requires only one set of target-specific primer pairs, whereas sample-specific barcode primers are universal and can be used in multiple experiments.

Bidirectional sequencing amplicon tagging generates two types of PCR products per target region: one PCR product allows sequencing of the 5 'end of the target region (product a) and one PCR product allows sequencing of the 3' end of the target region (product B). Because two PCR products are present in the flow cell simultaneously, one sequencing read yields sequence information for both ends of the target region. The main difference between this strategy and sequencing of end-pairs (example 6) is that the 5 'and 3' reads are not derived from the same cluster, i.e., from the same template molecule. Instead, the average of the template population was derived.

Amplification of multiple target sequences can be performed prior to addition of the bi-directional barcode. In brief, the protocol employs a two-step process: ACCESS ARRAY IFC were run in the presence of only multiplex, tagged, target-specific primers. The pool of harvested PCR products is then used as a template for a second PCR with the sample-specific barcode primer. Two sets of barcode primers were added to separate PCR reactions as described below.

The sample-specific barcode primer pair was isolated into two separate PCR reactions (FIG. 26; see also Table 1).

TABLE 1 Bar code primers used in the isolated-primer PCR strategy.

Primer and method for producing the same Sequence of
PE1-CS1 5’-AATGATACGGCGACCACCGAGATCTACACTGACGACATGGTTCTACA-3’(SEQ ID NO:9)
PE2-BC-CS2 5’-CAAGCAGAAGACGGCATACGAGAT-[BC]-TACGGTAGCAGAGACTTGGTCT-3’(SEQ ID NO:10)
PE1-CS2 5’-AATGATACGGCGACCACCGAGATCTTACGGTAGCAGAGACTTGGTCT-3’(SEQ ID NO:11)
PE2-BC-CS1 5’-CAAGCAGAAGACGGCATACGAGAT-[BC]-ACACTGACGACATGGTTCTACA-3’(SEQ ID NO:12)

After barcoding PCR, PCR products of both the 5 'reaction and the 3' reaction were pooled and used as templates for cluster formation on the flow cell. Since both PCR product types are present and form clusters on the flow cell, an equimolar mixture of CS1 and CS2 sequencing primers allowed simultaneous sequencing of both PCR product types (fig. 27). Similarly, indexed reading of equimolar mixtures with CS1rc and CS2rc sequencing primers allowed for simultaneous sequencing of barcodes of both PCR product types.

Can be consultedIFC Controller for ACCESS ARRAYTMSystem User Guide (PN 68000157) is used as a reference for this scheme. The latest project, reagent and catalog number information of the Illumina website can be consulted.

Preparation and sequencing of amplicons

The following reagents were used in this protocol and stored at-20 ℃: FastStart high fidelity PCR System, dNTPack (Roche, PN 04-738-292-001); 20X ACCESS ARRAYTMLoading reagent (Fluidigm, PN 100-; target-specific primer pairs with universal tags (CS1 forward tag, CS2 reverse tag) including 50 μ M CS 1-tagged TS forward primer and 50 μ M CS 2-tagged TS reverse primer; and the two-way 384 barcode kit for Illumina GAII, HiSeq and MiSeq sequencers (Fluidigm, PN 100-. Additional reagents were stored at 4 ℃, including: agilent DNA 1000 kit reagent (Agilent, PN 5067-; and 1X ACCESS ARRAYTMThe solution was harvested (Fluidigm, PN 100-. Other reagents were stored at room temperature, including PCR Certified Water (Teknova, PN W330); DNA suspension buffer (10mM Tris HCI,0.1mM EDTA, pH8.0) (Teknova, PN T0221); and Agilent DNA 1000 chip (included inAgilent DNA 1000DNA kit) (Agilent).

The following equipment and consumables are used for this scheme: 1.5mL or 2mL microcentrifuge tube; microcentrifuge with rotor for 2mL tube; microcentrifuge with rotor for 0.2mL PCR tube strips (tube strip); centrifuges with plate carriers; agilent 2100 BioAnalyzer (Agilent); a 96-well reaction plate; MicroAmp Clear additive Film (Applied Biosystems, PN 4306311); IFC Controller AX (2-fold amount, before and after PCR) (Fluidigm); FC1 cycler (fluidigm); 48.48ACCESS ARRAY TMIFCs (Fluidigm); and Control Line Fluid ceramics (Fluidigm, PN 89000020).

At ACCESS ARRAYTMMultiplex PCR on IFC as in Fluidigm ACCESS ARRAYTMChapter 6-Multiplex PCR on the 48.48ACCESS ARRAY in System for Illumina Platform User GuideTMThe description detailed in IFC.

Barcoded PCR was performed according to the instructions detailed in Chapter 6-attachment Sequence Tags and Sample Barcodes in the Fluidigm ACCESS ARRAY System for Illumina Platform User Guide. The 100X dilution of the harvested PCR product pool was used as template for two rather than one barcoded PCR reactions: one reaction produces a PCR product A that allows sequencing of the 5 'end of the target region and the other reaction produces a PCR product B that allows sequencing of the 3' end of the target region. The setting of the reaction is the same as "Attaching Sequence Tags and Sample Barcodes" in the Fluidigm ACCESS ARRAYSystem for Illumina Platform User Guide. However, the amount of Sample premix Master Mix (Sample Pre-Mix Master Mix) was doubled to compensate for the increase in the number of wells. After the second PCR is completed, pools of PCR product a and PCR product B are pooled and then sequenced. Fluidigm ACCESS ARRAYTMChapter 8 of System for Illumina Platform User Guide provides a method describing the purification and quantification of product libraries after PCR.

The remainder of this example provides a sequencing workflow used in this protocol.

The following instructions for preparing reagents are expected to be used with Illumina TruSeq sequencing reagents. Fluidigm reagents FL1 and FL2 comprise equimolar mixtures of CS1 and CS2 sequencing and indexing primers, respectively. FL1 is a sequencing primer comprising 50. mu.M each of the CS1 and CS2 primers. FL2 is an index primer comprising 50. mu.M each of the CS1rc and CS2rc primers. The sequences of these primers are shown in Table 2.

TABLE 2 primers and sequences

Primer and method for producing the same Sequence of
CS1 5’-ACACTGACGACATGGTTCTACA-3’(SEQ ID NO:13)
CS2 5’-TACGGTAGCAGAGACTTGGTCT-3’(SEQ ID NO:14)
CS1rc 5’-TGTAGAACCATGTCGTCAGTGT-3’(SEQ ID NO:15)
CS2rc 5’-AGACCAAGTCTCTGCTACCGTA-3’(SEQ ID NO:16)

Sequencing primers HP6/FL1 was prepared by diluting Fluidigm reagent FL1 (which contains the custom sequencing primers) to a final concentration of 0.25 μ M in TruSeq reagent HP6 in a DNase, RNase free 0.5mL microcentrifuge tube, as shown in Table 3. Vortex the primers after mixing to ensure complete mixing.

TABLE 3 description of the preparation of HP6/FL1 (per mL)

Reagent Volume of
TruSeq reagent HP6 995μL
FL1 5μL
In total 1000μL

The index primer HP8/FL2 was prepared by diluting Fluidigm reagent FL2 (which contains the custom index primer) to a final concentration of 0.25 μ M in Truseq reagent HP8 in a dnase, rnase free 0.5ml microcentrifuge tube, as shown in table 4. Vortex the primers after mixing to ensure complete mixing.

TABLE 4 description of the preparation of HP8/FL2 (per mL)

Reagent Volume of
TruSeq reagent HP8 995μL
FL2 5μL
In total 1000μL

Cluster utilization Illumina cBotTMThe specification in the User Guide, Illumina Cluster Station User Guide or Illumina MiSeq User Guide. To hybridize the sequencing primer, the sequencing primer reagent HP6/FL1 was used for the first read.

Sequencing reagents were prepared according to the manufacturer's instructions and loaded into the sequencer. For read 1, a multiplexed single read sequencing run was performed following the manufacturer's instructions.

For index reading, the index reagent HP7/FL2 was replaced instead of the HP7 reagent. The barcode sequences used in the Fluidigm bipartite primer library were designed so that they could be distinguished even in the presence of sequencing errors. As more samples are run in parallel, the length of index reads required to distinguish barcode sequences increases significantly. The suggestion of index reading is described in table 5.

Table 5 index read suggestions

Number of samples per lane 1-96 97-384 385-1920
Length of index read 6 bases 8 bases 10 bases

When preparing the sequencing run, the length of the index reads was adjusted as directed in table 5. Ensure that the volume of sequencing reagents loaded into the sequencer is sufficient for indexing cycles. These changes are implemented according to the manufacturer's recommendations.

Example 10

Tagging of target nucleic acids for bidirectional Illumina sequencing, allowing recovery of amplification products, using a microfluidic device Detailed procedures of

394 primer pairs were designed to PCR amplify exons from the genes BRCA1, BRCA2, PTEN, PI3KCA, APC, EGFR, TP53 (see Table 6 below). The forward primer is attached with a Tag8 sequence and the reverse primer is attached with a Tag5 sequence. 394 primers were arranged in 48 groups, each group containing an average of about 8 primer pairs, with a concentration of 1. mu.M of each primer in 0.05% Tween-20. Sample mixtures were prepared from 48 cell line genomic DNA samples (see Table 7 below) by adding 1. mu.l of sample (50ng/ul) to 3. mu.l of a sample premix containing 1U of Roche Faststart HiFi polymerase, 1 Xbuffer, 100. mu.M dNTPs, 4.5mM MgCl25% DMSO and 1X ACCESS ARRAYTMAnd (4) loading the sample into the solution.

ACCESS ARRAYTMIFC according to ACCESS ARRAYTMThe instructions in the User Guide were run. Sample mixture was loaded to ACCESS ARRAY 48.48TMSample port of IFC. Each set of primers was loaded to ACCESS ARRAY 48.48TMInlet of the IFC. PCR was performed on a Fluidigm independent thermal cycler using standard PCR protocols supplied by the thermal cycler. After PCR, ACCESS ARRAY was controlled by a separate controller TMThe IFC harvested the product. One microliter of each product was then transferred to a PCR plate and diluted 100x with PCR grade water. A master mix containing 4. mu.l of PCR (1U of Roche Faststart HiFi polymerase, 1 Xbuffer, 100. mu.M dNTP, 4.5mM MgCl) was then prepared25% DMSO and as in the following tableBarcode primers described in 8). Plate 1 contains a primer pair in the form of PE2-CS1/PE1-BC-CS2 with the barcode FL001-FL0048, each primer having a concentration of 400 nM. Plate 2 contained a primer pair in the form of PE2-CS2/PE1-BC-CS1 with a barcode FL001-FL0048, each primer at a concentration of 400 nM. Plate 3 contains two pairs of primers in the form of PE2-CS1/PE2-CS2/PE1-BC-CS1/PE1-BC-CS2 with the barcode FL0049-FL 0096. All three plates were subjected to 15 PCR cycles (95 ℃ 10 min; 15X (95 ℃ 15s,60 ℃ 30s,72 ℃ 90 s; 72 ℃ 3min) using the following thermal cycling protocol.

Each reaction product from each plate was analyzed on an Agilent 1000Bioanalyzer chip and the concentration of the PCR product pool was measured based on the electropherograms from the analysis (fig. 28). PCR products from each plate were pooled to equal concentrations using volumes adjusted according to the concentrations obtained from the Agilent Bioanalyzer.

Pooled samples were cleaned using AMPure beads (Beckman Coulter) at a bead to sample ratio of 1: 1.

The amplicon pool was sequenced on two separate lanes of Genome Analyzer II (Illumina). The first lane used CS1 and CS2 primers for the first read, and C1rc and CS2rc primers for the index read. Since the annealing temperatures of CS1 and CS2 are expected to be 10 ℃ lower than that of standard Illumina read 1 and index sequencing primers, LNA (locked nucleic acid) formats of CS1, CS2, CS1rc and CS2rc were used to optimize hybridization to clusters under the standard conditions described in the Illumina Cluster Station and Genome Analyzer manual.

For sequencing, the second lane was used at ACCESS ARRAYTMPools of target-specific forward and reverse primers assembled during amplification on IFCs using primers (fig. 29). The CS1/CS2rc index is used for index reading. Due to its increased length, the target-specific primer has a higher annealing temperature than CS1 or CS 2. This method avoids reading the uninformative target-specific primer portion of the (reading through) PCR product. In contrast, sequencing information with the lowest error rate was obtained from the informative region of the PCR product, with the least amount of overlap between the 5 'and 3' reads in this region. The method also allows for sequencing error rates where maximum (i.e., middle of PCR product) Greater overlap, and an increase in PCR product size of 30-40 bp.

Sequence data was demultiplexed (multiplexed) using Illumina software and aligned to the human genome reference sequence build hg19 using aligner ELAND (Illumina). The base-by-base coverage of the gene EGFR for the exemplary samples is shown in fig. 30.

TABLE 6 primers for amplification of exons from genes BRCA1, BRCA2, PTEN, PI3KCA, APC, EGFR, TP53

TABLE 7 cell line genomic DNA samples

TABLE 8 Bar code primers

Example 11

TMBidirectional DNA sequencing amplicon tagging for Illumina sequencer using 48.48 ACCESS ARRAY IFC Tab-scheme 2

This embodiment provides a modification of the scheme in embodiment 9. The introduction to example 9 is also applicable to this example.

Preparation of amplicons

The following documents may be referred to for reference of this scheme:IFC Controller for ACCESS ARRAYTM System User Guide(PN 68000157);a Control Line Fluid Loading Procedure Quick Reference (PN 68000132); and Agilent DNA 1000Kit Guide.

The following reagents were used in this protocol and stored at-20 ℃: FastStart high fidelity PCR System, dNTPack (Roche, PN 04-738-292-001); 20X ACCESS ARRAYTMLoading reagent (Fluidigm, PN 100-; 1X ACCESS ARRAYTMHarvesting the solution (Fluidigm, PN 100-; ACCESS ARRAY for Illumina sequencer TMBarcode library-384 (bidirectional) (Fluidigm, PN 100-; target-specific primer pairs tagged with universal tags (CS1 forward tag, CS2 reverse tag) including 50 μ M CS 1-tagged TS forward primer and 50 μ M CS 2-tagged TS reverse primer; and 50 ng/. mu.L of template DNA. (1X ACCESS ARRAYTMThe harvest solution (Fluidigm, PN 100-. It can be ACCESS ARRAYTMThe names of Harvest Pack, PN 100-TMThe components of the loading reagent kit, PN 100-1032, were purchased. ) Agilent DNA 1000 kit reagents (Agilent, PN 5067-. In addition, PCR verified Water (Teknova, PN W330) was used; it was stored at room temperature.

At ACCESS ARRAYTMMultiple P on IFCCR is as in ACCESS ARRAYTMChapter 6-multiple amplification Tagging on the 48.48ACCESS ARRAY in System for Illumina Platform User GuideTMThe description detailed in IFC. Optionally, as per at ACCESS ARRAYTMThe instructions detailed in appendix C of System for Illumina Platform User Guide are carried out 48.48ACCESS ARRAYTM2 primer target specific PCR on IFC to achieve duplex amplicon tagging without multiplexing. The harvested PCR products were then barcoded as described below.

According to the method in Fluidigm ACCESS ARRAYTMThe instructions detailed in Chapter 6-Attaching Sequence Tags and Sample Barcodes in System for Illumina Platform User Guide barcoded PCR products for bidirectional amplicon tagging in two 96-well plates. A 100-fold dilution of the harvested PCR product pool was used as template for two (rather than one) barcoded PCR reactions: one reaction in one 96-well plate produces PCR product a that allows sequencing of the 5 'end of the target region and the other reaction in the second 96-well plate produces PCR product B that allows sequencing of the 3' end of the target region. Setting of the reaction and Fluidigm ACCESS ARRAYTMThe "associating Sequence Tags and Sample Barcodes" in System for Illumina Platform User Guide are identical. However, the amount of sample premix solution doubled to compensate for the increase in the number of reactions, and ACCESS ARRAY for the Illumina sequencerTMThe barcode library-384 (two-way) (Fluidigm, PN 100-.

TABLE 9 sample mix solution-PCR product A

Components Volume (μ L)
Sample premix 15.0
ACCESS ARRAY for Illumina sequencerTMBar code library-384 (Bi-directional) A 4.0
Diluted harvested PCR product pool 1.0
In total 20.0

TABLE 10 sample mix solution-PCR product B

Components Volume (μ L)
Sample premix 15.0
ACCESS ARRAY for Illumina sequencerTMBar code library-384 (Bi-directional) B 4.0
Diluted harvested PCR product pool 1.0
In total 20.0

After the second PCR is completed, pools of PCR product a and PCR product B are pooled and subsequently sequenced. Fluidigm ACCESS ARRAYTMChapter 8 of System for Illumina Platform User Guide provides a method describing the purification and quantification of product libraries after PCR. Use of ACCESS ARRAY for Illumina sequencerTMThe generation of bidirectional amplicons for sequencing is crucial for the barcode library-384 (bidirectional) (Fluidigm, PN 100-.

Sequencing workflow Using Fluidigm FL1 and FL2 sequencing primers

The following description is intended for use with Illumina TruSeq sequencing reagents on Illumina GAII and HiSeq systems. Fluidigm sequencing reagents FL1 and FL2 comprise equimolar mixtures of CS1 and CS2 sequencing and indexing primers, respectively. FL1 is a custom sequencing primer comprising 50 μ M each of the CS1 and CS2 primers. FL2 is a custom index primer comprising 50. mu.M each of the CS1rc and CS2rc primers. For single read sequencing, reagents were prepared for read 1 and the index primer. For end-paired sequencing, reagents were prepared for read 1, index and read 2 primers.

The results of the PCR experiments testing the mutual interference between Fluidigm sequencing primer and TruSeq sequencing primer are shown in fig. 32 and 33.

The following documents can be consulted for reference in sequencing: illumina cBotTM User Guide;Illumina Genome Analyzer IITMA User Guide; and Illumina HiSeqTMA User Guide. Reference should be made to the Illumina Genome Analyzer II User Guide or Illumina HiSeq User Guide for a description of how to perform the sequencing run. Technical support of Illumina can also be contacted.

Preparation of reagents for sequencing on the Illumina GAII and HiSeq sequencing System

Read 1 sequencing primer HT1/FL1 was prepared by first diluting the FL1 stock solution to a final concentration of 500nM with hybridization buffer (HT1) in a DNase, RNase free 1.5mL microcentrifuge tube (Table 11). The tubes were vortexed for at least 20 seconds and centrifuged for 30 seconds to spin down all components. The following description outlines the preparation of HT1/FL1 sequencing primer mixtures (per mL) for read 1. Approximately 300. mu.L per lane was used using a cBot Custom Primers Reagent Stage. The custom primer orientation in the tube strip was aligned with the lanes of the GAII or HiSeq flow cell.

TABLE 11 description of preparation of HT1/FL1 (per mL)

The index primer HT1/FL2 was prepared by first diluting the FL2 stock solution to a final concentration of 500nM with hybridization buffer (HT1) in a DNase, RNase free 1.5mL microcentrifuge tube (Table 12). The tube was vortexed for at least 20 seconds and centrifuged for 30 seconds to spin down all components. The following description outlines the preparation of HT1/FL2 index primer mixtures for index reading. Approximately 3mL of the index sequencing primer mix (HP8) was used for the index reads. 1.5mL of TruSeq reagent HP8 was substituted for 1.5mL of HT1/FL 2.

TABLE 12 description of the preparation of HT1/FL2

Read 2 sequencing primer HT1/FL1 (for end-pairing sequencing) was prepared by first diluting FL1 stock solution to a final concentration of 500nM with hybridization buffer (HT1) in a DNase, RNase free 1.5mL microcentrifuge tube (Table 13). The tube was vortexed for at least 20 seconds and centrifuged for 30 seconds to spin down all components. The following description outlines the preparation of HT1/FL1 sequencing primer mix for read 2. Approximately 3.2mL read 2 sequencing primer (HP7) was used for read 2. 1.6mL of TruSeq reagent HP7 was substituted for 1.6mL of HT1/FL 1.

TABLE 13 preparation of description of read 2 sequencing primer HT1/FL1

Performing sequencing operation

The Illumina Genome Analyzer II or HiSeq user guide provides instructions on how to perform sequencing runs. Optionally, technical support of Illumina may be contacted.

For index reads, 1.5mL TruSeq reagent HP8 was replaced with 1.5mL index primer HT1/FL2 for GAII and HiSeq sequencing runs. ACCESS ARRAY for IlluminaTMThe barcode sequences used in the barcode library are designed so that they can be distinguished even in the presence of sequencing errors. As more samples are run in parallel, the length of index reads required to distinguish barcode sequences increases significantly. The suggestion of index reading is described in table 14.

Table 14 index read suggestions

When preparing the sequencing run, the length of the index reads was adjusted as directed in table 14. Ensure that the volume of sequencing reagents loaded onto the sequencer is sufficient for indexing cycles. For a detailed description of how these changes are implemented, Illumina sequence User Guide is consulted, or technical support associated with Illumina is consulted.

171页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种核酸样本保存液及其制备方法和应用

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!