Molecular barcoding

文档序号:1618167 发布日期:2020-01-10 浏览:21次 中文

阅读说明:本技术 分子条码化 (Molecular barcoding ) 是由 R·雷伯弗斯基 J·阿格雷丝迪 于 2018-05-21 设计创作,主要内容包括:提供了用于制备和使用带独特标签的靶核酸分子的方法和组合物。(Methods and compositions for making and using uniquely tagged target nucleic acid molecules are provided.)

1. A method of producing a reaction mixture comprising a target nucleic acid molecule with a unique tag, the method comprising:

a) covalently linking a plurality of variable length barcode tags consisting of 0-10 nucleotides of one or more nucleic acid sequences to a first end of a plurality of target nucleic acid molecules, such that individual target nucleic acid molecules in the plurality comprise a single variable length barcode tag and the plurality comprises at least 5 different variable length barcode tag lengths and/or sequences; and

b) contacting the target nucleic acid molecule with a plurality of transposases such that transposase fragmentation sites and covalently linked transposome ends are introduced to the second ends of individual target nucleic acid molecules in the plurality, thereby producing a plurality of uniquely tagged target nucleic acid molecules, wherein individual uniquely tagged target nucleic acid molecules in the plurality comprise:

i) a variable length barcode label located at the first end; and

ii) a transposase fragmentation site at the second end and a transposome end,

wherein the combination of i) and ii) in the individual uniquely tagged target nucleic acid molecules in the plurality together comprise a unique molecular barcode that is unique relative to all other individual uniquely tagged target nucleic acid molecules in the plurality having the same sequence of at least 25 contiguous nucleotides between the variable length barcode tag and the transposase fragmentation site in the reaction mixture.

2. The method of claim 1, wherein covalently attaching a plurality of variable length barcode tags comprises hybridizing a plurality of primers comprising the variable length barcode tags to a plurality of nucleic acid molecules comprising at least a portion of the target nucleic acid molecule sequence and extending the primers with a polymerase, thereby generating a plurality of double-stranded variable length barcode tag target nucleic acid molecules.

3. The method of claim 2, wherein the plurality of nucleic acid molecules comprise mRNA, the polymerase is an RNA-dependent DNA polymerase, and the plurality of primers comprising the variable length barcode tag comprise 3' oligo-dT ends.

4. The method of claim 2, wherein the first end of the uniquely tagged target nucleic acid molecule comprises a poly-a region and/or a poly-T region, and wherein the variable length barcode tag is 3 'of the poly-a region and/or 5' of the poly-T region.

5. The method of claim 1, wherein covalently attaching a plurality of variable length barcode tags comprises attaching variable length barcode tags to target nucleic acid molecules in the plurality.

6. The method of claim 1, wherein after a) and before b), the method comprises forming a double stranded target nucleic acid molecule with a variable length barcode tag comprising a first DNA strand hybridized to a reverse complementary second DNA strand.

7. The method of claim 6, wherein the variable length barcode-tagged double stranded target nucleic acid molecule comprises a double stranded target cDNA molecule or a variable length barcode-tagged target genomic DNA molecule.

8. The method of claim 7, wherein the method comprises forming a double stranded target cDNA molecule by:

i) hybridizing a plurality of individual primers to a plurality of mRNA molecules, wherein the individual primers comprise variable length barcode tags and extending the primers with an RNA-dependent DNA polymerase, thereby generating a plurality of double-stranded mRNA;

ii) contacting the mRNA cDNA hybrid with an enzyme comprising RNase H activity, thereby producing an mRNA fragment that hybridizes to the first strand cDNA molecule; and

iii) contacting the mRNA fragment with a DNA-dependent DNA polymerase, thereby extending the mRNA fragment in a template-directed polymerase reaction, wherein the template is a first strand cDNA polynucleotide, thereby forming a double-stranded cDNA molecule.

9. The method of claim 7, wherein the method comprises generating a target genomic DNA molecule with a variable length barcode tag by: hybridizing a plurality of first primers comprising a variable length barcode tag and a genomic DNA targeting region to a plurality of genomic DNA molecules comprising at least a portion of the target nucleic acid molecule sequence, and extending the primers with a DNA-dependent DNA polymerase, thereby generating the variable length barcode tag-bearing target genomic DNA molecule.

10. The method of claim 1, wherein the reaction mixture comprises 1-10,000 target nucleic acid molecules of different sequences.

11. The method of claim 1, wherein the reaction mixture comprises a plurality of fluidic partitions comprising target nucleic acid molecules from a single cell.

12. The method of claim 1, wherein a) is performed in a reaction mixture, wherein the target nucleic acid molecule is from a single cell, or wherein a) and b) are performed in a reaction mixture, wherein the target nucleic acid molecule is from a single cell.

13. The method of claim 1, wherein the variable length barcode tag consists of 0-10 nucleotides of a single nucleic acid sequence, wherein at least a portion of the variable length barcode tag comprises at least 1 nucleotide.

14. The method of claim 1, wherein the transposome termini comprise, from 5 'to 3', GTCTCGTGGGCTCGG (SEQ ID NO: 2); TCGTCGGCAGCGTC (SEQ ID NO: 3); AGATGTGTATAAGAGACAG (SEQ ID NO: 4); TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG (SEQ ID NO:5) or GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG (SEQ ID NO: 6).

15. The method of claim 1, wherein the transposome end comprises two complementary nucleotide sequences comprising, from 5 'to 3', AGATGTGTATAAGAGACAG (SEQ ID NO:4), and from 5 'to 3', CTGTCTTATACACATCT (SEQ ID NO: 7); TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG (SEQ ID NO:5), and from 5 'to 3', CTGTCTTATACACATCT (SEQ ID NO: 7); or GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG (SEQ ID NO:6), and from 5 'to 3', CTGTCTTATACACATCT (SEQ ID NO: 7).

16. A method of estimating the number of target nucleic acid molecules in a reaction mixture, the method comprising:

A) providing a reaction mixture, wherein the reaction mixture comprises a plurality of uniquely tagged target nucleic acid molecules comprising:

i) a variable length barcode label located at the first end; and

ii) a transposase fragmentation site and a transposome end at the second end; and

wherein: i) and ii) together comprise a unique molecular barcode that is unique to all other uniquely tagged target nucleic acid molecules having the same sequence between the variable length barcode tag and the transposase fragmentation site in the reaction mixture;

B) obtaining a plurality of sequence reads, wherein the sequence reads comprise one or more of: a sequence of a variable length barcode tag, a sequence of a portion of the target nucleic acid between the variable length barcode tag and the transposase fragmentation site, and a sequence of the fragmentation site; and

C) the number of target nucleic acid molecules having the same sequence between the variable length barcode tags and transposase fragmentation, but having different variable length barcode tags and/or transposase fragmentation sites, is counted, thereby estimating the number of target nucleic acid molecules in the reaction mixture.

17. A method of estimating the number of target nucleic acid molecules in a reaction mixture, the method comprising:

the method of, wherein the providing is performed according to claim 1.

18. The method of claim 16, wherein after a) and before B), the method further comprises amplifying the target nucleic acid molecule having a variable length barcode tag at a first end and a transposase fragmentation site and a transposome end at a second end.

19. A reaction mixture comprising a plurality of uniquely tagged target nucleic acid molecules, wherein the plurality of uniquely tagged target nucleic acid molecules comprises:

i) a variable length barcode label located at the first end; and

ii) a transposase fragmentation site and a transposome end at the second end; and

wherein: i) and ii) together comprise a unique molecular barcode that is unique to all other uniquely tagged target nucleic acid molecules having the same sequence between the variable length barcode tag and the transposase fragmentation site in the reaction mixture.

20. The reaction mixture of claim 19, wherein the reaction mixture comprises a plurality of fluidic partitions comprising target nucleic acid molecules from a single cell or target nucleic acid molecules from a plurality of cells.

Background

Next generation sequencing technologies can provide a large amount of sequence information from relatively small samples, such as nucleic acid (e.g., mRNA) samples from single cells. However, it can be difficult to extract quantitative information about the absolute or relative abundance of nucleic acids in a sample. In some cases, by attaching Unique Molecular Identifiers (UMIs), such as unique oligonucleotide barcode (barcode) sequences, to target nucleic acids and detecting such UMIs during sequencing, the absolute or relative abundance of target nucleic acids in a sample can be estimated.

Summary of The Invention

In one aspect, the invention provides a method for producing a reaction mixture comprising a uniquely tagged target nucleic acid molecule, the method comprising: (a) covalently attaching a plurality of variable length barcode tags to a first end of a plurality of target nucleic acid molecules (the variable length barcode tags consisting of 0-10 nucleotides of one or more nucleic acid sequences), such that individual target nucleic acid molecules in the plurality comprise a single variable length barcode tag and the plurality comprises at least 5 different variable length barcode tag lengths and/or sequences; and (b) contacting the target nucleic acid molecule with a plurality of transposases such that a transposase fragmentation site and a covalently linked transposome end are introduced at a second end of an individual target nucleic acid molecule in the plurality, thereby producing a plurality of uniquely tagged target nucleic acid molecules, wherein an individual uniquely tagged target nucleic acid molecule in the plurality comprises: (i) a variable length barcode label located at the first end; and (ii) a transposase fragmentation site and a transposome end at a second end, wherein a combination of (i) and (ii) in individual uniquely tagged target nucleic acid molecules in the plurality together comprise a unique molecular barcode that is unique relative to all other uniquely tagged individual target nucleic acid molecules in the plurality in the reaction mixture that have the same sequence of at least 25 contiguous nucleotides between the variable length barcode tag and the transposase fragmentation site. In some embodiments, the reaction mixture comprises at least 1,000 target nucleic acid molecules having different sequences.

In some embodiments, covalently linking the plurality of variable length barcode tags comprises hybridizing a plurality of primers comprising the variable length barcode tags to a plurality of nucleic acid molecules comprising at least a portion of the target nucleic acid molecule sequence and extending the primers with a polymerase, thereby generating a plurality of double stranded variable length barcoded target nucleic acid molecules. In some embodiments, covalently attaching a plurality of variable length barcode tags comprises attaching the variable length barcode tags to the target nucleic acid molecules in the plurality.

In some embodiments, the plurality of nucleic acid molecules comprises mRNA and the polymerase is an RNA-dependent DNA polymerase. In another embodiment, the plurality of nucleic acid molecules comprises mRNA and the plurality of primers comprising variable length barcode tags comprise a 3' oligo-dT terminus.

In some embodiments, the first end of the target nucleic acid molecule comprises a poly-a region. In another embodiment, the first end of the uniquely tagged target nucleic acid molecule comprises a poly-A region and/or a poly-T region. In some embodiments, the variable length barcode tag is 3 'of the poly a region and/or 5' of the poly T region.

In some embodiments, the method comprises, after step (a) and before step (b), forming a double stranded target nucleic acid molecule with a variable length barcode tag comprising a first DNA strand hybridized to a reverse complementary second DNA strand. In another embodiment, the double stranded target nucleic acid molecule with a variable length barcode tag comprises a double stranded target cDNA molecule.

In some embodiments, a variable length barcode-tagged double stranded target nucleic acid molecule comprises a variable length barcode-tagged target genomic DNA molecule. In some embodiments, the method comprises generating a target genomic DNA molecule with a variable length barcode tag by: hybridizing a plurality of first primers comprising a variable length barcode tag and a genomic DNA targeting region to a plurality of genomic DNA molecules comprising at least a portion of the target nucleic acid molecule sequence, and extending the primers with a DNA-dependent DNA polymerase, thereby generating the variable length barcode tag-bearing target genomic DNA molecule. In some embodiments, the method comprises amplifying the target genomic DNA molecule with a variable length barcode tag.

In one aspect, the present invention provides a method of forming a double stranded target cDNA molecule by: (i) hybridizing a plurality of individual primers to a plurality of mRNA molecules, wherein the individual primers comprise variable length barcode tags, and extending the primers with an RNA-dependent DNA polymerase, thereby generating a plurality of double-stranded mrnas; (ii) contacting the mRNA cDNA hybrid with an enzyme comprising RNase H activity, thereby producing an mRNA fragment that hybridizes to the first strand cDNA molecule; and (iii) contacting the mRNA fragment with a DNA-dependent DNA polymerase, thereby extending the mRNA fragment in a template-directed polymerase reaction, wherein the template is a first strand cDNA polynucleotide, and a double stranded target cDNA molecule is formed. In some embodiments, the method comprises contacting the double stranded target cDNA molecule with a ligase.

In some embodiments, the RNA-dependent DNA polymerase comprises rnase H activity. In some embodiments, the method comprises contacting the mRNA cDNA hybrid with an enzyme having rnase H activity and incubating the mRNA cDNA hybrid in the presence of an RNA-dependent DNA polymerase, thereby producing an mRNA fragment that hybridizes to the first strand cDNA molecule. In some embodiments, contacting the mRNA cDNA hybrid with an enzyme comprising rnase H activity comprises contacting the mRNA cDNA hybrid with an enzyme that is structurally different from an RNA-dependent DNA polymerase.

In some embodiments, the methods of generating a reaction mixture comprising uniquely tagged nucleic acid molecules, including steps (i) and (ii), together comprise a unique molecular identifier for an individual target nucleic acid molecule sequence, and the plurality of uniquely tagged individual target nucleic acid molecules do not comprise any other unique molecular identifiers. In some embodiments, the plurality of uniquely tagged individual target nucleic acid molecules comprises a cell barcode. In some embodiments, the plurality of uniquely tagged individual target nucleic acid molecules are cdnas and the cell barcode is 3 'of the poly a region and/or 5' of the poly T region.

In some embodiments, step (a) is performed in a reaction mixture, wherein the target nucleic acid molecule is from a single cell. In some embodiments, steps (a) and (b) are performed in a reaction mixture, wherein the target nucleic acid molecule is from a single cell. In some embodiments, step (b) is performed in a reaction mixture, wherein the target nucleic acid molecule is from at least 10 cells. In another embodiment, step (b) is performed in a reaction mixture wherein the target nucleic acid molecule is from about 50 to about 500 cells. In one embodiment, step (b) is performed in a reaction mixture wherein the target nucleic acid molecule is from about 10 to about 5000 cells. In another embodiment, step (b) is performed in a reaction mixture wherein the target nucleic acid molecule is from about 10 to about 10000 cells.

In some embodiments, the variable length barcode tag consists of 0-10 nucleotides of a single nucleic acid sequence, wherein at least a portion of the variable length barcode tag comprises at least 1 nucleotide.

In another embodiment, the variable length barcode tag consists of 0-5 nucleotides, wherein at least a portion of the variable length barcode tag comprises at least 1 nucleotide.

In some embodiments, method step (a) is performed, followed by step (b) on the target nucleic acid molecules of the plurality of double-stranded variable length barcode tags produced in method step (a), with or without an intermediate amplification step.

In some embodiments, the transposome end comprises: from 5 'to 3', GTCTCGTGGGCTCGG (SEQ ID NO:2) or from 5 'to 3', TCGTCGGCAGCGTC (SEQ ID NO: 3).

In some embodiments, the transposome end comprises, from 5 'to 3', AGATGTGTATAAGAGACAG (SEQ ID NO: 4).

In some embodiments, the transposome end comprises, from 5 'to 3', TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG (SEQ ID NO: 5).

In some embodiments, the transposome end comprises, from 5 'to 3', GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG (SEQ ID NO: 6).

In some embodiments, the method further comprises, after step (b), amplifying the uniquely tagged target nucleic acid molecule having a variable length barcode tag at a first end and a transposase fragmentation site and a transposome end at a second end. In some embodiments, the amplification is performed using a hot start DNA-dependent DNA polymerase. In some embodiments, the amplification is performed under conditions such that polymerase-mediated nucleic acid extension occurs substantially after the initial denaturation step.

In one aspect, the invention provides a method of estimating the amount of a target nucleic acid molecule in a reaction mixture, the method comprising: (A) providing the reaction mixture, wherein the reaction mixture comprises a plurality of uniquely tagged target nucleic acid molecules comprising: (i) a variable length barcode label located at the first end; and (ii) a transposase fragmentation site and a transposome end at a second end; and, wherein the combination of (i) and (ii) together comprise a unique molecular barcode that is unique to all other uniquely tagged target nucleic acid molecules having the same sequence of at least 25 contiguous nucleotides between the variable length barcode tag and the transposase fragmentation site in the reaction mixture; (B) obtaining a plurality of sequence reads, wherein the sequence reads comprise one or more of: a sequence of a variable length barcode tag, a sequence of a portion of the target nucleic acid between the variable length barcode tag and the transposase fragmentation site, and, a sequence of the fragmentation site; and (C) counting the number of target nucleic acid molecules having the same sequence of at least 25 contiguous nucleotides between the variable length barcode tag and the transposase fragmentation site, but different variable length barcode tags and/or transposase fragmentation sites, thereby estimating the number of target nucleic acid molecules in the reaction mixture.

In some embodiments, the method of estimating the number of target nucleic acid molecules in a reaction mixture is performed according to any of the methods disclosed herein.

In some embodiments, the method of estimating the number of target nucleic acid molecules in a reaction mixture further comprises amplifying the target nucleic acid molecules having a variable length barcode tag at a first end after step (a) and before step (B).

In one aspect, the invention provides a reaction mixture comprising a plurality of uniquely tagged target nucleic acid molecules, wherein the plurality of uniquely tagged target nucleic acid molecules comprises: (i) a variable length barcode label located at the first end; and (ii) a transposase fragmentation site and a transposome end at a second end; and, wherein the combination of (i) and (ii) together comprise a unique molecular barcode that is unique to all other uniquely tagged target nucleic acid molecules having the same sequence between the variable length barcode tag and the transposase fragmentation site in the reaction mixture.

In some embodiments, the reaction mixture comprises at least 10 different uniquely tagged target nucleic acid molecules. In some embodiments, the reaction mixture comprises 10 to 1000 different uniquely tagged target nucleic acid molecules, wherein the different uniquely tagged target nucleic acid molecules comprise mRNA transcripts from a single cell. In some embodiments, the reaction mixture comprises 10 to 2000 different uniquely tagged target nucleic acid molecules, wherein the different uniquely tagged target nucleic acid molecules comprise mRNA transcripts from a single cell. In some embodiments, the reaction mixture comprises 10 to 5000 different uniquely tagged target nucleic acid molecules, wherein the different uniquely tagged target nucleic acid molecules comprise mRNA transcripts from a single cell. In some embodiments, the reaction mixture comprises at least 10 different uniquely tagged target nucleic acid molecules, wherein the different uniquely tagged target nucleic acid molecules comprise unique mRNA transcripts from a single cell. In some embodiments, the reaction mixture comprises 10 to 5000 different uniquely tagged target nucleic acid molecules, wherein the different uniquely tagged target nucleic acid molecules comprise unique mRNA transcripts from a plurality of cells. In some embodiments, the reaction mixture comprises 10 to 10000 different uniquely tagged target nucleic acid molecules, wherein the different uniquely tagged target nucleic acid molecules comprise unique mRNA transcripts from a plurality of cells. In some embodiments, the reaction mixture comprises a fluidic partition (fluidification). In some embodiments, the reaction mixture comprises droplets. In some embodiments, the reaction mixture comprises droplets from an emulsion, such as, but not limited to, a water-in-oil emulsion. In another embodiment, the reaction mixture comprises a plurality of droplets, optionally wherein each droplet comprises 0 to 5000 different uniquely tagged target nucleic acid molecules. In another embodiment, the reaction mixture comprises a plurality of fluidic partitions, wherein one or more of the fluidic partitions comprises from 10 to 1000 different uniquely tagged target nucleic acid molecules, wherein the different uniquely tagged target nucleic acid molecules comprise a unique mRNA transcript from a single cell. In some embodiments, the reaction mixture further comprises amplification products of the uniquely tagged target nucleic acid molecule.

In one aspect, the invention provides a computer program product comprising a non-transitory machine-readable medium storing program code that, when executed by one or more processors of a computer system, causes the computer system to implement a method for estimating a quantity of a target nucleic acid molecule in a reaction mixture, employing: (A) a sequence of a target nucleic acid; and (i) a variable length barcode tag located at a first end of the target nucleic acid; and (ii) a transposase fragmentation site and a transposome end at a second end of the target nucleic acid to identify and estimate the number of individual molecules of the target nucleic acid in the reaction mixture, the program code comprising: code for obtaining a read-out of a plurality of amplified polynucleotides, wherein the plurality of amplified polynucleotides are obtained by amplifying nucleic acid fragments in a reaction mixture, the nucleic acid fragments comprising a variable length barcode tag at a first end; and (ii) a transposase fragmentation site and a transposome end at a second end; code for identifying a plurality of physically Unique Molecular Identifiers (UMIs) in a combination of variable length barcode tags and transposase fragmentation sites; and code for counting the number of target nucleic acid molecules having the unique molecule identifier.

30页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:抗病毒治疗剂

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!