Preparation method of long fragment capture sequencing probe set

文档序号:1290459 发布日期:2020-08-07 浏览:10次 中文

阅读说明:本技术 长片段捕获测序探针组的制备方法 (Preparation method of long fragment capture sequencing probe set ) 是由 潘世让 王洋 梁羽 吴昕 汪德鹏 于 2020-06-24 设计创作,主要内容包括:本发明属于生物技术领域,具体而言,涉及一种长片段捕获测序探针组的制备方法。所述方法包括:a)使用如下公式计算获取探针平均间隔范围:N=(L+2)×P±3×P;其中:N:探针平均间隔,单位bp;P:每条探针平均长度,单位bp;L:基因组片段化平均长度单位Kb,取整数;b)根据所述探针平均间隔范围制备用于检测靶区域的探针组。使用该方法设计探针,具有全面、快速、准确、性价比高的优点,解决了捕获测序探针密集、合成成本高的问题。(The invention belongs to the technical field of biology, and particularly relates to a preparation method of a long fragment capture sequencing probe set.)

1. The preparation method of the long fragment capture sequencing probe set is characterized by comprising the following steps:

a) the average probe spacing range was calculated using the following formula:

N=(L+2)×P±3×P;

wherein:

n: average probe spacing in bp;

p: average length of each probe, unit bp;

l genome fragmentation average length unit Kb, taking an integer;

b) preparing a probe set for detecting the target region according to the average interval range of the probes.

2. The method of claim 1, wherein the value of P ranges from 15bp to 250 bp; preferably 100bp to 200 bp.

3. The method of claim 1, wherein the value of L is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10.

4. The method of claim 1, wherein the species of the target region is of human origin.

5. The method of claim 1, further comprising grouping the probe sets, the grouping selected from at least one of the following groups:

(1) capturing the probe for the full length of the target region;

(2) a probe that captures only an exon region of the target region;

(3) probes that capture only the high mutation rate region;

preferably, the method further comprises coupling the probe to a label bound to a solid phase, the label being selected from one or more of streptavidin, biotin, avidin, an antibody, a chemical coupling agent.

6. The method of any one of claims 1 to 5, wherein the target region comprises:

any one of a dyetropin gene, a TSC1 gene, a TSC2 gene, a BRAC1 gene, a BRAC2 gene, a GJB2 gene, a 12SrRNA gene, or a S L C26a4 gene.

7. Use of the method of any one of claims 1 to 6 for third generation sequencing;

preferably, the third generation sequencing is performed using a PacBio sequence, PromethION, MinION or GridION platform.

8. A probe set prepared by the method of any one of claims 1 to 6;

optionally, the probe set is used to detect TSC genes.

9. A kit for gene detection comprising the probe set of claim 8.

10. A gene sequencing method, comprising the steps of:

A) breaking the genome DNA in a sample to be detected into nucleic acid fragments and amplifying to construct a DNA library;

B) capturing in the DNA library a target sequence capable of specifically binding to the probe set using the probe set of claim 8;

C) sequencing the target sequence;

preferably, when N is 7P, the addition amount of the probe in the probe set is not less than 0.3 pmol.

Technical Field

The invention belongs to the technical field of biology, and particularly relates to a preparation method of a long fragment capture sequencing probe set.

Background

The human genome is approximately 3G in size and contains about 2 ten thousand protein-encoding genes. The whole genome sequencing needs to consume a great deal of cost and time, and the target sequence capture sequencing becomes a hot technology in the current genomics research. When sequencing only the target sequence, researchers can measure a greater number of samples and a greater depth at the same cost. Especially for some rare variants or gene mutation of partial somatic cells, the sequencing depth determines that the target sequence capture sequencing is an effective tool.

The target sequence capture sequencing is to customize a target genome region of interest into a specific probe to hybridize with genome DNA, enrich DNA fragments of the target genome region and then sequence.

The target enrichment based on the capture of the liquid phase hybridization probe is to capture a target region by thousands to millions of oligonucleotide capture probes which are designed to be complementary with the target region and then sequence the target region, and the method is characterized in that the design of the probe is flexible, the flux is high, the capture region can be large or small, and the fusion gene can be detected; the disadvantage is the relatively high cost of probe synthesis, especially for large panels. Therefore, if the number of probes in Panel can be reduced, the popularization and application of the long fragment sequence determination method can be effectively promoted.

In view of the above, the present invention is particularly proposed.

Disclosure of Invention

The invention aims to provide an economical and effective long fragment capture sequencing method capable of reducing the number of probes.

The invention relates to a preparation method of a long fragment capture sequencing probe set, which comprises the following steps:

a) the average probe spacing range was calculated using the following formula:

N=(L+2)×P±3×P;

wherein:

n: average probe spacing in bp;

p: average length of each probe, unit bp;

l genome fragmentation average length unit Kb, taking an integer;

b) preparing a probe set for detecting the target region according to the average interval range of the probes.

The method for designing the probe has the advantages of comprehensiveness, rapidness, accuracy and high cost performance, and solves the problems of intensive capture sequencing probes and high synthesis cost.

According to another aspect of the invention, the invention also relates to the use of the method as described above in third generation sequencing.

According to another aspect of the invention, the invention also relates to the probe set prepared by the method as described above, or a kit comprising the probe set, and the use of the probe set or the kit.

The probe set prepared by the method provided by the invention can detect various mutation information of a target gene, and is simultaneously suitable for detecting various tissue samples such as muscle, blood, amniotic fluid and the like.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is an average depth of coverage of 0.5 × for a captured DNA fragment having an average length of 5Kb for genomic fragmentation in one embodiment of the present invention;

FIG. 2 is an average depth of coverage of 0.5 × for a captured DNA fragment with an average length of 2Kb for genomic fragmentation in one embodiment of the present invention;

FIG. 3 is an average depth of coverage of 0.5 × for a captured DNA fragment having an average length of 8Kb for genomic fragmentation in one embodiment of the present invention;

FIG. 4 is a graph of probe addition versus target rate in capture for one embodiment of the present invention.

Detailed Description

The invention relates to a preparation method of a long fragment capture sequencing probe set, which comprises the following steps:

a) the average probe spacing range was calculated using the following formula:

N=(L+2)×P±3×P;

wherein:

n: average probe spacing in bp;

p: average length of each probe, unit bp;

l genome fragmentation average length unit Kb, taking an integer;

b) preparing a probe set for detecting the target region according to the average interval range of the probes.

A "target region" of a genome (and any grammatical equivalents thereof) refers to any one or more regions of the genome as a whole or identified as a target and/or selected genome by one or more of the methods described herein. Target regions of a genome sequenced by the methods and systems described herein include, but are not limited to, introns, exons, intergenic regions, or any combination thereof. In certain examples, the methods and systems described herein provide sequence information about a full exome, a portion of an exon, one or more selected genes (including a selected set of genes), one or more introns, and a combination of intronic and exonic sequences.

The target region of the genome may also include certain portions or percentages of the genome rather than regions identified by sequence. In certain embodiments, the target region of the genome captured and analyzed according to the methods described herein comprises a portion of the genome located at every 1, 2, 5, 10, 15, 20, 25, 50, 100, 200, 250, 500, 750, 1000, or 10000Kb of the genome. In other embodiments, the target region of the genome comprises 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% of the whole genome. In other embodiments, the target region comprises 1-10%, 5-20%, 10-30%, 15-40%, 20-50%, 25-60%, 30-70%, 35-80%, 40-90%, or 45-95% of the whole genome. Furthermore, the target region may be continuous or intermittent.

In some embodiments, the value of P ranges from 15 to 250 bp; such as 15bp, 20bp, 30bp, 40bp, 50bp, 60bp, 70bp, 80bp, 90bp, 100bp, 110bp, 120bp, 130bp, 140bp, 150bp, 160bp, 170bp, 180bp, 190bp, 200bp, 210bp, 220bp, 230bp, 240bp, 250bp, or a range value composed of the above values.

In some embodiments, the probe is used to detect a target region of no less than 1Kb in length, optionally 1.5Kb, 2Kb, 2.5Kb, 3Kb, 4Kb, 5Kb, 10Kb, 15Kb, 20Kb, 30Kb, 40Kb, 50Kb, 100Kb, 150Kb, 200Kb, 300Kb, 400Kb, 500Kb, 600Kb, 700Kb, 800Kb, 900Kb, 1mb, 2mb, 3mb, 4mb, 5mb, 10mb, 15mb, 20mb or more.

Probes in a probe set should be complementary paired to corresponding regions in the target region. The term "complementary pair" as used herein means that the probe is capable of hybridizing to the desired corresponding site under stringent conditions. The stringent conditions used in the present invention are well known and include, for example, hybridization at 65 ℃ for 12 to 16 hours in a hybridization solution containing 400mM NaCl, 40mM PIPES (pH6.4) and 1mM EDTA, followed by washing at 65 ℃ for 15 to 60 minutes with a washing solution containing 0.1% SDS and 0.1% SSC. This is familiar to the person skilled in the art.

In some embodiments, the value of L is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10.

In some embodiments, the species of the target region may be selected from animals including humans and all animal breeds (e.g., livestock and pets) and wild animals and birds including, without limitation, cows, horses, cows, pigs, sheep, goats, rats, mice, dogs, cats, rabbits, camels, donkeys, deer, mink, chickens, ducks, geese, turkeys, etc., with mammals being preferred, primates being preferred, and humans being most preferred.

In some embodiments, the method further comprises grouping the sets of probes, the grouping selected from at least one of the following groups:

(1) capturing the probe for the full length of the target region;

(2) a probe that captures only an exon region of the target region;

(3) probes that capture only the high mutation rate region.

In some embodiments, the method further comprises coupling the probe to a label bound to a solid phase, the label being selected from one or more of streptavidin, biotin, avidin, an antibody, a chemical coupling agent, preferably biotin.

In the present invention, the term "target region" or "gene" is understood to include, but not limited to, any of the (preferably human) Dystrophin gene, TSC1 gene, TSC2 gene, BRAC1 gene, BRAC2 gene, GJB2 gene, 12SrRNA gene, or S L C26A4 gene.

The invention also relates to the application of the method in the third generation sequencing.

In some embodiments, the third generation sequencing is performed using a PacBio sequence, PromethION, MinION, or gridios platform.

The invention also relates to a probe set, which is prepared by the method;

in some embodiments, the probe set is used to detect the TSC (tuberous sclerosisomicplex) gene, which is used to refer to the TSC1 gene and/or the TSC2 gene herein unless otherwise specified.

In some specific embodiments, the probe set comprises probes that are complementary paired to the following chromosomal regions (hereinafter referred to as TSC probe sets):

chr9:135813137-135813237,chr9:135789857-135789957,chr9:135768617-135768717,chr9:135768977-135769077,chr9:135769457-135769557,chr9:135769817-135769917,chr9:135770297-135770397,chr9:135770777-135770877,chr9:135771137-135771237,chr9:135771617-135771717,chr9:135772097-135772197,chr9:135772457-135772557,chr9:135772937-135773037,chr9:135758177-135758277,chr9:135773777-135773877,chr9:135774257-135774357,chr9:135774617-135774717,chr9:135775097-135775197,chr9:135775577-135775677,chr9:135775937-135776037,chr9:135776417-135776517,chr9:135776897-135776997,chr9:135777257-135777357,chr9:135777737-135777837,chr9:135758657-135758757,chr9:135778577-135778677,chr9:135779057-135779157,chr9:135779417-135779517,chr9:135779897-135779997,chr9:135780377-135780477,chr9:135780617-135780717,chr9:135781097-135781197,chr9:135781577-135781677,chr9:135781937-135782037,chr9:135782417-135782517,chr9:135783737-135783837,chr9:135759257-135759357,chr9:135784577-135784677,chr9:135785057-135785157,chr9:135785897-135785997,chr9:135786377-135786477,chr9:135786737-135786837,chr9:135787217-135787317,chr9:135787697-135787797,chr9:135788057-135788157,chr9:135759737-135759837,chr9:135790217-135790317,chr9:135790697-135790797,chr9:135791177-135791277,chr9:135791537-135791637,chr9:135792017-135792117,chr9:135792497-135792597,chr9:135793217-135793317,chr9:135793697-135793797,chr9:135794057-135794157,chr9:135794537-135794637,chr9:135795377-135795477,chr9:135795857-135795957,chr9:135796217-135796317,chr9:135796697-135796797,chr9:135797177-135797277,chr9:135797537-135797637,chr9:135798017-135798117,chr9:135798497-135798597,chr9:135798857-135798957,chr9:135799337-135799437,chr9:135800177-135800277,chr9:135800657-135800757,chr9:135801017-135801117,chr9:135801497-135801597,chr9:135801977-135802077,chr9:135802337-135802437,chr9:135802817-135802917,chr9:135803297-135803397,chr9:135803657-135803757,chr9:135804137-135804237,chr9:135804857-135804957,chr9:135805337-135805437,chr9:135761417-135761517,chr9:135806177-135806277,chr9:135806657-135806757,chr9:135807017-135807117,chr9:135807497-135807597,chr9:135807977-135808077,chr9:135808337-135808437,chr9:135808817-135808917,chr9:135809297-135809397,chr9:135809657-135809757,chr9:135810137-135810237,chr9:135761897-135761997,chr9:135810977-135811077,chr9:135811457-135811557,chr9:135811817-135811917,chr9:135812297-135812397,chr9:135813617-135813717,chr9:135814097-135814197,chr9:135814457-135814557,chr9:135814937-135815037,chr9:135815777-135815877,chr9:135816257-135816357,chr9:135816977-135817077,chr9:135817457-135817557,chr9:135817817-135817917,chr9:135818297-135818397,chr9:135818777-135818877,chr9:135819137-135819237,chr9:135819617-135819717,chr9:135820097-135820197,chr9:135820457-135820557,chr9:135820937-135821037,chr9:135762977-135763077,chr9:135821777-135821877,chr9:135822257-135822357,chr9:135822617-135822717,chr9:135823577-135823677,chr9:135823937-135824037,chr9:135824417-135824517,chr9:135824897-135824997,chr9:135825737-135825837,chr9:135763457-135763557,chr9:135826577-135826677,chr9:135827057-135827157,chr9:135828377-135828477,chr9:135828617-135828717,chr9:135765137-135765237,chr9:135765617-135765717,chr9:135766097-135766197,chr9:135766457-135766557,chr9:135766937-135767037,chr9:135757577-135757677,chr9:135767777-135767877,chr9:135768257-135768357,chr16:2144932-2145032,chr16:2088172-2088272,chr16:2117692-2117792,chr16:2102452-2102552,chr16:2107732-2107832,chr16:2147572-2147672,chr16:2147812-2147912,chr16:2091412-2091512,chr16:2116732-2116832,chr16:2099812-2099912,chr16:2100292-2100392,chr16:2101612-2101712,chr16:2102092-2102192,chr16:2102932-2103032,chr16:2089252-2089352,chr16:2103772-2103872,chr16:2104252-2104352,chr16:2104612-2104712,chr16:2105092-2105192,chr16:2105932-2106032,chr16:2106412-2106512,chr16:2106892-2106992,chr16:2107252-2107352,chr16:2089732-2089832,chr16:2108572-2108672,chr16:2110372-2110472,chr16:2110732-2110832,chr16:2111212-2111312,chr16:2111692-2111792,chr16:2111932-2112032,chr16:2112412-2112512,chr16:2112892-2112992,chr16:2113252-2113352,chr16:2113732-2113832,chr16:2090332-2090432,chr16:2114572-2114672,chr16:2115412-2115512,chr16:2115892-2115992,chr16:2116372-2116472,chr16:2117212-2117312,chr16:2118532-2118632,chr16:2090812-2090912,chr16:2120212-2120312,chr16:2120692-2120792,chr16:2121172-2121272,chr16:2121532-2121632,chr16:2122012-2122112,chr16:2122492-2122592,chr16:2122852-2122952,chr16:2123332-2123432,chr16:2091292-2091392,chr16:2124052-2124152,chr16:2124532-2124632,chr16:2125852-2125952,chr16:2126212-2126312,chr16:2126692-2126792,chr16:2127172-2127272,chr16:2127532-2127632,chr16:2128012-2128112,chr16:2128492-2128592,chr16:2128852-2128952,chr16:2129332-2129432,chr16:2130172-2130272,chr16:2131012-2131112,chr16:2131492-2131592,chr16:2131972-2132072,chr16:2132332-2132432,chr16:2132812-2132912,chr16:2133292-2133392,chr16:2133652-2133752,chr16:2134132-2134232,chr16:2092372-2092472,chr16:2134972-2135072,chr16:2135452-2135552,chr16:2136172-2136272,chr16:2136652-2136752,chr16:2137012-2137112,chr16:2137492-2137592,chr16:2137972-2138072,chr16:2138332-2138432,chr16:2138812-2138912,chr16:2139292-2139392,chr16:2139652-2139752,chr16:2140132-2140232,chr16:2092972-2093072,chr16:2140972-2141072,chr16:2141452-2141552,chr16:2141812-2141912,chr16:2142292-2142392,chr16:2142772-2142872,chr16:2143132-2143232,chr16:2143612-2143712,chr16:2144092-2144192,chr16:2144452-2144552,chr16:2093452-2093552,chr16:2145772-2145872,chr16:2146252-2146352,chr16:2147092-2147192,chr16:2093812-2093912,chr16:2094292-2094392,chr16:2094772-2094872,chr16:2096092-2096192,chr16:2096452-2096552,chr16:2096932-2097032,chr16:2088652-2088752,chr16:2097772-2097872,chr16:2098252-2098352,chr16:2098612-2098712

wherein the chromosome region is a region on the genes of the human TSC1 and TSC2, and the reference sequence is the human reference genome version Dec.2013(GRCh38/hg38) (TSC1: chr9:132891348-132944633 and TSC2: chr16: 2047895-2088720).

The invention also relates to a kit for gene detection, comprising a probe set as described above.

In some embodiments, the kit further comprises one or more of a solid support as defined above, a linker sequence, primers for binding to the linker sequence and amplifying a nucleic acid fragment, a DNA extraction system, a PCR reaction buffer, nuclease-free water, a DNA polymerase, a molecular weight marker, a target sequence eluent, a terminal repair enzyme, a terminal repair buffer, a DNA ligase.

In some embodiments, the probes in the set of probes further comprise a label that is bindable to a solid phase; in some embodiments, the label may be selected from the group consisting of biotin, streptavidin, avidin, an antibody, a chemical coupling agent; and any biological, chemical, physical or enzymatic reagents known in the art for affinity purification.

"chemical coupling agent" refers to a group that is capable of reacting with another chemical group to form a covalent bond, i.e., a group that is covalently reactive under suitable reaction conditions. Generally, nucleophilic groups, electrophilic groups, and photoactivatable groups can be selected. Exemplary chemical coupling agents include, but are not limited to, olefins, acetylenes, alcohols, acids, ethers, oxides, halides, aldehydes, ketones, carboxylic acids, esters, amides, cyanates, isocyanates, thiocyanates, isothiocyanates, amines, hydrazines, hydrazones, hydrazides, diazos, diazonium salts, niter (nitre), nitriles, thiols, sulfides, disulfides, sulfoxides, sulfones, sulfonic acids, sulfuric acids, acetals, ketals, anhydrides, sulfates, sulfenamides, amidines, diimides, imides, nitrones, hydroxylamines, oximes, hydroxamic acids, thiohydroxamic acids, allenes, orthoesters, sulfites, enamines, acetylenic amines, ureas, pseudoureas, semicarbazides, carbodiimides, carbamates, imines, azides, azo compounds, azoxy compounds, and nitroso compounds. Reactive functional groups of chemical coupling agents also include those used to prepare bioconjugates, such as N-hydroxysuccinimide esters, maleimides, and the like. Methods of preparing each of these Functional groups are well known in the art and their use or modification for a particular purpose is within the ability of those skilled in the art (see, e.g., Sandier and Karo, editors, Organic Functional group precursors. academic Press, San Diego, 1989).

The probe can be combined with third-generation sequencing to achieve better technical effects.

The invention also relates to a gene sequencing method, which comprises the following steps:

A) breaking the genome DNA in a sample to be detected into nucleic acid fragments and amplifying to construct a DNA library;

B) capturing in the DNA library a target sequence capable of specifically binding to the probe set using the probe set as described above;

C) sequencing the target sequence.

In some embodiments, when N is 7P, the amount of probe added to the probe set is 0.3pmol or more, for example 0.3 to 0.8 pmol.

In some embodiments, the sample to be tested is a bodily fluid, tissue, or tissue lysate; the body fluid may be selected from, for example, blood (whole blood), serum, plasma, cell culture supernatant, saliva, semen, tissue or tissue lysate; the tissue may be selected from, for example, amniotic fluid, villi, bone, muscle, or hair, among others.

In some embodiments, the method of constructing a DNA library comprises:

and (3) connecting the nucleic acid fragment with an adaptor sequence after the end of the nucleic acid fragment is repaired, and designing a primer by taking the adaptor sequence as a template to carry out PCR amplification.

In some embodiments, the capturing occurs on a solid support;

the solid phase carrier is preferably enrichment particles, and the enrichment particles are coated with biotin or avidin; preferably avidin, and the object captured by it is biotin.

In a preferred embodiment, the "solid support" is an "enrichment particle"; as used herein, "enriched particles" refers to discrete small objects, such as spheres (e.g., beads), capsules, polyhedrons, and the like, that can be of various shapes. The particles may be macroscopic or microscopic, such as microparticles or nanoparticles. The particles may be non-magnetic or magnetic. The magnetic particles may contain a ferromagnetic substance, and the ferromagnetic substance may be Fe, Ni, Co, iron oxide, or the like.

In some embodiments, the target sequence is amplified prior to sequencing the target sequence.

In some embodiments, the amplification is specifically a PCR amplification.

In some embodiments, the sequencing is third generation sequencing;

in some embodiments, the third generation sequencing is performed using a PacBio sequence, PromethION, MinION, gridios platform.

More preferred embodiments are magnetic beads.

According to one aspect of the invention, the invention also relates to the use of a probe or probe combination as described above, or a kit as described above, or a method as described above, for detecting a variation in a human gene (e.g. a TSC gene);

in some embodiments, the human genetic (e.g., TSC gene) variation comprises an insertion, deletion, replication, inversion, translocation, SNP.

This application includes diagnostic and non-diagnostic purposes, such as in genetic studies, race distribution, human chemistry, etc. (typically the application of SNPs), or may be the identification of cellular and animal models of TSC gene-related diseases if homology is high for humans.

According to one aspect of the invention, the invention also relates to the use of a panel of probes as described above, or a kit as described above, for the preparation of a diagnostic agent for tuberous sclerosis in humans.

Embodiments of the present invention will be described in detail below with reference to examples, but it will be understood by those skilled in the art that the following examples are only illustrative of the present invention and should not be construed as limiting the scope of the present invention. The examples, in which specific conditions are not specified, were conducted under conventional conditions or conditions recommended by the manufacturer. The reagents or instruments used are not indicated by the manufacturer, and are all conventional products commercially available.

51页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:检测样品中生物靶标的空间分布的方法和系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!