High-throughput pathogen microorganism gene detection screening method

文档序号:1388889 发布日期:2020-08-18 浏览:39次 中文

阅读说明:本技术 一种高通量的病原体微生物基因检测筛查方法 (High-throughput pathogen microorganism gene detection screening method ) 是由 杨功达 曾丰波 胡秀弟 于 2020-04-29 设计创作,主要内容包括:本发明属于生物领域,尤其设计一种高通量的病原体微生物基因检测筛查方法。一种高通量的病原体微生物基因检测筛查方法,包括:S1:设计探针;S2:芯片合成;S3:提取样品中的DNA或RNA;S4:文库构建;S5:文库目标区域杂交捕获和测序;S6:采用高通量测序平台检测;S7:进行生信分析,采用Samtools来计算病原体基因组序列每个位点的深度;与探针目标区域重叠的序列被视为“目标序列”;相反,则被视为“非目标序列”。本发明通过一次检测,可同时实现“靶向病原体检测”和“宏基因组检测”,既能提高靶向病原体的检测灵敏度,又能检测到未设计探针的病原体,扩大了检测范围。(The invention belongs to the field of biology, and particularly relates to a high-throughput pathogen microorganism gene detection screening method. A high throughput pathogen microbial gene detection screening method comprising: s1: designing a probe; s2: chip synthesis; s3: extracting DNA or RNA in a sample; s4: constructing a library; s5: hybrid capture and sequencing of library target regions; s6: detecting by adopting a high-throughput sequencing platform; s7: performing a biogenesis analysis, and calculating the depth of each site of a pathogen genome sequence by adopting Samtools; sequences that overlap with the probe target region are considered "target sequences"; conversely, it is considered to be a "non-target sequence". The invention can realize the target pathogen detection and the metagenome detection at the same time through one-time detection, not only can improve the detection sensitivity of the target pathogen, but also can detect the pathogen without a designed probe, thereby enlarging the detection range.)

1. A high-throughput pathogen microorganism gene detection screening method, comprising:

s1: designing a probe;

s2: chip synthesis;

s3: extracting DNA or RNA in a sample;

s4: constructing a library;

s5: hybrid capture and sequencing of library target regions;

s6: detecting by adopting a high-throughput sequencing platform;

s7: performing a biogenesis analysis, and calculating the depth of each site of a pathogen genome sequence by adopting Samtools; sequences that overlap with the probe target region are considered "target sequences"; conversely, it is considered to be a "non-target sequence".

2. The method of claim 1, wherein the target pathogen microorganism is selected as a screening target, a bioinformatics method is used to select a species-specific sequence on the pathogen genome, and a probe sequence is designed based on the region.

3. The method of claim 1, wherein 1-100 probes are designed for each microorganism, and after the probes are synthesized, they are mixed in equimolar volumes to obtain a liquid phase detection chip.

4. The method of claim 1, wherein the sample is a blood sample and the cfDNA is obtained after sample collection and isolation.

5. The method according to claim 1, wherein the S4 includes:

s41: taking cfDNA respectively to carry out QuantiFluorTM-st (promega) quantification and Agilent 2100 assay quality;

s42: the genomic DNA library and the cfDNA library were prepared separately using the second generation constructed library sequencing kit with cfDNA as a sample.

6. The method according to claim 5, wherein the S42 comprises the following steps in sequence:

filling the tail end, purifying after filling, adding the tail A, purifying after adding the tail A, adding a single molecular marker joint, purifying magnetic beads after adding the joint, amplifying the library, identifying the library and purifying the library.

7. The method of claim 1, wherein the high throughput sequencing platform is illumine X-Ten.

8. The method according to claim 1, wherein the S7 includes:

s71: performing data analysis on off-line data, firstly performing data splitting, and then performing quality value filtering on the data to remove low-quality data;

s72: aligning the K-mers of the measured sequences to all reference genomes, and using Kraken as a taxonomic classifier; establishing a custom classification according to a kraken algorithm to obtain a database;

s73: calculating the depth of each site of the pathogen genome sequence by adopting Samtools; obtaining a target sequence and a non-target sequence;

s74: calculating the coverage of a probe target area and the sequence number of a non-target area by using a self-defined script; a pathogen is considered positive if its sequence is present in both the target and non-target regions.

9. The method of claim 1, wherein the pathogen is a BK virus.

10. Use of a method according to any one of claims 1 to 9 in the detection of a pathogen.

Technical Field

The invention belongs to the field of biology, and particularly relates to a high-throughput pathogen microorganism gene detection screening method.

Background

The detection and identification of the pathogenic microorganisms are of great significance to clinical judgment of infection types and targeted treatment. Pathogen microorganisms are various in types and wide in sources, and even different types of pathogen infection shows certain similarity in clinical symptoms, such as fever and other symptoms, so that the infection types are difficult to distinguish from the clinical symptoms, and an accurate and effective screening or diagnosis detection method is very important. The currently clinically adopted pathogen microorganism detection methods mainly include culture methods, protein-based antigen-antibody specificity detection, nucleic acid-based gene detection, and the common methods include QPCR, first-generation sequencing, high-throughput sequencing and the like. Due to the limitation of culture conditions, the types of microorganisms which can be cultured are less, the culture needs a longer time period, and the disadvantages of the culture method are obvious. Antigen-antibody-based specificity detection has the advantage of being rapid, but can detect a wide variety of microorganisms and cannot accurately determine the species of a particular pathogenic microorganism. The nucleic acid detection has stronger universality and has wider and wider application in the detection of pathogen microorganisms.

In the genetic testing method of pathogen microorganisms, QPCR is the most applied technology, and is used for determining whether the pathogen infection exists in a sample or not by selecting a species-specific sequence in a microorganism genome, designing a corresponding primer or probe, and calculating the genome copy number of the pathogen by a relative quantitative method. QPCR has great advantages for detecting known pathogens, low cost, high speed and high accuracy. Clinically, however, many pathogens may have similar infection symptoms, and the detection sites of QPCR are limited, so that multiple detections are needed to meet the requirement of detecting many pathogens. At the same time, some rare or unknown pathogens may cause clinical symptoms, and these infection types QPCR are difficult to solve.

Pathogen infection detection based on high-throughput sequencing is also increasingly used due to the reduction in cost of high-throughput sequencing, as well as the development of target area capture technologies. The current common detection modes are mainly divided into a whole genome or metagenome (WGS) and a target region capture sequencing. The metagenome sequencing does not make any hypothesis, the genomes of all species in a sample to be detected are detected, although the detection range is wide, the required data volume is large, most of the data are sequences from the background, the data efficiency is low, the detection sensitivity is limited, and the cost is very high.

And (3) capturing and sequencing a target region, designing species specific sequences of a series of pathogens into probes or primers in advance, enriching the target region of the DNA in a sample to be tested, and then carrying out sequencing analysis. The methods commonly used for target region capture sequencing mainly include two types, namely solution phase hybridization capture and multiplex PCR. The target region is enriched by the target region capture sequencing, the detection sensitivity is greatly improved compared with the whole genome sequencing, and meanwhile, the required sequencing data amount is small, and the cost is low. However, the method only aims at the detection of the pathogen microorganism which is determined in advance, and the detection range has certain limitation.

Disclosure of Invention

The invention aims to realize the screening of known and unknown pathogens simultaneously by a liquid phase hybridization technology and a high-throughput sequencing technology. The kit can capture and enrich pathogens of a target, improve the detection sensitivity, and can also enlarge the detection range by analyzing the sequence of a non-target area.

Specifically, the technical scheme of the invention is as follows:

the invention discloses a high-flux pathogen microorganism gene detection screening method in a first aspect, which comprises the following steps:

s1: designing a probe;

s2: chip synthesis;

s3: extracting DNA or RNA in a sample;

s4: constructing a library;

s5: hybrid capture and sequencing of library target regions;

s6: detecting by adopting a high-throughput sequencing platform;

s7: performing a biogenesis analysis, and calculating the depth of each site of a pathogen genome sequence by adopting Samtools; sequences that overlap with the probe target region are considered "target sequences"; conversely, it is considered to be a "non-target sequence".

It should be understood that the present invention is not limited to the above steps, and may also include other additional steps, for example, before step S1, between steps S1 and S2, between steps S2 and S3, between steps S3 and S4, between steps S4 and S5, between steps S5 and S6, between steps S6 and S7, and after step S7, without departing from the scope of the present invention.

Preferably, a target pathogen microorganism is selected as a screening target, a species-specific sequence on the pathogen genome is selected using bioinformatics, and a probe sequence is designed based on the region.

Preferably, 1-100 probes are designed for each microorganism, and after the probes are synthesized, the probes are mixed in equal molar volumes to obtain the liquid phase detection chip.

Preferably, the sample is a blood sample, and the cfDNA is obtained after sample collection and separation.

Preferably, the S4 includes:

s41: taking cfDNA respectively to carry out QuantiFluorTM-st (promega) quantification and Agilent 2100 assay quality;

s42: the genomic DNA library and the cfDNA library were prepared separately using the second generation constructed library sequencing kit with cfDNA as a sample.

More preferably, the S42 comprises the following steps in sequence:

filling the tail end, purifying after filling, adding the tail A, purifying after adding the tail A, adding a single molecular marker joint, purifying magnetic beads after adding the joint, amplifying the library, identifying the library and purifying the library.

Preferably, the high-throughput sequencing platform is illumine X-Ten.

Preferably, the S7 includes:

s71: performing data analysis on off-line data, firstly performing data splitting, and then performing quality value filtering on the data to remove low-quality data;

s72: aligning the K-mers of the measured sequences to all reference genomes, and using Kraken as a taxonomic classifier; establishing a custom classification according to a kraken algorithm to obtain a database;

s73: calculating the depth of each site of the pathogen genome sequence by adopting Samtools; obtaining a target sequence and a non-target sequence;

s74: calculating the coverage of a probe target area and the sequence number of a non-target area by using a self-defined script; a pathogen is considered positive if its sequence is present in both the target and non-target regions.

Preferably, the pathogen is a BK virus. It is to be understood that the pathogen detected by the present invention is not limited to BK virus, and any pathogen that can be detected by the method of the present invention is within the scope of the present invention.

In a second aspect, the invention discloses the use of the above method for the detection of pathogens.

On the basis of the common general knowledge in the field, the above-mentioned preferred conditions can be combined arbitrarily without departing from the concept and the protection scope of the invention.

The technical scheme of the invention uses a liquid phase hybridization capture technology, can carry out the design and the manufacture of a probe on a target pathogen genome sequence to be detected, and carries out the capture sequencing of a target region on a detection sample. According to the technical scheme, the 'target-loading rate' with different proportions can be obtained by adjusting the experimental conditions (the hybridization temperature and the proportion of the hybridization capture reagent) of the hybridization capture. Upper target rate (total number of sequences belonging to the target region)/(all lower sequences belonging to the sample). The target-loading rate is generally used for measuring the enrichment efficiency of the liquid phase hybridization capture technology, and the higher the target-loading rate is, the higher the sequence proportion belonging to the target region in the sequencing result is, and the better the enrichment effect is. Fragments of non-target regions are often discarded as "useless sequences". The technical scheme of the method can analyze and utilize the sequence belonging to the target region and the sequence belonging to the non-target region in the sequencing result of liquid phase hybridization capture. By adjusting the capture conditions for hybridization, the obtained "sequence of non-target region" is very close to a random WGS library, while the "target region sequence" is the target fragment we want to obtain. By controlling the 'target loading rate' of the liquid phase hybridization capture technology, the 'target sequence' and the 'non-target sequence' in the sequencing result are reasonably distributed. In the sequencing result, the target sequence plays a role in enriching the target pathogen, so that the pathogen sequence can be enriched with efficiency of more than 100 times, and the detection sensitivity is improved. The "non-target sequence" in the results is metagenomic data (WGS) with little bias, and even if the pathogen sequence of the probe is not designed, it can be detected from the "non-target sequence" as long as it exists in the test sample. The sequences of the target pathogens can be enriched in the target sequences and can be detected in the non-target sequences, so that the detection accuracy is improved.

Compared with the prior art, the invention has the following remarkable advantages and effects:

the method of the invention can capture and enrich the pathogens of the target, improve the detection sensitivity, and can also enlarge the detection range by analyzing the sequence of the non-target area. By adjusting the liquid phase hybridization capture process, the "on-target rate" of the capture process can be adjusted from 20-75%. For example, when the upper targeting rate is 60%, 60% of the lower data belongs to the target pathogen designed by us, and the remaining 40% of the lower data is random sequencing of the nucleic acid sequence of the detection sample, which is equivalent to a metagenome data, and contains the non-target pathogen sequence, which is complementary to the capture sequencing of the target region.

Therefore, the method can realize the target pathogen detection and the metagenome detection at the same time through one-time detection, can improve the detection sensitivity of the target pathogen, can detect the pathogen without a designed probe, and enlarges the detection range.

Drawings

FIG. 1 is a graph showing a comparison of the results of measuring a sample of plasma from a patient that is positive for BK virus in an example of the present invention.

Detailed Description

The technical solutions of the present invention are described in detail below with reference to the drawings and the embodiments, but the present invention is not limited to the scope of the embodiments.

The experimental methods without specifying specific conditions in the following examples were selected according to the conventional methods and conditions, or according to the commercial instructions. The reagents and starting materials used in the present invention are commercially available.

12页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种检测qPCR过程中的非特异性扩增的引物组、试剂盒以及检测方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!