Method for designing hypercholesteremia virulence gene screening probe and gene chip thereof

文档序号:1731658 发布日期:2019-12-20 浏览:34次 中文

阅读说明:本技术 一种高胆固醇血症致病基因筛查探针设计方法及其基因芯片 (Method for designing hypercholesteremia virulence gene screening probe and gene chip thereof ) 是由 王绿娅 江龙 张峰 于 2018-12-25 设计创作,主要内容包括:本发明公开了一种高胆固醇血症致病基因筛查探针设计方法及其基因芯片,该方法分为3个步骤,分别是步骤1探针设计,步骤2捕获建库和测序以及步骤3生物信息分析,其中探针组的数量为7,其中探针组1为JL3_1、探针组2为JL_1、探针组3为JL4_1、探针组5为JL1_1、探针组6为JL2_1、探针组7为JL-UTR_1,探针总数量为12944,探针包大小为1.012Mbp,每个样品的推荐最小排序为202.577Mbp。所述基因芯片包括固相载体和固定于所述固相载体上的探针。本发明对所有的致病基因和SNPs进行系统性归纳和汇总,最终绘制一幅完整的筛选目标区域,能够显著提高我国家族性高胆固醇血症患者的检出率。(The invention discloses a method for designing a hypercholesterolemia pathogenic gene screening probe and a gene chip thereof, wherein the method comprises 3 steps of step 1 probe design, step 2 capture library building and sequencing and step 3 biological information analysis, wherein the number of probe sets is 7, wherein the probe set 1 is JL3_1, the probe set 2 is JL _1, the probe set 3 is JL4_1, the probe set 5 is JL1_1, the probe set 6 is JL2_1, the probe set 7 is JL-UTR _1, the total number of probes is 12944, the size of a probe packet is 1.012Mbp, and the recommended minimum rank of each sample is 202.577 Mbp. The gene chip comprises a solid phase carrier and a probe fixed on the solid phase carrier. The invention systematically summarizes and summarizes all pathogenic genes and SNPs, finally draws a complete screening target area, and can obviously improve the detection rate of familial hypercholesterolemia patients in China.)

1. A method for designing a hypercholesterolemia pathogenic gene screening probe comprises the following steps:

step 1 probe design, number of probe sets 7, wherein probe set 1 is JL3_1, probe set 2 is JL _1, probe set 3 is JL4_1, probe set 5 is JL1_1, probe set 6 is JL2_1, probe set 7 is JL-UTR _1, total number of probes is 12944, probe pack size is 1.012Mbp, recommended minimum rank for each sample is 202.577 Mbp;

step 2, capturing, establishing a library and sequencing, namely firstly carrying out efficient enrichment on DNA, and then carrying out high-throughput and high-depth sequencing on an Illumina platform, wherein an Agilent SureSelect XT Custom kit is adopted in a library establishment experiment;

and 3, biological information analysis, namely performing original data quality control on an original sequencing sequence obtained by sequencing, then performing sequence comparison, and performing common mutation analysis, locus significance-based analysis and gene significance-based analysis after mutation detection and annotation.

2. The method as claimed in claim 1, wherein the step 2 of capturing, banking and sequencing further comprises:

step 201, detecting a DNA sample. The detection of DNA samples mainly comprises 2 methods: 1) agarose gel electrophoresis is used for analyzing the degradation degree of DNA and whether RNA and protein are polluted; 2) the Qubit accurately quantifies the DNA concentration, wherein DNA samples with a content above 0.5 μ g are used to construct a library;

step 202, capturing and building a library. Randomly breaking the genome DNA into fragments with the length of 180-plus 280bp by a Covaris breaker, respectively connecting joints at two ends of the fragments after end repair and A tail addition to prepare a DNA library, performing liquid phase hybridization on the library posing with specific index and a biotin-labeled probe, capturing a specific target region on the genome by using a magnetic bead with streptomycin, performing PCR linear amplification, performing library quality inspection, and performing sequencing after the quality inspection is qualified;

step 203, checking in a warehouse. After the library is constructed, firstly, using Qubit 2.0 to carry out preliminary quantification, then using Agilent2100 to detect the insert size of the library, and after the insert size meets the expectation, using a qPCR method to accurately quantify the effective concentration (3nM) of the library so as to ensure the quality of the library;

and step 204, performing computer sequencing. And (4) carrying out Illumina HiSeq4000 PE150 sequencing according to the effective concentration of the qualified library and the data output requirement.

3. The method as claimed in claim 1, wherein the quality control of the raw data in step 3 further comprises:

the original sequencing sequence obtained by sequencing contains low-quality reads with connectors. In order to ensure the quality of information analysis, raw reads need to be finely filtered to obtain clean reads, and subsequent analysis is performed based on the clean reads, and the method specifically comprises the following steps:

1) removing reads pairs with connectors (adapters);

2) when the proportion of N (N represents that the base information cannot be determined) in the single-ended sequencing read is more than 10%, the pair of reads needs to be removed;

3) the pair of reads needs to be removed when the number of low quality (less than 5) bases contained in the single ended sequencing read exceeds 50% of the length proportion of the read.

4. The method as claimed in claim 1, wherein the sequence alignment in step 3 further comprises: comparing the effective sequencing data to a reference genome through BWA to obtain an initial comparison result in a BAM format, and then sequencing the comparison result by using SAMtools; and then Picard marks repeat reads (mark duplicate reads), and then the comparison result after repeated marking is used for counting the coverage and depth.

5. The method as claimed in claim 1, wherein the mutation detection and annotation in step 3 further comprises:

on the basis of the comparison result, utilizing SAMtools to identify SNP sites and InDel, and filtering the SNP sites and the InDel by adopting a preset filtering standard, wherein the filtering standard is as follows:

a) filtering the variation sites of the thousand human genome database, removing diversity sites among individuals, and obtaining rare mutation (rare) which is really possibly pathogenic: the variant sites with the frequency of less than 0.01 in 1000G are reserved;

b) variation of exon region (exonic) or splice site region (spicing, 10bp up and down) is reserved;

c) removing the synonymous mutation to obtain a mutation affecting the gene expression product;

d) according to the 4 software of SIFT, Polyphen, Mutation Taster and CADD, at least half of the 4 software is required to support that the site is possibly harmful, the site is reserved, after Mutation screening and statistics, the screened SNV and InDel types of mutations are annotated with Mutation information respectively, wherein the SNV and InDel annotations comprise annotation information of a thousand-people genome plan, ExAC, Novo-Zhonghua and other existing databases, and the annotation content comprises 6 parts of priority information, genes and region annotations, database frequency annotations, conservative harmfulness prediction, Mutation site information, gene functions and passage annotations.

6. The method as claimed in claim 1, wherein the analysis of the consensus mutation in step 3 further comprises: on the basis of site filtration, screening mutant genes shared among samples, wherein the screening rate is shared by 10% of patients, and if the number of the shared samples included in the 10% of patients is calculated, the obtained result has decimal numbers, rounding up is needed, for example, if the calculated number of the 10% of samples is 19.2, screening at least 20 mutant genes shared by the samples is needed, and the parameters can be adjusted according to actual conditions.

7. The method as claimed in claim 1, wherein the site-based significance analysis in step 3 further comprises: performing association analysis on the mutation sites by using PLINK, and calculating the sites with significant difference between patients and normal persons by using Fisher test: and calculating the P value and the OR value of each SNP site, and screening the mutation sites with significant association through the associated significance P-value, wherein the part of the mutation sites are the mutation sites related to the familial hypercholesterolemia.

8. The method as claimed in claim 1, wherein the gene significance-based analysis in step 3 further comprises: on the basis of site filtering, Burden analysis is carried out on the face-to-face mutation of a gene layer by using an SKAT algorithm, and the mutation related to the disease is mined from the gene layer, so that the discovery of rare mutation related to the disease is facilitated.

9. A gene chip for screening a hypercholesterolemia pathogenic gene is characterized by comprising a solid phase carrier and a probe fixed on the solid phase carrier, wherein the sequence of the probe is shown in a sequence table 1-158.

10. The gene chip of claim 9, wherein the solid phase carrier is selected from any one of a glass slide, a silicon wafer, a membrane or a polymer material, and the membrane is selected from any one of a nitrocellulose membrane, a nylon membrane and a polystyrene membrane.

Technical Field

The invention belongs to the field of gene diagnosis, relates to a gene chip, and particularly relates to a method for designing a probe for screening a hypercholesterolemia pathogenic gene and a gene chip thereof.

Background

Familial hypercholesterolaemia (FH, MIM #143890), also known as hereditary hypercholesterolemia, is a common and serious dominant hereditary disease and is an internationally recognized global disease. The clinical classification is homozygous and heterozygous, homozygous FH patients are rare but often show extreme clinical characteristics, plasma cholesterol is 6-8 times of that of normal people, skin multi-part xanthoma, and systemic As can appear in childhood until myocardial infarction dies. Heterozygous FH patients have a rapid progression of atherosclerosis (As), with an approximately 80-fold increase in the risk of coronary heart disease death between the ages of 20 and 29 years of untreated patients, while early lipid-lowering interventions are effective in preventing As progression, but most patients currently lose the opportunity for early intervention because they are not discovered early. It was previously thought that FH incidence was 1/500 in the population, but it was recently reported to be 1/200 in france, canada, libamon and finland with danish high at 1/137. However, the total number of FH homozygous and heterozygous patients reported in China is less than hundreds, which is inferior to hong Kong, and the FH morbidity in China is not low, but the harm recognition of FH is insufficient. China, as the most populated country in the world, may bear the genetic burden of FH more seriously than other countries, but our country's FH patients ' genetic background is not clear yet, therefore, deep research on FH etiology and molecular mechanisms has important meaning for FH patients ' early diagnosis and early intervention.

FH is a monogenic genetic disease, a genetic disease controlled by a pair of alleles. To date, 3 FH virulence genes have been discovered: (1) low density lipoprotein receptor (LDL-R) mediates about 70% of LDL entry into cells; this mutation is most common, accounting for about 70% of known mutations; (2) apolipoprotein B100 (apoprotein B100, apoB100) is a ligand required for LDL to bind to its receptor, and the mutation accounts for 15%; (3) the protein convertase, subtilisin 9 (protein convertase sublisin/kexin type 9, PCSK9), can hydrolyze LDL-R protein, and mutations are rare. The 3 pathogenic genes play important physiological functions around LDL-R, are key genes of cholesterol metabolism, and one of the genes is mutated so that the coded protein undergoes structural and functional changes, so that cholesterol metabolism is fundamentally changed to trigger FH.

Although pathogenic mutations can be detected by methods such as whole genome sequencing and whole exome sequencing, the method is expensive and time-consuming, and is not suitable for being used in the vast population. With the wide application of high-throughput next-generation sequencing technology in scientific research and clinical fields, a sequencing scheme of a genome target region and a candidate gene region of a large-scale sample becomes possible, and a researcher can perform sequence determination of hundreds or even thousands of samples aiming at a chromosome region or a large number of candidate gene regions which are interested by the researcher. Target Region Sequencing (TRS) is a technical method for next generation sequencing, which can perform targeted capture sequencing analysis on known disease-causing sites, genes or disease-related genome segments, and is widely used in disease research, clinical diagnosis of diseases and gene screening, and has the following characteristics:

(1) the pertinence is strong: compared with the research of the whole genome level, the sequencing of the target region has higher pertinence, and a large number of candidate genes of the candidate chromosome region or the biological pathway can be obtained depending on a large number of early research results;

(2) the cost is low: compared with exome sequencing, the sequencing region of the target region is smaller, and hundreds of samples can be rapidly sequenced, so that the research cost is greatly reduced;

(3) the information amount is large: compared with a research strategy of SNP typing of a target region or a candidate gene haplotype label, the sequencing of the target region can completely cover the whole gene region, not only can obtain the typing data of high-frequency SNP, but also can discover low-frequency and individual specific variation;

(4) the efficiency is high: compared with a candidate gene sequencing method using a Sanger method, the target region sequencing based on the second generation sequencing technology is quicker and more efficient.

However, because the pathogenic genes of various diseases are different, the construction of the target capture second-generation sequencing chip needs to be specially made for different diseases. So far, no special target capture second-generation sequencing chip specially designed for patients with Chinese familial hypercholesterolemia is a problem to be solved urgently.

Disclosure of Invention

The invention aims to provide a method for screening a hypercholesterolemia pathogenic gene and a gene chip thereof, because of high coronary heart disease risk, China, as the country with the most population in the world, is likely to bear more severe genetic burden of FH than other countries, and needs to adopt an effective method to screen possible FH patients, while a targeted capture second-generation sequencing technology generally needs to individualize different capture chips aiming at different diseases. Therefore, we reviewed all pathogenic mutant genes and SNPs associated with familial hypercholesterolemia systematically and screened all genes and SNPs associated with lipid metabolism. Meanwhile, pathogenic genes and SNPs of early coronary heart disease are also screened. And systematically summarizing and summarizing all pathogenic genes and SNPs, and finally drawing a complete screening target region. The gene chip of the invention has the advantages of simple operation steps, high detection specificity, good stability, short time and low cost.

The technical scheme adopted by the invention is as follows:

a method for designing a hypercholesterolemia pathogenic gene screening probe comprises the following steps:

step 1 probe design, number of probe sets 7, wherein probe set 1 is JL3_1, probe set 2 is JL _1, probe set 3 is JL4_1, probe set 5 is JL1_1, probe set 6 is JL2_1, probe set 7 is JL-UTR _1, total number of probes is 12944, probe pack size is 1.012Mbp, recommended minimum rank for each sample is 202.577 Mbp;

step 2, capturing, establishing a library and sequencing, namely firstly carrying out efficient enrichment on DNA, and then carrying out high-throughput and high-depth sequencing on an Illumina platform, wherein an Agilent SureSelect XT Custom kit is adopted in a library establishment experiment;

and 3, biological information analysis, namely performing original data quality control on an original sequencing sequence obtained by sequencing, then performing sequence comparison, and performing common mutation analysis, locus significance-based analysis and gene significance-based analysis after mutation detection and annotation.

The step 2 of capturing, building a library and sequencing further comprises the following steps:

step 201, detecting a DNA sample. The detection of DNA samples mainly comprises 2 methods: 1) agarose gel electrophoresis is used for analyzing the degradation degree of DNA and whether RNA and protein are polluted; 2) the Qubit accurately quantifies the DNA concentration, wherein DNA samples with a content above 0.5 μ g are used to construct a library;

step 202, capturing and building a library. Randomly breaking the genome DNA into fragments with the length of 180-plus 280bp by a Covaris breaker, respectively connecting joints at two ends of the fragments after end repair and A tail addition to prepare a DNA library, performing liquid phase hybridization on the library posing with specific index and a biotin-labeled probe, capturing a specific target region on the genome by using a magnetic bead with streptomycin, performing PCR linear amplification, performing library quality inspection, and performing sequencing after the quality inspection is qualified;

step 203, checking in a warehouse. After the library is constructed, firstly, using Qubit 2.0 to carry out preliminary quantification, then using Agilent2100 to detect the insert size of the library, and after the insert size meets the expectation, using a qPCR method to accurately quantify the effective concentration (3nM) of the library so as to ensure the quality of the library;

and step 204, performing computer sequencing. According to the effective concentration of the qualified library and the data output requirement, Illumina HiSeq4000 PE150 sequencing is carried out.

The quality control of the raw data in the step 3 further comprises: the original sequencing sequence obtained by sequencing contains low-quality reads with connectors. In order to ensure the quality of information analysis, raw reads need to be finely filtered to obtain clean reads, and subsequent analysis is performed based on the clean reads, and the method specifically comprises the following steps:

1) removing reads pairs with connectors (adapters);

2) when the proportion of N (N represents that the base information cannot be determined) in the single-ended sequencing read is more than 10%, the pair of reads needs to be removed;

3) the pair of reads needs to be removed when the number of low quality (less than 5) bases contained in the single ended sequencing read exceeds 50% of the length proportion of the read.

The sequence alignment in step 3 further comprises: comparing the effective sequencing data to a reference genome through BWA to obtain an initial comparison result in a BAM format, and then sequencing the comparison result by using SAMtools; and then Picard marks repeat reads (mark duplicate reads), and then the comparison result after repeated marking is used for counting the coverage and depth.

The mutation detection and annotation in step 3 further comprises: on the basis of the comparison result, utilizing SAMtools to identify SNP sites and InDel, and filtering the SNP sites and the InDel by adopting a preset filtering standard, wherein the filtering standard is as follows:

a) filtering the variation sites of the thousand human genome database, removing diversity sites among individuals, and obtaining rare mutation (rare) which is really possibly pathogenic: the variant sites with the frequency of less than 0.01 in 1000G are reserved;

b) variation of exon region (exonic) or splice site region (spicing, 10bp up and down) is reserved;

c) removing the synonymous mutation to obtain a mutation affecting the gene expression product;

d) according to the 4 software of SIFT, Polyphen, Mutation Taster and CADD, at least half of the 4 software is required to support that the site is possibly harmful, the site is reserved, after Mutation screening and statistics, the screened SNV and InDel types of mutations are annotated with Mutation information respectively, wherein the SNV and InDel annotations comprise annotation information of a thousand-people genome plan, ExAC, Novo-Zhonghua and other existing databases, and the annotation content comprises 6 parts of priority information, genes and region annotations, database frequency annotations, conservative harmfulness prediction, Mutation site information, gene functions and passage annotations.

The consensus mutation analysis in step 3 further comprises: on the basis of site filtration, screening mutant genes shared among samples, wherein the screening rate is shared by 10% of patients, and if the number of the shared samples included in the 10% of patients is calculated, the obtained result has decimal numbers, rounding up is needed, for example, if the calculated number of the 10% of samples is 19.2, screening at least 20 mutant genes shared by the samples is needed, and the parameters can be adjusted according to actual conditions.

The site-based significance analysis in step 3 further comprises: performing association analysis on the mutation sites by using PLINK, and calculating the sites with significant difference between patients and normal persons by using Fisher test: and calculating the P value and the OR value of each SNP site, and screening the mutation sites with significant association through the associated significance P-value, wherein the part of the mutation sites are the mutation sites related to the familial hypercholesterolemia.

The gene significance-based analysis in step 3 further comprises: on the basis of site filtering, Burden analysis is carried out on the face-to-face mutation of a gene layer by using an SKAT algorithm, and the mutation related to the disease is mined from the gene layer, so that the discovery of rare mutation related to the disease is facilitated.

A gene chip for screening a hypercholesterolemia pathogenic gene is characterized by comprising a solid phase carrier and a probe fixed on the solid phase carrier, wherein the sequence of the probe is shown in a sequence table 1-158.

The solid phase carrier is selected from any one of glass slides, silicon wafers, membranes or high polymer materials, and the membranes are selected from any one of nitrocellulose membranes, nylon membranes and polystyrene membranes.

The technical scheme of the invention can obviously improve the detection rate of familial hypercholesterolemia patients in China.

Drawings

FIG. 1 is a schematic flow chart of the method for designing the hypercholesterolemia pathogenic gene screening probe;

FIG. 2 is a schematic diagram of an example of a probe report section;

FIG. 3 is a schematic diagram of a capture library construction process.

Detailed Description

For better illustrating the present invention, the technical solution will be further described with reference to the specific embodiments and the drawings attached to the specification. Although the present invention has been described in detail with reference to the embodiments, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

A schematic flow chart of a method for designing a hypercholesterolemia virulence gene screening probe is shown in fig. 1. The gene screening method is divided into 3 steps, namely step 1 probe design, step 2 capture library building and sequencing and step 3 biological information analysis.

The design of the capture probe is carried out by taking the genes and the loci in the gene list as capture objects according to the design method of the Agilent probe, and the report part of the probe is shown in figure 2. The number of probe sets was 7, where probe set 1 was JL3_1, probe set 2 was JL _1, probe set 3 was JL4_1, probe set 5 was JL1_1, probe set 6 was JL2_1, probe set 7 was JL-UTR _1, the total number of probes was 12944, the probe pack size was 1.012Mbp, and the recommended minimum rank for each sample was 202.577 Mbp.

FIG. 3 is a schematic flow chart of the capture and library creation in step 2. Firstly, DNA is efficiently enriched, and then high-throughput and high-depth sequencing is carried out on an Illumina platform. The Agilent SureSelect XT Custom kit is adopted in the library building and capturing experiment, the reagents and consumables recommended by the kit specification are strictly used, and the operation is carried out according to the latest optimized experiment process.

Step 201, detecting a DNA sample. The detection of DNA samples mainly comprises 2 methods: 1) agarose gel electrophoresis is used for analyzing the degradation degree of DNA and whether RNA and protein are polluted; 2) the Qubit accurately quantifies the DNA concentration, where DNA samples containing more than 0.5 μ g are used to create the library.

Step 202, capturing and building a library. Randomly breaking the genome DNA into fragments with the length of 180-plus 280bp by a Covaris breaker, respectively connecting joints at two ends of the fragments after end repair and A tail addition to prepare a DNA library, performing liquid phase hybridization on the library posing with the specific index and a biotin-labeled probe, capturing a specific target region on the genome by using a magnetic bead with streptomycin, performing PCR linear amplification, performing library quality inspection, and performing sequencing after the quality inspection is qualified.

Step 203, checking in a warehouse. After the library is constructed, firstly, the Qubit 2.0 is used for preliminary quantification, then the Agilent2100 is used for detecting the insert size of the library, and after the insert size meets the expectation, the qPCR method is used for accurately quantifying the effective concentration (3nM) of the library so as to ensure the quality of the library.

And step 204, performing computer sequencing. And (4) carrying out Illumina HiSeq4000 PE150 sequencing according to the effective concentration of the qualified library and the data output requirement.

Next, a step 3 of biological information analysis is performed. The biological information analysis comprises the following steps:

and step 301, quality control of original data. The original sequencing sequence obtained by sequencing contains low-quality reads with connectors. In order to ensure the quality of information analysis, raw reads need to be finely filtered to obtain clean reads, and subsequent analysis is performed based on the clean reads, and the method specifically comprises the following steps:

1) removing reads pairs with connectors (adapters);

2) when the proportion of N (N represents that the base information cannot be determined) in the single-ended sequencing read is more than 10%, the pair of reads needs to be removed;

3) the pair of reads needs to be removed when the number of low quality (less than 5) bases contained in the single ended sequencing read exceeds 50% of the length proportion of the read.

Step 302, sequence alignment. Valid sequencing data were aligned to the reference genome by BWA, resulting in initial alignment in BAM format. Then, sequencing the comparison results by using SAMtools; and then Picard marks repeated reads (mark duplicate reads), and then the comparison result after repeated marking is used for counting the coverage and depth.

Step 303, mutation detection and annotation. On the basis of the comparison result, SAMtools are used for identifying SNP sites and InDel, and the SNP sites and the InDel are filtered by adopting an international conventional filtering standard, wherein the specific filtering standard is as follows:

a) filtering variation sites of a thousand-person genome database (the frequency in the population is more than 0.01), removing diversity sites among individuals, and obtaining rare mutation (rare) which can really cause diseases: the variant sites with the frequency of less than 0.01 in 1000G are reserved;

b) variation of exon region (exonic) or splice site region (spicing, 10bp up and down) is reserved;

c) removing synonymous mutations (mutations that do not result in a change in amino acid coding) resulting in mutations that have an effect on the gene expression product;

d) according to the 4 software SIFT, Polyphen, Mutation Taster, CADD, it is required that at least half of the 4 software support the site possibly harmful, and the site is reserved (for example: a site is predicted to have a 'SIFT 0.07, T', 'Polyphen 2-HVAR 0.923, D, Polyphen2-HDIV 0.999, D', 'Mutation timer 1.000, N', 'CADD', then the site supports a detrimental software scale of 1/3, less than half, and the site is discarded). After mutation screening and statistics, the screened SNV and InDel types of mutations are annotated with mutation information respectively. The SNV and InDel annotations comprise annotation information of thousand human genome plans, ExAC, Novo-Zhonghua and other existing databases, and the annotation content comprises 6 parts of priority information, gene and region annotations, database (frequency) annotations, conservative (harmful) prediction, mutation site information, gene functions and pathway annotations.

Step 304, consensus mutation analysis. On the basis of site filtration, screening mutant genes shared among samples, wherein the screened ratio is shared by 10% of patients (if a control sample exists, 90% of the control samples are required to simultaneously carry no harmful mutation of the gene), and if the obtained result is decimal when calculating the number of the shared samples included in the 10% of the patients, rounding up is required, for example, if the calculated number of the 10% of samples is 19.2, screening at least 20 mutant genes shared by the samples is required, and the parameters can be adjusted according to actual conditions.

Step 305, based on site significance analysis. Performing association analysis on the mutation sites by using PLINK, and calculating the sites with significant difference between patients and normal persons by using Fisher test: and calculating the P value and the OR value of each SNP site, and screening the mutation sites with significant association through the associated significance (P-value), wherein the mutation sites are the mutation sites related to the familial hypercholesterolemia.

Step 306, based on gene significance analysis. On the basis of site filtering, Burden analysis is carried out on the face-to-face mutation of a gene layer by using an SKAT algorithm, and the mutation related to the disease is mined from the gene layer, so that the discovery of rare mutation related to the disease is facilitated.

The invention provides a gene chip for screening a hypercholesterolemia pathogenic gene, which comprises a solid phase carrier and a probe fixed on the solid phase carrier, wherein the sequence of the probe is shown in a sequence table 1-158.

The probe may be modified by 5 '-NH 2, 5' -SH, 5 '-PolyT (A, C or G), 5' -biotin, 3 '-NH 2, 3' -SH, 3 '-PolyT (A, C or G), 3' -biotin, or the like.

In the present invention, the solid phase carrier may be any carrier known in the art, as long as the carrier is compatible with the reactant and does not affect the detection result. Preferably, the solid phase carrier is selected from any one of a glass slide, a silicon wafer, a membrane or a high molecular material. Preferably, the membrane is selected from any one of nitrocellulose membrane, nylon membrane, and polystyrene membrane.

The chip preparation method mainly comprises two types:

1) and (3) a sample application method: firstly, preparing a probe library, selecting a specific sequence from a related gene database according to an analysis target of a gene chip to carry out PCR amplification or directly and artificially synthesizing an oligonucleotide sequence, then respectively distributing different probe solutions point by point on different sites on the surfaces of glass, nylon and other solid phase substrates by using a special needle head and a micro-nozzle through a three-coordinate working platform controlled by a computer, and fixing the probe solutions by a physical and chemical method, wherein each technical link of the method is mature, has high flexibility and is suitable for a research unit to automatically prepare the gene chip with a moderate lattice scale according to the requirement;

2) in-situ synthesis: the method is to directly synthesize an oligonucleotide probe array on a hard surface such as glass, and the like, and the currently applied methods mainly comprise a photo-deprotection parallel synthesis method, a piezoelectric printing synthesis method and the like, wherein the key probe can be fixed on a solid phase carrier through a connecting arm. The linker arm can provide a free space for the double-stranded portion of the probe to reduce steric hindrance, thereby facilitating the hybridization reaction. The longer the linker arm, the higher the hybridization efficiency. Typical linker arms comprise 15-30 functional group lengths. The linker arm may be selected from any suitable functional group such as PolyT (A, C or G), chimeras of C-atom or polyethylene glycol with PolyT (A, C or G), polyethylene glycol, polyvinyl alcohol, polyurethane, polyvinyl alcohol, and combinations thereof. The method is a template positioning technology with high spatial resolution and a DNA chemical synthesis technology with high synthesis yield, is suitable for manufacturing large-scale DNA probe chips, and realizes the standardization and large-scale production of the probe chips.

The invention also provides a using method of the gene chip, which comprises the following steps:

(a) preparing a sample DNA fragment;

(b) fluorescent labeling of the DNA fragment;

(c) eluting the labeled product;

(d) hybridizing the labeled product with the gene chip;

(e) and scanning the hybridization signal of the gene chip to obtain a result.

The sample DNA fragment preparation may include an amplification step, either directly with cells containing nucleic acid in the isolated target sample or directly with extracted target nucleic acid. For example, the leukocyte is separated from the whole blood by using magnetic beads, and the target nucleic acid sequence is amplified by directly using the separated leukocyte or the nucleic acid extracted from the whole blood as a template. The amplified DNA may contain a fluorescent or biotin label, and the labeled DNA may be used for hybridization without purification.

The sample DNA fragments may be enriched using any suitable amplification method, such as: polymerase Chain Reaction (PCR), multiplex PCR, Ligase Chain Reaction (LCR), Rolling Circle Amplification (RCA), nucleic acid sequence-based amplification (NASBA), Strand Displacement Amplification (SDA), and transcription-mediated amplification (TMA), among others.

Either the probe or sample DNA is suitable for labeling. The probe introduces a label during synthesis, the sample DNA can introduce a label during amplification, or after amplification, the label can be introduced by a suitable method.

Suitable labels include fluorescent labels, radioisotope labels, chromophores, luminophores, FRET, enzymes, biotin or ligands with specific binding ligands.

Further, the gene chip of the present invention is prepared by an in situ synthesis method. The procedures for preparing gene chips by in situ synthesis are well known to those skilled in the art, and the preparation of the gene chip of the present invention can be accomplished by conventional techniques.

The protective scope of the present invention is not limited to the above-described embodiments, and it is apparent that various modifications and variations can be made to the present invention by those skilled in the art without departing from the scope and spirit of the present invention. It is intended that the present invention cover the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents.

12页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种AND分子逻辑门传感体系及其制备方法和应用

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!