Design method and system of multiple methylation specific PCR primers

文档序号:1143091 发布日期:2020-09-11 浏览:9次 中文

阅读说明:本技术 一种多重甲基化特异性pcr引物设计方法及系统 (Design method and system of multiple methylation specific PCR primers ) 是由 易吉 李泽卿 于 2020-05-29 设计创作,主要内容包括:本发明公开了一种多重甲基化特异性PCR引物设计方法及系统,本发明实施步骤包括根据用户需求设置参数,对目标甲基化位点进行引物设计及筛选,然后对筛选出的多个甲基化位点引物对两两进行配对并检查兼容性,选择数量最大的可兼容的多重引物组合,最后对设计完成的多重引物组合进行评价,并决定是否需要重新设计引物。本发明能实现超长长度序列的多重甲基化特异性PCR引物设计,能有效减少单对引物内部之间和多对引物之间dimer/hairpin等二级结构的出现以及在基因组中的非特异性扩增,设计出的引物特异性强,提高了多重甲基化特异性PCR实验测试过程的可操作性和准确度,同时大大提高了工作效率。(The invention discloses a design method and a system of a multiple methylation specific PCR primer, which comprises the implementation steps of setting parameters according to user requirements, carrying out primer design and screening on a target methylation site, pairing a plurality of screened methylation site primer pairs pairwise and checking compatibility, selecting the compatible multiple primer combination with the largest number, evaluating the designed multiple primer combination, and determining whether the primer needs to be redesigned. The invention can realize the design of the multiple methylation specific PCR primers of the super-long sequence, can effectively reduce the occurrence of secondary structures such as dimer/hairpin and the like between the interiors of single pair of primers and between multiple pairs of primers and the non-specific amplification in a genome, has strong specificity of the designed primers, improves the operability and the accuracy of the multiple methylation specific PCR experiment testing process, and simultaneously greatly improves the working efficiency.)

1. A method for designing a multiple methylation specific PCR primer is characterized by comprising the following steps:

s000, setting a parameter office to generate a primer design environment;

s100, designing a primer for a target methylation site, and screening the designed primer;

s200, pairing every two screened primer pairs, checking the compatibility between every two primer pairs, and selecting the compatible multiple primer combination with the largest quantity;

s300, evaluating the designed multiple primer combination and determining whether primer redesign is needed.

2. The multiplex methylation specific PCR primer design method of claim 1, wherein the step S000 comprises the steps of:

s001, acquiring target site information and creating a target site;

s002, acquiring and establishing primer demand parameter information, and binding the primer demand parameter information to a target site;

s003, acquiring a reference sequence, and binding the reference sequence to a target site after CT conversion;

s004, obtaining a shielding site defined by a user;

s005, obtaining a shielding area defined by a user;

s006, generating a primer design environment including a site sequence, local environment parameters and global environment parameters according to the parameter content acquired in the steps S001-S005, and transmitting each environment parameter to the step S100.

3. The multiplex methylation specific PCR primer design method of claim 1, wherein the step S100 comprises the steps of:

s101, acquiring a primer design environment from the step S000, and designing to generate a corresponding primer;

s102, checking whether the primers obtained in the step S101 contain CG sites and terminal C sites, and deleting unqualified primers;

s103, screening the primers with the Dimer structure;

s104, screening the primer with the Hairpin structure;

s105, screening the primer of which the tail end is positioned at the shielding site defined by the user;

s106, screening the primer of which the tail end is positioned in a shielding region defined by a user;

s107, screening primers which possibly amplify a plurality of non-specific regions in the reference sequence inside the primers, and transmitting the qualified primers and the information of the potential binding sites of the single primer in the reference sequence to the step S200.

4. The multiplex methylation specific PCR primer design method of claim 1, wherein the step S200 comprises the steps of:

s201, computing the qualified primers screened in the step S100 and potential binding sites thereof in a reference sequence, evaluating the specificity of each pair of primers, screening out the primer pair with the highest specificity, and attributing all primers of other sites except the corresponding sites of the primer pair to a first other primer set list and transmitting the first other primer set list to the step S202;

s202, performing Dimer evaluation between the primers in the first other primer set list acquired in the step S201 and the highest specificity primer pair generated in the step S201 one by one, screening out primer pairs without a Dimer structure, obtaining a second other primer set list and transmitting the second other primer set list to the step S203;

s203, performing non-specific evaluation between the primers in the second other primer set list acquired in the step S202 and the highest specific primer pair generated in the step S201 one by one, screening out the primer pairs which have no non-specific amplification with the highest specific primer pair in the reference sequence, obtaining a third other primer set list and transmitting the third other primer set list to the step S204;

s204, if the number of the corresponding sites of the primer pairs in the third other primer set list is more than 1, continuing to send the third other primer set list to the step S201 for cycle operation; if the number of the corresponding sites of the primer pairs in the third other primer set list is equal to 1, sending the third other primer set list to the step S201 for screening the highest specificity primer pairs, synthesizing all circulating highest specificity primer pairs into a compatible multiple primer pair combination, and sending the multiple primer pair combination to the step S300; if the number of the corresponding sites of the third other primer set list primer pairs is equal to 0, directly synthesizing all the circulating primer pairs with the highest specificity into compatible multiple primer pair combinations, and sending to the step S300.

5. The method of multiplex methylation specific PCR primer design according to claim 1, wherein: in the step S300, the compatible multiplex primer combination delivered in the step S200 is evaluated, if the number of primer pairs is too small or the necessary sites are not included, the sites and primer parameters are modified, and then the step S000 is delivered to redesign; if the multiplex primer combination meets the requirement, but some target sites are not included, the non-included target sites are input into step S000 again, and the primers are redesigned for these sites to form a new set of multiplex primer combination.

6. A computer device, characterized by: said computer device being programmed to perform the steps of the multiplex methylation specific PCR primer design method of any one of claims 1-5; or a storage medium of said computer device having stored therein a computer program programmed to perform the multiplex methylation specific PCR primer design method of any one of claims 1-5.

7. A multiplex methylation specific PCR primer design system, comprising:

the reference sequence CT conversion and parameter setting module is used for performing C- > T conversion on the reference sequence, setting parameters according to user requirements and generating a primer design environment;

the primer design and screening module is used for designing primers for the target sites and screening qualified primers according to user requirements;

the multi-primer combination module is used for carrying out unified investigation on the primers of a plurality of methylation sites and screening out a multi-primer group without non-specific amplification and secondary structure in a genome;

and the report module is used for recording and monitoring all execution processes, generating a process execution report file according to the user requirements, and providing help and support for further screening primers and changing parameters for the user.

8. The multiplex methylation specific PCR primer design system of claim 7, wherein the reference sequence CT transformation and parameter setting module comprises:

a reference sequence CT conversion submodule for performing C- > T conversion on a reference sequence, wherein the C- > T conversion on a positive strand of the reference sequence and the reverse complementation are performed after the C- > T conversion on a negative strand of the reference sequence are performed;

the parameter setting submodule is used for setting parameters of the primer design and screening process in advance, wherein the parameters comprise the name and the position of a target site on a reference sequence, the design length of the primer, the generation quantity of the primer, the GC content, the annealing temperature and a region and a site which are required to be skipped in the primer screening;

and the environment generation submodule is used for generating a target region sequence and a parameter format required by primer design according to the set parameters and transmitting the target region sequence and the parameter format to the primer design and screening module.

9. The multiplex methylation specific PCR primer design system of claim 7, wherein the primer design and screening module comprises:

the primer design submodule is used for designing a primer according to a primer design environment transmitted by the reference sequence CT conversion and parameter setting module;

a CG locus screening submodule for evaluating whether the designed primer contains a CG locus which may cause amplification imbalance or not and removing a primer pair which may cause amplification imbalance;

the Dimer screening submodule is used for evaluating whether the designed primer contains a Dimer secondary structure which possibly causes low amplification efficiency and invalid primer Dimer, and removing a primer pair which possibly has low amplification efficiency or invalid primer Dimer;

the Hairpin screening submodule is used for evaluating whether the designed primer contains a Hairpin secondary structure which possibly causes low amplification efficiency and removing a primer pair which possibly has low amplification efficiency;

the screening submodule of the screening site is used for evaluating whether the designed primer contains the screening site defined by the user or not and removing the primer pair containing the screening site defined by the user;

the screening submodule of the shielding region is used for evaluating whether the designed primer contains the shielding region defined by the user or not and removing the primer pair containing the shielding region defined by the user;

and the primer pair internal non-specific screening submodule is used for evaluating whether a non-specific region can be amplified in the converted reference sequence in the designed primer and removing a primer pair possibly containing non-specific amplification characteristics.

10. The multiplex methylation specific PCR primer design system of claim 7, wherein the multiplex primer combination module comprises:

the single primer pair specificity evaluation submodule is used for evaluating the binding sites of the forward and reverse primers contained in the single primer pair in a reference sequence and giving an evaluation value;

the double-primer-pair Dimer evaluation submodule is used for evaluating whether a Dimer secondary structure which possibly causes low amplification efficiency and invalid primer Dimer exists between the two primer pairs or not and recording the compatibility between the two primer pairs;

the double-primer-pair non-specific evaluation submodule is used for evaluating whether a non-specific area is amplified in the converted reference sequence between the two primer pairs or not and recording the compatibility condition between the two primer pairs;

and the multi-primer compatibility evaluation submodule is used for evaluating the compatibility between the primers by using the double-primer pair Dimer evaluation submodule and the double-primer pair non-specific evaluation submodule according to the evaluation value given by the single-primer pair specificity evaluation submodule from small to large, and calculating the compatible multi-primer combination with the maximum number in the target sites by using a statistical method.

Technical Field

The invention relates to the field of DNA methylation detection, in particular to a method and a system for designing a multiple methylation specific PCR primer.

Background

The relationship between human phenotype and human genome mutation has become clear gradually since the development of human genome project and other studies. About 8800 ten thousand mutations (8470 ten thousand single nucleotide mutations, 360 ten thousand short fragment insertions or deletions) and 6 structural variations have been identified in 2015. 2001 human medicine genetics online database (OMIM)TM) 13005 entries have been made for the results of studies relating to mutations in the human genome and human disease. Human genome mutations have been increasingly used in human life in fields such as drug guidance, drug resistance detection, prenatal detection, and tumor detection.

Epigenetics is a genetic branch of science that studies heritable changes in gene expression without changes in the nucleotide sequence of the gene. Epigenetic includes DNA methylation (DNA methylation), genomic imprinting (genomic imprinting), gene silencing (gene silencing), nucleolar dominance, activation of dormant transposons, and RNA editing (RNA editing). Studies have shown that epigenetic biomarkers can also be used as indicators of disease, suggesting risk of disease or the health status of the human body. DNA methylation refers to the covalent bonding of a methyl group to the cytosine 5' carbon of a genomic CpG dinucleotide under the action of DNA methyltransferase. Numerous studies have shown that DNA methylation can cause changes in chromatin structure, DNA conformation, DNA stability, and the way DNA interacts with proteins, thereby controlling gene expression. DNA methylation markers have been widely used in the medical and health fields. Luyuming et al used certain CpG islands on fetal chromosome 21 to display a methylation pattern different from the corresponding CpG islands located on maternal chromosome 21 to detect in advance of fetal birth whether or not the fetus suffered from trisomy 21 syndrome (congenital type) (CN 101535502A). In addition, the Septin9 gene methylation assay has been identified as the gold standard for intestinal cancer screening (CN201810326484.7, CN108048570A, CN105861672A, CN201610948298.8, CN 201710285809.7). With the widespread development of genome-wide methylation sequencing and genome-wide methylation profiles in humans and other organisms, more and more methylation sites are identified as clinical or biological phenotype-specific sites. The cancer and tumor genetic map (TCGA) program has completed genome-wide methylation maps for about 11000 samples, and over 30 cancers, each of which has a large number of clinically significant methylation sites available to guide tumor detection and prevention.

The detection method of the methylation locus comprises whole genome sequencing, whole genome methylation mapping, polygene methylation capture sequencing, fluorescent quantitative PCR, methylation specific PCR, methylation amplification product sequencing and the like. However, whole genome sequencing, whole genome methylation maps, multigene methylation capture sequencing methods are directed to the detection of more than 1000 methylation sites; the methods of fluorescent quantitative PCR, methylation specific PCR, methylation amplification product sequencing and the like are directed at detecting 1-2 target methylation sites, and the conventional DNA methylation specific PCR Primer design software MethPrimer and Methyl Primer Express tools are directed at Primer design of a single site. Therefore, at present, the detection of multiple methylation sites needs to design a large number of primers aiming at multiple target fragments, and the detection can be realized only by repeated screening of multiple experiments, so that the workload is large, and the time consumption is long. The multiplex methylation specificity PCR not only can reduce the experimental amount and improve the working efficiency, but also can reduce the sample amount requirement and improve the technical feasibility. However, since the genomic DNA is converted from sulfite to T without methylated C, the genomic complexity is reduced, which leads to reduced primer complexity, increased primer dimer, secondary structure and non-specific amplification, and increased difficulty in designing the multiplex methylation specific PCR primers, and conventional general multiplex PCR primer design software cannot be applied to designing multiplex methylation specific PCR primers. At present, a system and a method which can design multiple PCR primers aiming at multiple methylation sites, ensure that secondary structures such as dimer/hairpin and the like do not exist between the inner parts of a single pair of primers and between multiple pairs of primers, and specifically amplify in a genome are lacked.

Disclosure of Invention

In view of the above, the invention provides a method and a system for designing a multiple methylation specific PCR primer, which can effectively reduce secondary structures such as dimer/hairpin and the like between the interiors of a single pair of primers and between multiple pairs of primers, reduce non-specific amplification in chromosomes between the interiors of the single pair of primers and between the multiple pairs of primers, reduce the strength and difficulty in a multiple methylation specific PCR experimental test process, and improve the working efficiency, so as to achieve the purpose of efficiently and accurately detecting multiple methylation sites.

The technical scheme of the invention is realized as follows:

in a first aspect, the present invention provides a method for designing a multiplex methylation specific PCR primer, comprising the following steps:

s000-parameter configuration, setting a parameter bureau and generating a primer design environment;

s100, designing and screening a primer, designing the primer for a target methylation site, and screening the designed primer;

s200-calculating multiple primer combinations, pairing primer pairs in the multiple primers, checking the compatibility between the primer pairs, and selecting the compatible multiple primer combinations with the largest number;

s300-primer evaluation and redesign, evaluating the designed multiplex primer combination and deciding whether primer redesign is needed.

On the basis of the above technical solution, preferably, the step S000 includes the following steps:

s001, acquiring target site information and creating a target site;

s002, acquiring and establishing primer demand parameter information, and binding the primer demand parameter information to a target site;

s003, acquiring a reference sequence, and binding the reference sequence to a target site after CT conversion;

s004, obtaining a shielding site defined by a user;

s005, obtaining a shielding area defined by a user;

s006, generating a primer design environment including a site sequence, local environment parameters and global environment parameters according to the parameter content acquired in the steps S001-S005, and transmitting each environment parameter to the step S100.

On the basis of the above technical solution, preferably, the step S100 includes the steps of:

s101-primer design, receiving a primer design environment from the step S000, and designing to generate a corresponding primer;

S102-CG site screening, checking whether the primer received from the step S101 contains a CG site and a terminal C site, and deleting unqualified primers;

S103-Dimer screening, namely screening the primer with the Dimer structure;

S104-Hairpin screening, wherein a primer with a Hairpin structure is screened;

s105-screening the shielding sites, namely screening the primers with the tail ends positioned at the shielding sites defined by the user;

s106-screening a shielding region, namely screening the primer of which the tail end is positioned in the shielding region defined by a user;

s107-non-specific screening in primer, screening primers which may amplify multiple non-specific regions in the reference sequence in the primer, and transmitting the qualified primers and the information of the potential binding sites of the single primer in the reference sequence to the step S200.

On the basis of the above technical solution, preferably, the step S200 includes the following steps:

s201, evaluating the specificity of a single primer pair, calculating the qualified primers screened in the step S100 and potential binding sites thereof in a reference sequence, evaluating the specificity of each pair of primers, screening a primer pair (Seedprimer pair) with the highest specificity, classifying all primers of other sites except the corresponding sites of the primer pair into a first other primer set list, and transmitting the first other primer set list to the step S202;

s202-double primer pair Dimer evaluation, carrying out Dimer evaluation between the primers in the first other primer set list obtained in the step S201 and the highest specificity primer pair generated in the step S201 one by one, screening out the primer pairs without the Dimer structure, obtaining a second other primer set list and transmitting the second other primer set list to the step S203;

s203, non-specific evaluation between double primer pairs, namely, performing non-specific evaluation between primers in the second other primer set list obtained in the step S202 and the highest specific primer pair generated in the step S201 one by one, screening out primer pairs which have no non-specific amplification with the highest specific primer pair in a reference sequence, obtaining a third other primer set list and transmitting the third other primer set list to the step S204;

s204-multiple primer compatibility evaluation, if the number of the corresponding sites of the primer pairs in the third other primer set list is more than 1, continuing to send the third other primer set list to the step S201 for cycle operation; if the number of the corresponding sites of the primer pairs in the third other primer set list is equal to 1, sending the third other primer set list to the step S201 for screening the highest specificity primer pairs, synthesizing all circulating highest specificity primer pairs into a compatible multiple primer pair combination, and sending the multiple primer pair combination to the step S300; if the number of the corresponding sites of the third other primer set list primer pairs is equal to 0, directly synthesizing all the circulating primer pairs with the highest specificity into compatible multiple primer pair combinations, and sending to the step S300.

On the basis of the above technical solution, preferably, in the step S300, the compatible multiplex primer combination delivered in the step S200 is evaluated, if the number of primer pairs is too small or the necessary sites are not included, the sites and primer parameters are modified, and then the step S000 is delivered to redesign; if the multiplex primer combination meets the requirement, but some target sites are not included, the non-included target sites are input into step S000 again, and the primers are redesigned for these sites to form a new set of multiplex primer combination.

In a second aspect, the present invention provides a computer device programmed to perform the steps of the multiplex methylation specific PCR primer design method of the first aspect of the invention; or a storage medium of said computer device having stored therein a computer program programmed to perform the multiplex methylation specific PCR primer design method of the first aspect of the invention.

In a third aspect, the present invention provides a computer readable storage medium having stored therein a computer program programmed to perform the methylation specific PCR primer design method of the first aspect of the invention.

In a fourth aspect, the invention provides a multiple methylation specificity PCR primer design system, which comprises a reference sequence CT conversion and parameter setting module, a primer design and screening module, a multiple primer combination module and a report module, and specifically comprises the following components:

the reference sequence CT conversion and parameter setting module is used for performing C- > T conversion on the reference sequence, setting parameters such as a target site, primer length and the like according to user requirements and generating a primer design environment;

the primer design and screening module is used for designing primers for the target sites and screening qualified primers according to user requirements;

the multi-primer combination module is used for carrying out unified investigation on the primers of a plurality of methylation sites and screening out a multi-primer group without non-specific amplification and secondary structure in a genome;

and the report module is used for recording and monitoring all execution processes, generating process execution report files according to user requirements, wherein the process execution report files comprise a primer design report, a screening report, a multiple combination report and the like, and providing help and support for further screening primers and changing parameters for users.

On the basis of the above technical solution, preferably, the reference sequence CT conversion and parameter setting module includes a reference sequence CT conversion sub-module, a parameter setting sub-module, and an environment generation sub-module, which are as follows:

the reference sequence CT conversion submodule is used for converting the reference sequence into C- > T, the unmethylated C of the genome is converted into T, the methylated C is kept unchanged, and because the genome has a positive strand and a negative strand, the positive strand of the reference sequence needs to be subjected to non-CpG site C- > T conversion, and the negative strand of the reference sequence is subjected to non-CpG site C- > T conversion and then subjected to reverse complementation;

the parameter setting submodule is used for setting parameters of the primer design and screening process in advance, wherein the parameters comprise the name and the position of a target site on a reference sequence, the design length of the primer, the generation quantity of the primer, the GC content, the annealing temperature and a region and a site which are required to be skipped in the primer screening;

and the environment generation submodule is used for generating a target region sequence and a parameter format required by the design of the primer according to the set parameters, and transmitting the target region sequence and the parameter format into the primer design and screening module after the primer design environment is generated.

On the basis of the above technical scheme, preferably, the primer design and screening module includes submodules for primer design, CG site screening, Dimer screening, Hairpin screening, shielded site screening, shielded region screening, and primer pair internal non-specific screening, and specifically includes the following steps:

the primer design submodule is used for designing a primer according to a primer design environment transmitted by the reference sequence CT conversion and parameter setting module;

a CG locus screening submodule for evaluating whether the designed primer contains a CG locus which may cause amplification imbalance or not and removing a primer pair which may cause amplification imbalance;

the Dimer screening submodule is used for evaluating whether the designed primer contains a Dimer secondary structure which possibly causes low amplification efficiency and invalid primer Dimer, and removing a primer pair which possibly has low amplification efficiency or invalid primer Dimer;

the Hairpin screening submodule is used for evaluating whether the designed primer contains a Hairpin secondary structure which possibly causes low amplification efficiency and removing a primer pair which possibly has low amplification efficiency;

the screening submodule of the screening site is used for evaluating whether the designed primer contains the screening site defined by the user or not and removing the primer pair containing the screening site defined by the user;

the screening submodule of the shielding region is used for evaluating whether the designed primer contains the shielding region defined by the user or not and removing the primer pair containing the shielding region defined by the user;

and the primer pair internal non-specific screening submodule is used for evaluating whether a plurality of regions can be amplified in the converted reference sequence (namely non-specific amplification) in the designed primer, and removing the primer pair which possibly contains the non-specific amplification characteristic.

On the basis of the above technical solution, preferably, the multiple primer combination module includes submodules for single primer pair specificity evaluation, double primer pair Dimer evaluation, double primer pair non-specificity evaluation, and multiple primer compatibility evaluation, which are specifically as follows:

the single primer pair specificity evaluation submodule is used for evaluating the binding sites of the forward and reverse primers contained in the single primer pair in a reference sequence and giving an evaluation value;

the double-primer-pair Dimer evaluation submodule is used for evaluating whether a Dimer secondary structure which possibly causes low amplification efficiency and invalid primer Dimer exists between the two primer pairs or not and recording the compatibility between the two primer pairs;

the double-primer-pair non-specific evaluation submodule is used for evaluating whether a plurality of regions (namely non-specific amplification) are amplified in the converted reference sequence between the two primer pairs or not and recording the compatibility between the two primer pairs;

and the multi-primer compatibility evaluation submodule is used for evaluating the compatibility between the primers by using the double-primer pair Dimer evaluation submodule and the double-primer pair non-specific evaluation submodule according to the evaluation value given by the single-primer pair specificity evaluation submodule from small to large, and calculating the compatible multi-primer combination with the maximum number in the target sites by using statistical methods such as a shortest distance method and the like.

Compared with the prior art, the method and the system for designing the multiple methylation site amplification primer have the following beneficial effects:

(1) the invention provides a new methylation specificity PCR primer design idea by defining a reference sequence and carrying out methylation CT conversion necessary for methylation site detection, so that the methylation primer design based on an ultra-long reference sequence (>1Mb) becomes possible;

(2) the CG site screening submodule contained in the invention can avoid the existence of sites whose methylation conversion state can not be determined in the primer, thereby avoiding the problem of differential amplification of the primer, so that the primer is not influenced by the methylation state on the reference sequence, and target sites are amplified comprehensively and specifically;

(3) the Dimer screening submodule and the Hairpin screening submodule contained in the invention can avoid the occurrence of a secondary structure in a primer pair, ensure the amplification efficiency of a qualified primer, and reduce the problems of Dimer and primer efficiency reduction in the experimental verification process;

(4) due to the complexity of PCR amplification, the reference genome partial region and site may influence the molecular experiment amplification of a user or the subsequent steps such as sequencing, library building and the like, and the screening site screening submodule and the screening region screening submodule provided by the invention increase the convenience of a user for designing a primer and provide more operability for the user;

(5) the system provides a primer pair internal non-specificity screening function on a full reference sequence, and no methylation software can realize the function at present, so that the function can fundamentally prevent the non-specificity of the primer in the experimental process, the experimental workload of primer verification is reduced by 90%, and the working efficiency can be greatly improved;

(6) the invention provides a multi-primer compatibility evaluation method, which simulates the reaction process of a plurality of primer pairs in the actual molecular biology experiment process by using an optimal primer method and a Dimer and non-specific evaluation method among the plurality of primer pairs, reduces the secondary structure among the plurality of primer pairs and non-specific amplification in a reference sequence, reduces abnormal amplification in the actual molecular biology experiment, and improves the amplification efficiency in the experiment process.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a block diagram of the multiplex methylation specific PCR primer design system of the present invention.

FIG. 2 is a flow chart of the method of the present invention for designing multiplex methylation specific PCR primers.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

The embodiment provides a multiple methylation specific PCR primer design system, as shown in FIG. 1, the system comprises four modules, namely a reference sequence CT conversion and parameter setting module, a primer design and screening module, a multiple primer combination module and a report module.

The device comprises a reference sequence CT conversion and parameter setting module, a primer design environment generation module and a parameter setting module, wherein the reference sequence CT conversion and parameter setting module is used for performing C- > T conversion on the reference sequence, setting parameters such as a target site and primer length according to user requirements and generating a primer design environment. It may specifically include the following 3 sub-modules:

(1) a reference sequence CT conversion submodule, namely performing C- > T conversion on a reference sequence, converting unmethylated cytosine of a genome into thymine, keeping the methylated cytosine unchanged, and performing non-CpG site C- > T conversion on a positive strand of the reference sequence and performing reverse complementation after performing non-CpG site C- > T conversion on a negative strand of the reference sequence because positive and negative strands exist in the genome;

(2) a parameter setting submodule for setting parameters of the primer design and screening process in advance, wherein the parameters comprise the name and position of a target site on a reference sequence, the design length of a primer, the generation quantity of the primer, GC content, annealing temperature and a region and a site which are required to be skipped in primer screening;

(3) and the environment generation submodule is used for generating a target region sequence and a parameter format required by primer design according to the reference sequence and the preset parameters after CT conversion, and transmitting the primer design environment to the primer design and screening module.

It should be understood that, the regions and sites that should be skipped during primer screening in the parameters of the above primer design and screening process are regions and sites that need to be masked during the primer screening process that are customized by the user in consideration of the complexity of PCR amplification, so as to avoid the designed primers from affecting the PCR amplification effect or subsequent sequencing, library building, and other experiments.

And the primer design and screening module is used for designing primers for the target sites and screening qualified primers according to the requirements of users. It may specifically include the following 7 sub-modules:

(1) the primer design submodule is used for designing a primer according to a primer design environment transmitted by the environment generation submodule;

(2) a CG locus screening submodule, namely evaluating whether the designed primer contains a CG locus which possibly causes amplification imbalance, and removing a primer pair which possibly has amplification imbalance;

(3) a Dimer screening submodule, namely evaluating whether the designed primer contains a Dimer secondary structure which may cause low amplification efficiency and invalid primer Dimer, and removing a primer pair which may have low amplification efficiency or invalid primer Dimer;

(4) the Hairpin screening submodule is used for evaluating whether the designed primer contains a Hairpin secondary structure which possibly causes low amplification efficiency and removing a primer pair which possibly has low amplification efficiency;

(5) the screening submodule of the screening site is used for evaluating whether the designed primer contains the screening site defined by the user or not and removing the primer pair containing the screening site defined by the user;

(6) the screening submodule of the shielding region is used for evaluating whether the designed primer contains the shielding region defined by the user or not and removing the primer pair containing the shielding region defined by the user;

(7) and the primer pair internal non-specific screening submodule is used for evaluating whether a plurality of regions can be amplified in the converted reference sequence (namely non-specific amplification) in the designed primer, and removing the primer pair which possibly contains the non-specific amplification characteristic.

It should be understood that the qualified primers are obtained by screening six screening conditions of CG site screening, Dimer screening, Hairpin screening, screening of screening sites, screening of screening regions and primer pair internal non-specific screening one by one, wherein a Dimer submodule, a Hairpin screening submodule, a screening site screening submodule and a screening region screening submodule can be replaced by other selectable modules, and more condition screening submodules can be added behind the screening region screening submodule according to the actual requirements of users.

And the multiple primer combination module is used for uniformly inspecting primers of a plurality of methylation sites and screening multiple primer groups which have no non-specific amplification and no secondary structure in a genome. It may specifically include the following 4 sub-modules:

(1) the single primer pair specificity evaluation submodule is used for evaluating the binding sites of the forward and reverse primers contained in each single primer pair in the reference sequence respectively and giving evaluation values;

(2) the double-primer-pair Dimer evaluation submodule is used for evaluating whether a Dimer secondary structure which possibly causes low amplification efficiency and invalid primer Dimer exists between the two primer pairs or not and recording the compatibility between the two primer pairs;

(3) the double-primer-pair non-specific evaluation submodule is used for evaluating whether a plurality of regions (namely non-specific amplification) are amplified in the converted reference sequence between the two primer pairs or not and recording the compatibility between the two primer pairs;

(4) and the multi-primer compatibility evaluation submodule is used for evaluating the compatibility between the primers by using the double-primer pair Dimer evaluation submodule and the double-primer pair non-specific evaluation submodule according to the evaluation value given by the single-primer pair specificity evaluation submodule from small to large, and calculating the compatible multi-primer combination with the maximum number in the target sites by using statistical methods such as a shortest distance method and the like.

And the report module is used for recording and monitoring all the execution processes, generating a process execution report file according to the user requirements, and providing help and support for further screening the primers and changing the parameters for the user.

It should be understood that the process execution report file includes a text, a graph or a list report formed in each link of the primer design and screening process, and specifically includes: executing an environment parameter specific list generated by the environment generation submodule; executing a primer design report generated by a primer design submodule, wherein the primer design report comprises a primer design failure reason; executing the primer screening report generated by each screening submodule, and recording the details of the screening result; executing a multi-primer pair compatibility report generated by a multi-primer compatibility evaluation submodule; and progress monitoring of all execution processes.

In addition, it should be noted that the above-described system embodiments are merely illustrative, and do not limit the scope of the present invention, and in practical applications, a person skilled in the art may select some or all of the modules to implement the purpose of the embodiment according to actual needs, and the present invention is not limited herein.

The system provided by the embodiment of the invention realizes the design of the multiple methylation specific PCR primers, and with reference to FIG. 2, FIG. 2 is a flow chart of the working method of the primer design system.

In this embodiment, the method for designing a multiplex methylation specific PCR primer includes the following steps:

s000-parameter configuration, setting a parameter bureau and generating a primer design environment;

s100, designing and screening a primer, designing the primer for a target methylation site, and screening the designed primer;

s200-calculating multiple primer combinations, pairing primer pairs in the multiple primers, checking the compatibility between the primer pairs, and selecting the compatible multiple primer combinations with the largest number;

s300-primer evaluation and redesign, evaluating the designed multiplex primer combination and deciding whether primer redesign is needed.

Further, the step S000 includes the steps of:

s001, acquiring target site information and creating a target site;

s002, acquiring and establishing primer demand parameter information, and binding the primer demand parameter information to a target site;

s003, acquiring a reference sequence, and binding the reference sequence to a target site after CT conversion;

s004, obtaining a shielding site defined by a user;

s005, obtaining a shielding area defined by a user;

s006, generating a primer design environment including a site sequence, local environment parameters and global environment parameters according to the parameter content acquired in the steps S001-S005, and transmitting each environment parameter to the step S100.

It should be understood that the execution reference sequence CT conversion sub-module may implement the process of step S003, the execution parameter setting sub-module may implement the processes of steps S001, S002, S004, and S005, and the execution environment generation sub-module may implement the process of step S006.

Further, the step S100 includes the steps of:

s101-primer design, by executing a primer design submodule, receiving a primer design environment from the step S000, and performing primer design generation by using Primer3.0 software;

S102-CG site screening, wherein the primers received from the step S101 are checked whether to contain CG sites and terminal C sites by executing a CG site screening submodule, and unqualified primers are deleted;

S103-Dimer screening, wherein primers with Dimer structures are screened and removed by executing a Dimer screening submodule;

S104-Hairpin screening, wherein a primer with a Hairpin structure is screened and removed by executing a Hairpin screening submodule;

s105, screening the screening sites, namely screening and removing the primers of which the tail ends are positioned at the screening sites defined by the user by executing a screening site screening submodule;

s106-screening the shielding region, wherein the primer of which the tail end is positioned in the shielding region defined by the user is screened and removed by executing a shielding region screening submodule;

s107-non-specific screening inside the primers, screening and rejecting the primers which possibly amplify a plurality of non-specific regions in the reference sequence inside the primers by executing a primer-to-inside non-specific screening submodule, and transmitting the finally obtained qualified primers and the information of the potential binding sites of the single primers in the reference sequence to the step S200.

Further, the step S200 includes the steps of:

s201, evaluating the specificity of a single primer pair, calculating the qualified primers screened in the step S100 and potential binding sites thereof in a reference sequence by executing a single primer pair specificity evaluation submodule, evaluating the specificity of each pair of primers, screening a primer pair with the highest specificity (Seed primer pair), and attributing all primers of other sites except the corresponding sites of the primer pair to a first other primer set list and transmitting the first other primer set list to the step S202;

s202-double primer pair Dimer evaluation, performing Dimer evaluation between primers in the first other primer set list obtained in the step S201 and the highest specificity primer pair generated in the step S201 one by executing a double primer pair Dimer evaluation sub-module, screening out a primer pair without a Dimer structure, obtaining a second other primer set list and transmitting the second other primer set list to the step S203;

s203, performing non-specific evaluation between double primer pairs, performing non-specific evaluation between primers in the second other primer set list obtained in the step S202 and the highest specific primer pair generated in the step S201 one by executing a double primer pair non-specific evaluation submodule, screening out a primer pair which has no non-specific amplification with the highest specific primer pair in a reference sequence, obtaining a third other primer set list and transmitting the third other primer set list to the step S204;

s204-multiple primer compatibility evaluation, if the number of the corresponding sites of the primer pairs in the third other primer set list is more than 1, continuing to send the third other primer set list to the step S201 for cycle operation; if the number of the corresponding sites of the primer pairs in the third other primer set list is equal to 1, sending the third other primer set list to the step S201 for screening the highest specificity primer pairs, synthesizing all circulating highest specificity primer pairs into a compatible multiple primer pair combination, and sending the multiple primer pair combination to the step S300; if the number of the corresponding sites of the third other primer set list primer pairs is equal to 0, directly synthesizing all the circulating primer pairs with the highest specificity into compatible multiple primer pair combinations, and sending to the step S300.

It is understood that the most specific primer pair is the only primer pair with the highest specific binding ability for the region where a certain methylation site of interest is located.

Further, in the step S300, the compatible multiple primer combinations delivered in the step S200 are evaluated, if the number of primer pairs is too small or the necessary sites are not included, the sites and primer parameters are modified and adjusted, and then the step S000 is delivered to redesign; if the multiple primer combination meets the screening requirement, but part of the target sites are not included, inputting the target sites which are not included into the step S000 again, redesigning the primers for the sites, and forming a new set of multiple primer combination; if the multiplex primer combination meets the screening requirements and contains all target sites, i.e., all requirements of the user are met, the multiplex methylation specific PCR primer design process is finished.

The present embodiment also provides a computer device programmed to perform the steps of the multiplex methylation specific PCR primer design method of the present embodiment; or a storage medium of the computer device, in which a computer program is stored, which is programmed to execute the multiplex methylation specific PCR primer design method described above in this embodiment.

The present embodiment also provides a computer-readable storage medium having stored therein a computer program programmed to execute the methylation specific PCR primer design method of the present embodiment.

The method and the system for designing the multiple methylation specific PCR primers provided by the embodiment of the invention can realize the design of methylation specific primers of an ultra-long sequence, effectively reduce secondary structures such as dimer/hairpin and the like between the interiors of single primers and between multiple pairs of primers, reduce non-specific amplification in a genome between the interiors of the single primers and between the multiple pairs of primers, reduce the strength and difficulty in the test process of multiple methylation specific PCR experiments, and greatly improve the working efficiency so as to achieve the aim of efficiently and accurately detecting multiple methylation sites.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

15页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种利用基因组数据探究疾病亚型亲缘性的方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!