Method for simultaneously positioning two character related genes

文档序号:1615535 发布日期:2020-01-10 浏览:54次 中文

阅读说明:本技术 一种同时定位两个性状相关基因的方法 (Method for simultaneously positioning two character related genes ) 是由 张晓军 于晓娜 王月福 张绍静 郭蕊 颜凡壮 赵瑞华 宋丹蕾 李季华 李瑶瑶 司 于 2019-10-24 设计创作,主要内容包括:本发明属于分子遗传学和分子育种技术领域,公开了一种利用重组表型分组分析同时定位两个性状相关基因位点的方法,基于基因组重测序或转录组测序的重组表型个体组的分析同时定位两个独立性状相关基因位点,具体为:利用两个性状有差异的亲本构建分离群体,在群体内选择两个性状的表型为不同亲本型的重组表型个体,将其分为两类:一类为性状A为母本型&性状B为父本型,另一类为性状A父本型&性状B父本型,通过对双亲和这两类个体组的混池或个体的基因组的重测序或转录组测序,计算个体组中双亲不同的SNP所对应的确定亲本型的SNP频率的差值,实现同时定位两个性状相关的基因位点。(The invention belongs to the technical field of molecular genetics and molecular breeding, and discloses a method for simultaneously positioning two character related gene loci by utilizing recombinant phenotype grouping analysis, wherein the analysis of a recombinant phenotype individual group based on genome resequencing or transcriptome sequencing simultaneously positions two independent character related gene loci, and specifically comprises the following steps: two parents with different characters are utilized to construct a segregation population, and recombinant phenotype individuals with two characters and phenotypes of different parental types are selected from the population and are divided into two types: one is that the character A is a female parent type and the character B is a male parent type, the other is that the character A is a male parent type and the character B is a male parent type, the difference value of the SNP frequency of the determined parent type corresponding to the different SNPs of the parents in the individual groups is calculated by the mixed pool of the parents and the two types of individual groups or the resequencing or transcriptome sequencing of the individual groups, and the gene loci related to the two characters are positioned at the same time.)

1. A method for simultaneously locating two genes related to a target trait, the method comprising the steps of: based on the conventional genome re-sequencing or transcriptome sequencing technology, the grouping sequencing analysis of the recombinant phenotype individuals in a segregation population formed by selfing after the biparental hybridization is carried out, and two independent character related genes are positioned at the same time, and the specific steps comprise:

(1) parent selection and hybridization: selecting two homozygous varieties or strain materials with far genetic relationship and large genetic polymorphism as parents, and constructing a hybrid combination if the difference characters between parents are more and better;

(2) construction and selection of segregating populations: after the biparental hybridization, selfing, forced selfing or selfing after backcrossing form a separation group;

(3) selection and grouping of individuals with recombinant phenotype: simultaneously paying attention to two independent target traits, selecting two trait phenotype values from the segregation population and taking the phenotype value close to the parent and the maternal parent as an extreme phenotype individual, and dividing the extreme phenotype individual in the segregation population into two types according to the phenotypes of the two traits: a parent phenotype individual with the phenotype of both traits being consistent with that of a certain parent and a recombination phenotype individual with the phenotype of the amphipathy being different parent types; recombinant phenotypic individuals in which the phenotypes of the two traits are different parental types can be divided into two categories: one is that the character A is a female parent type and the character B is a male parent type, namely an AmBf group, and the other is that the character A is a male parent type and the character B is a female parent type, namely an AfBm group; taking an AmBf group and an AfBm group as a mixed pool of two recombination phenotype individual groups of extreme trait individuals;

(4) acquisition of SNP and InDel mutation sites: respectively carrying out genome re-sequencing or transcriptome sequencing on the mixed pool of the parents and the two recombinant phenotype individuals, or respectively carrying out genome re-sequencing or transcriptome sequencing on the individuals, and respectively comparing clean reads obtained by the sequencing of the mixed pool of the parents and the two recombinant phenotype individuals with a reference genome to obtain Single Nucleotide Polymorphism (SNP) and insertion deletion mutation (InDel) variation of the reference genome; extracting SNPs (single nucleotide polymorphisms) which are different from a reference genome from all analyzed parents, and selecting the SNPs which are homozygous in the two parents and have different genotypes between the two parents for subsequent calculation of SNP _ Index;

(5) calculating the SNP _ index of the mixed pool and the delta SNP _ index among the mixed pools of the recombined phenotype group, and positioning the relative loci of the characters: by comparison with a reference genome, SNPs homozygous in the parent but different in genotype between parents are selected for calculating the genotype frequency of the SNP marker: SNP _ index, which is a SNP of two genotypes: maternal SNPMAnd paternal SNPF(ii) a Determining the SNP genotype of a certain parent, calculating the genotype frequency of the SNP of each SNP in two mixed pools for determining the parent, specifically adopting the ratio of the number of reads of the SNP of the determined parent in all the detected reads, namely the SNPMAn _ index or SNPFA _index; then calculating the SNP corresponding to each SNP between the two mixed poolsM/FDifference of _index, i.e. Δ SNPM/FA _index; to eliminate false positive sites, using the position of the marker on the genome, a sliding window fit can be performed on the Δ SNP _ index values of the marker on the same chromosome, and then the mean value of the sliding window is used for plotting, usually 200kb is used as the interval to take the Δ SNP within the intervalM/FAverage values of _indexwere plotted, 1000 permutation tests were performed, 95% confidence level was chosen as the threshold for screening, Δ SNPM/FThe region of the sliding window mean line of _indexbeyond the positive threshold or below the negative threshold is a candidate region of the potential gene; in one analysis, the genotype of SNP needs to be determined and kept uniform, but the selection of female parent type or male parent type is not limited, and the result verification can also be carried out after two types are respectively calculated; under definite conditions, the sites corresponding to the two traits are distinguished by a positive peak and a negative peak, and the direction and the trait are corresponding to each other according to the direction and the SNP genotype of the individual group.

2. The method for simultaneously mapping two genes associated with a trait of interest according to claim 1, wherein: when selecting parents in step (1), heterozygous parents can also be selected in case the marker analysis is unambiguous, in which case the analysis of the marker genotypes of the parents and the progeny is not limited to two variations.

3. The method for simultaneously mapping two genes associated with a trait of interest according to claim 1, wherein: the isolated population in step (2) is selected from any one of the isolated generations, preferably F2Population, BCnF2Population, recombinant inbred population RIL and haploid doubling line DH in which BCnF2The population is selected from F1 by direct backcross for 1-n times, or other progeny of segregating populationThe individual is backcrossed for 1-n times and then selfed.

4. The method for simultaneously mapping two genes associated with a trait of interest according to claim 1, wherein: when the segregating population is selected in the step (2), only the progeny segregating population with definite genetic pedigree or information of the segregating population is needed.

5. The method for simultaneously mapping two genes associated with a trait of interest according to claim 1, wherein: when selecting maternal SNPMWhen the method is used for calculating the delta SNP _ index and positioning the trait related locus, the calculation method is briefly as follows:

AmBf group SNPM_index=SNPMNumber of reads/(SNP)MNumber of reads + SNPFThe number of reads),

AfBm group SNPM_index=SNPMNumber of reads/(SNP)MNumber of reads + SNPFThe number of reads),

SNPs of each SNP of two groupsMThe difference between the _indexand the index is △ SNPMaMbf SNPM"though index-AfBm group SNPM_index。

6. The method for simultaneously mapping two genes associated with a trait of interest according to claim 1, wherein: the positioning analysis of the character related sites comprises the following specific steps: using maternal SNPMCalculated Δ SNPM_index, as for trait A, maternal SNP of the site associated with the trait in the maternal trait poolMOn the contrary, the maternal SNP of the paternal trait pool and the trait-related locusMLess, so when two pools for difference making are female parent type characters minus male parent type characters, the site related to the character A is a positive peak; for trait B, maternal SNP of paternal trait pool with trait-related lociMLess maternal SNPs of maternal trait pool and trait-related lociMMore than two pools are used for making difference, wherein the paternal character minus the maternal character pool, so that the site related to the character B is a negative peak; all in oneBy adopting the same theory, the paternal SNP can be definedFCalculated Δ SNPMThe relationship between the _indexsite orientation and the trait follows the same principle.

7. The method for simultaneously mapping two genes associated with a trait of interest according to claim 1, wherein: the trait of interest can be a quality trait or a quantitative trait for which major loci are present.

8. The method for simultaneously mapping two genes associated with a trait of interest according to claim 1, wherein: the method is optimally applicable to the utilization of pure plant or animal parents, can construct species of the segregation population of the filial generation of the parents, preferably the homozygous parents, and does not exclude the construction population of the non-homozygous parents under the condition of molecular marker permission.

9. The method for simultaneously mapping two genes associated with a trait of interest according to claim 1, wherein: SNP markers can be selected individually for calculation of SNP _ index and inter-pool Δ SNP _ index, or SNP and InDel markers can be used simultaneously, and the calculation of the marker frequency of the InDel marker is the same as that in step (5).

The technical field is as follows:

the invention belongs to the technical field of molecular genetics and molecular breeding, and relates to a method for simultaneously positioning two character related gene sites.

Background art:

the group separation Analysis (BSA), also called separation individual group mixed Analysis or mixed group Analysis, was first proposed by R.W. Michelmore et al (PNAS.1991.88(21):9828) and successfully screened the lettuce for markers linked to disease-resistant genes. The method includes that firstly, a certain number of plants are respectively selected from any one separation population generated by a pair of parents with target character phenotype difference according to the target character phenotype to form two extreme character subgroups or groups. The individual DNAs of each group are mixed in equal amounts to form two "Gene pools" (Gene Pool) with relative characters, and then the two Gene pools are analyzed by using a codominant molecular marker with polymorphism between parents, and the molecular marker showing polymorphism between the two groups is genetically linked with a target character Gene locus. After obtaining the molecular marker linked to the target gene, a mapping population can be used for analysis to further detect the linkage degree of the obtained molecular marker and the target trait gene, and the position of the molecular marker in a known molecular map or on a chromosome, so as to complete the marker localization of the target gene. There are various molecular markers for gene mapping combined with BSA, and the commonly used molecular markers include Restriction Fragment Length Polymorphism (RFLP), Random Amplified Polymorphism (RAPD), Amplified Fragment Length Polymorphism (AFLP), Simple repeat Sequence (SSR), and the like. With the progress of genome sequencing and transcriptome sequencing technologies and the reduction of cost, the application of the BSA technology is further enhanced by utilizing the sequencing technology and combining with a reference genome to develop Single Nucleotide polymorphism markers (SNPs). Specifically, two parents and two DNA pools with uniformly mixed relative characters selected from a separation population generated by hybridization of the two parents are subjected to genome sequencing respectively, short fragments obtained by sequencing the parents are compared with a reference genome to obtain polymorphic SNPs between the parents and the reference genome, and then the allele frequency (delta SNP _ index) of the polymorphic SNPs found by the parents in the two extreme character pools is analyzed, and the difference degree (delta SNP _ index) locks the position of a target character gene.

When the BSA or BSA-seq method is used for constructing an extreme character individual mixed pool, a separation population constructed by parental hybridization is selected, and only two side extreme characters of a single target character are selected during grouping, so that the genetic background of other characters is not selected theoretically, and the two gene pools theoretically have difference mainly in target gene sections, so that the two gene pools are also called near-isogenic pools, the influence of environmental and human factors is eliminated, and the research result is more accurate and reliable. The BSA method overcomes the limitation that many crops are difficult to obtain near-isogenic lines, saves time and labor compared with the near-isogenic line method, is a very practical method for positioning the genes related to the target characters, and has very wide application; this approach has many benefits, but one limitation is that: the single two-mixed pool analysis only aims at one target character to carry out the positioning of related loci, and the whole population genetic linkage positioning can simultaneously carry out the positioning analysis on a plurality of characters on the basis of the construction of the genetic map, but the construction of the whole population genetic map needs to analyze the molecular marker data of a large number of segregation populations, and the workload is larger; there is an urgent need for a method that can simultaneously realize genetic localization of two traits on the basis of less workload.

The invention content is as follows:

the invention aims to overcome the defects in the prior art, and provides a novel experimental design method based on segregation population extreme character individual grouping analysis, so that the analysis of single two recombination phenotype individual pools based on genome re-sequencing or transcriptome re-sequencing can be realized, and target sites of two independent characters can be simultaneously positioned.

In order to achieve the purpose, the invention relates to a method for simultaneously positioning two target character related genes, which comprises the following steps of grouping and sequencing analysis of recombinant phenotype individuals in a separation population formed by self-crossing after parental hybridization and simultaneously positioning two independent character related genes based on a conventional genome re-sequencing or transcriptome sequencing technology, wherein the specific steps comprise:

(1) parent selection and hybridization: selecting two homozygous varieties or strain materials with far genetic relationship and large genetic polymorphism as parents, and constructing a hybrid combination if the difference characters between parents are more and better;

(2) construction and selection of segregating populations: after the two parents are hybridized, selfing, forced selfing or backcrossing are carried out to form a separation group;

(3) selection and grouping of individuals with recombinant phenotype: simultaneously paying attention to two independent target traits, selecting two trait phenotype values from the segregation population and taking the phenotype value close to the parent and the maternal parent as an extreme phenotype individual, and dividing the extreme phenotype individual in the segregation population into two types according to the phenotypes of the two traits: a parent phenotype individual with the phenotype of both traits being consistent with that of a certain parent and a recombination phenotype individual with the phenotype of the amphipathy being different parent types; recombinant phenotypic individuals in which the phenotypes of the two traits are different parental types can be divided into two categories: one is that the character A is female parent type and the character B is male parent type, namely AmBf group; the other is that the character A is a male parent type and the character B is a female parent type, namely an AfBm group; taking an AmBf group and an AfBm group as a mixed pool of two recombination phenotype individual groups of extreme trait individuals;

(4) acquisition of SNP and InDel mutation sites: respectively carrying out genome re-sequencing or transcriptome sequencing on the mixed pool of the parents and the two recombinant phenotype individuals, or respectively carrying out genome re-sequencing or transcriptome sequencing on the individuals, and respectively comparing clean reads obtained by the sequencing of the mixed pool of the parents and the two recombinant phenotype individuals with a reference genome to obtain Single Nucleotide Polymorphism (SNP) and insertion deletion mutation (InDel) variation of the reference genome; extracting SNPs (single nucleotide polymorphisms) which are different from a reference genome from all analyzed parents, and selecting the SNPs which are homozygous in the two parents and have different genotypes between the two parents for subsequent calculation of SNP _ Index;

(5) calculating the SNP _ index of the mixed pool and the delta SNP _ index among the mixed pools of the recombined phenotype group, and positioning the relative loci of the characters: by comparison with a reference genome, SNPs homozygous in the parent but different in genotype between parents are selected for calculating the genotype frequency of the SNP marker: SNP _ index, which is a SNP of two genotypes: maternal SNPMAnd paternal SNPF(ii) a Determining the SNP genotype of a certain parent, calculating the genotype frequency of the SNP of each SNP in two mixed pools for determining the parent, specifically adopting the ratio of the number of reads of the SNP of the determined parent in all the detected reads, namely the SNPMAn _ index or SNPFA _index; then calculating the SNP corresponding to each SNP between the two mixed poolsM/FDifference of _index, i.e. Δ SNPM/FA _index; to eliminate false positive sites, using the position of the marker on the genome, a sliding window fit can be performed on the Δ SNP _ index values of the marker on the same chromosome, and then the mean value of the sliding window is used for plotting, usually 200kb is used as the interval to take the interval withinΔSNPM/FAverage values of _indexwere plotted, 1000 permutation tests were performed, 95% confidence level was chosen as the threshold for screening, Δ SNPM/FThe region of the sliding window mean line of _indexbeyond the positive threshold or below the negative threshold is a candidate region of the potential gene; in one analysis, the genotype of SNP needs to be determined and kept uniform, but the selection of female parent type or male parent type is not limited, and the result verification can also be carried out after two types are respectively calculated; in the case of defining the genotype of the SNP, the sites corresponding to the two traits are distinguished by a positive peak and a negative peak, and the correspondence of the direction and the traits depends on the determination of the direction and the SNP genotype of the individual group.

In the selection of parents in step (1) of the present invention, heterozygous parents can be selected even when the marker analysis is unambiguous, and in this case, the genotype of the marker for analyzing the parents and the offspring is not limited to two variations.

The segregating population in step (2) according to the present invention is selected from any one of the segregating generations, preferably F2Population, BCnF2Population, recombinant inbred population RIL and haploid doubling line DH in which BCnF2The population is formed by F1 backcrossing 1-n times directly or by selfing selected individuals in the offspring of other segregation populations after backcrossing 1-n times.

When the segregating population is selected in the step (2) related to the present invention, only the progeny segregating population whose genetic pedigree or information is clear may be selected.

Selecting maternal SNP in step (5) according to the present inventionMWhen the method is used for calculating the delta SNP _ index and positioning the trait related locus, the calculation method is briefly as follows:

AmBf group SNPM_index=SNPMNumber of reads/(SNP)MNumber of reads + SNPFThe number of reads),

AfBm group SNPM_index=SNPMNumber of reads/(SNP)MNumber of reads + SNPFThe number of reads),

SNPs of each SNP of two groupsMThe difference of _indexis:

△SNPM_index=AmBf group SNPM"though index-AfBm group SNPM_index。

The invention relates to a step (5) for positioning the relative locus of the character, which comprises the following steps: using maternal SNPMCalculated Δ SNPM_index, as for trait A, maternal SNP of the site associated with the trait in the maternal trait poolMOn the contrary, the maternal SNP of the paternal trait pool and the trait-related locusMLess, so when two pools for difference making are female parent type characters minus male parent type characters, the site related to the character A is a positive peak; for trait B, maternal SNP of paternal trait pool with trait-related lociMLess maternal SNPs of maternal trait pool and trait-related lociMMore than two pools are used for making difference, wherein the paternal character minus the maternal character pool, so that the site related to the character B is a negative peak; by the same reason, paternal SNP can be definedFCalculated Δ SNPMThe relationship between the _indexsite orientation and the trait follows the same principle.

The target trait of the present invention may be a quality trait or a quantitative trait in which major sites are present.

The method for simultaneously positioning two target character related genes by segregation population recombination phenotype grouping analysis is optimally suitable for using pure plant or animal parents, can construct species of the segregation population of the parents filial generation, preferably selects the homozygous parents, and does not exclude the construction population of the non-homozygous parents under the condition of molecular marker permission.

The calculation of SNP _ index and Δ SNP _ index between pools according to the present invention can be performed by selecting SNP markers alone or by using both SNP and InDel, and the calculation of the marker frequency of InDel marker is the same as that in step (5).

Compared with the prior art, the invention has the beneficial effects that: the method can realize simultaneous positioning of two different characters by utilizing once genome re-sequencing or transcriptome sequencing of two mixed pools, greatly improves positioning efficiency and saves research and development investment cost compared with the prior method that only one character-related gene locus can be positioned once by conventional BSA or BSR analysis.

Description of the drawings:

FIG. 1 is a delta SNP of two independent traits of creeping/standing and alternate flowering/continuous flowering of peanuts related to the present invention utilizing a recombinant type individual groupMMapping the character-related gene by using the _indexdata; wherein the positive peak is the creeping/vertical related gene locus, and the negative peak is the alternate flowering/continuous flowering trait related gene locus.

The specific implementation mode is as follows:

the technique of the present invention is further illustrated by the following examples.

9页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种睡眠异常保护的方法及装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!