Method for identifying transcription factor binding element in cotton fiber development period by DNA affinity protein sequencing

文档序号:796646 发布日期:2021-04-13 浏览:23次 中文

阅读说明:本技术 一种利用dna亲和蛋白测序鉴定棉花纤维发育时期转录因子结合元件的方法 (Method for identifying transcription factor binding element in cotton fiber development period by DNA affinity protein sequencing ) 是由 韦洋洋 程爽 刘玉玲 敖国威 金丽英 陈悦 邵静 孟飞 商海红 王雪飞 彭仁海 于 2020-12-09 设计创作,主要内容包括:本发明涉及一种用DNA亲和蛋白测序鉴定棉花纤维发育时期转录因子结合元件的方法,其包括1)提取棉花纤维基因组DNA,进行随机打断为200bp左右片段后,利用打断的DNA构建具有测序接头的Index-DNA文库;2)原核表达带有标签的棉花转录因子蛋白,并进行免疫印迹确认所表达的目标蛋白;3)将目标蛋白与Index-DNA文库在体外孵育使两者充分结合,并提取与该蛋白特异结合的Index-DNA;4)进行扩增后测序,将所得数据进行匹配到棉花基因组上,并与棉花纤维发育时期的DHS位点相互对应后,识别确认所得结合元件序列。本发明的方法不经遗传转化,可以迅速准确的鉴定出棉花转录因子下游结合位点,具有快速、准确、经济、省力等优点。(The invention relates to a method for identifying a transcription factor binding element in a cotton fiber development period by DNA affinity protein sequencing, which comprises the following steps of 1) extracting cotton fiber genome DNA, randomly breaking the DNA into fragments of about 200bp, and constructing an Index-DNA library with a sequencing joint by using the broken DNA; 2) prokaryotic expression of cotton transcription factor protein with label and immunoblotting to confirm the expressed target protein; 3) incubating the target protein and the Index-DNA library in vitro to ensure that the target protein and the Index-DNA library are fully combined, and extracting the Index-DNA specifically combined with the protein; 4) and (3) performing amplification and sequencing, matching the obtained data to a cotton genome, and identifying and confirming the sequence of the obtained binding element after the data correspond to the DHS locus in the cotton fiber development period. The method of the invention can rapidly and accurately identify the downstream binding site of the cotton transcription factor without genetic transformation, and has the advantages of rapidness, accuracy, economy, labor saving and the like.)

1. A method for identifying a transcription factor binding element in a cotton fiber development period by using DNA affinity protein sequencing is characterized by comprising the following steps:

s1, extracting genome DNA of the cotton fiber in the development period, randomly breaking the genome DNA of the cotton fiber to obtain DNA fragments, repairing and connecting to construct an Index-DNA library with an Illumina sequencing joint;

s2, carrying out expression and protein purification after fusion of the cotton transcription factor and the label gene to obtain fusion protein with a label;

s3, incubating the fusion protein with the label and the Index-DNA library in vitro to ensure that the fusion protein and the Index-DNA library are fully combined, and extracting the Index-DNA specifically combined with the fusion protein;

s4, amplifying and sequencing the DAP-DNA obtained in the step S3, matching the obtained data to a cotton genome, mutually verifying the data with a DHS map of a cotton fiber development period, and identifying and confirming the sequence of the obtained binding element.

2. The method of claim 1, wherein in step S1, the cotton fiber genomic DNA is extracted using a qiagen plant DNA extraction kit, and the random disruption is disrupted using an ultrasonicator to obtain DNA fragments of about 200 bp.

3. The method of claim 2, wherein in step S1, the program setting of the sonicator is Time ON 30S Time OFF 30S, Cycle Num 13.

4. The method of claim 1, wherein in step S1, the obtained DNA fragments are subjected to end repair using Enzyme, and then da-tailed DNA is obtained by adding dATP and Klenow Enzyme for a-tailed DNA, and then an Illumina sequencing linker is added for ligation to obtain an Index-DNA library.

5. The method of claim 1, wherein in step S2, the gene sequence of the cotton transcription factor is shown in SEQ ID No.1, and the tag is a GST tag, a His tag, a Flag tag or a GFP tag.

6. The method of claim 1, wherein in step S3, the in vitro Incubation is performed using an Incubation Buffer solution, shaking at room temperature for 1h while rotating, adding the labeled polyclonal antibody, incubating overnight, adding rputain a agarose bead, incubating at 4 ℃ for 3-4 h, washing, separating the rputain a agarose bead and the labeled polyclonal antibody-labeled fusion protein-Index-DNA fragment complex with an eluent, and extracting the Index-DNA fragment from the complex with phenol chloroform.

7. The method of claim 6, wherein the Incubation Buffer solution comprises a final concentration of 0.05M NaCl, 0.02M Tris-HCl, 0.005M EDTA,0.0002M PMSF, Complete Mini.

8. The method of claim 7, wherein the washing is performed by adding Buffer A-50mM NaCl, Buffer A-100mM NaCl, and Buffer A-150mM NaCl solutions, respectively, in order, i.e., adding the Buffer A and NaCl solutions, then mixing them by inversion for 1 minute, centrifuging at 5000rpm for 1 minute at room temperature, standing on ice for 1 minute, and removing as much supernatant as possible; buffer A is Tris-HCl containing 0.05M pH 7.5, EDTA containing 0.01M pH 8.0, 50mM NaCl, 100mM NaCl, 150mM NaCl represent the final concentration in Buffer A;

the eluent is preheated Elution buffer, the preheating temperature is 42 ℃, the Elution buffer is 0.05M NaCl in final concentration, 0.02M Tris-HCl with the pH value of 7.5, 0.005M EDTA with the pH value of 8.0 and 1% SDS, the Elution is carried out twice, the eluent is placed in a water bath at 65 ℃ for 15min, and the Elution is carried out by reversing and mixing once every 5 min.

9. The method of claim 1, wherein in step S4, the amplification is performed using a 50. mu.L system of DAP-DNA 14. mu.L, 2 XKapa HiFi HotStart Ready Mix 25. mu.L, 0.25. mu.M TruSeq PCR primer cocktail 1. mu.L, ddH2O10 mu L; an amplification program, wherein the temperature is 45s at 98 ℃, the temperature is 15s at 98 ℃, 30s at 63 ℃ and 30s at 72 ℃ in 3-22 cycles; then storing at 72 ℃ for 1min and 4 ℃, performing 1% agarose gel electrophoresis on the amplified product, recovering fragments of about 300bp, and sequencing.

10. The method of claim 1, wherein in step S4, the identification confirms the obtained binding element sequence, the obtained sequencing data is aligned to the cotton genome by using bowtie software, peak identification is performed by using Popera software, the correct position of peak is determined by using BED Tools software, the motif analysis is performed on peak, the identification of the gene promoter downstream of peak is performed on peak, and overlapping with the DHS data during cotton fiber development period to draw wien diagram, and all of the three are the binding elements downstream of the transcription factor.

Technical Field

The invention relates to a method for identifying a transcription factor binding element in a cotton fiber development period by using DNA affinity protein sequencing, belonging to the technical field of plant molecular biology.

Background

The transcription factor plays an important role in the growth and development process of animals and plants, and the transcription factor realizes the regulation and control of target genes by combining cis-acting elements on the promoters of the target genes. Research shows that various cotton transcription factors are involved in fiber development and stress tolerance of cotton, such as GhWRKY34, GhWRKY6-like, GhATAF1, GhABF2 and other transcription factors can improve salt tolerance of transgenic plants, and GhMYB25, GhHOX3, GhTCP14 and other transcription factors can regulate cotton fiber development. Then, identification of the downstream binding element site of the cotton transcription factor is difficult, the related regulation and control mechanism of the cotton transcription factor is difficult to analyze, and the related research is lagged, DNase I super-sensitive site (DHS) is a gold standard for identifying genome functional elements (a promoter, an enhancer, an insulator and the like), and the identification of the downstream binding element of the transcription factor is facilitated by utilizing DHS-seq data in a specific development period.

At present, most of the downstream identification methods of transcription factors are chromatin co-immunoprecipitation sequencing (ChIP-seq). There are two main approaches to chromatin co-immunoprecipitation: firstly, after directly obtaining the cell nucleus of the required material, carrying out chromatin co-immunoprecipitation and sequencing, however, the method needs to obtain the specific antibody of the transcription factor for experiment, and the transcription factor in organisms often has families, so that the obtaining of the specific antibody of a member is difficult, and the background interference is often caused, which reduces the accuracy of the result; secondly, the transcription factor is added with a known functional domain label to construct a fusion expression vector and then transgenic, but the genetic transformation cycle process of cotton needs 6-8 months, and a chromatin co-immunoprecipitation experiment can be carried out after the subsequent screening of a transgenic homozygous line, so that the time and labor are consumed, and the cost is high. At present, a transgenic method is mostly adopted for researching transcription factor regulation sites in the cotton fiber development period, the time and the labor are consumed, the cost is high (Shan C M, et al, 2014,5(5519):5519.), the common DAP-seq method does not consider DNA methylation in the specific development period, chromatin opening state analysis in the specific development period is not carried out, the obtained transcription factor binding sites are more (Ninlihua and the like, scientific reports, 2019, v.64(24):81-92.), and the screening difficulty of the subsequent analysis functional sites is increased.

Therefore, a method for identifying the downstream binding element of the transcription factor in the cotton fiber development period with rapidness, accuracy, economy and labor saving is needed.

Disclosure of Invention

Technical problem to be solved

In order to solve the above problems in the prior art, the present invention provides a method for identifying transcription factor binding elements during cotton fiber development by DNA affinity protein sequencing.

(II) technical scheme

In order to achieve the purpose, the invention adopts the main technical scheme that:

a method for identifying a transcription factor binding element in a cotton fiber development period by using DNA affinity protein sequencing, which comprises the following steps:

s1, extracting genome DNA of the cotton fiber in the development period, randomly breaking the genome DNA of the cotton fiber to obtain DNA fragments, repairing and connecting to construct an Index-DNA library with an Illumina sequencing joint;

s2, carrying out expression and protein purification after fusion of the cotton transcription factor and the label gene to obtain fusion protein with a label;

s3, incubating the fusion protein with the label and the Index-DNA library in vitro to ensure that the fusion protein and the Index-DNA library are fully combined, and extracting the Index-DNA specifically combined with the fusion protein;

s4, amplifying and sequencing the Index-DNA obtained in the step S3, matching the obtained data to a cotton genome, mutually verifying the data with a DHS locus in the cotton fiber development period, and identifying and confirming the sequence of the obtained binding element.

In the method, preferably, in step S1, the cotton fiber genomic DNA is extracted by using a qiagen plant DNA extraction kit, and the random disruption is performed by using an ultrasonicator to obtain a DNA fragment of about 200 bp.

The extracted DNA is cotton fiber genome DNA, and the misjudgment of transcription factor binding site elements caused by DNA methylation difference among different tissues of cotton is avoided.

Further preferably, in step S1, the ultrasonicator breaks the disruption to Bioruptor pico ultrasonicator sets Time ON 30S Time OFF 30S, Cycle Num 13, and finally obtains a DNA fragment of about 200 bp.

As described above, preferably, in step S1, the obtained DNA fragment is subjected to end repair using Enzyme, and then dATP and Klenow Enzyme are added for A-tailing to obtain A-tailed DNA, and Illumina sequencing adaptor is added for ligation to obtain an Index-DNA fragment.

Further preferably, the Illumina sequencing linker is an analized TruSeq adapter.

The method as described above, preferably, in step S2, the gene sequence of the cotton transcription factor is shown in SEQ ID No.1, and the tag is a GST tag, a His tag, a Flag tag or a GFP tag.

The method as described above, preferably, in step S3, the in vitro Incubation uses Incubation Buffer solution and rotates and shakes for 1h at room temperature, the labeled polyclonal antibody is added for Incubation overnight, the rputain a agarose bead is finally added for Incubation for 3-4 h at 4 ℃, after washing, the rputain a agarose bead and the labeled polyclonal antibody-labeled fusion protein-Index-DNA fragment complex are separated by eluent, and finally the Index-DNA fragment in the complex is extracted by phenol chloroform.

As described above, preferably, the Incubation Buffer solution contains 0.05M NaCl, 0.02M Tris-HCl, 0.005M EDTA,0.0002M PMSF, Complete Mini (Roche) at the final concentration.

The rProtain A agarose beads are white solid small particles, and after centrifugation, the precipitates are the rProtain A agarose beads-labeled polyclonal antibody-labeled fusion protein and-Index-DNA fragment compound.

The method as described above, preferably, the washing is performed by adding Buffer A-50mM NaCl, Buffer A-100mM NaCl, and Buffer A-150mM NaCl solution in sequence, respectively, that is, adding the Buffer A and NaCl solution, then reversing and mixing for 1 minute, centrifuging for 1 minute at 5000rpm at normal temperature, standing for 1 minute on ice, and removing as much supernatant as possible; buffer A is Tris-HCl containing 0.05M pH 7.5, EDTA containing 0.01M pH 8.0, 50mM NaCl, 100mM NaCl, 150mM NaCl represent the final concentration in Buffer A;

the eluent is preheated Elution buffer, the preheating temperature is 42 ℃, the Elution buffer is 0.05M NaCl in final concentration, 0.02M Tris-HCl with the pH value of 7.5, 0.005M EDTA with the pH value of 8.0 and 1% SDS, the Elution is carried out twice, the eluent is placed in a water bath at 65 ℃ for 15min, and the Elution is carried out by reversing and mixing once every 5 min.

As the method, in step S4, the 50. mu.L system used for the amplification is DAP-DNA 14. mu.L, 2 XKapa HiFi HotStart Ready Mix25 μ L, 0.25 μ M TruSeq PCR primer cocktail 1 μ L, ddH2O10 μ L; an amplification program, wherein the temperature is 45s at 98 ℃, the temperature is 15s at 98 ℃, 30s at 63 ℃ and 30s at 72 ℃ in 3-22 cycles; then storing at 72 ℃ for 1min and 4 ℃, performing 1% agarose gel electrophoresis on the amplified product, recovering fragments of about 300bp, and sequencing.

In the method as described above, preferably, in step S4, the recognition and confirmation of the obtained binding element sequence includes firstly using bowtie software to align the obtained sequenced data onto the cotton genome, performing peak identification using Popera software, finally using BED Tools software to determine the correct positions of peaks, performing motif analysis on the peaks, identifying the promoters of the genes downstream of the peaks, and overlapping the DHS data during cotton fiber development to draw wien maps, wherein all of the three are the binding elements downstream of the transcription factor.

Further preferably, the Wein map is drawn by overlapping with a DHS map of cotton fiber development period, wherein the DHS map contains most active functional elements of cotton fiber development period, and the misjudgment of the inactive DNA element site is avoided.

(III) advantageous effects

The invention has the beneficial effects that:

the invention provides a method for identifying a transcription factor binding element in a cotton fiber development period by DNA affinity protein sequencing, which adopts in-vitro prokaryotic expression of cotton transcription factor fusion protein with a label, transfers affinity fragments in cotton fiber genome DNA, constructs an Illumina sequencing library for sequencing, performs motif analysis aiming at peaks, utilizes BED Tools software peaks downstream gene promoter identification and performs overlapping drawing with DHS data in the cotton fiber development period to determine a correct DNA element sequence domain position. The method can quickly and accurately identify the downstream binding site of the cotton transcription factor without genetic transformation, and has stronger practical utilization value and significance.

The method for identifying the transcription factor binding element in the cotton fiber development period by using DNA affinity protein sequencing has the following advantages:

1. and (3) the speed is high. The method for identifying the transcription factor binding element in the cotton fiber development period by using DNA affinity protein sequencing has an experimental period of 3 days, and the experimental period is greatly shortened by the method compared with 6-8 months of cotton genetic transformation.

2. Is accurate. The result of GhWRKY70 obtained by using the method is basically consistent with the homologous gene binding site in Arabidopsis.

3. Is economical. The method utilizes the GST label confirmed in the fusion protein to select the corresponding antibody, and does not need to go through a complicated specific antibody screening process and waste time and economy of manufacturing and purchasing the specific antibody.

4. Is convenient. The experimental results are easily reproducible, are suitable for application to the tissues of most plants that have been subjected to DHS sequencing, and can be used directly to identify the core region of a functional DNA element.

Drawings

FIG. 1 is the electrophoresis detection of genome DNA in the development period of cotton fiber broken by ultrasonic wave;

FIG. 2 is a vector diagram of PGEX-4T-1-GhWRKY 70;

FIG. 3 shows the detection of prokaryotic expression protein by Western Blot;

FIG. 4 is an Illumina sequencing library constructed by gel recovery purification;

FIG. 5 is the analysis and identification of the downstream binding site of GhWRKY 70.

Detailed Description

The transcription factor can be specifically combined with cis-regulatory element DNA of a downstream target gene to further regulate the expression of the target gene, however, when the downstream binding site of the transcription factor is researched, for non-model plant cotton, the research on the binding site of the transcription factor in a specific period of cotton fiber development is more difficult.

According to the invention, a large number of experimental researches show that a DNA affinity protein sequencing (DAP-seq) method is adopted, the transcription factor is fused and expressed into a fusion protein with a specific label (GST, Flag, GFP and the like) by utilizing the self characteristics of the transcription factor, and then a commercial standard antibody can be selected to identify the fusion protein, so that the high cost and time waste of the preparation of the transcription factor antibody are avoided; further, the genomic DNA of cotton fiber development period is adopted, the characteristic that DNA methylation hinders the combination of transcription factors and target sites is effectively utilized, and the finally combined sites are DNA elements of fiber specific period; and finally, mutually verifying the data obtained by the DHS map and the DAP-seq in the cotton fiber development period, excluding the combination of inactive sites, obtaining a transcription factor combination DNA element in the cotton fiber chromatin open area, namely the specific combination site of the transcription factor in the cotton fiber development, wherein the downstream gene of the site is the target gene of the transcription factor. The method of the invention obtains the transcription factor binding element in the cotton fiber development period efficiently and accurately through multiple verification. For the purpose of better explaining the present invention and to facilitate understanding, the present invention will be described in detail by way of specific embodiments with reference to the accompanying drawings.

Example 1

A method for identifying a transcription factor binding element in a cotton fiber development period by using DNA affinity protein sequencing specifically comprises the following steps:

s1, connecting cotton genome DNA random break with Illumina sequencing joint to construct and obtain Index-DNA library

1. Random disruption of cotton genomic DNA

(1) Genomic DNA was extracted and detected during cotton fiber development using the DNeasy Plant Mini Kit (50) Plant DNA extraction Kit. The extraction step and detection are not described herein, the extracted genomic DNA is diluted to 20ng/μ L with TE, 300 μ L is taken and transferred to Covaris microTUBE, Time ON 30s Time OFF 30s is set by using an ultrasonicator of Bioruptor pico, Cycle Num 13 is randomly interrupted, and after electrophoresis detection, the result is shown in FIG. 1, wherein M represents DNA marker in the figure, and lanes 1 and 2 are both cotton fiber DNA after interruption. The electrophoresis detection result shows that the program setting is correct, the obtained cotton fiber DNA fragment is just about 200bp, the shorter the time is, the larger the fragment is, and the smaller the fragment is otherwise. Ultrasonic conditions are suitable conditions obtained after extensive experimental investigations.

(2) The sonicated samples were transferred to clean 1.5mL tubes. Add 30. mu.L of 3M NaOAc (0.1 vol) and 600. mu.L of cold 100% ethanol (2 vol) and vortex mix.

(3) Incubate at least 15 minutes, but not more than 24 hours, on ice or at-20 ℃.

(4) Centrifuge at maximum speed (20,000g) for 20 minutes at 4 ℃ and decant to discard the supernatant.

(5) The precipitate was washed with 1mL of 70% ethanol, centrifuged at maximum speed (20,000g) at 4 ℃ for 10 minutes, and the supernatant was discarded by decantation.

(6) Spin rapidly at 3000g for 5 seconds, then pipette off all remaining ethanol, taking care not to interfere with DNA precipitation.

(7) The precipitate is dried at room temperature (22 ℃) for 10-15 minutes, or at 37 ℃ for 5-10 minutes. Before resuspension, please ensure that the pellet is completely dry. Allowing the pellet to dry too long may make resuspension more difficult, but should not damage the DNA.

(8) The DNA pellet was resuspended in 30-40. mu.L TE. The DNA was left at 37 ℃ for 5 minutes to aid in the solubilization and used in the following reaction.

2. End-repair and making Index-DNA library:

(1)1.5mL of PCR centrifuge tube was prepared with a 50. mu.L system of the following mixed solution: (End-itTM kit (Lucigen), thawed on ice before use, centrifuged, aspirated and mixed before addition to avoid air bubbles).

It should be noted that: and adding the materials into a 1.5mL centrifuge tube in sequence, and replacing a gun head each time to make a mark so as to avoid errors. Mixing the dactylotheca uniformly, centrifuging, and standing at normal temperature for 45 min.

(2) Add 5. mu.L of 3M NaOAc (0.1 vol) and 100. mu.L of cold 100% ethanol (2 vol) and vortex mix.

(3) Incubate at least 15 minutes, but not more than 24 hours, on ice or at-20 ℃.

(4) Centrifuge at maximum speed (20,000g) for 20 minutes at 4 ℃ and decant to discard the supernatant.

(5) The precipitate was washed with 1mL of 70% ethanol, centrifuged at maximum speed (20,000g) at 4 ℃ for 10 minutes, and the supernatant was discarded by decantation.

(6) Spin rapidly at 3000g for 5 seconds, then pipette off all remaining ethanol, taking care not to interfere with DNA precipitation.

(7) The precipitate is dried at room temperature (22 ℃) for 10-15 minutes, or at 37 ℃ for 5-10 minutes. Before resuspension, please ensure that the pellet is completely dry. Allowing the pellet to dry too long may make resuspension more difficult, but should not damage the DNA.

(8) The DNA pellet was resuspended in 41.5. mu.L EB and left at 37 ℃ for 5 minutes to help solubilize the DNA.

3. Adding A tail, and configuring as follows:

(1) flicking the tube wall, mixing the liquid, centrifuging, placing in PCR instrument, warm bathing at 37 deg.C for 30min, taking out, and immediately placing on ice.

(2) Add 5. mu.L of 3M NaOAc (0.1 vol) and 100. mu.L of cold 100% ethanol (2 vol) and vortex mix.

(3) Incubate at least 15 minutes, but not more than 24 hours, on ice or at-20 ℃.

(4) Centrifuge at maximum speed (20,000g) for 20 minutes at 4 ℃ and decant to discard the supernatant.

(5) The precipitate was washed with 1mL of 70% ethanol, centrifuged at maximum speed (20,000g) at 4 ℃ for 10 minutes, and the supernatant was discarded by decantation.

(6) Spin rapidly at 3000g for 5 seconds, then pipette off all remaining ethanol, taking care not to interfere with DNA precipitation.

(7) The precipitate is dried at room temperature (22 ℃) for 10-15 minutes, or at 37 ℃ for 5-10 minutes. Before resuspension, please ensure that the pellet is completely dry, and allowing the pellet to dry too long may make resuspension more difficult, but should not damage the DNA.

(8) The DNA pellet was resuspended in 20. mu.L EB and left at 37 ℃ for 5 minutes to aid in DNA solubilization to obtain A-tailed DNA.

4. Index joint connection

This step can be performed by adding different Index linkers according to the amount of the sample, or by using only one Index linker, and sharing a library for all proteins, and noting that the Index information is clearly written during sequencing, to avoid confusion, in this example, the indexes of the amplified TruSeq Adapter are (SEQ ID NO.2) ATCACG (TruSeq Universal Adapter (SEQ ID NO. 3): AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATC. T and TruSeq Adapter Index 1(SEQ ID NO. 4): GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG, which represents phosphorylation) linkers, and are formulated as follows:

after a total of 50. mu.L of fingers were flicked, the resultant was centrifuged and left at room temperature for 15 min.

The manufacturing method of the connected TruSeq adapter comprises the following steps: dissolving the Adapter with an Annealing buffer, 10mM Tris pH 8.0, 50mM NaCl, 1mM EDTA, TruSeq Universal Adapter (100. mu.M) 10. mu.L; 10 μ L TruSeq Adapter Index 1(100 μ M), mixing, and storing at-20 deg.C at 75 deg.C for 15min, 60 deg.C for 10min, 50 deg.C for 10min, 40 deg.C for 10min, and 25 deg.C for 30 min. (1) Add 5. mu.L of 3M NaOAc (0.1 vol) and 100. mu.L of cold 100% ethanol (2 vol) and vortex mix.

(2) Incubate at least 15 minutes, but not more than 24 hours, on ice or at-20 ℃.

(3) Centrifuge at maximum speed (20,000g) for 20 minutes at 4 ℃ and decant to discard the supernatant.

(4) The pellet was washed with 1mL 70% ethanol and centrifuged at maximum speed (20,000g) for 10min at 4 ℃. The supernatant was discarded by decantation.

(5) Spin rapidly at 3000g for 5 seconds, then pipette off all remaining ethanol, taking care not to interfere with DNA precipitation.

(6) The precipitate is dried at room temperature (22 ℃) for 10-15 minutes, or at 37 ℃ for 5-10 minutes. Before resuspension, please ensure that the pellet is completely dry. Allowing the pellet to dry too long may make resuspension more difficult, but should not damage the DNA.

(7) The DNA pellet was resuspended in 31-50. mu.L EB. The DNA was left at 37 ℃ for 5 minutes to aid in the lysis of the DNA.

(8) Storing at-20 deg.C, and storing at-80 deg.C for a long time.

S2, prokaryotic expression of cotton transcription factor GhWRKY70 and protein purification

1. GST tag fusion protein expression

The full-length sequence of the cloned cotton transcription factor gene, namely the CDS sequence of the cotton transcription factor GhWRKY70 is shown in SEQ ID No.1, a prokaryotic expression vector pGEX-4T-1-GhWRKY70 (which can be purchased from a fenghui organism as shown in FIG. 2) with a GST label is constructed by utilizing a homologous recombination method (a proper label such as Flag, GFP, His-tag and the like can be selected according to a laboratory), escherichia coli DH5 alpha is converted, colony PCR is used for detecting positive clone, the constructed vector is subjected to sequencing to verify the accuracy of the gene, a plasmid with correct sequencing is extracted (the sequence is shown in SEQ ID No.5 and can be recorded as PGEX-4T-GST-GhWRKY70), the correct plasmid and an unloaded plasmid are converted into an escherichia coli expression strain BL21(DE3), colony PCR is used for screening of positive clones, and simultaneously, the unloaded plasmid is converted and the positive clones are screened. Inducing protein expression by using 1mM IPTG, collecting bacterial lysate, carrying out SDS-polyacrylamide electrophoresis detection, selecting bacterial lysate with proper induction concentration after dyeing, purifying GhWRKY70-GST fusion protein and GST tag protein by using GST magnetic beads, and carrying out Western Blot detection on the purified protein. The fusion protein purified by SDS-PAGE electrophoresis is shown in FIG. 3, wherein M represents protein marker, and lane 1 shows that the fusion protein GhWRKY70-GST to be detected is 61kDa (36kDa +25 kDa). After the detection is correct, the obtained fusion protein with GhWRKY70-GST and GST tag protein are divided into 5-10 μ g/tube (about 20-30 μ L/tube) and then placed in a-80 refrigerator for later use.

S3, obtaining transcription factor affinity DNA fragment

1. One tube of the GhWRKY70-GST fusion protein (about 20-30. mu.L/tube) obtained in the previous step was taken out (with GST-tag protein as a control), and Incubation Buffer was added thereto to make the total volume 500. mu.L.

Incubation buffer was configured as follows (10mL)

2. 50-100ng of the DAP library was added thereto and shaken for 1h at room temperature with rotation (15 rpm).

3. mu.L (4-8. mu.g) of an antibody (GST polyclonal antibody (purchased from Shanghai Probiotics, NO. D110271)) was added to each well, the tube was sealed with parafilm, and the mixture was shaken overnight at 4 ℃ with shaking.

4. Prepare the Incubation buffer and put on ice, 40 u L/tube/sample of rProtein A-sepharose agarose beads (rPAS), 5000rpm, centrifugal 1min, remove the supernatant.

5. The beads were washed with 500. mu.L of Incubation Buffer. Put on a magnet and aspirate the supernatant. Centrifuging at 5000rpm for 1min, freezing for 1min, and removing supernatant. Repeat three times to obtain rPAS.

6. rProtein A-sepharose (rPAS) was added uniformly to the antibody-bound sample tubes in step 3, and the tubes were sealed with parafilm and rotated (15 rpm) at 4 ℃ for 3 hours.

7. Buffer A (50mM-1 mL/sample, 100mM-1 mL/sample, 150mM-1 mL/sample NaCl) and Elution Buffer (1 mL/sample) were prepared as follows:

Buffer A-50mM NaCl(10mL)

Buffer A-100mM NaCl(10mL)

Buffer A-150mM NaCl(10mL)

Elution buffer(10mL)

8. centrifuge at 13000rpm for 1 minute at ambient temperature, leave on ice for 1 minute, and carefully remove supernatant.

9. Add 1mL of Buffer A-50mM NaCl and mix by inversion for 1 min.

10. Centrifuge at 5000rpm for 1 minute at room temperature, and leave on ice for 1 minute to remove as much supernatant as possible.

11. Add 1mL of Buffer A-100mM NaCl and mix by inversion for 1 min.

12. Centrifuge at 5000rpm for 1 minute at room temperature, and leave on ice for 1 minute to remove as much supernatant as possible.

13. Add 1mL of Buffer A-150mM NaCl and mix by inversion for 1 min.

14. Centrifuge at 5000rpm for 1 minute at ambient temperature, and leave on ice for 1 minute to remove as much supernatant as possible (the supernatant was aspirated off as much as possible, and the supernatant was aspirated off with a finest pipette tip).

15. Add 400. mu.L of preheated Elution buffer (42 ℃), mix by inversion (this step separates beads from nucleosomes, care seal is taken for stringency)

16. The mixture was placed in a water bath at 65 ℃ for 15min and mixed by inversion every five minutes.

17. 13000rpm, 1 minute, and standing at room temperature for 1 minute. The supernatant was pipetted into another freshly labeled 2mL centrifuge tube.

18. mu.L of preheated Elution buffer (42 ℃) was added to the original tube, and the mixture was inverted and mixed.

19. Placing in 65 deg.C water bath for 15min, and mixing by reversing every 5min

20. 13000rpm, 1 minute, and standing at room temperature for 1 minute. The supernatant was pipetted into a centrifuge tube with fresh label on top, and mixed by inversion to give a total of about 800. mu.L of liquid.

21. mu.L phenol and 400. mu.L chloroform were added and the Index-DNA fragment and Protein were separated in this step.

22. Vortex for 20 seconds, centrifuge at 13000pm for 4 minutes at ambient temperature. Transfer the supernatant to a new 1.5mL centrifuge tube.

23. mu.L of glycogen, 80. mu.L of 3M sodium acetate and 480. mu.L of 100% isopropyl alcohol were added, mixed well, and then left at room temperature for 10 minutes.

24. 13000rpm, 4 ℃, centrifugation for 10 minutes, and supernatant discarded.

25. Add 500. mu.L of 70% ethanol, mix by inversion for 1 minute, and wash the precipitate.

26. 13000rpm at normal temperature, centrifuging for 2-3 minutes, decanting the supernatant, and checking for precipitation. The precipitate is dried at room temperature (22 ℃) for 10-15 minutes, or at 37 ℃ for 5-10 minutes. Before resuspension, please ensure that the pellet is completely dry. Allowing the pellet to dry too long may make resuspension more difficult, but should not damage the DNA. (the fusion protein was eluted as described above, leaving only Index-DNA)

27. The DNA pellet (the DNA is the DAP-Index-DNA with the specific Index added) was resuspended in 15. mu.L TE. The DNA was left at 37 ℃ for 5 minutes to aid in the lysis of the DNA. The sample can be stored at-20 ℃. The long-term preservation needs to be carried out at-80 ℃.

S4, amplifying and sequencing the Index-DNA fragment, matching the obtained data to a cotton genome, corresponding to the DHS site in the cotton fiber development period, and identifying and confirming the sequence of the obtained binding element.

Wherein, 1, PCR amplification is prepared as follows:

after finger leveling, the finger was centrifuged and amplified in the following procedure: 45sec at 98,; 15sec at 98 ℃, 30sec at 63 ℃, 30sec at 72 ℃ for 11 cycles; 1min at 72 ℃; hold at 4 ℃. (3-22 cycles are all, 11 cycles are adopted in GhWRKY70-GST in the embodiment.)

2. 1% agarose gel was prepared and prepared with 1 XTAE + EB.

3. Adding 4 mu L of 100bp Marker and 4 mu L of 6 × loading dye, wherein the samples are positioned at two sides of the Marker, and the adjacent samples are separated by one lane to avoid pollution.

4. Electrophoresis at 100V for 20 min.

5. The fragments of about 300bp are recovered by using a QIAGEN recovery kit.

6. Finally, 31. mu.L of the eluent was added, and after 5min at RT, the mixture was centrifuged at 13200rpm for 1 min.

7. Sending the sample to a sequencing company, and sequencing by utilizing an Illumina sequencing platform PE150 sequencing strategy to obtain original data, wherein the result is as follows:

name total reads mapping reads unique reads
GhWRKY70-GST 75157242 69579946(92.58%) 39322174
GST 61611168 57122032(92.71%) 33151850

s5, DAP-seq data analysis

1. FASTQ files obtained from sequencing were aligned to the cotton reference genome using bowtie2 v2.3.5 mapping software. Based on the data quality and reference genome, trimming and quality/repeat read filtering was performed. Reads with unique alignment rates were selected for in-depth analysis.

2. The background was controlled using GEM v3.4 software with GST reads as negative controls. The parameters-k _ min 6-k _ max 13-fold 2-outbED are set to recognize the specific reading of the target transcription factor as peak data of DAP. 3. Peak value identification is carried out by using Popera software, finally, correct peak positions are determined by using BED Tools software, motif analysis and peak downstream gene promoter identification are carried out on peaks, and a Wien diagram is drawn by overlapping with DHS data of cotton fiber development period. All the three are binding elements at the downstream of the transcription factor, and the results are shown in figure 5, wherein A is a igv locus display diagram, a red line (a short cross bar at the upper right side) is a locus where DHS is opened in the cotton fiber development period, a GST brown line is a matching result of GST protein after sequencing and a cotton genome, a green line B1 is a matching result of affinity DNA fragments obtained by GhWRKY70-GST fusion protein after sequencing and a cotton genome, and a blue line (a short cross bar at the lower right side) is a binding site peak which can be identified. B is a functional element sequence CGTTCAC of a blue line site in A. The finally obtained result corresponds to a cotton fiber development DHS map made in the laboratory, and finally determined cotton GhWRKY70 can be combined at the position of a promoter region CGTTCAC at the upstream of the GhNAC gene, so that the expression of the GhNAC gene is regulated.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in other forms, and any person skilled in the art can change or modify the technical content disclosed above into an equivalent embodiment with equivalent changes. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Sequence listing

<110> Anyang industry and college

Zhengzhou University

<120> method for identifying transcription factor binding element in cotton fiber development period by using DNA affinity protein sequencing

<160> 5

<170> SIPOSequenceListing 1.0

<210> 1

<211> 828

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 1

atggatactc tttatccatg gcctgaacct gaaacactat caagcaacaa gaaaagggta 60

atacaaaaac ttgtggaagg tcaacagtgt gctactgagc ttcaagttat tgtactccac 120

aacaacaagc cccctcaaca agctgaggag cttgtgcaaa agatcttgtg gtcatttaat 180

cagacacttt ctatgctagc tgaagctggt catcatgatg aagtgatttc ccagaatcag 240

gcaacttgta atgatgattg taagtctcaa gattctagtg agagcagcaa gagatcactt 300

tcagcattcg ttaaggataa gaggggctgt tacaagagaa agaggtttgc tcaaacaaag 360

atagtggtgt ctgataaaat agaagatggg catgcatgga gaaaatatgg acaaaaaaat 420

atcttacatt ctaaacatcc aaggagttac ttcaggtgca gtcacaagca tgatcaaggc 480

tgtagtgcta tcaaacaagt tcaaagaatg gaagatgatg cccaaatgta ccacatcaca 540

tacattggta cccacacttg cagagaccag tactcatcca tggctacacc acgaatcgac 600

agtccgagtc cgatattaaa actcgaatcc gaggaacaag cgacgacacc aagcaatgtt 660

acggatttgg attcgatgac catgtggacg gatgtaatga tgggtggtgt tggttttgaa 720

actgatgtgg tgtccaacat gtattcatgc actgaaatca cttgtctgga tttagaacct 780

gttgagcttg aaaatggttt gctgtttgat gacactgatt ttgcttag 828

<210> 2

<211> 6

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 2

atcacg 6

<210> 3

<211> 58

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 3

aatgatacgg cgaccaccga gatctacact ctttccctac acgacgctct tccgatct 58

<210> 4

<211> 63

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 4

gatcggaaga gcacacgtct gaactccagt cacatcacga tctcgtatgc cgtcttctgc 60

ttg 63

<210> 5

<211> 5789

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 5

acgttatcga ctgcacggtg caccaatgct tctggcgtca ggcagccatc ggaagctgtg 60

gtatggctgt gcaggtcgta aatcactgca taattcgtgt cgctcaaggc gcactcccgt 120

tctggataat gttttttgcg ccgacatcat aacggttctg gcaaatattc tgaaatgagc 180

tgttgacaat taatcatcgg ctcgtataat gtgtggaatt gtgagcggat aacaatttca 240

cacaggaaac agtattcatg tcccctatac taggttattg gaaaattaag ggccttgtgc 300

aacccactcg acttcttttg gaatatcttg aagaaaaata tgaagagcat ttgtatgagc 360

gcgatgaagg tgataaatgg cgaaacaaaa agtttgaatt gggtttggag tttcccaatc 420

ttccttatta tattgatggt gatgttaaat taacacagtc tatggccatc atacgttata 480

tagctgacaa gcacaacatg ttgggtggtt gtccaaaaga gcgtgcagag atttcaatgc 540

ttgaaggagc ggttttggat attagatacg gtgtttcgag aattgcatat agtaaagact 600

ttgaaactct caaagttgat tttcttagca agctacctga aatgctgaaa atgttcgaag 660

atcgtttatg tcataaaaca tatttaaatg gtgatcatgt aacccatcct gacttcatgt 720

tgtatgacgc tcttgatgtt gttttataca tggacccaat gtgcctggat gcgttcccaa 780

aattagtttg ttttaaaaaa cgtattgaag ctatcccaca aattgataag tacttgaaat 840

ccagcaagta tatagcatgg cctttgcagg gctggcaagc cacgtttggt ggtggcgacc 900

atcctccaaa atcggatctg gttccgcgtg gatccccgga attcccgggt atggatactc 960

tttatccatg gcctgaacct gaaacactat caagcaacaa gaaaagggta atacaaaaac 1020

ttgtggaagg tcaacagtgt gctactgagc ttcaagttat tgtactccac aacaacaagc 1080

cccctcaaca agctgaggag cttgtgcaaa agatcttgtg gtcatttaat cagacacttt 1140

ctatgctagc tgaagctggt catcatgatg aagtgatttc ccagaatcag gcaacttgta 1200

atgatgattg taagtctcaa gattctagtg agagcagcaa gagatcactt tcagcattcg 1260

ttaaggataa gaggggctgt tacaagagaa agaggtttgc tcaaacaaag atagtggtgt 1320

ctgataaaat agaagatggg catgcatgga gaaaatatgg acaaaaaaat atcttacatt 1380

ctaaacatcc aaggagttac ttcaggtgca gtcacaagca tgatcaaggc tgtagtgcta 1440

tcaaacaagt tcaaagaatg gaagatgatg cccaaatgta ccacatcaca tacattggta 1500

cccacacttg cagagaccag tactcatcca tggctacacc acgaatcgac agtccgagtc 1560

cgatattaaa actcgaatcc gaggaacaag cgacgacacc aagcaatgtt acggatttgg 1620

attcgatgac catgtggacg gatgtaatga tgggtggtgt tggttttgaa actgatgtgg 1680

tgtccaacat gtattcatgc actgaaatca cttgtctgga tttagaacct gttgagcttg 1740

aaaatggttt gctgtttgat gacactgatt ttgcttaggc ggccgcatcg tgactgactg 1800

acgatctgcc tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg 1860

gagacggtca cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg 1920

tcagcgggtg ttggcgggtg tcggggcgca gccatgaccc agtcacgtag cgatagcgga 1980

gtgtataatt cttgaagacg aaagggcctc gtgatacgcc tatttttata ggttaatgtc 2040

atgataataa tggtttctta gacgtcaggt ggcacttttc ggggaaatgt gcgcggaacc 2100

cctatttgtt tatttttcta aatacattca aatatgtatc cgctcatgag acaataaccc 2160

tgataaatgc ttcaataata ttgaaaaagg aagagtatga gtattcaaca tttccgtgtc 2220

gcccttattc ccttttttgc ggcattttgc cttcctgttt ttgctcaccc agaaacgctg 2280

gtgaaagtaa aagatgctga agatcagttg ggtgcacgag tgggttacat cgaactggat 2340

ctcaacagcg gtaagatcct tgagagtttt cgccccgaag aacgttttcc aatgatgagc 2400

acttttaaag ttctgctatg tggcgcggta ttatcccgtg ttgacgccgg gcaagagcaa 2460

ctcggtcgcc gcatacacta ttctcagaat gacttggttg agtactcacc agtcacagaa 2520

aagcatctta cggatggcat gacagtaaga gaattatgca gtgctgccat aaccatgagt 2580

gataacactg cggccaactt acttctgaca acgatcggag gaccgaagga gctaaccgct 2640

tttttgcaca acatggggga tcatgtaact cgccttgatc gttgggaacc ggagctgaat 2700

gaagccatac caaacgacga gcgtgacacc acgatgcctg cagcaatggc aacaacgttg 2760

cgcaaactat taactggcga actacttact ctagcttccc ggcaacaatt aatagactgg 2820

atggaggcgg ataaagttgc aggaccactt ctgcgctcgg cccttccggc tggctggttt 2880

attgctgata aatctggagc cggtgagcgt gggtctcgcg gtatcattgc agcactgggg 2940

ccagatggta agccctcccg tatcgtagtt atctacacga cggggagtca ggcaactatg 3000

gatgaacgaa atagacagat cgctgagata ggtgcctcac tgattaagca ttggtaactg 3060

tcagaccaag tttactcata tatactttag attgatttaa aacttcattt ttaatttaaa 3120

aggatctagg tgaagatcct ttttgataat ctcatgacca aaatccctta acgtgagttt 3180

tcgttccact gagcgtcaga ccccgtagaa aagatcaaag gatcttcttg agatcctttt 3240

tttctgcgcg taatctgctg cttgcaaaca aaaaaaccac cgctaccagc ggtggtttgt 3300

ttgccggatc aagagctacc aactcttttt ccgaaggtaa ctggcttcag cagagcgcag 3360

ataccaaata ctgtccttct agtgtagccg tagttaggcc accacttcaa gaactctgta 3420

gcaccgccta catacctcgc tctgctaatc ctgttaccag tggctgctgc cagtggcgat 3480

aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac cggataaggc gcagcggtcg 3540

ggctgaacgg ggggttcgtg cacacagccc agcttggagc gaacgaccta caccgaactg 3600

agatacctac agcgtgagct atgagaaagc gccacgcttc ccgaagggag aaaggcggac 3660

aggtatccgg taagcggcag ggtcggaaca ggagagcgca cgagggagct tccaggggga 3720

aacgcctggt atctttatag tcctgtcggg tttcgccacc tctgacttga gcgtcgattt 3780

ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg ccagcaacgc ggccttttta 3840

cggttcctgg ccttttgctg gccttttgct cacatgttct ttcctgcgtt atcccctgat 3900

tctgtggata accgtattac cgcctttgag tgagctgata ccgctcgccg cagccgaacg 3960

accgagcgca gcgagtcagt gagcgaggaa gcggaagagc gcctgatgcg gtattttctc 4020

cttacgcatc tgtgcggtat ttcacaccgc ataaattccg acaccatcga atggtgcaaa 4080

acctttcgcg gtatggcatg atagcgcccg gaagagagtc aattcagggt ggtgaatgtg 4140

aaaccagtaa cgttatacga tgtcgcagag tatgccggtg tctcttatca gaccgtttcc 4200

cgcgtggtga accaggccag ccacgtttct gcgaaaacgc gggaaaaagt ggaagcggcg 4260

atggcggagc tgaattacat tcccaaccgc gtggcacaac aactggcggg caaacagtcg 4320

ttgctgattg gcgttgccac ctccagtctg gccctgcacg cgccgtcgca aattgtcgcg 4380

gcgattaaat ctcgcgccga tcaactgggt gccagcgtgg tggtgtcgat ggtagaacga 4440

agcggcgtcg aagcctgtaa agcggcggtg cacaatcttc tcgcgcaacg cgtcagtggg 4500

ctgatcatta actatccgct ggatgaccag gatgccattg ctgtggaagc tgcctgcact 4560

aatgttccgg cgttatttct tgatgtctct gaccagacac ccatcaacag tattattttc 4620

tcccatgaag acggtacgcg actgggcgtg gagcatctgg tcgcattggg tcaccagcaa 4680

atcgcgctgt tagcgggccc attaagttct gtctcggcgc gtctgcgtct ggctggctgg 4740

cataaatatc tcactcgcaa tcaaattcag ccgatagcgg aacgggaagg cgactggagt 4800

gccatgtccg gttttcaaca aaccatgcaa atgctgaatg agggcatcgt tcccactgcg 4860

atgctggttg ccaacgatca gatggcgctg ggcgcaatgc gcgccattac cgagtccggg 4920

ctgcgcgttg gtgcggatat ctcggtagtg ggatacgacg ataccgaaga cagctcatgt 4980

tatatcccgc cgttaaccac catcaaacag gattttcgcc tgctggggca aaccagcgtg 5040

gaccgcttgc tgcaactctc tcagggccag gcggtgaagg gcaatcagct gttgcccgtc 5100

tcactggtga aaagaaaaac caccctggcg cccaatacgc aaaccgcctc tccccgcgcg 5160

ttggccgatt cattaatgca gctggcacga caggtttccc gactggaaag cgggcagtga 5220

gcgcaacgca attaatgtga gttagctcac tcattaggca ccccaggctt tacactttat 5280

gcttccggct cgtatgttgt gtggaattgt gagcggataa caatttcaca caggaaacag 5340

ctatgaccat gattacggat tcactggccg tcgttttaca acgtcgtgac tgggaaaacc 5400

ctggcgttac ccaacttaat cgccttgcag cacatccccc tttcgccagc tggcgtaata 5460

gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg cagcctgaat ggcgaatggc 5520

gctttgcctg gtttccggca ccagaagcgg tgccggaaag ctggctggag tgcgatcttc 5580

ctgaggccga tactgtcgtc gtcccctcaa actggcagat gcacggttac gatgcgccca 5640

tctacaccaa cgtaacctat cccattacgg tcaatccgcc gtttgttccc acggagaatc 5700

cgacgggttg ttactcgctc acatttaatg ttgatgaaag ctggctacag gaaggccaga 5760

cgcgaattat ttttgatggc gttggaatt 5789

20页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种基于纳米孔测序的病原分子检测方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!