Construction method, system and platform of gene site-specific knock-in vector

文档序号:831803 发布日期:2021-03-30 浏览:11次 中文

阅读说明:本技术 一种基因定点敲入载体构建方法、系统及平台 (Construction method, system and platform of gene site-specific knock-in vector ) 是由 高翠 白颖 于 2020-12-23 设计创作,主要内容包括:本发明属于生物信息领域,具体涉及一种基因定点敲入载体构建方法、系统及平台。本发明通过获取待敲入基因原始数据信息,并创建与所述待敲入基因相对应的骨架,构建载体敲入模型;实时获取待插入序列数据,定点将待敲入基因同源臂序列、外源基因和一些功能元件按设计规则有序敲入载体,以及与方法相对应的系统及平台,可以实现外源基因的定点敲入,使遗传背景更为简单,实验操作更加精准、高效,而且比传统操作更简单,设计上更灵活,成本更低,周期也更短。并且通过基因的定点敲入为未来靶向基因治疗提供了可能,将功能缺失的DNA片段修复为有功能的DNA片段,即可实现基因治疗的目的,同时可以大大提高产出,解放人力,提高工作效率。(The invention belongs to the field of biological information, and particularly relates to a method, a system and a platform for constructing a gene fixed-point knock-in carrier. According to the method, a carrier knock-in model is constructed by acquiring original data information of a gene to be knocked in and creating a skeleton corresponding to the gene to be knocked in; the method comprises the steps of acquiring sequence data to be inserted in real time, sequentially knocking in a homologous arm sequence of a gene to be knocked in, an exogenous gene and some functional elements into a carrier according to a design rule at a fixed point, and a system and a platform corresponding to the method, so that fixed-point knocking in of the exogenous gene can be realized, the genetic background is simpler, the experimental operation is more accurate and efficient, and compared with the traditional operation, the method is simpler, more flexible in design, lower in cost and shorter in period. And the possibility is provided for future targeted gene therapy through the fixed-point knock-in of the gene, the purpose of gene therapy can be realized by repairing the DNA segment with the function loss into the functional DNA segment, and meanwhile, the yield can be greatly improved, the manpower is liberated, and the working efficiency is improved.)

1. A method for constructing a gene site-directed knock-in vector is characterized by specifically comprising the following steps:

acquiring original data information of a gene to be knocked in, and creating a skeleton corresponding to the gene to be knocked in;

constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element;

and acquiring sequence data to be inserted in real time, and knocking the gene homology arm to be knocked in and the functional element into the carrier at fixed points by combining the carrier knocking-in model.

2. The method for constructing a gene site-specific knock-in vector according to claim 1, wherein the obtaining of the original data information of the gene to be knocked in further comprises:

marking the homologous arm and knock-in element sequence information corresponding to the gene to be knocked in;

and screening the frameworks corresponding to the gene model to be knocked in real time.

3. The method for constructing a gene site-specific knock-in vector according to claim 2, wherein the labeling of the sequence information of the homology arm and the knock-in element corresponding to the gene to be knocked in further comprises:

creating a homology arm area and performing source filling on the knock-in element sequence;

mutations in the homology arms and gene sequences were processed in real time.

4. The method for constructing a gene site-specific knock-in vector according to claim 1, wherein the constructing a model of the knock-in vector based on the original data information of the gene to be knocked in and the skeleton and knock-in element further comprises:

constructing a gene sequence element knock-in plasmid vector model;

constructing a continuous primer model according to the length of the inserted sequence and the sequence source of the element;

constructing a bacteria detection primer model and an enzyme digestion identification model;

constructing a sequencing primer model for sequencing the vector;

the method for acquiring the sequence data to be inserted in real time and combining the carrier knock-in model to fix the sequence of the homologous arm of the gene to be knocked in, the foreign gene and the sequence of the regulatory element into the carrier further comprises the following steps:

and visually displaying the vector after knocking in each element sequence.

5. The method according to claim 4, wherein the constructing of the gene site-specific knock-in vector model further comprises:

creating an element typing scheme in real time according to the length of the typing sequence;

and selecting an endonuclease required for cutting the skeleton in each knock-in operation after the first step in real time through the knock-in scheme, adding the selected endonuclease into a sequence of an element to be knocked in the corresponding step of the scheme, wherein the sequence to be knocked in for the first time has a skeleton cutting site scheme.

6. The method for constructing a gene site-specific knock-in vector according to claim 4, wherein the construction of the tandem primer model based on the length of the knock-in sequence and the sequence source of the element further comprises:

grouping sequences according to the length of the sequence fragments and element sources, connecting extremely short sequences through primers, carrying out PCR amplification on the sequences with templates through the primers to obtain the sequences except the sequences, synthesizing the sequences through gene sequences to obtain the sequences, and synthesizing adjacent synthesized fragments together through the sequences to obtain a fragment;

and constructing a continuous primer model according to the sequence fragment type.

7. The method for constructing a gene site-directed knock-in vector according to claim 5, wherein the creating of the knock-in protocol in real time according to the length of the knock-in sequence further comprises:

creating a component step-by-step knock-in sequence model; and judging the cutting of the plasmid vector skeleton to process the enzyme cutting data.

8. A gene site-directed knock-in vector construction system is characterized by specifically comprising:

the acquisition unit is used for acquiring the original data information of the gene to be knocked in and creating a skeleton corresponding to the gene model to be knocked in;

the model construction unit is used for constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element;

and the fixed-point knocking-in unit is used for acquiring sequence data to be knocked in real time, and knocking each sequence element into the carrier at fixed points by combining the carrier knocking-in model.

9. The system for constructing a gene site-directed knock-in vector according to claim 8, further comprising:

the display module is used for visually displaying the carrier after the elements are knocked in;

the acquiring unit further includes:

the marking module is used for marking the homologous arm corresponding to the gene to be knocked in, gene sequence information, mutation information and regulatory element information;

the screening module is used for screening the frameworks corresponding to the genes to be knocked in real time;

the first creating module is used for creating a homology arm area and performing source filling on the sequence of the knocked-in elements;

the processing module is used for processing the homologous arms and the mutation on the gene sequence in real time;

the model building unit further comprises:

the first construction module is used for constructing a gene sequence element knock-in plasmid vector model and simultaneously comprises an endonuclease cutting framework scheme model;

the second construction module is used for constructing a continuous transfer primer model according to the length of the knock-in sequence and the sequence source of the element;

the third construction module is used for constructing a bacteria detection primer model and an enzyme digestion identification model;

the fourth construction module is used for constructing a sequencing primer model and sequencing the vector;

the third creating module is used for creating a component typing scheme in real time according to the length of the component sequence;

the first selection module is used for selecting the endonuclease required by each step of operation after the first step in real time when the skeleton is knocked in and cut through the knocking-in scheme, and adding the selected endonuclease into the sequence of the element to be knocked in the corresponding step of the scheme;

a grouping module for grouping the sequences according to the sequence fragment lengths and the element source;

the fifth construction module is used for constructing a continuous primer model according to the sequence fragment type;

the second creating module is used for creating a component typing sequence model; and judging and processing enzyme cutting data.

10. A gene site-directed knock-in vector construction platform is characterized by comprising: a processor, a memory and a gene fixed-point knock-in carrier construct platform control program;

wherein the processor executes the gene site-specific knock-in vector construction platform control program, the gene site-specific knock-in vector construction platform control program is stored in the memory, and the gene site-specific knock-in vector construction platform control program implements the steps of the gene site-specific knock-in vector construction method according to any one of claims 1 to 7.

Technical Field

The invention belongs to the field of biological information, and particularly relates to a method, a system and a platform for constructing a gene fixed-point knock-in carrier.

Background

Currently, there are two main approaches to study gene function: one is a loss of gene function; the other is to achieve overexpression of the gene. Gene knock-in is an effective technique for achieving stable overexpression of genes. Gene knock-in refers to a Gene editing technique in which a foreign functional Gene is inserted into a genome by random integration, a transposon system, homologous recombination, or the like, and expressed in a cell.

However, current knock-in models include: according to the conventional gene knock-in, point mutation, conditional point mutation and humanization, an earlier gene knock-in mode is mainly integrated into a host genome through random integration or a transposon system, the knock-in mode has strong randomness and is easy to cause the loss of other gene functions, and the inserted copy number and the inserted position are not fixed, so that different intercellular expression differences in a cell line of the same source are obvious, and the control of the expression level of genes is not facilitated.

In addition, only a plurality of plasmid vector design systems assisting manual design exist in the market at present, and no automatic design method which comprises plasmid vector design and all production links is provided. In addition, in the conventional operation, because the manual design of the vector scheme is long in time, much labor and time are required to be invested, and meanwhile, experimental personnel with rich experience are required to design each technical point of the experimental design scheme, and finally, the design requirement for obtaining the target gene knock-in scheme vector construction cannot be met.

Therefore, it is necessary to provide a method, a system and a platform for constructing a gene site-specific knock-in vector, aiming at the technical problems that the traditional experimental operation cannot realize site-specific knock-in of a foreign gene, and the operation is complex, low in efficiency, time-consuming and labor-consuming.

Disclosure of Invention

Aiming at the technical problems and defects that the prior traditional experimental operation can not realize the fixed-point knocking-in of the exogenous gene, and has complex and fussy operation, low efficiency, time consumption and labor consumption. The invention provides a method, a system and a platform for constructing a gene fixed-point knock-in vector, namely:

the first object of the present invention is to: providing a construction method of a gene site-directed knock-in vector;

the second object of the present invention is to: providing a gene site-directed knock-in vector construction system;

the third object of the present invention is to: providing a gene site-directed knock-in vector construction platform;

the first object of the present invention is achieved by: the method specifically comprises the following steps:

acquiring original data information of a gene to be knocked in, and creating a skeleton corresponding to the gene to be knocked in;

constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element;

and acquiring sequence data to be inserted in real time, and knocking the gene homology arm to be knocked in and the functional regulation element into the vector at fixed points by combining the vector knocking-in model.

Further, the acquiring of the original data information of the gene to be knocked in further includes:

marking the homologous arm and knock-in element sequence information corresponding to the gene to be knocked in;

and screening the frameworks corresponding to the gene model to be knocked in real time.

Further, the labeling of the homology arm and the knock-in element sequence information corresponding to the gene to be knocked in further comprises:

creating a homology arm area and performing source filling on the knock-in element sequence;

mutations in the homology arms and gene sequences were processed in real time.

Further, the constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element further comprises:

constructing a gene sequence element knock-in plasmid vector model;

constructing a continuous primer model according to the length of the inserted sequence and the sequence source of the element;

constructing a bacteria detection primer model and an enzyme digestion identification model;

constructing a sequencing primer model for sequencing the vector;

the method comprises the following steps of obtaining sequence data to be inserted in real time, combining the carrier knock-in model, and fixing a carrier of a homologous arm sequence of a gene to be knocked in, an exogenous gene and a regulatory element sequence, and further comprises the following steps:

and visually displaying the vector after knocking in each element sequence.

Further, the constructing of the gene sequence element knock-in into a plasmid vector model further comprises:

creating an element typing scheme in real time according to the length of the typing sequence;

and selecting an endonuclease required for cutting the skeleton in each knock-in operation after the first step in real time through the knock-in scheme, adding the selected endonuclease into a sequence of an element to be knocked in the corresponding step of the scheme, wherein the sequence to be knocked in for the first time has a skeleton cutting site scheme.

Further, the constructing of the tandem primer model according to the length of the knock-in sequence and the sequence source of the element further comprises:

grouping sequences according to the length of the sequence fragments and element sources, connecting extremely short sequences through primers, carrying out PCR amplification on the sequences with templates through the primers to obtain the sequences except the sequences, synthesizing the sequences through gene sequences to obtain the sequences, and synthesizing adjacent synthesized fragments together through the sequences to obtain a fragment;

and constructing a continuous primer model according to the sequence fragment type.

Further, the creating a component tap-in scheme in real time according to the length of the tap-in sequence further includes:

creating a component step-by-step knock-in sequence model; and judging the cutting of the plasmid vector skeleton to process the enzyme cutting data.

The second object of the present invention is achieved by: the system specifically comprises:

the acquisition unit is used for acquiring the original data information of the gene to be knocked in and creating a skeleton corresponding to the gene model to be knocked in;

the model construction unit is used for constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element;

and the fixed-point knocking-in unit is used for acquiring sequence data to be knocked in real time, and knocking each sequence element into the carrier at fixed points by combining the carrier knocking-in model.

Further, in the system, the method further includes:

the display module is used for visually displaying the carrier after the elements are knocked in;

the acquiring unit further includes:

the marking module is used for marking the homologous arm corresponding to the gene to be knocked in, gene sequence information, mutation information and regulatory element information;

the screening module is used for screening the frameworks corresponding to the genes to be knocked in real time;

the first creating module is used for creating a homology arm area and performing source filling on the sequence of the knocked-in elements;

the processing module is used for processing the homologous arms and the mutation on the gene sequence in real time;

the model building unit further comprises:

the first construction module is used for constructing a gene sequence element knock-in plasmid vector model and simultaneously comprises an endonuclease cutting framework scheme model;

the second construction module is used for constructing a continuous transfer primer model according to the length of the knock-in sequence and the sequence source of the element;

the third construction module is used for constructing a bacteria detection primer model and an enzyme digestion identification model;

the fourth construction module is used for constructing a sequencing primer model and sequencing the vector;

the third creating module is used for creating a component typing scheme in real time according to the length of the component sequence;

the first selection module is used for selecting the endonuclease required by each step of operation after the first step in real time when the skeleton is knocked in and cut through the knocking-in scheme, and adding the selected endonuclease into the sequence of the element to be knocked in the corresponding step of the scheme;

a grouping module for grouping the sequences according to the sequence fragment lengths and the element source;

the fifth construction module is used for constructing a continuous primer model according to the sequence fragment type;

the second creating module is used for creating a component typing sequence model; and judging and processing enzyme cutting data.

The third object of the present invention is achieved by: the method comprises the following steps:

a processor, a memory and a gene fixed-point knock-in carrier construct platform control program;

the processor executes the gene site-specific knock-in vector construction platform control program, the gene site-specific knock-in vector construction platform control program is stored in the memory, and the gene site-specific knock-in vector construction platform control program realizes the steps of the gene site-specific knock-in vector construction method.

Compared with the prior art, the invention has the following beneficial effects:

the invention relates to a construction method of a gene site-specific knock-in vector, which comprises the following steps: acquiring original data information of a gene to be knocked in, and creating a skeleton corresponding to the gene to be knocked in; constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element; acquiring sequence data to be inserted in real time, and knocking in exogenous genes into the vector at fixed points by combining the vector knock-in model; and the system and the platform corresponding to the method can realize the fixed-point knock-in of the exogenous gene, so that the genetic background is simpler, and the experimental operation is more accurate and efficient. And compared with the traditional operation, the operation is simpler, the design is more flexible, the cost is lower, and the period is shorter. And the possibility is provided for the future targeted gene therapy through the fixed-point knock-in of the gene, and the purpose of the gene therapy can be realized by repairing the DNA segment with the function loss into the functional DNA segment.

That is, through the scheme of the invention, the output can be greatly improved, the manpower is liberated, the working efficiency is improved, and the precision rate is improved, and the original complicated design process is simpler and quicker; and the method breaks through the knowledge background barrier, and researchers without abundant experimental experiences can also obtain detailed step schemes of each link of the design of the gene knock-in plasmid vector.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of a construction method of a gene site-specific knock-in vector according to the present invention;

FIG. 2 is a schematic view of the construction process of plasmid vector based on cas9 technique/ES cell gene targeting technique in accordance with the present invention;

FIG. 3 is a schematic diagram of a general flow chart of vector construction in a method for constructing a gene site-specific knock-in vector according to the present invention;

FIG. 4 is a schematic diagram of an insertion grouping design flow structure of a gene site-specific knock-in vector construction method according to the present invention;

FIG. 5 is a schematic diagram of the design flow structure of the sequencing primer of the gene site-specific knock-in vector construction method of the present invention;

FIG. 6 is a schematic diagram of the sequence insertion and digestion scheme of a method for constructing a gene site-specific knock-in vector according to the present invention;

FIG. 7 is a schematic diagram of the design flow structure of primers for successive transfer in a method for constructing a gene site-specific knock-in vector according to the present invention;

FIG. 8 is a schematic diagram of the design flow of primers for bacterial detection in the method for constructing a gene site-specific knock-in vector according to the present invention;

FIG. 9 is a schematic view of a visualized flow chart of a method for constructing a gene site-specific knock-in vector according to the present invention;

FIG. 10 is a schematic diagram of a system architecture for constructing a gene site-specific knock-in vector according to the present invention;

FIG. 11 is a schematic diagram of a construction platform of a gene site-specific knock-in vector according to the present invention;

FIG. 12 is a block diagram of a computer-readable storage medium according to an embodiment of the present invention;

the objects, features and advantages of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

For better understanding of the objects, aspects and advantages of the present invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings, and other advantages and capabilities of the present invention will become apparent to those skilled in the art from the description.

The invention is capable of other and different embodiments and its several details are capable of modification in various other respects, all without departing from the spirit and scope of the present invention.

It should be noted that, if directional indications (such as up, down, left, right, front, and back … …) are involved in the embodiment of the present invention, the directional indications are only used to explain the relative positional relationship between the components, the movement situation, and the like in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indications are changed accordingly.

In addition, if there is a description of "first", "second", etc. in an embodiment of the present invention, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. Secondly, the technical solutions in the embodiments can be combined with each other, but it must be based on the realization of those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not be within the protection scope of the present invention.

Preferably, the gene site-specific typing vector construction method is applied to one or more terminals or servers. The terminal is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The terminal can be a desktop computer, a notebook, a palm computer, a cloud server and other computing equipment. The terminal can be in man-machine interaction with a client in a keyboard mode, a mouse mode, a remote controller mode, a touch panel mode or a voice control device mode.

The invention relates to a method, a system, a platform and a storage medium for constructing a gene fixed-point knock-in carrier.

FIG. 1 is a flowchart of a method for constructing a gene site-directed knock-in vector according to an embodiment of the present invention.

In this embodiment, the method for constructing the gene site-specific typing vector may be applied to a terminal or a fixed terminal having a display function, where the terminal is not limited to a personal computer, a smart phone, a tablet computer, a desktop or all-in-one machine with a camera, and the like.

The gene site-directed typing vector construction method can also be applied to a hardware environment consisting of a terminal and a server connected with the terminal through a network. Networks include, but are not limited to: a wide area network, a metropolitan area network, or a local area network. The gene site-specific typing vector construction method of the embodiment of the invention can be executed by a server, a terminal or both.

For example, for a terminal that needs to perform the construction of the gene site-directed typing vector, the function of constructing the gene site-directed typing vector provided by the method of the present invention can be directly integrated on the terminal, or a client for implementing the method of the present invention can be installed. For another example, the method provided by the present invention may further run on a device such as a server in the form of a Software Development Kit (SDK), and an interface of the gene site-specific knock-in vector construction function is provided in the form of an SDK, and a terminal or other devices may implement the gene site-specific knock-in vector construction function through the provided interface.

The invention is further elucidated with reference to the drawing.

As shown in FIG. 1, the present invention provides a method for constructing a gene site-specific knock-in vector, which specifically comprises the following steps:

s1, acquiring original data information of the gene to be knocked in, and creating a skeleton corresponding to the gene to be knocked in;

s2, constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element;

and S3, acquiring sequence data to be inserted in real time, and knocking the gene homology arm to be knocked in and the functional regulatory element into the vector at fixed points by combining the vector knocking-in model.

That is, as shown in fig. 2, the present invention obtains the original data information of the gene to be knocked in, and creates a skeleton corresponding to the gene to be knocked in; constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element; and acquiring sequence data to be inserted in real time, and knocking the exogenous gene and the regulatory element into the vector at a fixed point by combining the vector knock-in model. An operator only needs simple parameter selection and complicated sequence pasting and even can set point mutation, and a detailed vector construction process can be obtained within one minute.

The method for acquiring the original data information of the gene to be knocked in further comprises the following steps:

s11, marking the homologous arm and the knock-in element sequence information corresponding to the gene to be knocked in;

and S12, screening the skeleton corresponding to the gene model to be knocked in real time.

Specifically, for backbone selection, as shown in fig. 3, an appropriate plasmid backbone is automatically selected for vector design according to the type of technology filled in by the user and the inserted element sequence system. Parameters that influence the selection of the skeleton include:

1) the technical types are as follows: ES cell targeting technology or CRISPR/Cas9 technology;

2) a Promoter;

3) an LSL module: a recombinase recognition sequence element + an expression termination element component + a recombinase enzyme recognition sequence element;

4)polyA;

preferably, the sequence elements to be inserted and the elements of the scaffold to be cut are determined, i.e. the site-specific knock-in requires calculation of the endonuclease scheme to be replaced and the elements to be inserted to determine the scaffold by comparing the Promoter, the LSL module and the polyA designed by the user with the corresponding elements of the scaffold. In situ knock-in requires a differentiation between the two technical routes ES and Cas9 in order to take different plasmid vector design schemes.

The labeling of the homology arm and the gene sequence information corresponding to the gene to be knocked in further comprises:

s111, creating a homology arm area, and performing source filling on the insertion element sequence;

s112, processing the homologous arm and the mutation on the gene sequence in real time.

That is, in the process of acquiring the gene information according to the embodiment of the present invention, the method further includes: data analysis uses Ensembl's genetic data information for obtaining corresponding information by the number of the target transcript, specifically:

a. basic information of the gene: the information basically used for scheme display, such as the corresponding relation of the gene name, the length, the chromosome to which the gene belongs, the exon region, the protein coding region, the NCBI library gene and the Ensembl library gene;

b. data for data analysis: the target transcript information, the coding protein information and the gene sequence information of the gene comprise upstream and downstream 15kb sequences of the gene;

c. defining the homology arms: the two homologous arm and other related gene sequence information required by the vector design are marked according to the defined information.

Preferably, in creating the insertion sequence model search, as shown in fig. 4, for the source analysis of the insertion sequence, there are mainly the following cases:

a. genomic data: directly designing a primer for PCR amplification and obtaining;

b. inherent elements: obtaining common elements through direct PCR amplification, wherein the common elements comprise common complex sequence elements, sequence sources are not obtained through template PCR, but specific implementation of a flow and a scheme of subsequent vector construction is influenced;

c. vector template library: obtaining a PCR amplification template source through sequence comparison and analysis;

d. genome orf sequence: a special sale website carries out searching and sequence comparison, and can be directly used for ordering and purchasing;

e. no template: sequences were synthesized by gene fragments.

The method for constructing the carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element further comprises the following steps:

for real-time processing of mutations on the homology arms and the gene sequences, namely, judging whether each group of enzyme cutting data can be reserved or not, and completely erasing introduced enzyme cutting sites; if the enzyme digestion recognition sequence inherent on the skeleton is judged when the sequence with both ends being connected with the skeleton is inserted, the whole sequence can be supplemented if the enzyme is not repeatedly used in the subsequent fragment insertion, otherwise, the corresponding enzyme digestion recognition sequence needs to be deleted through a primer when a vector is constructed.

S21, constructing a gene sequence element knock-in plasmid vector model;

s22, constructing a continuous transfer primer model according to the length of the inserted sequence and the sequence source of the element;

s23, constructing a bacteria detection primer model and an enzyme digestion identification model;

s24, constructing a sequencing primer model for sequencing the vector;

in the present embodiment, for sequencing primer design, as shown in fig. 5, the sequencing region comprises a complex element (such as 3 × SV40 polyA) on the backbone, a certain region of the border of each homologous sequence insert fragment and the exon region included, and the entire sequence of the other inserted elements. Sequencing primers were designed in three contexts:

a. fixing a sequencing primer by a framework, and judging whether the primer can effectively sequence a region or not;

b. sequencing primer pairs: preferably, the length of the primer product is close to 600bp, and the difference of the cleavage length of the endonuclease scheme is more than 100 bp.

The construction of the gene sequence element is knocked into a plasmid vector model, and further comprises the following steps:

s211, creating an element typing scheme in real time according to the length of the typing sequence;

s212, selecting an endonuclease needed when the skeleton is cut by each knock-in operation after the first step in real time through the knock-in scheme, adding the selected endonuclease into a sequence of an element to be knocked in at the corresponding step of the scheme, wherein the sequence to be knocked in for the first time has a skeleton cutting site scheme.

That is, the scheme of inserting elements into a plasmid vector is designed, i.e., model creation, and since the length of a sequence inserted into a backbone at a time is limited for technical reasons, an excessively long KI sequence needs to be inserted in multiple times. Each design scheme needs to judge and design insertion, and the scheme of multiple insertion needs to introduce enzyme digestion recognition sequences at corresponding positions in advance in the previous insertion step for subsequent experiments.

Specifically, as shown in fig. 6, the fragment insertion scheme was designed:

a. the total length of the inserted sequence is less than the set longest inserted length limit, and the fragment can be inserted into the vector at one time;

b. the total length of the inserted sequence is larger than the set longest insertion length limit, and the fragment needs to be inserted into a plasmid skeleton in multiple steps; i.e. a specific complex element is inserted separately, and each packet is split into multiple groups, the splitting principle is: each element is kept complete as much as possible, the number of split combinations is minimal, and the sizes of the split segments are uniform as much as possible.

c. And (3) designing an insertion sequence:

1) firstly inserting a PCR amplification region, then inserting a fragment needing gene synthesis, and finally inserting a complex element;

2) designing a restriction enzyme cutting scheme for inserting fragments, wherein the first fragment fixedly uses the existing restriction enzyme cutting sites on the skeleton, and the subsequent insertion of each fragment combination needs to introduce the restriction enzyme cutting scheme in advance;

d. the scheme design needs to meet the following rules: the inserted sequence and the skeleton can not have corresponding enzyme cutting sites; when the former group of sequences is inserted, 0-2 groups of enzyme cutting schemes are required to be introduced at two ends for subsequent fragment insertion; two groups of enzymes introduced simultaneously cannot conflict; the newly introduced cleavage schemes do not conflict with the cleavage combinations which have not yet been used.

e. Judging whether each group of enzyme cutting data can be reserved: that is, all the introduced enzyme cutting sites need to be erased, and the enzyme cutting recognition sequence inherent on the framework is judged when the sequences between the two ends and the framework are inserted, if the enzyme is not repeatedly used in the subsequent segment insertion, the whole sequence can be supplemented, otherwise, the corresponding enzyme cutting recognition sequence needs to be deleted through a primer when a carrier is constructed.

The method for constructing the continuous transfer primer model according to the length of the knock-in sequence and the sequence source of the element further comprises the following steps:

s221, grouping sequences according to the length of sequence fragments and element sources, connecting extremely short sequences through primers, carrying out PCR amplification on sequences with templates through primers to obtain the sequences, synthesizing the sequences except the sequences through gene sequences, and synthesizing adjacent synthesized fragments together through sequences to obtain a fragment;

s222, constructing a continuous primer model according to the sequence fragment type.

That is, for the design of the tandem primer, as shown in FIG. 7, the tandem primer used in each step of the vector is designed according to the length of the inserted sequence and the sequence source of the element, specifically:

a. grouping sequences according to fragment length and element template sources;

1) designing opposite primers at two ends of the sequence of the PCR source fragment for PCR amplification;

2) if one end of the short sequence fragment is a gene synthesis fragment, the short sequence fragment and the adjacent fragment are subjected to fragment synthesis;

3) multiple continuous fragments can be synthesized together to perform gene synthesis;

b. designing a primer according to the fragment type;

synthesizing a fragment: 1) if the synthetic segment is connected with the skeleton, directly adding a segment with the length of 20nt at the tail end of the skeleton at the tail end of the synthetic sequence for synthesis; if the synthetic fragment is connected with other PCR fragments, adding fragments with the length of 20nt at the tail end of the connected PCR fragment at the tail end of the synthetic sequence to synthesize together; 2) a group of specific restriction enzyme recognition sequences are added at the tail ends of the fragments to be synthesized together, and a group of opposite primers are designed at the two ends.

PCR amplification of fragments: 1) the part of the primer connected with the framework needs to be extended to an overlapping region of 20nt with the framework; 2) primers at the junctions between fragments also require an overlap region of at least 20 nt; 3) if the end sequence of the fragment exceeds the limit of the longest primer, additional primers are added for extension, and at least 20nt of sequence length exists between the added primers.

For the design and creation of the primers for bacterial detection, as shown in fig. 8, one to two groups of primers for bacterial detection and one group of no-load primers are required to be designed for each step of insertion sequence, and the primer parameters are designed and screened by experimenters according to design rules summarized by experience.

a. The length of the inserted segment is less than 500nt, and a group of opposite primers are arranged on the front and rear skeletons of the inserted segment;

b. the length of the inserted segment is more than 500nt, and a group of opposite primers are respectively arranged at the upstream and the downstream of the two inserted interfaces;

c. a group of opposite primers are arranged at the upstream and downstream of the position of the skeleton cut, and the length requirement of the final PCR product is different from that of the first last group of bacteria detection products.

For the design and creation of the enzyme digestion scheme, the plasmid vector is cut by single enzyme and double enzymes, and an appropriate enzyme digestion scheme is selected. The screening requirements are as follows:

a. the length of the segments is 100-1000 nt, and the length difference is more than or equal to 100 nt;

b. the length of the segments is 1000-2000 nt, and the length difference is more than or equal to 150 nt;

c. the length of the segments is 2000-3000 nt, and the length difference is more than or equal to 300 nt;

d. the length of the fragments is 3000-5000 nt, and the length difference is more than or equal to 500 nt;

e. the length of the fragments is 5000-8000 nt, and the length difference is more than or equal to 1000 nt;

f. the single enzyme digestion scheme is superior to the double enzyme digestion scheme.

The creating of the element typing scheme in real time according to the length of the typing sequence further comprises:

s2121, creating an element step-by-step knock-in sequence model; and judging the cutting of the plasmid vector skeleton to process the enzyme cutting data.

Specifically, the schema creates the rules that the design needs to satisfy: the inserted sequence and the skeleton can not have corresponding enzyme cutting sites; when the former group of sequences is inserted, 0-2 groups of enzyme cutting schemes are required to be introduced at two ends for subsequent fragment insertion; two groups of enzymes introduced simultaneously cannot conflict; the newly introduced cleavage schemes do not conflict with the cleavage combinations which have not yet been used.

And simultaneously judging whether each group of enzyme cutting data can be reserved: that is, all the introduced enzyme cutting sites need to be erased, and the enzyme cutting recognition sequence inherent on the framework is judged when the sequences between the two ends and the framework are inserted, if the enzyme is not repeatedly used in the subsequent segment insertion, the whole sequence can be supplemented, otherwise, the corresponding enzyme cutting recognition sequence needs to be deleted through a primer when a carrier is constructed.

The real-time acquisition of sequence data to be inserted, the combination of the carrier knock-in model and the fixed-point knock-in of the exogenous gene into the carrier, further comprises:

s40, visually displaying the vector with the elements knocked in.

That is to say, in the scheme of the present invention, for the visual display, as shown in fig. 9, the following can be specifically implemented:

a. the document downloading module: the method comprises the steps of including a gb annotation file, a strategy map picture, a scheme document and a data document designed by all links of carrier design;

b. enzyme cutting strategy diagram display module: schematic representation of recombinase reaction (strategy diagram);

c. the vector construction process diagram display module: visually displaying a knock-in scheme, and labeling an enzyme digestion scheme, a continuous transfer primer, a bacteria detection primer and a sequencing primer;

d. the table display data module: inserting an enzyme digestion scheme, continuous transfer primer identification data, bacteria detection identification data, sequencing primer data and enzyme digestion identification data;

e. the gene information display module: information of the NCBI website about genes, including gene description, genome version, information of the Ensembl website about genes and transcript graphic information;

f. homology arm sequence analysis module: dot Plot and GC content Plot.

In order to achieve the above object, the present invention further provides a gene site-specific knock-in vector construction system, as shown in fig. 10, the system specifically includes:

the acquisition unit is used for acquiring the original data information of the gene to be knocked in and creating a skeleton corresponding to the gene model to be knocked in;

the model construction unit is used for constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element;

and the fixed-point knocking-in unit is used for acquiring sequence data to be knocked in real time, and knocking each sequence element into the carrier at fixed points by combining the carrier knocking-in model.

Further, in the system, the method further includes:

the display module is used for visually displaying the vector with the exogenous gene and the regulatory element sequence knocked in at a fixed point;

the acquiring unit further includes:

the marking module is used for marking the homologous arm corresponding to the gene to be knocked in, gene sequence information, mutation information and regulatory element information;

the screening module is used for screening the frameworks corresponding to the genes to be knocked in real time;

the first creating module is used for creating a homology arm area and performing source filling on the insertion element sequence;

the processing module is used for processing the homologous arms and the mutation on the gene sequence in real time;

the model building unit further comprises:

the first construction module is used for constructing a gene sequence element knock-in plasmid vector model and simultaneously comprises an endonuclease cutting framework scheme model;

the second construction module is used for constructing a continuous transfer primer model according to the length of the knock-in sequence and the sequence source of the element;

the third construction module is used for constructing a bacteria detection primer model and an enzyme digestion identification model;

the fourth construction module is used for constructing a sequencing primer model and sequencing the vector;

the third creating module is used for creating a component typing scheme in real time according to the length of the component sequence;

the first selection module is used for selecting the endonuclease required by each step of operation after the first step in real time when the skeleton is knocked in and cut through the knocking-in scheme, and adding the selected endonuclease into the sequence of the element to be knocked in the corresponding step of the scheme;

a grouping module for grouping the sequences according to the sequence fragment lengths and the element source;

the fifth construction module is used for constructing a continuous primer model according to the sequence fragment type;

the second creating module is used for creating a component typing sequence model; and judging and processing enzyme cutting data. .

That is, the system of the present invention specifically includes: the kit comprises a target gene information acquisition module, an insertion sequence template search module, a vector construction scheme design module, a successive transfer primer design module, a sequencing primer design module, a bacteria detection primer design module and a visual display module.

In the embodiment of the system scheme of the present invention, the functions of each unit or module involved in the system may refer to the method steps involved in a gene site-specific knock-in vector construction method, and the specific details are described above and will not be described herein again.

In order to achieve the above object, the present invention also provides a gene site-directed knock-in vector construction platform, as shown in fig. 11, comprising:

a processor, a memory and a gene fixed-point knock-in carrier construct platform control program;

wherein the processor executes the gene site-specific knock-in vector construction platform control program, the gene site-specific knock-in vector construction platform control program is stored in the memory, and the gene site-specific knock-in vector construction platform control program implements the steps of the gene site-specific knock-in vector construction method, such as:

s1, acquiring original data information of the gene to be knocked in, and creating a skeleton corresponding to the gene to be knocked in;

s2, constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element;

and S3, acquiring sequence data to be inserted in real time, and knocking the exogenous gene and the regulation related sequence into the vector at a fixed point by combining the vector knock-in model.

The details of the steps have been set forth above and will not be described herein.

In an embodiment of the present invention, the processor built in the gene site-specific typing vector construction platform may be composed of an integrated circuit, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same function or different functions, and include one or more Central Processing Units (CPUs), a microprocessor, a digital Processing chip, a graphics processor, and a combination of various control chips. The processor accesses each component by using various interfaces and line connections, constructs various functions and processes data by running or executing programs or units stored in the memory and calling data stored in the memory to execute the gene site-specific typing vector;

the memory is used for storing program codes and various data, is installed in the gene fixed-point typing carrier construction platform and realizes high-speed and automatic access to the program or the data in the running process.

The Memory includes Read-Only Memory (ROM), Random Access Memory (RAM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electrically Erasable rewritable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM) or other optical Disc Memory, magnetic disk Memory, tape Memory, or any other medium readable by a computer that can be used to carry or store data.

In order to achieve the above object, the present invention further provides a computer-readable storage medium, as shown in fig. 12, wherein a gene site-specific knock-in vector construction platform control program is stored in the computer-readable storage medium, and the gene site-specific knock-in vector construction method includes the steps of:

s1, acquiring original data information of the gene to be knocked in, and creating a skeleton corresponding to the gene to be knocked in;

s2, constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element;

and S3, acquiring sequence data to be inserted in real time, and knocking the exogenous gene and the regulatory element sequence into the vector at a fixed point by combining the vector knock-in model.

The details of the steps have been set forth above and will not be described herein.

In describing embodiments of the present invention, it should be noted that any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and that the scope of the preferred embodiments of the present invention includes additional implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

The logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processing module-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM).

Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

In an embodiment of the present invention, to achieve the above object, the present invention further provides a chip system, where the chip system includes at least one processor, and when the program instructions are executed in the at least one processor, the chip system is enabled to execute the steps of the gene site-directed typing vector construction method, such as:

s1, acquiring original data information of the gene to be knocked in, and creating a skeleton corresponding to the gene to be knocked in;

s2, constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element;

and S3, acquiring sequence data to be inserted in real time, and knocking the exogenous gene and the regulatory element sequence into the vector at a fixed point by combining the vector knock-in model.

The details of the steps have been set forth above and will not be described herein.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application. It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

By the steps, the system, the platform and the storage medium of the method, the problem of dependence on a gene knock-in experiment design technology can be solved, and a gene knock-in carrier construction scheme which can be directly used by common experimenters is realized; meanwhile, the scheme of the invention also solves the problem of experiment cost, and solves the problem of sequence template source by automatically searching and comparing the existing template library. Compared with sequence synthesis, PCR amplification is cheaper, and the experimental period is shorter; in addition, the scheme of the invention also solves the strategy problem of inserting the sequence fragment into the vector, and the system solves the problems of long fragment splitting, single insertion of complex elements, multi-fragment step-by-step insertion, insertion sequence of each step, endonuclease selection scheme and the like, and quickly provides a vector construction strategy; the invention solves the problems of time cost and labor cost of traditional operation, and solves the technical problems that a great deal of time and energy are needed for a great deal of work such as designing sequencing primers and bacteria detection primers used for subsequent identification of a carrier, issuing a carrier construction scheme document and the like by traditional operation from the steps of acquiring gene information, selecting a proper carrier framework, searching a sequence template, designing a carrier construction scheme and comprehensively considering sequence complexity and segment length to design a proper successive transfer primer.

That is to say, the scheme of the invention only needs the information of the target vector expression box defined and designed by the user, and the system can automatically search the proper skeleton vector and automatically analyze the subsequent vector construction process. The method can also be used for transforming a semi-finished plasmid vector to generate a final plasmid vector, and simultaneously supports the design of the gene knock-in vector by a user-defined gene locus, so that the problem that the design requirement for obtaining the target gene knock-in scheme vector construction in a short time can be met by a great amount of manpower investment due to long time for artificially designing a vector scheme is avoided.

Namely, the gene typing scheme designed by the prior art needs experienced experts or experimenters to complete, and the scheme of the invention is an automatic experiment design system developed by design rules summarized by experiential experimenters. The user only needs simple parameter selection and complicated sequence pasting and even can set point mutation, and a detailed vector construction process can be obtained within one minute. The data comprises gRNA off-target data, a vector insertion enzyme digestion scheme, a continuous transfer primer, a sequencing primer, a bacteria detection primer, a vector annotation information file of each link, a design data file required by an experiment and a scheme file report. A reasonable experimental design scheme is provided by utilizing the efficient and rapid processing technology of a computer and combining a plurality of algorithm modules which are automatically developed by designers.

In general, the invention provides a gene site-specific knock-in vector construction method: acquiring original data information of a gene to be knocked in, and creating a skeleton corresponding to the gene to be knocked in; constructing a carrier knock-in model according to the original data information of the gene to be knocked in, the skeleton and the knock-in element; acquiring sequence data to be inserted in real time, and knocking the exogenous gene and the regulatory element sequence into the vector at a fixed point by combining the vector knock-in model; and the system and the platform corresponding to the method can realize the fixed-point knock-in of the exogenous gene, so that the genetic background is simpler, and the experimental operation is more accurate and efficient. And compared with the traditional operation, the operation is simpler, the design is more flexible, the cost is lower, and the period is shorter. And the possibility is provided for the future targeted gene therapy through the fixed-point knock-in of the gene, and the purpose of the gene therapy can be realized by repairing the DNA segment with the function loss into the functional DNA segment.

That is, through the scheme of the invention, the output can be greatly improved, the manpower is liberated, the working efficiency is improved, and the precision rate is improved, and the original complicated design process is simpler and quicker; but also breaks through the knowledge background barrier, and researchers without abundant experimental experiences can also obtain detailed step schemes of each link of the design of the gene knock-in plasmid vector; the production scheme can be directly and fully automatically designed by the scheme of the invention, so that the automatic analysis of the experimental design of each link of the experiment is realized, and the possibility is provided for the automation of the future production flow.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

22页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:与肺癌相关的基因标志物及其应用

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!