Method and system for improving safety of gene editing technology

文档序号:600278 发布日期:2021-05-04 浏览:11次 中文

阅读说明:本技术 一种提高基因编辑技术安全性的方法及系统 (Method and system for improving safety of gene editing technology ) 是由 李晓光 于 2021-01-19 设计创作,主要内容包括:本发明公开了一种提高基因编辑技术安全性的方法和系统,所述方法包括:获得进行基因编辑的第一外源基因;获得所述第一外源基因导入后的第一细胞基因序列信息;根据所述第一外源基因获得易脱靶的第一基因区域;获得所述第一基因区域内的第二细胞基因序列信息;将所述第二细胞基因序列信息输入第一训练模型中,获得所述第一训练模型的第一输出信息,其中,所述第一输出信息标识所述第二细胞基因序列是否满足预定条件的结果信息;当所述第一输出信息满足所述预定条件时,获得基因编辑技术安全性的第一结果。解决了现有技术中基因编辑技术安全性低、脱靶率高,脱靶预测不够准确的技术问题。(The invention discloses a method and a system for improving the safety of a gene editing technology, wherein the method comprises the following steps: obtaining a first exogenous gene for gene editing; obtaining the gene sequence information of a first cell after the first exogenous gene is introduced; obtaining a first gene region which is easy to miss targets according to the first exogenous gene; obtaining second cell gene sequence information within the first gene region; inputting the second cell gene sequence information into a first training model to obtain first output information of the first training model, wherein the first output information identifies result information of whether the second cell gene sequence meets a preset condition; and when the first output information meets the preset condition, obtaining a first result of the safety of the gene editing technology. The method solves the technical problems of low safety, high off-target rate and inaccurate off-target prediction of the gene editing technology in the prior art.)

1. A method of increasing the safety of gene editing techniques, wherein the method comprises:

obtaining a first exogenous gene for gene editing;

obtaining the gene sequence information of a first cell after the first exogenous gene is introduced;

obtaining a first gene region which is easy to miss targets according to the first exogenous gene;

obtaining second cell gene sequence information in the first gene region according to the first cell gene sequence information and the first gene region;

inputting the second cell gene sequence information into a first training model, wherein the first training model is obtained by training a plurality of groups of training data, and each group of training data in the plurality of groups comprises: second cell gene sequence information, and result information identifying whether the second cell gene sequence satisfies a predetermined condition;

obtaining first output information of the first training model, wherein the first output information identifies result information of whether the second cell gene sequence satisfies a predetermined condition;

and when the first output information meets the preset condition, obtaining a first result of the safety of the gene editing technology.

2. The method of claim 1, wherein when the first output information satisfies the predetermined condition, the method further comprises:

obtaining a second gene region which is not easy to miss targets according to the first exogenous gene;

obtaining third cell gene sequence information in the second gene region according to the first cell gene sequence information and the second gene region;

inputting the third cell gene sequence information into a second training model, wherein the second training model is obtained by training a plurality of groups of training data, and each group of training data in the plurality of groups comprises: third cell gene sequence information, and result information identifying whether the third cell gene sequence information satisfies the predetermined condition;

obtaining second output information of the second training model, wherein the second output information identifies result information of whether the third cell gene sequence satisfies the predetermined condition;

and when the second output information shows that the preset condition is met, obtaining a second result of the safety of the gene editing technology.

3. The method of claim 1, wherein after obtaining gene sequence information of the first cell after introduction of the first exogenous gene, the method further comprises:

inputting the first cell gene sequence information into a first predictive model;

obtaining first potential off-target site information;

inputting the first potential miss site information into a first comparative model.

4. The method of claim 3, wherein after obtaining second cellular gene sequence information within the first gene region, the method further comprises:

inputting the second cellular gene sequence information to the first predictive model;

obtaining second potential off-target site information;

inputting the second potential off-target site information into the first alignment model;

outputting a first comparison result by the first comparison model;

and determining the first gene region according to the first comparison result.

5. The method of claim 1, wherein the method comprises:

after obtaining the first exogenous gene for gene editing, the method further comprises:

obtaining first species information of the first exogenous gene;

obtaining first reference group gene information of the first exogenous gene according to the first variety information;

obtaining first target information according to the first reference group gene information;

and introducing the first exogenous gene according to the first target information.

6. The method of claim 1, wherein the method comprises:

generating a first verification code according to the first cell gene sequence information, wherein the first verification code corresponds to the first cell gene sequence information;

generating a second verification code according to the second cell gene sequence information and the first verification code; by analogy, generating an Nth verification code according to the Nth cell gene sequence information and the Nth-1 verification code, wherein N is a natural number greater than 1;

and respectively taking the cell gene sequence information and the corresponding verification codes as a storage unit, and respectively copying and storing the storage units on M devices, wherein M is a natural number more than 1.

7. The method of claim 6, wherein the method comprises:

taking the Nth cell gene sequence information and the Nth verification code as an Nth storage unit;

obtaining the recording time of the Nth storage unit, wherein the recording time of the Nth storage unit represents the time required to be recorded by the Nth storage unit;

acquiring first equipment with the largest memory in the M equipment according to the recording time of the Nth storage unit;

and sending the recording right of the Nth storage unit to the first equipment.

8. A system for improving the safety of gene editing techniques, wherein the system comprises:

a first obtaining unit for obtaining a first foreign gene for gene editing;

a second obtaining unit for obtaining gene sequence information of the first cell after the first foreign gene is introduced;

a third obtaining unit for obtaining a first gene region which is easy to miss from the first foreign gene;

a fourth obtaining unit configured to obtain second cell gene sequence information within the first gene region based on the first cell gene sequence information and the first gene region;

a first input unit, configured to input the second cell gene sequence information into a first training model, where the first training model is obtained by training multiple sets of training data, and each set of training data in the multiple sets includes: second cell gene sequence information, and result information identifying whether the second cell gene sequence satisfies a predetermined condition;

a fifth obtaining unit, configured to obtain first output information of the first training model, where the first output information identifies result information of whether the second cell gene sequence satisfies a predetermined condition;

a sixth obtaining unit configured to obtain a first result of safety of a gene editing technique when the first output information satisfies the predetermined condition.

9. A system for improving the safety of gene editing techniques, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the program.

Technical Field

The invention relates to the technical field of gene editing, in particular to a method and a system for improving the safety of a gene editing technology.

Background

Gene editing, also known as genome editing or genome engineering, is an emerging and relatively precise genetic engineering technique or process capable of modifying a specific target gene in the genome of an organism. However, the off-target effect is still a main limiting factor influencing whether the gene editing technology can be widely applied, how to correctly evaluate and detect the off-target effect and provide a corresponding strategy to reduce the off-target effect is an important research direction in the current gene editing research field.

In the process of implementing the technical scheme of the invention in the embodiment of the present application, the inventor of the present application finds that the above-mentioned technology has at least the following technical problems:

the gene editing technology has low safety, high off-target rate and inaccurate off-target prediction.

Disclosure of Invention

The embodiment of the application provides a method and a system for improving the safety of a gene editing technology, solves the technical problems that the safety of the gene editing technology is low, the off-target rate is high and the off-target prediction is not accurate enough in the prior art, and achieves the technical aims of carrying out genome comparison based on a training model, thereby improving the accuracy of off-target detection and improving the safety of the gene editing technology.

The embodiment of the application provides a method for improving the safety of a gene editing technology, wherein the method comprises the following steps: obtaining a first exogenous gene for gene editing; obtaining the gene sequence information of a first cell after the first exogenous gene is introduced; obtaining a first gene region which is easy to miss targets according to the first exogenous gene; obtaining second cell gene sequence information in the first gene region according to the first cell gene sequence information and the first gene region; inputting the second cell gene sequence information into a first training model, wherein the first training model is obtained by training a plurality of groups of training data, and each group of training data in the plurality of groups comprises: second cell gene sequence information, and result information identifying whether the second cell gene sequence satisfies a predetermined condition; obtaining first output information of the first training model, wherein the first output information identifies result information of whether the second cell gene sequence satisfies a predetermined condition; and when the first output information meets the preset condition, obtaining a first result of the safety of the gene editing technology.

In another aspect, the present application further provides a system for improving safety of gene editing technology, wherein the system comprises: a first obtaining unit for obtaining a first foreign gene for gene editing; a second obtaining unit for obtaining gene sequence information of the first cell after the first foreign gene is introduced; a third obtaining unit for obtaining a first gene region which is easy to miss from the first foreign gene; a fourth obtaining unit configured to obtain second cell gene sequence information within the first gene region based on the first cell gene sequence information and the first gene region; a first input unit, configured to input the second cell gene sequence information into a first training model, where the first training model is obtained by training multiple sets of training data, and each set of training data in the multiple sets includes: second cell gene sequence information, and result information identifying whether the second cell gene sequence satisfies a predetermined condition; a fifth obtaining unit, configured to obtain first output information of the first training model, where the first output information identifies result information of whether the second cell gene sequence satisfies a predetermined condition; a sixth obtaining unit configured to obtain a first result of safety of a gene editing technique when the first output information satisfies the predetermined condition.

On the other hand, the embodiment of the present application further provides a system for improving the safety of the gene editing technology, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method according to the first aspect when executing the program.

One or more technical solutions provided in the embodiments of the present application have at least the following technical effects or advantages:

the gene sequence edited by the gene is input into the training model, so that the gene sequence of the target area easy to miss is detected, the target miss phenomenon possibly occurring is predicted, and the characteristics that the training model can continuously learn and acquire experience to process data are adopted, so that the edited genome can be accurately compared, and the technical purposes of improving the accuracy of target miss detection and improving the safety of the gene editing technology are realized.

The foregoing is a summary of the present disclosure, and embodiments of the present disclosure are described below to make the technical means of the present disclosure more clearly understood.

Drawings

FIG. 1 is a schematic flow chart of a method for improving the safety of a gene editing technology according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a system for improving the safety of a gene editing technique according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of an exemplary electronic device according to an embodiment of the present application.

Description of reference numerals: a first obtaining unit 11, a second obtaining unit 12, a third obtaining unit 13, a fourth obtaining unit 14, a first input unit 15, a fifth obtaining unit 16, a sixth obtaining unit 17, a bus 300, a receiver 301, a processor 302, a transmitter 303, a memory 304, and a bus interface 305.

Detailed Description

The embodiment of the application provides a method and a system for improving the safety of a gene editing technology, solves the technical problems that the safety of the gene editing technology is low, the off-target rate is high and the off-target prediction is not accurate enough in the prior art, and achieves the technical aims of carrying out genome comparison based on a training model, thereby improving the accuracy of off-target detection and improving the safety of the gene editing technology. Hereinafter, example embodiments of the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are merely some embodiments of the present application and not all embodiments of the present application, and it should be understood that the present application is not limited to the example embodiments described herein.

Summary of the application

The off-target effect is still a main limiting factor influencing whether the gene editing technology can be widely applied, how to correctly evaluate and detect the off-target effect and provide a corresponding strategy to reduce the off-target effect, and the method is an important research direction in the current gene editing research field and has the technical problems of low safety of the gene editing technology, high off-target rate and inaccurate off-target prediction in the prior art.

In view of the above technical problems, the technical solution provided by the present application has the following general idea:

the embodiment of the application provides a method for improving the safety of a gene editing technology, wherein the method comprises the following steps: obtaining a first exogenous gene for gene editing; obtaining the gene sequence information of a first cell after the first exogenous gene is introduced; obtaining a first gene region which is easy to miss targets according to the first exogenous gene; obtaining second cell gene sequence information in the first gene region according to the first cell gene sequence information and the first gene region; inputting the second cell gene sequence information into a first training model, wherein the first training model is obtained by training a plurality of groups of training data, and each group of training data in the plurality of groups comprises: second cell gene sequence information, and result information identifying whether the second cell gene sequence satisfies a predetermined condition; obtaining first output information of the first training model, wherein the first output information identifies result information of whether the second cell gene sequence satisfies a predetermined condition; and when the first output information meets the preset condition, obtaining a first result of the safety of the gene editing technology.

Having thus described the general principles of the present application, various non-limiting embodiments thereof will now be described in detail with reference to the accompanying drawings.

Example one

As shown in fig. 1, the present application provides a method for improving the safety of a gene editing technology, wherein the method includes:

step S100: obtaining a first exogenous gene for gene editing;

specifically, gene editing relies on genetically engineered nucleases, also known as "molecular scissors," to generate site-specific double-strand breaks (DSBs) at specific locations in the genome, inducing organisms to repair DSBs by non-homologous end joining (NHEJ) or Homologous Recombination (HR), which is a targeted mutation that is prone to errors. The first exogenous gene is a guide sequence for gene editing of a DNA sequence.

Step S200: obtaining the gene sequence information of a first cell after the first exogenous gene is introduced;

specifically, after the first exogenous gene is introduced into an original gene sequence under the action of nuclease, the gene sequence information of the first cell is obtained, and a foundation is laid for predicting the off-target effect.

Step S300: obtaining a first gene region which is easy to miss targets according to the first exogenous gene;

specifically, the first gene region is a region which is easy to generate a miss-target phenomenon in the first cellular gene sequence, the first gene region is determined by predicting a potential miss-target site in the first cellular gene sequence, and then the gene sequence in the first gene region is analyzed and detected.

Step S400: obtaining second cell gene sequence information in the first gene region according to the first cell gene sequence information and the first gene region;

specifically, after the first gene region is determined, the gene sequence information within the first gene region is detected, thereby achieving the evaluation of the safety of the entire gene sequence by evaluating the safety of gene editing within the first gene region.

Step S500: inputting the second cell gene sequence information into a first training model, wherein the first training model is obtained by training a plurality of groups of training data, and each group of training data in the plurality of groups comprises: second cell gene sequence information, and result information identifying whether the second cell gene sequence satisfies a predetermined condition;

step S600: obtaining first output information of the first training model, wherein the first output information identifies result information of whether the second cell gene sequence satisfies a predetermined condition;

specifically, the first training model is a machine learning model, and the machine learning model can continuously learn through a large amount of data, further continuously modify the model, and finally obtain satisfactory experience to process other data. The machine model is obtained by training a plurality of groups of training data, and the process of training the neural network model by the training data is essentially a process of supervised learning. Each set of training data in the plurality of sets of training data comprises: second cell gene sequence information, and result information identifying whether the second cell gene sequence satisfies a predetermined condition; under the condition of obtaining the second cell gene sequence information, the machine learning model outputs identification information of result information of whether the second cell gene sequence meets a preset condition, the result information of whether the second cell gene sequence meets the preset condition output by the machine learning model is verified through the identification information of whether the second cell gene sequence meets the preset condition, if the output result information of whether the second cell gene sequence meets the preset condition is consistent with the identification result information of whether the second cell gene sequence meets the preset condition, the data supervised learning is finished, and then the next group of data learning is carried out; and if the output result information of whether the second cell gene sequence meets the preset condition is inconsistent with the identified result information of whether the second cell gene sequence meets the preset condition, adjusting the machine learning model by the machine learning model, and performing supervised learning of the next group of data after the machine learning model reaches the expected accuracy. And continuously correcting and optimizing the machine learning model through training data, and improving the accuracy of the machine learning model in processing the data through a supervised learning process so as to obtain more accurate result information whether the second cell gene sequence meets the preset conditions.

Step S700: and when the first output information meets the preset condition, obtaining a first result of the safety of the gene editing technology.

Specifically, if the second cell gene sequence obtained from the first output information satisfies the predetermined condition, the predetermined condition is whether the second cell gene sequence has off-target effect, and if the second cell gene sequence is identical to the original gene sequence through genome detection, the second cell gene sequence satisfies the predetermined condition, thereby obtaining the evaluation information of the first result of the safety of the gene editing technology.

Further, step S700 in the embodiment of the present application further includes:

step S701: obtaining a second gene region which is not easy to miss targets according to the first exogenous gene;

step S702: obtaining third cell gene sequence information in the second gene region according to the first cell gene sequence information and the second gene region;

step S703: inputting the third cell gene sequence information into a second training model, wherein the second training model is obtained by training a plurality of groups of training data, and each group of training data in the plurality of groups comprises: third cell gene sequence information, and result information identifying whether the third cell gene sequence information satisfies the predetermined condition;

step S704: obtaining second output information of the second training model, wherein the second output information identifies result information whether the third cell gene sequence satisfies a predetermined condition;

step S705: and obtaining a second result of the safety of the gene editing technology when the second output information shows that the predetermined condition is met.

Specifically, the second gene region is a region that is not easy to miss in the first cell gene sequence information, the gene sequence information of the second gene region, that is, the third cell gene sequence information is obtained, and the third cell gene sequence information is input into the second training model, the second machine model is obtained by training multiple sets of training data, and the process of training the neural network model by the training data is essentially a process of supervised learning. Each set of training data in the plurality of sets of training data comprises: third cell gene sequence information, and result information identifying whether the third cell gene sequence information satisfies a predetermined condition; under the condition of obtaining the third cell gene sequence information, the machine learning model outputs identification information indicating whether the third cell gene sequence information meets the condition, the result information indicating whether the third cell gene sequence information meets the predetermined condition or not output by the machine learning model is used for verifying whether the third cell gene sequence information meets the predetermined condition or not, and if the output result information indicating whether the third cell gene sequence information meets the predetermined condition is consistent with the result information indicating whether the third cell gene sequence information meets the predetermined condition or not, the data supervised learning is finished, and then the next group of data learning supervision is carried out; and if the output result information of whether the third cell gene sequence information meets the preset condition is inconsistent with the identified result information of whether the third cell gene sequence information meets the preset condition, adjusting the machine learning model by the machine learning model, and performing supervised learning of the next group of data until the machine learning model reaches the expected accuracy. And if the third cell gene sequence is the same as the original gene sequence through genome detection, the third cell gene sequence meets the preset condition, so that the evaluation information of a second result of the safety of the gene editing technology is obtained.

Further, step S200 in the embodiment of the present application further includes:

step S201: inputting the first cell gene sequence information into a first predictive model;

step S202: obtaining first potential off-target site information;

step S203: inputting the first potential miss site information into a first comparative model.

Specifically, one of the simplest and most effective methods for off-target detection is whole genome sequencing. And predicting potential off-target sites in the first cell gene sequence information through the first prediction model, and inputting the first potential off-target site information into the first alignment model to be aligned with the original genome.

Further, step S201 in the embodiment of the present application further includes:

step S2011: inputting the second cellular gene sequence information to the first predictive model;

step S2012: obtaining second potential off-target site information;

step S2013: inputting the second potential off-target site information into the first alignment model;

step S2014: outputting a first comparison result by the first comparison model;

step S2015: and determining the first gene region according to the first comparison result.

Specifically, the potential off-target site information of the second cell gene sequence information is obtained, the first alignment model is used for performing genome information alignment with the first potential off-target site, if the potential off-target site is not changed, no off-target is proved, and if the potential off-target site is changed, off-target is possible, and then the first comparison result is used for determining the easy-off-target region in the first cell gene sequence information, namely the first gene region.

Further, step S100 in the embodiment of the present application further includes:

step S101: after obtaining the first exogenous gene for gene editing, the method further comprises:

step S102: obtaining first species information of the first exogenous gene;

step S103: obtaining first reference group gene information of the first exogenous gene according to the first variety information;

step S104: obtaining first target information according to the first reference group gene information;

step S105: and introducing the first exogenous gene according to the first target information.

In particular, even in the same species, there are considerable genetic differences between different varieties and lines, which can have great influence on off-target and gene editing accuracy. Therefore, when genome editing of each variety is carried out, the reference genome of the variety is used for target design, so that the safety of gene editing can be effectively improved, and the off-target phenomenon can be reduced. And obtaining the first target information by determining the variety of the first exogenous gene and then obtaining a reference genome according to the first variety information, thereby introducing the first exogenous gene according to the first target information.

Further, step S400 in the embodiment of the present application further includes:

step S401: generating a first verification code according to the first cell gene sequence information, wherein the first verification code corresponds to the first cell gene sequence information;

step S402: generating a second verification code according to the second cell gene sequence information and the first verification code; by analogy, generating an Nth verification code according to the Nth cell gene sequence information and the Nth-1 verification code, wherein N is a natural number greater than 1;

step S403: and respectively taking the cell gene sequence information and the corresponding verification codes as a storage unit, and respectively copying and storing the storage units on M devices, wherein M is a natural number more than 1.

Specifically, in order to ensure the safety of gene sequence information, a first verification code is generated according to the first cell gene sequence information, wherein the first verification code and the first cell gene sequence information are in one-to-one correspondence; and generating a second verification code … according to the second cell gene sequence information and the first verification code, and so on, using the first cell gene sequence information and the first verification code as a first storage unit, using the second cell gene sequence information and the second verification code as a second storage unit …, and so on, and obtaining N storage units in total. The verification code information is used as main body identification information, and the identification information of the main body is used for distinguishing from other main bodies. When the cell gene sequence information needs to be called, after each next node receives the data stored by the previous node, the data is verified through a 'consensus mechanism' and then stored, and each storage unit is connected in series through a Hash technology, so that the cell gene sequence information is not easy to lose and damage, and the safety of gene sequence information storage in a gene editing process is ensured through a data information processing technology based on a block chain.

Further, step S402 in the embodiment of the present application further includes:

step S4021: taking the Nth cell gene sequence information and the Nth verification code as an Nth storage unit;

step S4022: obtaining the recording time of the Nth storage unit, wherein the recording time of the Nth storage unit represents the time required to be recorded by the Nth storage unit;

step S4023: acquiring first equipment with the largest memory in the M equipment according to the recording time of the Nth storage unit;

step S4024: and sending the recording right of the Nth storage unit to the first equipment.

Specifically, the nth cell gene sequence information and the nth verification code are partitioned to generate a plurality of blocks, and the nth device node is added to a block chain after the blocks are identified. And the Nth storage unit records time which is used for verifying the verification by a 'consensus mechanism' based on the obtained Nth verification code information and the Nth cell gene sequence information, storing the verification after the verification is passed and adding the verification to the original block. The shorter the recording time of the Nth storage unit is, the fastest the transport capacity of the equipment node is. The equipment with the fastest transport capacity is selected as block recording equipment, so that the real-time performance of data interaction under the chain in the block chain is improved, the safe, effective and stable operation of a decentralized block chain system is guaranteed, the efficiency of block chain message processing is improved, and the technical effect of improving the cell gene sequence information storage safety is achieved.

In summary, the method for improving the safety of the gene editing technology provided by the embodiment of the present application has the following technical effects:

the gene sequence edited by the gene is input into the training model, so that the gene sequence of the target area easy to miss is detected, the target miss phenomenon possibly occurring is predicted, and the characteristics that the training model can continuously learn and acquire experience to process data are adopted, so that the edited genome can be accurately compared, and the technical purposes of improving the accuracy of target miss detection and improving the safety of the gene editing technology are realized.

Example two

Based on the same inventive concept as the method for improving the safety of the gene editing technology in the previous embodiment, the present invention further provides a system for improving the safety of the gene editing technology, as shown in fig. 2, the system comprises:

a first obtaining unit 11, wherein the first obtaining unit 11 is used for obtaining a first exogenous gene for gene editing;

a second obtaining unit 12, wherein the second obtaining unit 12 is used for obtaining the gene sequence information of the first cell after the first exogenous gene is introduced;

a third obtaining unit 13, wherein the third obtaining unit 13 is used for obtaining a first gene region easy to miss target according to the first exogenous gene;

a fourth obtaining unit 14, wherein the fourth obtaining unit 14 is configured to obtain second cell gene sequence information in the first gene region according to the first cell gene sequence information and the first gene region;

a first input unit 15, where the first input unit 15 is configured to input the second cell gene sequence information into a first training model, where the first training model is obtained by training multiple sets of training data, and each set of training data in the multiple sets includes: second cell gene sequence information, and result information identifying whether the second cell gene sequence satisfies a condition;

a fifth obtaining unit 16, where the fifth obtaining unit 16 is configured to obtain first output information of the first training model, where the first output information identifies result information of whether the second cell gene sequence satisfies a predetermined condition;

a sixth obtaining unit 17, where the sixth obtaining unit 17 is configured to obtain a first result of the safety of the gene editing technology when the first output information satisfies the predetermined condition.

Further, the system further comprises:

a seventh obtaining unit for obtaining a second gene region that is less likely to be off-target from the first foreign gene;

an eighth obtaining unit configured to obtain third cellular gene sequence information within the second gene region from the first cellular gene sequence information and the second gene region;

a second input unit, configured to input the third cell gene sequence information into a second training model, where the second training model is obtained by training multiple sets of training data, and each set of training data in the multiple sets includes: third cell gene sequence information, and result information identifying whether the third cell gene sequence information satisfies a predetermined condition;

a ninth obtaining unit configured to obtain second output information of the second training model, wherein the second output information identifies result information whether the third cell gene sequence satisfies a predetermined condition;

a tenth obtaining unit, configured to obtain a second result of the safety of the gene editing technology when the second output information indicates that a predetermined condition is satisfied.

Further, the system further comprises:

a third input unit for inputting the first cell gene sequence information to a first prediction model;

an eleventh obtaining unit for obtaining first potential off-target site information;

a fourth input unit for inputting the first potential miss site information to a first comparative model.

Further, the system further comprises:

a fifth input unit for inputting the second cell gene sequence information to the first prediction model;

a twelfth obtaining unit for obtaining second potential off-target site information;

a sixth input unit for inputting the second potential miss site information to the first alignment model;

a thirteenth obtaining unit configured to output a first comparison result from the first comparison model;

a fourteenth obtaining unit, configured to determine the first gene region according to the first comparison result.

Further, the system further comprises:

a fifteenth obtaining unit for obtaining first species information of the first foreign gene;

a sixteenth obtaining unit, configured to obtain first reference group gene information of the first foreign gene according to the first variety information;

a seventeenth obtaining unit, configured to obtain first target information according to the first reference group gene information;

a first introduction unit configured to introduce the first foreign gene according to the first target information.

Further, the system further comprises:

an eighteenth obtaining unit, configured to generate a first verification code according to the first cell gene sequence information, where the first verification code corresponds to the first cell gene sequence information;

a nineteenth obtaining unit configured to generate a second verification code based on the second cell gene sequence information and the first verification code; by analogy, generating an Nth verification code according to the Nth cell gene sequence information and the Nth-1 verification code, wherein N is a natural number greater than 1;

and the first storage unit is used for respectively taking each cell gene sequence information and the corresponding verification code as a storage unit, and respectively copying and storing each storage unit on M devices, wherein M is a natural number more than 1.

Further, the system further comprises:

a twentieth obtaining unit for using the nth cell gene sequence information and the nth verification code as an nth storage unit;

a twenty-first obtaining unit, configured to obtain a recording time of the nth storage unit, where the recording time of the nth storage unit represents a time that the nth storage unit needs to record;

a twenty-second obtaining unit, configured to obtain, according to the nth storage unit recording time, a first device with a largest memory in the M devices;

a first sending unit, configured to send the recording right of the nth storage unit to the first device.

Various modifications and embodiments of the method for improving the safety of the gene editing technology in the first embodiment of fig. 1 are also applicable to the system for improving the safety of the gene editing technology in this embodiment, and a person skilled in the art can clearly know the system for improving the safety of the gene editing technology in this embodiment from the foregoing detailed description of the method for improving the safety of the gene editing technology, so for the sake of brevity of the description, detailed descriptions thereof are omitted here.

Exemplary electronic device

The electronic device of the embodiment of the present application is described below with reference to fig. 3.

Fig. 3 illustrates a schematic structural diagram of an electronic device according to an embodiment of the present application.

Based on the inventive concept of a method for improving the safety of a gene editing technology in the foregoing embodiments, the present invention further provides a system for improving the safety of a gene editing technology, wherein a computer program is stored thereon, and when being executed by a processor, the computer program realizes the steps of any one of the methods for improving the safety of a gene editing technology.

Where in fig. 3 a bus architecture (represented by bus 300), bus 300 may include any number of interconnected buses and bridges, bus 300 linking together various circuits including one or more processors, represented by processor 302, and memory, represented by memory 304. The bus 300 may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface 305 provides an interface between the bus 300 and the receiver 301 and transmitter 303. The receiver 301 and the transmitter 303 may be the same element, i.e., a transceiver, providing a means for communicating with various other apparatus over a transmission medium.

The processor 302 is responsible for managing the bus 300 and general processing, and the memory 304 may be used for storing data used by the processor 302 in performing operations.

The embodiment of the application provides a method for improving the safety of a gene editing technology, wherein the method comprises the following steps: obtaining a first exogenous gene for gene editing; obtaining the gene sequence information of a first cell after the first exogenous gene is introduced; obtaining a first gene region which is easy to miss targets according to the first exogenous gene; obtaining second cell gene sequence information in the first gene region according to the first cell gene sequence information and the first gene region; inputting the second cell gene sequence information into a first training model, wherein the first training model is obtained by training a plurality of groups of training data, and each group of training data in the plurality of groups comprises: second cell gene sequence information, and result information identifying whether the second cell gene sequence satisfies a predetermined condition; obtaining first output information of the first training model, wherein the first output information identifies result information of whether the second cell gene sequence satisfies a predetermined condition; and when the first output information meets the preset condition, obtaining a first result of the safety of the gene editing technology.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create a system for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including an instruction system which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

14页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:装置、方法及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!