Method for determining polymer sequence

文档序号:653035 发布日期:2021-04-23 浏览:7次 中文

阅读说明:本技术 用于测定聚合物序列的方法 (Method for determining polymer sequence ) 是由 卡莱夫·G·布朗 蒂莫西·L·马辛厄姆 斯图尔特·W·瑞德 于 2019-09-04 设计创作,主要内容包括:本发明涉及一种测定靶聚合物或其部分的序列的方法,所述靶聚合物或其部分包括聚合物单元,所述聚合物单元包括典型聚合物单元和非典型聚合物单元。所述方法包括获取与所述靶聚合物相关的信号的一系列测量结果,其中所述信号的测量结果取决于多个聚合物单元,并且其中所述靶聚合物的所述聚合物单元调制所述信号,并且其中非典型聚合物单元与对应典型聚合物单元不同地调制所述信号。使用机器学习技术分析所述一系列测量结果,所述机器学习技术将非典型聚合物单元的测量结果归属于相应的对应典型聚合物单元的测量结果。根据经过分析的一系列测量结果测定所述靶聚合物或其部分的序列。可以另外地或可替代地测定从所述分析中鉴定的非典型聚合物单元。可以使用两种或更多种类型的非典型聚合物单元,所述两种或更多种类型的非典型聚合物单元与两种或更多种类型的典型聚合物单元相对应。多核苷酸可以是DNA。(The present invention relates to a method of determining the sequence of a target polymer or a portion thereof comprising polymer units including typical polymer units and atypical polymer units. The method includes obtaining a series of measurements of a signal associated with the target polymer, wherein the measurements of the signal depend on a plurality of polymer units, and wherein the polymer units of the target polymer modulate the signal, and wherein atypical polymer units modulate the signal differently than corresponding typical polymer units. The series of measurements are analyzed using a machine learning technique that attributes the measurements of atypical polymer units to the measurements of corresponding typical polymer units. Determining the sequence of the target polymer or portion thereof based on the analyzed series of measurements. Atypical polymer units identified from the analysis may additionally or alternatively be determined. Two or more types of atypical polymer units corresponding to two or more types of typical polymer units may be used. The polynucleotide may be DNA.)

1. A method of determining the sequence of a target polymer or portion thereof, the target polymer or portion thereof comprising polymer units, the polymer units comprising canonical polymer units and atypical polymer units, the method comprising:

obtaining a series of measurements of a signal associated with the target polymer, wherein the measurements of the signal depend on a plurality of polymer units, and wherein the polymer units of the target polymer modulate the signal, and wherein atypical polymer units modulate the signal differently than corresponding typical polymer units;

analyzing the series of measurements using a machine learning technique that attributes measurements of atypical polymer units to measurements of corresponding typical polymer units; and

determining the sequence of the target polymer or portion thereof based on the analyzed series of measurements.

2. The method of claim 1 wherein atypical polymer units identified from the analysis are additionally or alternatively determined.

3. The method of claim 1 or 2, wherein the target polymer comprises two or more types of atypical polymer units corresponding to two or more types of typical polymer units.

4. The method according to any of the preceding claims wherein the identity and sequence position of atypical polymer units is determined.

5. The method according to any one of the preceding claims, wherein the target polymer comprises atypical polymer units corresponding to each type of typical polymer unit.

6. The method of any preceding claim, wherein the machine learning technique does not determine whether a polymer unit is an atypical polymer unit or a corresponding canonical polymer unit.

7. The method of claim 1 wherein the target polymer comprises a plurality of atypical polymer units for each of one or more types of atypical polymer units present.

8. The method according to claim 1, wherein the atypical polymer unit may correspond to more than one typical polymer unit.

9. The method according to any of the preceding claims, wherein the target polymer comprises about 50% atypical polymer units.

10. The method of claim 1 wherein the atypical polymer units are modified typical polymer units.

11. The method of claim 1 wherein the atypical polymer units are naturally modified.

12. The method of any one of the preceding claims, wherein the series of measurements are taken during movement of the target polymer relative to a nanopore.

13. The method of any one of the preceding claims, wherein the measurement is a measurement indicative of an ionic current flowing through the nanopore or a measurement of a voltage across the nanopore during translocation of the target polymer.

14. The method of any preceding claim, wherein the machine learning technique is trainable by a method comprising:

providing a plurality of target polymers comprising atypical units that have replaced equivalent canonical units at different sequence positions in the target polymer;

obtaining a series of measurements of a signal associated with the target polymer;

analyzing the series of measurements using the machine learning technique; and

corresponding representative polymer units of the polymer training chains are estimated.

15. The method of any preceding claim, wherein the machine learning technique is a recurrent neural network.

16. The method of any one of the preceding claims, wherein the polymer is a polynucleotide and the polymer units are nucleotide bases.

17. The method according to any of the preceding claims, wherein the one or more atypical bases have been modified enzymatically.

18. The method of claim 1, further comprising the step of modifying a canonical polymer to provide the target polymer that includes one or more atypical bases of one or more different types.

19. The method according to any one of the preceding claims, wherein the polynucleotide comprising one or more atypical bases of one or more different types is produced from the complement of the polynucleotide by use of a polymerase and a proportion of atypical bases.

20. The method of any one of the preceding claims, wherein the polynucleotide is DNA.

21. The method of any one of the preceding claims, wherein the movement of the polynucleotide relative to the nanopore is controlled by an enzyme.

22. The method of claim 21, wherein the enzyme is a helicase.

23. The method of claim 14, wherein the polynucleotide training strand comprises more than one type of atypical polymer unit.

24. A method of determining a consensus sequence of a target polymer or portion thereof:

providing a plurality of polymers, wherein the polymers comprise typical polymer units and atypical polymer units, and each of the polymers comprises a region of polymer units corresponding to a region of the target polymer;

analyzing measurements of signals associated with the plurality of polymers, wherein a measurement depends on a plurality of polymer units, and wherein the polymer units of the target polymer modulate the signals, and wherein atypical polymer units modulate the signals differently than corresponding typical polymer units; and

determining a consensus sequence based on the analyzed series of measurements of the plurality of polymers.

25. The method of claim 24 wherein analyzing the series of measurements includes a machine learning technique that attributes measurements of atypical polymer units to measurements of corresponding typical polymer units.

26. The method according to claim 24 wherein atypical polymer units identified from the analysis additionally or alternatively retain a measurement of an atypical polymer unit as a measurement of a corresponding typical polymer unit.

27. The method according to any one of claims 24 to 26 wherein the atypical nucleotide has been introduced into the polynucleotide in place of the corresponding canonical base.

28. The method according to any one of claims 24 to 26, wherein one or more polynucleotide strands of a polynucleotide strand each comprise four or more different types of atypical bases.

29. The method of any one of claims 24 to 26 further comprising the step of introducing the atypical base into the polynucleotide strand.

30. The method according to any one of claims 25 to 29 wherein the series of measurements are analysed using a machine learning technique that has been trained to attribute measurements relating to the presence of one or more atypical bases in a nucleotide region to measurements of equivalent regions, except where the one or more types of atypical bases have been replaced by corresponding one or more corresponding canonical bases and wherein an estimate of the consensus sequence is provided in which the one or more types of atypical bases are determined to be in addition to their corresponding canonical base or bases.

31. The method of any one of claims 24 to 30 wherein two or more types of atypical polymer units are introduced into one or more of the polynucleotide strands.

32. The method according to any one of claims 24 to 31, wherein each polynucleotide strand in the polynucleotide strand comprises between 30% and 80% atypical polymer units.

33. The method of any one of claims 24-31, wherein the series of measurements are taken during movement of the polymer unit relative to the nanopore.

34. The method of any one of the preceding claims, wherein the target polymer is derived from a template or complement of an original polymer, and the template or the complement of the target polymer has a 3 'or 5' linkage to a polymerase fill, wherein at least one of the template, the complement, or the polymerase fill of the target polymer comprises canonical polymer units and atypical polymer units.

35. The method of claim 34, wherein the atypical base is non-detectably incorporated into a target polymer according to the method of any preceding claim.

36. The method according to any one of the preceding claims, wherein the polynucleotide comprising one or more atypical bases of one or more different types is generated from a template or complement of the polynucleotide by use of a polymerase and a proportion of atypical bases.

37. The method of claim 36, wherein the generated polynucleotide is covalently linked to the corresponding template or complement by two hairpin adaptors and the resulting construct is circular.

38. The method of claim 37, wherein the two hairpin adaptors are asymmetric.

39. The method according to any one of the preceding claims, wherein the polymer is a polynucleotide and the polymer units are nucleotide bases and the target polynucleotide comprises a repeated sequence segment of a template polynucleotide strand produced from a circular construct by using a polymerase and a proportion of atypical bases.

40. The method of claim 39, wherein the target polynucleotide comprises alternating segments of repeated sequences of a template polynucleotide strand and a complement polynucleotide.

41. The method of claim 37, wherein the target polynucleotide is produced from the circular construct by using a polymerase and a proportion of atypical bases.

42. The method of claim 19, wherein the complement is prepared by at least one of: covalently linking adapters to opposite ends of the double-stranded polynucleotide; and separating the double stranded polynucleotides to provide complement chains each comprising an adapter at one end or an adapter at either end.

79页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:碳含量降低的中锰冷轧带钢中间产品以及用于提供此种钢中间产品的方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!