System and method for storage

文档序号：653337 发布日期：2021-04-23 浏览：19次中文

阅读说明：本技术 用于存储的系统和方法 (System and method for storage ) 是由莫瑞克·萧阿里·阿加斯坦利·洪塔伦·库拉纳阿塔万·卡鲁纳卡拉恩克雷格·切希拉于 2020-05-26 设计创作，主要内容包括：用于非易失性存储的设备、系统和方法包括阱激活设备,其可操作以修改来自流通池的多个阱的一个或更多个阱,以提供一组可读阱。可读阱被配置为允许阱暴露于来自核苷酸测序流体的物质,并防止暴露于其他物质和流体,例如核苷酸合成流体。阱激活设备也可以修改阱以提供一组可写阱。这组阱被配置成允许暴露于核苷酸合成流体和物质；并防止暴露于核苷酸测序流体和物质。还可以提供减轻数据错误风险的措施,例如生成命令以将规定数据写入与存储设备中特定位置相关联的核苷酸序列,读取核苷酸序列并进行比较。(Devices, systems, and methods for non-volatile storage include a well activation device operable to modify one or more wells from a plurality of wells of a flow cell to provide a set of readable wells. The readable wells are configured to allow exposure of the wells to substances from the nucleotide sequencing fluid and to prevent exposure to other substances and fluids, such as nucleotide synthesis fluids. The well activation device may also modify the wells to provide a set of writable wells. The set of wells is configured to allow exposure to nucleotide synthesis fluids and substances; and prevent exposure to nucleotide sequencing fluids and substances. Measures to mitigate the risk of data errors may also be provided, such as generating commands to write specified data to nucleotide sequences associated with particular locations in the storage device, reading the nucleotide sequences and comparing.)

1. A system for non-volatile storage, comprising:

a memory controller comprising a processor and a memory;

a storage device, comprising:

a flow-through cell comprising a plurality of wells having an open side accessible from a first surface of the flow-through cell, wherein the wells are adapted to accommodate a polynucleotide,

a flow control interface, and

a sequencing interface;

a fluidic device for providing one or more fluids to a first surface of the flow-through cell, wherein the one or more fluids comprise a nucleotide writing reagent and a nucleotide reading reagent;

a sequencing device for sequencing polynucleotides within the plurality of wells and determining nucleotides through the sequencing interface; and

a well activation device to:

modifying one or more wells of the plurality of wells to provide a set of readable wells, wherein the set of readable wells allows exposure to the nucleotide reading reagent and prevents exposure to other reagents from the fluidic device, an

Modifying one or more wells of the plurality of wells to provide a set of writeable wells, wherein the set of writeable wells allows exposure to the nucleotide writing reagent and prevents exposure to other reagents from a fluidic device.

2. The system of claim 1, wherein the trap activation device comprises a plurality of electrodes, and wherein:

at least one electrode of the plurality of electrodes is located adjacent to each well of the plurality of wells,

a control interface of the memory device is coupled to the plurality of electrodes and provides a set of control signals from the memory controller to the plurality of electrodes,

each of the plurality of electrodes generates a voltage based on the set of control signals, the voltage comprising a first voltage or a second voltage, an

A first voltage generated by an electrode of the plurality of electrodes modifies a well of the plurality of wells proximate to the electrode generating the first voltage to a readable well and a second voltage generated by an electrode of the plurality of electrodes modifies a well of the plurality of wells proximate to the electrode generating the second voltage to a writeable well.

3. The system of claim 2, wherein the at least one electrode is located on a sidewall of each well of the plurality of wells.

4. The system of any one or more of claims 2 to 3, wherein the at least one electrode is located on the first surface at a perimeter of each well of the plurality of wells.

5. The system of any one or more of claims 2 to 4, wherein the at least one electrode is located at a bottom of each well of the plurality of wells, and wherein the at least one electrode comprises an annular shape that allows light to pass through the annular shape from a second surface of the flow cell opposite the first surface and into a well of the plurality of wells in which the at least one electrode is located.

6. The system of any one or more of claims 2 to 5, wherein the second voltage causes hybridization of enzyme inhibitors of wells of the plurality of wells that are proximal to the electrode that generates the second voltage and binding of a desired nucleotide in wells of the plurality of wells that are proximal to the electrode that generates the second voltage.

7. The system of any one or more of claims 1 to 6, wherein:

the well activation device includes a plurality of pH control devices, each pH control device corresponding to one well of the plurality of wells, and

Each pH control device generates a voltage that controls the pH of the voltage-sensitive functionalized fluid provided by the fluidics device to modify a well of the plurality of wells corresponding to the pH control device into a readable or writeable well.

8. The system of any one or more of claims 1 to 7, wherein:

the well activation device comprises a Spatial Light Modulator (SLM) to emit light into one or more of the plurality of wells, an

The emitted light modifies each of the one or more wells to be a readable well or a writable well.

9. The system of any one or more of claims 1-8, wherein the storage controller, during a sequencing operation:

receiving from the sequencing device a set of address data describing nucleotides of polynucleotides in address wells of the plurality of wells,

determining a set of target wells from the plurality of wells based on the set of address data, an

Such that the fluidic device provides the nucleotide reading reagent only to the set of target wells.

10. The system of claim 9, wherein during a sequencing operation, the storage controller controls localization of the enzyme of the nucleotide reading reagent by causing the fluidics device to strip the enzyme from the plurality of wells and causing the well activation device to provide a charged tag.

11. The system of any one or more of claims 1 to 10, wherein the trap activation device comprises:

an electrical release mechanism for generating a voltage to modify the well into a writable well, an

A photon release mechanism for generating photon energy to modify the well into a writable well, an

Wherein the set of writable wells comprises a well modified by the electrical release mechanism and a well modified by the photonic release mechanism.

12. The system of any one or more of claims 1 to 11, wherein:

the trap activation device comprises an electrowetting device near a first surface of the flow cell,

the electrowetting device delivers fluid from the fluidics device into any one of the plurality of wells, and

during simultaneous sequencing and synthesis operations, the memory controller:

providing a droplet of the nucleotide reading reagent to a first well of the plurality of wells,

providing a droplet of the nucleotide writing reagent to a second well of the plurality of wells, wherein the first well and the second well are adjacent.

13. The system of any one or more of claims 1 to 12, wherein each of the plurality of wells comprises:

Polarization features to reduce cross talk between nearby wells, an

Optical waveguide features that reduce cross talk between nearby wells.

14. The system of any one or more of claims 1-13, wherein the storage controller, during a composition operation:

converting the set of data into a set of nucleotides,

synthesizing a first polynucleotide strand in a first well of the set of writeable wells based on the set of nucleotides, an

Synthesizing a second polynucleotide strand in a second well of the set of writeable wells based on the set of nucleotides, wherein the first polynucleotide strand and the second polynucleotide strand are identical when properly synthesized.

15. The system of claim 14, wherein the storage controller sequences the first polynucleotide strand and the second polynucleotide strand with the sequencing device and provides an indication of a synthesis error if the first polynucleotide strand and the second polynucleotide strand are not identical.

16. The system of any one or more of claims 14 to 15, further comprising a spatial light modulator to project an optical pattern onto the flow-through cell via the sequencing interface, wherein the memory controller operates the spatial light modulator to project the same optical pattern into the first and second wells to synthesize the first and second polynucleotide strands.

17. The system of any one or more of claims 1-16, wherein the storage controller, during a composition operation:

converting the set of data into a set of nucleotides,

synthesizing a first polynucleotide strand in a first well of the set of writeable wells based on a first portion of the set of nucleotides, an

Synthesizing a second polynucleotide strand in a second well of the set of writeable wells based on a second portion of the set of nucleotides concurrently with synthesizing the first polynucleotide strand, wherein the first polynucleotide strand and the second polynucleotide strand collectively represent the entirety of the set of nucleotides.

18. The system of any one or more of claims 1-17, wherein the storage controller, during a composition operation:

converting the set of data into a set of nucleotides,

synthesizing a first polynucleotide strand in a first well of the set of writeable wells based on the set of nucleotides,

synthesizing a second polynucleotide strand in the first well based on the set of nucleotides, wherein the first and second polynucleotide strands are identical when properly synthesized,

generating a strand hash value based on the set of nucleotides, an

Synthesizing the first polynucleotide strand and the second polynucleotide strand to add the strand hash value.

19. The system of claim 18, wherein the storage controller, during a sequencing operation:

sequencing the first polynucleotide strand and the second polynucleotide strand in the first well with the sequencing device to determine a set of nucleotides and a sequencing hash value for the sequencing of each polynucleotide strand,

providing an indication of a synthesis error when a set of nucleotides for sequencing of said first polynucleotide strand and said second polynucleotide strand are not identical, and

providing an indication that the hash value does not match when the sequencing hash value of the first polynucleotide strand or the second polynucleotide strand does not match a subsequent hash value of the sequenced set of nucleotides.

20. A method for non-volatile storage, comprising:

installing a storage device, the storage device comprising:

a flow-through cell comprising a plurality of wells having an open side accessible from a first surface of the flow-through cell, wherein the wells are adapted to accommodate a polynucleotide,

a flow control interface, and

a sequencing interface;

performing a synthesis operation to produce polynucleotides in the plurality of wells by operating a fluidic device to provide nucleotide writing reagents to the first surface;

Performing a sequencing operation with a sequencing device and nucleotide reading reagents from the fluidics device to determine the nucleotides of the polynucleotides in the plurality of wells; and

using a trap activation device prior to the synthesis operation and the sequencing operation:

21. The method of claim 20, wherein using the well activation device comprises operating a plurality of electrodes of the well activation device, and wherein:

at least one electrode of the plurality of electrodes is located adjacent to each well of the plurality of wells,

a control interface of the memory device is coupled to the plurality of electrodes and provides a set of control signals to the plurality of electrodes,

each of the plurality of electrodes generates a voltage based on the set of control signals, the voltage comprising a first voltage or a second voltage, an

The first voltage modifies a well of the plurality of wells proximate to an electrode that generates the first voltage to be a readable well and the second voltage generated by an electrode of the plurality of electrodes modifies a well of the plurality of wells proximate to an electrode that generates the second voltage to be a writeable well.

22. The method of claim 21, wherein operating the plurality of electrodes comprises operating at least one electrode located on a sidewall of a well of the plurality of wells.

23. The method of any one or more of claims 21 to 22, wherein operating the plurality of electrodes comprises operating at least one electrode located on the first surface at a perimeter of a well of the plurality of wells.

24. The method of any one or more of claims 21 to 23, wherein operating the plurality of electrodes comprises operating at least one electrode located at a bottom of a well of the plurality of wells, and wherein the at least one electrode comprises an annular shape that allows light to pass from a second surface of the flow cell, opposite the first surface, through the annular shape and into a well of the plurality of wells in which the at least one electrode is located.

25. The method of any one or more of claims 21 to 24, further comprising configuring the second voltage to hybridize an inhibitor of an enzyme in a well of the plurality of wells proximal to an electrode that generates the second voltage and bind a desired nucleotide to a well of the plurality of wells proximal to an electrode that generates the second voltage.

26. A method according to any one or more of claims 20 to 25, wherein using the trap activation device comprises operating a plurality of pH control devices of the trap activation device, and wherein:

each pH control device corresponds to one of the plurality of wells, and

each pH control device generates a voltage that controls the pH of the voltage-sensitive functionalized fluid provided by the fluidics device to modify a well of the plurality of wells corresponding to the pH control device into a readable or writeable well.

27. A method according to any one or more of claims 20 to 26, wherein operating the well activation device comprises operating a Spatial Light Modulator (SLM) to emit light into one or more wells of the plurality of wells, and wherein the emitted light modifies each of the one or more wells to be a readable well or a writable well.

28. The method of any one or more of claims 20 to 27, further comprising, during the sequencing operation:

receiving a set of address data from the sequencing apparatus, the set of address data describing nucleotides of a sequenced polynucleotide from an address well of the plurality of wells,

determining a set of target wells from the plurality of wells based on the set of address data, an

Such that the fluidic device provides the nucleotide reading reagent only to the set of target wells.

29. The method of claim 28, further comprising controlling the localization of the enzyme of the nucleotide reading reagent by causing the fluidic device to strip unwanted enzymes from the plurality of wells and causing the well activation device to provide a charged tag.

30. A method as claimed in any one or more of claims 20 to 29, wherein using the trap activation device comprises:

generating a voltage to modify the well into a writable well using an electrical release mechanism of the well activation device, an

Using a photon release mechanism of the well activation device to generate photon energy to modify a well into a writable well, an

Wherein the set of writable wells comprises a well modified by the electrical release mechanism and a well modified by the photonic release mechanism.

31. A method according to any one or more of claims 20 to 30, wherein operating the trap activation device comprises:

operating an electrowetting device of the well activation device to deliver fluid from the fluidics device to the plurality of wells, an

During the simultaneous sequencing and synthesis operations:

providing a droplet of the nucleotide reading reagent to a first well of the plurality of wells, an

Providing a droplet of the nucleotide writing reagent to a second well of the plurality of wells, wherein the first well and the second well are adjacent.

32. The method of any one or more of claims 20 to 31, further comprising, during the synthesizing operation:

converting the set of data into a set of nucleotides,

synthesizing a first polynucleotide strand in a first well of the set of writeable wells based on the set of nucleotides, an

33. The method of claim 32, further comprising sequencing the first polynucleotide strand and the second polynucleotide strand with the sequencing device and providing an indication of a synthesis error when the first polynucleotide strand and the second polynucleotide strand are not identical.

34. The method of any one or more of claims 32 to 33, further comprising operating a spatial light modulator to project the same optical pattern onto the first and second wells via the sequencing interface, wherein the same optical pattern is used to synthesize first and second polynucleotide strands in parallel.

35. The method of any one or more of claims 20 to 34, further comprising, during the synthesizing operation:

converting the set of data into a set of nucleotides,

synthesizing a first polynucleotide strand in a first well of the set of writeable wells based on a first portion of the set of nucleotides, an

36. The method of any one or more of claims 20 to 35, further comprising, during the synthesizing operation:

converting the set of data into a set of nucleotides,

synthesizing a first polynucleotide strand in a first well of the set of writeable wells based on the set of nucleotides,

synthesizing a second polynucleotide strand in the first well based on the set of nucleotides, wherein the first and second polynucleotide strands are identical when properly synthesized,

generating a strand hash value based on the set of nucleotides, an

Synthesizing the first polynucleotide strand and the second polynucleotide strand to add the strand hash value.

37. The method of claim 36, further comprising, during the sequencing operation:

providing an indication of a synthesis error in the case that the sequenced set of nucleotides of the first and second polynucleotide strands are not identical,

determining a subsequent hash value for each of the first and second polynucleotide strands based on the sequenced set of nucleotides, an

Providing an indication that the hash value does not match when the sequencing hash value of the first polynucleotide strand or the second polynucleotide strand does not match a subsequent hash value of the sequenced set of nucleotides.

38. A system for non-volatile storage, comprising:

a memory controller comprising a processor and a memory;

a storage device, comprising:

a flow-through cell comprising a plurality of wells having an open side accessible from a first surface of the flow-through cell, wherein the wells are adapted to accommodate a polynucleotide,

A flow control interface, and

a sequencing interface;

a fluidic device for providing one or more fluids to a first surface of the flow-through cell, wherein the one or more fluids comprise a nucleotide writing reagent and a nucleotide reading reagent;

a sequencing device for sequencing polynucleotides within the plurality of wells and determining nucleotides through the sequencing interface; and

a cache memory comprising electronic memory for storing data queued to be encoded into a set of nucleotides and synthesized into a polynucleotide in the plurality of wells.

39. The system of claim 38, wherein the cache memory is located in the storage device, and wherein the storage device is a removable storage device.

40. The system of any one or more of claims 38-39, wherein the cache memory stores one or more of:

a set of file indices describing the names and locations of data stored as polynucleotides in the plurality of wells, an

A set of checksum values that can be used to verify the integrity of data stored as polynucleotides in the plurality of wells.

41. The system of any one or more of claims 38-40, wherein the storage controller:

Receiving a set of input data to be written to the memory device as polynucleotides in the plurality of wells,

receiving a request to read output data from polynucleotides stored in the plurality of wells,

determining whether the storage device is in a write mode or a read mode based on whether the storage device recently received the nucleotide writing reagent or the nucleotide reading reagent from the fluidics device,

writing the set of input data prior to reading the output data with the storage device in the write mode, an

Storing the set of input data on the cache memory with the storage device in the read mode and reading the output data prior to writing the set of input data.

42. The system of any one or more of claims 38-41, further comprising an electrowetting device positioned near the first surface and delivering fluid from the fluidics device into any of the plurality of wells, wherein the storage controller:

operating the electrowetting device to provide droplets of the nucleotide reading reagent to a first well of the plurality of wells to enable sequencing of a polynucleotide stored therein,

While providing droplets of the nucleotide reading reagent to the first well, identifying a second well located closest to the first well based on a plurality of requests for output data, and

operating the electrowetting device to provide a portion of the droplet of nucleotide reading reagent to a second well of the plurality of wells to sequence the polynucleotide stored therein.

43. The system of any one or more of claims 38-42, wherein the storage controller:

determining a subset of the plurality of wells that are most frequently sequenced based on past requests for output data,

operating the sequencing apparatus to sequence the subset of the plurality of wells and generate a set of nucleotides describing polynucleotides stored within the subset of the plurality of wells,

converting the set of nucleotides into a set of digital data,

storing said set of digital data in said cache memory, an

Providing the set of digital data from the cache memory in response to a subsequent request.

44. A system for non-volatile storage, comprising:

a memory controller comprising a processor and a memory;

A storage device, comprising:

a first flow-through cell comprising a first plurality of wells, wherein the first plurality of wells is adapted to hold a polynucleotide,

a second flow-through cell comprising a second plurality of wells, wherein the second plurality of wells is adapted to hold a polynucleotide,

a flow control interface, and

a sequencing interface;

a fluidics device for providing one or more fluids to the first and second plurality of wells, wherein the one or more fluids comprise a nucleotide writing reagent and a nucleotide reading reagent; and

a sequencing device for sequencing and determining nucleotides for polynucleotides within the first and second plurality of wells through the sequencing interface;

wherein, when in the mirror mode, the storage controller:

converting the set of data into a set of nucleotides, and

operating the fluidic device to produce identical polynucleotides in the first and second plurality of wells based on the set of data.

45. The system of claim 44, further comprising a spatial light modulator to project an optical pattern onto the first flow-through cell and the second flow-through cell via the sequencing interface, wherein the memory controller operates the spatial light modulator to project the same optical pattern into a first well of the first plurality of wells and a second well of the second plurality of wells to synthesize a first polynucleotide strand and a second polynucleotide strand.

46. The system of any one or more of claims 44 to 45, wherein when in the exclusive mode, the storage controller:

converting the set of data into a set of nucleotides,

designating the first plurality of wells for writing data and designating the second plurality of wells for reading data,

operating the fluidic device based on the set of data to produce polynucleotides in the first plurality of wells, an

Operating the fluidics apparatus and the sequencing apparatus to sequence polynucleotides in the second plurality of wells based on the output request.

47. The system of claim 46, wherein the storage controller:

determining that there is no current output request, an

Switching from the dedicated mode to the mirror mode.

48. A method of operating a storage device, comprising:

generating one or more commands to write specified data to a polynucleotide associated with a particular location in a storage device;

reading the polynucleotide;

performing a comparison, wherein the comparison compares the polynucleotide stored in the memory device to a particular quality control value stored in a non-nucleotide memory; and

based on the comparison, it is determined whether the particular location in the storage device is to be considered as having corrupted data.

49. The method of claim 48, wherein reading the polynucleotide, performing the comparison, and determining whether a particular location in the storage device will be deemed to have corrupted data are performed automatically based on receiving a command to write to the storage device.

50. A method according to any one or more of claims 48 to 49, wherein the method comprises:

determining that the particular location in the storage device is to be considered as having corrupted data; and

based on determining that the particular location in the storage device is to be deemed as having corrupted data, writing a new polynucleotide encoding uncorrupted data to the particular location in the storage device.

51. A method according to claim 50, wherein the method includes generating the uncorrupted data based on reading information from one or more other locations in the storage device.

52. A method according to any one or more of claims 50 to 51, wherein the method comprises:

based on determining that the particular location in the storage device is to be considered as having corrupted data, and prior to writing a new polynucleotide encoding uncorrupted data to the particular location in the storage device, updating an index of the storage device to indicate that the particular location in the storage device has corrupted data; and

After writing a new polynucleotide encoding uncorrupted data to the particular location in the storage device, updating the index of the storage device to indicate that the particular location in the storage device is free of corrupted data.

53. A method according to any one or more of claims 48 to 52, wherein said method comprises generating a specific quality control value based on said prescription data.

54. The method of any one or more of claims 48 to 53, wherein:

the storage device comprises a plurality of addressable locations;

the particular location is comprised of the plurality of addressable locations;

the non-nucleotide memory stores a plurality of quality control values;

the specific quality control value is composed of the plurality of quality control values; and

each quality control value from the plurality of quality control values is associated with a respective addressable location from the plurality of addressable locations.

55. The method of any one or more of claims 48 to 54, wherein:

the polynucleotide comprises an inspection moiety; and

the method includes generating the inspection portion based on the prescription data.

56. The method of claim 8, wherein the check portion is a parity bit.

57. The method of any one or more of claims 55 to 56, wherein the inspection portion comprises methylation or other information retaining modification of a nucleobase in the polynucleotide.

58. A method according to any one or more of claims 55 to 57, wherein said inspection portion encodes data matching said particular quality control value.

59. The method of any one or more of claims 55 to 58, wherein:

a second polynucleotide is stored in the storage device, wherein the second polynucleotide comprises a second inspection portion that is the same as the inspection portion comprised by the polynucleotide; and

the method includes determining that the particular location in the storage device is to be considered as having corrupted data based on identifying a difference between the first polynucleotide and the second polynucleotide.

60. The method of any one or more of claims 48-52, wherein:

the specific quality control value is the prescribed data; and

the comparison comprises checking whether there is any difference between the data stored in the polynucleotide and the prescribed data.

61. The method of any one or more of claims 48-52 or 60, wherein the method comprises:

Determining that the particular location in the storage device should not be considered data that has corruption; and

deleting the particular quality control value stored in the non-nucleotide memory based on determining that the particular location in the storage device should not be considered as having the corruption.

62. A system comprising a storage device having one or more non-transitory computer-readable media storing instructions for the storage device to perform a method comprising:

generating one or more commands to write specified data to a polynucleotide associated with a particular location in a storage device;

reading the polynucleotide;

performing a comparison, wherein the comparison compares the polynucleotide stored in the memory device to a particular quality control value stored in a non-nucleotide memory; and

based on the comparison, it is determined whether the particular location in the storage device is to be considered as having corrupted data.

63. The system of claim 62, wherein the method further comprises:

reading the polynucleotide;

performing the comparison; and

based on receiving a command to write to the storage device, it is automatically determined whether a particular location in the storage device is to be treated as having corrupted data.

64. The system of any one or more of claims 62-63, wherein the method comprises:

determining that the particular location in the storage device is to be considered as having corrupted data; and

65. The system of claim 64, wherein the method comprises generating the uncorrupted data based on reading information from one or more other locations in the storage device.

66. The system of any one or more of claims 64-65, wherein the method comprises:

after writing a new polynucleotide encoding uncorrupted data to the particular location in the storage device, updating the index of the storage device to indicate that the particular location in the storage device is free of corrupted data.

67. The system of any one or more of claims 62-66, wherein the method comprises generating the particular quality control value based on the prescription data.

68. The system of any one or more of claims 62-67, wherein:

the storage device comprises a plurality of addressable locations;

the particular location is comprised of the plurality of addressable locations;

the non-nucleotide memory stores a plurality of quality control values;

the specific quality control value is composed of the plurality of quality control values; and

each quality control value from the plurality of quality control values is associated with a respective addressable location from the plurality of addressable locations.

69. The system of any one or more of claims 62-68, wherein:

the polynucleotide comprises an inspection moiety; and

the method includes generating the inspection portion based on the prescription data.

70. The system of claim 69, wherein the check portion is a parity bit.

71. The system of any one or more of claims 69-70, wherein the inspection portion comprises methylation or other information-retaining modifications of nucleobases in the polynucleotide.

72. The system of any one or more of claims 69-71, wherein the inspection portion encodes data that matches the particular quality control value.

73. The system of any one or more of claims 69-72, wherein:

74. The system of any one or more of claims 62-66, wherein:

the specific quality control value is the prescribed data; and

the comparison comprises checking whether there is any difference between the data stored in the polynucleotide and the prescribed data.

75. The system of any one or more of claims 62-66 or 74, wherein the method comprises:

determining that the particular location in the storage device should not be considered data that has corruption; and

76. One or more non-transitory computer-readable media storing instructions for a storage device to perform a method, the method comprising:

generating one or more commands to write specified data to a polynucleotide associated with a particular location in a storage device;

reading the polynucleotide;

performing a comparison, wherein the comparison compares the polynucleotide stored in the memory device to a particular quality control value stored in a non-nucleotide memory; and

based on the comparison, it is determined whether the particular location in the storage device is to be considered as having corrupted data.

77. The computer readable medium of claim 76, wherein the method further comprises:

reading the polynucleotide;

performing the comparison; and

based on receiving a command to write to the storage device, it is automatically determined whether a particular location in the storage device is to be treated as having corrupted data.

78. The computer-readable medium of any one or more of claims 76-77, wherein the method comprises:

determining that the particular location in the storage device is to be considered as having corrupted data; and

Based on determining that the particular location in the storage device is to be deemed as having corrupted data, writing a new polynucleotide encoding uncorrupted data to the particular location in the storage device.

79. The computer-readable medium of claim 78, wherein the method includes generating the uncorrupted data based on reading information from one or more other locations in the storage device.

80. The computer-readable medium of any one or more of claims 78-79, wherein the method comprises:

81. The computer-readable medium of any one or more of claims 76-80, wherein the method includes generating the particular quality control value based on the prescription data.

82. The computer-readable medium of any one or more of claims 76-81, wherein:

the storage device comprises a plurality of addressable locations;

the particular location is comprised of the plurality of addressable locations;

the non-nucleotide memory stores a plurality of quality control values;

the specific quality control value is composed of the plurality of quality control values; and

each quality control value from the plurality of quality control values is associated with a respective addressable location from the plurality of addressable locations.

83. The computer-readable medium of any one or more of claims 76-82, wherein:

the polynucleotide comprises an inspection moiety; and

the method includes generating the inspection portion based on the prescription data.

84. The computer readable medium of claim 83, wherein the check portion is a parity bit.

85. The computer readable medium of any one or more of claims 83-84, wherein the inspection portion comprises methylation or other information retaining modifications of nucleobases in the polynucleotide.

86. The computer-readable medium of any one or more of claims 83-85, wherein the inspection portion encodes data that matches the particular quality control value.

87. The computer-readable medium of any one or more of claims 83-90, wherein:

88. The computer-readable medium of any one or more of claims 76-80, wherein:

the specific quality control value is the prescribed data; and

the comparison comprises checking whether there is any difference between the data stored in the polynucleotide and the prescribed data.

89. The computer-readable medium of any one or more of claims 76-80 or 92, wherein the method comprises:

determining that the particular location in the storage device should not be considered data that has corruption; and

Background

Computer systems use a variety of different mechanisms to store data, including magnetic storage, optical storage, and solid state storage. This form of data storage may be deficient in terms of read and write speed, data retention time, power consumption, or data density.

Just as naturally occurring DNA can be read, machine-written DNA can also be read. Existing DNA reading techniques may include array-based cycle sequencing assays (e.g., sequencing-by-synthesis (SBS)), in which dense arrays of DNA features (e.g., template nucleic acids) are sequenced through iterative cycles of enzymatic manipulations. After each cycle, an image may be captured and subsequently analyzed with other images to determine a sequence of machine-written DNA features. In another biochemical assay, an unknown analyte having an identifiable label (e.g., a fluorescent label) may be exposed to an array of known probes having predetermined addresses within the array. Observing the chemical reaction that occurs between the probe and the unknown analyte can help identify or reveal the identity of the analyte.

SUMMARY

Described herein are devices, systems, and methods for polynucleotide data storage, including features for per-well activation, simultaneous read-write caching, and multi-volume management for simultaneous read-write capability, as well as devices, systems, and methods for mitigating the risk of errors in such storage, for example, by including quality control measures in the process for reading and writing data encoded in polynucleotides.

One implementation relates to a system for non-volatile storage, comprising: a memory controller comprising a processor and a memory; a storage device, comprising: a flow-through cell comprising a plurality of wells having open sides accessible from a first surface of the flow-through cell, wherein the wells are adapted to accommodate polynucleotides, a fluidic interface and a sequencing interface; a fluidic device for providing one or more fluids to a first surface of a flow-through cell, wherein the one or more fluids comprise a nucleotide writing reagent and a nucleotide reading reagent; a sequencing device for sequencing polynucleotides within the plurality of wells and determining nucleotides through the sequencing interface; and a well activation device for: modifying one or more wells of the plurality of wells to provide a set of readable wells, wherein the set of readable wells allows exposure to nucleotide reading reagents and prevents exposure to other reagents from the fluidic device, and modifying one or more wells of the plurality of wells to provide a set of writeable wells, wherein the set of writeable wells allows exposure to nucleotide writing reagents and prevents exposure to other reagents from the fluidic device.

There are variations to any one or more of the above implementations, wherein the well activation device comprises a plurality of electrodes, and wherein: at least one electrode of the plurality of electrodes is located near each well of the plurality of wells, a control interface of the memory device is coupled with the plurality of electrodes, and a set of control signals is provided from the memory controller to the plurality of electrodes, each of the plurality of electrodes generates a voltage based on the set of control signals, the voltage includes a first voltage or a second voltage, and the first voltage generated by an electrode of the plurality of electrodes modifies a well of the plurality of wells adjacent to the electrode generating the first voltage to a readable well and the second voltage generated by an electrode of the plurality of electrodes modifies a well of the plurality of wells adjacent to the electrode generating the second voltage to a writable well.

There are variations to any one or more of the above implementations in which at least one electrode is located on a sidewall of each well of the plurality of wells.

Variations exist in any one or more of the above implementations in which at least one electrode is located on the first surface at a perimeter of each well of the plurality of wells.

There are variations to any one or more of the above implementations in which at least one electrode is located at a bottom of each well of the plurality of wells, and in which the at least one electrode comprises an annular shape that allows light to pass from a second surface of the flow cell, opposite the first surface, through the annular shape and into the well of the plurality of wells in which the at least one electrode is located.

There are variations to any one or more of the above implementations in which the second voltage causes hybridization of the enzyme inhibitor of a well of the plurality of wells near the electrode that generates the second voltage and binds the desired nucleotide to a well of the plurality of wells near the electrode that generates the second voltage.

Variations exist in any one or more of the above implementations in which: the well activation device comprises a plurality of pH control devices, each pH control device corresponding to one well of the plurality of wells, and each pH control device generates a voltage that controls the pH of the voltage sensitive functionalized fluid provided by the fluidics device to modify the wells of the plurality of wells corresponding to the pH control devices into readable or writable wells.

Variations exist in any one or more of the above implementations, wherein: the well activation device includes a Spatial Light Modulator (SLM) to emit light into one or more wells of the plurality of wells, and the emitted light modifies each of the one or more wells to be a readable well or a writable well.

There are variations to any one or more of the above implementations in which the memory controller, during a sequencing operation: receiving a set of address data describing nucleotides of polynucleotides in address wells of the plurality of wells from the sequencing device, determining a set of target wells from the plurality of wells based on the set of address data, and causing the fluidics device to provide nucleotide reading reagents only to the set of target wells.

There are variations to any one or more of the above implementations in which the memory controller controls the localization of the enzyme of the nucleotide reading reagent during the sequencing operation by causing the fluidic device to strip the enzyme from the plurality of wells and causing the well activation device to provide a charged tag.

There are variations to any one or more of the above implementations, wherein the well activation device comprises: an electrical release mechanism that generates a voltage to modify the well into a writable well, and a photonic release mechanism that generates photonic energy to modify the well into a writable well, and wherein a set of writable wells includes a well modified by the electrical release mechanism and a well modified by the photonic release mechanism.

Variations exist in any one or more of the above implementations, wherein: the well activation device comprises an electrowetting device proximate the first surface of the flow-through cell that delivers fluid from the fluidics device to any well of the plurality of wells, and the storage controller, during simultaneous sequencing and synthesis operations: providing a droplet of nucleotide reading reagent to a first well of the plurality of wells and providing a droplet of nucleotide writing reagent to a second well of the plurality of wells, wherein the first well and the second well are adjacent.

Variations of any one or more of the above implementations exist in which each of the plurality of wells comprises: polarization features that reduce cross talk between nearby wells, and optical waveguide features that reduce cross talk between nearby wells.

There are variations to any one or more of the above implementations in which the memory controller, during the composition operation: converting a set of data into a set of nucleotides, synthesizing a first polynucleotide strand in a first well of the set of writeable wells based on the set of nucleotides, and synthesizing a second polynucleotide strand in a second well of the set of writeable wells based on the set of nucleotides, wherein the first and second polynucleotide strands are identical when correctly synthesized.

There is a variation of any one or more of the implementations above wherein the memory controller sequences the first polynucleotide strand and the second polynucleotide strand with a sequencing device and provides an indication of a synthesis error when they are not identical.

There are variations to any one or more of the above implementations further comprising a spatial light modulator for projecting an optical pattern onto the flow cell through the sequencing interface, wherein the memory controller operates the spatial light modulator to project the same optical pattern into the first well and the second well to synthesize the first polynucleotide strand and the second polynucleotide strand.

There are variations to any one or more of the above implementations in which the memory controller, during the composition operation: converting a set of data into a set of nucleotides, synthesizing a first polynucleotide strand in a first well of the set of writeable wells based on a first portion of the set of nucleotides, and synthesizing a second polynucleotide strand in a second well of the set of writeable wells based on a second portion of the set of nucleotides in parallel with the synthesis of the first polynucleotide strand, wherein the first polynucleotide strand and the second polynucleotide strand collectively represent the entirety of the set of nucleotides.

There are variations to any one or more of the above implementations in which the memory controller, during the composition operation: converting a set of data into a set of nucleotides, synthesizing a first polynucleotide chain in a first well of the set of writable wells based on the set of nucleotides, synthesizing a second polynucleotide chain in the first well based on the set of nucleotides, wherein the first and second polynucleotide chains are identical when correctly synthesized, and generating a chain hash value based on the set of nucleotides, and synthesizing the first and second polynucleotide chains to add the chain hash value.

There are variations to any one or more of the above implementations in which the memory controller, during a sequencing operation: sequencing the first polynucleotide strand and the second polynucleotide strand in the first well with a sequencing device to determine a set of nucleotides and a sequencing hash value for sequencing of each polynucleotide strand, providing an indication of a synthesis error when the set of nucleotides for sequencing of the first polynucleotide strand and the second polynucleotide strand are not the same, and providing an indication of a hash value mismatch when the sequencing hash value of the first polynucleotide strand or the second polynucleotide strand does not match a subsequent hash value of the set of nucleotides being sequenced.

Another implementation relates to a method for non-volatile polynucleotide storage, comprising: installing a storage device, the storage device comprising: a flow-through cell comprising a plurality of wells having open sides accessible from a first surface of the flow-through cell, wherein the wells are adapted to accommodate polynucleotides, a fluidic interface and a sequencing interface; performing a synthesis operation to produce polynucleotides in a plurality of wells by operating a fluidic device to provide nucleotide writing reagents to the first surface; performing a sequencing operation with a sequencing device and nucleotide reading reagents from a fluidics device to determine nucleotides of the polynucleotides in the plurality of wells; and using the trap activation device prior to the synthesis operation and the sequencing operation to: modifying one or more wells from the plurality of wells to provide a set of readable wells, wherein the set of readable wells allows exposure to nucleotide reading reagents and prevents exposure to other reagent fluids from the fluidic device, and modifying one or more wells from the plurality of wells to provide a set of writeable wells, wherein the set of writeable wells allows exposure to nucleotide writing reagents and prevents exposure to other reagent fluids from the fluidic device.

There are variations to any one or more of the above implementations, wherein using the well activation device comprises operating a plurality of electrodes of the well activation device, and wherein: at least one electrode of the plurality of electrodes is located near each well of the plurality of wells, a control interface of the storage device is coupled with the plurality of electrodes and provides a set of control signals to the plurality of electrodes, each of the plurality of electrodes generates a voltage based on the set of control signals, the voltage includes a first voltage or a second voltage, and the first voltage modifies a well of the plurality of wells near the electrode generating the first voltage to a readable well and the second voltage generated by the electrode of the plurality of electrodes modifies a well of the plurality of wells near the electrode generating the second voltage to a writable well.

There are variations to any one or more of the above implementations in which operating the plurality of electrodes comprises operating at least one electrode located on a sidewall of a well of the plurality of wells.

There are variations to any one or more of the above implementations in which operating the plurality of electrodes comprises operating at least one electrode located on the first surface at a perimeter of a well of the plurality of wells.

There are variations to any one or more of the above implementations in which operating the plurality of electrodes comprises operating at least one electrode located at a well bottom of the plurality of wells, and in which the at least one electrode comprises an annular shape that allows light to pass through the annular shape from a second surface of the flow cell opposite the first surface and into a well of the plurality of wells in which the at least one electrode is located.

Variations of any one or more of the above implementations further comprise configuring the second voltage to cause hybridization of an enzyme inhibitor of a well of the plurality of wells in proximity to the electrode that generates the second voltage and to bind the desired nucleotide into the well of the plurality of wells in proximity to the electrode that generates the second voltage.

There are variations to any one or more of the above implementations wherein using the well activation device comprises operating a plurality of pH control devices of the well activation device, and wherein: each pH control device corresponds to one of the plurality of wells, and each pH control device generates a voltage that controls the pH of the voltage sensitive functionalized fluid provided by the fluidics device to modify the wells of the plurality of wells corresponding to the pH control devices into readable or writeable wells.

A method according to any one or more of claims 20 to 26, wherein operating the well activation device comprises operating a Spatial Light Modulator (SLM) to emit light into one or more wells of the plurality of wells, and wherein the emitted light modifies each of the one or more wells to be a readable well or a writable well.

Variations exist in any one or more of the above implementations, further comprising, during the sequencing operation: receiving a set of address data from the sequencing device, the set of address data describing nucleotides of the sequenced polynucleotide from an address well of the plurality of wells, determining a set of target wells from the plurality of wells based on the set of address data, and causing the fluidics device to provide nucleotide reading reagents only to the set of target wells.

Variations exist in any one or more of the above implementations, further comprising controlling the localization of the enzyme of the nucleotide reading reagent by causing the fluidic device to strip unwanted enzymes from the plurality of wells and causing the well activation device to provide a charged tag.

There are variations to any one or more of the above implementations, wherein using a well activation device comprises: generating a voltage using an electrical release mechanism of the well activation device to modify the well into a writable well, and generating photonic energy using a photonic release mechanism of the well activation device to modify the well into a writable well, and wherein the set of writable wells comprises a well modified by the electrical release mechanism and a well modified by the photonic release mechanism.

There are variations to any one or more of the above implementations in which operating a trap activation device comprises: operating the electrowetting device of the well activation device to deliver fluid from the fluidics device to the plurality of wells, and during simultaneous sequencing and synthesis operations: providing a droplet of nucleotide reading reagent to a first well of the plurality of wells and providing a droplet of nucleotide writing reagent to a second well of the plurality of wells, wherein the first well and the second well are adjacent.

Variations exist in any one or more of the above implementations, further comprising during the synthesizing operation: converting a set of data into a set of nucleotides, synthesizing a first polynucleotide strand in a first well of the set of writeable wells based on the set of nucleotides, and synthesizing a second polynucleotide strand in a second well of the set of writeable wells based on the set of nucleotides, wherein the first and second polynucleotide strands are identical when correctly synthesized.

Variations exist in any one or more of the above implementations, further comprising sequencing the first polynucleotide strand and the second polynucleotide strand with a sequencing device, and when they are not identical, providing an indication of a synthesis error.

There are variations to any one or more of the above implementations, further comprising operating the spatial light modulator to project a same optical pattern onto the first well and the second well through the sequencing interface, wherein the same optical pattern is used to synthesize the first polynucleotide strand and the second polynucleotide strand in parallel.

Variations exist in any one or more of the above implementations, further comprising during the synthesizing operation: converting a set of data into a set of nucleotides, synthesizing a first polynucleotide strand in a first well of the set of writeable wells based on a first portion of the set of nucleotides, and synthesizing a second polynucleotide strand in a second well of the set of writeable wells based on a second portion of the set of nucleotides in parallel with synthesizing the first polynucleotide strand, wherein the first polynucleotide strand and the second polynucleotide strand collectively represent the entirety of the set of nucleotides.

Variations exist in any one or more of the above implementations, further comprising, during the synthesizing operation: converting a set of data into a set of nucleotides, synthesizing a first polynucleotide chain in a first well of the set of writable wells based on the set of nucleotides, synthesizing a second polynucleotide chain in the first well based on the set of nucleotides, wherein the first and second polynucleotide chains are identical when correctly synthesized, and generating a chain hash value based on the set of nucleotides and synthesizing the first and second polynucleotide chains to add the chain hash value.

Variations exist in any one or more of the above implementations, further comprising, during the sequencing operation: sequencing the first polynucleotide strand and the second polynucleotide strand in the first well with a sequencing device to determine a sequenced set of nucleotides and a sequencing hash value for each polynucleotide strand, wherein the sequenced set of nucleotides for the first polynucleotide strand and the second polynucleotide strand are not the same, providing an indication of a synthesis error, determining a subsequent hash value for each of the first polynucleotide strand and the second polynucleotide strand based on the sequenced set of nucleotides, and providing an indication that the hash values do not match when the sequencing hash values for the first polynucleotide strand or the second polynucleotide strand do not match the subsequent hash values for the sequenced set of nucleotides.

Yet another implementation relates to a system for non-volatile polynucleotide storage, comprising: a memory controller comprising a processor and a memory; a storage device, comprising: a flow-through cell comprising a plurality of wells having open sides accessible from a first surface of the flow-through cell, wherein the wells are adapted to accommodate polynucleotides, a fluidic interface and a sequencing interface; a fluidic device for providing one or more fluids to a first surface of a flow-through cell, wherein the one or more fluids comprise a nucleotide writing reagent and a nucleotide reading reagent; a sequencing device for sequencing polynucleotides within the plurality of wells and determining nucleotides through the sequencing interface; and a cache memory comprising electronic memory to store data queued for encoding into a set of nucleotides and synthesized into polynucleotides in a plurality of wells.

There are variations to any one or more of the above implementations in which the cache memory is located in a storage device, and in which the storage device is a removable storage device.

Variations exist in any one or more of the above implementations in which the cache memory stores one or more of: a set of file indices describing the names and locations of data stored as polynucleotides within the plurality of wells, and a set of checksum values that may be used to verify the integrity of the data stored as polynucleotides within the plurality of wells.

Variations exist in any one or more of the above implementations in which the memory controller: the method includes receiving a set of input data to be written to the storage device as polynucleotides in the plurality of wells, receiving a request to read output data from the polynucleotides stored in the plurality of wells, determining whether the storage device is in a write mode or a read mode based on whether the storage device recently received a nucleotide write reagent or a nucleotide read reagent from the fluidic device, writing the set of input data before reading the output data when the storage device is in the write mode, and storing the set of input data on the cache memory and reading the output data before writing the set of input data when the storage device is in the read mode.

There are variations to any one or more of the above implementations, further comprising an electrowetting device positioned near the first surface and delivering fluid from the fluidic device to any of the plurality of wells, wherein the storage controller: operating the electrowetting device to provide droplets of a nucleotide reading reagent to a first well of the plurality of wells to enable sequencing of polynucleotides stored therein while providing droplets of a nucleotide reading reagent to the first well, identifying a second well closest to the first well based on the plurality of output data requests, and operating the electrowetting device to provide a portion of the droplets of a nucleotide reading reagent to a second well of the plurality of wells to enable sequencing of polynucleotides stored therein.

Variations exist in any one or more of the above implementations in which the memory controller: the method includes determining a subset of the plurality of wells that are most frequently sequenced based on past requests for output data, operating the sequencing device to sequence the subset of the plurality of wells and generate a set of nucleotides that describe polynucleotides stored within the subset of the plurality of wells, converting the set of nucleotides into a set of digital data, storing the set of digital data in a cache memory, and providing the set of digital data from the cache memory in response to subsequent requests.

Yet another implementation relates to a system for non-volatile polynucleotide storage, comprising: a memory controller comprising a processor and a memory; a storage device, comprising: a first flow-through cell comprising a first plurality of wells, wherein the first plurality of wells is adapted to hold a polynucleotide; a second flow-through cell comprising a second plurality of wells, wherein the second plurality of wells is adapted to hold a polynucleotide; a fluidics device for providing one or more fluids to the first and second plurality of wells, wherein the one or more fluids comprise a nucleotide writing reagent and a nucleotide reading reagent; and a sequencing device for sequencing polynucleotides within the first and second plurality of wells and determining nucleotides through the sequencing interface, wherein when in mirror mode, the memory controller: converting the set of data into a set of nucleotides and operating the fluidic device to produce identical polynucleotides in the first and second plurality of wells based on the set of data.

There are variations to any one or more of the above implementations, further comprising a spatial light modulator to project an optical pattern onto the first flow-through cell and the second flow-through cell via the sequencing interface, wherein the memory controller operates the spatial light modulator to project the same optical pattern into a first well of the first plurality of wells and a second well of the second plurality of wells to synthesize the first polynucleotide strand and the second polynucleotide strand.

There are variations to any one or more of the above implementations in which, when in the dedicated mode, the memory controller: converting the set of data into a set of nucleotides, designating a first plurality of wells for writing data and designating a second plurality of wells for reading data, operating a fluidics device to generate polynucleotides in the first plurality of wells based on the set of data, and operating the fluidics device and a sequencing device to sequence the polynucleotides in the second plurality of wells based on the output request.

Variations exist in any one or more of the above implementations in which the memory controller: it is determined that there is no current output request and a switch is made from the dedicated mode to the mirror mode.

Another implementation relates to a method for mitigating a risk of errors in a storage device. In such implementations, the method can include generating one or more commands to write specified data to a polynucleotide associated with a particular location in the storage device. The method may further comprise reading the polynucleotides and comparing. In such implementations, the comparison may compare the polynucleotide stored in the memory device to a particular quality control value stored in the non-nucleotide memory. In some such implementations, based on the comparison, the method may include determining whether a particular location in the storage device is to be considered as having corrupted data.

There are variations to any one or more of the above implementations in which reading the polynucleotide, performing the comparison, and determining whether a particular location in the memory device will be considered to have corrupted data are performed automatically based on receiving a command to write to the memory device.

There are variations on any one or more of the implementations described above in which the method may include determining that a particular location in the storage device is to be considered as having corrupted data. In some such implementations, the method may include, based on determining that a particular location in the storage device is to be considered as having corrupted data, writing a new polynucleotide encoding uncorrupted data to the particular location in the storage device.

There are variations on any one or more of the implementations described above in which the method may include generating uncorrupted data based on reading information from one or more other locations in the storage device.

There may be variations to any one or more of the implementations described above wherein the method may include, based on determining that a particular location in the storage device is to be considered as having corrupted data, and prior to writing a new polynucleotide encoding uncorrupted data to the particular location in the storage device, updating an index of the storage device to indicate that the particular location in the storage device has corrupted data. In some such implementations, the method may include, after writing a new polynucleotide encoding uncorrupted data to a particular location in the storage device, updating an index of the storage device to indicate that the particular location in the storage device is free of corrupted data.

There are variations to any one or more of the above implementations, wherein the method may include generating the particular quality control value based on the specified data.

There are variations of any one or more of the above implementations in which a storage device may comprise a plurality of addressable locations. In some such implementations, a particular location may be comprised of multiple addressable locations. In some such implementations, the non-nucleotide memory can store a plurality of quality control values. In some such implementations, the particular quality control value is comprised of a plurality of quality control values. In some such implementations, each quality control value from the plurality of quality control values is associated with a corresponding addressable location from the plurality of addressable locations.

There are variations in any one or more of the above implementations wherein the polynucleotide can comprise an inspection moiety. In some such implementations, the method may include generating the inspection portion based on the provisioning data.

There are variations to any one or more of the above implementations in which the check portion may be a parity bit.

There are variations in any one or more of the above implementations in which the inspection portion can include methylation or other information-retaining modifications of nucleobases in the polynucleotide.

There are variations to any one or more of the above implementations in which the inspection portion may encode data that matches a particular quality control value.

There are variations to any one or more of the above implementations in which the second polynucleotide may be stored in a storage device, and the second polynucleotide may comprise a second inspection portion that is the same as the inspection portion comprised by the polynucleotide. In some such implementations, the method can include determining that a particular location in the storage device is to be considered as having corrupted data based on identifying a difference between the first polynucleotide and the second polynucleotide.

There are variations in any one or more of the above implementations in which a particular quality control value may be specified data. In some such implementations, the comparison may include checking whether there are any differences between the data stored in the polynucleotide and the prescribed data.

There are variations on any one or more of the implementations described above in which the method may include determining that a particular location in the storage device should not be considered data that is corrupt. In some such implementations, the method can include, based on determining that the particular location in the storage device should not be considered as having the data that is corrupted, deleting the particular quality control value stored in the non-nucleotide memory.

Another implementation relates to a system that includes a storage device having one or more non-transitory computer-readable media storing instructions for the storage device to perform a method. In some such implementations, the method may include generating one or more commands to write specified data to a polynucleotide associated with a particular location in the storage device. In some such implementations, the method can include reading the polynucleotides and comparing. In some such implementations, the comparison may compare a polynucleotide stored in a memory device to a particular quality control value stored in a non-nucleotide memory. In some such implementations, the method may include determining whether a particular location in the storage device is to be considered as having corrupted data based on the comparison.

There are variations on any one or more of the implementations described above in which the method can include automatically reading the polynucleotide based on receiving a command to write to the memory device, performing the comparison, and determining whether a particular location in the memory device will be considered to have corrupted data.

There are variations to any one or more of the above implementations, wherein the method may include generating the particular quality control value based on the specified data.

There are variations to any one or more of the above implementations in which the check portion may be a parity bit.

There are variations to any one or more of the above implementations in which the inspection portion may encode data that matches a particular quality control value.

Yet another implementation relates to one or more non-transitory computer-readable media storing instructions for a storage device to perform a method. In some such implementations, the method may include generating one or more commands to write specified data to a polynucleotide associated with a particular location in the storage device. In some such implementations, the method can include reading the polynucleotides and comparing. In some such implementations, the comparison may compare a polynucleotide stored in a memory device to a particular quality control value stored in a non-nucleotide memory. In some such implementations, the method may include determining whether a particular location in the storage device is to be considered as having corrupted data based on the comparison.

There are variations to any one or more of the above implementations, wherein the method may include generating the particular quality control value based on the specified data.

There are variations to any one or more of the above implementations in which the check portion may be a parity bit.

There are variations to any one or more of the above implementations in which the inspection portion may encode data that matches a particular quality control value.

It is to be understood that all combinations of the foregoing concepts and additional concepts discussed in greater detail below (provided that such concepts are not mutually inconsistent) are contemplated as being part of the inventive subject matter disclosed herein and that the benefits/advantages described herein are realized. In particular, all combinations of claimed subject matter appearing at the end of this disclosure are contemplated as being part of the inventive subject matter disclosed herein.

Brief Description of Drawings

The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages will become apparent from the description, the drawings, and the claims, wherein:

FIG. 1 depicts a schematic block diagram of an example of a system that may be used to perform biochemical processes;

FIG. 2 depicts a schematic cross-sectional block diagram of an example of a consumable cartridge that may be used with the system of FIG. 1;

FIG. 3 depicts a perspective view of an example of a flow cell that may be used with the system of FIG. 1;

FIG. 4 depicts an enlarged perspective view of the channels of the flow cell of FIG. 3;

FIG. 5 depicts a schematic cross-sectional view of an example of a trap that may be incorporated into the channel of FIG. 4;

FIG. 6 depicts a flow diagram of an example of a process for reading a polynucleotide;

FIG. 7 depicts a schematic cross-sectional view of another example of a trap that may be incorporated into the channel of FIG. 4;

FIG. 8 depicts a flow diagram of an example of a process for writing a polynucleotide;

FIG. 9 depicts a top view of an example of an electrode assembly;

FIG. 10 depicts a schematic cross-sectional view of another example of a trap that may be incorporated into the channel of FIG. 4;

FIG. 11A depicts a top view of an example of a well that may be incorporated into the channel of FIG. 4;

FIG. 11B depicts a top view of another example of a trap that may be incorporated into the channel of FIG. 4;

FIG. 11C depicts a top view of yet another example of a well that may be incorporated into the channel of FIG. 4;

FIG. 12A depicts a schematic of an example of a portion of a DNA storage system operable to read, write or read data to one or more wells;

FIG. 12B depicts a schematic diagram illustrating an example of a light pattern projected onto a plurality of wells by the DNA storage system of FIG. 12A;

FIG. 13A depicts a schematic of another example of a portion of a DNA storage system operable to read, write or read data to one or more wells;

figure 13B depicts a schematic illustrating an example of electrowetting of a plurality of wells;

FIG. 14 depicts a flow diagram of a process that may be performed to provide controlled read and write regions to multiple wells;

FIG. 15 depicts a schematic diagram illustrating an example of a storage device that may be used with a DNA storage system;

FIG. 16 depicts a flowchart of a process that may be performed to provide caching of read and write operations to the storage device of FIG. 15;

FIG. 17A depicts a schematic of another example of a storage device that may be used with a DNA storage system;

FIG. 17B depicts a schematic of a configuration of a plurality of storage devices that may be used with a DNA storage system;

FIG. 18 depicts a flowchart of a process that may be performed to provide redundant data writes and reads with a storage device;

FIG. 19 depicts a flowchart of a process that may be performed to provide high speed data writing and reading with a storage device;

FIG. 20 depicts a flowchart of a process that may be performed to provide for simultaneous reading and writing with multiple storage devices; and

FIG. 21 depicts a schematic of an example of a DNA storage system.

FIG. 22 is a description of a process of reading and writing data that may be performed in some implementations.

It will be appreciated that some or all of the figures are schematic illustrations for purposes of illustration. The drawings are provided for the purpose of illustrating one or more implementations, it being expressly understood that they are not intended to limit the scope or meaning of the claims.

Detailed Description

In some aspects, disclosed herein are methods and systems for providing selective activation of storage wells, simultaneous well reading and writing, and mitigation of data errors in DNA storage devices containing machine-written DNA, and for synthesizing DNA (or other biological material) to store data or other information; and/or reading machine-written DNA (or other biological material) to retrieve machine-written data or other information. Machine-written DNA may replace conventional forms of data storage (e.g., magnetic, optical, and solid-state storage). Machine-written DNA can provide faster read and write speeds, longer data retention times, lower power consumption, and higher data densities. Although the examples described herein refer to a "DNA storage system" or "DNA storage device," it should be understood that this is only one example of polynucleotide storage. The teachings herein can be readily applied to storage systems and devices that utilize polynucleotides that are not necessarily in the form of DNA. Thus, the present invention is not limited to the use of DNA as the only polynucleotide for storage as described herein. Furthermore, polynucleotides are only one example of biological material that can be used for storage as described herein.

An example of how Digital Information can be stored in DNA is disclosed in U.S. publication No. 2015/0261664 entitled "High-Capacity of Storage of Digital Information in DNA" published on 17.9.2015, the entire contents of which are incorporated herein by reference. For example, methods from coding theory can be used to enhance the recoverability of encoded messages from DNA fragments, including the inhibition of DNA homopolymers (i.e., runs of more than one identical base) known to be associated with higher error rates in existing high-throughput techniques. Furthermore, the error detection portion, which is similar to the parity bits, may be integrated into the index information in the code. More sophisticated schemes, including but not limited to error correcting codes, in fact, substantially any form of digital data security currently employed in informatics (e.g., RAID-based schemes), may be implemented in future developments in DNA storage schemes. The DNA encoding of the information can be calculated with software. The bytes making up each computer file may be represented by a DNA sequence without homopolymers by a coding scheme to produce a coded file that replaces each byte with five or six bases forming the DNA sequence.

The codes used in the coding scheme may be configured to allow direct coding (e.g., without repeated nucleotides) that approaches the optimal information capacity for run-length limited channels, although other coding schemes may be used. The resulting electronic DNA sequence may be too long to be easily produced by standard oligonucleotide synthesis, and may be divided into overlapping fragments of 100 bases in length, with 75 bases overlapping. To reduce the risk of systematic synthesis errors introducing any particular base run, alternating ones of the fragments may be converted to their reverse complements, meaning that each base may be "written" four times, twice in each direction. Each fragment may then be augmented with index information that allows the determination of the computer file from which the fragment originated and its location in that computer file plus simple error detection information. Such index information may also be encoded as non-repetitive DNA nucleotides and appended to the information storing bases of the DNA fragment. The division of the DNA fragments into lengths of 100 bases and overlaps of 75 bases is purely arbitrary and illustrative, and it will be appreciated that other lengths and overlaps may be used, and are not limiting.

Other encoding schemes of the DNA fragments may be used, for example to provide enhanced error correction performance. The amount of index information may be increased to allow more or larger files to be encoded. To avoid systematic patterns in DNA fragments, one extension of the coding scheme may be to add a change in information. One method is to "shuffle" the information in the DNA fragments, and if the shuffle pattern is known, the information can be retrieved. Different patterns of shuffling can be used for different DNA fragments. Another approach is to add a degree of randomness to the information of each DNA fragment. For this purpose, a series of random numbers can be used by modulo addition of the series of random numbers and the number containing the information encoded in the DNA fragment. If the series of random numbers used is known, the information can be retrieved by modulo subtraction in the decoding process. Different series of random numbers can be used for different DNA fragments. The data encoding portion of each string may contain shannon information, 5.07 bits per DNA base, which is close to the theoretical optimum of a base 4 channel with a run length limit of 1, 5.05 bits per DNA base. An index implementation may allow 314-4782969 unique data locations. Increasing the number of index ternary digits (and thus also cardinality) used to specify a file and a location within the file by two, i.e., 16, results in 316-43046721 unique locations, exceeding the practical maximum of the Nested Primer Molecule Memory (NPMM) scheme of 16.8M.

The DNA fragment design can be synthesized in three different runs (DNA fragments randomly assigned to runs) to create approximately 1.2X 10 per DNA fragment design⁷And (4) copying. Phosphoramidite chemistry can be used, and inkjet printing in an in situ microarray synthesis platform and flow cell reactor technology can be used. Inkjet printing in a water-free chamber can allow very small amounts of phosphoramidite to be delivered to a confined coupling region on a 2D planar surface, resulting in the addition of hundreds of thousands of bases in parallel. Subsequent oxidation and detritylation can be carried out in a flow cell reactor. Once DNA synthesis is complete, the oligonucleotide can be cleaved from the surface and deprotected.

Adapters may then be added to the DNA fragments to enable multiple copies of the DNA fragments to be made. DNA fragments without adaptors may require additional chemistry to "prime" the chemistry for synthesizing multiple copies by adding additional groups at the ends of the DNA fragments. Oligonucleotides can be amplified using Polymerase Chain Reaction (PCR) methods and paired-end PCR primers, followed by bead purification and quantitation. The oligonucleotide can then be sequenced to generate a read of 104 bases. Decoding of the digital information can then be performed by sequencing the central base of each oligonucleotide from both ends, and rapidly calculating the full length oligonucleotide and removing sequence reads that are inconsistent with the design. The sequence reads can be decoded using computer software that can completely reverse the encoding process. The parity ternary bits indicate an error or sequence reads that can be explicitly decoded or assigned to the reconstructed computer file can be discarded. The position in each decoded file can be detected in a number of different sequenced DNA oligonucleotides, and simple majority voting can be used to account for any differences caused by DNA synthesis or sequencing errors.

While several examples are provided in the context of machine-written DNA, it is contemplated that the principles described herein may be applied to other kinds of machine-written biological material.

As used herein, the term "machine-written DNA" is understood to include one or more polynucleotide strands that are produced by, or otherwise modified by, a machine to store data or other information. An example of a polynucleotide herein is DNA. It is noted that although the term "DNA" is used throughout this disclosure in the context of DNA being read or written, this term is used merely as a representative example of a polynucleotide and may encompass the concept of a polynucleotide. A "machine," as used herein, "machine-written," may include an instrument or system specifically designed for writing DNA, as described in more detail herein. The system may be abiotic or biological. In one example, the biological system may include or be a polymerase. For example, the polymerase may be terminal deoxynucleotidyl transferase (TdT). In biological systems, the process may additionally be controlled by machine hardware (e.g., a processor) or an algorithm. "machine-written DNA" may include any polynucleotide having one or more base sequences written by a machine. Although machine-written DNA is used as an example herein, other polynucleotide strands may be substituted for the machine-written DNA described herein. "machine-written DNA" may include natural bases and modifications of natural bases, including but not limited to methylated or other chemically-tagged modified bases; a synthetic polymer similar to DNA, such as Peptide Nucleic Acid (PNA); or morpholino DNA. "machine-written DNA" may also include DNA strands or other polynucleotides formed from at least one base strand derived from nature (e.g., extracted from a naturally occurring organism) to which machine-written base strands are immobilized in a parallel or end-to-end manner. In other implementations, "machine-written DNA" may be written by a biological system (e.g., an enzyme) instead of, or in addition to, the non-biological systems described herein (e.g., an electrode machine) writing DNA. In other words, "machine-written DNA" may be machine-directly written; or written by an enzyme (e.g., polymerase) controlled by an algorithm and/or machine.

"machine-written DNA" may include data that has been converted from an original form (e.g., a photograph, a text document, etc.) into a binary coding sequence using known techniques, then the binary coding sequence is converted into a DNA base sequence using known techniques, and then the DNA base sequence is machine-generated in the form of one or more DNA strands or other polynucleotides. Alternatively, "machine-written DNA" may be generated to index or otherwise track pre-existing DNA, store data or information from any other source, and used for any suitable purpose, without necessarily requiring an intermediate step of converting the raw data into binary code.

As described in more detail below, machine-written DNA may be written to and/or read from reaction sites. As used herein, the term "reaction site" is a localized region where at least one specified reaction can occur. The reaction site may comprise a support surface of the reaction structure or substrate on which a substance may be immobilized. For example, a reaction site may be a discrete region of space in which a set of discrete DNA strands or other polynucleotides are written. The reaction sites may allow for chemical reactions separate from reactions of adjacent reaction sites. An apparatus that provides machine writing of DNA may include a flow cell with a well having writing features (e.g., electrodes) and/or reading features. In some cases, the reaction site may comprise a surface of the reaction structure (which may be located in a channel of a flow cell) that already has a reaction component, e.g. a polynucleotide colony thereon. In some flow-through cells, the polynucleotides in a colony have the same sequence, e.g., a clonal copy of a single-stranded or double-stranded template. However, in some flow-through cells, the reaction sites may comprise only a single polynucleotide molecule, e.g., in single-stranded or double-stranded form.

The plurality of reaction sites may be randomly distributed along the reaction structure of the flow cell, or may be arranged in a predetermined manner (e.g., side by side in a matrix, such as in a microarray). The reaction site may also include a reaction chamber, recess, or well that at least partially defines a spatial region or volume configured to isolate a specified reaction. As used herein, the term "reaction chamber" or "reaction recess" includes a defined spatial region of a support structure (which is typically fluidly coupled with a flow channel). The reaction recess may be at least partially separated from the ambient environment or other spatial region. For example, a plurality of reaction wells may be separated from each other by a shared wall. As a more specific example, the reaction recess may be a nanowell that includes an indentation, pit, well, groove, cavity, or depression defined by an inner surface of the detection surface and has an opening or aperture (i.e., open-sided) such that the nanowell can be fluidically coupled to the flow channel.

To read machine-written DNA, one or more discrete detectable regions of the reaction site may be defined. Such a detectable region can be an imageable region, an electrically detectable region, or other type of region that has a measurable change in a property (or no change in a property) based on the type of nucleotide present during reading.

As used herein, the term "pixel" refers to a discrete imageable region. Each imageable region can include a compartment or discrete spatial region in which the polynucleotide is present. In some cases, a pixel may include two or more reaction sites (e.g., two or more reaction chambers, two or more reaction recesses, two or more wells, etc., in some other cases, a pixel may include only one reaction site. A single image sensor may be used with the objective lens to capture several "pixels" during an imaging event. In some other implementations, each discrete photodiode or photosensor may capture a respective pixel. In some implementations, the light sensors (e.g., photodiodes) of one or more detection devices may be associated with respective reaction sites. A light sensor associated with a reaction site can detect light emissions from the associated reaction site. In some implementations, detection of light emission can be performed by at least one light guide when a specified reaction occurs at the relevant reaction site. In some implementations, multiple light sensors (e.g., several pixels of a light detection or imaging device) may be associated with a single reaction site. In some implementations, a single photosensor (e.g., a single pixel) can be associated with a single reaction site or a group of reaction sites.

As used herein, the term "synthesis" is understood to include the process of producing DNA by a machine to store data or other information. Thus, machine-written DNA may constitute synthetic DNA. As used herein, the terms "consumable cartridge," "reagent kit," "removable cartridge" and/or "cartridge" refer to a combination of the same cartridges and/or components that make up a cartridge or a component of a cartridge system. The cassettes described herein may be independent of elements having reaction sites, such as flow cells having multiple wells. In some cases, the flow-through cell may be removably inserted into the cartridge, which is then inserted into the instrument. In some other implementations, the flow cell may be removably inserted into the instrument without the cartridge. As used herein, the term "biochemical analysis" may include at least one of a biological analysis or a chemical analysis.

The term "based on" is understood to mean that something is at least partially determined by what it is indicated as being "based on". To indicate that something must be decided upon entirely by something else, it is described as being entirely based on what it is decided upon entirely.

The term "non-nucleotide memory" is understood to mean an object, device or combination of devices capable of storing data or instructions in a form other than nucleotides, which data or instructions can be retrieved and/or processed by the device. Examples of "non-nucleotide storage" include solid-state storage, magnetic storage, hard disk drives, optical drives, and combinations of the foregoing (e.g., magneto-optical storage elements).

The term "DNA storage device" should be understood to refer to an object, device, or combination of devices configured to store data or instructions in the form of a polynucleotide sequence (e.g., machine-written DNA). Examples of "DNA storage devices" include flow cells having addressable wells as described herein, systems comprising a plurality of such flow cells, and tubes or other containers that store nucleotide sequences that have been cleaved from the surface of a synthetic nucleotide sequence. As used herein, the term "nucleotide sequence" or "polynucleotide sequence" is understood to include a polynucleotide molecule, as well as the underlying sequence of the molecule, depending on the context. The polynucleotide sequence may comprise (or encode) information indicative of certain physical characteristics.

Implementations described herein can be used to perform specified reactions for consumable cartridge preparation and/or biochemical analysis and/or machine-written DNA synthesis.

I. Overview of the System

Fig. 1 is a schematic diagram of a system 100 configured to perform biochemical analysis and/or synthesis. The system 100 can include a base instrument 102, the base instrument 102 configured to receive and detachably engage the removable cartridge 200 and/or a component having one or more reaction sites. The base instrument 102 and the removable cartridge 200 can be configured to interact to transport the biological material to different locations within the system 100 and/or to perform specified reactions involving the biological material in order to prepare the biological material for subsequent analysis (e.g., by synthesizing the biological material), and optionally, to detect one or more events of the biological material. In some implementations, the base instrument 102 can be configured to detect one or more events of biological material directly on the removable cartridge 200. The event may indicate a specified reaction with the biological material. The removable cartridge 200 may be constructed in accordance with any of the cartridges described herein.

Although reference is made below to base instrument 102 and removable cartridge 200 as shown in fig. 1, it should be understood that base instrument 102 and removable cartridge 200 illustrate only one implementation of system 100 and that other implementations exist. For example, the base instrument 102 and the removable cartridge 200 include various components and features that collectively perform several operations for preparing and/or analyzing biological material. Furthermore, although the removable cartridge 200 described herein includes an element having a reaction site, such as a flow cell having a plurality of wells, other cartridges may be independent of the element having a reaction site, and the element having a reaction site may be inserted into the base instrument 102 separately. That is, in some cases, the flow-through cell may be removably inserted into the removable cartridge 200, and then the removable cartridge 200 is inserted into the base instrument 102. In some other implementations, the flow cell may be removably inserted directly into the base instrument 102 without the removable cartridge 200. In further implementations, the flow cell may be integrated into a removable cartridge 200 that is inserted into the base instrument 102.

In the illustrated implementation, each of the base instrument 102 and the removable cartridge 200 is capable of performing certain functions. However, it should be understood that the base instrument 102 and the removable cartridge 200 may perform different functions and/or may share such functions. For example, the base instrument 102 is shown to include a detection component 110 (e.g., an imaging device) configured to detect a specified reaction at the removable cartridge 200. In alternative implementations, the removable cartridge 200 may include a detection component and may be communicatively coupled to one or more components of the base instrument 102. As another example, the base instrument 102 is a "dry" instrument that does not provide, receive, or exchange fluid with the removable cartridge 200. That is, as shown, the removable cartridge 200 includes a consumable reagent portion 210 and a flow cell receiving portion 220. The consumable reagent portion 210 can contain reagents used during biochemical analysis and/or synthesis. The flow cell receiving portion 220 may include an optically transparent or other detectable region for the detection assembly 110 to detect one or more events occurring within the flow cell receiving portion 220. In alternative implementations, the base instrument 102 can provide, for example, reagents or other liquids to the removable cartridge 200 that are subsequently consumed by the removable cartridge 200 (e.g., for a specified reaction or synthetic process).

As used herein, a biological material may include one or more biological or chemical substances, such as nucleosides, nucleotides, nucleic acids, polynucleotides, oligonucleotides, proteins, enzymes, peptides, oligopeptides, polypeptides, antibodies, antigens, ligands, receptors, polysaccharides, carbohydrates, polyphosphates, nanopores, organelles, lipid layers, cells, tissues, organisms, and/or biologically active compounds, such as analogs or mimetics of the foregoing. In some cases, the biological material can include whole blood, lymph, serum, plasma, sweat, tears, saliva, sputum, cerebrospinal fluid, amniotic fluid, semen, vaginal secretions, serous fluid, synovial fluid, pericardial fluid, peritoneal fluid, pleural fluid, exudates, exudate, cyst fluid, bile, urine, gastric fluid, intestinal fluid, fecal samples, fluids containing single or more cells, fluids containing organelles, fluidized tissues, fluidized organisms, viruses including viral pathogens, fluids containing multicellular organisms, biological swabs, and biological washes. In some cases, the biological material may include a set of synthetic sequences, including but not limited to machine-written DNA, which may be fixed (e.g., attached in a specific well in a cassette) or unfixed (e.g., stored in a cuvette).

In some implementations, the biological material can include added materials, such as water, deionized water, saline solution, acidic solution, basic solution, detergent solution, and/or pH buffer. The added materials may also include reagents to be used during a given assay protocol to perform a biochemical reaction. For example, the added liquid may include material that is subjected to multiple Polymerase Chain Reaction (PCR) cycles with the biological material. In other aspects, the added material can be a carrier for the biological material, such as a cell culture medium or other buffered and/or pH adjusted and/or isotonic carrier, which can allow or maintain the biological function of the biological material.

However, it should be understood that the biological material being analyzed may be in a different form or state than the biological material loaded into the system 100 or produced by the system 100. For example, the biological material loaded into the system 100 may include whole blood or saliva or a population of cells, which is subsequently processed (e.g., by an isolation or amplification procedure) to provide prepared nucleic acids. The prepared nucleic acids can then be analyzed (e.g., quantitatively by PCR or by SBS sequencing) by the system 100. Thus, when the term "biological material" is used while describing a first operation, such as PCR, and is used again when describing a subsequent second operation, such as sequencing, it is understood that the biological material in the second operation may be altered relative to the biological material prior to or during the first operation. For example, amplicon nucleic acids generated from template nucleic acids amplified in a previous amplification (e.g., PCR) can be sequenced (e.g., SBS). In this case, the amplicon is a copy of the template, and the amplicon is present in a higher amount compared to the amount of template.

In some implementations, the system 100 can automatically prepare a sample for biochemical analysis based on a substance (e.g., whole blood or saliva or a population of cells) provided by a user. However, in other implementations, the system 100 may analyze biological material that is partially or previously prepared for analysis by a user. For example, a user may provide a solution comprising nucleic acids that have been isolated and/or amplified from whole blood; or a viral sample may be provided in which the RNA or DNA sequences are partially or fully exposed for treatment.

As used herein, a "specified reaction" includes a change in at least one of a chemical, electrical, physical, or optical property (or characteristic) of the analyte in question. In particular implementations, the specified reaction is an associative binding event (e.g., admixing the analyte in question with a fluorescently labeled biomolecule). The specified reaction may be a dissociative binding event (e.g., the release of a fluorescently labeled biomolecule from the analyte in question). The specified reaction may be a chemical transformation, a chemical change, or a chemical interaction. The specified reaction may also be a change in electrical properties. For example, the specified reaction may be a change in ion concentration in the solution. Some reactions include, but are not limited to, chemical reactions, such as, reduction, oxidation, addition, elimination, rearrangement, esterification, amidation, etherification, cyclization, or substitution; a binding interaction in which the first chemical binds to the second chemical; a dissociation reaction in which two or more chemical substances are separated from each other; fluorescence; emitting light; bioluminescence; chemiluminescence; and biological reactions, such as nucleic acid replication, nucleic acid amplification, nucleic acid hybridization, nucleic acid ligation, phosphorylation, enzymatic catalysis, receptor binding, or ligand binding. The designated reaction may also be the addition or removal of protons, for example, detectable as a change in pH of the surrounding solution or environment. An additional designated reaction may be the detection of ion flux across a membrane (e.g., a natural or synthetic bilayer membrane). For example, when ions flow through the membrane, the current is interrupted and the interruption can be detected. In-situ sensing of charged tags may also be used; as are thermal sensing and other suitable analytical sensing techniques.

In a particular implementation, the specified reaction includes incorporation of a fluorescently labeled molecule into the analyte. The analyte may be an oligonucleotide and the fluorescently labeled molecule may be a nucleotide. When excitation light is directed to the oligonucleotide with the labeled nucleotide, the designated reaction can be detected and the fluorophore emits a detectable fluorescent signal. In an alternative implementation, the detected fluorescence is the result of chemiluminescence and/or bioluminescence. The designated reaction may also increase fluorescence resonance energy transfer (or fluorescence resonance, for example) by bringing the donor fluorophore close to the acceptor fluorophore) resonance energy transfer, FRET), decrease FRET by separating the donor fluorophore and the acceptor fluorophore, increase fluorescence by separating the quencher from the fluorophore or increase fluorescence by co-locating the quencher and fluorophore.

As used herein, "reaction components" include any substance that can be used to obtain a specified reaction. For example, reaction components include reagents, catalysts such as enzymes, reactants for the reaction, samples, reaction products, other biomolecules, salts, metal cofactors, chelators, and buffer solutions (e.g., hydrogenation buffers). The reactive components can be delivered to different locations in the fluidic network, either individually in solution or in combination in one or more mixtures. For example, the reaction components may be delivered to a reaction chamber in which the biological material is immobilized. The reactive component may interact with the biological material directly or indirectly. In some implementations, removable cartridge 200 is pre-loaded with one or more reaction components that participate in performing a specified assay protocol. The pre-loading may occur at one location (e.g., a manufacturing facility) prior to receipt of the cartridge 200 by a user (e.g., at a customer's facility). For example, one or more reaction components or reagents may be preloaded into the consumable reagent portion 210. In some implementations, the removable cartridge 200 can also pre-load the flow cell in the flow cell receiving portion 220.

In some implementations, the base instrument 102 can be configured to interact with one removable cartridge 200 per link. After the link, the removable cartridge 200 may be replaced with another removable cartridge 200. In some implementations, the base instrument 102 can be configured to interact with more than one removable cartridge 200 per link. As used herein, the term "link" includes performing at least one of a sample preparation and/or biochemical analysis protocol. Sample preparation may include synthesizing biological material; and/or isolating, modifying and/or amplifying one or more components of the biological material such that the prepared biological material is suitable for analysis. In some implementations, a segment may include a continuous activity in which a plurality of controlled reactions are performed until (a) a specified number of reactions have been performed, (b) a specified number of events have been detected, (c) system time has elapsed for a specified period of time, (d) the signal-to-noise ratio has dropped to a specified threshold; (e) the target component has been identified; (f) a system failure or malfunction has been detected; and/or (g) has depleted one or more resources for conducting the reaction. Optionally, a link may include pausing system activity for a period of time (e.g., minutes, hours, days, weeks) and completing the link at a later time until at least one of (a) - (g) occurs.

The assay protocol may include a series of operations for performing, detecting, and/or analyzing a specified reaction. Collectively, the removable cartridge 200 and the base instrument 102 may include components for performing different operations. Operation of the assay protocol may include fluidic operation, thermal control operation, detection operation, and/or mechanical operation.

Fluidic operations include controlling the flow of fluid (e.g., liquid or gas) through the system 100, which system 100 may be actuated by the base instrument 102 and/or the removable cartridge 200. In one example, the fluid is in liquid form. For example, fluidic operations may include controlling a pump to induce a flow of biological material or reaction components into a reaction chamber.

The thermal control operation may include controlling the temperature of a designated portion of the system 100, such as the temperature of one or more portions of the removable cartridge 200. For example, the thermal control operation may include increasing or decreasing a temperature of a Polymerase Chain Reaction (PCR) region in which a fluid including biological material is stored.

The detecting operation may include controlling actuation of the detector or monitoring activity of the detector to detect a predetermined property, characteristic, or characteristic of the biological material. As one example, the detecting operation may include capturing an image of a specified region including biological material to detect fluorescent emissions from the specified region. The detecting operation may include controlling a light source to illuminate the biological material or controlling a detector to observe the biological material.

The mechanical operation may include controlling movement or position of a designated component. For example, the mechanical operation may include controlling a motor to move a valve control component in the base instrument 102 that operably engages a movable valve in the removable cartridge 200. In some examples, a combination of different operations may occur simultaneously. For example, the detector may capture an image of the reaction chamber when the pump controls the flow of fluid through the reaction chamber. In some examples, different operations directed to different biological materials may occur simultaneously. For example, a first biological material may undergo amplification (e.g., PCR) while a second biological material may undergo detection.

Similar or identical fluidic elements (e.g., channels, ports, reservoirs, etc.) can be labeled differently to more easily distinguish between these fluidic elements. For example, a port may be referred to as a reservoir port, a supply port, a network port, a feed port, and the like. It is to be understood that two or more fluidic elements (e.g., reservoir channels, sample channels, flow channels, bridge channels) that are labeled differently do not require that the fluidic elements be structurally different. In addition, the claims can be modified to add such labels to more easily distinguish such fluidic elements in the claims.

As used herein, a "liquid" is a substance that is relatively incompressible and has the ability to flow and conform to the shape of the container or channel holding the substance. The liquid may be water-based and include polar molecules that exhibit a surface tension that holds the liquid together. The liquid may also include non-polar molecules, for example, in oil-based or non-aqueous materials. It should be understood that references to liquids in this application may include liquids comprising combinations of two or more liquids. For example, separate reagent solutions may be subsequently combined to perform a specified reaction.

One or more implementations can include retaining a biological material (e.g., a template nucleic acid) at a specified location where the biological material is analyzed. As used herein, the term "retain," when used in reference to a biological material, includes attaching the biological material to a surface or confining the biological material within a specified space. As used herein, the term "immobilizing", when used in reference to a biological material, includes attaching the biological material to a surface in or on a solid support. Immobilizing may include attaching the biological material to the surface at a molecular level. For example, biological materials can be immobilized to a substrate surface using adsorption techniques including non-covalent interactions (e.g., electrostatic forces, van der waals forces, and dehydration of hydrophobic interfaces) and covalent binding techniques, wherein functional groups or linkers facilitate attachment of a biological sample to the surface. The immobilization of the biological material to the surface of the substrate may be based on the characteristics of the substrate surface, the liquid medium carrying the biological material, and the characteristics of the biological material itself. In some examples, the substrate surface can be functionalized (e.g., chemically or physically altered) to facilitate immobilization of the biological material to the substrate surface. The substrate surface may first be modified to have functional groups bound to the surface. The functional group can then bind to the biomaterial to immobilize the biomaterial thereon. In some cases, the biological material may be immobilized on the surface by a gel.

In some implementations, nucleic acids can be immobilized to a surface and amplified using bridge amplification. Another useful method for amplifying nucleic acids on a surface is Rolling Circle Amplification (RCA), e.g., using the methods set forth in more detail below. In some implementations, the nucleic acid can be attached to a surface and amplified using one or more primer pairs. For example, one primer may be in solution and the other primer may be immobilized on a surface (e.g., 5' -attached). For example, a nucleic acid molecule can hybridize to one of the primers on the surface, followed by extension of the immobilized primer to generate a first copy of the nucleic acid. The primer in solution is then hybridized to a first copy of the nucleic acid, which can be extended using the first copy of the nucleic acid as a template. Optionally, after generating the first copy of the nucleic acid, the original nucleic acid molecule may be hybridized to a second immobilized primer on the surface and may be extended simultaneously or after primer extension in solution. In any implementation, repeated rounds of extension (e.g., amplification) using immobilized primers in solution and primers can be used to provide multiple copies of a nucleic acid. In some implementations, the biological material can be confined within a predetermined space along with reaction components configured for use during amplification (e.g., PCR) of the biological material.

One or more implementations set forth herein may be configured to perform an assay protocol that is or includes an amplification (or PCR) protocol. During an amplification protocol, the temperature of the biological material within the reservoirs or channels can be changed in order to amplify a target sequence or biological material (e.g., DNA of the biological material). For example, the biological material may be subjected to (1) a pre-heating period of about 95 ℃ for about 75 seconds; (2) a denaturation phase of about 95 ℃ for about 15 seconds; (3) an annealing-extension stage (annealing-extension stage) of about 59 ℃ for about 45 seconds; and (4) a temperature hold period of about 72 ℃ for about 60 seconds. Implementations may perform multiple amplification cycles. It should be noted that the above-described cycle describes only one particular implementation, and that alternative implementations may include modifications to the amplification scheme.

The methods and systems set forth herein may use an array of features having any of a variety of densities, including, for example, at least about 10 features/cm²About 100 features/cm²About 500 features/cm²About 1000 features/cm²About 5000 features/cm²About 10000 features/cm²About 50000 features/cm²About 100000 features/cm²About 1000000 features/cm²About 5000000 features/cm ²Or higher. The methods and apparatus set forth herein may include a detection component or device having a resolution at least sufficient to resolve individual features at one or more of these densities.

The base instrument 102 can include a user interface 130, the user interface 130 configured to receive user input for performing a specified assay protocol and/or configured to communicate information about the assay to a user. The user interface 130 may be incorporated with the base instrument 102. For example, the user interface 130 can include a touch screen attached to the housing of the base instrument 102 and configured to identify the location of touches from a user and touches relative to information displayed on the touch screen. Alternatively, the user interface 130 may be remotely located relative to the base instrument 102.

II. box

The removable cartridge 200 is configured to detachably engage or removably couple to the base instrument 102 at the cartridge chamber 140. As used herein, the terms "detachably engaged" or "removably coupled" (or similar terms) are used to describe the relationship between the removable cartridge 200 and the base instrument 102. The term is intended to mean that the connection between the removable cartridge 200 and the base instrument 102 is detachable without damaging the base instrument 102. Accordingly, the removable cartridge 200 may be electrically detachably engaged to the base instrument 102 such that the electrical contacts of the base instrument 102 are not damaged. The removable cartridge 200 may be detachably engaged to the base instrument 102 in a mechanical manner such that features of the base instrument 102 (e.g., the cartridge chamber 140) that hold the removable cartridge 200 are not damaged. The removable cartridge 200 may be detachably joined to the base instrument 102 in a fluid manner such that the port of the base instrument 102 is not broken. For example, if only a simple adjustment (e.g., realignment) of a component is required or a simple replacement (e.g., replacement of a nozzle) is required, the base instrument 102 is not considered "broken". The components (e.g., the removable cartridge 200 and the base instrument 102) can be easily separable when the components can be separated from each other without undue effort or time spent separating the components. In some implementations, the removable cartridge 200 and the base instrument 102 can be easily separable without damaging the removable cartridge 200 or the base instrument 102.

In some implementations, the removable cartridge 200 may be permanently altered or partially damaged during the time period with the base instrument 102. For example, a container containing a liquid may include a foil lid that is pierced to allow the liquid to flow through the system 100. In such an implementation, the foil lid may be damaged, so that the damaged container will be replaced by another container. In particular implementations, the removable cartridge 200 is a disposable cartridge such that the removable cartridge 200 can be replaced and optionally disposed of after a single use. Similarly, the flow-through cell of the removable cartridge 200 may be individually disposable such that the flow-through cell may be replaced and optionally disposed of after a single use.

In other implementations, the removable cartridge 200 can be used in more than one session when engaged with the base instrument 102, and/or can be removed from the base instrument 102, reloaded with reagents, and re-engaged to the base instrument 102 for additional designated reactions. Thus, in some cases, the removable cartridge 200 may be refurbished such that the same removable cartridge 200 may be used with different consumables (e.g., reactive components and biological materials). After the cartridge 200 is removed from the base instrument 102 at the customer facility, refurbishment may occur at the manufacturing facility.

The cartridge compartment 140 may include slots, mounts, connector interfaces, and/or any other features to receive the removable cartridge 200 or a portion thereof to interact with the base instrument 102.

The removable cartridge 200 can include a fluidic network that can hold and direct a fluid (e.g., a liquid or a gas) therethrough. A fluidic network can include a plurality of interconnected fluidic elements that can store fluid and/or allow fluid to flow therethrough. Non-limiting examples of fluidic elements include channels, ports of channels, chambers, storage devices, reservoirs of storage devices, reaction chambers, waste reservoirs, detection chambers, multi-purpose chambers for reactions and detection, and the like. For example, consumable reagent portion 210 can include one or more reagent wells or reagent chambers that store reagents, and can be part of or coupled to a fluidic network. The fluidic elements may be fluidically coupled to one another in a prescribed manner such that the system 100 is capable of performing sample preparation and/or analysis.

As used herein, the term "fluidly coupled" (or similar terms) refers to two regions of space that are connected together such that a liquid or gas may be directed between the two regions of space. In some examples, the fluid coupling allows fluid to be directed back and forth between the two spatial regions. In other examples, the fluid coupling is unidirectional such that there is only one direction of flow between the two spatial regions. For example, the assay reservoir may be fluidly coupled with the channel such that liquid may be transported from the assay reservoir into the channel. However, in some implementations, it may not be possible to direct fluid in the channel back into the assay reservoir. In particular implementations, the fluidic network can be configured to receive biological material and direct the biological material through sample preparation and/or sample analysis. The fluidic network can direct other reactive components of the biological material to a waste reservoir.

Fig. 2 depicts an implementation of a consumable cartridge 300. The consumable cartridge can be part of a combined removable cartridge, such as the consumable reagent portion 210 of the removable cartridge 200 of fig. 1; or may be a separate kit. The consumable cartridge 300 can include a housing 302 and a top 304. The housing 302 may comprise a non-conductive polymer or other material and is formed to create one or more reagent chambers 310, 320, 330. The size of the reagent chambers 310, 320, 330 may be varied to accommodate different volumes of reagent stored therein. For example, the first chamber 310 may be larger than the second chamber 320, and the second chamber 320 may be larger than the third chamber 330. The first chamber 310 is sized to accommodate a larger volume of a particular reagent, such as a buffer reagent. The second chamber 320 may be sized to contain a smaller volume of reagent than the first chamber 310, such as a reagent chamber containing a lysis reagent. The third chamber 330 may be sized to contain a smaller volume of reagent than the first chamber 310 and the second chamber 320, such as a reagent chamber containing a fully functional nucleotide-containing reagent.

In the illustrated implementation, the housing 302 has a plurality of housing walls or sides 350 in which the chambers 310, 320, 330 are formed. In the illustrated implementation, the housing 302 forms an at least substantially unitary or monolithic structure. In alternative implementations, the enclosure 302 may be made up of one or more subcomponents that are combined to form the enclosure 302, such as independently formed compartments for the chambers 310, 320, and 330.

Once reagents are provided into the respective chambers 310, 320, 330, the housing 302 may be sealed by the top 304. The top 304 may comprise a conductive or non-conductive material. For example, the top 304 may be an aluminum foil seal that is adhered to the top surface of the housing 302 to seal the reagents within their respective chambers 310, 320, 330. In other implementations, the top 304 may be a plastic seal that is adhered to the top surface of the housing 302 to seal the reagents within their respective chambers 310, 320, 330.

In some implementations, the housing 302 can also include a marker 390. The identifier 390 may be a Radio Frequency Identification (RFID) transponder, a bar code, an identification chip, and/or other identifier. In some implementations, the identifier 390 may be embedded in the housing 302 or attached to an external surface. The identifier 390 may include data for a unique identifier of the consumable cartridge 300 and/or data for the type of consumable cartridge 300. As described herein, the data of the identifier 390 may be read by the base instrument 102 or a separate device configured to heat the consumable cartridge 300.

In some implementations, the consumable cartridge 300 can include other components, such as valves, pumps, fluid lines, ports, and the like. In some implementations, the consumable cartridge 300 can be contained within another housing.

System controller

The base instrument 102 can also include a system controller 120, the system controller 120 configured to control operation of at least one of the removable cartridge 200 and/or the detection assembly 110. The system controller 120 may be implemented using any combination of dedicated hardware circuits, boards, DSPs, processors, etc. Alternatively, the system controller 120 may be implemented using an off-the-shelf PC having a single processor or multiple processors, with functional operations distributed among the processors. Alternatively, the system controller 120 may be implemented using a hybrid configuration in which some of the module functions are performed using dedicated hardware, while the remaining module functions are performed using an off-the-shelf PC or the like.

The system controller 120 may include a plurality of circuit modules configured to control the operation of certain components of the base instrument 102 and/or the removable cartridge 200. The term "module" herein may refer to a hardware device configured to perform a particular task. For example, the circuit module may include a flow control module configured to control fluid flow through the fluidic network of the removable cartridge 200. The flow control module may be operably connected to the valve actuator and/or the system pump. The flow control module may selectively activate the valve actuator and/or the system pump to direct fluid flow through one or more pathways and/or to prevent fluid flow through one or more pathways.

The system controller 120 may also include a thermal control module. The thermal control module may control a thermal cycler or other thermal component to provide and/or remove thermal energy from the sample preparation region of the removable cartridge 200 and/or any other region of the removable cartridge 200. In one particular example, the thermal cycler can increase and/or decrease the temperature experienced by the biological material according to a PCR protocol.

The system controller 120 may also include a detection module configured to control the detection assembly 110 to obtain data about the biological material. If the detection assembly 110 is part of the removable cartridge 200, the detection module may control the operation of the detection assembly 110 through a direct wired connection or through a contact array. The detection module may control the detection component 110 to acquire data at a predetermined time or a predetermined period of time. For example, when the biological material has a fluorophore attached thereto, the detection module can control the detection assembly 110 to capture an image of the reaction chamber of the flow cell receiving portion 220 of the removable cartridge. In some implementations, multiple images may be obtained.

Optionally, the system controller 120 may include an analysis module configured to analyze the data to provide at least a partial result to a user of the system 100. For example, the analysis module can analyze imaging data provided by the detection component 110. The analysis may include identifying a nucleic acid sequence of the biological material.

The system controller 120 and/or circuit modules described above may include one or more logic-based devices including one or more microcontrollers, processors, Reduced Instruction Set Computers (RISC), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), logic circuits, and any other circuit capable of executing the functions described herein. In one implementation, the system controller 120 and/or circuit modules execute a set of instructions stored in a computer or machine readable medium to perform one or more assay protocols and/or other operations. The set of instructions may be stored in the form of an information source or physical memory element within the base instrument 102 and/or the removable cartridge 200. The protocol performed by the system 100 may be used to perform, for example, machine-writing or otherwise synthesizing DNA (e.g., converting binary data into a DNA sequence and then synthesizing a DNA strand or other polynucleotide representing the binary data), quantitative analysis of DNA or RNA, protein analysis, DNA sequencing (e.g., sequencing-by-synthesis (SBS)), sample preparation, and/or preparation of a library of fragments for sequencing.

The set of instructions may include various commands that instruct the system 100 to perform specific operations such as the methods and processes of the various implementations described herein. The set of instructions may be in the form of a software program. As used herein, the terms "software" and "firmware" are interchangeable, and include any computer program stored in memory, including RAM memory, ROM memory, EPROM memory, EEPROM memory, and non-volatile RAM (NVRAM) memory, for execution by a computer. The above memory types are exemplary only, and are thus not limiting as to the types of memory usable for storage of a computer program.

The software may be in various forms, such as system software or application software. Further, the software may be in the form of a collection of separate programs or program modules within a larger program or portion of a program module. The software may also include modular programming in the form of object-oriented programming. After obtaining the test data, the test data may be automatically processed by the system 100, processed in response to user input, or processed in response to a request made by another processing machine (e.g., a remote request over a communication link).

System controller 120 may be connected to other components or subsystems of system 100 via communication links that may be hardwired or wireless. The system controller 120 may also be communicatively connected to an off-board system or server. The system controller 120 may receive user inputs or commands from the user interface 130. The user interface 130 may include a keyboard, mouse, touch screen panel, and/or voice recognition system, among others.

The system controller 120 is used to provide processing capabilities, such as storing, interpreting, and/or executing software instructions, and to control the overall operation of the system 100. The system controller 120 may be configured and programmed to control the data and/or power aspects of the various components. Although the system controller 120 is represented in fig. 1 as a single structure, it should be understood that the system controller 120 may include multiple separate components (e.g., processors) distributed throughout the system 100 at different locations. In some implementations, one or more components may be integrated with base instrument 102, and one or more components may be remotely located relative to base instrument 102.

IV, flow-through cell

Fig. 3-4 depict examples of flow cells 400 that may be used in the system 100. The flow-through cell of this example comprises a body defining a plurality of elongate flow channels 410, the flow channels 410 being recessed below the upper surface 404 of the body 402. The flow channels 410 are generally parallel to each other and extend substantially along the entire length of the body 402. Although five flow channels 410 are shown, the flow cell 400 may include any other suitable number of flow channels 410, including more or less than five flow channels 410. The flow cell 400 of this example also includes a set of inlet ports 420 and a set of outlet ports 422, each port 420, 422 being associated with a respective flow channel 410. Thus, each inlet port 420 may be used to deliver a fluid (e.g., a reagent, etc.) to a respective channel 410; and each outlet port 422 may be used to convey fluid from a respective flow channel 410.

In some forms, the flow cell 400 is integrated directly into the flow cell receiving portion 220 of the removable cartridge 200. In some other forms, the flow cell 400 is removably connected with the flow cell receiving portion 220 of the removable cartridge 200. In versions where the flow cell 400 is directly integrated into the flow cell receiving portion 220 or removably coupled with the flow cell receiving portion 220, the flow channel 410 of the flow cell 400 may receive fluid from the consumable reagent portion 210 via the inlet port 420, and the inlet port 420 may be fluidly coupled with a reagent stored in the consumable reagent portion 210. Of course, the flow channel 410 may be coupled with various other fluid sources or reservoirs, etc. via the ports 420, 422. As another illustrative variation, some forms of the consumable cartridge 300 may be configured to removably receive or otherwise integrate the flow cell 400. In this form, the flow channel 410 of the flow cell 400 may receive fluid from the reagent chambers 310, 320, 330 through the inlet port 420. Other suitable ways in which the flow cell 400 may be incorporated into the system 100 will be apparent to those skilled in the art in view of the teachings herein.

Fig. 4 shows the flow channel 410 of the flow cell 400 in more detail. As shown, the flow channel 410 includes a plurality of wells 430 formed in a base surface 412 of the flow channel 410. As will be described in more detail below, each well 430 is configured to contain a DNA strand or other polynucleotide, such as a machine-written polynucleotide. In some forms, each well 430 has a cylindrical configuration with a generally circular cross-sectional profile. In some other forms, each well 430 has a polygonal (e.g., hexagonal, octagonal, etc.) cross-sectional profile. Alternatively, well 430 may have any other suitable configuration. It should also be understood that the wells 430 may be arranged in any suitable pattern, including but not limited to a grid pattern.

Fig. 5 shows a part of a channel within a flow cell 500, which flow cell 500 is an example of a variant of the flow cell 400. In other words, the channel depicted in fig. 5 is a variation of the flow channel 410 of the flow cell 400. The flow cell 500 is operable to read a polynucleotide strand 550 immobilized to a bottom surface 534 of a well 530 in the flow cell 500. For example only, the bottom surface 534 to which the polynucleotide strand 550 is immobilized may comprise a co-block polymer that is end-capped with an azide group. By way of further example only, such polymers may include a poly (N- (5-azidoacetamidopentyl) acrylamide-co-acrylamide) (PAZAM) coating provided in accordance with at least some of the teachings of U.S. patent No. 9,012,022 entitled "Polymer Coatings," issued 4/21/2015, the entire contents of which are incorporated herein by reference. Such polymers may be incorporated into any of the various flow cells described herein.

In this example, the wells 530 are separated by interstitial spaces 514 provided by the floor 512 of the flow cell 500. Each well 530 has sidewalls 532 and a floor 534. The flow cell 500 in this example is operable to provide an image sensor 540 below each well 530. In some forms, each well 530 has at least one corresponding image sensor 540, the image sensors 540 being fixed in position relative to the wells 530. Each image sensor 540 may comprise a CMOS image sensor, a CCD image sensor, or any other suitable type of image sensor. For example only, each well 530 may have one associated image sensor 540 or a plurality of associated image sensors 540. As another variation, a single image sensor 540 may be associated with two or more wells 530. In some forms, one or more image sensors 540 are moved relative to the well 530 such that a single image sensor 540 or a single group of image sensors 540 can be moved relative to the well 530. As a further variation, the flow cell 500 may be movable relative to a single image sensor 540 or a single set of image sensors 540, which may be at least substantially fixed in place.

Each image sensor 540 may be directly incorporated into the flow cell 500. Alternatively, each image sensor 540 may be incorporated directly into a cartridge, such as the removable cartridge 200, with the flow cell 500 integrated into or otherwise coupled to the cartridge. As yet another illustrative variation, each image sensor 540 may be incorporated directly into the base instrument 102 (e.g., as part of the detection assembly 110 described above). Wherever the image sensor 540 is located, the image sensor 540 may be integrated into a printed circuit that includes other components (e.g., control circuitry, etc.). In versions where one or more image sensors 540 are not directly incorporated into the flow cell 500, the flow cell 500 may include optical transmission features (e.g., windows, etc.) that allow the one or more image sensors 540 to capture fluorescence emitted by one or more fluorophores associated with polynucleotide strands 550, the polynucleotide strands 550 being immobilized to a bottom surface 534 of wells 530 in the flow cell 500, as described in more detail below. It should also be understood that various optical elements (e.g., lenses, optical waveguides, etc.) may be positioned between the bottom surface 534 of the well 530 and the corresponding image sensor 540.

As also shown in fig. 5, the light source 560 is operable to project light 562 into the well 530. In some forms, each well 530 has at least one corresponding light source 560, the light sources 560 being fixed in place relative to the wells 530. For example only, each well 530 may have one associated light source 560 or a plurality of associated light sources 560. As another variation, a single light source 560 may be associated with two or more wells 530. In some other forms, one or more light sources 560 are moved relative to the well 530 such that a single light source 560 or a single group of light sources 560 can be moved relative to the well 530. As yet another variation, the flow cell 500 may be movable relative to a single light source 560 or a single set of light sources 560, which may be substantially fixed in place. For example only, each light source 560 may include one or more lasers. In another example, the light source 560 may include one or more diodes.

Each light source 560 may be directly incorporated into the flow cell 500. Alternatively, each light source 560 may be incorporated directly into a cartridge, such as the removable cartridge 200, with the flow cell 500 integrated into or otherwise coupled to the cartridge. As yet another illustrative variation, each light source 560 may be incorporated directly into base instrument 102 (e.g., as part of detection assembly 110 described above). In versions where the one or more light sources 560 are not directly incorporated into the flow cell 500, the flow cell 500 may include optical transmission features (e.g., windows, etc.) that allow the wells 530 to receive light emitted by the one or more light sources 560, thereby enabling the light to reach the polynucleotide strands 550 immobilized to the bottom surface 534 of the wells 530. It should also be understood that various optical elements (e.g., lenses, optical waveguides, etc.) may be positioned between the wells 530 and the respective light sources 560.

As described elsewhere herein and as shown in block 590 of fig. 6, the DNA reading process may begin with a Sequencing reaction being performed in target well 530 (e.g., according to at least some of the teachings of U.S. patent No. 9,453,258 entitled "Methods and Compositions for Nucleic Acid Sequencing," issued at 27/9/2016, the entire contents of which are incorporated herein by reference). Next, as shown in block 592 of fig. 6, the light source 560 is activated on the target well 530, thereby illuminating the target well 530. Projected light 562 can cause a fluorophore associated with polynucleotide strand 550 to fluoresce. Accordingly, as shown in block 594 of fig. 6, the respective image sensor 540 may detect fluorescence emitted from one or more fluorophores associated with the polynucleotide strand 550. The system controller 120 of the base instrument 102 may drive the light source 560 to emit light. The system controller 120 of the base instrument 102 may also process image data obtained from the image sensor 540 that represents the fluorescence emission profile from the polynucleotide strands 550 in the wells 530. Using the image data from the image sensor 540, and as shown in block 596 of fig. 6, the system controller 120 can determine the base sequence in each polynucleotide chain 550. By way of example only, the methods and apparatus may be used to map a genome or otherwise determine biological information associated with a naturally occurring organism from which a DNA strand or other polynucleotide is obtained or otherwise based. Alternatively, the data stored in the machine-written DNA may be obtained using the processes and equipment described above, as will be described in more detail below.

By way of further example only, when performing the above-described procedure shown in fig. 6, a spatiotemporal sequencing reaction may utilize one or more chemical and imaging events or steps to distinguish between multiple analytes (e.g., four nucleotides) incorporated into a growing nucleic acid strand during the sequencing reaction; alternatively, less than four different colors may be detected in a mixture with four different nucleotides while still resulting in the determination of four different nucleotides (e.g., in a sequencing reaction). A pair of nucleotide types can be detected at the same wavelength, but distinguished based on a difference in intensity of one member of the pair compared to the other, or based on an alteration (e.g., via a chemical, photochemical, or physical modification) of one member of the pair that results in the appearance or disappearance of an apparent signal compared to that detected by the other member of the pair.

V. machine-written biomaterial

In some implementations, a system 100, such as the system 100 shown in fig. 1, can be configured to synthesize biological material (e.g., polynucleotides, such as DNA) to encode data that can then be retrieved by performance of assays, such as those described above. In some implementations, this type of encoding can be performed by assigning base values to nucleotides (e.g., binary values, such as 0 or 1, ternary values, such as 0, 1, or 2, etc.), converting the data to be encoded into strings of related values (e.g., converting a text message into a binary string using an ASCII encoding scheme), and then creating one or more polynucleotides composed of nucleotides having base in a sequence corresponding to the string obtained by converting the data.

In some implementations, such polynucleotide generation may be performed using the form of a flow cell 400 having a well array 630 configured as shown in fig. 7. Fig. 7 shows a portion of a channel within a flow cell 600, which is an example of a variation of the flow cell 400. In other words, the channel depicted in fig. 7 is a variation of the flow channel 410 of the flow cell 400. In this example, each well 630 is recessed below the base surface 612 of the flow cell 600. Thus, wells 630 are separated from each other by interstitial spaces 614. For example only, the wells 630 may be arranged in a grid or any other suitable pattern along the base surface 612 of the flow cell 600. Each well 630 of this example includes sidewalls 632 and a floor 634. Each well 630 of this example also includes a respective electrode assembly 640 located on a bottom surface 634 of the well 630. In some forms, each electrode assembly 640 includes only a single electrode element. In some other forms, each electrode assembly 640 includes a plurality of electrode elements or segments. The terms "electrode" and "electrode assembly" should be understood herein as being interchangeable.

The base instrument 102 is operable to independently activate the electrode assemblies 640 such that one or more electrode assemblies 640 may be in an activated state while one or more other electrode assemblies 640 are not in an activated state. In some forms, the electrode assemblies 640 are controlled using CMOS devices or other devices. Such CMOS devices may be integrated directly into the flow cell 600, may be integrated into a cartridge (e.g., cartridge 200) contained by the flow cell 600, or may be integrated directly into the base instrument 102. As shown in fig. 7, each electrode assembly 640 extends along the entire width of the bottom surface 634, terminating at a sidewall 632 of the corresponding well 630. In other forms, each electrode assembly 640 may extend along only a portion of the bottom surface 634. For example, some forms of electrode assembly 640 may terminate internally with respect to sidewall 632. Although each electrode assembly 540 is schematically depicted in fig. 5 as a single element, it should be understood that each electrode assembly 540 may actually be formed from a plurality of discrete electrodes, rather than just one single electrode.

As shown in fig. 7, a particular polynucleotide chain 650 may be produced in a single well 630 by electrochemically generating an acid by activating an electrode assembly 640 of the associated well 630, which acid may deprotect the end groups of the polynucleotide chain 650 in the well 630. By way of example only, the polynucleotide chain 650 may be chemically attached to the surface at the bottom of the well 630 using a linker having a chemical such as a silane chemical at one end and a DNA synthesis compatible chemical (e.g., an enzyme-bound short oligonucleotide) at the other end.

To facilitate reagent exchange (e.g., transport of the deblocking agent), in this example, each electrode assembly 640 and the bottom surface 634 of each well 630 can include at least one opening 660. The opening 660 may be fluidly coupled with a flow channel 662 extending below the well 630 and below the bottom surface 634. To provide such an opening 660 through the electrode assembly 640, the electrode assembly 640 may be annular, may be placed in quadrants, may be placed on the perimeter or sidewall 632 of the well 630, or may be placed or shaped in other suitable ways to avoid interfering with reagent exchange and/or passage of light (e.g., as may be used in sequencing processes involving fluorescence emission detection). In other implementations, the reagent may be provided into the flow channel of the flow cell 600 without the opening 660. It should be understood that the opening 660 may be optional and may be omitted in some forms. Similarly, the flow passage 662 may be optional and may be omitted in some forms.

Fig. 9 shows an example of the form that electrode assembly 640 may take. In this example, the electrode assembly 640 includes four discrete electrode segments 642, 644, 646, 648 which together define an annular shape. The electrode segments 642, 644, 646, 648 are thus configured as discrete but adjacent quadrants of a ring. Each electrode segment 642, 644, 646, 648 can be configured to provide a predetermined charge uniquely associated with a particular nucleotide. For example, electrode segment 642 may be configured to provide a charge uniquely associated with adenine; electrode segment 644 can be configured to provide a charge uniquely associated with cytosine; electrode segment 646 can be configured to provide a charge uniquely associated with guanine; and the electrode segments 648 can be configured to provide a charge uniquely associated with thymine. When a mixture of these four nucleotides flows through the flow channel above well 630, activation of electrode segments 642, 644, 646, 648 can cause the corresponding nucleotides from that flow channel to adhere to strand 650. Thus, when electrode segment 642 is activated, it can effect the writing of adenine to strand 650; when electrode segment 644 is activated, it can effect the writing of cytosine to chain 650; when electrode segment 646 is activated, it can effect the writing of guanine to chain 650; and when the electrode segment 648 is activated, it can effect writing of thymine to the chain 650. Such writing may be provided by the activated electrode segments 642, 644, 646, 648 hybridizing an inhibitor of the enzyme to the pixels associated with the activated electrode segments 642, 644, 646, 648. Although the electrode segments 642, 644, 646, 648 are shown in fig. 9 as forming a ring shape, it should be understood that the electrode segments 642, 644, 646, 648 may form any other suitable shape. In other implementations, a single electrode may be used for the electrode assembly 640, and the charge may be adjusted to incorporate the various nucleotides to be written to a DNA strand or other polynucleotide.

As another example, the electrode assembly 640 may be activated to provide a localized (e.g., located within the well 630 in which the electrode assembly 640 is disposed) electrochemically generated pH change; and/or electrochemically generating moieties (e.g., reducing or oxidizing agents) locally to remove blocks from nucleotides. As another variation, different nucleotides may have different blocks; and these blocks may be photo-cut based on the wavelength of light transmitted to the wells 630 (e.g., light 562 projected from the light source 560). As another variation, different nucleotides may have different blocks; and these blocks can be cleaved based on certain other conditions. For example, one of the four blocks may be removed based on a combination of reducing conditions plus a high local pH or a low local pH; based on the combination of the oxidation conditions plus the high local pH or the low local pH, the other of the four blocks can be removed; based on the combination of light and high local pH, another of the four blocks can be removed; and based on a combination of light and low local pH, another of the four blocks may be removed. Thus, four nucleotides can be incorporated simultaneously, but selective deblocking occurs in response to four different sets of conditions.

The electrode assembly 640 further defines an opening 660 in the center of the arrangement of electrode segments 642, 644, 646, 648. As described above, this opening 660 may provide a path for fluid communication between the flow channel 662 and the well 630, allowing reagents or the like flowing through the flow channel 662 to reach the well 630. Also as described above, some variations may omit the flow channels 662 and provide communication of reagents or the like to the wells 630 in some other manner (e.g., by passive diffusion or the like). Regardless of whether the fluid is communicated through the opening 660, the opening 660 may provide an optical transmission path through the bottom of the well 630 during a read cycle, as described herein. In some forms, the opening 660 may be optional and thus may be omitted. In versions where opening 660 is omitted, fluid may be communicated to trap 630 via one or more flow channels located above trap 630 or positioned relative to trap 630. Furthermore, during a read cycle, the opening 660 may not be needed to provide an optical transmission path through the bottom of the well 630. For example, as described below with respect to flow cell 601, electrode assembly 640 may include an optically transparent material (e.g., an optically Transparent Conductive Film (TCF), etc.), and flow cell 600 itself may include an optically transparent material (e.g., glass), such that electrode assembly 640 and the material forming flow cell 600 may allow fluorescence emitted from one or more fluorophores associated with machine-written polynucleotide strand 650 to reach image sensor 540 under well 630.

Figure 8 shows an example of a process that may be used in a flow cell 600 for machine-writing of polynucleotide or other nucleotide sequences. At the start of the process, nucleotides may flow into the flow cell 600 through the trap 630, as shown in the first block 690 of fig. 8. As shown in the next block 692 in FIG. 8, the electrode assembly 640 may then be activated to write the first nucleotide to the primer at the bottom of the target well 630. As shown in the next block 694 of FIG. 8, the terminator may then be cleaved off from the just written first nucleotide in the target well 630. Various suitable ways in which a terminator may be cleaved from a first nucleotide will be apparent to those skilled in the art in view of the teachings herein. Once the terminator is cleaved from the first nucleotide, as shown in the next block 696 of FIG. 8, the electrode assembly 640 may be activated to write a second nucleotide to the first nucleotide. Although not shown in FIG. 8, the terminator can be cleaved from the second nucleotide, the third nucleotide written to the second nucleotide, and so on, until the desired nucleotide sequence has been written.

In some implementations, encoding of data by synthesis of biological material, such as DNA, may be performed in other ways. For example, in some implementations, the flow cell 600 may be completely devoid of the electrode assembly 640. For example, the deblocking reagent may be selectively communicated from the flow channel 662 to the well 630 through an opening 660. This may eliminate the need for the electrode assembly 640 to selectively activate nucleotides. As another example, an array of wells 630 may be exposed to a solution containing all nucleotide bases that may be used to encode data, and then individual nucleotides may be selectively activated for individual wells 630 by using light from a Spatial Light Modulator (SLM). As another example, in some implementations, a single base may be assigned a combined value (e.g., adenine may be used to encode diad 00, guanine may be used to encode diad 01, cytosine may be used to encode diad 10, and thymine may be used to encode diad 11) to increase the storage density of the resulting polynucleotide. Other examples are possible and will be apparent to those skilled in the art in light of this disclosure. Thus, the above description of synthetic biomaterials such as DNA encoding data should be understood as being illustrative only; and should not be considered limiting.

Reading machine-written biological materials

After the polynucleotide strands 650 in one or more wells 630 of the flow cell 600 have been machine written, the polynucleotide strands 650 may then be read to extract any data or other information stored in the machine-written polynucleotide strands 650. Such a read process may be performed using an arrangement such as that shown in fig. 5 and described above. In other words, one or more light sources 560 may be used to illuminate one or more fluorophores associated with the machine-written polynucleotide strand 650; and one or more image sensors 540 may be used to detect fluorescence emitted by the illuminated one or more fluorophores associated with the machine-written polynucleotide strand 650. The fluorescence profile of light emitted by the illuminated one or more fluorophores associated with the machine-written polynucleotide strand 650 can be processed to determine the base sequence in the machine-written polynucleotide strand 650. This determined base sequence in the machine-written polynucleotide strand 650 can be processed to determine data or other information stored in the machine-written polynucleotide strand 650.

In some forms, the machine-written polynucleotide strands 650 remain in the flow cell 600 comprising the wells 630 during storage. When it is desired to read the machine-written polynucleotide strand 650, the flow cell 600 may allow reading the machine-written polynucleotide strand 650 directly from the flow cell. By way of example only, the flow cell 600 containing the wells 630 may be housed in a cartridge (e.g., cartridge 200) or base instrument 102 containing the light source 560 and/or image sensor 540 such that the machine-written polynucleotide strands 650 are read directly from the wells 630.

As another illustrative example, a flow cell containing well 630 may be directly coupled to one or both of light source 560 or image sensor 540. Fig. 10 shows an example of a flow cell 601 comprising a well 630 with an electrode assembly 640, one or more image sensors 540 and a control circuit 670. Similar to the flow cell 500 depicted in fig. 5, the flow cell 601 of this example is operable to receive light 562 projected from a light source 560. The projected light 562 can cause one or more fluorophores associated with the machine-written polynucleotide strand 650 to fluoresce; and the corresponding image sensor 540 may capture fluorescence emitted from one or more fluorophores associated with the machine-written polynucleotide strand 650.

As described above in the case of the flow cell 500, each well 650 of the flow cell 601 may include its own image sensor 540 and/or its own light source 560; or these components may be otherwise configured and arranged as described above. In this example, fluorescent light emitted from one or more fluorophores associated with the machine-written polynucleotide strand 650 can reach the image sensor 540 via the opening 660. In addition, or alternatively, electrode assembly 640 can include an optically transparent material (e.g., optically Transparent Conductive Film (TCF), etc.), and flow cell 601 itself can include an optically transparent material (e.g., glass), such that electrode assembly 640 and the material forming flow cell 601 can allow fluorescence emitted from one or more fluorophores associated with machine-written polynucleotide strand 650 to reach image sensor 540. In addition, various optical elements (e.g., lenses, optical waveguides, etc.) may be placed between wells 650 and the corresponding image sensors to ensure that image sensor 540 only receives fluorescence emitted from one or more fluorophores associated with a desired machine-written polynucleotide strand 650 of well 630.

In this example, the control circuitry 670 is integrated directly into the flow cell 601. For example only, the control circuitry 670 may include a CMOS chip and/or other printed circuit configurations/components. The control circuitry 670 may be in communication with the image sensor 540, the electrode assembly 640, and/or the light source 560. In this case, "communicate" means that the control circuitry 670 is in electrical communication with the image sensor 540, the electrode assembly 640, and/or the light source 560. For example, the control circuit 670 is operable to receive and process signals from the image sensor 540 that are representative of an image picked up by the image sensor 540. "communicating" herein may also include the control circuitry 670 providing electrical energy to the image sensor 540, the electrode assembly 640, and/or the light source 560.

In some forms, each image sensor 540 has a corresponding control circuit 670. In some other forms, the control circuitry 670 is coupled to several, if not all, of the image sensors in the flow cell 601. Various suitable components and configurations that may be used to achieve this will be apparent to those skilled in the art in view of the teachings herein. It should also be understood that the control circuitry 670 may be integrated in whole or in part into the cartridge (e.g., the removable cartridge 200) and/or the base instrument 102 in addition to or instead of being integrated into the flow cell 601.

As yet another illustrative example, a machine-written polynucleotide chain 650 can be transferred from the well 630 after synthesis, whether using a write-only flow cell like the flow cell 600 of FIG. 7 or a read-write flow cell like the flow cell 601 of FIG. 10. This may occur shortly after synthesis is complete, just before the machine-written polynucleotide strand 650 is read, or at any other suitable time. In such a format, the machine-written polynucleotide strand 650 can be transferred to a read-only flow cell, such as flow cell 500 shown in fig. 5; and then read in the read-only flow cell 500. Alternatively, any other suitable device or process may be used.

In some implementations, reading data encoded by biomaterial synthesis can be achieved by determining the wells 630 that store the synthetic strands 650 of interest, and then sequencing those strands 650 using techniques such as those previously described (e.g., sequencing-by-synthesis). In some implementations, to facilitate reading data stored in a nucleotide sequence, the index may be updated with information showing the well 630 where the strand 650 that synthetically encodes the data is located when the data is stored. For example, when an implementation of the system 100 configured to synthesize a string 650 capable of storing up to 256 bits of data is used to store a one megabit (1, 048, 576 bits) file, the system controller 120 may perform the following steps: 1) dividing the file into 4, 096 256-bit segments; 2) identifying a sequence of 4096 wells 630 of the flowcell 600, 601 that are not currently being used to store data; 3) writing 4, 096 segments into 4, 096 wells 430, 530; 4) the index is updated to indicate that the sequence starting from the first identified well 630 and ending at the last identified well 630 is being used to store the file. Subsequently, when a request is made to read a file, the index can be used to identify the wells 630 containing the relevant chains 650, the chains 650 from those wells 630 can be sequenced, and the sequences can be combined and converted to an appropriate encoding format (e.g., binary), and the combined and converted data can then be returned as a response to the read request.

In some implementations, data previously encoded by the biomaterial synthesis may be read in other ways. For example, in some implementations, if a file corresponding to 4, 096 wells 630 is to be written, rather than identifying 4, 096 sequential wells 630 to write to, the controller may identify 4, 096 wells 630 and then update the index with a number of locations corresponding to the file if those wells 630 do not form a contiguous sequence. As another example, in some implementations, rather than identifying individual wells 630, the system controller 120 may group the wells 630 together (e.g., into groups of 128 wells 630), thereby reducing the overhead associated with storing location data (i.e., by reducing addressing requirements from one address per well 630 to one address per group of wells 630). As another example, in implementations where data reflecting the location of wells 630 in which DNA strands or other polynucleotides have been synthesized is stored, the data may be stored in various ways, such as sequence identifiers (e.g., well 1, well 2, well 3, etc.) or coordinates (e.g., X and Y coordinates of the location of the wells in the array).

As another example, in some implementations, rather than reading chain 650 from well 630 in which chain 650 is synthesized, chain 650 may be read from other locations. For example, the chain 650 may be synthesized to include addresses, then cut from the sink 630 and stored in a pipe for later retrieval, during which the included address information may be used to identify the chain 650 corresponding to a particular file. As another illustrative example, a polymerase may be used to replicate the strand 650 from the surface, then eluted and stored in a test tube. Alternatively, the strand 650 can be copied to beads using biotinylated oligonucleotides hybridized to DNA strands or other polynucleotides and capturing extension products on streptavidin beads dispensed in wells 630. Other examples are possible and will be apparent to those skilled in the art in light of this disclosure. Accordingly, the above description of retrieving data encoded by synthetic biomaterials should be understood as being illustrative only; and should not be considered limiting.

Implementations described herein may utilize a Polymer coating for the flow cell surface, such as described in U.S. patent No. 9,012,022 entitled "Polymer Coatings," filed 4/21 of 2015, which is incorporated herein by reference in its entirety. Implementations described herein may utilize one or more labeled nucleotides having a detectable label and a cleavable linker, such as those described in U.S. patent No. 7,414,116 entitled "Labelled Nucleotide Strands," issued 8/19 2008, which is hereby incorporated by reference in its entirety. For example, implementations described herein may utilize a cleavable linker that can be cleaved by contact with a water-soluble phosphine or a water-soluble transition metal-containing catalyst having a fluorophore as a detectable label. Implementations described herein may use a dual channel detection method to detect nucleotides of a polynucleotide, such as the method described in U.S. patent No. 9,453,258 entitled "Methods and Compositions for Nucleic Acid Sequencing," issued 2016, 9, 27, which is hereby incorporated by reference in its entirety. For example, implementations described herein can utilize a fluorescence-based SBS method that has a first nucleotide type detected in a first channel (e.g., dATP has a label detected in the first channel when excited by a first excitation wavelength), a second nucleotide type detected in a second channel (e.g., dCTP has a label detected in the second channel when excited by a second excitation wavelength), a third nucleotide type detected in both the first and second channels (e.g., dTTP has at least one label detected in both channels when excited by the first and/or second excitation wavelengths), and a fourth nucleotide type that lacks a label that is not detected or minimally detected in either channel (e.g., unlabeled dGTP). Implementations of the cartridges and/or flowcells described herein may be constructed in accordance with one or more of the teachings described below: U.S. patent No. 8,906,320 issued on 12/9 of 2014 and entitled "Biosensors for Biological or Chemical Analysis and Systems and Methods for Same," the entire contents of which are incorporated herein by reference; U.S. patent No. 9,512,422, entitled "Gel Patterned Surfaces," filed on 2016, 12, 6, the entire contents of which are hereby incorporated by reference; U.S. patent No. 10,254,225, entitled "Biosensors for Biological or Chemical Analysis and Methods of Manufacturing the Same", issued on 9.4.2019, the entire contents of which are incorporated herein by reference; and/or U.S. publication No. 2018/0117587 entitled "cartidge Assembly" published on 3.5.2018, which is incorporated herein by reference in its entirety.

Systems and methods for reading and writing to controlled areas

One challenge associated with storage devices is allowing for simultaneous or near-simultaneous reading and writing of data, as sequencing and synthesis operations of some flow cells may require conditioning and preparation of the flow cell (e.g., thermal conditioning to a suitable temperature, chemical conditioning with a suitable reagent, etc.) for writing data or reading data at a given time. For such conventional flowcells and systems, switching between "write mode" and "read mode" may require stopping all operations for a period of time while bringing the well to a certain temperature, flushing away previously used reagents, receiving new reagents, or receiving other inputs. With such a system, what may not be possible is: data is synthesized or written to a first well of the flow cell while data from a second well of the flow cell is also sequenced or read as this may create conflicts in reagent or other regulatory inputs provided to the flow cell.

Many modern data storage systems are considered to have the ability to allow users of these systems, as well as other systems and devices that may communicate with these systems, to simultaneously read and write data to a volume. Thus, the inability to simultaneously read and write data to a volume may be inconvenient to a user, for example, a user may prefer to exchange information between volumes in a storage device rack so that one device may be removed and placed in memory (e.g., copy a first file from volume a to volume B while also copying a second file from volume C to volume a), as this may create a situation where a user is unable to remove a volume due to multiple queuing actions that cannot be performed simultaneously. This may also be a technical problem for systems and devices that communicate with a volume, as software applications may be programmed to constantly write data to a database or file system stored on the volume, while also periodically reading data from the same database or file system. In situations where such actions cannot be performed simultaneously, such software applications may encounter various unexpected behaviors and errors, such as reduced hardware performance due to local memory and cache being overwhelmed by queued operations, race conditions, or reduced software performance due to the absence of required input at the time of operation.

To address these issues, DNA storage devices and related systems and devices operable to read, write, or read and write data may implement one or more features, such as selective activation of wells, simultaneous read and write caching, and multi-volume management using simultaneous read and write operations of the DNA storage system. Although the examples described herein refer to a "DNA storage system," it should be understood that this is merely one example of polynucleotide storage. The teachings herein can be readily applied to storage systems that utilize polynucleotides that are not necessarily in the form of DNA. Thus, the present invention is not limited to the use of DNA as the only polynucleotide for storage as described herein. Furthermore, polynucleotides are only one example of biological material that can be used for storage as described herein.

A. Exemplary DNA storage System

As described herein, a system operable to read digital data encoded as DNA, or encode and write digital data to DNA, or both, may be referred to as a system for DNA storage, or a DNA storage system. It should be understood that such a system may include various components and devices that may be assembled as a single piece of equipment (e.g., may be assembled and communicatively coupled within a housing), or may be separate equipment that may be connected, arranged, or both to provide the described features.

FIG. 21 shows a schematic diagram of an example of a DNA storage system 1300. The DNA storage system 1300 includes a set of instruments 1301 and a storage device 1320. The set of instruments 1301 may correspond to the base instrument 102 described above. The set of instruments 1301 may be assembled within a single apparatus or may be one or more separate apparatuses arranged, connected, or both, to provide the described functionality. The set of instruments 1301 includes a memory controller 1302, which memory controller 1302 may be one or more processors and memory configured to store and execute instructions to operate the set of instruments 1301. The set of instruments 1301 also includes a sequencing device 1304, a synthesis device 1306, a fluidics device 1308, and an electrical interface 1310.

In some implementations, memory controller 1302, sequencing device 1304, synthesis device 1306, fluidics device 1308, and electrical interface 1310 may be separate devices with one or more fluidics, electrical, or mechanical interfaces therebetween. In other implementations, the memory controller 1302, sequencing device 1304, synthesis device 1306, fluidics device 1308, and electrical interface 1310 may be integrated into a single device, with each of the sequencing device 1304, synthesis device 1306, fluidics device 1308, and electrical interface 1310 forming a subcomponent thereof.

Storage 1320 may be permanently or removably coupled with a set of instruments 1301; and includes a flow cell 1322 having a plurality of wells. Storage 1320 also includes sequencing interface 1324, synthesis interface 1326, fluidics interface 1328, and a set of module electronics 1330. In some implementations, flow cell 1322, sequencing interface 1324, synthesis interface 1326, fluidics interface 1328, and set of module electronics 1330 can be stand-alone devices with one or more fluidics, electrical, or mechanical interfaces therebetween. In other implementations, flowcell 1322, sequencing interface 1324, synthesis interface 1326, fluidics interface 1328, and set of module electronics 1330 may be integrated into a single device, with each of flowcell 1322, sequencing interface 1324, synthesis interface 1326, fluidics interface 1328, and set of module electronics 1330 forming a subcomponent thereof.

The sequencing device 1304 is operable to read data encoded and stored as DNA in one or more wells of the storage device 1320, and may include devices such as imaging devices, optical sensors, illumination devices (e.g., LEDs, illuminators), and other devices that can be used to detect characteristics of DNA stored in wells (e.g., the processes and devices described above with respect to SBS, where fluorescent labels or tags associated with individual nucleotides can be detected by optical sensors). The sequencing device 1304 interacts with a flow cell 1322 via a sequencing interface 1324. The sequencing interface 1324 may be a glass cover or other interface surface configured to allow the sequencing device 1304 to interact with the flow channel 410. In examples where the sequencing device 1304 includes an optical sensor and light source that can be used to detect labeled nucleotides, the sequencing interface 1324 can be an optically transparent glass cover that covers the flow channel 410 and prevents leakage of fluids transported therein while transmitting light in each direction. In some implementations, the sequencing interface 1324 can include one or more waveguides to selectively illuminate one or more portions of the flow cell 1322.

Synthesis apparatus 1306 may be used to synthesize DNA having a particular nucleotide arrangement in one or more wells of flow cell 1322 of storage apparatus 1320. In other implementations, the synthesis apparatus 1306 may synthesize DNA nucleotides on a particular surface of the flow cell 1322 without a well. Synthesis apparatus 1306 includes a reservoir of single nucleotides or other biological material and an input delivery apparatus operable to deliver the input biological material to one or more wells of flow cell 1322. In some implementations, this may include a set of electrodes located near the wells that are operable to attract a particular nucleotide to a particular well while the input delivery device provides the nucleotide carrier liquid or nucleotide writing reagent to the flow channel 410 through the inlet port 420. In some implementations, this may include a nucleotide injector head, which may be positioned near the desired well, and may release one or more nucleotides in a desired order. Synthesis interface 1326 is configured to allow synthesis device 1306 to interact with one or more wells and therefore will vary depending on the particular synthesis device 1306. In some implementations, the synthesis interface may be a conductive layer or coupling that receives electrical characteristics from the electrodes and conducts them to a region near the well. In some implementations, synthesis interface 1326 can include some or all of fluidic interface 1328, such as where synthesis device 1306 provides a nucleotide carrier liquid during synthesis. In some implementations, the synthesis interface 1326 can be a porous membrane that allows nucleotides to enter the flow channel when injected by the nucleotide injection head at a desired location. In some implementations, the synthetic interface 1326 may be formed of a flexible material, or may include a plurality of small valves, or may include other features configured to self-seal after the nucleotide injector head provides a nucleotide.

Fluidic device 1308 can include any device or feature described herein relating to fluidic and can include fluidic networks, pumps, valves, and other components operable to provide a desired type of fluid at a desired volume and pressure to one or more flow channels 410 or a particular location on one or more flow channels 410. In some implementations, fluidic device 1308 can include electrowetting features that are operable to precisely direct a desired volume of fluid to a desired location, rather than flooding flow channel 410 with fluid. Fluidics interface 1328 will vary based on the particular implementation of fluidics device 1308, but may include a fluidics network, ingress ports 420, egress ports 422, and other components within storage 1320.

The fluid provided by fluidic device 1308 can include fluidic reagents generated and used in various processes performed with sequencing device 1304 and synthesis device 1306, and can also include non-functional fluids, such as distilled water for flushing and cleaning one or more components of DNA storage system 1300. The reagents used by sequencing device 1304 may be different from the reagents used by synthesis device 1306, and each device itself may use one or more different reagents in different parts of synthesis and sequencing. As used herein, any reagents that change that can be provided during a sequencing operation can be collectively referred to as nucleotide reading reagents, while any reagents that change that can be provided during a synthetic operation can be collectively referred to as nucleotide writing reagents.

The electrical interface 1310 may include a wired, electrically conductive connection, or may include a wireless transceiver device (e.g., RFID, NFC, bluetooth, light emitter, inductive charging device) capable of exchanging power, data, or both with the module electronics 1330 of the storage device. This may include providing power and exchanging data with electronic memory of the memory device 1320, providing power and exchanging data with one or more sensors of the memory device 1320, and enabling other electronic or data driving capabilities of the module electronics 1330 (if present).

The DNA storage system 1300 may also include a module receiver 1321, the module receiver 1321 including one or more features to couple and statically position the storage device 1320 with respect to a set of instruments 1301 during use, wherein the storage device 1320 is a removable cartridge storage device. In other implementations, the storage 1320 may include a flow cell 1322 that interfaces with the components of the set of instruments 1301. The module receptacle 1321 can include a slot in which the storage device 1320 can be seated, as well as guide features (e.g., rails) and locking features to position the storage device 1320 with a high degree of precision and fixity so that one or more of the set of instruments 1301 can be repeatedly and automatically positioned to interact with its corresponding interface.

It is understood that DNA storage system 1300 is an example and that many variations are possible and will be apparent to those skilled in the art in light of this disclosure. By way of example, a set of instruments 1301 and a storage device 1320 may have fewer components or more components than shown. As another example, some implementations of storage device 1320 may include components of the set of instruments 1301, such as where the multiple electrodes of sequencing device 1304 are integrated on or within flow cell 1322 itself. In this case, the portion of the sequencing device 1304 that is paired with the set of instruments 1301 can include a network of electrically conductive switches that allow electrical signals to be generated and transmitted to the desired electrodes within the flowcell 1322.

B. Memory device with selectively activated wells

In some DNA sequencing systems, there is an "all or nothing" approach to sequencing DNA with multiple wells. For example, a particular channel of a flow cell (e.g., flow channel 410) may have thousands of individual wells, each well containing at least one DNA strand. To sequence single strand DNA in the channel and read the encoded data using a process such as sequencing-by-synthesis, the entire channel can be flooded with chemical reagents to prepare the stored DNA for nucleotide matching with the optical tag, illuminated with a light source to make the optical tag visible, and imaged with an imaging device or optical sensor to capture the tag. It can be seen that although encoded data may only need to come from a single well, DNA stored in thousands of other wells may be affected by this process, which may lead to degradation or damage of the stored DNA over time and multiple read operations.

11A-11C depict top views of traps that can be implemented with flow cells to address these and other aspects. Fig. 11A shows a single well 700, which single well 700 may be located with a plurality of other wells on a surface 702 of a flow channel or other flow cell structure having a plurality of wells. The trap 700 comprises a sidewall 704, which sidewall 704 defines an opening in the surface 702 and descends into the structure of the flow cell. Well 700 also includes a ring electrode 706 located at the bottom of well 700, ring electrode 706 having similar features as described in the context of electrode assembly 640 shown in fig. 9 and 10. In particular, ring electrodes 706 are operable in response to electrical signals to generate electrical characteristics, such as current and voltage, within the fluid at and near well 700. Due to its shape, the ring electrode 706 also allows for a port space 708 at the bottom of the well 700, which may be an optical port, a fluidic port, or both. Where the port space 708 includes an optical port, the bottom of the well 700 may be constructed of glass or other optically transparent material to allow illumination, imaging, or both from below the well (e.g., through the closed side of the well). Where the port space 708 includes a fluid port, the opening may pass through the bottom of the well to the closed side of the well (e.g., opposite the open side of the well, and referred to as "closed" even when a fluid port is provided). Although the ring electrode 706 is shown in FIG. 11A as a single monolithic electrode, it should be understood that the ring electrode 706 may in fact be formed from a plurality of individually addressable electrode segments (e.g., like segments 642, 644, 646, 648 of the electrode assembly 640).

The fluid ports may be used to allow for flushing and reagent exchange, as shown and described in fig. 9, and may be provided as an alternative or supplement to the fluid path flowing over the surface 702. The fluid ports may be connected to a fluidic network to allow fluid to be provided to flush used reagent from the flow channel (e.g., expelled through the open sides of the wells 700), or to provide suction to pull used reagent through the bottom of the wells, either of which may be performed in conjunction with similar operations along the surface 702 (e.g., flushing fluid may be provided along the surface 700 while using the fluid ports to aspirate flushing fluid and any remaining reagent through the bottom of each well). Implementations having both a fluid port and an optical port in the port space 708 can combine these two functions by, for example, offsetting the fluid port to the edge of the port space 708 and using the remaining space of the port space 708 to unobstructed image and illuminate the well.

For multiple wells, such as well 700, each with an electrode ring 706, a single well may be selectively "activated" or "deactivated" during a sequencing or synthesis operation, or may otherwise affect each well on surface 702, even where reagents and other fluids are provided across surface 702. As an example, the electrode loop 705 may be operated to generate a current or voltage in one or more wells that is either attractive or repulsive to one or more nucleotides, enzymes, sequencing primers, polymerases, or other substances suspended in the fluid at the surface 702 to increase the chance of a desired substance being pulled from the fluid at the surface 702 to the fluid within the well 700 itself (e.g., due to an attractive electrical characteristic) or to decrease the chance of an undesired substance flowing from the surface 702 into the well 700 (e.g., due to a repulsive electrical characteristic).

As another example, as already described, a single electrode such as electrode ring 706 may be activated on a per-well basis in conjunction with flooding the flow cell with an electrically pressure sensitive functionalizing fluid to locally affect the pH of the fluid within surface 702 and well 700 to help attract or repel suspended nucleotides, or to selectively activate nucleotides for binding by finely controlling the voltage generated by the electrode to control the pH of the electrically pressure sensitive functionalizing fluid.

As described above, selectively controllable electrodes for each individual well allow DNA to be sequenced and data to be read from a selected set of wells activated for sequencing while preventing sequencing from wells deactivated for sequencing. The described functionality may be similarly applied to DNA synthesis within a well, as a fluid comprising all types of nucleotides may be provided to the entire surface 702, and electrodes at each well may be activated to introduce the next nucleotide required for synthesis into the well 700. As an example, when the digital data stored in a particular well has been encoded into a DNA format that describes an ordered sequence of nucleotides such as "AGCT" (e.g., which format can be readily converted from or to binary as described above), the electrodes at that well may be operated by signals from sequencing device 1304, memory controller 1302, or both, to produce an ordered sequence of currents, voltages, or other electrical characteristics to sequentially attract A, G, C, then T nucleotides into well 700.

By being able to sequence or synthesize on a per-well basis individually, indexing or other addressing information that provides the spatial location of wells affected by a particular read or write operation can be used to activate only those wells. Providing an index or other addressing information of the spatial location of a well affected by a particular read or write operation may also be used to save reagents, save hardware usage, and save processor time, which is typically wasted during undesired sequencing or synthesis. Reagents containing suspended materials, such as nucleotides and enzymes, can also be saved by incorporating charged labels and stripping the enzymes from the wells when not needed to control the localization of the materials. Passive optical features may also be included in the wells, including optical waveguides and polarization of materials to prevent or at least reduce cross-talk between wells and to limit illumination, whether for activating nucleotides and other substances or for generating fluorescence, as well as optical imaging of desired wells. As noted above, by using electrodes on a per well basis, it can be seen that the movement of nucleotides and other substances into and out of the wells can be encouraged or inhibited so as to control and produce a particular desired movement.

It should also be noted that the flexibility of the electrodes allows a single electrode associated with a well to facilitate the desired synthesis and writing of encoded data to the well in a "write mode"; and also facilitates the desired sequencing and reading of encoded data from the same well in a "read mode". This flexibility, in turn, may allow simultaneous reading and writing to different wells on surface 702, for example where a particular well may be activated for writing to extract certain nucleotides and other species from a multi-species fluid for synthesis; while different wells may be activated for reading to extract certain nucleotides and other species from the same multi-species fluid for sequencing, while each well inhibits unwanted species from entering their respective wells. Such functionality may be implemented as described above; and may also be implemented with one or more of the other features disclosed herein to facilitate simultaneous reading and writing of data to separate wells on surface 702.

Fig. 11B and 11C each show an alternative well implementation that provides the features described above with respect to well 700. In fig. 11B, the well 701 on the surface 702 does not include the electrode ring 706, but leaves more room for the port space 708 at the bottom of the well 701. As described above, the port space 708 may be an optical port that allows illumination and imaging from the underside of the well 701. As described above, fluid port 709 is also shown offset from the edge of port space 708 to allow a larger area for unobstructed imaging through optically transparent port space 708. Electrode 712 is mounted inside well 701 on sidewall 704 where it can be operated to produce the desired electrical conditions on a per well basis while also maximizing the size of port space 708. Although only one electrode 712 is shown in fig. 11B, the well 701 may include more than one electrode 712. For example, some variations of well 701 may include four separate electrodes 712 at respective locations along sidewall 704, each electrode 712 being associated with a respective nucleotide base.

In fig. 11C, well 703 includes electrode 714 mounted on surface 702 at the perimeter of well 703, just above sidewall 704. Trap 703 does not have fluid port 709, but it can be implemented with trap 703 if desired. The port space 708 of the well 703 is relatively large and completely unobstructed, which may allow for maximum optical switching from below the well 703 (e.g., from the closed side of the well 703). As with the previous example, the electrode 714 may be operated to produce the desired electrical conditions in the vicinity of the well 703 on a per well basis.

Each of the above-described electrodes of fig. 11A-11C may be mounted to a structure of a flow cell, embedded electrically within the structure itself, and routed to the underside of the structure (e.g., on the closed side of the well, opposite surface 702). These electrical connections may be coupled with an Integrated Circuit (IC) or a Complementary Metal Oxide Semiconductor (CMOS) configured to route and control the electrical signals to the desired electrodes, such as control circuitry 670 of fig. 10. When used with a memory device 1320, such control circuitry may be integrated into the memory device 1320 and coupled to the bottom surface of the flow cell 1322, or may be part of a set of instruments 1301, and when the memory device 1320 is coupled with a set of instruments 1301, the control circuitry is precisely positioned against the bottom surface of the flow cell 1322 such that each surface electrical lead of the control circuitry is connected in pairs with a corresponding surface on the underside of the flow cell 1322, and each surface on the underside of the flow cell 1322 is connected to an electrode leading to a single well.

Fig. 12A and 12B illustrate aspects of an alternative system for selectively promoting and inhibiting the uptake of substances from the surrounding fluid on a per-well basis. Fig. 12A shows a portion of a DNA storage system, such as DNA storage system 1300, operable to selectively and on a per well basis read, write, or read and write data to one or more wells of a flowcell 1322. In this figure, a Spatial Light Modulator (SLM) system 800 is shown that includes a Spatial Light Projector (SLP)806 (e.g., a light source paired with a digital micro-mirror device, a set of LEDs paired with an optically addressed set of liquid crystals, or another spatial light modulation device) and an SLM controller 808. An SLM system 800 is shown in relation to a flow cell 802, the flow cell 802 may be similar to the flow cell 1322 and other examples of flow cells described herein. Although the flow cell 802 is shown to include a set of wells 804, the wells 804 comprising nine wells in a three by three grid, it should be understood that this is to aid in visualization of the system, and that the SLM system 800 may operate with flow cells having thousands of individual wells 804.

The SLM system 800 may be operated by one or more devices of a set of instruments 1301, such as a sequencing device 1304, a synthesis device 1306, or a memory controller 1302, to produce a pattern of light on the flowcell 802. The projected light pattern includes spatially encoded control signals that may be configured to interact with the fluid near the well 804 or within the well 804 and substances carried in the fluid so as to promote or inhibit movement of substances to and from the well 804, as described above with respect to the electrodes of fig. 11A-11C.

The SLM controller 808 is coupled to the SLP 806 and provides control signals to the SLP 806 based on instructions received from a device such as the memory controller 1302. Such instructions may be generated in response to a request to read data from storage 1320, write data to storage 1320, or both. To provide an example, instructions provided to SLM controller 808 from a device such as storage controller 1302 may include an identification of first well 804 to be activated to write data (e.g., DNA synthesis within first well 804), an identification of second well 804 to read data (e.g., DNA sequencing within second well 804), or both. The SLM controller 808 may convert these instructions into control signals configured to cause the SLP 806 to project a spatial pattern of light onto the flow cell 802 corresponding to the identified wells 804 and the desired read or write operation of each well 804.

The spatially encoded light pattern provided by the SLP 806 may include, on a per pixel basis, where multiple pixels may correspond to each well 804 of the flowcell 802, with or without illumination; and variations in lighting characteristics such as color, wavelength, frequency, amplitude, or brightness. Light projected into a single well 804 may cause nucleotides, enzymes, or other substances in that well 804 to be modified in some manner (e.g., cleaved, degraded, destroyed, attracted to light, repelled by light), or may cause a local change in the pH or other characteristic of the fluid to facilitate or inhibit the transport of substances to that well 804. In this manner, the SLM system 800 provides functionality similar to that described above in the context of the electrodes of fig. 11A-11C for sequencing, synthesis, or per-well activation or deactivation for both sequencing and synthesis.

As a result, a complex spatially encoded light pattern may be projected simultaneously onto multiple wells 804, each well receiving a portion of the projection configured to cause a desired behavior of fluids and substances in the vicinity of the well 804. Fig. 12B shows the simplified example described above in relation to the flow cell 802. A spatially encoded projection 810 is shown, where 9 regions correspond to the wells 804 in fig. 12A. The areas depicted with the dotted pattern receive light configured to facilitate the transport of substances required to read data from the nearby fluid to the well 804, while the areas depicted as pure white receive no light, or receive light configured to inhibit the transport of substances in the nearby fluid to the well 804. Region 814 corresponds to well 1-3 of the set of wells 804 and region 812 corresponds to well 1-1 of the set of wells 804.

As can be seen, the spatially encoded projection 810 provides light to a region 812, which region 812 is configured to promote desired DNA synthesis within well 1-1; and may be varied over time to facilitate uptake and binding of a particular desired nucleotide in an ordered sequence to construct a desired polynucleotide. In parallel, the region 814 projected onto the wells 1-3 inhibits uptake of one or more substances of the wells 1-3, prevents reagents from being wasted on the wells 1-3, or prevents DNA already stored in the wells 1-3 from being undesirably affected by nearby fluids. As an alternative to the example above, where solid white areas indicate light configured to facilitate reading of DNA from the respective wells 804, the area 814 projected onto the wells 1-3 may inhibit uptake of substances associated with DNA synthesis within the wells 1-1 from nearby fluids while facilitating uptake of substances required for DNA sequencing within the wells 1-3.

In the simplest form, the alternating regions of the spatially encoded projection may be the entire region with light, or the entire region without light. However, the SLM system 800 can also project more complex light patterns such that the area 812 can be projected as hundreds or thousands of individually controllable light pixels, each having its own characteristics and projected individually onto the wells 804 such that the SLM system 800 can both facilitate uptake of a particular nucleotide by the wells 804 and direct the nucleotide to a desired location within the wells 804 (e.g., centered within the wells 804, offset to the edges of the wells 804).

The SLP 806 may be positioned above the flowcell 1322 (e.g. on an open side of the well 804) so as to project into the well 804 from the open side of the well 804 via an interface such as an optically transparent glass sequencing interface 1324 above the well 804, or may be positioned below the flowcell 1322 (e.g. on an enclosed side of the well 804) so as to project into the well 804 from the enclosed side of the well 804 via an optically transparent sequencing interface 1324 (e.g. configured for optical transmission to the port space 708 in the well 804).

In conjunction with SLM system 800, some implementations of DNA storage system 1300 may include electrodes operable to provide per-well control and activation or deactivation of wells for a desired process as described in fig. 11A-11C. In this way, the wells can be activated to write or read data by a combination of the generated electrical characteristics and the projected light, the combination being configured to provide a complementary or additive effect for promoting or inhibiting uptake. As another example, a first group of wells may be activated for writing data based on the generated electrical characteristics, while a second group of wells may be activated for reading data based on the projected light. As another example, the electrical characteristics and projected light may be used to perform a write operation on a separate well; or in a separate well. These combinations can be paired with configurations of the fluid-carrying substance that allow certain nucleotides or bases to be electrically released or transported (e.g., based on interaction with an electric current or voltage) without being affected by the projected light, while other nucleotides or bases are only reactive to the projected light and are substantially unaffected by the electrical properties.

Fig. 11A and 11B illustrate aspects of another example of a system that may provide per-well activation and deactivation as described above. Fig. 13A depicts a schematic of a portion of a DNA storage system operable to read, write, or read data to one or more wells 904 on a per well 904 basis. The electrowetting system 900 includes an electrowetting controller 912 coupled to the electrowetting surface 906. An electrowetting surface 906 overlies the flow cell 902, with openings for a set of wells 904. The electrowetting surface 906 may be coupled with the flow cell 902 at a top side (e.g., an open side of a well). Also shown are fluidics controller 908 and fluid supply port 910, which may be a component of fluidics device 1308 and which is operable to control the flow of fluid into and out of flow-through cell 902 via fluidics interface 1328.

The electrowetting controller 912 is configured to provide electrical signals (e.g., current, voltage) to the electrowetting surface 906 in order to produce varying electrical conditions at discrete locations on the electrowetting surface 906. The electrowetting controller 912 may receive instructions from a device, such as the memory controller 1302, that identify a path along the electrowetting surface 906, such as a path leading from the fluid supply port 910 to one or more wells 904, or a path between wells 904. The electrowetting controller 912 can generate a series of electrical conditions (e.g., voltage patterns) on the electrowetting surface 906 with high precision, both spatially (e.g., at particular locations) and electrically (e.g., at particular currents, voltages, or frequencies) based on the instructions. Fluidic controller 908 may receive corresponding instructions indicating the type of fluid, quantity, composition, and sequence of delivery of the fluid to electrowetting surface 906. Based on these instructions, electrowetting controller 912 and fluidic controller 908 operate in parallel, are managed by memory controller 1302, or communicate with each other, to provide precise amounts and compositions of fluid, e.g., droplets, to an input region of electrowetting surface 906, followed by a precise sequence of electrical characteristics along electrowetting surface 906 that are configured to transport the droplets from the input region to one or more wells 904.

In this manner, rather than submerging or submerging the entire surface of the flowcell 902 with fluid, individual droplets may be composed and transported directly to the respective wells 904 to enable writing or reading of data. This allows selective reading and writing to a single well 904 (e.g., described above with respect to electrode and SLM-based implementations as "activating" and "deactivating" the wells), as well as simultaneous reading and writing operations across the well 904 without interference. The electrowetting system 900 also saves reagents because it does not need to flood or submerge a multi-substance reagent, which may include many nucleotides, enzymes, or other substances necessary to operate in one well 904; but may be wasted or may interfere with operation in another well 904. Instead, droplets having substantially the same volume as well 904 and containing only the species required for operation in well 904 may be delivered directly to well 904.

Fig. 13B shows an example of an electrowetting process. Surface map 920 corresponds to a set of wells 904. As with the previous example, nine wells 904 are shown for simplicity, but it should be understood that the electrowetting system 900 can support a surface with thousands of wells 904, for example by introducing additional fluid supply ports 910 at discrete locations on the electrowetting surface 906 in order to minimize the travel distance of a particular droplet. A first path 922 is shown in phantom on surface map 920, traveling along the interstitial space between wells 1-2 and 1-3, and being directed into well 2-3, before reaching the interstitial space near well 2-3. The first path 922 may result from a set of consecutive electrical conditions (e.g., applied voltage changes) along the electrowetting surface 906, configured to attract a droplet to a subsequent location, repel a droplet from a current location, or both; and may also include a pair of electrical conditions that surround the first path 922 and direct the droplet back to that path, somewhat like a set of "rails" that both direct the droplet and prevent deviation from the first path 922.

Similarly, second path 924 travels along the interstitial spaces between wells 904 until well 2-2 is reached, where the droplet can be divided into two portions, one portion being introduced into well 2-2 and the second portion continuing to well 3-1. This path may be created similar to the first path 922, with the additional process of splitting droplets when the path branches. This can be accomplished by creating an electrical condition, for example along the centerline of the drop, to divide it into two parts, paired with an electrical "trajectory" or other condition to direct each sub-drop from a splitting location to a subsequent location. Second path 924 may be useful, for example, in situations where well 2-2 and well 3-1 are each prepared to synthesize DNA and therefore require the same or very similar fluid composition.

The electrowetting system 900 can be paired with other systems disclosed herein, such as the per-well electrodes of fig. 11A-11C, the SLM system 800, or both. As an example, activation of the electrodes can be paired with delivery of droplets to filter out trace amounts of undesirable substances that may collect from the electrowetting surface 906 during droplet delivery. Alternatively, the droplet may pass through the well instead of or in addition to the gap, with the electrodes passing through the well being activated to prevent uptake. As another example, the droplets may contain a mixture of substances, with separate subsets of the mixture being used for two separate wells. The mixed droplet may be delivered to a first well which may activate electrodes to extract a desired subset thereof from the droplet, so that the droplet may be delivered to a second well to deliver the remaining substance.

The SLM system 800 can be paired with an electrowetting system 900 and can project patterned light onto wells through which droplets travel to inhibit accidental uptake of substances by the wells as the droplets pass. Alternatively, the SLM system 800 can project patterned light onto interstitial spaces or other paths of the electrowetting surface 906 where droplets travel to destroy or degrade residual species and prevent their absorption by subsequent beads.

In some implementations, electrowetting surface 900 may be a component of a memory device 1320; and may be integrated and coupled to the flow cell surface as shown in fig. 13A. In other implementations, the electrowetting surface 906 may be a component of the set of instruments 1301 (e.g., the sequencing device 1304 or the synthesis device 1306) and may be coupled to a surface of the flow cell 1322 as a result of the storage device 1320 being inserted or otherwise coupled to the set of instruments. As an example, where the storage device 1320 is inserted into the modular receptacle 1321, the electrowetting surface 906 may slide into a slot or opening in the storage device 1320 and automatically align with the wells of the flow cell 1322 when the storage device 1320 is locked in place, as already described, so that discrete portions of the electrowetting surface 906 can still be accurately addressed to the respective wells.

Several implementations have been disclosed that provide selective activation and deactivation of wells on a per-well basis for read and write operations. Fig. 14 depicts a flow diagram of a process 1000 that may be performed to provide controlled read and write regions to multiple wells with such systems and devices. An index of wells may be maintained by a device, such as storage controller 1302 (block 1002), and may include details of the status of each well, such as flowcell 1322. The well state information may uniquely identify each well (e.g., by ID number, physical location on the surface of the flow cell 1322, or both) and indicate whether each well contains a polynucleotide or is empty. The sink state information may also include an identifier of the information stored in the sink, such as a file list, a data description, a unique write operation identifier, the time and date of the write operation, or other similar information. This information may also be stored as a separate file index that indicates one or more wells in which each file is stored by listing the unique identifiers of the wells, the well index containing the unique identifiers and the physical location or address of the wells. The well index, the file index, or both may be stored on a memory or drive accessible to storage controller 1302; or may be stored on electronic memory that is a component of the storage device 1320, which will be described in more detail below.

The DNA storage system 1300 may receive (block 1004) a request from a user or from other systems and devices to write to or read data from a storage device; and the data associated with those requests may be stored until they can be completed. When a request is received (block 1004), the DNA storage system 1300 may determine (block 1006) one or more wells affected by the request. For requests to read data and provide output, this may include referencing the well index to determine the identity of wells that can be read, in order to generate a description of the machine-written polynucleotide contained therein; and converting the polynucleotide into digital data. For requests to write input to the storage device, this may include referencing the well index to identify one or more fully or partially empty wells that provide the required amount of storage based on the input size; and assigning a portion of the input to each of the wells.

The DNA storage system 1300 may manage (block 1008) the trap activation of the affected traps, which may include any or all of the following: (1) activating the affected well to read data or activating the affected well to write data; (2) deactivating or otherwise protecting wells near or adjacent to the affected well (e.g., such as the electrowetting system 900 may provide droplets to the target well and the adjacent wells may be deactivated); and/or (3) deactivating each well that is not an affected well (e.g., the entire surface of the flow cell receives the multi-substance fluid, and deactivating each unaffected well to prevent undesired synthesis or sequencing). As described above, trap activation may include the use of systems and devices such as those shown in fig. 11A-11B to promote desired transport and binding of species in target traps, inhibit undesired uptake of species in non-target traps, or both. While in some implementations it may be more accurate to say that nucleotides or other species are activated while the well itself remains unchanged, it should be understood that "activation" of the well is used herein to account for changes in properties associated with the well (e.g., electrical properties of the region near the well), as well as changes in species near the well (e.g., activation of nucleotides in response to voltage or photon energy).

The DNA storage system may perform 1010 input and output requests by sequencing or synthesizing DNA in respective activated wells using methods as described herein, which may include providing various reagent fluids to some or all of the wells, providing an ordered sequence of nucleotides to the wells, illuminating and imaging the wells, among others. As described above, the data to be written into the wells may be encoded into a DNA format (e.g., where the nucleotide bases correspond to binary data, e.g., where each individual nucleotide corresponds to a different binary) that is used to determine the sequence of nucleotides written into the wells. The data read from the wells will initially be in DNA format and will describe ordered sequences of nucleotides which can be converted back into binary data using the corresponding decoding rules.

B. Caching method for simultaneous reads and writes

In addition to benefiting from features that allow simultaneous reading and writing of data to storage devices, systems such as the DNA storage system 1300 may also benefit from DNA storage-specific caching methods that simulate simultaneous reading and writing of data, so that users or other systems and devices that rely on the DNA storage system 1300 are not affected by occasional delays in the ability of the DNA storage system 1300 to write data to the storage device 1320. Some implementations of these caching strategies may also minimize the likelihood of future read and write collisions.

While the set of tools 1301 may include various system level cache features (e.g., a processor, memory, or motherboard cache that is built into the memory controller 1302 and configured to automatically cache data related to the basic operation of the processor), it may be advantageous to provide a cache memory (e.g., electronic memory) associated with the storage device 1320 instead of the set of tools 1301. As an example, when data is written to the storage 1320 and a user begins to unload volumes from a set of instruments 1301 of the DNA storage system 1300 before writing to the flowcell 1322, the unwritten data may be stored on electronic memory and will move with the storage 1320 until it can be written to the flowcell 1322. As another example, where an index of wells is maintained as described in fig. 14 (block 1002), it may be useful to maintain the index on electronic storage that moves with storage 1320, rather than or in addition to cloud storage, or on a permanent storage volume of a set of instruments 1301 of DNA storage system 1300. In this manner, if the storage device 1320 is shipped and installed with a set of instruments 1301 of a different DNA storage system 1300, the well index is immediately available without the need to obtain information from the cloud storage volume or from a previously coupled set of instruments 1301 of the DNA storage system 1300.

FIG. 15 depicts a schematic diagram illustrating an example of a storage device 1100 that can be used with a DNA storage system, such as DNA storage system 1300. The storage device 1100 includes a cartridge interface 1104 (e.g., sequencing interface 1324, synthesis interface 1326, etc.) configured to couple the storage device 1100 with a DNA storage system interface 1108 (e.g., a set of instruments 1301, an electrical interface 1310). The memory device 1100 also includes a flow cell 1106 on which data can be stored and written as polynucleotides. The cache memory 1102 is included in the storage device 1100 and may be part of the module electronics 1330 such that it couples with the electrical interface 1310 when the storage device 1320 is placed or installed in the DNA storage system 1300. The electrical interface 1310 both powers the cache memory 1102 and allows data exchange between the cache memory 1102 and the memory controller 1302 or other device. In some implementations, the cache memory 1102 may be a non-volatile electronic solid-state memory configured to be physically connected for data transfer. In some implementations, the cache memory 1102 may be configured for wireless data transfer when coupled to the DNA storage system 1300. In some implementations, the cache memory 1102 can include several memories, such as a large solid-state memory for storing data, and a small wireless memory (e.g., an RFID memory containing a unique identifier associated with the storage device 1320) for storing a small amount of identification information.

In addition to the advantages of cache memory 1102 already described, particular caching strategies may be implemented for DNA storage systems 1300 available to cache memory 1102. As an example, fig. 16 depicts a flow diagram of a process 1120 that may be performed to provide a cache of read and write operations to the storage device 1100. When writing data to the storage device 1100 using the DNA storage system 1300, additional data may be written to the cache memory 1102. This may include writing (block 1122) a file index to the cache 1102, which may describe the contents of the multiple wells and the location of a particular file or data, whether stored in the multiple wells, on the cache 1102, or both. Such information may be used to allow later access and retrieval of the requested data. Checksum data for a single file or bundle of data may also be written (block 1124) to cache memory 1102 and may be associated with a file index.

And storing such data on the streaming pool 1106 and requiring that the drive content be sequenced before it is accessible and frequently synthesized to reflect changes; or storing such data on cloud storage or servers, which may require network connectivity and permission to access such data to access drive contents, storing indexes, checksums lists, or both of files and data on cache memory 1102 may enable faster data reads and writes in the future. In addition to storing it on the flowcell 1106 itself or on a network accessible volume, storing such data on the cache memory 1102 also provides the additional advantage of redundant storage of such data, as loss of file tables and indices may result in complete loss of data stored on the flowcell 1106, or greatly increase the time and resource costs of reconstructing a file index based on well-by-well checks.

When the storage device 1100 is coupled with the DNA storage system 1300, the system may receive (block 1126) a read operation from a user or from systems and devices in communication with the DNA storage system 1300 and may receive (block 1136) a write operation. As already discussed, in some cases it may not be possible to allow simultaneous reading and writing of data to separate wells of the flow cell 1106. This may be due to limitations of the DNA storage system 1300, or limitations of the storage device 1100. As an example, some implementations of DNA storage system 1300 may lack selective or per-well activation for reading and writing data, such as described in the context of fig. 9-12; but rather performs sequencing and synthesizing data in a batch operation that may affect each well of the flow cell 1106. In such an implementation, the storage device 1100 may be considered to have two mutually exclusive modes-a read mode and a write mode.

While such implementations may particularly benefit from the disclosed caching strategy, it should be appreciated that even implementations that support simultaneous reads and writes may encounter various situations where they are actually in a read mode or a write mode, e.g., where the well activation feature is limited to a certain number of simultaneous operations within a period of time (e.g., the electrowetting system 900 may be limited to the number of wells that it may activate within a given period of time, and a large number of data read requests may cause subsequent data write requests to queue for a period of time).

Upon receiving a read operation (block 1126), the DNA storage system 1300 may check the file index to locate the requested data and determine whether it is currently stored on the cache memory 1102 (block 1128). In various circumstances, the requested data may be available on cache memory 1102. By way of example, when data is written to the storage device 1100 and then requested in the near future, it may still be stored in the cache memory 1102. As another example, where data was recently read from the flowcell 1106 upon request, it may be stored in the cache memory 1102 until overwritten. As another example, the DNA storage system 1300 may be configured to mark certain data stored on the flowcell 1106 as also maintained in the cache memory 1102, where possible, due to manual configuration by a user, or based on an automatic determination by the DNA storage system 1300 based on the frequency of read requests for such data.

In the event that the requested data is available from the cache memory 1102, the DNA storage system 1300 will read (block 1130) the data from the cache memory 1102 to service the request, which may allow the storage device 1100 to remain in write mode while allowing the data to be read (albeit from the cache memory 1102 rather than from the flowcell 1106). In the event that the requested data is not stored in the cache memory 1102, the DNA storage system may read the data from the flowcell 1106 when such functionality is available (e.g., when the storage device 1100 is in a read mode, or when a read operation is otherwise available) (block 1132). In the case where the storage device 1100 is in a write mode and actively writing data to the flow cell 1106, it may not be advantageous to switch back to a read mode preferentially due to the time and reagent costs of switching between the mode and the conditioning wells. However, in the event that the queued data to be written is of a size that may be stored on cache memory 1102, it may be advantageous to switch to a read mode and allow the requested data to be read from the sink (block 1132) while the incoming data is stored on cache memory 1102. In this case, a user or other system or device that has requested writing and reading of data perceives that these actions are being performed simultaneously, as output data is being read from flow cell 1106 and input data is being written to cache memory 1102.

After each read operation, the file index or another data set on cache memory 1102 may be updated (block 1134) to reflect the read frequency of the most recently requested data, such data set being useful for determining in the future data that should be cycled into cache memory 1102 due to frequency of use (e.g., data that may be requested per single data) or pattern of use (e.g., data requested per friday may be cycled into cache memory 1102 at low priority on thursday so that it completes within a period of time during which there may be reduced data read and write requests).

With continued reference to FIG. 16, where a write operation is received (block 1136), the DNA storage system 1300 may determine (block 1138) whether the storage device 1100 is currently in a read mode. With storage device 1100 currently in read mode, input data associated with the write operation may be written (block 1142) to cache memory 1102 and marked as written to flow cell 1106 when available. Where the storage device is already in a write mode, input data may be written (block 1140) to one or more wells of the flow cell 1106.

The disclosed caching method may also be affected by a caching policy that stays in a current mode (e.g., read mode or write mode) in preference to other considerations, such that all queued read operations may be performed before switching to write mode, regardless of the order of arrival of the requests, and may even remain in read mode for a brief period of time after the last read request completes, in order to allow other read requests to arrive and be serviced before switching to mode. In addition to reducing the total number of mode switches performed in a given time period, this strategy also reduces the risk of read-specific and write-specific agent cross-contamination that would otherwise occur more frequently due to the more frequent mode switches.

C. Multi-volume management method for synchronous reading and writing

Systems such as DNA storage system 1300 may also benefit from DNA storage specific methods for multi-volume management that may allow for simultaneous reading and writing of data, redundant storage of data, error checking of data, and increased reading and writing speeds of data. For example, in some implementations, when data is written to a location (e.g., a well in a flowcell), a mirror copy may also be automatically written to a second location (e.g., a second well in the same flowcell, or a well in a different flowcell). In some such implementations, the presence of one or more mirror copies may be used to improve system performance, for example, by supporting parallel operations. For example, in some implementations, a DNA storage device may include two flow cells (e.g., two flow cells 601), a first flow cell for read operations and a second flow cell for write operations. In this case, if two different users want to read and write data stored in a particular location at the same time, the user issuing the read request can be satisfied by sequencing copies of the data stored in the first flow-through pool 601; while the write request is fulfilled by writing a new polynucleotide in place in the second flow cell 601. Subsequently, when a request has been processed (e.g., during a period of low activity), the first flowpool 601 may be resynchronized with the second flowpool 601, thereby ensuring that any data that has been written to the second flowpool 601 can be read using the first flowpool 601 whenever a next read request is issued.

As another example, in some implementations, when data is to be written, in addition to writing the data, a system can be implemented to write the data along with redundancy values that can be used to reconstruct a portion of the data even if the data is lost. To illustrate, consider table 1 below, which provides an example of redundancy values that may be generated for data stored in the form of four polynucleotides, each storing four bits of data.

Data sequence 1	Data sequence 2	Data sequence 3	Data sequence 4		Redundant sequences
						1	1	1	1	XOR	0
1	1	1	0	XOR	1
						1	1	0	0	XOR	0
1	0	0	0	XOR	1

Table 1: an example method of storing data using redundant sequences allows a portion of data to be recreated even when data is lost.

In some implementations, the 16-bit data may be divided into four-bit sequences, and a fifth four-bit sequence may be created by applying a logical exclusive-or operator to the bits from the first four sequences. These five sequences can then be stored in five polynucleotides at five different locations in the DNA storage device. Then, if during a sequencing operation it is found that data in one location is corrupted and/or inadvertently misread, the data can be recreated by applying the XOR operator to the remaining four sequences and storing the encoded result as a new polynucleotide in the location where the corrupted data was previously stored.

During reading and/or writing, "phasing" and/or "predetermined phasing" may occur and introduce errors in the final write or read sequence. "phasing" refers to the situation where a second nucleotide is incorporated when the reversible terminator of the first incorporated nucleotide is inadvertently removed (e.g., by interaction with residual reagents that have not yet been flushed from the flow cell). During writing, this may result in writing two nucleotides instead of one for a particular DNA sequence. During reading this may result in no fluorophore being detected associated with the first nucleotide, thus deviating the read sequence by skipping one nucleotide. "predetermined phase" refers to the situation where a nucleotide is not incorporated. During writing, this may result in no nucleotides being written into the sequence. During reading, this may result in either no fluorophore associated with the nucleotide of the detected sequence being detected, or a previous fluorophore associated with a previous nucleotide being detected again, thereby deviating the read sequence by delaying or repeating the reading of one nucleotide. The use of redundant sequences may detect errors in one or both of the writing and/or reading processes.

FIGS. 17A and 17B depict schematic diagrams of storage device configurations that allow for multi-volume management for DNA storage. Fig. 17A illustrates a storage device 1200, which can be coupled with a DNA storage system interface 1108 (e.g., as described in the context of storage device 1100). The storage device 1200 includes a cartridge interface 1202 and two different flowcells 1204, 1206, the cartridge interface 1202 having similar features and capabilities as described with respect to cartridge interface 1104, and the two different flowcells 1204, 1206 having similar features and capabilities as the other flowcells described herein. Although referred to as flow-through cells, it is also understood that first flow-through cell 1204 and second flow-through cell 1206 may alternatively be different channels, such as two channels separate from flow channel 410, which may be managed independently from a fluidics, sequencing, and synthesis perspective.

FIG. 17B shows a configuration similar to the storage device shown in FIG. 17A, but with two different storage devices 1210. Each storage device 1210 of this example has a cartridge interface 1212, the cartridge interface 1212 having similar features and capabilities as described with respect to the cartridge interface 1104. Each storage device 1210 of this example also has a flowcell 1214, the flowcell 1214 having similar features and capabilities as the other flowcells described herein. Each of the two different storage devices 1210 is coupled to a DNA storage system interface 1108, which itself may be slightly modified to couple to two different storage devices 1210 simultaneously. For example, the DNA storage system interface 1108 may double or otherwise provide additional capabilities for each set of instruments 1301; or two storage devices 1210 may be positioned close to each other and the set of instruments 1301 configured such that they may float between modules, for example where the illumination and optical sensing devices may move between modules. Alternatively, the DNA storage system interface 1108 may provide a set of instruments 1301 that interact with each module simultaneously, such as in the case of a fluidic device, which may provide fluid to each module through a fluidic network.

FIG. 18 depicts a flow diagram of a process 1220 that may be performed to provide redundant data write and read operations with a storage device, such as storage device 1200 or storage device 1210. Upon receiving an input request (block 1222), providing input data that should be written to the wells of the flowcell, the DNA storage system 1300 may synthesize (block 1224) and store the input data in a first well in parallel or in close sequence according to its capabilities; and synthesized (block 1228) and the input data is stored in the second sink. Although fig. 18 depicts a first well and a second well, it should be understood that data may advantageously be performed across three or more wells. The redundancy of the first and second wells may be used for error checking, for example by sequencing one or both of the polynucleotides of the first and/or second wells to determine whether phasing or a predetermined phase occurs during reading and/or writing. Assuming no errors in the synthesis, identical copies of the polynucleotide corresponding to the input data in the first and second wells. This provides various advantages depending on the nature of the first and second wells.

For example, referring to storage device 1200, a first well may be located in first flowcell 1204 and a second well may be located in second flowcell 1206. In this case, the written data is stored redundantly in two separate volumes, which is desirable for data integrity and to minimize the risk of data loss. Furthermore, since data is cloned across two different flowcells, the DNA storage system 1300 may also simultaneously read and write data to a volume, for example, by keeping the first flowcell 1204 in write mode at all times, while switching the second flowcell 1206 to read mode upon receiving a request to output data. These two advantages also apply to the case where the first well is located in the flow cell 1214 of one storage device 1210 and the second well is located in the flow cell 1214 of another storage device 1210.

Even in the case where another storage device (e.g., storage device 1100) is used and the first and second wells are each located in the flow cell 1106, there are some advantages to cloning data across the wells in the same flow cell. In addition to providing data redundancy to reduce the risk of data loss due to well failure or unexpected degradation of DNA within one well, the clonally written data may also provide additional error checking capabilities wherever two wells are located. As an example, where input data is written to separate wells as shown in fig. 18, the input data may then be read back from both separate wells by sequencing (block 1226) and reading the data from the first well, and sequencing (block 1230) and reading the data from the second well. Comparison of the output (complete file or data set, or checksum of the output) read from each well (block 1232) will indicate whether the polynucleotide written to one cell was missynthesized, misread, or subsequently degraded for some reason.

In some implementations, synthesis of the polynucleotide in the second well can be accomplished based on synthesis of the polynucleotide in the first well. That is, the polynucleotides written into the first well may be clonally amplified, and one or more cloned polynucleotides may be stored in the second well and/or the fluid storage chamber. The clonally amplified polynucleotides in the first well may be sequenced to determine the sequence of the nucleotides in the first well. The nucleotide sequence in the first well may be compared to the indicated write polynucleotide to determine if any phasing or predetermined phase errors occurred during the writing process. If an error occurs, one or more cloned polynucleotides stored in the second well and/or fluid storage chamber may be discarded due to damage, and the writing process may occur again. If no errors occur, one or more cloned polynucleotides stored in the second well and/or the fluid storage chamber may be cloned and stored in the first well and/or one or more other wells to provide two or more identical polynucleotides as described herein.

Some implementations that perform data clone writing as shown in fig. 18, also including an SLM system 800, where the first and second wells are contained within the same flow cell, may benefit from the ability of the SLM system 800 to project encoded spatial light across the flow cell surface. In this case the flow cell may have received reagents required for synthesis and written data to the first well, which means that the second well typically also receives the same reagents. In this case, the SLM system 800 can also project patterned light onto the first well to complete the synthesis, and since the second well is on the same flow cell and within the potential projection footprint of the SLM system 800, duplicating the pattern projected onto the first well in reagent or processor time may require little additional resources so that the same pattern is also projected onto the second well.

In yet another implementation of the method of fig. 18, a single well may be used as the first well and the second well, such that two identical polynucleotides may be produced in the single well. In addition to providing redundancy against single strand degradation, two separate strands may be constructed from optically tagged or labeled nucleotides and illuminated and imaged in the same well immediately after completion to determine if one of the strands is broken during the writing process. One or more disclosed implementations can also include attaching additional nucleotides to the machine-written polynucleotide, the additional nucleotides representing information such as a hash value or checksum of the nucleotide sequences within the strand, a spacing value indicating the end of a particular data sequence, and/or a discrete column value separated from a subsequent nucleotide sequence. As already described, such information may be written to the polynucleotide, stored as an index in a separate storage medium (e.g., cache memory 1102), or both, and may be used to help verify the data integrity of the stored data.

As one example of the above, where a set of input data is encoded as a sequence AATTCCGG, one example of creating a corresponding hash value may include assigning each distinct nucleotide an arbitrary value (e.g., a-1, T-2, C-3, G-4), and then mathematically combining the sequence of values into a single value (e.g., by one or more of addition, multiplication, or other mathematical operations). Additional values may also be combined into the hash, such as a value indicating the length of the sequence (e.g., 8 in the example above). Continuing with the above example, the resulting hash value will depend on the mathematical operation used to generate the hash value.

For example, the resulting hash value may be 28 (e.g., the result of adding the sequence value and the sequence length), 4608 (e.g., the result of multiplying the sequence value and the sequence length), or 132 (e.g., the result of alternately adding and multiplying the sequence value followed by adding the sequence length). Different approaches will provide hash values with a variable number of possible inputs that can be used to verify the sequence of input nucleotides with different confidence levels. As an example, using the quaternion maps a-0, T-1, C-2, and G-3, the sequence of the above example corresponding to hash values may be TGO for 28, tacaaaaa for 4608, and CATA for 132.

In implementations where interval values are included in the sequence, the interval values may be paired with hash values or other post-or pre-sequence information, and may include any nucleotide sequence, such as TTTTTTT, which may be unlikely to occur in normal encoding of input data, and may be processed as interval values rather than encoded data when decoding sequencing data.

FIG. 19 depicts a flowchart of a process 1240 that may be performed to provide high speed data writing and reading with a storage device. The process of fig. 19 may be performed using storage device 1200, storage device 1210, or another storage device that supports per-well activation for synthesis and sequencing. When an input request is received (block 1242), the DNA storage system 1300 may segment (block 1244) the data associated with the input request into multiple portions (e.g., two or more). Instead of writing the entire input into a single well, the system may synthesize and write the first portion into a first well (block 1246) and, in parallel, synthesize and write the second portion into a second well (block 1248). In the case where the first and second wells can be synthesized separately in this way without affecting the writing speed of the other well, the result is that the input data can be completely written to the DNA storage at approximately twice the speed relative to writing the entire input to a single well. Similarly, when an output request is received (block 1250), the DNA storage system 1300 may sequence (block 1252) and read the second portion from the second well, and in parallel, sequence (block 1254) and read the first portion from the first well. The two portions may then be recombined and a complete output may be provided (block 1256). Also in this case, the output data can be read from the DNA storage at about twice the speed as compared to reading the data from a single well.

The multi-volume configuration disclosed herein may also be implemented to provide simultaneous read and write functionality, even without the per-well activation feature. By way of example, FIG. 20 depicts a flow diagram of a process 1260 that may be performed to provide simultaneous reads and writes, where multiple separate volumes (e.g., flowcells) are available, such as storage device 1200 and storage device 1210. When the DNA storage system 1300 receives an input request (block 1262), the system will default to composition (block 1264) and write data to each volume in clone mode so that, if there are no unexpected composition errors, the individual wells on each individual volume (e.g., first flowcell 1204 and second flowcell 1206) contain the same polynucleotide after composition.

If a request to read output from the storage device 1200 is received (block 1266), the DNA storage system 1300 will switch (block 1268) to managing multiple volumes in a hybrid mode, where composition (block 1270) continues uninterrupted on a first volume (e.g., the first flowthrough pool 1204), but sequencing (block 1272) begins to read the requested output from a second volume (e.g., the second flowthrough pool 1206). The DNA storage system 1300 may continue in the hybrid mode for a period of time during which the input may be continuously written to the first volume and the output may be continuously read from the second volume, as long as the requested data is present on the second volume before switching to the hybrid mode. Over time, the state of the first and second volumes will gradually diverge, as the first volume will increasingly contain data that is not cloned to the second volume.

To address this divergence, the DNA storage system 1300 may evaluate several factors to determine when to return to the clone mode. As an example, where all queued incoming requests have completed and there are no pending operations to synthesize and write (block 1274) data to the first volume, the system may begin sequencing in clone mode (block 1276) and reading data from multiple volumes. In this way, if an outgoing request is received for data that exists within the divergent portion of the data stored by the first volume and that has not yet been stored by the second volume, the data may be read from the first volume without interruption. It should be noted that for divergent data, a common file index or sink index may be shared by multiple volumes and maintained to indicate the availability of sinks on the multiple volumes. In other words, when each volume is full, the aggregate data they store should be the same, although it is not necessary to write data written to the X-Y coordinate well located on the first volume as well as to the X-Y coordinate well located on the second volume. Where the volume that needs to implement the well locations corresponds to a volume, the shared file and well index may be used to implement such behavior.

As another example, where all queued output requests are complete and there are no pending operations for the data to be sequenced and read from the second volume (block 1278), the system may return to synthesizing in clone mode (block 1280) and write the data to multiple volumes, ignoring the divergent portions of the data for the time being. Later, when the demand on the multiple volumes is minimal, the DNA storage system 1300 may synthesize (block 1282) and write the divergent data to the second volume to return it to the true clone state of the first volume. In some implementations, the composition of the divergent data may be performed based on reading the divergent data from the first volume and then writing to the second volume (block 1282).

In some implementations, synthesis (block 1282) may be performed by in situ cloning (e.g., clonal amplification) of polynucleotides of divergent data in the wells of the first volume where they are stored, and then transporting the cloned strands directly into the corresponding wells of the second volume where they are bound, as already described. In such an implementation, the transfer of clone chains between volumes may be achieved by a fluidic connection between each well in the first volume and each corresponding well in the second volume, such that a fluid flow may be used to transfer clone chains directly from the wells of the first volume to the corresponding wells of the second volume via the fluidic connection.

As another example, the DNA storage system 1300 may be configured with a time threshold indicating an allowed time period or an allowed data divergence. When the time or other threshold is exceeded (block 1284), the DNA storage system 1300 may inhibit reading data from multiple volumes for a period of time due to an extended period in the mixed mode or a large number of writes in the mixed mode, and begin composition (block 1282), and write divergent data to the second volume until it catches up.

While the process of FIG. 20 has been described in the context of two volumes, it should be understood that three or more volumes may provide significant advantages in maintaining data redundancy while also reducing the likelihood of excessive divergence of data for a particular volume. As an example, managing three or more volumes according to fig. 20, the second and third volumes may alternate between read and write modes in order to reduce the amount of divergent data for any single drive and make the most recently written data available for reading. For example, if the third volume switches to read mode three hours after the second volume switches to read mode, in order to allow the second volume to switch back to write mode with the first volume, the immediate result may be that the divergent data of the second volume may be limited to three hours of write input, and the third volume may have another three hours of write input available for reading from the third volume as compared to the available data on the second volume.

Data error risk mitigation

In some implementations, a system configured to encode data in the form of a nucleotide sequence (e.g., machine-written DNA) and read the data therefrom can include features to mitigate the risk of errors in the data that may result from phasing and/or predetermined phasing during writing and/or reading. For example, in some implementations, when reading data previously written as a nucleotide sequence, the system can compare the data read from the nucleotide sequence to a quality control value stored in a non-nucleotide memory, and then use this comparison as a basis for determining whether the nucleotide sequence (or a well storing the nucleotide sequence, in an implementation in which the sequence is stored in the well as an addressable element) should be considered data that has a corruption. In implementations where this type of comparison occurs, it may be performed in a variety of ways. For example, in some implementations, the comparison of nucleotide sequences may be performed by calculating a value (e.g., a checksum, a hash value, or some other type of error detection value) based on the read nucleotide sequence, and then comparing that value to a value that had been previously calculated and stored in non-nucleotide memory when the data that should have been encoded in the nucleotide sequence was written. Similarly, in some implementations, when a nucleotide sequence is written to a DNA storage device, the data that should be encoded in the sequence can be stored in a non-nucleotide memory, and the comparison can be a direct comparison of the data read from the nucleotide sequence to the data stored in the non-nucleotide memory. In some implementations, a combination approach may also be used.

In some implementations, a combination of the risk mitigation methods described in the preceding paragraphs may also be used. To illustrate, consider fig. 22, which depicts a process of reading and writing data that may be performed in some implementations. Initially, a method as shown in fig. 22 may include receiving a data write request, as shown at block 1701 of fig. 22. This may be, for example, a request submitted by a user via the user interface 130 to store a file in the DNA storage device, or an automatically generated request to store data generated as process output, as an internal backup, or for other purposes. After receiving a data write request (block 1701), the method may include generating one or more commands to write data, as shown at block 1702 of FIG. 22. This may include, for example, determining how the data to be written should be represented given the nature of the DNA storage device (e.g., the length of the nucleotide sequence that may be written, the encoding scheme used for the data, etc.), determining the specific locations in the DNA storage device where the nucleotide sequence encoding the data should be synthesized, and generating commands for synthesizing those sequences (e.g., commands for activating electrodes corresponding to particular wells 630 in the flow cell 600, 601 so that bases are appropriately added to strands in those wells 630). These commands may then be executed to write data to the DNA storage device, as shown in block 1703 of fig. 22.

In some implementations, after data is written (block 1703) to the DNA storage device, it can be read automatically (i.e., where the polynucleotide in which the data was written can be sequenced directly, or by copying the polynucleotide and sequencing the copy), as shown in block 1704 of fig. 22. The read data may then be compared to the data that was the subject of the original write request, as shown at block 1705 of FIG. 22. For example, in some implementations, when a write request is received (block 1701), the data to be written will be automatically maintained in the non-nucleotide memory of the system (e.g., in the RAM of the system controller 120) until an acknowledgement of a successful write is received. In these types of implementations, the comparison of data read from the DNA storage device with data that was the subject of the original request (block 1705) may be performed by a bit-level comparison of data from the write request stored in a non-nucleotide memory of the system with data read by sequencing information written to the DNA storage device.

In an implementation performing the process shown in FIG. 22, if the data read from the DNA storage device (block 1704) matches the data in the non-nucleotide memory of the system, a fingerprint of the data may be stored and the write operation may be ended, as shown in block 1706 of FIG. 22. In some implementations, storing the fingerprint of the data may include calculating a checksum, a cyclic redundancy check value, a hash value, or other similar type of value that may be used to detect changes in the encoded data, which is then stored in a non-volatile memory, such as a disk drive, solid state memory, optical disk, or other type of storage element. In some implementations, the fingerprint may also include information about how the data was encoded, rather than the data itself. For example, if the nucleotide sequence consists of bases A, C, T and G, then the fingerprint may consist of the number of instances A, C, T, and G in the sequence, rather than reflecting the data encoded using these bases. Other variations are also possible (e.g., fingerprints show parity of the number of each type of base included in the sequence, fingerprints based on the data and how the data is encoded); and may be used in various implementations. In some implementations, fingerprints can be appended to the ends of the DNA sequence. Ending the write operation may include sending a message to the source of the write request indicating that the data has been successfully written. Additionally, in some implementations, ending the write operation may include clearing data included in the write request from one or more locations in memory so that those locations may be used to store other information (e.g., new data that may be included in future write requests).

In some implementations, if the data read from the DNA storage device (block 1704) does not match the data in the non-nucleotide memory of the system, the data may be deemed to have been corrupted (e.g., not written correctly), as shown in block 1707 of fig. 22. In some implementations, this treatment of data as corrupt may trigger various error handling procedures. For example, in some implementations, if the data read from the DNA storage device (block 1704) does not match the data in the non-nucleotide memory of the system, a process such as that shown in FIG. 22 can return to writing the data (block 1703) to the DNA storage device so that the data can be recorded regardless of what problem caused the previous error. As another approach, in some implementations, if the data read from the DNA storage device (block 1704) does not match the data in the non-nucleotide memory of the system, the location with the unmatched data may be identified as "bad" (e.g., the corrupt data flag in the index may be flipped for that location), and the correct data may be subsequently written (e.g., during future low utilization) to the location of the corrupt data. As another approach, in some implementations, if the data read from the DNA storage device (block 1704) does not match the data in the non-nucleotide memory of the system, an error message (e.g., a function return code) may be generated indicating that the write operation encountered a problem that may require some other aspect or user remedy of the system (e.g., an error code may be generated identifying the location of the "corrupted" data as a location where hardware troubleshooting (e.g., replacement of the electrodes 640) may be appropriate). In some implementations, a combination approach may also be used. For example, in some implementations, if the data read from the DNA storage device (block 1704) does not match the data in the non-nucleotide storage of the system, the writing of the data to the DNA storage device may be retried (block 1703) and an error message may be generated so that the user may be aware of an early problem that he or she may wish to resolve before developing an unrecoverable failure. Other methods will be apparent to those of ordinary skill in the art and may also be implemented without undue experimentation and may be used in other implementations. Thus, the description of how an implementation performing a method such as that shown in fig. 22 treats data as corrupt (block 1707) and how such an implementation potentially responds to such corruption should be understood as illustrative only and should not be taken as limiting.

In some implementations, the system 100 can read the nucleotide sequence, for example, by sequencing it as described previously, as shown in block 1708 of fig. 22. This may be done, for example, in response to a read request (e.g., a request to retrieve a previously stored file), as part of a data quality check (e.g., in some implementations, data stored in the DNA storage device may be periodically read to identify whether it has degraded at the time of storage), or for other suitable reasons. In some implementations, after the nucleotide sequence is read (block 1708), it may undergo additional risk mitigation steps, such as computing a fingerprint based on the read data, as shown at block 1709 of fig. 22. This fingerprint may then be compared to another fingerprint that is expected to correspond (e.g., a fingerprint that was automatically created and stored as part of the write process (block 1706)) to determine whether the retrieved data should be considered corrupt and/or whether an error occurred in the read process, as shown in block 1710 of fig. 22. In some implementations, if there is a match, the retrieved data may be returned (e.g., sent as a return value to the process that has issued the read request), as shown at block 1711 of FIG. 22. Alternatively, in some implementations, the read data may not be returned, but rather some code may be provided (e.g., a code indicating that no errors were found if the data was read as part of a data quality check (block 1708)). Similarly, if the comparison (block 1710) finds that the fingerprints do not match, the data may be considered corrupt (block 1707) and, in some implementations, various error handling procedures such as those described previously may be initiated.

In some implementations, other types of risk mitigation measures may be taken, and/or variations may exist in how risk mitigation measures such as those described above are put into practice. For example, in some implementations, when data is written to a location (e.g., a well in a flowcell), a mirror copy may also be automatically written to a second location (e.g., a second well in the same flowcell, or a well in a different flowcell). In some such implementations, the presence of one or more mirror copies may be used to improve system performance, for example, by supporting parallel operations. For example, in some implementations, a DNA storage device may include two flow cells (e.g., two flow cells 601), a first flow cell for read operations and a second flow cell for write operations. In this case, if two different users want to read and write data stored in a particular location at the same time, the user issuing the read request can be satisfied by sequencing copies of the data stored in the first flow-through pool 601; while the write request is fulfilled by writing a new polynucleotide in place in the second flow cell 601. Subsequently, when a request has been processed (e.g., during a period of low activity), the first flowpool 601 may be resynchronized with the second flowpool 601, thereby ensuring that any data that has been written to the second flowpool 601 can be read using the first flowpool 601 whenever a next read request is issued.

As another example, in some implementations, when data is to be written, in addition to writing the data, a system can be implemented to write the data along with redundancy values that can be used to recreate a portion of the data in the event of data loss. To illustrate, consider table 2 below, which provides an example of redundancy values that may be generated for data stored in the form of four polynucleotides, each storing four bits of data.

Data sequence 1	Data sequence 2	Data sequence 3	Data sequence 4		Redundant sequences
						1	1	1	1	XOR	0
1	1	1	0	XOR	1
						1	1	0	0	XOR	0
1	0	0	0	XOR	1

Table 2: an example of a method of storing data using a redundant sequence that allows a portion of the data to be recreated when the data is lost.

In some implementations, the 16-bit data may be decomposed into four-bit sequences, and a fifth four-bit sequence may be created by applying a logical exclusive-or operator to the bits from the first four sequences. These five sequences can then be stored in five polynucleotides at five different locations in the DNA storage device. Then, in the event that data in a location is found to be corrupted, the data can be recreated by applying the exclusive or operator to the remaining four sequences and encoding/storing the result as a new polynucleotide in the location where the corrupted data was previously saved.

Further, in some implementations, when encoding a polynucleotide to store data in a DNA storage device, the polynucleotide may be synthesized to include error detection data, such as a fingerprint of the type previously described, in addition to storing the data. For example, in some implementations, when synthesizing a polynucleotide, a parity bit may be calculated based on data encoded by the synthesized polynucleotide, and a methylated (or unmethylated) base may be added to the polynucleotide to represent the parity bit. Similarly, in some implementations, other types of information-retaining modifications (i.e., modifications that do not interfere with the bases being read, but also allow the modified bases to be distinguished from the unmodified bases) can also or alternatively be used, such as the addition of synthetic bases or sequences encoding parity or other error checking information. In some implementations that include this type of error correction data in the polynucleotide, this type of data may be used to identify and remedy errors. For example, in some implementations, a mirror copy may be stored at another location as each polynucleotide is stored. Then, when the polynucleotide is read, both the polynucleotide and its mirror copy can be read and compared to each other. If the comparison indicates that the polynucleotide and its mirror copy are not the same, the checksum may be recalculated and any strand having a recalculated checksum that matches its methylation may be considered correct while the other strand may be considered damaged and replaced (e.g., as part of an error handling routine as previously described).

As another example, in some implementations, to identify whether data should be considered corrupted, the sequence to be tested can be examined to determine whether it can bind to one of a plurality of error-detecting polynucleotides. For example, in the case of a nucleotide sequence ending in a fingerprint portion, when the sequence is read, a new fingerprint may be calculated based on the data read from the sequence. An error check chain may then be synthesized that encodes the fingerprints in reverse order, which may be considered to indicate that the data has been corrupted and should be replaced if the fingerprint at the end of the original nucleotide sequence is not bound to the newly synthesized fingerprint chain.

When sequencing on a CMOS sequencing chip, it may be useful to use various image correction techniques (e.g., image optical or spectral crosstalk between different pixels), and the correction process may vary from chip to chip. The calibration method may use spatially controlled flow cell training data with diversity for basic call training of data, especially for optical systems where optical crosstalk introduces errors that require internal calibration. For example, a flow cell with a smaller pitch may have distortions near each well, which may be masked using known sequences. The calibration method may include an onboard QC system based on writing a predetermined sequence for different DNA reading systems. The predetermined sequence may be contained within or near the well and may provide a correction factor for the underlying call. These methods provide individual well crosstalk correction (creating a known true phase or truth table). The known sequences may be placed in a predetermined space on the flow cell for synchronous sequencers and/or for possible random access. These methods may allow for field calibration.

IX. miscellaneous items

All references, including patents, patent applications, and articles, are expressly incorporated herein by reference in their entirety.

The previous description is provided to enable any person skilled in the art to practice the various configurations described herein. While the subject technology has been described in detail with reference to various figures and configurations, it should be understood that these are for illustration purposes only and should not be taken as limiting the scope of the subject technology.

As used herein, an element or step recited in the singular and proceeded with the word "a" or "an" should be understood as not excluding plural said elements or steps, unless such exclusion is explicitly recited. Furthermore, references to "one implementation" are not intended to be interpreted as excluding the existence of additional implementations that also incorporate the recited features. In addition, unless explicitly stated to the contrary, an implementation of an element or elements having a particular property "comprising" or "having" may include additional elements whether or not they have that property.

The terms "substantially" and "approximately" are used throughout this specification to describe and account for small fluctuations due to, for example, variations in processing. For example, they may refer to less than or equal to ± 5%, such as less than or equal to ± 2%, such as less than or equal to ± 1%, such as less than or equal to ± 0.5%, such as less than or equal to ± 0.2%, such as less than or equal to ± 0.1%, such as less than or equal to ± 0.05%.

Many other ways of implementing the subject technology are possible. The various functions and elements described herein may be divided differently than those shown without departing from the scope of the subject technology. Various modifications to these implementations will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other implementations. Accordingly, many changes and modifications may be made to the subject technology by one of ordinary skill in the art without departing from the scope of the subject technology. For example, a different number of given modules or units may be used, a different type or type of given modules or units may be used, given modules or units may be added, or given modules or units may be omitted.

Underlined and/or italicized headings and subheadings are used for convenience only, do not limit the subject technology, and are not involved in explaining the subject technology description. All structural and functional equivalents to the elements of the various implementations described throughout this disclosure that are known or later come to be known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the subject technology. Moreover, nothing disclosed herein is intended to be dedicated to the public regardless of whether such disclosure is explicitly recited in the above description.

It should be appreciated that all combinations of the foregoing concepts and additional concepts discussed in greater detail below (assuming such concepts are not mutually inconsistent) are contemplated as part of the inventive subject matter disclosed herein. In particular, all combinations of claimed subject matter appearing at the end of this disclosure are contemplated as being part of the inventive subject matter disclosed herein.

76页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：鉴定中间体的方法

System and method for storage

相关技术

网友询问留言