Data storage layout

文档序号：1845425 发布日期：2021-11-16 浏览：4次中文

阅读说明：本技术 数据存储布局 (Data storage layout ) 是由 J·M·本特 N·丹尼洛夫 K·K·克拉菲 R·B·达斯于 2021-05-11 设计创作，主要内容包括：本申请公开了数据存储布局。一种复合布局,用于将数据对象的一个或多个区段存储在第一存储系统中,并将该数据对象的一个或多个区段存储在第二、不同的存储系统中。该第一存储系统可以被配置为用于高效地存储小数据分块,诸如例如,小于该存储系统所使用的存储设备的可寻址块大小的数据分块。(The application discloses a data storage layout. A composite layout for storing one or more sections of a data object in a first storage system and one or more sections of the data object in a second, different storage system. The first storage system may be configured to efficiently store small chunks of data, such as, for example, chunks of data that are smaller than the addressable block size of the storage devices used by the storage system.)

1. A system, comprising:

one or more data storage devices for storing one or more data objects; and

a computing device comprising one or more processors and operably coupled to the one or more data storage devices, the computing device configured to:

maintaining, using the one or more data storage devices, a key value storage system and another storage system different from the key value storage system; and is

Providing, for each data object stored on the one or more data storage devices, a composite layout comprising mapping information linking one or more sections of each data object to one or more locations on the one or more data storage devices in which the one or more sections of the data object are stored, wherein at least one of the one or more sections of each data object is stored in the key-value storage system and at least another one of the one or more sections of each data object is stored in the other storage system.

2. The system of claim 1, wherein the at least one section stored in the key-value storage system comprises a first section of each data object.

3. The system of claim 1, wherein the at least one section stored in the key-value storage system is equal to or less than 4096 bytes.

4. The system of claim 1, wherein the key-value storage system uses a b-tree to store key-value pairs.

5. A method, comprising:

maintaining, using the one or more data stores to store one or more data objects, a key value storage system and another storage system different from the key value storage system; and is

6. The method of claim 5, wherein the at least one section stored in the key-value storage system comprises a first section of each data object.

7. The method of claim 5, wherein the at least one section stored in the key-value storage system is equal to or less than 4096 bytes.

8. The method of claim 5, wherein the key-value storage system uses a b-tree to store key-value pairs.

9. A composite layout corresponding to a data object describing one or more locations of one or more sections of the data object on one or more storage devices, comprising:

a plurality of sub-layouts ranked from lowest priority to highest priority, each sub-layout comprising mapping information linking one or more sections of the data objects to one or more locations on the storage device in which the one or more sections of the data objects are stored on the storage device, wherein at least one of the one or more sections of each data object is stored in a key-value storage system and at least another one of the one or more sections of each data object is stored in another storage system different from the key-value storage system.

10. The composite layout of claim 9 wherein the mapping information of at least one of the plurality of sub-layouts indicates whether one or more sections of the data object are stored in the key-value storage system.

Disclosure of Invention

An illustrative system may comprise: one or more data storage devices for storing one or more data objects; and a computing device comprising one or more processors and operatively coupled to the one or more data storage devices. The computing device can be configured to maintain, using the one or more data storage devices, a key-value storage system and another storage system different from the key-value storage system, and to provide a composite layout for each data object stored on the one or more data storage devices. The composite layout may include mapping information linking one or more sections of each data object to one or more locations on the one or more data storage devices in which the one or more sections of the data object are stored. At least one of the one or more extents of each data object may be stored in the key-value storage system, and at least another one of the one or more extents of each data object may be stored in the other storage system.

An illustrative method may comprise: the key value storage system and another storage system different from the key value storage system are maintained using the one or more data storage devices to store one or more data objects. The illustrative method may further include providing a composite layout for each data object stored on the one or more data storage devices. The composite layout may include mapping information linking one or more sections of each data object to one or more locations on the one or more data storage devices in which the one or more sections of the data object are stored. At least one of the one or more extents of each data object may be stored in the key-value storage system, and at least another one of the one or more extents of each data object may be stored in the other storage system.

An illustrative composite layout corresponding to a data object describing one or more locations of one or more sections of the data object on one or more storage devices may include a plurality of sub-layouts ranked from lowest priority to highest priority, and each sub-layout may include mapping information linking the one or more sections of the data object to one or more locations on a storage device in which the one or more sections of the data object are stored. At least one of the one or more extents of each data object can be stored in a key-value storage system, and at least another one of the one or more extents of each data object can be stored in another storage system different from the key-value storage system.

The above summary is not intended to describe each embodiment or every implementation of the present disclosure. A more complete understanding will become apparent and appreciated by reference to the following detailed description and claims when taken in conjunction with the accompanying drawings. In other words, these and various other features and advantages will become apparent from a reading of the following detailed description.

Drawings

The disclosure may be more completely understood in consideration of the following detailed description of various embodiments of the disclosure in connection with the following drawings.

FIG. 1 is a block diagram of an exemplary system including a file system for storing data objects.

FIG. 2 is an illustration of a "simple" layout corresponding to data objects for use with an exemplary system such as, for example, that depicted in FIG. 1.

FIG. 3 is an illustration of an exemplary layout corresponding to data objects for use with an exemplary system such as, for example, that depicted in FIG. 1.

FIG. 4 is an illustration of an exemplary composite layout corresponding to data objects for use with an exemplary system such as, for example, that depicted in FIG. 1.

FIG. 5 is a flow chart of an illustrative write method using the illustrative composite layout of FIGS. 3-4.

FIG. 6 is a flow chart of an illustrative read method using the illustrative composite layout of FIGS. 3-4.

Detailed Description

The present disclosure relates to systems, methods, and processes for utilizing file system data location lookup in a dynamic environment. As further described herein, exemplary systems, methods, and processes can reduce computational complexity in describing file data locations in a dynamic environment and implement a range of layout-related file system features using a common description format and minimized code paths. In general, systems, methods, and processes may utilize or include an exemplary composite layout having and/or mechanisms associated with a set of useful characteristics. For example, the illustrative composite layouts described herein may be configured to provide specific functionality for processing small files or small data objects in a space-efficient manner. As will be further described, one or more sections of each data object may be stored in a first storage system (such as a key-value storage system) while any remaining sections may be stored in a second storage system (such as a more traditional storage system).

In at least one embodiment, a first or initial segment of each data object may be stored in a first storage system (such as a key-value storage system) while any remaining segments may be stored in a second storage system (such as a more traditional storage system). Any writes or reads from the first sector will be directed to the first memory system (e.g., key value memory system).

Further, for example, an illustrative composite layout may include or consist of a set of sub-layouts. The sub-layouts may occupy a particular ordered ranking in the composite layout or structure. The new write may be directed to the highest ranked sub-layout, while the read may be directed to the highest ranked simple layout having the mapped extent of the requested file range. In at least one embodiment, one of the sub-layouts may correspond to only the sections stored in the first storage system (e.g., the key-value storage system).

An exemplary system 10 for storing data objects is depicted in FIG. 1. The system 10 includes a host device 12 (such as, for example, a personal computer, server, etc.) and a data storage system 20. The host device 12 may be operatively coupled to the data storage system 20 to read data objects or files from the data storage system 20 and to write data objects or files to the data storage system 20. Although a single host device is depicted, it should be understood that the system 10 may include a plurality of host devices 12 operatively coupled to the data storage system 20. Additionally, the data storage system 20 itself may include one or more computing devices to provide the functionality provided by the data storage system 20. More specifically, one or more computing devices of the data storage system 20 may include one or more processors, processing circuitry, memory, or the like, configured to provide for reading and writing of one or more data objects (e.g., including files) from the data storage system 20 and one or more mechanisms and processes associated with the example composite layouts described herein. For example, the host device 12 may request data from a data object from the data storage system 20, and the data storage system 20 may return the requested data for the data object. Further, for example, the host device 12 may attempt to write data to a data object of the data storage system 20, and the data storage system 20 may facilitate writing data to the data object.

As shown, the data storage system 20 includes a plurality of data storage devices 22 for storing data objects. The data storage 22 may include any device and/or apparatus configured to store data (e.g., binary data, etc.). The data storage device 22 may include, but is not necessarily limited to: solid state memory, hard disk, magnetic tape, optical disk, integrated circuit, volatile memory, non-volatile memory, and any combination thereof. Further, each data storage device 22 may be an array of storage devices, such as, for example, a RAID (redundant array of inexpensive disks) storage arrangement. Each data storage device 22 may be a server or a virtual server. It should be understood that the present disclosure is not limited to the system 10 depicted in fig. 1, but rather, the system 10 is merely one illustrative configuration. For example, the data storage system 20 may include one or more of a local file system, a Storage Area Network (SAN) file system, a distributed file system, a parallel file system, a virtual file system, and/or combinations thereof.

The data storage system 20 may further be described as a system designed to provide computer applications on the hosts 12 with access to data stored on the data storage devices 22 in a logical, consistent manner. In addition, the data storage system 20 may be described as hiding the details of how data is stored on the data storage device 22 from the host 12 and applications running on the host 12. For example, the data storage device 22 may generally be block addressable in that data is addressed at a minimum granularity of one block, and multiple contiguous portions or chunks (chunks) of data may define or form a zone. A section may be defined as a portion of data within a data object or file. In other words, a section may be described as a range of bytes within a data object or file. The size of a particular sector (e.g., 512 bytes long, 1024 kilobytes long, 4096 kilobytes long, etc.) may depend on the type and size of data storage device 22. An application on the host 12 may request data from the data storage system 20, and the data storage system 20 may be responsible for seamlessly mapping between application logical extents within data objects and physical spaces on the data storage device 22.

Existing file systems have used various methods to provide such mapping. For example, the file system may provide data locations on the data storage 22 via a lookup (e.g., a list of extents in an inode in the case of a local file system (such as the EXT4 file system), or a set of object/server pairs in a distributed system (such as the LUSTRE file system)) or in a formulaic manner (e.g., parameters of a SWIFT ring) using mapping information or metadata according to the layout. These existing file systems may suffer from the assumption that the layout remains mostly static. For example, modifying some or all of the data layout may typically overwrite the lookup information completely or move the data itself to accommodate the new parameters.

A layout may be defined as a description of a location in a file system where a particular set of data (e.g., a file or data object) is located. As described herein, a section may be defined as a portion of data within a data object or file. The term "FID" as used throughout this disclosure means a "file identifier" that can be used as a handle or descriptor to reference the layout of a particular file. For some types of layouts, the FID may point to some metadata describing the layout formula and the parameters of the formula for a series of sections in the file. In addition, other types of layout mapping functions (such as block bitmaps or zone lists) may also be used.

Depicted in FIG. 2 is an illustration of a simple layout 110 referenced as an FID corresponding to a data object or file 100 for use with an exemplary system. The data object or file 100 is represented graphically as a string of hexadecimal bytes. Layout 110 may correspond to data object 100 such that layout 110 extends from a left end representing a range of bytes within data object 100 to a right end, where the left end represents the beginning of data object 100 and the right end represents the end of data object 100. Layout 110 may include mapping information represented by shading in FIG. 2 that links individual sections 112 of data object 100 to one or more locations 114 on one or more storage devices 22. The shading of the mapping information within the layout 110 corresponds to the size and location of the sections within the data object 100. In other words, layout 110 may correspond to data object 100 in that layout 110 describes where various sections 112 of data object 100 are stored or located on one or more storage devices 22.

The sections 112 are graphically depicted using shading within the layout FID 110 of FIG. 2 to indicate which sections or ranges of the data object 100 include data, and thus which sections have corresponding mapping information in the layout 110. For example, a shaded section 112 or region of the layout 110 indicates that mapping information exists to link the section 112 to the location 114.

When data is to be written to or read from data object 100 (e.g., by host 12), exemplary system may utilize layout functions to determine mapping information for writing data to or reading data from data object 100 from layout 110. If new data is to be written to a portion or segment of data object 100, the layout function may determine, based on the mapping information within layout 110, the extent 112 within file system 22 (e.g., on which storage device 22, at which location within storage device 22, etc.) in which those portions or segments of data object 100 reside, and may then overwrite (overwrite) such extent 112 of data storage device 22, or a portion thereof, with the new data. If new data is to be read from portions or sections of the data object 100, the layout function may determine, based on mapping information within the layout 110, the sections 112 on the one or more storage devices 22 on which those portions or sections of the data object 100 reside, and may then read such sections 112 of the data storage devices 22, or portions thereof.

The layout 110 shown with respect to fig. 2 may be described as a "simple" layout because a "simple" layout may not include mapping information that links any of the sections to an alternate storage system (e.g., a key-value storage system). Rather, the layout 110 may include only mapping information linking each sector 112 to various locations of the data storage device 22, without providing any flexibility to multiple storage systems or schemes therebetween, which may lead to storage issues for very small files.

It is desirable to store very small files more efficiently, which may be common in machine learning and artificial intelligence applications. For example, in many file systems, it is often extremely inefficient to store a small file individually into a 4K (4096 kilobytes) disk block because a file occupies the entire 4K disk block for storage regardless of how small the file is. In other words, while a small file may be 100 kilobytes, in many file systems it may occupy the entire 4096 kilobytes of a 4k disk block. Additionally, if the file system creates an inode for each 100 kilobyte file, the inode (which may be 1024 kilobytes long) will also occupy another 4K disk block. In one example, if 50% of the files stored in a 4K block storage system are 100 bytes, the system may lose up to 40 times the available capacity. Thus, for example, if a small portion of data is stored in an empty file object according to layout 110, even if that portion of data is small, the entire first extent (such as a 4K block) will be consumed, which maps to various locations on data storage device 22.

To address these issues, the illustrative systems and methods may use a composite layout to store a first section (such as, for example, a first 4K block) of each file into a first storage system (such as, for example, a key-value storage system) for optimizing storage of small files, as will be further described herein with reference to fig. 3-4. While previous systems and methods may utilize 80 terabytes to store a1 terabyte 100 byte file, for example, illustrative systems and methods may utilize only 11 terabytes of space, which is significantly less than the 80 terabytes of space of previous systems and methods.

Additionally, one or more sections of a data object may be used frequently, and thus, it may be beneficial to locate such sections on a storage system that is faster than other available storage systems. Furthermore, one or more sections of a data object may be determined to be more critical than other sections, and therefore, it may be beneficial to locate such sections on a storage system that is more reliable, more fault tolerant, and/or includes better redundancy and error correction than other available storage systems. To address these issues, illustrative systems and methods may store common and/or critical sections of data objects into a first storage system (such as, for example, a key-value storage system) using a composite layout to increase the speed of storage modification and retrieval of such sections, and to increase the reliability, fault tolerance, redundancy, and error correction of such sections, as will be further described herein with reference to fig. 3-4.

The illustrative systems and methods described herein may include composite layouts configured or adapted to handle doclet and/or object compression. Further, the illustrative systems and methods may be described as using multiple types of storage systems (such as key-value stores) in combination with a compound layout to ensure efficient packaging of small file data, e.g., into as few disk blocks as possible. Thus, the illustrative systems and methods described herein may be capable of efficiently storing large numbers of small files to disk.

It can be described that the illustrative compound layout can readily allow the "header" of a file to be stored to a first storage system (such as, for example, a key-value storage system or volume) that can automatically use a log structured merge tree (merge tree) streaming b-tree mechanism to efficiently package small data. Further, it may be described that the illustrative systems and methods may provide an integrated storage system that supports object, file, and key-value pairs. In one or more embodiments, the key-value store may be implemented as a streaming B-tree, such that small key-value pairs may be automatically merged into 4K blocks. Still further, it can be described that the illustrative systems and methods utilize a key-value store and a compound layout such that small files and their inodes are transparently and efficiently stored in the key-value store.

Furthermore, the layout 110 shown with respect to FIG. 2 may be described as a "simple" layout, as a "simple" layout cannot represent more than one location on a storage device for any overlapping range of sectors. For example, section 112a and section 112b in FIG. 2 cannot overlap, because the location of any portion of data object 100 in the intersection cannot be uniquely resolved. Other storage systems may internally map a segment to multiple locations for data redundancy purposes. For example, a zone may map to a pair of locations in a RAID1 configuration, where data is written to both physical devices and may be read from either physical device in the event of a failure. More complex but fully resolvable schemes such as RAID6 or erasure coding schemes may also be used. However, these simple layouts, used alone or in combination with the RAID configuration layout, remain statically fixed over time. Typically, repairing RAID data by incorporating a new device or changing the desired file location involves moving all mapped sections of the file in their entirety into a new static layout. In such cases, these systems may temporarily retain two layouts, an original layout and a new layout, for the file while doing such data migration (copying the data to a new location).

To address these issues, the systems, methods, and processes described herein with reference to fig. 3-4 utilize an exemplary composite layout or a series of sub-layouts, each describing any set of data objects with mapped or unmapped sections, ordered as a "sieve". The unmapped sections may be considered "holes" in the "screen". Any section that is not mapped by the upper level sub-layout will go to the next sub-layout in the screen. More specifically, an exemplary composite layout may be composed of a set of sub-layouts occupying a particular ordered ranking in the composite layout structure. The new write may be directed to the highest ranked sub-layout, and the read may be directed to the highest ranked sub-layout, which has the mapped extent of the requested file range. In addition, by inserting new sub-layout layers at different ranks and providing the ability to read and write sub-layouts directly, many useful behaviors can be instantiated.

The systems, methods, and processes described herein may use an exemplary composite layout 220 as shown in fig. 3-4, which exemplary composite layout 220 is more useful than the layout 110 of fig. 2, because, for example, the mapping information of the illustrative composite layout 220 may allow different sections (e.g., portions, chunks, extents, etc.) of a data object to be efficiently and effectively mapped, and may allow those sections to be easily dynamically remapped. In particular, the illustrative composite layout 220 may include mapping information that links one or more sections of the data object 100 to an alternate storage system (such as, for example, a key-value storage system). In the particular example depicted in fig. 3-4, the illustrative composite layout 220 may include mapping information that links the first sections 202x, 222x of the data object 100 to an alternate storage system (such as, for example, a key-value storage system). While the illustrative composite layout 220 of fig. 3-4 includes mapping information that links only the first sections 202x, 222x of the data object 100 to the alternate storage system, it should be understood that the illustrative composite layout 220 may include mapping information that links any section of the data object 100 to the alternate storage system (e.g., based on the criticality of the section, the frequency of access of the section, etc.). Further, it may be generally described that the illustrative composite layout 220 of systems, methods, and processes may enable a range of desired capabilities that may be simply and efficiently implemented, thereby presenting a more capable, more efficient, and more stable file system.

Similar to the layout 110 shown in FIG. 2, the exemplary composite layout 220a in FIG. 3 may correspond to data objects 100 stored on one or more storage devices 22 and may include mapping information that links the data objects 100 to one or more locations 114 on the one or more storage devices 22. Unlike the layout 110 of FIG. 2, the illustrative composite layout 220a may include mapping information for linking the first section 202x of the data object 100 to one or more locations on one or more storage devices 22 in the first storage system 30 (such as, for example, a key-value storage system).

The data storage system 20 of fig. 3-4 may include two or more different storage systems or schemes. As shown, the data storage system 20 includes two different storage systems or schemes, namely a first storage system 30 and a second or further storage system 32. The second storage system 32 may be any other type of storage system different from the first storage system 30. In other words, the second storage system 32 is different from the first storage system 30. In at least one embodiment, the second storage system 32 is any type of storage system other than a key-value storage system (which may be the first storage system 30). In at least one embodiment, the second storage system 32 may be described as a conventional storage system, such as an object storage system. For example, the second storage system 32 may be an EXT4 file system. Further, for example, the second storage system 32 may be a conventional distributed data storage system optimized for storing large data objects, such as a conventional file storage system like the LUSTRE file system or a conventional object storage system like the SWIFT ring.

Additionally, the first storage system 30 and the second storage system 32 may utilize different types or kinds of storage devices 22 that may vary in size, speed, reliability, and so forth. The first storage system 30 may include a faster (e.g., faster reads and/or writes) storage 22 than the second storage system 32. For example, the first storage system 30 may include or utilize a faster solid state drive than the rotating disk drive of the second storage system 32. Further, the first storage system 30 may include storage devices 22 that are more reliable (e.g., include higher redundancy, better error correction, etc.) than the second storage system 32.

As described herein, the first storage system 30 may be a key-value storage system. The illustrative key-value store system can automatically use log-structured merge-tree and/or streaming b-tree mechanisms for efficiently packing small data. In one or more embodiments, a key-value storage system may be configured to store pairs of key values, where the keys of each pair provide mapping or location information to find values within the key-value storage system that correspond to the keys.

A key value store may include a single index of keys or multiple indices of keys (or sets of indices of keys). Each key may identify a location of a corresponding value within a key-value storage system. Thus, the key may be used to find a segment within a key-value storage system. In the illustrative systems and methods described herein, the "value" corresponding to the "key" in the key-value storage system will be the first section 202x of the file object 100, according to its composite layout 220 a.

In at least one embodiment, the "key" of each section may be stored in a key-value storage system as follows:

{ FID | offset | Length }.

The FID of a key is an object or file identifier, the offset is the distance from the starting location or address, and the length is the length of the value. The "value" of a key-value pair is data for a section that can be identified using a key.

In the single index example of a key, the input to find a section in the key-value store would be the concatenation of the FID and the offset. In the multiple index example of a key, the input to find a section in the key-value store would be the FID identifying an index of the multiple indexes and an offset within the index.

In at least one embodiment, all keys for all doclets or headers may be stored in a single "index," which is a logical set of key-value pairs. Prefix matching and iteration may be able to retrieve all key-value pairs that contain data in the first section or header. Furthermore, during b-tree merging, overlapping key-value pairs may be garbage collected and snapshots (snapshots) should not block them.

The size of the first section 202x of the data object 100 may vary depending on the block size of the storage system 20. In at least one embodiment, the first section 202x is 4096 bytes (in other words, a 4K block). The first zone 202x may be less than 4096 bytes, such as, for example, 512 bytes.

As described herein, the initial or first mapping information links the first section 202x of the data object 100 to one or more locations on one or more storage devices 22 in the first storage system 30 (such as, for example, a key-value storage system). The remaining mapping information links the other sections 202a, 202b, 202c of the data object 100 to one or more locations on one or more storage devices 22 in the second storage system 32 in which the sections 202a, 202b, 202c of the data object 100 are stored. Again, as described herein, although the mapping information of the composite layout 220a of fig. 3 only links the first section 202x of the data object 100 to one or more locations on one or more storage devices 22 in the first storage system 30, it should be understood that the composite layout 220a may include mapping information that links more than the first section 202x sections (such as one or more of the sections 202a, 202b, 202 c) to one or more locations on one or more storage devices 22 in the first storage system 30.

Additionally, another illustrative layout 220b is depicted in FIG. 4, which illustrative layout 220b may include a plurality of sub-layouts 201 ranked from lowest priority to highest priority. As shown, each sub-layout 201 may include a handle or descriptor "FID" and a number in a subscript that indicates the priority of the sub-layout 201. The subscript numbers may indicate priority in descending order, with the highest priority sub-layout 201 having a "FID₀"and a lower priority sub-layout has a handle with a larger numerical subscript. In this example, the lowest priority sub-layout 201 includes the handle "FID₂", which indicates a lower than sub-layout 201" FID₀And FID₁"priority of the system. In other words, the plurality of sub-layouts 201 may be layered to form a cascaded screen, with higher priority sub-layouts 201 positioned in a higher layer than lower priority sub-layouts 201.

Although the illustrative composite layout 220b depicted in FIG. 4 includes three sub-layouts 201, it should be understood that the exemplary composite layout 220b is dynamic such that it may increase or decrease the amount or number of sub-layouts 201 to provide the functionality described further herein. More specifically, the new sub-layout 201 may be further added within the composite layout 220b at an intentional or selected priority, e.g., higher or lower than the priority of the existing sub-layout 201. Furthermore, the ability of the composite layout 220b to include or have multiple sub-layouts 201 may allow for overlapping ranges of sections, which may also be useful for providing the functionality described herein.

Each sub-layout 201 may include mapping information linking one or more sections 222a, 222b, 222c, 222d, 222e of the data object 100 to one or more locations 114 on one or more storage devices 22 on which the one or more sections 222a, 222b, 222c, 222d, 222e of the data object 100 are stored. Similar to layout 110, each sub-layout 201 corresponds to data object 100 such that sub-layout 201 extends from a left end representing a range of bytes within data object 100 to a right end, where the left end represents the beginning of data object 100 and the right end represents the end of data object 100. The sections 212 of the sub-layout 201 are graphically depicted using shading within the sub-layout 201 to indicate which sections or ranges of data objects 100 have corresponding mapping information in the sub-layout 201 to link the sections or ranges to locations 114 on one or more storage devices 22. In other words, the shaded section 212 or region of the sub-layout 201 indicates that mapping information exists to link the section 212 to a location 114 on one or more storage devices 22.

Additionally, the plurality of sub-layouts 201 includes a sub-layout 201FID_xSub-layout 201FID_xIncluding linking a first section 212x of the data object 100 to one or more of the storage devices 22 in a first storage system 30 (such as, for example, a key-value storage system)Mapping information for each location, similar to the illustrative layout 220 a. Since the sectors stored in the first storage system 30 are positioned in a single sub-layout 201FID_xThus, the systems, methods, and processes may be able to determine whether a zone is located in the first storage system 30 based only on the FID. In other words, in directing the sectors to the sub-layout of the first storage system 30 (such as sub-layout 201 FID)_x) And a sub-layout (such as the remaining sub-layout 201 FID) that directs the sectors to the second storage system 32₀、FID₁、FID₂) There may be no mixing of sub-layouts 201.

Further, in one or more embodiments, the mapping information for the first section 212x may be stored in the highest priority sub-layout 201. Further, in one or more embodiments, the layout function may include an exception (exception) for the first section 222x, such that the layout function directs a read or write to the first section to the first storage system 30. Thus, exceptional mapping information linking the first section 222x to the first storage system 30 may be used with a composite layout 220b comprising a plurality of sub-layouts 201 as the sub-layout 201FID as depicted in FIG. 4_xEither by itself or as a mapping process in a layout function.

The exemplary composite layout 220b may be described as a result (resultant) layout FID of the combined results of the sub-layouts 201_r221. Although the resulting layout FID is depicted in FIG. 4_r221, but it should be understood that the resulting layout FID is depicted_r221 is primarily intended to describe example systems, methods, and processes described herein. The resulting layout FID may be provided based on the mapping information present for a particular section in the highest priority sub-layout 201_r221, the highest priority sub-layout 201 includes mapping information for a particular section. In other words, when looking for mapping information for a particular section or range, the highest priority sub-layout 201 may be checked, and if there is no mapping information for that section or range, the next highest priority sub-layout 201 may be checked, and so on.

The resulting layout FID in FIG. 4_r221 from left to right(e.g., from the beginning to the end of the data object 100), the mapping information for the first section 222x is formed by the sub-layout 201FID_xThe mapping information of the section 212x of (2) is provided, and the mapping information of the second section 222a is provided by the sub-layout 201FID₀The sub-layout 201FID₀May be the highest priority sub-layout 201 comprising the mapping information of the second section 222 a. As shown, sub-layout 201FID₁And FID₂Also included are mapping information in sections 212b and 212c, respectively, the mapping information in sections 212b and 212c corresponding to the resulting layout FID_r221 of the second section 222 a. However, sub-layout 201FID₀With comparator layout 201FID₁And FID₂Of the sub-layout 201FID, and therefore₀The mapping information of (1) replaces the sub-layout 201FID₁And FID₂The mapping information in (1). In other words, the sub-layout 201FID₀、FID₁And FID₂Is overlapped at the section 222a, and thus, the higher priority sub-layout 201 (i.e., the sub-layout 201 FID)₀) Has precedence over the lower priority sub-layout 201.

Furthermore, the resulting layout FID_rThe mapping information of the third section 222b of 221 is formed by the sub-layout 201FID₁The sub-layout 201FID₁Is the highest priority sub-layout 201 comprising the mapping information of the third section 222 b. As shown, sub-layout 201FID₂Also included is mapping information in section 212c corresponding to the resulting layout FID_rA portion of the third section 222b of 221. However, sub-layout 201FID₁With comparator layout 201FID₂Higher priority, therefore, the sub-layout 201FID₁The mapping information of (1) replaces the sub-layout 201FID₂The mapping information in (1). In other words, the sub-layout 201FID₁And FID₂Is overlapped at the section 222b, and thus, the higher priority sub-layout 201 (i.e., the sub-layout 201 FID)₁) Has precedence over the lower priority sub-layout 201.

Next, through sub-layout 201FID₂Part of the mapping information of the section 212d of (the lowest priority sub-layout) to provide the resulting layout FID_r221 mapping information of the fourth section 222c because the composite layout 220b does not have a higher priority sub-layout 201 including FID for the resulting layout_rMapping information of section 222c of 221. Finally, according to the priority function and logic described herein, the FID is passed through the sub-layout 201, respectively₂The mapping information and the sub-layout 201FID of the section 212e of₁To provide a resulting layout FID_rMapping information of the fifth section 222d and the sixth section 222e of 221.

Additional features enabled by the Composite layout 220 of the illustrative systems, methods, and processes may include layering/Information Lifecycle Management (ILM), Data locality, failure recovery, and Data rebalancing, such as described in U.S. patent application publication No.2018/0232282a1 entitled "Data Storage Composite Layouts for Data Objects" published on day 8, month 16, 2018, which is incorporated herein by reference in its entirety.

FIG. 5 depicts an illustrative writing method 50 using the illustrative composite layout of FIGS. 3-4. The method 50 may include receiving write data to be written to an existing data object stored in the data storage system 20 (52), and determining whether the write data is in the first storage system (54). If the write data is located in the first storage system, the method 50 may store the write data in the first storage system (56), such as, for example, a key-value storage system. Conversely, if the write data is not located in the first storage system, the method 50 may store the write data in the second storage system (58).

FIG. 6 depicts an illustrative read method 60 using the illustrative composite layout of FIGS. 3-4. The method 60 may include receiving a read request for read data of an existing data object stored in the data storage system 20 (62), and determining whether the read data is in the first storage system (64). If the read data is located in the first storage system, the method 60 may retrieve the read data from the first storage system (66), such as, for example, a key-value storage system. Conversely, if the read data is not located in the first storage system, method 60 may retrieve the read data from the second storage system (68).

It will be apparent to those skilled in the art that elements or processes from one embodiment may be used in combination with elements or processes of other embodiments, and that the possible embodiments of such systems, apparatuses, devices and methods using combinations of features set forth herein are not limited to the specific embodiments shown in the drawings and/or described herein. Further, it will be appreciated that the timing of the processes herein, as well as the size and shape of the various elements, may be modified while remaining within the scope of the present disclosure, although certain timings, one or more shapes and/or sizes, or types of elements may be advantageous relative to other timings, shapes and/or sizes, or types of elements.

The methods and/or techniques described in this disclosure, including those attributed to the computing device or various component parts of the host and/or file system, may be implemented at least in part in hardware, software, firmware, or any combination thereof. For example, aspects of these techniques may be implemented within one or more processors, including one or more microprocessors, DSPs, ASICs, FPGAs, or any other equivalent integrated or discrete logic circuitry, as well as any combinations of such components embodied in programs. The terms "controller," "module," "processor," or "processing circuitry" may generally refer to any of the preceding logic circuitry, alone or in combination with other logic circuitry, or any other equivalent circuitry.

Such hardware, software, and/or firmware may be implemented within the same device or within separate devices to support the various operations and functions described in this disclosure. Furthermore, any of the described units, modules or components may be implemented together or separately as discrete but interoperable logic devices. Depiction of different features as modules or units is intended to highlight different functional aspects and does not necessarily imply that such modules or units must be realized by separate hardware or software components. Rather, functionality associated with one or more modules or units may be performed by separate hardware or software components, or may be integrated within common or separate hardware or software components.

When implemented in software, the functions attributed to the systems, devices, and techniques described in this disclosure may be embodied as instructions on a computer-readable medium (such as RAM, ROM, NVRAM, EEPROM, flash memory, STRAM, RRAM, magnetic data storage media, optical data storage media, and the like, as well as any combination thereof). The instructions may be executed by one or more processors to support one or more aspects of the functionality described in this disclosure.

In the preceding description, reference has been made to the accompanying drawing sets that form a part hereof, and in which are shown by way of illustration several specific embodiments. It is to be understood that other embodiments are contemplated and may be made without departing from (e.g., still falling within) the scope or spirit of the present disclosure. The foregoing detailed description, therefore, is not to be taken in a limiting sense. The definitions provided herein are intended to facilitate understanding of certain terms used frequently herein and are not intended to limit the scope of the present disclosure.

Unless otherwise indicated, all numbers expressing feature sizes, amounts, and physical characteristics used in the specification and claims are to be understood as being modified in all instances by the term "about". Accordingly, unless indicated to the contrary, the numerical parameters set forth in the foregoing specification and attached claims are approximations that can vary depending upon the desired properties sought by those skilled in the art utilizing the teachings disclosed herein.

Further examples are:

example 1: a system, comprising: one or more data storage devices for storing one or more data objects; and a computing device comprising one or more processors and operably coupled to the one or more data storage devices, the computing device configured to: maintaining, using the one or more data storage devices, a key value storage system and another storage system different from the key value storage system; and providing, for each data object stored on the one or more data storage devices, a composite layout comprising mapping information linking one or more sections of each data object to one or more locations on the one or more data storage devices in which the one or more sections of the data object are stored, wherein at least one of the one or more sections of each data object is stored in the key-value storage system and at least another one of the one or more sections of each data object is stored in the other storage system.

Example 2: the system of example 1, wherein the at least one section stored in the key-value storage system comprises a first section of each data object.

Example 3: the system of example 1, wherein the at least one section stored in the key-value storage system is equal to or less than 4096 bytes.

Example 4: the system of example 1, wherein the key-value storage system uses a b-tree to store key-value pairs.

Example 5: the system of example 1, wherein the key-value storage system comprises a plurality of indices of key-value pairs, wherein each of the indices of key-value pairs comprises at least one key usable to find the at least one section within the key-value storage system.

Example 6: the system of example 1, wherein the computing device is further configured to: receiving a write data object to be stored on the one or more data storage devices; and storing at least one section of the write data object in the key-value storage system according to the composite layout.

Example 7: the system of example 1, wherein the computing device is further configured to: receiving write data to be written to an existing data object of the one or more data objects; determining whether the write data is to be written to a section of the existing data object that is stored in the key-value storage system; and if it is determined that the write data is to be written to the section of the existing data object stored in the key-value storage system, storing the write data in the key-value storage system.

Example 8: the system of example 1, wherein the computing device is further configured to: receiving a read request for read data of an existing data object of the one or more data objects; determining whether the read data is to be read from a section of the existing data object stored in the key-value storage system; and if it is determined that the read data is located in the section of the existing data object stored in the key-value storage system, reading the read data from the key-value storage system.

Example 9: the system of example 1, wherein the composite layout includes a plurality of sub-layouts from a lowest priority to a highest priority, each sub-layout including mapping information linking one or more sections of the data object to one or more locations on the storage device in which the sections of the data object are stored, wherein the mapping information for at least one of the plurality of sub-layouts indicates whether one or more sections are stored in the key-value storage system.

Example 10: the system of example 1, wherein the one or more data storage devices include one or more solid state drives, wherein the key value storage system uses the one or more solid state drives.

Example 11: a method, comprising: maintaining, using the one or more data stores to store one or more data objects, a key value storage system and another storage system different from the key value storage system; and providing, for each data object stored on the one or more data storage devices, a composite layout comprising mapping information linking one or more sections of each data object to one or more locations on the one or more data storage devices in which the one or more sections of the data object are stored, wherein at least one of the one or more sections of each data object is stored in the key-value storage system and at least another one of the one or more sections of each data object is stored in the other storage system.

Example 12: the method of example 11, wherein the at least one section stored in the key-value storage system comprises a first section of each data object.

Example 13: the method of example 11, wherein the at least one section stored in the key-value storage system is equal to or less than 4096 bytes.

Example 14: the method of example 11, wherein the key-value storage system uses a b-tree to store key-value pairs.

Example 15: the method of example 11, wherein the key-value storage system includes a plurality of indices of key-value pairs, wherein each of the indices of key-value pairs includes at least one key usable to find the at least one section within the key-value storage system.

Example 16: the method of example 11, the method further comprising: receiving a write data object to be stored on the one or more data storage devices; and storing at least one section of the write data object in the key-value storage system according to the composite layout.

Example 17: the method of example 11, the method further comprising: receiving write data to be written to an existing data object of the one or more data objects; determining whether the write data is to be written to a section of the existing data object that is stored in the key-value storage system; and if it is determined that the write data is to be written to the section of the existing data object stored in the key-value storage system, storing the write data in the key-value storage system.

Example 18: the method of example 11, the method further comprising: receiving a read request for read data of an existing data object of the one or more data objects; determining whether the read data is to be read from a section of the existing data object stored in the key-value storage system; and if it is determined that the read data is located in the section of the existing data object stored in the key-value storage system, reading the read data from the key-value storage system.

Example 19: the method of example 11, wherein the composite layout includes a plurality of sub-layouts from a lowest priority to a highest priority, each sub-layout including mapping information linking one or more sections of the data object to one or more locations on the storage device in which the sections of the data object are stored, wherein the mapping information for at least one of the plurality of sub-layouts indicates whether one or more sections of the data object are stored in the key-value storage system.

Example 20: the method of example 11, wherein the one or more data storage devices include one or more solid state drives, wherein the key value storage system uses the one or more solid state drives.

Example 21: a composite layout corresponding to a data object describing one or more locations of one or more sections of the data object on one or more storage devices, comprising: a plurality of sub-layouts ranked from lowest priority to highest priority, each sub-layout comprising mapping information linking one or more sections of the data objects to one or more locations on the storage device in which the one or more sections of the data objects are stored on the storage device, wherein at least one of the one or more sections of each data object is stored in a key-value storage system and at least another one of the one or more sections of each data object is stored in another storage system different from the key-value storage system.

Example 22: the composite layout of example 21 wherein the mapping information of at least one of the plurality of sub-layouts indicates whether one or more sections of the data object are stored in the key-value storage system.

21页详细技术资料下载

Data storage layout

相关技术

网友询问留言