Data storage method, data storage equipment, computer program and storage medium

文档序号:1270532 发布日期:2020-08-25 浏览:2次 中文

阅读说明:本技术 一种数据存储方法、设备、计算机程序及存储介质 (Data storage method, data storage equipment, computer program and storage medium ) 是由 柴云鹏 吴坤尧 于 2020-04-28 设计创作,主要内容包括:本发明涉及一种数据存储方法、设备、计算机程序及存储介质,其包括用于处理用户读写请求的步骤;用于实现地址转换的步骤;用于缓存数据块请求的步骤;用于存储用户数据的步骤。本发明能大幅降低RAID5系统的写放大开销,提高系统读写性能。本发明就可以广泛在数据存储技术领域中应用。(The invention relates to a data storage method, device, computer program and storage medium, comprising steps for processing read-write request of user; a step for implementing address translation; a step for caching a data block request; a step for storing user data. The invention can greatly reduce the write amplification overhead of the RAID5 system and improve the read-write performance of the system. The invention can be widely applied in the technical field of data storage.)

1. A data storage method, comprising the steps of:

processing a user read-write request;

a step for implementing address translation;

a step for caching a data block request;

a step for storing user data.

2. The data storage method of claim 1, wherein: when processing a read request, the method comprises the following steps:

s11: receiving a read request;

s12: converting the logic address of the read request into the address of the data block;

s13: and judging whether the buffer is hit, if so, reading from the buffer, otherwise, reading from the SMR, and returning to the step S11.

3. The data storage method of claim 1, wherein: when processing the write request, the method comprises the following steps:

s21: accepting the write request;

s22: performing address conversion on the write request, and generating an address of the data block and an address of the check block at the same time;

s23: judging the block type, if the block type is a check block, directly writing the SMR, and returning to the step S21; if the data block is the data block, the process proceeds to step S24;

s24: judging whether the buffer area is hit, if so, updating the original position in the buffer area, and returning to the step S21; otherwise, go to step S25;

s25: judging whether the buffer area is full, and if so, executing a garbage recovery process; and if the buffer area is not full or after garbage collection is finished, directly writing the buffer area.

4. A data storage method according to claim 2 or 3, wherein: in the step S12 and the step S22, the specific conversion method is as follows: and moving the position of the check block in each disk to the position capable of covering the write track, and sequentially arranging the rest data blocks in the disk from small to large according to the sequence of addresses.

5. The data storage method of claim 4, wherein: it is necessary to ensure that the number of disks in RAID is the same as the number of tracks in Band.

6. The data storage method of claim 2, wherein: in step S13, the specific determination executing method is as follows:

s131: judging whether the data block corresponding to the read request is in the overwritable writing area, if so, directly reading corresponding data from the overwritable writing buffer area; otherwise, judging whether the area is located in the imbricated area, and if so, reading from the imbricated area;

s132: if the corresponding data block can not be found in both areas of the persistent buffer, and no cache hit occurs, the data is directly read from the corresponding data block address.

7. A data storage method according to claim 3, wherein: in step S25, the garbage collection process includes:

(1) and garbage recycling of the overwritable writing area: firstly, selecting an empty Band of a imbricated area, and if the whole imbricated area is full, executing imbricated area garbage recovery, wherein the specific method comprises the following steps:

under the condition that the Band residual space allows, selecting a Band number containing the most data blocks in the coverable writing area as a culling number; sequentially moving the data blocks corresponding to the eliminated numbers into the Band, and repeating the process until no eliminated numbers are selected;

(2) and (3) recycling the garbage in the imbricated area: firstly, counting the distribution situation of Band numbers of each data block in a tiling type area, selecting the Band number containing the most data blocks as a culling number each time, and then moving all the data blocks of the Band in the tiling type area containing the culling number to an area corresponding to tile records;

(3) and when the buffer area is not full or garbage collection is finished, writing the corresponding write request into a free position of the overwritable write area.

8. A data storage device, characterized by: the system comprises an operating system, an address mapping module, a persistent buffer area and a tile recording disk group; the operating system is used for processing a user read-write request; the address mapping module is used for realizing address conversion; the persistent buffer is used for caching a data block request; the tile recording disk group is for storing user data.

9. A computer program, characterized in that: comprising computer program instructions for implementing the corresponding steps of the data storage method according to any one of claims 1 to 7 when executed by a processor.

10. A computer-readable storage medium characterized by: the computer readable storage medium has stored thereon computer program instructions for implementing the corresponding steps of the data storage method according to any one of claims 1 to 7 when executed by a processor.

Technical Field

The present invention relates to the field of data storage technologies, and in particular, to a data storage method, device, computer program, and storage medium based on a novel tile recording disk.

Background

A magnetic disk: hard Disk Drive (HDD), is currently the most dominant data storage medium. Compared with a Solid State Drive (SSD) based on a flash memory, the magnetic disk has the advantages of high capacity, low cost and long service life. With the exponential increase of data volume, the demand of large-scale storage systems in various fields including social media, internet of things, astronomical observation and medical imaging is rapidly increasing. Inexpensive, high capacity, stable disks will remain the primary storage medium for future storage needs.

Tile recording: i.e., Shingled Magnetic Recording (SMR), is a regionA new type of storage device other than a Conventional Magnetic Recording (CMR). Tile records are mainly classified into the following three categories: Drive-Managed tile records (DM-SMRs), Host-Managed tile records (HM-SMRs), and Host-Aware tile records (HA-SMRs). Due to the limitation of the Superparamagnetic effect (Superparagenic limit) in physics, the storage density of the conventional magnetic disk reaches 1Tb/in2The limit of (c). Tile recording changes the distribution of tracks on the existing disk hardware technology to increase storage density. By taking advantage of the characteristic that the width of the read head is smaller than that of the write head in the magnetic disk, the tracks in the shingle recording are stacked on top of each other like tiles, and the width of the exposed area of each track is larger than that of the read head, so as to ensure that the read head can read normally. Because of the track overlap, shingle recording performs similarly to a conventional disk when handling read requests and consecutive write requests, but suffers from performance degradation when handling random write requests. Compared with Heat-Assisted Magnetic Recording (HAMR) and Bit-Patterned Magnetic Recording (Bit-Patterned Magnetic Recording), which have not been commercialized, watt Recording does not require a substantial change in the existing disk structure, and is now widely produced by xiagile, western data companies. Due to the larger capacity and lower price, shingle recording is considered a promising storage device.

Disk array: namely, Redundant Array of Independent Disks (RAID), is a storage scheme which is composed of a plurality of Disks and logically represents a uniform interface, and aims to balance capacity, performance and reliability. There are different levels of disk arrays, such as RAID0, RAID10, RAID4, RAID 5. RAID4 evenly distributes user data among the data disks and stores parity information generated by the user data in the parity disks. When one disk fails, the RAID4 system can restore the data of the failed disk by using the information of the remaining hard disks. RAID5 was modified from RAID4 to distribute parity information evenly across each disk. RAID5 is widely deployed in large-scale storage systems due to good read-write performance, single disk error tolerance, and large capacity.

Region block instruction set: namely, Zone Block Commands (ZBC), is an HM-SMR-based interface standard proposed by the International information technology standards Committee. The standard provides a series of Zone (Zone) granular operations including looking at the type and size of a Zone, opening and closing a Zone, reading a Zone, writing a Zone, and resetting the write pointer within a Zone.

Disclosure of Invention

In view of the foregoing problems, it is an object of the present invention to provide a data storage method, apparatus, computer program, and storage medium, which can greatly reduce the write amplification overhead of a RAID5 system and improve the read/write performance of the system.

In order to achieve the purpose, the invention adopts the following technical scheme: a method of data storage, comprising: processing a user read-write request; a step for implementing address translation; a step for caching a data block request; a step for storing user data.

Further, when processing the read request, the method comprises the following steps: s11: receiving a read request; s12: converting the logic address of the read request into the address of the data block; s13: and judging whether the buffer is hit, if so, reading from the buffer, otherwise, reading from the SMR, and returning to the step S11.

Further, when processing the write request, the method comprises the following steps: s21: accepting the write request; s22: performing address conversion on the write request, and generating an address of the data block and an address of the check block at the same time; s23: judging the block type, if the block type is a check block, directly writing the SMR, and returning to the step S21; if the data block is the data block, the process proceeds to step S24; s24: judging whether the buffer area is hit, if so, updating the original position in the buffer area, and returning to the step S21; otherwise, go to step S25; s25: judging whether the buffer area is full, and if so, executing a garbage recovery process; and if the buffer area is not full or after garbage collection is finished, directly writing the buffer area.

Further, in the step S12 and the step S22, the specific conversion method is as follows: and moving the position of the check block in each disk to the position capable of covering the write track, and sequentially arranging the rest data blocks in the disk from small to large according to the sequence of addresses.

Further, it is necessary to ensure that the number of disks in RAID is the same as the number of tracks in Band.

Further, in step S13, the specific determination executing method includes: s131: judging whether the data block corresponding to the read request is in the overwritable writing area, if so, directly reading corresponding data from the overwritable writing buffer area; otherwise, judging whether the area is located in the imbricated area, and if so, reading from the imbricated area; s132: if the corresponding data block can not be found in both areas of the persistent buffer, and no cache hit occurs, the data is directly read from the corresponding data block address.

Further, in step S25, the garbage collection process includes: (1) and garbage recycling of the overwritable writing area: firstly, selecting an empty Band of a imbricated area, and if the whole imbricated area is full, executing imbricated area garbage recovery, wherein the specific method comprises the following steps: under the condition that the Band residual space allows, selecting a Band number containing the most data blocks in the coverable writing area as a culling number; sequentially moving the data blocks corresponding to the eliminated numbers into the Band, and repeating the process until no eliminated numbers are selected; (2) and (3) recycling the garbage in the imbricated area: firstly, counting the distribution situation of Band numbers of each data block in a tiling type area, selecting the Band number containing the most data blocks as a culling number each time, and then moving all the data blocks of the Band in the tiling type area containing the culling number to an area corresponding to tile records; (3) and when the buffer area is not full or garbage collection is finished, writing the corresponding write request into a free position of the overwritable write area.

A data storage device comprising an operating system, an address mapping module, a persistent buffer, and a set of shingle recording disks; the operating system is used for processing a user read-write request; the address mapping module is used for realizing address conversion; the persistent buffer is used for caching a data block request; the tile recording disk group is for storing user data.

A computer program comprising computer program instructions for implementing the steps corresponding to the above data storage method when executed by a processor.

A computer readable storage medium having stored thereon computer program instructions for implementing the corresponding steps of the above data storage method when executed by a processor.

Due to the adoption of the technical scheme, the invention has the following advantages: 1. the invention can improve the read-write speed, especially the random write speed, of the tile record RAID5 system. 2. The invention can improve the performance of the storage system by changing the address mapping mode under the condition of not changing the existing hardware and operating system. 3. The present invention employs a novel storage medium (mainframe managed tile recording) that has a lower cost while having a higher storage density than conventional magnetic recording.

In summary, the present invention utilizes the random write property capable of covering the write track to implement new address mapping and buffer structure, thereby greatly reducing the write amplification overhead of the RAID5 system and improving the read-write performance of the system.

Drawings

FIG. 1 is a flow diagram of read request processing of the present invention;

FIG. 2 is a write request processing flow diagram of the present invention;

FIG. 3 is a schematic diagram of a conventional RAID5 mapping;

FIG. 4 is a mapping diagram of the present invention;

FIG. 5 is a schematic diagram of an overwritable write track configuration of the present invention;

FIG. 6 is a diagram of a persistent buffer configuration of the present invention;

fig. 7 is a system equipment architecture diagram of the present invention.

Detailed Description

The invention is described in detail below with reference to the figures and examples.

14页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:存储卷级联架构、批量作业处理系统和电子设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类