Distributed storage system based on electronic files

文档序号:1952033 发布日期:2021-12-10 浏览:3次 中文

阅读说明:本技术 基于电子卷宗的分布式存储系统 (Distributed storage system based on electronic files ) 是由 罗健锋 王俊峰 崔起旭 李仕军 周东 翟晓清 许晓峰 于 2021-08-30 设计创作,主要内容包括:本发明涉及基于电子卷宗的分布式存储系统,包括以下:将网络类型划分为管理网络、存储私网以及存储外网;将存储外网通过堆叠的存储外网交换机与EDS集群通信连接;将管理网络通过管理交换机与EDS集群通信连接;将存储私网通过堆叠的存储私网交换机与EDS集群通信连接;所述EDS集群需保证存储节点管理IP与EDS集群IP均在同一网段。作为本发明的优选实施方式,具体的,所述存储外网与所述存储私网通过链路聚合或无链路聚合的方式与所述EDS集群连接。本发明提出为电子卷宗的数据存储系统提供一个组网安全方案,能够在一定程度抵抗业务进行过程中的意外情况,提供了端到端的高可用性,确保业务安全高效的进行。(The invention relates to an electronic volume based distributed storage system, comprising the following: dividing the network types into a management network, a storage private network and a storage external network; connecting the storage external network with the EDS cluster through a stacked storage external network switch in a communication manner; the management network is in communication connection with the EDS cluster through a management switch; connecting the storage private network with the EDS cluster through a stacked storage private network switch in a communication manner; the EDS cluster needs to ensure that the storage node management IP and the EDS cluster IP are in the same network segment. As a preferred embodiment of the present invention, specifically, the storage external network and the storage private network are connected to the EDS cluster in a link aggregation or no link aggregation manner. The invention provides a networking safety scheme for the data storage system of the electronic file, can resist the accident situation in the service process to a certain extent, provides end-to-end high availability and ensures the safe and efficient operation of the service.)

1. An electronic volume-based distributed storage system, comprising:

the protocol gateway layer PGW is used for being compatible with various protocols;

the distributed capacity layer PhxStore is used for optimizing and perfecting the stored data, and high-concurrency reading and writing of the stored data are realized based on a striping technology and a data distribution technology;

the storage service layer PhxTire is used for upwards docking a protocol gateway layer PGW and downwards adapting to a capacity layer PhxStore;

and the visual management layer Manager is used for ensuring the monitoring and management of the distributed storage system.

2. The electronic volume-based distributed storage system according to claim 1, wherein the storage service layer PhxTire comprises two storage engines,

a first engine PhxKV for realizing the high efficiency of metadata service by using a distributed database

And the second engine PhxCache is used for realizing high-performance data reading and writing by utilizing a distributed cache and small file merging technology.

3. The electronic volume-based distributed storage system according to claim 2, wherein the PGW comprises three different types of storage interfaces, each type of storage interface supporting a standard interface protocol,

the block storage interface supports Librbd, iSCSI, SCSI and FC protocols;

the object storage interface supports S3 and the swift protocol;

and the file storage interface supports NFS, CIFS, SMB and FTP protocols.

4. The electronic volume-based distributed storage system of claim 1, wherein the system further comprises,

and the access selection module is used for preferentially selecting the stored data accessed into the distributed storage system when the data access request is acquired, and automatically analyzing the IP to the disaster recovery data center in the same city when the stored data in the distributed storage system is abnormal.

5. The electronic volume-based distributed storage system of claim 1, wherein the distributed storage system is constructed based on,

assuming that the value space of the hash function H is 0,]the hash value is the sum of 0 of the whole hash space ring formed by the function H and the 32-bit unsigned shapingDirection coincidence in the zero point;

equally dividing the hash ring by each server in the distributed storage system, determining the position of each server on the hash ring, taking each server as a node and recording position information;

when data storage is carried out, calculating a hash value of a data key of data to be stored through a function H, determining the position of the hash value on the whole hash space ring, and rotating a first server encountered from the position along the clockwise direction to be the server for storing the data;

and repeating the process until all the data to be stored are stored.

6. The electronic volume-based distributed storage system of claim 5, wherein the distributed storage system further comprises,

the backup module is used for backing up the data stored in the distributed storage system and recovering the abnormal data when the data stored in the distributed storage system is abnormal;

specifically, the backup method adopted by the backup module is as follows,

the method comprises the steps of converting data to be stored into N original symbols according to a preset conversion mechanism, additionally adding M symbols to enable the total number of the symbols to be L, and storing the L symbols in L storage nodes in a spread mode, wherein the original symbols meet the following rule, and the N symbols in the L symbols can be restored to obtain the data to be stored.

7. The electronic volume-based distributed storage system of claim 1, wherein the visualization management layer Manager comprises,

and the early warning module is used for early warning when the distributed storage system has health problems of hardware and logic resources.

8. The electronic volume-based distributed storage system of claim 5, wherein the distributed storage system is further based on a build approach,

introducing virtual nodes, and ensuring that each virtual node corresponds to one server, but each server can correspond to a plurality of virtual nodes, wherein the nodes formed by the virtual nodes and the servers are used for equally dividing a hash ring and storing data information.

Technical Field

The invention relates to the field related to electronic volume, in particular to a distributed storage system based on electronic volume.

Background

The existing online case handling business system is often not mature enough to safely and effectively store data so as to ensure the efficient operation of the electronic file system.

Disclosure of Invention

It is an object of the present invention to address at least one of the deficiencies of the prior art by providing an electronic volume-based distributed storage system.

In order to achieve the purpose, the invention adopts the following technical scheme:

specifically, a distributed storage system based on electronic volume is provided, which comprises the following components:

the protocol gateway layer PGW is used for being compatible with various protocols;

the distributed capacity layer PhxStore is used for optimizing and perfecting the stored data, and high-concurrency reading and writing of the stored data are realized based on a striping technology and a data distribution technology;

the storage service layer PhxTire is used for upwards docking a protocol gateway layer PGW and downwards adapting to a capacity layer PhxStore;

and the visual management layer Manager is used for ensuring the monitoring and management of the distributed storage system.

Further, in particular, the storage service layer PhxTire comprises two storage engines,

a first engine PhxKV for realizing the high efficiency of metadata service by using a distributed database

And the second engine PhxCache is used for realizing high-performance data reading and writing by utilizing a distributed cache and small file merging technology.

Further, specifically, the protocol gateway layer PGW includes three different types of storage interfaces, each type of storage interface supports a standard interface protocol,

the block storage interface supports Librbd, iSCSI, SCSI and FC protocols;

the object storage interface supports S3 and the swift protocol;

and the file storage interface supports NFS, CIFS, SMB and FTP protocols.

Further, the system also comprises a control unit,

and the access selection module is used for preferentially selecting the stored data accessed into the distributed storage system when the data access request is acquired, and automatically analyzing the IP to the disaster recovery data center in the same city when the stored data in the distributed storage system is abnormal.

Further, the distributed storage system is constructed based on,

assuming that the value space of the hash function H is 0,]the hash value is the sum of 0 of the whole hash space ring formed by the function H and the 32-bit unsigned shapingDirection coincidence in the zero point;

equally dividing the hash ring by each server in the distributed storage system, determining the position of each server on the hash ring, taking each server as a node and recording position information;

when data storage is carried out, calculating a hash value of a data key of data to be stored through a function H, determining the position of the hash value on the whole hash space ring, and rotating a first server encountered from the position along the clockwise direction to be the server for storing the data;

and repeating the process until all the data to be stored are stored.

Further, the distributed storage system further comprises,

the backup module is used for backing up the data stored in the distributed storage system and recovering the abnormal data when the data stored in the distributed storage system is abnormal;

specifically, the backup method adopted by the backup module is as follows,

the method comprises the steps of converting data to be stored into N original symbols according to a preset conversion mechanism, additionally adding M symbols to enable the total number of the symbols to be L, and storing the L symbols in L storage nodes in a spread mode, wherein the original symbols meet the following rule, and the N symbols in the L symbols can be restored to obtain the data to be stored.

Further, the visualization management layer Manager comprises,

and the early warning module is used for early warning when the distributed storage system has health problems of hardware and logic resources.

Further, the distributed storage system based on the construction mode also comprises,

introducing virtual nodes, and ensuring that each virtual node corresponds to one server, but each server can correspond to a plurality of virtual nodes, wherein the nodes formed by the virtual nodes and the servers are used for equally dividing a hash ring and storing data information.

The invention has the beneficial effects that:

the invention provides a distributed storage system based on electronic files, which deploys the whole system in a distributed storage mode and is used for being compatible with various protocols through a protocol gateway layer (PGW); the distributed capacity layer PhxStore is used for optimizing and perfecting the stored data, and high-concurrency reading and writing of the stored data are realized based on a striping technology and a data distribution technology; the storage service layer PhxTire is used for upwards docking a protocol gateway layer PGW and downwards adapting to a capacity layer PhxStore; the visual management layer Manager is used for guaranteeing monitoring and management of the distributed storage system, the visual management layer Manager is matched with the distributed storage system to realize safe storage of the electronic file data, and when the electronic file data is abnormal, the electronic file data is quickly recovered by means of pre-established backup data to guarantee efficient operation of the distributed storage system.

Drawings

The foregoing and other features of the present disclosure will become more apparent from the detailed description of the embodiments shown in conjunction with the drawings in which like reference characters designate the same or similar elements throughout the several views, and it is apparent that the drawings in the following description are merely some examples of the present disclosure and that other drawings may be derived therefrom by those skilled in the art without the benefit of any inventive faculty, and in which:

FIG. 1 is a schematic structural diagram of a networked security system based on electronic files according to the present invention.

Detailed Description

The conception, the specific structure and the technical effects of the present invention will be clearly and completely described in conjunction with the embodiments and the accompanying drawings to fully understand the objects, the schemes and the effects of the present invention. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The same reference numbers will be used throughout the drawings to refer to the same or like parts.

Referring to fig. 1, embodiment 1, the present invention provides an electronic volume based distributed storage system, including the following:

the protocol gateway layer PGW is used for being compatible with various protocols;

the distributed capacity layer PhxStore is used for optimizing and perfecting the stored data, and high-concurrency reading and writing of the stored data are realized based on a striping technology and a data distribution technology;

the storage service layer PhxTire is used for upwards docking a protocol gateway layer PGW and downwards adapting to a capacity layer PhxStore;

and the visual management layer Manager is used for ensuring the monitoring and management of the distributed storage system.

The present embodiment proposes that the number of the first and second electrodes,

1) protocol gateway layer (PGW)

The protocol gateway layer is mainly responsible for adaptation conversion of storage protocols, supports iSCSI, NFS, SMB, S3, HDFS and the like, and meanwhile, in order to achieve reliability and continuity of protocol access, the EDS automatically groups the protocol gateway layers of different storage nodes into a cluster, and guarantees high availability of data access based on cluster unified scheduling.

2) Storage service layer (PhxTire)

PhxTire is a high-performance distributed layered storage system developed based on a self-developed distributed programming framework (SRAFT), and is used as a middleware of an EDS (electronic data system), an upward docking protocol gateway layer and a downward adaptation capacity layer PhxStore. PhxTire constructs two storage engines PhxKV and PhxCache, wherein PhxKV achieves high efficiency of metadata service through a distributed database, and PhxCache achieves high performance of data reading and writing through technologies such as distributed caching and small file merging.

3) Distributed capacity layer (PhxStore)

PhxStore is used as a capacity layer, and most importantly, the data reliability is guaranteed and the performance of the hard disk is exerted to the maximum extent. In the aspect of data reliability, the EDS performs a large amount of optimization and function perfection aiming at open source erasure codes, so that the EDS can meet the requirements of production environment, and in the aspect of exerting the performance of a hard disk, the EDS utilizes a striping technology and a data distribution technology, so that high concurrent data reading and writing are realized, and the reading and writing performance of a storage system is guaranteed.

4) Visual management layer

And the Manager management layer ensures the monitoring and management of the storage system. In the monitoring aspect, the EDS utilizes the visualization technology to display the health conditions of hardware and logic resources in real time, and can realize early warning processing of faults. For management, EDS uses logical wizards to simplify and automate certain aspects of capacity expansion, configuration, and the like. Corresponding operation can be realized by a plurality of steps.

The method can realize safe storage of the electronic volume data, and when the electronic volume data is abnormal, the electronic volume data is quickly recovered by means of the pre-established backup data so as to ensure efficient operation of the distributed storage system.

As a preferred embodiment of the present invention, in particular, the storage service layer PhxTire includes two storage engines,

a first engine PhxKV for realizing the high efficiency of metadata service by using a distributed database

And the second engine PhxCache is used for realizing high-performance data reading and writing by utilizing a distributed cache and small file merging technology.

As a preferred embodiment of the present invention, specifically, the protocol gateway layer PGW includes three different types of storage interfaces, each type of storage interface supports a standard interface protocol,

the block storage interface supports Librbd, iSCSI, SCSI and FC protocols;

the object storage interface supports S3 and the swift protocol;

and the file storage interface supports NFS, CIFS, SMB and FTP protocols.

As a preferred embodiment of the present invention, the system further comprises,

and the access selection module is used for preferentially selecting the stored data accessed into the distributed storage system when the data access request is acquired, and automatically analyzing the IP to the disaster recovery data center in the same city when the stored data in the distributed storage system is abnormal.

As a preferred embodiment of the present invention, the distributed storage system is constructed based on,

assuming that the value space of the hash function H is 0,]the hash value is the sum of 0 of the whole hash space ring formed by the function H and the 32-bit unsigned shapingDirection coincidence in the zero point;

equally dividing the hash ring by each server in the distributed storage system, determining the position of each server on the hash ring, taking each server as a node and recording position information;

when data storage is carried out, calculating a hash value of a data key of data to be stored through a function H, determining the position of the hash value on the whole hash space ring, and rotating a first server encountered from the position along the clockwise direction to be the server for storing the data;

and repeating the process until all the data to be stored are stored.

As a preferred embodiment of the present invention, the distributed storage system further comprises,

the backup module is used for backing up the data stored in the distributed storage system and recovering the abnormal data when the data stored in the distributed storage system is abnormal;

specifically, the backup method adopted by the backup module is as follows,

the method comprises the steps of converting data to be stored into N original symbols according to a preset conversion mechanism, additionally adding M symbols to enable the total number of the symbols to be L, and storing the L symbols in L storage nodes in a spread mode, wherein the original symbols meet the following rule, and the N symbols in the L symbols can be restored to obtain the data to be stored.

As a preferred embodiment of the present invention, the visualization management layer Manager includes,

and the early warning module is used for early warning when the distributed storage system has health problems of hardware and logic resources.

As a preferred embodiment of the present invention, the distributed storage system further comprises, on a build basis,

introducing virtual nodes, and ensuring that each virtual node corresponds to one server, but each server can correspond to a plurality of virtual nodes, wherein the nodes formed by the virtual nodes and the servers are used for equally dividing a hash ring and storing data information.

The ring is divided into M equal parts, and if the number of the nodes corresponding to the joining servers is N, the node corresponding to each server has V = M/N node numbers. When a node corresponding to a server is offline, because the node corresponding to the node is uniformly distributed on the ring, the original load of the node can be uniformly shared by the nodes nearby the node, and when a new node is added, similarly, the load of other nodes can be uniformly transferred to the node. In addition, the number of the virtual nodes on the ring is distributed to the nodes corresponding to the server for the weight according to the actual performance of the nodes corresponding to the server, and the problem of performance difference of the storage nodes is solved.

In order to solve the problem of efficiency reduction caused by data migration, the data migration completed by the newly added nodes can be distributed in each query task, namely, a small part of data can be migrated in each query, and in addition, the data migration can be carried out in the system idle time, so that the efficiency can be effectively improved.

Description of the architecture:

1, the bottom layer storage adopts an X86 server plus distributed storage technology, and a server local disk is utilized to realize hundreds of PB-level large-scale storage clusters.

And 2, the distributed object storage supports the display of a uniform name space to an electronic file, a trial process management system and the like under a cross-region scene, and supports the requirements of service realization of near access, data aggregation and disaster tolerance.

And 3, performing cross-data center disaster recovery deployment in the same city, storing full data storage in both the main data center and the standby data center, and switching to the operation in the same-city disaster recovery data center in real time by combining an intelligent DNS component after the bottom storage in the production data fails.

And 4, due to network condition factors, the more important file files of the data center parts of all regions can be gathered to the disaster recovery data center in the same city as required.

5 the network resources from each region to the data center are expensive and the bandwidth is limited, and the distributed object storage system provides QOS control strategies based on flow, time periods and the like.

6, the project planning object storage is a completely decentralized distributed object storage system, is suitable for mass files and large-capacity storage capacity, and can completely realize the following functions:

supporting a large-capacity hard disk;

supporting distributed metadata architecture storage;

mass file storage is supported;

the method supports the service continuity of 7 × 24 hours, supports online upgrade, expands the capacity and other changes without influencing the service;

the method supports multi-version, CDP and quick rollback of files, realizes non-structured data backup-free, and solves the problems of poor backup performance and slow calling of a tape library;

storage, operation, maintenance and operation supporting full-interface

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium and can implement the steps of the above-described method embodiments when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media which may not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

While the present invention has been described in considerable detail and with particular reference to a few illustrative embodiments thereof, it is not intended to be limited to any such details or embodiments or any particular embodiments, but it is to be construed as effectively covering the intended scope of the invention by providing a broad, potential interpretation of such claims in view of the prior art with reference to the appended claims. Furthermore, the foregoing describes the invention in terms of embodiments foreseen by the inventor for which an enabling description was available, notwithstanding that insubstantial modifications of the invention, not presently foreseen, may nonetheless represent equivalent modifications thereto.

The above description is only a preferred embodiment of the present invention, and the present invention is not limited to the above embodiment, and the present invention shall fall within the protection scope of the present invention as long as the technical effects of the present invention are achieved by the same means. The invention is capable of other modifications and variations in its technical solution and/or its implementation, within the scope of protection of the invention.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种SSD主控中多端口低延迟访问的SRAM群组的控制方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类