Network fabric storage system

文档序号：717308 发布日期：2021-04-16 浏览：12次中文

阅读说明：本技术 网络织物存储系统 (Network fabric storage system ) 是由陈旭于 2019-10-16 设计创作，主要内容包括：本发明公开了一种网络织物存储系统,包括：机箱,机箱容纳多个主存储设备和一个或多个缓存设备,缓存设备与主存储设备中的每一个分开。链路控制卡(LCC)容纳在机箱中并经由中板联接到主存储设备的每一个和缓存设备。LCC包括转换层处理器,所述转换层处理器从主机设备接收第一数据,并且处理第一数据以存储在缓存设备中以使得第一数据存储在缓存设备中。所述转换层处理器然后确定第一数据应存储在第一主存储设备中,并且作为响应,使第一数据从缓存设备移动到第一主存储设备以使得第一数据存储在第一主存储设备中。(The invention discloses a network fabric storage system, comprising: a chassis housing a plurality of primary storage devices and one or more cache devices, the cache devices being separate from each of the primary storage devices. A Link Control Card (LCC) is housed in the chassis and coupled to each of the primary storage devices and the cache device via the midplane. The LCC includes a translation layer processor that receives first data from the host device and processes the first data for storage in the cache device such that the first data is stored in the cache device. The translation layer processor then determines that the first data should be stored in the first primary storage device and, in response, causes the first data to be moved from the cache device to the first primary storage device such that the first data is stored in the first primary storage device.)

1. A network fabric storage system comprising:

a chassis;

a plurality of primary storage devices housed in the chassis;

at least one cache device separate from each of the plurality of primary storage devices and housed in the chassis;

a middle plate accommodated in the cabinet; and

a Link Control Card (LCC) housed in the chassis and coupled to each of the plurality of main storage devices and the at least one cache device via the midplane, wherein the LCC comprises a translation layer processor configured to:

receiving first data from at least one host device;

processing the first data for storage in the at least one cache device such that the first data is stored in the at least one cache device;

determining that the first data should be stored in a first primary storage device included in the plurality of primary storage devices; and

causing the first data to be moved from the at least one cache device to the first primary storage device such that the first data is stored in the first primary storage device.

2. The system of claim 1, wherein the translation layer processor is configured to:

performing a logical address to physical address mapping operation on the first data stored on the at least one cache device; and

performing a logical address to physical address mapping operation on the first data stored on the first primary storage device.

3. The system of claim 1, wherein the translation layer processor is configured to:

generating and storing metadata associated with a lifecycle of each of the plurality of primary storage devices.

4. The system of claim 1, wherein the translation layer processor is configured to:

determining that the first data stored on the first primary storage device is unavailable and, in response, performing a data recovery operation on the data stored on the first primary storage device.

5. The system of claim 1, wherein each of the plurality of main memories is provided by a non-volatile memory express (NVMe) Solid State Drive (SSD).

6. The system of claim 1, wherein the translation layer processor is configured to:

receiving second data from the host device;

processing the second data for storage in the at least one cache device such that the second data is stored in the at least one cache device;

determining that the second data should be stored in a second primary storage device included in the plurality of primary storage devices; and

causing the second data to be moved from the at least one cache device to the second primary storage device such that the second data is stored in the second primary storage device.

7. The system of claim 1, wherein each of the plurality of primary storage devices is free of a translation layer processor and a cache subsystem.

8. An Information Handling System (IHS), comprising:

a processing system; and

a memory system coupled to the processing system and including instructions that, when executed by the processing system, cause the processing system to provide a translation layer engine configured to:

receiving first data from at least one host device;

processing the first data for storage in at least one caching device coupled to the translation layer engine via a midplane and located in a storage/cache enclosure such that the first data is stored in at least one caching device;

determining that the first data should be stored in a first primary storage device included in a plurality of primary storage devices coupled to the translation layer engine via the midplane, each separate from the at least one cache device and located in the storage/cache enclosure; and

causing the first data to be moved from the at least one cache device to the first primary storage device such that the first data is stored in the first primary storage device.

9. The IHS of claim 7, wherein the translation layer engine is configured to:

performing a logical address to physical address mapping operation on the first data stored on the at least one cache device; and

performing a logical address to physical address mapping operation on the first data stored on the first primary storage device.

10. The IHS of claim 7, wherein the translation layer engine is configured to:

generating and storing metadata associated with a lifecycle of each of the plurality of primary storage devices.

11. The IHS of claim 7, wherein the translation layer engine is configured to:

12. The IHS of claim 7, wherein each of the plurality of main memories is provided by a non-volatile memory express (NVMe) Solid State Drive (SSD).

13. The IHS of claim 7, wherein the translation layer engine is configured to:

receiving second data from the host device;

processing the second data for storage in the at least one cache device such that the second data is stored in the at least one cache device;

determining that the second data should be stored in a second primary storage device included in the plurality of primary storage devices; and

causing the second data to be moved from the at least one cache device to the second primary storage device such that the second data is stored in the second primary storage device.

14. A method of storing data in a network fabric storage system, the method comprising:

receiving, by a translation layer processor, first data from at least one host device;

processing, by the translation layer processor, the first data for storage in at least one cache device coupled to the translation layer processor via a midplane and located in a storage/cache enclosure such that the first data is stored in the at least one cache device;

determining, by the translation layer processor, that the first data should be stored in a first primary storage device included in a plurality of primary storage devices coupled to the translation layer engine via the midplane, each separate from the at least one cache device and located in the storage/cache shell; and

causing, by the translation layer processor, the first data to be moved from the at least one cache device to the first primary storage device such that the first data is stored in the first primary storage device.

15. The method of claim 14, further comprising:

performing, by the translation layer processor, a logical address to physical address mapping operation on the first data stored on the at least one cache device; and

performing, by the translation layer processor, a logical address to physical address mapping operation on the first data stored on the first primary storage device.

16. The method of claim 14, further comprising:

generating and storing, by the translation layer processor, metadata associated with a lifecycle of each of the plurality of primary storage devices.

17. The method of claim 14, further comprising:

determining, by the translation layer processor, that the first data stored on the first primary storage device is unavailable and, in response, performing a data recovery operation on the data stored on the first primary storage device.

18. The method of claim 14, wherein each of the plurality of main memories is provided by a non-volatile memory express (NVMe) Solid State Drive (SSD).

19. The method of claim 14, further comprising:

receiving, by the translation layer processor, second data from the host device;

processing, by the translation layer processor, the second data for storage in the at least one cache device such that the second data is stored in the at least one cache device;

determining, by the translation layer processor, that the second data should be stored in a second primary storage device included in the plurality of primary storage devices; and

causing, by the translation layer processor, the second data to be moved from the at least one cache device to the second primary storage device such that the second data is stored in the second primary storage device.

20. The method of claim 14, wherein each of the plurality of primary storage devices is free of a translation layer processor and a cache subsystem.

Technical Field

The present disclosure relates generally to information handling systems, and more particularly, to data storage via a network fabric.

Background

As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option that a user may use is an information handling system. Information handling systems typically process, compile, store, and/or transmit information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information processing requirements and requirements vary from user to user or application to application, information handling systems may also vary with the information being processed, the manner in which the information is processed, the amount of information processed, stored, or transmitted, and the speed and efficiency at which the information is processed, stored, or transmitted. Variations of information handling systems allow for information handling systems to be general or configured for a particular user or particular use, such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, an information handling system may include various hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.

Information handling systems typically include storage systems for storing data, and the current trend is to provide connectivity to such storage systems via a network fabric to allow data to be stored via the network fabric. For example, storage systems utilizing non-volatile memory express (NVMe) Solid State Drives (SSDs) may be connected to computing devices (often referred to as host devices) via a network Fabric to provide a Fabric NVMe (NVMe over Fabric: NVMeoF) storage system that allows host devices to store data. One common design architecture for network fabric NVMe storage systems is commonly referred to as NVMeoF one-heap Flash Of Flash (JBOF) design only. The NVMe JBOF design may utilize redundant Link Control Cards (LCCs), each providing a respective NVMeoF protocol processing system (e.g., provided via system on a chip (SOC) technology) coupled to a respective peripheral component interconnect express (PCIe) switch, and those LCCs coupled to the NVMe SSD via a midplane. Further, NVMe SSDs in nvmeofjbof designs typically include multiple memory devices (e.g., NAND flash memory devices), translation layers (e.g., Flash Translation Layers (FTLs) of NAND flash memory devices), and controllers (e.g., NAND flash controllers for NAND flash memory devices) coupled to a processing system (which provides a PCIe/host interface), as well as cache systems that may be provided by Dynamic Random Access Memory (DRAM) devices, Single Level Cell (SLC) flash memory devices, and/or other relatively high performance, robust storage device technologies known in the art.

In the conventional NVMeoF JBOF design discussed above, data may be received from a host device and converted (e.g., from ethernet protocol to PCIe protocol) by the NVMeoF protocol processing system in the LCC and then provided to the PCIe switch, which then transmits the data to one or more NVMe SSDs. Thus, data may be received by the PCIe/host interface in the processing system of the NVMe SSD and provided to the FTL in the NVMe SSD, which then processes the data for storage in the DRAM cache system and/or NAND memory devices in the NVMe SSD. As will be understood by those skilled in the art, the FTL in the NVMe SSD may perform various processes on the NVMe SSD, including data mapping (e.g., mapping of logical addresses to physical addresses (L2P) of data stored on a NAND flash device in the NVMe SSD), generating and storing metadata associated with a lifecycle of the NVMe SSD, performing data recovery operations in the event of data loss stored on the NVMe SSD, data movement between the NAND flash device and the DRAM cache system, and/or various other FTL operations known in the art. The inventors of the present disclosure have identified inefficiencies associated with conventional nvmeofjbof designs, such as those described above.

For example, using a processing system in an NVMe SSD to provide FTL operation performance locks the FTL processing capabilities of the NVMe SSD, which limits the ability to optimize or customize FTL processing for different applications, introduces dependencies between NVMe SSD controller and flash media support, and/or causes various other FTL processing inefficiencies, as will be apparent to those skilled in the art. Furthermore, providing a dedicated caching system on the NVMe SSD may increase the cost of those NVMe SSDs, lock the caching system/NAND flash ratio of the NVMe SSD, which may result in reduced caching system utilization (e.g., when NVMe utilization is low), hinder flexibility in the different caching media types that use the NVMe SSD, hinder caching system modification and/or adjustment (e.g., performance upgrades and/or downgrades depending on NVMe SSD usage), and/or result in various other caching system inefficiencies, as will be apparent to those skilled in the art.

It is therefore desirable to provide an improved network fabric storage system that addresses the above-mentioned problems.

Disclosure of Invention

According to one embodiment, an Information Handling System (IHS) includes a processing system; a memory system coupled to the processing system and including instructions that, when executed by the processing system, cause the processing system to provide a translation layer engine configured to: receiving first data from at least one host device; processing the first data for storage in at least one cache device coupled to the translation layer engine via the midplane and located in the storage/cache enclosure such that the first data is stored in the at least one cache device; determining that the first data should be stored in a first primary storage device included in a plurality of primary storage devices coupled to the translation layer engine via a midplane, each separate from at least one cache device and located in a storage/cache enclosure; and causing the first data to be moved from the at least one cache device to the first primary storage device such that the first data is stored in the first primary storage device.

Drawings

FIG. 1 is a schematic diagram illustrating an embodiment of an Information Handling System (IHS).

Fig. 2 is a schematic diagram illustrating an embodiment of a conventional network fabric storage system.

Fig. 3 is a schematic diagram illustrating an embodiment of a conventional storage device, which may be included in the conventional network fabric storage system of fig. 2.

Figure 4 is a schematic diagram illustrating an embodiment of a network fabric storage system provided in accordance with the teachings of the present disclosure.

Fig. 5 is a schematic diagram illustrating an embodiment of a storage device provided in accordance with the teachings of the present disclosure, which may be included in the network fabric storage system of fig. 4.

FIG. 6 is a flow diagram illustrating an embodiment of a method for storing data in a networked fabric storage system.

DETAILED DESCRIPTION OF EMBODIMENT (S) OF INVENTION

For purposes of this disclosure, an information handling system may include any instrumentality or aggregate of instrumentalities operable to evaluate, compute, determine, classify, process, transmit, receive, retrieve, originate, switch, store, display, communicate, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data for business, scientific, control, or other purposes. For example, an information handling system may be a personal computer (e.g., a desktop or palmtop computer), a tablet computer, a mobile device (e.g., a Personal Digital Assistant (PDA) or a smartphone), a server (e.g., a blade server or a chassis server), a network storage device, or any other suitable device and may vary in size, shape, performance, functionality, and price. The information handling system may include Random Access Memory (RAM), one or more processing resources (e.g., a Central Processing Unit (CPU) or hardware or software control logic), ROM, and/or other types of nonvolatile memory. Additional components of the information handling system may include one or more disk drives, one or more network ports for communicating with external devices as well as various input and output (I/O) devices, such as a keyboard, a mouse, a touch screen, and/or a video display. The information handling system may also include one or more buses operable to transmit communications between the various hardware components.

In one embodiment, IHS 100 of FIG. 1 includes a processor 102, processor 102 coupled to a bus 104. Bus 104 serves as a connection between processor 102 and the other components of IHS 100. An input device 106 is coupled to the processor 102 to provide input to the processor 102. Examples of input devices may include a keyboard, a touch screen, a pointing device (such as a mouse, trackball, and trackpad), and/or various other input devices known in the art. Programs and data are stored on the mass storage device 108, and the mass storage device 108 is coupled to the processor 102. Examples of mass storage devices may include hard disks, optical disks, magneto-optical disks, solid-state storage devices, and/or various other mass storage devices known in the art. IHS 100 also includes a display 110, display 110 coupled to processor 102 through a video controller 112. The system memory 114 is coupled to the processor 102 to provide fast storage for the processor to facilitate execution of computer programs by the processor 102. Examples of system memory may include Random Access Memory (RAM) devices such as dynamic RAM (dram), synchronous dram (sdram), solid state memory devices, and/or various other memory devices known in the art. In one embodiment, chassis 116 houses some or all of the components of IHS 100. It should be appreciated that other buses and intermediate circuits may be deployed between the above-described components and the processor 102 to facilitate interconnection between the components and the processor 102.

Referring now to FIG. 2, an embodiment of a conventional network fabric storage system 200 is shown for purposes of the following discussion. Storage system 200 may be provided by IHS 100 discussed above with reference to FIG. 1 and/or may include some or all of the components of IHS 100. Further, while shown and discussed as the storage system 200 being provided within a single enclosure/chassis, those skilled in the art will recognize that the functionality of the storage system 200 and/or components thereof discussed below may be distributed across multiple enclosures/chassis, and remain within the scope of the present disclosure. In the illustrated embodiment, the storage system 200 includes a chassis 202 (only some of which are shown below) that houses components of the storage system 200. For example, chassis 202 may house a communication system 204 (which may be implemented by a Network Interface Controller (NIC), a wireless communication system (e.g.,near Field Communication (NFC) components, WiFi components, etc.), and/or any other communication components apparent to those skilled in the art. As such, the communication system 204 may include ports and/or other interfaces for coupling to one or more host devices (not shown) discussed below.

In the illustrated embodiment, the enclosure 202 of the storage system 200 houses a pair of redundant Link Control Cards (LCCs) 206 and 208. For example, LCC 206 may include a protocol conversion processing system 206a, protocol conversion processing system 206a coupled to communication system 204, and also coupled to switch 206b, switch 206b also included in LCC 206. Similarly, LCC 208 may include a protocol conversion processing system 208a coupled to communication system 204, and a switch 208b, switch 208b also included in LCC 208. In the following example, protocol conversion processing systems 206a and 208a are provided via respective non-volatile memory express fabric (NVMeoF) system on a chip (SOC), while switches 206b and 208b are provided by respective peripheral component interconnect express (PCIe) switches. As will be appreciated by those skilled in the art, the nvmeofs SOC providing protocol conversion processing systems 206a and 208a may be configured to convert ethernet, fibre channel, and/or other protocols for data received from host devices via communication system 204 to PCIe protocols for components in storage system 200, while the PCIe switches providing switches 206b and 208b may be configured to route data (to PCIe protocols as described above). However, while some specific functionality of components of the LCC has been described, one of ordinary skill in the art will recognize that multiple LCCs and/or LCC components may provide other conventional functionality while remaining within the scope of the present disclosure.

The enclosure 202 of the storage system 200 also includes a storage enclosure 210, the storage enclosure 210 housing a plurality of storage devices 210a, 210b, 210 c-210 d, each coupled to switches 206b and 208b in the LCCs 206 and 208, respectively, via a midplane 212. In the following example, each storage device 210a-210d is provided by a conventional NVMe SSD that includes a dual port capable of privately coupling each of those NVMe SSDs to each of switches 206b and 208b (e.g., PCIe switches as described above) via midplane 212. The functionality of midplane and other similar attachment subsystems are known in the art and therefore the attachment of storage devices 210a-210d to switches 206b and 208b will not be discussed in detail herein. While a particular conventional network fabric storage system 200 has been illustrated, it will be appreciated by those skilled in the art that the conventional network fabric storage system may include various components and/or component configurations for providing conventional network fabric storage system functionality while remaining within the scope of the present disclosure.

Referring now to FIG. 3, an embodiment of a conventional memory device 300 is shown for purposes of the following discussion. Storage device 300 may be provided by IHS 100 discussed above with reference to FIG. 1, and/or may include some or all of the components of IHS 100. Further, while shown and discussed as NVMe SSDs, those skilled in the art will recognize that the functionality of the storage device 300 and/or its components discussed below may be provided by a variety of storage device technologies, which remain within the scope of the present disclosure. In the illustrated embodiment, the storage device 300 includes a chassis 302 that houses components of the storage device 300, only some of which are shown below. For example, the chassis 302 may house a processing subsystem 304, which in the illustrated embodiment includes a host interface 304a, a translation layer processor 304b, and a controller 304 c. In the examples below, host interface 304a is provided by PCIe and one or more NVMe host interfaces, and host interface 304a is configured to receive data via switches 206b and 208b (e.g., PCIe switches) in LCCs 206 and 208, respectively. Further, in the examples below, translation layer processor 304b is provided by a Flash Translation Layer (FTL) processor configured to perform data processing and storage device management functions of storage device 300 (e.g., a NAND flash memory device used by storage device 300, as described below). Further, in the examples below, controller 304c is provided by a NAND flash memory controller configured to interact with NAND flash memory devices used by storage device 300 as described below. However, while specific components and functionality have been described for the processing system 304, those skilled in the art will recognize that the processing system 304 may include other components and/or functionality while remaining within the scope of the present disclosure.

In the illustrated embodiment, the enclosure 302 of the storage device 300 also houses a storage subsystem 306, the storage subsystem 306 including a plurality of memory devices 306a, 306b, 306c, 306d through 306e, and 306 f. In the following example, the memory devices 306a-306f are provided by NAND flash memory devices, but those skilled in the art will recognize that other memory devices and/or storage technologies may be used for the storage subsystem, and remain within the scope of the present disclosure. In the illustrated embodiment, the enclosure 302 of the storage device 300 also houses a cache memory subsystem 308, which may be provided by one or more memory devices. In the following examples, the cache memory subsystem is provided by a Dynamic Random Access Memory (DRAM) device, but those skilled in the art will recognize that other memory devices and/or storage technologies (e.g., single-level cell (SLC) flash memory devices, etc.) may be used for cache memory subsystem 306 while remaining within the scope of the present disclosure. In the illustrated embodiment, the enclosure 302 of the storage device 300 also houses a communication subsystem 310, the communication subsystem 310 coupled to the processing system 304 and configured to couple the processing system 304 to the midplane 212 in the storage system 200 and provide communication with the LCCs 206 and 208 in the storage system. While a particular conventional storage device 300 has been illustrated, those skilled in the art will recognize that conventional storage devices may include various components and/or configurations of components for providing conventional storage device functionality while remaining within the scope of the present disclosure.

Referring now to fig. 4, fig. 4 illustrates an embodiment of a network fabric storage system 400 provided in accordance with the teachings of the present disclosure. Storage device 400 may be provided by IHS 100 discussed above with reference to FIG. 1, and/or may include some or all of the components of IHS 100. Further, while the storage system 400 is shown and discussed as being provided within a single enclosure/chassis, those skilled in the art will recognize that the functionality of the storage system 400 and/or its components discussed below may be distributed across multiple enclosures/chassis, and remain within the scope of the present disclosure. In the illustrated embodiment, the storage system 400 includes a chassis 402 that houses the components of the storage system 400, only some of which are shown below. For example, the chassis 402 may house a communication system 404, and the communication system 404 may be implemented by a Network Interface Controller (NIC), a wireless communication system (e.g.,near Field Communication (NFC) components, WiFi components, etc.) and/or any other communication components apparent to those skilled in the art. As such, the communication system 404 may include ports and/or other interfaces for coupling to one or more host devices (not shown) discussed below.

In the illustrated embodiment, a chassis 402 of the storage system 400 houses a pair of redundant Link Control Cards (LCCs) 406 and 408. For example, LCC406 may include protocol conversion processing system 206a, protocol conversion processing system 406a coupled to communication system 404, and also coupled to global translation layer processor 406b, which global translation layer processor 406b is also included in LCC406, and global translation layer processor 406b coupled to switch 406c, which is also included in LCC 406. Similarly, the LCC 408 may include a protocol conversion processing system 408a coupled to the communication system 404, and a global translation layer processor 408b, the global translation layer processor 408b also included in the LCC 408, the global translation layer processor 408b coupled to a switch 408c also included in the LCC 408. In the following example, protocol translation processing systems 406a and 408a are provided via respective non-volatile memory express fabric (NVMeoF) System On Chip (SOCs), global translation layer processors 406b and 408b are provided by respective Flash Translation Layer (FTL) processors, and switches 406c and 408c are provided by respective peripheral component interconnect express (PCIe) switches.

As will be understood by those skilled in the art, the nvmeofs SOCs providing the protocol translation processing systems 406a and 408a may be configured to translate ethernet, fibre channel, and/or other protocols for data received from host devices via the communication system 404 into PCIe protocols for components in the storage system 400, while the FTL processors providing the global translation layer processors 406 and 408b may be configured to perform data processing and storage device management functions for any of the primary storage devices and cache devices provided in the storage system 400, while the PCIe switches providing the switches 406c and 408c may be configured to route data (translated into PCIe protocols as described above). However, while some specific functionality of components of the LCC has been described, one of ordinary skill in the art will recognize that multiple LCCs and/or LCC components may provide other conventional functionality while remaining within the scope of the present disclosure.

The enclosure 402 of the storage system 400 also includes a storage/cache enclosure 410, the storage/cache enclosure 410 housing a plurality of primary storage devices 410a-410b, each of the primary storage devices 410a-410b coupled to each of the switches 406c and 408c in the LCCs 406 and 408, respectively, via a midplane 412. As shown, the storage/cache enclosure 410 may also house a plurality of cache devices 410c-410d, each of the plurality of cache devices 410c-410d being separate from any of the primary storage devices 410a-410b (e.g., provided by a different drive, device, chassis, etc., rather than for providing the primary storage devices 410a-410b), wherein each of the plurality of cache devices 410c-410d is coupled to each of the switches 406c and 408c in the LCCs 406 and 408, respectively, via the midplane 412. In the following example, each of the primary storage devices 410a-410b is provided by an NVMe SSD as described below, which includes a dual port such that each of those NVMe SSDs is exclusively coupled (e.g., via an exclusive PCIe connection) to each of the switches 406b and 408b (e.g., a PCIe switch as described above) via midplane 412.

Further, in the following example, each of cache devices 410c-410d is provided by a DRAM memory system that includes a dual port that enables each of those DRAM memory systems to be coupled exclusively (e.g., via a dedicated PCIe connection) to each of switches 406b and 408b (e.g., PCIe switches as described above) via midplane 412. However, those skilled in the art will recognize that the DRAM memory system providing cache devices 410c-410D may be replaced by an SLC flash memory system, a 3D XPoint memory system, and/or other cache memory system technologies known in the art. Further, while more than one cache device 410c-410d is shown, those skilled in the art will recognize that the storage system 400 may utilize a single cache device and remain within the scope of this disclosure. As described below, in some embodiments, the cache devices 410c-410d may utilize relatively higher performance and higher endurance storage technologies than the primary storage devices 410a/410b due to, for example, a trend to perform more write operations to the storage devices providing the cache. While a particular network fabric storage system 400 has been illustrated, those skilled in the art will recognize that the network fabric storage system of the present disclosure may include various components and/or component configurations for providing conventional network fabric storage system functionality as well as the functionality described below, while remaining within the scope of the present disclosure.

Referring now to fig. 5, fig. 5 illustrates an embodiment of a storage device 500 provided in accordance with the teachings of the present disclosure. Storage device 500 may be provided by IHS 100 discussed above with reference to FIG. 1 and/or may include some or all of the components of IHS 100. Further, while shown and discussed as an NVMe SSD, one skilled in the art will recognize that the functionality of the storage device 500 and/or components thereof discussed below may be provided by a plurality of storage device technologies, while remaining within the scope of the present disclosure. In the illustrated embodiment, the storage device 500 includes a chassis 502 (only some of which are shown below) that houses components of the storage device 500. For example, the chassis 502 may house a processing subsystem 504, and in the illustrated embodiment, the processing subsystem 504 includes a host interface 504a and a controller 504b, and does not have a translation layer processor (e.g., the FTL processor discussed above) like the translation layer processor 304b (included in the conventional storage device 300 discussed above with reference to fig. 3). In the following example, host interface 504a is provided by PCIe and one or more NVMe host interfaces and is configured to receive data via switches 406c and 408c (e.g., PCIe switches) in LCCs 406 and 408, respectively. Further, in the examples below, controller 504b is provided by a NAND flash controller configured to interact with a NAND flash memory device used by storage device 500 as described below. However, while specific components and functionality have been described for processing subsystem 504, those skilled in the art will recognize that processing subsystem 504 may include other components and/or functionality while remaining within the scope of the present disclosure.

In the illustrated embodiment, the enclosure 502 of the storage device 500 also houses a storage subsystem 506, the storage subsystem 506 including a plurality of memory devices 506a, 506b, 506c, 506d through 506e, and 506 f. In the following example, memory devices 506a-506f are provided by NAND flash memory devices, but those skilled in the art will recognize that other memory devices and/or storage technologies may be used for storage subsystem 506 and remain within the scope of the present disclosure. In the illustrated embodiment, the memory device 500 does not have a cache memory subsystem like cache memory subsystem 308 (included in the conventional memory device 300 discussed above with reference to FIG. 3) (e.g., the DRAM device discussed above). In the illustrated embodiment, enclosure 502 of storage device 500 also houses a communication subsystem 510, communication subsystem 510 coupled to processing subsystem 504 and configured to couple processing subsystem 504 to midplane 412 in storage system 400 and provide communication with LCCs 406 and 408 in the storage system. While a particular storage device 500 has been illustrated, those skilled in the art will recognize that storage devices provided in accordance with the teachings of the present disclosure may include various components and/or component configurations for providing conventional storage device functionality as well as the functionality discussed below, while remaining within the scope of the present disclosure.

Referring now to fig. 6, fig. 6 illustrates an embodiment of a method 400 for storing data in a network fabric storage system. As described below, the systems and methods of the present disclosure provide a new architecture for a network fabric storage system that moves translation layer processing from individual storage devices to a Link Control Card (LCC) to provide global translation layer processing for each of the storage devices in the storage system, while removing cache subsystems from individual storage devices, and providing a centralized cache system used by all storage devices. For example, a storage system may include a storage/cache enclosure that houses a plurality of primary storage devices and at least one cache device separate from each of the plurality of primary storage devices. A midplane in a storage system couples a plurality of primary storage devices and at least one cache device to a Link Control Card (LCC), the LCC including a translation layer processor that receives data from a host device and processes the data for storage in the at least one cache device such that the data is stored in the at least one cache device. When the translation layer processor determines that data should be stored in a first primary storage device included in the plurality of primary storage devices, it causes the data to be moved from the at least one cache device to the first primary storage device such that the first data is stored in the first primary storage device.

As such, the new network fabric storage system architecture described herein moves from the personalized translation layer processing provided in legacy storage devices to the global translation layer processor to provide the ability to optimize or customize translation layer processing for different applications, reduce dependencies between storage device controllers and storage medium support, reduce cost of storage devices (e.g., by eliminating a dedicated chipset provided for dedicated translation layer processing), and provide other translation layer processing efficiencies that will be apparent to those skilled in the art. In addition, the new network fabric storage system architecture described herein eliminates dedicated caching systems on its storage devices to reduce the cost of the storage devices, provides the ability to adjust the cache/primary storage ratio of the storage system, results in higher cache system utilization, introduces flexibility to use different cache media types for the storage devices, allows for cache system modifications and/or adjustments (e.g., performance upgrades and/or downgrades depending on the use of the storage system), and/or provides various other cache system efficiencies that will be apparent to those skilled in the art.

Referring to the conventional network fabric storage system 200 shown in fig. 2 and the conventional storage device 300 shown in fig. 3, the conventional network fabric storage system operation is briefly discussed for the following reference. Referring first to fig. 2, a host device (not shown) may transmit data to the communication system 204 for storage in the storage system 200. As such, the protocol conversion processing system 206a in the LCC 206 may receive the data from the communication system 204, perform protocol conversion operations on the data to produce converted data (e.g., by converting the data from an ethernet protocol to a PCIe protocol), and may provide the converted data to the switch 206, which the switch 206 may route to one or more storage devices 210a-210 d. In a similar manner, data may be received via the protocol conversion processing system 208a and the switch 208b in the LCC 208 and provided to one or more storage devices 210a-210d, as will be appreciated by those skilled in the art.

Referring to fig. 3, the host interface 304a in the processing subsystem 304 of the storage device 300 receiving the data may then provide the data to the translation layer processor 304b in the processing subsystem 304, the processing subsystem 304 may process the data for storage in the storage subsystem 306 or the cache memory subsystem 308, and may provide the data to the controller 304c along with instructions to store data determined by its processing. The controller 304c in the processing subsystem 304 may then receive the data from the translation layer processor 304b and may store the data in one or more memory devices 306a-306f in the storage subsystem 306 or in the cache memory subsystem 308, depending on the instructions provided by the translation layer processor 304 b. As such, in some examples, the controller 304c may store the data in the cache memory subsystem 308. As will be understood by those skilled in the art, the translation layer processor 304b may then determine that data stored in the cache memory subsystem 308 should be moved from the cache memory subsystem 308 to the storage subsystem 306, and may provide instructions to the controller 304c that cause the controller to move the data from the cache memory subsystem 308 to the memory devices 306a-306f in the one or more storage subsystems 306.

As such, the translation layer processor 304b may operate to control the storage of any data provided to the storage device 300 in the cache memory subsystem 308 or the storage subsystem 306, as well as to control the movement of data between cache memory subsystems or storage subsystems 306. Further, the translation layer processor 304b may operate to perform mapping operations on data stored in the storage device 300 (e.g., logical address to physical address mapping operations), generate and store metadata about the lifecycle of the storage device 300 (e.g., information about multiple writes to the storage device), enforce policies for the storage device 300 (e.g., to extend the life or increase the performance of the storage device), perform data recovery operations when data on the storage device 300 becomes unavailable, and/or perform various other translation layer processor functions known in the art.

As described above, providing the capability to translate layer processing operations using the processing subsystem 304 in the storage device 300 locks the translation layer processing capabilities of the storage device 300, which limits the ability to optimize or customize the translation layer processing for different applications, introduces dependencies between the processing subsystem 304 and the flash media support, and results in various other translation layer processing inefficiencies that will be apparent to those skilled in the art. Moreover, providing a dedicated cache memory subsystem 308 on the storage device 300 increases the cost of the storage device 300, locks the cache subsystem/storage subsystem 306 ratio of the storage device 300, which may result in relatively low utilization of the cache memory subsystem 308 (e.g., when the storage device 300 has low utilization), discourages flexibility in the use of different cache media types for the storage device 300, discourages modification and/or adjustment of the cache memory subsystem 308 (e.g., performance upgrade and/or downgrade, depending on the use of the storage device 300), and/or cause inefficiencies of various other cache memory subsystems, as will be apparent to those skilled in the art.

The method 600 begins at block 602, where the translation layer processor receives data from a host device. In one embodiment, a host device coupled to the storage system 400 may generate data for storage in the storage system 400 and send the data to the communication system 404 at block 602. Thus, at block 602, the protocol conversion processing system 406a in the LLC 406 can receive the data from the communication system 404, perform protocol conversion operations on the data to provide converted data (e.g., by converting the data from an ethernet protocol or a Fibre Channel (FC) protocol to a PCIe protocol), and provide the converted data to the global translation layer processor 406 b. Thus, at block 602, the global translation layer processor 406b may receive the translated data from the protocol translation processing system 406 a. As will be appreciated by those skilled in the art, the global translation layer processor 408b may receive data sent by the host device (e.g., via the communication system 404 and the protocol translation processing system 408b) in a similar manner. In some examples, data received from the host device may be load balanced between LCCs 406 and 408 to reduce the processing load on global translation layer processors 406b and 408 b.

The method 600 then proceeds to block 604, where the translation layer processor processes the data for storage in one or more cache devices. In one embodiment, at block 604, the global translation layer processor 406b may operate to process the data received at block 602 for storage in the primary storage devices 410a-410b or the cache devices 410c-410d, and may provide the data via the switch 406c along with instructions to store the data determined by its processing. Thus, in this example, at block 604, global translation layer processor 406b may process the data received at block 602 and determine that the data should be stored in caching device 410c, and may provide the data to caching device 410c via switch 406c along with instructions to store the data in caching device 410 c. The switching device 406c may then send the data to the caching device 410 c. As such, at block 604, cache device 410c may receive and store data in its memory system (e.g., the DRAM device, SLC flash memory device, and/or 3D XPoint memory device discussed above).

The method 600 may then proceed to optional block 606, where the translation layer processor may move data stored in one or more cache devices to one or more primary storage devices, optional block 606. In one embodiment, at block 606, the global translation layer processor 406b may operate to determine that the data stored in the cache device 410c at block 604 should be moved to the primary storage device. For example, global translation layer processor 406b may determine that the storage of data in cache device 410c should be moved to primary storage device 410a, retrieve the data from cache device 410c via switch 406c, and provide the data to primary storage device 410a via switch 406c along with instructions to store the data in primary storage device 410 a. The switch device 406c may then send the data to the primary storage device 410 a. Thus, at block 604, a host interface 504a in a processing subsystem 504 included in the storage device 500 may receive data and instructions via a communication subsystem 508 and provide them to a controller 504b in the processing subsystem 504, and the processing subsystem 504 may then execute the instructions and store the data in one or more memory devices 506a-506f in the storage subsystem 506. Thus, those skilled in the art will understand how the global translation layer processor 406b (or the global translation layer processor 408b) receives data from one or more host devices, stores the data in any of one or more primary storage devices 410a-410b or one or more cache devices 410c-410d, moves data between one or more primary storage devices 410a-410b or one or more cache devices 410c-410d, and/or performs any of a variety of data storage and/or data movement operations that will be apparent to those skilled in the art.

Method 600 may then proceed to optional block 608, where in optional block 608 the translation layer processor may map data stored in one or more cache devices to one or more primary storage devices. In one embodiment, the global translation layer processor 406b may operate to map data stored in one or more of the primary storage devices 410a-410b or one or more of the cache devices 410c-410d at block 608. For example, the global translation layer processor 406b may operate to perform a mapping operation (e.g., a logical address to physical address mapping operation) on any data it provides for storage in any of the one or more primary storage devices 410a-410b or the one or more cache devices 410c-410d to generate a mapping that maps a physical location of the data after storage of the data to a logical location for retrieval of the data, and to store the relevant mapping information in a storage subsystem accessible to the global translation layer processor 406 b. As such, those skilled in the art will understand how the global translation layer processor 406b (or the global translation layer processor 408b) maps data stored in any of the one or more primary storage devices 410a-410b or the one or more cache devices 410c-410d and/or performs any of a variety of data mapping operations that will be apparent to those skilled in the art.

Method 600 may then proceed to optional block 610, where in optional block 610 the translation layer processor may generate and store metadata associated with the lifecycle of one or more cache devices and/or one or more primary storage devices. In one embodiment, at block 610, the global translation layer processor 406b may operate to generate and store metadata associated with the life cycle of any of the primary storage devices 410a-410c and any of the cache devices 410c-410 d. For example, the global translation layer processor 406b may operate to monitor any lifecycle characteristics (e.g., multiple writes to the storage device) of each of the primary storage devices 410a-410c and the cache devices 410c-410d, generate metadata associated with those lifecycle characteristics, and store the metadata in a storage subsystem accessible to the global translation layer processor 406 b. As such, those skilled in the art will understand how the global translation layer processor 406b (or the global translation layer processor 408b) can generate and store lifecycle metadata for any of the storage devices in the storage/cache shell 410, and/or perform any of a variety of metadata/lifecycle operations that will be apparent to those skilled in the art. Further, in some embodiments, the metadata operations may include enforcing storage device policies defined via the metadata, for example, to extend the life of the storage device 400 and/or components thereof, and/or to provide maximum performance of the storage device 400 and/or components thereof, and/or to perform various other storage device policy operations that will be apparent to those of skill in the art.

Method 600 may then proceed to optional block 612, where in optional block 612 the translation layer processor may perform data recovery operations on data stored on one or more cache devices and/or one or more primary storage devices. In one embodiment, at block 612, the global translation layer processor 406b may operate to perform a data recovery operation for any of the primary storage devices 410a-410c and/or any of the cache devices 410c-410 d. For example, the global translation layer processor 406b may operate to determine that data stored on any of the primary storage devices 410a-410c and/or any of the cache devices 410c-410d has been lost, corrupted, and/or otherwise unavailable, and in response, may perform a data recovery operation. As such, those skilled in the art will understand how the global translation layer processor 406b (or the global translation layer processor 408b) operates to restore data that has been unavailable to the primary storage devices 410a-410c and/or the cache devices 410c-410d, and/or to perform any of a variety of data restoration operations that will be apparent to those skilled in the art.

The method 600 may then return to block 602 and may loop through blocks 602 and 612 to continue performing the above-described operations. As such, one or more global translation layer processors of the present disclosure may perform the data storage operations, data movement operations, data mapping operations, metadata generation and storage operations, policy enforcement operations, and data restoration operations discussed above, as well as any other translation layer processing for the primary storage devices 410a-410c and/or the cache devices 410c-410d during the method 600.

Thus, systems and methods have been described that provide a new architecture for NVMeoF systems that move FTL processing from a single NVMe SSD to an LCC to provide global FTL processing for each NVMe SSD in the NVMeoF system while removing DRAM cache subsystems from the single NVMe SSD and providing a centralized cache system that all NVMe SSDs use. For example, the NVMeoF system may include a storage/cache enclosure housing a plurality of NVMe SSDs and at least one DRAM cache device, separate from each of the plurality of NVMe SSDs. A midplane in the NVMeoF system couples the plurality of NVMe SSDs and the at least one DRAM cache device to the LCC, and the LCC includes an FTL processor that receives data from a host device and processes the data for storage in the at least one DRAM cache device such that the data is stored in the at least one DRAM cache device. When the FTL processor determines that data should be stored in the NVMe SSD, it moves the data from the at least one DRAM cache device to the NVMe SSD such that the data is stored in the NVMe SSD.

As such, the new NVMeoF system architecture described herein moves from personalized FTL processing provided in legacy NVMe SSDs to global FTL processing to provide the ability to optimize or customize FTL processing for different applications, reduce dependencies between NVMe SSD controllers and storage media support, eliminate dedicated chipsets for providing FTL processing on storage devices, and provide other FTL processing efficiencies that will be apparent to those skilled in the art. Furthermore, the new NVMeoF system architecture described herein eliminates a dedicated DRAM cache system on its NVMe SSD, reduces the cost of the NVMe SSD, provides the ability to adjust the DRAM cache/NAND flash ratio of the NVMeoF system, results in higher DRAM cache system utilization, introduces flexibility for NVMe SSDs for different cache media types than DRAMs (e.g., SLCs, 3D xpoids, etc.), allows DRAM cache system modifications and/or adjustments (e.g., performance upgrades and/or downgrades depending on the use of the NVMeoF system), and/or provides various other DRAM cache system efficiencies that will be apparent to those skilled in the art.

While illustrative embodiments have been shown and described, a wide range of modifications, changes, and substitutions is contemplated in the foregoing disclosure and, in some instances, some features of the embodiments may be employed without a corresponding use of the other features. Accordingly, the appended claims are consistent with the scope of the embodiments disclosed herein.

17页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种数据缓存方法及装置

Network fabric storage system

相关技术

网友询问留言