Relay device and information processing system

文档序号:1661764 发布日期:2019-12-27 浏览:22次 中文

阅读说明:本技术 中继装置和信息处理系统 (Relay device and information processing system ) 是由 石田智弘 木村真敏 于 2019-04-18 设计创作,主要内容包括:提供第一端点和第二端点。第一端点从每个用作执行算术处理的计算机的平台中的第一平台的根复合体接收数据。第二端点向平台中的第二平台的根复合体传送数据,该要传送的数据是通过从第一端点的隧穿在该第二端点处接收到的。(A first endpoint and a second endpoint are provided. The first endpoint receives data from a root complex of a first one of the platforms each serving as a computer performing arithmetic processing. The second endpoint transmits data to a root complex of a second one of the platforms, the data to be transmitted being received at the second endpoint by tunneling from the first endpoint.)

1. A relay apparatus that is connected to each of platforms serving as a computer that performs arithmetic processing to communicate with each of the platforms and relay communication between the platforms via a peripheral component interconnect express (PCIe) bus, the relay apparatus comprising:

a first endpoint that receives data from a root complex of a first one of the platforms; and

a second endpoint that transmits the data to a root complex of a second one of the platforms, the data to be transmitted being received at the second endpoint by tunneling from the first endpoint.

2. The relay device of claim 1, further comprising a storage area associated with the first endpoint and the second endpoint,

wherein, when data is stored on a first storage region associated with the second endpoint as a transmission destination among storage regions set for the endpoints among the storage regions associated with the first endpoint as a transmission source, the tunneling from the first endpoint to the second endpoint is performed by storing the data on a first storage region associated with the second endpoint as a transmission destination among the storage regions associated with the second endpoint as a transmission destination.

3. An information processing system, comprising:

platforms each serving as a computer that performs arithmetic processing; and

a relay device connected to the platforms to communicate with each of the platforms and relay communications between the platforms via a peripheral component interconnect express (PCIe) bus,

wherein the relay device comprises:

a first endpoint that receives data from a root complex of a first one of the platforms; and

a second endpoint that transmits the data to a root complex of a second one of the platforms, the data to be transmitted being received at the second endpoint by tunneling from the first endpoint.

4. The information processing system of claim 3,

each of the platforms includes a storage area used by the relay device for each platform, and

when data is stored on a first storage area associated with the second platform as a transfer destination among storage areas set for each of the platforms among the storage areas included in the first platform as a transfer source among the platforms, transfer of the data from the first platform to the second platform is performed so that the data is stored on a first storage area associated with the second platform as a transfer destination among the storage areas associated with the second platform as a transfer destination via the first endpoint and the second endpoint.

5. A relay apparatus that is connected to platforms each serving as a computer that performs arithmetic processing to communicate with each of the platforms and relays communication between the platforms via a data transfer bus, comprising:

a first endpoint that receives data from a root complex of a first one of the platforms; and

a second endpoint that transmits the data to a root complex of a second one of the platforms, the data to be transmitted being received at the second endpoint by tunneling from the first endpoint.

Technical Field

The invention relates to a relay device and an information processing system.

Background

A method of performing parallel computation using a plurality of calculators (arithmetic devices) is known. In this method, data is exchanged between calculators, for example, through an ethernet (registered trademark) line.

Reference list

Patent document

Patent document 1: japanese patent laid-open No. 2008-41027

Patent document 2: japanese translation of PCT International publication No. JP-T-2012-504835

Disclosure of Invention

Technical problem

However, when data transfer is performed within one device or data transfer is performed between two devices, the communication speed of an ethernet (registered trademark) line can become a bottleneck at the time of use.

Technical scheme

According to one aspect, it is an object of the present invention to enable high speed communication between platforms.

The relay apparatus according to the aspect is connected to each of the platforms serving as computers that perform arithmetic processing to communicate with each of the platforms and relay communication between the platforms via a peripheral component interconnect express (PCIe) bus. The relay device includes: a first endpoint that receives data from a root complex of a first platform among the platforms; and a second endpoint that transmits data to a root complex of a second one of the platforms, the data to be transmitted being received at the second endpoint by tunneling from the first endpoint.

Technical effects

According to the above aspect of the present invention, high-speed communication between platforms can be realized.

Drawings

Fig. 1 is a diagram illustrating a connection configuration using a PCIe bus in various platforms;

fig. 2 is a diagram illustrating a connection configuration using a PCIe bus in various platforms;

fig. 3 is a diagram illustrating a connection configuration using a PCIe bus in various platforms;

fig. 4 is a diagram schematically showing a connection configuration of a plurality of platforms in an information processing system as an example of an embodiment;

fig. 5 is a diagram illustrating a software configuration of a platform in an information processing system as an example of an embodiment;

FIG. 6 is a diagram schematically illustrating a hardware configuration of a PCIe bridge controller in an information handling system as an example of an embodiment;

fig. 7 is a diagram showing a layer configuration of PCIe as an example of an embodiment;

fig. 8 is a diagram illustrating a view seen from a processor toward other processors in an information processing system as an example of an embodiment;

fig. 9 is a diagram illustrating a view seen from a processor toward other processors in an information processing system as an example of an embodiment;

fig. 10 is a diagram for explaining a data transfer method between platforms via a PCIe bridge controller in the information processing system as an example of the embodiment; and

fig. 11 is a diagram for explaining a data transfer method between platforms via a PCIe bridge controller in the information processing system as an example of the embodiment.

Detailed Description

Exemplary embodiments of a relay apparatus and an information processing system will be described with reference to the accompanying drawings. Note that the embodiments described below are merely examples, and are not intended to exclude various modifications and technical applications not clearly described in the embodiments. That is, various modifications may be made to the embodiment to realize the embodiment without departing from the gist of the present invention. Each of the drawings does not necessarily include only the components shown therein, and may include another function or the like.

(A) Communication using PCIe bus

For example, in order to perform a highly loaded arithmetic operation such as PC-based AI inference processing and image processing, it may be considered to use a processor (arithmetic operation processor) such as a GPU or FPGA or the like that can function as a device of the PC. PC is an abbreviation for personal computer and AI is an abbreviation for artificial intelligence. GPU is an abbreviation for graphics processing unit and FPGA is an abbreviation for field programmable gate array.

In order for the above processor to operate as a device of a PC, a device driver for operating specific hardware needs to be installed on an Operating System (OS). Examples of the OS include Windows (registered trademark) and Linux (registered trademark). It is also necessary to create a device driver that meets the requirements of each OS. In particular, in the case of Windows, driver requirements and the like differ according to the version of the OS, and development techniques for device drivers have been required. Therefore, without the development technology of Windows-compatible device drivers, a processor cannot be used as a device for a PC regardless of the high performance of the processor.

As an interface for connecting a device to a PC, a PCIe interface capable of transferring large-capacity data at high speed is known. On PCIe, a processor such as an Intel (registered trademark) processor functions as a Root Complex (RC) operating as a host, while a device functions as an Endpoint (EP). Data transfer is performed between the host and the device.

Each of fig. 1 to 3 is a diagram illustrating a connection configuration using a PCIe bus in various platforms.

For example, an x86 compatible processor manufactured by Intel corporation is installed on a PC platform and a general-purpose OS such as Windows and Linux operates thereon.

Fig. 1 shows an example of a configuration in which RCs are connected one-to-one to EPs on a PC platform equipped with PCIe. In this method shown in fig. 1, a PC platform is used as an RC while each device used as an EP is connected to the PC platform. Controllers for the respective devices in fig. 1 are provided by different manufacturers (a company to H company).

When the device driver of each device is installed on the OS of the PC platform, each device becomes available, so that each device cannot be operated independently. When an operational failure occurs on the PC platform, all devices stop operating.

It is necessary to develop a device driver compatible with each hardware and each OS so that the driver is appropriately developed when the OS is changed.

Fig. 2 shows an example of a configuration in which multiple EPs are connected to a single RC via a PCIe switch controller. In addition, in the method shown in fig. 2, a PC platform is used as an RC, while each device used as an EP is connected to the PC platform.

The PCIe switch controller shown in fig. 2 is used in a case where the number of RCs is insufficient compared to the number of devices to be connected when connecting a plurality of EPs to a single RC. By this method, the band of one RC is shared by four EPs, so that the performance deteriorates.

However, the method of driving the device is the same as that in the above case of connecting RC and EP one to one, so that the device cannot be operated alone. Each device is available when its device driver is installed on the OS of the Intel x86 platform.

Fig. 3 shows an example of a configuration in which two PC platforms (unit a and unit B) are interconnected via an interconnect.

In addition, in the method illustrated in fig. 3, a PC platform is used as an RC, while each device used as an EP is connected to the PC platform.

Each device is available when its device driver is installed on the OS of the PC platform.

As shown in fig. 3, by connecting platforms (processors) on which the OS operates via an interconnect, an ethernet, or the like, the processors can be driven synchronously.

However, the same OS needs to be operated on the platforms to be connected, and the platforms to be connected need to support the same connection method. Therefore, the configuration shown in fig. 3 is not suitable for connecting different platforms.

For example, when unit a in fig. 3 causes a device of company E connected to unit B to perform a process, the process is transferred from the processor of unit a to the processor of unit B via the interconnect, whereby the processor of unit B causes the device of company E to perform the process.

Between platforms connected via an interconnect, each processor can initiate processing to a device connected to another unit. However, the processing must be performed via a processor connected to the apparatus, thereby increasing the load on the processor on the reception side accordingly.

As described above with reference to fig. 1 to 3, when PCIe communication in the related art is directly applied to communication between a plurality of platforms, a device driver for each device is indispensable to the OS, so that development cost and maintenance cost thereof may be required.

The information processing system according to the present invention realizes communication between platforms by connecting a plurality of platforms to each other via a PCIe bus, and provides a configuration in which each processor does not need a driver to function as an RC of another processor.

(B) Configuration of

Fig. 4 is a diagram schematically showing a connection configuration of a plurality of platforms in the information processing system 1 as an example of the embodiment.

The information processing system 1 illustrated in fig. 4 includes a PCIe bridge controller 3 and a plurality of (8 in the example shown in fig. 4) platforms 2-1 to 2-8. Each of the platforms 2-1 through 2-8 is connected to a PCIe bridge controller 3.

In the following description, as reference numerals indicating the platforms, reference numerals 2-1 to 2-8 are used in the case where it is necessary to designate one of the platforms, and reference numeral 2 is used in the case where a specific platform is indicated. The platform 2 may also be referred to as a PC platform 2.

Platform

The platform 2-1 includes a processor 21-1. Similarly, platforms 2-2 through 2-8 include processors 21-2 through 21-8, respectively.

The respective processors 21-1 to 21-8 may be provided by different manufacturers (suppliers). For example, assume that processors 21-1, 21-2, 21-3, 21-4, 21-5, 21-6, 21-7, and 21-8 are provided by company A, company B, company C, company D, company E, company F, company G, and company H, respectively.

In the following description, processors 21-1, 21-2, 21-3, 21-4, 21-5, 21-6, 21-7, and 21-8 may be referred to as processor A, processor B, processor C, processor D, processor E, processor F, processor G, and processor H, respectively. Different platforms may be connected to the various EPs installed on the PCIe bridge controller 3. In addition, two or more EPs may be connected to one platform, and the platform may use multiple RCs to communicate with the PCIe bridge controller 3.

In the following description, as reference numerals indicating processors, reference numerals 21-1 to 21-8, reference numerals a to H, and the like are used in the case where one of the processors needs to be specified, and reference numeral 21 is used in the case where a specific processor is indicated.

Each of the platforms 2-1 to 2-8 provides a computer environment for executing arithmetic processing such as AI inference processing and image processing, and includes a processor 21, and a storage 23 and a memory (physical memory) 22 shown in fig. 10.

On the platform 2, when the processor 21 executes a program stored in the memory 22 or the storage 23, various functions are implemented.

The storage 23 is a storage device such as a Hard Disk Drive (HDD), a Solid State Drive (SSD), and a Storage Class Memory (SCM), and stores various data therein.

The memory 22 is a storage memory including a Read Only Memory (ROM) and a Random Access Memory (RAM). Various software programs, data for the programs, and the like are written in the ROM of the memory 22. The software program on the memory 22 is suitably read for execution by the processor 21. The RAM of the memory 22 is used as a main storage memory or a working memory.

The processor 21 controls the entire platform 2. The processor 21 may be a multiprocessor. For example, the processor 21 may be any one of a Central Processing Unit (CPU), a Micro Processing Unit (MPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Programmable Logic Device (PLD), and a Field Programmable Gate Array (FPGA). The processor 21 may be a combination of two or more types of components of a CPU, MPU, DSP, ASIC, PLD, and FPGA.

Fig. 5 is a diagram illustrating a software configuration of the platform 2 in the information processing system 1 as an example of the embodiment.

For convenience, fig. 5 shows only the software configuration of the platforms 2-1 to 2-3.

In the information processing system 1 illustrated in fig. 5, the OS of the platform 2-1 is Windows, and the storage management program is executed on the OS. The OS of each of the platforms 2-2 and 2-3 is Linux, and a variance processing program is executed on the OS (variance processing A, B).

Each platform 2 comprises a bridge driver 20. The platform 2 communicates with the PCIe bridge controller 3 and another platform 2 via the bridge driver 20. The communication method performed by the bridge driver 20 will be described later.

Each platform 2 includes a processor 21 and a memory (physical memory) 22. The processor 21 executes an OS, various programs, drivers, and the like stored in the memory 22 to realize the respective functions.

The processors 21 included in the respective platforms 2 may be provided by different vendors from each other. In the example shown in FIG. 4, a platform including multiple RCs (e.g., an x86 processor manufactured by Intel corporation) may be used as at least part of platform 2 (e.g., platforms 2-7).

Each of the platforms 2 is configured to be capable of operating independently without affecting the other drive configurations.

On the platform 2, as described later with reference to fig. 10, a part of the storage area of the memory 22 is used as a communication buffer 221 that temporarily stores data transferred between the platforms 2 (between the processors 21).

The PCIe bridge controller 3 enables communication of data and the like between the platforms 2-1 to 2-7.

Fig. 6 is a diagram schematically showing the hardware configuration of the PCIe bridge controller 3 in the information processing system 1 as an example of the embodiment.

The PCIe bridge controller 3 is, for example, a relay device of an EP including 8 lanes in a single chip. As shown in FIG. 6, PCIe bridge controller 3 includes a CPU 31, a memory 32, an interconnect 33, and a plurality of (8 in the example shown in FIG. 6) slots 34-1 to 34-8.

Devices configured to meet the PCIe standard are connected to each of the slots 34-1 through 34-8. Specifically, in the information processing system 1, the platform 2 is connected to each of the slots 34-1 to 34-8.

In the following description, as the reference numerals indicating the slots, the reference numerals 34-1 to 34-8 are used in the case where one of the slots needs to be specified, and the reference numeral 34 is used in the case where a specific slot is indicated.

Like the platforms 2-1 through 2-6 shown in FIG. 4, a single processor 21 may be connected to a single socket 34. Alternatively, like platforms 2-7 in FIG. 4, a single platform 2 may be connected to two or more (the example of FIG. 4 shows two slots) slots 34. The embodiment can be implemented with various modifications.

By assigning two or more slots 34 to a single platform 2 (like platforms 2-7 in fig. 4), the platforms 2-7 can perform communication using a wide communication band.

Each of the slots 34 is connected to the interconnect 33 via an internal bus. The CPU 31 and the memory 32 are also connected to the interconnect 33. Accordingly, each of the socket 34, the CPU 31, and the memory 32 is connected so that they can communicate with each other via the interconnect 33.

The memory 32 is, for example, a storage memory (physical memory) including a ROM and a RAM. In the ROM of the memory 32, a software program related to data communication control, data used for the program, and the like are written. The software program on the memory 32 is read appropriately by the CPU 31 to be executed. The RAM of the memory 32 is used as a main storage memory or a working memory.

The PCIe bridge controller 3 includes registers 35 (refer to fig. 10) associated with the respective slots. Storage area for each slot is provided in the Base Address Register (BAR) space of register 35. In the BAR space of the register 35, a storage area corresponding to each of the slots #0 to #7 is provided.

As described later, the PCIe bridge controller 3 performs data transfer between the platforms 2 by using the storage area of each slot in the BAR space.

The CPU 31 controls the entire PCIe bridge controller 3. The CPU 31 may be a multiprocessor. Instead of the CPU 31, any one of an MPU, DSP, ASIC, PLD, and FPGA may be used. The CPU 31 may be a combination of two or more types of components among a CPU, MPU, DSP, ASIC, PLD, and FPGA.

When the CPU 31 executes the software program stored in the memory 32, data transfer between the platforms 2 (between the processors 21) is realized by the PCIe bridge controller 3.

The PCIe bridge controller 3 uses PCIe to increase the data transfer speed between the platforms 2. The PCIe bridge controller 3 causes the processor included in each platform 2 to operate as an RC as shown in fig. 4, and realizes data transfer between EPs operating as devices.

Specifically, in the information processing system 1, the processor of each platform 2 is made to operate with the RC of the PCle as the data transfer interface. PCIe bridge controller 3 (i.e., slot 34 to which each platform 2 is connected) is made to operate as an EP with respect to each platform 2 (processor 21).

As a method of connecting the PCIe bridge controller 3 to the processor 21 as an EP, various known methods can be used.

For example, upon connection with the platform 2, the PCIe bridge controller 3 notifies the processor 21 of a signal indicating to function as an EP, thereby connecting to the processor 21 as an EP.

The PCIe bridge controller 3 tunnels the data through endpoint-to-endpoint (EP-to-EP) communications to transmit the data to the RCs. The communication between the platforms makes a logical connection when PCIe switching occurs, and when data transfer is not concentrated on one processor, data transfer can be performed in parallel between the respective platforms.

Fig. 7 is a diagram showing a layer configuration of PCIe as an example of an embodiment.

FIG. 7 shows an example of performing communication between processor A of platform 2-1 and processor B of platform 2-2.

On the platform 2-1 as the transmission source, data generated by the processor a serving as the RC is transferred sequentially through software, a switching layer, a data link layer, and a physical layer (PHY), and is transferred from the physical layer of the platform 2-1 to the physical layer of the PCIe bridge controller 3.

In the PCIe bridge controller 3, data is transferred sequentially through the physical layer, the data link layer, the exchange layer, and the software, and transferred to an EP corresponding to the RC of the platform 2 as a transmission destination by tunneling.

That is, in the PCIe bridge controller 3, data is transferred from one RC included in one platform to another RC included in another platform by performing tunneling of data between EPs (i.e., performing tunneling of data received by one EP from one platform to another EP).

On the platform 2-2 as the transmission destination, the data transferred from the PCIe bridge controller 3 is transferred sequentially through the physical layer (PHY), the data link layer, the exchange layer, and the software, and is transferred to the processor B of the platform 2-2 as the transmission destination.

In the present information processing system 1, communication between the processors 21 (between the platforms 2) is logically connected when PCIe switching occurs.

When data transfer from the plurality of other processors 21 is not concentrated on a specific processor 21 connected to one of the eight slots included in the PCIe bridge controller 3, data transfer may be performed in parallel between the processors 21 in the plurality of different groups.

For example, in the event that each of processor B of platform 2-2 and processor C of platform 2-3 attempts to communicate with processor A of platform 2-1, PCIe bridge controller 3 processes the communications of processor B and processor C serially.

However, when communication is performed between different processors and communication is not concentrated on a specific processor (such as communication between processor a and processor B, processor C and processor D, and processor E and processor F), PCIe bridge controller 3 processes communication between respective processors 21 in parallel.

Fig. 8 is a diagram illustrating a view seen from the processors 21 to 8 (processor H) toward the other processors in the information processing system 1 as an example of the embodiment. Fig. 9 is a diagram illustrating a view viewed from the processor 21-5 (processor E) toward the other processor 21.

Even when communication is performed between the processors 21, only the PCIe bridge controller 3 can be seen from an OS (e.g., a device manager of Windows) on each processor 21. Therefore, it is not necessary to directly manage another processor 21 as a connection destination. Thus, the processor 21 connected to the PCIe bridge controller 3 can be managed by a device driver provided in the PCIe bridge controller 3.

Therefore, it is not necessary to prepare device drivers for operating the respective processors 21 serving as the transmission source and the reception destination. Communication between processors 21 may be performed by simply performing communication processing on PCIe bridge controller 3 using a driver of PCIe bridge controller 3.

(C) Operation of

As an example of the embodiment configured as described above with reference to fig. 10, a data transfer method between the processors 21 via the PCIe bridge controller 3 in the information processing system 1 is described below.

In the example shown in FIG. 10, data from platform 2-1 connected to slot #0 is transferred to platform 2-5 connected to slot # 4.

On the platform 2-1 as a data transmission source, data transmitted by software or the like (hereinafter referred to as transmission data) is loaded from the storage 23 included in the platform 2-1 (reference numeral P1 in fig. 10) into the communication buffer 221.

The position information (e.g., offset/length) of the area storing the transmission data in the communication buffer 221 and the information (e.g., slot/offset) of the transmission destination are specified by software, and these pieces of information are passed to the bridge driver 20 (reference numeral P2).

The bridge driver 20 in the transmission source EP transfers transfer data to the address of slot #4 in the BAR space (reference numeral P3). In the PCIe bridge controller 3, the transfer data is transmitted to the slot (transmission destination slot) corresponding to the transmission destination platform 2-5 through the transmission source port by EP-to-EP communication (reference numeral P4). At the transmission destination slot, the transfer data is stored on the storage area corresponding to slot #4 in the BAR space of the register 35.

In the sending-destination slot corresponding to the platforms 2 to 5, the bridge driver 20 transfers the transfer data from the storage area corresponding to the slot #4 in the BAR space of the register 35 to the communication buffer 221, and stores the transfer data on a predetermined area specified by the offset in the communication buffer 221 (reference numeral P5).

On the sending destination platform 2-5, according to the program, the transfer data stored in the communication buffer 221 is read out and moved to the memory (local memory) 22 (reference numeral P6) or the storage 23 (reference numeral P7).

As described above, data (transfer data) is transferred from the platform 2-1 as a transfer source to the platform 2-5 as a transfer destination.

A data transfer method between the platforms 2 via the PCIe bridge controller 3 in the information processing system 1 is described below with reference to fig. 11. Fig. 11 is a diagram for explaining an example of a data transfer method between platforms via the PCIe bridge controller 3 in the information processing system 1 according to the present embodiment.

In the example shown in FIG. 11, a case where data is transferred from the platform 2-1 connected to slot #0 to the platform 2-5 connected to slot #4 is described.

The platform 2-1 as the transmission source stores data transmitted by software or the like (hereinafter referred to as transmission data) in the storage area 36 of the platform 2-1 from the storage 23 or the like included in the platform 2-1 (step S701). The storage area 36 may be part of a communication buffer that temporarily stores data to be transferred. The storage area 36 is an area provided in the memory 22 in each platform 2 and having the same size among the platforms. The storage area 36 is divided into a plurality of slots 305. Each of the segmented storage areas of the storage area 36 is associated with any one of the slots 305. For example, the storage area denoted as slot #0 in the storage area 36 is associated with platform 2-1 connected to slot # 0. The memory area denoted slot #4 in memory area 36 is associated with platforms 2-5 connected to slot # 4. The platform 2-1 stores the transmission data in the area allocated to the slot 305 (slot #4 in this case) as the transmission destination in the storage area 36.

The platform 2-1 serving as the transmission source of the root complex RC acquires or generates slot information indicating the slot 305 as the transmission destination and address information indicating an address in a divided area in the storage area 36 as the transmission destination based on the storage area in the storage area 36 of the platform 2 (step S702).

The platform 2-1 as the transmission source passes transfer data including slot information, address information, and transmission data to the PCIe bridge controller 3 having a function of a plurality of endpoints (step S703). The PCIe bridge controller 3 transfers transfer data to the platforms 2 to 4 as transmission destinations by connecting the slot 305 as the transmission source to the slot 305 as the transmission destination through EP-to-EP communication based on the slot information (step S704). The platform 2 as the transmission destination stores the transmission data (or the transfer data) in the area indicated by the address information in the storage area corresponding to the communication buffer 221 of the platform 2 as the transmission destination based on the slot information and the address information (step S705).

On the platform 2-5 as the transmission destination, the program reads out the transmission data stored on the communication buffer 221 and moves the transmission data to the memory (local memory) 22 or other area in the storage 23 (step S706, step S707).

As described above, data (transfer data) is transferred from the platform 2-1 as a transfer source to the platform 2-5 as a transfer destination.

(D) Advantages of

In the information processing system 1 as an example of the embodiment, the PCIe bridge controller 3 mediates data transfer between EPs in the PCIe bridge controller 3. Thus, data transfer can be realized between the plurality of RCs (processors 21) connected to the PCIe bridge controller 3.

That is, each processor 21 operates independently as an RC of PCIe, and the PCIe bridge controller 3 is connected to a device serving as an EP connected to the processor 21 to perform data transfer between EPs. As a result, problems caused by the device driver can be avoided, and high-speed data transfer as one system can be realized.

In addition, as long as the processor 21 has a data communication function conforming to the PCIe standard, data transfer can be performed between different processors 21. Therefore, the selection of the processor 21 to be used can be expanded regardless of the presence of device drivers, supported OSs, and the like.

The processors 21 are connected via the PCIe bridge controller 3 as an EP. Therefore, there is no need to install a device driver for the RC before the EP. Therefore, development of a device driver is not required, and malfunction due to addition of the device driver is prevented.

As shown in fig. 1, in a conventional PCIe connection using a processor such as an Intel (registered trademark) processor, when a device is added to PCIe, an EP is connected to an RC. In this case, it is necessary to install a device driver corresponding to each EP, so that the operation of the entire device becomes unstable due to the applied device driver. There are also problems as follows: the device is unusable due to the device driver not being prepared, and the processing is delayed because the CPU load factor is increased due to the control of the arithmetic operation processor.

Even if the number of EPs is increased by using the PCIe switch controller as shown in fig. 2, such a problem still remains.

As shown in fig. 3, as a method of distributing the CPU load and controlling the PCIe device, it may be considered to use an interconnect directly connecting CPUs to each other. However, in order to use the CPUs in such a connection form as shown in fig. 3, each CPU needs to be compatible with the same interconnect. Therefore, the types of CPUs to be connected are limited, thereby reducing the versatility and reducing the options of the processor.

On the other hand, in the present information processing system 1, general-purpose processors such as an ARM processor and an FPGA are required to operate as RCs so that they can be easily added as the processor 21 of the present information processing system 1.

In the PCIe bridge controller 3, connection (communication) is performed by PCIe, so that high-speed transfer that cannot be realized by ethernet can be realized. In addition, transmission and reception of high-definition images such as 4K and 8K, parallel computation of large-scale large data, and the like may be performed between platforms.

It is also possible to connect a dedicated processor dedicated to each function such as image processing, data retrieval, and the like, so that functions can be added and performance can be improved at low cost.

In addition, in the present information processing system 1, for example, virtualization of the system is not necessary, and system performance is not deteriorated by virtualization of the system. Therefore, the present information processing system 1 can also be applied to a system used for a high-load arithmetic operation such as AI inference or image processing.

(E) Others

The present disclosure is not limited to the above-described embodiments, and various modifications may be made to implement the present disclosure without departing from the gist of the embodiments. The configuration and processing in this embodiment mode can be selected as needed, or can be appropriately combined with each other.

For example, in the configuration shown in FIG. 6, PCIe bridge controller 3 includes eight slots 34-1 through 34-8, but the embodiments are not limited thereto and may be implemented with various modifications. That is, the PCIe bridge controller 3 may include seven or fewer slots 34, or nine or more slots 34.

In the above embodiments, although the communication system using PCIe has been described, the embodiments are not limited thereto. The present embodiment can be applied to communication based on a communication standard other than PCIe.

In the above embodiments, although PCIe is exemplified as a standard of an I/O interface for each component, the interface is not limited to PCIe. For example, the interface for each component may be realized by a technique for performing data transfer between an apparatus (peripheral control controller) and a processor via a data transfer bus. The data transfer bus may be a general-purpose bus that can transfer data at high speed in a local environment (e.g., one system or one device) provided in a single housing. The interface may be any one of a parallel interface and a serial interface.

In the case of serial transfer, the I/O interface may have a configuration capable of performing point-to-point connection and transferring data on a packet basis. In the case of serial transfer, the I/O interface may include multiple lanes (lanes). The layer structure of the I/O interface may include a switch layer for generating and decoding packets, a data link layer for performing error detection and the like, and a physical layer for converting between serial and parallel. The I/O interface may also include a root complex at the top of a hierarchy including one or more ports, endpoints as I/O devices, switches for adding ports, bridges for converting protocols, and the like. The interface may transmit after multiplexing the transmission data and the clock signal by the multiplexer. In this case, the receiving side can separate the data from the clock signal through the demultiplexer.

The present embodiments can be implemented and manufactured by those skilled in the art in light of the above disclosure.

(F) Supplementary note

The following remarks are further disclosed with respect to the above embodiments.

Supplementary note 1, a relay apparatus that is connected to each of platforms serving as computers that perform arithmetic processing to communicate with each of the platforms and relay communication between the platforms via a peripheral component interconnect express (PCIe) bus, the relay apparatus comprising: a first endpoint that receives data from a root complex of a first one of the platforms; and a second endpoint that transmits the data to a root complex of a second platform among the platforms, the data to be transmitted being received at the second endpoint by tunneling from the first endpoint.

Supplementary note 2, the relay device according to supplementary note 1, further comprising storage areas associated with the first endpoint and the second endpoint, wherein, when data is stored on a first storage area associated with the second endpoint as a transmission destination among the storage areas set for the endpoints among the storage areas associated with the first endpoint as a transmission source, the tunneling from the first endpoint to the second endpoint is performed by storing the data on the first storage area associated with the second endpoint as a transmission destination among the storage areas associated with the second endpoint as a transmission destination.

Note 3 that an information processing system includes: platforms each serving as a computer that performs arithmetic processing; and a relay device connected to the platforms to communicate with each of the platforms and relay communications between the platforms via a peripheral component interconnect express (PCIe) bus, wherein the relay device comprises: a first endpoint that receives data from a root complex of a first one of the platforms; and a second endpoint that transmits the data to a root complex of a second platform among the platforms, the data to be transmitted being received at the second endpoint by tunneling from the first endpoint.

Note 4 the information processing system according to note 3, wherein each of the platforms includes a storage area used by the relay apparatus for each platform, and when data is stored on a first storage area associated with the second platform as a transmission destination among storage areas set for each platform among the storage areas included in the first platform as a transmission source among the platforms, transfer of the data from the first platform to the second platform is performed so that the data is stored on a first storage area associated with the second platform as a transmission destination among the storage areas associated with the second platform as a transmission destination via the first end point and the second end point.

23页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:可重新配置的服务器以及具有可重新配置的服务器的服务器机架

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!