Device and method for elastically expanding AI edge server by embedded blade

文档序号：1086816 发布日期：2020-10-20 浏览：32次中文

阅读说明：本技术 一种嵌入式刀片弹性扩展ai边缘服务器的装置和方法 (Device and method for elastically expanding AI edge server by embedded blade ) 是由韩磊于 2020-03-25 设计创作，主要内容包括：本发明属于AI技术领域且公开了一种嵌入式刀片弹性扩展AI边缘服务器的装置和方法,该装置包括：一块或多块计算解码刀片,所述计算解码刀片从网络获取网络摄像头输入视频流,通过CPU自带解码器解码和预处理视频流,得到AI芯片所需的输入数据,通过PCIe将输入数据传输到AI芯片进行推理,采集AI芯片推理返回结果,将推理结果返回远程AI管理平台；从PCIe获取需要推理的输入数据,通过AI芯片推理出结果,通过PCIe返回AI推理组件；所述交换主板通过交换机芯片将网络摄像头视频流输入到AI计算解码刀片计算解码刀片和AI推理刀片配套使用,可以一块计算解码刀片带多开AI推理刀片。本发明拥有高算力、高带宽、高密度、高可靠、低延时和灵活扩展等优点。(The invention belongs to the technical field of AI and discloses a device and a method for an embedded blade to elastically expand an AI edge server, wherein the device comprises: the computing and decoding blade or blades acquire a network camera input video stream from a network, decode and preprocess the video stream through a CPU self-contained decoder to obtain input data required by an AI chip, transmit the input data to the AI chip for reasoning through PCIe, acquire an AI chip reasoning return result and return the reasoning result to the remote AI management platform; acquiring input data needing to be inferred from PCIe, inferring a result through an AI chip, and returning to an AI inference component through PCIe; the exchange mainboard inputs the network camera video stream into the AI calculation decoding blade through the switch chip, and the AI calculation decoding blade and the AI inference blade are matched for use, so that one calculation decoding blade can be provided with a multi-open AI inference blade. The invention has the advantages of high computing power, high bandwidth, high density, high reliability, low time delay, flexible extension and the like.)

1. The utility model provides an embedded blade elasticity extends AI edge server's device which characterized in that: the method comprises the following steps:

the computing and decoding blade or blades acquire a network camera input video stream from a network, decode and preprocess the video stream through a CPU self-contained decoder to obtain input data required by an AI chip, transmit the input data to the AI chip for reasoning through PCIe, acquire an AI chip reasoning return result and return the reasoning result to the remote AI management platform;

the AI reasoning component acquires input data needing reasoning from PCIe, deduces a result through the AI chip and returns the result to the AI reasoning component through PCIe;

the switching mainboard inputs the network camera video stream into the AI computing decoding blade through the switch chip to be matched with the AI reasoning blade for use, and one computing decoding blade can be provided with a multi-open AI reasoning blade;

the power supply supplies electric quantity to the CPU and the exchange mainboard;

and the hard disk is connected with the CPU and used for storing information.

2. The device for the embedded blade to elastically extend the AI edge server according to claim 1, wherein: each of the compute decode blades may process a 0-16 way 1080p video stream.

3. The device for the embedded blade to elastically extend the AI edge server according to claim 1, wherein: the AI reasoning component is an AI reasoning blade which acquires input data needing reasoning from PCIe, reasons a result through an AI chip and returns to the computation decoding blade through PCIe.

4. The device for the embedded blade to elastically extend the AI edge server in claim 3, wherein: the mainboard is provided with a plurality of slots with PCIe and internet access, and different numbers of calculation decoding blades and AI reasoning blades can be inserted according to performance and requirements.

5. The device for the embedded blade to elastically extend the AI edge server according to claim 1, wherein: the AI reasoning component is a complete server, and complete servers with different roles communicate through a network switch.

6. The device for the embedded blade to elastically extend the AI edge server in claim 5, wherein: the communication of the whole server comprises the following steps:

the general server issues a software algorithm and data to the GPU training server through the network switch;

the general server sends a software algorithm and data to the AI inference server through the network switch;

the GPU training server and the AI reasoning server calculate and return results to the general server through the network switch;

and the universal server acquires the return result and then further processes and outputs the result.

7. A method for an embedded blade to elastically expand an AI edge server is characterized in that: the method comprises the following steps:

s1: the multi-path network cameras, pictures or videos enter the AI server through the network, and the network switch distributes and transmits the data to different AI computing decoding blades;

s2: load balancing is carried out according to the load conditions of the decoding blade and the AI blade calculated currently;

s3: decoding and preprocessing the network camera video stream by an AI computing decoding blade to obtain input data required by an AI chip;

s4: the data characteristics are transmitted to an AI reasoning blade through a PCIe interface to be reasoned;

s5: transmitting the feature data back to the compute decode blade over the PCIe interface;

s6: and the information is transmitted back to the management platform through a network of computing decoding blades.

Technical Field

The invention relates to the technical field of AI (Artificial Intelligence), in particular to a device and a method for elastically expanding an AI edge server by an embedded blade.

Background

The AI technology can make a computer think like a person, and information such as human faces, sounds, characters and the like can be intelligently recognized from videos or pictures through software algorithms. The AI technology is mainly composed of servers for reasoning, training, control and communication.

AI techniques require high performance computing, large capacity disks to store data, and high bandwidth to transfer data. Therefore, the computer and the scheme implemented by the AI technology must have the characteristics of high performance, high bandwidth, high density and the like.

Before AI-specific chips are not developed, conventional AI algorithms are typically computed by a CPU and a GPU. Because the traditional CPU and GPU have many unrelated instructions and no AI algorithm specific instructions, the traditional AI solution is computationally inexpensive, power-intensive, and costly. In order to realize the AI technology, most AI technologies achieve the computing power and response time required by the AI technology by uploading algorithms and data to a cloud server and utilizing the technologies of the cloud server such as huge computing power and mass storage. This approach has the disadvantages of high power consumption, high cost, and large delay.

Disclosure of Invention

The technical problem to be solved by the invention is to overcome the existing defects, and provide a device and a method for elastically expanding an AI edge server by an embedded blade, which have the advantages of high computing power, high bandwidth, high density, high reliability, low time delay, flexible expansion and the like, and can effectively solve the problems in the background art.

In order to solve the technical problems, the invention provides the following technical scheme:

the invention provides a device for elastically expanding an AI edge server by an embedded blade, which comprises:

the AI reasoning component acquires input data needing reasoning from PCIe, deduces a result through the AI chip and returns the result to the AI reasoning component through PCIe;

the power supply supplies electric quantity to the CPU and the exchange mainboard;

and the hard disk is connected with the CPU and used for storing information.

As a preferred technical solution of the present invention, each block of the computation decoding blade can process 0-16 way 1080p video stream.

As a preferred technical scheme of the invention, the AI reasoning component is an AI reasoning blade, the AI reasoning blade acquires input data needing reasoning from PCIe, a result is reasoned by an AI chip, and the result is returned to the computation decoding blade by PCIe.

As a preferred technical scheme of the invention, a plurality of slots with PCIe and internet access are arranged on the mainboard, and different numbers of calculation decoding blades and AI inference blades can be inserted according to performance and requirements.

As a preferred technical scheme of the invention, the AI inference component is a complete server, and the complete servers with different roles communicate through a network switch.

As a preferred technical solution of the present invention, the communication of the whole server includes:

the general server issues a software algorithm and data to the GPU training server through the network switch;

the general server sends a software algorithm and data to the AI inference server through the network switch;

the GPU training server and the AI reasoning server calculate and return results to the general server through the network switch;

and the universal server acquires the return result and then further processes and outputs the result.

A method for an embedded blade to elastically extend an AI edge server comprises the following steps:

s1: the multi-path network cameras, pictures or videos enter the AI server through the network, and the network switch distributes and transmits the data to different AI computing decoding blades;

s2: load balancing is carried out according to the load conditions of the decoding blade and the AI blade calculated currently;

s3: decoding and preprocessing the network camera video stream by an AI computing decoding blade to obtain input data required by an AI chip;

s4: the data characteristics are transmitted to an AI reasoning blade through a PCIe interface to be reasoned;

s5: transmitting the feature data back to the compute decode blade over the PCIe interface;

s6: and the information is transmitted back to the management platform through a network of computing decoding blades.

One or more technical schemes provided by the invention at least have the following technical effects or advantages:

1) high calculated power density

One AI server can be inserted with more computation decoding blades and AI accelerating blades, the computation power density is more than 1000 times larger than that of the traditional architecture, and the number of the cameras capable of processing is more than 64 times larger than that of the traditional architecture.

2) Low power consumption techniques

The ARM computing and decoding CPU has low power consumption, and the AI chip is an ASIC special chip, so that the computing power of special instructions is fast, the power consumption is low, and the cost is low.

3) Flexible combination

Different numbers of compute decoding blades or AI blades may be configured depending on the number characteristics to be processed.

4) High bandwidth and small delay

The data between the AI blade and the blade and between the AI blade and the main control blade are connected through PCIe and network, so that the bandwidth is high and the delay is small.

5) Low power consumption

6) High speed

The calculation decoding blade adopts hardware decoding of an ARM chip, and then data transmission is directly carried out through PCIe and a network to corresponding components for immediate processing.

7) Reliability of

The hot plug AI blade computing card can increase or decrease the computing decoding blade and the AI blade during the running of the application program to increase or decrease the computing power, and can also hot plug out the failed card for replacement.

8) Flexible data transmission mode

Different data to be processed can be assigned to different blades for processing, and computing resources are flexibly allocated.

9) High expansibility

For different customer or application requirements, components such as a calculation decoding blade and an AI inference blade of different manufacturers can be replaced so as to quickly and flexibly meet the customer or application requirements.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention.

In the drawings:

fig. 1 is a flowchart illustrating an apparatus and method for an embedded blade to elastically extend an AI edge server according to an embodiment of the present invention;

fig. 2 is a schematic diagram of an apparatus computing decoding blade of an embedded blade elastic expansion AI edge server according to an embodiment of the present invention;

fig. 3 is a schematic diagram of an AI blade of an apparatus for an embedded blade elastically expanding an AI edge server according to an embodiment of the present invention;

fig. 4 is a schematic diagram of an entire AI elastic expansion server of an apparatus for elastically expanding an AI edge server by an embedded blade according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a compute-decode blade of an embedded blade elastic expansion AI edge server according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a server complete machine of an embedded blade elastic expansion AI edge server according to an embodiment of the present invention, where complete machines with different roles communicate through a network switch.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In the description of the present invention, it is to be understood that the terms "length", "width", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on the orientations or positional relationships illustrated in the drawings, and are used merely for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, are not to be construed as limiting the present invention. Further, in the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.

For better understanding of the above technical solutions, the following detailed descriptions will be provided in conjunction with the drawings and the detailed description of the present invention.

11页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：多个协议层封装互连的虚拟链路状态

Device and method for elastically expanding AI edge server by embedded blade

相关技术

网友询问留言