Low-speed parallel asynchronous communication method and communication system between FPGA (field programmable Gate array) chips

文档序号:1908219 发布日期:2021-11-30 浏览:15次 中文

阅读说明:本技术 一种fpga片间低速并行异步通信方法及通信系统 (Low-speed parallel asynchronous communication method and communication system between FPGA (field programmable Gate array) chips ) 是由 刘国成 顾大晔 王秋实 周乐 于 2021-09-01 设计创作,主要内容包括:本发明提出一种FPGA片间低速并行异步通信方法及通信系统,对发送端工作时钟clk1进行n分频处理,得到发送端分频时钟clk-div1;发送端与接收端进行数据传输前进行链路同步,在完成链路同步后,发送端将clk1时钟域的大位宽数据,转换成低频的clk-div1时钟域的小位宽数据,再逐个发送出去,接收端再用接收端工作时钟clk2采样接收到的clk-div1时钟域的数据,将数据恢复到clk2时钟域。本发明采用FPGA片间低速并行异步通信,在待传输数据位宽很大且信号跳变频率较小的情况下,完成不同片间/板间FPGA之间的异步通信和数据传输。(The invention provides a low-speed parallel asynchronous communication method and a communication system among FPGA chips, which are used for carrying out n frequency division processing on a transmitting end working clock clk1 to obtain a transmitting end frequency division clock clk _ div 1; the method comprises the steps that link synchronization is carried out before data transmission is carried out between a sending end and a receiving end, after the link synchronization is completed, the sending end converts large-bit-width data of a clk1 clock domain into small-bit-width data of a clk _ div1 clock domain with low frequency, the small-bit-width data are sent out one by one, the receiving end samples the received data of the clk _ div1 clock domain through a receiving end working clock clk2, and the data are restored to the clk2 clock domain. The invention adopts low-speed parallel asynchronous communication among FPGA chips, and completes the asynchronous communication and data transmission among the FPGA chips/boards under the conditions that the bit width of the data to be transmitted is large and the signal hopping frequency is small.)

1. A low-speed parallel asynchronous communication method between FPGA chips is characterized in that n frequency division processing is carried out on a transmitting end working clock clk1 to obtain a transmitting end frequency division clock clk _ div 1;

the method comprises the steps that link synchronization is carried out before data transmission is carried out between a sending end and a receiving end, after the link synchronization is completed, the sending end converts large-bit-width data of a clk1 clock domain into small-bit-width data of a clk _ div1 clock domain with low frequency, the small-bit-width data are sent out one by one, the receiving end samples the received data of the clk _ div1 clock domain through a receiving end working clock clk2, and the data are restored to the clk2 clock domain.

2. The FPGA inter-chip slow parallel asynchronous communication method of claim 1, wherein the frequency division number is set according to an effective bandwidth required for actual transmission determined in conjunction with a transmission latency between adjacent valid data.

3. The FPGA chip-to-chip low-speed parallel asynchronous communication method according to claim 1 or 2, characterized in that link synchronization comprises the following steps:

step A1, after power-on reset, the sending end and the receiving end both enter a link synchronization state, and the sampling position counter of the receiving end counts circularly within the range of 0-2 n-1;

step A2, the receiving end pulls down the synchronous signal sync and sends it to the sending end;

step A3, the transmitting end sends synchronous words to the receiving end in clk _ div1 clock domain;

step A4, after receiving the synchronous word sequence, the receiving end samples the synchronous word at intervals according to the value of the current sampling position counter to find out the synchronous word;

step A5, finding the synchronous word at the same sampling point position for t times continuously, and then considering that the link synchronization is completed;

step A6, the receiving end pulls up the sync signal sync, the sending end and the receiving end both jump out of the link synchronization state, the sending end enters the data waiting state, and the receiving end enters the packet header detection state.

4. The FPGA inter-chip low-speed parallel asynchronous communication method of claim 3, wherein the data transmission comprises the following steps:

step B1, the sending end detects the jump of the effective data, writes the data into fifo data buffer, and waits for sending;

step B2, the sending end is in the state of waiting data, if the fifo data buffer is not empty, the data is read out, and the packet head is added to send to the receiving end;

step B3, the receiving end samples the data sent by the sending end according to the sampling position determined by the link synchronization stage, if the packet head is detected, the effective mark of the packet head is raised, the receiving state of the effective data is entered, and the effective data is obtained by sampling in sequence;

and step B4, after the valid data is received, pulling up the data receiving completion signal and outputting the valid data.

5. An FPGA inter-chip low-speed parallel asynchronous communication system is characterized in that the FPGA inter-chip low-speed parallel asynchronous communication method of claim 4 is adopted, the FPGA inter-chip low-speed parallel asynchronous communication method comprises a sending end and a receiving end, the sending end is provided with a data cache unit, a data coding unit and a sending link synchronization unit, and the receiving end is provided with a data decoding unit and a receiving link synchronization unit;

the sending link synchronization unit directly enters a link synchronization state after power-on reset, and sends out a synchronization word after receiving a pulled-down synchronization signal sync sent by a receiving end; and jumping out of the link synchronization state after the link synchronization is completed;

the data cache unit sends the data to be sent with large bit width and low hopping frequency to fifo for caching;

the data coding unit reads the data to be sent with large bit width in the fifo cache, converts the data to be sent into low-frequency data with small bit width, adds a packet header and sends the data to a receiving end;

the receiving link synchronization unit directly enters a link synchronization state after power-on reset and sends a pulled-down synchronization signal sync to the sending end; receiving a synchronous word of a sending end, pulling up a synchronous signal sync after detecting a stable synchronous word, and jumping out of a link synchronous state together with the sending end so that the data analysis unit enters an effective data receiving state;

the data analysis unit analyzes effective data according to the packet header and converts low-frequency data with small bit width into data with large bit width and low hopping frequency.

Technical Field

The invention relates to the technical field of chip verification, in particular to an FPGA (field programmable gate array) inter-chip low-speed parallel asynchronous communication method and system for chip verification.

Background

In the present day that the chip design scale is increasing and the chip application is more and more extensive, ensuring the correctness of the chip design is a great matter that every chip designer must consider. The full verification before chip production is one of important measures for improving the chip design quality and ensuring the chip design correctness.

The existing chip verification methods are numerous, wherein the FPGA-based prototype verification has great speed advantage compared with server software simulation, so that the FPGA-based prototype verification method is widely applied to the field of chip design. However, with the increasing scale of chip design, even the current FPGA with the largest capacity is not enough to completely put down the whole chip design, which inevitably requires that the design of the whole chip is distributively placed in multiple FPGAs in one board card, even multiple FPGAs among multiple board cards. Therefore, various different interconnection communication and data transmission among the FPGA chips become a very important part of the whole verification platform system.

The existing inter-chip interconnection mostly adopts a time division multiplexing mode of directly connecting signals or improving the frequency of a communication clock. Because the resources of the GPIO port of the FPGA are limited, the clock frequency that the FPGA can support also has an upper limit, so in many cases, the two ways are difficult to implement, and even if implemented, the communication rate is very low.

Disclosure of Invention

Aiming at the technical problems in the existing chip verification, the invention provides an FPGA (field programmable gate array) inter-chip low-speed parallel asynchronous communication method and a communication system, under the condition that the bit width of data to be transmitted is large and the signal hopping frequency is small, a time division multiplexing communication method with low clock frequency is adopted, only a small number of GPIO (general purpose input/output) ports are used, and the inter-chip communication clock frequency is not required to be increased, so that the asynchronous communication and the data transmission between different inter-chip/inter-board FPGAs can be completed.

The invention protects a low-speed parallel asynchronous communication method between FPGA chips, carry on the frequency division of n to the work clock clk1 of the transmitting terminal, receive the frequency division clock clk _ div1 of the transmitting terminal; the method comprises the steps that link synchronization is carried out before data transmission is carried out between a sending end and a receiving end, after the link synchronization is completed, the sending end converts large-bit-width data of a clk1 clock domain into small-bit-width data of a clk _ div1 clock domain with low frequency, the small-bit-width data are sent out one by one, the receiving end samples the received data of the clk _ div1 clock domain through a receiving end working clock clk2, and the data are restored to the clk2 clock domain.

Further, the frequency division number is set according to an effective bandwidth required for actual transmission determined in conjunction with transmission latency between adjacent effective data.

Further, the link synchronization comprises the following steps:

step A1, after power-on reset, the sending end and the receiving end both enter a link synchronization state, and the sampling position counter of the receiving end counts circularly within the range of 0-2 n-1;

step A2, the receiving end pulls down the synchronous signal sync and sends it to the sending end;

step A3, the transmitting end sends the synchronous word f0f0f0f0 to the receiving end in clk _ div1 clock domain;

step A4, after receiving the synchronous word sequence, the receiving end samples the synchronous word at intervals according to the value of the current sampling position counter to find out the synchronous word;

step A5, finding the synchronous word at the same sampling point position for t times continuously, and then considering that the link synchronization is completed;

step A6, the receiving end pulls up the sync signal sync, the sending end and the receiving end both jump out of the link synchronization state, the sending end enters the data waiting state, and the receiving end enters the packet header detection state.

Further, the data transmission comprises the following steps:

step B1, the sending end detects the jump of the effective data, writes the data into fifo data buffer, and waits for sending;

step B2, the sending end is in the state of waiting data, if the fifo data buffer is not empty, the data is read out, and the packet head is added to send to the receiving end;

step B3, the receiving end samples the data sent by the sending end according to the sampling position determined by the link synchronization stage, if the packet head is detected, the effective mark of the packet head is raised, the receiving state of the effective data is entered, and the effective data is obtained by sampling in sequence;

and step B4, after the valid data is received, pulling up the data receiving completion signal and outputting the valid data.

The invention also protects a low-speed parallel asynchronous communication system among FPGA chips, which adopts the low-speed parallel asynchronous communication method among FPGA chips and comprises a sending end and a receiving end, wherein the sending end is provided with a data cache unit, a data coding unit and a sending link synchronization unit, and the receiving end is provided with a data decoding unit and a receiving link synchronization unit;

the sending link synchronization unit directly enters a link synchronization state after power-on reset, and sends out a synchronization word after receiving a pulled-down synchronization signal sync sent by a receiving end; and jumping out of the link synchronization state after the link synchronization is completed;

the data cache unit sends the data to be sent with large bit width and low hopping frequency to fifo for caching;

the data coding unit reads the data to be sent with large bit width in the fifo cache, converts the data to be sent into low-frequency data with small bit width, adds a packet header and sends the data to a receiving end;

the receiving link synchronization unit directly enters a link synchronization state after power-on reset and sends a pulled-down synchronization signal sync to the sending end; receiving a synchronous word of a sending end, pulling up a synchronous signal sync after detecting a stable synchronous word, and jumping out of a link synchronous state together with the sending end so that the data analysis unit enters an effective data receiving state;

the data analysis unit analyzes effective data according to the packet header and converts low-frequency data with small bit width into data with large bit width and low hopping frequency.

According to the invention, low-speed parallel asynchronous communication among FPGA chips is adopted, and under the condition that the bit width of data to be transmitted is large and the signal hopping frequency is small, the asynchronous communication and data transmission among the FPGA chips/boards are completed; through effective bandwidth calculation and matching, a large amount of invalid bandwidth consumption is avoided, the usage amount of GPIO port resources and the communication clock frequency between chips are obviously reduced, and the communication stability is improved.

Drawings

Fig. 1 is a schematic diagram of a system application scenario.

FIG. 2 is a block diagram of the overall architecture of an FPGA inter-chip low speed parallel asynchronous communication solution;

FIG. 3 is a transmit end state transition diagram;

FIG. 4 is a receiving end state transition diagram;

FIG. 5 a receive end synchronization training process;

FIG. 6 sender data packing format;

fig. 7 receiver data parsing format.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments. The embodiments of the present invention have been presented for purposes of illustration and description, and are not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.

Example 1

In the application scenario set forth in fig. 1, data transfer needs to be implemented between FPGAs 1-6, which are distributed across multiple boards. In the case of a cross-board, the reference clocks of the two FPGA chips are asynchronous and there may be some frequency difference. How to solve the communication between the FPGA chips in the application scene is just the problem discussed by the invention.

The overall structure block diagram of the FPGA inter-chip low-speed parallel asynchronous communication solution provided by the invention is shown in FIG. 2, and comprises a sending end and a receiving end, wherein the sending end is provided with a data cache unit, a data coding unit and a sending link synchronization unit, and the receiving end is provided with a data decoding unit and a receiving link synchronization unit.

For the AXI4 standard bus, in this embodiment, the CHIP2CHIP IP provided by Xilinx is directly used to convert the AXI bus interface into a high-speed serdes interface, the FPGA CHIPs are directly interconnected by using serdes, and the serdes link rate can reach 8 Gpbs. For other control signals, the number of signal lines is usually very large, and the signal hopping frequency is usually very low, i.e. the effective control information to be transmitted is not much. Therefore, the invention adopts a low-speed parallel asynchronous communication solution, adopts a time division multiplexing method communication method with low clock frequency under the conditions of large bit width of data to be transmitted and small signal hopping frequency, only uses a small number of GPIO ports, and can complete asynchronous communication and data transmission between different pieces of FPGA/boards without increasing communication clock frequency.

Aiming at a sending end: the sending link synchronization unit directly enters a link synchronization state after power-on reset, and sends out a synchronization word after receiving a pulled-down synchronization signal sync sent by a receiving end; and jumping out of the link synchronization state after the link synchronization is completed; the data cache unit sends the data to be sent with large bit width and low hopping frequency to fifo for caching; and the data coding unit reads the data to be transmitted with large bit width in the fifo buffer, converts the data to be transmitted into low-frequency data with small bit width, adds a packet header and transmits the data to the receiving end.

Aiming at a receiving end: the receiving link synchronization unit directly enters a link synchronization state after power-on reset and sends a pulled-down synchronization signal sync to the sending end; receiving a synchronous word of a sending end, pulling up a synchronous signal sync after detecting a stable synchronous word, and jumping out of a link synchronous state together with the sending end so that the data analysis unit enters an effective data receiving state; the data analysis unit analyzes effective data according to the packet header and converts low-frequency data with small bit width into data with large bit width and low hopping frequency.

The data is transferred from a large-bit-width high-frequency clock domain to a small-bit-width low-frequency clock domain, and the frequency division number needs to be determined firstly. The frequency division number here is set according to the effective bandwidth required for actual transmission determined in conjunction with the transmission latency between adjacent effective data.

For example, if the clock of clk1 is 100MHz, and the bit width of the control signal to be transmitted is 500 bits, the theoretically required transmission bandwidth is 50Gbps, and if there are only 4 lines available between chips, the clock for transmitting data at 50Gbps needs to be as high as 12.5GHz, which is impossible to achieve at the GPIO port. Through analysis and discovery, in practical application, after a control signal transmits an effective signal once, the control signal can be kept unchanged for a long time, and the next effective signal is transmitted again and possibly waits for 10us, namely the effective hopping frequency is only about 100KHz, so that the calculation shows that the limited bandwidth required by actual transmission is only 50Mbps, and the communication clock can meet the requirement only by 12.5MHz in combination with transmission of 4 signal lines between chips. By means of such a bandwidth matching calculation, the communication module can be made as a parameterized standard module.

The FPGA inter-chip low-speed parallel asynchronous communication method is adopted to perform n frequency division processing on a transmitting terminal working clock clk1 to obtain a transmitting terminal frequency division clock clk _ div1, and clk _ div1 is calculated according to the effective bandwidth required by actual transmission. The method comprises the steps that link synchronization is carried out before data transmission is carried out between a sending end and a receiving end, after the link synchronization is completed, the sending end converts large-bit-width data of a clk1 clock domain into small-bit-width data of a clk _ div1 clock domain with low frequency, the small-bit-width data are sent out one by one, the receiving end samples the received data of the clk _ div1 clock domain through a receiving end working clock clk2, and the data are restored to the clk2 clock domain. Therefore, the operation mainly changes the space by time, saves GPIO port resources and improves the stability of asynchronous communication. The state transition of the transmitting end and the receiving end is shown in fig. 3 and 4. It can be seen that the first step in the inter-chip asynchronous communication is to establish link synchronization.

After power-on reset, the sending end and the receiving end both enter a link synchronization state, the receiving end pulls down a synchronization signal sync and sends the synchronization signal sync to the sending end, and the sending end starts to continuously send a synchronization word f0f0f0 … …

Taking dividing clk1 by 3 as an example, because the data transmitted by the transmitting end is in the low frequency clock domain of dividing by 3, and the data sampled by the receiving end is in the high frequency clock domain, ideally, the data sequence of the sync word sampled by the high frequency clock domain should be fff000fff000fff000 … …

However, due to the possible phase difference and frequency difference between clk1 and clk2, the receiving end may sample the data sequence of the sync word with fff000fff0000 … …/ff0000ff0000 … …/ffff00ffff00 … …. The acquisition of the subsequent valid data can be guaranteed only by recovering the synchronous word from the sampled synchronous word data sequence, and therefore, the link synchronization is very important.

The link synchronization method adopted by the embodiment comprises the following steps:

1. after power-on reset, the sending end and the receiving end both enter a link synchronization state, and the sampling position counter bit _ cnt of the receiving end counts circularly within the range of 0-5. The reason why the cycle count is in the range of 0 to 5 is that, for the 8-bit sync word f0, the data sequence of the sync word sampled ideally at the receiving end is fff000, and there are 6 sampling positions.

2. The receiving end pulls down the synchronous signal sync and sends the sync signal sync to the sending end; the transmitting end transmits a synchronous word f0f0f0 … … to the receiving end under the clk _ div1 clock domain

3. After receiving the data sequence of the synchronous words, the receiving end samples the synchronous words at intervals according to the value of the current sampling position counter to find the synchronous words. Taking the received sync word data sequence as ff0000ff0000ff0000 as an example, if the value bit _ cnt is 3, the 0f0f0f sequence can be stably sampled, and the specific principle is as shown in table 1 below. As can be seen from table 1, the start bit of the sample starts from bit _ cnt 3.

F F 0 0 0 0 F F 0 0 0 0 F F 0 0 0 0 F F 0 0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
0 F 0 F 0 F 0

TABLE 1

4. And after the times of finding the synchronous words at the same sampling point position continuously exceed a set threshold value, the link synchronization is considered to be completed.

5. The receiving end pulls up the synchronous signal sync, the sending end and the receiving end both jump out of the link synchronous state, the sending end enters a data waiting state, and the receiving end enters a packet head detection state.

The following describes a data transmission flow based on the FPGA inter-chip low-speed parallel asynchronous communication method by taking transmission of valid data 0x12345678 as an example. Fig. 6 and 7 show data packet formats of a transmitting end and a receiving end, respectively.

1. And when the transmitting end detects the jump of the effective data, writing the data 0x12345678 into the fifo data buffer to wait for transmission.

2. The transmitting end is in a data waiting state, if the fifo data buffer is not empty, the data 0x12345678 is read out, the packet header is added, and the data is transmitted to the receiving end, the packet header data ffff0000 is transmitted firstly, and then the valid data 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7 and 0x8 are transmitted subsequently.

3. And the receiving end samples the data sent by the sending end according to the sampling position determined in the link synchronization stage, if the packet header of ffff0000 is detected, the effective flag frm _ header _ vld of the packet header is raised, an effective data receiving state is entered, and effective data 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7 and 0x8 are obtained through sampling in sequence.

4. And after the effective data is received, the data receiving completion signal data _ out _ vld is pulled up, and the effective data is output.

It is to be understood that the described embodiments are merely a few embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by one of ordinary skill in the art and related arts based on the embodiments of the present invention without any creative effort, shall fall within the protection scope of the present invention.

15页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于传送多比特位数据的传送器

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!