Path design method, system and storage medium for forwarding instruction data in advance

文档序号:947609 发布日期:2020-10-30 浏览:10次 中文

阅读说明:本技术 一种提前转发指令数据的通路设计方法、系统及存储介质 (Path design method, system and storage medium for forwarding instruction data in advance ) 是由 刘权胜 余红斌 于 2020-06-05 设计创作,主要内容包括:本发明涉及微电子技术领域,具体涉及一种提前转发指令数据的通路设计方法、系统及存储介质;本发明首先确定与数据相关的这类指令的数据;并在指令间提前转发条件检测及提前转发指令间的数据;最后提前转发若干周期前指令的数据,加速指令从保留站中发射到执行单元,同时加速后续流水线中的指令;本发明的方法能够提前快速的得到指令的依赖数据,使依赖指令提前满足发射的条件。加速指令从保留站中发射到执行单元,同时也加速了后续流水线中的指令。(The invention relates to the technical field of microelectronics, in particular to a path design method, a system and a storage medium for forwarding instruction data in advance; the invention first determines data for such instructions that are related to the data; and forward the data among the condition detection and forward the order in advance among the order in advance; finally, forwarding data of the instructions before a plurality of periods in advance, accelerating the transmission of the instructions from the reservation station to the execution unit, and accelerating the instructions in the subsequent pipelines; the method can quickly obtain the dependent data of the instruction in advance, so that the dependent instruction meets the emission condition in advance. Accelerating instructions from the reservation station to the execution unit also accelerates instructions in subsequent pipelines.)

1. A path design method for forwarding instruction data in advance is characterized in that the method firstly determines the data of the instruction related to the data; and forward the data among the condition detection and forward the order in advance among the order in advance; and finally forwarding data of the instruction before a plurality of cycles in advance, accelerating the transmission of the instruction from the reservation station to the execution unit, and accelerating the instruction in a subsequent pipeline.

2. The method for designing a path for forwarding instruction data in advance as claimed in claim 1, wherein the type of the partial integer instruction in RISC-V instruction set is selected for the data of such instruction, and the process of determining whether a certain instruction can obtain data is performed by writing the stage before CACHE, the instruction fetch stage, the predecoder or the instruction queue.

3. The method as claimed in claim 2, wherein x0 is a special register except the immediate, the register is always fixed to 0, and the write operation to the register is not effective, so that when x0 occurs in the instruction, the value of the register is determined to be 0, which is equivalent to a special form of 0 immediate.

4. The method according to claim 1, wherein such instructions perform data forwarding between instructions, and for supporting data forwarding in advance in a wider range, the instructions are renamed, a physical register is allocated to a destination register of each instruction, the instruction enters a reservation station at a dispatch stage, and simultaneously enters a buffer fw _ buffer [ N-1:0], and the fw _ buffer stores control signals such as an immediate and a physical register that satisfy the data forwarding in advance for the latest N cycles.

5. The method as claimed in claim 4, wherein the depth of fw _ buffer is N, the width M indicates the instruction state of the first N cycles of the current cycle, if there is an instruction satisfying the forwarding data in the first N cycles, the instruction in the N cycles may not be transmitted to the execution unit in the reservation station, may be executed in the execution unit, or may be in the forwarding data channel of the execution unit by comparing the physical register numbers to forward the data to the dependent instruction.

6. The method as claimed in claim 5, wherein the renamed instruction compares the physical register of the destination register of the instruction in fw _ buffer with the physical register of the source register, and if the renamed instruction hits the physical register, the data of the source register is obtained in advance.

7. The method of claim 1, wherein the design method uses an instruction optimization method, including how to obtain an immediate instruction, which is optimized similarly according to the characteristics of different instruction sets, and is applicable to any instruction set.

8. The method according to claim 7, wherein the optimization method of the instruction comprises the idea and principle of forwarding between instructions immediately, the method of forwarding data ahead of reading the physical register, and the method of forwarding data ahead of the execution unit, the method of forwarding data through a buffer before forwarding N cycles, the method of depending on instruction data, and the judgment logic and judgment method of forwarding data.

9. A path design system for forwarding instruction data in advance, comprising a processor and a memory storing execution instructions, wherein when the processor executes the execution instructions stored in the memory, the processor hardware executes the path design method for forwarding instruction data in advance according to any one of claims 1 to 8.

10. A readable medium comprising execution instructions which, when executed by a processor of a path design system forwarding-ahead instruction data, cause the path design system forwarding-ahead instruction data to perform the path design method forwarding-ahead instruction data according to any one of claims 1 to 8.

Technical Field

The invention relates to the technical field of microelectronics, in particular to a method, a system and a storage medium for designing a path for forwarding instruction data in advance.

Background

The development of microprocessors has made tremendous progress in the short decades. The performance of processors is constantly being improved from a number of aspects, including hardware architectures, processes, and combinations of software and hardware. The hardware architecture experiences from a single-launch scalar to a multiple-launch superscalar; from the first 3-stage pipeline to a few tens of stages; from an in-order execution instruction to an out-of-order execution instruction; a storage structure from no cache to 3-level cache; from physical single core to physical multiple core (CMP, chipmuli-Processors) and logical single core to logical multiple core (SMT); even for clustered systems for super-arithmetic, instruction-level parallelism and thread-level parallelism of execution by processors have been greatly developed. The instruction level parallel bandwidth requirement of the single-core microprocessor is higher and higher, and the multiple of the logic complexity program of the chip is increased.

Currently, the pipeline processing bandwidth of a server reaches up to 8 instructions per clock cycle. In the terminal domain, there are also 6 instructions per clock cycle in the instruction processing bandwidth. The CPU expects better performance by designing high bandwidth processing capabilities. There may be a correlation between instructions in each clock cycle or with instructions in some clock cycle before. Since there is a class of data-dependent instructions in the instruction set, such instructions may get data at decoder or some stage before.

In conventional design approaches, the instruction needs to read data from the register file, or the instruction compares the results of instructions that are completed by execution at the execution unit. The traditional method can not improve the execution efficiency of the pipeline when executing the instructions.

In fig. 1, the multi-core CPU is illustrated in which N physical cores share L3 and memory, and each physical core may be a single-threaded or multi-threaded architecture. Each core is applicable to all instruction sets, architectures, and processes.

In fig. 2, a single physical core, which may be a single-threaded or multi-threaded architecture. The modular division of the core is given in table 1 as a functional description.

In conventional design approaches, the instruction needs to read data from the register file, or the instruction compares the results of instructions that are completed by execution at the execution unit. The traditional method can not improve the execution efficiency of the pipeline when executing the instructions.

Disclosure of Invention

Aiming at the defects of the prior art, the invention discloses a path design method, a system and a storage medium for forwarding instruction data in advance, which are used for solving the problem that at present, a plurality of pieces of emission instruction read data are all returned by reading a physical register or taking a slave execution unit to execute a completion instruction; the problem of not being able to forward data for instructions that have been determined before entering the reservation station and thus not being able to wake up dependent instructions earlier.

The invention is realized by the following technical scheme:

in a first aspect, the invention discloses a path design method for forwarding instruction data in advance, the method firstly determines the data of the instruction related to the data; and forward the data among the condition detection and forward the order in advance among the order in advance; and finally forwarding data of the instruction before a plurality of cycles in advance, accelerating the transmission of the instruction from the reservation station to the execution unit, and accelerating the instruction in a subsequent pipeline.

Furthermore, the data of the instructions is selected from a partial integer instruction type of a RISC-V instruction set, and whether a certain instruction can obtain data or not is judged by writing the data before CACHE, an instruction fetching stage, a predecoder or an instruction queue and other stages.

Furthermore, except for the immediate, x0 is a special register in the RISC-V instruction set, which is always fixed to 0 and to which write operations are not valid, so that when x0 occurs in the instruction, the value of the register is determined to be 0, corresponding to a special form of immediate 0.

Furthermore, the instructions firstly carry out data forwarding among the instructions, the instructions are renamed for supporting forwarding data in advance in a wider range, a physical register is allocated to a destination register of each instruction, the instructions enter a reservation station at a dispatch stage and simultaneously enter a cache fw _ buffer [ N-1:0], and the fw _ buffer stores control signals such as immediate numbers and physical registers meeting the requirement of forwarding data instructions in advance in the latest N periods.

Furthermore, the depth of fw _ buffer is N, the width M, N represents the instruction state of the first N cycles of the current cycle, if there is an instruction satisfying forwarding data in advance in the first N cycles, the instruction of the N cycles may not be transmitted to the execution unit in the reservation station, may be executed in the execution unit, or may be in the forwarding data channel of the execution unit by comparing the physical register numbers to forward data to the dependent instruction.

Furthermore, after renaming, the physical register of the destination register of the instruction in fw _ buffer is compared with the physical register of the source register, and if the instruction hits the physical register, the data of the source register is obtained in advance.

Furthermore, the design method uses an instruction optimization method, including a method of how to obtain an immediate instruction, which performs similar optimization according to the characteristics of different instruction sets, and is applicable to any instruction set.

Furthermore, the optimization method of the instruction comprises the idea and principle of immediate data forwarding between the instructions, a method for forwarding data before forwarding the data to a read physical register and before the instruction is executed by an execution unit, a method for depending on the instruction data before forwarding N cycles through a cache, and a judgment logic and judgment method for forwarding the data.

In a second aspect, the present invention discloses a path design system for forwarding instruction data in advance, which includes a processor and a memory storing execution instructions, wherein when the processor executes the execution instructions stored in the memory, the processor hardware executes the path design method for forwarding instruction data in advance according to the first aspect.

In a third aspect, the present invention discloses a readable medium, which includes an execution instruction, and when a processor of a path design system forwarding instruction data in advance executes the execution instruction, the path design system forwarding instruction data in advance executes the path design method forwarding instruction data in advance according to the first aspect.

The invention has the beneficial effects that:

the method can quickly obtain the dependent data of the instruction in advance, so that the dependent instruction meets the emission condition in advance. Accelerating instructions from the reservation station to the execution unit also accelerates instructions in subsequent pipelines.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a diagram of a background art multi-core CPU with N physical cores sharing L3 and memory;

FIG. 2 is a background art single physical core diagram;

Fig. 3 is an immediate forwarding condition detection diagram;

FIG. 4 is a data diagram of an instruction before forwarding several cycles in advance.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

17页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种存储体冲突优化方法、并行处理器及电子设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!