Method and device for accelerating execution of atomic instruction

文档序号:1755134 发布日期:2019-11-29 浏览:24次 中文

阅读说明:本技术 一种加速原子指令执行的方法和装置 (Method and device for accelerating execution of atomic instruction ) 是由 郑重 王永文 隋兵才 黄立波 孙彩霞 倪晓强 郭维 王俊辉 雷国庆 郭辉 于 2019-08-28 设计创作,主要内容包括:本发明公开了一种加速原子指令执行的方法和装置,本发明方法为根据原子指令生成对应的辅助指令(Load指令或预取指令),向不同的流水线输出原子指令及辅助指令;辅助指令Load前瞻执行读取原子指令需要操作区域的数据,将结果写往结果总线和原子指令结果队列,使得其他依赖该指令结果的指令可提前执行使得原子指令结果提前送出、执行速度加快,原子指令在提交后执行,根据执行结果确认辅助指令Load的前瞻执行结果;辅助指令为预取指令则执行预取指令提前将数据获取到处理器核中并置为可写状态,原子指令直接操作预取指令已经置为可写状态的数据加速执行过程。本发明可提高处理器核性能,并且实现简单,可灵活应用到现有处理器设计中。(The invention discloses a method and a device for accelerating atomic instruction execution, wherein the method comprises the steps of generating a corresponding auxiliary instruction (Load instruction or prefetch instruction) according to an atomic instruction, and outputting the atomic instruction and the auxiliary instruction to different pipelines; the auxiliary instruction Load performs the look-ahead execution to read the data of the region, which needs to be operated, of the atomic instruction, and writes the result to a result bus and an atomic instruction result queue, so that other instructions depending on the instruction result can be performed in advance, the atomic instruction result is sent out in advance, the execution speed is accelerated, the atomic instruction is executed after being submitted, and the look-ahead execution result of the auxiliary instruction Load is confirmed according to the execution result; and if the auxiliary instruction is a prefetch instruction, executing the prefetch instruction to acquire data into the processor core in advance and set the data into a writable state, and directly operating the data acceleration execution process of which the prefetch instruction is set into the writable state by the atomic instruction. The invention can improve the performance of the processor core, is simple to realize and can be flexibly applied to the design of the existing processor.)

1. a kind of method for accelerating atomic instructions to execute, it is characterised in that implementation steps include:

1) atomic instructions are taken;

2) corresponding auxiliary instruction being generated according to atomic instructions, the auxiliary instruction instructs for Load or prefetched instruction, wherein For taking out the data before atomic instructions operation executes from atomic instructions operation storage region, prefetched instruction is used for will for Load instruction The data of atomic instructions operation storage region are prefetched in processor core in advance, and being placed in can write state;Respectively to different stream Waterline exports atomic instructions and its corresponding auxiliary instruction;

If 3) auxiliary instruction is Load instruction, looks forward to the prospect and executes the data that Load instruction reading atomic instructions need operating area, And write result into result bus and atomic instructions result queue, the instruction for allowing other to rely on the instruction results shifts to an earlier date It executes;If auxiliary instruction is prefetched instruction, prefetched instruction is executed in advance by data acquisition into processor core, and be set to writeable State;Atomic instructions execute after submission, if auxiliary instruction is Load instruction, jump execution step 4);If auxiliary instruction is Prefetched instruction then jumps execution step 5);

4) the Load instruction execution result in the implementing result of atomic instructions and atomic instructions result queue is compared, if holding Row result is equal, then entire atomic instructions execution terminates;If implementing result is unequal, atomic instructions are returned the result and are write toward knot Fruit bus, and cause the assembly line null clear operation since the atomic instructions, instruction execution terminates;

5) data result directly obtained to execution prefetched instruction carries out atomic operation, and instruction execution terminates.

2. the method according to claim 1 for accelerating atomic instructions to execute, which is characterized in that referred in step 2 according to atom Enable the corresponding auxiliary instruction of generation specifically refer to: having checked whether the atomic instructions once in forbidding prediction caching has Load instruction Prediction executes mistake, is prefetched instruction if there is then generating corresponding auxiliary instruction according to atomic instructions, is otherwise referred to according to atom It enables and generates corresponding auxiliary instruction as Load instruction;When causing the assembly line null clear operation since the atomic instructions in step 4) It further include by forbidding the prediction of the prediction caching record atomic instructions to execute wrong step.

3. the method according to claim 1 for accelerating atomic instructions to execute, which is characterized in that respectively to difference in step 2 Assembly line output atomic instructions and its when corresponding auxiliary instruction, mentioned for two kinds of Load instruction, prefetched instruction auxiliary instructions It hands over to same or different assembly line.

4. the method according to claim 1 for accelerating atomic instructions to execute, which is characterized in that atomic instructions exist in step 3) The detailed step executed after submission includes: that atomic instructions are emitted in assembly line, submits atomic instructions, obtain data writes power Limit, operates data according to the operation of atomic instructions, the data after operation is write back storage region, atomic instructions are operated The result that the data of preceding storage region are executed as atomic instructions.

5. the method according to claim 1 for accelerating atomic instructions to execute, which is characterized in that look forward to the prospect and execute in step 3) The detailed step of Load instruction includes: the data acquisition by Load instruction issue into assembly line, write-back result to result bus, And Load instruction is without submission treatment process.

6. the method according to claim 1 for accelerating atomic instructions to execute, which is characterized in that execute pre- fetching in step 3) The detailed step of order includes: that prefetched instruction is emitted in assembly line, data acquisition, and prefetched instruction is submitted, and prefetched instruction is simultaneously Without the movement for writing result bus, data are read from next stage storage only and are put into processor core.

7. a kind of device for accelerating atomic instructions to execute, which is characterized in that the device is programmed to perform in claim 1~6 The step of method that acceleration atomic instructions execute described in any one.

8. a kind of device for accelerating atomic instructions to execute characterized by comprising

Fetch unit, for taking atomic instructions;

Atomic instructions split cells, for generating corresponding auxiliary instruction according to atomic instructions, the auxiliary instruction refers to for Load Order or prefetched instruction, wherein Load instruction is for taking out the number before atomic instructions operation execution from atomic instructions operation storage region According to prefetched instruction is used to for the data of atomic instructions operation storage region being prefetched in processor core in advance, is placed in writeable shape State;Respectively to different assembly line output atomic instructions and its corresponding auxiliary instruction;

Atomic instructions result queue executes the implementing result of Load instruction for recording prediction;

Load/ prefetches assembly line, for executing auxiliary instruction, if auxiliary instruction is Load instruction, looks forward to the prospect and executes Load instruction It reads atomic instructions and needs the data of operating area, and result is write into result bus and atomic instructions result queue, so that Other instructions for relying on the instruction results can execute in advance;If auxiliary instruction is prefetched instruction, executes prefetched instruction and shift to an earlier date By data acquisition into processor core, and it is set to writeable state;

Atomic instructions assembly line, for executing atomic instructions, atomic instructions execute after submission, if auxiliary instruction refers to for Load It enables, then jumps and execute atomic instructions result inspection unit;If auxiliary instruction is prefetched instruction, directly execution prefetched instruction is obtained The data result arrived carries out atomic operation, and instruction execution terminates;

Atomic instructions result inspection unit, for referring to the Load in the implementing result of atomic instructions and atomic instructions result queue Implementing result is enabled to compare, if implementing result is equal, entire atomic instructions execution terminates;If implementing result is unequal, Atomic instructions are returned the result and are write toward result bus, the assembly line null clear operation since the atomic instructions, instruction execution are caused Terminate.

Technical field

The present invention relates to the microarchitecture design fields of microprocessor, and in particular to a kind of that atomic instructions is accelerated to execute Method and apparatus.

Background technique

In order to improve program feature, majority program all uses parallel form to run at present, by altogether between parallel thread Resource is enjoyed to be communicated.In order to support concurrently writing for shared resource, most architectures both provide atomic instructions.Atom refers to Enabling basic operation is to carry out " read-modify-write " to some region of memory space, and cannot be interrupted during this period.Atom Instruction generally requires the data for returning to the memory space destination address read, and writes back in register.So atomic instructions are same When with data storage, calculate and data acquisition operation.

In the processor of high-performance Out-of-order execution, since data acquisition instruction (Load instruction) is in the pass of program execution On key path, so having carried out special optimization, allow to Out-of-order execution.And instruction data storage (Store instruction) is general Do not allow to look forward to the prospect and execute, it is necessary to data could be write toward memory after instruction is submitted.Atomic instructions need to read and write to deposit Space is stored up, therefore is executed generally according to the executive mode of Store instruction.In general, Load instruction execution by instruction dispatch, Instruction issue obtains data, write-back result (data write result bus), instruction presentation stage.Atomic instructions are due to there is Store The characteristic of instruction can be submitted successively by instruction distribution, instruction issue, instruction, obtain data and modify, write-back result (data Write-back result bus) stage.Since atomic instructions will also write back register, so the time of atomic instructions write-back result direct shadow The execution opportunity of the subsequent instruction using the data is rung, to influence the performance that entire program executes.Also, due to atomic instructions It needs to modify to memory space data, so the write permission for needing to obtain data could execute.And the number in processor core According to majority in shared (read-only) state, this requires processor cores by consistency protocol by the data acquisition and in writeable State, and the process is related to and interacts outside processor core, generally requires the long period, this allows for atomic instructions execution Delay further increases.

As shown in Figure 1, in the processor of general Out-of-order execution, Load instruction can in T3 energy returned data, at this time according to The subsequent instructions of the Lai Yu Load destination register are obtained with data and perform, when instructing T4 instruction to submit, confirmation The implementing result of Load instruction, carrying out assembly line null clear operation i.e. if incorrect can guarantee that entire program executes correct Property.But atomic instructions could return to the data of reading in T8, because can not cancel to writing for memory space, must refer to It enables and just can be carried out write operation after submitting.Therefore the opportunity that atomic instructions write destination register seriously lags behind common Load, after The execution of continuous instruction also accordingly lags.If the read-write of shared resource is more in program, such atomic instructions execute meeting Seriously affect the performance of entire program.

Summary of the invention

The technical problem to be solved in the present invention: in view of the above problems in the prior art, provide a kind of performance is high, expense is small, It being easily achieved, the method and apparatus for accelerating atomic instructions to execute of using flexible, processor core performance can be improved in the present invention, and Realize it is simple, can flexible Application into the design of existing processor.

In order to solve the above-mentioned technical problem, the technical solution adopted by the present invention are as follows:

A method of accelerate atomic instructions to execute, implementation steps include:

1) atomic instructions are taken;

2) corresponding auxiliary instruction being generated according to atomic instructions, the auxiliary instruction instructs for Load or prefetched instruction, wherein For taking out the data before atomic instructions operation executes from atomic instructions operation storage region, prefetched instruction is used for will for Load instruction The data of atomic instructions operation storage region are prefetched in processor core in advance, and being placed in can write state;Respectively to different stream Waterline exports atomic instructions and its corresponding auxiliary instruction;

If 3) auxiliary instruction is Load instruction, looks forward to the prospect and executes the data that Load instruction reading atomic instructions need operating area, And write result into result bus and atomic instructions result queue, the instruction for allowing other to rely on the instruction results shifts to an earlier date It executes;If auxiliary instruction is prefetched instruction, prefetched instruction is executed in advance by data acquisition into processor core, and be set to writeable State;Atomic instructions execute after submission, if auxiliary instruction is Load instruction, jump execution step 4);If auxiliary instruction is Prefetched instruction then jumps execution step 5);

4) the Load instruction execution result in the implementing result of atomic instructions and atomic instructions result queue is compared, if holding Row result is equal, then entire atomic instructions execution terminates;If implementing result is unequal, atomic instructions are returned the result and are write toward knot Fruit bus, and cause the assembly line null clear operation since the atomic instructions, instruction execution terminates;

5) data result directly obtained to execution prefetched instruction carries out atomic operation, and instruction execution terminates.

Optionally, corresponding auxiliary instruction is generated according to atomic instructions in step 2 to specifically refer to: prediction being forbidden to cache In checked whether the atomic instructions once and had that Load instruction prediction executes mistake, generated if there is then according to atomic instructions corresponding Auxiliary instruction is prefetched instruction, otherwise generates corresponding auxiliary instruction according to atomic instructions as Load instruction;Cause in step 4) It further include the prediction by the caching record atomic instructions of forbidding looking forward to the prospect when assembly line null clear operation since the atomic instructions Execute wrong step.

Optionally, in step 2 respectively to different assembly line output atomic instructions and its corresponding auxiliary instruction when, needle Same or different assembly line is committed to two kinds of Load instruction, prefetched instruction auxiliary instructions.

Optionally, the detailed step that atomic instructions execute after submission in step 3) includes: that atomic instructions are emitted to stream In waterline, atomic instructions are submitted, the write permission of data is obtained, data is operated according to the operation of atomic instructions, will operate Data afterwards write back storage region, the result that the data of storage region are executed as atomic instructions before atomic instructions are operated.

Optionally, the detailed step that prediction executes Load instruction in step 3) includes: by Load instruction issue to assembly line In, data acquisition, write-back result to result bus, and Load instruction is without submission treatment process.

Optionally, the detailed step for prefetched instruction being executed in step 3) includes: that prefetched instruction is emitted in assembly line, number According to acquisition, prefetched instruction submission, and prefetched instruction are simultaneously free of the movement for writing result bus, only by data from next stage storage It reads and is put into processor core.

The present invention also provides it is a kind of acceleration atomic instructions execute device, the device be programmed to perform the present invention it is aforementioned plus The step of method that fast atomic instructions execute.

The present invention also provides a kind of devices that acceleration atomic instructions execute, comprising:

Fetch unit, for taking atomic instructions;

Atomic instructions split cells, for generating corresponding auxiliary instruction according to atomic instructions, the auxiliary instruction refers to for Load Order or prefetched instruction, wherein Load instruction is for taking out the number before atomic instructions operation execution from atomic instructions operation storage region According to prefetched instruction is used to for the data of atomic instructions operation storage region being prefetched in processor core in advance, is placed in writeable shape State;Respectively to different assembly line output atomic instructions and its corresponding auxiliary instruction;

Atomic instructions result queue executes the implementing result of Load instruction for recording prediction;

Load/ prefetches assembly line, for executing auxiliary instruction, if auxiliary instruction is Load instruction, looks forward to the prospect and executes Load instruction It reads atomic instructions and needs the data of operating area, and result is write into result bus and atomic instructions result queue, so that Other instructions for relying on the instruction results can execute in advance;If auxiliary instruction is prefetched instruction, executes prefetched instruction and shift to an earlier date By data acquisition into processor core, and it is set to writeable state;

Atomic instructions assembly line, for executing atomic instructions, atomic instructions execute after submission, if auxiliary instruction refers to for Load It enables, then jumps and execute atomic instructions result inspection unit;If auxiliary instruction is prefetched instruction, directly execution prefetched instruction is obtained The data result arrived carries out atomic operation, and instruction execution terminates;

Atomic instructions result inspection unit, for referring to the Load in the implementing result of atomic instructions and atomic instructions result queue Implementing result is enabled to compare, if implementing result is equal, entire atomic instructions execution terminates;If implementing result is unequal, Atomic instructions are returned the result and are write toward result bus, the assembly line null clear operation since the atomic instructions, instruction execution are caused Terminate.

Compared to the prior art, the method that the present invention accelerates atomic instructions to execute has an advantage that

1, performance is high.It is needed after instruction is submitted since atomic instructions execute, the write state for obtaining data could execute, and lead to original The time that sub-instructions return data to result bus is longer.The present invention accelerates the execution of atomic instructions by two methods, thus Improve the performance entirely handled.

2. realizing that cost is small.The Load instruction that the method that the present invention is mentioned mainly is split is with prefetched instruction in general unrest There is execution unit in sequence processor, so the auxiliary instruction split out does not need to increase additional assembly line support.Flowing water Line null clear operation is also existing mechanism in Out-of-order execution processor.It only needs to increase some control logics and several atoms refers to Enable result queue that can realize method of the invention in the design of existing processor.

3. using flexible does not influence existing instruction execution access.The method for accelerating atomic instructions to execute that the present invention is mentioned The execution path of existing atomic instructions, Load instruction and prefetched instruction is not influenced.When not needing the acceleration mechanism, close former Sub-instructions split component.

The device for accelerating atomic instructions to execute of the invention has identical as the method that the present invention accelerates atomic instructions to execute Technical effect, details are not described herein.

Detailed description of the invention

Fig. 1 is that Load instruction and atomic instructions execute time diagram.

Fig. 2 is the basic procedure schematic diagram of present invention method.

Fig. 3 is the basic structure schematic diagram of the device of that embodiment of the invention.

Specific embodiment

As shown in Fig. 2, the implementation steps for the method that the present embodiment accelerates atomic instructions to execute include:

1) atomic instructions are taken;

2) corresponding auxiliary instruction is generated according to atomic instructions, auxiliary instruction is Load instruction or prefetched instruction, and wherein Load refers to It enables for taking out the data before atomic instructions operation executes from atomic instructions operation storage region, prefetched instruction is for referring to atom The data of operation storage region are enabled to be prefetched in processor core in advance, being placed in can write state;It is defeated to different assembly lines respectively Atomic instructions and its corresponding auxiliary instruction out;

If 3) auxiliary instruction is Load instruction, looks forward to the prospect and executes the data that Load instruction reading atomic instructions need operating area, And write result into result bus and atomic instructions result queue, the instruction for allowing other to rely on the instruction results shifts to an earlier date It executes;If auxiliary instruction is prefetched instruction, prefetched instruction is executed in advance by data acquisition into processor core, and be set to writeable State;Atomic instructions execute after submission, if auxiliary instruction is Load instruction, jump execution step 4);If auxiliary instruction is Prefetched instruction then jumps execution step 5);

4) the Load instruction execution result in the implementing result of atomic instructions and atomic instructions result queue is compared, if holding Row result is equal, then entire atomic instructions execution terminates;If implementing result is unequal, atomic instructions are returned the result and are write toward knot Fruit bus, and cause the assembly line null clear operation since the atomic instructions and (send Flush letter to processor core control unit Breath), instruction execution terminates;

5) data result directly obtained to execution prefetched instruction carries out atomic operation, and instruction execution terminates.

In the present embodiment, auxiliary instruction includes the Load instruction for writing result register and the pre- fetching for not writing result register Two kinds are enabled, storage address operated by auxiliary instruction is identical with corresponding atomic instructions, and while splitting every time both only selects One.The selection of auxiliary instruction executes history according to auxiliary instruction and atomic instructions and carries out, if auxiliary instruction refers to for Load When enabling, implementing result is different with atomic instructions implementing result, then splits out prefetched instruction when next time executes the atomic instructions, instead It, selects prefetched instruction.Atomic instructions can also return to number at the T3 moment in Fig. 1 in the case where splitting out Load instruction According to greatly accelerate the execution of atomic instructions.In the case where splitting out prefetched instruction, acquisition data can also be saved and write The time of state, to accelerate the execution of principle instruction.

With atomic instructions ATOM Rd, Rt, for [Rn], wherein Rd is the destination register that the instruction needs to write, and Rt is original One of the operand of sub-instructions operation, Rn are the storage address of atomic instructions operation.Load so corresponding with the atomic instructions Instruction is LOAD Rd, and [Rn], prefetched instruction is PLD [Rn].The effect of Load instruction is to operate storage region from atomic instructions Take out the data before atomic instructions operation executes.The effect of prefetched instruction is to shift to an earlier date the data of atomic instructions operation storage region It is prefetched in processor core, being placed in can write state.In this way after atomic instructions submission, when being operated to data, so that it may It directly carries out, to accelerate the execution of atomic instructions.

Auxiliary instruction is Load instruction or auxiliary instruction, whether has the atomic instructions once according in forbidding prediction caching It looks forward to the prospect through Load and executes erroneous decision.It is specific according to the corresponding auxiliary instruction of atomic instructions generation in step 2 in the present embodiment Refer to: forbid prediction cache in checked whether the atomic instructions once and had that Load instruction prediction executes mistake, if there is then root Generating corresponding auxiliary instruction according to atomic instructions is prefetched instruction, otherwise generates corresponding auxiliary instruction according to atomic instructions and is Load instruction;It further include by forbidding prediction to cache when causing the assembly line null clear operation since the atomic instructions in step 4) The prediction for recording the atomic instructions executes wrong step.

In the present embodiment, respectively to different assembly line output atomic instructions and its corresponding auxiliary instruction in step 2 When, same or different assembly line is committed to for two kinds of Load instruction, prefetched instruction auxiliary instructions.

The execution of atomic instructions executes in special assembly line, in general comprises the steps of: instruction issue to stream In waterline, instruction is submitted, and is obtained the write permission of data, is operated according to the operation of atom to data, by the data after operation Storage region is write back, the result that the data of storage region before atomic operation are executed as atomic instructions.In the present embodiment, step 3) detailed step that atomic instructions execute after submission in includes: that atomic instructions are emitted in assembly line, submits atomic instructions, The write permission for obtaining data, operates data according to the operation of atomic instructions, the data after operation is write back storage region, The result that the data of storage region are executed as atomic instructions before atomic instructions are operated.

For aforementioned atomic instructions ATOM Rd, Rt, [Rn], the Load instruction split is LOAD Rd, [Rn].

It includes: by Load instruction issue to stream that prediction, which executes the detailed step of Load instruction, in the present embodiment, in step 3) In waterline, data acquisition, write-back result to result bus, and Load instruction is without submission treatment process.The execution of atomic instructions In the process, for the situation of Load instruction, need to check whether the data of atomic instructions result queue are correct, and the queue is by splitting Load instruction write.If implementing result is equal, the Load instruction prediction ground that explanation is split out correctly returns atom Instruct needing to return as a result, result bus is written in the result for then executing atomic instructions, that is to say, that again will be correct former Sub-instructions implementing result is written in bus, and causes the assembly line null clear operation since the atomic instructions, because before Load Execution is looked forward or upwards, and has write out the data of mistake, it is now desired to the instructions for being later than the atomic instructions all in assembly line are all removed, Ensure not instruct the data for having used mistake.Otherwise number has been write toward result bus with illustrating the Load instruction errors that prediction executes According to it is wrong that wrong data propagation, which will lead to following instruction execution all,.The information such as the program address by the atomic instructions are write The Load instruction of prediction will cannot be split out by entering to forbid to encounter the atomic instructions again later in prediction caching.

For aforementioned atomic instructions ATOM Rd, Rt, [Rn], the prefetched instruction split is PLD [Rn].

The execution of prefetched instruction includes instruction issue, data pre-fetching, instruction submission etc..In the present embodiment, held in step 3) The detailed step of row prefetched instruction includes: that prefetched instruction is emitted in assembly line, data acquisition, and prefetched instruction is submitted, and pre- Instruction fetch is simultaneously free of the movement for writing result bus, only reads data from next stage storage and is put into processor core.In advance Instruction fetch can't write result bus, only read data from next stage storage, be put into processor core.Prefetching here Need to prefetch can write state data block, when being executed so as to atomic instructions, not needing the additional time goes to obtain the data block Can write state, to accelerate the execution of atomic instructions.

The present embodiment also provides a kind of device that acceleration atomic instructions execute, before which is programmed to perform the present embodiment The step of stating the method for accelerating atomic instructions to execute.

As shown in figure 3, the present embodiment also provides a kind of device that acceleration atomic instructions execute, comprising:

Fetch unit, for taking atomic instructions;

Atomic instructions split cells, for generating corresponding auxiliary instruction according to atomic instructions, auxiliary instruction be Load instruct or Prefetched instruction, wherein Load instruction operates the data before executing for taking out atomic instructions from atomic instructions operation storage region, Prefetched instruction is used to for the data of atomic instructions operation storage region being prefetched in processor core in advance, and being placed in can write state; Respectively to different assembly line output atomic instructions and its corresponding auxiliary instruction;

Atomic instructions result queue executes the implementing result of Load instruction for recording prediction;

Load/ prefetches assembly line, for executing auxiliary instruction, if auxiliary instruction is Load instruction, looks forward to the prospect and executes Load instruction It reads atomic instructions and needs the data of operating area, and result is write into result bus and atomic instructions result queue, so that Other instructions for relying on the instruction results can execute in advance;If auxiliary instruction is prefetched instruction, executes prefetched instruction and shift to an earlier date By data acquisition into processor core, and it is set to writeable state;

Atomic instructions assembly line, for executing atomic instructions, atomic instructions execute after submission, if auxiliary instruction refers to for Load It enables, then jumps and execute atomic instructions result inspection unit;If auxiliary instruction is prefetched instruction, directly execution prefetched instruction is obtained The data result arrived carries out atomic operation, and instruction execution terminates;

Atomic instructions result inspection unit, for referring to the Load in the implementing result of atomic instructions and atomic instructions result queue Implementing result is enabled to compare, if implementing result is equal, entire atomic instructions execution terminates;If implementing result is unequal, Atomic instructions are returned the result and are write toward result bus, the assembly line null clear operation since the atomic instructions, instruction execution are caused Terminate.

Referring to Fig. 3, atomic instructions split cells needs to check the content for forbidding prediction queue before being split, thus Determine the mode that atomic instructions are split.Atomic instructions result inspection unit checks atomic instructions and stores the Load in auxiliary instruction Whether the result of instruction execution is consistent, if inconsistent, needs to send Flush information, request to processor core control unit Flush operation.

As shown in figure 3, the device that the present embodiment accelerates atomic instructions to execute includes atomic instructions split cells, Load/ pre- Instruction fetch execution pipeline, atomic instructions execution pipeline, atomic instructions result queue, atomic instructions result check queue, prohibit Only prediction caching and processor core control unit.Atomic instructions split cells function be atomic instructions are split as atomic instructions and Auxiliary instruction, and the atomic instructions of fractionation are sent to atomic instructions assembly line and are executed, Load/ is sent in auxiliary instruction and is prefetched The assembly line of instruction.Atomic instructions split cells needs to check the content for forbidding prediction queue, to determine before being split The mode that atomic instructions are split.Load/ prefetched instruction execution pipeline is responsible for executing auxiliary instruction, and auxiliary instruction includes that Load refers to Enable and two kinds of prefetched instruction, both instruction execution accesses are similar, generally realize in the same assembly line.Atomic instructions flowing water For line for instructing atomic instructions, atomic instructions execution is more complicated, can be also may be implemented in Load/ in independent assembly line Increase additional access in prefetched instruction execution pipeline to be realized.Atomic instructions result queue is for storing in auxiliary instruction Load instruction result.The function of atomic instructions result inspection unit is checked in atomic instructions and storage auxiliary instruction Whether the result of Load instruction execution is consistent, if inconsistent, need to send Flush information to processor core control unit, ask Flush is asked to operate.Processor core control unit is the central control mechanism for managing the Out-of-order execution of entire assembly line.

The above is only a preferred embodiment of the present invention, protection scope of the present invention is not limited merely to above-mentioned implementation Example, all technical solutions belonged under thinking of the present invention all belong to the scope of protection of the present invention.It should be pointed out that for the art Those of ordinary skill for, several improvements and modifications without departing from the principles of the present invention, these improvements and modifications It should be regarded as protection scope of the present invention.

11页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:信息处理装置、信息处理系统以及信息处理方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!