Optimize the method for register access in a kind of CPU

文档序号:1771504 发布日期:2019-12-03 浏览:7次 中文

阅读说明:本技术 一种cpu中优化寄存器访问的方法 (Optimize the method for register access in a kind of CPU ) 是由 李晓辉 胡胜发 于 2019-08-01 设计创作,主要内容包括:本发明公开了一种CPU中优化寄存器访问的方法,包括:根据预设的映射策略将数据寄存器和内存栈建立映射关系;根据所述数据寄存器和内存栈的映射关系,将栈存取操作指令进行精简,并根据精简后的操作指令进行数据访问操作。在建立数据寄存器与内存栈的映射关系后,可假定所有的临时变量都在栈上,系统直接访问寄存器即可访问内存栈上这些变量。通过实施本发明,能够在大幅增加数据寄存器的同时,精简了大量的栈存取代码,从而有效降低了编译的复杂度与难度,有效提升了CPU性能。(The invention discloses a kind of methods for optimizing register access in CPU, comprising: data register and memory stack are established mapping relations according to preset mapping policy;According to the mapping relations of the data register and memory stack, the instruction of stack accessing operation is simplified, and data access operation is carried out according to the operational order after simplifying.After establishing the mapping relations of data register and memory stack, it may be assumed that for all temporary variables all on stack, system, which directly accesses register, may have access to these variables on memory stack.By applying the present invention, can simplify a large amount of stack access code while data register is significantly increased to effectively reduce the complexity and difficulty of compiling and effectively improve cpu performance.)

1. optimizing the method for register access in a kind of CPU characterized by comprising

Data register and memory stack are established into mapping relations according to preset mapping policy;

According to the mapping relations of the data register and memory stack, the instruction of stack accessing operation is simplified, and according to simplifying Operational order afterwards carries out data access operation.

2. optimizing the method for register access in CPU according to claim 1, which is characterized in that the method also includes:

Stack pointer is configured according to the preset mapping policy, so that system is when stack pointer is mobile to memory stack sum number Operation is synchronized according to register.

3. optimizing the method for register access in CPU according to claim 1, which is characterized in that the method also includes:

Modified according to the preset mapping policy to compiler, in a manner of to register allocation plan, variable access with And compilation generation form is adjusted.

4. optimizing the method for register access in CPU according to claim 1, which is characterized in that described according to the number According to the mapping relations of register and memory stack, the instruction of stack accessing operation is simplified, and according to the operational order after simplifying into Row data access operation, comprising:

The stack access requirement of response variable goes out stacking to variable according to the mapping relations of the data register and memory stack Instruction is simplified, and carries out data access operation to variable according to the operational order after simplifying.

5. optimizing the method for register access in CPU according to claim 4, which is characterized in that described according to the number According to the mapping relations of register and memory stack, the instruction of stack accessing operation is simplified, and according to the operational order after simplifying into Row data access operation, further includes:

Respond the call request of subfunction, the discrepancy according to the mapping relations of the data register and memory stack, to subfunction Stack instruction is simplified, and carries out data access operation to variable according to the operational order after simplifying.

6. optimizing the method for register access in CPU according to claim 5, which is characterized in that described according to the number According to the mapping relations of register and memory stack, the instruction of stack accessing operation is simplified, and according to the operational order after simplifying into Row data access operation, further includes:

The access request for responding interrupt function, according to the mapping relations of the data register and memory stack, to interrupt function It enters and leaves stack instruction to be simplified, and data access operation is carried out to interrupt function according to the operational order after simplifying.

7. optimizing the method for register access in CPU according to claim 6, which is characterized in that described according to the number According to the mapping relations of register and memory stack, the instruction of stack accessing operation is simplified, and according to the operational order after simplifying into Row data access operation, further includes:

Multithreading switching request is responded, the discrepancy according to the mapping relations of the data register and memory stack, to thread switching Stack instruction is simplified, and carries out thread handover operation according to the operational order after simplifying.

Technical field

The present invention relates to optimization fields, more particularly, to a kind of method for optimizing register access in CPU.

Background technique

Register is the core component of CPU, it is high speed storing component, is used to temporary instruction, data and address.It posts The presence of storage greatly accelerates calculating and internal storage access efficiency, therefore the quantity of theoretically register is The more the better.It is existing Register only has several on X86-based, and ARM then has more than ten, and RISC-V is then more, there is more than 30.But with current technology Situation, it is desirable to which the quantity for further increasing register is extremely difficult, because register excessively will increase instruction complexity, increases The compiling difficulty of compiler can lead to reduction system execution efficiency due to increasing size of code in this way.Therefore, the prior art can not Running efficiency of system is effectively improved by increasing register number.

Summary of the invention

The technical problem to be solved by the embodiment of the invention is that providing the side for optimizing register access in a kind of CPU Method, can be avoided the defect caused due to increasing register when register number is significantly increased, and increase so as to pass through Register number effectively improves running efficiency of system.

In order to solve the above-mentioned technical problem, the embodiment of the invention provides in a kind of CPU optimize register access method, Include:

Data register and memory stack are established into mapping relations according to preset mapping policy;

According to the mapping relations of the data register and memory stack, the instruction of stack accessing operation is simplified, and according to Operational order after simplifying carries out data access operation.

Further, optimize the method for register access in the CPU further include:

Stack pointer is configured according to the preset mapping policy, so that system is when stack pointer is mobile to memory stack Operation is synchronized with data register.

Further, optimize the method for register access in the CPU further include:

It is modified according to the preset mapping policy to compiler, to register allocation plan, variable access side Formula and compilation generation form are adjusted.

Further, the mapping relations according to the data register and memory stack, by stack accessing operation instruct into Row is simplified, and carries out data access operation according to the operational order after simplifying, comprising:

The stack access requirement of response variable goes out variable according to the mapping relations of the data register and memory stack Enter stack instruction to be simplified, and data access operation is carried out to variable according to the operational order after simplifying.

Further, the mapping relations according to the data register and memory stack, by stack accessing operation instruct into Row is simplified, and carries out data access operation according to the operational order after simplifying, further includes:

The call request for responding subfunction, according to the mapping relations of the data register and memory stack, to subfunction It enters and leaves stack instruction to be simplified, and data access operation is carried out to variable according to the operational order after simplifying.

Further, the mapping relations according to the data register and memory stack, by stack accessing operation instruct into Row is simplified, and carries out data access operation according to the operational order after simplifying, further includes:

The access request for responding interrupt function, according to the mapping relations of the data register and memory stack, to interruption letter Several discrepancy stack instructions are simplified, and carry out data access operation to interrupt function according to the operational order after simplifying.

Further, the mapping relations according to the data register and memory stack, by stack accessing operation instruct into Row is simplified, and carries out data access operation according to the operational order after simplifying, further includes:

Multithreading switching request is responded, according to the mapping relations of the data register and memory stack, to thread switching It enters and leaves stack instruction to be simplified, and thread handover operation is carried out according to the operational order after simplifying.

Compared with prior art, the invention has the following beneficial effects:

The present invention provides a kind of methods for optimizing register access in CPU, comprising: will be counted according to preset mapping policy Mapping relations are established according to register and memory stack;According to the mapping relations of the data register and memory stack, stack is accessed and is grasped Make instruction to be simplified, and data access operation is carried out according to the operational order after simplifying.Establishing data register and memory After the mapping relations of stack, it may be assumed that for all temporary variables all on stack, system, which directly accesses register, may have access to memory stack These upper variables.By applying the present invention, while data register is significantly increased can have been simplified a large amount of stack access generation Code, to effectively reduce the complexity and difficulty of compiling, effectively improves cpu performance.

Detailed description of the invention

Fig. 1 is the flow diagram for optimizing the method for register access in the CPU of one embodiment of the invention offer;

Fig. 2 is the application schematic diagram by a certain number of register mappings to memory stack that one embodiment of the invention provides.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Whole description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

Referring to Figure 1, the embodiment of the invention provides a kind of methods for optimizing register access in CPU, comprising:

Step S1, data register and memory stack are established by mapping relations according to preset mapping policy;

Step S2, according to the mapping relations of the data register and memory stack, the instruction of stack accessing operation is simplified, And data access operation is carried out according to the operational order after simplifying.

Compared with prior art, the present invention is after establishing the mapping relations of data register and memory stack, it may be assumed that all Temporary variable all on stack, system directly access register i.e. may have access to memory stack on these variables.By applying the present invention, A large amount of stack access code can be simplified, to effectively reduce answering for compiling while data register is significantly increased Miscellaneous degree and difficulty, effectively improve cpu performance.

Fig. 2 is referred to, of the invention to be characterized by, mapping data register and memory stack.It is understood that data are posted Storage refers to that the register that data operation is used in CPU, memory stack refer to one section of memory for distributing to program.Operating system can give one The program (or thread) of a operation distributes one section of memory for being storehouse, and the bottom of (in little endian mode) this section of memory is heap, on Portion is stack.Stack is mainly used for temporary variable, it grows downwards, namely when call subroutine or the more temporary variables of needs, The instruction that compiler generates, control stack pointer (SP register) move down, and SP is moved up when exiting subprogram.The present invention is exactly will be certain The register mappings of quantity are directed toward R0 to memory stack, such as SP.Maximum has 256 data registers in this legend, and actually answers It can be increased or decreased according to specific requirements in.These data registers are like the Cache of stack, it is certainly intermediate can be with There are real data Cache.When accessing data register, be equivalent to operation memory stack, these work by the hardware such as CPU from It is dynamic to complete.

It is understood that machine language is the highest language of execution efficiency, can be compiled for the compiler of machine language it is normal For C/C++ compiler.In C language, each subprogram is other than function variable, and there are also the temporary variables such as dynamic variable.Existing skill In art, when variable is few, register expression can be used directly, and when variable is more, just certain variables are temporarily stored on stack, this It is exactly stack access.In fact, a large amount of subroutine call, the complexity of function, interruption, multitask switching etc. can all cause largely Stack access.Since stack access needs specific instruction, occur that the size of code executed will be resulted in the need for when the access of a large amount of stacks Become larger, execute time-consuming.For the present invention after establishing the mapping relations of data register and memory stack, compiler can be assumed that all face Variations per hour is all on stack, and directly access register just may have access to these variables on stack, therefore can simplify a large amount of stack access generation Code, the runnability of effective lifting system.

In embodiments of the present invention, further, optimize the method for register access in the CPU further include:

Stack pointer is configured according to the preset mapping policy, so that system is when stack pointer is mobile to memory stack Operation is synchronized with data register.

In embodiments of the present invention, further, the mapping relations according to the data register and memory stack, will The instruction of stack accessing operation is simplified, and carries out data access operation according to the operational order after simplifying, comprising:

The stack access requirement of response variable goes out variable according to the mapping relations of the data register and memory stack Enter stack instruction to be simplified, and data access operation is carried out to variable according to the operational order after simplifying.

It is understood that code piecemeal will lead to temporary variable stacking or go out in excessive variable, function call, function Stack.Stack-incoming operation is gone out for these, compiler will generate dependent instruction processing, in the prior art, enter and leave stack instruction and account for larger ratio Example, the present invention can greatly improve register number, deposit in access by establishing the mapping relations of data register and memory stack Access memory stack is just comparable to when device, so that discrepancy stack instruction is greatly decreased, effective lifting system operational efficiency.

For variable go out stacking, the prior art with present invention is different in that:

A) stacking: when common subfunction or code block enter, the prior art will generate the change that into stack instruction, i.e., will be saved Memory stack is written from register in amount, while stack pointer moves down.The present invention only simply moves down stack pointer.After stack pointer moves down, R0 has been directed toward new address, therefore it represents new variable.If stack pointer moves down N, then original R0, is indicated with RN now.Stack refers to Needle, which moves down, will lead to the top registers such as R255 with stepping into memory, but this is that hardware is automatically brought into operation, best performance, compiler without It needs to consider, without the occupancy cpu instruction period.Therefore stack-incoming operation of the invention is substantially better than the prior art.

B) pop: common subfunction or code block exit, and the prior art uses pull instruction, by the variable saved from stack It is restored to register, while stack pointer also will be shifted up.The present invention only simply moves up stack pointer.Stack pointer moves up needs will be interior It deposits and steps into top register together, but this is that hardware is automatically brought into operation.Therefore Pop operations of the invention are also significantly better than existing skill Art.

In embodiments of the present invention, further, the mapping relations according to the data register and memory stack, will The instruction of stack accessing operation is simplified, and carries out data access operation according to the operational order after simplifying, further includes:

The call request for responding subfunction, according to the mapping relations of the data register and memory stack, to subfunction It enters and leaves stack instruction to be simplified, and data access operation is carried out to variable according to the operational order after simplifying.

It should be noted that when calling subfunction, by parameter and Function return addresses stacking.And calculate return value stack Occupy size.When subfunction returns, stack is written into return value, then returns to caller.Although for subfunction calling and It returns, the present invention is similar with present technology, but the optimization due to going out stacking, and the calling and return of subfunction, performance synthesis get up Still it is an advantage over present technology.

In embodiments of the present invention, further, the mapping relations according to the data register and memory stack, will The instruction of stack accessing operation is simplified, and carries out data access operation according to the operational order after simplifying, further includes:

The access request for responding interrupt function, according to the mapping relations of the data register and memory stack, to interruption letter Several discrepancy stack instructions are simplified, and carry out data access operation to interrupt function according to the operational order after simplifying.

It should be noted that entering for interrupt function, the prior art needs then to set all data register stackings Stack pointer is set to interrupt stack top.The present invention only need setting stack pointer can, hardware automatically posts former data in setting up procedure Storage is the same as the stack memory for stepping into current function.The present invention can also allow interruption not switch stack, allow interrupt function directly on current stack Operation, performance are more excellent.

Interrupt function is exited, the prior art needs to restore all data registers and pops, and then returns to the point of interruption. The present invention only needs to restore stack pointer just, and hardware will restore former data register automatically in setting up procedure.Therefore when interrupt operation The present invention is aobvious better than better than the prior art.

In embodiments of the present invention, further, the mapping relations according to the data register and memory stack, will The instruction of stack accessing operation is simplified, and carries out data access operation according to the operational order after simplifying, further includes:

Multithreading switching request is responded, according to the mapping relations of the data register and memory stack, to thread switching It enters and leaves stack instruction to be simplified, and thread handover operation is carried out according to the operational order after simplifying.

It should be noted that having in the operating system of multitask, there are the switchings of frequent thread.Switching every time, Dou Yaobao All data registers of former thread are deposited, and restore all data registers of new thread.The prior art still needs to explicit discrepancy Stack instruction, but the present invention only needs that lower stack pointer is arranged, and is just automatically completed related synchronization by hardware.Therefore sheet when thread handover operation Hair is substantially better than better than the prior art.

In embodiments of the present invention, further, optimize the method for register access in the CPU further include:

It is modified according to the preset mapping policy to compiler, to register allocation plan, variable access side Formula and compilation generation form are adjusted.

In a particular embodiment, the present embodiments relate to the modifications for arriving following many levels:

A), CPU: mainly stack pointer SP setting (by taking small end as an example)

I.SP moves down instruction: the N number of register synchronization in top being entered in memory stack (register -> memory), then SP=SP- N。

Ii.SP moves up instruction: memory stack is synchronized in the N number of register in top (memory -> register), then SP=SP+N.

Iii.SP setting instruction: all registers are written in former stack, and setting SP is newly worth, and all registers are written in new stack In.

B), compiler is modified: adjustment register allocation plan, variable access, compilation generate etc..

C), software modification: the bottom layer realizations such as mainly modification is interrupted, thread switches.

It is understood that the modification of the above many levels can be by each professional research staff (compiler development person, operation System or bottom software developer etc.) adaptation carried out according to mapping demand.

In a particular embodiment, following means can be taken to optimize, such as in terms of CPU:

A) the quantity N of real data register can be but be not limited only to 256 of the present embodiment, can be according to practical need It asks and increases or decreases.

B) register uses annular+index accesses mode, the R [0] accessed in instruction, it may be possible to the Rx of physics.Physics It is R [N-x] in instruction that R0 is corresponding.The data caused when SP movement can be reduced using the mode to carry, thus enhancing Energy.

C) between register and memory, it is inserted into one or more levels caching, with improving performance.

It illustrates below some typical C language functions, and encloses the related assembler language after translation (similar ARM collects), To carry out demonstration explanation to the present invention.In following citing, not explicit data enter and leave stack instruction really, it was demonstrated that phase of the present invention Compared with the advantage of the prior art.

Firstly, about calling convention:

When a) calling a function, sp is allowed to subtract a value, so that it is directed toward the 0th parameter to be passed to, then by parameter Stack is written in inverted order, and last pc (program counter) stacking simultaneously goes to subfunction.If the byte number that parameter needs is less than return value The byte number needed, then sp subtracts the byte number of return value needs.

B) in function, function parameter can be when local variable uses, that is, function can be changed the value of function argument, and the 0th Parameter, that is, r1 may become r [n] if sp value changes.

C) each variable of function is stored on stack, and using how many variable, sp should subtract respective value, so that address of variable is big In or be equal to sp, can sample can ensure that interruption etc. fortuitous events will not rewrite variable.

D) when function returns, sp restores, and first return value is placed on r1, i.e. sp [1] is then placed on r2, finally, pc goes out Stack, pc are directed toward next instruction for calling function.

1) simplest function: LoopDelay ()

2) memory copy function: CopyMemory

3) mathematical operation: DoMath (a, b, c, d), return value be (a+b) * (c-d), behind be Optimized code

Optimized code:

The above is a preferred embodiment of the present invention, it is noted that for those skilled in the art For, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also considered as Protection scope of the present invention.

11页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:基于Linux系统的贴牌显示方法、电子设备及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!