Method for realizing acceleration of prime number domain large integer modular multiplication calculation

文档序号：1798192 发布日期：2021-11-05 浏览：18次中文

阅读说明：本技术 一种实现素数域大整数模乘计算加速的方法 (Method for realizing acceleration of prime number domain large integer modular multiplication calculation ) 是由郑昉昱高莉莉魏荣马原王跃武范广万立鹏于 2021-07-12 设计创作，主要内容包括：本发明公开一种实现素数域大整数模乘计算加速的方法,将素数域长度为k比特的被乘数和乘数分为N段,前(N-1)段每段长为w比特,第N段长为r比特,w≥r；将被乘数和乘数的每一段转化为双精度浮点数,采用积和熔加运算对转化后的被乘数和乘数的每一段进行乘加运算,初始化2N个定点数,将乘加结果的二进制数值累加到经过初始化后的定点数中,对定点数进行位数约减,获得最后的模乘结果。本发明充分利用双精度浮点数的格式特点,提升了素数域模乘的计算效率。(The invention discloses a method for realizing the acceleration of prime number domain large integer modular multiplication calculation, which divides a multiplicand and a multiplier with the length of k bits in the prime number domain into N sections, wherein each section of the front (N-1) section is w bits in length, the Nth section is r bits in length, and w is more than or equal to r; converting each section of the multiplicand and the multiplier into double-precision floating point numbers, performing multiply-add operation on each section of the converted multiplicand and multiplier by adopting a product-fuse operation, initializing 2N fixed point numbers, accumulating the binary number of the multiply-add result into the initialized fixed point numbers, performing digit reduction on the fixed point numbers, and obtaining the final modular multiplication result. The invention fully utilizes the format characteristics of the double-precision floating point number and improves the calculation efficiency of the prime number field modular multiplication.)

1. A method for realizing acceleration of prime number domain large integer modular multiplication calculation comprises the following steps:

1) a and B are defined in prime number field F_pP is 2^kσ, σ is less than 2^wThe prime number of (c); respectively dividing a multiplicand A and a multiplier B with the length of k bits into N sections; wherein, each segment of the front (N-1) segment is w bits, the Nth segment is r bits, and w is more than or equal to r;

2) converting each section of the multiplicand A and the multiplier B into double-precision floating point numbers respectively; performing multiply-add operation on the converted multiplicand A and multiplier B by adopting a product-fuse addition operation, and converting an operation result into a certain number R;

3) dividing the fixed point number R into 2N sections, and setting w bits for the section length of the front (2N-1) section of R under the condition that the numerical value of R is not changed; reduction of R to N fixed point numbers by multiplication and additionBy multiplication, addition and shift operationsPartial reduction of more than k bits, so thatFixed point number of k bits;

4) judgment ofWhether it is an integer in the selected prime field, ifIs an integer in the prime field of choice, thenNamely the big wholeThe modular multiplication result of the number A and the large integer B; if it is notNot an integer on the prime field of choice, then it will beSubtracting p as the result of the modular multiplication of the large integer a and the large integer B.

2. The method of claim 1, wherein the multiplicand a and multiplier B segment lengthsWhere 52 is the tail length of a double precision floating point number; the bit length w of the (N-1) segment before the multiplicand A and the multiplier B and the bit length r of the Nth segment satisfy the equation (N-1) multiplied by w + r ═ k, and in the case that 52 is more than or equal to w and more than or equal to r, w-r is made as small as possible.

3. The method of claim 1 or 2, wherein after segmenting the multiplicand and the multiplier, a [0: N-1] represents N segments of 0 to (N-1) of the multiplicand a, a '[ 0: N-1] is in the form of floating point numbers of a [0: N-1], B [0: N-1] represents N segments of 0 to (N-1) of the multiplier, and B' [0: N-1] is in the form of floating point numbers of B [0: N-1 ].

4. The method of claim 3, wherein performing a multiply-add operation on the converted multiplicand A, multiplier B using a fused-multiply-add operation comprises: firstly, initializing fixed point number R, dividing it into 2N segments, and recording as R [0:2N-1]](ii) a Second, according to the large integer multiplication order of the segment scan_i,jA'[i]·B'[j]Calculating a segment A' i of the multiplicand A]And a section B' j of the multiplier B]Result of multiplication and addition M with addend C0_ij[0]Then, a section A ' i ' of the multiplicand A ' is calculated]And a section B' j of the multiplier B]Result of multiplication and addition M with addend C1_ij[1]Wherein i is more than or equal to 0, and j is less than N; let the operation of conv _2_ bin (x) be binary form of x, and connect _2_ bin (M)_ij[0]) Adding to a fixed pointNumber R [ i + j +1]In (1), conv _2_ bin (M)_ij[1]) Add up to R [ i + j ]]In (1).

5. The method of claim 4, wherein initializing a fixed-point number R comprises: when t belongs to [0, N-1], R [ t ] - [ (t × (0x433+ w) + (t +1) × 0x433) &0xFFF ] < 52, and when t belongs to [ N,2N-1], R [ t ] - [ ((t +1) × (0x433+ w) + t × 0x433) &0xFFF ] < 52.

6. The method of claim 4, wherein the addend C0 has a value of 2⁵²+ w, the value of the addend C1 being 2^52+w+2⁵²-M_ij[0]。

7. The method of claim 1 or 5, wherein the method of setting the segment length of the first (2N-1) segment of R to w bits is: r_t+1＝R_t+1+(R_t＞＞w),t∈[0,2N-2]Wherein R is_tDenotes the t +1 th segment of R, R_t+1Represents the t +2 th segment in R.

8. The method of claim 7, wherein the reducing R to about N fixed point numbers using multiply and add operationsThe method comprises the following steps:after reductionHas a value range of [0,2 ]^k+σ·2^digit-r) Where digit is the bit length of a double-precision floating-point number, and since A and B are large integers whose bit length k is much greater than the bit length of a double-precision floating-point number, 0 < σ · 2^digit-r＜2^kI.e. by

9. The method of claim 8,to representThe N segments of (1) to (N-1) are describedThe high digit-r bit of (1) is a carry, according toThe value of carry is 0 or 1; the operations of multiplication, addition and shift are usedPartial reduction of more than k bits, so thatFixed point number for k bits, including: first orderWherein the mask_rIs 2^r-1; then when t is equal to [0, N-2 ]]When it is used, orderReduced by carryThe value range is as follows: when the carry is 0, then,when the carry is 1, the number of the entry is,since sigma is a small prime number and digit is much smaller than k, carry is reducedCan be unified as [0,2 ]^k-1]。

10. The method of claim 9, wherein if, ifLess than the prime number p, thenMultiplying the large integer A and the large integer B and then taking a modulus of p; if it is notGreater than prime p, thenIs the result of multiplying the large integer A and the large integer B and then taking the modulus of p.

Technical Field

The invention belongs to the technical field of calculation, and relates to a method for accelerating prime number domain large integer modular multiplication calculation.

Background

With the continuous progress of science and technology and the rapid development of computer technology, users have higher requirements on privacy protection, and cryptography is also applied to network communication technology in a large number. For example, e-commerce, software distribution, etc. are directed to the internet-derived industry of mass users to implement privacy protection and secure communication over the internet through key agreement and digital signature. Large integer modular multiplication is the core computational load of many asymmetric cryptographic algorithms. The main computational load of the world's mainstream asymmetric cryptographic algorithm ecc (explicit currve cryptography) is the large integer modular multiplication of the prime number domain. Therefore, the operation speed of the large integer modular multiplication in the prime number domain directly affects the speed of key agreement and digital signature realization, and the research on the high-performance realization of the large integer modular multiplication in the prime number domain is very important.

GPUs (graphics processing units) are very efficient in computer graphics and image processing and are therefore more adept at floating point number operations. The computational power of floating point numbers of GPUs has grown more than ten times over the last decade. In addition, the CUDA parallel computing framework introduced by NVIDIA corporation enables GPUs computing resources originally only suitable for graphics processing computing to be used for accelerating scientific computing. Many researchers have accelerated mainstream cryptographic primitives using the computational resources of GPUs. For example, Pan et al use fixed point number computation power of GPUs to accelerate ECDSA, Niall et al use double-precision floating point number computation power of GPUs to accelerate RSA, and throughput reaches a new peak value. In order to adapt to the characteristic that the GPUs floating point number has rapid calculation capacity development, the method combines a double-precision floating point number-based product-fused addition instruction and an integer domain arithmetic instruction, and accelerates the large integer modular multiplication operation of a prime number domain.

The basic data type of the current computer has a corresponding fixed word length, the large integer cannot be directly represented in the computer through the basic data type, researchers generally split the large integer, a plurality of basic data types are used for representing the large integer, and the large integer modular multiplication is calculated in a multi-precision calculation mode.

The double precision floating point number format used by the present invention conforms to the floating point number standard specified by IEEE 754. One floating point number in the IEEE754 standard consists of a sign bit, a level code, and a tail code, where the tail code includes a 1-bit implied bit and a fractional portion of bits. A double precision floating point number contains a 1-bit sign bit, a 12-bit level code, a 1-bit implied bit and a 52-bit fractional portion, the implied bit being not shown in the computer.

Disclosure of Invention

The invention provides a method for realizing acceleration of prime number domain large integer modular multiplication calculation, which can fully utilize double-precision floating point calculation capacity of calculation resources and improve the calculation speed of large integer modular multiplication.

A method for realizing acceleration of prime number domain large integer modular multiplication calculation comprises the following steps:

1) will be defined in the prime number field F_pThe big integers A and B with the upper length of k bits are divided into N sections, each section of the front (N-1) section is w bits, the Nth section is r bits, and w is more than or equal to r; wherein p is 2^kσ, σ is less than 2^wThe prime number of (c);

3) dividing the fixed point number R into 2N sections, and setting the section length of the front (2N-1) section of R as w bits under the condition that the numerical value of R is not changed; reduction of R to N fixed point numbers by multiplication and additionBy multiplication, addition and shift operationsPartial reduction of more than k bits, so thatFixed point number of k bits;

4) judgment ofWhether it is an integer in the selected prime field, ifIs an integer in the prime field of choice, thenThe result is the modular multiplication result of the large integer A and the large integer B; if it is notNot an integer on the prime field of choice, then it will beSubtracting p as the result of the modular multiplication of the large integer a and the large integer B.

Where "large integer" refers to an integer that cannot be represented with only one double-precision floating-point number.

Further, the segment lengths of multiplicand A and multiplier BWhere 52 is the tail length of a double precision floating point number; the bit length w of the previous (N-1) segment of the multiplicand A and the multiplier B and the bit length r of the Nth segment satisfy the equation (N-1) x w + r ═ k, and w-r is made as small as possible in the case where 52 ≧ w ≧ r.

Further, after segmenting the multiplicand and the multiplier, A [0: N-1] represents N segments of 0 to (N-1) of the multiplicand A, A '[ 0: N-1] is in the form of floating point number of A [0: N-1], B [0: N-1] represents N segments of 0 to (N-1) of the multiplier, and B' [0: N-1] is in the form of floating point number of B [0: N-1 ].

Further, the performing multiply-add operation on the converted multiplicand a and multiplier B by using fused-multiply-add operation includes: first 2N fixed points are initialized to be R0: 2N-1](ii) a Second, according to the large integer multiplication order of the segment scan_i,jA'[i]·B'[j]Calculating a segment A' i of the multiplicand A]And a section B' j of the multiplier B]Result of multiplication and addition M with addend C0_ij[0]Then, a section A ' i ' of the multiplicand A ' is calculated]And a section B' j of the multiplier B]Result of multiplication and addition M with addend C1_ij[1]Wherein i is more than or equal to 0, and j is less than N; let the operation of conv _2_ bin (x) be binary form of x, and connect _2_ bin (M)_ij[0]) Adding to the fixed point number R [ i + j +1]In (1), conv _2_ bin (M)_ij[1]) Add up to R [ i + j ]]In (1).

Further, 2N fixed-point numbers R [0:2N-1]]The initialization method comprises the following steps: when t is in the range of [0, N-1]]When is, R < t >]＝-[(t×(0x433+w)+(t+1)×0x433)&0xFFF]< 52, when t is epsilon [ N,2N-1]When is, R < t >]＝-[((t+1)×(0x433+w)+t×0x433)&0xFFF]< 52. Where 0x433 is the hexadecimal version of the offset 1023 plus 52 of the double-precision floating-point order bits. 0xFFF is 2¹²Hexadecimal form of 1.

Further, the addend C0 has a value of 2⁵²+ w, the value of the addend C1 being 2^52+w+2⁵²-M_ij[0]。

Further, the method for setting the segment length of the first (2N-1) segment of R to w bits is as follows: r_t+1＝R_t+1+(R_t＞＞w),t∈[0,2N-2]. Wherein R is_tDenotes the t +1 th segment of R, R_t+1Represents the t +2 th segment in R.

Further, the reduction of the R to N fixed point numbers by using multiplication operation and addition operationThe method comprises the following steps:after reductionHas a value range of [0,2 ]^k+σ·2^digit-r) Wherein digit is the bit length of a double-precision floating-point number, and since A and B are large integers, the bit length k is much longer than that of a double-precision floating-point numberNamely, it is

Further, in the above-mentioned case,to representThe N segments of (1) to (N-1) are describedThe high digit-r bit of (1) is a carry, according toThe value of carry is 0 or 1; the operations of multiplication, addition and shift are usedPartial reduction of more than k bits, so thatFixed point number for k bits, including: first order Wherein the mask_rIs 2^r-1; then when t is equal to [0, N-2 ]]When it is used, orderReduced by carryThe value range is as follows: when the carry is 0, then,when the carry is 1, the number of the entry is,since sigma is a small prime number and digit is much smaller than k, carry is reducedCan be unified as [0,2 ]^k-1]。

Further, ifLess than the prime number p, thenMultiplying the large integer A and the large integer B and then taking a modulus of p; if it is notGreater than prime p, thenIs the result of multiplying the large integer A and the large integer B and then taking the modulus of p.

Compared with the prior art, the invention has the following positive effects:

when the multiplication of a prime number domain large integer digital and analog is calculated, firstly, a multiplicand and a multiplier are split and converted into a plurality of numerical values of a double-precision floating point type, and fraction parts in mantissas of double-precision floating point numbers are fully utilized in the floating point number conversion process; the method has the advantages that the prime number field large integer modular multiplication is realized by using the floating point number calculation instruction, the conception is novel, the calculation is efficient, the double-precision floating point number storage format of a computer is utilized to the maximum, and the calculation speed of the large integer modular multiplication is improved.

Drawings

FIG. 1 is a flowchart illustrating a method for accelerating modular multiplication of prime number domain large integers by using floating point number calculation instructions according to the present invention.

Detailed Description

The technical solution of the present invention is explained in detail below, but the scope of the present invention is not limited to the embodiments.

For a given prime field F_p，p＝2²²¹-3, A and B are prime number field F_pWhen calculating A multiplied by B to get the modulus of p, the method for accelerating the modulo multiplication calculation of the large integer in the prime number domain by utilizing the floating point number calculation instruction mainly comprises the following steps:

1) respectively dividing a multiplicand A and a multiplier B with the length of 221 bits into N sections, wherein N is 5; wherein, each segment of the first 4 segments is 45 bits, and the 5 th segment is 41 bits;

2) after the multiplicand and multiplier are segmented, A [0:4] represents 5 segments of 0 to 4 of the multiplicand A, and B [0:4] represents 5 segments of 0 to 4 of the multiplier B. Each segment of A [0:4] is converted to double precision floating point form and denoted as A '[ 0:4], and each segment of B [0:4] is converted to double precision floating point form and denoted as B' [0:4 ].

3) Large integer multiplication order sigma by segment scan_i,jA'[i]·B'[j]，i,j∈[0,4]First, a segment A ' i ' of the multiplicand A ' is calculated]And a section B' j of the sum multiplier B]Result of multiplication and addition M with addend C0_ij[0]Wherein, C0 is 2⁹⁷(ii) a Then a section A 'i of the multiplicand A' is calculated]And a section B' j of the sum multiplier B]Result of multiplication and addition M with addend C1_ij[1]Wherein C1 ═ 2⁹⁷+2⁵²-M_ij[0]。

4) Initializing fixed point number R, dividing the fixed point number R into 2N sections, and recording as R [0:2N-1 ]; the initialization mode of R0: 2N-1 is as follows:

5) let the operation of conv _2_ bin (x) be binary form of x, and connect _2_ bin (M)_ij[0]) Adding to the fixed point number R [ i + j +1]In (1), conv _2_ bin (M)_ij[1]) Add up to R [ i + j ]]In (1).

6) The segment length of the first 9 segments of R [0:9] is set as 45 bits, and the setting method is as follows:

R_t+1＝R_t+1+(R_t＞＞45),t∈[0,8]

7) the 10-segment fixed point number R is reduced to 5-segment fixed point by multiplication operation and addition operationCounting numberThe calculation method comprises the following steps:

namely, it is

After reductionHas a value range of [0,2 ]²²¹+3·2²³)。

8)To representThe N segments of (1) to (N-1) are describedThe high 23 bits of (1) are carry, according to step 7)The value of carry is 0 or 1; counting N segments by multiplication, addition and shift operationsThe bit is reduced to 221 bits. Order to Then when t is equal to 0,3]When it is used, orderAfter the carry-out reduction operation is performed,

9) judgment ofIf the number is less than the prime number p, thenMultiplying the large integer A and the large integer B and then taking a modulus of p; if it is notGreater than prime p, thenIs the result of multiplying the large integer A and the large integer B and then taking the modulus of p.

Finally, the related parameters are calculated by 7 prime number domains commonly used in cryptography and the method for accelerating the modulo multiplication calculation of the large integer in the prime number domain by using the floating point number calculation instruction provided by the invention, and the following table 1 is obtained.

TABLE 1 packet Length and segment Length selection for the common prime field

p	k	σ	N	w	r
						2²²¹-3	221	3	5	45	41
2²²²-117	222	117	5	45	42
						2²⁵¹-9	251	9	5	51	47
2²⁵⁵-19	255	19	5	51	51
						2³⁸²-105	382	105	8	48	46
2³⁸³-187	383	187	8	48	47
						2⁴¹⁴-17	414	17	8	52	50

Based on the same inventive concept, another embodiment of the present invention provides an asymmetric cryptographic method, which includes a prime number domain large integer modular multiplication calculation, where the prime number domain large integer modular multiplication is calculated by the method of the present invention.

Based on the same inventive concept, another embodiment of the present invention provides an electronic device (computer, server, smartphone, etc.) comprising a memory storing a computer program configured to be executed by the processor, and a processor, the computer program comprising instructions for performing the steps of the inventive method.

Based on the same inventive concept, another embodiment of the present invention provides a computer-readable storage medium (e.g., ROM/RAM, magnetic disk, optical disk) storing a computer program, which when executed by a computer, performs the steps of the inventive method.

The particular embodiments of the present invention disclosed above are illustrative only and are not intended to be limiting, since various alternatives, modifications, and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The invention should not be limited to the disclosure of the embodiments in the present specification, but the scope of the invention is defined by the appended claims.

8页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种软件开发需求质量的评价方法及系统

Method for realizing acceleration of prime number domain large integer modular multiplication calculation

相关技术

网友询问留言