Method and apparatus for Differential Power Analysis (DPA) resilient security in a cryptographic processor

文档序号:1048019 发布日期:2020-10-09 浏览:9次 中文

阅读说明:本技术 密码处理器中差分功率分析(dpa)弹性安全性的方法和装置 (Method and apparatus for Differential Power Analysis (DPA) resilient security in a cryptographic processor ) 是由 R·拉玛拉居 R·瓦蒂科达 S·辛哈罗伊 鲁德 庞博 于 2019-01-24 设计创作,主要内容包括:在某些方面中,一种电路包括动态差分逻辑门,具有第一输出和第二输出;以及第一静态差分逻辑门,具有第一输出和第二输出以及分别耦合到该动态差分逻辑门的第一输出和第二输出的第一输入和第二输入。动态差分逻辑门被配置为接收时钟信号,并且在时钟信号的第一阶段期间将动态差分逻辑门的第一输出和第二输出两者预设为第一预设值。第一静态差分逻辑门被配置为,当第一预设值被输入到第一静态差分逻辑门的第一输入和第二输入两者时,将第一静态差分逻辑门的第一输出和第二输出两者预设为第二预设值。(In certain aspects, a circuit includes a dynamic differential logic gate having a first output and a second output; and a first static differential logic gate having first and second outputs and first and second inputs coupled to the first and second outputs of the dynamic differential logic gate, respectively. The dynamic differential logic gate is configured to receive a clock signal and to preset both a first output and a second output of the dynamic differential logic gate to a first preset value during a first phase of the clock signal. The first static differential logic gate is configured to preset both a first output and a second output of the first static differential logic gate to a second preset value when a first preset value is input to both the first input and the second input of the first static differential logic gate.)

1. A circuit, comprising:

a dynamic differential logic gate having a first output and a second output; and

a first static differential logic gate having first and second outputs and first and second inputs coupled to the first and second outputs of the dynamic differential logic gate, respectively;

wherein the dynamic differential logic gate is configured to receive a clock signal and is configured to preset both the first output and the second output of the dynamic differential logic gate to a first preset value during a first phase of the clock signal; and

wherein the first static differential logic gate is configured to preset both the first output and the second output of the first static differential logic gate to a second preset value when the first preset value is input to both the first input and the second input of the first static differential logic gate.

2. The circuit of claim 1, wherein the first preset value and the second preset value are opposite in logical value.

3. The circuit of claim 1, wherein the first dynamic differential logic gate comprises a dynamic differential exclusive-or (XOR) gate.

4. The circuit of claim 1, wherein during a second phase of the clock signal, the dynamic differential logic gate is configured to:

performing a first differential logic function on the input data bits to generate a first pair of complementary data bits; and

outputting the first pair of complementary data bits at the first output and the second output of the dynamic differential logic gate.

5. The circuit of claim 4, wherein the first static differential logic gate is configured to:

performing a second differential logic function on at least the first pair of complementary data bits to generate a second pair of complementary data bits; and

outputting the second pair of complementary data bits at the first output and the second output of the first static differential logic gate.

6. The circuit of claim 5, wherein the clock signal is low during the first phase of the clock signal and high during the second phase of the clock signal.

7. The circuit of claim 5, wherein the clock signal is high during the first phase of the clock signal and low during the second phase of the clock signal.

8. The circuit of claim 5, further comprising:

a second static differential logic gate having first and second outputs and first and second inputs coupled to the first and second outputs of the first static differential logic gate, respectively;

wherein the second static differential logic gate is configured to preset both the first output and the second output of the second static differential logic gate to the first preset value when the second preset value is input to both the first input and the second input of the second static differential logic gate.

9. The circuit of claim 8, wherein the first preset value and the second preset value are opposite in logical value.

10. The circuit of claim 5, wherein the first differential logic function is a differential exclusive OR (XOR) function.

11. A processor, comprising:

a first differential latch configured to latch first complementary data and output the latched first complementary data; and

a first pipeline configured to perform a first operation on the latched first complementary data to generate second complementary data;

wherein the first pipeline comprises one or more dynamic differential logic gates in a first stage of the first pipeline and one or more static differential logic gates in a second stage of the first pipeline; and

wherein each of the one or more dynamic differential logic gates in the first stage is configured to receive a clock signal and preset a respective output to a first preset value during a first phase of the clock signal.

12. The processor of claim 11, wherein each of the one or more static differential logic gates in the second stage is configured to preset a respective output to a second preset value when the output of the one or more dynamic differential logic gates in the first stage is preset to the first preset value.

13. The processor of claim 12, wherein the first preset value and the second preset value are opposite in logical value.

14. The processor of claim 13, wherein:

the first pipeline includes one or more static differential logic gates in a third stage of the first pipeline; and

each of the one or more static differential logic gates in the third stage is configured to preset a respective output to the first preset value when the output of the one or more static differential logic gates in the second stage is preset to the second preset value.

15. The processor of claim 12, wherein the first differential latch is configured to output the latched first complementary data to the first pipeline during a second phase of the clock signal.

16. The processor of claim 15, wherein the clock signal is low during the first phase of the clock signal and high during the second phase of the clock signal.

17. The processor of claim 11, further comprising:

a second differential latch configured to latch the second complementary data and configured to output the latched second complementary data; and

a second pipeline configured to perform a second operation on the latched second complementary data to generate third complementary data.

18. The processor of claim 17, wherein:

the first differential latch is configured to output the latched first complementary data to the first pipeline during a second phase of the clock signal; and

the second differential latch is configured to output the latched second complementary data to the second pipeline during the first phase of the clock signal.

19. The processor of claim 18, wherein the clock signal is low during the first phase of the clock signal and the clock signal is high during the second phase of the clock signal.

20. The processor of claim 18, wherein the second pipeline comprises one or more static differential logic gates in a first stage of the second pipeline.

21. The processor of claim 17, wherein:

the first operation comprises a mixed column operation or an inverse mixed column operation; and

the second operation comprises a byte replacement operation or an inverse byte replacement operation.

22. A differential logic gate, comprising:

a first logic gate comprising:

a first plurality of p-type field effect transistors (PFETs) coupled in series between the first output and a supply rail;

a second plurality of PFETs coupled in series between the first output and the supply rail;

a first plurality of n-type field effect transistors (NFETs) coupled in series between the first output and ground;

a second plurality of NFETs coupled in series between the first output and the ground;

a second logic gate comprising:

a third plurality of PFETs coupled in series between a second output and the supply rail;

a fourth plurality of PFETs coupled in series between the second output and the supply rail;

a third plurality of NFETs coupled in series between the second output and the ground;

a fourth plurality of NFETs coupled in series between the second output and the ground; and

a plurality of inputs coupled to gates of the first plurality of PFETs, the second plurality of PFETs, the third plurality of PFETs, and the fourth plurality of PFETs, and gates of the first plurality of NFETs, the second plurality of NFETs, the third plurality of NFETs, and the fourth plurality of NFETs, such that the differential logic gate performs a differential logic function when a first pair of complementary bits is input to a first input and a second input of the plurality of inputs, and a second pair of complementary bits is input to a third input and a fourth input of the plurality of inputs.

23. The differential logic gate of claim 22, wherein the plurality of inputs are coupled to the gates of the first, second, third, and fourth pluralities of PFETs, and the gates of the first, second, third, and fourth pluralities of NFETs, such that when a second preset value is input to the first, second, third, and fourth ones of the plurality of inputs, both the first and second outputs are preset to a first preset value.

24. The differential logic gate of claim 23, wherein the first preset value and the second preset value are opposite in logical value.

25. A differential logic gate as claimed in claim 22, wherein the differential logic function is one of: a differential exclusive-or (XOR) function, a differential or function, and a differential and function.

26. The differential logic gate of claim 22, wherein the first logic gate and the second logic gate are complementary to each other.

Technical Field

Aspects of the present disclosure relate generally to processors and, more particularly, to processors that are resilient to Differential Power Analysis (DPA) attacks.

Background

Sensitive data may be encrypted at the sending device to provide secure communication of the data to the receiving device. For example, a sending device may encrypt data using a key, and a receiving device may decrypt the encrypted data using the key, where the key is known only to the sending and receiving devices. To maintain security, the keys need to be protected from software attacks and hardware attacks. An increasingly popular technique for attackers to retrieve keys is Differential Power Analysis (DPA), in which the attacker measures the power profile of a cryptographic processor on the sending and/or receiving device to discern unique power signatures of 1 and 0. This allows an attacker to retrieve the key after compiling and analyzing enough power measurements.

Disclosure of Invention

The following presents a simplified summary of one or more implementations in order to provide a basic understanding of such implementations. This summary is not an extensive overview of all contemplated implementations, and is intended to neither identify key or critical elements of all implementations, nor delineate the scope of any or all implementations. Its sole purpose is to present some concepts of one or more implementations in a simplified form as a prelude to the more detailed description that is presented later.

A first aspect relates to a circuit. The circuit includes a dynamic differential logic gate having a first output and a second output; and a first static differential logic gate having first and second outputs and first and second inputs coupled to the first and second outputs of the dynamic differential logic gate, respectively. The dynamic differential logic gate is configured to receive a clock signal and is configured to preset both a first output and a second output of the dynamic differential logic gate to a first preset value during a first phase of the clock signal. The first static differential logic gate is configured to preset both a first output and a second output of the first static differential logic gate to a second preset value when a first preset value is input to both the first input and the second input of the first static differential logic gate.

A second aspect relates to a processor. The processor includes a first differential latch configured to latch first complementary data and output the latched first complementary data. The processor also includes a first pipeline configured to perform a first operation on the latched first complementary data to generate second complementary data. The first pipeline includes one or more dynamic differential logic gates in a first stage of the first pipeline and one or more static differential logic gates in a second stage of the first pipeline. Each of the one or more dynamic differential logic gates in the first stage is configured to receive the clock signal and preset a respective output to a first preset value during a first phase of the clock signal.

A third aspect relates to a differential logic gate. The differential logic gate includes a first logic gate and a second logic gate. The first logic gate includes a first plurality of p-type field effect transistors (PFETs) coupled in series between the first output and a supply rail, a second plurality of PFETs coupled in series between the first output and the supply rail, a first plurality of n-type field effect transistors (NFETs) coupled in series between the first output and ground, and a second plurality of NFETs coupled in series between the first output and ground. The second logic gate includes a third plurality of PFETs coupled in series between the second output and a supply rail, a fourth plurality of PFETs coupled in series between the second output and the supply rail, a third plurality of NFET outputs coupled in series between the second output and ground, and a fourth plurality of NFETs coupled in series between the second output and ground. The differential logic gate further includes a plurality of inputs coupled to gates of the first plurality of PFETs, the second plurality of PFETs, the third plurality of PFETs, and the fourth plurality of PFETs, and gates of the first plurality of NFETs, the second plurality of NFETs, the third plurality of NFETs, and the fourth plurality of NFETs, such that the differential logic gate performs a differential logic function when a first pair of complementary bits is input to a first input and a second input of the plurality of inputs, and a second pair of complementary bits is input to a third input and a fourth input of the plurality of inputs.

Drawings

Fig. 1 shows an example of a logic gate that is vulnerable to DPA attacks.

Fig. 2 shows an example of a static differential logic gate.

Fig. 3 illustrates an example of a dynamic differential logic gate in accordance with certain aspects of the present disclosure.

Fig. 4 illustrates an example of a circuit block that may be used to implement a differential logic gate in accordance with certain aspects of the present disclosure.

Fig. 5 illustrates an example of a static differential xor gate in accordance with certain aspects of the present disclosure.

Fig. 6A illustrates a truth table for a static differential exclusive or gate in accordance with certain aspects of the present disclosure.

Fig. 6B illustrates a first preset table for a static differential xor gate according to certain aspects of the present disclosure.

Fig. 6C illustrates a second preset table for a static differential xor gate according to certain aspects of the present disclosure.

Fig. 7 illustrates an example of a dynamic differential exclusive-or gate in accordance with certain aspects of the present disclosure.

Fig. 8 shows an example of glitches caused by input data bits arriving at the logic gate at different times.

FIG. 9A illustrates a timing diagram of an example of data bits arriving at a static differential XOR gate at different times, in accordance with certain aspects of the present invention.

FIG. 9B illustrates a timing diagram of another example of data bits arriving at a static differential XOR gate at different times, in accordance with certain aspects of the present invention.

Fig. 10 illustrates an example of a static differential and gate in accordance with certain aspects of the present disclosure.

Fig. 11A illustrates a truth table for a static differential and gate in accordance with certain aspects of the present disclosure.

Fig. 11B illustrates a first preset table for a static differential and gate in accordance with certain aspects of the present disclosure.

Fig. 11C illustrates a second preset table for a static differential and gate in accordance with certain aspects of the present disclosure.

Fig. 12 illustrates an example of a static differential or gate in accordance with certain aspects of the present disclosure.

Fig. 13A illustrates a truth table for a static differential or gate in accordance with certain aspects of the present invention.

Fig. 13B illustrates a first preset table for a static differential or gate in accordance with certain aspects of the present disclosure.

Fig. 13C illustrates a second preset table for a static differential or gate in accordance with certain aspects of the present disclosure.

Fig. 14 illustrates an example of a pipeline including a plurality of differential logic gates, in accordance with certain aspects of the present disclosure.

Fig. 15 illustrates an example of a differential latch in accordance with certain aspects of the present disclosure.

FIG. 16A illustrates an exemplary implementation of a differential latch in accordance with certain aspects of the present invention.

FIG. 16B illustrates another exemplary implementation of a differential latch in accordance with certain aspects of the present invention.

Fig. 17A illustrates an example encryption processor in accordance with certain aspects of the present disclosure.

Fig. 17B illustrates an example implementation of a portion of an encryption processor in accordance with certain aspects of the present disclosure.

Fig. 18A illustrates an example decryption processor in accordance with certain aspects of the present disclosure.

Fig. 18B illustrates an exemplary implementation of a portion of a decryption processor in accordance with certain aspects of the present disclosure.

Detailed Description

The detailed description set forth below in connection with the appended drawings is intended as a description of various configurations and is not intended to represent the only configurations in which the concepts described herein may be practiced. The detailed description includes specific details to provide a thorough understanding of various concepts. It will be apparent, however, to one skilled in the art that these concepts may be practiced without these specific details. In some instances, well-known structures and components are shown in block diagram form in order to avoid obscuring the concepts.

Fig. 1 shows an example of a static logic gate 110 that is susceptible to a Differential Power Analysis (DPA) attack. The static logic gate 110 includes a P-type field effect transistor (PFET) P1 and an N-type field effect transistor (NFET) N1, wherein the gates of PFET P1 and NFET N1 are coupled to the input of the logic gate 110 (which is labeled "In") and the drains of PFET P1 and NFET N1 are coupled to the output of the logic gate 110 (which is labeled "Out"). The logic gate 110 also includes a load capacitor (labeled "CL") coupled to the output. In this example, the logic gate 110 implements an inverter.

When the input of logic gate 110 switches from 1 to 0 (i.e., a 1- >0 transition), capacitor CL charges to the supply voltage Vdd through PFET P1, resulting in a large spike in the supply current flow. When the input of logic gate 110 switches from 0 to 1 (i.e., a 0- >1 transition), capacitor CL discharges to ground through NFET N1. Finally, when the inputs to logic gate 110 remain the same for two adjacent input bits, little current flows. In this case, the input remains 1 for two adjacent bits (i.e., a 1- >1 transition) or 0 for two adjacent bits (i.e., a 0- >0 transition).

Thus, the supply current flow (and thus the power profile) of the logic gate 110 depends on the bit value at the input. The dependence of the power profile on the input bit values allows an attacker to identify the input bit values based on the power measurements.

Fig. 2 shows an example of a static differential logic gate 210 that is less susceptible to DPA attacks than logic gate 110. The static differential logic gate 210 includes a first PFET P1, a second PFET P2, a first NFET N1, and a second NFET N2, wherein the first PFET P1 and the second PFET P2 are cross-coupled. The differential logic gate 210 is configured to receive a pair of complementary inputs (labeled "In" and

Figure BDA0002642210780000051

) And outputs a pair of complementary outputs (labeled as "Out" sum)). In the discussion that follows, the input In is referred to as a true input, and the inputReferred to as complementary inputs. Furthermore, the output Out is called the true output, and the outputReferred to as complementary outputs.

In this example, the differential logic gate 210 also includes two load capacitors (labeled "CL"), one of which is coupled to the true output Out and the other of which is coupled to the complementary output Out

Figure BDA0002642210780000061

When the true input In to the differential logic gate 210 switches from 1 to 0 (i.e., 1->0 transition), the load capacitor coupled to the true output Out discharges to ground and is coupled to the complementary output

Figure BDA0002642210780000062

Is charged to the supply voltage Vdd. When the true input In of the logic gate 210 is switched from 0 to 1 (i.e., 0->1 switching), the load capacitor coupled to the true output Out is chargedTo a supply voltage Vdd and to a complementary output

Figure BDA0002642210780000063

The load capacitor is discharged to ground. Note that the complementary inputs

Figure BDA0002642210780000064

Switching In the opposite direction to the real input In.

Thus, when the true input In switches logic values (i.e., 1- >0 or 0- >1), one of the load capacitors is charged while the other load capacitor is discharged. If the load capacitors are balanced (i.e., have approximately the same capacitance), then the supply current flow for 1- >0 transitions is approximately the same as the supply current flow for 0- >1 transitions. In other words, the power profiles for the 1- >0 and 0- >1 transitions are approximately the same. This makes it difficult for an attacker to distinguish between 1- >0 transitions and 0- >1 transitions based on power measurements. In contrast, the power profile for the 1- >0 transition of logic gate 110 in FIG. 1 is different than the power profile for the 0- >1 transition. Thus, differential logic gate 210 is less susceptible to DPA attacks than logic gate 110.

When the true input In to the differential logic gate 210 remains the same for two adjacent input bits, almost no current flows. Note that in this case, the complementary inputsAlso remains the same. Thus, for two adjacent bits (i.e., 1->1 or 0->0) The power distribution diagram of the case where the true input In remains the same and the case where the logic value is switched for the true input In (i.e., 1->0 or 0->1) The power distribution map of (a) is different. As a result, an attacker can still use power measurements to distinguish between the case where the true input In for two adjacent input bits remains the same and the case where the true input In switches logic values. This allows an attacker to use the DPA to determine when a data transition has occurred at the input In. Thus, while differential logic gate 210 provides less information to an attacker than logic gate 110 in FIG. 1Provided, but the differential logic gate 210 is still susceptible to DPA attacks. Thus, a better solution is needed to protect against DPA attacks.

To improve protection against DPA attacks, according to various aspects of the present disclosure, dynamic differential logic gates are provided. As discussed further below, the dynamic differential logic gate has an approximately uniform power profile for all possible input transitions (i.e., 1- >0, 1- >1, 0- >0, and 0- > 1). A uniform power profile makes it more difficult for an attacker to identify logical bit values based on power measurements, thus making it more difficult for the attacker to retrieve keys or other sensitive information.

Fig. 3 illustrates an example of a dynamic differential logic gate 310 in accordance with certain aspects of the present disclosure. The dynamic differential logic gate 310 includes a static differential logic gate 315, the static differential logic gate 315 having one or more pairs of complementary inputs (labeled "In" and

Figure BDA0002642210780000071

) Performs a logical operation to generate a pair of complementary outputs (labeled as "Out" sums)

Figure BDA0002642210780000072

). Although a pair of complementary inputs is shown in fig. 1 for simplicity, it should be appreciated that multiple pairs of complementary inputs may be input to the static differential logic gate 315. The differential static logic gate 315 may perform any of a variety of different logical operations including, for example, an exclusive-or (XOR) operation, an or operation, an and operation, and the like.

Dynamic differential logic gate 310 also includes clock transistors that make dynamic differential logic gate 310 dynamic, as discussed further below. The clock transistor includes a first PFET322, a second PFET324, and an NFET 326. A first PFET322 is coupled between the true output Out of the static differential logic gate 315 and the supply rail Vdd, and a second PFET324 is coupled at the complementary output of the static differential logic gate 315And the supply rail Vdd.NFET 326 is coupled between static differential logic gate 315 and ground. The gates of first PFET322 and second PFET324 and the gate of NFET 326 are driven by a clock signal (labeled "CLK").

When the clock signal CLK is low (which is referred to as a "clock low phase"), the first PFET322 and the second PFET324 are turned on, and the NFET 326 is turned off. As a result, first PFET322 and second PFET324 will output Out and of differential logic gate 315Both coupled to the supply rail Vdd. This will output Out and

Figure BDA0002642210780000075

both preset to Vdd (i.e., logic 1) independent of the inputs In and

Figure BDA0002642210780000076

the bit value of (c).

When the clock signal CLK is high (which is referred to as the "clock high phase"), first PFET322 and second PFET324 are turned off, and NFET 326 is turned on. This allows the static differential logic gate 315 to input In andperforms a logical operation on and generates corresponding pairs of complementary output Out sumsBecause of the outputs of the static differential logic gate 315 Out and

Figure BDA0002642210780000079

complementary, so output Out andset to the opposite logical value (i.e., one of the outputs remains at the preset value of 1 while the other output becomes 0). Thus, after both outputs are preset to 1 in the clock low phase, one of the outputs is switched from 1 to 0 in the clock high phase, while the other output is switched from 1 to 0 in the clock high phaseThe output remains 1. The output switching from 1 to 0 depends on the sum of the In at the inputs

Figure BDA0002642210780000082

The bit value of (c) and the logical operation performed by differential logic gate 315.

Thus, during each cycle (i.e., period) of the clock signal CLK, in the respective clock low phase, the output of Out and

Figure BDA0002642210780000083

preset to 1 and Out andso that one of the outputs switches from 1 to 0 during the high phase of the corresponding clock. If the differential logic gate 315 is balanced (i.e., the capacitances on both sides of the differential logic gate 315 are the same), then the power profile for the true output Out switching from 1 to 0 is the same as for the complementary output

Figure BDA0002642210780000085

The power profiles for switching from 1 to 0 are approximately the same.

Because during the high phase of the clock of each clock cycle, the output Out and

Figure BDA0002642210780000086

and the power profile for the true output Out switching from 1 to 0 and for the complementary output Out switching from 1 to 0

Figure BDA0002642210780000087

The power profiles for switching from 1 to 0 are approximately the same, so the power profile per clock cycle is approximately the same. Note that even if adjacent input bits are the same, Out and are output in each clock cycleWill also switch from 1 to 0. This is because during the clock low phase of each clock cycle, two outputs Out andare all preset to 1 resulting in output Out andis switched from 1 to 0 during the subsequent high phase of the clock. In other words, Out and will be output during the clock low phase

Figure BDA00026422107800000811

Presetting to 1 ensures that one of the outputs switches from 1 to 0 during the high phase of the clock even though the input bits remain the same.

Thus, the power profile of dynamic differential logic gate 310 is approximately uniform across clock cycles, independent of input transitions (i.e., 1- >0, 1- >1, 0- >0, and 0- > 1). The uniform power profile makes it difficult for an attacker to identify the logical bit values based on the power measurements.

In the above example, during each clock cycle, two outputs Out andis preset to 1. However, it should be appreciated that the present disclosure is not limited to this example. Alternatively, during each clock cycle, two outputs Out and

Figure BDA00026422107800000813

may be preset to 0. In this example, due to the complementary nature of the outputs, after each preset, the output Out and

Figure BDA00026422107800000814

one output is switched from 0 to 1. The complementary nature of the outputs ensures that one of the outputs switches logic values after each preset, and the structure of the balanced differential logic gate 315 helps to ensure that the power profile is the same regardless of which of the outputs switches logic values.

In the above example, the outputs Out and Out are preset during the clock low phase,and evaluate them during the high phase of the clock. However, the present disclosure is not limited to this example. Alternatively, the outputs Out and may be preset during the high phase of the clockAnd evaluate them during the low-order segments of the clock. This may be accomplished, for example, by inverting the clock signal CLK and driving the gates of clock transistors 322, 324, and 326 using the inverted clock signal.

Fig. 4 illustrates an example circuit block 410 in accordance with certain aspects of the present disclosure. As discussed further below, circuit block 410 may be used to implement an exclusive or gate, an exclusive nor gate, an or gate, a nor gate, an and gate, or a nand gate. Circuit block 410 includes a first transistor stack 415 and a second transistor stack 425.

First transistor stack 415 includes first PFET 412, second PFET414, first NFET 416, and second NFET 418. First PFET 412 and second PFET414 are coupled in series between supply rail Vdd and second PFET414, while output node 430 and first NFET 416 and second NFET 418 are coupled in series between output node 430 and ground.

Second transistor stack 425 includes third PFET 422, fourth PFET424, third NFET426, and fourth NFET 428. Third PFET 422 and fourth PFET424 are coupled in series between supply rail Vdd and output node 430, while third NFET426 and fourth NFET 428 are coupled in series between output node 430 and ground.

As shown in fig. 4, the first transistor stack 415 is coupled to the second transistor stack 425 at an output node 430. Output node 430 provides an output to a logic gate implemented using circuit block 410. The capacitance at output node 430 may include the drain-gate capacitances of second PFET414, fourth PFET424, first NFET 416, and third NFET 426.

As discussed above, circuit block 410 may be used to implement an exclusive or gate, an exclusive nor gate, an or gate, a nor gate, an and gate, or a nand gate. In this regard, the block circuit 410 may be configured to implement any of the logic gates described above by coupling inputs to the gates of the transistors 412, 414, 416, 418, 422, 424, 426, and 428 according to the logic gate to be implemented, as discussed further below.

Fig. 5 illustrates an example of a static differential exclusive-or gate 510 implemented based on circuit block 410 shown in fig. 4, in accordance with certain aspects of the present disclosure. In this example, the static differential xor gate 510 includes an input a,b andas discussed further below, inputs a and

Figure BDA0002642210780000103

is configured to receive a first pair of complementary data bits, and inputs b andconfigured to receive a second pair of complementary data bits. Each of the inputs may be implemented using a metal structure coupled to the gates of a respective subset of the transistors in the static differential xor gate.

Static differential XOR gate 510 is configured to perform a differential XOR operation on a first pair of complementary data bits and a second pair of complementary data bits to generate a pair of complementary output data bits that are summed from outputs Out andand (6) outputting. In the following discussion, the output Out is referred to as the true output, and the output Out is referred to as the true outputReferred to as complementary outputs.

Static differential xor gate 510 includes a single output xor gate 520 and a single output xor gate 530. Single output XOR gate 520 provides the true output Out of differential XOR gate 510, and single output XOR gate 530 provides the complementary output of differential XOR gate 510Xor gate 530 is the complement (i.e., the inverse) of xor gate 520. Thus, in this example, the static differential exclusive-or gate 510 is implemented using a pair of complementary single output logic gates.

Each of exclusive or gate 520 and exclusive nor gate 530 is implemented using a separate instance (i.e., a copy) of circuit block 410. The reference number for each transistor in exclusive or gate 520 includes a 1 in parenthesis, while the reference number for each transistor in XNOR 530 includes a 2 in parenthesis, in order to distinguish between two separate instances (i.e., copies) of circuit block 410.

In the example shown in FIG. 5, input a is coupled to the gates of first PFET 412(1) and first NFET 416(1)Coupled to the gates of fourth NFET 428(1) and fourth PFET424 (1), with input b coupled to the gates of second NFET 418(1) and third PFET 422(1) and input b coupled to the gates of

Figure BDA0002642210780000109

To the gates of second PFET414 (1) and third NFET426 (1) to implement xor gate 520. The node between first NFET 416(1) and second NFET 418(1) is coupled to the node between third NFET426 (1) and fourth NFET 428 (1). Output node 430(1) provides the true output Out of differential exclusive or gate 510.

By coupling input a to first PFET 412(2) and first NFET 416(2)Coupled to the gates of fourth PFET424(2) and fourth NFET 428(2), with input b coupled to the gates of second PFET414 (2) and third NFET426(2), and input b coupled to the gates of

Figure BDA0002642210780000111

To the gates of second NFET 418(2) and third PFET 422(2) to implement exclusive nor gate 530. In the first PFET 412(2) The node between second PFET414 (2) is coupled to the node between third PFET 422(2) and fourth PFET424 (2). Output node 430(2) provides the complementary output of differential xor gate 510.

FIG. 6A shows a truth table for a differential XOR gate 510, where inputs a and

Figure BDA0002642210780000112

having complementary logic values, inputs b and

Figure BDA0002642210780000113

having complementary logic values. From the truth table, the output Out and

Figure BDA0002642210780000114

having complementary logic values. When the inputs a and b have different logic values, the true output Out is 1; and when the inputs a and b have the same logic value, the true output Out is 0.

Static differential xor gate 510 may be used to implement a dynamic differential xor gate. In this regard, fig. 7 illustrates an example of a dynamic differential xor gate 710 in accordance with certain aspects of the present disclosure. Dynamic differential xor gate 710 includes a static differential xor gate 510 and a clock transistor. The clock transistors include a first clock PFET 722, a second clock PFET724, and a clock NFET 726. The first clock PFET 722 is coupled between the true output Out of the static differential xor gate 510 and the supply rail Vdd. Since the single output xor gate 520 provides the true output Out of the static differential xor gate 510, the first clock PFET 722 is coupled between the output Out of the xor gate 520 and the supply rail Vdd. A second PFET724 is coupled to the complementary output of the differential XOR gate 510

Figure BDA0002642210780000115

And the supply rail Vdd. Since single output xor gate 530 provides the complementary outputs of static differential xor gate 510

Figure BDA0002642210780000116

The second clock PFET724 is coupled between the output of the xor gate 530 and the supply rail Vdd. Clock NFET726 is coupled between static differential xor gate 510 and ground. More specifically, clock NFET726 is coupled between single-output xor gate 520 and ground, and between single-output xor gate 530 and ground. The gates of first clock PFET 722 and second clock PFET724, and the gate of clock NFET726, are driven by a clock signal (which is labeled "CLK").

When the clock signal CLK is low (which is referred to as a "clock low phase"), the first clock PFET 722 and the second clock PFET724 are turned on, and the clock NFET726 is turned off. As a result, the first clock PFET 722 and the second clock PFET724 will output the outputs of the dynamic differential XOR gate 710, Out andboth coupled to the supply rail Vdd. Thus, in this example, two outputs Out and

Figure BDA0002642210780000121

is preset to Vdd (i.e., logic 1).

When the clock signal CLK is high (which is referred to as the "clock high phase"), the first clock PFET 722 and the second clock PFET724 are turned off, and the clock NFET726 is turned on. This allows the static differential exclusive-OR gate 510 to output Out andone output of which is pulled to 0.

Thus, during each cycle (i.e., period) of the clock signal CLK, in the respective clock low phase, the output of Out andpreset high (i.e., 1) and in the corresponding clock high phase, output Out and

Figure BDA0002642210780000124

goes low (i.e., 0), while outputting Out and

Figure BDA0002642210780000125

the other output of (a) remains high (i.e., 1). Thus, during the clock high phase, the output of one of the exclusive or gate 520 and the exclusive nor gate 530 switches logic states. Since xor gate 520 and xor gate 530 are similar in structure (i.e., both are implemented using circuit block 410 in fig. 4), the same number of capacitor nodes are charged/discharged independent of outputs Out andone output switches logic states. As a result, during the high phase of the clock, the power profile of differential XOR gate 510 is approximately the same as the output Out sumOne of the outputs switches logic states independently. Thus, during a clock cycle, the power profile of dynamic differential XOR gate 710 is approximately the same for each clock cycle, and is equal to the output Out sum

Figure BDA0002642210780000128

One output of (a) goes low during a clock cycle independently. Thus, the power profile is approximately uniform across clock cycles, making it difficult for an attacker to identify logical bit values based on power measurements.

As shown above, the outputs of differential XOR gate 510, Out and

Figure BDA0002642210780000129

being preset high (i.e., 1) results in a more uniform power profile that is resilient to DPA attacks. Instead, this may be accomplished by summing the outputs of differential exclusive-OR gate 510 OutBoth preset low (i.e., 0) to achieve. In this case, after both outputs are preset low, at outputs Out andone output switches from low to high.

The static differential xor gate 510 may also be preset by inputting a preset input value to the input of the static differential xor gate 510. For example, by inputting a preset input value of 0 to all inputs of the static differential exclusive-or gate 510, the outputs Out and of the static differential exclusive-or gate 510Both may be preset high (i.e., 1). In this regard, fig. 6B illustrates a first preset table in which preset input values of 0 are input to the inputs a, a of the static differential xor gate 510,

Figure BDA00026422107800001213

b andso that Out and will be outputBoth preset high (i.e., 1). Therefore, during the preset period, the same value (i.e., 0) is input to all the inputs to output Out andpreset high (i.e., 1).

In this example, a logic circuit (not shown in fig. 5) may be coupled to the inputs of the static differential xor gates. During the preset phase, the logic circuit inputs a, b to the static differential XOR gate 510,

Figure BDA0002642210780000132

b andinputting a preset input value of 0 to output Out sumPreset high (i.e., 1). During an evaluation phase following the presetting phase, the logic circuit inputs a andinputting a first pair of complementary data bits, and adding to inputs b and

Figure BDA00026422107800001319

a second pair of complementary data bits is input. The data bits may include cryptographic data or other secure data. In response to the data bit, static differential XOR gate 510 will output Out and according to the truth table in FIG. 6AOne output switches from high to low. Inputs a preset input value to the inputs of the static differential exclusive-or gate 510 to preset the output Out andthe preset outputs Out and for clock transistors 722, 724, and 726 are removed

Figure BDA0002642210780000138

To the need of (a). This reduces power consumption by removing the switching power associated with switching clock transistors 722, 724, and 726. The logic circuit may include one or more dynamic differential logic gates, one or more static differential logic gates, or a combination of dynamic differential logic gates and static differential logic gates, examples of which are discussed below with reference to fig. 14.

In another example, by inputting a preset input value of 1 to all inputs of the static differential exclusive-or gate 510, the outputs Out and of the static differential exclusive-or gate 510May be preset low (i.e., 0). In this regard, fig. 6C illustrates a second preset table in which preset input values of 1 are input to the inputs a, a of the static differential xor gate 510,

Figure BDA00026422107800001310

b and

Figure BDA00026422107800001311

thereby causing output Out andboth are preset low (i.e., 0).

In this example, a logic circuit (not shown in fig. 5) may be coupled to the inputs of the static differential xor gates. During the preset phase, the logic circuit inputs a, b to the static differential XOR gate 510,

Figure BDA00026422107800001313

b and

Figure BDA00026422107800001314

inputting a preset input value of 1 to output Out sumPreset to low (i.e., 0). During an evaluation phase following the presetting phase, the logic circuit inputs a andinputting a first pair of complementary data bits, and adding to inputs b and

Figure BDA00026422107800001317

a second pair of complementary data bits is input. The data bits may include cryptographic data or other secure data. In response to the data bit, static differential XOR gate 510 will output Out and according to the truth table in FIG. 6AOne output switches from low to high. The logic circuit may include one or more dynamic differential logic gates, one or more static differential logic gates, or a combination of dynamic differential logic gates and static differential logic gates, examples of which are described below with reference to fig. 14.

Logic gates may be susceptible to glitches when data bits corresponding to different inputs of the logic gate arrive at the logic gate at different times. In this regard, fig. 8 shows an example of a glitch for logic gate 810 that performs an exclusive or function. In this example, logic gate 810 has two inputs (labeled "a" and "b") and one output (labeled "Out"). FIG. 8 shows a timing diagram in which the bit value of 1 is sent to input a of logic gate 810 and the bit value of 1 is sent to input b of logic gate 810. In this example, the bit value for the 1 of input a reaches logic gate 810 before the bit value for the 1 of input b. This causes the output Out of logic gate 810 to go high temporarily between the time the bit value for the 1 of input a reaches logic gate 810 and the time the bit value for the 1 of input b reaches logic gate 810, creating a glitch 820.

Presetting the static differential xor gate 510 greatly reduces glitches that occur when data bits arrive at the input of the static differential xor gate 510 at different times. In this regard, fig. 9A shows a state in which a preset input value of 0 is initially input to the input a of the static differential xor gate 510,b and

Figure BDA0002642210780000142

the static differential exclusive-or gate 510 will output Out and

Figure BDA0002642210780000143

preset high (i.e., 1). In this example, a bit value of 1 is sent to input a and a bit value of 1 is sent to input b. Note that due to the input

Figure BDA0002642210780000149

Is the complement of the bit value of input a, so that the inputsRemains 0 and is inputIs the complement of the bit value of input b, so that the inputsRemains at 0.

In this example, the bit value for input a arrives before the bit value for input b. As shown in FIG. 9A, when the bit value 1 for input a first reaches input a, the two outputs Out and

Figure BDA0002642210780000144

remain high (i.e., remain at the preset output value of 1). This is because of the two inputs b and

Figure BDA0002642210780000145

still at the preset input value 0, which results in two outputs Out and

Figure BDA0002642210780000146

remains high (i.e., at a preset output value of 1). This can be illustrated with reference to fig. 5. When two inputs b and

Figure BDA0002642210780000147

both the second PFET414 (1) and the third PFET 422(1) in the xor gate 510 are turned on when both are at the preset input value of 0. As a result, a conduction path between output node 430(1) and supply rail Vdd is maintained by third PFET 422(1) and fourth PFET424 (1). This keeps the true output Out high (i.e., at the preset output value of 1). A similar analysis may be performed on XOR gate 530 to show complementary outputsAlso remains high (i.e., remains at the preset output value of 1).

When the bit value for input b reaches input b, the true output Out of the static differential logic gate 510 switches from high to low according to the truth table in FIG. 6A. Thus, the true output Out only switches logic state once during the evaluation phase (i.e., when the bit value of 1 for input b arrives), thereby avoiding glitches. Note that in this example, the complementary outputs

Figure BDA0002642210780000151

Remains high (i.e., does not switch logic states).

FIG. 9B shows an example where the bit value 1 for input B arrives before the bit value 1 for input a. As shown in FIG. 9B, when the bit value 1 for input B first reaches input B, the two outputs Out and

Figure BDA0002642210780000152

is kept high (i.e., at a preset output value of 1). This is because of the two inputs a andstill at the preset input value 0, which results in two outputs Out andremain high (i.e., at a preset output value of 1). This can be illustrated with reference to fig. 5. When two inputs a and

Figure BDA0002642210780000155

at a preset input value of 0, both the first PFET 412(1) and the fourth PFET424 (1) in the xor gate 510 are turned on. As a result, a conduction path between output node 430(1) and supply rail Vdd is maintained by first PFET 412(1) and second PFET414 (1). This keeps the true output Out high (i.e., at the preset output value of 1). A similar analysis may be performed on XOR gate 530 to show complementary outputs

Figure BDA0002642210780000156

Also remains high (i.e., remains at the preset output value of 1).

When the bit value 1 for input a reaches input a, the true output Out of the static differential logic gate 510 switches from high to low according to the truth table in FIG. 6A. Thus, the true output Out only switches logic states once during the evaluation phase (i.e., when the bit value of 1 for input a arrives), thereby avoiding glitches.

For other combinations of data bit values, the analysis discussed above with reference to FIGS. 9A and 9B may be performed to indicate that presetting the static differential XOR gate 510 greatly reduces the glitch of the other combinations of data bit values. In general, presetting the static differential exclusive-or gate 510 during the preset phase helps to ensure that no one output will change logic state more than once during the subsequent evaluation phase, thereby avoiding glitches.

Fig. 10 illustrates an example of a static differential and gate 1010 implemented based on circuit block 410 shown in fig. 4, in accordance with certain aspects of the present disclosure. In this example, the static differential AND gate 1010 includes inputs a,

Figure BDA0002642210780000157

b and

Figure BDA0002642210780000158

as discussed further below, inputs a andis configured to receive a first pair of complementary data bits, and inputs b and

Figure BDA0002642210780000161

configured to receive a second pair of complementary data bits. Each of the inputs may be implemented using a metal structure coupled to the gates of a respective subset of the transistors in the static differential and gate.

Static differential AND gate 1010 is configured to perform a differential AND (AND) operation on a first pair of complementary data bits AND a second pair of complementary data bits to generate a pair of complementary output data bits that are summed from outputs Out AND

Figure BDA0002642210780000162

And (6) outputting. In the following discussion, the output Out is referred to as the true output, and the output Out is referred to as the true outputReferred to as complementary outputs.

The static differential and gate 1010 includes a single-output and gate 1020 and a single-output nand gate 1030. The single-output AND gate 1020 provides the true output Out of the differential AND gate 1010, andthe single output nand gate 1030 provides the complementary outputs of the differential and gate 1010Each of the and gates 1020 and nand gates 1030 is implemented using a separate instance (i.e., replica) of the circuit block 410. The reference numeral for each transistor in the and gate 1020 includes a 1 in parentheses, while the reference numeral for each transistor in the nand gate 1030 includes a 2 in parentheses, in order to distinguish between two separate instances (i.e., copies) of the circuit block 410.

In the example shown in FIG. 10, the input is made byCoupled to the gates of first PFET 412(1), first NFET 416(1), second NFET 418(1), and fourth PFET424 (1), and will be input

Figure BDA00026422107800001610

And gate 1020 is implemented by coupling to the gates of second PFET414 (1), third PFET 422(1), third NFET426 (1), and fourth NFET 428 (1). Output node 430(1) provides the true output Out of differential and gate 1010.

Nand gate 1030 is implemented by coupling input a to the gates of first PFET 412(2), second PFET414 (2), second NFET 418(2), and third NFET426(2), and coupling input b to the gates of first NFET 416(2), third PFET 422(2), fourth PFET424(2), and fourth NFET 428 (2). Output node 430(2) provides the complementary output of differential AND gate 1010

FIG. 11A shows a truth table for a differential AND gate 1010 with inputs a and

Figure BDA0002642210780000166

having complementary logic values, inputs b andhaving complementary logic values. FromTruth table can see that Out sum is outputHaving complementary logic values. When the input a and the input b are both 1, the real output Out is 1; and when one or both of the inputs a and b is 0, the true output Out is 0.

The static differential and gate 1010 may be preset by inputting a preset input value to an input of the static differential and gate 1010. For example, by inputting a preset input value of 1 to all inputs of the static differential AND gate 1010, the outputs of the static differential AND gate 1010, Out andmay be preset high (i.e., 1). In this regard, fig. 11B shows a first preset table in which preset input values of 0 are input to the inputs a, a of the static differential and gate 1010,b andthereby causing output Out and

Figure BDA0002642210780000174

preset high (i.e., 1).

In this example, a logic circuit (not shown in fig. 10) may be coupled to the inputs of the static differential and gate. During the preset phase, the logic circuit inputs a, b to the static differential AND gate 1010,

Figure BDA0002642210780000175

b and

Figure BDA0002642210780000176

inputting a preset input value of 0 to output Out sumPreset high (i.e., 1). During an evaluation phase following a preset phase, the logic circuit inputs a andinputting a first pair of complementary data bits, and adding to bA second pair of complementary data bits is input. The data bits may include cryptographic data or other secure data. In response to the data bit, static differential AND gate 1010 sums Out and Out according to the truth table in FIG. 11A

Figure BDA00026422107800001710

One of which switches from high to low. The logic circuit may include one or more dynamic differential logic gates, one or more static differential logic gates, or a combination of dynamic differential logic gates and static differential logic gates, examples of which are discussed below with reference to fig. 14.

In another example, the outputs Out and Out of the static differential and gate 1010 are summed by inputting a preset input value of 1 to all inputs of the static differential and gate 1010Both of which may be preset low (i.e., 0). In this regard, fig. 11C shows a second preset table in which a preset input value of 1 is input to the inputs a, a of the static differential and gate 1010,b andthereby causing output Out andpreset to low (i.e., 0).

In this example, a logic circuit (not shown in fig. 10) may be coupled to the inputs of the static differential and gate. During the preset phase, the logic circuit inputs a, b to the static differential AND gate 510,b and

Figure BDA00026422107800001716

inputting a preset input value of 1 to output Out sumBoth are preset low (i.e., 0). During an evaluation phase following a preset phase, the logic circuit inputs a andinputting a first pair of complementary data bits, and adding to b

Figure BDA00026422107800001719

A second pair of complementary data bits is input. The data bits may include cryptographic data or other secure data. In response to the data bit, static differential AND gate 1010 sums Out and

Figure BDA00026422107800001720

switches from low to high. The logic circuit may include one or more dynamic differential logic gates, one or more static differential logic gates, or a combination of dynamic differential logic gates and static differential logic gates, examples of which are discussed below with reference to fig. 14.

In the above example, the outputs Out and are preset during the preset phase

Figure BDA0002642210780000181

Ensuring that during the evaluation phase, one of the outputs Out and Out switches logic states. Thus, the output of one of the and 1020 and nand 1030 gates switches logic states during the evaluation phase. Since the structures of the AND gate 1020 and the NAND gate 1030 are similar (i.e., both are implemented using the circuit block 410 in FIG. 4), the same number of capacitor nodes are charged/discharged independent of the outputs Out and 1030One of the outputs switches the logic state. As a result, during the evaluation phase, the power profile of differential AND gate 1010 is approximately the same as the output Out sumOne of the outputs switches logic states independently. Thus, the power profile is approximately uniform, making it difficult for an attacker to identify the logic bit values based on the power measurements.

In the example shown in fig. 10, the and gate 1020 includes a first conductive path 1040 between the first internal node 1042 and the second internal node 1044. First internal node 1042 is between the drain of first PFET 412(1) and the source of second PFET414, and second internal node 1044 is between the source of first NFET 416(1) and the drain of second NFET 418 (1). The and gate 1020 also includes a second conductive path 1050 between the third internal node 1052 and the fourth internal node 1054. Third internal node 1052 is between the drain of third PFET 422(1) and the source of fourth PFET424, while fourth internal node 1054 is between the source of second NFET426 (1) and the drain of fourth NFET 428 (1).

The first and second conductive paths 1040, 1050 are used to set the capacitance at the internal nodes 1042, 1044, 1052, and 1054 to a known charge state during a preset of the and gate 1020. For example, if a preset input value of 1 is input to all inputs of the AND gate 1020 to preset the output Out to 0, then the capacitances at nodes 1042, 1044, 1052 and 1054 discharge to ground. In this case, the capacitance at internal node 1042 is discharged through first conduction path 1040 and second NFET 418(1) to ground, and the capacitance at node 1044 is discharged through second NFET 418(1) to ground. In addition, the capacitance at internal node 1052 discharges through second conductive path 1050 and fourth NFET 428(1) to ground, and the capacitance at node 1054 discharges through fourth NFET 428(1) to ground. Setting the capacitances at internal nodes 1042, 1044, 1052, and 1054 to known states of charge during each preset period reduces the dependence of the power profile on the data bit values, resulting in a more uniform power profile.

Similar to the and gate 1020, the nand gate 1030 includes a first conduction path 1060 between the first internal node 1062 and the second internal node 1064. First internal node 1062 is between the drain of first PFET 412(2) and the source of second PFET414 (2), and second internal node 1064 is between the source of first NFET 416(2) and the drain of second NFET 418 (2). The nand gate 1030 also includes a second conductive path 1070 between the third internal node 1072 and the fourth internal node 1074. Third internal node 1072 is between the drain of third PFET 422(2) and the source of fourth PFET424, and fourth internal node 1074 is between the source of second NFET426(2) and the drain of fourth NFET 428 (2). Similar to the and gate 1020, the first conductive path 1060 and the second conductive path 1070 are used to set the capacitance at the internal nodes 1062, 1064, 1072, and 1074 to a known state of charge during a preset of the nand gate 1030.

Static differential and gate 1010 may be used to implement a dynamic differential and gate. This can be achieved, for example, by: coupling a first clock PFET between the true output Out and a supply rail, coupling a second clock PFET at the complementary outputAnd the supply rail, and the clock NFET is coupled between the static differential and gate 1010 and ground. In this example, the gate of the clock transistor is driven by the clock signal.

Fig. 12 illustrates an example of a static differential or gate 1210 implemented based on the circuit block 410 illustrated in fig. 4, in accordance with certain aspects of the present disclosure. The structure of the static differential OR gate 1210 is similar to that of the static differential AND gate 1010, with the inputs rearranged to perform a differential OR operation. Components common to differential and gate 1010 and differential or gate 1210 are identified by the same reference numerals.

In this example, the static differential OR gate 1210 includes an input a,b and

Figure BDA0002642210780000193

as discussed further below, inputs a and

Figure BDA0002642210780000194

is configured to receive a first pair of complementary data bits, and inputs b and

Figure BDA0002642210780000195

configured to receive a second pair of complementary data bits. Each of the inputs may be implemented as a metal structure coupled to the gates of a respective subset of the transistors in the static differential or gate.

Static differential OR gate 1210 is configured to perform a differential OR (OR) operation on a first pair of complementary data bits and a second pair of complementary data bits to generate a pair of complementary output data bits, which are output from outputs Out and Out. In the following discussion, the output Out is referred to as the true output and the output Out is referred to as the complement output.

The static differential or gate 1210 includes a single output or gate 1220 and a single output nor gate 1230. Single output OR gate 1220 provides the true output Out of differential OR gate 1210, while single output NOR gate 1230 provides the complementary output of differential OR gate 1210Each of or gate 1020 and nor gate 1230 are implemented using a separate instance (i.e., a copy) of circuit block 410. The reference number for each transistor in or gate 1220 includes a 1 in parenthesis, while the reference number for each transistor in nor gate 1230 includes a 2 in parenthesis, in order to distinguish between two separate instances (i.e., copies) of circuit block 410.

In the example shown in FIG. 12, by inputting

Figure BDA0002642210780000209

Coupled to the gates of first PFET 412(1), second PFET414 (1), second NFET 418(1), and third NFET426 (1), and will be inputCoupled to third PFET 422(1), fourth PFET424 (1), first NThe gates of FET 416(1) and fourth NFET 428(1) to implement OR gate 1220. The output node 430(1) provides the true output Out of the differential or gate 1210.

NOR gate 1230 is implemented by coupling input a to the gates of first PFET 412(2), first NFET 416(2), second NFET 418(2), and fourth PFET424(2), and input b to the gates of second PFET414 (2), third PFET 422(2), third NFET426(2), and fourth NFET 428 (2). Output node 430(2) provides the complementary output of differential OR gate 1210

FIG. 13A shows a truth table for a differential OR gate 1210 in which inputs a and

Figure BDA0002642210780000203

have complementary logic values, and input b and

Figure BDA0002642210780000204

having complementary logic values. From the truth table, the output Out andhaving complementary logic values. The true output Out is 1 when one or both of the inputs a and b is 1, and 0 when both inputs a and b are 0.

The static differential or gate 1210 may be preset by inputting a preset input value to an input of the static differential or gate 1210. For example, by inputting a preset input value of 0 to all inputs of the static differential OR gate 1210, the outputs Out and of the static differential OR gate 1210May be preset high (i.e., 1). In this regard, FIG. 13B shows a first preset table in which preset input values 0 are input to the inputs a, a of the static differential OR gate 1210,b andthereby causing output Out andboth preset high (i.e., 1).

In this example, a logic circuit (not shown in fig. 12) may be coupled to the inputs of the static differential or gates. During the preset phase, the logic circuit inputs a, b to the static differential OR gate 1210,

Figure BDA0002642210780000212

b and

Figure BDA0002642210780000213

inputting a preset input value of 0 to output Out sumBoth preset high (i.e., 1). During an evaluation phase following the presetting phase, the logic circuit inputs a andinputting a first pair of complementary data bits, and adding to inputs b and

Figure BDA0002642210780000216

a second pair of complementary data bits is input. The data bits may include cryptographic data or other secure data. In response to the data bit, static differential OR gate 1210 sums Out and Out according to the truth table in FIG. 13AOne of which switches from high to low. The logic circuit may include one or more dynamic differential logic gates, one or more static differential logic gates, or a combination of dynamic differential logic gates and static differential logic gates, examples of which are discussed below with reference to fig. 14.

In another example, the input of the static differential or gate 1210 is a preset input value of 1 by inputting all inputs to the static differential or gate 1210Out of andboth of which may be preset low (i.e., 0). In this regard, FIG. 13C shows a second preset table in which a preset input value of 1 is input to the input a, a of the static differential OR gate 1210,

Figure BDA0002642210780000219

b and

Figure BDA00026422107800002110

thereby causing output Out andboth are preset low (i.e., 0).

In this example, a logic circuit (not shown in fig. 12) may be coupled to the inputs of the static differential or gates. During the preset phase, the logic circuit inputs a, b to the static differential OR gate 1210,b and

Figure BDA00026422107800002113

inputting a preset input value of 1 to output Out sum

Figure BDA00026422107800002114

Both are preset low (i.e., 0). During an evaluation phase following the presetting phase, the logic circuit inputs a andinputting a first pair of complementary data bits, and adding to inputs b anda second pair of complementary data bits is input. The data bits may include cryptographic data or other secure data. In response to the data bit, static differential OR gate 1210 sums Out and Out according to the truth table in FIG. 13A

Figure BDA00026422107800002117

Switches from low to high. The logic circuit may include one or more dynamic differential logic gates, one or more static differential logic gates, or a combination of dynamic differential logic gates and static differential logic gates, examples of which are discussed below with reference to fig. 14.

In the above example, the outputs Out and are preset during the preset phaseEnsure output Out and

Figure BDA00026422107800002119

one output switches logic states during the evaluation phase. Thus, the output of one of the or gate 1220 and the nor gate 1230 switches logic states during the evaluation phase. Since the structures of the or gate 1220 and the nor gate 1230 are similar (i.e., both are implemented using the circuit block 410 in fig. 4), the same number of capacitor nodes are charged/discharged without switching the logic state depending on one of the outputs Out and Out. As a result, the power profile of differential OR gate 1210 is approximately the same regardless of the output Out and during the evaluation phaseWhich output switches logic state. Thus, the power profile is approximately uniform, making it difficult for an attacker to identify the logic bit values based on the power measurements.

As shown in fig. 12, or gate 1220 includes first and second conductive paths 1040, 1050 discussed above, and nor gate 1230 includes first and second conductive paths 1060, 1070 discussed above. For simplicity, the description of these conductive paths is not repeated herein.

Static differential or gates 1210 may be used to implement dynamic differential or gates. This can be done, for example, by: coupling a first clock PFET between the true output Out and a supply rail, coupling a second clock PFET at the complementary output

Figure BDA0002642210780000222

And the supply rail, and the clock NFET is coupled between the static differential or gate 1210 and ground. In this example, the gate of the clock transistor is driven by the clock signal.

The exemplary differential logic gates discussed above may be cascaded to implement a pipeline that is resilient to DPA attacks. In this regard, FIG. 14 shows an exemplary pipeline 1410 having a plurality of stages (labeled "stage 1" through "stage 5"), where each stage includes one or more differential logic gates 1415, 1420, 1425, 1430, 1440, and 1450.

The first stage of the pipeline 1410 (which is labeled "stage 1") includes a first dynamic differential logic gate 1415 and a second dynamic differential logic gate 1420. Each of the dynamic differential logic gates may be implemented using the dynamic differential logic gate 310 shown in fig. 3 or the dynamic differential exclusive-or logic gate 710 shown in fig. 7. Each of the subsequent stages of the pipeline 1410 (labeled "stage 2" through "stage 5") includes one or more static differential logic gates 1425, 1430, 1440, and 1450. Each of the static differential logic gates may be implemented using the static differential xor gate 510 shown in fig. 5, the static differential and gate 1010 shown in fig. 10, or the static differential or gate 1210 shown in fig. 12. Thus, in this example, a first stage of the pipeline 1410 includes dynamic differential logic gates 1415 and 1420, while subsequent stages of the pipeline 1410 include static differential logic gates 1425, 1430, 1440, and 1450.

As discussed further below, the dynamic differential logic gates 1415 and 1420 in a first stage of the pipeline 1410 are used to preset the static differential logic gates 1425, 1430, 1435 and 1440 in subsequent stages of the pipeline 1410, without requiring dynamic differential logic gates in subsequent stages. The use of static differential logic gates in subsequent stages reduces the power of the pipeline 1410. This is because the static differential logic gate does not consume switching power for switching the clock transistors.

In this example, each static differential logic gate is configured to preset the output of each static differential logic gate to a respective preset output value (i.e., 1 or 0) when the respective preset input value is input to the input of the static differential logic gate. Further, each static differential logic gate is configured to preset its output to a preset output value opposite to the preset input value input to the static differential logic gate. For example, if a preset input value of 1 (i.e., high) is input to the static differential logic gate, the static differential logic gate presets its output to a preset output value of 0 (i.e., low), and vice versa.

The first 1415 and second 1420 dynamic differential logic gates receive a clock signal CLK that drives clock transistors (not shown in fig. 14) in the first 1415 and second 1420 dynamic differential logic gates. Each cycle (i.e., period) of the clock signal CLK includes a preset phase and an evaluation phase. In one example, the preset phase occurs when the clock signal CLK is low, and the evaluate phase occurs when the clock signal CLK is high. In this example, the preset phase corresponds to the low-order segment of the clock discussed above, and the evaluation phase corresponds to the high-order segment of the clock discussed above.

During each preset phase, the first 1415 and second 1420 dynamic differential logic gates have their outputs 1416 and 1424 preset high (i.e., 1). In the example shown in fig. 14, the output 1416 of the first dynamic differential logic gate 1415 is coupled to the input 1426 of the static differential logic gate 1425 in the second stage (which is labeled "stage 2"). Thus, during each preset phase, the first dynamic differential logic gate 1415 outputs a preset value of 1 to the static differential logic gate 1425 in the second stage, which causes the static differential logic gate 1425 in the second stage to preset its output 1428 to low (i.e., 0). This is because the static difference logic gate 1425 presets its output 1428 to a preset value that is opposite to the preset value input to the static difference logic gate 1425.

The static differential logic gate 1425 in the second stage outputs a preset value of 0 to the input 1432 of the static differential logic gate 1430 in the third stage (which is labeled "stage 3"), which causes the static differential logic gate 1430 in the third stage to preset its output 1434 to high (i.e., 1). The static differential logic gate 1430 in the third stage outputs a preset value of 1 to the input 1442 of the static differential logic gate 1440 in the fourth stage (which is labeled "stage 4"), which causes the static differential logic gate 1440 in the fourth stage to preset its output 1444 to low (i.e., 0). Finally, the static differential logic gate 1440 in the fourth stage outputs a preset value of 0 to the input 1452 of the static differential logic gate 1450 in the fifth stage (which is labeled "stage 5"), which causes the static differential logic gate 1450 in the fifth stage to preset its output 1465 to high (i.e., 1).

Thus, during each preset stage, the preset value at the output 1416 of the first dynamic differential logic gate 1415 causes the static differential logic gates 1425, 1430, 1440, and 1450 in subsequent stages of the pipeline 1410 to preset their outputs. The preset output values of the static differential logic gates 1425, 1430, 1440, and 1450 alternate between low and high when moving from the second stage to the fifth stage. In fig. 14, a label "PH" indicates that the corresponding differential logic gate has a high preset output value, and a label "PL" indicates that the corresponding differential logic gate has a low preset output value.

During each evaluation phase, the first dynamic differential logic gate 1415 receives one or more pairs of complementary input data bits at its inputs 1412 and performs a differential logic operation (e.g., a differential exclusive-or operation) on the one or more pairs of complementary input data bits to generate complementary output data bits. The data bits may include cryptographic data or other secure data.

The dynamic differential logic gate 1415 outputs complementary output data bits to the static differential logic gate 1425 in the second stage. The static differential logic gate 1425 in the second stage performs a differential logic operation on the complementary data bits from the first dynamic differential logic gate 1415 to generate complementary output data bits, and the dynamic differential logic gate 1415 outputs the complementary output data bits to the static differential logic gate 1430 in the third stage. The process continues to subsequent stages of the pipeline, where each of the static differential logic gates 1430, 1440, and 1450 receives complementary data bits from one or more differential logic gates in a previous stage, performs a differential logic operation on the received data bits to generate complementary output data bits, and outputs the complementary output data bits to one or more differential logic gates in the subsequent stage. The static complement logic gate 1450 in the fifth stage may output its complement output bits from the pipeline 1410.

Thus, during each evaluation stage, the pipeline 1410 receives input data bits and performs an operation (e.g., an encryption operation or a decryption operation) on the input data bits, with each of the differential logic gates 1415, 1420, 1425, 1430, 1440, and 1450 in the pipeline 1410 performing a subset of the operations. Presetting the differential logic gates 1415, 1420, 1425, 1430, 1440, and 1450 during the previous preset phase helps to ensure monotonic signal propagation through the pipeline 1410 during the evaluation phase, thereby preventing glitches. In addition, presetting the differential logic gates makes the power profile of the differential logic gates more uniform (i.e., horizontal), which makes it difficult for an attacker to use power measurements to identify data bit values inside the pipeline 1410.

FIG. 14 shows an example of additional connections that may be made in line 1410. FIG. 14 shows an example in which the output 1424 of the second dynamic differential logic gate 1420 is coupled to the input 1446 of the static differential logic gate 1440 in the fourth stage. In this example, during each preset phase, the static differential logic gate 1440 receives a preset value of 1 from the static differential logic gate 1430 in the third stage and a preset value of 1 from the second dynamic differential logic gate 1420. Thus, during each preset phase, the static differential logic gate 1440 in the fourth stage receives the same preset value of 1 at all inputs 1442 and 1446, causing the static differential logic gate 1440 to preset the output 1444 to low (i.e., 0). During each evaluation phase, the static differential logic gate 1440 receives a complementary data bit from the static differential logic gate 1430 in the third stage and a complementary data bit from the second dynamic differential logic gate 1420. The static differential logic gate 1440 performs a differential logic operation (e.g., a differential exclusive-or operation) on the received data bits to generate complementary output data bits and outputs the complementary output bits at output 1444.

Fig. 14 also shows an example in which the output 1428 of the static differential logic gate 1425 in the second stage is coupled to the input 1454 of the static differential logic gate 1450 in the fifth stage. In this example, during each preset phase, the static differential logic gate 1450 receives a preset value of 0 from the static differential logic gate 1440 in the fourth stage and a preset value of 0 from the static differential logic gate 1425 in the second stage. Thus, during each preset phase, the static differential logic gate 1450 receives the same preset value of 0 at all inputs 1452 and 1454, resulting in the static differential logic gate 1450 presetting the output 1456 to high (i.e., 1). During each evaluation phase, the static differential logic gate 1450 receives a complementary data bit from the static differential logic gate 1440 in the fourth stage and from the static differential logic gate 1425 in the second stage. The static differential logic gate 1450 performs a differential logic operation (e.g., a differential exclusive-or operation) on the received data bits to generate complementary output data bits, and outputs the complementary output bits at an output 1456.

As an overall rule, the inputs of the static differential logic gate should receive the same preset value during the preset phase, so that the static differential logic gate correctly preset its output. In this regard, FIG. 14 shows an example of a connection that violates the rule (shown in dashed lines), and therefore is not allowed in line 1410. This connection connects the output of the first dynamic differential logic gate 1415 to the first static differential logic gate 1430 in the third stage. This connection violates the above rule because during the preset phase, the first dynamic differential logic gate 1415 outputs a preset value that is different from the preset value output by the static differential logic gate 1425 in the second stage (i.e., the dynamic differential logic gate 1415 in the second stage outputs a preset value of 1, while the logic differential logic gate 1425 outputs a preset value of 0). As a result, if a connection (shown in dashed lines) is made, the static differential logic gate 1430 in the third stage will receive different preset values from the first dynamic differential logic gate 1415 and the static differential logic gate 1425 in the second stage. In fig. 14, a large "X" on a connection indicates that the connection is not allowed. Note that the other exemplary connections shown in fig. 14 conform to the rules described above.

It should be appreciated that the pipeline 1410 may include a different number of stages than the number of stages shown in FIG. 14. The logical operations performed by the differential logic gates in the pipeline 1410, and the connections between the differential logic gates in the pipeline 1410, may be selected to perform desired operations (e.g., encryption operations). In certain aspects, the connection requirements between differential logic gates comply with the following rules: during the preset phase, the static differential logic gate receives the same preset value at all inputs.

Although two dynamic differential logic gates 1425 and 1420 are shown in the first stage in the example of fig. 14, it should be appreciated that the first stage may include a greater number of dynamic differential logic gates. Additionally, it should be appreciated that each dynamic differential logic gate may receive one or more pairs of complementary data bits per clock cycle.

In certain aspects, incoming input data bits may be latched before being input to the pipeline 1410. In this regard, fig. 15 shows an example of a first differential latch 1510 and a second differential latch 1520. First differential latch 1510 has an input 1512 configured to receive complementary input data bits, and an output 1514 coupled to an input 1412 of first dynamic differential logic gate 1415. The second differential latch 1520 has inputs 1522 configured to receive complementary input data bits, and an output 1524 coupled to the inputs 1422 of the second dynamic differential logic gate 1420. Each of differential latches 1510 and 1520 receives a clock signal CLK for the sequential operation of the latches.

In operation, each of differential latches 1510 and 1520 may be opened when the clock signal is low (i.e., a clock low phase). During this time, the dynamic differential logic gates 1415 and 1420 may be in a preset phase, where the dynamic differential logic gates 1415 and 1420 preset their outputs to high.

Each of differential latches 1510 and 1520 may latch on a rising edge of the clock signal a data bit at the respective input and output the latched data bit to the respective dynamic differential logic gate when the clock signal is high (i.e., a clock high phase). During this time, the dynamic differential logic gates 1415 and 1420 may be in an evaluation phase. The latched data bits from differential latches 1510 and 1520 help ensure that the data bits input to dynamic differential logic gates 1415 and 1420 are stable during the evaluation phase.

In the above example, differential latches 1510 and 1520 are open during the low-order segment of the clock, and dynamic differential logic gates 1415 and 1420 preset their outputs during the low-order segment of the clock. However, it should be appreciated that the present disclosure is not limited to this example. For example, the differential latches 1510 and 1520 may be open during the high phase of the clock, and the dynamic differential logic gates 1415 and 1420 may preset their outputs during the high phase of the clock. In this example, the differential latches 1510 and 1520 may latch on a data bit at the respective input on a falling edge of the clock signal and output the latched data bit to the respective dynamic differential logic gate when the clock signal is low (i.e., a clock low phase). During this time, the dynamic differential logic gates 1510 and 1520 may be in an evaluation phase.

FIG. 16A illustrates an exemplary implementation of a differential latch 1610 according to certain aspects of the invention. Differential latch 1610 may be used to implement each of differential latches 1510 and 1520 in FIG. 15. In other words, each of differential latches 1510 and 1520 may be a separate instance (i.e., a copy) of differential latch 1610.

Differential latch 1610 includes inverters 1612 and 1614, inverters 1612 and 1614 coupled in series for generating signals CB and C from clock signal CLK, where signal CB is an inverted version of clock signal CLK and signal C is a delayed version of clock signal CLK. The differential latch 1610 further includes a first transmission gate 1620, a second transmission gate 1625, a first inverter 1630, and a second inverter 1635. The first transmission gate 1620 is coupled to the true input of the latch (which is labeled "DIN") through an inverter 1616, while the second transmission gate 1625 is coupled to the complementary input of the latch (which is labeled "DINB") through an inverter 1618. Each of the transmission gates receives signals C and CB and is configured to be open (i.e., clock signal CLK is low) when signals C and CB are low and high, respectively, and closed (i.e., clock signal CLK is high) when signals C and CB are high and low, respectively. When the transmission gates 1620 and 1625 are open, the first transmission gate 1620 couples the true input to the first signal path 1632 and the second transmission gate 1625 couples the complementary input to the second signal path 1634. The first signal path 1632 is coupled to the true output of the latch (which is labeled "Q") through an inverter 1640, and the second signal path 1634 is coupled to the complementary output of the latch (which is labeled "QB") through an inverter 1645. Thus, in this example, the differential latch 1610 is open during the low phase of the clock.

First inverter 1630 has an input coupled to second signal path 1634 and an output coupled to first signal path 1632, and second inverter 1635 has an input coupled to first signal path 1632 and an output coupled to second signal path 1634. Each of the first inverter 1630 and the second inverter 1635 receives signals C and CB and is configured to be disabled (i.e., the clock signal CLK is low) when signals C and CB are low and high, respectively, and enabled (i.e., the clock signal CLK is high) when signals C and CB are high and low, respectively. Thus, the first inverter 1630 and the second inverter 1635 are disabled during the time that the transfer gates 1620 and 1625 are open and enabled during the time that the transfer gates 1620 and 1625 are closed.

When first inverter 1630 and second inverter 1635 are enabled, first inverter 1630 and second inverter 1635 (back-to-back coupled) latch complementary data bits on signal paths 1632 and 1634. The latched complementary data bits are output at the true output Q and the complementary output QB of the latch. Thus, in this example, the differential latch 1610 latches the complementary data bits at the inputs DIN and DINB of the latch on the rising edge of the clock signal and outputs the latched complementary data bits at the outputs Q and QB of the latch during the high phase of the clock.

In this example, rather than using separate latches to latch the true data bits and the complementary data bits, respectively, the differential latch 1610 is used to latch the true data bits and the complementary data bits. The differential latch 1610 is more resilient to timing attacks than using separate latches for the true data bits and complementary data bits.

Fig. 16B illustrates another example differential latch 1650 in accordance with certain aspects of the disclosure. Differential latch 1650 is similar to differential latch 1610 in FIG. 16B, in that components common to both latches are identified by the same reference numerals. As shown in fig. 16B, the differential latch 1650 is different from the differential latch 1610 of fig. 16A in that signals C and CB input to the transmission gates 1620 and 1625 and the inverters 1630 and 1635 are inverted. As a result, the differential latch 1650 is opened when the clock signal CLK is high (i.e., the clock high phase). The differential latch 1650 latches the data bits at the inputs DIN and DINB on the falling edge of the clock signal and outputs the latched data bits to the respective dynamic differential logic gates when the clock signal is low (i.e., a clock low phase).

The exemplary differential logic gates and pipelines discussed above may be used to implement encryption and/or decryption processors that are resilient to DPA attacks. In this regard, fig. 17A illustrates an exemplary encryption processor 1705 configured to encrypt input data (which is labeled "plaintext") as encrypted data (which is labeled "ciphertext"). In this example, the encryption processor 1705 encrypts the data according to the Advanced Encryption Standard (AES) established by the National Institute of Standards and Technology (NIST). As discussed further below, in this example, encryption involves key addition operations, byte substitution operations, shift row operations, and mixed column operations.

The encryption processor 1705 includes a first latch 1710, a hybrid column processor 1720, a first key adder 1730, a multiplexer 1735, a second latch 1740, a shift row and S-Box processor 1745, a second key adder 1725, a third key adder 1750, and a third latch 1755. The multiplexer 1735 is configured to couple the output of the first key adder 1730 or the second key adder 1725 to the second latch 1740 based on the round select signal. As discussed further below, the encryption processor 1705 is configured to encrypt data over multiple rounds (e.g., 12 rounds), where the data is repeatedly processed over the multiple rounds by the hybrid column processor 1720, the first key adder 1730, and the shift row and S-Box processor 1745 to generate encrypted data (which is labeled "ciphertext").

In operation, the second key adder 1725 receives input data to be encrypted (which is marked as "plaintext") and adds a key to the input data according to AES. The second key adder 1725 may be implemented using a differential exclusive or gate. The multiplexer 1735 then couples the data from the second key adder 1725 to a second latch 1740, which second latch 1740 latches the data and outputs the latched data to the shift row and S-Box processor 1745. The shift row and S-Box processor 1745 performs a shift row operation and a byte replacement operation on the data according to AES (e.g., using differential XOR gates and cross-wiring). The data output from the shift row and S-Box processor 1745 is then input to the first latch 1710 via loop 1712. The first latch 1710 latches data, and outputs the latched data to the mix column 1720. A mixcolumns processor 1720 performs mixcolumns operations on the data according to AES (e.g., using differential logic gates). A first key adder 1730 (e.g., using a differential xor gate) adds a key to the data output from the hybrid column processor 1720. The multiplexer 1735 then couples the data output from the first key adder 1730 to the second latch 1740. Then, the above process is repeated. In this regard, the multiplexer 1735 couples the output of the first key adder 1730 to the second latch 1740 over multiple rounds to repeat the above process over multiple rounds.

At the end of the last round, a third key adder 1750 adds keys to the data output from the shift row and S-Box processor 1745. The third latch 1755 latches the data from the third key adder 1750 and outputs the latched data as encrypted data (which is labeled "ciphertext").

In certain aspects, the hybrid column processor 1720, the first key adder 1730, and the shift row and S-box processors 1745 may be implemented using a pipeline. In this regard, fig. 17B illustrates first and second line 1760, 1765 according to certain aspects of the disclosure. A first pipeline 1760 may implement the hybrid column processor 1720 and first key adder 1730 shown in FIG. 17A. In this example, the first pipeline 1760 has a plurality of stages, where each stage includes one or more differential logic gates. The first stage may include one or more dynamic differential logic gates, while each of the subsequent stages includes one or more static differential logic gates. Note that for ease of illustration, the various connections between the differential logic gates are not explicitly shown in fig. 17B.

In operation, when the clock signal is low (i.e., clock low phase), the first differential latch 1710 is open, and when the clock signal is low (i.e., clock low phase), the dynamic differential logic gate in the first stage of the first pipeline 1760 presets its output to high. The high preset value at the output of the dynamic differential logic gate causes the static differential logic gate in the subsequent stage of the first pipeline 1760 to preset its output. The preset output value of the differential logic gate in the first pipeline 1760 alternates between high and low across the pipeline 1760. In fig. 17B, the label "PH" indicates a high preset output value, and the label "PL" indicates a low preset output value.

While the clock signal is high (i.e., a clock high phase), the first differential latch 1710 latches the data bit on the rising edge of the clock signal and outputs the latched data bit to the first pipeline 1760. Differential logic gates in the first pipeline 1760 then perform the mixed column operation and key addition operation of the mixed column processor 1720 and first key adder 1730, respectively. The output data of the first pipeline 1760 is input to the second latch 1740. In this example, the preset stage of the first pipeline 1760 corresponds to a low-order phase of the clock, and the evaluation stage of the first pipeline 1760 corresponds to a high-order phase of the clock, as shown in FIG. 17B. The first differential latch 1710 may be implemented using one or more differential latches 1510 and 1520 shown in fig. 15.

The second pipeline 1765 may implement the shift row and S-Box processor 1745 shown in FIG. 17A. In this example, the second pipeline 1765 has a plurality of stages, where each stage includes one or more differential logic gates. A first stage may include one or more dynamic logic gates and each of the subsequent stages includes one or more static logic gates. Alternatively, the first stage may also include one or more static differential logic gates.

In operation, the second differential latch 1740 is open when the clock signal is high (i.e., the clock high phase). For the example where the first stage of the second pipeline 1765 includes dynamic differential logic gates, the differential logic gates preset their outputs high during the high phase of the clock, causing the static differential logic gates in subsequent stages of the second pipeline 1765 to preset their outputs. For the example where the first stage of second pipeline 1765 includes a static differential logic gate, the preset output value of the last stage of first pipeline 1760 may flow into the first stage of second pipeline 1765 through second differential latch 1740 during the time that second differential latch 1740 is turned off. As shown in fig. 17B, the preset output value of the differential logic gate in the second pipeline 1765 alternates between high and low.

While the clock signal is low (i.e., the clock low stage), the second differential latch 1740 latches the data bit on the falling edge of the clock signal and outputs the latched data bit to the second pipeline 1765. Differential logic gates in the second pipeline 1765 then perform the shift row and S-Box processor 1745 operations on the data bits. As shown in FIG. 17B, in this example, the preset stage of second pipeline 1765 corresponds to a high-order segment of the clock, and the evaluate stage of second pipeline 1765 corresponds to a low-order segment of the clock.

Thus, the preset stages of the first and second pipelines 1760 and 1765 correspond to opposite stages of the clock signal, and the evaluation stages of the first and second pipelines 1760 and 1765 correspond to opposite stages of the clock signal. In the above example, the preset stage and the evaluation stage of the first pipeline 1760 correspond to the clock low stage and the clock high stage, respectively, and the preset stage and the evaluation stage of the second pipeline 1765 correspond to the clock high stage and the clock low stage, respectively. However, it should be appreciated that this is reversible such that the preset and evaluate stages of the first pipeline 1760 correspond to a clock high stage and a clock low stage, respectively, and the preset and evaluate stages of the second pipeline 1765 correspond to a clock low stage and a clock high stage, respectively. In this case, the first differential latch 1710 is disconnected in the clock high stage and latched in the clock low stage, and the second differential latch 1740 is disconnected in the clock low stage and latched in the clock high stage. Each of first differential latch 1710 and second differential latch 1720 may be implemented using one or more of differential latches 1510 and 1520 shown in fig. 15.

In the above example, the first pipeline 1760 implements the mixed column processor 1720 and the first key adder 1730, and the second pipeline 1765 implements the shifted row and S-Box processor 1745. However, it should be appreciated that the present disclosure is not limited to this example. In general, the operations of the mixed column processor 1720, the first key adder 1730, and the shift row and S-Box processor 1745 may be split differently between the first and second pipelines 1760, 1765 without affecting functionality, so long as the correct order of operations is maintained.

Fig. 18A illustrates an exemplary decryption processor 1805 configured to decrypt encrypted data (which is labeled "ciphertext") into decrypted data (which is labeled "plaintext"). In this example, the decryption processor 1805 decrypts the encrypted data according to AES. As discussed further below, in this example, decryption involves a key addition operation, an inverse byte replacement operation, an inverse shift row operation, and an inverse mix column operation.

The decryption processor 1805 includes a first latch 1810, an inverse hybrid column processor 1820, a first key adder 1830, a multiplexer 1835, a second latch 1840, an inverse shift row and S-Box processor 1845, a second key adder 1825, a third key adder 1850, and a third latch 1855. The multiplexer 1835 is configured to couple the output of the inverse hybrid column processor 1820 or the second key adder 1825 to the second latch 1840 based on the round select signal. As discussed further below, the decryption processor 1805 is configured to decrypt data in multiple rounds (e.g., 12 rounds) where the data is repeatedly processed by the inverse hybrid column processor 1820, the first key adder 1830, and the inverse shift row and S-Box processor 1845 to generate decrypted data (which is labeled "plaintext").

In operation, the second key adder 1825 receives encrypted data (which is labeled "ciphertext") and adds a key to the encrypted data according to AES. The second key adder 1825 may be implemented using a differential exclusive or gate. The multiplexer 1835 then couples the data from the second key adder 1825 to the second latch 1840, the second latch 1840 latches the data, and outputs the latched data to the inverse shift row and S-Box processor 1845. The inverse shift row and S-Box processor 1845 performs inverse shift row operations and inverse byte replacement operations on the data according to AES (e.g., using differential XOR gates and cross-wiring). The data output from the inverse shift row and S-Box processor 1845 is then input into a first latch 1810 via loop 1812. The first latch 1810 latches the data, and outputs the latched data to a first key adder 1830, which adds a key to the data. The inverse mix column processor 1820 performs inverse mix column operations on the data from the first key adder 1830 according to AES (e.g., using differential logic). The multiplexer 1835 then couples the data output from the inverse hybrid column processor 1820 to a second latch 1840. Then, the above process is repeated. In this regard, the multiplexer 1835 couples the output of the inverse hybrid column processor 1820 to the second latch 1840 over multiple wraps to repeat the above process over multiple wraps.

At the end of the last round, the third key adder 1850 adds the key to the data output from the inverse shift row and S-Box processor 1845. The third latch 1855 latches the data from the third key adder 1850 and outputs the latched data as decrypted data (which is marked as "plaintext").

In certain aspects, the inverse hybrid column processor 1820, the first key adder 1830, and the inverse shift row and S-box processor 1845 may be implemented using a pipeline. In this regard, fig. 18B illustrates a first pipeline 1860 and a second pipeline 1865 according to certain aspects of the present disclosure. The first pipeline 1860 may implement the first key adder 1830 and the inverse hybrid column processor 1820 shown in fig. 18A. In this example, the first pipeline 1860 has multiple stages, where each stage includes one or more differential logic gates. A first stage may include one or more dynamic logic gates and each of the subsequent stages includes one or more static logic gates. Note that for ease of illustration, the various connections between the differential logic gates are not explicitly shown in fig. 18B.

In operation, when the clock signal is low (i.e., clock low phase), the first differential latch 1810 is turned off, and when the clock signal is low (i.e., clock low phase), the dynamic differential logic gate in the first stage of the first pipeline 1860 presets its output to high. The high preset value at the output of the dynamic differential logic gate causes the static differential logic gate in the subsequent stage of the first pipeline 1860 to preset its output. As shown in fig. 18B, the preset output values of the differential logic gates in the first pipeline 1860 alternate between high and low across the pipeline 1860.

While the clock signal is high (i.e., the clock high phase), the first differential latch 1810 latches the data bit on the rising edge of the clock signal and outputs the latched data bit to the first pipeline 1860. Then, the differential logic gates in the first pipeline 1860 perform the key addition operation and the inverse-mixed column operation of the first key adder 1830 and the inverse-mixed column processor 1820, respectively. The output data of the first pipeline 1860 is input to a second latch 1840. As shown in fig. 18B, in this example, the preset stage of the first pipeline 1860 corresponds to a low-order section of the clock, while the evaluate stage of the first pipeline 1860 corresponds to a high-order section of the clock.

The second pipeline 1865 may implement the inverse shift row and S-Box processor 1845 shown in FIG. 18A. In this example, the second pipeline 1865 has a plurality of stages, where each stage includes one or more differential logic gates. A first stage may include one or more dynamic logic gates and each of the subsequent stages includes one or more static logic gates. Alternatively, the first stage may also include one or more static differential logic gates.

In operation, when the clock signal is high (i.e., a clock high phase), the second differential latch 1840 is opened. For the example where the first stage of the second pipeline 1865 includes dynamic differential logic gates, during the clock high phase, the differential logic gates preset their outputs high, so that the static differential logic gates in subsequent stages preset their outputs. For an example in which the first stage of the second pipeline 1865 includes static differential logic gates, the preset output value of the last stage of the first pipeline 1860 may flow into the first stage of the second pipeline 1865 through the second differential latch 1840 during the time that the second differential latch 1840 is turned off. As shown in fig. 18B, the preset output values of the differential logic gates in the second pipeline 1865 alternate between high and low.

While the clock signal is low (i.e., a clock low stage), the second differential latch 1840 latches the data bit on the falling edge of the clock signal and outputs the latched data bit to the second pipeline 1865. The differential logic gates in the second pipeline 1865 then perform inverse shift row and S-Box processor 1845 operations on the data bits. As shown in fig. 18B, in this example, the preset stage of the second pipeline 1865 corresponds to a high-order section of the clock, while the evaluate stage of the second pipeline 1865 corresponds to a low-order section of the clock.

Thus, the preset stages of the first and second pipelines 1860, 1865 correspond to opposite stages of the clock signal, and the evaluate stages of the first and second pipelines 1860, 1865 correspond to opposite stages of the clock signal. In the above example, the set and evaluate stages of the first pipeline 1860 correspond to a clock low stage and a clock high stage, respectively, and the set and evaluate stages of the second pipeline 1865 correspond to a clock high stage and a clock low stage, respectively. However, it should be appreciated that this is reversible such that the preset and evaluate stages of the first pipeline 1860 correspond to a clock high stage and a clock low stage, respectively, and the preset and evaluate stages of the second pipeline 1865 correspond to a clock low stage and a clock high stage, respectively. In this case, the first differential latch 1810 is disconnected in the clock high stage and latched in the clock low stage, and the second differential latch 1840 is disconnected in the clock low stage and latched in the clock high stage. Each of first and second differential latches 1810 and 1820 may be implemented using one or more of differential latches 1510 and 1520 shown in fig. 15.

In the above example, the first pipeline 1860 implements a first key adder 1830 and an inverse hybrid column processor 1820, and the second pipeline 1865 implements an inverse shift row and S-Box processor 1845. However, it should be appreciated that the present disclosure is not limited to this example. In general, the operations of the first key adder 1830, the inverse hybrid column processor 1820, and the inverse shift row and S-Box processor 1845 may be divided differently between the first pipeline 1860 and the second pipeline 1865 without affecting functionality, as long as the correct order of operations is maintained.

It should be understood that the present disclosure is not limited to the terms used above to describe various aspects of the present disclosure. For example, it should be appreciated that logical 1's and 0's may also be referred to as high and low, respectively, dynamic may also be referred to as clocked, preset may also be referred to as precharge, logical values may also be referred to as logical states, and logical operations may also be referred to as logical functions.

It should also be understood that the present disclosure is not limited to a, b shown in each of fig. 5, 7, 10, and 12,b andthe specific arrangement of (a). In this regard, it should be appreciated that for each of the differential logic gates 510, 1010, and 1210 shown in these figures, the inputs a, b, c, d,

Figure BDA0002642210780000363

b andthere are several possible arrangements for implementing the functionality of the differential logic gates discussed above. For example, in FIG. 5, inputs a andthe arrangement of the gates to first PFET 412(1) and second PFET414 (1) is reversed such that the input is

Figure BDA0002642210780000366

Coupled to the gate of first PFET 412(1) and input a is coupled to the gate of second PFET414 (1). In this example, the function of the exclusive or gate 520 discussed above remains unchanged. Thus, it should be appreciated that the present disclosure encompasses an input a, for each gate,b andother possible arrangements of (3).

Any reference to elements by a name such as "first," "second," etc., as used herein does not generally limit the number or order of those elements. Rather, these designations are used herein as a convenient way to distinguish two or more elements or instances of an element. Thus, reference to a first element and a second element does not imply that only two elements can be used or that the first element must precede the second element.

In this disclosure, the word "exemplary" is used to mean "serving as an example, instance, or illustration. Any implementation or aspect described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects of the disclosure. Likewise, the term "aspect" does not require that all aspects of the disclosure include the discussed feature, advantage or mode of operation. The term "coupled" is used herein to refer to either a direct electrical coupling or an indirect electrical coupling between two structures.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the examples described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

45页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:移动存储设备、存储系统和存储方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!