System memory latency compensation
阅读说明:本技术 系统存储器延时补偿 (System memory latency compensation ) 是由 D·T·全 R·A·斯图尔特 于 2018-03-26 设计创作,主要内容包括:可以补偿在以降低的频率进行操作的存储器系统中的流水线逻辑延时。可以至少使用第一时钟信号和第二时钟信号来控制流水线逻辑。当存储器系统以较高的频率进行操作时,可以使用第一时钟信号来控制流水线逻辑的全部寄存器。然而,当存储器系统以降低的频率进行操作时,可以使用第一时钟信号来控制一个或多个寄存器,以及可以使用第二时钟信号来控制一个或多个其它寄存器。(Pipeline logic delays in a memory system operating at a reduced frequency may be compensated. The pipeline logic may be controlled using at least the first clock signal and the second clock signal. When the memory system is operating at a higher frequency, the first clock signal may be used to control all registers of the pipeline logic. However, when the memory system is operating at a reduced frequency, one or more registers may be controlled using the first clock signal and one or more other registers may be controlled using the second clock signal.)
1. A system for compensating for system memory latency, comprising:
a memory interface between a client device and a memory system, the memory interface having pipeline logic including a first register and a second register, a data input of the second register being pipelined to a data output of the first register;
a clock frequency controller configured to detect a client device workload demand and adjust a frequency of a system clock signal provided to the memory interface, the clock frequency controller adjusting the system clock signal to a first frequency in response to detecting a high client device workload demand and to a second frequency lower than the first frequency in response to detecting a low client device workload demand; and
a clock phase controller configured to control the first register and the second register using a first periodic clock edge signal in response to adjusting the system clock signal to the first frequency, the clock phase controller further configured to: in response to adjusting the system clock signal to the second frequency, controlling the first register using one of the first and second periodic clock edge signals and controlling the second register using the other of the first and second periodic clock edge signals, wherein a first periodic time interval between successive assertions of the first periodic clock edge signal is greater than a second periodic time interval between assertions of the first periodic clock edge signal and a next assertion of the second periodic clock edge signal after the assertion of the first periodic clock edge signal.
2. The system of claim 1, wherein:
the first periodic clock edge signal corresponds to a continuous assertion of a true edge of the system clock signal; and
the second periodic clock edge signal corresponds to a continuous assertion of a supplemental edge of the system clock signal.
3. The system of claim 2, wherein:
the pipeline logic comprises three or more registers, the registers comprising one or more odd registers and one or more even registers, and the data input of each odd register is pipelined to the data output of one of the even registers; and
the clock phase controller is configured to: in response to adjusting the system clock signal to the first frequency, controlling the odd and even registers using the first periodic clock edge signal, and the clock phase controller is further configured to: in response to adjusting the system clock signal to the second frequency, controlling the odd registers using the first periodic clock edge signal and controlling the even registers using the second periodic clock edge signal.
4. The system of claim 2, wherein:
the pipeline logic comprises three or more registers; and
the clock phase controller is configured to: in response to adjusting the system clock signal to the first frequency, controlling ones of the registers using the first periodic clock edge signal, and the clock phase controller is further configured to: in response to adjusting the system clock signal to the second frequency, controlling a first subset of the registers using the first periodic clock edge signal and a second subset of the registers using the second periodic clock edge signal, and at least one pair of pipeline registers is controlled by a same one of the first periodic clock edge signal and the second periodic clock edge signal.
5. The system of claim 1, wherein:
the first periodic clock edge signal corresponds to a continuous assertion of a true edge of the system clock signal; and
the second periodic clock edge signal corresponds to successive assertions of a real edge of a delayed system clock signal having a phase delay greater than or equal to zero and less than 360 degrees relative to the system clock signal.
6. The system of claim 5, wherein:
the pipeline logic includes three or more ("N") registers; and
the clock phase controller is configured to generate N delayed system clock signals, each delayed system clock signal having a unique phase delay of an integer multiple of 360/N relative to the system clock signal, and the clock phase controller is further configured to control each of the N registers using the system clock signal in response to adjusting the system clock signal to the first frequency, and to control each of the N registers using one of the N delayed system clock signals in response to adjusting the system clock signal to the second frequency.
7. The system of claim 5, wherein:
the pipeline logic comprises three or more registers; and
the clock phase controller is configured to generate a plurality of delayed system clock signals, each delayed system clock signal having a unique phase delay relative to the system clock signal, and the clock phase controller is further configured to control each of the registers using the system clock signal in response to adjusting the system clock signal to the first frequency, and to control a first pair of pipeline registers using two respective delayed system clock signals having a first phase difference and a second pair of pipeline registers using two respective delayed system clock signals having a second phase difference different from the first phase difference in response to adjusting the system clock signal to the second frequency.
8. The system of claim 1, wherein the clock phase controller comprises:
a clock phase generator configured to generate a plurality of different clock signals based on the system clock signal and to provide the different clock signals to elements of the pipeline logic including the first register and the second register; and
a mode table configured to generate mode control signals and to provide the mode control signals to elements of the pipeline logic, wherein the mode control signals indicate an association between the elements of the pipeline logic and one of the different clock signals.
9. The system of claim 1, wherein the memory interface comprises a dynamic random access memory ("DRAM") controller.
10. The system of claim 1, wherein the client device and the memory interface are included in a system on a chip ("SoC") of a portable computing device.
11. A system for compensating for system memory latency, comprising:
means for detecting a client device workload demand associated with a client device, the client device connected with a memory system through a memory interface, the memory interface having pipeline logic including a first register and a second register;
means for adjusting a frequency of a system clock signal provided to the memory interface, the means for adjusting the system clock signal to a first frequency in response to detecting a high client device workload demand and adjusting the system clock signal to a second frequency lower than the first frequency in response to detecting a low client device workload demand;
means for controlling the first register and the second register using a first periodic clock edge signal in response to adjusting the system clock signal to the first frequency; and
means for controlling the first register using one of the first and second periodic clock edge signals and the second register using the other of the first and second periodic clock edge signals in response to adjusting the system clock signal to the second frequency, wherein a first periodic time interval between successive assertions of the first periodic clock edge signal is greater than a second periodic time interval between assertions of the first periodic clock edge signal and a next assertion of the second periodic clock edge signal after the assertion of the first periodic clock edge signal.
12. The system of claim 11, wherein:
the first periodic clock edge signal corresponds to a continuous assertion of a true edge of the system clock signal; and
the second periodic clock edge signal corresponds to a continuous assertion of a supplemental edge of the system clock signal.
13. The system of claim 12, wherein:
the pipeline logic comprises three or more registers, the registers comprising one or more odd registers and one or more even registers, and the data input of each odd register is pipelined to the data output of one of the even registers;
means for controlling in response to adjusting the system clock signal to the first frequency comprises: means for controlling the odd register and the even register using the first periodic clock edge signal; and
means for controlling in response to adjusting the system clock signal to the second frequency comprises: means for controlling the odd register using the first periodic clock edge signal and the even register using the second periodic clock edge signal.
14. The system of claim 12, wherein:
the pipeline logic comprises three or more registers;
means for controlling in response to adjusting the system clock signal to the first frequency comprises: means for controlling each of the registers using the first periodic clock edge signal; and
means for controlling in response to adjusting the system clock signal to the second frequency comprises: means for controlling a first subset of the registers using the first periodic clock edge signal and a second subset of the registers using the second periodic clock edge signal, and at least one pair of pipeline registers is controlled by a same one of the first periodic clock edge signal and the second periodic clock edge signal.
15. The system of claim 11, wherein:
the first periodic clock edge signal corresponds to a continuous assertion of a true edge of the system clock signal; and
the second periodic clock edge signal corresponds to successive assertions of a real edge of a delayed system clock signal having a phase delay greater than or equal to zero and less than 360 degrees relative to the system clock signal.
16. The system of claim 15, wherein:
the pipeline logic includes three or more ("N") registers; and
means for controlling in response to adjusting the system clock signal to the first frequency comprises: means for controlling each of the N registers using the system clock signal; and
means for controlling in response to adjusting the system clock signal to the second frequency comprises: means for generating N delayed system clock signals, and responsive to adjusting the system clock signals to the second frequency, controlling each of the N registers using one of the N delayed system clock signals, wherein each delayed system clock signal has a unique phase delay relative to the system clock signal that is an integer multiple of 360/N.
17. The system of claim 15, wherein:
the pipeline logic comprises three or more registers; and
means for controlling in response to adjusting the system clock signal to the first frequency comprises: means for controlling each of the registers using the system clock signal; and
means for controlling in response to adjusting the system clock signal to the second frequency comprises: a unit for generating a plurality of delayed system clock signals, and in response to adjusting the system clock signals to the second frequency, controlling a first pair of pipeline registers using two respective delayed system clock signals having a first phase difference, and controlling a second pair of pipeline registers using two respective delayed system clock signals having a second phase difference different from the first phase difference, wherein each delayed system clock signal has a unique phase delay relative to the system clock signals.
18. The system of claim 11, wherein the means for controlling in response to adjusting the system clock signal to the second frequency comprises:
a clock phase generator configured to generate a plurality of different clock signals based on the system clock signal and to provide the different clock signals to elements of the pipeline logic including the first register and the second register; and
a mode table configured to generate mode control signals and to provide the mode control signals to elements of the pipeline logic, wherein the mode control signals indicate an association between the elements of the pipeline logic and one of the different clock signals.
19. The system of claim 11, wherein the client device and the memory interface are included in a system on a chip ("SoC") of a portable computing device.
20. A method for compensating for system memory latency, the system comprising:
detecting a client device workload demand associated with a client device, the client device connected with a memory system through a memory interface, the memory interface having pipeline logic including a first register and a second register;
adjusting a frequency of a system clock signal provided to the memory interface, wherein the system clock signal is adjusted to a first frequency in response to detecting a high client device workload demand and to a second frequency lower than the first frequency in response to detecting a low client device workload demand;
controlling the first register and the second register using a first periodic clock edge signal in response to adjusting the system clock signal to the first frequency; and
in response to adjusting the system clock signal to the second frequency, controlling the first register using one of the first and second periodic clock edge signals and controlling the second register using the other of the first and second periodic clock edge signals, wherein a first periodic time interval between successive assertions of the first periodic clock edge signal is greater than a second periodic time interval between assertions of the first periodic clock edge signal and a next assertion of the second periodic clock edge signal after the assertion of the first periodic clock edge signal.
21. The method of claim 20, wherein:
the first periodic clock edge signal corresponds to a continuous assertion of a true edge of the system clock signal; and
the second periodic clock edge signal corresponds to a continuous assertion of a supplemental edge of the system clock signal.
22. The method of claim 21, wherein:
the pipeline logic comprises three or more registers, the registers comprising one or more odd registers and one or more even registers, and the data input of each odd register is pipelined to the data output of one of the even registers; and
controlling in response to adjusting the system clock signal to the first frequency, comprising: controlling the odd register and the even register using the first periodic clock edge signal; and
controlling in response to adjusting the system clock signal to the second frequency, comprising: the odd registers are controlled using the first periodic clock edge signal and the even registers are controlled using the second periodic clock edge signal.
23. The method of claim 21, wherein:
the pipeline logic comprises three or more registers;
controlling in response to adjusting the system clock signal to the first frequency, comprising: controlling each of the registers using the first periodic clock edge signal; and
controlling in response to adjusting the system clock signal to the second frequency, comprising: the first subset of registers is controlled using the first periodic clock edge signal and the second subset of registers is controlled using the second periodic clock edge signal, and at least one pair of pipeline registers is controlled by a same one of the first periodic clock edge signal and the second periodic clock edge signal.
24. The method of claim 20, wherein:
the first periodic clock edge signal corresponds to a continuous assertion of a true edge of the system clock signal; and
the second periodic clock edge signal corresponds to successive assertions of a real edge of a delayed system clock signal having a phase delay greater than or equal to zero and less than 360 degrees relative to the system clock signal.
25. The method of claim 24, wherein:
the pipeline logic includes three or more ("N") registers; and
controlling in response to adjusting the system clock signal to the second frequency, comprising: generating N delayed system clock signals, and in response to adjusting the system clock signal to the first frequency, controlling each of the N registers using the system clock signal, and in response to adjusting the system clock signal to the second frequency, controlling each of the N registers using one of the N delayed system clock signals, wherein each delayed system clock signal has a unique phase delay relative to the system clock signal that is an integer multiple of 360/N.
26. The method of claim 24, wherein:
the pipeline logic comprises three or more registers; and
controlling in response to adjusting the system clock signal to the second frequency, comprising: generating a plurality of delayed system clock signals, and in response to adjusting the system clock signals to the first frequency, controlling each of the registers using the system clock signals, and in response to adjusting the system clock signals to the second frequency, controlling a first pair of pipeline registers using two respective delayed system clock signals having a first phase difference, and controlling a second pair of pipeline registers using two respective delayed system clock signals having a second phase difference different from the first phase difference, wherein each delayed system clock signal has a unique phase delay relative to the system clock signals.
27. The method of claim 20, wherein controlling in response to adjusting the system clock signal to the second frequency comprises:
generating a plurality of different clock signals based on the system clock signal and providing the different clock signals to elements of the pipeline logic including the first register and the second register; and
generating a mode control signal, and providing the mode control signal to elements of the pipeline logic, wherein the mode control signal indicates an association between the elements of the pipeline logic and one of the different clock signals.
28. A computer program product for compensating for system memory latency, the computer program product comprising processor-executable logic embodied in at least one non-transitory storage medium, execution of the logic by one or more processors of a system configuring the system to:
detecting a client device workload demand associated with a client device, the client device connected with a memory system through a memory interface, the memory interface having pipeline logic including a first register and a second register;
adjusting a frequency of a system clock signal provided to the memory interface, wherein the system clock signal is adjusted to a first frequency in response to detecting a high client device workload demand and to a second frequency lower than the first frequency in response to detecting a low client device workload demand;
controlling the first register and the second register using a first periodic clock edge signal in response to adjusting the system clock signal to the first frequency; and
in response to adjusting the system clock signal to the second frequency, controlling the first register using one of the first and second periodic clock edge signals and controlling the second register using the other of the first and second periodic clock edge signals, wherein a first periodic time interval between successive assertions of the first periodic clock edge signal is greater than a second periodic time interval between assertions of the first periodic clock edge signal and a next assertion of the second periodic clock edge signal after the assertion of the first periodic clock edge signal.
29. The computer program product of claim 28, wherein:
the first periodic clock edge signal corresponds to a continuous assertion of a true edge of the system clock signal; and
the second periodic clock edge signal corresponds to a successive assertion of a supplemental edge of the system clock signal;
the pipeline logic comprises three or more registers, the registers comprising one or more odd registers and one or more even registers, and the data input of each odd register is pipelined to the data output of one of the even registers;
the system is configured to: controlling the odd and even registers using the first periodic clock edge signal in response to adjusting the system clock signal to the first frequency;
the system is configured to: in response to adjusting the system clock signal to the second frequency, controlling the odd registers using the first periodic clock edge signal and controlling the even registers using the second periodic clock edge signal.
30. The computer program product of claim 28, wherein:
the pipeline logic includes three or more ("N") registers; and
the system is further configured to: generating N delayed system clock signals, wherein each delayed system clock signal has a unique phase delay relative to the system clock signal that is an integer multiple of 360/N; and
the system is configured to: in response to adjusting the system clock signal to the first frequency, controlling each register of the N registers using the system clock signal; and
the system is configured to: in response to adjusting the system clock signal to the second frequency, controlling each register of the N registers using one of the N delayed system clock signals.
Background
Portable computing devices ("PCDs") are becoming a necessity for individuals and professionals. These devices may include cellular telephones, portable digital assistants, portable game consoles, palm top computers, and other portable electronic components.
PCDs have various electronic systems that consume power, such as one or more cores of a system on a chip ("SoC"). The cores may include, for example, central processing units ("CPUs"), graphics processing units ("GPUs"), digital signal processors ("DSPs"), and memory systems. Since the quality of the user experience is related to system performance, it is desirable to maintain a higher system clock frequency, a wider system data path, and so on to maximize performance. However, parameters associated with high performance (such as high clock frequency and supply voltage) may affect power savings. Since power savings are highly desirable in battery-powered PCDs, dynamic voltage and frequency scaling ("DVFS") techniques have been developed to balance system performance with power consumption. For example, the power management logic may monitor operating conditions in the PCD, including workload demands on the processor, core, SoC, or other system. When the power management logic detects that workload demands on such a system are low, the power management logic may issue a command to the clock signal controller to set the frequency of the clock signal controlling the operation of the system to a low frequency, which allows the power management logic to reduce the supply voltage provided to the system or portions of the system, thereby saving power without adversely affecting performance and therefore user experience. When the power management logic detects that the workload demand on such a system is high, the power management logic may issue a command to the clock signal controller to set the frequency of the clock signal to a higher frequency, which typically requires the power management logic to also increase the supply voltage to maintain performance (and user experience) at the expense of increased power consumption.
Memory latency may also affect the user experience, for example, by forcing a processor or other client device to wait for a memory access to complete. The interface between the client device and the memory system may include pipeline logic controlled by a system clock. Thus, when a memory interface designed for high bandwidth, high frequency operation is forced to operate at a lower frequency, memory latency increases proportionally. Memory latency can be reduced by operating the memory interface and associated system at a higher clock frequency, but such a solution does not maximize power savings.
Disclosure of Invention
Systems, methods, and computer program products for compensating for pipeline logic latency in a memory system are disclosed.
In one aspect, a system for compensating for pipeline logic latency may include a clock phase controller, a clock frequency controller, and a memory interface between a client device and a memory system. The memory interface may have pipeline logic including at least a first register and a second register. The clock frequency controller may be configured to adjust the system clock signal provided to the memory interface to a first frequency in response to detecting a high client device workload demand, and to adjust the system clock signal provided to the memory interface to a second frequency lower than the first frequency in response to detecting a low client device workload demand. The clock phase controller may be configured to control the first register and the second register using a first periodic clock edge signal in response to adjusting the system clock signal to a first frequency. The clock phase controller may be further configured to: in response to adjusting the system clock signal to the second frequency, one of the first and second periodic clock edge signals is used to control the first register, and the other of the first and second periodic clock edge signals is used to control the second register. A first periodic time interval between successive assertions (assertions) of the first periodic clock edge signal is greater than a second periodic time interval between an assertion of the first periodic clock edge signal and a next assertion of a second periodic clock edge signal after the assertion of the first periodic clock edge signal.
In another aspect, a method for compensating for pipeline logic latency may include: the method includes detecting a client device workload demand associated with a client device, adjusting a frequency of a system clock signal provided to a memory interface, and controlling pipeline logic of the memory interface using at least a first periodic clock edge signal and a second periodic clock edge signal. The pipeline logic includes at least a first register and a second register. In response to detecting a high client device workload demand, a system clock signal provided to a memory interface may be adjusted to a first frequency. The first and second registers may be controlled using a first periodic clock edge signal in response to adjusting the system clock signal to a first frequency. In response to detecting a low client device workload demand, a system clock signal provided to the memory interface may be adjusted to a second frequency lower than the first frequency. In response to adjusting the system clock signal to the second frequency, the first register may be controlled using one of the first periodic clock edge signal and the second periodic clock edge signal, and the second register may be controlled using the other of the first periodic clock edge signal and the second periodic clock edge signal. A first periodic time interval between successive assertions of the first periodic clock edge signal is greater than a second periodic time interval between an assertion of the first periodic clock edge signal and a next assertion of the second periodic clock edge signal after the assertion of the first periodic clock edge signal.
In another aspect, a computer program product for compensating for pipeline logic latency may include processor-executable logic embodied in at least one non-transitory storage medium. Execution of the logic by one or more processors of a system configures the system to: the method includes detecting a client device workload demand associated with a client device, adjusting a frequency of a system clock signal provided to a memory interface, and controlling pipeline logic of the memory interface using at least a first periodic clock edge signal and a second periodic clock edge signal. The pipeline logic includes at least a first register and a second register. In response to detecting a high client device workload demand, a system clock signal provided to a memory interface may be adjusted to a first frequency. The first and second registers may be controlled using a first periodic clock edge signal in response to adjusting the system clock signal to a first frequency. In response to detecting a low client device workload demand, a system clock signal provided to the memory interface may be adjusted to a second frequency lower than the first frequency. In response to adjusting the system clock signal to the second frequency, the first register may be controlled using one of the first periodic clock edge signal and the second periodic clock edge signal, and the second register may be controlled using the other of the first periodic clock edge signal and the second periodic clock edge signal. A first periodic time interval between successive assertions of the first periodic clock edge signal is greater than a second periodic time interval between an assertion of the first periodic clock edge signal and a next assertion of the second periodic clock edge signal after the assertion of the first periodic clock edge signal.
Drawings
In the drawings, like reference numerals refer to like parts throughout the various views unless otherwise indicated. For reference numerals that utilize an alphabetic character designation such as "102A" or "102B," the alphabetic character designation may distinguish two identical parts or elements that appear in the same figure. Where a reference numeral is intended to include all parts having the same reference numeral throughout the drawings, the alphabetic character designation for the reference numeral may be omitted.
FIG. 1 is a block diagram of a portable computing device that may include a system for compensating for system memory latency in accordance with an exemplary embodiment.
FIG. 2 is a block diagram of a system for compensating for system memory latency in accordance with an exemplary embodiment.
FIG. 3 is a block diagram of pipeline logic provided with complementary system clock signals in accordance with an illustrative embodiment.
FIG. 4 is a timing diagram illustrating operation of the pipeline logic of FIG. 3 according to an exemplary embodiment.
FIG. 5 is similar to FIG. 4, but illustrates compensating for system memory latency when the system clock is set to a lower frequency according to an exemplary embodiment.
FIG. 6 is similar to FIG. 3, but shows at least one register of the pipeline logic with substantial delay.
FIG. 7 is a timing diagram illustrating compensation for system memory latency in a system having the pipeline logic of FIG. 6 in accordance with an exemplary embodiment.
FIG. 8 is a block diagram of pipeline logic providing a system clock signal with a phase delay in accordance with an illustrative embodiment.
FIG. 9 is a timing diagram illustrating the operation of the pipeline logic of FIG. 8 according to an exemplary embodiment.
FIG. 10 is similar to FIG. 9, but illustrates compensating for system memory latency when the system clock is set to a low frequency in a system having the pipeline logic of FIG. 8, according to an exemplary embodiment.
FIG. 11 is a block diagram of pipeline logic having at least one register with substantial delay provided by a system clock signal utilizing phase delays in accordance with an exemplary embodiment.
FIG. 12 is a timing diagram illustrating compensation for system memory latency in a system having the pipeline logic of FIG. 11 in accordance with an exemplary embodiment.
FIG. 13 is a block diagram of a DRAM controller according to an exemplary embodiment.
FIG. 14 is a method flow diagram illustrating an exemplary method for compensating for system memory latency in accordance with an exemplary embodiment.
FIG. 15 illustrates an example of the clock mode table of FIG. 13 according to an exemplary embodiment.
Detailed Description
The word "exemplary" is used herein to mean "serving as an example, instance, or illustration. Any aspect described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects.
The terms "central processing unit" ("CPU"), "digital signal processor" ("DSP"), and "graphics processing unit" ("GPU") are non-limiting examples of processors that may be present in a PCD. These terms are used interchangeably herein, except as otherwise indicated.
The term "portable computing device" ("PCD") is used herein to describe any device that operates on a limited capacity power source, such as a battery. While battery-powered PCDs have been in use for decades, technological advances in rechargeable batteries combined with the advent of third generation ("3G") and fourth generation ("4G") wireless technology have enabled a large number of PCDs with multiple capabilities. Thus, a PCD may be a cellular or mobile phone, a satellite phone, a pager, a personal digital assistant ("PDA"), a smartphone, a navigation device, a smartbook or reader, a media player, a combination of the above, a notebook or handheld computer with a wireless connection or link, and so forth.
As used herein, the terms "component," "module," "system," and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures stored thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the internet with other systems by way of the signal).
The terms "application" or "application" may be used synonymously to refer to a software entity having executable content, such as object code, scripts, byte code, markup language files, patches, and the like. In addition, an "application" may further include files that are not executable in nature, such as data files, configuration files, documents, and so forth.
As shown in fig. 1, in an illustrative or exemplary embodiment, a system, method and computer program product for system memory latency compensation may be embodied in a PCD 100. The PCD 100 includes a system on a chip ("SoC") 102 (i.e., a system embodied in an integrated circuit chip). The SoC102 may include a central processing unit ("CPU") 104, a graphics processing unit ("GPU") 106, or other processor. PCD 102 may include an analog signal processor 108.
A display controller 110 and a touch screen controller 112 may be coupled to the CPU 104. A touchscreen display 114 external to the SoC102 may be coupled to the display controller 110 and the touchscreen controller 112. PCD 102 may further include a video decoder 116. A video decoder 116 is coupled to the CPU 104. The video amplifier 118 may be coupled to the video decoder 116 and the touch screen display 114. The video port 120 may be coupled to the video amplifier 118. A universal serial bus ("USB") controller 122 may also be coupled to the CPU 104, and a USB port 124 may be coupled to the USB controller 122. A subscriber identity module ("SIM") card 126 may also be coupled to the CPU 104.
One or more memories may be coupled to the CPU 104. The one or more memories may include volatile memory and non-volatile memory. Examples of volatile memory include static random access memory ("SRAM") 128 and dynamic RAM ("DRAM") 130 and 131. Such memory may be external to SoC102, such as DRAM 130, or internal to SoC102, such as DRAM 131. DRAM controller 132, coupled to CPU 104, may control writing data to DRAMs 130 and 131 and reading data from DRAMs 130 and 131. In other embodiments, such a DRAM controller may be included within a processor, such as CPU 104.
The stereo audio CODEC 134 may be coupled to the analog signal processor 108. Further, an audio amplifier 136 can be coupled to the stereo audio CODEC 134. The first stereo speaker 138 and the second stereo speaker 140 may be separately coupled to the audio amplifier 136. Additionally, a microphone amplifier 142 may also be coupled to the stereo audio CODEC 134 and a microphone 144 may be coupled to the microphone amplifier 142. A frequency modulation ("FM") wireless tuner 146 may be coupled to the stereo audio CODEC 134. The FM antenna 148 may be coupled to the FM radio tuner 146. Further, stereo headphones 150 may be coupled to the stereo audio CODEC 134. Other devices that may be coupled to the CPU 104 include a digital camera 152.
A modem or radio frequency ("RF") transceiver 154 may be coupled to the analog signal processor 108. The RF switch 156 may be coupled to the RF transceiver 154 and the RF antenna 158. Additionally, a keypad 160, a mono headset with a microphone 162, and a vibrator device 164 may be coupled to the analog signal processor 108.
The power supply 166 may be coupled to the SoC102 via a power management integrated circuit ("PMIC") 168. The power source 166 may include a rechargeable battery or a DC power source derived through an AC-to-DC converter connected to an AC power source.
The CPU 104 may also be coupled to one or more internal, on-chip thermal sensors 170A, and one or more external, off-chip thermal sensors 170B. The thermal sensors 170A and 170B may generate a voltage drop that is converted to a digital signal using an analog-to-digital converter ("ADC")
In the exemplary or illustrative embodiment, the touchscreen display 114, the video port 120, the USB port 124, the camera 152, the first stereo speaker 138, the second stereo speaker 140, the microphone 144, the FM antenna 148, the stereo headphones 150, the RF switch 156, the RF antenna 158, the keypad 160, the mono headphone 162, the vibrator 164, the thermal sensor 170B and the PMIC168, the power supply 166, and the DRAM 130 are external to the SoC 102. However, it will be understood that in other embodiments, one or more of these devices may be included in such a SoC.
The SoC102 may include a clock controller 172. The clock controller 172 can adjust the frequency of one or more system clock signals used by various systems, such as processors and memory systems. The clock controller 172 may dynamically adjust such clock frequency in response to operating conditions, such as measured or predicted workload demands on a processor, core, SoC, or other system. For example, when the clock controller 172 detects a high processor workload demand, the clock controller 172 may set the frequency of the clock signal provided to the processor to a high frequency. Likewise, when the clock controller 172 detects a low processor workload demand, the clock controller 172 may set the frequency of the clock signal provided to the processor to a low frequency. As used in this description, the terms "high frequency" and "low frequency" have no other meaning than to indicate relative values with respect to each other; high frequencies are higher than low frequencies. This dynamic frequency scaling may be used in combination with dynamic voltage scaling. The PMIC168 may set a supply voltage of the clocked system to a high voltage level when the clock signal provided to the system is set to a high frequency, and the PMIC168 may set the supply voltage to a low voltage level when the clock signal provided to the system is set to a low frequency. As used in this description, the terms "high voltage" and "low voltage" have no other meaning than to indicate relative values with respect to each other; the high voltage is higher than the low voltage. A reduction in the supply voltage generally results in a proportional saving of power consumed in the system. The dynamic adjustment of the supply voltage and clock frequency may be referred to as dynamic voltage and frequency scaling ("DVFS"). As understood by those of ordinary skill in the art, DVFS techniques enable a balance between power consumption and performance.
As shown in fig. 2, exemplary system 200 may include a device 202, a memory system 204, and a power manager 206. The SoC102 described above with respect to fig. 1 may be an example of the device 202. The PMIC168 described above with respect to fig. 1 may be an example of the power manager 206. The DRAM 130 described above with respect to fig. 1 may be an example of the memory system 204.
The device 202 may include a power controller 208, a clock frequency controller 210, a clock phase controller 212, and a memory controller or interface 214, all of which may communicate with each other via a bus 216. The client device 218 may also communicate with the aforementioned components via the bus 216. Examples of client devices 218 include the CPU 104 (or a kernel thereof), the GPU 106, clients associated with the camera 152 and the display 114, and so forth, as described above with respect to fig. 1. For clarity of illustration, although only one exemplary client device 218 is shown in fig. 2, other such client devices may be similarly coupled to bus 216. Power controller 208 controls power manager 206 or otherwise communicates with power manager 206. Clock frequency controller 210 generates one or more system clock signals 220. The clock frequency controller 210 may monitor and detect workload demands on the client device 218 and adjust the frequency of one or more system clock signals 220 in response to the detected client workload demands. The power manager 206 may adjust one or more supply voltages (i.e., voltage rails) provided to the device 202 based on the clock frequency to facilitate a stable clock signal. Thus, for example, in response to clock frequency controller 210 setting one such system clock signal 220 to a high frequency, power manager 206 may set the corresponding power supply rail to a high voltage. Likewise, in response to clock frequency controller 210 setting one such system clock signal 220 to a low frequency, power manager 206 may set the corresponding power rail to a low voltage.
Clock phase controller 212 receives at least one such system clock signal 220 generated by clock frequency controller 210. In response to or based on the frequency of such system clock signal 220, clock phase controller 212 generates a
As described in further detail below, the
As shown in fig. 3,
The circuit elements, including the
In the timing diagram of fig. 4, an
In the example shown in fig. 4, the phase _0 clock edge signal corresponds to one of the system clock signals 220 ("system clocks") on which the memory interface 214 (fig. 2) or other such system operates. Accordingly, data in the form of one or more bits, words, etc. propagates through the memory interface 214 or other such system's pipeline logic in synchronization with the phase _0 clock edge signal. In the example shown in fig. 4, the frequency of the phase _0 clock edge signal may be a "high" frequency, where, in response to detecting a high workload demand of the client device 218, the clock frequency controller 210 sets or adjusts a corresponding one of the system clock signals 220 ("system clock") to the "high" frequency.
In response to adjusting the system clock signal to a high frequency, clock phase controller 212 controls all of registers 302 (fig. 3) using only one of periodic clock edge signals 222, such as the phase _0 clock edge signal. Accordingly, in the example illustrated in fig. 3-4,
In the example described above with respect to fig. 3-4, the total delay of data through
In the timing diagram of fig. 5, an
In the example illustrated in fig. 5, register 302a (fig. 3) captures and stores exemplary data in response to the assertion of the phase _0 clock edge signal defined by edge 502 a. The data propagates from the data output of
In the example described above with respect to fig. 3 and 5, the total delay ("L") of data through
Another example of delay compensation may be described with reference to fig. 6 and 7. As shown in fig. 6, pipeline logic 600 is similar to pipeline logic 300 (fig. 3) described above, except that in this example, logic 604a substantially delays data (e.g., logic 604a introduces a greater delay than logic 604b or logic 604 c). Pipeline logic 600 may include any number of registers 602 (such as exemplary registers 602a, 602b, 602c, and 602 d). Combinatorial logic 604 may be inserted between pairs of registers 602. For example, logic 604a may be inserted between registers 602a and 602b, logic 604b may be inserted between registers 602b and 602c, and logic 604c may be inserted between registers 602c and 602 d. Although in the embodiment shown in FIG. 6, registers 602a-602D comprise D-type flip-flops, in other embodiments, pipeline logic may employ other types of delay elements, such as transparent latches, J-K flip-flops, S-R flip-flops, delay lines, and so forth.
In the timing diagram of fig. 7, an example timing diagram 700 may describe an example of operation of one of the systems or components of device 202 (fig. 2), such as memory interface 214, or a system that is a combination of two or more systems or components, such as client device 218, bus 216, memory interface 214, and memory system 204. The phase _0 clock edge signal and the phase _1 clock edge signal shown in fig. 7 are similar to those described above with respect to fig. 5. Accordingly, the identified or true edge of the phase _0 clock edge signal is similarly a rising edge 702 (such as edge 702a,
In the example illustrated in fig. 7, register 302a (fig. 3) captures and stores exemplary data in response to the assertion of the phase _0 clock edge signal defined by edge 702 a. The data propagates from the data output of register 702a to the data input of
In the examples described above with respect to fig. 6-7, the total delay ("L") of data through
While in the examples described above with respect to fig. 6-7, the effect of the substantial delay introduced by logic 604a may be addressed in the manner described above, in other examples, the effect of the substantial delay introduced by any other combinational logic or other delay-introducing elements interposed between any one or more pairs of registers may be addressed in the same manner. Each register may be controlled by one of the clock edge signals 222 that has been individually selected to ensure that the register's timing requirements are met. The delay and timing requirements may be determined by analyzing timing analysis reports from a circuit synthesis and physical design simulator or timing analyzer (not shown) during a design phase of development of the device 202. The analysis may be performed for each timing path at each frequency and voltage operating condition evaluated across temperature and casting process variations. The selection of the phase _0 or phase _1 clock edge signal for any given path between the source and destination registers may be determined by selecting the clock phase that results in the least acceptable (i.e., positive slack) timing margin. In general, the input to the destination register may be derived from a plurality of upstream registers and associated logic paths; it is well known in digital circuit design that timing margin must be acceptable (i.e., relaxed) for all paths into the destination register. The assignment of clock phases may be established at each frequency and voltage operating condition. For example, a high frequency with a high voltage condition will have an allocation of clock phases to all registers, and a low frequency with a low voltage condition will have an allocation of clock phases to all registers. The overall allocation can be captured in the schema table 1500 shown in FIG. 15. One embodiment of the pattern table 1500 may be fixed at design time and a static lookup table implemented using hardwired digital logic or ROM for worst case clock phase assignments. Alternatively, the mode table 1500 may be implemented using programmable registers, RAM, fuses, EPROM, flash memory, and the like. If pattern table 1500 is programmable, the clock phase assignments during operation can be further optimized based on initial values determined during the design phase. This is because the clock phase distribution to the hard-coded lookup table must be based on worst-case process and temperature variations (since the distribution must span millions of devices and any/all temperatures are reliable). However, if the pattern table is reprogrammable, depending on process variations (e.g., slow or fast), the pattern table may be adjusted for a particular device, i.e., a fast device exhibiting a speed faster than normal silicon may fill the pattern table 1500 with a clock phase that results in lower latency as compared to a slow device exhibiting a speed slower than normal silicon. The adjustment may be made by determining the silicon speed of the individual devices using conventional methods (e.g., ring oscillator speed, leakage current, etc.). This results in the lowest possible delay for the individual devices. In any case, the individualization and the integrally adapted (worst case) clock phase distribution will result in a reduction of the total delay.
Another example of delay compensation may be described with reference to fig. 8 and 9. As shown in fig. 8,
In the timing diagram of fig. 9, an example timing sequence 900 may describe an example of operation of one of the systems or components of device 202 (fig. 2), such as memory interface 214, or a system that is a combination of two or more systems or components, such as client device 218, bus 216, memory interface 214, and memory system 204. The phase _0 clock edge signals shown in fig. 9 are similar to those described above with respect to fig. 4. Accordingly, the frequency of the phase _0 clock edge signal may be a "high" frequency, wherein, in response to detecting a high workload demand of the client device 218, the clock frequency controller 210 sets or adjusts a corresponding one of the system clock signals 220 ("system clock") to the "high" frequency. Likewise, in the example shown in fig. 9, the identified or true edges of the phase _0 clock edge signal are rising edges 902, such as edges 902a, 902b, 902c, and 902 d. In addition to the phase _0 clock edge signal, other clock edge signals (such as the phase _1 clock edge signal, the phase _2 clock edge signal, and the phase _3 clock edge signal) may be included in the
In response to adjusting the system clock signal to a high frequency, clock phase controller 212 controls all registers 802 (fig. 8) using only one of clock edge signals 222 (e.g., the phase _0 clock edge signal). Accordingly, in the example illustrated in fig. 8-9,
In the example described above with respect to fig. 8-9, the total delay of data through
In the timing diagram of fig. 10, an example timing diagram 1000 may describe an example of the operation of one of the systems or components of device 202 (fig. 2), such as memory interface 214, or a system that is a combination of two or more systems or components, such as client device 218, bus 216, memory interface 214, and memory system 204. The phase _0 clock edge signal shown in fig. 10 is the same as the clock edge signal described above with respect to fig. 9, except that the frequency of the phase _0 clock edge signal may be a "low" frequency in the example of operation shown in fig. 10, where clock frequency controller 210 sets or adjusts a corresponding signal of system clock signal 220 ("system clock") to the "low" frequency in response to detecting a low workload demand of client device 218. The phase _0 clock edge signal, the phase _1 clock edge signal, the phase _2 clock edge signal, and the phase _3 clock edge signal have the same frequency as one another, but have different phases or delays relative to the associated ones of the system clock signals 220. For example, the delay of the phase _0 clock edge signal relative to a corresponding one of the system clock signals 220 may be zero, the delay of the phase _1 clock edge signal relative to a corresponding one of the system clock signals 220 may be a certain amount of time delay ("D"), the delay of the phase _2 clock edge signal relative to a corresponding one of the system clock signals 220 may be 2xD, and the delay of the phase _3 clock edge signal relative to a corresponding one of the system clock signals 220 may be 3 xD. In other words, in this example, the system clock cycle may be divided into four phases with phase delays of 0 °, 90 °, 180 °, and 270 ° relative to the system clock. The number of phases may be four because, in this example,
In the example shown in fig. 10, the identified or true edges of the phase _0 clock edge signal are rising edges 1002 (such as
In the example illustrated in fig. 10, register 802a (fig. 8) captures and stores example data in response to the assertion of the phase _0 clock edge signal defined by
In the examples described above with respect to fig. 8 and 10, the total delay ("L") of data through
Another example of delay compensation may be described with reference to fig. 11 and 12. As shown in fig. 11,
In the timing diagram of fig. 12, an
In the example illustrated in FIG. 12, register 1102a (FIG. 11) captures and stores exemplary data in response to the assertion of the phase _0 clock edge signal defined by
In the example described above with respect to fig. 11-12, the total delay ("L") of data through
While in the examples described above with respect to fig. 11-12, the effect of the substantial delay introduced by
In an exemplary embodiment, aspects of clock phase controller 212 and memory interface 214 (FIG. 2) may be integrated together in a single system or device. For example, as shown in fig. 13, DRAM controller 1302 may include a clock phase controller 1306 (which may be an example of clock phase controller 212) and a DRAM controller interface 1304 (which may be an example of memory interface 214). DRAM controller 1302 may receive one or more system clock signals (e.g., adjusted in frequency in response to client device workload demands) generated by clock frequency controller 210 in the manner described above. DRAM controller 1302 may be coupled to an SoC bus such as bus 216 (fig. 2). The DRAM controller interface 1304 may be coupled to a DRAM system, such as the memory system 204 (fig. 2). DRAM controller interface 1304 may have a conventional structure and may include, for example, physical interface 1308, command generator 1310, request optimizer 1312, DRAM request queue 1314, and DRAM response queue 1316.
The clock phase controller 1306 may include a clock phase generator 1318 and a clock pattern table 1320. Clock phase generator 1318 may generate delayed system clock signals or phase clock signals, such as the phase _0 clock edge signal, the phase _1 clock edge signal, the phase _2 clock edge signal, and the phase _3 clock edge signal described above with respect to fig. 3-12. Each of the generated phase clock signals may be provided to pipeline logic in DRAM controller interface 1304. Likewise, the pipeline logic may include a physical interface 1308, a command generator 1310, a request optimizer 312, a DRAM request queue 1314, and a DRAM response queue 1316 (each of which may itself include multiple stages of pipeline logic), so it may be appreciated that data may take many clock cycles to propagate through the DRAM controller interface 1304. It may also be appreciated that some of the elements of DRAM controller interface 1304 may have longer delays than others. For example, DRAM request queue 1314 and DRAM response queue 1316 may have a low logic latency, while request optimizer 1312 may have a high logic latency. Thus, in a system such as DRAM controller interface 1304 that includes a combination of elements with high logic delays and elements with low logic delays, it is beneficial to use an individually selected one of two or more phase clock signals to control each element.
The clock mode table 1320 may associate elements of the pipeline logic or a set of multiple elements of the pipeline logic with one of a different phase clock signal or other clock edge signal for use when the system is operating in a low frequency mode (e.g., in response to detecting a low client device workload). In the example described above with respect to fig. 3-5, clock mode table 1320 may associate
The clock pattern table 1320 may be a lookup table in which the associations described above are stored. Alternatively, the clock mode table 1320 has any other structure, such as using reconfigurable registers, RAM, ROM, EPROM, or other types of NV memory. The mode table 1500 (fig. 15) described above may be an example of the clock mode table 1320. The clock mode table 1320 may generate a mode control signal indicating the association described above. The mode control signal may be provided to an element of the pipeline logic. In other words, the clock mode table 1320 indicates to various registers or other elements of the pipeline logic: which of the two or more phase clock signals to use when the system is operating in the low frequency mode. In the example described above with respect to fig. 3, mode control signal mode _ a is provided to register 302a, mode control signal mode _ B is provided to register 302B, mode control signal mode _ C is provided to register 302C, and mode control signal mode _ D is provided to register 302D. In the example described above with respect to fig. 3, each of these mode control signals may consist of only one bit to enable the operation in the fieldSelection among the two phase clock signals of these examples. Similarly, in the example described above with respect to fig. 6, mode control signal mode _ a is provided to register 602a, mode control signal mode _ B is provided to register 602B, mode control signal mode _ C is provided to register 602C, and mode control signal mode _ D is provided to register 602D. In the examples described above with respect to fig. 6, each of these mode control signals may consist of only one bit to enable selection among the two phase clock signals of these examples. In the example described above with respect to fig. 8, mode control signal mode _ a is provided to register 802a, mode control signal mode _ B is provided to register 802B, mode control signal mode _ C is provided to register 802C, and mode control signal mode _ D is provided to register 802D. In the examples described above with respect to fig. 8, each of these mode control signals may consist of two bits to enable selection among the four phase clock signals of these examples. Similarly, in the example described above with respect to FIG. 11, at low frequencies, mode control signals mode _ A are provided to register 1102a using mode table 1500, mode _ A being equal to pass through row 1550 (block 0_ reg _1102a) and column 1502 (f)LOWMode) determined binary value 00. Mode control signals mode _ B are provided to register 1102B using mode table 1500, mode _ B being equal to pass through row 1551 (block 0_ reg _1102B) and column 1503 (f)LOWMode) determined binary value 01. Mode control signal mode _ C is provided to register 1102C using mode table 1500, mode _ C being equal to pass through row 1552 (block 0_ reg _1102C) and column 1503 (f)LOWMode) determined binary value 11. Mode control signals mode _ D are provided to register 1102D using mode table 1500, mode _ D being equal to the value provided by row 1553 (block 0_ reg _1102D) and column 1503 (f)LOWMode) determined binary value 00. This process is repeated for each register that employs delay compensation and for each frequency and voltage rail operating point. For example, the schema table 1500 includes low frequency operating points defined by columns 1502 and 1503, mid frequency operating points defined by
In the examples described above with respect to fig. 11, each of the mode control signals may consist of a 2-bit binary value to enable selection among the four phase clock signals of these examples. Although in the exemplary embodiments described in this disclosure, each element of the pipeline logic receives two or more phase clock signals and at least one mode control signal, in other embodiments, each element of the pipeline logic may only receive a single clock signal that has been pre-selected or generated by a clock phase controller. Other arrangements for controlling pipeline logic registers to compensate for pipeline logic delays using a phase clock signal or other type of periodic clock edge signal will occur to those of ordinary skill in the art in view of the description in this disclosure.
An
The first and second periodic clock edge signals are related in that a first periodic time interval between successive assertions (e.g., identified edges) of the first clock edge signal is greater than a second periodic time interval between an assertion of the first periodic clock edge signal and a next assertion of the second periodic clock edge signal. Because the next assertion of the second periodic clock edge signal after the assertion of the first periodic clock edge signal occurs before the next assertion of the first periodic clock edge signal, a time between the assertion of the first periodic clock edge signal and the next assertion of the second periodic clock edge signal is less than a time between the assertion of the first periodic clock edge signal and the next assertion of the first periodic clock edge signal. Thus, when the system clock signal has been adjusted to the second (low) frequency, using the next assertion of the second periodic clock edge signal to control the register instead of waiting for the next assertion of the first periodic clock edge signal reduces latency through the pipeline logic.
Aspects of the exemplary methods described in this disclosure, including
Alternative embodiments will become apparent to those of ordinary skill in the art to which the present invention pertains without departing from its spirit and scope. Thus, while selected aspects have been illustrated and described in detail, it will be understood that various substitutions and alterations may be made thereto without departing from the spirit and scope of the invention, as defined by the following claims.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:具有形状记忆材料线的计算设备的锁定装置