Bandwidth-based power management for peripheral component interconnect express devices

文档序号:425865 发布日期:2021-12-21 浏览:10次 中文

阅读说明:本技术 用于外围组件互连快速设备的基于带宽的功率管理 (Bandwidth-based power management for peripheral component interconnect express devices ) 是由 T·塞尔万姆 D·V·穆拉利 M·克里希纳 S·迪亚斯 T·张 于 2020-05-14 设计创作,主要内容包括:一种系统包括被配置为提供与链路的接口的接口电路以及控制器。该控制器被配置为:从一个或多个客户端接收一个或多个带宽请求;以及基于一个或多个带宽请求来确定针对链路的链路速度和链路宽度中的至少一项。(A system includes an interface circuit configured to provide an interface with a link and a controller. The controller is configured to: receiving one or more bandwidth requests from one or more clients; and determining at least one of a link speed and a link width for the link based on the one or more bandwidth requests.)

1. A system, comprising:

an interface circuit configured to provide an interface with a link; and

a controller configured to:

receiving one or more bandwidth requests from one or more clients; and

determining at least one of a link speed and a link width for the link based on the one or more bandwidth requests.

2. The system of claim 1, wherein:

the controller is configured to: generating a link request indicative of the at least one of the link speed and the link width; and

the interface circuit is configured to: transmitting the link request over the link.

3. The system of claim 2, wherein the controller is a peripheral component interconnect express (PCIe) endpoint device controller.

4. The system of claim 3, wherein the interface circuitry is configured to: sending the link request to a PCIe host device via the link.

5. The system of claim 1, wherein the controller is configured to: configuring the interface circuit to interface with the link at the at least one of the link speed and the link width.

6. The system of claim 1, wherein the link comprises a peripheral component interconnect express (PCIe) link and the interface circuitry comprises PCIe interface circuitry.

7. The system of claim 6, wherein the link speed comprises one of a plurality of different PCIe link speeds corresponding to different PCIe generations.

8. The system of claim 6, wherein the link comprises a plurality of lanes, and the link width corresponds to a number of activities in the plurality of lanes.

9. The system of claim 1, wherein the controller is configured to: determining the link speed for the link using a look-up table that maps each of a plurality of bandwidths to a respective one of a plurality of link speeds.

10. The system of claim 1, wherein the controller is configured to: the link widths are determined using a look-up table that maps each of a plurality of bandwidths to a respective one of a plurality of link widths.

11. The system of claim 1, wherein the controller is configured to determine the link speed for the link by:

determining a power consumption for each of a plurality of different link speeds that satisfy the one or more bandwidth requests; and

determining one of the plurality of different link speeds having a lowest power consumption.

12. The system of claim 1, wherein the controller is configured to determine the link width for the link by:

determining a power consumption for each of a plurality of different link widths that satisfy the one or more bandwidth requests; and

determining one of the plurality of different link widths having a lowest power consumption.

13. The system of claim 1, wherein:

a Power Management Integrated Circuit (PMIC) to provide one or more voltages to the interface circuit; and

the controller is configured to: setting the one or more voltages provided by the PMIC to the interface circuit based on the link speed.

14. The system of claim 1, wherein:

the link comprises a plurality of lanes;

the interface circuit comprises a plurality of drivers, wherein each of the drivers is configured to drive a respective one of the channels;

the system includes a power switching circuit configured to: selectively powering the plurality of drivers; and

the controller is configured to: setting an amount of the plurality of drivers to be powered by the power switching circuit based on the link width.

15. The apparatus of claim 1, wherein the controller is one of a host controller or a device controller.

16. A method, comprising:

receiving, in a controller, one or more bandwidth requests from a client regarding communications over a link between a link partner and the client;

determining, in the controller, at least one of a link speed and a link width for the link based on the one or more bandwidth requests;

implementing, with the controller, a speed change in the client based on the determined at least one of link speed and link width for the link; and

sending, with the controller, a speed change request over the link to the link partner, the request based on the determined at least one of link speed and link width for the link.

17. The method of claim 16, wherein the controller comprises a peripheral component interconnect express (PCIe) endpoint device controller.

18. The method of claim 17, wherein the PCIe endpoint device controller comprises: PCIe interface circuitry configured to: sending the speed change request to the link partner via the link.

19. The method of claim 18, wherein the PCIe endpoint device controller is configured to: implementing the speed change in the client based on the determined at least one of link speed and link width for the link.

20. The method of claim 16, wherein the link comprises a peripheral component interconnect express (PCIe) link.

21. The method of claim 20, wherein the link speed comprises one of a plurality of different PCIe link speeds corresponding to different PCIe generations.

22. The method of claim 16, wherein the link comprises a plurality of lanes, and the link width corresponds to a number of activities in the plurality of lanes.

23. The method of claim 16, further comprising:

determining the link speed for the link using a look-up table that maps each of a plurality of bandwidths to a respective one of a plurality of link speeds.

24. The method of claim 16, further comprising:

the link widths are determined using a look-up table that maps each of a plurality of bandwidths to a respective one of a plurality of link widths.

25. The method of claim 16, further comprising:

determining the link speed for the link by:

determining a power consumption for each of a plurality of different link speeds that satisfy the one or more bandwidth requests; and

determining one of the plurality of different link speeds having a lowest power consumption.

26. The method of claim 16, further comprising:

determining the link width for the link by:

determining a power consumption for each of a plurality of different link widths that satisfy the one or more bandwidth requests; and

determining one of the plurality of different link widths having a lowest power consumption.

27. The method of claim 16, further comprising:

providing one of a plurality of voltages or clocks to an interface circuit, the interface circuit configured to: providing an interface with the link and the controller; and

setting the one or more voltages or clocks provided to the interface circuit based on the link speed.

28. The method of claim 16, further comprising:

the link includes a plurality of channels, each channel driven by a respective driver of a plurality of drivers and coupled to a power switching circuit configured to selectively power the plurality of drivers; and

setting an amount of the plurality of drivers to be selectively powered by the power switching circuit to change the link width based on the determined link width for the link.

29. The method of claim 16, wherein the controller comprises one of a host controller and a device controller coupled to the link, and the link partner comprises the other of the device controller or the host controller.

Technical Field

Aspects of the present disclosure relate generally to peripheral component interconnect express (PCIe) devices, and more particularly, to managing power for PCIe devices.

Background

The system may include one or more processors (e.g., application processors) and peripherals, such as wireless modems, graphics processors, displays, sensors, and the like. The one or more processors may communicate with the peripheral devices using a high-speed communication link according to a standard (i.e., protocol). One popular standard is the peripheral component interconnect express (PCIe) standard, which supports high-speed links capable of sending data at speeds of several gigabits per second.

Disclosure of Invention

The following presents a simplified summary of one or more implementations in order to provide a basic understanding of such implementations. This summary is not an extensive overview of all contemplated implementations, and is intended to neither identify key or critical elements of all implementations, nor delineate the scope of any or all implementations. Its sole purpose is to present some concepts of one or more implementations in a simplified form as a prelude to the more detailed description that is presented later.

One aspect relates to a system. The system comprises: an interface circuit configured to provide an interface with a link, and a controller. The controller is configured to: receiving one or more bandwidth requests from one or more clients; and determining at least one of a link speed and a link width for the link based on the one or more bandwidth requests.

Another aspect relates to a method. The method comprises the following steps: one or more bandwidth requests are received in the controller from the client regarding communications over the link between the link partner and the client. In addition, the method comprises: determining, in the controller, at least one of a link speed and a link width for the link based on the one or more bandwidth requests; and effecting, with the controller, a speed change in the client based on the determined at least one of link speed and link width for the link. Further, the method comprises: sending, with the controller, a speed change request over the link to the link partner, the request based on the determined at least one of the link speed and the link width for the link.

To the accomplishment of the foregoing and related ends, the one or more implementations comprise the features hereinafter fully described and particularly pointed out in the claims. The following description and the annexed drawings set forth in detail certain illustrative aspects of the one or more implementations. These aspects are indicative, however, of but a few of the various ways in which the principles of various implementations may be employed and the described implementations are intended to include all such aspects and their equivalents.

Drawings

Fig. 1 illustrates an example of a root complex coupled to an endpoint device in accordance with certain aspects of the present disclosure.

Fig. 2 illustrates an example of a system including a host system and an endpoint device system, in accordance with aspects of the present disclosure.

Fig. 3 illustrates an example implementation of a link in accordance with certain aspects of the present disclosure.

Fig. 4 is a call flow diagram illustrating an example of a bandwidth-based power management methodology in accordance with certain aspects of the present disclosure.

Fig. 5 is a call flow diagram illustrating another example of a bandwidth-based power management methodology in accordance with certain aspects of the present disclosure.

Fig. 6A illustrates an example of a lookup table mapping bandwidth requirements to link speeds in accordance with certain aspects of the present disclosure.

Fig. 6B illustrates an example of a lookup table mapping bandwidth requirements to link widths in accordance with certain aspects of the present disclosure.

Fig. 6C illustrates an example of a lookup table mapping bandwidth requirements to link speeds and link widths in accordance with certain aspects of the present disclosure.

Fig. 7 is a call flow diagram illustrating yet another example of a bandwidth-based power management method in accordance with certain aspects of the present disclosure.

Fig. 8 is a call flow diagram illustrating yet another example of a bandwidth-based power management method in accordance with certain aspects of the present disclosure.

Fig. 9 illustrates a flow diagram of a further exemplary method for bandwidth-based power management in accordance with aspects of the present disclosure.

Detailed Description

The detailed description set forth below in connection with the appended drawings is intended as a description of various configurations and is not intended to represent the only configurations in which the concepts described herein may be practiced. The detailed description includes specific details for the purpose of providing a thorough understanding of the various concepts. It will be apparent, however, to one skilled in the art that these concepts may be practiced without these specific details. In some instances, well-known structures and components are shown in block diagram form in order to avoid obscuring such concepts.

Aspects of the present disclosure provide bandwidth-based PCIe power management using link speed and/or link width scaling. Aspects of the disclosure are discussed below using the PCIe GEN1 through GEN4 examples. However, it will be appreciated that the present disclosure is not limited to these examples, and that the present disclosure may be used to provide power management for future implementations of the PCIe standard (e.g., GEN5 and higher). Further, it should be noted that while the present disclosure is discussed with respect to PCIe links, those skilled in the art will appreciate that the underlying principles of the disclosed systems and methods may be implemented in other types of PCI links, or even other physical serial interconnects between a host and client devices.

FIG. 1 illustrates a particular example of a system 110 supporting PCIe GEN3, the system 110 including a PCIe root complex 115 (e.g., on a host device) and PCIe endpoint devices 120, where the root complex 115 and endpoint devices 120 are coupled by PCIe links 125 running at GEN3 speed (up to 8.0GT/s theoretical speed). In this example, link 125 operates at GEN3 speed, even when link 125 is being underutilized by low bandwidth applications that may be adequately served by GEN2 speed or lower. Operating the link 125 at GEN3 speed may require maintaining the PCIe core and physical layer (PHY) supply voltages at higher voltage levels (also referred to as voltage corners) than at GEN2 or lower speeds. Thus, for low bandwidth applications, keeping the link 125 at GEN3 speed will remain higher than the required supply voltage (e.g., input voltage 130 at a higher corner, as shown in block 130), which will result in more leakage and faster battery depletion.

The current method for managing power specified in the PCIe specification is Active State Power Management (ASPM), which employs a method to reduce power based on link activity detected on the PCIe link between the root complex and the endpoint PCIe device. In this approach, when data is being transmitted over a PCIe link, the link operates in an L0 power state (i.e., a link operating state). When the link is idle (e.g., for a short time interval between data bursts), the link may transition from the L0 state to a lower power state (e.g., L0s → L1 → L1.1/L1.2) to reduce power consumption. In this example, L0s is a low power standby for state L0, and the L1 sub-state is the lowest possible active low power state for the PCIe link. Even though this approach reduces power consumption at the link level, this approach does not change the voltage domain of the PCIe device. Thus, when the link is operating at GEN3 speed or higher PCIe speed, the voltage domain will remain at a higher voltage level (corner) for GEN3 and higher speed PCIe generations.

In the current approach, the PCIe link running at its maximum speed runs at its maximum speed even during low throughput traffic scenarios. This not only consumes more power from the PCIe controller and PHY, but also requires other systems interfacing with PCIe (such as, but not limited to, the memory subsystem and system bus interface) to run at higher clock frequencies.

To address this issue, aspects of the present disclosure provide a coordinated power management approach between the root complex and the endpoint device controller that implements dynamic PCIe link speed and/or link width scaling based on the bandwidth requirements of one or more clients. In this manner, during lower bandwidth use cases, the GEN speed of PCIe link operation may be reduced (e.g., from GEN3 to GEN2/GEN1), allowing the voltage domain to be reduced to lower voltage levels (turn angles) (e.g., the lowest operating level for low throughput applications that perform well below GEN2 or GEN1 speeds). The reduced voltage level (corner) for the lower bandwidth use case reduces power consumption.

By reducing link speed (e.g., GEN speed) and/or link width during low throughput data traffic scenarios, aspects of the present disclosure give subsystems the opportunity to scale down voltage levels (e.g., reduce operating levels that meet current throughput requirements on PCIe links). The scaled down voltage level or levels reduce power consumption (e.g., reduce leakage current during sustained low throughput traffic or in idle use cases).

It should be noted that system 110 may be implemented within a battery-based consumer device, such as a wireless device (e.g., mobile phone, User Equipment (UE), Mobile Station (MS). furthermore, in the context of a UE, a PCIe root complex or host device may be an application processor or other processor within the UE, and an endpoint device may be an onboard integrated circuit or client, such as a wireless modem, system on a chip (SOC), or 802.11WiFi radio, as examples.

An exemplary PCIe system 205 in which aspects of the present disclosure may be implemented will now be discussed with reference to fig. 2. This will be followed by a description of a bandwidth-based power management method according to aspects of the present disclosure.

The system 205 includes a host system 210 and a PCIe endpoint device system 250. The host system 210 may be integrated on a first chip (e.g., a system on a chip) and the endpoint device system 250 may be integrated on a second chip. In this example, host system 210 and endpoint device system 250 are coupled by PCIe link 285.

The host system 210 includes one or more host clients 214. Each host client 214 may be implemented on a processor executing software that performs the functions of the host client 214 discussed herein. For the example of more than one host client 214, the host clients 214 may be implemented on the same processor or different processors. The host system 210 also includes a host controller 212 that can perform root complex functions specified in the PCIe specification, as discussed further below. The host controller 212 may be implemented on a processor executing software that performs the functions of the host controller 212 discussed herein.

Host system 210 includes PCIe interface circuitry 216, system bus interface 215, and system memory 240. The system bus interface 215 may interface the one or more host clients 214 with the host controller 212 and each of the one or more host clients 214 and the host controller 212 with the PCIe interface circuit 216 and the system memory 240. PCIe interface circuit 216 provides an interface to PCIe link 285 to host system 210. In this regard, PCIe interface 216 is configured to send data to endpoint device 250 (e.g., from host client 214) over PCIe link 285, and receive data from endpoint device 250 via PCIe link 285, as discussed further below. PCIe interface circuit 216 includes PCIe controller 218, digital PHY Interface (PIPE) interface 220 for PCIe fabric, Physical (PHY) Transmit (TX) block 222, PHY Receive (RX) block 226, and clock generator 224. The PIPE interface 220 provides a parallel interface between the PCIe controller 218 and the PHY TX block 222 and the PHY RX block 226. PCIe controller 218 (which may be implemented in hardware) may be configured to perform transaction layer, data link layer, and control flow functions specified in the PCIe specification, as discussed further below.

The host system 210 also includes an oscillator (e.g., a crystal oscillator or "XO") 230 configured to generate a stable reference clock signal 232. In one example, the reference clock signal 232 may have a frequency of 19.2MHz, but is not limited to such a frequency. The reference clock signal 232 is input to the clock generator 224, and the clock generator 224 generates a plurality of clock signals based on the reference clock signal 232, as discussed further below. In this regard, the clock generator 224 may include a plurality of Phase Locked Loops (PLLs), where each PLL generates a respective clock signal of the plurality of clock signals by multiplying the frequencies of the reference clock signal 232.

Endpoint device system 250 includes one or more device clients 254. Each device client 254 may be implemented on a processor executing software that performs the functions of the device client 254 discussed herein. For the example of more than one device client 254, the device clients 254 may be implemented on the same processor or different processors. The endpoint device system 250 also includes a device controller 252. As discussed further below, the device controller 252 may be configured to receive bandwidth requests from one or more device clients and determine whether to change the link speed (e.g., GEN speed) and/or the link width based on the bandwidth requests. The device controller 252 may be implemented on a processor executing software that performs the functions of the device controller discussed herein.

Endpoint device system 250 includes PCIe interface circuitry 260, system bus interface 256, and system memory 274. The system bus interface 256 may interface one or more device clients 254 with the device controller 252 and each of the one or more device clients 254 and the device controller 252 with the PCI interface circuitry 260 and the system memory 274. PCIe interface circuitry 260 provides an interface to PCIe link 285 to endpoint device system 250. In this regard, PCIe interface circuit 260 is configured to send data (e.g., from device client 254) to host system 210 (also referred to as a host device) over PCIe link 285 and receive data from host system 210 via PCIe link 285 as discussed further below. PCIe interface circuitry 260 includes PCIe controller 262, PIPE interface 264, Physical (PHY) Transmit (TX) block 266, PHY Receive (RX) block 270, and clock generator 268. PIPE interface 264 provides a parallel interface between PCIe controller 262 and PHY TX block 266 and PHY RX block 270. PCIe controller 262 (which may be implemented in hardware) may be configured to perform transaction layer, data link layer, and control flow functions specified in the PCIe specification, as discussed further below.

Endpoint device system 250 also includes an oscillator (e.g., a crystal oscillator) 272 configured to generate a stable reference clock signal 273 for system memory 274. In the example in fig. 2, clock generator 224 at host system 210 is configured to generate an Endpoint (EP) reference clock signal 287, which is forwarded by PHY RX block 226 to endpoint device system 250 via differential clock line 288. At endpoint device system 250, RX PHY block 270 receives EP reference clock signal 287 and forwards EP reference clock signal 278 to clock generator 268. The EP reference clock signal 287 may have a frequency of 100MHz, but is not limited to this frequency. The clock generator 268 is configured to generate a plurality of clock signals based on the EP reference clock signal 287, as discussed further below. In this regard, the clock generator 268 may include a plurality of PLLs, where each PLL generates a respective clock signal of the plurality of clock signals by multiplying the frequency of the EP reference clock signal 287.

The system 205 also includes a Power Management Integrated Circuit (PMIC)290 coupled to a battery 292 and/or another power source. PMIC290 is configured to convert the voltage of battery 292 to multiple supply voltages (e.g., using a switching regulator, a linear regulator, or any combination thereof). In this example, PMIC290 generates voltage 242 for oscillator 230, voltage 244 for PCIe controller 218, and voltage 246 for PHY blocks 222 and 226 and clock generator 224. The voltages 242, 244, and 246 may be programmable, with the PMIC290 configured to set the voltage levels (turns) of the voltages 242, 244, and 246 according to instructions (e.g., from the host controller 212).

PMIC290 also generates voltage 280 for oscillator 272, voltage 278 for PCIe controller 262, voltage 276 for PHY blocks 266 and 270, and clock generator 268. The voltages 280, 278, and 276 may be programmable, with the PMIC290 configured to set the voltage levels (turns) of the voltages 280, 278, and 276 according to instructions (e.g., from the device controller 252). The PMIC290 may be implemented on one or more chips. Although PMIC290 is shown in fig. 2 as one PMIC, it will be appreciated that PMIC290 may be implemented by two or more PMICs. For example, PMIC290 may include a first PMIC for generating voltages 242, 244, and 246 and a second PMIC for generating voltages 280, 278, and 276. In this example, both the first PMIC and the second PMIC may be coupled to a battery 292.

In operation, PCIe interface circuitry 216 on host system 210 may send data from one or more host clients 214 to endpoint device system 250 via PCIe link 285. Data from one or more host clients 214 may be directed to PCIe interface 216 according to a PCIe mapping set by host controller 212 during initial configuration. At PCIe interface 216, PCIe controller 218 may perform transaction layer and data link layer functions on the data, such as packetizing the data, generating error correction codes to be transmitted with the data, and so forth. PCIe controller 218 outputs the processed data to PHY TX block 222 via PIPE interface 220. The processed data includes data from one or more host clients 214 as well as overhead data (e.g., packet headers, error correction codes, etc.). In one example, clock generator 224 may generate 250MHz clock 234 for GEN3 based on reference clock 232 and input 250MHz clock 234 to PCIe controller 218 to clock the operation of PCIe controller 218. In this example, the PIPE interface 220 may include a 32-bit parallel bus that transmits 32 bits of data to the PHY TX block in parallel (which translates to a transmission rate of approximately 8 GT/s) in each cycle of the 250MHz clock 234.

PHY TX block 222 serializes the parallel data from PCIe controller 218 and drives link 285 with the serialized data. In this regard, the PHY TX block 222 may include one or more serializers and one or more drivers. The clock generator 224 may generate a high frequency clock for one or more serializers based on the reference clock signal 232.

At endpoint device system 250, PHY RX block 270 receives the serialized data via link 285 and deserializes the received data into parallel data. In this regard, the PHY RX block 270 may include one or more receivers and one or more deserializers. Clock generator 268 may generate a high frequency clock for one or more deserializers based on EP reference clock signal 287. PHY RX block 270 transmits the deserialized data to PCIe controller 262 via PIPE interface 264. PCIe controller 262 may recover data from one or more host clients 214 from the deserialized data and forward the recovered data to one or more device clients 254.

On the endpoint device system 250, PCIe interface circuitry 260 may send data from one or more device clients 254 to the host system 250 via link 285. In this regard, PCIe controller 262 at PCIe interface circuit 260 may perform transaction layer and data link layer functions on the data, such as packetizing the data, generating error correction codes to be transmitted with the data, and so forth. PCIe controller 262 outputs the processed data to PHY TX block 266 via PIPE interface 264. The processed data includes data from one or more device clients 254 as well as overhead data (e.g., packet headers, error correction codes, etc.). In one example, clock generator 268 may generate a 250MHz clock for GEN3 based on EP reference clock 287 and input the 250MHz clock to PCIe controller 262 to clock the operation of PCIe controller 262.

PHY TX block 266 serializes the parallel data from PCIe controller 262 and drives link 285 with the serialized data. In this regard, the PHY TX block 266 may include one or more serializers and one or more drivers. The clock generator 268 may generate a high frequency clock for one or more serializers based on the EP reference clock signal 287.

At host system 210, PHY RX block 226 receives the serialized data via link 285 and deserializes the received data into parallel data. In this regard, the PHY RX block 226 may include one or more receivers and one or more deserializers. Clock generator 224 may generate a high frequency clock for one or more deserializers based on reference clock signal 232. The PHY RX block 226 transmits the deserialized data to the PCIe controller 218 via the PIPE interface 220. PCIe controller 218 may recover data from one or more device clients 254 from the deserialized data and forward the recovered data to one or more host clients 214.

Fig. 3 illustrates an example of a PCIe link 285 that may be used in the system of fig. 2 in accordance with certain aspects of the present disclosure. In this example, link 285 includes a plurality of lanes 310-1 to 310-n, where each lane includes a respective first differential line 312-1 to 312-n for transmitting data from host system 210 to endpoint device system 250 and a respective second differential line 315-1 to 315-n for transmitting data from endpoint device system 250 to host system 210. Thus, each channel 310-1 to 310-n is bi-directional. The differential lines 312-1 through 312-n and 315-1 through 315-n may be implemented with metal traces on a substrate (e.g., a printed circuit board), where the host system 210 may be integrated on a first chip mounted on the substrate and the endpoint devices are integrated on a second chip mounted on the substrate. Differential lines 312-1 through 312-n and 315-1 through 315-n may also be implemented using wires, cables, etc. In this example, when data is sent from host system 210 to endpoint device system 250 across multiple lanes, PHY TX block 222 may include logic to split the data between the lanes. Similarly, when transmitting data from endpoint device system 250 to host system 210 across multiple lanes, PHY TX block 266 may include logic to split the data between the lanes.

Based on the example in fig. 3, the PHY TX block 222 shown in fig. 2 may be implemented to include a driver 320-1 to 320-n for each differential line 312-1 to 312-n, and the PHY RX block 270 shown in fig. 2 may be implemented to include a receiver 340-1 to 340-n (e.g., an amplifier) for each differential line 312-1 to 312-n. Each driver 320-1 to 320-n is configured to drive a respective differential line 312-1 to 312-n with data, and each receiver 340-1 to 340-n is configured to receive data from the respective differential line 312-1 to 312-n. Further, in FIG. 3, PHY TX block 266 may include drivers 345-1 through 345-n for each differential line 315-1 through 315-n, and PHY RX block 226 may include receivers 325-1 through 325-n (e.g., amplifiers) for each differential line 315-1 through 315-n. Each driver 345-1 to 345-n is configured to drive a respective differential line 315-1 to 315-n with data, and each receiver 325-1 to 325-n is configured to receive data from the respective differential line 315-1 to 315-n.

In some aspects, the width of link 285 is scalable. In these aspects, the width of link 285 is scaled by controlling the number of active lanes 310-1 through 310-n. The greater the number of active lanes, the wider the width of link 285; the smaller the number of active lanes, the smaller the width of link 285. In one example, host controller 212 may configure the width of link 285 by configuring the number of lanes 310-1 to 310-n on which PCIe interface circuitry 216 and PCI interface circuitry 260 send and/or receive data via link 285.

In one example, the host system 210 may include a power switch circuit 350 configured to individually control power to the drivers 320-1 through 320-n and the receivers 325-1 through 325-n from the PMIC 290. In this regard, the power switching circuit 350 may couple the drivers and receivers of the active channel to the voltage 246 and decouple the drivers and receivers of the inactive channel from the voltage 246. In this example, the drivers 320 and receivers 325 of the inactive channel are powered down to save power. Thus, in this example, the number of drivers and receivers that are powered up scales with the width of link 285. The power switch circuit 350 may be configured to selectively power the drivers 320-1 through 320-n and the receivers 325-1 through 325-n based on instructions from the host controller 212, where the host controller 212 instructs the power switch circuit 350 which drivers and receivers to power on or off (e.g., based on the current link width). For ease of illustration, separate connections or couplings between the power switch circuit 350 and the drivers 320-1 through 320-n and the receivers 325-1 through 325-n are not shown in FIG. 3.

Similarly, endpoint device system 250 as shown in FIG. 2 may include a power switch circuit 360 configured to individually control the power supplied to drivers 340-1 through 340-n and receivers 345-1 through 345-n from PMIC 290. In this regard, the power switching circuit 360 may couple the drivers and receivers of the active channel to the voltage 276 and decouple the drivers and receivers of the inactive channel from the voltage 276. Thus, in this example, the drivers and receivers of the inactive channels are powered down to save power. The power switch circuit 360 may be configured to selectively power the drivers 340-1 through 340-n and the receivers 345-1 through 345-n based on instructions from the device controller 252, wherein the device controller 252 instructs the power switch circuit 360 which drivers and receivers to power on or off (e.g., based on the current link width). For ease of illustration, separate connections or couplings between the power switching circuit 360 and the drivers 340-1 through 340-n and the receivers 345-1 through 345-n are not shown in FIG. 3.

Link 285 may support multiple link speeds. For example, link 285 may support multiple link speeds corresponding to different generations ("GENs") of the PCIe standard. In this regard, table 1 below lists exemplary transmission speeds per channel per direction for GEN1 speed, GEN2 speed, GEN3 speed, GEN4 speed, and GEN5 speed.

TABLE 1

The exemplary transmission rate in table 1 may be a theoretical transmission rate. The actual transmission rate for one or more of these link speeds may be slightly lower than the transmission rate shown in table 1. The transmission rate may also be expressed as Gbps. In this example, link speed may refer to the transmission rate per channel per direction.

In the above example, host controller 212 and device controller 252 may negotiate a link speed (e.g., GEN speed) and configure PCIe interface circuits 216 and 260 to operate at the negotiated link speed (e.g., according to PCIe hardware programming guidelines). In this example, host controller 212 may set one or more of voltages 242, 244, and 246 based on the current link speed (current GEN speed). In one example, host controller 212 may include a table that maps each supported link speed (e.g., each supported GEN speed) to one or more corresponding voltage levels (turns). In this example, the host controller 212 may instruct the PMIC290 to set one or more of the voltages 242, 244, and 246 provided by the PMIC290 according to one or more voltage levels (turns) mapped to the current link speed (e.g., the current GEN speed). The voltage level (corner) for the lower link speed may be lower than the voltage level (corner) for the higher link speed (e.g., due to looser timing requirements for the lower link speed). The host controller 212 may indicate the PMIC290 directly or through another processor in direct communication with the PMIC 290. Thus, in this example, the voltage level (turn angle) of the host system 210 scales with the link speed (e.g., GEN speed).

Similarly, the device controller 252 may set one or more of the voltages 280, 278, and 276 based on the current link speed (current GEN speed). In one example, the device controller 252 may include a table that maps each supported link speed (e.g., each supported GEN speed) to one or more corresponding voltage levels (turns). In this example, the device controller 252 may instruct the PMIC290 to set one or more of the voltages 276, 278, and 280 provided by the PMIC290 according to one or more voltage levels (turns) mapped to the current link speed (e.g., the current GEN speed). The device controller 252 may indicate the PMIC290 directly or through another processor in direct communication with the PMIC 290. Thus, in this example, the voltage level (turn angle) of endpoint device system 250 scales with link speed (e.g., GEN speed).

Fig. 4 shows a call flow diagram of an exemplary bandwidth-based power management method 410 using link speed scaling, according to aspects of the present disclosure. The method 410 may also include link width scaling according to further aspects. At 412, the device controller 252 (or also the endpoint PCIe client software processes) receives bandwidth requests from one or more device clients 254 (or also the endpoint PCIe client software processes). Each device client 254 may generate a respective bandwidth request based on the bandwidth requirements of that client. Each bandwidth request may have any format. For example, each bandwidth request may indicate the bandwidth requirements of the respective device client 254 in terms of Mbps or another scale of bits per second. For the example where endpoint device system 250 includes multiple device clients 254, device controller 252 may receive multiple bandwidth requests from multiple device clients 254. In this example, the device controller 252 may aggregate the bandwidth requests (e.g., aggregate the bandwidth requirements indicated in the bandwidth requests).

The device controller 252 implements a bandwidth solver 413, which bandwidth solver 413 may be an algorithmic process executing within the controller 252 or some hardware component of the device controller 252, or a combination thereof, to determine whether to change the current link speed of the link 285 based on the bandwidth requests (or, for the case of multiple device clients 254, based on aggregated bandwidth requests). Bandwidth resolver 413 may implement decision process 414 to determine whether to warrant a speed change for link 285 and also to perform scaling of the link speed or link width. Process 414 may make a determination as to whether a speed change (scaling of a decrease or increase in speed) is warranted or not required. The decision result is shown by block 415, where if a change is needed, a message (e.g., 416 and 418) is sent to the controller 212 and the device client 254 regarding the effectuation of the speed change (the portion of block 415 above the dashed line shown within the block). Otherwise, block 415 shows that when a speed change is not guaranteed, a no speed change message 418 is returned to the device client 254 (i.e., the portion of block 415 below the dashed line). In one example, if the bandwidth request (or aggregated bandwidth request) indicates a low bandwidth requirement that may be adequately served by a lower link speed or lower link width, the bandwidth solver 413 may determine to decrease the link speed from the current link speed to the lower link speed (e.g., from the GEN3 speed to the GEN2 or GEN1 speed). In another example, if the bandwidth request (or aggregated bandwidth request) indicates a high bandwidth requirement, the bandwidth solver 413 may determine to increase the link speed from the current link speed to a higher link speed (e.g., from a GEN1 or GEN2 speed to a GEN3 speed). As described above, the bandwidth solver 413 may be implemented in software executed by the device controller 252, or some hardware logic as part of the device controller 252 or in communication with the device controller 252, or a combination thereof. An exemplary implementation of the bandwidth solver 413 is provided below.

If the bandwidth resolver 413 determines that no link speed change is required, the device controller 252 may send an indicator to one or more device clients indicating no link speed change at 418. If the bandwidth resolver determines a link speed change, the device controller 252 may send an indicator or message to one or more device clients 254 at 416 indicating the link speed change. The indicator to device client 254 may also indicate a new link speed or PCIe generation.

If the bandwidth resolver 413 determines that the link speed changes, the device controller 252 sends a speed change request to change the link speed to the host controller 212 at 420. The request 420 may indicate a new link speed. For example, if the bandwidth solver determines to change from the GEN1 or GEN2 speed to the GEN3 speed, the request to the host controller 212 may indicate the GEN3 speed. Device controller 252 may send a request to host controller 212 via PCIe interface circuits 260 and 216 and link 285.

According to further aspects, it should be noted that in the option as indicated at block 419, the bandwidth solver may also initiate preparing resources (e.g., system or power resources) for speed changes (including amplification). Note that with respect to the process in block 419, system resources or power resources (including but not limited to voltage regulators or clock sources) may be amplified for higher link speed change requests, if desired. This amplification of power or system resources may be an aggregation of all previous bandwidth change requests, including the current request sent to or being sent to the host and the pending acknowledgement of the completion of the link speed change. It is further noted that this example illustrates only one implementation of a system or power resource prepared for an outstanding link speed change request, but is not so limited. After sending the speed change request to the host (as shown at 420), in another optional aspect, the host controller 212 may be configured to prepare resources for the speed change (including changes to accommodate higher link speed change requests), as shown at block 421 (e.g., to prepare system or power resources, including but not limited to voltage regulators and/or clock sources).

Additionally, in response to the request to change the link speed, host controller 212 may initiate a link speed change using a link speed change implementer at 422. Link speed change implementer 422 may handle speed changes according to PCIe hardware programming guidelines in the PCIe specification, which may include performing link retraining and reconfiguring PCIe interface circuits 216 and 260 for the new link speed. Link speed change implementer 422 may be implemented in software executed by host controller 212, hardware associated with controller 212 or as part of controller 212, or a combination thereof.

When the link speed change process is complete, the host controller 212 may send an indicator to the device controller 252 at 424 indicating that the speed change is complete. At 426, host controller 212 changes or updates the voltage level (rotation angle) of one or more of voltages 242, 244, and 246 based on the new link speed, if necessary. For example, if the new link speed is low (e.g., changing from GEN3 speed to GEN2 or GEN1 speed), the host controller 212 may decrease the voltage level (turn angle) of one or more of the voltages 242, 244, and 246. In another example, if the new link speed is high (e.g., changing from GEN2 or GEN1 speed to GEN3 speed), host controller 212 may increase the voltage level (turn angle) of one or more of voltages 242, 244, and 246. As discussed above, host controller 212 may change the voltage level (rotational angle) of one or more of voltages 242, 244, and 246 by instructing PMIC290 to set the voltage level (rotational angle) of one or more of voltages 242, 244, and 246 provided by PMIC290 based on the new link speed. The device controller 252 may override the voltage levels of the voltages 242, 244, and 246 for the new link speed if these voltage levels are the same as the voltage levels for the previous link speed. The voltage scaling for the new link speed at 426 may be integrated with the link speed change process performed by the link speed change implementer 422 (i.e., the voltage scaling may be part of the link speed change process).

At 428, in response to the speed change complete indication from the host controller 212, the device controller 252 may send an indicator 428 to the one or more device clients 254, the indicator 428 notifying the one or more device clients 254 of the link speed change.

At 430, the device controller 252 updates the voltage level (turn angle) of one or more of the voltages 276, 278, and 280 based on the new link speed, if necessary. For example, if the new link speed is low (e.g., changing from GEN3 speed to GEN2 or GEN1 speed), the device controller 252 may decrease the voltage level (turn angle) of one or more of the voltages 276, 278, and 280. In another example, if the new link speed is high (e.g., changing from GEN2 or GEN1 speed to GEN3 speed), the device controller 252 may increase the voltage level (angle) of one or more of the voltages 276, 278, and 280. As discussed above, the device controller 252 may change the voltage level (rotation angle) of one or more of the voltages 276, 278, and 280 by instructing the PMIC290 to set the voltage level (rotation angle) of one or more of the voltages 276, 278, and 280 provided by the PMIC290 based on the new link speed. The device controller 252 may override the voltage levels for the new link speeds if the voltage levels of the voltages 276, 278, and 280 are the same as the voltage levels for the previous link speeds.

Thus, for low bandwidth use cases, the example power management method 410 reduces link speed (e.g., GEN speed), which gives the host controller 212 and the device controller 252 the opportunity to scale down one or more voltage levels of the system 205 (e.g., reduce the operating level that meets the current throughput requirements on the link 285). The scaled down voltage level or levels reduce power consumption (e.g., reduce leakage current during sustained low throughput traffic or in idle use cases).

In the example shown in fig. 4, the bandwidth solver is implemented on the endpoint device side. However, it will be appreciated that the present disclosure is not limited to this example. For example, a bandwidth solver may also be implemented on the host side according to certain aspects. In this regard, fig. 5 illustrates a call flow diagram showing an exemplary bandwidth-based power management method 510 in which a bandwidth solver is implemented on the host side, in accordance with aspects of the present disclosure.

At 512, host controller 212 receives bandwidth requests from one or more host clients 214. Each host client 214 may generate a respective bandwidth request based on the bandwidth requirements of that client. Each bandwidth request may have any format (e.g., bits per second on the Mbps or another scale to indicate the bandwidth requirements of the respective client). For an example in which the host system 210 includes multiple clients 214, the host controller 212 may receive multiple bandwidth requests from the multiple clients 214. In this example, host controller 212 may aggregate the bandwidth requests (e.g., aggregate the bandwidth requirements indicated in the bandwidth requests).

At 514, host controller 212 implements bandwidth resolver 513 to determine whether to change the current link speed of link 285 based on the bandwidth request (or, for the case of multiple host clients, based on the aggregated bandwidth request). For example, if the bandwidth request (or aggregate bandwidth request) indicates a low bandwidth requirement that can be adequately served at a lower link speed, the bandwidth solver may determine to decrease the link speed from the current link speed to the lower link speed (e.g., from the GEN3 speed to the GEN2 or GEN1 speed). In another example, if the bandwidth request (or aggregated bandwidth request) indicates a high bandwidth requirement, the bandwidth solver may determine to increase the link speed from the current link speed to a higher link speed (e.g., from a GEN1 or GEN2 speed to a GEN3 speed). Exemplary implementations of bandwidth solvers are provided below. The bandwidth solver 513 may be implemented in software executed by the host controller 212, by hardware associated with the host controller 212, or by some combination thereof.

If the bandwidth resolver 513 determines that no link speed change is required, the host controller 212 may send an indicator to one or more host clients indicating no link speed change at 518. If the bandwidth resolver determines that the link speed changes, the host controller 212 may send an indicator indicating the link speed change to one or more host clients 214 at 516. The indicator may also indicate the new link speed. If the bandwidth resolver 513 determines a link speed change, the host controller 212 may send a link speed change request to the device controller 252 at 520, which informs the device controller 252 of the proposed link speed change. The request may indicate a new link speed. Note that after the host controller 212 sends the message in step 520, the host controller 212 may be configured to wait for a response from the device controller 252. In response to the request, the device controller 252 may send an Acknowledgement (ACK) to the host controller 212 indicating that the device controller 252 is ready for a link speed change, as shown at 521. If the endpoint device does not support the new link speed, the device controller 524 may send a Negative Acknowledgement (NACK) to the host controller 212 indicating to the host controller 212 that the endpoint device does not support the proposed link speed change request from the host controller 212. If the host controller 212 receives a NACK from the device controller 252, the host controller 212 may abort the link speed change.

In the alternative, it should also be noted that after receiving the message 520, the device controller 252 may be configured to initiate amplification of system or power resources, including but not limited to a voltage regulator or clock source (e.g., PMIC290 or clock generator 268), for higher link speed change requests, as shown at block 540. Additionally, upon receiving the ACK message 521, the host controller 212 may also be configured to then "amplify the system or power resources (e.g., PMIC290 or clock generator 224) to accommodate the higher link speed change request, as shown at block 542.

If the bandwidth resolver 513 determines to change the link speed, the host controller 212 may initiate a link speed change using a link speed change implementer at 522. Link speed change implementer 522 may process the speed change according to PCIe hardware programming guidelines in the PCIe specification, which may include performing link retraining and reconfiguring PCIe interface circuits 216 and 260 for the new link speed. The link speed change implementer may be implemented in software executed by host controller 212.

When the link speed change process is complete, the host controller 212 may send an indicator to the one or more host clients 214 at 524 that informs the one or more host clients 214 of the link speed change. The host controller 212 may also send an indicator to the device controller 252 at 526 indicating that the speed change is complete.

At 528, host controller 212 updates the voltage level (turn angle) of one or more of voltages 242, 244, and 246 based on the new link speed, if necessary. For example, if the new link speed is low (e.g., changing from GEN3 speed to GEN2 or GEN1 speed), the host controller 212 may decrease the voltage level (turn angle) of one or more of the voltages 242, 244, and 246. In another example, if the new link speed is high (e.g., changing from GEN2 or GEN1 speed to GEN3 speed), host controller 212 may increase the voltage level (turn angle) of one or more of voltages 242, 244, and 246. If the voltage levels of the voltages 242, 244, and 246 for the new link speed are the same as the voltage levels for the previous link speed, the host controller 212 may override these voltage levels. The voltage scaling for the new link speed at 528 may be integrated with the link speed change process performed by link speed change implementer 522 (i.e., the voltage scaling may be part of the link speed change process).

At 530, the device controller 252 updates the voltage level (turn angle) of one or more of the voltages 276, 278, and 280 based on the new link speed, if necessary. For example, if the new link speed is low (e.g., changing from GEN3 speed to GEN2 or GEN1 speed), the device controller 252 may decrease the voltage level (turn) of one or more of the voltages 276, 278, and 280. In another example, if the new link speed is high (e.g., changing from GEN2 or GEN1 speed to GEN3 speed), the device controller 252 may increase the voltage level (turn angle) of one or more of the voltages 276, 278, and 280. The device controller 252 may override the voltage levels for the new link speeds if the voltage levels of the voltages 276, 278, and 280 are the same as the voltage levels for the previous link speeds.

The bandwidth solver 513 can be implemented in any of a number of ways to convert bandwidth requirements from one or more clients (e.g., one or more host clients, one or more device clients, etc.) and to convert the bandwidth requirements to one of the following link parameters: PCIe link speed only; PCIe link width only; as well as PCI link speed and link width. For the case of multiple clients, the bandwidth requirement may be an aggregation of the bandwidth requirements of the multiple clients. The bandwidth solver may also take into account additional parameters such as burst frequency. Fig. 4 and 5, discussed above, illustrate examples in which a bandwidth solver converts bandwidth requirements for one or more clients to link speeds.

The bandwidth solver 513 can be implemented in any of a number of ways to convert the bandwidth requirements into link speeds and/or link widths. Exemplary implementations of bandwidth solvers are discussed below. However, it will be appreciated that the bandwidth solver is not limited to these examples, and may be extended to other implementations based on the system power budget.

In certain aspects, the bandwidth solver 513 may convert the bandwidth requirements to PCIe link parameters by looking up a table and using the table to determine link speeds and/or link widths based on the bandwidth requirements. Fig. 6A shows an example of a lookup table 610 for converting bandwidth requirements (aggregated bandwidth requirements for the case of multiple clients) to link speeds (e.g., GEN speeds). In this example, table 610 includes different bandwidths (labeled "Bandwidth 1" through "Bandwidth m") and corresponding link speeds (labeled "Link speed 1" through "Link speed m") for each of the bandwidths. The bandwidth may have Mbps or another format. Thus, table 610 maps each bandwidth to a corresponding link speed. It will be appreciated that two or more bandwidths may be mapped to the same link speed. It will also be appreciated that each bandwidth entry in the table may be a range of bandwidths mapped to the same link speed.

The table 610 may be pre-stored in a memory coupled to the bandwidth solver. When the bandwidth solver receives a bandwidth requirement from one or more clients, the bandwidth solver can convert the bandwidth requirement (aggregated bandwidth requirement for the case of multiple clients) to a link speed by looking up the link speed mapped to the bandwidth requirement in table 610. For example, if the bandwidth requirement (aggregated bandwidth requirement for the case of multiple clients) corresponds to bandwidth 1 in table 610, the bandwidth solver may convert the bandwidth requirement to link speed 1. If the corresponding link speed in table 610 is different from the current link speed, the device controller 252 or host controller 212 may initiate a link speed change to the corresponding link speed from table 610, as discussed above.

The table 610 may be generated based on computer simulations of the system 205 and/or power measurements of the system 205 for various bandwidth and link speed scenarios. In this example, the link speed that results in the lowest power for a particular bandwidth based on simulation results and/or power measurements may be mapped to the bandwidth in table 610. Accordingly, the table 610 may be populated based on the simulation results and/or power measurements and then stored in a memory accessible to the bandwidth solver.

Fig. 6B shows an example of a lookup table 620 for converting bandwidth requirements (aggregated bandwidth requirements for the case of multiple clients) to link widths. In this example, table 620 includes different bandwidths (labeled "Bandwidth 1" through "Bandwidth m") and corresponding link widths (labeled "Link Width 1" through "Link Width m") for each of the bandwidths. Thus, table 620 maps each bandwidth to a corresponding link width. It will be appreciated that two or more bandwidths may be mapped to the same link width. It will also be appreciated that each bandwidth entry in the table may be a range of bandwidths that map to the same link width. In one example, the link width may be specified by the number of active lanes in link 285 corresponding to the link width. As discussed above with reference to fig. 3, the greater the number of active lanes in link 285, the wider the width of link 285.

In one example, a table 620 can be generated for each supported link speed and pre-stored in a memory coupled to the bandwidth solver. Thus, in this example, each link speed may have a corresponding table 620. In this example, the bandwidth solver may use a table 620 corresponding to the current link speed.

When the bandwidth solver receives a bandwidth requirement from one or more clients, the bandwidth solver may convert the bandwidth requirement (aggregated bandwidth requirement for the case of multiple clients) to a link width by looking up the link width mapped to the bandwidth requirement in table 620. For example, if the bandwidth requirement (aggregated bandwidth requirement for the case of multiple clients) corresponds to bandwidth 1 in table 620, the bandwidth solver may convert the bandwidth requirement to link width 1. If the corresponding link width in table 620 is different from the current link width, the device controller 252 or host 212 controller may initiate a link width change to the corresponding link width from table 620, as discussed further below.

Table 620 may be generated based on computer simulations of system 205 and/or power measurements of system 205 for various bandwidth and link width scenarios. In this example, the link width that results in the lowest power for a particular bandwidth based on simulation results and/or power measurements may be mapped to the bandwidth in table 620. Accordingly, table 620 may be populated based on simulation results and/or power measurements and then stored in memory accessible to the bandwidth solver.

For the example where the bandwidth solver is implemented on the endpoint device side, the device controller 252 may send a link width change request to the host controller 212 if the bandwidth solver determines that the link width is to be changed. In response, host controller 212 may process the width change in accordance with PCIe hardware programming guidelines in the PCIe specification, which may include performing link retraining for the new link width and reconfiguring PCIe interface circuits 216 and 260. When the link width change is complete, the host controller 212 may notify the device controller 252.

In this example, if the link width is reduced, host controller 212 may power down the drivers in PHY TX block 222 and/or the receivers in PHY RX block 226 (which correspond to channels in link 285 that are deactivated due to the link width change). As discussed above, the host controller 212 may power down the selected driver and/or receiver by sending an instruction to the power switch circuit 350 to turn off the selected driver and/or receiver. In other words, the host controller 212 sets the number of drivers and/or receivers that are powered by the power switch circuit 350 based on the new link width.

Similarly, device controller 252 may power down the drivers in PHY TX block 266 and/or the receivers in PHY RX block 270 (which correspond to channels in link 285 that are deactivated due to link width changes). As discussed above, the host controller 212 may power down the selected driver and/or receiver by sending an instruction to the power switch circuit 360 to turn off the selected driver and/or receiver. In other words, the device controller 252 sets the number of drivers and/or receivers that are powered by the power switch circuit 360 based on the new link width. Accordingly, components associated with lanes that are deactivated due to link width changes may be powered down to save power.

For the example in which the bandwidth solver is implemented on the host side, if the bandwidth solver determines that the link width changes, the host controller 212 may process the width changes according to PCIe hardware programming guidelines in the PCIe specification, which may include performing link retraining for the new link width and reconfiguring the PCIe interface circuits 216 and 260. Host controller 212 may also notify device controller 252 of the link width change.

In this example, if the link width is reduced, host controller 212 may power down the drivers in PHY TX block 222 and/or the receivers in PHY RX block 226 (which correspond to channels in link 285 that are deactivated due to the link width change). Similarly, device controller 252 may power down the drivers in PHY TX block 266 and/or the receivers in PHY RX block 270 (which correspond to channels in link 285 that are deactivated due to link width changes).

Fig. 6C shows an example of a lookup table 630 for converting bandwidth requirements to both link speed (e.g., GEN speed) and link width. In this example, the table 630 includes different bandwidths (labeled "bandwidth 1" to "bandwidth m") and corresponding link speeds (labeled "link speed 1" to "link speed m") and link widths (labeled "link width 1" to "link width m") for each of the bandwidths. The bandwidth may have Mbps or another format. Thus, table 630 maps each bandwidth to a corresponding link speed and link width. It will be appreciated that two or more bandwidths may be mapped to the same link speed and/or the same link width.

The table 630 may be pre-stored in a memory coupled to the bandwidth solver. When the bandwidth solver receives a bandwidth requirement from one or more clients, the bandwidth solver may convert the bandwidth requirement (aggregated bandwidth requirement for the case of multiple clients) to a link speed and a link width by looking up the link speed and the link width mapped to the bandwidth requirement in table 630. If the corresponding link speed in table 630 is different from the current link speed, the device controller 252 or host controller 212 may initiate a link speed change to the corresponding link speed from table 630, as discussed above. If the corresponding link width in table 630 is different from the current width speed, the device controller 252 or host controller 212 may initiate a link width change to the corresponding link width from table 630, as discussed above. Thus, the link speed may be changed, the link width may be changed, or both the link speed and the link width may be changed.

The table 630 may be generated based on computer simulations of the system 205 and/or power measurements of the system 205 for various bandwidth, link speed, and link width scenarios. In this example, the link speed and link width that result in the lowest power for a particular bandwidth based on simulation results and/or power measurements may be mapped to the bandwidths in table 630. Accordingly, table 630 may be populated based on simulation results and/or power measurements and then stored in memory accessible to the bandwidth solver.

In certain aspects, the power budget may be prepared by experiment or simulation for various link configurations (e.g., various link speeds and link widths) by varying link parameters, such as, but not limited to, the following: l0 power consumption; l0s power consumption; l1 power consumption; l1ss power consumption; l0s entry time; l1 time of entry; l1ss entry time; l0s exit latency; l1 exit latency; and L1S exit latency. In these aspects, the bandwidth solver may then select the link speed and the link width with the lowest power consumption.

Fig. 7 illustrates another call flow diagram of an exemplary bandwidth-based power management method 700 using link speed scaling according to aspects of the present disclosure. In particular, the method 710 relates to scenarios in which the device client 254 initiates speed scaling and speed changes are also completed, processed, and/or implemented by the device-side controller 252 in cooperation with the host controller 212. The method 700 may also include link width scaling according to further aspects.

At 702, one or more device clients 254 (or also endpoint PCIe client software processes) send bandwidth requests to the device controller 254 (or also endpoint PCIe software processes). Each device client 254 may generate a respective bandwidth request based on the client's bandwidth requirements and may have any format. For example, each bandwidth request may indicate the bandwidth requirements of the respective device client 254 in terms of Mbps or another scale of bits per second. As a further example, where endpoint device system 250 includes multiple device clients 254, device controller 252 may receive multiple bandwidth requests from multiple device clients 254. In this example, the device controller 252 may aggregate the bandwidth requests (e.g., aggregate the bandwidth requirements indicated in the bandwidth requests).

The device controller 252 implements a bandwidth solver 704, which bandwidth solver 704 may be an algorithmic process executing within the device controller 252 or some hardware component of the device controller 252, or a combination thereof, to determine whether to change the current link speed of a link (e.g., link 285 as shown in fig. 2) based on a bandwidth request (or, for the case of multiple device clients 254, based on an aggregated bandwidth request). The bandwidth solver 704 may implement a decision process 706 to determine whether to warrant a speed change for the link 285 and also to perform scaling of the link speed or link width. The process 706 may make a determination whether a speed change is warranted (scaling a decrease or increase in speed) or not required. The decision result is shown by block 708, where if a change is needed, a message (e.g., optional message 710 and message 712) may be sent to the host controller 212 and the device client 254 regarding effecting the speed change (see, e.g., the portion of block 708 above the dashed line shown within the block). Otherwise, block 708 illustrates returning a no speed change message 714 to the device client 254 when a speed change is not guaranteed (i.e., the portion of block 708 below the dashed line).

In one example, if the bandwidth request (or aggregated bandwidth request) indicates a low bandwidth requirement that may be adequately served by a lower link speed or lower link width, the bandwidth solver 704 may determine to decrease the link speed from the current link speed to the lower link speed (e.g., from the GEN3 speed to the GEN2 or GEN1 speed). In another example, if the bandwidth request (or aggregated bandwidth request) indicates a high bandwidth requirement, the bandwidth solver 704 may determine to increase the link speed from the current link speed to a higher link speed (e.g., from a GEN1 or GEN2 speed to a GEN3 speed). As mentioned above, the bandwidth solver 704 may be implemented in software executed by the device controller 252, or some hardware logic as part of the device controller 252 or in communication with the device controller 252, or a combination thereof.

If the bandwidth resolver 704 determines a link speed change, the device controller 252 may send an indicator or message to the one or more device clients 254 indicating the link speed change at 712 by notifying the one or more device clients 254 that the speed change is in progress. Message 712 to device client 254 may also indicate a new link speed or PCIe generation. Additionally, if the bandwidth resolver 704 determines a link speed change, the device controller 252 may optionally send a speed change request to the host controller 212 requesting a change in link speed, as shown by message 710. The request 710 may indicate a new link speed. For example, if the bandwidth solver determines to change from the GEN1 or GEN2 speed to the GEN3 speed, the request to the host controller 212 may indicate the GEN3 speed. Device controller 252 may send a request to host controller 212 via PCIe interface circuits 260 and 216 and link 285. In turn, the host controller 212 may prepare the voltage (e.g., voltage turn angle) for the speed change, as indicated at block 716, and then send a speed change ready message 718 back to the device controller 252. Additionally, a process for preparing system or power resources for the speed change may be implemented in the device controller 252, as indicated by block 719.

According to a further optional aspect, it should be noted that in one option, the bandwidth solver 704 may also initiate amplification. That is, system resources or power resources (including but not limited to voltage regulators or clock sources) may be amplified for higher link speed change requests, if desired. This amplification of power or system resources may be an aggregation of all previous bandwidth change requests, including the current request sent to or being sent to the host and the pending acknowledgement of the completion of the link speed change.

After the bandwidth solver 704 determines the guaranteed speed change at block 708, a device link speed change implementer 720 may be run in the device controller 252 to implement the speed change for one or more device clients 254 in the device controller 252. Additionally, in response to a request to change the link speed, in some aspects, the device controller 252 may initiate through the device speed change implementer 720. Device link speed change implementer 720 may process the speed change in accordance with PCIe hardware programming guidelines in the PCIe specification, as shown in block 722, which may include performing link retraining and reconfiguring PCIe interface circuits 216 and 260 for the new link speed. The device link speed change implementer 720 may be implemented in software executed by the device controller 252, hardware associated with the controller 252 or as part of the controller 252, or a combination thereof.

When the device speed change process is complete, the device controller 252 may send an indicator callback message to the client device indicating that the speed change is complete at 724. At 726, the device controller 252 changes or updates the voltage level (turn angle) of one or more of the voltages 276, 278, or 280 based on the new device link speed, if necessary. For example, if the new device link speed is low (e.g., changing from GEN3 speed to GEN2 or GEN1 speed), the device controller 252 may decrease the voltage level (turn) of one or more of the voltages 276, 278, or 280. In another example, if the new device link speed is high (e.g., changing from GEN2 or GEN1 speed to GEN3 speed), the device controller 252 may increase the voltage level (turn angle) of one or more of the voltages 276, 278, or 280. As discussed above, the device controller 252 may change the voltage level (corner) of one or more of the voltages 276, 278, or 280 by instructing the PMIC290 to set the voltage level (corner) of one or more of the voltages 276, 278, or 280 provided by the PMIC290 based on the new device link speed. The device controller 252 may override the voltage levels of the voltages 276, 278, or 280 for the new device link speed if these voltage levels are the same as the voltage levels for the previous device link speed. The voltage scaling for the new link speed at 726 may be integrated with the device link speed change process performed by the device link speed change implementer 720 (i.e., the voltage scaling may be part of the device link speed change process).

At 728, in response to the speed change complete indication from the host controller 212, the device controller 252 may send a message 728 to the host controller 212 informing the host controller 212 of the device speed change. At 730, host controller 212 updates the voltage level (rotation angle) of one or more of voltages 242, 244, or 246 based on the new link speed, if necessary. For example, if the new link speed is low (e.g., changing from GEN3 speed to GEN2 or GEN1 speed), the host controller 212 may decrease the voltage level (turn angle) of one or more of the voltages 242, 244, or 246. In another example, if the new link speed is high (e.g., changing from GEN2 or GEN1 speed to GEN3 speed), the host controller 212 may increase the voltage level (turn angle) of one or more of the voltages 242, 244, or 246. As discussed above, the host controller 212 may change the voltage level (rotational angle) of one or more of the voltages 242, 244, or 246 by instructing the PMIC290 to set the voltage level (rotational angle) of one or more of the voltages 242, 244, or 246 provided by the PMIC290 based on the new device link speed. If the voltage level of the voltage 242, 244, or 246 for the new device link speed is the same as the voltage level for the previous link speed, the host controller 212 may override these voltage levels.

According to another example, fig. 8 illustrates a call flow diagram of a method 800 in which a host client can initiate a bandwidth change request and a device controller completes, implements, or effectuates a link speed change. At 802, the host controller 212 receives bandwidth requests from one or more host clients 214. Each host client 214 may generate a respective bandwidth request based on the bandwidth requirements of the host client. Each bandwidth request may have any format (e.g., indicating the bandwidth requirements of the respective client in Mbps or another scale of bits per second). For an example in which the host system 210 includes multiple clients 214, the host controller 212 may receive multiple bandwidth requests from the multiple clients 214. In this example, host controller 212 may aggregate the bandwidth requests (e.g., aggregate the bandwidth requirements indicated in the bandwidth requests).

Host controller 212 implements bandwidth solver 804 to determine whether to change the current link speed of link 285 based on the bandwidth requests (or aggregated bandwidth requests for the case of multiple host clients), as shown at 806. For example, if the bandwidth request (or aggregate bandwidth request) indicates a low bandwidth requirement that can be adequately served at a lower link speed, the bandwidth solver 804 can determine to decrease the link speed from the current link speed to the lower link speed (e.g., from the GEN3 speed to the GEN2 or GEN1 speed). In another example, if the bandwidth request (or aggregated bandwidth request) indicates a high bandwidth requirement, the bandwidth solver 804 may determine to increase the link speed from the current link speed to a higher link speed (e.g., from a GEN1 or GEN2 speed to a GEN3 speed). An exemplary implementation of the bandwidth solver 804 is provided below. The bandwidth solver 804 may be implemented in software executed by the host controller 212, by hardware associated with the host controller 212, or by some combination thereof.

If the bandwidth solver 804 determines that no link speed change is required, the host controller 212 may send an indicator to one or more host clients indicating no link speed change at 518. If the bandwidth resolver determines that the link speed changes, at 810, the host controller 212 may send an indicator to the one or more host clients 214 indicating the link speed change. Indicator 810 may also indicate a new link speed. Alternatively, if the bandwidth solver 804 determines that no speed change is required, a message 812 may be sent to the host client 214 to indicate no speed change.

If the bandwidth solver 804 determines that the link speed has changed, the host controller 212 may optionally prepare system resources for the speed change (e.g., prepare voltage corners or clocks), as shown at block 814. Additionally, at 816, the host controller 212 may send a link speed change request to the device controller 252 requesting a proposed link speed change. The request 816 may indicate a new link speed. In response, the device controller 252 may optionally prepare system resources (e.g., voltage turns or clocks) for the speed change, as shown in block 818. Additionally, the device controller may be configured to implement a device link speed change implementer 820. In one aspect, the device link speed change implementer 820 may handle speed changes in accordance with PCIe hardware programming guidelines, as shown at 822, which may include performing link retraining for the new link speed and reconfiguring PCIe interface circuits 216 and 260. The link speed change implementer 820 may be implemented in software executed by the device controller 252, hardware associated with the device controller 252 or coupled to the device controller 252, or some combination thereof.

When the link speed change procedure is complete, the device controller 252 may send a speed change complete message 824 to the host controller 212 to signal completion of the link speed change. The host controller 212 may then signal the host client 214 to call the host client latency callback function for the bandwidth/speed change request 802 to signal that the change has been completed, as shown at 826.

At 828, host controller 212 updates the voltage level (turn angle) of one or more of voltages 242, 244, and 246 based on the new link speed, if necessary. For example, if the new link speed is low (e.g., changing from GEN3 speed to GEN2 or GEN1 speed), the host controller 212 may decrease the voltage level (turn angle) of one or more of the voltages 242, 244, and 246. In another example, if the new link speed is high (e.g., changing from GEN2 or GEN1 speed to GEN3 speed), host controller 212 may increase the voltage level (turn angle) of one or more of voltages 242, 244, and 246. If the voltage levels of the voltages 242, 244, and 246 for the new link speed are the same as the voltage levels for the previous link speed, the host controller 212 may override these voltage levels. The voltage scaling for the new link speed at 828 may be integrated with the link speed change process performed by the device link speed change implementer 820 (i.e., the voltage scaling may be part of the link speed change process).

At 830, the device controller 252 updates the voltage level (turn angle) of one or more of the voltages 276, 278, and 280 based on the new link speed, if necessary. For example, if the new link speed is low (e.g., changing from GEN3 speed to GEN2 or GEN1 speed), the device controller 252 may decrease the voltage level (turn) of one or more of the voltages 276, 278, and 280. In another example, if the new link speed is high (e.g., changing from GEN2 or GEN1 speed to GEN3 speed), the device controller 252 may increase the voltage level (turn angle) of one or more of the voltages 276, 278, and 280. The device controller 252 may override the voltage levels for the new link speeds if the voltage levels of the voltages 276, 278, and 280 are the same as the voltage levels for the previous link speeds.

The bandwidth solver 704 or 804 can be implemented in any of a number of ways to convert bandwidth requirements from one or more clients (e.g., one or more host clients, one or more device clients, etc.) and to convert the bandwidth requirements to one of the following link parameters: PCIe link speed only; PCIe link width only; as well as PCI link speed and link width. For the case of multiple clients, the bandwidth requirement may be an aggregation of the bandwidth requirements of the multiple clients. The bandwidth solvers 704 or 804 may also take into account additional parameters such as burst frequency. Fig. 7 and 8, discussed above, illustrate examples in which a bandwidth solver converts bandwidth requirements for one or more clients to link speeds.

Fig. 9 shows a flow diagram of a method 900 for bandwidth-based power management for a link, such as a PCIe link, in accordance with aspects of the present disclosure. In certain aspects, method 900 enables scaling of one or more of link speed (or bandwidth) and/or link width, such as reducing the number of channels that are powered (e.g., selectively powering on or off drivers 320 or 345).

As can be seen in fig. 9, method 900 includes: one or more bandwidth requests are received from a client (e.g., "device client" 254 in the examples of fig. 4 and 7 or "host client" 214 in the examples of fig. 5 and 8) regarding communications over a link between a link partner (i.e., a link partner of the client, such as device controller 252 in fig. 4 or host controller 212 shown in fig. 5) and the client (e.g., 254 in fig. 4 or 214 in fig. 5), as shown in block 902. The process of block 902 may be implemented by the device controller 252 in one example, or by the host controller 212 when executing the process in FIG. 5. Additionally, the process of block 902 may be implemented by software running in the device controller 252 or by hardware within the device controller 252 or coupled to the device controller 252.

The method 900 further includes the following processes: at least one of a link speed and a link width for the link is determined based on the one or more bandwidth requests, as shown in block 904. These determinations may be implemented by the device controller 252 or software running therein, or alternatively in hardware in communication with or as part of the controller 252. In further aspects, the process of block 904 may be implemented by the bandwidth solver 413 shown in fig. 4. In further aspects, the process of block 904 may be implemented by the host controller 212 shown in fig. 5 or software running thereon, or alternatively in hardware in communication with or as part of the controller 212. In further aspects, the process of block 904 may be implemented by the bandwidth solver 513 shown in fig. 5.

Further, the method 900 includes: implementing a speed change in the client (or host client in the example of fig. 5) based on the determined at least one of link speed and link width for the link, as shown in block 906. These processes may be implemented by the device controller 252 (or the host controller 212 in the example of fig. 5) or software running therein, or alternatively by some dedicated hardware in communication with or as part of the controller 252. In further aspects, the process of block 904 may be implemented by a bandwidth solver 413 or 513 running in the controller 252 or 212 as shown in fig. 4 and 5. Additionally, the process of block 906 may include coordination and/or communication with host link speed change implementers 422 or 522.

Finally, the method 900 includes: a speed change request is sent to the host (or over the link) based on the determined at least one of link speed and link width for the link, as shown in block 908. According to certain aspects, the process may be implemented by the controller 252 or 212, and by a bandwidth solver 413 or 513 operating in the controller 252 or 212 as shown in fig. 4 and 5. In some aspects, in the case of fig. 5, the sending in block 908 may be carried out over link 285 or through an interface such as interface 215. Further, it should be noted that the various processes in any of blocks 902, 904, 906, and 908 may include using PCIe interface 216 or 260 as shown in fig. 2.

In a further aspect of method 900, it should be noted that the client may comprise a peripheral component interconnect express (PCIe) endpoint device controller. Further, the PCIe endpoint device controller may include PCIe interface circuitry configured to send a speed change request to the host via the link. In further aspects, the PCIe controller may implement the speed change in the client based on the determined at least one of the link speed and the link width for the link.

As previously discussed, the link in method 900 may be a peripheral component interconnect express (PCIe) link. Here, it should be noted that although PCIe has been described herein, method 900 is applicable to other links. Further, the link speed includes one of a plurality of different PCIe link speeds corresponding to different PCIe generations.

The link in method 900 may also include a plurality of lanes, and the link width corresponds to the number of activities in the plurality of lanes, as discussed previously with respect to fig. 3. The method 900 further comprises: the link speed for the link is determined using a look-up table that maps each of the plurality of bandwidths to a respective one of the plurality of link speeds. In further aspects, the method 900 may include: the link widths are determined using a look-up table that maps each of the plurality of bandwidths to a respective one of the plurality of link widths. In other aspects, the method 900 may utilize other algorithms based on the speed and width of the transmission to determine the optimal link parameter (e.g., link speed).

In other aspects, the method 900 includes: determining a link speed for the link by: the method includes determining a power consumption for each of a plurality of different link speeds that satisfy one or more bandwidth requests and determining one of the plurality of different link speeds having a lowest power consumption. Additionally, method 900 may include: determining a link width for the link by: the method includes determining a power consumption for each of a plurality of different link widths that satisfy one or more bandwidth requests, and determining one of the plurality of different link widths having a lowest power consumption.

Further, the method 900 may include: one of the plurality of voltages or clocks is provided to the interface circuit, and the controller is configured to set the one or more voltages or clocks for the interface circuit based on the link speed. In yet another aspect, the method 900 may include: the link includes a plurality of channels, each channel driven by a respective driver of the plurality of drivers and coupled to a power switching circuit configured to selectively power the plurality of drivers; and setting an amount of the plurality of drivers to be selectively powered by the power switching circuit based on the determined link width for the link to change the link width.

It will be appreciated that the disclosure is not limited to the exemplary terms used above to describe aspects of the disclosure. For example, bandwidth may also be referred to as throughput, data rate, or another term.

While aspects of the disclosure are discussed above using an example of a PCIe standard, it will be appreciated that the disclosure is not limited to this example and may be used with other standards.

The host client 214, host controller 212, device controller 252, and device client 254 discussed above may each be implemented with a processor configured to perform the functions described herein by executing software comprising code for performing the functions. The software may be stored on a computer readable storage medium such as RAM, ROM, EEPROM, optical and/or magnetic disks.

Any reference herein to elements using nomenclature such as "first," "second," etc., does not generally limit the number or order of those elements. Rather, these designations are used herein as a convenient method of distinguishing between two or more elements or instances of an element. Thus, references to a first element and a second element do not mean that only two elements can be used or that the first element must precede the second element.

Within this disclosure, the word "exemplary" is used to mean "serving as an example, instance, or illustration. Any implementation or aspect described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects of the disclosure. Similarly, the term "aspect" does not require that all aspects of the disclosure include the discussed feature, advantage or mode of operation. The term "coupled" is used herein to refer to a direct or indirect electrical or other communicative coupling between two structures. Further, the term "about" means within 10% of the stated value.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the examples described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

33页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于修正时间参数的技术

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!