## 19.3 A 7nm All-Digital Unified Voltage and Frequency Regulator Based on a High-Bandwidth 2-Phase Buck Converter with Package Inductors

Francois Atallah<sup>1</sup>, Keith Bowman<sup>1</sup>, Hoan Nguyen<sup>1</sup>, Jihoon Jeong<sup>1</sup>, Daniel Yingling<sup>1</sup>, Yu Sun<sup>1</sup>, Brad Appel<sup>1</sup>, Anthony Polomik<sup>1</sup>, Mahesh Harinath<sup>1</sup>, Joshua Morelli<sup>1</sup>, Thomas Moore<sup>1</sup>, Nathaniel Reeves<sup>2</sup>, Amer Cassier<sup>2</sup>, Arijit Raychowdhury<sup>3</sup>

<sup>1</sup>Qualcomm, Raleigh, NC <sup>2</sup>Qualcomm, San Diego, CA <sup>3</sup>Georgia Institute of Technology, Atlanta, GA

Conventional processors regulate the supply voltage ( $V_{DD}$ ) and clock frequency ( $F_{CLK}$ ) in two separate and independent control loops. A buck converter, switched-capacitor, or low-dropout (LDO) voltage regulator are example control loops for regulating  $V_{DD}$  based on a reference voltage ( $V_{REF}$ ). Processors commonly integrate a phase-locked loop (PLL) to separately regulate  $F_{CLK}$  based on a reference clock frequency ( $F_{REF}$ ), where the  $F_{CLK}$  control loop is unaware of the impact of dynamic parameter variations such as  $V_{DD}$  droops or temperature changes on the path-timing margin because the PLL voltage-controlled oscillator (VCO) operates on a separate analog voltage. For this reason, conventional processors require either  $V_{DD}$  or  $F_{CLK}$  guardbands or adaptive and resilient circuits to ensure correct functionality while in the presence of worst-case dynamic parameter variations [1].

Recent work [2-5] combines the regulation of  $V_{\text{DD}}$  and  $F_{\text{CLK}}$  in one control loop to reduce the guardbands for  $V_{\mbox{\tiny DD}}$  and temperature variations, while protecting the processors from path-timing violations. These techniques enable a tight relationship between  $V_{\text{DD}}$  and  $F_{\text{CLK}}.$  These previous single  $V_{\text{DD}}$  and  $F_{\text{CLK}}$  regulation designs employ either a buck converter [4], an LDO [3], or a switched-capacitor [2], [5] voltage regulator. Today's high-performance SoC processors require the high efficiency of a Buck voltage regulator with a fast-transient response. The design with a Buck converter in [4] uses a large external inductor of  $10\mu$ H and a large capacitor of  $10\mu$ F to provide a peak power efficiency of 96.3%, but at the expense of a 20us settling time for a 90mA load step with a 1ns rise time. Although this design effectively demonstrates high efficiency and guardband reduction with the unified  $V_{\text{DD}}$  and  $F_{\text{CLK}}$ control loop, the long settling time during a  $V_{\text{DD}}$  droop is a major concern for a commercial processor because the V<sub>pp</sub> droop magnitude may exceed the minimum operating V<sub>DD</sub> for memory circuits, resulting in failures, and sustained performance degradation of 20µs is unacceptable for some applications. While the design in [4] operates the processor at a faster F<sub>CLK</sub> after recovering from the V<sub>DD</sub> droop, this performance loss may exceed the minimum requirements for applications with realtime deadlines. Recent advances of integrating inductors into the package [6] provide an opportunity to significantly improve the transient response for the unified  $V_{\mbox{\tiny DD}}$  and  $F_{CLK}$  control loop during a  $V_{DD}$  droop. This paper describes an all-digital unified voltage and frequency buck regulator (UVFBR) with in-package inductors in a 7nm [7] test chip to enable a fast-transient response to VDD droops as required for highperformance processors

The 7nm test chip (Figs. 19.3.1,19.3.7) contains the UVFBR, performance counters, and a programmable noise generator. The UVFBR design extends the unified voltage and frequency regulator in [3] with a Buck converter, consisting of two in-package inductors. The UVFBR controls both the output clock frequency (Four) and output voltage (V<sub>OUT</sub>) for the digital load in one loop. The UVFBR generates the clock from a tunable-replica circuit (TRC) oscillator and supplies the regulated Vout to power both the TRC oscillator and the digital load. In the UVFBR control loop,  $F_{\text{out}}$  is divided by a programmable  $F_{REF}$  multiplication factor (N) to produce  $F_{OUT}$ /N. The UVFBR continuously monitors  $F_{\text{out}}/N$  and adjusts  $V_{\text{out}}$  to lock  $F_{\text{out}}/N$  to a target  $F_{\text{REF}}$ , resulting in the optimum  $V_{out}$ . Since  $V_{out}$  is regulated to achieve a target  $F_{out}$ , a  $V_{REF}$  is not required. Four dynamically adapts to Vout and temperature variations to compensate for delay changes in critical paths to maintain a nearly constant timing margin [1-5]. When a  $V_{0UT}$  droop occurs, the TRC oscillator slows down and provides a larger clock cycle time to the load until UVFBR raises  $V_{0UT}$  to satisfy the desired  $F_{0UT}$ . During a  $V_{0UT}$ overshoot, the critical-path delays become faster and compensate for the TRC oscillator speeding up and generating a higher  $F_{\text{out}}$  until UVFBR lowers  $V_{\text{out}}$  back to the nominal value.

The UVFBR includes two 4b Johnson counters with one counter clocked by  $F_{REF}$  and the other clocked by  $F_{OUT}/N$ . The Johnson counter produces 4 phases of each  $F_{REF}$  and  $F_{OUT}/N$ . The UVFBR is a 2-phase Buck regulator, with each phase separated by 180°. Each XNOR logic gate produces a signal representing the difference in phase between  $F_{REF}$  and  $F_{OUT}/N$  to drive the Buck converter output stage to supply the necessary output current load for a target  $F_{REF}$ . To avoid phase aliasing associated with the XNOR comparison, the Johnson counter uses an overrun protection (OP) scheme [3]. The OP design holds the Ri value if Li=Ri and propagates the previous stage value (Ri-1) to Ri if Li≠Ri. In addition, the OP holds the Li value if Li≠Ri and propagates the previous stage value (Li-1) to Li if Li=Ri. If  $F_{OUT}$  considerably slows down with respect to  $F_{REF}$ due to a large load current demand, the phase difference may saturate to 180°, thus increasing the duty cycle to ~100% and maximizing the time for enabling the high side of the output stage. On the contrary, if  $F_{OUT}$  speeds up with respect to  $F_{REF}$  due to a large  $V_{OUT}$  overshoot event, the duty cycle may reduce to ~0%, minimizing the time the high side of the output stage is enabled.

The TRC oscillator enables the interdependent relationship between V<sub>OUT</sub> and F<sub>OUT</sub> in the UVFBR. The TRC contains configurable delays to calibrate the oscillator clock period to match the critical-path delay of the digital load (i.e., processor). Each coarse tuning-bit (Coarse\_s[3:0]) adjusts the TRC oscillator cycle time by ~30ps, and each small-tuning bit (S[0:12]) adjusts the TRC oscillator by ~4ps resulting in a worst-case timing inaccuracy of ~1.0% at 3.0Ghz. The performance counters allow an on-die measurement of F<sub>OUT</sub>, which is scanned out of the chip to capture the TRC oscillator frequency. The UVFBR can operate in single-phase mode by either asserting phase1a\_select or phase1b\_select or in 2-phase mode by asserting both signals. Six programmable noise generators provide the ability to stress UVFBR with variant load currents.

The three-turn inductors (Fig. 19.3.2) are in package and each use six levels of copper metal, where the last three metals constitute the three turns. The simulated DC resistance at room temperature is ~50m\Omega and the simulated inductor quality (Q) factor at the UVFBR operating frequencies varies between 17 and 30. Fig. 19.3.3 describes the UVFBR small signal s-domain model. Simulations indicate more than 100° of phase margin at the unity gain bandwidth. While there are two complex poles that sharply drop the phase at high frequencies, these poles are far from the unity gain bandwidth and below -20dB.

The UVFBR is implemented in a 7nm test chip (Fig. 19.3.7). It occupies 6,478µm<sup>2</sup>. To distribute the UVFBR output to the regulated region, the UVFBR components are distributed along the height of the regulated region, thus increasing the effective area to  $50,004 \mu m^2$ . The area of the package inductors matches the area of the UVFBR regulated region of  $639,000\mu m^2$ . Fig. 19.3.4 captures the UVFBR V<sub>out</sub> transition from 0.9V to 0.55V (1.4GHz) in single-phase and 2-phase modes as an example of a dynamic voltage-frequency scaling (DVFS) transition. The settling time is measured at 510ns for the 1-phase mode and 250ns for the 2-phase mode. UVFBR demonstrates a settling time of 60ns and a voltage droop magnitude of 55mV at an  $F_{out}$  of 2GHz with a current load step from 1mA to 178mA at a rise time of 500ps. This fast-transient response is critical for high performance processors to limit the voltage-droop magnitude and the time while operating at a lower  $F_{\text{out.}}\,$  Fig. 19.3.5 describes the UVFBR Four regulation vs. output load current from 1mA to 900mA with temperature ranging from -15°C to 105°C across 24 dies, demonstrating consistent and highly accurate  $F_{out}$  regulation for a target  $F_{out}$  from 1.0GHz to 3.0GHz in steps of 500MHz across a wide range of process, temperature, voltage, and load current. Fig. 19.3.6 plots the measured and simulated power efficiencies for loads between 10mA and 600mA with a 1.6nH inductor, indicating agreement between measurements and simulations. Since the measured peak power efficiency of 60% for a 1.6nH inductor is too low for a processor, simulations indicate that power efficiencies of 90% require an ~10× larger inductor to guide future designs. The small test-chip area allocated and regulated by the UVFBR limits the size of the inductor to 1.6nH and the number of phases to only two. From these simulations, the 1.6nH inductor size limits the measured power efficiency in the UVFFBR implementation. For a high-performance processor, the area is much larger than the test chip area allocated to the UVFBR, therefore allowing for a larger inductor size and/or a greater number of phases to provide acceptable power efficiencies. Fig. 19.3.6 also provides a comparison table with state-of-the-art designs, highlighting the UVFBR fast-transient response.

## References:

[1] K. A. Bowman, "Adaptive and Resilient Circuits: A Tutorial on Improving Processor Performance, Energy Efficiency, and Yield via Dynamic Variation Tolerance," *IEEE Solid-State Circuits Magazine*, vol. 10, no. 3, pp. 16-25, 2018.

[2] D. Bol, et al., "SleepWalker: A 25-MHz 0.4-V Sub-mm<sup>2</sup> 7µW/MHz Microcontroller in 65-nm LP/GP CMOS for Low-Carbon Wireless Sensor Nodes," *IEEE JSSC*, vol. 48, no. 1, pp. 20-32, 2013.

[3] S. Gangopadhyay, et al., "UVFR: A Unified Voltage and Frequency Regulator with 500MHz/0.84V to 100KHz/0.27V Operating Range, 99.4% Current Efficiency and 27% Supply Guardband Reduction", *ESSCIRC*, pp. 321-324, 2016.

[4] X. Sun, et al., "A Combined All-Digital PLL-Buck Slack Regulation System with Autonomous CCM/DCM Transition Control and 82% Average Voltage-Margin Reduction in a 0.6-to-1.0V Cortex-M0 Processor," *ISSCC*, pp. 302-304, 2018.

[5] F. ur Rahman, et al., "An All-Digital Unified Clock Frequency and Switched-Capacitor Voltage Regulator for Variation Tolerance in a Sub-Threshold ARM Cortex M0 Processor," *IEEE Symp. VLSI Circuits*, pp. 65-66, 2018.

[6] E. A. Burton, et al., "FIVR — Fully Integrated Voltage Regulators on 4th Generation Intel® Core™ SoCs," *IEEE Applied Power Electronics Conference and Exposition*, pp. 432-439, 2014.

[7] S.-Y. Wu, et al., "A 7nm CMOS Platform Technology Featuring 4th Generation FinFET Transistors with a  $0.027 \mu m^2$  High Density 6-T SRAM Cell for Mobile SoC Applications," *IEDM*, pp. 43-46, 2016.

2019 IEEE International Solid-State Circuits Conference

978-1-5386-8531-0/19/\$31.00 ©2019 IEEE

## ISSCC 2019 / February 20, 2019 / 9:30 AM



Figure 19.3.1: Test-chip block diagram of the UVFBR with two in-package inductors, performance counters, programmable noise generators, and schematics of the 4b Johnson Counter with overrun protection, and the tunable-replica circuit (TRC) oscillator.



Figure 19.3.3: UVFBR small signal analysis and simulated UVFBR loop gain and phase.





Figure 19.3.2: 3D view of the six metal levels for the two in-package inductors, a detailed view of each metal layer, and the simulated inductor quality (Q) factor. Q ranges from 17 to 30 in the region of operation.



Figure 19.3.4: Measured oscilloscope captures the UVFBR 1-phase and 2-phase modes during a 0.9V to 0.55V ( $F_{OUT}$  =1.4GHz) transition and the UVFBR transient response ( $F_{OUT}$ =2GHz) in 2-phase with a current load step from 1mA to 178mA at a rise time of 500ps.



Figure 19.3.6: Measured and simulated UVFBR power efficiency vs. output load current and in-package inductor size and comparison table with other unified voltage-frequency regulation designs.

DIGEST OF TECHNICAL PAPERS • 317

## **ISSCC 2019 PAPER CONTINUATIONS**

|                |                                                                              | Region regulated by UVFBR                 |
|----------------|------------------------------------------------------------------------------|-------------------------------------------|
| Te             | ichnology<br>hip Area                                                        | Noise Generators   7nm FinFET CMOS   8mm² |
|                | RC Area<br>/FBR Area                                                         | 48μm²<br>6,478μm²                         |
| UV<br>Re<br>Ar | /FBR Distributed Area<br>egion Regulated by UVFBR<br>rea of Noise Generators | 50,004µm²<br>639,000µm²<br>32,571µm²X 6   |
| Figure 19.     | .3.7: Test-chip die micro                                                    | graph and characteristics.                |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |
|                |                                                                              |                                           |

• 2019 IEEE International Solid-State Circuits Conference

978-1-5386-8531-0/19/\$31.00 ©2019 IEEE