# GRANULAR POWER CONVERSION WITH DISTRIBUTED SWITCHING CELLS AND MAGNETICS INTEGRATION

### PING WANG

A DISSERTATION

PRESENTED TO THE FACULTY

OF PRINCETON UNIVERSITY

IN CANDIDACY FOR THE DEGREE

OF DOCTOR OF PHILOSOPHY

RECOMMENDED FOR ACCEPTANCE

BY THE DEPARTMENT OF

ELECTRICAL AND COMPUTER ENGINEERING

ADVISER: PROFESSOR MINJIE CHEN

© Copyright by Ping Wang, 2023.
All rights reserved.

### Abstract

Power electronics is the backbone of future energy systems including data centers, electric vehicles, and grid-scale energy storage. These high-impact applications demand increased efficiency, density, and reliability in power conversion. To leverage the advances in semiconductor devices and the scaling laws of passive components, a promising trend is to adopt granular power architecture with magnetics integration for minimized power conversion stress and maximized component utilization.

In pursuit of this vision, this thesis first develops a systematic approach to all-inone magnetics integration through matrix coupling. The benefits of matrix coupling in size reduction, ripple compression, and transient acceleration are quantified. A matrix coupled SEPIC prototype is designed and built. It can support load current up to 185 A at 5-to-1-V voltage conversion with over 470 W/in<sup>3</sup> power density. Compared to commercial discrete inductors, the matrix coupled inductor has a 5.6 times smaller size and 8.5 times faster transient speed with similar current ripples and ratings.

Next, a multistack switched-capacitor point-of-load (MSC-PoL) architecture is presented to power high current computing systems with high efficiency and ultra-compact size. Benefiting from granular architecture, coupled magnetics, and soft-charging technique, the MSC-PoL architecture can reduce current ripple, boost transient speed, reduce charge sharing loss, and enable the self-balancing of granular switching cells. A 48-to-1-V/450-A voltage regulator containing two MSC-PoL modules is fabricated and tested. The prototype is enclosed into a  $\frac{1}{16}$ -brick/0.31-in<sup>3</sup>/6-mm-thick package with 724 W/in<sup>3</sup> power density, enabling ultra-compact power-supply-in-package (PwrSiP) voltage regulation for extreme efficiency, density, and control/communication bandwidth.

Finally, a multiport ac-coupled differential power processing (MAC-DPP) architecture is introduced to support large-scale energy systems with ultra-high system efficiency (> 99%). The proposed MAC-DPP architecture associates all granular switching ports through a series coupled multi-winding transformer, featuring reduced component count, smaller magnetic volume, and fewer differential power conversion stages compared to other DPP solutions. A stochastic loss model is developed to explore DPP performance scaling limits. A 10-port 450 W MAC-DPP prototype with over 700 W/in³ power density is built and tested on a 50-HDD storage server. It achieves 99.77% system efficiency, completing the first demonstration of a DPP-powered data storage server with full reading, writing, and hot-swapping capabilities. The exploration of software, hardware, and power architecture co-design yields valuable insights for designing next-generation power architectures in data centers.

The matrix coupling theory, the MSC-PoL architecture, and the systematic DPP analysis advance the fundamentals of granular power electronics and pave the way toward high-performance power conversion systems for a wider range of applications.

# Acknowledgements

I want to express my foremost and sincere appreciation to my advisor, Professor Minjie Chen, for his unwavering guidance and encouragement throughout my research journey. He has taught me everything I know about research from scratch. The most valuable lesson I learned was to approach questions with a higher-level methodology perspective rather than focusing solely on engineering details. His timely replies have helped me navigate many challenging projects. I will never forget the midnights we worked together deriving equations and revising papers. Beyond an advisor, he has been a friend who has supported me both professionally and personally. I am thankful for his help in my apartment move and for celebrating my first birthday in the U.S.

I extend my heartfelt gratitude to my committee members, Professor Naveen Verma, Professor Kaushik Sengupta, and Professor Robert Pilawa. I am grateful to Professors Naveen Verma and Kaushik Sengupta for their valuable time and feedback on my preliminary FPO and practice talks. I am indebted to Professor Robert Pilawa for his critical questions and constructive advice during our bi-weekly meetings for the ARPA-E project. Professor Niraj Jha, I thank you for serving as the reader of this dissertation and for your detailed reviews that have helped improve the quality significantly. I thank Professor Philip Krein for his insightful comments on our collaborated differential power processing paper, and I'd also like to thank Professor Yuxin Chen for patiently answering my questions regarding the derivations of stochastic models. I appreciate all your contributions to my academic growth and development.

Thank you to all fellow members of Princeton Powerlab, past and present: Yenan Chen, Ming Liu, Jaeil Baek, Diego Serrano, Yufei Li, Jing Yuan, Yueshi Guan, Youssef Elasser, Daniel Zhou, Haoran Li, Mian Liao, Tanuj Sen, Hsin Cheng, Shukai Wang, and many more. Thank you for your companionship over the past six years and your steadfast support whenever I required assistance. I hold dear the memories of our discussions in the lab as we tackled complex questions and challenges together.

I would like to express my gratitude to my intern mentors, Sombuddha Chakraborty and Raveesh Magod, at TI Kilby Labs in 2020. I am grateful for the valuable opportunity they provided me to work on state-of-the-art power management solutions for industrial, automotive, and personal electronic markets. I would also like to thank David Giuliano, Stephen Allen, and Gregory Szczeszynski from pSemi for the collaboration on the microprocessor power architecture project. Their contributions and insights were critical to the project's success. I am grateful to Nathan Brooks at UC Berkeley, Pradeep Shenoy at TI, and Enver Candan at IBM for their support on the collaborated ARPA-E project about HDD server power architecture. Besides, I want to express my appreciation to Xin Zhang at IBM for his assistance in applying for a fellowship.

Thank you to all my Princeton friends: Sulin, Fei, Tianran, Yuchen, Jimmy, Pengning, Zach, Bonan, and many more. The memories we shared, whether playing

basketball, skiing, or celebrating holidays together, will stay with me forever. Your companionship added vibrancy to my life at Princeton, and I cherish every moment spent with you.

I extend my special thanks to my girlfriend, Yiqing. Her constant love and support have been instrumental in helping me navigate the ups and downs of graduate school. Her presence has brought warmth and joy to my life, and her compassionate nature has taught me the essence of love. I want to express my deepest gratitude to my parents, Defa and Yu, to whom I dedicate this dissertation. Their love, support, and guidance have been integral in shaping me into the person I am today. They have provided me with a nurturing and joyful environment to grow up in and have instilled in me the invaluable traits of critical thinking and independence. I am forever indebted to them for all that they have done for me!

Finally, completing this dissertation marks the start of a new chapter in my life, and I am eagerly looking forward to embracing the exciting future.

To my parents.

# Contents

|   | Abs  | ract                                                        |    |
|---|------|-------------------------------------------------------------|----|
|   | Ack  | owledgements                                                | 4  |
|   | List | of Tables                                                   | 11 |
|   | List | of Figures                                                  | 13 |
| 1 | Inti | oduction                                                    | 38 |
|   | 1.1  | Granular Power Electronics Architecture                     | 38 |
|   |      | 1.1.1 The Component Scaling Laws                            | 36 |
|   |      | 1.1.2 Distributed Switching Cells and Magnetics Integration | 43 |
|   | 1.2  | Contributions and Thesis Organization                       | 47 |
| 2 | A S  | ystematic Approach to All-in-One Magnetics Integration      | 51 |
|   | 2.1  | Background and Motivation                                   | 51 |
|   | 2.2  | Matrix Coupling Structure and Example Topologies            | 53 |
|   | 2.3  | Mechanism of Current Ripple Reduction and Steering          | 57 |
|   |      | 2.3.1 Phase Current Ripple Reduction                        | 57 |
|   |      | 2.3.2 Winding Current Ripple Reduction                      | 61 |
|   |      | 2.3.3 Winding Current Ripple Steering                       | 64 |
|   | 2.4  | Transient Performance and Figure of Merit                   | 66 |
|   | 2.5  | Design of a Matrix Coupled SEPIC Converter                  | 70 |
|   |      | 2.5.1 Matrix Coupled Inductor Design                        | 70 |
|   |      | 2.5.2 Matrix Coupled SEPIC Prototype                        | 74 |

|   | 2.6 | Exper  | imental Results                                       | 77  |
|---|-----|--------|-------------------------------------------------------|-----|
|   |     | 2.6.1  | Inductor Current Ripple and Converter Dynamics        | 77  |
|   |     | 2.6.2  | Efficiency Measurement and Magnetics Comparison       | 83  |
|   | 2.7 | Chapt  | ser Summary                                           | 87  |
| 3 | Gra | nular  | Architecture with Parallel Coupled Magnetics for High | h   |
|   | Cur | rent C | Computing Systems                                     | 89  |
|   | 3.1 | Backg  | round and Motivation                                  | 89  |
|   | 3.2 | Multis | stack Switched-Capacitor (MSC) Architecture           | 94  |
|   | 3.3 | A 48-1 | to-1-V MSC-PoL CPU Voltage Regulator                  | 98  |
|   |     | 3.3.1  | Topology and Operation Principle                      | 98  |
|   |     | 3.3.2  | Dynamic Modeling and Analysis                         | 100 |
|   | 3.4 | Interp | hase $L$ - $C$ Resonance and Stability Analysis       | 102 |
|   |     | 3.4.1  | Impacts of Coupled Inductors                          | 109 |
|   |     | 3.4.2  | Influence on Control Stability                        | 112 |
|   | 3.5 | MSC-   | PoL Converter Design with 3D Stacked Packaging        | 116 |
|   |     | 3.5.1  | Ladder-Structured Coupled Inductor                    | 117 |
|   |     | 3.5.2  | Gate Driver Circuits and 3D Stacked Packaging         | 126 |
|   | 3.6 | Exper  | imental Results                                       | 129 |
|   |     | 3.6.1  | Prototype and Testbench                               | 129 |
|   |     | 3.6.2  | Steady-State Operation                                | 132 |
|   |     | 3.6.3  | Transient Performance                                 | 134 |
|   |     | 3.6.4  | Efficiency Measurement                                | 136 |
|   |     | 3.6.5  | Performance Discussions and Comparison                | 138 |
|   | 3.7 | Chapt  | er Summary                                            | 141 |
| 4 | Gra | nular  | Architecture with Series Coupled Magnetics for Large  | _   |
|   | Sca | le Mod | dular Energy Systems                                  | 144 |

| 4.7 | Chapte | er Summary                                                    | 247  |
|-----|--------|---------------------------------------------------------------|------|
|     | 4.6.4  | A Buck SVC for the 10-Port MAC-DPP                            | 236  |
|     | 4.6.3  | LED Screen Testbench                                          | 226  |
|     | 4.6.2  | HDD Server Testbench                                          |      |
|     | 4.6.1  | A 450-W/50-to-5-V 10-Port MAC-DPP Prototype                   | 205  |
| 4.6 | Experi | mental Results                                                | 205  |
|     | 4.5.2  | Example Topology Implementations of SVC                       | 199  |
|     | 4.5.1  | Power Stress Analysis of SVC                                  | 194  |
| 4.5 | String | Voltage Regulation for DPP                                    | 192  |
|     | 4.4.4  | Feedforward Control based on the Newton-Raphson Method .      | 187  |
|     | 4.4.3  | Feedback Control based on Distributed Phase Shift Modulation  | 185  |
|     | 4.4.2  | Small Signal Model for Very Large-Scale MAC-DPP Systems .     | 177  |
|     | 4.4.1  | Modeling of MIMO Power Flow                                   | 175  |
| 4.4 | Modeli | ing and Control of Multi-Input-Multi-Output (MIMO) Power Flow | v174 |
|     | 4.3.5  | Impacts of Load Correlation                                   | 171  |
|     | 4.3.4  | Performance Scaling of Various DPP Topologies                 | 161  |
|     | 4.3.3  | Stochastic Loss Model and Scaling Factor                      | 157  |
|     | 4.3.2  | Fully Coupled DPP and Ladder DPP                              | 155  |
|     | 4.3.1  | Parameter Definitions and Modeling Assumptions                | 154  |
| 4.3 |        | stic Power Loss Analysis and Performance Scaling              | 153  |
| 4.2 |        | ort ac-Coupled Differential Power Processing                  | 148  |
| 4.1 | Backgr | round and Motivation                                          | 144  |

| В  | Stochastic Loss Model for DPP: Detailed Derivation and Extended | d           |
|----|-----------------------------------------------------------------|-------------|
|    | Case Study                                                      | <b>2</b> 60 |
| Bi | ibliography                                                     | 270         |

# List of Tables

| 2.1 | Key Parameters of Current Ripple Analysis for Symmetric Matrix Cou-  |     |
|-----|----------------------------------------------------------------------|-----|
|     | pling                                                                | 64  |
| 2.2 | Bill-of-Material of the Matrix Coupled SEPIC Prototype               | 77  |
| 3.1 | Parameters of a Two-Phase SCB Converter                              | 110 |
| 3.2 | Parameters for the Optimal Coupled Inductor Design                   | 122 |
| 3.3 | Comparison between the Two Four-Phase Coupled Inductors              | 123 |
| 3.4 | Bill-of-Material of the 48-to-1-V MSC-PoL Converter                  | 127 |
| 3.5 | Performance Comparison of 48 V-to-1 V Point-of-Load VMRs             | 140 |
| 4.1 | Stochastic Power Loss Model Comparison $(M \ge 1, N \ge 2)$          | 161 |
| 4.2 | Comparison between the DAB Converter and DPP Topologies $(N \geq 2)$ | 165 |
| 4.3 | $R_{out}$ Modeling of SC DPP Topologies at SSL $(N \ge 2)$           | 166 |
| 4.4 | Simulation Parameters                                                | 183 |
| 4.5 | SVC Incurred Power Processing in Buck and Boost Region               | 199 |
| 4.6 | Comparison of Different SVC Topologies and Standalone dc-dc Regu-    |     |
|     | lators                                                               | 202 |
| 4.7 | Bill-of-Material of the MAC-DPP Converter                            | 207 |
| 4.8 | Read/Write Speed Comparison of Isolated SATA and Standard SATA       |     |
|     | Link                                                                 | 216 |
| 4.9 | Long-Term Random Read/Write Testing Results                          | 218 |

## LIST OF TABLES

| 4.10 | Bill-of-Material of the Prototype                            | 241 |
|------|--------------------------------------------------------------|-----|
| A.1  | Capacitor Specifications for Data Points in Fig. 1.3a        | 257 |
| B.1  | Average Power Consumption and DPP Power Loss of Each Voltage |     |
|      | Domain and of the Total System                               | 269 |

# List of Figures

| 1.1 | (a) Application power rating vs. operation frequency and (b) theo-                |    |
|-----|-----------------------------------------------------------------------------------|----|
|     | retical $R_{on,sp}Q_{gd,sp}$ limit vs. breakdown voltage $V_B$ of different semi- |    |
|     | conductor device materials. Theoretical material limits are calculated            |    |
|     | based on [7], assuming drain-source voltage $V_D = 0.7V_B$ and gate-drain         |    |
|     | overlapping area ratio $k = 0.1.$                                                 | 40 |
| 1.2 | (a) Example structure of a transformer. (b) Scaling trend of trans-               |    |
|     | former VA power rating, volume, and onboard area versus linear di-                |    |
|     | mension $\lambda$                                                                 | 41 |
| 1.3 | (a) Peak energy density versus voltage rating of commercial Class-I               |    |
|     | and Class-II ceramic capacitors. (b) Comparison of total volume and               |    |
|     | energy storage between one large capacitor and three small capacitors.            |    |
|     | In (a), data are sourced from Murata Database [16]. Derated capac-                |    |
|     | itance due to dc bias is considered. Peak energy storage density is               |    |
|     | calculated based on $\int_0^{V_R} vC(v)dv$ . Detailed capacitor specifications of |    |
|     | the data points are listed in Table A.1                                           | 43 |
| 1.4 | (a) Conventional power architecture with single (or few) lumped                   |    |
|     | switching cell(s) and multiple discrete magnetics. (b) Granular power             |    |
|     | architecture with single (or few) coupled magnetics and multiple                  |    |
|     | distributed switching cells                                                       | 44 |

| 1.5 | (a) An example commercial converter with conventional power archi-         |    |
|-----|----------------------------------------------------------------------------|----|
|     | tecture [21]. (b) A multiport ac-coupled converter with granular power     |    |
|     | architecture [22]                                                          | 44 |
| 1.6 | (a) Example circuit implementations and (b) corresponding device           |    |
|     | power ratings of a conventional power converter and a reconfigurable       |    |
|     | MIMO power converter                                                       | 46 |
| 1.7 | Multiphase interleaving operation of parallel-distributed switching cells. | 47 |
| 2.1 | Magnetics integration on the functional and the package levels: (a)        |    |
|     | WE, dual mode inductors [35]; (b) UCC, split-winding integrated mag-       |    |
|     | netics [36]; (c) EPC, planar matrix transformer [37]; and (d) ViTEC,       |    |
|     | coupled inductor [38], as well as on the IC level: (e) Intel, FIVR [39];   |    |
|     | (f) Apple M1 Pro, integrated coupled inductors [40]                        | 52 |
| 2.2 | Coupled magnetic structures: (a) series coupled; (b) parallel coupled;     |    |
|     | (c) matrix coupled                                                         | 54 |
| 2.3 | Example PWM topologies that may apply matrix coupled magnet-               |    |
|     | ics: (a) multiphase SEPIC; (b) multiphase ZETA; (c) multiphase Ćuk;        |    |
|     | (d) multiphase tapped-inductor buck; (e) multiphase tapped-inductor        |    |
|     | boost; (f) multiphase flyback                                              | 56 |
| 2.4 | Equivalent magnetic models for the matrix coupled magnetics: (a)           |    |
|     | magnetic circuit model; (b) inductance dual model                          | 58 |
| 2.5 | (a) Interleaved winding voltages for parallel phases. (b) Modified in-     |    |
|     | ductance dual model where windings of each phase are combined as           |    |
|     | one port delivering the summed currents into the inductance network.       | 58 |
| 2.6 | Detailed superposition procedures for phase current ripple analysis        | 59 |
| 2.7 | Equivalent inductance dual model per phase for non-interleaved voltages.   | 61 |

| 2.8  | Magnetic models including leakage inductance between series coupled                   |    |
|------|---------------------------------------------------------------------------------------|----|
|      | windings on each core leg: (a) magnetic circuit model; (b) inductance                 |    |
|      | dual model                                                                            | 63 |
| 2.9  | Mapping the inductance dual model back to the one without $\mathcal{R}_K$             | 63 |
| 2.10 | Generalized series coupled winding configuration in the $k^{th}$ phase and            |    |
|      | its Thevenin-equivalent network                                                       | 65 |
| 2.11 | Current ripple steering among the series coupled windings in the $k^{th}$             |    |
|      | phase. If the windings have identical volt-per-turn, the phase current                |    |
|      | ripple $\Delta i_k$ will be linearly divided into each winding by a steering          |    |
|      | coefficient $s_j$                                                                     | 66 |
| 2.12 | (a) Duty ratio command for each phase remains identical during tran-                  |    |
|      | sients. (b) Switching-cycle averaged voltage of each winding is identical.            | 67 |
| 2.13 | (a) Switching-cycle averaged model for each winding. (b) Equivalent                   |    |
|      | inductance seen at each winding during transients                                     | 68 |
| 2.14 | Switching-cycle averaged dynamics of the converter with the matrix                    |    |
|      | coupled inductor are the same as that with discrete inductors of $\mathcal{L}_{tr}$ . | 69 |
| 2.15 | FOM as a function of duty ratio (D) for various numbers of phases                     |    |
|      | $(M)$ and coupling factors $(K_{\alpha\beta})$                                        | 70 |
| 2.16 | (a) Circuit topology of the four-phase matrix coupled synchronous                     |    |
|      | SEPIC converter. (b) Four-leg EE-type magnetic core built with Fer-                   |    |
|      | roxcube 3F4 material. (c) Cross section of the magnetic core and                      |    |
|      | inductor winding annotations                                                          | 71 |
| 2.17 | Alternative winding designs of the matrix coupled inductor based on                   |    |
|      | an 8-layer PCB board of 3-oz copper thickness: (a) side by side; (b)                  |    |
|      | non-interleaved overlapping; (c) interleaved overlapping. Assume each                 |    |
|      | inductor current is $I_L$ . MMF diagrams for windings in the window area              |    |
|      | are plotted along horizontal and vertical directions                                  | 72 |

| 2.18 | FEM simulation of magnetic field strength distributions and ac wind-                                               |    |
|------|--------------------------------------------------------------------------------------------------------------------|----|
|      | ing current distributions in the designs of (a) side by side; (b) non-                                             |    |
|      | interleaved overlapping; (c) interleaved overlapping. Each inductor is                                             |    |
|      | driven by a 1-MHz sinusoidal current excitation of 10-A amplitude.                                                 |    |
|      | Copper thickness and current directions are consistent with Fig. 2.17.                                             | 73 |
| 2.19 | 3-D structure of the matrix coupled inductor and PCB winding pat-                                                  |    |
|      | terns on (a) layer 1 & 3, (b) layer 2 & 4, (c) layer 6 & 8, and (d)                                                |    |
|      | layer 5 & 7. Winding terminal connections of phases 1 & 2 are plot-                                                |    |
|      | ted for demonstration. Phases 3 & 4 are centrosymmetric to phases 1                                                |    |
|      | & 2. The multilayer overlapped implementation of multiple windings                                                 |    |
|      | enables greatly reduced ac resistance                                                                              | 75 |
| 2.20 | Annotated matrix coupled SEPIC prototype: (a) top view; (b) bottom                                                 |    |
|      | view; (c) side view. The prototype measures 35 mm $\times$ 35 mm $\times$ 5.25 mm.                                 | 76 |
| 2.21 | External winding setup: (a) two winding options; (b) current measure-                                              |    |
|      | ment setup; (c) compact winding setup for high power density                                                       | 76 |
| 2.22 | (a) Measured two switch-node voltages, a blocking capacitor volt-                                                  |    |
|      | age, and an inductor current (as defined in Fig. 2.16), when                                                       |    |
|      | $V_{in} = 5 \text{ V}, V_{out} = 3.3 \text{ V}, I_{out} = 50 \text{ A}, \text{and } f_{sw} = 806 \text{ kHz}.$ (b) |    |
|      | Current ripple reduction ratio as a function of duty ratio with different                                          |    |
|      | external winding setups                                                                                            | 78 |
| 2.23 | Inductor current ripple under (a) interleaved operation and (b) non-                                               |    |
|      | interleaved operation. $V_{in}=1~\mathrm{V},~V_{out}=3.3~\mathrm{V},~f_{sw}=1~\mathrm{MHz},~\mathrm{and}$          |    |
|      | tested in the setup with current measurement loops                                                                 | 80 |

| 2.24 | Measured waveforms for verifying current ripple steering due to asym-                                 |    |
|------|-------------------------------------------------------------------------------------------------------|----|
|      | metric series coupling. In phase 1, two external winding leakages are                                 |    |
|      | both 27 nH, while in phase 3, they are 22 nH and 37 nH, respectively.                                 |    |
|      | The ripple steering ratio is inversely proportional to external winding                               |    |
|      | leakage inductances.                                                                                  | 80 |
| 2.25 | (a) Small-signal circuit model of the four-phase matrix coupled SEPIC                                 |    |
|      | converter. (b) Simplified small-signal circuit model. $D$ is the gate                                 |    |
|      | driving duty ratio of the lower switch $S_{k1}$ , and $D' = 1 - D$ . Assume                           |    |
|      | blocking capacitors have stable voltages and can be treated as constant                               |    |
|      | voltage sources.                                                                                      | 81 |
| 2.26 | (a) Modeled and measured Bode plots of the control $(\hat{d})$ to output                              |    |
|      | $(\hat{v}_{out})$ transfer function. (b) Duty ratio perturbation from 50% to 53%.                     |    |
|      | Duty ratio is indicated by the controller DAC output with $0{\sim}3.3~\mathrm{V}$ to                  |    |
|      | represent 0~100%. $V_{in}=3.3$ V, $V_{out}=3.3$ V, $f_{sw}=1$ MHz, effective                          |    |
|      | $C_{out}=168~\mu F$ in (a) and 8.8 $\mu F$ in (b), $R_o=2.5~\mathrm{k}\Omega$ in (a) and 0.4 $\Omega$ |    |
|      | in (b). Tested with current measurement loops                                                         | 82 |
| 2.27 | (a) Measured efficiency of different voltage conversion ratios at                                     |    |
|      | 806 kHz switching frequency. (b) Full-load hot-spot temperature of                                    |    |
|      | the prototype under 36 CFM airflow. $(V_{in} = 5 \text{ V}, V_{out} = 1 \text{ V}, I_{out} =$         |    |
|      | 185 A, and $f_{sw} = 806 \text{ kHz.}$ )                                                              | 83 |

| 2.28 | (a) Detailed power loss breakdown and calculated efficiency for 5 V-to-                          |    |
|------|--------------------------------------------------------------------------------------------------|----|
|      | 1 V voltage conversion at 806 kHz switching frequency. (b) Power loss                            |    |
|      | proportion in the peak-efficiency load condition ( $I_{out}=29~\mathrm{A}$ ) and full            |    |
|      | load condition ( $I_{out} = 185$ A). Loss breakdown includes conduction                          |    |
|      | loss and switching loss of high side and low side switches, $P_{HS.Cond}$ ,                      |    |
|      | $P_{HS.SW}$ , $P_{LS.Cond}$ , $P_{LS.SW}$ ; ESR loss of blocking capacitors $P_{Cap}$ ; in-      |    |
|      | ductor winding loss and core loss, $P_{Winding}$ , $P_{Core}$ ; conduction loss of               |    |
|      | PCB traces and vias, $P_{PCB}$                                                                   | 84 |
| 2.29 | Size comparison between commercial discrete inductors and the matrix                             |    |
|      | coupled inductor. The background grid cell size is 1 cm. Comparison is                           |    |
|      | based on the same current ripple, similar winding dc resistance (DCR),                           |    |
|      | and similar current ratings (i.e., $I_{rms} \ge 40$ A with inductor tempera-                     |    |
|      | ture rise less than 40 °C) when converting voltage from 5 V to 1 V at                            |    |
|      | 806 kHz switching frequency. Box dimensions (length×width×height)                                |    |
|      | of the eight discrete inductors and the matrix coupled inductor are                              |    |
|      | $44\times30.48\times12.66~\mathrm{mm^3}$ and $24\times24\times5.25~\mathrm{mm^3}$ , respectively | 85 |
| 2.30 | (a) Four-phase SEPIC prototype equipped with discrete inductors. (b)                             |    |
|      | Efficiency comparison of the SEPIC prototype with one matrix coupled                             |    |
|      | inductor (compact winding) and with eight discrete inductors                                     | 86 |
| 2.31 | Measured open-loop transient waveforms of the SEPIC prototype with                               |    |
|      | (a) matrix coupled inductor (compact winding); (b) discrete inductors.                           |    |
|      | Duty ratio steps from 17% to 41.9%. $V_{in}=5~\mathrm{V};V_{out}$ changes from 1 V               |    |
|      | to 3.3 V; $I_{out}=20$ A; $f_{sw}=806$ kHz; effective $C_{out}=300~\mu\text{F.}$                 | 86 |

| 3.1 | (a) Microprocessor trend data during 1972 $\sim$ 2022 (replotted                   |    |
|-----|------------------------------------------------------------------------------------|----|
|     | from [74]). (b) As microprocessors develop from single-core, mono-                 |    |
|     | lithic die to multi-core, multiple chiplets, modern computing systems              |    |
|     | are hitting both power wall and memory wall (replotted from [75]).                 |    |
|     | Process node geometry and die area of selected high-performance-tier               |    |
|     | GPUs in [76,77] are plotted along the scaling curve of GPU thermal                 |    |
|     | design power                                                                       | 90 |
| 3.2 | Microprocessor power architecture comparison between (a) traditional               |    |
|     | solution that heavily relies on the on-board power conversion and (b)              |    |
|     | PwrSiP solution that focuses on the in-package power conversion. A                 |    |
|     | two-stage on-board conversion architecture is demonstrated in (a) as               |    |
|     | an example. Labeled efficiencies are sourced from $\left[37,39,97\right]$ and Sec- |    |
|     | tion 3.6.4 (including gate loss)                                                   | 92 |
| 3.3 | Ultra-thin VRM embedded into a CPU package that fits in a land-                    |    |
|     | grid array (LGA) socket for extreme efficiency, density, and control               |    |
|     | bandwidth                                                                          | 92 |
| 3.4 | MSC-PoL architecture for microprocessor voltage regulation. Stacked                |    |
|     | SC cells breakdown the high input voltage and create many intermedi-               |    |
|     | ate voltage rails loaded with switched inductor cells to perform voltage           |    |
|     | regulation. Multiple capacitors of the SC stage are soft charged by one            |    |
|     | single parallel coupled magnetic component                                         | 93 |
| 3.5 | MSC-PoL architecture based on modular H-bridge structures. Volt-                   |    |
|     | age conversion ratio can be extended by stacking more H-bridges. The               |    |
|     | switched-inductor current sources can be interleaved to reduce the out-            |    |
|     | put current ripple.                                                                | 95 |

| 3.6  | Example implementations of the MSC-PoL architecture: (a) current           |     |
|------|----------------------------------------------------------------------------|-----|
|      | sources are implemented as parallel multiphase buck converters; (b)        |     |
|      | current sources are separately regulated to supply different output volt-  |     |
|      | age levels; (c) current sources are tapped into different locations of the |     |
|      | stacked SC circuits and can be implemented as different converters,        |     |
|      | such as multiphase buck and multiphase SCB                                 | 96  |
| 3.7  | (a) Circuit topology and (b) key operation waveforms of the 48-to-1-       |     |
|      | V MSC-PoL converter. In subfigure (a), one 2:1 H-bridge SC cell is         |     |
|      | stacked in front and drives two 4-phase SCB cells. GaN FETs are            |     |
|      | plotted in blue and Silicon MOSFETs are plotted in red. Maximum            |     |
|      | voltage stress of each switch is labeled aside. In subfigure (b), inductor |     |
|      | currents and blocking capacitor voltages of the SCB cell A are plotted.    |     |
|      | Two SCB cells are interleaved by 180° phase shift as an example. $$        | 98  |
| 3.8  | Small-signal circuit model of the 48-to-1-V MSC-PoL converter              | 100 |
| 3.9  | Circuit topology and operation waveforms of an example two-phase           |     |
|      | series-capacitor buck converter with discrete inductors. The maximum       |     |
|      | switch voltage stress is labeled in red. Coupled inductors can be uti-     |     |
|      | lized to replace the discrete ones, and phase number can be extended       |     |
|      | by stacking more series-capacitor buck cells [54, 108], as indicated by    |     |
|      | the grey lines and grey dots                                               | 103 |
| 3.10 | Large-signal average model and its equivalent circuit model                | 104 |
| 3.11 | Input voltage disturbance and its response decomposed into: (a)            |     |
|      | common-mode dynamics; (b) differential-mode dynamics                       | 104 |

| 3.12 | (a) Response decomposition of common-mode and differential-mode                                       |     |
|------|-------------------------------------------------------------------------------------------------------|-----|
|      | dynamics for a general $M$ -phase SCB converter ( $R_C$ is ignored here).                             |     |
|      | (b) The $\tilde{v_{in}}$ -to- $\tilde{i_L}$ transfer functions of an example 3-phase SCB con-         |     |
|      | verter, where $L=50$ nH, $C_{B1,2}=30$ $\mu\text{F},~C_o=100$ $\mu\text{F},~R_o=1$ $\Omega,$          |     |
|      | $D=\frac{1}{6}$                                                                                       | 108 |
| 3.13 | (a) Large-signal average model of using coupled inductors. (b) Equiv-                                 |     |
|      | alent circuit and parameter conversion for a two-phase coupled inductor.                              | 109 |
| 3.14 | Bode plots of $G_{v_{in}\Delta i_L}$ with different coupling coefficients                             | 110 |
| 3.15 | Simulated and calculated $\Delta i_L$ during a line transient ( $v_{in}$ steps from                   |     |
|      | 12 V to 14 V) when using (a) discrete inductors ( $\beta=0$ ) and (b) a                               |     |
|      | coupled inductor $(\beta = 5)$                                                                        | 111 |
| 3.16 | Block diagram of an SCB converter with typical voltage-mode control.                                  | 112 |
| 3.17 | Measured open and closed loop transfer functions in SPICE simula-                                     |     |
|      | tions: (a) open loop $G_{dv_o}$ & loop gain, (b) $G_{v_{in}i_o}$ , (c) $G_{v_{in}v_o}$ , and (d)      |     |
|      | $G_{v_{in}\Delta i_L}$ . $(\beta=0)$                                                                  | 113 |
| 3.18 | Simulated voltage and current responses to a line transient ( $V_{in}=12$                             |     |
|      | $V$ →14 $V$ →12 $V$ ) in the case of (a) open loop and (b) closed loop. ( $\beta = 0$ )               | 115 |
| 3.19 | Bode plots of (a) $G_{\Delta d\Delta i_L}$ and (b) $G_{\Delta dv_c}$ under different load conditions. | 116 |
| 3.20 | Two four-phase coupled inductor designs based on (a) a ladder core                                    |     |
|      | and (b) a ladder core plus a leakage plate. The ladder core is made of                                |     |
|      | DMR51W ( $\mu_r = 900$ ), while the leakage plate is made of DMR53 ( $\mu_r =$                        |     |
|      | 900), a higher frequency magnetic material to enhance the leakage flux                                |     |
|      | path                                                                                                  | 117 |

| 3.21 | Annotated design dimensions for the ladder core. To fit the PCB lay-                                                   |     |
|------|------------------------------------------------------------------------------------------------------------------------|-----|
|      | out, the entire inductor shape can be determined by three dimension                                                    |     |
|      | variables: $X_{Leg}$ , $H_{Leg}$ , and $H_{tot}$ . Predicted core loss for geometry opti-                              |     |
|      | mization is based on the flux density in each core segment (labeled in                                                 |     |
|      | blue) using iGSE                                                                                                       | 118 |
| 3.22 | Equivalent magnetic models for a ladder-structured coupled inductor:                                                   |     |
|      | (a) magnetic circuit model; (b) inductance dual model. The magnetic                                                    |     |
|      | flux in each core segment can be calculated through probing the current                                                |     |
|      | in the inductance dual model and dividing it by the corresponding                                                      |     |
|      | reluctance. For the designed coupled inductors, the turns ratio $n = 1$ .                                              | 119 |
| 3.23 | Calculated and ANSYS-simulated magnetic flux density in: (a) each                                                      |     |
|      | core header $(B_{H1} \sim B_{H3})$ and (b) each core leg $(B_{L1} \sim B_{L4})$ . $V_o = 1 \text{ V}$ ;                |     |
|      | $D = \frac{1}{6}$ ; $f_{sw} = 500 \text{ kHz.} \dots$                                                                  | 121 |
| 3.24 | Optimization process for the ladder-core coupled inductor: (a) total                                                   |     |
|      | inductor loss contour plot at a specific $H_{tot}$ ; (b) optimized inductor                                            |     |
|      | loss versus $H_{tot}$ . Core loss and conduction loss are optimized for one                                            |     |
|      | coupled inductor (four-phase) supporting 125 A at 500 kHz switching                                                    |     |
|      | frequency.                                                                                                             | 122 |
| 3.25 | Customized magnetic components: (a) four-phase ladder magnetic core                                                    |     |
|      | (DMR51W, $\mu_r = 900$ ); (b) CNC-machined windings; (c) leakage mag-                                                  |     |
|      | netic plate (DMR53, $\mu_r = 900$ )                                                                                    | 123 |
| 3.26 | Coupled inductor height of: (a) using the ladder core only; (b) using                                                  |     |
|      | the ladder core plus the 0.8-mm leakage plate with a 0.2-mm air gap.                                                   | 123 |
| 3.27 | ANSYS FEM simulation of the two coupled inductor designs: (a) dc                                                       |     |
|      | flux density distribution when supporting 31.25 A average current per                                                  |     |
|      | phase (125 A in total) and (b) ac flux density distribution at $t=1~\mu \mathrm{s}$                                    |     |
|      | of one switching cycle $V_{\text{tot}} = 48 \text{ V}$ $V_{\text{c}} = 1 \text{ V}$ $f_{\text{cov}} = 500 \text{ kHz}$ | 124 |

| 3.28 | Simulated steady-state inductor currents and transient output voltages                                   |       |
|------|----------------------------------------------------------------------------------------------------------|-------|
|      | during a duty ratio step change when using: (a) the coupled inductor                                     |       |
|      | with ladder core only and discrete inductors of its equivalent $L_{ss}$ and                              |       |
|      | $L_{tr}$ ; (b) the coupled inductor with ladder core plus leakage plate. $V_{in} =$                      |       |
|      | 48 V, $V_o = 1 \rightarrow 1.2$ V, $f_{sw} = 500$ kHz, $R_{eq} = 3$ m $\Omega$ , $R_o = 0.01$ $\Omega$ , |       |
|      | $C_o = 1$ mF. (Steady-state inductor currents are simulated at $V_o = 1$ V)                              | . 125 |
| 3.29 | Design of gate driver circuits and bootstrap chains (plotted in green)                                   |       |
|      | for one MSC-PoL module. All gate driver and bootstrap circuits are                                       |       |
|      | laid out together with the power stage inside the compact converter                                      |       |
|      | package                                                                                                  | 126   |
| 3.30 | PCB layout and 3D stacked packaging of the MSC-PoL VRM: (a)                                              |       |
|      | annotated top view; (b) annotated bottom assembly view. The PCB                                          |       |
|      | area is 31.9 mm $\times$ 26.6 mm = 848.54 mm <sup>2</sup> , and the total VRM height                     |       |
|      | is only 6 mm (7 mm if including the leakage plate)                                                       | 128   |
| 3.31 | (a) Block diagram of the prototype power stage. (b) An example phase                                     |       |
|      | shift strategy, which enables 16-phase interleaving with multiplicated                                   |       |
|      | ripple frequency $(16 \times f_{sw})$ and reduced ripple amplitude of the output                         |       |
|      | current                                                                                                  | 129   |
| 3.32 | (a) Picture of the 48-to-1-V/450-A MSC-PoL prototype containing two                                      |       |
|      | MSC-PoL modules, a signal interface board, and two microcontroller                                       |       |
|      | boards. Each MSC-PoL module is covered by a heat sink together with                                      |       |
|      | a dc fan                                                                                                 | 130   |
| 3.33 | (a) One MSC-PoL module (w/o the leakage plate) compared with a                                           |       |
|      | U.S. quarter. (b) Mechanical demonstration of a 225 W 48-to-1-V                                          |       |
|      | MSC-PoL module embedded into a 3D-printed FCLGA-3647 socket to                                           |       |
|      | support a server CPU (Intel Xeon Platinum 8280, 205 W)                                                   | 130   |

| 3.34 | Picture of the experimental testbench. Digital multimeters are inter-                                                                  |    |
|------|----------------------------------------------------------------------------------------------------------------------------------------|----|
|      | faced with the BenchVue platform to automatically collect efficiency                                                                   |    |
|      | measurement results. Two current shunts are utilized for measuring                                                                     |    |
|      | the input and the output currents. A dc power source is used as the                                                                    |    |
|      | 48 V dc bus. Multiple electronic loads are connected in parallel to                                                                    |    |
|      | drain high load currents                                                                                                               | 31 |
| 3.35 | Steady-state waveforms of switch drain-source voltages and intermedi-                                                                  |    |
|      | ate rail voltages. $V_{\text{Rail1A}}$ and $V_{\text{Rail1B}}$ are the positive and the negative                                       |    |
|      | terminal voltages of the flying capacitor $C_{fly}$ . $f_{sw}=400$ kHz; $V_o=1$ V. 13                                                  | 32 |
| 3.36 | Steady-state waveforms of switch node voltages and output voltage                                                                      |    |
|      | ripples. The 16-phase interleaving operation in Fig. 3.31 is applied,                                                                  |    |
|      | yielding $16f_{sw}$ ripple frequency for the output voltage. $f_{sw} = 400 \text{ kHz}$ ;                                              |    |
|      | $V_o = 1 \text{ V.} \dots \dots$ | 33 |
| 3.37 | Steady-state waveforms of: (a) capacitor dc voltages; (b) capacitor ac                                                                 |    |
|      | voltage ripples and output current. $f_{sw} = 400 \text{ kHz}$ ; $V_o = 1 \text{ V}$ ; $I_o = 400 \text{ A.13}$                        | 34 |
| 3.38 | Measured open-loop transient waveforms with one MSC-PoL module                                                                         |    |
|      | when (a) using the leakage plate and (b) not using the leakage plate.                                                                  |    |
|      | Duty ratio steps from 15.8% to 22.2%, yielding a step change $V_o$ from                                                                |    |
|      | 0.8 V to 1.2 V. $f_{sw} = 704 \text{ kHz}$ ; $I_o = 100 \text{ A}$ ; $C_o = 3 \text{ mF}$                                              | 34 |
| 3.39 | Measured closed-loop transient waveforms with one MSC-PoL mod-                                                                         |    |
|      | ule (w/o the leakage plate) during a load step change between 50 A                                                                     |    |
|      | and 150 A. A typical voltage-mode feedback control is applied. The                                                                     |    |
|      | maximum voltage overshoot is less than 80 mV during the 100 A load                                                                     |    |
|      | step (44% of the full load) with 4 A/ $\mu$ s current slope. $f_{sw} = 704$ kHz;                                                       |    |
|      | $C_2 = 3 \text{ mF}.$ 15                                                                                                               | 35 |

| 3.40 | Measured 48-to-1-V efficiency of the MSC-PoL prototype when (a) us-         |     |
|------|-----------------------------------------------------------------------------|-----|
|      | ing the leakage plate and (b) not using the leakage plate. Efficiencies of  |     |
|      | different switching frequencies excluding and including the gate losses     |     |
|      | are plotted and compared. $V_{drive} = 8 \text{ V.} \dots \dots$            | 136 |
| 3.41 | Thermal image of the MSC-PoL prototype when operating at 48-to-1-           |     |
|      | $V/450$ -A, $f_{sw}=400$ kHz under dc fan and heat sink cooling for more    |     |
|      | than 10 minutes. The hot-spot temperature of the heat sink remains          |     |
|      | around 45 °C. The ambient temperature is around 25 °C                       | 138 |
| 3.42 | Loss breakdown of the 48-to-1-V/400 kHz MSC-PoL prototype (with             |     |
|      | the leakage plate) at (a) full load range and (b) two specific load con-    |     |
|      | ditions. Gate loss is included. The power loss breakdown listed in the      |     |
|      | legend is ordered from bottom to top in the bar chart and clockwise         |     |
|      | from 12 o'clock in the pie charts                                           | 139 |
| 3.43 | Performance comparison of the MSC-PoL prototype (with the leakage           |     |
|      | plate) and other 48 V-to-1 V VRMs. Efficiency and power density             |     |
|      | points (including gate loss and size) at full load and peak-efficiency      |     |
|      | load are plotted and connected with a line. Switching frequencies are       |     |
|      | color coded, corresponding to the logarithmic color bar. The MSC-PoL        |     |
|      | prototype achieves both excellent efficiency and power density among        |     |
|      | state-of-the-art VRM designs                                                | 141 |
| 4.1  | Example large-scale energy systems: (a) photovoltaic systems; (b) bat-      |     |
|      | tery storage systems; and (c) data center servers                           | 144 |
| 4.2  | A data storage server with series stacked power delivery architecture.      |     |
|      | It comprises a cluster of $N \times M$ HDDs divided into $N$ series-stacked |     |
|      | voltage domains with differential power processing.                         | 145 |

| 4.3 | (a) Proposed MAC-DPP architecture. (b) Magnetic flux in the mag-                      |     |
|-----|---------------------------------------------------------------------------------------|-----|
|     | netic core of a multi-winding transformer with a single magnetic link-                |     |
|     | age. $\Phi_i$ is the magnetizing flux, and $\Delta\Phi_{ij}$ is the leakage flux. (c) |     |
|     | Waveforms of winding volt-per-turn and peak-peak flux variation                       | 149 |
| 4.4 | (a) FEM simulation setup: two windings are driven by two sinusoidal                   |     |
|     | voltage sources of different phase-shits. (b) Simulated magnetic flux                 |     |
|     | density inside the core at the phase-shift of 0 degree and 180 degree,                |     |
|     | respectively. (c) Peak magnetic flux density in the spacing between                   |     |
|     | two adjacent windings when sweeping the voltage phase-shift from $0^{\circ}$          |     |
|     | to 180°.                                                                              | 151 |
| 4.5 | (a) An $N \times M$ DPP system with $N$ series-stacked voltage domains,               |     |
|     | each comprising $M$ load or source modules. $P_{ij}(t)$ and $P_i(t)$ are the          |     |
|     | power of one dc module and of one voltage domain, respectively; $\Delta P_i(t)$       |     |
|     | is the power mismatch for one voltage domain. (b) Load power and                      |     |
|     | mismatched power of each voltage domain is a random process with a                    |     |
|     | certain probability distribution (Gaussian distributions are shown here               |     |
|     | as an example)                                                                        | 154 |
| 4.6 | Typical DPP architectures: (a) fully-coupled DPP; (b) ladder DPP                      | 156 |
| 4.7 | Equivalent circuit model for loss analysis of: (a) conventional $N:1$ dc-             |     |
|     | dc converter based on a DAB; (b) fully-coupled DPP; (c) ladder DPP.                   | 158 |
| 4.8 | Expected power loss of the $i^{th}$ port or submodule in a fully-coupled              |     |
|     | DPP converter and a ladder DPP converter with $N$ series voltage do-                  |     |
|     | mains                                                                                 | 161 |

| 4. | 9  | Magnetic core window area distribution and winding conductance. To-                      |        |
|----|----|------------------------------------------------------------------------------------------|--------|
|    |    | tal core window area is proportional to $\sum G_m n^2$ . $A_w$ represents the            |        |
|    |    | distributed area for each winding, $n$ is the effective number of turns                  |        |
|    |    | in each winding, $\rho$ is the winding resistivity, and $MLT$ is the mean                |        |
|    |    | length per turn, set to be identical for all windings                                    | 162    |
| 4. | 10 | Fully-coupled DPP topologies: (a) ac fully-coupled DPP [22, 130]; (b)                    |        |
|    |    | dc fully-coupled DPP [129,139]; (c) Dickson-SC DPP [123]                                 | 163    |
| 4. | 11 | Ladder DPP topologies: (a) ladder DPP with buck-boost cells [121,                        |        |
|    |    | 135,137,158]; (b) ladder DPP with DAB cells; (c) ladder-SC DPP [18,                      |        |
|    |    | 123, 124, 134, 140]                                                                      | 164    |
| 4. | 12 | Calculated and simulated loss ratio $\beta$ as a function of $N$ in: (a) fully-          |        |
|    |    | coupled DPP converters; (b) ladder DPP converters                                        | 168    |
| 4. | 13 | Calculated and simulated loss ratio $\beta$ as a function of the number of               |        |
|    |    | the parallel loads $M$ in (a) fully-coupled and (b) ladder DPP converters                | s. 168 |
| 4. | 14 | Calculated and simulated loss ratio $\beta$ as a function of coefficient of              |        |
|    |    | variance $C_V$ in (a) fully-coupled and (b) ladder DPP converters                        | 169    |
| 4. | 15 | Two types of load correlation in an $N \times M$ DPP system: (1) verti-                  |        |
|    |    | cal correlation across different voltage domains is denoted in green;                    |        |
|    |    | (2) horizontal correlation between loads within one voltage domain is                    |        |
|    |    | denoted in blue                                                                          | 171    |
| 4. | 16 | (a) Vertical correlation matrix $\rho_V$ : $\rho_V(i,j)$ is the correlation coefficient  |        |
|    |    | between the $i^{th}$ and $j^{th}$ domain power, $P_i(t)$ and $P_j(t)$ ; (b) Horizontal   |        |
|    |    | correlation matrix $\rho_{Hk}$ : $\rho_{Hk}(i,j)$ is the correlation coefficient between |        |
|    |    | the $i^{th}$ and $j^{th}$ load power in the $k^{th}$ domain, $P_{ki}(t)$ and $P_{kj}(t)$ | 172    |
| 4. | 17 | Power loss of a fully-coupled DPP converter reaches its maximum with                     |        |
|    |    | worst case load correlation, where $\rho_{Hk}(i,j)=1$ for all loads within a             |        |
|    |    | voltage domain, and $\operatorname{Var}\left[\sum_{k=1}^{N} P_k(t)\right] = 0.$          | 174    |

| 4.18 | (a) The MAC-DPP converter functions as a multi-input-multi-output                                                        |     |
|------|--------------------------------------------------------------------------------------------------------------------------|-----|
|      | (MIMO) system. (b) Equivalent lumped circuit model to analyze the                                                        |     |
|      | MIMO power flow. The $N$ -port passive network is represented by a                                                       |     |
|      | delta network, and each dc-ac unit is modeled as a square-wave voltage                                                   |     |
|      | source                                                                                                                   | 175 |
| 4.19 | (a) Example waveforms of normalized port voltages $(\frac{V_1}{N_1} \sim \frac{V_3}{N_3})$ and                           |     |
|      | branch inductor current $(I_{13})$ with phase-shift modulation. (b) Average                                              |     |
|      | power flow between the $i^{\rm th}$ and the $j^{\rm th}$ ports as a function of phase shift                              |     |
|      | $\phi_{ij}$                                                                                                              | 176 |
| 4.20 | (a) Block diagram of a MAC-DPP architecture with a large number of                                                       |     |
|      | ac-coupled voltage domains connected in series; Small signal model of                                                    |     |
|      | (b) a DAB converter and (c) a MAC converter                                                                              | 178 |
| 4.21 | (a) Equivalent lumped circuit model to analyze the transfer function                                                     |     |
|      | of a DAB converter. (b) Inductor current variation $\Delta I_L$ due to $\Delta V_{out}$ .                                |     |
|      | (c) Current and voltage waveforms of DAB with power losses                                                               | 181 |
| 4.22 | (a) Improved small signal model of DAB considering power losses. (b)                                                     |     |
|      | Equivalent circuit of MAC showing the $i^{th}$ port inductor current vari-                                               |     |
|      | ation $\Delta I_{Li}$ due to $\Delta V_i$ . (c) Improved small signal model of MAC                                       |     |
|      | considering power losses                                                                                                 | 182 |
| 4.23 | (a) Comparison between calculated and simulated transfer function of                                                     |     |
|      | a 10-port MAC-DPP converter with and without power losses. (b)                                                           |     |
|      | Calculated and simulated $v$ to $\phi$ Bode plots for three arbitrary ports                                              |     |
|      | in a 100-port lossless MAC-DPP system: (1) transfer function from                                                        |     |
|      | $\hat{\phi}_{10}$ to $\hat{v}_1$ ; (2) from $\hat{\phi}_{45}$ to $\hat{v}_1$ ; (3) from $\hat{\phi}_{92}$ to $\hat{v}_1$ | 184 |
| 4.24 | PLECS simulation platform of a 100-port lossless MAC-DPP converter.                                                      | 185 |

| 4.25 (a) Principles of the modular distributed control strategy of an example            |     |
|------------------------------------------------------------------------------------------|-----|
| 3-port MAC-DPP converter. (b) Equivalent single loop for each port.                      |     |
| (c) Loop gain Bode plot of port #1, port #10 with and without PI                         |     |
| controller. Port #1 has the heaviest load with the largest phase margin,                 |     |
| while port $\#10$ has the lightest load with the lowest phase margin                     | 186 |
| 4.26 (a) Feasible power range of a three-port MAC converter. (b) Fractal                 |     |
| convergence region of the Newton-Raphson solver. The solver is more                      |     |
| likely to converge if the initial anticipated points are close to the final              |     |
| solution. Empirically, for a symmetric multiport network, starting from                  |     |
| the origin is always a good strategy                                                     | 191 |
| 4.27 A series voltage compensator (SVC) leveraging the partial power pro-                |     |
| cessing concept for voltage regulation of DPP systems. SVC only pro-                     |     |
| cesses a fraction of total power. Major power is directly delivered to                   |     |
| the DPP loads                                                                            | 193 |
| 4.28 The SVC incurred power processing consists of: (1) SVC processed                    |     |
| power $P_{SVC}$ ; (2) additional differential power in DPP converters $\Delta P_{DPP}$ . | 195 |
| 4.29 Conventional voltage pre-regulator for DPP system. In contrast, the                 |     |
| standalone dc-dc regulator needs to process total system input power                     |     |
| $P_{IN}$                                                                                 | 195 |
| 4.30 (a) Normalized SVC processed power $\rho_{SVC}$ as a function of voltage            |     |
| regulation ratio $M_v$ . (b) Normalized additional DPP processed power                   |     |
| $\rho_{DPP}$ as a function of the voltage regulation ratio $M_v$ . (c) Normalized        |     |
| total SVC incurred power processing $\rho_{tot} = \rho_{SVC} + \rho_{DPP}$ as a function |     |
| of the voltage regulation ratio $M_v$ . Each curve is plotted only within                |     |
| its feasible range: $M_v < \frac{1}{1-K}$                                                | 197 |

| 4.31 | Several circuit implementations of SVC: (a) buck SVC; (b) boost SVC;        |     |
|------|-----------------------------------------------------------------------------|-----|
|      | (c) buck-boost SVC; (d) extra DPP port [137,174]. The negative ter-         |     |
|      | minal of the input and output ports of the SVC is connected to the          |     |
|      | negative terminal of the first voltage domain to achieve the maximum        |     |
|      | benefits.                                                                   | 200 |
| 4.32 | Transistor and inductor $CLF$ s comparison between: (a) different SVC       |     |
|      | topologies when $K_s=0.5;$ (b) buck SVC and conventional buck; (c)          |     |
|      | boost SVC and conventional boost; and (d) buck-boost SVC and con-           |     |
|      | ventional buck-boost. For each $K_s,CLF_s$ are plotted within the fea-      |     |
|      | sible range: $M_v < \frac{1}{1-K_s}$                                        | 203 |
| 4.33 | Regulation ratio $(M_v)$ at the crossing point where transistor $CLFs$ of   |     |
|      | the SVC topology and its conventional counterpart are equal. The            |     |
|      | crossing point $M_v$ value is not continuous at $K_s = 1$ , where the tran- |     |
|      | sistor $CLF$ curves of the SVC and conventional converter will overlap      |     |
|      | instead of crossing                                                         | 204 |
| 4.34 | (a) Topology of a 10-port MAC-DPP converter with dc-ac units im-            |     |
|      | plemented as half-bridge circuits. (b) Modular isolated PWM driving         |     |
|      | circuit (in red) and voltage sampling circuit (in blue) at each port. (c)   |     |
|      | Annotated top view, side view, and 3D assembly view of the 10-port          |     |
|      | MAC-DPP prototype. The prototype is 40 mm $\times$ 35 mm in area and        |     |
|      | 7.56 mm in height. (d) Winding patterns on main power board (4              |     |
|      | layers) and bottom cover (6 layers)                                         | 206 |
| 4.35 | The 450 W 10-port MAC-DPP prototype and a U.S. quarter. The                 |     |
|      | peak system efficiency is $>99\%$ , and the peak converter efficiency is    |     |
|      | >96%                                                                        | 208 |

| 4.36 | (a) Port-to-port power converter efficiency in different cases. When              |     |
|------|-----------------------------------------------------------------------------------|-----|
|      | delivering 40 W from 9 ports to 1 port, the hot-spot temperate of the             |     |
|      | output port reached 114 °C under 110 CFM airflow. (b) System power                |     |
|      | conversion efficiency (total load power: 450 W)                                   | 209 |
| 4.37 | (a) Estimated conduction loss when delivering power from 9 ports to               |     |
|      | 1 port at different switching frequencies. (b) Estimated core loss and            |     |
|      | switching loss as a function of the switching frequency from $50~\mathrm{kHz}$ to |     |
|      | 200 kHz. Gate drive loss is not included. (c) Estimated total power loss          |     |
|      | of the MAC-DPP prototype when delivering power from 9 ports to 1                  |     |
|      | port at different frequencies. The total power loss includes conduction           |     |
|      | loss, core loss and switching loss                                                | 212 |
| 4.38 | Pictures of the Backblaze server (a) with the original ac-dc power sup-           |     |
|      | ply; (b) after replacing the power supply with MAC-DPP converter.                 |     |
|      | Both the power and the communication circuitry are reconfigured. The              |     |
|      | server comprises an Intel i<br>3-2100 $3.10~\mathrm{GHz}$ CPU, a Supermicro MBD-  |     |
|      | X9SCM-F motherboard, and 8 GB RAMs                                                | 213 |
| 4.39 | Data link infrastructure of the series-stacked HDD server testbench:              |     |
|      | (a) Three-layer data link block diagram. (b) Component connection                 |     |
|      | diagram                                                                           | 214 |
| 4.40 | Isolated SATA wiring pattern of the modified Backblaze storage server.            |     |
|      | The three ground wires are removed, and the four differential signals             |     |
|      | are capacitive isolated. Note the SATA extension cards selected in this           |     |
|      | prototype have internal isolation capacitors. No external capacitors              |     |
|      | are needed                                                                        | 215 |

| 4.41 | Experimental setup for the HDD read/write speed comparison be-              |     |
|------|-----------------------------------------------------------------------------|-----|
|      | tween isolated SATA and standard SATA communication. Ten 2.5-inch           |     |
|      | HDDs are in series to a 50 V dc bus. The same HDD was swapped from          |     |
|      | the first voltage domain (isolated SATA) to the last domain (standard       |     |
|      | SATA) to test the read/write speed in sequential and 4KB random             |     |
|      | mode. The speed was tested using the disk drive benchmark tool,             |     |
|      | CrystalDiskMark V6.0                                                        | 215 |
| 4.42 | (a) Side view and (b) top view of the HDD server testbench supported        |     |
|      | by the MAC-DPP converter                                                    | 216 |
| 4.43 | LabVIEW real-time monitoring system. It measures and records the            |     |
|      | voltage and current waveforms of all ten series-stacked domains, and        |     |
|      | calculates the system efficiency in real time. In this example, the input   |     |
|      | power is 93.31 W, the load power is 92.99 W, and the instantaneous          |     |
|      | system efficiency is 99.79%                                                 | 217 |
| 4.44 | Experiment waveforms of all voltage domains at random read-                 |     |
|      | ing/writing test measured by LabVIEW: (a) voltage waveforms; (b)            |     |
|      | current waveforms.                                                          | 218 |
| 4.45 | (a) Transient response when hot-swapping an entire voltage domain           |     |
|      | (removing 5 HDDs from port #5) of the HDD server testbench. Volt-           |     |
|      | age measurements are ac-coupled, and current measurements are dc-           |     |
|      | coupled. (b) Transient response of a 25 W step load change at port #6.      |     |
|      | The settling time is $0.5$ ms, and the voltage overshoot is less than $250$ |     |
|      | mV. Voltage measurements are ac-coupled, and current measurements           |     |
|      | are dc-coupled.                                                             | 220 |
| 4.46 | Measured system efficiency when different number of voltage domains         |     |
|      | are swapped out. The average overall load power is annotated aside          |     |
|      | each data point. The system efficiency drops as more HDDs are removed       | 221 |

| 4.47 | Thermal images of the MAC-DPP prototype in (a) balanced load and           |     |
|------|----------------------------------------------------------------------------|-----|
|      | (b) hot-swapping an entire voltage domain. The thermal images were         |     |
|      | taken at 25°C ambient temperature after the testbench running for 10       |     |
|      | min without forced air flow                                                | 221 |
| 4.48 | Comparison of the 10-port MAC-DPP prototype with many state-of-            |     |
|      | the-art commercial 48V-5V dc-dc converters. The MAC-DPP con-               |     |
|      | verter achieves over 10x power loss reduction compared with most of        |     |
|      | industry products with top-ranking power density. This comparison is       |     |
|      | based on the DPP system efficiency. The port-to-port converter effi-       |     |
|      | ciency is shown in Fig. 4.36a. The size of the microcontroller is not      |     |
|      | included in the volume calculation.                                        | 223 |
| 4.49 | (a) Two different RAID levels: RAID 0 (striped volume) and RAID 1          |     |
|      | (mirrored volume) [180]. (b) Implementation of different RAID levels       |     |
|      | on the $10 \times 5$ HDD array. HDDs can be vertically or horizontally     |     |
|      | grouped into RAID systems                                                  | 224 |
| 4.50 | Experimental results of writing test under different storage architec-     |     |
|      | tures. HDD server performance was analyzed in multiple aspects in-         |     |
|      | cluding: (a) time consumption; (b) system efficiency; (c) energy con-      |     |
|      | sumption of the overall system (including working/idling HDDs and          |     |
|      | backplanes), or just the HDDs accessed by the writing test                 | 226 |
| 4.51 | (a) Experimental test bench with a $30 \times 20$ LED array. (b) Power and |     |
|      | signal configuration. 600 LEDs are divided into 10 series-stacked volt-    |     |
|      | age domains and supported by the 10-port MAC-DPP converter. Each           |     |
|      | LED is individually addressable from the MCU controller.                   | 227 |

| 4.52 | (a) The 10-port MAC-DPP prototype. (b) Equivalent circuits of the                |     |
|------|----------------------------------------------------------------------------------|-----|
|      | MAC-DPP prototype when delivering power from 5 ports to 5 ports                  |     |
|      | $(V_{IN} = V_{OUT} = 5 \text{ V})$ . (c) Measured power loss vs. output current  |     |
|      | square for 5-port-to-5-port power delivery. This measurement is per-             |     |
|      | formed on common ground without sampling resistors, etc., so the                 |     |
|      | $485~\mathrm{mW}$ control and auxiliary losses are not captured in static loss   | 228 |
| 4.53 | (a) Measured total input power and each domain power in LabVIEW.                 |     |
|      | (b) Measured average power loss of the DPP system in LabVIEW. (c)                |     |
|      | Vertical correlation matrix based on sampled data (2 min) of each do-            |     |
|      | main power. Diagonal histograms plot the distribution of each domain             |     |
|      | power. Non-diagonal scatter plots depict power correlation between               |     |
|      | each pair of domains and the correlation coefficients                            | 230 |
| 4.54 | Example zooms from Fig. 4.53c: (a) diagonal histogram of domain #1;              |     |
|      | (b) diagonal histogram of domain #2; (c) non-diagonal scatter plot of            |     |
|      | domain #1 power and domain #2 power                                              | 231 |
| 4.55 | Experimental setup to validate the model as: (a) $M$ increases; (b) $\sigma_0^2$ |     |
|      | increases.                                                                       | 231 |
| 4.56 | Comparison between expected power loss and measured average loss as              |     |
|      | M increases in the case of: (a) independent load; (b) worst-case hori-           |     |
|      | zontal load correlation. The calibrated loss is the sum of the modeled           |     |
|      | loss and the estimated 667 mW overhead                                           | 232 |
| 4.57 | Comparison between expected power loss and measured average loss                 |     |
|      | when $\sigma_0^2$ increases. The calibrated loss is the sum of modeled loss and  |     |
|      | the estimated 667 mW overhead                                                    | 232 |
| 4.58 | Experimental setup for horizontal correlation. Here, each horizontally           |     |
|      | correlated group contains two correlated LEDs with $\rho = 1.$                   | 233 |

| 4.59 | LED screen pattern, power waveform and the probability histogram of            |     |
|------|--------------------------------------------------------------------------------|-----|
|      | domain $\#1$ when $60$ LEDs of each voltage domain are: (a) independent;       |     |
|      | (b) horizontally grouped with 6 LEDs/group; (c) horizontally grouped           |     |
|      | with 20 LEDs/group; (d) horizontally grouped with 60 LEDs/group                | 233 |
| 4.60 | Comparison between expected power loss and measured average loss as            |     |
|      | the number of LEDs per horizontal group increases. A larger number             |     |
|      | of LEDs per group represents a stronger positive horizontal correlation.       |     |
|      | The calibrated loss is the sum of modeled loss and estimated $667~\mathrm{mW}$ |     |
|      | overhead                                                                       | 234 |
| 4.61 | Experimental setup for vertical load correlation. Here is an example           |     |
|      | in which two vertically correlated groups are set up. In each vertical         |     |
|      | correlated group, $\rho=1$ for any two loads within the group                  | 235 |
| 4.62 | (a) Power distribution histogram of domain $\#1$ and power correlation         |     |
|      | graph between domains $\#1$ and $\#2$ with different number of vertically      |     |
|      | correlated groups. (b) Comparison between expected power loss and              |     |
|      | measured average loss as the number of vertically correlated groups            |     |
|      | increases. A larger number of correlated groups represents a stronger          |     |
|      | positive vertical correlation. The calibrated loss is the sum of modeled       |     |
|      | loss and estimated 667 mW overhead                                             | 236 |
| 4.63 | Circuit topology of a buck SVC attached to the MAC-DPP converter.              | 237 |
| 4.64 | Normalized power rating of the buck SVC and the 10-port DPP con-               |     |
|      | verter. Power ratings are normalized to the maximum system power.              | 239 |
| 4.65 | (a) Control block diagram. (b) Prototype of the buck SVC and the               |     |
|      | 10-port DPP converter in comparison with a US quarter                          | 241 |
| 4.66 | Component volume breakdown of buck SVC and DPP converter                       | 242 |

| 4.67 | (a) Picture of an example application. The SVC-DPP is powering a         |     |
|------|--------------------------------------------------------------------------|-----|
|      | 600-LED screen. (b) Power and signal configuration of the SVC-DPP        |     |
|      | system                                                                   | 242 |
| 4.68 | (a) Measured waveforms of input dc bus voltage, regulated DPP string     |     |
|      | voltage, and the gate driving signal and inductor current of the buck    |     |
|      | SVC. (b) Measured waveforms of DPP string voltage and the voltage        |     |
|      | of domain #1 when input voltage ramps up and down between 55 V $$        |     |
|      | and 60 V. Input dc bus voltage is measured in dc coupling; DPP string    |     |
|      | voltage and the voltage of domain #1 are measured in ac                  | 243 |
| 4.69 | (a) Measured SVC converter efficiency, 1-port-to-9-port DPP converter    |     |
|      | efficiency, and the system efficiency when SVC converting 55 V input     |     |
|      | dc bus voltage into 50 V DPP string voltage. The SVC processed           |     |
|      | power and the DPP processed differential power are labeled along the     |     |
|      | curves. (b) Power loss breakdown of SVC and DPP converter at 100 $\rm W$ |     |
|      | system load power                                                        | 244 |
| 4.70 | (a) System efficiency when converting input dc bus voltage from 55 V,    |     |
|      | 60 V, 65 V into 50 V for DPP system with identical domain load           |     |
|      | powers. (b) System efficiency in the best-case and the worst-case load   |     |
|      | distributions. The buck SVC is converting 55 V input voltage into        |     |
|      | 50 V DPP string voltage                                                  | 245 |
| 4.71 | Load distributions of: (a) best-case system efficiency; (b) worst-case   |     |
|      | system efficiency. The buck SVC is converting 55 V input voltage into    |     |
|      | 50 V DPP string voltage                                                  | 246 |
| 5.1  | Comparison between (a) magnetic-core memory [181] and (b) prospec-       |     |
|      | tive power processor. In the power processor, multiple power loads and   |     |
|      | sources will store and transfer energy through the centralized magnetic  |     |
|      | core structure                                                           | 254 |

| B.1 | Current flow in a Dickson-SC DPP during: (a) phase 1; (b) phase 2.         |     |
|-----|----------------------------------------------------------------------------|-----|
|     | Current flow (in blue) on the left of the dash line is the average current |     |
|     | per period; Current flow (in red) on the right is the average current      |     |
|     | per phase                                                                  | 264 |
| B.2 | Current flow in a ladder-SC DPP during: (a) phase 1; (b) phase 2.          |     |
|     | Color code is the same as that of Fig. B.1                                 | 266 |
| B.3 | Distribution of expected switch conduction loss at FSL (in blue) and       |     |
|     | capacitor charge loss at SSL (in red) of Dickson-SC DPP and ladder-SC      |     |
|     | DPP                                                                        | 267 |
| B.4 | Power consumption waveforms of two example voltage domains when            |     |
|     | the data storage server is running a random read/write program             | 268 |
| B.5 | Probability distribution and correlation of the two example domain         |     |
|     | powers: (a) power distribution histogram of domain A; (b) power dis-       |     |
|     | tribution of domain B; (c) correlation plot of domain A power and          |     |
|     | domain B power.                                                            | 268 |
| B.6 | Differential current $(\Delta I_i)$ of the two example voltage domains     | 269 |

# Chapter 1

# Introduction

#### 1.1 Granular Power Electronics Architecture

Power electronics plays an important role in emerging energy systems including data centers, electric vehicles, and grid-scale energy storage. These high-impact applications demand extreme performance from power conversion, driving a growing need for more efficient, dense, and reliable power electronics systems with faster response.

Take CPU/GPU/ASIC power delivery as an example. The explosive growth of cloud computing and AI applications are pushing computing power consumption to an extreme. High performance microprocessors can consume hundreds of amperes or even thousands of amperes of current within a few cm<sup>2</sup> of chip area. The power density reaches several hundred watts per cm<sup>2</sup>, equivalent to the surface power density of nuclear reactor or sun [1,2]. In these systems, there is a strong desire for advanced power electronics converters to support such a high-level power consumption in a limited space while meeting the efficiency, thermal, and bandwidth requirements.

The introduction of semiconductors into power supply design brought about remarkable improvements in size, efficiency, and reliability. However, the subsequent advances of power supplies have been mainly owing to the development of power semi-conductor switches and integrated control and driver circuits. Off-the-shelf power

converters nowadays still tend to use conventional architectures with simple topologies and employ multiple discrete magnetics which are bulky and inefficient. As power supply requirements become more sophisticated and stringent, these designs are struggling to keep up.

To pursue extreme performance power conversion and reduce both converter loss and volume, it is insufficient to merely count on iterated product optimizations. We need holistic innovations including new materials, new devices, and, more importantly, innovative power architectures that can make the best use of passive and active devices. The theories and technologies developed in this thesis mainly focus on advanced computing applications, and are widely applicable to other power electronics applications where efficiency, density, and control bandwidth are desired, such as power electronics for vehicles and robotics.

#### 1.1.1 The Component Scaling Laws

1. Power semiconductor switches are critical in power converters. Over the years, lots of advances have been made in power switches, which are developed from current-controlled devices (e.g., GTO) to voltage-controlled devices (e.g., IGBT, MOSFET), from silicon-based to wide band-gap (WBG) materials (SiC and GaN), continuously evolving towards higher power rating and faster speed to support more emerging applications (Fig. 1.1a). The next-generation power electronics need to leverage the progress in WBG switches, where the higher band-gap energy contributes to a higher thermal limit and higher critical electric field [3]. Given the same breakdown voltage, this allows to shorten the channel length, resulting in a lower on-resistance  $R_{on}$  and a smaller switch package. Several figure-of-merits (FOM) have been developed to quantify the material/device performance from different aspects [3–7]. According to [7], the theoretical minimum power loss (including both conduction and switching



Figure 1.1: (a) Application power rating vs. operation frequency and (b) theoretical  $R_{on,sp}Q_{gd,sp}$  limit vs. breakdown voltage  $V_B$  of different semiconductor device materials. Theoretical material limits are calculated based on [7], assuming drain-source voltage  $V_D = 0.7V_B$  and gate-drain overlapping area ratio k = 0.1.

losses) of a power semiconductor switch in a typical switched-mode power supply is:

$$P_{loss,min} = \left(2I_{rms}\sqrt{\frac{V_D I_D f}{I_g}}\right) \cdot \sqrt{R_{on,sp}Q_{gd,sp}},\tag{1.1}$$

where  $I_{rms}$  and  $I_D$  are the switch root-mean-square (rms) current and the switch turn-on/-off current, respectively;  $I_g$  is the average gate current;  $V_D$  is the drain-source blocking voltage; f is the switching frequency. Equation (1.1) shows that the minimum device power loss is proportional to  $\sqrt{R_{on,sp}Q_{gd,sp}}$ . The scaling trends of  $R_{on,sp}Q_{gd,sp}$  versus breakdown voltage of different semiconductor materials are plotted in Fig. 1.1b. Compared to silicon switches, SiC and GaN switches can theoretically reduce  $R_{on,sp}Q_{gd,sp}$  by 15 times and 64 times, respectively, which can lead to a decrease in power loss of about 4 times and 8 times, respectively, as per Eq. (1.1).

Figure 1.1b also suggests that  $R_{on,sp}Q_{gd,sp}$  scales quadratically with  $V_B$ . Since the device blocking voltage  $V_D$  will scale linearly with  $V_B$  in practical designs,  $P_{loss,min}$  is proportional to  $V_B^{\frac{3}{2}}$ . This implies if replacing one single high voltage switch with

n series switches of  $\frac{1}{n}$  voltage rating for each, the total power loss of the switches is expected to drop by a factor of  $\frac{1}{\sqrt{n}}$ . Besides, smaller switches have smaller parasitics, allowing higher switching frequency that can further reduce the passive component size as well as increase control bandwidth and power density. Thus, it is advantageous to replace one lumped, "large", high voltage switch with many distributed, "small", low voltage switches, and switch at high frequency to benefit from the reduced parasitics and the reduced overall effective resistance. This is one of the key motivations for granular power conversion.

2. Magnetic components, including inductors and transformers, can provide various functions in power converters, such as filtering, resonation, voltage conversion, etc. Traditionally, magnetic components of different functions are implemented individually as discrete components. Those bulky and inefficient discrete magnetics with high inductive energy storage limit the efficiency, power density, and control bandwidth that can be achieved. Coupling multiple inductors and transformers with a shared single core offers reduced magnetic size and loss, increased level of integration, higher converter efficiency, and higher power density [8–13].





Figure 1.2: (a) Example structure of a transformer. (b) Scaling trend of transformer VA power rating, volume, and onboard area versus linear dimension  $\lambda$ .

For instance, in a transformer (Fig. 1.2a), the maximum winding voltage depends on the flux density limit  $B_0$  in the cross-sectional area of the magnetic core  $(A_c)$ , and the maximum current rating is determined by the current density limit  $J_0$  in the window area  $(A_w)$  [8]. Thus, the transformer VA power rating will be proportional to the  $A_c$  and  $A_w$  product:

VA Power Rating = 
$$V \cdot I \propto (NfB_0A_c) \cdot \left(\frac{J_0A_w}{N}\right) = f \cdot B_0 \cdot J_0 \cdot (A_cA_w) \propto \lambda^4$$
 (1.2)

Define a linear dimension factor as  $\lambda$ . The transformer power rating scales up with  $\lambda^4$ , faster than its volume ( $\lambda^3$ ) and cross-sectional area ( $\lambda^2$ ). This implies merging many transformers into one can achieve higher power density as well as smaller total volume and onboard area. Besides, coupling many inductors into one can leverage multiphase interleaving to reduce winding current ripple, allowing to use lower inductance with reduced core size and faster transient speed [14,15]. As a result, it is beneficial to combine many distributed, uncoupled, single-function magnetic components into one centralized, coupled, multi-function magnetic component to leverage the reduced size and enable additional circuit design opportunities.

3. Capacitors are another primary passive component in power converters. They can be utilized to filter ac ripples at the dc link, blocking dc biased voltage, or create a resonant tank with inductors. Recently, the growing power demand in space-limited applications (e.g., smartphone) have drawn increased attention to switched-capacitor (SC) converters [17–20]. These transformerless, capacitor-intensive topologies use capacitors to undertake the major voltage stress for large conversion ratios, greatly reducing the converter volume due to the superior capacitor energy density.

Figure 1.3a plots the peak energy storage density versus voltage rating of some example commercial ceramic capacitors [16]. The capacitor energy is stored in the electric field whose peak energy density is only determined by  $\varepsilon E_c^2$  ( $\varepsilon$  and  $E_c$  are the



Figure 1.3: (a) Peak energy density versus voltage rating of commercial Class-I and Class-II ceramic capacitors. (b) Comparison of total volume and energy storage between one large capacitor and three small capacitors. In (a), data are sourced from Murata Database [16]. Derated capacitance due to dc bias is considered. Peak energy storage density is calculated based on  $\int_0^{V_R} vC(v)dv$ . Detailed capacitor specifications of the data points are listed in Table A.1.

permittivity and the critical electric field of the dielectric). Therefore, as indicated by Fig. 1.3a, the energy density limit of the same material stays roughly constant across different voltage ratings. This means, to store the same energy, one large high voltage capacitor will have a similar total volume to multiple small low voltage capacitors, as demonstrated in Fig. 1.3b. Hence, voltage ratings and dimensions have little impact on capacitor performance. "Large" and "small" capacitors are equally good.

#### 1.1.2 Distributed Switching Cells and Magnetics Integration

Consequently, conventional power architectures (Fig. 1.4a) using one single (or few) lumped switching cell(s) and multiple discrete magnetic components cannot make the most of the devices. On the contrary, the granular power architecture (Fig. 1.4b) that employs multiple distributed switching cells with one single (or few) coupled magnetics can leverage the device performance scaling and is a promising approach to achieving extreme performance power conversion. The way of coupling multiple



Figure 1.4: (a) Conventional power architecture with single (or few) lumped switching cell(s) and multiple discrete magnetics. (b) Granular power architecture with single (or few) coupled magnetics and multiple distributed switching cells.



Figure 1.5: (a) An example commercial converter with conventional power architecture [21]. (b) A multiport ac-coupled converter with granular power architecture [22].

discrete magnetics into one, using well-designed magnetic paths to replace complex circuit connections, is known as *magnetics integration*. Figure 1.5 shows two specific converter examples of the two power architectures.

As envisioned by the author, a major trend can be seen in the near future that the next-generation power electronics will be shifted from simple circuit topologies and controls to more complex systems [23], from centralized bulky power supplies to distributed power conversion [24–26], and from simple discrete magnetic structures to sophisticated coupled magnetic structures. The advantages of the granular power architecture will include but are not limited to the following:

- Better Device Performance: As mentioned in Section 1.1.1, the granular power architecture can reduce the power loss of semiconductor switches and decrease the size, loss, and inductive energy storage of magnetic components.
- Higher Switching Frequency: Compared to the lumped large switching cell, the distributed small switching cells can utilize active and passive components of smaller packages with lower parasitic inductance/capacitance, allowing the converter to switch at higher frequencies. Switching faster can further increase the control bandwidth and decrease the passive component size [27–29].
- Reduced Device Power Rating: As shown in Fig. 1.6, the granular power architecture enables reconfigurable multi-input-multi-output (MIMO) power conversion with reduced device power rating [30]. The MIMO power converter can reconfigure the input and the output voltage/current ratings by connecting multiple low-voltage/-current switching cells in series and parallel. To cover a wide operation range with a constant output power curve  $(P_o)$ , device ratings of the conventional power converter need to be higher than both the maximum voltage and the maximum current, resulting in a much higher total device power rating than  $P_o$ . In contrast, the MIMO power converter can cover the wide operation range by reconfiguring granular switching cells with a total device power rating close to  $P_o$ .
- Improved Scalability and Operation Flexibility: The granular power architecture comprising modular building blocks can be easily extended to meet the required voltage/current ratings for different applications [22, 31]. The distributed switching cells also offer more flexibility for various operation



Figure 1.6: (a) Example circuit implementations and (b) corresponding device power ratings of a conventional power converter and a reconfigurable MIMO power converter.

schemes that might enhance the converter functionality. For example, parallel-distributed switching cells can be operated in multiphase interleaving, as demonstrated in Fig. 1.7. The interleaved phase current ripples will be canceled at the output, leading to decreased ripple amplitude and increased equivalent switching frequency. These advantages can greatly reduce the filter size and EMI noise as well as improve the control bandwidth [32–34].

• Enhanced Reliability: The granular power architecture using smaller device packages can improve the manufacturing and assembly reliability [32]. Besides, benefiting from the distributed power conversion, the reduced electrical and thermal stress can potentially improve the system reliability, although the component count is high [24].



Figure 1.7: Multiphase interleaving operation of parallel-distributed switching cells.

Since the performance improvement comes from component scaling, the advantages of granular power conversion are fundamental and exist across the full-power level of power electronics applications ranging from power management integrated circuits to grid-scale power electronics.

#### 1.2 Contributions and Thesis Organization

Fully achieving these advantages of the granular power architecture needs appropriate design, which requires a good understanding of both the effective approaches to magnetics integration and the fundamental limits of different converter topologies. The author envisions a promising trend to minimize the power conversion stress and maximize the passive component utilization by architecture and magnetics co-design. In pursuit of this vision, the thesis is developed from three major aspects.

Chapter 2 answers the question "how to design all-in-one magnetics for granular power conversion". Contributions of this chapter include:

• A matrix coupled all-in-one magnetic structure combining both series and parallel couplings is developed for pulse-width-modulated (PWM) power conversion.

- The mechanism of current ripple reduction and current ripple steering among the matrix coupled windings is systematically analyzed.
- The benefits of matrix coupling in size reduction, ripple compression, and transient acceleration are quantified, indicating that a higher number of phases or a stronger coupling coefficient yields more advantages.
- A single-ended primary-inductor converter (SEPIC) with planar matrix coupled magnetics is built and tested. The matrix coupled SEPIC prototype can support load current up to 185 A at 5 V-to-1 V voltage conversion with a maximum power density over 470 W/in<sup>3</sup>. This is the first demonstration of a multiphase interleaved SEPIC converter with all-in-one magnetics featuring series and parallel coupling.
- Compared to commercial discrete inductors, the designed matrix coupled inductor has a 5.6 times smaller size and 8.5 times faster transient speed with similar current ripple and current rating.

Chapter 3 studies the granular power conversion for miniaturized point-of-load voltage regulators with extreme performance. Contributions are summarized as:

- A multistack switched-capacitor point-of-load (MSC-PoL) architecture is developed. The MSC-PoL architecture comprises many switched-capacitor cells connected with switched-inductor cells for soft charging and voltage regulation. Parallel coupled inductors with interleaving operation are utilized to reduce current ripple and boost transient speed. Mutual balancing between capacitor voltages and inductor currents can be achieved during the soft charging process.
- The MSC-PoL architecture is a hybrid switched-capacitor-magnetics system containing many L-C resonant poles that may influence stability and transient.

  A systematic analysis of this intrinsic L-C resonant behavior is performed.

- The impacts of coupled inductors on the resonant amplitude, frequency, and settling time during a line transient are analyzed, and the influence of intrinsic resonance on control stability is clarified, offering guidance for controller design.
- A 48-to-1-V/450-A prototype containing two MSC-PoL modules with ladder-structured coupled inductors is built. Benefiting from the 3D stacked inductor-driver packaging, one MSC-PoL module encloses all circuits and components into a <sup>1</sup>/<sub>16</sub>-brick/0.31-in<sup>3</sup>/6-mm-thick space, achieving 724 W/in<sup>3</sup> power density.
- When including the gate loss, the MSC-PoL prototype can achieve a peak efficiency of 91.7% (@170 A) and a full-load efficiency of 85.8% (@450 A). It can be further embedded into the CPU socket for power-supply-in-package (PwrSiP) voltage regulation with extreme efficiency, density, and bandwidth.

Chapter 4 investigates the granular power conversion for large-scale modular energy systems, featuring hardware-software-power co-design. Contributions are:

- A multiport ac-coupled differential power processing (MAC-DPP) architecture is presented, which couples all granular switching ports through a series coupled multi-winding transformer. The MAC-DPP architecture with high modularity and scalability has a reduced component count, smaller magnetic volume, and fewer differential power conversion stages compared to other DPP solutions.
- A stochastic analytical framework is developed to estimate DPP power loss under probabilistic load distributions. Scaling factors are introduced to describe DPP performance scaling limits. The theoretical analysis is verified by SPICE simulations, indicating the MAC-DPP is superior to other DPP solutions.
- Two control strategies, feedback and feedforward controls, are proposed to regulate the MIMO power flows and port voltages. A small signal modeling approach for large-scale MAC-DPP systems is derived to guide the parameter designs of

the feedback control. Besides, a customized Newton-Raphson solver is designed to identify the cross-coupled control variables for the feedforward control.

- To regulate the series-stacked string voltage, a series voltage compensator (SVC) that leverages the partial power processing is presented. Compared to a standalone dc-dc regulator, the SVC only processes a small fraction of the total load power, significantly reducing the power conversion stress and power loss.
- To validate the MAC-DPP architecture and theoretical analysis, a 10-port 450 W MAC-DPP prototype with 700 W/in<sup>3</sup> power density is built. The prototype is tested on: (1) a 50-HDD storage server, realizing 99.77% system efficiency and the hot-swapping capability; and (2) a 30×20 LED screen to verify the stochastic loss model. A buck SVC prototype is designed to regulate the DPP string voltage from a 50~65 V dc bus into precise 50 V, achieving 98.8% peak system efficiency. Exploring the co-design of software, hardware, and power architecture offers valuable insights for designing next-generation power architectures in data centers.

Finally, Chapter 5 concludes this thesis and presents multiple potential research avenues arising from its development. The proposed granular power architecture opens up the possibility of the future general-purpose power processor that can interface with multiple power sources and loads, featuring a reconfigurable number of ports and a reconfigurable current/voltage rating for each port.

# Chapter 2

# A Systematic Approach to All-in-One Magnetics Integration

#### 2.1 Background and Motivation

Magnetics integration technique is widely adopted in various applications with different types of implementations. Figure 2.1 demonstrates several example implementations of magnetics integration, which can be generally classified into three levels: the functional level, the package level, and the integrated circuit (IC) level. For the functional level integration, multiple magnetics performing various functions are coupled by magnetic flux, yielding mutual benefits among functionally-correlated magnetic components. The package level integration focuses on merging many magnetics into one package within the converter on the PCB board. As for the IC level integration, magnetic components are fabricated on die or together with the IC circuits. The magnetics integration discussed in this thesis refers to the functional and the package level integrations. However, the developed design methodology can be extended to the IC circuit level, where the space is severely limited for magnetic components.

This chapter presents a systematic all-in-one magnetics integration approach to merging multiple magnetic components into one with matrix coupling. A comprehensive analysis of the current ripple reduction mechanism is performed, revealing



Figure 2.1: Magnetics integration on the functional and the package levels: (a) WE, dual mode inductors [35]; (b) UCC, split-winding integrated magnetics [36]; (c) EPC, planar matrix transformer [37]; and (d) ViTEC, coupled inductor [38], as well as on the IC level: (e) Intel, FIVR [39]; (f) Apple M1 Pro, integrated coupled inductors [40].

the fundamental benefits of matrix coupling. The transient performance of matrix coupled inductors is demystified, providing guidance on large- and small-signal modeling. A figure of merit based on current ripple and transient performance is defined to quantify the benefits obtained from matrix coupling. To validate the matrix coupled magnetic structure and theoretical analysis, a four-phase matrix coupled synchronous SEPIC converter with planar PCB integrated magnetics is designed and tested. The prototype measures  $0.392 \text{ in}^3$  in volume and is capable of flexibly delivering power from  $1\sim5$  V input to  $1\sim5$  V output. At 5 V-to-1 V voltage conversion, the prototype can support load current up to 185 A with power density over 470 W/in<sup>3</sup>. Compared to discrete commercial inductors of similar current ratings and ripples, the matrix coupled inductor reduces the magnetic component size by over 5.6 times and increases the transient speed by over 8.5 times.

In the remainder of this chapter, Section 2.2 introduces the all-in-one matrix coupled magnetics integration approach and demonstrates several example PWM converter topologies that may apply matrix coupled magnetics. Section 2.3 performs the systematic analysis of current ripple reduction and steering mechanism for matrix coupled magnetics. Section 2.4 reveals the transient performance, quantifies the matrix coupling benefits, and provides insights for developing large- and small-signal modeling. Section 2.5 demonstrates the hardware design of a four-phase matrix coupled synchronous SEPIC prototype. Experimental results are summarized in Section 2.6. Finally, Section 2.7 concludes this chapter.

#### 2.2 Matrix Coupling Structure and Example Topologies

There are two fundamental ways of coupling magnetic components: series coupling and parallel coupling [41], each offering distinct advantages in reducing current ripple and magnetic size for power converters. In the series coupled magnetic structure, windings with in-phase voltages are coupled by a serial flux linkage, as shown in Fig. 2.2a. Examples include multi-winding transformers and series coupled inductors. For the multi-winding transformer, cross-sectional area of the magnetic core doesn't scale up with the winding count, but depends on the maximum volt-secondper-turn of all the windings [22, 42–45]. The total VA power rating scales faster than the transformer volume [8]. Thus, integrating many transformers into a single multi-winding transformer with the same total power rating can reduce the overall transformer size. For the series coupled inductor, winding current ripple can be reduced given the same self inductance. It has been successfully implemented in lots of topologies such as Ćuk, SEPIC, and tapped-inductor buck or boost converters, and has been proven to offer improved efficiency, reduced converter size, and more benign control characteristics [46–50]. Moreover, by adjusting the leakage inductance and turns ratio, current ripples can be steered among the series coupled windings [46,47].



Figure 2.2: Coupled magnetic structures: (a) series coupled; (b) parallel coupled; (c) matrix coupled.

With appropriate configuration, current ripples on specific windings can be significantly decreased, even to zero, which is beneficial to ripple-sensitive applications like microprocessor power supplies [51].

Figure 2.2b plots the parallel coupled magnetic structure, where windings on multiple core legs are coupled by parallel flux linkages. Parallel coupled inductors are widely used in interleaved multiphase topologies [14,52–59]. In these converters, interleaved winding voltages lead to ripple cancellation between winding currents. The resulting reduced current ripples in all circuit components (switches, inductors, capacitors, etc.) and printed circuit board (PCB) traces decrease power losses and extend the operation range of continuous-conduction mode (CCM). In parallel coupling with equally-shared phase currents, magneto-motive forces (MMF) from parallel coupled windings cancel each other. Almost entire dc inductive energy is stored in leakage

fluxes. Due to current ripple reduction, small leakage inductance is allowed, which can reduce total energy storage and boost transient performance. Since the majority of dc fluxes are leakage fluxes that flow through high reluctance paths, current saturation ratings are greatly improved.

Combining both series coupling and parallel coupling, this chapter develops a systematic all-in-one magnetics integration approach, namely matrix coupling as shown in Fig. 2.2c. Motivations for merging multiple discrete magnetics into one origin from their voltage relationships – magnetics with in-phase voltages can be coupled in series on the same core leg, while magnetics with interleaved voltages can be coupled on parallel core legs. Benefiting from both series and parallel couplings as well as interleaving, matrix coupled magnetics can achieve miniaturized magnetic size and inductive energy storage, reduced current ripple and power loss, and improved transient response. Similar coupled magnetics applied in current doubler rectifier [60,61], dual flyback [62], multiphase LLC [63], and cross commutated buck converters [64] are a subset of the generalized matrix coupled magnetic structure presented here.

Matrix coupled magnetics are generally applicable to power converters that have both in-phase and interleaved voltage relationships between magnetic components. In-phase voltages motivate series coupling to reduce cross-sectional core area, and interleaved voltage excitations motivate parallel coupling to reduce ac current ripples. Typical examples are multiphase, multi-order PWM converters [65], as shown in Fig. 2.3. Figures 2.3a-2.3c plot a series of multiphase buck-boost topologies (SEPIC, ZETA, and Ćuk) with matrix coupled inductors. Each topology contains two inductors and one capacitor per phase. Given that the capacitor maintains stable dc voltage, two inductors of each phase have identical square wave voltages and thus are coupled in series on the same core leg. Many phases on different core legs are coupled in parallel and are operated in interleaving. Figures 2.3d-2.3e plot a few multiphase tapped-inductor topologies, where two windings of each tapped inductor are origi-



Figure 2.3: Example PWM topologies that may apply matrix coupled magnetics: (a) multiphase SEPIC; (b) multiphase ZETA; (c) multiphase Ćuk; (d) multiphase tapped-inductor buck; (e) multiphase tapped-inductor boost; (f) multiphase flyback.

nally series coupled. The turns ratio can be adjusted to enlarge voltage conversion ratio, but in-phase square wave voltages are always applied to the series coupled windings [66]. Through parallel couping and multiphase interleaving, ac current ripples on the tapped inductors are decreased, enabling the use of a smaller magnetic core with faster transient speed. Figure 2.3f shows an isolated matrix coupled converter based on flyback topology. In the flyback converter, transformer windings are coupled in series. While providing the functions of galvanic isolation and voltage conversion, the transformer also needs to store energy due to dc magnetizing current. Applying matrix coupling and interleaving operation to the multiphase flyback converter can help reduce the required energy storage in the transformer.

Matrix coupled converters in Fig. 2.3 are constructed based on identical switching converter cells connected in parallel. One can also combine multiple switching cells

of different types into a composite converter [67] or sigma converter [68] to leverage the resulting mutual advantages.

#### 2.3 Mechanism of Current Ripple Reduction and Steering

This section systematically analyzes the mechanism of current ripple reduction and ripple steering when operated under PWM voltage excitations. Fundamental ripple reduction benefits from both parallel and series couplings are revealed. The analysis is first performed based on symmetric matrix coupling structures where series coupled windings of each phase have the same voltages, winding turns, and leakage reluctance and winding configurations across parallel coupled phases are identical. Matrix coupling structures are then generalized as asymmetric series coupling plus symmetric parallel coupling, in which the above-mentioned quantities may vary across series coupled windings, but parallel phases are still identical. In this case, current ripples can be steered among the series coupled windings. Although asymmetric parallel coupling is not elaborated in this chapter, it shares the same ripple reduction mechanism and benefits as the symmetric cases. The discussed conditions herein already cover most of the matrix coupling applications, especially the multiphase topologies. Following current ripple analysis is based on CCM operation, but related analysis in discontinuous-conduction mode (DCM) can emulate the procedures presented below.

# 2.3.1 Phase Current Ripple Reduction

Assume the matrix coupled magnetic component in Fig. 2.2c contains M parallel core legs, and on each leg are wound N series coupled windings. Figure 2.4a plots its equivalent magnetic circuit model [69]. Each core leg is modeled as a circuit branch with a leg reluctance  $\mathcal{R}_L$  and N MMF sources. Given that parallel coupled phases are symmetric, the leakage fluxes between phases can be modeled as a central branch



Figure 2.4: Equivalent magnetic models for the matrix coupled magnetics: (a) magnetic circuit model; (b) inductance dual model.



Figure 2.5: (a) Interleaved winding voltages for parallel phases. (b) Modified inductance dual model where windings of each phase are combined as one port delivering the summed currents into the inductance network.

with an equivalent reluctance  $\mathcal{R}_C$ . Applying topological duality to the magnetic circuit model leads to the inductance dual model [70–73], as shown in Fig. 2.4b. In the inductance dual model, effective resistance can be further added in parallel with  $1/\mathcal{R}_C$  and  $1/\mathcal{R}_L$  to capture the core loss of each portion of the magnetic core, or in series at each port to capture the winding conduction loss. In a well-designed magnetic component, these effective resistors are usually small enough and have negligible impacts on the current ripple. Thus, they are ignored in the analysis below.

As a start, symmetric series coupling with the same winding voltages and number of turns (denoted as n) is considered. Cases with unmatched voltages or number of turns can be converted back to Fig. 2.4b as long as the series coupled windings have



Figure 2.6: Detailed superposition procedures for phase current ripple analysis.

the same voltage-per-turn. More general cases allow unmatched voltage-per-turn for series coupling and are discussed later in Section 2.3.3. In the case of symmetric coupling, series coupled windings driven by identical voltages can be combined as one port delivering the summed winding currents into the inductance network, as shown in Fig. 2.5b. Summed current ripple of each phase is related to all the voltage excitations  $v_1, ..., v_M$ , which become phase-shifted square wave voltages (in Fig. 2.5a) under multiphase interleaved operation. We analytically derive the current ripple based on superposition. Figure 2.6 illustrates the key steps of the superposition analysis. Denote the summed winding current ripple in the  $k^{th}$  phase as  $\Delta i_k$ . Interleaved voltage

excitations  $v_1, ..., v_M$  have identical voltage patterns so that their positive volt-second integrals of one cycle are the same (denoted as  $\sigma$ ). M superposition subcircuits are created. In each subcircuit, the square wave voltage results in triangular ac currents in the two parallel inductors, and their peak-to-peak ripple values are:

$$(\Delta i_{Lk})_{pp} = \frac{\sigma \mathcal{R}_L}{n}, \qquad (\Delta i_{Ck})_{pp} = \frac{\sigma \mathcal{R}_C}{n}.$$
 (2.1)

In the inductance dual model, each branch inductor  $(1/\mathcal{R}_L)$  is only excited by its parallel voltage source, while the shared inductor  $(1/\mathcal{R}_C)$  is excited by all voltage sources. The overall shared inductor current ripple  $(\Delta i_C)$  is the summation of the interleaved triangular current ripples  $(\Delta i_{Ck})$  in all subcircuits, so its peak-to-peak current ripple is:

$$(\Delta i_C)_{pp} = \left(\sum_{k=1}^M \Delta i_{Ck}\right)_{pp} = \frac{\Gamma M \sigma \mathcal{R}_C}{n}, \tag{2.2}$$

where  $\Gamma$  is the ripple cancellation ratio of the summed triangular currents due to interleaving [41]:

$$\Gamma = \frac{(k+1-DM)(DM-k)}{(1-D)DM^2}, \text{ for } \frac{k}{M} \le D < \frac{k+1}{M}.$$
 (2.3)

Since  $\Delta i_C$  and  $\Delta i_{Lk}$  are synchronized to the PWM switching clock, summing their peak-to-peak values results in the peak-to-peak ripple of  $\Delta i_k$  for the interleaved operation:

$$(\Delta i_k)_{pp}^{interleaved} = \frac{1}{n} \left( (\Delta i_C)_{pp} + (\Delta i_{Lk})_{pp} \right) = \frac{\sigma(\Gamma M \mathcal{R}_C + \mathcal{R}_L)}{n^2}. \tag{2.4}$$

If the M phases are not interleaved and driven by synchronized voltage excitations, the inductance dual model can be equivalent to multiple standalone ones as shown in Fig. 2.7. Then the peak-to-peak ripple of  $\Delta i_k$  becomes

$$(\Delta i_k)_{pp}^{non-interleaved} = \frac{\sigma(M\mathcal{R}_C + \mathcal{R}_L)}{n^2}.$$
 (2.5)



Figure 2.7: Equivalent inductance dual model per phase for non-interleaved voltages.

Define the parallel coupling coefficient as  $\beta = M\mathcal{R}_C/\mathcal{R}_L$ . A higher  $\beta$  indicates a stronger parallel coupling. Define the ratio between the peak-to-peak values of  $\Delta i_k$  with interleaved and non-interleaved operations as  $\gamma$ :

$$\gamma \stackrel{\text{def}}{=} \frac{(\Delta i_k)_{pp}^{interleaved}}{(\Delta i_k)_{pp}^{non-interleaved}} = \frac{(\Gamma M \mathcal{R}_C + \mathcal{R}_L)}{(M \mathcal{R}_C + \mathcal{R}_L)} = \frac{1 + \beta \Gamma}{1 + \beta}.$$
 (2.6)

The analysis above reveals that the benefits of parallel coupling fundamentally come from multiphase interleaving. Similar conclusion is drawn from a different perspective in [41]. If parallel-coupled phases are operated in a non-interleaved manner, they are equivalent to multiple discrete inductors of  $\frac{n^2}{M\mathcal{R}_C+\mathcal{R}_L}$ , as indicated by Fig. 2.7. In this case, phase current ripples, power losses, magnetic energy storage, and transient speed are all the same as the discrete ones. Only if the parallel phases are both coupled and interleaved, the phase current ripple is reduced by a factor of  $\gamma$ , which is determined by  $\beta$  and  $\Gamma$ . As implied by Eq. (2.6), with a strong parallel coupling coefficient (i.e.,  $\beta \to \infty$ ),  $\gamma$  approaches  $\Gamma$ . If the coupling is weak (i.e.,  $\beta \to 0$ ),  $\gamma$  approaches one, when no ripple reduction is achieved.

## 2.3.2 Winding Current Ripple Reduction

Equation (2.6) reveals the benefits of parallel coupling assuming the series coupled windings are perfectly coupled. To capture the impact of non-ideal series coupling,

the leakage flux between windings on the same core leg is modeled as a leakage reluctance ( $\mathcal{R}_K$ ) in parallel with each MMF source as shown in Fig. 2.8a. In the inductance dual model of Fig. 2.8b, the parallel leakage path is then converted to a series leakage inductance ( $1/\mathcal{R}_K$ ) at each port. Define the series coupling coefficient as  $\alpha = N\mathcal{R}_K/\mathcal{R}_L$ . We analyze the current ripple under the impacts of series coupling by mapping its inductance dual model (in Fig. 2.8b) back to the one without leakage inductance  $\mathcal{R}_K$  (in Fig. 2.4b). The mapping process is illustrated in Fig. 2.9 with parameter mapping described as:

(Fig. 2.8b) 
$$\begin{bmatrix} \mathcal{R}_L \\ \mathcal{R}_C \\ \mathcal{R}_K \\ \alpha \\ \beta \end{bmatrix} \xrightarrow{\text{Parameter Mapping}} \begin{bmatrix} \mathcal{R}'_L \\ \mathcal{R}'_C \\ \beta' \end{bmatrix} \text{ (Fig. 2.4b)}. \tag{2.7}$$

Since the two inductance dual models in Fig. 2.9 are equivalent and should have the same impedance matrix, the parameter mapping relationship can be obtained as

$$\begin{cases}
\mathcal{R}'_{L} = \mathcal{R}_{L} || N \mathcal{R}_{K} \\
M \mathcal{R}'_{C} = (\mathcal{R}_{L} + M \mathcal{R}_{C}) || N \mathcal{R}_{K} - \mathcal{R}_{L} || N \mathcal{R}_{K} . \\
K_{\alpha\beta} \stackrel{\text{def}}{=} \beta' = \frac{M \mathcal{R}'_{C}}{\mathcal{R}'_{L}} = \frac{\alpha\beta}{1 + \alpha + \beta}
\end{cases}$$
(2.8)

 $K_{\alpha\beta}$ , determined by  $\alpha$  and  $\beta$ , is the matrix coupling coefficient, which describes the overall coupling strength among all the windings in the matrix coupled magnetics. After model mapping, phase current ripple and its reduction ratio when considering the series coupling coefficient can be analyzed in the same way as in Section 2.3.1.

Table 2.1 lists the summarized parameters of the current ripple analysis for symmetric matrix coupling. Here, impacts of both the series and parallel coupling co-



Figure 2.8: Magnetic models including leakage inductance between series coupled windings on each core leg: (a) magnetic circuit model; (b) inductance dual model.



Figure 2.9: Mapping the inductance dual model back to the one without  $\mathcal{R}_K$ .

efficients are included. As indicated by Table 2.1, compared to the non-interleaved operation, multiphase interleaving can reduce the peak-to-peak phase current ripple by a factor of  $\gamma$ . The phase current ripple  $((\Delta i_k)_{pp}^{interleaved})$  is the summation of winding current ripples in each phase and will be equally split across the series coupled

|                                           | J J                                                                            |
|-------------------------------------------|--------------------------------------------------------------------------------|
| Series Coupling Coefficient               | $\alpha = \frac{N\Re_K}{\Re_L}$                                                |
| Parallel Coupling Coefficient             | $\beta = \frac{M\mathcal{R}_C}{\mathcal{R}_L}$                                 |
| Matrix Coupling Coefficient               | $K_{\alpha\beta} = \frac{\alpha\beta}{1+\alpha+\beta}$                         |
| Current Ripple Reduction Ratio            | $\gamma = \frac{1 + K_{\alpha\beta} \Gamma}{1 + K_{\alpha\beta}}$              |
| $(\Delta i_k)_{pp}^{non-interleaved}$     | $\frac{\sigma\Big((M\mathcal{R}_C + \mathcal{R}_L)  N\mathcal{R}_K\Big)}{n^2}$ |
| $(\Delta i_k)_{pp}^{interleaved}$         | $\gamma \times (\Delta i_k)_{pp}^{non-interleaved}$                            |
| $(\Delta i_{winding})_{pp}^{interleaved}$ | $\frac{1}{N} \left( \Delta i_k \right)_{pp}^{interleaved}$                     |

Table 2.1: Key Parameters of Current Ripple Analysis for Symmetric Matrix Coupling

windings if they have the same leakage inductance  $(1/\Re_K)$ . In this case,  $\gamma$  is also the current ripple reduction ratio for each winding current:

$$\frac{(\Delta i_{winding})_{pp}^{interleaved}}{(\Delta i_{winding})_{pp}^{non-interleaved}} = \frac{(\Delta i_k)_{pp}^{interleaved}}{(\Delta i_k)_{pp}^{non-interleaved}} = \gamma$$
 (2.9)

A higher  $\alpha$  or  $\beta$  results in a higher  $K_{\alpha\beta}$  so that  $\gamma$  will be lower. Therefore, both strong series coupling and parallel coupling are preferred in order to achieve lower current ripples and larger benefits for matrix coupled magnetics.

#### 2.3.3 Winding Current Ripple Steering

The above analysis is based on symmetric series coupling when all windings are continuously conducting. In some applications, however, series coupled windings can have different turns ratio and voltages (e.g., tapped-inductor topologies) or intermittent winding currents caused by switching behaviors (e.g., flyback topology). If certain windings are disconnected due to switching, current ripples will be distributed among the other conducting windings, but the summed current ripple of each phase still follows the analysis above as long as the voltage relationships are always applied.



Figure 2.10: Generalized series coupled winding configuration in the  $k^{th}$  phase and its Thevenin-equivalent network.

Therefore, without loss of generality, we assume all the windings are always conducting. Figure 2.10 plots a generalized inductance dual model referring to the series coupled windings in the  $k^{th}$  phase. Winding voltage  $v_{kj}$ , turns ratio  $n_j$ , and leakage inductance  $1/\Re_{K_j}$  are all independent. As shown in the figure, the multi-source inductance network can be converted to a single-source Thevenin-equivalent network with equivalent quantities as:

$$\mathcal{R}_{eq} = \sum_{j=1}^{N} \mathcal{R}_{K_j}, \quad v_{eq} = \frac{n_{eq}}{\mathcal{R}_{eq}} \sum_{j=1}^{N} \frac{\mathcal{R}_{K_j} v_j}{n_j}.$$
(2.10)

 $n_{eq}$  is a reference turns ratio and can be anyone of  $n_1 \sim n_N$ . Then the Thevenin-equivalent network can be substituted into the model in Fig. 2.9, following the same procedures above to get the phase current ripple  $\Delta i_k$ .

For general asymmetric series coupling, how phase current ripple  $\Delta i_k$  is distributed into series coupled windings depends on winding voltage, turns ratio, and leakage inductance. There is no general solution to the winding current ripple. Instead, it needs to be analyzed case by case. However, if all series coupled windings share the same voltage-per-turn,  $\Delta i_k$  will be linearly split into each winding, proportionally to its leakage reluctance  $\mathcal{R}_{K_j}$ , as shown in Fig. 2.11. As explored in [46], there are opportunities to steer the current ripple among series coupled windings by adjusting



Figure 2.11: Current ripple steering among the series coupled windings in the  $k^{th}$  phase. If the windings have identical volt-per-turn, the phase current ripple  $\Delta i_k$  will be linearly divided into each winding by a steering coefficient  $s_i$ .

the leakage reluctance. The steering coefficient and current ripple for the  $j^{th}$  winding in the  $k^{th}$  phase are:

$$s_j = \frac{\mathcal{R}_{K_j}}{\sum_{i=1}^N \mathcal{R}_{K_i}}, \quad \Delta i_{kj} = s_j \times \Delta i_k.$$
 (2.11)

Equation (2.11) indicates that, with appropriate adjustment of leakage reluctance, switching current ripples on specific windings can be reduced to nearly zero. Rippleless winding currents can be used at important outputs to supply ripple-sensitive applications or to reduce the filtering capacitor size.

### 2.4 Transient Performance and Figure of Merit

Besides the current ripple, another important metric for a matrix coupled inductor is the transient performance, which impacts converter dynamics and control design. The transient performance of the matrix coupled inductor can be analyzed based on switching-cycle averaging. To analytically derive the transient inductance, symmetric matrix coupling is assumed. Asymmetric cases can emulate the analysis below.



Figure 2.12: (a) Duty ratio command for each phase remains identical during transients. (b) Switching-cycle averaged voltage of each winding is identical.

In a multiphase PWM converter with the matrix coupled inductor, duty ratios of parallel phases are usually identical. Under the control methods that maintain the same duty ratio for parallel phases during transients, all the windings will always have identical square wave voltages (same amplitude and pulse width), as shown in Fig. 2.12a. Applying switching-cycle averaging to the inductance dual model (Fig. 2.12b), the averaged winding voltage is the same for each winding, regardless of phase shifts. It indicates that in the switching-cycle averaged model, all the windings are always driven by identical voltage excitations. In this case, each winding can be modeled as an individual inductance network, as shown in Fig. 2.13a. Therefore, the equivalent inductance seen at each winding during transient (defined as transient inductance  $L_{tr}$  in Fig. 2.13b) is:

$$L_{tr} = \frac{n^2 N}{(M \mathcal{R}_C + \mathcal{R}_L)||N \mathcal{R}_K}.$$
 (2.12)



Figure 2.13: (a) Switching-cycle averaged model for each winding. (b) Equivalent inductance seen at each winding during transients.

 $L_{tr}$  determines the transient performance of the matrix coupled inductor under common-mode excitations. A smaller  $L_{tr}$  enables the designer to achieve a faster transient response with proper closed loop control [14]. This chapter only discusses this common-mode transient inductance, because it is typically used to solve the input or output transient speed and can indicate the scaling trend of the transient performance for matrix coupled magnetics. Transient response for parallel coupled inductors under differential-mode excitations is discussed in [58].

Under common-mode excitations, the averaged component voltages and currents of using the matrix coupled inductor is the same as using discrete inductors  $L_{tr}$ . Therefore, the large- and small-signal models can be developed by treating the matrix coupled inductor as multiple discrete  $L_{tr}$ , as illustrated in Fig. 2.14. To quantify the benefits of matrix coupling, a figure of merit (FOM) is defined by comparing the current ripple of a matrix coupled inductor (under interleaved operation) to that of using discrete  $L_{tr}$ :

$$FOM \stackrel{\text{def}}{=} \frac{(\Delta i_{winding})_{pp}^{matrix-coupled}}{(\Delta i_{winding})_{pp}^{discrete-L_{tr}}} = \frac{1 + K_{\alpha\beta}\Gamma}{1 + K_{\alpha\beta}} = \gamma.$$
 (2.13)

The FOM describes the current ripple reduction ratio of using a matrix coupled inductor compared to using discrete inductors given the same transient performance.

A lower FOM indicates a lower current ripple than that of discrete inductors and



Figure 2.14: Switching-cycle averaged dynamics of the converter with the matrix coupled inductor are the same as that with discrete inductors of  $L_{tr}$ .

larger benefits. Accordingly, the effective steady-state inductance  $L_{ss}$  that has the same steady-state current ripple can be expressed as:  $L_{ss} = L_{tr}/\gamma$ .

Equation (2.13) also implies that the ripple reduction ratio between using matrix coupled inductor and discrete  $L_{tr}$  equals the ripple reduction ratio between interleaved and non-interleaved operations for the matrix coupled inductor itself. This is consistent with the fundamental characteristics of matrix coupling:  $L_{tr}$  is in effect the equivalent winding inductance under common-mode voltage excitations, so it is equal to the effective winding inductance when operated by non-interleaved (synchronized) PWM signals. Consequently, current ripple of discrete  $L_{tr}$  is the same as the matrix coupled inductor under non-interleaved operation, and thus FOM equals  $\gamma$ . Figure 2.15 plots the FOM as a function of duty ratio with different phase numbers (M) and coupling factors  $(K_{\alpha\beta})$ , indicating that a higher number of parallel phases or a stronger matrix coupling coefficient will result in a lower FOM (i.e., larger benefits) across the full duty ratio range.



Figure 2.15: FOM as a function of duty ratio (D) for various numbers of phases (M) and coupling factors  $(K_{\alpha\beta})$ .

#### 2.5 Design of a Matrix Coupled SEPIC Converter

To validate the matrix coupled magnetic structure and the theoretical analysis, a four-phase synchronous SEPIC converter with a planar matrix coupled inductor is designed and built. This section elaborates the detailed prototype design including the planar inductor structure, optimized winding pattern, and converter PCB layout.

## 2.5.1 Matrix Coupled Inductor Design

Figure 2.16a shows the circuit topology, in which eight discrete PWM inductors are merged into one matrix coupled inductor. The matrix coupled inductor is implemented as planar PCB integrated magnetics utilizing a four-leg planar magnetic core



Figure 2.16: (a) Circuit topology of the four-phase matrix coupled synchronous SEPIC converter. (b) Four-leg EE-type magnetic core built with Ferroxcube 3F4 material. (c) Cross section of the magnetic core and inductor winding annotations.

as shown in Fig. 2.16b. The magnetic core is built with Ferroxcube 3F4 material and takes up 12 mm×13 mm board area. Two core pieces are stacked as an EE-type structure with 5.25 mm total height. The magnetic core is four-way symmetric, allowing for identical parameters across the four phases. Details about this core shape design are provided in [57]. A ladder core structure as presented in [41] is also feasible.

Figure 2.16c plots the cross-section view of the magnetic core and annotated inductor winding configurations. The labeled winding current directions follow the defined ones in Fig. 2.16a. As shown in Fig. 2.16c, two inductor windings of each phase are wound on the same core leg and their ac currents flow in the same direction, whereas ac winding currents of neighboring phases flow reversely in the shared window area. It might lead to concentrated currents on adjacent conductor surfaces between phases due to the proximity effect. Thus, an appropriate winding design is needed to reduce the ac resistance.



Figure 2.17: Alternative winding designs of the matrix coupled inductor based on an 8-layer PCB board of 3-oz copper thickness: (a) side by side; (b) non-interleaved overlapping; (c) interleaved overlapping. Assume each inductor current is  $I_L$ . MMF diagrams for windings in the window area are plotted along horizontal and vertical directions.

Figure 2.17 shows three feasible PCB winding designs based on an 8-layer PCB board of 3-oz copper thickness. MMFs of the windings are plotted in both horizontal and vertical directions across the window area. Figure 2.17a shows a side-by-side winding design, where windings of neighboring phases are placed side by side in the window area. Each inductor comprises four layers of windings connected in parallel and takes up half of the window width. Assume each inductor current is  $I_L$ . In the window area, MMFs of two reverse inductor currents on the same layer cancel each other, so the MMF along vertical direction remains zero. Along horizontal direction,



Figure 2.18: FEM simulation of magnetic field strength distributions and ac winding current distributions in the designs of (a) side by side; (b) non-interleaved overlapping; (c) interleaved overlapping. Each inductor is driven by a 1-MHz sinusoidal current excitation of 10-A amplitude. Copper thickness and current directions are consistent with Fig. 2.17.

the opposite winding currents on the left and right sides enhance the magnetic flux, resulting in a maximum MMF of  $2I_L$  in the center. Figure 2.17b shows a non-interleaved overlapping design, in which windings between phases take up the full window width and are overlapped in the window area. Winding layers between phases are not interleaved but separated into top and bottom halves of the PCB. Each inductor is comprised of two layers of windings connected in parallel. Due to the overlapped windings, MMFs of reverse currents cancel along horizontal axis, but the opposite currents of the top and bottom halves result in a maximum MMF of  $2I_L$  in the middle of vertical direction. Figure 2.17c plots the interleaved overlapping winding design, where windings between neighboring phases are both overlapped in the window area and interleaved across PCB layers. The interleaved winding layers can effectively reduce the MMF along vertical direction. The maximum MMF along vertical direction is  $I_L/2$ , and MMF along horizontal direction remains zero. Accordingly, the interleaved overlapping winding design maintains low MMFs in both the horizontal and vertical directions.

For the three winding designs in Fig. 2.17, dc resistance of the windings in the window area is the same, but ac resistance varies. Figures 2.18 shows the finiteelement-method (FEM) simulations of magnetic field distributions and ac winding current distributions. The simulations are performed by applying a 1-MHz sinusoidal current excitation of 10-A amplitude to each inductor, and eddy current effects are captured for each winding. Since the core has a large permeability  $\mu_{core} \gg \mu_0$ , the H field of leakage flux in the window area is much higher than in the core. The FEM simulation results are consistent with the MMF analysis in Fig. 2.17. For the sideby-side or non-interleaved overlapping design, the H field mainly flows vertically in the center of the window area or flows horizontally and concentrates between middle layers. As for the interleaved overlapping design, the major H field in the window area also flows horizontally along the conductor layers, but it maintains low and is well balanced across different layers. The high H field in the side-by-side and non-interleaved overlapping design causes concentrated current at nearby conductor surfaces with increased ac resistance, as can be observed in Fig. 2.18. Consequently, the interleaved overlapping design has the most balanced ac current distribution with the lowest ac resistance, and thus is selected.

Figure 2.19 plots the overall 3-D structure and detailed PCB winding patterns of the planar matrix coupled inductor. Circuit connections and current flow direction of each winding (as defined in Fig. 2.16a) are labeled in the figure. Parallel winding terminal connections of phases 1 & 2 are drawn for demonstration. Winding patterns and terminal connections of phases 3 & 4 are centrosymmetric to phases 1 & 2.

## 2.5.2 Matrix Coupled SEPIC Prototype

The matrix coupled SEPIC prototype is designed to flexibly deliver power from  $1 \text{ V}\sim 5 \text{ V}$  input to  $1 \text{ V}\sim 5 \text{ V}$  output. Figure 2.20 shows the annotated prototype from top, bottom, and side views. The prototype measures 35 mm  $\times$  35 mm in



Figure 2.19: 3-D structure of the matrix coupled inductor and PCB winding patterns on (a) layer 1 & 3, (b) layer 2 & 4, (c) layer 6 & 8, and (d) layer 5 & 7. Winding terminal connections of phases 1 & 2 are plotted for demonstration. Phases 3 & 4 are centrosymmetric to phases 1 & 2. The multilayer overlapped implementation of multiple windings enables greatly reduced ac resistance.

area and 5.25 mm in height. Its total volume is 6431 mm<sup>3</sup> (i.e., 0.392 in<sup>3</sup>). On the converter, the matrix coupled inductor is located in the center with four phases of SEPIC surrounding it. As a result, both the matrix coupled inductor and the overall converter structure are centrosymmetric, facilitating keeping balanced parameters across the four phases.

In the matrix coupled SEPIC converter, blocking capacitors might resonate with the leakage inductance of the coupled inductor. To avoid resonance and keep the blocking capacitors working as dc sources, sufficient leakage inductance is needed for maintaining the resonant frequency far less than the switching frequency [64]. In this design, as shown in Fig. 2.20, through-hole connections for external windings are reserved to adjust the leakage inductance. Figure 2.21a shows two alternative exter-



Figure 2.20: Annotated matrix coupled SEPIC prototype: (a) top view; (b) bottom view; (c) side view. The prototype measures  $35 \text{ mm} \times 35 \text{ mm} \times 5.25 \text{ mm}$ .



Figure 2.21: External winding setup: (a) two winding options; (b) current measurement setup; (c) compact winding setup for high power density.

nal windings: current measurement loop (27 nH) and compact rectangular winding (10 nH), and their assembly setups are shown in Figs. 2.21b and 2.21c, respectively. To achieve high power density, the compact rectangular winding is designed to reduce the height so that the prototype thickness is only determined by the magnetic core as shown in Fig. 2.20c. In the following of this chapter, all the current measurements are performed with the current measurement loop. Efficiency and maximum out-

| Device & Symbol                                 | Description                                                                                                                                      |
|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------|
| Low Side Switch, $S_{11} \sim S_{41}$           | Infineon BSZ010NE2LS5                                                                                                                            |
| High Side Switch, $S_{12} \sim S_{42}$          | Infineon BSZ011NE2LS5I                                                                                                                           |
| Switch Gate Driver                              | TI LM5114                                                                                                                                        |
| Blocking Capacitor, $C_{B1} \sim C_{B4}$        | $X5R 6.3 V, 100 uF \times 6$                                                                                                                     |
| Input Capacitor, $C_{IN1} \sim C_{IN4}$         | $X5R 6.3 V, 100 uF \times 6$                                                                                                                     |
| Output Capacitor, $C_{OUT1} \sim C_{OUT4}$      | X5R 6.3 V, 100 uF×10                                                                                                                             |
| Core Material                                   | Ferroxcube 3F4                                                                                                                                   |
| Core Leg Reluctance, $\mathcal{R}_L$            | $1.02 \times 10^6 \ \mathrm{H^{-1}}$                                                                                                             |
| Leakage Reluctance, $\mathcal{R}_C$             | $19.9 \times 10^6 \; \mathrm{H}^{-1}$                                                                                                            |
| Winding Leakage Reluctance, ${\mathcal{R}_k}^*$ | $ \begin{array}{c} \textcircled{1} \ 36.9 \times 10^6 \ \mathrm{H^{-1}} \\ \textcircled{2} \ 99.0 \times 10^6 \ \mathrm{H^{-1}} \\ \end{array} $ |

Table 2.2: Bill-of-Material of the Matrix Coupled SEPIC Prototype

put power are measured based on the compact rectangular winding. As implied by Fig. 2.8, the external winding inductance can be merged into  $1/\mathcal{R}_k$  to directly leverage the developed analysis of current ripple and transient performance. Table 2.2 lists detailed component descriptions and equivalent magnetic parameters of the two external winding setups. Experiments below are performed based on the parameters in Table 2.2, unless otherwise specified. The leakage inductance of external windings can be further integrated into the matrix coupled inductor design for maximizing the power density.

## 2.6 Experimental Results

This section shows the experimental results in terms of inductor current ripple, transient speed, converter efficiency, and magnetics comparison.

### 2.6.1 Inductor Current Ripple and Converter Dynamics

Current ripple and transient speed are usually tradeoffs for discrete inductors, whereas the matrix coupled inductor can achieve both low current ripple and fast transient

<sup>\*</sup> Equivalent  $\mathcal{R}_k$  including the external winding inductance. ① is for the current measurement loop. ② is for the compact rectangular winding.



Figure 2.22: (a) Measured two switch-node voltages, a blocking capacitor voltage, and an inductor current (as defined in Fig. 2.16), when  $V_{in} = 5 \text{ V}$ ,  $V_{out} = 3.3 \text{ V}$ ,  $I_{out} = 50 \text{ A}$ , and  $f_{sw} = 806 \text{ kHz}$ . (b) Current ripple reduction ratio as a function of duty ratio with different external winding setups.

speed at the same time. This subsection experimentally validates the analysis of current ripple reduction, current ripple steering, and converter dynamics as discussed in Sections 2.3 and 2.4.

Figure 2.22a plots the measured steady-state operation waveforms when the prototype is switching at 806 kHz and converting 5 V into 3.3 V with 50 A load current. As shown in the figure, the blocking capacitor remains stable voltage without resonance, functioning like a dc source as expected. Inductor current ripple is reduced at a frequency of four times the switching frequency due to the multiphase interleaving. Figure 2.22b plots the current ripple reduction ratio ( $\gamma$ ) as a function of duty cycle (D) for the two external winding setups and for the case without external winding. Larger external winding leakage inductance leads to lower coupling coefficient and larger transient inductance. According to the magnetic parameters in Table 2.1, matrix coupling coefficients and equivalent transient inductances for the current measurement loop and compact winding setups are:  $K_{\alpha\beta} = 37$  & 56 and  $L_{tr} = 52$  nH & 35 nH, respectively. When directly connecting without external winding, they are  $K_{\alpha\beta} = 78$  and  $L_{tr} = 25$  nH. Figure 2.22b implies that the coupling

coefficients with the two external winding setups are sufficiently high so that the ripple reduction ratios with and without external windings are similar. The major difference lies in the transient inductance  $L_{tr}$  that varies from 25 nH to 52 nH, resulting in the variation of steady-state inductance  $L_{ss}$  as well as winding current ripple. The figure also indicates that inductor current ripple of the four-phase matrix coupled SEPIC will approach almost zero when D = k/4, k = 1, 2, 3.

To verify the analysis of current ripple reduction, the matrix coupled SEPIC prototype (with measurement loop) is tested under both interleaved and non-interleaved operations when converting voltage from 1 V to 3.3 V at 1 MHz switching frequency. In this case,  $L_{tr} = 52$  nH and  $L_{ss} = 1.07$   $\mu$ H, as indicated by Fig. 2.22b. Figure 2.23 shows the measured inductor current ripples, which are well-balanced across the four phases, indicating a good symmetry of the prototype. Under interleaved operation, inductor current ripple is the same as using discrete  $L_{ss}$  and is only 0.8 A. Under non-interleaved operation, it is the same as using discrete  $L_{tr}$  and increases to 15.5 A. The measured ripple reduction ratio at this duty cycle is 5.2%, reflecting about 20x current ripple reduction compared to using discrete inductors with the same transient speed. The measured current ripples under interleaved and non-interleaved operations as well as ripple reduction ratio match well with the theoretically calculated ones (0.72 A, 14.8 A, 4.8%).

The matrix coupled SEPIC prototype is also tested with asymmetric series coupling to validate current ripple steering. In the experiment, external winding leakage is adjusted by changing the size of current measurement loop. Phases 1 & 3 are selected for demonstration. In phase 1, two external winding leakages are identical (both are 27 nH), while in phase 3, they are 22 nH and 37 nH, respectively. Figure 2.24 shows the measured winding current ripples of phases 1 & 3. As indicated by the figure, the summed phase current ripples are still balanced between phases 1 & 3 (i.e.,  $7 \text{ A} \approx 6.9 \text{ A}$ ), because the lumped leakage inductances of the series



Figure 2.23: Inductor current ripple under (a) interleaved operation and (b) non-interleaved operation.  $V_{in} = 1 \text{ V}$ ,  $V_{out} = 3.3 \text{ V}$ ,  $f_{sw} = 1 \text{ MHz}$ , and tested in the setup with current measurement loops.



Figure 2.24: Measured waveforms for verifying current ripple steering due to asymmetric series coupling. In phase 1, two external winding leakages are both 27 nH, while in phase 3, they are 22 nH and 37 nH, respectively. The ripple steering ratio is inversely proportional to external winding leakage inductances.

coupled windings in the two phases are identical (i.e.,  $27||27 \text{ nH} \approx 22||37 \text{ nH}$ ). Due to asymmetric series coupling, however, phase current ripple is unevenly distributed between the two windings in phase 3. The distributed ripple percentage is inversely proportional to external winding leakage inductances, consistent with the analysis in Section 2.3.3.

As discussed in Section 2.4, converter dynamics of the matrix coupled SEPIC can be analyzed by replacing the matrix coupled inductor with discrete  $L_{tr}$ . Then existing modeling methods for SEPIC converter can be directly applied. Figure 2.25 plots the



Figure 2.25: (a) Small-signal circuit model of the four-phase matrix coupled SEPIC converter. (b) Simplified small-signal circuit model. D is the gate driving duty ratio of the lower switch  $S_{k1}$ , and D' = 1 - D. Assume blocking capacitors have stable voltages and can be treated as constant voltage sources.

small-signal circuit model of the four-phase matrix coupled SEPIC. In Fig. 2.25a, each phase is modeled in the same way as a conventional SEPIC converter with discrete  $L_{tr}$ , and small-signal circuits of multiple phases are connected in parallel. An  $R_{eq}$  is inserted to capture the power losses of each phase. The simplified small-signal circuit of four parallel phases is plotted in Fig. 2.25b, indicating that the matrix coupled SEPIC converter is a second-order system. Accordingly, the control  $(\hat{d})$  to output  $(\hat{v}_{out})$  transfer function can be derived as follows:

$$\frac{\hat{v}_{out}}{\hat{d}} = \frac{1}{MR_o D'^2 + R_{eq}} \cdot \frac{\left(MR_o - \frac{D}{D'^2}R_{eq} - \frac{D}{2D'^2}L_{tr}s\right)V_{IN}}{\frac{s^2}{\omega_n^2} + \frac{s}{\omega_n Q} + 1},\tag{2.14}$$

$$\omega_n = \sqrt{\frac{2R_{eq} + 2MR_oD'^2}{L_{tr}R_oC_{out}}}, \quad Q = \frac{\sqrt{2L_{tr}R_oC_{out}}}{L_{tr} + 2R_{eq}R_oC_{out}}$$
(2.15)



Figure 2.26: (a) Modeled and measured Bode plots of the control  $(\hat{d})$  to output  $(\hat{v}_{out})$  transfer function. (b) Duty ratio perturbation from 50% to 53%. Duty ratio is indicated by the controller DAC output with  $0 \sim 3.3$  V to represent  $0 \sim 100\%$ .  $V_{in} = 3.3$  V,  $V_{out} = 3.3$  V,  $f_{sw} = 1$  MHz, effective  $C_{out} = 168~\mu F$  in (a) and  $8.8~\mu F$  in (b),  $R_o = 2.5~\mathrm{k}\Omega$  in (a) and  $0.4~\Omega$  in (b). Tested with current measurement loops.

To verify the transfer function, the matrix coupled SEPIC prototype was operated at 3.3-V input to 3.3-V output with 2.5-k $\Omega$   $R_o$  and 168- $\mu$ F effective  $C_{out}$  (considering dc bias degradation). The equivalent  $R_{eq}$  is 15.5 m $\Omega$ . The control to output transfer function is measured from gate signal to output voltage in the same way as in [58]. Figure 2.26a compares the measured and modeled transfer functions. The discrepancies mainly come from the errors in the estimated resistance ( $R_{eq}$ ), non-linear effects in inductors and capacitors, and other factors that the small-signal circuit model doesn't capture, such as deadtime and switching loss.

Figure 2.26b shows the measured transients during a duty ratio perturbation from 50 % to 53 %. In this test, the effective output capacitance  $C_{out} = 8.8 \ \mu F$ , and the load resistance  $R_o = 0.4 \ \Omega$ . According to Eq. (2.15), the resonant frequency of the control-to-output transfer function is  $f_n = \frac{\omega_n}{2\pi} = 333 \ \text{kHz}$ . The transient output voltage in Fig. 2.26b is a typical underdamped second-order system response. The measured resonant frequency of the output voltage waveform is 345 kHz, which is close to the theoretical calculation.



Figure 2.27: (a) Measured efficiency of different voltage conversion ratios at 806 kHz switching frequency. (b) Full-load hot-spot temperature of the prototype under 36 CFM airflow. ( $V_{in} = 5 \text{ V}$ ,  $V_{out} = 1 \text{ V}$ ,  $I_{out} = 185 \text{ A}$ , and  $f_{sw} = 806 \text{ kHz}$ .)

### 2.6.2 Efficiency Measurement and Magnetics Comparison

Figure 2.27a shows the measured converter efficiency of different conversion ratios when switching at 806 kHz. The peak efficiency and maximum output power for the 5 V-to-3.3 V, 5 V-to-1 V, and 1 V-to-3.3 V conversions are (93.2%, 430 W), (90.3%, 185 W), and (93.5%, 170 W), respectively. The maximum output power in each case is obtained when the hot-spot temperature reaches around 95 °C under 36 CFM airflow, as demonstrated in Fig. 2.27b. The measurement results indicate that the matrix coupled SEPIC prototype can flexibly deliver power from 1 V $\sim$ 5 V input to 1 V $\sim$ 5 V output and can deliver up to 185-A output current at 5 V-to-1 V voltage conversion with power density over 470 W/in<sup>3</sup>.

Detailed power loss breakdown versus output current for the 5 V-to-1 V voltage conversion is plotted in Fig. 2.28. In Fig. 2.28a, the calculated efficiency based on the estimated power loss is compared with the measured efficiency. Figure 2.28b shows the power loss proportion in the peak-efficiency load condition and full load condition. At light load, the switching losses of high side and low side switches dominate and limit the peak efficiency, while at full load, conduction losses of inductor windings and



Figure 2.28: (a) Detailed power loss breakdown and calculated efficiency for 5 V-to-1 V voltage conversion at 806 kHz switching frequency. (b) Power loss proportion in the peak-efficiency load condition ( $I_{out} = 29$  A) and full load condition ( $I_{out} = 185$  A). Loss breakdown includes conduction loss and switching loss of high side and low side switches,  $P_{HS.Cond}$ ,  $P_{HS.SW}$ ,  $P_{LS.Cond}$ ,  $P_{LS.SW}$ ; ESR loss of blocking capacitors  $P_{Cap}$ ; inductor winding loss and core loss,  $P_{Winding}$ ,  $P_{Core}$ ; conduction loss of PCB traces and vias,  $P_{PCB}$ .

high side switches dominate. In the loss breakdown, winding conduction loss takes up a large portion, especially at heavy load. Therefore, one straightforward way of improving converter efficiency is to integrate the external leakage inductance into the matrix coupled inductor to achieve lower winding resistance. Besides, by replacing the switches with lower current-rated ones that have smaller parasitic capacitance, the switching loss can be reduced, and converter light-load efficiency (including the peak efficiency) can be further improved. The tradeoffs are the increased  $R_{ds(on)}$  and the decreased maximum power rating. It is noticeable that the switching loss of low side switches is much higher than their conduction loss, especially at light load. However, the low side switches are still supposed to keep a similar current rating to that of the



Figure 2.29: Size comparison between commercial discrete inductors and the matrix coupled inductor. The background grid cell size is 1 cm. Comparison is based on the same current ripple, similar winding dc resistance (DCR), and similar current ratings (i.e.,  $I_{rms} \geq 40$  A with inductor temperature rise less than 40 °C) when converting voltage from 5 V to 1 V at 806 kHz switching frequency. Box dimensions (length×width×height) of the eight discrete inductors and the matrix coupled inductor are  $44\times30.48\times12.66$  mm<sup>3</sup> and  $24\times24\times5.25$  mm<sup>3</sup>, respectively.

high-side switches in order to maintain balanced performance across wide buck and boost conversion range.

According to Fig. 2.22b, the designed matrix coupled inductor (with compact winding) has the same fast transient speed as a 35-nH discrete inductor and maintains the same low current ripple as a 302-nH discrete inductor at 5 V-to-1 V voltage conversion. Figure 2.29 compares the matrix coupled inductor with the state-of-the-art commercial discrete inductors. Here, Coilcraft SER1412-301ME inductor that has similar current ripple and current rating is selected. Both the magnetic core and inductor windings (i.e., PCB & external windings) are included in the matrix coupled inductor for the size comparison. The box volume of eight discrete inductors and one matrix coupled inductor is 16,978 mm<sup>3</sup> and 3,024 mm<sup>3</sup>, respectively, indicating over 5.6 times size reduction.

To further compare the converter performance, the four-phase SEPIC prototype is also tested with discrete inductors as shown in Fig. 2.30a. Figure. 2.30b plots the



Figure 2.30: (a) Four-phase SEPIC prototype equipped with discrete inductors. (b) Efficiency comparison of the SEPIC prototype with one matrix coupled inductor (compact winding) and with eight discrete inductors.



Figure 2.31: Measured open-loop transient waveforms of the SEPIC prototype with (a) matrix coupled inductor (compact winding); (b) discrete inductors. Duty ratio steps from 17% to 41.9%.  $V_{in}=5$  V;  $V_{out}$  changes from 1 V to 3.3 V;  $I_{out}=20$  A;  $f_{sw}=806$  kHz; effective  $C_{out}=300$   $\mu$ F.

converter efficiency with the matrix coupled inductor and with discrete inductors, which are almost the same. It is consistent with the analysis since they have the same current ripple and similar winding resistance. Figure 2.31 shows the measured open-loop transient response of the two inductor setups during a duty ratio step change. As indicated by the figure, the matrix coupled inductor can significantly

improve the transient performance by reducing both settling time and voltage overshoot. Consequently, compared to commercial discrete inductors, the designed matrix coupled inductor can reduce total magnetic volume by over 5.6 times and improve the transient speed by over 8.5 times (i.e.,  $L_{tr}$  reduced from 300 nH to 35 nH) while maintaining similar current ripple and current rating.

### 2.7 Chapter Summary

This chapter presents a matrix coupled all-in-one magnetics integration approach that combines both series coupling and parallel coupling. A systematic analysis of the current ripple reduction is performed, which implies that current ripple reduction fundamentally comes from multiphase interleaving, and coupling coefficients will scale the ripple reduction ratio gained from interleaving. To have a lower current ripple, both stronger series coupling and parallel coupling are preferred. Current ripple steering due to asymmetric series coupling is discussed, and steering ratios are derived. By adjusting steering ratios, ripple can be steered away from specific windings, beneficial to ripple-sensitive applications. The transient performance of the matrix coupled inductor is demystified, providing guidance on converter dynamics analysis and large- or small-signal model derivation. To quantify the benefits of matrix coupling, an FOM is defined by comparing the current ripple of a matrix coupled inductor to that of a discrete inductor given the same transient speed. The comparison results indicate that a higher number of phases and a stronger matrix coupling coefficient will amplify the benefits of matrix coupled inductors compared to discrete ones. A 1 V-to-5 V input, 1 V-to-5 V output, four-phase matrix coupled synchronous SEPIC converter was designed and built. The matrix coupled inductor is implemented as a PCB planar magnetic component with an optimized winding design to reduce ac resistance. The matrix coupled SEPIC prototype achieves a maximum power density over 470 W/in<sup>3</sup> at 5 V-to-1 V voltage conversion. Compared to discrete commercial inductors, the designed matrix coupled inductor has a 5.6 times smaller size and 8.5 times faster transient speed with similar current ripple and current rating. The experimental results validate both the matrix coupling concept and the theoretical analysis.

The methodology presented in this chapter enables a holistic rethinking of power magnetic component design in PWM power converters – there are opportunities to merge all PWM power magnetics in a topology into one to reduce size, improve efficiency, and enhance transient performance.

#### **Related Publications**

- 1. P. Wang, D. H. Zhou, Y. Elasser, J. Baek and M. Chen, "Matrix Coupled All-in-One Magnetics for PWM Power Conversion," *IEEE Transactions on Power Electronics*, vol. 37, no. 12, pp. 15035-15050, Dec. 2022.
- 2. P. Wang, Y. Elasser, V. Yang and M. Chen, "WAN Converter: A Family of Multicell PWM Converter with All-in-One Magnetics," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, Houston, TX, USA, 2022, pp. 1035-1042.
- 3. P. Wang, D. Zhou, V. Yang and M. Chen, "Matrix Coupled All-in-One Magnetics for PWM Power Conversion," in *Proc. IEEE Workshop on Control and Modelling of Power Electronics (COMPEL)*, Colombia, 2021, pp. 1-8.

# Chapter 3

# Granular Architecture with Parallel Coupled Magnetics for High Current Computing Systems

### 3.1 Background and Motivation

The microprocessor industry has gone through tremendous change over the past few decades (Fig. 3.1a). It had been driven by Moore's law and Dennard scaling [78] for a long time, during which integrated transistor count on one single chip grew exponentially without significantly increasing the power consumption density. In recent years as Dennard scaling tapered out, processor performance-per-Watt improvement gained from the advances in fabrication process gradually faded away [79]. To meet the growing computational demand of artificial intelligence (AI) applications and cloud computing, microprocessors have entered a new era, where multiple cores are integrated on one chip and many chiplets are co-located on one interposer [80], incessantly pushing towards larger die area and higher power consumption.

However, the continuous scaling of computing systems is hitting both the power wall and the memory wall (Fig. 3.1b) [81]. With billions of transistors, high-performance microprocessors nowadays can consume hundreds of amperes of



Figure 3.1: (a) Microprocessor trend data during  $1972 \sim 2022$  (replotted from [74]). (b) As microprocessors develop from single-core, monolithic die to multi-core, multiple chiplets, modern computing systems are hitting both power wall and memory wall (replotted from [75]). Process node geometry and die area of selected high-performance-tier GPUs in [76,77] are plotted along the scaling curve of GPU thermal design power.

current at very low voltage (< 1 V), greatly increasing the conduction loss on power distribution networks (PDN) and narrowing the tolerance for supply voltage variations [1]. Besides, the development of AI algorithms dramatically boosts the memory bandwidth demand. These have brought severe challenges to designing highly sophisticated signal and power network, which requires high converter efficiency, high control bandwidth, and high signal and power integrity.

A recent trend in data centers is to replace the ac power distribution with 48~54 V dc dc distribution networks on the server racks [82]. To deliver power from 48 V dc bus to low voltage microprocessors, conventional voltage regulation solutions heavily rely on the on-board power conversion with little or without any conversion stress inside the processor package (Fig. 3.2a). The on-board power circuits of various point-of-load (PoL) converters can be generally classified into two categories: the two-stage architecture [54, 83–87] and the single-stage architecture [68, 88–91]. In two-stage architectures, an intermediate dc voltage bus is employed to decouple the

voltage conversion stress and transient dynamics between the two converter stages. The first stage is usually a transformer-based converter (e.g., LLC converter) or a switched-capacitor (SC) circuit functioning as a fixed-ratio dc transformer (DCX), and the second stage is a multiphase buck switching at high frequencies for the high control bandwidth. Compared to transformer-based topologies, SC converters utilize capacitors to undertake the major voltage stress for the large step-down ratio and can substantially decrease the converter size due to the superior capacitor energy storage density. If merging the two stages, soft charging technique can be leveraged on the SC circuits to reduce the charge sharing loss [31,92–94], allowing the use of smaller capacitors or lower switching frequency with decreased converter size and improved efficiency. Single-stage architectures that have low component count and less power conversion stages can attain high efficiency and high power density, but they might experience difficulty realizing high control bandwidth. Although the on-board power conversion solutions are currently the mainstream due to mature techniques and easier implementation, their long PDN traces lead to high conduction loss and large onboard areas impede microprocessors from communicating with peripherals, limiting the efficiency, power density, as well as control and communication bandwidth that can be achieved.

An alternative 48-to-1-V voltage regulation solution is to embed a substantial part of or complete power conversion circuits into the processor package, enabling ultra-compact power-supply-in-package (PwrSiP) systems [95], as shown in Fig. 3.2b. With PwrSiP voltage regulation, power conversion stress is shifted from on-board circuits to in-package circuits. The shortened interconnection lengths can substantially decrease PDN losses and benefit signal integrity, making it extremely attractive for powering future high-performance microprocessors. Figure 3.3 shows an example PwrSiP implementation, where a voltage regulator module (VRM) is co-packaged with a CPU. To fit into the CPU socket, the VRM is required to have both small area and low z



Figure 3.2: Microprocessor power architecture comparison between (a) traditional solution that heavily relies on the on-board power conversion and (b) PwrSiP solution that focuses on the in-package power conversion. A two-stage on-board conversion architecture is demonstrated in (a) as an example. Labeled efficiencies are sourced from [37, 39, 97] and Section 3.6.4 (including gate loss).



Figure 3.3: Ultra-thin VRM embedded into a CPU package that fits in a land-grid array (LGA) socket for extreme efficiency, density, and control bandwidth.

height. Typically, the VRM height is set by the magnetic components, whose sizes are limited by the fundamental trade-off between transient and ripple performance. Parallel coupled magnetics with interleaving operation can obtain both high di/dt in transient and low current ripple in steady-state, greatly reducing dc energy storage and magnetic size [14, 41, 96].

In pursuit of an ultra-compact PwrSiP CPU VRM with miniaturized z height for high current computing systems, this chapter presents a multistack switched-capacitor point-of-load (MSC-PoL) architecture with parallel coupled magnetic components [98], as demonstrated in Fig. 3.4. Multiple granular SC cells are stacked in front and break down the high input voltage into many intermediate voltage rails, which are loaded with granular switched-inductor current sources to perform soft



Figure 3.4: MSC-PoL architecture for microprocessor voltage regulation. Stacked SC cells breakdown the high input voltage and create many intermediate voltage rails loaded with switched inductor cells to perform voltage regulation. Multiple capacitors of the SC stage are soft charged by one single parallel coupled magnetic component.

charging and voltage regulation. Different from the two-stage PoL architectures, the intermediate voltage rail herein is not necessarily a fixed dc bus but may shift between several dc levels at different switching states [99–101]. The dc rail voltage is provided by the capacitor network of the SC stage, and thus large intermediate bus capacitors can be eliminated. The switched-inductor cell is switched in at the right time to get the desired voltage level. Many inductors of the switched-inductor cells are merged into one and operated in an interleaved fashion. Through soft charging multiple switched capacitors with one single coupled magnetic component, the MSC-PoL architecture can minimize both capacitor and magnetic size, achieving extremely low z-height as well as high efficiency and high transient speed.

As a hybrid switched-capacitor/magnetics system, the MSC-PoL architecture has intrinsic L-C resonant dynamics that might influence its control stability and transient response. This chapter presents a systematic approach to analyzing this intrinsic resonant behavior based on a series-capacitor buck (SCB) converter that has a similar small signal model [102]. Two types of resonance, the output L- $C_o$  resonance and the interphase L- $C_B$  resonance are identified through common-mode and differential-mode decomposition. The impacts of coupled inductors on the resonant amplitude,

frequency, and settling time during a step line transient are analyzed, and the influence of intrinsic resonance on control stability is clarified, providing comprehensive guidance for controller design.

To validate the granular MSC-PoL architecture, a 48-to-1-V, 6-mm-thick MSC-PoL VRM with 3D-stacked ladder-core coupled inductors is built and tested. A 0.8-mm-thick leakage magnetic plate is designed to adjust the leakage inductance for lower current ripple. The MSC-PoL VRM encloses all components of power stage, bootstrap, and gate driver circuits into a  $\frac{1}{16}$ -brick module with 0.31 in<sup>3</sup> ultra-compact size. Two MSC-PoL modules can support up to 450 A load current with over 724 W/in<sup>3</sup> power density. The peak efficiency (including gate loss) of the MSC-PoL prototype with and without using the leakage plate is 91.7% and 89.5%, respectively. This converter is one of the smallest 48-to-1-V PoL converters in the 400 A range with among the highest power density, efficiency, and the lowest thickness reported in literature.

The remainder of this chapter is structured as follows. Section 3.2 introduces the multistack switched-capacitor architecture together with several example topology implementations. Section 3.3 presents a specific 48-to-1-V MSC-PoL topology, clarifies its working principles, and analyzes its dynamic performance with small-signal modeling. Section 3.4 performs a systematic analysis of the intrinsic resonance issue on hybrid SC systems. Section 3.5 elaborates the design of the MSC-PoL converter, including the ladder-structured coupled inductor, gate driver circuits and 3D stacked packaging. Hardware prototype and detailed experimental results are summarized in Section 3.6. Finally, Section 3.7 concludes this chapter.

### 3.2 Multistack Switched-Capacitor (MSC) Architecture

There are many different ways of implementing the granular SC cells and switched-inductor current sources of the multistack switched-capacitor architecture. The SC cells can be implemented as any SC structure that can leverage soft charging, such



Figure 3.5: MSC-PoL architecture based on modular H-bridge structures. Voltage conversion ratio can be extended by stacking more H-bridges. The switched-inductor current sources can be interleaved to reduce the output current ripple.

as Dickson derived topologies or flying capacitor derived topologies; the switched-inductor cells functioning as voltage regulators can be implemented as PWM or resonant converters, such as buck, series-capacitor buck (SCB), and SEPIC converters. One can combine different switched-capacitor and switched-inductor cells to meet specific design requirements of diverse applications.

Figure 3.5 shows an MSC-PoL architecture based on modular "H-bridge" structures. The SC cell is configured as a 2:1 H-bridge circuit with one terminal connected to the input side, one terminal connected to ground, and two intermediate voltage rails each providing a half of the input voltage. Two voltage rails are loaded with switched-inductor circuits that function as voltage regulators and can soft charge and discharge the flying capacitor of the H-bridge SC cell. The MSC-PoL architecture is granular and extendable. One can stack many H-bridge structures to interface with higher voltages (e.g., 96 V, 192 V), or parallel multiple voltage regulator structures to support higher output currents. Redundant switches within the stacked H-bridges or between the SC stage and the switched-inductor stage are merged to reduce component count and power loss [86]. The switched-inductor current sources are operated in an interleaved fashion to decrease the output current ripple.



Figure 3.6: Example implementations of the MSC-PoL architecture: (a) current sources are implemented as parallel multiphase buck converters; (b) current sources are separately regulated to supply different output voltage levels; (c) current sources are tapped into different locations of the stacked SC circuits and can be implemented as different converters, such as multiphase buck and multiphase SCB.

Figure 3.6 shows several example MSC-PoL topologies with sixteen output phases. The 16-phase inductors can be implemented as eight 2-phase coupled inductors, four 4-phase coupled inductors, or one 16-phase coupled inductor. The 16-phase switched inductor cells can be implemented as multiphase buck (Figs. 3.6a and 3.6b), multiphase SCB, or a hybrid (Fig. 3.6c). Figure 3.6b shows an alternative implementation of the MSC-PoL architecture that has the ability to generate multiple output voltages, which could be used for chiplet systems [103]. The current sources are connected in parallel but are separately regulated to supply different output voltage levels. Fig-

ure. 3.6c shows another example multi-output topology with current sources tapped into different locations of the stacked SC circuits. The switched-inductor current sources connected to higher levels of the SC circuits can provide higher output voltages. Benefiting from the stacked switched-capacitor/inductor structure, capacitor soft charging, parallel coupled magnetics, and interleaving operation, the MSC-PoL architecture has following advantages:

- Reduced Passive Component Size: The MSC-PoL architecture enables transformerless voltage conversion with extremely higher power density because of:

  1) reduced capacitor size owing to superior capacitor energy storage density and soft charging; 2) miniaturized magnetic component size by magnetics coupling; and 3) reduced filter size due to decreased output current ripple caused by the interleaving operation. The greatly reduced passive component size makes the MSC-PoL architecture a very attractive solution to CPU PwrSiP VRM.
- Improved Efficiency and Transient Speed: Soft charging the flying capacitors reduces the capacitor charge sharing loss; parallel coupled magnetics with interleaving operation decrease inductor current ripple, reducing both switching loss and conduction loss; the ultra-compact converter size enables PwrSiP voltage regulation with shortened interconnections, reducing the PDN conduction loss. Besides, the reduced coupled inductor current ripple allows the use of smaller leakage inductance with smaller inductive dc energy storage and faster transient speed.
- Automatic Current Sharing and Voltage Balancing: Mutual balancing between capacitor voltages and inductor currents can be achieved during the capacitor charging and discharging processes: 1) the flying capacitor voltage of the H-bridge SC cell and the two following switching cell currents are automatically balanced; 2) the blocking capacitor voltage of the switched-inductor cell



Figure 3.7: (a) Circuit topology and (b) key operation waveforms of the 48-to-1-V MSC-PoL converter. In subfigure (a), one 2:1 H-bridge SC cell is stacked in front and drives two 4-phase SCB cells. GaN FETs are plotted in blue and Silicon MOSFETs are plotted in red. Maximum voltage stress of each switch is labeled aside. In subfigure (b), inductor currents and blocking capacitor voltages of the SCB cell A are plotted. Two SCB cells are interleaved by 180° phase shift as an example.

(e.g., SCB and SEPIC) and parallel phase inductor currents are automatically balanced. Coupled magnetics can also suppress the unbalanced voltages and currents caused by nonideal factors including resistance variation between phases [104], phase shift error [105], and source impedance [106].

## 3.3 A 48-to-1-V MSC-PoL CPU Voltage Regulator

This section presents the operation principles and small-signal models of a 48-V-to-1-V 450-A MSC-PoL converter.

# 3.3.1 Topology and Operation Principle

Figure 3.7a shows the 48-to-1-V MSC-PoL topology. It consists of one H-bridge SC cell stacking on top of two 4-phase SCB cells. The H-bridge SC cell steps down the  $V_{in}$  by half and distributes 24 V to each SCB cell. Two switches at the output terminals

of the H-bridge are merged with the input switches of the SCB circuits. Voltage conversion ratios or power ratings can be extended by stacking more H bridges or paralleling more series-capacitor buck phases [99]. In Fig. 3.7a, the maximum drain-source voltage stress is labeled aside each switch. Switches in the H-bridge SC cell can use high voltage GaN FETs to undertake high voltage stress, while switches in the SCB cells can utilize low voltage, low resistance Silicon MOSFETs to support large current output.

Figure 3.7b plots key steady-state waveforms of the 48-to-1-V MSC-PoL converter. Switches  $S_{0A}$  &  $S_{0B}$  are synchronized with  $S_{1A}$  &  $S_{1B}$ , respectively. High-side and low-side switches of each SCB phase are driven by complementary gate signals and four phases of each SCB cell are interleaved by 90° phase shifts. The four interleavingoperated inductors are coupled in parallel, leading to reduced inductor current ripples of 4x switching frequency. In Fig. 3.7b, two SCB cells are operated with a 180° phase shift as an example. Other phase shifts between SCB cells (e.g., 145° or 225°) and alternative coupled inductor solutions (e.g., coupling all eight inductors in parallel) can also be applied to realize eight-phase interleaving with further reduced ripple amplitudes and increased ripple frequency for inductor and output currents. The flying capacitor  $C_{fly}$  in the H-bridge SC cell is soft charged and discharged in turns by the first two SCB phases (i.e., phases 1A and 1B), while the blocking capacitors  $C_{1X\sim 3X}$  in each SCB cell are soft charged and discharged by neighboring inductor currents. As a result, the 48-to-1-V MSC-PoL topology is capable of automatic voltage balancing for all the capacitors and automatic current sharing for all the parallel output branches. Based on inductor volt-second balance, the steady-state output voltage can be expressed as:

$$V_o = \frac{D}{8}V_{in}. (3.1)$$



Figure 3.8: Small-signal circuit model of the 48-to-1-V MSC-PoL converter.

 $D = \frac{1}{6}$  for the 48:1 voltage conversion ratio. As indicated by Eq. (3.1), the steady-state operation of the MSC-PoL converter resembles that of a multiphase buck converter, but with a reduced input voltage of one eighth the original value.

## 3.3.2 Dynamic Modeling and Analysis

This subsection analyzes the converter dynamics based on the small-signal model in Fig. 3.8. For the 4-phase coupled inductor, winding voltages and currents are associated by an inductance matrix:

$$\begin{bmatrix} v_{L1} \\ v_{L2} \\ v_{L3} \\ v_{L4} \end{bmatrix} = \begin{bmatrix} L_{11} & L_{12} & L_{13} & L_{14} \\ L_{21} & L_{22} & L_{23} & L_{24} \\ L_{31} & L_{32} & L_{33} & L_{34} \\ L_{41} & L_{42} & L_{43} & L_{44} \end{bmatrix} \begin{bmatrix} \frac{di_{L1}}{dt} \\ \frac{di_{L2}}{dt} \\ \frac{di_{L3}}{dt} \\ \frac{di_{L3}}{dt} \\ \frac{di_{L4}}{dt} \end{bmatrix}.$$
(3.2)

Two effective discrete inductances, the transient inductance  $(L_{tr})$  and the steady-state inductance  $(L_{ss})$ , can be defined, which have the same transient speed and the same current ripple as the coupled inductor, respectively [41]. If the 4-phase coupled inductor is symmetrically coupled, the summation of each column in the inductance matrix is the transient inductance for each phase:  $L_{tr} = \sum_{j=1}^{4} L_{jk}$   $(k = 1 \sim 4)$ .

Applying switching-cycle averaging and small-signal approximation to the MSC-PoL converter leads to the small-signal circuit model as demonstrated in Fig. 3.8. It can be treated as the combination of two SCB small-signal circuits linked by the flying capacitor  $C_{fly}$ . According to Fig. 3.8, following modeling equations can be obtained:

$$\begin{cases}
D\left(\hat{v}_{Cfly} - \hat{v}_{C1A}\right) + \left(V_{Cfly} - V_{C1A}\right)\hat{d} - s\sum_{k=1}^{4} L_{1k}\hat{i}_{LkA} = \hat{v}_{o}, \\
D\left(\hat{v}_{C1A} - \hat{v}_{C2A}\right) + \left(V_{C1A} - V_{C2A}\right)\hat{d} - s\sum_{k=1}^{4} L_{2k}\hat{i}_{LkA} = \hat{v}_{o}, \\
D\left(\hat{v}_{C2A} - \hat{v}_{C3A}\right) + \left(V_{C2A} - V_{C3A}\right)\hat{d} - s\sum_{k=1}^{4} L_{3k}\hat{i}_{LkA} = \hat{v}_{o}, \\
D\cdot\hat{v}_{C3A} + V_{C3A}\cdot\hat{d} - s\sum_{k=1}^{4} L_{4k}\hat{i}_{LkA} = \hat{v}_{o},
\end{cases} \tag{3.3}$$

$$\begin{cases}
D\left(\hat{v}_{in} - \hat{v}_{Cfly} - \hat{v}_{C1B}\right) + (V_{in} - V_{Cfly} - V_{C1B})\hat{d} - s \sum_{k=1}^{4} L_{1k}\hat{i}_{LkB} = \hat{v}_{o}, \\
D\left(\hat{v}_{C1B} - \hat{v}_{C2B}\right) + (V_{C1B} - V_{C2B})\hat{d} - s \sum_{k=1}^{4} L_{2k}\hat{i}_{LkB} = \hat{v}_{o}, \\
D\left(\hat{v}_{C2B} - \hat{v}_{C3B}\right) + (V_{C2B} - V_{C3B})\hat{d} - s \sum_{k=1}^{4} L_{3k}\hat{i}_{LkB} = \hat{v}_{o}, \\
D\left(\hat{v}_{C3B} + V_{C3B} \cdot \hat{d} - s \sum_{k=1}^{4} L_{4k}\hat{i}_{LkB} = \hat{v}_{o}.
\end{cases} \tag{3.4}$$

By summing up the equations in (3.3) - (3.4), impacts of the flying capacitor and blocking capacitors are eliminated. The overall converter dynamics are modeled as:

$$D \cdot \hat{v}_{in} + V_{in} \cdot \hat{d} - (R_{eq} + sL_{tr}) \underbrace{\sum_{k=1}^{4} \left( \hat{i}_{LkA} + \hat{i}_{LkB} \right)}_{\hat{i}_{o}} = 8\hat{v}_{o}.$$
 (3.5)

 $R_{eq}$  is the equivalent series resistance at each phase that captures the power losses. Based on (3.5), the input-to-output and control-to-output transfer functions are:

$$G_{v_{in}v_{o}} = \frac{\hat{v}_{o}}{\hat{v}_{in}} = \frac{DR_{o}}{L_{tr}R_{o}C_{o}} \cdot \frac{1}{s^{2} + 2\xi\omega_{n}s + \omega_{n}^{2}},$$

$$G_{dv_{o}} = \frac{\hat{v}_{o}}{\hat{d}} = \frac{V_{in}R_{o}}{L_{tr}R_{o}C_{o}} \cdot \frac{1}{s^{2} + 2\xi\omega_{n}s + \omega_{n}^{2}},$$

$$\omega_{n} = \sqrt{\frac{R_{eq} + 8R_{o}}{L_{tr}R_{o}C_{o}}}, \quad \xi = \frac{L_{tr} + R_{eq}R_{o}C_{o}}{2\sqrt{L_{tr}R_{o}C_{o}(R_{eq} + 8R_{o})}}.$$
(3.6)

Eqs. (3.5) - (3.6) indicates that the overall system dynamics and transfer functions of the MSC-PoL converter are the same as a multiphase buck with  $\frac{v_{in}}{8}$  input voltage and  $\frac{L_{tr}}{8}$  output inductance. Therefore, it can be controlled by typical control methods for a multiphase buck (e.g., voltage mode control), except that the duty ratio is limited within 25% which might restrain its maximum transient speed.

## 3.4 Interphase L-C Resonance and Stability Analysis

As shown in Fig. 3.8, the MSC-PoL converter containing multiple inductors and capacitors is a higher-order PWM converter. The resulting L-C resonant poles might influence its control stability and transient performance. Since the MSC-PoL converter has a similar small signal mode as a multiphase SCB converter (Fig. 3.9), the intrinsic L-C resonant behavior can be studied in the same way as the SCB converter. However, previous large-/small-signal models of the SCB converter [107,108] only explain the overall converter dynamics, and interphase dynamics were not investigated.



Figure 3.9: Circuit topology and operation waveforms of an example two-phase series-capacitor buck converter with discrete inductors. The maximum switch voltage stress is labeled in red. Coupled inductors can be utilized to replace the discrete ones, and phase number can be extended by stacking more series-capacitor buck cells [54, 108], as indicated by the grey lines and grey dots.

Ref. [104] unveiled the interphase L-C resonance whose damping ratio is proportional to the conduction-path resistance. Therefore, a well-designed high efficiency converter with low conduction loss might result in an underdamped system with long settling time and large resonant amplitude. Models and design methods for describing and mitigating the interphase L-C resonance are still needed.

This section presents a systematic analysis of the intrinsic L-C resonance by decomposing disturbance and its response into common-mode and differential-mode dynamics, streamlining the underlying mechanisms of the L-C resonant behaviors. The analysis below starts with using discrete inductors, and the impacts of coupled inductors are discussed in Section 3.4.1. In a two-phase SCB topology (Fig. 3.9), the blocking capacitor  $(C_B)$  functions as a dc voltage source with  $v_{in}/2$  across it. Switch node voltages step between 0 and  $v_{in}/2$ , doubling the duty ratio compared to a regular buck. Two phases are typically interleaved, and  $C_B$  is charged and discharged by the inductor currents of the two phases alternatively as their high-side switches turn on. The large-signal average model of the two-phase SCB converter is described in Fig. 3.10 together with the modeling equations and their equivalent



Figure 3.10: Large-signal average model and its equivalent circuit model.



Figure 3.11: Input voltage disturbance and its response decomposed into: (a) common-mode dynamics; (b) differential-mode dynamics.

circuits. Conduction-path resistances (including switch  $R_{ds}$ , capacitor ESR, inductor winding resistance, etc.) are lumped into an effective resistance  $R_C$  in series with  $C_B$ .

The load transient dynamics of an SCB converter is similar to a multiphase buck converter and has been discussed in [54]. However, the line transient dynamics and their impacts on flying capacitor voltage and current sharing have not been systematically explored. The input voltage step change of a line transient results in blocking capacitor voltage variation and causes ringing and long settling time, which are the main focuses of this section. Similar analysis methods can be applied to describe the responses to other perturbations, such as duty ratio change, unbalanced initial conditions, load transients, etc.

Assume  $d_1 = d_2 = D$ ,  $L_1 = L_2 = L$ . The input voltage perturbation  $\tilde{v_{in}}$  can be decomposed into common mode  $\{+\frac{\tilde{v_{in}}}{2}, +\frac{\tilde{v_{in}}}{2}\}$  and differential mode  $\{+\frac{\tilde{v_{in}}}{2}, -\frac{\tilde{v_{in}}}{2}\}$  for the two phases, as illustrated in Fig. 3.11. The common-mode perturbations drive the

two phases to change in the same way, while the differential-mode perturbations cause opposite variations on the two phases. The resulting differential inductor currents  $\pm \frac{\Delta \tilde{i_L}}{2}$  are cancelled at the output, so the common-mode current response is  $\pm \frac{\tilde{i_o}}{2}$  for each inductor. The overall current response of each inductor is:

$$i_{L_1}^{\tilde{}} = \frac{1}{2}\tilde{i_o} + \frac{1}{2}\Delta\tilde{i_L}, \qquad \tilde{i_{L_2}} = \frac{1}{2}\tilde{i_o} - \frac{1}{2}\Delta\tilde{i_L}.$$
 (3.7)

Apply state-space-averaging, the  $\tilde{v_{in}}$ -to- $\tilde{i_o}$  transfer function is:

$$G_{v_{in}i_o} = \frac{\tilde{i_o}}{\tilde{v_{in}}} = \frac{D}{2R_o + DR_C} \cdot \frac{\frac{s}{\omega_z} + 1}{\frac{s^2}{\omega_{nop}^2} + \frac{s}{Q_{op}\omega_{nop}} + 1},$$
 (3.8)

$$\omega_{nop} = \sqrt{\frac{2R_o + DR_C}{R_o C_o L}}, Q_{op} = \frac{\sqrt{\frac{R_o C_o L}{2R_o + DR_C}}}{L + R_C R_o C_o D}, \omega_z = \frac{1}{R_o C_o}.$$
 (3.9)

Accordingly, the line transient  $\tilde{v_{in}}$ -to- $\tilde{v_o}$  transfer function is:

$$G_{v_{in}v_o} = \frac{\tilde{v_o}}{\tilde{v_{in}}} = G_{v_{in}i_o} \cdot Z_o, \quad Z_o = \frac{R_o}{R_o C_o s + 1}.$$
 (3.10)

Similarly, the  $\tilde{v_{in}}$ -to- $\Delta \tilde{i_L}$  transfer function is:

$$G_{v_{in}\Delta i_L} = \frac{\tilde{\Delta i_L}}{\tilde{v_{in}}} = \frac{C_B}{2D} \cdot \frac{s}{\frac{s^2}{\omega_{nin}^2} + \frac{s}{Q_{ip}\omega_{nip}} + 1},$$
(3.11)

$$\omega_{nip} = D\sqrt{\frac{2}{LC_B}}, \quad Q_{ip} = \frac{1}{R_C}\sqrt{\frac{2L}{C_B}}.$$
 (3.12)

It can be seen from (3.8) and (3.11) that there exist two types of intrinsic L-C resonances in SCB converter dynamic responses:

1. Output L- $C_o$  resonance with  $\omega_{nop} \& Q_{op}$ : higher  $R_o$  leads to a higher  $Q_{op}$  and lower damping ratio.

2. Interphase L- $C_B$  resonance with  $\omega_{nip}$  &  $Q_{ip}$ : higher  $R_C$  results in a lower  $Q_{ip}$  and higher damping ratio.

Dynamic responses of common-mode variables (e.g.,  $\tilde{i_o}$  and  $\tilde{v_o}$ ) will only see the output L- $C_o$  resonant pole and their associated transfer functions (e.g.,  $G_{v_{in}i_o}$ ,  $G_{v_{in}v_o}$ , and  $G_{dv_o}$ ) are the same as of a regular two-phase buck. Contrarily, responses of differential-mode variables (e.g.,  $\Delta \tilde{i_L}$  and  $\tilde{v_C}$ ) will only see the interphase L- $C_B$  resonant pole, whose transfer functions (e.g.,  $G_{v_{in}v_c}$ ,  $G_{v_{in}\Delta i_L}$ , and  $G_{\Delta d\Delta i_L}$ ) are different from the buck converter. According to (3.7), responses of  $\tilde{i_L}_1/\tilde{i_L}_2$  contain both common-mode and differential-mode dynamics and will see both output L- $C_o$  and interphase L- $C_B$  resonant poles. The  $\tilde{v_{in}}$ -to- $\tilde{i_L}_1/\tilde{i_L}_2$  transfer functions can be obtained by combining  $G_{v_{in}i_o}$  and  $G_{v_{in}\Delta i_L}$ . Same analysis approach and conclusions also apply to SCB converters with a higher number of phases.

For a general M-phase SCB converter, transfer functions can be derived through state-space modeling. An M-phase SCB converter contains M inductors, M-1 blocking capacitors, and one output capacitor, so there are 2M state variables. Select the state vector as  $\mathbf{x} = [i_{L_1}, i_{L_2}, \dots, i_{L_M}, v_{C_1}, v_{C_2}, \dots, v_{C_{M-1}}, v_o]^T$ , the input vector as  $\mathbf{u} = [v_{in}]$ , and the output vector as  $\mathbf{y} = [i_{L_1}, i_{L_2}, \dots, i_{L_M}, i_o, v_o]^T$ . Applying switching-cycle averaging, the state-space model can be obtained as:

$$\dot{\mathbf{x}} = \mathbf{A}\mathbf{x} + \mathbf{B}\mathbf{u},\tag{3.13}$$

$$\mathbf{y} = \mathbf{E}\mathbf{x},\tag{3.14}$$

where the coefficient matrix **A** is:

$$\mathbf{A} = \begin{bmatrix} \mathbf{0}_{M \times M} & \mathbf{A}_{12} \\ \mathbf{A}_{21} & \mathbf{A}_{22} \end{bmatrix}, \tag{3.15}$$

and the block matrices  $A_{12}$ ,  $A_{21}$ , and  $A_{22}$  are:

$$\mathbf{A_{12}} = \begin{bmatrix} \frac{-D}{L_1} & 0 & \cdots & 0 & \frac{-1}{L_1} \\ \frac{D}{L_2} & \frac{-D}{L_2} & & \vdots & \frac{-1}{L_2} \\ 0 & \ddots & \ddots & 0 & \vdots \\ \vdots & & \frac{D}{L_{M-1}} & \frac{-D}{L_{M-1}} & \frac{-1}{L_{M-1}} \\ 0 & \cdots & 0 & \frac{D}{L_M} & \frac{-1}{L_M} \end{bmatrix}, \mathbf{A_{21}} = \begin{bmatrix} \frac{D}{C_{B_1}} & \frac{-D}{C_{B_2}} & 0 & \cdots & 0 \\ 0 & \frac{D}{C_{B_2}} & \frac{-D}{C_{B_2}} & & \vdots \\ \vdots & & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & \frac{D}{C_{B_{M-1}}} & \frac{-D}{C_{B_{M-1}}} \\ \frac{1}{C_o} & \frac{1}{C_o} & \cdots & \frac{1}{C_o} & \frac{1}{C_o} \end{bmatrix},$$

$$\mathbf{A_{22}} = \begin{bmatrix} \mathbf{0}_{M-1 \times M-1} & \mathbf{0}_{M-1 \times 1} \\ \mathbf{0}_{1 \times M-1} & -\frac{1}{C_o R_o} \end{bmatrix}.$$

$$(3.16)$$

The coefficient matrices **B** and **E** are:

$$\mathbf{B} = \begin{bmatrix} \frac{D}{L_1} & \mathbf{0}_{1 \times 2M - 1} \end{bmatrix}^T, \quad \mathbf{E} = \begin{bmatrix} \mathbf{I}_{M \times M} & \mathbf{0}_{M \times M} \\ 1, 1, \dots, 1 & 0, 0, \dots, 0 \\ 0, 0, \dots, 0 & 0, \dots, 0, 1 \end{bmatrix}.$$
(3.17)

Accordingly, the transfer functions can be derived as:

$$\begin{cases}
G_{v_{in}i_{L_k}} = \frac{\tilde{i_{L_k}}}{\tilde{v_{in}}} = (\mathbf{E})^{\text{row k}} \cdot (s\mathbf{I} - \mathbf{A})^{-1}\mathbf{B}, \\
G_{v_{in}i_o} = \frac{\tilde{i_o}}{\tilde{v_{in}}} = (\mathbf{E})^{\text{row M}+1} \cdot (s\mathbf{I} - \mathbf{A})^{-1}\mathbf{B}, \\
G_{v_{in}v_o} = \frac{\tilde{v_o}}{\tilde{v_{in}}} = (\mathbf{E})^{\text{row M}+2} \cdot (s\mathbf{I} - \mathbf{A})^{-1}\mathbf{B}.
\end{cases} (3.18)$$

An intuitive way of deriving the response to a perturbation is by superposing the responses to its common-mode and differential-mode components. Fig. 3.12a plots the small-signal average model of an M-phase SCB converter and its equivalent circuits seen by common-mode and differential-mode perturbations individually. Denote the effective perturbations applied to each phase as  $\tilde{v_1} \sim \tilde{v_M}$ , which can represent input voltage or converted duty ratio perturbations (e.g.,  $\tilde{v_1} = \tilde{v_{in}}$ ,  $\tilde{v_{2\sim M}} = 0$  representing



Figure 3.12: (a) Response decomposition of common-mode and differential-mode dynamics for a general M-phase SCB converter ( $R_C$  is ignored here). (b) The  $\tilde{v_{in}}$ -to- $\tilde{i_L}$  transfer functions of an example 3-phase SCB converter, where L=50 nH,  $C_{B1,2}=30~\mu\text{F},~C_o=100~\mu\text{F},~R_o=1~\Omega,~D=\frac{1}{6}$ .

the input voltage perturbation). As shown in Fig. 3.12a, the common-mode perturbation component  $\tilde{v_{cm}}$  ( $\tilde{v_{cm}} = \frac{\Sigma_{k=1}^{M} \tilde{v_{k}}}{M}$ ) is effectively applied to an output L- $C_o$ - $R_o$  network with the resonant frequency of  $\omega_{nop} = \sqrt{\frac{M}{LC_o}}$ . Same for the output load transient that contains only common-mode perturbation component.

As for differential-mode perturbations, the incurred variations are canceled at the output, so the output terminals are effectively shorted in the small-signal average

$$\begin{cases} (v_{in} - v_C - i_{L1}R_C) \cdot d_1 = v_{out} + \begin{pmatrix} L_S \frac{di_{L1}}{dt} + L_M \frac{di_{L2}}{dt} \\ (v_C - i_{L2}R_C) \cdot d_2 = v_{out} + \begin{pmatrix} L_M \frac{di_{L1}}{dt} + L_S \frac{di_{L2}}{dt} \\ \end{pmatrix} \\ i_{L1}d_1 - i_{L2}d_2 = C_B \frac{dv_C}{dt} \end{cases}$$
(a)
$$\begin{pmatrix} L_k & \downarrow \\ L_M \frac{di_{L1}}{dt} + L_S \frac{di_{L2}}{dt} \\ \end{pmatrix} \\ \begin{pmatrix} L_S + L_M = L_k \\ L_S - L_M = (1 + \beta)L_k \end{pmatrix}$$

Figure 3.13: (a) Large-signal average model of using coupled inductors. (b) Equivalent circuit and parameter conversion for a two-phase coupled inductor.

model, resulting in an equivalent M-level L- $C_B$  ladder network. Transfer function for this L-C ladder circuit can be determined by using DFFz triangles [109], which contains up to M-1 resonant poles (assuming  $C_{B,k}$  are identical):

$$\omega_{nip.k} = \frac{2D}{\sqrt{LC_B}} \sin\left(\frac{k\pi}{2M}\right), \quad k = 1 \sim M - 1.$$
 (3.19)

Lumping the common-mode and the differential-mode dynamic responses yields the overall response. Fig. 3.12b shows the  $\tilde{v_{in}}$ -to- $\tilde{i_L}$  transfer functions of an example 3-phase SCB converter, which has two interphase L- $C_B$  resonant poles and one output L- $C_o$  resonant pole, as expected.

# 3.4.1 Impacts of Coupled Inductors

Coupled inductors that exhibit different inductances to common-mode and differential-mode excitations can improve inductor current sharing and capacitor voltage balancing for multiphase hybrid switched-capacitor-magnetic topologies. This subsection discusses the impacts of coupled inductors on intrinsic resonance of the SCB converter. Fig. 3.13 shows the large-signal average model with coupled inductors.  $L_S$  and  $L_M$  are self and mutual inductances in the inductance matrix, and  $L_k$  and  $\beta$  are effective leakage inductance and coupling coefficient as defined in [96]. Table 3.1 lists the parameters of an example two-phase SCB for all calculations

| Table 3.1: Parameters of a Two-Phase SCB Converter |           |          |                   |                     |       |                    |                      |  |
|----------------------------------------------------|-----------|----------|-------------------|---------------------|-------|--------------------|----------------------|--|
| $V_{in}$                                           | $V_{out}$ | $f_{sw}$ | $C_B$             | $R_C$               | $L_k$ | $C_o$              | $R_o$                |  |
| 12V                                                | 1V        | 1MHz     | $30\mu\mathrm{F}$ | $3\mathrm{m}\Omega$ | 50nH  | $100\mu\mathrm{F}$ | $20\mathrm{m}\Omega$ |  |
| 60                                                 |           | 1        |                   |                     |       |                    |                      |  |



Figure 3.14: Bode plots of  $G_{v_{in}\Delta i_L}$  with different coupling coefficients.

and simulations in Sections 3.4.1 and 3.4.2, unless otherwise specified. A higher  $\beta$ indicates higher coupling coefficient for the coupled inductor. When using coupled inductors, transfer functions  $G_{v_{in}i_o}$ ,  $G_{v_{in}v_o}$ , and  $G_{v_{in}\Delta i_L}$  have the same expressions as in Eqs. (3.8) - (3.12) except that the  $\omega_n$  and Q are changed to:

$$\omega_{nop} = \sqrt{\frac{2R_o + DR_C}{R_o C_o L_k}}, \quad Q_{op} = \frac{\sqrt{\frac{R_o C_o L_k}{2R_o + DR_C}}}{L_k + R_C R_o C_o D},$$
(3.20)

$$\omega_{nip} = D\sqrt{\frac{2}{(1+\beta)L_kC_B}}, \quad Q_{ip} = \frac{1}{R_C}\sqrt{\frac{2(1+\beta)L_k}{C_B}}.$$
 (3.21)

Accordingly, common-mode or output dynamics will see a small inductance  $L_k$ , while the differential-mode or interphase dynamics will see a large inductance  $(1 + \beta)L_k$ . If  $L_k$  is fixed (i.e., under the same transient speed),  $\beta$  will only influence differentialmode dynamics. As shown in Fig. 3.14, a larger coefficient  $\beta$  results in a lower interphase resonant frequency  $\omega_{nip}$  and a higher quality factor  $Q_{ip}$ , but the gain at



Figure 3.15: Simulated and calculated  $\Delta i_L$  during a line transient ( $v_{in}$  steps from 12 V to 14 V) when using (a) discrete inductors ( $\beta = 0$ ) and (b) a coupled inductor ( $\beta = 5$ ).

resonance remains unchanged as  $\frac{1}{R_C}$ . When  $\beta$  increases, higher Q with narrower high-gain bandwidth may benefit the line transient response, since a  $v_{in}$  step change contains multiple frequency components. In frequency domain, a  $v_{in}$  step change is  $\tilde{v_{in}} = \frac{U}{s}$  (U is the step amplitude), and the  $\Delta i_L$  response is  $\Delta \tilde{i_L} = G_{v_{in}\Delta i_L} \cdot \frac{U}{s}$ . Accordingly, its time domain response is:

$$\Delta i_L(t) = \mathcal{L}^{-1} \left\{ G_{v_{in}\Delta i_L} \cdot \frac{U}{s} \right\} = A \cdot e^{-\sigma t} \sin(\omega_d t), \tag{3.22}$$

$$A = 2U\sqrt{\frac{C_B}{8(1+\beta)L_k - R_C^2 C_B}}, \quad \sigma = \frac{DR_C}{2(1+\beta)L_k},$$

$$\omega_d = \frac{D}{2(1+\beta)L_k}\sqrt{\frac{8(1+\beta)L_k - R_C^2 C_B}{C_B}}.$$
(3.23)

Fig. 3.15 shows the simulated and calculated responses of  $\Delta i_L$  to an input voltage step change, in which the calculated results match well with the simulated ones,



Figure 3.16: Block diagram of an SCB converter with typical voltage-mode control.

validating the analysis. The 2% settling time of  $\Delta i_L$  envelop is  $t_s = \frac{4}{\sigma}$ . Fig. 3.15 also indicates that using coupled inductors can effectively suppress the amplitude of interphase resonance for SCB converters with the tradeoff of increased settling time. This feature fundamentally comes from larger effective inductance  $(1 + \beta)L_k$  for differential-mode (i.e., interphase) dynamics.

### 3.4.2 Influence on Control Stability

This subsection explains the impacts of intrinsic resonance on control stability when the SCB converter is controlled in voltage mode. A typical multiphase PWM voltage-mode controller generates identical duty ratio command for each phase by sensing  $\tilde{v}_o$ . Fig. 3.16 plots its block diagram.  $H_s$  and  $A_c$  are transfer functions for sampling and compensation networks, respectively. As implied by the equivalent circuit model in Fig. 3.10, the identical duty commands will cause common-mode variations (e.g.,  $\tilde{i}_o$ ), but will not incur differential-mode variations (e.g.,  $\Delta \tilde{i}_L$  and  $\tilde{v}_c$ ). Substituting  $d_1 = d_2 = D + \tilde{d}$  into the average model, the  $\tilde{d}$ -to- $\tilde{i}_o$  transfer function is:

$$G_{di_o} = \frac{\tilde{i_o}}{\tilde{d}} = \frac{V_{in} - I_o R_C}{2R_o + DR_C} \cdot \frac{\frac{s}{\omega_z} + 1}{\frac{s^2}{\omega_{nop}^2 + \frac{s}{Q_{op}\omega_{nop}} + 1}},$$
(3.24)

where  $\omega_z$ ,  $\omega_{nop}$ , and  $Q_{op}$  are for output L- $C_o$ - $R_o$  network and are the same as in (3.9) or (3.20).  $G_{dv_o} = G_{di_o} \cdot Z_o$ . Since the sensed  $\tilde{v_o}$  is also a common-mode variable, the overall feedback loop only senses and affects common-mode dynamics; it will



Figure 3.17: Measured open and closed loop transfer functions in SPICE simulations: (a) open loop  $G_{dv_o}$  & loop gain, (b)  $G_{v_{in}i_o}$ , (c)  $G_{v_{in}v_o}$ , and (d)  $G_{v_{in}\Delta i_L}$ . ( $\beta = 0$ )

not be influenced by or have impacts on differential-mode dynamics. Consequently, interphase L- $C_B$  resonance doesn't affect control stability; stable loop design only needs to consider the output L- $C_o$  resonance, which is the same as a multiphase buck. Similar conclusions can be drawn for other control methods that sense the common-mode dynamics and generate identical commands for all the phases.

Fig. 3.17 shows the simulated open loop and closed loop transfer functions for an example SCB converter with a typical voltage-mode controller. Denote the loop gain

as  $T = Z_o \cdot H_s \cdot A_c \cdot G_{di_o}$ . Smaller  $L_k$  will result in larger  $\omega_{nop}$  in the  $G_{di_o}$ , allowing to design higher loop-gain bandwidth to achieve faster transient speed. Responses of  $\tilde{i}_o$  and  $\tilde{v}_o$  are involved in the loop, so the closed loop gains of  $G_{v_{in}i_o}$  and  $G_{v_{in}v_o}$  are greatly suppressed; responses of  $\Delta \tilde{i}_L$  are not affected by the loop, so the closed loop gain of  $G_{v_{in}\Delta i_L}$  is unchanged:

$$(G_{v_{in}i_o})^{\text{CL}} = \frac{G_{v_{in}i_o}}{1+T}, \quad (G_{v_{in}v_o})^{\text{CL}} = \frac{G_{v_{in}v_o}}{1+T},$$

$$(G_{v_{in}\Delta i_L})^{\text{CL}} = G_{v_{in}\Delta i_L}.$$
(3.25)

Equation (3.25) indicates that the voltage-mode control loop can restrain the output variation, but it cannot suppress the interphase resonance. Therefore, as shown in Fig. 3.18, while the output voltage and current are effectively controlled to remain stable against a line transient, the resonance of  $v_c$  and  $\Delta i_L$  are still left underdamped with high resonance amplitude and long settling time.

Actively controlling  $\Delta i_L$  resonance with unequal duty ratios will face more complicated  $\Delta \tilde{d}$ -to- $\Delta \tilde{i_L}$  dynamics than that of the multiphase buck. Substituting  $d_1 = D + \frac{1}{2}\Delta \tilde{d}$  and  $d_2 = D - \frac{1}{2}\Delta \tilde{d}$  into the average model, the  $\Delta \tilde{d}$ -to- $\Delta \tilde{i_L}$  transfer function can be obtained as:

$$G_{\Delta d\Delta i_L} = \frac{\Delta \tilde{i_L}}{\Delta \tilde{d}} = -\frac{I_o}{2D} \cdot \frac{1 - \frac{s}{\omega_{z_{rhp}}}}{\frac{s^2}{\omega_{nip}^2} + \frac{s}{Q_{ip}\omega_{nip}} + 1},$$
(3.26)

where  $\omega_{nip}$  and  $Q_{ip}$  are the same as in (3.12) or (3.21), and the right-half-plane zero is  $\omega_{z_{rhp}} = \frac{2I_o D}{(V_{in} - I_o R_C)C_B}$ . Fig. 3.19a shows the Bode plots of  $G_{\Delta d\Delta i_L}$  under different load conditions. The right-half-plane zero together with the interphase resonant poles results in a 270° phase reduction. As  $I_o$  decreases, both  $\omega_{z_{rhp}}$  and the dc gain will reduce towards zero. The dc gain might even flip the sign due to nonlinear factors at very light load. All these issues could bring challenges to the active control of  $\Delta i_L$  and need to be properly handled.



Figure 3.18: Simulated voltage and current responses to a line transient  $(V_{in} = 12 \text{ V} \rightarrow 14 \text{ V} \rightarrow 12 \text{ V})$  in the case of (a) open loop and (b) closed loop.  $(\beta = 0)$ 

An alternative way of actively suppressing interphase resonance is to control  $v_c$  resonance. Similar to (3.26), the  $\Delta \tilde{d}$ -to- $v_c$  transfer function can be derived as:

$$G_{\Delta dv_c} = \frac{\tilde{v_c}}{\Delta \tilde{d}} = \frac{V_{in}}{4D} \cdot \frac{1 + \frac{s}{\omega_{z_c}}}{\frac{s^2}{\omega_{nip}^2} + \frac{s}{Q_{ip}\omega_{nip}} + 1},$$
(3.27)

where  $\omega_{nip}$  and  $Q_{ip}$  are the same as in (3.12) or (3.21), and  $\omega_{z_c} = \frac{DV_{in}}{I_o(1+\beta)L_k}$ . Fig. 3.19b shows the Bode plots of  $G_{\Delta dv_c}$ , in which there is no right-half-plane zero and the



Figure 3.19: Bode plots of (a)  $G_{\Delta d\Delta i_L}$  and (b)  $G_{\Delta dv_c}$  under different load conditions. maximum phase reduction is 180° under all load conditions, making it attractive to design a  $C_B$  voltage control loop for suppressing the interphase L- $C_B$  resonance.

# 3.5 MSC-PoL Converter Design with 3D Stacked Packaging

To validate the MSC-PoL architecture, a 48-to-1-V, 450-A, 6-mm-thick MSC-PoL VRM with 3D-stacked ladder-core coupled inductors is designed and built [110]. This

section elaborates the design of the ultra-thin MSC-PoL VRM, including coupled inductors, gate driver circuits, and 3D stacked packaging.

### 3.5.1 Ladder-Structured Coupled Inductor

In the 48-to-1-V MSC-PoL converter, each SCB cell requires a four-phase coupled inductor. Figure 3.20 shows two ladder-structured coupled inductor designs based on: (1) a ladder core only; and (2) a ladder core plus a leakage plate. The ladder magnetic core, made of DMR51W ( $\mu_r = 900$ ), couples four horizontally arranged windings in parallel. Stacking the leakage plate on top creates a low-reluctance path for the leakage magnetic flux, and the resulting larger leakage inductance can reduce the inductor current ripple, achieving higher efficiency. In a fully symmetric coupled inductor structure, the frequency of the leakage magnetic flux is four times the switching frequency. As a result, the leakage plate adopts a higher frequency magnetic material DMR53 ( $\mu_r = 900$ ) for lower core loss.

Figure 3.21 annotates the design dimensions for the ladder core. The overall core and winding shapes are determined by three free dimension variables:  $X_{Leg}$ ,  $H_{Leg}$ , and  $H_{tot}$ . In this section, geometries of the ladder core are optimized for the minimum sum of conduction loss and core loss. Since the ac root-mean-squared (RMS) current



Figure 3.20: Two four-phase coupled inductor designs based on (a) a ladder core and (b) a ladder core plus a leakage plate. The ladder core is made of DMR51W ( $\mu_r = 900$ ), while the leakage plate is made of DMR53 ( $\mu_r = 900$ ), a higher frequency magnetic material to enhance the leakage flux path.



Figure 3.21: Annotated design dimensions for the ladder core. To fit the PCB layout, the entire inductor shape can be determined by three dimension variables:  $X_{Leg}$ ,  $H_{Leg}$ , and  $H_{tot}$ . Predicted core loss for geometry optimization is based on the flux density in each core segment (labeled in blue) using iGSE.

is negligible at heavy load, the winding conduction loss is calculated only based on the dc resistance (DCR). The core loss is predicted using the improved Generalized Steinmetz Equations (iGSE) [111], where the power loss density of each core segment can be expressed as:

$$P_{\rm v} = \frac{1}{T} \int_0^T k_i \left| \frac{\mathrm{d}B}{\mathrm{d}t} \right|^{\alpha} (\Delta B)^{\beta - \alpha} \mathrm{d}t, \tag{3.28}$$

$$k_i = \frac{k}{(2\pi)^{\alpha - 1} \int_0^{2\pi} |\cos \theta|^{\alpha} 2^{\beta - \alpha} d\theta}.$$
 (3.29)

k,  $\alpha$ , and  $\beta$  are the material Steinmetz coefficients provided by the manufacturer. It is noticeable that the predicted core loss from iGSE does not capture the impacts of temperature and dc flux density, and the calculated winding conduction loss does not include the loss from winding soldering and winding returning path on the PCB board. However, the resistance of soldering and PCB returning path is less dependent on inductor geometry and is relatively constant. Therefore, the calculated inductor loss herein can still provide good guidance for optimizing the dimensions of the coupled inductor. Advanced core loss modeling tools, such as neural network models, can be used to estimate the core loss under particular operating conditions (e.g., waveform, temperature, dc-bias) [112].



Figure 3.22: Equivalent magnetic models for a ladder-structured coupled inductor: (a) magnetic circuit model; (b) inductance dual model. The magnetic flux in each core segment can be calculated through probing the current in the inductance dual model and dividing it by the corresponding reluctance. For the designed coupled inductors, the turns ratio n=1.

In Eq. (3.28), the flux density of each core segment can be calculated based on the equivalent magnetic models shown in Fig. 3.22. Figure 3.22a plots the magnetic circuit model. Each core leg is modeled as a leg reluctance  $\mathcal{R}_L$  in series with an MMF source. The top and bottom core segments between two legs are lumped as a header reluctance  $\mathcal{R}_H$ . The leakage flux path of each phase is modeled as a parallel leakage reluctance  $\mathcal{R}_K$ . Generally, for a ladder-structured coupled inductor,  $\mathcal{R}_K$  is not identical for all the phases. The  $\mathcal{R}_K$  discrepancy tends to increase as phase number increases, but for the designed four-phase coupled inductor, the difference is small enough and  $\mathcal{R}_K$  can be analyzed using average values in most of the cases. Adding the leakage plate will reduce  $\mathcal{R}_K$ , but it is still much larger than the core reluctance  $\mathcal{R}_L$ and  $\mathcal{R}_H$ . Applying circuit duality to the magnetic circuit model yields the inductance dual model (Fig. 3.22b). Denote the winding voltages as  $v_{L1} \sim v_{L4}$ , which can be expressed as:

$$v_{Lk} = \begin{cases} \left(1 - \frac{1}{D}\right) v_o & \frac{(k-1)T}{4} \le t < (D + \frac{k-1}{4})T \\ v_o & \text{Otherwise} \end{cases}$$
(3.30)

Magnetic flux in each core segment can be calculated through probing the current in the inductance dual model and dividing it by corresponding reluctance. As shown in Fig. 3.22b, the ac current of the inductor  $1/\mathcal{R}_L$  is directly determined by its parallel voltage source:  $di_{\mathcal{R}_{Lk}}/dt = v_{Lk} \cdot \mathcal{R}_L$ . Accordingly, the ac flux density in the k<sup>th</sup> core leg can be derived as:

$$B_{Lk} = \frac{1}{S_{Leq}} \cdot \frac{i_{\mathcal{R}_{Lk}}}{\mathcal{R}_L} = \frac{1}{S_{Leq}} \int v_{Lk} dt.$$
 (3.31)

 $S_{Leg}$  is the cross-sectional area of each core leg. Equation (3.31) can also be developed from Faraday's law. It implies the ac flux density in one core leg is only related to its own winding voltage, irrelevant to other phases.

In Fig. 3.22b,  $1/\Re_H >> 1/\Re_K$  even with the leakage plate. Therefore, the voltage across the inductor  $1/\Re_H$  is primarily determined by the voltage division along the series-connected  $1/\Re_{K1} \sim 1/\Re_{K4}$ . Similar to Eq. (3.31), the ac flux density in core headers (i.e., segments between core legs) can be derived as:

$$\begin{cases}
B_{H1} = \frac{1}{S_{Head}} \int \left( v_{L1} - \sum_{j=1}^{4} v_{Lj} \cdot \frac{\frac{1}{\mathcal{R}_{K1}}}{\sum_{j=1}^{4} \frac{1}{\mathcal{R}_{Kj}}} \right) dt, \\
B_{H2} = \frac{1}{S_{Head}} \int \left( v_{L1} + v_{L2} - \sum_{j=1}^{4} v_{Lj} \cdot \frac{\frac{1}{\mathcal{R}_{K1}} + \frac{1}{\mathcal{R}_{K2}}}{\sum_{j=1}^{4} \frac{1}{\mathcal{R}_{Kj}}} \right) dt, \\
B_{H3} = \frac{1}{S_{Head}} \int \left( -v_{L4} + \sum_{j=1}^{4} v_{Lj} \cdot \frac{\frac{1}{\mathcal{R}_{K4}}}{\sum_{j=1}^{4} \frac{1}{\mathcal{R}_{Kj}}} \right) dt.
\end{cases} (3.32)$$



Figure 3.23: Calculated and ANSYS-simulated magnetic flux density in: (a) each core header  $(B_{H1} \sim B_{H3})$  and (b) each core leg  $(B_{L1} \sim B_{L4})$ .  $V_o = 1$  V;  $D = \frac{1}{6}$ ;  $f_{sw} = 500$  kHz.

 $S_{Head}$  is the cross-sectional area of each core header;  $\mathcal{R}_{K1} \sim \mathcal{R}_{K4}$  can be obtained from the extracted inductance matrix in ANSYS simulation.

Figure 3.23 compares the calculated and simulated ac flux density for the two coupled inductor designs. As indicated in the figure, the ac flux density is almost the same with or without using the leakage plate. The calculated and simulated results match well, validating the theoretical analysis.

To simplify the optimization calculation,  $\mathcal{R}_{K1} \sim \mathcal{R}_{K4}$  are treated as identical, since their differences are small. Figure 3.24 demonstrates the optimization process for the ladder-core coupled inductor (without the leakage plate) under the conditions of 125 A average current (31.25-A/phase) and 500 kHz switching frequency. Given a specific inductor height  $H_{tot}$ , the optimized inductor geometries are obtained from the inductor loss contour plot by sweeping  $X_{leg}$  and  $H_{leg}$  as shown in Fig. 3.24a. The optimized inductor loss versus  $H_{tot}$  is plotted in Fig. 3.24b. Weighing the tradeoff between inductor loss and height,  $H_{tot}$  is selected as 2.9 mm. Key parameters for the optimal coupled inductor design are listed in Table 3.2. Figures 3.25 and 3.26 show the CNC-machined magnetic cores and copper windings based on the optimized geometries. The ladder core measures 28.9 mm × 13 mm × 2.9 mm. A customized



Figure 3.24: Optimization process for the ladder-core coupled inductor: (a) total inductor loss contour plot at a specific  $H_{tot}$ ; (b) optimized inductor loss versus  $H_{tot}$ . Core loss and conduction loss are optimized for one coupled inductor (four-phase) supporting 125 A at 500 kHz switching frequency.

Table 3.2: Parameters for the Optimal Coupled Inductor Design

| Parameter                               | Value                                                                                          |  |  |  |  |
|-----------------------------------------|------------------------------------------------------------------------------------------------|--|--|--|--|
| Total Length, $L$                       | 28.9 mm                                                                                        |  |  |  |  |
| Total Width, $W$                        | 13 mm                                                                                          |  |  |  |  |
| Total Height, $H_{tot}$                 | 2.9 mm                                                                                         |  |  |  |  |
| Leg Width, $X_{Leg}$                    | 4.6 mm<br>2 mm                                                                                 |  |  |  |  |
| Leg Height, $H_{Leg}$                   |                                                                                                |  |  |  |  |
| Window Width, $X_{Win}$                 | 1.9 mm                                                                                         |  |  |  |  |
| Header Width, $X_{Head}$                | 3.5 mm                                                                                         |  |  |  |  |
| Leg Reluctance, $\mathcal{R}_L$         | $0.91 \times 10^6 \text{ H}^{-1}$                                                              |  |  |  |  |
| Header Reluctance, $\mathcal{R}_H$      | $1.21 \times 10^6 \text{ H}^{-1}$                                                              |  |  |  |  |
| Leakage Reluctance, ${\mathcal{R}_K}^*$ | ① $52.1 \sim 65.9 \times 10^6 \text{ H}^{-1}$<br>② $12.5 \sim 14.1 \times 10^6 \text{ H}^{-1}$ |  |  |  |  |

<sup>\*</sup> Simulated leakage reluctance per phase: ① is for the design with ladder core only;

0.8-mm magnetic plate can be put on top of the ladder core with 0.2-mm air gap for enhanced leakage flux. Comparison of the two coupled inductors is summarized in Table 3.3. Notice that the transient inductance is equivalent to the leakage inductance for the two parallel coupled inductors.

<sup>(2)</sup> is for using ladder core plus leakage plate.



Figure 3.25: Customized magnetic components: (a) four-phase ladder magnetic core (DMR51W,  $\mu_r = 900$ ); (b) CNC-machined windings; (c) leakage magnetic plate (DMR53,  $\mu_r = 900$ ).



Figure 3.26: Coupled inductor height of: (a) using the ladder core only; (b) using the ladder core plus the 0.8-mm leakage plate with a 0.2-mm air gap.

Table 3.3: Comparison between the Two Four-Phase Coupled Inductors

| Inductor Design                | Height | $L_{tr}^*$ | $L_{ss}^*$ | $ ight  DCR^{\dagger}$               | Current<br>Rating |
|--------------------------------|--------|------------|------------|--------------------------------------|-------------------|
| Ladder Core                    | 2.9 mm | 17 nH      | 140 nH     | $0.06 \text{ m}\Omega$               | > 125 A           |
| Ladder Core +<br>Leakage Plate | 3.9 mm | 75 nH      | 381 nH     | $\mid 0.06 \; \mathrm{m}\Omega \mid$ | > 125 A           |

<sup>\*</sup>  $L_{tr}$  and  $L_{ss}$  are simulated average values for each phase when D = 1/6.

The two coupled inductor structures are verified by both FEM and SPICE simulations. Figure 3.27 shows the FEM magnetic field simulation in ANSYS. In Fig. 3.27a, a magnetostatic simulation is performed to display the dc flux distribution when each phase conducts 31.25 A dc current (125 A in total). The dc flux density in the core leg is 0.066 T if not using the leakage plate. After installing the leakage plate, it increases to 0.28 T, but it is still much lower than the saturation flux density (0.5 T) of the magnetic material used. Therefore, both the coupled inductors can support 125 A dc current, which is sufficient for the designed MSC-PoL converter. Although adding the leakage plate will reduce the saturation current limit, it is acceptable in

<sup>&</sup>lt;sup>†</sup> DCR is measured winding dc resistance per phase.



Figure 3.27: ANSYS FEM simulation of the two coupled inductor designs: (a) dc flux density distribution when supporting 31.25 A average current per phase (125 A in total) and (b) ac flux density distribution at  $t=1~\mu s$  of one switching cycle.  $V_{in}=48~V, V_o=1~V, f_{sw}=500~kHz$ .

most cases because the current rating of a coupled inductor is usually constrained by unbalanced phase currents and semiconductor devices. In Fig. 3.27b, a transient magnetic field simulation is conducted for one switching cycle (2  $\mu$ s), displaying the ac flux density at t=1  $\mu$ s when it reaches its peak in the middle core header and the third core leg. As shown in Fig. 3.27b, the ac flux density is similar with or without using the leakage plate. This indicates the core losses of the two coupled inductors are comparable, though they might be influenced by the dc bias.

Figure 3.28 shows the SPICE simulation of the 48-to-1-V MSC-PoL converter when using different coupled inductor designs as well as discrete inductors of equivalent  $L_{ss}$  and  $L_{tr}$ . Simulations with coupled inductors are based on the extracted inductance matrix from ANSYS. Simulated steady-state inductor current ripples and transient output voltages during a duty ratio step change are plotted in the figure. Since the transfer function  $G_{dv_o}$  in Eq. (3.6) is a second-order system, its maximum



Figure 3.28: Simulated steady-state inductor currents and transient output voltages during a duty ratio step change when using: (a) the coupled inductor with ladder core only and discrete inductors of its equivalent  $L_{ss}$  and  $L_{tr}$ ; (b) the coupled inductor with ladder core plus leakage plate.  $V_{in}=48~\rm V,~V_o=1\rightarrow1.2~\rm V,~f_{sw}=500~\rm kHz,~R_{eq}=3~\rm m\Omega,~R_o=0.01~\Omega,~C_o=1~\rm mF.$  (Steady-state inductor currents are simulated at  $V_o=1~\rm V$ ).

percent overshoot  $(M_p)$  and 2% settling time  $(t_s)$  of a step response are:

$$M_p = e^{\frac{-\pi\xi}{\sqrt{1-\xi^2}}}, \quad t_s = \frac{4}{\xi\omega_n} = \frac{8R_oC_o}{1 + \frac{R_{eq}R_oC_o}{L_{tra}}}.$$
 (3.33)

Lower  $L_{tr}$  results in faster transient with less  $t_s$ , but  $M_p$  is not necessarily smaller, since it is also related to other circuit parameters. Therefore, as implied by Fig. 3.28a, the ladder-core coupled inductor can achieve as fast transient speed as using small 17 nH discrete inductors while maintaining as low current ripple as using large 140 nH discrete inductors. If adding the leakage plate with 1 mm extra thickness, the coupled inductor can further reduce current ripple to an extremely low level (Fig. 3.28b), significantly decreasing switching related loss and improving converter efficiency. The disadvantages of adding the leakage plate are slower transient speed, lower saturation current limit, and larger thickness.



Figure 3.29: Design of gate driver circuits and bootstrap chains (plotted in green) for one MSC-PoL module. All gate driver and bootstrap circuits are laid out together with the power stage inside the compact converter package.

# 3.5.2 Gate Driver Circuits and 3D Stacked Packaging

Table 3.4 tabulates key component parameters of the 48-to-1-V MSC-PoL module. GaN switches with higher voltage ratings are used for  $S_{0X} \sim S_{1X}$  in the SC cell to undertake high voltage stress; Silicon MOSFETs with lower voltage ratings are used for  $S_{2X} \sim S_{8X}$  in the SCB cells to undertake high current stress. The hybrid GaN-Si switch combination maximizes the advantages of material characteristics and state-of-the-art performance of GaN FETs and Silicon MOSFETs.

Figure 3.29 plots the detailed gate driver and bootstrap circuit design for one MSC-PoL module. Supporting by an external voltage rail  $V_{drive}$  ( $V_{drive} = 8 \text{ V}$ ), the

Table 3.4: Bill-of-Material of the 48-to-1-V MSC-PoL Converter

| Semiconductor Devices                                                                                                                                           | Description                                                     |  |  |  |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------|--|--|--|
| Switches, $S_{0X} \sim S_{1X}$<br>Gate Drivers for $S_{0X} \sim S_{1X}$<br>LDO Regulators                                                                       | EPC 2065<br>ADI LTC4440-5<br>On-Semi NCP711                     |  |  |  |
| High-Side Switches, $S_{2X} \sim S_{4X}$<br>Low-Side Switches, $S_{5X} \sim S_{8X}$<br>Gate Drivers for $S_{5X}$<br>Gate Drivers for $S_{2X/6X} \sim S_{4X/8X}$ | Infineon BSZ0902NS Infineon BSZ011NE2LS5I TI LM5114 TI UCC27282 |  |  |  |

| Capacitors* | Description                                                                                  |
|-------------|----------------------------------------------------------------------------------------------|
| $C_{in}$    | 0805 X5R 100 V 4.7 $\mu$ F × 36, $C_{eff} = 20.3 \mu$ F                                      |
| $C_{fly}$   | $0805 \text{ X5R } 35 \text{ V } 22 \ \mu\text{F} \times 22, \ C_{eff} = 38.7 \ \mu\text{F}$ |
| $C_{1X}$    | $0805 \text{ X5R } 25 \text{ V } 22 \mu\text{F} \times 9, C_{eff} = 20 \mu\text{F}$          |
| $C_{2X}$    | $0805 \text{ X5R } 25 \text{ V } 22 \mu\text{F} \times 7, C_{eff} = 27.7 \mu\text{F}$        |
| $C_{3X}$    | $0805 \text{ X5R } 25 \text{ V } 22 \mu\text{F} \times 6, C_{eff} = 52 \mu\text{F}$          |
| $C_o$       | $0805 \text{ X5R } 6.3 \text{ V } 100 \ \mu\text{F} \times 12, C_{eff} = 0.94 \text{ mF}$    |

 $<sup>^{*}</sup>$  Capacitor count and  $C_{eff}$  are listed for one MSC-PoL module.

bootstrap chain creates multiple floating dc voltages referenced to floating switch source terminals. In each SCB cell, half-bridge gate drivers (UCC27282) are used to drive  $S_{2X} \sim S_{4X}$  and  $S_{5X} \sim S_{8X}$ , and low-side gate drivers (LM5114) are used to drive  $S_{5X}$ . In the H-bridge SC cell, high-side gate drivers (LTC4440-5) and 5-V LDOs are utilized for driving the GaN switches  $S_{0X} \sim S_{1X}$ . The PWM input side of each gate driver is ground referenced and powered by  $V_{drive}$ . The driving output side is powered by the bootstrap chain for the floating switches or by  $V_{drive}$  for the grounded switches.

Detailed PCB layout and 3D stacked packaging of the MSC-PoL VRM are plotted in Fig. 3.30. The VRM measures 31.9 mm × 26.6 mm in area, and the overall height is only 6 mm (7 mm if including the leakage plate). All power devices are placed on the top side of the PCB, while the coupled inductors and gate drivers are stacked on the bottom side. Placing all power components on one side simplifies the cooling requirements by enabling single-sided heat dissipation. The bootstrap circuit chain is



Figure 3.30: PCB layout and 3D stacked packaging of the MSC-PoL VRM: (a) annotated top view; (b) annotated bottom assembly view. The PCB area is  $31.9 \text{ mm} \times 26.6 \text{ mm} = 848.54 \text{ mm}^2$ , and the total VRM height is only 6 mm (7 mm if including the leakage plate).

laid out in the center of the converter, and on its two sides symmetrically located are the H-bridge SC cell as well as the two 4-phase SCB cells (cells A&B). To minimize both converter height and on-board area, a 3D stacked inductor-driver packaging is implemented as shown in Fig. 3.30b. At the bottom side of the PCB, the coupled inductors are stacked on top of the gate drivers with a copper backbone inserted in between to draw the high output currents out. Winding structures of the two inductors are in symmetry to bring all the output currents to the middle, which helps to shorten the layout length of PCB traces and reduce the conduction loss of the overall system. All components including power stage, bootstrap chain, gate driver circuits, and coupled inductors are packaged into a  $\frac{1}{16}$ -brick module with 0.31 in ultra-compact size and 6-mm ulta-thin thickness. Only PWM pins, a voltage rail  $V_{drive}$ , and an optional heat sink are needed to operate the MSC-PoL VRM.



Figure 3.31: (a) Block diagram of the prototype power stage. (b) An example phase shift strategy, which enables 16-phase interleaving with multiplicated ripple frequency  $(16 \times f_{sw})$  and reduced ripple amplitude of the output current.

### 3.6 Experimental Results

A 48-to-1-V/450-A MSC-PoL prototype comprising two parallel-connected MSC-PoL modules is fabricated and tested. This section presents the overall hardware prototype, the experimental testbench, and detailed experimental results.

# 3.6.1 Prototype and Testbench

Figure 3.31a plots the block diagram of the prototype power stage, which contains 16 output phases. Appropriate phase shift strategy can be designed to achieve 16-phase interleaving with multiplicated ripple frequency and reduced ripple amplitude of the output current, as shown in Fig. 3.31b. Figure 3.32 shows the complete hardware prototype including the power stage, the signal interface board and two F28388D controllers. A heat sink (SKV38538514-CU) equipped with a dc fan (9GA0312P3J001) is placed on top of each MSC-PoL module through a thermal interface. Benefiting from the single-side heat dissipation, the heat sink can easily take away most of the heat generated by the power devices.



Figure 3.32: (a) Picture of the 48-to-1-V/450-A MSC-PoL prototype containing two MSC-PoL modules, a signal interface board, and two microcontroller boards. Each MSC-PoL module is covered by a heat sink together with a dc fan.



Figure 3.33: (a) One MSC-PoL module (w/o the leakage plate) compared with a U.S. quarter. (b) Mechanical demonstration of a 225 W 48-to-1-V MSC-PoL module embedded into a 3D-printed FCLGA-3647 socket to support a server CPU (Intel Xeon Platinum 8280, 205 W).

Figure 3.33a exhibits the ultra-thin MSC-PoL VRM. Each MSC-PoL module is enclosed within a 31.9 mm×26.6 mm×6 mm box volume, which is comparable to a U.S. quarter. With the ultra-compact size and the ultra-thin thickness, the MSC-PoL VRM can be embedded into a FCLGA-3647 socket to power an Intel Xeon



Figure 3.34: Picture of the experimental testbench. Digital multimeters are interfaced with the BenchVue platform to automatically collect efficiency measurement results. Two current shunts are utilized for measuring the input and the output currents. A dc power source is used as the 48 V dc bus. Multiple electronic loads are connected in parallel to drain high load currents.

Platinum 8280 CPU (205 W), enabling PwrSiP voltage regulation as demonstrated in Fig. 3.33b.

Figure 3.34 shows the experimental testbench. Four digital multimeters (Agilent 34401A) are utilized in combination with the BenchVue software platform to set up an automatic efficiency measurement system. Two current shunts (Rideon RSN-50 and RSC-1000), calibrated by Agilent 34330A, are connected in series at the input and output for precise current measurement. A dc power source (BK Precision 9117)



Figure 3.35: Steady-state waveforms of switch drain-source voltages and intermediate rail voltages.  $V_{\text{Rail1A}}$  and  $V_{\text{Rail1B}}$  are the positive and the negative terminal voltages of the flying capacitor  $C_{fly}$ .  $f_{sw} = 400$  kHz;  $V_o = 1$  V.

is used to provide the 48 V dc voltage. Multiple electronic loads (Chroma 63103A and 63203) are parallelly connected to drain high load currents from the converter.

In the following experiments, the MSC-PoL prototype is tested based on the component parameters in Table 3.4 and phase shift strategy in Fig. 3.31b, unless otherwise specified. Measured experimental results when using different coupled inductor designs shown in Table 3.3 are compared and discussed.

# 3.6.2 Steady-State Operation

This subsection demonstrates the steady-state operation of the MSC-PoL prototype when delivering power from 48 V to 1 V and switching at 400 kHz. The leakage plate is installed on the coupled inductor for lower current ripple.

Figure 3.35 shows the measured waveforms of switch drain-source voltages and two intermediate rail voltages. The maximum switch voltage stresses are labeled aside the waveforms, which are 24 V for  $S_{0X}$ , 30 V for  $S_{1A/C}$ , 18 V for  $S_{1B/D}$ , 12 V for SCB high side switches ( $S_{2X} \sim S_{4X}$ ), and 6 V for SCB low side switches ( $S_{5X} \sim S_{8X}$ ), consistent with the analysis in Fig. 3.7. Two intermediate rail voltages  $V_{\text{Rail1A}}$  and  $V_{\text{Rail1B}}$  refer to the voltages of positive and negative terminals of the flying capacitor  $C_{fly}$ .  $V_{\text{Rail1A}}$ 



Figure 3.36: Steady-state waveforms of switch node voltages and output voltage ripples. The 16-phase interleaving operation in Fig. 3.31 is applied, yielding  $16f_{sw}$  ripple frequency for the output voltage.  $f_{sw} = 400$  kHz;  $V_o = 1$  V.

is shifting between 24 V and 48 V, while  $V_{\text{Rail1B}}$  is alternating between 0 V and 24 V. By turning on  $S_{1X}$ , each SCB cell will be switched into the corresponding voltage rail when it turns 24 V.

Figure 3.36 shows the measured waveforms of switch node voltages and output voltage ripples. The phase shift strategy in Fig. 3.31 is applied, where the phase shifts between four SCB cells (Cell A~D) are 202.5°, 112.5°, and 202.5°, and neighboring phases within each SCB cell are shifted by 90°. As implied by Fig. 3.36, this phase shift scheme enables 16-phase interleaving, yielding  $16f_{sw}$  ripple frequency and greatly reduced ripple amplitude for the output voltage. The peak-peak steady-state output voltage ripple is less than 10 mV.

Figure 3.37 shows the measured capacitor dc voltages and ac voltage ripples when delivering 400 A load current. As indicated by Fig. 3.37a, both the flying capacitor and the blocking capacitors can maintain stable voltages at heavy load, functioning like a dc source with expected dc values. As shown in Fig. 3.37b, the capacitor ac voltage ripples can remain less than 0.8 V at 400 A load current (89% of full load).



Figure 3.37: Steady-state waveforms of: (a) capacitor dc voltages; (b) capacitor ac voltage ripples and output current.  $f_{sw} = 400 \text{ kHz}$ ;  $V_o = 1 \text{ V}$ ;  $I_o = 400 \text{ A}$ .



Figure 3.38: Measured open-loop transient waveforms with one MSC-PoL module when (a) using the leakage plate and (b) not using the leakage plate. Duty ratio steps from 15.8% to 22.2%, yielding a step change  $V_o$  from 0.8 V to 1.2 V.  $f_{sw} = 704$  kHz;  $I_o = 100$  A;  $C_o = 3$  mF.

### 3.6.3 Transient Performance

This subsection exhibits the open-loop and the closed-loop transient experiments tested on one MSC-PoL module with and without using the leakage plate. The transient experiments are performed when  $V_{in} = 48 \text{ V}$ ,  $f_{sw} = 704 \text{ kHz}$ ,  $C_o = 3 \text{ mF}$ .

Figure 3.38 shows the measured transient waveforms during an open-loop duty ratio step change at 100 A load current. The duty ratio steps from 15.8% to 22.2%,



Figure 3.39: Measured closed-loop transient waveforms with one MSC-PoL module (w/o the leakage plate) during a load step change between 50 A and 150 A. A typical voltage-mode feedback control is applied. The maximum voltage overshoot is less than 80 mV during the 100 A load step (44% of the full load) with 4 A/ $\mu$ s current slope.  $f_{sw} = 704$  kHz;  $C_o = 3$  mF.

yielding a step change  $V_o$  from 0.8 V to 1.2 V. The settling time of reaching within 5% error band of the final voltage is 26  $\mu$ s when using the leakage plate and 18  $\mu$ s when not using the leakage plate. As discussed in Section 3.5.1, adding the leakage plate will reduce the current ripple but also slow down the transient speed due to larger leakage inductance, resulting in longer settling time. However, one MSC-PoL module contains eight output phases in parallel. This narrows the transient performance difference between the two coupled inductor designs since both of them have a very small total output leakage inductance, which is comparable to the parasitic trace inductance. Therefore, after adding the leakage plate, the MSC-PoL VRM still maintains a fast transient speed. Besides, the flying capacitor and the blocking capacitor voltages remain stable during the open loop duty ratio step change.

Figure 3.39 shows the measured waveforms of closed-loop transient experiments. A typical voltage-mode feedback control with PI compensator is implemented, which changes the duty ratio based on the error between reference and output voltages. The output load current is programmed to step between 50 A and 150 A with  $4 \text{ A}/\mu\text{s}$  downslope. As indicated by the figure, the maximum voltage overshoot is less than



Figure 3.40: Measured 48-to-1-V efficiency of the MSC-PoL prototype when (a) using the leakage plate and (b) not using the leakage plate. Efficiencies of different switching frequencies excluding and including the gate losses are plotted and compared.  $V_{drive} = 8 \text{ V}$ .

80 mV during this 100 A load step (44% of the full load). The flying capacitor and blocking capacitor voltages also remain stable in the closed-loop transient test. The transient performance can be further enhanced by increasing the control loop bandwidth (e.g., reducing the delay of controller and gate drivers) or by using advanced nonlinear controls (e.g., constant-on-time control). However, demonstrating the extreme transient performance of the converter is beyond the scope of this section.

# 3.6.4 Efficiency Measurement

The efficiencies of the MSC-PoL prototype with and without using the leakage plate are measured at multiple switching frequencies. The gate drivers and the bootstrap chain are powered by an auxiliary dc-dc converter, and the gate losses are estimated by  $Q_gV_{drive}f_{sw}$ .  $V_{drive}$  is the voltage of the auxiliary power rail, and  $V_{drive} = 8$  V in all experiments.

Figures 3.40 summarizes the 48-to-1-V efficiencies of the MSC-PoL prototype with and without using the leakage plate, respectively. Efficiencies of different switching

frequencies excluding and including the gate losses are collected and compared. As shown in the figure, the MSC-PoL prototype with the leakage plate has a higher efficiency than without using the leakage plate. As the switching frequency increases, there is a tradeoff between the decreased ac conduction losses and the increased switching related losses (including switching losses, deadtime losses, parasitic loop inductance losses, etc.). When using the coupled inductor with the leakage plate, the inductor current ripple is already very small. Increasing switching frequency does not have a significant reduction in ac conduction losses, so the increased switching related losses will dominate. In this case, a higher switching frequency yields a lower efficiency. As for using the coupled inductor without the leakage plate, the inductor current ripple is large. Increasing switching frequency can greatly reduce ac conduction losses. The decreased ac conduction losses dominate the frequency impacts at light load, but at heavy load, the increased switching related losses are predominant. Consequently, a higher switching frequency leads to a higher efficiency at light load but a lower efficiency at heavy load. At full load where the current ripple amplitude has little influence on the total power losses, the MSC-PoL prototype of using different coupled inductor designs has a similar efficiency for the same switching frequency. The efficiency measurement results indicate that, if excluding the gate losses, the MSC-PoL prototype with the leakage plate can achieve 93.1% peak efficiency at  $140~\mathrm{A}/400~\mathrm{kHz}$  and 86.2% full-load efficiency at  $450~\mathrm{A}/400~\mathrm{kHz}$ . In contrast, the MSC-PoL prototype without using the leakage plate can achieve 91% peak efficiency at 150 A/602 kHz and 84.6% at 450 A/602 kHz. The gate drive losses are estimated as 2.48 W at 400 kHz, 3.10 W at 500 kHz, and 3.74 W at 602 kHz.

Figure 3.41 shows the thermal image of the MSC-PoL prototype under dc fan and heat sink cooling. After operating at 450 A full load for more than 10 minutes, the hot-spot temperature of the heat sink remains around 45 °C when the ambient temperature is around 25 °C. Featuring single-side heat dissipation, the MSC-PoL



Figure 3.41: Thermal image of the MSC-PoL prototype when operating at 48-to-1-V/450-A,  $f_{sw}=400$  kHz under dc fan and heat sink cooling for more than 10 minutes. The hot-spot temperature of the heat sink remains around 45 °C. The ambient temperature is around 25 °C.

prototype greatly simplifies its cooling design, enabling long-term operation at heavy load while keeping a cool temperature.

# 3.6.5 Performance Discussions and Comparison

The 48-to-1-V MSC-PoL CPU VRM is a combination of many state-of-the-art technologies, including multistack SC architecture, soft charging technique, hybrid GaN-Si switch combination, coupled magnetics, and 3D stacked packaging. It achieves an ultra-compact size with both a small area and a low z-height. The overall VRM height is only 6 mm (7 mm if adding the leakage plate), making it an extremely attractive PwrSiP solution for CPU voltage regulation.

Appropriate coupled inductor design can be selected based on specific application requirements. Adding the leakage plate can reduce the inductor current ripple, and the resulting smaller RMS and peak current values decrease conduction loss, switching loss, and parasitic inductance loss, yielding a higher efficiency. The tradeoff is the increased VRM height and slower transient response. However, with 8-phase (or 16-phase) interleaving, the coupled inductor that uses the leakage plate can still achieve



Figure 3.42: Loss breakdown of the 48-to-1-V/400 kHz MSC-PoL prototype (with the leakage plate) at (a) full load range and (b) two specific load conditions. Gate loss is included. The power loss breakdown listed in the legend is ordered from bottom to top in the bar chart and clockwise from 12 o'clock in the pie charts.

a fast transient speed, as demonstrated in Section 3.6.3. Although the light-load efficiencies for the two coupled inductor designs are quite different, their heavy-load efficiencies are very close given the same operation frequency.

Detailed loss breakdown of the 48-to-1-V/400 kHz MSC-PoL prototype (with the leakage plate) is plotted in Fig. 3.42. The power loss breakdown contains 1) losses of the H-Bridge SC stage including switching and conduction losses of the GaN switches  $(S_{0X} \sim S_{1X})$  as well as ESR loss of the flying capacitors  $(C_{fly})$ ; 2) losses of the SCB stage including switching and conduction losses of the MOSFETs  $(S_{2X} \sim S_{8X})$ , ESR loss of the blocking capacitors  $(C_{1X} \sim C_{3X})$ , core loss and winding loss of the coupled inductors; 3) parasitic loop inductance loss estimated by  $\frac{1}{2}L_{loop}i_L^2f_{sw}$ ; 4) deadtime loss, PCB trace conduction loss, and gate loss estimated by  $Q_gV_{drive}f_{sw}$ . At light load, gate loss, core loss, and switching loss are predominant. When load current increases to 170 A where the peak efficiency is achieved, the major power losses are

Table 3.5: Performance Comparison of 48 V-to-1 V Point-of-Load VMRs

|      | Note                            | @ Peak Efficiency |                |                                                 | @ Full Load       |                |                                              |                                     | Including                        |
|------|---------------------------------|-------------------|----------------|-------------------------------------------------|-------------------|----------------|----------------------------------------------|-------------------------------------|----------------------------------|
| Year |                                 | Output<br>Current | Efficiency     | Box Power<br>Density*                           | Output<br>Current | Efficiency     | Box Power<br>Density*                        | Switching<br>Frequency <sup>†</sup> | Gate Drive<br>Loss & Size        |
| This | Ladder Only<br>6-mm height      | 150 A<br>210 A    | 91.0%<br>89.5% | $241 \text{ W/in}^3$<br>$338 \text{ W/in}^3$    | 450 A<br>450 A    | 84.6%<br>85.6% | $724 \text{ W/in}^3$<br>$724 \text{ W/in}^3$ | 602 kHz <sup>‡</sup><br>400 kHz     | Loss ×; Size ✓<br>Loss ✓; Size ✓ |
| Work | Ladder + Leakage<br>7-mm height | 140 A<br>170 A    | 93.1%<br>91.7% | $\frac{193 \text{ W/in}^3}{235 \text{ W/in}^3}$ | 450 A<br>450 A    | 86.2%<br>85.8% | $621 \text{ W/in}^3$<br>$621 \text{ W/in}^3$ | 400 kHz                             | Loss ×; Size ✓<br>Loss ✓; Size ✓ |
| 2020 | Sigma [68]                      | 40 A              | 94.0%          | $210~\mathrm{W/in^3}$                           | 80 A              | 92.5%          | $420~\mathrm{W/in^3}$                        | 600 kHz                             | Loss ×; Size ✓                   |
| 2020 | TSAB [113]                      | 30 A              | 91.5%          | $12~\mathrm{W/in^3}$                            | 90 A              | 85.0%          | $36~\mathrm{W/in^3}$                         | 500 kHz                             | Loss ×; Size ✓                   |
| 2020 | Vicor [114, 115]                | 120 A             | 90.1%          | $224~\mathrm{W/in^3}$                           | 214 A             | 87%¶           | $400~\mathrm{W/in^3}$                        | $1,025~\mathrm{kHz}$                | Loss √; Size √                   |
| 2021 | ADI [116]                       | 30 A              | 90.8%          | $53.1~\mathrm{W/in^3}$                          | 50 A              | 88.1%          | $88.5~\mathrm{W/in^3}$                       | 350 kHz                             | Loss √; Size √                   |
| 2021 | On-Chip [117]                   | 1.5 A             | 90.2%          | $37.1~\mathrm{W/in^3}$                          | 8 A               | 76%            | $198~\mathrm{W/in^3}$                        | 2,500 kHz                           | Loss √; Size √                   |
| 2021 | LEGO-PoL [31]                   | 190 A             | 88.4%          | $124~\mathrm{W/in^3}$                           | 450 A             | 84.8%          | $294~\mathrm{W/in^3}$                        | 1,000 kHz                           | Loss √; Size √                   |
| 2021 | VIB-PoL [54]                    | 144 A             | 93.3%          | $74.2~\mathrm{W/in^3}$                          | 450 A             | 88.1%          | $232~\mathrm{W/in^3}$                        | 417 kHz                             | Loss √; Size √                   |
| 2022 | MLB-PoL [118]                   | 23 A              | 91.5%          | $101~\mathrm{W/in^3}$                           | 60 A              | 88.4%          | $263~\mathrm{W/in^3}$                        | 250 kHz                             | Loss √; Size √                   |
| 2022 | Symmetric-DIH [119]             | 36 A              | 81.4%§         | $205~\mathrm{W/in^3}$                           | 105 A             | 70.9%§         | $598~\mathrm{W/in^3}$                        | 750 kHz                             | Loss √; Size √                   |
| 2022 | Dickson <sup>2</sup> -PoL [101] | 100 A             | 91.6%          | $133~\mathrm{W/in^3}$                           | 270 A             | 87.7%          | $360~\mathrm{W/in^3}$                        | 280 kHz                             | Loss √; Size √                   |
| 2023 | Mini-LEGO [120]                 | 160 A             | 84.1%          | $929~\mathrm{W/in^3}$                           | 240 A             | 82.3%          | $1{,}390~\mathrm{W/in^3}$                    | 1,515 kHz                           | Loss √; Size √                   |

<sup>\*</sup> The power density is calculated with the box volume (defined as the maximum Length×Width×Height).

relatively evenly distributed among switching loss, conduction loss, and gate loss. As load current keeps rising, the low-side conduction loss and parasitic loop inductance loss increase dramatically and will dominate at 450 Å full load. To further improve the efficiency and power density, multiple switches and gate drivers can be integrated together to reduce the parasitic loop inductance especially for the SCB stage.

Table 3.5 compares several key metrics of the MSC-PoL prototype with other state-of-the-art 48 V-to-1 V point-of-load voltage regulators. As implied by the table, both the efficiency and power density of the MSC-PoL prototype are at the top tier among state-of-the-art VRM designs. The full-load power density with and without using the leakage plate is 621 W/in<sup>3</sup> and 724 W/in<sup>3</sup>, respectively. A performance metric represented as the connection curve of the efficiency and power density points at full load and peak-efficiency load is introduced and plotted in Fig. 3.43. The

<sup>†</sup> The switching frequency of the voltage regulation stage.

<sup>†</sup> The frequency of the MSC-PoL prototype is selected for the maximum peak efficiency w/ or w/o gate loss.

 $<sup>\</sup>P$  The full load efficiency of the Vicor product is not available and is estimated.

<sup>§</sup> Efficiency including gate loss is calculated based on the gate driving energy per cycle provided in [119].



Figure 3.43: Performance comparison of the MSC-PoL prototype (with the leakage plate) and other 48 V-to-1 V VRMs. Efficiency and power density points (including gate loss and size) at full load and peak-efficiency load are plotted and connected with a line. Switching frequencies are color coded, corresponding to the logarithmic color bar. The MSC-PoL prototype achieves both excellent efficiency and power density among state-of-the-art VRM designs.

MSC-PoL prototype presented in this chapter expands the performance boundary of point-of-load VRMs by pushing towards higher efficiency and higher power density.

# 3.7 Chapter Summary

This chapter presents a granular power architecture with parallel coupled magnetics to support high current computing systems. The MSC-PoL architecture comprising multiple granular switched-capacitor and switched-inductor cells is introduced and analyzed. The stacked switched-capacitor cells split the high input voltage into multiple intermediate voltage rails, which are loaded with the switched-inductor cells to achieve soft charging and voltage regulation. Many inductors of the switched-inductor cells are coupled into one and operated in an interleaved fashion to miniaturize the dc

magnetic energy storage, reduce the inductor current ripple, and improve the VRM transient speed. We develop a 48-to-1-V MSC-PoL topology and analyze its converter dynamics by small signal modeling. The 48-to-1-V MSC-PoL converter has a similar small signal model and transfer functions as a multiphase buck. Therefore, similar control methods (e.g., voltage-mode or constant-on-time control) can be directly applied.

Containing multiple inductors and capacitors, the MSC-PoL converter is a high-order PWM converter. The resulting L-C resonant poles might challenge its control design. This chapter presents a systematic approach of analyzing the intrinsic L-C resonant behavior in a hybrid switched-capacitor/magnetics converter. By decomposing disturbance and its response into common-mode and differential-mode dynamics, the intrinsic resonant behavior can be classified into output L- $C_o$  resonance and interphase L- $C_B$  resonance. A similar analysis approach can be extended to higher number of phases, enabling a more intuitive understanding of transient and balancing behaviors. The impacts of coupled inductors are analyzed, indicating that higher coupling coefficient results in smaller resonant amplitude and lower resonant frequency with longer settling time. Comprehensive guidelines for designing a controller that covers both input-output dynamics and interphase resonance are provided.

To validate the granular MSC-PoL architecture, a 48-to-1-V/450-A, 6-mm-thick MSC-PoL VRM with ladder-structured coupled inductors is built and tested. Two coupled inductor designs based on a ladder-structured core are developed and compared. A leakage magnetic plate of 0.8-mm thickness is designed to adjust the leakage inductance for lower current ripple. All power stage, gate drive, and bootstrap circuits as well as the coupled inductors of one MSC-PoL module are packaged into a 0.31 in<sup>3</sup> box volume. The peak and the full-load efficiencies (including gate loss) as well as the full-load power density of the MSC-PoL prototype with and without using the leakage plate are 91.7% (@170 A) and 89.5% (@210 A), 85.8% (@450 A) and 85.6%

(@450 A), and 621 W/in<sup>3</sup> and 724 W/in<sup>3</sup>, respectively. The ultra-compact MSC-PoL VRM enables CPU PwrSiP voltage regulation and expands the performance boundary of point-of-load converters towards higher efficiency and higher power density.

### **Related Publications**

- P. Wang, Y. Chen, G. Szczeszynski, S. Allen, D. Giulianoand M. Chen, "MSC-PoL: Hybrid GaN-Si Multistacked Switched Capacitor 48V PwrSiP VRM for Chiplets," TechRxiv, 23-Feb-2023.
- 2. P. Wang, D. Zhou, H. Li, D. Giuliano, G. Szczeszynski, S. Allen and M. Chen, "Interphase L-C Resonance and Stability Analysis of Series-Capacitor Buck Converters," *IEEE Transactions on Power Electronics*, 2023.
- 3. P. Wang, D. Giuliano, S. Allen and M. Chen, "MSC-PoL: An Ultra-Thin 220-A/48-to-1-V Hybrid GaN-Si CPU VRM with Multistack Switched Capacitor Architecture and Coupled Magnetics," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, Orlando, FL, USA, 2023.
- 4. P. Wang, D. Zhou, D. Giuliano, M. Chen and Y. Chen, "Multistack Switched-Capacitor Architecture with Coupled Magnetics for 48V-to-1V VRM," in *Proc. IEEE 23rd Workshop on Control and Modeling for Power Electronics (COM-PEL)*, Tel Aviv, Israel, 2022.

# Chapter 4

# Granular Architecture with Series Coupled Magnetics for Large-Scale Modular Energy Systems

# 4.1 Background and Motivation

Large-scale energy systems usually contain massive amounts of modular loads or sources connected in parallel and series as a large-scale array. Figure 4.1 shows few example large-scale energy systems, including photovoltaic systems, battery storage systems, and data center servers. In these systems, identical modular loads or sources



Figure 4.1: Example large-scale energy systems: (a) photovoltaic systems; (b) battery storage systems; and (c) data center servers.



Figure 4.2: A data storage server with series stacked power delivery architecture. It comprises a cluster of  $N \times M$  HDDs divided into N series-stacked voltage domains with differential power processing.

perform similar functions and have similar power profiles, making them a perfect fit for the series-stacked differential power processing (DPP) architecture.

Figure 4.2 illustrates the key principles of the DPP power architecture based on an example hard-disk-drive (HDD) server. In data storage servers, numerous HDDs perform similar reading/writing tasks and have similar power consumption. Power difference (i.e., differential power) among HDDs is small. To deliver power from high voltage bus to low voltage loads, conventional power architecture employs a cascaded dc-dc converter at the front, and full load power needs to be processed by the dc-dc converter. In contrast, by stacking multiple groups of HDDs in series to the high voltage bus, inherent voltage step down can be achieved due to the series connection. In this way, the vast majority of power is directly delivered to the loads, while only a small amount of power difference is processed by the DPP converter, yielding significantly reduced power conversion stress and improved efficiency.

DPP architectures origin from battery active equalization circuits, including switched-inductor (buck-boost) types [121, 122], switched-capacitor types [18, 123],

and ac-link or dc-link fully-coupled types based on flyback [124–126], forward [127], half-bridge [128], and dual-active-bridge (DAB) converters [129,130]. Similar topologies were later applied to photovoltaic (PV) systems to manage mismatch among series PV cells [131,132]. Control strategies and architectures have been proposed to achieve PV maximum power point tracking (MPPT) [133–138]. DPP architectures have also been implemented in emerging dc systems such as data center servers [22,139] and multi-processor systems [140–142].

In this chapter, a multiport ac-coupled differential power processing (MAC-DPP) converter is presented, which couples all series-stacked voltage domains through the series coupled magnetic flux in a multi-winding transformer [22]. The proposed MAC-DPP converter features reduced component count, smaller magnetic volume, and less differential power conversion stages compared to other DPP solutions. The granular power architecture of the MAC-DPP converter offers high modularity and extendability, enabling linear extension without customizing the design for each port. Other key contributions of this chapter include:

- Stochastic power loss analysis for DPP: A stochastic modeling approach is developed to analyze power loss scaling in a DPP system based on probability distributions of loads or sources [143]. Scaling factors are introduced to describe how losses change with DPP system size and load or source power variance.
- Modeling and control of MIMO power flow: The MAC-DPP converter is a multi-input-multi-output system. To precisely transfer the required power flow of each port and balance each domain voltage, this chapter introduces two control methods: (a) a feedback control based on distributed phase shift modules [144]; and (b) a feedforward control based on Newton-Raphson method [145]. A generalized small signal model is derived to provide guidance on the control loop design of large-scale MAC-DPP systems.

• String voltage regulation for differential power processing: One challenge for the DPP system design is to regulate the series-stacked string voltage. Leveraging the partial power processing concept [68, 146–153], a series voltage compensator (SVC) architecture is developed to regulate the DPP string voltage with improved efficiency and power density [43].

To validate the MAC-DPP architecture and theoretical analysis, a 10-port 450 W MAC-DPP prototype with 700 W/in<sup>3</sup> power density is built. The prototype is tested on both a 50-HDD storage server and a 30×20 LED screen. Supported by the MAC-DPP prototype, the HDD server can achieve 99.77% system efficiency and can maintain normal reading/writing operation of all HDDs against the worst hot-swapping scenarios, realizing the first complete demonstration of a DPP-powered data storage server with full reading, writing, and hot-swapping capabilities. The exploration of the co-design of software, hardware, and power architecture provides profound insight for designing next-generation power architectures in data centers. To verify the effectiveness of the stochastic power loss model, random load tasks (independent or correlated) are set up and assigned to the LED array, and the tested results match well with theoretical analysis. Besides, a buck-derived SVC prototype is designed and built, which can regulate the DPP string voltage precisely at 50 V from a 50~65 V dc bus and can achieve 98.8% peak system efficiency.

In the remainder of this chapter, Section 4.2 introduces the MAC-DPP architecture and qualitatively compares it with existing DPP solutions. Section 4.3 presents the stochastic loss modeling approach and defines a figure of merit to quantitatively compare different DPP topologies given a random load power distribution. Section 4.4 develops a generalized small signal model for large-scale MAC-DPP systems and proposes two control methods to manipulate the MIMO power flow. Section 4.5 introduces the SVC architecture for DPP string voltage regulation. Experimental

verification of the MAC-DPP and the SVC architectures as well as their theoretical analysis is summarized in Section 4.6. Finally, Section 4.7 concludes this chapter.

### 4.2 Multiport ac-Coupled Differential Power Processing

Many DPP converter topologies have been explored so far. In [121], a load-to-load DPP architecture was proposed, which uses a bidirectional buck-boost circuit to process the differential power between two neighboring loads. Compared to DPP converters that connect each load to the input dc bus [141, 154], the load-to-load DPP converter has reduced switch voltage stress  $(2V_{load})$ . However, the differential power between two non-adjacent loads has to go through multiple power conversion stages due to the laddered structure. This creates higher power conversion losses and limits the system dynamic performance. Another ladder-structured DPP topology is based on ladder switched-capacitor (SC) circuits [134,140]. The ladder SC-DPP converter can achieve high efficiency and high power density, but during load transient, it can only transfer power between neighboring voltage domains within one switching cycle. If two voltage domains are not directly connected, it takes multiple switching cycles to transfer energy from one domain to the other. One alternative DPP approach is to employ multiple isolated dc-dc converters (e.g., flyback, dual active bridge (DAB), etc.) and connect each voltage domain to a virtual dc bus or an input dc bus [126,139]. The dc-coupled DPP architecture can transfer power directly between two arbitrary loads. Compared to ladder-structured DPP options, this architecture is more scalable and can offer better dynamic performance. However, the dc-coupled DPP topology requires multiple magnetic elements (i.e., transformers) as well as high component count, which increases the cost and total converter size. Moreover, the differential power needs to go through at least two "dc-ac-dc" stages from one port to another, resulting in additional power conversion stress and losses [145].



Figure 4.3: (a) Proposed MAC-DPP architecture. (b) Magnetic flux in the magnetic core of a multi-winding transformer with a single magnetic linkage.  $\Phi_i$  is the magnetizing flux, and  $\Delta\Phi_{ij}$  is the leakage flux. (c) Waveforms of winding volt-per-turn and peak-peak flux variation.

In this chapter, we propose a MAC-DPP architecture that connects each voltage domain to a multi-winding transformer through a dc-ac unit, as shown in Fig. 4.3a. The differential power of each voltage domain is coupled to the multi-winding transformer with series coupled magnetic flux. The granular dc-ac inverter can be implemented as a half-bridge inverter with a dc blocking capacitor. Other dc-ac inverter circuits, such as full-bridge inverters, or Class-E-based inverters, are also applicable [45, 155–157]. The power transferred between two different loads is galvanically isolated and is bidirectional. The advantages of the MAC-DPP architecture include:

• Fewer "dc-ac-dc" power conversion stages: The MAC-DPP architecture directly transfers power between any two ports with one single "dc-ac-dc" conversion stage. Existing DPP solutions usually need two or more "dc-ac-dc" stages when delivering power between two arbitrary loads. The reduced power conversion stress improves the dynamic performance and reduces the losses.

- Reduced component count: In the MAC-DPP architecture, one voltage domain is connected to one dc-ac unit, and n voltage domains only need n dc-ac units, which are reduced by half compared with dc-coupled DPP architecture.
- Smaller magnetic size: Compared to the dc-coupled DPP converter that needs multiple transformers, the MAC-DPP architecture has only one magnetic core. In principle, the magnetic core area of a multi-winding transformer is determined by the highest volt-second-per-turn of all windings instead of the winding count, and is not directly related to the number of windings. In a MAC-DPP architecture with a fully symmetric configuration, each dc-ac unit has an identical voltage rating, and all windings have identical volt-second-per-turn, which will stay the same as the winding count increases. Therefore, the core area of a multi-winding transformer in the MAC-DPP is roughly the same as that of a two-winding transformer in other isolated DPP options. Only the window area increases as the winding count increases. Theoretically, the MAC-DPP architecture can reduce the magnetic core area by n times compared to other isolated DPP implementations (n is the number of series voltage domains).

One challenge of designing a MAC-DPP converter is to build a high performance miniaturized multi-winding transformer with a single magnetic linkage. A basic requirement is to effectively couple all windings without saturating the magnetic core. In a two-winding transformer, the cross-section area of the core is determined by the maximum volt-second-per-turn in the windings. Here, this rule is extended to the generalized multi-winding cases. Fig. 4.3b shows the magnetic flux diagram in the magnetic core of the multi-winding transformer. There are two types of magnetic flux in the core: (a) magnetizing flux, which is coupled with each individual winding:  $\Phi_i$ ; and (b) leakage flux, which leaks out through the spacing between two windings:  $\Delta\Phi_{ij} = \Phi_i - \Phi_j$ . The magnetizing flux of a specific coupled winding is linked to the  $V_k(t)/N_k$  (volt-per-turn) by Faraday's Law.



Figure 4.4: (a) FEM simulation setup: two windings are driven by two sinusoidal voltage sources of different phase-shits. (b) Simulated magnetic flux density inside the core at the phase-shift of 0 degree and 180 degree, respectively. (c) Peak magnetic flux density in the spacing between two adjacent windings when sweeping the voltage phase-shift from 0° to 180°.

Fig. 4.3c shows two example arbitrary periodic waveforms of the voltage at two windings. The shaded area (volt-second-per-turn) is the peak-peak flux variation within one period. The maximum magnetizing flux in the core is:

$$\Phi_M^{\max} = \frac{1}{2} \times \max_{k=1,\dots,n} \{ \Delta \Phi_k \} = \frac{1}{2} \times \max_{k=1,\dots,n} \left\{ \int_{t_{a,k}}^{t_{b,k}} \frac{V_k(t)}{N_k} dt \right\}. \tag{4.1}$$

The maximum leakage flux in the core is:

$$\Phi_L^{\text{max}} = \frac{1}{2} \times \max_{k=1,\dots,n-1} \left\{ \int_{t_{pos}} \left( \frac{V_k(t)}{N_k} - \frac{V_{k+1}(t)}{N_{k+1}} \right) dt \right\}, \tag{4.2}$$

where  $t_{pos}$  represents the time period of the positive integral.

As a result, the maximum leakage flux density in a multi-winding transformer is located at the spacing between two windings if the winding voltages have opposite phases (assuming equal voltage amplitudes at all ports). Fig. 4.4 shows an example transformer simulated in ANSYS Maxwell to validate the design guidelines with finite element modeling (FEM). This transformer has a ferrite planar core (ELP18/10 with  $\mu_r = 1000$ ). Each winding has one single turn. Two sinusoidal voltage sources (2.5 V amplitude, 100 kHz) were connected to the two windings. Fig. 4.4b shows

the simulated magnetic flux density inside the core with different phase-shifts. If two voltage sources are in phase, the magnetic flux density in the core is relatively uniform, and the maximum flux density ( $B_{\text{max}}$ ) is low. When the phase-shift increases to 180°, the two voltage sources have exactly opposite phases, and the magnetic flux concentrates at the center pole surface between two windings, leading to a high peak flux density that might saturate the core. Fig. 4.4c shows the peak flux density of the spacing area between two windings when sweeping the phase-shift from 0° to 180°. The  $B_{\text{max}}$  increases as the phase shift increases.

To avoid saturating the core, the minimum core area should be designed for the maximum volt-second-per-turn, and the spacing distance between two windings should be designed for the maximum phase-shift between two neighboring ports. Whether a core will saturate or not is independent of the number of windings. A large number of windings driven by different voltage sources can be coupled to a single magnetic linkage without saturating the core, as long as the maximum volt-second-per-turn does not exceed the designed limit. If all windings are driven by square wave voltage sources with the same volt-per-turn amplitude  $V_0$  and period T, the maximum magnetizing flux in the core is:

$$\Phi_{\text{max}} = \frac{1}{2} \int_{\frac{T}{2}} V_0 dt = \frac{1}{4} V_0 T. \tag{4.3}$$

The maximum magnetizing flux is independent from the number of windings n, and is only determined by the maximum volt-second-per-turn  $(V_0T)$  of all windings. Accordingly, the minimum core area  $(A_{\min})$  of a multi-winding transformer driven by an arbitrary number of square wave voltage sources with amplitude of  $V_0$  is:

$$A_{\min} = \frac{\Phi_{\max}}{B_{\text{sat}}} = \frac{V_0 T}{4B_{\text{sat}}}.$$
(4.4)

Therefore, coupling many voltage domains with a single linkage multi-winding transformer can significantly reduce the required magnetic core volume of a multi-port topology. This is the fundamental reason why the proposed MAC-DPP architecture can achieve much higher power density and better magnetic utilization than other isolated DPP implementations. Compared to non-isolated DPP options without transformers [128], the MAC-DPP architecture also offers reduced power conversion stress (fewer "dc-ac-dc" stages), lower component voltage rating, higher modularity, and lower component count.

#### 4.3 Stochastic Power Loss Analysis and Performance Scaling

Power flows in DPP systems are usually dynamic and unpredictable [158]. Power distribution and mismatch among series voltage domains are influenced by factors that include aging, manufacturing variation, temperature differences [18], illuminance variation [159,160], and random task requests for data center servers [22]. Potentially, each module power is a random process. Previous work to analyze how power loss and power ratings of DPP converters change with statistical variance has been based on numerical simulations or data-driven methods [154, 161, 162]. An analytical method to evaluate performance with large-scale stochastic loads or sources is still needed.

In this section, DPP topologies are grouped into two primary categories: fully coupled DPP and ladder DPP. We perform a systematic analysis of power flow for each, and develop a stochastic model to predict conduction loss and its distribution. The purpose of the stochastic model is not to predict all losses in DPP systems, but rather to understand how performance scales with system dimension and load or source power variance. The model provides guidance on topology selection and design optimization. Instead of estimating loss for a specific case, the model is an ensemble evaluation for stochastic power distributions (e.g., Gaussian, Poisson, Bernoulli, etc.). A scaling factor,  $\mathcal{S}(\bullet)$ , is introduced to describe how loss changes with system size or



Figure 4.5: (a) An  $N \times M$  DPP system with N series-stacked voltage domains, each comprising M load or source modules.  $P_{ij}(t)$  and  $P_i(t)$  are the power of one dc module and of one voltage domain, respectively;  $\Delta P_i(t)$  is the power mismatch for one voltage domain. (b) Load power and mismatched power of each voltage domain is a random process with a certain probability distribution (Gaussian distributions are shown here as an example).

module power variance. Representative DPP topologies are analyzed and compared to a reference N:1 DAB converter [163, 164], given the same total switch die area and magnetic core size. The models are validated with SPICE simulations and with experiments designed (in Section 4.6) to test loss scaling.

## 4.3.1 Parameter Definitions and Modeling Assumptions

Fig. 4.5a shows a general DPP system. An  $N \times M$  array of load or source modules is configured in N series-stacked voltage domains. Each domain comprises M modules connected in parallel. Analysis in this section is based on modular loads, and analysis for modular sources is the same. Denote the power consumption of the  $j^{th}$  load in the  $i^{th}$  voltage domain as  $P_{ij}(t)$ . The total domain power consumed within the  $i^{th}$ 

voltage domain is

$$P_i(t) = P_{i1}(t) + P_{i2}(t) + \dots + P_{iM}(t). \tag{4.5}$$

DPP converters deliver power mismatch  $\Delta P_i(t)$  among series voltage domains. In practical applications, the power distribution can be complicated, with unpredictable patterns or correlations. In this section, each individual load power  $P_{ij}(t)$ , domain power  $P_i(t)$ , and mismatched power  $\Delta P_i(t)$  is modeled as a random process with certain probability distributions as indicated in Fig. 4.5b. We first analyze the case when all module powers are statistically independent with identical distributions (i.i.d.), and later extend the analysis to cases with correlation. In the case with i.i.d. loads, individual load power mean values  $\mathbb{E}[P_{ij}(t)]$  and variances  $\mathrm{Var}[P_{ij}(t)]$  are identical and are denoted as  $\mu_0$  and  $\sigma_0^2$ . Each domain has the same voltage, denoted as  $V_0$ . A more general case allows unbalanced voltages (as when each domain has its own power droop characteristic), but matched domain voltages are explored here for clarity. The analytical framework in this section can be applied to DPP systems with more complicated patterns such as unmatched load power expectations across different voltage domains.

#### 4.3.2 Fully Coupled DPP and Ladder DPP

The two primary DPP categories are shown in Fig. 4.6. Fig. 4.6a depicts the architecture of a fully-coupled DPP converter, in which all voltage domains are coupled by the DPP circuitry. A typical fully-coupled DPP circuit functions as a multiport dc-dc converter [131], with a direct power flow path between any two domains. Due to the series architecture, the same bus current  $\overline{I}(t) = \frac{\sum_{k=1}^{N} P_k(t)}{NV_0}$  flows through each voltage domain plus its corresponded DPP port. The instantaneous differential power processed for the  $i^{th}$  voltage domain is

$$\Delta P_i(t) = \overline{I}(t)V_0 - P_i(t) = \overline{P}(t) - P_i(t). \tag{4.6}$$



Figure 4.6: Typical DPP architectures: (a) fully-coupled DPP; (b) ladder DPP.

Here,  $\overline{P}(t) = \sum_{k=1}^{N} P_k(t)/N$  is the arithmetic average of the N domain powers. Eq. (4.6) indicates that in a fully-coupled DPP converter, the differential power processed at the  $i^{th}$  port is the power mismatch between the average domain power  $\overline{P}(t)$  and the  $i^{th}$  domain power  $P_i(t)$ . With i.i.d. loads, the power rating of each port in a fully-coupled DPP is the same.

Fig. 4.6b shows the architecture of a domain-to-domain or ladder DPP system, in which multiple standalone dc-dc converters (termed *DPP submodules*) link neighboring voltage domains. The differential power processed for one voltage domain is related to multiple DPP submodules,

$$P_i(t) + \Delta P_{i \leftrightarrow i+1}(t) - \Delta P_{i-1 \leftrightarrow i}(t) = \overline{I}(t)V_0 = \overline{P}(t). \tag{4.7}$$

 $\Delta P_{i \leftrightarrow i+1}(t)$  is the differential power that the  $i^{th}$  submodule delivers from the  $i^{th}$  domain to the  $(i+1)^{th}$  domain  $(\Delta P_{i \leftrightarrow i+1}(t) = 0$ , if i=0 or N). Reorganizing (4.7),

$$\Delta P_{i \leftrightarrow i+1}(t) = \sum_{k=1}^{i} (\overline{P}(t) - P_k(t)) = \sum_{k=1}^{i} \Delta P_k(t)$$

$$= i \times \overline{P}(t) - \sum_{k=1}^{i} P_k(t).$$
(4.8)

In a ladder DPP converter, there is no direct power path between non-neighboring voltage domains. Differential power must go through multiple submodules to manage non-neighboring domains, potentially resulting in differential power accumulation. As indicated in (4.8), the  $i^{th}$  submodule needs to process the accumulated mismatched power of first i voltage domains, i.e.,  $\sum_{k=1}^{i} \Delta P_k(t)$ . This will cause additional power to be processed in a ladder DPP system compared to a fully-coupled DPP system. It also leads to varied power ratings among submodules in a ladder DPP converter.

#### 4.3.3 Stochastic Loss Model and Scaling Factor

In Fig. 4.5a, parameters N, M, and  $\sigma_0^2$  impact the differential power processed by DPP converters. Here, we develop a stochastic model with i.i.d. loads to quantify the impact. Scale-dependent loss (i.e., loss that scales with system size or load power variance) is derived based on processed differential power. Losses that are expected to be approximately scale independent, such as control power and losses linked to switching frequency, are not included in the model but are explored during experiments to test scaling validity. The expected value of scale-dependent power loss is used to describe the average loss of a DPP system. For comparison, a stochastic loss model is derived for a conventional N:1 DAB converter delivering the same total load power  $\sum_{i=1}^{N} P_i(t)$ , and this is used as a reference case. Detailed derivations of the expected scale-dependent power loss are provided in Appendix B.1.1.



Figure 4.7: Equivalent circuit model for loss analysis of: (a) conventional N:1 dc-dc converter based on a DAB; (b) fully-coupled DPP; (c) ladder DPP.

Fig. 4.7 shows equivalent circuit models of the reference converter and of the two typical DPP architectures. Conduction loss dominates scale-dependent losses, and is captured by an effective output resistance,  $R_{out}$ , for each module or circuit. Switching loss, core loss, control power, and other nonideal effects can be added, typically as polynominal functions of the processed power, to enhance accuracy, but the modeling procedure for any of these follows from that presented below.

1. Conventional reference N:1 DAB: A stochastic loss model for a conventional N:1 DAB converter outputting  $V_0$  is derived here as a comparative reference or baseline. This converter can be modeled as an N:1 transformer with an output resistance  $R_{out}$  [17], as shown in Fig. 4.7a. All loads are connected in parallel at the

output. The loss in this converter when processing full power is

$$\mathbb{E}[P_{loss}(t)] = \mathbb{E}[R_{out}I_{out}^{2}(t)] = \frac{R_{out}}{V_{0}^{2}}\mathbb{E}\left[\left(\sum_{i=1}^{N}P_{i}(t)\right)^{2}\right]$$

$$= \left(MN\sigma_{0}^{2} + M^{2}N^{2}\mu_{0}^{2}\right) \times \frac{R_{out}}{V_{0}^{2}} \Rightarrow \underbrace{\mathcal{S}(M^{2}N^{2}\mu_{0}^{2})}_{scaling\ factor}.$$

$$(4.9)$$

We use symbol  $S(\bullet)$  to represent a performance scaling factor that describes how power loss changes with system size or load power variance. As indicated by (4.9), loss in the reference converter depends on average load power as well as on load variance, and scales quadratically with the total average load power  $MN\mu_0$  unless the variance  $\sigma_0^2$  is extremely high.

2. Fully-Coupled DPP Converter: As illustrated in Fig. 4.7b, a fully-coupled DPP topology can be modeled as an N-port network coupled with an N-winding transformer of uniform turns ratios. Each port has an effective output resistance  $R_{out}$ , matched for this analysis. The  $i^{th}$  port processes  $\Delta P_i(t)$ , so the instantaneous loss and expected loss at the  $i^{th}$  port are

$$P_{loss,i}(t) = \Delta I_i(t)^2 R_{out} = R_{out} \left(\frac{\Delta P_i(t)}{V_0}\right)^2 = R_{out} \left(\frac{\overline{P}(t) - P_i(t)}{V_0}\right)^2, \tag{4.10}$$

$$\mathbb{E}[P_{loss.i}(t)] = \frac{R_{out}}{V_0^2} \times \frac{M(N-1)}{N} \sigma_0^2. \tag{4.11}$$

Here,  $\Delta I_i(t)$  is the current flowing through  $R_{out}$  at each port and is also the mismatch between the average current and domain load current:  $\Delta I_i(t) = \overline{I}(t) - I_i(t)$ . Notice that  $\mathbb{E}[P_{loss.i}(t)]$  is proportional to  $\sigma_0^2$  because  $P_{loss.i}(t)$  depends on  $\Delta I_i^2(t)$ . Each port has the same expected loss, and the total is

$$\mathbb{E}[P_{loss}(t)] = \sum_{i=1}^{N} \mathbb{E}[P_{loss.i}(t)] = M(N-1)\sigma_0^2 \times \frac{R_{out}}{V_0^2} \Rightarrow \underbrace{S(MN\sigma_0^2)}_{scaling\ factor}. \tag{4.12}$$

The loss scaling in (4.12) is linear in N, M, and  $\sigma_0^2$  but independent of the average load power  $\mu_0$ .

3. Ladder DPP Converter: In a ladder DPP topology, each submodule can be modeled as a 1:1 transformer with output resistance  $R_{out}$ , as illustrated in Fig. 4.7c. The  $i^{th}$  submodule is processing  $\Delta P_{i\leftrightarrow i+1}(t)$ , so the instantaneous and expected loss of the  $i^{th}$  submodule are

$$P_{loss.i}(t) = \Delta I_{i \leftrightarrow i+1}(t)^2 R_{out} = R_{out} \left( \frac{\Delta P_{i \leftrightarrow i+1}(t)}{V_0} \right)^2 = R_{out} \left( \frac{i \times \overline{P}(t) - \sum_{k=1}^{i} P_k(t)}{V_0} \right)^2,$$

$$(4.13)$$

$$\mathbb{E}[P_{loss.i}(t)] = \frac{R_{out}}{V_0^2} \times \frac{M(N-i)i}{N} \sigma_0^2.$$

Here,  $\Delta I_{i\leftrightarrow i+1}(t)$  is the effective current that flows through  $R_{out}$  at each submodule and is equal to the accumulated mismatched current of the top i voltage domains:  $\Delta I_{i\leftrightarrow i+1}(t) = \sum_{k=1}^{i} \Delta P_k(t)/V_0 = \sum_{k=1}^{i} \Delta I_k(t)$ . Expected loss varies among submodules, and the total expected loss is

$$\mathbb{E}[P_{loss}(t)] = \sum_{i=1}^{N-1} \mathbb{E}[P_{loss.i}(t)] = \frac{M(N-1)(N+1)}{6} \sigma_0^2 \times \frac{R_{out}}{V_0^2} \Rightarrow \underbrace{\mathcal{S}(MN^2\sigma_0^2)}_{scaling\ factor}. \quad (4.15)$$

The loss scales linearly with M and  $\sigma_0^2$ , and quadratically with N. Compared to a fully-coupled DPP converter, a ladder DPP converter has a higher order of scaling factor with N since differential power accumulates along the series stack. Notice that the total loss is still independent of the average load power  $\mu_0$ .

Table 4.1 summarizes the expected power loss and scaling factors of the three architectures. For DPP solutions, the expected loss scales linearly with variance  $\sigma_0^2$  but is independent of average load power  $\mu_0$ . This is consistent with the fundamental benefit: loss in a DPP system is determined by power differences, expected to be only a fraction of total load power. If the individual load powers match, a DPP system has no conduction loss.

|                   |                                                            | Expected Total Power Loss                                      | Scaling Factor                |
|-------------------|------------------------------------------------------------|----------------------------------------------------------------|-------------------------------|
| N:1 DAB Converter | N/A                                                        | $(MN\sigma_0^2 + M^2N^2\mu_0^2) \times \frac{R_{out}}{V_0^2}$  | $\mathcal{S}(M^2N^2\mu_0^2)$  |
| Fully-Coupled DPP | $\frac{M(N-1)}{N}\sigma_0^2 \times \frac{R_{out}}{V_0^2}$  | $M(N-1)\sigma_0^2 \times \frac{R_{out}}{V_0^2}$                | $\mathcal{S}(MN\sigma_0^2)$   |
| Ladder DPP        | $\frac{M(N-i)i}{N}\sigma_0^2 \times \frac{R_{out}}{V_0^2}$ | $\frac{M(N-1)(N+1)}{6}\sigma_0^2 \times \frac{R_{out}}{V_0^2}$ | $\mathcal{S}(MN^2\sigma_0^2)$ |

Table 4.1: Stochastic Power Loss Model Comparison  $(M \ge 1, N \ge 2)$ 



Figure 4.8: Expected power loss of the  $i^{th}$  port or submodule in a fully-coupled DPP converter and a ladder DPP converter with N series voltage domains.

Fig. 4.8 plots the expected loss distribution in a fully-coupled DPP converter and a ladder DPP converter. In a fully-coupled DPP, the expected loss is uniformly distributed among different ports, whereas in a ladder DPP, submodules closer to center of the series stack tend to process more power and generate more loss.

### 4.3.4 Performance Scaling of Various DPP Topologies

This subsection explores DPP performance scaling trends as the system size or power variance scales up. Expected power loss of specific DPP topologies are calculated and compared with the reference DAB converter. Since the switch count and magnetic component count of a DPP topology track the number of voltage domains N, a reasonable comparison needs to be based on the same total semiconductor switch



Figure 4.9: Magnetic core window area distribution and winding conductance. Total core window area is proportional to  $\sum G_m n^2$ .  $A_w$  represents the distributed area for each winding, n is the effective number of turns in each winding,  $\rho$  is the winding resistivity, and MLT is the mean length per turn, set to be identical for all windings.

size and magnetic component volume. Here, several example DPP topologies are explored this way. Their output resistance  $R_{out}$  is analyzed and compared with that of the reference converter under the following constraints:

- 1. Identical Total Semiconductor Die Area: For switches, semiconductor die area scales linearly with the  $G_{sw}V_{sw}^k$  product [6,93].  $G_{sw}$  is switch conductance;  $V_{sw}$  is switch blocking voltage; and coefficient k, typically 2, depends on material and process. The total semiconductor die area is represented as the sum  $\sum G_{sw}V_{sw}^2$  for all switches, constrained to be identical for topologies compared here and normalized to  $G_{SW}V_0^2$ .
- 2. Identical Total Volume of Magnetic Components: In this section, total volume of magnetic components is evaluated using core window area, which in turn tracks core cross sectional area. As illustrated in Fig. 4.9, the window area of each winding is proportional to G<sub>m</sub>n<sup>2</sup> (each winding is assigned the same fill factor). G<sub>m</sub> is the winding conductance and n is the number of series turns. Here, n is determined by flux limits on volts per turn. Volts per turn values are scaled to V<sub>0</sub>. The total window area is the sum ∑ G<sub>m</sub>n<sup>2</sup> over all windings, constrained to

be identical for topologies compared here and normalized to  $G_M$ . Notice that SC DPP topologies compared in this section do not require magnetics.

To model the output resistance  $R_{out}$  in Fig. 4.7,  $R_{ds(on)}$  of each switch and winding dc resistance are lumped together and constrained as above.

Figs. 4.10 and 4.11 exhibit several typical circuit implementations of fully-coupled DPP architectures and ladder DPP architectures, respectively. An energy buffering capacitor can be added in parallel to each voltage domain for stable voltage. Table 4.2 compares these topologies to the reference converter, in terms of normalized quantities. In Table 4.2, the root-mean-square (RMS) current in each component is calculated based on the output current ( $I_{out}$ ) or the effective differential current ( $\Delta I_i$  or  $\Delta I_{i\leftrightarrow i+1}$ ) as defined in Fig. 4.7. For the reference DAB converter, the semiconductor die area  $G_{SW}V_0^2$  and winding window area  $G_M$  are equally distributed between the primary and secondary sides; for DPP converters, they are equally distributed among DPP ports or submodules.

To model  $R_{out}$  of magnetic-based topologies (reference converter, Figs. 4.10a-4.10b, and Figs. 4.11a-4.11b), the component RMS current is calculated with the following approximations: (1) trapezoidal current waveforms in topologies with ac-



Figure 4.10: Fully-coupled DPP topologies: (a) ac fully-coupled DPP [22, 130]; (b) dc fully-coupled DPP [129, 139]; (c) Dickson-SC DPP [123].



Figure 4.11: Ladder DPP topologies: (a) ladder DPP with buck-boost cells [121, 135, 137, 158]; (b) ladder DPP with DAB cells; (c) ladder-SC DPP [18, 123, 124, 134, 140].

tive bridges (reference converter, Figs. 4.10a-4.10b, and Fig. 4.11b) are treated as square waves; (2) the inductor current in the DPP topology with buck-boost cells (Fig. 4.11a) has low ripple. Based on switch  $R_{ds(on)}$ , winding dc resistance, and RMS current, effective output resistance  $R_{out}$  of the magnetic-based topologies can be obtained.

Fig. 4.10a shows an ac fully-coupled DPP (i.e., MAC-DPP) converter with full bridge coupling to a multiwinding transformer. This converter comprises 4N switches, each blocking  $V_0$ , and N windings. Volts-per-turn values are scaled to  $V_0$ , so each winding contains one turn per unit. The resistances of each switch and each winding are  $\frac{4N}{G_{SW}}$  and  $\frac{N}{G_M}$ . The RMS currents in each switch and transformer winding at the  $i^{th}$  port are  $\frac{\sqrt{2}}{2}\Delta I_i$  and  $\Delta I_i$ , respectively, so the conduction loss at the  $i^{th}$  port is

$$P_{loss.i} = \left(\frac{\sqrt{2}}{2}\Delta I_i\right)^2 \frac{4N}{G_{SW}} \times 4 + \Delta I_i^2 \frac{N}{G_M} = \Delta I_i^2 R_{out}. \tag{4.16}$$

This indicates that the output resistance of each port is  $\frac{8N}{G_{SW}} + \frac{N}{G_M}$ . Results for  $R_{out}$  of other magnetic-based DPP topologies and the reference converter can be modeled similarly and are summarized in Table 4.2.

|                                        |                    | L                    |                        |              |                             |                                                     |                    |                       |                             | 1 (                                | <del>-</del>                                   |
|----------------------------------------|--------------------|----------------------|------------------------|--------------|-----------------------------|-----------------------------------------------------|--------------------|-----------------------|-----------------------------|------------------------------------|------------------------------------------------|
|                                        | I                  |                      | Semiconductor Switches |              |                             | Transformer/Inductor Windings                       |                    |                       |                             |                                    |                                                |
| Top                                    | pologies           | Switch<br>Count      | Voltage<br>Rating      | $R_{ds(on)}$ | RMS <sup>a</sup><br>Current | Winding<br>Count                                    | Turns <sup>b</sup> | Winding<br>Resistance | RMS <sup>c</sup><br>Current | Output Resistance $R_{out}$        |                                                |
| N:1 Converter (Conventional Reference) | DAR                | Primary <sup>d</sup> | 4                      | $NV_0$       | $\frac{8N^2}{G_{SW}}$       | $\frac{\sqrt{2}}{2N}I_{out}$                        | 1                  | N                     | $\frac{2N^2}{G_M}$          | $\frac{I_{out}}{N}$                | $\frac{32}{G_{SW}} + \frac{4}{G_M}$            |
|                                        | DAD                | Secondarye           | 4                      | $V_0$        | $\frac{8}{G_{SW}}$          | $\frac{\sqrt{2}}{2}I_{out}$                         | 1                  | 1                     | $\frac{2}{G_M}$             | $I_{out}$                          |                                                |
| Fully-Coupled DPP                      | Ac                 | -Coupled             | 4N                     | $V_0$        | $\frac{4N}{G_{SW}}$         | $\frac{\sqrt{2}}{2}\Delta I_i$                      | N                  | 1                     | $\frac{N}{G_M}$             | $\Delta I_i$                       | $\frac{8N}{G_{SW}} + \frac{N}{G_M}$            |
|                                        | De                 | -Coupled             | 8 <i>N</i>             | $V_0$        | $\frac{8N}{G_{SW}}$         | $\frac{\sqrt{2}}{2}\Delta I_i$                      | 2N                 | 1                     | $\frac{2N}{G_M}$            | $\Delta I_i$                       | $\frac{32N}{G_{SW}} + \frac{4N}{G_M}$          |
|                                        | SC-based (FSL)     |                      | 2N                     | $V_0$        | $\frac{2N}{G_{SW}}$         | $\sqrt{2}\Delta I_i$                                | N/A                | N/A                   | N/A                         | N/A                                | $\frac{8N}{G_{SW}}$                            |
| Ladder DPP –                           | Buck-Boost-cell 2N |                      | 2N-2                   | $2V_0$       | $\frac{8N-8}{G_{SW}}$       | $\sqrt{2}\Delta I_{i\leftrightarrow i+1}$           | N-1                | 2                     | $\frac{4N-4}{G_M}$          | $2\Delta I_{i\leftrightarrow i+1}$ | $\frac{32N-32}{G_{SW}} + \frac{4N-4}{G_M}$     |
|                                        | D                  | AB-cell              | 8N-8                   | $V_0$        | $\frac{8N-8}{G_{SW}}$       | $\frac{\sqrt{2}}{2}\Delta I_{i\leftrightarrow i+1}$ | 2N-2               | 1                     | $\frac{2N-2}{G_M}$          | $\Delta I_{i \leftrightarrow i+1}$ | $\frac{32N - 32}{G_{SW}} + \frac{4N - 4}{G_M}$ |

Table 4.2: Comparison between the DAB Converter and DPP Topologies  $(N \ge 2)$ 

To model  $R_{out}$  of switched-capacitor (SC) DPP topologies (Figs. 4.10c and 4.11c), power loss should be analyzed at both the slow switching limit (SSL) and fast switching limit (FSL) [93]. Fig. 4.10c shows a Dickson-SC DPP converter in which all voltage domains are coupled through capacitors. Since charge can be transferred through the capacitors between any two voltage domains within one switching cycle, there is a direct power flow path between arbitrary voltage domains, and the circuit functions like a fully-coupled DPP topology. Fig. 4.11c shows a ladder-SC DPP in which neighboring voltage domains are linked by one capacitor. Charge can be transferred only between two neighboring voltage domains in each switching cycle, so this functions like a ladder-DPP topology. Detailed loss mechanism for the two SC DPP topologies in SSL and FSL are discussed in Appendix B.2. Here, we briefly introduce their loss modeling for a unified comparison.

At the SSL, power loss of an SC converter is dominated by capacitor charge sharing loss. Table 4.3 summarizes charge transfer of each capacitor and  $R_{out}$  at the SSL for a Dickson-SC DPP and ladder-SC DPP. Denote the capacitance as C and the switching

a,c These two columns list RMS current in each component. For the reference converter, they list the RMS current in each component on the primary side or the secondary side; for DPP topologies, they list the RMS current in the  $i^{th}$  port or submodule.

<sup>&</sup>lt;sup>b</sup> This column lists the number of turns per winding, normalized to a volts-per-turn value of  $V_0$ . <sup>d,e</sup> These two rows show primary side and secondary side information of the reference converter. Semiconductor die area  $G_{SW}V_0^2$  and winding window area  $G_M$  are allocated equally across the primary and secondary sides.

| Table 4.5. $N_{out}$ Modeling of SC D11 Topologies at SSL $(N \ge 2)$ |                    |                                                   |                                     |  |  |  |  |
|-----------------------------------------------------------------------|--------------------|---------------------------------------------------|-------------------------------------|--|--|--|--|
| Topologies                                                            | Capacitor<br>Count | Charge <sup>a</sup><br>Transfer                   | Output Resistance $R_{out}$ (@ SSL) |  |  |  |  |
| Dickson-SC DPP                                                        | N                  | $\frac{\Delta I_i}{f_{sw}}$                       | $\frac{1}{Cf_{sw}}$ (Fig. 4.7b)     |  |  |  |  |
| Ladder-SC DPP                                                         | N-1                | $\frac{\Delta I_{i \leftrightarrow i+1}}{f_{sw}}$ | $\frac{1}{Cf_{sw}}$ (Fig. 4.7c)     |  |  |  |  |

Table 4.3:  $R_{out}$  Modeling of SC DPP Topologies at SSL  $(N \ge 2)$ 

frequency as  $f_{sw}$ . The energy buffering capacitor at each voltage domain should be large, with a stable voltage, so its charge sharing loss is neglected. In the Dickson-SC DPP, charge transfer of the  $i^{th}$  capacitor is  $\Delta I_i/f_{sw}$  per half switching cycle, so the charge sharing loss at the  $i^{th}$  port is

$$P_{loss.i} = \frac{\Delta Q_i^2}{C} f_{sw} = \frac{(\Delta I_i / f_{sw})^2}{C} f_{sw} = \Delta I_i^2 R_{out}.$$
 (4.17)

Accordingly,  $R_{out}$  of the Dickson-SC DPP as defined in Fig. 4.7b is  $\frac{1}{Cf_{sw}}$ . In the ladder-SC DPP, charge transfer of the  $i^{th}$  capacitor that links the  $i^{th}$  and  $(i+1)^{th}$  voltage domains is  $\sum_{k=1}^{i} \Delta I_k / f_{sw} = \Delta I_{i \leftrightarrow i+1} / f_{sw}$  per half switching cycle. Similarly,  $R_{out}$  of the ladder-SC DPP as defined in Fig. 4.7c is also  $\frac{1}{Cf_{sw}}$ . Although ladder-SC topologies have the same  $R_{out}$ , they generate higher loss due to differential power accumulation, especially if the voltage domain is close to the center or if N is large.

At the FSL, capacitor charge sharing loss of an SC DPP is negligible. Conduction loss dominates. All capacitors act as fixed voltage sources. In this case, both the Dickson-SC DPP and the ladder-SC DPP function like fully-coupled DPP and are equivalent. Each switch at the  $i^{th}$  domain conducts  $2\Delta I_i$  for half a switching cycle, and corresponded  $R_{out}$  values are listed in Table 4.2. For a unified comparison, internal capacitor power loss is not included here, and SC DPP topologies are compared with the reference converter as fully-coupled circuits based on conduction loss at the FSL.

<sup>&</sup>lt;sup>a</sup> This column lists the charge transfer per half switching cycle of the  $i^{th}$  capacitor (from top to bottom) in an SC DPP.

As listed in Table 4.2,  $R_{out}$  of a dc fully-coupled DPP is four times of that in an ac fully-coupled DPP (i.e., MAC-DPP) due to doubling of switch and winding counts and doubling of "dc-ac-dc" differential power conversion stages [22]. At the FSL, the two SC DPP topologies have the same conduction loss as that of a MAC-DPP without considering winding loss. Although an SC topology has no winding loss, the capacitor charge sharing loss is non-negligible if the capacitors are not large enough or if the switching frequency is not high enough. Table 4.2 also indicates that with a fixed total switch die area and a fixed total magnetic volume, output resistance of DPP topologies increases linearly with the number of voltage domains due to the linear growth of component count, whereas  $R_{out}$  of the reference converter is fixed.

In DPP systems, the processed differential power increases as load power variance increases, and advantages in terms of output resistance diminish when N scales up, as shown in Table 4.2. To evaluate trends, a comparative expected loss ratio  $\beta = \frac{\mathbb{E}[P_{loss,DPP}]}{\mathbb{E}[P_{loss,ref}]}$  can be used as a performance figure of merit. The coefficient of variance  $C_V = \frac{\sigma_0}{\mu_0}$  is used to represent the normalized variance of  $P_{ij}(t)$ . Values of  $\beta$  for a variety of topologies have been calculated based on the analysis. Lower values are better, and DPP advantages disappear if  $\beta > 1$ . The calculated  $\beta$  values and their asymptotic limits as M, N, and  $C_V$  scale-up are plotted in Figs. 4.12 - 4.14.

Calculated results have been compared to Monte Carlo simulations in SPICE, in which a random sequence is generated for each load power. In simulations, the domain voltage  $V_0$  is 5 V, and the domain power is mostly below 10 W. For a given M, N, and  $C_V$ , each simulation was run 10,000 times to obtain an average power loss. For each case, simulated  $\beta$  was obtained as the ratio of the simulated average DPP loss to the calculated loss of the reference converter delivering the same total power. Switch  $R_{ds(on)}$  and winding resistance in each topology are set based on Table 4.2. Since the Dickson-SC DPP and the ladder-SC DPP are equivalent with fast switching, the



Figure 4.12: Calculated and simulated loss ratio  $\beta$  as a function of N in: (a) fully-coupled DPP converters; (b) ladder DPP converters.



Figure 4.13: Calculated and simulated loss ratio  $\beta$  as a function of the number of the parallel loads M in (a) fully-coupled and (b) ladder DPP converters.

simulation uses a ladder-SC DPP at the FSL. When comparing SC DPP circuits to the reference converter, winding conduction loss has been excluded.

Figs. 4.12 - 4.14 compare calculated and simulated  $\beta$  values for various DPP topologies as functions of load array dimensions N and M, and coefficient of variance  $C_V$ . Considering the scaling of  $R_{out}$ , when N increases, the expected loss of fully-coupled DPP topologies increases as  $N^2$ , the same growth rate as for the reference converter. The expected power loss of ladder DPP topologies grows as  $N^3$ . Therefore,



Figure 4.14: Calculated and simulated loss ratio  $\beta$  as a function of coefficient of variance  $C_V$  in (a) fully-coupled and (b) ladder DPP converters.

as N scales up,  $\beta$  of fully-coupled DPP topologies converges to an upper limit, but  $\beta$  of ladder DPP topologies keeps increasing, as shown in Fig. 4.12. The figure suggests that ladder DPP circuits lose their advantages for  $N \geq 25$ , given M = 4 and  $C_V = 1$ .

When the number of parallel load units M increases, the expected loss in both fully-coupled DPP and ladder DPP circuits increases as M, while the expected loss in the reference converter tracks  $M^2$ . Thus, the loss ratio  $\beta$  decreases for both fully-coupled DPP and ladder DPP circuits with increasing M, as shown in Fig. 4.13. As M increases, power consumption of each voltage domain becomes relatively more balanced since multiple random loads with the same probability distribution in parallel will narrow the domain population variance. The asymptotic limits are  $\beta \to \frac{C_V^2}{4M}$  for an ac-coupled or an SC DPP (FSL),  $\beta \to \frac{C_V^2}{M}$  for a de-coupled DPP, and  $\beta \to \frac{NC_V^2}{6M}$  for a ladder DPP with DAB or buck-boost cells.

Fig. 4.14 shows log-log plots of  $\beta$  for various DPP topologies as a function of  $C_V$ . As  $C_V$  increases, power variation among voltage domains increases, so the DPP converters need to process more power. Thus,  $\beta$  increases with  $C_V$  for all DPP topologies, but it converges to an upper limit. This is because the power loss of the reference converter, as in (4.9), is ultimately dominated by  $MN\sigma_0^2$  when  $C_V$  (i.e.,

 $\frac{\sigma_0}{\mu_0}$ ) increases, the same rate of increase with  $C_V$  as for DPP topologies. Asymptotic upper limits of  $\beta$  for ac-coupled or SC DPP (FSL), dc-coupled DPP, and ladder DPP with DAB or buck-boost cells are  $\frac{N-1}{4}$ , N-1, and  $\frac{(N+1)(N-1)^2}{6N}$ , respectively.

In Figs. 4.12 - 4.14, calculated ratios match simulated ones well, validating the stochastic model. Mismatches are caused by active bridge trapezoidal current waveforms (Figs. 4.10a-4.10b, Fig. 4.11b), inductor current ripple in buck-boost cells (Fig. 4.11a), and capacitor charge sharing loss in SC converters (Fig. 4.10c, Fig. 4.11c). For larger M or smaller  $C_V$ , the average differential power processed by each buckboost cell is reduced. In this case, inductor ripple current becomes comparable to average current, yielding increased mismatch between calculated and simulated results for ladder DPP with buck-boost cells, as shown in Figs. 4.13b and 4.14b.

Figs. 4.12 - 4.14, together with Tables 4.1 - 4.3, provide useful design insights for DPP architectures. For example, the asymptotic upper limit of  $\beta$  in an ac-coupled DPP topology is  $\frac{C_V^2}{4M}$  as N increases. When M=4,  $N\geq 2$ , and  $C_V=1$ , the loss ratio of an ac-coupled DPP converter is below 0.0625, indicating at least 16x loss reduction compared to the reference converter. A dc-coupled DPP converter can offer at least 4x reduction under the same conditions. If  $M>C_V^2$ , then  $\beta$  of fully-coupled DPP converters will be always less than 1, indicating that a fully-coupled DPP solution will be more efficient than the reference converter for arbitrary N. For a ladder DPP converter,  $\beta$  will be larger than 1 if N exceeds  $\frac{6M}{C_V^2}$ , indicating that a ladder DPP converter will lose advantages if N is large. It should be pointed out, however, that ladder DPP circuits are attractive if load variance is limited. A  $C_V$  value of 0.1, for instance, supports a large value of N before  $\beta$  exceeds unity. Figs. 4.12 - 4.14 and Tables 4.1 and 4.2 reveal that the proposed MAC-DPP solution stands out from others explored here, although SC solutions are equally good if the FSL applies.



Figure 4.15: Two types of load correlation in an  $N \times M$  DPP system: (1) vertical correlation across different voltage domains is denoted in green; (2) horizontal correlation between loads within one voltage domain is denoted in blue.

#### 4.3.5 Impacts of Load Correlation

Load (or source) power correlation is common in DPP applications, such as when managing partial shading in a solar panel array, thermal hot spots in a series battery pack, or task distribution algorithms for a hard-disk storage cluster. In this subsection, the i.i.d. condition is relaxed to generalize the stochastic loss analysis. Each load power  $P_{ij}(t)$  is given the same distribution but the values are not independent. Detailed derivations are provided in Appendix B.1.2.

As shown in Fig. 4.15, load correlation can happen between loads across different voltage domains (vertical correlation) or between loads within one voltage domain (horizontal correlation). These can be described using correlation matrices as in Fig. 4.16. Fig. 4.16a is the vertical correlation matrix  $\rho_V$ , in which each entry  $\rho_V(i,j)$  represents the correlation coefficient between the  $i^{th}$  domain power  $P_i(t)$  and  $j^{th}$  domain power  $P_j(t)$ . Fig. 4.16b shows the horizontal correlation matrix  $\rho_{Hk}$  of the  $k^{th}$  voltage domain, in which  $\rho_{Hk}(i,j)$  is the correlation coefficient of the  $i^{th}$  load power  $P_{ki}(t)$  and  $j^{th}$  load power  $P_{kj}(t)$  within the  $k^{th}$  domain. These are Pearson's



Figure 4.16: (a) Vertical correlation matrix  $\rho_{\mathbf{V}}$ :  $\rho_{V}(i,j)$  is the correlation coefficient between the  $i^{th}$  and  $j^{th}$  domain power,  $P_{i}(t)$  and  $P_{j}(t)$ ; (b) Horizontal correlation matrix  $\rho_{Hk}$ :  $\rho_{Hk}(i,j)$  is the correlation coefficient between the  $i^{th}$  and  $j^{th}$  load power in the  $k^{th}$  domain,  $P_{ki}(t)$  and  $P_{kj}(t)$ .

correlation coefficients [165]:  $\rho_{X,Y} = \frac{\text{Cov}[X,Y]}{\sqrt{\text{Var}[X]\text{Var}[Y]}} \in [-1,1]$ . The expected power loss of a fully-coupled DPP converter when considering load correlation is

$$\mathbb{E}[P_{loss}(t)] = \frac{R_{out}}{NV_0^2} \left( \underbrace{(N-1)\sum_{k=1}^N \operatorname{Var}[P_k(t)]}_{\boxed{1}} - \underbrace{2\sum_{1 \le i < j \le N} \operatorname{Cov}[P_i(t), P_j(t)]}_{\boxed{2}} \right). \tag{4.18}$$

In part ① of (4.18), the variance of each domain power,  $Var[P_k(t)]$ , can be expanded

$$\operatorname{Var}[P_{k}(t)] = \sum_{i=1}^{M} \operatorname{Var}[P_{ki}(t)] + 2 \sum_{1 \le i < j \le M} \operatorname{Cov}[P_{ki}(t), P_{kj}(t)]$$

$$= \left(M + 2 \sum_{1 \le i < j \le M} \rho_{Hk}(i, j)\right) \underbrace{\operatorname{Var}[P_{ij}(t)]}_{=\sigma_{0}^{2}}.$$
(4.19)

In part ②, the covariance between arbitrary two domain powers,  $Cov[P_i(t), P_j(t)]$ , can be expressed as

$$Cov[P_i(t), P_j(t)] = \rho_V(i, j) \sqrt{Var[P_i(t)]Var[P_j(t)]}.$$
(4.20)

Eqs. (4.18) - (4.20) indicate that positive vertical correlation  $\rho_V(i,j) > 0$  reduces the total expected power loss, whereas positive horizontal correlation  $\rho_{Hk}(i,j) > 0$  increases the total expected power loss.

The worst-case horizontal load correlation is to have  $\rho_{Hk}(i,j) = 1$  for two arbitrary loads within the  $k^{th}$  voltage domain, i.e., two arbitrary loads are linearly related and change exactly in the same direction. In this case, the  $k^{th}$  domain power variance reaches a maximum of  $\text{Var}[P_k(t)] = M^2 \sigma_0^2$ , and that domain can be treated as a single load. The worst-case vertical correlation can be analyzed by reorganizing (4.18) as

$$\mathbb{E}[P_{loss}(t)] = \frac{R_{out}}{V_0^2} \left( \sum_{k=1}^N \operatorname{Var}[P_k(t)] - \frac{\operatorname{Var}\left[\sum_{k=1}^N P_k(t)\right]}{N} \right). \tag{4.21}$$

The worst-case vertical correlation is when  $\operatorname{Var}\left[\sum_{k=1}^{N}P_{k}(t)\right]=0$ , i.e., the total power across all voltage domains is constant.

With both worst-case horizontal and vertical correlation, an  $N \times M$  DPP system becomes equivalent to a system in which each voltage domain contains a single load with mean power  $M\mu_0$  and power variance  $M^2\sigma_0^2$ , and the system load power  $\sum_{k=1}^{N} P_k(t)$  is constant, as depicted in Fig. 4.17. In this case, the expected loss of a fully-coupled DPP converter is

$$\mathbb{E}[P_{loss}(t)] = M^2 N \sigma_0^2 \times \frac{R_{out}}{V_0^2} \Rightarrow \underbrace{\mathcal{S}(M^2 N \sigma_0^2)}_{scaling\ factor}.$$
 (4.22)



Figure 4.17: Power loss of a fully-coupled DPP converter reaches its maximum with worst case load correlation, where  $\rho_{Hk}(i,j) = 1$  for all loads within a voltage domain, and  $\operatorname{Var}\left[\sum_{k=1}^{N} P_k(t)\right] = 0$ .

Worst-case horizontal correlation results in the expected loss scaling quadratically with M. Worst-case vertical correlation increases the domain scaling rate from N-1 to N. Based on (4.22), comparing an ac-coupled DPP to the reference converter under worst-case load correlation, the upper limit of  $\beta$  is  $\frac{C_V^2}{4}$ . In practice,  $C_V$  is usually less than one, and a MAC-DPP converter can reduce the expected loss by at least a factor of four even with arbitrary load correlation. When  $C_V$  is lower, the benefits are substantial.

# 4.4 Modeling and Control of Multi-Input-Multi-Output (MIMO) Power Flow

Transferring differential power among multiple ports, the MAC-DPP converter functions as a multi-input-multi-output (MIMO) system. This section discusses how to model and precisely control the MIMO power flow.



Figure 4.18: (a) The MAC-DPP converter functions as a multi-input-multi-output (MIMO) system. (b) Equivalent lumped circuit model to analyze the MIMO power flow. The N-port passive network is represented by a delta network, and each dc-ac unit is modeled as a square-wave voltage source.

#### 4.4.1 Modeling of MIMO Power Flow

In a MAC-DPP converter, all ports are bidirectional and are closely coupled with the multi-winding transformer. The multi-winding transformer together with the series inductors is indeed an N-port passive network, whose port voltages and currents are connected by an  $N \times N$  impedance matrix:

$$Z = jw \begin{bmatrix} L_{11} + L_{s1} & M_{12} & \dots & M_{1n} \\ M_{21} & L_{22} + L_{s2} & \dots & M_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ M_{n1} & M_{n2} & \dots & L_{nn} + L_{sn} \end{bmatrix}.$$
(4.23)

 $L_{ii}$  is the self-inductance of the i<sup>th</sup> winding,  $M_{ij,(i\neq j)}$  is the mutual inductance between windings, and  $\omega=2\pi f_s$  is the angular frequency of the system.  $L_{si}$  is the series inductance of each winding, which can be either implemented as discrete inductors or the transformer leakage inductance. To analyze the MIMO power flow, the N-port passive network (multi-winding transformer with series inductor) is converted into a delta network as depicted in Fig. 4.18b. Here, the dc-ac units are implemented as half-bridge or full-bridge circuits, which can be modeled as square-wave voltage sources with normalized voltage amplitudes  $\frac{V_i}{N_i}$ . Each branch inductor,  $L_{ij,(i\neq j)}$ , which links



Figure 4.19: (a) Example waveforms of normalized port voltages  $(\frac{V_1}{N_1} \sim \frac{V_3}{N_3})$  and branch inductor current  $(I_{13})$  with phase-shift modulation. (b) Average power flow between the  $i^{\text{th}}$  and the  $j^{\text{th}}$  ports as a function of phase shift  $\phi_{ij}$ .

the  $i^{th}$  and the  $j^{th}$  port can be directly obtained from the admittance matrix of the passive network [166]:

$$Y = Z^{-1} = \frac{1}{jw} \begin{bmatrix} y_{11} & \dots & y_{1n} \\ \vdots & \ddots & \vdots \\ y_{n1} & \dots & y_{nn} \end{bmatrix}, \ L_{ij} = -\frac{1}{N_1 N_2 y_{ij}}.$$
 (4.24)

As shown in Fig. 4.19a, the MIMO power flow can be modulated by adjusting the phase-shift at each port. Other power flow modulation methods, such as time-sharing modulation [167], are also applicable. When adjusting the phase-shifts, the power flow delivered through each branch inductor  $(L_{ij})$  can be calculated in the same way as that in a dual active bridge (DAB) converter [163], and the power flow carried by each grounded inductor  $(L_{gi})$  is reactive power which has no impact on the average power of each port. Thus, the total average power feeds into the passive network from the  $i_{th}$  port is:

$$P_{i} = \sum_{j \neq i} P_{ij} = \sum_{i=1}^{n} \frac{V_{i}V_{j}}{2\pi f_{s}N_{i}N_{j}L_{ij}} \phi_{ij} \left(1 - \frac{|\phi_{ij}|}{\pi}\right). \tag{4.25}$$

Figure 4.19b plots the average power flow  $\phi_{ij}$  between the  $i^{\text{th}}$  and the  $j^{\text{th}}$  ports as a function of phase shift  $\phi_{ij}$ . Since  $\phi_{ij} \propto 1/L_{ij}$ , a larger branch inductance will reduce the maximum output power, but it might be beneficial for control resolution of digital controllers because a higher phase shift value is required for the same power.

Open-loop phase-shift modulation is capable of controlling the multiway differential power flow in steady-state, but the system may run into oscillation without feedback control. According to (4.25), the input average power of one port,  $P_i$  (i.e., input differential power in the MAC-DPP system) is related with the phase shifts of all the ports  $\{\phi_1, \phi_2, ..., \phi_n\}$ . The closely-coupled power flow brings challenges to the port voltage regulation, especially in the case where a large number of loads are stacked in series.

## 4.4.2 Small Signal Model for Very Large-Scale MAC-DPP Systems

Designing a stable and fast control scheme requires accurate modeling of converter dynamics. This subsection provides a compact and scalable small signal model for very-large-scale MAC-DPP systems. The small signal model is first derived for an ideal lossless MAC-DPP converter with arbitrary number of ports, and then extended to capture the impacts of power losses. Fig. 4.20a shows the general architecture of an n-port MAC-DPP system. According to Section 4.4.1, the small signal modeling of a MAC-DPP converter can follow that of a DAB converter, where the output average current is (ignoring power loss):

$$I_{out} = \frac{P_{out}}{V_{out}} = \frac{V_{in}}{2\pi f_s L_{eq}} \Phi\left(1 - \frac{|\Phi|}{\pi}\right), \tag{4.26}$$

In this subsection, all the winding turns is assumed to be unity without loss of generality.  $L_{eq}$  is the inductance linking the two ports,  $\Phi$  is the phase difference between



Figure 4.20: (a) Block diagram of a MAC-DPP architecture with a large number of ac-coupled voltage domains connected in series; Small signal model of (b) a DAB converter and (c) a MAC converter.

the two ports, and  $f_s$  is the switching frequency. The small signal output current is:

$$\hat{i}_{out} = G_v \hat{v}_{in} + G_\phi \hat{\phi}, \tag{4.27}$$

$$\begin{cases}
G_v = \frac{\Phi}{2\pi f_s L_{eq}} \left( 1 - \frac{|\Phi|}{\pi} \right) \\
G_\phi = \frac{V_{in}}{2\pi f_s L_{eq}} \left( 1 - \frac{2|\Phi|}{\pi} \right)
\end{cases}$$
(4.28)

As illustrated in Fig. 4.20b, the small signal model of a DAB converter contains two current sources  $(G_v \hat{v}_{in} \text{ and } G_\phi \hat{\phi})$  depending on  $\hat{v}_{in}$  and  $\hat{\phi}$ . Note that  $\hat{i}_{out}$  is not a function of  $\hat{v}_{out}$  in an ideal lossless DAB.

Similarly, for a lossless MAC converter, the average output current of port #i can be derived based on the average power flow in (4.25):

$$I_{i} = \sum_{j=1}^{n} \frac{V_{j}}{2\pi f_{s} L_{ij}} \Phi_{ij} \left( \frac{|\Phi_{ij}|}{\pi} - 1 \right), \tag{4.29}$$

where  $L_{ij}$  is the equivalent inductance linking port #i and port #j, and  $\Phi_{ij}$  is the phase difference between port #i and port #j. Based on (4.29), the small-signal current at one port is a function of the voltage perturbation of all ports  $\{\hat{v}_1, \hat{v}_2, ..., \hat{v}_n\}$ , and phase perturbation of all ports  $\{\hat{\phi}_1, \hat{\phi}_2, ..., \hat{\phi}_n\}$ :

$$\hat{i}_i = \sum_{j=i}^n G_v(i,j)\hat{v}_j + \sum_{j=1}^n G_\phi(i,j)\hat{\phi}_j.$$
(4.30)

Here,  $G_v(i,j)$  and  $G_\phi(i,j)$  are functions of the large-signal voltage  $\{V_1, V_2, ..., V_n\}$ , and the large-signal phase  $\Phi_{ij}$ :

$$G_v(i,j) = \frac{\Phi_{ij}}{2\pi f_s L_{ij}} \left( \frac{|\Phi_{ij}|}{\pi} - 1 \right)$$
 (4.31)

$$G_{\phi}(i,j) = \begin{cases} \frac{V_{j}}{2\pi f_{s} L_{ij}} \left(1 - \frac{2|\Phi_{ij}|}{\pi}\right) & j \neq i, \\ \sum_{k \neq i} \frac{V_{k}}{2\pi f_{s} L_{ik}} \left(\frac{2|\Phi_{ik}|}{\pi} - 1\right) & j = i. \end{cases}$$
(4.32)

As shown in Fig. 4.20c, the small signal output current at each port in MAC can be represented as two current sources determined by the voltage perturbation and the phase perturbation of all ports. The small signal current  $\hat{i}_i$  of port #i is not a function of the voltage perturbation of the same port  $\hat{v}_i$ , because  $G_v(i,i)$  is zero. The output ports of a MAC-DPP architecture are connected to a series stacked R-C network. The series-stacked load structure adds another constraint on the small signal output voltages  $\{\hat{v}_1, \hat{v}_2, ..., \hat{v}_n\}$  and output currents  $\{\hat{i}_1, \hat{i}_2, ..., \hat{i}_n\}$ , which can be described

by an impedance matrix:

$$\begin{bmatrix} \hat{v}_1 \\ \hat{v}_2 \\ \vdots \\ \hat{v}_n \end{bmatrix} = \begin{bmatrix} Z_1 || \sum_{k \neq 1} Z_k & -\frac{Z_1 Z_2}{\sum_{k=1}^n Z_k} & \dots & -\frac{Z_1 Z_n}{\sum_{k=1}^n Z_k} \\ -\frac{Z_2 Z_1}{\sum_{k=1}^n Z_k} & Z_2 || \sum_{k \neq 2} Z_k & \dots & -\frac{Z_2 Z_n}{\sum_{k=1}^n Z_k} \\ \vdots & \vdots & \ddots & \vdots \\ -\frac{Z_n Z_1}{\sum_{k=1}^n Z_k} & -\frac{Z_n Z_2}{\sum_{k=1}^n Z_k} & \dots & Z_n || \sum_{k \neq n} Z_k \end{bmatrix} \begin{bmatrix} \hat{i}_1 \\ \hat{i}_2 \\ \vdots \\ \hat{i}_n \end{bmatrix}$$

$$(4.33)$$

where  $Z_i$  is the lumped load impedance at port i:

$$Z_i = R_{Li} || \frac{1}{sC_i} = \frac{R_{Li}}{sR_{Li}C_i + 1}.$$
 (4.34)

Eq. (4.30) and Eq. (4.33) can be reorganized as:

$$\begin{cases} \hat{\boldsymbol{i}} = \boldsymbol{G}_{\boldsymbol{v}} \times \hat{\boldsymbol{v}} + \boldsymbol{G}_{\phi} \times \hat{\boldsymbol{\phi}}, \\ \hat{\boldsymbol{v}} = \boldsymbol{G}_{\boldsymbol{z}} \times \hat{\boldsymbol{i}}. \end{cases}$$

$$(4.35)$$

Based on Eq (4.35), the transfer function matrix from the phase perturbation  $(\hat{\phi})$  to the port voltage perturbation  $(\hat{v})$  in a lossless MAC-DPP architecture is:

$$\hat{\boldsymbol{v}} = (\boldsymbol{I} - \boldsymbol{G}_{\boldsymbol{Z}} \boldsymbol{G}_{\boldsymbol{v}})^{-1} \boldsymbol{G}_{\boldsymbol{Z}} \boldsymbol{G}_{\boldsymbol{\phi}} \times \hat{\boldsymbol{\phi}} = \boldsymbol{G}_{\boldsymbol{S}} \times \hat{\boldsymbol{\phi}}. \tag{4.36}$$

This  $n \times n$  transfer function matrix  $G_S$  can be used to analyze the stability of the MAC-DPP system and assist in the controller design.

Power loss in a MAC-DPP converter can change the power flow and reduce the dc gain of the system transfer function. To capture the influence of loss, an improved small signal model was developed. Fig. 4.21a shows an equivalent circuit of a DAB with the power conversion loss modeled as a few resistors  $R_1$ ,  $R_2$ , and  $R_m$ .  $R_1$  and  $R_2$  capture the resistance of the inductors, switches, and the transformer windings,



Figure 4.21: (a) Equivalent lumped circuit model to analyze the transfer function of a DAB converter. (b) Inductor current variation  $\Delta I_L$  due to  $\Delta V_{out}$ . (c) Current and voltage waveforms of DAB with power losses.

and  $R_m$  captures the core loss. Switching losses can be included in either  $R_1$ ,  $R_2$ , or  $R_m$ . With significant  $R_1$ ,  $R_2$ , and  $R_m$ , Eq. (4.26) is no longer valid and needs to be modified. As illustrated in Fig. 4.21c, the inductor current is no longer trapezoidal but has a significant exponential component. The lower the quality factor of the equivalent L-R circuit, the more different the inductor current was from the trapezoidal waveform.  $G_v(i,j)$  and  $G_\phi(i,j)$  need to be modified to capture the impact of these resistors. One of the most distinct differences is in  $G_v(i,i)$ . Previous analysis indicates that the output current perturbation  $(\hat{i}_i)$  of an ideal MAC converter will not be impacted by the voltage perturbation  $(\hat{v}_i)$  at the same port (i.e.,  $G_v(i,i) = 0$ ). However, this observation is not valid if losses are considered.

As shown in Fig. 4.21b, assuming that there is a voltage perturbation  $(\Delta V_{out})$  on the output voltage, the change of the inductor current  $(\Delta I_L)$  can be considered as if there is only one square wave voltage source  $(\Delta V_{out})$ , based on superposition. The change in the average output current  $(\Delta I_{out})$  is the time average integral of



Figure 4.22: (a) Improved small signal model of DAB considering power losses. (b) Equivalent circuit of MAC showing the  $i^{th}$  port inductor current variation  $\Delta I_{Li}$  due to  $\Delta V_i$ . (c) Improved small signal model of MAC considering power losses.

 $\Delta I_L$  in the positive half cycle of  $\Delta V_{out}$ , which is not zero if the current waveform is exponential. As a result, one additional current source,  $G_{vout}\hat{v}_{out}$ , should be added to the small signal model of Fig. 4.20b, which can be modeled as an output resistance  $R_s = -1/G_{vout}$  (Fig. 4.22a) to capture this effect. Fig. 4.21b indicates that the  $\Delta I_{out}$  is only determined by  $\Delta V_{out}$  and is not related with the phase-shift operating point  $(\Phi_{12})$ . If the impedances of  $R_m$  and  $L_m$  are much larger than  $R_1$ ,  $R_2$ ,  $L_1$ , and  $L_2$ , the effective output resistance of a DAB converter can be derived analytically as:

$$R_{s} = -\frac{1}{G_{vout}} = \frac{(R_{1} + R_{2})}{1 - \frac{4\tau \left(1 - \exp\left(-\frac{T}{2\tau}\right)\right)}{T\left(1 + \exp\left(-\frac{T}{2\tau}\right)\right)}},$$
(4.37)

in which the time constant  $\tau = (L_1 + L_2)/(R_1 + R_2)$ .

In a MAC architecture, the output current perturbation caused by the voltage perturbation at the same port can be interpreted as a square wave voltage source  $(\Delta V_i)$  driving a linear L-R network, as shown in Fig. 4.22b. Also, the induced output

current change  $\Delta I_i$  is only determined by the  $\Delta V_i$ , and is not related with the phaseshift  $\{\Phi_1, \Phi_2, ..., \Phi_n\}$ . The corresponding current source  $G_v(i, i)\hat{v}_i$  can be interpreted as an output resistance  $R_{si}$  added at each port, as shown in Fig. 4.22c. For a MAC-DPP architecture, Eq. (4.34) should be modified if losses are considered:

$$Z_i = R_{Li}||R_{si}||\frac{1}{sC_i} = \frac{(R_{Li}||R_{si})}{s(R_{Li}||R_{si})C_i + 1}.$$
(4.38)

With the modified impedance matrix, the improved system transfer function can be derived from Eq. (4.36). The output resistance of each port  $R_{si}$  can be estimated by circuit analysis, SPICE simulations, or experimental calibriation (by measuring  $\Delta I_i/\Delta V_i$  while keeping all the phase-shifts and voltages of all other ports constant).

To validate the effectiveness of the proposed small signal model, a 10-port MAC-DPP converter, which connects ten 5 V voltage domains in series, was simulated in PLECS. The circuit parameters are listed in the Table 4.4. The dc-ac units of the 10-port MAC-DPP converter is implemented as a half bridge circuit, so a coefficient of  $\frac{1}{2}$  needs to be added to the previous derived transfer functions. In this example analysis, we investigate the transfer function matrix with the MAC-DPP converter delivering power from nine ports (port #1  $\sim$  #9) to one port (port #10). The control to output transfer function of port #10 (i.e.,  $\frac{\hat{v}_{10}}{\hat{\phi}_{10}}$ ) is simulated with and without considering

Table 4.4: Simulation Parameters

| Parameter                                      | Value                 |
|------------------------------------------------|-----------------------|
| Switching Frequency $(f_s)$                    | 100 kHz               |
| External Inductance $(L_1 \sim L_{10})$        | $120~\mathrm{nH}$     |
| Magnetic Inductance $(L_m)$                    | $3.2~\mu\mathrm{H}$   |
| Output Capacitance $(C_1 \sim C_{10})$         | $200~\mu\mathrm{F}$   |
| Equivalent Path Resistance $(R_1 \sim R_{10})$ | $20~\mathrm{m}\Omega$ |
| Magnetic Resistance $(R_m)$                    | $220~\Omega$          |
| $R_{L1} \sim R_{L9}$                           | $10 \Omega$           |
| Load Resistance $\frac{R_{L10}}{R_{L10}}$      | $3 \Omega$            |



Figure 4.23: (a) Comparison between calculated and simulated transfer function of a 10-port MAC-DPP converter with and without power losses. (b) Calculated and simulated v to  $\phi$  Bode plots for three arbitrary ports in a 100-port lossless MAC-DPP system: (1) transfer function from  $\hat{\phi}_{10}$  to  $\hat{v}_1$ ; (2) from  $\hat{\phi}_{45}$  to  $\hat{v}_1$ ; (3) from  $\hat{\phi}_{92}$  to  $\hat{v}_1$ .

losses. The conduction losses and core losses are represented by the  $R_1 \sim R_{10}$  and  $R_m$ , respectively.

Fig. 4.23a compares the calculated Bode plot and simulated Bode plot of the phase to voltage transfer function of the example MAC-DPP converter, with and without considering the losses. The calculated Bode plot matches well with the simulated Bode plot in both cases. The power conversion loss reduces the dc gain and changes the phases of the transfer function. The improved small signal model precisely captures the impact of the losses. The dominant pole  $(1/\sum R_{load.i}C_i)$  of the transfer function is pushed to higher frequency by the output resistance.

The developed small signal modeling approach can be easily extended to model a MAC architecture with arbitrary number of ports. To validate the scalability and applicability of the approach, a 100-port SPICE simulation platform is built and tested in PLECS to capture the transfer function from  $\hat{\phi}_i$  to  $\hat{v}_j$  (Fig. 4.24) in the 100 ports. Compared to a conventional state-space based small signal model, the proposed modeling approach greatly reduces the computational load without sacrificing the model



Figure 4.24: PLECS simulation platform of a 100-port lossless MAC-DPP converter.

accuracy. Running SPICE simulations for large-scale MAC-DPP systems is computationally heavy and time-consuming, while an analytical model can rapidly present the same results with very low computational requirements, opening the opportunities to design and optimize very large scale MAC-DPP systems. As shown in Fig. 4.23b, the calculated Bode plots match very well with the simulated results, validating the effectiveness and scalability of the proposed small-signal modeling approach.

# 4.4.3 Feedback Control based on Distributed Phase Shift Modulation

According to the derived small signal model, this subsection presents a simple but robust distributed control strategy which is scalable to very large-scale MAC-DPP systems. Figure 4.25a plots the control block diagram, where each port utilizes a feedback loop to adjust its own phase based on locally measured port voltage. As shown in Fig. 4.25b, the interaction between each port can be treated as disturbance. The closely coupled feedback loops can be simplified as multiple standalone feedback loops with explicit transfer function  $G_s(s)$  as captured in  $(I - G_Z G_v)^{-1} G_Z G_\phi$  (Eq. (4.36)). The PI loop and phase controller of each port can be implemented as distributed phase-shift (DPS) modules synchronized by a system clock. The DPS module can be further integrated into each half bridge to enable a fully integrated modular building block. This distributed control strategy allows independent voltage



Figure 4.25: (a) Principles of the modular distributed control strategy of an example 3-port MAC-DPP converter. (b) Equivalent single loop for each port. (c) Loop gain Bode plot of port #1, port #10 with and without PI controller. Port #1 has the heaviest load with the largest phase margin, while port #10 has the lightest load with the lowest phase margin.

regulation of each port and can be easily applied to large-scale MAC-DPP systems with very large number of ports.

The small signal model can provide useful guidance to designing the control loops for the MAC-DPP architecture. Fig. 4.25c shows a design example of the PI parameters in a 10-port MAC-DPP converter with distributed phase-shift control. Here, the feedback gain is considered as a delay unit, and its delay time is one switching cycle (T). The loop gain of each port before adding a PI controller is:

$$G_{Li}(s) = G_s(s)(i,i) \times e^{-Ts} \approx G_s(s)(i,i) \times \frac{1}{1+sT}.$$
 (4.39)

Fig. 4.25c shows the Bode plots of two ports with the highest phase margin and smallest phase margin, respectively. The phase margin of both loop gains without PI is higher than 45°. Therefore, the system bandwidth can be improved by trading off phase margin for bandwidth with a PI controller. By tuning the lowest phase

margin of port #10 close to 45°, the bandwidth of all the ports was expanded. Since the lowest phase margin of 10 ports is still higher than 45°, the distributed control loop of each port is stable. It is worth noting that, the power conversion losses of the MAC system usually shifts the phase response rightwards and increases the system phase margin, so in practice, the power losses will create additional stability margin for the system dynamic response. The system may become unstable if the phase difference between two ports is greater than 90°. As a result, a phase limiting stage  $(-45^{\circ} < \phi_i < 45^{\circ})$  should be included in the control loop.

# 4.4.4 Feedforward Control based on the Newton-Raphson Method

This subsection presents an alternative feedforward control strategy based on Newton-Raphson method. Four variables are controllable at each port of the MAC-DPP converter,  $(P_i, Q_i)$  and  $(V_i, \phi_i)$ .  $P_i$  and  $Q_i$  are the active and reactive power injected into the  $i^{th}$  port from the  $i^{th}$  voltage domain.  $V_i$  and  $\phi_i$  are the amplitude and phase of the square wave voltage at each port. In power electronics designs, the reactive power at each port is usually absorbed by a large dc-filtering capacitor and is usually neglected. Depending on the design goals, one can control the voltage amplitude and the phase shift to modulate the injected power, or control the injected power and the phase shift to modulate the voltage of each port. When designing the control framework, the n ports can be grouped into two major categories:

- PV port:  $P_i$  and  $V_i$  are specified, and  $\phi_i$  is unknown. In multiport power converters, sources and loads that require specific voltage amplitudes or active power injection can be modeled as PV ports.
- $\mathbf{V}\phi$  **port**: the reference port for the power flow calculation.  $V_i$  is selected as the nominal voltage,  $\phi_i$  is considered as zero,  $P_i$  is free-wheeling and is determined

by the system needs. At least one  $V\phi$  port is needed in a system to meet the energy conservation requirements. Usually, a port that is connected to an energy storage device can be selected as a  $V\phi$  port.

Usually, in a multiport power converter, a majority of ports are PV ports. A port that is connected to an energy storage element is usually selected as the V $\phi$  port. The V $\phi$  port functions as an energy buffer to balance the input and output power of the system. As a result, a multiport power converter with n ports usually has n-1 PV ports and one V $\phi$  port. The phases of all n ports need to be precisely controlled to control the power flow in the multiport passive network. The target of this control framework is the active powers of the n-1 PV ports, and the input variables are the phases of the n-1 PV ports (the phase of the V $\phi$  port is zero). The key challenge of this control framework is to solve n-1 unknown variables with n-1 nonlinear power flow equations reorganized from (4.25):

$$\begin{bmatrix} P_1 \\ P_2 \\ \vdots \\ P_{n-1} \end{bmatrix} = f : R^{n-1} \to R^{n-1} \begin{bmatrix} \phi_1 \\ \phi_2 \\ \vdots \\ \phi_{n-1} \end{bmatrix}. \tag{4.40}$$

We adopted the Newton-Raphson method in power system analysis [168] to solve these nonlinear equations. Newton-Raphson method linearizes non-linear equations and use iterations to approach desired solutions. It converges fast and requires low computation power (enabling a microcontroller or FPGA implementation), but is sensitive to the initial anticipated solution. Other methods such as Gauss-Seidel Iteration and Fast Decoupling Methods [168] may be applicable to specific cases. Sophisticated power flow calculation tools such as Matpower [169] can also be used at the cost of more computation requirements. In this work, a Newton-Raphson solver customized for power flow analysis has been developed and a corresponding software

tool named Mapflow (Mapping the Power Flow) is open-sourced in Github<sup>1</sup>. The solver takes in the following inputs: 1) network information: including the branch inductance, magnetizing inductance, branch capacitance and branch resistance; 2) targeted active power and voltage amplitude of PV ports; 3) voltage amplitude of the reference port; 4) initial anticipated solutions for the phases of PV ports. The iteration step for the solver is:

$$\begin{bmatrix} \Delta \phi_1 \\ \Delta \phi_2 \\ \vdots \\ \Delta \phi_{n-1} \end{bmatrix} = \begin{bmatrix} \frac{\partial f_1}{\partial \phi_1} & \cdots & \frac{\partial f_1}{\partial \phi_{n-1}} \\ \vdots & \ddots & \vdots \\ \frac{\partial f_{n-1}}{\partial \phi_1} & \cdots & \frac{\partial f_{n-1}}{\partial \phi_{n-1}} \end{bmatrix}^{-1} \begin{bmatrix} \Delta P_1 \\ \Delta P_2 \\ \vdots \\ \Delta P_{n-1} \end{bmatrix}, \tag{4.41}$$

$$\phi_k = \phi_{k-1} - \Delta \phi. \tag{4.42}$$

Accordingly, the Jacobian matrix is:

$$\frac{\partial f_i}{\partial \phi_j} = \begin{cases}
\frac{V_i V_j}{2\pi f_s L_{ij}} \left(\frac{2|\Phi_{ij}|}{\pi} - 1\right) & j \neq i \\
\sum_{k \neq i} \frac{V_i V_k}{2\pi f_s L_{ik}} \left(1 - \frac{2|\Phi_{ij}|}{\pi}\right) & j = i
\end{cases}$$
(4.43)

With appropriate selection of the initial operating point, the phases of all ports can usually be found within a few iteration steps. These phases will be utilized to control the ac-dc converters at each port. If full bridges are implemented as the dc/ac converters in Fig. 4.18a, more feasible solutions can be found for the nonlinear power flow equations. These ports can be defined as P ports with only active power in control. Assuming there are n ports in total, the first m ports are P ports, and the last port is a reference port, the rest of them are P ports. The iteration step of the

<sup>&</sup>lt;sup>1</sup>https://github.com/PingWang3741/Multiport-Power-Converter.git

solver is:

$$\begin{bmatrix} \Delta V_{1} \\ \dots \\ \Delta V_{m} \\ \Delta \phi_{1} \\ \vdots \\ \Delta \phi_{n-1} \end{bmatrix} = \begin{bmatrix} \frac{\partial f_{1}}{\partial V_{1}} & \dots & \frac{\partial f_{1}}{\partial V_{m}} & \frac{\partial f_{1}}{\partial \phi_{1}} & \dots & \frac{\partial f_{1}}{\partial \phi_{n-1}} \\ \vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\ \frac{\partial f_{n-1}}{\partial V_{1}} & \dots & \frac{\partial f_{n-1}}{\partial V_{m}} & \frac{\partial f_{n-1}}{\partial \phi_{1}} & \dots & \frac{\partial f_{n-1}}{\partial \phi_{n-1}} \end{bmatrix}^{-1} \begin{bmatrix} \Delta P_{1} \\ \Delta P_{2} \\ \vdots \\ \Delta P_{n-1} \end{bmatrix}, \quad (4.44)$$

$$\mathbf{V}_k = \mathbf{V}_{k-1} - \Delta \mathbf{V}, \quad \boldsymbol{\phi}_k = \boldsymbol{\phi}_{k-1} - \Delta \boldsymbol{\phi}. \tag{4.45}$$

There are n-1+m unknown variables and n-1 nonlinear equations. The Jacobian matrix is not a square matrix. The Moore-Penrose pseudoinverse of the Jacobian matrix can be utilized to calculate the iteration step size. When  $V_i$  of the  $i^{th}$  P port reaches its maximum, this P port will change back to PV port in the next iteration.

It is known that the convergence of the Newton-Raphson method is very sensitive to the initial anticipated solution. The Newton-Raphson method will not converge if there is no feasible solution. As a result, it is critical to determine if the targeted power is even feasible before the iteration starts. In a large scale MAC converter, the maximal and minimal power injected into port i are:

$$P_{i_{-}\max} = \frac{\pi}{4} \sum_{k \neq i} \frac{V_i V_k}{2\pi f_s L_{ik}},\tag{4.46}$$

$$P_{i_{-\min}} = -\frac{\pi}{4} \sum_{k \neq i} \frac{V_i V_k}{2\pi f_s L_{ik}}.$$
(4.47)

Different ports need different control phases for maximal rated power, so they can't reach the maximum simultaneously. We can use the sum of power squares (SPS) to describe the joint feasible power range of multiple ports. A conservative subset of the



Figure 4.26: (a) Feasible power range of a three-port MAC converter. (b) Fractal convergence region of the Newton-Raphson solver. The solver is more likely to converge if the initial anticipated points are close to the final solution. Empirically, for a symmetric multiport network, starting from the origin is always a good strategy.

complete feasible power region can be defined using SPS, which is written as:

$$\sum_{i=1}^{n} P_i^2 \le \beta_{max},\tag{4.48}$$

where  $\beta_{max}$  is a conservative bound that guarantees a feasible solution as long as Eq. (4.48) holds true. Here, we use a three-port MAC converter as an example to demonstrate the effectiveness of using Eq. (4.48) as a convergence bound. We randomly selected the targeted power in a range and recorded those that have phase shift solutions. Results are plotted in Fig. 4.26a, where the blue points are the feasible power region, and the red points are infeasible power region. The blue inner circle is a conservative region for the feasible power range as described by Eq. (4.48). The convergence of the Newton-Raphson method is also determined by the initial anticipated solutions. Using the developed Newton-Raphson solver, a three-port MAC converter is tested, where two ports are PV port, and the other one is  $V\phi$  port. We selected a feasible targeted power in the range of Eq. (4.48) and swept the initial

phase of the two ports from  $-\pi$  to  $\pi$ , recording if the solver was converged. As shown in Fig. 4.26b, two target solutions exist, and the converging area for the initial phases have fractal boundary. It also indicates that initial solutions that are close to the target solution converges better. With a look-up table that stores a certain number of target powers with their solutions, the solver can always start from a initial point near the target solution to ensure the convergence of the algorithm. Empirically, we found that for a MAC converter with identical impedance on all branches, using zero phase-shift as the initial solution almost always leads to a converging result, if the targeted power is feasible.

### 4.5 String Voltage Regulation for DPP

In DPP systems where the series loads are directly connected to the input dc bus, the stacked DPP string voltage is fixed to the input dc bus voltage. The tightly-coupled load voltage and bus voltage might deteriorate system performance. In PV systems, for example, the voltage to achieve MPPT for each PV string varies due to illuminance difference. Directly connecting multiple PV strings to the dc bus forces different PV strings to share the same string voltage, lowering the overall power generation. In other cases such as servers in data centers, the dc bus voltage in a server rack may change between 48 V to 54 V, whereas the IT equipment needs precisely regulated voltage (e.g., 24 V, 12 V, or 5 V) to function properly. An input regulation stage that decouples the series domain voltage from the dc bus voltage is needed and is the main focus of this section.

The most straightforward way of implementing an input voltage regulator is to design a standalone front-end dc-dc converter. In this case, however, the front-stage dc-dc converter processes the full load power, reducing the benefits gained from differential power processing and limiting the overall system efficiency and power density that can be achieved. An alternative way is to regulate the DPP string voltage



Figure 4.27: A series voltage compensator (SVC) leveraging the partial power processing concept for voltage regulation of DPP systems. SVC only processes a fraction of total power. Major power is directly delivered to the DPP loads.

through partial power processing. The partial power processing concept was initially found in PV applications [170–172]. Similar topologies such as sigma converter [68] and composite converter [146] were proposed later. These topologies can be generally classified into two categories: input-parallel-output-series (IPOS) structure and input-series-output-parallel structure (ISOP) [147, 148]. Usually, in partial power converters, major power is directly delivered to the loads. The direct power has no impact on the size or losses of power converters [149–151]. Some partial power converters might deliver the majority of power through a high-efficiency fixed-ratio dc-dc conversion stage to minimize the cost of size or losses. Consequently, only a fraction of power (i.e., partial power) is processed by the voltage regulation stage, contributing to power rating reduction and efficiency improvement [152, 153].

Leveraging the partial power processing concept, a variety of series voltage compensators (SVCs) for voltage pre-regulation in DPP systems are investigated and compared in this section. Fig. 4.27 shows the general architecture of an SVC. Different from a standalone voltage pre-regulator, the SVC converter is in effect connected in series with the DPP loads, compensating for the voltage difference between the input dc bus and stacked DPP loads. The negative terminals of the input and out-

put ports of SVC are tied to the middle of the stacked loads, leading to a decreased voltage rating. Therefore, SVC only processes a fraction of the overall system power and delivers it to the top few voltage domains. A majority of the power is directly delivered to the DPP loads. The topologies presented in [137, 173, 174] are a subset of the SVC family investigated in this section. We generalize these topologies and perform a systematic analysis on the power rating and the additional power conversion stress that SVC brings to DPP converters. To validate the theoretical analysis, a buck SVC topology was designed and tested with the MAC-DPP converter, which will be presented in Section 4.6.

### 4.5.1 Power Stress Analysis of SVC

As shown in Fig. 4.27, an SVC converter is a modified dc-dc converter with unique input/output terminal configurations to take advantage of partial power processing. The input and output negative terminals of the SVC converter are connected to the middle of the stacked loads. In this way, the SVC current rating is the same as the overall system, but its voltage rating is only a portion of the system voltage rating. Therefore, the SVC converter processes a fraction of the total system power and can have much lower power loss and component size compared to a standalone dc-dc regulator. Although the SVC converter only processes a portion of the system power, it may increase the power conversion stress of the DPP converters. As indicated in Fig. 4.28, the SVC only delivers power to the top few voltage domains, creating additional power imbalance which needs to be handled by the DPP converter.

The main purpose of this subsection is to identify the operating conditions in which an SVC is attractive or not compared to a conventional DPP pre-regulation converter that has to process the full power as shown in Fig. 4.29. Both the SVC power conversion stress and the additional stress introduced by the SVC to the DPP are considered in this comparison. The operation boundaries for SVC to achieve



Figure 4.28: The SVC incurred power processing consists of: (1) SVC processed power  $P_{SVC}$ ; (2) additional differential power in DPP converters  $\Delta P_{DPP}$ .



Figure 4.29: Conventional voltage pre-regulator for DPP system. In contrast, the standalone dc-dc regulator needs to process total system input power  $P_{IN}$ .

lower overall system power stress than a traditional DPP pre-regulation converter are derived. To the best of our knowledge, this is the first systematic analysis of the impact of SVC on DPP.

Fig. 4.28 labels the voltage and current ratings of an SVC. Assume the DPP system comprises N series voltage domains, and the negative terminals of the input and output ports of the SVC are tied to the negative terminal of the  $K^{th}$  domain

with voltage as  $\frac{N-K}{N}V_{DPP}$ , the power processed by the SVC is

$$P_{SVC} = \left(V_{IN} - \frac{N - K}{N} V_{DPP}\right) I_{IN}. \tag{4.49}$$

To quantify the benefits of partial power processing, the SVC processed power is normalized to the total system input power and is denoted as  $\rho_{SVC}$ :

$$\rho_{SVC} = \frac{\left(V_{IN} - \frac{N - K}{N} V_{DPP}\right) I_{IN}}{V_{IN} I_{IN}} = 1 - (1 - K_s) M_v. \tag{4.50}$$

Here,  $M_v = \frac{V_{DPP}}{V_{IN}}$  is the voltage regulation ratio;  $K_s = \frac{K}{N}$  is the ratio of the SVC-tied voltage domain to the overall number of voltage domains. The SVC converters discussed herein are non-inverting converters and their input or output polarity does not flip during the operation. Thus, the input voltage  $(V_{IN})$  should be larger than the negative terminal voltage of the  $K^{th}$  domain  $(\frac{N-K}{N}V_{DPP})$ . As a result, the feasible range for  $M_v$  is:  $M_v < \frac{1}{1-K_s}$ .

Fig. 4.30a plots the normalized SVC power as a function of the voltage regulation ratio with different values of  $K_s$ .  $K_s = \frac{K}{N}$  is the ratio of SVC-tied voltage domain to the overall number of voltage domains. If K = 1 and N is very large,  $K_s \to 0$  and  $\rho_{SVC} = 0$  at  $M_v = 1$ , indicating that the SVC is not processing power. When  $K_s \to 1$ , the output of SVC is almost directly attached to the entire DPP series voltage domains. In this case, the SVC becomes a conventional standalone dc-dc regulator, and  $\rho_{SVC}$  becomes one, as shown in Fig. 4.30a.  $\rho_{SVC}$  will increase as  $M_v$  decreases in both buck region  $(M_v < 1)$  and boost region  $(1 < M_v < \frac{1}{1-K_s})$ , but it will be always less than one, indicating that the SVC processed power is always less than the total load power. As  $K_s$  increases, the voltage regulation range in the boost region will be larger, but the SVC voltage rating will also increase, resulting in a higher  $\rho_{SVC}$ .



Figure 4.30: (a) Normalized SVC processed power  $\rho_{SVC}$  as a function of voltage regulation ratio  $M_v$ . (b) Normalized additional DPP processed power  $\rho_{DPP}$  as a function of the voltage regulation ratio  $M_v$ . (c) Normalized total SVC incurred power processing  $\rho_{tot} = \rho_{SVC} + \rho_{DPP}$  as a function of the voltage regulation ratio  $M_v$ . Each curve is plotted only within its feasible range:  $M_v < \frac{1}{1-K_v}$ .

In DPP systems, power converters work to balance the differential power among the series domains. SVC that only delivers power to the top few voltage domains will cause power imbalance among the series domains and bring additional power conversion stress to the DPP system. DPP converters need to cope with both the inherent power mismatch of the series domains and the power imbalance caused by SVC. Here, we quantitatively analyze the additional differential power in a fully-coupled DPP architecture [143], where there is a direct power flow path between any two voltage domains. The analytical framework can be further extended to other DPP architectures (e.g., ladder DPP [121,161]) with indirect power delivery paths.

Differential power flow in DPP systems is determined by power distribution across series voltage domains, which is dynamic and unpredictable [143, 158]. For a well-designed DPP system, however, load powers of different voltage domains are expected to be close by average [139]. Thus, we analyze the average increased differential power caused by SVC by assuming a uniform load power across all series domains. In this case, the average summed load power of top K voltage domains is  $\frac{K}{N}P_{IN}$ , and the power that SVC delivered to the top K domains is  $\rho_{SVC}P_{IN}$ . If SVC operates in buck region,  $\rho_{SVC} > \frac{K}{N}$ , thus the DPP converter needs to deliver the differential power  $(\rho_{SVC} - \frac{K}{N}) P_{IN}$  from the top K domains to the lower N - K domains. When SVC is working in boost region,  $\rho_{SVC} < \frac{K}{N}$ , the DPP converter needs to deliver the differential power  $(\frac{K}{N} - \rho_{SVC}) P_{IN}$  in an opposite way. For both the two regions, the average additional differential power that an SVC brings to the system is

$$\Delta P_{DPP} = \left| \rho_{SVC} - \frac{K}{N} \right| \times V_{IN} I_{IN}. \tag{4.51}$$

Similarly, we normalize the additional differential power to the total input power:

$$\rho_{DPP} = \frac{\left| \rho_{SVC} - \frac{K}{N} \right| V_{IN} I_{IN}}{V_{IN} I_{IN}} = (1 - K_s) \left| 1 - M_v \right|. \tag{4.52}$$

Fig. 4.30b plots the relationship between the normalized additional differential power and the voltage regulation ratio  $M_v$ . In both buck and boost region,  $\rho_{DPP}$  increases as  $M_v$  deviates one (i.e., gap between  $V_{IN}$  and  $V_{DPP}$  becomes larger).  $\rho_{DPP}$  becomes zero if  $M_v = 1$  (i.e.,  $V_{IN}$  equals  $V_{DPP}$ ). Different from  $\rho_{SVC}$ ,  $\rho_{DPP}$  will be lower as  $K_s$  increases. As  $K_s \to 1$ , the SVC behaves more like a standalone dc-dc regulator, and the additional power stress reduces.

A normalized total SVC incurred power processing  $\rho_{tot}$  ( $\rho_{tot} = \rho_{SVC} + \rho_{DPP}$ ) is used as a performance metric for evaluating the performance of an SVC. A lower  $\rho_{tot}$  indicates a lower total power stress and better performance. If  $\rho_{tot} > 1$ , the total

|             | Buck Operation Region $(M_v < 1)$ | Boost Operation Region $(1 < M_v < \frac{1}{1 - K_s})$ |
|-------------|-----------------------------------|--------------------------------------------------------|
| $ ho_{SVC}$ | $1 - (1 - K_s)M_v$                | $1 - (1 - K_s)M_v$                                     |
| $ ho_{DPP}$ | $(1-K_s)(1-M_v)$                  | $(1-K_s)(M_v-1)$                                       |
| $ ho_{tot}$ | $(2-K_s)-(2-2K_s)M_v$             | $K_s$                                                  |

Table 4.5: SVC Incurred Power Processing in Buck and Boost Region

SVC incurred power processing will be higher than total input power, and the SVC loses advantages compared to a standalone dc-dc regulator. Fig. 4.30c plots  $\rho_{tot}$  as a function of  $M_v$ .  $\rho_{tot}$  keeps constant ( $\rho_{tot} = K_s$ ) when SVC operates in boost region. If SVC works in buck region,  $\rho_{tot}$  will increase as  $M_v$  decreases (i.e., larger difference between  $V_{IN}$  and  $V_{SVC}$ ), and it will be larger than one if  $M_v < 0.5$  for any  $K_s$ . If the voltage regulation ratio  $M_v$  is larger than 0.5 ( $V_{in} < 2V_{DPP}$ ), the overall SVC incurred power processing ( $\rho_{SVC} + \rho_{DPP}$ ) is lower than 1, regardless of how the SVC and DPP are configured (independent from  $K_s$ ), indicating that a DPP with SVC can offer better performance than a traditional DPP architecture with a standalone, fully rated regulation stage in most operating conditions. Detailed normalized figure-of-merits for SVC in boost and buck regions are summarized in Table 4.5.

### 4.5.2 Example Topology Implementations of SVC

SVC can be implemented in many different ways with trade-offs in voltage regulation range, control complexity, efficiency, and component count. Fig. 4.31 shows several circuit implementations of SVC. One can either implement the SVC as an individual partial power converter (Figs. 4.31a-4.31c), or merge the SVC into the DPP converter as one extra element (Fig. 4.31d). Fig. 4.31a shows a buck SVC which applies to the circumstances where the input voltage is higher than the string voltage of DPP systems. In the case where the input voltage is lower, SVC can be implemented as a boost converter as shown in Fig. 4.31b. The buck SVC and boost SVC have a



Figure 4.31: Several circuit implementations of SVC: (a) buck SVC; (b) boost SVC; (c) buck-boost SVC; (d) extra DPP port [137, 174]. The negative terminal of the input and output ports of the SVC is connected to the negative terminal of the first voltage domain to achieve the maximum benefits.

very low component count, but they can only regulate the input voltage towards one direction (either up or down).

Fig. 4.31c is a non-inverting buck-boost SVC that can regulate the input voltage in both directions. It requires more components and more sophisticated control, but it offers a wider regulation range. Fig. 4.31d shows an SVC topology implemented as an extra port of the DPP converter, where the extra DPP port compensates for the gap between dc bus voltage and stacked string voltage. Input string current is bypassed through the DPP converter. It can either step up or step down the input voltage depending on the designed polarity of the extra port. Voltage regulation of the extra port can be merged with the master controller of the DPP converter. For an extra-port SVC of a fully-coupled DPP converter, the total SVC incurred power processing is exactly the additional differential power. The normalized total SVC incurred power for an extra-port SVC is:

$$\rho_{tot} = \rho_{DPP} = \frac{|V_{IN} - V_{DPP}| \times I_{IN}}{V_{IN}I_{IN}} = |1 - M_v|. \tag{4.53}$$

To compare different SVC topologies, the component load factor (CLF) that includes the impacts of component count and stress is used as an evaluation metric [175, 176]:

$$CLF = \frac{V^*I^*}{P_{tot}}. (4.54)$$

 $V^*$  is the maximum blocking voltage of switches or the ac average voltage of inductors;  $I^*$  is the root-mean-square (RMS) current value of switches and inductors;  $P_{tot}$ is the total load power. A lower CLF indicates lower component stress or better utilization of the components. In buck SVC or boost SVC (Figs. 4.31a-4.31b), the upper and lower switches  $(S_1 \& S_2)$  are controlled by two complementary gate signals. As for buck-boost converter (Fig. 4.31c), different control methods exist with trade-offs in driving circuit complexity, inductor size, switch utilization, and converter efficiency [177, 178]. Discussion of buck-boost SVC in this section is based on the assumption that two switches in each half-bridge are controlled by complementary signals, and the two half-bridges are switching oppositely (i.e.,  $S_1\&S_4$  or  $S_2\&S_3$  are in phase). Define D as the duty ratio of  $S_1$  in the three SVC topologies. The CLFof switches and inductors are calculated and compared across the buck, boost, and buck-boost SVCs. The current ripple is ignored when calculating the current RMS value; power loss is not considered so that  $P_{tot} = V_{IN}I_{IN}$ . In a buck SVC, the negative terminal of the input and output ports are connected to the negative terminal of the  $K^{th}$  domain with voltage as  $(1 - K_s)V_{DPP}$ . To keep the volt-second balancing of the inductor, the duty cycle of buck SVC should satisfy:

$$D \times \underbrace{[V_{IN} - (1 - K_s)V_{DPP}]}_{\text{SVC Input Voltage}} = \underbrace{K_s V_{DPP}}_{\text{SVC Output Voltage}}.$$
 (4.55)

As a result, the duty ratio of the buck SVC is:

$$D = \frac{M_v K_s}{M_v K_s + 1 - M_v} \tag{4.56}$$

| Topologies        | Transistor CLF                                                        | Inductor CLF                    | Duty Ratio $(D)$                     |
|-------------------|-----------------------------------------------------------------------|---------------------------------|--------------------------------------|
| Buck SVC          | $\frac{\left(\sqrt{D} + \sqrt{1 - D}\right)K_s}{(D - D^2)K_s + D^2}$  | $\frac{(1-D)K_s}{(1-D)K_s+D}$   | $\frac{M_v K_s}{M_v K_s + 1 - M_v}$  |
| Boost SVC         | $\frac{\left(\sqrt{D} + \sqrt{1 - D}\right)K_s}{1 - (1 - D)K_s}$      | $\frac{(D-D^2)K_s}{1-(1-D)K_s}$ | $\frac{M_v K_s + 1 - M_v}{M_v K_s}$  |
| Buck-Boost<br>SVC | $\frac{\left(\sqrt{D} + \sqrt{1 - D}\right)K_s}{(D - 2D^2)K_s + D^2}$ | $\frac{(1-D)K_s}{(1-2D)K_s+D}$  | $\frac{M_v K_s}{2M_v K_s + 1 - M_v}$ |
| Buck              | $\frac{\sqrt{D} + \sqrt{1 - D}}{D}$                                   | 1-D                             | $M_v$                                |
| Boost             | $\frac{\sqrt{D}+\sqrt{1-D}}{D}$                                       | 1-D                             | $\frac{1}{M_v}$                      |
| Buck-Boost        | $\frac{\sqrt{D} + \sqrt{1 - D}}{D(1 - D)}$                            | 1                               | $rac{M_v}{1+M_v}$                   |

Table 4.6: Comparison of Different SVC Topologies and Standalone dc-dc Regulators

In a buck SVC, the blocking voltage of each switch is the SVC input voltage which can be reorganized as  $\frac{K_s V_{IN}}{(1-D)K_s+D}$ , and the RMS currents of  $S_1$  and  $S_2$  are  $\frac{\sqrt{D}}{D}I_{IN}$  and  $\frac{\sqrt{1-D}}{D}I_{IN}$ , respectively, so the transistor CLF of buck SVC is:

Transistor CLF = 
$$\frac{\frac{K_s V_{IN}}{(1-D)K_s+D} \times \left(\frac{\sqrt{D}}{D} I_{IN} + \frac{\sqrt{1-D}}{D} I_{IN}\right)}{V_{IN} I_{IN}}$$

$$= \frac{(\sqrt{D} + \sqrt{1-D})K_s}{(D-D^2)K_s + D^2}.$$
(4.57)

The average voltage of the inductor is  $\frac{D(1-D)K_sV_{IN}}{(1-D)K_s+D}$ , and the RMS inductor current is  $\frac{I_{IN}}{D}$ , so the inductor CLF of buck SVC is:

Inductor CLF = 
$$\frac{\frac{D(1-D)K_sV_{IN}}{(1-D)K_s+D} \times \frac{I_{IN}}{D}}{V_{IN}I_{IN}} = \frac{(1-D)K_s}{(1-D)K_s+D}.$$
 (4.58)

Similarly, the component load factors and required duty ratios for other SVC topologies can be derived and summarized in Table 4.6. Fig. 4.32a plots the switch and inductor CLFs of the three SVC topologies when  $K_s = 0.5$ . While the buck-boost SVC has a wider regulation range than the buck SVC and the boost SVC, its switch and inductor CLFs are higher in the full regulation range, as shown in Fig. 4.32a.



Figure 4.32: Transistor and inductor CLFs comparison between: (a) different SVC topologies when  $K_s = 0.5$ ; (b) buck SVC and conventional buck; (c) boost SVC and conventional boost; and (d) buck-boost SVC and conventional buck-boost. For each  $K_s$ , CLFs are plotted within the feasible range:  $M_v < \frac{1}{1-K_s}$ .

In Table 4.6, the CLF and D of the three topologies when implemented as conventional buck, boost, and buck-boost converters are also calculated for comparison. These conventional dc-dc converters compared here are performing the same voltage pre-regulation task as the SVC for the DPP system. Figs. 4.32b - 4.32d plot the switch



Figure 4.33: Regulation ratio  $(M_v)$  at the crossing point where transistor CLFs of the SVC topology and its conventional counterpart are equal. The crossing point  $M_v$  value is not continuous at  $K_s = 1$ , where the transistor CLF curves of the SVC and conventional converter will overlap instead of crossing.

and inductor CLFs of the three SVC topologies with different  $K_s$  values and plot the CLFs of their conventional counterparts as references. As indicated by the figures, for all the three SVC topologies, their CLFs become closer to their counterparts as  $K_s$  increases from  $0 \to 1$ . If  $K_s = 1$ , the SVC becomes a standalone dc-dc regulator, processing full power. For the buck SVC, its inductor CLF is always the same as the conventional buck converter, but its switch CLF might be higher when  $M_v$  is low. For the boost SVC, both its transistor and inductor CLFs are lower than the conventional boost converter in the full regulation range, implying that its component stress is always less than its conventional counterpart. As for the buck-boost SVC, its inductor CLF is always less than the conventional buck-boost converter, but its transistor CLF might be higher if  $M_v$  is low, similar to the buck SVC. Regulation ratio  $(M_v)$  at the crossing point when transistor CLF of the buck SVC or buck-boost SVC is equal to its conventional counterpart is plotted in Fig. 4.33. To maintain the advantage in terms of component stress, a buck SVC or buck-boost SVC is suggested to operate in the condition when  $M_v$  is larger than the crossing point so that the SVC

has lower transistor CLF. One can adjust the number of series domains in the DPP, or change the configuration of SVC to achieve this goal.

#### 4.6 Experimental Results

To validate the MAC-DPP architecture and theoretical analysis, a 450-W/50-to-5-V 10-port MAC-DPP prototype is designed and built. First, the MAC-DPP prototype is tested on a 50-HDD storage server. The system efficiency and thermal performance as well as the control stability and transient speed have been tested when the HDD server performs daily tasks. The MAC-DPP prototype achieves over 99.7% system efficiency with 700 W/in³ power density, realizing the first complete demonstration of a DPP-powered data storage server with full reading, writing, and hot-swapping capabilities. The MAC-DPP prototype is then applied to a 600-LED screen. Various random load tasks (independent or dependent) have been created and assigned to the LED screen, and the tested results match well with the loss scaling trend predicted by the stochastic loss model. To verify the principles of SVC, a buck SVC is designed and applied to the MAC-DPP prototype. The buck SVC can efficiently convert an input voltage ranging from 50 V to 65 V into a regulated 50 V for the DPP system. The size of the SVC is only 20% of the MAC-DPP converter, and the peak efficiency of the SVC-DPP system achieves 98.8%.

## 4.6.1 A 450-W/50-to-5-V 10-Port MAC-DPP Prototype

This subsection introduces the design of the DPP power stage. Fig. 4.34a shows the circuit topology of the 10-port MAC-DPP prototype. The dc-ac units are implemented as half-bridge circuits with dc blocking capacitors, and all ports are ac-coupled to a 10-winding transformer. The port-to-port operation of this converter is the same as that of a DAB converter with a 1:1 conversion ratio. It offers the lowest power



Figure 4.34: (a) Topology of a 10-port MAC-DPP converter with dc-ac units implemented as half-bridge circuits. (b) Modular isolated PWM driving circuit (in red) and voltage sampling circuit (in blue) at each port. (c) Annotated top view, side view, and 3D assembly view of the 10-port MAC-DPP prototype. The prototype is 40 mm×35 mm in area and 7.56 mm in height. (d) Winding patterns on main power board (4 layers) and bottom cover (6 layers).

conversion stress, and can realize soft switching across the full operation range [164]. The 50 V dc bus is split into 10 series-stacked 5 V voltage domains, which support dc loads like HDDs and LEDs. The distributed phase shift (DPS) control units are implemented as standalone phase-shift modules synchronized by a system clock. The

Table 4.7: Bill-of-Material of the MAC-DPP Converter

| Device & Symbol                                                | Component Description                                                                  |
|----------------------------------------------------------------|----------------------------------------------------------------------------------------|
| Half-Bridge Switch, $S_1 \sim S_{10}$                          | DrMOS, CSD95377Q4M                                                                     |
| Blocking Capacitor, $C_{B1} \sim C_{B10}$                      | Murata X5R, 100 $\mu$ F × 3                                                            |
| Series Inductor, $L_{s1} \sim L_{s10}$                         | Coilcraft SLC7649, 100 nH                                                              |
| Port Voltage, $V_1 \sim V_{10}$                                | 5 V                                                                                    |
| Switching Frequency, $f_{sw}$                                  | 100 kHz                                                                                |
| Transformer Core Main Power Board Winding Bottom Cover Winding | Ferroxcube, ELP18-3C95<br>2 oz, single turn $\times$ 4<br>2 oz, single turn $\times$ 6 |

voltage sampling circuits and isolated PWM signal circuits are designed as scalable modules as depicted in Fig. 4.34b. In each driving and sampling module, a bootstrapping circuit (annotated in red) is utilized to create a dc bias voltage on the capacitor and generate an isolated PWM signal referred to the floating negative node (V-). The voltage sampling circuit (in blue) uses a resistive divider to scale down the positive node voltage (V+) and sends it back to the controller. The driving and sampling circuit together with the distributed phase-shift module can be further integrated into the half-bridge power stage, enabling fully integrated modular building blocks for the MAC-DPP architecture. Detailed prototype parameters are listed in Table 4.7.

Fig. 4.34c shows top, side, and 3D assembly views of the MAC-DPP prototype. To create symmetric winding paths, the 10-winding transformer is placed in the middle, surrounded by the 10 ports. The driving, sampling circuit and the power stage are all included. The prototype is 40 mm×35 mm in area, 7.56 mm in height, and the total volume is only 10.58 cm<sup>3</sup> (0.64 in<sup>3</sup>). Two PCB boards are stacked and integrated with an ELP18/10 magnetic core, whose effective core area is 39.5 mm<sup>2</sup>. To avoid saturation, the core area is selected as two times of the minimum core area calculated from the Eq. (4.4). This area is comparable to that of a two winding transformer with the same volt-seconds-per-turn. Since the additional window area is negligible, the MAC-DPP prototype reduces the magnetic volume by 10 times compared to a



Figure 4.35: The 450 W 10-port MAC-DPP prototype and a U.S. quarter. The peak system efficiency is >99%, and the peak converter efficiency is >96%.

10-port dc-coupled DPP converter. Fig. 4.34d shows the PCB patterns of the ten windings. Each winding consists of one single turn in one PCB layer. The main power board comprises four windings, while the bottom cover comprises six windings, which are connected vertically to the main power board through vias. The copper thickness of the PCB is 2 oz. Since all windings are single-turn PCB windings, and the core has high permeability, the magnetic field distribution within the core can be approximated as 1D. Many models can capture the high-frequency skin and proximity effects in 1D planar magnetics and provide guidance to the geometry design. For example, reference [179] presents a systematical approach to modeling the impedance and current distribution in multi-winding planar magnetics, which can be used as a guideline to design the windings in the multi-winding transformer.

Fig. 4.35 shows the MAC-DPP prototype in comparison with a U.S. quarter. The MAC-DPP prototype is a 10-port dc-dc converter, and all ten ports are bidirectional ports. Fig. 4.36a shows the measured efficiency of the converter under a variety of different power delivery scenarios. Each port is connected to a 5 V dc source/load and switching at 100 kHz. A few ports are connected in parallel as input ports, and a few other ports are in parallel as output ports. The entire MAC-DPP converter functions



Figure 4.36: (a) Port-to-port power converter efficiency in different cases. When delivering 40 W from 9 ports to 1 port, the hot-spot temperate of the output port reached 114 °C under 110 CFM airflow. (b) System power conversion efficiency (total load power: 450 W).

equivalently as a one-to-one converter. When delivering power from 9 ports to 1 port, current concentrates at one port. Since conduction loss increases quadratically as current increases, the 9-port-to-1-port scenario dissipates large loss at one port, yielding the lowest efficiency. The 5-port-to-5-port case has the highest efficiency because the power conversion stress is well distributed. The peak port-to-port conversion efficiency is 96.5% when delivering power from 5 ports to 5 ports. The peak efficiency in the worst power delivery scenario (9-port-to-1-port) is still maintained above 95%. Limited by the concentrated heat at one port, the MAC-DPP prototype can deliver a maximum of 40 W power from 9 ports to 1 port when the hot-spot temperature of the output port reaches 114°C under 110 CFM airflow. Two key figure-of-merits are defined to evaluate the DPP performance:

• System Power Rating: The MAC-DPP converter is designed for a DPP system with 10 series-stacked voltage domains. The system power rating is defined as the maximum overall load power that the DPP system can support for the desired application, which is different from the actual power processed by the power converter. In a DPP system, the load power,  $P_i$  at each voltage domain changes

between  $[0, P_{max}]$ . The differential power that the MAC-DPP converter needs to process in the  $i^{th}$  domain is:

$$\Delta P_i = \left| P_i - \frac{\sum_{i=1}^{10} P_i}{10} \right|. \tag{4.59}$$

The maximum differential power at one port is reached if nine voltage domains have no load while the remaining one operates at full load ( $P_{max}$ ) or if one voltage domain has no load and the other nine are operating at full load. In this case, the maximum differential power that the MAC-DPP converter needs to deliver from 9 ports to 1 port is  $\frac{9}{10}P_{max}$ , which is 40 W according to Fig. 4.36a. As a result, the maximum power of each voltage domain,  $P_{max}$ , is approximately 45 W, and the maximum load power that the 10-port MAC-DPP converter can support is 450 W. The power density of the MAC-DPP converter is 700 W/in<sup>3</sup>.

• System Efficiency: The system efficiency of the MAC-DPP system is defined as the overall load power of all voltage domains divided by the input power from the dc bus:

$$\eta_{sys} = \frac{\sum_{i=1}^{10} P_i}{P_{inmut}} = 1 - \frac{P_{loss}}{P_{inmut}}.$$
 (4.60)

 $P_{loss}$  is the power loss resulting from differential power processing. In a DPP system, the processed differential power is a small portion of the total load power, so only a small amount of power loss is generated and the system efficiency of a DPP converter can be much higher than the converter efficiency. Define the ratio between the total processed differential power and the total load power as:  $r = \sum_{i=1}^{10} \Delta P_i / \sum_{i=1}^{10} P_i$ . The generated power loss of the MAC-DPP converter can be calculated as:

$$P_{loss} = r \cdot \sum_{i=1}^{10} P_i \cdot (1 - \eta_{con}), \tag{4.61}$$

 $\eta_{con}$  is the converter efficiency of the MAC-DPP prototype. Based on the converter efficiency in Fig. 4.36a and Eq. (4.60)-(4.61), the system efficiency at 450 W total load power is estimated in Fig. 4.36b.

A well-designed DPP system usually has uniformly-allocated load power across voltage domains. Therefore, as shown in Fig. 4.36b, the MAC-DPP prototype can maintain over 99% system efficiency of a 450 W DPP system if the differential power ratio is below 13.5%, which should cover most of the operation conditions. Compared to the conventional 50V-5V dc-dc converters, the proposed MAC-DPP converter can achieve extremely high system efficiency with very small converter size.

The power loss of the MAC-DPP prototype mainly consists of core loss, conduction loss, and switching loss. Fig. 4.37 performs a loss analysis for the MAC-DPP converter under different operating conditions. The core loss is calculated by the Steinmetz's equation with the fitted coefficient from the Ferroxcube-3C95 datasheet. The root-mean-square (RMS) current of each conduction path is calculated based on the output load current and phase-shift between input and output.

Based on Eq. (4.25), when outputting the same amount of power, the phase-shift of the DAB converter increases as the switching frequency increases, leading to higher RMS current and higher conduction loss as shown in Fig. 4.37a. When operating at 200 kHz, the maximum output power of the MAC-DPP converter is determined by the phase-shift. It delivers 26.3 W from 9-ports to-1-port at 90° phase-shift. When the switching frequency is 150 kHz, 100 kHz, and 50 kHz, the maximum power that the MAC-DPP converter can deliver are 34 W, 40 W, and 44.5 W, respectively, limited by the maximum allowable component temperature (assume temperature limit is reached when the conduction loss reaches the same value as that of the experiment with 114 °C temperature in Fig. 4.36a).

Fig 4.37b shows the estimated core loss and switching loss as a function of the switching frequency. Fig. 4.37c shows the estimated full system loss at different



Figure 4.37: (a) Estimated conduction loss when delivering power from 9 ports to 1 port at different switching frequencies. (b) Estimated core loss and switching loss as a function of the switching frequency from 50 kHz to 200 kHz. Gate drive loss is not included. (c) Estimated total power loss of the MAC-DPP prototype when delivering power from 9 ports to 1 port at different frequencies. The total power loss includes conduction loss, core loss and switching loss.

frequencies. The core loss and switching loss dominate the system loss at light load. The conduction loss dominates the system loss at heavy load.

#### 4.6.2 HDD Server Testbench

This subsection presents the details of a MAC-DPP supported data storage server, including the power and communication infrastructure as well as the software configuration of the testbench. A Backblaze 4U 45 Drive Storage Pod is selected as the base model for the server. Fig. 4.38a shows an annotated photograph of the



Figure 4.38: Pictures of the Backblaze server (a) with the original ac-dc power supply; (b) after replacing the power supply with MAC-DPP converter. Both the power and the communication circuitry are reconfigured. The server comprises an Intel i3-2100 3.10 GHz CPU, a Supermicro MBD-X9SCM-F motherboard, and 8 GB RAMs.

Backblaze server with an original ac-dc power supply, and Fig. 4.38b shows the same Backblaze server after modification, where it is now powered by the 10-port 450 W MAC-DPP power converter. The original server contained forty-five 2.5-inch 320 GB HDDs (TOSHIBA MQ01ABD032V). After modification, its original power supply was replaced with the MAC-DPP converter, and the 45 HDDs were extended to 50 HDDs. Both the power and the communication configuration of the SATA-to-PCIe extension card were modified to enable data transfer across different voltage domains.

Fig. 4.39 and Fig. 4.40 show the detailed implementation of the high-speed data link infrastructure across series-stacked voltage domains. The data link infrastructure comprises three layers. The 50 HDDs are divided into 10 groups, and each group contains five 2.5-inch HDDs in parallel on a SATA III port multiplier, namely backplane board. Ten backplanes in different voltage domains transfer data to the SATA-to-PCIe extension card through isolated differential signals with dc blocking capacitors. Indeed, the SATA/SAS protocol signal is differential. By simply removing the common ground wires and adding blocking capacitors to the SATA/SAS differential signal links, the isolated signal transfer across voltage domains is achieved without major modification to standard communication protocols and existing wiring configuration,



Figure 4.39: Data link infrastructure of the series-stacked HDD server testbench: (a) Three-layer data link block diagram. (b) Component connection diagram.

as shown in Fig. 4.40. At Layer 2, a group of SATA-to-PCIe extension cards are placed on the same voltage domain. They are directly connected to the mother-board through PCIe Express slots. The 3-layer data link infrastructure is scalable to large-scale data storage systems with numerous stacked voltage domains.

Fig. 4.41 demonstrates the experimental setup for the HDD read/write speed test of the isolated SATA communication based on a disk drive benchmark tool, CrystalD-



Figure 4.40: Isolated SATA wiring pattern of the modified Backblaze storage server. The three ground wires are removed, and the four differential signals are capacitive isolated. Note the SATA extension cards selected in this prototype have internal isolation capacitors. No external capacitors are needed.



Figure 4.41: Experimental setup for the HDD read/write speed comparison between isolated SATA and standard SATA communication. Ten 2.5-inch HDDs are in series to a 50 V dc bus. The same HDD was swapped from the first voltage domain (isolated SATA) to the last domain (standard SATA) to test the read/write speed in sequential and 4KB random mode. The speed was tested using the disk drive benchmark tool, CrystalDiskMark V6.0.

iskMark V6.0. Ten 2.5-inch HDDs are connected in series to a 50 V dc bus. In this experiment, one HDD was swapped from an isolated voltage domain to a ground-referenced voltage domain, and the reading and writing speed were compared. As listed in Table 4.8, both the sequential read/write speed and 4KB random read/write speed are nearly the same in two different SATA connections. The results indicate

Table 4.8: Read/Write Speed Comparison of Isolated SATA and Standard SATA Link

|          | Reading (MB/s) |            | Writing (MB/s) |            |
|----------|----------------|------------|----------------|------------|
|          | Sequential     | 4KB Random | Sequential     | 4KB Random |
| Isolated | 104.0          | 1.037      | 104.1          | 1.036      |
| Standard | 104.3          | 0.987      | 104.1          | 1.055      |



Figure 4.42: (a) Side view and (b) top view of the HDD server testbench supported by the MAC-DPP converter.

that the bottleneck of SATA transmission speed is the read/write speed of mechanical HDDs, and is independent of whether the SATA connection is grounded or not. In applications where a high data rate is needed, the isolated SATA transmission can also be replaced with optic fibers, which are by nature isolated, and can offer higher communication bandwidth.

Fig. 4.42 shows the 50-HDD storage server testbench with a LabVIEW measurement system. A Linux based OS (Ubuntu) is installed to manage the reading, writing, and hot-swapping functions. A dc source (QPX-600D) is utilized for the 50 V dc bus. The LabVIEW system was set up to monitor the power consumption of the HDD server testbench. The monitoring system utilizes an NI-compactDAQ (cDAQ-9178) together with extendable analog input modules (NI9221 and NI9227) to simultaneously sample the voltages and currents of all the 10 voltage domains as well as the input voltage and current of the dc bus. The sampling rate of each voltage or cur-



Figure 4.43: LabVIEW real-time monitoring system. It measures and records the voltage and current waveforms of all ten series-stacked domains, and calculates the system efficiency in real time. In this example, the input power is 93.31 W, the load power is 92.99 W, and the instantaneous system efficiency is 99.79%.

rent sampling channel is 1600 Samples/s (the sampling period is about 620  $\mu$ s), and the sampled voltage and current were calibrated by a Keysight Digital Multimeter (34401A). In the LabVIEW console shown in Fig. 4.43, the voltage and current of ten voltage domains are monitored in real time, including the voltage ripple, load power, and differential power of each voltage domain as well as system efficiency, etc.

An HDD usually has two operating states: (a) reading or writing, each HDD used in this hardware setup consumes about 2.8 W to drive the motor; (b) idling, each HDD in the hardware setup consumes about 0.7 W to remain active. In data centers, the reading/writing operation of each HDD is commanded by external software requests. To validate the MAC-DPP architecture on the HDD server with typical data center tasks, a random reading/writing program was created, in which each HDD has a 20% probability to perform reading/writing tasks and 80% probability to stay idling at any time instant. Fig. 4.44 shows the measured voltage and current waveforms of the



Figure 4.44: Experiment waveforms of all voltage domains at random reading/writing test measured by LabVIEW: (a) voltage waveforms; (b) current waveforms.

| Table 4.9: Long-Term Random Read/Write Testing Results |              |             |                   |
|--------------------------------------------------------|--------------|-------------|-------------------|
| Elapsed Time                                           | Input Energy | Load Energy | System Efficiency |
| 60 min                                                 | 333.801 kJ   | 333.031 kJ  | 99.77 %           |

ten voltage domains under the random reading/writing test. The average power of each voltage domain is about 9 W, consisting of the random HDD load power and the power consumption of the Backplane board. Due to the random reading/writing tasks, the load currents were fluctuating continuously, but the voltages of all the domains were maintained stably at 5 V. The random reading/writing task was run for one hour, during which the accumulated input and load energy was recorded, as listed in Table 4.9. The total input energy from the dc bus was 333.801 kJ, while the total load energy (including energy consumptions of HDDs and backplanes) was 333.031 kJ, so the average system efficiency was as high as 99.77%. The testing results show that the MAC-DPP converter can feed power to the ten voltage domains with extremely high system efficiency.

Maintaining a dc voltage within a narrow ripple range is of great importance for the robust operation of HDDs. A typical requirement for 2.5-inch HDDs is to regulate the voltage within 5% of the nominal value (250 mV out of 5 V). In data centers, to avoid interrupting the normal operation, HDDs are usually removed or replaced while the server systems are still running (i.e., hot swapping). Hot swapping induces large load current transient, bringing challenges to voltage regulation. In the random reading/writing experiment, a worst-case hot-swapping test was performed, where an entire voltage domain (five HDDs and one backplane) was abruptly pulled out and plugged in. In this scenario, the differential power change at one port reaches the maximum, resulting in the largest voltage fluctuation during the transient. Distributed phase shift control regulates the voltage of the ten voltage domains. Fig. 4.45a shows the measured port voltage and load current waveforms at the  $5^{th}$  and  $6^{th}$  voltage domain during the hot-swapping test. A 2.2 mF electrolytic capacitor was included at each port, and the  $5^{th}$  domain was hot-swapped while the HDDs in other voltage domains were kept performing the random reading/writing task. During the hot-swapping, the voltage transition was very smooth. The fluctuation is almost negligible. Fig. 4.45a also shows that the current variation during swapping in is higher than that of swapping out, because of the current overshoot caused by the motor spinning up when swapping in. A soft starting circuit can also be implemented to meet higher requirements on HDD voltage ripple.

Since the MAC-DPP prototype is designed to support 45 W peak power at each voltage domain, the transient response of the prototype was also tested in an extreme case with 25 W load step change (i.e., 56% of full load step change) in one voltage domain, as shown in Fig. 4.45b. In the test, each series-stacked voltage domain was connected to an electronic load. All the load currents were kept at 1 A except for the current at port #6, which was stepped up from 1 A to 6 A and then returned back to 1 A. The MAC-DPP converter can successfully limit the overshoot of the



Figure 4.45: (a) Transient response when hot-swapping an entire voltage domain (removing 5 HDDs from port #5) of the HDD server testbench. Voltage measurements are ac-coupled, and current measurements are dc-coupled. (b) Transient response of a 25 W step load change at port #6. The settling time is 0.5 ms, and the voltage overshoot is less than 250 mV. Voltage measurements are ac-coupled, and current measurements are dc-coupled.

"hot-swapping" port voltage to 250 mV with only 0.5 ms settling time, fulfilling the 5% voltage ripple requirements. Fig. 4.45b also indicates that the load step change in one port induces voltage fluctuation on other ports (e.g.,  $V_5$ ), but they can also be effectively controlled by the DPS control strategy. These hot-swapping experiments verified that the designed MAC-DPP prototype is capable of maintaining a smooth operation of the HDD server against the worst-case hot-swapping scenarios.

Benefiting from the control strategy to support hot-swapping, the DPP system is robust against device failure. By connecting a protection device in series with the loads in each voltage domain which fails as open (e.g., a fuse or a current limiting device), the challenge of managing a failure condition is translated into managing a hot-swapping transient - the voltage domain which has a fault condition is removed from the series stack and the power is instantly redistributed.

Hot swapping leads to unbalanced load power, yielding reduced system efficiency. As more voltage domains are swapped out, the power mismatch between different voltage domains usually increases. Fig. 4.46 shows the measured system efficiency



Figure 4.46: Measured system efficiency when different number of voltage domains are swapped out. The average overall load power is annotated aside each data point. The system efficiency drops as more HDDs are removed.



Figure 4.47: Thermal images of the MAC-DPP prototype in (a) balanced load and (b) hot-swapping an entire voltage domain. The thermal images were taken at 25°C ambient temperature after the testbench running for 10 min without forced air flow.

in the random reading/writing test when different numbers of voltage domains were swapped out. The overall load power decreased as more voltage domains were removed, and the system efficiency also dropped. In the worst case where nine voltage domains were out, the system efficiency dropped to 94.7%. Under this circumstance, power was delivered to the load bypassing nine voltage domains. The lowest efficiency, 94.7%, is still comparable to that of the state-of-the-art 10:1 dc-dc converters. A DPP solution can offer much higher efficiency than dc-dc converters in most cases.

Fig. 4.47 shows the thermal images of the MAC-DPP converter operating in different load conditions. Both thermal images were taken after the testbench running for over 10 minutes. The experiment is performed under 25°C ambient temperature with no forced airflow. At the beginning when all HDDs were doing the same random reading/writing tasks, the load power was very balanced with only a small amount of differential power to be processed by the MAC-DPP converter. The temperature distribution on the MAC-DPP converter was uniform, and little hot-spot could be observed. The transformer is the hottest component due to core loss. When all five HDDs of an entire voltage domain were removed, the hot-swapping port delivered about 9 W differential power to the other 9 ports. Since the current at the hot-swapping port was roughly the summation of currents of all other 9 ports, its loss was much higher than others. A significant temperature rise was observed at the hot-swapping port (port #8 in this case) as shown in Fig. 4.47b. In this worst case, the temperature of the MAC-DPP converter was still maintained lower than 40 °C without forced air cooling.

Fig. 4.48 compares the system efficiency and power density of the MAC-DPP prototype with many state-of-the-art commercial 48V-to-5V dc-dc converters. Benefiting from the DPP architecture and the single "dc-ac-dc" power delivery path, the MAC-DPP prototype can support a 450 W HDD server with about 1 W of loss (99.77% system efficiency), reducing the power loss by 10x compared to most of the commercial products. By employing the MAC-DPP topology, the prototype has a smaller overall magnetic volume and lower component count compared to many other DPP topologies. The MAC-DPP converter is miniaturized with a power density above 700 W/in³, which is higher than most commercial products. The voltage sampling circuit and isolated driving signal circuit are all included in the MAC-DPP prototype and are considered in volume calculation. The microcontroller (TI F28379D) is off-board and is not included in the power density calculation.

In data centers, hardware infrastructure and software algorithms will have an impact on the power consumption, and thus influencing the performance of power



Figure 4.48: Comparison of the 10-port MAC-DPP prototype with many state-of-the-art commercial 48V-5V dc-dc converters. The MAC-DPP converter achieves over 10x power loss reduction compared with most of industry products with top-ranking power density. This comparison is based on the DPP system efficiency. The port-to-port converter efficiency is shown in Fig. 4.36a. The size of the microcontroller is not included in the volume calculation.

converters. There are opportunities to investigate software, hardware, and power codesign of large-scale computing systems in data centers, such as CPU/GPU clusters, memory banks, and HDD arrays. RAID (Redundant Array of Independent Disks) is a popular data storage architecture adopted in commercial cloud storage HDD arrays [180]. It combines multiple HDDs into one or more logical units in order to improve storage reliability or storage speed. Fig. 4.49a demonstrates two typical RAID configurations: (a) RAID 0, where the data is divided into multiple parts (namely striped) and written into multiple disks in parallel; there is no redundancy of data, but the storage speed is improved. (b) RAID 1, where the data is duplicated and stored in multiple disks (namely mirror); the storage speed is the same as for a single disk, but the storage reliability is improved due to the data redundancy. Other



Figure 4.49: (a) Two different RAID levels: RAID 0 (striped volume) and RAID 1 (mirrored volume) [180]. (b) Implementation of different RAID levels on the  $10 \times 5$  HDD array. HDDs can be vertically or horizontally grouped into RAID systems.

RAID levels like RAID 5 (striped with parity check), RAID 10 (striped and mirrored), etc., are extensions of these two RAID levels.

The MAC-DPP system was tested together with different storage architectures. RAID 0 and RAID 1 levels were applied, and a 10 GB file chunk was utilized as a testing sample. Fig. 4.49b shows the implementation of four different RAID levels on the  $10 \times 5$  HDD array. The following five modes were tested:

- 1. Vertical RAID 0: The 10 GB file chunk was striped into 10 HDDs across 10 voltage domains. Each HDD was written into 1 GB file chunk.
- 2. **Horizontal RAID 0:** The 10 GB file chunk was striped into 5 HDDs within one voltage domain. Each HDD was written into 2 GB file chunk.
- 3. Vertical RAID 1: The 10 GB file chunk was mirrored into 2 HDDs across two voltage domains. Each HDD was written into 10 GB file chunk.

- 4. **Horizontal RAID 1:** The 10 GB file chunk was mirrored into 2 HDDs within one voltage domain. Each HDD was written into 10 GB file chunk.
- 5. **Direct Storage:** The 10 GB file chunk was directly written into one HDD.

A systematic performance analysis of the HDD server is performed. Time consumption, system efficiency, and energy consumption of the HDD array when writing the 10 GB file sample under different storage strategies were measured in LabVIEW, and the experimental results are shown in Fig. 4.50. As indicated by the results, RAID 0 offers faster transmission speed due to the mechanism of parallel storage. Although RAID 1 needs higher HDD energy consumption, it provides higher storage redundancy. Fig. 4.50b shows that vertical RAID 0 has the highest system efficiency. Horizontal RAID 1 is the least efficient. This is because the load distribution of vertical RAID 0 is the most balanced across different voltage domains, but horizontal RAID 0 has the most unbalanced load distribution. The difference of system efficiency in different HDD storage architecture will be more distinct in larger HDD arrays with more HDDs included in the storage tasks. Due to the limited bandwidth, the advantages of parallel storage speed were not completely exploited. Because of these non-ideal factors involved in the test, a more rigorous study is needed to fully reveal the advantages and disadvantages of grouping HDDs in different ways. However, it can still be distinctly concluded from the results that vertical RAID modes have higher system efficiency and lower energy consumption compared with the horizontal counterparts due to more balanced power distribution among different voltage domains. It suggests that storage algorithm and storage architecture in data centers can be optimized to allocate storage tasks more balanced across different voltage domains, creating a more balanced load power, and thus greatly improving the overall performance of the system.



Figure 4.50: Experimental results of writing test under different storage architectures. HDD server performance was analyzed in multiple aspects including: (a) time consumption; (b) system efficiency; (c) energy consumption of the overall system (including working/idling HDDs and backplanes), or just the HDDs accessed by the writing test.

#### 4.6.3 LED Screen Testbench

To validate the stochastic model, a 30×20 LED array was built and tested as a large-scale DPP system with probabilistic load distribution. Random load tasks (independent or correlated) were set up and assigned to the LED array, which is supported by the MAC-DPP converter. Measured average DPP power loss was compared to the expected conduction power loss predicted by the model to validate scaling factors. The analytical framework developed in this chapter is applicable to a range of DPP applications. An extended application study and model verification on a data storage server powered by DPP are provided in Appendix B.3.

Recall that the stochastic model captures conduction losses, expected to dominate scale-dependent DPP system losses. Switching loss, core loss, and control and



Figure 4.51: (a) Experimental test bench with a 30×20 LED array. (b) Power and signal configuration. 600 LEDs are divided into 10 series-stacked voltage domains and supported by the 10-port MAC-DPP converter. Each LED is individually addressable from the MCU controller.

auxiliary losses could be weakly load dependent, so the key validation challenge is to determine whether total losses measured in experiments show the same scaling effects as conduction losses in the model.

Fig. 4.51 shows an overview of the test bench. The 30×20 LEDs were divided into ten voltage domains, connected in series to a 50 V dc bus. Each voltage domain supplied 5 V to 60 LEDs, and the full load power of the 600-LED screen is 108 W. Differential power of the ten domains was processed by the 10-port MAC-DPP prototype [22]. All 60 LEDs in each voltage domain were controlled by a serial signal path connected to a digital pin on the microcontroller (Arduino Mega) through a digital isolator (ADuM1200). Each LED was controlled individually by the microcontroller (MCU). A LabVIEW measurement system (cDAQ-9178 & NI9221 & NI9227) monitored and recorded total input power, load power of each voltage domain, and average power loss of the DPP system, in real time.

Fig. 4.52a shows the MAC-DPP prototype. A ten-winding printed-circuit-board (PCB) transformer in the center is surrounded by ten half-bridge ports. Each port couples one voltage domain to the transformer, and has the same  $R_{out}$  as that of a full-bridge implementation given the same switch die area and magnetic size. The prototype measures 4 cm  $\times$  3.5 cm  $\times$  0.76 cm, switches at 100 kHz, and supports up



Figure 4.52: (a) The 10-port MAC-DPP prototype. (b) Equivalent circuits of the MAC-DPP prototype when delivering power from 5 ports to 5 ports ( $V_{IN} = V_{OUT} = 5$  V). (c) Measured power loss vs. output current square for 5-port-to-5-port power delivery. This measurement is performed on common ground without sampling resistors, etc., so the 485 mW control and auxiliary losses are not captured in static loss.

to 450 W system power with a power density of 700 W/in<sup>3</sup>. More details about the prototype can be found in Section 4.6.1.

 $R_{out}$  of each port was measured with a five-port-to-five-port power delivery test in which five ports are connected in parallel as the input and five other ports are in parallel as the output. Fig. 4.52b depicts the equivalent circuit of this test. In this case, the DPP prototype is equivalent to a dc-dc converter with an output resistance of  $\frac{2}{5}R_{out}$ . The measured power loss versus  $I_{out}^2$  is plotted in Fig. 4.52c. Measured data are fitted with a line. The slope is the output resistance  $\frac{2}{5}R_{out}$  and the intercept comprises switching loss and magnetic core loss. The  $R_{out}$  value is estimated as 0.12  $\Omega$ .

As shown in Fig. 4.37b, if switching at 100 kHz, the estimated core loss is 156 mW, the switching loss is 26 mW and the sum is 182 mW. The current meter (NI-9227) was calibrated with an Agilent 34401A digital multimeter. Its tolerance is  $\pm 1$  mA on a 5 A scale, translating into 50 mW of power measurement tolerance on the full 50 V stack, or 5 mW for each 5 V port. Control and auxiliary losses (including level shifters, resistive dividers, etc.) were measured with inactive switches, and totalled  $485 \pm 50$  mW. Gate drives were powered from a separate source (which also

powers the microcontroller and other auxiliary circuits). Thus, estimated loss above and beyond conduction loss totals  $667 \pm 50$  mW. This difference is observed in all measurements. As will be noted below, it is load independent and has minimal impact on scaling. Since this section is not seeking to design an extreme-performance DPP implementation on the LED screen and it is vital to have extensive real-time measurements, control overhead power is not optimized in the design and might be higher than in a commercial implementation.

In the random load experiment, power to each LED is controlled by a random variable  $\xi$  that follows a Bernoulli distribution, Bernoulli(p). Here, p is the probability of turning on the LED. The load power of each LED therefore follows  $P_{ij} = \xi P_{on}$ , where  $P_{on} = 0.18$  W is the power consumption of one LED at full brightness, and the value of  $\xi \in \{0,1\}$  is updated once per second. By changing the turn-on probability p, the number of active loads per voltage domain M, and the vertical and horizontal load correlations, various random load tasks can be set up on the LED screen.

Fig. 4.53 illustrates the method for comparing measured average power loss to expected power loss from the stochastic model. Fig. 4.53a shows the instantaneous input system power and domain power measured by LabVIEW when performing a particular random load task. Measured average power loss over time is displayed in Fig. 4.53b. For each random load task in the experiments, the full system is operated long enough for measured average power loss to converge (typically 10 min).

The expected power loss is obtained from statistics of the sampled domain power waveforms. As shown in Fig. 4.53c, measured power waveforms of all voltage domains are sampled every second for two minutes and plotted in the vertical correlation matrix. Fig. 4.54 zooms in on three example diagonal and non-diagonal entries in Fig. 4.53c. The diagonal entries (such as Fig. 4.54a and Fig. 4.54b) are histograms of domain powers  $P_1(t)$  through  $P_{10}(t)$ . The variance of each domain power,  $Var[P_k(t)]$  in part ① of (4.18), can be obtained from the histograms. Horizontal correlation within



Figure 4.53: (a) Measured total input power and each domain power in LabVIEW. (b) Measured average power loss of the DPP system in LabVIEW. (c) Vertical correlation matrix based on sampled data (2 min) of each domain power. Diagonal histograms plot the distribution of each domain power. Non-diagonal scatter plots depict power correlation between each pair of domains and the correlation coefficients.

a voltage domain is also reflected in the probability distribution of each diagonal histogram. The non-diagonal scatter plots (such as Fig. 4.54c) describe vertical correlation coefficients between any two domain powers. For scatter plots in Fig. 4.53c, red boxes show positive correlation, blue boxes show negative correlation, and green boxes show weak correlation.  $Cov[P_i(t), P_j(t)]$  in part ② can be obtained from correlation coefficients of non-diagonal scatter plots. The statistical information provided in Fig. 4.53c can be used in the developed stochastic model to predict the expected power loss of a DPP system.

To validate stochastic model scaling with M and  $\sigma_0^2$ , we perform two experiments as shown in Fig. 4.55. In the M scaling experiment (Fig. 4.55a), sets of 12 LEDs in each voltage domain are bundled as one load and controlled by one random variable. The turn-on probability of each load is fixed at 0.5. By controlling the number of



Figure 4.54: Example zooms from Fig. 4.53c: (a) diagonal histogram of domain #1; (b) diagonal histogram of domain #2; (c) non-diagonal scatter plot of domain #1 power and domain #2 power.



Figure 4.55: Experimental setup to validate the model as: (a) M increases; (b)  $\sigma_0^2$  increases.

active loads (non-active loads are kept off), M can be adjusted from 1 to 5. Fig. 4.56 compares measured average loss and expected loss with and without horizontal correlation as M increases. The figure shows the conduction loss from the model, the model loss plus the estimated 667 mW overhead (shown as calibrated loss), and the total measured loss. The results confirm that average power loss of this ac-coupled DPP circuit scales linearly with M when loads are independent, but scales quadratically with M with worst-case horizontal correlation, as predicted by (4.12) and (4.22). The tracking match is as tight as the power measurement tolerance supports, with error bounds ( $\pm 50$  mW) highlighted.

To test  $\sigma_0^2$  scaling, all 60 LEDs in each voltage domain are bundled as one load as shown in Fig. 4.55b, and the load power variance is adjusted by changing the turn-on



Figure 4.56: Comparison between expected power loss and measured average loss as M increases in the case of: (a) independent load; (b) worst-case horizontal load correlation. The calibrated loss is the sum of the modeled loss and the estimated 667 mW overhead.



Figure 4.57: Comparison between expected power loss and measured average loss when  $\sigma_0^2$  increases. The calibrated loss is the sum of modeled loss and the estimated 667 mW overhead.

probability p. Fig. 4.57 compares the measured average loss and expected loss as a function of  $\sigma_0^2$ . The figure shows the conduction loss from the model, the calibrated loss with added 667 mW overhead, and the measured total loss. The average loss of this ac-coupled DPP circuit increases linearly with load variance  $\sigma_0^2$ , consistent with the scaling factor in (4.12). The tracking match is as tight as power measurement tolerance supports.

Fig. 4.58 shows the setup to test horizontal correlation. In the experiment, each LED is controlled individually with p = 0.5. Positive horizontal correlation is created



Figure 4.58: Experimental setup for horizontal correlation. Here, each horizontally correlated group contains two correlated LEDs with  $\rho = 1$ .



Figure 4.59: LED screen pattern, power waveform and the probability histogram of domain #1 when 60 LEDs of each voltage domain are: (a) independent; (b) horizontally grouped with 6 LEDs/group; (c) horizontally grouped with 20 LEDs/group; (d) horizontally grouped with 60 LEDs/group.

by dividing 60 LEDs in a voltage domain equally into horizontally correlated groups in which  $\rho = 1$  for LEDs within a group. Fig. 4.58 shows an example in which each horizontal group contains two LEDs. By increasing the number of LEDs in a horizontal group, a stronger positive horizontal correlation is created.

Figs. 4.59 and 4.60 show experimental results for horizontal correlation. Fig. 4.59 shows four cases of horizontal correlation as LEDs of each voltage domain shift from independent to fully correlated. The number of correlated LEDs per group increases from zero (i.e., independent), to six LEDs, 20 LEDs, and then 60 LEDs per group. When all LEDs are independent, the domain power consumption has a single smooth peak in histogram that follows a binomial distribution, and variance is small. When



Figure 4.60: Comparison between expected power loss and measured average loss as the number of LEDs per horizontal group increases. A larger number of LEDs per group represents a stronger positive horizontal correlation. The calibrated loss is the sum of modeled loss and estimated 667 mW overhead.

LEDs are horizontally correlated and the number of LEDs per correlated group increases, multiple split peaks appear in the histogram, with a higher power variance, as indicated by the power waveforms and probability histograms of domain #1. Fig. 4.60 compares the measured average loss to the expected loss and the calibrated loss with 667 mW overhead as the number of LEDs per horizontal group increases. The tracking match to the model is as tight as the power measurement tolerance supports. Figs. 4.59 and 4.60 confirm that positive horizontal correlation increases power variance, and thus the system needs to process more power and generates more loss. More positive horizontal correlation leads to higher DPP system loss, consistent with conclusions in Section 4.3.5.

To test vertical load correlation, sets of 12 LEDs in a voltage domain are bundled as one load and controlled with p=0.5. Each domain contains five loads in total. As shown in Fig. 4.61, vertical correlation is created by grouping one load from each voltage domain, with  $\rho=1$  for loads within a vertical group. Fig. 4.61 demonstrates an example with two vertically correlated groups. By increasing the number of correlated groups, stronger positive vertical correlation can be generated. In this case,



Figure 4.61: Experimental setup for vertical load correlation. Here is an example in which two vertically correlated groups are set up. In each vertical correlated group,  $\rho = 1$  for any two loads within the group.

loads in each domain are controlled by five independent random variables, i.e., loads are vertically correlated but horizontally independent. The distribution and variance of each domain power (part 1 of (4.18)) remain unchanged. DPP power loss variation in this experiment is only related to vertical load correlation (part 2) of (4.18)).

Fig. 4.62 shows experimental results for vertical correlation. Fig. 4.62a plots the power distribution histogram of domain #1 and power correlation between domains #1 and #2. As the number of vertically correlated groups increases, positive load correlation across voltage domains becomes stronger and  $\rho_V$  increases from 0 to 1. During this process, the power distribution histogram of each voltage domain changes little, as expected. The measured average loss, calibrated loss with 667 mW overhead, and expected loss are compared in Fig. 4.62b. The average loss of a fully-coupled DPP system decreases when  $\rho_V$  increases, validating the conclusions in Section 4.3.5. Again, the tracking match is as tight as the power measurement tolerance supports, and error bounds are highlighted.



Figure 4.62: (a) Power distribution histogram of domain #1 and power correlation graph between domains #1 and #2 with different number of vertically correlated groups. (b) Comparison between expected power loss and measured average loss as the number of vertically correlated groups increases. A larger number of correlated groups represents a stronger positive vertical correlation. The calibrated loss is the sum of modeled loss and estimated 667 mW overhead.

#### 4.6.4 A Buck SVC for the 10-Port MAC-DPP

To validate the SVC concept, a buck SVC is built and tested with the 10-port MAC-DPP converter. The details of the DPP converter are introduced in Section 4.6.1. Here, we focus on the impacts of SVC on the DPP operation. Fig. 4.63 shows the circuit topology of the buck SVC and the 10-port DPP converter. Ten voltage domains are connected in series and fully-coupled by a multi-winding transformer through half-bridge circuits. The DPP converter functions to balance the differential power among series loads, making the dc bus voltage  $(V_{DPP})$  evenly distributed into ten series-stacked voltage domains. The buck SVC is attached to the first voltage domain with the negative terminals of its input and output ports connected to the negative terminal of the first domain and its switch node linked to the DPP dc bus through a filter inductor. By controlling the duty ratio of the buck SVC, the DPP dc bus voltage can be regulated. Besides input voltage regulation and partial power processing, the buck SVC offers the following additional advantages for DPP system operation:



Figure 4.63: Circuit topology of a buck SVC attached to the MAC-DPP converter.

• Soft Start: In a DPP architecture, multiple voltage domains are connected in series to the input side. If the input voltage has a high slew rate at startup, a small power unbalance might cause significant voltage overshoot at some of the series voltage domains, leading to severe damage to the loads in that voltage domain. By adjusting the duty ratio, the buck SVC can control the voltage difference between the input bus and the load, limiting the load voltage slew rate during startup or input transient.

• Fault Protection: In fault conditions, the buck SVC can provide fast protection by disabling the upper arm switch and detaching the DPP loads from the input dc bus as shown in Fig. 4.63.

As indicated in Section 4.6.4, voltage regulation ratio of the buck SVC needs to follow  $M_v > 0.5$ , to maintain the total SVC incurred power processing lower than full load power. In addition, to keep the buck SVC component stress lower than a standalone buck converter,  $M_v$  should be larger than the crossing point in Fig. 4.33, which is  $M_v > 0.76$  for  $K_s = 0.1$  in this design. Considering both the two requirements, the buck SVC is designed to operate in the regulation range of  $0.76 < M_v < 1$ . Based on the regulation range, we design the power ratings for the buck SVC and the DPP converter. Note that the buck SVC may still function when operating out of this range, but the power rating design for other feasible regulation ranges can follow the discussions below.

Power ratings of the buck SVC and the DPP converter should be designed for their maximum processed power in all operating scenarios. Assume load power of each voltage domain  $(P_{load,i})$  is within the range of  $[0, P_{max}]$ . The power processed by the buck SVC is

$$P_{SVC} = \rho_{SVC} \times P_{IN} = (1 - 0.9M_v) \times \sum_{i=1}^{10} P_{load,i}.$$
 (4.62)

According to (4.62), the buck SVC processed power reaches maximum when all voltage domains consume  $P_{\text{max}}$ , and the voltage regulation ratio  $M_v = 0.76$ . The maximum value is  $3.16P_{\text{max}}$ , so the buck SVC power rating should be larger than 31.6% of the maximum system load power (i.e.,  $10P_{\text{max}}$ ).

In Fig. 4.63, the buck SVC processed power is only delivered to the first voltage domain, so the power rating requirement for the first DPP port is different from that of the other nine ports. In the first voltage domain, the differential power processed



Figure 4.64: Normalized power rating of the buck SVC and the 10-port DPP converter. Power ratings are normalized to the maximum system power.

by the DPP converter is

$$P_{DPP,1} = P_{SVC} - P_{load,1} = \rho_{SVC} \sum_{i=1}^{10} P_{load,i} - P_{load,1}.$$
 (4.63)

Here,  $\rho_{SVC} \in [0.1, 0.316]$  for  $M_v \in [0.76, 1]$ . Therefore, the maximum differential power processed for the first domain is reached when the first domain consumes zero power and each of the other nine domains consumes  $P_{\text{max}}$  at the regulation ratio of  $M_v = 0.76$  (i.e.,  $\rho_{SVC} = 0.316$ ). The maximum value is  $2.84P_{\text{max}}$ , so the power rating for DPP port #1 should be larger than 28.4% of the maximum system load power.

As for voltage domains  $2\sim10$ , differential power processed by the DPP converter for each voltage domain is

$$P_{DPP,m(m\geq 2)} = \frac{P_{IN} - P_{SVC}}{9} - P_{load,m} = \frac{1 - \rho_{SVC}}{9} \sum_{i=1}^{10} P_{load,i} - P_{load,m}. \tag{4.64}$$

Different from the first voltage domain, the maximum differential power processed by the  $m^{th}$  ( $m \ge 2$ ) voltage domain is reached when the  $m^{th}$  domain consumes  $P_{\text{max}}$  and each of the other nine domains consumes zero power at  $M_v = 0.76$ . The maximum value is  $0.92P_{\text{max}}$ , so the power rating for DPP ports  $\#2\sim\#10$  should be larger than 9.2% of the maximum system load power. Fig. 4.64 shows the normalized power rating requirements for the buck SVC and the 10-port DPP converter. As shown in

the figure, the buck SVC can have a significantly reduced power rating compared to the maximum system power. Fig. 4.64 also indicates that the SVC processed power delivered to the first domain brings additional differential power conversion stress to the DPP converter and increases the power rating requirement for the first DPP port.

In the SVC-DPP architecture, SVC output capacitor and DPP system input capacitor decouple the dynamics of the SVC stage and the DPP system, so the buck SVC and the DPP converter can be controlled separately. Existing voltage regulation methods for a buck converter can be easily applied to the buck SVC. Fig. 4.65a shows one way of implementing the closed-loop control for the buck SVC. The DPP string voltage  $(V_{DPP})$  is regulated by controlling the duty ratio (D) of the buck stage. According to Table 4.6, the regulated DPP string voltage can be formulated as a function of  $V_{IN}$  and D:

$$V_{DPP} = \frac{10D}{9D+1} \times V_{IN}. (4.65)$$

Eq. (4.65) indicates that the DPP string voltage monotonically increases as the duty ratio increases. Therefore, a PI feedback loop can be applied to regulate the DPP string voltage. The feedback loop adjusts the duty ratio based on the sampled DPP string voltage as shown in Fig. 4.65a. The 10-port DPP converter works as a tenactive-bridge converter, and the power flows among all the ports are controlled by phase-shift modulation. To balance the voltage of series domains, a distributed phase-shift control strategy is applied, where an individual feedback loop is implemented in each domain to control the phase-shift based on the locally measured voltage [144].

To experimentally validate the analysis, a buck SVC was built and tested. Fig. 4.65b shows the picture of the buck SVC and DPP converter prototype in comparison with a U.S. quarter. The prototype is designed to support ten series voltage domains with 50 V overall string voltage, and each domain can supply 5-V loads, such as hard disk drives or LEDs, etc. The buck SVC operates to regulate  $50 \text{ V} \sim 65 \text{ V}$  input voltage into 50 V for the DPP system. In this input range, ac-



Figure 4.65: (a) Control block diagram. (b) Prototype of the buck SVC and the 10-port DPP converter in comparison with a US quarter.

Table 4.10: Bill-of-Material of the Prototype

| tion                                                                                                         |
|--------------------------------------------------------------------------------------------------------------|
| S, LMG5200MOFT CM Shielded, $1.5\mu H$                                                                       |
| f, CSD95377Q4M<br>X5R, 100 $\mu$ F × 3<br>ft SLC7649, 100 nH<br>z<br>ube, ELP18-3C95<br>gle turn per winding |
| 1                                                                                                            |

cording to (4.50), the buck SVC only processes 10%~31% of the overall load power. Board area of the buck SVC is about 1/4 of the DPP converter and is comparable to a U.S. quarter. Table 4.10 lists the key component values and parameters of the prototype. Detailed component volume breakdown of the prototype is plotted in Fig. 4.66. Fig. 4.67a shows an example application, where the SVC-DPP architecture is powering a 30×20 LED array. The 600 LEDs are evenly divided into ten groups, and LEDs in each group are connected in parallel in one 5-V voltage domain. DPP converter is operated to balance power difference among different LED groups and



Figure 4.66: Component volume breakdown of buck SVC and DPP converter.



Figure 4.67: (a) Picture of an example application. The SVC-DPP is powering a 600-LED screen. (b) Power and signal configuration of the SVC-DPP system.

maintain stable 5 V for each voltage domain. Fig. 4.67b shows the power and signal configuration of the SVC-DPP system with programmable LED arrays.

Fig. 4.68a shows the measured steady-state voltage and current waveforms. Here, the input dc bus voltage is 55 V. The buck SVC can effectively compensate for the difference between the input voltage and the DPP string voltage, converting 55 V into 50 V for the DPP system. The duty ratio of the buck SVC is 50%, consistent with (4.65). Fig. 4.68b shows the regulated DPP string voltage and the voltage of domain #1 during the input voltage ramping transient. Both the DPP string voltage and the voltage of domain #1 remain stable during the transient, validating the SVC and DPP control strategy.



Figure 4.68: (a) Measured waveforms of input dc bus voltage, regulated DPP string voltage, and the gate driving signal and inductor current of the buck SVC. (b) Measured waveforms of DPP string voltage and the voltage of domain #1 when input voltage ramps up and down between 55 V and 60 V. Input dc bus voltage is measured in dc coupling; DPP string voltage and the voltage of domain #1 are measured in ac.

In the SVC-DPP architecture, the DPP converter needs to cope with both the inherent power mismatch of the series loads and the power imbalance caused by SVC. The system efficiency is defined as the total load power divided by the input power. We first only consider the power imbalance caused by SVC by assuming identical load power across series voltage domains. In this case, the system efficiency describes the average performance of the SVC-DPP system with matched average domain load powers. The best-case and worst-case load distributions are discussed later, and the corresponded system efficiencies are plotted to show the upper and lower efficiency limits of the SVC-DPP system.

Figure 4.69 shows the efficiency curves and power loss breakdown in the case of identical domain load powers. Fig. 4.69a plots the measured SVC converter efficiency, DPP converter efficiency, and the system efficiency when SVC is converting 55 V input voltage into 50 V DPP string voltage. Losses of control and auxiliary circuits are not included here. In the figure, the SVC processed power and the DPP processed differential power are labeled along the curves, and both of them are only a small portion of the total load power, leading to significantly improved system efficiency.



Figure 4.69: (a) Measured SVC converter efficiency, 1-port-to-9-port DPP converter efficiency, and the system efficiency when SVC converting 55 V input dc bus voltage into 50 V DPP string voltage. The SVC processed power and the DPP processed differential power are labeled along the curves. (b) Power loss breakdown of SVC and DPP converter at 100 W system load power.

As shown in Fig. 4.69a, the maximum SVC processed power at 55 V input voltage is 53.9 W, indicating over 290 W system power according to (4.50). In Fig. 4.69a, the peak converter efficiency of the buck SVC and DPP converter (measured in the 1-port-to-9-port power delivery case) is around 95%, but the efficiency of the full SVC-DPP system can be much higher, achieving 98.8% peak efficiency at around 80 W load power while losing less than 1 W. Detailed power loss breakdown when SVC converting 55 V to 50 V for the DPP system with 100 W load power is plotted in Fig. 4.69b. The labeled conduction loss for SVC converter covers all resistive paths except for inductor winding wire, whose loss is included in inductor ac and dc loss. For DPP converter, the labeled conduction loss covers all resistive paths. Based on the manufacturer's core loss calculation tool (Coilcraft Inductor Analyzer), the DPP inductor (Coilcraft SLC7649S-100nH) core loss at this operating point is negligible and is not included in the graph. In Fig. 4.69b, the majority power loss of the system is the conduction loss at this operation point. Fig. 4.70a plots the system efficiency of different input voltages in the case of identical domain load powers. When the input voltage increases, the voltage regulation ratio  $M_v$  decreases. As indicated by



Figure 4.70: (a) System efficiency when converting input dc bus voltage from 55 V, 60 V, 65 V into 50 V for DPP system with identical domain load powers. (b) System efficiency in the best-case and the worst-case load distributions. The buck SVC is converting 55 V input voltage into 50 V DPP string voltage.

Figs. 4.30a and 4.30b, the power processed by the buck SVC and the DPP converter will increase as  $M_v$  decreases, yielding higher loss and lower system efficiency.

To examine the best-case and worst-case load distributions for the system efficiency, both the inherent power mismatch of series loads and power imbalance caused by SVC need to be considered. As indicated by (4.50), the SVC processed power ratio ( $\rho_{SVC}$ ) only depends on  $M_v$  and  $K_s$ . Therefore, in a specific SVC-DPP system (i.e., when  $M_v$  and  $K_s$  are fixed), SVC processed power and its generated power loss will be determined by the total load power regardless of load distribution. The impacts of load distributions on the total power loss and system efficiency lie in differential power processing. Fig. 4.71 shows the load distribution for the best-case and worst-case system efficiencies when SVC is regulating 55 V input voltage to 50 V DPP string voltage. Denote the total load power as  $P_{tot}$ , then the SVC processed power is fixed at  $\frac{2}{11}P_{tot}$ , which is directly delivered to the first voltage domain. In the best-case load distribution, domain #1 consumes  $\frac{2}{11}P_{tot}$ , and each one of domain #2~#10 consumes  $\frac{1}{11}P_{tot}$ , as shown in Fig. 4.71a. In this case, power of each domain is balanced, so the DPP converter doesn't need to deliver differential power



Figure 4.71: Load distributions of: (a) best-case system efficiency; (b) worst-case system efficiency. The buck SVC is converting 55 V input voltage into 50 V DPP string voltage.

and total power loss is minimized. It is noticeable in Fig. 4.71a that the best-case load distribution of an SVC-DPP system is different from a conventional DPP system due to the power imbalance caused by SVC. In the worst-case load distribution, domain #1 consumes zero power and one of domain #2 $\sim$ #10 consumes  $P_{tot}$ . Fig. 4.71b shows one example of the worst-case load distribution, where the total processed differential power and generated power loss are maximal. Fig 4.70b plots the system efficiency in the best-case and worst-case load distributions. System efficiency curve of any other load distribution will be located in between. As shown in the figure, the peak system efficiency in the best case reaches 99% and even the worst-case system efficiency can reach 92.3%. In a well-designed DPP system, however, the worst-case load distribution rarely happens.

In summary, the SVC leverages the partial power processing concept and only compensates for the voltage difference between the input voltage and the DPP string voltage. The DPP converter only processes the differential power among the series-stacked voltage domains and inherits natural voltage step-down. An SVC may induce

additional power conversion stress to the DPP, and the system should be jointly optimized to achieve optimal performance.

### 4.7 Chapter Summary

In this chapter, a granular power architecture with series coupled magnetics, namely MAC-DPP architecture, is developed to support large-scale modular energy systems. Benefiting from differential power processing, the MAC-DPP architecture can significantly reduce the power conversion stress. It couples all series-stacked voltage domains through a series coupled multi-winding transformer, featuring reduced component count, smaller magnetic volume, and less differential power conversion stages compared to other DPP solutions.

To explore the performance scaling limit of DPP systems, this chapter develops a stochastic analytical framework which estimates average power loss of various DPP topologies under probabilistic load distributions. Scaling factors are introduced to describe how power loss scales as the dimension (N, M), average load power  $(\mu_0)$ , and load power variance  $(\sigma_0^2)$  of a modular load array increase. The scaling characteristics of general DPP topologies were analyzed and compared. The analytical framework was verified by both SPICE simulations and experiments, and the comparison results indicate that the proposed MAC-DPP solution stands out from other explored DPP solutions, although SC solutions are equally good if FSL applies. The analytical framework, scaling factors, and stochastic models provide useful guidelines for designing large-scale DPP systems.

Essentially, a MAC-DPP converter is a MIMO system. To precisely control the MIMO power flow and regulate port voltages, this chapter presents two solutions based on a feedback control and a feedforward control, respectively. A systematic small signal modeling approach is first derived to guide the control loop design of large-scale MAC-DPP systems. The small signal model captures the impact of the

lossy component in a MAC converter with an output resistance and can be easily extended to capture the transfer function of a MAC converter with arbitrary number of ports. Based on the small signal model, the feedback control strategy with distributed phase-shift modules is designed, which is simple, robust and scalable. The proposed distributed control strategy can effectively keep voltage stable in the worst-case hot-swapping scenario of data storage servers. For the feedforward control framework, a customized Newton-Raphson solver is designed to identify the cross-coupled control variables in the nonlinear power flow equations. The feasible target powers as well as the convergence initial solutions are discussed.

To validate the granular MAC-DPP architecture and theoretical analysis, a 450 W 10-port MAC-DPP converter was designed and tested on both a 600-LED screen and a 50-HDD data storage server. On the LED testbench, different independent and dependent random load tasks have been created and assigned to the LED screen. The measured loss scaling trends match well with the predicted results from the developed stochastic loss model. The modified HDD server is the first data storage server supported by series-stacked differential power processing. It can maintain normal reading/writing operation against the worst hot-swapping scenario for the HDDs. The storage server was also tested in an extreme case when 25 W load was hot-swapped at one port. The transient response of the MAC-DPP system meets the requirements of typical HDDs, and the system efficiency for a 450 W storage server remains above 99% for a majority of operating conditions. The storage server was also tested with various HDD storage modes including direct storage and different RAID levels. Experimental results showed that the performance of large-scale modular information systems can be greatly improved by software, hardware, and power architecture co-design.

Besides, this chapter presents an SVC converter that leverages partial power processing to regulate the series-stacked string voltage of DPP systems. Compared to a standalone dc-dc regulator, the SVC only processes a small fraction of the total load power but may introduce additional stress to the DPP system. A theoretical analysis is performed to compare the summation of both the SVC processed power and the additional power conversion stress that SVC brings to the DPP converter to the power conversion stress of a traditional DPP architecture with a standalone pre-regulator. The operating conditions in which the total SVC incurred power processing is less than total load power are identified. Several SVC topologies are compared based on their component load factors. A buck SVC converter is designed and applied to the 10-port MAC-DPP converter. In addition to improved efficiency and reduced size, the SVC also enables soft-start and fault protection of the DPP system. Experimental results show that the SVC can effectively regulate the DPP bus voltage with minimum impact on the DPP performance.

The MAC-DPP converter, the SVC regulation stage, and the stochastic modeling approach presented in this chapter creates a comprehensive framework for designing and analyzing power architectures that can support large-scale modular loads with ultra-high energy efficiency.

#### Related Publications

- 1. P. Wang, Y. Chen, J. Yuan, R. C. N. Pilawa-Podgurski and M. Chen, "Differential Power Processing for Ultra-Efficient Data Storage," *IEEE Transactions on Power Electronics*, vol. 36, no. 4, pp. 4269-4286, April 2021.
- 2. P. Wang, R. C. N. Pilawa-Podgurski, P. T. Krein and M. Chen, "Stochastic Power Loss Analysis of Differential Power Processing," *IEEE Transactions on Power Electronics*, vol. 37, no. 1, pp. 81-99, Jan. 2022.
- 3. P. Wang and M. Chen, "Analysis and Design of Series Voltage Compensator for Differential Power Processing," *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 10, no. 6, pp. 7890-7903, Dec. 2022.
- 4. P. Wang and M. Chen, "Series Voltage Compensator for Differential Power Processing," in *Proc. IEEE Energy Conversion Congress and Exposition (ECCE)*, Detroit, MI, USA, 2020, pp. 135-142.

- 5. P. Wang, R. C. N. Pilawa-Podgurski, P. T. Krein and M. Chen, "Performance Limits of Differential Power Processing," in *Proc. IEEE Workshop on Control and Modeling for Power Electronics (COMPEL)*, Denmark, 2020, pp. 1-8.
- P. Wang, Y. Chen, P. Kushima, Y. Elasser, M. Liu and M. Chen, "A 99.7% Efficient 300 W Hard Disk Drive Storage Server with Multiport Ac-Coupled Differential Power Processing (MAC-DPP) Architecture," in *Proc. IEEE Energy Conversion Congress and Exposition (ECCE)*, Baltimore, MD, USA, 2019, pp. 5124-5131.
- 7. P. Wang, Y. Chen, Y. Elasser and M. Chen, "Small Signal Model for Very-Large-Scale Multi-Active-Bridge Differential Power Processing (MAB-DPP) Architecture," in *Proc. IEEE Workshop on Control and Modeling for Power Electronics* (COMPEL), Toronto, ON, Canada, 2019, pp. 1-8.
- 8. P. Wang and M. Chen, "Towards Power FPGA: Architecture, Modeling and Control of Multiport Power Converters," in *Proc. IEEE Workshop on Control and Modeling for Power Electronics (COMPEL)*, Padua, Italy, 2018, pp. 1-8.

# Chapter 5

## Conclusion

#### 5.1 Conclusion

This thesis explores the granular power architecture for extreme performance power conversion. Consisting of distributed switching cells with magnetics integration, the granular power architecture can minimize power conversion stress and maximize passive component utilization, making it a promising solution to powering emerging energy systems such as data centers, electric vehicles, and grid-scale energy storage.

The advantages of granular power conversion fundamentally come from component scaling laws: (1) the switch  $R_{on}Q_{gd}$  scaling indicates it is beneficial to replace one single "large" switch with multiple "small" switches; (2) the transformer power rating scaling suggests one "large" coupled magnetic component is superior to multiple "small" discrete magnetics; (3) capacitor energy density scaling implies "large" and "small" capacitors are equally good. Therefore, the granular power architecture can achieve better device performance by reducing the power loss of semiconductor switches and decreasing the size, loss, and inductive energy storage of magnetic components. The granular power architecture also offers many other benefits, including smaller device power ratings and passive component sizes, higher efficiency, scala-

bility, and functionality, as well as reduced parasitics that enable higher frequency operation.

To fully realize the benefits of granular power architecture, a comprehensive understanding of both the magnetics integration approaches and the performance limits of various converter topologies is essential. The thesis addresses this objective from three major perspectives:

- 1. A Systematic All-in-One Magnetics Integration Approach: Chapter 2 presents a matrix coupled all-in-one magnetic structure combining both series coupling and parallel coupling for PWM power conversion. A step-by-step analysis framework is developed to unveil the mechanism of current ripple reduction and current ripple steering among the matrix coupled windings. A figure of merit is defined to quantify the advantages gained from matrix coupling, indicating that a higher number of phases and a stronger coupling coefficient yield larger benefits. The matrix coupling magnetic structure and theoretical analysis are verified by a matrix coupled SEPIC prototype which can support load current up to 185 A at 5 V-to-1 V voltage conversion with over 470 W/in³ power density. The matrix coupled inductor we built is 5.6 times smaller and 8.5 times faster than commercial discrete inductors with similar current ripples and current ratings.
- 2. A Granular Power Architecture with Parallel Coupled Magnetics: Chapter 3 develops the MSC-PoL architecture to support high current computing systems with high efficiency and ultra-compact size. The MSC-PoL architecture consists of multiple granular switched-capacitor and switched-inductor cells. Parallel-coupled inductors with interleaving operation are leveraged to minimize dc magnetic energy storage, reduce inductor current ripple, and improve transient speed. The switched capacitor cells are soft charged with switched-inductor current sources, removing the charge sharing loss and achieving mutual balancing between capacitor voltages and inductor currents. A systematic anal-

ysis of the intrinsic L-C resonant issue in hybrid switched-capacitor/magnetics systems is performed, providing guidelines for control design. Both the architecture and theoretical analysis are verified by a 48-to-1-V/450-A/724 W/in<sup>3</sup> MSC-PoL prototype, which combines many state-of-the-art technologies to achieve PwrSiP voltage regulation for high-performance microprocessors.

3. A Granular Power Architecture with Series Coupled Magnetics: Chapter 4 develops the MAC-DPP architecture to power large-scale modular energy systems with extremely high efficiency. The proposed MAC-DPP converter utilizes one series-coupled multi-winding transformer to connect all ports, resulting in fewer components, smaller magnetic volume, and fewer differential power conversion stages compared to other DPP solutions. Its granular power architecture offers high modularity and scalability, allowing for easy expansion without customizing the design for each port. A stochastic loss model is developed to quantify the DPP performance under random loads and a scaling factor is introduced to explore the performance limits of various DPP topologies. Two control strategies, feedback and feedforward, are proposed to regulate the MIMO power flow and port voltages. A 10-port 450 W MAC-DPP prototype with 700 W/in<sup>3</sup> power density is built and tested on a 50-HDD storage server, achieving 99.77% system efficiency and demonstrating the feasibility of using DPP power architecture for data storage servers with full reading, writing, and hot-swapping capabilities. The DPP-powered HDD server is also tested under different storage architectures, highlighting the importance of software, hardware, and power architecture co-design in next-generation data center power architectures.



Figure 5.1: Comparison between (a) magnetic-core memory [181] and (b) prospective power processor. In the power processor, multiple power loads and sources will store and transfer energy through the centralized magnetic core structure.

#### 5.2 Future Work

In the 1960s, the MIT team invented the first computer magnetic-core-based memory, a random-access computer memory as shown in Fig. 5.1a. Multiple computing units are storing and transmitting data through a centralized magnetic network. In this thesis, a similar granular power architecture with distributed switching cells and magnetics integration is discussed, providing a promising solution to future power processors where multiple power loads and sources will store and transfer energy through a centralized magnetic core, as demonstrated in Fig. 5.1b.

The prospective power processor consists of four basic units: 1) CMU: central magnetics unit for energy storage and power transfer; 2) DAU: dc-ac unit for dc and ac power conversion; 3) EBU: energy buffer unit to filter power fluctuation; 4) PRU: power routing unit to configure input/output interfaces for surrounding sources and loads. The power processor can potentially provide extreme performance power conversion by leveraging the scaling trends of device performance. More importantly, it

enables the general-purpose MIMO power architecture with a reconfigurable number of ports as well as reconfigurable current and voltage ratings for each port.

The development of this thesis also opens up many other avenues for further exploration, including but not limited to the following:

- In Section 4.3, a stochastic loss model is developed to evaluate DPP system performance through power loss expectation. More complicated stochastic models (e.g., Markov chain, random process, etc.) can be applied to power electronics systems to solve some practical problems of uncertain results (e.g., reliability).
- AI techniques can be leveraged to design and control power electronics systems. For example, the magnetic components in this thesis are designed and optimized based on traditional Steinmetz's equation (or IGSE) which cannot capture many operation conditions including the shape of waveforms, dc bias, temperatures, etc. Lacking a uniform measurement standard, the provided Steinmetz's coefficients by manufacturers are less unconvincing, limiting the accuracy of calculated magnetic loss. In contrast, machine learning models can be applied to predict magnetic loss in arbitrary operation conditions [112]. AI models can also be used to control sophisticated power electronics systems such as multi-port converters with multi-input-multi-output power flows [182].
- Integration techniques can be utilized to improve converter performance. As mentioned in Section 3.6.5, integrating multiple switches and drivers together can reduce circuit parasitics, suppressing non-ideal voltage/current spikes and ringings as well as reducing parasitic-related losses and improving efficiency. Besides, passive components including capacitors and magnetics can also be integrated into the package or together with the semiconductor die to achieve miniaturized power conversion.

- Increasing switching frequency can reduce passive component size and expand control bandwidth. Lots of research has been performed to investigate the advantages of high-frequency (HF) and very-high-frequency (VHF) power electronics systems [28]. In Section 3.6, the MSC-PoL prototype is tested at 400~700 kHz switching frequency. It would be meaningful to design an MSC-PoL prototype that facilitates higher switching frequency (e.g., 1~10 MHz) with faster transient speed and better handling of microprocessor power dynamics.
- WBG devices will play a crucial role in next-generation power electronics systems. Given the same breakdown voltage, WBG devices offer lower resistance, smaller footprint, and better thermal conductivity. As shown in Fig. 1.1b, the SiC and GaN switches can theoretically reduce the power loss by about 4 times and 8 times compared to silicon-based switches. Their lower parasitics allow higher frequency operation. Currently, WBG devices are facing challenges including gate driver circuit design, device packaging, and  $R_{on}$  degradation. Addressing these challenges will further enhance WBG device performance and broaden their applications.
- Last but not least, this thesis envisions a promising trend to replace complicated circuit connections with well-designed magnetic paths. The matrix coupling approach presented in Section 2 provides an example, opening the opportunities for more sophisticated magnetic structures to merge many magnetic components into one with improved performance and functionality.

# Appendix A

# Capacitor Survey Datasheet

Table A.1: Capacitor Specifications for Data Points in Fig. 1.3a

|                                                                  |          |         | Voltage    |                         |        | Maximum Energy             |           |  |
|------------------------------------------------------------------|----------|---------|------------|-------------------------|--------|----------------------------|-----------|--|
| Part Number                                                      | Material | Package | Rating [V] | [mm]                    | [mm3]  | Storage [J]                |           |  |
| GRT033R70J472KE01                                                | X7R      | 0201    | 6.3        | 0.33                    | 0.0594 | 8.77E-08                   | 1.48E-06  |  |
| GRT033R70J103KE01                                                | X7R      | 0201    | 6.3        | 6.3 	 0.33 	 0.059      |        | 1.85E-07                   | 3.12E-06  |  |
| GRT033R71A103KE01                                                | X7R      | 0201    | 10         | 0.33                    | 0.0594 | 4.46E-07                   | 7.50E-06  |  |
| GCM033R71A182MA01                                                | X7R      | 0201    | 10         | 10 0.33 0.0594 8.95E-08 |        | 1.51E-06                   |           |  |
| GCM033R71C182MA40                                                | X7R      | 0201    | 16         | 16 0.33 0.0594 2.01E    |        | 2.01E-07                   | 3.39E-06  |  |
| $\rm GCM033R71C222KA55$                                          | X7R      | 0201    | 16         | 0.33                    | 0.0594 | 2.35E-07                   | 3.96E-06  |  |
| GCM033R71E222KE02                                                | X7R      | 0201    | 25         | 0.33                    | 0.0594 | 4.54E-07                   | 7.64 E-06 |  |
| GCM033R71E332KE02                                                | X7R      | 0201    | 25         | 0.33                    | 0.0594 | 6.73E-07                   | 1.13E-05  |  |
| GCM033R71E331KA03                                                | X7R      | 0201    | 25         | 0.33                    | 0.0594 | 9.11E-08                   | 1.53E-06  |  |
| $\mathrm{GRT}155\mathrm{R}70\mathrm{G}105\mathrm{KE}01$          | X7R      | 0402    | 4          | 0.55                    | 0.275  | 5.14E-06                   | 1.87E-05  |  |
| GRT155R70J105KE01                                                | X7R      | 0402    | 6.3        | 0.55                    | 0.275  | 9.24E-06                   | 3.36E-05  |  |
| GRT155R71A224KE01                                                | X7R      | 0402    | 10         | 0.55                    | 0.275  | 6.18E-06                   | 2.25E-05  |  |
| $\rm GXT155R71C224KE01$                                          | X7R      | 0402    | 16         | 0.55                    | 0.275  | 9.95E-06                   | 3.62 E-05 |  |
| GRT155R71E104KE01                                                | X7R      | 0402    | 25         | 0.55                    | 0.275  | 1.86E-05                   | 6.76E-05  |  |
| GCM155R71E333KA55                                                | X7R      | 0402    | 25         | 0.55                    | 0.275  | 7.63E-06                   | 2.77E-05  |  |
| $\rm GXT155R71H104KE01$                                          | X7R      | 0402    | 50         | 0.55                    | 0.275  | 3.98E-05                   | 1.45E-04  |  |
| GGM155R72A472KA37                                                | X7R      | 0402    | 100        | 0.55                    | 0.275  | 1.12E-05                   | 4.07E-05  |  |
| GCM155R72A222KA37                                                | X7R      | 0402    | 100        | 0.55                    | 0.275  | 5.19E-06                   | 1.89E-05  |  |
| GCM155R72A221KA01                                                | X7R      | 0402    | 100        | 0.55                    | 0.275  | 7.36E-07                   | 2.68E-06  |  |
| $\rm GCM21BR70J106KE21$                                          | X7R      | 0805    | 6.3        | 1.4                     | 3.36   | 1.29E-04                   | 3.84E-05  |  |
| $\mathrm{GCJ21BR71A106ME01}$                                     | X7R      | 0805    | 10         | 1.45                    | 3.48   | 2.41E-04                   | 6.92 E-05 |  |
| GCM21BR71C475KA67                                                | X7R      | 0805    | 16         | 1.4                     | 3.36   | 2.43E-04                   | 7.22 E-05 |  |
| GCM21BR71E225KA67                                                | X7R      | 0805    | 25         | 1.4                     | 3.36   | 3.90E-04                   | 1.16E-04  |  |
| $\rm GCM21BR7YA155KA54$                                          | X7R      | 0805    | 35         | 1.4                     | 3.36   | 3.26E-04                   | 9.69E-05  |  |
| GGM21BR71H105KA03                                                | X7R      | 0805    | 50         | 1.4                     | 3.36   | 6.29E-04                   | 1.87E-04  |  |
| GCD21BR72A823MA01                                                | X7R      | 0805    | 100        | 1.4                     | 3.36   | 1.95E-04                   | 5.81E-05  |  |
| $\mathrm{GCJ}219\mathrm{R}72\mathrm{A}273\mathrm{M}\mathrm{A}01$ | X7R      | 0805    | 100        | 0.95                    | 2.28   | 8.04E-05                   | 3.53E-05  |  |
| $\rm GCE21BR72A273MA01$                                          | X7R      | 0805    | 100        | 1.45                    | 3.48   | 8.71E-05                   | 2.50E-05  |  |
| $\rm GCJ21AR72E222KXJ1$                                          | X7R      | 0805    | 250        | 1                       | 2.4    | 4.04E-05                   | 1.69E-05  |  |
| GCM21AR72E102KX01                                                | X7R      | 0805    | 250        | 1                       | 2.4    | $2.25\mathrm{E}\text{-}05$ | 9.37E-06  |  |
| GCM31CR70J226KE26                                                | X7R      | 1206    | 6.3        | 1.8                     | 9.216  | 3.14E-04                   | 3.41E-05  |  |
| GGM31CR71A226KE02                                                | X7R      | 1206    | 10         | 1.8                     | 9.216  | 5.47E-04                   | 5.94E-05  |  |

| Part Number Material Package Rating [V] [mm] [mm3] Storage [J] Dens | mum Energy<br>ity [J/mm3] |  |
|---------------------------------------------------------------------|---------------------------|--|
|                                                                     | ուջ [ə/шшə]               |  |
| GGM31CR71C106KA64 X7R 1206 16 1.8 9.216 8.18E-04 8                  | 8.88E-05                  |  |
|                                                                     | 1.07E-04                  |  |
|                                                                     | 3.85E-05                  |  |
|                                                                     | 7.47E-05                  |  |
|                                                                     | .47E-03                   |  |
|                                                                     |                           |  |
|                                                                     | 5.51E-05<br>1.17E-05      |  |
|                                                                     |                           |  |
|                                                                     | .87E-04                   |  |
| ·                                                                   | .01E-04                   |  |
|                                                                     | 3.40E-04                  |  |
| ·                                                                   | 1.79E-04                  |  |
| •                                                                   | 3.12E-04                  |  |
|                                                                     | .49E-04                   |  |
|                                                                     | 2.06E-04                  |  |
|                                                                     | .49E-04                   |  |
|                                                                     | .69E-05                   |  |
|                                                                     | .52E-05                   |  |
|                                                                     | .40E-05                   |  |
|                                                                     | .28E-05                   |  |
| ·                                                                   | .33E-04                   |  |
|                                                                     | .18E-04                   |  |
|                                                                     | .12E-04                   |  |
|                                                                     | .39E-04                   |  |
|                                                                     | .58E-04                   |  |
|                                                                     | .33E-04                   |  |
|                                                                     | .26E-06                   |  |
|                                                                     | .31E-07                   |  |
|                                                                     | .05E-07                   |  |
| GCM0335C1E100GA16 C0G 0201 25 0.33 0.0594 3.13E-09 5                | .26E-08                   |  |
| GRT0335C1E471GA02 C0G 0201 25 0.33 0.0594 1.47E-07 2                | .47E-06                   |  |
| GCM0335C1E470GA16 C0G 0201 25 0.33 0.0594 1.47E-08 2                | .47E-07                   |  |
| GRT0335C1E9R0DA02 C0G 0201 25 0.33 0.0594 2.81E-09 4                | .73E-08                   |  |
| GRT0335C1E120GA02 C0G 0201 25 0.33 0.0594 3.75E-09 6                | .31E-08                   |  |
| GRT0335C1E220FA02 C0G 0201 25 0.33 0.0594 6.88E-09 1                | .16E-07                   |  |
| GCM0335C1E200JA16 C0G 0201 25 0.33 0.0594 6.25E-09 1                | .05E-07                   |  |
| GRT0335C1E270FA02 C0G 0201 25 0.33 0.0594 8.44E-09 1                | .42E-07                   |  |
| GRT0335C1H221FA02 C0G 0201 50 0.33 0.0594 2.75E-07 4                | .63E-06                   |  |
| GCM0335C1H910FA16 C0G 0201 50 0.33 0.0594 1.14E-07 1                | .91E-06                   |  |
| GRT0335C2A7R0DA02 C0G 0201 100 0.33 0.0594 3.50E-08 5               | .89E-07                   |  |
| GRT0335C2A680FA02 C0G 0201 100 0.33 0.0594 3.40E-07 5               | .72E-06                   |  |
| GRT1555C1E101JA02 C0G 0402 25 0.55 0.275 3.13E-08 1                 | .14E-07                   |  |
| GCQ1555C1H2R4BB01 C0G 0402 50 0.55 0.275 3.00E-09 1                 | .09E-08                   |  |
| GCQ1555C1H3R8BB01 C0G 0402 50 0.55 0.275 4.75E-09 1                 | .73E-08                   |  |
| GCM1555C1H8R0BA16 C0G 0402 50 0.55 0.275 1.00E-08 3                 | .64E-08                   |  |
| GCQ1555C1H4R9BB01 C0G 0402 50 0.55 0.275 6.13E-09 2                 | .23E-08                   |  |
| ·                                                                   | .36E-07                   |  |
| •                                                                   | .68E-08                   |  |
| ·                                                                   | .50E-08                   |  |
|                                                                     | .14E-08                   |  |
| •                                                                   | .18E-08                   |  |
| ·                                                                   | .50E-08                   |  |
| •                                                                   | .26E-07                   |  |
|                                                                     | .45E-06                   |  |

| Part Number                                       | Material | Package | Voltage<br>Rating [V] | Thickness [mm] | Volume<br>[mm3] | Maximum Energy<br>Storage [J] | Maximum Energy<br>Density [J/mm3] |  |
|---------------------------------------------------|----------|---------|-----------------------|----------------|-----------------|-------------------------------|-----------------------------------|--|
| GCM1885C1J242GA16                                 | C0G      | 0603    | 63                    | 0.9            | 1.152           | 4.76E-06                      | 4.13E-06                          |  |
| $\mathbf{GCM1885C1J162GA16}$                      | C0G      | 0603    | 63                    | 0.9            | 1.152           | 3.18E-06                      | 2.76E-06                          |  |
| GCM1885C1J182JA16                                 | C0G      | 0603    | 63                    | 0.9            | 1.152           | 3.57E-06                      | 3.10E-06                          |  |
| GCM1885C1J222JA16                                 | C0G      | 0603    | 63                    | 0.9            | 1.152           | 4.37E-06                      | 3.79E-06                          |  |
| GCM1885C1J182GA16                                 | C0G      | 0603    | 63                    | 0.9            | 1.152           | 3.57E-06                      | 3.10E-06                          |  |
| $\mathbf{GCM1885C1J302JA16}$                      | C0G      | 0603    | 63                    | 0.9            | 1.152           | 5.95 E-06                     | 5.17E-06                          |  |
| $\rm GCM1885C1K392GA16$                           | C0G      | 0603    | 80                    | 0.9            | 1.152           | 1.25E-05                      | 1.08E-05                          |  |
| $\rm GCM1885C1K302GA16$                           | C0G      | 0603    | 80                    | 0.9            | 1.152           | 9.60E-06                      | 8.33E-06                          |  |
| $\rm GCM1885C1K222JA16$                           | C0G      | 0603    | 80                    | 0.9            | 1.152           | 7.04E-06                      | 6.11E-06                          |  |
| $\mathbf{GCM1885C2A221JA16}$                      | C0G      | 0603    | 100                   | 0.9            | 1.152           | 1.10E-06                      | 9.55E-07                          |  |
| $\rm GCM1885C2A4R0DA16$                           | C0G      | 0603    | 100                   | 0.9            | 1.152           | 2.00E-08                      | 1.74E-08                          |  |
| GCM1885C2A5R6CA16                                 | C0G      | 0603    | 100                   | 0.9            | 1.152           | 2.80E-08                      | 2.43E-08                          |  |
| $\rm GCH1885C2A9R0DE01$                           | C0G      | 0603    | 100                   | 0.9            | 1.152           | 4.50E-08                      | 3.91E-08                          |  |
| $\rm GCM2195C1K682GA16$                           | C0G      | 0805    | 80                    | 0.95           | 2.28            | 2.18E-05                      | 9.54E-06                          |  |
| GCM2195C1K103GA16                                 | C0G      | 0805    | 80                    | 0.95           | 2.28            | 3.20E-05                      | 1.40E-05                          |  |
| $\mathbf{GCM2165C2A332JA16}$                      | C0G      | 0805    | 100                   | 0.7            | 1.68            | 1.65E-05                      | 9.82E-06                          |  |
| $\rm GCM21B5C2E682JX0A$                           | C0G      | 0805    | 250                   | 1.45           | 3.48            | 2.13E-04                      | 6.11E-05                          |  |
| GCM21A5C2E180JX01                                 | C0G      | 0805    | 250                   | 1              | 2.4             | 5.63E-07                      | 2.34E-07                          |  |
| GCM21A5C2E181JX01                                 | C0G      | 0805    | 250                   | 1              | 2.4             | 5.63E-06                      | 2.34E-06                          |  |
| GCM21A5C2E150JX01                                 | C0G      | 0805    | 250                   | 1              | 2.4             | 4.69E-07                      | 1.95E-07                          |  |
| GCM21A5C2E102JX01                                 | C0G      | 0805    | 250                   | 1              | 2.4             | 3.13E-05                      | 1.30E-05                          |  |
| GCM21A5C2E121JX01                                 | C0G      | 0805    | 250                   | 1              | 2.4             | 3.75 E-06                     | 1.56E-06                          |  |
| GCM21A5C2E391JX01                                 | C0G      | 0805    | 250                   | 1              | 2.4             | 1.22E-05                      | 5.08E-06                          |  |
| GCM21B5C2J122JX03                                 | C0G      | 0805    | 630                   | 1.45           | 3.48            | 2.38E-04                      | 6.84E-05                          |  |
| GCM21A5C2J390JX01                                 | C0G      | 0805    | 630                   | 1              | 2.4             | 7.74E-06                      | 3.22E-06                          |  |
| GCM21A5C2J181JX01                                 | C0G      | 0805    | 630                   | 1              | 2.4             | 3.57E-05                      | 1.49E-05                          |  |
| ${\rm GRT}31{\rm C}5{\rm C}1{\rm C}124{\rm JA}02$ | C0G      | 1206    | 16                    | 1.8            | 9.216           | 1.54E-05                      | 1.67E-06                          |  |
| GRT31C5C1H823JA02                                 | C0G      | 1206    | 50                    | 1.8            | 9.216           | 1.03E-04                      | 1.11E-05                          |  |
| $\rm GCM3195C1K273GA16$                           | C0G      | 1206    | 80                    | 0.95           | 4.864           | 8.64E-05                      | 1.78E-05                          |  |
| GCM3195C1K333JA16                                 | C0G      | 1206    | 80                    | 0.95           | 4.864           | 1.06E-04                      | 2.17E-05                          |  |
| GCM31A5C2J121JX01                                 | C0G      | 1206    | 630                   | 1              | 5.12            | 2.38E-05                      | 4.65E-06                          |  |
| GCM31A5C2J150JX01                                 | C0G      | 1206    | 630                   | 1              | 5.12            | 2.98E-06                      | 5.81E-07                          |  |
| GCM31A5C2J102JX01                                 | C0G      | 1206    | 630                   | 1              | 5.12            | 1.98E-04                      | 3.88E-05                          |  |
| GCM31A5C3A120JX01                                 | C0G      | 1206    | 1000                  | 1              | 5.12            | 6.00 E-06                     | 1.17E-06                          |  |
| GCM31A5C3A180JX01                                 | C0G      | 1206    | 1000                  | 1              | 5.12            | 9.00E-06                      | 1.76E-06                          |  |
| GCM31A5C3A391JX01                                 | C0G      | 1206    | 1000                  | 1              | 5.12            | 1.95E-04                      | 3.81E-05                          |  |
| GCM31C5C3A102JX03                                 | C0G      | 1206    | 1000                  | 1.8            | 9.216           | 5.00E-04                      | 5.43E-05                          |  |
| GCM31A5C3A560JX01                                 | C0G      | 1206    | 1000                  | 1              | 5.12            | 2.80E-05                      | 5.47E-06                          |  |
| GCM31A5C3A471JX01                                 | C0G      | 1206    | 1000                  | 1              | 5.12            | 2.35E-04                      | 4.59E-05                          |  |
| GCM31B5C3A681JX01                                 | C0G      | 1206    | 1000                  | 1.25           | 6.4             | 3.40E-04                      | 5.31E-05                          |  |
| GCM32D5C2J183JX01                                 | C0G      | 1210    | 630                   | 2.5            | 20              | 3.57E-03                      | 1.79E-04                          |  |

### Appendix B

# Stochastic Loss Model for DPP: Detailed Derivation and Extended Case Study

#### B.1 Derivations of the Expected Power Loss

This appendix derives expected power loss for the stochastic model under conditions of independent loads and of correlated loads. Definitions and constraints are the same as those introduced in Section 4.3.

#### B.1.1 Expected Power Loss with Independent Load

In Section 4.3.3, the stochastic model is developed based on independent and identically distributed (i.i.d.) individual load powers  $P_{ij}(t)$ . With this condition, the domain powers  $P_i(t)$  are also i.i.d..

For the conventional reference converter, loss is related to total load power, and the expected value in (4.9) can be derived as

$$\mathbb{E}[P_{loss}(t)] = \frac{R_{out}}{V_0^2} \mathbb{E}\left[\left(\sum_{i=1}^N P_i(t)\right)^2\right] = \frac{R_{out}}{V_0^2} \left\{\sum_{i=1}^N \mathbb{E}[P_i^2(t)] + 2\sum_{1 \le i < j \le N} \mathbb{E}[P_i(t)P_j(t)]\right\}$$

$$\stackrel{\text{(i)}}{=} \frac{R_{out}}{V_0^2} \left(N\mathbb{E}[P_i^2(t)] + N(N-1)\mathbb{E}^2[P_i(t)]\right). \tag{B.1}$$

Here, line (i) follows because  $P_i$  values are i.i.d.. Therefore,  $\mathbb{E}[P_i(t)]$  and  $\mathbb{E}[P_i^2(t)]$  are identical for i=1,...,N, and  $\mathbb{E}[P_i(t)P_j(t)]=\mathbb{E}^2[P_i(t)]$  for any  $i\neq j$ . Considering  $P_i(t)=P_{i1}(t)+...+P_{iM}(t)$ , where  $P_{i1}(t),...,P_{iM}(t)$  are also i.i.d.,  $\mathbb{E}[P_i^2(t)]$  and  $\mathbb{E}^2[P_i(t)]$  in (B.1) can be expanded to

$$\mathbb{E}[P_i^2(t)] = M\mathbb{E}[P_{ij}^2(t)] + M(M-1)\mathbb{E}^2[P_{ij}(t)],$$

$$\mathbb{E}^2[P_i(t)] = (M\mathbb{E}[P_{ij}(t)])^2 = M^2\mathbb{E}^2[P_{ij}(t)].$$
(B.2)

Substituting (B.2) into (B.1), the expected power loss is

$$\mathbb{E}[P_{loss}(t)] = \frac{R_{out}}{V_0^2} \left\{ MN \left( \mathbb{E}[P_{ij}^2(t)] - \mathbb{E}^2[P_{ij}(t)] \right) + M^2 N^2 \mathbb{E}^2[P_{ij}(t)] \right\}$$

$$\stackrel{\text{(i)}}{=} \frac{R_{out}}{V_0^2} \left( MN \underbrace{\text{Var}[P_{ij}(t)]}_{=\sigma_0^2} + M^2 N^2 \underbrace{\mathbb{E}^2[P_{ij}(t)]}_{\mu_0^2} \right).$$
(B.3)

Line (i) is based on  $Var[X] = \mathbb{E}[X^2] - \mathbb{E}^2[X]$ .

To calculate expected loss of DPP converters,  $P'_i(t) = P_i(t) - \mathbb{E}[P_i(t)]$  is defined to subtract out the mean value  $M\mu_0$  of  $P_i(t)$ , so that  $\mathbb{E}[P'_i(t)] = 0$ . The i.i.d. property still holds for  $P'_i(t)$ . For a fully-coupled DPP with this loading condition, instantaneous power loss at each port has the same probability distribution. The expected power loss at the  $i^{th}$  port can be derived from (4.10) as

$$\mathbb{E}[P_{loss.i}(t)] \stackrel{\text{(i)}}{=} \frac{R_{out}}{V_0^2} \mathbb{E}\left[\left(\frac{\sum_{k=1}^N P_k'(t)}{N} - P_i'(t)\right)^2\right] \\
\stackrel{\text{(ii)}}{=} \frac{R_{out}}{V_0^2} \left\{\sum_{k \neq i} \frac{1}{N^2} \mathbb{E}[P_k'^2(t)] + \frac{(1-N)^2}{N^2} \mathbb{E}[P_i'^2(t)]\right\} \stackrel{\text{(iii)}}{=} \frac{R_{out}}{V_0^2} \frac{N-1}{N} \mathbb{E}[P_i'^2(t)] \\
\stackrel{\text{(iv)}}{=} \frac{R_{out}}{V_0^2} \frac{N-1}{N} \left(\mathbb{E}[P_i^2(t)] - \mathbb{E}^2[P_i(t)]\right) = \frac{R_{out}}{V_0^2} \frac{N-1}{N} \text{Var}[P_i(t)] \\
\stackrel{\text{(v)}}{=} \frac{R_{out}}{V_0^2} \frac{M(N-1)}{N} \underbrace{\text{Var}[P_{ij}(t)]}_{=\sigma_0^2}. \tag{B.4}$$

Here, lines (i) and (iv) change the variables between  $P_i(t)$  and  $P'_i(t)$ ; (ii) follows because  $P'_1(t), ..., P'_N(t)$  are independent with zero mean, and hence  $\mathbb{E}[P'_i(t)P'_j(t)] = 0$  for any  $i \neq j$ ; (iii) follows because  $P'_i(t)$  values are identically distributed, and hence  $\mathbb{E}[P'_i(t)]$  is the same for all i; (v) follows because all  $P_{ij}(t)$  values are i.i.d., and hence  $\operatorname{Var}[P_i(t)] = \operatorname{Var}[P_{i1}(t) + ... + P_{iM}(t)] = M \operatorname{Var}[P_{ij}(t)]$ .

In a ladder DPP, power loss varies among submodules. Similar to (B.4), the expected power loss at the  $i^{th}$  submodule can be calculated from (4.13) as

$$\mathbb{E}[P_{loss.i}(t)] = \frac{R_{out}}{V_0^2} \mathbb{E}\left[\left(i\overline{P}'(t) - \sum_{k=1}^{i} P_k'(t)\right)^2\right] \\
= \frac{R_{out}}{V_0^2} \mathbb{E}\left[\left(\sum_{k=1}^{i} \left(\frac{i}{N} - 1\right) P_k'(t) + \sum_{k=i+1}^{N} \frac{i}{N} P_k'(t)\right)^2\right] \\
= \frac{R_{out}}{V_0^2} \left\{\sum_{k=1}^{i} \left(\frac{i}{N} - 1\right)^2 \mathbb{E}[P_k'^2(t)] + \sum_{k=i+1}^{N} \frac{i^2}{N^2} \mathbb{E}[P_k'^2(t)]\right\} \\
= \frac{R_{out}}{V_0^2} \frac{(N-i)i}{N} \mathbb{E}[P_k'^2(t)] \\
= \frac{R_{out}}{V_0^2} \frac{M(N-i)i}{N} \underbrace{\operatorname{Var}[P_{ij}(t)]}_{=\sigma_0^2}. \tag{B.5}$$

#### B.1.2 Expected Power Loss with Correlated Load

In Section 4.3.5, load correlation is considered to generalize the stochastic loss model. The i.i.d. condition is relaxed so that each load power has identical probability distribution but is not necessarily independent. In this case,  $\mathbb{E}[P_{ij}(t)]$  and  $\text{Var}[P_{ij}(t)]$  are still identical for each load;  $\mathbb{E}[P_i(t)] = M\mu_0$  is identical for each domain, but  $\text{Var}[P_i(t)] = M\sigma_0^2 + 2\sum_{k\neq l} \text{Cov}[P_{ik}(t), P_{il}(t)]$  might vary among domains due to load correlation. In this case, expected total power loss of a fully-coupled DPP in (4.18)

can be derived as

$$\mathbb{E}[P_{loss}(t)] = \frac{R_{out}}{V_0^2} \mathbb{E}\left[\sum_{k=1}^{N} \left(\overline{P}(t) - P_k(t)\right)^2\right] \\
= \frac{R_{out}}{V_0^2} \mathbb{E}\left[\frac{1}{N}\left((N-1)\sum_{k=1}^{N} P_k^2(t) - 2\sum_{1 \le i < j \le N} P_i(t)P_j(t)\right)\right] \\
= \frac{R_{out}}{NV_0^2} \left\{(N-1)\sum_{k=1}^{N} \mathbb{E}[P_k^2(t)] - 2\sum_{1 \le i < j \le N} \mathbb{E}[P_i(t)P_j(t)]\right\} \\
\stackrel{\text{(i)}}{=} \frac{R_{out}}{NV_0^2} \left\{(N-1)\sum_{k=1}^{N} \left(\mathbb{E}[P_k^2(t)] - \mathbb{E}^2[P_k(t)]\right) - 2\sum_{1 \le i < j \le N} \text{Cov}[P_i(t), P_j(t)]\right\} \\
= \frac{R_{out}}{NV_0^2} \left\{(N-1)\sum_{k=1}^{N} \text{Var}[P_k(t)] - 2\sum_{1 \le i < j \le N} \text{Cov}[P_i(t), P_j(t)]\right\}. \tag{B.6}$$

Line (i) follows because  $\mathbb{E}[P_i(t)P_j(t)] = \mathbb{E}[P_i(t)] \times \mathbb{E}[P_j(t)] + \text{Cov}[P_i(t), P_j(t)]$ , and  $\mathbb{E}[P_i(t)]$  are identical for i = 1, ..., N. Eq. (4.21) is obtained by rearranging (B.6) as

$$\mathbb{E}[P_{loss}(t)] = \frac{R_{out}}{NV_0^2} \left\{ N \sum_{k=1}^{N} \text{Var}[P_k(t)] - \left( \sum_{k=1}^{N} \text{Var}[P_k(t)] + 2 \sum_{1 \le i < j \le N} \text{Cov}[P_i(t), P_j(t)] \right) \right\}$$

$$\stackrel{\text{(i)}}{=} \frac{R_{out}}{V_0^2} \left\{ \sum_{k=1}^{N} \text{Var}[P_k(t)] - \frac{\text{Var}\left[\sum_{k=1}^{N} P_k(t)\right]}{N} \right\}. \tag{B.7}$$

Here, (i) holds because  $\operatorname{Var}[X_1 + X_2 ... + X_N] = \sum_{k=1}^N \operatorname{Var}[X_k] + 2 \sum_{k \neq l} \operatorname{Cov}[X_k, X_l]$ .

#### B.2 Loss Analysis for SC-based DPP Topologies

This appendix provides a detailed power loss analysis of the Dickson-SC DPP (Fig. 4.10c) and the ladder-SC DPP (Fig. 4.11c). As pointed out in Section 4.3.4, if capacitor charge-sharing loss is dominant, the Dickson-SC DPP operates as a fully-coupled DPP, while the ladder-SC DPP operates as a ladder-DPP. When conduction loss is dominant, the two SC-DPP topologies are equivalent, and both of them works as a fully-coupled DPP. The reasons are as follows.



Figure B.1: Current flow in a Dickson-SC DPP during: (a) phase 1; (b) phase 2. Current flow (in blue) on the left of the dash line is the average current per period; Current flow (in red) on the right is the average current per phase.

In SC-based dc-dc converters, there are two asymptotic limits for power loss: the slow switching limit (SSL) and the fast switching limit (FSL) [93]. At SSL, the power loss of an SC converter is mainly caused by the charge sharing between capacitors and voltage sources, and the current tends to be impulsive. The charge sharing loss decreases as the capacitor size or switching frequency increases. If the capacitance or switching frequency is high enough so that the capacitors can be treated as fixed voltage sources and current flows are close to constant, FSL is reached. At FSL, charge sharing loss is negligible. The SC converter power loss becomes independent of the capacitance and switching frequency, and is dominated by the conduction loss of the current path resistance, which is mainly comprised of the switch  $R_{ds(on)}$ .

Figs. B.1-B.2 illustrates the current flow of the Dickson-SC DPP and the ladder-SC DPP in two phases. In the figures, the load current of the  $i^{th}$  domain  $(I_i)$  and its mismatched current  $(\Delta I_i = \overline{I} - I_i)$  are average currents per period and are labeled in blue. Currents labeled in red are average currents per phase that flow through each switch or capacitor in one phase, and their values can be obtained based on the ampsecond balance of the capacitors. For example, in Dickson-SC DPP, the total charge transferred from the first voltage domain to the Dickson-SC DPP in one switching cycle is  $\Delta I_1/f_{sw}$ , which is also the total charge delivered through the upper switch of the first domain in one switching cycle, because the average charge transfer per cycle of the buffering capacitor  $C_{out}$  is zero due to the amp-second balance. Since the upper switch is only conducted for half cycle in phase one, the average current that flows through the upper switch in phase one should be  $2\Delta I_1$ . Similarly, current flows of other components as well as current flows in the ladder-SC DPP can be obtained. Since the energy-buffering capacitor at each domain has a large capacitance with stable voltage, its charing sharing loss is negligible, so its current flow is not explored.

If an SC-based DPP topology works at SSL, capacitor charge sharing loss dominates. As indicated in Fig. B.1, the charge transfer of the  $i^{th}$  flying capacitor in a Dickson-SC DPP is  $\Delta I_i/f_{sw}$  per phase, which is only related to the mismatched current at the  $i^{th}$  domain. Thus, it is categorized as a fully-coupled DPP. The charge sharing loss attributed to the  $i^{th}$  flying capacitor at SSL is

$$P_{loss.i} = \frac{\Delta Q_i^2}{C_{Fly}} \cdot f_{sw} = \frac{\Delta I_i^2}{C_{Fly} \cdot f_{sw}}.$$
 (B.8)

In Fig. B.2, the charge transfer of the  $i^{th}$  flying capacitor in a ladder-SC DPP is  $\sum_{k=1}^{i} \Delta I_k / f_{sw}$  per phase, indicating that there is differential power accumulation through the flying capacitors in the ladder-SC DPP, so it is classified as a ladder-DPP. The charge sharing loss of the  $i^{th}$  flying capacitor at SSL in the ladder-SC DPP is

$$P_{loss.i} = \frac{\Delta Q_i^2}{C_{Flu}} \cdot f_{sw} = \frac{(\sum_{k=1}^i \Delta I_k)^2}{C_{Flu} \cdot f_{sw}} = \frac{\Delta I_{i \leftrightarrow i+1}^2}{C_{Flu} \cdot f_{sw}}.$$
 (B.9)

Based on Eqs. (B.8)-(B.9), the output resistance  $R_{out}$  as defined in Fig. 4.7b and Fig. 4.7c for the two SC-DPP topologies, respectively, can be obtained both as  $\frac{1}{C_{Fly} \cdot f_{sw}}$ .



Figure B.2: Current flow in a ladder-SC DPP during: (a) phase 1; (b) phase 2. Color code is the same as that of Fig. B.1.

When the SC-based DPP topologies work at FSL, switch conduction loss dominates, and current flows are close to constant. In this case, the Dickson-SC DPP and ladder-SC DPP are equivalent and conduct the same current on each corresponded switch, as indicated in Figs. B.1-B.2. For both the two SC-DPP topologies, each switch located at the  $i^{th}$  voltage domain conducts a constant current of  $2\Delta I_i$  for half switching cycle, so the generated conduction loss at the  $i^{th}$  domain is

$$P_{loss.i} = 2 \times I_{RMS}^2 \cdot R_{ds(on)} = 4\Delta I_i^2 \cdot R_{ds(on)}. \tag{B.10}$$

Eq. (B.10) indicates that the conduction loss of switches at one voltage domain is only related to the mismatched current/power of that domain and there is no differential power accumulation along the stacked voltage domains. As a result, the loss characteristics of both the Dickson-SC DPP and the ladder-SC DPP are similar to a fully-coupled DPP at FSL.



Figure B.3: Distribution of expected switch conduction loss at FSL (in blue) and capacitor charge loss at SSL (in red) of Dickson-SC DPP and ladder-SC DPP.

If all the domain load currents  $I_i$ 's (i = 1, ..., N) are i.i.d. random variables, the expected (or average) power loss of Eqs. (B.8)-(B.10) can be calculated in the same way as that in Section 4.3.3. Fig. B.3 plots the expected loss distribution of the switch conduction loss at FSL and the capacitor charge sharing loss at SSL in the two SC-DPP systems. For the Dickson-SC DPP, both the expected conduction loss at FSL and expected charge sharing loss at SSL are uniform at each voltage domain. As for a ladder-SC DPP, the expected conduction loss at FSL is uniform, but expected charge sharing loss at SSL varies with different flying capacitors. Therefore, if a ladder-SC DPP system works at SSL, a higher power loss is expected to be observed at middle of the stacked voltage domains.

## B.3 Application Study and Model Verification on a Data Storage Server

To further verify the model in a practical application, we recorded the power profiles of a data storage server (in [22]) and applied them to a SPICE simulation (PLECS v4.5).



Figure B.4: Power consumption waveforms of two example voltage domains when the data storage server is running a random read/write program.



Figure B.5: Probability distribution and correlation of the two example domain powers: (a) power distribution histogram of domain A; (b) power distribution of domain B; (c) correlation plot of domain A power and domain B power.

The data storage server contains ten series voltage domains. Each domain supplies 5 V to multiple parallel hard disk drives (HDDs). A random read/write program was running on the server. Fig. B.4 shows power waveforms of two example voltage domains. Probability distributions of the two domain powers and their correlation are plotted in Fig. B.5, indicating that the measured ten domain powers are i.i.d.. Differential current waveforms of the two voltage domains are plotted in Fig. B.6.

In the SPICE simulation, a DPP system with ten series domains was built and supported by the MAC-DPP converter (Fig. 4.10a). Here, each domain contains one random load with the recorded domain power profile, so in this system  $N = 10, M = 1, V_0 = 5 \text{ V}, \mu_0 = 9.2 \text{ W}, \text{ and } \sigma_0^2 = 0.17 \text{ W}^2$ . For the DPP converter, each



Figure B.6: Differential current  $(\Delta I_i)$  of the two example voltage domains.

Table B.1: Average Power Consumption and DPP Power Loss of Each Voltage Domain and of the Total System

| Domain              | #1   | #2   | #3   | #4   | #5   | #6   | #7   | #8   | #9   | #10  | Simulated<br>Domain<br>Average | Simulated<br>Total<br>System | Modeled<br>Total<br>System |
|---------------------|------|------|------|------|------|------|------|------|------|------|--------------------------------|------------------------------|----------------------------|
| Average Power [W]   | 9.10 | 9.17 | 9.13 | 9.21 | 9.19 | 9.12 | 9.10 | 9.17 | 9.19 | 9.19 | 9.16                           | 91.6                         | 91.6                       |
| DPP Power Loss [mW] | 2.73 | 2.62 | 2.51 | 2.52 | 2.67 | 2.66 | 2.68 | 2.53 | 2.46 | 2.70 | 2.61                           | 26.1                         | 24.5                       |

<sup>\*</sup> The system operated for 60 seconds. Conduction losses are considered. Switching loss, core loss, control, and other auxiliary losses are not included.

switch  $R_{ds(on)}$  is set as 0.1  $\Omega$  and each winding resistance is set as 0.2  $\Omega$ , yielding  $R_{out} = 0.4 \Omega$ . Table B.1 lists the average power consumption and DPP power loss of each voltage domain and of the total system. It also compares the modeled system power loss (based on (4.12)) to the simulated system power loss. As shown in the table, the modeled system loss (24.5 mW) is within 6% of the simulated system loss (26.1 mW), validating the stochastic loss model.

## Bibliography

- [1] K. Radhakrishnan, M. Swaminathan, and B. K. Bhattacharyya, "Power delivery for high-Performance microprocessors challenges, solutions, and future trends," *IEEE Transactions on Components, Packaging and Manufacturing Technology*, vol. 11, no. 4, pp. 655–671, 2021.
- [2] M. A. Abdou, A. Team, et al., "Exploring novel high power density concepts for attractive fusion systems," Fusion Engineering and Design, vol. 45, no. 2, pp. 145–167, 1999.
- [3] A. Lidow, M. De Rooij, J. Strydom, D. Reusch, and J. Glaser, "Chapter I: GaN technology overview," in *GaN Transistors for Efficient Power Conversion*, 3rd Edition, John Wiley & Sons, 2019.
- [4] E. Johnson, "Physical limitations on frequency and power parameters of transistors," in *Proc. IEEE International Convention Record*, vol. 13, pp. 27–34, 1965.
- [5] R. Keyes, "Figure of merit for semiconductors for high-speed switches," *Proceedings of the IEEE*, vol. 60, no. 2, pp. 225–225, 1972.
- [6] B. J. Baliga, "Semiconductors for high-voltage, vertical channel field-effect transistors," *Journal of Applied Physics*, vol. 53, no. 3, pp. 1759–1764, 1982.
- [7] A. Huang, "New unipolar switching power device figures of merit," *IEEE Electron Device Letters*, vol. 25, no. 5, pp. 298–301, 2004.
- [8] C. R. Sullivan, B. A. Reese, A. L. F. Stein, and P. A. Kyaw, "On size and magnetics: Why small efficient power inductors are rare," in *Proc. International Symposium on 3D Power Electronics Integration and Manufacturing (3D-PEIM)*, pp. 1–23, 2016.
- I. Bassett, "Constant frequency ZVS converter with integrated magnetics," in Proc. IEEE Applied Power Electronics Conference and Exposition, pp. 709–716, 1992.
- [10] A. Kats, G. Ivensky, and S. Ben-Yaakov, "Application of integrated magnetics in resonant converters," in *Proc. IEEE Applied Power Electronics Conference*, vol. 2, pp. 925–930, 1997.

- [11] W. Chen, G. Hua, D. Sable, and F. Lee, "Design of high efficiency, low profile, low voltage converter with integrated magnetics," in *Proc. IEEE Applied Power Electronics Conference*, vol. 2, pp. 911–917, 1997.
- [12] M. H. Ahmed, A. Nabih, F. C. Lee, and Q. Li, "Low-loss integrated inductor and transformer structure and application in regulated LLC converter for 48-V bus converter," *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 8, no. 1, pp. 589–600, 2020.
- [13] B.-K. Kang, S.-K. Chung, and D.-S. Oh, "Integrated magnetics for boost PFC and flyback converters with phase-shifted PWM," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 1018–1024, 2013.
- [14] P.-L. Wong, Performance Improvements of Multi-Channel Interleaving Voltage Regulator Modules with Integrated Coupling Inductors. PhD thesis, Virginia Polytechnic Institute and State University, 2001.
- [15] J. Li, A. Stratakos, A. Schultz, and C. Sullivan, "Using coupled inductors to enhance transient performance of multi-phase buck converters," in *Proc. IEEE Applied Power Electronics Conference and Exposition*, vol. 2, pp. 1289–1293, 2004.
- [16] Murata, "Murata design support software SimSurfing." https://www.murata.com/en-us/tool/simsurfing. Accessed: 2022-04-06.
- [17] M. Makowski and D. Maksimovic, "Performance limits of switched-capacitor dede converters," in *Proc. IEEE Power Electronics Specialist Conference*, vol. 2, pp. 1215–1221, 1995.
- [18] C. Pascual and P. Krein, "Switched capacitor system for automatic series battery equalization," in *Proc. IEEE Applied Power Electronics Conference*, vol. 2, pp. 848–854, 1997.
- [19] J. M. Henry and J. W. Kimball, "Practical performance analysis of complex switched-capacitor converters," *IEEE Transactions on Power Electronics*, vol. 26, no. 1, pp. 127–136, 2011.
- [20] Z. Ye, Hybrid Switched-Capacitor Power Converters: Fundamental Limits and Design Techniques. PhD thesis, UC Berkeley, 2020.
- [21] Infineon, "3600W, 385V to 52V LLC dc-dc demonstration board using CoolGaN 600V e-mode HEMT IGT60R070D1." https://www.infineon.com/cms/jp/product/evaluation-boards/eval\_3k6w\_llc\_gan/. Accessed: 2023-01-06.
- [22] P. Wang, Y. Chen, J. Yuan, R. C. N. Pilawa-Podgurski, and M. Chen, "Differential power processing for ultra-efficient data storage," *IEEE Transactions on Power Electronics*, vol. 36, no. 4, pp. 4269–4286, 2021.

- [23] M. Chen, Merged Multi-Stage Power Conversion: A Hybrid Switched-Capacitor/Magnetics Approach. PhD thesis, Massachusetts Institute of Technology, 2015.
- [24] W. Tabisz, M. Jovanovic, and F. Lee, "Present and future of distributed power systems," in *Proc. IEEE Applied Power Electronics Conference and Exposition*, pp. 11–18, 1992.
- [25] Y. Xi and P. Jain, "Distributed on-board power supply architectures for electronic cards employing high-speed low-voltage semiconductor circuits," in Proc. IEEE Annual Power Electronics Specialists Conference, vol. 6, pp. 4333–4339, 2004.
- [26] J. Kassakian and M. Schlecht, "High-frequency high-density converters for distributed power supply systems," *Proceedings of the IEEE*, vol. 76, no. 4, pp. 362–376, 1988.
- [27] B. Miwa, L. Casey, and M. Schlecht, "Copper-based hybrid fabrication of a 50 W, 5 MHz 40 V-5 V dc/dc converter," *IEEE Transactions on Power Elec*tronics, vol. 6, no. 1, pp. 2–10, 1991.
- [28] D. J. Perreault, J. Hu, J. M. Rivas, Y. Han, O. Leitermann, R. C. Pilawa-Podgurski, A. Sagneri, and C. R. Sullivan, "Opportunities and challenges in very high frequency power conversion," in *Proc. IEEE Applied Power Electronics Conference and Exposition*, pp. 1–14, 2009.
- [29] J. Rivas, R. Wahby, J. Shafran, and D. Perreault, "New architectures for radio-frequency dc-dc power conversion," *IEEE Transactions on Power Electronics*, vol. 21, no. 2, pp. 380–393, 2006.
- [30] Y. Chen, P. Wang, Y. Elasser, and M. Chen, "Multicell reconfigurable multi-input multi-output energy router architecture," *IEEE Transactions on Power Electronics*, vol. 35, no. 12, pp. 13210–13224, 2020.
- [31] J. Baek, Y. Elasser, K. Radhakrishnan, H. Gan, J. P. Douglas, H. K. Krishnamurthy, X. Li, S. Jiang, C. R. Sullivan, and M. Chen, "Vertical stacked LEGO-PoL CPU voltage regulator," *IEEE Transactions on Power Electronics*, vol. 37, no. 6, pp. 6305–6322, 2022.
- [32] D. J. Perreault, Design and Evaluation of Cellular Power Converter Architectures. PhD thesis, Massachusetts Institute of Technology, 1997.
- [33] B. A. Miwa, Interleaved Conversion Techniques for High Density Power Supplies. PhD thesis, Massachusetts Institute of Technology, 1992.
- [34] Y. Qiu, K. Yao, Y. Meng, M. Xu, F. Lee, and M. Ye, "Control-loop bandwidth limitations for multiphase interleaving buck converters," in *Proc. IEEE Applied Power Electronics Conference and Exposition*, vol. 2, pp. 1322–1328, 2004.

- [35] Würth Elektronik, "Coupled inductor confusion," in *Proc. IEEE APEC Virtual Conference*, 2020. https://www.youtube.com/watch?v=e3o76v208JQ. Accessed: 2023-01-07.
- [36] J. G. Hayes, K. J. Hartnett, and M. Rylko, "Split-Winding Integrated Magnetic Structure," June 6 2013. US Patent App. 13/602,727.
- [37] M. H. Ahmed, M. A. de Rooij, and J. Wang, "High-power density, 900-W LLC converters for servers using GaN FETs: Toward greater efficiency and power density in 48 V to 6/12 V converters," *IEEE Power Electronics Magazine*, vol. 6, no. 1, pp. 40–47, 2019.
- [38] ViTEC, "Multi-Phase SMD Coupled Inductor Designed for Advanced Voltage Regulator Modules." https://www.viteccorp.com/data/af4268.pdf. Accessed: 2023-01-07.
- [39] E. A. Burton, G. Schrom, F. Paillet, J. Douglas, W. J. Lambert, K. Radhakrishnan, and M. J. Hill, "FIVR Fully integrated voltage regulators on 4th generation Intel® Core™ SoCs," in *Proc. IEEE Applied Power Electronics Conference and Exposition*, pp. 432–439, 2014.
- [40] C. Ó. Mathúna and S. O'Driscoll, "Heterogeneous integration of power conversion using power supply on chip and power supply in package," in *Proc. IEEE European Conference on Power Electronics and Applications (EPE'22 ECCE Europe)*, pp. 1–2, 2022.
- [41] M. Chen and C. Sullivan, "Unified models for coupled inductors applied to multiphase PWM converters," *IEEE Transactions on Power Electronics*, vol. 36, no. 12, pp. 14155–14174, 2021.
- [42] C. Zhao, S. Round, and J. Kolar, "An isolated three-port bidirectional dc-dc converter with decoupled power flow management," *IEEE Transactions on Power Electronics*, vol. 23, no. 5, pp. 2443–2453, 2008.
- [43] P. Wang and M. Chen, "Analysis and design of series voltage compensator for differential power processing," *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 10, no. 6, pp. 7890–7903, 2022.
- [44] Y. Chen, Y. Elasser, P. Wang, J. Baek, and M. Chen, "Turbo-MMC: Minimizing the submodule capacitor size in modular multilevel converters with a matrix charge balancer," in *Proc. IEEE Workshop on Control and Modeling for Power Electronics (COMPEL)*, pp. 1–8, 2019.
- [45] M. Liu, P. Wang, Y. Guan, and M. Chen, "A 13.56 MHz multiport-wireless-coupled (MWC) battery balancer with high frequency online electrochemical impedance spectroscopy," in *Proc. IEEE Energy Conversion Congress and Exposition (ECCE)*, pp. 537–544, 2019.

- [46] S. Cuk, "A new zero-ripple switching dc-to-dc converter and integrated magnetics," in Proc. IEEE Power Electronics Specialists Conference, vol. 19, pp. 57–75, 1983.
- [47] S. Cuk and Z. Zhang, "Coupled-inductor analysis and design," in *Proc. IEEE Power Electronics Specialists Conference*, pp. 655–665, 1986.
- [48] J. Betten, "Benefits of a coupled-inductor SEPIC converter," Texas Instruments Analog Applications Journal, 2011.
- [49] K. Yao, M. Ye, M. Xu, and F. Lee, "Tapped-inductor buck converter for high-step-down dc-dc conversion," *IEEE Transactions on Power Electronics*, vol. 20, no. 4, pp. 775–780, 2005.
- [50] Q. Zhao, F. Tao, and F. Lee, "A front-end dc/dc converter for network server applications," in *Proc. IEEE Annual Power Electronics Specialists Conference*, vol. 3, pp. 1535–1539, 2001.
- [51] Intel, "Desktop platform form factors power supply design guide." https://www.intel.com/content/dam/www/public/us/en/documents/guides/power-supply-design-guide-june.pdf. Accessed: 2023-01-04.
- [52] Y. Dong, Investigation of Multiphase Coupled-Inductor Buck Converters in Point-of-Load Applications. PhD thesis, Virginia Polytechnic Institute and State University, 2009.
- [53] C. Shi, A. Khaligh, and H. Wang, "Interleaved SEPIC power factor preregulator using coupled inductors in discontinuous conduction mode with wide output voltage," *IEEE Transactions on Industry Applications*, vol. 52, no. 4, pp. 3461–3471, 2016.
- [54] Y. Chen, P. Wang, H. Cheng, G. Szczeszynski, S. Allen, D. Giuliano, and M. Chen, "Virtual intermediate bus CPU voltage regulator," *IEEE Transactions on Power Electronics*, vol. 37, no. 6, pp. 6883–6898, 2022.
- [55] C. Sullivan and M. Chen, "Coupled inductors for fast-response high-density power delivery: discrete and integrated," in *Proc. IEEE Custom Integrated Circuits Conference (CICC)*, pp. 1–8, 2021.
- [56] A. Ikriannikov and T. Schmid, "Magnetically coupled buck converters," in *Proc. IEEE Energy Conversion Congress and Exposition*, pp. 4948–4954, 2013.
- [57] Y. Elasser, J. Baek, C. Sullivan, and M. Chen, "Modeling and design of vertical multiphase coupled inductors with inductance dual model," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 1717–1724, 2021.

- [58] D. Zhou, Y. Elasser, J. Baek, and M. Chen, "Reluctance-based dynamic models for multiphase coupled inductor buck converters," *IEEE Transactions on Power Electronics*, vol. 37, no. 2, pp. 1334–1351, 2022.
- [59] P. Wang, Y. Elasser, V. Yang, and M. Chen, "WAN converter: A family of multicell PWM converter with all-in-one magnetics," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 1035–1042, 2022.
- [60] P. Xu, Q. Wu, P.-L. Wong, and F. Lee, "A novel integrated current doubler rectifier," in *Proc. IEEE Applied Power Electronics Conference and Exposition*, vol. 2, pp. 735–740, 2000.
- [61] S. Chandrasekaran, V. Mehrotra, and H. Sun, "A new matrix integrated magnetics (MIM) structure for low voltage, high current dc-dc converters," in Proc. IEEE Annual IEEE Power Electronics Specialists Conference. Proceedings, vol. 3, pp. 1230–1235, 2002.
- [62] T. Qian and B. Lehman, "Coupled input-series and output-parallel dual interleaved flyback converter for high input voltage application," *IEEE Transactions* on Power Electronics, vol. 23, no. 1, pp. 88–95, 2008.
- [63] M. Noah, K. Umetani, J. Imaoka, and M. Yamamoto, "Lagrangian dynamics model and practical implementation of an integrated transformer in multi-phase LLC resonant converter," *IET Power Electronics*, vol. 11, no. 2, pp. 339–347, 2018.
- [64] T. Ge and K. Ngo, "Omnicoupled inductors (OCI) applied in a resonant cross-commutated buck converter," *IEEE Transactions on Industrial Electronics*, vol. 68, no. 6, pp. 4894–4902, 2021.
- [65] D. Maksimovic, Synthesis of PWM and Quasi-Resonant Dc-to-Dc Power Converters. PhD thesis, California Institute of Technology, 1989.
- [66] Y. Guan, P. Wang, M. Liu, D. Xu, and M. Chen, "MSP-LEGO: Modular series-parallel (MSP) architecture and LEGO building blocks for non-isolated high voltage conversion ratio hybrid dc-dc converters," in *Proc. IEEE Energy Conversion Congress and Exposition (ECCE)*, pp. 143–150, 2019.
- [67] H. Chen, K. Sabi, H. Kim, T. Harada, R. Erickson, and D. Maksimovic, "A 98.7% efficient composite converter architecture with application-tailored efficiency characteristic," *IEEE Transactions on Power Electronics*, vol. 31, no. 1, pp. 101–110, 2016.
- [68] M. Ahmed, C. Fei, F. Lee, and Q. Li, "Single-stage high-efficiency 48/1 V sigma converter with integrated magnetics," *IEEE Transactions on Industrial Electronics*, vol. 67, no. 1, pp. 192–202, 2020.
- [69] MIT EE Staff, "Magnetic Circuits and Transformers," Cambridge MA, USA: MIT Press, 1943.

- [70] F. E. Terman, "Section 2: Circuit Elements," in *Radio Engineers' Handbook*, New York, NY, USA: McGraw-Hill, 1943.
- [71] E. Cherry, "The duality between interlinked electric and magnetic circuits and the formation of transformer equivalent circuits," *Proceedings of the Physical Society. Section B*, vol. 62, no. 2, pp. 101–111, 1949.
- [72] S.-A. El-Hamamsy and E. Chang, "Magnetics modeling for computer-aided design of power electronics circuits," in *Proc. Annual IEEE Power Electronics Specialists Conference*, vol. 2, pp. 635–645, 1989.
- [73] G. Ludwig and S.-A. El-Hamamsy, "Coupled inductance and reluctance models of magnetic components," *IEEE Transactions on Power Electronics*, vol. 6, no. 2, pp. 240–250, 1991.
- [74] K. Rupp, "50 Years of Microprocessor Trend Data." https://github.com/karlrupp/microprocessor-trend-data. Accessed: 2023-01-29.
- [75] R. Mahajan, B. Penmecha, and K. Radhakrishnan, "Advanced packaging architecture for heterogeneous integration," in *International Workshop on Power Supply on Chip (PwrSoC)*, 2021.
- [76] P. Watch, "GPU Die Size & Process Technology." https://pc.watch. impress.co.jp/img/pcw/docs/752/331/html/6.jpg.html, Apr. 2016. Accessed: 2023-01-29.
- [77] TechPowerUp, "GPU Specs Database." https://www.techpowerup.com/gpu-specs/. Accessed: 2023-01-29.
- [78] R. Dennard, F. Gaensslen, H.-N. Yu, V. Rideout, E. Bassous, and A. LeBlanc, "Design of ion-implanted MOSFET's with very small physical dimensions," *IEEE Journal of Solid-State Circuits*, vol. 9, no. 5, pp. 256–268, 1974.
- [79] S. Borkar and A. A. Chien, "The future of microprocessors," *Communications* of the ACM, vol. 54, no. 5, pp. 67–77, 2011.
- [80] J. Held, J. Bautista, and S. Koehl, "From a few cores to many: A tera-scale computing research overview," White Paper, Intel, 2006.
- [81] S. A. McKee, "Reflections on the memory wall," in *Proc. of the 1st Conference on Computing Frontiers*, p. 162, 2004.
- [82] X. Li and S. Jiang, "Google 48 V rack adaptation and onboard power technology update," in *Open Compute Project (OCP) 2019 Summit*, 2019.
- [83] C. Fei, M. H. Ahmed, F. C. Lee, and Q. Li, "Two-stage 48 V-12 V/6 V-1.8 V voltage regulator module with dynamic bus voltage control for light-load efficiency improvement," *IEEE Transactions on Power Electronics*, vol. 32, no. 7, pp. 5628–5636, 2017.

- [84] M. H. Ahmed, F. C. Lee, and Q. Li, "Two-stage 48-V VRM with intermediate bus voltage optimization for data centers," *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 9, no. 1, pp. 702–715, 2021.
- [85] Z. Ye, R. A. Abramson, and R. C. N. Pilawa-Podgurski, "A 48-to-6 V multiresonant-doubler switched-capacitor converter for data center applications," in Proc. IEEE Applied Power Electronics Conference and Exposition (APEC), pp. 475–481, 2020.
- [86] T. Ge, Z. Ye, and R. C. Pilawa-Podgurski, "A 48-to-12 V cascaded multiresonant switched capacitor converter with 4700 W/in<sup>3</sup> power density and 98.9% efficiency," in *Proc. IEEE Energy Conversion Congress and Exposition (ECCE)*, pp. 1959–1965, 2021.
- [87] S. Jiang, S. Saggini, C. Nan, X. Li, C. Chung, and M. Yazdani, "Switched tank converters," *IEEE Transactions on Power Electronics*, vol. 34, no. 6, pp. 5048– 5062, 2019.
- [88] P. S. Shenoy, O. Lazaro, R. Ramani, M. Amaro, W. Wiktor, J. Khayat, and B. Lynch, "A 5 MHz, 12 V, 10 A, monolithically integrated two-phase series capacitor buck converter," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 66–72, 2016.
- [89] G.-S. Seo, R. Das, and H.-P. Le, "Dual inductor hybrid converter for point-of-load voltage regulator modules," *IEEE Transactions on Industry Applications*, vol. 56, no. 1, pp. 367–377, 2020.
- [90] X. Lou and Q. Li, "300A single-stage 48V voltage regulator with multiphase current doubler rectifier and integrated transformer," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 1004–1010, 2022.
- [91] J. A. Cobos, A. Castro, Ó. García-Lorenz, J. Cruz, and Á. Cobos, "Direct power converter -DPx- for high gain and high current applications," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 1016–1022, 2022.
- [92] R. C. N. Pilawa-Podgurski and D. J. Perreault, "Merged two-stage power converter with soft charging switched-capacitor stage in 180 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 7, pp. 1557–1567, 2012.
- [93] M. Seeman and S. Sanders, "Analysis and optimization of switched-capacitor dc-dc converters," in *Proc. IEEE Workshops on Computers in Power Electronics*, vol. 23, pp. 841–851, 2008.
- [94] Y. Lei and R. C. N. Pilawa-Podgurski, "A general method for analyzing resonant and soft-charging operation of switched-capacitor converters," *IEEE Transactions on Power Electronics*, vol. 30, no. 10, pp. 5650–5664, 2015.

- [95] C. Ó. Mathúna, "PwrSiP power supply in package power system in package," in *Proc. International Symposium on 3D Power Electronics Integration and Manufacturing (3D-PEIM)*, pp. 1–21, 2016.
- [96] P. Wang, D. H. Zhou, Y. Elasser, J. Baek, and M. Chen, "Matrix coupled all-in-one magnetics for PWM power conversion," *IEEE Transactions on Power Electronics*, vol. 37, no. 12, pp. 15035–15050, 2022.
- [97] MPS, "MP86998 Integrated Intelli-Phasetm solution in TLGA package." https://www.monolithicpower.com/en/documentview/productdocument/index/version/2/document\_type/Datasheet/lang/en/sku/MP86998GMJT/, 2020. Accessed: 2023-01-18.
- [98] P. Wang et al., "MSC-PoL: An ultra-thin 220-A/48-to-1-V hybrid GaN-Si CPU VRM with multistack switched capacitor architecture and coupled magnetics," in Proc. IEEE Applied Power Electronics Conference and Exposition (APEC), 2023.
- [99] M. Chen, Y. Chen, and P. Wang, "Methods, Devices, and Systems for Power Converters," Feb. 2022. US Patent 63/313,256.
- [100] P. Wang, D. Zhou, D. Giuliano, M. Chen, and Y. Chen, "Multistack switched-capacitor architecture with coupled magnetics for 48V-to-1V VRM," in *Proc. IEEE 23rd Workshop on Control and Modeling for Power Electronics (COM-PEL)*, pp. 1–7, 2022.
- [101] Y. Zhu, T. Ge, Z. Ye, and R. C. Pilawa-Podgurski, "A Dickson-squared hybrid switched-capacitor converter for direct 48 V to point-of-load conversion," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 1272–1278, 2022.
- [102] P. Wang et al., "Interphase L-C resonance and stability analysis of series-capacitor buck converters," *IEEE Transactions on Power Electronics*, 2023.
- [103] M. Liao, D. H. Zhou, P. Wang, and M. Chen, "Power Systems on Chiplet: Inductor-linked multi-output switched-capacitor multi-rail power delivery on chiplets," in *Proc. IEEE International Symposium on 3D Power Electronics Integration and Manufacturing (3D-PEIM)*, pp. 1–7, 2023.
- [104] P. S. Shenoy *et al.*, "Automatic current sharing mechanism in the series capacitor buck converter," in *Proc. IEEE Energy Conversion Congress and Exposition* (ECCE), pp. 2003–2009, 2015.
- [105] D. H. Zhou, A. Bendory, P. Wang, and M. Chen, "Intrinsic and robust voltage balancing of FCML converters with coupled inductors," in *Proc. IEEE 22nd Workshop on Control and Modelling of Power Electronics (COMPEL)*, pp. 1–8, 2021.

- [106] Z. Ye, Y. Lei, Z. Liao, and R. C. N. Pilawa-Podgurski, "Investigation of capacitor voltage balancing in practical implementations of flying capacitor multilevel converters," *IEEE Transactions on Power Electronics*, vol. 37, no. 3, pp. 2921–2935, 2022.
- [107] B. Oraw and R. Ayyanar, "Small signal modeling and control design for new extended duty ratio, interleaved multiphase synchronous buck converter," in *Proc. International Telecommunications Energy Conference*, pp. 1–8, 2006.
- [108] B. Oraw and R. Ayyanar, "Large signal average model for an extended duty ratio and conventional buck," in *Proc. IEEE International Telecommunications Energy Conference*, pp. 1–8, 2008.
- [109] M. Faccio, G. Ferri, and A. D'Amico, "A new fast method for ladder networks characterization," *IEEE Transactions on Circuits and Systems*, vol. 38, no. 11, pp. 1377–1382, 1991.
- [110] P. Wang, Y. Chen, G. Szczeszynski, S. Allen, D. Giuliano, and M. Chen, "MSC-PoL: Hybrid GaN-Si multistacked switched capacitor 48V PwrSiP VRM for chiplets," *TechRxiv*, Feb. 23, 2023.
- [111] J. Li, T. Abdallah, and C. Sullivan, "Improved calculation of core loss with nonsinusoidal waveforms," in *Proc. IEEE Industry Applications Society Annual Meeting*, vol. 4, pp. 2203–2210, 2001.
- [112] H. Li, D. Serrano, T. Guillod, E. Dogariu, A. Nadler, S. Wang, M. Luo, V. Bansal, Y. Chen, C. R. Sullivan, and M. Chen, "MagNet: An open-source database for data-driven magnetic core loss modeling," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 588–595, 2022.
- [113] J. Zhu and D. Maksimovic, "48 V-to-1 V transformerless stacked active bridge converters with merged regulation stage," in *Proc. IEEE Workshop on Control and Modeling for Power Electronics (COMPEL)*, pp. 1–6, 2020.
- [114] Vicor, "PRM Regulator PRM48BH480T250A00." http://www.vicorpower.com/documents/datasheets/PRM48BH480T250A00\_ds.pdf, 2020. Accessed: 2023-01-29.
- [115] Vicor, "VTM Current Multiplier VTM48MP010x107AA1." http://www.vicorpower.com/documents/datasheets/VTM48M\_010\_107AA1.pdf, 2017. Accessed: 2023-01-29.
- [116] ADI, "LTM4664 54V<sub>IN</sub> Dual 25A, Single 50A μModule Regulator with Digital Power System Management." https://www.analog.com/en/products/ltm4664.html, 2021. Accessed: 2023-01-29.
- [117] H. Cao, X. Yang, C. Xue, L. He, Z. Tan, M. Zhao, Y. Ding, W. Li, and W. Qu, "A 12-level series-capacitor 48-1V dc—dc converter with on-chip switch and GaN

- hybrid power conversion," *IEEE Journal of Solid-State Circuits*, vol. 56, no. 12, pp. 3628–3638, 2021.
- [118] T. Ge, R. Abramson, Z. Ye, and R. C. Pilawa-Podgurski, "Core size scaling law of two-phase coupled inductors demonstration in a 48-to-1.8 V hybrid switched-capacitor MLB-PoL converter," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 1500–1505, 2022.
- [119] N. M. Ellis and R. C. Pilawa-Podgurski, "A symmetric dual-inductor hybrid Dickson converter for direct 48V-to-PoL conversion," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 1267–1271, 2022.
- [120] Y. Elasser *et al.*, "Mini-LEGO: A 1.5-MHz 240-A 48-V-to-1-V CPU VRM with 8.4-mm height for vertical power delivery," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 1–8, 2023.
- [121] G. Brainard, "Non-Dissipative Battery Charger Equalizer," Dec. 26 1995. US Patent 5,479,083.
- [122] Y.-S. Lee and M.-W. Cheng, "Intelligent control battery equalization for series connected lithium-ion battery strings," *IEEE Transactions on Industrial Electronics*, vol. 52, no. 5, pp. 1297–1307, 2005.
- [123] Y. Ye, K. Cheng, Y. Fong, X. Xue, and J. Lin, "Topology, modeling, and design of switched-capacitor-based cell balancing systems and their balancing exploration," *IEEE Transactions on Power Electronics*, vol. 32, no. 6, pp. 4444–4454, 2017.
- [124] H. Schmidt and C. Siedle, "The charge equalizer a new system to extend battery lifetime in photovoltaic systems, UPS and electric vehicles," in *Proc. IEEE International Telecommunications Energy Conference*, pp. 146–151, 1993.
- [125] S. Hung, D. Hopkins, and C. Mosling, "Extension of battery life via charge equalization control," *IEEE Transactions on Industrial Electronics*, vol. 40, no. 1, pp. 96–104, 1993.
- [126] A. Imtiaz and F. Khan, ""Time shared flyback converter" based regenerative cell balancing technique for series connected Li-Ion battery strings," *IEEE Transactions on Power Electronics*, vol. 28, no. 12, pp. 5960–5975, 2013.
- [127] N. Kutkut, D. Divan, and D. Novotny, "Charge equalization for series connected battery strings," *IEEE Transactions on Industry Applications*, vol. 31, no. 3, pp. 562–568, 1995.
- [128] S. Dam and V. John, "A modular fast cell-to-cell battery voltage equalizer," *IEEE Transactions on Power Electronics*, vol. 35, no. 9, pp. 9443–9461, 2020.

- [129] M. Evzelman, M. Ur Rehman, K. Hathaway, R. Zane, D. Costinett, and D. Maksimovic, "Active balancing system for electric vehicles with incorporated low-voltage bus," *IEEE Transactions on Power Electronics*, vol. 31, no. 11, pp. 7887–7895, 2016.
- [130] M. Liu, Y. Chen, Y. Elasser, and M. Chen, "Dual frequency hierarchical modular multilayer battery balancer architecture," *IEEE Transactions on Power Electronics*, vol. 36, no. 3, pp. 3099–3110, 2021.
- [131] T. Shimizu, M. Hirakata, T. Kamezawa, and H. Watanabe, "Generation control circuit for photovoltaic modules," *IEEE Transactions on Power Electronics*, vol. 16, no. 3, pp. 293–300, 2001.
- [132] G. Walker and J. Pierce, "Photovoltaic dc-dc module integrated converter for novel cascaded and bypass grid connection topologies design and optimisation," in *Proc. IEEE Power Electronics Specialists Conference*, pp. 1–7, 2006.
- [133] C. Olalla, D. Clement, M. Rodriguez, and D. Maksimovic, "Architectures and control of submodule integrated dc–dc converters for photovoltaic applications," *IEEE Transactions on Power Electronics*, vol. 28, no. 6, pp. 2980–2997, 2013.
- [134] J. Stauth, M. Seeman, and K. Kesarwani, "Resonant switched-capacitor converters for sub-module distributed photovoltaic power management," *IEEE Transactions on Power Electronics*, vol. 28, no. 3, pp. 1189–1198, 2013.
- [135] S. Qin, S. Cady, A. Dominguez-Garcia, and R. Pilawa-Podgurski, "A distributed approach to maximum power point tracking for photovoltaic submodule differential power processing," *IEEE Transactions on Power Electronics*, vol. 30, no. 4, pp. 2024–2040, 2015.
- [136] A. Chang, A.-T. Avestruz, and S. Leeb, "Capacitor-less photovoltaic cell-level power balancing using diffusion charge redistribution," *IEEE Transactions on Power Electronics*, vol. 30, no. 2, pp. 537–546, 2015.
- [137] C. Liu, D. Li, Y. Zheng, and B. Lehman, "Modular differential power processing (mDPP)," in *Proc. IEEE Workshop on Control and Modeling for Power Electronics (COMPEL)*, pp. 1–7, 2017.
- [138] Y.-T. Jeon, H. Lee, K. Kim, and J.-H. Park, "Least power point tracking method for photovoltaic differential power processing systems," *IEEE Transac*tions on Power Electronics, vol. 32, no. 3, pp. 1941–1951, 2017.
- [139] E. Candan, P. Shenoy, and R. Pilawa-Podgurski, "A series-stacked power delivery architecture with isolated differential power conversion for data centers," *IEEE Transactions on Power Electronics*, vol. 31, no. 5, pp. 3690–3703, 2016.
- [140] A. Stillwell and R. Pilawa-Podgurski, "A resonant switched-capacitor converter with GaN transistors for series-stacked processors with 99.8% power delivery

- efficiency," in *Proc. IEEE Energy Conversion Congress and Exposition (ECCE)*, pp. 563–570, 2015.
- [141] P. Shenoy and P. Krein, "Differential power processing for dc systems," *IEEE Transactions on Power Electronics*, vol. 28, no. 4, pp. 1795–1806, 2013.
- [142] P. T. Krein, R. H. Campbell, and N. R. Shanbhag, "System and Method for Improving Power Conversion for Advanced Electronic Circuits," Aug. 25 2015. US Patent 9,116,692.
- [143] P. Wang, R. Pilawa-Podgurski, P. Krein, and M. Chen, "Stochastic power loss analysis of differential power processing," *IEEE Transactions on Power Electronics*, vol. 37, no. 1, pp. 81–99, 2022.
- [144] P. Wang, Y. Chen, Y. Elasser, and M. Chen, "Small signal model for very-large-scale multi-active-bridge differential power processing (MAB-DPP) architecture," in *Proc. IEEE Workshop on Control and Modeling for Power Electronics (COMPEL)*, pp. 1–8, 2019.
- [145] P. Wang and M. Chen, "Towards power FPGA: Architecture, modeling and control of multiport power converters," in *Proc. IEEE Workshop on Control and Modeling for Power Electronics (COMPEL)*, pp. 1–8, 2018.
- [146] H. Chen, H. Kim, R. Erickson, and D. Maksimovic, "Electrified automotive powertrain architecture using composite dc–dc converters," *IEEE Transactions on Power Electronics*, vol. 32, no. 1, pp. 98–116, 2017.
- [147] J. Zhao, K. Yeates, and Y. Han, "Analysis of high efficiency dc/dc converter processing partial input/output power," in *Proc. IEEE Workshop on Control and Modeling for Power Electronics (COMPEL)*, 2013.
- [148] H. Zhou, J. Zhao, and Y. Han, "PV balancers: Concept, architectures, and realization," in *Proc. IEEE Energy Conversion Congress and Exposition (ECCE)*, vol. 30, pp. 3479–3487, 2015.
- [149] E. Moore and T. Wilson, "Basic considerations for dc to dc conversion networks," *IEEE Transactions on Magnetics*, vol. 2, no. 3, pp. 620–624, 1966.
- [150] J. Cobos, H. Cristobal, D. Serrano, R. Ramos, J. Oliver, and P. Alou, "Differential power as a metric to optimize power converters and architectures," in *Proc. IEEE Energy Conversion Congress and Exposition (ECCE)*, pp. 2168–2175, 2017.
- [151] C. Li and J. Cobos, "Classification of differential power processing architectures based on VA area modeling," *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 10, no. 6, pp. 7849–7866, 2022.

- [152] J. Zapata, S. Kouro, G. Carrasco, H. Renaudineau, and T. Meynard, "Analysis of partial power dc–dc converters for two-stage photovoltaic systems," *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 7, no. 1, pp. 591–603, 2019.
- [153] C. Li, Y. Bouvier, A. Berrios, P. Alou, J. Oliver, and J. Cobos, "Revisiting "partial power architectures" from the "differential power" perspective," in *Proc. Workshop on Control and Modeling for Power Electronics (COMPEL)*, pp. 1–8, 2019.
- [154] K. A. Kim, P. S. Shenoy, and P. T. Krein, "Converter rating analysis for photovoltaic differential power processing systems," *IEEE Transactions on Power Electronics*, vol. 30, no. 4, pp. 1987–1997, 2015.
- [155] Y. Chen, P. Wang, Y. Elasser, and M. Chen, "LEGO-MIMO Architecture: A universal multi-input multi-output (MIMO) power converter with linear extendable group operated (LEGO) power bricks," in *Proc. IEEE Energy Conversion Congress and Exposition (ECCE)*, pp. 5156–5163, 2019.
- [156] P. Wang, Y. Chen, P. Kushima, Y. Elasser, M. Liu, and M. Chen, "A 99.7% efficient 300 W hard disk drive storage server with multiport ac-coupled differential power processing (MAC-DPP) architecture," in *Proc. IEEE Energy Conversion Congress and Exposition (ECCE)*, pp. 5124–5131, 2019.
- [157] Y. Elasser, Y. Chen, P. Wang, and M. Chen, "Sparse operation of multi-winding transformer in multiport-ac-coupled converters," in *Proc. IEEE Workshop on Control and Modeling for Power Electronics (COMPEL)*, pp. 1–8, 2019.
- [158] M. Kasper, D. Bortis, and J. Kolar, "Unified power flow analysis of string current diverters," *Electrical Engineering*, vol. 100, no. 3, pp. 2085–2094, 2018.
- [159] Y. Cao, J. Magerko, T. Navidi, and P. Krein, "Power electronics implementation of dynamic thermal inertia to offset stochastic solar resources in low-energy buildings," *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 4, no. 4, pp. 1430–1441, 2016.
- [160] P. Krein and J. Galtieri, "Active management of photovoltaic system variability with power electronics," *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 9, no. 6, pp. 6507–6523, 2021.
- [161] P. Shenoy, K. Kim, B. Johnson, and P. Krein, "Differential power processing for increased energy production and reliability of photovoltaic systems," *IEEE Transactions on Power Electronics*, vol. 28, no. 6, pp. 2968–2979, 2013.
- [162] C. Olalla, C. Deline, D. Clement, Y. Levron, M. Rodriguez, and D. Maksimovic, "Performance of power-limited differential power processing architectures in mismatched PV systems," *IEEE Transactions on Power Electronics*, vol. 30, no. 2, pp. 618–631, 2015.

- [163] R. De Doncker, D. Divan, and M. Kheraluwala, "A three-phase soft-switched high-power-density dc/dc converter for high-power applications," *IEEE Trans*actions on *Industry Applications*, vol. 27, no. 1, pp. 63–73, 1991.
- [164] M. Kheraluwala, R. Gascoigne, D. Divan, and E. Baumann, "Performance characterization of a high-power dual active bridge dc-to-dc converter," *IEEE Transactions on Industry Applications*, vol. 28, no. 6, pp. 1294–1301, 1992.
- [165] D. Bertsekas and J. Tsitsiklis, "Section 4.2: Covariance and Correlation," in *Introduction to Probability*, Nashua, NH: Athena Scientific, 2008.
- [166] R. Erickson and D. Maksimovic, "A multiple-winding magnetics model having directly measurable parameters," in *Proc. IEEE Annual Power Electronics Specialists Conference*, vol. 2, pp. 1472–1478, 1998.
- [167] Y. Chen, P. Wang, H. Li, and M. Chen, "Power flow control in multi-active-bridge converters: theories and applications," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, pp. 1500–1507, 2019.
- [168] J. Glover, M. Sarma, and T. Overbye, *Power System Analysis and Design*. Toronto, ON, Canada: Thomson, 2008.
- [169] R. Zimmerman, C. Murillo-Sanchez, and R. Thomas, "MATPOWER: Steady-state operations, planning, and analysis tools for power systems research and education," *IEEE Transactions on Power Systems*, vol. 26, no. 1, pp. 12–19, 2011.
- [170] R. Button, "An advanced photovoltaic array regulator module," in *Proc. IEEE Intersociety Energy Conversion Engineering Conference*, vol. 1, pp. 519–524, 1996.
- [171] J. Enslin and D. Snyman, "Combined low-cost, high-efficient inverter, peak power tracker and regulator for PV applications," *IEEE Transactions on Power Electronics*, vol. 6, no. 1, pp. 73–82, 1991.
- [172] B.-D. Min, J.-P. Lee, J.-H. Kim, T.-J. Kim, D.-W. Yoo, and E.-H. Song, "A new topology with high efficiency throughout all load range for photovoltaic PCS," *IEEE Transactions on Industrial Electronics*, vol. 56, no. 11, pp. 4427–4435, 2009.
- [173] F. Xue, R. Yu, and A. Huang, "A family of ultrahigh efficiency fractional dc–dc topologies for high power energy storage device," *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 9, no. 2, pp. 1420–1427, 2021.
- [174] M. Shousha, T. Mcrae, A. Prodic, V. Marten, and J. Milios, "Design and implementation of high power density assisting step-up converter with integrated battery balancing feature," *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 5, no. 3, pp. 1068–1077, 2017.

- [175] B. Carsten, "Converter component load factors: A performance limitation of various topologies," in *Proc. of Power Conversion International Conference*, 1988.
- [176] M. Kasper, D. Bortis, and J. Kolar, "Classification and comparative evaluation of PV panel-integrated dc-dc converter concepts," *IEEE Transactions on Power Electronics*, vol. 29, no. 5, pp. 2511–2526, 2014.
- [177] R. Erickson and D. Maksimovic, "Section 6.1.2: Cascade Connection of Converters," in *Fundamentals of Power Electronics*, Norwell, MA: Kluwer, 2001.
- [178] Y.-J. Lee, A. Khaligh, A. Chakraborty, and A. Emadi, "Digital combination of buck and boost converters to control a positive buck-boost converter and improve the output transients," *IEEE Transactions on Power Electronics*, vol. 24, no. 5, pp. 1267–1279, 2009.
- [179] M. Chen, M. Araghchini, K. Afridi, J. Lang, C. Sullivan, and D. Perreault, "A systematic approach to modeling impedances and current distribution in planar magnetics," in *Proc. IEEE Workshop on Control and Modeling for Power Electronics (COMPEL)*, vol. 31, pp. 560–580, 2016.
- [180] P. Chen, E. Lee, G. Gibson, R. Katz, and D. Patterson, "RAID: high-performance, reliable secondary storage," ACM Computing Surveys, vol. 26, no. 2, pp. 145–185, 1994.
- [181] W. N. Papian, "The MIT magnetic-core memory," in *Proc. Information Processing Systems Reliability and Requirements*, pp. 37–42, 1953.
- [182] M. Liao, H. Li, P. Wang, T. Sen, Y. Chen, and M. Chen, "Machine learning methods for feedforward power flow control of multi-active-bridge converters," *IEEE Transactions on Power Electronics*, vol. 38, no. 2, pp. 1692–1707, 2023.