# Wideband Tunable True-Time-Delay Architecture Using a Variable Order All-Pass Filter and its Applications to Continuous-Time Pulse Processing

A THESIS

submitted by

**IMON MONDAL** 

for the award of the degree

of

DOCTOR OF PHILOSOPHY



# DEPARTMENT OF ELECTRICAL ENGINEERING INDIAN INSTITUTE OF TECHNOLOGY MADRAS.

December 2017

©2017, Imon Mondal. All rights reserved.

#### THESIS CERTIFICATE

This is to certify that the thesis titled Wideband Tunable True-Time-Delay Architecture Using a Variable Order All-Pass Filter and its Applications to Continuous-Time Pulse Processing, submitted by Imon Mondal, to the Indian Institute of Technology Madras, for the award of the degree of Doctor of Philosophy, is a bona fide record of the research work done by him under my supervision. The contents of this thesis, in full or in parts, have not been submitted to any other Institute or University for the award of any degree or diploma.

Nagendra Krishnapura Associate Professor Dept. of Electrical Engineering IIT-Madras, Chennai 600 036

#### To Sharmistha

for being there through thick and thin

D

To my parents

for making me the person I am today

#### **ACKNOWLEDGEMENTS**

Reading the acknowledgment section in a thesis often used to be the interesting bit, partly because, it used to be the only section I could probably fully understand. But I never thought starting to write my own would be such a difficult job; not because I have nothing to acknowledge, but because I have too much to. And as usual gathering my thoughts before they all got jumbled up again took up some time.

The decision to apply for Ph.D. was a leap of faith, and the fact that I never thought of looking back was almost entirely due to my guide, Dr. Nagendra Krishnapura. I would like to express my heartfelt thanks for his guidance, unwavering support, encouragement and innumerable technical discussions that we had. I cannot over-emphasize his contribution in not only shaping my technical abilities, but also in changing the way I think about research. His vision to give us (his research students) an environment that fosters the freedom of individual thinking, while keeping track of everyone's progress is something that I will always be grateful for. As has often been said before, a research program at times can lead to blind alleys. During these times he knew exactly how to get me back on track – looking back I really cannot figure out how. His inputs and guidance while writing a research paper has been immense, and his constant endevour to strive for perfection is inspiring. He has been everything and more that I could have ever asked for in a mentor.

I would like to thank Prof. Shanthi Pavan, whose lectures on Active Filter Design and VLSI Data Conversion Circuits have been constant sources of reference for many years. Thanks for being an inspiring teacher and a researcher.

I would like to thank Prof. Shanthi Pavan, Dr. Nitin Chandrachoodan, Dr. Balakrishna Rao and Prof. Babu Viswanathan for their inputs as my doctoral committee members.

I would also like to thank Dr. Aniruddhan for his lectures on RFIC design, Prof. Pavan Hanumolu for his short course on serial links, Dr. Harish Krishnaswamy for his course on mm-wave circuits and Dr. Sudhakar Pamarti for the course on PLL design.

I would like to thank Mrs. Janaki and Mr. Saranath for quick resolution of administrative and maintenance hurdles.

My guide once told me "One probably learns more from his peers than from his guide." Even though I don't agree with him on this, but he is not too far off either, and for that I have my lab-mates to thank. I had heard somewhere, "You should always get into an argument. If you win, you gain confidence, and if you lose, you gain knowledge." If this is indeed true, I have learnt the most from Sumit. His technical rigor and propensity to question everything (sometimes to the verge of insanity), and an attitude of not accepting anything at face value has bailed me out multiple times. Thanks to Praveen for introducing me to the world of power amplifiers. Thanks to Abhishek Bhat for our discussions on VCOs. I would like to especially thank Sumit, Praveen and Abhishek for helping me review the thesis draft.

Thanks to Naga Rajesh for our discussions on delay lines and beamforming during the initial part of my work. My brief interaction with Pradeep Shettigar about interesting titbits of analog design has remained memorable. Thanks to Animesh and Ashwin for helping me with any issue with CAD tools. I would like to thank Debasish, Ashwin and Sujith for helping me with my PCB debug. Thanks to Debasish for teaching me basic soldering skills and showing me how not to panic even if I messed up a PCB. I would also like to thank Vallabh, Amrith, Ankesh, Radha, Chithra, Ananya, Aditya, Ashish, Ramakrishna, Abhishek Kumar, Rakshit, Aswani, Ravi Teja, Saravana, Aravind, Naresh, Madhavi, Naveen, Rohit and Priya for making this phase of my life a memorable one.

Among the many cherishable memories that I have made the one that will stand out is the attitude of each member in the lab to accept, discuss, debate and answer (seemingly) stupid questions over and over again. Among other things, this gave me the license to say "I don't know" without having to think twice about being judged right from the beginning, and I have everyone to thank for that.

"When you are stuck, prove that it's not meant to work, and you will find a solution more often than not." Thanks to my friend and ex-colleague, Varun Gupta for this suggestion that he once gave me. This has stood me in good stead time and again.

Thanks to Ashique for many interesting and counter-intuitive discussions over the years.

Through out this research journey my better half, Sharmistha, has always stood by my side. Not once did she ask me the dreaded question to a Ph.D. student, "When are you submitting your thesis?" Being my best friend, she understood this journey, and I cannot thank her enough. Long time back she had shown me the beauty of picturing a problem out of the equations embedded in a textbook. I had tried to inculcate the process, and needless to say, as an analog engineer it turned out to be very helpful.

My daughter, Biyas, has been a source of innocence and mischief in my life. I observe her absorb the surroundings and learn from it, and wish I could do so at the same rate too. Also because of her, I have found in myself a reservoir of patience that I never knew existed.

If, whatever happens to us in life is a matter of chance, then I have been blessed. But the luckiest thing to have happened to me was to be born to my loving parents. From my childhood my father taught me how to think, and my mother stood by everything in my life. They have been constant pillars of support on whom I have leaned on countless number of times. Thank you Ma! Thank you Baba!

#### **ABSTRACT**

Delay lines are integral parts of wideband beamforming systems and continuous-time equalizers. Ideal delay lines can only be implemented using lossless transmission lines terminated with its characteristic impedance at both ends. A lumped element realization using an all-pass filter having linear phase can approximate a delay line within its delay bandwidth. Higher the order, more the realizable delay. However, all-pass filter architectures reported in the literature are limited to first and second order filters which can realize limited delays within a given bandwidth. Larger delays are realized by cascading multiple units of the lower order filters. Cascading introduces parasitic poles, thus causing distortions in magnitude or delay characteristics. This limits the maximum number cascadable unit cells, in turn limiting the maximum achievable delay.

This thesis proposes an all-pass filter architecture that can be generalized to high orders, and can be realized using active circuits. Using this a compact true-time-delay element with a widely tunable delay and a large delay-bandwidth product is demonstrated. This is useful for beamforming and equalization in the lower GHz range where the use of LC or transmission line based solutions to realize large delays is infeasible. Coarse tuning of delay is realized by changing the filter's order while keeping the bandwidth constant and fine tuning is implemented by changing the filter's bandwidth utilizing the delay-bandwidth tradeoff. A test chip fabricated in 0.13 µm CMOS process demonstrates a delay tuning range of 250 ps-1.7 ns, over a bandwidth of 2 GHz, while maintaining a magnitude deviation of  $\pm 0.7$  dB. The filter achieves a delay-bandwidth product of 3.4 and a delay per unit area of  $5.8\,\mathrm{ns/mm^2}$ . The filter has a worst case noise figure of 20 dB, and -40 dB IM3 distortion for  $37 \,\mathrm{mV_{ppd}}$  inputs. The chip occupies an active area of  $0.6 \,\mathrm{mm^2}$ , and dissipates  $112 \,\mathrm{mW} - 364 \,\mathrm{mW}$  of power between its minimum and maximum delay settings. Computed radiation pattern with four antennas spaced by 7.5 cm (half wavelength at the maximum frequency of 20 GHz) shows  $\pm 90^{\circ}$ beam steering off broadside.

Exploiting the feasibility of large delay-bandwidth product of this architecture, a high order all-pass filter has been used to demonstrate true-time expansion and compres-

sion of narrow, wideband, finite width, continuous-time pulses. It is based on storing an input pulse as state-variables in a continuous-time filter whose delay exceeds the pulse duration, and, once the pulse is completely "inside" the filter, reducing or increasing its bandwidth. Expansion and compression enable processing and generation of high speed pulses using low speed circuits. The proposed method can be implemented on an IC unlike photonic or microwave implementations based on dispersive media. It is more accurate and less complex than IC implementations using a high frequency chirped VCO and on-chip group delay dispersion. It avoids high speed sampling and is more immune to jitter than sampling the signal on a capacitor array. Pulse expansion and compression by factors of 1.8× and 1.7× respectively are demonstrated in a 0.13 µm CMOS process. The prototype chip includes a filter whose bandwidth can be switched between 870 MHz and 472 MHz and circuitry to generate Gaussian/monopulse for testing. It occupies 1.6 mm² and consumes 370 mW.

### **Contents**

#### **ACKNOWLEDGEMENTS**

#### **ABSTRACT**

| 1 | Con  | tinuous-time Delay Elements for Broadband Signal Processing               |    |
|---|------|---------------------------------------------------------------------------|----|
|   | 1.1  | Introduction                                                              | 1  |
|   | 1.2  | Narrow band beamforming                                                   | 3  |
|   | 1.3  | Broadband beamforming                                                     | 5  |
|   | 1.4  | Integrated circuit delay elements in the literature and their limitations | 9  |
|   | 1.5  | Objective and organization of the thesis                                  | 13 |
|   | 1.6  | Contributions of the thesis                                               | 15 |
|   |      |                                                                           |    |
| 2 | Prop | posed Architecture of the Tunable All-pass Filter                         | 16 |
|   | 2.1  | Selection of $D(s)$                                                       | 19 |
|   | 2.2  | Quantifying the error in a real delay line                                | 20 |
|   |      | 2.2.1 Error due to in-band group delay ripple                             | 22 |
|   | 2.3  | Changing the delay                                                        | 24 |
|   | 2.4  | Delay tunability: fine tuning                                             | 28 |
|   |      | 2.4.1 Errors due to fine tuning                                           | 28 |
| 3 | Imp  | lementation of the Prototype All-pass Filter                              | 31 |
|   | 3.1  | Integrating capacitors                                                    | 31 |
|   | 3.2  | Transconductor                                                            | 33 |

| 7 |      | n Enhanced High Frequency Transconductor With On-Chip Tuned       | .,        |
|---|------|-------------------------------------------------------------------|-----------|
| 6 | Mea  | surement Results for Pulse Expansion and Compression Prototype    | <b>79</b> |
|   | 5.6  | Prototype chip architecture                                       | 77        |
|   | 5.5  | Pulse generator                                                   | 75        |
|   |      | 5.4.1 Summing taps                                                | 74        |
|   | 5.4  | Transconductor                                                    | 71        |
|   |      | 5.3.4 Effect of AC coupling between the two halves                | 67        |
|   |      | filter's bandwidth                                                | 66        |
|   |      | 5.3.3 Mismatch induced transients while dynamically switching the |           |
|   |      | 5.3.2 Filter used for pulse expansion and compression             | 65        |
|   |      | 5.3.1 Delay filter for signal storage                             | 64        |
|   | 5.3  | Filter with switched bandwidth                                    | 64        |
|   | 5.2  | Principle of pulse expansion/compression                          | 61        |
|   | 5.1  | Motivation                                                        | 59        |
|   | _    | tinuous-Time Filters                                              | 59        |
| 5 | Expa | ansion and Compression of Analog Pulses by Bandwidth Scaling of   |           |
| 4 | Mea  | surement Results of the Variable Delay All-Pass Filter            | 45        |
|   | 3.3  | Cinp arcintecture                                                 | 43        |
|   | 3.5  | 3.4.1 Design trade-offs: mismatch and noise                       | 41        |
|   | 3.4  | Distortion and noise                                              |           |
|   | 3.4  |                                                                   | 40        |
|   | 3.3  | Summing taps and gain tuning                                      | 38        |
|   |      | 3.2.2 Bandwidth tuning                                            | 36        |
|   |      | 3.2.1 Effect of device mismatch and $G_m$ sizing                  | 35        |

|   | 7.1   | Introduction                                                      | 88  |
|---|-------|-------------------------------------------------------------------|-----|
|   | 7.2   | Proposed architecture                                             | 89  |
|   | 7.3   | $G_N$ tracking $g_{ds}$                                           | 90  |
|   | 7.4   | Secondary effects and design trade-offs                           | 94  |
|   |       | 7.4.1 Results and discussion                                      | 95  |
| 8 | Accı  | urate Constant Transconductance Generation Without Off-chip Com-  |     |
|   | pone  | ents                                                              | 98  |
|   | 8.1   | Motivation                                                        | 98  |
|   | 8.2   | Proposed achitecture                                              | 99  |
|   |       | 8.2.1 Generation of accurate on-chip resistance                   | 100 |
|   |       | 8.2.2 Generation of fixed transconductance                        | 101 |
|   |       | 8.2.3 Differential implementation                                 | 101 |
|   | 8.3   | Alternate compact implementation                                  | 104 |
|   | 8.4   | Simulation results and discussion                                 | 107 |
| • | ~     |                                                                   | 400 |
| 9 | Con   | clusion and Future Scope                                          | 109 |
|   | 9.1   | Suggestions for future work                                       | 110 |
| A | Mea   | surement of Noise Spectral Density                                | 113 |
|   | A.1   | Measurment of insertion loss $(\alpha)$                           | 116 |
| В | Mul   | ti-variable Numerical Optimization for Fitting Measured and Simu- |     |
|   | lated | d Filter's Responses Using Space Mapping Technique                | 119 |

# **List of Figures**

| 1.1 | (a) Radiation pattern of an isotropic antenna. (b) Illustration of beam-   |    |
|-----|----------------------------------------------------------------------------|----|
|     | forming using a two antenna system. (c) Illustration of changing the       |    |
|     | beam direction using electronic delays                                     | 1  |
| 1.2 | Phase shift mimicking time delay for a sinusoidal signal                   | 3  |
| 1.3 | (a) Principle of narrow band beamforming using phase shifters for a        |    |
|     | two antenna system. (b) Realization of (a) using LO phase shifting.        | 3  |
| 1.4 | Applications of true-time-delay elements: (a) Beamforming by delay-        |    |
|     | ing and combining at RF[7]. (b) Beamforming at IF by delaying and          |    |
|     | combining after downconversion[9]. (c) Channel response equaliza-          |    |
|     | tion [8]                                                                   | 5  |
| 1.5 | Spatial filtering illustration of an $N+1$ element TTD beamformer for      |    |
|     | a broadband input. Delays are assigned to produce maximum(a) and           |    |
|     | minimum (b) at the output                                                  | 6  |
| 1.6 | (a) Normalized magnitude of the beamformed output for different scan-      |    |
| 1   | ning angles using a narrowband signal. (b) Polar representation of (a).    |    |
|     |                                                                            | 7  |
| 1.7 | (a) Normalized magnitude of the beamformed output for different in-        |    |
|     | cident angles using a broadband Gaussian monopulse input. (b) Polar        |    |
|     | representation of (a)                                                      | 8  |
| 1.8 | (a) An unit lattice filter cell. (b) Higher order all-pass filter realized |    |
|     | using cascades of unit lattice filters [12]                                | 10 |
| 1.9 | Active all-pass delay cells. (a) First order APF in [13] (b) First order   |    |
|     | APF in [14] (c) Second order APF in [15]. $C_L$ at the output of each      |    |
|     | delay cell is the output parasitic capacitance                             | 11 |

| 1.10 | Realization of large delays using cascade of first and second order de-          |    |
|------|----------------------------------------------------------------------------------|----|
|      | lay cells. $D_1(s)$ , $D_2(s)$ , $D_3(s)$ are first or second order polynomials. |    |
|      | Excess phase lag and magnitude droop caused by output parasitic ca-              |    |
|      | pacitance of each delay cell is modeled as $1/(1+s\tau_p)$                       | 12 |
| 1.11 | Scatter plot showing the magnitude deviations of reported delay lines            |    |
|      | with respect to delay (range)-bandwidth product. Red and blue markers            |    |
|      | represent active and passive realizations respectively                           | 12 |
| 2.1  | Forms of $LC$ ladder. (a) Odd order 'capacitor first' (b) Even order             |    |
|      | 'capacitor first' (c) Even order 'inductor first' (d) Odd order 'inductor        |    |
|      | first'                                                                           | 16 |
| 2.2  | All-pass filter architecture using singly terminated $LC$ ladder architec-       |    |
|      | tures of (a) Fig. 2.1(a, c) and (b) Fig. 2.1(b, d)                               | 18 |
| 2.3  | Singly terminated transmission line analogy for delay realization                | 19 |
| 2.4  | Comparison of group delay characteristics of a $9^{th}$ order Bessel filter to   |    |
|      | a $9^{th}$ order EGD filter having the same bandwidth                            | 20 |
| 2.5  | (a) Gaussian monopulse having FWHM $\approx$ 2 s. (b) Normalized spectrum        |    |
|      | of (a)                                                                           | 23 |
| 2.6  | Group delay characteristics of an EGD filter with change in order. (a)           |    |
|      | Constant delay. (b) Constant bandwidth                                           | 24 |
| 2.7  | LC ladder topologies for capacitor first architectures of Fig. 2.1 for (a)       |    |
|      | third order, (b) second order and (c) first order                                | 25 |
| 2.8  | Single ended representation of the fully differential variable order all-        |    |
|      | pass filter. Transconductors in each shaded box are turned on to config-         |    |
|      | ure the filter in the respective order. Inset: Single ended equivalent of        |    |
|      | the transconductor                                                               | 25 |
| 2.9  | (a) Group delay characteristics of APF with fine tuning. (b) Transient           |    |
|      | response of the APF with the monopulse input of Fig. 2.5                         | 28 |
| 2.10 | Simulated rms error between the outputs and the delayed and scaled               |    |
|      | input of Fig. 2.9(b)                                                             | 29 |

| 2.11       | (a) Simulated phase errors for the delay response of Fig. 2.9(a). (b)                   |      |
|------------|-----------------------------------------------------------------------------------------|------|
|            | RMS error in 2× FWHM of a Gaussian monopulse corresponding to                           |      |
|            | the phase error of (a)                                                                  | 29   |
| 3.1        | nMOS gate capacitance versus control voltage                                            | 32   |
| 3.2        | (a) Utilization of differential signaling to generate virtual shorts at drain/so        | urce |
|            | terminals of the MOS transistors. (b) Layout arrangement of the struc-                  |      |
|            | ture [34]                                                                               | 32   |
| 3.3        | (a) Transconductor, $G_m$ loaded with negative conductance $G_N$ . (b)                  |      |
|            | Principle of generating a transconductance, $G_N$ , to track the parasitic              |      |
|            | conductance, $g_{ds}$ . (c) Schematic representation of (a)                             | 34   |
| 3.4        | (a) Dependence of $F_{\rm T}$ of the transconductor in Fig. 3.3 on current den-         |      |
|            | sities in a 0.13 $\mu m$ CMOS process. (b) $G_m$ of the transconductor per              |      |
|            | unit width corresponding the current densities of (a)                                   | 35   |
| 3.5        | (a) Group delay characteristics of the ninth order EGD filter with $\mathcal{G}_m$      |      |
|            | mismatch with standard deviation of 2%. (b) Computed transient re-                      |      |
|            | sponse of the EGD APF having delay characteristics of (a) when excited                  |      |
|            | with a monopulse having FWHM=1 ns                                                       | 36   |
| 3.6        | RMS error between the outputs and delayed and scaled input of Fig. 3.5(b).              | 36   |
| 3.7        | (a) Principle demonstrating $G_m$ locking to $\alpha I_{ref}/(2\Delta V)$ . (b) Genera- |      |
|            | tion of $2\Delta V$ . (c) Schematic representation of (a)                               | 37   |
| 3.8        | Realization of the APF summer.                                                          | 39   |
| 3.9        | (a) Illustration of the effect of transconductor's mismatch on compo-                   |      |
|            | nent values. (b) Noise analysis using reciprocity. (c) Schematic of the                 |      |
|            | ninth order APF and the normalized signal swings at each state across                   |      |
|            | frequencies                                                                             | 41   |
| 3.10       | Simplified block diagram of the chip.                                                   | 43   |
| 3.11       | Output buffer architecture                                                              | 43   |
| <i>4</i> 1 | Chin micrograph and snapshot of the test hoard                                          | 45   |

| 4.2  | Simplified measurement setup for characterizing the APF. VNA, SA              |    |
|------|-------------------------------------------------------------------------------|----|
|      | and DSO refer to vector network analyzer, spectrum analyzer, and dig-         |    |
|      | ital storage oscilloscope respectively.                                       | 46 |
| 4.3  | Measured frequency response of the variable order APF with $V_{DD} =$         |    |
|      | 1.4 V for coarse delay settings                                               | 46 |
| 4.4  | Measured frequency response of the variable order APF with $V_{DD} =$         |    |
|      | 1.4 V for fine delay settings                                                 | 47 |
| 4.5  | Back-annotated $G_m$ s of the all-pass filter's transconductors with relative |    |
|      | distance from each other                                                      | 48 |
| 4.6  | Measured (from two chips) and back annotated group delay response             |    |
|      | for coarse delay settings                                                     | 48 |
| 4.7  | Computed phase error from the AC response of Fig. 4.3                         | 49 |
| 4.8  | Computed rms error from the AC response of Fig. 4.4 with a Gaussian           |    |
|      | monopulse input using (2.30)                                                  | 50 |
| 4.9  | Test setup for measuring response of the APF to transient inputs              | 50 |
| 4.10 | Measured response to a to a 3.2 ns wide (FWHM) monopulse. (a) Coarse          |    |
|      | delay resolution. (b) Fine delay resolution                                   | 51 |
| 4.11 | (a) Transient plots (computed from AC response of Fig. 4.4) demon-            |    |
|      | strating monotonically varying true-time-delay when a monopulse of            |    |
|      | 1 ns width (FWHM) is fed to the filter. (b) Computed rms error between        |    |
|      | the outputs and delayed and scaled input of (a)                               | 51 |
| 4.12 | Ninth order APF gain and group delay frequency characteristics for dif-       |    |
|      | ferent summing tap voltages, $V_{DDH}$ . $V_{DD}$ set at 1.4 V                | 52 |
| 4.13 | Measured input referred noise spectral density and the integrated noise       |    |
|      | figure versus order                                                           | 53 |
| 4.14 | Measured peak to peak differential input voltage for which IM3 falls          |    |
|      | 40 dB below the applied tones for (a) nominal delay setting, (b) in-          |    |
|      | creased delay setting with fine tuning                                        | 54 |

| 4.15 | setting, (b) increased delay setting with fine tuning                                                                                                                                                                          | 55 |
|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 4.16 | (a) Model of the test setup used for computing radiation pattern. (b) Computed normalized radiation pattern for a four element array with uniform spacing of 7.5 cm ( $\lambda_{\rm fmax}/2$ ) using a monopulse of 1 ns FWHM. | 55 |
| 4.17 | Scatter plot showing the magnitude deviations of the reported delay lines (active and passive) including this work with respect to delay (range)-                                                                              |    |
| ~ 1  | bandwidth product.                                                                                                                                                                                                             | 56 |
| 5.1  | (a) Slowing down a signal by storing the samples at high speed and reading at low speed [44], (b) Slowing down a continuous-time signal [45]                                                                                   | 59 |
| 5.2  | (a) State-space model of a filter, (b) $G_m$ - $C$ realization of (a); Values shown on conductances and transconductances are multipliers of a cer-                                                                            |    |
| 5.2  | tain unit transconductance $G_{m0}$                                                                                                                                                                                            | 61 |
| 5.3  | (a) and (c) Normal operation of a continuous-time filter, (b) Pulse expansion by dynamically reducing $\alpha$ , (d) Pulse compression by dynamically increasing $\alpha$                                                      | 62 |
| 5.4  | (a) Cascade of an all-pass and a low pass filter. (b) APF7 using an $LC$ ladder. (c) $G_m$ -C realization of APF7 and LPF5                                                                                                     | 64 |
| 5.5  | Illustration of switching a first order filter between (a) High bandwidth, and (b) Low bandwidth modes. Mismatches between the filter halves                                                                                   |    |
| 5.6  | cause disturbance to the states while switching                                                                                                                                                                                | 66 |
|      | enth order $G_m$ – $C$ filter                                                                                                                                                                                                  | 67 |
| 5.7  | $k^{th}$ stage of the filter: (a) With both halves turned on, (b) With the lower half turned off. $R_b$ in Fig. 5.6 is assumed to be very high and is                                                                          |    |
|      | not shown here                                                                                                                                                                                                                 | 68 |
| 5.8  | Schematic of the transconductor used in APF7                                                                                                                                                                                   | 71 |
| 5.9  | Common mode feedback circuit for the transconductor in Fig. 5.8                                                                                                                                                                | 72 |

| 5.10 | (a) Switched transconductor in the ladder filter. (b) Unbalanced transcon-                                                                 |          |
|------|--------------------------------------------------------------------------------------------------------------------------------------------|----------|
|      | ductor while switching (negative conductance not shown)                                                                                    | 73       |
| 5.11 | Simulated bias current settling time of the transconductor of Fig. 5.8.                                                                    | 74       |
| 5.12 | (a) Simplified schematic of the all-pass taps. (b) Architecture of a summing tap unit                                                      | 74       |
| 5.13 | (a) Gaussian and monopulse generation architecture. (b) Schematic of the impulse generator circuit and illustration of impulse generation. | 76       |
| 5.14 | Simulated differential outputs of the impulse, Gaussian pulse and monopuls generators                                                      | se<br>76 |
| 5.15 | Simplified block diagram of the chip                                                                                                       | 77       |
| 5.16 | Output buffer schematic                                                                                                                    | 78       |
| 6.1  | (a) Chip photograph and (b) snapshot of the test board                                                                                     | 79       |
| 6.2  | Magnitude and group delay of (a) all-pass filter (APF7), and (b) the total                                                                 |          |
|      | filter chain (APF7+LPF5) for both bandwidth settings                                                                                       | 80       |
| 6.3  | Measured input noise spectral density for APF7 and APF7 + LPF5 for                                                                         |          |
|      | (a) high bandwidth and (b) low bandwidth mode                                                                                              | 81       |
| 6.4  | Measured IIP3 with $50\Omega$ reference resistance for the APF7 and APF7 + LPF                                                             | F5       |
|      | for (a) high bandwidth and (b) low bandwidth mode                                                                                          | 82       |
| 6.5  | Simplified test set up for pulse expansion and compression                                                                                 | 83       |
| 6.6  | Demonstration of pulse expansion with Gaussian input pulse with FWHM                                                                       |          |
|      | of 1.4 ns                                                                                                                                  | 84       |
| 6.7  | Demonstration of pulse expansion with a monopulse input with FWHM                                                                          |          |
|      | of 2.4 ns                                                                                                                                  | 84       |
| 6.8  | Compression of a Gaussian input pulse with FWHM = 2 ns                                                                                     | 85       |
| 6.9  | Compression of a monopulse input with FWHM = 3.2 ns                                                                                        | 85       |
| 7.1  | Schematic representation of Nauta transconductor [29]                                                                                      | 89       |

| 7.2  | (a): Transconductor $G_m$ loaded with negative conductance $G_N$ . (b):                   |     |
|------|-------------------------------------------------------------------------------------------|-----|
|      | Symbolic representation of (a)                                                            | 90  |
| 7.3  | Common mode feedback circuit                                                              | 91  |
| 7.4  | Principle of conductance tracking by a transconductor                                     | 91  |
| 7.5  | Schematic of transconductance generation circuit to track the output conductance $g_{ds}$ | 92  |
| 7.6  | Schematic of the transconductors $T1, T2, T3$ used in Fig. 7.5                            | 93  |
| 7.7  | Inclusion of chopping to eliminate the effect of offsets in $G_m$ and $G_N$ .             | 94  |
| 7.8  | Schematic for generation of chopping clock                                                | 94  |
| 7.9  | Transconductor gain variation with temperature and supply voltage across                  |     |
|      | process corners with chopping enabled                                                     | 96  |
| 7.10 | Transconductor gain distribution with mismatch over 500 Monte-Carlo                       |     |
|      | runs with gain enhancement post chopping                                                  | 96  |
| 8.1  | Block diagram demonstrating principle of operation for generating con-                    |     |
| 0.1  | stant on-chip resistance                                                                  | 100 |
| 8.2  | Block diagram demonstrating principle of $G_m$ tracking $1/R$                             | 101 |
| 8.3  | Generation of constant resistance for differential operation                              | 101 |
| 8.4  | Schematic representation of the two stage opamp used in Fig. 8.3                          | 102 |
| 8.5  | Generation and routing of bias current for fixed transconductance                         | 103 |
| 8.6  | Schematic of the transconductors $T1, T2, T3$ used in Fig. 8.5                            | 104 |
| 8.7  | Principle demonstrating $G_m$ locking to $I_{ref}/\Delta V$                               | 104 |
| 8.8  | Transconductance generation circuit to track $I_{ref}/\Delta V$                           | 105 |
| 8.9  | Inclusion of chopping to eliminate offset of $M1, 4. \ldots$                              | 106 |
| 8.10 | Simulation schematic of the transconductor whose bias current is gen-                     |     |
|      | erated in Fig. 8.9.                                                                       | 107 |
| 8.11 | Simulated variation of $G_{m M1a,M2a}$ of Fig. 8.10 with temperature                      | 107 |

| 8.12 | Simulated variation of $G_{m M1a,M2a}$ of Fig. 8.10 with supply voltage.  | 108 |
|------|---------------------------------------------------------------------------|-----|
| A.1  | Noise contributors in the test setup (a) with filter path enabled and (b) |     |
|      | with direct path enabled                                                  | 113 |
| A.2  | Test setup for evaluating $ H_{AP}H_{int} $                               | 114 |
| A.3  | Effect of path loss on power transfer                                     | 115 |
| A.4  | Setup for measuring balun loss under                                      | 117 |
| A.5  | Insertion loss of the balun and the associated connectors                 | 117 |
| B.1  | Schematic of the all-pass filter of Fig. 2.8 with mismatches in the tran- |     |
|      | conductors (other than the summing taps).                                 | 120 |

## **List of Tables**

| 2.1 | Values of integrating capacitors in Fig. 2.8 for all orders normalized to |    |
|-----|---------------------------------------------------------------------------|----|
|     | the minimum capacitance, $C_1$                                            | 27 |
| 4.1 | Comparison of the proposed all-pass filter with the delay element archi-  |    |
|     | tectures                                                                  | 57 |
| 4.2 | Performance summary of the chip                                           | 58 |
| 6.1 | Table comparing expansion/compression architectures                       | 86 |
| 6.2 | Performance summary of the test chip                                      | 86 |
| 7.1 | Comparison of the proposed transconductor (with and without negative      |    |
|     | conductance cancellation) with the state of the art negative conductance  |    |
|     | cancellation based architecture                                           | 97 |

#### **Chapter 1**

# Continuous-time Delay Elements for Broadband Signal Processing

#### 1.1 Introduction

Wireless communication systems use antennas to propagate information through space. When a single antenna is used, the radiation is spread out broadly around the antenna. Fig. 1.1(a) shows an example of an antenna with a radially symmetric radiation pattern. However, there are many applications which demand directional communication where



Figure 1.1: (a) Radiation pattern of an isotropic antenna. (b) Illustration of beamforming using a two antenna system. (c) Illustration of changing the beam direction using electronic delays.

it is required to focus the radiated energy in the desired direction. Using two antennas results in a more focussed beam pattern as shown in Fig. 1.1(b). The radiation pattern from each of the individual antennas constructively interfere at a direction normal to the antenna array and produce a beam maximum, and the radiation intensity rolls off at

angles further from the normal. The beam direction of this antenna array can be pointed to an arbitrary angle  $\theta$ , by applying an appropriate electronic time delay,  $\tau_d$ , to the signal arriving at one of the antennas. This is illustrated in Fig. 1.1(c). The beam rotates to an angle  $\theta$  such that the total path delay of the signal through each of the antenna paths is identical, which implies  $(d/c)\sin(\theta)=\tau_d$ , where d is the spacing between the antennas, and c is the speed of light. This spatial selectivity in the angle of reception is akin to filtering a signal based on the direction of arrival. Such arrangements of two or more antennas are called phased array antennas, and the electronic circuitry controlling the spatial filtering are called beamforming systems.

As a consequence of spatial filtering, phased array systems have the following advantages.

- The antenna array of a phased array transmitter performs a vector summation of the signals in the electromagnetic (EM) domain, and adds in magnitude in the direction of transmission. However, the uncorrelated noise from each of the individual paths adds in power. This increases the signal-to-noise ratio (SNR) of the transmitted and the received signal, thus relaxing the noise specifications of the receiver.
- Thanks to spatial filtering a phased array receiver is able to attenuate strong nearby blocker signals from any unwanted direction while continuing to communicate with a transmitter in another direction. This relaxes the linearity requirements of the circuitry succeeding the beamformer.

The above advantages have made beamforming systems an attractive option for applications like radar and through wall imaging. Also, in the near future the roll out of the fifth generation (5G) wireless standards having carrier frequencies in the range of 28–30 GHz will enable the integration of multiple antennas [1] in a single cell phone, thus paving the way for beamforming systems to infiltrate the world of millions of handheld devices.

From the simplistic illustration of the phased array system in Fig. 1.1, it is evident that the the core of the technique is the electronically controlled variable delay element which is responsible for steering the beam in the direction of choice. The first integrated circuit phased array beamformer was reported in [2] in 2004. There has been a plethora of publications in this domain ever since using variety of delay elements. The following section reviews some of the most widely used beamforming architectures. The rest of the document will concentrate on beamforming receivers unless otherwise mentioned.

#### 1.2 Narrow band beamforming

$$s(t) \rightarrow \tau \rightarrow s(t - \tau)$$

$$\sin(\omega_c t) \rightarrow \tau \rightarrow \sin(\omega_c t - \omega_c \tau)$$

$$\sin(\omega_c t) \rightarrow \Phi \rightarrow \sin(\omega_c t - \Phi)$$

$$\boxed{\text{Make } \Phi = \omega_c \tau}$$

Figure 1.2: Phase shift mimicking time delay for a sinusoidal signal.

A time delay of  $\tau$  to a narrow band signal around a center frequency  $\omega_c/(2\pi)$  can be approximately modeled as shifting of phase of the sinusoidal carrier by  $\omega_c\tau$ . This is shown in Fig. 1.2. Because phase shifters are easier to design than true-time-delay elements in CMOS technologies, architectures based on narrow band phase shifters have become popular. To do this the baseband signal of interest is modulated with a high



Figure 1.3: (a) Principle of narrow band beamforming using phase shifters for a two antenna system. (b) Realization of (a) using LO phase shifting.

frequency carrier. For most of the reported literature the baseband bandwidth is  $\leq 1\%$  of the carrier frequency [3]. The narrow band nature of the up-converted signal enables the architecture to mimic a time delay with the phase shift of the sinusoidal carrier. The individual outputs are summed to achieve spatial filtering and down-converted back to baseband through a mixer. The underlying assumption behind a narrowband implementation is that, the baseband signal remains *almost* constant across the time delay of interest (about one cycle of the carrier). A conceptual illustration of the same is shown in Fig. 1.3(a) for a two antenna system. d represents the antenna spacing,  $\theta$  the direction of beam arrival, s(t) the baseband signal,  $\omega_c$  the angular frequency of the modulating

carrier,  $\Phi$  the incremental phase shift in the signal path, and  $\omega_{LO}$  the the angular frequency of the LO. For a signal arriving at an angle  $\theta$  from the normal to the antenna array, the delay in arrival experienced by the waveform incident on channel 1, with respect to channel 2, is  $\tau = (d/c)\sin(\theta)$ . Assuming that the phase shifters do not affect the baseband envelope of the high frequency carrier, the the signals in each channel are expressed as

$$x_1(t) = s(t - \tau)e^{-j\omega_c\tau}e^{j(\omega_c - \omega_{LO})t} \quad \text{and} \quad x_2(t) = s(t)e^{-j\Phi}e^{j(\omega_c - \omega_{LO})t}$$
 (1.1)

The beamformed signal at the output of the summer can be expressed as

$$s_0(t) = (s(t)e^{-j\Phi} + s(t-\tau)e^{-j\omega_c\tau})e^{j(\omega_c - \omega_{LO}t)}$$
(1.2)

Choosing  $\Phi = \omega_c \tau = \omega_c (d/c) \sin(\theta)$ , and assuming  $s(t) \approx s(t-\tau)$ , yields a maximum at the output (2s(t)) indicating the direction of beam arrival. Since there is a one-to-one relation between  $\theta$  and  $\Phi$  for  $\theta = [-90^o, +90^o]$ ,  $\Phi$  uniquely represents the direction of arrival of the incoming beam.

Fig. 1.3(b) phase shifts the local oscillator (LO) signal [3]. Mixing, phase shifting and summing yields

$$s_0(t) = (s(t)e^{-j\Phi} + s(t-\tau)e^{-j\omega_c\tau})e^{j(\omega_c - \omega_{LO}t)}$$
(1.3)

Again, for narrow band applications  $s(t) \approx s(t - \tau)$ . Thus, if  $\Phi = \omega_c \tau$  the beam maximum (2s(t)) is detected at an angle  $\Phi$  which represents the direction of arrival.

However for both Fig. 1.3(a,b), if the carrier frequency changes from  $\omega_c$ , the angle of observation of beam maximum also deviates from  $\Phi$ . This phenomenon of frequency dependence of the observed beam direction is called beam squinting [4], which eventually limits the bandwidth of the baseband signal.

Also, as the bandwidth of the baseband signal becomes a significant fraction of  $\omega_c/(2\pi)$ ,  $s(t-\tau)$  can no longer be approximated as s(t). This makes observed maximum of the detected beam direction-dependent. If such a system is used for wireless reception of digital signals, this error will lead to degradation in the error vector magnitudes (EVM)[3], especially for standards demanding dense constellation points. This

is likely to make narrow band beamforming challenging for 5G standards, where the baseband signal bandwidth is expected to be around 1 GHz for a carrier frequency of 28 GHz [1]. However, if the phase shifters are replaced by true-time-delays, both beam squinting and EVM degradation are reduced.

In applications like ground penetrating radars or through wall imaging considerable high frequency signal attenuation [5][6] makes modulated carrier based narrowband band beamforming unattractive. Because of these reasons, exploring wideband efficient true-time-delay architectures is of interest.

#### 1.3 Broadband beamforming



Figure 1.4: Applications of true-time-delay elements: (a) Beamforming by delaying and combining at RF [7]. (b) Beamforming at IF by delaying and combining after downconversion [9]. (c) Channel response equalization [8].

From the discussions in Section 1.2 it is evident that using true-time-delays instead of phase shifters can reduce the challenges like bandwidth limitations and EVM degradation for wideband signals. Fig. 1.4(a) [7] shows an N+1 channel beamforming architecture where the signal received by each element of an antenna array is delayed and combined to achieve directional selectivity.  $\tau$  is the arrival delay of the signal s(t) with respect to its neighbor and  $\tau_{i|0->N}$  are the variable electronic delays in each channel. The beamformed output can be expressed as

$$s_0(t) = \sum_{i=0}^{N} s(t - i\tau - \tau_i)$$
 (1.4)

To produce a beam maximum, the extra path delay  $i\tau$  with which the signal arrives



Figure 1.5: Spatial filtering illustration of an N+1 element TTD beamformer for a broadband input. Delays are assigned to produce maximum (a) and minimum (b) at the output.

at the  $i^{th}$  channel is compensated by a frequency independent electronic delay,  $\tau_i = (N-i)\tau$ , to produce (N+1)s(t). The output, (N+1)s(t) is independent of frequency implying that there is no beam squinting. This is shown in Fig. 1.5(a). To detect an incoming beam at an angle  $90^o$  from the normal, and thus support full spatial coverage, the maximum required delay is given by the time taken by the EM wave to traverse the antenna array along its axis, i.e.,

$$\tau_{max} = N(d/c) \tag{1.5}$$

For a given delay range of  $\tau_{max} < N(d/c)$ , the range of angles to which the beam can be steered is given by  $[-\theta_{max}, \theta_{max}]$ , where  $\theta_{max} = \sin^{-1}(\tau_{max}(c/d))$ .

For effective spatial filtering the beam pattern shown in Fig. 1.1(b) should be as narrow as possible. For an N+1 element beamformer the beam patterns of narrowband signals has a maximum of (N+1)s(t), and a minimum of 0. The minimum occurs when the signals from different paths destructively interfere with each other. The beamformed

output for a sinusoidal signal with carrier frequency of  $\omega_c$  incident normal to the antenna array is expressed as

$$s_0(t) = 1 + e^{-j\omega_c \tau_d} + e^{-j2\omega_c \tau_d} + \dots + e^{-jN\omega_c \tau_d}$$
 (1.6)

where  $\tau_d$  is the unit delay of the true-time-delay element. The scanning angle  $\theta$  corresponding to the delay  $\tau_d$  can be expressed as  $\theta = \sin^{-1}((c/d)\tau_d)$ . This implies

$$|s_0(t)| = \left| \frac{\sin(\omega_c(N+1)(d/c)\sin(\theta)/2)}{\sin(\omega_c(d/c)\sin(\theta)/2)} \right|$$
(1.7)

Fig. 1.6(a) shows the beamformed output for a sinusoidal signal with antenna spacing at  $\lambda_{\text{fmax}}/2$  for different number of antennas, where  $\lambda_{\text{fmax}}$  is the wavelength of the EM wave of the frequency  $\omega_c$ . Fig. 1.6(b) is the polar representation (a). From (1.7) and Fig. 1.6(a, b) it is evident that the beam patterns become sharper with increasing number of antenna elements.



Figure 1.6: (a) Normalized magnitude of the beamformed output for different scanning angles using a narrowband signal. (b) Polar representation of (a).

Beam patterns using wideband pulses depend on the type of pulses and their widths with respect to the time taken by the EM wave to traverse the antenna array. It can be verified with numerical simulations that increasing the number of antennas leads to sharper broadband radiation patterns. This is shown in Fig. 1.7 for a Gaussian monopulse having width 1 = 2(d/c)

The intuition behind this can be illustrated from Fig. 1.5. When the time delay

<sup>&</sup>lt;sup>1</sup>Width of a Gaussian monopulse is defined as the time interval between half its maximum and minimum values. For a more elaborate discussion on the choice of the pulse the reader is referred to Section 2.2.



Figure 1.7: (a) Normalized magnitude of the beamformed output for different incident angles using a broadband Gaussian monopulse input. (b) Polar representation of (a).

elements are adjusted to align the input waveforms in time, the output has a maximum value of  $(N+1)\times s(t)_{max}$ . This is shown in Fig. 1.5(a). Conversely, when the time delays are set such that none of the pulses in any of the channels overlap on each other, the output has maximum of  $s(t)_{max}$ . This is shown in Fig. 1.5(b). The ratio of the maximum and minimum of the beam pattern is N+1. The beam width becomes smaller with the increase in the number of antennas in the array. As beam patterns become narrower, beam steering with smaller resolutions becomes necessary to ensure full spatial coverage between  $\pm \theta_{max}$ . This requires finer granularity in the delays of the delay elements.

The spatial filtering, resolution and the SNR of these beamforming systems improve as the number of antenna elements increases. Its implication from (1.5) is an increase in the required maximum delay. All the above analysis and conclusions are also valid for modulated carrier based communication systems after downconversion as shown in Fig. 1.4(b) [9]. Fig. 1.4(c) shows a continuous-time equalizer. Here again, the longer the impulse response of the channel, the longer must be the delay span of the equalizer. The group delay has to be uniform over the signal bandwidth. Continuous-time wideband pipelined ADCs such as the one reported in [10] requires 1.5 times the clock cycle delays to compensate for the time delay between the continuous-time analog input and the continuous-time DAC output for calculating residues at each stage. Since the input is wideband, the delay element also needs to have a wide bandwidth.

Therefore, for beamforming with a large number of antennas or for realizing large-span equalizers, or even in some modern continuous time piplelined ADCs, there is a need to realize delay elements which maintain a large, flat group delay over a wide bandwidth. In other words, true-time-delay elements with a *large delay-bandwidth product* (DBW) are necessary.

# 1.4 Integrated circuit delay elements in the literature and their limitations

A true-time-delay element imparting a delay  $T_d$  has a transfer function of  $e^{-sT_d}$ , which has linear phase and constant magnitude characteristics. This is realizable only using ideal transmission lines (or passive realization of the same) terminated with the characteristic impedance [7]. For applications targeted at lower GHz frequencies like ground penetrating radars, wall imaging systems [5], or beamforming at baseband [9] in millimeter-wave communication, the length of the transmission lines or size of the inductors is impractically large for integrated circuit realizations. Moreover, the insertion loss of transmission lines increase with their lengths and the the inductor losses increase with increase in their sizes. This causes roll-off of the magnitude characteristics, which eventually limit the operating frequency of the delay lines [7][11].

All-pass filters (APFs) having linear phase over a range of frequencies have unity magnitude response and a flat delay response over the same frequency range. Lattice APF [12] is a passive element based filter architecture which is systematically synthesizable. Synthesis of a lattice filter based APF is shown in Fig. 1.8. Fig. 1.8(a) shows an unit cell of the filter. If the impedances  $Z_1$  and  $Z_2$ , are related with the termination impedance  $Z_0$  as  $Z_0^2 = Z_1 Z_2$ , it can be shown [12] that the output voltage  $V_1 = V_0 \frac{Z_0 - Z_1}{Z_0 + Z_1}$ . Let  $Z_1$  and  $Z_2$  be a capacitor and an inductor having impedances 1/sC and sL respectively such that  $L/C = R^2$ . For  $Z_0 = R$ ,  $V_1 = \frac{1 - sCR}{1 + sCR}$ , which is a first order all-pass transfer function, having magnitude response of unity and a nominal delay of 2RC. Also, since the input impedance of this unit cell is equal to  $Z_0$  [12], multiple such blocks can be cascaded to realize longer delays. Variable delays can be generated by using a multiplexer (MUX) as shown in Fig. 1.8(b). Multiplexing stages adds output parasitic capacitance at each stage (due to the input capacitance of the MUX). This de-



Figure 1.8: (a) An unit lattice filter cell. (b) Higher order all-pass filter realized using cascades of unit lattice filters [12].

grades the all-pass transfer function by introducing magnitude and delay droops. This degradation becomes more severe as the number of stages increases, eventually limiting the maximum number of stages that can be cascaded. Along with with the insertion loss of the inductors, this eventually limits the maximum realizable delay.

Inductorless, active-RC implementations of long time constants can lead to compact realizations which can be tiled to form large arrays. In the literature, active true-time-delays are implemented using all-pass filters (APFs) [14][13] which have a transfer function of the form  $H_{AP}(s) = D(-s)/D(s)$ , where D(s) is a polynomial of s. An APF has a flat magnitude response, and twice the delay of the corresponding lowpass filter 1/D(s), and is a better alternative for realizing a true-time-delay element. For realizing a true-time-delay, D(s) can be chosen to be some polynomial approximation to the ideal exponential  $e^{sT_d}$ , such as those used for Bessel or equiripple group delay (EGD) filters. The higher the order, the better the approximation. To realize long delays over wide bandwidths, i.e. a large delay-bandwidth product, a high order filter is necessary. But unlike lowpass filters which can be systematically synthesized for any order, APF architectures in the literature are mostly limited to first and second orders. Examples of these are shown in Fig. 1.9 ([13, 15, 14]). In the absence of parasitic poles



Figure 1.9: Active all-pass delay cells. (a) First order APF in [13] (b) First order APF in [14] (c) Second order APF in [15].  $C_L$  at the output of each delay cell is the output parasitic capacitance.

Fig. 1.9(a) and (b) have first order all-pass transfer function of the form

$$H_{AP1}(s) = \frac{1 - s/\omega_1}{1 + s/\omega_1} \tag{1.8}$$

and Fig. 1.9(c) has a second order transfer function of the form

$$H_{AP2}(s) = \frac{1 - s/(\omega_2 Q) + (s/\omega_2)^2}{1 + s/(\omega_2 Q) + (s/\omega_2)^2}$$
(1.9)

where  $\omega_1$ ,  $\omega_2$ , and Q are functions of the circuit components. Both (1.8) and (1.9) have unity magnitude response and their phase response can be tailored to provide linear phase by modifying  $\omega_1$ ,  $\omega_2$ , and Q, by adjusting the component values. However, the parasitic capacitances at the output (and at some internal node) of each of these architectures introduce one or two additional parasitic poles which introduce magnitude and phase deviations. Higher order APFs are realized using a cascade of first or second order filters and their outputs multiplexed to realize a variable delay as shown in Fig. 1.10. Cascading and multiplexing of stages adds to the output parasitic capacitance and lowers the parasitic pole frequency, worsening the roll-off and limiting the bandwidth of the system. Techniques like inductive peaking are used to counter the phase roll-off [14], but this causes in-band gain deviations for different delay settings [14]. This limits the number of cells which can be cascaded, and consequently the maximum achievable delay.



Figure 1.10: Realization of large delays using cascade of first and second order delay cells.  $D_1(s)$ ,  $D_2(s)$ ,  $D_3(s)$  are first or second order polynomials. Excess phase lag and magnitude droop caused by output parasitic capacitance of each delay cell is modeled as  $1/(1+s\tau_p)$ .

Hayahara's structure [16] is an example of an active filter based higher order APF which can be systematically synthesized without cascading unit cells. However, like the lattice filter based architecture, this too suffers from the effect of parasitic capacitances to ground.



Figure 1.11: Scatter plot showing the magnitude deviations of reported delay lines with respect to delay (range)-bandwidth product. Red and blue markers represent active and passive realizations respectively.

To summarize, efficacy of delay lines (both passive and active) is limited by the magnitude deviations they incur while realizing large delay-bandwidth products. The trade-off between large delay-bandwidth product and magnitude flatness can be captured by plotting the gain variation of the reported delay lines in literature versus delay bandwidth product. This is shown in Fig. 1.11. This is a scatter plot of the *measured* magnitude deviation versus the *measured* delay-bandwidth product of most of the re-

ported<sup>2</sup> monolithic implementations of the true-time-delay cells in IEEE $\mathbb{R}$ . The points marked in red and blue represent active and passive realizations respectively. An ideal delay line with a large delay and flat magnitude response will be positioned far right on the x axis in Fig. 1.11. However, it can be observed that, an attempt to achieve higher delay has invariably led to larger magnitude variations.

#### 1.5 Objective and organization of the thesis

The previous sections discussed about the necessity and architectures of the true-time-delay elements reported in literature. From these discussions it was concluded that even though delay lines realizing large delays over wide bandwidths are necessary for many applications like wideband beamforming, equalization, and even modern continuous-time pipelined ADCs, realization of large delay-bandwidth products have hit a bottle-neck due to the distortion of magnitude and/or delay characteristics of the delay lines due to effects of insertion loss and interface parasitic effects arising out of cascading of multiple stages. Expectedly, these effects are more prominent in the architectures realizing higher delay-bandwidth products. Active implementations of delay lines trying to equalize these artifacts and realize compact area efficient solutions using unit delay cells have been handicapped by the interface parasitic capacitances of these units. Keeping these design issues in the forefront, one of the objectives of this thesis is to break the trade-off between realization of large delays and compromising on the flatness of the magnitude response and realize a delay line architecture to validate the objective.

In the latter half the proposed delay line architecture has been used in an application of expansion and compression of continuous-time analog pulses hitherto not reported in literature. This is an IC realization of the method proposed in [28] using the proposed delay line with the objective of realizing a monolithic solution and reducing the chip area of the pulse expansion and compression architectures.

Chapter 2 proposes the all-pass filter architecture used to realize the true-time-delay line. An analysis of the efficacy of the architecture for maintaining flat magnitude

<sup>&</sup>lt;sup>2</sup>To the best of our knowledge this diagram contains all the reported architectures of TTD implementations which contain measured data for both group delay and magnitude till Sep. 2017. Some implementations of the same architecture for different applications have been omitted. [11] has been left out due to its extremely large gain droop of 35 dB.

response regardless of the delay is presented. An analysis technique to quantify the error due to distortion of the AC response of the delay line is presented.

Chapter 3 discusses the design and circuit details of the all-pass filter introduced in Chapter 2. Methods to realize a range of delays from the single active delay line are introduced. Several design trade-offs between noise, mismatch and distortions are discussed.

Chapter 4 presents the measurement results of the delay line architecture introduced thus far. The proposed delay line is benchmarked with the state-of-the art architectures reported in literature and its efficacy validated.

Chapter 5 discusses the motivation behind realizing expansion and compression of continuous-time analog pulses, and the architectures reported in literature. The usefulness of the proposed delay line in realizing an area efficient IC implementation of pulse expansion/compression is discussed. The design challenges, techniques to mitigate them, and circuit details are presented.

Chapter 6 presents the measurement results demonstrating expansion and compression of continuous-time wideband analog pulses. The proposed solution is benchmarked with the state-of-the-art architectures realizing pulse expansion and compression and its efficacy in reducing the chip area validated.

The building block of the all-pass filter which is used as a delay line in this work is a gain-enhanced high frequency transconductor. This is proposed in Chapter 7. The design details, theoretical analysis of its operating conditions and simulation results of the architecture are presented.

Chapter 8 presents two architectures for realizing constant (process, voltage and temperature independent) transconductance for the transconductors realizing the all-pass filters.

Chapter 9 concludes the thesis and puts forth suggestions for future work.

#### 1.6 Contributions of the thesis

This thesis presents an architecture for realizing large tunable delays over a wide bandwidth using variable order all-pass filter and uses the proposed architecture to demonstrate true-time expansion and compression of continuous-time, high frequency, analog pulses. The contributions of the thesis are summarized below.

- Active all-pass filters reported in literature have been used to realize delay lines. But their usefulness has not been exploited to the fullest due to the use of cascade of unit cells which introduces interface artifacts. This thesis proposes and realizes an architecture of a delay line based on an all-pass filter topology which does not use cascade of delay cells to realize large tunable delays. It does not have any unwanted interfaces distorting the AC response of the delay line except for the final output node. The topology is systematically synthesizable using well known LC ladder synthesis techniques and can be extended to realize large delays without compromising on magnitude response. Measurement results of a prototype variable order all-pass filter validates the architecture.
- A comprehensive analysis of the effect of distortion of the AC repose of a delay line on a wideband input signal is presented. It is shown that the error energy between the input and the output of a wideband pulse travelling through a real delay line depends on the square of phase error and the fourth power of the magnitude deviation of the real delay line, and is not directly related to the group delay flatness. This quantifies the amount of acceptable phase deviations as per the requirement of any application and quantifies the importance of magnitude deviation in contributing to pulse shape distortion. This is in contrary to the reported literature on the delay lines which mostly concentrate on maintaining group delay flatness and not the magnitude response.
- In the latter half of the thesis the proposed all-pass filter is used to demonstrate expansion and compression of continuous-time wideband analog pulses based on the technique proposed in [28]. Due to the compactness of the all-pass filter architecture it was possible to realize this in almost three times less chip area than the state-of-the art architectures reported in literature, which mostly used narrow band techniques based on chirped carrier based modulation and dispersive delay lines to realize pulse expansion and compression.
- The building block of the all-pass filter is a high frequency transconductor without any internal nodes. This is needed to avoid excess phase lags necessary to avoid unwanted peaking in the AC response. The transconductor uses negative conductance to cancel the parasitic conductance of a single stage differential amplifier. This thesis proposes a technique to ensure that the cancellation takes place automatically across process, voltage and temperature variations.
- Transconductances of the transconductors in active filters need to be invariant of the process, voltage and temperature variations. This thesis presents a solution of the same which is independent of the model of the transistor and is based on negative feedback and the linear behaviour of a non-linear element within a range of voltages.

#### Chapter 2

### **Proposed Architecture of the Tunable All-pass Filter**

Realization of large delay-bandwidth product by cascading multiple delay cells leads to undesired parasitic poles, which causes phase and magnitude deviations eventually imposing an upper limit to the number of cells that can be cascaded. This limits the maximum realizable delay-bandwidth product for an architecture. In contrast, if a filter topology has just as many nodes as its order, all parasitic capacitors can be absorbed into the integrating capacitors of the filter [29]. This will not distort the frequency response. Such architectures for realizing lowpass filters are widely used in literature [29][30]. However, due to the absence of a systematic design procedure for realizing higher order all-pass filters, these filter design techniques have not been utilized.

This chapter introduces a tunable delay variable order APF architecture based on a singly terminated ladder filter which can be systematically synthesized for any transfer function. It can realized using a transconductor with no internal nodes. This allows realization of large delay-bandwidth product without the need to cascade multiple units or multiplex high frequency analog signals. Consider the singly terminated LC ladder



Figure 2.1: Forms of LC ladder. (a) Odd order 'capacitor first' (b) Even order 'capacitor first' (c) Even order 'inductor first' (d) Odd order 'inductor first'.

in Fig. 2.1(a) or Fig. 2.1(c) where the termination resistor R is the only dissipative element. The driving point impedance,  $Z_{11}(s)$ , looking into the lossless LC network can be represented as [31]

$$Z_{11}(s) = \frac{N_e(s)}{D_o(s)} \tag{2.1}$$

where  $N_e(s)$  and  $D_o(s)$  are polynomials with only even and odd powers of s respectively. For example,

$$Z_{11}(s) = \frac{s^2 C_1 L_2 + 1}{s^3 C_1 L_2 C_3 + s(C_1 + C_3)}$$
 (2.2)

for a third order filter having configuration of Fig. 2.1(a), or

$$Z_{11}(s) = \frac{s^2 L_1 C_2 + 1}{s C_2} \tag{2.3}$$

for a second order filter having configuration of Fig. 2.1(c). For filters like those in Fig. 2.1(b), or Fig. 2.1(d) the driving point impedance,  $Z_{11}(s)$ , looking into the lossless LC network can be represented as [31]

$$Z_{11}(s) = \frac{N_o(s)}{D_e(s)} \tag{2.4}$$

where  $N_o(s)$  and  $D_e(s)$  are polynomials with only odd and even powers of s respectively

such as 
$$Z_{11}(s) = \frac{sL_2}{s^2L_2C_1 + 1}$$
 (2.5)

for a second order filter having configuration of Fig. 2.1(b), or

$$Z_{11}(s) = \frac{s^3 L_3 C_2 + s(L_1 + L_3)}{s^2 L_3 C_2 + 1}$$
 (2.6)

for a second order filter having configuration of Fig. 2.1(d). In Fig. 2.1(a, c) the node voltage,  $V_1(s)$  can be represented as

$$V_1(s) = V_i \frac{Z_{11}(s)}{Z_{11}(s) + R}$$
(2.7)

i.e. 
$$V_1(s) = V_i \frac{N_e(s)}{N_e(s) + RD_o(s)}$$
 (2.8)

i.e. 
$$V_1(s) = \frac{V_i}{2} \frac{N_e(s) - RD_o(s)}{N_e(s) + RD_o(s)} + \frac{V_i}{2}$$
 (2.9)

Let

$$H_{AP}(s) = \frac{N_e(s) - RD_o(s)}{N_e(s) + RD_o(s)}$$
(2.10)

which implies

$$V_1(s) = \frac{V_i}{2} H_{AP}(s) + \frac{V_i}{2}$$
 (2.11)

Since  $N_e(s)$  and  $D_o(s)$  contain only even and odd powers of s respectively,  $N_e(j\omega)$  is purely real and  $D_o(j\omega)$  is purely imaginary. Therefore,

$$|H_{AP}(j\omega)| = \left| \frac{N_e(j\omega) - RD_o(j\omega)}{N_e(j\omega) + RD_o(j\omega)} \right| = 1.$$
 (2.12)

Proceeding as above it can be shown that

$$H_{AP}(j\omega) = \frac{N_o(j\omega) - RD_e(j\omega)}{N_o(j\omega) + RD_e(j\omega)}$$
(2.13)

for the configurations of Fig. 2.1(b, d).



Figure 2.2: All-pass filter architecture using singly terminated LC ladder architectures of (a) Fig. 2.1(a, c) and (b) Fig. 2.1(b, d)

Thus  $H_{AP}(s)$  is an all-pass transfer function. Denoting  $V_{AP}(s) = H_{AP}(s)V_i$  and rearranging (2.11), results in  $V_{AP}(s) = 2V_1(s) - V_i$ . Thus, a weighted summation of the input voltage  $V_i$  and the first node voltage of the filter  $V_1$  results in an all-pass function. This is shown in Fig. 2.2(a). Note that, if  $N_e(j\omega) + RD_o(j\omega)$  has linear phase (constant group delay) within the signal bandwidth, so will  $H_{AP}(j\omega)$ . Thus, if the components of the LC ladder in Fig. 2.2(a) are chosen such that  $V_n/V_i$  is a Bessel lowpass filter,  $H_{AP}(s)$  will be a filter with a constant delay within the signal band, and unit magnitude across all frequencies. Since the order of the filter is the same as that of the singly terminated ladder it can be made arbitrarily large. This opens up possibilities for realizing arbitrarily large delays without compromising the magnitude flatness.

Fig. 2.2(a, b) summarize the proposed all-pass filter realizations. The expressions for the all-pass outputs are as follows.  $V_{AP}(s) = 2V_1(s) - V_i(s)$  for LC ladders in Fig. 2.1(a, c), and  $V_{AP}(s) = V_i(s) - 2V_1(s)$  for LC ladders in Fig. 2.1(b, d). Also note that the lowpass transfer function 1/D(s) can also be simultaneously realized by tapping the last state-variable of the LC ladder (capacitor voltage or inductor current, e.g.  $V_n$  in Fig. 2.1(a)).



Figure 2.3: Singly terminated transmission line analogy for delay realization.

This all-pass structure can also be understood by analogy with transmission lines. This is shown in Fig. 2.3. The singly terminated ladders in Fig. 2.1(a, c) are equivalent to a transmission line terminated by an open circuit (Fig. 2.3(a)) and those in Fig. 2.1(b, d) are equivalent to a transmission line terminated by a short circuit (Fig. 2.3(b)). Voltage  $V_1$  at the input of the line consists of the attenuated version of input pulse ( $V_i/2$ ) and the reflected pulse which arrives  $2T_d$  later, where  $T_d$  is the one-way delay of the transmission line. The reflected pulse is in phase with the input with an open circuit termination and out of phase with the input with a short circuit termination. Thus taking  $2V_1 - V_i$  in Fig. 2.3(a) or  $V_i - 2V_1$  in Fig. 2.3(b) cancels the incident pulse and leaves only the reflected pulse, which is the input pulse delayed by  $2T_d$ .

# **2.1** Selection of D(s)

One of the ways to realize flat group delay for  $H_{AP}(s) = D(-s)/D(s)$  is to ensure that 1/D(s) has the characteristics of a Bessel filter. Fig. 2.4 shows the group delay of ninth-order Bessel and equiripple group delay (EGD) all-pass filters. Since EGD filters



Figure 2.4: Comparison of group delay characteristics of a  $9^{th}$  order Bessel filter to a  $9^{th}$  order EGD filter having the same bandwidth.

have higher group delay than their Bessel counterparts [31] for the same bandwidth, the former has been chosen for this implementation. The effect of the in-band group delay ripple is explained in following section.

# 2.2 Quantifying the error in a real delay line

An ideal delay line offering a delay  $T_d$  to an input  $v_i(t)$  has an output  $v_i(t-T_d)$ . In the frequency domain the output can be represented as  $V_i(j\omega)e^{-j\omega T_d}$ , where  $V_i(j\omega)$  is the Fourier transform of  $v_i(t)$ . The output of a real delay line of Fig. 2.2 is  $H_{AP}(j\omega)V_i(j\omega)$ . For a real delay line

$$H_{AP}(j\omega) = |H_{AP}(j\omega)|e^{-j\omega T_d - j\phi_e(\omega)}$$
(2.14)

where  $\phi_e(\omega)$  is the phase deviation of the real delay line from its ideal counterpart. The error  $V_e(j\omega)$  is quantified as

$$V_e(j\omega) = H_{AP}(j\omega)V_i(j\omega) - V_i(j\omega)e^{-j\omega T_d}.$$
 (2.15)

Using Parseval's theorem the error energy can be expressed as

$$\int_{-\infty}^{\infty} (v_0(t) - v_i(t - T_d))^2 dt = \frac{1}{2\pi} \int_{-\infty}^{\infty} |V_e(j\omega)|^2 d\omega$$
 (2.16)

where  $v_0(t)$  is the time domain output. A high frequency broadband pulse has most of its energy concentrated within a narrow time duration. If  $T_p$  represents this duration of interest an rms error can be obtained by modifying (2.16) as

$$E_{rms} = \sqrt{\frac{1}{T_p} \int_{-\infty}^{\infty} (v_0(t) - v_i(t - T_d))^2 dt} = \sqrt{\frac{1}{2\pi T_p} \int_{-\infty}^{\infty} |V_e(j\omega)|^2 d\omega}.$$
 (2.17)

It is common in communication systems to quantify an rms error with respect to the signal peak. Therefore this error is quantified as

$$E_{rms}/v_{pp} = \sqrt{\frac{1}{T_p v_{pp}^2} \int_{-\infty}^{\infty} (v_0(t) - v_i(t - T_d))^2 dt} = \sqrt{\frac{1}{2\pi T_p v_{pp}^2} \int_{-\infty}^{\infty} |V_e(j\omega)|^2 d\omega}$$
(2.18)

where  $v_{pp}$  is the peak-peak signal amplitude. Also, from (2.15) and (2.14)

$$\left|\frac{V_e(\omega)}{V_i(j\omega)}\right|^2 = ||H_{AP}(j\omega)|e^{-j\phi_e(\omega)} - 1|^2$$
(2.19)

which implies

$$\left|\frac{V_e(j\omega)}{V_i(j\omega)}\right|^2 = (|H_{AP}(j\omega)|e^{-j\phi_e(\omega)} - 1)(|H_{AP}(j\omega)|e^{j\phi_e(\omega)} - 1). \tag{2.20}$$

Thus.

$$\left|\frac{V_e(j\omega)}{V_i(j\omega)}\right|^2 = |H_{AP}(j\omega)|^2 - 2|H_{AP}(j\omega)|\cos(\phi_e(\omega)) + 1.$$
 (2.21)

Let

$$|H_{AP}(j\omega)|^2 = 1 + \epsilon(\omega^2) \tag{2.22}$$

where  $\epsilon(\omega^2)$  is a real quantity denoting magnitude deviation. Therefore,

$$\left| \frac{V_e(j\omega)}{V_i(j\omega)} \right|^2 = 2 + \epsilon(\omega^2) - 2\sqrt{1 + \epsilon(\omega^2)} \cos(\phi_e(\omega)). \tag{2.23}$$

For  $|\epsilon(\omega^2)| \ll 1$ 

$$\left|\frac{V_e(j\omega)}{V_i(j\omega)}\right|^2 \approx 2 + \epsilon(\omega^2) - 2\cos(\phi_e(\omega))[1 + \epsilon(\omega^2)/2 - \epsilon(\omega^2)^2/8]$$
 (2.24)

which implies,

$$\left|\frac{V_e(j\omega)}{V_i(j\omega)}\right|^2 = 2[1 - \cos(\phi_e(\omega))] + \epsilon(\omega^2)[1 - \cos(\phi_e(\omega))] + \cos(\phi_e(\omega))\epsilon(\omega^2)^2/4. \quad (2.25)$$

For  $\phi_e(\omega) \ll 1 \, \mathrm{rad}$ 

$$\left| \frac{V_e(j\omega)}{V_i(j\omega)} \right|^2 \approx (\phi_e(\omega))^2 + \epsilon(\omega^2)^2 / 4.$$
 (2.26)

Using (2.22) in (2.26)

$$\left|\frac{V_e(j\omega)}{V_i(j\omega)}\right|^2 \approx (\phi_e(\omega))^2 + \frac{1}{4} \left(|H_{AP}(j\omega)|^2 - 1\right)^2. \tag{2.27}$$

Thus, for small magnitude and phase deviations in a real delay line, the total energy of the error is given by

$$\int_{-\infty}^{\infty} \frac{|V_e(j\omega)|^2}{2\pi} d\omega = \int_{-\infty}^{\infty} \frac{(\phi_e(\omega))^2}{2\pi} |V_i(j\omega)|^2 d\omega + \frac{1}{8\pi} \int_{-\infty}^{\infty} (|H_{AP}(j\omega)|^2 - 1)^2 |V_i(j\omega)|^2 d\omega.$$
(2.28)

The effect of the non-idealities of a real delay line in terms of its AC characteristics is quantified in (2.28) and it shows that the energy of the error is the sum of the individual errors due to integral of the square of the phase and fourth power of magnitude deviations and not directly related to the peak deviation in group delay. Particularly, with a broadband input whose energy is spread across frequency, the cumulative effect of  $\epsilon(\omega^2)$ , and  $\phi_e(\omega)$  over that range is what determines the output error.

From (2.18) and (2.28)

$$E_{rms}/v_{pp} = \sqrt{\frac{1}{T_p v_{pp}^2}} \int_{-\infty}^{\infty} (v_0(t) - v_i(t - T_d))^2 dt$$

$$= \sqrt{\frac{1}{2\pi T_p v_{pp}^2}} \int_{-\infty}^{\infty} (\phi_e(\omega))^2 |V_i(j\omega)|^2 d\omega + \frac{1}{8\pi T_p v_{pp}^2} \int_{-\infty}^{\infty} (|H_{AP}(j\omega)|^2 - 1)^2 |V_i(j\omega)|^2 d\omega$$
(2.30)

#### 2.2.1 Error due to in-band group delay ripple

This section estimates the effect of group delay ripple of the EGD-APF having group delay characteristics of Fig. 2.4 on a wideband pulse.

In most wideband beamforming systems the Gaussian monopulse and its derivatives



Figure 2.5: (a) Gaussian monopulse having FWHM  $\approx$  2 s. (b) Normalized spectrum of (a).

are used as inputs [32][17]. The width of a Gaussian monopulse is defined as the interval between the negative pulse reaching half the negative peak and positive pulse reaching half the positive peak. This is commonly referred to as full width half maximum or FWHM. 2×FWHM encompasses the entire pulse. The APF having the group delay response of the EGD filter in Fig. 2.4 is fed with a Gaussian monopulse. The pulse and its spectrum are shown in Fig. 2.5. To be consistent with the inputs used in literature the monopulse spectrum is chosen such that it drops to about 25 dB below its maximum at the filter's band-edge [17],[27].

The phase error  $\phi_e(\omega)$  is obtained from the group delay ripple of the EGD filter with delay characteristics of Fig. 2.4. A flat magnitude response is assumed. Using  $|H_{AP}(\omega)| = 1$ , (2.30) gets modified as

$$E_{rms}/v_{pp} = \sqrt{2\frac{1}{2\pi T_p v_{pp}^2} \int_0^\infty (\phi_e(\omega))^2 |V_i(j\omega)|^2 d\omega}.$$
 (2.31)

The multiplication factor of two in (2.31) accounts for the one sided spectral density. However, since (2.30) is valid for  $\phi_e(\omega) \ll 1$  rad, the integral limit of (2.31) has to be changed to a finite frequency  $\omega_H$ . A convenient limit is the delay bandwidth of the APF, which from Fig. 2.4 is approximately equal to  $2\pi$  rad/s. However, recognizing that the error energy is proportional to  $|\phi_e(\omega)V(j\omega)|^2$ , it is necessary to have an integration limit which also encompasses most of the energy of  $V(j\omega)$ . It can be numerically worked

out that the energy content of the monopulse of Fig. 2.5 beyond  $2.1\pi$  rad/s is less than 40 dB below it total energy<sup>1</sup>. Also maximum phase error within this frequency range (0 to 1.05 Hz) is approximately 0.01 rad. Hence  $\omega_H$  has been chosen to be  $2.1\pi$  rad/s. Thus

$$E_{rms}/v_{pp} = \sqrt{2\frac{1}{2\pi T_p v_{pp}^2} \int_0^{2.1\pi} (\phi_e(\omega))^2 |V_i(j\omega)|^2 d\omega}.$$
 (2.32)

Evaluating (2.32) numerically with  $T_p=2\times {\rm FWHM}$  yields an rms error 49 dB below the peak-peak monopulse input.

# 2.3 Changing the delay

Beam steering in a broadband beamforming system, necessitates the availability of variable delay elements. This section investigates a method to generate variable delays from the APF architectures in Fig. 2.2.



Figure 2.6: Group delay characteristics of an EGD filter with change in order. (a) Constant delay. (b) Constant bandwidth.

Consider the APF architecture in Fig. 2.2 using the LC ladder of Fig. 2.1(a). The low frequency group delay from  $V_i$  to  $V_n$  in Fig. 2.1(a) is equal to  $R \sum_k C_k$ . Since the APF provides double the delay of its lowpass counterpart, the group delay for the APF in Fig. 2.2 is  $2R \sum_k C_k$ . If the components have been adjusted to realize an EGD characteristic equation, these DC group delay values persist till the filter's band edge.

<sup>&</sup>lt;sup>1</sup>Depending on the desired amount of accuracy this limit can be modified.

Higher the order of the filter, higher is the filter's bandwidth for the same group delay. This is shown in Fig. 2.6(a). A corollary to this observation is that, for the same bandwidth it is possible to get higher delays by increasing the filter's order. This is shown in Fig. 2.6(b).



Figure 2.7: LC ladder topologies for capacitor first architectures of Fig. 2.1 for (a) third order, (b) second order and (c) first order.



Figure 2.8: Single ended representation of the fully differential variable order all-pass filter. Transconductors in each shaded box are turned on to configure the filter in the respective order. Inset: Single ended equivalent of the transconductor

To realize a tunable delay element an equiripple group delay all-pass filter whose order can be changed has been implemented. Fig. 2.7 shows a conceptual realization of

a variable order LC ladder whose order is changed from three to one. The component values of an EGD filter needs to be changed with a change in the filter's order. Changing the order of the passive LC ladder is difficult because switches (not shown in the figure) are required in the signal path to disconnect a portion of the ladder to provide a short circuit termination and to change the element values. However, the active filter counterpart of the LC ladder can be more conveniently programmed because transconductors can be easily tuned. Additionally, active filters occupy much smaller area due to the absence of inductors.

In this work a  $G_m$ –C realization of a tunable APF architecture has been implemented. The single ended block diagram is shown in Fig. 2.8. The transconductors marked as +/- have transconductance of  $G_m$  with appropriate polarities, and form the singly terminated LC ladder. The transconductors used for summing the input voltage,  $V_i$ , and the node voltage,  $V_1$ , are marked as -1 and +2 respectively. Enabling all the transconductors realizes a ninth order ladder. Progressively turning off the transconductors from the right, as shown in the shaded boxes, reduces the order. For a given order N, all transconductors upto  $V_N$  are turned on. Extending the analogy of the transmission line based APF in Fig. 2.3, this can be seen as changing the delay of the reflected pulse by changing the length of the transmission line.

Note that the architecture has only one parasitic pole,  $(C_p)$  at the output summation node which contributes to high frequency roll-off for all delay settings. The parasitic capacitances associated with input and output of each ladder transconductor are absorbed in the ladder's integrating capacitances. Also, unlike the cascade of delay cell architectures of Fig. 1.10, neither the input nor the output is required to be multiplexed to vary the delay. The output parasitic pole at  $\omega_{3dB}=1/(R_{AP}C_p)$  contributes to a delay of  $\approx R_{AP}C_p$  due to excess phase lag<sup>2</sup> and signal dispersion due to magnitude droop. A constant excess delay in all paths of a beamformer acts as an offset delay and does not affect its beam patterns. Hence, this analysis concentrates on the error energy introduced by the magnitude roll-off contributed by first order output pole. From (2.28), the energy is expressed as

$$\int_{-\infty}^{\infty} \frac{|V_e(\omega)|^2}{2\pi} d\omega = \int_{-\infty}^{\infty} \frac{1}{8\pi} (1 - |H_{AP}(\omega)|^2)^2 |V_i(\omega)|^2 d\omega$$
 (2.33)

<sup>&</sup>lt;sup>2</sup>Assuming a high frequency output pole beyond  $\omega_H$ 

Expressing  $H_{AP}(\omega) = 1/(1+j\omega/\omega_{3dB})$ , where  $\omega_{3dB}$  is the -3 dB frequency  $|H_{AP}(\omega)|$ ,

$$1 - |H_{AP}(\omega)|^2 \approx \left(\frac{\omega}{\omega_{3dB}}\right)^2 \tag{2.34}$$

The monopulse input of Fig. 2.5 is used in (2.33) and (2.30) and integrated till  $\omega_H$  to get

$$E_{rms} = \sqrt{2 \frac{1}{2 \text{FWHM}} \int_0^{\omega_H} \frac{1}{8\pi} \left(\frac{\omega}{\omega_{3dB}}\right)^4 |V_i(\omega)|^2 d\omega}$$
 (2.35)

Numerical computation of (2.35) shows that to get an rms error due to signal dispersion 40 dB below the peak-peak amplitude of the monopulse, the location of the output pole, i.e.,  $\omega_{3dB}/2\pi$  needs to be at least 1.8 times the delay bandwidth of the APF. Since this is the only parasitic pole, efforts can be made to push it out and keep the magnitude response remains almost flat regardless of the delay setting. This property enables the realization of large delays in a given bandwidth. The capacitors are made programmable based on the order such that an EGD transfer function is realized for any given order.

The values of the integrating capacitors in the APF of Fig. 2.8 have to be changed according to the selected order to realize group delays characteristics of Fig. 2.6(b). For either EGD or Bessel filter the capacitance values increase further down the ladder. Table 2.1 lists their values normalized to  $C_1$ .

Table 2.1: Values of integrating capacitors in Fig. 2.8 for all orders normalized to the minimum capacitance,  $C_1$ .

|                | Capacitors (F) |       |       |       |       |       |       |       |       |
|----------------|----------------|-------|-------|-------|-------|-------|-------|-------|-------|
| Filter's order | $C_1$          | $C_2$ | $C_3$ | $C_4$ | $C_5$ | $C_6$ | $C_7$ | $C_8$ | $C_9$ |
| 2              | 1              | 2.45  | -     | -     | -     | -     | -     | -     | -     |
| 3              | 1              | 2.18  | 3.44  | -     | -     | -     | -     | -     | -     |
| 4              | 1              | 2.12  | 2.88  | 4.03  | -     | -     | -     | -     | -     |
| 5              | 1              | 2.10  | 2.74  | 3.20  | 4.81  | -     | -     | -     | -     |
| 6              | 1              | 2.09  | 2.69  | 2.96  | 3.66  | 5.22  | -     | -     | -     |
| 7              | 1              | 2.09  | 2.65  | 2.85  | 3.31  | 3.85  | 5.96  | -     | -     |
| 8              | 1              | 2.09  | 2.64  | 2.81  | 3.16  | 3.42  | 4.29  | 6.24  | -     |
| 9              | 1              | 2.09  | 2.63  | 2.78  | 3.08  | 3.22  | 3.80  | 4.45  | 6.96  |

# 2.4 Delay tunability: fine tuning

The APF architecture of Fig. 2.8 achieves a three bit delay tunability as the filter's order is varied from two to nine. It is however desirable to have finer resolution of delays to steer a beam in a beamforming system at finer angles. If the delay steps obtained by changing filter's order are denoted as coarse, the finer steps in between them can be obtained by tuning the filter's delay bandwidth. Since the group delay of the filter is inversely proportional to its bandwidth, an increase in delay can be realized by reducing the latter. This technique of fine tuning sacrifices the filter's bandwidth, and increases the rms error.

#### 2.4.1 Errors due to fine tuning

Fine tuning by reducing the filter's bandwidth leads to droop in the group delay characteristics. This increases the pulse shape distortion.



Figure 2.9: (a) Group delay characteristics of APF with fine tuning. (b) Transient response of the APF with the monopulse input of Fig. 2.5.

To evaluate this an input monopulse of Fig. 2.5 is fed to the APF of Fig. 2.8. The input is delayed and scaled so that its peak matches the peak of the output pulse. Equalizing the peaks of the output and input is not strictly necessary. However, since most beamforming systems use gain tuning, this is a valid assumption. To compute the error both the input and output are captured. The output is scaled so that its peak-peak matches to that of the input and is shifted until the positive peaks coincide. The mean squared difference is taken over 2×FWHM. This is integrated over a finite duration of



Figure 2.10: Simulated rms error between the outputs and the delayed and scaled input of Fig. 2.9(b).

2×FWHM which encompasses the entire pulse. The error is defined as the rms error between these two in a duration of twice the FWHM.

Fig. 2.9(a) shows the AC response for all orders including fine tuning. Fig. 2.9(b) shows the output of the APF having group delay characteristics of Fig. 2.9(a) when fed with a monopulse of Fig. 2.5. Fig. 2.10 shows the simulated rms error for all orders including fine tuning versus the corresponding group delay of the filter from Fig. 2.9(b). The bandwidth reduction factor due to fine tuning is the highest when the low frequency group delay of the first order filter is tuned to match the nominal delay of the second order. This is validated by the steep increase of the rms error of the first order filter with fine tuning. Due to this reason the first order configuration was omitted and the filter was made programmable from two to nine.



Figure 2.11: (a) Simulated phase errors for the delay response of Fig. 2.9(a). (b) RMS error in  $2 \times$  FWHM of a Gaussian monopulse corresponding to the phase error of (a).

The above analysis can also be done using the error equation (2.30) like Section 2.2.1.

As before, the effect of the increased variation of group delay on the Gaussian monopulse input is estimated by modifying (2.30) as

$$E_{rms}/v_{pp} = \sqrt{\frac{2}{2\pi T_p v_{pp}^2} \int_0^{\omega_H} (\phi_e(\omega))^2 |V_i(j\omega)|^2 d\omega}$$
 (2.36)

where  $\omega_H$  is as defined in Section 2.2.1, and

$$\phi_e(\omega) = \int_0^\omega -(GD(\omega) - \zeta GD(0))d\omega. \tag{2.37}$$

 $GD(\omega)$  represents the group delay at a frequency  $\omega$  rad/s and is extracted from Fig. 2.9(a). The parameter  $\zeta$  is varied to minimize of  $E_{rms}/v_{pp}$  in (2.36). The time domain equivalent of this optimization is to minimize the error between the output and the delayed input by shifting the input till its aligns with the output with minimum error.

Fig. 2.11(a) shows the resulting phase errors. Fig. 2.11(b) shows the computed rms error with respect to the signal peak for all the delay setting of Fig. 2.9(a). It can be seen that even though the error plots extracted from time and frequency domain analysis are similar, the rms errors extracted from the AC analysis is lower than that from the transient. This is because, aligning the peak of the output pulse with delayed input does not necessarily satisfy the condition of minimum error.

## Chapter 3

## Implementation of the Prototype All-pass Filter

This chapter presents the design and implementation details of the proposed variable order all-pass filter of Fig. 2.8.

# 3.1 Integrating capacitors

The integrating capacitors in the APF of Fig. 2.8 have to be programed for each order to satisfy the values of Table 2.1. To do this, a set of switchable capacitors is used at each node so that the desired capacitance can be obtained by progressively switching them on. An nMOS transistor with its gate set at a bias voltage,  $V_{cm}$  ( $\approx 860\,\mathrm{mV}$ ), bulk tied to ground and a control voltage,  $V_{ctl}$ , applied to the shorted drain and source terminals has a capacitance profile as shown in Fig. 3.1. Since the bulk is always tied to ground, and  $V_{GB}$  is more than the threshold voltage, the nMOS is always in inversion. When  $V_{ctl}$  is close to 0, the drain/source terminals provide the mobile carriers (electrons) required to respond to high frequency excitations<sup>1</sup> at the gate. When  $V_{ctl}$  is close to  $V_{DD}$ , the drain/source terminals are reverse biased with respect to the oxide/bulk interface. The channel charge can no longer respond to the any high frequency excitation at the gate. Thus the gate capacitance varies from a high (2.8 fF) to a low (0.7 fF) as the control voltage is varied from 0 to  $V_{DD}$  for an nMOS having  $W/L=1.2\mu/0.2\mu$  in a 0.13  $\mu$ m CMOS technology. Hence, a switchable capacitor can be realized by tying  $V_{ctl}$  to the control bits [33]. Moreover, since device capacitance is fairly constant at the maximum and minimum control voltage levels, non-linearities of the integrating capacitors do not significantly contribute to the filter's distortion. However, in order to realize a C-V profile of Fig. 3.1, it is necessary to engineer an incremental ground at the control voltage terminal.

<sup>&</sup>lt;sup>1</sup>High frequency in this context refers to the frequency much greater than that supported by the thermal generation of mobile carriers in the channel.



Figure 3.1: nMOS gate capacitance versus control voltage.



Figure 3.2: (a) Utilization of differential signaling to generate virtual shorts at drain/source terminals of the MOS transistors. (b) Layout arrangement of the structure [34].

Utilizing the differential nature of the filter, virtual shorts have been realized at the source/drain terminals of the transistors by placing the devices corresponding to the anti-symmetric gate voltage swings in alternating columns [34]. Fig. 3.2(a) and Fig. 3.2(b) show the schematic representation and the layout arrangement of the capacitor array. Since the control voltage is applied at either end of the array, no lateral DC electric field exists along any of the device channels, thus making their channel potentials identical. This enables the removal of the source/drain channel contacts from all of the intermediate transistors, thus allowing closer spacing between adjacent devices.

To realize higher bandwidths, the filter's integrating capacitances have to be reduced, but the minimum capacitance at any node in Fig. 2.8 is set by the parasitic ca-

pacitances from the transconductors and routing. From Table 2.1 it is evident that the EGD APF has increasing capacitor values as one progresses down the ladder ( $V_1$  to  $V_9$  in Fig. 2.8) when equal  $G_m$ s are used for all the transconductors. Therefore, the maximum realizable bandwidth in a given process is limited by the smallest parasitic capacitance which is present at the first node ( $V_1$ ) in Fig. 2.8. To maximize the bandwidth, the filter was designed such that  $C_1$  comprises only of parasitic capacitances and has a fixed value. When the APF is programmed to an order N, in addition to all transconductors up to the  $N^{th}$  node, the transconductor driven by the  $N^{th}$  node is also switched on to retain its input capacitance value (e.g. for the second order case,  $G_{m2+}$  in Fig. 2.8 was also kept on<sup>2</sup>).

#### 3.2 Transconductor

The APF prototype in Fig. 2.8 assumes that the transconductors'  $G_m$ s are ideal. In reality all transconductors have non-zero output conductances and capacitances, and parasitic poles in their transfer functions. Finite output resistance of a transconductor in a  $G_m$ -C filter has the same effect as series loss in an inductor or a shunt loss in a capacitor in an LC ladder. In that case, the driving point impedance,  $Z_{11}(s)$  in Fig. 2.2 will not be a ratio of exclusively even and odd polynomials of s as assumed in (2.1). Transconductor parasitic poles also have similar detrimental effects. Load capacitance at the output node can be absorbed into the filter's integrating capacitances. However, the larger the parasitic capacitance, the smaller is the filter's achievable bandwidth. Hence, a transconductor with a high DC gain, no parasitic poles and small output capacitance (high unity gain bandwidth) is desired.

The design and analysis of the transconductor is outlined in Chapter 7. A brief explanation of the same is enumerated below. In order to faithfully emulate a lossless LC ladder the output conductance of the transconductor is cancelled using an incremental negative conductance. This is shown in Fig. 3.3(a). The bias current for the negative conductance,  $G_N$ , is derived from a servo loop to ensure that it tracks the net positive output conductance of the transconductor. The principle of operation of the loop is shown in Fig. 3.3(b). Fig. 3.3(c) shows the transistor level schematic. Minimum length

<sup>&</sup>lt;sup>2</sup> This is not strictly required. The capacitors  $C_{2-9}$  can be varied to account for the change in capacitance at  $V_N$  when the transconductor driven by  $V_N$  is turned off.



Figure 3.3: (a) Transconductor,  $G_m$  loaded with negative conductance  $G_N$ . (b) Principle of generating a transconductance,  $G_N$ , to track the parasitic conductance,  $g_{ds}$ . (c) Schematic representation of (a).

transistors are used in the transconductor to minimize the parasitic capacitances and maximize the achievable bandwidth. Since the output conductance is cancelled, this does not compromise the DC gain. Without output conductance cancellation, the DC gain of the transconductor is 14 dB and when cancellation is incorporated, it improves to 48 dB across PVT without mismatch, and has average and minimum values of 46 dB and 34 dB with mismatch. The key advantage of incorporating output conductance cancellation is that minimum length transistors can be used in the transconductors, maximizing the achievable bandwidth in a given technology. The parasitic capacitances contributed by the negative transconductance is  $\approx 20$  fF. The main transconductor's output capacitance is  $\approx 140$  fF. Hence, the negative conductance circuit does not significantly compromise the bandwidth.

The speed of an transconductor can be gauged from the unity gain frequency of its short circuit current gain, commonly referred to as  $F_T$  of the transconductor. To maximize  $F_T$ , it is desirable to bias the transconductor at high current densities. Fig. 3.4(a) shows the dependence of the  $F_T$  of the transconductor of Fig, 3.3 on its quiescent current densities. It can be observed that  $F_T$  tends to saturate at higher current levels. Hence, the transconductors have been biased at an  $F_T$  of around 100 GHz, which corresponds to a current density of 100  $\mu$ A/ $\mu$ m, and  $G_m/W$  of  $\approx$  0.7 mS/ $\mu$ m. This is shown in Fig. 3.4(b).



Figure 3.4: (a) Dependence of  $F_T$  of the transconductor in Fig. 3.3 on current densities in a 0.13  $\mu$ m CMOS process. (b)  $G_m$  of the transconductor per unit width corresponding the current densities of (a).

#### 3.2.1 Effect of device mismatch and $G_m$ sizing

Fig. 3.4 shows the current density at which to bias the transconductor to achieve the desirable transconductance per unit width. Therefore to save power it is of interest to use minimum possible transistor widths. To determine this, the effect of random device mismatch on the APF's characteristics is evaluated in this section.

As before, a monopulse is fed to the APF of Fig. 2.8 whose  $G_m$ s are mismatched, and its output captured. To quantify the effect of the pulse shape distortion due to device mismatches the time domain error expression of (2.29) is used and integrated over  $2\times FWHM$ .

Fig. 3.5(a) shows MATLAB© simulation results of the group delay of the ninth order APF with  $G_m$ s having 2% mismatch standard deviation. This results in a group delay variation of  $\pm 200 \,\mathrm{ps}\,(\pm 11.7\%)$  for the ninth order filter. The corresponding APF output is shown in Fig. 3.5(b), and this results in an rms error which is about 35 dB below the peak-peak value of the input pulse as shown in Fig. 3.6. Transistor level Monte-Carlo simulations show that  $G_m$ s with standard deviation of 2% is obtained when the transconductors have  $W \times L = 44\mu \times 0.12\mu$ . This corresponds to  $G_m \approx 28 \,\mathrm{mS}$ , which also fixes the parasitic capacitance,  $C_1$  associated with the first node of the APF of Fig. 2.8. This sets the bandwidth of the APF to 2 GHz in typical corner. On-chip capac-



Figure 3.5: (a) Group delay characteristics of the ninth order EGD filter with  $G_m$  mismatch with standard deviation of 2%. (b) Computed transient response of the EGD APF having delay characteristics of (a) when excited with a monopulse having FWHM=1 ns.



Figure 3.6: RMS error between the outputs and delayed and scaled input of Fig. 3.5(b).

itor mismatches are smaller than the  $G_m$  mismatches due to their dependence primarily on lateral dimensions [35]. Simulations show that nMOS capacitances in inversion have a mismatch  $\sigma/\mu \leq 0.1\%$ . Therefore their mismatch contribution was neglected in the analysis.

## 3.2.2 Bandwidth tuning

The bandwidth of a  $G_m$ -C filter is controlled by controlling the transconductors'  $G_m$ s. Their transconductances can be tuned to a desired value using various techniques. [36]



Figure 3.7: (a) Principle demonstrating  $G_m$  locking to  $\alpha I_{ref}/(2\Delta V)$ . (b) Generation of  $2\Delta V$ . (c) Schematic representation of (a).

demonstrates a transistor model independent technique based on linear behavior for small signals to stabilize an on-chip  $G_m$  to a targeted conductance value. In this work a constant  $G_m$  generation scheme which is similar to [36], but whose transconductance can be tuned by controlling a current source has been used. Details of the architecture are provided in Chapter 8. A brief description of the same is outlined below.

The differential incremental picture of the principle is shown in Fig. 3.7(a).  $G_m$  is the transconductance which is to be fixed. A DC incremental voltage  $2\Delta V$  at its input produces an output current  $2G_m\Delta V$ , which is compared to a fixed current  $\alpha I_{ref}$ , and value of the transconductance tuned by changing its bias current,  $I_{gm}$ , through negative feedback. The loop settles at a value of  $I_{gm}$  when  $2G_m\Delta V = \alpha I_{ref}$ , thus making  $G_m = \alpha I_{ref}/(2\Delta V)$ . Constant  $\alpha I_{ref}$  and  $\Delta V$  ensure a constant  $G_m$ .

The circuit implementation of the architecture is shown in Fig. 3.7(b, c). Since the ratio of  $I_{ref}$  to  $2\Delta V$  is important, the exact value of  $2\Delta V$  is not critical, as the effect of process and voltage variation can be calibrated by tuning  $I_{ref}$ . However, it needs to be temperature invariant. For the purpose of demonstration of this architecture  $2\Delta V$  is generated by a resistive divider as shown in Fig. 3.7(b). This makes the effective differential input voltage to the transconductor of Fig. 3.7(a) to be equal to  $\Delta V_1 + \Delta V_2$ . The resistors, R1-4 are sized to ensure  $\Delta V_1 \approx \Delta V_2 \approx \Delta V$ . Mismatch between  $\Delta V_1$  and  $\Delta V_2$  results in a non-zero common mode input to the transconductor in Fig. 3.7(a). Since the transconductor is fully differential, it rejects any common mode input, and makes the architecture insensitive to mismatch between  $\Delta V_1$  and  $\Delta V_2$ . Hence, for the

rest of the discussion it is assumed that  $\Delta V_1 = \Delta V_2 = \Delta V$ .

The transistors marked in black in Fig. 3.7(c) form the transconductor whose transconductance is to be stabilized.  $T_{err}$  is a single stage error amplifier with nMOS input pair. The combination of  $T_{err}$  and M8 forms the error amplifier, and the fixed current,  $I_{ref}$ , is mirrored into  $V_{op|om}$  as  $\alpha I_{ref}$ . Tuning the mirroring ratio,  $\alpha$ , or the current reference  $I_{ref}$  ensures a proportional tuning of  $G_m$ , thus tuning the bandwidth of the filter. In the prototype in this work  $I_{ref}$  has been replaced with an off-chip resistor,  $R_{ext}$ . Since the common mode feedback loop sets  $(V_{op} + V_{om})/2$  to  $V_{cm}$ , and the overall negative feedback makes the differential input to  $T_{err}$  zero,  $V_{op} = V_{om} = V_{cm}$ . The opamp based current mirroring scheme makes  $I_{ref} = V_{cm}/R_{ext}$ . This sets  $G_m = \alpha V_{cm}/(R_{ext}\Delta V)$ . Tuning  $R_{ext}$  tunes the transconductances, and thus the filter's bandwidth. The transconductance can in principle be tuned by varying  $\Delta V$  as well. However, the trade-off associated with this can be understood as follows. Since this transconductance tracking scheme is based on the principle of small signal input,  $\Delta V$  needs to be much smaller than the overdrive voltage of M1, 2. If the transconductor needs to be tuned for smaller transconductance, the overdrive of M1, 2 needs to be decreased from its nominal value. On the other hand, since  $G_{m|M1,2} = \alpha I_{ref}/\Delta V$ ,  $\Delta V$  needs to be increased from its nominal value. Decreasing overdrive and increasing  $\Delta V$  will introduce more non-linear incremental current from M1, 2 and the small signal based analysis will no longer hold good. This effect is not as severe when  $\Delta V$  is held constant, and  $I_{ref}$  is tuned to reduce the transconductance. Chopping and low pass filtering (details shown in Chapter 8) has been realized to nullify the effects of device mismatch. Simulations show  $G_{m|M1,2}$  vary within  $\pm 2\%$  across process voltage and temperature variations. Mismatch simulations with 200 Monte-Carlo runs yield  $\mu/\sigma \le 2\%$ .

# 3.3 Summing taps and gain tuning

In any beamforming architecture mismatches between different paths (from different antennas to the beamformer) lead to gain errors. Gain programmability, without affecting the filter's delay, provides a way to correct these mismatches and also introduces an additional degree of freedom to the beamformer to tailor the frequency response. In this work gain tuning has been implemented by changing the gain of the APF summing



Figure 3.8: Realization of the APF summer.

taps in Fig. 2.8. In a complete system such as the ones shown in Fig. 1.4, gain tuning is typically incorporated in various stages of the signal chain starting from the LNA. Having a programmable gain in the delay lines provides an additional degree of freedom to distribute the gains across the signal chain. Fig. 3.8 is a schematic representation of summing taps used in the prototype APF of Fig. 2.8.  $V_{ip|m}$  and  $V_{1p|m}$  represent the differential inputs and the first state-variable of the filter in Fig. 2.8 respectively. Summing is done in the current domain using differential pairs, and the resultant current is passed through resistors,  $R_{AP} \approx 50\Omega$ , to realize the all-pass output voltages.  $R_{AP}$  is chosen such that the output pole associated with the parasitic capacitance  $C_p$  is moved out to 5 GHz to minimize the pulse shape distortion due to magnitude droop. To maintain symmetry and reduce the effect of doping gradient in an IC, the unit taps were laid out in a common centroid pattern along one direction, i.e., the summing taps of Fig. 3.8 were laid out as  $T_1T_iT_1$ . The summing taps have a supply voltage  $(V_{DDH})$  different from the filter's ladder. Reducing  $V_{DS}$  drives M1-2 from saturation to triode region of operation. The  $G_m$  of a transistor, biased in triode region, is a strong function of the drain to source voltage across it. This helps in tuning the summing tap  $G_m$ s, thus realizing gain tuning. Since, supply voltage modulation of the taps does not affect the filter ladder, the delay characteristics remain unaffected. A tunable voltage regulator is required to implement this scheme.

#### 3.4 Distortion and noise

In  $G_m$ –C filters the transconductors are the prime contributors to distortion. The distortion introduced by the transconductors is primarily due the non-linear relationship of their input voltages to output currents. At lower frequencies where the currents through the filter's integrating capacitors are negligible (almost) all the current into and out of each transconductor flows into or out of another identical transconductor to develop the node voltages. This current to voltage conversion is *almost an exact* inverse to voltage to current conversion operation at each node. This (almost) masks the distortion effects at the filter's node voltages at low frequencies. As frequency increases an appreciable portion of the transconductors' output currents flow into the integrating capacitors. Since, now the current sourced by a transconductor is not completely sunk by another, the current to voltage conversion is no longer an exact inverse to its voltage to current counterpart. This introduces distortion in the voltage domain at higher frequencies [37].

The linearity of the transconductor in general is improved by biasing it at maximum possible overdrive voltage within the constraints imposed by supply voltage. The input pair M1–2, and the negative conducatance transistors, M5–6, of the transconductor in Fig. 3.3 have been designed to work at a nominal overdrive of  $\approx 160\,\mathrm{mV}$  (for  $F_T$  of  $\approx 100\,\mathrm{GHz}$  at a typical corner) to maximize the linearity of the  $G_m$ s. For fine delay tuning, the bandwidth is reduced by reducing the transconductors' bias currents. This increases the distortion to some extent. The distortion due to the non-linear integrating capacitors (Fig. 3.1) is insignificant in this implementation.

The transconductors of the APF contribute to random noise, among which the thermal noise is significant at high frequencies. In cases where the beamformer is used in the baseband (after downconversion) the effect of noise gets reduced due to the gain in the RF stages, and its distortion metrics become more significant. If the beamformer is used in the RF path, the importance of noise and distortion reverses.

When the filter's order is increased by one, two more transconductors become active. Each transconductor is noisy and non-linear. Because of additional noise sources and additional non-linear currents injected in the filter, it is expected that both noise and distortion increase with increasing filter order. The tradeoff between the dynamic range of this architecture and its power consumption is similar to a general active filter architecture.

tecture. More details about the same can be found in [39] and the references therein. However, there is a tradeoff between the effect of the transconductors' mismatch on the output characteristics of the APF and the output noise due to the transconductors, which is particular to this architecture. This is analysed in the following section.

#### 3.4.1 Design trade-offs: mismatch and noise



Figure 3.9: (a) Illustration of the effect of transconductor's mismatch on component values. (b) Noise analysis using reciprocity. (c) Schematic of the ninth order APF and the normalized signal swings at each state across frequencies.

The size (and hence, the power consumption) of the filter's transconductors depend on the effect of mismatch, noise and distortion of the transconductors on the filter's output. Without the loss of generality consider the second order  $G_m$ -C APF and its equivalent LC ladder representation in Fig. 3.9(a). Ideally, without any mismatch the

capacitor  $C_2$  at the node  $V_2$  appears as an inductor of value  $C_2/G_m^2$  at  $V_1$ . However, due to random mismatch if the transconductance of the two ladder transconductors changes to  $G_m + \delta g_m$ , the corresponding inductance changes to  $C_2(1-2\delta g_m/G_m)/G_m^2$  for small  $\delta g_m$ . This deviation causes distortion in the group delay flatness of the filter. In general for a higher order filter an integrating capacitance at node  $V_N$ , can deviate from its intended transformed reactance at  $V_1$  by  $2 \times (N-1) \times \delta g_m/G_m$ . Hence it is beneficial to increase the  $G_m$ s of successive stages in the ladder to mitigate the effect of component mismatches.

To analyze the contributions of the transconductor's thermal noise at the APF output, consider the second order APF in Fig. 3.9(b)(i), where the noise sources from each of the ladder  $G_m$ s have been lumped into an equivalent noise source  $(i_1, i_2)$  at each node. The summing taps have transconductances of  $-G_T$  and  $2G_T$ . Fig. 3.9(b)(ii) is a adjoint representation of (i) [38] (without the input). The contribution of the  $j^{th}$  noise source in (i) at the output can be assessed by evaluating the transfer function from  $i_j$  to  $V_j$  in (ii). Assuming all  $i_j$ s are of the same order, having a PSD of  $\eta kTG_m$ , the total output noise PSD can be represented as

$$S_{v|APF} = \eta k T G_m R_{AP}^2 \sum_{j} (2G_T / G_m H_j)^2$$
 (3.1)

where  $H_j$  is the transfer function from the input  $(V_i)$  of the ladder in (i) to the  $j^{th}$  node. Since the reciprocal ladder in (ii) is driven by  $2G_T$  instead of  $G_m$ , the transfer function is scaled by  $2G_T/G_m$ . For the ninth order filter, transfer function from the input to  $V_{1-9}$  are plotted in Fig. 3.9(c). Since the number of zeros in the transfer functions  $(H_j s)$  decreases from  $V_1$  to  $V_9$ , their magnitude roll offs become sharper at higher frequencies. Due to sharper high frequency roll-off, from (3.1) it follows that the integrated noise contribution of the transconductors in the stages near  $V_1$  are more than those further away<sup>3</sup>. Hence, to mitigate noise it is beneficial to increase the  $G_m s$  of consecutive stages in the ladder from  $V_9$  to  $V_1$ . This requirement conflicts with mismatch mitigation. To satisfy both these conditions identical  $G_m s$  have been chosen for all the filter's stages.

 $<sup>^3</sup>$ Exact analysis involves integrating the area under the curve under each  $|H_j|^2$  and taking into account the noise from an extra transconductor connected at  $V_1$ . For our filter the area under each of the curves sum up to [2.35, 2.03, 1.82, 1.85, 1.6, 1.6, 1.3, 1.17, 0.65] $\times$ 10 $^9$  V $^2$ Hz. This shows that noise contribution of the transconductors driving the nodes  $V_1$  to  $V_9$  mostly reduce further down the ladder.

## 3.5 Chip architecture



Figure 3.10: Simplified block diagram of the chip.



Figure 3.11: Output buffer architecture.

Fig. 3.10 shows the simplified block diagram of the prototype chip. APF is the variable order all-pass filter whose order and integrating capacitances are programmed by the 'Order select' digital logic. The filter's order is programmable from 2 to 9. The bias currents and distribution network required for cancelling the transconductors' output conductances and maintaining constant  $G_m$ s are produced in the 'Bias generator', which receives an external reference current and a chopping clock with a frequency of 2 MHz. An on-chip divide-by-two counter produces internal non-overlapping clocks, which are used for chopping the offsets in the servo loops within the Bias generator block. These are explained in Chapters 7 and 8. The filter's summing taps operate on a different supply,  $V_{DDH}$ , than the rest of the filter which runs on  $V_{DD}$ . The output MUX selects between the filter input and the output, thus helping in de-embedding the effects of package and test board parasitics. The polarity of the output is changed by enabling/disabling the control bit, FLIP, which helps in cancelling the input-output

feedthrough [40]. The architecture of the MUX is shown in Fig. 3.11, which consists of two transconductors, one each for the filter input and output. One of these two is selected by a digital logic block, which also produces signals  $f_i$  and  $f_{bi}$  to flip its polarity. The termination resistor R at the input of the APF is used to provide input matching and de-embedding the effect of noise of the test setup. This is explained in Appendix A.



# **Chapter 4**

# Measurement Results of the Variable Delay All-Pass Filter



Figure 4.1: Chip micrograph and snapshot of the test board.

The test chip was fabricated in a standard  $0.13\,\mu m$  CMOS process, was housed in a QFN48 package and was mounted on a four layer printed circuit board. The chip occupies an active area of  $0.6\,\mathrm{mm^2}$ , out of which the APF takes up  $0.29\,\mathrm{mm^2}$ . The die photograph and the PCB snapshot is shown in Fig. 4.1. The chip was tested with a



Figure 4.2: Simplified measurement setup for characterizing the APF. VNA, SA and DSO refer to vector network analyzer, spectrum analyzer, and digital storage oscilloscope respectively.

ladder supply voltage  $(V_{DD})$  of 1.4 V, and common mode voltage,  $V_{cm}$ , of 860 mV. The summing tap supply voltage  $(V_{DDH})$  was varied from 0.95 to 1.75 V<sup>1</sup> to vary the gain. Fig. 4.2 shows the simplified test setup used for characterizing the test chip. Fig. 4.3 and



Figure 4.3: Measured frequency response of the variable order APF with  $V_{DD} = 1.4 \text{ V}$  for coarse delay settings.

4.4 shows the measured magnitude, and the group delay response of the APF for  $V_{DDH}$  of 1.75 V. The filter's response has been extracted with the techniques used in [40]. Each color corresponds to the frequency response of a respective filter order between two and nine. Multiple plots in the same color correspond to the filter's response with fine tuning of the group delay by changing the bandwidth.

The APF achieves a group delay range of 250 ps to 1.7 ns over a worst case group

<sup>&</sup>lt;sup>1</sup>Safe operation of the devices in Fig. 3.8 is ensured by enforcing sufficient voltage drop across  $R_{AP}$ .



Figure 4.4: Measured frequency response of the variable order APF with  $V_{DD}=1.4\,\mathrm{V}$  for fine delay settings.

delay bandwidth of 2 GHz. This makes the maximum delay-bandwidth product equal to 3.4. The absolute delay variation within the bandwidth for the maximum and the minimum delay settings are  $\pm 140$  ps ( $\pm 8\%$ ), and  $\pm 30$  ps ( $\pm 12\%$ ) respectively. The always active summing taps consume 45 mW of power, and as the filter's order is swept from two to nine, the total consumption increases from 112 mW to 364 mW.

With a unit increment in the filter's order, the delay increases approximately by 200 ps. The finer delay increments between the coarse delay increments are obtained by reducing the filter's bandwidth by increasing  $R_{ext}$  which reduces the  $G_m$ s of the ladder transconductors proportionately. The granularity of the finer delay resolution is limited by the granularity with which the filter  $G_m$ s are changed. The peak-to-peak magnitude response deviation for all delay settings is within 1.4 dB.

The difference between the measured and the simulated group delay is attributed to the mismatch [41] between the filter's transconductors as they are spread over a length of 1 mm. To verify this the APF of Fig. 2.8 was modeled in MATLAB© using state space representations. The optimization routing "fminsearch" in MATLAB was used to find the transconductance values of the eighteen transconductors (all transconductors in Fig. 2.8 except the two used for summing taps) such that the simulated group de-



Figure 4.5: Back-annotated  $G_m$ s of the all-pass filter's transconductors with relative distance from each other.



Figure 4.6: Measured (from two chips) and back annotated group delay response for coarse delay settings.

lay response of the APF from the state-space model fits the measured response for the ninth order. To do this the space mapping method outlined in [42] was used. This is explained in detail in Appendix B. This generated a set of eighteen different transconductances (say  $G_{m1} - G_{m18}$ ), each of which corresponds to a transconductor of Fig. 2.8. These  $G_m$ s versus their relative distance from each other is shown in Fig. 4.5.  $g_{kl}$  in Fig. 4.5 corresponds that  $G_m$  in the APF of Fig. 2.8 which has its input connected to the node  $V_k$  and output to the node  $V_l$ . Subsets of these eighteen transconductances were used to simulate group delays for lower orders. For example, for  $N^{th}$  order (N < 9),  $G_{m1} - G_{m2 \times N}$  were used to simulate the group delay response.

Fig. 4.6 shows the measured group delay responses from two chips and the simulated response obtained with  $G_m$  values of Fig. 4.5 in the filter's core for all orders. The close match between the plots validates the assumption that mismatch between the  $G_m$ s are responsible for the deviation of the measured delay response from the simulated ones. Since the summing taps from  $V_i$  and  $V_1$  in Fig. 2.8 were laid out in a common centroid manner, the mismatch due to doping gradient across the IC has been neglected.



Figure 4.7: Computed phase error from the AC response of Fig. 4.3.

The effect of the increased delay ripple on a Gaussian monopulse input whose strength falls to about 25 dB below its maximum at 2 GHz is estimated from error equation (2.36) which is reproduced below.

$$E_{rms}/v_{pp} = \sqrt{\frac{2}{2\pi T_p v_{pp}^2} \int_0^{\omega_H} (\phi_e(\omega))^2 |V_i(j\omega)|^2 d\omega}$$
(4.1)

where  $\omega_H$  represents the frequency beyond which the energy content of the monopulse is 40 dB below the total energy of the pulse, which in this case equals  $2\pi \times 2.1$  Grad/s. As before, the phase error,  $\phi_e(\omega)$  is approximated as

$$\phi_e(\omega) = \int_0^\omega -(GD(\omega) - \zeta GD(0))d\omega \tag{4.2}$$

where  $GD(\omega)$  represents the group delay at a frequency  $\omega$  rad/s and is extracted from Fig. 4.3. The parameter  $\zeta$  is varied to minimize of  $E_{rms}/v_{pp}$  in (4.1). Fig. 4.7 shows the resulting phase errors for the coarse delay settings. Fig. 4.8 shows the computed rms error with respect to the signal peak for all the delay setting of Fig. 4.4. The worst case

error is about 36 dB below the peak-peak signal amplitude.



Figure 4.8: Computed rms error from the AC response of Fig. 4.4 with a Gaussian monopulse input using (2.30).

The effect of magnitude deviation on the pulse can be obtained by proceeding in the same manner. This resulted in a errors less than 43 dB below the peak-peak signal amplitude, and hence can be neglected for the purpose of this analysis.

Fig. 4.9 shows the test setup for measuring the response of the APF to a Gaussian monopulse input. The method of generating the input pulses is outlined in Chapter 6. It is based on the principle that the impulse response of a high order Bessel filter approximates a Gaussian pulse. Fig. 4.10(a) shows the measured output of the filter demon-



Figure 4.9: Test setup for measuring response of the APF to transient inputs.

strating coarse tuning when excited by a Gaussian monopulse having FWHM =  $3.2 \text{ ns}^2$  as the order of the filter is changed from 2 to 9.

<sup>&</sup>lt;sup>2</sup>The input amplitude is limited because of the small energy in the narrow pulse used to approximate the impulse excitation to the lowpass filter.



Figure 4.10: Measured response to a to a 3.2 ns wide (FWHM) monopulse. (a) Coarse delay resolution. (b) Fine delay resolution.



Figure 4.11: (a) Transient plots (computed from AC response of Fig. 4.4) demonstrating monotonically varying true-time-delay when a monopulse of 1 ns width (FWHM) is fed to the filter. (b) Computed rms error between the outputs and delayed and scaled input of (a).

The input has transients beyond the pulse duration mainly due to cabling between the pulse generator board and the test board for the all-pass filter. Similar transients are seen in the delayed outputs as well. The inset shows an observable delay resolution of 200 ps. Fig. 4.10(b) shows the measured delays with fine tuning. The filter faithfully retains the shape of the input pulse and delays it. The monotonicity of the true-time-delay with change in delay settings is evident from the transient plots, thus corroborating the delay tuning nature of the architecture. Since the pulse generation setup could not generate pulses narrower than 3.2 ns, to ascertain the wideband behavior of the filter, the transient response is computed for a 1 ns wide Gaussian monopulse (Fig. 2.5) by convolving it with the filter's impulse response which is extracted from the measured frequency response in Fig. 4.4. Fig. 4.11(a) shows some of these computed responses. They behave similarly to the measured responses in Fig. 4.10 and confirm the true-time-delay behavior for wideband pulses. Fig. 4.11(b) shows the computed rms error using the time domain method. The input in Fig. 4.11(a) is delayed and scaled to match the



Figure 4.12: Ninth order APF gain and group delay frequency characteristics for different summing tap voltages,  $V_{DDH}$ .  $V_{DD}$  set at 1.4 V.

peaks of the outputs. The corresponding worst case rms error in twice the FWHM range for the outputs in Fig. 4.11(a) is 36 dB below the peak to peak input pulse.

The filter's gain was trimmed by varying the summing tap supply voltage,  $V_{DDH}$  from 0.95 V to 1.75 V. Fig. 4.12 shows the measured frequency response for the ninth order APF with  $V_{DDH}$  as a parameter. The gain varies between  $-8\,\mathrm{dB}$  to 0.6 dB in the same range. Identical delay response for all gain settings allows the beamformer to independently tune the gain and delay of each signal path.

The technique demonstrated in [40] was used to de-embed the effects of frequency dependent measurement paths for measuring noise and distortion. It is also described in Appendix A. Fig. 4.13(a), and Fig. 4.13(b) show the measured input referred noise power spectral density for all orders, and the integrated noise figure (with a reference impedance of  $50\,\Omega$ ) versus order respectively<sup>3</sup>. The noise figure is maximum for the delay settings corresponding to the highest delay (order =9) and is equal to 19.2 dB. It improves to 14 dB when the order is decreased to 2. The total output integrated noise in the range of  $50\,\text{MHz}$ –2 GHz varies from  $170\,\mu\text{V}$  (order =2) to  $332\,\mu\text{V}$  (order =9). To further reduce the noise figure, an LNA similar to [14] can be used.

<sup>&</sup>lt;sup>3</sup>This work was published as "A 2 GHz bandwidth, 0.25-1.7 ns true-time-delay element using a variable-order all-pass filter architecture in 0.13 μm CMOS," in *IEEE J. Solid-State Circuits*, vol. 52, no. 8, pp. 2180-2193, Aug. 2017, where the reported noise was worse by 4.4 dB and the IIP3 better by 1.6 dB than the ones reported in the thesis. The noise reported in the paper was inadvertedly measured with the peak settings in the spectrum analyzer, as opposed to rms. The numbers reported here have been updated after new measurements and a slightly different parasitic de-embedding technique. The de-embedding technique is explained in detail in Appendix A.



Figure 4.13: Measured input referred noise spectral density and the integrated noise figure versus order.

For measuring distortion the filter was excited with two tones 1 MHz apart and their third order intermodulation (IM3) component observed as their center frequency is varied. Fig. 4.14(a) shows the measured maximum input voltage for which the output IM3 falls 40 dB below the fundamental tone, the strength of which varies from  $100\,\mathrm{mV_{ppd}}$  (order =2) to  $35\,\mathrm{mV_{ppd}}$  (order =9) at the respective nominal bandwidth settings. The observed worst case IIP3, which corresponds to the highest delay setting, was  $-5.1\,\mathrm{dBm}$ . As mentioned in Section 3.4, distortion increases when bias current is reduced to reduce the bandwidth for fine tuning. Fig. 4.14(b) shows the input voltage for which the output IM3 falls 40 dB below the fundamental tone when the delay of the



Figure 4.14: Measured peak to peak differential input voltage for which IM3 falls 40 dB below the applied tones for (a) nominal delay setting, (b) increased delay setting with fine tuning.

 $N^{th}$  order APF is increased to match the nominal delay for the  $(N+1)^{th}$  order where 2 < N < 8. The worst case occurs when the delay of the eighth order configuration is increased to match the nominal delay of the ninth order. The maximum input voltage for  $-40\,\mathrm{dB}$  IM3 is  $31\,\mathrm{mV_{ppd}}$  in that case. Using this, the worst case measuerd signal-to-noise ratio (for inputs corresponding to  $-40\,\mathrm{dB}$  IM3) is  $39.4\,\mathrm{dB}$  for the highest delay configuration. It improves to  $55\,\mathrm{dB}$  for the second order configuration.

Fig. 4.15(a) shows the measured 1 dB compression points (P1dB) for all orders across frequency for the nominal bandwidth setting. P1dB varies from –5.5 dBm to –14.2 dBm as the filter's order varies from 2 to 9. Fig. 4.15(b) shows the measured P1dB with fine tuning. For the same configuration the worst case P1dB falls to –16.4 dBm.

In order to compute a broadband radiation pattern for a four antenna system, the beamforming receiver of Fig. 4.16(a) is assumed with adjacent antenna spacing of 7.5 cm which corresponds to  $\lambda_{\rm fmax}/2$  with  $f_{\rm max}=2$  GHz. The APF having the AC response of Fig. 4.4 mimics the delay line. The impulse response of the APF (obtained by inverse Fourier transform of Fig. 4.4) in the  $i^{th}$  channel is denoted as  $h_{APF|i=0-3}^{\tau_i}$ . The inputs are 1 ns wide monopulses. The output of each of the four channels is computed by convolving the input with the impulse response of the APF in the corresponding channel. For each set of  $h_{APF}^{\tau_{0-3}}$  the angle of beam arrival,  $\theta$ , was varied from 0 to 360°. This



Figure 4.15: Measured peak to peak 1 dB compression points for (a) nominal delay setting, (b) increased delay setting with fine tuning.



Figure 4.16: (a) Model of the test setup used for computing radiation pattern. (b) Computed normalized radiation pattern for a four element array with uniform spacing of 7.5 cm ( $\lambda_{\rm fmax}/2$ ) using a monopulse of 1 ns FWHM.

mimics the delay in the incoming wavefront to each channel. The peak of the received pulse  $(v_0(t))$  at the output of the summer is used as a measure of the received signal strength.

Fig. 4.16(b) shows the computed radiation pattern. With same delay settings for all the APFs the radiation pattern has maximum intensity normal to the array, whereas a delay difference of  $\approx 250\,\mathrm{ps}$  between adjacent antennas steers the beam by  $90^\circ$  off broadside.  $\pm 90^\circ$  beam steering can be realized as the delay range of the filter (1.45 ns)



Figure 4.17: Scatter plot showing the magnitude deviations of the reported delay lines (active and passive) including this work with respect to delay (range)-bandwidth product.

exceeds the time required for the EM wave to traverse the antenna array in the endfire direction ( $3/2 \times \lambda_{\rm fmax}/c = 0.75 \, \rm ns$ ). The delay range provided by the proposed delay line allows the use of up to 6 antennas for broadside steering, and up to 9 antennas for steering to  $45^o$  of boresight. Using a 5-bit DAC to provide bias currents can realize digital delay tuning with a resolution better than  $10 \, \rm ps$ .

Fig. 4.17 shows the scatter plot of the measured magnitude deviation with respect to the measured delay-bandwidth product of the proposed architecture and the reported IC implementations of the true-time-delay cells in literature. Its efficacy in breaking the trade-off between realizing high delay-bandwidth product while compromising magnitude flatness is clearly evident. Table 4.1 compares the performance of the proposed all-pass filter with earlier work in the literature. The high delay-bandwidth product, and low magnitude deviation for all delay settings are direct consequences of the high order APF implementation with only one extra parasitic pole. Compared to [14] the delay-bandwidth product is  $3 \times$  larger and magnitude deviation is 1.4 dB smaller. [11] uses LC delay lines, where a second order all-pass network acts as taps. As expected, it achieves higher delay-bandwidth product at the expense of area, and also has a very high insertion loss. The filter in this work achieves a delay per unit area of  $5.8 \, \mathrm{ns/mm^2}$ . Only [14] has a higher delay per unit area ( $7.8 \, \mathrm{ns/mm^2}$ ).

Table 4.1: Comparison of the proposed all-pass filter with the delay element architectures.

|                         | Proposed            | JSSC 2015                     | IMS 2015           | JSSC 2007             | JSSC 2014         |
|-------------------------|---------------------|-------------------------------|--------------------|-----------------------|-------------------|
|                         | APF                 | [14]                          | [11]               | [17]                  | [27]              |
| Technology              | 0.13 μm             | 0.14 μm                       | 0.13 μm            | 0.13 μm               | SiGe              |
|                         | CMOS                | CMOS                          | CMOS               | CMOS                  | <b>BiCMOS</b>     |
| Technique               | $G_m$ – $C$         | $G_m$ – $C$                   | Delay line         | Delay line            | LC                |
| Maximum                 | 1700                | 550                           | 550                | 400                   | 180               |
| delay (ps)              | $\pm 140 (\pm 8\%)$ | $\pm 10 (\pm 2\%)$            | $\pm 8  (\pm 2\%)$ | $\pm 25  (\pm 6\%)^*$ | $\pm 6 (\pm 4\%)$ |
| Delay range             | 250-1700            | 0–550                         | 150–550            | 190-400 ‡             | 40–180            |
| (ps)                    |                     |                               |                    |                       |                   |
| Bandwidth               | 0.1–2               | 1-2.5                         | 1-20               | 3-16                  | 3-10.6            |
| (GHz)                   |                     |                               |                    |                       |                   |
| Delay×                  | 3.4                 | 0.8                           | 10.45              | 5.2                   | 1.9               |
| bandwidth               |                     |                               |                    |                       |                   |
| Delay                   | 2.9                 | 0.8                           | 8                  | 2.73                  | 1.1               |
| range×                  |                     |                               |                    |                       |                   |
| bandwidth               |                     |                               |                    |                       |                   |
| Delay                   | 10**                | 13                            | 5**                | 15                    | 0.5**             |
| step (ps)               |                     |                               |                    |                       |                   |
| Gain                    | $\pm 0.7$           | ±1.4                          | 35                 | ±4                    | ±2.5              |
| variation (dB)          |                     |                               |                    |                       |                   |
| Gain (dB)               | 0.6                 | 12                            | -10 to             | 10                    | 10                |
|                         |                     |                               | -45                |                       |                   |
| Supply (V)              | 1.4                 | 1.8                           | 1.2                | 1.5                   | 2.5               |
| NF(dB)                  | 14.5–19.2           | 8–10†                         | -                  | 2.9–4.8†              | _                 |
| $IIP3_{50\Omega} (dBm)$ |                     | $-20 \text{ to } -13 \dagger$ |                    |                       | _                 |
| Power (mW)              | 112–364             | 90†                           | 6                  | 60                    | 53                |
| Delay Cell              | 0.29                | 0.07                          | 4††                | 10‡ ††                | 1††               |
| Area (mm <sup>2</sup> ) |                     |                               |                    |                       |                   |
| Delay/Area              | 5.8                 | 7.8                           | 0.1                | 0.08                  | 0.18              |
| $(ns/mm^2)$             |                     |                               |                    |                       |                   |
| FoM (fJ)                | 23                  | 42‡‡                          | –                  |                       |                   |

<sup>†</sup> With LNA. ‡ 4 channels. \* Extracted from Fig. 21 in [17].

FoM =  $\frac{P_d}{DBW \times BW \times 10^{2/3[IIP3-N]/10}}$ .

The strengths of the proposed architecture are its large delay-bandwidth product, large delay per unit area, and a response with minimum magnitude deviation among its peers for all delay settings. The delay cell FoM listed in Table I takes into account the dynamic range, power dissipation, and the delay-bandwidth product and is similar to that for filters [43]. It is seen that the proposed APF improves the state of the art of integrated circuit active delay lines in terms of the delay-bandwidth product. Table 4.2

<sup>††</sup> Chip area due to absence of separate delay cell numbers.

<sup>‡‡</sup> Without LNA. Noise and IIP3 back calculated from [14]

<sup>\*\*</sup> Continuous Tuning. 5-bit DAC for bias current is assumed in our work.

Table 4.2: Performance summary of the chip.

| 0.13 μm CMOS                                     |         |  |
|--------------------------------------------------|---------|--|
| $1.4 \text{ V}(V_{DD}), 1.75 \text{ V}(V_{DDH})$ |         |  |
| 2 GHz                                            |         |  |
| 250 ps-1.7 ns                                    |         |  |
| $\pm 0.14\mathrm{ns}$                            |         |  |
| 0.6 dB                                           |         |  |
| $\pm0.7\mathrm{dB}$                              |         |  |
| -8.4 dB to 0.6 dB                                |         |  |
| 2                                                | 9       |  |
| 170 μV                                           | 330 μV  |  |
|                                                  |         |  |
| 81 mV                                            | 31 mV   |  |
| 112 mW                                           | 364 mW  |  |
| 37 mV                                            |         |  |
| $0.55\mathrm{mm}^2$                              |         |  |
|                                                  | 1.4 V ( |  |

<sup>†</sup> Nominal bandwidth setting. †† Worst case, including fine tuning.

summarizes the performance of the chip.



# **Chapter 5**

# **Expansion and Compression of Analog Pulses by Bandwidth Scaling of Continuous-Time Filters**

#### 5.1 Motivation



Figure 5.1: (a) Slowing down a signal by storing the samples at high speed and reading at low speed [44], (b) Slowing down a continuous-time signal [45].

Many modern data acquisition systems like radars, indoor localization, and neutrino detectors, require digitization of bursts of narrow (≈ nano seconds), *continuous-time*, *wide-bandwidth*, *analog* pulses which appear infrequently in time. Radar systems used for through-wall imaging and ground penetration [46][47] transmit short nano-second pulses and wait for the echo to return. The reflected echo is used to study the image of the desired object. Bursts of high frequency pulses are also used in some gaming

devices to detect player position and movement [44]. In neutrino and other particle detectors [48, 49], features of the pulse shape such as its area and slope are used for identifying particles and calibrating the readout systems.

All these applications require digitization of high bandwidth analog pulses occurring infrequently in time. Direct digitization of these narrow pulses require wide bandwidth (multi-GHz) high resolution ADCs. Since the signals occur only in bursts separated by long idle intervals, attempts have been made to avoid a full-fledged highspeed ADC, a low-jitter, high-frequency clock and the associated complexity. These methods involve slowing down the pulse in various ways and use a low-speed ADC for digitization. [44],[50] store closely spaced samples on an array of capacitors and read out and digitize the samples using a low speed ADC as shown in Fig. 5.1(a). Expansion of continuous-time pulses has been demonstrated in the photonic [51][52] and microwave [53] domains by modulating a chirped carrier by the input pulse, delaying different parts of the pulse modulated carrier differently by passing it through a dispersive medium whose delay varies with frequency, and demodulating the resulting expanded pulse. The same principle has been demonstrated on an integrated circuit (IC) in [45] where the chirped carrier is produced by modulating a voltage-controlled oscillator (VCO) with a ramp, and frequency dependent delay is realized using lossy LC structures near resonance. This is shown in Fig. 5.1(b). Just as expanding a pulse makes its analysis easier, compressing a pulse makes it easier to generate short pulses [54]. All of the techniques above can in principle be adapted for pulse compression. Along similar lines, time amplification[55][56][57] has been proposed to ease the resolution of closely spaced signals. This is however applicable only to digital signals.

This work demonstrates expansion and compression of continuous-time pulses using bandwidth scaling (which scales the time-domain response in an inverse manner) of continuous-time filters [28]. The core idea is to drive the pulse into a filter whose group delay exceeds the pulse width, and once the pulse is completely "inside" the filter, decrease (increase) the bandwidth to produce a stretched (compressed) output pulse. Unlike the photonic and microwave techniques[51][52][53], the proposed technique can be completely integrated on a chip. Integrated continuous-time filters have been extensively studied in the literature and the proposed technique can exploit these for efficient IC realizations. Compared to the method of storing samples in a capacitor array in Fig. 5.1(a), it does not require a high speed clock, and the associated complexity of a



Figure 5.2: (a) State-space model of a filter, (b)  $G_m$ -C realization of (a); Values shown on conductances and transconductances are multipliers of a certain unit transconductance  $G_{m0}$ .

high frequency clock distribution network. Compared to the IC implementation of the dispersive method in Fig. 5.1(b)[45], it does not require modulation, demodulation, or chirped carrier, and results in a simpler system with lower distortion. The ideas are demonstrated using a 0.13  $\mu$ m prototype which realizes 1.8× expansion and 1.7× compression of nanosecond-wide pulses.

# 5.2 Principle of pulse expansion/compression

The basic principle used for expanding the width of the pulse is outlined in [28]. Fig. 5.2(a) shows the state-space model of a filter with an input u(t) and an output y(t), state vector  $\mathbf{x}$  and state matrices  $\alpha \mathbf{A}$ ,  $\alpha \mathbf{B}$ ,  $\mathbf{C}$ ,  $\mathbf{D}$ . The equations governing this filter are

$$\dot{\mathbf{x}}(t) = \alpha \mathbf{A} \mathbf{x}(t) + \alpha \mathbf{B} u(t) \tag{5.1}$$

$$y(t) = \mathbf{C}\mathbf{x}(t) + \mathbf{D}u(t). \tag{5.2}$$

A, B, C, D are constant with time.  $\alpha$  is a scalar.  $\alpha$  is not unique because any number can be factored out of all elements of A and B. In a linear, time-invariant system, the matrices in the state-space description are constant with time. The reason to include  $\alpha$  as a multiplier for A and B is to form a time-varying system in which all elements of A and B are varied in the same proportion. In this implementation case,  $\alpha$  is unity at the beginning by definition and is then varied in a piece-wise constant fashion at certain instants of time. Fig. 5.2(b) shows a possible  $G_m$ -C realization of this filter with a second-order example. The capacitor voltages form the state vector. Conductances and transconductances realizing  $\alpha A, \alpha B, C, D$  are indicated on the figure. The value

of each transconductance is the product of a certain unit  $G_{m0}$  with the multiplier shown on the transconductance. Those realizing A, the path gains between the states, and B, the path gains between the input and the states have a multiplier  $\alpha$ .

Starting from an initial state  $\mathbf{x}(t_0)$ , and assuming a constant  $\alpha$  for  $t > t_0$ , the state  $\mathbf{x}$  of the above filter for  $t > t_0$  can be written as [58]

$$\mathbf{x}(t) = \exp\left(\alpha \mathbf{A}(t - t_0)\right) \mathbf{x}(t_0) + \underbrace{\int_{t_0}^t \exp\left(\alpha \mathbf{A}(t - \gamma)\right) \alpha \mathbf{B}u(\gamma) d\gamma}_{\text{zero-state response}}. \tag{5.3}$$

If u(t) is zero for  $t > t_0$  for some  $t_0$ , the output for  $t > t_0$  consists of only the zero-input response starting from state  $\mathbf{x}(t_0)$ . Since  $\alpha$  appears as a multiplier for time t, with a zero input, the state  $\mathbf{x}$  and the output y evolve at a rate proportional to  $\alpha$ . This is also apparent from the  $G_m$ -C example in Fig. 5.2(b). With zero input, and a given initial voltage across the capacitors, the capacitor currents, and hence the rates of change of capacitor voltages, scale with  $\alpha$ . It can be easily shown from (5.2) that, for a constant  $\alpha$ , the transfer function from input to the output is given by [58]

$$\frac{Y(s)}{U(s)} = \mathbf{D} + \mathbf{C} \left(\frac{s}{\alpha}\mathbf{I} - \mathbf{A}\right)^{-1} \mathbf{B}$$
 (5.4)

where I is an identity matrix of the order of A and U(s) and Y(s) are the Laplace transforms of u(t) and y(t) respectively. On the right hand side of (5.4), s is divided by  $\alpha$ . Therefore the bandwidth is proportional to  $\alpha$ . Hence and changing  $\alpha$  is referred to as bandwidth scaling.



Figure 5.3: (a) and (c) Normal operation of a continuous-time filter, (b) Pulse expansion by dynamically reducing  $\alpha$ , (d) Pulse compression by dynamically increasing  $\alpha$ .

The scaling of the zero-input response with  $\alpha$  suggests a method for expanding or compressing a pulse along the time axis. This is illustrated in Fig. 5.3. The input is a smooth pulse of width w. The filter is assumed to have a unit magnitude and a flat group delay  $\tau$  over the bandwidth of the input pulse. The state-variables are zero before the pulse is applied. By definition,  $\alpha=1$  at the beginning. If  $\alpha=1$  throughout as in Fig. 5.3(a), the output  $y(t)=u(t-\tau)$  as shown. If  $\tau>w$ , there is a time instant  $t_0$  at which the input has fallen to zero and the output has not yet changed from zero. To simplify the notation in what follows,  $t_0$  is defined as  $t_0 \triangleq 0$ . y(t) for t>0 can be seen as the zero-input response starting from an initial condition  $\mathbf{x}(0)$ .

In Fig. 5.3(b)  $\alpha$  is changed from 1 to 0.5 at t=0. This scales the zero input response by factor of 2 in time from t=0. Therefore, the output  $y_1$  is given by

$$y_1(t) = \begin{cases} y(t) = 0, & t \le 0 \\ y(t/2) = u(t/2 - \tau) & t > 0 \end{cases}$$
 (5.5)

 $y_1$  is therefore a  $2\times$  expanded version of u(t). Similarly, if  $\alpha$  is increased from 1 to 2 at t=0 as shown in Fig. 5.3(d), it can be worked out that the output  $y_2(t)=u\,(2t-\tau)$ , that is, a  $2\times$  compressed pulse.

The input pulse can therefore be compressed or expanded using a filter whose delay exceeds the pulse duration and changing  $\alpha$ , the scaling factor for **A** and **B** matrices, as a piecewise constant with time. Decreasing  $\alpha$  expands the input pulse and increasing  $\alpha$  compresses the input pulse.

Another way to think of this is to view the filter with a delay  $\tau$  as a continuous-time memory for inputs of duration  $\leq \tau$ . The continuous-time input pulse is "written" into the memory (where it is stored in the filter's state x) for t < 0 and read out for t > 0. Changing  $\alpha$  at t = 0 causes the write and read speeds to be different and achieve pulse expansion or compression. This is analogous to the discrete-time system in Fig. 5.1 where  $\phi_{sk}$  and  $\phi_{rk}$  have different phase spacing to achieve different write and read speeds.

If  $\alpha$  switches to zero,  $\dot{\mathbf{x}} = 0$  and the states will be frozen (zero capacitor currents in Fig. 5.3(d)). The familiar track-and-hold operation can itself be thought of as a pulse expansion system in which is  $\alpha$  is switched to zero. In track mode, the filter has a very



Figure 5.4: (a) Cascade of an all-pass and a low pass filter. (b) APF7 using an LC ladder. (c)  $G_m$ -C realization of APF7 and LPF5.

high bandwidth (compared to that of the input signal) and a very small delay. It stores the input integrated over a small aperture. This is nearly the same as its instantaneous value. In hold mode, the bandwidth reduces to zero and the delay increases to infinity. The stored "pulse", i.e. the instantaneous value, is stretched out infinitely, meaning that it is held forever.

# 5.3 Filter with switched bandwidth

### **5.3.1** Delay filter for signal storage

A filter that stores the input signal is at the core of the technique outlined in the previous section. In order to maintain the shape of the wideband pulse, constant group delay and magnitude are essential. As mentioned in Chapter 2 the best lumped element approximation to a constant group delay and magnitude is the Bessel or the equiripple group delay (EGD) [59] all-pass filter with a transfer function  $H_{AP}(s) = D(-s)/D(s)$ . A

high order filter is required to realize a large delay over a wide bandwidth. In this work, a cascade of a seventh order all-pass filter (APF7) and a fifth-order lowpass filter (LPF5) is used as shown in Fig. 5.4(a). The all-pass filter architecture proposed in Chapter 2 is suitable for this purpose. Fig. 5.4(b) shows a seventh order example. The lowpass stage is added to filter out the high frequency transients that occur when switching the bandwidth. These are discussed in detail in the next section.  $\alpha$  controls the bandwidth of both APF7 and LPF5. All-pass output  $V_{AP}$  is available at the output of the summing taps and lowpass output  $V_{LP}$  is available at the end of the ladder. The lowpass output  $V_{LP} = (1/D(s))V_i$  is available at the far end of the ladder.

A and B matrices can be conveniently scaled electronically in the  $G_m$ -C realization of Fig. 5.4(c). The transconductors marked "+" and "-" have transconductances of  $+\alpha G_{m\theta}$  and  $-\alpha G_{m\theta}$  respectively where  $G_{m\theta}$  is the unit transconductance and  $\alpha$  is the scaling factor.

# 5.3.2 Filter used for pulse expansion and compression

As mentioned before, in this work, a cascade of a seventh order all-pass filter (APF7) and a fifth-order lowpass filter (LPF5) is used as shown in Fig. 5.4(a). Both filters use the architecture in Fig. 5.4(c) and  $\alpha$  controls their bandwidth. The lowpass stage LPF5 is added to filter out the high frequency transients that occur when switching the bandwidth. A highpass filter using  $C_{hp}$  and  $R_{hp}$  is introduced between the stages to remove certain artifacts that occur during pulse compression. Details of these deviations from the ideal behavior are discussed later in this and the next sections.

Based on preliminary simulation of  $G_m$ -C filters in the 0.13 µm CMOS process it was determined that 900 MHz was the highest bandwidth with which delay filters with rapidly switchable bandwidth could be reliably realized. APF7 was designed for 2.8 ns delay with 2.5% ripple and LPF5 for 1 dB attenuation and 0.3 ns delay over a 900 MHz bandwidth. Mismatch in transconductors is more significant than mismatch in capacitors. Therefore, identical transconductors and appropriate capacitor values  $C_1$  to  $C_{12}$  were chosen. Pulse expansion and compression by  $2\times$  was targeted. Therefore, the transconductors were split into two units of 12 mS each. For pulse expansion, the filter's bandwidth is switched from 900 MHz to 450 MHz. This corresponds to  $\alpha$  switching from 1 to 0.5 with  $G_{m0} = 24$  mS. For pulse compression, the filter's bandwidth is

switched from 450 MHz to 900 MHz. This corresponds to switching  $\alpha$  from 1 to 2 with  $G_{m0}=12\,\mathrm{mS}$ . The integrating capacitors were implemented as nMOS transistors biased in inversion. The total capacitance at any of the filter nodes is the sum of the nMOS capacitance, the parasitic capacitance of the transistors in the transconductors, and the metal routing capacitances. Since EGD filters have increasing capacitor values as one progresses down the ladder ( $C_1$  to  $C_7$  and  $C_8$  to  $C_{12}$  in Fig. 5.4(c)), the smallest capacitances  $C_1$ , and  $C_8$  are made up entirely of parasitic capacitances.

# 5.3.3 Mismatch induced transients while dynamically switching the filter's bandwidth



Figure 5.5: Illustration of switching a first order filter between (a) High bandwidth, and (b) Low bandwidth modes. Mismatches between the filter halves cause disturbance to the states while switching.

Fig. 5.5 shows a simplified example of bandwidth scaling of a first order  $G_m$ –C filter. Fig. 5.5(a) shows the high-bandwidth mode in which two sets of transconductors connected in parallel. Fig. 5.5(b) shows the low-bandwidth mode in which the lower set of transconductors is turned off. All transconductors are nominally identical, but are mismatched in reality. Due to this,  $V_1$  and  $V_1'$ , the quiescent voltages in the two cases, are different from each other. When the filter is switched from one mode to the other, there is a transient response as the nodes settle to the new quiescent voltage. This transient is an unwanted interference to the filter's states.

In order to minimize transients due to mismatch, the two filter halves are connected through coupling capacitors instead of directly. This decouples the bias of the two halves without affecting the transient functionality. Fig. 5.6 shows this arrangement for a seventh order  $G_m$ -C filter. The two halves of the filter are connected in parallel



Figure 5.6: Bandwidth switching technique using a representative single ended seventh order  $G_m$ –C filter.

through coupling capacitors  $C_{ck}$ . In the low-bandwidth mode, the transconductors in the grayed out portion are switched off. Since nMOS transistors in inversion are used as integrating capacitors, it is imperative to preserve their DC bias even when half the filter is turned off. Hence, a bank of high valued resistors  $(R_b)$  has been used to keep the capacitor gates at  $V_{cm}$  when the transconductors are switched off. The output is taken only from the upper half.

### 5.3.4 Effect of AC coupling between the two halves

Consider the seventh order all-pass filer of Fig. 5.6. Because the two halves of the filter in Fig. 5.6 are connected through capacitors  $C_{ck}$  instead of being directly connected, there are fourteen state-variables instead of the originally intended seven (excluding the parasitic pole at the all-pass output node.). The rest of this section analyzes the resulting effects. The analysis is presented for the specific case of expansion and compression by  $2\times$ , but it can be easily generalized to other factors. Mismatch between transconductors is ignored. Fig. 5.7(a) shows the  $k^{th}$  stage of the filter. It has capacitors  $C_k$  in the upper and lower halves with  $C_{ck}$  between the two. Transconductors driven by  $v_{k-1}$  and  $v_{k+1}$  push currents into the capacitors. Voltages in the upper and lower halves are denoted by additional subscripts u and l respectively. Transconductors in both halves are active in the high-bandwidth mode in Fig. 5.7(a) and the lower half is turned off in the low-bandwidth mode in Fig. 5.7(b).



Figure 5.7:  $k^{th}$  stage of the filter: (a) With both halves turned on, (b) With the lower half turned off.  $R_b$  in Fig. 5.6 is assumed to be very high and is not shown here.

For pulse expansion, the filter is initially in the high-bandwidth mode shown in Fig. 5.7(a), with state-variables  $v_k$  being zero. Since the input drives both halves of the filter equally, the voltages in the two halves are identical, i.e.  $v_{k,l} = v_{k,u}$ .  $C_{ck}$  has zero voltage across it and  $i_{Cck} = 0$ . For the upper half, the state equation is,

$$C_k \dot{v}_{k,u} = G_m \left( v_{k-1,u} - v_{k+1,u} \right). \tag{5.6}$$

The same applies to the lower half. By definition,  $\alpha=1$  in this case. The elements of  $\bf A$  are the coefficients of voltages on the right-hand side when the coefficient of the derivative on the left-hand side is unity. When the filter switches to the low-bandwidth mode in Fig. 5.7(b), if  $C_{ck}$  were replaced by a short, the governing equation would have been

$$2C_k \dot{v}_{k,u} = G_m \left( v_{k-1,u} - v_{k+1,u} \right) \tag{5.7}$$

implying that elements of **A** are halved and  $\alpha$  becomes 1/2 as intended. But, with  $C_{ck}$  in place, when the filter switches to the low-bandwidth mode,  $C_k$  in the lower half and  $C_{ck}$  are in series. The governing equation is

$$C_{eq}\dot{v}_{k,u} = G_m \left( v_{k-1,u} - v_{k+1,u} \right) \tag{5.8}$$

where  $C_{eq} = C_k + C_k C_{ck} / (C_k + C_{ck})$ . The **A** matrix elements in (5.8) and (5.6) are in the ratio  $C_k / C_{eq}$ . For scaling all elements of **A** by the same factor,  $C_k / C_{eq}$ , and hence  $C_{ck} / C_k$ , should be the same for all k. From (5.6) and (5.8), the value of  $\alpha$ 

after switching is  $C_k/C_{eq} = (C_k + C_{ck})/(C_k + 2C_{ck})$ . Larger the value of  $C_{ck}/C_k$ , closer it is to the ideal value of 1/2. But increasing  $C_{ck}$  also increases the parasitic capacitance from each node to ground, reducing the highest bandwidth achievable in a given process. In this prototype,  $C_{ck} = 4C_k$ . Consequently, for t > 0,  $\alpha$  changes to 1/1.8 instead of 1/2. That is, instead of  $u(t/2 - \tau)$  given by (5.5), the output will be  $u(t/1.8 - \tau)$ . In pulse expansion mode, this change in the expansion factor is the only effect of capacitively coupling the two halves of the filter.

In pulse compression mode however, capacitively coupling the two halves has an undesirable side effect. Initially the filter is in the low-bandwidth mode in Fig. 5.7(b) and the state-variables are zero before the input pulse is applied. The state equation for  $v_{k,u}$  is given by

$$C_{eq}\dot{v}_{k,u} = G_m \left( v_{k-1,u} - v_{k+1,u} \right). \tag{5.9}$$

As before,  $C_{eq} = C_k + C_k C_{ck} / (C_k + C_{ck})$ . Because  $C_{ck}$  and  $C_k$  are in series,  $v_{k,l} = v_{k,u} C_{ck} / (C_{ck} + C_k)$ . The state equation for  $v_{k,l}$  is the same as (5.9). By definition, this case corresponds to  $\alpha = 1$ . Unlike in the pulse expansion mode,  $v_{k,l} \neq v_{k,u}$  and  $i_{Cck} \neq 0$  before  $\alpha$  is changed at t = 0. After switching to the high-bandwidth mode in Fig. 5.7(a), the equations are

$$(C_k + C_{ck})\dot{v}_{k,u} - C_{ck}\dot{v}_{k,l} = G_m(v_{k-1,u} - v_{k+1,u})$$
(5.10)

$$(C_k + C_{ck})\dot{v}_{k,l} - C_{ck}\dot{v}_{k,u} = G_m(v_{k-1,l} - v_{k+1,l}).$$
(5.11)

 $v_{k,u}$  and  $v_{k,l}$  evolve differently, whereas, ideally, they should have been equal to each other. Instead of determining  $v_{k,u}$  and  $v_{k,l}$  individually, it is more illuminating to separate them into an even mode component  $v_{k,e} = (v_{k,u} + v_{k,l})/2$  and an odd component  $v_{k,e} = (v_{k,u} - v_{k,l})/2$ . This is analogous to representing a pair of signals by their common-mode and differential components. Ideally,  $C_{ck}$  should have been a short and the odd-mode component should have been zero. The state equations for  $v_{k,e}$  and  $v_{k,o}$  before  $\alpha$  is switched can be obtained from (5.9) and the relationship  $v_{k,l} = v_{k,u}C_{ck}/(C_{ck} + C_k)$ . They are

$$C_{eq}\dot{v}_{k,e} = G_m (v_{k-1,e} - v_{k+1,e})$$
 (5.12)

$$C_{eq}\dot{v}_{k,o} = G_m(v_{k-1,o} - v_{k+1,o}).$$
 (5.13)

 $\alpha$  is by definition unity for this case. The state equations after  $\alpha$  is switched are obtained by rewriting (5.10) and (5.11) in terms of  $v_{k,e}$  and  $v_{k,o}$ .

$$C_k \dot{v}_{k,e} = G_m (v_{k-1,e} - v_{k+1,e})$$
 (5.14)

$$(C_k + 2C_{ck})\dot{v}_{k,o} = G_m(v_{k-1,o} - v_{k+1,o}).$$
 (5.15)

Only the even-mode component appears in (5.12) and (5.14) and only the odd-mode component appears in (5.13) and (5.15). The odd- and even-mode components evolve independently, but with  $\alpha$  taking on different values after t=0. Comparing (5.14) and (5.12),  $\alpha$  for the even-mode after switching is  $(C_k+2C_{ck})/(C_k+C_{ck})$ . The even mode therefore contains the desired compressed output. Comparing (5.15) and (5.13),  $\alpha$  for the odd-mode after switching is  $C_k/(C_k+C_{ck})$  which is less than unity. The odd mode therefore contains an undesired *expanded* output.

Using the definitions of  $v_{k,e}$  and  $v_{k,o}$ , and the relationship  $v_{k,l} = v_{k,u}C_{ck}/(C_{ck} + C_k)$ , the even- and odd-mode components at the switching instant t = 0 are calculated as

$$v_{k,e}(0) = \frac{v_{k,u}(0) + v_{k,l}(0)}{2} = \frac{1}{2} \frac{2C_{ck} + C_k}{C_{ck} + C_k} v_{k,u}(0) = k_e v_{k,u}(0).$$
 (5.16)

$$v_{k,o}(0) = \frac{v_{k,u}(0) - v_{k,l}(0)}{2} = \frac{1}{2} \frac{C_k}{C_{ck} + C_k} v_{k,u}(0) = k_o v_{k,u}(0).$$
 (5.17)

All even-mode state-variables and the even mode output suffer an attenuation because  $k_e < 1$ . As  $C_{ck}/C_k \to \infty$ ,  $k_e \to 1$  and  $k_o \to 0$ , implying that, if  $C_{ck}$  is a short circuit, the undesired odd mode disappears and desired even mode appears with no attenuation.

In this implementation,  $C_{ck}/C_k=4$ , and  $\alpha$  switches to 1.8 for the even mode and 0.2 for the odd mode, and  $k_e$  and  $k_o$  are 0.9 and 0.1 respectively. Therefore, instead of the desired  $2\times$  compressed output  $u(2t-\tau)$ , an attenuated even-mode output  $0.9u(1.8t-\tau)$  and an undesired odd-mode output  $0.1u(0.2t-\tau)$  is obtained. The change in the value of  $\alpha$  due to finite  $C_{ck}/C_k$  is similar to that during pulse expansion. The purpose of the highpass filter using  $C_{hp}$  and  $R_{hp}$  in Fig. 5.4(c) is to further attenuate the slower, odd-mode component.  $R_{hp}$  is programmed to set the corner frequency to 20 MHz in compression mode and 40 Hz in expansion mode. The filter has 480 MHz bandwidth in the low-bandwidth mode which is the initial state for compression. Gaussian or monopulse signals which fit into this bandwidth have substantial portion of their energy at low frequencies. The bandwidth of the undesired odd-mode signal is

five times smaller. Therefore, the highpass corner frequency of 20 MHz attenuates a substantial portion of the odd-mode signal without disturbing the desired even-mode signal<sup>1</sup>.

The coupling capacitors  $C_{ck}$ , which are realized using anti-parallel connected pMOS capacitors, experience no signal swing in high-bandwidth mode and some signal swing in low-bandwidth mode. Their non-linearity is exercised in the latter, increasing the distortion.

#### 5.4 Transconductor

The APF prototype in Fig. 5.4 assumes ideal transconductors with zero output conductance and no parasitic poles. To realize this the transconductor used to realize the APF architecture in Chapter 2 was used. Fig. 5.8 show the transistor level schematic. The



Figure 5.8: Schematic of the transconductor used in APF7.

differential pair M1,2 realizes the transconductance and the cross-coupled pair M5,6 realizes the negative conductance. The pMOS current sources M3,4 are driven by the

<sup>&</sup>lt;sup>1</sup>One could also attempt to extract only the even-mode output by summing the signals from both halves instead of taking the output from only the upper half as done in Fig. 5.6. But, the states in the low bandwidth mode are scaled by  $C_{eq}/C_k$ , unlike the input to the lower half which is unscaled. This requires another set of transconductors with different non-integer weights to form the summing taps from the lower half. After considering these alternatives, the arrangement in Fig. 5.6 with the high pass filter shown in Fig. 5.4(c) has been chosen.

common mode feedback (CMFB) signal  $V_{cmfb}$  derived from the split transistor CMFB circuit shown in Fig. 5.9. Since the output conductance is canceled, minimum length transistors can be used in the transconductor without compromising the DC gain. This minimizes the parasitic capacitances and maximizes the achievable bandwidth.



Figure 5.9: Common mode feedback circuit for the transconductor in Fig. 5.8.

For pulse expansion / compression the bandwidth has to be changed instantaneously, i.e. some transconductors have to be rapidly turned on or off. When a transconductor is turned off, dummy differential pairs M1d, 2d and M5d, 6d, biased at a small current, are switched in by the logic signal ENB so that the capacitance at the input and output nodes do not change. Though M1, 2 and M1d, 2d have different currents, they are in deep inversion, and their capacitances are nearly identical. The tail nodes of the main differential pair and the cross coupled pair are connected to a voltage  $V_{cmx}$  (nominally  $V_{cm}$ ) through switches S3, 4 so that these transistors are turned off. The pMOS current sources M3, 4 are turned off by connecting their gates to  $V_{DD}$ .

Fig. 5.10 shows sources of error when the transconductor is switched. In the prototype filter,  $V_{cmx}$  is provided externally. This means that the tail current from the differential pair flows through the bond wire inductance of this pin. This large current step causes large transients on the tail node and the ground node. To avoid this, an additional current source M0d is added to the circuit. This is switched off when the transconductor is turned off and switched on when the transconductor is switched on, in principle maintaining a constant current through the bond wire inductance. Transients during



Figure 5.10: (a) Switched transconductor in the ladder filter. (b) Unbalanced transconductor while switching (negative conductance not shown).

this switching are filtered by  $C_{vcm}$ . At the instant the transconductor is turned off, due to the signal that is present, the input pair transistors have different gate-to-source and drain-to-source voltages, and the load transistors have different drain-to-source voltages. Since the switching is not instantaneous, differential error currents are injected into the integrating capacitors. This effect is most prominent from the transconductors closer to the input (those connected to  $C_{1,2,3}$  in Fig. 5.4) in APF7. This observation is a corollary of the discussion in Section 3.4.1. For small current injections, its effect at the APF output can be estimated by evaluating the adjoint network as shown in Fig. 3.9. The number of zeros in the transfer functions (and hence the high frequency contributions of the injected noise beyond signal band) from the input to the APF nodes decrease from  $V_{1-7}$ . The lowpass filter LPF5 which follows this helps limit these fast transients.

When the transconductors are required to rapidly turn on during pulse compression, the reverse operation takes place. The CMFB loop wakes up to drive M3-4 and is designed to settle fast enough such that the differential properties of the transconductor remain unaffected. The CMFB amplifier shown in Fig. 5.9 is always on and is switched in and out of the loop through S1-2 of Fig. 5.8. Also since the tail current sources in Fig. 5.8 are never turned off, the charging time of the tail node capacitance dominates the transconductor's wake up time. This capacitance is restricted to 130 fF. The main transconductor has a nominal tail current of 2.55 mA, and  $V_{cm}$  of 800 mV. Simulations show that the transconductor's bias current takes around 100 ps to settle to within 90%



Figure 5.11: Simulated bias current settling time of the transconductor of Fig. 5.8.

of its intended value. This is shown in Fig. 5.11. To ensure fast settling of the CMFB loop during wake up, the CMFB transconductor consumes  $\approx 3$  mW of power. If only pulse expansion is desired, a slower CMFB loop with a lesser power consumption can be used.

### 5.4.1 Summing taps



Figure 5.12: (a) Simplified schematic of the all-pass taps. (b) Architecture of a summing tap unit.

In case of an ideal LC ladder, or an ideal  $G_m$ –C filter, the weights of the summing taps in the differential all-pass architecture (Fig. 5.4) are in the ratio of -1:2. Since the tap weights are integer multiples of each other, a unit transconductor can be used to

implement the summing operation. This is shown in Fig. 5.12(a).

Unlike the APF architecture of Chapter 2, pulse expansion/compression does not require gain tuning. Hence, the supply of the summing taps of Fig. 5.12, is kept fixed at  $V_{DD}$ . Fig. 5.12(b) shows the schematic of a summing tap unit cell. The input pair, M1, M2 provide the summing transconductance. In order to minimize the quiescent drop across  $R_L$ , and keep the input pairs M1, 2 in saturation while supplying them with sufficient bias current to maintain their  $G_m \approx 15 \, \mathrm{mS}$ , most of the bias currents through M1, M2 are diverted into M4, 5.

The output parasitic capacitance ( $C_p$ ) in Fig. 5.12(a) introduces a pole and causes a droop in the frequency response. The interconnect resistances to the filter capacitors also introduce a similar droop. Wider interconnect widths increases the parasitic capacitances at each of the filter states and compromises the bandwidth. Hence, a high pass response to equalize the magnitude droop is included in the summing taps. Fig. 5.12 shows the simplified schematic.  $C_z$  creates a high pass response with a zero at  $1/(2\pi RC_z)$ . Including the effect of layout parasitics, the combination of APF7 and LPF5 has a 3 dB bandwidth of 900 MHz.

# **5.5** Pulse generator

Measurement of pulse expansion and compression requires high frequency pulses which are time and bandlimited. A Gaussian pulse and its derivatives satisfy this requirement. A pulse generator, producing Gaussian and a monopulse (first derivative of a Gaussian) is also included in the chip. It is based on the principle that the impulse response of a high order Bessel filter closely resembles an ideal Gaussian curve [59][60]. This property has been utilized in [32] to generate short Gaussian pulses and its higher derivatives. In this work, the Gaussian pulse generator consists of a narrow pulse ( $\approx$  impulse) generator followed by a seventh order Bessel lowpass filter, Bessel7. Bessel7 has the same architecture as APF7 in Fig. 5.4 without the summing taps and capacitor values adjusted for flat group delay. This is shown in Fig. 5.13(a). The impulse response of the last state-variable,  $V_7$  is a Gaussian pulse, and that of  $V_6$  is a Gaussian monopulse [32], which is used to test the system. The transconductors of Bessel7 are half the sizes of that used in APF7 since matching is not critical for generating test pulses.



Figure 5.13: (a) Gaussian and monopulse generation architecture. (b) Schematic of the impulse generator circuit and illustration of impulse generation.



Figure 5.14: Simulated differential outputs of the impulse, Gaussian pulse and monopulse generators.

The technique to produce a sharp (pseudo) differential pulse mimicking the impulse is shown in Fig. 5.13(b). A clock and its delayed and inverted form are fed to the nMOS transistors M1 and M2 respectively, whose drains are connected to a low voltage supply  $(V_{DDL})$  through a resistance R. During any clock transition there is a brief span of time when both clk and clkb are either low or high. This changes the current through R and generates a short pulse. The delay between the the adjacent edges of clk and

clkb are kept around 100 ps. The combination of R, M3 and M4 produces impulses of opposite polarities, thus enabling a differential input to filter. The sharp impulses are AC coupled to Bessel7 to produce the required Gaussian and the monopulses. The resistance R is realized by an nMOS pass transistor, MR, whose gate is controlled by a digital logic to enable or disable pulse generation.  $V_{DDL}$  is set to around 200 mV. This is necessary to ensure the impulse heights are smaller than the overdrives at which the input transistors of Bessel7 are biased. The simulated differential outputs of the impulse generator and the Gaussian and monopulse generators are shown in Fig. 5.14.

# 5.6 Prototype chip architecture

Fig. 5.15 shows the block diagram of the prototype chip. APF7 and LPF5 form the filter which stores the signal and performs pulse expansion or compression.  $S_{cmpb}$  is a digital signal which enables the RC high pass filter for compression and disables it for expansion. When  $S_{cmpb}$  is high,  $M_{hp}$  is off, but, since it is in an n-well connected



Figure 5.15: Simplified block diagram of the chip.

to  $V_{cm}$ , it provides a very high resistance path to  $V_{cm}$ , resulting in a very low cutoff frequency for the highpass filter.

The input MUX allows us to drive the programmable filter from an external input or one of the two internally generated test inputs. The output MUX allows us to probe various points in the filter. The polarity of the output can be inverted using the control bit, FLIP. Taking the difference between measurements with normal and inverted positions of the MUX helps cancel input-output feedthrough [40]. The simplified schematic



Figure 5.16: Output buffer schematic.

of the output MUX is shown in Fig. 5.16(a). It consists of three transconductors, one of which is selected by a decoding logic block. The schematic of the transconductor is shown in Fig. 5.16(b).  $f_{1-3}$ ,  $fb_{1-3}$  are logic signals used to flip the polarity of the output buffers. The input termination resistance R is used for input matching and deembedding the effect of noise of the measurement setup as explained in [40]. The rest of the circuitry is for testing purposes. The clock (clk) fed to the impulse generator produces a sharp pulse ( $\approx$  impulse) which is AC coupled to the seventh order equiripple group delay LPF to generate the Gaussian/monopulses required for testing.

# **Chapter 6**

# Measurement Results for Pulse Expansion and Compression Prototype



Figure 6.1: (a) Chip photograph and (b) snapshot of the test board.

The test chip was fabricated in a  $0.13 \,\mu m$  CMOS process, packaged in a standard QFN-48 package and mounted on a four layer printed circuit board for testing. The die

photograph and the snapshot of the test board are shown in Fig. 6.1. The chip occupies an active area of 1.6 mm<sup>2</sup>.



Figure 6.2: Magnitude and group delay of (a) all-pass filter (APF7), and (b) the total filter chain (APF7+LPF5) for both bandwidth settings.

Fig. 6.2 shows the simulated and measured magnitude and group delay of APF7 for the two bandwidth settings. The effect of the test setup is de-embedded using the technique in [40]. Group delay and bandwidth scaling are clearly seen. The bandwidths ( $\approx$  frequency at which the group delay falls 10% below its nominal low frequency value) in the two settings are 870 MHz and 472 MHz respectively. In the high bandwidth mode APF7 has a group delay of  $2.82\pm0.23$  ns and a magnitude variation of 2.4 dB. The total filter, APF7 + LPF5, has a group delay of  $3.15\pm0.3$  ns and a magnitude droop of 4.3 dB. For the lower bandwidth setting, APF7 has a group delay of  $5\pm0.4$  ns and magnitude variation of 2.4 dB. The total filter has a group delay variation of  $5.8\pm0.45$  ns and magnitude droop of 4.1 dB. Variations below 100 MHz range for the high bandwidth setting

is due to the mismatch of integrating capacitors between the two filter halves on either sides of the coupling capacitors, which gives rise to pole zero doublets. The extra group delay ripple in the passband as compared to simulation is attributed to the mismatch between the filter's transconductors which are spread over a wide area which has been analysed in Chapter 4.



Figure 6.3: Measured input noise spectral density for APF7 and APF7 + LPF5 for (a) high bandwidth and (b) low bandwidth mode.

Fig. 6.3 shows the measured input noise spectral density for the high and low bandwidth modes for the APF7 and the total filter chain. The technique described in [40] and explained in Appendix A was used to de-embed the frequency-dependent characteristics of the measurement paths. However, the chip architecture of Fig. 5.15, unlike the one in Fig. 3.10 has an input MUX. Following the analysis presented in [40] and in Appendix A, it can be appreciated that the gain of the input MUX needs to be deembedded for accurate estimation of the noise at the input of the APF7 in Fig. 5.15. Since this could not be explicitly measured (due to the absence of a direct path around the input MUX in Fig. 5.15), its gain has been estimated from simulation, which shows a nominal DC gain of approximately 6 dB which drops by less than 0.1 dB across the bandwidth of the filter. Note that this affects the measurement of distortion identically, thus keeping dynamic range of the filter unaffected. The integrated inband input referred noise for APF7 (APF7 + LPF5) at high (100–900 MHz) and low (100–480 MHz) bandwidth modes are 349  $\mu$ V (423  $\mu$ V) and 330  $\mu$ V (416  $\mu$ V) respectively.

Fig. 6.4 shows the measured IIP3 (with test tones 1 MHz apart) of the system for both bandwidth modes. The filter chain and APF7 has worst case IIP3s of  $-5.6 \, \mathrm{dBm}$ 



Figure 6.4: Measured IIP3 with  $50\Omega$  reference resistance for the APF7 and APF7 + LPF5 for (a) high bandwidth and (b) low bandwidth mode.

 $(332\,\mathrm{mV_{ppd}})$  and  $-5.7\,\mathrm{dBm}$   $(327\,\mathrm{mV_{ppd}})$  respectively in the low bandwidth setting. Since the non-linear coupling capacitors do not contribute any distortion in the high bandwidth mode the IIP3 improves to  $-4.5\,\mathrm{dBm}\,(376\,\mathrm{mV_{ppd}})$  and  $-4.75\,\mathrm{dBm}\,(365\,\mathrm{mV_{ppd}})$  respectively. For APF7, the maximum input voltage swing corresponding to 1% IM3 (third order intermodulation component) is  $33\,\mathrm{mV_{ppd}}$  at the band edge in the low bandwidth mode, and the same for the filter chain is  $32\,\mathrm{mV_{ppd}}$ . For the high bandwidth mode the maximum input voltage swing increases to  $38\,\mathrm{mV_{ppd}}$  and  $37\,\mathrm{mV_{ppd}}$  for APF7 and APF7 +LPF5 respectively.

To demonstrate pulse expansion / compression, the input waveforms were generated using the pulse generator of Fig. 5.13. Fig. 6.5 shows the simplified test set up. The parts marked in blue represent active parts of the set up. The clock, clk from an external clock source generates the test pulse of choice (Gaussian or monopulse) in Board 1. This emulates an "external" input to the test chip mounted on Board 2. The interval between the edge of clk and the pulse appearing at the output of Board 2 (without expansion / compression) is deterministic and constant 1 and can be observed in the digital oscilloscope by setting  $\alpha=1$  in Board 2. Using this information a delay is set in the programmable digital delay block in Board 1 to apply the trigger  $(\alpha)$  when the pulse has entered the device under test in Board 2 and enable expansion / compression. Maxim DS1023S is used as the programmable delay block, which has a minimum delay reso-

<sup>&</sup>lt;sup>1</sup>Disregarding the effect of noise.



Figure 6.5: Simplified test set up for pulse expansion and compression.

lution of 250 ps.

Fig. 6.6 and Fig. 6.7 demonstrate expansion of a Gaussian pulse and a monopulse with input FWHM of 1.4 In each case, the input pulse (output of the pulse generator), output of the filter without expansion ( $\alpha = 1$ ), output of the filter post expansion ( $\alpha$ switched from 1 to 1.8 after the input pulse falls to zero), and the expected output pulse are shown. The "expected output" is obtained as follows: The measured output without expansion is mathematically expanded in time and scaled in amplitude such that the starting point (at the switching instant  $t_0$ ) and the first peak are the same as the measured expanded pulse. If expansion is ideal, the measured expanded pulse must coincide with the "expected output". It is seen from Fig. 6.6 and Fig. 6.7 that they are very close to each other. The measured expansion factor is within 1.78-1.81. The variation is suspected due to the resolution in capturing the waveforms. The waveforms shown in the figures are obtained by taking the difference between the waveforms captured with the output MUX in the direct and cross-connected positions. This eliminates effects of direct coupling on the package and the board. But, the effects of the bondwires, PCB traces, coaxial cables, buffer and the MUX remain. The input pulse has extra transients after the main pulse. This is suspected to be from the signal bounce initiated by the generation of sharp impulses. Multiple sharp impulses at the input of LPF7



Figure 6.6: Demonstration of pulse expansion with Gaussian input pulse with FWHM of 1.4 ns.



Figure 6.7: Demonstration of pulse expansion with a monopulse input with FWHM of 2.4 ns.

used for pulse generation in Fig. 5.15 can generate multiple pulses at its output. Similar transients are seen at the filter's output as well.

The trigger to change the filter's bandwidth has been applied externally through an off-chip programmable digital delay generator<sup>2</sup>. In all the observed cases, the rms error of the expanded pulse with respect to its expected output within the FHWM are less than 34 dB of the peak to peak output.

<sup>&</sup>lt;sup>2</sup>The trigger could in principle be automatically generated by comparing the signal at the end of the filter ( $V_7$  in Fig. 5.4) to a threshold.  $V_7$  rising above the threshold indicates that the signal has entered the filter. To compensate for delays in the trigger pulse generation, an earlier state-variable such as  $V_5$  or  $V_6$  could be used.



Figure 6.8: Compression of a Gaussian input pulse with FWHM = 2 ns.



Figure 6.9: Compression of a monopulse input with FWHM = 3.2 ns.

Fig. 6.8 and Fig. 6.9 demonstrate waveform compression with a Gaussian pulse and a monopulse. The observed compression factor is 1.72 and the ratio between the peaks of the uncompressed and the compressed pulses is 0.83. The rms error of the compressed pulse with respect to its expected output within FWHM are less than 33 dB of the peak to peak output.

Table 6.1 compares the performance of this work with the state of the art pulse expansion architectures in literature. [45][53][52] use chirped carrier and dispersion based continuous time techniques for pulse expansion. Among them [45] demonstrates a monolithic implementation. Compared to these, the demonstrated technique occupies a smaller area. This is a direct consequence of the solution exploiting the fundamental principle of expansion of natural response of a filter on bandwidth scaling. This requires neither a high frequency carrier or a dispersive medium. The scheme in [45] requires linearity in the chirped carrier, modulator, group delay variation, and the de-

Table 6.1: Table comparing expansion / compression architectures.

|                         | This work  | [45]    | [53]     | [52]     | [44]     | [54]     |
|-------------------------|------------|---------|----------|----------|----------|----------|
|                         |            | [MTT    | [MTT     | [TCAS    | [ISSCC   | [RWS     |
|                         |            | 2012]   | 2007]    | 2005]    | 2015]    | 2008]    |
| Technology              | 130 nm     | 130 nm  | Discrete | Discrete | 65 nm    | Discrete |
| Method                  | Cont.      | Cont.   | Cont.    | Cont.    | Sampling | Cont.    |
|                         | time       | time    | time     | time     | †        | time     |
| Carrier/clock           | NA         | Chirped | Chirped. | Chirped  | Sampling | Chirped  |
|                         |            | carrier | carrier  | carrier¶ | clock    | carrier  |
| Bandwidth               | 0.87 GHz   | 1 GHz   | 8 GHz    | 4 GHz    | 3.75-    | 1.5 GHz  |
|                         |            |         |          |          | 4.25 GHz |          |
| Expansion               | 1.8        | 2       | 5        | 1.5      | 400      | -        |
| factor                  |            |         |          |          |          |          |
| Compression             | 1.7        | -       | -        | -        | -        | 2.8      |
| factor                  |            |         |          |          |          |          |
| Power (mW)              | 370 HighBW | -       | -        | -        | 70†      | -        |
|                         | 260 LowBW  |         |          |          |          |          |
| Area (mm <sup>2</sup> ) | 1.4        | 4.5‡    | -        | -        | 4†       | -        |

<sup>‡</sup> Off chip peak detector. ¶ Optical carrier..

Table 6.2: Performance summary of the test chip.

| Technology     |                          | 0.13 μm CMOS       |                |  |
|----------------|--------------------------|--------------------|----------------|--|
| Supply voltage |                          | 1.2 V              |                |  |
| Active area    |                          | $1.6\mathrm{mm}^2$ |                |  |
|                | Bandwidth (MHz)          | 870                | 472            |  |
| •              | Group Delay (ns)         | $2.82 \pm 0.23$    | 5±0.4          |  |
|                | Gain (dB)                | -1.5 to $-4.2$     | -1.5 to $-3.7$ |  |
| APF7           | Input noise $(\mu V)$    | 349                | 330            |  |
| AII /          | $ m V_{ippd}$            |                    |                |  |
|                | @ 1% IM3 (mV)            | 38                 | 33             |  |
|                | IIP3 (dBm @ $50\Omega$ ) | -4.5               | -5.6           |  |
| APF7           | Bandwidth (MHz)          | 900                | 474            |  |
|                | Group Delay (ns)         | $3.15{\pm}0.3$     | $5.8 \pm 0.45$ |  |
|                | Gain (dB)                | -1.6 to $-6.2$     | -1.6 to $-6.2$ |  |
|                | Input noise $(\mu V)$    | 423                | 416            |  |
| +<br>LPF5      | $V_{\mathrm{ippd}}$      |                    |                |  |
| LPF3           | @ 1% IM3 (mV)            | 36                 | 32             |  |
|                | IIP3 (dBm @ $50\Omega$ ) | -4.75              | -5.7           |  |
| Power          | APF7 (mW)                | 182                | 111            |  |
|                | LPF5 (mW)                | 121                | 85             |  |
|                | PulseGen (mW)            | 50                 |                |  |
|                | Switching                |                    |                |  |
|                | overhead (mW)            | 70                 |                |  |

modulator, and deviations in any of them causes significant distortion. The proposed technique on the other hand is dependent mainly on transconductor linearity. Distortion

<sup>†</sup>External 20 GHz clock. Clock generation power and area not included.

is not reported in [45], but appears to be visibly higher in the measured results. [44] demonstrates expansion based storing the input on an array of capacitors. The high frequency (20 GHz) clock distribution network along with an array of sampling capacitors also increases the area. The proposed implementation does not require a clock and has a smaller area. The achievable expansion and compression factors are however higher in the sampling technique. The proposed technique is advantageous when a modest expansion factor (< 10) is desired without the availability of accurate multi-GHz clocks for GHz bandwidth pulses. Electronic pulse compression using chirped carrier and off-chip dispersive media is demonstrated in [54]. To the best of our knowledge the proposed work is the first monolithic demonstration of broadband pulse compression.

Table 6.2 summarizes the performance of the chip. At the high (low) bandwidth setting the power consumption of APF7 and LPF5 are 182 mW (111 mW) and 121 mW (85 mW) respectively. The 24 CMFB circuits which are designed for fast settling of the filter's transconductors during compression and are always on, consume around 72 mW. The pulse generator consumes 50 mW, most of it in LPF7. The dummy tail currents in the transconductors in Fig. 5.10 consume 70 mW.

Potential applications of the time-scaling technique include slowing down a high speed signal so that it can be processed by low-speed circuitry and speeding up a low speed signal so that low-speed circuitry can be used to generate high-speed signals.

# **Chapter 7**

# Gain Enhanced High Frequency Transconductor With On-Chip Tuned Negative Conductance Load

The transconductors used in the all-pass filters in Chapters 2 and 5 use negative conductance loads to cancel the large parasitic conductance of a single stage differential amplifier. This chapter delves into the motivation behind such an architecture, its design details and presents its simulation results.

#### 7.1 Introduction

High frequency continuous time  $G_m - C$  filters require integrators with high unity gain bandwidth (UGB). However, since ideal integrators are not a reality, transconductors with high DC gain, low capacitive load, and parasitic pole position substantially higher than the filter's cut-off frequency are sought for. These conditions impose conflicting requirements on the transconductor design. Cascoding or cascading of transistors to improve DC gain results in insertion of parasitic poles which introduce additional undesired phase lags. The high frequency requirements can be met by using a simple transconductor topology with minimum number of internal nodes [30], using minimum length transistors, but these usually have poor DC gain. The DC gain can be improved by using a negative incremental conductance in parallel with the output which effectively cancels the output conductance of the transconductor without adding any internal nodes to the signal path (Fig. 7.1) [29]. However, this method requires tuning of the negative conductance to make it equal to the parasitic conductance at the output nodes. In [29], this is achieved by monitoring and adjusting the supply  $(V_{DDT})$  of I4, I5 through a dedicated Q tuning loop. The tuning scheme requires an accurate reference clock which is a system overhead. Moreover, since this method is based on the usage of inverters as  $G_m$ s, they are not biased at fixed currents. Hence, they draw transient currents from the



Figure 7.1: Schematic representation of Nauta transconductor [29].

supply which increases the potential of signal coupling between stages [61], especially for higher order high frequency filters.

Even though cancellation of parasitic conductance with a negative conductance load for single stage transconductors have been reported in the literature ([29],[62]), none of them has investigated a way to implement the cancellation across PVT without any external intervention. The following section presents a fully differential current biased transconductor architecture based on output conductance cancellation by negative conductance load. The negative conductance is automatically adjusted with fully on-chip circuitry across PVT.

# 7.2 Proposed architecture

Consider the transconductor schematic shown in Fig. 7.2.  $G_m$  is a fully differential transconductor cell where M1, M2 form the input pair, M3, M4 the PMOS loads, and M0, the tail current source.  $V_{cmfb}$  is the common mode feedback voltage generated from Fig. 7.3 which keeps the average of  $V_{outp}$  and  $V_{outm}$  at  $V_{cm}$ .  $I_{gm}$  is a current reference setting the value of  $G_m$ , which can be derived in a variety of ways ([30],[63]). It is kept constant for the context of this discussion. High frequency operation calls for all the transistors in the differential signal path to be of minimum length. Short channel devices suffer from low output resistance and hence load the output nodes ( $V_{outp}, V_{outm}$ ). To mitigate this effect an incremental negative conductance load is added in shunt with the transconductor. M5, M6 form the cross coupled pair which produces the negative differential conductance ( $G_N$ ) and M8 forms its tail current source. Equating the in-



Figure 7.2: (a): Transconductor  $G_m$  loaded with negative conductance  $G_N$ . (b): Symbolic representation of (a).

put voltages and output currents between Fig. 7.2(a) and (b),  $G_m = G_{m|M1,2}/2$  and  $G_N = G_{m|M5,6}/2$ . The total differential output conductance (for the half circuit), at the output nodes  $(V_{outp}, V_{outm})$  of the transconductor in Fig. 7.2 can thus be expressed as  $(g_{ds1} + g_{ds3} + g_{ds5})/2 - G_N$  where  $g_{ds1}, g_{ds3}, g_{ds5}$  are the parasitic conductances of M1, 2, M3, 4 and M5, 6 respectively. For the structure in Fig. 7.2 to behave like an ideal transconductor the total differential output conductance must be set to zero.

Fig. 7.3 shows the common mode feedback circuit used to maintain common mode of tranconductor's outputs at  $V_{cm}$ . The input transistors (M1a, M1b, M2) in Fig. 7.3 should be sized such that their capacitance contribution at the output of  $G_m$  is non-dominant, and should be large enough to ensure minimum deviation of transconductor's output voltages due to their offsets.

# **7.3** $G_N$ tracking $g_{ds}$

The basic principle of generating a  $G_N$  which tracks an output conductance  $g_{ds}$  is similar to the method used in [30], and its incremental picture is shown in Fig. 7.4.



Figure 7.3: Common mode feedback circuit.

An incremental current  $(G_N \Delta V)$  is generated by applying a small voltage  $(\Delta V)$  to a transconductor,  $G_N$ . When passed through the output conductance,  $g_{ds}$ , it produces an differential voltage,  $2G_N\Delta V/g_{ds}$ , which is compared with the applied voltage,  $\Delta V$ , and the value of transconductance is tweaked by changing its bias current  $(I_N)$  through negative feedback until  $2G_N\Delta V/g_{ds}=\Delta V$ . This ensures  $G_N=g_{ds}/2$ .  $G_{m\_error}$  is a differential difference error amplifier [64] which compares the differential difference between the output of  $G_N$  with the applied input.



Figure 7.4: Principle of conductance tracking by a transconductor.

The circuit implementation of the architecture is demonstrated in Fig. 7.5.  $G_m$  is a replica of the transconductor used in Fig. 7.2b. The transistors marked in black form the

transconductor  $G_N$ , and  $g_{ds}$  is the net parasitic conductance at  $V_{outp}$ ,  $V_{outm}$ .



Figure 7.5: Schematic of transconductance generation circuit to track the output conductance  $g_{ds}$ .

A small differential DC voltage  $2\Delta V$  is applied to the differential pair M5a, M6awhich rides on the common mode voltage  $V_{cm}$ . These voltages  $(V_{cm} \pm \Delta V)$  can be derived from an on chip reference voltage by using resistor divider network. An incremental current,  $2G_N\Delta V$ , thus flows through M5a, M6a into the output conductance  $(g_{ds})$  producing incremental voltages  $-2G_N\Delta V/g_{ds}$  and  $2G_N\Delta V/g_{ds}$  at  $V_{outm}$ and  $V_{outp}$  respectively. The circuit operation can be understood from the argument of negative feedback. If the transconductance of M5a, M6a is too high,  $V_{outm}$  tends to drop and  $V_{outp}$  tends to rise as a result of which the transconductors T1, T2 drive the gate of M12 higher, thus reducing the current through M9a. As this current is mirrored through M8a into the differential pair, the increase of  $G_N$  is corrected. Negative feedback with high loop gain forces the difference between the differential inputs of T1, T2to zero. This ensures  $2G_N\Delta V/g_{ds}=\Delta V$ , which forces  $G_N$  to be equal to  $g_{ds}/2$ , under all conditions. T1 and T2 form a differential difference amplifier [64] which amplifies the difference between the output of  $G_N$  and the references  $V_{cm} \pm \Delta V$ . Note that the differential difference amplifier can tolerate some offset between the common mode at the output of  $G_N$  and the common mode of the reference input. The schematic representation of T1, T2, T3 is shown in Fig. 7.6. M1, 4 in Fig. 7.6 should be made large enough to minimize its input offset.



Figure 7.6: Schematic of the transconductors T1, T2, T3 used in Fig. 7.5.

Note that the above argument is completely based on small signal analysis and does not rely on any transistor model. This makes the method universal. The exact value of  $\Delta V$  is not critical as long as it is small enough to keep the transistors in the small signal regime and large enough to override the input referred offsets of M5a, 6a, and T1, T2 in Fig. 7.5.  $I_N$  has all the necessary information to ensure that  $G_N$  tracks the output parasitic conductance at the nodes  $V_{outp}$ ,  $V_{outm}$  accurately with precision. Hence, this current is mirrored through an accurate current mirror (M13, M14) and supplied as the reference to the negative transconductance load in Fig. 7.2.

Let us now analyze the effect of mismatch on the circuit in Fig. 7.5. It can be appreciated from Fig. 7.5 that any threshold voltage mismatch in M5a, M6a adds to the applied voltage  $\Delta V$ , and the transconductor  $G_m$  generates an additional output current, thus making  $G_N \neq g_{ds}/2$ . In order to alleviate this problem a chopper operation is necessary which can average out the existing mismatches in the transconductors. Fig. 7.7 shows the implementation of the chopper used for neutralizing the effect of mismatch of input pair  $(G_N)$  and that in the transconductor,  $G_m$ . The chopping switches  $(S_1 - S_8)$  are pMOS transistors.  $\phi$  and  $\phi_b$  are complementary clocks running at 1 MHz with 50% duty cycle. The capacitors  $C_c$  and  $C_{LP}$  are used for filtering out the ripples in the loop. Note that this clock need not be accurate and can be generated on chip using a ring oscillator. In this work an external 2 MHz reference clock has been used. This is passed through an on-chip divide-by-two frequency divider followed by a non-overlapping clock generator as shown in Fig. 7.8.



Figure 7.7: Inclusion of chopping to eliminate the effect of offsets in  $G_m$  and  $G_N$ .



Figure 7.8: Schematic for generation of chopping clock.

# 7.4 Secondary effects and design trade-offs

Consider the schematic of Fig. 7.5. Due to the application of differential input voltage of  $2\Delta V$ , the current in M5a is greater than that in M6a, which causes  $G_{m|M5a} > G_{m|M6a}$ . Nominally  $G_{m|M5a} = G_N$ . For small deviations of the transconductance around  $2G_N$ , the Taylor series expansion of the transconductances of M5a|6a for small  $\Delta V$  can be expressed as, /5

$$G_{m|M5a} = 2(G_N + \Delta V G_N' + \frac{(\Delta V)^2}{2} G_N'')$$
 (7.1)

$$G_{m|M6a} = 2(G_N - \Delta V G_N' + \frac{(\Delta V)^2}{2} G_N'')$$
 (7.2)

Assuming negligible steady state error, the loop settles when

$$G_{m|M6a}\Delta V/g_{ds} + G_{m|M5a}\Delta V/g_{ds} = 2\Delta V \tag{7.3}$$

Using (7.2) in (7.3)

$$G_N + \frac{\Delta V}{2} G_N'' = \frac{g_{ds}}{2}.$$
 (7.4)

From (7.4) it is evident that for effective conductance tracking, the value of  $\Delta V$  should be low enough such that the effect of device non-linearities (second order and higher) remains negligible.

Now consider the DC loop gain of the schematic in Fig. 7.5. Assume that the combined gain of the error amplifiers, T1, T2 is  $A_{err}$ , and the current mirroring ratio between M9a, M7, M8a is 1:1:1. (Hypothetically) Breaking the loop at the input of the error amplifier, the loop gain can be expressed as

$$L(0) = A_{err} G_{m|M12} \frac{\delta G_N}{\delta I_{tail}} \Delta V \frac{1}{g_{ds}}$$
(7.5)

(7.5) suggests that, since the DC loop gain is proportional to  $\Delta V$  a high value of  $\Delta V$  is desired. From the conflicting requirements of (7.4) and (7.5), it is desirable to maximize the value of  $\Delta V$  till the second and higher order device non-linearities start degrading the conductance tracking.

#### 7.4.1 Results and discussion

The transconductor in Fig. 7.2 was realized in a 0.13  $\mu$ m CMOS process.  $I_{gm}$  was adjusted to set  $G_m$  at 12 mS.  $I_N$  was generated from Fig. 7.7. The transconductor was simulated over a range of temperature and supply voltages. The gain of the transconductor without the negative conductance load was as low as 5 (14 dB). Using the negative conductance load, the gain went up to 48 dB, across all process, voltage, and temperature variations (Fig. 7.9), indicating the effectiveness of the negative transconductance tracking method. Simulated integrator gain distribution with mismatch for 500 Monte-Carlo runs (Fig. 7.10) shows a minimum DC gain of 34 dB post enhancement. The distribution shows 86% of cases having gains of more than 40 dB, a mean of 46 dB, and standard



Figure 7.9: Transconductor gain variation with temperature and supply voltage across process corners with chopping enabled.



Figure 7.10: Transconductor gain distribution with mismatch over 500 Monte-Carlo runs with gain enhancement post chopping.

deviation of 9 dB.

The proposed transconductor (with and without gain enhancement) was compared with the one in [29] (Fig. 7.1) in a 0.13  $\mu$ m CMOS process. The sizes of the inverters (all identical) were set such that the combination of the power consumed by the proposed transconductor (Fig. 7.2) and the CMFB network (Fig. 7.3) was the same as

the DC power consumption of the circuit in Fig. 7.1.  $V_{DDT}$  in Fig. 7.1 was set such that the transconductor in Fig. 7.1 provides a DC gain >40dB at the typical corner. The power consumption of the bias setting inverter  $I_0$  was left out during comparison. Table 7.1 lists the comparative performance of the proposed transconductor (with and without conductance cancellation). The proposed transconductor architecture does not use any external gain tuning mechanism and achieves higher UGB as compared to the transconductor in [29]. This is attributed to the capacitive loading of the transconductor in Fig. 7.1 which is primarily due to the input gate capacitances of  $I_3 - 6$ . Since, all inverters are sized identical, the loading is greater than that in Fig. 7.2, where the capacitance offered by  $M_5|_6$  and the CMFB circuit (Fig. 7.3) is smaller owing to their lower  $G_m$  requirements. The common mode stability of the transconductors in Fig. 7.1

Table 7.1: Comparison of the proposed transconductor (with and without negative conductance cancellation) with the state of the art negative conductance cancellation based architecture.

|                                           | Proposed         | Transconductor without | [29]              |
|-------------------------------------------|------------------|------------------------|-------------------|
|                                           | transconductor   | load cancellation      |                   |
| Technology                                | 0.13 μm          | 0.13 μm                | 0.13 μm           |
| $V_{DD}$                                  | $1.2\mathrm{V}$  | 1.2 V                  | 1.2 <b>V</b>      |
| Gain                                      | >40 dB           | 16 dB                  | >40 dB            |
| Power                                     | 9 <b>mW</b>      | 8 mW                   | 9 <b>mW</b>       |
| Integrator UGB                            | 20 GHz           | 22 GHz                 | 11 GHz            |
| Integrated input noise                    | $0.5\mathrm{mV}$ | $0.48\mathrm{mV}$      | $0.56\mathrm{mV}$ |
| $V_{inp p-p}$ for THD< $-40  \mathrm{dB}$ | $220\mathrm{mV}$ | $220\mathrm{mV}$       | 560 mV            |
| External Gain tuning                      | Not Required     | Not Applicable         | Necessary         |

dictates that  $g_{m|I3} + g_{m|I4} > g_{m|I1}$ . Hence, even if I3 and I4 are scaled down, the maximum attainable UGB is around 13 GHz. However, as pointed out in [29], this will be at the cost of linearity. Due to its inverter based topology, the transistors in Fig. 7.1 have higher overdrives, and hence can afford greater input swing, thus resulting in a higher dynamic range than the proposed architecture.

# **Chapter 8**

# Accurate Constant Transconductance Generation Without Off-chip Components

The transconductances used in the all-pass filters in Chapters 2 and 5 are stabilised and tuned using the architecture explained in Section 3.2.2. This scheme is based on the fixed  $G_m$  generation outlined in this chapter.

#### 8.1 Motivation

The necessity of constant (i.e., process, voltage and temperature invariant) transconductance  $(G_m)$  bias generation circuits arises from the need for fixed transconductances in many analog circuits like filters [30], oscillators [65], low noise amplifiers [63]. The constant  $G_m$  bias circuits in literature can broadly be categorized into the following:

- Beta-multiplier circuits [66] and [67], each of which generate a  $G_m$  that tracks an off-chip conductance  $(G_{offchip}=1/R_{offchip})$  for maintaining its constancy with changes in temperature and on-chip process variation. However, maintaining an off-chip component to augment the on-chip circuitry increases the cost of the solution. Moreover, the beta-multiplier topology is based on the assumption of square law behavior of the MOS devices which does not hold good for modern sub-micron processes. This causes considerable deviation from its ideal behavior [36], [63].
- Master slave topology of the beta-multiplier circuit which tracks an on-chip resistor made of an MOS transistor in linear region [68]. The resistance of the transistor is controlled by the master by comparing its value to a switched cap resistor. Even though this approach alleviates the problem of an off-chip component, it suffers from the requirement of an external accurate clock. Furthermore, it also inherits the short-comings of the beta-multiplier circuit.
- A temperature compensated approach to derive a constant  $G_m$  is presented in [63] which also gets rid of the off-chip resistor. The authors generate current references which are proportional, constant, and complementary to the variation of absolute temperature, and use them to generate a current source which compensates for the temperature variation of electron mobility in an MOS transistor. In

this approach as well, the authors assume a square law model of the MOS transistor biased in saturation whose  $G_m$  is to be fixed. Moreover, this approach is fully dependent on the model of the particular device whose transconductance is being monitored, and hence is difficult to generalize.

• [36] presents a small signal method for generating a fixed transconductance. It is based on applying a small voltage,  $I \times R_{off_chip}$  to a differential pair and adjusting the bias current of the pair by negative feedback such that the incremental differential drain current is I. The bias current thus generated ensures that the  $G_m$  of the differential pair is set to  $1/R_{off_chip}$ . Since this does not depend on any assumption on the model of the device or its operating condition, it produces  $G_m$  s which track the off-chip conductance with very high accuracy. Never-the-less, this method also suffers from the requirement of an off-chip resistor.

This chapter presents two architectures which can generate constant transconductance with precision without the requirement of any external components and without depending on the square law model of the MOS transistor in saturation.

# 8.2 Proposed achitecture

The basic premise of the idea is to generate an on-chip conductance with a pMOS transistor in linear region and use it to track the  $G_m$  of the transconductor through a negative feedback loop. The efficacy of the solution lies in the effectiveness of making the conductance of the pMOS transistor constant with respect to the variation of ambient conditions. The source to drain current,  $I_{SD}$ , through a long channel pMOS transistor biased in linear region is expressed as

$$I_{SD} = \mu C_{ox} \frac{W}{L} ((V_{SG} - V_{th})V_{SD} - \frac{1}{2}V_{SD}^2)$$
(8.1)

where  $\mu$  is the hole mobility,  $C_{ox}$  the gate capacitance, W/L the aspect ratio and  $V_{th}$  the threshold voltage of the device. The subscripts D, G, S refer to drain gate and source of the transistor respectively. The incremental source to drain current when the source is excited can be expressed as

$$\Delta I_{SD} = \mu C_{ox} \frac{W}{L} (V_{SG} - V_{th}) \Delta V_S \tag{8.2}$$

For operation in deep linear region i.e.,  $V_{SD} \ll V_{SG} - V_{th}$ 

$$\Delta I_{SD} \approx \frac{I_{SD}}{V_{SD}} \Delta V_S \tag{8.3}$$

(8.3) refers to the fact that the incremental resistance of pMOS transistor in linear region with a small source to drain voltage,  $V_{SD}$ , is equal to  $V_{SD}/I_{SD}$ , where  $I_{SD}$  is the quiescent drain to source current of the device. The ratio  $V_{SD}/I_{SD}$  can be made constant with availability of on-chip constant voltage and current references. Fortunately there is a wide amount of literature for making highly accurate process and temperature insensitive references [69], [70]. The availability of such references have been assumed in this work and hence not been generated separately.

#### 8.2.1 Generation of accurate on-chip resistance



Figure 8.1: Block diagram demonstrating principle of operation for generating constant on-chip resistance.

The on-chip resistance, R, is created by a pMOS transistor ( $M_{linear}$ ) in deep linear region, through which a precise current  $I_{bias}$  is passed and a precise voltage  $\Delta V$  maintained between its source and drain terminals through negative feedback (Fig. 8.1). This ensures that the resistance between the source and drain terminals of this transistor is always  $\Delta V/I_{bias}$  irrespective of the changes in ambient conditions.



Figure 8.2: Block diagram demonstrating principle of  $G_m$  tracking 1/R.

#### 8.2.2 Generation of fixed transconductance

The basic principle of generating a fixed  $G_m$  from an accurately controlled resistor is similar to [36], and is shown in Fig. 8.2. An incremental current  $(G_m\Delta V)$  is generated by applying a small voltage  $(\Delta V)$  to a transconductor which is subsequently passed through a fixed resistance, R. The incremental voltage  $(G_m\Delta VR)$  thus produced is compared to the applied voltage,  $\Delta V$ , and the value of transconductance tweaked by changing its bias current through negative feedback until  $G_m\Delta VR = \Delta V$ . This ensures  $G_m = 1/R$ . The bias current thus generated is mirrored to all the transconductors on-chip whose  $G_m s$  need to be stabilized. A replica of the pMOS transistor,  $M_{linear}$  in Fig. 8.1 is used to replicate the accurate resistance R of Fig. 8.2.

# 8.2.3 Differential implementation



Figure 8.3: Generation of constant resistance for differential operation.

A differential implementation procedure for generating constant on-chip resistance is shown in Fig. 8.3. Two negative feedback loops hold the source voltage of  $M_{lin\_p}$  and



Figure 8.4: Schematic representation of the two stage opamp used in Fig. 8.3.

drain voltage of  $M_{lin\_m}$  at  $V_{cm} + \Delta V$ , and  $V_{cm} - \Delta V$  respectively, while a current  $I_{bias}$  is passed through them.  $V_{cm}$  is the common mode voltage of the differential signals that the transconductors are supposed to process. This ensures  $V_{SD} = \Delta V$  for both the pMOS transistors and a source to drain resistance of  $\Delta V/I_{bias}$ . Transistors  $M_{lin\_p}$  and  $M_{lin\_m}$  operate in linear region and hence degrade the loop gain. A two stage opamp (Fig. 8.4) is used in Fig. 8.3 in order to alleviate the problem.

Replicas of  $M_{lin\_p}$  and  $M_{lin\_m}$ , (MR1 and MR2 respectively) are used to replicate constant resistances in the fully differential implementation of the constant  $G_m$  bias generation circuit (Fig. 8.5). The gate voltages,  $V_{g1}$  and  $V_{g2}$  which act as the control for generating fixed resistances in Fig. 8.3 are routed to the gates of MR1 and MR2 respectively. These transistors are sized large enough such that random on-chip mismatches invoke negligible difference in the conductances between  $M_{lin\_p}$  and  $M_{lin\_m}$  and their replicas.

M1, M2 in Fig. 8.5 form the input pair of the transconductor whose  $G_m$  is to be fixed. M0 controls the bias current through the pair. M3, M4 form the pMOS active loads and are sized such that their output conductances are negligible to that of MR1, MR2. The circuit operation can be understood from the argument of negative feedback. If the  $G_m$  of M1, M2 is too high,  $V_{om}$  tends to drop and  $V_{op}$  tries to increase as a result of which the transconductors T2, T3 drive the gate of M7 higher, thus reducing the



Figure 8.5: Generation and routing of bias current for fixed transconductance.

current through M11. As this current is mirrored through M0 into the differential pair, the increase of  $G_m$  of M1, M2 is corrected.

A small differential DC voltage  $2\Delta V$  is applied to the differential pair M1, M2 which rides on the common mode voltage  $V_{cm}$ . The incremental current,  $G_m\Delta V$ , through M1, M2 flows into MR1, MR2 producing incremental voltages  $-G_m\Delta VR_{MR1}$  and  $G_m\Delta VR_{MR2}$  at  $V_{om}$  and  $V_{op}$  respectively.  $R_{MR1}$  and  $R_{MR2}$  are the source to drain resistances of MR1 and MR2 respectively. The transconductors, T2, T3 which are in the negative feedback loop forces  $V_{op}$ , and  $V_{om}$  to  $V_{cm}+\Delta V$ , and  $V_{cm}-\Delta V$  respectively. This ensures that the quiescent conditions of MR1, MR2 are identical to  $M_{lin\_p}$ ,  $M_{lin\_m}$  in of Fig. 8.3 respectively, thus making their incremental resistances identical (i.e.  $R_{MR1}=R_{MR2}=\Delta V/I_{bias}$ ). Also, since  $G_{m|M1,M2}\Delta VR_{MR1,MR2}=\Delta V$ ,  $G_{m|M1,M2}$  settles to  $I_{bias}/\Delta V$ .

The bias current that flows into the differential pair has all the dependencies needed to make the  $G_{m|M1,M2}$  insensitive to changes in ambient conditions. Hence this current is mirrored through an accurate current mirror M12, M13 to the transconductors in the rest of the chip. T1 assists in maintaining the drain voltages of M12, M13 identical, ensuring perfect mirroring without loss of head-room. The schematic representation of the transconductors T1, T2, T3 is shown in Fig. 8.6



Figure 8.6: Schematic of the transconductors T1, T2, T3 used in Fig. 8.5.

# 8.3 Alternate compact implementation

The architecture presented thus far generates a constant on-chip conductance using negative feedback and subsequently uses the generated conductance to track the transconductance of a transonductor using another negative feedback loop. This requires three negative feedback loops (Fig. 8.3 and Fig. 8.2) with large error amplifiers which consumes a large area. This section presents an alternate, simpler realization of fixed transconductance generation circuit that uses a single negative feedback loop and thus makes it more area efficient with respect to the architecture of Section 8.2.3.



Figure 8.7: Principle demonstrating  $G_m$  locking to  $I_{ref}/\Delta V$ .

The principle of operation is shown in Fig. 8.7.  $G_m$  is the transconductance which is to fixed. A DC incremental voltage,  $\Delta V$ , at its input produces an output current

 $G_m\Delta V$ , which is compared to a fixed current  $I_{ref}$ , and value of the transconductance tweaked by changing its bias current,  $I_{gm}$ , through negative feedback. The loop settles when  $G_m\Delta V=I_{ref}$ , thus making  $G_m=I_{ref}/\Delta V$ . Constant  $I_{ref}$  and  $\Delta V$  ensure a constant  $G_m$ .



Figure 8.8: Transconductance generation circuit to track  $I_{ref}/\Delta V$ .

The circuit implemention of the architecture is shown in Fig. 8.8. The transistors marked in black form the transconductor cell whose  $G_m$  is to be fixed. M1, 2 form the nMOS input pair, M0 is the tail transistor, and M3, 4 are the pMOS active loads driven by *cmfb* generated by a split transistor telescopic differential pair shown in Fig. 7.3. A small incremental DC voltage  $2\Delta V$  riding on a common mode voltage  $V_{cm}$  is applied to the differential pair M1, M2. An incremental current,  $G_m \Delta V$ , that flows into the drain of M1 (and out of M2) is compared to a constant reference,  $I_{ref}$ . Like Section 8.2.3 the circuit operation can be understood from the argument of negative feedback. If the transconductance of M1, M2 is too high,  $V_{om}$  tends to drop and  $V_{op}$  tends to rise as a result of which transconductor T1 drives the gate of M8 higher, thus reducing the current through M5. Since this current is eventually mirrored through M0 to the differential pair, the increase in  $G_m$  of M1, M2 is corrected. Negative feedback with high loop gain forces  $G_m$  of M1, M2 to adjust such that  $G_{m|M1,2}\Delta V = I_{ref}$ , thus ensuring  $G_{m|M1,2} = I_{ref}/\Delta V$ . High loop gain also forces the input of the error amplifier T1 to be virtual short. This ensures that the differential parasitic conductance of M1-4, do not carry any incremental current. The bias current through M5 has all the necessary information to ensure fixed transconductances for M1, M2. Hence, this current (or some multiple, k, of it) is mirrored through an accurate current mirror, M10, M11 and supplied as the reference currents to all the transconductors on-chip.

Note that the circuit in Fig. 8.8 does not have any constraint on speed, as it operates in isolation with respect to the other transconductors on-chip. Hence, the feedback loop can be made arbitrarily slow to achieve stability and high loop gain. Transconductances T1, T2 are single stage telescopic cascode opamps as shown in Fig. 8.6, which ensure high enough loop gain to minimize the steady state errors. Also note that the above argument is completely based on small signal analysis and does not rely on any transistor model. This makes the method universal.



Figure 8.9: Inclusion of chopping to eliminate offset of M1, 4.

Let us now analyze the effect of transistor mismatch on the circuit in Fig. 8.8. It can be appreciated that any threshold voltage mismatch  $2\delta v$  between M1, M2 adds to the applied incremental input  $2\Delta V$ , and forces the loop to settle at  $G_m(\Delta V + \delta v) = I_{ref}$ . This can be alleviated by introducing chopping which averages out the mismatches in the input pair and the active loads. This is shown in Fig. 8.9. The chopping switches are pMOS transistors (not shown in the figure),  $\phi$  and  $\phi_b$  are complementary clocks running at  $1 \mathrm{MHz}$ . Note that like Chapter 7 this clock need not be accurate and can be generated on-chip using a ring oscillator. The capacitors  $C_{LP}$  and  $C_c$  are used to filter out the ripples in the loop.



Figure 8.10: Simulation schematic of the transconductor whose bias current is generated in Fig. 8.9.

## 8.4 Simulation results and discussion

The mirrored current,  $I_{gm}$ , in Fig. 8.9 was used to bias another transconductor on-chip (Fig. 8.10). Circuits in Fig. 8.9 and Fig. 8.10 were simulated in conjunction over a range of temperature and supply voltages in a standard 0.13  $\mu$ m CMOS process, and  $G_m$  of M1a, M2a measured.  $V_{em}$  was set to 800 mV.  $\Delta V$  was set at 20 mV, and  $I_{bias}$  at 230  $\mu$ A, which makes the expected  $G_m=11.65$  mS. The input pair, M1, M2 of Fig. 8.9 were sized such that their overdirve  $\approx 160$  mV for a  $G_m=11.65$  mS. This ensured that with a  $\Delta V$  of 20 mV input pair M1, 2 operate within its linear range of operation.



Figure 8.11: Simulated variation of  $G_{m|M1a,M2a}$  of Fig. 8.10 with temperature.



Figure 8.12: Simulated variation of  $G_{m|M1a,M2a}$  of Fig. 8.10 with supply voltage.

Fig. 8.11 shows the variation of  $G_{m|M1a,M2a}$  over a temperature range from  $0^{\circ}C$  to  $70^{\circ}C$  across the process corners. The cumulative variation of  $G_m$  is less than 1% over this range. The ambient supply voltage was varied by from  $1.1-1.4\,\mathrm{V}$  and variation of  $G_m$  plotted in Fig. 8.12. Again it deviated less than 2% from its ambient value. However, as mentioned earlier these plots have been obtained with the assumption of perfect current and voltage references. Single-trim constant voltage references, popularly known as bandgap references have been reported to vary by less than 1% for a temperature range of  $0-70^{\circ}C$  [69]where as [70] reports a variation of 0.25% for its constant current reference. With the use of these references, an on-chip constant  $G_m$  circuit can be realized without the necessity of any off-chip intervention.

# Chapter 9

# **Conclusion and Future Scope**

This thesis proposed a way of realizing large tunable delays over a wide bandwidth using variable order all-pass filter necessary for wide band beamforming, and used the proposed architecture to demonstrate true-time expansion and compression of continuous time, high frequency, analog pulses.

Lumped element realizations of a delay lines using all-pass filters have been demonstrated in literature. However, the usefulness of an all-pass filter to generate large delays over wide bandwidth has not been explored to the fullest. The primary reason behind this is that, almost all the reported works on all-pass filters resort to cascading unit delay cells for realizing large delay range. This approach led to distortions in magnitude and delay characteristics as the number of cascaded units increased owing to the effect of the parasitic capacitances at the unit cell interfaces. This thesis discussed the limitations of this approach and introduced an architecture suitable for realizing high order all-pass filters to generate large delays while not compromising on the delay range, and magnitude flatness. This was possible by recognizing that a pulse launched into transmission line terminated by an open or a short circuit returns to the input port after a round trip delay. A simple arrangement cancels the incident pulse at the input port and leaves only the reflected pulse, which is a delayed version of the pulse. Changing the length of the transmission line changes the delay. A lumped element approximation of this is a singly terminated LC filter with programmable order, with a flat group delay characteristics. A  $G_m$ -C counterpart of the LC filter was used as the variable order ladder filter. Coarse tuning of delay was realized by changing the filter's order while keeping the bandwidth constant and fine tuning was implemented by changing the filter's bandwidth utilizing the delay-bandwidth tradeoff. The proposed claims were validated with a test chip fabricated in 0.13 µm CMOS process, which demonstrated a delay tuning range of 250 ps-1.7 ns, over a bandwidth of 2 GHz, while maintaining a magnitude deviation of  $\pm 0.7$  dB, and dissipating 112 mW-364 mW of power between its minimum and maximum delay settings.

The latter part of the thesis explored an area-efficient method of realizing expansion and compression of continuous-time, high frequency analog pulses. Continuous time methods of realizing expansion of high frequency pulses are beneficial in that they slow down the signal without any loss of information, which helps in further digitization using a slower and more accurate ADC. Compressing a pulse on the other hand helps in generation of high frequency pulses using slower hardware. Prevalent methods of pulse expansion/compression use modulation of these high frequency pulses using even higher frequency carriers (often photonic) and then stretching the modulated envelope using a dispersive media. Dispersive medium take a large area and often makes the solutions off-chip. This thesis demonstrated pulse expansion/compression using the method proposed in [28]. This is based on storing an input pulse as state variables in a continuous-time filter whose delay exceeds the pulse duration, and, once the pulse is completely "inside" the filter, reducing or increasing its bandwidth. Using this method this work demonstrated an IC implementation of both expansion and compression in the same chip. Circuit design challenges of IC implementations were discussed and pulse expansion and compression by factors of  $1.8 \times$  and  $1.7 \times$  respectively were demonstrated in a 0.13 µm CMOS process. The prototype chip included a seventh order all-pass filter with switchable bandwidth between 870 MHz and 472 MHz and circuitry to generate Gaussian/monopulse for testing.

This thesis also proposed a gain enhanced, high frequency transconductor with no internal nodes, based on automatic cancellation of its parasitic conductance across variations of process voltage and temperature. Architectures to tune the transconductance while keeping it constant across variations of process, voltage and temperature were discussed. These transconductors with their automatic tuning loops were used in the variable and fixed order filters to realize the delay lines and their efficacy demonstrated.

# 9.1 Suggestions for future work

The building blocks of the delay lines used in this thesis were the unit  $G_m$  cells. They were sized such that their transconductances vary within  $\pm 2\%$  with random mismatch, which was essential to keep the distortion of the pulse shape due to group delay distor-

tion within acceptable limits. This condition fixed the power consumption of the unit cell and thus of the active delay line as well. Calibration of random mismatches of these transconductors through bias currents can reduce the power consumption. Reducing the quiescent current would reduce the transconductor's  $G_m$ , which would increase the noise of the APF. However, as has been argued in the thesis, the noise contribution of the transconductors in the APF reduce further down the ladder. Calibration of these unit cells in the latter half of the ladder can significantly reduce the power consumption of the delay line without degrading its noise performance appreciably.

The bandwidth and the linearity of the active delay line demonstrated in this thesis was limited by the transistor parasitics, and the linearity of the transconductors. An LC or transmission line based delay line using the proposed architecture can significantly increase the bandwidth and improve linearity. This improvement will occur at the expense of the chip area. However, the realizable delay of the proposed architecture corresponds to the round trip delay of a transmission line, as opposed to the existing architectures which use one way delay of the input signal in a doubly terminated line. This can result in  $2\times$  area improvement with respect to the existing trombone like architectures [7] for the same delay.

Full duplex communication has gained a lot of research interest in the recent years in the quest of increasing bandwidth of wireless transceivers. The critical challenge for realizing an effective full duplex system is the cancellation of large transmitted power that couples into the receiver path. With the advent of 5G systems which can support multiple antennas, beamforming systems with high linearity can be used to significantly suppress this interference, without compromising the EVM of the received signal especially in dense constellation based wireless standards. The proposed architecture can also be used for broadband beamforming after downconversion at baseband for 5G systems where the expected baseband bandwidth is in excess of 1 GHz [1]. Since baseband signal processing is preceded by many gain stages in a receiver, the effect of its noise becomes insignificant.

This thesis also demonstrated an IC implementation of expansion and compression of continuous-time analog signals. Using the same principles, realization of higher expansion/compression factors can be explored. Two of the primary impediments towards such a realization in this work were the power consumption of the delay lines,

and the operating point mismatch between two halves of the filters used for transfer function scaling. To get around the operating point mismatch coupling capacitors were used. This increased the chip area and reduced the achievable bandwidth due to the parasitic bottom plate capacitances. Both of these issues can be addressed with the help of mismatch calibration of the transconductors used in the filter's ladders.

The bandwidth changing trigger signal used in this work for demonstrating pulse expansion/compression was provided externally. In any future implementation of this architecture the on-chip integration of the switching trigger can be explored. One of the possible ways of realizing the same is mentioned in the footnote of Chapter 6.

# Appendix A

# **Measurement of Noise Spectral Density**

This appendix explains the method used for characterizing the noise of the filters in Chapter 4 and 6. It is same as the method used in [40] with a slight modification in the estimation of the loss of the test setup. ss Fig. A.1(a) shows the test setup used to



Figure A.1: Noise contributors in the test setup (a) with filter path enabled and (b) with direct path enabled.

characterize the noise of the filter chain, and de-embed the effect of the test setup.  $S_R$ ,  $S_f$ ,  $S_{int}$  are the input referred noise spectral density of the on-chip termination resistor at the input of the filter, the filter under test and the test interface between the output of the filter and the spectrum analyzer (SA).  $H_{AP}$  and  $H_{int}$  are the transfer functions of the

filter, and the test interface respectively. Assuming negligible current at the input of the APF, the observed noise spectral density at the spectrum analyzer is given by

$$S_{full} = (S_R |H_{AP}|^2 + S_f |H_{AP}|^2 + S_{int})|H_{int}|^2$$
(A.1)

Fig. A.1(b) shows the same setup with the direct path in the chip enabled. The filter no longer contributes to the output noise and the observed output noise PSD is given by

$$S_d = (S_R + S_{int})|H_{int}|^2 (A.2)$$

Subtracting (A.2) from (A.1) and assuming  $|H_{AP}| \approx 1$  and  $S_f \gg S_R$ 

$$S_{full} - S_d = S_f |H_{AP}|^2 |H_{int}|^2$$
(A.3)

which implies

$$S_f = (S_{full} - S_d)/|H_{AP}|^2|H_{int}|^2$$
(A.4)

From (A.4) it is evident that to measure the filter's input referred noise, it is necessary to find  $|H_{AP}H_{int}|$ . Fig. A.2 shows the test setup used to find  $|H_{AP}H_{int}|$ , using a vector network analyzer (VNA).  $V_+$  denotes the voltage wave incident on the input port of the



Figure A.2: Test setup for evaluating  $|H_{AP}H_{int}|$ .

device under test.  $\Gamma V_+$  is the reflected wave at the input port.  $V_R$  is the filter's input voltage observed across the termination resistor R. From inspection of Fig. A.2

$$\frac{V_0}{V_R} = H_{AP}H_{int} \tag{A.5}$$

Assuming lossless power transfer through the balun between the input and the termination resistor R

$$\frac{V_{+}^{2}}{Z_{0}}(1-|\Gamma|^{2}) = \frac{V_{R}^{2}}{R}$$
(A.6)

Also the measured transmission scattering parameter  $S_{21}$  is given by

$$S_{21} = \frac{V_0}{V_+} \tag{A.7}$$

From (A.7), (A.6) and (A.5) and using  $\Gamma = S_{11}$ , where  $S_{11}$  is the scattering parameter representing reflection at the input port.

$$|H_{AP}H_{int}|^2 = \frac{|S_{21}|^2 Z_0}{R(1 - |S_{11}|^2)}$$
(A.8)

Using (A.8) in (A.4), the filter's noise spectral density can be represented as

$$S_f = \frac{(S_{full} - S_d)(1 - |S_{11}|^2)\frac{R}{Z_0}}{|S_{21}|^2}$$
(A.9)

Since all the quantities in the right hand side of (A.9) are explicitly measurable, (A.9) gives an accurate estimation of the filter's noise spectral density [40]. It is worth mentioning that, since  $|H_{AP}H_{int}|$  is accurately known, any signal swing at the output of the filter can be referred back to its input by dividing it by  $|H_{AP}H_{int}|$ . This is useful for estimating the peak-peak input signal swing by observing the output in a spectrum analyzer. This technique is used to measure the allowable swing at the input of the filter for which it has introduced certain amount of distortion. Any error in the measurement of  $|H_{AP}H_{int}|$  affects the de-embedded noise and distortion quantities equally, and hence does not affect the measured dynamic range.



Figure A.3: Effect of path loss on power transfer.

In practice the lossless power transfer assumption is not strictly valid especially at

high frequencies. The effect of the power loss through the balun and the connectors is mimicked as the attenuation of the incident power  $P_+$  by a factor  $\alpha$  (insertion loss) through the balun as illustrated in Fig. A.3. The power transferred to the resistor R is now expressed as

$$\frac{V_0^2}{R} = P_+(1 - |\Gamma|^2) - P_{bal} \tag{A.10}$$

where  $P_+ = V_+^2/Z_0$  and  $P_{bal}$  is the power dissipated in the balun and the cables, and can be expressed as the sum of the power lost by the incident and the reflected wave flowing through it. For real  $\alpha$  (implying resistive loss)

$$P_{bal} = P_{+}(1 - \alpha) + \Gamma^{2} P_{+}(1/\alpha - 1)$$
(A.11)

Substituting  $P_+ = V_+^2/Z_0$ 

$$P_{bal} = \frac{V_{+}^{2}(1-\alpha)}{Z_{0}} + \frac{\Gamma^{2}V_{+}^{2}(1/\alpha - 1)}{Z_{0}}$$
(A.12)

Substituting (A.12) into (A.10) yields

$$\frac{\alpha V_{+}^{2}}{Z_{0}} (1 - (\Gamma/\alpha)^{2}) = \frac{V_{R}^{2}}{R}$$
 (A.13)

and the (A.9) gets modified as [40]

$$S_f = \frac{(S_{full} - S_d)(1 - |S_{11}/\alpha|^2)\frac{\alpha R}{Z_0}}{|S_{21}|^2}$$
 (A.14)

# **A.1** Measurment of insertion loss ( $\alpha$ )

The work in [40] assumes a frequency independent balun loss. In this section a method similar to the power transfer analysis of the previous section is used to estimate the frequency dependent insertion loss due to the test setup at the input of the APF.

Note that  $\alpha$  in Fig. A.3 is defined as the power loss of the forward or the backward travelling wave through the balun. Power of forward travelling wave can be conviniently measured using a VNA with matched output termination. To evaluate the insertion loss in the test setup due to balun and the connectors from input to output, two identical baluns were connected back to back between the two ports of a VNA as shown in



Figure A.4: Setup for measuring balun loss under.

Fig. A.4. Their  $S_{21}$  was then measured to obtain the loss of the of the entire setup, consisting of two back-to-back connected baluns and the associated connectors. However, to estimate the actual power transfer ratio between the input and the output ports of the VNA it is required to nullify the effect of reflection of power at the input port. The reflection was accounted for by measuring the  $S_{11}$  of the setup and the insertion loss was estimated as

$$\alpha = \sqrt{\frac{|S_{21bal}|^2}{1 - |S_{11bal}|^2}} \tag{A.15}$$

where  $S_{21bal}$  and  $S_{11bal}$  are the forward transmission and input reflection scattering parameters of the test setup of Fig. A.4 respectively. The square root term in (A.15) accounts for the loss contributed by half the setup on the one side of the dashed line in Fig. A.4.



Figure A.5: Insertion loss of the balun and the associated connectors.

Fig. A.5 shows the de-embedded insertion loss<sup>1</sup> of a single balun and the associated connectors of Fig. A.4.



 $<sup>^1</sup>$ The variable order APF that was published as "A 2 GHz bandwidth, 0.25-1.7 ns true-time-delay element using a variable-order all-pass filter architecture in 0.13  $\mu$ m CMOS," in *IEEE J. Solid-State Circuits*, vol. 52, no. 8, pp. 2180-2193, Aug. 2017, assumed a constant, frequency independent balun loss of 2 dB.

# Appendix B

# Multi-variable Numerical Optimization for Fitting Measured and Simulated Filter's Responses Using Space Mapping Technique

This appendix explains the method used to fit the measured group delay response of the APF in Chapter 4 with simulation by varying the transconductances of the eighteen transconductors of Fig. 2.8, as shown in Fig. 4.6. This is an application of the technique used for post layout optimization of AC responses of continuous-time filters in [42].

This optimization is based on developing a simplified model of the filter which can be simulated much faster than the SPICE based simulations, and using the model to simulate the filter multiple times to match the response to the desired waveform. In this case a state space model of the all-pass filter of Fig. 2.8 was developed in MATLAB©. The APF of Fig. 2.8 is reproduced in Fig. B.1 whose state space model can be expressed as

$$\dot{\mathbf{X}} = \mathbf{A}\mathbf{X} + \mathbf{B}\mathbf{U} \tag{B.1}$$

$$Y = CX + DU (B.2)$$

where A, B, C, D are the state-variable matrices, and X, Y, U are the filter's states (node voltages), outputs and inputs respectively.

Let a transconductor (other than the ones forming the summing taps) with its input connected to  $V_k$  and output to  $V_l$  in Fig. B.1 be marked as  $g_{kl}$  with an appropriate polarity. Also let  $g_{i1}$  represent the transconductor connected between the input  $V_i$  and the node  $V_1$ . Using these notations, the A, B, C, D and the X, Y, U matrices for the all-pass filter of Fig. B.1 can be represented as



Figure B.1: Schematic of the all-pass filter of Fig. 2.8 with mismatches in the tranconductors (other than the summing taps).

$$\mathbf{A} = \begin{bmatrix} -\frac{g_{11}}{C_1} & -\frac{g_{21}}{C_2} & 0 & 0 & 0 & 0 & 0 & 0 \\ \frac{g_{12}}{C_2} & 0 & -\frac{g_{32}}{C_2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{g_{23}}{C_3} & 0 & -\frac{g_{43}}{C_3} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \frac{g_{34}}{C_4} & 0 & -\frac{g_{54}}{C_4} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{g_{45}}{C_5} & 0 & -\frac{g_{65}}{C_5} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & \frac{g_{56}}{C_6} & 0 & -\frac{g_{76}}{C_6} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & \frac{g_{67}}{C_7} & 0 & -\frac{g_{87}}{C_7} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{g_{78}}{C_8} & 0 & -\frac{g_{98}}{C_8} \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{g_{89}}{C_9} & 0 \end{bmatrix}$$

$$(B.3)$$

$$\mathbf{B} = \begin{bmatrix} \frac{g_{11}}{C_1} & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix}^T$$
(B.4)

$$\mathbf{C} = \begin{bmatrix} 2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \tag{B.5}$$

$$\mathbf{D} = \begin{bmatrix} -1 \end{bmatrix} \tag{B.6}$$

$$\mathbf{X} = \begin{bmatrix} V_1 & V_2 & V_3 & V_4 & V_5 & V_6 & V_7 & V_8 & V_9 \end{bmatrix}^T$$
 (B.7)

$$\mathbf{Y} = \begin{bmatrix} V_{AP} \end{bmatrix} \tag{B.8}$$

$$\mathbf{U} = \begin{bmatrix} V_i \end{bmatrix} \tag{B.9}$$

The inbuilt MATLAB© function "ss2tf" gives the filter's transfer function from the A, B, C, D matrices.

To match the measured group delay response of Fig. 4.6 with simulations, the measured delay response was imported in MATLAB©. All the transconductances in (B.3) and (B.4) were assigned nominal initial values, and the capacitors assigned the ideal values (since capacitances depend mostly on lateral dimensions, and not on doping and mobility, their gradient across the die was neglected in comparison to the transconductors). The optimization routine "fminsearch" in MATLAB© was run to minimize the mean squared error, defined as the integral of the squared difference between the simulated and measured group delay over a certain bandwidth, while varying the transconductance values gkl. The optimization routine runs in less than 2 minutes using a 3.4 GHz, 8 core processor. The transconductance values extracted for different orders are consistent, i.e., for example transconductance values extracted for the seventh order case are the same as the transconductance values in the first seven stages when extracted for the ninth order case. This confirms that the variations are mainly in the transconductance values and not in capacitance values. The corresponding transconductances represent the  $G_m$ s of the transconductors that provide the measured group delay response and these are shown in Fig. 4.5.

## **Bibliography**

- [1] A. Gupta and R. K. Jha, "A Survey of 5G Network: Architecture and Emerging Technologies," *IEEE Access*, vol. 3, pp. 1206-1232, 2015.
- [2] Xiang Guan, H. Hashemi, A. Komijani and A. Hajimiri, "Multiple phase generation and distribution for a fully-integrated 24-GHz phased-array receiver in silicon," *Proc.* 2004 IEEE RFIC, pp. 229-232, 2004.
- [3] H. Hashemi, X. Guan, A. Komijani and A. Hajimiri, "A 24-GHz SiGe phased-array receiver-LO phase-shifting approach," *IEEE Trans. Microw. Theory and Tech.*, vol. 53, no. 2, pp. 614-626, Feb. 2005.
- [4] S. K. Garakoui, E. A. M. Klumperink, B. Nauta and F. E. V. Vliet, "Phased-array antenna beam squinting related to frequency dependency of delay circuits," *Proc.* 2011 IEEE EuMC, pp. 1304-1307, 2011.
- [5] N. Paulino, J. Goes and A. S. Garcao, "UWB radar systems," in *Low Power UWB CMOS Radar Sensors*, Caparica, Portugal, Springer, 2008. pp. 53.
- [6] W. Qing, Z. Kuangyu, L. Qiao and X. Huagang, "Design and implementation of sub-GHz transmitter for ultra-wideband through-wall radar," in *Proc. 2010 IEEE ICU*, pp. 1-4, Nov. 2010.
- [7] J. Roderick, H. Krishnaswamy, K. Newton and H. Hashemi, "Silicon-based ultrawideband beam-forming," *IEEE J. Solid-State Circuits*, vol. 41, pp. 1726-1739, Jul. 2006.
- [8] H. Wu, J. A. Tierno, P. Pepeljugoski, J. Schaub, S. Gowda, J. A. Kash and A. Hajimiri, "Integrated transversal equalizers in high-speed fiber-optic systems," *IEEE J. Solid-State Circuits*, vol. 38, no. 12, pp. 2131-2137, Dec. 2003.
- [9] V. Szortyka, K. Raczkowski, M. Kuijk and P. Wambacq, "A 42 mW wideband baseband receiver section with beamforming functionality for 60 GHz applications in 40 nm low-power CMOS," *Proc. 2012 IEEE RFIC*, pp. 261-264, Jun. 2012.
- [10] H. Shibata, V. Kozlov, Z. Ji, A. Ganesan, H. Zhu, D. Paterson, J. Zhao, S. Patil and S. Pavan, "A 9-GS/s 1.125-GHz BW Oversampling Continuous-Time Pipeline ADC Achieving –164 dBFS/Hz NSD," *IEEE J. Solid-State Circuits*, vol. 52, no. 12, pp. 3219-3234, Dec. 2017.
- [11] F. Hu and K. Mouthaan, "A 1-20 GHz 400 ps true-time-delay with small delay error in 0.13 μm CMOS for broadband phased array antennas," *Proc. 2015 IEEE IMS*, pp. 1-3, May 2015.
- [12] R. Schaumann and M. E. V. Valkenburg, "LC ladder filters," in *Design of Analog Filters*, New York, Oxford University Press, 2001, pp. 524-528.

- [13] P. Horowitz and W. Hill, "Unity-gain phase splitter," in *The Art of Electronics*, 2nd ed. New York, NY, USA: Cambridge Univ. Press, 1999, ch. 2, sec. 8, pp. 77-78.
- [14] S. K. Garakoui, E. A. M. Klumperink, B. Nauta and F. E. V. Vliet, "Compact cascadable gm-C all-pass true-time-delay cell with reduced delay variation over frequency," *IEEE J. Solid-State Circuits*, vol. 50, pp. 693-703, Mar. 2015.
- [15] Y. W. Chang, Z. C. Yan and C. N. Kuo, "Wideband time-delay circuit," *Proc.* 2011 *EuMIC*, pp. 1391-1394, May 2011.
- [16] E. Hayahara, "Synthesis of an active all-pass network," *IEEE Trans. Circuits Syst.*, vol. 22, no. 5, pp. 404-407, May 1975.
- [17] T. S Chu, J. Roderick and H. Hashemi, "An integrated ultra-wideband timed array receiver in 0.13 μm CMOS using a path-sharing true-time-delay architecture," *IEEE J. Solid-State Circuits*, vol. 42, pp. 2834-2850, Dec. 2007.
- [18] S. Moallemi, R. Welker and J. Kitchen, "Wide band programmable true time delay block for phased array antenna applications," *Proc. 2016 IEEE DCAS*, pp. 1-4, Oct. 2016.
- [19] S. Park and S. Jeon, "A 15-40 GHz CMOS True-Time Delay Circuit for UWB Multi-Antenna Systems," *IEEE Microwave and Wireless Components Letters*, vol. 23, no. 3, pp. 149-151, Mar. 2013.
- [20] A. Ulusoy, B. Schleicher and H. Schumacher, "A Tunable Differential All-Pass Filter for UWB True Time Delay and Phase Shift Applications," *IEEE Microwave and Wireless Components Letters*, vol. 21, no. 9, pp. 462-464, Sep. 2011.
- [21] Q. Ma, D. Leenaerts, and R. Mahmoudi, "A 10-50 GHz True-Time-Delay phase shifter with max 3.9% delay variation," *Proc. 2014 IEEE RFIC*, pp. 83-86, 2014.
- [22] Q. Ma, D. Leenaerts, and R. Mahmoudi, "A 12 ps true-time-delay phase shifter with 6.6% delay variation at 20-40GHz," *Proc. 2013 IEEE RFIC*, pp. 61-64, Jun. 2013.
- [23] P. Ahmadi, B. Maundy, A. S. Elwakil, L. Belostotski and A. Madanayake, "A New Second-Order All-Pass Filter in 130-nm CMOS," *IEEE Trans. Circuits Syst. II: Express Briefs*, vol. 63, no. 3, pp. 249-253, Mar. 2016.
- [24] Y. W. Chang, T. C. Yan, C. N. Kuo, "Wideband time-delay circuit", *Proc. IEEE EuMIC*, pp. 454-457, Oct. 2011
- [25] Y. Gao, Y. Zheng, S. Diao, Y. Zhu and C. H. Heng, "An integrated beamformer for IR-UWB receiver in 0.18 μm CMOS," *Proc. 2011 IEEE ISCAS*, pp. 1548-1551, May 2011.
- [26] E. Adabi and A. M. Niknejad, "Broadband variable passive delay elements based on an inductance multiplication technique," *Proc. 2008 IEEE RFIC*, pp. 445-448, Apr. 2008.
- [27] N. Rajesh and S. Pavan, "Design of lumped-component programmable delay elements for ultra-wideband beamforming," *IEEE J. Solid-State Circuits*, vol. 49, pp. 1800-1814, Aug. 2014.

- [28] N. Krishnapura, "Electronic time stretching for fast digitization," *Proc. 2011 IEEE ISCAS*, pp. 1391-1394, May 2011.
- [29] B. Nauta, "A CMOS transconductance-C filter technique for very high frequencies," *IEEE J. Solid-State Circuits*, vol. 27, pp. 142-153, Feb. 1992.
- [30] T. Laxminidhi, V. Prasadu and S. Pavan, "Widely programmable high frequency active-RC filters in CMOS technology," *IEEE Trans. Circuits Syst. I: Reg. Papers*, vol. 56, no. 2, pp. 327-336, Feb. 2009.
- [31] A. B. Williams and F. J. Taylor, "Introduction to modern network theory," and "Selecting the response characteristic," in *Electronic Filter Design Handbook*, 4th ed. New York, The McGraw-Hill Companies, 2006, pp. 5, 64-70.
- [32] N. Rajesh and S. Pavan, "Programmable analog pulse shaping for ultra-wideband applications," *Proc. 2015 IEEE ISCAS*, pp. 461-464, May 2015.
- [33] T. Laxminidhi and S. Pavan, "A 70–500 MHz programmable CMOS filter compensated for MOS nonquasistatic effects," *Proc. 2006 IEEE ESSCIRC*, pp. 328-331, Sep. 2006.
- [34] A. S. Porret, T. Melly, C. C. Enz, and E. A. Vittoz, "Design of high-Q varactors for low-power wireless applications using a standard CMOS process," *IEEE J. Solid-State Circuits*, vol. 35, pp. 337-345, Aug. 2002.
- [35] K. Sawada, G. V. Plas, Y. Miyamori, T. Oishi, C. Vladimir, A. Mercha, V. Diederik and H. Ammo, "Characterization of capacitance mismatch using simple difference charge-based capacitance measurement (DCBCM) test structure," *Proc.* 2013 *IEEE ICMTS*, pp. 49-52, 2013.
- [36] S. Pavan, "A fixed transconductance bias circuit for CMOS analog integrated circuits," *Proc. 2011 IEEE ISCAS*, pp. 661-664, May 2004.
- [37] Y. Tsividis and B. Shi, "Cancellation of distortion of any order in integrated active RC filters," *Electronics Letters*, vol. 21, no. 4, pp. 132-134, Feb. 1985.
- [38] G. W. Roberts and A. S. Sedra "Adjoint networks revisited," *Proc. 1990 IEEE ISCAS*, vol. 1, pp. 540-544, May 1990.
- [39] Y. Palaskas and Y. Tsividis, "Dynamic range optimization of weakly nonlinear, fully balanced, Gm-C filters with power dissipation constraints," *IEEE Trans. Circuits Syst. II: Analog and Digital Signal Processing*, vol. 50, no. 10, pp. 714-727, Oct. 2003.
- [40] S. Pavan and T. Laxminidhi, "Accurate characterization of integrated continuous-time filters," *IEEE J. Solid-State Circuits*, vol. 42, no. 8, pp. 1758-1766, Aug. 2008.
- [41] M. J. M. Pelgrom, A. C. Duinmaijer, and A.P. G. Welbers, "Matching properties of MOS transistors," *IEEE J. Solid-State Circuits*, vol. 24, no. 5, pp. 1433-1440, Aug. 1989.
- [42] T. Laxminidhi and S. Pavan, "Efficient Design Centering of High-Frequency Integrated Continuous-Time Filters," *IEEE Trans. Circuits Syst. I: Reg. Papers*, vol. 54, no. 7, pp. 1481-1488, Jul. 2007.

- [43] N. Krishnapura, A. Agrawal and S. Singh, "A high-IIP3 third-order elliptic filter with current-efficient feedforward-compensated opamps," *IEEE Trans. Circuits Syst. II: Express Briefs*, vol. 58, pp. 205-209, Apr. 2011.
- [44] H. Han, B. G. Yu and T. W. Kim, "A 1.9 mm-Precision 20 GS/s real-time sampling receiver using time-extension method for indoor localization," *Proc. 2015 IEEE ISSCC*, Feb. 2015.
- [45] B. Xiang, A. Kopa, Z. Fu and A. B. Apsel, "Theoretical analysis and practical considerations for the integrated time-stretching system using dispersive delay line," *IEEE Trans. Microw. Theory Tech.*, vol. 60, no. 11, pp. 3449-3457, Nov. 2012.
- [46] P. T. W. Wong, W. W. L. Lai and M. Sato, "Time-frequency spectral analysis of step frequency continuous wave and impulse ground penetrating radar," *Proc.* 2016 IEEE Int. Conf. on GPR, Jun. 2016.
- [47] Z. H. Xia, Y. Hao, S. Y. Wu, G. Y. Fang, G. J. He, J. X. Jiang, H. Guo, H. Q. Ma, Y. J. Zhan, X. J. Liu, H. J. Yin and H. J. Yin, "Design of high scanning rate for array impulse ground penetrating radar," *Proc.* 2016 IEEE Int. Conf. on GPR, Jun. 2016.
- [48] H. Y. Ramirez, "The antares neutrino detector instrumentation," *Journal of Instrumentation*, vol. 7, no. 1, pp. C01022, Jan. 2012.
- [49] M. Mutterer, W. H. Trzaska, G. P. Tyuri, A.V. Evsenin, J. von Kalben, J. Kemmer, M. Kapusta, V.G. Lyapin and S.V. Khlebnikov, "Breakthrough in pulse-shape based particle identification with silicon detectors," *IEEE Trans. Nuclear Science*, vol. 47, no. 3, pp. 756-759, Jun. 2000.
- [50] G. Haller and B. Wooley, "A 700 MHz switched-capacitor analog waveform sampling circuit," *IEEE J. Solid-State Circuits*, vol. 29, no. 4, pp. 500-508, Apr. 1994.
- [51] S. Gupta and B. Jalali, "Time stretch enhanced recording oscilloscope," *Applied Physics Letters*, vol. 94, issue 4, Jan. 2009.
- [52] Y. Han and B. Jalali, "Continuous-time time-stretched analog-to-digital converter array implemented using virtual time gating," *IEEE Trans. Circuits Syst. I: Reg. Papers*, vol. 52, pp. 1502-1507, Aug. 2005.
- [53] J. D. Schwartz, J. Azana and D. V. Plant, "A fully electronic system for the time magnification of ultra-wideband signals," *IEEE Trans. Microw. Theory Tech.*, vol. 55, no. 2, pp. 327-334, Feb. 2007.
- [54] J. D. Schwartz, J. Azana and D. V. Plant, "An electronic temporal imaging system for compression and reversal of arbitrary UWB waveforms," *Proc. 2008 IEEE RWS*, Jan. 2008.
- [55] M. Lee and A. Abidi, "A 9b, 1.25 ps resolution coarse-fine time-to-digital converter in 90 nm CMOS that amplifies a time residue," 2007 IEEE Symp. VLSI Circuits, pp. 168-169, Jun. 2007.
- [56] M. S. Harb and G. W. Roberts, "Embedded measurement of GHz digital signals with time amplification in CMOS," *IEEE Trans. Circuits Syst. I: Reg. Papers*, vol. 55, no. 7, pp. 1884-1896, Aug. 2008.

- [57] T. Nakura, S. Mandai, M. Ikeda and K. Asada, "Time difference amplifier using closed-loop gain control," *2009 IEEE Symp. VLSI Circuits*, pp. 208-209, Jun. 2009.
- [58] B. P. Lathi, in *Signal Processing and Linear Systems*, Oxford University Press, 2003.
- [59] A. I. Zverev, "Filter characteristics in the frequency domain," in *Handbook of Filter Synthesis*, New York, John Wiley and Sons, Inc, 1967, pp. 71-75.
- [60] N. Krishnapura, S. Pavan, C. Mathiazhagan and B. Ramamurthi, "A Baseband Pulse Shaping Method for Gaussian Minimum Shift Keying," *Proc. 1998 IEEE ISCAS*, vol. 1, pp. 249-252, Jun. 1998.
- [61] S. Pavan and Y. Tsividis, "Application of scaling to other filter techniques," in *High Frequency Continuous Time Filters In Digital CMOS Process*, ed. Massachusetts, Kluwer Academic Publisher, 2000, pp. 149-151.
- [62] Y. Jie and R. L. Geiger, "A negative conductance voltage gain enhancement technique for low voltage high speed CMOS op amp design," *Proc. 2000 IEEE MWS-CAS*, vol. 1, pp. 502-505, Aug. 2000.
- [63] J. Chen and B. Shi, "Novel constant transconductance references and the comparisons with the traditional approach," *Proc. 2003 IEEE SSMSD*, pp. 104-107, Feb. 2003.
- [64] E. Sackinger and W. Guggenbuhl, "A versatile building block: the CMOS differential difference amplifier," *IEEE J. Solid-State Circuits*, vol. 22, pp. 287-294, Apr. 1987.
- [65] M. Kumngern and K. Dejhan, "Electronically tunable current-mode quadrature oscillator using current differencing transconductance amplifiers," *Proc. 2009 IEEE TENCON*, pp. 1-4, Jan. 2009.
- [66] J. Steininger, "Understanding wide-band MOS transistors," *IEEE Circuits and Devices Magazine*, vol. 6, issue 3, pp. 21-26, May 1990.
- [67] R. Zele and D. Allstot, "Low power CMOS continuous time filters," *IEEE J. Solid-State Circuits*, vol. 31, no. 2, pp. 157-168, Feb. 1996.
- [68] N. Talebbeydokhti, P. K. Hanumolu, P. Kurahashi and Un-Ku Moon, "Constant transconductance bias circuit with an on-chip resistor," *Proc. 2006 IEEE ISCAS*, pp. 2860-2863, May 2006.
- [69] G. Guang, Z. Cheng, G. Hoogzaad and K. A. A. Makinwa, "A Single-Trim CMOS Bandgap Reference With a  $3\sigma$  Inaccuracy of  $\pm 0.15\%$  From  $-40^{\circ}C$  to  $125^{\circ}C$ ," *IEEE J. Solid-State Circuits*, vol. 46, no. 11, Nov. 2011, pp. 2693 2701.
- [70] J. Lee and S. H. Cho, "A 1.4-μW 24.9-ppm/° C Current Reference With Process-Insensitive Temperature Compensation in 0.18-μm CMOS," *IEEE J. Solid-State Circuits*, vol. 47, pp. 2527-2533, Oct. 2012.

# List of publications based on the thesis

- I. Mondal and N. Krishnapura, "A 2 GHz bandwidth, 0.25-1.7 ns true-time-delay element using a variable-order all-pass filter architecture in 0.13 μm CMOS," *IEEE J. Solid-State Circuits*, vol. 52, no. 8, pp. 2180-2193, Aug. 2017.
- I. Mondal and N. Krishnapura, "Expansion and compression of analog pulses using bandwidth switching of continuous-time filters," *IEEE Trans. Circuits Syst. I: Reg. Papers*, DOI 10.1109/TCSI.2018.2799080.
- I. Mondal and N. Krishnapura, "Gain enhanced high frequency OTA with on-chip tuned negative conductance load," *Proc. 2015 IEEE ISCAS*, pp. 2085-2088, May 2015.
- I. Mondal and N. Krishnapura, "Accurate constant transconductance generation without off-chip components," *Proc. 2015 IEEE Int. Conf. on VLSI Design*, pp. 249-253, Jan. 2015.