

# A 0.2-2 GHz Time-Interleaved Multi-Stage Switched-Capacitor Delay Element Achieving 448.6 ns Delay and 330 ns/mm<sup>2</sup> Area Efficiency

Travis Forbes<sup>1</sup>, Benjamin Magstadt, Jesse Moody, Andrew Suchanek, Spencer Nelson

Sandia National Laboratories, USA

<sup>1</sup>tmforbe@sandia.gov

**Abstract**—A 0.2-2 GHz digitally programmable RF delay element based on a time-interleaved multi-stage switched-capacitor (TIMS-SC) approach is presented. The proposed approach enables hundreds of ns of broadband RF delay by employing sample time expansion in multiple stages of switched-capacitor storage elements. The delay element was implemented in a 45 nm SOI CMOS process and achieves a 2.55-448.6 ns programmable delay range with <0.12 % delay variation across 1.8 GHz of bandwidth at maximum delay, 2.42 ns programmable delay steps, and 330 ns/mm<sup>2</sup> area efficiency. The device achieves 24 dB gain, 7.1 dB noise figure, and consumes 80 mW from a 1 V supply with an active area of 1.36 mm<sup>2</sup>.

**Keywords**—programmable delay element, true-time delay, broadband, switched-capacitor, low-noise amplifier.

## I. INTRODUCTION

Achieving large delays at GHz frequencies with significant bandwidth and programmability is key for applications including radar testers (>400 ns), digital RF memory (DRFM) (>400 ns), and emerging full-duplex communications (>100 ns). While programmable RF delay <2 ns has been deployed for broadband phased array applications [1]–[3] and <8 ns for full-duplex communications [4], achieving >8 ns RF delay has been limited to non-programmable surface acoustic wave (SAW) devices and long coax cables. While SAW devices can achieve >100 ns RF delay, the devices have small bandwidth, significant delay variation, and no programmability (e.g. [5] achieves 172 ns delay with 40 ns variation and 68 MHz bandwidth). Coax cables enable >400 ns delay across a large bandwidth with small delay variation but with significant size, achieving only 5 ns/m of length, and no programmability.

Radar testers and DRFM devices simulate radar returns by capturing the transmit (TX) waveform, applying RF delay related to distance (1 ns/ft), and re-transmitting the delayed waveform. With radar waveforms using >500 MHz bandwidths for good range resolution, radar testers use coaxial delays but are limited to a small set of delay values because of physical size. DRFM devices employ either digital implementations which require watts of power consumption or arrays of SAW devices which only capture a portion of the radar waveform bandwidth and have a small set of delay values. For full-duplex communications, cancellation of the TX signal arriving at the receiver (RX) input (self-interference cancellation (SIC)) from TX to RX leakage and radar-like TX reflections in the environment must be completed to prevent saturation of the RX front-end. While SIC FIR cancellers have achieved RF delays up to 8 ns (4 ft reflection) [5], reflection from a car can saturate the front-end at distances >>15 ft (>>30 ns) [6].



Fig. 1. Delay element techniques including (a) gm-C all-pass filter, (b) switched-capacitor, and (c) the proposed time-interleaved multi-stage switched-capacitor (TIMS-SC) approach.

Previously published RF delay element approaches include delay line (e.g. [3]), gm-C filter [2], and switched-capacitor [7] techniques (Fig. 1). Gm-C filter-based delays achieved >20x increase in area efficiency [2] compared to delay lines but are limited to <2 ns delays [1]. Switched-capacitor delays achieved another >5x increase in area efficiency, but are limited to <8 ns RF delay [4], which remains an order of magnitude lower than required for implementation in radar testers and DRFM devices, as well as deployable full-duplex systems.

To solve these challenges, this paper introduces a time-interleaved multi-stage switched-capacitor (TIMS-SC) delay element which employs time expansion in the second delay stage to reduce sample leakage and overcome the <8 ns delay limit of prior art, while maintaining low capacitive input loading and simple, timing-skew tolerant clock generation. At sample rate  $F_s = 3.3$  GHz, the delay element achieves 448.6 ns maximum delay across a bandwidth of 0.2-2 GHz, 330 ns/mm<sup>2</sup> area efficiency, and covers a 175.9x delay range, representing increases of 58x in delay and 9x in area efficiency compared to prior art. The approach enables significant miniaturization and increase in programmable delay in radar testers and DRFM devices, breaks the bandwidth/power consumption tradeoff in DRFM devices, and greatly increases SIC FIR delay tap coverage towards deployable full-duplex communications.



## II. DELAY ELEMENT OPERATION

Fig. 2 shows the functional and timing diagram of the proposed TIMS-SC approach, shown single-ended for clarity while the implementation is differential (Fig. 3). To achieve almost 450 ns RF delay at 3.3 GHz  $F_s$ , >1480 RF samples must be stored with low leakage. Achieving this in a direct switched-capacitor implementation is impractical because of significant capacitive loading at the input, large signal leakage through sampling switches which must support settling time at the RF sample rate, and complex multi-phase clock generation at the full sample rate. To overcome these challenges, the proposed delay element is constructed in three stages: (Stage 1) an 8-phase switched-capacitor network sampling at full sample rate  $F_s$ , (Stage 2) a 186 capacitor storage element operating at  $F_s/8$  following each stage 1 sampler and enabling long time sample storage, and (Stage 3) an 8-phase recombining stage operating at  $F_s$ . In this implementation, a buffer is inserted between each delay stage to prevent gain loss from charge sharing, but a passive implementation may be completed as a tradeoff between power consumption and gain loss.



Fig. 3. Block diagram of the delay element RFIC.



Fig. 4. Circuit diagrams for the (a) broadband LNA, (b) input buffer, (c) stage 1 delay buffer, and (d) output buffer.

As shown in Fig. 2, the RF input is sampled sequentially onto 8 capacitors using 8-phase non-overlapping clocks  $P_0$ - $P_7$ . Each stage 1 sample is transferred to 1 of 186 storage capacitors in the associated stage 2 sub-block, where there are  $8x186$  total capacitors in stage 2. While stage 1 settling time is  $1/F_s$ , settling time expansion is created in stage 2 by allowing sample transfer from stage 1 to 2 to continue during the stage 1 hold time ( $PI_{xy}$ , where  $x$  is stage 1 path and  $y$  is stage 2 capacitor). With the expanded settling time, the sampler bandwidth required in stage 2 is greatly reduced, allowing much smaller sampling switches in stage 2 ( $8x$  in this work), enabling a large reduction in OFF state sample leakage and savings in clock power consumption. The leakage reduction enables an equal increase in maximum achievable hold time, key to achieving nearly 450 ns of RF delay. To reduce timing skew sensitivity, the stage 2 input clock  $PI_{xy}$  transitions prior to the stage 1 sample clock  $P_x$  (e.g.  $PI_{10}$  before  $P_1$ ) such that the stage 2 input is static during stage 2 clock transitions to prevent signal distortion. After the programmed delay time, a stage 2 output clock  $PO_{xy}$  initiates the transfer of the sample to the input of the associated stage 3 buffer, again time expanded. The stage 3 buffers output the delayed RF signal employing the same 8-phase clock timing as the stage 1 delay ( $P_x$ ). Timing

skew is again mitigated by transitioning the stage 2 output clock  $PO_{xy}$  after the stage 3 output clock  $P_x$ .

The input and output clocks in each stage 2 block are generated by two separate, but synchronous, divide-by-186 circuits implemented in standard logic operating at frequency  $F_s/8$  (Fig. 3). The RF delay is programmed by delaying the enable timing (Fig. 3) of the stage 2 output clock  $PO_{x0}$  relative to the associated input clock  $PI_{x0}$  and is fully configurable over a standard SPI digital interface. The RF delay can be programmed over a range  $8/F_s$  to  $1480/F_s$  in  $8/F_s$  steps and delay scales with sample frequency. High configurability and broadband delay achieved provide flexibility in operating frequency and bandwidth around  $F_s/2$  alias intervals.

### III. CIRCUIT IMPLEMENTATION

Fig. 3 shows the block diagram and Fig. 4 shows the circuit diagrams for the proposed delay element. A low-noise amplifier (LNA) provides gain and differential conversion, similar to [8] but with key modifications (Fig. 4a). By biasing the lower NMOS devices at the same current density and applying the same  $V_{GS} = V_{DS}$  across these devices in both cascode legs through shared cascode bias voltage, the largest AC coupling capacitor between the inverter and cascode node in [8] is removed for significant size savings. The LNA only consumes  $0.0016 \text{ mm}^2$  active area while achieving a simulated noise figure of  $3.5 \text{ dB}$  at  $3.5 \text{ mW}$  power consumption. An input buffer (Fig. 4b) provides LNA output isolation to the switched-capacitor circuits and employs a push-pull output stage. Each switched-capacitor circuit employs a differential  $250 \text{ fF}$  capacitor for small area and low sampling noise contribution. The stage 1 buffer (Fig. 4c) employs an NMOS common-source with diode-connected load for unity gain matching between the 8 paths, all placed close in layout, to limit gain mismatch induced signal distortion. The stage 2 buffer employs a dynamic inverter clocked at both  $VSS$  and  $VDD$  by  $PO_{xy}$ , where only 1 of 186 in each path are enabled at a time and 186 share a self-biased inverter load in each of the 8 delay paths. Stage 3 buffers are placed close in layout for matching and employ dynamic common-source amplifiers with



Fig. 5. Die micrograph of the programmable RF delay element.



Fig. 6. Measured performance ( $F_s = 3.3 \text{ GHz}$ ) (a) across delay code at  $F_{RF} = 1 \text{ GHz}$ , (b) delay DNL/INL at  $F_{RF} = 1 \text{ GHz}$ , (c) maximum and (d) minimum delay across  $F_{RF}$  frequency.



Fig. 7. Measured RF delay element gain and noise figure ( $F_s = 3.3 \text{ GHz}$ ).

a shared resistive load. An output buffer (Fig. 4d) provides balun and matching operation, employing a common-source amplifier and push-pull output stage. Clocking is provided through a divide-by-2 ( $F_{clk} = 2F_s$ ) and 8-phase clocks are generated by two synchronous divide-by-8 circuits for low timing skew at delay stages 1 and 3. 8-phase clocks are pulse extended to 50% duty cycle for standard logic compatibility to drive divide-by-186 circuits inside each stage 2 delay area.

### IV. MEASURED PERFORMANCE

The delay element was implemented in a  $45 \text{ nm}$  SOI CMOS process with  $4 \text{ mm}^2$  chip area and  $1.36 \text{ mm}^2$  active area. The die micrograph is shown in Fig. 5. The device was packaged in a  $5 \times 5 \text{ mm}^2$  QFN and soldered on PCB for device testing. The sample frequency was chosen to be  $F_s = 3.3 \text{ GHz}$  ( $F_{clk} = 6.6 \text{ GHz}$ ) for full characterization, while the device was found to operate properly beyond  $F_s = 4.4 \text{ GHz}$  ( $F_{clk} = 8.8 \text{ GHz}$ ) and below  $F_s = 3.3 \text{ GHz}$ , providing system flexibility in clock frequency, delay range, and frequency coverage. The device consumes  $80 \text{ mW}$  ( $3.5 \text{ mW}$  LNA,  $3.5 \text{ mW}$  output buffer,



Fig. 8. Maximum delay achieved vs area efficiency scatter plot of prior art.

Table 1. Performance summary and comparison to prior art.

|                 | This Work                           | JSSC 2021 [4]                      | JSSC 2015 [2]          | JSSC 2017 [1]          |
|-----------------|-------------------------------------|------------------------------------|------------------------|------------------------|
| Design          | Delay Element                       | SIC Receiver                       | 4 Channel Beamformer   | Delay Element          |
| Architecture    | TI-MS Switched-Cap                  | Switched-Cap                       | Gm-C                   | Gm-C                   |
| Frequency Range | 0.2-2 GHz                           | 0.1-1 GHz                          | 1-2.5 GHz              | 0.1-2 GHz              |
| 3-dB Bandwidth  | 0.2-1.1 GHz <sup>a</sup>            | 0.1-0.5 GHz <sup>b</sup>           | 1-2.5 GHz              | 0.1-2 GHz              |
| Max Delay       | 448.6 ns <sup>a</sup>               | 7.75 ns <sup>b</sup>               | 0.55 ns                | 1.7 ns                 |
| Area Efficiency | 330 ns/mm <sup>2</sup> <sup>a</sup> | 37 ns/mm <sup>2</sup> <sup>b</sup> | 7.9 ns/mm <sup>2</sup> | 5.9 ns/mm <sup>2</sup> |
| Delay Range     | 175.9x                              | 31x <sup>b</sup>                   | 39.3x <sup>c</sup>     | 6.8x                   |
| Gain            | 24 dB                               | -19 dB <sup>b</sup>                | 12 dB                  | 0.6 dB                 |
| Noise Figure    | 7.1 dB                              | -                                  | 8 dB                   | 23 dB                  |
| IP1dB           | -27 dBm                             | -                                  | -21 dBm                | -13 dBm                |
| Power           | 80 mW <sup>a</sup>                  | 7.4 mW <sup>b</sup>                | 90 mW <sup>d</sup>     | 364 mW                 |
| Technology      | 45 nm SOI                           | 65 nm                              | 140 nm                 | 130 nm                 |
| Active Area     | 1.36 mm <sup>2</sup>                | 0.21 mm <sup>2</sup> <sup>b</sup>  | 0.07 mm <sup>2</sup>   | 0.29 mm <sup>2</sup>   |

<sup>a</sup>F<sub>s</sub> = 3.3 GHz. <sup>b</sup>Max RF delay element. <sup>c</sup>Based on delay step. <sup>d</sup>Single channel.

30 mW clocking, 43 mW delay buffers) from a 1 V core supply and includes a 1.8 V digital interface.

RF delay was measured using both an RF oscilloscope with correlation-based post-processing and a 2-port VNA with group delay capture. Delay performance was verified across all delay settings at F<sub>RF</sub> = 1 GHz, and at minimum/maximum delay across RF input frequency (Fig. 6). The maximum achieved delay was 448.6 ns, minimum delay 2.55 ns, and delay slope showed expected 2.42 ns/step (8/F<sub>s</sub>) over a 175.9x delay range. Delay DNL/INL was  $<\pm 4$  ps across all delay codes. Delay response was relatively flat at minimum and maximum delay settings across 0.2-2 GHz, with delay variation at maximum delay of  $<0.12$  %.

Fig. 7 shows the measured gain and noise figure of the delay element. The device achieved a maximum gain of 24 dB, 1.1 GHz 3-dB bandwidth and 7.1 dB minimum NF at the maximum delay setting, while  $<0.1$  dB gain/NF change was observed at minimum delay setting showing successful sample leakage mitigation. Gain flatness across any 100 MHz BW was  $<\pm 0.5$  dB across 0.2-2 GHz. Bandwidth and flatness can be further improved by operating at F<sub>s</sub> > 3.3 GHz to reduce the Sinc roll-off inherent to the zero-order hold operation ( $\sin(\pi F_{RF}/F_s)/(\pi F_{RF}/F_s)$ ), which produces 1.4/6 dB loss at 1/2 GHz F<sub>RF</sub>. Input-referred P1dB was -27/-25 dBm at 1/2 GHz, dominated by the output buffer. S11/S22 was  $<-10$  dB from 0.2-3 GHz for the RF input/output, and  $<-10$  dB from 3.5-9

GHz for the clock input, including an open-stub board match. Input-referred clock spurious performance was  $<-47$  dBm at F<sub>s</sub>/8 and 3F<sub>s</sub>/8, and  $<-76$  dBm for all other broadband spurs.

Fig. 8 and Table 1 compare this work against prior state-of-the-art RF delay elements. This work is the first to achieve  $>8$  ns programmable delay at GHz frequencies with increases of 58x in maximum delay, 9x in area efficiency, and 4.5x in delay range compared to prior art while maintaining comparable gain, NF, linearity, and power consumption.

## V. CONCLUSION

A 0.2-2 GHz RF delay element was presented. The proposed time-interleaved multi-stage switched-capacitor implementation enabled a programmable delay of up to 448.6 ns at GHz frequencies while achieving small size at 330 ns/mm<sup>2</sup> area efficiency. The delay element breaks the  $<8$  ns delay limit in prior art by over an order of magnitude, enabling miniaturization and increased programmable delay range in radar testers, DRFM devices, and full-duplex FIR SIC filters.

## ACKNOWLEDGMENT

The authors thank Rodrigo Llanes and Jeff Chiu for the board design and system integration. This work was supported by the Laboratory Directed Research and Development program at Sandia National Laboratories, a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525.

## REFERENCES

- I. Mondal and N. Krishnapura, "A 2-GHz bandwidth, 0.25-1.7 ns true-time-delay element using a variable-order all-pass filter architecture in 0.13  $\mu$ m CMOS," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 8, pp. 2180-2193, Aug 2017.
- S. K. Garakoui, E. A. M. Klumperink, B. Nauta, and F. E. van Vliet, "Compact cascadable gm-C all-pass true time delay cell with reduced delay variation over frequency," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 3, pp. 693-703, March 2015.
- M. Li, N. Li, H. Gao, Z. Zhang, S. Wang, Y.-C. Kuan, C. Song, X. Yu, Q. J. Gu, and Z. Xu, "An 800-ps origami true-time-delay-based CMOS receiver front end for 6.5-9-GHz phased arrays," *IEEE Solid-State Circuits Letters*, vol. 3, pp. 382-385, 2020.
- A. Nagulu, A. Gaonkar, S. Ahasan, S. Garikapati, T. Chen, G. Zussman, and H. Krishnaswamy, "A full-duplex receiver with true-time-delay cancelers based on switched-capacitor-networks operating beyond the delay-bandwidth limit," *IEEE Journal of Solid-State Circuits*, vol. 56, no. 5, pp. 1398-1411, May 2021.
- R. Lu, Y. Yang, A. E. Hassanien, and S. Gong, "Gigahertz low-loss and high power handling acoustic delay lines using thin-film lithium-niobate-on-sapphire," *IEEE Transactions on Microwave Theory and Techniques*, vol. 69, no. 7, pp. 3246-3254, 2021.
- K. E. Kolodziej, B. T. Perry, and J. S. Herd, "In-band full-duplex operation in high-speed mobile environments: Not so fast!" *IEEE Microwave Magazine*, vol. 22, no. 12, pp. 60-72, 2021.
- A. Nagulu, A. Gaonkar, S. Ahasan, T. Chen, G. Zussman, and H. Krishnaswamy, "A full-duplex receiver leveraging multiphase switched-capacitor-delay based multi-domain FIR filter cancelers," in *2020 IEEE Radio Frequency Integrated Circuits Symposium (RFIC)*, 2020, pp. 43-46.
- P. I. Mak and R. P. Martins, "A 0.46-mm 4-dB NF unified receiver front-end for full-band mobile TV in 65-nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 9, pp. 1970-1984, Sep. 2011.