# **A High-Resolution Dual-Loop Digital DLL**

Jongsun Kim and Sang-woo Han

Abstract-A new dual-loop digital delay-locked loop (DLL) using a hybrid (binary + sequential) search algorithm is presented to achieve both wide-range operation and high delay resolution. A new phaseinterpolation range selector (PIRS) and a variable successive approximation register (VSAR) algorithm are adopted to resolve the boundary switching and harmonic locking problems of conventional digital DLLs. The proposed digital DLL, implemented in a 0.18-µm CMOS process, occupies an active area of 0.19 mm<sup>2</sup> and operates over a wide frequency range of 0.15-1.5 GHz. The DLL dissipates a power of 11.3 mW from a 1.8 V supply at 1 GHz. The measured peak-to-peak output clock jitter is 24 ps (effective pkpk jitter = 16.5 ps) with an input clock jitter of 7.5 ps at 1.5 GHz. The delay resolution is only 2.2 ps.

*Index Terms*—Delay locked loop, DLL, DRAM, digital DLL

## I. INTRODUCTION

Delay-locked loops (DLLs) are widely used in highspeed integrated circuits, such as dynamic random-access memories (DRAMs) and microprocessors, to support the highest data rates between chips by allowing for the precise synchronization of external and internal clocks. In addition, DLLs are often used in high-speed timeinterleaved and pipelined analog-to-digital converters (ADCs). In general, DLLs can be classified into three categories: analog [1, 2], digital [3-7, 12], and mixedmode type DLLs [8, 9].



Fig. 1. Conventional DLLs (a) Analog DLL, (b) Digital DLL.

As shown in Fig. 1(a), a typical analog DLL generally consists of a variable voltage controlled delay line (VCDL) and a phase detector. A charge pump and a loop filter are often used to control the delay of the VCDL, which continuously adjusts the phase difference between the input and the output clocks. Although analog DLLs show relatively good jitter performance, they require long locking time, are susceptible to process variations, and are sensitive to power-supply noise. More importantly, analog DLLs are fabricated without considering the standby or power-down mode support for low-power dissipation. Further, dynamic voltage scaling is usually not available in analog DLLs. Therefore, digital DLLs are currently preferred because of their fast locking time and lower standby power consumption capabilities. Moreover, digital DLLs allow easy design migrations to future advanced process technologies. A conventional digital DLL, shown in Fig. 1(b), generally consists of a digitally controlled delay line (DCDL), a phase detector, and a control block. The control block

Manuscript received Jan. 2, 2016; accepted Feb. 29, 2016 Electronic and Electrical Engineering, Hongik University E-mail : js.kim@hongik.ac.kr



Fig. 2. Proposed dual-loop digital DLL architecture.

that adjusts the delay of the DCDL is usually registercontrolled, counter-controlled, or successive approximation register (SAR)-controlled [8]. Digital DLLs, however, usually have higher jitter than analog DLLs because of the limited resolution of the delay line element of the DCDL. In order to minimize the quantization error of the discrete delay steps of conventional digital DLLs, a digital DLL with coarse and fine delay lines has been presented in [4]. However, the design in [4] has a very large delay resolution of 14 ps. Although the digital DLL in [6] achieved a delay resolution of 4 ps, it has a limited operating frequency range in that it cannot operates at frequencies below 1.5 GHz. Mixed-mode DLLs that combine the advantages of analog DLLs and digital DLLs have been proposed [8, 9]. Because the DLL in [8] consists of both a DCDL and a VCDL, it occupies a large area and still comprises analog circuits such as a charge pump that cannot be turned off during the power-down mode, resulting in large standby power dissipation.

In this paper, a high-resolution dual-loop digital DLL is presented [12]. To resolve the low-resolution constraint in the conventional digital DLL, a new dual-loop (coarse + fine loop) architecture using a hybrid (binary + sequential) search algorithm is adopted. The hybrid search algorithm, a combination of the binary search and sequential search algorithms, is one of the most efficient techniques that can be used to achieve the objectives of fast locking and high delay resolution in the DLL design. The dual-loop architecture forms a closed-loop that allows for tracking of process, voltage, and temperature (PVT) variations. By adopting a variable SAR algorithm [3], the harmonic-locking problem is eliminated in this design. Further, a new phase-interpolation range selector (PIRS) is proposed to solve the boundary switching problem [7] of conventional digital DLLs with coarse and fine delay lines. The proposed DLL achieves a high delay resolution of approximately 2.2 ps.

This paper is organized as follows. Section II describes the proposed dual-loop digital DLL architecture. The boundary switching problem is discussed in Section III. Section IV shows the implementation results of the fabricated DLL chip. Finally, the conclusions are given in Section V.

# II. PROPOSED DUAL-LOOP DIGITAL DLL ARCHITECTURE

Fig. 2 shows a block diagram of the proposed dualloop digital DLL which consists of two cascading loops: a coarse loop and a fine loop. The coarse loop contains a coarse DCDL, a PIRS, two thermometer decoders (5 to 32 and 2 to 3), a variable successive approximation register (VSAR), an SAR controller, and a phase detector. The fine loop consists of a phase interpolator (PI), a small-swing to full-swing level converter, a digital-toanalog converter (DAC), an up/down counter, and a phase detector. This architecture has two operating modes: a binary search mode for the coarse loop and a sequential search mode for the fine loop. The coarse loop is used to set the DLL output clock near the locking point within half of the digitally controlled delay unit (DCDU) delay in only a few clock cycles. Then the fine loop



Fig. 3. Proposed hybrid (binary + sequential) search algorithm (a) flow-chart, (b) detailed locking process.

makes small adjustments to the correct output phase, resulting in a small delay resolution of one  $DCDU/2^7$ .

Fig. 3(a) and (b) show a detailed flow chart and locking process of the proposed hybrid search algorithm with a 7-bit VSAR and a 7-bit up/down counter. First, the coarse loop performs a binary search that controls the amount of delay time by changing the number of delay elements in the DCDL. Initially, Q[6:0] of the 7-bit VSAR is set to [0000000] and C[6:0] of the 7-bit up/down counter is set to [0100000]. The 5 mostsignificant-bits (MSBs) of the VSAR, Q[6:2], are used to control the number of cascaded DCDUs of the DCDL which consists of a total of 32 delay units as shown in Fig. 4. The Q[6:2] bits are converted to thermometer codes, T0/T0b to T31/T31b, by the 5-to-32 thermometer decoder. The last 2 least-significant bits (LSBs) of the VSAR, Q[1:0], are converted to thermometer codes, K0/K0b to K2/K2b, by the 2-to-3 thermometer decoder,

and they are used to control the PIRS. As shown in Fig. 3, a binary search begins with Q[4:0] = [10000], performing a 5-bit search without using the first 2 MSBs. If the delay amount of the DCDL is out of the coarse locking range, the VSAR control bit increases by one and the binary search restarts with Q[5:0] = [100000], performing a 6-bit binary search. If the delay amount is still out of the coarse locking range, a 7-bit binary search starts again. After the binary search.

Fig. 3(b) shows the locking process more in detail. The DLL starts 5-bit binary search with Q[4:0] = [10000] (= 16 in decimal) at the beginning. Since the delay amount of the DCDL is less than the required delay amount when the course loop uses whole the delay cells with Q[4:0] = [11111], the Reset pulse is generated by the SAR controller to increase the VSAR bit by one and restart 6-bit binary search with Q[5:0] = [100000] (= 32 in



Fig. 4. Proposed DCDL, DCDU, and PIRS.

decimal). When the binary search is completed with Q[5:0] = 52 in decimal, the SAR controller generates the PI\_EN signal to start sequential search of the fine loop. After the sequential search is correctly completed around the bit count C[6:0] of 33 and 34, the DLL keeps a closed loop to track PVT variations with dithering of a 1-LSB. In case the DLL loses the locked state because of some unexpected external clock phase shift, the DLL starts the sequential search again until the DLL is re-locked. The variable SAR algorithm eliminates the harmonic-locking issue in this design [3].

Fig. 4 shows the proposed DCDL, DCDU, and PIRS. The DCDL is based on a DCDU which is a cascaded lattice delay unit (LDU) [3]. The delay of DCDU, td<sub>2</sub>, is twice the LDU delay td<sub>1</sub>. The PIRS consists of three LDUs connected in series. The total delay of the fine loop is equal to one DCDU delay of the coarse loop. At the beginning of a binary search mode, Q[1:0] of the VSAR is set to [01], and therefore, the PIRS has an initial delay of  $2 \times td_1$ . The 7-bit PI of the fine loop achieves a small delay resolution of  $t_{d2}/2^7 = 2.1875$  ps, where  $td_2 = 280$  ps. The PI has an initial delay  $t_m$ , which is equal to  $32 \times 2.1875$  ps = 70 ps, because the 7-bit up/down counter is set to [0100000] at the beginning.

# III. BOUNDARY SWITCHING PROBLEM OF A DIGITAL DLL

In conventional digital DLLs with coarse and fine delay lines, if the locking point is located at the center of the two adjacent DCDUs, the coarse lock code may change one bit back and forth. In this case, this dithering phenomenon caused by the boundary switching problem [7] may severely increase the output clock jitter. In this paper, the phase-interpolation range selector (PIRS) is proposed to eliminate the boundary switching problem of conventional digital DLLs. Fig. 5 illustrates an example of a locking process that can eliminate the boundary switching problem by using the proposed PIRS.

We assume that the locking point is located in between phase 13 and 14 with a delay of  $t_{LP}$  from phase 13. Through a binary search mode, the phase detector will select phase 12 to generate a reference signal  $DL_{MID}$ , which is the output of the DCDL. If the total delay through the PIRS and the PI is less than t<sub>LP</sub>, then phase 13 will be selected. The PIRS can generate three phase boundaries (phase p0, p1, and p2) with delay steps of td<sub>1</sub>. One of these three phases is selected to generate the PI input, DL<sub>OUTA</sub>. In this case, the phase detector may select phase p2 for the phase of DL<sub>OUTA</sub>. Then, phase difference between an input and output clock is reduced into PIRS delay, td1.  $DL_{OUTB}$  is a delayed signal through a DCDU that has a delay of td<sub>2</sub>. Since the PIRS overlaps with two adjacent DCDUs (= two MSB delay steps), the proposed DLL architecture can eliminate the boundary dithering of the coarse lock code bits Q[6:0]. Finally, in the sequential search mode, the PI will select phase 33 with a delay resolution of td<sub>3</sub>, which is equal to  $td_2/2^7$ .

Since the delay resolution of the DCDL impacts the jitter performance of a digital DLL, the fine loop shown in Fig. 2 adopts a 7-bit DAC-controlled PI, which is conceptually similar to that of [11], to increase the delay resolution and to achieve smaller jitter in this design. The PI shown in Fig. 2 and 4 receives two reference signals,  $DL_{OUTA}$  and  $DL_{OUTB}$ , and produces an output signal whose phase lies between that of the two input signals. Fig. 6 illustrates the simulated delay profile of the PI depending on the 7-bit DAC control bits. Although it is not perfectly linear, the PI shows good monotonicity. The DAC used in this design is a simple binary-weighted current steering DAC that achieves low power, small area,

**Fig. 5.** Proposed PIRS operation that can eliminate the boundary switching problem.

32 33

Locking Poin

31

: Unselected phase

: Selected phase

 $t = t_{d2} / 2$ 

126 127

Coarse Loop

Fine Loop



Fig. 6. PI delay versus 7-bit DAC control bit.

and good linearity. The DAC provides bias currents  $I_{DAC_A}$  and  $I_{DAC_B}$  to the PI. The delay resolution is determined by the size of the DAC's LSB. The simulated differential nonlinearity (DNL) is 1.09 LSB, which is approximately 2.4 ps in this design. As long as the delay monotonicity is maintained, the non-linearity in the DAC does not cause a problem.

#### **IV. EXPERIMENTAL RESULTS**

The simulated locking process of the proposed DLL is shown in Fig. 7. The VSAR starts with a 5-bit binary search of the course loop. In this simulation, the VSAR Reset pulse is generated to start a 6-bit binary search again because the delay amount of the DCDL is less than the required course lock delay for proper locking. After the course lock is completed, the PD generates the Lock signal and the SAR controller enables the PI\_EN signal



Fig. 7. Simulated locking process of the proposed dual-loop DLL.

to start sequential search of the fine loop. The Counter is the C[6:0] code shown in decimal. Finally, the input and output clock signals are correctly locked with each other.

The proposed dual-loop digital DLL is fabricated in a 0.18-um CMOS process and tested in a chip-on-board (CoB) assembly. Fig. 8(a) shows the chip layout and die microphotograph of the proposed DLL with an active area of only 0.19 mm<sup>2</sup>. Fig. 8(b) shows the measurement setup. Fig. 9 shows the measured peak-to-peak (pk-pk) jitter of the input and output clocks at 150 MHz, 500 MHz, and 1.5 GHz, respectively. The digital DLL achieves a measured pk-pk jitter of 24 ps at 1.5 GHz with an input clock jitter of 7.5 ps. It also achieves a measured pk-pk jitter of 40 ps and 32 ps at 150 MHz and 500 MHz, respectively, with an input clock jitter of 20 ps. If we remove the amount of the input clock jitter, the effective pk-pk jitter will be only 16.5 ps at 1.5 GHz. Fig. 10 shows the measured input and output pk-pk clock jitters depending on the operating frequencies. The proposed DLL dissipates a power of 11.3 mW at 1.0 GHz from a supply voltage of 1.8 V. A performance comparison between the proposed dual-loop digital DLL and other DLLs is given in Table 1.

Binary Search

Q[6:2] (MSB 5-bit)

Binary Search PIRS Q[1:0]

(LSB 2-bit)

Sequential Search

(PI 7-bit)

C[6:0]

Total adjustable delay time

Dŀ

Phas

Phase of DI

 $= 't_{d2}' \ge 2^{2}$ 

taz i tre

пı

of DL

t<sub>d1</sub>

 $t_{d3} = t_{d2} / 2^{2}$ 



Fig. 8. (a) Chip layout and die microphotograph, (b) Measurement setup.





(c)

Fig. 9. Measured input and output clocks at (a) 150 MHz, (b) 500 MHz, (c) 1.5 GHz.



Fig. 10. Measured input and output pk-pk clock jitters.

|                                   | JSSC [2]             | JSSC [3]                   | JSSC [4]             | JSSC [6]                   | This work              |
|-----------------------------------|----------------------|----------------------------|----------------------|----------------------------|------------------------|
| Process &<br>Supply               | 0.18 μm<br>1.8 V     | 0.18 μm<br>1.8 V           | 0.13 μm<br>1.6 V     | 0.13 μm<br>1.5 V           | 0.18 μm<br>1.8 V       |
| Туре                              | Analog<br>Dual-loop  | Digital<br>Single-<br>loop | Digital<br>dual-loop | Digital<br>Single-<br>loop | Digital<br>Dual-loop   |
| Frequency<br>Range                | 60-760<br>MHz        | 40-550<br>MHz              | 66-500<br>MHz        | 1.5-2.5<br>GHz             | 150 MHz-<br>1.5 GHz    |
| Delay<br>Resolution               | -                    | 10 ps                      | 14 ps                | 4 ps                       | 2.2 ps                 |
| Pk-pk jitter                      | 28 pS<br>@700<br>MHz | 12 pS<br>@500<br>MHz       | -                    | 14 ps<br>@2.5<br>GHz       | 16.5 pS<br>@1.5<br>GHz |
| Active Area<br>(mm <sup>2</sup> ) | 0.189                | 0.2                        | 0.053                | 0.03                       | 0.19                   |
| Power<br>(mW)                     | 63 @<br>700 MHz      | 12.6 @<br>550 MHz          | 29 @<br>266 MHz      | 30 @<br>2.5 GHz            | 11.3 @<br>1 GHz        |

 Table 1. Performance summary and comparison

## **IV. CONCLUSION**

A 0.15–1.5 GHz digital DLL for high-speed DRAMs has been implemented in a 0.18- $\mu$ m CMOS process. The proposed DLL includes a dual-loop architecture with a hybrid search algorithm to achieve both wide-range operation and high delay resolution. The boundary switching and harmonic locking problems are eliminated by adopting the proposed PIRS and the VSAR algorithm. The proposed digital DLL achieves a delay resolution of 2.2 ps and the measured pk-pk output clock jitter is 16.5 ps with an input clock jitter of 7.5 ps at 1.5 GHz (effective pk-pk jitter = 16.5 ps). The proposed DLL occupies an active area of only 0.19 mm<sup>2</sup> and dissipates a power of 11.3 mW from a 1.8 V supply at 1.0 GHz.

#### **ACKNOWLEDGMENTS**

This work (C0249896) was supported by Business for Cooperative R&D between Industry, Academy, and Research Institute funded Korea Small and Medium Business Administration in 2015. The chip fabrication was supported by IDEC.

## REFERENCES

[1] Y. Moon, J. Choi, K. Lee, D. Jeong, M. Kim, "An all-analog multiphase delay-locked loop using a replica delay line for wide-range operation and low-jitter performance," *IEEE J. Solid-State Circuits*, vol. 35, No. 3, pp. 377-384, 2000.

- [2] S. Bae, H. Chi, Y. Sohn, H. Park, "A VCDL-based 60-760MHz dual-loop DLL with infinite phaseshift capability and adaptive-bandwidth scheme," *IEEE J. Solid-State Circuits*, vol. 40, No. 5, 2005, pp. 1119-1129.
- [3] R. Yang, S. Liu, "A 40-550MHz harmonic-free all-Digital delay-locked loop using a variable SAR algorithm," *IEEE J. Solid-State Circuits*, vol. 42, No. 2, Feb 2007, pp. 361-373.
- [4] T. Matano, et al., "A 1-Gb/s/pin 512-Mb DDRII SDRAM using a digital DLL and a slew-ratecontrolled output buffer," *IEEE J. Solid-State Circuits*, vol. 38, no. 5, pp. 762–768, 2003.
- [5] L. Wang, L. Liu, and H. Chen, "An Implementation of Fast-Locking and Wide-Range 11-bit Reversible SAR DLL," *IEEE Trans. Circuits and Systems II*, vol. 57, No. 6, Jun 2010, pp 421-425.
- [6] R. Yang, S. Liu, "A 2.5 GHz all-digital delaylocked loop in 0.13μm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 42, No. 11, 2007, pp. 2338-2347.
- J.-T. Kwak et al., "A low cost high performance register-controlled digital DLL for 1Gbps x32 DDR SDRAM," *Dig. Tech. Papers, VLSI Circuits Symp.* pp. 112-113, 2002
- [8] G-K Dehng, J-W Lin, S-I Liu, "A fast-lock mixedmode DLL using a 2-b SAR Algorithm," *IEEE J. Solid-State Circuits*, vol. 36, pp. 1464-1471, 2001.
- [9] J. Kim, S. Lee, T. Jung, C. Kim, S. Cho, and B. Kim, "A low-jitter mixed-mode DLL for highspeed DRAM applications," *IEEE J. Solid-State Circuits*, vol. 35, pp. 1430–1436, Oct. 2000.
- [10] Jong-Chern Lee et al., "A low-power small-area open loop digital DLL for 2.2Gb/s/pin 2Gb DDR3 SDRAM," *IEEE Asian Solid State Circuits Conference*, pp. 157–160, 2011.
- [11] S. Sidiropoulos, et al., "A semidigital dual delaylocked loop," *IEEE J. Solid-State Circuits*, vol. 32, pp. 1683–1692, Nov. 1997.
- [12] Sangwoo Han and Jongsun Kim, "A highresolution wide-range dual-loop digital delaylocked loop using a hybrid-search algorithm," *IEEE Asian Solid State Circuits Conference*, pp. 293-296, 2012.



Jongsun Kim received his Ph.D. degree in electrical engineering from the University of California, Los Angeles (UCLA) in 2006 in the field of Integrated Circuits and Systems. He was a postdoctoral fellow at UCLA from 2006 to 2007. From

1994 to 2001 and from 2007 to 2008, he was with Samsung Electronics as a senior research engineer in the DRAM Design Team, where he worked on the design and development of Synchronous DRAMs, SGDRAMs, Rambus DRAMs, DDR3 and DDR4 DRAMs. Dr. Kim joined the School of Electronic & Electrical Engineering, Hongik University in March 2008. Professor Kim's research interests are in the areas of high-performance mixed-signal circuits and systems design. His current research areas include high-speed and low-power transceiver circuits for chip-to-chip communications, clock recovery circuits (PLLs/DLLs/CDRs), frequency synthesizers, signal integrity and power integrity, ultra low-power memories, power-management ICs (PMICs), RF-interconnect circuits, and low-power memory interface circuits and systems.



Sangwoo Han was born in Seoul, Korea, on 1985. He received the B.S., M.S., and Ph.D. degrees in the Department of Electronic and Electrical Engineering from Hongik University, Korea, in 2010, 2012, and 2016, respectively.