논문 2011-48IE-2-2 # 온 칩 셀 특성을 위한 위상 오차 축적 기법 # (Phase Error Accumulation Methodology for On-chip Cell Characterization) 강 창 수\*, 임 인 호\*\* (Chang Soo Kang and In Ho Im) 요 약 본 논문은 나노 구조에서 ASIC 표준 라이브러리 셀의 특성에 대하여 전파지연시간 측정의 새로운 설계 방법을 제시하였다.라이브러리 셀((NOR, AND, XOR 등)에 대한 정확한 시간 정보를 제공함으로서 ASIC 설계 흐름 공정의 시간적 분석을 증진 시킬 수 있다. 이러한 분석은 기술 공정에서 반도체 파운드리 팀에게 유용하게 사용할 수 있다. CMOS 소자의 전파지연시간과 SPICE 시뮬레이션 은 트랜지스터 파라미터의 정확도를 예측할 수 있다. 위상오차 축적방법 물리적 실험은 반도체 제조공정 (0.11µm, GL130SB)으로 실현하였다. 표준 셀 라이브러리에서 전파지연시간은 $10^{-12}$ 초 단위까지 정확성을 측정할 수 있었다. VLSI STPE를 위한 솔루션은 배치, 시뮬레이션, 그리고 검증에 사용할 수 있다. #### **Abstract** This paper describes the design of new method of propagation delay measurement in micro and nanostructures during characterization of ASIC standard library cell. Providing more accuracy timing information about library cell (NOR, AND, XOR, etc.) to the design team we can improve a quality of timing analysis inside of ASIC design flow process. Also, this information could be very useful for semiconductor foundry team to make correction in technology process. By comparison of the propagation delay in the CMOS element and result of analog SPICE simulation, we can make assumptions about accuracy and quality of the transistor's parameters. Physical implementation of phase error accumulation method(PHEAM) can be easy integrated at the same chip as close as possible to the device under test(DUT). It was implemented as digital IP core for semiconductor manufacturing process(0.11µm, GL130SB). Specialized method helps to observe the propagation time delay in one element of the standard-cell library with up-to picoseconds accuracy and less. Thus, the special useful solutions for VLSI schematic-to-parameters extraction (STPE), basic cell layout verification, design simulation and verification are announced. Keywords: Transmitter, Phase Error Accumulation Methodology(PHEAM), BIST, DFT #### I. Introduction Phase Error Accumulation Methodology is a one possible way to provide high precision on-chip measurement of propagation delay inside single standard library cell. Another one is the random sample methodology $^{[1-3]}$ . ASIC design flows involve several activities, from specification and design entry, to place-and-route and timing closure. Timing closure is accomplished when all of the signal paths in the design satisfy the timing constraints imposed by the interface circuitry, the circuit's sequential elements, and the system clock. Timing verification ultimately depends on (AMNT Co., Ltd) <sup>\*</sup> 정회원, 유한대학 전자정보과 <sup>(</sup>Yuhan University Dept. Electronic & Information Engineering) <sup>\*\*</sup> 정회원, AMNT(주) 접수일자: 2011년4월7일, 수정완료일: 2011년6월15일 realistic values of the propagation delays in a library element model<sup>[4]</sup>. So, extremely accurate analysis of propagation delay in each STD delay cell becomes crucial during VLSI validation process. With element dimensions getting smaller each year, the ability to measure delay in realm of picoseconds is becoming essential. However, in semiconductor VLSI mass production are very few mechanisms exist to measure propagation delay. Thus, library timing information could be shipped to library customers who assume the timing is within specification, when actually no concrete information has been developed to either prove or disprove the fact. As a result, ASIC design specialists have recently been looking forwards Built-In-Self-Test(BIST) applications. The proposed method is based on the hypothesis that the small differences exists between of the two identical physical micro (or nano) structures like CMOS gate. This might be different geometrical distortions or chemical impurities in the layout of the element. It is difficult to detect with ordinary test equipment. But, by using numbers of ring oscillators and counters this method provides measurement of delay time in one cell. Various mathematical methods can determine more detail information about cell characteristics, for example rising and falling time separation. Combining together random sampling and phase error accumulation methodology we build powerful VLSI validation tool and give a great support to test, extraction and library design teams. ### II. Conception of Phase Shift Figure 1 illustrates the natural idea of further realization in electronic world. One consist of two flywheels 1 and 2. They have almost same mass (m1 and m2). To start rotation we apply same impulse (I1=I2 or force F) to both devices at the same time t1 = t2 = 0. During rotation the phase shift (or phase error) $\Theta$ of the flywheels will be changed, as result of imperfection in test conditions and mass. By observation of synchronous point 1 and 2 the number 그림 1. 기본 개념도 Fig. 1. Illustration of basic conception. 그림 2. 위상각과 위상 이동 Fig. 2. Phase angles and phase shift. of rotations can be determined. In addition, we can see as the phase angles and phase shift $\Theta$ will be changing during rotation. Mapping of synchronous point from 3D to the 2D view diagram depicted on Figure 2. Here are "a" is radius of flywheels, $\Theta$ is phase shift, $\Phi$ 1 and $\Phi$ 1 are phase angles of synchronous points. Dynamical simulation shows at Table 1, where t is time of observation of flywheels, N1 and N2 - number of full rotations of each of wheels respectively. Let that $\Phi$ 1 = 10 and $\Phi$ 2 = 15. Consider simulation results in the Table 1. It consists from points, where phase shift $\Theta$ = 0 at t = 0 and 72 seconds will be repeat periodically. It's too difficult indentify how mach value has the real phase shift $\Theta$ at once (also $\Phi$ 1 and $\Phi$ 2), especially if it's very small in practical case. But, it is easy to calculate $\Phi$ 1 and $\Phi$ 2 using t and N1 and N2 values 표 1. 위상 이동의 시뮬레이션 Table 1. Dynamical simulation of phase shift in system. | t | ф1 | ф2 | Ө | N1 | N2 | |----|-----|-----|-----|-----|-----| | 1 | 10 | 15 | 5 | 0 | 0 | | 2 | 20 | 30 | 10 | 0 | 0 | | 3 | 30 | 45 | 15 | 0 | 0 | | | | ••• | | ••• | ••• | | 23 | 230 | 345 | 115 | 0 | 0 | | 24 | 240 | 360 | 120 | 0 | 1 | | 25 | 250 | 15 | 235 | 0 | 1 | | | | | | | | | 35 | 350 | 165 | 185 | 0 | 1 | | 36 | 360 | 180 | 180 | 1 | 1 | | 37 | 10 | 195 | 185 | 1 | 1 | | | | | | | | | 47 | 110 | 345 | 235 | 1 | 1 | | 48 | 120 | 360 | 240 | 1 | 2 | | 49 | 130 | 15 | 115 | 1 | 2 | | | | | | | | | 71 | 350 | 345 | 5 | 1 | 2 | | 72 | 360 | 360 | 0 | 2 | 3 | | 73 | 10 | 15 | 5 | 2 | 3 | | 74 | 20 | 30 | 10 | 2 | 3 | after certain time span of simulation. It will be easier to do, when N1 and N2 have changes of values at the same time. As we'll see later, sometimes it is impossible to determine exact time moment, outside of border where $\Theta \neq 0$ . Using the formula (1), (2) and (3) the $\Phi$ 1, $\Phi$ 2 and $\Theta$ can be calculated. $$\phi 1 = 360 * N1 / t$$ (1) $$\phi 2 = 360 * N2 / t$$ (2) $$\Theta = |\phi 2 - \phi 1| = (360*(|N2 - N1|))/t$$ (3) Thus, the method of phase error accumulation can applied for parameters extractions ( $\phi$ 1, $\phi$ 1 and $\Theta$ ). Accordingly, the knowledge of the values $\phi$ 1, $\phi$ 1 and $\Theta$ (or phase error), gives indirect secondary information about flywheel's mass difference (or other interesting parameters). ### III. Schematic realisation of PHEAM Figure 3 indicated simple ring oscillators(RO) based on NAND elements. The simplest procedure to calculate propagation delay time can be described as following. If counters will be connected on each outputs of RO1 and 2 then the number of cycles can be easily obtained. The waveform clarifying the activity of RO and phase error accumulation is depicted on Figure 4. Thereby the situation, which was described above, has been achieved and delay is computed by formula (4) with same method. $$\Delta t1 = \left( Tref * Nt / N1 \right) / N1 \tag{4}$$ $$\Delta t2 = \left( Tref * Nt / N2 \right) / NI \tag{5}$$ Where $\Delta t1$ and $\Delta t2$ are propagation delays in each of elements, Tref is period of clock reference signal, Nt is number of periods of reference clock (REF on schematic), N1 and N2 value in counter of RO1 and RO2, N1 is the total number of NAND elements in delay line. The stop condition is when 그림 3. PHAEM 개념도 Fig. 3. Schematic realization of PHAEM. 그림 4. 링 오실레이터 파형 Fig. 4. Waveform of ring oscillators. 그림 5. PHAEM의 하드웨어 구현 Fig. 5. Hardware implementation of PHEAM. value in the counter $1 \neq \text{counter } 2$ . At this time the first result of measurement will be ready to transfer outside from module. The hardware realization is illustrated on Figure 5. It consists of 2 ring oscillators which include 101 and 103 NAND primitives respectively. To provide more accurate result the stop condition is selected by overflow of one of counters. ## IV. Analysis of falling and rising time separation in PHEAM Although, this solution has extremely clear and simplest realization, at the same time it has significant disadvantage. The proposal idea can't separate of rising and falling delay in results. Let consider again the main schematic on Figure 3 and analyze how the front of input pulse pass from the input to the output of RO, forming of output CLOCK. Every time when the signal propagates from input of first to the input of second element it passes some time. Denote it, depending on that kind of propagation happens. So, we have a few variables to form here. Let X3 to be a propagation delay time, when signal from start propagates to output 3 of the N1.1. Then X1 became rising delay time, when clock signal passes N1.2 from input 1 to the output 3. Continuing, X2 is falling delay passes over N1.3 and so on. Finally, we formed the equitation like $$AX1 + BX2 + CX3 = t \tag{6}$$ where t is the total time of simulation. Build equitation to corresponded ring oscillator and get system of linear equations: $$6X1 + 5X2 + 1X3 = 1105$$ (7) $$10X1 + 9X2 + 1X3 = 1909$$ (8) $$14X1 + 13X2 + 1X3 = 2713 \tag{9}$$ But, this system of linear equations has a lot of solutions. Analyze schematic, shows on Figure 3 it is easy to see that X3 can be omitted. And equalizations could transform to next form: $$6X1 + 5X2 = 1105$$ (10) $$10X1 + 9X2 = 1909 \tag{11}$$ Solving of equitation 9 and 10, we obtain X1=100 and X2=101, where 100 rising and 101 falling propagation delay time in picoseconds. The possible simplification is depicted on Figure 6. It is obvious that the start pulse will not affect the behavior of the system, if propagation delay from input 2 to output 3 of N1.1 and N2.1 will be the same. It is unnecessary to wait for phase matching too. In real test case the real equalizations can be more complex to provide higher accuracy in data extraction. Solving of set of system of linear equations in different test conditions we could get static data for future analysis and data extrapolation. Accuracy in time measurement of small time intervals is not high. 그림 6. 하드웨어 구현 Fig. 6. Another view of hardware implemen- tation. 그림 7. 특성 개념도 Fig. 7. Scheme of the pins differences characterization. So, in real test the coefficient A, B (6) in system of linear equations should be very large. It means that the test time should be much more than 1105 ps. Also we need a special solution for time differences capture from output 1,2,3,4 in schematic on Figure 6. It could be done by using random sampling methodology [1,2]. The start pulse was registrated at output 2,4. The stop pulse generated at output 1 and 3. Changing schematic a little we can provide simple way to characterize propagation time differences between 1-3 and 2-3 pins Figure 7. The method of calculation propagation delay time was described above. ## V. Using technology scatter for super stability generation Another useful application of PHEAM is building and characterization of ring oscillator's arrays (ROA). In high precision measurement the generation of sequences of impulses with forecast positions is very important. It can be used in time position 그림 8. 동일 길이를 갖는 링 오실레이터 배치 Fig. 8. Array of ring oscillator with same length. 그림 9. 각각의 링 오실레이터의 시뮬레이션 Fig. 9. Dynamical simulation of counters value for each of ring oscillators. identification. By using the same number of elements in set of ring oscillator cell's array the high accuracy generator could be built. The schematic realization is illustrated on Figure 8. Starting at same time the ring oscillators will provide same periodic signals with slight differences as shows on Figure 9. The FSM counts and increment own value of internal counters every time when rising signal of RO appears at the input of FSM. As, result the value of counters will be different after some simulation time. So, we could characterize every modules of RO with high accuracy. Based on previously test data we could calculate of deviation of frequency generation for each RO. By other words say when periodic signal from RO get influences from outside. It gives a great bonus to analyze the quality of RO generation and increase the signal/noise ratio. Following step by step value increment in FSM, the generation error might be easy to find. Moreover, the errors can be corrected during generation process. For example, in different application every value that does not match with test data indicates the noise in schematic. Same principle goes to test chip characterization process, but it points on geometry or chemical inaccuracies in semiconductor technology process. Also that idea might be used at Super Stability On-Chip Generator (SSOCG) design. These facts show that the high accuracy calibration can be done before using in real device and at second may provide dynamical deviation correction of CLOCK signal in each of RO, which gives powerful solution in future applications. ### VI. Conclusion This paper introduces new advanced methodology to the CMOS library characterization applicable to the external and internal components. We have built and analyzed the special mathematical model of PHEAM using C tools. During design of PHEAM module we have created special test chip for digital library characterization. The PHEAM idea was proved by implementation in FPGA and ASIC devices in 110 nm and 130 nm processes. In new test chip the old methods were implemented as well. It is a combination of random sampling<sup>[1~2]</sup> and PHEAM. Also the two new conceptions of falling/rising time separation and super stability on-chip generation were introduced. Finally we have formulated the new task of the design of PHEAM module with hardware implementation of system of linear calculator. It will provide significant speed improvement in PHEAM. ### References - [1] S. Maggioni, A. Veggetti, A. Bogliolo, L. Croce, "Random sampling for on-chip characterization of standard cell propagation delay", Proceedings of the Fourth International Symposium on Quality Electronic Design, pp. 41~45, March 2003. - [2] Churayev S. O., Matkarimov B. T., Paltashev T. T., "On chip Measurements of Standard Cell Propagation Delay", Proceedings of IEEE East-West Design & Test Symposium (EWDTS'09), pp. 93~95. Sep. 18~21 2009, Moscow. 2009. - [3] S. K. Thompson, Sampling, 2nd Edition, Wiley, 2002 - [4] Extrapolation Methods. Theory and Practice by C. Brezinski and M. Redivo Zaglia, North-Holland, 1991. ### – 저 자 소 개 *–* 강 창 수(정회원) 1982년 광운대학교 공학사 졸업 1986년 한양대학교 공학석사 졸업 1992년 광운대학교 공학박사 졸업 1996년 Clemson University Post Doctor <주관심분야 : 반도체, AI, 회로설계> 임 인 호(정회원) 1988년 인천대학교 공학사 졸업 1990년 광운대학교 공학석사 졸업 2000년 연세대학교 공학박사 졸업 <주관심분야 : 반도체재료, 반도 체설계>