# Optimizing for Low Skew and Phase Error on PLL Based Clock Generators National Semiconductor Application Note 968 Rahim Ahmed Louis Malarsie December 1994 ## **ABSTRACT** This application note will discuss techniques used to optimize high speed Phase Locked Loop (PLL) clock distribution networks found in PCs and High end workstations. Static delay along clock paths can vary considerably. By placing a precision delay element in the feedback signal path of the PLL, the system designer has a method of controlling the output edge placement of the PLL. Some of the parameters that can affect skew and phase error are feedback length, input edge rate, loading and temperature. Data will be presented to show how skew and phase error were optimized for a specific customer application where National's CGS701A was used as a PLL clock distribution driver. ## WHAT IS A PLL? A basic Phase Locked Loop (PLL) consists of a Phase Detector (PD), Low Pass Filter (LPF), Charge Pump (CP) and Voltage Controlled Oscillator (VCO). When put together, these blocks make up a control loop as shown in *Figure 1*. When the PLL is out of lock, the PD changes the frequency of the VCO in such a way as to minimize the phase difference between the input signal (V<sub>REF</sub>) and the VCO feedback signal (VFBK). The error signal from the PD, in the form of an UP/DOWN signal, is proportional to the phase difference between $V_{\mbox{\scriptsize REF}}$ and $V_{\mbox{\scriptsize FBK}}.$ This signal drives the charge pump input voltage higher or lower. If V<sub>REF</sub> leads V<sub>FBK</sub> $(V_{\mbox{\scriptsize REF}} > V_{\mbox{\scriptsize FBK}})$ an UP pulse equal to the phase difference is produced that accordingly increases the VCO frequency. Similarly, if V<sub>REF</sub> trails V<sub>FBK</sub> (V<sub>REF</sub> < V<sub>FBK</sub>) a DOWN pulse is produced that decreases the VCO frequency. This process continues until the loop is locked which occurs when both the rising and trailing edges of $V_{\mbox{\scriptsize REF}}$ and $V_{\mbox{\scriptsize FBK}}$ are aligned and have the same frequency and phase (V<sub>REF</sub> = V<sub>FBK</sub>). The phase error just described is also referred to as propagation delay when an output is used to drive the feedback input. Due to the inherent characteristics and limitations of a PLL, such as mismatches in internal edge rates and internal chip routing, during phase lock $V_{\text{REF}}$ and $V_{\text{FBK}}$ are aligned to within some non-zero value. This non-zero value, referred to as part to part skew, will have to be accounted for by the system designer when final timing requirements for a system are computed. Increasingly complex systems, especially those using high frequency multi-processors, require several clocks to be distributed to different loads on one or more boards. Figure 2 illustrates how PLLs can be used to achieve frequency multiplication by placing a divider circuit in the feedback path. The PLL circuitry adjusts the frequency of VREF and VFBK to be equal, so the VCO frequency must be $2\times V_{REF}$ . Similarly, Figure 3 illustrates how using the feedback path increases fanout. The buffer placed in the feedback path allows for additional clock outputs all phase locked to VREF. In each example, the PLL will adjust the output clock to ensure phase and frequency synchronization between $V_{REF}$ and $V_{FBK}$ . ## THE CHALLENGES OF CLOCK DISTRIBUTION Clock distribution is a significant design problem for high frequency systems and play a major role in the system design. Synchronous systems, as shown in Figure 4, use a single clock source to coordinate the functions of several circuits on the board. This also means that the clocks are now distributed over greater distances as illustrated by traces 1, 2 and 3 in Figure 4. The challenges for the system designer is to get the clock signals to the various destinations at very nearly the same time without sacrificing signal integrity or quality. Clock signals with square edges and zero rise and fall times are only found in text books. With real systems the low to high output edges do not happen at the same time. The difference in time between the earliest and FIGURE 4. Clock Distribution on a Board TL/F/12314-4 the latest output transitions among all outputs is called output clock skew. Therefore, the designer must minimize output clock skew while maximizing clock integrity, this is not always possible because of intrinsic and extrinsic components that make up output clock skew. Intrinsic delay is attributed to differences in internal chip edge rates, internal chip routing and process variations. Extrinsic delay is attributed to the trace length plus the loading at the end of those traces. The system designer has control over skew generated due to these extrinsic components. In general, adding capacitance to a clock output increases its propagation delay time. A typical delay number could be 1.0 ns per 50 pF of loading. If therefore, one output had a 25 pF load and a second a 50 pF load then the result would be about a 500 ps skew. Adding capacitance to an output would also result in an increase in delay due to slower output rise and fall times. Similarly, skews due to trace delays on printed circuit board can be as much as 150 to 200 ps per inch. Therefore two traces, one two inches long and the other three inches long will result in about a 200 ps skew. What can be done to address these limitations? The standard approach used currently in most applications, has been to use matched trace lengths for clock signals to ensure minimal skew due to different trace lengths. Layout of the shorter traces in a serpentine path or alternatively, use a delay element on the clock paths. In general, applications use one or a combination of these techniques but this has not always been easy to achieve because of limited board space and the fact that traces of different lengths behave differently and require special handling. Traces around five inches, depending on frequency and edge rate, behave like transmission lines and as such require proper termination to avoid reflections and clock signal degradation. ## PLL FEEDBACK DELAY NETWORKS As has already been described, PLLs, maintain the phase and frequency relationship between the input reference and the output clocks by externally hardwiring the feedback output (FDBKOUT) pin to the feedback input (FDBKIN) pin. The PLL circuitry matches the rising edges of the input reference and the output clocks. In essence an ideal zero delay buffer is realized, but in reality there are output loads and propagation delays due to those loads. The goal here is to strive toward that ideal zero phase delay between the input reference and the output pins. How is this achieved? There is no requirement in a PLL that the feedback connection be hardwired from the FDBKOUT pin to the FDBKIN pin. By placing a delay in this feedback path, adjustments can be made to the FDBKOUT such that the information being presented to the FDBKIN is now delayed. The PLL circuitry now adjusts the outputs including the feedback output until lock is achieved. This delay in the feedback path can be customized by using an RC network and divider, see Figure 5. In other words by making this delay equal to the phase error between $V_{\rm REF}$ and the worst case clock load, it is possible to zero out this phase error and align $V_{\rm REF}$ and $V_{\rm Load}$ # FIGURE 5. RC Tuning Network National's CGS701A in *Figure 5* is used to illustrate how this technique was used to optimize for phase delay across loads and trace lengths. Before this technique can be used an understanding of the following issues are required: - 1. The type of clock driver being used and its critical specifications, e.g., output skew and prop delay. - 2. Determining the layout and loading of the clock signals on the board, e.g., short versus long traces. We will work through each of these steps using an actual clock distribution design example shown in *Figure 6*. It will be assumed that the final system timing requirements are known and the maximum allowable skew for error free operation is also known. It will also be assumed that the data is sampled on the rising edge of clock. The CGS701A is a low skew 1 to 8 PLL clock driver, with an external feedback path, providing outputs at $1\times$ , $2\times$ , and $4\times$ the input. The output skew is less than 500 ps and input-to-output propagation delay of $\pm 300$ ps (including jitter). In addition, the CMOS output drive capability of $\pm 30$ mA gives it an overall performance and flexibility that makes the CGS701A an ideal clock source for this and other high frequency design. The layout of the board and the loading on the outputs are shown in Figure 6. The trace length from the 1 to 8 driver to the CGS701A reference input is between 2 to 6 inches. In this particular application, the trace length from the CGS701A outputs (8) to each one of the loads has been kept to approximately four inches. There could be applications where the trace lengths are not the same and this additional trace delay can be compensated for by either phase advancing the longer trace or retarding the shorter trace. Examples showing this and other configurations are shown in the results section of this application note. Variations in output loading increases the output rise and fall times therefore clock signal take longer to transition from a TTL high or low to the TTL threshold point of 1.5V. An analysis of the CGS701A showed that when the FDBKIN input was connected to the FDBKOUT output with a $1\!\!/_4$ inch trace, the delay between the V\_REF and FDBKIN was approximately -400 ps when measured at the DUT. The minus sign indicating FDBKIN signal leading V\_REF. The difference in edge-rate between FDBKIN and V\_REF accounts for much of the -400 ps delay. Analyzing the RC delay network in the feedback path, it must be remembered that the phase frequency detector (PFD) of the PLL is positive edge triggered. The FDBKOUT signal that drives the PFD and the equations used to determine the specific phase delay required is also based on the rising edge. Synchronising these two edges, V<sub>REF</sub> and FDBKOUT, and their reference points allows us to proceed with the selection of the components necessary for phase compensation. EQ1 R1 $$\div$$ [R1 + R2] $\times$ V<sub>O</sub> = V<sub>I</sub> R1 and R2 are resistor values in $\Omega$ $\ensuremath{\text{V}_{\text{O}}}$ is the CGS701A output $\ensuremath{\text{V}_{\text{OH}}}$ V<sub>I</sub> is the reference input swing EQ2 $$\begin{aligned} & [R1XC_T \div R1 + XC_T] \div \\ & [R2 + [R1XC_T \div R1 + XC_T]] \times V_O = V_I \end{aligned}$$ Equation 1 above represents a simplified version of a voltage divider, the attenuation desired between $V_{l}$ and $V_{O}$ determines the ratio of R1 and R2. In general the value of R2 is in the range from $25\Omega$ to $250\Omega.$ Equation 2 shows a more complete version of the voltage divider. Assuming the following conditions: - If XC<sub>T</sub> is negligible, EQ2 simplifies and the simple voltage divider shown in EQ1 can be used. - 2. If R1 > > R2 and R1 > > XC<sub>T</sub>, the delay network shown in Figure 7 simplifies to an R2C<sub>T</sub> network. This allows equation 3 shown below to be used to determine the desired phase adjustment "dt". EQ3 $$V_T = V_I [1 - \epsilon^{-t \div R2C_T}]$$ V<sub>T</sub> is the PFD switching threshold V<sub>I</sub> is the reference input swing R2 resistors in $\boldsymbol{\Omega}$ t phase shift required C<sub>T</sub> capacitance in pF FIGURE 6. Trace Length and Loading RC DELAY NETWORK **FDBKOUT** dt V<sub>I</sub> FIGURE 7 **FDBKIN** TL/F/12314-7 TL/F/12314-6 The phase adjustment process requires several iterations to get the final values of R1, R2 and $C_T$ where $C_T$ includes input trace, plus any additional capacitance required for phase shift. It is left up to the user to use either equations 1 or 2 to determine these values. It is also recommended that the user validate the equations and assumptions used by verifying and fine tuning on the bench. In general components used must have tight tolerance (<1%) and prudent PCB layout techniques must be used to minimize sources of skew/phase error. ## **PHASE OPTIMIZATION RESULTS** The results from the phase optimization for the clock distribution design shown in Figure 6, using the delay network, is shown in this section. In addition, data is also shown here that can be used as a guideline by any system designer for any application using a PLL with an external feedback path. The phase optimization process is not complete until it is understood how various other parameters can affect the results. We have attempted to provide data that will show how some of the following variables: $V_{\rm CC}$ , frequency, temperature and $V_{\rm REF}$ edge-rate, different loads and load distances can affect the optimization process and phase delay. With reference to the application being considered here, the propagation delay requirements of the system designer was $\pm\,500$ ps. Figure 5 was the customized delay circuit used for optimization and subsequently used by the system designer in his design. Table I shows the results of phase optimization under the following conditions: 1 Inch Jumper Room Temperature Load @ 4 inches from CGS701A output (see Figure 6 ) Tuning components, R1 = $50\Omega$ , R2 = $25\Omega$ ## TABLE I | Reference<br>Frequency | V <sub>CC</sub> (V) | Optimized<br>Prop Delay | |------------------------|---------------------|-------------------------| | 25 MHz | 4.5 | ±260 ps | | | 5.5 | ±210 ps | | 33 MHz | 4.5 | ± 265 ps | | | 5.5 | ±210 ps | | 40 MHz | 4.5 | ±225 ps | | | 5.5 | ±205 ps | The effect over temperature ( $-40^{\circ}$ C to $+85^{\circ}$ C) on phase shift can be anywhere from 50 ps-150 ps. This shift, as can be seen from Table I is about the same over the different input reference frequencies and supply voltages. Table II details the impact on phase delay by varying the input reference signal edge-rate, keeping its amplitude at 3V and feedback output rise time between 1.5 ns to 2.0 ns. Data is for 25°C, using the same loads as shown in Figure 6. With slower edge-rates the phase delay increases. In these examples, for an input reference with an edge rate of 2 ns a phase delay of 50 ps (zero delay) is realized. Conversely, for a 4 ns edge-rate the phase delay is considerable at 2.6 ns. If this were an actual edge-rate used in application then it would be possible to compensate for this phase delay using an RC network described earlier. **TABLE II** | V <sub>REF</sub><br>Rise Times | Phase Delay<br>V <sub>CC</sub> = 5.0V | |--------------------------------|---------------------------------------| | 2.0 ns | 50 ps | | 4.0 ns | 1.63 ns | | 6.0 ns | 2.6 ns | Table III below illustrates how phase delay is affected by the physical length of the feedback. The phase delay, for the loading described, is shown under two conditions: 1) with a straight jumper in the feedback path and 2) with an RC delay network in the feedback path. Comparing the results shows zero phase delay (70 ps) has been achieved using the RC delay network in the feedback path. Additional data, under the same conditions, shows what happens to phase delay when six and twelve inch feedback lengths are used. TABLE III | Load: $200\Omega/100\Omega$ , see <i>Figure 6</i> $V_{CC}=5V$ , Freq $=40$ MHz, Temp $=25^{\circ}$ C | | | | | | |------------------------------------------------------------------------------------------------------|-------------|---------------------|--|--|--| | Feedback | Phase Delay | | | | | | Length | Jumper | R1 = 25 and R2 = 50 | | | | | 1 Inch | 210 ps | 70 ps | | | | | 6 Inch | -400 ps | −1.4 ns | | | | | 12 Inch | -1.7 ns | -2.0 ns | | | | Table IV illustrates what happens to the phase delay when the distance of the loads from the clock output pins are different. The results, for a given 500Ω/50 pF load, show a dramatic increase in phase delay from a load positioned three inches away and another twenty inches away. By selecting the correct delay components, adjustments can be made to zero out this large 1.6 ns phase delay. TABLE IV | Load: 500 $\Omega$ and 50 pF $V_{CC}=$ 5V, Freq $=$ 40 MHz, Temp $=$ 25°C | | | | | |---------------------------------------------------------------------------|------------------|----------------------|--|--| | Feedback<br>Length | Load<br>Distance | Tuned<br>Phase Delay | | | | 2 Inch | 3 Inch | −860 ps | | | | 2 Inch | 7 Inch | 620 ps | | | | 2 Inch | 20 Inch | 1.6 ns | | | ## CONCLUSION This application note has discussed the high frequency challenges of clock distribution. We have shown how varying trace lengths and loads at the end of clock traces can affect propagation delays. By using a real life system application, we have shown how using a customized delay element in the feedback path has allowed us to optimize propagation delays to within $\pm\,250$ ps. This flexibility to optimize for low input-to-output phase delay in any application, gives PLLs a performance advantage that makes high frequency clock distribution applications very attractive for the system designer. For a copy of the CGS Design Databook (1995), please call our toll free hotline (800) 272-9959 and order Literature # 400046. # Optimizing for Low Skew and Phase Error on PLL Based Clock Generators # LIFE SUPPORT POLICY NATIONAL'S PRODUCTS ARE NOT AUTHORIZED FOR USE AS CRITICAL COMPONENTS IN LIFE SUPPORT DEVICES OR SYSTEMS WITHOUT THE EXPRESS WRITTEN APPROVAL OF THE PRESIDENT OF NATIONAL SEMICONDUCTOR CORPORATION. As used herein: - 1. Life support devices or systems are devices or systems which, (a) are intended for surgical implant into the body, or (b) support or sustain life, and whose failure to perform, when properly used in accordance with instructions for use provided in the labeling, can be reasonably expected to result in a significant injury to the user. - 2. A critical component is any component of a life support device or system whose failure to perform can be reasonably expected to cause the failure of the life support device or system, or to affect its safety or effectiveness. **AN-968** National Semiconductor Corporation 1111 West Bardin Road Arlington, TX 76017 Tel: 1(800) 272-9959 Fax: 1(800) 737-7018 **National Semiconductor** Europe Fax: (+49) 0-180-530 85 86 Fax: (+49) 0-180-530 8b so Email: cnjwge@tevm2.nsc.com Deutsch Tel: (+49) 0-180-530 85 85 English Tel: (+49) 0-180-532 78 32 Français Tel: (+49) 0-180-532 8 38 Italiano Tel: (+49) 0-180-534 16 80 **National Semiconductor** National Semiconductor Hong Kong Ltd. 13th Floor, Straight Block, Ocean Centre, 5 Canton Rd. Tsimshatsui, Kowloon Hong Kong Tel: (852) 2737-1600 Fax: (852) 2736-9960 National Semiconductor Japan Ltd. Tel: 81-043-299-2309 Fax: 81-043-299-2408