High-Speed CMOS Dual-Modulus Prescalers for Frequency Synthesis by Ranganathan Desikachari

High-Speed CMOS Dual-Modulus Prescalers for Frequency Synthesis by Ranganathan Desikachari
High-Speed CMOS Dual-Modulus Prescalers for Frequency Synthesis
by
Ranganathan Desikachari
A THESIS
submitted to
Oregon State University
in partial fulfillment of
the requirements for the
degree of
Master of Science
Presented October 1, 2003
Commencement June 2004
ACKNOWLEDGMENTS
During the course of my graduate study over the past two years at Oregon State
University, several people have inspired and influenced my life. While the list of my
well-wishers and benefactors runs long, I hope to express my acknowledgement to all
those whose help and support this thesis was the result of.
First and foremost, I wish to thank my research advisor Professor Un-Ku Moon
for providing me the opportunity to work on this research project. Over the past two
years, our several stimulating discussions, both technical and non-technical, have been a
constant source of inspiration. I believe I have imbibed a lot of values in life during my
research and teaching assistantship tenures with him. I am grateful to Mark Steeds, at
National Semiconductor, for being a huge source of support and encouragement. Without
his resourceful advice and kind help, it would not have been possible for me to fabricate
and test this chip within the time constraints.
I thank National Semiconductor Corp. for supporting this project and for fabricating the chip. Jeff Huard and Bijoy Chatterjee were instrumental in encouraging
and supporting this research endeavor and I express my heartfelt thanks to them. I am
grateful for all the help and useful suggestions extended out by engineers of the Wireless
Products group at NSC, Tacoma - in particular, Mark Steeds, Mike Harris, Mike Viafore,
Dan Suckow and Rodney Hughes for sharing their valuable design and layout experience
during my several project update reviews. I would also like to thank all the committee
members - Dr.Karti Mayaram, Dr.Huaping Liu and Dr.Joseph Nibler for sparing the
time to serve on my defense committee.
Having worked with the students of both the research labs (Owen 245 and Dearborn 211/212) over the past two years, I have several people to thank for their friendship
and cooperation. The analog circuit design research group at Owen 245 has provided a
scintillating environment that has fostered my growth as a circuit designer. Pavan Hanumolu, José Silva, Gil-Cho Ahn, José Ceballos, Jipeng Li, Anurag Pulincherry, Yoshio
Nishida and Min-Gyu Kim have been such great friends and mentors, that I feel honored to have had the opportunity to work with each of them. Pavan, Gil-Cho and José
have provided valuable feedback and suggestions that has helped me many a time in my
research. Gowtham deserves a special mention for all the interesting discussions we have
had during our parallel graduate work over the past two years. I thank Vova for putting
up with me both in the lab and at home, as well as for being a great information resource.
I thank Yoshio and José for several interesting discussions regarding measurements on
my chip and offering their kind help with preparing my thesis. What I have learnt from
my experienced colleagues has enriched my knowledge and will certainly benefit me in
my career.
I am grateful to Gowtham, Sirisha, Patrick, Husni, KP, Manu, Raghu, Manas,
Yuhua, Trimmy and my other colleagues in Dearborn 211 for their warm friendship and
help on innumerous occasions. I owe my deep gratitude to several friends in the AMS
lab who had helped me get accustomed to the rigors of graduate studies - Neel Seshan,
Vinay Chandrashekar, Madhu Chennam, and Ravi Suravarapu, to name a few. I thank
my apartment-mates Rajan and Ajit for the several memorable experiences that we have
shared over the past two years.
Words cannot suffice to thank my family for all that they have done for me. I owe
whatever I am as a person largely to the values instilled in me by my mother, father and
sister. I thank them for being a great source of encouragement and support.
Above all, I thank God for everything in my life.
TABLE OF CONTENTS
Page
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.1.
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.2.
Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2. PLL-BASED FREQUENCY SYNTHESIZERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.1.
Introduction to Frequency Synthesizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.2.
Characterization of Frequency Synthesizers . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.2.1. Frequency Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2. Spectral Purity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.3. Transient Response Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
7
8
PLL System Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.3.
2.3.1. Basic Operation of PLL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.2. PLL Loop Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.
Frequency Synthesizer Architectures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4.1. Static-Moduli Frequency Synthesizers . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.2. Integer-N Frequency Synthesizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.3. Fractional-N Synthesizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3. DUAL-MODULUS PRESCALERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.
Dual-Modulus Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.
Pulse-Swallow Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.
Technology Comparison - Bipolar Vs CMOS . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4.
Current Mode Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4.1. Speed-Power Advantage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.4.2. Common-Mode Noise Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4.3. Substrate Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.5.
Pulse-Swallow Feedback Delays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.6.
Ring-Oscillator Speed Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
TABLE OF CONTENTS (Continued)
Page
4. ANALYSIS, CIRCUIT DESIGN AND IMPLEMENTATION . . . . . . . . . . . . . . . 29
4.1.
8/9 Dual Modulus Prescaler Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2.
Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.1. Voltage Swing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.2. Current Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.3. Transistor Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3.
Implementation Of Pulse-Swallow Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.4.
Asynchronous Flip-Flop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5.
RF Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5.1. CMOS RF Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.5.2. BiCMOS RF Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.6.
Output Buffer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.7.
Layout Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.7.1. Symmetry Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.7.2. Synchronous Divider Floorplan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.7.3. Minimization of Interconnect Capacitance . . . . . . . . . . . . . . . . . . . . . . 50
5. SIMULATION AND MEASUREMENT RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1.
SpectreS Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2.
Measurement Set-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3.
Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4.
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
LIST OF FIGURES
Figure
Page
1.1
A typical integer-N PLL frequency synthesizer. . . . . . . . . . . . . . . . . . . . . . . . . .
2
2.1
Frequency synthesizer operation in a generic transceiver. . . . . . . . . . . . . . . .
5
2.2
PLL-based frequency synthesizer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.3
Phase noise and spurious tones. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.4
Modulus switching transient model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5
Phase-locked loop (a) block diagram,(b) linear model. . . . . . . . . . . . . . . . . . . 10
2.6
Charge-Pump Phase-Locked Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.7
(a) Classical static modulus divider; (b) Modified static modulus divider 13
2.8
Fractional-N synthesis based on (a) pulse removal, (b) dual-modulus
prescaler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1
Pulse-swallow integer-N frequency synthesizer. . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2
2/3 dual-modulus divider(a) divide-by-2 ;(b) divide-by-3 circuit ;(c) 2/3
dual-modulus divider. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3
(a) CMOS ring oscillator;(b) Switching spikes in a CMOS inverter. . . . . . 23
3.4
Principle of current-mode logic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.5
Substrate current injection in (a) CMOS, (b)CML. . . . . . . . . . . . . . . . . . . . . . 26
3.6
Analogy between (a) ring oscillator and, (b) frequency divider. . . . . . . . . . 28
4.1
8/9 Dual-Modulus Prescaler System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2
Timing diagram explanation of pulse-swallow operation. . . . . . . . . . . . . . . . . 30
4.3
RC time-constant linear delay model in CML operation. . . . . . . . . . . . . . . . . 32
4.4
Current-Mode D-Latch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.5
Ring oscillator simulation comparisons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.6
Optimized D flip-flop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.7
(a) Parallel pull-down latch structure for gated flip-flop.(b) Fully symmetric flip-flop gating. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
LIST OF FIGURES (Continued)
Figure
Page
4.8
Asynchronous divider. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.9
CMOS RF buffer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.10 Clock waveforms : CMOS buffer output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.11 BiCMOS RF Buffer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.12 BiCMOS Output Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.13 Flip-Flop3 with dummy devices to maintain signal symmetry. . . . . . . . . . . 49
4.14 Floor plan to optimize pulse-swallow feedback delays. . . . . . . . . . . . . . . . . . . 49
4.15 Top-level chip snap-shot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.16 Layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.1
Die photograph of OSU-Prescaler test chip. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2
Top-level of dual-modulus prescaler implementation. . . . . . . . . . . . . . . . . . . . 53
5.3
Prescaler output waveform for slow process/hot temperature corner. . . . . 54
5.4
Prescaler output waveforms for 0 dBm input, fast/cold operating corner. 55
5.5
Dual-modulus division operation breakdown. . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.6
Test setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.7
Operating frequency variation with input signal levels. . . . . . . . . . . . . . . . . . 59
5.8
Measured operation at 2.1 GHz, 2.1 mA prescaler current, -16dBm input
signal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.9
Measured operation at 1.85 GHz, 1.3 mA prescaler current, -16dBm
input signal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
LIST OF TABLES
Table
Page
4.1
Current-Speed relation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1
Chip performance over 21 samples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
HIGH-SPEED CMOS DUAL-MODULUS PRESCALERS
FOR FREQUENCY SYNTHESIS
CHAPTER 1. INTRODUCTION
Frequency synthesizers are critical components for frequency translation and channel selection in wireless transceivers. Synthesizer design is a challenging task due to the
stringent requirements imposed by RF systems.
This chapter provides an overview of the concepts of PLL-based frequency synthesis and the significance of variable-moduli prescalers.
1.1.
Motivation
Oscillators and frequency synthesizers are key elements in a radio system that
provide a controlled frequency source for receive signal down conversion and transmit
signal up conversion. A stand-alone oscillator or a voltage controlled oscillator (VCO)
does not have the required frequency stability to satisfy the low phase noise requirement.
Therefore frequency synthesis is necessary to obtain accurate high frequencies from a
precise low frequency crystal oscillator. Phase locked loops (PLLs) are often used to
provide negative feedback in frequency synthesizers to suppress the phase noise due to
oscillators.
Figure 1.1 shows a generic PLL-based synthesizer. The reference divider ÷R can
be used to scale highly accurate crystal based input frequencies down to desired levels
for the PLL module. The PLL consists of a phase-frequency detector (PFD) and a
loop filter (LF) apart from the VCO. The operation of the PLL and the programmable
counter in the feedback path allow generation of accurate high frequencies from a pure
2
low frequency signal. The programmable P-counter is usually preceded by a prescaler
(÷N) that scales down the high output frequencies to a range at which standard CMOS
dividers can be implemented.
A dual-modulus division gives the flexibility to select channels on the basis of
the number of times each of the moduli is selected. The dual-modulus prescaler could
also be used in the feedback to obtain fractional output frequencies. Such synthesizer
architectures, called fractional-N frequency synthesizers, allow the reference frequencies
to be higher. PLL stability considerations require loop bandwidths to be a fraction of
the reference frequency. So fractional-N synthesizers allow for higher loop bandwidths,
resulting in faster output settling and lower oscillator noise characteristics.
fref
÷R
PFD
VCO
LF
P
counter
f out
=
N f ref
R
÷N
Modulus Control
FIGURE 1.1: A typical integer-N PLL frequency synthesizer.
In several RF systems, the synthesizer is commonly partitioned into three separate
chips: the VCO; the dual-modulus prescaler; and the channel selection logic(program
counters and swallow counters) along with the phase/frequency detectors. As the VCO
and the prescaler operate at a maximum frequency, they are usually fabricated in silicon
bipolar or GaAs technologies and the rest in CMOS technology. Although these technologies support faster transistors, they are not very cost-effective. With the growing
push towards integration of the entire synthesizer, research progress is being gradually
made towards obtaining comparable performance with CMOS processes.
3
In order to tackle the multiple challenges presented by the requirements of wireless
systems, it is necessary to understand the PLL frequency synthesizer at the system-level
before analyzing and designing dual-modulus prescalers. These system level issues and
the transistor-level implementation of dual-modulus prescalers have been addressed in
this thesis.
1.2.
Thesis Organization
To provide a system level understanding of PLL-based frequency synthesizers, it is
essential to discuss the features of commonly used structures. An overview of frequency
synthesizer architectures and their characteristics, as well as a brief review of phaselocked loops are discussed in the Chapter 2. Chapter 3 deals with the details of the
pulse-swallow topology for dual-modulus division. In Chapter 4 the design considerations
and the actual implementation of the dual-modulus prescaler is discussed. The design
challenges presented by the input/output buffers that interface this circuit to the outside
world are also presented. Chapter 5 concludes the thesis with a compilation of the
obtained simulation and measurement results.
4
CHAPTER 2. PLL-BASED FREQUENCY
SYNTHESIZERS
With the boom of the wireless communications market, wireless transceivers have
become ubiquitous. At the heart of every transceiver is a frequency synthesizer, required
for accurate channel-selection. This chapter is an introduction to the concept of PLLbased frequency synthesis.
2.1.
Introduction to Frequency Synthesizers
A transceiver (transmitter-receiver) is the building block that interfaces between
the end user and the transmission medium. The transceiver consists of three blocks [1]
- the user-end interfaces between the user information and the digital data representation; the back-end modulates and demodulates the digital data to and from the analog
baseband signal that is suited for the transmission technique used (QPSK, GMSK etc.);
and the front-end block that does the transmission, reception and frequency conversions.
Modern communication protocols always allocate closely located channels at very
high frequencies. For example [2], Bluetooth, a short range wireless protocol, allocates
79 channels from 2.402 GHz to 2.480 GHz resulting in a 1MHz channel spacing. The
phase noise from the LO needs to be low enough not to interfere with frequencies in
the adjacent channels. However, a stand-alone oscillator with sufficiently high-Q, will
not be tunable over a 79 MHz band. Additionally, even crystals do not have resonance
frequencies as high as 2.4 GHz.
The above discussion illustrates the need for a block that synthesizes many discrete
frequencies from one or more fixed reference frequencies. Figure 2.1 illustrates this in
the context of a generic transceiver.
5
Band Pass
Filter
Mixer
LNA
Channel
Selection by
digital control
Duplexer
Filter
Frequency
Synthesizer
Power Amplifier
Band Pass
Filter
Mixer
FIGURE 2.1: Frequency synthesizer operation in a generic transceiver.
As many discrete frequencies need to be generated it is impractical to have a
reference frequency for each. Ideally, the reference frequency is a single spectrally pure
frequency, typically generated from a piezo-electric crystal. This leads to the idea of a
control input to generate required frequencies from a single reference.
Phase Locked Loops (PLLs) are negative feedback systems whose output frequency can be digitally controlled with the help of a precise clock at its input as reference.
The reference phase noise is suppressed within the loop bandwidth by the negative feedback. High output frequencies can be obtained from accurate references by frequency
division in the feedback path. Therefore they are ideal devices for frequency synthesis.
Figure 2.2 shows a typical PLL-based implementation of frequency synthesizer. This is
also referred to as an integer-N synthesizer, discussed in detail in later sections.
6
Phase Locked Loop
fref
÷R
PFD
VCO
LF
P
counter
f out
=
N f ref
R
÷N
Modulus Control
FIGURE 2.2: PLL-based frequency synthesizer.
2.2.
Characterization of Frequency Synthesizers
Modern wireless standards impose several stringent requirements which challenge
the design of frequency synthesizers. This is exemplified by the Bluetooth protocol
mentioned in the previous section. Some of the important parameters that characterize
the performance of synthesizers are highlighted in this section.
2.2.1. Frequency Range
The range of frequencies generated by the synthesizer is defined by the wireless
standard (900 MHz, 1.9 GHz, 2.4 GHz etc.). In most cases, the frequency must be varied
in small increments determined by the channel spacing. This frequency resolution could
be as low as 30 kHz. A specification on the output frequency accuracy, as well as on
the channel’s upper and lower edges requires the error of the synthesizer to be less than
a few parts per million. The generated output frequencies could experience short term
(drift) or long term variations (aging) due to the environment. So a frequency stability
specification is usually defined with respect to time, temperature, power supply etc.
7
2.2.2. Spectral Purity
The spectral purity of the synthesized output can deviate due to the nonidealities in
the components of the PLL. The ideal output spectrum of a frequency synthesizer should
be a single tone at the desired frequency in order to provide the reference frequency for
frequency translation. A single tone in the frequency domain is equivalent to a pure
sinusoidal waveform in the time domain. The random and systematic amplitude and
phase deviations from the desired values produce energy in frequencies other than the
desired frequency. When this energy is mixed with the received RF signal or modulated
baseband signal, undesired sidebands are created. Phase noise and spurious tones are the
two key parameters to measure the quality of a frequency synthesizer. In this section,
the effects of phase noise and spurious tones on a transceiver have been investigated.
Phase noise is the phenomenon of phase disturbance of oscillators and has been
modeled and described extensively in literature[3, 4, 5, 6]. The ideal synthesizer has a
pure sinusoidal waveform as given by Eq. (2.1).
v(t) = V0 Cos(2πf0 t)
(2.1)
When amplitude and phase fluctuations are included, the waveform becomes
v(t) = [V0 + (t)]Cos[2πf0 t + φ(t)]
(2.2)
where (t) represents an amplitude fluctuation and φ(t) represents a phase fluctuation.
Because amplitude fluctuations can be removed or greatly reduced by a limiter, phase
modulation is a bigger concern in frequency synthesizer design. The phase fluctuations
could arise in three different ways: systematic variations, due to aging of the resonator
material; deterministic periodic variations due to unwanted phase or frequency modulations in the PLL, and random variations due to noise sources such as thermal, shot or
flicker noise in the devices.
The phase noise of oscillators or synthesizers is measured as the ratio of the noise
8
power in a 1Hz bandwidth at a certain offset frequency from the carrier to the noise
power of the carrier.
φ(f ) = 10 log
Pnoise
(dBc/Hz)
Pcarrier
(2.3)
Although the loop components suffer from the above mentioned noise sources, the
two important contributors of phase noise in a PLL are the input reference noise and
the VCO noise. The PLL loop bandwidth is a design parameter that is determined on
the basis of the dominant noise source. A linear model of PLLs has been discussed in a
later section.
As shown in Figure 2.3, the sidebands caused by the phase modulation appears
as a phase noise skirt. Sometimes energy is concentrated at frequencies other than the
desired frequency, resulting in spurious tones appearing as spikes above the phase
noise skirt. These tones are usually artifacts of reference frequency feed-through due to
charge-pump nonidealities or due to any periodicity introduced by the modulus-selection
operation. Some standard strategies used to alleviate the problem of spurious tones are
to use larger loop filter capacitors, notch filters to suppress reference feed-through and
the use of delta-sigma modulators to shape the noise due to modulus-selection out of the
band of interest [7, 8].
2.2.3. Transient Response Requirements
As shown in the phase-locked architecture of Figure 2.2, the modulus variation by
control signal would result in a loop transient. Every time a different channel is selected,
the PLL needs to lock to the new frequency. The lock time of synthesizers is especially
a critical parameter in fast frequency-hopped spread-spectrum systems.
An interesting analysis to model the loop settling behavior of PLL architectures
has been carried out in [1]. The modulus change in the system can be modeled with
9
Phase noise skirt on
either side of desired
oscillation frequency
Spurious tones
Desired frequency
FIGURE 2.3: Phase noise and spurious tones.
a simple feedback topology as illustrated in Figure 2.4. Suppose the divider modulus
(N) variation corresponds to a small step in the feedback factor, the closed-loop jitter
transfer function of this system is equal to
Y (s) =
≈
≈
H(s)
X(s)
1 + (N + )H(s)
H(s)
1
.
X(s)
1 + N H(s) 1 + /N
H(s)
(1 − )X(s)
1 + N H(s)
N
(2.4)
where, H(s) is the phase-domain transfer function of the PLL. The above relation implies
that the modulus change is equivalent to multiplying the input by (1- /N). The output
frequency’s response can thus be estimated with conventional second-order transient
equations to a step input.
2.3.
PLL System Description
Since the frequency synthesizers are based on the phase-locking principle, a typical
PLL system has been described in this section. PLLs are negative feedback systems that
10
X(s)+
Y(s)
H(s)
−
+
N
+
ε
FIGURE 2.4: Modulus switching transient model
operate on excess phase of nominally periodic signals. Their function is to lock the
frequencies and phase of its two input signals with as small an error as possible.
2.3.1. Basic Operation of PLL
The simplest PLL, shown in Figure 2.5(a) consists of a phase detector (PD), lowpass filter (LPF) and a voltage-controlled oscillator (VCO). The PD serves as an error
amplifier in the feedback loop, and tries to minimize the phase difference ∆Φ between
X(t) and Y(t). The loop is considered “locked” if ∆Φ is constant with time.
X(t)
Phase
Detector
LowPass
Filter
(a)
Y(t) X(s) +
VCO
KPD
GLPF (s)
Kvco
Y(s)
s
--
(b)
FIGURE 2.5: Phase-locked loop (a) block diagram,(b) linear model.
In the locked condition, the PD produces an output proportional to the phase
error between the input and output signals. The LPF averages out this error so that a
voltage error ∆V corresponding to the phase error is built up at the input of the VCO.
The output frequency is modulated to minimize the phase and frequency error.
11
2.3.2. PLL Loop Dynamics
Although the PLL transient is a nonlinear process, a linear approximation can
be used to arrive at a phase-domain PLL model that gives intuitive insight into the
tradeoffs involved in the design. From the linear PLL model of Figure 2.5(b), the closedloop transfer function, also known as the jitter transfer function, is given by,
Φout (s)
Φin (s)
KP D KV CO GLP F (s)
s + KP D KV CO GLP F (s)
H(s) =
=
If GLP F (s) =
1
1+ ω s
(2.5)
,
LP F
KP D KV CO
H(s) =
s2
ωLP F
(2.6)
+ s + KP D KV CO
From the above equations, it can be shown[1, 3] that the finite phase error is
inversely proportional to the loop gain KP D KV CO . Further, the above simple phase
detector and low pass-filter combination does not give independent control of the loopbandwidth and damping factor. These limitations necessitate the use of charge-pump
PLLs (CPLLs). A typical CPLL schematic is shown in Figure 2.6.
I cp
X(t)
Y(t)
PFD
U
VCO
S1
D
Vctrl
Y(t)
S2
R
I cp
C2
C1
FIGURE 2.6: Charge-Pump Phase-Locked Loop
12
This CPLL structure has been analyzed extensively in the literature [9, 10]. The
salient features of this PLL structure have been discussed below without digressing into
the mathematical details of the system.
The simple phase detector has been replaced by a phase-frequency detector (PFD).
The PFD is a circuit that can detect both phase and frequency difference between the
input and output signals. This ability is a consequence of realizing the PFD using
digital sequential logic that compares clock edges instead of continuous input-output
phase comparisons. In order to convert the phase error pulses into a control voltage,
charge-pumps are used to charge up/down the loop filter depending on the state of the
PFD control signals. By virtue of incorporating a charge-pump and loop filter to average
the phase error pulses, the CPLL has ideally infinite DC gain, making the phase error
negligibly small. The loop filter capacitance C1 and the VCO, each contribute one pole
at the origin in the jitter transfer function. The resistance is, therefore, introduced
to compensate for the net phase shift around the loop. This implementation of the
filter gives the additional advantage of independent control of the loop-bandwidth and
damping factor [1].
Most PLL-based synthesizers use this basic topology with several variations in the
design of the individual building blocks. The switching introduced by the digital blocks
and the charge pumps can affect the purity of frequency synthesizer outputs. This has
been discussed in later sections.
2.4.
Frequency Synthesizer Architectures
Output frequencies of frequency synthesizers vary in discrete steps corresponding
to the channel spacing i.e. fout = fo + k ·fch where fo is the lower limit of frequency. It
was shown in Figure 2.1 that the ‘k’ is selected by digital control. The need for output
frequencies to be accurately locked to a particular channel mandates the use of a PLL.
Some popular architectures for frequency synthesis and their features are discussed here.
13
2.4.1. Static-Moduli Frequency Synthesizers
The classic PLL frequency synthesizer (Figure 2.7(a)) comprises a reference oscillator and two static modulus dividers so that,
fref
N
fout
=
⇒ fout =
· fref
N
R
R
(2.7)
Thus, by varying the moduli, different output frequencies can be synthesized.
Output frequencies can be incremented in steps of fref /R , and this is the frequency at
which the PFD is updated. As mentioned earlier, this requires the PLL loop bandwidth
to be a fraction of this update rate. The lower bandwidth is a big disadvantage in terms
of the reference phase noise suppression and slower settling times.
fref
÷R
PFD
LF
fout
VCO
fref
÷R
PFD
÷N
LF
VCO
÷N
fout
÷P
FIGURE 2.7: (a) Classical static modulus divider; (b) Modified static modulus divider
A modification made to the above architecture is to include a static divider following the PLL as shown in Figure 2.7(b), so that the output frequencies are given by
fout =
N
R·P
·fref but the PLL update rate is fref /N , a factor of P improvement.
However, although this modification improves the PLL loop bandwidth constraint,
the operating frequency of the PLL and the ÷N need to be higher.
14
2.4.2. Integer-N Frequency Synthesizers
The integer-N frequency synthesizer is probably the most widely used architecture.
The feedback divider now consists of a static divider as well as a dual-modulus prescaler
coupled with two counters. The two counters are referred to as the swallow-counter
and program counter. The swallow counter, or the channel-spacing counter, can be
programmed to enable channel selection. The prescaler divides by N+1 until the swallow
counter overflows after which the overflow bit will set the prescaler in divide-by-N mode
until the program counter overflows. Stated more explicitly, the program counter (also
known as the frame counter) determines the total number of VCO cycles required for the
above operation. The detailed operation and the math involved in an integer-N system
to realize channel selection on the basis of programmed counters has been discussed in
later chapters.
2.4.3. Fractional-N Synthesizers
As mentioned above, integer-N PLLs are restricted in reference frequencies by
the channel spacing. The principle of pulse-swallow or pulse-removal can be used to
implement fractional division ratios. Depicted in Figure 2.8, are two implementation
strategies for fractional-N synthesis. Figure 2.8(a) incorporates a pulse-remover which
blocks one VCO pulse upon assertion of the “remove” control. The average locked
frequency is hence,
fout = fref + 1/Tp
(2.8)
where, Tp is the period with which the pulse-remove command is applied. Modern
implementations however are based on dithering between two moduli. If the prescaler
divides by N for A VCO pulses and by N+1 for B pulses, the average divide ratio would
be equal to
15
fref
PFD
LPF
VCO
fout
PFD
VCO
LPF
fout
N/N+1
Pulse
Remover
÷N
fref
PulseSwallow
Logic
(a)
(b)
FIGURE 2.8: Fractional-N synthesis based on (a) pulse removal, (b) dual-modulus
prescaler
Navg =
faverage =
N A + (N + 1)B
B
=N+
A+B
A+B
A+B
B
A
N + N +1
⇒ faverage = N · f
(2.9)
(2.10)
(2.11)
where, N represents the integer potion of the division modulus and f represents the
fractional portion.
The obvious advantage of the above scheme is that the channel spacing need only be
a fraction of the reference frequency, allowing for higher PLL loop bandwidths. the critical drawback of fractional-N synthesis is that the periodicity introduced by the modulusselect operation usually appears as sidebands or spurs. The conventional technique to
eliminate the spurs is by dithering the modulus. To avoid an increase of the noise floor,
delta-sigma techniques can be used to shape the noise out of band which can then be
filtered out. This randomization of the modulus control to suppress spurs has been
discussed in [1].
16
CHAPTER 3. DUAL-MODULUS PRESCALERS
The applications of dual-mod prescalers has been highlighted earlier in the context
of integer-N and fractional-N synthesizers. An analysis and a discussion of their operation
are carried out in this chapter.
3.1.
Dual-Modulus Operation
The complexity of the N counter in PLL frequency synthesizers has grown over the
years. In addition to a straightforward N counter, it has evolved to include a prescaler.
This structure, illustrated earlier in Figure 2.2, has developed as a solution to the problems inherent in using the basic divide-by-N structure to feed back to the phase detector
when very high-frequency outputs are required. For example, suppose a 900 MHz output
is required with 10 kHz channel spacing. A 10 MHz reference frequency might be used,
with the R-divider set at 1000. Then, the N-value in the feedback would need to be of
the order of 90000. This would need a 17-bit counter capable of dealing with an input
frequency of 900 MHz.
To handle this range, it makes sense to precede the programmable counter with
a fixed counter element (the prescaler), to bring the very high input frequency down
to a range at which standard CMOS will operate. However, using a standard prescaler
introduces other complications. The system resolution, or the effective channel spacing,
is now degraded by P, the modulus of the prescaler. This issue can be addressed by using
a dual-modulus prescaler (Figure 3.1). It has the advantages of the standard prescaler
but without any loss in system resolution. A dual-modulus prescaler is a counter whose
division ratio can be switched from one value to another by an external control signal.
By using the dual-modulus prescaler with an ‘S’ and ‘P’ counter one can still maintain
17
an output resolution specified by the input to the PLL (F1 ).
fref
F1
Reference
Divider
Phase
Detector
Loop
Filter
VCO
‘P’ counter
‘S’ counter
Dual-modulus
prescaler N/N+1
FIGURE 3.1: Pulse-swallow integer-N frequency synthesizer.
As long as the S counter has not timed out, the prescaler divides down by N +
1. So, both the S and P counters will count down by 1 every time the prescaler counts
(N + 1) VCO cycles. This means the S counter will time out after ((N + 1) S) VCO
cycles. At this point the prescaler is switched to divide-by-N mode. The P counter still
has (P - S) cycles to go before it times out. So after ((P - S) N) more cycles, the system
is now reset to the initial condition. Expressing the above discussion mathematically,
total number of VCO cycles for one dual-modulus division, is
M
= (S × (N + 1) + (P − S) × N )
= (SN + S + P N − SN )
=⇒ M
= (S + P N )
(3.1)
Typically the S counter is called the swallow counter, and the P counter is the program
counter.
Consider the expression M = S + PN. To ensure a continuous integer spacing for
M, S must be in the range 0 to (N - 1). Then, every time P is incremented there is
18
enough resolution to fill in all the integer values between PN and (P + 1)N. As was
already noted for the dual-modulus prescaler, P must be greater than or equal to S for
the dual modulus prescaler to work. From these we can say that the smallest division
ratio possible while being able to increment in discrete integer steps is:
MM IN
= (Pmin · N ) + Smin
= ((N − 1) · N ) + 0
= (N 2 − N )
(3.2)
= (Pmax · N ) + Smax
(3.3)
The highest value of M is given by
MM AX
In this case Smax and Pmax are simply determined by the size of the S and P
counters. The range from MM IN to MM AX defines the multiple moduli of division.
3.2.
Pulse-Swallow Architecture
One important factor that has not been addressed yet is how the dual-modulus op-
eration can be implemented. The conventional implementation of any divider/prescaler
is using digital counters. The division factor that can be easily realized using such logic
is of the form 2N , i.e. the pattern in which the counter counts repeats every 2N cycles.
To implement 2N + 1, therefore, one extra state of the system needs to be inserted over
a single pulse duration in the repetitive pattern. This is referred to as a “pulse-swallow”
operation.
The principle of operation of the pulse-swallow architecture can be explained by
means of a simple divide-by-2/3 circuit (Figure 3.2). Figure 3.2(a) shows the simplest
divider, a ÷2 implemented with a D flip-flop (DFF2). Now if another DFF (DFF1) and
the combinational gate G are inserted in the feedback path of the divider(Figure 3.2(b)),
19
Q1
D
Q
D
DFF2
D
Q
D
Combinational
gate G
DFF1
Q
D
Q
Q
DFF2
D
Q
Q2
(a)
(b)
Q1
D
Q
D
G
DFF1
D
Q
Q
DFF2
D
Q
Q2
Mod
Select
(c)
FIGURE 3.2: 2/3 dual-modulus divider(a) divide-by-2 ;(b) divide-by-3 circuit ;(c) 2/3
dual-modulus divider.
then the system can be in three states : Q1 Q2 = 01, 10, 11. The Q1 Q2 = 00 is
obviously illegal as that implies the previous values of Q2 and G would have had to be
in the impossible states of ‘0’ and ‘1’, respectively. In order to control the mod-select
an extra gate is required, such as the OR gate in Figure 3.2(c). This simple 2/3 divider
works in divide-by-2 mode when Mod Select is ‘1’ and in divide-by-3 mode for Mod
Select ‘0’. The above discussion can be extended to higher division moduli 2N /2N +1
prescalers easily.
3.3.
Technology Comparison - Bipolar Vs CMOS
The two most important performance parameters to be optimized in the design of
prescalers are speed and power. The biggest limiting factor in this optimization is the
technology. As mentioned earlier, the prescaler being one of the components working at
full speed, is often implemented with bipolar or SiGe/GaAs technologies [11, 12, 13, 14].
20
One of the perennial questions that has been discussed often in several conference panel
discussions and by RF engineers is the wisdom in pursuing RF-CMOS. This section
compares bipolar and CMOS technology for RF applications, throwing light on some of
their merits and demerits.
The key transistor figures of merit for RF and microwave applications are the
unity-gain frequency fT and the maximum power gain frequency fmax , 1/f noise corner
frequency etc. A comparison of the figures of merits of several technologies suitable for
wireless LAN applications has been tabulated in [15]. As silicon technologies are less expensive and more integrable, they would be the clear choice. However, it is not obvious
as to which silicon technology ought to be used, bipolar/BiCMOS or CMOS. Some of
the issues considered are listed below.
Device Performance Comparison: Bipolar transistors do have a lot of performance
advantages over MOSFETs in RF and analog applications. Some of the important comparisons are:
gm /I ratio of bipolar transistors is always higher than MOSFETs [16]. The NPN
•
transistor possess a higher inherent gain compared to NMOSFETs and hence, a higher
drive capability.
Bipolar transistors have lower 1/f noise than MOSFETs due to the absence of
•
surface charge effects. This is a significant advantage for low-noise RF circuits.
•
Bipolar transistors are sometimes considered to be modeled better than MOSFETs,
especially in the deep-submicron processes currently used [17].
•
Bipolar devices exhibit better device matching on the same die.
•
Bipolar transistors, however, are more non-linear than MOS devices due to their
exponential I-V characteristics. This is especially significant in the context of devices
that are used as switches. However distortion introduced due to back-gate effects are
absent in bipolar transistors.
•
MOS devices have the advantage of the availability of complementary PMOS de-
vices.
21
For RF circuits, bipolar devices do seem to possess more desirable features. Several
commercial analog products, however, utilize the advantages of both the bipolar and
MOS characteristics in BiCMOS processes.
Availability and Accessibility: It has been well established that CMOS is the most
available and accessible process among all semiconductor technologies. Several foundries
around the world offer a wide range of CMOS processes. This is the biggest advantage
of RF CMOS over bipolar and BiCMOS processes.
Cost, Yield and Integration Levels: The main appeal of CMOS is the relatively
low cost combined with high levels of integration. Shrinking device sizes is an attractive
feature for digital CMOS circuits as it improves both speed and power dissipation [18].
This is an added motivation for CMOS integration of analog components. Bipolar transistors are also more prone to defect density as they are minority carrier devices. CMOS
processes therefore have a higher yield. However, arguments presented in [17] point out
that when costs associated with packaging and testing are included, the price tag on
RF chips are not significantly different. Also, any RF or analog CMOS process requires
more masks for good passive components, adding to fabrication costs. Combined with
the performance advantages of bipolar devices, the cost factor advantage of CMOS can
be challenged .
Based on above discussions it can be concluded that the choice of technology
depends on the kind of application. If integration levels of wireless systems become
sufficiently high for Radio on Chip (ROC) to be feasible, CMOS processes may reduce
the entire chip-set into one big chip with small supporting chips. When the die cost of
the chip is a significant portion of the overall system costs, CMOS could have significant
edge. For several (low end) radio systems CMOS RF performance may be comparable
to BiCMOS implementations and may be preferable. The attempt of this thesis is to
realize high-speed, low-power dual-modulus prescalers in RF CMOS technology. The
22
implementation of the dual-modulus prescalers in Chapter 4 highlights the tradeoffs
involved with the CMOS design.
3.4.
Current Mode Operation
CMOS static logic is widely used in mixed-signal integrated circuits because of
its ease of design, high packing densities, wide noise margins, etc. The most significant
feature is that the static power dissipation is nearly zero. However, its power dissipation
at high frequencies due to the displacement current Cout (dVout /dt) accounts for dynamic
power Pdynamic ≈ Cout (∆VL )2 f. As illustrated in Figure 3.3, the current spikes during
switching could flow through parasitic resistances and inductances associated with the
Vdd and Gnd power supply grid networks, bond-pad, package parasitics, etc. and cause
Vdd bounce or Gnd bounce by virtue of the I ·R or L· dI
dt voltage drops. This kind of digital
switching noise could show up as annoying glitches in the analog part of a mixed-signal
chip. Although there has been much research progress in the modeling of this substrate
coupling in mixed-signal ICs, the effects of the digital switching is difficult to predict,
making it difficult to eliminate with conventional circuit and layout techniques [19].
CMOS static logic belongs to the category of voltage switching circuits in which
Vdd or Vss is switched to the output node. The fundamental reason for the the digital
switching noise is that the power supply current is not held constant during output
voltage transitions [20]. This observation has motivated the development of sourcecoupled logic circuits. Source-coupled logic circuits, also referred to as MOS current
mode logic circuits (MCML) work on the principle of current steering controlled by an
input to a differential pair. As shown in Figure 3.4, the tail current is steered on either
side of the source-coupled pair and the output differential voltage determined by the tail
current and load resistances. This kind of differential logic has several advantages over
conventional CMOS as discussed in detail in this section.
23
VDD
VDD
VDD
(a)
current spikes
during switching
ID Current
Time(ns)
(b)
FIGURE 3.3: (a) CMOS ring oscillator;(b) Switching spikes in a CMOS inverter.
3.4.1. Speed-Power Advantage
Several comparisons of CMOS and MCML circuits have been carried out in literature [21, 22, 23]. Suppose a linear chain of N identical gates, all with an identical load
capacitance C on each output node was compared and contrasted in the two different
cases, the total propagation delay (D) of the chain of gates will be proportional to:
DCM OS =
N × C × Vdd
0.5k × (Vdd − Vt )α
(3.4)
where, k and α are parameters depending on transistor dimension and process. Assuming
the CMOS logic is clocked at a frequency equal to the inverse of the propagation delay,
24
VDD
VDD
R
R
Out
C
Out
+
C
Vin control
-
FIGURE 3.4: Principle of current-mode logic.
the dynamic power dissipation, power-delay and energy-delay products are given by :
PCM OS
P DCM OS
EDCM OS
2
= N × C × Vdd
×
1
DCM OS
2
= N × C × Vdd
= N2 · 2
C2
(3.5)
(3.6)
2
Vdd
k (Vdd − Vt )α
(3.7)
The objective of digital design is to optimize the energy-delay (ED) product. It
can be derived that the optimized supply voltage for minimizing the ED product for
CMOS is
Vdd =
2Vt
3−α
(3.8)
The power-delay equations for a CML inverter cascade are [22] :
DCM L = N RC =
N × C × ∆V
I
PCM L = N × I × Vdd
N C∆V
= N 2 × C × ∆V × Vdd
I
N C∆V
N 3 C 2 Vdd ∆V 2
= N 2 CVdd (∆V ) ×
=
I
I
(3.9)
(3.10)
P DCM L = N IVdd ×
(3.11)
EDCM L
(3.12)
(3.13)
where, ∆V is the output voltage swing = I · R
25
The above results indicate that CML circuits can be optimized by reducing the
supply voltage, or the signal voltage swing, and by increasing the tail current.
Intuitively, the higher speeds of current-mode operation can be attributed to two
main aspects - the transistors need not be completely turned on/off as in the case of
CMOS, and the lower voltage swings can charge/discharge the output node capacitance
much faster. The conventional power advantage of CMOS circuits does not hold at such
high frequencies as their dynamic power dissipation is comparable to or even worse than
the static power loss in CML circuits.
3.4.2. Common-Mode Noise Suppression
One of the most significant drawbacks of CMOS logic is the effect of the current
spikes during switching. The large transient currents could lead to L dI
dt voltage drops of
the order of about 200 mV. Since many analog signals could be much smaller than this,
such variations could be disastrous. The constant current drawn by source-coupled pairs
reduces this noise coupling by a large extent.
3.4.3. Substrate Coupling
Another source of switching noise is the injection of currents into the substrate
by charging/discharging of the drain-bulk capacitance (Figure 3.5(a)). In case of singleended rail-to-rail CMOS logic, the voltage variation modulates the depletion widths
causing a current
isub = Cdb
dvout
dt
(3.14)
The use of differential logic in CML circuits cancels these substrate currents to a first
order as illustrated in Figure 3.5(b). The total substrate current is now
isub = Cdb1
dvout
dvout
+ Cdb2
dt
dt
(3.15)
26
The cancelation is not exact as Cdb is non-linear and depends on the voltage across it.
VDD
i sub+
VDD
+
isub
i sub -
C db
-
Cdb,tot
(a)
(b)
FIGURE 3.5: Substrate current injection in (a) CMOS, (b)CML.
Apart from the above mentioned advantages, differential CML gates give some
implementation advantage with the availability of both true and complementary phases
of the signal without the need for separate inverters. Finally, their low swing makes
them more compatible for low-voltage designs.
3.5.
Pulse-Swallow Feedback Delays
The conventional dual-modulus prescaler with the pulse-swallow architecture is
usually limited by the speed of the pulse-swallow operation. In other words, the divideby-N+1 operation is the speed bottleneck of dual-modulus prescalers. Since the primary
27
goal of this thesis is to optimize the speed, the feedback loop was analyzed. Referring
to the synchronous divide-by-4/5 circuit of Figure 3.2, following the clock edge on which
Q2 must change , the next valid clock transition needs to accommodate the propagation
delay through the gate G and the input stage of DF F 2. This signal delay can make the
divide-by-3 about twice as slow as the divide-by-2 operation.
Some design techniques can be used to reduce these propagation delays in the
synchronous division. The combinational gates can be embedded into the first stage
of the D flip-flops. Previous implementations of dual-modulus prescalers [24, 14] have
incorporated a gate with the flip-flops. The differential current-mode implementation of
these “gated” flip-flops has been discussed in the next chapter.
3.6.
Ring-Oscillator Speed Analysis
The synchronous portion of the prescaler is the critical design to be optimized
for speed. Design optimization of a simple divide-by-two flip-flop begins with reducing
the propagation delay of the CML D flip-flops. To estimate the maximum obtainable
input frequency that can be divided by the DFF, the toggle flip-flop (divide-by-2) can be
regarded similar to a 3-inverter ring oscillator. With the availability of complementary
signals, ring oscillators can be made with even number of stages as well. A divide-by-4
circuit is similar in structure to a 4-stage ring oscillator with the complementary output
looping back to it’s input stage. This parallel between the two circuits is illustrated
in Figure 3.6. Theoretically the maximum input frequency toggled by the DFF would
be twice the oscillation frequency. However , because of the additional loading of the
positive feedback latch in the flip-flops, the input frequencies will be less than twice the
oscillation frequencies [24].
The equivalence of ring oscillators and prescalers is significant in analyses for speedpower tradeoffs and in understanding the role of various design parameters such as
28
(a)
D
Q
latch
D
CK
Q
D
Q
D
latch
D
CK
Q
latch
Q
D
CK
Q
D
Q
latch
D
CK
Q
CLK
CLK
(b)
FIGURE 3.6: Analogy between (a) ring oscillator and, (b) frequency divider.
voltage swing, transistor sizes and current consumed in each stage. The results of the
analysis are explained in detail in the next chapter.
29
CHAPTER 4. ANALYSIS, CIRCUIT DESIGN AND
IMPLEMENTATION
Having discussed system level considerations in dual-modulus prescalers, this chapter discusses the analysis and transistor-level implementation aspects. The pulse-swallow
operation, explained in principle in Section 3.1, is discussed in the context of the divideby-8/9 prescaler that was designed and implemented.
4.1.
8/9 Dual Modulus Prescaler Operation
The 8/9 dual modulus prescaler is illustrated in Figure 4.1. The synchronous
portion, which works at maximum frequency, is the critical block to design. The masterslave D flip-flops FF1 and FF2 perform conventional divide-by-4 in the absence of a
“pulse-swallow” signal. Such a control signal can be suppressed by disabling FF3 when
Mod-Select signal is inactive. The output of the FF2 is further divided asynchronously
to generate a divide-by-8 signal. When this divide-by-8 signal, Q4, is combined with
the Mod-Select signal appropriately, flip-flop FF3 gets included in the divider feedback
loop in such a way that FF1 is forced to hold state for exactly one extra clock period.
The output of the synchronous portion now has a duty cycle of
3
5,
i.e., the output Q2
is high for 3 and low for 2 clock periods. Q2, obviously, follows a high for 2, low for
3 clock-periods trend by virtue of the differential operation of the current mode logic.
As the synchronous pulse clocks the asynchronous divider, this translates into Q3 being
high for 5, low for 4 pulses (and vice-versa for Q3). The time period of the prescaler
output is now 9 pulses giving it the 8/9 modulus operation.
The pulse-swallow operation is emphasized with a timing diagram (Figure 4.2).
30
Pulse Swallow signal
D
Q
Q1
D
FF1
D
FF2
Q
FF4
D
CK
Q3
CK
Q2
D
Q
FF3
CK Q
CK
f in / f clk
Q
Modulus
Control
Q4
Q
Mod Select signal
FIGURE 4.1: 8/9 Dual-Modulus Prescaler System.
CLK
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
FF1 forced to hold state for one extra pulse
Q1
Q2
Q4- Asynchronous out
9 clock pulses
Q3 - Pulse swallow signal
Q2outbar/ D1in
FIGURE 4.2: Timing diagram explanation of pulse-swallow operation.
The prescaler is assumed to be in the ÷5 mode. The output of the flip-flops FF1 and
FF2 are, as expected, time shifted by one clock period. The asynchronous divider is
clocked by Q2. The asynchronous output, by virtue of an ‘AND’ operation with Mod-
31
Select, clocks FF3 so that Q3 is a time-shifted version of Q2. The pulse-swallow control
signal Q3 can thus be considered a ‘NOR’ operation on asynchronous output and Q2.
The control is used to ‘SET’ FF1, so that Q1 stays high for one extra pulse more than
it would have been without the pulse-swallow operation. The dashed lines show the
conventional ÷4 waveforms.
4.2.
Design Considerations
The core block to be optimized for speed-power is the synchronous DFF. A de-
tailed analysis of the parameters involved and the optimization has been discussed. As
discussed briefly in the earlier chapter, the analysis of the divider structure can be simplified by exploiting the similarity with a ring oscillator. Figure 3.6 explicitly showed this
analogy between the two structures. The primary advantage of analysis on the basis of
ring oscillator is that the maximum ring oscillation frequency is a clear indication of the
speed of the DFFs in the divider. In general, ring oscillators are used to characterize a
process because their oscillation frequency depends heavily on the fT of the transistors.
The primary design parameters involved in optimizing the CML Flip-Flops are analyzed
in detail in this section.
4.2.1. Voltage Swing
One of the most significant attributes of current-mode gates over CMOS is its
lower output voltage swing. Intuitively, the output node capacitance needs lesser time
to charge up and implies faster operation. The formal mathematical equation for the
propagation delay of a CML gate may be derived assuming a linear model as shown
in Figure 4.3. Assuming symmetry of the differential pair, the initial condition of the
32
circuit at the beginning of a switching transient initiated by input voltage swing is :
Vo+ (t = 0−) = VDD
(4.1)
Vo− (t = 0−) = VDD − I · R
(4.2)
R
R
V0 +
VoVDD
VDD
C
C
I
FIGURE 4.3: RC time-constant linear delay model in CML operation.
At the end of the transients, the current is steered from one leg to the other. The
output voltages after settling would be
Vo+ (t → ∞) = VDD − I · R
(4.3)
Vo− (t → ∞) = VDD
(4.4)
Equating transient currents at the output node (assuming instantaneous current
switching), we obtain the first order differential equation,
C·
VDD − I · R
dVo Vo
+
=
dt
R
R
(4.5)
Solving the above differential equation with the initial and final conditions, the
output voltage can be expressed as
Vo+ = (VDD − I · R)(1 − e−t/RC ) + VDD e−t/RC
Vo+ = VDD − I · R(1 − e−t/RC )
(4.6)
Propagation delay can be defined as the time taken for the output node to charge/discharge
to a desired fraction of the final voltage. For instance, the time taken for the output
33
to reach within 1% of its final value in the above case of Vo+ , can be derived from
Equation 4.6.
1.01(VDD − I · R) = VDD − I · R(1 − e−t99% /RC )
IR
=⇒ t99% = R · C ln
0.01(VDD − IR)
(4.7)
The propagation delay evidently depends on the voltage swing I · R in a direct
proportion. The voltage swing is a design parameter that depends on other factors
as well. In standard CMOS digital circuits, the mid-swing voltage gain is considered
representative of the robustness of the circuit to noise [25]. Digital logic requires a point
on the DC transfer curve where the gain is greater than 1. This requirement on the
gain per stage should be true for a ring of inverters to sustain oscillations as well. The
mid-swing gain is given by
A v = g m · RL
2I Vsw
·
∆
I
Vsw
= 2
∆
(4.8)
=
(4.9)
where, Vsw is the swing and ∆ refers to the over-drive voltage VGS -VT , or VDsat .
Another significant reason for higher voltage swings is the response of the positive feedback latch. The output of the preamplifier (differential-pair) of the CML latch,
although amplified, still needs to be pulled to the output levels needed to avoid metastability. The latch positive feedback regenerates the output signals to maximum possible
swings.
The conventional CML Flip-Flop of Figure 4.4 works similar to a latched comparator with the positive feedback supplementing the gain of the differential pair. The
latch-mode time constant in the positive feedback phase has been derived using a linearized model in [26, pp.319-321]. The result derived indicates that the transient response
of the latch is represented by the solution
∆V = ∆Vo e(Av −1)t/τ
(4.10)
34
VDD
Rload
Rload
D
Dbar
Clkbar
Clk
Bias
FIGURE 4.4: Current-Mode D-Latch.
where ∆Vo is the initial voltage difference at the beginning of the latch phase.
If it is necessary for a voltage difference of ∆Vlatch to be obtained in order for the
subsequent preamplifier to safely recognize the correct output value, the time required
for this to happen can be derived from Equation 4.10 to be
Tlatch
CL
∆Vlatch
=
ln
Gm
∆Vo
(4.11)
So if ∆Vo is small, the latch time can be larger than the allowed time to latch (half the
clock period) causing metastability. Further, low voltage swings are more susceptible to
noise and mismatch. Although not very critical in the case of frequency dividers, this
would be relevant for the design of oscillator delay stages.
The upper bound on voltage swings (Vsw ) is established by biasing conditions of
differential pair transistors. When one differential delay stage drives a similar stage, then
the differential pair transistor with a high input voltage requires a large enough V DS to
35
remain in saturation, or
VDS
VDD − Vsw − VS
≥ VGS − VT
≥ VDD − VS − VT
⇒ Vsw ≤ VT
(4.12)
4.2.2. Current Consumption
The current flowing in each stage of the divider/oscillator contributes directly to
the static power consumption of the circuit. Since the propagation delay is the time
taken for the available current to charge the output node capacitance, the circuit speed
is directly dependent on the current through the stage. However, an interesting question
that arises is whether there is an upper bound on how fast a circuit can be made to
operate if there was unlimited power to burn. In the case of a ring oscillator, if the
voltage swing is assumed fixed, scaling up the currents would require (1) reduction
in load resistance to maintain swing, and (2) proportional increase in NMOS device
sizes so that the over-drive ∆ remains same. The increase in the device size implies
a proportional increase in parasitic capacitances. Therefore, with a RC time-constant
dependent propagation delay, the above two variations nullify the effect of higher current
on improvements in speed. At very low device sizes/currents, the gain of the oscillator
is not high enough to sustain oscillations.
This result was verified with a simulation on the ring oscillator. Table 4.1 shows
that the maximum oscillation frequency of the ring (fosc ) does vary with current (I),
but not as significantly as one would have liked for the amount of static power traded
off. The parasitic capacitances associated with the resistive loads(R) do not scale down
proportional to the resistance. So the net RC time constant of the output node starts
increasing with current giving diminishing returns in the speed.
36
TABLE 4.1: Current-Speed relation.
Current I(µA)
W(µm)
load R(KΩ)
fosc (GHz)
50
0.8
12
5.8
100
1.6
6
6.1
200
3.2
3
6
400
6.4
1.5
6.13
4.2.3. Transistor Sizing
The sizes of the transistors in the current mode flip-flops are tightly coupled with
the other design parameters of swing and current. The primary considerations involved
in deciding the device size are those of speed, voltage swing and current steering ratios.
The RC time constant equations in Section 4.1 suggest that lower device sizes (and
hence, lower parasitic capacitance) reduces the delay in each stage. Ideal CML inverters
have a perfect current switch that steers current from one leg of the differential pair to
the other. In reality, however, some finite current is going to flow in the “OFF” path
preventing full current from being available at the output node of the “ON” transistor.
Assuming a current of ION flows through the active transistor and IOF F through the
other leg so that I = ION + IOF F , the effective voltage swing is
Vsw = R[ION − IOF F ] = R[2ION − I]
(4.13)
This current steering ratio is a parameter that depends on the voltage swing and the
device size. It has been observed in [24] that the current steering ratio can be a useful
parameter to indicate the robustness of the circuit. The analysis in [24] also accounts
for process variations and temperature variations which exacerbate the effect of on CML
latches. In this prescaler design too, the device sizes were sized on the basis of a fixed
DC current steering ratio (taken as 95%). So the device size involves a tradeoff between
37
maximum operating speed and robustness to process and temperature variation.
The approach to optimize the design of the inverter stage for each flip-flop began
with analysis of ring oscillation frequency with transistor sizes. Simulations results for
three particular cases were investigated:
1. Ring oscillator speed with decreasing device dimensions maintaining constant voltage swing and tail current of 100 µA (Figure 4.5(a)). As the percentage current
steering will be lower for smaller transistor sizes, the load resistance is higher than
Vsw
I .
The increases in resistance is, however, not significant enough to setback the
improvement in node capacitance. So, although the slope of the frequency variation flattens out at lower device dimensions, the general trend encourages smaller
size for faster operation.
2. Ring oscillator speed with decreasing device dimensions maintaining current steering percentage fixed and the total current at 100 µA constant (Figure 4.5(b)). The
load resistance used is assumed ideal for simplicity. The above discussion indicates
that the current steering requirement imposes need for higher device dimensions,
and so, the maximum oscillation frequencies are lower. The trend remains the
same as in the case of the fixed swing
3. The above simulations, when run with the real resistance RBH2 available in the
National BiCMOS8i process, shows a “sweet-spot” at very low device sizes (Figure 4.5(c)). This inflection occurs because at very low device sizes, the load resistance needed to compensate for the lower steering is so high that the parasitic
capacitances associated with these resistances negate the reduction of the transistor parasitic capacitances. The parasitic capacitances associated with the load
resistance are small and may be swamped out when interconnect capacitances are
included in a real simulation with extracted netlists.
4. The fourth and most relevant set of simulations was done with the divide-by-4
circuits. The input clock frequency was stepped up for different device sizes in the
38
flip-flop until correct division operation observed. Although the ring-oscillator is
similar to the divider circuit, there are some marked distinctions. The additional
positive feedback latch stage in the flip-flop not only loads the output of the CML
preamplifier stage, but also helps the speed with its gain. The differential pair and
the latch have similar transistor sizes for optimum current steering. The design
had to account for process and temperature variations as well. The schematic of
the final current-mode D flip-flop of the prescaler, designed on the basis of above
considerations is as shown in Figure 4.6. The clock and signal swings were set to
about 0.6 V with current through each CML latch at 100 µA.
9
9
x 10
Figure (a): Ring Oscillator simulations with constant swing
Figure (b): Ring Oscillator simulations with fixed current steering
Figure (c): Ring Oscillator constant steering simulations with RBH2
8
7
6
5
4
3
2
0
1
2
3
4
5
W
6
7
8
FIGURE 4.5: Ring oscillator simulation comparisons.
9
10
39
VDD
VDD
Rload=5.8K
Rload=5.8K
Rload=5.8K
Rload=5.8K
Vdd
Q
Qbar
D
2u/0.25u
2u
2u
0.25u
0.25u
2u/0.25u
2u
2u
0.25u
0.25u
Dbar
Clk
8u/0.5u
Bias
2u/0.25u
2u/0.25u
Clkbar
Clkbar
2u/0.25u
8u/0.5u
2u/0.25u
Clk
8u/0.5u
FIGURE 4.6: Optimized D flip-flop
4.3.
Implementation Of Pulse-Swallow Logic
As mentioned earlier, the speed bottleneck of the dual-modulus prescalers is in
the divide-by-N+1 implementation. This is obvious considering the fact that the N+1
modulus division requires the divide-by-N signal (and hence, the delays associated with
it) as well as the delay in generating the pulse-swallow signal. There have been some
clever design techniques to reduce the pulse-swallow delays [14, 27, 15] using bipolar
ECL and ECL-like differential logic. Merging the logic gates into the flip-flop saves
power and increases operating speed. However, some of the above methods, especially
the previous generation prescaler implemented in the National BiCMOS7 process and
its MOS current-mode equivalent in [15] have their disadvantages.
The gated D-type master-slave CML latch is shown in Figure 4.7(a). The reset
signal needs to be combined with the divider signal to pull the ouput node to a logic
‘low’ state. Unlike the simple DFF where the signals are differential and symmetrical,
the OR function of these gated flip-flops requires that the input signals compare their
levels with a reference voltage to determine whether whether the signal is high or low.
40
In current-mode logic, the signal swing is low and the DC value of this reference may
tend to shift around due to process variations. The way the reset operation works is
based on providing a dominant pull-down path through the reset transistor in parallel
with the DFF signal transistor. The disadvantage of such a technique is that since
the reset operation is essentially single-ended and asymmetric, we loose many of the
common-mode noise immunity advantages discussed in the previous chapter. Also, since
it requires a dominant pull-down ‘reset’ transistor, this needs to be 4× or 5× wider than
the conventional differential pair/latch transistors. This larger device for the one extra
pulse out of the N (divide modulus) pulses loads the differential pair and slows down the
prescaler operation. Any logic that requires a “fight” between two signal paths cannot
be robust.
Reset /
Set signal
Reset /
Set signal
Reset /
Set signal
fully symmetric
flip−flop gating
FIGURE 4.7: (a) Parallel pull-down latch structure for gated flip-flop.(b) Fully symmetric flip-flop gating.
A more symmetric implementation of the reset/set operations on the gated flipflops is shown in Figure 4.7. The principle of operation is based on the idea of stacking
CML gates (their low swing allows multiple stacking) to save power [14]. Since a CML
gate has only one structure, and all logic operations can be derived from one basic
CML cell, the reset/set operation is defined by combinational logic. For example, the
41
conventional ‘reset’ operation is an ‘AND’ of signal with ‘LOW’ and ‘set’ operation is an
‘OR’ with ‘HIGH’. A CML gate’s structure is inherently asymmetric with respect to the
output node loading. Careful layout by appropriate source/drain sharing of transistors
and use of dummy devices can alleviate problems due to this load mismatch.
4.4.
Asynchronous Flip-Flop
The asynchronous divide-by-two flip-flop is simpler to design as it works at fclk /4.
To design the asynchronous DFF4, the device sizes of DFF1 are retained, but the current
is scaled down. Since the output of DFF4 needs to be combined with the modulus-select
signal to control the pulse-swallow flip-flop DFF3, the output swing should be the same
as the signal swing in the rest of the circuit. The load resistance needs to be scaled up
to compensate for the lower current in the stage.
Figure 4.8 shows the asynchronous ÷2 stage schematic. This stage also needs
an input buffer because the divide-by-4/5 signal that clocks this stage is usually not
a clean reference. Simple CML inverter buffers are used with source-followers for level
shifting the signal level to the clock bias levels. In this design, the targeted operation
was 8/9 dual-modulus and so, requires only one such asynchronous divider with the 4/5
synchronous stage. Higher moduli can be obtained by cascading more toggle flip-flops
(TFFs).
4.5.
RF Buffer
The design of the dual-modulus prescaler was implemented entirely in CMOS,
which was one of the key objectives of this work. The prescaler in the PLL loops are
driven by the VCO output. It is typical to buffer the VCO outputs to shield the noisesensitive block from the digital switching in the prescaler. The specifications on the
42
VDD
11.6KΩ
11.6K Ω
11.6KΩ
11.6K Ω
VDD
D
0.25u
2u/0.25u
Q
Qbar
2u
2u
2u/0.25u
0.25u
2u
2u
0.25u
0.25u
Dbar
Clk
From DFF2
output
2u/0.25u
2u/0.25u
Clkbar
Clkbar
2u/0.25u
2u/0.25u
Clk
DFF2 out
8u/0.5u
50uA
4u/0.5u
50uA
4u/0.5u
FIGURE 4.8: Asynchronous divider.
dual-modulus prescaler requires operation for signals in the range -20 dBm to 0 dBm
power level. This needs to be converted to ≈ 0.6 V single-ended peak-to-peak swings
to clock the prescaler. The above requirement imposes a gain specification of about
26 dB (×20) at the targeted signal frequencies of 2.5-3 GHz, which translates to gainbandwidth product requirements to the order of 50 GHz. The buffer was implemented
using both CMOS and bipolar technology and the performances compared.
4.5.1. CMOS RF Buffer
The gain requirement for low-power input signals implies the need to cascade multiple gain stages. The primary requirements on the output clock signal generated by the
RF buffer are:
• The output signal should have a high rise time so that the the differential pair
transistors of the prescaler spend lesser time in the linear region. This achieves
lower total noise contributed by the transistors [5]. The output waveform should
effectively be as close to a square waveform as possible with signal swings of 0.6 V.
43
• Based on DC biasing conditions, if the clock signal amplitude is large, the differential pair transistors driven by the clock could swing into triode region. This is
undesirable for robust current switching.
With the above requirements in consideration, a cascade of amplifiers was designed.
The first stage does a single-ended to differential conversion and the remaining stages
provide increasing gain (Figure 4.9). In order to limit the output swing, each of the
gain stages has a common mode load resistance with about 0.9 V drop across it. This is
effectively like working with a 1.6 V supply. To shape sinusoidal waveforms into square
waves, more stages of amplifiers and higher current needs to be burned to sharpen the
edges of the clock.
VDD
+
750
VDD
0.9V
_
−
In / Out CM
= 1.25 V
2k
RF input
50u/0.25u
−20dBm−0dBm
500
Out2
27u/0.25u
Multiple differential
pair stages in cascade
50k
DC = 1.25V
1nF
104u/0.5u
Bias
1.2mA
1.2mA
104u/0.5u
FIGURE 4.9: CMOS RF buffer.
The load resistances were chosen to allow no more than 0.6 V swing across them.
The device sizes are chosen using parametric sweep of the transistor widths to obtain a
high gain, square waveform for a fixed current and swing. If the transistor W/L is large,
its gm is high, but it loads the previous stage as well. So the parametric sweep yields an
44
optimum device size for high gain. Only the first differential to single ended stage does
not require the supply level-shifting resistance.
In order to generate square waveforms for worst case conditions (slow transistors,
hot temperature conditions), 12 gain stages were required and with about 1.2mA in
each stage. To characterize the “square”-ness of the waveforms, the rise/fall time of the
current through the final output stage (time to swing from 5% to 95% of final value) as
a proportion of the time period (which needs to be ≥ 2.5 GHz). In the worst corner of
slow process/hot temperature, the rise time with the former criterion is about 26% of
the time period. Effectively, the differential pair is in the linear region about 52% of the
time. For nominal case operation, this rise time is 15% of time period.
The output swing can achieve the desired 600 mV with 6 gain stages. The additional stages and larger current can be traded-off with the noise criterion. Some of the
clock buffer output waveforms (Figure 4.10) are shown here to illustrate the performance
of the design. This buffer’s power consumption is unacceptably high and the number of
cascaded stages would increase the clock jitter.
Clock waveforms with 12-stage CMOS buffer
Time(s)
Clock waveforms with 9-stage CMOS buffer
Time(s)
FIGURE 4.10: Clock waveforms : CMOS buffer output.
45
4.5.2. BiCMOS RF Buffer
The significant advantages of bipolar transistors for high frequency analog applications were outlined in the previous chapter. The much higher
gm
Id
of the device makes
it possible to obtain higher bandwidths. The RF amplifier/buffer shown in Figure 4.11,
can be implemented with just two gain stages using the available NPN devices for the
differential pairs. The primary reason why the power supply had to be level shifted in the
CMOS implementation is that source-follower level shifters have an AC gain at 2.5 GHz
of about 0.4, therefore degrading the gain obtained from the differential amplifiers. In
the case of bipolar transistors, the small-signal AC response of emitter followers has a
wide bandwidth so that the small-signal gain is approximately unity. This enables full
utilization of the power supply.
Single-ended
to differential
conversion stage
VDD
1.8k
2k
12.5k
12.5k
27pF
CLKB
CLK
27pF
12.5k
Bias
0.6V
clock
swing
12.5k
16u/0.5u
4u/0.5u
20u/0.5u
20u/0.5u
200uA
FIGURE 4.11: BiCMOS RF Buffer.
The design issues for bipolar buffer are listed below :
• The bias currents should be set such that the transistors work to the left of the
46
peak fT point. The fT vs current characteristic was obtained for the available npn
transistors and the bias point point chosen accordingly.
• The swing of the clock signals must be maintained such that the collector-base
junction is not forward biased at any instant. The resistances and the bias currents
must be set such that the largest input signal amplitude does not saturate the npn
transistors.
• The frequency response of the bipolar buffer must allow for sufficient gain to switch
currents completely at the final output stage. The total capacitance at the common
emitter junction of the differential pairs determines the high frequency response
of the and should be minimized. As bipolar transistors suffer from large collectorsubstrate capacitance, MOS current sources are used to bias the transistors.
The single-ended to differential conversion cannot be implemented as in the case of the
CMOS buffer due to the mismatch introduced by base currents. So the common mode
of the differential pair transistors of the first stage are set by independent references,
generated by resistive voltage division.
4.6.
Output Buffer
The dual-modulus prescaler in a conventional frequency synthesizer drives a counter
or the phase-frequency detector of the PLL. So the load of the prescaler is usually capacitive, i.e., the gate capacitance of a few transistors. For this test chip, since the
dual-modulus prescaler was implemented as a stand-alone block, it needs larger drive
capability. The output buffer needs to be designed to drive a 50 Ω resistance,and at
least a 5 pF capacitive load. The biggest concern with driving these huge loads is the
amount of current that is drawn during voltage swings of the prescalers output.
To match the output impedance as well as to supply this large current a sourcefollower/emitter-follower buffer may be used. The currents may be of the order of a few
47
milliamps. MOS implementations using source-followers required large device sizes to
support the current. These devices may then load the prescaler, affecting its operating
speed. A more feasible alternative to satisfy this high drive capability is to use bipolar
transistors (Figure 4.12). To avoid large DC currents due to the output resistance, the
buffer is AC coupled to the load. The sourcing current to charge the output node needs
to be supported by the transistor of the emitter follower. Multiple transistors are used
in parallel so that each transistor only carries current corresponding to its optimum f T .
The sink current, however, is provided externally through resistances.
VDD
5.8k
4u/0.25u
100nF
300mV
swing
10pF
IB-
I B+
8x
In+
In-
8x
100nF
10pF
50
50
16u/0.5u
bias
FIGURE 4.12: BiCMOS Output Buffer
A new issue that crops up with the use of bipolar drivers is the large base current
component. With a collector current gain β of 100, the base current could be as high
as 250 µA. The isolation buffer between the prescaler from the bipolar devices needs to
supply the extra base current into the input of the emitter follower.
48
4.7.
Layout Considerations
The layout of RF building blocks is very important in realizing expected perfor-
mance. The main layout considerations and techniques to optimize the dual-modulus
prescaler system have been summarized below.
4.7.1. Symmetry Considerations
One of the highlights of implementing the dual-modulus prescaler system with
current-mode logic is the advantage of common-mode noise immunity achieved by its
differential and symmetric operation. Effect of random mismatches could be significant in
introducing clock jitter, especially with small device sizes. Since the prescaler and every
flip-flop is desired to have large-signal symmetry, the dummy transistors of Figure 4.13
ensure the output nodes of the differential amplifier in the first level stacking see the same
signal swing. The obvious drawback is that the capacitive loading is now asymmetrical
on the output nodes. However, since the transistor sizes are quite small, the dominant
capacitances of these asymmetric output nodes is the gate capacitance of the positive
feedback latch and the interconnect capacitance. So this asymmetry is expected to have
a lesser significance compared to signal path symmetry. Conventional techniques, such
as sharing common source terminal of differential-pairs and implementing all transistors
with respect to unit transistors, have also been used in drawing the layout.
4.7.2. Synchronous Divider Floorplan
As explained previously, the pulse-swallow feedback limits the speed of operation. So, to optimize the layout with speed consideration, the delays in the critical
synchronous divider path need to be reduced. This circuit is dominated by interconnect
49
Dummy transistors to match differential signal path
VDD
Rload=5.8K
VDD
Rload=5.8K
Rload=5.8K
Rload=5.8K
Qbar
NOR gate
Resetn
Q
Resetp
Resetn
2u/0.25u
2u/0.25u
2u/0.25u
2u/0.25u
VDD
Dbar
2u/0.25u
2u/0.25u
D
Rbias
CLK
2u/0.25u
8u/0.5u
2u/0.25u
CLKB
CLK
2u/0.25u 2u/0.25u
8u/0.5u
8u/0.5u
FIGURE 4.13: Flip-Flop3 with dummy devices to maintain signal symmetry.
capacitance, so it is imperative to ensure that wiring delays are minimized, especially
in the synchronous divider. It was obtained from analysis of the critical delay paths
that placing the pulse-swallow flip-flop DFF3 as close as possible to the first stage of
the flip-flop DFF1 reduces the feedback delay. The synchronous divide-by-4 operation
between DFF1 and DFF2 will have a lesser delay compared to the propagation path of
the divide-by-5. A simple floor-plan of the layout of the dual-modulus prescaler, not
including the modulus-selection logic, is shown in Figure 4.14.
DFF2
RF Buffer
Level
Shift
DFF4
Asynchronous
CML
Gate
DFF3
DFF1
Output Buffer
FIGURE 4.14: Floor plan to optimize pulse-swallow feedback delays.
It was verified from simulations on the extracted netlists that this placement of the
flip-flops and combinational logic ensures breakdown of division operation for both mod-
50
ulus at approximately similar frequencies. If the layout floor plan had been implemented
as in the sequence of the block diagram of Figure 4.1, the divide-by-9 operation breaks
down approximately 300 MHz input clock frequencies below the divide-by-8 modulus
division. It is obvious that the higher operation of the divide-by-N stage alone is not
desired. By shuffling the arrangement of the flip-flops, we are trading off the speed of
the synchronous division with the speed of the pulse-swallow division.
4.7.3. Minimization of Interconnect Capacitance
As the interconnect capacitance at the output node of each differential amplifier/latch dominates the RC propagation delay, a study of the metal layer used to route
was performed. Further, we have a choice between two types of routing–using a top
level metal (M4, M5) that is routed above the rest of the layout to connect end-to-end
flip-flops, or to route over a longer path using lower-level (M1, M2,or M3) metal avoiding
signal crossing. It was observed that routing using M5 right over the the rest of the layout
has lesser RC parasitics. Parasitic coupling between the signal lines and the power buses
can be avoided with this strategy as well. Simulations using extracted netlist confirm
the above deductions. Another important rule that was followed in the layout was to
isolate the clock lines from overlapping signal lines to the maximum extent possible to
avoid disturbing the critical reference frequency.
The top-level layout of the dual-modulus prescaler is shown in Figure 4.15. The
prescaler, input and output buffer layout is captured in Figure 4.16.
51
FIGURE 4.15: Top-level chip snap-shot.
FIGURE 4.16: Layout.
52
CHAPTER 5. SIMULATION AND MEASUREMENT
RESULTS
This chapter is a discussion of the simulation results obtained and a performance
evaluation of the prototype chip. The dual-modulus prescaler, analyzed and described
in the previous chapter, was designed and laid out using the CADENCE simulation
environment. A chip was fabricated using the National BiCMOS8i 0.25 µm, 5-metal
layer process. The chip was mounted on a 20-pin ultra-thin chip scale package (UTCSP)
provided by National Semiconductor. The die photograph of the fabricated chip is shown
in Figure 5.1.
FIGURE 5.1: Die photograph of OSU-Prescaler test chip.
53
5.1.
SpectreS Simulations
The overall system that was simulated is shown in Figure 5.2. The dual-modulus
prescaler and the input/output buffer blocks have been discussed in the earlier chapter.
The modulus-selection logic is usually determined by the program counter and swallow
counter data in integer-N PLLs or, by means of a randomized modulator output, as in the
case of a ∆Σ fractional-N synthesizer. For the prototype testing a worst-case switching
was decided upon. Worst case switching happens when the modulus is changed between
N and N+1 every other complete cycle. This can be easily realized using a divide-by-2
asynchronous block clocked by the 8/9 modulus output from the prescaler.
fout
RF input
-20dBm - 0dBm
RF Buffer
CMOS DualModulus Prescaler
Output Buffer
fout bar
Mod_Select signal
Divide-by-2
FIGURE 5.2: Top-level of dual-modulus prescaler implementation.
The operation of the prescaler was verified by looking at the transient waveforms
and checking for dual-modulus operation. The slowest case operation was obtained
with the slow process transistor models and hot temperature (850 C). A typical transient
simulation result of the prescaler operating at 2.75 GHz is shown in Figure 5.3.
Although the fast-case transistor models and higher input power levels appear
better for speed performance, a possible issue could be the robustness of the buffers. As
shown in the transient waveform of Figure 5.4, the bipolar buffers tend to saturate at
54
Divide-by-8
Divide-by-9
FIGURE 5.3: Prescaler output waveform for slow process/hot temperature corner.
0dBm input signal power levels.
It was observed in simulations that the divide-by-9 limits the operating frequency
of the prescaler. Investigations by following the waveform delays through the pulseswallow feedback loop indicate that the propagation delays through the asynchronous
divider output and modulus-selection logic causes a time-delay between the output of
DFF2 and the RESET signal on DFF3. This delay suppresses the pulse-swallow signal
even when mod-select has been asserted. The simulation results indicating measured
delays encountered in the pulse-swallow operation is shown in Figure 5.5.
55
FIGURE 5.4: Prescaler output waveforms for 0 dBm input, fast/cold operating corner.
5.2.
Measurement Set-Up
The prescaler measurement set-up is shown in Figure 5.6. The chip’s input/output
pins have been labelled for easy identification. The input frequency to the chip is through
a signal generator to obtain progressively higher clock frequencies by ramping the control
voltage. The power levels are also to be controlled to verify operation of the RF buffer.
The control to vary the speed of the prescaler is by varying the current to each stage
of the synchronous divider. The current bias is decided by external reference voltage
or resistance. Since the bias currents of the RF buffer, prescaler and output buffer are
56
Q2
FF2
NOR
CK to
Q=115ps
CLK
FF3
Q3
Control
65ps
Level Shift/
Buffer
85ps
Asynchronous
÷2
AND
CK to Q = 180ps
Total propagation delay ~330ps
FIGURE 5.5: Dual-modulus division operation breakdown.
all mirrored with the same bias transistor, increasing the bias current worsens power
consumption of all three blocks.
VREF
VDD
VDD
RFbuffVDD VDD
VDD
OutbuffVDD
Bias
C = 27 pF
Signal Generator
RFin+
OSU
Prescaler
Capacitor to AC couple the DUT
Divout+
Divout−
RFin−
C = 27pF
VSS
Dummy capacitance for matching RF input
FIGURE 5.6: Test setup.
Oscilloscope / Universal Counter
57
5.3.
Measurement Results
The measurement results of the dual-modulus prescaler operation have been tab-
ulated in this section. The chip testing yielded the following measurement observations
and inferences:
• The prescaler operation frequency varies with input power levels. The characteristic of the divider speeds when input levels are varied from -25 dBm to 0 dBm
(corresponding to signal amplitudes of 18 mV – 320 mV) has been obtained. As
shown in Figure 5.7, low power levels are likely to have lesser operation frequency
because of the low noise margins. Very low power, high-frequency signals may be
amplified insufficiently by the RF buffer inducing metastability in CML latches of
the prescaler. At higher power levels (≥ -16 dBm), corresponding to higher amplitudes, the second stage of the bipolar buffer may be driven into saturation. As
bipolar transistors take longer time to get out of saturation, this could be the reason
for lower clock speeds. Note that this implies the prescaler could still be capable of
functioning and the dual-modulus operation is restricted by the buffer. Optimum
prescaler speeds were obtained at signal amplitudes of ≈ 50 mV (-16 dBm).
• The maximum operating frequencies obtained from 21 chip samples and the optimum power-levels and biasing currents at which this was obtained are tabulated
in Table 5.1.
• The output waveform is a divide-by-8/divide-by-9 alternating waveform which has
been observed operational until frequency of 2.1 GHz with about 2 mA consumed
in the CMOS prescaler. Figure 5.8 shows the output waveform as observed on the
oscilloscope, with the specified biasing and signal conditions.
• When the bias current is set such that the current mirrored through the buffers
58
TABLE 5.1: Chip performance over 21 samples.
Sample Number
Max. frequency (GHz)
Signal amplitude (dBm)
Prescaler I (mA)
1
2.06
-16
2.10
2
1.85
-16
2.15
3
1.92
-15
2.10
4
2.00
-16
2.00
5
2.00
-23
2.30
6
2.06
-15
2.15
7
1.96
-23
1.96
8
1.81
-16
1.94
9
2.00
-15
2.45
10
2.00
-16
2.27
11
1.97
-16
2.35
12
2.03
-19
2.31
13
1.8
-16
1.96
14
2.10
-15
2.33
15
2.11
-20
2.38
16
1.85
-20
1.95
17
1.86
-20
1.78
18
1.8
-21
1.75
19
1.94
-23
2.06
20
1.9
-19
2.10
21
2.06
-17
2.46
Maximum prescaler operating frequency mean = 1.96 GHz
Standard deviation σ = 0.10 GHz
59
Dual−modulus operation frequency Vs Signal power
2
Input Frequency (GHz)
1.9
1.8
1.7
1.6
1.5
1.4
−25
−20
−15
−10
Signal amplitude (dBm)
−5
0
FIGURE 5.7: Operating frequency variation with input signal levels.
FIGURE 5.8: Measured operation at 2.1 GHz, 2.1 mA prescaler current, -16dBm input
signal.
and the prescaler is according to the design, the maximum obtainable dual-modulus
operation frequency is 1.85 GHz (Figure 5.9).
• Since the operation seems limited by the RF buffer stage transistors saturating, an
experiment was done with higher supply voltage levels for the buffer alone. The
frequency of operation was found to increase by only 10 MHz when supply voltage
was increased from 2.5 V to 3 V.
60
FIGURE 5.9: Measured operation at 1.85 GHz, 1.3 mA prescaler current, -16dBm input
signal.
5.4.
Conclusions
An analysis of high-performance dual-modulus prescalers has been presented in
this thesis. A design methodology was developed to implement high-speed dividers consuming lowest possible power. The system level issues of the pulse-swallow topology were
investigated and modifications incorporated to optimize the propagation delays. The circuit was designed keeping in mind robustness to extremes of process and temperatures.
The performance has been validated with measurements on a test chip fabricated with
National BiCMOS8 0.25µm, 5-metal technology.
Future work on this topic could involve investigating other dual-modulus architectures like those in [28] and obtain perform comparisons. The preliminary results of
this design could be used to design and fabricate the entire frequency synthesizer as that
could open up several research avenues based on the influence of the various blocks in
the mixed-signal chip.
61
BIBLIOGRAPHY
1. B. Razavi, RF Microelectronics, New Jersey: Prentice Hall, 1998.
2. L.Dai, Design of High-Performance CMOS Voltage Controlled Oscillators, Ph.D.
thesis, University of Minnesota, Department of Electrical and Computer Engineering, 1996.
3. T. Lee, The Design of CMOS Radio-Frequency Integrated Circuits, Cambridge,UK:
Cambridge University Press, 1998.
4. L.Lin, Design Techniques for High-Performance Frequency Synthesizers for MultiStandard Wireless Communication Applications, Ph.D. thesis, University of California, Department of Electrical Engineering and Computer Science, 2000.
5. A. Hajimiri and T. Lee, “A general theory of phase noise in electrical oscillators,”
IEEE Journal of Solid-State Circuits, vol. 33, no. 2, pp. 179–194, February 1998.
6. U. Moon, K. Mayaram, and J. Stonick, “Spectral analysis of time-domain phase
jitter measurements,” IEEE Transactions on Circuits and Systems–II: Analog and
Digital Signal Processing, vol. 49, no. 6, pp. 321–327, May 2002.
7. T.A.D. Riley, M.A. Copeland, and T.A. Kwasniewsky, “Sigma-Delta modulation
in fractional-N frequency synthesis,” IEEE Journal of Solid-State Circuits, vol. 28,
no. 1, pp. 553–559, May 1993.
8. K. Shu, E. Sanchez-Sinencio, F. Maloberti, and U. Eduri, “A comparitive study
of digital sigma-delta modulators for fractional-N synthesis,” in Proc. of IEEE
International Conf. Electronics,Circuits and Systems, September 2001, pp. 1391–
1394.
9. F. Gardner, “Charge pump phase-lock loops,” IEEE Transactions on Communications, vol. COM-28, pp. 1849–1858, November 1980.
10. J. Hein and J. Scott, “Z-domain model for discrete-time PLLs,” IEEE Transactions
on Circuits and Systems–II: Analog and Digital Signal Processing, vol. 35, no. 6,
pp. 1393–1400, November 1988.
11. H. Knapp, J. Bock, M. Wurzer, G. Ritzberger, K. Aufinger, and L. Treitinger, “2
GHz/2 mW and 12 GHz/ 30 mW dual-modulus prescalers in silicon bipolar technology,” in Proceedings of the Bipolar/BiCMOS Circuits and Technology Meeting,
September 2000, pp. 164–167.
12. L. Tournier, M. Sie, and J. Graffeuil, “A 14.5 GHz, 0.35-micron frequency divider
for dual-modulus prescaler,” in IEEE Radio Frequency Integrated Circuits (RFIC)
Symposium, June 2002, pp. 227–230.
62
13. B.-U Klepser, “SiGe Bipolar 5.5 GHz dual-modulus prescaler,” IEE Electronics
Letters, vol. 35, no. 20, pp. 1728–1730, September 1999.
14. T.S. Aytur and B. Razavi, “A 2-GHz, 6-mW BiCMOS frequency synthesizer,”
IEEE Journal of Solid-State Circuits, vol. 30, no. 12, pp. 1457–1462, December
1995.
15. X.Li, Evaluation of Radio Frequency CMOS Integrated Circuit Technology for Wireless LAN Applications, Ph.D. thesis, University of Florida, Department of Electrical
and Computer Engineering, 2003.
16. E. Abou-Allam, T. Manku, T. Ting, and M. Obrecht, “Impact of technology scaling
on CMOS RF devices and circuits,” in IEEE Custom Integrated Circuits Conference, May 2000, vol. 1, pp. 361–364.
17. B. Gilbert, “Why bipolar? :- http://www.chipcenter.com/analog/c009.htm,” Analog Avenue Columns,ChipCenter Electronics Group, April 1998.
18. B. Razavi, Design of Analog CMOS Integrated Circuits, New York: McGraw-Hill,
2001.
19. A. Hastings, The Art of Analog Layout, New Jersey: Prentice Hall, 2001.
20. D.J. Allstot, S.-H Chee, S. Kiaei, and M. Shrivastawa, “Folded source-coupled
logic vs. CMOS static logic for low-noise mixed-signal ics,” IEEE Transactions on
Circuits and Systems–I: Fundamental Theory and Applications, vol. 40, no. 9, pp.
553–563, September 1993.
21. Jason Musicer, An Analysis of MOS Current Mode Logic for Low Power and High
Performance Digital Logic, Ph.D. thesis, University of California, Department of
Electrical Engineering and Computer Science, 2000.
22. M. Mizuno, M. Yamashina, K. Furuta, H. Igura, H. Abiko, K. Okabe, A. Ono, and
H. Yamada, “A GHz MOS adaptive pipeline technique using MOS current-mode
logic,” IEEE Journal of Solid-State Circuits, vol. 31, no. 6, pp. 784–791, June 1996.
23. M. Alioto, G. Palumbo, and S. Pennisi, “Delay estimation of SCL gates with output
buffer,” in Proc. of IEEE International Conf. Electronics,Circuits and Systems,
September 2001, pp. 719–722.
24. J. Craninckx and M. Steyaert, “A 1.75-GHz, 3-V dual-modulus divide-by-128/129
prescaler in 0.7-micron CMOS,” IEEE Journal of Solid-State Circuits, vol. 31, no.
7, pp. 890–897, July 1996.
25. J. Rabaey, Digital Integrated Circuits : A Design Perspective, New Jersey: Prentice
Hall, 1996.
26. D. A. Johns and K. Martin, Analog Integrated Circuit Design, New York: John
Wiley and Sons, 1997.
63
27. C-Y. Yang and Member S-I. Liu, “Fast-switching frequency synthesizer with a
discriminator-aided phase detector,” IEEE Journal of Solid-State Circuits, vol. 35,
no. 10, pp. 1445–1452, October 2000.
28. K. Shu, E. Sanchez-Sinencio, J. Silva-Martinez, and S.H.K. Embabi, “A 2.4-GHz
monolithic fractional-N frequency synthesizer with robust phase-switching prescaler
and loop capacitance multiplier,” IEEE Journal of Solid-State Circuits, vol. 38, no.
6, pp. 866–874, June 2003.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement