Imperial College London Department of Electrical and Electronic Engineering Analogue Circuits For Low Power Communication Mark David Tuckwell 2010 Supervised by Dr. Christos Papavassiliou A thesis submitted for the degree of Doctor of Philosophy in Electrical and Electronic Engineering of Imperial College London and the Diploma of Imperial College London 1 2 Declaration I herewith certify that all material in this dissertation which is not my own work has been properly acknowledged. Mark David Tuckwell 3 4 Abstract Low power electronic circuits are required to extend the operational time of battery operated devices. They are also necessary to reduce the power consumption of equipment in general, especially as the world tries to cut energy usage. The first section of this thesis explores fundamental and implementation limits for low power circuits. The energy requirements of amplification are presented and a lower bound on the energy required to transmit information over a point to point link is proposed. It is evident from the low power limits survey that when a transistor is biased, significant thermodynamic energy is required to reduce the resistance of the channel. A transmitter is presented that turns on a transistor for 0.1 % of transmitted time. This transmitter approximates a Gaussian pulse by allowing the impulse response of two 2nd order transmitting elements to sum in free space. The transmitter is of low complexity and the receiver architecture ensures that no on-line tuning is required. Measured results indicate that by using coherent detection a 1 Mbps, 50 mm distance link with a bit error rate of 10−3 can be achieved. The bandwidth of the transmitted pulse is 30-37.5 MHz and 30 dB of out of band attenuation is provided. An analogue Gabor transform is described which splits a signal into parallel paths of a lower bandwidth. This enables post processing at lower clock rates, which can reduce energy dissipation. An implementation of the transform using sub-threshold CMOS continuous time filters is presented. A novel method for designing low power gmC filters using simple models of identical transconductors is used to specify transistor sizes. Measured results show that the transform consumes 7 µW for an input signal bandwidth of 4 kHz. 5 6 Acknowledgements I would like to thank Dr. Christos Papavassiliou for his excellent supervision and guidance during my PhD. I would also like to thank the members of the CAS group for their useful advice and help during my time at Imperial. I would like to thank my fianceé Katie for her patience and support over the last few years. Finally I would like to thank my parents and my sister Hannah for many years of encouragement and support. 7 8 Contents 1 Introduction 25 1.1 Communication . . . . . . . . . . . . . . . . . . . . . . 25 1.2 Research Contributions . . . . . . . . . . . . . . . . . . 25 1.3 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . 26 2 Low Power Limits 29 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 29 2.2 Information Transfer . . . . . . . . . . . . . . . . . . . 29 2.3 Classical Limits . . . . . . . . . . . . . . . . . . . . . . 30 2.3.1 The Shannon-Hartley theorem . . . . . . . . . . 30 2.3.2 Compression of a gas . . . . . . . . . . . . . . . 31 2.3.3 Thermal Equilibrium . . . . . . . . . . . . . . . 32 2.3.4 Achieving the kT ln 2 limit . . . . . . . . . . . . 33 2.3.5 Beating the kT ln2 limit . . . . . . . . . . . . . 34 2.3.6 Approaching the kT ln2 limit . . . . . . . . . . 35 Quantum limits . . . . . . . . . . . . . . . . . . . . . . 36 2.4.1 Uncertainty Principle . . . . . . . . . . . . . . . 37 2.4.2 Energy Per Operation . . . . . . . . . . . . . . 39 2.4.3 Blackbody Radiation . . . . . . . . . . . . . . . 39 2.4.4 Blackbody Communication Limit . . . . . . . . 40 2.4.5 Signal Representation Using Photons . . . . . . 41 Implementation Limits . . . . . . . . . . . . . . . . . . 43 2.5.1 Stein Limit . . . . . . . . . . . . . . . . . . . . 43 2.5.2 Driving An RC Interconnect . . . . . . . . . . . 44 2.5.3 Power Per Pole . . . . . . . . . . . . . . . . . . 46 2.4 2.5 9 2.6 2.7 Matched Transmission Line . . . . . . . . . . . . . . . 2.6.1 Energy Per Bit For Matched Information Transfer 47 2.6.2 Comparison of Analogue and Digital Signal Representation . . . . . . . . . . . . . . . . . . . . . 48 Exploration of Information Transfer Through an Amplifier 51 2.7.1 Information and Physical Energy . . . . . . . . 52 2.7.2 Information Flow Through a Practical LNA . . 55 2.7.3 Theoretical Vs Practical Output Bit Energy . . 58 2.7.4 Energy Required for Biasing . . . . . . . . . . . 58 2.7.5 Black Box Analogue Amplifier . . . . . . . . . . 62 2.7.6 Conclusion . . . . . . . . . . . . . . . . . . . . . 64 Electromagnetic Limits . . . . . . . . . . . . . . . . . . 65 2.8.1 Friis Limit . . . . . . . . . . . . . . . . . . . . . 65 2.8.2 Friis-Kraus Electromagnetic Limit . . . . . . . . 65 2.8.3 Near Field . . . . . . . . . . . . . . . . . . . . . 66 A New Electromagnetic Lower Bound . . . . . . . . . . 66 2.9.1 A numerical example . . . . . . . . . . . . . . . 67 2.9.2 Comparison With Current Standards . . . . . . 68 2.9.3 Limitations Of This Bound . . . . . . . . . . . 69 2.9.4 Conclusion . . . . . . . . . . . . . . . . . . . . . 69 2.10 Biological Examples . . . . . . . . . . . . . . . . . . . . 70 2.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 71 2.8 2.9 3 Pulsed Communication 77 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 77 3.2 Comparison of FSK and PPM . . . . . . . . . . . . . . 78 3.2.1 Minimum PPM Power . . . . . . . . . . . . . . 81 3.2.2 Transmitter Complexity . . . . . . . . . . . . . 82 3.2.3 Channel Block Coding . . . . . . . . . . . . . . 86 PPM Simulation . . . . . . . . . . . . . . . . . . . . . 90 3.3.1 PPM Spectrum Shape . . . . . . . . . . . . . . 90 3.3.2 Orthogonal Signalling . . . . . . . . . . . . . . . 91 3.3.3 Pseudo Orthogonal BER . . . . . . . . . . . . . 92 3.3 10 47 3.3.4 PPM Detection . . . . . . . . . . . . . . . . . . 95 3.3.5 Correlation Detection . . . . . . . . . . . . . . . 96 3.3.6 Matched Filter Detection . . . . . . . . . . . . . 97 3.4 Truncated Gaussian . . . . . . . . . . . . . . . . . . . . 99 3.5 Gaussian Approximations . . . . . . . . . . . . . . . . 103 3.5.1 All Pole Approximation . . . . . . . . . . . . . 104 3.5.2 Cascade of Poles . . . . . . . . . . . . . . . . . 107 3.5.3 Padé Approximation . . . . . . . . . . . . . . . 110 3.5.4 Creating bandpass filters . . . . . . . . . . . . . 114 3.6 Gaussian Approximation BER . . . . . . . . . . . . . . 115 3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 118 4 Communication using 2nd Order TX Elements 121 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 121 4.2 Making Use of Antenna Topology . . . . . . . . . . . . 122 4.2.1 4.3 4.4 4.5 4.6 4.7 Efficiency of the 2nd Order Element . . . . . . . 122 Pulse Decomposition . . . . . . . . . . . . . . . . . . . 125 4.3.1 Minimal Decomposition . . . . . . . . . . . . . 126 4.3.2 Maximal Decomposition . . . . . . . . . . . . . 126 4.3.3 TX Element Coupling . . . . . . . . . . . . . . 130 2nd Order Receiving Element . . . . . . . . . . . . . . 130 4.4.1 Implicit Matched Filter . . . . . . . . . . . . . . 131 4.4.2 Coupling Matrix for N=2 . . . . . . . . . . . . 132 4.4.3 Transmission Distance . . . . . . . . . . . . . . 133 4.4.4 Attenuation Map . . . . . . . . . . . . . . . . . 134 Demodulation . . . . . . . . . . . . . . . . . . . . . . . 136 4.5.1 Coherent Detection . . . . . . . . . . . . . . . . 139 4.5.2 Sub Sampling . . . . . . . . . . . . . . . . . . . 139 4.5.3 Non Coherent Detection . . . . . . . . . . . . . 139 BER Performance . . . . . . . . . . . . . . . . . . . . . 139 4.6.1 Accuracy of Estimation . . . . . . . . . . . . . . 140 4.6.2 Preamble Length . . . . . . . . . . . . . . . . . 140 Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 11 4.8 4.9 Circuit Design . . . . . . . . . . . . . . . . . . . . . . . 145 4.8.1 I-Q Delay Circuit . . . . . . . . . . . . . . . . . 146 4.8.2 Pulse Generation Circuit . . . . . . . . . . . . . 146 Measured Results . . . . . . . . . . . . . . . . . . . . . 147 4.9.1 Transmitted Pulse . . . . . . . . . . . . . . . . 148 4.9.2 Orthogonal Pulse . . . . . . . . . . . . . . . . . 154 4.9.3 Receiver Demodulation . . . . . . . . . . . . . . 154 4.9.4 Power Consumption . . . . . . . . . . . . . . . 157 4.9.5 Measured BER Performance . . . . . . . . . . . 157 4.10 Comparison with State of the Art . . . . . . . . . . . . 164 4.10.1 Integration . . . . . . . . . . . . . . . . . . . . 166 4.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 167 5 An Analogue Gabor Transform 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 169 5.2 The Gabor Transform . . . . . . . . . . . . . . . . . . 170 5.3 Implementation of Convolution . . . . . . . . . . . . . 172 5.3.1 Number of Filters and Coefficient Rate . . . . . 173 5.3.2 Noise Analysis . . . . . . . . . . . . . . . . . . . 175 5.4 Design of the State Space Filter from Impulse Response Specifications . . . . . . . . . . . . . . . . . . . . . . . 177 5.5 Designing a Low Power gmC Filter . . . . . . . . . . . 182 5.6 12 169 5.5.1 Noise and Distortion . . . . . . . . . . . . . . . 182 5.5.2 Mismatch . . . . . . . . . . . . . . . . . . . . . 185 5.5.3 Bandwidth and Output Resistance . . . . . . . 187 5.5.4 A Low Power gmC Design Method . . . . . . . 189 Measured Results . . . . . . . . . . . . . . . . . . . . . 192 5.6.1 Impulse and Bode Response . . . . . . . . . . . 193 5.6.2 Centre Frequency Variation . . . . . . . . . . . 196 5.6.3 Power Consumption and SINAD . . . . . . . . . 196 5.6.4 Bit Error Test Performance . . . . . . . . . . . 198 5.6.5 Analysis Window Cross Correlation . . . . . . . 199 5.6.6 Transform Comparison . . . . . . . . . . . . . . 203 5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 203 6 Conclusion 6.1 205 Future Work . . . . . . . . . . . . . . . . . . . . . . . . 209 7 Published Work 211 Bibliography 213 A Derivation of switching energy for a square wave 227 B A lower bound on the energy for a point to point communication link 231 B.1 System Temperature . . . . . . . . . . . . . . . . . . . 231 B.2 Antenna Noise Temperature . . . . . . . . . . . . . . . 233 B.3 Lower Bound on Transmission Energy . . . . . . . . . . 235 B.4 Energy per bit . . . . . . . . . . . . . . . . . . . . . . . 235 B.5 Isotropic and practical antennas . . . . . . . . . . . . . 236 B.5.1 Parabolic dish antenna . . . . . . . . . . . . . . 237 B.5.2 Dipole antenna . . . . . . . . . . . . . . . . . . 237 C The relationship between SNR and Eb /N0 239 C.1 Square wave SNR . . . . . . . . . . . . . . . . . . . . . 240 D Time-Frequency Uncertainty 241 D.1 Uncertainty Calculation . . . . . . . . . . . . . . . . . 241 D.2 Rectangular Pulse . . . . . . . . . . . . . . . . . . . . . 242 D.3 Sinusoidal Pulse . . . . . . . . . . . . . . . . . . . . . . 242 D.4 Gaussian Pulse . . . . . . . . . . . . . . . . . . . . . . 243 D.5 Gaussian Derivative . . . . . . . . . . . . . . . . . . . . 244 D.6 Truncated Gaussian Pulse . . . . . . . . . . . . . . . . 245 E BER Simulation 249 E.1 Correlation Detection . . . . . . . . . . . . . . . . . . . 249 E.2 Matched Filter Detection . . . . . . . . . . . . . . . . . 250 13 E.3 BER Results . . . . . . . . . . . . . . . . . . . . . . . . 251 F Impulse Approximation 255 G Inductive Coil Characterisation 259 G.1 Lumped Element Model . . . . . . . . . . . . . . . . . 259 G.1.1 Measured Coil Characteristics . . . . . . . . . . 260 G.2 Coupling Measurements . . . . . . . . . . . . . . . . . 262 H 2nd Order Approximation 265 I 267 Transmitter Coupling J 2nd Order Element Temperature Change 271 K 2nd Order Element Tuning 273 L Pulse Generation and Receiver Schematics 275 14 List of Tables 2.1 2.2 2.3 2.4 2.5 Bit energies for several common communication standards. Fundamental Classical Limits. . . . . . . . . . . . . . . Fundamental Quantum Limits. . . . . . . . . . . . . . Implementation Limits. . . . . . . . . . . . . . . . . . . Biological vs Electronic Energy Examples. . . . . . . . 3.1 Examples of some pulse based transmitters for short range communications. . . . . . . . . . . . . . . . . . . 84 Comparison of time-frequency uncertainty for a variety of pulse shapes. . . . . . . . . . . . . . . . . . . . . . . 99 Comparison of bandwidth and attenuation for various pulses. . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Approximated pulses with an attenuation of > 35 dB. . 115 3.2 3.3 3.4 4.1 4.2 4.3 4.4 4.5 4.6 5.1 5.2 5.3 69 73 73 74 75 Characteristics for the 3rd order baseband minimal pulse.126 Characteristics for the 3rd order baseband maximal pulse.129 Characteristics for the N=2 pulse. . . . . . . . . . . . . 130 Transmission distance using matched TX and RX 2nd order elements. . . . . . . . . . . . . . . . . . . . . . . 134 Preamble frequency, phase and offset estimation algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Comparison of some transmitters suitable for short range communication. . . . . . . . . . . . . . . . . . . . . . . 165 Tradeoffs in filter design. . . . . . . . . . . . . . . . . . 189 Comparison of model and BSIM3 simulation. . . . . . . 190 Variation in centre frequency for fixed bias currents. . . 196 15 5.4 5.5 5.6 16 Summary of measured results. . . . . . . . . . . . . . . 198 Cross Correlation of Cos Analysis Windows . . . . . . 200 Transform Comparison. . . . . . . . . . . . . . . . . . . 202 List of Figures 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 3.1 3.2 3.3 Compression of an ideal gas using a piston. . . . . . . . Blackbody Radiation Spectrum also showing Wien’s displacement law. . . . . . . . . . . . . . . . . . . . . . . Minimum energy required per bit when representing a signal in terms of a Plank oscillator. . . . . . . . . . . . The Ideal Integrator requires a power of 8kT SNRf . . . A typical interconnect with matched source and load impedances. . . . . . . . . . . . . . . . . . . . . . . . . Information transfer using parallel digital, serial digital and analogue representations. . . . . . . . . . . . . . . Energy per bit required to represent a signal in analogue and digital formats. . . . . . . . . . . . . . . . . . . . . The mapping of free information to bound information. Power and noise spectra of an idealised amplifier. . . . The MOSFET acting as an information isolator. . . . . Comparison of output bit energy for many LNAs across a range of frequencies. . . . . . . . . . . . . . . . . . . Low pass filter ideal spectrum. . . . . . . . . . . . . . . Energy per bit lower bound for free space communications for a distance of 10 m. . . . . . . . . . . . . . . . An example of FSK and PPM modulation for L=4. . . Block Diagram of a transmitter capable of FSK or PPM modulation. . . . . . . . . . . . . . . . . . . . . . . . . Examples of some typical circuit topologies for wireless transmitters. . . . . . . . . . . . . . . . . . . . . . . . . 32 40 43 46 47 49 51 53 56 57 59 63 68 79 80 82 17 3.4 Power consumption of PLL frequency generation circuits. 86 3.5 Comparison between 2PSK (antipodal) and M-ary FSK (orthogonal) modulation schemes. . . . . . . . . . . . . 87 Comparison of bit error rates for convolutional block coding. . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 3.7 Simulated power dissipation of a convolutional encoder. 89 3.8 Simulation of the power spectrum for the PPM information sequence. . . . . . . . . . . . . . . . . . . . . . 91 The effect of cross correlation on the energy requirements of ML detection. . . . . . . . . . . . . . . . . . . 94 3.10 Possible detection schemes when the impulse of the transmit filter is sent across a channel. . . . . . . . . . . . . 96 3.11 The effect of using the same approximated filter for the transmit filter and matched filter. . . . . . . . . . . . . 98 3.6 3.9 3.12 Truncated Gaussian time domain pulse for T = 1. . . . 101 3.13 Truncated Gaussian pulse frequency domain response for T = 1. . . . . . . . . . . . . . . . . . . . . . . . . . 102 3.14 Attenuation achieved by the truncated Gaussian pulse. 102 3.15 Bandwidth-time product for the truncated Gaussian pulse.103 3.16 Out of band attenuation versus αT 2 . . . . . . . . . . . 105 3.17 All pole approximation of the Gaussian pulse function for αT 2 = 6 and T = 1. . . . . . . . . . . . . . . . . . . 106 3.18 Frequency response of the all pole Gaussian pulse approximation for αT 2 = 6 and T = 1. . . . . . . . . . . . 106 3.19 Out of band attenuation vs αT 2 for the cascade of poles approximation. . . . . . . . . . . . . . . . . . . . . . . 108 3.20 Cascade of poles approximation of the Gaussian pulse function for N=2 and N=8. . . . . . . . . . . . . . . . 109 3.21 Frequency response of the cascade of poles approximation for N=2 and N=8. . . . . . . . . . . . . . . . . . . 109 3.22 Out of band attenuation versus αT 2 for the Padé Approximation. . . . . . . . . . . . . . . . . . . . . . . . . 111 18 3.23 Time domain response of the Padé approximation for N=4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 3.24 Time domain response of the Padé approximation for N=6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 3.25 Frequency response of the Padé approximation for N=4. 113 3.26 Frequency response of the Padé approximation for N=6. 113 3.27 Extra transmitter energy required over ideal orthogonal pulses when using a correlation detector. . . . . . . . . 117 3.28 Extra transmitter energy required over ideal orthogonal pulses when using the approximated matched filter detector. . . . . . . . . . . . . . . . . . . . . . . . . . . 117 4.1 Circuit diagram showing a 2nd Order transmitting element. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 4.2 Using the inductor as a transmitter in a direct modulation approach and a pulse based approach. . . . . . . . 124 4.3 Sum of 2nd order impulse responses to form the desired pulse using a minimal decomposition. . . . . . . . . . . 127 4.4 Spectral comparison of the minimally decomposed pulses.128 4.5 Sum of 2nd order impulse responses to form the desired pulse using a maximal decomposition. . . . . . . . . . . 129 4.6 Diagram indicating the coupling between each transmitting element and each receiving element. . . . . . . . . 131 4.7 Coupling paths when using two transmit and receive antennas. . . . . . . . . . . . . . . . . . . . . . . . . . 133 4.8 Geometry of the transmit antennas. . . . . . . . . . . . 135 4.9 Out of band attenuation with the transmit antennas placed 20mm apart. . . . . . . . . . . . . . . . . . . . . 137 4.10 Block diagram of the transmitter and receiver for the pulse based communications link. . . . . . . . . . . . . 138 4.11 Effect of transmit distance on BER performance for an N=2 pulse with 4 time slots. . . . . . . . . . . . . . . . 141 19 4.12 Effect of centre frequency shift on BER performance for an N=2 pulse. . . . . . . . . . . . . . . . . . . . . . . . 142 4.13 Mean and variance of frequency, phase and symbol offset estimation. . . . . . . . . . . . . . . . . . . . . . . . . . 143 4.14 Top level schematic of the transmitter. . . . . . . . . . 146 4.15 Circuit diagram of the digital pulse delay circuit. . . . 147 4.16 Circuit diagram of the digital pulse generator. . . . . . 148 4.17 Photograph of the transmitter module. . . . . . . . . . 149 4.18 Photograph of the receiver module. . . . . . . . . . . . 149 4.19 Block diagram of measurement setup used to evaluate BER performance. . . . . . . . . . . . . . . . . . . . . 150 4.20 Measured spectrum of the transmitter magnetic field on the centreline of the inductors at a distance of 5 mm. . 152 4.21 Measured time domain pulses of the transmitter magnetic field. . . . . . . . . . . . . . . . . . . . . . . . . . 153 4.22 Measured time domain orthogonal pulses at 33 MHz. . 155 4.23 Measured time domain orthogonal pulses after sub sampling operation. . . . . . . . . . . . . . . . . . . . . . . 155 4.24 A train of pulses measured at the receiver including the preamble sequence. . . . . . . . . . . . . . . . . . . . . 156 4.25 Modelled and measured power consumption of the transmitter. . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 4.26 BER performance over distance for orthogonal coherent detection. . . . . . . . . . . . . . . . . . . . . . . . . . 160 4.27 BER performance versus Eb /N0 for orthogonal coherent detection. . . . . . . . . . . . . . . . . . . . . . . . . . 161 4.28 BER performance over distance for non coherent detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 4.29 BER performance versus Eb /N0 for non coherent detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 5.1 20 Information diagram illustrating the discrete time and frequency energy. . . . . . . . . . . . . . . . . . . . . . 170 5.2 Truncation of the Gaussian pulse. . . . . . . . . . . . . 172 5.3 Generation of a single coefficient using the direct filter approach. . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.4 Generation of a single coefficient using the time domain approach. . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.5 Coefficient time line showing the generation of the coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.6 Block diagram of the gm-C filter. . . . . . . . . . . . . 180 5.7 Circuit diagram of the G matrix implementation using identical transconductors and capacitors. . . . . . . . . 180 5.8 Circuit diagram of the C matrix implementation. . . . 181 5.9 Generation of the bias currents for a single complex filter using current mirrors. . . . . . . . . . . . . . . . . . . . 181 5.10 Diagram of a simple transconductor. . . . . . . . . . . 182 5.11 Measured impulse response of each filter. . . . . . . . . 191 5.12 Die photograph showing a single complex filter. . . . . 192 5.13 Measurement Setup. . . . . . . . . . . . . . . . . . . . 193 5.14 Plot showing 200 overlaid cos 2500 Hz analysis windows for a single chip . . . . . . . . . . . . . . . . . . . . . . 194 5.15 Measured Bode plot of the cos filters for each chip. . . 195 5.16 Measured Bode plot of the sin filters for each chip. . . 195 5.17 Variation in the impulse response of the cos 2500 Hz filter for each chip with a fixed bias current. . . . . . . 197 5.18 Variation in the frequency response of the cos 2500 Hz filter for each chip with a fixed bias current. . . . . . . 197 5.19 Bit error comparison. . . . . . . . . . . . . . . . . . . . 199 5.20 Cross correlation error and bit error performance comparison between each chip. . . . . . . . . . . . . . . . . 201 A.1 A single pole RC filter. . . . . . . . . . . . . . . . . . . 227 D.1 Approximation of the time-frequency uncertainty for the Gaussian, Gaussian derivative, sinusoidal and rectangular pulses. . . . . . . . . . . . . . . . . . . . . . . . . . 247 21 E.1 BER for L = 2 for correlation and approximated matched receivers. . . . . . . . . . . . . . . . . . . . . . . . . . . 252 E.2 BER for L = 4 for correlation and approximated matched receivers. . . . . . . . . . . . . . . . . . . . . . . . . . . 252 E.3 BER for L = 8 for correlation and approximated matched receivers. . . . . . . . . . . . . . . . . . . . . . . . . . . 253 E.4 BER for L = 16 for correlation and approximated matched receivers. . . . . . . . . . . . . . . . . . . . . . . . . . . 253 F.1 Approximation to the impulse function. . . . . . . . . . 256 F.2 Frequency Response of the approximated impulse response. . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 G.1 Measurement setup for characterisation of the inductive transmitter element. . . . . . . . . . . . . . . . . . . . 260 G.2 Measured and modeled characteristics of a 10 mm long coil. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 G.3 Measured and modeled characteristics of a 10 mm long coil with a 100 pF ± 10% capacitor in parallel with the coil. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 G.4 Ideal circuit for measuring the coupling constant. . . . 263 I.1 Circuit showing the coupling between two transmitter elements. . . . . . . . . . . . . . . . . . . . . . . . . . . 268 I.2 Pole plot showing how the poles of two transmitting elements varies as the coupling constant k is increased. I.3 269 Frequency response of the Gaussian approximation with various coupling constants. . . . . . . . . . . . . . . . . 270 L.1 Top level block diagram of the transmitter and receiver circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 L.2 TX schematic . . . . . . . . . . . . . . . . . . . . . . . 277 L.3 TX2 schematic . . . . . . . . . . . . . . . . . . . . . . 278 L.4 TX3 schematic . . . . . . . . . . . . . . . . . . . . . . 279 22 L.5 L.6 L.7 L.8 L.9 TXpower schematic RXamp schematic . RXamp2 schematic RX DAC schematic RXpower schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 281 282 283 284 23 24 1 Introduction 1.1 Communication Communication is a word that describes the transferal of information from one point to another. The Oxford dictionary defines communication as [1]: 1 the action of communicating. 2 a letter or message. 3 (communications) means of sending or receiving information, such as telephone lines or computers. 4 (communications) means of travelling or of transporting goods, such as roads or railways. The subject of communication in engineering often deals with the information theoretic concepts, for example Shannon’s paper [2]. A transmitter and receiver are a special case where electronics is used to transfer macroscopic information such as speech or video. In general all circuits carry out communication between circuit elements. At a microscopic level electrons pass information around a circuit. Therefore, communication is quite a general term that can be used to describe both macroscopic and microscopic information transfer. In this thesis an exploration of the fundamental communication limits is presented and two low power circuits motivated by these limits are described. 1.2 Research Contributions The following list describes the main research contributions of this thesis: 25 • A comprehensive review of physical limitations of energy required by electronic circuits has been undertaken • A hypothesis for why energy is required by an amplifier is proposed • A minimum energy bound for transmission between two antennas is derived • A low power analogue pulse based transmitter using 2nd order transmitting elements has been described and implemented • A low power analogue Gabor transform using 180 nm CMOS technology has been fabricated 1.3 Thesis Structure Chapter 2 contains a review and discussion of fundamental and implementation limits related to low power electronics circuits. A discussion of the energy requirements of amplification is presented. A lower bound on the transmit energy required to transmit information over a point to point link is proposed. In Chapter 3 the advantages of using a pulse based scheme such as Pulse Position Modulation (PPM) over Frequency Shift Keying (FSK) are discussed. Several continuous time approximations for generating Gaussian pulse shapes are shown. The bit error rate performance of using continuous time Gaussian approximations with PPM are presented. A novel architecture for the transmission of information using tuned 2nd order transmitter elements is shown in Chapter 4. Here PPM is used together with two approximately orthogonal pulses to reduce the power consumption and complexity of the transmitter circuit. A possible application for this circuit is for low power medical transmitters. Chapter 5 describes the design and implementation of an analogue Gabor transform. The transform allows a signal to be split into several 26 parallel paths, each of a lower bandwidth than the original signal. A possible application of this is in sensor networks, where an input signal can be transformed and then sampled using several low rate A/D converters instead of a single high rate converter. The transform is implemented in 180 nm CMOS technology. Finally conclusions and suggestions for future work are presented in Chapter 6. 27 28 2 Low Power Limits 2.1 Introduction The quest for low power electronics is of particular importance for portable battery powered circuits where longer operational times are required. Low power circuits are also required to reduce the power consumption of fixed equipment, especially as the world tries to cut energy usage. In this chapter a survey of physical and implementation limits related to minimal energy electronics is provided. The aim of this survey is to provide the reader with some understanding of the need for energy when carrying out a task such as computation or transferal of information. In addition to the survey a discussion of the energy requirements of amplification is presented (Section 2.7) and a lower bound on the energy required to transmit information over a point to point link is proposed (Section 2.9). 2.2 Information Transfer Information transfer can be defined as the movement of knowledge from one place to another. There are many different ways in which to move information. For example, information could be printed out on paper, packed onto the back of a lorry, dropped off at a shop for someone to buy, walk home with and read at their leisure. Alternatively, the same information can be sent through the ‘air’ as electromagnetic waves which arrive at a handheld terminal for the user to read. A widely used technique of information transfer is via the internet, this may consist 29 of wired and wireless links which span many countries. Transferal of information between people using speech happens on a day to day basis. Each form of information transmission and reception has its own energy cost. For example, the energy required to transfer the newspaper contents in print form would include the energy to print the pages, the energy to transport the newspaper and the energy used by the customer when picking the newspaper up from a shop. This whole process typically takes a few hours. The same information could be broadcast using electromagnetic waves (i.e. television) and the energy cost for this form of broadcast would include that for the transmitter and the receiver. Research into low power communication can be divided into two distinct subjects; fundamental physical limits and implementation specific limits. The fundamental physical limits are those which describe limitations in terms of fundamental constants that do not change throughout the universe [3] 1 . The implementation limits are those where some form of circuit structure, device or modulation scheme are assumed. 2.3 Classical Limits 2.3.1 The Shannon-Hartley theorem The Shannon-Hartley theorem [2] describes the maximum rate at which information can be transferred through a channel with additive white Gaussian noise (AWGN): C ≤ B log2 P 1+ N (2.1) where C is the channel capacity in bits/s, B is the channel bandwidth in Hz, P is the signal power and N is the thermal noise power. By setting 1 The view that fundamental constants do not change throughout time is simplistic. According to many it is an open question whether fundamental constants change throughout the universe or with time. 30 P , the minimum N = kT B and rearranging (2.1) to find the ratio C energy per bit of information transfer can be found [4]–[8]: −1 C/B P C 2 −1 . = kT C B (2.2) The minimum energy is found in the limit of C/B → 0: P = Ebit ≥ kT ln2. C (2.3) In order to derive the minimum energy it is necessary to set the thermal noise power to N = kT B. This is the minimum noise floor that is achieved when the link is impedance matched and the noise factor is unity. The limit shown in (2.3) is only valid when the bandwidth is infinite or the rate of transmission is infinitely small. This is equivalent to assuming that the information transmission is occurring sufficiently slowly so as to maintain thermodynamic equilibrium [9]. In [8] the result of (2.3) is derived by considering an ideal CMOS transistor operating as part of an inverter. This result is derived by essentially ignoring the capacitance of the MOS transistor, which is in effect implying an infinite bandwidth. To add further evidence that this may be the lower bound on energy per bit, Levitin [10] proved that (2.3) is the minimum energy per bit by using quantum theory. 2.3.2 Compression of a gas Using the laws of thermodynamics it is possible to prove that the minimum energy required to compress a piston isothermally is kT ln2 [11]. Consider a piston immersed in a heat bath of temperature T which remains constant throughout the process, figure 2.1. By undertaking this compression the number of places where the gas molecules can reside has been reduced by half, i.e. the loss of one bit of information. The work done on the gas by compressing it from an initial volume, 31 Figure 2.1: Compression of an ideal gas using a piston. To reduce the volume of the gas by half requires N kT ln2 J of energy. V1 to a final volume V1 2 can be written as: W = Z V1 2 V1 N kT dV = −N kT ln2 V (2.4) where N is the number of gas molecules. In the limit of a single molecule of gas kT ln2 J of energy is required for a loss of a single bit of energy. This is an interesting result because the same basic result (2.3) has been found using a different set of physical assumptions, indicating that the limit of kT ln2 joules per bit is fundamental. 2.3.3 Thermal Equilibrium It is important to remember that circuits designed to date are not operating in thermal equilibrium. For example, the temperature of a resistor that is short circuited will tend towards the temperature of the surroundings. However, as soon as a voltage is applied across the resistor, or a current passed through it, then external energy is being 32 supplied to the resistor, which causes it to reach a temperature higher than the surroundings. Therefore, the circuit is no longer in thermal equilibrium as there is a flow of heat between the resistor and the surroundings. The same is true for modern processors. A processor containing millions of gates switching at a speed in the GHz region becomes very hot. To stop the device from overheating a cooling fan is added to remove heat from the device. This simple example shows that practical electronic circuits are not operating in thermal equilibrium, thus it should be expected that practical circuits will require more power than the fundamental limit of kT ln2 joules per bit. 2.3.4 Achieving the kT ln 2 limit Proakis [12] shows that the kT ln 2 limit can be met when representing information using an infinite number of orthogonal signals. In this region, C/B → 0, the system is power limited. This is in contrast to bandwidth limited systems, such as amplitude and phase modulation schemes which have C/B > 1. The upper limit on the probability of error for using a set of M orthogonal signals is given by [12]: Pe ≤ log M exp− 22 −2 ln 2 2 exp− log2 M „r Eb N0 Eb − N0 √ ln 2 « > 4 ln 2, ln 2 < (2.5) Eb N0 ≤ 4 ln 2. Chapter 4 makes use of orthogonal signalling to reduce power consumption of a transmitter albeit with the number of orthogonal signals much less than infinity. Eq. (2.3) is of somewhat limited practical value due to the requirement of infinite bandwidth or infinitesimally small rate of transmission. However, there are attempts in the literature to try and get closer to the limit using adiabatic logic. There is also a school of thought which suggests that a reversible computer which requires zero energy could be constructed . These two cases are discussed in the following sections. 33 2.3.5 Beating the kT ln2 limit In 1961 Landauer [13] (reprinted 2000 [14]) introduced the idea that it may be possible to carry out computation without dissipation. The reason why it may be possible to beat the limit lies in the fact that physical reversibility exists under thermal equilibrium [9]. Under this condition energy can theoretically be conserved when switching between states. Physical reversibility is a consequence of the Second Law of Thermodynamics [9]. A reversible process is one where a change in a system can be reversed in order to return the system and the surrounding environment to its original condition. Consider the example of the piston (Section 2.3.2) where kT ln2 J of energy is required to compress the gas to half its original volume under isothermal assumptions. If the piston is now retracted so that the gas is allowed to expand to twice its volume then the net energy loss between compression and expansion is zero. It should be noted that this example is a physical ideal and neglects any friction due to the side walls. In fact reversibility is a theoretical concept in any physical system as an infinite amount of time would be required to change state isothermally. Any process which involves friction, heat transfer across a temperature difference or electrical resistance is irreversible. In daily life, the concepts of Mr. Right and Ms. Right are also idealisations, just like the concept of a reversible (perfect) process. People who insist on finding Mr. or Ms. Right to settle down with are bound to remain Mr. or Ms. Single for the resist of their lives. The possibility of finding a perfect prospective mate is no higher than the possibility of finding a perfect (reversible) process. Likewise, a person who insists on perfection in friends is bound to have no friends. [15] This quote gives some philosophical indication that finding a reversible process in practice is highly unlikely. Even if physical reversibility 34 could be achieved then the very act of measurement requires energy [4], [13], [16]–[18]. In essence the measurement apparatus is set to a standard state ready for the next measurement, i.e. information is erased between measurements thus dissipating energy. Reversible computing aims to prevent this information erasure thus reducing energy dissipation. This is achieved using logical reversibility, where a machine can carry out a number of computations but is still able to return to a previous state. Landuaer [13] makes the distinction that a machine is logically reversible only if each of the individual steps are logically reversible. Therefore, at each step the machine must remember some information so that is can get back to the previous state. Attempts at implementing circuits using reversible computation have been made, but these typically use adiabatic circuits to try and achieve thermal equilibrium and thus reduce power consumption. 2.3.6 Approaching the kT ln2 limit There have been several attempts at producing adiabatic logic circuits where gradual switching of a transistor is used to try and keep the circuit close to thermal equilibrium throughout the switching process [19]–[21]. The basic principle of adiabatic switching can be seen by considering the energy required to make a logic transmission on an RC interconnect. Typically a fast rising edge voltage is used in order to make the transition and thus the energy dissipation is proportional to CV 2 . Younis [20] has shown that by considering current steps instead of voltage steps, the energy dissipation per cycle is: 2RC (2.6) T where T is the period of the cycle. Hence by increasing the time of the operation the energy per cycle can be reduced. For implementation purposes an ideal current source can be replaced by a voltage ramp [20]. 2 Ecycle = CVdd Typically the clocking requirements of circuits using adiabatic logic are more complex than ‘standard’ voltage stepped logic. Adiabatic 35 logic requires several voltage ramps at different phases, requiring a clock generation circuit which consumes considerable energy. For an 8 bit microcontroller implemented using adiabatic logic, over 50 % of energy is consumed by the clock generator [22], [23]. In this microprocessor the energy per operation is less than 10 pJ. Gong [24] has shown that the implementation of an 8 bit shift register can reduce energy consumption by 60-80 % compared with conventional logic. Fabricated results at operating frequencies up to 15 MHz are presented in [24]. It is clear that adiabatic logic will play an important role in reducing power, provided that the speed at which digital circuits operate can be reduced. This can typically be achieved by using parallelism. To this end Chapter 5 outlines a method to convert an analogue signal into several parallel paths of lower bandwidth. This would then enable a digital processor to process the signal at lower switching speeds, allowing exploitation of adiabatic logic circuits to reduce power in communication processing circuits. 2.4 Quantum limits Quantum limits are those which look at the use of single particles to store, compute or transfer information. A single particle in a potential V is described deterministically by the Schroedinger Wave equation: ~2 ∂ 2 Ψ(x, t) ∂Ψ(x, t) + V (x, t)Ψ(x, t) = i~ (2.7) 2 2m ∂x ∂t where ~ is Plank’s Reduced Constant, m is the mass of the particle, x is the position of the particle, t is the time, V (x, t) is the potential energy acting on the particle and Ψ(x, t) is the solution to the wave equation, a wave function. The general form of the wave function (solutions) for free particles (V = 0) is: − Ψ(x, t) = Ae−ikx e−iωt + Beikx eiωt . (2.8) Even though the wave functions deterministically describe the po- 36 sition of a single particle it is not possible to directly measure the position, energy or momentum of the particle without introducing error because the act of measurement will affect the particles position, energy or momentum. This is explained by the Heisenburg uncertainty principle (see Section 2.4.1). A useful property of the wave functions is that the probability density function of a particle can be found by taking the complex conjugate of the wave function: P (x, t) = Ψ∗ (x, t)Ψ(x, t) (2.9) The function P (x, t) can then be used to find expected values and variances of the particles position. In effect the observation of the position of a single particle can only be made with a degree of uncertainty. The quantum wave function, when used to describe a group of particles, leads to an equation for acceleration which matches that of Newtons law a = F/m in the limit of a large number of particles. Therefore, quantum mechanics is an underlying principle of classical mechanics. Formerly classical mechanics is quantum mechanics but in the limit that ~ → 0. 2.4.1 Uncertainty Principle The uncertainty principle is important as it embodies the impossibility of making a measurement without disrupting the particles being measured. The uncertainty relation that relates position and momentum is [25], [26]: ∆x∆p ≥ h . 4π (2.10) The corresponding energy-time uncertainty is given by: ∆E∆t ≥ h . 4π (2.11) 37 As pointed out by Bremermann [26], the interpretation of the energytime uncertainty is not trivial. Essentially the uncertainty bounds explain that it is impossible to measure one aspect of a quantum system without effecting the others. It is worth noting that only Gaussian type functions satisfy this bound. The energy-time frequency relation could be interpreted as noise [26]. For a given time of measurement there is an uncertainty in the energy. This interpretation is used by Bremermann to find the maximum capacity of a photon channel. Rearranging for maximum energy gives [26]: Emax = ˙ Ih . ln(1 + 4π) (2.12) where Emax is the maximum signal energy and I˙ is the information rate. This result shows that increasing the rate at which information is transferred requires the maximum signal energy to increase. The energy per bit in this case can be expressed as: Ebit ≥ Emax h = τ ln(1 + 4π) I˙ (2.13) where τ is the length of the recorded signal. The shorter the received signal, the more energy per bit is required; faster rates of information transfer require more power. A similar result is found by Bekenstein [27] where he derived his results by considering blackhole theory and system entropy: Ebit = ~ln2 . πτ (2.14) A similar bound has also been derived by Pendry [28] which written in terms of the energy per bit is: Ė 3~ ln2 2I˙ = Ebit ≥ . π I˙ (2.15) Both (2.15) and (2.14) show that the energy per bit increases as the rate of transmission increases. 38 2.4.2 Energy Per Operation In the quantum computation literature [29]–[31] there is evidence that a single state change, i.e. a change of state which is distinguishable/orthogonal requires a minimum amount of energy: π~ (2.16) 2∆t where ∆t is the time required to undertake the operation. As in the classical case it is evident that the longer the operation the less energy is required. This result implies that quantum computing architectures can potentially perform operations with much greater speed and less energy than their classical counterparts. For example, with kT ln 2 J a quantum evolution rate of 5.5 T operations per seconds could be achieved. This result is not surprising as a classical system would require many elementary particles to undertake a similar operation; many quantum state changes are required for a single classical state change. E≥ 2.4.3 Blackbody Radiation Blackbody radiation theory explains the amount of energy radiated by objects. The theory was originally devised by Plank with further work carried out by Wein, Stephan and Boltzman [25]. Initially the spectrum of radiated electromagnetic was found experimentally with Plank formalising the spectrum by writing the intensity of radiation as: 2hf 3 1 I(f, T ) = 2 hf Js−1 m−2 Sr−1 Hz−1 (2.17) c e kT −1 Wien’s Displacement law describes the frequency at which maximum radiation occurs: kT (2.18) h where α = 2.821439 is a fitting constant. For example, the maximum fmax = α 39 Figure 2.2: Blackbody Radiation Spectrum also showing Wien’s displacement law. radiation from a blackbody at room temperature (300 K) occurs at approximately 18 THz. The Stephan-Boltzmann law gives the total power emitted from a blackbody per unit area: Pmax = ǫσT 4 Wm−2 (2.19) where ǫ is the emissivity constant which lies between zero and unity. A perfect blackbody has ǫ = 1. σ ≈ 5.66 × 10−8 Wm−2 K−4 is Stephan’s constant. A plot of the blackbody spectrum is provided in figure 2.2. 2.4.4 Blackbody Communication Limit In [32] a bound involving a point to point link is postulated: 40 I˙ ≤ 512π 4 At Ar 3 P 1215h3 c2 d2 1/4 (2.20) where At and Ar are the areas of the transmit and receive antennas and d is the distance between the antennas. This limit assumes that the receiver can measure the position and time of arrival of the photons. In terms of energy per bit this limit can be written as: Ebit 1215h3 c2 d2 ≥ R 512π 4 At Ar 1/3 (2.21) Notice that the bit energy in (2.21) increases as the rate of transmission increases, however with a weaker relationship than is the case with the bounds derived directly from the uncertainty bound. As an example consider a transmitter and receiver which have an area of 1 m2 and are 1 m apart. For a bit rate of 1 Gbit/s the minimum bit energy would be 86 × 10−27 Joules. The above limits are fundamental limits and, as has been hinted at, they require the measurement of momentum, position and time of the photons. They provide no indication of how to achieve the limits and thus their usefullness to practical engineering is somewhat limited. In Section 2.9 a proposal for a lower bound on the energy per bit for a point to point communications link is found via recourse to blackbody radiation theory. This limit is based on a point to point link with known antenna sizes and is more closely related to practical engineering problems of this time. 2.4.5 Signal Representation Using Photons In this section work by Gabor [33] is revisited in order to derive an expression for the energy per bit required to represent a signal which is valid for the quantum and classical regions. This comparison shows that the energy required per bit increases as the operation frequency increases for the quantum case, which suggests that less energy is required for signal representation when operating assuming the classical 41 case. Gabor considers that a signal can be split into elementary information cells, each of which obeys the time frequency uncertainty. He proposes that each of the elementary information cells can then be represented using a Plank oscillator resonating at a particular frequency. One way of representing infomation is to encode the information in the amplitude of each cell. Gabor [33] shows that the number of uniformally quantised levels (s) for a signal represented with N photons can be given by: s2 = 4N (2.22) 1 + 2 ĒhfT where ĒT is the energy associated with a Plank oscillator: ĒT = hf e hf kT −1 . (2.23) The total energy required by N photons is: E = N hf (2.24) n = log2 (s). (2.25) and the number of bits is: Thus the energy per bit of information represented by each information cell can be written as: Ebit = 22n−2 hf + 22n−1 kT. (2.26) Eq. (2.26) contains a low frequency and high frequency asymptote which correspond to the so called quantum and classical regions of operation. Figure 2.3 shows the quantum and classical parts of (2.26). This gives evidence to the previously seen fact that in the classical region energy per bit is independent of frequency whereas in the quantum region energy is proportional to frequency. 42 Energy requirements to transmit a single bit of information 5 10 0 Energy/kT 10 −5 10 Quantum Classical Complete Picture −10 10 6 10 8 10 10 10 12 10 Frequency (Hz) 14 10 16 10 Figure 2.3: Minimum energy required per bit when representing a signal in terms of a Plank oscillator. 2.5 Implementation Limits In this section a number of results based on implied implementation specifics are discussed. As expected, the minimum energy found from these limitations is larger than the fundamental limits discussed above, hinting that the optimal way of reaching the fundamental limits has not yet been discovered. 2.5.1 Stein Limit Stein [34] analysed the probability of distinguishing two voltage (current) levels when driving a capacitive (inductive) node in the presence of thermal noise (akin to switching in CMOS digital circuits). With a decision level half way between the supply voltage the energy per switching operation (i.e. bit change) can be expressed as: 2 Eop = 4kT erfc−1 (2Pe ) (2.27) 43 where Pe is the probability of an error in the decision. As a numerical example Stein argues for a bit error rate of 10−19 for a large digital switching system. In this case 165 kT is needed per switching operation. For a communication link, where coding and error correction are applied, a bit error rate of 10−3 may be suitable. In this case the amount of energy required per switching operation is only 19 kT. 2.5.2 Driving An RC Interconnect Krishnapura [35] has shown that the minimum energy required to drive an RC circuit with a sinusoid is: Pmin = 2πkT fs SNR (2.28) where fs is the signal frequency (which in this case is the same as the cut off frequency) and SNR is the signal to noise ratio at the output of the filter. By extending the analysis it is also possible to show that the power required can be written in terms of the filter cut off frequency as: ω2 (2.29) Pmin = kT SNR s ωc In [36] Andreou and Furth derive a similar result for driving an RC interconnect: f0 fp P = 4kT SNRfp 1 − arctan fp f0 (2.30) where fp is the message bandwidth and fo is the cut off frequency of the filter. When the signal is at the 3 db cut off, the power reduces to: P = 0.86kT SNRfp . (2.31) The difference between (2.28) and (2.31) occurs because (2.31) assumes a uniformally distributed signal over a bandwidth fp . Eq. (2.28) assumes a sinusoidal input signal. This highlights the difficulty in producing universal lower bounds as much depends on the signal repre- 44 sentation. In many practical cases an interconnect is not driven by a sinusoid. An interesting case is that of a square wave, which is typically used by digital computers as a clock signal. The square wave can be represented as a Fourier Series where the number of terms taken is related to the rise time of the pulse; see Appendix A for the derivation. The minimum energy per switching operation in this case is: Eop = 2kT SNRsq (2.32) where SNRsq is the required SNR of the square wave. The SNR can be calculated by considering the probability that the signal is above a certain threshold, as hinted at in Section 2.5.1. The SNR required for binary detection is well known (see Appendix C): 2 SNRsq = erfc−1 (2Pe ) . (2.33) 2 Eop = 2kT erfc−1 (2Pe ) (2.34) Therefore, the energy per operation required for driving an RC load with a square wave is given by: Eq. (2.34) appears to be 3 dB better than the result derived by Stein. However, Stein assumed unipolar switching, whereas bi-polar switching was assumed here. If bi-polar switching is used with Stein’s proof then the same result is obtained. To put the result into some kind of perspective consider the number of flip-flops in a modern digital-IC. There are many clocked devices (flip-flops, registers and latches) present on each IC and each of these devices requires a clock. For an IC with 1 × 109 transistors, where there are on average 25 transistors per clocked device and 75% of the chip contains clocked devices, 30 × 106 clock paths would be required. Any misinterpretation of the clock would result in a system crash. For an error rate of 10−19 the energy per clock path is 82.5 kT and for the 45 Figure 2.4: The Ideal Integrator requires a power of 8kT SNRf . total chip is 2.5 × 109 kT J. If the chip were clocked at 3 GHz the power dissipation solely from the clock paths would be around 4 mW at 100 C. 2.5.3 Power Per Pole In [37] Vittoz derives a power limitation based on an ideal transconductor driving a capacitive load, figure 2.4. This limit is valid for any technology which uses an active device to drive a capacitive load, such as CMOS, bi-polar and other transistor technologies. The power required per pole is: Ppole = 8kT SNRf Vdd Vpp (2.35) where Vpp is the peak amplitude of the sinusoidal input and Vdd is the voltage supply to the transconductor. The minimum power per pole is then found by setting Vpp equal to the supply voltage: Ppole (min) = 8kT SNRf. (2.36) The fact that the minimum power occurs when the signal amplitude is maximised suggests that the signal amplitude should be maximised at all times. To take advantage of this a technique known as the analogue floating point technique [38] was invented. This can be implemented by using an automatic gain control circuit. 46 Figure 2.5: A typical interconnect with matched source and load impedances. The minimum energy per bit delivered by the source is 2kT ln 2 J/bit. 2.6 Matched Transmission Line In this section the energy per bit for transferring information using a matched transmission line is derived. This analysis also enables a comparison between analogue and digital signal representations in terms of energy required and number of bits resolution used. Appendix C shows the relationship between SNR and Eb /N0 . Eb /N0 is the energy to noise ratio required at the load and this has a minimum value of ln 2. 2.6.1 Energy Per Bit For Matched Information Transfer Consider the circuit shown in figure 2.5 where RS and RL are the source and load resistances. The power delivered by the source is: PS = RS + RL N0 BSNR RL (2.37) with N0 given by: N0 = 4kT RS RS + RL 2 . (2.38) 47 The energy per bit for the circuit shown in figure 2.5 is thus: Ebit RS2 = 4kT RL (RS + RL ) Eb N0 . (2.39) For a matched link when RS = RL , the energy per bit in terms of Eb /N0 is: Ebit = 2kT Eb N0 . (2.40) This result indicates that 2kT ln 2 J/bit is required when transmitting information across a matched link, twice that of the minimum limit. 2.6.2 Comparison of Analogue and Digital Signal Representation There have been several attempts to determine whether an analogue or digital signal representation requires the lowest energy. Enz and Vittoz [38], [39] compare the power per pole required by an analogue filter to a digital filter which requires 50 operations per bit. For 8 kT of energy per digital transition they find that an analogue representation is only of benefit for SNR < 25 dB. Sarpeshkar [40] carried out the analysis with MOSFET devices to conclude that an analogue representation has a benefit in terms of power consumption for SNR < 60 dB. Hosticka [41] has provided a comparison by considering an analogue signal driving a low pass filter and a digital signal represented using a serial shift register; here analogue representation is shown to be lower energy for SNR < 40 dB. It is evident from these studies that the maximum SNR at which analogue processing provides a benefit is very dependent on implementation specifics. In this work a comparison is made for the case of information transmission using matched interconnects. The three cases considered are shown in figure 2.6. 48 0 1 2 N-1 N-bit Parallel Matched Transmission N-bit Serial Matched Transmission Analogue Matched Transmission Figure 2.6: Information transfer using parallel digital, serial digital and analogue representations. 2 The two digital representations −1 require 2kT erfc (2Pe ) J/bit and the analogue repre2n sentation requires (2 n−1) kT J/bit 49 For bi-polar digital modulation over a matched line the energy per bit for the parallel link is given by: (digital) Ebit 2 = 2kT erfc−1 (2Pe ) (2.41) where Pe is the probability of a bit being in error. This result is identical to (2.34) where a square wave was driving an RC interconnect. The result is the same for parallel or serial representation because under the classical region of operation the rate of information transfer does not fundamentally effect the energy per bit. For analogue transmission consider the case when the bandwidth and the SNR of the channel are known. For a uniformally quantised signal the number of bits can be related to the SNR by [40]: n= log2 (1 + SNR) . 2 (2.42) By substituting the SNR from (2.42) into (C.3) gives the minimum analogue energy per bit for matched impedances as: B (2.43) = 2 × (22n − 1) kT. R The rate of an analogue channel is provided by the Shannon-Hartley law (2.1). Using (2.42) and (2.1) the rate can be found as: (analogue) Ebit R = 2Bn. (2.44) Equivalently the rate can be found by considering the digitisation of the analogue signal. In this case sampling must occur at the Nyquist rate in order to preserve all the information. Therefore, the equivalent digital data rate for the analogue signal is 2× B × n bits/s. The energy per bit for the analogue representation can be written as: (analogue) Ebit = (22n − 1) kT. n (2.45) Figure 2.7 shows a plot of the energy per bit required for the ana- 50 3 Energy per bit [k T ln 2 Joules] 10 2 10 1 10 Digital Analogue 0 10 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 Number of bits Figure 2.7: Energy per bit required to represent a signal in analogue and digital formats. This shows that there is a benefit in the analogue format for n < 3, which corresponds to a SNR < 18 dB. logue and digital representations of the signal; the probability of bit error is taken as 10−5 . This shows that representing the signal using an analogue channel is beneficial only up to a few bits of resolution; SNR < 18 dB. 2.7 Exploration of Information Transfer Through an Amplifier In this section the energy required by Low Noise Amplifiers (LNA) is discussed from a thermodynamic perspective. The link between bound information and physical energy is shown and a proposal of how information processing occurs in an LNA is made. It is shown that current LNA implementations require seven orders of magnitude greater energy than the thermodynamic limit of kT ln 2 J/bit. In order 51 to make this comparison a figure of merit based on information transfer is presented. 2.7.1 Information and Physical Energy It is possible to make a link between the disorder of a physical system and the information present [16]. In order to do this the concepts of entropy, elementary complexions and a definition of information are required. The first law of thermodynamics states that energy is conserved. For an isolated system: W −q =0 (2.46) where W is the electrical work available and q is the heat transfer. The second law of thermodynamics states that entropy (S) cannot decrease in a closed system therefore: ∆S ≥ 0. (2.47) Entropy is related to the temperature (T ) and heat flow by: ∆q = T ∆S. (2.48) When considering physical states the definition of information should be tightened to that of bound information. Bound information is the amount of information transmitted along a physical channel and includes any necessary error coding or sequencing information. Figure 2.8 shows the mapping between ‘free’ and ‘bound’ information. This work is not concerned with how this mapping takes place, but as an example consider a microphone which captures an audio wave of somebody reciting a passage from a book. The text is the free information and the mapping of the speech to an electrical wave is the encoding and then the transmission of this wave along a wire to a tape recorder is the bound information. Here the amount of energy required to represent 52 Figure 2.8: The mapping of free information to bound information. the bound information is discussed. The Boltzmann-Planck formula for the physical entropy is: S = klnP (2.49) where P is the elementary number of complexions. An elementary complexion is a discrete configuration of a quantised physical system. It is a description of microscopic variables such as position, momentum and velocity of individual atoms. A link between bound information and entropy can be obtained if it is assumed that there is a one to many mapping between the information and the number of elementary complexions. For example, many electrons are required to represent a single bit. To clarify the situation consider a system which starts with zero bound information; this requires a number of elementary complexions, P0 . As there are a positive number of elementary complexions the system has been overdetermined. There are more physical states than are actually required to represent the information; such as with a DC power line. A DC power line does not contain any bound information but requires a number of elementary complexions in order 53 to define it. If the power line is modulated, by varying the voltage, then bound information is present. In this case there will be number P1 of elementary complexions. Bound information may be measured by considering the number of states: I = Klog2 N (2.50) where K is a constant and N is the number of states. The amount of information increase after modulation is: 1 ∆I = log2 β P0 P1 (2.51) where β is a positive constant which makes the link between the possible cases in the bound information and the elementary complexions. Effectively this means that the amount of bound information is β times less than that available by considering the elementary complexions on their own. When there is an increase in information more is known about the system so the number of physical complexions P1 < P0 . The entropy of the unmodulated line is: S0 = klnP0 , (2.52) and for the modulated line is: S1 = klnP1 . (2.53) The change in entropy is then: ∆S = kln P1 . P0 (2.54) The link between bound information and physical entropy is thus: ∆I = − 1 ∆S. kβln(2) (2.55) Eq. (2.55) shows that in order to increase the amount of information 54 the physical entropy must decrease. A decrease in entropy means that an increase in the grade of energy has occurred, i.e. heat energy (low grade) has been converted to electrical energy (high grade). Information loss on the other hand always requires an increase in entropy. An increase in entropy is an irreversible process, i.e. a loss in grade of energy so electrical energy is converted to heat energy. Using the first and second laws of thermodynamics the link between bound information and the energy supplied to the system can be written as: E = +kT βln(2)∆I. (2.56) Note the positive sign on the right hand side of (2.56), this change of sign occurs so that the energy supplied to the system is considered, rather than the internal energy of the system (E = −W ). Eq. (2.56) shows that if information is lost then dissipation occurs whereas if information is created then energy is required. In the case that the elementary complexions have a one-to-one mapping with the bound information (β = 1) then (2.56) reduces to the thermal limit of kT ln 2. If there is a decrease in information then E is negative, meaning that energy is dissipated. This is equivalent to the case outlined in the analysis of thermodynamic computing, where it is well known that kT ln 2 J is dissipated per bit of information erased. In the following section a value for β will be estimated by considering information flow through a practical LNA. 2.7.2 Information Flow Through a Practical LNA In this section it will be shown that information is always lost through a simple electronic amplifier/attenuation system. Consider a single input single output system immersed in a bath of temperature T, from which energy can be provided to the system if required. The input signal will contain a signal and noise component. Assume for simplicity that the amplifier has a brick wall frequency response. In this case the 55 Figure 2.9: Power and noise spectra of an idealised amplifier. input signal and noise will be amplified by the same amount over the amplifiers bandwidth. The amplifying device will also add noise to the output signal (amplification requires an active device, which will add noise, even in a passive system; a resistor will add noise). The input and output spectrum of the signal and noise for this idealised case are shown in figure 2.9. It is clear that the additive noise of the amplifier causes a bound information loss between the input and the output of the amplifier. By using the Shannon-Hartley law (2.1) the information gain of the amplifier is: ∆I = −Blog2 1 + SNRIN . 1 + SNROUT (2.57) This information loss should theoretically require no external energy input, only dissipation of power from the input signal. However, this is not seen in practical circuits where the bias current on a transistor always requires significant external power. A possible reason for the requirement of power is that an amplifying device effectively isolates the input information from the output. For example consider the MOSFET device shown in figure 2.10. 56 Figure 2.10: The MOSFET acting as an information isolator. If the MOSFET is ideal then the information at the input is separated from the output, meaning that there is a loss of information at the input. Consider the MOSFET as a device that is measuring the input signal. In order to recreate the information at the output external energy is required because there is an increase in information. To make an estimate for β the output bit energy of the amplifier is required. The maximum information rate at the output is given by: I˙OUT = Blog2 (1 + SNRout ) . (2.58) The output SNR can be found by considering the noise factor (F ) of the amplifier [42]: F = SNRIN . SNROUT (2.59) The best input SNR occurs when the source is matched to the input and when the maximum power is being supplied from the source. Typically the input 1 dB compression point provides a measure of the maximum input power that can be applied to the amplifier. Thus the maximum input SNR is: SNRIN = P1dB . kT B (2.60) 57 A measure of the Output Bit Energy (OBE) is then: OBE = PDC PDC = ˙IOUT Blog2 1 + P1dB F kT B (2.61) where PDC is the DC power supply to the amplifier. If the input 1 dB compression point is not given then this may be estimated from the output 1 dB compression point divided by the LNA gain. Eq. (2.61) gives the energy per bit required to represent the information at the output of a low noise amplifier. 2.7.3 Theoretical Vs Practical Output Bit Energy In this section a comparison of many CMOS LNA is made at a range of different frequencies. The OBE for each amplifier is calculated using (2.61). Figure 2.11 shows a comparison between the output bit energies for many CMOS LNA circuits. It is clear that the bit energy is almost independent of operating frequency as predicted by (2.56). From this graph a value of β = 1×107 is observed, implying that many elementary complexions are required to represent each bit of information. The probable reason why some of the 800 MHz - 2.4 GHz circuits achieve slightly better bit energies is due to the large amount of time and money spent on these designs. It is clear that current circuits require energy orders of magnitude greater than the theoretical minimum. In the following section the energy required to bias a transistor due to thermodynamic considerations is explored to try and explain this large discrepancy. 2.7.4 Energy Required for Biasing So far it has been assumed that the amplifier starts with P0 elementary complexions. However, in reality to get to this state requires a change in entropy, hence a change in energy. Consider a transistor at rest with no connections; this physical system will contain P−1 elementary complexions. The output of a transistor at a specified bias current 58 Output bit energy [kT ln 2 Joules] 1010 108 [48] [45] [46] [47] [49] [43][44] 106 104 Biasing Limit 102 Thermal Limit 100 0 10 20 40 50 60 70 30 Centre Frequency [GHz] 80 90 Figure 2.11: Comparison of output bit energy for many LNAs across a range of frequencies. Current LNA circuits are more than 6 orders of magnitude greater than the fundamental limit of kT ln 2 and 3 orders of magnitude greater than the proposed biasing limitation. 59 with no input (i.e. the case with P0 complexions) can be modelled by a resistor. A resistor produces thermal voltage fluctuations which increase as the resistance increases [50]: Vf2 = 4kT RB. (2.62) It is also well known that this small signal resistance is inversely proportional to the applied current [51]. Thus to reduce the fluctuations a larger bias current is required. This can alternatively be viewed as the lowering of the resistor’s temperature; refridgeration. Consider a household fridge where an external energy source is used to lower the temperature of the chamber. This results in a temperature rise outside of the fridge. The same can be said about biasing a transistor. To estimate the amount of energy required to get from P−1 complexions to P0 complexions, knowledge about the resistance and the form of fluctuations is required. In order to reduce the fluctuations the number of elementary complexions has to be compressed. Hence: P0 << P−1 . (2.63) This implies that the information content has been increased and more is now known about the system. Thus external energy is required in order to make this happen, the same as when the line was modulated with information. The entropy rate of a signal with a Gaussian distribution probability is given by Shannon [2]: Ḣ = BN log2 2πeN bits/s (2.64) where N is the white noise average power and BN is the noise bandwidth. Here the difference between the entropy rate before and after bias is required; this gives the change in entropy rate as: ∆Ḣ = Ḣ0 − Ḣ−1 = BN log2 60 R0 . R−1 (2.65) As R0 << R−1 the change in entropy rate is negative, implying an increase in information akin to that shown in (2.55). Shannon entropy is the negative of information so the information gain can be written as: ∆I˙ = BN log2 R−1 bits/s. R0 (2.66) Thus the power required to reduce the thermal fluctuations is: PN = +kT βBN ln R−1 Joules/s. R0 (2.67) The energy per bit is then found by dividing the power by the output bit rate, ∆I˙out , (2.58). Numerical Example In order to grasp the consequence of PN on the energy per bit required by a amplifier a numerical example is presented. This will allow this energy to be compared with that of current LNA designs. The noise bandwidth BN is the bandwidth over which the thermal fluctuations are reduced, R−1 is the semiconductor resistance at rest and R0 is the small signal resistance after biasing. The noise bandwidth will not be the same as the signal bandwidth in this case. The small signal resistance of the transistor will be valid at least until the transition frequency of the transistor; typically 10’s of GHz for integrated transistors. Assuming that the typical small signal resistance is 500 ohms and that the resistance of a piece of semiconductor at rest is 1 T ohm. For a data rate of 100 Mbps and a transition frequency of 10 GHz the energy per bit is 2.14e3 βkT ln2 Joules. This is considerably more than the amount of energy required to represent the information. This example of bit energy is also plotted on figure 2.11 to show that this explanation of energy usage due to biasing goes someway to explaining why current circuits are orders of magnitude greater than the 61 fundamental theoretical limit of kT ln 2 J/bit. In the following section consideration of the energy purely due to amplification is discussed. This is in contrast to the previous discussion where a transistor was considered as an amplifying device. 2.7.5 Black Box Analogue Amplifier In this section a lower bound on the energy required to process information through a single input single output filter is described. A filter is a generalisation of an amplifier. As has already been mentioned, the loss of information through a amplifier theoretically does not require any external energy because information is always lost, causing dissipation of signal energy only. From Shannon [2] the entropy at the output of a linear filter can be given as: 1 HO = HI + B Z ln|H(f )|2 df (2.68) B where |H(f )| is the magnitude transfer response of the filter. The entropy change through the filter is: 1 ∆H = B Z ln|H(f )|2 df. (2.69) Z ln|H(f )|2 df. (2.70) B The entropy rate is given by: ∆Ḣ = 2 B It can immediately be deduced from this equation that amplification of a signal results in positive entropy and therefore negative information. As maybe expected, an amplified signal requires more degrees of freedom to represent it. Therefore, the energy in the output signal will be less than the input. Figure 2.12 shows the spectrum of an ideal brick wall input and output response. The voltage gain of this system is: 62 Figure 2.12: Low pass filter ideal spectrum. Av = AOUT = |H(f )|. AIN (2.71) By treating the signal and noise spectra independently and considering that the noise figure is given by [42]: F = NOUT SNRIN = SNROUT NIN A2v (2.72) the respective entropy rates are: ∆Ḣ(signal) = 4SBW ln Av (2.73) ∆Ḣ(noise) = 4NBW ln F A2v . (2.74) When considering amplification, F > 1 and Av > 1. Therefore, the entropy gain of the amplifier is always positive. Thus an amplifier in thermal equilibrium does not theoretically require external energy in order to operate. This analysis backs up the intuition that the information gain through an amplifier is negative, c.f (2.57). However, there is a serious flaw with this argument if power gain (Gp ) is considered. For a positive Gp the power of the output signal is greater than the power of input, therefore power is required. In fact the entropy rate shown here is simply the entropy required to carry out the amplification operation with no consideration of input and output impedances. The act of voltage amplification, with constant power, is equivalent to increasing the output impedance of the amplifier while keeping the 63 power of the signal the same: Av = r RL . RS (2.75) In contrast power amplification, with constant voltage, requires a decrease in resistance for the same signal voltage throughout amplification: RS Gp = . (2.76) RL Eq. (2.67) shows that a decrease in effective resistance requires external energy (cooling effect), whereas an increase in resistance does not (heating effect). It is well known that voltage amplification does not require an external energy source; for example a transformer or a cascade of RC T-networks [52]. For all known power amplification tasks external energy is required. Perhaps the most useful observation from the entropy rate equations for an amplifier is that it is physically impossible to obtain a 100 % efficient amplifier. There must always be power dissipation from the amplifier signal energy in order to satisfy the entropy increase. 2.7.6 Conclusion It has been shown that the energy required by an LNA is many orders of magnitude larger than that required by the fundamental limit of kT ln 2 J/bit. The derivation of the energy per bit in order to achieve a given bias is by no means an exact derivation and many assumptions have been made. However, this work gives insight into the amount of energy required to carry out information transfer operations and should provoke the reader to consider circuits from a thermodynamic perspective. One key point from this analysis is that the effective cooling in order to reduce the fluctuations almost certainly increases the amount of energy required per a bit of information. In effect this means that keeping the output resistance of the transistor high will reduce the 64 energy requirement. Keeping the output resistance high is equivalent to keeping the transistor in the off position for as long as possible. This low duty transistor usage is typical of digital circuits and is exploited in Chapter 4, where a low power transmitter circuit is described. 2.8 Electromagnetic Limits 2.8.1 Friis Limit In 1946, Friis proposed a key formula that is now used in communications engineering to estimate the amount of transmission power required for a point to point communications system [53]: Pr = Pt At Ar λ2 d2 (2.77) where Pr is the signal power at the receiver, Pt is transmitter power, At and Ar are the effective areas of the transmit and receive antennas respectively, λ is the wavelength and d is the distance between the transmitter and the receiver. Appendix B.5 shows how the effective area of an antenna is related to physical size. 2.8.2 Friis-Kraus Electromagnetic Limit An extension to the Friis formula (2.77) has been derived by Kraus [54]. In this extension the transmit power required to achieve a given SNR at the receiver is found to be: Pt = SNRd2 c2 BkTsys At Ar f 2 (2.78) where k is the Boltzman constant (1.38 × 10−23 ), B is the bandwidth, f is the centre frequency of transmission and Tsys is the system temperature. Eq. (2.78) can be derived by considering, SNR = Ps /Pn and Pn = kTsys B. 65 2.8.3 Near Field Near field communication is becoming increasingly important for short range links, particularly in the medical arena and for short range smart payment devices. Yates [55] has analysed a tuned narrow band inductive link using loop antennas to show that the minimum power required for transmission is: PT X = d6 PRX ω 2 GA (2.79) where GA is a constant that depends on the antenna geometry and directivity. One difference between near field and far field is that the field rolls off much more quickly with distance with near field in comparison to far field radiation. Capps [56] attempts to summarise the distance at which near field communication is advantageous. More details on near field communication are provided in Chapter 4 where the design and measured results of a near field link operating at 33.75 MHz are presented. 2.9 A New Electromagnetic Lower Bound Eq. (2.77) and (2.78) imply that an ever increasing centre frequency of transmission will enable lower and lower energies to be used for communications. In Appendix B a derivation of an electromagnetic limit based on the fact that the transmitting antenna is a blackbody is shown. By taking this into account it shows that the energy of a link is bounded by: SNR 2εĀt f 2 (F − 1)d2 c2 Ebit ≥ . kT + log2 (1 + SNR) c2 At Ar f 2 Eq. (2.80) has a minimum of: 66 (2.80) s Ebit (MIN) = 2d 2εĀt (F − 1) kT ln 2, At Ar (2.81) at a centre frequency of: f0 = (F − 1)d2 c4 2εĀt At Ar 41 . (2.82) The frequency at which minimum energy of transmission occurs is based on the dimensions of the antenna, the transmission distance and the receiver noise figure, so (2.82) could be used to optimise the areas of the transmit and receive antennas. The energy per bit given by (2.80) is dependent on the signal to noise ratio at the receiver. The ability to minimise the signal to noise ratio will ensure that lower energy per bit transmissions are made. Lowering the SNR in (2.80) will result in lower energy per bit at the expense of increased bandwidth for a given transmission rate. Schemes like Ultrawide Broadband (UWB) are trying to achieve this by trading wider bandwidth utilisation with lower transmission power [57]. 2.9.1 A numerical example In this example, a typical size of parabolic antenna is considered together with a half wavelength dipole. The physical area of the parabolic antenna is taken to be 0.01 m2 . The same transmit and receive antennas are used; their efficiency is assumed to be 50 % and the transmission distance is 10 m. It is assumed that the antennas are made from a material with a blackbody emissivity of 0.1, and that the receiver noise factor is 1.1. The plot in figure 2.13 for these conditions shows that there is a minimum in the theoretical transmit energy required to transmit a bit of information using parabolic antennas. In this case the lowest energy transmission can be achieved around 35 GHz, with a bit energy of 39 kT J/bit. 67 Energy per bit requirement for transmission Zigbee Bluetooth 10 Energy [kT J/bit] 10 DS−UWB 5 10 Parabolic Antenna Dipole Antenna 0 10 9 10 10 10 11 10 Frequency, Hz 12 10 13 10 Figure 2.13: Energy per bit lower bound for free space communications for a distance of 10 m. Parabolic antenna areas are 0.1 m3 . The emissivity of the transmit antenna is 0.1. The receiver has a noise figure of 1.1. The frequency at which the minimum energy occurs is proportional to the square root of the transmitted distance. So for a 100 m transmission the minimum power point would occur at about 111 GHz for the sizes and conditions as above and the energy per bit would increase to 390 kT J/bit. The transmit energy required for the dipole at fo is approximately five orders of magnitude greater than for the parabolic antennas. This implies that lower energy systems for point to point links can be optimised by choosing an antenna whose effective area is independent of frequency. This is known as an electrically large antenna as its dimensions are greater than λ/2π [58]. 2.9.2 Comparison With Current Standards There are several standards that define the amount of transmit power required for a given data rate. The Bluetooth, Zigbee and DS-UWB transmit energies are plotted together with the numerical example in figure 2.13. Table 2.1 shows the protocol specifications used in deriving 68 Standard Tx distance (m) Tx Power (mW) Data Rate (Mbps) Ebit (pJ/bit) Bluetooth (Class 2) 10 2.5 3 833.33 DS-UWB 10 0.23 110 2 Zigbee 70 1 0.25 4000 Ref [59] [57] [60] Table 2.1: Bit energies for several common communication standards. the bit energies. Note that any extra transmission due to error control has not been included, however, this does not detract from the fact that these standards require more than four orders of magnitude more power than the limit suggests. The DS-UWB transmission power has been calculated by assuming that the FCC limit of -41.25 dBm/MHz power has been adhered to over a 3 GHz bandwidth. 2.9.3 Limitations Of This Bound The Friis-Kraus limit, (B.6) assumes plane wave propagation and is only valid for distances of [53]: 2a2 f (2.83) c where a is the largest linear dimension of the transmit or receive antenna. For the numerical example above when d = 10 m the maximum frequency for which the result is valid is 150 GHz. The limit (2.81) assumes that the transmitter is the only blackbody radiator in the system and that the temperature of the transmitter and receiver are the same. Despite these limitations the lower bound derived in this paper shows that there will be a frequency at which minimum transmit power is required, which to the author’s knowledge, has not been shown before. d> 2.9.4 Conclusion The work in this section shows that there is a fundamental lower limit to the amount of energy required to transmit a bit of information be- 69 tween two points in free space. In order to calculate the new limit the blackbody radiation of the transmit antenna has been taken into account. For practical antenna sizes this limit will present itself somewhere in the 10’s of GHz region. This result shows that as the frequency of communication systems are increased beyond this limit more transmit power will be required. The type of antenna used is important in minimising the transmission energy. The dipole antenna shows an increasing bit energy with frequency, whereas the use of a parabolic or horn antenna allows for a minimum energy point to be found, which can be four orders of magnitude lower in power than the dipole. 2.10 Biological Examples It is interesting to be aware that some biological processes also carry out information processing. There is a trend towards trying to mimic biological processes by using electronic circuits [40], [61] or by directly manipulating cells in order to create biological microprocessors [62]. Bennett has described DNA replication via RNA polymerase in nature [17]. A nucleotide represents a single code of a DNA sequence. He states that approximately 20 kT J is dissipated per nucleotide insertion, which occurs at a rate of 30 nucleotides per second with a probability of error 1 × 10−4 . He also notes that if the reaction is slowed down by reducing enzyme concentration then the energy required decreases; that is the reaction occurs closer to thermal equilibrium. These values are much closer to the classical lower bounds than any electronic implementation has achieved so far. Abshire [61] considers the energy required by a biological and silicon photo receptor. In this paper it is found that the bit energy of a silicon blowfly receptor varies between 20 pJ/bit and 2 pJ/bit, depending on light intensity. The measured silicon model typically requires 5-6 orders of magnitude more energy per bit than the biological receptor, similar to that found in Section 2.7.3 for the low noise amplifiers. Another example is that of the human brain. Sarpeshkar [40] has 70 estimated that the human brain carries out 3.6 × 1015 synaptic operations per second and consumes 12 Watts of power. This equates to 3.33 fJ/bit or 805, 000 kT joules per operation. In comparison, consider the latest Intel Itanium microprocessor [63], [64]. The Dual Core 9120N is able to perform up to 6 operations per clock cycle, equating to 109 operations per second with a power consumption of 104 Watts; 10 nJ/bit or 2.4 × 1012 kT joules per operation. Therefore, the human brain uses 6 orders of magnitude less energy per operation than one of the fastest processors built to date. 2.11 Conclusion This chapter has presented many limitations to energy requirements in electronic circuits. These limits are summarised in tables 2.2, 2.3 and 2.4. The first table contains limits due to classical physics, the second limits due to quantum physics and the third shows implementation limits. From this summary it is evident that obtaining a true measure of the energy required to undertake an information processing task is extremely difficult as heavy reliance on implementation specifics is required. Table 2.5 shows that biological systems operate closer to the fundamental limits. The rate of information transfer depends on the amount of energy available. This is explicitly shown in the quantum limits and hinted at in the classical limits as the minimum energy relies on the SNR tending towards zero, i.e. zero rate of transmission. For classical information processing the energy required per bit is independent of the operating frequency. The exploration of the power required by an amplifier has shown that current designs require 6 orders of magnitude greater power than the fundamental limit. It has been shown that the biasing of a transistor requires considerable energy from a thermodynamic point of view. A lower bound for the energy per bit for a free space point to point link has been proposed. This is based on the fact that the transmitter 71 antenna is a blackbody radiator which contributes to receiver noise as the frequency of operation increases. An important observation is that the digital representation of a signal could fundamentally reach the lower limit whereas an analogue representation requires considerably more energy per bit. The input and output stimuli of the real world are not digital in nature. Therefore analogue signals and circuits are required for information processing. In Chapter 4 a scheme of transmitting information using short pulses is shown which requires the main transmission transistor to be on for only 0.1 % of the transmitted time. The output of the transmitter is a well defined analogue pulse. This circuit takes advantage of the fact that there are no biased transistors. The transmitter is able to operate at bit energies of < 10 nJ/bit over short range links. Chapter 5 shows a scheme where an analogue signal can be split into several parallel channels each of lower bandwidth. This allows information to be channelled into a digital processor at lower rates suiting adiabatic logic which consumes less power at lower rates. In the next Chapter the advantages of using a Gaussian pulse generator for communication over short distances is discussed. This is followed by some theory on the Gaussian pulse and how it can be approximated. The approximations are used for the transmitter described in Chapter 4 and as a filter prototype for the Gabor transform described in Chapter 5. 72 Limit Ebit > kT ln 2 Brief Description Classical limit which can be derived by considering Shannon’s Law, Compression of a gas and from quantum theory Reversible computation could theoretically require zero energy Adiabatic computing aims to get closer to the limit Ebit < kT ln 2 Ebit → kT ln 2 Section Ref. 2.3.1 [4]–[8], [10], [11] 2.3.5 [13] 2.3.6 [19]–[21] Table 2.2: Fundamental Classical Limits. Limit Ebit > h τ ln(1+4π) Ebit > ~ ln 2 τπ Ebit > 3~ ln2 2I˙ π Eop > π~ 2∆t Ebit ≥ h Brief Description Bremermann’s bound based on the uncertainty principle Bekenstein’s bound based on black hole theory and entropy Pendry’s bound based on quantum information flow Limit on the energy required to distinguish between orthogonal quantum states 1215h3 c2 d2 R 512π 4 At Ar i1/3 Ebit = 22n−2 hf + 22n−1 kT Section Ref. 2.4.1 [26] 2.4.1 [27] 2.4.1 [28] 2.4.2 [29]–[31] Limit on the energy for 2.4.4 black body communications. The entire black body spectrum is used for transmission. n-bit signal representa- 2.4.5 tion in terms of photons. Valid for quantum and classical regions [32] [33], This Work Table 2.3: Fundamental Quantum Limits. 73 Limit Eop = 4kT erfc−1 (2Pe ) Brief Description Stein’s limit on uni-polar digital switching Minimum energy to drive an RC interconnect with a sinusoid Energy required per clock cycle for driving an RC interconnect with a 50-50 duty cycle clock Pmin = 2πkT fp SNR 2 Eop = 2kT erfc−1 (2Pe ) h P = 4kT SNRfp 1 − f0 arctan fp Ppole = 8kT SNRf Ebit = 2kT (analogue) Ebit = Eb N0 22n −1 kT 2n E = +kT β ln 2∆I PN = +kT βBN ln RR−1 0 Pt = SNRd2 c2 BkTsys At Ar f 2 q Ebit = 2d 2ǫĀAtt(FAr−1) kT ln 2 PT X = d6 P ω 2 GA RX i fp f0 2.5.2 [35] 2.5.2 This Work Minimum power re- 2.5.2 quired to drive an RC interconnect with a uniformally distributed signal Minimum power re- 2.5.3 quired per pole of a filter Minimum energy per bit to drive a matched transmission line Minimum amount of energy required to represent an n-bit resolution signal on an analogue line Energy required to represent an information change. β links the physical elementary complexions to macroscopic information Power required to make a change in resistance Transmission power required for electromagnetic communication [36] [37] 2.6 This Work 2.6 This Work 2.7 This Work 2.7.4 This Work [54] 2.8.2 Minimum transmission 2.9 energy required when the transmitter antenna is considered to be a black body Transmission power re- 2.8.3 quired for a tuned near field link using loop antennas Table 2.4: Implementation Limits. 74 Section Ref. 2.5.1 [34] This Work [55] Biological 20kT J/code 2 − 20 pJ/bit 805, 000kT J/op Electronic - 200 nJ/bit 2.4 × 1012 kT J/op Brief Description DNA replication in nature via use of the enzyme RNA polymerase at a rate of 30 codes a second Energy cost of information transfer through a blowfly photoreceptor and a silicon replication Estimation of the energy per synaptic operation for the human brain compared with that of an Itanium processor Section Ref. 2.10 [17] 2.10 [61] 2.10 [40], [63], [64] Table 2.5: Biological vs Electronic Energy Examples. 75 76 3 Pulsed Communication 3.1 Introduction The amount of transmission power required by orthogonal modulation schemes decreases as the the number of symbols is increased. Two commonly used orthogonal modulation schemes are Frequency Shift Keying (FSK) and Pulse Position Modulation (PPM). The first part of this Chapter considers why a pulse based transmitter using PPM could lead to lower power and lower complexity transmitters. An expression for the power consumption of a generic PPM transmitter is presented, which shows that the transmitter power can be optimised by choosing the number of time slots. The PPM transmitter is argued to be lower complexity than an FSK transmitter because power hungry elements, such as a PLL, are not necessarily required. It is also suggested that block coding should be used to improve the bit error rate of data transmission as the power overhead of an encoder is relatively small. In the second part of this Chapter, the characteristics of PPM are explored. It is shown that as the number of symbols increases the output spectrum tends towards that of the pulse shape used. There are many examples of pulse based transmitters in the literature that use approximations to different pulse shapes; see section 3.2.2. However, little attention is paid to the effect of the approximation on the bit error rate (BER). Due to pulse approximation the symbols will not be strictly orthogonal, giving overlap between symbols. The effect of this overlap is to introduce Inter Symbol Interference (ISI), which degrades the ability to detect the symbols at the receiver. Simulation schemes for estimating the BER for matched filter and correlation detectors 77 with arbitrary pulse shapes are described. The channel model used is AWGN and the detector used is the Maximum Likelihood (ML) estimator. These allow quick evaluation of the BER for an approximated pulse. The channel and estimator choice are only valid for systems where the effects of fading are small. This is not the case in UWB as the ISI is dominated by multipath components, so different receiver architectures are required [12]. However, the simple AWGN channel is valid for wireline communications and for directional wireless systems such as the inductive transmitter shown in Chapter 4. The final part of the Chapter explores the approximation of the optimal time-frequency localised Gaussian pulse. Firstly the properties of the ideal truncated Gaussian pulse are explored. This is followed by a comparison of the truncated pulse with three approximation methods; All Pole, Cascade of Poles and Padé. Simulation results of the BER for the approximated pulses using coherent correlation and approximated matched filter detectors are presented. These show the performance penalty of using the approximated matched filter. For filter orders of N > 3 using the correlation detector the ideal BER can be achieved. Finally, methods for converting the approximated baseband pulses to orthogonal bandpass filters are presented. The All Pole Approximation is used in the low power PPM transmitter described in Chapter 4. The superior Padé approximation is used in Chapter 5 as the window function for the Gabor transform. 3.2 Comparison of FSK and PPM For FSK there are M possible pulses of length T; the frequency spacing between the pulses is 1/T. FSK modulation leads to a continuous emission of power from the transmitter. In PPM modulation there are L time slots, each with a slot of length T. The transmitter remains in an idle state for L − 1 time slots. A combination of these two schemes results in time-frequency modulation. An example of FSK and PPM modulation is shown in figure 3.1. 78 1 0.5 FSK 0 −0.5 −1 0 0.5 1 1.5 2 Time [s] 2.5 3 3.5 4 4 6 8 Time [s] 10 12 14 16 T sym 1 0.5 PPM 0 −0.5 −1 0 2 Figure 3.1: An example of FSK and PPM modulation for L=4. For the same number of symbols PPM modulation requires L times longer than FSK. Tang [65] carries out an analysis between FSK and PPM for wireless sensor networks which includes non-linear battery effects. The conclusion of this study is that in dense networks PPM can outperform FSK, but in sparse networks FSK is generally a better choice. However, the model used for the transmission power does not take into account circuit complexity. It assumes that the transmitter contains only a power amplifier. An alternative power model is shown in figure 3.2. This generic circuit is capable of implementing FSK or PPM modulation. For FSK a continuous stream of pulses is generated, each with a length of T . For PPM modulation a pulse is generated on average once every LT . In this case the rates for FSK and PPM are: log2 (L) , T log2 (L) = . LT RF SK = (3.1) RP P M (3.2) The DC energy per bit for the transmitter operating in FSK and PPM are given by: 79 PDC PPULSE Data R bits/s Pulse Generator PTX RF Converter PRF PQUIES Figure 3.2: Block Diagram of a transmitter capable of FSK or PPM modulation. PP U LSE is the power required by the circuit to generate a continuous stream of pulses. PQU IES is the quiescent power of pulse generator and RF conversion when no pulses are being generated. PT X is the power required by the RF converter, PRF = ηPT X is the transmitted power where η is the efficiency of the conversion. DC Ebit (F SK) DC Ebit (P P M ) = RF T Ebit = [PP U LSE + PQU IES ] + , log2 (L) η (3.3) T E RF [PP U LSE + LPQU IES ] + bit . log2 (L) η (3.4) Eq. (3.4) shows that the PPM scheme requires more energy than the FSK scheme due to the quiescent power of the circuit. The value of PP U LSE will generally not be the same for the FSK and PPM schemes. FSK schemes typically require a PLL in order to accurately switch the frequencies. RF RF In (3.3) and (3.4) the value of Ebit is also dependent on L. If Ebit is kept the same for a given distance of transmission, a better error rate is achievable when L is increased. It is assumed that PRF can be increased with the same efficiency η in order to keep the symbol energy the same when the number of time slots in the PPM scheme is increased. 80 3.2.1 Minimum PPM Power The transmitter energy for PPM (3.4) can be minimised by increasing the efficiency of the RF output stage and by optimising the number of symbols, L: d dL PP U LSE LPQU IES + log2 (L) log2 (L) = 0. (3.5) Eq. (3.5) has a closed form expression which gives the optimal value of L as: „ W e−1 Lopt = e PP U LSE PQU IES « +1 where W (z) is the Lambert W function. For approximation to this equation is: Lopt ≈ PP U LSE + 5. 3PQU IES , PP U LSE PQU IES (3.6) > 2 a good linear (3.7) In order to get closer to the fundamental energy per bit for transmission the number of time slots needs to be large. Hence the quiescent power must be reduced as much as possible in order to minimise the transmitter energy. This result is used in Chapter 4 to optimise the number of time slots for low power transmitter implementation. In many PPM based transmitters a power amplifier is used to couple the pulse generator to the antenna. This will require large amounts of quiescent power. Power amplifiers have a filtering effect so when they are turned on or off the step response of the filter’s transfer function would be sent to the antenna. Thus, powering off a PA in between pulses in PPM is difficult to achieve. In Chapter 4 a novel architecture that does not use a power amplifier or oscillator is shown, which is designed to dissipate small amounts of quiescent power. 81 Baseband Pulse Shaping LO PA Oscillator OOK Vdd VDATA G(s)/G(z) fsamp(digital) DSP based Systems LO PA DSP VDATA VDATA e.g. OFDM Pulse Based System Impulse or G(s)/G(z) Step PA fsamp(digital) Analog/Digital Filter Figure 3.3: Examples of some typical circuit topologies for wireless transmitters. 3.2.2 Transmitter Complexity For short range links the RF transmission power is typically very low; < 1 mW. One of the latest Texas Instruments Low Power DSPs (the TMS320C5402-100) consumes 180 mW of power with a 100 MHz clock [66]. Even an ultra low power MSP430CG microcontroller typically consumes 6.6 mW at a 10 MHz clock rate [66]. These figures suggest that the use of a digital processor for implementing the modulation scheme directly would be highly inefficient at low transmit powers. In addition to the digital processor, a multiplier, phase locked loop (PLL) and power amplifier (PA) are typically required. Some typical block diagrams of transmitters are shown in figure 3.3. The OOK oscillator is an interesting example and could potentially consume very low power for short range links [55]. The advantage of this scheme is that the inductor forms the antenna. However, the main drawbacks of such a scheme are that it is difficult to predict when the oscillator will turn on and also what the startup phase will be. This restricts use of this transmitter to non-coherent detection. The out- 82 put spectrum is limited to either a exponentially shaped pulse or an approximation to a rectangular pulse. Examples of DSP intensive systems are narrow band and wideband OFDM [67], [68]. These schemes employ Fast Fourier Transforms in order to generate the transmission signal over many frequency bands. Basedband pulse shaping schemes are used extensively in commercial systems. For example, the Bluetooth and GSM standards use Gaussian filters to shape the data [69], [70]. Most developments in UWB pulse systems rely on either mixing an envelope of the baseband pulse or by directly generating a pulse using a filter. See table 3.1 for some examples of pulse based transmitters. The overall efficiency of the transmitter is defined as: ηT X = PRF (average) × 100% PDC (3.8) 83 84 b Triangular YES NO Rectangular YES YES YES YES NO YES NO NO pulse impulse YES Exponential shape Gaussian Biphase train YES NO YES YES NO Digitally designed from the available spectral mask Exponential pulse shape Raised Cosine YES NO Rectangular c a PLL/VCO NO PA NO Pulse Shape Digitally designed from the available spectral mask 6 5 4 4.3 8 4 4 6 0.4 f0 [GHz] 6 0.5 0.5 0.5 2 4 2 0.528 9 0.003 B [GHz] 6 0.04 0.015 0.1 0.002 0.75 0.036 0.1 1.8 0.00012 R [Gbps] 2.5 2 2 31.3 0.236 9 29.7 1.82 227 0.35 PDC [mW] 1.5 b Table 3.1: Examples of some pulse based transmitters for short range communications. only operational during pulse suppressed operation by ramping up and down to prevent spurious output. On for 1/3 of continuous operation c only operational during pulse d external frequency synthesiser power not included a Mixing of a baseband triangular pulse with the output from a ring oscillator Pulse Generation Technique Distributed pulse generation with combination using transmission lines. Contains several impulse generators to generate the impulse response of an FIR filter. Direct modulation of the antenna using a digitally controlled oscillator. FIR pulse generation using delay lines. Delays are derived from a high frequency clock Switching of an oscillator to produce a pulse. No pulse shaping filters Summation of offset triangluar pulses to create an approximated raised cosine Formed by the combination of 4 digitally produced monocycles to produce a monocycle train. Switching of an LC oscillator to create a pulse. Uses a tanh pulse shaping mixer to produce a Gaussian pulse Gating of a local oscillator. 0.2 7.5 d 0.01 8.5 0.01 0.44 0.78 0.03 - ηT X [%] 0.7 [80] [79] [78] [77] [76] [75] [74] [73] [72] [71] Ref. A majority of the circuits in table 3.1 are for GHz frequencies. Although [72] operates at around 400 MHz and switches an oscillator on and off to produce a rectangular pulse shape. The efficiency of [72] could not be determined due to insufficient data. The architecture which has the best overall efficiency is [77], which turns on and off an LC oscillator to create an exponentially shaped pulse. The high efficiency results from designing the LC oscillator so that it has a large voltage swing. The circuit in [75] uses a power amplifier but this is only switched on for one third of the time. To prevent spurious transmissions, the gate voltage of the PA is ramped up slowly to turn the device on. Many of the circuits in table 3.1 require a frequency control device to act as the carrier or for synchronisation of a set of sub pulses which are summed to form a single pulse. Figure 3.4 shows the power dissipation for a number of PLL designs which could be used as a frequency control device. All of these results have been extracted from journal papers containing fabricated PLL integrated circuits. The figure clearly shows that the power required increases as the operating frequency increases. A power improvement of approximately 2 orders of magnitude is seen over the last 10 years. The use of a phase locked loop typically requires several mW at operating frequencies in the 100’s MHz region, thus making it a power hungry component for a low power transmitter. As seen in Chapter 2, every interconnect and clock signal fundamentally dissipate power. Many of the direct digital pulse based circuits [71] contain many hundreds of interconnects and typically require clocks at least twice the carrier frequency. These clocks and the information transferral between interconnects can be reduced by considering simple analogue pulse based architectures. To this end the transmitter presented in Chapter 4 has a simple topology with few interconnects and requires no clocks at the carrier frequency. 85 4 10 2000 and before 2001 to 2005 2006 to 2009 3 Power [mW] 10 2 10 1 10 0 10 −1 10 −2 10 −1 10 0 10 1 10 2 10 Frequency [GHz] Figure 3.4: Power consumption of PLL frequency generation circuits. 3.2.3 Channel Block Coding Channel block coding is applied to the information sequence before it is modulated. Essentially k input bits are mapped to n output bits in order to provide an improvement in the error rate. When n > k a decrease in the rate of transmission occurs, i.e. some redundancy is added to the information sequence. The underlying modulation scheme (e.g PSK, FSK or PPM) will influence the performance of the block coding. Figure 3.5 shows the bit error performance between PSK (which is an example of an antipodal modulation scheme) and FSK using AWGN channel, coherent detection and no block coding. It is clear that at an error rate of 10−4 , orthogonal signalling outperforms PSK for M ≥ 8. Figure 3.6 shows the bit error rates for convolutional block coded modulation schemes. The trellis structure used causes a rate reduction of half but it shows significant improvement in the bit error rate for 86 0 10 2PSK 2FSK −1 10 4FSK 8FSK −2 10 32FSK −3 BER 10 −4 10 −5 10 −6 10 −7 10 −8 10 0 2 4 6 8 Eb/N0 (dB) 10 12 14 16 18 Figure 3.5: Comparison between 2PSK (antipodal) and M-ary FSK (orthogonal) modulation schemes. M-ary orthogonal modulation out performs antipodal signalling for M ≥ 8. Channel model is AWGN. a given Eb /N0 . When orthogonal signalling with 32 symbols is used a reduction of 20 % in energy per bit can be seen for a Pe of 10−4 . Alternatively for the same Eb/N0 required for uncoded 32-FSK, an increase of over 2 orders of magnitude in the probability of error can be seen. Convolutional coding is a powerful tool for reducing energy requirements in communication of information. The performance of block coding depends on the modulation scheme used and it is clear that although orthogonal signalling requires much larger bandwidths (or lower rates) than schemes such as 2PSK, it can fundamentally achieve lower energy per bit. The convolutional encoder is straightforward to implement. An estimate of the dynamic and static power for implementation of the encoder using an Xilinx Spartan3E device is shown in figure 3.7. For data rates of 1 Mbps, the power consumption is 170 µW. It is certainly worth considering adding this circuit block to any low power implementation of a transmitter as its cost in terms of power is small. 87 0 10 2PSK−Convolutional −1 10 2PSK 8FSK−Convolutional −2 10 8FSK 32FSK−Convolutional 32FSK −3 BER 10 −4 10 −5 10 −6 10 −7 10 −8 10 0 2 4 6 8 10 Eb/N0 (dB) 12 14 16 18 Figure 3.6: Comparison of bit error rates for convolutional block coding using a constraint length of 7 and a generator code of [171 133]. The convolutional code error rates are the upper bounds. It is evident that the performance still relies on the underlying modulation scheme. The channel model is awgn. 88 Static and Dynamic Power [dBm] 20 15 10 5 0 −5 −10 6 10 7 8 10 10 9 10 Data Rate [bps] Figure 3.7: Simulated power dissipation of a convolutional encoder with a constraint length of 7. The target FPGA was a Spartan3E. The static power for the entire chip was 34 mW but the encoder uses only 0.21 % of the available logic. This gives an estimated static power consumption for an ASIC implementation of 71.4 µW. At bit rates of 1 Mbps, the total power consumption required by the encoder is 170 µW, making this a suitable circuit block for improving circuit performance of low power transmitters. 89 3.3 PPM Simulation In this section the simulation of a PPM modulation scheme for an arbitrary pulse shape is presented. It is shown that the output spectrum of PPM tends towards that of the pulse shape chosen. An overview of detecting orthogonal and pseudo-orthogonal symbols is shown. Simulation methods for a correlation and a matched filter coherent detector are then presented. The use of these simple models enables the performance of approximated pulses to be compared quickly without carrying out time intensive continuous time simulations. 3.3.1 PPM Spectrum Shape The use of PPM has an effect on the output spectra of the transmitter. The output power spectrum is given by [12]: 1 |G(f )|2 SI (f ) (3.9) T where SI (f ) is the power spectrum of the discrete information sequence, G(f ) is the Fourier transform of the pulse function and T is the length of the pulse. The power spectrum of the discrete information sequence is: S(f ) = SI (f ) = k=+∞ X RI (k)e−j2πkf T . (3.10) k=−∞ RI (k) is the autocorrelation of the discrete information sequence, In . For the case that In contains either +1 or -1 (with equal probability) the power spectrum is equal to unity so the pulse shape G(f ) entirely determines the spectral usage. For PPM this is not the case. A single pulse is transmitted every L time slots and the position of the pulse is in one of the L slots. Figure 3.8 shows a simulation of the PPM information sequence power spectrum, SI (f ). As the number of time slots increases then the spectrum tends to unity. Therefore, the spectrum of a PPM modulation with a large number of time slots is approximately 90 5 0 Power Spectrum −5 L=2 −10 L=4 −15 L=8 L=16 −20 −25 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 Normalised Frequency Figure 3.8: Simulation of the power spectrum for the PPM information sequence, SI (f ). As the number of time slots increases the power spectrum tends to unity. that of the pulse function used. 3.3.2 Orthogonal Signalling The crux of orthogonal signalling is that the set of symbols used to represent the information must be orthogonal. Each symbol represents a number of bits: n = log2 L (3.11) where L is the number of symbols. For the set of continuous time real symbols: h iT s = s1 (t) s2 (t) s3 (t) · · · sL (t) . (3.12) In the notation used in (3.12), each row of s contains discrete sam- 91 ples of si (t). The sampling frequency must be much larger than the Nyquist rate to ensure each row of s approximates its continuous time representation. s is then a L×Nsamp matrix where Nsamp is the number of samples. Orthogonality exists if the cross correlation of the symbols is zero and the autocorrelation is constant: Z 0 T 1, if i = j si (t)sj (t)dt = 0, if i = 6 j (3.13) Alternatively the normalised cross correlation matrix (inner product) has only 1’s on the diagonal: 1 0 ′ Rss = s s = 0 . .. 0 ··· 0 · · · 0 · · · 0 . . . .. . 0 0 ··· 1 0 1 0 .. . 0 0 1 .. . (3.14) Any symbol set which obeys 3.14 is orthogonal and therefore is a power constrained modulation scheme. 3.3.3 Pseudo Orthogonal BER In this section a general method for determining the bit error for a signalling scheme that is not completely orthogonal is described. The ML (Maximum Likelihood) detector for the case when each symbol has equal energy, is equiprobable and the noise is additive white Gaussian is given by [12]: m̂ = arg max r · sm 1≤m≤M where r · sm is the inner product and can be written as: 92 (3.15) r · sm = Z +∞ r(t)sm (t)dt (3.16) −∞ where r(t) is the received signal corrupted by noise and sm (t) is the mth symbol. The measure of signal orthogonality is the cross correlation matrix, which for orthogonal signals is shown in (3.14). For the case of M = 2 the closed form expression for the energy per bit required is: 2 Eb [erfc-1 (2Pe )] . = N0 1 − Rss [1, 2] (3.17) Eq. (3.17) is not dependent on the pulse shapes, only the cross correlation between the two pulses. Figure 3.9 shows a plot of how the SNR varies for different values of cross correlation at a fixed probability of error. It is evident that antipodal signal schemes (i.e. {g(t), −g(t)}) require the least amount of energy. On-Off keying (i.e. {g(t), 0}) requires 3 dB more energy than antipodal signals. Strongly correlated signals require a further increase in energy; an extra 3 dB for Rss = 0.5. For M > 2 there are closed form expressions for the upper and lower probability bounds [12]. In this work a simple numerical simulation method will be presented which is valid for any value of M and which can be used to also model the ISI. Cross correlation between symbols results in the spread of transmitted energy amongst all the symbols. The energy components of the symbol set can be found by decomposing Rss to find s, where here s is a M × M matrix of energy coefficients. With s symmetrical the symbol set that corresponds to the cross correlation matrix Rss is: 1 1 2 s = Rss = VD 2 V−1 (3.18) where V are the eigenvectors of Rss and D is a diagonal matrix containing the corresponding eigenvalues. Each row of s represents the 93 20 18 −3 P = 10 e −4 Pe = 10 16 −5 P = 10 e 12 b 0 E /N [dB] 14 10 8 6 4 2 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 Cross Correlation, R [1,2] ss Figure 3.9: The effect of cross correlation on the energy requirements of ML detection for a symbol set of size M = 2 for various error rates. 94 energy of each symbol. To simulate the probability of detecting the wrong symbol, a row of s is chosen at random and white noise is added to this vector to produce the received vector: r = sm + n (3.19) where sm is the mth row of s. When the autocorrelation of each of the symbols is unity the noise vector n contains white noise samples with a variance given by: σn2 = 1 2 log2 (M ) Eb N0 . (3.20) Eq. (3.15) is then used to make an ML estimate of the received symbol. As only the cross correlation of the symbols is required, the algorithm can be implemented efficiently in a few lines of Matlab code, thus allowing fast evaluation of the error rate for a symbol set. The following sections extend the method to take into account ISI and use of an approximate matched filter at the receiver. 3.3.4 PPM Detection The received pulse will be the transmitted pulse, plus the overlap from the previously transmitted pulses. This implies that this transmission scheme has memory. It is also the same mechanism by which intersymbol interference occurs (ISI). The optimum receiver for this type of interference is a correlator followed by a maximum likelihood sequence estimator (MLSE) [12]. A sub optimum estimate, yet much easier to implement, is to use the ML detector for the case of no ISI, as described in section 3.3.3. In this case the estimator is no longer a maximum likelihood estimator because the symbol overlap has not been taken into account when finding the maximum likelihood function. However, using this detector in the presence of ISI still provides a good estimate of the transmitted 95 {In} Impulse Generator g(t) g*(-t) z(t) ML {In} MLSE {In} nT s ML {In} Figure 3.10: Possible detection schemes when the impulse of the transmit filter is sent across a channel. Using the ML estimator for the case of no ISI is suboptimal but straightforward to implement. The MLSE uses a trellis to search to find the most-likely sequence in order to mitigate ISI effects. The correlation method acts on a symbol by symbol basis to find the most likely transmitted symbol data. This detection scheme can also be formed by using a matched filter followed by a sampler followed by the ML estimator for the case of no ISI. A block diagram of three estimation methods are shown in figure 3.10. 3.3.5 Correlation Detection As ISI depends on the previously transmitted symbols, the current transmitted signal can be written as: s̃(t) = P X i=0 g(t − Ii T + iLT ) for 0 < t < LT (3.21) where P are the number of previous symbols to take into account. Ik is the symbol information sequence that takes on values between 0 and L − 1. Thus, the current symbol can be written as: s̃(t) = g(t − I0 T ) + 96 P X i=1 g(t − Ii T + iLT ). (3.22) Eq. (3.22) shows that each transmitted symbol is the sum of the the current symbol and a contribution from the past symbols. Appendix E shows the derivation of the ML detector for correlation detection. Using correlation detection is not the optimum way to detect a signal corrupted with ISI but, as shown in section 3.6, the approximated pulses performance is close to theoretical limit for orthogonal detection. 3.3.6 Matched Filter Detection The matched filter impulse response is the time-reversed complex conjugate of the transmit filter. For the ideal Gaussian pulse the transmit and matched filter response are the same: 2 g ∗ (−t) = g(t) = e−παt . (3.23) This allows the same filter to be used at the transmitter and the receiver. However, only an approximation to the Gaussian pulse can be made so the implemented matched filter will not be perfect. Figure 3.11 shows the effect of using the same approximated Gaussian filter for the transmit and matched filters. This is for the all pole approximation, see section 3.5.1. It clearly shows that as the order is increased, a better approximation to the ideal matched filter is produced; as expected from (3.23). To find an equivalent cross correlation matrix which can be used for ML detection the approximated filter characteristics need to be taken into account. Appendix E shows the derivation of the ML estimator for approximate matched filter detection. A comparison of the BER performance using approximated pulses is shown in section 3.6. In the following section the choice and approximation of the pulse g(t) is discussed. 97 Matched Filter Response Transmitted Pulse 2 N=2 1 g(t) 0.5 1 0 0 2 3 4 5 6 3 4 5 6 10 11 5 6 7 8 9 10 11 6 7 8 9 10 11 6 7 8 9 10 11 0 3 4 5 6 5 1 2 0.5 1 0 0 2 9 0.5 1 0 N=8 8 1 2 2 7 0 0 N=6 6 0.5 1 2 5 1 2 N=4 g(t) ∗ g(−t) g(t) ∗ g(t) g(−t) 3 4 5 6 5 Figure 3.11: The effect of using the same approximated filter for the transmit filter and matched filter. g(t)∗g(t) is the convolution when using the same approximated filters. g(t) g(−t) is the convolution when using an ideal matched filter. As the order increases, the difference between g(t) ∗ g(t) and g(t) ∗ g(−t) is reduced. 98 3.4 Truncated Gaussian Having seen that the pulse shape directly affects the spectrum of the modulated signal, it is important to choose a pulse that provides optimum data rate and spectral use. The choice of the Gaussian function is optimal in the sense that it provides the best spectrum usage for a given pulse width. It also meets the time-frequency uncertainty constant. The calculation of the time-frequency uncertainty for a number of pulse shapes is shown in Appendix D. A summary of the timefrequency uncertainty is shown in table 3.2. ǫ is the normalised timefrequency uncertainty bound that varies between zero and unity (D.1). Pulse Shape Time-Frequency Uncertainty, ǫ Rectangular 0 Sinusoidal 0.3 1 Gaussian Derivative 3 Truncated Gaussian ǫ → 1, for αT 2 >> 1 Gaussian 1 Table 3.2: Comparison of time-frequency uncertainty for a variety of pulse shapes. ǫ is the normalised time-frequency uncertainty (D.1). The Gaussian pulse is an analytic function and has a Taylor series expansion. However, there are several properties that make a practical implementation difficult. For implementation of the pulse using linear systems a causal transfer function is required. The Gaussian function is not causal so it must be approximated by shifting the signal so that it is zero for negative time. This results in truncation of the pulse. For a digital representation, for example when using a look up table, the pulse would have to be truncated. When comparing linear approximated Gaussian pulses it is more relevant to compare them to the ideal truncated Gaussian pulse because this can be implemented by a digital process. In this section the bandwidth-time product and the achievable attenuation of the truncated pulse are compared with the ideal Gaussian. 99 The ideal Gaussian pulse is given by: g(t) = e−παt 2 (3.24) where α describes the width of the pulse. The corresponding Fourier transform [81], ignoring the scaling constant is: π 2 G(f ) = e− α f . (3.25) The bandwidth of the Gaussian pulse is dependent on the attenuation required and the value of α: r A ln 10 α (3.26) 5π where A is the attenuation in dB. This equation shows that the bandwidth and the attenuation are linked. B= The Fourier transform of the truncated pulse can be found by evaluating the integral: Gt (f ) = Z T /2 2 e−αt e−j2πf t dt (3.27) −T /2 where T is the length of the pulse in the time domain. Evaluation of (3.27) ignoring any scaling factors results in: π Gt (f ) = R(M )e− α f 2 (3.28) with M given by: M = erf √ παT + jf 2 r π . α (3.29) The term αT 2 appears in these equations (and also for subsequent equations involving the truncated Gaussian pulse). Therefore αT 2 , is the parameter that will be used when describing the truncation effects on the Gaussian pulse. Figures 3.12 and 3.13 show the time and frequency domain truncated Gaussian pulse for different values of αT 2 . 100 3 α T2 = 1 α T2 = 5 2.5 α T2 = 10 Magnitude 2 1.5 1 0.5 0 −0.5 0 Time [s] 0.5 Figure 3.12: Truncated Gaussian time domain pulse for T = 1. As αT 2 goes towards 0 a square wave is obtained. In these examples the window width is set to T = 1 and the energy of the pulses has been set to be equal. The truncated pulse attenuation can be found numerically from |Gt (f )|. This is achieved by comparing the height of the peak and the 1st side lobe. Figure 3.14 shows a plot of the attenuation versus αT 2 , which is valid for any value of T . The numerical result can be approximated using a linear equation: A[dB] = 7.5αT 2 + 12.6. (3.30) The bandwidth-time product is also approximated using a linear equation derived from numerical calculations. Figure 3.15 shows a plot of the numerical and approximated bandwidth-time for different value of αT 2 . The side lobe merges with the main lobe periodically, causing the discontinuities at αT 2 ≈ 3.5 and αT 2 ≈ 7.75. The linear approximation is: BT ≈ αT 2 + 1.4. (3.31) 101 50 α T2 = 1 α T2 = 5 α T2 = 10 Magnitude [dB] 0 −50 −100 −150 −200 −10 −5 0 Frequency [Hz] 5 10 Figure 3.13: Truncated Gaussian pulse frequency domain response for T = 1. The bandwidth of the pulse is dependent on the value of αT 2 . 120 Numerical Linear Approximation Attenuation [dB] 100 80 60 40 20 0 0 2 4 6 α T2 8 10 12 Figure 3.14: Attenuation achieved by the truncated Gaussian pulse for various value of αT 2 . 102 12 10 BT 8 6 4 Numerical Linear Approximation 2 0 0 2 4 6 8 10 α T2 Figure 3.15: Bandwidth-time product for the truncated Gaussian pulse for various values of αT 2 . The use of the truncated Gaussian pulse allows a trade-off between attenuation and time-bandwidth product that is not available when considering square or sinusoidal pulses. Table 3.3 shows a comparison of the attenuation and bandwidth-time product for the rectangular, sinusoidal and truncated Gaussian pulses. Rectangular Sinusoid BWml T A [dB] 2 13.3 3 23 Gaussian αT 2 = 1 αT 2 = 5 αT 2 = 10 2.4 5.4 11.4 20 50 90 Table 3.3: Comparison of bandwidth and attenuation for various pulses. BWml is the main lobe bandwidth. 3.5 Gaussian Approximations A pulse can be approximated using a variety of methods, some of which were discussed in section 3.2.2. To reduce complexity of the transmitter, the approach used here is to apply an impulse to an continuous 103 time filter with the resulting response being an approximation to the Gaussian pulse: g(t) = L−1 [G(s)GI (s)] (3.32) where g(t) is the Gaussian approximation, G(s) is the frequency domain response of the linear filter and GI (s) is the impulse function. For an ideal impulse (delta-Dirac) GI (s) = 1. In reality an ideal impulse function is not realisable as this would require infinite amplitude and infinitesimally width in time. Therefore, an approximation to the impulse function has to be made. Details of this approximation and its effects on obtaining the impulse response are shown in Appendix F. In the following sections an overview and comparison of signalling performance for three approximation methods is shown. 3.5.1 All Pole Approximation This is the simplest of approximations that can be made from the frequency description of the Gaussian pulse. The time shifted Laplace representation of (3.25) is: 1 2 T G(s) = e 4πα s e− 2 s (3.33) where the time shift of T /2 has been added to ensure that the approximation is causal. This function has a Taylor series expansion, but finding this would result in a transfer function with no poles, which is not realisable. Therefore, the following expansion is used in order to ensure an all pole approximation: G(s) = 1 1 h i= N . N 1 s + aN −1 s −1 + · · · + a0 Taylor G(s) (3.34) Figure 3.16 shows how the out of band attenuation varies according to the value of αT 2 and the order of the filter. The out of bound 104 120 Truncated Attenuation [dB20] 100 N=6 N=5 80 N=4 N=3 60 N=2 40 N=1 20 0 0 2 4 6 2 8 10 12 14 αT Figure 3.16: Out of band attenuation versus αT 2 . The solid line shows the attenuation for the ideal truncated pulse. attenuation is calculated at the bandwidth described by (3.31). The solid line shows the attenuation vs αT 2 for the truncated pulse, (3.30). This figure shows that the order of the filter needs to be increased to provide a larger attenuation. The stability of the filter is also dependent on αT 2 . In particular, higher order all pole filters do not exist for low values of αT 2 . Figure 3.17 shows the all pole approximation of a pulse where αT 2 = 6 and T = 1. This figure shows the spreading of energy into adjacent time slots, which will cause non-zero cross correlation between the PPM symbol set. The corresponding frequency response is shown in figure 3.18. As the order of the filter is increased then the approximated pulse is closer to the ideal Gaussian. 105 1.2 Ideal N=1 N=2 N=3 N=4 N=5 T 1 Amplitude 0.8 0.6 Energy spread into adjacent time slots 0.4 0.2 0 −0.2 0 0.5 1 1.5 Time [s] 2 2.5 3 Figure 3.17: All pole approximation of the Gaussian pulse function for αT 2 = 6 and T = 1. As the order is increased the approximation is improved and there is less energy in adjacent time slots. Magnitude Response [dB] 0 −20 −40 Truncated N=1 −60 N=2 N=3 −80 −100 −5 N=4 N=5 0 Frequency [Hz] 5 Figure 3.18: Frequency response of the all pole Gaussian pulse approximation for αT 2 = 6 and T = 1. As the order is increased the approximated pulse is closer to the truncated pulse. 106 3.5.2 Cascade of Poles The crux of the approximation is to use a cascade of 1st order systems. Consider the following time domain function: h(t) = tN −1 −at e . (N − 1)! (3.35) Eq. (3.35) has a well known Laplace transform, which is the cascade of identical poles [82]: H(s) = 1 . (s + a)N (3.36) To approximate the Gaussian function firstly, centre this function so its maximum is at t = 0. This can be achieved by substituting t in (3.35) with [83]: t= p 2(N − 1)t̃ + N − 1 a (3.37) to form a new function h̃(t̃). As the order, N , is increased then h̃(t̃) approximates a Gaussian pulse: 2 lim h̃(t̃) = e−t̃ . n→∞ (3.38) Expressions for α and T can be obtained by rearranging (3.37): a2 2π(N − 1) 2(N − 1) T = a 2 αT 2 = (N − 1) π α= (3.39) (3.40) (3.41) In this case the value of αT 2 is fixed by the order of the filter, unlike in the all pole approximation. Also the approximation is only defined for N > 2. Figure 3.19 shows a plot of the attenuation for different values of 107 120 Truncated Attenuation [dB20] 100 80 N=20 60 N=18 N=16 N=14 N=12 40 N=10 N=8 N=6 20 N=2 0 0 N=4 2 4 6 8 10 12 14 α T2 Figure 3.19: Out of band attenuation vs αT 2 for the cascade of poles approximation. The solid line shows the attenuation for the ideal truncated pulse. αT 2 . The solid line shows the ideal truncated Gaussian pulse attenuation. With reference to figure 3.16 it is evident that the order of the cascade of poles filter must be much greater than the all pole approximation for a given attenuation. For example, 40 dB of attenuation required a N = 10 cascade filter or a N = 3 all pole filter. Figures 3.20 and 3.21 show the time and frequency response of the approximated pulses for N=2 and N=8. The time domain pulses show much greater overlap between symbols in comparison to the all pole approximation. For large values of N a very good approximation of the Gaussian pulse can be made. An extension of this approximation is used by several authors [83], [84] to produce a Gaussian pulse using the sum of complex first order systems. This extension makes the approximation bandpass with real and imaginary outputs. 108 1.2 2 N=2 α T = 0.64 T N=8 α T2 = 4.5 α T2 = 0.64 Ideal 1 α T2 = 4.5 Ideal 0.8 Amplitude Energy spread into adjacent time slots 0.6 0.4 0.2 0 −0.2 0 0.5 1 1.5 Time [s] 2 2.5 3 Figure 3.20: Cascade of poles approximation of the Gaussian pulse function for N=2 and N=8. For low order filters this approximation shows considerable spread in energy in adjacent time slots. 0 Magnitude Response [dB] −10 −20 −30 −40 −50 N=2 α T2 = 0.64 N=8 α T2 = 4.5 −60 −70 −5 Truncated α T2 = 0.64 Truncated α T2 = 4.5 0 Frequency [Hz] 5 Figure 3.21: Frequency response of the cascade of poles approximation for N=2 and N=8. 109 3.5.3 Padé Approximation The Padé approximation includes zeros as well as poles in order to provide a better estimate of the pulse function. [85] provides a derivation of the Padé approximation for an arbitrary function. The idea of the Padé approximation is to rationalise the Taylor series expansion of G(s): G(s) = ck+1 sk+1 ck sk + · · · + c1 s + c0 (3.42) where k + 1 is the order of the approximation and G(s) is shown in (3.33). The Padé approximation is then: Ĝ(s) = P (s) pM sM + · · · + p1 s + p 0 = Q(s) qN s N + · · · + q1 s + q0 (3.43) where k = M + N and N > M to ensure a realisable transfer function. See [85] for a succinct derivation for finding the values of the coefficients pi and qi . From this derivation the matrix of numerator coefficients are given by: cM +1 cM +2 q = Nullspace .. . · · · c0 c1 cM +N · · · h iT 0 c0 .. . (3.44) cM where q = q0 q1 · · · qN . For normalisation scale q such that qN = 1. As N > M the denominator coefficients are then given by: h where p = p0 p1 c0 c1 p= .. . 0 c0 ··· ··· ... 0 0 · qM c0 cM cM −1 · · · iT · · · pM and qM is the 1st M rows of q. (3.45) Figure 3.22 shows a plot of the achievable attenuation when using 110 110 110 M=0, N=4 100 M=1, N=6 M=2, N=4 90 M=0, N=6 100 M=1, N=4 M=2, N=6 90 M=3, N=6 Truncated 80 Attenuation [dB20] Attenuation [dB20] M=3, N=4 70 60 50 80 Truncated 60 50 40 40 30 30 20 20 10 0 2 4 6 8 2 αT 10 12 M=4, N=6 70 10 0 2 4 6 8 10 12 2 αT Figure 3.22: Out of band attenuation versus αT 2 for the Padé Approximation. the Padé approximation for an N = 4 and N = 6 filter. As M is increased the out of band attenuation of the pulse decreases. This occurs because the Padé approximation introduces zeroes in the transfer function, which tend to add 20 dB/decade of gain. Figure 3.23 shows the time domain approximation for the N = 4 filter and figure 3.24 shows the time domain approximation for N = 6. The corresponding frequency domain pulses are shown in figures 3.25 and 3.26. It is evident from figures 3.25 and 3.26 that a better approximation of the frequency response can be made when using the Padé approximation, compared to the other approximation methods. This is useful for signal processing applications where a bank of filters is required. For example, the Padé approximation is used for the Gabor transform shown in Chapter 5. However, for a PPM modulation scheme the Padé approximation results in worse out of band attenuation, compared with the other approximation methods. The all pole approximation is superior in terms of attenuation. An advantage of the Padé approximation is that the intersymbol interference is reduced when more zeros are used. 111 T M=1,N=4 M=2,N=4 M=3,N=4 Ideal 0.8 Amplitude 0.6 Energy spread into adjacent time slots 0.4 0.2 0 −0.2 0 0.5 1 1.5 Time [s] 2 2.5 3 Figure 3.23: Time domain response of the Padé approximation for N=4. T M=1,N=4 M=2,N=4 M=3,N=4 Ideal 0.8 Amplitude 0.6 Energy spread into adjacent time slots 0.4 0.2 0 −0.2 0 0.5 1 1.5 Time [s] 2 2.5 3 Figure 3.24: Time domain response of the Padé approximation for N=6. 112 0 −10 Magnitude Response [dB] −20 −30 −40 −50 M=1,N=4 M=2,N=4 −60 M=3,N=4 Truncated −70 −80 −90 −5 −4 −3 −2 −1 0 1 Frequency [Hz] 2 3 4 5 Figure 3.25: Frequency response of the Padé approximation for N=4. 0 Magnitude Response [dB] −10 −20 −30 −40 −50 −60 M=1,N=4 M=2,N=4 M=3,N=4 Truncated −70 −80 −5 0 Frequency [Hz] 5 Figure 3.26: Frequency response of the Padé approximation for N=6. 113 3.5.4 Creating bandpass filters So far the low pass (base band) filter approximation has been shown. In this section several methods of creating a bandpass filter are shown. When there are many oscillations in the transfer function the Taylor series has difficulty in converging, making direct approximation of a bandpass filter difficult. To create a bandpass filter create a prototype low pass filter with T = 1 and make the following substitution for s: s→ ω02 T s2 ω02 +1 . (3.46) 2s Making this substitution causes the poles to be reflected about +/−ω0 on the imaginary axis. Therefore, the order of the filter is doubled. Using (3.46) will also introduce transmission zeros which are undesired if an all pole filter is required. To make the all pole approximation bandpass firstly scale the bandpass representation such that the width of the pulse is correct. Then reflect the modified poles about the required frequency on the imaginary axis. For each pole (pi ) of the base band filter apply the following equation to find the new pole (pn ) positions: pn = Re[pi ] ± j(Im[pi ] + w0 ). (3.47) An orthogonal filter can be created by multiplying the bandpass filter by s, i.e. taking the differential of the transfer function. This introduces a zero at the origin. In the cascade of pole approximation a common way to produce the bandpass filter is to make a in (3.35) complex. This technique is known as the cascade of first order complex systems (CFOS) [84]. For the case of the Padé approximation a set of orthogonal filters can be created by finding the inverse Laplace transform of G(s) and multiplying this by a sinusoid or cosinusoid waveform of the desired frequency. This technique is used to produce the orthogonal bandpass filters in Chapter 5. 114 3.6 Gaussian Approximation BER A comparison of the bit error performance for the all pole, cascade of poles and Padé approximations for the correlation and approximate matched receiver are made using the simulation framework described in section 3.3. To prevent interference between adjacent channels the out of bound attenuation must be limited. For many channels the out of bound attenuation is defined by a spectral mask which is made available via communications regulators, for example Ofcom [86]. In Chapter 4 a modulator operating in the 30 - 37.5 MHz band is presented which has a measured out of band attenuation of 30 dB. At the time of writing no spectral mask is available for this band. In order to make a performance comparison between the approximate Gaussian pulses several pulse prototypes with an attenuation of greater than 35 dB are compared. Table 3.4 shows the approximations to the truncated pulse that meet this criteria. Approximation Method Filter Order αT 2 BT All Pole N=2 8 9.4 All Pole N=3 5 6.4 All Pole N=4 5 6.4 Cascade N=10 6 7.4 Padé M=1, N=4 4.75 6.15 Padé M=2, N=6 7 8.4 Table 3.4: Approximated pulses with an attenuation of > 35 dB and a bandwidth-time product given by αT 2 + 1.4. For the pulses shown in table 3.4, the BER performance for L=2, 4, 8 and 16 are shown in Appendix E.3. Figures 3.27 and 3.28 show the extra energy required over ideal orthogonal signalling for correlation detection and matched filter detection, respectively. Figure 3.27 shows that the bit error performance tends to that of ideal orthogonal signalling as the number of symbols is increased. The N = 3 all pole approximation performs better than the orthogonal 115 signalling limit because the normalised distance between adjacent time slots is greater than unity. This can be seen by examining the cross correlation matrix for L=4: 1.0021 −0.0540 0.0073 −0.0003 −0.0540 1.0021 −0.0539 0.0073 . R̃ss = 0.0073 −0.0539 1.0020 −0.0539 −0.0003 0.0073 −0.0539 0.9938 (3.48) The Padé approximation is able to meet the ideal orthogonal performance for both correlation and matched filter receivers. The N = 2 All Pole filter using the matched filter requires significantly more energy, however, as the number of time slots increases the performance also increases. The reason for the poorer performance can be seen in figure 3.11 which shows that the approximation of the matched filter for N = 2 is poor, resulting in a smaller distance between the symbols. The same is true of the N=10 Cascade of Poles filter, which shows a roughly constant performance penalty as the number of time slots is increased. To summarise, it is possible to create a pseduo-orthogonal approximate Gaussian pulse PPM symbol set using continuous time filters whose BER performance approaches that of an ideal orthogonal symbol set as the number of symbols is increased. 116 20 N=2 All Pole N=3 All Pole N=4 All Pole N=10 Cascade M=1, N=4 Pade M=2, N=6 Pade Increase in Transmitter Energy, % 15 10 5 0 −5 −10 0 2 4 6 8 10 12 14 16 18 Number Of Symbols, L Figure 3.27: Extra transmitter energy required over ideal orthogonal pulses when using a correlation detector with the approximated pulses. 30 25 N=2 All Pole Increase in Transmitter Energy, % N=3 All Pole 20 N=4 All Pole N=10 Cascade M=1, N=4 Pade 15 M=2, N=6 Pade 10 5 0 −5 −10 0 2 4 6 8 10 12 14 16 18 Number Of Symbols, L Figure 3.28: Extra transmitter energy required over ideal orthogonal pulses when using the approximated matched filter detector with the approximated pulses. 117 3.7 Conclusion In this chapter it is suggested that a low complexity transmitter, such as PPM, is beneficial for low power transmitters. It has been shown that the quiescent power of a PPM transmitter determines the optimum number of symbols, which minimises the overall power consumption of the transmitter. Typically block coding circuits require a few hundred µW of power so would be useful addition to dramatically improve performance of low power transmitter architectures. This allows the specification of a raw BER of 10−4 ; using block coding with L = 32 will improve the error rate to 10−7 . The spectrum of PPM for a large number of symbols tends towards the spectrum of the pulse function used. A simulation framework for finding the BER for an arbitrary pulse using a correlation and approximated matched filter have been described. These approaches are for AWGN channel, which is only valid for cases where the multipath contributions are small. However, it allows a methodical comparison of arbitrary pulse shapes. The ideal truncated Gaussian pulse has been analysed in depth. The advantage of using a Gaussian pulse over a rectangular or sinusoid pulse is that the bandwidth-time product can be chosen so as to provide better out of bound attenuation. Approximate expressions for the attenuation and bandwidth for a truncated Gaussian pulse have been derived. A comparison between three continuous time approximation methods and the truncated Gaussian pulse have been made. This showed that the all pole approximation has better out of bound attenuation than the other pulses. The Padé approximation is superior in approximating the shape of the Gaussian in both frequency and time but it leads to higher out of bound attenuation due to the addition of zeroes in the transfer function. The simulated BER for the approximated pulse shapes show many of the pulses can reach or exceed the performance of ideal orthogonal 118 signalling. As the number of time slots is increased, the performance of the correlation receiver converges to the ideal orthogonal performance. The effect of implementation errors in the filter, for example due to mismatch of components, can be analysed by simulating the BER with the erroneous g(t). The result of the analysis in this Chapter gives confidence that high performance PPM transmitters and receivers can be constructed using approximated Gaussian continuous-time filters. The all pole Gaussian pulse approximation is used as the pulse prototype for the PPM scheme described in the following chapter, and the Padé approximation is used for the analogue Gabor transform described in Chapter 5. 119 120 4 Communication using 2nd Order TX Elements 4.1 Introduction The UK Office of Communications has made the 30-37.5 MHz band available for short range medical implant devices [87]. In this chapter a transmitter and receiver operating at a centre frequency of 33 MHz with an out of band attenuation of 30 dB is analysed and implemented. The transmitter is based upon using 2nd order transmitting elements, which are naturally available when using inductive coils. It is shown that using the resonant properties of an inductive coil provides much better coupling of DC power to the magnetic field than direct modulation of the inductor with a current source. The All Pole Gaussian pulse approximation shown in the previous chapter is used as the prototype pulse. This pulse can be decomposed into the sum of shifted and scaled 2nd order systems. Each of these systems is then implemented using a separate transmitting element. If the same number of receiving elements as transmitting elements are used at the receiver then an implicit matched filter is implemented. The N=2 All Pole Gaussian baseband pulse is used as the prototype function to implement a coherent modulation scheme. The out of phase pulse is generated by delaying the impulse response to the transmitting elements. BER results are presented which show that the variation in centre frequency of the approximated pulse has the most significant effect on performance. To mitigate the error due to centre frequency drift, an estimation technique for use at the receiver is pro- 121 posed. Simulated results show that using a preamble overhead of 1 % allows sufficiently accurate estimation of the centre frequency. Thus the transmitter requires no on-line tuning, which makes the transmitter topology simple. A discrete circuit implementation of the transmitter and front end of the receiver are described. Measured results are presented. 4.2 Making Use of Antenna Topology Many antennas can be modelled as lumped element circuits. In particular, a solenoid coil can be modelled as an inductor in parallel with a capacitor. This immediately gives rise to a 2nd order circuit element, which can be used as part of the pulse generating filter. Figure 4.1 shows an example of a 2nd order transmitting element. Assuming that RX > RS >> RL the 2nd order response of the resonant circuit is approximately given by: IT X ω02 = 2 IIN s + s ωQ0 + ω02 Q ≈ ω0 CRX r 1 ω0 ≈ . LC (4.1) (4.2) (4.3) These equations show that Q and f0 can be tuned independently to obtain the required 2nd order response. The amplitude of the pulse can be controlled by adjusting VIN or RS and negative pulses can be generated by reversing the orientation of the coil. Experimental results of coil characterisation are described in Appendix G. 4.2.1 Efficiency of the 2nd Order Element Figure 4.2 shows two circuits that can be used as inductive transmitters. The current sources can be implemented using a transistor. In 122 RS IIN ITX L RX C VIN RL Impulse Figure 4.1: Circuit diagram showing a 2nd Order transmitting element. The magnetic field in the inductor follows a damped sinusoidal response. the direct modulation approach all of the current flowing through the inductor must also flow through the current source. In contrast the pulse based modulation scheme applies an impulse, hI (t), to a resonant tank, which produces an exponentially decaying magnetic field, only the impulse current is provided by the source. To make a comparison between the two cases it is ensured that the same rms current flows through the inductors in both schemes. In the pulse based scheme the current is an exponentially decaying sinusoid. In the direct modulation scheme the current could be a sinusoid, i.e. an OOK modulator. The RMS current flow through the inductor is defined as: IRMS = s 1 T Z T iL (t)2 dt (4.4) 0 where iL (t) is the current in the inductor and T is the length of the transmitted pulse. The 2nd order decaying current to be generated in the pulse based modulation circuit due to an impulse is: 123 Direct Modulation Pulse Based Modulation Vdd Vdd iL(t) R L L C hI(t) iL(t) Figure 4.2: Using the inductor as a transmitter in a direct modulation approach and a pulse based approach. The direct modulation is typically found in low power OOK implementations where the current through the inductor is determined by a oscillator. ω0 iL (t) = e− 2Q t sin ω0 t. (4.5) Take the transmitted pulse length to be the time that it takes the envelope of (4.5) to decay to 95 % of its starting value: T ≈ 6Q . ω0 (4.6) The rms current flowing in the inductor is then: iRMS ≈ L 1 3.5 (4.7) The rms power consumption of the direct modulation approach is thus: Vdd (DM) PRMS ≈ . (4.8) 3.5 For an impulse whose rise times are given by τ = 0.1TI the length of the impulse can be approximated by, see Appendix F: TI ≈ 2π . 3ω0 (4.9) The amplitude of the impulse function can be found by considering 124 that the Laplace transform of (4.5) must have a DC value of 1/ω0 , see Appendix H. The amplitude of the impulse function required is: a≈ 1 ≈ 0.531. ω0 0.9TI (4.10) Therefore the rms power consumption of the pulse based modulation scheme is given by: (PM) PRMS ≈ Vdd a r TI 1 √ . ≈ T 3.2 Q (4.11) The ratio of power dissipation of the pulse based modulation to the direct modulation is: (P M ) PRMS (DM ) PRMS 1.1 ≈√ . Q (4.12) Therefore pulse based modulation requires less power to generate the same rms current in the inductor when the Q of the resonant tank is greater than 1.2. This result is important as it shows the advantage in terms of power consumption when using high Q 2nd order transmitting elements. In the following sections the use of 2nd Order transmitting elements to produce shaped pulses is considered. 4.3 Pulse Decomposition The pulse shape of a 2nd order transmitter element excited by an impulse is a damped exponential. A single element is capable of implementing the All Pole Gaussian approximation with an order of N = 1. The 1st order approximation to a Gaussian is far from ideal, as shown in Chapter 3. To implement higher order Gaussian pulses the transfer function can be decomposed into the sum of 2nd order transfer functions. These can then be implemented using separate 2nd order transmitting elements. The magnetic fields produced in the inductors then sum in free space to give the required pulse approximation. In this 125 Section f0 [MHz] Q Delay [nS] Amplitude 1 33.63 41 0 -0.52 2 33.00 28 1.5 1 3 32.40 40 2.9 -0.54 Table 4.1: Characteristics for the 3rd order baseband minimal pulse. αT 2 = 4.6, T = 1 µs and f0 = 33 MHz. section a minimal and maximal decomposition of the transfer function will be considered. 4.3.1 Minimal Decomposition For hardware minimisation it is important to consider the minimal representation where the least number of components is required. In this case each bandpass transfer function can be decomposed into a sum of N 2nd order transfer functions each with a transmission zero: G(s) = aN s + b2N a1 s + b21 + · · · + . s2 + c1 s + d1 s2 + CN s + dN (4.13) In Appendix H it is shown how an approximation to each of these sections can be constructed by using the delayed and scaled impulse response of a standard 2nd order low pass filter. The characteristics of each 2nd order element for a 3rd order All Pole Approximation to the Gaussian pulse with αT 2 = 4.6, T = 1 µs and f0 = 33 MHz is shown in table 4.1. Figure 4.3 shows the individual impulse responses that sum to form the desired pulse. Due to the use of the delays in approximating the transfer function zeros there is an impulse at t = 0. This has the effect of raising the out of bound attenuation, see figure 4.4. 4.3.2 Maximal Decomposition For the maximal representation split (4.13) into its constituent low pass and bandpass filters, thus creating 2N 2nd order filters. The bandpass filter can be approximated by applying an impulse delayed by 4f10 , i.e. 126 0.5 0.4 0 0.3 −0.5 0 0.2 0.5 1 1.5 2 1 0.1 0.5 0 0 −0.5 −1 0 −0.1 0.5 1 1.5 2 0.5 −0.2 0 −0.3 −0.5 0 −0.4 0.5 1 Time [µS] 1.5 2 0 0.5 1 1.5 2 Time [µS] Figure 4.3: Sum of 2nd order impulse responses to form the desired pulse using a minimal decomposition. αT 2 = 4.6, T = 1 µs and f0 = 33 Mhz. The impulse at t = 0 is due to the delay used to approximate the zeroes of the 2nd order transfer functions. The dashed line is the envelope of the All Pole Gaussian approximation. 127 0 −10 N=1 −20 Attenuation [dB] −30 N=2 N=3 min −40 −50 −60 Ideal Truncated −70 −80 −90 25 N=3 max 30 35 40 45 Frequency [MHz] Figure 4.4: Spectral comparison of the minimally decomposed pulses. Due to the approximation of the transfer function zeroes the 3rd order minimal decomposition has worse out of band attenuation than the maximum decomposition. approximating orthogonality. With this scheme there is no undesired impulse at the beginning of the pulse. Also only a single delay is required. However, generally twice as many 2nd order transmitting elements are required. For the 3rd order representation one of the amplitudes required is an order of magnitude lower than the others so can be neglected, thus only five 2nd order sections are required. The characteristics for the 3rd order maximal pulse with αT 2 = 4.6, T = 1 µs and f0 = 33 MHz are shown in table 4.2. The resulting sum of the 2nd order filter impulse responses is shown in figure 4.5. The frequency response is shown in figure 4.4. The 2nd order baseband approximation is a special case where the minimal and maximal representations are the same. The 2nd order function also requires no delays and the amplitudes are of the same magnitude. This will make implementation of the 2nd order approximation much easier, and less suspectable to component tolerances. 128 Section f0 [MHz] 1 33.63 2 33.63 3 33.00 4 32.38 5 32.38 Q Delay [nS] Amplitude 41 0 0.15 41 7.58 0.5 28 7.58 -1 40 0 -0.15 40 7.58 0.5 Table 4.2: Characteristics for the 3rd order baseband maximal pulse. αT 2 = 4.6, T = 1 µs and f0 = 33 MHz. 0.4 0.3 Amplitude 0.2 0.1 0 −0.1 −0.2 −0.3 −0.4 0 0.5 1 1.5 2 Time [µS] Figure 4.5: Sum of 2nd order impulse responses to form the desired pulse using a maximal decomposition. αT 2 = 4.6, T = 1 µs and f0 = 33 Mhz. There is no undesired impulse at t = 0 for the maximal representation. The dashed line is the envelope of the All Pole Gaussian approximation. 129 Section f0 [MHz] Q Delay [nS] Amplitude 1 33.32 45 0 -1 2 32.69 44 0 1 Table 4.3: Characteristics for the N=2 pulse with αT 2 = 4.6, T = 1 µs and f0 = 33 Mhz The characteristics of each transmitter element for the 2nd order baseband approximation are shown in table 4.3. The frequency response is shown in figure 4.4. 4.3.3 TX Element Coupling As several transmitting elements are required there will be mutual coupling between all the transmitting inductors. This will affect the transmitted pulse shape. Appendix I describes the effect of coupling on the pulse shape when using two coupled transmitting elements. This analysis shows that as the coupling is increased it becomes difficult to compensate for the change in pole positions. As coupling is increased the bandwidth of the pulse increases and the Q decreases. The coupling effect can be mitigated by ensuring that the transmitting antennas are far enough apart so that the mutual coupling is small. From experimental coupling results, see Appendix G, a distance of 20 mm between the centres of the TX coils gives a coupling constant of 1.8 × 10−3 . This value of coupling is small enough so the transmitting elements can be assumed to be independent. 4.4 2nd Order Receiving Element The use of a parallel tuned tank at the receiver is advantageous as it allows a matched filter to be implemented and the higher the Q the better the receiver sensitivity. In order to use a parallel tank as a receiving element a high impedance amplifier is required to couple the coil to the measuring circuit. The high impedance is required so that 130 k11 GTX1 Y k22 GTX2 GRX2 k2N kN2 k1N kN1 GTXN GRX1 k21 k12 kNN GRXN Figure 4.6: Diagram indicating the coupling between each transmitting element and each receiving element. the Q of the tank is not affected. 4.4.1 Implicit Matched Filter When there are N transmitting and N receiving elements then an approximated matched filter receiver can be implemented. Figure 4.6 shows a transmitter and receiver consisting of N transmitting elements and N receiving elements. In the frequency domain the sum of the individually transmitted pulses is defined as: h GTX = GT X1 GT X2 1 i 1 (N ×1) . · · · GT XN .. = ĜTX 1 . (4.14) 1 In a similar way the transfer function of the receiver can be defined as: GRX = ĜRX 1(N ×1) . (4.15) Each receiving element will see a contribution from each of the transmitting elements, i.e. N separate paths. The coupling constant of the ij th path is given by kij . Defining the received signal as the sum of the outputs of the individual receiver elements gives: 131 k11 k21 Y = ĜRX .. . k12 · · · k22 · · · ... kN 1 · · · k1N iT k2N h Ĝ . TX .. . (4.16) kN N For the approximated matched filter the received signal will be: h iT Y = GRX [GTX ]T = ĜRX 1(N ×N ) ĜTX . (4.17) Therefore, if all the coupling constants between the transmitter and receiver elements are equal then the matched filter is explicitly implemented at the receiver. For non equal coupling constants the matched filter approximation does not hold and this will affect the bit error performance. 4.4.2 Coupling Matrix for N=2 Figure 4.7 shows the coupling paths when using two solenoid transmit and receive antennas and when the receiver is aligned with the the transmitter. In practice the user may orientate the receiver at any angle. For the purposes of simplicity it is assumed that the transmit antenna is aligned. This will provide a reasonable estimate for the coupling matrix which can be used to evaluate BER performance. For identical transmit and receive solenoids the coupling constant is given by: lR2 sin θtx sin θrx , (4.18) 2d3 where l is the length of the solenoid, R is the radius of the solenoid, θtx and θrx describe the coil orientation and d is the distance between the two solenoids. The coupling matrix is then: k= 132 atx TX θtx1 θtx2 d d11 d22 d21 d12 RX Figure 4.7: Coupling paths when using two transmit and receive antennas. The coupling matrix for an N=2 transmitter can be derived from this simple model. 2 k= lR 2d3 1+ 1 a2tx d2 −1 a2tx d2 1+ 1 −1 . (4.19) It is clear that in the limit of a large distance between the transmitter and receiver the matched filter approximation achieved. 4.4.3 Transmission Distance Due to the impulse applied to the 2nd order transmitting element the peak inductor current in the transmit inductor is: (Peak) IT X = VDD ω0 (T − τ ) . RS (4.20) The peak induced voltage in the receiver element will be: (Peak) Vind (Peak) = ω0 M IT X (4.21) where M is the mutual inductance given by: 133 Distance [mm] 10 15 20 50 100 200 500 1000 G = 0 dB VRX [mV ] 564 166 70 4.6 0.56 0.08 0.004 0.0006 G = 10 dB ADC [bits] 8 7 3 - VRX [mV ] 524 222 14.2 1.78 0.22 0.01 0.002 G = 60 dB ADC [bits] 8 4 1 - VRX [mV ] 560 80 4 0.6 ADC [bits] 6 2 - Table 4.4: Transmission distance using matched TX and RX 2nd order elements. M =k p LT X LRX . (4.22) The peak voltage across the receiver coil is thus: (Peak) VRX = ω0 QRX k p LT X LRX VDD 3π . RS 5 (4.23) Table 4.4 shows the received voltage versus transmission distance 16×10−9 when VRDD , LT X = LRX = 200 nH and QRX = = 10 mA, k = d3 S 45. The resolution in bits at a 10 bit ADC with a 1 Vpp input range is also shown with and without amplification. It is evident from table 4.4 that a gain of at least 10 dB is required in order to transmit data up to 50 mm. For longer distances a gain of 60 dB could be realised, which would take the detection range up to 500 mm. 4.4.4 Attenuation Map The transmit antennas have to be placed a certain distance apart to reduce the coupling between them. The consequence of this is that the sum of the magnetic fields will be different at each point on the transmitting plane. Therefore, the pulse shape seen at each point will vary. Figure 4.8 shows the geometry of the problem with two transmit antennas. Here only a 2-D plane is shown, however, the analysis could easily be extended to three dimensions if required. 134 atx H1 y H2 θtx1 dtx1 θ θtx2 d dtx2 H x Figure 4.8: Geometry of the transmit antennas. The transmit antennas are placed far enough apart such that the effect of coupling between the antennas can be ignored. The magnetic field at an arbitrary point can be found if θ, d, atx and the magnetic fields found in the centre of the two solenoids are known. 135 The magnetic field at an arbitrary point can be written as: H = H1 sin θtx2 sin θtx1 + H2 3 . 3 dtx1 dtx2 (4.24) The path lengths are given by: dtx1 dtx2 r atx 2 = + d2 sin θ2 d cos θ − 2 r atx 2 + d2 sin θ2 d cos θ + = 2 (4.25) (4.26) and the angles of the paths are given by: d sin θ dtx1 d sin θ . = dtx2 sin θtx1 = (4.27) sin θtx2 (4.28) Figure 4.9 shows the out of band attenuation when the transmit antennas are 20 mm apart. The magnitude of the magnetic field has been set constant on the θ = 0 axis so that the comparative attenuation at any transmit power can be observed. Figure 4.9 shows that the attenuation of 31 dB is close to the expected attenuation at distances far from the transmit antennas. To either side of the centre line the attenuation is smaller than the ideal because the contribution from each antenna is no longer equal. 4.5 Demodulation A block diagram of the transmitter and receiver are shown in figure 4.10. This receiver samples the output of both receive elements and sums them to form the approximated matched filter output. 136 0 27 40 32 33 31 35 27 28 31 30 30 2278 29 26 0.05 26 24 28 24 26 29 27 30 31 29 33 32 24 28 29 3130 35 24 40 0.1 30 31 29 29 30 28 28 31 0.2 29 27 27 28 26 0.15 28 y [m] 29 26 0.25 29 0.3 30 30 29 31 31 0.35 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2 x [m] Figure 4.9: Out of band attenuation when the transmit antennas are 20mm apart. The approximate pulse is centred at 33 MHz with T = 1 µS and αT 2 = 4.6. The position of the two transmit antennas are shown by the two solid points at the top of the graph. 137 138 Vdd Vdd Gain Gain A/D A/D lpf lpf Carrier f0 Estimation Symbol Synchronisation Q I Digital Processing nT nT Q I ML Estimate ^I n Figure 4.10: Block diagram of the receiver for the pulse based communications link. The digital processing estimates the carrier frequency of the transmitted pulse from a preamble sequence. Generator Impulse Symbol Symbol Clock Sequencer µP In 4.5.1 Coherent Detection For coherent detection it is important that the receiver knows the frequency and phase of the transmitted carrier. The carrier frequency with a frequency offset maybe written as: gc (t) = sin (2π(f0 + δf0 )t) = sin (2πf0 t + 2πδf0 t) (4.29) which shows that the error in frequency can be considered as a phase error term. Considering a centre frequency of 33 MHz and a variation in frequency of 1 % the phase error at the centre of a 1 µS pulse will be almost 60o . Unless an estimate of the frequency at the receiver is made then using orthogonal signals without tuning the transmitter will require considerably more energy than the ideal receiver. 4.5.2 Sub Sampling The receiver ADCs operate using subsampling. The sampling frequency is 30 MHz so the centre frequency of the resulting pulses is around 3 MHz. 4.5.3 Non Coherent Detection The receiver is capable of implementing non coherent detection where the frequency and phase of the carrier do not need to be estimated. The 1 rate of non coherent detection is reduced by LT because the orthogonal symbol cannot be used. 4.6 BER Performance In this section the BER performance of the order N=2 transmitter is analysed using the narrow band receiver with coherent demodulation. The simulation framework for the matched filter, see Chapter 3, is used to obtain an idea of BER performance. 139 Figure 4.11 shows the effect of moving the receiver closer to the transmitter. At a transmit distance of 100 mm the difference in performance compared to the best achievable with k = 1(2×2) is small. The noncoherent BER for L=4 is plotted to show that using coherent detection has a definite advantage in terms of Eb /N0 when the transmit distance is greater than 25 mm. The sensitivity of the BER to variations in Q is very small. Variations in Q of over 10 % do not cause any significant variation in the BER performance. The result of a BER test when changing only the centre frequency of the pulse is shown in figure 4.12. The sensitivity of the BER due to the difference in the centre frequency is large. Therefore, a good frequency estimation scheme at the receiver is required. It is clear that the centre must be estimated to within 0.1 % in order to provide a performance advantage over using ’non-coherent’ detection. The main reason for the large error is due to phase accumulation as indicated by (4.29). 4.6.1 Accuracy of Estimation The procedure for estimating frequency, phase and symbol offset from the preamble data is outlined in table 4.5. Standard techniques are used which can be implemented in a digital processor at the receiver. Figure 4.13 shows the bias and variance for frequency, phase and offset estimates of the subsampled preamble packet with random sampling phase. This figure shows that the estimation method is suitable for obtaining a frequency estimate well within 1 % and a phase estimate withing a few degrees. The variance of the estimates could be reduced further by increasing the length of the preamble. 4.6.2 Preamble Length As shown in figure 4.12, the frequency estimate for a 33 MHz carrier needs to be less than 0.1 %. This equates to a phase difference of 12o per time slot. For an equivalent phase shift on the sub sampled pulse 140 0 10 L=4 (8) exact L=4 ’noncoherent’ L=4 (8) d = 25 mm L=4 (8) d = 50 mm L=4 (8) d = 100 mm L=4 (8) d = ∞ −1 10 −2 10 −3 BER 10 −4 10 −5 10 −6 10 −7 10 −8 10 2 3 4 5 6 7 E /N (dB) b 8 9 10 11 12 0 Figure 4.11: Effect of transmit distance on BER performance for an N=2 pulse with 4 time slots and coherent demodulation; total of 8 symbols. The distance between the transmit antennas is 20 mm. The performance decreases as the transmit distance is decreased because the accuracy of matched filter approximation decreases due to k 6= 1(2×2) . Step 1 - Offset estimation Description Find the baseband preamble sequence by finding the square of yp (t) and low pass filtering the result. Cross correlate the baseband preamble sequence with the known sequence to find the offset. 2 - Carrier Frequency Obtain a frequency estimate of each preamble pulse by estimation computing the periodogram of each pulse. Estimate the centre frequency from the peak of the periodogram. Average the resulting estimates to find the centre frequency. 3 - Carrier Phase esti- Using the frequency estimates for each pulse from step mation 2, find the most likely phase of the carrier for each pulse. 4 - Phase Accumula- Compute the average difference in phase between the tion estimated phases of the pulses in step 3. Use this to estimate the symbol synchronisation Table 4.5: Preamble frequency, phase and offset estimation algorithm. The received preamble sequence is yp (t) 141 0 10 −1 10 −2 10 −3 BER 10 −4 10 L=4 (8) df = 0 % L=4 (8) exact −5 10 L=4 (8) df = 0.1% L=4 (8) df = 0.2% L=4 (8) df = 0.5% −6 10 L=4 ’noncoherent’ −7 10 −8 10 0 2 4 6 8 10 12 E /N (dB) b 0 Figure 4.12: Effect of centre frequency shift on BER performance for an N=2 pulse with 4 time slots and orthogonal signalling; total of 8 symbols. The plot clearly shows the performance penalty when the centre frequency of the pulse is not estimated closely enough. A percentage change in f0 of 0.2 % makes the performance worse than the non-coherent L=4 scheme. 142 0.8 0.05 0.7 Frequency Std [%] Frequency [%] 0.1 0 −0.05 −0.1 −0.15 6 8 10 12 0.6 0.5 0.4 6 8 EbNo [dB] 10 12 10 12 10 12 EbNo [dB] 2 10 Phase Std [deg] Phase [deg] 9 1 0 −1 8 7 6 5 −2 6 8 10 4 6 12 8 EbNo [dB] 4 35 2 30 Offset Std [ns] Offset [ns] EbNo [dB] 0 −2 −4 6 8 10 EbNo [dB] 12 25 20 15 6 8 EbNo [dB] Figure 4.13: Mean and variance of frequency, phase and symbol offset estimation. 100 runs at each Eb /N0 value. The parameters that are varied are f0 (T X) with standard deviation (sd) 1 %; f0 (RX) with sd 1 %; sampling phase between 0 and 2π; transmit delay between 0 and 4 µs. The preamble length is 32 time slots. 143 the frequency needs to be estimated to within 1 %. The preamble sequence, when the transmitter and receiver clocks are synchronised, has been chosen to be 4h empty time slots followed by 32 time slots i with the pulse sequence gI (t) 0 · · · gI (t) 0 , where gI (t) is the in phase pulse. The preamble ends with 4 empty slots. Using a sequence of the same pulses allows the pulses to be averaged, thus reducing the noise when carrying out the estimation. 2048 time slots containing data follow the preamble, a 2 % overhead. When the clocks are not synchronised, as will typically be the case a preamble length of 64 was chosen in order to improve the estimates. For both cases a moving average of the last twenty preamble estimates is used to smooth the estimates which prevents catastrophic packet failures. For the non synchronised clocks 1024 time slots containing data follow the preamble, a 7 % overhead. 4.7 Tuning The analysis in the previous sections shows that the accuracy of the centre frequency of each 2nd order element plays a significant role in BER performance. Using a preamble approach with an estimation algorithm relaxes the requirement of ensuring the coils are perfectly tuned. The main reason for centre frequency drift will be due to environmental factors such as temperature. Appendix J shows the variation of centre frequency and Q over temperature for each 2nd order element. The percentage change over a temperature of 55o for the centre frequency is ±0.4% and for Q is ±0.85%. These values indicate that the estimation algorithm would be capable of estimating the centre frequency in the presence of temperature change. Therefore, the 2nd order coils could be tuned once at start up in order to take care of drift due to ageing. 2nd order circuit elements are popular sub circuits for building higher order filters, so there is a large amount of literature on tuning circuits for 2nd order filter elements. A survey of these techniques can be found in chapter 7 of [88]. This survey outlines a 144 digital frequency tuning method which requires four clock signals. This tuning method is able to tune the centre frequency to within 0.3 %. The problem with the standard tuning techniques described in [88] is that they rely on being able to use the filter on an input signal. This is fine for tuning the receive coils but in the case of the 2nd order transmitting element, figure 4.1, the only input available is the impulse response. The output of the filter to the impulse can be measured by using a sense coil located close to the transmitting inductor. Using a sense coil with lower inductance and lower Q than the transmitting coil reduces the effect of pole shifting caused by excess coupling. Appendix K outlines a method for tuning the 2nd order Transmitting element using the impulse response and a single sinusoid. This shows that a theoretical error of 0.02% in the tuning of the centre frequency of a coil with a Q of 40 is obtainable. 4.8 Circuit Design Having determined some key performance measures in the previous sections an implementation of a transmitting circuit is presented. This circuit was designed using readily available discrete components. The top level design of the transmitter circuit is shown in figure 4.14. Detailed schematics for both the transmitter and receiver circuits can be found in Appendix L. The PULSE and IQ SELECT signals are generated as part of the measurement setup, see Section 4.9. For every transmitted symbol there are L time slots and for each time slot either an I or Q pulse can be transmitted. Only a single pulse is transmitted per symbol. The most significant bit of the incoming symbol data determines whether an I or Q pulse is transmitted. The rest of the bits determine which time slot the pulse is to be transmitted in. In the following sections a description of the main circuit blocks is provided. The supply voltage to the digital gates was chosen to be 2.8 V and the supply voltage to the TX elements was chosen to be 5 V. 145 I-Q SELECT I-Q DELAY BUFFER PULSE GENERATOR PULSE TX ELEMENT TX ELEMENT IQ SELECT Figure 4.14: Top level schematic of the transmitter. 4.8.1 I-Q Delay Circuit The I-Q Delay circuit is required in order to select whether the transmitted pulse is in phase or quadrature phase. Figure 4.15 shows the circuit diagram for the I-Q delay circuit. The pulse delay is the difference between the time constants of each path. Coarse control of the width was made using the variable capacitor. Fine control can be made by adjusting the voltage supply to an inverter. This also allows automatic tuning of the pulse width. The inverters used are the NC7SV04, which have a quiescent current of less than 1 µA. The I-Q Select switch is a SN74LVC2G66, which has a quiescent current of less than 10 µA. 4.8.2 Pulse Generation Circuit Figure 4.16 shows an implementation of a digital pulse generator. This uses the same operating principles as the I-Q delay circuit. The pulse width is the difference in the time constant between the two paths. The NAND gate used is the 74LVC1G00, which has a quiescent current of less than 10 µA. The buffers between the pulse generation circuit and the TX elements are in the same package to minimise the mismatch in the propagation delay, thus ensuring that the impulse arrives at each TX element at the same time. The 74LVC2G04 dual gate inverter was used for this purpose. This has a quiescent supply of less than 10 µA. The TX element is shown in figure 4.1. RX is the series combination 146 500 I-Delay 10pF Vdd (Fine Control) 500 Q-Delay 6-30pF Figure 4.15: Circuit diagram of the digital pulse delay circuit. Coarse pulse width control is obtained using a variable capacitor. Fine pulse width control is obtained by adjusting the supply voltage to an inverter. of a fixed and variable resistor, used to adjust the Q of the resonant tank. C is the capacitance of the coil, a 68 pF fixed capacitor and a 6-30 pF variable capacitor in parallel. A varactor diode can be added to enable voltage controlled tuning. RS is 500 ohms to set the impulse height to 10 mA. The transistor switch used is a BFG410W wide bandwidth NPN transistor. 4.9 Measured Results Figures 4.17 and 4.18 shows photographs of the transmitter and receiver modules. For this prototype tuning circuits were not implemented. The centre frequencies and Q were set once using a spectrum analyser to determine the values. Despite the lack of tuning circuits the resulting measured pulse is close to the desired response. In order to measure the BER performance an FPGA development board with two ADCs was used. The measurement setup for BER performance is shown in figure 4.19. Separate clock domains for the transmitter and receiver circuit were used so that the transmitter and receiver clocks 147 Width Control Vdd 500 6-30p 500 4.7p Figure 4.16: Circuit diagram of the digital pulse generator. Coarse pulse width control is obtained using a variable capacitor. Fine pulse width control is obtained by adjusting the supply voltage to an inverter. could be chosen to be synchronised (derived from the same clock) or unsynchronised (derived from independent clocks). The receiver front end consists of two 2nd order elements followed by a 10 dB FET amplifier. 4.9.1 Transmitted Pulse The transmitted pulse in the frequency domain was measured using an HP 11940A spectrum analyser and an HP E4403B magnetic field probe. The measured spectrum of the resulting sum of the 2nd order responses is shown in figure 4.20. The spectral contribution from each transmitting element, measured on the centreline at a distance of 5 mm, are also shown in figure 4.20. The noise floor of the instrument is at -69.4 dBm, which prevents accurate measurement of the out of bound attenuation. Extrapolation of the spectrum indicates that the attenuation at 30 MHz is around 30 dB, as expected. The time domain measurements of the individual transmitting elements are shown in figure 4.21. The 33.32 MHz pulse has a higher Q 148 Figure 4.17: Photograph of the transmitter module. The centres of the two transmit elements are separated by 20 mm. Figure 4.18: Photograph of the receiver module. A single FET stage has been used to buffer the tuned coil from the ADC input. The FET stage has a gain of 10 dB. 149 Host PC USB MATLAB Control 40 MHz RAM Symbol Data TX 2 MHz RX DDR SDRAM 128 MBytes 30 MHz ADC0 ADC1 ch1 ch2 FPGA (3 clock domains) Pulse Orthog Transmitter Receiver Figure 4.19: Block diagram of measurement setup used to evaluate BER performance. 150 than desired but the 32.69 MHz pulse shows a good match to the ideal envelope. The resulting sum of these pulses shows a very good match to the ideal envelope. The measured pulse shows slightly more energy spread into the adjacent time slot. Therefore, inter-symbol interference will be increased resulting in a BER performance worse than expected. 151 152 (b) Contribution of the 33.32 MHz transmitting element. (c) Contribution of the 32.69 MHz transmitting element. Figure 4.20: Measured spectrum of the transmitter magnetic field on the centreline of the inductors at a distance of 5 mm. The pulse repetition rate used was 200 kHz. (a) Magnetic field due to both transmitting elements. 153 −0.03 0 −0.02 −0.01 0 0.01 0.02 0.5 1 1.5 Time [µS] 2 2.5 Measured Ideal Envelope 3 −0.04 0 −0.03 −0.02 −0.01 0 0.01 0.02 0.03 0.04 0.05 0.5 1 1.5 Time [µS] 2 2.5 Measured Ideal Envelope 3 (b) Contribution of the 33.32 MHz transmitting element. Amplitude [V] 0.5 1 1.5 Time [µS] 2 2.5 Measured Ideal Envelope 3 (c) Contribution of the 32.69 MHz transmitting element. −0.04 0 −0.03 −0.02 −0.01 0 0.01 0.02 0.03 0.04 0.05 Figure 4.21: Measured time domain pulses of the transmitter magnetic field, averaged over 128 pulses. (a) Resulting sum of the transmitting elements. Amplitude [V] 0.03 Amplitude [V] 4.9.2 Orthogonal Pulse The I-Q delay used to produce the orthogonal pulse, was tuned by hand. A high speed oscilloscope was used to measure the pulse delay. The measured orthogonal pulses at 33 MHz are shown in figure 4.22. The cross correlation of these two pulses was measured at the receiver following the sub sampling operation. The sub-sampled measured orthogonal pulses are shown in figure 4.23. The cross correlation of these two pulses is: CXX # 0.997 −0.0359 . = −0.0359 1.0030 " (4.30) Equation 4.30 shows that the orthogonality of the pulses is close to the ideal identity matrix. 4.9.3 Receiver Demodulation Figure 4.24 shows a train of pulses measured at the receiver and also shows the result of the IQ demodulation scheme. The preamble can be seen at the beginning of the pulse train, which is used to estimate the centre frequency and phase of the carrier. 154 0.03 I measured Q measured Amplitude [V] 0.02 0.01 0 −0.01 −0.02 −0.03 0 0.1 0.2 0.3 0.4 0.5 Time [µS] Figure 4.22: Measured time domain orthogonal pulses at 33 MHz, averaged over 128 pulses to increase SNR. 20 Q pulse 15 I pulse Voltage [mV] 10 5 0 −5 −10 −15 −20 0 0.5 1 1.5 2 2.5 3 Time [µS] Figure 4.23: Measured time domain orthogonal pulses after sub sampling operation. 155 Voltage [mV] 0 20 40 50 50 Time [µS] Demodulated Data Time [µS] Received Data 100 100 Q Channel I Channel 150 150 Figure 4.24: A train of pulses measured at the receiver including the preamble sequence. The result of the IQ demodulation is also shown. −5 0 0 5 10 15 20 −40 0 −20 Amplitude 156 4.9.4 Power Consumption Using the same notation for PPM power consumption as described in Chapter 3, the overall energy per bit of the transmitter for coherent detection using two orthogonal pulse shapes is given by: DC Ebit = T [PP U LSE + LPQU IES + PT X] . log2 (2L) (4.31) The quiescent supply to the circuit was measured as 3.6µW . PP U LSE and PT X are measured when L = 1, which corresponds to a pulse repetition rate of 1 MHz. The dynamic power measurements were: PP U LSE = 2.8 mW and PT X = 3.6 mW. The optimal number of time slots that result in minimum transmitter energy is given by (see Chapter 3): Lopt = PP U LSE + PT X + 5. 3PQU IES (4.32) Thus the optimum number of time slots is Lopt = 512. The measured and modelled (4.31) power consumption for various values of L are shown in figure 4.25. As the minimum lies on a shallow curve the reduction in the rate at values of L > 16 shows only a small improvement in the energy per bit. If non coherent detection were used the energy per bit would be the same but the rate would be reduced. For example, for two time slots the non coherent rate would be 500 kbps compared with 1 Mbps for the coherent case. In the next section the BER performance of the system for L < 16 is presented. 4.9.5 Measured BER Performance To measure the BER performance 100,000 bits were transmitted and then demodulated for each test. Figures 4.26 and 4.27 show the BER performance for coherent detection. For a target BER of 10−3 the maximum transmission distance when the TX and RX clocks are synchronised is 60 mm. For non synchronised clocks the maximum transmission distance is 50 mm. Figure 4.27 clearly shows the advantage 157 3.5 Modelled Energy Measured Energy 1 Mbps 3 0.75 Mbps 2 E bit [nJ/bit] 2.5 0.5 Mbps 1.5 0.3125 Mbps 0.1875 Mbps 1 0.5 1 0.10938 Mbps 2 3 4 5 6 7 8 9 10 log (L) 2 Figure 4.25: Modelled and measured power consumption of the transmitter. The model allows extrapolation of the measured results to find the minimum transmitter energy. 158 of increasing the number of symbols in order to reduce the required transmission energy. In the high SNR regime the BER performance levels out. This effect can be attributed to the fact that the matched filter approximation is worse the closer the receiver is to the transmitter, (4.19). The ideal BER performance for two time slots (four symbols) shows that an extra 10 dB of transmit power is required by the fabricated transmitter when using synchronised clocks. This is larger than predicted by the previous theoretical BER analysis. There is a performance penalty because the circuit does not use continuous tuning methods and the accuracy of the initial tuning is only within 1 %. When using non synchronised clocks an extra 15 dB of transmitter power is required compared to ideal modulation. The extra 5 dB increase in power over using synchronised clocks is due to the variance in estimating the frequency and phase parameters. By using longer preamble sequences the variance of the estimates will decrease and the performance when using non synchronised clocks will improve. However, this is at the expense of increased overhead. Figures 4.28 and 4.29 show the BER rate performance when using non coherent detection. In this case the range can be extended to 70 mm when using synchronised clocks and to 65 mm when using non synchronous clocks. These results show that using non coherent detection can achieve a BER with lower energy than the coherent case, albeit at lower transmission rates. The reason why the non coherent modulation achieves better results than coherent detection is due to the performance of the frequency and phase estimator. In summary, the BER performance of coherent detection is poorer than non coherent detection, primarily due to the frequency and phase estimator. The advantage of using coherent detection is that higher bit error rates can be achieved using the same bandwidth. Using non synchronised clocks a bit error rate of 10−3 at a rate of 1 Mbps and a distance of 50 mm is achievable using coherent detection. The maximum rate of non coherent detection is 500 kbps over a distance of 65 mm for a BER of 10−3 . 159 160 BER 40 60 Distance [mm] 50 70 80 L=16 (32) L=8 (16) L=4 (8) L=2 (4) 40 50 60 Distance [mm] 70 80 L=16 (32) L=8 (16) L=4 (8) L=2 (4) (b) RX and TX clock independent. Preamble overhead is 7 %. −5 30 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 Figure 4.26: BER performance over distance for orthogonal coherent detection. (a) RX and TX clocks synchronised. Preamble overhead is 1 %. −5 30 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 BER 161 0 25 30 35 b 10 15 b 0 E /N 20 25 L=2 (4) L=4 (8) L=8 (16) L=16 (32) L=2 (4) Ideal 30 Figure 4.27: BER performance versus Eb /N0 for orthogonal coherent detection. (b) RX and TX clock independent. Preamble overhead is 7 %. −6 5 20 E /N −6 5 15 −5 −5 10 −4 −3 −2 −1 0 −4 −3 −2 −1 L=2 (4) L=4 (8) L=8 (16) L=16 (32) L=2 (4) Ideal (a) RX and TX clocks synchronised. Preamble overhead is 1 %. BER 0 BER 162 BER 40 60 Distance [mm] 50 70 L=16 L=8 L=4 L=2 80 40 50 60 Distance [mm] 70 L=16 L=8 L=4 L=2 80 (b) RX and TX clock independent. Preamble overhead is 7 %. −5 30 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 Figure 4.28: BER performance over distance for non coherent detection. (a) RX and TX clocks synchronised. Preamble overhead is 1 %. −5 30 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 BER 163 15 20 E /N 0 25 30 35 b 10 15 20 b 0 E /N 25 30 L=2 L=4 L=8 L=16 L=4 Ideal 35 Figure 4.29: BER performance versus Eb /N0 for non coherent detection. (b) RX and TX clock independent. Preamble overhead is 7 %. −5 5 −4.5 −4.5 10 −4 −4 −5 5 −3.5 −3 −2.5 −2 −1.5 −1 −3.5 −3 −2.5 −2 −1.5 L=2 L=4 L=8 L=16 L=4 Ideal (a) RX and TX clocks synchronised. Preamble overhead is 1 %. BER −1 BER 4.10 Comparison with State of the Art Table 4.6 shows comparisons between the transmitter presented in this Chapter and some of the latest published inductive transmitters. Inductive circuits for inter-chip communication have not been included as these are designed for transmission distances in the order of µm. 164 165 b a Power [µW] 6 2300 6750 10,500 3,200 3,200 400 f0 [MHz] ≈ 200 ≈ 400 433 400 33 33 33 b 0.3125 0.5 Rate [Mbps] 0.25 200 0.330 0.8 1 1,300 6,400 Energy [pJ/bit] 24 12 20,500 13,000 3,200 Distance [mm] 59 20 50 65 50 10−2.5 10−3 10−3 10−3 10−3 BER 2009 This Work 2009 This Work 2007 [89] 2009 [90] 2007 [91] 2007 [92] 2009 This Work Year Ref. Table 4.6: Comparison of some transmitters suitable for short range communication. ZL70101 commercial device. Limited information available. Includes receiver power Antipodal Antipodal FSK Unknown a PPM coherent L=2 PPM non coherent L = 2 PPM coherent L = 16 Modulation [90] describes a high data rate inductive link suitable for multimedia applications. [91] uses FSK modulation to transmit neural data from a body sensor over a 433 MHz inductive link. Direct modulation of the inductor by a VCO is used generate the transmitted data. [89], [90] both contain a pulse based transmitter where a bi-polar voltage impulse is applied to the transmitting inductor. The current through the inductor is then approximately a triangular pulse. [89], [90] and this work all apply an impulse to an inductor. However, in this work the resonant aspects of the antenna are exploited in order to shape the pulse. [89], [90] apply a voltage across the inductor for the whole duration of the transmitted pulse, which is akin to the direct modulation approach shown in figure 4.2. Applying a voltage across the inductor for the length of the pulse helps to reduce the intersymbol interference. However, in this work the intersymbol interference is reduced by using the resonant properties to approximate a Guassian pulse shape. 4.10.1 Integration The power consumption of the discrete circuit presented is significantly more than the state of the art transmitters shown in table 4.6. The power consumption of the pulse generator could be reduced significantly if implemented as an integrated circuit. For a digital circuit the power consumption of each gate is given by [93]: Pgate = CV 2 f. (4.33) Equation (4.33) shows that the supply voltage and load capacitance play a significant role in the dynamic power consumption. For the discrete implementation the supply voltage was 2.8 V and the capacitance per gate (load and power dissipation capacitor to model the internal gate) is approximemetely 35 pF. With integration the capacitance could be easily reduced to 1 pF and the supply voltage reduced to 1 V. With these conservative figures a reduction in power of approximately 270 times can be achieved per switching element. This 166 would lead to a power consumption of the pulse generation circuitry, approximately PP U LSE = 10 µW. The quiescent power due to leakage will be much less than 0.5 µW. The use of a larger inductance would also decrease the RF power required, as described by the efficiency of generating the magnetic field, section 4.2.1. In [90] a planar inductor with a diameter of 10 mm was used which has an inductance of 2µH, ten times the inductance of the coil used in this Chapter. The use of the planar inductor would increase the efficiency of creating the magnetic field by a factor of ten. Therefore, the required transmitter power to achieve the same performance as the discrete implementation shown in this Chapter would be reduced to PT X = 50 µW. For L = 8 the estimated energy per bit for a fabricated IC would be 15 pJ/bit. 4.11 Conclusion A novel circuit that makes use of the resonant properties of 2nd order transmitting elements to form an approximated pulse has been shown. It has been shown that applying a current impulse to a resonant tank rather than directly modulating an inductor reduces the rms power by √ a factor of Q. Theoretical decomposition of an all pole Gaussian pulse approximation using the sum of 2nd order responses is described. The N = 2 approximation does not require the creation of delays, and the amplitudes of the pulses from the two 2nd order elements is the same, thus simplifying implementation. It is shown that by using the same number of transmitting and receiving elements an implicit approximated matched filter can be implemented at the receiver. The coupling matrix and radiated magnetic field profile of the transmitter with two elements spaced at 20 mm are shown. As the carrier frequency of the transmitting elements can only be tuned to within 1 %, an estimation of the carrier frequency is required 167 in order to carry out coherent demodulation. An algorithm that estimates the parameters from a known preamble sequence is proposed. The BER performance of coherent detection is poorer than non coherent detection, which is primarily due to the frequency and phase estimator. The advantage of using coherent detection is that higher bit error rates can be achieved using the same bandwidth. Using non synchronised clocks, a bit error rate of 10−3 at a rate of 1 Mbps and a distance of 50 mm is achievable using coherent detection. The maximum rate of coherent detection is 500 kbps, which can be achieved at a distance of 65 mm for a BER of 10−3 . The power consumption of the discrete circuit implementation is much higher than state of the art. However, it is shown that the projected power consumption would be 15 pJ/bit if the circuits were fabricated on an integrated circuit. In summary a transmitter circuit and receiver topology suitable for a medical implant transmitter operating in the 30-37.5 MHz band have been presented. The transmitter achieves 30 dB of out of band attenuation and can transmit at rates of up to 1 Mbps over a 50 mm distance by using coherent detection. 168 5 An Analogue Gabor Transform 5.1 Introduction In the Gabor transform (also known as the Short Time Fourier Transform) [94], a set of basis functions are used to represent the input signal. The input signal is projected onto these basis functions to form a set of discrete time continuous amplitude coefficients. An approximate reconstruction can then be made by multiplying these coefficients by the same basis functions. The algorithm transforms a signal into multiple parallel channels, each of which occupies a lower bandwidth than the original signal. A possible application of this is for low power sensor networks; apply the Gabor transform and then use a set of analogue to digital converters with lower sampling rates. The downstream digital processor can then process the received data using parallel channels, each at the lower rate. A lower clock rate processor would be required, which would reduce dynamic power dissipation [39]. In Section 5.2 an introduction to the Gabor transform is provided, followed by a comparison between a time domain and a direct filter approach in Section 5.3. This analysis shows that using the direct filter approach requires significantly less hardware than the time domain approach. Section 5.4 discusses how the filters were designed based on the framework devised by Haddard [85]. Section 5.5 shows a novel method to quickly design state space continuous time filters. Finally, Section 5.6 presents measured results. 169 Energy Frequency [k] 2 B 1 ∆f 0 ∆t 0 1 2 Time [p] Figure 5.1: Information diagram illustrating the discrete time and frequency energy resulting from the projection of a signal onto the pseudo basis functions. 5.2 The Gabor Transform The Gabor transform [94] was proposed in 1946. A useful way of describing this process is via the information diagram, figure 5.1. In this diagram the time and the frequency are split up into a set of units. Each unit is of size ∆t∆f and the time uncertainty as defined by Gabor is: 1 ∆t∆f ≥ . (5.1) 2 The arrow protruding from each unit is the complex coefficient obtained by projecting the signal onto an analysis window. For the case of Gaussian windows, the bound in (5.1) is met so ∆f ∆t = 21 . The continuous time and frequency Gabor transform is given by: Cc (τc , fc ) = Z∞ s(t − τc )hc (fc , t) dt (5.2) −∞ where Cc are the coefficients, τc and fc are both continuous variables describing the time and frequency localisation and hc is the window. 170 The Gabor transform is a special case of a wavelet transform that has a uniform time and frequency grid. It is shown in [95] that the continuous wavelet transform can be made discrete in frequency and time without loss of information, provided that the sampling in both frequency and time is sufficiently dense. C(k, p) = Z∞ s (t − p∆t) h(k, t) dt (5.3) −∞ where C(k, p) are the coefficients, s(t) is the input signal, p is the time index, k is the frequency index and h(k, t) is the analysis window for each frequency. The window functions are complex sinusoidal Gaussian functions which are given by: 2 2 “ ” j2π f0 + LB k t h(k, t) = e−α t e k (5.4) where B = ∆f Lk is the bandwidth and f0 is the frequency of the first analysis window. α is the spread of the pulse envelope given by: α= √ 2π∆f. (5.5) In order to reconstruct the original signal from the coefficients an approximate inverse can be used: s(t) ≈ Lp −1 Lk −1 X X p=0 k=0 C(k, p)h(k, t − p∆t) (5.6) where Lp is the total number of time samples and Lk is the total number of frequency divisions. Eq. (5.6) uses the same windows as used in the forward transform. If different window functions were chosen for the inverse then an exact reconstruction could be arrived at. However, the inverse (5.6) still allows reconstruction of the original information in the signal albeit with some error. Refer to [95] for more details on this 171 2 2 g(t)=eα t ∆t t T w Figure 5.2: Truncation of the Gaussian pulse. issue. A bit error test is used in the measurement section to observe the performance of the transform and its inverse. For a practical implementation the analysis window must be truncated to a length Tw as shown in figure 5.2. In this case the integration limits of (5.3) are from −Tw /2 to Tw /2. To implement the window in the analogue domain the complex sinusoid is split into its real and imaginary parts and these are processed separately; figure 5.11 shows a set of analysis windows. The analysis windows must also be causal. This is achieved by shifting the window, figure 5.2, by Tw /2, thus ensuring the function is zero for t < 0. 5.3 Implementation of Convolution There are several ways in which the convolution operator of (5.3) can be implemented using analogue circuits. In [96], [97] complex demodulation is used to convert the signal to the baseband. The complex baseband signal is then passed through two low pass filters to pro- 172 duce the real and imaginary coefficients, which can then be sampled to obtain the discrete time transform. Both these schemes require two multipliers and two low pass filters per complex coefficient. The scheme described in [98] produces a time domain representation of the analysis window using a baseband filter and a local oscillator. The input signal is then mixed with the analysis window and integrated to find the coefficients. This scheme requires four multipliers and a single low pass filter to obtain one complex coefficient. The advantage of the circuits described above is that the complexity of the filters is reduced so half the number of poles are required compared to using bandpass filters. However, these circuits require clock generation circuits and/or demodulation circuitry. By using advances in higher order complex filter design [85], methods to directly implement the convolution using bandpass filters are achievable. This means that there is no requirement for clock generation circuits, potentially reducing the amount of hardware and power required. The generation of a complex coefficient using the direct filter approach requires two of the circuits shown in figure 5.3 per complex coefficient. The two filters have the same centre frequency but are orthogonal to each other. By using the direct time domain approach the impulse response of the bandpass filter is multiplied with the input signal then integrated using a resettable integrator, figure 5.4. Again two of these circuits are required per complex coefficient. The analysis in the following section compares the direct filter approach and the direct time domain approach in terms of hardware and noise. 5.3.1 Number of Filters and Coefficient Rate In the direct filter approach two filters are required per analysis window, one for each of the real and imaginary parts. In the time domain approach, integration is carried out over the length of each analysis window and the windows overlap in time. Figure 5.5 shows the over- 173 n(t) s(t) + p(t) y[n] + q[n] H(s) Figure 5.3: Generation of a single coefficient using the direct filter approach. n̄(t) is the output noise of the filter. s(t) + p(t) m(t) Tw Impulse Train h(t) + n(t) y[n]+q[n] H(s) TW 0 Figure 5.4: Generation of a single coefficient using the time domain approach. n̄(t) is the filter noise, p̄(t) is the signal noise and m̄(t) is the multiplier noise. lap in the calculation of coefficients for a truncated window of length 2∆t. Here four filters are required per analysis window; two per complex coefficient. In the general case the length of the window is Tw = m∆t, where m describes the amount of overlap. Thus for the time domain approach there will be 2mN channels with a coefficient rate of 1/m∆t. In constrast the direct filter approach requires 2N channels with a rate of TW C(k,1) C(k,3) C(k,2) 0 ∆t 2∆t C(k,4) 3∆t 4∆t 5∆t Figure 5.5: Coefficient time line showing the generation of the coefficients for TW = 2∆t. 174 1/∆t. Each channel requires a bandpass filter, so the time domain approach requires m times as many filters, but the rate of the coefficients is m times less. Therefore, in terms of hardware requirement, the direct filter approach is superior to the direct time domain approach. 5.3.2 Noise Analysis Generation of a single coefficient using the time domain approach is shown in figure 5.4. n̄, p̄ and m̄ are independent noise sources with zero mean. n̄ is the noise added by the filter used to generate the impulse response, p̄ is the signal noise, m̄ is the multiplier noise, h(t) is the analysis window, s(t) is the input signal, Tw is the length of the analysis window and y[n] is the discrete time continuous amplitude coefficient. The integrator is reset to zero every Tw . At the output there is a signal RT component, y[n] = 0 w s(t)h(t)dt and a noise component: q̄[n] = ZTw [p̄(t)h(t) + n̄(t)s(t) + p̄(t)n̄(t) + m̄(t)] dt. (5.7) 0 As the random variables are independent and have zero mean it is straightforward to see that the expectation of q̄[n] will be zero. The variance of q̄[n] in terms of the autocorrelation function of the noise source, R(τ ), is [99]: 2 E[q̄[n] ] = Z 0 Tw Z 0 Tw [Rp (t1 − t2 )h(t1 )h(t2 ) + Rn (t1 − t2 )s(t1 )s(t2 ) + Rp (t1 − t2 )Rn (t1 − t2 ) + Rm (t1 − t2 )]dt1 dt2 . (5.8) For the case of wide sense stationary white noise the autocorrelation function is: R(τ ) = σ 2 δ(τ ) (5.9) where σ 2 is the variance of the white noise source and δ is the Dirac 175 Delta function. Thus (5.8) can be simplified to: 2 E[q̄[n] ] = σp2 ZTw 2 h(t) dt + σn2 ZTw 2 s(t)2 dt + σp2 σn2 Tw + σm Tw (5.10) 0 0 where σi2 is the variance of the ith white noise source. The peak coefficient will occur when s(t) = h(t) thus the peak signal to noise ratio (SNR) is: T Rw h(t − T2w )2 dt SNRTD = 0 p . (5.11) E[q̄[n]2 ] For the Gaussian analysis window (5.4), the maximum value of the coefficient output is: ZTw h(t − Tw 2 ) dt 2 ≈ A2 Tw 2m (5.12) 0 where A is the peak amplitude of the window function and m is the length of the window as defined in Section 5.3.1. The peak SNR of the signal, filter output and multiplier output computed over the signal bandwidth, ∆f are: A √ σp ∆f A SNRn = √ σn ∆f A2 √ SNRm = . σm 2∆f SNRp = (5.13) (5.14) (5.15) Eq. (5.11) can then be approximated by: SNRTD v u 1 u ≈t 1 1 4 SNR 2 + SNR2 + p n m SNR2m . (5.16) In the direct filter approach of figure 5.3 the peak SNR can be written 176 as: SNRDF t R Tw 2 − τ ) dτ s(τ )h(t − max 0 = v u f0 +∆f /2 u R t σn2 + σp2 |H(f )|2 df (5.17) f0 −∆f /2 where H(f ) is the filter transfer function. When the input signal has a maximum amplitude of A and the filter has unity gain then (5.17) reduces to: v u 1 u . SNRDF ≈ t (5.18) 1 π 1 4 SNR + 2 4 SNR2 n p In the direct filter the noise in the signal path is reduced due to the action of the filter. This is not the case when using the time domain approach. In the trivial case where the SNR of the filter, multiplier and signal are the same then: SNRDF ≈ 1.5 × SNRTD . (5.19) This result implies that the SNR of the coefficients computed using the direct filter method would give almost 2 dB improvement over the time domain case using windows of length of 2∆t. The result of the analysis shows that from a noise point of view there is no benefit in using the time domain approach for implementing the Gabor transform. 5.4 Design of the State Space Filter from Impulse Response Specifications From the time domain function a Padé approximation, as shown by Haddad [85] and described in Chapter 3, has been used in order to create causal and rational 8th order bandpass transfer functions in the Laplace domain. For the implementation of this chip the bandwidth has been chosen to be 4 kHz and f0 = 500 Hz, so the spacing in the 177 frequency domain for N = 4 is 1 kHz, (5.4). An example of a transfer (s) function, H(s) = N , for the imaginary (sin) and real (cos) windows D(s) with Tw = 2 s and a centre frequency of 0.75 Hz is: N (s)(sin) = 56.14s5 + 284.6s4 + 5332s3 + 34.33e3 s2 + 28.64e3 s − 20.16e3 N (s)(cos) = 5.956s6 + 63.75s5 + 583.3s4 + 5376s3 − 48.73e3 s2 − 48.73e3 s − 10.45e3 (5.20) D(s) = s8 + 21.97s7 + 313.5s6 + 2850s5 + 18.93e3 s4 + 87.66e3 s3 + 286.9e3 s2 + 581.2e3 s + 626.0e3 . In order to convert this Laplace transfer equation into a state space realisation, a method documented in [100] is used. In this method an LC ladder network for the characteristic polynomial is created, from which the entries of the state matrix can be found. The entire transfer function is then created by taking a weighted sum of the nodes of the LC filter in order to realise the output matrix. For the filter in (5.20), scaled so that it is centred at f0 = 1500 Hz, the state space model using CN = 20 pF of capacitance at each node, ignoring any parasitics and using the notation in [101] is: 0 112.1 0 0 0 0 0 0 −112.1 0 127.6 0 0 0 0 0 0 −127.6 0 157.0 0 0 0 0 0 0 −157.0 0 177.5 0 0 0 G = 0 0 0 −177.5 0 212.3 0 0 0 0 0 0 −212.3 0 278.4 0 0 0 0 0 0 −278.4 0 542.2 0 0 0 0 0 0 −542.2 −878.8 (5.21) 178 h i C(sin) = 605.4 −275.1 −538.2 200.1 −107.5 112.5 0 0 C(cos) (5.22) i = −152.3 −781.5 291.4 169.4 −37.5 127.8 −83.1 0 h (5.23) h i BT = 0 0 0 0 0 0 0 200 TC = diag CN I(1×8) H (sin) (s) = C(sin) (sTC − G)−1 B H (cos) (s) = C(cos) (sTC − G)−1 B. (5.24) (5.25) (5.26) (5.27) where all the values in G, C and B above are written in nano Siemens. The characteristic polynomial determines the centre frequency of the filter, and multiplication of the G matrix by a constant enables control over the centre frequency of the filter. This feature is exploited when tuning the filter, as only a single bias current is required to set the centre frequency. Another feature of the state space implementation is that the same G matrix can be used together with two different C matrices in order to implement the sin and cos components of the analysis windows, figure 5.6. The circuit design of the filter is based on a gmC ladder approach. The topology of the ladder implementation is similar to that used in [85]. Figure 5.7 shows the implementation of the G matrix using identically sized transconductors. The bias current of each transconductor is set such that the value of transconductance defined by the G matrix is realised. The implementation of the sin and cos C matrices is shown in figure 5.8. The individual bias currents are generated on-chip using a simple current mirror, figure 5.9. In order to design the filter for low power there are several tradeoffs that need to be addressed. In the following section a methodology for the design of a low power gmC filter is shown. 179 VIN V-I (sin) C IOUT(sin) C(cos) IOUT(cos) G b81 Figure 5.6: Block diagram of the gm-C filter. IG12 0 + - 0 IB81 VIN IG23 0 + - IG34 + IG21 0 0 + - IG45 + IG32 0 + - 0 + - IG56 + IG43 0 0 + - IG67 + IG54 0 0 + - IG78 + IG65 0 0 + - 0 + + IG76 0 + - IG87 C1 V1 C2 V2 C3 V3 C4 V4 C5 V5 C6 V6 IG88 C7 V7 C8 V8 Figure 5.7: Circuit diagram of the G matrix implementation using identical transconductors and capacitors. The gm of each transconductor is set via the bias currents IB81 and IGij . 180 V1 0 0 + - + - V1 IC11(sin) 0 V2 IC11(cos) 0 + - + - V2 IC12(cos) IC12(sin) 0 V3 + - 0 + - V3 IOUT(sin) IC13(sin) V4 0 IC13(cos) V4 + - + - 0 IC14(cos) IC14(sin) 0 V5 0 + - + - V5 IC15(cos) IC15(sin) V6 0 IOUT(cos) V6 + - + - 0 IC16(sin) IC16(cos) 0 + - V7 IC17(cos) Figure 5.8: Circuit diagram of the C matrix implementation. A weighted sum of the state voltages shown in figure 5.7 is created using transconductors. ICij are the bias currents of the transconductors. IBIAS(OFF CHIP) IB81 W=20µ L=20µ IG12 IG88 IC11(sin) (sin) (W/L)B81 (W/L)G12 (W/L)G88 (W/L)C11 IC16(sin) (sin) (W/L)C16 IC11(cos) IC17(cos) (cos) (cos) (W/L)C11 (W/L)C17 Figure 5.9: Generation of the bias currents for a single complex filter using current mirrors. (W/L) of each transistor is scaled to provide the required bias current. 181 Vdd M3 M4 i0 v+ M1 M2 v- I0 Figure 5.10: Diagram of a simple transconductor. The differential input voltage is defined as: vin = v+ − v− . 5.5 Designing a Low Power gmC Filter The widths and lengths of the transistors used in the transconductor need to be chosen in order to meet a certain specification. In the following analysis an insight into how to size the transistors for optimum performance is shown. The design is based on the use of identical simple differential pair transconductors with no linearisation, figure 5.10. The following sections show some simplified analyses which enable the filters performance to be established before using a full blown analogue simulator, leading to faster design times. 5.5.1 Noise and Distortion For a given filter topology it is important to maximise the SNR and minimise the distortion. This can be achieved by maximising the 182 SINAD (signal to noise+distortion ratio) of the filter which is defined as: Pf (5.28) SINAD = Pn + Pd where Pf is the power in the fundamental at the output, Pn is the integrated output noise and Pd = Pf (THD)2 is the distortion power. THD is the total harmonic distortion. Estimating THD The transfer function of the transconductor in figure 5.10 using the EKV model [102] is: io = I0 tanh vin 2nUt (5.29) where n is the subthreshold slope factor and Ut = kT /q is the thermal voltage. The gm can be found by taking the first term of the Taylor series expansion: gm = I0 . 2nUt (5.30) To compute the THD for the single transconductor a series expansion of (5.29) can be used to find the first few terms of the Fourier series [103]. There are also several methods reported for estimating the distortion of state space systems [104]–[107]. Any of these methods could be used to estimate the distortion of the filter, however, they rely on extracting a weakly non-linear model of the transconductor. For the subthreshold case it is known that the V-I characteristic is a tanh function (5.29). Therefore, the distortion estimate can be simplified by forming a non-linear state space model: 1 VIN v v̇ = + IB tanh IG tanh CN 2nUt 2nUt v IOUT = IC tanh 2nUt (5.31) (5.32) 183 where IG , IB and IC are the bias current matrices for the transconductors, for example: IG = 2nUt |G|. (5.33) VIN is the input voltage to the filter shown in figure 5.7 and the state variable vector is defined as: iT v = V 1 V 2 ··· V 8 . h (5.34) The non-linear state space system can then be simulated using Euler numerical integration: vd+1 vd 1 VIN (td ) IG tanh = vd + ts + IB tanh CN 2nUt 2nUt (5.35) where td is the time index and the time step is given by: ts = td+1 − td . (5.36) Higher order integration methods could also be used for improved speed and accuracy [108], however, the simplicity of the Euler method is suitable for this case. The cumulative error over a time period is proportional to ts . To find the THD a simulation over time T is carried out, where T is large enough to reach the steady state response. The last period of the simulation is then used to calculate the distortion by using a Fourier transform. This method of simulating the behaviour of the filter is much quicker than using the full model in an analogue simulator. For example, calculation of the THD for the 1500 Hz filter block took 0.5 s in MATLAB compared to 25 s using the Spectre simulator with the analogue models. 184 Noise The input referred spectral noise densities for subthreshold MOS transistors defined in [109] are: Sfgate = Stgate KF 1 W LCOX f 2kT n = gm (5.37) (5.38) where COX is the gate oxide capacitance and KF is the SPICE flicker noise coefficient. For a transconductor the input referred noise spectrum may be written as [101]: Sin (f ) = St Sf + . gm f (5.39) By referring the input noise voltage of each transistor to the input for the simple transconductor the following expressions can be derived for St and Sf (see Chapter 5 of [51]): Sf = 2KF COX " St = 8kT n # 2 1 1 n + (W L)(1,2) np (W L)(3,4) (5.40) (5.41) where np is the slope factor for the pmos transistor. This shows that the thermal noise is dictated by the process and not by the sizes of the transistors. In contrast, the flicker noise decreases as the area of the nmos and pmos transistors are increased. Using these equations together with Kozeil’s method [101], the total output referred noise for the filter can be found and Pn in (5.28) can be calculated. 5.5.2 Mismatch Mismatch in the subthreshold region predominantly causes variation in threshold voltages, which in turn causes variation in the offsets and the gm of the transconductor [103]. Offsets sum together to produce 185 an offset at the output and do not affect the AC response. There will also be mismatch from the capacitors, but as long the ratios of capacitors are well matched then this error will be small compared to that produced by the transconductor. Assuming that the areas of the transistors supplying the bias currents are large compared to the transconductor transistors, the main error is due to the mismatch in the active load transistors. Making the bias transistors large does not affect the output resistance, bandwidth or power consumption. The variation in gm due to M3 and M4 can be written as [103]: ′ gm = gmo X 1+ √ 2np Ut where X is a random variable with a normal distribution: !2 ξ X ∼ N 0, p (W L)(3,4) (5.42) (5.43) where ξ is the threshold voltage mismatch constant, which is provided by the foundry. The minimum value of (W L)(3,4) can be found independently of the bias currents and capacitance value by considering the relative change in gm value: ∆gm X . =√ gmo 2np Ut (5.44) A Monte Carlo analysis using the standard state space model with the normalised matrices Gn , Bn and Cn can be used to determine m the maximum value of ∆g . The normalised matrices are those where gmo the node capacitance (CN ) is equal to unity. The state space model including mismatch is: 186 v̇ = [Gn + δGn ] v + [Bn + δBn ] Vin (5.45) IOU T = [Cn + δCn ] v (5.46) XG δGn = √ ◦ Gn 2nUt XB δBn = √ ◦ Bn 2nUt XC ◦ Cn δCn = √ 2nUt (5.47) (5.48) (5.49) where ◦ is the Hadamard product. XG , XB and XC are matrices containing random variables sampled from the normal distribution (5.43). The error due to mismatch can be calculated by computing the magnitude of the transfer function over the frequencies of interest and comparing this to the desired transfer function. By running a Monte Carlo simulation the standard deviation of this error can be computed which is then the mismatch error, emm : emm v q u P 2 u ′ u ω (|H(jω)| − |H (jω)|) P = tE 2 ω |H(jω)| (5.50) where H(jω) is the desired frequency response and H ′ (jω) is the simulated response. Summations are used here as ω is considered to be a discrete variable that ranges over the bandwidth of the filter passband. 5.5.3 Bandwidth and Output Resistance The bandwidth and output resistances are very dependent on the transfer function being implemented. For a simple transconductor the bandwidth can be approximated by a single pole: ωp ≈ √ gm3 2 Cx (5.51) I0 where gm3 = 2nU and Cx = Cgs3 + Cgb3 + Cgs4 + Cgb4 . M3 and M4 t are in the weak inversion saturated region so Cx can be approximated by [102]: 187 vgs − Vth np − 1 Cx = 2(W L)(3,4) COX exp + np Ut np (5.52) where Vth is the device threshold voltage. The gate source voltage, vgs is given by: vgs = Vth + np Ut ln I0 4np µCOX (W/L)(3,4) Ut2 (5.53) where µ is the channel mobility. To take account of the bandwidth a function F is defined, which is used to modify the gm of each transconductor: Fij (s) = s ωp 1 +1 (5.54) where ij is the index of the matrix entry affected. The output resistance of a transconductor can be modeled as: Rout ≈ 1 , λI0 (5.55) where λ is the channel length modulation. The effect of finite output resistance is modeled by subtracting the total node conductance from the diagonal of G. A diagonal conductance matrix is defined as: Dg = 2λnUt diag G1(mx1) (5.56) where 1(mx1) is a one’s row vector equal in length to the order of the transfer function, m. The total effect of bandwidth and output resistance can then be modeled as: H ′ (s) = (FC ◦ βC C)(sTC − (FG ◦ (G − Dg )))−1 (FB ◦ B) (5.57) where βC is a constant that scales the bias currents required for the C matrix. FG , FB and FC are matrices whose elements are formed using 188 Analysis Distortion Noise Mismatch Bandwidth & Rout Outputs THD Pn emm ebwro Inputs CN , Vin (W L)(1,2) , (W L)(3,4) (W L)(3,4) CN ,W(3,4) ,L(3,4) ,βC Table 5.1: Tradeoffs in filter design. (5.54). The error between the desired and modeled H(s) can then be found over the bandwidth of the filter. The error is defined in a similar way to the mismatch error (5.50): ebwro = qP ω (|H(jω)| − |H ′ (jω)|)2 P . ω |H(jω)| (5.58) 5.5.4 A Low Power gmC Design Method The overall DC current consumption of the complex gmC filter is: IDC = X IG + X IB + X IC (sin) + X IC (cos) . (5.59) The method presented here is one way in which to methodically obtain a low power realisation given the set of tradeoffs. The analyses used to investigate the tradeoffs are shown in table 5.1. All fixed process parameters and the normalised state space matrices are also passed to this analysis. To obtain a low power implementation, first find the SINAD for differing values of capacitance, then select a value of capacitance so that the required SINAD is achieved. Increase the size of (W L)(1,2) to reduce flicker noise if greater SINAD is required. A suitable value for (W L)(3,4) can be found from the mismatch analysis and then a value for W(3,4) can be found by carrying out the Rout and bandwidth analysis with βC set to a large value. When W(3,4) has been found, βC should be reduced until the maximum error is reached. As there are four G matrices and eight C matrices and the design 189 Filter cos 500 Hz sin 500 Hz cos 1500 Hz sin 1500 Hz cos 2500 Hz sin 2500 Hz cos 3500 Hz sin 3500 Hz SINAD [dB] model simulation 41 40 36 38 40 39 40 39 39 38 39 38 38 37 38 37 emm [%] model simulation 2.1 4.5 2.0 2.2 2.3 3.2 2.2 2.9 2.8 2.9 2.7 2.8 3.4 3.8 3.5 3.7 ebwro [%] model simulation 1.3 8.6 0.5 3.2 1.9 1.1 1.6 1.0 2.4 3.5 2.6 2.7 3.2 3.9 3.5 3.7 IDC [µA] model simulation 0.9 1.1 1.2 1.2 1.2 1.3 1.3 1.4 Table 5.2: Comparison of model and BSIM3 simulation. uses identical transconductors for each filter, each stage of the analysis needs to be carried out on each filter in order to determine the worst case. For a SINAD of greater than 35 dB with an input voltage of 10 mV, a capacitance of 20 pF is suitable. For emm < 5% and ebwro < 5%, W(1,2) = L(1,2) = 10 µm and W(3,4) = L(3,4) = 5 µm were calculated. The bias transistors were chosen to have an area at least ten times larger than W L(3,4) so that the mismatch due to these could be ignored. Table 5.2 lists the analysis results and the corresponding simulated results for each filter. The predication of SINAD using the model shows a good match with the simulated results. In terms of estimating the bandwidth and mismatch effects the models generally underestimate the error. However, using the model has enabled a specification for the device sizes to be sought before resorting to using the simulator. Balanced transconductors were used for the implementation. These have a similar model to the simple transconductor, except that the bandwidth is slightly less due to extra internal nodes and the power supply is twice that of the simple transconductor. 190 191 1 2 Time [ms] sin 500 Hz 1 2 Time [ms] 3 3 −20 0 −10 0 10 20 −20 0 −10 0 10 20 1 2 Time [ms] sin 1500 Hz 1 2 Time [ms] cos 1500 Hz 3 3 −20 0 −10 0 10 20 −20 0 −10 0 10 20 1 2 Time [ms] sin 2500 Hz 1 2 Time [ms] cos 2500 Hz 3 3 −20 0 −10 0 10 20 −20 0 −10 0 10 20 1 2 Time [ms] sin 3500 Hz 1 2 Time [ms] cos 3500 Hz 3 3 Figure 5.11: Measured impulse response of each filter. The solid line shows the ideal state space impulse response as found in Section 5.4 −20 0 −10 0 10 20 −20 0 −10 0 10 cos 500 Hz Amplitude [mV] Amplitude [mV] 20 Amplitude [mV] Amplitude [mV] Amplitude [mV] Amplitude [mV] Amplitude [mV] Amplitude [mV] Figure 5.12: Die photograph showing a single complex filter. The entire chip contains four of these complex filters 5.6 Measured Results A chip was fabricated using UMC 180 nm technology. Each fabricated chip contains four complex filters centred at 500 Hz, 1500 Hz, 2500 Hz and 3500 Hz. The chip area for each complex filter is 0.11 mm2 . Figure 5.12 shows the die photograph for a single complex filter. Nine dies were packaged for testing to obtain an idea of the statistical spread in the response of the filters. The measurement set-up is shown in figure 5.13. The impulse function and the A/D trigger were generated using an FPGA running at 50 MHz; this meant that the jitter on the impulse response would be less than 10 ns. The length of the impulse signal was chosen to be 10 µs in order to give a bandwidth of the impulse response up to 33 kHz (Appendix F). The analogue output of the NI DAQ was used to provide sinusoidal stimulus in order to measure the Bode response of each filter. 192 Figure 5.13: Measurement Setup. A National Instruments aquisition board was used capture the output from the four complex filters (8 channels) 5.6.1 Impulse and Bode Response Figure 5.11 shows the measured impulse responses, averaged over 200 frames, for fabricated ICs. The filters were tuned by adjusting the bias currents to alter the centre frequency of the filters. The output amplitude of the filters was also adjusted by tuning the bias current of the output I-V converter. All offsets have been removed from the measured results. The output offsets for the chips have a mean of 25.4 mV and a standard deviation of 6.61 mV. These offsets are large (due to subthreshold CMOS mismatch) but can be removed by AC coupling the output of the filter. figure 5.14 shows the overlaid plot of 200 measured frames for the cos 2500 Hz window. The rms jitter around the zero crossing point (0.5 ms) for all chips is less than 21 µs, the simulated result was 33 µs. The Bode plot of the measured results for the tuned cos filters are shown in figure 5.15 and for the tuned sin filters in figure 5.16. These figures clearly show the effect of mismatch on the frequency response. It is also clear that the low frequency response of the filter deviates significantly from the ideal state space response. The reason for this is 193 15 Amplitude [mV] 10 5 0 −5 −10 −15 0 0.5 1 Time [ms] 1.5 2 Figure 5.14: Plot showing 200 overlaid cos 2500 Hz analysis windows for a single chip. due to the variation in the finite output resistance of the transconductors. The bias currents required to tune the centre frequencies for each chip are different due to mismatch. This directly translates into different conductance matrices (5.56) for each chip, resulting in a different low frequency gain for each chip (5.57). 194 0 Amplitude [dB20] −10 −20 −30 −40 −50 −60 −70 3 10 Frequency [Hz] 4 10 Figure 5.15: Measured Bode plot of the cos filters for each chip. The thick line shows the ideal state space frequency response. 0 Amplitude [dB20] −10 −20 −30 −40 −50 −60 3 10 Frequency [Hz] 4 10 Figure 5.16: Measured Bode plot of the sin filters for each chip. The thick line shows the ideal state space frequency response. 195 Filter 500 Hz 1500 Hz 2500 Hz 3500Hz Ibias [nA] 61.0 59.1 58.5 57.9 Average f0 [Hz] 554 1559 2542 3547 Standard Deviation [Hz] 49.8 50.8 68.5 76.12 Quality Factor 0.5 1.5 2.5 3.5 Table 5.3: Variation in centre frequency for fixed bias currents. 5.6.2 Centre Frequency Variation Table 5.3 shows the variation in the centre frequency between the chips for fixed bias currents. It is evident from these results that the variation in the centre frequency is larger for the higher centre frequency filters. The reason for this is that the Quality factor of the filters increases with increasing centre frequency. The Quality factor is defined as: Q = f0 /B, where f0 is the centre frequency of the filter and B is the bandwidth. For these filters the bandwidth is fixed at 1000 Hz. As Q increases then the position of the poles becomes more sensitive to errors in the G matrix. This effect can also be seen in the modeled mismatch results in table 5.2. For the cos 2500 Hz window, figure 5.17 shows the impulse response for each chip at a fixed bias current of 58.5 nA and figure 5.18 shows the frequency variation. 5.6.3 Power Consumption and SINAD Table 5.4 shows the measured power consumption, the range of bias currents required and the range of measured SINAD. The average power consumption of all four complex filters is 7.06 µW when operating from a 1.2 V supply. The power shown for each complex filter includes the power consumed by all of the transconductors that make up the filter, figure 5.7 and figure 5.8. It also includes the generation of all the bias currents, figure 5.9, but excludes the output buffers that are used to take the signal off-chip for measurement. A single bias current per complex filter is generated off-chip using a variable resistor so 196 20 15 Amplitude [mV] 10 5 0 −5 −10 −15 −20 0 0.2 0.4 0.6 Time [ms] 0.8 1 Figure 5.17: Variation in the impulse response of the cos 2500 Hz filter for each chip with a fixed bias current of 58.5 nA. 0 Amplitude [dB20] −10 −20 −30 −40 −50 −60 3 10 Frequency [Hz] Figure 5.18: Variation in the frequency response of the cos 2500 Hz filter for each chip with a fixed bias current of 58.5 nA. 197 SINAD (avg) [dB] SINAD (min) [dB] SINAD (max) [dB] Ibias (avg) [nA] Ibias (min) [nA] Ibias (max) [nA] Power (avg) [µA] Power (min) [µA] Power (max) [µA] 500 Hz cos sin 37 41 12 29 41 47 58.3 56.2 61.4 1.42 1.35 1.48 1500Hz cos sin 34 35 30 26 42 37 56.9 54.1 61.0 1.62 1.58 1.73 2500Hz cos sin 39 38 38 36 40 39 57.7 55.2 60.8 1.95 1.87 2.00 3500Hz cos sin 37 35 36 31 40 38 57.3 55.6 58.6 2.07 2.01 2.15 Table 5.4: Summary of measured results. that the centre frequencies can be tuned. As an alternative to using a variable resistor, the voltage supply to a fixed resistor could be varied. To produce a filter where each value of the G and C matrix can be individually controlled then a scheme such as the “Stochastic I-Pot” presented in [110] could be used. The measured average SINAD, table 5.4, corresponds well with the modeled and simulated results shown in table 5.2. 5.6.4 Bit Error Test Performance The direct filter technique of figure 5.3 was used to test the performance of the transform. A pseudo random binary signal is applied to the input of the filters and the outputs of the four complex filters were sampled to obtain the complex coefficients. The rate at which the coefficients were obtained is equal to 1/∆t, which for this implementation is 2 kHz. The effective resolution of the A/D converter is 10 bits. Therefore, the noise floor of the measuring instrument is much lower than the test circuit. The measured coefficients were then used to reconstruct the binary input using the ideal windows, (5.6). The results of a bit error test of 10000 bits for the ideal and measured transform are shown in figure 5.19. This figure shows that with the measured windows, bit rates up to 7 kbit/s can pass through the transform at error rates of 198 −0.5 Bit error rate [dB10] −1 −1.5 −2 −2.5 −3 Chip Average Chip 2 Chip 1 Ideal −3.5 0 2000 4000 6000 8000 Bit Rate [bps] 10000 12000 14000 Figure 5.19: Bit error comparison. The chip average bit error is the average measured result from 9 chips. Chip 1 showed the best bit error performance and chip 2 the worst. about 10−2 . The worst and best performing chips are also shown to provide an idea of the spread in the bit error rate. The departure from the ideal curve is due to the error in the cross correlation of the analysis windows, which is a result of mismatch error. 5.6.5 Analysis Window Cross Correlation The cross correlation is computed using the average of 200 measured windows. The set of basis functions is not orthogonal, due to the truncation of the Gabor pulses and also due to errors in the implented response caused by mismatch. The cross correlation of the real (cos) windows for ideal (Cii ) and measured (Cmm ) windows is shown in table 5.5. The cross correlation has been computed over 5 ms to allow the measured impulse response to decay to zero. The measured cross 199 Table 5.5: Cross Correlation of Cos Analysis Windows Cii 0.36 −0.12 0.01 −0.00 −0.12 0.25 −0.11 0.01 0.01 −0.11 0.25 −0.12 −0.00 0.01 −0.12 0.24 Caa 0.37 −0.14 0.02 −0.00 −0.14 0.25 −0.12 0.02 0.02 −0.12 0.25 −0.12 −0.00 0.02 −0.12 0.25 Cmm 0.36 −0.15 0.03 0.01 −0.15 0.25 −0.12 0.02 0.03 −0.12 0.25 −0.12 0.01 0.02 −0.12 0.24 correlation of the analysis windows is close to the ideal windows of (5.4). However, a consequence of the error in the cross correlation is that the transform will not perform as expected at high bit rates. This is clearly shown in figure 5.20 where the cross correlation error for each chip is plotted together with the bit error at 8 kHz. The cross correlation error is defined as the sum of the mean squared error between the ideal and the measured cross correlation coefficients. 200 3.5 −1 Cross Correlation Error −1.5 2.5 −2 2 Bit Error −2.5 1.5 1 1 2 3 4 5 6 Chip Number 7 8 −3 9 Figure 5.20: Cross correlation error and bit error performance comparison between each chip. There is a strong link between the amount of cross correlation error and the bit error performance of the chip. 201 Bit Error @ 8kbps [dB10] Cross Correlation Error 3 202 Table 5.6: Transform Comparison. This Work Graham [111] Haddad [85] Haddad [112] Moreira-Tamayo [98] Edwards [96] Justh [97] Year 2009 2007 2005 2005 1995 1993 1999 No. of complex windows 4 8 1/2 1/2 1 6 16 Overall Transform Bandwidth 4 kHz 10 kHz 10 kHz 45 MHz Maximum filter centre frequency 3.5 kHz 1 MHz 5.8 kHz 58 MHz 25 kHz 2 kHz 9 kHz 50 MHz Process 180 nm CMOS 0.5 µm 180 nm BiCMOS BiCMOS Discrete 2 µm CMOS 2 µm CMOS Area 0.44 mm2 2.25 mm2 0.28 mm2 5.43 mm2 Transform Power Consumption 7 µW 320 µ 1.5 µW 24.3 mW 6.75 µW 1.6W Supply Voltage 1.2 3.3 1.2 1.5 Transform Methodology Direct Filter Time domain Complex Demodulation Bit Rate [10−2 error] 7kbps Measured Yes Yes No Yes Yes Yes 5.6.6 Transform Comparison A comparison between this work and other attempts at implementing similar transforms are shown in table 5.6. The work in this Chapter shows a complete working transform, many of the other designs in the literature are for the individual filters. In [111] sixteen 10th order filters were fabricated. These would be suitable for implementing eight complex analysis windows. The implementation shown in this Chapter requires 0.44 mm2 for four complex analysis windows. This is a much smaller area than required by the most recent work in [112], which requires 0.28 mm2 to implement one half of a complex window, and in [111], which requires 2.25 mm2 to implement eight complex analysis windows. The most likely reason for the reduced area requirement is that in this work the size of the transconductors has been optimised using a given set of performance criteria (see Section 5.5). The power consumption per filter is similar to the simulated work shown in [85]. 5.7 Conclusion In this Chapter the design and implementation of an analogue Gabor transform have been discussed. An approach using bandpass filters has been compared to a time domain filtering approach. The results of the comparison show that using the bandpass filter approach requires significantly less hardware and has slightly better noise performance than the time domain approach. The use of a state space filter allowed a complex filter to be implemented using the same characteristic equation but using different output summing networks for the sin and cos windows. A technique for optimising the transistor sizes to produce a low power implementation has been presented. Simplified models to analyse the effect of distortion, noise, mismatch, output resistance and bandwidth on the filter transfer function are derived. The simplified models allow rapid evaluation of the tradeoffs in the filter design. An approximation to the Gabor transform has been successfully im- 203 plemented in 180 nm CMOS technology using devices operating in the sub-threshold region. Measured bit error rate tests confirm that the transform is operating as expected up to 7 kbit/s. Performance is limited by the approximation of the filter and the variation in the gm of the transconductors. A set of tradeoffs have been described together with mathematical descriptions so that device sizes can be found from a set of given constraints. This has led to the fabrication of a low power, low area circuit that carries out the analogue Gabor transform. This circuit can be used to split a 4 kHz signal into eight separate 2 kHz bandwidth channels for further processing. 204 6 Conclusion In todays society the reduction of energy consumption in electronic circuits is important for increasing longevity of battery operated equipment, reducing heat dissipation to enable tighter integration of circuits and to reduce our burden on the planets energy resources. In Chapter 2 a survey of energy consumption limits has been presented which shows fundamentally how and why energy is dissipated by electronic circuits. The survey looks at fundamental limits as well as implementation specific limitations. It shows that modern circuits typically require many orders of magnitude more energy than the fundamental limits. For the case of a transistor amplifier a possible reason for the large energy consumption is that significant energy is required to reduce the fluctuations in the transistor semiconductor. This is akin to cooling (refridgerating) the transistor, which requires energy. This observation implies that circuits that bias a transistor using a low duty cycle will benefit from reduced power consumption. The survey also highlights the fact that the rate of information transfer depends on the amount of energy available. This is explicitly shown in the quantum limits and is hinted at by the classical limits as the minimum energy relies on the SNR tending towards zero, i.e. zero rate of transmission. As part of the survey a lower bound for the energy per bit for a free space point to point link has been proposed. This is based on the fact that the transmitter antenna is a blackbody radiator that contributes to receiver noise. This bound is interesting because it shows that there is an operating frequency at which a minimum energy per bit occurs. This frequency and the minimum energy are dependent on the antenna geometry and distance between the transmitter and receiver. A pos- 205 sible application of this lower bound would be to create a transmitter whose centre frequency is adjusted depending on transmission distance in order to minimise the energy required. Inspired by the survey, two low power circuits designed to reduce energy consumption have been described and measured results have been presented. The PPM circuit in Chapter 4 shows a low complexity circuit which uses a transistor with a very low duty cycle. The analog Gabor transform described in Chapter 5 decomposes a signal into smaller bandwidths to allow post processing to be carried out at a reduced rate, thus enabling the possibility of reducing energy consumption in the post processing circuits. Chapter 3 provides an overview of PPM and presents the Gaussian pulse as a suitable pulse prototype. It is suggested that low complexity PPM schemes are well suited to low power transmitter circuits. In particular, analogue pulse based systems are advantageous as they eliminate clocks at the carrier frequency and typically contain fewer interconnects than their digital counterparts. For PPM modulation, theory shows that by increasing the number of orthogonal symbols, time slots in this case, the transmission energy can be reduced towards the fundamental limit. However, in practice the quiescent energy consumption of the transmitter circuit means that there is an optimum number of time slots which provides minimum energy. The Gaussian pulse is superior to a rectangular or sinusoid pulse because the bandwidth-time product of the Gaussian pulse can be chosen arbitrarily to provide a desired out of band attenuation. This is particularly important as improving the out of band attenuation gives better spectral efficiency and reduces electromagnetic interference. A performance comparison between three continuous time approximation methods shows that the All Pole approximation has better out of bound attenuation for a given filter order than the Cascade of Poles or Padé approximations. This is important as the number of poles is directly related to analogue circuit complexity and thus should be minimised 206 in order to reduce power consumption. As the approximate pulses are not completely orthogonal, extra transmission energy is required to achieve the same BER as the ideal Gaussian pulse. However, results show that increasing the number of time slots reduces this extra energy requirement. This means that the performance of the approximate pulses is very close to the ideal Gaussian pulse when the number of time slots is large. In the case of using a correlation detector with 16 time slots, the extra transmitter energy due to the approximation of the pulse is less than 1 %. Chapter 4 presents the analysis and implementation of a PPM architecture suitable for use within the UK 30-37.5 MHz medical implant band. The transmitter makes use of two 2nd order RLC resonant tanks whose impulse responses sum in free space to approximate a Gaussian pulse. This significantly reduces circuit complexity as no clocks are required at the carrier frequency. Unlike simple OOK transmitters, the PPM transmitter shown in this thesis is able to use coherent transmission by utilising two orthogonal pulses. These pulses are produced by delaying the impulse response to the resonant circuits to approximate orthogonality. The advantage of this is that the rate of transmission can be increased for a given bandwidth of operation. The implemented transmitter provides 30 dB of out of band attenuation and has a maximum rate of 1 Mbps. The power consumption of the discrete circuit at 1 Mbps is 3.2 nJ/bit and at 312.5 kbps is 1.3 nJ/bit. The high power consumption of the discrete implementation is due to the large node capacitance of discrete logic gates. The estimated power consumption of a CMOS implementation is 15 pJ/bit for a rate of 500 kbps. A major advantage of the transmitter circuit presented in Chapter 4 is that no dynamic tuning circuitry is required by the transmitter. This reduces complexity and size of the transmitter. To overcome the drift in frequency of the transmitting and receiving elements a preamble sequence sent by the transmitter is used by the receiver to estimate the 207 centre frequency and provide symbol synchronisation. Measurements using the proposed estimation scheme indicate that a bit error rate of 10−3 can be achieved using coherent transmission at a rate of 1 Mbps and a distance of 50 mm. In Chapter 5 the design and implementation of an analog Gabor transform has been presented. The implementation of the Gabor transform in the analog domain provides a low power circuit which can separate an input signal into several parallel paths, each of lower bandwidth than the original signal. As pointed out in Chapter 2, processing signals at a lower rate may potentially allow circuits with lower energy dissipation to be designed. The transform topology in this chapter employs a bandpass filter approach as this requires significantly less hardware and has slightly better noise performance than a time domain approach. Using a state space filter allows a complex filter to be implemented using the same characteristic equation but using different output summing networks for the sin and cos windows, hence reducing the amount of hardware required. A technique for optimising the transistor sizes to produce a low power implementation has been presented. This technique involves using simplified models to analyse the effect of distortion, noise, mismatch, output resistance and bandwidth of the filter transfer function. The simplified models allow rapid evaluation of the tradeoffs in the filter design, which lead to a low power, low area circuit. The Gabor transform was successfully implemented in 180 nm CMOS technology using devices operating in the sub-threshold region. Measured bit error rate tests confirm that the transform operates as expected up to 7 kbit/s. Performance is limited by the approximation of the filter and the variation in the gm of the transconductors. This particular implementation can be used to split a 4 kHz signal into eight separate 2 kHz bandwidth channels for further processing. Using a modified optimisation technique with bi-polar technology would allow the bandwidth of the transform to be increased, potentially making the circuit suitable as the front end of a receiver circuit, such as the 208 one presented in Chapter 4. From the work carried out in this thesis it is surmised that in order to achieve lower energy dissipation in the future the complexity of the circuits needs to be kept to a minimum, the duty cycle of transistor operation needs to be reduced and the rate of information transfer within a circuit needs to lowered. 6.1 Future Work The fundamental limitations described in Chapter 2 assume that the system is operating under thermal equilibrium. Generally thermal equilibrium is not met in practical circuits and so the fundamental energy is many orders of magnitude lower than seen in practical circuits. Analysing a system that is not in thermal equilibrium is difficult, however, future research from the physics community into non-equilibrium systems may provide further insight into the minimum energy requirements of electronic circuits. [113] provides some details of analysing systems that are not in thermal equilibrium. Chapter 4 shows the feasibility of a low power, low complexity coherent transmitter suitable for the 30-37.5 MHz band. To reduce power further an integrated circuit of the transmitter electronics is required. To improve the BER, a convolutional block encoder should also be added. The use of an integrated circuit will significantly reduce the node capacitance of the pulse generation gates, thus reducing power consumption. The convolutional encoder is a digital circuit which would benefit, in terms of lower power consumption, by using adiabatic logic, see Chapter 2. The algorithm used to estimate the carrier frequency at the receiver is not optimal and could be improved, thus reducing the amount of overhead, currently 7 %, when using non synchronised transmit and receive clocks. Using the same techniques as shown in Chapter 4 the transmitter analysis could be extended to use 2nd order transmitting elements at 209 higher frequencies. For example, the capacitive property of a short electrical dipole could be used. Using an electric field instead of a magnetic field would enable longer distance links to be achieved. The Gabor transform in Chapter 5 shows a general circuit that is suitable for splitting an analogue signal into several parallel paths, each of which has a lower bandwidth than the original signal. This would be particularly beneficial if the output of the transform is processed using adiabatic logic, as the lower coefficient rate would reduce the required clock rate. The Gabor transform could be implemented at a higher bandwidth and used as the front end to the receiver described in Chapter 4. The digital side of the receiver could then operate on the frequency domain coefficients using adiabatic logic. Doing this may offer considerable savings in energy over the standard logic approach. However, the adiabatic logic circuit would not be trivial as the operations on the received coefficients need to occur in parallel in order to take advantage of the lower coefficient rate. 210 7 Published Work • An Analogue Gabor Transform Using Sub-Threshold 180 nm CMOS Devices. Mark Tuckwell, Christos Papavassiliou. IEEE Transactions on Circuits and Systems I: Regular Papers, December 2009. Volume 56, Issue 12, pp. 2597 - 2608. • Exploration of energy requirements at the output of an LNA from a thermodynamic perspective. Mark Tuckwell, Christos Papavassiliou. IEEE International Symposium on Circuits and Systems, 2007. Page(s):2810 - 2813 211 212 Bibliography [1] Compact Oxford English Dictionary of Current English. Oxford Dictionaries, 2005. [2] C. E. Shannon, “A Mathematical Theory of Communication,” The Bell System Technical Journal, vol. 27, pp. 379–423, 1948. [3] Dictionary of Physics, 5th ed. Oxford University Press, 2005. [4] F. Adler, “Minimum Energy Cost of an Observation,” vol. 1, pp. 28–32, 1955. [5] R. W. Keyes, “Fundamental Limits in Digital Information Processing,” Proceedings of the IEEE, vol. 69, pp. 267–278, 1981. [6] R. Landauer, “Zig-Zag Path to Understanding [Physical Limits of Information Handling],” Physics and Computation, 1994. PhysComp ’94, Proceedings., Workshop on, pp. 54–59, Nov 1994. [7] R. Landauer, “Minimal Energy Requirements in Communication,” Science, vol. 272, pp. 1914–1918, June 1996. [8] J. D. Meindl and J. A. Davis, “The Fundamental Limit on Binary Switching Energy for Terascale integration (TSI),” vol. 35, pp. 1515–1516, 2000. [9] J. S. Dugdale, Entropy and its Physical Meaning. Taylor & Francis, 1996. 213 [10] L. B. Levitin, “Energy Cost of Information Transmission (Along the Path to Understanding),” Phys. D, vol. 120, no. 1-2, pp. 162–167, 1998. [11] R. P. Feynman, Feynman Lectures on Computation, R. W. Hey, Anthony JG Allen, Ed. Addison-Wesley, 1996. [12] M. Proakis, John G Salehi, Digital Communications, 5th ed. McGraw Hill, 2008. [13] R. Landauer, “Irreversibility and Heat Generation in the Computing Process,” IBM Journal of Research and Development, vol. 5, pp. 261–269, 1961. [14] R. Landauer, “Irreversibility and Heat Generation in the Computing Process,” IBM Journal of Research and Development, vol. 44, no. 1, pp. 261–269, January 2000. [15] M. Cengel, Y Boles, Thermodynamics an Engineering Approach. McGraw Hill, 2002. [16] L. Brillouin, Science and Information Theory, 2nd ed. Academic Press, 1962. [17] C. H. Bennett, “Thermodynamics of Computation - a Review,” International Journal of Theoretical Physics, vol. 21, pp. 905–940, 1982. [18] R. Landauer, “Computation: A Fundamental Physical View,” Physica Scripta, vol. 35, no. 1, pp. 88–95, 1987. [19] R. C. Merkle, “Reversible Electronic Logic Using Switches,” Xerox, Tech. Rep., 1990. [20] G. S. Younis, “Asymptotically Zero Energy Computing Using Split-Level Charge Recovery Logic,” Ph.D. dissertation, Massachusetts Institute Of Technology, June 1994. 214 [21] W. Athas, L. Svensson, J. Koller, N. Tzartzanis, and E. Ying-Chin Chou, “Low-power digital systems based on adiabatic-switching principles,” Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 2, no. 4, pp. 398–407, Dec 1994. [22] S. Kim, J.-H. Kwon, and S.-I. Chae, “An 8-b nRERL Microprocessor for Ultra-Low-Energy Applications,” Design Automation Conference, 2001. Proceedings of the ASP-DAC 2001. Asia and South Pacific, pp. 27–28, 2001. [23] S. Kim and S. lk Chae, “Complexity Reduction in an nRERL Microprocessor,” Low Power Electronics and Design, 2005. ISLPED ’05. Proceedings of the 2005 International Symposium on, pp. 180–185, Aug. 2005. [24] C.-Y. Gong, M.-T. Shiue, C.-T. Hong, and K.-W. Yao, “Analysis and Design of an Efficient Irreversible Energy Recovery Logic in 0.18-µm CMOS,” Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 55, no. 9, pp. 2595–2607, Oct. 2008. [25] R. Eisberg, Robert Resnick, Quantum Physics of Atoms, Molecules, Solids, Nuclei and Particles. Wiley, 1974. [26] H. J. Bremermann, “Minimum Energy Requirements of Information Transfer and Computing,” International Journal of Theoretical Physics, vol. 21, no. 3, pp. 203–217, 1982. [27] J. D. Bekenstein, “Energy Cost of Information Transfer,” The American Physical Society, December 1981. [28] J. Pendry, “Quantum Limits to the Flow of Information and Entropy,” Journal of Physics A: Mathematics General, vol. 16, pp. 2161–2171, 1983. 215 [29] S. Lloyd, “Ultimate Physical Limits to Computation,” Nature, vol. 406, pp. 1047–1054, August 2000. [30] L. L. B. Margolus Norman, “The Maximum Speed of Dynamical Evolution,” Physica D, vol. 120, pp. 188–201, 1998. [31] M. K. Andrecut, M Ali, “Maximum Speed of Quantum Evolution,” International Journal of Theoretical Physics, vol. 43, no. 4, pp. 969–974, 2004. [32] M. M. C. Lachmann, Michael Newman, “The Physical Limits of Communication or Why Any Sufficiently Advanced Technology is Indistinguishable from Noise,” Associaton of Physics teachers, vol. 72, no. 10, pp. 1290–1293, October 2004. [33] D. Gabor, “Lecture on Communication Theory,” Technical Report No. 238, MIT Research Laboratory of Electronics, vol. 238, pp. 1–46, 1952. [34] K. U. Stein, “Noise-induced Error Rate as Limiting Factor for Energy per Operation in Digital ICs,” Solid-State Circuits, IEEE Journal of, vol. 12, pp. 527–530, 1977. [35] N. Krishnapura, “Large Dynamic Range Dynamically Biased Log-Domain Filters,” Ph.D. dissertation, Columbia University, 2000. [36] A. G. Snchez-Sinencio, E. Andreou, Ed., Low-Voltage/Low-Power Integrated Circuits and Systems : Low-Voltage Mixed-Signal Circuits. Wiley-IEEE press, 1999, ch. 17, pp. 519–540. [37] E. A. Vittoz, “Future of Analog in the VLSI Environment,” in Circuits and Systems, 1990., IEEE International Symposium on, 1990, pp. 1372–1375. 216 [38] C. C. Enz and E. A. Vittoz, Designing Low Power Digital Systems, Emerging Technologies, 1996, ch. CMOS Low-Power Analog Circuit Design, pp. 79–133. [39] E. Vittoz, “Low-Power Design: Ways to Approach the Limits,” Solid-State Circuits Conference, 1994. Digest of Technical Papers. 41st ISSCC., 1994 IEEE International, pp. 14–18, Feb 1994. [40] R. Sarpeshkar, “Analog Versus Digital: Extrapolating from Electronics to Neurobiology,” Neural Computation, vol. 10, pp. 1601–1638, October 1998. [41] B. J. Hosticka, “Performance Comparison of Analog and Digital Circuits,” Proceedings of the IEEE, vol. 73, pp. 25–29, 1985. [42] T. H. Lee, The Design of CMOS Radio-Frequency Integrated Circuits. Cambridge University Press, 2004. [43] B. G. Perumana, S. Chakraborty, C. Lee, and J. Laskar, “A Fully Monolithic 260µW, 1-GHz Subthreshold Low Noise Amplifier,” vol. 15, no. 6, pp. 428–430, 2005. [44] X. Li, S. Shekhar, and D. J. Allstot, “Gm-Boosted Common-Gate LNA and Differential Colpitts VCO/QVCO in 0.18-µm CMOS,” Solid-State Circuits, IEEE Journal of, vol. 40, no. 12, pp. 2609–2619, 2005. [45] S.-C. Shin, M.-D. Tsai, R.-C. Liu, K.-Y. Lin, and H. Wang, “A 24-GHz 3.9-dB NF Low-Noise Amplifier Using 0.18 µm CMOS Technology,” Microwave and Wireless Components Letters, IEEE, vol. 15, no. 7, pp. 448–450, 2005. [46] M. Varonen, M. Karkkainen, M. Kantanen, and K. Halonen, “Millimeter-Wave Integrated Circuits in 65-nm CMOS,” Solid-State Circuits, IEEE Journal of, vol. 43, no. 9, pp. 1991–2002, Sept. 2008. 217 [47] T. Yao, M. Gordon, K. Yau, M. Yang, and S. P. Voinigescu, “60-GHz PA and LNA in 90-nm RF-CMOS,” in Radio Frequency Integrated Circuits (RFIC) Symposium, 2006 IEEE, 2006. [48] S. Pellerano, Y. Palaskas, and K. Soumyanath, “A 64 GHz LNA With 15.5 dB Gain and 6.5 dB NF in 90 nm CMOS,” Solid-State Circuits, IEEE Journal of, vol. 43, no. 7, pp. 1542–1552, July 2008. [49] M. Khanpour, K. Tang, P. Garcia, and S. Voinigescu, “A Wideband W-Band Receiver Front-End in 65-nm CMOS,” Solid-State Circuits, IEEE Journal of, vol. 43, no. 8, pp. 1717–1730, Aug. 2008. [50] D. MacDonald, Noise and Fluctuations. John Wiley & Sons, Inc, 1962. [51] P. E. Allen and D. R. Holberg, CMOS Analog Circuit Design. Oxford University Press, 2002. [52] A. Fialkow and I. Gerst, “The Maximum Gain of an RC Network,” Proceedings of the IRE, vol. 41, no. 3, pp. 392–395, March 1953. [53] H. T. Friis, “A Note on a Simple Transmission Formula,” Proceedings of the IRE, vol. 34, no. 5, pp. 254–256, May 1946. [54] J. D. Kraus, Electromagnetics, J. W. Beamesderfer, Lyn Bradley, Ed. McGraw-Hill International, 1991. [55] D. Yates, A. Holmes, and A. Burdett, “Optimal Transmission Frequency for Ultralow-Power Short-Range Radio Links,” Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 51, no. 7, pp. 1405–1413, July 2004. [56] C. Capps, “Near or Far Field?” EDN.com, 2001. 218 [57] Matt Welborn, “DS-UWB vs. 802.11n: What’s the Best Connectivity Option?” Microwave Engineering Online (CMP), Mar. 2005. [58] H. Wheeler, “Fundamental Limits of Small Antennas,” Proceedings of the IRE, vol. 35, no. 12, pp. 1479–1484, Dec. 1947. [59] Bluetooth, “How Bluetooth Technology Works,” Website, 2007. [Online]. Available: http://www.bluetooth.com/Bluetooth/Learn/Works/ [60] Mark Norris and Joe Oldak, “Single-chip ZigBee for Indoor Mobile Telemetry,” Cambridge Consultants, Tech. Rep., 21 Jun. 2005. [61] P. Abshire and A. G. Andreou, “Capacity and Energy Cost of Information in Biological and Silicon Photoreceptors,” Proceedings of the IEEE, vol. 89, pp. 1052–1064, 2001. [62] Engineering and Technology Magazine. IET, January 2009, vol. 4, no. 1. [63] B. Stackhouse, S. Bhimji, C. Bostak, D. Bradley, B. Cherkauer, J. Desai, E. Francom, M. Gowan, P. Gronowski, D. Krueger, C. Morganti, and S. Troyer, “A 65 nm 2-Billion Transistor Quad-Core Itanium Processor,” Solid-State Circuits, IEEE Journal of, vol. 44, no. 1, pp. 18–31, Jan. 2009. [64] “Intel.” [Online]. Available: www.intel.com [65] Q. Tang, L. Yang, G. Giannakis, and T. Qin, “Battery Power Efficiency of PPM and FSK in Wireless Sensor Networks,” Wireless Communications, IEEE Transactions on, vol. 6, no. 4, pp. 1308–1319, April 2007. [66] “Texas Instruments.” [Online]. Available: www.ti.com 219 [67] F. Frederiksen and R. Prasad, “An Overview of OFDM and Related Techniques Towards Development of Future Wireless Multimedia Communications,” Radio and Wireless Conference, 2002. RAWCON 2002. IEEE, pp. 19–22, 2002. [68] T. Tsang and M. El-Gamal, “Ultra-Wideband (UWB) Communications Systems: An Overview,” IEEE-NEWCAS Conference, 2005. The 3rd International, pp. 381–386, June 2005. [69] H. Darabi, B. Ibrahim, and A. Rofougaran, “An Analog GFSK Modulator in 0.35-µm CMOS,” Solid-State Circuits, IEEE Journal of, vol. 39, no. 12, pp. 2292–2296, Dec. 2004. [70] C. Wong, “A 3-V GSM Baseband Transmitter,” Solid-State Circuits, IEEE Journal of, vol. 34, no. 5, pp. 725–730, May 1999. [71] Y. Zhu, J. Zuegel, J. Marciante, and H. Wu, “Distributed Waveform Generator: A New Circuit Technique for Ultra-Wideband Pulse Generation, Shaping and Modulation,” Solid-State Circuits, IEEE Journal of, vol. 44, no. 3, pp. 808–823, March 2009. [72] J. Bohorquez, J. Dawson, and A. Chandrakasan, “A 350µW CMOS MSK Transmitter and 400µW OOK Super-regenerative Receiver for Medical Implant Communications,” VLSI Circuits, 2008 IEEE Symposium on, pp. 32–33, June 2008. [73] M. Demirkan and R. Spencer, “A Pulse-Based Ultra-Wideband Transmitter in 90-nm CMOS for WPANs,” Solid-State Circuits, IEEE Journal of, vol. 43, no. 12, pp. 2820–2828, Dec. 2008. [74] T.-A. Phan, J. Lee, V. Krizhanovskii, S.-K. Han, and S.-G. Lee, “A 18-pJ/Pulse OOK CMOS Transmitter for Multiband UWB Impulse Radio,” Microwave and Wireless Components Letters, IEEE, vol. 17, no. 9, pp. 688–690, Sept. 2007. 220 [75] T. Norimatsu, R. Fujiwara, M. Kokubo, M. Miyazaki, A. Maeki, Y. Ogata, S. Kobayashi, N. Koshizuka, and K. Sakamura, “A UWB-IR Transmitter With Digitally Controlled Pulse Generator,” Solid-State Circuits, IEEE Journal of, vol. 42, no. 6, pp. 1300–1309, June 2007. [76] V. Kulkarni, M. Muqsith, K. Niitsu, H. Ishikuro, and T. Kuroda, “A 750 Mb/s, 12 pJ/b, 6-to-10 GHz CMOS IR-UWB Transmitter With Embedded On-Chip Antenna,” Solid-State Circuits, IEEE Journal of, vol. 44, no. 2, pp. 394–403, Feb. 2009. [77] S. Diao, Y. Zheng, and C.-H. Heng, “A CMOS Ultra Low-Power and Highly Efficient UWB-IR Transmitter for WPAN Applications,” Circuits and Systems II: Express Briefs, IEEE Transactions on, vol. 56, no. 3, pp. 200–204, March 2009. [78] D. Wentzloff and A. Chandrakasan, “Gaussian Pulse Generators for Subbanded Ultra-Wideband Transmitters,” Microwave Theory and Techniques, IEEE Transactions on, vol. 54, no. 4, pp. 1647–1655, June 2006. [79] R. Xu, Y. Jin, and C. Nguyen, “Power-efficient Switching-based CMOS UWB Transmitters for UWB Communications and Radar Systems,” Microwave Theory and Techniques, IEEE Transactions on, vol. 54, no. 8, pp. 3271–3277, Aug. 2006. [80] J. Ryckaert, C. Desset, A. Fort, M. Badaroglu, V. De Heyn, P. Wambacq, G. Van der Plas, S. Donnay, B. Van Poucke, and B. Gyselinckx, “Ultra-wide-band Transmitter for Low-power Wireless Body Area Networks: Design and Evaluation,” Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 52, no. 12, pp. 2515–2525, Dec. 2005. [81] B. V. P. Vliet, L; Rieger, “Fourier Transform of a Gaussian,” Delft University of Technology, Tech. Rep., 2002. 221 [82] P. A. Lynn, An Introduction to the analysis and processing of signals. MacMillan, 1989. [83] H. Kamada and N. Aoshima, “Analog Gabor Transform Filter with Complex First Order System,” SICE ’97. Proceedings of the 36th SICE Annual Conference. International Session Papers, pp. 925–930, Jul 1997. [84] S. Bagga, G. de Vita, S. Haddad, W. Serdijn, and J. Long, “A PPM Gaussian Pulse Generator for Ultra-wideband Communications,” Circuits and Systems, 2004. ISCAS ’04. Proceedings of the 2004 International Symposium on, vol. 1, pp. I–109–I–112 Vol.1, May 2004. [85] S. Haddad, S. Bagga, and W. Serdijn, “Log-domain Wavelet Bases,” IEEE Trans. Circuits Syst. I, vol. 52, no. 10, pp. 2023–2032, Oct. 2005. [86] “UK commumications regulator.” [Online]. Available: http://www.ofcom.org.uk/ [87] Ofcom, “UK Interface Requirement 2030 - Licence Exempt Short Range Devices,” 2006. [88] Y. Sun, Ed., Design of high frequency integrated analogue filters. IEE, 2002. [89] D. Guermandi, S. Gambini, and J. Rabaey, “A 1 V 250 kbps 90 nm CMOS Pulse Based Transceiver for cm-range Wireless Communication,” Solid State Circuits Conference, 2007. ESSCIRC 2007. 33rd European, pp. 135–138, Sept. 2007. [90] S. Lee, J. Yoo, and H.-J. Yoo, “A 200-Mbps 0.02-nJ/b Dual-Mode Inductive Coupling Transceiver for cm-Range Multimedia Application,” Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 56, no. 5, pp. 1063–1072, May 2009. 222 [91] R. R. Harrison, P. T. Watkins, R. J. Kier, R. O. Lovejoy, D. J. Black, B. Greger, and F. Solzbacher, “A Low-Power Integrated Circuit for a Wireless 100-Electrode Neural Recording System,” Solid-State Circuits, IEEE Journal of, vol. 42, no. 1, pp. 123–133, Jan. 2007. [92] “Zarlink Semiconductor.” [Online]. Available: http://www.zarlink.com/ [93] Fairchild Semiconductor, “NC7SV04 TinyLogic ULP-A Inverter,” Tech. Rep., 2009. [94] G. Gabor, “Theory Of Communication,” IEE Journal of Radio and Communications, vol. 93, pp. 429–457, 1946. [95] Y. T. Chan, Wavelet Basics, J. Allen, Ed. Kluwer Academic Publishers, 1995. [96] R. Edwards and M. Godfrey, “An Analog Wavelet Transform Chip,” Neural Networks, 1993., IEEE International Conference on, pp. 1247–1251 vol.3, 1993. [97] E. Justh and F. Kub, “Analog CMOS High-frequency Continuous Wavelet Transform Circuit,” Circuits and Systems, 1999. ISCAS ’99. Proceedings of the 1999 IEEE International Symposium on, vol. 2, pp. 188–191 vol.2, Jul 1999. [98] O. Moreira-Tamayo and J. Pineda de Gyvez, “Time-domain Analog Wavelet Transform in Real-time,” Circuits and Systems, 1995. ISCAS ’95., 1995 IEEE International Symposium on, vol. 3, pp. 1640–1643 vol.3, Apr-3 May 1995. [99] A. Papoulis, Probability, Random Variables, and Stochastic Processes, S. W. Director, Ed. McGraw-Hill, 1984. [100] D. Johns, W. Snelgrove, and A. Sedra, “Orthonormal Ladder Filters,” IEEE Trans. Circuits Syst, vol. 36, no. 3, pp. 337–343, Mar 1989. 223 [101] S. Koziel, R. Schaumann, and H. Xiao, “Analysis and Optimization of Noise in Continuous-Time OTA-C Filters,” IEEE Trans. Circuits Syst. I, vol. 52, no. 6, pp. 1086–1094, June 2005. [102] C. Enz, F. Krummenacher, and E. Vittoz, “An Analytical MOS Transistor Model Valid in all Regions of Operation and Dedicated to Low-voltage and Low-current Applications,” Analog Integrated Circuits and Signal Processing, vol. 8, pp. 83–114, 1995. [103] P. Corbishley and E. Rodriguez-Villegas, “Design Tradeoffs in Low-power Low-voltage Transconductors in Weak Inversion,” Circuits and Systems, 2006. MWSCAS ’06. 49th IEEE International Midwest Symposium on, vol. 2, pp. 444–448, Aug. 2006. [104] J. F. Fernandez-Bootello, M. Delgado-Restituto, and A. Rodriguez-Vazquez, “Matrix Methods for the Dynamic Range Optimization of Continuous-Time Gm - C Filters,” IEEE Trans. Circuits Syst. I, vol. 55, no. 9, pp. 2525–2538, Oct. 2008. [105] Z. Zhang, A. Celik, and P. Sotiriadis, “State-space Harmonic Distortion Modeling in Weakly Nonlinear, Fully Balanced Gm-C Filters - A Modular Approach Resulting in Closed-form Solutions,” IEEE Trans. Circuits Syst. I, vol. 53, no. 1, pp. 48–59, Jan. 2006. [106] P. Sotiriadis, A. Celik, D. Loizos, and Z. Zhang, “Fast State-Space Harmonic-Distortion Estimation in Weakly Nonlinear GmC Filters,” IEEE Trans. Circuits Syst. I, vol. 54, no. 1, pp. 218–228, Jan. 2007. [107] A. Celik, Z. Zhang, and P. Sotiriadis, “A State-Space Approach to Intermodulation Distortion Estimation in Fully Balanced 224 Bandpass Gm C Filters With Weak Nonlinearities,” IEEE Trans. Circuits Syst. I, vol. 54, no. 4, pp. 829–844, April 2007. [108] P. Maffezzoni, L. Codecasa, and D. D’Amore, “Time-Domain Simulation of Nonlinear Circuits Through Implicit RungeKutta Methods,” IEEE Trans. Circuits Syst. I, vol. 54, no. 2, pp. 391–400, Feb. 2007. [109] A. Arnaud and C. Galup-Montoro, “Consistent Noise Models for Analysis and Design of CMOS Circuits,” IEEE Trans. Circuits Syst. I, vol. 51, no. 10, pp. 1909–1915, Oct. 2004. [110] R. Serrano-Gotarredona, L. Camuas-Mesa, T. Serrano-Gotarredona, J. Leero-Bardallo, and B. Linares-Barranco, “The Stochastic I-Pot: A Circuit Block for Programming Bias Currents,” IEEE Trans. Circuits Syst. II, vol. 54, no. 9, pp. 760–764, Sept. 2007. [111] D. Graham, P. Hasler, R. Chawla, and P. Smith, “A Low-Power Programmable Bandpass Filter Section for Higher Order Filter Applications,” IEEE Trans. Circuits Syst. I, vol. 54, no. 6, pp. 1165–1176, June 2007. [112] S. Haddad, J. Karel, R. Peelers, R. Westra, and W. Serdijn, “Ultra Low-power Analog Morlet Wavelet Filter in 0.18 µm BiCMOS Technology,” Solid-State Circuits Conference, 2005. ESSCIRC 2005. Proceedings of the 31st European, pp. 323–326, Sept. 2005. [113] J. Ross, Thermodynamics and Fluctuations far from Equilibrium. Springer, 2008. [114] E. W. Weisstein, “Fourier Series–Square Wave,” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/FourierSeriesSquareWave.html, Tech. Rep. 225 [115] Kraus John D, Antennas, Elken Alan E, Ed. McGraw-Hill International Editions, 1998. [116] J. G. Proakis, Digital Communications, S. W. Director, Ed. McGraw Hill, 2001. [117] B. Le Floch, M. Alard, and C. Berrou, “Coded Orthogonal Frequency Division Multiplex [TV Broadcasting],” Proceedings of the IEEE, vol. 83, no. 6, pp. 982–996, Jun 1995. 226 A Derivation of switching energy for a square wave The proof in [35] for a sinusoid driving an RC interconnect, figure A.1, assumes that the driving signal is at the cut off frequency of the filter. For a square wave there are many components which are well below the cut off frequency. The proof in this section expands on that in [35], by considering a square wave driving an RC interconnect. The input signal is a square wave which is the sum sinusoids given by [114]: Vin (t) = VP 4 π N X 1 sin(nω0 t) n n=1,3,5,··· (A.1) where N is the maximum harmonic index, ω0 is the fundamental frequency of the clock and VP is the peak amplitude of the clock. Here the clock is assumed to swing between +Vp and −Vp . By using superposition the total input power required can be found by considering the input power for each component of Vin . The cut off frequency of the filter is related to the number of harmonics which are included in the representation of the square wave, ωc = N ω0 . The Figure A.1: A single pole RC filter. 227 transfer function for the nth harmonic is thus given by: H(n) = s 1 ejθ(n) 1 + (n/N )2 (A.2) where θ(n) = −tan−1 (n/N ). The output voltage of the nth harmonic is given by: s 4 1 VP sin(nω0 t + θ(n)) . Vout (n, t) = πn 1 + (n/N )2 (A.3) The mean squared output voltage of the nth harmonic can then be written as: 2 Vout (n) 1 = 2 4 VP πn 2 1 . 1 + (n/N )2 (A.4) The output noise of an RC filter is V 2 = kT /C, so the output SNR for the nth harmonic is: 4 2 C VP πn SN R(n) = 2kT (1 + (n/N )2 ) (A.5) By considering the voltage drop across the resistor the input power of the nth harmonic can be written as: 4 VP πn Pn = 2R 2 1 1− 1 + (n/N )2 (A.6) which reduces to: n2 Pin (n) = kT SN R(n)ω0 . N (A.7) The SNR of an ideal square wave where N is taken to infinity is: SN Rsq = VP2 C kT (A.8) The total input power can then be expressed in terms of the number of harmonics and the SNR of the ideal square wave as: 228 16 1 PT = kT SN Rsq f0 π N N X 1 1 + (n/N )2 n=1,3,5,··· (A.9) In the limit of N going to infinity the series sum can be replaced by: 1 N N X 1 π = 2 1 + (n/N ) 8 n=1,3,5,··· (A.10) The energy per switching operation for a square wave driving an RC interconnect is PT /f0 : Eop = 2kT SN Rsq (A.11) 229 230 B A lower bound on the energy for a point to point communication link In this section a lower bound on the energy required for transmitting a bit using electromagnetic radiation is sought in terms of distance, antenna sizes and noise factor of the receiver. The Friis equation for free space path loss is widely used in radio communication calculations [53]. This equation, together with Plank’s blackbody radiation formula [25], is used to show that there is a lower bound on the energy required to transmit information between two antennas in free space. This lower bound takes into account the fact that the transmit antenna is a blackbody and radiation from it increases with the square of signal frequency. This blackbody radiation can be treated as noise generated by the transmitter. In this work, the noise generated at the transmitter due to blackbody radiation will be considered. This will show that there is a frequency at which an energy minima occurs. B.1 System Temperature The system temperature (Tsys ) is the overall noise temperature of the system, taking into account the noise temperature that the receive antenna sees, together with the noise introduced in the receiver electronics. The definition of Tsys in [115] is: 231 Tsys = TA + TAP 1 1 1 − 1 + TLP − 1 + TR ǫ1 ǫ2 ǫ3 (B.1) TA is the noise temperature seen by the antenna when no transmission is taking place, but includes the noise temperature due to the blackbody radiation emitted from the transmitting antenna. TAP is the physical temperature of the receive antenna. If the thermal efficiency of the antenna is 100% then: TAP 1 −1 =0 ǫ1 (B.2) TLP is the physical temperature of the transmission line and again, if there are no thermal losses due to this, then: TLP 1 −1 =0 ǫ2 (B.3) TR is the noise temperature of the receiver electronics, which is dictated by the gain and number of stages as: TR = T1 + T3 Tn T2 + + ... + Qn−1 G1 G1 G2 i=1 Gi (B.4) where n is the number of stages, Ti is the noise temperature of the ith stage and Gi is the gain of the ith stage. For a multistage receiver the overall noise temperature, TR , can be related to the physical temperature of the receiver, To , by the noise figure: TR = (F − 1)To (B.5) By setting Tsys = TA + (F − 1)To , the following relationship for the minimum transmit power can be written: Pt ≥ SNRd2 c2 kB [TA + (F − 1)To ] At Ar f 2 (B.6) Eq. (B.6) shows that the transmit power has an inverse relationship to signal frequency. This implies that as the frequency of 232 transmission increases, then the power required to transmit information becomes smaller. In the following sections a lower bound will be derived which shows that this inverse relationship to frequency does not hold because TA is also frequency dependent (∝ f 4 ) due to the blackbody radiation from the transmit antenna. Note also that increasing the quality factor (Q = Bf ) will reduce the power, implying that narrowband systems require less transmit power. B.2 Antenna Noise Temperature The antenna noise temperature, TA , seen by the receiving antenna is a function of the coupling of the antenna to all radiating sources. In general, there are many radiating objects that couple with the antenna. The radiation of the sky is approximately 3K, so some of this may couple into the receive antenna. Also the ground is at around 290K, so some of the antenna field will couple with this. However, in order to derive a lower bound on transmit power these effects are ignored and only the radiation due to the transmit antenna is taken into account. Every object can be considered as a blackbody that radiates energy according to Plank’s radiation formula [25]: I(f, T ) = ε 1 2hf 3 hf 2 c e kT − 1 (B.7) 2f 2 kT c2 (B.8) where I is the intensity of radiation at frequency f and temperature T, h is Plank’s constant (6.626068 × 10−34 ) and ε is the emissivity of the blackbody. Dealing only with frequencies below approximately 100T Hz, (B.7) can be simplified by keeping up to the linear term of the Taylor expansion, ex = 1 + x: I(f, T ) = ε For an antenna with a physical area of Āt , the noise power generated 233 by the transmitter antenna due to blackbody radiation can be given by: 2kT Pn(Tx) = εĀt 2 c Z f 2 df (B.9) B For narrowband systems with Q > 1 it is possible to approximate the integral in (B.9) by the area of the rectangle: Pn(Tx) ≈ εĀt 2kT f 2 B c2 (B.10) This blackbody noise source originates at the transmit antenna and is attenuated by the path loss (2.77), therefore, the noise power at the receive antenna is given by: 2εĀt At Ar f 2 kT B Pn(Rx) ≈ λ2 d2 c2 (B.11) The equivalent noise temperature of this received power is given by: TA = Pn(Rx) kB (B.12) which results in the noise temperature at the receiver due to transmitter blackbody radiation as: TA = 2εĀt At Ar f 4 T d2 c4 (B.13) Eq. (B.13) shows that the temperature due to the blackbody radiation of the transmitter antenna is proportional to the 4th power of frequency, which means that the antenna will appear increasingly hotter as the centre frequency of the transmitted information is increased. 234 B.3 Lower Bound on Transmission Energy By substituting the antenna temperature due to blackbody radiation (B.13) into (B.6) and rearranging the resulting equation, a new limit on the amount of power required for communication between two points in free space can be shown to be: 2εĀt f 2 (F − 1)d2 c2 Pt ≥ SNRkT B + c2 At Ar f 2 (B.14) Here the temperature of the transmit antenna and the receive antenna have been set equal to T, which is not unreasonable for systems which transmit over limited distances. Doing this simplifies the analysis in order to clearly show the lower bound on the energy required for transmission. Notice that the increase in power due to the blackbody radiation increases as the cube of frequency, whereas the increase in power due to the thermal noise of the receiver electronics decreases as 1/f . This is interesting as it will provide a minimum amount of power required for transmission. B.4 Energy per bit In order to make comparisons between systems, it often makes more sense to compare the energy per bit as this takes into account the information processing ability of the circuit. The energy per bit can be written as: E= P I˙ (B.15) where I˙ is the channel capacity in bits/s. The upper bound on channel capacity is given by Shannon [2] as: 235 I˙ ≤ Blog2 (1 + SNR) (B.16) By using this equation the energy per bit can be found from (B.14): SNR 2εĀt f 2 (F − 1)d2 c2 Ebit ≥ kT + log2 (1 + SNR) c2 At Ar f 2 (B.17) This equation has its minimum at f0 given by: f0 = (F − 1)d2 c4 2εĀt At Ar 14 (B.18) and at low signal to noise ratios (SNR << 1): SNR = ln(2) log2 (1 + SNR) (B.19) which gives the minimum energy per bit as: s Ebit (MIN) = 2d 2εĀt (F − 1) kT ln 2 At Ar (B.20) B.5 Isotropic and practical antennas In order to analyse this lower bound further and make comparisons with practical scenarios, some knowledge about how the physical area relates to the effective area for an antenna is required. An isotropic antenna is a hypothetical antenna which radiates equally in all directions and one in which all power at the input to the antenna is transmitted. The effective area of an antenna is defined, via the antenna gain G, as: λ2 Aeffective = G 4π (B.21) For the isotropic case the gain is equal to unity and it is found that the antenna noise temperature, TA , becomes a constant, which means that the minimum energy occurs at DC. However, for other types of 236 antenna that show a direct relation between effective area and physical area, the minimum energy can be selected to be at much higher frequencies. B.5.1 Parabolic dish antenna For the parabolic reflective dish the antenna gain is: π 2 D2 (B.22) λ2 where D is the reflector diameter. Essentially the effective antenna area is proportional to the area of the transmitter with the constant of proportionality being the efficiency [54]: Gparabolic = Aparabolic = η Ā (B.23) where Ā is the physical antenna area and η < 1 is the efficiency. B.5.2 Dipole antenna For a half wavelength dipole antenna the gain of the antenna is [53]: π 2 which leads to a frequency dependent effective area: Gdipole ≈ Adipole ≈ 0.125c2 f2 (B.24) (B.25) In the case of antennas whose physical area is proportional to the effective area there will be a minimum, at a frequency other than zero, of (B.20). Whereas for those cases where a dipole antenna is used the contribution due to the blackbody relationship becomes a constant for all frequencies, which means that: Ebit (MIN) ∝ f (B.26) 237 238 C The relationship between SNR and Eb/N0 In digital communications a widely accepted measure of system performance is Eb /N0 [116]. This measure gives the amount of energy over the noise floor required in order to transmit a single bit of information. Typically graphs showing Eb /N0 against bit error are used to compare communications protocols. Signal to noise ratio is defined as: SN R = PS PN (C.1) where PS is the average signal to noise ratio and PN is the average noise power. For white noise it is more convenient to write: PN = N0 B (C.2) where B is the bandwidth and N0 is the spectral height of white noise. Eb /N0 is thus related to the SNR: Eb SNRB PS = = N0 N0 R R (C.3) Using (C.3) and the channel capacity formula (2.1) the minimum value of Eb /N0 can be obtained: SNR Eb (min) = lim = ln 2 SN R→0 log2 (1 + SNR) N0 (C.4) For matched source and load impedances N0 = kT , thus in a straightforward manner the minimum energy per bit required at the 239 load is kT ln 2 joules. As this is only true in the limit of SNR → 0 then the rate of information also tends towards zero. This is the same result as shown several times in Section 2.3. C.1 Square wave SNR For detection of a bi-polar binary signal the following relationship can be written [116]: 2 Eb = erfc−2 (2Pe ) . N0 (C.5) Thus the SNR required in order to represent this signal is: 2 R erfc−2 (2Pe ) . (C.6) B For a rectangular pulse the ratio R/B is approximately equal to unity, thus the SNR for a square wave is: SNR = 2 SNRsq = erfc−2 (2Pe ) . 240 (C.7) D Time-Frequency Uncertainty In this appendix the time-frequency uncertainty bound is calculated for the Square, Gaussian, Gaussian derivative and Sinusoidal pulse functions. D.1 Uncertainty Calculation For a pulse shape, g(t) the degree of time-frequency uncertainty can be written as [117]: 1 4π∆t∆f ǫ= (D.1) where ∆t and ∆f can be computed from the second moments of the time and frequency representations of the function: R 2 t |g(t)|2 dt (∆t) = R |g(t)|2 dt R 2 f |G(f )|2 df . (∆f )2 = R |G(f )|2 df 2 (D.2) (D.3) The Fourier transform of g(t) is: G(f ) = Z ∞ g(t)e−j2πf t dt. (D.4) −∞ 241 D.2 Rectangular Pulse The baseband rectangular pulse can be written as: g(t) = q 2 T 0 0 < t ≤ T, (D.5) elsewhere. The Fourier transform of the rectangular pulse is: G(f ) = √ 2T sinc(f T ). (D.6) The frequency uncertainty is given by: 2 df f 2 sinc 2πf2 T = ∞. (∆f ) = R 2 2πf T df sinc 2 2 R (D.7) Therefore, the time-frequency uncertainy for the rectangular pulse is ǫ = 0. D.3 Sinusoidal Pulse The sinusoidal pulse function is: q 4 sin( π t) 0 < t ≤ T , T T g(t) = 0 elsewhere. (D.8) The Fourier transform of the sinusoidal pulse is: G(f ) = √ 4T 1 + e−j2πf T . π (1 − (2f T )2 ) (D.9) The time uncertainty can be found as follows: Z 0 242 T Z T |g(t)|2 dt = 2 0 1 2 2 2 2 − t |g(t)| dt = T 3 π2 (D.10) (D.11) ∆t = T s 2 3 − π12 . 2 (D.12) The frequency uncertainty is: ∞ Z |G(f )|2 df = 2 −∞ ∞ 2 Z −∞ f |G(f )|2 df = ∆f = (D.13) 1 2T 2 1 . 2T (D.14) (D.15) Therefore, the time-frequency uncertainty is ǫ = 0.3. Although much better than the rectangular pulse, this is still much less than unity which is achieved by the Gaussian pulse. D.4 Gaussian Pulse The Gaussian pulse is given by: g(t) = √ T 2 2(2α)1/4 e−πα(t− 2 ) (D.16) The Fourier transform of this pulse is: G(f ) = √ 14 π 2 2 e− α f . 2 α (D.17) The time uncertainty is: Z Z ∞ −∞ ∞ 2 −∞ |g(t)|2 dt = 2 t |g(t)|2 dt = 1 ∆t = √ 2 πα 1 2πα (D.18) (D.19) (D.20) 243 The frequency uncertainty is: Z ∞ |G(f )|2 df = 2 (D.21) −∞ ∞ 2 α f |G(f )|2 df = 2π −∞ r 1 α ∆f = . 2 π Z (D.22) (D.23) The time-frequency uncertainty of the Gaussian pulse is ǫ = 1. This is the optimum time-frequency which is only reached by the Gaussian pulse. D.5 Gaussian Derivative The 1st Gaussian derivative is given by: g(t) = 4 α3 2π 14 2 te−αt . (D.24) The Fourier transform of the pulse is: G(f ) = 2j 8π 5 α3 14 f e− π2 f 2 α . (D.25) The time uncertainty is: Z ∞ −∞ ∞ 2 |g(t)|2 dt = 2 3 t |g(t)|2 dt = 2α −∞ r 3 ∆t = 4α Z The frequency uncertainty is: 244 (D.26) (D.27) (D.28) Z ∞ −∞ ∞ 2 |G(f )|2 df = 2 (D.29) 3α f |G(f )|2 df = 2 2π −∞ r 3α . ∆f = 4π 2 Z (D.30) (D.31) For the derivative of a Guassian pulse the time-frequency uncertainty is ǫ = 31 which is slightly better than for the sinusoidal pulse. D.6 Truncated Gaussian Pulse The truncated Gaussian pulse is: g(t) = √ 2(2α)1/4 e−πα(t− T2 )2 0 0<t≤T (D.32) elsewhere. The Fourier transform of this pulse is: G(f ) = 1 π 1 4 (M + M ∗ ) e− α f 2 (2α) r √ παT π . M = erf − jf 2 α (D.33) (D.34) The magnitude squared function is then: |G(f )|2 = 4 (2α) π 1 4 2 (Re(M ))2 e−2 α f . (D.35) The frequency uncertainty does not have a closed form expression so a numerical approximation is required. The complex error function may be approximated using: 245 ∞ 2 X (−1)n z 2n+1 2 z7 z3 z5 erf(z) = √ =√ + − + ··· . z− 3 10 42 π n=0 n!(2n + 1) π (D.36) Figure D.1 shows a plot of the approximated time-frequency uncertainty evaluated over a bandwidth of 20/T . The plot also shows the approximation of the Gaussian derivative, sinusoid and rectangular pulses over the same bandwidth. The figure clearly shows the superiority of the truncated Gaussian pulse over the other pulses in terms of the time-frequency uncertainty. The reason why the rectangular uncertainty value is greater than 0 is because the frequency uncertainty is only evaluated over 20/T which results in a value of ǫ = 0.2725. As expected as the value of αT 2 decreases towards zero the time uncertainty becomes smaller, thus approximating a square wave. 246 1 Time−Frequency Uncertainty − ε 0.9 0.8 0.7 Truncated Gaussian Gaussian Derivative 0.6 Sinusoid Rectangular 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10 2 αT Figure D.1: Approximation of the time-frequency uncertainty for the Gaussian, Gaussian derivative, sinusoidal and rectangular pulses. The approximation is taken over a bandwidth of 20/T 247 248 E BER Simulation E.1 Correlation Detection The entire set of symbols for a given past history can be written in matrix form as: s̃(ψ) g(t) g(t − T ) .. . + 1(L×1) = g(t − (L − 1)T ) p X i=1 g(t − (ψ) Ii T ! + iLT ) (E.1) where 1(L×1) is a vector containing L one’s. ψ is an integer between 0 and LP − 1. For every transmitted symbol the previous information sequence will be different, so there are LP possible values for s̃. To ensure that the transmitted power is equal to unity a scaling factor can be found such that: P L −1 1 X h (ψ) (ψ) T i tr s̃ s̃ = L. LP ψ=0 (E.2) A cross correlation matrix can be formed for each of the previous information sequences. Assuming that each symbol is equally likely and that ML detection on a symbol by symbol basis is used then an average cross correlation matrix can be formed: P L −1 1 X (ψ) R̃ss = P s̃ [s]T L ψ=0 (E.3) 249 iT where s = g(t) g(t − T ) · · · g(t − (L − 1)T ) . The L × L matrix created by (E.3) together with the L × L matrix Rss = s sT can then be used to find the bit error rate performance. Given these two matrices, the transmitted symbol set and the original symbol set can be found as: h s̃ = s−1 R̃ss 1 s = [Rss ] 2 . (E.4) (E.5) The ML estimate (3.15) then becomes: m̂ = arg max [s̃i + n] · sm (E.6) 1≤m≤L where n has a variance shown in (3.20) and s̃i is the ith row of s̃ selected at random. E.2 Matched Filter Detection In the same way as in (E.1) the complete time domain signals for the current symbol and the past P symbols needs to be constructed. The difference here is that s(t) must be defined over the interval −P LT < T < LT . For unity transmit energy ensure that s̃(ψ) is scaled so that (E.2) is satisfied. Each row of s̃(ψ) can be convolved with g(t), the approximated matched filter: ỹm (t)(ψ) = s̃(ψ) m ∗ g(t). (E.7) The result of the convolution is sampled at the times of: t = {T, 2T, · · · , LT } (E.8) in order to construct an L × L matrix of samples for each value of ψ: 250 ỹ1 (T )(ψ) ỹ1 (2T )(ψ) · · · ỹ1 (LT )(ψ) .. .. .. .. . ỹψ = . . . . (ψ) (ψ) (ψ) ỹL (T ) ỹL (2T ) · · · ỹL (LT ) (E.9) The variance of the noise at the output of the matched filter will be: σy2 = σz2 Z +∞ −∞ |G(f )|2 df = σz2 Eg (E.10) where Eg is the energy of g(t). Create an averaged matrix ỹ: P L −1 1 X ψ ỹ = P ỹ . L ψ=0 (E.11) The ML estimate (3.15) then becomes: m̂ = max " # ỹi p +n Eg (E.12) where n has a variance shown in (3.20) and ỹi is the ith row of ỹ chosen at random. E.3 BER Results Figures E.1, E.2, E.3 and E.4 show the simulated BER performance for a variety of different pulse approximations. 251 Correlator Receiver Approx. Matched Filter Receiver 0 0 −1 −1 L=2 −2 −2 N=2 All Pole −3 N=3 All Pole N=10 Cascade M=1, N=4 Pade M=2, N=6 Pade −3 log10 Pe log 10 P e N=4 All Pole −4 −4 Ideal −5 −5 −6 −6 −7 −8 2 4 6 8 10 E /N b 12 −7 2 14 4 6 8 10 E /N 0 b 12 14 0 Figure E.1: BER for L = 2 for correlation and approximated matched receivers. Correlator Receiver Approx. Matched Filter Receiver −1 −1 −1.5 −1.5 L=4 −2 −2 −2.5 −2.5 N=2 All Pole N=10 Cascade log M=1, N=4 Pade −3.5 M=2, N=6 Pade e P −3 10 N=4 All Pole log −3 10 P e N=3 All Pole −3.5 Ideal −4 −4 −4.5 −4.5 −5 −5 −5.5 2 3 4 5 6 Eb/N0 7 8 9 10 −5.5 2 3 4 5 6 Eb/N0 7 8 9 10 Figure E.2: BER for L = 4 for correlation and approximated matched receivers. 252 Correlator Receiver Approx. Matched Filter Receiver −1 −1 −1.5 −1.5 L=8 −2 −2.5 −2 −2.5 N=2 All Pole log 10 P e N=4 All Pole N=10 Cascade −3 M=1, N=4 Pade log10 Pe N=3 All Pole −3 M=2, N=6 Pade Ideal −3.5 −3.5 −4 −4 −4.5 −4.5 −5 2 3 4 5 6 Eb/N0 7 −5 2 8 3 4 5 E /N b 6 7 8 0 Figure E.3: BER for L = 8 for correlation and approximated matched receivers. Approx. Matched Filter Receiver Correlator Receiver −1 −1.5 −1.5 −2 L=16 −2 −2.5 −2.5 N=2 All Pole log M=1, N=4 Pade P 10 log e 10 N=10 Cascade P N=4 All Pole e N=3 All Pole −3 −3 M=2, N=6 Pade −3.5 Ideal −3.5 −4 −4 −4.5 −5 2 −4.5 3 4 E /N b 5 0 6 7 −5 2 3 4 E /N b 5 6 7 0 Figure E.4: BER for L = 16 for correlation and approximated matched receivers. 253 254 F Impulse Approximation It is impossible to implement a delta-dirac impulse because it is defined as having infinite height and infinitesimally small width with unity energy. Any implementation is an approximation of the ideal impulse. Figure F.1 shows an approximation to an impulse. The length of the pulse is TI , the rise and fall times are τ and it has an amplitude of a. The time domain representation of this pulse is: a t τ h(t) = a − a t + τ 0<t≤τ aTI τ τ < t ≤ TI − τ (F.1) TI − τ < t ≤ TI Using the following two relationships the Laplace domain impulse response can be formulated. Z t2 te−st dt = t1 Z e−st1 + t1 se−st1 − e−st2 − t2 se−st2 s2 t2 t1 e−st dt = e−st1 − e−st2 s (F.2) (F.3) Due to much symmetry in the shape of the pulse the following simplified form is obtained: a −sτ −s(TI −τ ) −sTI 1 − e − e + e (F.4) τ s2 A plot of this function over frequency for several values of T, with τ = 10TI , is shown in figure F.2. On this figure the ideal impulse response is also plotted. It is evident that the approximate pulse is H(s) = 255 a TI τ t Figure F.1: Approximation to the impulse function. The length is TI and the rise and fall times are τ flat across frequency, up to a certain point. Therefore it can be expected that the the correct shape of the impulse response can be obtained albeit with a loss of amplitude. To find the low frequency constant value of the approximated impulse response the limit of H(s) as s → 0 is required. One way of doing this is by using the Taylor expansion of e−x . The first 3 terms of the 2 expansion are required in order to find the limit; e−x = 1 − x + x2 . Using this expansion on (F.4) results in: H(s → 0) = a(TI − τ ) (F.5) (F.5) shows that the amplitude of the impulse response at the output of the filter is directly related to the area underneath the approximate pulse in the time domain. This can be verified by referring to figure F.2. Here it is assumed that a is equal to unity. From the graph the frequency up to which the approximation is valid can be found. This is: fmax ≈ 1/(3TI ) (F.6) This analysis shows that the approximation of the impulse provides a constant value in the Laplace domain up to a certain frequency, 256 Attenuation [dB] 0 −50 Ideal T = 10 µ s T=1µs T = 0.1 µ s T = 0.01 µ s −100 −150 2 10 3 10 4 5 10 6 10 10 Frequency [Hz] 7 10 8 10 9 10 Figure F.2: Frequency Response of the approximated impulse response for τ = 10Ti for several values of Ti and a = 1. which is dependent on the length of the pulse. Provided the pulse width is small enough then the roll off in frequency can be neglected. For a triangular pulse where TI = 0 (F.4) simplifies to: 2a [1 − cos(ωτ )] . τ s2 In this case the cutoff frequency is: HT (s) = fTriangle = 1 2πτ (F.7) (F.8) and the DC gain is: HT (0) = τ. (F.9) 257 258 G Inductive Coil Characterisation This appendix shows the characterisation of an inductive coil for use at 33 MHz. The measured results of the characterisation have been used in order to simulate the behaviour of the inductive communications link. G.1 Lumped Element Model The measurement setup is shown in figure G.1. The vector network analyser (VNA) is setup to measure the reflection s-parameter, S11 . These can then be converted into Z parameters. The impedance function of the transmitter element is: Z= s C + R LC s2 + sR L + 1 LC . (G.1) The magnitude squared of the impedance at resonance is: 2 2 |Zres | = Q R2 1 + ω02 C 2 ω08 . (G.2) Assuming that R << ω0 then (G.2) can be simplified and the magnitude impedance at resonance is then: |Zres | ≈ Q . ω0 C (G.3) The resonance frequency and Q can be found from the measured Z11 parameters, thus a value for C can be found. From (G.1) the centre 259 Network Analyser Lumped element model of inductive transmitter RS L C R Figure G.1: Measurement setup for characterisation of the inductive transmitter element. The vector network analyser measures the the reflection S parameters, S11 . frequency and Q are given by: 1 LC Lω0 , Q= R ω0 = √ (G.4) (G.5) which enables L and then R to be found. G.1.1 Measured Coil Characteristics For a 10 mm long coil with a radius of 2.5 mm using 1 mm diameter enamelled copper wire, the measured and approximated impedance of the coil are shown in figure G.2. The modeled parameters were: 260 L = 190 nH (G.6) C = 715 fF (G.7) R = 31.15 Ω (G.8) Q = 16.5 (G.9) 80 Measured Approximation 78 76 Impedance [dBohm] 74 72 70 68 66 64 62 60 350 400 450 500 Frequency [MHz] Figure G.2: Measured and modeled characteristics of a 10 mm long coil with radius 2.5 mm using 1 mm diameter enamelled copper wire. f0 = 432 MHz (G.10) With a 100 pF ± 10% placed in parallel with the coil, figure G.3, the following modeled parameters where obtained: L = 212 nH (G.11) C = 98.5 pF (G.12) R = 0.36 Ω (G.13) Q = 128.8 (G.14) f0 = 34.85 MHz (G.15) The results of this characterisation show that this coil is suitable to be used in a transmitting element with a Q of less than 128. The reason for the large discrepancy between the resistance of the coil at 432 MHz and 35 MHz is probably due to the proximity effect. Wires 261 80 Measured Approximation Impedance [dBohm] 70 60 50 40 30 20 10 0 20 40 60 80 100 Frequency [MHz] Figure G.3: Measured and modeled characteristics of a 10 mm long coil with radius 2.5 mm using 1 mm diameter enamelled copper wire. A 100 pF±10% capacitor is placed in parallel with the coil. placed in close proximity causes current crowding, which results in much higher losses. If the coil windings were spaced out then the proximity effect would be smaller. However, this is not a problem with this coil at frequencies around 35 MHz. G.2 Coupling Measurements To estimate the coupling between the inductors two series R-L circuits are used. In this case the coil resistance and capacitance can be neglected, see figure G.4. The coupling constant of this ideal circuit is given by: Vout M sR = Vin L2 s2 + s 2R + L where M is the mutual coupling given by: 262 R 2 L . (G.16) R Vin L L R Vout Figure G.4: Ideal circuit for measuring the coupling constant. M = kL. (G.17) The value of R can be chosen to place the centre frequency of the filter to 30 MHz. The gain at the centre frequency is given by: k kL = . (G.18) 2L 2 By measuring the gain at the centre frequency an estimate for the coupling constant can be found. A resistance value of 50 Ω was used in the experiment. The Q of this circuit is low (< 0.5) so any deviation in the centre frequency between the TX and RX coils will not produce a large error. G= The coupling when the antennas are aligned along the axis of the magnetic field (i.e the coupling between transmit and receive antennas) was found to be: k= 16 × 10−9 . d3 (G.19) For the perpendicular coils (i.e. between a pair of transmit antennas) the coupling was found to be: kp = 6.3 × 10−9 . d3 (G.20) The theoretical model predicts that the coupling constant between axis aligned coils is: ki = lR2 2d3 (G.21) 263 were l is the length of the coil and R is the radius of the coil. This equates to a coupling constant of 31.25 × 10−9 . This is double the value that was measured. The reason for this error is likely to be due to the orientation of the two coils during the measurement. 264 H 2nd Order Approximation For the low pass and bandpass filter the Laplace transfer function and time domain impulse response for Q >> 1 are: HLP F (s) = b2i s2 + ωQ0 s + ω02 ω0 t hlpf (t) = bi e− 2Q sin ω0 t ai s HBP F (s) = 2 ω0 s + Q s + ω02 ω0 t hbpf (t) = −ai e− 2Q cos ω0 t (H.1) (H.2) (H.3) (H.4) (H.5) where ω0 is the centre frequency of the filter and Q is the Quality factor. ai and bi are constants. A second order filter with a transmission zero may be represented as the sum of low pass and bandpass filters: ai s + b 2 HZERO (s) = 2 ω0 i 2 (H.6) s + Q s + ω0 ω0 t hzero (t) = Ge− 2Q sin ω0 [t − dt]. (H.7) where G and dt can be found by considering the vector sum of the low pass sinusoid and bandpass cosinusoid. When dt << ω2π0 then the impulse response of the filter with the transmission zero is well approximated by a scaled and delayed low pass filter: ω0 ĥzero = Ge− 2Q [t−dt] sin ω0 [t − dt]. (H.8) 265 266 I Transmitter Coupling In this appendix the statespace model for the coupling between two adjacent transmitter elements is derived. The impulse response of the sum of the inductor currents is then found. From this expression the resulting transfer function in the limit of small and large coupling is shown. The coupling of two transmitter elements is shown in figure I.1. For simplicity assume that the same impulse function is applied to each of the elements; this is true for the 2nd order Gaussian approximation. The inductive elements are also assumed to have the same inductance. As the inductances are equal the induced voltages can be written as: Vind1 = kLsi1 (I.1) Vind2 = kLsi2 (I.2) Thus the state space model is: ˙ 1 1 − C11R1 0 − C1 0 v1 v1 C1 v 2 0 − C12 v2 C12 0 − C21R2 + IP = k 1 i 0 0 i1 0 1 L(1−k2 ) − L(1−k2 ) k 1 − L(1−k 0 0 i2 0 i2 2) L(1−k2 ) (I.3) IP is the input pulse which is assumed to be the ideal delta-Dirac function, i.e. IP (s) = 1. This formulation could easily be extended to a greater number of coupled elements if required. The characteristic 267 Vdd Vdd i1 v1 C1 R1 i2 v2 k L C2 R2 L Vind1 Vind2 IP IP Figure I.1: Circuit showing the coupling between two transmitter elements. polynomial of the sum of the inductor currents is given by: 1 1 + + Isum (s) = s + s R1 C1 R2 C2 1 1 1 2 s + + + LC1 (1 − k 2 ) LC2 (1 − k 2 ) R1 C1 R2 C2 1 1 1 s + 2 + (I.4) 2 2 LC1 C2 R1 (1 − k ) LC1 C2 R2 (1 − k ) L C1 C2 (1 − k 2 ) 4 3 When there is no coupling this factors into the product of two separate state space systems: Isum (s)|k=0 = 1 1 + s +s R1 C1 LC1 2 1 1 s +s + R2 C2 LC2 2 (I.5) This is not the case when k 6= 0. To see the effect of the coupling constant consider the pole plot shown in figure I.2. This figure shows that for small coupling constants there is little effect on the pole positions. When the coupling is greater than 0.5 then there is a rapid change in the frequency difference between the poles, which causes the bandwidth of the pulse to increase. It is possible to adjust the 268 8 8 x 10 k=0.9 6 4 k=0 k=0.26 k=0.0166 Imaginary 2 k=0.068 0 −2 −4 −6 −8 −2.23 −2.225 −2.22 −2.215 −2.21 Real −2.205 −2.2 −2.195 −2.19 6 x 10 Figure I.2: Pole Plot showing how the poles of two transmitting elements vary as the coupling constant k is increased. The pulse is centred at 33 MHz with T = 1 µS and αT 2 = 4.6. values of R and L to compensate for the coupling when k is small. However, when k is less than 10 × 10−3 there is only a small difference in pole position. Figure I.3 shows the frequency domain plot for a pulse centred at 33 MHz with T = 1 µS and αT 2 = 4.6 for a variety of coupling constants. As the coupling constant is increased the frequency difference between the poles increases and the quality factor decreases. For k = 1 × 10−3 there is very little difference between the ideal and coupled responses; the difference cannot be seen in figure I.3. Therefore, provided that the coupling is of the order of 1e − 3 then it can be assumed that the transmitting elements are not coupled, thus simplifying the analysis and implementation of the transmitter coils. In practice this means that a certain separation in space between the coils is required. 269 35 k=0 −3 k=1 × 10 30 k=1 × 10−4 −5 k=1 × 10 Magnitude [dB20] 25 20 15 10 5 0 30 31 32 33 34 35 36 Frequency [MHz] Figure I.3: Frequency response of the a Gaussian approximation with the pulse centred at 33 MHz with T = 1 µS and αT 2 = 6. As the coupling constant is increased the frequency difference between the poles increases and the quality factor decreases. For k = 1 × 10−3 there is very little difference between the ideal and coupled responses. 270 J 2nd Order Element Temperature Change The centre frequency and Q of the transmitting element will be affected by the temperature coefficients of the capacitor, inductor and resistor. In this section the change in centre frequency and Q over a standard temperature range is determined. This shows that the percentage change in centre frequency and Q due to temperature changes is small compared to that achievable using tuning techniques. Therefore, the 2nd order elements may be tuned once at circuit build and left to run without tuning. The effects of age on the components has not been taken into account, however, it is typically smaller than the change due to temperature. The inductive coils used for this transmitter are custom built. A change in temperature has the effect of altering the dimensions of the inductor. The linear dimensions of a metal wire follow the thermodynamic expansion: ∆x = αx0 ∆T (J.1) where ∆x is the change in linear dimension due to thermal expansion and x0 is the linear dimension at 20 C. α is the expansion coefficient, which is 17 ppm for copper. The inductance of a solenoid coil depends on the area of the core and the length of coil. Thus the inductance after thermal expansion can be written as: 271 µ0 N 2 πr2 (1 + α∆T )2 = L0 (1 + α∆T ) . L̂ = l (1 + α∆T ) (J.2) The number of turns N is not affected by thermal expansion. Therefore, the temperature coefficient for the inductance is the same as for copper, i.e. 17 ppm. The change in centre frequency of a 2nd order element when using a combination of fixed and variable capacitors is given by: fˆ0 = f0 s CV + CF [CV (1 + AV ∆T ) + CF (1 + AF ∆T )] (1 + AL ∆T ) (J.3) where CF is the fixed capacitance with temperature coefficient AF , CV is the variable capacitance with temperature coefficient AV and AL is the inductor temperature coefficient. For a 2nd order element where a maximum of 25 % of the capacitance contribution is from a variable capacitor, the change in centre frequency can be written as: fˆ0 = f0 s 1+AV ∆T 4 1.25 . + (1 + AF ∆T ) (1 + AL ∆T ) (J.4) For an NPO ceramic capacitor the temperature coefficient is typically 30 ppm, however a trimmer capacitor (Murata TZC03 series) has a much larger temperature coefficient of 500 ppm. Over a 55o C temperature range (J.4) shows a ±0.4 % variation in the centre frequency. In a similar way the variation in Q can be written as: Q̂ = Q s (1 + AF ∆T ) + (1 + AV ∆T ) (1 + AR ∆T ) 1.25 (1 + AL ∆T ) (J.5) where AR is the temperature coefficient of the resistance. A typical trimmer resistor has a temperature coefficient of 100 ppm. Over a 55o C temperature range (J.5) shows a variation in Q of approximately ±0.85 %. 272 K 2nd Order Element Tuning This appendix describes a tuning method to set the centre frequency of each 2nd order transmitting element. A sense coil is used to measure the response of the transmitting coil to an impulse response. The voltage across the sense coil due to current in the transmitter coil, ignoring gain factors, will be: v(t) = g(t) sin ωc t (K.1) ct is the exponentially decaying envelope, ωc is where g(t) = exp − ω2Q the centre frequency. Multiply (K.1) by I and Q sinusoids, with the required centre frequency ω0 , and low pass filter the product: yI = g(t) sin(ωe t + θ) (K.2) π ) (K.3) 2 where ωe = ωc − ω0 is the error in the frequency. θ is the phase of the reference clock, which is unknown. Taking the integral of yI and yQ over a time period, T, which is long enough such that it can be assumed the pulse has decayed to zero results in: yQ = g(t) sin(ωe t + θ + Z 0 Z 0 T yI(t) dt = 2Q [2ωe Q cos(θ) + ωc sin(θ)] ωc2 + 4ωe2 Q2 T yQ(t) dt = − 2Q [2ωe Q sin(θ) − ωc cos(θ)] ωc2 + 4ωe2 Q2 (K.4) (K.5) The dependence on the unknown phase of the reference clock can be 273 removed by finding the sum of the squares of (K.4) and (K.5): 16ωe2 Q4 + 4Q2 (ωe + ω0 )2 R= ((ωe + ω0 )2 + 4ωe2 Q2 )2 (K.6) which when simplified and assuming Q >> 1 results in: R≈ 4Q2 . 4ωe2 Q2 + 2ωe ω0 + ω02 (K.7) Equation (K.7) has a maximum which occurs when: ωe = −ω0 . 4Q2 (K.8) Therefore, the percentage error in tuning the 2nd Order transmitting coils using this method can be at best: 100 fe = [%]. f0 4Q2 For a Q of 40 the percentage error due to tuning is 0.02 %. 274 (K.9) L Pulse Generation and Receiver Schematics The following pages show the schematics of the circuits used for the pulse transmitter and receiver. These circuits include the provision of digital to analogue converters to enable calibration of the centre frequency and Q of the second order elements. However, this functionality was not required so the DAC ICs were not fitted. Both schematics and PCBs were created using Kicad (an open source programme for circuit development). 275 276 Figure L.1: Top level block diagram of the transmitter and receiver circuits 277 Figure L.2: TX schematic 278 Figure L.3: TX2 schematic 279 Figure L.4: TX3 schematic 280 Figure L.5: TXpower schematic 281 Figure L.6: RXamp schematic 282 Figure L.7: RXamp2 schematic 283 Figure L.8: RX DAC schematic 284 Figure L.9: RXpower schematic

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement